/
Автор: Gresho P.M. Sani R.L.
Теги: numerical methods heat transfer hydrodynamics finite element method
ISBN: 0-471-96789-0
Год: 1998
Текст
Incompressib 'Flow
AND THE
Element
Method
■■i».i JL £
title
author
publisher
isbnlO I asin
print isbnl3
ebook isbnl3
language
subject
publication date
lcc
ddc
subject
cover
Pagei
~ " ' >
ViMJv\
Cover illustration
Snapshot of streamlines and pressure (in color) shortly after an 'impulsive' start for
Re=1000. See section 3.19 for an explanation.
page_i
Incompressible Flow and the Finite Element Method
AdvectionDiffusion and Isothermal Laminar Flow
P. M. Gresho
Lawrence Livermore National Laboratory
R. L. Sani
University of Colorado
in collaboration with
M. S. Engelman
Fluid Dynamics International, Evanston
JOHN WILEY AND SONS
Chichester < New York - Weinheim - Brisbane ■ Singapore ■ loronto
page_iii
Page iv
Copyright © 1998 John Wiley & Sons Ltd,
Baffins Lane, Chichester,
West Sussex P019 IUD, England
National 01243 779777
International (+44) 1243 779777
e-mail (for orders and customer service enquiries): cs-books@wiley.co.uk
Visit our Home Page on http://www.wiley.co.uk
or http://www.wiley.com
Reprinted with corrections January 1999
All Rights Reserved. No part of this publication may be reproduced, stored in a retrieval system, or
transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, scanning
or otherwise, except under the terms of the Copyright, Designs and Patents Act 1988 or under the
terms of a licence issued by the Copyright Licensing Agency, 90 Tottenham Court Road, London, UK,
W1P 9HE, without the permission in writing of the Publisher.
Other Wiley Editorial Offices
John Wiley & Sons, Inc., 605 Third Avenue,
New York, NY 10158-0012, USA
WILEY-VCH Verlag Gmbh, Pappelallee 3,
D-69469 Weinheim, Germany
Jacaranda Wiley Ltd, 33 Park Road, Milton,
Queensland 4064, Australia
John Wiley & Sons (Asia) Pte Ltd, Clementi Loop #02-01,
Jin Xing Distripark, Singapore 129809
John Wiley & Sons (Canada) Ltd, 22 Worcester Road,
Rexdale, Ontario M9W 1LI, Canada
British Library Cataloguing in Publication Data
A catalogue record for this book is available from the British Library
ISBN 0 471 96789 0
Typeset in 10/12pt Times from the authors' disks by Laser Words, Madras, India Printed and bound in
Great Britain by Bookcraft (Bath) Ltd. This book is printed on acid-free paper responsibly
manufactured from sustainable forestation, for which at least two trees are planted for each one used
for paper production.
page_iv
Contents
Page v
Preface xv
Glossary of Abbreviations xix
1 1
Introduction
1.1 Introduction 1
1.2 Incompressible Flow 3
1.3 The Finite Element Method 6
1.4 Incompressible Flow and the Finite Element Method 12
1.5 Overview of this Book 15
1.6 Some Subjective Discussion 17
1.7 Why Finite Elements? Why Not Finite Volumes? 18
2 23
The Advection-Diffusion Equation
2.1 The Continuum Equation 23
2.1.1 The Advective (Convective) Form 23
2.1.2 Dimensionless Forms and Limiting Cases of the Equation 25
2.1.3 The Divergence (Conservation) Form 29
2.1.4 Conservation Laws 30
2.1.5 Weak Forms of the PDE's/Natural Boundary Conditions 32
2.2 The Finite Element Equations/Discretization of the Weak Form 37
2.2.1 Advective Form 37
2.2.2 Divergence Form 44
2.2.3 Conservation Laws 44
2.2.4 An Absolutely Conserving Form 48
2.2.5 A Finite Difference Interpretation 52
2.2.6 A Control Volume FEM 54
2.3 Some Semi-Discrete Equations 58
2.3.1 One Dimension 58
a. Linear elements 59
b. Quadratic elements 65
A * indicates advanced or peripheral material.
page_v
2.3.2 Two Dimensions with Bilinear Elements 69
a. An Interior 4-Patch 69
b. A Boundary 2-Patch 76
c. A Boundary Corner 79
d. An Internal Line Heat Source 82
2.3.3 Two Dimensions with Biquadratic Elements 84
a. An Interior 4-Patch 85
b. An Interior 2-Patch 86
c. An Interior 1-Patch 87
2.3.4 Two Dimensions with Serendipity Elements 88
a. An Interior 4-Patch 88
b. An Interior 2-Patch 89
c. Another Interior 2-Patch 90
2.4 Open Boundary Conditions (OBC's) 93
2.4.1 One Dimension 93
2.4.2 Two Dimensions 104
2.5 Some Non-Galerkin Results 107
2.5.1 The Lumped Mass Approximation 107
2.5.2 One-point Quadrature 108
a. An Interior 4-Patch of Uniform Rectangles 108
b. A B oundary 2-Patch 110
c. A Boundary Corner 110
2.5.3 Control Volume Finite Element (CVFEM) 111
a. An Interior 4-Patch 111
b. A Boundary 2-Patch 116
c. A Boundary Corner 119
d. OBC's 120
e. A Nine-Node CVFEM 120
2.5.4 The Group FEM/Product Approximation 123
2.5.5 The Petrov-Galerkin FEM 124
2.6 Dispersion, Dissipation, Phase Speed, Group Velocity, Mesh Design, andWiggles 125
2.6.1 Qualitative Discussion 125
a. Wiggles 125
b. Dispersion 128
c. Dissipation 129
d. Phase Speed 130
e. Group Velocity 131
f. Mesh Design 132
2.6.2 Quantitative Discussion for Some ID Problems 133
a. Pure Advection with Periodic BC's 134
b. Advection-Diffusion with Periodic BC's 187
c. Advection-Diffusion with Dirichlet BC's 188
d. Advection-Diffusion with Dirichlet/ Neumann BC's 208
e. Advection-Diffusion with Neumann BC's at Both Ends 213
f. Advection-Diffusion with Dirichlet/ Robin BC's 214
g. The Advective-Diffusive Time Scale Daee vi 217
h. Final Remarks on ID Advection-Diffusion 218
Page vii
2.6.3 Extension to 2D 219
a. Pure Advection with Periodic BC's 220
b. Pure Advection with Dirichlet BC's (Inlet Only) 224
c. Advection-Diffusion with Dirichlet BC's 228
d. Advection-Diffusion with Periodic BC's 230
e. Advection-Diffusion with OBC's 231
f. Final Remarks on Advection-Diffusion via GFEM 231
2.7 Time Integration 232
2.7.1 Some Explicit ODE Methods 240
a. Second-Order Adams-Bashforth (AB2), an 'Explicit Multi-Step Method' 240
b. Third-Order Adams-Bashforth (AB3), Another 'Explicit Multi-Step Method' 240
c. Runge-Kutta Methods (RK2,4) 241
d. Leapfrog (Another Explicit Midpoint Rule) 242
e. Rational Runge-Kutta (RRK) 243
2.7.2 Application to Advection-Diffusion (Scalar Transport) 244
a. Generalities 244
b. Lumping the Mass 245
c. Stability Estimates and the Case for Implicit Methods 246
d. Matrix Method of Stability Analysis 251
e. Balancing Tensor Diffusivity (BTD) 252
2.7.3 Some Implicit ODE Methods 257
a. The Trapezoid Rule (TR) 259
d. Implicit Midpoint Rule (IMR) 262
c. Backward Differentiation Formulae (BDF) 265
2.7.4 Variable-Step Implicit Methods 266
a. Variable-Step Trapezoid Rule 266
b. Variable-Step Backward Euler 270
c. A Model Problem 270
d. An Aerospace Version of TR 273
e. TR on Advection-Diffusion 274
f. The Smoothing Property 282
2.7.5 A Semi-Implicit Method 285
2.7.6 Dispersion (et al.) Errors For Some Fully Discrete Methods 288
a. Introduction 288
b. Semi-Discrete the Other Way 289
c. Fully Discrete 291
d. Generalizations and Extensions 314
2.7.7 Other (Different) Methods Used by Others 316
a. Methods Based on Trajectories/Characteristics 317
b. Methods Based on Modified Equations 329
c. Some Least-Squares Finite Element Methods (LSFEM) 332
d. Methods Based on a Discontinuous-in-Tlrrie^^irerkin ODE Technique 334
e. Methods Based on Least-Squares and Time-Discontinuous ODE's 339
f. A Wave Equation Method 341
g. Another Combined Method: Taylor Least Squares 341
2.7.8 Concluding Remarks and Suggestions 342
2.8 Additional Numerical Examples 343
2.8.1 Unstable ODE Example 343
2.8.2 Advection-Diffusion of a Puff (Point Source) 353
2.8.3 The Rotating ConeA Pure Advection Test Problem 354
3 357
The NavierStokes Equations
3.1 Notational Introduction 357
3.2 The Continuum Equations (The PDE's) 360
3.3 Alternate Forms of the Viscous Term 362
3.3.1 Stress-Divergence Form 363
3.3.2 Div-Curl Form 363
3.3.3 Curl Form 364
3.4 Alternate Forms of the Non-Linear Term 364
3.4.1 Divergence Form 364
3.4.2 Rotational Form 364
3.4.3 Skew-Symmetric Form 365
3.4.4 A Symmetric Form 365
3.5 Derived Equations 367
3.5.1 The Pressure Poisson Equation (PPE) 367
3.5.2 The Vorticity Transport Equation (VTE) 368
3.5.3 The Penalized Momentum Equation 369
3.6 Alternate Statements of the NS Equations 371
3.6.1 Velocity-Pressure in Divergence Form 371
3.6.2 Velocity-Pressure in Rotational Form 371
3.6.3 PPE Form 371
3.6.4 The Stream Function-Vorticity (yw) Formulation 372
3.6.5 The Velocity-Vorticity Formulation 372
3.6.6 Other Formulations 373
3.7 Special Cases of Interest 373
3.7.1 Stokes Flow 373
3.7.2 Inviscid Flow 376
3.7.3 Potential Flow 378
3.7.4 Axisymmetric Flow 379
3.8 B oundary Conditions 380
3.8.1 uP Equations 381
a. Traction 381
b. Mixed 382
c. Total Momentum Flux 383
d. Symmetry 383
e. Robin 384
f. OBC's 385
g.MoreOBC's 390
h. Penalty method OBC's 391
i. Ill-posed OBC's page viii 391
3.8.2 The Pressure Poisson Equation and Pressure Boundary Conditions 392
3.8.3 The Vorticity Transport Equation and Boundary Conditions on the Vorticity 395
a. The 2D Stream Function-Vorticity Formulation 395
b. The 3D Velocity-Vorticity Formulation 397
3.9 Initial Conditions (and Well-Posedness) 397
3.9.1 The uP Formulation 397
3.9.2 The PPE Formulation 400
3.9.3 Vorticity-Based Methods 401
3.10 Interim Summary 403
3.10.1 A Well-Posed IBVP for Incompressible Flow, and the Equivalence 'Theorem' 403
3.10.2 Some Ill-Posed Problems 406
3.10.3 The Simplified PPE Is Also Ill-Posed 407
3.10.4 Fixing the SPPE, and the PPE Paradox 409
3.10.5 PPE Solutions That Are Not NSE Solutions 410
3.10.6 A Remark on the Penalty Method 412
3.10.7 Key Features of Incompressible Flow 412
3.11 Global Conservation Laws 412
3.11.1 Conservation of Mass 413
3.11.2 Momentum Conservation 413
3.11.3 Kinetic Energy Conservation 413
3.11.4 Vorticity Conservation 415
3.11.5 Enstrophy Conservation 417
3.12 Weak Forms of the PDE's/Natural Boundary Conditions (NBC's) 418
3.12.1 The Conventional u P Formulation and the Stress-Divergence Form Combined 418
3.12.2 Other uP Formulations 426
a. Full Divergence Form 426
b. Skew-Symmetric Form 427
c. Rotational Form and Other Recent Forms 427
d. Other Recent Formulations 430
e. Divergence-Free Basis Functions 431
3.12.3 Pressure Poisson Equation Formulations 432
3.12.4 The Stream Function-Vorticity Formulation 433
3.12.5 Some Ill-Posed Formulations 435
3.13 The Finite Element Equations/Discretization of the Weak Form 438
3.13.1 Detailed Derivation of One uP Formulation 439
a. Continuum Formulation 439
b. GFEM Equations 440
c. Matrix-Vector Representation 444
d. Ill-Posed Equations 449
e. Normal and Tangential BC's 450
f. Axisymmetric Case 455
g. Fixing Ill-Posed Dirichlet BC's 456
3.13.2 The Choice of Elements 457
a. Introduction and Summary Tables 457
*b. Null Spaces and Their Effects; Pressure Modes 469
*c. LBB-Stability/Div-Stability 503
d. Bringing LBB to the Rest of the People 510
e. Penalty Methods page ix 523
f. Some 2D vs 3D Considerations 533
3.13.3 Stabilization (D. J. Silvester) 533
a. Stable vs. Stabilized Methods 534
b. Equal Order Interpolation via Stabilization 535
c. Stabilized Approximations Using Discontinuous Pressure 541
d. Impact on Ierative Solvers 546
e. Recommendations 549
3.13.4 The Discrete Pressure Poisson Equations (PPE's) 550
a. The Consistent PPE 550
b. Some Inconsistent (Approximate) PPE's 552
3.13.5 Additional Detailed Discussion of the Slightly UNSTABLE but Highly 554
USABLE gi go Element
a. Introduction 554
b. General Problem Statement 554
c. Interior Momentum Equation 557
d. Interior PPE 561
e. Boundary Momentum Equations/NBC's 564
f. The PPE at Boundaries 570
g. Flow Past a Flat Plate 581
h. Flow Past a Corner 583
i. Div u = 0 as a PPE BC 586
). Qi Qo Convergence Proof 587
k. Quantitative Description of Some Unstable Modes 596
1. The Boundary Vector, g 601
3.13.6 Higher-Order Elements 604
a.Q2Qi 605
b.Q2Qi 607
C.Q2P1 608
3.13.7 Divergence-Free Elements (and Methods) 609
3 13 8 Conservation Laws Revisited 614
3 13 9 Periodic Boundary Conditions 617
3 14 A Control Volume Finite Element Method 622
*3 15 Variational Principles for Potential and Stokes Flow 626
3 15 1 Introduction 626
3 15 2 Discrete Stokes 627
3 15 3 Discrete Potential 631
3 15 4 Continuous Potential 632
3 15 5 Continuous Stokes 636
3 16 Solution Methods for the Semi-Discretized TimeDependent (and Steady) Equations 639
3 16 1 Some Time-Integration Methods for the DAE's 643
a Primitive Equations/Index 2 645
b PPE Methods/Index 1 651
c Error Analysis for Index 1 and 2 655
d Some Numerical Results (Taylor Vortex) 660
*3 16 2 A Model DAE Problem 677
a Introduction 677
b Index 2 678
c Index 1 685
d Index 0 687
page x
e. Penalty 687
f. Energetics 691
g. Numerical Integration 692
h. Final Exercise 700
*3.16.3 Analytical Solution of the Stokes Equations 701
a. Introduction 701
b. Index 2 701
c. Index 1 704
d. Linear Stability Theory 706
3.16.4 Three Variable-Step Implicit (Index 2) Methodsand Some Steady-State 707
Methods
a. Introduction 707
b. Trapezoid Rule 707
c. Backward Euler 713
d. BDF2 715
e. Discussion 717
f. Penalty Method 718
3.16.5 An Explicit (Index 1) Method, Plus a Few Tricks 722
3.16.6 Semi-Implicit Projection Methods 734
a. Introduction 734
b. Derivation of an 'Optimal' Projection Method, Simplifications Thereto, and 736
Analysis Thereof
c. A GFEM (Almost) Implementation of the Second-Order Projection 759
MethodProjection 2
d. A Sampling of Projection Methods Used by Others 770
3.16.7 Fully-Implicit Segregated Solution MethodsTransient and Steady-State 773
3.16.8 A Fractional-Step (Index 2) Method 778
3.16.9 Other Methods (Used by Others) 782
a. Methods Based on Trajectories/Characteristics 782
b. Methods Based on Least Squares (LSFEM) 783
c. Methods Based on Galerkin Least Squares 785
3.16.10 A Strategy for Hastening Steady Solutions 785
3.17 Aliasing and Aliasing Instability, Linear and Non-Linear 786
*3.18 A New Look at Two Old Finite Difference Methods 790
3.19 Numerical Examplelmpulsive Start 794
3.19.1 Introduction 794
3.19.2 Domain, Mesh, BC' s, IC' s 796
3.19.3 Two Steady-state Results (v = 0,¥) 797
3.19.4 Pressure Impulse 805
3.19.5 Minimum Time of Believability, Re = 1000 Results 806
3.19.6 Transient Stokes Flow 809
3.19.7 Divergence for h®0 815
3.19.8 A Better Model 821
3.19.9 Drag Coefficients 822
3.19.10 A New Analytical Model page xi g26
3.19.11 A Better Mesh 833
3.19.12Dfvs. t 838
3.19.13 A Deficient Mesh Design 840
3.19.14 Concluding Remarks 842
3.20 Closure: Some Additional Remarks on the Pressure 844
4 847
Derived Quantities
4.1 Introduction 847
4.2 Two Dimensions 848
4.2.1 Smoothing in General 848
4.2.2 Vorticity 849
4.2.3 Stream Function 851
4.2.4 Heat Flux 853
4.2.5 Forces and Moments 859
4.2.6 A Recommended Method for Computing First Derivatives at Nodes 862
4.2.7 Particle Paths 865
4.2.8 Effective Peclet (Reynolds) Number 867
*4.2.9 Pressure Smoothing and Node Moving for Q\ go 868
4.3 Three Dimensions 871
4.3.1 Vorticity 871
4.3.2 Helicity Density 871
Appendix 1 873
Some Element Matrices
A.1.1 Advection-Diffusion Matrices 873
A.1.2 One-Dimensional Element Matrices 873
A.1.3 Two-Dimensional Element Matrices 874
A.1.4 NavierStokes; Additional Matrices 878
A. 1.4.1 Gradient Matrix 878
A.1.4.2 Divergence Matrix 878
A.1.4.3 Consistent Laplacian Matrix 878
A.1.5 Two-Dimensional Control Volume Finite Element Matrices 881
Appendix 2 883
Further Comparison of Finite Elements and Finite Volumes
A.2.1 Introduction 883
A.2.2 Viewpoint One 883
A.2.3 Viewpoint Two 891
Appendix 3 897
Projections, Orthogonal and Notand Projection Methods
A.3.1 Introduction 897
A.3.2 Scalar Projections 901
page_xn
Page xiii
A. 3.2.1 The L2-Projection, ** 902
A.3.2.2 The L2 -projection, #$ 904
A3.23 The H] -Projection, .;-i 911
A.3.2.4 The H] -projection, J? 920
A.3.2.5 The Projection Method 925
A.3.2.6 Brief Discussion of GFEM Errors on Elliptic BVP's 929
A.3.2.7 Numerical Examples 931
A.3.3 Vector Projections 940
A.3.3.1 The *M> -Projection 942
A.3.3.2 The ^ -Projection 945
A.3.3.3 The PJ -Projection 951
A.3.3.4 The '** -Projection 954
A.3.3.5 Sequential Projections 956
A.3.3.6 The Projection Method 957
A.3.3.7 Ranking Elements via Projections 960
References 961
Author Index 1009
Subject Index 1017
page_xiii
Preface
There are many ways to 'do' CFD (computational fluid dynamics) today, and there will
undoubtedly be more rather than fewer 'tomorrow'. C'est la vie of CFD. There are also
several {not 'many') ways for one to become acquainted with the need or desire to
actually do CFD. The latter, of course, can have a major impact on the former. Rather
than attempting to elaborate on either of the above subjects (a probably pedantic,
difficult, and mostly useless exercise), we shall simply state our own opinion up-front (and
'opinion' it must be, as the jury is still out, and likely to remain so, regarding 'How best
to do CFD'): the Galerkin finite element method (GFEM) is one of the good ways to
'do CFD'—especially when flows in or around 'real world' (complex) geometry are of
principal interest. Note that 'good' does not necessarily imply easy, or robust. It does,
in our view, imply accuracy and generality—and, in some sense, 'honesty'. It is an
objective and honest method that tries to remain true to the underlying PDE's (partial
differential equations). Hopefully there is still a market for a method that displays these
characteristics.
There is also a significant market for what we perceive to be less honest methods;
namely, those modifying the Galerkin principle in various seemingly ad hoc ways such
as 'upwinding' and related stabilizing and artificially dissipative methods. Such methods
would be acceptable to us (and, we believe, to many others) if, in addition to the continual
demonstration of their more-or-less acknowledged robustness, they would always be used
in conjunction with appropriate mesh refinement efforts that would convince both giver
and receiver that their final results do represent an accurate approximate solution to the
stated problem. If these 'final' results require a significantly finer mesh than does a
(frequently harder-to-obtain) wiggle-free GFEM solution, then there is clearly room for
'compromise' methods that cleverly combine robustness and accuracy in an efficient way.
This book was actually 'conceived' way back in 1981. That it did not actually get
written until 'now' (1989-1997) is probably a good thing, as a book back then would have
been premature. (We, at least, have learned many important aspects of the subject—FEM
in fluids—that were opaque back then, as most of the world was still in the Dark Ages
relative to the present.) Some may say, perhaps rightly, that we are still premature. Others,
however, will (we hope!) say 'It's about time! Promises, promises!'
Finite element methods (FEM's) are not as easy to understand as finite difference
methods (FDM's) or even finite volume methods (FVM's). The incompressible Navier-
Stokes equations (NSE's) are also not easy to understand. Finite elements applied to
incompressible flows can be especially hard to understand. One of the major goals of this
book is to significantly clarify the use of FEM for discretizing the incompressible NSE's.
xvi PREFACE
Another major goal has been to make the FEM more accessible/understandable—even
desirable (!)—to the many engineers and applied scientists who might use it, but who are
put off by, perhaps even frustrated by, some of the heavy mathematical machinery that
is often associated with the subject. Thus, this book is also intended to place a readable
finite element text in the hands of would-be FEM practitioners who might otherwise select
an alternative method by presenting a lucid and reasonably uncluttered explanation of the
FEM and by presenting and interpreting the final discrete equations in the form of simple
finite difference equations. We even occasionally digress to derive and discuss a particular
finite volume method—and compare it with the GFEM.
On the other hand we intend—as the title suggests—that this book also be useful if
only the first part of the title is of interest to the potential reader; for example, for those
who prefer or are already committed to other numerical methods than the FEM and/or
those who are not even interested in numerical computations but only wish to learn more
about incompressible fluid dynamics.
In contrast to some texts that provide only broad brush introductory information on
many topics, this text is clearly focused on providing the important details of only a
few useful-but-important topics—including much previously unpublished related research
results and many illuminating discussions of one of the most mysterious parts of
incompressible fluid dynamics: the pressure. Also, after careful derivation of the semi-discretized
NSE's via the GFEM, the resulting differential-algebraic equations (DAE's) are—for the
first time—addressed in great detail. This book puts modern theory and methods of
ordinary differential equations (ODE's—for the advection-diffusion equation) and DAE's
into the hands of the CFD practitioner, complete with robust algorithms that
automatically select the proper time step size (small enough to 'follow the physics' but no
smaller—and large when possible). The time integration methods presented in this book
are also virtually independent of the underlying spatial approximation method—and thus
will also be useful to those using FDM, FVM, or spectral methods (another Galerkin
method).
The FEM enjoys and utilizes, like spectral methods but not like FDM's, the particularly
useful aspect of providing a final solution in a well-defined functional form; no ad hoc
interpolation procedures are needed to determine the solution away from a node point.
Another advantage is the systematic and consistent generation (via weighted residual/error
distribution principles) of the appropriate ODE's, DAE's, and/or linear and nonlinear
algebraic equations—with no need of 'intervention' by the analyst to consider, for example,
how to treat a certain derivative at or near a boundary.
In addition to removing much of the mystique related to 'natural boundary conditions'
(NBC's), we also explain and emphasize consistency in (at least) the following six areas
that even now are not as widely known and appreciated as they should be: (i) consistent
mass matrices, (ii) consistent heat flux, (iii) consistent penalty methods, (iv) consistent
PPE (Pressure Poisson Equation), (v) consistent normal direction, (vi) consistent forces
and moments.
We hope that this book will be useful to each of the following four 'types': graduate
students, researchers, code developers, and code users. Although we are definitely not
writing the book for mathematicians, we hope and believe that they too will find much that
is of interest. We are writing the book, at least in part, for that class of people whom Strang
and Fix (1973) have referred to as 'mathematical engineers' in the following sense: (i) We
assume that the (average) reader does not have a PhD in applied mathematics/numerical
PREFACE xvii
analysis, (ii) We do assume that the reader has an advanced degree in applied physical
science or engineering and an interest in the mathematics of CFD, (iii) We do assume
a 'reasonable' background/facility with some relatively advanced mathematics (ODE's,
linear algebra, PDE's, and numerical analysis), and, finally, (iv) we do assume the reader
has a reasonable background in fluid mechanics.
The task of writing a book soon brings the putative author face-to-face with a stark
and embarrassing reality: he knows a bit less about 'his' subject than he had thought.
Even though one may like to think that writing a book turns one into an authority of sorts,
this is unfortunately not the case. Rather, it helps to define/reveal the author's level of
ignorance. Also, we realize well the bane of frustrated researchers trying to learn from the
literature: a confusing misprint, or, worse yet, a misleading misprint or statement. So let us
profusely apologize in advance for errors of this type (which there will inevitably be)—as
well as those that are even more serious/harmful: errors in conceptual understanding that
show up in the form of promulgating our own misunderstanding.
The scope of this text is both narrow and broad; it is narrow in that it covers only
advection-diffusion and isothermal laminar flow, and it is broad because these
important 'basic' topics are covered in much detail. Looking ahead to a second volume, with
the tentative subtitle, 'Additional and Advanced Topics', we plan to further broaden the
scope via (at least) the following chapters: Flows with coupled transport (for example,
the Boussinesq equations for both stratified flows and buoyancy-driven flows); stability,
continuation, and bifurcation (advanced analysis of steady flows); free and/or moving
boundaries/interfaces; turbulent flows and turbulence modeling; other specialized
applications; good simulation practices—with appendices that discuss non-dimensionalization,
solution methods for algebraic systems (linear and nonlinear), and (perhaps) weak
operators.
Before acknowledging the assistance provided by others in this endeavour, we wish to
echo the too-true words of Telionis (1981) in his preface: 'At the end, one realizes that a
manuscript is never finished. It is simply abandoned.'
The first person owed thanks is Dr. Joseph B. Knox, who provided both intellectual
support and an excellent research environment at LLNL during the 'early' years (circa
1975-1982 or so). Next, we opine that the ultimate portion of the CFD learning curve
is that traversed by writing code and making runs. Special thanks are offered here to
several important contributing LLNL colleagues: Bob Lee (an early and able 'partner'),
Steve Chan (a later and longer partner), and John Leone, Jr (last but not least—he also
provided much help with many numerical results that are presented for the first time in
this text). Thanks also to the Institute Universitaire des Systems Thermique Industriels in
Marseille, directed by J. Pantoloni and J. Martin, for providing computational resources
and a stimulating environment for RLS during the final stages of this book. RLS would
also like to acknowledge his family and the family Aubert in Marseille, France, for
encouragement and support during the project, as well as the Council on Research and Creative
Work at the University of Colorado for providing financial support via a faculty
fellowship. PMG also acknowledges the sabbatical-like environment provided by the University
of Minnesota in the form of a George T. Piercy Distinguished Visiting Professorship in the
fall of 1989. And special thanks to Damien Veyret of the CNRS-UMR 6595 groups, who
provided invaluable assistance in the generation of some of the new numerical results.
The following alphabetical list, probably incomplete, of professional colleagues who
have contributed indirectly (usually) but significantly to this product is presented with
xviii PREFACE
a general but sincere 'Thank You:' Doug Arnold, Alex Chorin, Peter Brown, Michel
Fortin, Dave Gartling, Vivette Girault, Max Gunzburger, John Heywood, Dave Malkus,
Ty Olson, 'Nick' Nicolaides, Tinsley Oden, Olivier Pironneau, Rolf Rannacher, Rolf
Stenberg, and David Silvester.
Finally, no mere words of thanks can repay the active and able assistance provided
by two mathematicians/friends who have helped us to remain 'honest' over many years
and, most importantly for this project, during the entire preparation of this book: David
Griffiths and Alan Hindmarsh.
And to those whose help we have forgotten to acknowledge explicitly: please accept
both our thanks and our apologies.
This book is dedicated to the person who, via the significant labor of 'true' TgX (not
IATgX) and the ability to improve many (but not all!) clumsy sentences, worked as hard
as anyone: Doris Getsla Gresho
PMG
RLS
January 1998
GLOSSARY OF ABBREVIATIONS
AB
AD
BC
BDF
BE
BHC
BHE
BHM
BL
BLT
BTD
BVP
CB
CFD
CG
CM
CPPE
CVFEM
DAE
DSCG
DTSF
FDM
FE
FEM
FVM
GFEM
GFEMIA
HOT
IBVP
IC
IMR
Adams-Bashforth
advection-diffusion
boundary condition
backward differentiation formula
backward Euler
biharmonic catastrophe
biharmonic equation
biharmonic miracle
boundary layer
boundary layer thickness
balancing tensor diffusivity
boundary value problem
checkerboard
computational fluid dynamics
conjugate gradient
consistent mass
consistent pressure Poisson equation
control volume finite element method
differential-algebraic equation
diagonally-scaled conjugate gradient
delta t scale factor
finite difference method
forward Euler
finite element method
finite volume method
Galerkin finite element method
Galerkin finite element method intelligently-
higher-order terms
initial boundary value problem
initial condition
implicit midpoint rule
applied
GLOSSARY OF ABBREVIATIONS
IRK
IVP
LBB
LDC
LHS
LM
LMM
LTE
NR
NS
NSE's
OBC
ODE
PDE
PPE
RHS
RK
RMS
SPD
SPPE
ss
STR
TBD
TR
TS
VBL
VTE
implicit Runge-Kutta
initial value problem
Lady shenskaya- Babuska- Brezzi
lid-driven cavity
left-hand side
lumped mass
linear-multistep method
local truncation error
Newton- Raphson
Navier-Stokes
Navier-Stokes equations
outflow/open boundary condition
ordinary differential equation
partial differential equation
pressure Poisson equation
right-hand side
Runge-Kutta
root mean square
symmetric positive definite
simplified pressure Poisson equation
steady state
shortened trapezoidal rule
to be determined
trapezoid rule
Taylor series
Viscous boundary layer
vorticity transport equation
INTRODUCTION
1.1 INTRODUCTION
This short chapter is intended primarily to help bring the 'novice' up-to-speed by supplying
a sufficiently-generous (but by no means complete) sampling of the relevant introductory
and contemporary related literature. A secondary purpose is to provide a sort of 'road-
map' to chart our plans for the rest of the book. A final purpose is to state 'up-front'
what is not covered in this book.
We assume a certain level of 'sophistication' on the part of the reader, which we now
attempt to define.
1. The basic concepts ('physics') and equations of fluid mechanics—especially those for
incompressible flow—should be reasonably well-known and understood. (We will further
your understanding.)
2. The (Galerkin) finite element method (GFEM) should also be in the reader's sphere
of knowledge, as we begin at well-above an introductory level (although we often 'drop
back down' to generate discrete equations).
3. We hope that an 'intermediate' level of mathematical sophistication—such as that
attained by about the end of a 'second' course in finite elements would have provided—is
all that is required.
Related to this last point, we quote from John Whiteman (1975) the following very-
relevant, but perhaps (for some 'engineers') somewhat frightening summary of FEM and
mathematics:
It is perhaps part of the fascination of the subject that so many branches of mathematics
are involved in the theory of finite elements. As can be seen from the above, this draws
on such areas as functional analysis, the theory of differential and integral equations,
variational principles, optimisation, interpolation, approximation and the solution of
linear and nonlinear systems. The task of becoming conversant with this wide spectrum
of knowledge is indeed a challenge.
But we will not spend much time on 'the theory of finite elements' (there are quite
enough publications on this subject), as we try to get the subject matter of our book
into the hands of 'practitioners'—code writers and code users. But it is a fact of life
that the FEM is somewhat unique in the sense that it is a 'perfect' subject for heavy
involvement by engineers and applied scientists on the one hand and by a wide range of
applied mathematicians on the other. We present in this text a product of our experience in
which we have strongly attempted to keep 'one foot in each camp'. While we try to keep
2 INTRODUCTION
'advanced' mathematics to a minimum (for example, we invoke very little 'functional
analysis,' as we mostly avoid the sophisticated subject of error analysis), we also hope
that the less sophisticated reader who masters the material herein will be able to proceed
more confidently into much of the FEM literature. In their excellent and timely text,
Strang and Fix (1973) stated in their preface,
... this book is absolutely not intended for the exclusive use of specialists in numerical
analysis. On the contrary, we hope it may help establish closer communication between
the mathematical engineer and the mathematical analyst.
In our opinion, they succeeded well—and we would be quite pleased if our book
is judged in a similar vein, but with a slightly different slant: we would like the not-
necessarily-mathematical engineer to learn, appreciate, and use effectively the FEM—as
we attempt to further bridge the gap between mathematicians and 'engineers'.
Two related historical facts/events that we believe worth reporting here are the
following, the first from Whiteman again (1981):
Over a decade ago I reached the conclusion that there existed a very definite and
detrimental lack of communication between those mathematicians, who were at that
time working in increasing numbers on the mathematical theory of finite elements, and
engineers who were routinely using finite element methods to solve practical problems.
The second is from Tinsley Oden (1991):
One unfamiliar with aspects of the history of finite elements may be led to the
erroneous conclusion that the method of finite elements emerged from the growing wealth
of information on partial differential equations, weak solutions of boundary value
problems, Sobolev spaces, and the associated approximation theory for elliptic boundary
value problems. This is a natural mistake, because the seeds for the modern theory of
partial differential equations were sown about the same time as those for the
development of modern finite element methods, but in an entirely different garden. ... the rich
mathematical theory of partial differential equations, which began in the 1940's and
1950's, blossomed in the 1960's, and is now an integral part of the foundations of not
only partial differential equations but also approximation theory, grew independently
and parallel to the development of finite element methods for almost two decades. ...
It was, perhaps, an unavoidable occurence, that in the late 1960's these two
independent subjects, finite element methodology and the theory of approximation of partial
differential equations via functional analysis methods, united in an inseparable way,
so much so that it is difficult to appreciate the fact that they were ever separate.
Next, we summarize our 'interpretation' of the reason that the Galerkin Finite Element
Method is so useful and powerful (cf. Fourier series and other eigenfunction expansion
methods).
1. The approximate solution is represented as an expansion in a convenient (and finite)
set of linearly-independent basis functions (piecewise-polynomials) with coefficients to
be determined.
2. Placing the expansion into the governing PDE results in an error, called the residual.
3. Application of the 'Galerkin principle' yields a set of equations for the undetermined
coefficients as follows. Since the basis functions form a complete set of functions and since
the only function that is orthogonal to a complete set of functions is the zero function,
INCOMPRESSIBLE FLOW 3
it makes very good sense to orthogonalize the residual function against the same basis
functions.
4. The choice of piecewise-polynomials and the 'Galerkin principle' more or less 'define'
the GFEM—the former leading to sparse matrices and the latter (often) to optimal
accuracy. Of course there are other linearly-independent sets of functions against which one
might orthogonalize the error—leading to, for example, Petrov-Galerkin methods. More
general yet is the so-called 'method of weighted residuals', of which the Galerkin method
is one member; for background here, see Crandall (1956), Finlayson and Scriven (1966),
and Finlayson (1972). But we prefer the 'Galerkin method' for reasons that will be clarified
throughout the book.
We conclude this introductory discussion with the following remark: we hope, as the
title suggests, that this book will also be useful if only the first part of the title is of interest;
for example, for those who prefer other numerical methods than the FEM or for those
who are not at all interested in numerical computions (CFD) but only in Incompressible
Fluid Dynamics.
1.2 INCOMPRESSIBLE FLOW
As stated earlier, we must assume that the reader is at least somewhat versed in this
subject, but we will offer a short list of references that we have found useful—hopefully
some on our list will actually be 'new' to the reader.
But before doing this, let us help to 'define'/review the subject via some citations from
both a classic text and a neo-classic (?) one:
A fluid is said to be incompressible when the density of an element of fluid is not
affected by changes in the pressure. [Batchelor (1967, p. 75)]
The layman is usually surprised to learn that the pattern of the flow of air can be
similar to that of water. From a thermodynamic standpoint, gases and liquids have
quite different characteristics. As we know, liquids are often modeled as
incompressible fluids. However, 'incompressible fluid' is a thermodynamic term whereas
'incompressible flow' is a fluid-mechanical term. We can have incompressible flow of
a compressible fluid. [Panton (1984 4. 228)]
I regard the flow of an incompressible viscous fluid as being at the centre of fluid
dynamics by virtue of its fundamental nature and its practical importance" [Batchelor
(1967, p. xiv)]
In the remainder of this book we therefore concentrate on the flow of a fluid which
possesses inertia and viscosity but which is effectively incompressible. This programme
might appear to be modest, but it lies at the center of fluid mechanics and both deserves
and requires serious study. [Batchelor (1967, p. 173)]
Following on from these relevant words, we add just a few of our own: While it is well
and good to remember that incompressible flow (V • u = 0) is always only an
approximate conservation of mass equation (dp/dt + u • Vp = —pV • u being the exact equation)
it is also always true that the exact mathematical consequences of this 'approximate'
conservation of mass equation are both significant and profound—as we shall see.
Returning for a minute to real fluids, please consult (at least) both Batchelor's
and Panton's books to see under what conditions it is a 'good' approximation to
model a compressible fluid via the incompressible assumption—the perhaps most
important/common condition being related to the Mach number, Ma = q/c where q is
4 INTRODUCTION
the fluid's speed and c is the speed of sound in the fluid: If Ma2 <JC 1 everywhere (say
Ma2 < 0.1, that is, Ma < ~0.3), the flow will be have basically incompressibly. (Note
that V • u = 0 implies that Ma = 0.)
The rest of our brief 'survey' of incompressible flow is presented in the form of a briefly
annotated table, with some relevant educational quotations, in (roughly) chronological
order (Table 1.1)—about which we make three remarks: (i) The books listed cover
an incredibly large range of mathematical tools/approaches for examining/studying the
NSE's. (ii) Our 'sampling' is, of course, incomplete, (iii) Abbreviations are defined in
the Glossary of Abbreviations.
Table 1.1 Some books on incompressible flow.
Author
Title (year)
Comments
J. Serrin
L. Landau and
E. Lifshitz
H. Schlichting
G. Batchelor
0. Ladyzhenskaya
D. Tritton
R. Temam
Mathematical Principles
of Classical Fluid
Mechanics (1959)
Fluid Mechanics (1959)
Boundary Layer Theory
(1955-1979)
R. Bird, W. Stewart Transport Phenomena
and E. Lightfoot (1960)
An Introduction to Fluid
Dynamics (1967)
The Mathematical
Theory of Viscous
Incompressible Flow
(1963-1969)
Physical Fluid Dynamic
(1977)
Navier-Stokes
Equations (1979, 1984)
Classic—careful discussion, via vector
and tensor calculus, of both
compressible and incompressible
flows.
Much emphasis on 'basics'/physics;
very broad coverage but mostly
compressible.
The classic text on the subject; it
covers laminar, turbulent, and thermal
boundary layers—mostly
incompressible.
Classic introductory text on
momentum, heat, and mass transfer;
many practical/tutorial examples.
Another classic, at an intermediate
level, focussing on incompressible
flow; a 'must' for the serious student.
The title tells all for this classical
mathematical text. Most of the
discussion is in Sobolev spaces
(functional analysis methods). See
footnotes 1-3.
The focus is on physics, not
mathematics; broad coverage of
incompressible fluid dynamics
including rotational and thermal
effects.
From the preface: This book stands at
the boundary between computational
fluid dynamics and mathematical
analysis to which CFD is firmly tied',
The 'analysis' is in function space, but
some FDM's (and FEM's) are also
discussed.
INCOMPRESSIBLE FLOW
Table 1.1 (continued).
Author
Title (year)
Comments
M. Van Dyke
H. Lugt
R. Panton
H-O. Kreiss and
J. Lorenz
P. Constantin and
C. Foias
G. Galdi
An Album of Fluid
Motion (1982)
Vortex Flow in Nature
and Technology (1983)
Incompressible Flow
(1984, 1996)
Initial- Boundary Value
Problems and the
Navier-Stokes
Equations (1989)
Navier-Stokes
Equations (1989)
An Introduction to the
Mathematical Theory of
the Navier- Stokes
Equations. Volume I:
Linearized Steady
Problems. Volume II:
Nonlinear Steady
Problems (1994)
Beautiful photographs of 'all' of fluid
mechanics; 'a picture is worth a
thousand words'-and 'some'
equations.
A unique and mostly qualitative
approach with some mathematics) in
which the theory of vorticity dynamics
is used to illustrate many
fluid-mechanical phenomena.
A good beginning-to-intermediate-level
text; the mathematics is 'classical';
broad range of topics.
Much hyperbolic and parabolic PDE
analysis—linear and non-linear—in 1
to 3 dimensions; mostly, periodic BC's
for NSE's; the depth is in IBVP's, less
extensive is NSE's; see footnotes 4
and 5.
A short treatise using modern
techniques in PDE's; also discusses
attractors, fractal dimensions, and
inertial manifolds. See footnotes 6-8.
Modern, comprehensive, and thorough
(over 750 pages), and mathematical
(see footnote 9); the time-dependent
case (Volume III) is anxiously awaited.
See footnote 9.
1. 'However, the only way to verify what the Navier-Stokes equations really have to say about the
motion of actual fluids is first to carry out a rigorous mathematical analysis of the solution of boundary-
value problems for the Navier-Stokes equations, corresponding to the actual hydrodynamical situations.'
(P-4)
2. 'Before studying the nonlinear Navier-Stokes equations, we investigate various linearized versions of
the equations. These studies show that the boundary-value problems for the linearized equations always
have unique solutions, and that properties of the operators corresponding to stationary problems are
very much like those of the Laplace operator, while the properties of the operators corresponding to
nonstationary problems resemble those of the heat-conduction operator but have some distinction,' (p. 5)
(See too footnote 7.)
3. 'Finally, we warn the reader who is accustomed to the classical methods of mathematical physics that
the interpretations given here of what is understood by the solution of a problem and what it means to
solve a problem differ from those with which he is familiar. To a large extent, a precise analysis of these
matters is responsible for the success of the investigations reported here,' (p. 6)
4. 'Existence and regularity questions play a fundamental role in computations because the resolution
required depends on the smoothness of the solution, and there is always the danger that one tries to
compute things that do not exist,' (p. ix)
(continued overleaf)
6 INTRODUCTION
Table 1.1 (continued).
5. 'There is no existence proof except for small time intervals. Thus it has been questioned whether the
N-S equations really describe general flows. ... Possibly a lack of mathematical ingenuity is the reason
for the missing existence proof, and the N-S equations are physically correct, (p. 1)
6. 'Questions regarding the notions of weak and strong solutions and their relations to classical solutions
are studied in some detail.' (p. viii)
7. 'The difference between the Laplacian for the Dirichlet problem and the Stokes operator originates
from the fact that Leray's projector P and (-A) do not commute in general,' (p. 41) (See too
footnote 2.)
8. 'The need to study weak solutions arises mainly for d = 3 because even if uo and f are very nice
functions, in this case the existence of a classical solution of the Navier-Stokes equations is known, in
general, only for short time intervals.' (p. 63)
9. 'The book is essentially mathematically self-contained: the knowledge of Banach spaces and their basic
properties (completeness, separability, reflexivity) along with some classical results in operator theory
are the only necessary prerequisites to reading this book, which is devoted to students (graduate and
undergraduate) and those mathematicians and applied mathematicians who wish to become acquainted
with the subject.' (p. vii)
1.3 THE FINITE ELEMENT METHOD
As in the previous section, we shall mainly merely provide the interested reader with a
reasonably complete shopping list of books on the FEM. We wait until the next section
to cite some of those who, like us, connected the FEM with fluid mechanics—mostly
incompressible. Our list will range from beginner's books to those requiring a rather
more solid background in advanced mathematics.
But before we do this, we wish to get three things out of the way: (1) the history of
FEM, (2) FEM error analysis via Taylor series, and (3) some brief discussion of 'function
spaces'
1. For the interested reader, we offer a small sample of progress that trace one or another
aspect of FEM history. We begin by returning to a reference already cited—Oden (1991).
Slightly more recent is Babuska (1994), and more recent yet is the paper by Gupta
and Meek (1996). Additionally, some of the books referred to below trace the early
development of the FEM.
2. Even though the discrete equations generated by the FEM are often 'amenable to' Taylor
series analysis, especially when uniform meshes are employed, we caution the reader not to
draw 'final' conclusions regarding the accuracy of GFEM from such a (necessarily) local
analysis—especially when the 'nodal equations' are not all of the same 'type' [see, for
example, Carey (1976)]. For but a single example: the discrete momentum equation for the
center node of a 9-node (biquadratic) velocity-piecewise-constant pressure approximation
to the incompressible NSE's contains no pressure gradient term. (!) This is a Taylor-
sehes-inconsistent equation, yet the 'element' converges nicely, though 'slowly' (it is
consistent).
3. We now turn to the subject of function spaces—wherein all the GFEM solutions reside.
So as not to lose a large fraction of our planned audience, we cover the subject as follows:
nearly all that we wish to say about the subject is contained in the 'graphic' shown in
Figure 1.1.
THE FINITE ELEMENT METHOD 7
Fig. 1.1 Our GFEM lives in a Banach space that is also in the intersection of two subspaces
of it: a Sobolev space and a Hilbert space.
With function spaces out of the way, we now present another briefly annotated table,
this time on the FEM, in approximately chronological order—and again with no intention
of 'completeness'. (For example we list none that are too 'solid-mechanics-oriented'.)
The three remarks regarding Table l.l apply equally well to Table 1.2.
We conclude this FEM section with a short discussion on 'element matrices' and
'assembly of equations'—two common and important aspects of the FEM. While we
generally refer the notice to the (more basic, introductory) texts in Table 1.2 for learning
about these things, we believe that there is one 'area' that is not 'completely' covered
there—and that is what we wish to contribute here: element matrices for several common
quadrilateral elements for incompressible fluid mechanics. And this we do only for
the simplest shapes—rectangles; isoparametric element matrices are better left to the
computer. With some apologies for getting a little bit ahead of ourselves in terminology,
we present in Appendix 1 the following element matrices in one and two dimensions:
M, K, and N(u) (all defined in Chapter 2) for both linear and quadratic basis functions,
and CT (defined in Chapter 3) for 2D. Also included there are M, K, and N{u) for the
control-volume-finite-element method (CVFEM; defined in Chapters 2 and 3) and, for
one higher-order element (QjQx), we even show CTM^}C (Chapter 3). We shall refer
to this appendix frequently in the following chapters; we introduce it here because it
is FEM.
Two related remarks:
1. We will actually present the 'nitty-gritty' details of assembling the global semi-
discrete NSE's from the element matrices for one special case (Q\Qo)—in
Section 3.13.5 of Chapter 3.
2. For some advice on numerical integration 'quadrature rules'—required for the
isoparametric case in which only approximate results are generally attainable—see
Leone et al. (1979).
8 INTRODUCTION
Table 1.2 Some books on finite elements.
Author
Title (year)
Comments
O. Zienkiewicz
and R. Taylor
G. Strang and
G. Fix
P. Ciarlet
E. Becker*,
G. Carey and
J.T. Oden
"(Volume I only)
O. Zienkiewicz
and K. Morgan
J.N. Reddy
O. Axelsson and
V. Barker
The Finite Element
Method. Volume 1:
Basic Formulation and
Linear Problems (1989)
(1st Edition: 1967)
An Analysis of the
Finite Element Method
(1973)
The Finite Element
Method for Elliptic
Problems (1978)
The Texas Finite
Element series I. Finite
Elements: An
Introduction (1981)
II. Finite Elements: A
Second Course (1983)
III. Finite Elements:
Computational Aspects
(1984) IV. Finite
Elements:
Mathematical Aspects
(1983) VI. Fluid
Mechanics (1984)
Finite Elements and
Approximation (1983)
An Introduction to the
Finite Element Method
(1984, 1993)
Finite Element Solution
of Boundary Value
Problems (1984)
Latest edition of one of the 'standards' on
the subject; while each edition tends to be
more general and less solid mechanics,
the latter still dominates; a basic book that
ranges from introductory material on
through to some advanced topics.
A well-deserved classic; an excellent and
readable book for the 'mathematical
engineer' to learn about error estimates,
convergence rate, and more—at an
'intermediate' level of mathematics. See
footnotes 1 and 2.
This is the 'standard' reference for elliptic
problems; the mathematical level is fairly
high (functional analysis, etc); includes
theorems, proofs, and much error analysis.
See footnote 3.
A series of 'small' books that range from
undergraduate level to graduate and
postgraduate level; good introductions to
the various 'spaces' of solutions; very wide
coverage of the subject,
A sort of 'special-purpose' book that is
appropriate for a beginner and for one
trained in FDM, for example. See
footnote 4.
One of the first 'basic' texts that addresses
the 'rest' of the science and engineering
community. See footnotes 5 and 6.
Written by mathematicians, this is another
text addressed to non-solid-mechanics
readers; it emphasizes second-order
BVP's and methods of solving the resulting
linear algebra problems, and the
mathematical level is 'intermediate'.
THE FINITE ELEMENT METHOD
Table 1.2 (continued).
Author
Title (year)
Comments
V. Thomee
R. Wait and
A. Mitchell
D. Burnett
T. Hughes
H. Kardestuncer
(Editor)
C. Johnson
R. Cook,
D. Malkus and
M. Plesha
Galerkin Finite Element
methods for Parabolic
Problems (1984)
Finite Element Analysis
and Applications (1985)
Finite Element Analysis
(1987)
The Finite Element
Method: Linear Static
and Dynamic Analysis
(1987)
Finite Element
Handbook (1987)
Numerical Solution of
Partial Differential
Equations by the Finite
Element Method.
(1987)
Concepts and
Applications of Finite
Element Analysis
(1989)
Very thorough, very mathematical;
much error analysis (with proofs), and
in several norms, including LM; also full
discretizations, including BE and TR;
smoothing properties discussed and
analyzed.
Another book by mathematicians that
avoids structural mechanics;
intermediate level of mathematics;
includes time-dependent problems;
concisely introduces function spaces.
See footnote 7.
A long, but thorough and useful
introductory text that is directed toward
the advanced undergraduate in
non-solid mechanics curricula. See
footnotes 8-10.
Although written by an expert in (at
least) solid mechanics, this first level
text is an excellent starting point for
other branches of engineering—and
applied science; mixed methods
(Stokes flow) are also addressed. See
footnote 11.
This large four-part volume begins with
'FEM mathematics', written by
mathematicians at a high/advanced
level; next, 'FEM Fundamentals', is
written for and by solid mechanicers;
'FEM Applications' does contain some
fluid mechanics; the last section, 'FEM
computations', again address mostly
issues in solid mechanics. (C'est la vie.)
Addressed to advanced undergraduates
and first-year graduate students, this
text does a very good job of introducing
the necessary mathematics (but no
more) and shows how it applies to a
range of problems. See footnote 12.
While mainly a 'solid mechanics' FEM
text, we include it for two reasons:
(i) The chapter on constraints, (ii) the
Appendix on eigenvalues and
eigenvectors is unique and useful.
(continued overleaf)
10
INTRODUCTION
Table 1.2 (continued).
Author
Title (year)
Comments
O. Zienkiewicz
and R. Taylor
A. Baker and
D. Pepper
The Finite Element
Method, 4th Edition.
Volume 2: Solid and
Fluid Mechanics,
Dynamics and
Non-Linearity (1991)
Finite Elements 1 -2-3
(1991)
D. Pepper and
J. Heinrich
S. Brenner and
R. Scott
K.J. Bathe
The Finite Element
Method: Basic concepts
and Applications (1992)
The Mathematical
Theory of Finite
Element Methods
(1998).
Finite Element
Procedures (1996)
This is 'Part 2' of the first-listed book; it is
more advanced with the first third
emphasizing 'solids'; the remainder brings in
the other physics, including fluid
mechanics—both compressible and
incompressible.
A rather different (unique) approach to
introduce the FEM, mostly via heat transfer
and advection-diffusion examples, is
presented; a disk-supplied code is an
integral part of the text; it alleges to
streamline the analysis via unconventional
hypermatrix terminology. See footnote 13.
Another introductory text that emphasises
heat transfer and not structural mechanics.
See footnotes 14-16.
This useful text, written by mathematicians,
seems to be a sort of 'modern-day Strang
and Fix (1973); though perhaps more
advanced in the mathematics. See
footnote 17.
Although solid mechanics-oriented, this new
text covers several other relevant issues:
mixed methods for incompressible behavior,
new ways to analyze and understand the
inf-sup condition, and incompressible flow.
See footnotes 18 and 19.
1. 'Whenever flexibility in the geometry is important—and the power of the computer is needed not only
to solve a system of equations, but also to formulate and assemble the discrete approximation in the first
place—the finite element method has something to contribute' (p. ix)
2. 'This completes the technical error estimates for parabolic problems; there are no surprises in the
results. Our impression is that just as in static problems, finite elements are particularly effective in
coarse mesh calculations, with a large value of h. In this situation, the physics is often more adequately
represented by Galerkin's principle, on which the finite element method is based, than by supposing
difference quotients to be close to the derivative.' (p. 251)
3. 'Although the emphasis is mathematical, it is one of the author's wishes that some parts of the book
will be of some value to engineers, whose familiar objects are perhaps seen from a different view point.
Indeed, in the selection of topics, we have been careful in selecting only actual problems and we have
likewise restricted ourselves to finite element methods which are actually used in contemporary engineering
applications.' (p. vii)
4. 'Many alternative numerical approximation processes existed before the advent of the finite element
method. Here boundary solution techniques and finite difference methods have established their own
useful existence—and proponents of these have at times crossed swords with those advocating finite
element methods in claiming particular superiority. Today some of us see the essential unity of all
approximation processes used in the solution of problems defined by differential equations and in this book we
stress this throughout. We endeavour to show that a 'generalized finite element method' can be defined
embracing all the alternative variants, thus leaving scope for choosing the 'optimal approximation' to the
user.' (p. vii)
5. 'The many discussions I have had with students who had no background in solids and structural
mechanics gave rise to my writing a book that should fill the rather unfortunate gap in the literature.'
(p. xi)
THE FINITE ELEMENT METHOD 11
Table 1.2 (continued).
6. 'In introducing the finite element method in Chapters 3 and 4, the traditional solid mechanics approach
is avoided in favor of the "differential equation" approach, which has broader interpretations than a single
special case.' (p. xii)
7. 'In this book, the authors have attempted to provide an introduction to the method by considering both
the theory and the practice. The book is aimed at final-year undergraduate and first-year postgraduate
students in mathematical sciences and engineering, No specialized mathematical knowledge beyond a
familiarity with calculus and elementary differential equations is assumed. The applications are drawn from
many areas and no knowledge of structural mechanics, or any other branch of engineering science, is
assumed. The abstract mathematics is kept to a minimum and is concentrated in a single chapter' (p. vii)
8. 'Because of these origins, most of the FE literature is permeated with mechanical concepts such
as forces, moments, displacements, rotations, masses, dampers, springs, rods, beams, plates, and
shells. Understandably, many physicists, applied mathematicians, and non-mechanical engineers have
concluded that this field is not for them.' (p. vi)
9. 'Of course, the only technical language understandable to all fields is mathematics, so that is the
approach used here. This, however, presented a challenge: how to avoid a lot of the sophisticated
mathematical concepts and specialized mathematical jargon prevalent in much of the mathematically-
oriented FE literature, while preserving only those concepts necessary for intelligent application of the
FEM.' (p. vi)
10. 'Judging from my own experience and that of my colleagues, such topics as functional analysis and
variational calculus are definitely not needed to understand and use the FEM.' (p. ix)
11. 'Background in structural mechanics (i.e., the theory of beams, plates, and shells) is certainly an asset
when it comes to studying this book but is not essential. Only Chapters 5 and 6 deal exclusively with this
subject, and these chapters may be ignored by students whose interest lies elsewhere, ... In this spirit the
book emphasises fundamental finite element concepts and techniques applicable to a very broad range
of problems and thus constitutes a suitable text for most students in the physical sciences.' (p. xvii)
12. The purpose of this book is to give an easily accessible introduction to the finite element method
as a general method for the numerical solution of partial differential equations in mechanics and physics
covering all the three main types of equations, namely elliptic, parabolic, and hyperbolic equations.' (p. 7)
13. 'The intrinsic beauty of the method is that this theory completely accounts for each and every critical
decision that must be made in the design of a numerical algorithm. No other approximation method in use
today is so complete in giving firm and accurate guidance on algorithm construction, consistent boundary
condition implementation, direct adjustment of spatial order of accuracy, and geometric flexibility.' (p. xiii)
14. 'Most practioners of the finite element now employ Galerkin's method to establish the approximations
to the governing equations. The underlying theme in this book likewise follows Galerkin's method. The
simplicity and richness of the method pays for itself as the user progresses into more complicated and
demanding types of problems. Once this fundamental concept is grasped, application of the finite element
method unfolds quickly.' (p. 2)
15. 'The finite element method is rapidly becoming the de facto standard for numerical approximation of
the partial differential equations which define engineering and scientific problems. Many of the commercial
computer codes currently available are finite element based—especially in the structural and heat transfer
areas.' (p. 3)
16. 'The theory of the finite element method is found in variational calculus, and it is this mathematical
basis that allowed it to be developed in a very short time and makes it the powerful tool for engineers that
it is today. However, this has also created the misconception that a strong mathematical background is
essential in order to understand the finite element method. Here, we will indeed show that this is not the
case and that all of the finite element methodology can be developed utilizing the theorems of advanced
calculus and basic physical principles.' (p. 5) (The reference to variational calculus for the theory is, of
course, mainly addressed to 'classical' FEM and elliptic BVP's.)
17. This book developes the basic mathematical theory of the finite element method, the most widely
used technique for engineering design and analysis. One purpose of this book is to formalize basic
tools that are commonly used by researchers in the field but never published. It is intended primarily for
mathematics graduate students and mathematically sophisticated engineers and scientists.' (p. vii)
18. 'My objective in writing this book was to provide a text for upper-level undergraduate and graduate
courses on finite element analysis and to provide a book for self-study by engineers and scientists.'
(p. xiii)
19. This text does not present a survey of finite element methods. ... Instead, this book concentrates on
only certain finite element procedures, namely, on techniques that I consider very useful in engineering
practice and that will probably be employed for many years to come.' (p. xiii)
12 INTRODUCTION
1.4 INCOMPRESSIBLE FLOW AND THE FINITE
ELEMENT METHOD
We are certainly not the first to try to tie these two important subjects together—and we
will not be the last (we hope!). We mentioned earlier that the 'simplification' of the mass
conservation equation leads to some significant and profound consequences. Well, such is
the case when one attacks the incompressible NSE's via the FEM—or any other spatial
discretization method, we hasten to add. The 'divergence free constraint' that is placed on
the vector field called velocity causes nearly no end of difficulties, mostly mathematical.
It is no wonder, as anyone who is familiar with the historical development of CFD is no
doubt aware, that many 'slightly compressible' approximations to truly incompressible
flow have been devised. Even today, many 'incompressible' discretizations are not really
that; discrete incompressibility, and all of its associated 'stability' problems are often
side-stepped by invoking 'stabilized' formulations that permit the discrete divergence to
be proportional to some power (usually the first) of the mesh spacing, rather than strictly
zero. The history of the subject, revealed by a perusal of the literature over the last
25 years [30 years if the FDM history is also included—which it is in the recent paper by
Williams and Baker (1996).], will reveal many many (too many, some detractors might
suggest) publications that 'test' or analyze (or both) various types of 'discretizations'
(velocity and pressure basis functions, basically) for both accuracy and for that truly
annoying mathematical concept called 'stability' — uniform invertibility of the operator
(roughly).
This is all we care to reveal at this time; the remainder of the 'iceberg' will be detailed
in Chapter 3 and in many if not all of the citations listed in our final historical introduction
table (Table 1.3)—about which the same remarks made regarding Table 1.1 again apply.
In addition to the references cited above, useful material can also be found in (at least)
those listed below—some of which 'occur' periodically:
(i) the FEM in Fluids Series, John Wiley and sons in seven volumes, from 1974 through
1988. Also, the FEM in Fluids Conferences, from 1974 through the present (the
next is in January 1998);
(ii) the Numerical Methods in Laminar and Turbulent Flow series, Pineridge Press, in
nine volumes from 1978 through 1995;
(iii) the 'Water Resources Conference' series; called Finite Elements and Water
Resources for the first 10 or so years, beginning in 1976, until 'external influences'
(apparently) caused a name change to Computational Methods in Water Resources;
(iv) various ASME Special Meetings Proceedings—especially the Division of Applied
Mechanics and the Division of Fluids Engineering;
(v) various other 'CFD' conferences—too many to list;
(vi) Annual Review in Fluid Mechanics—annually;
(vii) All journals in which CFD is one of the main subjects.
We conclude this section with a very brief listing of some of the 'pioneering' papers
in which the FEM was applied to incompressible flow: Thompson et al. (1969), Oden
(1970), Cheng (1972), Taylor and Hood (1973), and Hood and Taylor (1974)—from each
of which we learned something when we began our FEM effort in 1975.
INCOMPRESSIBLE FLOW AND THE FINITE ELEMENT METHOD
13
Table 1.3 Some books on finite elements and fluid mechanics.
Author
Title (year)
Comments
A. Baker
G. Carey and
J.T. Oden
V. Girault and
P.A. Raviart
C. Cuvelier,
A. Segal, and
A. Van
Steenhoven
M. Gunzburger
0. Pironneau
0. Zienkiewicz
and R. Taylor
F. Brezzi and
M. Fortin
Finite Element
Computational Fluid
Dynamics (1983)
Finite Elements.
Volume VI: Fluid
Mechanics (1986)
Finite Element Methods
for Navier-Stokes
Equations (1986)
Finite Element Methods
and Navier-Stokes
Equations (1986)
Finite Element Methods
for Viscous
Incompressible Flows
(1989)
Finite Element Methods
for Fluids (1989)
The Finite Element
Method, 4th Edition
Volume 2: Solid and
Fluid Mechanics,
Dynamics and
Non-linearity (1991)
Mixed and Hybrid Finite
Element Methods
(1991)
Although 'dated' with respect to 'FEM in
fluids,' this early text is still useful, both to
introduce FEM as well as presenting
parabolic (boundary layer) methods; the
language of 'hypermatrix' formulation is,
however, rather unconventional. See
footnote 1.
One of the Texas series'; another early text
that recognized, and responded to, the
need; still a useful book. See footnote 2.
This mathematical text, written by
mathematicians, is a 'standard' (advanced)
reference for the theory of FEM and
incompressible flows—as far as it goes (it
does not address the time-dependent case).
Another 'early' text written by
mathematicians, but for engineers (no
functional analysis); includes discussions of
div-free bases; includes some error analysis
and convergence proofs.
Although written at a higher level of
mathematics (by a mathematician), this text
probably comes closer to our own than any
other—it is complementary; it covers also
stream function-related methods and a few
others that we do not. See footnote 3.
A small-but-useful book that touches on a
wide range of flows: irrotational,
incompressible Stokes and NS,
compressible Euler, and shallow water
equations, plus advection-diffusion. See
footnotes 4 and 5.
Begun in Volume 1 (mixed formulations),
this text denotes about 15% to the subject
matter of our book; It is thus somewhat
complementary. See footnotes 6 and 7.
While incompressible NSE occupies only
about 20% of this important mathematical
text, it is the 'standard reference' for the
analysis of mixed methods and their
stability; we appeal to it as the 'authority'.
See footnote 8.
(continued overleaf)
14
INTRODUCTION
Table 1.3 (continued).
Author
Title (year)
Comments
M. Gunzburger
and R. Nicolaides
(Editors)
J.N. Reddy and
D. Gartling
Incompressible
Computational Fluid
Dynamics—Trends
and Advances
(1993)
The Finite Element
Method in Heat
Transfer and Fluid
Dynamics (1994)
While not restricted to FEM, this book
is still timely and relevant; the
state-of-the-art circa 1992—or a
segment there of.
Nearly at the opposite end of the
'spectrum' from Girault and Raviart
(1986), this text is focussed on applied
FEM and shows many results; solution
methods (linear and non-linear
equations) are also discussed. See
footnotes 9 and 10.
1. 'For engineers and scientists whose expertise lies generally outside mathematics or structural analysis,
and in fluid mechanics in particular, these approaches [mathematical or structural] are probably confusing
if not rather incomprehensible and frustrating [Amen;] ... This text addresses this dilemma.' (p. xiii)
2. The main framework of the subject for flow problems has now been established. Our objective in this
volume is to develop and discuss the method in this context. Accordingly, the scope has been limited
to the following fundamental classes of flow problems: linear potential flow, compressible inviscid flow,
viscous flow, and transport processes [advection-diffusion].' (p. ix)
3. 'A principal goal is to present some of the important mathematical results that are relevant to practical
computations. In so doing, useful algorithms are also discussed. Although rigorous results are stated, no
detailed proofs are supplied; rather, the intention is to present these results so that they can serve as a
guide for the selection and, in certain respects, the implementation of the algorithms.' (p. xv)
4. This course addresses students having a good knowledge of basic numerical analysis, a general idea
about variational techniques and finite element methods for partial differential equations and if possible
a little knowledge of fluid mechanics; its purpose is to prepare them to do research in numerical analysis
applied to problems in fluid mechanics.' (p. 7)
5. 'Computational fluid dynamics (CFD) is in a fair way to becoming an important engineering tool like
wind tunnels. For Dassault Industries, 1986 was the year when the numerical budget overtook the budget
for experimentation in wind tunnels. In other domains, like nuclear security and aerospace, experiments
are difficult if not impossible to make.' (p. 11)
6. 'We were tempted to publish this section [fluid mechanics] as a separate volume. This is not only
because it deals with a field of application of its own wide interest but also because it extends the field
of finite element applications to a difficult area in which 'variational principles' do not exist naturally.'
(p. xiv)
7. The whole field of computational fluid mechanics in which finite difference approximation has been
the mainstay is today in a transition stage in which the advantages of finite elements are being realized.'
(p. xiv) [optimists!]
8. 'One, therefore, sees that we must keep a delicate balance between coerciveness on the kernel of
B and the inf-sup condition which are in a sense conflicting conditions with respect to the choice of
spaces.' (p. 198)
9. Though finite difference methods have been [playing] and will continue to play a major role in
computational fluid dynamics (CFD) and heat transfer, finite element techniques have spurred the
explosive development of 'general purpose' methods and the growth of commercial software. The inherent
strengths of the finite element method such as unstructured meshes, element-by-element formulation and
processing, and the simplicity and rigor of boundary condition application are being coupled with modern
developments in automatic mesh generation, adaptive meshing, and improved solution techniques to
produce accurate and reliable simulation packages that are widely accessible.' (p. iii)
10. 'As in any rapidly developing field, the education of the nonexpert user community is of primary
importance. The present text is an attempt to fill a need for those interested in using the finite element
method in the study of fluid mechanics and heat transfer. It is a pragmatic book that views numerical
computations as means to an end—we do not dwell on theory or proof.' (p. iii)
OVERVIEW OF THIS BOOK 15
1.5 OVERVIEW OF THIS BOOK
We now briefly summarize what lies ahead and what does not; i.e., we also list as many
relevant issues/items that we can think of that, for one reason or another, are not discussed
in this book—although many of these omitted subjects will be taken up in Volume II.
In Chapter 2 we develop an appropriate weak formulation of the advection-diffusion
(AD) equations after appropriate discussion of the underlying partial differential equation
(PDE). The equations governing the approximate solution of the weak form are then
carefully developed, including both boundary conditions (BC's) and initial conditions (IC's).
After displaying in 'full glory' (probably for the first time in many cases) the semi-discrete
form—the 'method of lines'—of these equations (time remaining continuous) generated
by the Galerkin finite element method in both ID and 2D for several 'elements', we
discuss outflow/open boundary conditions (OBC's) in general, and then present some
non-Galerkin semi-discrete equations, including a finite volume method. After an
extensive discussion of the 'behaviour' of the 'GFEM-generated' ordinary differential equations
(ODE's)—including such things as dispersion, phase and group velocity, and wiggles as
a warning signal, we present a major discussion of time-integrating these ODE's—to get
the 'final' results (numbers) from the computer. Besides reviewing explicit and implicit
classical ODE methods, we discuss in detail one of our most important contributions:
how to employ local error control to vary the integration step size in such a way that a
specified level of overall accuracy is attained without 'wasting' time steps; i.e., small step
are only used when required by the physics, and not by either stability limits or analyst's
arbitrary 'guesses'. After returning to a discussion of dispersion (etc.) for fully-discretized
systems, we summarize other methods of solving the GFEM equations, and conclude with
a few numerical examples that are tutorial in nature; i.e., they demonstrate one or more
of the characteristics described earlier. Finally, an appendix presents further comparisons
of finite elements to finite volumes.
Chapter 3 is clearly the most important chapter in the book; in a sense, the other
chapters, are 'adjuncts' to this one. After some careful discussion of the myriad ways
to express the continuum PDE's that are the Navier-Stokes equations (NSE's), and a
brief introduction to 'related' equations (for vorticity, for pressure) and limiting cases,
we present extensive discussion of the wide variety of permissible BC's, concluding
the 'introduction' portion with a discussion of IC's and well-posedness followed by a
summary section for the PDE's. After a brief discussion of associated global
conservation laws we embark on a detailed derivation of the various weak forms of the NSE's,
followed by an even more detailed exposition of a particular weak form (the most common
one). Next is a lengthy discussion related to element 'choice' (still a 'non-converged'
process, world wide), including the most difficult of the criteria for making selection (div-
stability) and some consequences of making 'wrong' choices (pressure modes). Following
a 'contributed' section (by D. J. Silvester) on 'Stabilizing' the GFEM and another
discussion about the pressure Poisson equation (PPE), we offer a very detailed section which,
from the element matrices onward, shows 'how' the GFEM 'works'—how both natural
boundary conditions (NBC's) and the always elusive-in-the-past pressure BC's really
come about. Also included is a new convergence proof for the controversial Q\Qq element.
Following discussions of some higher-order elements and global conservation laws for
the GFEM equations, a digression to show a finite volume method is permitted. The new
'meat' of the chapter begins with some tutorial information on various ways to integrate
16 INTRODUCTION
the differential-algebraic equations (DAE's) that are the GFEM equations. After a useful
digression on a model DAE system with an analytic solution, and another on the Stokes
equation via eigenvectors, we discuss a variety of methods used to solve (integrate in time)
the DAE's, beginning with our favorite and strongly-recommended method: trapezoid rule
(TR) with intelligently selected ('smart') variable step-sizes. Methods also included are
: explicit via forward Euler (FE), semi-implicit via a popular projection method, other
projection-related methods, a fractional step method, methods based on characteristics,
methods based on least squares, and methods based on Galerkin least squares. After a
small section on aliasing and one that clarifies some old finite difference methods, the
chapter concludes with but one numerical example—treated in some detail: impulsive
start of flow past a circular cylinder. One of the objectives of this chapter that we think
we have achieved is to remove most of the cloak of mystery surrounding the pressure.
The last chapter addresses the various issues of 'what to do with the results', and how
to derive other 'secondary' quantities from the 'primary' (computed) variables. We show
several ways to compute vorticity, heat flux, and forces (and moments)—and recommend
some over others.
The book concludes with an extensive appendix on 'projections' related to the
GFEM—and projection methods, again presenting some quite new material that we
believe is illuminating and useful.
What (else) did we leave out? Lots—as the reader how by now already ascertained.
Probably more than we know. Below we list some of the important items, and hasten to
point out that many of these will be covered in Volume II.
1. Local error estimates and adaptive meshing; error estimates in general. This is a serious
defect, but perhaps our treatment of the 'same' subject in the time domain will help
compensate. Besides serious error estimates require too much functional analysis. For a
fairly recent 'update' on this important-and-still-developing areas, see Zienkiewicz and
Taylor (1991). (We do provide a few error analyses, and summarize a few others.)
2. Detailed discussion/analysis of triangular/tetrahedral elements.
3. Petrov-Galerkin and related schemes. We simply 'apologize' for our 'naivete'—and
lack of experience.
4. Free surface flows: See Volume II.
5. Thermal problems, buoyancy-induced flows (Boussinesq equations), stratified flows:
See Volume II.
6. Turbulence; turbulence modeling; turbulent flows: see Volume II, but note also, for
example: 'Laminar flows are not just of academic interest. They are of considerable
practical importance to the designer of forced convection heating and cooling devices used
in the electronics, biomechanics, and aerospace industries, among others. [Mohammed
etal. (1991)]
7. Solution methods for linear and non linear algebraic equations: This very important
practical aspect of the GFEM unfortunately also had to be deferred to Volume II.
8. /^-methods and h- p methods (p—polynomial, h—element 'size') spectral methods.
We cover only /z-methods. For an introduction to the first of these higher-order methods,
see, for example, Oden (1990), Ainsworth and Oden (1991), Babuska and Suri (1994),
Oden etal. (1993), and Babuska and Oden (1996). For spectral methods, start with the
excellent text by Canuto et al. (1988a).
We hope to make up for many, but definitely not all, of these deficiencies in Volume II.
SOME SUBJECTIVE DISCUSSION 17
1.6 SOME SUBJECTIVE DISCUSSION
In this short section, we wish to present a potpourri of items and issues that we believe
are useful, poignant, relevant, or simply 'interesting'—that fit well nowhere else. They
are rather subjective, but are hopefully useful to some of our readers.
We begin with a few references that we believe are useful reading for the reasons
stated. The recent book by Morton (1996) covers many aspects of 'our Chapter 2' from a
variety of alternative analytical viewpoints, with—unlike our presentation—an emphasis
on various Petrov-Galerkin methods. (He is, ultimately, more interested in compressible
CFD, in which shocks can wreak havoc with 'centered difference' methods like GFEM);
nevertheless, the book is a good adjunct to ours. The recent contribution to the Handbook
of Numerical Analysis by Marion and Temam (1996) both complements and supplements
our treatment of the NSE's in Chapter 3; it also includes a concise and readable summary
of what is and what is not known (3D global existence) regarding existence and uniqueness
of solutions in both 2D and 3D—recommended reading. On a rather lighter note, the short
paper by Russell (1989) whose title tells all, is also recommended: 'Finite Elements and
Finite Differences: Are they Really Different, and Does it Matter?' Returning to a much
heavier subject, the paper by Johnson et al. (1985) is a piercing critical review of most
of the 'classical' methods of performing 'stability analysis' of a given steady solution of
the NSE's; also critically reviewed and allegedly improved are the related error analyses
in CFD—including error control, based a combination of 'strong stability and Galerkin
orthogonality.' Related to this, from an M.Sc, thesis, is
Mathematically there are two reasons for the partial failure of classical stability
(eigenvalue) analysis. First, as already indicated, the classical analysis is purely
qualitative, investigating the behaviour of [infinitesimal] perturbations as time tends to
infinity. This does not properly describe cases where the operators occurring are non-
normal; e.g., the Orr-Sommerfeld operator in plane Couette and Poiseuille flows.
Non-normality means that the eigenfunctions of the O-S operator, though forming a
complete set, are not orthogonal. In this case a standard eigenvalue analysis fails, since
it does not account for the large transient perturbation growth that can arise when the
eigenfunctions are nearly linearly-dependent. The second reason is the restriction to
2D flow. The operators occuring in two- and three-dimensional Couette and Poisieuille
flow are both non-normal, but only the 3D case appears to have the potential of large,
transient pertubation growth, [Ericsson (1993)].
[See also the seminal works in this area by C.N. Treffethen—for example Treffethen
et al. (1993) and Reddy and Treffethen (1994).] For a 'new' approach to the larger problem
of fluid dynamics from a mathematical viewpoint, see Feistauer (1992)—for both
incompressible and (mostly) compressible flow, Numerous additional relevant references that
we have 'overlooked' are also cited.
We now cite a few (additional) quotations that simply seem relevant:
The FEM is, first, a systematic and powerful method of Interpolation' [Oden and
Reddy (1976, p. 197)].
... there is also a great controversy surrounding them [numerical methods]—focusing
on the trade off between ease of obtaining a solution and its accuracy. Solution of
the Navier-Stokes equations is much more difficult than that of the linear equations
governing most solid mechanics problems; hence computational methods for fluid
dynamics are much more involved and require greater expertise on the part of the
user than stress or thermal analysis by finite-element methods. [SAE Automotive
Engineering Staff (1995)]
18 INTRODUCTION
At the conclusion of an article in the Annual Review in Fluid Mechanics eulogizing
Karl Pohlhausen, Millsaps (1984) states,
... by a legendary figure on the ascent to the summit of human intellectual
activity—fluid mechanics.
Finally getting back down to Earth—in an unpublished set of notes that was a precursor
to their 1973 text (see their preface), Strang and Fix state:
The development of the method has led naturally from piecewise linear functions to
splines and other piecewise polynomials of fixed degree p: each increase in p adds
both to the accuracy and to the complexity of the method. As usual, the extra accuracy
is initially worth the price, but just as Newton's method is more popular than its higher-
order analogues, questions of convenience soon become paramount. In applications to
second-order equations, cubic approximants (p = 3) are apparently close to the turning
point.
This is probably still true today for fluid mechanics via /z-methods—although it seems
that 'cubic' should be replaced by 'quadratic'.
We conclude this brief section with the following remark: As have most if not all
previous books of this type, we shall probably be found guilty of spending too much time
discussing our own previous work relative to that of others—we, too, are human.
1.7 WHY FINITE ELEMENTS? WHY NOT FINITE
VOLUMES?
In some CFD circles the finite element method has not yet achieved the same level of
acceptance as other numerical solution methods. Part of this is probably a backlash of
relief by those who feared the worst some 10 or so years ago when it had become pretty
clear that the FEM had few serious competitors in its field of origin: solid mechanics
computations in arbitrarily complex geometry. The fact that FEM is clearly 'best' for
elliptic problems on arbitrary domains led some FEM optimists/advocates/zealots to
believe (and espouse) that other applied fields of computational physics—fluid mechanics
in particular—would also be 'easy pickings'. However, the fact that the Navier-Stokes
equations, for both compressible and incompressible flows, are much more difficult to
solve, let alone solve efficiently, combined with oversell by some and by dazzling displays
of advanced and often obfuscating mathematics by others, seems to have caused the
FEM bandwagon to become less attractive to jump on during the 80's and 90's—a
state of affairs that was furthered by the simultaneous development and near-maturation
of two finite-difference-related-areas of CFD: (i) body-fitted (and related) coordinate
transformations for reasonably complex geometries, and (ii) finite/control volume methods
(FVM) for spatial discretization, the former being a competitor to the inherent geometric
flexibility via the isoparametric mappings of FEM and the latter offering a more
readily understood (low-order) weighted residual method for approximating the solution
of PDEs' via 'local conservation.' Additionally, the FVM—developed and applied
mostly by mechanical engineers—was not much 'burdened' by the oft-times frightening
mathematical analyses that accompanied (or at least lurked in the background of) applied
FEM code development... such as mixed interpolation, LBB stability (inf-sup condition),
WHY FINITE ELEMENTS? WHY NOT FINITE VOLUMES? 19
Hilbert spaces, Sobolev spaces, etc. ... not to mention the seemingly never-ending quest
for the best—or optimal—element. Even the experts do not agree. Much of this 'trend'
is, it should be noted, attributable to the historical development of both FEM and FVM in
CFD, the former spinning off from solid mechanics, where fully-coupled systems solved
via the Newton-Raphson method and 'Gaussian elimination' were the rule—and the
latter evolving from the 'simple geometry' FDM's (of an earlier computer era) in which
uncoupled equations and iterative solution methods dominated the solution algorithms.
The advantages of the uncoupled/iterative methods are two: (i) reduced computational
cost (memory and CPU—ignoring accuracy considerations) and (ii) a significantly larger
radius of convergence. The disadvantage is a significant increase in the number of
iterations to achieve convergence—and in some cases even the ability to converge to
tight tolerances. It is thus interesting to note that a current trend in the finite volume
community is toward coupled iterative solution methods, because of the disadvantage
noted above. Also we note that very often the methods are compared simply on a 'CPU
work per node' basis. However this is a very misleading way of comparing the relative
costs, particularly in the (most common) case of nonlinear problems in which additional
issues such as accuracy and convergence rates should be factored into the 'cost equation'.
All of the above issues have, we believe, conspired to cause the FEM, which is really
just a 'generalized' FVM (because a consistently-formulated FVM has more similarities
than differences when compared with a low-order FEM—as we shall demonstrate), to
either lose followers/advocates or at least to not gain new ones at a rate proportional to
the total rate of growth of CFD. But the facts are that the situation has changed rather
significantly in the recent past—on both sides of the fence: FEM has, in some circles at
least, been switching to simple (low-order) elements and to iterative solution methods on
uncoupled (or less coupled) equations—and FVM is beginning to become 'saddled' with
mathematical analysis relation to stability and convergence. Three examples of the latter
are: Cai et al. (1991), Cai (1991), and Shin and Strikwerda (1997).
We believe that the FEM, using low-order elements (basically linear and piecewise-
constant), does represent a viable version of generalized FVM—especially now that
the CFD FEM community (or at least some of it) has 'caught on' regarding solution
strategy—and that it can and will complete cost-effectively in the arena of large CFD
simulations—involving millions of unknowns. We are less certain regarding the 'viability'
of higher-order elements—although this may in fact be less important since there are no
(or very few) higher-order FVM's. (As we shall demonstrate in the next chapter, the FVM
is inherently a low-order method.)
We now present a comparison of FEM and FVM in several areas, beginning with the
principal and most-oft-stated advantage of the latter over the former: the linear
conservation laws implied by the governing PDE's are always and inherently satisfied locally
(at control volume level) and thus globally (via simple summation). Hence, the discrete
equations are always amenable to simple physical interpretation because the resulting
stencils are also simple—at least on simple grids. In fact, the above is stated rather well
by some users:
One reason for this (the increasing popularity of FVM's) is that they combine the
intrinsic geometric flexibility of FEM together with the desirable, direct physical
invocation (sic) of a conservation principle to clearly identified and delineated control
volumes comprising the domain, [Schneider and Raw (1986)]
20 INTRODUCTION
While there is no doubt that the above statements are true and that local
conservation is a decided asset, it is also true that the stencils are far from simple and not so
easy to intrepret 'physically' when the FVM is applied to complex geometry with its
mappings, Jacobians, metrics, equation transformations, and sometimes even (God-awful)
Christoffel symbols—and that local conservation does not necessarily imply local
accuracy. [It is also true that virtually all FVM's we have seen employ a lumped (diagonal)
mass matrix, thus vitiating the claim of local conservation—at least for time-dependent
situations.]
Now we would like to list some definite advantages of GFEM over FVM.
1. The inherent (built-in) geometric flexibility permits the easy use of simple Cartesian
velocity components on unstructured meshes for arbitrarily complex geometry. There is
absolutely no need for 'add-on' global mappings, global transformation of equations to
covariant (or contravariant) components and thus no need for Christoffel symbols (etc.)
This FEM simplification is really significant, further appreciation of which could be
obtained by reading a carefully written paper on FVM by a team of mathematicians; see
Segal et al. (1992). In fact, however, many 'modern' FVM methods are also based on
Cartesian velocities—as are many commercial codes; see Ferziger and Peric (1996).
2. The inherent ability to easily and accurately apply the appropriate (physical) BC's on
complex domains—especially the Neumann type, and especially at outflow regions—is a
real asset.
3. Global physical (linear) conservation laws are either satisfied automatically (a la
FVM's) or can be made to do so with a slight change in formulation; the conservation of
linear momentum (global 'force balance') is obtained simply by using the divergence form
for advection, V • uu), as do FVM's. In addition, global energy conservation (quadratic)
is guaranteed by writing the advection terms as ^ [V • uu + u • Vu]—issues we shall
elaborate upon in due course. On this point we note that many (us included) believe that the
most important reason to seek global conservation of quadratic quantities (such as kinetic
energy) is that it assures stability (boundedness) of the numerical results. Again, however
we remark that stability does not imply accuracy—although it is true (obviously) that
instability implies inaccuracy. Finally, we point out that FVM's also generally do not
conserve (locally or globally) quadratic quantities; the only quantities guaranteed to be
conserved are those for which the divergence theorem applies.
4. If the original differential operator is symmetric (self-adjoint), so too are the FEM
discretizations of the operator, but not (in general, on non-rectangular grids, etc.) FVM.
Examples: (i) Laplacian; (ii) the divergence and gradient operators are adjoint to each
other in the continum and in the GFEM, but not in the FVM.
5. The phase speed of an FEM is always more accurate than that of the
'corresponding' FVM.
6. For elliptic problem, the GFEM is always more accurate than the 'corresponding' FVM.
Related to this last item, we would like to react to the following 'related' assertion
by some FVM advocates: FEM is great for elliptic problems, but it may be out of its
realm for Navier-Stokes. We respond thus: This best approximation property for elliptic
problems should not even be surfaced when discussing CFD—but if it is, it should be
interpreted as a bonus, an extra. This is just another example, like that of phase speed
already mentioned, wherein GFEM trades ease of 'interpretation' for increased accuracy.
WHY FINITE ELEMENTS? WHY NOT FINITE VOLUMES? 21
Another 'excuse,' we believe, for going with FVM rather than FEM is this: The
mathematics is too difficult and the concepts of weak formulation, weighted residuals,
etc. are non-physical and non-intuitive. We have a three-part response to this one.
1. The FVM is (often if not always) also a weighted residual method; only the weighting
functions are different. They are simpler to be sure (piecewise-constant); and this is the
reason we refer to GFEM as a generalized finite volume method (GFVM). We shall,
in fact, derive, describe and demonstrate (partially) a particular FVM in the next two
chapters.
2. Most of the difficult mathematics can be bypassed by practitioners and even code
builders—just as it is and has been in FVM; i.e., there is a large body of (difficult)
mathematics in the numerical analysis literature that is largely ignored by applied scientists
writing finite difference codes (finite volume, too, but to a lesser extent, as the difficult
mathematics is still in its infancy).
3. Finally, the same criticism should be leveled against all elementary Fourier-series
methods (and related eigenfunction expansion methods), as they too are certainly 'non-
physical and non-intuitive'—but in fact they are all Galerkin methods. But we believe
that even finite volume advocates appreciate the 'power' (and magic?) of Fourier series.
We conclude by emphasizing that there are actually more similarities than
differences between these two methods: (i) they are both members of the family of weighted
residual methods (although FVM methods need not be; there is a wide variety of methods
called FVM); (ii) they both rely more on integration than divided differences to generate
the discrete equations; and (iii) they both treat complex geometry via a 'mapping'. We
strongly hope that enough 'mechanical engineers' (et alii) will be sufficiently swayed by
this book that some good fraction of this large group of practioners will help the FEM to
grow and improve.
qi The Advection-Diffusion
^ Equation
An appropriate starting point for the study and numerical simulation of incompressible
fluid flow is that of the simpler but important linear equation of advection-diffusion
(AD), in which the velocity field is presumed known. Indeed, many fluid flow
simulations are primarily (or ultimately) concerned with the transport and diffusion of scalar
quantities such as 'heat' (temperature) or concentration (e.g. air pollution). Unfortunately,
even in these cases, the more-difficult-to-obtain velocity field must usually be computed
first. Here, however, we shall assume that the velocity is known, either analytically or
from a numerical solution of the incompressible NSE's; in the next chapter we will
turn to the problem of computing the velocity field itself. Finally, since the advection-
diffusion equation is, in many ways, prototypical of the (much more difficult) NSE's, it
is useful to study it first—and we remark that it is often called the convection-diffusion
equation.
2.1 THE CONTINUUM EQUATION
2.1.1 The Advective (Convective) Form
The conservation principle for energy or chemical species (mass) can often be well-
approximated by the following partial differential equation (PDE)—the scalar transport
equation—written here in terms of temperature, T(x, t), where x is a short-cut notation
for all spatial directions and applies to one, two, or three dimensions:
— +u- V7 = V- (K- VT) + S, (2.1-1)
dt
where the velocity field, u(x, t), is given and satisfies V • u = 0, as is the diffusivity tensor,
K, and the source term, S. {For S(x, t) > 0[< 0], it is a source [sink] term}.
In this chapter we will consider the simple case wherein the equation is linear (i.e. no
temperature-dependent terms); a further simplification, which we also use for the most
part, is that the coefficients are invariant in time. The term u • V7 represents advection
(temperature is 'carried' by the velocity field), which is also often called convection;
the term 37/3/ represents 'accumulation' for non-steady processes; and finally, the term
V • (K • V71) is, of course, the diffusion term; if K is a scalar (k) and constant, then this
term becomes noticeably simpler: kW2T, where k = k/pCp (thermal conductivity divided
by density and heat capacity) is the thermal diffusivity.
The solution to (2.1-1) will generally be sought within a bounded domain, £2, with
boundary, 3^2—also called f. Given an initial distribution of temperature, (2.1-1) can, in
24 THE ADVECTION-DIFFUSION EQUATION
principle, be solved subject to an appropriate set of boundary conditions (BC's), which
typically are:
T = TD on TD (2.1-2)
and
n(KVT) + H(T -f) = q on TN, (2.1-3)
where 3£2(= VD + rN) is composed of the two non-overlapping segments, rD and TN;
also TD, f,H^0 (heat transfer coefficient, albeit slightly redefined—with units of
velocity—from that used in conventional heat transfer analysis), and q (specified normal
heat flux into Q,—and again slightly redefined) are given functions (time-dependent, in
general—and often simply called 'data' by mathematicians) on the appropriate portion
of the boundary, and n is the outward pointing unit normal vector. The BC given by
(2.1-2) is called a Dirichlet BC, or a BC of the first kind (an 'essential' BC in the weak
formulations to follow); if H = 0 in (2.1-3), then the resulting BC is called a Neumann
BC or a BC of the second kind; finally, for H ^ 0, we have a BC of the third kind, or
a Robin BC—see, for example, James and James (1959). [The Robin BC is, for a heat
conduction problem, also Newton's law of cooling; hence, the Robin BC is sometimes
referred to as the Newton BC (e.g. Bird et al. 1960; Rektorys, 1980; Reddy, 1993). Both
Neumann and Robin BC's will later also be referred to as 'natural' BC's in the weak
formulations to follow.]
While K can, in general, be a full (but symmetric and positive-definite) second-order
tensor (a 2 x 2 matrix of coefficients in 2D, and 3 x 3 in 3D) representing anisotropic
diffusion, it is usually much simpler; e.g., a diagonal matrix or even a scalar. Since this
presentation is largely introductory, we shall usually consider the simplest case of a scalar
(and constant) diffusion coefficient k.
Finally, the statement of the scalar transport problem (abbreviated henceforth by AD;
advection-diffusion) is completed by specifying an initial condition (IC):
T(x,0) = T0(x) in Q, (2.1-4)
where Tq is a given function of position and Q, = Q, + d£2.
Before continuing, we make several
Remarks:
(1) The above BC's are the most general linear BC's that can be applied to (2.1-1);
if K or H or q are temperature-dependent, a 'family' of non-linear BC's unfolds.
{Actually, to incorporate a BC that will be discussed later, which may be even more
'general,' a convective transport term —n • uT should be added to the LHS of (2.1-3)
[cf. (2.1-25)].}
(2) The IC need not (and generally does not) satisfy the BC's, but if it does, the resulting
solution will be smoother, i.e., possess higher-order derivatives—especially if (2.1-4)
satisfies (2.1-2), the Dirichlet BC. (This 'flexibility' regarding IC's and BC's will be
partially lost when we advance to the Navier-Stokes equations in the next chapter.)
(3) A practical application of a BC (2.1-3) occurs when rN is a wall (at which u = 0,
usually) containing a heater (for q > 0) and on the other side of which flows a fluid
at temperature T.
(4) Another practical and very common use of (2.1-3) occurs when TN represents an
'outflow' (n • u > 0) boundary that is usually artificial/synthetic in the real world but
THE CONTINUUM EQUATION 25
very real in the mathematical modeling world. Here the use of H = 0 and q = 0 is
often effective as an approximation to the true coupling with the rest of the universe.
(5) It is impractical, and generally ill-advised, to apply (2.1-3) at inflow (n • u < 0)
portions of the boundary because it could lead to an ill-posed problem—or at least
to a problem whose existence of a unique solution cannot be proven, as we shall
demonstrate.
(6) If K = 0, we have the limiting case of pure advection, a hyperbolic equation for
which no BC is permitted at outflow; i.e., BC (2.1-3) must be dropped in this
situation, because the theory of characteristics tells us that T must be specified at
inlet points on T(n • u < 0), but that there is no BC at outlet points—at these points,
the PDE itself prevails.
(7) Consider the ID AD equation with constant coefficients. To help appreciate the
tremendous difference between advection and diffusion, we point out that
discontinuities in the initial data, Tq(x), are transported unaltered (C_1 initial data remains
C_1, i.e. rough) under pure advection but are instantaneously smoothed (to C°° for
/ = 0+) when diffusion is present.
(8) There will generally occur a singularity at the junction of VD and rN, at which
certain derivatives of T (e.g., diffusive heat flux) will fail to exist (be unbounded).
(9) Periodic (or cyclic) BC's are sometimes useful/appropriate, for which both (2.1-2)
and (2.1-3) are replaced by the requirement that the solution at one part of the
boundary must be equal to that at another.
Given sufficiently smooth data (u, K, TD, T, q, Tq, and dQ), the solution of (2.1-1)
will possess two continuous spatial derivatives and (for / > 0) one continuous time
derivative, as implied by (2.1-1). Such solutions are called classical solutions—which
distinguish them from the weak solutions to be discussed later (and which 'dominate' the
solution space in the 'applied' world). Given that a classical solution to (2.1-1) exists,
its uniqueness (for n • u ^ 0 on rN) follows easily—in the usual way; i.e., insert the
difference of two putative/alleged solutions into (2.1-1), multiply the equation by this
difference, and integrate the equation over the domain. [See Remark (6) following (2.1-18)
for help.]
2.1.2 Dimensionless Forms and Limiting Cases of the
Equation
It is often useful, both physically and mathematically, to recast a given PDE into one or
another dimensionless form to develop/improve one's intuition regarding the solution's
behavior. We provide two such forms below.
Suppose that a given problem is characterized by a particular (characteristic) length
scale, L, and a particular (characteristic) velocity scale, uq. We now consider two useful
non-dimensional forms of (2.1-1)—for the case K -» k, where k could represent the
average of the AT,-/s, for example—by representing distance (x) in terms of L and velocity
in terms of m0- In the first form, we will assume that the time scale is 'set' by diffusion,
an assumption that must often be verified a posteriori; i.e., we will non-dimensionalize
time via L2/k, a diffusion time constant.
26 THE ADVECTION-DIFFUSION EQUATION
Equation (2.1-1) then becomes
+ ^(u • V7) = -^(Vz7) + S, (2.1-5)
k /dT\ uq k 9
where the terms in parentheses all have the units of temperature (it is not necessary for
our purposes here to use dimensionless temperature). Multiplication of (2.1-5) by L2/k
gives the first dimensionless form,
— + Pe u VT = V2T + QU (2.1-6)
dt
where Q\ = L2S/k and the Peclet number,
Pe = u0L/k, (2.1-7)
has been introduced. Each term in T is now presumably (at least 'globally'—on average)
of order unity in AT, the characteristic driving temperature difference: i.e., 37/3/ = 0(1),
u W = 0(1), and V27 = 0(l). The Peclet number represents a ratio between the
'strength' of the advective and diffusive processes. In fact, an alternative derivation
of the Peclet number considers it as the ratio that estimates the relative magnitude of
advection to that of diffusion: u • VT/kV2T; it then approximates u • V7 by uqAT/L,
and approximates kV2T by kAT/L2; the result is uqL/k. If Pe «; 1, then advection
is unimportant (almost everywhere, usually) and the process is said to be diffusion-
dominated. If Pe ^>> 1, then diffusion is secondary (again; almost everywhere, usually)
and the process is called advection-dominated. Finally, of course, when Pe = 0(1), both
processes are important. In practice, it is often the case that both transport terms are on
nearly equal footing over most of Q, if 0.1 < Pe < 10, say.
To obtain the second non-dimensional form, we assume, in contrast to the above, that
the time scale is set by advection; in this case, the appropriate measure of time is L/uq—a
transport time—and (2.1-1) yields
V7)= ~(V2T) + S, (2.1-8)
L
which, when multiplied by L/uq, gives the second non-dimensional form,
— +u-VT=~-V2T + Q2, (2.1-9)
at Pe
where Q2 = LS/uQ, and Pe is still given by (2.1-7). Again, 37/3/, u • V7\ and V27 are
0(1) in AT—presumably.
Remark:
It is sometimes useful to note that the Peclet number (or Reynolds number for momentum
transport—next chapter) is also the ratio of the diffusion time scale (or time constant) to
the advection time scale (or time constant); e.g., Pe ^> 1 means that the 'response' of the
advection process to a 'perturbation' is much faster than that of the diffusion process.
It is also of interest to write the Robin BC, (2.1-3), with K -> k, in dimensionless
form. Noting first that since n • VT = nx(dT/dx) + ny(dT/dy) (2D), we see that only
THE CONTINUUM EQUATION 27
length need be re-scaled, and the result is
jn-VT + H(T-f) = q,
or, using the short-cut notation, n • V7 = dT/dn,
dT HL
— + (T-T) = g, (2.1-10)
on k
where q = qL/K, and the dimensionless group HL/k has two names—at least in thermal
analysis, noting that h, the conventional heat transfer coefficient, is h = pCpH:
1. HL/k = Nu, the Nusselt number (convective heat transfer), and
2. HL/k = Bi, the Biot number (transient heat conduction).
Returning now to the dimensionless PDE, it turns out that (2.1-6) is the appropriate
form to consider for diffusion-dominated situations in that, as uq -> 0, Pe -» 0, and (2.1-6)
becomes the appropriate and familiar 'transient heat equation'; i.e., it describes pure
diffusion. On the other hand, (2.1-6) is generally inappropriate for studying advection-
dominated cases; here we should use (2.1-9) and thus obtain the appropriate hyperbolic
limit, via Pe -> oo as k -» 0, of pure advection (with, in general, a source term).
Equation (2.1-9) is also the proper one for studying the effects of boundary layers when
Pe ^>> 1. Thus, (2.1-6) is often more appropriate if Pe ^ 1, and (2.1-9) is often better if
Pe ^ 1. It is to be emphasized, however, that either (2.1-6) or (2.1-9) may be used for
any value of Pe other than the asymptotic limits of 0 or oo; the forms presented simply
better place the weaker of advection or diffusion into the more appropriate setting. The
different forms can also have important numerical ramifications, which will be discussed
later. In many cases of interest here we will be concerned with the (more difficult) case
of advection-dominated flow, Pe ^> 1.
We next discuss some general characteristics of a given problem that can cause the
numerical solution of the advection-diffusion equation (and the NS equations, for that
matter) to be (relatively) either easy or difficult to obtain (the latter requiring fine zoning
in at least some portions of the domain):
1. Since diffusion is a smoothing process, diffusion-dominated flows (Pe <^C 1), for which
the PDE is predominantly parabolic in character, are generally easier than advection-
dominated flows (Pe ^> 1), which are more hyperbolic in character and do not smooth the
solution. Exceptions occur, of course, but are usually associated with early time (small
t); e.g., a sharp change in a BC and/or a very non-smooth source function (S) will make
the solution more difficult—even if Pe <<C 1—at least during the initial transient period.
Steady-state simulations (37/3/ = 0) with Pe <<C 1 are generally the easiest.
2. Smooth wave forms (spatial via IC's, or temporal via inflow BC's) are easy compared
with rapidly changing ones. The combination of non-smooth wave forms, which contain
large amounts of 'energy' in the short wavelengths (in the context of Fourier analysis),
and large Pe, in which case the difficult wave form must be translated with little change of
shape, is a difficult problem. This situation is usually only encountered in time-dependent
simulations wherein, for example, the advection of a 'square wave,' or a sequence of
them via time-varying inlet BC's, is very difficult, as we will demonstrate.
28 THE ADVECTION-DIFFUSION EQUATION
3. For problems in which the flow must, usually for reasons of computational feasibility,
actually leave the computational domain, the simulation can be either easy or difficult,
depending on the form of the outflow boundary condition (OBC) employed. Usually in
these cases, the outflow 'boundary' (n • u > 0) is artificial in that the true domain does
not end at this location, and thus the true (or physical) 'BC is not known (there is no
boundary and therefore no BC); nevertheless, the PDE, e.g., (2.1-1), cannot generally
be solved unless some (mathematically allowed) BC is employed. [An exception occurs
if the problem is truly hyperbolic (k = 0), in which case the PDE does not require an
OBC—and none should be imposed.] These simulations are especially difficult if Pe ^> 1
(but bounded; i.e., k ^ 0) and (2.1-2) is employed at an outflow boundary, as already
mentioned. The reason for the difficulty is that a thin, outflow boundary layer will form,
and the interior solution, T, which is generally not close to TD as the flow approaches To
from the interior, must rapidly change to be equal to TD in a very short (non-dimensional)
distance—of 0(1/Pe)—from (2.1-9). (This distance, say, 8, in the flow direction can be
derived by 'equating' advection to diffusion in this boundary layer: uqAT/8 = kAT/82, or
8 = k/uq, which in non-dimensional form is 8/L = k/uqL = 1/Pe.) The problem is much
easier, on the other hand, if an appropriate form of (2.1-6) is employed; if H and q are
zero, then the BC given by n ■ (K • VT) = 0 does not cause a boundary layer phenomenon
(see also Gartling, 1978; Chang and Finlayson, 1980; Gresho and Lee, 1981) and is usually
a good OBC for a 'computationally truncated' domain. For inflow boundaries (n • u < 0),
this problem does not arise, even for Pe ^> 1, and either (2.1-2) or (2.1-3) can be utilized.
[The former is often more appropriate—especially for large Pe; indeed, if K = 0, then
(2.1-2) is the only allowable BC at inflow boundaries.] Finally, these comments apply to
both time-dependent and steady-state simulations.
4. Flows around obstacles can be difficult if Pe ^> 1, again for both steady-state and time-
dependent situations; i.e., the advection-diffusion equation can be difficult to solve—even
if the flow field is 'simple,' like potential flow or Stokes flow. The reason is, again, the
formation of thin boundary layers, especially when (2.1-2) is applied on the 'upwind'
portion of the obstacle (where fine zoning will be required). This type of situation is
frequently encountered in heat and/or mass-transfer analysis.
5. Pure advection (k = 0, Pe = oo) simulations are often difficult, but are sometimes
actually easier than large Pe (e.g., 104) problems, owing to the simpler BC's and absence
of boundary layers for the hyperbolic equation. For these situations, generally the only
appropriate BC for (2.1-1) is (2.1-2), applied at inflow boundaries only. Again, exceptions
are possible; e.g., in the case of a steady flow with an internal (closed) recirculation region
(attached eddy), the steady, pure advection equation (u • VT = 0) has no solution unless T
is specified at some point on each streamline in the eddy. A general solution to u • VT = 0
for constant u is T(x, y) = f(vx — uy) for any /(•)• For the time-dependent case, on the
other hand, the solution is completely determined from the initial conditions, and there
is, in general, no steady state. The numerical simulation of such flows can be difficult.
One of the advantages of the FEM (a property also shared with most spectral methods
but not by many finite difference methods) is that the approximate solutions are always
obtained subject to the same BC's as are appropriate to the continuum equations—no
more, no less; e.g., (i) when Neumann or mixed BC's are involved, they are
'automatically,' unambiguously, and consistently (although approximately) incorporated (as we
THE CONTINUUM EQUATION 29
shall demonstrate) into the discretized equations; (ii) in the hyperbolic limit, k = 0, the
PDE requires no BC's at any outflow region (n • u > 0) of dQ—a situation that is again
'built-in' to the final FEM equations—as long as it is remembered that both H and q
must then also be zero. Hence, an appreciation and understanding of the right (and wrong)
BC's for the PDE's is especially useful in FEM approximations. In fact, the motivation
behind many of the above remarks is related to these issues—and more. If the analyst (or
'modeler') possesses a good understanding of the qualitative behavior of the continuum
equations and BC's, including limiting cases, s/he is in a much better position to plan the
experiments, create a good mesh, and finally, to understand and interpret the numerical
results—especially with regard to the frequently asked question,
'What went wrong?'
2.1.3 The Divergence (Conservation) Form
Since V ■ u = 0, an equivalent form of (2.1-1) is (for K -» k = constant, for simplicity)
— + V • (uT) = kV2T + S
dt
or
dT
— + V ■ (uT - kWT) = S, (2.1-11)
dt
which is called the (flux-) divergence form, since
qA=uT (2.1-12)
is the advective flux vector and
qD = -KVT (2.1-13)
is the diffusive flux vector. That is, with qr = qA + qo, the total flux vector, (2.1-11) is
clearly
d^ + V-qT = S, (2.1-14)
at
which is called a conservation form because integration over Q gives directly, via the
divergence theorem, the following global conservation law:
lJT = JS-Iraqr- <21-'5)
i.e., the total energy (or mass if T represents a concentration or mass fraction) changes
(decreases) only by the net flux of T out of the domain through the boundary—except,
of course, for the source term.
Remark:
Here and hereafter, we often employ the abbreviated but convenient notation that /(■)
means integration of (■) over £2 and Jr() to denote integration of (■) over the boundary
of Q.
Now it is clear that the same global conservation law could also have been derived
from (2.1-1) because V ■ u = 0. So, one may reasonably ask: What is the reason for
30 THE ADVECTION-DIFFUSION EQUATION
discussing the divergence form? The detailed answer will come later and is in two parts,
which we merely hint at now: (1) in the weak formulations of the transport equation, the
two forms thus far discussed—advective form and divergence form—can differ owing
to different natural BC's; and (2) in the spatially discretized equations, we generally do
not obtain V ■ u = 0 pointwise, with the result that only the divergence form can assure
global conservation—an assertion we shall later prove. And this leads naturally to the
subject of the next section.
2.1.4 Conservation Laws
Often, one of the goals of approximate solutions to PDE's, in addition to the principal goal
of finding a cost-effective approximate solution that is close to the continuum solution,
is the assurance that the approximate solution will satisfy discrete approximations to
certain global conservation laws that are satisfied by the continuum solution and that are
basically independent of the 'local error'; i.e., they are satisfied on the coarsest of meshes.
The principal reason for this goal is the desire to attain stable and bounded numerical
solutions, independently of the issue of accuracy. This presumes, of course, that the PDE
solution is itself stable and bounded.
Toward this end then, we present next a brief discussion of the relevant conservation
laws for the AD equation, so that we can set our sights toward the proper goals when
later generating numerical approximations. The first of these, global conservation of T,
has already been derived—in (2.1-15), which we restate in expanded form:
— /T = J S- J n (uT -kVT), (2.1-16)
showing that internal transport (i.e., within Q) of T via the principle transport processes
(advection and diffusion) makes no contribution to the global change of T—it merely
redistributes it within £2.
Invoking BC (2.1-3) in (2.1-16) yields another equivalent form of the global energy
(enthalpy) conservation statement:
ilT=ls+h+H{f-T)}+LKfn-LauT- <2M7)
in which the individual boundary contributions are more clearly displayed. The global
energy increases owing to (1) the internal heat source, (2) the applied heat flux on TN,
(3) the convective heat flow—also on TN, (4) inflow (for dT/dn > 0) of diffusive heat
flux owing to the specified value of T on rD, and (5) net inflow of advective flux owing
to the velocity field (recall that u • n < 0 for inflow).
Remark:
If a steady solution is sought for the somewhat special case of Vo = 0, H = 0, and
u ■ n = 0 on T, (2.1-17) yields a constraint on the data; i.e., it states that 0 = J S + Jr q.
If this solvability condition is not satisfied, then the problem is ill-posed, and no solution
exists because the given data preclude a global balance and are thus inconsistent.
Another energy-like quantity that is often of interest is a quadratic one: How does
E = J T2, a positive-definite quantity, behave? (Note that J T could be well-behaved
THE CONTINUUM EQUATION 31
even if T is locally 'poorly-behaved'; e.g., small regions of large negative T could be
cancelled by small regions of large positive T.) To answer this question, we first multiply
(2.1-11) by T and integrate over £2:
f dT f f
/ T— + / 7V ■ (uT - kWT) = / ST.
Application of the divergence theorem after an integration by parts of the two transport
terms yields, with V ■ u = 0,
-— T2 = ST-k WT VT - - / n ■ (uT2 - kVT2), (2.1-18)
which merits the following
Remarks:
(1) If S > 0, then the source term will act to increase (decrease) E if T > 0 (< 0).
(2) Dissipation—the second term on the RHS—will (try to) decrease E monotonically
(because J V7 ■ V7 > 0) and is the reason that diffusional processes are called dissi-
pative. It is noteworthy that this type of 'damping' is present in the T2 equation, but
not in the T equation—internal diffusion acts to equalize T, conserve its integral,
and decrease J T2—consistent with thermodynamics.
(3) The boundary terms show that T2, like T, is subject to inflow/outflow along T by
(again, like T) both transport processes.
(4) If n • u = 0 (contained flow) and n ■ VT = 0 on T ('insulated' container), then we
have djT/dt = JS and \&$ T2/dt = J ST - kJ VT ■ V7\ In a situation with no
source term, J T = J T0—where Tq(x) is the initial temperature—and E decays
monotonically, showing that E -> 0 and T -> constant as / -> oo; i.e., a steady
state will be attained in which the constant final temperature is the same as the
average initial temperature. For S ^ 0, a steady solution can clearly only be attained
if J S = 0; sources and sinks must balance.
(5) If k = 0 (pure advection) and n ■ u = 0 on T, then the sourceless situation will
conserve all powers of T; i.e., it then follows that J Tm = J T™, m = 1, 2, ....
(6) (2.1-18) can be used to prove uniqueness as follows: denote T as the difference
between two alleged solutions, whence the source term vanishes and the BC's (and
IC) are homogeneous—T = 0 on VD and KdT/dn + HT = 0 on TN\ (2.1-18) then
becomes
1 d
2d7
JT* = -jKvr.vr-\jn.*i*+Kj
dT
T —
dn
f
-HIT
Jyn
-2.
To
but T = 0 on TD and thus, if n ■ u ^ 0 on rN, d / T2/dt < 0; finally, since T = 0 at
/ = 0, so too is / T2; thus, / T2 = 0 is the only possible solution, which => T = 0.
QED. Note that any inflow on ^(n ■ u < 0) loses the negative semi-definiteness of
the RHS and precludes a simple uniqueness proof; there could presumably exist two
(or more) solutions, differing because the temperature at the inflow portions of f/v
32 THE ADVECTION-DIFFUSION EQUATION
could differ—it is 'uncontrolled' there. [In practice, however, the non-unique case
is sometimes 'solved' in the CFD laboratory—uniquely or not. In fact, even the
full Navier-Stokes equations and Boussinesq equations are often 'solved' in cases
wherein there is some inflow at an ostensibly outflow portion of the domain (an eddy
gets 'chopped in half at rV), thus bringing into £2 'unknown' values of both velocity
and temperature—and the solutions usually look quite believable. C'est la vie.]
These results can be regarded as some goals for the approximate (numerical)
solutions. We will later return to these conservation issues after deriving the numerical
approximations—both semi-discrete, which lead to a set of ordinary differential equations
(ODE's) in time, and fully discrete, in which a time-marching method has been selected.
2.1.5 Weak Forms of the PDE's/Natural Boundary Conditions
The next step toward a Galerkin FEM solution is to recast the governing PDE—either
(2.1-1) or (2.1-11)—into the weak (or Galerkin) form, sometimes also referred to as a
variational form. Here and hereafter, when we speak (loosely, sometimes) of the weak
form of an equation (PDE), we are usually referring to the final result of a weak
formulation of the spatial part of the problem (PDE plus BC's); i.e., weak forms generally come
with BC's. The weak form can be derived in several ways, but below we offer mainly
one—one that is usefully heuristic even if not totally rigorous in all situations. (The 'end
of the day'/'bottom line' results, however, are rigorous). We also state at the outset that
while the classical statement of a problem (PDE + BC's + IC's, also often referred to as
an IBVP—Initial Boundary Value Problem) is generally unique and unambiguous, there
is usually no unique weak statement of the same problem. But while there may exist
alternate weak formulations of a given problem, they are actually equivalent—at least when
a classical solution exists, in which case the solution is said to be 'sufficiently smooth.'
Some weak formulations, however, are more useful than others because (at least) they
more efficiently and more 'naturally' take account of the BC's. Part of the 'game,'
therefore, is to find the most appropriate weak form—a task that is often non-obvious and
non-trivial—especially when we consider the NS equations in the next chapter. (Thus,
the FDM problems of 'how to discretize each operator and how to treat each term at a
boundary?' is replaced by the FEM problem of 'selection of the weak form.') Another,
and very important, attribute of weak solutions is this:
a great many physically interesting and seemingly well-posed problems possess weak
solutions but do not possess classical solutions.
Beginning with (2.1-1) then, we first suppose that we have a solution, T(x, t), that
satisfies this PDE. It is then clear that w[dT/dt + u • VT - S - V • (K • V7)] = 0, and
therefore that
w(—+u-VT] = J w[S + V • (K • V71)] (2.1-19)
is also satisfied for all (bounded, which we assume) functions, w(x). The next step is
to immediately narrow the class of so-called test functions, [w], so that w is at least
once-differentiable. This restriction on w permits the use of the following identity:
V • [w(K ■ V71)] = wV • (K • WT) + Vw • (K • WT),
THE CONTINUUM EQUATION 33
w( — +u-V7j + Vw(K- V7)= / wS + J wn-(K- V7), (2.1-21)
which we integrate over Q and apply the divergence theorem to the LHS to obtain
[ V • [w(K • V71)] = f wn-(K-WT) = f wV ■ (K • V7) + /" Vw • (K • V7);
i.e., we have obtained, via integration-by-parts and application of the divergence theorem,
the following important result:
/ wV ■ (K • VT) = J wn ■ (K • V7) - f Vw ■ (K • V7), (2.1-20)
whose 'importance' will soon become clear. Using (2.1-20), (2.1-19) becomes
in which the diffusive normal boundary flux is now prominent, and is one reason that
(2.1-20) is important. Namely, recalling now the Neumann (Robin) BC, given by (2.1-3),
leads to
/ w ( — + u • VT J + Vw • (K • VT) = wS+ / wn ■ (K • V7)
+ / w[q-H(T-f)], (2.1-22)
where we have separated the boundary integral into two parts, one over rD and the other
over T/v, in order to (naturally) incorporate the Neumann BC; recall that rD + rN = T.
We are almost, but not quite, to the desired weak form. To finish, we now further
restrict the class of test functions, {w}, of which there is an infinite number, to those that
vanish on the Dirichlet portion of 3£2; i.e., we now require w = 0 on rD. Calling this class
of functions HlQ, our final weak form of the advective form of the AD equation is—and
this is important—obtained by dropping the assumption that T(x, t) satisfies (2.1-1),
and instead considering it as an unknown function that need only be once piecewise-
differentiable [i.e., it no longer need satisfy (2.1-1)] and require (2.1-22) to hold for
every function, w(x) in HlQ; i.e.,
Find T(x, t) e H\ such that
/ w(— +u V7J + Vw(K- V7) = fwS+f w[q-H(T - f)] Ww € //J,
which we rearrange to place the unknown boundary temperature on the LHS and the data
on the RHS: find T(x, t) in H\ such that
/ w(— +u VTj + Vw(K- V7) +/ wHT
= [wS+ [ w(q + HT) VweHlQ, (2.1-23)
where HlE is that set of once piecewise-differentiable functions in £2 that satisfy the
essential BC, (2.1-2), on T/j.
This is the final weak form of (2.1-1); it incorporates automatically BC (2.1-3) and
can be solved, in principle at least, for T(x, t) once the initial data, (2.1-4), are supplied
34 THE ADVECTION-DIFFUSION EQUATION
at/ = 0. In fact, we now discard (2.1-1) through (2.1-3) and regard (2.1-23) as the 'God-
given' form of the problem; and also note that T(x, t), the weak solution, can (but need
not) now reside in a larger function space than do solutions of (2.1 -1), since the weak
solution need not even possess second spatial derivatives, at least in the classical sense. A
final comment on this weak formulation is that the BC (2.1-3) has been incorporated into
the solution in a (relatively) natural way and is the reason (or one reason, at least) that such
a BC is called a natural boundary condition (NBC); it (the Neumann BC for the Laplacian
operator) is 'natural' to this weak formulation—and (2.1-23) is the 'natural' weak form
of (2.1-1). See also Strang and Fix (1973) for further useful elucidation, including the
concept of 'completion' of the function space. [The actual origin of the term is in the
calculus of variations; e.g., 'Any BC in the boundary value problem which need not be
imposed on the set of admissible functions in the variational principle is said to be a
natural boundary condition. Other BC's are essential'—Stakgold (1979).]
It is interesting and instructive to reverse the procedure that 'generated' this weak
form—at least when this is permissible; i.e., when T(x, t) is sufficiently smooth. To this
end, we assume sufficient regularity and manipulate—a la (2.1-20)—the diffusion term
as follows:
f Vw • (K • WT) = J V • (wK • V7) - / wV • (K • WT)
= / wn ■ (K • VT) - / wV ■ (K • V7)
JrN
-l
wn ■ (K • V7) - / wV • (K • WT)
so that (2.1-23) becomes, after rearrangement,
w — + u • V7 - V • (K • WT) - S
dT
■7- +U
dt
Jr.
= / w[q - H(T -T) -n- (K-VT)] Vw e HlQ.
(2.1-24)
Since this equation holds for all w e Hq, it follows that it holds for that subset (say Hl0)
that vanishes on VN (as well as on rD); i.e., for this subset, we have
/
w
dT
+ u • VT - V • (K • V7) - S
= 0 Wwe Hi
But since even this subset contains an infinite number of functions (i.e., w is an arbitrary
function in Hq), it follows that
dT
dt
+ u • V7 = V • (K • WT) + S in Q,
and we see that T(x, t) satisfies the original PDE. And this fact, with (2.1-24), leads
directly to
JrN
w[q-H(T-T)-n-(K-VT)] = 0 Vw €//J.
But again the set of test functions is of infinite dimension, and thus
n
(K-VT) + H(T -T) = q
on
N
THE CONTINUUM EQUATION 35
is necessarily true, and we see also that T(x, t) satisfies the Neumann/Robin BC. But it also
satisfies the Dirichlet BC, (2.1-2), and the IC, (2.1-4), both by construction; i.e., T(x, t)
satisfies the original IBVP. Hence, we have just proven a special case of the following
general result: if the solution of a weak form of the problem is sufficiently smooth, then
that solution is also a classical solution of the same problem. The key word is IF, a word
that is missing when going the other direction—i.e., a classical solution is always also a
weak solution. This distinction is actually rather important in practice because most (or
at least many) problems posed do not satisfy all of the smoothness requirements in order
that a classical solution exists. (It is perhaps also worth pointing out that conventional
finite-difference approximate solutions to such problems can also converge only to a weak
or generalized solution in such cases.) As stated by Rektorys (1980, p. 377), 'The weak
solution ... represents a substantial generalization of the concept of a classical solution
of a differential equation with boundary conditions. However, the weak solution is a
generalization considerably more expressive as regards the range of problems, on the
one hand, and the assumptions imposed on the given data of the problem on the other
hand.' Finally, for an interesting 'formal' proof of the equivalence—at least for the steady
case—see Hughes (1987, pp. 4 and 60).
Weak solutions are also referred to as generalized solutions or solutions in a
distributional sense. If the classical solution exists,—which requires (at least) smooth data (e.g.,
no delta functions in S or jumps in K) and a domain with a sufficiently smooth boundary
(e.g., no L-shaped domains are allowed), then the weak solution will also be a classical
solution, as shown above. If, however, the problem is not smooth enough, a strictly
classical solution (e.g., V2T exists everywhere) will not exist, whereas a weak solution usually
will. For further discussion regarding technical definitions of 'sufficiently smooth,' refer
to, for example, Strang and Fix (1973). We conclude by reiterating: classical solutions
are subsets of weak solutions; classical solutions [i.e., the solution to (2.1-1)] will always
satisfy (2.1-23), but solutions of (2.1-23) will not always satisfy (2.1-1).
Before taking the next step toward a finite element solution, we digress to consider
another weak formulation: if we start with the conservation form of the PDE, (2.1-11), to
generate the weak form, it may seem natural (although it is not necessary) to also integrate
the other divergence term, wV • (uT), by parts, so that the total flux, uT — K • V7\ is thus
so treated. The result is
/ w— + Vw-(K-VT-uT) = wS+ wn • (K • V7 - uT),
in which the total flux appears in both domain and boundary integrals. This suggests,
properly, that if the Neumann/Robin BC was
n (K VT -uT) + H(T-f) = q on VN (2.1-25)
instead of (2.1-6), then the appropriate weak formulation, generated from the divergence
form (2.1-11), would lead to:
Find T(x, t) in HXE such that
J w_ + Vw • (K • V7 - uT) + / wHT
J I dt J JVn
= [wS+ f w(q + HT) VweHl0, (2.1-26)
rather than that given by (2.1-23).
36 THE ADVECTION-DIFFUSION EQUATION
Remarks:
(1) If the advection term (in the flux-divergence form) was not integrated by parts, then
the resulting weak form would be equivalent to that derived earlier—with u W
replaced by V • (uT) in (2.1-23)—and would satisfy the natural BC implied by it;
i.e., (2.1-3) rather than (2.1-25) above.
(2) While either form of the PDE (advective or flux divergence) could actually be used
to solve the AD equation with either (natural) BC—(2.1-3) or (2.1-25)—in the weak
form, the former is a more natural choice if the BC is (2.1-3) and the latter if it is
(2.1-25).
(3) We will have more to say regarding the choice of weak form and associated BC's
in Section 2.4 after we 'discretize the weak form' via the finite element method.
(4) The weak forms of the scalar transport equation presented above contain several
important and simpler (usually) special cases; e.g., (i) u = 0 gives the transient heat
equation (parabolic equation) in a weak form, (ii) u = 0 and 37/3/ = 0 gives a weak
form of a Poisson problem (elliptic equation), which, in the special case of VN = V
and H = 0, leads to the previously-stated solvability condition, J S + frq = 0—a
global heat balance requirement that is obtained directly from (2.1-23) by setting
w = 1 there, and (iii) K = 0 gives the pure advection equation (hyperbolic) in a
weak form, which also requires q = 0 and H = 0.
(5) We shall concentrate mainly on the first weak form, (2.1-23), for reasons that will
be explained later.
(6) Alternate weak forms could be obtained by either not integrating the diffusion term
by parts or integrating it by parts a second time, thereby shifting all of the derivatives
to test functions. These options will not be examined herein since they are less
relevant to our purpose; see, for example, Hopf (1950), Ladyzhenskaya (1969).
(7) If both q and H are zero, then the boundary integral vanishes, and the 'data
preparation' involves 'no action' (no data input) on TN—resulting in the sometimes-used
jargon of a 'do-nothing' BC.
(8) See also Strang and Fix (1973, p. 70) and Carey and Oden (1983, p. 4) for further
discussion of the non-uniqueness of weak formulations.
A final remark on existence and uniqueness for the steady-state AD equation: if
and only if TN = 0 or if n • u ^ 0 on rN, some powerful 'machinery' from functional
analysis—the Lax-Milgram theory in particular, e.g., Axelsson and Barker (1984, p. 158)
and Johnson (1987)—applies to (2.1-23): it then assures us that a solution exists and that
it is unique. Powerful and useful as this may be (and is), it does not say, as indeed the
classical uniqueness 'proof presented in Remark (6) following (2.1-18) (for the time-
dependent case) did not say, that if n • u < 0 on some portion of rN a unique solution
does not exist; it simply becomes silent in that case, being 'merely' a sufficient
condition. All we can say for sure when portions of VN display inflow is that we may be
flirting with danger and should proceed cautiously and suspiciously. However, to end
this section on a relatively high note, in spite of these theoretical 'deficiencies' for our
non-self-adjoint problems in fluid mechanics, we remark that when the Lax-Milgram
theory applies, our GFEM solution will possess existence and uniqueness because it
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 37
resides in a proper subspace of the continuum solution; i.e., the matrix will be guaranteed
invert ible.
2.2 THE FINITE ELEMENT EQUATIONS/DISCRETIZATION
OF THE WEAK FORM
2.2.1 Advective Form
We now address the issue of 'solving'—albeit approximately—the weak form of the
problem, (2.1-23), and mention up front that thus far it is far from obvious that solving
(2.1-23) is any easier than solving (2.1-1). But it is the weak form upon which the FEM—a
weighted residual method (see Finlayson and Scriven, 1966) or a projection method—is
based. The FEM is a general and systematic technique for obtaining approximate solutions
of weak forms; i.e., it is a method of 'discretizing' the weak form with the result that
the underlying function spaces become finite—and thus amenable to representation via
computer. Of course, the resulting solution in this 'truncated' function space is only an
approximation to the true weak solution; i.e., of the solution to (2.1-23) in this case.
In addition, the approximate solution is based on the approximation of functions—be
they given or need to be found—via (usually) piecewise polynomials defined on the
spatial domain of the problem. Different FEM solutions—by which here and hereafter
we always mean approximate solutions to a given continuous problem—arise from the
use of different piecewise polynomials and/or different weak forms, all of which are
ostensibly trying to solve the same IBVP. But once the choice of weak form is made and
once the choice of piecewise polynomial (e.g., linear, quadratic, or cubic) is made, there
are virtually no more choices available to the analyst—except, of course, the difficult and
important one of how many and what distribution of piecewise polynomials are to be used;
i.e., the number and distribution of nodes/elements in the mesh. Bothersome issues, such
as 'How should I treat this term or that term near and/or at this curved boundary, or at that
corner?' are simply not present. They are part of the 'package deal' mentioned by Strang
and Fix (1973); the rest of the 'recipe' is well defined—just turn the crank (which crank
admittedly may sometimes be somewhat resistant, and other times generate less-than-ideal
results) to generate the complete—and usually quite good—spatial approximation.
The piecewise polynomials of the FEM are also called basis functions or shape
functions and are said to 'span the space': any function in this finite-dimensional subspace is
presumed to be representable by an appropriate linear combination of these basis
functions. When the test functions, {vv}, are also represented by a linear combination of the
same basis functions used to approximate the solution, which we assume to be the case
herein, the FEM that evolves is called the Galerkin FEM or GFEM. If the test
functions differ from the basis functions, we have a so-called Petrov-Galerkin method, which
leads to different numerical approximations; we will say a little more about some of
these methods later. So we now have our next task—namely to apply the GFEM to the
weak form of the AD equation given by (2.1-23). Since the integrals in (2.1-23) involve
no derivatives of higher order than one, we can (and do) employ the simplest class of
piecewise polynomials called C° functions (zero derivatives are continuous; C1 functions
also have continuous first derivatives, etc.); the basis functions are piecewise continuous
and linearly independent, and their first derivatives, while discontinuous (they typically
suffer jumps at node points), are square integrable—and that is all that is needed for the
38 THE ADVECTION-DIFFUSION EQUATION
terms (namely, the diffusion term) in (2.1-23) to 'make sense'; i.e., they can be evaluated.
Second- and higher-order derivatives are not required to 'make sense' nor even to exist.
The next step toward a GFEM solution is to represent the unknown function, T(x, t),
in (2.1-23) as a linear combination of (known) basis functions (piecewise polynomials)
with unknown amplitude coefficients that are to be determined in such a way that the
resulting approximate solution function, which we call Th(x, t), represents T(x, t) from
(2.1-23) in a reasonable way. The generic symbol h is used both to represent a typical
(or maximum) element size (length) on the discrete mesh and to remind us that we are
henceforth dealing with an approximate solution—Th(x, t) ^ T(x, t) from (2.1-23), but
we hope that Th(x, t) — T(x, t) is 'small.' Thus, we write
N
Th(x, t) = f(x, t) + Y/Tj(t)(pj(x), (2.2-1)
where <pj is the y'-th (global) finite element basis function (with, however, compact
support), Tj(t) is the y'-th unknown (to-be-determined) amplitude coefficient, N is the
number of nodes (in Q, and on 1"V) at which Tn is to be determined, and T(x, t) is a given
function (to be discussed in detail below) whose purpose is to ensure that Th(x, t)
satisfies the Dirichlet (essential) BC, (2.1-2), since a property of the {<f>j}, inherited from
that of {w}, is that <pj = 0 for x e T^; i.e., for points located on VD, (2.2-1) gives
Th(x, t) = T(x, t) ~ TD of (2.1-2). (It may be worthwhile to emphasize that N is not
the total number of nodes in the mesh; it does not include those on VD.)
Another useful and important property of the {4>j} is that they are a Lagrange
interpolating basis of C° piecewise polynomials; i.e., <pj(xi) = &,j, the Kronecker delta,
where jc, is the location of node /. This property—combined with a similar representation
of T(x, t) which we shall discuss below—endows the discrete system of equations with
the convenient property (as with finite difference methods) that the numerical value of
the amplitude coefficient, Tj(t), is also the value of Th(x, 0 at jc = xy, i.e., the numbers
that come out at the other end are the values of the approximate solution at the nodal
points. (This 'convenience feature' is not an essential part of the FEM and will, in some
cases to be discussed later when dealing with the NS equations, be waived.)
Next we deal with the 'function space issue' represented by the expression 'Vw e HXq
in (2.1-23). First, we note that since our approximation lives in a finite-dimensional
subspace of //', of dimension N, we need only (can only, in fact) force (2.1-23) to be
satisfied for an analogous finite-dimensional subspace of Hq, also of dimension N. This
is accomplished as follows: since each \^(x), a subset of {w}, is representable as a linear
combination of the N basis functions, {07}, it suffices to enforce (2.1-23) only for each
of these particular test functions; i.e., 'Vw e //q' is equivalent to and replaced by the
finite-dimensional version, 'for wh = fa, / = 1, 2, ..., /V.' [See Becker et al. (1981) and
Hughes (1987) for alternative derivations/explanations. See also Remark (6) below.]
Thus, the finite-dimensional/GFEM statement of (2.1-23) is obtained by inserting
(2.2-1) into the finite-dimensional analog of (2.1-23) to obtain the following set of ordinary
differential equations (ODE's) for the amplitude coefficients (nodal values of T):
V07 + V0(-(K-V07)
-.1
+ Tj HMj
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 39
= f 4>iS+ [ <j>i(q+Hf)
j 0, f^+u-vrj+v0(-(K-vr) + j frHf
fori= 1,2, ...,N, (2.2-2)
where tj = dTj/dt, and we note that the entire term in curly brackets on the RHS is
actually a formal method of enforcing the essential BC, (2.1-2), and is not nearly so
cumbersome in practice as it appears at first glance.
Further Remarks:
(1) The solution of (2.2-2) approximates that of (2.1-23), which represents a
generalized/weak solution of (2.1-1) through (2.1-4).
(2) The entire approximate solution (once IC's are set) of the scalar transport problem
is contained in this set of equations—both at all points in Q, [via solving (2.2-2) for
nodal values and by using (2.2-1) elsewhere] and at all points on rN [where T is
also an unknown function owing to the derivative BC, (2.1-3)].
(3) Hopefully, the dual use of the symbol N—for Neumann (1"V) and for the number
of nodal unknowns—will not cause a problem.
(4) It is noteworthy (and significant, and perhaps even somewhat amazing) that none of
the individual basis functions satisfies the NBC of (2.1-3), yet the solution of (2.2-2)
will do—albeit approximately (as indeed is the entire solution an approximate one)
and more closely as N is increased—even when VN has a complicated shape. This
is, in fact, one of the major advantages of approximating the weak form rather than
the strong form. [See Strang and Fix (1973), for more detailed discussions of the
theory behind such 'unstable' BC's.]
(5) The ODE's become algebraic equations (Tj = 0) if the steady AD equation is being
solved via the GFEM—a linear system of N equations in N unknowns.
(6) An 'implicit' method of obtaining (2.2-2) that is sometimes used goes as follows: in
the finite-dimensional subspace associated with (2.1-23), the generic test function,
wh, can be represented as wh = X^Li a<A = Y^=\ ^(jc,)0,(*), and the statement,
'for every wh e HXq is replaced by 'where the a, are arbitrary,' which leads to the
following version of (2.2-2): Y!!=\ a«{LHS - RHS} = 0, where LHS is the left-hand
side of (2.2-2), etc.; and (2.2-2) then follows immediately since the a,'s are arbitrary
coefficients.
(7) For elucidation of our opinion on the subject of the GFEM equation formulation via
the process of 'element assembly', or looping through the elements, and the frequent
confusion that it has sometimes caused, see the first part of Appendix 2.
It is interesting and perhaps fruitful to show how the C° approximation actually 'tries'
to satisfy both the original PDE and flux-continuity between elements. To do this
expeditiously and without loss of generality, we consider the simpler (homogeneous) version
40 THE ADVECTION-DIFFUSION EQUATION
of (2.2-2) in which T, H, and q are all zero, so that (2.2-2) becomes simply
J <Pi I ^- + u-vrM +V0(-(K-vr/!) = j &S v/.
Then, following Hughes (1987, p. 68), we break up the global integral into a sum over
elements (which is, in fact, the way most codes are actually written), focus on one of the
elements containing node /, and integrate the diffusion term by parts as follows:
f W<f>i ■ K • VTh = f 0,n • (K • VTh) - f &V • (K • WTh),
which is legitimate even for a C° approximation because our basis functions are smooth
within Qe. (They are, in fact, C°° there.) Next we realize that for each element boundary,
re, that is internal to the domain (not on V, the boundary of the global domain, Q), the
boundary integral will be generated twice as we loop through the elements—once from
each side of Ve. But both n and Th (and thus VTh) are different in each of these two
boundary integrals (but not <pf), the former merely in sign (n is outward-pointing in each
element) and the latter because VTh is computed from different 'data.' The net result
from each pair of boundary integrals is, upon summation over elements, a jump in the
heat flux across each element boundary since, after all, it is precisely on Te where the
(normal) derivative of Th is discontinuous. Denoting each jump by [[n • K • VTh]\, gives,
for each / (node),
Y,f 0/[[n-K.VT*]]+ f 4>i
+ u • VTh - V • (K • WTh) - S
dt
= 0,
where the summation is, in effect, only over those elements containing node / (typically
a four-patch in 2D). The interpretation of this result is that, at every node, Th from
the GFEM is satisfying both the original PDE and flux continuity—both weakly. [The
Euler-Lagrange equations associated with the above 'variational' statement are (2.1-1)
and (2.1-3), the latter applying on all internal boundaries (with q = H = 0) as well as on
the domain boundary. See Hughes (1987) for further discussion.]
Remarks:
(1) It is remarkable how many misleading papers have appeared in the finite element
literature in which the jump term was claimed to be zero—typically via a statement
like '... and all interior boundary integral terms cancel upon summation ...'
(2) We shall return to this issue, and generalize it, in Chapter 4.
Before moving on to the discussion of the solution of (2.2-2), we must address two
more issues: (i) the function T(x, t), and (ii) IC's.
The main job of T(x, t), as alluded to earlier, is to ensure that the approximate solution
satisfies (closely if not exactly) the essential/Dirichlet/stable BC of (2.1-23); there is no
'free lunch' for these BC's. Wait and Mitchell (1985, pp. 88-91) present an interesting
sample problem in which a comparison is made of 'blending functions' (which exactly
satisfy the essential BC's) and finite element, piecewise-polynomial basis functions (which
interpolate the BC's and are therefore exact only at the nodes). The result is that both
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 41
'work' quite well, and neither is clearly superior—and both preserve the overall 'order'
of accuracy. Another analysis, and with a similar conclusion (at least for polynomial
domains), was done by Fix et al. (1983); they compared the boundary interpolant with a
least-squares fit (L2 projection; see Appendix 3) using the same basis functions. We shall
follow common practice and use the same class of piecewise polynomials to interpolate
T(x, t) for x e VD that are used to approximate the solution (and the test functions, in
the Galerkin method)—a procedure that may enforce a certain degree of regularity upon
TD(x e VD). That is, we take
NT
f(x,t)= J2 TD(Xj,t)<j>j(x) for XjerD, (2.2-3)
j=N+\
where NT is the total number of nodes in the finite element mesh, and we quickly note that
the implied ordering/numbering of the nodes (i.e., j = \,2 ..., N, N + I,
N + 2,..., Nt) is definitely not appropriate for solution by the computer—it merely
simplifies the presentation of the 'theory.'
The advantages of this choice for the interpolation of TD are several:
1. Simplicity; it is 'natural' to the FEM technique, and code writing is much easier.
2. All of the amplitude coefficients, {Tj{t)}—those in Q and those on dQ = TD +
TN—represent the value of the function Th(x, t) at the nodes. (This is not true if blending
or other functional forms are employed.)
3. The function T(x, t) is of compact support; it is non-zero only on those elements that
are contiguous to To and zero elsewhere. The bracketed term on the RHS of (2.2-2) is
thus zero over most of the domain.
For other methods (not recommended by us) of enforcing Dirichlet BC's—Lagrange
multipliers, penalty methods, least-squares methods—see Strang and Fix (1973).
Turning finally to the subject of initial conditions, we mention that again at least one
alternative to 'interpolation via the basis functions' exists, but that there is usually not a
sufficiently compelling reason to introduce this more complicated technique, which is:
Compute the 'consistent' IC's by setting Th(x,0) = Tq(x) weakly; i.e., from (2.2-1) we
obtain
f Th(x, 0)0; = f f(x, 0)<j>i + j ]T Tj(0)<f>j<j>i = f T0(x)<j>i for 1 = 1, 2, ..., N,
(2.2-4)
which is an N x N linear system for {7^(0)}. We leave as an exercise the proof that this
is the same Th(x, 0) that minimizes the following functional,
Q= f[Th(x,0)-T0(x)]2, (2.2-5)
where Th(x, 0) is again expressed via (2.2-1) and (2.2-3). The initial values thus obtained
will generally not agree with Tq(x) at the nodal points, but the resulting Th(x, 0) will be as
close as possible—in the least squares sense—to Tq(x) in Q; Th{x, 0) is an L2-projection
of Tq(x) —see also Appendix 3. While this IC computation is indeed more consistent,
42 THE ADVECTION-DIFFUSION EQUATION
we shall generally again follow precedent/common practice, and simply interpolate the
initial data via
7,(0) = To(xj), j=l,2,...,N, (2.2-6)
which again simplifies code writing and is usually sufficiently accurate (indeed, the
error is zero at each node, so that the only error is that caused by interpolation). Note
too that (2.2-6) also obtains from (2.2-4) simply by approximating Tq(x) itself via the
interpolant—a quite reasonable procedure, usually; the best fit to the interpolant is the
interpolant.
Final Remarks on IC's:
(1) Only the L2-projected IC of (2.2-4) can always [and easily—see, for example,
Johnson (1987, p. 151)] be shown—at least for the heat equation (u = 0)—to satisfy
the following stability 'condition' (which is also satisfied by the PDE for S = 0,
q = 0,andH = 0):\\Th(t)\\^\\Th(0)\\^ \\T0(x)\\, properly reflecting its dissipative
behavior.
(2) Only the L2-projected IC can successfully (and easily) deal with arbitrary rough data
(Tq(x) e L2) such as: Tq(x) is a smooth function (say C° or C1) except at a finite
number (or a countably infinite number) of points in Q, where it takes the value
of, say, 1000. Whereas the L2-projection does not even 'see' these points [they live
in a set of measure zero in the term J 4>iTq(x)], the interpolated IC, which would
presumably require locating a node at each point, would definitely see them—and
generate a very 'bad' solution, since the weak solution of the PDE would also
[necessarily—at least via the form presented in (2.2-2)] ignore them. Thus, in using
(2.2-6) in the sequel, we must implicitly agree to preclude such irregular initial data,
or else explicitly revert to (2.2-4).
(3) If Tq(x) and TD(x, 0) disagree at any nodal points on T^, then the BC must prevail;
i.e., it is necessary that Tj(0) = Td(xj, 0) for all nodes on rD. (If the IC and BC
are the same on r^, then there is no jump there at / = 0, and the resulting solution
will be smoother.)
(4) If Tq(x) is sufficiently smooth, yet another projected initial condition is also possible
(but not used in practice, to our knowledge): the Hl projection; see, e.g., Thomee
(1984) and Appendix 3.
The total GFEM problem has now been posed; namely, using (2.2-3), solve (2.2-2)
for Tj(t) with IC's obtained from (2.2-6), and (optionally, and rarely done in practice)
use (2.2-1) to obtain Th(x, t), the full finite-element solution. But 'solve (2.2-2)' is easier
said than done—even though we now have only a. finite number of unknowns. We will
thus later devote a fair amount of attention to methods for solving the ODE's of (2.2-2),
but first we shall spend some time studying the ODE system that has been generated. To
begin, we rewrite the GFEM problem in the more compact matrix-vector form
MT + [N(u) + K]T = f for t > 0, (2.2-7)
where T = (T\, 7^, ..., T^)T is an /V-vector of the nodal values, Tjit), which satisfy
TjiO) = Toixj) at / = 0. Also, M,N(u), and K are sparse NxN matrices ii,j =
1,2,...,/V):
Uj = J <Pi<Pj
Mij = / Mj (2-2-8)
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 43
is the mass matrix,
is the advection matrix,
where
is the diffusion matrix, and
Nu(u) = J <j>iu{x, t) ■ V(j)j (2.2-9)
Kij=Kfj+Kfj, (2.2-10)
Kfj = Jw4>i-(K-W4>j) (2.2-11)
Kfj ee f Hfrtj
(2.2-12)
is the boundary matrix representing the contribution of the Robin BC.
Finally, / is an /V-vector that comprises the entire RHS of (2.2-2); i.e., it incorporates
the internal source term, the specified boundary heat flux, the remainder (specified portion)
of the Robin BC, and it contains information that couples the Dirichlet BC (including the
time derivative) to the rest of the problem (the term in curly brackets).
Remarks:
(1) Kfj is zero for most /, j; it is only non-zero for those nodes {/} on 1"V that 'see'
node j (via the support of the basis function).
(2) M is symmetric and positive-definite (SPD), and causes (dTh/dt)(x, t) to be a best
(least-squares) fit to the data: V • (K • VTh) + S - u VTh and the NBC of (2.1-3).
It is sometimes referred to as the finite-element version of the identity matrix; see,
for example, Wathen (1991), or 'the variational equivalent of the identity operator',
in Karniadakis et al. (1993).
(3) K is always symmetric; it is SPD unless TD = 0 and H = 0; i.e., K is symmetric
but singular if Neumann data prevail on all of dQ—a rare occurrence in practice.
(4) Both M and K (when SPD) possess all positive eigenvalues—because all positive-
definite matrices do.
(5) N(xx) is unsymmetric and indefinite, and its eigenvalues are complex in general—and
purely imaginary if N is skew-symmetric; it is also time-dependent when u is. N(u)
is always 'close,' in some sense, to being skew-symmetric—'because' u • V is a
skew-symmetric operator.
(6) Variable coefficients—especially u(x, t) or, perhaps more commonly, u(x)—are
usually interpolated via the basis functions before performing the integrations in
(2.2-9).
(7) We will often, with apologies, use the bad notations that TT is the transpose of
the temperature vector, that /V is both the advection matrix and the length of the
7-vector, and that Nt refers to the total number of nodes.
(8) Some of these matrices are tabulated—at element level—in Appendix 1.
This may be a good time to attempt to define what we like to refer to as 'honest
GFEM':
44 THE ADVECTION-DIFFUSION EQUATION
1. Perform the integrals in the above matrix definitions 'as accurately as possible'—do
not use so-called reduced quadrature (typically Gauss-Legendre—use a higher than
'minimum' quadrature rule, see Leone et al. (1979));
2. Do not cheat on the mass matrix via lumping (we shall later address 'mass lumping'
in some detail);
3. When it comes to time integration (Section 2.7), use a non-dissipative method—at least
for advection-dominated cases.
2.2.2 Divergence Form
It is now a very simple matter to write the GFEM equations in flux-divergence form, a
la (2.1-11); just change the definition of the advection matrix, from (2.2-9) to
Nu(u) = JfrV ■ [cjijuix, /)]. (2.2-13)
But since V • (4>jU) = u • V07 + 07 V • u and the velocity field is allegedly divergence-
free, one may properly ask, 'Why bother with the divergence form since the results are
the same?' While the detailed answer can only be provided after we have discussed the
GFEM solution of the NS equations in the next chapter, it is appropriate to point out
here that V-u^O when u is obtained from the approximate (GFEM) solution of the
NS equations, and that the velocity field that drives the scalar transport equation often,
if not usually, is obtained from just these equations. So we must face the case where
the velocity divergence is small but not zero. (The velocity is generally only discretely
divergence-free.) Hence, we do not require that Njj from (2.2-13) be the same as Af,7
from (2.2-9). The consequence of this is that only the use of (2.2-13) can assure global
conservation of T in the GFEM solution, an assertion that we shall soon prove.
If, of course, the Robin BC was (2.1-25) rather than (2.1-3), then the GFEM would (or
at least, should) be based on the weak form given by (2.1-26) rather than on that given
by (2.1-23). The resulting semi-discretized equations would differ from those in (2.2-2)
in the following ways:
1. J <piU ■ V(f)j is replaced by — J V0(- • (0/ii); i.e., in this case, Af/y(u) = — J 0/ii • V0,-,
vis-a-vis (2.2-9) and (2.2-13).
2. The same replacement must be made in the advection part of the Dirichlet BC term on
the RHS; i.e., J 0;u • V7 is replaced by — J Tu ■ V0(.
2.2.3 Conservation Laws
We now attempt to mimic the analyses presented in Section 2.1.4, this time for the semi-
discrete system of GFEM equations. But before we can do so conveniently, we will
modify/generalize/augment our GFEM in the following way (see, for example, Mizukami,
1986; Gresho et al., 1987): rather than stating the problem a la (2.2-1), (2.2-2), and (2.2-3);
i.e., find Th(x, t) in Q and on I"V from
J 4>t f^-+u.VT*J+V0l-.(K.Vr*) + J ^HTh
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 45
= [<piS+ I <pi(q + HT) for/= 1,2, ...,N, (2.2-14)
we generalize this weak form in three ways:
1. Replace u-VTh by u • VTh + /37/!V • u, where the scalar /3 will be defined (and
discussed) below.
2. Introduce a new unknown, the diffusive heat flux into £2 through T^ (on which Th is
specified, and which is n • K • V7 in the continuum), as follows:
nt
where the [qD } are to be determined. Note that qhD is a continuous function—whereas n.
K. VTh is not.
3. Increase the size of the space of test functions, from those in £2 and on rN to those in
Q and on V = VD + VN = dQ.
The generalized/augmented weak form is then: find Th(x, t) in Q, and on rN and find
qhD on TD from
/ ^ l^{-+u-VTh + l3Thw-u) +V0(-(K-V7/!) + J 4>xHTh
= f<t>-tS+ f <pi(q + Hf)+ t 4>iqhD fori=l,2,...,NT, (2.2-16)
where (still) T is given by (2.2-1) and T by (2.2-3), and we immediately point out that
(2.2-16) naturally 'decomposes' into two sets of equations—the first set given by (2.2-14),
which (as before) can be used to solve the N ODE's for Th(x, t) from the first /V equations;
and the second set by the last N7 — ./V algebraic equations of (2.2-16), which can be used
to solve for the NT — N values of qD. (with Th known). The reason this decomposition
occurs is that the first Af equations are independent of the rest (the converse, of course, is
not true). The reason we introduced this additional complexity is that it is a nice way to
ensure that the total GFEM solution (Th in £2 and on I"V, and qhD on To) can be made to
satisfy a global energy balance, as we demonstrate below, after making some additional
Remarks:
(1) qhD from (2.2-15) and (2.2-16) is called the consistent heat flux because, in addition
to yielding global energy conservation (shown below), it is the only heat flux that
permits reversibility; i.e., if the Dirichlet BC, T = TD on VD were to be replaced
by a Neumann BC, n • K • V7 = qD with qD specified on VD, then only qhD as
computed from (2.2-15) and (2.2-16) would produce the same Th as did the original
problem—an important point even if somewhat 'theoretical' in that the proper qD
can only be determined (for the continuum case as well) by first solving the problem
with TD specified.
(2) The actual value of qhD is seen to depend on much more than just the normal
component of K • V7 on rD—at least on a finite mesh; but in the limit of h -> 0(AV -> 00),
46 THE ADVECTION-DIFFUSION EQUATION
all of the terms in the last Nt — N equations of (2.2-16) would vanish (-> 0) except
/ V0( • (K • WTh) on the LHS and /r <j>iqhD on the RHS, with the final result that
qD = n • K • V7\
(3) We shall have much more to say about the consistent heat flux (and other 'boundary'
quantities) in Chapter 4.
We will return to these (non-obvious) issues later; for now we just allege their veracity
so that we can get on with the problem at hand—the derivation of global conservation
laws. To this end, we now note the final reason for introducing the generalized problem
of (2.2-16): the sum of all NT basis functions is unity,
NT
]T0,(jc)=1.O Vjc, (2.2-17)
(=i
a result that is crucial to the establishment of global conservation statements. (Note that
Y^=\ 4>i(x) 7^ 1 near To.) The important property (2.2-17), which can also be stated in
the more 'elegant' form that 'the constant function is a member of the test space,' leads
easily to the following result when all NT equations of (2.2-16) are summed (equivalently,
set (pi = 1):
( (d^+u-VTh+/3ThV.u) + J HTh= fs+ f {q + HT)+ f qhD,
which we rearrange to
j{ ITh = fs+ f H(f-Th)+ J qhD+ f q- f (u ■ VTh + /3ThV ■ u),
(2.2-18)
and note that all but the last term on the RHS are in the desired form [cf. (2.1-17)]; namely,
the second term is the heat input from Newton's 'law of cooling,' the third is the heat
flux into r^ that results from the specified temperature there, and the fourth term is the
original applied heat flux on rN. To finish, we use J u • VTh = J V • (uTh) — J Th V • u =
/r n • uTh - J Th V • u to obtain, finally,
A frh= fs+ f [(q+H(f-Th)j + J qD- f n ■ uTh + (\ - 0) f ThV
(2.2-19)
which, upon identifying qD as KdT/dn, the diffusive (only) flux through rD, now properly
mimics (2.1-17) except for the last term which should, but does not, vanish—unless
V • u = 0 or f> = 1. Since, as mentioned earlier, we often must solve the scalar transport
equation using velocity fields that have small (hopefully) but indefinite divergence, we
conclude that for these cases, it is necessary to set /3 = 1 if we wish to assure global
conservation of our scalar field, Th.
But since V • (uTh) = u • VTh + ThV ■ u, we see from (2.2-16) that 0 = 1 is nothing
but the flux-divergence form of the advective term (fi = 0 being the advective form).
Thus, while the advective form cannot assure (and will not attain) global conservation
of energy/enthalpy when V • u ^ 0, the divergence form can and will. See Lee et al.
u.
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 47
(1982) for some demonstrations of these facts in the case of an ideal fluid (zero diffusion
coefficient for the AD equation and zero viscosity in the Boussinesq equations, whose
computed velocity field drives the 7-field).
So at this point it seems clear that the 'proper' GFEM for scalar transport that is driven
by a GFEM-computed velocity field (or any other for which V • u ^ 0) should not use
the simpler advective form; the flux divergence (conservation) form (fi = 1) is clearly
preferred. Or is it? What about quadratic conservation, Eh = J(Th)2, and the associated
stability/boundedness that conservation of Eh would guarantee? To answer this question,
we must attempt to duplicate the steps that led to (2.1 -18) for the continuum. We begin
with (2.2-16) again, and with the observation that Th is a linear combination of all (NT)
basis functions. Thus, we form (in principle) that same linear combination of the NT
equations of (2.2-16) to obtain (in fact, just replace 0, by Th, a perfectly legitimate test
function)
/ Th ( 4-+u- V^ + ^V-u) + VTh>(K- VTh) +f
H(Thf
= ThS+ Th(q + HT)+ / ThqD,
J J rV J I"/;
an equation that is also satisfied if (2.2-16) is [it is implied by (2.2-16)].
Next, recall BC (2.1-3) and rearrange the above equation to obtain
f(Th)2= frhS- fvTh-(K-VTh)+ J Thn-(K-VTh)+ f ThqD
- f Th{u- VTh + pThV-u),
which is seen to agree, except for the advection term, with the continuum version, (2.1-18),
once we generalize (k -» K) and then replace \ Jr n • (K • V72) by Jr Tn ■ (K • V7) +
IrD TqD there.
The final step is
Thu ■ VTh =
u-V(Thf = \
n-u(Thf - i (Th)2V>u
to obtain
-— f(Th)2= f STh - fvTh-(K-VTh)- - f n-[u(Th)2 -K- W(Th)2]
(2.2-20)
in which agreement with (2.1-18), and the assurance of global conservation of
J(Th)2—and, when 'appropriate,' the associated assurance of a bounded solution—can
now only be obtained (when V • u ^ 0) by choosing fi = \! A dilemma, to be sure: for
V • u 7^ 0, j6 = 0 conserves 'nothing,' /3 = \ conserves T2 but not T, and /3 = 1 conserves
T but not T2. Which /3 (and associated form) should we choose, and why?
48 THE ADVECTION-DIFFUSION EQUATION
In the experiments performed by Lee et al. (1982), these discouraging aspects of global
conservation when V • u ^ 0 were indeed verified; see also Cliffe (1981). But they also
reported that they would still not switch from the simpler (fi = 0). Why is this? Besides
computer costs (the conservative form is slightly more expensive than the advective form)
and laziness, the reasons are basically these:
1. Any decent (weak) solution of the NS equations, in which J \J/V • u = 0 Wx(/, where
^ is a pressure test function (Chapter 3), will generate only small (and of variable
sign, generally) values of V • u, so that the offending terms are probably always pretty
small—although they may cause instability if diffusion is small or absent.
2. Any real (physical) solution—i.e., one with non-zero diffusion coefficients—should
provide sufficient physical dissipation to control most potential instabilities related to
the (indefinite) term J(Th)2W • u. Experience, both our own and that of many others,
suggests that this is indeed true—usually. So, for now at least, we shall leave the issue of
'^-selection' open, except perhaps for the rare hyperbolic (K = 0) case wherein fi = 1/2
is to be preferred to ensure stability of the ODE's, an example of which we shall later
demonstrate—in Section 2.8.1.
2.2.4 An Absolutely Conserving Form
In this section we examine the case fi = 1/2 in a little more detail. We first mention
that the term 'absolutely conserving' was apparently introduced by Piacsek and Williams
(1970) in a finite difference context, and refers to what we have called quadratic
conservation: in the absence of diffusion and sources, and with no inflow or outflow (n • u = 0),
(2.2-20) shows that Eh = df(Th)2/dt = 0 if and only if 0 = 1/2 when V • u ^ 0. If
fi ^ 1/2, Eh behaves in an indefinite (i.e., unpredictable) manner so that boundedness of
the approximate solution cannot be a priori guaranteed.
It is interesting to first note that fi = 1/2 is actually the average of the advective and
divergence forms; i.e.,
u • WTh + \Thy u = i[u V7/! + V • (uTh)]. (2.2-21)
Next, consider the matrix representation of this form; the advection matrix becomes, from
(2.2-9) and (2.2-13),
N?j(u)=tJ(j)i[u.V<j>j + V.(u(Pj)]
= i f(<PiU ■ V0,- + fru ■ V(j)j + 0,-0,-V • u)
= J fou ■ V(pj + \ J frcpjV ■ u, (2.2-22)
vis-a-vis (2.2-9) or (2.2-13); i.e., if the matrix of (2.2-9) is—after increasing its dimension
from N x N to NT x NT—called NA for advection form, and that of (2.2-22) is called
Nq (for quadratically conserving form), we have
NQ = NA + B,
(2.2-23)
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 49
where
Bij = { J<j>i(j>jW.u. (2.2-24)
Remarks:
(1) It is interesting to first form the transpose of Nq,N7q = NA + B, and then the sum,
S = NQ+NTQ = NA+Nl + 2B; i.e.,
Sij = J 0,u • V(pj + J <j>ju ■ Vfr + J <pi<pjV ■ u
= J V • (<j>i<j>jU) - J &0,-V ■ u + f WjV ■ u
= / fafyn ■ u.
Thus, if n • u = 0 on I\ then Nq is skew-symmetric, Nq = —Nq; a useful
observation—and one that does not exist for /3 = 0 or /3 = 1 when V • u ^ 0. Thus,
noting first from (2.1-18) or (2.2-20) for £=1/2 that when there is no inflow
or outflow the advection process has no effect on f T2, we see that the skew-
symmetry of Nq assures the same since xTAx = 0 for every vector x when A = —AT
[proven below in Remark (4)]. Here, A = Nq and x = T. Another property of skew-
symmetric matrices that is useful to know for CFD is that their spectra (eigenvalues)
are purely imaginary. Finally, it is important to note that 5,-y = Jr <pi<pj-n • u obtains
even if V • u = 0; i.e., even a solenoidal velocity field wil] only yield a skew-
symmetric advection matrix if either 0, = 0 (e.g., specified T) or n • u = 0 on r.
(2) In matrix language, and (again/still) for the case with no inflow/outflow (u • n = 0
on O, we have that NQ = (NA — NTA)/2 is the skew-symmetric part of AfA and that
—B = (NA +NTA)/2 is the symmetric part of NA — and the decomposition (NA =
Nq — B) is unique. (Note/recall also that the diagonal entries of a skew-symmetric
matrix are zero.)
(3) These results generalize as follows: any discrete (matrix) 'centered-difference-like'
approximation to u • V(-) when u is spatially varying and n • u = 0 on T—say No
(u)—can be uniquely decomposed into the sum of a skew-symmetric part, (A^c —
NTG)/2, and a symmetric part, (NG + Njj)/2—the former 'truly' (more properly)
representing advection and the latter corresponding to (representing) — ^(-)V • u. If,
however, the discrete approximation to u • V() is dissipative, then the symmetric part
will always contain a term that looks like diffusion—and may or may not display a
— ^()V • u term; e.g., for pure upwinded advection in ID, the 'usual' result obtains:
(i) the skew-symmetric part looks like centered advection and
(ii) the symmetric part looks like centered diffusion with diffusivity uAx/2.
The eigenvalues of the 'true' advection matrix are purely imaginary (as they should
be when approximating the hyperbolic operator), and those of the symmetric matrix
are indefinite (because V • u is). Discarding the symmetric part guarantees that the
50 THE ADVECTION-DIFFUSION EQUATION
advection terms will not destabilize the ODE's because TtNqT = 0. It would also
convert any dissipative advection approximation into a non-dissipative one.
(4) Theorem: For any real matrix, A, a necessary and sufficient condition that the
quadratic form q = x1Ax vanish for all x is that A be skew-symmetric. Proof: First
note that q = \xT(A + AT)x too, because the quadratic form of the skew-symmetric
part of a matrix, I, (A — AT), is always zero.
(i) Sufficiency. If A is skew symmetric, AT + A = 0.
(ii) Necessity. If q = 0, let B = (A + AT)/2, the symmetric part of A, and consider
the eigenproblem Bz = Xz- Since q = 0, zTBz = XzTz = 0 for every
eigenvector, which =>• all of the eigenvalues of B are zero. But a symmetric matrix
with all zero eigenvalues must be the zero matrix. QED.
(5) The above theorem does not preclude the following possibility: for some non-trivial
x, xTAx = 0 with A not skew-symmetric—a situation that can actually occur in
practice. [See, for example, Gary (1979), who studied u, = uux and obtained uTN(u)u =
0 using GFEM and linear basis functions, thus conserving energy but with NT(u) ^
—N(u). We shall return to this point in the next chapter (Section 3.16).]
(6) For a recent 'spectral element' discussion of skew-symmetric advection, see Ronquist
(1996).
The generalized matrix-vector form of (2.2-7) is now
Mf + [NQ(u) + K]T = f, (2.2-25)
where now T is an NT vector that represents all of the nodes:
T = (T\, Tj, • • •, TN, T/v+i, Tn+2 ■ ■ ■ Tnt) ,
and the matrices, M, K, and Nq are of size NT x N7. Also, consistent with this
formulation, the A^^-vector / is different: it does not contain the terms in curly brackets of
(2.2-2) because T has been absorbed into Th (the 7-vector 'contains' it), but it does
contain an additional term—that corresponding to the 'Dirichlet' heat flux term on the
RHS of (2.2-16). In fact, / is the RHS of (2.2-16).
We will now generate the fi = 1/2 version of (2.2-20) from the matrix-vector form of
the equations—a procedure that may be regarded as an exercise, but hopefully a useful
one. Multiplication of (2.2-25) by TT (the transpose of the nodal temperature vector)
yields a scalar equation that is a combination of quadratic forms—and one linear form:
- — TTMT + TTN0T + TTKT = TTf, (2.2-26)
2df u
and we examine each term in turn:
1. TTMT = JTTi(jM^Tj
'7=1
" N7
E^.jlEr*| = /^)2-^
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 51
3.
TTNQT = J2 Ti (J'fa ■ V0,J Tj + X-Tt (j 0,^V • u) Tj
= lJu.W(Th)2+l-J(Th)2V-u
= i J V • [u(Th)2] - X- j(Thfy -u+~J
= 1/(7**..,.
Nt \ c ~\ Nt / r
TTKT = Y^Ti / W>/ • (K • W>,)^ + Y. T> / H^
(rrv-u
nr;
-/
vr1 • (K • vr
h\2
H(T"y.
TTf = J2T; /<&<>+ f (pi(q + Hf)+ f ^qhD
= frhS+ f Th(q + HT)+ f ThqhD.
Thus, (2.2-26) becomes
-— t(Thf + - f(Th)2n u + fvTh-(K- VTh) + f H(Thf
=lSTh+L
STh+ I Th(q + Hf)+ [ ThqhD,
(2.2-27)
which, using n • K • VTh + H(Th - f) = q on VN, reproduces (2.2-20) when 0 = 1/2
there. Since this is also the proper conservation law, we see (again) that the absolutely
conserving form does indeed properly 'conserve' Eh. It also follows that the quadratic
forms associated with the Af-matrix for either the advective form (fi = 0) or the divergence
form (fi = 1) lead to indefinite forms, and global conservation of Eh cannot then be assured
unless V • u = 0.
The 'bottom line' from these results is as follows: if V • u ^ 0,
1. It is not possible, in general, to conserve both T and T2.
2. If the problem is diffusion-dominated and/or good boundary heat fluxes and/or an exact
overall heat balance is important, fi = 1 is the recommended choice (unless, of course,
it 'blows up'—probably unlikely under these conditions).
3. If the problem is advection-dominated (or, worse yet, the limiting case of a hyperbolic
problem), then fi = 1/2 is advisable since a bounded solution may be difficult to obtain
otherwise.
4. For many (if not most) practical problems, however, fi = 0 may be the best choice,
since it is simpler, and it does not necessarily follow that conservation forms lead to more
52 THE ADVECTION-DIFFUSION EQUATION
accurate approximate solutions of the PDE—at least when fi = 0 and fi = 1 generate
stable results. While instability =>• inaccuracy, it does not follow that stability =>• accuracy.
5. We have occasionally been 'burned' via the GFEM generation of unstable ODE's when
using the advective form—an example of which we shall present in Section 2.8.1; we thus
warn the reader that only (N — NT)/2 can assure that the FEM ODE's will be stable—a
result that also applies to all other spatial discretizations, as well.
6. Although various upwind and/or streamline diffusion advocates of advection seem
never to worry about the fact that their ODE's too are not guaranteed to be stable because
their advection matrix is also indefinite, it is probably the case that instability never
occurs simply because the extra dose of diffusion (numerical), with its positive (decaying)
eigenvalues, swamps the indefinite contribution from — ^(-)V • u.
As a final remark, we mention that we will return to these issues (in Volume II) after
studying the NS and Boussinesq equations—and thereby muddy the waters of conservation
even further.
2.2.5 A Finite Difference interpretation
Since the GFEM equations are in 'weighted residual' form, it is of some interest to
'undo' the Galerkin weighting so that the equations can be more readily interpreted
as finite difference equations, a procedure that can unfortunately lead also to
misinterpretations—as we shall demonstrate. In this section we will convert (2.2-2), or (2.2-7),
to an equivalent form that more readily permits such an interpretation.
Since the GFEM equations are formed by the process 'multiply by each basis (test)
function and integrate the result over the domain,' we now consider the effect of dividing
the final results by the same test functions integrated over the domain; i.e., by f <p[. For
reasons that will become more clear later, we define
MLij=8ij J fa (2.2-28)
where 5,-y is the Kronecker delta. Whereas f <p{, i = 1, 2, ..., NT, is, of course, a vector,
this diagonal matrix representation is more convenient for our purposes, because often
ML is also the so-called lumped mass matrix, about which we will say more later. Noting
that M^} is trivial to compute, we multiply (2.2-7) by M~[l to get
AT + Ml{[N{u) + K]T = Ml{f, (2.2-29)
where
A=Ml[M (2.2-30)
is, in fact, an averaging matrix; i.e.,
N
A'j = J2 (8{k J &) / fob = J &<t>j/ f & (2-2-31)
is dimensionless and has the property that the sum over each row is unity (since "JTrLi 4>j —
1.0), so that each element of the vector AT represents a particular weighted average of
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 53
the elements of the A^-vector 7. In fact, the /-th component of AT is
AT
= jrj4>i{tj<i>j)/ !& = !&
dT"{x,t)
dt
(2.2-32)
and it is now clear that we did indeed undo the Galerkin weighting—at least for the time
derivative.
Next, note that M]^{N{u) corresponds to (represents) a weak advection operator and
(—)M~[lK corresponds, with the exception noted below, to a weak Laplacian operator [at
least when the heat transfer coefficient—see (2.2-12)—is zero].
Next,
MlxN{u)T
= E (/^'u • V^./) Tj/ J*i = J0'U " V7V/*» (2-2_33)
i ./=
and
MlxKDT
= / V0, • k • vr
(2.2-34)
r-l
Similarly, M~[lf—see (2.2-16)—is
-l
MTlf
</>iS+ / (Pi(q + HT) +
(t>iqhD
r,>
(2.2-35)
and the finite difference interpretation is now available; namely, except for the time-
derivative term, each of the other terms now corresponds (in some sense) to a point-wise
approximation of the corresponding term in the original PDE, because 0, is the 'proper'
piecewise polynomial of the FEM—the number of neighboring nodes (j) that couple with
the node in question (/) being only a function of the support of the basis function, 0,.
For example, a bilinear approximation in 2D will couple (generally, and away from 3£2)
eight neighbors to each node. The time derivative term, though, is 'special' in the sense
that the GFEM performs a weighted average of all the t/s in the neighborhood of node
/ to approximate 37/3/ at x = x,. Again the details depend on the support of the basis
functions, but the key point to note is that only AT\i is not a pointwise approximation in
(2.2-29); a pointwise approximation of 37/3/ would be simply 7,—as with lumped mass.
A final remark: if 0, belongs to a node on T, then the interpretation is somewhat trickier.
It turns out that if xt e rN, then the nodal equation is actually (as for the original GFEM)
an approximation to the Neumann (Robin) BC, (2.1-3), and will approach this exactly
as the mesh is refined; i.e., all other terms will -> 0 as NT -* oo. Finally, if xt e VD, a
similar result is obtained, with only M^lKDT and J 4>iqhD/ J 4>; remaining significant as
NT -> oo, wherein they give n K V7 = qD.
If some of the above assertions are not obvious, they will become more transparent
soon, when we show some actual nodal equations.
54 THE ADVECTION-DIFFUSION EQUATION
2.2.6 A Control Volume FEM
In this section we develop one form of a non-Galerkin weighted residual method that
has been gaining in popularity, partly via religious beliefs (via the God of 'local
conservation')—and probably at the 'expense' of both the GFEM and FDM's. Called
the control volume finite element method (CVFEM), it is a subdomain method of
weighted residuals (see Crandall, 1956; Finlayson and Scriven, 1966), and seems to have
been spearheaded by, among others, Professor S. Patankar and colleagues—at least for
incompressible flow. For elliptic BVP's, it was employed at least as far back as 1952
(MacNeal, 1953); see Varga (1962), who presents (but does not 'name') both GFEM—in
the guise of a Ritz method—and a finite volume method, and attributes the latter to
MacNeal (1953). While most of the recent papers we have seen involve more than a
simple change in the test function (such as directional upwinding and mass lumping),
herein we develop and present the CVFEM as a fully legitimate (no cheating) alternate
finite element technique, beginning with the appropriate weak formulation and introducing
the CVFEM version of natural boundary conditions (NBC's). Its extension from the
advection-diffusion equation to the NS equations will be considered in the next chapter.
The 'Patankar-family' and related publications are now summarized (there are many
more than those cited here), beginning with Baliga and Patankar (1980), which we believe
to be the first in the 'series,' using triangular elements and 'exponential upwinding.'
Further efforts/improvements, still on triangles, followed in Hookey et al. (1988) for
advection-diffusion and Hookey and Baliga (1988) for Navier-Stokes. Quadrilateral
CVFEM seems to have begun with Ramadhyani and Patankar (1985), again using
'upwinded' shape functions. Prakash (1987) shows that CVFEM, like FDM, suffers
from excess diffusion if simple ID 'smart upwinding' (see Section 2.6.2c) methods are
employed. Schneider and Raw (1986) develop a type of skew upwinding on quadrilateral
elements, and Prakash and Baliga (1989) compare FDM, FEM, and CVFEM, concluding
with the hope that CVFEM will catch up with FEM for problems with 'truly complex
shapes' via unstructured grids. A sort of state-of-the-art summary, as of about 1987, plus
references, is available in Minkowycz et al. (1988). Finally, in a recent pair of papers,
Swaminathan and Voller (1992a,b) add SUPG [Streamline-Upwind Petrov-Galerkin; see,
for example, Hughes and Brooks (1982) and Brooks and Hughes (1982)] to CVFEM
to minimize the up-to-then excessive cross-wind diffusion via extant upwind methods. It
compared favorably with 'GFEM/SUPG' in most respects, although its phase speed errors
were noticeably larger (which we explain later: Section 2.6.2a)—thus bringing CVFEM
nearly up to date with one popular type of FEM.
Outside of the 'Patankar family,' we mention just two recent contributions: Zienkiewicz
and Onate (1991) and Idelsohn and Onate (1994)—and a quotation: 'The FVM is a poor
man's FEM; it's an FDM moved over half-way'—O.C. Zienkiewicz (invited lecture, 8th
Int. Conf. on Finite Elements in Fluids, Barcelona, Spain, September 1993).
A final 'justification' for FVM's is revealed in the following quotation (typical
of many): 'Finite element methods can easily be used on irregular geometries, but
the equations are more complex and it is often more difficult to explain them
physically'—from Melaaen (1992). Somewhat ironically, this paper then goes on to a
discussion of curvilinear non-orthogonal grids via tensor calculus and seems, in our
humble opinion, to get bogged down in equations that are far 'more complex' than GFEM
equations ever are!
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 55
A brief sampling of recent theoretical/convergence results, a field that is rather nascent
compared with FEM, follows—and it is important to point out that there are many varieties
of what is most often called the 'finite volume method': Cai (1991) and Cai et al. (1991);
Siili (1991) and Morton and SOU (1991), from which we quote: 'Despite the practical
success of finite volume methods, their theoretical foundation is still unsatisfactory and
their stability and accuracy properties are not well understood'; see also Nicolaides et al.
(1995) and Shin and Strikweda (1997).
Crucial to the CVFEM is the conservation (divergence) form of the PDE, (2.1-11) in
this case, because it is based on the 'conservation of T at control volume level. The weak
form of this AD equation begins, as usual, by multiplication by a test (weighting) function
and integration over the domain. In this case, however, the test function is piecewise
constant; it is unity over a particular subdomain (control volume or, in 2D, the only case
we consider in detail, a control area) and zero over the rest of £2. Thus, in this subdomain
method the PDE is satisfied on the average over each subdomain. Calling the test function
for subdomain '/' surrounding node / \J/j, we have
fiW ■ (uT - kVT) = 0, i = 1, 2, ..., N, (2.2-36)
where N is now the number of non-overlapping subdomains covering £2. But owing to
the nature of the test functions, the above equation is equivalent, via the divergence
theorem, to
/ (^t~~S) + / n • (ur - *Vr) = °> i=U2,...,N, (2.2-37)
where T, is the boundary of subdomain Qj. This simple set of equations—each
representing an energy balance over one subdomain—is the starting point for the finite-element
discretization; it is the desired, and intuitively appealing, weak form.
What about boundary conditions? They are not nearly as apparent here as in the GFEM
weak form, a la (2.1-23) and (2.2-2). But the answer is actually simple: if (and only if)
T, includes a portion of the full domain boundary, T, then 'special procedures' need to
be introduced so that both Dirichlet and Robin/Neumann data are properly incorporated.
These procedures are, in fact, little different from those already discussed and will later
be presented in some detail. Suffice it to say here that the simple looking equation of the
weak form is, in practice, only slightly simpler than that from GFEM. (It is also easy to
interpret physically, a very important attribute in the eyes of many would-be finite-element
practitioners.)
The next step is to approximate the solution in the finite-element spirit: expand the
'solution' in the same set of piecewise polynomials used in the GFEM, a la (2.2-1) and
(2.2-3), which converts (2.2-37) to the final control volume weak form:
r n r n
/ EW)-^+/ 52n-M>j-KV4>j)Tj
' /= i ' /= i
= -f^r--f n • (uf -/fVf), i = 1,2,..., AT, (2.2-38)
JQi 'dt Jri
where, for simplicity, we have also expanded the source term into the FEM basis functions,
S = Yl^i $$]■> v'a interpolation. Note that the RHS is only non-zero at points where
56 THE ADVECTION-DIFFUSION EQUATION
£2,- and r,- involve rD; i.e., the (now-less-simple-looking) weak formulation now does
account for the Dirichlet BC. (Neumann and Robin BC's are deferred until Section 2.5.3,
where they will appear as NBC's.)
Note too that j also ranges over 1 to N, where N is the number of nodes (in Q and on
T/v, as before) at which Tj is to be determined; there must be one subdomain for each
unknown nodal temperature. Thus, we have reduced the weak form of the continuous
problem [obtained via N -> co in (2.2-38)] to one of finite dimension. All that remains
to be addressed prior to programming is the precise definition of £2,- and V; for / =
1, 2,..., N, in such a way that the test functions retain linear independence. We do this
first in the 2D context in which the basis functions {(pj) are bilinear. Consider the 4-patch
of isoparametric elements shown in, Figure 2.2-1 surrounding a generic node (/) in the
domain:
The subdomain £2,- (control 'volume') is that formed by joining the element centroids
(xq = \ Ylj=\ xj is tne -^-coordinate of a centroid, etc.) with eight straight line segments,
each of which passes through the midside of the appropriate element.
It should now be apparent, at least in principal, 'how to build' a CVFEM code: each
internal node's control volume—for the integration of 'volume quantities' (37/3/ and
S above)—is composed of pieces of neighboring elements (four for a 4-patch, two for
a 2-patch, etc.); each internal node's control volume boundary—for the integration of
flux quantities (like uT and kVT)—is made up from two internal segments from each
element that has something to contribute. It may also be apparent that this method is more
'localized' than GFEM, owing to the nature of the test functions; i.e., CVFEM will give
more weight to node / relative to its neighbors than does GFEM. In local coordinates
(£, iff), these line segments are simply pieces of the coordinate lines themselves, e.g.,
£ = 0 or r\ = 0, and this fact actually makes the 'boundary' calculations significantly
easier to perform since the general bilinear interpolation becomes simply linear on each
of these segments. [The volume quantities are not simplified, however, and conventional
element-level matrices (or the equivalent) need to be constructed. See Appendix 1 for
some CVFEM element matrices]
This is a sufficient exposition of the method at this point. Later, we will actually present
the resulting semi-discrete equations and compare them with those from the GFEM.
Suffice it to say here that there are more similarities (when done our way) than differences.
How do the two schemes compare theoretically? Numerically? While we do not have
many answers here [indeed we have not (yet) programmed a CVFEM], some conjectures,
opinions, and assertions are offered:
Fig. 2.2-1 A 4-patch of control volume finite elements.
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 57
1. Because of the particular conservation formulation, the CVFEM has the nice property
that 'whatever exits one CV through its boundary surface enters the neighboring CV.'
This physically appealing property—which also assures global conservation—accounts in
(very) large measure for its popularity, as already hinted at above. [The GFEM does not, in
general, incorporate control volume or element-level conservation properties. In a sense, it
resembles more a Fourier series expansion method—or any other eigenfunction expansion
method (e.g., spectral methods)—which also lack the appeal of physical intuition but
nonetheless provide very useful mathematical tools. But, see Appendix 2.] Also, and
importantly, local conservation does not imply local accuracy.
2. There is no assurance that quadratic quantities are conserved (e.g., J T2), and in general
they will not be; hence, boundedness of the solutions is not a priori guaranteed—as indeed
it is not for the analogous (/3 = 1) GFEM.
3. For any mesh other than simple rectangles, both the mass matrix and the diffusion
matrix are unsymmetric—not a nice feature.
4. For situations in which variational principles apply—which generally require
u = 0—the GFEM is guaranteed to produce the most accurate solution possible on a
given mesh, at least when errors are measured in the appropriate norm. For example, for
the elliptic (Poisson) problem PDE, V27 = —S, the error from GFEM, e = T — Th, is a
minimum in the 'energy' (or 'heat flux') norm, (J" Ve • Ve)l/2.
5. The dispersion (phase speed) error in the advection-dominated situation is significantly
smaller for GFEM than CVFEM, as we will demonstrate.
6. Higher-order schemes are rare (non-existent?)—CVFEM advantages tend to disappear
when higher accuracy is demanded.
7. Finally, we note that no one has yet (to our knowledge) presented results from the
'honest' CVFEM presented here—at least for the time-dependent case; the practitioners
thus far seem to both lump the mass (thus 'blowing' the very local conservation that they
so admire!) to simplify the algorithms and modify the advection terms to control wiggles
(Section 2.6.1) by adding dissipation. Comini et al. (1996) have recently performed some
steady heat conduction analyses for CVFEM and GFEM; but they, like 'Patankar' (see
below), did not use the correct norm in which to measure the error—although they did
come closer. They used the proper (intrinsic) energy norm, but did so after interpolating
the exact solution into the finite element subspace.
Following up on Remarks (4) and (7), we wish to point out a misleading paper by
Ramadhyani and Patankar (1980) in which they 'showed' that the bilinear CVFEM is
more accurate than the bilinear GFEM by solving several 'Poisson' problems on the
same meshes and measuring the errors in the wrong norms. The facts, somewhat alluded
to by Habashi and Youngson (1980), are these: (i) if they had used the proper norm
(energy norm, or //'-semi-norm), they would have reversed their conclusions—GFEM
would win every time, as it must since there is no more accurate solution than that from
GFEM; (ii) they would, however, see that GFEM does not win by much. We (RLS and
PMG) repeated these experiments and found consistency between numerics and theory;
the 'bad' news was that GFEM only won by a 'small' margin in all cases.
CVFEM is definitely a valid and viable competitor to GFEM, but we are not pleased
with some of the misleading 'advertising' that often accompanies it.
58 THE ADVECTION-DIFFUSION EQUATION
2.3 SOME SEMI-DISCRETE EQUATIONS
In this section we will get closer to the nitty gritty and actually display the final semi-
discrete equations for a few common elements (time is still treated as a continuous
variable), both in ID and 2D. (We leave 3D to the reader! Or to his/her computer with
symbolic manipulator.) We shall use both the GFEM form and, so that the reader can more
easily interpret the results via identification with the PDE and its BC's, the finite difference
form developed above—(2.2-29). We limit the presentation to linear and quadratic basis
functions and, in 2D, we further limit the discussion to rectangular quadrilateral elements
[distorted, isoparametric element equations are better left to the computer—see, for
example, Irons and Ahmad (1980) for further discussion of this point—and for much other
interesting 'discussion']. Also, we further limit the presentation—for the most part—to
the constant coefficient case, although we will permit one important coefficient—the
velocity—to be (sometimes) non-constant. Finally, we make the obvious remark that
while our presentation of the time-dependent case yields ODE's, the simpler steady case
follows easily by discarding all time derivative terms to arrive at the appropriate systems
of linear algebraic equations.
As we mentioned in Chapter 1, we generally presume that the reader is sufficiently
FEM-educated both to construct 'element matrices' and to use them for the generation
of the global GFEM equations. And to provide a little bit of assistance, we have already
mentioned, in Chapter 1, many element matrices that are displayed in Appendix 1. In this
chapter, we have used these matrices to present the final, 'assembled' equations. For the
reader who wants even more assistance, a look-ahead to Section 3.13.5 in the next chapter
will actually show, step-by-step, how to 'build' the global equations using bilinear basis
functions.
2.3.1 One Dimension
The ID version of (2.1-1) is
dT dT d2T
h u— = K—z
dt dx dx2
T = TD at x = 0, (2.3-2)
+ m— =k—2+S on 0 < x < L, (2.3-1)
dT
K—+H(T-T) = q at x = L, (2.3-3)
ax
with
T(x, 0) = T0(x). (2.3-4)
The weak formulation of the advective form is given by the ID version of (2.2-16)
with fi = 0: find Th(x, t) on 0 < x ^ L and (optionally) qhD at x = 0 from
dTh dTh\ fdfodT11 .
dt dx / ./ dx dx
x=L
= I &S + friq + HT)
x—L
, i=l,2,...,NT, (2.3-5)
-v=0
SOME SEMI-DISCRETE EQUATIONS 59
where
N
Th{x, t) = TD<t>NT{x) + ]T Tj{t)(t>j{x). (2.3-6)
7=1
Note that for this simple case, Nr = N + 1, and 4>Nt corresponds to the single node at
jc = 0; i.e., <pNT(0)= 1, 0,(0) = 0 for / = 1,2, ...,JV. Also, J {■) = J^ {■) dx.
Although we will only treat the case of constant S (and k, etc.) for the most part, note
that it is very easy to extend the results to the case of variable S when S is interpolated
into the basis functions via S = ^ Sj<pj by simply realizing that the (consistent) mass
matrix applies so that the same coefficients that multiply f{ will multiply S{. (We shall
soon explicate this in the next section—in 2D.)
a. Linear elements
The domain, showing a single 'hat,' 'chapeau,' 'roof,' 'tent,' 'teepee,' 'pagoda,' 'pyramid,'
etc. basis function (all of these names and probably others, have been used—although
perhaps/probably not all in ID) is shown in Figure 2.3-1:
Using linear basis functions on (at first) the variable mesh shown above with N elements
(and N + 1 nodes) leads to the following equations, of which there are four different types:
1. / = 1. The first node in from x = 0 is special because it incorporates the
Dirichlet/essential BC from (2.3-5); it is
- [htD + 2(/, + l2)ti + l2t2] + -u{T2 - TD)
+ K{h^lR + I^IA = li±hs,
(2.3-7)
V h h J £
in which TD{t) is given by BC (2.3-2) and—in practice—the terms in TD(t), and fD, are
transposed to the RHS to form a portion of the 'data' (forcing function)—unless mass
lumping (see Section 2.5.1) is invoked, in which case all of the T terms are replaced by
[(/,+/2)/2]7V
2. 1 < i < N, where node Af is located at x = L:
- [liti-i + 2(/,- + ii+i)ti + ii+iti+i] + ^{Ti+l -r,-_,)
/rI--r,-_, Ti-Ti+l
+ k h
/,■ + u
+1
u
u
(2.3-8)
+i
(x = 0) £,
1.0
~+\ \
i = NT i = 1 i = 2 i = 3 i -1
Fig. 2.3-1 1D mesh and one linear basis function.
H h
i + 1
N-1
£N (x = L)
•
N
60 THE ADVECTION-DIFFUSION EQUATION
in the GFEM form, or—via (2.2-29)—converted to the FD form, where here /0, =
(/,- + /,■+, )/2,
1 2/,- ^ ,2^,1 2/,+, ^ , r,-+i-r,--i
+
at
'Ti-Ti-i Ti+i-T{
'/,■ + //
+ i
/,■
/,■
= 5,
(2.3-9)
+i
which appears to be more amenable to a term-by-term identification with those of the
PDE. In fact, it is also ostensibly amenable to a Taylor-series 'error analysis,' as follows:
assume that the exact solution to (2.3-1) is available and smooth enough that a Taylor
series makes sense. Thus, for example,
/:
/?
/;
r(*f_,, 0 = r,-_, = Ti - UT\ + "±t[ - ^T'!' + ±r!" + 0(/J),
2 O 24
where 7J = dT/dx\x=Xi, etc. Performing a similar expansion for Ti+i and inserting the
results into the above difference equation yields [using T\ = d{dT/dt)/dx\Xi, etc., and
h = U + ^i+i]
1 /3 -i- /3 /4 - /4 /5 -i- /5
Ti + 3(/,+' " li)Ti + ~~6h~T< + 18/7" ' + ^727T ' + ( }
+ u
I ■ — I ■ I 4- /
ryl 1+1 'l „// . ' )'+ | ~l~ | y/// . t |+ | * )' y//// .
= K"
3 , ,3 I4 -I4 I5 + /5
120/5
f5 , |5
-(v)
6/,
24/5
;4 ,4
t;.v, + o(/3)
/ — /• /3 4-/3 /4 — /4 /3 4-/
T" 4- —li—ll-T"' 4- _i±_L___Lt"" i i+l ' ^<v) I i + l > ynvu I f)(p\
12/,
60/,
360/.
+ S.
But since we have assumed/claimed that 7 is the exact solution to (2.3-1), the leading
terms (within each bracketed group) cancel, and we are left with one measure of the local
truncation error (LTE) of the scheme (subtract the RHS from the LHS):
TEi = (/,-+, - /,-)
-t': + -uT'l - -kT'I'
3 ' 2 ' 3 '
+
I +1
i + l
6(/,+/,+,)
T; + UT;
-kT"
2 '
+ 0(/?+i -/?)/(/,- + /,+,) + 0(/4).
Digression:
Before proceeding with this 'error analysis,' we note that if the velocity is not constant and
is interpolated via the basis function as u = ^ Uj<pj, then the advection term changes to
1
6
",■-1-77——, r- + 4«,-
7V+i - Ti
h + h+l\ ' ""' (/,■ + /;+,) +"'+,//,+/,-
+ 1
SOME SEMI-DISCRETE EQUATIONS 61
for the finite difference form, in which the 1/6, 2/3, 1/6 averaging of upwind, centered,
and downwind difference approximations (encountered here for the first of many times
with linear basis functions) is clearly apparent. We make the important remark that this
advective form (fi = 0) can actually generate an unstable ODE [sometimes referred to as
a linear (or non-linear, depending on the author) aliasing instability in the finite
difference literature] if u{x) varies in certain ways. To see this, we write the general row of
the corresponding advection matrix (GFEM form, wherein—recall—the variable element
lengths show up in the mass matrix but not in the advection matrix):
£[0,-» 0, -(2«,- + m,-_i ), (m,-_i - w,+i), (2«,- + w,+i), 0, -» 0],
whose (generally complex) eigenvalues may display positive real parts; i.e., there is
no guarantee that they will not. In contrast, if the quadratically conserving form (/3 =
1/2)—equivalent to \ [(d(uT))/dx + u(dT/dx)} = u(dT/dx) + \T{du/dx) — \s employed,
then the GFEM advection term becomes
- L"iVi+i - Ti-\) + («,-+!7\-+i - w,-_ir(-_i)j = - I —-—ri+l —r,_, j,
whose matrix row is
£[0, -» 0, -(«,- + «,-_!), 0, («,- + ui+i), 0, -» 0],
which shows that the advection matrix is skew symmetric and thus displays purely
imaginary eigenvalues regardless of the behavior of u(x); i.e., it cannot generate an unstable
ODE. [The linearly conserving form, fi = 1, leads to the following GFEM advection term:
t[2(ui+lTi+\ -Ui-iTi-i) + Ui(Ti+i - 7Vi) + («,-+! -Ui-\)T{\,
and a general matrix row like
£[0, -» 0, -(2m,-_i + «,-), («,-+i - «,--i), (2m,-+i + «,-), 0, -» 0],
which, because it corresponds to an indefinite advection matrix, like that for fi = 0, is
also susceptible to unstable behavior even though it conserves T.]
Remark:
It is interesting that the famous 'non-linear aliasing instability' of Phillips (1959) may
have in fact been a simpler linear instability (or perhaps some of each) caused by trying
to solve what are effectively unstable ODE's. The possibility of explaining the growth
as a linear instability seems to have first been put forth by Miyakoda (1962). Later,
this possibility was rediscovered by Gerrity (1972) and later yet by Gary (1979). A
finite difference analog of the quadratically stable (/3 = 1/2) version was presented by
Piacsek and Williams (1970). And, earlier, Lilly (1965) recognized that any quadratically
conserving scheme would preclude the Phillips' instability. Finally, see Kreiss and Oliger
(1973), who also showed that linear equations with sufficiently rough coefficients can
be unstable. Neither of the latter two (P & W, K & O) seemed aware of the previous
62 THE ADVECTION-DIFFUSION EQUATION
and essentially equivalent descriptions described by the earlier authors. We will discuss
aliasing, briefly, in the next chapter (Section 3.17).
In our own experience (Lee et al., 1982), we too observed linear 'aliasing' instability
in certain inviscid simulations: only the quadratically conserving advection approximation
precluded blow up of / T2 in a velocity field that was semi-chaotic, with small scales
in both space and time. Later in this chapter, we will demonstrate the occurrence of an
unstable ODE system for fi ^ 1 /2.
End Digression
Returning now to the Taylor series analysis, we note that if a uniform mesh is employed,
then /, = /,-+i = I = L/N, and the semi-discrete difference equation [(2.3-8)] simplifies to
2r,- + rI-_,
-(r,-_i +4Ti + Ti+i) + u — =*■
O LI
I
+ s,
(2.3-10)
and its Taylor series analog to
/
Ti + -T" +
72'
= K
r; +
12
i-piin .
+ u
360
V
lL
T' + -T'" +
'6 '120
-(v)
-(vi)
+ S + 0(l5).
The corresponding LTE in this case is
I2
TEi = -(r; + ur;
k"
__ Tffff
2 ''
+
24
_ t""
3 ''
1
+ -5uT,
(V)
kT
15
(vi)
+ 0(/5).
So what do these equations tell us, besides the obvious fact that if T{x, t) that solves
the given IBVP is available, then the quantities called TEiit) could be computed (up
to some power of /) at each node? Considering first the general result (/, ^ /,+i), it
seems to say that the FDM version of the GFEM equation has leading truncation error
terms in (/,+i — /,) and in (/;?+1 + l])/(/,- + /,-+i), the latter of which can be interpreted as
second-order in /; i.e., it is 0(/2). Considering the first of these, however, there are two
possibilities: (i) the mesh grading is random/rough such that (/,+i —/,-) = 0(1 j)—e.g.,
/,-+i = 3/,; i.e., large changes in element size will occur; and (ii) (/,+i — /,) = 0(l2); i.e.,
(/,-+! — /,■)//,■ = 0(1 {)—the mesh gradation is gradual. In the former case, the Taylor series
accuracy is clearly only first-order, whereas in the second, it is second-order. One obvious
conclusion—and one that is generally also borne out in practice—is to grade the mesh
gradually. In practice, a common mesh grading procedure is 'geometric'; e.g., /,+i — /, =
el{, where e <^ 1. While theoretically still only first-order accurate, the computed results
are usually quite acceptable/accurate. We quickly add, however, the important remark
that while Taylor series analyses are useful to some extent, they are subservient to the
more powerful error analysis techniques inherent in the GFEM and the results of these
techniques, in that the truncation error as computed above is not necessarily the correct
measure of the error in a GFEM solution. Another, related to the first, is that Taylor series
consistency is not even necessary for convergence (to the PDE solution) to occur; although
this is not the case here, there are FEM situations where it is (e.g., Carey, 1976). Part of
the reason for these assertions is that the GFEM is not based on Taylor series expansions,
SOME SEMI-DISCRETE EQUATIONS 63
which are actually only appropriate (i.e., when used to actually generate approximations
to derivatives) when 'Ax' is 'sufficiently small.' The GFEM tries, in some sense (and
often-but-not-always succeeds) to deliver an approximate solution that is 'as accurate as
possible' based on the given mesh (and the basis functions employed).
Finally, we consider two interesting limiting cases. In the first, we will uncover here
(and demonstrate later) an example of 'improved' convergence: let k = 0 = S so that
we have the pure advection equation, T, + uTx = 0. The uniform grid result in this case
enjoys a property of truncation error cancellation that increases its Taylor series accuracy
from first-order on a variable mesh all the way to fourth -order! This is because the term
T'l + uT'j" = d2{T, + uTx)/dx2 = 0, and the I2 term drops out of the TE equation. The
second case is that of pure diffusion (u = 0). Here the general TE equation becomes
TE = — - — k—r + —! ^- T' - -kT"" + (HOT).
3 dx\dt dx2) 6(/,- + /,+,) V 2 /
But since T, = kTxx + S, we have 3(7", — KTxx)/dx = 0; the variable-grid GFEM for the
transient heat equation is second-order accurate—a result that is general; i.e., it also holds
for a uniform mesh. We remark too that this is a consequence of the so-called consistent
mass matrix of GFEM and does not occur if the mass is lumped [setting 7,_i and ti+\ to
T; in (2.3-10)] — lumping gives only first-order accuracy. (See Section 2.5.1 for a more
extensive discussion on mass lumping.) The fourth-order accuracy for the pure advection
case is also lost if the mass is lumped; it is then only second-order accurate. We also
mention that the second-order accuracy obtains even if the source term is not constant
and is treated either via interpolation into the FEM basis functions,
(/,+2/,+l) S -+ g^-i + 2(/' + li+iW + f.-+iS.-+i]>
or via numerical (approximate) integration (Strang and Fix, 1973).
Before proceeding, however, we emphasize the point (as mentioned earlier) that we must
be careful not to fall into the common trap of actually believing that the FDM (Taylor
series) truncation error analysis actually governs the behavior of the error in the solution
to the GFEM equations (it does not) and thereby provides accurate error estimates. [For
a recent example of such entrapment, see Fletcher's (1991) discussion of non-uniform
grids. See also Fletcher and Srinivas (1983).] The 'reduction' of the GFEM equations to
an FDM 'equivalent' is thus actually misleading in several ways:
(i) The (global) finite element theory [via energy inner products, Sobolev spaces,
etc.; see, for example, Strang and Fix (1973); Wait and Mitchell (1985); Girault
and Raviart (1986); Ciarlet and Lions (1990); Kardestuncer and Norrie (1987)]
prevails, and the (local) Taylor series theory is secondary; i.e., the one-node-at-
a-time-analyzed-separately-and-independently approach does not reveal the whole
truth/the big picture.
(ii) Whereas M and KD are symmetric matrices in the GFEM version of the semi-
discrete equations, these important properties are lost (or at least, camouflaged) in
the FD version.
(iii) Whereas the advection matrix —see (2.2-9)—is skew-symmetric in the GFEM
version [N(u) for variable velocity with ft = 1/2], this property too is 'lost'
64 THE ADVECTION-DIFFUSION EQUATION
in the FD version. Thus, for example, the former version assures that the
advection approximation is dissipation-free (since xTNx = 0, Vx), but the latter
version—via the particular Taylor series term, [(/,+i — li)/2]u(d2T/dx2)\., which
looks dissipative/diffusive, at least if (/(+) — /,) were replaced by |/,+i — /,-[—seems
to imply otherwise; pointwise! This 'conclusion' is erroneous because the local error
analysis via FDM/Taylor series is, and is the cause of, for example, Fletcher's (1991)
errors when discussing this sort of example.
So the best advice that we can offer seems to be, 'Do not be afraid to use Taylor series
analysis to assist in studying accuracy and verifying consistency, but do not rely on it
too heavily—certainly not exclusively. Also, when it generates "good news," advertise
the fact; but if it generates bad or suspicious results, be suspicious—or ignore it.'
3. / = N; (the right-most node). Here, (2.3-5) gives
1 '
o I
1
K
TN^) + —(TN
lN
TN^)+HT
N
= -lNS + q + HT,
(2.3-11)
which rearranges to
/ TN — 7/v_ i
1
K
I
N
+ H{TN-T) = q + -lN
'c (TN — 7V_i) 2 •
o — u In
In 3
-In-\Tn_\,
o
(2.3-12)
a notable result in several ways:
(i)
(ii)
(iii)
(iv)
The GFEM form of the semi-discrete equation is 'useful' as it stands because it
is the 'integrated form' that is appropriate for studying Neumann/Robin boundary
conditions—no need to divide by f (p^ = /#/2. (This is true only in ID.)
It shows how the GFEM satisfies approximately the natural boundary condition,
(2.3-3). The ODE at node N is—in effect—the BC given by (2.3-3). We shall return
to this important point later (Section 2.4).
Presuming convergence as /, -> 0, all terms on the RHS but the first vanish so that
convergence to the NBC indeed occurs.
A Taylor series analysis about TN at x = L gives, for //v_i
I ( dT dT d2T
2
dT
K~+H{TN-T) = q
ax
dT dT
h u —
dt dx
In = I,
I2
K-
dxz
12
uTxx + 0{l'),
(v)
which, if the PDE holds on T (i.e., if the term in parentheses vanishes), and if the
Taylor series interpretation is valid, is a second-order accurate (third-order if u = 0
and the mass is not lumped) approximation to BC (2.3-3)—see also Strang and Fix
(1973, p. 33).
The equation can also be interpreted as a sort of discrete (GFEM) energy balance in
the last element—or perhaps better, in the last half of the last element, or perhaps
better yet, at the last node. (See Section 4.2.4 in Chapter 4.) Finally, we remark that
SOME SEMI-DISCRETE EQUATIONS 65
the discrete equations generated by the GFEM (or any other approximation)—and
their solution—are indifferent to the manner of 'interpretation.'
4. / = NT; (the left-most node)—optional/'post-processing.' Here, since all 4>j = 0 at
x = 0 except for j = NT, (2.3-5) gives
-(2/,77, + l2ti) + \u{T, - TD) - k{T{ Td) = i/,5 + qD,
(2.3-13)
which is similar to (2.3-11) and is actually an equation for qhD, the heat flux into £2 at
x = 0. That is, it is presumed that equations 1 through N have already been solved so
that T\ is known. Again we note that this (and only this) calculation is optional, and is a
result of expressing the equations in the generalized/augmented weak form.
Clearly, / -» 0 => qhD = -K(dT/dx)\x=0 = K{dT/dn)\r; the heat flux through TD, for
/ -> 0, is purely diffusive—even though n-u^0 there. Also, for constant /,
„ {Tj - TD) 1
qD = -k + -/
which yields, via Taylor series analysis,
dT
2 • 1 . {T\-TD)
«+*&
i dT dT
= \-u —
2\dt dx
d2T
K r- — S
dx'
i2 d dT 3 dT
_l 1_ _M—
12 3jc \ dt 2 9jc
K-
d2T
~d?
+ 0(/3),
which gives, again assuming the PDE holds on f,
qD =
= —K-
dT
~dx
i2
H uTx
12 x
+ 0(/3),
another second-order accurate heat flux result (again, it turns out to be third-order in the
absence of advection, when consistent mass is retained).
b. Quadratic elements
The domain in this case, with N/2 elements and N + 1 nodes, is shown in Figure 2.3-2
(N is necessarily even). Now the fun begins; i.e., higher-order basis functions generate
higher-order complexity, complete with different 'types' of nodes, basis functions, and
associated (and often counter-intuitive) equations—a necessary adjunct to higher-order
accuracy, at least when we insist on C° Lagrange basis functions with compact support.
But, in fact, quadratic basis functions are but the first step up in what is called the p-
finite element method (p = 2 here and stands for polynomial of degree 2), a technique
that we shall not present here, just summarize: rather than employing a simple mesh
refinement using the same type of element on each mesh (/z-method) to increase the
accuracy, it holds a fixed mesh (relatively coarse) and increases the order (p) of the basis
functions. In the h-p method (see, for example, Oden et ai, 1992), both h and p are
varied, usually 'adaptively,' in order to find an optimal solution strategy. After seeing
66 THE ADVECTION-DIFFUSION EQUATION
N-2 N-1 N
* p to.
<* p ^
**— *3~►
-<— ^n/2 —►
Fig. 2.3-2 1D mesh and four quadratic basis functions.
some of the discrete equations below from p = 2, the reader will probably appreciate
why most people are content to just trust Galerkin and let the computer do the work. We
now 'evaluate'/explicate (2,3-5) for 'quads'; here there are six different 'types' of nodes,
l. / = l, the first of two nodes that see the Dirichlet data, To(t):
j-(tD + 87, + t2) +\{T2~ TD) + -^(-TD + 27, - T2) = ~S, (2.3-14)
l j j /1 3
in which the 'data,'
l\ ■ uTD 8kTd
would be transposed to the RHS when forming the ODE system a la (2,2-7), (As for
linear elements, mass lumping removes the effect of TD by replacing all f terms by t\.)
2, / = 2, the second of two nodes coupling with To(t):
— [-htD + 21 {ti + 4(/, + l2)t2 + 2l2t3 - l2t4) + ^[4(7-3 -Ti)-(T4- TD)]
K
+ 3
(TD - 87, +772) {1T2 - &T2 + T4)
/i
/:
(h+h)S
(2.3-15)
in which the data, -/,7D/30 + uTD/6 + KTD/3h again go over to the RHS. [If the mass
is lumped, TD does not go to the RHS because all time derivative terms are then replaced
by their 'sum,' {l\ +l2)t2/6.\
3, / = 3, 5, 7, .,., N — 1; these are the rest of the 'center' nodes, which we easily display
as FDM equations by dividing each GFEM equation by f <p-, = ^1{;+d/2:
SOME SEMI-DISCRETE EQUATIONS 67
1 • ■ Ti+\-Ti-\ -Ti-\ +2T;-T;+i
— (77_, + 87-,- + 7-,-+,) + u^ — + K — ~ ^1 = S, (2.3-16)
l(J Mi+D/2 Axf
where Ax,- = jla+i)/2 is the node-to-node distance on element (/ + l)/2.
Remarks:
(i) For the case of variable velocity via u = ^ ■ Ujipj, the term u(Ti+l — r,_i)//((+i)/2
is replaced by
[«,•_! (47\- - 37\-_, - r,-_2) + Sui(Ti+i - r,-_i)
+ «,-+, (7\-_, -4T; + 3Ti+l)]/lCl+l)/2,
a linear combination of first-derivative approximations,
(ii) To lump the mass, just set 7Vi and fi+\ to f;.
This result, (2.3-16), 'smells like' a simple, second-order accurate approximation, and it
is if 'simple-think' is employed. But the actual scent is better—the global theory, which
couples all midside nodes with all 'edge' nodes, gives the result that quadratic basis
functions are third-order accurate, even for variable coefficients.
4. i = 4, 6, ..., N — 2, the remaining edge nodes except the last: here we first present the
GFEM form, obtained from the element matrices (Appendix 1) for two adjacent elements:
1 ...
^[—li/lTi-2 + 2/;/27\'_l + 4(/;/2 + li/2+\)Tj + 2li/2+\T{+\ — li/2+\T{+2\
+ ^[4(7Vh -rI-_1)-(r/+2-7Y_2)]
+ ^[(Ti-2 - 8r,-_, + lTi)/li,2 + {IT, - 87;+, + Ti+2)/li/2+l]
= ^(/,72 + /;/2+i)5, (2.3-17)
which we further present in FDM form by dividing through by f (pi = (/,y2 + /,/2+i)/6:
I I AT -I- Q^'72^'-1 ~*~ ^'72+1^+1 _ h/2Tj-2 + lj/2+\Tj+2
5 y /;/2 + ^'/2+i /;/2 + ^72+1
/ \
/-j-T T' T' T'
0 ' ;+l — ^i-l ' H-2 — ' i-2
+ u 2-j —-.
///2 + ///2+1 /,72 +/i/2+l
V 2 /
+ -.—-^—[(7V2 - 8r,-_, + 7r()/(V2 + (iTf - sr/+1 + Ti+2)/ii/2+l]
h/2 + h/2+\
= S, (2.3-18)
wherein we note/remark:
(i) The diffusion terms are easily identified as differences of fluxes. For / = 2Ax =
constant, they become k[{T^2 - 2Tt + Ti+2)/l2 - 2(7,_, - 2T; + Ti+\)/Ax2].
68 THE ADVECTION-DIFFUSION EQUATION
(ii) Lumped mass is, as usual, easily obtained by setting 7,-±i and T;±2 to T\.
(iii) The variable velocity case, via ^itjipj, replaces the advection term in (2.3-18) by
|[(w,_2 + 2m,-_i + 2«,-)r,-_2 - 4(2m,-_i + 3w;)7V,
+ (ui+2 - 6w, + i + 6w,_, - Ui-2)Ti + 4(3W; + 2«;+1 )7,;+1
- (2«,- + 2«;+i + Ui+2)/Ti+2]/(l;/2 + /,72+l).
(iv) Again, local Taylor series analysis is not recommended, except/unless to demonstrate
consistency. (Recall, though, that consistency is not necessary for convergence; it is merely
sufficient.)
5. i = N; the right-most (NBC) node: here, (2.3-5) gives
— lN/2(-tN-2 + 27V., + 47V + ~(TN-2 ~ 47V, + 37V
+ ^-(TN-2 - 87V, + 77V + HTN = i/^S + q + HT, (2.3-19)
J//V/2 O
which is an ODE for node N that can be conveniently rearranged to reveal its alias—the
boundary condition given by (2.3-3)—via
k(7V2 - 87V, + 77V/3/^/2 +H(TN~ f)
= Q + {h/iiS - u(TN~2 ~ 47V i + 37V/V/2
- (-7V_2 + 27V, + 47V/5], (2.3-20)
and letting lN/2 -» 0.
6. / = Afj-; the left-most node—via post-processing: here (2.3-5) gives
—/,(47*D + 27, - 7-2) + \i-Wo + 47, - T2) + ^(77D - 87, + T2)
= ~S + qhD, (2.3-21)
which we recognize as the (consistent) heat flux into the domain at x = 0 via
rearrangement:
qhD = KOTD-STi+T2)/3h
I
+ 6
-(ATD + 27-, - t2) + u(-3TD + 47, - 72)//, + S
, (2.3-22)
which, for /, -> 0, is ^ = -/c37/3jc|0 + 0(/J). But for finite /,, (2.3-22) describes the
most accurate estimate to qD available by accounting for more ('finite') physics than
simply heat conduction. (Further details of such post-processing will be provided in
Chapter 4, 'Derived Quantities,' as well as in Appendix 2.)
SOME SEMI-DISCRETE EQUATIONS 69
This concludes our ID 'demonstration.' The reader should now both have a better 'feel'
for the GFEM and be able to 'go farther'. Thus, we leave as an exercise the generation
of the ODE's using cubic (or higher) basis functions.
2.3.2 Two Dimensions with Bilinear Elements
Here we wish to present and examine the semi-discrete equations, (2.2-2) or (2.2-16) with
fi = 0, on a mesh of variable rectangles for both bilinear and, in the next two sections,
biquadratic elements—both in the interior (£2) and on the boundary (T). For one case
(bilinear elements), we will even show how the advection terms change when /3 is set
to 1 (flux-divergence/conservation form) or to 1/2 (quadratically conserving form). We
remark that anyone who decides to verify/check these results is well-advised to use a
'symbolic/algebraic-manipulator' package rather than 'do it by hand' as was done by the
authors (PMG, in fact).
a. An Interior 4-Patch
We start with a general 4-patch of general rectangular elements and a 'general' velocity
field, i.e., the velocity is variable and expressed (via interpolation) in terms of the bilinear
basis functions. The semi-discrete equation for the 'control' node (0) comprises the
following terms for the 4-patch shown in Figure 2.3-3, in which we employ compass-point
notation:
1. Mf\0= 3L {/,/*,tsw + l2h\tSE + l2h2fNE + lxh2tNW + 2[(l\ +l2)(h{fs + h2fN)
+ (/z, +h2)(lltw+l2tE)]+4(ll +/2)(/z, +h2)t0}.
2. For K -> k, KD % -V2:
KDT\0 = - \j-[2(T0 - Tw) + (Ts - Tsw)] + ^-[2(70 - TE) + (Ts - TSE)]
+ ^[2(70 - Tw) + (TN - TNW)] + ^[2(T0 - TE) + (TN - TNE]\
h h )
N
4
W '
<
t
>
A
h
<
J
h2
0
hi
>
£2
NE
%9
u
sw
SE
Fig. 2.3-3 A 4-patch of bilinear elements.
70 THE ADVECTION-DIFFUSION EQUATION
+ ~ {y-[2(T0 - Ts) + (Tw - Tsw)] + ^[2(70 - Ts) + (TE - TSE)]
6 {hi h\
+ ^[2(70 - TN) + (Tw - 7W)] + j^[2(T0 - TN) + (TE - TNE)]\ .
h2 h2 J
3. N(uT = Nx(u)T + Ny(v)T « u(dT/dx) + v(8T/dy), where w = £\ w;0; and v = ^j
vJ<Pj'-
Nx(u)T\0 = — {h\(uw + usw)(Ts -Tsw) + [3(h\ + h2)uw + h\usw + h2uNW]
x (T0 - Tw) + h2(uw + uNW)(TN — TNW)}
+ zr {M"o + "<>)(^<>£ - Tsw) + [3(h\ + h2)u0 + h{us + h2uN]
JO
x (7£ - Tw) + /z2("o + uN)(TNE - TNW)}
+ =« fai(uE + use){Tse -Ts) + [3(hi + h2)uE + h{uSE + h2uNE)]
x (7£ - To) + /z2("£ + uNE){TNE - TN)},
1
72
x (70 - r5) + l2(vs + vSE)(TE - TSE)}
Ny(v)T\0 = ^ {/i(v5W +vs)(Tw - Tsw) + [3(/i + l2)vs + l\vSw + hvSE]
+ ^r {/i(vo + vw)(7W - ^5w) + [3(/i +h)vo + l\vw +hvE]
io
x (TN - Ts) + l2(v0 + vE)(TNE - TSE)}
+ ~ (M^w + ^w)(7W - Tw) + [3(11 +l2)vN + l\vNW +hvNE\
x (TN - T0) + l2(vN + vNE)(TNE - TE)},
which involves a linear combination of upwind, centered, and downwind differences for
u -V7\
Combining the above three results in the form MT\o + N(u)T\o + KDT\o = MS\o,
where MS\o, is, for S = Y^Sjfij' tne same form as MT\o and thus need not be explicitly
written out, yields the GFEM ODE for T0. [Note that if S is constant, then MS\o =
(lx+l2)(hx +h2)/4-S.]
Since the GFEM equation as stated (and as usually seen by the computer) is not in
a clearly tasteful form, we next multiply it by the inverse of the lumped mass matrix
[ML = (Ii +l2)(h\ +h2)/4] and rearrange the terms to reduce it to a more palatable
form; i.e., the finite difference form discussed in Section 2.2.5, which is more easily
interpreted. The result is, where ASw =l\h\—the area of the southwest element, etc.
—and AT = Asw + ASE + ANE + ANW = (/, + l2)(h\ + h2),
Asw ■ ANW ■ ANE ■ Ase ^ \ 0 (ANE + ANW ■ Ase + Asw +
—A— I SW ~\ — I nw ~\ — I NE ~r —— 1 SE ) -r *■ \ : '/vi 1 s
AT AT AT AT ) \ AT AT
. Ase +ANE + . Asw + ANW ■ \ ■
+ : 1 e H : lw +4/()
SOME SEMI-DISCRETE EQUATIONS 71
+
6 6
2h\ M/v + Usw Ts — T<
sw
/_ , hi usw + h2uNW \
3uw H
hi+h2 2 (l\+h
+ 4
/ll +/I2
V
7\) — Tw
/i+/2
+
2/i2
w
h\ +h2
4 l
+ 6l6
/l+/2
2/Zi Mo + M.v TSE — T
sw
(' , h{us + h2uN \
3m0 H ; ;
hi+h2 2 (/1+/2)
+ 4
/ll +/l2
V
T'e — T w
(/1+/2)
+
2/z2 Mo + «/v 7^/vf - 7
/vw
/ii+/i2 2 (/1+/2)
+ 6I6
2h\ uE + use Tse - Ts
(\ + h^uSE + h2uNE\
h\+h2 2 fl\+l
+ 4
hx +h2
V
/1+/2
+
2/?2 Mf + Une T^e ~ T^
h{+hQ 2 fli+l
+ 6I6
2/1 vs + vsw Tw - T<
sw
(~ , Uvsw + 12Vse\
3VS H : ;
/l+/2
/ii +h2
+ 4
/1+/2
V
hi +h2
+
2/2 ^5 + vSE TE - TSE
h+h
h{ +h2
+
6 | 6
2/1 Vq + Vw TNW-T
( 3v,^W + l2VE\
SW
/1+/2
(h+h2)
+ 4
/.+/;
TN-TS
(/ii+/j2)
+
2/2 Vo + vE TNE - TSE
/,+/2 2 (h{+h2)
+ 6l6
2/1 t>/v + Vnw Tnw~Tw
3vN H
/1+/2
hx +h2
+ 4
/1+/2
V
/
Tn — Tq
h{+h2\
72 THE ADVECTION-DIFFUSION EQUATION
+
2/2 vN + vNE TNE - TE
l\ +/o
h\ -f h2
l\+h
2h\ (Ts — TSw TSe — Ts\ {TQ — Tw TE — TQ
+
2hj
h\ + h2
K
hi+h2 V l\
Tn ~ Tftw The — Tn
h
hx +h2
l\ h
2/1 I1"w — Tsw Tnw ~ Tw
/,+/2
/i,
h2
+
+
21:
Te — Tse Tne — Tt
/,+/2
A$w
hx
/t/vw AftE a AsE „
. Sjiv H—-—Snw H—-—SNE H—-—Sse
AT At At At
. ^ ,'Ane + Anw „ Ase + Asw _
+ I I : ^N H : J5
+
AT " AT
Ase + Ane „ ^5iv + Anw
^ _l —
Sw + 4S0
+ 4
/,
/5
To - Ts TN- TQ
(2.3-23)
Interesting, isn't it? The finite element method displayed. (Palatable, maybe; but still
not enticing!) The following remarks seem appropriate:
1. All terms are consistently weighted according to element size, a point especially
noteworthy, both now and later when we address the subject of 'mass lumping.'
2. 'Superimposed' on the element size weighting is the characteristic averaging associated
with linear basis functions; namely, (1 4 l)/6.
3. The advection approximation is composed of one-sixth upwind, two-thirds centered, and
one-sixth downwind differences. It also employs an interesting combination of averaged
coefficients.
4. The diffusion term (as in ID) is in the form —V • q where q = —kVT.
5. It is trivial to reduce the advection terms to their constant velocity counterparts.
6. The GFEM treatment of products—here in the term u • V7 —is complicated and costly
in 'operation counts'; its cost-effectiveness for advection is still an open issue, especially
when 'wiggles' are generated in such a costly way—a subject we will later return to,
briefly, in Section 2.5.4.
7. GFEM generates many averages of averages (usually weighted) on a uniform
rectangular mesh, wherein the extra effort (of averaging) may not always be worth the extra
work; it may be the case that it is on general, distorted isoparametric element meshes for
which the extra effort really pays off—i.e., for complex geometry.
SOME SEMI-DISCRETE EQUATIONS
73
8. If side SE-E-NE were a boundary segment with time-varying Dirichlet data supplied,
the nodal values of Tse, Te, and T^e, as well as their time derivatives, would ultimately
wind up on the RHS as given data.
For the record, and for completeness, we present below the uniform mesh (/, —l,h[ —
h) version of the above AD equation:
36
[(fsw + fNW + fNE + fSE) + 4(tN + ts + fE + tw)+ \6tQ]
+ 6<
2
+ 3<
+ 6<
+ 6<
2
+ 3<
+ 6<
uw + usw Ts — Tsw
I
3uw +
usw + unw
+ 4
UQ + Us Tse — Tsw
2 11
ue + use Tse — Ts
3w0 +
4
us + uN
Tp — Tw uw + unw Tn — Tnw
+ 4
+ 4
3uE +
4 | 2/
USE + UNE
Te — Tw up + uN TNe — T^w
21
Te — Tp ue + une T^e — Tn
Vs + vSw Tw - Tsw
vQ + vw TNW - Ts
w
2h
+ 4
+ 4
-3vs+Vsw+VsE\
J
I
Tq-Ts | Vs + Vse Te - TSE
iV" + - 1 7\ - TS vQ + vE Tne - Tse
2h
2h
tyvj- VNW TNw ~Tw A N+ 2 1 TN - Tp VN + VNE TNE - TE
K
6
+
(TSW - 27-5 + TSE)/l2 + 4(TW - 2T0 + TE)/l2 + (TNW - 2TN+]TNE)/l2
(TSW - 2TW + TNW)/h2 + 4(7s - 2T0 + TN)/h2 + (TSE - 2TE + TNE/h2
+ tt [(Ssw + SNW + SNE + SSE) + MSN +Ss + SE+Sw)+ 1650]
JO
(2.3-24)
which seems to warrant no further comment, except that the simple case of constant
velocity is easily obtainable and gives, not surprisingly,
dT u ( TSE — Tsw , Te — Tw , TNe — TNW \
with an obvious analogous term for v dT/dy.
For the special case of constant velocity on a uniform rectangular mesh, it is interesting
to note that the 2D equation can be generated using the outer products (tensor products;
see, for example, Wait and Mitchell, 1985) of the corresponding ID 'operators', because
the 2D basis functions can be written as the (tensor) product of the corresponding ID
basis functions; 0,(jc, y) = i/;(x) • i/i(y), where V/OO = \0 ±*/0> etc This permits the
74 THE ADVECTION-DIFFUSION EQUATION
separation of each 2D integration into the product of two ID integrations. The result is
a 'speedy' way to go from a ID 'stencil' to the corresponding 2D version. Example: the
ID mass matrix contains the operator /(l 4 l)/6 in the x-direction and h(\ 4 l)/6 in y.
The outer product of these operators is
Ih ( ' \ lh(X 4 ' \
which is just the 2D stencil for M presented above. Similarly, the term udT/dx in 2D can
be generated from the outer product of the ID mass operator in y with the 2D advection
operator in x, thus
H:)'^i,=4::s:)'
which is clearly the appropriate 2D representation. For further details on these matters,
see Wait and Mitchell (1985) or, especially, Fletcher (1991), who shows how to use it
on non-uniform rectangular meshes, although still with constant coefficients. For general,
isoparametric meshes with variable velocity, however, the simple tensor-product
formulation does not work—numerical integration a la 'conventional' FEM construction of the
assembled equations is the only way.
Digression on Mass Lumping:
Since we will later spend a fair amount of time discussing lumped mass approximations
to GFEM, it may be useful and appropriate here to show one form of the relationship
between lumped and consistent mass for bilinear elements. If we had used (2.2-28) to
define the mass matrix, M -» Mi, and the time-derivative terms in (2.3-24) would be
much simpler—uncoupled: all time derivatives are replaced by Tq.
We now wish to examine the difference between M and Mi when 'operating' on a
smooth function, say F(x, y), evaluated at the nodes and denoted by the vector /. We
easily obtain, for a mesh of / x h elements, from (2.3-24):
(Mf - MLf)\0 = —[(fsw + Inw + fNE + fsE) + MJn + fs + fE + fw) - 20/oL
JO
(2.3-25)
which, if subjected to Taylor series analysis about node 0, in the form F(x + 8, y + e) =
F{x, y) + ££Li ^ (SJL + e^y F(x, y), yields the following:
Ih
(M - ML)f\0 = — (/2F„ + h2Fvy)\0 + Ih ■ 0(l\ h4), (2.3-26)
o
showing that the 'difference' between M and ML shrinks with mesh refinement. Also,
multiplication by M^1 and rearranging gives [see (2.2-30) through (2.2-32)],
M^Mf\o = F(xo, yo) + {U2FXX + h2Fyy)\0 + 0(l\ h4). (2.3-27)
For a fine enough mesh, CM and LM are equivalent; for the finite meshes that we are
forced to use in practice, however, they are not, and the effect of the difference can be
very significant, as we will later demonstrate.
SOME SEMI-DISCRETE EQUATIONS
75
End Digression
The results above are for the advective form (/3 = 0). The conservative form (fi = 1) is
interesting in that the mix of upwind, centered, and downwind differences is replaced
by a mix of pointwise flux terms, u7\ all in a centered difference format (a la finite
volume methods—before upwinding is invoked). It is sufficient to focus on just the x-
portion, as the rest follows from 'symmetry.' Calling then Ncx(uT)\; = J (pj(d(uhTh)/dx),
the conservative form of x-advection yields, for the same 4-patch:
72Ncx(uT)\0 = /i,{3[(m0 + uE)(T0 + TE) - (uQ + uw)(TQ + Tw)]
+ 2[(uE + uSe)(Te + Tse) - (uw + uSw)(Tw + TSw) - (uSeTse - u.swTSw)]
+ (us + uSE)(Ts + TSE) - (us + uSw)(Ts + Tsw) + ("o + Use)(Tq + TSE)
- (m0 + uSw)(Tq + Tsw) + (us + uE)(Ts + TE) - (us + uw)(Ts + Tw)}
+ h2{3[(uo + uE)(TQ + TE) - (mo + uw)(T0 + Tw)]
+ 2[(ue + Une)(Te + TNE) — (uw + Unw)(Tw + TNW) — {u^eTne — u^wT^w)]
+ (m/v + Une)(Tn + Tne) — (un + Unw)(Tn + TNw) + (mo + Une)(Tq + TNE)
- (mo + unw)(Tq + TNW) + (uN + uE)(TN + TE) - (uN + uw)(TN + Tw)}.
If we multiply this by ML ' = 4/(/) + l2){h\ + hj) and judiciously rearrange the result,
we obtain the FDM form of d(uT)/dx; i.e.,
MllNcx(uT)\0
1
2h\
9 hx+h2
ue + use Te + Tse uw + usw Tw + Tsw
h+h
1 (useTse — VswTsw)
2 /,+/,
+ 2
us + use Ts + Tse us + usw Ts + Tsw
l\+h
uq + use Tq + TSe "o + usw T0 + Ts
w
+
h+h
+
us + uE Ts + TE us + uw Ts + Tw
l\+h
+
UQ + UE Tq + TE Uq + UW Tq + TW
l\+h
1 2h2
+ x
9 h\+h
l -t- ni
uE + unE Te + TnE uw + unw Tw + T^w
h+h
1 unETnE — unwTnw
2 /,+/2
76 THE ADVECTION-DIFFUSION EQUATION
1
+ -
un + "ne Tpj + The «/v + unw Tn + T^w
h+h
2
WO + W/V£ Tq + TnE "0 + UNW Tq + TnW
+ 2 2 ~ 2 2
X/l+/2
2
w/v + «£ T^/v + Te un + Mtv T'/v + TV
2 • 2 - 2 2
'h+h
(2.3-28)
which is rather easily interpreted: the 2h\/{h\ + h2) terms contribute a net j3(«r)/3jc
from the southern nodes, the middle term is clearly ^d(uT)/dx from the central nodes,
and the remainder—the 2h2/(h\ + h2) terms—contribute the remaining ^d(uT)/dx, from
the northern nodes.
Final 4-Patch Remarks:
(1) The conservation form results displayed as above involved a non-trivial amount
of manipulative algebra, but the results are felt to be worthwhile and instructive,
especially because we will later compare them with those generated by the control
volume FEM,
(2) It is easy to see how the conservation form will conserve T via addition of the nodal
equations: all interior contributions will cancel.
(3) Taylor series analysis of these equations might be fruitful, but this is doubtful.
(4) The quadratically conserving (skew-symmetric) form of 72N@(u)T\o, obtainable
either
by averaging the advective and divergence forms above or by forming the skew-symmetric
part of the advective-form matrix, a la Section 2,2,4, is:
72N®(u)T\0 = \ {h\ [(mo + us + uE + use)TSe - ("o + us + uw + usw)TSw]
+ [3(/*i + /z2)("o + ue) + h\ (us + use) + h2(uN + uNE)] TE
- [3(/?i + h2)(u0 + uw) + h\ (us + uSw) + h2(uN + uNW)] Tw
+ h\ [(m0 + uN + uE + uNE)TNE - (m0 + w/v + «vv + "/vw)7W]},
in which we note that the coefficients of Ts, T0, and TN are all zero, (The diagonal entries
of any skew-symmetric matrix are zero.)
b. A boundary 2-patch
Suppose the N-O-S side of the 4-patch shown above is an outflow plane, and the two
right-most elements are (of course) omitted. It is of interest—and not difficult, having
completed the 4-patch analysis—to determine the ODE corresponding to node 0 for this
case, when the BC given by (2,1-3) is employed. It is, from (2.2-2) term-by-term, in the
FDM form [here, ML = /,(/?, + h2)/4\:
SOME SEMI-DISCRETE EQUATIONS
77
1 (tsw + 2ts) + , 2 , (tNw + 2tN) + 2(tw + 2T0)
/Jl + /j2 "~ "" ' /li +/l2
1 2h\ usw + uw + 2mq + 2us Ts - TSw
+
+
+
+
6 h\ + h2 6 /i
1 2h2 Uw + Www + 2uq + 2m/v 7"/v — Tnw
6 /ji+/j2
/,
1 2h\ 2u$ + Msw 1 2uq + Uw 1 2/j2 2un + unw
12 ' /i, + /i2 3 + 2 3 + 12 ' hx + h2 3
vsw + 3^5 + 6v0 + 2v
w
12
Tp-Ts
hi + h2
\
+
2vw + 6v0 + 3vN + vN
w
+
+
6
2k
vSw + vs + 2v0 + 2vw I Tw -Ts
w
hi +h2
+
12
2vw + 2v0 + vN + vNW
2hi
6/i [hi +h2
K
hi +h2
Ts - TSw \ . {Tq — Tw\ 2h2
2{Tn7To -To7Ts \ +
hi +h2
Tnw — Tw
ho
Tn — Tnw
Ti
Tw — Tsw
hi
+
2H
6/7
2/i i 2/i2
-Ts + 4T0+-——TN
hi +h2
hi +h2
9 [hi +h2
(Ssw + 2Ss) + 2(Sw + 2Sq) + -—■—— (Snw + 2Sn)
hi +h2
2
+ —
6/, [hi +h2
2hl (qs + HTS) + 4(tfo + HT0) + 7—V^* + ^^)
/ii +/i2
0 — ' w
7^/v — 70
/ll +/l2
\
\
/VIV — 1 W
hi +h2
(2.3-29)
where we have allowed both q and t to vary along TN and have interpolated them via
the basis functions.
While most terms in this equation permit obvious identification with their continuous
counterparts, there are three that do not quite; namely, those with a coefficient of 2/l\,
and this is a key observation that will show how the ODE at node 0 is actually also the
Robin BC given by (2.1-3). So let us first multiply the equation by l\/2 and rearrange:
2 \ 18
+
2/ii
hi + h2
2/i2
(tsw + 2TS - Ssw -2SS) + 4(TW + 2TQ - Sw ~ 250)
h\ + h2
(Tnw + 2Tn — Snw — 2Sn]
1 2/ii uSw + uw + 2 (wo + us) Ts — Tsw
6 hi + h2
l\
1 2/12 uw + unw + 2(«o + un) Tn — Tnw
+
6 h\ +h2 6 11
1 2/ii 2«5 + usw 1 2«o + uw 1 2/12 2w/v + unw
12 /11+/12
+ r 3
+
12 hi+h2
To — Tw
78 THE ADVECTION-DIFFUSION EQUATION
1
+ 3
+
vsw + 3^5 + 6t>0 + 2vw { Tq-Ts \ , 2vw + 6v0 + 3vN + vN
+
12
Vsw +Vs + 2Vq + 2vw
6
TN-T0 Tq-Ts
h2 h\
Tnw — Tw Tw — Tsw
h\ +h2
+
w
12
Tn ~ Tq
h\ + /i2
Tw ~ Tsw \ 2vw + 2VQ + VN + VNW
K
+ 6
H
1
hi
2hx
h\ + hi
' 2/m
h\ +hi
2h\
h\
Ts — Tsw
I
Tnw
w
h\ +h2
+ 4
(TS - TS) + 4(r0 - TQ) +
2h2
h\ + /i2
(TN - TN)
6 V/Ji +/J2
2/i2
<75 + 49o+ 9/v
«1 +«2
(2.3-30)
Note that the operation of multiplying by /) /2 is equivalent to realizing that it is MBL =
(h\ + /?2)/2, a boundary (lumped) mass matrix, which is the appropriate 'de-scrambler'
rather than the 'area' bulk/mass matrix used initially.
While this is indeed an ODE describing the behavior of T0 and is indeed the equation
'solved' by the computer, it is a/so the Robin BC (2.1-3)—at least approximately. For further
clarity, we first simplify the equation to the constant coefficient case on a uniform mesh:
Tsw + Tnw
u
+ 6
V
+ 3
K
2
Ts
+ (Ts + Tn) + 2(Tw + 2T0)
Tnw
-S
I
NW
Jjk + 4.Ti
I
Tw T_n__
I
2h
Tsw j TN — Ts
2h
+
K
6
+ H
- [{Tnw ~ 27V + Tsw)/h2 + 2(TN - 2T0 + Ts)/h2}
I S — I SW . 1 0 — I W I N — I NW
I I
TS + 4T0 + TN^
I
-T
q,
(2.3-31)
which represents a sort of (time-dependent) energy balance in the vicinity of node 0 on
a finite mesh, a balance that includes the specified (and time-independent) Robin BC.
As the mesh is refined, the contribution from the terms in curly brackets tends to vanish
because these terms are multiplied by /, which -> 0, and all that remains is the BC on
Tn- {The ostensibly 'missing' term, Kd2T/dx2 in the curly brackets terms, can be found
SOME SEMI-DISCRETE EQUATIONS 79
by a Taylor series analysis of the diffusive flux term in x; namely,
K
6
Ts — Tsw_ To
I
T\y TN — TNW
I
= K
dT
~dx
I d2T
2 dxz
+ 0(/2, h2)
From the ODE viewpoint, / -> 0 means that the time constant for node 0 is also
approaching zero, and that the ODE therefore approaches a quasi-steady-state condition
(its time constant approaches zero), which is another way to say that the Robin BC is
being enforced.
Returning briefly to Taylor series one more time, the insertion of the exact solution
into (2.3-31) gives, after Taylor series expansion about node 0,
/
2
{dT dT dT , \
±1 d±—!^ I+on)
0
7\T
+ K~~+H(T-f) = q + 0(h2);
ax
i.e., the ODE at the outflow yields a second-order accurate (a la Taylor series)
approximation to the Robin BC, (2.1-3), as in ID.
Finally, we remark that a variable-grid version of the Taylor series analysis would—for
both the boundary 2-patch and the interior 4-patch—yield only first-order accuracy, a
misleading result since the prevailing global theory proves second-order accuracy (still)
for the bilinear element.
c. A boundary corner
To finish, we consider the case involving a corner in the domain; for the sketch in
Figure 2.3-4, let W-O-S lie on the boundary of the domain. We will first present the
equation for node 0 under the condition that the Robin BC applies on both W-0 and 0-S.
Referring to the element matrices again, Row 3 gives the final result; i.e., (2.2-2) for this
case is:
* sw)
36
h
+ — [(6«0 + 3miv + 2us + usw)(TQ -Tw) + (2w0 + 2us + uw + usw)(Ts - Tsw)]
+ ^ [(6«o + 3^ + 2vw + vsw)(To -Ts) + (2vQ + 2vw + vs + vsw)(Tw ~ Tsw)]
*0
<>S
Fig. 2.3-4 A corner bilinear element.
80
THE ADVECTION-DIFFUSION EQUATION
+ K— [2(70 - Tw) + (Ts - Tsw)} + ^ [2(T0 - Ts) + (Tw - Tsw)]
+ - [l(Tw + 2TQ) + h(2TQ + Ts)]
= —(4SQ + 2Sw + 2Ss + Ssw)
JO
1
6
+ M'
(qw + HTw) + 2(qQ + HTQ)
+ h [2(qQ + Hf0) + (qs + HTS)] } . (2.3-32)
Here, the appropriate 'de-averaging' mass matrix is the boundary integral MBL = Jr 4>0 =
(/ + h)/2; division by Mf and rearranging then yields the FDM form,
Ih
2(1+ h) [9
1
(4T0 + 2Tw+2Ts + Tsw)
+
+
Ih
2-
6«q + 3«vv + 2us + usw To — T
w
6(1+h) V 12
2«o + 2us + u\y + usw Ts — Tsw
I
+ 2
6 /
6i;o + 3^5 + 2vw + vSw T0 - Ts
+
12 h
2vq + 2vw +vs + vsw Tw - TSw
+
K
3(1 +h)
* C2 —i—+
h J
Ts — Tsw
I
+ l(2-To~Ts + -W~Tsw
h
h
+
H
6"
Ih
21 ~ 2h
(Tw - Tw) + 4(70 - T0) + —^(Ts - Ts)
l + h
l + h
2(1+h)
^(4S0 + 2SW+2SS+Ssw)
+
1
2/ 2h
qw + 4^o + -TT~7as
l+h
l+h
(2.3-33)
as the ODE for node 0—another interesting energy-balance-plus BC. As usual, the only
surviving terms upon mesh refinement are those from the Robin BC; in fact, the asymptotic
version of (2.3-33) is
K
h dT I dT\
+ 7—- —)+H(T-T) = q + Oil, h),
I + h dx I + h dy
which might seem to suggest a 'heat balance' definition of a consistent (and unique)
normal direction: n • V7 = nx(dT/dx) + ny(dT/dy). But this implies nx = h/(l + h) and
ny = 1/(1 + h), which does not satisfy the requirement that n\ + n2y = 1. A better
interpretation of this result can be obtained via multiplication by (/ + h) and regrouping:
h
dT ~
K—+H(T-T)-q
ax
+ 1
dT ~
K—+H(T-T)-q
= 0(l2, h2);
SOME SEMI-DISCRETE EQUATIONS 81
i.e., it leads to a linear combination of the BC (energy balance) in the x-direction and that
in the ^-direction. Or, perhaps more accurately, it is a linear combination of the residuals
of the BC equations.
There is one more 'corner scene' that we wish to examine, because the results are
interesting. Suppose the Robin BC is applied to only one of the two surfaces—say 0-
S—and Dirichlet data on the other. In this case, T0 is given, and the GFEM equation at
node 0 is an equation for the heat flux there. It is obtained from (2.2-16) rather than from
(2.2-2), and all of the nodal temperatures are presumed to be known. [Recall that these
come from (2.2-2), in which there is no equation at node 0.] The result is (for fi = 0),
after dividing by Mf = Jr <pi = fw<pdx = 1/2,
h
2
-(4TQ + 2TW + 2TS + 1 sw)
1 / 6mo + 3miv + 2us + Usw Tq — Tw 2uq + 2us + u$w + "s T$ — T$w
+ 3\2 12 / + 6 /
6v0 + 3vs + 2vw + vsw T0 -Ts 2v0 + 2vw + vs + vsw
_l_ i. _ . ^ _)_
Tw-T
sw
12
1
- -(4S0 + 2SW + 2SS + Ssw)
h k ( To - Tw Ts - Tsw
h
+ —H
3/
2(T0 - T0) + (Ts - Ts)
h 1 „
— (2qo + qs)+ ^(^d0 + 4dw ),
h
+
To — Ts Tw
sw
h
(2.3-34)
where qhD = Y^Qd^i has been utilized to describe the (diffusive, in the limit) heat flux
into £2 through rD. This is an equation for qDo, basically. (See also Chapter 4.)
But, as seen several times already, this equation for qDo involves quite a lot of
information—and all gleaned from a single bilinear element! What does it tell us? In
general, it is an accounting of all energy 'flows' in the neighborhood of node 0. In
particular, asymptotic analysis leads to
h
1
dT ~
K—+H(T-T)-q
ox
dT
+ k— = qD + 0(l,h),
3y
or
h
' oT ~
K—+H(T-T)-q
ox
+ / (Kj- - qD\ = 0(l\ IK h2),
which has the following interpretation—roughly:
1. If h -> 0 with / fixed, then qD = KdT/dy + 0(1), the corner diffusive flux is in the
^-direction.
2. If / -► 0 with h fixed, then the Robin BC, x{dT/dx) + H(T -f) = q, is recovered—an
x-direction heat balance.
82 THE ADVECTION-DIFFUSION EQUATION
3. If h/l is fixed and both -> 0, the most meaningful/sensible case, then it is (again) a
linear combination of the horizontal and vertical components of an energy balance in the
corner.
d. An internal line heat source
A sometimes useful mathematical model is one that employs a point heat source in ID,
a line heat source in 2D, or a plane heat source in 3D. Here we shall show how, using
bilinear elements, a line source may be modeled in the plane (2D). And we shall show
two ways to do so—the second of which extends nicely (in the next chapter) to the
mathematical insertion of a pump, which makes the pressure jump.
For the first way, we return to Figure 2.3-3 and suppose that we wish to solve the heat
equation (transient in general, steady by dropping the acceleration terms) with a line heat
source along the column of nodes passing through S, 0, and N. The problem statement is
the following:
— = Kv2T + Q(y)8(x - xQ) in Q, (2.3-35)
at
plus appropriate BC's and IC's—the details of which do not concern us here because
we are simply interested in generating a typical internal node's equation. The line source
location is xq (along which line we insist on placing some of our nodes), and 8(x) is the
Dirac delta function. The weak form of (2.3-35) is
f f dT f f dT
J (PiQ(y)8(x -Xq)- J &— = J KVfr ■ V7 - J K<Pi — , (2.3-36)
and we henceforth omit the boundary integral, since we are studying internal nodes only.
In fact, we will only examine the case 0; = <p0 in Figure 2.3-3, and we further simplify
to the case of equal element size, the result being, with T & Th = ]C; Tj<pj, and lumping
the mass for simplicity,
yN
<!>Q(xQ,y)Q(y)dy -ihiQ
ryr
JyS
Kit
= 77 UTs - T^) - (Tse - Ts) + 4[(T0 - Tw) - (TE - T0)] + (TN - TNW) - (TNE - TN)}
o/
- ^-[(Tsw ~ 27V + TNW) + 4(TS - 2TQ + TN) + (TSE - 2TE + TNE)], (2.3-37)
on
in which the algebraic form of the x- and ^-diffusion terms is purely intentional—the line
heat source will cause a jump (discontinuity) in the x-direction heat flux. If we divide by
h and rearrange, we obtain
] ryN
j (po(xQ,y)Q(y)dy
JyS
h JyS
K
6
SW l SE — l S . I l 0 ~ 1 W ' E — ' 0 \ ' N — ' NW ' NE ~ ' N
_ . + . . . | +
+ un-l
Tsw — 2T$ + T'/viv Ts — 2Tq + T^
+ 4
+
W- hl
Tse — 2TE + T^e
_
(2.3-38)
SOME SEMI-DISCRETE EQUATIONS 83
which yields the following result for /, h -> 0:
Q(yo) = *
dT
a*
(2.3-39)
XQ
where [[ ]] denotes the jump in dT/dx at x = xq; i.e.,
Q(yo) = *
dT
a*
dT
a*
(2.3-40)
Remarks:
(1) The line heat source causes a jump in heat flux.
(2) Setting Q(y) = 0 in (2.3-37) recovers, properly, the 'conventional' GFEM equation,
in which both 37/3/ and d2T/dy2 are still present as /, h -> 0, converging to (2.3-35)
with Q omitted.
(3) The ID case, obtained by taking Q(y) = Q0 and setting TN = T$ = Tq, etc. in
(2.3-37), is
/ Tn — Tw Tp — in\
Q0 = k[ , - -S-—° + IT0, (2.3-41)
/
/
which converges to go = K[[dT/dx]]Xo and has the following exact solution for the
steady-state situation with T = 0 at x = 0 and x = L:
{M^T1JQo/K for 7 = 0, !,•••/
Tj =
N
^-^JQq/k fOTJ = J,J+l,
N
(2.3-42)
/V
where there are /V elements and A^ + 1 total nodes (/ = 1 /N) and j = J is the
location of the (only) point source of heat, Qq. This piecewise linear function is, in
fact, the Green's function for Txx, and the flux jump at x = xj is
Ag = k
Tj-Tj-i Tj+i-Tj
I I
which gives, using (2.3-42), Aq = Qq.
The second way to solve the problem is less direct, involves NBC's, and starts by
'neglecting' the line heat source. Suppose we imagine, for a moment, that the two sides
of the line S-O-N in Figure 2.3-3 are disjoint and that we apply, separately to each surface,
an applied heat flux. The weak form is now (2.3-36) with the heat source term omitted
and the boundary integral retained. For the case in which the domain is on the left, we
say qLiy) = KdT/dx\r is the (leftward flowing) applied heat flux, and when the domain
is on the right, we use qR(y) = KdT/dx\r as the (rightward flowing) applied heat flux in
the boundary integral terms. Thus, the two disjoint equations, from (2.3-36), are
CyN Kh
/ <foqUy)dy = -ttKTs - TSW) + 4(7o - Tw) + (TN - TNW)]
JyS 6/
*■/ lh .
- -ttKTsw - 27V + TNW) + 2(TS - 2T0 + TN)] + —T0, (2.3-43)
oh 2
84 THE ADVECTION-DIFFUSION EQUATION
which, not surprisingly, is like the 2-patch result given by (2.3-31), and
f <f>oqR(y)dy = K—[{TS - TSE) + 4(T0 - TE) + (TN - TNE)]
JyS 0/
- |r[2(rs - 270 + TN) + (TNE - 2TE + TSE)] + —t0. (2.3-44)
oh 2
Finally, we rejoin the two pieces by summing the two equations to obtain the total applied
flux ^tot = <Jl + <1r- The result is (2.3-37), with gT0T = Q, the total applied heat flux is
'equivalent' to a line source of internal generation.
Remarks:
(1) The individual heat fluxes (leftward-flowing, qL, and rightward-flowing, qR) can
be obtained from (2.3-43) and (2.3-44), respectively, after the temperature has been
obtained. This is nothing other than another application of the consistent flux
methodology of Chapter 4.
(2) If the sum of the fluxes (qror) is set t0 zero, the conventional GFEM internal
node equation is obtained—modeling/approximating a continuous heat flux. Also
in this case, the use of (2.3-43) and (2.3-44) to compute qL and qR in a
'postprocessing' mode, would give (consistent) flux continuity: qi + qR = 0 as required.
(See Appendix 2 for further discussion of related issues.)
(3) If, instead of a specified heat source or a specified total heat flux along the line S-0-
N, it is desired to specify the temperature along this same line (an internal Dirichlet
BC!), we now see how: simply switch known and unknown variables in (2.3-37); T$
(and Tq if time-dependent) is the given variable, and Q0 from f ^ 4>oQ(y) dy is to be
determined—via Q(y) = Q(ys)<ps + Q(yo)4>o + Q{yN)4>N- A specified temperature
distribution along a line internal to £2 can be realized by applying the appropriate
heat source distribution along this same line. After the full temperature field is
available, (2.3-43) and (2.3-44) may be utilized to see how the specified heat flux is
distributed in the two directions, by computing qL and qR—again via qL = qLs4>s +
<7Lo0o + qLN<pN, etc. [In both cases, i.e., in computing either Q(y) or qL(y) and qR(y),
one may use—at least for this element—consistent or lumped mass.]
2.3.3 Two Dimensions with Biquadratic Elements
The first 'higher-order' element (and the highest we consider) in more than ID is too
elaborate to make anything but the simplest case 'palatable' when it comes to writing
and pondering nodal equations. Thus, we a priori limit ourselves to the advective form
on a mesh of uniform rectangles with constant coefficients—and we drop S. The fact
that there are several different types of nodes to consider more than compensates for the
above simplifications.
To further streamline the presentation, we show only final results—and those in FDM
form only—obtainable by dividing by J cpi, which is lh/9 for a corner node (in a 4-patch),
2lh/9 for an edge node (in a 2-patch), and 4lh/9 for the center node of a 1-patch; surely
every reader by this time knows how to assemble equations from element matrices and
is thus more interested in the final results than more 'tutorial.'
SOME SEMI-DISCRETE EQUATIONS
85
a. An interior 4-patch
For a different presentation of the nine-node (and eight-node) results to be presented
below, see Gray and Pinder (1976). By 'different,' we mean simply an algebraic
rearrangement. We are suspicious, however, of portions of the ensuing error analysis in that
paper.
The 4-patch (and associated stencil/'molecule') for the interior node '0' comprises 25
nodes, as shown in Figure 2.3-5, in which the three node 'types' are shown with different
symbols.
Thus, upon forming the GFEM equations for node 0 and converting them to FDM form,
one is led to the following 'interesting' 25-point stencil/ODE (Ax = 1/2, Ay = h/2):
1
100
[64f0 + 16(TN + fs + fE + 7V) - HfNN + fss + Tee + tww)
+ ^(TNE + TNW + Tsw + Tse)
— 2(Tnne + TNnw + TNww + Tsww + Tssw + Tsse + Tsee + TNee)
+ (Tnnee + TNnww + Tssww + Tssee)]
, u (\n. Te — Tw ,, Tee — Tww TNe — TNw + Tse — Tsw
-\ 32 16 h 16
20 V / 2/ 2/
, . Tnnee — TNNww + Tssee — Tssww Q Tsse — Tssw + Tnne — Tnnw
_|_ 4 — — g-
- 8
4/
Tsee — Tsww + TNee — TNWw
4/
2/
, v L TN — Ts TNN — Tss , ,, TNe — Tse + TNw — Tsw
-\ 51 16 h 16
20 V h 2h 2h
. . Tnnee — Tssee + TNNww — Tssww Q Tnnw — Tsww + TNee — Tsee
_|_ 4 — — g
Mi
2h
Q Tnne — Tsse + TNnw — Tssw \
"8 4h )
NNWW
•
NWWP
\N\N n
SWWn
SSWW
NNW
—n—
NWO
w
o-
SW O
-o
ssw
NN
U N
O
PS
ss
NNE
—U—
O NE
-o
O SE
£
—D—
SSE
NNEE
0 NEE
it EE
□ SEE
SSEE
Fig. 2.3-5 A 4-patch of biquadratic elements.
86 THE ADVECTION-DIFFUSION EQUATION
k (' Te — 2T0 + Tw __ The — 2TN + TNw + Tse — 2Ts + Tsw
= — 64 ~—■ h 32 5
40 V Ax2 2Ajc2
0 TNNee — 2TNN + TNNww + T^s^ — 27^5 + Tssww ao T'ee — 270 + 7Vw
+ o ~ — jZ ~
2/2 /2
, £ Tnne — 2TNN + TNNw + r^sf — 2Tss + Tssw
— 16 ~
2Ajc2
,, TVee — 27V + TNWW + T^^ — 27$ + Tssw \
"16 & )
, *" /,. TN — 2Tq + Ts TNw — 27V + Tsw + The — 2TE + TSe
-\ 64 ~ h 32 ^
40 V Ay2 2Ay2
Q Tnnee — 2TEe + Tssee + Tnnww — 27Vw + 7"55ww ao Tnn — 2Tq + 7^
+ o » 51 »
2n2 I2
, ^ TVee — 2Tee + T'see + TVwiv — 2TWw + T^w
— lo ~
2A/
, £ Tnne — 2Te + Tsse + TNNw — 27V + Tssw \ ,_ - . _.
— 16 = . (2.3-45)
2n2 /
Remarks:
(1) This higher-order approximation is seen to be simply a (non-simple) linear
combination of second-order centered difference approximations—an observation that applies
to most, if not all, of those stencils to follow.
(2) As discovered long ago by Carey (1976), it is not appropriate—though tempting—to
analyze the accuracy of this equation via TS expansions. The results would generally
be wrong, because they do not properly account for the 'difference' between mode
types and the couplings between the different types of equations. This point is also
brought out in the recent text by Morton (1996).
(3) This is the 2D extension of the constant / version of (2.3-18), to which it simplifies
by assuming ID behavior.
(4) Mass lumping (see Section 2.5.1) can also be invoked by setting all 25 time
derivatives to to.
b. An interior 2-patch
An appropriate 2-patch is shown in Figure 2.3-6 for a 'typical' edge node, such as node
N in the above 4-patch sketch. (The other edge nodes, like node E, can be easily derived
from that below via 'symmetry'):
The GFEM for node 0 in this case couples 14 others and generates the following ODE
in FDM format:
[(128ro + 32(7- E + TV) + 16(7* + tN)
200
+ 4(Tse + Tsw + TNE + TNW) — \6(Tee + Tww)
— 2(Tsww + TNWw + tsEE + TNee)\
SOME SEMI-DISCRETE EQUATIONS 87
NWW
NW
HD—
N
WWD
WO
-D-
NE
-ID-
NEE
—•
go
OE
■£ —
0 EE
sww sw s
Fig. 2.3-6 A 2-patch of biquadratic elements.
Tee — Tww
-D-
SE
SEE
u ( Tf — Tw
+ _ 64—
40 V /
32-
2/
, , , The ~ Tnw + Tse — Tsw 0 T^ee — T^ww + Tsee — Tsww
_l_ 15 _ x-
v
+ —32
40 V
2/
Tn-Ts
h
4/
+ 16
Tnw — Tsw + TNE — TSe
2h
Tnww — Tsww + TNEe — Tsee \
2h ')
k ( --Te — 27\) + Tw , __ Tse — 27$ + Tsw + T^e ~ 2TN + TNw
80
128-
Axz
+ 32-
2Ajc'
64
Tee — 2Tp + Tww . ^ ^5vvw ~ 27$ + 7^ee + TNww — 27V + 7Vee \
V
21
, K {^,Ts — 2Tq + Tn Tse — 2TE + TNE + 7svv — 2Tw + 7Viy
+ — 64 = h 32
80
A/
2Ay
16
T^ww ~ 27Vw + TNWW + r^gg — 2r££ + TNee
__ ^
(2.3-46)
wherein the 'anisotropy' of such a midside node is obvious. This equation properly
also degenerates to (2.3-18)—for / = constant—if the solution is taken to be ID (no
y-variation). Also, mass lumping by equating all 7's to T0 is viable (but less accurate).
c. An interior 1-patch
Our last 'biquadratic equation' is also the simplest, and is contained in the single element
shown in Figure 2.3-7. The central node equation is easily found to be
1
400
[256r0 + 32(7V + fs + fE + tw) + 4(.tNE + tNW + tsw + tSE)]
+
w / Tse — Tsw TE — Tw TNE — TNw \
To I—/—+8—r~ + —/ )
, NW — * SW , nl N — I S , I NE — I SE \
10
h
+ 8-
h
+
h
88 THE ADVECTION-DIFFUSION EQUATION
NW
WD
sw s
Fig. 2.3-7 One nine-node element.
0 E
« ( Tsw — 2Ts + TNW TE — 2Tq + Tw TNE — 2TN + TNW
1_ $ _ 1
10
Axz
Axz
Axz
+
K fTNy/
27V + TSw
io V
A/
+ 8
TN — 2Tq + Ts , TNE — 2TE + TsE
a/
+
A/
, (2.3-47)
whose ID version is easily seen to agree with (2.3-16)—and, with the following statement,
we are done: boundary patches are left as exercises for the reader.
2.3.4 Two Dimensions with Serendipity Elements
Because some still use the eight-node serendipity element (it can sometimes be slightly
more cost-effective than the nine-node element, although we do not recommend it),
and—especially—since it leads to some very interesting (bizarre) behavior when mass
lumping is invoked, we repeat the exercise for this element. Again, only final (FDM)
forms of the equations will be presented, and again only for constant coefficients on a
uniform rectangular mesh with S = 0. (As usual, the basis function-interpolated source
term can be instantly realized via the mass matrix.)
a. An interior 4-patch
We begin with the 25-node stencil shown earlier (Figure 2.3-5) for the nine-node element
and reduce it to a 21-node stencil by removing the center node from each element. The
concomitant serendipity basis functions and their element matrices (Appendix 1) then lead
to the following ODE—after some judicious rearrangement that even includes adding and
subtracting equal terms to make more sense of the GFEM equations:
— [l2(fN + fs + fE + fw)
+ &(Tnee + TNNE + TNNW + TNww + Tsww + Tssw + Tsse + T$ee)
— 4(r/v/v + Tss + Tee + Tww) — 3(TNNEE + TNNWW + Tssee + Tssww) — 24Tq]
SOME SEMI-DISCRETE EQUATIONS 89
_"_ /R Tee — Tww ~ TNNEe — TNNww + Tssee — Tssww
__ ^ _ _
.Te — Tw TNee — TNww + Tsee — Tsww \
_ + __ j
■ v (Q Tnn — Tss Q TNNee — Tssee + TNNWw — Tssww
+ M 2h +3 4h
_ 1 n ^N ~ ^s -i- i a Tnne — Tsse + TNNW — Tssw \
h 4/ /
/c / 7Vf — 277) ~l~ Tww Tp — 277) + Tw
_ 2S_tt ut ww _ 20— " w
15 V /2 Ajc2
Tnne — 2TNN + 7V/VW + Tssw — 2Tss + Tsse
10-
2 Ajc2
^ TVff — 27V + r^ww + Tsee — 27^ + ^vvw
— 6 =
2/2
^ Tnnee — 2TNN + TNNww + Tssff, — 27'55 + r^^ww \
2? /
15 V /i2 A/
TVee _ 27££ + 75FF + TNww — 27V + Tsww
10
2A/
^ 7"/V/V£ _ 2TE + 7"55£ + TNNw — 2TW + 755W
6
2h2
. OQ TNNEe — Tee + ^ss££ + TNNww — 27Vvv + Tssww \ ,» » ,0,
+ 23 ^ J , (2.3-48)
which deserves several
Remarks:
(1) The lumped mass for node 0, J cpo, is negative (—lh/3); and, we have divided the
GFEM equations by this negative 'mass' in order to obtain the FDM form above.
(2) The 'small' change from the nine-node element has caused a large change in the
coefficients—with some rather 'strange' weightings.
(3) Even though the coefficients of the T terms sum to unity, lumping via 'row sum'
does not work—an important point that we will soon return to.
b. An interior 2-patch
Again we can use the nine-node, 2-patch sketch presented earlier (Figure 2.3-6) and
simply omit nodes E and W. Again, only the FDM form of the equation is shown, and is
— [647Yj + 20(tNE + fNW + Tse + tsw) + ^(Tee + tww) - \2(tN + ts)
90 THE ADVECTION-DIFFUSION EQUATION
— &(Tnee + Tnww + Tsww + Tsee)\
u f TNE — TNW + Tse — Tsw \^EE ~ Tww
15 V 2/ 2/
„ TVgg ~ ^JVffff + TsEE — Ts\VW \
~ 21 J
v_ (Tn — Ts Tne — Tse + TNw — Tsw \
3 V ^ 2h J
k (' Tee — 2To + Tww TNww — 2TN + TNee + Tsww — 2Ts + TSee \
= To [s ? + 1 tf )
« f Tn — 2T0 + Ts TNWW — 2TWW + Tsww + TNee — 2Tee + Tsee \
3 V A? + 2A? J '
(2.3-49)
which displays even more anisotropy than does the analogous nine-node result in (2.3-46).
Remarks:
(1) At least now the 'close-in' nodes get the largest positive weighting.
(2) Here J (po = 2lh/3 was used to convert the GFEM form to the displayed FDM form.
(3) Again, even though the sum of the T terms is unity, this form of mass lumping is
to be avoided.
c. Another interior 2-patch
Because the eight-node element is 'different,' it is of some interest to examine the other
type of midside node, node 0 in Figure 2.3-8. And the result is
[64ro + 20(tNE + tNW + tsw + tSE) + 16(7^ + tss)
120
— \2(TE + Tw) — 8(TNne + TNNW + Tsse + ^ssw)]
u fTE — Ty/ TNE — TNw + Tse — Tsw \
+ 3^ / + 21 ;
v_ (,ftTne — Tse + TNW — Tsw \^nn ~ ^ss
- 7
15 V 2h 2h
Tnne — Tsse + TNNw — Tssw \
2h
— - ('? ^E ~ ^° ~*~ ^w _i_ Tnne — 2TNN + TNNW + Tsse — 2Tss + Tssw \
~ 3 \l Ax2 + 2A? )
, *" (Q TNN ~ 2Tq + TSS
+ Tol8 1?
ry Tnnw — 27V + Tssw + TNNE — 2TE + Tsse \ _ „ n
+ Z 2h2 I, (2.J-5U)
SOME SEMI-DISCRETE EQUATIONS 91
NNW
NWP
W<>
SWP
NN
-o-
-Q
0 w
n_
ssw ss
Fig. 2.3-8 A 2-patch of serendipity elements.
NNE
—•
PNE
i* E
P SE
SSE
an equation that is actually easily derived from (2.3-49) via 'symmetry,' but one that we
present because it leads to a different ID equation, which we show below.
Before continuing our promised mass lumping discussion, it is of some interest to see
how the serendipity element responds to a ID problem; thus, suppressing all y-variation
in the above equations leads to the following three 'x-direction' equations:
^[14(7-/+, + 7-/-1) + 3(r/+2 + ti-2) ~ 47-/1 + \ (5 Ti+2 2/ Ti 2 ~ 2-
/ Ti-2 -2T; + Ti+2 . 7V, - 2T; + Ti+, \
= KV ? 2 K? J
/+i
Ti.
from the 4-patch equation;
(2.3-51)
from the first 2-patch equation; and
1 •
— (7V
10
+ &Ti + Ti+l) + u-
Ti+i - Ti.
I
= K-
Ti+i -2Ti + Ti.
Ax2
(2.3-52)
(2.3-53)
from the second 2-patch equation, wherein it is clear that we have switched from compass
point to index notation. Whereas (2,3-53) does recover the analogous equation from ID
quadratic elements—i.e, (2.3-16)—the other two are new. But they are valid/legitimate;
i.e., the serendipity element can indeed solve ID problems even though its basis functions
are not tensor products of ID basis functions.
To conclude this discussion (on a really sour note), we now discuss mass lumping. The
first—and rather obvious—lumping is simply 'row sum' (see too Section 2.5.1), which
is obtained from the above consistent mass equations, (2.3-48) through (2.3-50), simply
by setting all 7's to to- The equations that result are the same as those derived via the
so-called optimal lumping scheme of Malkus et al. (1988); see too the original reference,
Fried and Malkus (1975), which was derived using nodal quadrature rather than exact
integration. Regardless of the fact that row sum lumping is the same as optimal nodal
quadrature, and regardless of the fact that the resulting ODE's 'look' okay, they are in fact
92 THE ADVECTION-DIFFUSION EQUATION
very far from 'okay.' In what might be called one of the 'finite element surprises,' it turns
out that such a lumped mass approximation turns what were perfectly legitimate GFEM
ODE's into totally inappropriate non-GFEM ODE's that are sometimes even unstable
(eXl behavior, where Re(A.) > 0). In our numerical experiments, summarized in Gresho
et al. (1976, 1978), we found that these lumped mass ODE's were unstable for pure
diffusion, stable but 'meaningless' (i.e., not representing the PDE) for pure advection,
and somewhere in between for finite Peclet number. The case of pure diffusion was
later substantiated in Malkus and Plesha (1986) and further discussed in Malkus et al.
(1988).
To add even more confusion to the mass lumping arena wherein PDE consistency can
be lost, we must discuss the innovative (but still ad hoc, as are all) mass lumping scheme
of Hinton et al. (1976), beginning with the admonition that they were mainly interested
in structural dynamics—they did not advocate nor test their scheme on the equations of
fluid mechanics. (They in fact cleverly side-stepped this issue by restricting their claims
to problems for which variational principles exist.) Their scheme is simple to state and to
implement, and it has apparently served well in some areas—but not advection-diffusion
via 'serendipity' (!): compute the diagonal entries of the element consistent-mass matrix,
sum them, and multiply each by the (same) unique scalar that preserves total element
mass. Since both diagonal entries and total mass are always positive, this scheme never
generates negative lumped masses, which, indeed, was one of the authors' objectives.
Referring then to the eight-node mass matrix in Appendix 1, this mass lumping algorithm
puts 3/76 of the mass at each corner node and 16/76 at each midside node. What does
this do to the three nodal equations in (2.3-48) through (2.3-50)? To answer, we must first
back up from the FDM form presented to the original GFEM form, then lump the mass,
and then divide by f <Po, so that all terms except the time derivatives are as presented
above, and it is no longer GFEM. The results are:
1. — jgf0 for a corner node (2.3-48), and
2. j^Tq for a midside node (2.3-49) and (2.3-50), which merit the following
Remarks:
(1) Recall that row sum lumping of the FDM versions of the equations gives t0 for
each, a result that 'looks' consistent (but is not).
(2) The minus sign in front of the corner node equation is caused by our 'undoing' of
the weighting by division by J (p0 = —lh/3.
(3) Neither coefficient looks consistent.
(4) These same coefficients apply in the ID cases, (2.3-51) through (2.3-53), the first
giving —-^T'o and the other two jf 7V
(5) We also tested this lumping scheme in Gresho et al. (1976, 1978) with the following
results: (i) it was always stable, (ii) it was—like row sum—completely inappropriate
for the hyperbolic limit of pure advection, and (iii) it was 'reasonable' (but much
less accurate than consistent mass) for pure diffusion. We did no mesh refinement
experiments, but believe that the resulting 'pure advection' ODE's do not model
dT/dt + u • V7 = 0 even when h -* 0.
OPEN BOUNDARY CONDITIONS (OBC'S) 93
While further analysis may be interesting, we believe it to be unwarranted; suffice it to
say that neither lumping scheme should be used in practice—especially the more logical-
looking row sum lumping, since it will often generate unstable ODE's. If you wish to use
the serendipity element for the scalar transport equation (or even for the NS equations),
then you should use only the consistent mass (GFEM) formulation (for which it behaves
very well) or design a new lumping scheme that works. (We have here a perfect example
of the use of the adjective 'consistent'!)
2.4 OPEN BOUNDARY CONDITIONS (OBC'S)
We now specifically address, for but one of several times in this text, the important
issue of outflow (or, more generally, open) boundary conditions (OBC's)—a special case
(usually) of the NBC's associated with the weak form.
In many simulations of interest in fluid mechanics, the fluid—and the 'load' that it
carries/advects, here the scalar T—flows through (i.e., both into and out of) the
computational domain, a situation necessitated by the fact that the true (physical) domain of
interest is (much) too large to even be considered in the numerical simulation. For an
engineering example, consider a physical laboratory in which the experiment of interest is
flow past an obstacle—a cylinder in a channel, or an airplane in a wind tunnel—and the
flow is forced via a pump or fan/compressor; to attempt to model the entire closed loop
would be expensive. (Modeling an open wind tunnel would be out of the question.) For a
geophysical application, consider the problem of trying to predict the air pollution from
a (dirty) factory that is located (to make the problem more interesting) in mountainous
terrain; to attempt to model the entire atmosphere of the earth would be expensive.
So we must consider inflow/outflow situations in which our computational domain is
truncated and some BC's necessarily applied at these artificial/synthetic 'boundaries'; i.e.,
the PDE does not know that we are truncating the universe—all it knows is that BC's on
T are required in order to 'solve for 7Y OBC's synthesize the connection of our restricted
computational domain to the rest of the 'universe.' The general goal is to apply BC's at
inflow (n • u < 0) and—especially—at outflow (n • u > 0) that are both mathematically
legitimate and computationally useful. But what does 'useful' mean? While necessarily
vague, it is basically this: useful BC's are those that lead to good results in the 'smallest'
truncated domain. But what does 'good' mean? What does 'smallest' mean? Good results
are those that cause the solution in the 'subdomain of principle interest' to change little
when the computational domain is made larger and that would agree well with those
from the true (physical) domain. The smallest truncated domain is often (but not always)
the largest domain that one can afford to model. Naturally, all of these issues are rather
qualitative in nature—a necessary consequence of domain truncation. But it is a very real
fact of life that many CFD simulations must deal with the open-boundary situation, and
whereas many BC's are supplied by nature, OBC's are not. Finally, we remark that good
OBC's are cost-effective, and bad OBC's can even destroy the entire simulation.
2.4.1 One Dimension
So now let us assume that our ID problem, which indeed displays inflow (x = 0) and
outflow (x = L) as posed in (2.3-1) for u > 0 and constant, is one in which the 'desired'
94 THE ADVECTION-DIFFUSION EQUATION
simulation is to produce good results on some subdomain, Ls < L. We have already
imposed a Dirichlet BC at the inflow point, which presumes that we have been supplied
with some 'outside information' there. But this is not actually necessary (nor desirable,
sometimes), in that a Robin BC could also (and sometimes judiciously; see, for example,
Novy et ai, 1990) be applied at x = 0—this time either in the form — K(dT/dx) + H(T —
f) = qQ or — K(dT/dx) + uT + H(T — f) = qQ, the minus sign accounting for the fact
that the unit normal vector in the x-direction is — 1 at the inlet.
But here we retain the Dirichlet BC at x = 0 and shift our attention to the outlet,
x = L. The NBC of (2.3-3) can, in fact, be usefully applied there as an OBC, usually in
the following way: take H = q = 0; i.e., the useful OBC is simply dT/dx = 0 at x = L.
Recalling the semi-discrete equation for / = N [(2.3-12)] gives, for this case,
(TN — TN_\
K ~
I
N
u(TN - TN-i) 2 .
lN 3
-lN-XtN-u (2.4-1)
6
an ODE for node TN, with (ignoring the coupling) time constant x ~ {3k/12n + 3u/2lN)~l,
that is also the NBC approximation to dT/dx = 0. Note that the time constant, which
is 12n/3k for pure diffusion and 2lN/3u for pure advection, tends to zero with mesh
refinement; the ODE solution appears to then respond so rapidly that it is in quasi-
equilibrium; i.e., lutN -> 0 with tN finite, which presumably could have implications
regarding the stability of the chosen ODE method. (But in fact it does not, fortunately; see
Section 2.7.2c.) For lN -> 0, the ODE is (automatically) 'sacrificed' in favor of the OBC.
We shall demonstrate later (Section 2.6.2d) that this BC permits a passive exit (of the
advective flux) from the domain even in the advection-dominated {Pe ^> 1) case. We will
also show that a Dirichlet BC at the outflow is (usually) not at all passive—especially for
advection-dominated flow: 'From the study of singular perturbation problems, it is well
known that boundary layers are weaker if boundary conditions on the derivatives are used
(soft BC's) rather than boundary conditions on a function itself {hard BC's)'—Naughton
(1986).
A specific and simple example might be useful to help investigate the concept of
quasi-equilibrium of the ODE for node N, and the 'dual' assignment of this ODE; i.e.,
it must simultaneously (i) satisfy closely the BC of the PDE and (ii) ensure that TN(t)
is an accurate approximation to the PDE solution at the boundary. Consider the problem
T, — kTxx for x > 0, — kTx = q at x = 0, T = 0 at / = 0; i.e., the transient heat equation
in a semi-infinite domain with an applied flux BC. The exact solution is
T=1
K
J^l^/ak, _ x(l _ erfx/^-t)
(2.4-2)
which for 'small' / (namely, 4/cf <$C L2), approximates well the solution on the finite span,
0 ^ x ^ L, which we shall use for a numerical solution (with BC 7 = 0 at x = L= 1).
The ODE for node 1 (at x = 0; node /V -> node 1)—corresponding to the OBC—is
[linear elements, lumped mass—for simplicity—which is also 'FDM + image point,' see
(2.4-26) through (2.4-28) below]
{lti+K(Tl-T2)/l=q, (2.4-3)
and our objective is to focus on the temporal behavior of node 1 with an applied flux
BC. Viewed 'in isolation,' it responds with a time constant x = I2/2k, which goes to
OPEN BOUNDARY CONDITIONS (OBC'S) 95
zero with mesh refinement, suggesting, as stated above, quasi-equilibrium behavior; i.e.,
for / 'sufficiently small' (and / 'sufficiently large,' both of which we shall soon define),
node 1 should respond so fast that T, (t) in (2.4-3) will agree very closely with that
of the exact solution, dT/dt\o = q/y/mct, from (2.4-2). We then presume that the quasi-
equilibrium ODE will then very nearly satisfy k(T\ - T2)/l =q- (l/2)dT/dt\0 = q[\ -
l/(2y/icKt)] = q; it will of course always satisfy k(T\ — T2)/l = q — (l/2)f\. But since
the exact acceleration is unbounded at / = 0, quasi-equilibrium behavior is clearly not
true for all /. This is because of the discontinuity in heat flux at / = 0; the closer you go
toward / = 0, the finer the needed mesh if the approximate solution is to be accurate—a
given mesh will not be able to capture T\ = 0(1 /yfi) for / 'too small.' We can quantify
this approximately as follows: supposing that / is small enough that the mesh nearly
captures the correct solution at x = 0, we can say
7X0, 0 = q/yfriTt = 7,(0/t = 7X0, t)/x = (q/r)y/At/7tK, (2.4-4)
with t = I2/2k, to give the 'limiting case' result,
/ = yfAid. (2.4-5)
Thus, for a fixed mesh, our approximate solution will be poor for / < 0(12/4k);
conversely, if we wish to capture the behavior at small /, then the first element length
should be less than y/Aict. Quasi-equilibrium {and concomitant good approximation of the
constant flux BC) in this case will be observed only for / ^> I2/4k; for / ^ 0(12/4k), T\ (t)
will not be in quasi-equilibrium—and neither the PDE near x = 0 nor the BC will be
accurately approximated. Quasi-equilibrium and accurate results near x = 0 go hand-in-
hand, and can only occur for / ^> 0(12/k).
Figure 2.4-1 shows—thanks to A.C. Hindmarsh and his VODE code (Brown et al.,
1989)—the exact and approximate (N = 20) solutions at x = 0, and Figure 2.4-2 shows
the 'flux error,' q — k(T\ — T2)/l, which is also lt\/2, for a variable mesh [x-, = (//TV)1'2]
solution of the appropriate ODE's on the unit span with T = 0 to x = 1 for 10, 20, 40, and
80 grid points—with q = k = 1. [The oo span solution is 'valid' for / «; L2/k = 1. Also,
the quantity called 'flux error' is not exactly consistent with our discussion of 'consistent
flux' in Chapter 4. C'est la vie. Also shown is the 'scaled' exact acceleration, lt(0, t)/2 =
lq/2j7ZKt, where I = (\/N)L2 = 0.063, 0.027, 0.012, and 0.0052 for N = 10, 20, 40, and
80, respectively. These give the following values for r : 0.0020, 0.00038, 0.000071, and
0.000014, respectively. Figure 2.4-3 shows the actual acceleration error, T\{t) — q/y/rnci,
for the same four cases. (N = 80 is really there—it is just 'too accurate' to see, and the
initial error in all cases is —oo.) It is seen that for / > ~10r quasi-equilibrium and
good accuracy ('error' in flux ^ ~0.13) obtain. (The horizontal line shown corresponds
to ~5.3t and an 'error' of ^ ~0.18 in Figure 2.4-2; e.g., for /V = 20, the flux error
is <0.18 for / > ~ 0.002.) This time (10t) is closely related to the 'minimum time of
believability' discussed in Gresho and Lee (1981) and would be even more obvious had
we used consistent mass in the above example because T2(t) would then start off in the
wrong direction, and would have recovered and become accurate by 10t.
Remark:
Another approximation that can be applied for this problem, from the FDM point of
view: replace the ODE for node 1 by the PDE BC approximation, q = k(T\ — T2)/l. We
96 THE ADVECTION-DIFFUSION EQUATION
0.12
0.1
0.08
0.06
0.04
0.02
T(0,t) = V4t/7t.
/T,(t) for N = 20 and x, = (i/N)1
Fig. 2.4-1 Exact and approximate solutions at x = 0.
CM
C
CO
" 0.4
CM
0 0.002 0.004 0.006 0.008 0.01 0.012
t
0 0.002 0.004 0.006 0.008 0.01
t
Fig. 2.4-2 Scaled acceleration: numerical (solid curves) and exact.
OPEN BOUNDARY CONDITIONS (OBC'S) 97
0.002
0.004 0.006
t
0.008
0.01
Fig. 2.4-3 Acceleration error at x = 0.
did the experiment and obtained much less accurate answers. The ODE NBC is the best
way to go.
We conclude the analysis by actually determining the exact solution to the ODE's via
an eigenvector expansion—hopefully for further elucidation. Also, the solution is both
simple and interesting—so we present it. To do so, we switch to a uniform mesh and utilize
the (uniform mesh) eigenproblem results from Hindmarsh et al. (1984)—accounting for
the reversed BC's there (T = 0 at x = 0, Tx = 0 at x = L):
where
K
K=-
I
Ky = -iiMLy,
r-2 1
1 -1 1
... 1
-2
1
I
1
-lJ
(2.4-6)
(2.4-7)
is the N x N diffusion matrix and
Mr=l
(2.4-8)
1/2.
98 THE ADVECTION-DIFFUSION EQUATION
is the lumped mass matrix. The corresponding ODE system is
MLt = KT + b, 7(0) = 0, (2.4-9)
where bT = (0 -» q) is the driving force. The eigensolution is
fi„ = 2kN2 M - cos tt] IL2 (2.4-10)
and
y)^ = sm ^-jn fovnJ=l,2,...,N, (2.4-11)
IN
where / = \/N—and j = N corresponds to the ODE + BC; namely, from (2.4-9),
2TV = y (TV-i - TN) + q, (2.4-12)
which corresponds to the T\ equation in the earlier discussion. The solution for T(t) in
terms of the eigenvectors, {j(w)}, goes like:
1. Expand the solution in terms of the eigenvectors (which form a basis):
N
T(t) = Y,an(t)y(n). (2.4-13)
n=\
2. Expand the RHS vector in a 'related' basis:
N
b = Y,bnMLy{n\ (2.4-14)
n = \
where the expansion coefficients are obtained using the following orthogonality relation
{y{m))TML(y{n)) = ^mn, (2.4-15)
to obtain bn = 2bTy(n)/L = (2q/L)sm ^-jt = (2q/L)(-l)n+i.
3. Insert both of the above expansion into (2.4-9) and utilize (2.4-6) to obtain
N
]T(a„ + finan - bn)MLy(n) = 0, (2.4-16)
M=l
which, since the vectors {MLy(n)) are linearly independent, is an expansion of the zero
function so that we have
an + fina„ = bn, a„(0) = 0, (2.4-17)
with solution
an(t) = bn(\ ~e-^')/fjtn. (2.4-18)
Hence, the complete solution at node j is
TJ^ = fB-'^'C -e-Mmf)sin-^->/^, (2.4-19)
OPEN BOUNDARY CONDITIONS (OBC'S) 99
and we zoom in on the 'BC node,' j = N, to obtain
2q
N
TN(t) = -fY.^~Q'
■Unt
)/Vn,
(2.4-20)
and its rate of change—much more relevant in this case since T itself starts out with zero
error— is
2q
L
N
TN(t) = ^Y.f
.-/*«'
(2.4-21)
n=\
In Table 2.4-1, we show, for N = 20(r = I2/2k = 0.00125) some eigenvector results—at
t = 0.002(1.6t) and 0.01 (8r), at which times 7^(0.002) = 0.046 vs 7(0, 0.002) = 0.050
(8% error), tN(0.002) = 14.1 vs t(0, 0.002) = 12.6 (12% error), and 7^(0.01) = 0.111
vs 7(0, 0.01) = 0.113 (1.8% error), and finally 7^(0.01) = 5.74 vs 7(0, 0.01) = 5.64
(1.7% error)—showing, again, good accuracy for / > ~10r. We first note the lack of
'mode separation' at / = 0.002 (all modes are significant) for TN, whereas at / = 0.01
there is a hint of mode separation. This actually is not surprising when it is realized that
initially all modes contribute equally, that from each mode to TN(t) being 2qt/L. We
seem to have chosen a bad example for arguing 'quasi-equilibrium' for node Nl Surely
7V(0 is a better indicator than T^(t), since both exact and approximate values of T start
at zero. By / = 0.01, though, there is significant mode separation, especially in tN—only
the first five or so are really significant.
We conclude our simple(?) example with the following
Table 2.4-1 Eigenvalues and eigenvector contributions to TN and TN for N = 20 at t = 0.002
and 0.01.
n
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
Mn
2.46613
22.1041
60.8964
117.888
191.675
280.442
382.001
493.853
613.244
737.233
862.767
986.756
1106.15
1217.10
1319.56
1408.32
1482.11
1539.10
1577.90
1597.53
in3 1 -e-0002""
1.99508
1.95644
1.88301
1.78171
1.66128
1.53078
1.39843
1.27076
1.15236
1.04594
0.95266
0.87259
0.80509
0.74917
0.70370
0.66760
0.63990
0.61981
0.60675
0.60037
e-0.002//n
0.995080
0.956755
0.885172
0.789958
0.681574
0.570705
0.465798
0.372430
0.293321
0.228901
0.178078
0.138968
0.109449
0.087510
0.071423
0.059806
0.051601
0.046042
0.042605
0.040964
103- J—%
9.87770
8.97192
7.48954
5.87318
4.44980
3.34992
2.56039
2.01038
1.62713
1.35557
1.15885
1.01337
0.90402
0.82102
0.75783
0.71006
0.67471
0.64973
0.63376
0.62596
e-0.0Vn
0.975640
0.801684
0.543425
0.307623
0.147084
0.060542
0.021928
0.007165
0.002171
0.000628
0.000179
0.000052
0.000016
0.000005
0.000002
0.000001
0
0
0
0
100 THE ADVECTION-DIFFUSION EQUATION
Remarks:
(1) Pure diffusion is not a particularly good example of an OBC.
(2) We are open to suggestion for a better model problem that includes advection.
Turning now to another OBC issue, we conjecture: Suppose, however, that one
employed the weak form (2.1-26) rather than (2.1-23), which incorporates the total flux
as an NBC. Our purpose here is to indicate the possibility of serious danger in this
case with respect to OBC's—especially (again) for the advection-dominated case. The
ID version corresponds to the following BC at x = L : K(dT/dx) — uT + H(T — T) = q,
which leads to the following semi-discrete equation for / = N—after replacing the term
J(()iu(dTh/dx) by - JuTh(dc()i/dxX a la (2.1-26):
7 [7/V-1 ^/v-1 + 2//v^/v] — -m(T/v-i + TN)
6 I
+ ^(TN- 7V_,) + HTN = -lNS + q + HTN,
lN I
which is the NBC of the 'total-flux' BC given above. Consider now the use of these two
forms of the OBC (at / = N) for the advection-dominated case, the dimensionless form
of which is (2.1 -10) for the former, and
Pe T + Bi(T -f) = q (2.4-22)
dx
for the latter (total flux), which is the dimensionless form of (2.1-25), where Pe =
uL/k, Bi = HL/k, and q = qL/K. To simplify(?) matters, first take q = 0 and consider
the case Pe » 1. If Bi = 0(1), the total flux BC becomes dT/dx = Pe T at the exit
rather than dT/dx = 0; forward advection is matched/balanced by back-diffusion with (in
general) an extremely large gradient. Whereas the homogeneous Neumann case is
computationally effective regardless of the size of Pe, the homogeneous Robin case can be a
disaster for Pe ^> 1, another result we shall demonstrate in due course—Section 2.6.2f.
The reason may be clear already: q = 0 is usually quite inappropriate for the total flux
case, damming as it does the total flux. It is appropriate for the 'advective' form because
it permits the advective flux to leave passively, while damming only the diffusive flux.
Another OBC has been used and advocated in the finite difference literature (e.g.,
Orlanski, 1976; Sani and Gresho, 1994)—again for the advection-dominated case wherein
it tries to satisfy the maxim, 'Let the advective flux leave the domain passively.' In the
simple ID context, it is
dT dT
— + V— = 0, (2.4-23)
dt dx
where it clearly makes the most sense here to take V = u. Since, however, the
multidimensional case rarely has a fixed and uniform outflow velocity, we will for now keep
V as a 'general' (but > 0) velocity. [In Sani and Gresho (1994) are also cited several other
OBC's—some for FDM, some for FEM, and a few via the spectral element method.]
In the finite element context, this OBC might be implemented simply by replacing the
semi-discrete equation at the outlet (/ = N) by (for linears)
±(/tf_i7V_, +2lNtN)+ V(TN - TN_{) = 0, (2.4-24)
OPEN BOUNDARY CONDITIONS (OBC'S) 101
where we note that in this same context, it looks more like a Dirichlet (essential) BC
[as indeed does (2.4-23)] than a Robin/Neumann BC; i.e., it is not an NBC. But it
is also clear from the above discussion that—provided V = u — it corresponds
properly to the hyperbolic limit associated with Pe -» oo after, of course, setting H = 0,
q = 0, and S = 0 in these equations. At this time, we have little or no experience in
using this BC via GFEM; nor do we know others who have. It should probably receive
more attention/testing—especially in the more difficult context of multi-dimensional flow
wherein, for example, setting (in 2D) V = (\/H)fQ (n-u)dy, where n • u (> 0) is
the normal velocity at the outlet, may be appropriate. If, of course, n • u varies greatly
between 0 < y < H, then this BC would be more severely tested.
An alternative, and indeed more appropriate, way to enforce (2.4-23) is via the NBC
route of GFEM—as follows: find Th from
r riT^1 r W r\T^ r
/ Vi-r- + / Wtt-ht + / (<^u ' yTh + kWW ■ WTh) = ° V/' (24"25)
Jq at Jy{) V at Jq
which is obtained via the usual integration by parts and (2.4-23), and we see that a simple
modification to the mass matrix is all that is needed.
Remarks:
(1) The / -» '0' limit yields, for linears, the lumped version of (2.4-24).
(2) It has not yet been tested (to our knowledge), probably in part because the FEM
community is perfectly happy with the simple 'do-nothing' OBC: oTjon = 0.
(3) It properly 'vanishes' for the hyperbolic (k = 0) case.
(4) It also vanishes at the steady state, leaving aTh jan = 0 as the OBC.
(5) It appears to be also interpretable as a Dirichlet BC—applied weakly.
The last issue to be raised in this, our first visit to OBC's, is—perhaps somewhat
unfairly, but we do not believe so—to compare the OBC/NBC of the GFEM with a
common OBC that is derived in a finite difference version of (2.3-3). This is the (Taylor
series) second-order BC at x = L that is derived using the image point method, wherein
a fictitious node (N + 1) is temporarily introduced beyond x = L (at a distance In, to be
precise) and used to formulate a discrete version of BC (2.3-3); namely, the FDM might
proceed as follows for i = N:
. u(TN+\ -TN-\) (TN-\ -2TN + TN+\)
Tn + = * -2 + S, (2.4-26)
Un In
where TN+\ is found from the Robin BC,
K(TN+l-TN^) + H(Tn _ f^ = ^ (24_2?)
2/
N
and used to eliminate the fictitious node from the semi-discrete equation above, resulting
in the FDM-ODE-plus OBC,
Tn + h(^---\tn + ^(Tn - TN-i) = S + (q + HTN) (%- - - )■ (2.4-28)
\h k) 1% \lN k)
102 THE ADVECTION-DIFFUSION EQUATION
First we note that the equation is consistent: multiplication by lN and letting lN -» 0
recovers (2.3-3). Next, we write the equation in non-dimensional form (x -» x/L, t -»
ut/L\ a la (2.1-8), and let Ax = lN/L:
PetN+Bi (-?- - Pe\ TN + 2{TN - TN-\)/Ax2 = g, + (q + BitN) (— -
where Q\ = SL2/k, and we note two things: (i) Ax -> 0 recovers (2.4-22), and (ii) the
nature of the equation changes at Pe = 2/ Ax; while (2/Ax — Pe) > 0 for any fixed Pe as
Ax -» 0, it is also true that (2/Ajc — Pe) < 0 for Ax fixed and Pe sufficiently large—and
the crossover corresponds to a grid Peclet number, P = uAx/2k, equal to 1. For
comparison, the dimensionless form of the GFEM analog is
1 .1
-(Axn-\TN-\ +2AxNTN) + - Pe (TN - TN-\) + (TN - TN-\)/AxN + BiTN
o I
= -AxNQi + (q + Bi)fN, (2.4-30)
which, (i) for Ax -> 0 recovers (2.4-22), and (ii) sees no 'transition' at P = 1.
These differences can be important. Consider, for example, the advection-dominated
case, Pe ^> 1/Ax for Ax fixed [and Bi = 0(1), Q\ = 0(1)]: the FDM equation becomes,
approximately, tN = Bi Pe(TN — TN), and the FEM equation above becomes, again
approximately, ^(AxN-\Tn-\ + 2AxNTN) + Pe(7/v — TN_\) = 0, which approximates
the large-Pe PDE at the outflow, 37/3/ + Pe(37/3;c) = 0. Thus, we see the proper
asymptotic behavior of the GFEM approximation; for Pe -> oo, the AD equation (i.e., the
PDE) approaches the pure hyperbolic equation for which no BC is applicable—the PDE
itself applies at the outflow point. The FDM ODE equation above is—of course—then
unstable in the following sense: it would yield TN ~ eB,Pe'. Even if H = 0(5/ = 0),
the equation would be poorly-behaved; i.e., tN = 0 implies that TN(t) = TN(0), similar
to the 'hard'/foolish Dirichlet OBC referred to earlier and discussed in detail later
(Section 2.6.2). See also Smith (1980), who shows that this OBC is a 'wiggle-maker'
even at steady state.
So, another nice feature of the NBC/OBC is that it better mimics the 'physics' of the
problem and over all ranges of the parameters/data. It does so by approximating the time-
in dependent BC (for constant //, T, etc.) via a stable ODE in such a way that, while TN
varies in time, it does so in such a manner that the NBC is always and stably, although
approximately, satisfied.
In this case, however, the 'physics' could also be well-approximated via the FDM
simply by using an upwind approximation to udT/dx at the outlet (as indeed is the method
'chosen' by the GFEM); if u(TN+i — TN-\)/2lN is replaced by u(TN — TN_\)/lN, then
the FDM ODE becomes
u(TN-TN_{) 2k 2H ~ 2
TN + \ + -^{Tn - TN^)+ —{TN -TN) = S+ —q, (2.4-31)
in lN In In
which is stable for all values of Pe and is, in fact, the (wiggle-free) lumped mass version
of the GFEM equation. This is but one example of a situation in which FDM modelers
could benefit by looking to the FEM for guidance in selecting BC's.
OPEN BOUNDARY CONDITIONS (OBC'S) 103
A final remark: if an FDM was a priori generated to solve the purely hyperbolic
equation, Tt + uTx = 0, surely no silly OBC of the form discussed above would arise,
because the analyst would know that an approximation of the PDE itself is required at the
exit point. But an analyst might fall into the trap discussed above if s/he was 'thinking
parabolic when writing the code,' i.e., k > 0, in which case the image point derivation of
an OBC is natural and useful, and the trap could later catch either the analyst or—more
probably and probably more dangerously—an unwary user who might innocently set
k = 0 (or simply 'very small') in order to solve a pure advection problem.
A final final remark for the FDM community: if the image point (wiggle-making, at
large P) BC of (2.4-27) with H = q = 0 is employed, with a forward Euler timestepping
scheme (see Section 2.7.2e), then the OBC wiggles can be eliminated (precluded) by
adding the BTD term to the diffusion term and operating at or near the stability limit—a
change that actually converts the BAD OBC to the good one.
We conclude by returning to the GFEM and introduce the reader to what is often
a better OBC than dT/dx = 0. While it is easy to state and easy to implement in a
GFEM code, it is very difficult to analyze—which has led to its being called a 'fuzzy
boundary condition' (FBC) by Sani and Gresho (1994), and a 'no BC boundary condition'
by Griffiths (1997). Stated in words, it is simply this: for nodes on the open boundary
(only), do not integrate the diffusion term by parts when generating the GFEM equations.
That such a simple procedure seems to leave the AD equation without any BC at the
open boundary was first noted by Sani and Gresho (1994), and remarked upon by two
mathematicians who later analyzed it, as follows: 'It is our feeling that it should be
referred to as the 'no BC boundary condition as this more accurately describes the
situation, since within a purely continuous setting (as opposed to finite element
approximations) the weak formulation is invalid because it is equivalent to not setting any BC at
x = L and the governing equations cannot therefore isolate a unique solution—Griffiths
(1997), and 'At the level of the partial differential equation it is of course impossible to
simply drop a boundary condition. However the discrete problem in Reference 1 yields
perfectly well-defined solutions which appear to be better than those obtained with more
'classical' choice of boundary conditions—Renardy (1997). Thus, for finite h (i.e. any
result from a finite element code), the 'new' BC yields unique (and often surprisingly,
good) results but the problem actually becomes ill posed (non-unique) for h -» 0, thus
causing us to call it a. fuzzy BC. Reference 1 is Papanastasio et al. (1992) in which the
so-called 'free boundary condition' was put forth and demonstrated (on non-trivial 2D
problems, in fact.). It was also put forth even ealier by Frind (1988), who called it a
'free exit boundary condition'. Whereas we introduced it above by stating that
integration by parts is waived at open boundary nodes, both of the above references derived
it in an equivalent, but perhaps slightly more confusing manner; viz. they did perform
the integration by parts, but then put the resulting boundary integral terms back onto
the LHS as 'unknown' quantities. It is also worth mentioning that Frind used (only)
linear basis functions, (for which V2Th vanishes identically) and Papanastasio et al.
used (only) quadratic basis functions. Frind even tested the pure diffusion case, with
the disappointing, result—later analyzed and explained by Sani and Gresho (1994)—that
the initial condition at x = L also served as a BC there. He thus suggested, and
Griffiths later 'proved', that there should he some advection present in order that the new
OBC work 'well'. (Sani and Gresho also showed failure of the new BC for steady state
diffusion.)
104 THE ADVECTION-DIFFUSION EQUATION
Thus, whereas it is a demonstrably effective 'OBC and better than dT/dx = 0 (see the
numerical results in Frind, Papanastasios et al., and Griffiths), and is thus recommended
as perhaps the very best way to treat outflow boundaries, it is still slightly mysterious as
to how and why it works—although, Griffiths has done an excellent job of unravelling
at least most of the mystery—albeit with some non-'classical' results; e.g. 'It does not
converge in the conventional sense, but it does converge to something—and this paper
tells what that something is '—D.F. Griffiths (1997, personal communication). What is it?
Unfortunately, one thing that it is, is that it is 'element-dependent'; the results for linears
are different than those for quadratics, which are different than those for cubics, etc—with
higher-order elements giving higher-order (better) results. Part of the reason for this is that
the hidden/fuzzy BC is actually not applied at the boundary, but somewhere within the
element that contains the boundary node (nodes in multi-D), although, neither Griffiths
nor Renardy studied any but ID. In fact, Griffiths is currently (at the time of writing)
working on the 2D case, which appears to be rather more difficult, with results that may
be more fuzzy.). We conclude by summarizing a few of Griffith's (1997) results—and
we do so for the much more common case of zero source terms (S = 0), although, both
Griffiths and Renardy also obtained results with S ^ 0.
dT
(i) k
dx (2.4-32)
x=L
x=L
where the integration is over only the last element, and \J/(x) is an 'appropriate' linear
combination of the FEM basis functions (and is thus 'element-dependent'). Clearly the
RHS of (2.4-32) is 'small' (because L — X[ is) and thus the 'free' OBC is actually quite
close to dT/dx = 0.
(ii) The error is 0[k(k + h)p] for pth-order elements (p = 1 for linears), whereas that
from the conventional GFEM NBC OBC (dT/dx = 0) adds a term like hp+l to the above
error; it is noticeably larger for the convection-dominated case—defined here by very
small k(k <^ h).
(iii) The general ('non-standard') OBC is
\- U
dt dx
dxp
-i
= 0 (2.4-33)
at x = £, where £ is somewhere in the last element. [This result, for p = 2, was also
independently derived by Renardy (1997).]
(iv) For u sufficiently smooth, this OBC can also be stated as
dp+i
n ,7 = 0 at x = Z. (2.4-34)
dxP+\ >
Thus, as mentioned at the outset, it is easier to apply this BC than to understand it. So
our final remark is: try it; you will probably like it—usually.
2.4.2 Two Dimensions
In fact, not very much more need be added in the multi-dimensional case. The goals are
the same and, for the most part, so is the method of implementation.
OPEN BOUNDARY CONDITIONS (OBC'S) 105
The analog of (2.4-1) for 2D can be obtained from the boundary two-patch equations
presented earlier—(2.3-30) and (2.3-31). We take the latter case for simplicity; (i.e.) as
an OBC, (2.3-31) with q = H = S = 0 gives
K (T$ — Tsw TQ — Tw TN — TNW
6 V / / /
Tsw + TN
w
u (T$ — Tsw . Tq — Tw .T^ — Tnw
6 I / / /
+ (TS + TN) + 2(TW + 2TQ)
v I TNW — Tsw
2h
+ 2-
TN - 7\
2h
K
3
Tnw — 2TW + Tsw \ ~ ( Ts — 2TQ + TN
(2.4-35)
an ODE that also represents the energy-balance-plus NBC/OBC at node 0 that
approximates dT/dx = 0 there, but achieves dT/dx = 0 only as / -> 0. Again, even for advection-
dominated flow, this OBC equation is effective because it does not interfere with the
passage of the advective flux thru Tyy.
If the limiting condition, k = 0, applies, then (2.4-35) automatically becomes the
appropriate hyperbolic equation at the exit,
Tsw + TN
w
+ (TS + TN) + 2(TW+2TQ)
+ _ -J^L ^HL + 2— i = S,
3 V 2h 2h '
u (Ts — Tsw .Tq — Tw Tn — Tnw
6 V / 11
(2.4-36)
which properly approximates 37/3/ + u • V7 = S at the outflow. It is noteworthy that this
PDE (not BC) is, as are the NBC's for k > 0, built-in to a GFEM code; if k ^ 0 and a
Robin BC a la (2.1-3) is appropriate, then the code user need only input the Robin data
(H, q, T). If a passive OBC is desired, then the code user simply 'does nothing'—i.e.,
homogeneous Neumann data are automatically applied if no data are supplied. Finally,
if the pure advection case is being run, then the user again 'does nothing' —except of
course set k = 0—and the GFEM will automatically deliver (2.4-36) at outflow points.
Perhaps the largest difference in 2D is that V7 and u are generally not parallel—neither
to each other nor to the unit normal vector at the domain exit, a situation that is conceivably
more difficult. This is because the NBC always involves /en • V7\ the normal flux on VN,
independently of either the flow direction or the shape of FN; e.g., consider the situation
in Figure (2.4-4) (thanks to J. Leone), with an x-directed velocity field and an outflow
boundary not parallel to y:
For example, if the flow were fully developed (plane-parallel/uni-directional) with
v = 0 and T linear in y (e.g., a steady, stably stratified flow), then the NBC of n • V7 =
nx(dT/dx) + ny(dT/dy) = 0 can cause some outflow problems relative to the desired
solution, T = T(y), or dT/dx = 0; i.e., the (homogeneous) NBC may force dT/dx ^ 0.
Figure 2.4-4(b)-(d), with u = 1 and v = 0, should all show T = y, but only the advection-
dominated case comes even close at the outlet—because k is 'small.'
106 THE ADVECTION-DIFFUSION EQUATION
(a) y
Q
-► x
(b)
1.0
T0.5
0 —■
(C)
1.0
T0.5
0 —
(d)
1.0
T0.5
0 0.5 1.0 1.5 2.0 2.5
x
3.0
Fig. 2.4-4 (a) Isotherms for non-parallel outflow boundary; (b) Pe = 100; (c)Pe=~\0;
(d) Pe=1.
SOME NON-GALERKIN RESULTS 107
2.5 SOME NON-GALERKIN RESULTS
We mentioned earlier that there is a significant body of literature in which—for one reason
or another—the Galerkin method is not used to generate the discrete FEM equations.
Herein, we address several of these for the AD equation—all based on bilinear
function approximation. We do not discuss here the many non-Galerkin approximations of
the advection terms—most of which involve the intentional introduction of some form
of numerical dissipation. Those that we introduce here are: mass lumping, one-point
quadrature, and our version of a CVFEM (Control Volume FEM).
2.5.1 The Lumped Mass Approximation
A very common approximation/short-cut in the FEM is that of 'lumping the mass,' the
term deriving from solid mechanics in which the mass matrix of (2.2-8) is related to the
acceleration of inertial mass; see Archer (1963). The terminology is rather fixed because it
is so prevalent, and we retain it here—usually in the analogous context wherein the mass
matrix is associated with the time rate of change of a dependent variable. We first attempt
to motivate the approximation in three ways: physically, mathematically, and numerically.
Physically, the procedure may be (at least for some elements) motivated by the
following assumption/approximation: assume—at least temporarily—that all nodes that
share the support of the test function [e.g., 0, in (2.2-2)] vary in time at the same rate.
In the 4-patch context of Section 2.3.2—see (2.3-23) or (2.3-24) and Figure 2.2-3—the
time rate of change of all eight nodes surrounding node 0 is assumed to be no different
from that of node 0 itself. It is a 'temporary' assumption because it must be changed
when changing the focus from node / to any other node—a fact that clearly calls such a
rationale into question.
Mathematically, to permit equivalent formulation of the FEM equations—it can be
described/derived by replacing f <pi<pj in (2.2-8) by /0, as follows:
Mij = / <pj<pj -» Mu. = 8U / (ph
which is (2.2-28). Note that since X^/Ii 4>j = l> this procedure is equivalent to summing
the rows of the consistent mass (CM) matrix, M,7. We hasten to add, however, that this
so-called 'row-sum' technique of mass lumping does not work for all elements since
f fa is not always positive—the simplest example being the six-node triangle, and the
next simplest, the eight-node serendipity (semi-biquadratic) element already discussed
in Section 2.3.4. But here we use the row-sum technique because it does work for the
four-node bilinear element. By 'work' we mean that the resulting semi-discrete equations
generate 'appropriate' ODE's, which converge to the PDE as h —> 0.
Numerically, the principal motivation behind the technique is the generation of a
diagonal mass matrix—basically a vector—whose inverse is trivial to evaluate and 'compute
with' when compared with the CM matrix, which, while sparse and banded, has an inverse
that is dense.
We will return to all of these aspects in more detail later, but here our goal is simply to
point out the effect of this ad hoc modification on the ODE's of the semi-discrete system,
and to emphasize that the result is a non-Galerkin modification. And this is perhaps best
done by referring to the element patches discussed earlier; (2.3-23) through (2.3-47). In all
108 THE ADVECTION-DIFFUSION EQUATION
of these equations the LM approximation is effected simply by replacing the coupled first
derivatives in the first term by to, as done in most FDM approximations. [In ID—e.g.,
in (2.3-8)—the analogous procedure is to replace t,_\ and ti+\ by t„ of course.]
It is important to point out that the LM technique can give quite inconsistent (read
'silly') weighting to the time-derivative terms—especially on a non-uniform mesh.
Consider, for example, the 4-patch shown earlier, in which the northwest (NW) element is
'much' larger than the southeast (SE) element (i.e., hj ^> h\). The Galerkin weighting is
basically area weighting, for which node NW assumes a larger role (is more 'important')
than does node SE in the ODE for node 0—but this larger role is felt consistently in all
terms when GFEM is used.
Lumping the mass removes this consistent weighting from the dT/dt approximation with
the result that the resulting semi-discrete equations are much less accurate approximations
to the PDE.
(They are often even less accurate than the simplest FDM on the same mesh!)
This inconsistency of mass lumping will show up again later, when we examine phase
speed and group velocity—in which case a serious loss of accuracy occurs even on a
uniform mesh.
In closing this brief section (see Section 2.7.2b for further LM discussion), it is also
worthwhile pointing out (again) that only the CM matrix produces a best least-squares fit
to the data; i.e., for (2.2-7), with T given,
t = M~l[f-N(u)T-KT]
produces an 'acceleration field, dTh(x, t)/dt = ]T tj(t)4>j(x), that is also the L2-projection
(see Appendix 3) of V • (K • V7) + S — u • V7 onto the bilinear basis functions. This 'best
approximation' property of the time derivative is lost when LM is involved.
An interim bottom line on mass lumping, especially for advection-dominated flow:
avoid it if at all possible, it is not honest GFEM.
2.5.2 One-point Quadrature
Another common short-cut to the GFEM is to employ a less-accurate (and much less-
expensive, especially in 3D) integration rule to form the element matrices. Here, for
linear (bilinear, trilinear) basis functions we examine the effect of replacing a sufficiently
accurate Gauss-Legendre rule to evaluate the integrals in (2.2-8) through (2.2-12) by
a simple and cheap one: evaluate (at element level, of course) each integrand at the
element centroid and multiply the result by the element area (volume in 3D). The element
matrices corresponding to this particular non-Galerkin (or perhaps 'approximate Galerkin,'
since the spirit is still the same) modification are presented in Appendix 1. Here, we
use those results to 'replicate' some of those presented earlier for GFEM. We remark,
however, that this approximation tends to make the diffusion matrix singular [to a 2Ajc x
2Ay 'checkerboard' eigenvector, (— \)l+J in the simplest case], and various 'hourglass
corrections' (a term from Lagrangian solid mechanics; see, for example, Goudreau and
Hallquist, 1982) often need be applied—a subject we defer until the next chapter.
a. An interior 4-patch of uniform rectangles
The ODE for node 0 in Figure 2.3-3 that approximates the GFEM equation given by
(2.3-24) is
SOME NON-GALERKIN RESULTS
109
T7[&sw + TNW + tNE + tSE) + 2(7* + ts + fE + tw) + 4f 0J
16
1
+ 4
1 / uw + usw Ts — Tsw r. usw + 2uw + uNW To — Tw
_. _ . _ _
+
U\y + UNW ' N — T
NW
1
+ 2
+
+
+
1
+ 4
2 /
I ( Uq ~*~ Us ^SE ~ ^sw j- ? Us ~*~ ^M° ~*~ M/v ^E ~ ^w _i_ Mo + W/v 7W — T'/viv
4 V 2~ 2/ 4 2/ 2 2/
_J_ / »E + »5£ 7,y£ — 7^ «5£ + 2uE + M/yg ^ — Tq UE + UNE TNE — TN
4 V 2 / 4 / 2 /
1 fvs + Vsw Tw - Tsw 2 Vsw + 2i;5 + ^5£ TQ-TS vs + ^se 7£ - 75E
4 V 2 ' A ' 4 /i 2 /z
j_ fvo + Vw TNW - Tsw Vw + 2v0 + vE TN - Ts Vq + ve Tne - TSE
4\ 2 ' 2h ' 4 2/i 2 2/z
1 /f/v + f/vw ^/viv — 7 iv t>/viy + 2^/v + Vme Tm — TQ
4\ 2 " h 4 h
+
vN + vNE TNE — TE
2 h
= -[(Tsw - 2TS + TSE)/l2 + 2(TW - 2TQ + TE)/l2 + (TNW - 2TQ + Tsw)/h2]
K
+ -[(Tsw ~ 27V + TNW)/h2 + 2(TS - 2TQ + TN)/h2 + (TSE - 2TE + TNE)/h2]
1
+ ttWsw + SNW + SNE + SSE) + 2(SN +Ss + SE + Sw) + 4S0].
16
(2.5-1)
Besides a different averaging of the velocity coefficients, the one-point quadrature
approximation has consistently converted the (1 4 l)/6 averaging coefficients—and its
tensor product equivalent for 2D via bilinear basis functions—to (1 2 l)/4 for ID and its
tensor product equivalent for 2D. Also, the 1/6 upwind, 2/3 centered, and 1/6 downwind
of GFEM becomes 1/4 upwind, 1/2 centered, and 1/4 downwind.
It is interesting to note that the advection terms can be rearranged to
1 L (T0 - 7V) + (7s - Tsw) , _ (TE- To) + (TSE - Ts)
uSw ■ w, 1" use ■
2/
2/
_ (TE — Tq) + (TNE — TN) (Tq — Tw) + (TN — TNW)
+ M/v£ • — h UNW
2/
2/
1
-I— [similar terms from the vdT/dy portion] ,
where u~sw = \(usw + "5 + "o + uw), the average (centroid) x-velocity in the southwest
element, etc.; i.e., the one-point quadrature approximation—at least on a patch of uniform
rectangles—can be usefully interpreted and described as follows: the average velocity in
each element is multiplied by the average temperature gradient in the same element, and
the results are averaged over the four elements sharing the node in question.
110 THE ADVECTION-DIFFUSION EQUATION
If the skew-symmetric version of the advection matrix is desired (/3 = l /2), it can
be obtained via the skew-symmetric part of the above advection matrix, as discussed in
Section 2.2.4. For example, the udT/dx portion becomes
l I useTse — usv/Tsw
2/
use + uNe ~ usw + unw ~
l E Z I W
+ 2
2/
+
UneTne ~ unwT
\
NW
2/
J
which is 'similar' to the GFEM skew-symmetric version shown earlier.
b. A boundary 2-patch
Similarly, the one-point quadrature approximation that corresponds to (2.3-30)—simplified
to a uniform mesh— is
< - [Tsw + TS + TN + TNW + 2(7\) + Tw) — (Ssw + Ss + SN + SNW) — 2(S0 + Sw)\
I (I
2
1
+ 4
Usw + Us + Uq + Uw Ts — Tsw ~ USW + «5 + 2(«o + «IV ) + M/V + 1*NW Tq — Tw
+
4 /
Uw + Uq -\- M/v + Uftw Tn — Tfi/w
+
+
1
4 /
vSw + vs + vQ + vw Tw - Tsw 2 vsw + vs + 2(vQ +vw) + vN + vNW TN - Ts
h
8
2/i
vw +v0 + vN + vNW TNW - T
w
K
2
h
Tsw — 27V + TNW \ [Ts — 2TQ + TN
h1
hz
« (Ts — Tsw 0 T0 — Tw TN — TNW\ H
= -(qs + 2q0 + qN);
(Ts - Ts) + 2(T0 - TQ) + (TN - TN)
(2.5-2)
again a similar modification of the GFEM result.
c. A boundary corner
Moving right along, the one-point quadrature approximation corresponding to (2.3-33) is
Ih
2(1+h)
1 .... 1
-(Tsw + Ts + T0 + Tw) — -(Ssw + Ss + Sq + Sw)
, _ ( Tq — Tw + Ts — Tsw \ , _ / ^o — Ts + Tw — Tsw
+ u — +v
2/
2h
+ K
h (Tq — Tw + Ts — TSw \ I {Tq — Ts + Tw — Tsw
l+h
2/
l + h
2h
SOME NON-GALERKIN RESULTS 111
+ H
T0 + T
w
w
+
h
l+h
Tq-Tq + Ts- Ts
+
h fqs + qo
l+h
(2.5-3)
where u and v are the centroid (average) velocities. As for the GFEM, the /, h —> 0
limit is
h
dT ~
K—+H(T-T)-q
ox
+ 1
K—+H(T-T)-q
= 0.
2.5.3 Control Volume Finite Element (CVFEM)
A straightforward application of the theory presented in Section 2.2.6 will display the
CVFEM and permit some interesting comparisons with GFEM (both honest and via the
one-point quadrature approximation). However, as mentioned earlier, there is not yet
available a computational comparison on advection-dominated flows (to our knowledge),
although Comini et al. (1996) come close. It will become reasonably clear, though, that
the CVFEM has more similarities with (a lowest-order) GFEM than differences.
We proceed as before, beginning with an interior four-patch and ending with an
exposition of NBC's and OBC's for the CVFEM, after mentioning again that only on simple
meshes of the type shown below are M and KD symmetric.
a. An interior 4-patch
We begin with a new sketch—shown in Figure 2.5-1—for the CVFEM on rectangles
(as for GFEM, distorted meshes and isoparametric elements are probably best left to the
computer).
A direct application of (2.2-38)—with Sj = 0 for simplicity—to the CV for node 0
yields the following, in which the matrix notation of (2.2-7) is employed:
NW
W
SW
N
NE
o
h2
h1
i
it
<
1
(
>
i
>
o
0
2
O
SE
Fig. 2.5-1 A bilinear 4-patch with control volume.
112
THE ADVECTION-DIFFUSION EQUATION
1. MT\o = — {l\h\tsw + l2h\fSE + l2h2fNE + l\h2fNw
+ 3[(/, + l2)ih\ Ts + h2tN) + (hi + h2)(htw + l2tE)]
+ 9(li+l2)(hl+h2)to};
2. KDT\0 = £ ly-[3(T0 - Tw) + (Ts - Tsw)] + y-[3(T0 - TE) + (Ts - TSE)]
O [/| l2
+ ^[3(r0 - Tw) + (TN - TNW)] + ^[3(T0 - TE) + (TN - TNE)]\
l\ l2 )
+ K~ lr-[3(T0 - Ts) + (Tw - Tsw)\ + ^[3(r0 - Ts) + (TE - TSE)]
+ £-[3(r0 - 7*) + (Tw - TNW)] + ^[3(7o - TV) + (7£ - 7W)]1 ;
3. yV(uD|0 = Nx(uT) + A^T") « —(uT) + — («r),
oc ay
where u = ]T\- uj(t>j and v = Yj vj(t)j'-
N(uT)\0 = ^ |/i,
"5 + use Ts + TSe «s + "siv ^s + Tsw
+ 2
2 2 2 2
«o + ue Ts + 7^£ M0 + "w T's + ^5vv
+
2 2 2 2
us + «S£ To + TE us + «svv 7"o + TV
+ 7(/i, +/i2)
+ h2
2 2 2
uq + ue Tq + T^ «o + uw To + TV
2 2 2 2 .
UN + W/V£ T^ + TNe Un + M/VW T^ + TNw
+ 2
+
V 2 2 2 2
"0 + ue TN + T^/vE «o + MW T^ + T^w
x 2 2^ 2 2
M/v + w/v£ T'o + Te un + w/vw ^o + TVN
+ s{''
vw + tWiv TV + TNW vw + ^5W TV + Tt
sw
+ 2
2 2 2
V0 + VN TN + 7/vw ^0 + ^5 TV + ^5W
+
2 2 2 2
vn + ^/vw T0 + TN vw + ^5w To + TV
SOME NON-GALERKIN RESULTS 113
+ 7(/,+/2)
+ h
vq + vn To + TN _ vp + vs ^o + Ts
2 " 2 2 " 2
ve + vNE TE + TNE vE + vSE TE + 75£'
+ 2
V 2 2
^0 + vN TE + TNE
2 2
Vq + VS TE + TSE
+
2 2 2 2
vE + vNE T0 + TN vE + vSE T0 + Ts
which is (as required) symmetric in velocity and temperature and is a combination of
centered differences.
These are the contributions to the CVFEM version of (2.2-7) for node 0 and can
be compared with the GFEM equivalent in Section 2.3.2. Similar to the GFEM, we
can generate an FDM form via multiplication by the inverse of the (same!) lumped
mass matrix, ML = {l\ + h){h\ +h2)/4, which now is simply the size of the CV,
and rearrange—a la (2.3-23) and, especially the conservation form (/3 = l) that came
later—see (2.3-28). The result is
l
16
Asw + , ANW ■ ANE ■ ASe ■
— ' SW n I NW H — 1 NE + —— ' SE
Ay A? Aj Aj
, „ [ANE +ANW. ASe +Asw + , ASe +ANE ■ Asw +ANW ■
+ -3 I / n H -A 1 s H / e H -A ' w ) + 9/o
+
2h\
24 h\+h2
Aj Aj Aj
us + use Ts + Tse _ us + usw Ts + TSw
2 ' 2 2 ' 2
/1+/2
+ 2
UE + USE Te + Tse Uw + USW TW + T
SW
2-
(/1+/2)
+
uq + use Tq + Tse _ uq + usw Tg + Tsw us + ue Ts + TE _ us + uw Ts + Tw
i -> °2 2. "> "> n
h+h
h+h
- 2-
useTse — uswTsw
(/1+/2)
+
12
uq + ue Tq + Te uq + uw Tq + T
w
1 ■
h+h
-2-
ueT e — uwT
w
h + h
+
1 2h2
24 h\ + h2
un + une Tn + The un + unw Tn + T^w
h+h
114 THE ADVECTION-DIFFUSION EQUATION
+ 2
he + une Te + Tne u\v + unw Tw + T^
w
2-
(/1+/2)
«o + une Tq + T^E uq + unw Tq + TNW un + ue T^ + Te w/v + uw T^ + T
w
+
h+h
+
h+h
- 2-
uneTne — unwTnw
(77+71)
+ 'vertical advection'
K
h+h
2h\ (T$- Tsw TSE - Ts\ (T0 - Tw Te - Tq
+ o
M +h2 V h
h
h
+
2h2 (TN-T,
NW TNe — Tft
h\ +h2 \ h
K
h\ +/i2
h
2/i fTw — Tsw TNW-TW\ (Tq — Ts Tn — T0
+ o
h+h
hi
h2
+
2/2 (Te - Tse TNE - Te
h+h
hi
(2.5-4)
which merits the following
Remarks:
(1) The vertical advection terms can by written by inspection/symmetry.
(2) The advection terms were judiciously rearranged (as was done earlier for the fi = 1
GFEM case) so that pointwise products (fluxes) are present everywhere.
(3) The x-advection terms are interpretable as follows: the 2h\/(h\+h2) terms
contribute ^d(uT)/dx from the southern nodes, the 2h2/(h\ + h2) terms contribute
a like amount from the northern nodes, and the remaining -^d(uT)/dx is contributed
by the center nodes; cf. (2.3-28).
(4) The similarity with the GFEM conservation form is obvious; only the averaging is
changed.
(5) The source term can be easily incorporated by replacing each t,j by (7,y — 5,-y),
since they share the same mass matrix.
(6) The diffusion term again is in the form —V • q, where q = —kVT.
To make further 'progress,' we simplify the results to the simplest case: constant
velocity on a uniform mesh. The result is
64
[(Tsw + tNW + tNE + fSE) + 6(7^ + fs + fE + fw) + 3670]
+
u
rT^ 'T' r-w~\ rrf (Ti rrf
I SE — I SW , , I E — 1 W , I NE — I NW
+ o • — h ~
2/
2/
2/
SOME NON-GALERKIN RESULTS 115
v
+ 8
rr-w~\ rrf (Tl /xi rrf
NW — * SW , , 1 n — 1 S , I NE — I SE
- + O • — h -
2h
2h
2h
K
8
(Tsw ~ 2TS +_Tse) 6(7V - 2T0 + TE) (TNW - 2TN + TNE)
V
V
V
K
+ 8
(Tsw — 27V + TNW) 6(TS — 2T0 + TN) (TSe — 2TE + TNE)
_l _ |
/i^
/i^
h1
(2.5-5)
which is to be compared with (2.3-24)—GFEM—and (2.5-1)—one-point GFEM, after
setting the velocity to a constant there.
Before making any obvious remarks other than noting that the Galerkin weighting
(1 4 l)/6 has gone over to (1 6 l)/8, let us first display the analogous ID CVFEM
results—easily derived from (2.5-5):
1
8
2/,
h+l.
■TW + 6T0 +
21:
TE
+ u-
TF-T
w
= K
TE
h+h
Tq Tq — Tw
I:
/i
/, +/2
I (U +h'
(2.5-6)
which is a seemingly small change from the GFEM equation, (2.3-9), and it may be
interesting to examine this 'small change' (mass matrix only) from the point of view
of local conservation, which is the 'strong point' of control volume methods. To this
end, consider the local solution, Th(x), shown in Figure 2.5-2. While the support of the
GFEM test function, (p0, spans all of both elements, that of CVFEM, \J/o, spans one half
of each. The change in the mass matrix, from ^[l\,2(l\ + I2), 12] to |[/i, 3(/i + I2), 12]
is just that change needed to conserve energy in Qq = [xq — l\/2, xq + h/2]. If the mass
■^-x
Fig. 2.5-2 Piecewise linear function, GFEM test function ((p0), and CVFEM test function (ty0).
116 THE ADVECTION-DIFFUSION EQUATION
is lumped, then both GFEM and CVFEM become identical—and neither one represents
a proper local energy balance since the total energy in the CV (via the piecewise-linear
representation) is exactly ^[(l\Tw +3(l\ + l2)T0 + l2TEV, it is not [(l\ + l2)/2]T0. Also,
whereas the GFEM diffusive flux is discontinuous at element boundaries, the CVFEM
flux is continuous at CV boundaries—even though it too is discontinuous at element
boundaries. The CVFEM 'cleverly' arranges local flux continuity and (using consistent
mass) local energy conservation, two physically appealing attributes that are sacrificed
via Galerkin weighting.
Note that this comparison is at odds with those recently published by Comini et al.
(1991, 1992), who assert that GFEM also displays element-level balances, which clashes
with the opinion that only a weighted-residual method that utilizes discontinuous test
functions can generate element-level balances. Both viewpoints are presented in Appendix 2.
Remark
A quadratically conserving CVFEM and the associated guaranteed stable ODE's do not
appear to be easily derived—even if desired—since by construction/definition, only the
divergence form (not skew-symmetric) of advection is permitted. This could be construed
as a disadvantage.
The (interim) bottom line on these last two methods is this: the (l 4 l)/6 averaging
inherent in the GFEM is replaced by (1 2 l)/4 for the one-point quadrature approximation,
and by (1 6 l)/8 for the CVFEM; the former increases the influence of 'neighbors,' and the
latter decreases it. In 2D, the tensor product of these terms applies—at least to the mass
matrix. In fact, in this simple context, we have: (GFEM) = ^ (GFEM via one-point) + f
(CVFEM), or, (CVFEM) = \ (GFEM) - \ (GFEM via one-point). A few further
comparisons will be presented later, when we focus on pure advection and the associated problems
(errors) associated with dispersion (phase speed and group velocity).
b. A boundary 2-patch
The key new feature here will be the NBC associated with the CVFEM, for which purpose
we introduce the following 2-patch in Figure 2.5-3. Application of (2.2-38) to the CV of
NW
N
o o
r
h2 ,— >-
Q0
%f IP
hl
£
o o
o
sw
Fig. 2.5-3 A boundary 2-patch with control volume.
SOME NON-GALERKIN RESULTS
117
node 0 yields, with due account taken of the Robin BC of (2.1-3) on the two segments
of To that are coincident with rN for the KdT/dn terms,
/
64
[M3r5 + tsw) + 3(/*i + h2)(3t0 + fw) + h2(3fN + fNW)}
+
24
7 u0Tq
Uq + UW Tq + T
w
+ 2(u0Ts + usT0) + usTs -
us + usw Ts + Tsw
/us + usw Tq + Ts uq + uw Ts + Tsw
+
24
7 M0^0 -
2 2
«o + «w 7o + Tw
M/V + Ma/W T/y + 7"
/VW
+ 2(u0TN + wa/70) + wa/7a/ -
(un + unw To + Tw uq + uw Tn + 7a/w
+
/
24
+ 2f
7
2 2 2 2
' vo + vN 70 + TN v0 + vs T0 + Ts
2 2 2 2,
i;o + vw Tw + 7,/vw ^o + ^s 7w + 75vy
+
+
2 2 2 2
vw + vnw Tq + TV ^ vw + ^sw 7^o + ^5 \
2 2 2 2 J
/% + vnw Tw + 7Vw vw + ^sw ?V + Tsw
+ hlf9L^)+h2
2 2
Qn + 3Q0^
8
= y [3(70 - Tw) + (7S - 75vv)] + —-[3(70 - Tw) + (7* - 7^)]
+ ^-[3(7* - 70) + (7^ - 7*)] + ^-[3(70 - 7S) + (Tw - Tsw)], (2.5-7)
where we have introduced the short-cut notation. Q, = qi — H(Tt — 7,) from the Robin
BC, in which both q and 7 are assumed variable and represented (as before) via
interpolation using the bilinear basis functions.
Since we now know (or at least suspect) that the proper averaging coefficient is the
lumped mass matrix on rN—namely, {h\ + h2)/2—we divide by this quantity and
rearrange the result to see the CVFEM ODE + NBC at IV
/
64
1 (375 + Tsw) + 6(370 + fw) + t-^tOTn + tNW)
hi +h2
hi + hi
118
THE ADVECTION-DIFFUSION EQUATION
+
/
2h\
24 h\+h2
'u0Ts + usT0'
/' us + usw To + Tw uq + uw Ts + Tsw
+ usTs -
7
+
+
12
/
2 2 2
us + uSw Ts + TSw
2 2 .
uq + uw Tq + Tw
uoTo -
2h2
24 hi + h2
2 2
(UqTn + M/V^O
f UN + M/vw 7\) + Tw Uq + Uw TN + 7/vw
+ "^^,
Af
+
/
24 h\ +h2
2 2 ' 2
un + "a'W Tn + ^w
[similar terms in v and T]
kI
~2
1
8
3(TN — Tq) + (TNw — Tw) 3(70 — Ts) + (T^w — Tsw)
2hi
hi + h2
Ah-,
QS-K
Ah\
Ts — Tsw
I
+ 6 Go - *:
To-T
w
I
+
2h,
Qn - k
TN-T
NW
I
= 0, (2.5-8)
h\ -\-h2
which is in a simpler form but is still rather complicated. But the main result is the
same as it was for the GFEM in (2.3-30); namely, all terms but the last are (in effect)
multiplied by / and will decrease in size with mesh refinement. So, analogous to (2.3-31),
we simplify the above to the case of constant velocity on a uniform mesh; the result is
Tnw
I [ • u
2 r»+8
Ts — Tsw , , Tq
—ho- —
Tw TN
I
v
+ 4
K
4
1
/ /
3(TN — Ts) + (TNW — Tsw)
h
(Tnw — 27V + Tsw) ~(Tn — 2r0 + Ts)
hl
hl
+
8
T c — T cw
k / SW +H(Ts-Ts)-qs
+
T n — T nw
k h H(TN - TN) - qN
+ 6
= 0,
Tq — Tw
k : h H(T0 - Tq) - qo
I
(2.5-9)
where we also lumped the mass for variety. Equation (2.5-9) appears to converge to
(2.3-3) with mesh refinement—as desired.
SOME NON-GALERKIN RESULTS 119
So again, CVFEM is similar to GFEM; only the averaging coefficients differ. (Probably
the Taylor series definition of local error also differs, but probably not by very much; its
inherent irrelevance stops us from doing the work.)
c. A boundary corner
For completeness, and to assure the absence of surprises, we present the CVFEM at
node 0 in Figure 2.5-4.
The equation below, derived from (2.2-38), has been divided by Jr 0, = (/ + h)/2, as
was the case for GFEM in (2.3-33):
Ih
9T0 + 3fs + 37V + t
sw
2(1 + h)
16
+
h
12(1+ h)
+ 2 ( u0Ts + usTq
7 uqTq -
Uq + UW Tq + T
w
2 2 )
uq + uw Ts + TSw us + uSw To + Tw
+ ( usTs -
I
+
2 2
us + usw Ts + TswN
7 ^0r0
12(1+ h)
+ 2 ( v0Tw + vwT0 -
_ vq + vs Tq + Ts \
2 ' 2 J
vq + vs Tw + Ts vw + vSw T0 + Ts'
+ [vwTw
1 f 2h
+
+ 6
8 U+^
h
2 2
vw + vsw Tw + ^w \
2 * 2 J
7^5 _ 7"5W
AT-
/
+ h(ts- fs)
qs
l + h
To — Tw , /
K : h
/
l+h
J0 . Ts + H (t0 - f q) -
h
go
+
21
l+h
Tw — T sw
k h H(TW - Tw) ~ qw
h
= 0,
(2.5-10)
which is simultaneously the ODE for node 0, an energy balance over the control volume
n0, and (above all) an NBC for the CVFEM.
w
o
£
SW
Fig. 2.5-4 A boundary corner with control volume.
120 THE ADVECTION-DIFFUSION EQUATION
The lumped-mass, constant-velocity version of this equation converts the storage term
and advection terms to
Ih { ■ u
<T0 + -
2(l + h)\° 4
3(7\) — Tw)_ ,Ts — Tsw
I
I
v
+ 4
3(T0-Ts) + (Tw-Tsw)
h
h
terms that diminish with mesh refinement owing to the coefficient Ih.
Finally, in either the variable or constant coefficient case, the asymptotic result is
simply
dT
dx
h
dT / ~\
q
+ i
dT
Ty+H{T
(t-t\
q
= 0(l\h2),
the same as the GFEM and the one-point quadrature results; it is a linear combination of
the jc-direction and ^-direction Robin BC's at the corner.
A final remark:
If in the above CVFEM equations the mass is lumped, then the results would be closer
to what is more often seen in the literature—a rather sad circumstance, in that lumping
loses what the CV method had gained, i.e., local conservation.
d. OBC's
Little need be said here except perhaps that the CVFEM is close enough to the GFEM at
outflow points that the technique that works well there should also do so here; namely,
set q = H = 0 in the Robin BC and let the code 'do its thing'; the results should usually
be acceptable (sometimes even good) for any value of the Peclet number.
e. A nine-node CVFEM
Finally, to emphasize that the finite volume method ('our' way, at least) is inherently a
low-order method, we present the results of an analysis using higher-order (biquadratic)
approximation. It turns out that since the test functions (piecewise-constants) are (still)
low-order, the results are also low-order (second-), so that all one can get from higher-
order basis functions is an expensive second-order method.
The 4-patch of nine-node elements in Figure 2.5-5 below is used to both define the
control 'volumes' and to facilitate the discussion/analysis—the 'gaps' being shown for
clarity only. Shown are the CV's for nodes 0, E, N, and NE, three of which we shall
pursue (node E and node N are related by some obvious symmetries). A direct application
NNWW
•
NWW
WW
SWWO
SSWW
NNW
—□—
NWO
W
f>
SWO
-n-
ssw
NN
N
O
so
ss
J I-
NNE
—□—
i
ONE]
i
-a
E i
OSE
-D-
SSE
NNEE
ONEE
<IEE
nSEE
SSEE
Fig. 2.5-5 A 4-patch of biquadratic elements with four control volumes.
SOME NON-GALERKIN RESULTS
121
of (2.2-38) to node 0 gives (the elements are equal rectangles, all coefficients are constant,
and S = 0 for simplicity), in FDM 'format' (divide the CVFEM equation by M0 = A0 =
Ih/A = AxAy):
256T0 + 80(7^ + Ts + TE + Tw) + 25(TNE + TNW + TSE + Tsw)
576
— \6{TNN + TSs + TEE + TWw)
— ${Tnee + TNNE + TNnw + 7Vww + TSww + 7"ssvv + 7"ss£ + TSEE)
+ (Tnnee + Tnnww + 7^55 ww + TSsee)
r i
+ M
16
(^ss£ — ^w) + 5(r5£ — TSw) + 16(7£ — Tw)
+ 5(Tne — Tnw) — (TNNE — TNNW)
1
48
(Tssww — Tssee) + 5(Tsee — Tsww) + \6(TEE — 7Vw)
+ 5(Tnee — TNWw) — (Tnnww — TNNEE) I 2/ >
+v {h> [~{Tnee ~ Tsee)+5(Tne ~ Tse)+16(Tn ~Ts)
+ 5(Tnw — Tsw) — (Tnww — Tsww)] / h
— (Tnnee — Tssee) + 5(Tnne — Tsse) + 16(7V/v — TSs)
1
48
+ 5(Tnnw — Tssw) — (Tnnww — Tssww)
2h
K
24
- (TSSE - 2TSS + Tsw) + 5(TSE - 2TS + Tsw) + 16(TE - 2T0 + Tw)
+ 5(Tne — 1TN + TNw) — (Tnne — 2TNN + TNNw)
Axz
+
(Tnee ~ 2TEE + TSEE) + 5(TNE - 2TE + TSE)+ 16(TN - 2T0 + Ts)
+ 5(Tnw — 2Tw + Tsw) ~ (Tnww — 27Vw + Tsww)
A/ ,
(2.5-11)
which is to be compared with the GFEM analog, (2.3-48). It is interesting that only the
nearest neighbors are present in the diffusion terms. The ID version of this result is, for
the jc-direction,
—Tww + 57V + 167"o + 5TE — TEE f?>TE — Tw 1 TEE — Tww \
24 \Y / 2 21 J
= K-
TE — 2Tq + Tw
Ax2
122 THE ADVECTION-DIFFUSION EQUATION
Next we present a typical 2-patch result, using Figure 2.3-6:
352r0 + 1 \0(TE + TW)+ \6(TN + Ts) + 5(TNE + TNW + TSE + TSW)
576
THJee + Tww) — (Tnee + Tnww + TSww + TSee)
+ u
1 Tse — Tsw + 22(Te — Tw) + 7/yg — 7W
16 /
1 T'see _ Tsww + 22(TEe — Tww) + 7"/v££ — ^ww
+
48
v
2A
21
— (Tnww — Tsww) + 5(Tnw — Tsw) + 16(7^ — 7"s)
+ 5(Tne — Tse) — (Tnee — TSee)
h
k f (TSw - 2TS + TSE) + 22(7V - 2T0 + 7£) + (TNW - 2TN + 7y£)
24 \
+
Ajc^
5(7Vw _ 2TW + T^vv) — (Tnww — 2TWW + r^vvvv) + 16(7\ — 270 + 7$)
+ 5(Tne — 2Te + 7"s£) — (Tnee — 2TEe + T'see)
A/
(2.5-12)
which is the CVFEM version of (2.3-46) and whose ID version in the jc-direction is the
same as that above, and in the ^-direction is
1 • „ • • TN -Ts TN - 2T0 + Ts
-(TN + 2270 + rs) + v N = k N ° ■
24 h Ay
Finally, the central node; referring to Figure 2.3-7, gives
[484r0 + 22(fN + fs + tE + tw) + (fNE + tNW + fSE + tsw)]
(2.5-13)
576
+
+
u
24
v
24
Tse — Tsw + 22(TE — Tw) + TNe — T,
NW
K
2A
+
/
Tne — Tse + 22(TN — Ts) + TNw — Tsw
_
TSE ~ 2TS + Tsw + 22(TE -2T0 + TW) + TNE - 2TN + T
NW
Ax1
Tne — 2Te + Tse + 22(TN — 2T0 + Ts) + TNw — 2TW + Tsw)
Ay2
whose ID version is
Tw + 227"o + Te T e — Tw Te — 2T0 + Tw
h U ; = K-
24
/
Axz
(2.5-14)
(2.5-15)
SOME NON-GALERKIN RESULTS 123
in the ^-direction. [Note that both the equation for the central node and its interpretation
are rather different from that presented in Raw et al. (1984).]
So there they are; a highly coupled, very complex, second-order accurate
approximation to the scalar transport equation—lots of work for little gain, an example of a
cost-ineffective method. (Later, when we present ID phase speed results for pure advec-
tion, we will point out that this quadratic CVFEM is less accurate than even linear
GFEM.)
A higher-order CVFEM could perhaps be generated as follows: (i) replace the four
contiguous subdomains in the sketch by the single subdomain obtained by omitting
the internal boundaries, and (ii) apply a discontinuous, bilinear test function over this
new control volume. The four parameters of the test function 'balance' the four nodal
unknowns. In practice, this would generate four independent equations by making the error
orthogonal to the following four functions: 1, x, y, and xy, with a resulting approximation
that is just as complicated as GFEM—but probably less accurate.
If, however, one completely abandons the finite element methodology and turns instead
to a finite difference methodology, it is possible to generate higher-order finite volume
methods—starting from (2.2-37). For example, Lilek and Peric (1995) generated a fourth-
order CV-FDM by examining each term separately and devising higher-order
approximations term-by-term. These authors also support our 'philosophy' that no upwinding
is required on properly designed grids—although they do also advocate the so-called
'deferred-correction' approach, which utilizes some upwinding 'technology' during the
iterations toward a non-upwinded converged result. Perhaps the FEM community could
also benefit from such an approach.
2.5.4 The Group FEM/Product Approximation
We present one other non-Galerkin technique in this section; an ad hoc -but-quite-
inexpensive method for dealing with product non-linearities. An early work in this area
is that by Swartz and Wendroff (1969); more recent is Christie et al. (1981), Abia and
Sanz-Serna (1984), and Fletcher (1991). In the last reference, Fletcher, a pioneer and
strong advocate of the approximation, devotes much space to it and argues for its cost-
effectiveness. [We remark, parenthetically, that R. Taylor suggested back in 1975 that we
(at LLNL) try it for our non-linear terms—a suggestion we briefly tested on Hamel flow;
Newton's method behaved less robustly than it did for GFEM.] Here we present it briefly
and with little discussion; indeed, like CVFEM, we have no personal experience (i.e.,
computing) with it.
The group FEM must begin, like the CVFEM, with the divergence form of the equation,
because the whole idea rests on approximating the product, uT, in a simple but hopefully
cost-effective way. It is simply this: instead of multiplying the two variables u and T
together after expressing each in the basis set, do it before expanding (!); i.e., uT is
approximated via
N
uT = J2(uT)j4>j, (2.5-16)
7=1
a trick that essentially linearizes the advection term. The resulting advection approximation
via GFEM is equivalent to that with a constant velocity; i.e., use (2.2-9) with u = constant
124 THE ADVECTION-DIFFUSION EQUATION
to form the Af-matrix, but replace the final result, Tj, at the nodes—by (u7)7; i.e., the
velocity can vary.
Thus, we begin by returning to (2.3-23) and re-writing the advection term for the case
of constant velocity on the four-patch; it simplifies to
u ( 2h\ TNE - TNW TE — Tw 2h2 TSe — TSw\
U • V/,r|o = 7 7 — : — h 4 • — — h
6\hi+h2 h+h h+h h{ + h2 h+l
2
2/1 TNW — Tsw . TN — Ts 2/2 TNe — Tse
h+h h\ + h2 h\ + h2 h+h h\ + h2 J
(2.5-17)
another averaging of centered differences.
The group FEM for the variable velocity case is now easily obtained in the manner
mentioned above; i.e., the (entire) set of advection terms in (2.3-23) is simply replaced by
_ . _. 1 / 2/i 1 uNETNE — uNwTnw
V/, • (u7)|0 = - —— ——
6 \h\ +h2 l\ + l2
. ueTe — uwTw 2h2 useTse — uswTsw\
/, +/2 h{+h2' h+h J
2/) vNWTNW — vsv/Tsw
h+h h\+h2
+ 4 VnTn ~VsTs 2^2 vneTne -vseTse\ (2 . 1R
h\ +h2 h+h h\ +h2 ) '
a remarkable simplification, to be sure, and one that could easily be immediately converted
to the CVFEM group approximation to advection; just replace the (1 4 l)/6 weighting
above by (1 6 l)/8, and then place the result in the CVFEM 4-patch equation, (2.5-4),
after deleting the (many) advection terms from that equation.
The cost reduction of the product approximation appears to be very significant. It
also ensures global conservation of T (but not T2). What else can we say, except that
Fletcher advocates it quite strongly and that we (negligently?) have not tested it? It seems
worthy of further careful exploration, especially in the context of the NS equations,
where the product non-linearity is present in large measure. A negative opinion on the
approximation, which may or may not be too relevant for our particular advection non-
linearity, also exists, however: Abia and Sanz-Serna (1984) assert that in 2D and 3D, the
product approximation may be more costly than standard GFEM.
2.5.5 The Petrov-Galerkin FEM
To conclude this section, we mention the existence of a large class of 'non-Galerkin'
methods, all of which add artificial diffusion in one way or another to control/damp
wiggles, called Petrov-Galerkin methods, wherein the test functions are different than
the basis functions. For the interested reader, we cite but two recent references, the first
of which seems to present a pretty useful historical account; i.e., see Goldschmit and
Dvorkin (1994) for a Petrov-Galerkin 'family tree.' The second reference is a new book
by Bill Morton—an acknowledged expert numerical analyst with broad experience in
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 125
FDM, FEM, and FVM. In Morton (1996) is much useful analysis of the AD equation
(and its approximate solutions) that complements what we present here—including GFEM
error analysis for both linear/bilinear and quadratic/biquadratic on triangles and
quadrilaterals. But the bulk (main thrust) of the book is nearly 'orthogonal' to ours: Petrov-Galerkin
finite volume methods for compressible flow. Nevertheless, especially for the reader
interested in learning about this approach to CFD, this text is 'required reading'. In the next
Section (2.6-1) we discuss some CFD 'philosophy', but mention here that the above
reference generally reflects a philosophy that is different from ours.
We give short shrift to this large 'branch' of CFD for the following, probably somewhat
naive (and not quite 100% true) reasons: we have no (or little) experience with them
because we (usually) see no need for them, causing us to have little interest in them.
(Those who believe we are 'wrong' in this regard would probably assert that were never
solved any 'really hard' problems. ... Our only defence against such a probably-valid
assertion is this: probably neither have you really solved the 'hard' problems that you
have in mind.) If the reader detects that we are being somewhat contentious here, please
read on! (If not, please wake up.)
2.6 DISPERSION, DISSIPATION, PHASE SPEED, GROUP
VELOCITY, MESH DESIGN, AND— WIGGLES
2.6.1 Qualitative Discussion
a. Wiggles
CFD is not a 'hard' science in that some of its aspects (such as 'conservation form') are
often more religious than rational in nature. We are about to discuss one of these aspects.
Wiggles—the Nemesis of CFD. Perhaps no other single difficulty has generated more
frustration and caused more effort than the rapid, high-frequency—typically (but
definitely not always) node-to-node (or time-step to time-step)—oscillations that come out of
the computer and pollute the putative 'solution' than that called, most simply, wiggles.
Perhaps no other aspect of CFD has so divided the world into two basic camps: those
who hate/fear the wiggles so much that they use only methods that never permit their
occurrence, and those who, while not exactly embracing them, believe that there is a
message in the wiggles and that there is more to good CFD analyses than simply being
wiggle-free. And—in many cases—never the twain shall meet; there is too large a gap
between the two 'religions.' Indeed, some zealots in the teaching profession—especially
those who belong to the smooth-is-good, wiggle-free religion—seem to generate
'products' (typically Ph.D. students) who have never had any first-hand experience with wiggles
and thus leave the university wearing blinders—an unfortunate situation in our opinion.
(Our 'religion' already shows through.) The price that is too often paid by those who
a priori suppress wiggles by their choice of a numerical method is simply that they are
often solving the wrong problem; i.e., the effectively/numerically much-reduced Peclet
or Reynolds number leads these analysts to believe that they are really solving some
tough problems when, in fact, they are not (a virtual reality of CFD) because they have
changed the problem. The wiggly-camp, on the other hand, often has much difficulty with
tough problems wherein, in the worst case, they can get no solution at all. This camp, in
which we are fairly firmly (but perhaps not permanently) entrenched, believes that 'The
126 THE ADVECTION-DIFFUSION EQUATION
wiggles are telling you something' and try to use wiggle signals as a guide to better mesh
design (where possible) or, in the worst of cases, admit that the stated problem—truth
be told—is just too difficult (for the current generation of computers). Consistent with
our religious belief, we shall not do the disservice of citing any but a bare minimum of
publications from the 'other side'; e.g., as mentioned above, a good history is available
in Goldschmit and Dvorkin (1994). On the other hand, sometimes the wiggle signal can
be too strong—forcing a fine mesh upon the analyst in regions of the domain where
high accuracy is known (or assumed) to be not important. Thus, it is not hard to see
why there are two camps, because one must basically make a Hobson's choice: results
(when obtainable!) that may display spurious wiggles or results that may be deceptively
smooth. In between these two 'extremist' philosophies are those who seriously try to
reduce the wiggles—not always a priori—in ways that are not otherwise too deleterious,
thus suggesting at least some hope for a 'middle ground.' We conclude this metaphilo-
sophical introduction with a quotation from another field of 'science' that also abounds
with religious zealots—stratospheric ozone depletion—because it probably applies here
as well: 'The debate has become quasi-theological, with each side basing its arguments
on faith in its own imperfect calculations'—Singer (1994). Holy ozone, Batman!
While this divisive issue transcends the boundaries of FEM to encompass FDM,
FVM, and spectral methods, we shall naturally focus mostly on FEM, in which our
belief—generally—is reflected in our new acronym,
GFEMIA: Galerkin finite element method intelligently applied.
In partial support of this belief, we shall show below how a thousand nodes, badly placed,
will lead to a wiggly solution, but that just a single node, intelligently placed, can give
a really good solution—an example that will also obviate much of the typical related
GFEM error analyses. The GFEMIA requires, besides a lower bound on the analyst's
IQ, not much more than common sense—and, of course, an appreciation for some of the
subtleties of both fluid mechanics and the numerical methods used to describe it. It can
yield results that are unbelievably more accurate than most error analyses can/would ever
predict.
An interesting wiggle-opinion from the spectral method side of the house is
this: 'A very important attribute of spectral methods is their self-diagnosis property.
Inadequate grid resolution is reflected in excessive values of high-order expansion
coefficients.'—Rogallo and Moin (1984). Hopefully, it is obvious that 'excessive values
of high-order expansion coefficients' is just another way to say 'wiggles,' and we
now borrow their useful terminology: wiggles are a self-diagnosis property. To further
emphasize this property, this time in the other direction, which again introduces some
nice terminology, we quote from Gropp and Keyes (1992): 'Complaints that heavily
upwinded discretizations conceal their own errors are common in the literature ...' So,
another way to state the extremes of this dilemma is this: choose a method that either
has a self-diagnosis property (and thus often 'makes waves') or one that conceals its own
errors (and is wiggle-free).
While not everyone understands the wiggles—or their causes—it is a safe bet to assert
that everyone does recognize them—and nearly everyone has an opinion about them.
(Those who have no opinion have probably been brainwashed early-on and have never
seen them from their 'smooth-is-beautiful' codes. Smooth may be 'beautiful', but it can
also be very wrong.) While not as bad as turbulence, or pornography, or even art, in each
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 127
of which you may not be able to define it but believe that you recognize it when you
see it, there really is no pure and simple always-applicable definition of wiggles except
perhaps this (which may require some knowledge of physics): wiggles are non-physical
oscillations.
What does cause the wiggles? One principal cause of spatial wiggles is the
specification of a problem in which the dependent variable is forced to suffer/experience
gradients (rates of change) in the flow direction that are 'too large' to be 'captured' by
the mesh—with the resulting solution (when a solution can be attained, which is basically
'always' for the linear problems now under discussion) that 'radiates' 2Ajc (and other
short-wave) oscillations away from the 'source' (the non-resolved 'boundary layer'),
typically in the upstream direction. But not always upstream—especially on grids that are
regular in the sense of having the node points more or less aligned with the direction
of the coordinate axes. We believe—and will act on this belief in what follows—that
one good way to understand wiggles is to study the eigenproblems associated with the
spatial operators (matrices) in that wiggles are excitations of the high-frequency (wave
number) eigenmodes; i.e., the amplitude coefficients of one or more of the oscillatory
eigenvectors that describe the GFEM (or other method) for advection-dominated flows
become relatively 'too large.' In the above category is one of the most important wiggle
makers: flow toward a Dirichlet (hard) BC that generates a boundary layer that is 'too
thin' relative to the mesh employed; i.e., the boundary layer thickness is smaller than
the (normal) distance from the boundary to the first node point, typically quantified by
k/u„ < Ax„/2, where un is the component of the velocity at the node in question that
is directed toward the boundary. This inequality, which also reads P = u„Ax„/2k > 1,
where P is called the grid Peclet number, makes wiggles—and if P ^> 1, the wiggles
become stronger and more widespread. The flow at large Pe(= uqL/k) toward the hard
BC can force the dependent/transported variable to undergo a large adjustment/change in
a very short distance, 0(8), where 8/L = 1/Pe, so that transport by diffusion—which is
small away from V—'suddenly' and necessarily becomes large, to 'balance' the large
transport via advection, thus generating a steep gradient through the boundary layer. This
is a challenging and (in many cases) important physical problem in virtually all areas
of transport phenomena: momentum, heat, and mass transfer. An example of the
importance of the above situation, we surmise, may have been related to the jet engine failures
suffered by certain manufacturers twenty-some years ago when the wiggle-suppressing
numerical results failed to reveal overheating and a concomitant serious thermal stress
condition. ... Heavy-handed but 'robust' techniques can and do generate a
'What, me worry?'
type of attitude among otherwise good analysts/designers, and a perfect example of
the following statement that we attribute to J. Ferziger: 'The greatest disaster one can
encounter in computation is not instability or lack of convergence, but results that are
good enough to be believable but bad enough to cause trouble.' This is the euphoria of
CFD: a heavily 'damped' code that always delivers smooth, wiggle-free results. How does
a designer/analyst know how to make a good mesh if his/her code never makes wiggles?
If the answer to this question is, 'They don't', then perhaps it is more understandable why
so many clearly difficult problems have been ostensibly solved using wiggle-free methods
on really coarse grids. (We admit, here, to inexperience with good adaptive mesh methods,
which, in the best of all worlds, would generate an appropriate mesh for the unskilled
128 THE ADVECTION-DIFFUSION EQUATION
analyst—at least if employed in conjunction with 'honest,' not overly artificially-damped
methods.)
Two other wiggle makers, to be addressed below, that are worthy of mention in this
introduction are: advection of a wave-form that is too tough for the chosen mesh, and
transient diffusion at early time and close to a large local disturbance—such as a step
change in a Dirichlet BC or a discontinuous source term. The first of these is probably
far more important, prevalent, and obvious—and the second is more subtle and less
well appreciated. Returning to the former—a good description of a special but common
and important special case of which is the propagation of a (steep) front through the
domain—it may be the case that this situation is even more common than the 'hard BC
case discussed above. In fact, a recent (and excellent) book has been devoted to this
single subject (Finlayson, 1992). But a moment's reflection will reveal a strong similarity
between these two wiggle makers—the key difference being that the moving front problem
cannot be 'solved' by simple local mesh refinement. Another cause of wiggles is poorly
resolved, or rough in general, IC's—a consequence of the fact that the GFEM mass
matrix generates an L2-best fit to the data (in this case, the data are the operations of the
advection and diffusion matrices on the initial condition vector).
What are the wiggles telling you? They are suggestive oscillations that are trying to
tell you that there may exist a serious deficiency in your mesh design—or in the problem
specifications if a too-sharp IC is used or a hard OBC is used when a soft one could or
should be used. They are saying, 'An important steep gradient—and a concomitant large
diffusive flux—exists that cannot be resolved ("captured") by the chosen mesh.' They
also tell you, usually (if you can translate the wiggle signal) where in your mesh a better
resolution is needed. If it is a thermal analysis, temperature wiggles are often telling you
where the local Nusselt number is very large and where to improve (refine) the mesh so
that the next solution might be wiggle-free and the high heat flux properly computed.
(An analogous momentum transfer problem, to be studied in the next chapter, is that of
obtaining proper lift, drag, and torque.)
To close this extended and somewhat philosophical introduction to the 'theory' of
wiggles, we point out a recent serious application of the same philosophy: in September
1993, the ASME's Journal of Fluids Engineering modified their editorial policy in a major
way such that smooth-but-lousy numerical results would/should be much more difficult to
slip past the referees and mislead the readers. In a somewhat less brash manner, the
International Journal for Numerical Methods in Fluids followed suit—as did the AIAA Journal.
Finally, two suggested readings for the novice interested in wiggles—whose full titles we
must present: (1) 'A Survey of Finite Differences of Opinion on Numerical Muddling
of the Incomprehensible Defective-Confusion Equation,' by B.P. Leonard (1979), and
(2) 'Don't Suppress the Wiggles—They're Telling You Something!' by P.M. Gresho and
R. Lee (1979); both in the ASME publication, Finite Element Methods for Convection-
Dominated Flow, T.J.R. Hughes, Ed. (1979). [The latter appeared later—and longer—in
Gresho and Lee (1981).]
b. Dispersion
To open this discussion, we appeal to authority for the definition (Whitham, 1974, p. 3),
'A linear dispersive system is any system which admits solutions of the form
(p = acos(kx — cot),
(2.6-1)
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 129
where the frequency co is a definite real function of the wavenumber k (with wavelength
A. = 2ic/k), and the function co(k) is determined by the particular system. The phase
speed is then co(k)/k, and the waves are said to be "dispersive" if this phase speed is
not a constant but depends on k. The term refers to the fact that a more general solution
will consist of the superposition of several modes like (2.6-1) with different k. If the
phase speed co/k is not the same for all k, that is co ^ c0k where cq is some constant, the
modes with different k will propagate at different speeds; they will disperse.' co/lir is the
frequency at which wave crests/troughs pass a fixed point.
The first important remark for our 'waves' is this: the continuum solutions are non-
dispersive. And the second is this: all grid-based numerical approximations are dispersive.
Thus, all of our numerical dispersion is spurious—it is pure error. [N.B. In some fields, such
as groundwater flow, the term dispersion has a different meaning than the above wave-based
definition; it refers to flow processes in which small-scale velocity variations cause enhanced
mixing. See, for example, Shubin and Bell (1984) for a description of porous medium
dispersion, and Cussler (1984) for turbulence-induced dispersion—both cases being models of the
true physics. Also, some writers confound the term dispersion with dissipation, which we
will discuss next.] It is ironic (and paradoxical) that the ostensible 'simplifications'
engendered by making the incompressible flow assumption/restriction that should surely preclude
most internal wave phenomena (except in Boussinesq fluids—covered in Volume 2)—and
does in the PDE's—nevertheless require the (learned) analyst to become familiar with a
fairly significant portion of 'wave theory' simply because of the deficiencies associated with
the modeling of (non-dissipative!) advection. The simplest case of dispersion (or dispersion
error, an interchangeable term in this text) is when an initial wave-form (e.g., a Gaussian,
or a triangular 'pulse') is placed on the grid, and the pure advection solution sought; it
will, if followed long enough in time, break up into a trail of wiggles. Results—and some
theory—follow later.
c. Dissipation
A synonym for this term, in the context that we shall use it, is 'artificial dissipation,' and
another is 'artificial diffusion,' and, finally, some call it 'numerical diffusion.' Others mean
artificial dissipation when they use the term dispersion. C'est la vie. They all mean this:
when the pure advection equation—which, by definition (almost), is free of dissipation—is
solved by a numerical approximation method that reduces the amplitude and changes the
shape of the initial wave in a way analogous to a diffusional process, the method is said
to contain (or to suffer from) 'dissipation.' A dissipative scheme will monotonically and
erroneously decrease the 'energy' in the wave—the quadratic conservation property of
the advection equation (cf. Sections 2.1.4, 2.2.3, and 2.2.4) is lost.
Whereas it seems safe to say that all 'numerical methods designers' seek and covet
schemes that display as little dispersion as possible, it is definitely not the case that
they also want to minimize dissipation. And the reason is wiggle-related: dispersion
leads to/generates wiggles, and some people are simply allergic to wiggles. And,
adding numerical diffusion to an otherwise non-dissipative method will always reduce
the wiggles, and often eliminates them entirely. Hence the occurrence or intentional
introduction of artificial diffusion, although sometimes the intent is disguised by phrases
such as 'upwinded advection' approximations. [Pig-pen advection, a name attributed to
B. Spalding by Roache (1982), is also an appropriate description of 'upwinding.']
More later.
130 THE ADVECTION-DIFFUSION EQUATION
d. Phase speed
Although briefly introduced (in ID) just above, a few more words may be helpful—and
will here be specifically applied to our problem: advection. If a sine wave with wave
number k (and wave vector k with direction—by definition—orthogonal to the wave
crests) is placed in a flow field with constant velocity, u, the phase speed (a scalar, c) is
the projection of u in the direction of k—see Figure 2.6-1:
c = u • k/k = |u| cos(# — fi),
(2.6-2)
where k = |k|; it is the speed at which wave crests move past a stationary observer.
Clearly c is maximized when u and k are parallel. The apparent wavelength, to a stationary
observer, is Xa = 2;r|u|/k • u = 2n\u\/kc, which is a minimum (ka = k = 2n/k) when k
and u are parallel (c = |u|) and is a maximum (ka = oo) when k and u are perpendicular
(c = 0). Finally, since k/k is just the unit vector in the direction of k, c is independent
of the magnitude of k; thus, there is no dispersion. In ID, of course, c = u, and the sine
wave is simply dragged along (translated) by the flow at speed u.
The above is a special case of the more general concept of phase velocity, which is the
speed of the wave train in the wave direction k, which is normal to the lines of constant
phase. It is shown in Figure 2.6-1 and given by Whitham (1974)
CO
c = -(k/k) = cok/k2 = ck/k = (u • k)k/k2,
k
(2.6-3)
giving jc| = c = co/k, where co is the temporal frequency of the wave and 2n/co = x is
its period, via a generalization of (2.6-1) to a plane wave in multi-dimensions,
cp = acos(k • x — cot).
(2.6-4)
It is clear from (2.6-3) that the direction of the phase velocity is that of the wave number
vector. To obtain (2.6-2) from (2.6-3) and (2.6-4), simply seek a solution of the form
(2.6-4) to the advection equation,
d(p/dt + u-V(p = 0.
(2.6-5)
Fig. 2.6-1 Plane waves in a fluid with constant velocity u.
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 131
The result is co = u • k and, using (2.6-3) and c = |c| yields the phase speed given in
(2.6-2). Finally, the phase velocity can also be written as
c=-(k//c). (2.6-6)
T
e. Group velocity
Whereas the study of wave motion for an isothermal incompressible fluid in the absence
of free surfaces is, or should be, a nearly vacuous subject, it is (unfortunately) far from
that when numerical approximations are employed. So we must, because of our imperfect
model of the PDE's, address certain aspects of wave motion—mainly to help to assess the
errors contained in the model. First, however, another appeal to authority: 'The theory
of group velocity is an essentially mathematical theory that has been developed over the
years with an eye on a great variety of spheres of application' —Lighthill (1965). A general
qualitative description of group velocity is that it is the velocity of the energy (amplitude
squared, basically) of the wave (or wave packet, to which it is generally applied). Rather
than simply dividing the wave's frequency by its wave number, a la (2.6-3) for phase
speed, the group velocity is the vector that obtains by differentiating co(k) with respect
to the components of the wave number vector, k:
G*)=v"»=(t/*;) inm <z6-7)
Now for our simple advection equation, co = u • k, and thus
G = u; (2.6-8)
the group velocity is merely the fluid's velocity and is independent of k: there is no
dispersion in the continuous case, and the theory of wave motion (for an isothermal
incompressible fluid) need not be invoked—at least in the absence of free surfaces or
stratified fluids (see Volume 2).
If one were to try to examine the 'components' of the phase velocity, (2.6-3),
in an 'analogous' way as above for group velocity, via c\ = oo/k\ and c2 = oo/l<2,
then one would find that these non-physical components do not 'transform' properly;
and, rather than satisfying c2 = c\ + c\, it turns out that c~2 = cj~2 + c^2 because, in
fact, c\ =c/cos0 and C2=c/sin#. These phase 'components' are geometrical, not
physical—whereas those of G are physical. (There are some slippery concepts in wave
motion.) The 'physical' components of c come from (2.6-3) via k = k(e\ cos 0 + e2 sin 0)
and c = c\t\ + c2%2\ i.e., c\ = ccos# and c2 = csin# are the projections of c onto the
coordinate directions.
When we try to solve (2.6-5) by any numerical approximation method, however, both
phase speed and group velocity will differ from those above and will exhibit dispersion by
not being independent of wavelength. These concepts will also prove to be much more
useful than local (h -> 0) error analyses, whether based on Taylor series or via more
sophisticated convergence theory, because they do not require h -> 0 and thus apply
on 'real' meshes. Stated differently, asymptotic (small h) analyses are restricted to long
waves (h/X —> 0), whereas the general situation always involves a linear combination of
long and short waves, and the behavior of the actual error must account for this fact.
(Short waves are typically those whose wavelengths are between 4Ajc and 2Ajc, the latter
being the shortest wave resolvable on the mesh. More on this later. And yes, h = Ax and
132 THE ADVECTION-DIFFUSION EQUATION
we use both designations, depending on the context; e.g., a 2h wave just does not 'sound
right,' whereas a 2Ajc wave does.)
To conclude this brief introduction to phase and group velocity, we present a few more
gems from Whitham (1974):
1. (p. 10), 'For linear problems, solutions more general than (2.6-1) are obtained by
superposition to form Fourier integrals, such as
(p = / F(k)cos[kx — co(k)t]dk,
Jo
where co(k) is the dispersion function appropriate to the system. Formally, at least, this is
a solution for arbitrary F(k), which is then chosen to fit the boundary or initial conditions,
with use of the Fourier inversion theorem.
The solution of (2.6-1) is a superposition of wave-trains of different wave numbers,
each traveling at its own phase speed, c(k) = co(k)/k. As time evolves, these different
component modes 'disperse,' with the result that a single concentrated hump, for example,
disperses into a whole oscillatory train. This process is studied by various asymptotic
expansions of (2.6-1). The key concept that comes out of the analysis is that of the group
velocity, defined as G(k) = 3w(/c)/3/c.'
2. (p. 371), 'Although the Fourier integrals give exact solutions, the content is hard to see.'
This statement can often also be applied to the exact solutions of semi-discrete equations
given by the eigenvector expansions to be presented soon.
3. (p. 376), '... an observer moving with the velocity G(ko) will always see waves with
wave number &o and frequency &>(&())•'
4. (p. 377), 'An observer following any particular crest moves with the local phase velocity
but sees the local wave number and frequency changing; that is, neighboring crests get
farther away. An observer moving with the group velocity sees the same local wave
number and frequency, but crests keep passing him.' And, finally,
5. (p. 380, 381), 'The group velocity G(k) is the propagation velocity for the wave
number k, dk/dt + G(k)dk/dx = 0... It is interesting and significant that (this equation)
is non-linear, even though the original problem is linear '
f. Mesh design
We have stated above that the design of smart meshes is part of GFEMIA, and we
reiterate here: in our opinion, not enough use has been made of the inherent flexibility
of isoparametric finite elements to really maximize the cost-effective use of GFEM. But,
unfortunately, we too have no magic recipe But we can offer some advice on initial
mesh design (more on this in Volume 2): think in advance of calling your mesh generator,
analyze the problem parameter range of interest, and, qualitatively, the associated fluid
dynamics and potential 'wiggle dynamics.' As a start, consider the following questions
and answers:
Q: Should I use a uniform or a slightly graded or a highly graded mesh?
A): If the flow is advection-dominated and if transport of wave-forms ('shapes') is
important and if there are no obstacles to flow around that require hard (Dirichlet)
BC's, then a uniform or quasi-uniform mesh is suggested (slow growth of element
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 133
size in the flow direction if diffusion is strong enough to be 'noticeable' and if there
is a clear flow direction).
A2: If flow past obstacles is the important part of the problem and if the obstacles require
hard BC's, then an advection-dominated flow needs a graded mesh such that one
or (preferably) a few nodes are placed in the boundary layer at least upstream
of any stagnation points, and wherever there is a boundary layer on the obstable
in the normal direction. Such a procedure will properly both eliminate (or make
small) wiggles and produce a reasonably accurate solution. If, though, transverse
boundary layers are also important, then mesh refinement/grading is needed near
these surfaces for other than wiggle reasons—usually. (Transverse boundary layers
are not often big wiggle generators, except under certain transient conditions.) Hard
BC's are the name of the game when no-penetration and no-slip are appropriate
for the momentum equations and Re ^> 1. They apply to the scalar problem if, for
example, cool fluid flows around a hot obstacle and Pe ^>> 1.
A3: If both wave form transport and flow past obstacles are present in the problem, then
obviously a combination of A) and A2 is required.
Q: Does the problem involve a boundary layer that you consider to be unimportant and
therefore do not wish to pay the extra price of selective mesh refinement?
A: If yes, why? (If no, see above.)
This protracted 'introduction' has been intended partly as an 'eye opener' and partly
as an incentive for the reader to believe it worthwhile to plow through the often lengthy
mathematical analyses to follow both here and (especially) in the next chapter—and which
is but a miniscule 'sample' of a type of 'CFD' problem that has consumed the efforts
of so many for so long, and continues to do so. An appreciation of this single aspect of
CFD will make a CFD 'analyst' much more 'useful' in both the applied world of CFD
and in the never-ending world of research seeking better methods.
For the reader truly interested in the numerical treatment of advection, it would be
advisable to stay abreast of this portion of the following 'specialty areas,' since each works
quite hard at devising improved numerical methods (most of which are not FEM's) and
more often than not, each field develops in relative isolation—not knowing or (it seems)
caring what developments are occurring outside of the particular specialty, all in addition
(of course) to perusing the applied math literature: geophysical fluid dynamics (general
circulation modeling, planetary boundary layer modeling, ocean modeling, shallow-water
equations, etc.), flow through porous media and ground water pollution, oil reservoir
analysis, aerodynamics, air pollution, astrophysics, magnetohydrodynamics and particle
physics, river dynamics and sediment transport, gas dynamics, materials science,
separation science,
2.6.2 Quantitative Discussion for Some 1D Problems
Many of the above items are amenable to at least some useful quantitative analysis—at
least in ID. Here we shall look at these issues from the point of view of the semi-discrete
equations—time remaining continuous. In a later section (2.7.6), we shall include the
additional complexity of time-marching and analyze some full discretizations. Finally, we
shall spend just a little time discussing the steady equations—which of course are just a
'special case,' albeit an important one.
134 THE ADVECTION-DIFFUSION EQUATION
Before embarking on some extensive analysis in ID, we make the comment that much
less analysis will be applied when we go to multi-dimensions (for several reasons!) but
that a good understanding of the behavior—idiosyncratic and not—of ID
approximations, although seemingly slightly academic, is actually helpful for the multi-dimensional
cases in the following sense: if the ID method has 'problems,' surely so too will the
multi-dimensional version have these and more (a sort of Murphy's Law of CFD).
Thus, if ID analysis causes rejection of a scheme, then it is a safe bet that it should
not be tried in multi-dimensions. The 'converse,' unfortunately, is not generally true:
good/great methods in ID can be poor—or, more often, 'impossible' to implement
properly—in multi-dimensions. Even in ID, the bulk of the effort will be for the simple
(but useful, for learning) case of periodic BC's, and mostly for pure advection, to which
we now turn.
a. Pure advection with periodic BC's
o Continuum. We start with the simplest of all hyperbolic equations, one which is a true
time-dependent equation since steady solutions (other than T = constant) do not exist,
dT dT
— +u—=0 on 0^jc^L=1 (2.6-9)
dt dx
with periodic BC's,
T(0,t) = T(L,t), (2.6-10)
and some IC,
T(x, 0) = 7o(jc), (2.6-11)
where u is constant. The exact solution is
T(x,t) = T0(x-ut); (2.6-12)
the solution simply translates the initial data to the right at speed u, thus showing
how simple this PDE really is. Until you spatially discretize it—at which point you
will encounter what Mitchell (1984) has called CFD's 'ultimate embarrassment.' [More
precisely: 'The ultimate embarrassment is to be unable to solve the simplest of equations,
du/dt + du/dx = 0, accurately by numerical methods on a fixed grid'.] The solution of the
semi-discrete equations is unbelievably more 'difficult' than that of the continuum. This is
due, at least in part, to the lack of damping in this first-order hyperbolic equation; whereas
(2.6-12) will propagate all wavelengths—and even discontinuities—with no change in size
or shape, the short waves—and especially discontinuities—are very difficult to propagate
properly by most numerical approximations to (2.6-9). Another relevant opinion from an
advection expert is this: 'There is no such thing as a perfect advection scheme—only
differing degrees of badness'—A. Staniforth (personal communication). Finally, from
Lien and Leschziner (1994), we quote: 'The approximation of convection poses
challenges which might not be expected at first sight. The problem is, essentially, one of
reconciling stability, boundedness, and accuracy.'
Before pursuing the general problem, we first digress to perform some Fourier analyses
that will be both useful and important when we study the semi-discretized version. To do
this, we take a single Fourier mode as our initial data:
r0 (jc) = e'"'
(2.6-13)
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 135
where, because of the periodicity constraint, (2.6-10), we have a constraint on the
allowable wave number:
k = kn= 2nn, (2.6-14)
where n is any integer (or 0). The wavelength, of course, is just
X = kn=27i/k = \/n, (2.6-15)
and the exact solution to (2.6-9) is simply
T(x, t) = zik(x~ut) = e<(2™*-^), (2.6-16)
where coc = ku = 2nnu is the temporal frequency of the wave—with period r = 2jt/coc =
\/nu. (The subscript refers to the continuous case, to distinguish this frequency from those
to follow.)
Remarks:
(1) If we studied instead the closely related IVP on the infinite span rather than the
stated IBVP, then the restriction on k would be removed—it would be a continuous
independent variable. We choose the periodic IBVP because our finite-dimensional
'analog' will be always restricted to (a finite set of) discrete wave numbers. Even
so, we shall often consider n to be a continuous variable, for 'convenience' (and
with little error, usually)—as have many before us.
(2) The restriction to periodic BC's is not as 'severe' (unrealistic) as one might first
imagine, in the following sense: lots of what is learned applies to situations with
different BC's, at least during a time interval in which the BC's are not important/felt.
(3) The spatial derivatives of this wave behave as follows: dmT/dxm = (ik)mT, which
gives ±kmT for m even and ±ikmT for m odd; the former corresponds to dissipation
and the latter to dispersion, which accounts for the often-seen statement that even-
order derivatives are dissipative and odd-order are dispersive; see, for example,
Ramshaw (1994).
(4) The restriction to a single Fourier mode is, of course, not really a restriction in that
any IC, and solution, can always be represented as a linear combination of such
modes.
(5) A quotation from Strang (1986, p. 264) is relevant here: 'Every harmonic elkx is an
eigenfunction of every derivative and every finite difference.' It is, however, less
relevant when we move up to quadratic finite elements—as we shall see.
o Linear elements. So—each Fourier mode translates to the right at speed u. How well
is this simple behavior approximated by the GFEM (or other)? Let us find the answer
to this very important question by first examining 'linear elements' on a uniform mesh
(the only kind of mesh amenable to Fourier analysis, unfortunately), for which (2.6-9) is
approximated by [see (2.3-10)]
\(Tj-X +4fJ+fJ+l)+^-(TJ+l -Tj-i) = 0, j=\,2,...,N, (2.6-17)
6 2h
and (2.6-10) by
T0 = TN; (2.6-18)
136 THE ADVECTION-DIFFUSION EQUATION
there are N elements and N nodes for the periodic BC problem, and Nh = 1. (It may be
helpful to think of the ID problem as one on a circular track, wherein node 0 and node
N coincide—at x = 0 and x = 1, when unrolled; nodes 1 and N + 1 also coincide.)
Taking our cue from (2.6-16), the general solution to the above system of ODE's is
sought in the form (x -> Xj = jh):
Tj(t) = ei{kjh-Wt\ (2.6-19)
where k is a given wave number, and co is to be determined; hopefully it is close to 2icnu.
Note that (2.6-19) satisfies (2.6-18) only when k = kn = Inn.
It is convenient, for later use, to rewrite the above result as
Tj (t) = e,k"Jh • Q~iwt = v(pt-iwt, (2.6-20)
which has introduced the eigenvectors {v(n\ n = 1, 2, ..., N). Tj(t) thus describes the
temporal behavior of the n-th eigenvector. We remark that i/w) qualifies as an eigenvector
because it satisfies the generalized eigenvalue problem
Kv(n) = XnMv(n) (2.6-21)
obtained from (2.6-17) and (2.6-18) via Tj = v" e~knt, where K and M are (after
multiplying by h) 'defined by' (2.6-17) and kn is here an eigenvalue (not a wavelength); i.e.,
a row of the skew-symmetric matrix K reads
^(0, ->, -1,0, l,0->),
and one of the SPD matrix M reads
t(0,-> 1,4,1,0,-*),
o
and it follows that Xn = icon, where con is to be determined.
Inserting the 'test' solution, (2.6-19), into (2.6-17) leads to
-^ (e~ie + 4 + eie) + ^-(Qie - e~ie) = 0, (2.6-22)
6 2h
where 6 = kh is a dimensionless wave number and k = kn = Inn, and we henceforth
omit the subscript for 'convenience.' The 'approximate' frequency is thus found to be
u . 3 sin# 3
a>=-sin0- =uk , (2.6-23)
h 2 + cos# 6 2 + cos# '
which approximates the true frequency, coc = uk; and it is (only) a good approximation
for 0 'small.'
Remarks:
(1) We have apparently succeeded in finding the analytic solution to (2.6-17).
(2) If the mass is lumped in (2.6-17), the factor 3/(2 + cos#) is replaced by unity; i.e.,
sin#
(joLM =uk——, (2.6-24)
V
which is also the second-order (centered) finite difference result.
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 137
(3) It is sometimes convenient, but not necessary nor rigorous, to regard 0 as a
continuous variable, rather than 0 = 0n = Inn/N; n = 1, 2, ..., N. It is a good
approximation when N ^> I; see, for example, Hindmarsh et al. (1984).
(4) The frequency in (2.6-23) is sometimes called the 'symbol' of the operator/matrix
(Vichnevetsky and Bowles, 1982), A = -M~lK from t = AT; i.e., it is the
(normalized) result of the effect of operating on a Fourier mode with the operator A:
SA = e~'J0A ■ (e'-i0). The separate symbols in this case are: Sm = (2 + cos#)/3 and
Sk = iu sin 0/h, whose continuous analogs are, respectively, 1 and uk. See also
Swartz and Wendroff (1974), who may have been the first to analyze the phase
speed of this GFEM.
Recalling that 0 = 0n = knh = 2nn/N for n = 1, 2, ..., N, we plot in Figure 2.6-2
the eigenfrequency, con/u vs (continuous) n for N = 80 for both CM and LM [e.g., from
(2.6-24) we have a>„/u = Nsm2nn/N].
Noteworthy from this figure are:
1. The CM curve remains much closer to the goal, 2im, for the low-frequency modes,
than does that for LM. But both wander quite far from the goal for n 'large.' (This is the
'cause' of dispersion error.)
2. Both go through 0 at n = N/2(0 = tt), which corresponds to A = 2/N = 2h, the
infamous '2Ax-wave.' The eigenvector of the n = N/2 mode, from (2.6-20), is viN/ } =
el7T-i = (— \)J, showing a spatial period (wavelength) of 2Ajc; hence, the name '2Ajc-
mode.' (If N is odd, then this wave does not strictly exist, but one very close to it—for
large N—does.)
3. The curves are anti-symmetric about n = N/2.
4. The upper half of the spectrum (n > N/2) corresponds to waves shorter than 2Ajc,
none of which can be resolved/seen/displayed on the discrete mesh. Each of these waves
is 'aliased' to the lower half of the spectrum (see Fornberg, 1996, for discussion and
pictures). In fact,
V3N
N
co^U 0
-N
-V3N
1 N/2 N
Fig. 2.6-2 Eigenvalues for linear elements on pure advection.
138 THE ADVECTION-DIFFUSION EQUATION
5. oi>N-n = — &>n and v{N~n) = v{n)—complex conjugate—for n = 1,2,... ,N — \. This
leads [see (2.6-20)] to T(f~n)(t) = T{"\t), showing that each high-'frequency' mode is
aliased to the complex conjugate of the corresponding low-frequency mode. (What you
'thought' was a high-frequency wave is actually a low-frequency one.)
6. n = N is the special case of the constant eigenvector with zero frequency and infinite
wavelength.
Thus, in a sense, we can 'disregard' the upper half of the spectrum, and focus our
attention on only the resolvable lower half. (Note, however, that this does not mean we
are really discarding useless or redundant 'information'—we just look in two different
'ways' at the lower portion of the spectrum, a 'consequence' of periodic BC's. The
eigenvectors for the upper half, for example, are needed for the expansion of an arbitrary
vector on the grid—as we will show later.) So now let us look more closely at the lower
half—Figure 2.6-3—in which we also show the wavelength scale.
Additional noteworthy points are:
1. The midpoint, N/4(X = 4Ajc), is special in two ways—the first of which is quite
important:
(i) It divides the modes 50/50 (for all semi-discrete approximations—not just the two
under discussion); one half of the modes, which we might call long-wave modes,
have wave numbers between In and Nn/2 (wavelengths between 4Ax and NAx =
1)—and the other half, which are obviously called short waves, have wave numbers
between Nic/2 and Ntz (wavelengths between 2Ajc and 4Ax). Clearly, 'too many'
waves are short—and this is why wiggles of 'all kinds' can occur—especially for
the lumped-mass case.
^N —
con/u N —
Analytical^'
(27in) /
Consistent
Nh = 1
Fig. 2.6-3 Eigenvalues for linear elements on pure advection—a closer look.
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 139
(ii) It is the point where dco/dk = 0 for the lumped-mass (finite difference) case; i.e.,
the group velocity of the 4 Ax wave is zero.
2. The eigenvector of each short-wave mode (2h < A. < 4h) is also actually the product
of the 2Ajc eigenvector with the (complex conjugate of the) eigenvector of the
'corresponding' long-wave mode; i.e., we have
v(N/2-m) _ e2mj[(N/2-m)/N)
— QixJQ-lx'Jm/N
= v(N/2) . yim)
= (-iy¥m); (2.6-25)
thus, if t/m) is considered as 'smooth,' then wW2""1' is 'rough,' a distinction that is most
clear when m is 'small' and least clear as m -> N/4, the 4Ax wave. Even though mode
N/2 — m can be obtained from mode m, using (2.6-25), it is important to realize that this
does not imply that mode N/2 — m is not an independent eigenvector. In fact, each of
the N eigenvectors is distinct—the vector space is only spanned when all N of them are
utilized, even though v{N~m) = v(m) for m = 1, 2,..., N - 1, and v(N/2-m) = (-l)-¥m)
form= 1,2, ...,N/2- 1.
3. If we instead define dco/dk = 0 as the dividing point between long and short waves
(i.e., G = 0), which actually makes good sense, the consistent mass spectrum clearly
contains more long-wave modes than does the lumped-mass spectrum (one-third more, in
fact, since dco/dk = 0 gives k = 3Ax; 2/3 of the modes are 'long'). Many of the waves
that look 'short' (hard) for the less-accurate lumped-mass model, look 'long' (easy) for
the GFEM (consistent-mass) model. And we shall later argue that dco/dk = 0 is indeed a
useful definition of long vs short.
It may be useful to show a few of these analytical eigenvector results pictorially,
which we do with N = 80 (h = 0.0125). Figures 2.6-4 and 2.6-5 show the eigenvector
__
!Tiv^n=1
N
\ 1
\ 1
\ 1
J\ |
|\ 1 |
1 llltv\ hi
t 1U 1 if
pi 1
M
I
M
i \
x
^* n = 39 _—
\
/
\
/
\
1 ,\\
\V\ K A II 1
11 v ill T
i 1
'
y 1
/ '
s
V
/
-JJj' _L
1/N
Fig. 2.6-4 Eigenvectors for linear elements; n = 1,39.
140 THE ADVECTION-DIFFUSION EQUATION
1/N 1
XJ
Fig. 2.6-5 Eigenvectors for linear elements; n = 2,38.
(real part) cos2imj/N = cos2icnXj, j = 1, 2, ..., N for n = 1 (an 80Ax wave, shown
dashed) and n = 2 (a 40Ax wave, dashed) along with their short-wave analogs, n = 39
and 38, respectively (shown solid), on the same graphs.
The wavelengths of the latter two are ||Ajc and f§Ajt, respectively; i.e., they are
close to the limit of 2Ajc—given by n = 40. This shows the long and short 'pairs'
referred to above, and it is relevant to note that only the long-wave modes look like
'conventional' cosine curves. The modulated cosine that is the short-wave mode is another
consequence of dealing with a high-frequency wave on a discrete mesh; e.g., the
function cos(2icj x 38/80) = (—\)J cos(2jtj x 2/80) displays the shape that it does owing to
sampling 'error'—a form of 'aliasing.' If j was instead a continuous variable, the 38-th
mode would be a pure cosine wave (amplitude ±1 with no modulation) of wavelength
80/38; i.e., there would be 38 cosine cycles across the mesh. Sampling this 'pure' wave
at the discrete points j = 1, 2, ..., 80 yields the discrete eigenvector for mode 38—and
is the 'cause' of group velocity error, as we shall show (near the end of this section).
Moving more toward the middle of the spectrum, Figures 2.6-6 and 2.6-7 show n = 10
(an 8Ajc wave) and n -= 40 — 10 = 30 (an 8Ajc/3 wave, 3 waves per 8Ax)—shown
separately—even though mode 30 is a 2 Ax modulation of mode 10, for ease of viewing.
Finally, Figure 2.6-8 shows a 4Ajc wave, n = 20, halfway through the spectrum.
Enough pictures for now. Later we shall return to these figures in our discussion of
phase and group velocities.
o Quadratic elements. Before discussing phase speed and group velocity, and their
comparison with the exact results, the next 'logical' step, let us examine one 'higher-
order' element—quadratic—to see how much more difficult the analysis becomes for
methods in which 'one-node-looks-just-like-another' (on a uniform mesh) does not hold
true. It will also reveal how significantly more accurate this element is than the linear
one—and not just asymptotically for h -> 0, which is all that 'local' theory can predict.
The pure advection equation, via quadratics, is available from (2.3-16) and (2.3-18), as
i rr rr
— (tj-i + STj + tj+i) + u J+l j~l = 0 (2.6-26)
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 141
1/N 1
Xj
Fig. 2,6-6 Eigenvector for linear elements; n - 10.
1/N 1
Xj
Fig. 2.8 7 Eigenvector for linear elements; n = 30.
for center nodes, j = 1,3. ...,N; N is necessarily odd—in contrast to the situation
discussed earlier (Section 2.3-1) with different EC's, and
— (—T;-~> + 27\-_i + STj + 2T ;+\ — Tj+->) + 2u-^——^ u^l~z^^L^ = q
10 J J J J ]+ 2/1 4/1
(2.6-27)
for edge nodes, j = 2, 4, ..., N + I, where h = 1/2 is the internodal distance, the number
of elements is (N + l)/2, the number of nodes is N + 1, h = l/(N + 1), and we require
N ^ 5 for the equations to make sense.
If we seek a solution to (2.6-26) and (2.6-27) in the form of (2.6-20) (and we have),
we will fail (and we did). Due account must be made of the difference between edge
142 THE ADVECTION-DIFFUSION EQUATION
Fig. 2,6-8 Eigenvector for linear elements; n = 20.
and midside nodes, and this can be accomplished by seeking a solution in the form of
(2.6-20) for edge nodes, and
Tj(t) = /3e'WA-orf) (2.6-28)
for center nodes, where fi is to be determined along with co. A combined form that covers
all nodes is
Tj(t) = |[(1 + /3) + (-iy(1 - j8)]elW*-arf), (2.6-29)
and another is
Tj(t)
i + (-iy i-(-iy
J(kjh—mf)
(2.6-30)
for j = 1, 2, ..., N + 1. The analog of (2.6-10) is To = TN+\ and T^+2 = T\, which,
again, requires k = kn = Inn. Inserting the trial solution into (2.6-26) and (2.6-27) yields
the pair
1(0
.-2/0 , ^on-W
— (-tr™ + 2pe-'° + 8 + 2Bew - e_2w) + 2Bu
10 2/3
i6 —W 2/0 —2id
~ II-
4/i
and
io> ,fl ■,, (e'( - e"'")
— (e-'* + 8/? + e'e) + u ■ — = 0,
10 2/i
0 (2.6-31)
(2.6-32)
for « and /J. Rewriting these in terms of the trigonometric functions gives
5ii(4j6 sin 6 — sin 20) = 2coh(4 + 2/? cos 0 - cos 29)
and
(2.6-33)
5m sin 0 = (t)h(4p + cos 0).
\ Jkrf * \J aJ 1 I
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 143
Solving the second of these for fi, inserting the result into the first, and rearranging yields
the following quadratic equation;
/ coh\ coh
(3-cos 26>) — +(4 sin 26>) 5(1 - cos260 = 0, (2.6-35)
\ u J u
with solution
-2 sin26- ± J(\ - cos260(19 - cos26-)
toh/u = , (2.6-36)
3 — cos 26
and we are up against the dilemma that has confused many an analyst in the past; namely,
the quadratic equation gives two roots (frequencies) for each wave number. (Recall 6 =
kh = 2nnh,) Is one of them spurious (extraneous) and if so, which? And why? Before
answering these questions, we briefly review some of the previous literature on the subject
that may have given 'quads' a bad name—all because both roots of the quadratic were
utilized and assumed needed/present, with the 'explanation' (rationale/excuse) that these
silly elements possessed one spurious non-physical mode for each physical mode. And
to show that we too are not immune as we criticize previous efforts, please note that the
earliest entry on this list of references is ourselves. The list that we are aware of includes
Gresho et al. (1976), Hedstrom (1979), Cullen and Morton (1980), Cullen (1982), Cathers
and O'Conner (1985), Bates and Cathers (1986). On the other hand, there is at least one
publication early-on that got it right: Vichnevetsky and DeSchutter (1975), It is too bad
that this early paper was not seen by all those later people who 'blew it.' [We finally
discovered the error, and published correct results—there are no spurious modes—in
Gresho and Lee (1987) and Rowley and Gresho (1987).]
The best way we have found to resolve this dilemma is to realize that coh/u is closely
related to an eigenvalue of M~lK and to invoke linear algebra theory, where M is the mass
matrix and K the advection matrix defined by (2,6-26) and (2.6-27). Actually, to display
the symmetries that we will need below, these equations must first be returned/converted
to a form more like that of GFEM, (We presented them in finite difference form for ease
of 'interpretation,') Thus, multiplying (2,6-26) by 4/i/3 and (2.6-27) by 2h/3 yields the
desired form; namely,
MT + KT = 0, (2.6-37)
whose solution we now seek in the form T(t) = qo~Xt to give
Kq = kMq; (2.6-38)
A. is an eigenvalue of M~XK (one of precisely Af + 1), and q the corresponding ('quad')
eigenvector, of length N + \; also, A. = ico. Thus, if we can determine the proper properties
of the spectrum, we will have resolved the dilemma regarding co. And this we can do
because: M is symmetric-positive-definite (SPD), and K is skew-symmetric. We proceed
as follows: since M is SPD, its square root exists (and is also SPD) as does the inverse
of its square root. Thus, (2.6-38) can be written as M~l/2KM~l/2x = Ax = Xx, where
x = Mxl2q. Now, since A"7 = —K, so too does A7 = —A. Since A is a real skew-symmetric
matrix, its eigenvalues [except A. = 0, which occurs for n = (N + 1 )/2 and n = N + 1 ]
are pure imaginary and occur in conjugate pairs. Since there are N + 1 — 2 values of X,
there are at most (N — l)/2 distinct values of \co\.
Now return to the quadratic equation solution, (2.6-36), and use 6 = kh = knh =
2icnh = 2nn/(N + 1) for n = \,2, ... ,N + 1. In order to obtain just Af + 1 roots and
144 THE ADVECTION-DIFFUSION EQUATION
not 2(N + 1) and to obtain the proper complex conjugate pairs, it is necessary to select the
plus (or minus) sign for the first (lower) half of the spectrum (n = 1, 2, ..., (N + 1 )/2)
and the minus (or plus) for the upper half. (This also rejects the N + 1 extraneous roots.)
We choose the former (plus for first half, minus for second) to give, for convenience
and compatibility, con -> ukn for h -> 0 when n < (N + l)/2, rather than con -> — ukn.
The first (N + l)/2 frequencies are thus positive, and the second half negative, with
coN+l_n = —con corresponding to the complex conjugate pairs of eigenvalues (cf. the
previous frequency plot, Figure 2.6-2, in which the same behavior is shown using linear
basis functions—with no need to solve a quadratic equation). The spurious root issue was
a spurious issue.
Remark:
We suspect, but have not proven and thus merely offer a potentially useful warning, that
similar erroneous conclusions have been obtained, via similar erroneous analyses, in two
other CFD fields—shallow water equations and acoustics. For the former, see Kinnmark
and Gray (1985) and Kinnmark (1986), wherein they asserted the existence of spurious
solutions ('numerical artifacts'). For the latter, see Belytschko and Mullen (1978), Mullen
and Belytschko (1982), and Schreyer (1983), wherein the extraneous roots were identified
as 'optical (rather than acoustical) branches.' That these optical branches may be optical
illusions has also been pointed out by, at least, Abboud and Pinsky (1990, 1992).
To complete the analysis, we need /3; from (2.6-34), this is simply
1 /5m sin 0n \
' -cos0„), n = l,2,...,N+\, (2.6-39)
Qn = Inn i'(N +1), and we are done.
If the mass is lumped [replacing all time derivatives by tj], then the RHS's of (2.2-33)
and (2.2-34) are replaced by \0coh and 5/3a>h, respectively, (2.6-36) is replaced by
- sin20 ± \/l7- 16cos20 - cos2 26
coh/u = , (2.6-40)
and (2.6-39) by
u sin 6n
Pn = :A (2.6-41)
a>nn
It is noteworthy that the analog of (2.6-25) also applies to these quads, with the
eigenvector being now [cf. (2.6-30)]
(«)
i + (-iy i-(-iy
^ rPn ~
e
ikjh, (2.6-42)
where k = 2nn, n = 1,2, ..., N + 1; i.e.,
qf+X),2-m] = (-\yqf\ (2.6-43)
although in this case—as we shall see—(N + l)/2 is perhaps not the best definition of the
long-wave/short-wave dividing point. (It still does divide the modes 50/50, however—as
mentioned earlier.)
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 145
Now that we have found our 'Fourier' modes for quads, we realize that they are
not simple/conventional Fourier modes—a consequence of the two different types of
equations. The plot of fin vs n in Figure 2.6-9 for N = 79, which is symmetrical about
(N + 1 )/2, shows that only the low modes (n «; N) look like simple Fourier modes;
higher modes have a reduced amplitude for mid-side nodes, reaching 1/2 at the shortest
resolvable mode [n = (N + l)/2, knh = 2nn(N + 1) = ic, kn = \/n =2h, the '2Ax'
mode]. This causes, for example, the 'simple'/conventional 2Ajc mode (±1) to be a
linear combination of two quad eigenvectors, which we will later explicate.
This somewhat awkward behavior, while in no way harming the approximating
properties of the quadratic basis functions, leads to another point of confusion in the literature
(see, for example, Cullen, 1982). This is awkward and unfortunate—the latter because
those potential newcomers to quadratic finite elements who study only these negative and
misleading papers will probably be 'turned off when they should not be. We try herein
to 'restore the faith', since quads are actually very accurate for advection.
To start, let us try to reinterpret the situation as follows: it is just pure dumb luck that
some numerical approximations (linear FEM, centered FDM) share the same eigenvector
(discrete version, on a uniform mesh) as the continuum. Such a fortunate occurrence
merely makes the analysis easier—it does not make the approximation better. For
example, if quadratic B-splines are utilized as basis functions in the Galerkin method—a
non-interpolatory (and thus awkward) basis—the semi-discrete equations are simpler in
that they do satisfy the one-node-looks-like-another property, on a uniform mesh, and the
advection problem looks like (see, for example, Chin et al. 1979, or Vitchnevetsky and
Bowles, 1982)
1
120
(r,-_2 + 267V, + 66fj + 267V i + fj+2)
u
+ 24/- (TJ+2 + 107V, - 107V, - Tj-.2) = 0,
(2.6-44)
(N = l)h
Fig. 2.6-9 Mid-side node amplitude coefficient for quads.
146 THE ADVECTION-DIFFUSION EQUATION
which also gives a highly accurate approximation asymptotically [phase error = 0(h6)
vis-a-vis 0(h4) for the GFEM quads] from c = 5w(sin20 + lOsin0)/0(33 + 26cos0 +
cos 20). But if one plots c(0) and G(0) for these 'unambiguous' (and smoother, with C1
continuity) quads, one would find that they lie virtually on top of the corresponding curves
from the 'conventional' (GFEM) quads that we will present later; i.e., they would perform
just about like the C° quads in practice. Another blow to the 'simple' Fourier analysis
is this: real grids rarely have equal nodal spacing, thus restricting the Fourier analysis
method to a small subset of 'model' problems. What happens in practice, and which can
only be dealt with in 'generalities,' is that every 'real' problem can be represented in
terms of eigenvectors of the matrix (M~lK in our case) approximating the differential
operator. Only for special methods and special grids do the eigenvectors take the simple
form of Fourier modes—or any other simple 'analytic' form.
Thus, whereas the quadratic element displays 'modes' that differ even more from
simple trigonometric functions, these modes still qualify for (i) representing an arbitrary
function via linear combinations, and (ii) modal analysis that asks 'How well is each
mode propagated with respect to the ideal?' That is, how close is the phase velocity of
each mode to the fluid velocity, u, as we let n range through the spectrum? The answer
is this: quads do an excellent job of modal translation/advection relative to linears (and
to many very high-order FDM's) because they transport a larger fraction of their modes
at much closer to the proper velocity—as we shall see. (The errant literature referred to
above has interpreted their extraneous roots as modes that move upstream—against the
flow—a totally fallacious conclusion.)
A hint of their behavior is shown in Figure 2.6-10, which should be compared with
Figure 2.6-3 and which shows graphs of (2.6-36) and (2.6-40)—using the + sign only
(because we are plotting only the lower half of the spectrum, recall). The GFEM version
(consistent mass) hugs the exact solution line for a very large fraction of its
spectrum. Lumped mass quads, on the other hand, are (slightly) less accurate than consistent
mass linears. If dco/dk = 0 is used to separate short from long waves for CM quads,
2N
(On/U
N
0
1 (N+1)/4 (N+1)/2
Fig. 2.6-10 Eigenvalues for quadratic elements on pure advection.
Analytical (2nr\)
Consistent
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 147
which definition we strongly recommend in general, then almost three-quarters of the
waves/modes fall into the long (easy) category (vs two-thirds for GFEM linears and
one-half for LM linears).
Again, a few eigenvector pictures may be useful—although we will again be somewhat
misled using quads, this time because of what could be called laziness; i.e., we use
a 'conventional' plotting package that connects discrete data points with straight line
segments rather than the proper piecewise-parabolas. The key difference, though, will
be in the '/J-factor' for the center nodes; e.g., the 2Ax wave oscillates between +1
and —1/2 rather than between ±1, and should look like a sequence of parabolas, as
shown—in part.—in Figure 2.6-11. This is the mode that modulates the lower modes
to 'generate' the corresponding higher modes. Figure 2.6-12 shows n = 1 (dashed) and
n = 39 (solid), corresponding to Figure 2.6-4 for linears—in which the /J-factor is again
easily recognized.
We will skip n = 2, 38, and 20, and conclude with the analog of Figures 2.6-6 and
2.6-7: n = 10, 30—shown in Figures 2.6-13 and 2.6-14.
1.0
0.5
-0.5
— 1.0 —
Fig. 2.6-11 The 2Ax eigenvector for quads.
1/(N+1)
Fig. 2.6-12 Eigenvectors for quadratic elements; n = 1,39.
148 THE ADVECTION-DIFFUSION EQUATION
1/(N+1)
Fig. 2.6-13 Eigenvector for quadratic elements; n = 10.
1/(N+1)
Fig. 2.6-14 Eigenvector for quadratic elements; n = 30.
o Phase and group speeds. We are now ready to examine the numerical phase and
group velocities (neither of which is simply u, as in the continuum)—for both linear
and quadratic elements—and see how well the semi-discrete systems can translate/advect
the various 'Fourier' modes. The phase speed, from (2.6-3) is just c = |c| = co/k, and the
group velocity (= group speed in ID), from (2.6-7), is just G = dco/dk, where now it will
really be seen how 'convenient' it is to regard k as a continuous variable.
For linear finite elements, we thus have, from (2.6-19),
f .(f) — Qi(kxj-mt) _ Qik(Xj-
-cot/k)
= e
ik(xi —ct)
(2.6-45)
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 149
rather than T(xj, t) = elkixJ~ul) from (2.6-16). Here c = co/k is the phase speed of the
Fourier mode k = kn = 2nn, obtained from (2.6-23) for GFEM and from (2.6-24) for its
LM counterpart; i.e.,
s'mO 3
(2.6-46)
c = u-
0 2 + cos#
and
sin#
clm = u-
0
(2.6-47)
The respective group velocities, G = dco/dk = h dco/dO = d(ck)/dk = c + k dc/dk = c +
Odc/de are
l+2cos# 3
G = u
2 + cosO 2 + cos#
and
Glm = u cos 6.
For quads, the corresponding results are [from (2.6-36)]
c = u-
-2 sin 26 + ^(1 - cos2#)(19 - cos 2^)
0(3 - cos 29)
(2.6-48)
(2.6-49)
(2.6-50)
for consistent mass and, for lumped mass,
clm — u-
- sin 20 + \/l7- 16 cos 20 -cos2 20
40
(2.6-51)
from (2.6-40), with corresponding group velocities
G =
2m
(3 - cos 20y
2(1 -3 cos 20) +
sin 20(11 +7 cos 20)
and
Glm = -
V(l -cos2#)(19-cos2#)
(8 + cos 20) sin 20
y/\l - 16cos2^-cos22^
— cos 20
(2.6-52)
(2.6-53)
These functions are plotted in Figures 2.6-15 and 2.6-16, in which we switch to 0n
on the abscissa rather than n; also, we ignore the 'discrete' effects (such as n = N/2 =>•
Af is even, and Af should really be Af + 1 for quads) and consider 0n as a continuous
variable—for convenience. (The curve labeled 'Best Petrov-Galerkin' will be discussed
later.)
Remarks:
(1) The phase speed for CVFEM, from (2.5-6), is easily found to be [replacing
(1 4 l)/6 by (1 6 l)/8 in the mass matrix 'translates' to: replace (2 + cos#)/3 by
(3 + cos 0)14 in the phase speed equation] c = u sin 0/0 ■ 4/(3 + cos 0), a result also
shown in Figure 2.6-15. G = u • 4/(3 + cos#) • (1 + 3cos#)/(3 + cos#) is shown
in Figure 2.6-16. CVFEM is not nearly as accurate as GFEM with linear basis
functions.
150 THE ADVECTION-DIFFUSION EQUATION
1.0
0.8
c/u 0.6
0.4
0.2
n
I
-""■'--
Quadratic
Linear
Lumped Quad.
Lumped Linear
CVFEM
Best Petrov-Galerkin
~ — ' ~ " ^^^"^-^
^ ^ \
V'N/. N\ \.
^ \ \\
^ V
'•■ "v \ \
\ \ V
\ \ \
>:l
7l/2
e
Fig. 2.6-15 Phase speed for several elements.
G/u
Quadratic
Linear
Lumped Quadratic
Lumped Linear
CVFEM
7l/2
e
Fig. 2.6-16 Group velocity for several elements.
— o
— -1
— -3
(2) In fact, however, virtually all FVM users lump their mass, giving up what
little gain they could have. The curve labeled 'Lumped Linear' describes
their scheme.
(3) The phase speed of quadratic CVFEM [see (2.5-11) through (2.5-15)] would
lie just below that of linear GFEM (not plotted, not worth it).
Recall: the phase speed shows the modal speed—mode n moves at speed cn, and G
gives the velocity of a wave group—a concept that may thus far seem somewhat vague,
but soon we shall make it clear (hopefully) via examples, including some with negative
group speed.
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 151
Additional Remarks:
(1) Lumped linears (FDM) are not very accurate.
(2) All except quads show a lagging phase speed (c < u), whereas quads show a slight
phase lead for a good fraction of the spectrum (cmax = 1.0067 at n = 0.24N; 0 =
0.48;r; k = 4Ax). All modes move in the proper direction, in spite of what some
have said about quads.
(3) The higher the phase accuracy is for 'long' waves (G > 0), the larger is the
(negative) short-wave group velocity. This translates in practice to: an accurate scheme
will track most of the waves very well (c % u), but those few (short) waves that
it cannot track well will cause fast upstream-moving wiggles. [By 'waves' here,
we mean of course the projection of the initial data onto the eigenvectors; i.e., we
imagine (and could do—at higher cost) the problem is being solved via actual
eigenvector expansion. See below.] Thus, for example, the group velocity of — 5u for the
2Ax wave with quads is actually a blessing in disguise, in some sense, relative to — u
for lumped linears. And by 'upstream-moving,' we mean that the appropriate linear
combination of all modes, each of which is moving to the right, causes the resulting
total wave form to display leftward-moving wiggles—this is group velocity (soon
to be shown).
(4) CVFEM is in between LM and CM for the linear GFEM; while it may conserve
locally better than GFEM, that conservation is obtained at the cost of not transporting
the solution as accurately. When LM is employed for CVFEM (and it seems to be
always the case), then it does not even conserve locally, in spite of what its adherents
assert.
(5) Another interpretation is this: quads are especially good for smooth data.
Let us now return to the eigenmode pictures in Figures 2.6-4-2.6-8, and imagine
first that a single linear basis function eigenvector is placed on the grid as initial data (a
portion of modal analysis/eigenvector expansion analysis). Setting u = 1 for convenience,
the n = 1 mode will translate at a speed given by (2.6-46) for 6 = Inn/N = n/'40 of
d = 0.99999979—unless the mass is lumped, in which case the speed is given by (2.6-47)
and is only c\M = 0.9989723. For n = 2, the speeds are reduced to c2 = 0.99999966
and c^4 = 0.99589274. In contrast, the short-wave 'sisters' will be much lazier in
their forward motion. For n = 39, we have 6 = 39;r/40 to give C39 = 0.0766 and
c$ = 0.0256. For the second mode, n = 38, we obtain c38 = 0.1553 and c$ = 0.0524.
Finally, the associated group velocities, from (2.6-48) and (2.6-49) are—even though a
single mode follows the phase speed and not the group speed (there is no 'group' for
a pure/monochromatic wave—more later)—G, = 0.9999989, G\M = 0.9969173, G2 =
0.999983, G^M = 0.987688, G39 = -2.963, G^ = -0.997, G38 = -2.855, and G$ =
—0.987. For quads, we let the reader do the work. The gist of all of this is this: an arbitrary
IC will behave as some linear combination of all 80 modes and will only be accurate
if the amplitude coefficients of the higher frequency modes are very small—which will
normally only be the case when the IC is 'sufficiently smooth'—the definition of which
varies with N (rougher data require larger N for fixed 'error'). Demonstrations will be
made soon.
o Finite difference comparison. To compare some of these results with FDM, as an
'aside', we go to Vichnevetsky and Bowles (1982)—a short book that should be required
152 THE ADVECTION-DIFFUSION EQUATION
reading for anyone interested in understanding more on this interesting subject (as also
should Trefethen, 1982, and Vichnevetsky, 1987)—and plot a few of 'their' finite
difference schemes alongside our finite elements. In Figures 2.6-17, 2.6-18, and 2.6-19, we
compare frequency, phase speed, and group velocity for 2nd-, 4th-, 6th-, 12th-, 18th-, and
24th-order centered FDM's, along with linear and quadratic FEM's, the FDM frequency
equation being
coh
= sinO
u
K
i + £
Q-!)2(2sinfl/2)
(2./+1)!
2j
(2.6-54)
for K = 0, 1, ..., where 2{K + 1) is the 'order' of the FDM. For variety, we also plot
phase-speed error (1 — c/u) in Figure 2.6-18(b) as a function of the number of points per
wave (A/Ajc). The type of plot in Figure 2.6-18(c) is useful for answering questions like:
How many points per wave are required to attain 2% phase-speed error?
These figures seem to say that GFEM with linear elements is a good competitor to
8th-order FDM and that it takes a 24th-order FDM to beat quads. Asymptotically, of
course, this is not true. Both linears and quads show a 4th-order convergence rate (super-
convergence for linears) as h -> 0. But in practice ('in the real world'), unfortunately, we
are almost never lucky enough to operate in this limit. Thus, the phase and group speeds
over the full range tell the story better.
Nevertheless, in Table 2.6-1 we list some asymptotic results (kh -> 0) for most of the
schemes pictured above—for completeness, wherein the following formulas describe the
FDM cases, with thanks to B. Fornberg:
= 1
n
7=2(2)
J
7+1
(2.6-55)
toh/u
2nd order
4th order
6th order
12th order
18th order
24th order
Quadratic
Linear
~~~
y^
I
I
J&
^"T
'-
/^Analytical
- - \* *
^ \ NX\
^ \ \ x f
\ \ \\'\\
V\VA
"-•-x4
7l/2
e
Fig. 2.6-17 Eigenvalues of some FDM's and FEM's.
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN
(a) 1
c/u
mt\mii ■ I.IW I ■iM^j r -^ i _ _^|» .
2nd order
4th order
6th order
12th order
18th order
24th order
Quadratic
Linear
7l/2
e
\ \ \ \ \>
••. \ \ \ \i
'••• \ \' VI
\\\\A
153
(b)
c/u
.--"" 2nd order
.,-'' 4th order •■ -
6th order
12th order
18th order
24th order
Quadratic -----
Linear
_J I I I
4 5 6 7 8
Wave Length / Grid Spacing
(c)
1-c/u
u.^
0.1
0
lH ; W '« \ I I XI I
U \ \ \ \ 2nd order x
\\ \ \ \ \ 4th order -•••■-■ "^.v
\l \ \ \ \ 6th order ^~~v
\\ \ \ \ \, 12th order -^-^
- \\ \ \ \ \ 18th order ^
i * \ \ \ "'•■-.. 24th order
\* \ Nx \ ''•••... Quadratic ■-■
\ "»\ \ v\v^ ""-•-....Linear
I I I I I
2 3 4 5 6 7 8
Wave Length / Grid Spacing
Fig. 2.6-18 Phase speeds corresponding to Figure 2.6-17. (a) Phase speeds; (b) phase
speed vs k/h; (c) phase speed error.
154
THE ADVECTION-DIFFUSION EQUATION
G/u
0
-1
-2
-3
-4 I—
-5
■ — ,.Mnt -■ mi m*m
™*
2nd order
4th order
6th order
12th order
18th order
24th order
Linear
Quadratic
^.^ -
NVY -
'%
%
V.
k/2
e
1
o
-1
-2
-3
-4
-5
-6
Fig. 2.6-19 Group velocities.
Table 2.6-1 Asymptotic phase speed for several discretizations.
Method (0 =
Lumped linears (=
Linear GFEM
4th-order FDM
6th-order FDM
12th-order FDM
18th-order FDM
Lumped quads
Quadratic GFEM
24th-order FDM
kh)
= FDM)
c/u
\-e2/6
l-6»4/180
I - 6>4/30
l-6»6/140
1 -6»12/12, 012
1 -6>18/923,780
I - 6»4/270
I + 6>4/270
I - 6»24/67, 603, 900
G/u
I - 02/2
I - 6>4/36
i - e4/6
I - 6»6/70
I -6>12/924
I -6»18/48,620
I - 6>4/54
I + 6»4/54
I -6»24/2, 704, 156
and
G
- = 1
u
«*»<$)'&&
(2.6-56)
where n = 2(K + 1) is the order of the scheme.
A more useful comparison is framed with the following question: How many points
per wave are required for a 1% (5%) error in the phase speed? Table 2.6-2 answers this
{k/h = Xk/kh = 2tt/0).
Remarks:
(1) A result shown in Table 7 of Swartz and Wendroff (1974) is reasonably compatible
with our results: for a phase error of 0.01 per period, the implicit quadratic FEM
via TR costs about the same as the explicit 8th-order centered FDM via leapfrog.
(We shall discuss the ODE methods called 'TR' and 'leapfrog' in Section 2.7.)
(2) For additional very-high-order FDM results, and pseudo-spectral results, see the
recent book by Fornberg (1996).
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 155
Table 2.6-2 Resolution required (k/h) to get 1% (5%)
phase speed error for several discretizations.
Method
Lumped linears (= FDM)
4th-order FDM
6th-order FDM
Linear GFEM
12th-order FDM
18th-orderFDM
Quadratic GFEM
24th-order FDM
1% error
26
8.31
5.70
5.61
3.82
3.29
3.19
3.03
5% error
11.3
5.49
4.25
3.93
3.21
2.89
2.85
2.72
o Reduced quadrature. To wrap up this ID analytical discussion, we move away from
GFEM and examine two more (besides lumped-mass) non-Galerkin FEM's. The first is
reduced quadrature, and the second goes under the name of SUPG (originally, Streamline
Upwinding Petrov-Galerkin, but see Hughes (1988) for an etymological discussion). The
former is interesting because, with linear elements, it generates a finite element derivation
of a well-known finite difference method called Keller's (1971) box scheme, and the latter
because it is a dissipative scheme that is called 'simple first-order upwinding' in the finite
difference literature. (The true 'power' of SUPG is revealed in multi-dimensions—it has
no harmful 'crosswind diffusion.')
The one-point quadrature analog of (2.6-17), available for example by appropriately
simplifying (2.5-1), is
i(7-;_, + 2tj + tj+l)+ ^(Tj+i ~ Tj-i) = 0, j=l,2,...,N, (2.6-57)
in which only the mass matrix is changed. This is Keller's box scheme, and it is
interesting to note that it is also a finite element scheme, although non-Galerkin, obtained via
inaccurate quadrature on the mass matrix. Repeating the analysis following (2.6-18), it is
immediate to obtain the one-point quadrature frequency equation
sin# 2 2uk
co = uk = tan 0/2, (2.6-58)
0 1 +cos6> 0
which leads to
and
sin 0 2 2u
c = u = — tan 0/2 (2.6-59)
0 l+cos6> 0
G = 2u/(l +cos6). (2.6-60)
This second-order accurate approximation displays a quite different error behavior than
those seen heretofore. In particular, it exhibits a leading error in both phase and group
velocities, with both c and G approaching oo(!) as 0 -> n, even though the actual 2Ax
wave (0 = n) is stationary; i.e., the functions are not continuous at 0 = n. See also
Vichnevetsky and Bowles (1982) for further discussion of this method, and for methods
'in between.'
156 THE ADVECTION-DIFFUSION EQUATION
The box scheme above was obtained via reduced quadrature and led to a scheme with a
leading phase error. It is interesting and worth pointing out that the same sort of behavior
(leading phase error) obtains with quadratic basis functions and—like linears—it also
carries over to 2D and 3D. If a two-point quadrature rule is used on the 'quadratic' GFEM
equations, then the mass matrix (only) changes (GFEM requires three-point Gaussian
quadrature on M)\ it becomes
I ( 2 2 ~X\
M = — [ 2 8 2 , (2.6-61)
18\-1 2 2/
where / = 2h is the element length, which leads to the following advection equations
[cf. (2.6-26) and (2.6-27)]:
i(7-;_, + 4fj + tj+l) + uTj+{~Tj-{ = 0 (2.6-62)
6 2h
for center nodes (like linear elements!) and
z(-tj-2 + 2f;_, + 4fj + 2tj+i - fj+2) + 2uTj+[~Tj-[ - uTj+2~Tj-2 = 0
6 2h 4h
(2.6-63)
for edge nodes. The analogous phase speed/dispersion analysis leads to
-3sin20 + y/3(7-cos2^)(l -cos26)
c= ^71 ^ ' (2.6-64)
20(1 — cos 20)
for the phase speed. The associated group velocity is
3 J3 (7 - 8 cos 20 + cos2 20) - 9 sin 20
G = -^—^ . (2.6-65)
(1 - cos 20) V 3 (7 - 8 cos 20 + cos2 20)
wherein it is clear that like linears, both go to oo at 0 = n. Finally, the midside node
coefficient, p, is [cf. (2.6-39)]
1 /3«sin0„ \
Pn = 2 V C0 " C°S °" J ' (2.6-66)
which again varies from 1 for 0„ -> 0 to 1/2 for 6n -> n.
Figure 2.6-20 shows the phase and group speeds for the reduced quadrature
approximations, whose leading phase and group error we shall later demonstrate—in 2D
(Section 2.6.3b)—but the 'message' is already clear: use full quadrature/honest GFEM.
o Upwind, SUPG. Before examining SUPG, we first introduce and study its 'predecessor,'
pure upwinding; i.e.,
i(7V_, +47,- + tj+l)+ Uh{T) - r,-_,) = 0. (2.6-67)
The 'usual' procedure now leads to an unusual result; i.e., the use of (2.6-19) in (2.6-67)
results in the following analog of (2.6-22):
^(e'10 + 4 + ei0) + y(l - e-ie) = 0, (2.6-68)
6 h
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 157
2.0
1.9
1.8
1.7
1.6
-d 1-5
£ 1.4
w 1.3
1.2
1.1
1.0
0.9
I I 1/ / 1 /
— ' / /
Group_quadratic / /
■-■•- Phase_quadratic / /
— _._.. Groupjinear / /
~ Phasejinear / /
/ /
' / /
/ / ■'
V 1
—
— —
0.5
1.0
1.5
e
2.0
2.5
3.0
Fig. 2.6-20 Reduced quadrature phase speed and group velocity.
which has the solution
u 3
to = 7[sin0 - i(l - cos0)]— -;
h 2 + cos0
(2.6-69)
an imaginary part has been tacked onto the GFEM frequency. This has the effect of
changing the solution, (2.6-20), from the GFEM result:
Tj(t) = eik{x'-cl\
where c = «(sin0/0)(3/(2 + cos0)), to
T (t) = e-[M'('-cos6,)//'l[3/(2+cos6,)l . eik(xj-ct)
(2.6-70)
(2.6-71)
with the same c. Thus we now also have dissipation—numerical damping—of the
traveling and dispersing [c = c(k)] wave. In the 'limit' 6 <& \, this becomes
Tj(t) = Q-k2hul/2-eikix^cl\
(2.6-72)
which actually looks like a solution to the full advection-J/^M^/on equation with diffu-
sivity
k = uh/2, (2.6-73)
because if (2.6-13) is used as initial data for the transient heat equation, then it follows
easily that the analytic solution is just T(x, t) = e~~k'Kl ■ e'kx. Thus, the upwinded advection
equation reacts as if it were solving the parabolic advection -diffusion equation—not the
hyperbolic advection equation. On the other hand, (1 — cos 6) is a monotone function
between 6 = 0 and 0 = n; for 6 = n, the damping is a maximum. The shorter the wave, the
stronger the damping, with the 'deleterious' 2Ajc being damped faster than any others. In
fact, this [in the LM mode, in which the 3/(2 + cos 0) factor is omitted] is the world's most
famous wiggle suppressor, it has the property of always suppressing wiggles, regardless
of the initial data—continuous or not, such as a step function or even a discrete Dirac
delta function (Tj = \/h at node j, zero at all others), both of which 'excite' the entire
158 THE ADVECTION-DIFFUSION EQUATION
spectrum quite significantly. In fact, the exact (LM version) solution of a related delta
function problem above is available (Wurtele, 1961): the solution of the lumped mass
(FDM) version of (2.6-67) is the Poisson distribution, Tj(t) = e~Tzj/j\ for j ^ 0 and
Tj(t) = 0 for j < 0, where 7,(0) = l for j = 0 and 0 for j ^ 0, and r = ut/h. (Unlike
the true delta function solution, Wurtele's solution goes to zero as h -> 0. For further
discussion of the true delta function, see the end of Section 2.7.4.) For further discussion
of the upwinded AD equation, see Section 2.6.2b.
Turning now to SUPG, with its convenient upwinding 'parameter'—a tuning knob—we
will see that it can do better on pure advection. Following Brooks and Hughes (1982),
who followed (generalized) Raymond and Garder (1976), who followed Dendy (1974),
we shall 'tune for the best phase speed.' But first it is instructive to rewrite (2.6-67) in
the following equivalent form:
^(tj-i+Wj + tj+ri + U
(Tj+l - Tj-\) _ uh Tj^i - 2Tj + Tj+\
2h
hz
(2.6-74)
in which the numerical diffusion coefficient (2.6-73) is clearly displayed. The SUPG,
though, is better than the above 'simple' upwinded approximation because the effect of
the weighting function ('Petrov-type') also 'shows up' in the (unsymmetric) mass matrix
for approximating (2.6-9); it is given by
(Tj+i-Tj-i)
(l+P/2)TM+4Tj + (l-p/2)TJ+i
= Puh(TM-2TJ+TJ+l)/h2,
+ u-
2h
(2.6-75)
where /3 is an upwinding parameter. Whereas fi = 1 /2 is the recommended value for
steady-state simulations at large values of the grid Peclet number (P = uh/2ic), which then
gives simple upwinding as shown above, the choice /3 = l/\/T5 = 0.26 (Raymond and
Garder, 1976) minimizes the phase speed error—at least for long waves (kh <^ 1)—which
we shall demonstrate. Proceeding as before, the analog of (2.6-22) is
-ico
[(1 + 3P)e~i9 + 4 + (1 - 30)eie] + u(ei0 - Q-i6)/2h
.-W
id,
= puh(c-w -2 + ew)/h\
which leads to the solution
where
K =
Tj(t) = e-^Kt ■ eik{xJ-cl\
(2.6-76)
(2.6-77)
/2 + cos#\ 9
*2* (l+mY + fstfe
(2.6-78)
is the artificial/numerical diffusion coefficient, and
sin#
u-
c =
e
2 + 3 cos 0
2 + cos^
+ 2£2(1 -cos#)
2 ,,:„2
+ PL sinz 0
(2.6-79)
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 159
is the phase speed. For 0 -> 0, the phase speed is
c = u
1 -
1
180
^\o4 + 0{e6)
(2.6-80)
which is clearly a sixth-order accurate result (i.e., for long waves) if the upwind parameter,
/3, is taken to be l/\/l~5. For this choice, we have
c = u-
sin#
V\5u
9(4 + cos 6)
23 + 20cos# + 2cosz#
K =
(i -cosoy
klh 23 + 20cos# + 2cosz#
(2.6-81)
(2.6-82)
which gives, for 6 -> 0,
K =
klh
180
e4 + 0(6>6)
•21,3
180
ukzh* + 0(tf),
(2.6-83)
which is actually a higher-order diffusive term; SUPG is a big improvement over simple
upwinding.
The phase speed for this accurate advection method is plotted in Figure 2.6-15 where
it is rather interesting that it nearly overlaps the curve for quadratic basis functions. This
suggests that, in practice, the 'best' linear and the standard quadratic would perform quite
similarly—and very well, although only the latter has zero numerical diffusion.
o Fourier analysis, eigenvector expansion for linear elements. Next, we show some sample
results—mostly in the form of (a sampling of) popular tutorial test problems—which we
hope the reader will find useful. Before presenting any numerical results, however, we
introduce more quantitatively the notion of easy vs difficult problems in the context of
Fourier analysis. Although the Fourier transforms that we are about to introduce are only
strictly valid for a single copy of the waveform on an oo-span, they are nevertheless
useful for our periodic case—at least if the given function goes to zero 'sufficiently fast.'
Figure 2.6-21 shows three types of waveforms (e.g., initial data) that can span the
range from 'easy' to difficult. The Gaussian is easy in that its Fourier transform, shown
normalized in the figure, is also a Gaussian (in wave number) and thus rapidly goes to
zero as ok increases; i.e.,
Fx{ka) =
1
.—ikx
2 /o 2
-x /2a
a
dx = e
-a2k2/2
(2.6-84)
1lT J—oo
which has virtually zero response for ok > 5 or so (e~125 =4 x 10~6). Translation:
'A fine mesh should not be required.' Moving next to the other end of the scale, the
square wave is very difficult—the discontinuities cause the excitation of many high wave
numbers. The normalized spectral amplitude (transform) in this case is given by
F2(kl)
2/7-<
Jkx
f{x)e,KX<\x = (slnkiykl,
(2.6-85)
where f(x) describes the step function (of unit amplitude). Noteworthy here is the slow
drop-off with increasing kl —the n-th local maximum or minimum is given,
approximately, by F2(kl = [(In + \)/2\n) = ±2/n(2n + 1) forn = 1, 2, ...; i.e., it decays like
160 THE ADVECTION-DIFFUSION EQUATION
0 271 471 67C
Dimensionless wavenumber, k^or ka
Fig. 2.6-21 Normalized Fourier spectrum of several waveforms.
l/nic, which is not fast enough—as we shall demonstrate. A (very) fine mesh is required.
Finally, the intermediate case of a waveform given by the triangle has the intermediate
response curve given by
1 f°°
F3(kl) = - \ g(x)e,kx dx = 2( 1 - coskl)/k2l2, (2.6-86)
' J—oo
where g(x) defines the triangle with unit amplitude and base 21.
Consider now the problem of advecting these three waves across the periodic unit
span. Let us begin by selecting a grid for the Gaussian (guided by the figure) such that
o^max — 5, which is achieved if a = 1.6/i, so this is what we choose first; i.e., okmdX = 5 =
a ■ 2n/Xmin = 2na/2h =>• a = 5h/n = l.6h. Later, we will also compare some methods
on a well-resolved Gaussian: a = 4h(akmdX = Arc). For the triangle, we try kl = 4n to
catch the first small lobe but neglect the rest (whose amplitudes decrease like \/n2 per
period). Since kmdX = 2n/kmin = ic/h, we get / = Ah as a guess at the minimum resolution
required. Finally, for the step, we try two different 'grids'; for the first, we take a lO/i
step width (21) and for the second, we take 40h. Both grids will get to the 2Ajc wave at
k = 7ijh or kl = icl/h, giving 5;r for the first and 20jt for the second. The former will
'capture' the first four 'lobes' in Figure 2.6-21, and the latter about the first 19 (not shown
in the figure).
The analog of the continuous Fourier transform of the initial data is the amplitude
coefficients of the eigenvectors corresponding to the same data; i.e., the (L2, see Appendix 3)
projection of the initial data onto the eigenmodes, which brings us to an important, and
somewhat lengthy, diversion: the exact representation of the approximate (semi-discrete)
solution in terms of the discrete eigenvectors. We have already introduced these in (2.6-21)
for linear elements and in (2.6-38) for quads. But let us 'take it from the top,' more or
less, starting with linears: we seek a solution of (2.6-17) in the form
N
Tj(t) = ^fl^f'e-^', (2.6-87)
m=\
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 161
where v{- = Q,2jTmJ/N is the y'-th element in the ra-th eigenvector, the eigenvalue com
is given by (2.6-23) [(2.6-24) for LM], and the amplitude coefficients, {am}, are to be
determined from the initial data. Thus, at / = 0, we have [see (2.6-11)]
N
7,(0) = T0(xj) = ^Tamvf\ j=\,2,...,N; (2.6-88)
an expansion of a real number (for each j) in terms of N complex numbers. Defining
N
(u,w) = ^wjUj, (2.6-89)
it follows that our eigenvectors are orthogonal,
(v{m\v{n))=N-8mn, (2.6-90)
which allows us to easily evaluate am\ i.e.,
I N I N
am = (v(m\ T0(Xj))/(vlm\ v(m)) = - ]T ToiXj*-12™"''" = - ]T T0(Xj)v{™\ (2.6-91)
7=1 j=\
and we see that am indeed is the projection of the IC onto the m-th eigenvector—which is
also our approximate/discrete/finite Fourier transform of T0(x). Introducing the phase
speed, from (2.6-46) and (2.6-47), the exact solution to the approximate advection
equation is
N N
T\x,t) = Y,T j{t)(j>j{x) = Y,
j=\ j=\ lm=\
N
i2Ttm(j/N-cmt)
^2ame':
4>j(x), (2.6-92)
where am is obtained from (2.6-91) and cm from (2.6-46) or (2.6-47) with 0 = 0m. So,
in principle, we are finished. But in practice, we actually still have a long way to go,
because it is not cost-effective to actually compute the solution this way. The cost-effective
way is to invoke a numerical method for integrating (approximately) the discrete ODE's,
(2.6-17)—a subject we shall turn to in the next section (Section 2.7). The above analytical
solution is presented only/mainly to increase our 'awareness level' and to help us better
understand what our ODE solutions will be trying to do. So, back to the theory. While
(2.6-92) is a full and complete representation, there are a few matters of interpretation to
deal with [e.g., for m > N/2, the eigenvectors are aliased to lower (resolvable) modes].
Also, we can usually omit (defer) the basis function summation and just focus on each
nodal coefficient, Tj(t). To this end, then, we first rewrite the equation for Tj(t) as
N/2 N/2-\
Tj(t)^^2amviJm)e-i27Tmc'"l+ ]T aN-mvf-m)e-i2n(N-m)c»->''\ (2.6-93)
where obviously we are (for now) assuming that Af is an even number. (We shall account
for the remaining 50% of the cases later.) The reason for the rewrite is because of the
following easily verified facts:
162 THE ADVECTION-DIFFUSION EQUATION
1. <2/v-
2. V: = Vj ; and
3. (N - m)cN^m = -mcm.
Before using this information, however, let us note two special features of (2.6-93):
1. For m = N/2 in the first summation, we have
1
N
aN/2v(?l2)e-i7<Nc»r-> = (-\yaN/2 = (-!)>- ^2(-\)lT0(xi)
N
l=\
because v(: ) = (— \y is the 2Ajc mode, and its phase speed, cN/2, is zero. aN/2 measures
the amount of 2Ajc 'noise' in the IC. (A smooth IC will obviously give a small a/v/2-)
2. For m = 0 in the second summation, we have
N
a^v
f^NcNl=aN = ±_J2T0(xl),
N
i=i
which is the average value of To(xj) because v • — 1 is the constant mode, and cN = 0.
Thus, we can conveniently rewrite (2.6-93) as
N/2-l
m=\
amv{?)e-i27Tmc'»t + aOT^m)e-''2™c«'
+ (-l)JaN/2 +aN.
(2.6-94)
Finally, noting for any complex number z = x + iy that z + z = 2Re(z) leads to
N/l-\
^') = ^£
A^
+
m=\
' N
N
Y^ Tq(xi ) cos Irnnl/N
L/=i
cos2icm(j/N — cmt)
y] Tq(xi ) sin 2nml/N
N
sin 2nm(j/N — cmi)
+ ^Yl^-iy+l + l]T^x^
N
(2.6-95)
i=i
which merits the following
Remarks:
(1) The solution is clearly seen to be a projection of the IC onto each mode, followed by
a linear combination of all of these translating (to the right because cm > 0) modes.
The larger the projection coefficient for the higher modes, the worse—usually—will
be the approximate solution because cm is too far from u.
(2) Af must be even for the above result to be valid. If Af is odd, there is no 'pure' 2Ajc
mode, and the result is slightly different—an exercise we leave to the reader—and
is this: drop the (—1)7+/ term and change the upper limit in the outer (modal)
summation to (N — 1 )/2.
(3) The original Af modes with complex exponential eigenvectors finally 'break down'
to the first % Af/2 modes represented twice: once by cosines and once by sines.
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 163
(4) Tq(xj) must be real for the above result to be valid—the principal case of interest
herein. [Complex To(Xj) can be easily accounted for, however.]
o Eigenvector expansion for quadratic elements. For quads, the analysis proceeds in much
the same way but, of course, it is (necessarily) more complicated, although the bottom
line will be just the same: project the initial data onto the eigenmodes and then let
all modes 'advect'—or try to, since, by definition, (almost) advection means to move
precisely at the fluid's velocity. What happens is that each mode 'drifts' to the right at
its characteristic (phase) speed, and the total solution is the superposition of all moving
modes. Here we still have the solution as described by (2.6-87), with N replaced by N + 1
and eigenvectors, now called q and given by (2.6-42), rewritten as
(m)
i + (-iy i-(-D'
—— 1————p„
J2jmj/(N+\)
(2.6-96)
and the eigenvalues given by [the positive (!) roots of] (2.6-36)—or (2.6-40) for LM.
The next complication arises from the fact that these eigenvectors, although linearly
independent, are not orthogonal in the sense of (2.6-89). But they are M-orthogonal;
q{m)TMq{n) =0 ifm^n, where M is the [proper, a la (2.6-37), not a la (2.6-26) and
(2.6-27)] GFEM mass matrix (or the lumped mass matrix if lumping is invoked). The
vector w{m) = Mq{m) is the eigenvector of the adjoint (transpose) problem; q satisfies
M~xKq = Xq and w satisfies the adjoint problem, KM~xw = —kw with (w(m), q(n)) = 0
for m^n. In fact, it is not too difficult to obtain the following explicit orthogonality
result:
+ -[4 + 8/£ + 4/3OT cos6m - cos20OT]aOT„, (2.6-97)
{Mq(m\q(n)) =
15
which replaces (2.6-90). The corresponding amplitude coefficient, from (2.6-88) with q
replacing v, is now, via (2.6-97),
am = (w{m\T0(xj))/(w{m\q^)
^{2[l-(-lV] (4An+cos0m)+ [1 +(-lV] (4 + 2/3wcosgw-cos2gw)}
7=1
(N + 1 )(4 + 8#, + 4/3m cos 6m - cos 26m)
xT0(xj)e-i2™j/{N+l\
(2.6-98)
which is the quadratic element's version of the projection of T0(x) onto mode m. The
full solution is now, as before, given by (2.6-92) with Af replaced by Af + 1. The
analysis leading from (2.6-93) to (2.6-95) follows analogously (with, in addition, the use of
^N+\-m = Pm) to give the final result—for the y-th node:
N
Tj(» = 2j2
m=\
l + (-
\y i
— + -
<-i)J
Pn
X
+
Re(aOT)cos2;rm
J
N + 1
— Im(aOT)sin2;rm
Af+ 1
i+3(-iy
<2(/V+l)/2 + ®N+\,
(2.6-99)
164 THE ADVECTION-DIFFUSION EQUATION
where N = (N+l)/2-l and a(N+i)/2 = (4/3 (AT + l))T!j=}(-^jTo(Xj) is the
projection of the IC onto the 2Ajc mode, (1 +3(-l)-/')/4 (see Figure 2.6-11), and aN+\ =
(1/3(JV + 1)) Y^!j=\ t3 - (-1 )J]T0(Xj) is the projection of the IC onto the constant mode,
wherein the midside nodes get twice the weight of the edge nodes—the 'average' value
of Tq(xj) a la quads.
If mass lumping is employed, the following changes must be made:
1. Replace (2.6-97) by
(Mq{m\q^) = i(tf+ l)(2/£ + \)8mn. (2.6-100)
2. Replace (2.6-98) by
"+l 2[l-(-l)']&, +[1+ (-!)>]
= E
i2
(N+\)(2fa + \)
T0(xj)Q-i27Tmj/{N+l). (2.6-101)
Remarks:
(1) If the 'simple' 2Ajc mode, T0(xj) = (— l)7, is the IC, then the quad needs both
g(/v+n/2 an(j q(N+\) t0 represent it, and the result is
(_1) ~3qJ "39> =3 4 3'
which is but one example of a two-for-one deal that we now examine further.
(2) The eigenvectors for linear elements are, of course, also M-orthogonal; they just
'happen to' also satisfy (2.6-90).
o One eigenvector or two? Before showing a few sample results, it may be important
to point out another ostensible 'problem' (Cullen, 1982) with quads (while eigenvector
expansions are still fresh in our minds): because of the '^-factor' in the quad's eigenvector,
the ostensibly simple experiment of placing a single discrete 'Fourier mode,' Q<27TmJ/N,
on the grid to see how accurately it is moved to the right (a common form of
'dispersion analysis') is in fact not so simple, and has, owing again to misinterpretation, been
interpreted as another detraction of this element. The reason it is not simple is that each
simple Fourier mode (more precisely: the linear interpolant of each continuous Fourier
mode, which of course introduces sampling error) requires a linear combination of two
quad eigenvectors for its representation: one with the same index (say m) and the other
with index (N + 1 )/2 — m. The reason it can be misinterpreted is that the second mode
is inaccurately translated, with the result that quads might appear to do a poor job of
(discrete) Fourier mode advection—even when the mode has a long wavelength—because
of 'eigenvector error' in addition to phase error. And in fact they do, but this fact is
irrelevant, usually, because what is relevant is how well do the quad eigenvectors translate
the general initial condition; i.e., that made up of all Fourier modes (eigenvectors for
the simple cases, such as linears) and all quad eigenvectors? The answer is that quads
do very, very well—as we shall soon demonstrate. The confusing issue may become less
confusing if we also asked the inverse question: How well do linear elements (or simple
centered differences—or any method with 'zero' eigenvector error—at node points) advect
a given quad eigenvector? The inverse answer is (for linears): not very well, because to
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 165
represent the m-th quad eigenvector requires two simple (Fourier) eigenvectors, the
rath and the [(N + l)/2 — ra]-th, the second of which will 'spoil' the result. Thus, to be
complete, in some sense, we show below the results of advecting a single eigenvector
from lumped linear elements via two quad eigenvectors—and vice versa. We leave the
details of the analyses as exercises. First, given the m-th Fourier mode as the IC (real,
for convenience/variety)
.(»»)
Tq(xj) = Vj = cos2icmj'/(N + 1) = cos j6m,
(2.6-102)
where our mesh has N + 1 nodes because of quads—and (recall) N + 1 is even—the
solution in terms of quads is
Tj(t) = an
i + (-iy , i
(-1)>
Pn
cos 2icm
J
N+ 1
Cnjl
+ a/v+i
x cos 2n
i + (-iy i-(-i)7'
— + ^ fel
N + 1 \ ,
— m
J
N+ 1
— cn+\ t
(2.6-103)
where
dm —
2(4^w + cos 0m) + (4 + 2pm cos Bm - cos 26m)
4 + &pl + Wm cos em - cos 20„
aN+i
= 1
a„
(2.6-104)
(2.6-105)
the phase speeds are given by (2.6-50), and the center-node amplitude coefficients by
(2.6-39). Whereas linear elements would translate T0(xj) at the single appropriate phase
speed (close to u for m small), from (2.6-47), the quad representation will ostensibly
not be nearly so good (for / > 0) because of the seriously lagging phase speed of the
(N + 1 )/2 - m mode.
The other side of this coin is this: given the real part of the m-th quad eigenvector
if
i + (-iy \-{-\y
— 1 —Z Pn
J9m
(2.6-106)
as the IC, its representation via linear basis functions also requires two eigenvectors and
is given by the (simpler) linear combination
Tj(t) = —r-^ cos 2;rm
J
+
2
cos2;r
JV+ 1
(N + 1
Cm *
m
J
N+ 1
c/v+i t
(2.6-107)
with the phase speeds obtained from (2.6-47), which again shows a good mode and a bad
(slow) one.
166 THE ADVECTION-DIFFUSION EQUATION
For 1 <£ m <£ N, however, it turns out in both cases that the amplitude coefficient of the
slow modes is virtually zero and that of the good (long) modes is % 1. The harder and
more interesting cases have 'intermediate' values of m, e.g., m = (N + l)/8—an 8Ax
wave—for which the slow mode, 3(N + l)/8 with X = 8Ax/3, has both a significant
amplitude and a slow phase speed. The results of one such experiment are shown in
Figures 2.6-22 and 2.6-23 for N = 7 and m = 1, which is an 8Ajc wave. The curve
labeled / = 0 in Figure 2.6-22 is the 8Ajc wave for lumped linears that is composed of
the two quadratic eigenvectors, m = 1 and m = (N + l)/2 — 1 = 3—which agrees with
cos2nj/8 at the nodes (and also in between when linear interpolation is employed, as
done here). The curves labeled / = 1 and / = 2 in Figure 2.6-22 show the quadratic
element solution, from (2.6-103) after one and two laps, respectively, and that labeled
/ = 100 shows it after 100 laps. The goal in all cases is the exact transport of the
linear eigenvector—the / = 0 curve. Now look at Figure 2.6-23, in which the m = 1
quad eigenvector is synthesized by two linears (m = 1 and m = 3). [Also shown is the
first mode linear eigenvector at t = 0 to show that they (linears vis-a-vis quads) really are
not all that different.] The solution at later times, however, is quite different. The results
after just one lap (/ = 1) are already in serious error, which just gets worst with larger /,
so we stop with the / = 2 result. The surprising (?) result is that the two quads advect the
single 8Ajc mode of lumped linears (= second-order FDM in this case) much better than
do the linears themselves! This is simply because c\ for quads is very close to u, and fi\
is not far from unity. This can perhaps be better appreciated by showing the 'numbers'
for this case (and let the reader examine other cases); (2.6-103) gives
Tj(t)= 1.03418
l + (-l)7 l-(-lV
—H-—- + 0.94733 • —4:—-
cos 2n(j/S- 1.00115/)
-0.03418
i + (-iy i-(-iy
—H:—- + 0.59378 • —4;—-
cos 6^0'/8- 0.899600,
and (2.6-107) gives
Tj{t) = 0.97367 cos2tt(./78 - 0.900320 + 0.02633 cos6^0/8 - 0.3001k)-
The m = 1 quad has a (leading) phase speed error of 0.115% vs 10% (lagging) for lumped
linears (and 0.19% lagging for consistent linears—not shown) and an amplitude 'error,' in
each case, of only about 3%. And this is nearly the 'worst' case; larger or smaller values
of m/N would reduce these differences. So much for the 'wrong' quad eigenvector. The
bottom line on this two-for-one digression is that while higher-order elements are definitely
much harder to analyze, a correct analysis is all the more important if unwarranted or
even erroneous conclusions are not to be drawn.
o Numerical experiments. We now abandon exact eigenvector solutions and turn to some
'numerical' results—all with u = 1 and obtained by integrating the ODE's with a 'time-
marching' ODE method (to be discussed in the next section) and a small enough At
that the results can be considered virtually exact. We do this to study spatial errors
separately from temporal errors. Thus, all results to follow can be considered to be the
exact solution of the ODE's with interpolated IC's—as if we had actually employed the
eigenvector expansions just discussed.
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 167
Ti(t)
Fig. 2.6-22 Two quads make one linear.
T,(t)
1
0
1
^~- 1 J ~"7
\. t = 1 /---^ y
V , X ■. Mi
•*•. t-n ' / s y^^
\ v- — </•■''»
\ ^. t = 2 / / ■' \
\ V. '/.•/»
\ v 1 mAor Pirtfln\/o/^rtr ....-...- ' / y n
\ v, LINcdl digcilVcClOl , / ;• x
\ V / ./
\ X. / /
\ ^ / >
\ \ / '
4 \ \ , / f
\ \ / /
—v "\ \ ' ■/ / —
\ \ / /
\ ^ / '
\ * / '
\ \ / f
\ s / t
4 \ \ 'ft
\ ■* / / '•'
\ -\ / / /:
\ '.\ / / /.'
\ \ '^ ' / <■'
\ ■•>. / s.-
\ \ - "\ V ^--
>. \ < "V-""^ '■
N"-""^ •-* v
Fig. 2.6-23 Two linears make one quad.
We start with the cr = 1.6 Ax Gaussian discussed earlier. Figure 2.6-24 shows the initial
Gaussian, centered at x = 0.15 on a mesh with 40 nodes (N = 40 for linears, N + 1 = 40
for 20 quads) as well as its linear interpolate. The integration was stopped at / = 0.6 with
the exact solution now centered at x = 0.75. Also shown are the results from GFEM,
for both linear and quadratic basis functions. While a small amount of dispersion (and
associated wiggles) is already apparent for linears, the quadratic element's solution is
still quite close to the interpolant. To gain further perspective, Figure 2.6-25 shows a
few finite difference results for the same problem. The highly damped shape (owing to
numerical diffusion) of the curve labeled first-order FD is that from equations (2.6-67)
through (2.6-73) with lumped mass [omit the (2 + cos#)/3 factor]. It is readily apparent
168 THE ADVECTION-DIFFUSION EQUATION
1.2
T 0.4
-0.4
Exact
Linear FE
Quadratic FE
0.2
0.4 0.6
X
0.8
1.0
Fig. 2.6-24 Pure advection in ID; Gaussian waveform, finite element results.
1.2
-0.2
-0.4
Exact
1st order FD
2nd order FD
4th order FD
0.2
1.0
Fig. 2.6-25 Pure advection in 1D; Gaussian wave form, finite difference results.
that the other (centered) FDM results display significant dispersion error (a = 1.6Ajc
is not easy for these methods, in spite of the nice Fourier spectrum of Figure 2.6-21).
Also clear is that linear FEM is quite a bit more accurate than even fourth-order FDM.
A final noteworthy result is that the highly dispersive second-order FDM is the same as
FEM linear basis functions with lumped mass—our first actual demonstration of the truly
deleterious effect caused by mass lumping in advection-dominated flows.
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 169
If one studies the GFEM equations with linear basis functions on a variable
mesh (different element lengths) via Taylor-series methods, the result is rather
discouraging—the super-convergence enjoyed on a uniform mesh (fourth-order) drops
to 'inferior'-convergence (first-order). But we have said several times that GFEM is
generally not amenable to such analyses, and we demonstrate that now. We designed
a mesh with 53 linear elements that alternated between Ax = 0.025 and 0.0125 and
solved the same (a = 1.6 x 0.025) problem. Figure 2.6-26 shows the result, and also that
from a truly second-order FDM on a variable mesh, in which dT/3jc| is approximated
by \/{hL + hR) [(hR/hL)(T0 - TL) + (hL/hR)(TR - T0)]. The results are rather striking;
both results are insensitive to the variable mesh! That is, FDM still looks poor, and
GFEM still looks good—in fact, the GFEM result is even more accurate here than it was
on the uniform mesh; it responds to the extra nodes by giving extra accuracy. So much
for Taylor—and for the misleading 'note of caution' on p. 422 of Vichnevetsky (1987).
Moving up in difficulty, a la Figure 2.6-21, we next study the advection of a triangle.
The base width (21) is 8Ajc, and the grid contains 50 nodes. The IC in Figure 2.6-27
is centered at x = 0.12, and it is noteworthy that here we have no interpolation error in
the IC. The / = 0.6 result is shown for linear elements—both consistent (GFEM) and
lumped (FDM). Comparing the latter result with that in Figure 2.6-25, it is clear, since
the solutions look much the same, that the FDM cannot recognize the difference between
a 1.6Ax Gaussian and the triangle. The GFEM does notice the difference—it holds the
triangular shape pretty well but leaves behind a longer trail of wiggles for this less-smooth
waveform.
Now we move to the really hard wave—a square wave—for which GFEM falls on its
face, as does FDM. Figure 2.6-28 shows a lOh step (2/ = lOAx in Figure 2.6-21) on a
100-node uniform mesh, both at / = 0(*o = 0.15) and at / = 0.6 (xo = 0.75) for linear and
quadratic GFEM, as well as second-order FDM (lumped linears). All are lousy, wiggly,
dispersive. The discontinuities excite the bad end of the spectrum. Clearly, if advection of
discontinuities is important, GFEM is of questionable utility. Fortunately, such is rarely
1.2
T 0.4 —
-0.4
Exact
Linear FE
2nd order FD
"-jj<"—'-"^
/ v
\
./
0.2
0.4 0.6
X
0.8
1.0
Fig. 2.6-26 Pure advection in ID; Gaussian waveform, variable-grid results.
170
THE ADVECTION-DIFFUSION EQUATION
1.2
0.8 —
T 0.4
0.0
-0.4
Exact
Linear FE
2nd order FD /
••/'\.
/ \
\ /
\ J
0.2
0.4 0.6
X
0.8
Fig. 2.6-27 Pure advection in ID; triangular waveform.
-0.4
Fig. 2.6-28 Pure advection in 1D; rectangular waveform.
1.0
the case in the 'applied' (incompressible!) world with which we are familiar. For methods
that are more-or-less designed to solve problems with discontinuities and sharp fronts,
see the recent text by Finlayson (1992).
Remarks:
(1) The above results were taken from Lee et al. (1976) and did not employ periodic
BC's; the inlet had T = 0, and the outlet had no BC. But the calculations were
stopped well short of the exit, so that the periodic BC case would be little different.
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 171
(2) It is interesting that for lumped linears, the exact solution of the ODE's is
available—in terms of Bessel functions of the first kind. For example, Wurtele
(1961) shows the following solution for the discrete 'delta' function IC (Tj(0) = 1
for j = 0, and 0 for j ^ 0) : Tj(t) = Jj(ut/h), where Jj is a Bessel function of the
first kind, and more general solutions (for more general IC's) via linear combinations
of these 'fundamental solutions.'
(3) Sincovec (1972) shows, for the square wave above, that other higher-order methods,
such as cubic spline Galerkin and cubic Hermite collocation, also do a very bad job.
The next series of figures (with thanks to J. Rowley) shows another view of a IOAjc
square wave (11 = lO/i), this time on an 80-node mesh, beginning with the modal
decomposition plot of the IC; i.e., Figure 2.6-29 shows the absolute value of the (normalized)
projection of the step function onto the 41 resolvable modes [see (2.6-95) with N = 80
and (2.6-99) with N = 79], which shows 39 sines and cosines plus the constant (zero wave
number) mode plus the 2Ajc mode for both linears and quads. Mode 0 is the constant
mode [F(0) = 1.0], and mode 41 is 2Ax. Comparing the (discrete) eigenvector expansion
in Figure 2.6-29 with the (continuous) Fourier transform of Figure 2.6-21 (for kl = 5n)
shows some similarity and some differences—e.g., the discrete case shows 4| lobes vs 4
for the continuous case. The similarity of the spectra for the two discrete cases is closer
yet—suggesting more similarities than differences between the two—even though each
quad mode is a linear combination of two linear modes and vice versa. The differences
show up in the phase speeds—especially, for example, that between CM quads and LM
linears; recall that, based on the zero-group-velocity criterion, three-quarters of the modes
look easy for quads, but only one-half look easy for lumped linears—a 50% increase in
favor of quads.
The practical effect of these phase speed differences will be well-demonstrated later.
Now we just show a short integration, in Figure 2.6-30, wherein the center of the square
wave moves just a small distance, from 0.5625 to 0.625, in which we show results in
the order 'best to worst'; i.e., CM quads, CM linears, LM quads, and LM linears. It is
clear that none do very well; also clear is the short-wave upstream-moving wiggles, the
fastest moving at close to the group speed of the 2Ax mode; namely, —5, —3, —2, and
— 1, respectively—in the order of appearance.
Shown next is another step function—this time one with a much longer flat top
(2/ = 40Ajc rather than IOAjc), shown in Figure 2.6-31, the first of which shows the
IC and its 'Fourier' spectrum (i.e., eigenvector amplitude coefficients) in which the
excrutiatingly slow decay of the higher modes' amplitude coefficients is particularly
notable—showing again that square waves are difficult. (Shown in Figure 2.6-31 is the
spectrum for quads, but that for linears is virtually identical.) Figure 2.6-32 shows the
results at / = 0.0625—complete with upstream wiggles—again corroborating the Fourier
'analysis.'
A simpler periodic function is examined next—briefly. The C° function given by
x(\ — x) is 'easy' in that only the first derivative is discontinuous—as the discrete
spectrum in Figure 2.6-33 shows: only the lower 20% or so of the modes are 'active.'
The single result shown in Figure 2.6-34, linear elements at t = 0.125, is again
'consistent'—only a small amount of dispersion is present.
A less simple (C-1) function, a periodic ramp, with a discontinuity, is portrayed
next—a sawtooth function. Note the smooth but slow drop-off of the discrete spectrum
172 THE ADVECTION-DIFFUSION EQUATION
1.5
1.0
0.5
f(x) 0
-0.5
-1.0
-1.5
(a) Initial P.nnriitinn
: n -
|F(k)|
|F(k)| 0.5 -
0.1
0.2 0.3
0.4
0.5
X
0.6 0.7 0.8 0.9
1.0
0.1
0.2 0.3
0.4 0.5 0.6
Normalized frequency
0.7 0.8
0.9
1.0
Fig. 2.6-29 A 10 Ax square wave and its modal decomposition.
shown in Figure 2.6-35 (for linears—quads look virtually the same)—like \/n (or l/k
for the continuum). The four results, at / = 0.125, are shown in Figure 2.6-36. Here
again the more accurate methods tend not to look so because of the upstream-moving
wiggles—group velocity 'in action.'
The next four examples—and the last in this series—are designed to further accentuate
the group velocity aspects of the several methods under scrutiny. We begin with the
experiment inspired by Cathers and O'Conner (1985) and place a truncated 2Ajc wave
on the 80-node span—see Figure 2.6-37—for both linear and quadratic elements (and
apologize for the different 'scale' used for quads). The discrete Fourier spectra are again
revealing—showing (logically) the dominance of the 2Ajc eigenvector. [The spectrum for
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN
173
1.5
1.0
0.5
T 0
-0.5
-1.0
-1.5
1.5
1.0
0.5
T 0
-0.5
-1.0
(a) Consistent quads
■ -—a,aa/w^"—^vV
— (b) Consistent linears
~ WW- -V-a/^
Fig. 2.6-30 Brief advection of a square wave.
174 THE ADVECTION-DIFFUSION EQUATION
0.4 0.5 0.6 0.7
Normalized frequency
Fig. 2.6-31 A 40Ax square wave and its modal decomposition.
0.8 0.9
the quadratic shows a larger fraction of long modes; these are required to compensate
for the '/3-factor' that shifts the (dominant) short modes away from an average value of
zero.] The solutions for all four cases, shown at / = 0.125 in Figure 2.6-38, reveal that the
wave packet is indeed dominated by the group velocity of the 2 Ax wave, since all packets
move leftward at close to its corresponding group velocity. Thus, in some sense, the error
is 'total' in that the exact solution (translation to the right of the initial sawtooth at unit
speed) can in no way be well approximated by any of the methods. The positive 'bias'
shown by quads is a reflection of the IC, which sawtooth displayed positive values for end
nodes and negative (of the same magnitude) for center nodes. If the IC is shifted by one
grid length, then the 'reverse' is seen, and the solution displays a negative bias. [Note that
Figure 13 in Cathers and O'Conner (1985) is, as pointed out by Gresho and Lee (1987),
erroneous. The code bug was later corrected by Cathers (personal communication).]
Next we show (Figures 2.6-39 and 2.6-40) a wave packet that is obtained by
modulating, and reflecting about the jc-axis, a smooth Gaussian (a = lOh) by (— \y,
the 2Ax wave for linear elements, in which the excitation of long waves for the
quad case (Figure 2.6-39) is even more evident/prominent—and for the same reason;
except for this feature, the symmetric modulated Gaussian displays only short-wave
excitation—from linears. And the results are, correspondingly, cleaner: the wave packets
are—with the exception of quads whose difference will be made clear in the next
example to follow—really translating leftward at the group velocity of the 'shortest'
waves. (Exercise for the reader: from these two figures, deduce / in Figure 2.6-40.)
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN
175
1.5
-1.5
(a) Consistent quads
"v'v/V'^*—*y/\s**s\f •^v
(b) Consistent linears
Fig. 2.6-32 Brief advection of a longer square wave.
176 THE ADVECTION-DIFFUSION EQUATION
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
Normalized frequency
Fig. 2.6-33 A periodic C° function and its modal decomposition.
Fig. 2.6-34 Brief advection ofx(1 -x) via linears.
The fact that there are no rightward moving waves (for linears) is a consequence of
the reflected Gaussian—the smooth envelope cancels out so that only the short waves
are seen. In contrast, if we had used an unsymmetric Gaussian, say positive only (as
in earlier examples), the solution would display similar leftward moving 'noise' plus a
smooth Gaussian moving rightward at c = u = 1. For examples of this case, see Vich-
nevetsky (1987), in which paper he also 'corrects' (on p. 423) an erroneous statement
related to this behavior in his book with Bowles (Vichnevetsky and Bowles, 1982). There
they discuss the separation of smooth and rough solutions via, in part, an appeal to the
second-order wave equation, Ttt = u2Txx, easily derivable from the first-order equation
of interest, Tt + uTx = 0, to make the following statement, '... which shows that the
central-difference semi-discretization is a consistent approximation of (the second-order
wave equation) rather than that of the advection equation, Tt + uTx = 0.' What they
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 177
f(x)
1.5
1.0
0.5
-1.0
(a) Initial Condition ^_
0.1 0.2 0.3 0.4 0.5 0.6 0.7
X
0.8 0.9
1.0
I I T
i i i
(b) Linears
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7
Normalized frequency
Fig. 2.6-35 A sawtooth function and its modal decomposition.
meant to say [R. Vichnevetsky (1985), personal communication] was 'in addition to'
rather than 'rather than.' Nevertheless, the remaining discussion and analysis presented
there is useful, interesting, and enlightening (even if not absolutely necessary, nor perhaps
totally rigorous; e.g., for h -> 0, implied by the writing of PDE's, the amplitude of their
left-moving difference wave goes to zero).
The third case is for quads (CM) only, and was kindly supplied by D.F. Griffiths. It
uses an IC of a 'symmetric' Gaussian modulated by the quad's 2Ax eigenvector—hence
the positive bias; see Figure 2.6-41. This wave packet (for which the theory of group
velocity is most apt) corresponds more closely to that in the previous example using linear
elements; i.e., it shows a purely leftward motion at a velocity of % —5. The rightward-
moving smooth portion of the wave displayed in the previous example is completely
absent (as are low mode eigenvector amplitude coefficients of the IC, not shown), owing
to cancellation.
Another interesting experiment was performed by Cathers and O'Connor (1985); in
their Figure 9 is shown an IC that is comprised of about five 2Ax waves, followed by the
same number of 4Ajc waves, followed by the same number of 8Ax waves. The results
showed the proper wave packet 'separation' as each portion of the IC moved at (nearly)
its own group velocity—and concludes (almost) our group velocity examples for the time
being.
The last of the four group velocity demonstrations returns us to the eigenvector pictures
in Figure 2.6-5—as promised there. The 'exact' solution referred to in the legends of
Figures 2.6-42 and 2.6-43, which figures are to be viewed top-to-bottom, is the exact
solution of the ODE with the given mode (eigenvector) as an IC; they are given by a
special case (pure cosine IC) of (2.6-95): T(f] = cos2nm(j/N - cmt) for m = 2, N = 80,
where cm = (sm2]rm/N)/(2]rm/N) = 0.9959—and the same equation for m = 38 for
178 THE ADVECTION-DIFFUSION EQUATION
Fig. 2.6-36 Brief advection of a sawtooth wave.
which cm = 0.0524. Each mode is shown at / = 0 and at / = r/4, 2r/4, 3r/4, and 4r/4
where r = 2n/N sm(2nm/N) is the wave's period—2n/co = \/mcm = 0.5021 for both
m = 2 and m = 38, owing to the symmetry of the sine function. [The consistent mass
version—not shown—would not display this (fortuitous) symmetry. It would have ci =
0.9999966, c38 = 0.15533 and thus r2 = 0.50000 and r38 = 0.1694.] Anyway, the object
of the exercise is to point out the temporal behavior of low vis-a-vis high modes—and
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 179
0.3 0.4 0.5 0.6 0.7
Normalized frequency
1.0
Fig. 2.6-37 A finite 2Ax wave and its modal decomposition.
180 THE ADVECTION-DIFFUSION EQUATION
— (a) Consistent quads
1.5
1.0
0.5
T 0
-0.5
-1.0
-1.5
1.5
1.0
0.5 |—
T 0
-0.5
-1.0 |—
-1.5
2.0
1.5
1.0
T 0.5
0
-0.5
"•"'vA/V* 'AA'AA
(b)
I I I
— (c) Lumped quads ,
V
I
I I I I I
Ilk., a.
Fig. 2.6-38 Brief advection of the truncated 2Ax wave.
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 181
|F(k)|
|F(k)|
0.2
0.3 0.4 0.5 0.6 0.7
Normalized frequency
Fig. 2.6-39 A modulated Gaussian and its modal decomposition.
this we now do. Mode 2 moves rightward at its phase speed (0.9959)—'because' it is
a pure sinusoidal wave. But this is definitely not the case for mode 38, which moves
leftward at its group velocity, G^ = cos 2mn/N = —0.9877 — 'because' it is a wave
group, comprised as it is of a 2Ajc wave and a 40Ax wave. A reaction from the maker
of these figures, J. Rowley, after viewing the results in 'movie mode,' is relevant: 'The
groups zoom to the left while the envelope creeps to the right.' Thus, since a general
IC requires 'all' modes to represent it (80 in this case), the solution will be poor to the
extent that the IC contains significant amounts of the high modes. [A close perusal of
Mode 38 in Figure 2.6-43 will reveal that the actual plot is not quite right; plotted is
cos76ir(j - l)/80 for 1 ^ j ^ 79.]
182
THE ADVECTION-DIFFUSION EQUATION
1.0 - (b)
Consistent linears
Fig. 2.6-40 Brief advection of a modulated Gaussian wave.
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 183
in !
S)
* ■ /vaAI
vwvVu
k
l
I I
t =
t =
k
Si
^/vvv V V V
t =
ii
>
>
>
lkA-- ,=
Ml. •
.20
.16
.12
.08
.04
= 0
i ii
-1.0 -0.5 0 0.5 1.0
x
Fig. 2.6-41 Reverse advection of a 2Ax-modulated Gaussian; a/Ax = 5, Ax = 0.02.
0 10 20 30 40 50 60 70 80
j
Fig. 2.6-42 'Exact' advection of mode 2 for lumped linears (N = 80,), shown each quarter
cycle.
184 THE ADVECTION-DIFFUSION EQUATION
0 10 20 30 40 50 60 70 80
J
Fig. 2.6-43 'Exact' advection of mode 38 for lumped linears (N = 80,), shown each quarter
cycle.
For our final final example, we 'get more serious' about sending waveforms around the
periodic circuit—finally getting away from all of the short time results shown heretofore.
Specifically, we shall chase a a = 4Ajc Gaussian around and around an 80-node circuit
and compare several methods—using results previously published in Rowley and Gresho
(1987). Figure 2.6-44(a) shows a result using the least accurate of the methods considered:
lumped linears (= centered second-order FDM) after 15 circuits/cycles. Dispersion already
so dominates that the Gaussian is no longer discernible. The next three are much
more accurate, so we subjected them to a tougher test: 80 laps through the mesh.
Figures 2.6-44(b) and (c) show, as discussed and demonstrated previously, that lumped
quads and consistent linears are quite close in advection accuracy—their average speeds
are ~0.9975, with slightly larger wiggles from lumped quads. Consistent quads, on the
other hand, are almost 'spot on'—see Figure 2.6-44(d)—demonstrating the remarkable
accuracy alluded to earlier. To finish, we show some variable-grid results, partly to show
that GFEM is still quite good, but also to show some puzzling, and not yet understood,
results. Starting with the 80-node uniform mesh, we randomly perturbed the nodes to
generate element lengths up to 20% above or below the uniform value of 0.0125. (Note
that variable-grid FEM results are also non-dissipative because the advection matrix
remains skew-symmetric.) All results are shown after 80 cycles with an average CFL
number of 0.1 using the trapezoid rule for time integration. Figure 2.6-45(a) shows the
worst result—lumped linears not only lose the Gaussian but really generate a lot of high
frequency noise—especially near 2Ax. Mass lumping on non-uniform meshes of linear
elements is not a wise move. Consistent linears (GFEM), on the other hand, do quite
well—in spite of their 'first-order spatial accuracy'; Figure 2.6-45(b) shows about the
same phase error as for the uniform ('fourth-order') mesh, but the results are somewhat
polluted by low-amplitude, high-frequency wiggles. Perhaps a remark by Vichnevetsky
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 185
0 0.5 1.0
X
Fig. 2.6-44 Uniform mesh of 80 nodes: (a) Lumped linears after 15 cycles through the
domain (b) Lumped quads after 80 cycles through the domain (c) Consistent
linears after 80 cycles through the domain (d) Consistent quads after 80 cycles
through the domain.
(1987), for gradually changing mesh size, is relevant here—even though our mesh change
may not qualify here as gradual: 'During the passage of such wave packets, the frequency
(co) remains constant and satisfaction of the dispersion relation with h variable results
in an x-dependence of the wave number.' Lumped and consistent quads are shown in
Figures 2.6-45(c) and (d), respectively, in which more puzzling results are apparent:
(i) although lumped quads show slightly stronger wiggles, both show about the same
phase error, (ii) consistent quads seem to retain the proper shape of the Gaussian, thus
186 THE ADVECTION-DIFFUSION EQUATION
0 0.5 1.0
X
Fig. 2.6-45 Non-uniform 80-node mesh; 80 cycles: (a) Lumped linears, (b) Consistent
linears, (c) Lumped quads, (d) Consistent quads.
suggesting that the phase speed error is basically the same for all eigenvectors—although
we note (or believe) that only the low modes will have significant initial amplitudes for
this easy-for-quads-to-resolve case. But another part of the puzzle—not shown by these
results—is this: different runs, with different randomized node locations, yield different,
and surprising, results in that dispersion seems to be virtually absent and sometimes the
numerical solution moves faster, sometimes slower, than the continuous one. For example,
for six runs, the average speed of the Gaussian over 80 laps ranged from 0.9963 to 1.0031.
For further results and discussion of this issue, see Rowley and Gresho (1987). We leave
the resolution/explanation of these curious results as an exercise for the reader (!), with
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 187
a plea to share the explanation with us. Perhaps alternate IC's, such as the best least-
squares fit via the consistent mass matrix, would deliver further insight—we simply used
the interpolant IC.
b. Advection-diffusion with periodic BC's
Little need be added to the pure advection discussion when the diffusion term is added to
the equation, except to note that: (i) the problem becomes parabolic rather than hyperbolic,
and thus much easier since 'roughness' no longer remains but is diffused into
'smoothness'; and (ii) the rate of diffusion in the semi-discrete equations only approximates the
correct rate. Each of the eigenmodes (still elkx) will decay (diffuse) while advecting—and
more quickly the higher the wave number owing to the larger gradients. The analytic
solution for the continuous case is easily found to be a simple combination of that for
pure advection (already discussed) and that for pure diffusion (Q,kx-k **); namely,
T(x,t) = e-klKteikix-ut\ (2.6-108)
where k(= kn) = 2nn. The eigenvalue for the AD operator is thus X„ = k2K + ik„u. Each
Fourier mode (eigenfunction) undergoes scale-dependent diffusion while being advected at
speed u—and, interestingly, this is the only case (BC's) for which eigenfunction advection
actually occurs; they decay in place for all other BC's, as we shall see.
For the semi-discretized approximation we will restrict the discussion to linear elements
and focus primarily on how GFEM and several other schemes approximate the decay
rate. The semi-discrete equations of interest are obtained by adding the diffusion term to
(2.6-17):
i(ry_, + 4fj + tj+l)+ ^(7>, - 7V.) = ^(7V, - 2Tj + TJ+l),
j=l,2,...,N. (2.6-109)
We seek a solution of the form Tj(t) = e-M?e'(^-o>o t0 easily obtain
- (/x + ico)m{6) + — sin 6 = ^ (1 - cos 6), (2.6-110)
h hr
where 6 = kh and m(6) is the mass matrix 'response,' m(0) = (2 + cos#)/3—also called
the 'matrix symbol.' But we shall 'generalize' m(0) to permit the inclusion of three
additional mass matrix approximations, as follows: m = 1 for lumped mass, m = (1 +
cos 0)/2 for one-point quadrature (the 'Box' scheme), and m = (3 + cos 6)/4 for CVFEM.
In all cases, (2.6-110) yields
li = 2k(\ - cos0)/mh2, (2.6-111)
co = us'mO/mh, (2.6-112)
to give the eigenvalue A = /x + ico and the solution
T (t) = Q~2Kt(l~COii0ymh2QikUh-utsinQ/m(^ (2 6-113)
which is to be compared with (2.6-45) through (2.6-47)—the pure advection limit (k = 0)
above—and of course to (2.6-108), the 'goal.' These eigenvectors translate to the right at
188 THE ADVECTION-DIFFUSION EQUATION
the (mode-dependent) phase speed while being damped by diffusion. It is seen that the
advection portion of the solution is unchanged, thus permitting us to focus on the diffusion
portion. Figure 2.6-46 shows the effective diffusivity, k^/k = 2(1 — cos0)/m02, ratioed
to the true diffusivity for the above-mentioned schemes.
We see that (i) CM (GFEM) is over-diffusive—an ostensible advantage in the
shortwave portion (6 > |;r; see Figure 2.6-3), since these are the modes that are inaccurately
advected; i.e., why not get rid of the noise more quickly? (ii) LM is rather under-diffusive
(noise will linger longer than it should); (iii) CVFEM is clearly the winner (finally!);
and (iv) the box scheme is a clear loser. Finally, we leave the analysis of quads as an
exercise—perhaps not easy.
Exercise for the reader:
(1) Replace the GFEM advection term by its upwinded counterpart (cf. 2.6-67). Show
that the resulting AD ODE is
i(7V_, + 47; + tj+l)+ ^(TJ+l - THl)
u f2L 1 \
which shows, for large Pe, that the equation (and therefore its solution) becomes
completely independent of Pe—all diffusion is numerical—a result that obtains for
other BC's as well, and one that carries over to both 2D and 3D and thus should
relegate simple upwinding to the numerical methods cemetery.
(2) Show that the resulting ejfective diffusivity is tc(eff) = k + uh/2 and that the
ejfective Peclet number is Pe(ejf) = Pe/(1 + P), where P = uh/2ic is the grid
Peclet number—showing that Pe(eff) -> 2L/h for k -> 0; hyperbolic behavior is
completely impossible with upwinding.
Kgff/K
Box scheme /'
Consistent mass
Lumped mass ~--^
jc/2
9
Fig. 2.6-46 Effective diffusivity of four methods.
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 189
c. Advection -diffusion with Dirichlet BC's
o Continuum. We now switch from easy BC's to hard ones, still in ID: we change from
studying IVP's to IBVP's in which physical diffusion will play a major role. We are thus
interested in approximate solutions to (2.3-1), such as given by (2.6-109), now with BC's
T(0,t) = T0(t) (2.6-114)
and
T(L,t) = TL(t), (2.6-115)
and IC (2.3-4). We will also address the steady-state version of the AD equation with
the above BC's, which will lead to a discussion of a classic tough case that has been
used, effectively but erroneously—we believe—to denigrate GFEM as a viable solution
technique. In fact, if it were not for the excessive use and abuse of the above case, the
steady-state section would probably not need to be written—at least for the advection-
dominated case (which we emphasize here). But written it must be—and for reasons that
are perhaps as much philosophical as technical. We hope, at the end, though, that the
reader will share with us a more balanced view regarding the utility, or otherwise of
GFEMIA—as measured against the various 'upwinded' alternatives whose pushers/sales
people/advocates/zealots believe in the need for artificial dissipation, intelligently (or,
sometimes, otherwise) applied. The reason that this section should not need to be written
is that the use of a Dirichlet boundary condition as an 'outflow' BC, is, to say the least, a
bit silly in a fluid flow problem. It (usually) makes little or no sense physically and causes
(or can cause) serious problems mathematically. That it is 'silly' has been addressed in (at
least) Gartling (1978), Chang and Finlayson (1980), and Gresho and Lee (1981). But we
believe that more yet needs to be said, and we now proceed to do so, beginning with the
associated eigenproblem, obtained by seeking e~Xt temporal behavior with homogeneous
BC's, which leads directly to
wOA - K<bxx = AO on 0 <x < L=\ (2.6-116)
O = 0 at jc = 0, L, (2.6-117)
where O is an eigenfunction, with concomitant eigenvalue k.
The solution is
Xn = (n2ir2 + Pq2)k/L2, (2.6-118)
<D „ (x) = ePex/L sin nnx/L, n = l,2, ..., (2.6-119)
where
Pq = uL/2k (2.6-120)
is the (new) global Peclet number—the factor of two introduced solely for mathematical
convenience [as indeed it was for the local (grid) Peclet number, P = uh/2K = Pe h/L],
and it is quite worth noting that each 'advection-diffusion' eigenfunction, as initial data,
only diffuses (!)—in place, according to T(x, t) = e~/fO(x); they do not advect at all,
yet the linear combinations of them that define the general solution to the IBVP do
'advect,' or appear to, at least. Such is the 'power' of linear combinations/superposition.
This somewhat 'strange' (abnormal?) modal behavior is related to the fact, discussed in
more detail at the end of the next section (2.6.2d), that because of the BC's, the operator
is no longer normal and the modes no longer 'simply' orthogonal (vis-a-vis the normal
190 THE ADVECTION-DIFFUSION EQUATION
periodic case with orthogonal modes). It is also noteworthy that the 'advection part' of
the eigenvalue is independent of n (cf. the periodic case). The exponential factor—the
advection effect—is part of 'the problem.' And the Dirichlet OBC is the other part.
These two 'do battle' in that the former wants large O, and the latter small O at the
outlet. Figure 2.6-47(a) shows the net result at a moderate Pe (24) and small n\ the trend
is clear, and Dirichlet wins, as it must. The discrete versions (linear elements) shown in
Figures 2.6-47(b) and (c) will be derived in the next subsection; suffice it to say here that
the low modes are easily simulated when P is small.
But it wins at the expense of an extremely large gradient at x = L, given by
o;(L) = (-l)"^ePe, (2.6-121)
= 2.6 x 1010(— \)nnic/L for Pe = 24, and of a concomitant thin OBL (outflow boundary
layer), whose thickness is 0(Pe~') or 'less' (less for large n). [The extremum closest
-4x109
m = 2\
40
50
60
J
70
80
(b) Consistent mass.
40 50
(c) Lumped mass.
60
I
Fig. 2.6-47 First three Dirichlet eigenfunctions and eigenvectors for Pe = 24; N = 79 (P =
0.3; for (b) and (c).
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 191
to x = L (defining the OBL) is given by x/L = (nn — tan ' nn/Pe)/nn, giving 8/L =
\-x/L=\/nn tan"' (nir/Pe), which looks like 8/L = l /Pe for nn « Pe and like 8/L =
\/2n for nn ^> Pe; i.e., the BL is very thin for large n and/or Pe ^>> l.] Figure 2.6-48
shows the same first three eigenfunctions, but now with Pe = 100—and only for the
right-most 20% of the domain. It is clear that advection takes the pure heat equation
eigenfunction and, for large Pe, 'compresses' it toward the right boundary—while strongly
amplifying it there.
If one is really trying to solve an applied problem in thermal (or other) analysis, it
is almost clear already that the Dirichlet OBC is not such a good idea; not only does it
make little or no sense physically, but it causes unnecessarily difficult mathematical and
computational problems. But we shall 'push on' for largely historical reasons, leaving
until the next section a better, more sensible OBC.
Thus we have a hard problem for Pe ^> l, since any solution of (2.3-1) with Dirichlet
BC's is always a linear combination of these 'badly behaved' eigenfunctions, whose
analytical solution, at least, is made simple (in principle) via the (adjoint) orthogonality
condition
(<D„,<Dm)= f e-2Pe</L0„0„, = <W2,
Jo
(2.6-122)
permitting (awkward) eigenfunction expansions of the following form: f(x) =
Y^n an^n(x) via an = 2 f f(x)e~Pex/L sin nnx/L—a somewhat counter-intuitive result for
large Pe, weighting 'small x' as it does.
Note too that the time constant for mode n,
Tn = 1/An —
L2/k
l
nV + Pe2 n2n2K/L2 + u2/4^
(2.6-123)
is a combination of a (physical) diffusional time scale, zD = L2/Kn2n2, and an advection-
diffusional time scale, xAD = 4k/u2, that is non-physical—the former (ultimately)
■r- 8x10
0.80
,41
8x1Q
41
Fig. 2.6-48 Dirichlet eigenfunctions for Pe = 100.
192 THE ADVECTION-DIFFUSION EQUATION
dominating for the high modes and the latter for the low modes (at large Pe), the
'crossover' occurring near n = Pe, suggesting the possible need for many modes when
Pe » 1. We shall encounter the above non-physical time scale several more times.
A simple example of the use of these eigenfunctions is given by Tq(x) = constant and
TL = Tq = 0, which describes the advection and diffusion of a (two-sided) step function:
oo
T(x, 0 = T0J2
2nn[\ -(-l)"e
-Pe
n=\
n V + Pe2
-<P„e~Kt
= ro£
n=\
2nir[\ -(-l)"e-Fe]
n V + Pe2
ePe*/L smnixx/L ■ e-^+'W, (2.6-124)
which, for Pe^>\, is a complex way to say that the (near) step function translates
across the interval on the advective time scale (L/u) with 'massive' diffusion occurring
at x = L—and at x = ut for ut < L (back-diffusion from the backside of the step). There
are two interesting points worth noting here for this, or any other, Tq(x): (i) while each
individual mode has a decay time constant much smaller than the advective scale of L/u,
the total process does occur on this slower time scale, thus suggesting (again) that 'many'
modes will be required to get an accurate result—which is another manifestation of the
statement that these (Dirichlet BC at outlet) are 'hard' problems for large Pe; (ii) each
individual mode (eigenfunction, 0„) decays 'in place' (N.B. As noted above, there is in
fact no advection of these 'advection-diffusion' modes.) according to its decay constant
(kn), the linear combination of them conspiring to generate a total result (waveform) that
does move to the right (at speed u).
o Linear elements, mostly. We turn now to the approximation eigenproblem, via GFEM
and the associated discrete eigenproblem, the discrete version of (2.6-116) and (2.6-117),
given for linear basis functions by
u(Tj+i - Tj-i)/2h - k(Tj-i - 2Tj + 7>,)//i2
= A(7>, + ATj + 7>,)/6, j = 1, 2,..., N, (2.6-125)
with Tq = 7V+i = 0, the solution of which is sufficiently challenging/interesting that we
present it; i.e., we show one way to solve the generalized eigenproblem Kz = kMz for
simple tri-diagonal matrices. It is this: using the fact (e.g. Fletcher and Griffiths, 1980) that
the eigenproblem Az = yz where A is N x N and tri-diagonal with lower diagonal /, main
diagonal d, and upper diagonal u has the solution z^ = (l/u)->/2 sin jmn/(N + 1) and
ym = d + 2\/~lu cos mn/'(N + 1), we form A = XM — K, and solve Az = yz, after which
we set y = 0 to get k. The result, applied to (2.6-125), is, with h = L/(N + 1) and z = T,
j
1 +Xmh2/6K + Px J/2
1 + kmh2/6K - P
sin jmnh/L,
(2.6-126)
and
_ 6k
h
2 + cos2 6m - cos6m\f9-P2(4-cos2em)
4 — cos2 6m
for m
U2,...,N,
(2.6-127)
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 193
where 0m = nuih/L and (recall) P = uH/2k. Also, we have changed the 'name' of the
eigenvectors—from Vj in (2.6-21) to T(™]. Actually, an easier alternative to solving
the quadratic equation d(k) + 2*J I (X)u(X) cos mich = 0 is to simply 'guess' the above
eigenvector (from the form of the tri-diagonal matrices) and place it into Kz = XMz,
which gives X directly. Anyway, we are done—for linears. This result [for Xm, not for
z(m)] is also presented in Fletcher and Griffiths (1980) and in Mitchell and Griffiths
(1979), in the latter of which they also show that all of the eigenvalues are real only for
P ^ 3/2. They also show that all are complex for P > y/3. For 1.5 < P < %/3, some are
real and some are complex. Recalling that the continuous case has only real eigenvalues,
clearly points to a 'problem' in the approximate solution, and one may suspect that the
approximation can only be good for 'small P.' We shall see later that this is generally a
valid suspicion; but we shall also show how a smart mesh with variable h can get good
results for very large values of P in most of the domain and small values [0(1)] only
locally, selectively.
The lumped mass version of the above results is easier to obtain and has been presented
in, at least, Hindmarsh et al. (1984):
Xm = JL(\-yJ\ -p2C0Sem\ (2.6-128)
and
/ 1 + P \ ^2
T{p] = l——\ sin jmnh/L (2.6-129)
for P ^ 1. For P = 1, the degenerate case is Xm = 2k/h2 for all m and the single
eigenvector 7y") = (0, 0 -> 0, \)T. The lumped case has a real solution only for P < 1. For
P > 1, it is complex and can be written in a more convenient form:
/ p _i_ J \ J/2
Tf] = (-i)J I —— 1 sin jmnh/L, (2.6-130)
with a similar form of (2.6-126) for P > y/3:
~P + (\+kmh2/6K)
rf] = (-/y
P-{\ +Xmh2/6K)
7/2
sin jmnh/L. (2.6-131)
In both cases, CM and LM, the corresponding eigenvalues have a factor of / in front of
the radical, and the quantity under the radical is negated—and corresponds to spurious,
damped, temporal oscillations, with the same damping rate (r = h2/2k) for all modes,
with frequency (LM) 2K/h2y/P2 — 1 cosOm [highest for mode 1 and lowest (0) for the
(stationary) 2Ajc mode]. Also, in both cases, both eigenvalue and eigenvector for the
discrete case converge—as they must—to the continuous results for h -> 0. Another
relevant property of all of the above four eigenvectors is that they display adjoint
orthogonality [cf. (2.6-122)] with respect to the mass matrix; i.e., the adjoint eigenvector is given
by ff\P) = Tf\-P\ and it follows that (f(m))TMT(n) = 0 for m # n. For m = n, the
'normalization' results are
[fln)]TMLT{n) = (N+\ )/2 (2.6-132)
194 THE ADVECTION-DIFFUSION EQUATION
for LM and
nn
3[f{n)]TMT{n) = N + 1 + cos
— sin • sin ' • cos —:— (2.6-133)
N+l N + 1 N + 1
for CM. A final property of these eigenvectors that is worth pointing out is that each
higher mode is obtainable from a particular lower mode and the 2Ax mode; i.e., as for
the periodic BC case, we have, for both lumped and consistent mass,
T(N+\-m) = {_X)j+\T(m) for a„ m (2.6-134)
Enough 'theory'—now for some pictures; beginning with a return to Figure 2.6-47,
where we have already noted that both CM and LM simulate well the low modes when
P is small. The high modes, even for P 'small,' are not nearly so well simulated, as
seen in Figure 2.6-49 for the (nearly) 4Ajc mode (40) and for the nearly 2Ax mode [the
wavelength of mode 79 is ^ times 2Ax]. Worse yet—much worse, in fact—is the case of
'large' P, where large here is basically any P giving complex eigenvalues and
eigenvectors; i.e., P > 1 for LM and P > y/3 for CM. Thus, whereas the eigenvalues of the discrete
case are reasonable approximations to the continuum for small P, see Figure 2.6-50(a),
they are rather unreasonable for 'large' P, as shown in Figures 2.6-50(b) and (c) for P = 3,
wherein we note the 'similarity' between N = 79 and N = 799. The continuum
eigenvalues are purely real, and the approximate eigenvalues are complex—with the real part
being really far from the correct value. And we will not even try to display the complex
eigenvectors! No wonder that 'advection-dominated' flows are hard for Dirichlet BC's.
It even seems slightly amazing that either approximate solution can ever be close to
the exact solution for large P. But close it can be, even for P = 106 (say), as long as
the boundary at x = L is not 'seen'—as we shall soon show. The 'success' alluded to
is apparently 'just' another manifestation of the miracle of linear combinations But
see too the remarks on non-normal matrix eigenvector expansions at the end of the next
section (2.6.2d).
Considering quadratic elements, the analysis is much more difficult (penta-diagonal
matrices, quartic equations); thus, we just mention that some results are available in
Fletcher and Griffiths (1980), and some of their properties are discussed in Mitchell
and Griffiths (1979)—including lumped mass. They tell us that all eigenvalues are real
(GFEM) when P ^ 2^31/15 = 2.9; i.e., close to the previous results, from linears.
o Wiggles or not? We now continue our discussion of the time-dependent case, (2.3-1), by
returning to an example first presented in Gresho and Lee (1981)—again by integrating
the ODE's with a very small At to get an accurate ODE solution. To see how the Dirichlet
OBC makes an otherwise easy problem difficult, we set Tq = TL = 0, u = 1, k = 0.004,
and use for Tq(x) a fairly well-resolved Gaussian, Tq(x) = e~{x~Xo) /2a with a = 1.6Ajc,
Pe = uo/k = 10(P = 3.125), Ax = 0.025, and 40 linear elements. The problem is easy
until the Gaussian 'sees' (or feels) the hard BC, at which time GFEM announces the
difficulty by generating non-negligible wiggles. In Figure 2.6-51, we show the solution a short
time later (/ = 0.3) to show that advection of this well-resolved IC really is easy (for CM
only, cf. Figures 2.6-24 and 2.6-25), and finally at / = 0.8, wherein the oo-span solution
o
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN
2x1010
1x1010
0
-1x1010
195
-2x1010
(a) Continuum, mode 40.
-2x101°
(d) Continuum, mode 79.
500
r--
40 50
(b) Consistent mass, mode 40
3x1010
2x1010 —
1x1010
„ 0
S_-1x1010
^^xlO10
-3x1010 —
-4x1010
60
J
40 50
(e) Consistent mass, mode 79
60
J
80
-5x1010
40
50
60
J
r--
(c) Lumped mass, mode 40.
(f) Lumped mass, mode 79.
Fig. 2.6-49 Some higher modes with Dirichlet BC's for Pe = 24, N = 79, (P = 0.3;.
would be halfway out of the domain. The infamous GFEM wiggles are sending out their
signal—and even more clearly in Figure 2.6-52, in which Pe has been increased to 1000.
The dotted lines shows the corresponding oo-span analytical solution,
T(x, t) = exp[-(;c - jc0 - 02/2(l + 2f/Pe)]/v/l+2f/Pe,
(2.6-135)
interpolated to the nodes, which, of course, is not valid when the BC TL = 0 is
encountered. But it is a reasonable goal for OBC testing; i.e., the perfect OBC would generate
the oo-span solution.
Remark:
It is interesting to note that the solutions obtained via numerical time integration could
also be obtained via the eigenvector expansion method, using a linear combination of
196
THE ADVECTION-DIFFUSION EQUATION
m
0 10
(a) Pe = 24, N = 79 (P = 0.3)
60,000
lm(Am)
40,000
20,000
0
-20,000
-40,000
IT
1
m
■Am (P = 3)
^40;
\A
'80
a^M(P = 3)
-► n
40
80
80 x
-60,000
10,000 30,000
(b) Pe = 240, N = 79 (P = 3)
6x106
_L
_L
_L
50,000 70,000 90,000 110,000 130,000
Re(Am)
4x106
2x106
lm(Am) 0
-2x106
-4x106
m /cm
C (P = 3)
n i i r
-► n ->e
^ (P = 3) 1
400
800
-6x106 I §°°-
1x106 3x106
J_
5x106 7x106 9x106 1.1x106 1.3x106
Re(Am)
(C) Pe = 2400, N = 799 (P = 3)
Fig. 2.6-50 Continuous and discrete eigenvalues for Dirichlet BC's.
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 197
0.8 —
0.6
0.4
0.2
t = 0.3, GFEM
t = 0.8, GFEM
t = 0.0, exact
t = 0.3, exact
t = 0.8, exact
t = 0.3
n
/1
/ i.
ni
A = 0.8!
^±
/
0.2
0.4 0.6
x
0.8
1.0
Fig. 2.6-51 Gaussian hitting hard OBC; Pe = 10.
1.5
1.0
0.5
-0.5
\t = 0
t = 0.3, GFEM
t = 0.8, GFEM
t = 0.0, exact
t = 0.3, exact
t = 0.8, exact
t = 0.3
A M
/ MM
■-» ^v / ^ / \ >*f
V
'/I
I /
".!
= 0.8
0.2
0.4 0.6
x
0.8
1.0
Fig. 2.6-52 Gaussian hitting hard OBC; Pe = 1000.
those modes of, for example, (2.6-126), and it is interesting (again, perhaps) to ponder
how the short modes 'get excited' later when they appear to be absent from the IC
expansion and in the 'small f solution. Such is the power of linear combinations; i.e.,
the apparently 2Ax oscillation is much more than just the 2Ax mode since its amplitude
coefficient must be very small. It is in fact a particular linear combination of many of the
short modes, each of which has a very small amplitude coefficient, that conspire (with
just the proper phases) to make up the wiggly solution.
Supposing, however, that we really wanted to solve the 'Dirichlet' problem, we must
respond to the wiggle signal, which forces us to realize that there is an OBL of thickness
198 THE ADVECTION-DIFFUSION EQUATION
0(1/Pe) that is not being resolved by our uniform mesh. So we take advantage of one
of FEM's greatest virtues and design a new mesh that will—will little or no additional
cost—solve the stated problem. We re-meshed by adding five more elements, graded
between 0.00244 and 0.00595, from x = 1 to x = 0.98—and used a uniform mesh of
39 elements to the left of x = 0.98(Ajc = 0.02513). The virtually wiggle-free results are
shown in Figure 2.6-53 for Pe = 1000 at the same times as above. Resolving the OBL
has solved the main wiggle problem—and reveals the minor wiggles associated with
dispersion error; Ajc = 0.02513 is just not quite small enough. So, if we resolve the
(silly?) OBL, we can get a good solution; in the next section we shall show how to get
a good solution on the 'coarse' mesh by employing a smarter OBC.
o Diffusion wiggles, minimum time of believability. We will conclude the discussion of
the time-dependent case by showing how the mass matrix wiggles that can occur for even
pure diffusion (u = 0) can and should probably be regarded as a blessing in disguise, a
point first brought out in Gresho and Lee (1981). Consider the following common 'sharp'
transient—a step change in boundary temperature:
= k—- on 0<jc<1,
dT
~dt
dxz
with initial temperature zero and boundary temperature
7(0,0=1, r(l,r) = 0.
The exact solution is given by
ry OO
7(i,0 = (1-i)--V
n=\
sinn70c
n
and a useful small-time approximation (based on a semi-infinite thickness) is
T(x, 0=1- erf(jc/V4^),
(2.6-136)
(2.6-137)
(2.6-138)
(2.6-139)
0.2
1.0
0.8
0.6
0.4
0.2
0
—
—
i
i
i
i
i
i
i
)
)
)
s
1 1
1 \
1 \
1 »
1 1
1 1
1 1
1 1
1 1
1 1
1 1
1
;t = o
i
i 1
i /
i /
i /
i J
\ 1
\ J
N ^
1 1
A t = 0.3, GFEM
J \ t = 0.8, GFEM
/ \ t = 0.0, exact
/ \ t = 0.3, exact
/ \ t = 0.8, exact
\ \ t = 0.3
1
/
r
1
1
';—
t = 0.8 '/ _
/
/ _
/
s-.S,
V
0.4 0.6
x
0.8
1.0
Fig. 2.6-53 Gaussian hitting hard resolved OBC; Pe = 1000.
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 199
from which the heat flux at x = 0 is seen to be
q = KdT/dx = -K/y/jTKt, (2.6-140)
which, of course, is unbounded for t -» 0. This should clearly alert the analyst to 'worry'
about the approximate solution's behavior near x = 0 for small time. But if the analyst
does not know of the asymptotic heat flux approximation, or the 'real' problem is so
much more difficult that no analytical solution of any sort is available, s/he can usually
rely on the GFEM to signal the danger—by making WIGGLES. In the above case the
GFEM ODE's, given by MT + KT = /, give the initial temperature 'acceleration'
t0=M-lf, (2.6-141)
where / is a vector of zeros except for the first entry (from the inhomogeneous Dirichlet
BC). Now, a property of the mass matrix is that its inverse (which is dense) oscillates in
sign (and decreases in magnitude) away from the main diagonal, which is positive, 'so
that' M~x f can be a least-squares best fit. The net result, for this case, is that M~x f
picks out the first column of M_1 to get T0. But since the signs of the entries of (M,i )_1
oscillate in sign, so too do the entries in 7V, thus, whereas the exact T(x, t) is always
positive, every other node in the GFEM acceleration vector is negative, thus causing
one half of the nodal values to start out in a definitely non-physical way. If the mass
is lumped (i.e., if we do not employ GFEM), only node 1 has a non-zero value of Tq
(and it is positive), and there are never negative temperatures. (But of course, the LM
results are also wrong for small t—a subject we return to at the end of the next chapter:
Section 3.19.) So what are the GFEM wiggles telling us? They are alerting us to the
fact—perhaps not appreciated a priori and certainly not recognized by the lumped mass
approximation—that we have defined a very difficult problem and that the approximate
solution at small time [at least while any nodal values Tjit) are negative], especially
near x = 0, is not reliable. This is the self-diagnosis capability referred to earlier. Once
so alerted though, what is the analyst to do? Here are two responses—each of which is
better than mass lumping and living in its associated and erroneous dream world: (i) do
not believe the solution for times shorter than that at which the last negative temperature
(which will be closest to the step change) passes through zero from below—called the
'minimum time of believability' in Gresho and Lee (1981); (ii) equate the t = 0 heat flux
from the discrete solution, which can be approximated by KdT/dx = —k/Ii, where h is the
first element length, to the analytic flux at small time given by (2.6-140), which yields a
simple estimate of the minimum time of believability (since —ic/h is an upper bound for
the discrete problem). This yields
tc = h2/nK, (2.6-142)
which is also close to the element time constant—see Gresho and Lee (1981), in which
is also shown the exact solution to the finite element equations.
Related to (2.6-142), we present below the maximum eigenvalues (minimum time
constants) for (2.6-136) when 'solved' via linear (L) or quadratic iQ) FEM:
xLCM « /z2/12k, x%m « h2/\5K (2.6-143)
and their lumped mass counterparts:
xLm « h2/4K, r?M « h2/6K, (2.6-144)
200 THE ADVECTION-DIFFUSION EQUATION
where h is the smallest nodal separation in the mesh—all of which are useful to know
and are derived by studying the eigenproblem for a single element; see Gresho and Lee
(1981) and Hughes et al. (1979).
Another thing the GFEM solution would show you is the desirability of employing
a graded mesh—small elements near x = 0—because a uniform mesh will cause larger
wiggles near x = 0 than elsewhere. Finally, even a very well-designed graded mesh will
have its own (smaller!) minimum time of believability—and GFEM will also announce
that fact, in the usual way.
Our last time-dependent 'example' is borrowed, but appears to be useful enough to pass
on to others. In Vichnevetsky (1985) an inlet boundary condition was proposed that lets
any spurious upwind-moving noise (wiggles) leave gracefully—rather than being fully
reflected (and 'aliased' to long waves) as does the simple Dirichlet BC, 7(0, t) = T0(t).
He addresses only linear elements, but shows for them that a boundary condition that
couples the inlet node, say To, to the first interior node, as follows:
f0(0 = 27o(f) - Ti(t), (2.6-145)
which looks like To(t) applied at the middle of the first element, successfully removes
upstream-moving wiggles.
o Steady state. Now we turn to the steady-state case. We will introduce the (over-
publicized) 'tough' problem alluded to above and solve it three ways: (i) uniform mesh
GFEM, (ii) smart upwind methods, and (iii) GFEMIA. But in between the first two we
will summarize the error analysis that applies to (i)—because (iii) will obviate it.
The steady-state solution of (2.3-1), (2.6-114), and (2.6-115) is (with Pe = uL/2k)
T(x) - T0 e2Pex/L - 1
T^—r = P2Pe , > (2-6-146)
i l — i o e — 1
while that of its GFEM approximation with N + 1 linear elements,
Pe(7> - Tj-y) = 7f' ~Jj + Tj~' ~ Tj, j = 1, 2, ..., N, (2.6-147)
hj+i/L hj/L
on a uniform mesh [hi = h = L/(N + 1)] is (use Tj = a + b^j)
^-70 V-p) -, (2.6-148)
TL
1
7=0, 1, ..., N + 1. The approximation of (2.6-148) to (2.6-146) can be shown to be:
excellent for P <$C 1, 'reasonable' for P = 0(1) but not > 1, oscillatory for P > 1, very
bad/unreasonable/wiggly for P ^> 1, and sometimes unbounded (!) for Pe => oo. For
example, in Figure 2.6-54 we show, for T0 = 1 and TL = 0, the exact solution and the
approximate solution for N = 7 for Pe = 4(/> = 1/2) and Pe = 16 (P = 2).
Worse yet is large P: Figure 2.6-55 shows the results for Pe = 160 on two (coarse)
grids: N = 6(P = 22f) and N = 1(P = 20). The reason for two values of N is to display
the disjoint solutions for odd and even j when N is odd, wherein the odd nodes actually
become unbounded for fixed N(Tj ~ Pe/N2 ~ P/N for j odd and P ^> N).
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 201
1.4
1.2 -
1.0
T(x)
0.8 -
0.6 -
0.4 -
0.2
0
P = 1/2 -
P = 2 -
Pe = 4 - -
Pe = 16 -
Fig. 2.6-54 Exact and GFEM (N = 7) solutions for two Peclet numbers.
T(x)
Fig. 2.6-55 Exact and two GFEM solutions for Pe = 160.
In any event, it is clear that GFEM gives very poor accuracy for P ^> 1. But it
also gives a very clear wiggle signal. And we shall respond to these two aspects in
two different ways: first, by verifying/ascertaining analytically (via error analysis) that
GFEM can be very 'bad,' and later, by heeding the wiggle signal, to show that GFEMIA
can be very 'good'. To this end, then, we present an error analysis of the GFEM
approximation—largely following T. Hughes (personal communication). Defining the
error as e = Th - T, where Th = J2j Tj<Pj(x) and T is the exact solution, a first basic
step in finite element error analysis is to decompose the error into two parts, as follows,
202 THE ADVECTION-DIFFUSION EQUATION
and analyze each part separately:
e= (Th-th) + (Th-T) = eh+ t), (2.6-149)
where th is the finite element interpolant of T, eh is the part of the error contained
in the finite element space, and r\—the interpolation error—is that part of the error not
contained in the finite element space (because T is not so contained). The plan then is to
independently bound these two errors, then combine the results via the triangle inequality.
To set the stage for eh, we first restate the weak form of the problem as
B(Th, wh)= [ uThxwh + KThxwhx = 0 for every wh, (2.6-150)
Jo
where, for convenience, we are assuming that every trial solution satisfies the inhomoge-
neous Dirichlet BC's. Next, we note that the true solution also satisfies (2.6-150) to give
B(e, wh) = 0; the projection of the error onto the finite-dimensional subspace vanishes.
We also need the following 'stability' result:
B(wh,wh)= f uwhxwh + K(whx)2
= «\\™hx\\l (2.6-151)
since wh = 0 on V. We will use this result to obtain an estimate for eh, via
K\\ex\\l = B(eh,eh)
= B(e\e-r1)
= B(eh,e)-B(eh,r1)
= ~B(eh, r,)
= \B(eh, r,)\
(2.6-152)
/
uehxri + Kexr)x
which, via the triangle inequality, yields kII^IIo ^ u\(ex rj)\ +K\(ex, Y]x)\, which, via the
Cauchy-Schwarz inequality, yields K\\ehx\\l ^ u{\\e% ■ |M|0) + K(||e*||o ■ ll^llo), which,
via an application of Young's inequality,
xy^ - (ax2 + -y2\ for all a > 0, (2.6-153)
[obtainable from (x^/a — (y/y/a)2 ^ 0], yields
*H*Jllo^ (flill^llo + ^ll^llo) +^ (^Ilejllj+^II^IIS) (2.6-154)
for arbitrary a\ and aj. We now apply a common trick in such error analyses: pick a\ and
a2 such that when the ||^||J terms on the RHS of (2.6-154) are transferred to the LHS,
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN
203
the final coefficient is positive; i.e., we want k — ^{a\u + a2K) = c > 0. A good way to
do this is to set c = k/2, which =>• a\u + a2K = k, which leads to the following logical
choice: a\ = k/2u and a2 = 1/2. Then (2.6-154) rearranges to
Jni2
u
\\enx% ^2 \ -2 ||i,||£ + ||^||j).
(2.6-155)
To make further progress, we recall two facts:
2. IMI? = /^ + ^ = IMIJ + M?,
and that the semi-norm, \<p\\, qualifies as a norm because the only constant function
allowed is (p = 0 because (p = 0 on T. Thus, (2.6-155) can be rewritten as
Ji\2
U
Kit ^2 -rlhll^ + Nf
(2.6-156)
/c
and we have a valid (norm) estimate of eh in terms of the interpolation error. To finish,
we apply the triangle inequality to (2.6-149) in the //' semi-norm, \e\\ ^ \eh\\ + Mi,
square it, \e\\ ^ \eh\\ + 2\eh\\ ■ \r]\\ + \r]\\, and apply Young's inequality with a = 1 to
the middle term, \eh\\ ■ \rn\ ^ ^(\eh\j + \rj\\) to give, finally,
^2
= 2
W
2 -rlhllS + Ni +1^11
ACT
2«2
K"
■|hllo + 3N?
, 2"2 o
^ 2 max I -^-, 3
K
hllo + Ni)
= 2 max
8Pe'
-,3 IhUf.
(2.6-157)
To finish, all we need is the //'-estimate of the interpolation error—a standard result (e.g.,
Strang and Fix, 1973, or p. 190 of Wait and Mitchell, 1985):
Nh ^chk\\T\\k+l, (2.6-158)
where k is the polynomial degree (k = 1 for linear elements). Thus, finally,
4Pe
\e\x ^max —,V6 )chk\\T\\k+l
(2.6-159)
bounds the GFEM error in terms of the exact solution—and we zoom right in on the
GFEM's major alleged defect: for Pe > Ly/6/4, the error increases linearly with Pe (for
204 THE ADVECTION-DIFFUSION EQUATION
fixed h), so that large Pe simulations may show large errors. (Indeed, this is quite the case
for the simple example presented above—and perhaps also for the time-dependent case;
cf. Figures 2.6-51 and 2.6-52.) For this case, it follows easily that to have \e\\ small, it is
necessary (at least) to keep Pehk small, giving P 'small' for linear elements, hP 'small'
for quadratics, etc. It thus follows, from this 'worst case' error analysis, that the only way
linear elements could be accurate is if P < 0(1)—and this is true, per the above example,
if a uniform mesh is employed. For example, if Pe = 104, we would need 104 elements
to get P = 1. And, as it turns out, the result is 'only if,' in a sense; i.e., it turns out that
the worst thing you can do for this hard OBC/Dirichlet case is to use a uniform mesh; it
maximizes the error! To make our main point of this section in the clearest possible way,
we state now what we shall prove later: the large Pe case can always be solved quite
accurately—in spite of the dreary estimate of (2.6-159) by the intelligent use of just two
elements (one degree of freedom)! Details later.
o Smart upwinding. Before returning to GFEMIA, we believe a short digression is in
order to discuss what some believe is the 'proper' solution to GFEM's hard OBC/wiggles
dilemma: smart upwinding. To review the history of smart upwinding prior to about
1980, see Gresho and Lee (1981) wherein it was stated that 'Further ... literature review
would undoubtedly continue to reveal more and more re-discoveries of this highly touted
scheme.' Here are two more—unknown to us then and brought to our attention via Segal
(1982) in a paper contemporaneous and complementary [and (thus) well worth reading!]
to our 'wiggle paper': (i) II'in (1969) and (ii) Chien (1977); both are in a 'disguised' (and
less compact) form, as is Segal's, representing the effective diffusivity as kP[\ — 2(e2P —
\)/(e4p — 1)]_1 rather than the simpler equivalent form, kPcothP—which we present
below.
First, we define the smart upwind method: it is a method that obtains the exact
solution—(2.6-146)—at the nodes (in ID only). And it 'looks like'/becomes simple
upwinding when the grid Peclet number is large. Next, we present our version of it—on
a variable grid yet (sought by Segal, 1982)—beginning by rewriting (2.6-147) as
Tj+l - Tj-i = -^ J- + -±-L J-, (2.6-160)
where Pj = uhj/2K, and we remark/note that
n+\
Y^PJ= Pe • (2.6-161)
Replacing MP j by coth/^ [which functions agree to O(Pj) for Pj -> 0] gives
Tj+l - Tj-i = (cothPj+i)(Tj+l - Tj) + (coitLPj)(Tj-i - Tj), (2.6-162)
the solution of which is exactly (2.6-146) evaluated at the nodes. For a uniform mesh, an
alternate representation of (2.6-162) is
u(Tj+i - Tj-X)/2h = K(Pco\hP)(Tj+x - 2Tj + Tj-i)/h2, (2.6-163)
thus revealing the effective diffusivity (when viewed as a centered scheme),
ATeff = KP COth P,
(2.6-164)
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 205
giving (for P -> 0) Ke{{ = k[\ — P2/6 + 0(P4)]. This is the form that has been derived
many times and in many different ways: finite differences, finite elements with Petrov-
Galerkin weighting, etc. For small P, Areff = k, and the central differencing scheme is
recovered; but for large P, Areff = kP = uh/2, which recovers 'pure' upwinding for P —>
oo (and yields Tj =Tj-\). But it is for intermediate P that this scheme gained its
fame—not many approximation schemes are nodally exact. The problem is: it does not
generalize; not to the time-dependent case, not to situations with internal sources or sinks,
and not to multi-dimensions. Thus, we believe it to be cute, interesting, but not really
viable. Also, some observations of Segal's (1982) study (of it and other methods), are
worth mentioning, who refers to smart upwinding as 'Il'in's scheme':
1. 'Hence we may conclude that the numerical oscillations... are caused by the presence
of the normal boundary layer and the fact that Dirichlet boundary conditions are given at
the outflow boundary.'—p. 332.
2. 'These examples show that when a normal boundary layer is present, a central difference
scheme with mesh refinement is preferable to upwind differencing. Only in the case that
the outer solution is constant does the Il'in scheme appear to be very accurate.'—p. 334.
3. 'Our final conclusion is that for the general (2D) case, where boundary layers are
present, the "outer solution" is not a constant, a Neumann condition is given at the outflow
(normal) boundary and mesh refinement is used in the horizontal (parallel) boundary,
central differencing is much more accurate than upwind differencing...'—p. 340.
Another paper on this subject that is worth reading is Smith (1980), in which a number
of exact solutions to the discrete equations, via FDM, linear FEM, and even quadratic
FEM, are derived and discussed. The conclusions reached by Smith are in complete
agreement with Segal's—and with our own.
o Try GFEMIAl So much for smart (and other) upwinding. We now return to GFEMIA and
attempt to drive another nail into the upwinder's coffin—which is probably just wishful
thinking since this religion will probably never die. And this brings us to an interesting
paper by Veldman and Rinzema (1992) that is worth recognizing before we present our
version of it, because they too look at crude-but-smart solutions employing but one internal
grid point—which solutions we will generalize and improve—and compared GFEM's
version of variable grids, (2.6-160), with a common, second-order-accurate (Taylor series)
variable grid used by finite differencers. These are called, respectively, 'Method A' and
'Method B' in their paper. Some quotations:
1. 'Method A is only of first order, whereas Method B is of second order. Nevertheless,
Method A gives good results for all three grids, whereas Method B is not able to produce
an acceptable solution at all. Thus, the local truncation error does not give a reliable
indication about the behavior of the global discretization error'—p. 124.
2 'Method A has often been mentioned, but each time it was rejected because of its LTE.
The present experiments show that this rejection has been premature; Method A is much
more powerful than generally assumed'—p. 130.
So let us return to 'Method A'—GFEM—and consider the intelligent placement of
nodes for the hard (Dirichlet) OBC problem, (2.6-160) with T0 = 1 and TL = 0—as
206 THE ADVECTION-DIFFUSION EQUATION
before. Let us start with a two-element mesh (N = 1) for which the single nodal equation,
from (2.6-160), gives
7, = 1//?1 + 1 , (2.6-165)
\/Pi + \/P2
where P\ = uh\/2ic, P2 = uh2/2K, and Px+P2 = Pe, a la (2.6-161); i.e., h\+h2= L.
We take Pe large and see how T\ varies as we 'move' the single node. It is not hard
to find that T\(P\) peaks at P, = ^(Pe - 1) = Pe/2 with value 7, = (Pe + l)2/4Pe =
Pe/4, which is consistent with (2.6-159) and corresponds to a uniform mesh; h\ = h2 =
L/2. Since T(L/2) = 1 for Pe ^> 1, we see that the largest error occurs on a uniform
mesh—at least for this simple case. (But the result does generalize—when all P} ^> 1.)
On the other hand, if we place the node such that P2 = 1, we get a much better result
(and one which also generalizes); T\ = 1 (the inlet temperature). While not perfect, the
result is quite good—considering—for Pe ^> 1. That is, P2 = 1 =>• h2/L = 1/Pe, and the
exact solution atx/L = 1 - h2/L = 1 - 1/Pe is, from (2.6-146), T(x/L) = (1 - e~2)/(\ -
e~Pe) = 0.865 for Pe » 1 (say Pe > 10). Placing the node at 8/L = 1/Pe from the hard
OBC turns out to be a smart thing to do—GFEMIA—for any number of nodes in the
mesh. In fact, it is easy to show from (2.6-160) that, for any N, setting Pm+\ = 1 will
always give T}■ = 1, j = 1,2, ..., N no matter where the other nodes are placed and no
matter how may of them are used! No wiggles, reasonable solution, and no more need
for the pessimistic, uniform grid error analysis (or, more commonly, and less useful yet,
error analysis in which '/z' is considered to be the size of the largest element).
Figure 2.6-56 shows T\ vs h\/L and the resulting two-element solution, T(x), for
Pe = 50 and three values of h\/L: 0.2, 0.5, and 0.8. Clearly the uniform mesh has the
largest error; also clear is that h\/L = 1 — 1/Pe = 0.98 is an excellent choice.
Remarks:
(1) The above property was called 'disconnected' by Griffiths and Lorenz (1978) in
their study of Petrov-Galerkin methods, because Tj = To (the inlet value) for all j
is independent of the specified value of T^.
(2) If the 'second-order-accurate,' variable-grid finite difference scheme is employed,
then (2.6-160) changes to
I-Pi l+^z+i
-p-^iTj+l ~ Tj) = ~~P~^(Tj ~ Tj~xl (2-6_166)
which has the (peculiar) property that P^i = 1 (rather than Pn+\ = 1) gives Tj = 1
for all j. Thus, Pn+\ could be 'large' and the boundary layer totally missed, with
still Tj = 1 for all j. For a similar comparison of centered and upwind FDM's, with
a similar conclusion as our own, see Ferziger and Peric (1996).
(3) Whereas a uniform mesh would require on the order of Pe node points to get a
good result, a smart mesh could do nearly as well with only one or two nodes—a
considerable saving for large Pe; e.g., 104.
(4) The high cost (large error constant) of a uniform finite-difference grid for this
problem is also clearly shown in Table IV of Roache and Knupp (1993).
Our bottom line(s) for the Dirichlet problem in ID are the following:
1. Do not bother with smart upwind methods because they do not generalize.
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 207
h^L and x/L
Fig. 2.6-56 Locus of nodal temperature and T (x) for Pe = 50; exact solution shown dashed.
2. Never use a uniform mesh.
3. Do not place the first node in from the outflow farther than 8/L = 1/Pe from the exit.
If you do, GFEM will wiggle.
4. General advice (which later needs to be—and will be—extended to multi-dimensions):
place a node at 8/L = 1/Pe away from the 'hard' boundary, place another halfway
between it and the boundary, and, unless there is a source term in the steady, ID,
advection-diffusion equation, you need no more nodes; and no supercomputer, no
workstation, no desktop computer, no calculator, no slide rule—just pencil and paper. GFEMIA
can really work.
5. For a better-yet mesh design, in both ID and multi-dimensions, see Hegarty et al.
(1995) and the new book by Miller et al. (1996).
Somewhat along these lines is a sample 'pure advection with source term' test problem,
originated by Leonard (1979), in which Brooks and Hughes (1980, 1982) present what we
believe is a somewhat incomplete picture—which we complete below. Specifically, they
neglected to mention GFEM's performance on the problem, perhaps leaving the reader
with the impression that it is not worth considering, because it is so 'wiggle-prone.' In
Figure 2.6-57, we correct this impression by showing the GFEM solution (it is l/24th
larger than the exact solution at the four nodes marked 'X'; it is exact at the others) as
well as those shown by Brooks and Hughes for the problem uTx = S(x) on 0 < x < 15
with r(0) = 0, where S(x) = 1 - x/4 for 0 ^ x ^ 6, S(x) = -2 + x/4 for 6 ^ x ^ 8 and
S(x) = 0 for x > 8; here, u = Ax = 1.
208 THE ADVECT10N-D1FFUS10N EQUATION
L)
— Exact (nodal interpolate)
□ Upwind/classical
o Upwind/Petrov-Galerkin
X GFEM
a n n n □ n n 0
10
15
Fig. 2.8-57 1D steady advection with a source term.
d. Advection-diffusion with Dirichlet/Neumann BC's
This case applies when the DiricMet BC at the outlet, x — L, (2.6-115), is replaced by a
Neumann BC, sometimes called a 'soft' BC,
KdT/dx — q at x = L.
(2.6467)
(In the next section we will consider the Neumann BC at both ends.) Except for pure heat
conduction problems (see, for example, Reddy and Gartling, 1994), there is really only
one important case worth detailing here—the use of the homogeneous (q = 0) Neumann
BC as an OBC. This simple change can go a long way toward relieving/precludieg wiggle
problems near x = L. Reason? Diriclilet imposes a large gradient upon the solution, and
Neumann imposes a small one—zero. Large gradients are wiggle makers—unless they
are 'smashed' by up winding.
But let us start by summarizing the associated eigenproblem results, which will suggest
the alleviation of difficulties relative to the previous case—usually. Solving (2.6-116) with
the BC # = 0 at x — L replaced by <J>' = 0, yields
Xn=(K" + Pe2K/L2,
where y„ are the roots of the transcendental equation
and
Pe tan y + y = 0,
<f>n (x) = ePe*/L sin ynx/L
(2.6468)
(2.6469)
(2.6470)
where y„ lies between (n — ^)jt and tin. Again, the advection effect seems to want to
cause 'problems' near x — L—although not nearly as badly as before because sin yn ^ 0;
in fact, the eigenfunction attains an extremum at jc = L rather than crashing back down
to zero there, as in the Dirichlet OBC situation. This key difference helps to explain its
relative success as an OBC; note though that for Pe ^> 1, — y„ ~ nn[\ + 0(1/Pe)] from
which, for low modes at least, (2.6470) tends to be a return to the Dirichlet case^—a point
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 209
15000
10000
5000
T(x) 0
-5000
-10000
-15000
0 1
x
Fig. 2.6-58 Dirichlet-Neumann eigenfunctions for Pe = 10.
to bear in mind when we see how an FDM approximation behaves. Figure 2.6-58 shows
the first several modes for Pe = 10—for which y\ = 2.86, yi = 5.76, and 5/3 = 8.71.
The discrete case is interesting, and paradoxical, in at least two ways (via linear
elements and, for simplicity, lumped mass): (i) a common FDM version of the Neumann
BC ('image point') yields to analysis but gives bad results, and (ii) the GFEM version
of the same BC (via an NBC) does not yield to analysis but gives good results—where
here bad/good means big/small wiggles when an otherwise well-simulated waveform tries
to leave the grid. We shall briefly summarize this awkward situation, and then appeal to
the recent revelations of L.N. ('Nick') Trefethen and colleagues to help 'excuse' our
incompetence.
The image point method of approximating dT/dx = 0 has already been introduced, in
Section 2.4.1 on OBC's; namely, from (2.4-28) with H = S = q = f N = 0, we have (for
a uniform mesh)
ltN + 2k{Tn - TN_x)/l = 0 (2.6-171)
as the ODE/OBC for the last node, in which the absence of any advection effect should be
noted. Recall that the 'image point,' TN+\, was eliminated by approximating dT/dx = 0
by TN+1 = Tm-1 •
The FEM version of this OBC is also a special case of the iV-th node's equation of
Section 2.4.1; i.e., set S = H = q = 0 in (2.4-31) to give, for a uniform mesh,
ltN + u{TN - TN_X) + 2k{Tn - TN-i)/l = 0, (2.6-172)
a seemingly small perturbation from (2.6-171)—since both yield dT/dx —>• 0 for / —>
0. But the fact is that this 'perturbation' is almost unbelievably powerful in its effect:
even though all other equations in the two problems are identical, with only the term
u{Tm — TN-\) in the last equation being different, the difference in the 'response' of the
two sets of equations, for large Pe, is profound. Before expounding these differences,
however, let us briefly return to the discussion near the end of Section 2.4.1, wherein
we suggested that only foolish or naive modelers would actually invoke the image point
—
—
—
/'
n = 3 / _
/■•-""x\ = 1
"Xn = 2
\
\
210 THE ADVECTION-DIFFUSION EQUATION
method. That this is not necessarily the case is seen in at least the two following FDM
papers: Price et al. (1966) needed to invoke the 'theory of oscillatory matrices' to explain
their results, and Fisk (1982) showed that even the Keller box scheme is not immune to
wiggles. Both were concerned about the resulting 'oscillations' in their solutions; both
used the 'conventional' image point OBC approximation to dT/dx = 0.
Since in this case pictures speak much louder than words, we show comparative results
before trying to understand/explain them. The IC in Figure 2.6-59 is a Gaussian wave
on a 50-element mesh with u = 1, o/l = 5 (fairly easy even for LM, at least for small
t), Pe = uo/k = 1000 (P = uI/2k = 100), and At was small enough to consider the
ODE integration as exact. The solution is on 0 ^ x ^ 1, but we also show the exact
solution, (2.6-135), on 0 ^ x ^ 2, as a dashed line. The image point BC is obviously
quite disruptive, and the homogeneous NBC is obviously quite the opposite—showing
only a modicum of small amplitude 2Ax waves that appear to trail the exact solution as
it leaves the mesh. Both these little wiggles and the big ones from the FDM case will
actually move upstream at the group velocity of about — 1.
The analysis of this behavior, in terms of eigenvectors, is not easy. Although the results
for the FDM case are well known (see, for example, Hindmarsh et ai, 1984), those for
the FEM case (even with LM) are not. The analog of (2.6-168) through (2.6-170) for the
FDM case is given by
K = -^ (l - Vl-P2cosfm) (2.6-173)
and Vl — P2 is replaced by iy/P2 — 1 for P > 1, where \(rm are (for both cases) the N
roots (N = L/h) between 0 and tt of
P tan Nfm + tan fm = 0, (2.6-174)
with eigenvector
1 -I- P\J/ ■
Y^rp) siny^m for P<\
(-1V (KPA])J/2 sin jxfrm for P> 1
Tf]={ " " . (2.6-175)
For the finite element case, the (small?) change in the last row of the matrix changes the
characteristic equation from (2.6-174) to
P tan Nfm + y/\ - P2 sin fm = 0, (2.6-176)
with \m still given by (2.6-173) and Tf] still given by (2.6-175). The difference here
is that P > 1 converts (2.6-176) to a complex equation with complex roots; the roots
of (2.6-174) remain real for all P. While the details of these subtle differences are not
yet fully appreciated, one limiting case is (with thanks to D.F. Griffiths): as P —>■ oo,
(2.6-174) —>• tanN\[f/n = 0 or \(rm = mn/N and the Dirichlet OBC case is recovered—an
observation that obviously helps to explain FDM's wiggles. For the FEM case, P —>■ oo
does not recover Dirichlet; rather, (2.6-176) yields
tan Nilsm + i sin \ffm = 0, (2.6-177)
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN
1.0 | 1 * 1 1 1 1.0
211
T0.5 —
1.8
1.0 —
0.5 —
-0.5
0.5
(a) FDM
1.0
x
(c) FDM
1.5
— T0.5
t = 0.25
1.0
T0.5 -
i J i !^j
2.0
t = 0.50
1.0
T 0.5
0.5 1.0 1.5
x
(d) FEM
0.5
1.0
x
(f)FEM
1.5
2.0
I I
:\ '
, 1
1 1
1
1
1
\
Iv
2.0
Fig. 2.6-59 Comparison of FDM image point and FEM NBC as OBC's: Pe = 1000, P = 100,
lumped linears.
212 THE ADVECTION-DIFFUSION EQUATION
producing a complex set of roots that somehow recognizes that, for k —>• 0, the OBC
should vanish because the AD equation becomes the pure advection equation, dT/dt +
udT/dx = 0. The details of this 'recognition' are, however, still somewhat obscure. All
we know for sure is that, if an eigenvector expansion approach is pursued, the wiggle-free
FEM result and the wiggly FDM result are both complex linear combinations of complex
eigenvectors, only one set of which conspires to make wiggles. The FEM is often smarter
than its users.
To conclude this section, we return to the relevant issues raised by Trefethen in this
regard, as alluded to earlier. For our AD equation on a finite domain, it is a fact that
only the periodic BC case generates an operator (in the continuum) and a matrix (in the
discrete approximation) that is 'normal'; a normal operator commutes with its adjoint,
and a normal matrix does the same (the adjoint being the complex conjugate of the
transpose matrix). For either Dirichlet/Dirichlet or Dirichlet/Neumann BC's, AAT ^ A1A;
our matrices are non-normal, a measure of the non-normality being the condition number
of the transformation matrix that converts our AD matrix to a symmetric matrix—which
condition number increases with Pe. Such matrix transformations are discussed in Fletcher
and Griffiths, 1980; also, for the record, symmetric and skew-symmetric matrices are
normal. Reddy and Trefethen (1994) point out that the condition number of the continuous
AD operator, as well as that of its basis (as we have seen), is 0(ePe). The key points
are these: (i) the eigenvectors corresponding to non-normal matrices are not orthogonal
(indeed, for P ^> 1, they can be nearly parallel), and (ii) an expansion in terms of these
guys (or even the eigenfunctions in the continuous case) would encounter insuperable
numerical difficulties (cf. e±Pe for Pe = 10, 100, 1000,...). Let us end the discussion
with some cogent words and advice from Trefethen and friends:
1. 'This is a reflection of the fact that these operators* are non-normal. This means that
they cannot be unitarily diagonalized, or to put it another way, their eigenfunctions are
not orthogonal. For the convection-diffusion problem, the degree of non-normality grows
exponentially with Pe. It follows that any attempt to make quantitative estimates of the
behavior of L (the AD operator) by means of its eigenfunctions or eigenvalues is likely
to lead to exponentially large constants. Such estimates are of little use when Pe is large,
and have no content at all that is uniformly valid as Pe —>• oo.'—Reddy and Trefethen
(1994).
2. 'If it is far from normal, however, the change to eigenvector coordinates may involve
an extreme distortion of the state space. In the new coordinates, the physics of the system
may become strangely complicated. A typical state of the system may be a superposition
of huge eigenfunction components that nearly cancel, and the evolution over time intervals
of scientific interest may be determined by how this pattern of cancellation evolves, rather
than by the growth or decay of the individual eigenfunctions. In other words, there may
be no good scientific reason for attempting to analyze the problem in terms of eigenvalues
or eigenvectors.'—Trefethen (1995).
3. 'Eigenvalues and eigenvectors are an imperfect tool for analyzing non-normal matrices
and operators, a tool that has often been abused. Physically, it is not always the eigen-
modes that dominate what one observes in a highly non-normal system. Mathematically,
* Not just matrices; i.e., it is true for the continuum too
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 213
eigenanalysis is not always an efficient means to the end that really matters: understanding
behavior.'—Trefethen (1991).
With this we abandon the analysis and return to the numerical results, which tell us
that the homogeneous NBC is an excellent OBC. They also provide guidance to the FDM
community.
So, we have found a desirable OBC for the ID advection-diffusion equation, namely,
KdT/dx = 0. This result generalizes to multi-dimensions via kti • VT = 0, at least when
exit 'planes' are perpendicular to the coordinate axes. It also nicely 'goes away' as k —>• 0,
and we reach the proper pure advection limit for which there is no BC at the exit. And
this is as far as we need to take the Dirichlet/Neumann case, except to mention: see Smith
(1980) for more results on this class of problems.
e. Advection-diffusion with Neumann BC's at both ends
We include this case only because it is mathematically interesting—if physically lacking
in interpretation. And we mention at the outset that it is only legitimate for non-zero
diffusion coefficients—pure advection requires a Dirichlet BC at the inlet (and no outlet
BC). Here we will restrict attention to the continuous problem, because this set of BC's
seems to have little practical utility.
The solution of (2.6-116) with <J>' = 0 at both ends is (again, from Hindmarsh et ai,
1984)
k0 = 0, kn = (nV + Pe2)K/L2, (2.6-178)
<D0 = 1, <&n(x) = ePex/L [ cos rnix/L sin nnx/L J , (2.6-179)
V nn J
and we refer to the original reference for the discrete analogs via lumped linears—and
the 'image point' BC.
The interesting aspect of this set of BC's is what it shows us if we pose an IBVP using
them; namely,
dT/dt + udT/dx = Kd2T/dx2 on 0 < x < L, (2.6-180)
dT/dx = 0 at x = 0,L, (2.6-181)
and
T(x,0) = T0(x). (2.6-182)
An eigenfunction expansion solution of this IBVP reveals the following steady-state
solution—from the constant eigenfunction that does not decay:
T(x,oo)= [ T0(x)e-2Pex/Ldx/ f e"2Pex/L
Jo Jo
= 4r t To(x)e~2Pex/L <Lc/(l - e"2Pe), (2.6-183)
2Pe Jo
a constant temperature that approximates 7\)(0) for Pe » 1—the zero derivative at the
inlet forces inflow to occur at close to r0(0) for all time.
214 THE ADVECTION-DIFFUSION EQUATION
dT
+ u
uT
dT
— K
d2T
= K T
dx2
T(0, t) =
dT
— =0
dx
on
= T0
at
0
x =
< X
= L,
<
L
f. Advection-diffusion with Dirichlet/Robin BC's
The last case we visit returns to Dirichlet at the inlet but employs a BC of the third kind
at the exit; namely,
KdT/dx + HT = 0 at x = L, (2.6-184)
where H > 0 corresponds to a 'Newton's law of cooling' (convective) BC, and H < 0
can cause problems (e.g., exponential growth in x). In fact, the special case of H = — u
gives the zero total flux BC,
KdT/dx-uT = 0, (2.6-185)
which is interesting in its own right—although perhaps more in the field of mass transfer
than heat transfer.
In neither case will we present much in the way of results, however, and we also
omit/neglect discussing the associated eigenproblems; see Hindmarsh et al. (1984) for a
discussion of these. What we will do is refer the reader to some literature for the H > 0
case and offer a new challenge for the case H = — u; i.e., we conclude this section by
returning to the zero total flux case (// = —u) and pose the following IBVP:
(2.6-186)
(2.6-187)
(2.6-188)
ox
with
T(x,0) = 0, (2.6-189)
which describes the advection-diffusion of a 'front.' For large Pe = uL/k, advection will
dominate for t ^ 0(L/u), and the front will simply translate at speed u. But when the
front slams into the wall at x = L, advection will no longer dominate, owing to the no-
flux OBC. In fact, far from it; even though the flow would like to 'carry 71' with itself
right through the exit 'plane,' back-diffusion precludes it—no T can leave at x = L. The
resulting battle between advection and diffusion is interesting to say the least; it is, in
fact, the toughest, ID, linear advection-diffusion problem that we have encountered. The
one thing we can say for sure is that at sufficiently large t, the solution will approach the
following steady state
T(x) = r0ePex/L, (2.6-190)
giving T(L) = 22, 000 T0 for Pe = 10 and T(L) = 2.7 x 1043r0 for Pe = 100. Clearly,
the advection-dominated case is very difficult, and we 'pause' to note that we are well
aware of the fact that this is a problem in mathematics, not in physics. But we nevertheless
offer it as a 'next generation' hard problem designed to break codes—and computers;
i.e., while a GFEMIA analyst would probably know how—or soon learn how from
the wiggle signal—to build a decent mesh, the numbers may become too difficult to
cope/compute with.
Another challenge in this problem is time integration itself, as there are at least two
time scales, one very short (r^ = L/u) and one very long, Tss = LePe/«Pe = r^e^/Pe,
whose derivation came about as follows: (i) the total 'energy' in (0, L) at a steady state
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 215
is Jo T(x) dx = LToe /Pe, (ii) the rate of energy addition by advection at x = 0 is uT0,
and (iii) a lower bound on the time to reach a steady state is obtained by neglecting
back-diffusion and just estimating how long it takes to advect in the total energy—the
result is that given above, r55. The true time scale for reaching the steady state is even
longer!
Shown next is one solution (for To = 1) of this no-flux problem, obtained in a
semi-analytic way—via Laplacian transforms [courtesy of Dr. Novy, then (1989) a
postdoctoral student at the University of Minnesota]—but only for small Pe; i.e., the software
subroutine INLAP of the IMSL V.IO library on the Cray 2 was not able to obtain a
reliable solution (inverse Laplace transform) for Pe much above 10 owing to insufficient
resolving capability (machine round-off)- The solution for small time (t ^ 1) is shown in
Figure 2.6-60 for three values of Pe in which the 'front' is clearly not sharp; i.e., we thus
far have a nearly diffusion-dominated flow But the medium and large t solution, shown
in Figure 2.6-61 for Pe = 10, shows that even this amount of advection does indeed cause
3.0
2.5
2.0
1.5
1.0
0.5
0
—
"^fr~
(c)
d:o5
i
Pe=10
TOjC^"--
I
Zjxi —
I
—~—^____o:5__
I
t = 1 ,■•'
;
/
JZ
0.2
0.4
0.6
0.8
1.0
X
Fig. 2.6-60 Advection-diffusion with a no-flux outflow boundary condition.
216 THE ADVECTION-DIFFUSION EQUATION
Pe=10
25000
20000 —
15000 —
10000 —
5000 —
Fig. 2.6-61 As in Figure 2.6-60 except longer times and Pe = 10 only.
problems as the outlet value slowly climbs to e10 = 22, 000 on the time scale whose lower
bound of Tss = 2200 is remarkably accurate; in fact, the following equation describes very
accurately the temperature at x = L: T(L, t) = r0ePe(l — e~Pe?/e e) for t > 0(u/L)—for
reasons that we do not yet understand.
Returning briefly to the H > 0 case, we first remark that a large H(Hh/K ^> 1) will
cause a previously 'easy' simulation to become hard—unless proper mesh refinement
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 217
accompanies large H. In particular, it is obvious that H —>• oo returns us to the 'hard'
(Dirichlet) OBC, compete with wiggles—unless the grid Peclet number is less than 0(1)
at the exit.
The only literature on Robin BC's that we are aware of has come from the research
group at the University of Minnesota's Department of Chemical Engineering and
Materials Science, under the leadership of L.E. Scriven and H.T. Davis. A sampling of this
literature is the following: Higgins (1982), Bixler and Scriven (1987), Christodoulou and
Scriven (1989), and Novy et al. (1990, 1991). Most, if not all, of these results were
obtained via asymptotic analysis of downstream conditions. Since many (most) of the
above references are solving the NS equations, we shall revisit some of them in the next
chapter (Section 3.8.1).
q. The advective-diffusive time scale
When the AD operator is normal (see Section 2.6.2d), the time scales that appear in any
analytical solutions clearly show both advective and diffusive time scales. Two examples:
1. The dimensional form of the Gaussian solution on the oo-span is, from (2.6-135),
T(x, t) = exp
"(X-X°-Mr)2 /(l+2,rAx2)
2ol
\+2Kt/a2, (2.6-191)
in which ta = xQ/u and zD = a2/2k are clearly identifiable as advective and diffusive
time scales, respectively.
2. For the case of periodic BC's, (2.6-108) also clearly shows both time scales: xA =
\/ku = X/2nu and zD = \/k2K = X2/4jt2k, where X is the wavelength.
When the AD operator is non-normal, some of the above 'physics' is lost; four more
examples:
1. The Dirichlet BC case of Section 2.6.2c has, from (2.6-118), td = L2 /n27i2K and tAd =
L2//cPe2 = Ak/u2, wherein only rD is 'physical'—see, for example (2.6-123).
2. The Dirichlet/Neumann case of Section 2.6.2d has, from (2.6-168), td = L2/y2K and
xAD = 4k/u2, similar to the pure Dirichlet case.
3. The Neumann/Neumann case of Section 2.6.2e is the same as all Dirichlet.
4. The Dirichlet/Robin case of Section 2.6.2f: while not stated there, the results are the
'same' as the Dirichlet/Neumann case—only the value of yn differs, slightly; it comes
from (HL/k + Pe) tan y + y = 0 rather than from (2.6-169).
In all four cases, each mode (n) decays in place, at a rate given by the combined
time constant, xn = (1/td„ + 1/Tad)-1* so that it seems to be the case that the non-
physical (and mode-independent.) advective-diffusive time constant is closely related
to the apparently non-physical response that does not obviously 'display' advection.
Also, if we define the physical advective time scale via Ta =Xn/u = L/nu, we obtain
*ad/*a = 4-nK/uL ~ n/Pe and Tad/*d = (^Yn^/uL)2 ~ n2/Pe2—neither of which 'make
sense'—vis-a-vis td/ta = uL/nK = Pe("\ which of course does make sense; e.g., if
Pe = uL/k ^> 1, then the low modes are advection-dominated (ta <$C td), whereas the
very high (short wave length) modes (n » Pe) are diffusion-dominated.
218 THE ADVECTION-DIFFUSION EQUATION
All of the above observations lend more credence to the admonitions put forth by
Trefethen at the end of Section 2.6.2d; namely, it is not a good idea to put much stock
in eigenproblems that come from non-normal operators.
h. Final remarks on W advection-diffusion
We have seen what may be called 'easy' problems and 'hard' problems in the
above discussion, where some 'hard' problems may be contrived or may occur
inadvertently (hard OBC's, for example), where here 'easy' means 'no wiggles/very
small wiggles/smooth solution' and 'hard' means 'wiggles that are large enough to be
distasteful.'
What have we learned about wiggles? Basically this: there are two major causes of
wiggles, the first of which is sometimes only revealed via honest GFEM (i.e., consistent
mass): (i) poorly resolved or rough IC's and (ii) hard OBC's, wherein the following
definitions apply: a poorly resolved IC is one that, while smooth enough (C° at least), is
attempted on a too coarse mesh [e.g., a Gaussian (C00) with h ^ 0(a)] and a rough IC is
typically a C~' function (e.g., a step function, or any discontinuous function). A caveat
is needed in the second (rough) case, however: if Pe <$C 1 (which of course =>• P <$C 1)
and the mass is lumped (where possible), the wiggles generated by the consistent mass
matrix of GFEM (L2-best fit) are suppressed, and a false sense of security may be thereby
engendered. A smooth IC that is well resolved is called 'easy.' By hard OBC's we mean
Dirichlet (usually) or Robin with a large H, and by easy OBC's we mean homogeneous
Neumann (usually, and only when properly inplemented via NBC's) or periodic (that
rare but wonderful case). Finally, the hard OBC case applies equally well to steady-state
simulations.
Another, perhaps less common wiggle-maker is a smooth IC (e.g., constant) but a
rough source term; e.g., a discontinuity in a 'heat source' will cause CM wiggles in the
initial 'acceleration.'
The remainder of this wiggle discussion precludes the (always) hard IC cases and
examines the remaining possibilities with easy IC's. Here the results are first presented in
the form of a table (2.6-2), which we do for large N(h <$C L), the only 'sensible' situation,
then via some remarks. (Recall that P = uh/2K and Pe = uL/2k = P(L/h) = NP.)
Remarks:
(1) All of the above results are based on a uniform mesh.
(2) The hard OBC case can be made easy simply by employing an appropriate fine
mesh near the outlet (i.e., hN/L < 1/Pe and smoothly graded from the outlet).
(3) Since Pe <$C 1 =>• P <$C 1, a small global Peclet number is always easy.
(4) Clearly P ^> 1 =>• Pe ^> 1; locally advection-dominated flows are always globally so.
Table 2.6-2 Some good and bad simulation
results (Easy IC's and N » "Ij.
Easy OBC Hard OBC
P <£ 1 Good Good
P»1 Good Bad
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 219
(5) If a non-uniform mesh is employed, even easy IC's and easy OBC's can lead to
a 'bad' result if, downwind of the IC (presumed to be of compact support) for an
advection-dominated flow, the grid becomes sufficiently coarse that the local Peclet
number, Pj = uhj/2K, becomes large compared with unity. For advection-dominated
flow, the entire mesh must be 'sufficiently fine,' and is permitted to coarsen only
gradually in the flow direction—or else wiggles will occur.
Next, we present another brief discussion on the subject of 'advection-dominated'
flows for N ^> 1. Does P ^> 1 =>• advection-dominated? It surely makes Pe very very
large. Is Pe <$C 1 diffusion-dominated? It surely makes P very very small. The answer to
both questions is, 'Yes, usually.' If P ^> 1 and we have a waveform (IC) characterized
by a length scale /, say, then P{ = uI/2k = P ■ l/h is a quite reasonable definition of an
appropriate (to this problem) Peclet number. Since it would be silly (usually) not to take
h <$C / for good resolution, we have that P » 1 =>• P/ » 1; advection-dominated. The
other case, though, is not so easy—partly because what P <$C 1 really means is that we
have good resolution. For example, consider the case P = uh/2K <$C 1 <$C ul/2k = Pi; this
very-well-resolved advection-dominated situation would appear to be diffusion-dominated
if viewed solely from the 'viewpoint' that P <$C 1. Finally, let us suggest that advection-
dominated means that D[= e~K!^1 \ where / is the characteristic length of the waveform in
question or of the object being 'flowed past'] should not be less than 0.9 when advected
a distance /. This gives D = e~K/ul = e~Pl/2 > 0.9 or P, > -20. (If D > 0.99 is more to
your liking, then you will need P/ > —200.) If l/h = 100, then we have P = 0.2 for this
advection-dominated (P/ = 20) flow.
As final final remarks, we briefly attempt to play devil's advocate by pointing a finger
of failure at GFEM. Thus, we ask, 'After all is said and done, why even consider GFEM
for the myriad of cases presented above when you nearly always (or, often) run flat
up against a wiggle problem? Haven't you been hoisted with your own petard?' While
many 'anti-Galerkin' advocates would indeed argue that we have just presented many
reasons to shun GFEM—in favor of ad hoc or other methods that can simultaneously
suppress/preclude the wiggles while otherwise generating accurate solutions that are not
overly dissipative/smooth—we still generally prefer GFEM, partly because we still feel
that the wiggle signals tell us how to do a better job and partly because methods that never
wiggle will sometimes be deceptive with respect to alleged or assumed accuracy. Also,
we have presented GFEM in the manner shown in part to 'expose' it fully, so that its
strong points may also be better appreciated. Consistent with this belief, we will not offer
panaceas to wiggle-sensitized readers—we merely warn them that all such non-wiggly
methods should be employed with due caution and, usually, healthy skepticism. Finally,
we point out that whereas tricky methods that do most of the right things are not too
tough to devise in ID, their extension to multi-dimensions is often either unsuccessful or
tremendously difficult and/or expensive. Whereas most of our ID examples via GFEM
have truly analogous 2D and 3D versions, the smart ID wiggle-suppressors often do not.
For a fairly recent and useful summary of ID methodologies, see Finlayson (1992).
2.6.3 Extension to 2D
The jump from ID to 2D is a big one—too big in many ways, especially when it comes
to analysis; useful, closed-form solutions are much harder to find and often are difficult
220 THE ADVECTION-DIFFUSION EQUATION
to 'interpret' even when found. The additional jump from 2D to 3D is, fortunately, not
a big one—at least conceptually. But 3D analysis does involve lots of long equations.
Thus, we will move into 2D, but not into 3D, for our analytical discussions. In fact, even
our 2D presentation will be brief due, in part, to a lack of known results. For example, we
will restrict almost all of our analysis to bilinear elements—leaving quads (and the entire
and important class of triangular elements) to 'others,' or to the reader. We will, however,
show some computational results comparing these latter elements with bilinears.
a. Pure advection with periodic Bc's
The principal purpose of this section is to extend the phase speed/group velocity analysis
to 2D, to see what new treats/surprises are in store. The two basic results of the effort
below can be easily summarized up front: (i) dispersion error extends to directional
error as well as to just translational error (spurious anisotropic behavior of the numerical
approximation), and (ii) lumping the mass is even more deleterious than it was in ID.
To start, we return to the uniform-mesh, 4-patch equation of Section 2.3.2, (2.3-24),
simplified to constant velocity and with diffusion and source terms dropped:
1
36
[16ro + 4(7V + Ts + TE + 7V) + (TNE + tNW + TSE + tsw)}
+
+
u
6
v
Tse — Tsw ,a^e
2/
2/
Tw TNE
NW
Tne — TsE TN — Ts
2/
Tmw — Tsw
6 V 2h 2h 2h
which, of course, approximates the pure advection equation,
dT/dt + udT/dx + vdT/dy = 0.
= 0,
(2.6-192)
(2.6-193)
Actually, for the purpose of the analysis to follow, there is really no reason to be restricted
to a finite domain with periodic BC's; thus, we switch to a pure IVP and allow the wave
vector to be a continuous variable. (Restriction to periodic domains can always be done
at the end, if desired.)
Thus, inserting the IC
_ i(k[X+k2y)
T0(x, y) = e
where k\ + k\ = k2, into (2.6-193) yields the solution
where
T(x, y, t) = e^x+k2y-u>,t)^
(joc = k\u + kjv = k u,
(2.6-194)
(2.6-195)
(2.6-196)
and the subscript on co refers (again, still) to the continuum result. Also, referring to
Figure 2.6-1, the phase velocity [see (2.6-3)] is
c = cock/k =
k2
k2
k
cos 6
sin#
= c
cosO
sinO
(2.6-197)
and G = u—from (2.6-7).
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 221
A similar approach using (2.6-192) on a mesh of / x h elements leads to
T0(xm,yn) = e^ml+k^h) (2.6-198)
as initial data,
as the trial solution, and
T(xm, yn, t) = TQ(xm, yn)t-'wl (2.6-199)
sin ^i 3 sin 02 3
a) = uk\ + vk2 (2.6-200)
0i 2 + cos0, 02 2 + cos02
as the resulting frequency, with 6\ = k\l = kl cos0 and 02 = k2h = khsinO. Comparing
this result with that from the ID case, (2.6-23), shows a 'nice' generalization that is totally
lost if we lump the mass; i.e., whereas the ID, lumped-mass result is given by (2.6-24),
it does not generalize. What happens if the mass is lumped in (2.6-192) is this: (2.6-198)
and (2.6-199) lead to
sin 9\ 2 + cos 02 sin 02 2 + cos 9\
coi = uk\ \-vk2 ; (2.6-201)
0, 3 02 3
not only do we lose the improvement generated by the 'consistent mass factor,' 3/(2 +
cos 0), we pick up a further reduction in frequency (and therefore in phase speed) by the
cross-directional 'pollution' factors, (2 + cos0,)/3! This is perhaps simpler to appreciate
if we display the frequency from the simple, second-order centered FDM (the five-point
stencil),
sin 0i sin 02
(Ofdm = uk\ — h vk2——, (2.6-202)
0i 02
which is the proper generalization of the ID, lumped-mass result. Lumped-mass linears
equal FDM in ID only; they are inferior to FDM in multi-dimensions—an important point
not generally appreciated. Consistent mass really is consistent. We will soon demonstrate
these differences via examples, but first let us bring two other non-Galerkin methods,
previously considered, into the picture: the CVFEM of Section 2.5.3 and the one-point
quadrature approximation of Section 2.5.2—wherein we recall that the key differences
are in the mass matrix averages that result; the (1 4 1 )/6 of GFEM goes over to (1 6 1 )/8
for CVFEM and to (1 2 l)/4 for one-point quadrature. The resulting frequencies are
sin 0i 4 sin 02 4
MCVFEM = uk\~^—-^-{ 7- +vk2—r-^~, 7". (2.6-203)
01 3+COS 01 02 3+COS 02
and its (more frequently employed) lumped-mass counterpart,
, sin 0i 3 + cos 02 , sin 02 3 +cos 0)
^cvfem = "*i —Q 4 + vk2~fo 4 (2.6-204)
and
/ sin(9' 2 , / sin(92 2 /oAon^
a)\-pt = ukx—— — —— -t- vA:2——— -— —, (2.6-205)
0)1+ COS 0) 02 1 + COS 02
a generalization of the Box-scheme result of (2.6-58) to 2D. Finally, the lumped mass
and one-point quadrature result is
, sin 0) 1 + cos 02 sin 02 1 +cos0i
M*i ' ^ T ^+Mt2^——■. (2.6-206)
222 THE ADVECTION-DIFFUSION EQUATION
The numerical phase speeds are simply obtained from (2.6-197) with coc replaced by
the approximate frequency from any of (2.6-200) through (2.6-206), say &>,; and the
phase speed magnitude is coi/k. The numerical group velocities, however, are another
matter—some calculus is required per (2.6-7), and the results are shown in Table 2.6-3;
recall that G = u for the continuum—see Figure 2.6-1.
The cross-pollution caused by mass lumping is particularly evident in the group
velocities—for both FEM and CVFEM. A quantitative comparison is given in Table 2.6-4
for 0 = 45° and three wave-number pairs, given in terms of wave lengths: (1) 16/ x 8/z
(a fairly well-resolved wave), with 6\ = 7r/8cos7r/4, 0i = 7r/4cos7r/4; (2) 8/ x Ah (a
not-so-well-resolved wave); and (3) a 4/ x Ah wave (barely resolved):
Finally, the middle case of Table 2.6-4 is depicted in Figure 2.6-62 for u = 1,
v = 2, and l/h = 2. Most notable is the superiority of GFEM and the inferiority of its
Table 2.6-3 Group velocities for several methods.
Method
Gx = dco/dk<\
Gy = doo/dkz
GFEM
CVFEM
LMCVFEM
1-pt
FDM
LM
LM + 1 -pt
u
u
3 1 + 2 cos 01
2 + cos 01 ' 2 + cos Q\
4 1 + 3 cos fl1
3 + cos 01 3 + cos 01
ucos0i . 3 + cos02 _ yjLSjnfllSinfl2
u. ?
1+COS01
U COS 01
u cos 01 • 2 + c°sd2 - ^ sin 01 sin 02
ucos0i • 1 + c°s^ _^j_s]n0:S]nQ2
V ■
V ■
3 1 + 2 cos 02
2 + cos 02 ' 2 + cos 02
1 +3cos02
3 + COS 02 3 + COS 02
v cos 02 • 3 + c°s0i - ^ sin 01 sin 02
v 2
1 + COS 02
V COS 02
V COS 02
2 + cos 01 u_h_c]
^ 'j sin 01 sin 02
v cos 02 ■ 1 + cos ft - | tj. sin 01 sin 02
Table 2.6-4
Method
GFEM
CVFEM
LMCVFEM
1-pt
FDM
LM
LM + 1 -pt
Three special cases.
Gx
16/ x 8/7
0.9998
0.9739
0.9256
-0.036^
1.020
0.9617
0.9135
-0.048^
0.8894
-0.072 ^f
/u
8/ x4/7
0.9972
0.9579
0.7316
-0.118^
1.081
0.8497
0.6922
-0.126^
0.6135
-0.236^
4/ x 4/7
0.9461
0.7864
0.3823
-0.201 ^
1.385
0.4440
0.3617
-0.268^
0.3206
-0.401 j£
Gy/V
16/ x8/7
0.9972
0.9579
0.8416
-0.036^
1.081
0.8497
0.8172
-0.048^
0.8334
-0.072^
8/ x 4/7
0.9461
0.7864
0.4273
-0.118^
1.385
0.4440
0.4218
-0.126^
0.4106
-0.236^
4/ x4/7
0.9461
0.7864
0.3823
-0.201^
1.385
0.4440
0.3617
-0.268^
0.3206
-0.401 ^
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN
223
/
(a) GFEM
/
(b) LM
fi
11
fi
fi
Ii
fi
fi
(c) CVFEM
//
(d) FDM
V
11
If
(e) LMCVFEM
4
i A
iAm
i /
t /
if
if
ii
if
if
if
if
(f) 1-pt.
Fig. 2.6-62 Pictorial of group velocities for an 8/ x 4/7 wave with u = 1, v = 2, l/h = 2.
224 THE ADVECTION-DIFFUSION EQUATION
lumped-mass counterpart—as usual. CVFEM is in between FDM and GFEM—unless
lumping is employed, in which case it is poor. Also, the one-point quadrature
approximation gives too large a result, although it is probably not in last place, which
seems to go to one-point plus LM.
b. Pure advection with Dirichlet BC's (inlet only)
The GFEM automatically 'switches' to an upwind approximation to advection at the
exit if the domain is truncated in a region of outflow (n • u > 0)—see (2.3-11), with
K = H = S = q = 0. One might initially think that such an approximation would add
artificial diffusion and thus be bad for pure advection. While it is true that the advection
matrix is then no longer skew-symmetric, it is also true that the behavior/response is
innocuous. In fact, Gustafson has shown (see Kreiss and Oliger, 1973) that the overall
order of accuracy is not harmed by one-order-lower approximations to derivatives at
boundaries for finite differences, and Hindmarsh (1975) has shown the same for ID,
linear finite elements.
But the main purpose of this section is to present some numerical results for pure
advection, which uses a Dirichlet BC at the inlet (T = 0) and no BC at the exit. Presented
below are numerical results for simple, constant-velocity, pure advection of a Gaussian
through a 2D rectangular domain with the purpose of demonstrating some of the features
discussed above and to compare various elements numerically/pictorially, including those
not studied analytically: quadratic rectangles and triangles, both linear and quadratic. In
all cases, the ODE's were integrated numerically using a sufficiently small timestep so
that all of the errors can be regarded as 'spatial.'
The domain in all cases is a unit square, and the discretization in all cases corresponds
to that of a uniform bilinear element mesh of 30 x 50; i.e., Ax = 1/30 and Ay = 1/50.
The velocity field is given by u = cos 37° = 0.7986, v = sin 37° = 0.6018, and the time
integration goes from zero to 1 (500 steps, trapezoid rule; see Section 2.7). The IC is a
Gaussian centered at (0.2, 0.2) with ox = oy = 0.05, giving ox/ Ax = 1.5 and <yy/Ay =
2.5, which we 'define' to be fairly well resolved in y but slightly under-resolved in x. Thus,
at the end of the integration, the Gaussian center should be at x = 0.2 + cos 37° = 0.9986
(virtually at the exit) and y = 0.2 + sin37° = 0.8018. The comparison is purely pictorial
(the 'augenmethod'), and there are 15 different runs, as defined in Table 2.6-5, in which
an additional piece of comparison data is given for each case; the max and min of the
computed result at t = 1/2 (x = 0.599 and y = 0.501 for the exact result)—which sort
of measures 'conservation' and wiggles. The pictures referred to in the table follow the
table, and in each the left side shows i =\ an(^ tne right s^e shows t = 1.
We offer the following comments regarding the results:
1. The size of the 'base' (added for 'display' purposes) on which the Gaussian sits is 10%
of the height of the Gaussian—if the Gaussian remains positive. For the plots in these
figure, however, there are come negative values, in which cases the height of the base
is 10% of the full range of the plotted function plus the absolute value of the smallest
negative value.
2. The three rectangular-element GFEM results, Figures 2.6-63(a), 2.6-63(c), and
2.6-64(a), show clearly the group velocity effect of being under-resolved in x and 'okay' in
y by the fact that the dispersion wiggles are basically moving in the negative x-direction,
irrespective of the fluid's velocity. The fact that lumped biquadratics, Figure 2.6-64(b),
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 225
Table 2.6-5 Some advection experiments.
Run
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
Element
Bilinear
Bilinear
8-node
serendipity
8-node
serendipity
Biquadratic
Biquadratic
Linear
triangle(2)
Linear
triangle(2)
Linear
triangle<3)
Linear
triangle<3)
Linear
triangle*4'
Linear
triangle'4'
Quadratic (6-node)
triangle'2'
Quadratic (6-node)
triangle'2'
Quadratic (6-node)
triangle*3'
Bilinear
Mass matrix
Consistent
Lumped
(row sum)
Consistent
Lumped'1'
Consistent
Lumped
(row sum)
Consistent
Lumped
Consistent
Lumped
Consistent
Lumped
Consistent
Lumped'1'
Consistent
Consistent'5'
Max/(-min)
at t = 0.5
0.972/0.012
0.770/0.277
1.002/0.004
0.400/0.439
1.004/0.006
0.962/0.07
0.963/0.017
0.758/0.305
0.968/0.015
0.787/0.266
0.964/0.044
0.748/0.302
0.998/0.045
0.662/0.414
0.998/0.011
0.860/0.200
Figure
2.6-63 (a)
2.6-63(b)
2.6-63(c)
None
(gibberish)
2.6-64(a)
2.6-64(b)
2.6-65(b)
2.6-65(c)
2.6-66(a)
2.6-66(b)
None
None
2.6-64(c)
None
2.6-65(a)
2.6-66(c)
<1 'Compute total element mass (area), then distribute it equally to the nodes; see also
Section 2.3.4, in which two other poor lumping schemes were discussed.
<2)Grid designed so the triangles were aligned with the flow—more or less.
<3)Grid designed so the triangles were aligned against the flow—more or less.
<4''Union-jack' mesh design—preferred by some (shunned by others).
<5'Reduced quadrature (one-point).
also look like the others, suggests that lumping the mass for this element is not nearly
as deleterious as it is for most others—an effect we have seen before; Gresho et al.
(1978, 1980).
3. The lumped bilinear element, Figure 2.6-63(b), performs poorly—as
predicted—showing lots of dispersion in both directions.
4. The lumped serendipity element is inconsistent/non-convergent, an interesting point
since the consistent mass result for this element, Figure 2.6-64(c), is arguably the best
of all.
5. Linear triangles do indeed demonstrate their famous (infamous) 'mesh-orientation
effect'; no surprise, really, since we know that group velocity errors are definitely
mesh-dependent.
226 THE ADVECT10N-D1FFUS10N EQUATION
Fig. 2.6-63 Pure advection results for three quadrilateral elements.
6. Consistent linears are almost, but not quite, as good as consistent bilinears—again not
a new result.
7. Lumped linears are about as bad as lumped bilinears.
8. Quadratic triangles, Figures 2.6-64(c) and 2.6-65(a), are quite good, but the aligned
result [Figure 2.6-64(c)] has a surprising trail of wiggles in the better-resolved direction.
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 227
Fig. 2.6-64 Pure advection results for two quadrilateral elements and one triangular element.
9. Finally, the lumping algorithm employed for the quadratic triangle is, like that for the
eight-node quad, inappropriate.
10. Reduced quadrature on the bilinear element [Figure 2.6-66(c)] causes phase lead, as
predicted; ditto nine-node (2 x 2), but smaller error (not shown).
11. For some analytical results using linear triangles in several configurations, see
Neta and Williams (1986)—although at least one of their results is known to be in
error (D. Griffiths, personal communication): the criss-cross triangle formulation is not
unstable.
228 THE ADVECTION-DIFFUSION EQUATION
Fig. 2.6-65 Pure advection results for three triangular elements.
c. Advection-diffusion with Dirichlet BC's
The only eigensystem result known to us for this case is that for the related FDM on
the simple five-point stencil with constant coefficients. Thus, we shall present it, knowing
(believing) that the nine-point GFEM version is 'similar but better.' The FDM version of
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 229
Fig. 2.6-66 Pure advection results for two triangular elements and one quadrilateral element.
dT/dt + udT/dx + vdT/dy = kV2T, is, on a uniform mesh (see Figure 2.3-3),
to + ^(Te~ TV) + ^-(TN - Ts) = k[(Tw - 2T0 + TE)/l2 + (TN - 2T0 + Ts)/h2],
2/ 2n
(2.6-207)
which is the analog of (2.3-24)—after setting w, = u, Vj = v, and 5 = 0 there.
230 THE ADVECTION-DIFFUSION EQUATION
We present the eigensolution only for the simplest case—a square domain with / = h.
It turns out that the five-point stencil above leads, as does the continuum, to a simple
generalization of the ID results given in (2.6-128) and (2.6-129)—a generalization that
is not easy for the GFEM. For the continuum, we have [cf. (2.6-118) and (2.6-119)]
^min = [(m2 + n2)n2 + (Pe2 + Pe2)]K/L2,
(2.6-208)
where Pei = uL/2k and Pe2 = vL/2k, with eigenfunction
<&m,n = e(Pe'x'L+?t2ylL) sinmnx/L sin nny/L,
(2.6-209)
for m, n = 1, 2, The corresponding FDM result is [for / = h = L/(N + 1)]
2k
i —
( 1 - yjl -P2cosmjrh/L] + ( 1 - \A ~ P\cosnnh/L
(2.6-210)
with eigenvector
rp{m,n)
1 jk ~
1+P.
1 -Pi
J/2
sin jmnh/L ■
1+P2
k/2
sin kmnh/L,
(2.6-211)
for m,n,j,k = 1,2, ... ,N and Pi = uh/2ic, Pj = vh/2ic.
Clearly the 2D case is a simple and appropriate generalization of the ID case discussed
in Section 2.6.2c, and we leave to the reader the extension of those results with respect
to wiggles (transient and steady-state), minimum time of believability, etc.—including,
unfortunately, the extension to GFEM!
But there are two additional features of the 2D case that merit mention here—both of
which are much more important than the somewhat silly case discussed above: (i) flows
past objects (with or without wiggles) and (ii) contained flows—both of which must
extend the previously restricted case to one of variable velocity and variable mesh—via
a return to GFEM. With respect to the former, consider flow of a cool fluid past a hot
step, or block, or circular cylinder—at large Peclet number, which leads to two remarks.
The first remark is this: if P\ ^> 1 and P2 ^> 1 in the neighborhood of the object, then the
7-field will wiggle; the second is this: if the mesh is properly graded such that P\ < 1
and P2 < 1 (GFEMIA), again (only) in the neighborhood of the object, then the solution
will not wiggle and it will be accurate. (We surmise, but have not proven, that it is really
the local normal grid Peclet number that makes or breaks wiggles; i.e., Pn = unhn/2ic,
where hn is the normal distance from the first node in from T, and un = u • n is the
'normal' velocity at that node.) The final remark relates to contained flow (u • n = 0 on T),
although actually not much is new for this case, because wiggles—and their suppression
via local mesh refinement—can still occur when there is a velocity component normal to
the boundary at the first node in, and the local (normal) Peclet number is too large there.
d. Advection-diffusion with periodic BC's
Periodic BC's are sometimes appropriate; e.g., if the 'geometry' of the problem is itself
periodic (in one or more directions), so too might be the solution. Consider, for example,
flow in a channel containing periodically placed identical 'obstacles'—such as cross-flow
through a tube bank. The mathematical conditions that apply under such conditions are
two: (i) continuity of T, and (ii) continuity of KdT/dn, where dT/dn refers to the normal
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 231
derivative along the periodic boundary—at x = 0 and x = L, typically. The first of these
is incorporated into the finite element code simply by appropriate node 'labeling'; i.e.,
at 'outflow' the degree-of-freedem numbers must match, one-for-one, the corresponding
numbers at 'inflow'—a situation that is especially obvious for the (rare) case of flow
through a torus/doughnut; i.e., a 'truly' periodic flow. By giving the same mode number
at inlet and outlet, the FEM 'assembly' process will take care of the rest; i.e., the solution
will be periodic. The second of the 'BC's' actually comes 'free' in the usual case; i.e.,
the first BC assures, as well as can be approximated, the second. (Of course, with our C°
basis functions, we do not have C° gradients; but the jump in dT/dn will be 'as small as
possible,' in some sense—the same sense in which jumps in V7 are small 'internally.')
To finish, we take a quick look at the associated 'eigenproblems.' The analytic solution
for this case is easily found to be [cf. (2.6-108)]
T(x, y, t) = e-*2" ■ eki{x~ut) ■ eik*y-vt\ (2.6-212)
where k2 = k2 + k2 and periodicity requires k\ = 2nm/L and kj = 2jin/H with m,n =
1, 2, ... in an L x H domain. The bilinear GFEM 'analog' on a uniform mesh is obtained
by first adding the diffusion term to the RHS of (2.6-192)—see (2.3-24)—and then seeking
a solution in the form Tjt{t) = e-/"e'(*'>Aj:+*2/A-v-ftrf) to obtain
li = -£(1 - cos0,) • — + -f (1 - cos02) ■ — -, (2.6-213)
r 2 + cos#i h 2 + cos 02
and a) is (still) given by (2.6-200), where 0\ = k\ Ax and 02 = ^ Ay; i.e., we again have an
appropriate generalization of the ID result in Section 2.6.2b. We leave the mass-lumped
analog to the reader, except for the following parting remark: as for pure advection, the
2D result is not an appropriate generalization of the 1D result.
e. Advection-diffusion with OBC's
Little need be said here except that the FEM NBC as OBC, the homogeneous Neumann
BC, KdT/dn = 0, should generally be employed or perhaps (probably) the new 'free OBC
discussed in Section 2.4.1. It will usually produce quite acceptable results, at both small
and large global and local Peclet numbers, even when dT/dn is not even close to zero at
the outflow boundary—as in ID.
Finally, a brief return to Section 2.4.2 and Figure 2.4-4 may be worthwhile, showing
as it does a possible problem when n is not parallel to a coordinate direction.
f. Final remarks on advection-diffusion via GFEM
The AD equation can range over the following PDE classifications: elliptic (steady-state
with u = 0), parabolic (the general case), and hyperbolic {k = 0, pure advection). It
is nearly common knowledge by now that elliptic equations are the easiest to solve
numerically and that the GFEM generates, in the appropriate norm, the 'best' possible
solution. For the other extreme, however (k = 0), it usually does not—wiggles often get
in the way. A primary purpose of much of what has been presented above was to display
and explain GFEM—not necessarily to always advocate it.
We have demonstrated the 'wiggle-weakness' of GFEM for pure advection (and for
advection-dominated, Pe » 1, flows) and 'rough data'; i.e., for wave forms with some
232 THE ADVECTION-DIFFUSION EQUATION
length scales that are much smaller than the mesh size. Thus, for the parabolic (general)
case that is 'close to' the hyperbolic limit, GFEM leaves something to be desired—which
causes us to end this section with some ostensibly apologetic remarks. The main one is
that, although many in the field of numerical simulation believe that there is a definite
need for a dissipative advection scheme (to preclude wiggles, if nothing else), we have
rarely found such a need—in spite of some of the wiggles presented above. Thus, we
cannot offer the wiggle-sensitized reader any of the many alleged panaceas that have
appeared in the literature. But we do suggest the following when considering a particular
dissipative scheme: make sure it works well in multi-D and not just ID. We presented
very few multi-D versions of our various GFEM examples, because we know that the
behavior generalizes properly (or improperly, if you are a wiggle-hater). But beware the
'smart' dissipative scheme that works great in ID but often fails to do so in 2D or 3D,
especially with regard to phase and group velocity errors. And we will always remain
suspicious of Eulerian methods that never ever wiggle.
If we were to advocate a 'better' way than GFEM for advection-dominated flows, it
would be this: use the method of characteristics—much of which will be summarized in
Section 2.7.7a.
2.7 TIME INTEGRATION
While all of the equations of interest in this text are PDE's, we are advocates of what
some call the method of lines [we solve for T as a 'continuous' function of time on
the 7-th line, Tj(t)—at least in ID], and others call semi-discretization—and have been
practicing it thus far; i.e., the spatial operators of the PDE's are approximated via spatial
'discretization' with the time variable remaining continuous, thus generating (countable)
systems of ODE's (this chapter) or DAE's (next chapter). Then, in 'Step 2,' the
(considerable) theory and machinery of ODE's (or DAE's) is brought to bear in order to effect an
appropriate/cost-effective time-integration method. A dominant reason for our proceeding
along these lines is that one of us (PMG) has had (near) ready and willing access to one
of the ODE experts of the day, Dr. A.C. Hindmarsh (LLNL), whose impact on our work
has been, to say the least, considerable.
While we do not see total agreement by many others with this philosophy, neither
are we alone. We borrow from Warming and Beam (1979), who stated our case very
well back in 1979 (shortly after our first FEM Navier-Stokes papers were published,
in which the same approach was taken): 'Historically, the development and analysis of
methods for ODE's have been more advanced than those for PDE's. The present state
of numerical methods is no exception; therefore, it behooves the numerical analyst to
exploit the sophisticated ODE methods for the numerical solution of PDE's.' This is our
belief too; they just said it better. [See also Beam and Warming (1982).] Note too that
the recent basic (and excellent) text on finite elements by Hughes (Hughes, 1987) adopts
a similar approach. We just take (a portion of) it one important step further than either
of the above—we utilize that portion of ODE theory that employs variable timesteps.
But we are also well aware that this approach is not without some disadvantages and
even pitfalls. For openers, it generally pays little heed to the possibility of 'matching'
spatial methods with temporal methods to obtain an optimum balance; e.g., a very-high-
order ODE method does not match well with a spatial differencing scheme that uses simple
upwinding. Another disadvantage of the method of lines is that one may sometimes not
TIME INTEGRATION 233
see the forest for the trees; i.e., it is not always fruitful, nor even possible, to get at some
of the deeper issues buried in PDE's such as regularity (smoothness), spatial singularities,
and even convergence as Ax and At —> 0. For example, it is not always sufficient to
simply study convergence by letting At —>• 0 for a fixed Ax (fixed ODE system size);
sometimes it is necessary to attempt to recover some of the forest by letting the number
of ODE's grow toward infinity (Ax —>• 0) while simultaneously allowing At to approach
zero.
So, with certain admonitions placed 'up front,' we nevertheless believe that there are
many more advantages to this 'ODE' approach than disadvantages, among which are:
1. There exists a large and growing theoretical base that is not accessible if a 'full PDE'
treatment (time and space together) is selected.
2. The FEM is especially well developed and suited for spatial approximation, which
more or less naturally leads to the method of lines.
3. The FEM in time, while explored by some, has not been notably successful—partly
because of the many well-developed and well-understood finite difference methods that
are available. Recent exceptions to this statement will be discussed later.
4. Spatial approximation error can be studied and evaluated on its own merits (or
otherwise).
5. ODE theory can show us how to select the proper timestep—and when and how to
change it.
6. Additional 'machinery' associated with ODE theory, such as solving linear and
nonlinear algebraic equations, can also be usefully utilized.
7. Finally, our time derivatives are 'low' (first-order), and our space derivatives are high
(second-order), thus relegating—in some sense—temporal integration to a secondary
(easier) role.
Enough on philosophy; let us now get on with it. What we shall do in this chapter is to
introduce some ODE methods and some 'model' ODE's, beginning with the largest, single,
ODE 'categorization' term: explicit vs implicit, the former usually leading to simple step-
by-step 'marching' methods, and the latter to the intermediate solution of simultaneous
(non-linear and linear) algebraic equations. Intellectual efforts (considerable) for implicit
methods focus on 'How to solve linear and non-linear equation sets efficiently,' and those
for explicit methods on 'What are the stability limits as a function of problem parameters
and how can they be improved?' Another categorization, lesser in total significance, is
low-order (first- and second-, typically) vis-a-vis high-order ODE methods, that will not
consume much of our time (higher-order is, usually, not really required; unless, perhaps,
one is solving (low-Re) turbulent flows via direct numerical simulation and a high-order
spatial method). After the introduction via model ODE's, we will apply a few ODE
methods to the scalar transport equation.
We begin by writing a model ODE that in some sense corresponds to the subjects of
this chapter: advection and diffusion. It is
y(t) = icoy - y/x = -Xy, (2.7-1)
y(0) = y0, with (obvious) solution y(t) = yoe'^' = yoe~t/Teta)t, where r(> 0) corresponds
to a time-constant associated with diffusion (like L2/k or l/k2K, where k is a wavenumber),
234 THE ADVECTION-DIFFUSION EQUATION
and the frequency co corresponds (via co = u/L or co = ku) to a velocity (advection); the
term icoy is 'non-dissipative', like advection. For pure diffusion ('friction'), set co = 0,
and for pure advection, set r = oo. Equation (2.7-1) can also be interpreted as one Fourier
mode of a converged (h —>• 0) AD equation; i.e., one with no spatial error. A further
correspondence with the PDE describing advection-diffusion is obtained when it is recalled
that the diffusion matrix has purely real eigenvalues (corresponding to monotonic decay
in time), and the advection matrix (skew-symmetric form, at least) has purely imaginary
eigenvalues (corresponding to non-decaying transport). Furthermore, if one were to solve
the ODE's corresponding to the AD equation via the eigenvector expansion method, it
would turn out that each (uncoupled) 'mode' (eigenvector) would look like (2.7-1); see
Remark (4) below.
Remarks:
(1) Whereas the ODE's of advection-diffusion involve the GFEM mass matrix, we shall
defer this additional complexity in order to introduce some ODE methods in the
easiest way.
(2) It is useful and immediate to generalize (2.7-1) to a vector system of equations each,
in general, with a different A. (eigenvalue); see Remark (4).
(3) Equation (2.7-1) is sometimes referred to as 'Dahlquist's test equation'; see Hairer
and Wanner (1991).
(4) To clarify the above discussion and to motivate (formally at least) the study of
a single scalar equation, return to (2.2-7) and rewrite it as t + AT = b, where
A = M~l[N(u) + K] and b = M~l f. Now consider the eigenvalue problem Axi =
XiX(, i = 1, 2,... N, and assume the existence of a complete set of eigenvectors
(valid for virtually all cases of interest to us) so that the 'total' eigenproblem is
AX = XA, where the j-th column of the matrix X is the j-th eigenvector and A is a
diagonal matrix of the eigenvalues, {A.,-}. Thus A = XAX"1 and T +XAX~~lT = b.
Finally, let y = X~XT to give T — Xy and Xy + XAy = b or y + Ay = X~xb = b,
an uncoupled system of N ODE's.
Actually, to better introduce the several ODE methods to be described below, and
to better set the stage for the next chapter, we will also consider the general
(nonlinear) ODE
y = f(y,t), (2.7-2)
y(0) = y0, where f(y, t) is a given function that is fairly well-behaved (continuous in
time and satisfies a Lipschitz condition; see, for example, Gear, 1971).
Next we list a few 'desirables' of ODE methods and point out right up front that no
method yet devised possesses all of the listed attributes:
1. Self-starting; i.e., the method should be applicable given only the IC, ;y(0) = yQ. (Many
are not.)
2. No spurious/extraneous/parasitic roots (solutions); i.e., the method should not introduce
any numerical artifacts in addition to the desired solution. (Many do.)
3. Stable for all timestep sizes, At. (Most are not.)
4. No spurious damping; i.e, if the ODE is of the non-dissipative variety, then the solution
method should not introduce numerical damping. (Many do.)
TIME INTEGRATION 235
5. Easy to implement/inexpensive.
6. Finally, the method should be 'accurate'—an obvious but vague (thus far) statement;
i.e., it should solve correctly the ODE for At —>■ 0, and the error should be small at finite
but 'reasonable' Af's.
Additional Remarks:
(1) Attributes (1) and (2) above tend to go together in that self-starting methods have
no extraneous roots, and methods with extraneous roots are not self-starting.
(2) Selection of an ODE method always involves compromises among the above list of
attributes.
(3) For a method with 'good' stability properties, an additional attribute is that the local
error be easy enough to estimate that a cost-effective, variable-step method, with
At based on desired accuracy, can be designed.
(4) As in the linear case, we can also consider that (2.7-2) describes a vector system of
(in this case, coupled) ODE's
(5) ODE people often/usually use h for At (one symbol vs two), and PDE people often
use k for At because h to them refers to Ax. But in fluid mechanics, k often means
wavenumber. Thus, since we are discussing all of the above—ODE's, PDE's, and
fluid mechanics—we shall simply use At for At.
To further set the stage for our brief discussion of ODE methods for AD, we refer to
Figure 2.7-1, taken (with permission) from the Stanford CFD course notes in the
Department of Aeronautics and Astronautics, #AA214—Numerical Methods in Fluid Mechanics,
taught for many years by Harvard Lomax (ours is from 1980). The terminology in the
figure is this: his X is our r-1, his h is our At, and his er's are the (complex in general)
roots of the so-called 'characteristic polynomial' that characterizes the ODE method. We
shall soon show some er's, but will call them £'s. Each dot on each curve in the figure
corresponds to the value of er for a particular timestep size, and the numerical solution is
stable if and only if all er's lie within or on the unit circle (|er| ^ 1), the latter condition
(|er| = 1) requiring the further constraint that the roots be simple (repeated roots on the
unit circle are unstable). To continue the discussion, we find it best to quote directly
from the Lomax course notes, after pointing out that the left 'column' in Figure 2.7-1
is modeling pure diffusion, the right column is modeling pure advection, and the middle
is AD:
'One can picture the stability of the er-roots by plotting them, for given values of
A./?, in the complex er-plane relative to the unit circle about the origin. Figure 2.7-1
illustrates what could happen for a numerical method that produced one principal and
two spurious roots. We assume the method is applied to inherently stable ODE. Shown
are the traces of the er-roots as Xh starts at zero and is increased by some constant
increment. The top row shows a (hypothetical) exact behavior. The second row shows
the attempt of the principal er-root to match the exact behavior and its ultimate failure
for large Xh. Since &xh =1 for h = 0, the principal root always starts at +1 on the
unit circle. Traces of two spurious roots are shown on the bottom row. They can start
anywhere inside, on, or outside the unit circle. For most useful methods they start
inside the circle, and are kept inside by a proper choice of h. When any root, principal
or spurious, leaves the unit circle, the method is said to be numerically unstable for
that Xh.
236
THE ADVECTION-DIFFUSION EQUATION
•AK
*■=: e
GTz. e
(-jUcCJ^
NjM«Wta
Principal*
Hum&rtcai-
^-fo-J
6.4- 5 fcabif.'fjj i« H»fc covHolex* <T-f>loM*
A.r« veai O/vid. p©si"t"iVe •
/4 dM^ **J
4 -V«rl
*e«r)
ro»H .start* an t&e UK»«-t* Ca re I • .
A. ,X«rJ
*«<*■./
(b) A4<*■*»* ^yp*..-Alt spurious
roots sfcxrl" at Hi ft Ortqj* .
8.3T- L.oc-cd't'on of 0" 6ube.n H = o £or Ada»v»j o*l<J
Fig. 2.7-1 Two figures from H. Lomax's CFD course. (Reproduced by permission from
H. Lomax).
Asymptotic Numerical Stability for Ah —► 0.
In the classical study of ODE's, considerable attention is given to the subject of
asymptotic stability which refers to the behavior of the numerical a-root structure in
the limit as h —> 0. The theory applies only to the study of spurious a-roots since all
principal a-roots must start at +1 when h = 0. We mention here two classifications
of multi-root (defined as methods that produce at least one spurious root) methods.
If all the spurious a-roots fall on the origin in the complex a-plane when h = 0, the
method is said to be of the Adams type. The Adams-Bashforth and Adams-Moulton
methods have this property. If any spurious a-root falls on the unit circle when h = 0,
the method is said to be of the Milne type. The leap-frog method has this property.
Milne methods are usually more accurate, but less stable than Adams methods. They
are illustrated in Figure 2.7-1.
TIME INTEGRATION 237
Returning now to a general discussion of ODE integration methods, five important
ODE definitions are the following:
1. A method is said to be k-th order accurate if, given the exact solution at time tn, the
local (single-step) error,
ln = yn+\ - y(tn+\), (2.7-3)
where y(tn+\) denotes the exact solution and yn+\ the approximate solution at tn+\, varies
like /„ = 0{Atk+[).
2. The global error,
e„ = yn -y(tn\ (2-7-4)
is the error over the whole range and includes the accumulation of local errors. It is
in general larger than the local error by one power of At; e„ = 0{Atk) for a k-th order
method. [See, for example, Gear (1971) for the proof of this assertion.]
3. A method is said to be A-stable (A / Absolutely; it is simply A) if, when applied to
(2.7-1) with arbitrary r > 0, the numerical solution —> 0 as n —> oo. Note that there is
no constraint on At in this definition; thus, all Af's must cause yn —>• 0 as n —>• oo if the
method is A-stable. (Few are).
4. A method is said to be L-stable if it is A-stable and, when applied to (2.7-1), gives
HniA^oo yn+\/yn =0.
5. An intermediate stability, called strong A-stability, has limA?-^cx3 yn+\/yn = S, where
0 <8 < 1.
Before getting too embroiled in details, let us clarify some of the above discussion by
showing the simplest (and lowest-order) explicit and implicit ODE methods—both first
used (apparently) by Euler, because they are called 'forward Euler' (FE) and 'backward
Euler' (BE), respectively—or 'explicit' and 'implicit' Euler. They are, applied to (2.7-2),
FE. yn+l = yn + Atf(yn, tn) = yn + Atyn. (2.7-5)
BE. yn+i = yn + Atf(yn + l,tn+\) = yn + Atyn+{. (2.7-6)
Clearly the explicit method is much simpler in that, since yn and tn are known, the
evaluation of f(yn, tn) is explicit, and a simple marching method is obtained. For BE, on
the other hand, f(yn+\, tn+\) is not known because yn+\ is not; the method is implicit in
yn+\, and each step requires the solution of the non-linear (in general) equation, (2.7-6).
Even if the ODE is linear, an implicit method will always involve solving for yn+\ rather
than simply evaluating the RHS, which characterizes explicit methods. Thus, for (2.7-1),
BE yields yn+\ = yn — XAtyn+\, which is 'solved' via yn+\ = yn/(\ -\-XAt).
Before evaluating these two methods further (clearly, FE looks to be the 'winner' thus
far), let us show them graphically, starting at yn, tn in Figure 2.7-2. Note that FE
overshoots and BE undershoots; this is also true for a monotonically decreasing function—and
helps to understand (as we shall soon see) the inherent instability of the former and the
inherent stability of the latter. Even in the diagram, FE is much simpler than BE, the
latter requiring first finding the slope of y(t) at tn+\ and then 'translating' this
downward (in this case) until it passes through y(tn) to finally yield yn+\. And in truth it is
even worse yet because the curve y(t)—the exact solution of the ODE—is not available
to us! Thus, the BE sketch is more suggestive than real. For example, even the simple
THE ADVECTION-DIFFUSION EQUATION
Fig. 2.7-2 Forward and backward Euler.
linear equation, y = —ky, would yield, in going from y(tn) at tn to tn+\ = tn + At,
yn+\ = y(tn)/{\ +XAt) for BE vs the true solution y(tn+\) = y(tn)e'XAt; i.e., the true
slope at tn+\ is — ky(tn+\) = — ky(tn)e'XAt, whereas the BE slope is given by [yn+\ —
y(tn)]/At = —ky{tn)/{\ + kAt)—differing from the true slope by 0(At2) for kAt small.
How do the two methods stack up against our list of attributes? Like so:
1,2. Both are self-starting and introduce no parasitic solutions.
3. Only BE qualifies here, and we see that the extra effort associated with the implicit
method can yield a dividend; if the ODE itself is stable (not going to oo at 'large' t),
then BE is also stable for arbitrary At whereas FE is, at best, conditionally stable, and,
at worst, unstable (yielding solutions that grow without bound). For example, y = —ky
with k > 0 is only stable via FE for At < 2/k—and for another example, y = icoy,
the FE method, simple though it is, is unstable for all At. [Proof: y = —ky =>• yn+\ =
(1 — kAt)yn => yn = (I — kAt)nyo, and stability thus requires |1 — kAt\ < 1 giving, for
k = kr + ikh (1 - krAtf + (kjAt)2 < 1, or At < 2kr/(k2r + k2). This same analysis for
BE, left as an exercise, gives stability when At > —2kr/(k2 + kj), which is everywhere
in the complex plane except within the unit circle centered at krAt = —1; in particular,
it is stable for all ODE's with kr ^ 0. It is, in fact, so stable that it will drive to zero
(damped oscillation) the numerical solution of an ODE with an exponentially growing
solution (kr < 0) if At is sufficiently large: krAt < — 1. This is the most 'dissipative'
(and stable) method ever devised—and FE is the least stable.]
TIME INTEGRATION 239
4. Here they both lose; BE will damp the solution to y = icoy, y = yoelajt, eventually
getting to y = 0, and FE will blow up—thus suggesting (properly), with due respect to
Leonhard Euler, that perhaps better ODE methods should be sought. [FE yields yn =
(1 +ia)At)ny0 and \yn/yo\ = 0 + co2At2)n/2, and BE gives yn = y0/(l + icoAtf and
IW3*>I= l/d+^2Af2)"/2.]
5. Only FE qualifies here, since solving non-linear equations is, by comparison with
simple time-marching, expensive.
6. Both are of minimal acceptable accuracy. As we will see below, they are 'first-order'
methods; for small At, yn — y{tn) = O(At).
The local truncation error (LTE), dn, is defined as the residual in the ODE formula
when the exact solution is inserted. For FE, the LTE is determined, via Taylor series
analysis of a single step (hence, 'local') on the assumption that the exact solution is
available at the beginning of the step, which we now derive.
/FE
d„ = yM + A*y(tn) - y(tn+l)
At2.. At3
= yn + &ty„ - \yn + &tyn + -^-.v« + ~T~yn + "'
= —^yn+0(Aty), (2.7-7)
and we see that in this case, the LTE is also the local error, /„. For BE,
d*E = y(tn) + Aty(tn+l)- y(t„+i)
+ Aty(tn+l) - y(tn+l)
At2
y{tn+x)- Aty{tn+X) + — KWi) + 0{Af)
At2
2
y(tn+l) + O(Af)
At2
= -^-yn+0{At\ (2.7-8)
the sign change being consistent with the sketches presented earlier. Also, the actual local
error, /„, is not quite d^E because, in fact, we do not, as assumed in obtaining (2.7-8), have
the exact solution available at t„+\ (only at tn, by definition)—but it is within 0(At3)
ofd*E.
So much for the two first-order ODE methods—for now, except to point out that the
average of the two (first-order) Euler methods might be second-order accurate—and this
is true. But it then goes by a different name—trapezoid rule (TR), a method that we shall
return to later (many times, in fact). We now move on to describe a few higher-order
methods, both explicit and implicit. Also, it is important to point out that we will provide
only a brief overview of (some) 'numerical' ODE theory; for basic, in-depth, authoritative
discussions, see, for example, Gear (1971), Shampine and Gordon (1975), Butcher (1987),
and Hairer et al. (1987).
240 THE ADVECTION-DIFFUSION EQUATION
2.7.1 Some Explicit ODE Methods
a. Second-order Adams -Bashforth (AB2), an 'explicit multi-step method'
Applied to (2.7-2), this method is
At
yn+\ = yn + y(3^" ~ ^""l^ (2.7-9)
which is a two-step method (needing one extra 'history vector') with the following
properties:
1. It is second-order accurate; the local error and the LTE are the same, and given by
At
dn = yn + -yOyn - yn-i)- y{tn+\)
At
yn -Atyn + -^-yn+0(At3)
At'
At
yn + Atyn + -—yn + -—yn+ O(Af)
= --Atiyn+0{Af),
(2.7-10)
where we have used the exact solution at tn-\ in order to invoke Taylor series.
2. It has a startup problem (not self-starting). Step 1 {n = 0) is therefore typically
performed with FE—preferably with a smaller timestep.
3. It introduces a spurious/extraneous 'root,' derivable for y = — Xy by seeking a
solution to (2.7-9) of the form yn = y^n which gives £"+' = S" - (A.Ar/2)(3|" -
|"_1), with solutions £± = ^(1 - \XAt ± J\ - XAt + \X2At2), one of which (£+) is
'physical' (approximates the ODE solution, y{t) = yot~~Xt and the other (the - sign)
spurious/extraneous. (A &-step Adams-Bashforth method has k — 1 spurious solutions.) As
At —>• 0, however, so does the spurious root (a property of the Adams' family of methods;
all spurious roots —>• 0 as At —>• 0, as shown earlier in Figure 2.7-1, with £ replaced by a).
But as At is increased, so too does one or both of £+ or £_ until eventually one of them
'punches through' the unit circle, giving instability. Whether the physical or spurious
root goes unstable first depends on the relative magnitudes of a> and r in (2.7-1)—an
issue we shall return to when applying AB2 to advection-diffusion in Section 2.7.6. If
X = —ico (pure advection), then the physical root (£+) is larger than 1 for all At > 0; as
for explicit Euler, which is also ABl, stability is lost for purely oscillatory solutions. (For
X real, instability commences at XAt = 1, with £_ = — 1.)
b. Third-order Adams-Bashforth (AB3), another 'explicit multi-step method'
Applied to (2.7-2), it is
At
yn+\ = yn + "12(^n ~ 16:v«-i + 5y„_2), (2.7-11)
which requires one more history vector (jn-2)- But it offers a key advantage over AB2
(in addition to being third-order accurate): it is not unconditionally unstable for pure
TIME INTEGRATION 241
advection. It is conditionally stable; coAt < ~0.724 [Durran (1991)—a recent paper that,
with what seem to be very good arguments, strongly advocates this method for pure
advection]. It, of course, has a startup problem, and it introduces two extraneous roots
(solutions), which can (as with AB2) cause stability as well as accuracy problems, as we
shall demonstrate in Section 2.7.6. Startup could be done with two FE steps or one FE
and one AB2 step—all at smaller At if similar accuracy is to be preserved.
The stability boundary for the first three AB methods is shown in Figure 2.7-3 (AB1
is forward Euler), taken from L-W. Ho (1989, Ph.D. Thesis), with permission. The curves
plot the real and imaginary portions of XAt for the ODE y = Xy for complex X (note the
sign change which is conventional in the ODE literature), and each method is stable only
when XAt lies on (the stability limit) or within the closed stability boundary. They are
obtained by seeking a solution of the form yn = y^n and setting £ = e'e, which has |£| =
1. For example, applied to explicit Euler, which we have already derived in a different
way, yn+\ = (1 + XAt)yn = %yn = t'eyn or XAt = e"9 — 1 = (cosO — 1) + is'mO, which,
as 0 ranges from 0 to 2n, describes the curve labeled AB1. For other stability curves,
see, for example, Gear (1971) or Hairer and Wanner (1991). Note too the small regions
in the right half plane in which AB3 would drive to zero the numerical solution of an
unstable ODE; cf. the discussion of BE above.
c. Runge-Kutta methods (RK2, 4)
These one-step (and thus self-starting with no spurious roots) multi-stage methods are
characterized by the fact that they involve evaluations of / at intermediate values of
t in addition to those at the temporal mesh points, tn. Since we have little experience
with them but since they are extremely popular ODE methods in general (especially
RK4, often called 'classical Runge-Kutta') and are reasonably popular in CFD (at least
1.2
1.0
0.8
0.6
0.4
0.2
0
-0.2
-0.4
-0.6
-0.8
-1.0
-1.2
-2.2-2.0-1.8-1.6-1.4-1.2-1.0-0.8-0.6-0.4-0.2 0 0.2
Fig. 2.7-3 Stability regions for first-, second- and third-order Adams-Bashforth formulas.
(Reproduced by permission from L-W. Ho).
T I I I I I I I I T
I I I I I I I I I L
242 THE ADVECTION-DIFFUSION EQUATION
for the advection terms), we shall summarize the lowest (second-order) member of the
family, which itself describes a 'family' (as do RK3, RK4, etc.) in that it comes with a
free parameter, y, which may take on any value except zero (higher-order RK methods
involve more free parameters):
Stage 1 : yn+\/2y = yn + Atyn/2y,
Stage 2 : yn+x = yn + At[(l - y)yn + yyn+l/2y], (2.7-12)
where yn+\/2y = f(yn+\/2Y, tn + 1/2)/). When y = 1, this RK2 method also goes by the
name 'explicit midpoint rule', and for y = 1/2, it has several aliases: modified Euler,
modified (explicit) trapezoid rule, or one of Heun's methods. For y = 3/4, a certain
bound on the LTE is minimized; see Gear (1971) and/or Hairer et al. (1987) for details
and further discussion—and also for other RK methods except RK4, which we shall
later use:
Stage 1: j, = Atyn,
Stage 2: y2 = Aty(yn + \yx),
Stage 3: jy3 = Aty(yn + \y2),
Stage 4: y4 = Aty(yn + y3)
yn+\ = yn + i(y\ + 2j2 + 2j3 + y4). (2.7-13)
A serious impediment to both explicit and implicit [not considered herein; see, for
example, Butcher (1987)] RK methods when one is interested in generating a variable-
timestep ODE method, is that the LTE's are difficult, if not impossible, to estimate.
(They are, however, good fixed-step integrators.) Finally, if f(y,t) = ia)y (advection),
RK2 is unstable; higher-order RK methods—especially RK4—are then conditionally
stable. See Hirsch (1988), Canuto et al. (1988a), or Vitchnevetsky and Bowles (1982)
for stability plots of the first four RK methods, and Hairer and Wanner (1991), for these
and many more.
d. Leapfrog (another explicit midpoint rule)
We would surely be remiss in the eyes of many if we omitted this meteorologically
(and oceanographically) popular, simple two-step method of second-order accuracy—even
though it is sometimes not regarded as a 'legitimate' ODE 'method' by the experts. [It is
only 'weakly stable,' which often translates to (slightly) 'unstable' in practice.] It is
yn+l = >>„_, +2Afj„, (2.7-14)
which is not self-starting and displays one extraneous solution—often referred to as a
time-splitting 'instability' (even and odd timesteps tend to look 'de-coupled'). Seeking
a solution of the form yn = yo^n for / = — Xy yields £ = — XAt ± Vl + X2At2, which,
for X real, always has an unstable root; leapfrog is unconditionally unstable for
diffusion. If, though, X = — ico, then the roots are £ = icoAt ± Vl — co2At2, and the method
is both stable and properly non-dissipative when coAt < 1 [|£| = 1; this is the feature
loved by meteorologists vis-a-vis virtually all other explicit methods; it also seems to be
popular in plasma physics; see Miller (1991).] For coAt ^ 1, it is unstable; coAt = 1 is
TIME INTEGRATION 243
unstable even though |£| = 1 because the root is not simple—yn = (1 — in)in. £+ is the
physical root, and £_ is the extraneous root. But since £ is unimodular (when stable),
it is more convenient to use £ = e"9 to obtain £+ = e'9, where 6 = sin"1 coAt and £_ =
-Vl - co2 At2 + iooAt = (-l)(Vl -co2 At2 - iooAt) = -t'ie. To see the time-splitting
(Lilly, 1965), we write the exact leapfrog solution to y = icoy as yn =a%n+ + b%n__ =
at"10 + (— \)nbt~~ine, where a and b are determined from yo and y\ = y0 + icoAtyo, which
comes from explicit Euler on the first step; the result is (for yo = 1 and coAt strictly less
than 1) yn = {(I + 1/Vl -oj2At2)ein0 + (-1)" • ±(1 - 1/Vl - co2At2)t'ine, vis-a-vis
the exact solution, y{tn) = e'MnAt—which also corrects a small error in Lilly's paper (he
had Vl — co2At2 rather than its inverse). If coAt is not 'sufficiently small,' then the even
and odd timesteps will differ noticeably because of the oscillating extraneous root—this
is 'time-splitting.' To see how meteorologists cope with this problem, via 'time-filtering,'
see, for example, Durran (1991).
The stability boundary, obtained by setting |£| = 1 = e"9 in the leapfrog solution to
y = Xy, is found to be XAt = i sin 0, which is merely a small piece of the imaginary axis.
If there is any diffusion at all [0 < r < oo in (2.7-1)], leapfrog is outside of its stability
boundary—the extraneous root will exceed one in magnitude. For further discussion of
leapfrog for non-linear equations, see Sanz-Serna (1985).
e. Rational Runge-Kutta (RRK)
We end our brief introduction to explicit methods with a particularly suspicious one—and
we include it only to steer the reader away from it because (and in spite of) several
recent references (Wambecq, 1978; Hairer, 1980; Liu and Zhang, 1984; Liu et ai, 1984;
Pironneau, 1989) have advocated it (probably some at least without testing it). They are
non-linear RK methods that are unconditionally stable (hence the interest)—of which we
will show just a two-stage, second-order method (from Hairer, 1980): on y = f(y), it is
yn
yn+\ = yn+ &tyn ——^-, (2.7-15)
2j« - f [yn + Y^n)
where the two stages have been combined. While explicit and unconditionally stable, we
eschew it for the following reasons—all of which we learned from others:
1. In spite of Hairer's paper (1980), 'Unconditionally Stable Explicit Methods for Parabolic
Equations,' in which stiff ODE's were addressed, a follow-up paper by Sottas (1984)—in
which Hairer served as a 'consultant' (see Acknowledgements)—has the following title,
'Rational Runge-Kutta methods are not suitable for stiff systems of ODE's.' [We will
discuss 'stiffness' in more detail below—for now, just imagine that it is necessary to
solve a pair of ODE's like (2.7-1) simultaneously and that the ratio of the two r's is
» 1.]
2. In another paper whose title also tells the story, 'On the lack of convergence of
unconditionally stable rational Runge-Kutta schemes,' Pierce and Prevost (1986) state, 'This
paper establishes that the poor performance of the RRK scheme, which has been described
as a 'loss of accuracy' by Liu et al. (1984), is in fact due to a lack of convergence. Since
the method is unconditionally stable, the loss of accuracy does not manifest itself in a
blow-up of the solution or wild oscillations.'
244 THE ADVECTION-DIFFUSION EQUATION
3. At the end of a brief, unpublished analysis, A. Hindmarsh (1985, personal
communication) concluded that 'So it seems that the non-stiff components are not being approximated
even to order 1.'
4. In a short note on the subject, Fried (1984) states, 'Unconditionally stable (semi-)
explicit integration schemes for stiff systems of equations as generated by the finite
element analysis of elasto dynamics and non-stationary heat transfer are shown to suffer
from the Dufort-Frankel-Saulev syndrome, whereby coupling between the space and time
discretizations may have a ruinous effect on the accuracy of the computations.' For the heat
equation, he refers to the RRK method, and the 'syndrome' referred to is that consistency
as the finite element mesh is refined is obtained only if At/Ax —>• 0 [cf. forward Euler
for the heat equation; here stability requires At/Ax < O(Ax) as Ax —>• 0].
2.7.2 Application to Advection-Diffusion (Scalar Transport)
We now address the interesting problem of solving the semi-discrete AD equations—the
ODE's—for transient situations. A very common occurrence in CFD is that wherein over
large portions of the computational domain the main job of the fluid is just to carry its
load (this is advection); i.e., Pe ^> 1, and the passive scalar 'simply' tags along with the
fluid (for the 'ride')—except near certain boundaries, where the diffusion process can
also be quite strong. But, over large portions of many flow domains, the scalar transport
equation is, effectively, dT/dt + u • VT = 0. While the analytic solution of this equation
is extremely simple, the numerical solutions are very definitely not; indeed, the design
and analysis of numerical schemes for 'solving' the advection equation might well have
more person-hours invested in it than has any other linear and scalar PDE. The analytical
solution is simple (in concept, at least) because the equation just represents a Lagrangian,
or substantial, derivative; DT/Dt = 0 is a derivative following the motion, and thus T
does not change along a streamline. In fact, in a later section (Section 2.7.7), we shall
return to this concept; for now, however, we address the solution of these equations in
their Eulerian form.
a. Generalities
Switching from general (and uncoupled) ODE's to the specific (and coupled) GFEM
ODE's governing scalar transport, (2.2-7), we note immediately the obvious fact that
these ODE's 'look' implicit (coupled) in that even the 'simple' evaluations of f(y, t) a la
explicit ODE methods goes over to: solve My = f(y, t) for y. Another fact is that three
of the simplest ODE methods discussed above, forward Euler (which is also the first-order
Adams-Bashforth method and the first-order Runge-Kutta method), AB2, and RK2, are
unconditionally unstable in the absence of diffusion (and, of course, in the absence of
an advection scheme that introduces diffusion, such as upwinding and its offshoots). But
if diffusion is present (and it usually is), then these schemes are at least conditionally
stable, and if one insisted upon the simplicity of an explicit scheme, one could easily
solve for the needed values of y using but two or three iterations of the diagonally scaled
conjugate gradient (DSCG) method (see Volume II, and Wathen, 1991). But, as will soon
be demonstrated, the conditional stability—At must be less than some critical value,
Afcrit—is often so restrictive as to require an inordinate number of timesteps to complete
the desired time integration. And then even the few DSCG iterations per timestep might
TIME INTEGRATION 245
drive one to 'mass lumping,' which brings a pair of bonuses: (i) Afcrjt is larger and
(ii) the 'implicit' ODE's are converted to explicit ones in that M~[lf(y,t) is a cheap
operation. Indeed, for transient diffusion problems, or even diffusion-dominated (small
Peclet number) problems, mass lumping is often employed (when 'legitimate'; see below)
and is not very deleterious. But for advection-dominated flows, it usually is—as we will
soon show, even though the phase speed and group velocity results shown earlier should
suffice.
b. Lumping the mass
Since many have done it in the past, and in spite of our aversion to it, we discuss briefly
some mass lumping methods that have been employed (recall—all are ad hoc), which
is an incomplete coverage at best. [For further information on this subject, see Hughes
(1987), Fried and Malkus (1975), Cook et al. (1989), and Zienkiewicz and Taylor (1991).]
One of the arguments, hopefully nearly passe, goes like a finite difference vs finite element
argument: 'The extra speed of my resulting algorithm will permit me to use rather more
elements than otherwise, thus compensating for loss of accuracy by a gain in mesh
density.'
Some mass lumping methods:
1. For the Lagrange family of quadrilateral elements, the procedure called 'row summing'
is effective; the rows of the consistent mass matrix are summed and the result placed on the
diagonal, with (of course) zeros placed in all other locations; i.e., Meu = J2; fe <Pi<Pj — fe <t>i
since Y.j0/ = 1 •
2. There is no procedure available, to our knowledge, that is successful for the serendipity
elements (e.g., the eight-node element in 2D) when the advection terms are important
(high Pe) and the Galerkin method is employed—see Section 2.3.4. Not all elements are
susceptible to mass lumping.
3. For the linear triangle, the procedure developed by Winslow (1967) may be more
effective (S. Sackett, personal communication) than the commonly advocated row-sum
technique, which always places one third of the element mass at each node. Here, the mass
is distributed among the three nodes in proportion to the fraction of relevant area associated
with each node, this area being determined by forming the perpendicular bisectors from
each side. For right triangles, this leads to the placement of one-half of the mass at the
node with the right angle, and one-quarter at the other two. This proportion is also used
for obtuse triangles, wherein the three bisectors do not intersect inside the triangle.
4. For the quadratic, six-node triangle, the row-sum technique is completely inappropriate
using GFEM since the vertex nodes are assigned zero mass. Donea et al. (in Hughes,
1979) have suggested a modified, weighted residual method (Petrov-Galerkin) which,
upon row summing, places 4/15 of the total mass at each midside node and 1/15 at each
vertex.
5. Nodal quadrature is an approximate integration method that, by construction, gives
a diagonal mass matrix; see, for example, Fried and Malkus (1975), Gray (1977), and
Zienkiewicz and Taylor (1989, 1991). (Figure A8.2 of Vol. 1 and the figure on p. 321
of Vol. 2 have an error for the eight-node element result; 8/36 should be 8/38, and 1/36
should be 3/76. Also, the ninth node, with weight 16/36, should be added to the nine-node
element in the same figure, Figure A8.2.)
246 THE ADVECTION-DIFFUSION EQUATION
To our knowledge, none of the above lumping schemes for triangular elements has yet
been tested numerically on the advection-diffusion equation.
The use of LM (at least for the simplest elements) has an additional advantage with
respect to the stability-limited timestep size in the low Pe range: for diffusion-dominated
flows, the stability limit (Afmax) of the explicit Euler scheme is, in ID at least, three times
larger than that when CM is employed (for high Pe, both schemes have the same stability
limit, At ^ 2k/u2, in ID—which result we shall soon derive). Also, for Pe <$C 1, the LM
results are not necessarily less accurate than CM (phase error is less important), although
they could be misleading since they tend to suppress important wiggle signals (Gresho
and Lee, 1981).
All things considered, however, the LM method (even using more elements) can
sometimes be more cost-effective than CM when explicit time integration is employed.
c. Stability estimates and the case for implicit methods
The most common and the most useful method for analyzing stability of numerical
approximations to PDE's is the so-called von Neumann method, which is based on Fourier
analysis. Before presenting our first application of this method, let us point out that while
extremely important, it does have a few shortcomings—seemingly serious:
1. It requires a uniform mesh and periodic BC's (or, what is the same thing for Fourier
analysis, no BC's and an infinite spatial domain)—neither of which occurs much in
practice.
2. It is often difficult to apply in multi-dimensions (but then, so too are alternative
methods).
But it turns out that von Neumann stability results are always necessary conditions,
at least on a uniform mesh, regardless of BC's. Alternative methods of analysis also
exist, such as the matrix method (in which the eigenvalues of the appropriate matrix are
estimated and from which stability estimates obtain) and the energy method (in which
the 'energy,' ^yTMy in the ODE system My = /, a positive-definite quantity, is to
remain bounded). We shall discuss these methods a little bit, but not as much as von
Neumann—which we now apply to the ID AD equation discretized via linear elements
on a uniform grid with periodic BC's and constant coefficients (velocity, diffusivity)
[cf. (2.3-10)], and time integrated via forward Euler, and no source term (stability, or
otherwise, of numerical time-integration methods is independent of forcing functions):
1
(Tn + l _ Tn_i)+4{Tn + l _ jp + (r« + l _ ^j + ±{T%X ~ T)_x)
6At L
= jj[(T'}_l-2Tnj+Tnj+l), (2.7-16)
where T" denotes the value of T at node j and time tn.
It is convenient at this stage to introduce two common dimensionless parameters,
c = uAt/h and a = 2icAt/h2, (2.7-17)
the first of which has a name and the second of which generally does not—although some
(e.g., Roache, 1982) call it the diffusion number or the diffusion parameter, c is called the
TIME INTEGRATION 247
Courant number [or CFL number, after the famous paper by Courant et al. (1928)] and
denotes the travel distance measured in mesh spacing during one timestep; c = 1 means
the fluid moves one grid length per timestep. Rearranging (2.7-16) and introducing c and
a gives
l-(Tnj+l +4r;+1 + t]X\) = g(7-;_, + 4r; + r;+1) + ^(t^ - it) + rj+1)
-\<Jnj+\-Tnj_x\ (2.7-18)
whose solution we seek in the form
jn = ^n^ije^ (2.7-19)
where 0 = kh is the dimensionless wave number [k is the ('user-selected') actual wave
number of the 'sinusoidal' IC that is part and parcel of Fourier analysis], and £ is the
('von Neumann') amplification factor. If we find such a solution, it is clear, since tlj0 is
unimodular, that the magnitude of £ relative to unity will determine whether the
numerical solution grows or decays. Since £ is generally complex, the von Neumann stability
criterion is
l£|2 = £f ^ 1, (2.7-20)
where the overbar denotes a complex conjugate.
Remarks:
(1) This is actually a modified von Neumann criterion, as the original criterion would
be a bit more generous, |£| ^ 1 + O(At), permitting growth of the numerical
solution even when the true (ODE) solution does not grow. Richtmyer and Morton
(1967) and Morton (1971—an excellent paper) call this the 'practical' stability
requirement—one that precludes the numerical solution from growing faster than the
true solution. We do, however, take issue with one of the points made by Morton in
that paper; he blames the trapezoid rule for phase error that is more properly ascribed
to the second-order, centered, spatial differencing employed to obtain the results in
his Figure 2—those for a Courant number of 0.25 and 0.50 being virtually the same
indicates that the 'inaccurate' ODE's were actually solved accurately by TR.
(2) The wave number selection should actually not be arbitrary; rather, because of
discrete Fourier analysis (or because of the periodic BC's) it should be a multiple of
27ra, where n is an integer; i.e., for a mesh with N nodes (h = \/N = Ax), the mesh
can only support a finite set (N) of wave numbers n = no, no + \, ..., no + N — I,
where no is any integer. But it is common practice—and much easier—if the analysis
regards k as a continuous variable over the range 0 to jt/Ax (the '2Ax' wave), and
that is what we shall do. But since being 'easy' is not sufficient justification, we
also point out that for large N at least, the results of the two analyses differ only by
0(h2); see Paolucci and Chenoweth (1982) and Hindmarsh et al. (1984).
(3) Similar von Neumann analyses can be done (but not as easily) for other explicit
methods, but the end result will usually be the same: conditional stability
(with different 'constants'). We shall perform some such analyses in a later
section—Section 2.7.6.
248
THE ADVECTION-DIFFUSION EQUATION
Anyway, inserting (2.7-19) into (2.7-18) leads to the von Neumann amplification factor
equation,
-(t~i9 + 4 + e/e)$ = W'e + 4 + ti0) + °^{t'i9 - 2 + tl9) - %i9 - e~i9),
6 6 2 2
which can be simplified and rearranged to give the final result:
£ =
2 + cos 9
2 + cos 9
— a( 1 — cos 9) — ic sin 9
(2.7-21)
and thus
1*1 =
'2 + cos#'
2a
'2 + cos#'
(1 -cos#) + a2(l -cosO)2 + c2(l -cos2#)
' 2 + cos 9'
Requiring |£|2 ^ 1 leads easily to
c ^
where
2P(1 -cos#)(2 + cos#)/3
(1 -cos#)2 + P2(l -cos2#)'
P e= uAx/2k
(2.7-22)
(2.7-23)
(2.7-24)
is called the grid Peclet number. (Note that aP = c.) This is our 'CFL' stability limit in
terms of the arbitrarily variable dimensionless wave number, 9. If 9 ^ 0, then it can be
simplifed to
2P(2 + cos#)/3
= f(P,9).
c<
1 -cos6 + P2(\ + cos9)
(2.7-25)
Studying the function / (P, 9) over the range 0 < 9 ^ n and all P ^ 0, or plotting it as
a function of P with 9 as a parameter as in Figure 2.7-4—all curves passing through
/ = 1/V3 at P = V3—yields the desired stability results. The key point, and the final
result of this stability analysis, is that the two curves defined by 9 —>• 0 and 9 = tt are,
for P > V3 and P < V3, respectively, the lowest of all / (P, 9). Thus, the stability limits
are the following:
(i) c ^ P/3 for P < V3,
(ii) c ^ 1/P for P > V3,
(2.7-26)
(2.7-27)
and we make the following
Remarks:
(1) The diffusion limit, c ^ P/3, translates to At ^ Ax2/6k, and the 9 = tt boundary
curve suggests—properly—that it is the ubiquitous 2Ax wave that goes unstable
first. That pure diffusion requires At —> 0 as Ax —> 0 can perhaps be better
appreciated if FE is attempted on the heat equation with the spatially continuous
TIME INTEGRATION 249
12
10
8
f(P.e) 6
4
2
1
0 _
0 V3 5
Grid Peclet No. (P)
Fig. 2.7-4 Forward Euler stability diagram.
Laplacian— (Tn+\ — Tn)/At = KV2Tn, which at least up to BC's, implies that
Tn+X = (/ + AtKW2)nT0, which 'blows up' for all At > 0 because (/ + AticV2)
is an unbounded operator.
(2) The pure advection limit (P —>• oo) is unstable for all At—a reflection of the fact
that the explicit Euler method is unconditionally unstable for ODE's with purely
imaginary eigenvalues. That it is bounded by the 9 —>• 0 curve suggests—again
properly—that the longest waves are the most unstable. This 'advection-diffusion'
forward Euler stability limit (c ^ \/P or At ^ 2k/u2) also applies to higher-order
centered difference approximations—and presumably (probably) to higher-order
GFEM's—since it is basically an 'asymptotic' result (0 —>• 0 is the most unstable
case); i.e., it even applies to explicit Euler with exact spatial representation, in the
advection-dominated limit.
(3) If the mass is lumped, then the (2 + cos 6)/3 factor in the above equations is replaced
by unity, as is the factor of three in the final stability results. That is, the consistent
mass matrix reduces (only) the diffusional stability limit by a factor of three.
(4) The lumped mass result is of course also the second-order, centered, finite-difference
result for which erroneous stability results have appeared in the literature that
asserted instability if P > 2, regardless of At; see Hindmarsh et al. (1984) and
Hirsch (1988) for details—and see Thompson et al. (1985), whose title tells all:
'The cell Reynolds number myth.'
(5) The advection-dominated stability limit, c ^ \/P, or, equivalently, At ^ Ik/u2, is
particularly distressing when P ^> 1; in the next section we will show how to
improve this to the much more tolerable limit, c ^ 1. It is interesting, even if not
too 'relevant,' that the FE stability limit is as 'non-physical,' non-intuitive, as was
the advective-diffusive time scale discussed in Section 2.6.2g. See too E and Liu
(1996) on this issue.
1
1/P(0 = O)
v0A
i ]
> l
i \.X
rv
': \T .0.5
r- ■^••^••■^fe?*^ t
—
—
P/3 (0 = Jt) v
2 e = 2-75_v_3~I=
250 THE ADVECTION-DIFFUSION EQUATION
(6) Recalling the discussion following (2.4-1), in which the time constant for the ODE
at the outflow boundary is (for the LM version, for simplicity) r = (2k/Ax2 +
u/Ax)~\ and recalling the stability result for FE of At ^ 2r, gives—approximately
since the outflow ODE is actually coupled to all of the others—c ^ ~2P/(1 + P),
which is less restrictive than the LM version of (2.7-26), c ^ P; i.e., the OBC ODE
appears not to further reduce the allowable At for explicit methods even though the
time constant tends to zero with mesh refinement.
The multi-dimensional analog of the above stability analysis, while vitally important,
has not yet been successfully completed. We [PMG, with A. Hindmarsh and D. Griffiths;
see Hindmarsh et al. (1984)] have tried but failed—for the GFEM case (except for some
special cases) with bilinear basis functions. But we did succeed for the lumped mass case
and the finite difference (five-point) Laplacian, for which the following rigorous results
were obtained: the FTCS (forward-time-centered space) method is stable provided (if and
only if)
(i) At ^ 1/2*^(1/Ax,)2 or ^a, ^ 1 (2.7-28)
j=\ /=•
and
(ii) At ^2k Y1u)' (2.7-29)
' /=•
where ns is the spatial dimensionality (1, 2, or 3). Note the obvious similarity to the easy-
to-obtain ID results. It follows that (2.7-28) prevails when all Pj (grid Peclet number in
the 7-th direction) are < 1 and (2.7-29) when all are > 1.
The real bottom line here—which must generalize to the true GFEM case even though
the details are not yet available—is this: there is a diffusional stability limit on At that
becomes severe for small elements, and there is an 'advection-diffusion' stability limit
that becomes severe when grid Peclet numbers are much larger than unity. For some
recent, analogous results when leapfrog time integration is used—for a variety of spatial
FDM's—see Kwok and Tarn (1993).
For the record (and/or the challenged reader), the stability condition for dT/dt + udT/dx
+ vdT/dy = K\d2T/dx2 + 2K\2d2T /dxdy + K2d2T/dy2 for bilinear elements on a mesh
of uniform rectangles, which has resisted our efforts to 'solve,' is: [ 1 — g\ (#2)0 — cos 0\) —
giiO\)i\ -cos#2)-ai2sin#i sin#2]2 + [/z,(#2)sin#i +/z2(#,)sin#2]2 ^ 1, where 0,-=
kiAxi, giiOj) = a,(2 + cos0/)/3, where a, = 2KiAt/Axf and a12 = 2K\2At/AxAy, and
hi(Qj) = fii(2 + cos0j)/3, where /3, = w/Af/Ax,, and |0/| ^ n.
We now (somewhat boldly) employ some of these stability results to two hypothetical,
but hopefully practical, cases, using a graded mesh (and even variable velocities), simply
in order to show (or at least suggest) that explicit methods can sometimes be too expensive.
In general, this situation can occur when the stability At is very small relative to that
required for an accurate solution and/or to the total integration time required.
1. Pure diffusion (Pe = 0). Consider the transient heat equation (even in ID) and a step
change (or other rapid variation) in boundary temperature on a mesh that is highly graded
(very small elements near the boundary) in order to obtain an accurate result near this
boundary. If L is the characteristic length and the solution is required for the full transient,
TIME INTEGRATION 251
then the total number of timesteps will be ~ x/ At, where x ~ L2/k is the longest 'time
constant.' Since At — Ax^Jk, we have r/At — (L/Axmm)2; a finely graded mesh with
fewer than even MOO nodes in the 'x-direction' could easily result in Axmm/L < 0.001,
giving t/ At > 106. [(L/Axmin)2 in this case is a measure of the 'stiffness' of the problem.]
2. Flow past an obstacle. Consider steady flow around an obstacle such as a cylinder.
At t = 0, a temperature anomaly, or hot-spot, is introduced into the otherwise isothermal
flow field at the inlet and approximately on the 'axis' (i.e., it is transported toward the
obstacle); it is desired to find how much heat is transferred to the (isothermal, say) obstacle
and the downstream temperature distribution for two cases—low (0.01) and high (104)
Peclet numbers, UD/k, where D = 0(1) is the 'size' of the obstacle. Since an accurate
solution is desired, a graded mesh is again employed such that Axmin/D ~ 0.001 (near the
obstacle) and Axmax/D ~ 1 (far downstream). Consider a total domain length of MOD
and a nominal (characteristic) velocity of 0(1); thus, the total mesh transport time, x, is
M0. Using the ID stability results as a first approximation, we have P = {Ax/ID) Pe,
giving P(min) = 0.0005Pe.
For Pe = 0.01, the diffusion limit gives
Ax2 • i i n
At = =52* = - x 10~6D2/k = - x 10"6-Pe = 0.5 x 10"8,
2k 2 ' 2 U
and MO9 steps would be required for the simulation. On the other hand, for Pe = 104,
the advection-diffusion limit (using Axmjn) gives At ^ (2/Pe)(D/U) = 2 x 10~4, which
is controlling. In this case, 'only' 105 or so timesteps would be required.
In general, a strongly graded mesh (often required for optimum spatial accuracy) and a
low Pe lead to expensive explicit calculations. On the other hand, a rather uniform mesh
and a moderate or high Pe may not be so bad; i.e., explicit methods are more afforable
(and can be more cost-effective) for certain 'hyperbolic' cases.
d. Matrix method of stability analysis
An alternate method of stability analysis that is sometimes useful—partly because BC's
other than periodic can, in principle, be easily accommodated—is based on estimating
either the spectral radius or an appropriate norm of the matrix that relates the solution
from time level tn to that at tn+\. We provide a bare introduction to this topic in this
small section. For further details, see, for example, the book by Hirsch (1988), and the
following papers: Hindmarsh et al. (1984), Griffiths et al. (1980), and Morton (1980),
which contains the following important remark: 'Unfortunately, most of the analysis was
based on the so-called matrix method, and an associated concept of stability which is
misleading both in theory and practice for such problems.'
The 'matrix method' begins with the system of ODE's, such as (2.2.7), and applies
the ODE method of interest—say FE for simplicity, to obtain (for constant forcing)
MTn+l ={M - At[N(u) + K]}T„ + Atf (2.7-30)
and thus (see Hindmarsh et al., 1984)
Tn+l =EnT0-(E" -I)[N(u) + K]-'f, (2.7-31)
where
E = I -AtM-l[N(u) + K] (2.7-32)
252 THE ADVECTION-DIFFUSION EQUATION
is the 'amplification' matrix. Clearly both stability and the attainment of a steady
solution requires, for n —>• oo, En —>• 0—the zero matrix. This in turn requires ||£|| < 1 for
some (induced) matrix norm, because ||£|| < 1 =>• ||En|| < 1, which then =>• His" Toll <
\\En\\\\Tq\\ < \\To\\. But since the norm of E is usually very difficult to estimate relative
to the spectral radius, p(E) = |A.max(£)|, and since p(E) ^ ||E||, the matrix method of
stability analysis is often/usually simply taken to be
p(E) < 1 (2.7-33)
or, if |A.max(£)| is a simple eigenvalue, p(E) ^ 1. Some authors use p{E) ^ 1 as the
stability limit even when there are repeated eigenvalues of unit modulus; see Hindmarsh
et al. (1984) to see what sort of trouble this more lax definition can cause.
Now, p(E) < 1 does guarantee that En —>• 0 as n —>• oo for fixed N (ODE size) and
fixed At. Thus, ultimately the satisfaction of (2.7-33) will assure stability. But what it does
not assure is that ||£|| < 1 nor \\En\\ < 1, the violation of which, even with p(E) < 1,
can cause large growth in \\En\\ before it peaks and turns around—even in the cases in
which there are no repeated eigenvalues. Just such behavior was in fact demonstrated by
Griffiths et al. (1980) and Hindmarsh et al. (1984) for a case in which [while p(E) < 1]
the von Neumann stability analysis predicted instability. What happens in the computer is
this: if you have p(E) < 1 but are unstable according to von Neumann, then the solution
magnitude can become exceedingly large (say 1010) before finally turning around and
decaying. Thus, the matrix method 'wins' ultimately, but certainly not in practice. See
too the discussion of non-normal matrices at the end of Section 2.6.2d, to which such
behaviour is closely related.
For these reasons, we generally suggest that stability analyses be performed, wherever
possible, using the von Neumann method: |£| ^ 1.
e. Balancing tensor diffusivity (BTD)
If one insists on lumping the mass and using explicit time integration—a la many
FDM's—one would probably also at least consider (or start with) the simplest
method—forward Euler (FE), as we did above. In this section we present a forward
Euler method that is modified in such a way that both accuracy (usually) and stability
(always) are increased, a rare occurrence in CFD.
To motivate the derivation and to obtain the result in the simplest manner, we revert to
the pure advection equation for which FE is unconditionally unstable (the P —>• oo limit in
Figure 2.7-3) and ask the question: Can we (slightly) modify (perturb) the spatial operator
(i.e., the problem) in such a way that FE will be both stable (at least conditionally) and
sufficiently accurate? The answer turns out to be 'yes,' and it is obtained as follows:
rather than trying to integrate
dT/dt = -u-VT = LT (2.7-34)
with FE (which is fruitless—unless we 'upwind'), let us integrate
dT/dt = LT (2.7-35)
with FE, where L approximates L in an appropriate sense. We first use Taylor series on
the exact solution as follows: starting at tn,
T(tn+l) = Tn + At—
ot
dT At2 d2T
n+ 2 tf
+ 0(At3) (2.7-36)
TIME INTEGRATION 253
Tn + AtLT
+
At
2 r
L2T
dL
+ —T
dt
+ o(An
(2.7-37)
using (2.7-34) with u = u(x, t). Next, apply FE to (2.7-35), starting from the same place:
Tn+l =Tn + AtlT\n. (2.7-38)
We find our modified operator [to 0(At2)] by equating Tn+\ to T(tn+\):
(2.7-39)
At ( 2 9M
l=l+t{l + *)-
and we do find an operator that is close to L. That is, integrating (2.7-35) via FE will,
to 0(At2), agree with the exact solution of (2.7-34). Now use L2 = (u • V)(u • V) and
dL/dt = -(du/dt) ■ V to obtain L = -u • V + (Af/2)[(u • V)2 - (du/dt) ■ V], and to make
further progress, we note (i.e., we leave as an exercise for the reader) that V • (uu • V) =
(u • V)2 + (V • u)u • V = (u • V)2 for our incompressible flow field. Thus,
/ Atdu\ At
(2.7-40)
is, we assert, an appropriate modified operator wherein we hasten to add that it is
the diffusion-like term (u2At has units of a diffusion coefficient) that is important.
(Indeed, limited testing with a time-dependent velocity field suggested to us that the
extra work associated with the du/dt term is not worth the extra effort.) In fact, our
final modified operator omits the acceleration term [thus losing some of the
theoretical closeness—0(At2)—to the exact solution when u is time-dependent] so that the
associated, modified advection equation reads
dT/dt + u • vr = -^ v • (uu • vr),
(2.7-41)
which is to be integrated via FE, which we now examine for the simplest case; i.e., ID
lumped linears with constant velocity. Thus, (2.7-41) becomes
dT dT u2At d2T
h u— = =-,
dt dx 2 dx2
and another application of 'von Neumann' [cf (2.7-16) through (2.7-20)] to
(2.7-42)
rrn+\
J
j
At
U
2h
+ ^(T"J+l
jn ^
u2At T"^ - 2Tnj + TnJ+l
hl
gives
£ = 1 - c2( 1 - cos 9) - ic sin 6,
|£|2= 1 -c2(l -c2)(l -costf)2,
(2.7-43)
(2.7-44)
(2.7-45)
and it is not too difficult to find that the stability limit occurs at 6 = n (the 2Ax wave)
with the result
c < 1 (2.7-46)
254 THE ADVECTION-DIFFUSION EQUATION
for stability. Also, for h —> 0 (and c ^ 1), |£| —> 1 — u2k4At2h2/S, which is called (owing
to k4) 'dissipative of fourth order in the sense of Kreiss' (Richtmyer and Morton, 1967).
Thus we have stabilized forward Euler for pure advection by changing (with justification)
the equation—at least for this special case.
It is noteworthy that FE stability has been achieved by adding a diffusion term to
the pure advection equation, a result that helps to explain the instability mechanism of
FE on pure advection; i.e., FE is unstable by virtue of the fact that its truncation error
'generates' an advection-diffusion equation with negative diffusivity—an unstable PDE.
The addition of the positive diffusion coefficient, u2At/2, balances this destabilizing
LTE. (Likewise, the BE integration of the advection equation adds the same amount of
numerical diffusion—helping to 'explain' its robustness.)
It is also noteworthy that this result is actually quite old—and well enough known
to have a name: the Lax-Wendroff method (see Richtmyer and Morton, 1967). [Our
derivation, from A. Hindmarsh (personal communication) is, we believe, new—and first
appeared in Upson et al. (1983).] Also relevant from this classic text is the following
quotation (p. 332):
It has occasionally been argued that the Lax-Wendroff equations must in some sense
be less accurate than the centered or 'leapfrog' equations because of the damping
of those Fourier components exp (ikx) having kAx ~ 1, which does not occur for
the leapfrog equations or for the differential equations. This argument fails to take
account of the phases of the Fourier coefficients, which are falsified, for kAx ~ 1,
by both the Lax-Wendroff and the leapfrog schemes, in fact by an amount of order
(kAx)7,—the same amount for both schemes—which is one order of magnitude larger
than the falsification of the amplitudes (the damping). It seems to us that to retain the
short-wave Fourier components with unchanged amplitudes is unrealistic under these
circumstances.
More history: in ID, the BTD method is also known as Leith's method (Roache, 1982),
referring to Leith (1965)—but in fact it is older yet; in Knox (1961), it is mentioned that
Leith suggested the scheme to him in 1960. Knox also showed that the scheme competes
well with 'the well-known centered explicit scheme'—leapfrog.
Generalization to include diffusion is possible (almost), but we present only the final
results, referring to the original references for details (i.e., Gresho et al., 1984b; Hindmarsh
et ai, 1984): to solve dT/dt + u • VT = V • K • V7\ where K is a diagonal diffusion tensor
via FE and lumped mass FEM on bilinear elements (trilinear in 3D), simply replace K
by K + uuAf/2. We succinctly summarize the results in the form of
Remarks:
(1) The entire procedure is called BTD (balancing tensor diffusivity) because it balances
the FE truncation error with a diffusivity tensor.
(2) The von Neumann stability limit (necessary and sufficient) for the ID case is (see,
e.g., Hindmarsh et al. (1984)
c ^ 2/7(1 + y/l +4P2) {2.1-Al)
and is plotted in Figure 2.7-5, and called modified FTCS. The 'upwind' stability
result, from FDM or LM linears with pure upwinding, is also shown—as is the
no-BTD LM result, a la Figure 2.7-4 for CM, and called FTCS here. Note that CM
causes a factor of three reduction in At for unmodified FE and P < 1.
TIME INTEGRATION
255
Fig. 2.7-5 Stability results for three schemes with forward Euler; lumped linears (FTCS);
upwind FDM, and lumped linears using BTD (modified FTCS).
(3) The stability limit for multi-dimensions is not yet known unless the one-point
quadrature approximation is invoked, in which case it is (2.7-47), applied separately in each
direction—but only as necessary conditions. Presumably, it is close to a sufficient
condition and is a good approximation to the full quadrature (2 x 2) case. (It has
served us (PMG) well in practice, even for 'real' problems—those with variable
grids, velocity, etc.).
(4) Since the 'fix' is based on the time-dependent AD equation, it is not strictly valid
(though still needed for stability) if a steady state is reached (the time truncation
error of FE that introduces negative diffusion is no longer present; there is nothing
left to 'balance'). It is then merely another example of streamline upwinding, so-
called because the diffusion tensor, UjUjAt/2, is anisotropic in just such a way that
it is non-zero (value |u|2 At/2) only along streamlines; crosswind diffusion, the bane
of multi-dimensional upwinding, is absent. [The proof that the diffusion acts only
along streamlines is simple: rotate the diffusion tensor to streamline and normal
(to streamline) coordinates—i.e., to principal directions—via (in 2D, for simplicity)
tan# = v/u and Kfj = (RT ■ K • R),7, where Ktj = UjUj and
*«7 =
cos 6 — sin 6
sin 0 cos 0
is the appropriate rotation matrix. The result is that the rotated 'diffusivity' matrix,
+
0
K« =
u2 + v2
"'7
is non-zero only in the streamline direction. A second, and simpler, 'proof follows
from V • (uu • V) = (u • V)2 and the realization that u • V = usd/ds, the directional
derivative in the streamline direction.]
256 THE ADVECTION-DIFFUSION EQUATION
(5) If BE is used to integrate the AD equation, then it adds (implicitly) the streamline
diffusity, uuAf/2, to the physical diffusivity; BE is thus inherently a 'streamline-
diffusion' method (see Johnson, 1987)—for 'small' At. For large At, BE goes from
extra damping in the streamline direction to massive amplitude reduction everywhere.
Returning to 'small' At, it is clear that the accuracy of BE can be increased (and
stability decreased) by subtracting a BTD term—a result that was demonstrated
by Engelman (see Gresho and Chan, 1990) on a spinning vortex problem for the
incompressible Euler equations.
We now demonstrate that BTD 'works' via a simple ID example—from Gresho et al.
(1984b): a unit-amplitude Gaussian wave with a = 2Ax is advected with unit velocity
through a uniform mesh of 100 lumped linear elements with Pe = uo/k = 20 (and P = 5)
and periodic BC's on the unit span. Figure 2.7-6 shows three approximate solutions
and the exact (infinite span) solution, given by T(x, t) = exp[—(x — xq — ut)2/(2a2 +
4Kt)]/\/l + 2/ct/er2 with xo = 0.5 and t = 1.0 (one lap) and shown as the dotted curve.
Curve (1) shows the true solution (very small At) to the ODE with no BTD, and curve (2)
shows the no-BTD result at its stability limit (c = \/P = 0.2) and seems to corroborate the
notion that (at the stability limit) the negative diffusivity of forward Euler has 'precisely'
cancelled the physical diffusivity because the curve looks just like a pure advection
result with lumped mass (dispersion error). Finally, curve (3) shows the BTD result
(k + u2At/2) at its larger stability limit (c = 2/7(1 + Vl + 4P2) = 0.905—a very cost-
effective result because the accuracy is much better and the timestep is significantly
larger (a factor of 0.905/0.2 = 4.5). While these truly desirable assets gained by BTD
with forward Euler are not always so striking in the general multi-dimensional case, they
are sufficiently good as to strongly recommend the use of BTD when forward Euler time
integration is employed.
0.6 —
0.3 —
T
0 <-^w
-0.3 —
X
Fig. 2.7-6 Advection-diffusion of a Gaussian: Curve (1) no BTD at very small At; curve (2)
no BTD at the stability limit; curve (3) BTD result at the stability limit. (The dotted
curve is the exact solution.
TIME INTEGRATION 257
As a final remark on BTD, it seems that it may only be effective for the explicit Euler
method; e.g., if second-order Adams-Bashforth is analyzed in a similar way, the leading
FE truncation error term, AtL2/2, where L = — u • V, is replaced (for constant u) by the
higher-order AB2 truncation error term, 5 At2 L3/12, which does not look like simple
diffusion.
2.7.3 Some Implicit ODE Methods
Since explicit methods may be too expensive for some cases of interest (for which the
stability-induced step size is often orders of magnitude smaller than that required to
integrate the ODE's with acceptable accuracy), it is appropriate to additionally consider
stable, implicit methods. In these situations the At question is changed from 'How small
must At be to maintain stability?' to 'How large can At be while still assuring sufficient
accuracy?' The answer to this important question is, unfortunately, not easy to obtain in
a simple and general way since the answer is problem-dependent and is often a strong
function of time for a given problem. (The question is important because implicit methods
are relatively expensive per timestep, and the goal is to use as few steps as possible.) Here
we first give a few general guidelines or insights, before suggesting the 'proper' answer.
If Pe <$C 1, then the diffusion-dominated, time-dependent flow usually exhibits multiple
time scales; e.g., consider the typical (ID) analytical solution, T = J2T=\ an(x)e~fin''K^L\
where an(x) is proportional to an appropriate eigenfunction, and /J = 0(1). For very
small time (Kt/L2 <$C 1), many of the 'faster' modes (large n) are often quite important;
but as time goes on, these modes contribute less and less until, for Kt/L2 ^ ~ 1, only the
slowest mode (n = 1) remains significant. This suggests, and properly so, that numerical
solutions of the FEM model (the ODE's) should use a very small At initially (on the
order of the time constant of the smallest element; e.g., r ~ QAh2/k), and At should
(or at least, could) be increased monotonically and significantly during the simulation,
without sacrificing accuracy. On the other hand, for certain advection-dominated situations
(Pe ^> 1) involving the transport (with little diffusion) of a discrete waveform, the solution
is more hyperbolic in nature—the 'temperature' follows the fluid streamlines (particle
paths for time-varying velocity). As a result, the relevant time scale is not nearly so
variable, and often a fixed step size (and perhaps an explicit method) is most appropriate.
For problems not pushing the hyperbolic limit, a variable step size may generally be useful
and cost-effective—if done properly. Fortunately, the theory of the numerical solution of
ODE's is sufficiently well advanced to provide simple but effective methods for selecting
the appropriate step size for the general case. Thus, after introducing implicit methods
in the simpler context of fixed-step-size algorithms, we will present 'smarter' and more
cost-effective algorithms for solving the ODE's.
The stability issues discussed above for AD are usually translated into what is called
'stiffness' in the ODE jargon. The early portion of a diffusion (-dominated) problem is
usually 'non-stiff in that small timesteps are required for accuracy no matter what type
of time integrator—explicit or implicit—is employed. A general definition of stiffness, in
terms of our model ODE [(2.7-1)] (considered now to be one of many, each with a different
time constant and frequency), is this: 'The problem is stiff if the smallest time constant
(Tmin) is "very small" relative to the time over which we wish to solve the problem.'
Thus, for example, the transient heat equation, when spatially discretized, becomes stiff
at 'large' time because the 'fast' modes have decayed away. If the maximum time constant
258 THE ADVECTION-DIFFUSION EQUATION
is called rmax and if the minimum frequency is called com\n, then it is typically the case
that the time interval of interest is at least rmax or l/&>min- Since all explicit methods,
and many implicit methods, are only stable for At not far from rmjn, a measure of the
stiffness of a problem is the ratio of rmax/rmjn or the inverse product l/cominTm\n, each
of which is an estimate of the number of timesteps necessary to do the job stably—not
necessarily accurately, although it is often true that the step size needed for stability will
give adequate accuracy. (Sometimes in the extreme, AtCTll can be so small that the ODE's
are integrated with an accuracy much greater than is needed, or reasonable—especially
when solving PDE's, which almost never warrant 'too precise' an ODE solution because
of 'Ax errors.') In the case y = Ay, where A is a matrix, the stiffness ratio is often taken
to be the ratio of largest to smallest eigenvalue moduli of A. For further discussions
of this subject, see Aiken (1985), Lambert (1991), and Radhakrishnan and Hindmarsh
(1993).
Three examples should suffice to demonstrate stiffness (for others, see, for example,
Gear, 1971; Shampine and Gear, 1979; and Hairer and Wanner, 1991, in addition to those
already mentioned); the first is from Dahlquist and Bjorck (1974), and the second is our
own invention (Lee et ai, 1983)—and will be referred to later (Volume II) when we
discuss anelastic equations:
1. Consider the ODE
y = 100(sin t - y), y(Q) = 0, (2.7-48)
with solution
x sint- 0.01 cosr + O.Ole"100'
y(t) = , (2.7-49)
y 1.0001
which displays y(0) = 0 and y(t) = sint — 0.01 cos? for t > ~0.1, and even y(t) = sin?
to within 1%. But since a = 1/r = 100, any of the explicit schemes discussed above
would be unstable for At much larger than 0.02 or so. While At = 0(0.01) is fine
(appropriate) for the initial rapid transient, the stability limit later gives the requirement
of n = 2n/At = several hundred steps per period of the simple sine wave—many more
steps than would be necessary to give reasonable accuracy, probably ten-fold too many.
And suppose 100 were changed to 104!
2. Here we consider a pair of equations,
yi = -y\ + yi, yi(0) = i, (2.7-50)
and
y2 = yi - \000y2, y2(0)=L (2.7-51)
The negatives of the eigenvalues of the associated matrix,
A=(~{ M
^ i -loooy'
are A, = 1000.001001 and a2 = 0.998999, and the exact solution is
A) — A2 k\ — k2
A2(A, - 1) , , A,(l -A2) , ,
yi = . . e~x'? + -^ -^e-K (2.7-53)
A) — A2 A) — A2
TIME INTEGRATION 259
This solution—to five or six digits—is
v, = l.OOle-0999' - O.OOle 1000' (2.7-54)
and
y2 = O.OOle-0'999' + 0.99e-100(\ (2.7-55)
which rapidly (t > -0.01) changes to yi = 1.001e~0999r and y2 = 0.00 le"0999'; only the
slowly decaying portion of the solution is significant. If explicit Euler were to be applied
to this problem, stability would require At ^ 2/X\ = 0.002; to follow y\'s decay to, say
10-4, would mean tt\na\ = 10, and a total of 5000 steps would be needed when less than
1/100 that many, based on accuracy, would suffice.
Exercise: Solve y = 2t- 106(j - t2), y(0) = 0 via FE and BE. Show that FE is not
useful unless At ^ ~ 10~6, whereas BE gives six-place accuracy with At = 1 (!). (See
also Gear, 1971.)
3. One more fairly common situation in which stiffness is encountered is that where the
advection-diffusion equation becomes the 'reaction-advection-diffusion' equation—i.e.,
when chemical reactions occur in the fluid. It is often the case that very wide differences
in chemical reaction rates ('time constants') exist, thus introducing 'chemical' stiffness.
After a short time, some of the reactants are virtually in equilibrium (y = 0), yet a
stability-limited ODE method would not know this and would need to always use the
small timestep (to follow the fastest reaction, which is in equilibrium) that only makes
sense at very early time, in general. For examples of stiffness of this type, see, for example,
Enright and Hull (1976), Radhakrishnan (1986), and Aiken (1985).
It is interesting (again) to note that the ODE's that are also the natural BC's, which are
also used for OBC's, become stiffer and stiffer as / -> 0; i.e., the ODE's time constant
goes to zero like 12/k—see, for example, (2.3-11) or (2.4-1)—causing, appropriately,
'equilibrium' behavior. Since this time constant is essentially the same as the At stability
limit for explicit methods, we see that—fortunately—the NBC ODE does no additional
'harm' when an explicit ODE method is used. Implicit methods, of course, also take this
'stiffness' in their stride and should use a step size based on accuracy—even for very
small /.
Recall: 'A method is A-stable if lim^oc y„ = 0 for all Re(A) < 0 and a fixed positive
At when applied to the test problem y = Ay'—Brenan et al. (1989, 1996).
'The condition of A-stability, as a requirement for a method to be considered for stiff
problems, leads to disappointingly few methods. A more fruitful approach to this subject
was followed by Gear. In this approach, instead of requiring that the entire half-plane
Re(XAt) ^ 0 lie in the absolute stability region, it is only required that a suitably large
part of it lie there.'—Hindmarsh (1974). See Gear (1971) for the precise definition of
'stiff stability'; suffice it to say here that there are many more stiffly stable methods than
A-stable methods. Every A-stable method is also stiffly stable, but not vice versa.
We now consider implicit ODE methods that can cope with stiffness—after stating
that a final and significant reason to consider implicit methods is that the GFEM ODE's
from PDE's are 'inherently implicit' owing to the mass matrix.
a. The trapezoid rule (TR)
This is the second member of the Adams-Moulton family, or implicit Adams family (the
first is backward Euler); applied to (2.7-2), the TR gives
260 THE ADVECTION-DIFFUSION EQUATION
yn+\ = yn + -z-(yn+ yn+\\ (2.7-56)
and applied to the linear ODE [(2.7-1)], it yields
1 ~ *Af/2
y*+x = urn]?*- (2-7"57)
This popular method, called 'Crank-Nicolson' in much of the PDE literature (at least
when applied to the diffusion-only equation), has a number of interesting and important
features—most of them very good; recall our list of virtues in the introductory discussion
in Section 2.7:
1. It is self-starting.
2. There are no extraneous roots.
3. It is stable for all At whenever the ODE is stable (i.e., it is A-stable). It is also unstable
whenever the ODE is unstable.
4. It displays no spurious damping; it is completely neutral—at least for constant a>.
5. It is nearly as simple to implement as its first-order dissipative relative, backward Euler.
6. It is second-order accurate.
Before presenting the next neat feature of TR, the 'Dalhquist theorem,' we briefly
digress to define a full class of methods; two classes, actually—the Adams methods. These
are embedded in a yet more general class of methods called linear multi-step methods
as is yet another: BDF (backward differentiation formula) methods, popularized by Gear
(1971) and Hindmarsh, beginning with Hindmarsh (1972) on through to Hindmarsh and
Petzold (1995a,b). The linear multi-step methods are given by
yn = Y.ajy"-j + ^^fryn-h (2.7-58)
wherein the terms 'linear' (for linear combinations, vis-a-vis, for example, RK methods,
which are non-linear) and 'multi-step' (at least for K\ and/or K2 > 1) are obvious. The
quantities {K,, a,-, /},•} are constants given by the particular choice of method. If we define
K = max (A!'], K2), then the formula gives a AT-step method. It is explicit if /}0 = 0 (e.g.,
Adams-Bashforth) and implicit otherwise (e.g., Adams-Moulton). Also, the order
(accuracy) of the method is determined by K\ and K2. The subsets (families) referred to above,
of order q, are
1. Adams-Bashforth (explicit): K\ = 1 (and a\ = 1), K2 = q, fio = 0.
2. Adams-Moulton (implicit): K\ = 1 (and a\ = 1), K2 = q — 1, fio > 0.
3. BDF: Kx = q, K2 = 0, & > 0.
See, for example, Gear (1971) for values of the coefficients as functions of q in each
case. From this text (p. 130), we quote: '... the region of stability for the implicit Adams-
Moulton methods is larger by a factor of ten or more than that of the explicit Adams-
Bashforth methods. The truncation errors are also smaller for the implicit methods, so
the implicit methods can be used with a step size that is several times larger than that of
the explicit methods. This increase in step size usually more than offsets the additional
effort in solving the corrector, which may require two or three function evaluations.' The
TIME INTEGRATION 261
'corrector' here refers to a predictor-corrector method for these implicit ODE methods
which we shall define below—or see p. 114 of Gear (1971).
This excursion into general ODE methodology, for our purposes, has ended (at least
until the next chapter); but for those interested in how general ODE software packages (of
variable order and automatically varying step sizes) are built upon the above multi-step
methods, see Hindmarsh (1979); Shampine and Gordon (1975, non-stiff ODE's); Burrage
et al. (1980); Hindmarsh (1983); and Brown et al. (1989). Also, a general CFD-oriented
discussion of these methods (and others, such as one-leg methods that we will define
below) is available in the excellent Van Karman Institute lecture series publication by
Beam and Warming (1982).
The forward Euler method is clearly the first in the AB family, and the implicit Euler
is the first member in both the Adams-Moulton and BDF families, whereas TR is the
second member (q = 2) of the Adams-Moulton family. The BDF methods are so named
because they look like formulas for yn that are obtained by looking backwards at earlier
values of yj. They are famous because of their 'stiff-stability.'
We now return to the TR and quote, loosely, the famous Dahlquist theorems (following
Hughes, 1987), which helps explain our attraction to it:
1. There are no explicit linear multi-step methods that are A-stable.
2. The highest order attainable that has A-stability is 2 (q = 2), and it is necessarily
implicit.
3. Of all second-order, A-stable methods, the TR is the most accurate (smallest constant
in the LTE).
It is useful to demonstrate the accuracy, stability, and lack of dissipation of TR since
we will rely heavily on this 'optimal' method in the sequel. The LTE is determined from
(2.7-56) as follows:
At
dn= yn + -y[;y« +y(tn+\)] - y(tn+l)
= yn +
At
At2... ,
yn + yn + Atyn + ~ y„ + 0(At3) + 0(1 „)
At2
At
yn + Aty„ + -—% + —- yn + 0(An
= —At3yn+0(At4),
(2.7-59)
where, as with BE, the LTE differs slightly from the true local error, /„— here by 0(At4).
The stability of TR, for the constant-coefficient case, from (2.7-57), is determined by
studying £ = (1 - XAt/2)/{\ + XAt/2) and setting £ = e'e to find the stability boundary
(|£| = 1), which yields XAt = — 2/tan(#/2): the stability boundary is the entire imaginary
axis; this is the epitome of A-stability—all values of XAt with Re(AAf) > 0 are stable
and all values of XAt with Re(XAt) < 0 are unstable, thus mimicking optimally the ODE
behavior. Finally, to test dissipation, we set X = ico and compute
l£|2 =
icoAt/2
1 + icoAt/2
1 + ((oAt/2Y
1 + (coAt/2)7
= l;
262 THE ADVECTION-DIFFUSION EQUATION
TR is neutrally stable and (properly) will not damp solutions that should be purely
oscillatory—thus showing that TR shares that coveted property of the leapfrog method.
The most serious criticism ever leveled against TR (perhaps besides being only second-
order accurate) is that its neutral stability (lack of numerical damping) is sometimes
regarded as a disadvantage because, although A-stable, too large a timestep (used at the
wrong time) can (but need not—see below) lead to what are called TR oscillations, or
ringing. TR is also not very good at damping perturbations, or round-off error, for the
same reason. To see the ringing, consider X real so that the ODE solution is monotonically
decaying like z~~Xt. If XAt:» 1 in (2.7-57), then the TR solution is, approximately, yn+\ =
-(1 -4/XAt)yn, giving yn = (-1)"(1 -4/XAt)ny0 % (-l)nyo; the solution oscillates
nearly between ±y0 (similar to forward Euler at its stability limit, At = 2/k), while
only very slowly decaying—non-monotonically—to zero. Soon, though, we shall show
how this alleged shortcoming is easily overcome by a 'smart integrator'—a la ODE
theory—that precludes ringing by intelligently (and efficiently) selecting appropriate step
sizes for TR. Unfortunately, the literature abounds with papers in which a fixed step size
is used and—especially for diffusion-dominated situations—spurious oscillations result
because the fixed At is inappropriate (at early time) for the 'higher modes.' These TR
oscillations should be regarded as another 'wiggle signal,' telling the astute analyst that
a smaller At should have been used—at least at early time. (TR for time integration is
the ODE 'portion' of what we call 'honest GFEM' for PDE's.) We will return to (and
solve) this problem soon, but first we introduce two serious competitors to TR, both
also second-order: implicit midpoint rule and BDF2—after mentioning that TR is also a
member of the second-order implicit Runge-Kutta family.
A suggestion by R. Rannacher that may sometimes be useful (e.g., in the presence of
'rough data') in the context of a fixed-step integrator is called 'a slight shift to the implicit
side' in Hey wood and Rannacher (1990); namely, modify (2.7-57) to
yn+\ = yn
kAt
—- (1 -XAt)
XAt
1 + ^-(1 +A.A0
(See also Timmermans et ai, 1994, who make a similar suggestion.)
A final remark on TR that also applies to leapfrog and to the implicit midpoint rule to be
described next, but does not apply to dissipative methods: it is a symmetric (self-adjoint)
method in that any given time integration can be reversed (At -> —At) and backward-
integrated to recover the original IC's. {Proof (A. Hindmarsh): Given y = f(y, t) and
y = yo at t = to, one TR step gives the non-linear system, y\ = y0 + (At/2)[f(y0, to) +
f(y\,t\)], for y\. Assuming a unique solution exists, all we need to show is that a
backward step recovers yo, which is easy: given y\ at t\, integrate backwards to find
yifo) = yo as follows: y0 = y\ - (At/2)[f(yi, t\) + f(y0, t0)], which is again assumed
to have a unique solution. But this is just y, = yo + (At/2)[f(y\, t\) + f(y0, to)], which
proves that y0 = yo- For further discussion of symmetric ODE methods, see Hairer et al.
(1987).} Later in this chapter we will demonstrate this symmetry for a 2D, pure advection
problem.
b. Implicit midpoint rule (IMR)
There exists a class of methods pioneered by Dahlquist (see, for example, Dahlquist,
1983) called 'one-leg methods' because only one function evaluation is involved in each
TIME INTEGRATION 263
timestep. The implicit midpoint rule is the second-order member of this family (backward
Euler is the first); applied to (2.7-2), it is
, a,/- (yn + yn+\ , . AA mow
yn+i=yn + &tfl—-—,tn + —u (2.7-60)
with the function evaluation occurring at the 'midpoint' of the interval. Note that
application of the IMR to the linear ODE of (2.7-1) simply returns the TR, a specific example of
a general fact: one-leg methods are the same as linear multi-step methods when applied
to linear ODE's with constant coefficients and a fixed step size. Even for the general
nonlinear case, they are quite closely related (Hairer and Wanner, 1991): if {yn} is the solution
of (2.7-60), then ~yn = ~(yn + yn+\) at tn = ^(tn + tn+\) satisfies (2.7-56); conversely, if
{y„, t„] is the TR solution, then {yn - (At/2)f(y„,t„), tn - (At/2)} satisfies (2.7-60).
The main reason that one might be attracted to this method, which is self-starting and
displays no spurious roots, is that it is somewhat more stable than TR, without being
dissipative, which we now demonstrate for the GFEM ODE's, (2.2-25),
Mt + [NQ(u) + K]T = 0, (2.7-61)
in the absence of a forcing function. Recall that Nq is the skew-symmetric version of the
advection approximation for n • u = 0 on T, which we assume. The 'energy' (quadratic
form) of the system is E = ^TTMT, and (2.7-61) easily leads to
E = -TTKT, (2.7-62)
which, since K is SPD, shows monotonic decay—the ODE's are stable, and the advection
process (properly) has no net effect on the energy. (Recall that xTAx = 0 for all x when
A is a skew-symmetric matrix.)
The TR applied to (2.7-61) with a time-varying velocity field is
M(Tn+^~Tn) + l-[NnTn + Nn+lTn+l + K(Tn + Tn+l)] = 0, (2.7-63)
where we have dropped the Q-subscript on N, and Nn =N[u(tn)\. Taking the scalar
product of this equation with (Tn + Tn+\) yields, using the symmetry of M and K and
the skew-symmetry of N,
^-t(TTn + lMTn+l -TTnMTn) = --[TTn+x{Nn - Nll+{ )T„ + (7,I+1 + Tn)TK(Tn+{ + Tn)].
(2.7-64)
Since the diffusion term is behaving 'properly,' we now focus on the pure advection limit
by setting K = 0 to give
En+l=En + ^-[TTn+l(Nn+l -Nn)Tn]=En+0(At3) (2.7-65)
rather than energy conservation, En+\ = En from (2.7-62). The TR gives an indefinite
result for the energy, which seems to imply that stability (in the 'energy sense') is not
guaranteed. Soon we will present an alternate analysis that 'returns' stability, but first we
examine the IMR result for the pure advection case:
*"r^ - r"> + N ("" +2""+l )(T" +2r"+' ) = 0- (2-7-66)
264 THE ADVECTION-DIFFUSION EQUATION
Forming the same scalar product as above leads easily to the desirable result that En+\ =
En; energy conservation is achieved for all step sizes. The TR result, in an asymptotic
form that replaced (Nn+\ — Nn) by AtN + 0(At2), was first discussed from the standpoint
of stability and conservation by Lee et al. (1982), and the better qualities of the IMR on
the same were first pointed out by Cliffe (1981).
We now turn to an interesting model ODE with a time-varying decay rate,
y = -k(t)y, y(0) = y0, (2.7-67)
and compare TR with IMR. The former yields
( l-A„Af/2 ^ _
and the latter _
l-kAt/2
yn+\ = = yn, (2.7-69)
where k = k(tn + At/2). As first pointed out by Gourlay (1970) and further studied by
Hughes (1977) (see also Nevanlinna and Liniger, 1978), the TR result above actually looks
unstable if k(t) > 0 is a decreasing function of time and if At > 4/(A.„ — kn+\) = Atc—a
result that obtains from violating the left inequality in the stability statement — 1 ^ £„ ^ 1.
In contrast to this ostensibly conditional stability behavior, the IMR is easily seen to be
stable for all step sizes.
But it is actually too hasty to conclude instability if At > Atc because we have a
variable-coefficient ODE. If A. was constant and if |£| > 1, we would get yn = %nyo, which
is indeed unbounded. But for k = k(t)—a function of time—the analogous behavior
is yn = (rio-1£/);yo = <pnyQ-> and it does not necessarily follow that cpn is unbounded
for n -» oo even when each £y has |£y| > 1; e.g., n^,(l + \/j2) = sinh7r/7T. In fact,
it is not hard to show that TR is (at least for constant At) actually still stable—in a
slightly extended sense, and not as 'cleanly' stable as IMR—even when At > Atc for
every step. The proof uses 'energy' arguments and for the scalar problem goes like:
yn =>• (1 + K+\&t/2)yn+x = (1 - knAt/2)yn with kn+l < kn, and we permit
At > Atc for which |^„| > 1. Squaring both sides gives (1 + kn+\ At/2)2y2+l = (1 —
knAt/2)2y2n=(\-knAt + (At2/4)k2n)y2n<:(l+knAt/2)2y2n, which =>|y„|^[(l +
k0At/2)/(l + knAt)]yo; i.e., although yn+\ may not decay to zero if At is large, it
is true that yn+\ does not 'blow up' for any At. This boundedness is stability. This result
also generalizes to the system of ODE's given by (2.7-61), written now as
Mf + [N(t) + K(t)]T = 0, (2.7-70)
where NT = — N represents advection, and K is SPD (diffusion with, for generality but
not necessarily, a time-varying diffusivity): the TR yields (M + (At/2)An+\)Tn+\ =
(M — (At/2)An)Tn, where A = N + K. The energy analysis goes as follows (and
uses the fact that M, and thus M_1, is SPD): let (M + (At/2)An+l)Tn+l = a
and (M - (At/2)An)Tn = b. Since a = b, we have M~xa = M~~xb and aTM'la =
aTM~~xb = bTM~~xb which, after some algebra, yields
tt
1 n + \
At2
M + AtKn+x + --(Kn+iM-lKn+l - 2Nn+lM'lKn+l)
Tn+\
TIME INTEGRATION 265
At
+ — {Nn+xTn+x)TM-\Nn+xTn+x)
= T{
^ T1
At
M - AtKn + —(K„M-lKn - 2NnM-lKn)
M + AtKn + ^-(KnM-{Kn - 2NnM-lKn)
At
Tn + —(NnTn)TM-\NnTn)
+
At*
-l,
■(N„T„)'M-\N„T„),
(2.7-71)
which, since now all terms on each side of the inequality are the 'same' functions of
n, leads by induction to LHS„+i ^ RHSo, and we have stability. If in fact K = 0 (pure
advection), then we have, from (2.7-71), the following 'conservation' law: TTnMTn +
(At2/4)(NnT„)TM-l(NnTn) = TTQMTQ + (At2/4)(N0T0)TM-1 (N0T0), which is seen to
be more useful than (2.7-65) in that now we have shown a useful and definite
result—boundedness.
In Hughes (1983) similar arguments were presented, but he apparently did not realize
(nor did Gourlet for the scalar case) that <pn = njp'^y need not become unbounded even
though i-j > 1, and thus saw a discrepancy between the scalar case and the ODE system.
Our analysis shows that there need not be a discrepancy—at least for the linear case
[Hughes was considering the more general non-linear case, and there are examples from
this class, e.g., Fornberg (1973)] for which TR is indeed only conditionally stable.
But the main reason that we are not attracted to IMR is that only TR provides us with
easy and accurate methods of error control and automatic step size variation, as we will
soon show.
c. Backward differentiation formulae (BDF)
The last implicit method that we will consider is BDF2 (see Gear, 1971, and Hindmarsh,
1972). It is the second of the so-called 'stiffly-stable' BDF ODE methods (implicit Euler
being the first) meaning, roughly at least, that ODE's like y = —ky with A. > 0 are
integrated in an unconditionally stable way, and those like y = icoy are at least conditionally
stable; recall that stiff stability is weaker than A-stability. BDF2 is sometimes referred
to as the second-order, implicit Euler method in the CFD literature (BDF1 is precisely
implicit Euler). BDF2 is not neutrally stable, like TR or IMR; rather, it—like backward
Euler, but less enthusiastically—will damp an oscillatory solution like that of y = icoy,
and thus would be shunned by those wishing to solve pure advection on long time scales.
It is, for (2.7-2),
yn+\ -yn _ \yn_
~ 3
At
- yn-\ . 2.
(2.7-72)
in which form it appears to be one-third 'extrapolation' and two-thirds backward Euler.
(The general BDF family seems susceptible to a similar interpretation.) Another way to
write it is
iKt = yn+u (2-7"73)
in which the LHS is—via Taylor series—a well-known, second-order-accurate, one-sided
approximation to y at tn+\. BFD2 is actually better than just stiffly stable—and even better
266 THE ADVECTION-DIFFUSION EQUATION
than A-stable [see Figures 11.6 and 11.7 of Gear, 1971 (the higher-order BDF methods
are only stiffly stable)]. Applied to y = — Xy, it gives
y«+i = ^ ~ y"~l (2-7-74)
yn+X 3 + 2AAf V
which, via yn+\ = %yn yields a quadratic equation for £ with roots £ = (2 ±
Vl - 2A.AO/(3 + 2XAt), with |£| < 1, which displays L-stability (y„+1 -> 0 for XAt -»
oo), a good feature for dissipative systems (at least). The disadvantages of BDF2 are
two: it is not self-starting (requiring, for example, a BE first step—at a smaller At—or,
preferably in our opinion, TR at the same At) and it displays one spurious root (given by
the minus sign in the ^-equation). Also, for pure diffusion (X real) and XAt > 1/2, BDF2
displays (in both roots) an oscillatory damped behavior—vis-a-vis BDFl (BE), which is
monotone. While BDF2 may seem thus to be less attractive than TR—which we believe
to be generally the case—it may be a useful complement to TR when variable timesteps
are employed and a solution is heading for a (non-zero) steady state (which, of course,
precludes the undamped, pure advection cases). We shall return to this point later.
To conclude our brief visit to implicit ODE methods, we summarize as follows: TR
and IMR are A-stable whereas BDFl (implicit Euler) and BDF2 are L-stable. The former
pair are clearly preferred for pure advection whereas the latter pair have certain 'stability'
advantages (e.g., errors are always damped).
2.7.4 A Variable-Step Implicit Method for Advection-Diffusion
Having presented a number of methods in the fixed-A? context, it is time to get 'serious'
regarding implicit ODE methods and show how they should be used. If one is committed
to—or more interested in—implicit methods so that At can be selected based principally
on desired accuracy with little or no regard to stability issues, one is naturally led to
consider (stiffly stable or even A-stable) implicit methods in which At is varied during
the time integration procedure, being 'small' only when necessary (lots of high-frequency
and/or small time-constant modes that are active/important—or via a 'busy' forcing
function) and 'large' whenever possible (lack of high-frequency modes, small time-constant
modes are of small amplitude, and any forcing is 'slow'). Such a method we refer to as
a smart integrator because it follows the 'physics' intelligently—and is a natural adjunct
to FEM which, when employed optimally, puts nodes where they are most needed.
a. Variable step trapezoid rule
Based on the above considerations, the implicit method that we favor for such an approach
is TR, whose only detraction is its tendency to oscillate when At is (too) large—a
detraction that is absent when At is selected properly, a point that we cannot emphasize too
strongly. That is, our TR integrator will not generate visible oscillations even when XAt,
as selected by the smart integrator, is very large. (Large At will only be employed when
it is 'safe'/appropriate to do so.)
The ODE theory that we employ is that in which an explicit method (cheap by
comparison with the implicit method) is also employed in such a way that the LTE of TR can be
easily and reliably estimated and used to control accuracy via step size changes; increase
At whenever possible, decrease At only when necessary. (The timesteps are selected to
TIME INTEGRATION 267
follow the 'physics.') Since TR is second-order accurate and uses two values of y, the
natural explicit method is second-order Adams-Bashforth, AB2. These two are used side
by side to make a variable-step TR that we call 'smart'—it emulates just what the better
ODE software packages do, which packages could sometimes be used to advantage in
CFD, at least if the problem is not too large. (But they rarely are; here's one exception
we're aware of: Randriamampianiva et al. (1987).
The strategy will first be introduced with the scalar model problem—and favorably
compared with its first-order counterpart (backward Euler, with forward Euler used to
estimate the local error)—and then for the GFEM ODE's for advection-diffusion, including
some 'heuristics' and some 'warnings.' In the next chapter we will extend it to the NS
equations.
The trick is to combine the variable-A? AB2 method, as a sort of 'predictor' (see, for
example, Shampine and Gordon, 1975),
P Atn
y„+l =yn + ~^-[(2+ Atn/Atn-\)yn - (Atn/Atn-i)y„-il (2.7-75)
with LTE [cf. (2.7-10)]
rf+l-y(tn+i) = -(2 + 3Atn-i/Atn)At3nyn/\2 + 0(At*), (2.7-76)
which is used to estimate the LTE of TR, given by (2.7-59), with At replaced by Atn, the
current step size; i.e., the pair of equations, (2.7-59) and (2.7-76), in the two unknowns,
y(tn+\) and yn, can be solved, and the item of interest, dn = yn+\ — y(tn+\), where yn+\
is the TR solution, computed. The result is
d- = »+<-*"+')S30?*/;!L.y <2J-77)
the local error in the TR step [to 0(AtAn)] is simply proportional to the difference between
the TR and the AB2 solution—with a known proportionality constant. Armed with this
knowledge, we can vary the step size in such a way that this local error is maintained
below a (user-specified) tolerance, which also keeps any TR oscillations within the 'noise'
level |X 0(e), where e is the specified error allowance]—at least when e is chosen to be
'sufficiently small.' [Some of the smart integrator 'package' falls into the lap of (is the
responsibility of) the user.]
Remarks:
(1) A simple rearrangement of (2.7-75) makes it more 'transparent':
p , A, • , Atl (yn -yn-\\
(2) If the ODE is linear in y, then AB2 is used only to estimate dn; y^+l is then
'discarded.' (For the general non-linear case, or for the alternative AB/TR algorithm
to be described below, y%+l has a dual use, further amortizing its (small) added
cost.)
Suppose we have just computed dn from (2.7-77). The first thing we might want to
do is improve our current result by using (2.7-77) again, this time in the form
y(tn+\) = y„+i-d„, (2.7-78)
268 THE ADVECTION-DIFFUSION EQUATION
which is ostensibly a better estimate than the TR result, yn+u i.e., subtracting the LTE
yields a result that should be third-order accurate. And this is true. But it is not true that the
above value should be used to replace yn+\ in the overall algorithm, which would then no
longer be TR. It could be used at 'output' times, but should not be used in any other way
since A-stability would then be lost. And, it is only really sensible (and recommended)
to use it at the first step beyond the specified output times, and an [0(At4)] interpolation
scheme would need to be invoked to retain the third-order accuracy at exactly the specified
output times.
The next step is to compute the next step—At. This is done using the LTE of TR,
dn = At3ny n/\2 as follows: if we change At from Atn to Atn+\ and take the next step, we
would clearly expect to see dn+\/dn = (Atn+\)3yn+l/At3ny' n = (Atn+\/Atn)3 + 0(Atn),
which is just a reflection of TR's second-order accuracy. Now we place the following
accuracy constraint on the next TR solution:
\d„+i\ ^ £}W, (2.7-79)
where e is the user-specified, dimensionless relative-error tolerance parameter, and
}W (also user-supplied) is an appropriate scaling factor. This leads to £ymm/\dn\ ^
(Atn+\/Atn)3, which we use to obtain our next step size (by invoking equality):
Ar„+, = Atn(eymm/\dn\)l/3, (2.7-80)
and we are basically finished with the derivation. If \dn\ = \yn+\ — ^+1|/3(1 +
Atn-\/Atn) is larger than eymax, then the timestep will be reduced, and vice versa.
Finally, to set up for the next AB2 predictor, (2.7-75), we 'invert' the TR as follows:
2
yn+\ = ti^/i-h - yn) - y„, (2.7-81)
thus providing a cheap, recursive formula for use in the next predictor step.
Remarks:
(1) An alternative method of error control is based on the (dimensionally inconsistent)
concept of 'local error per unit step' (Shampine and Gordon, 1975); an example of
its inferiority is given in Griffiths (1988).
(2) It is generally not advisable to replace (2.7-81) by (2.7-2); i.e., yn+\ = f(yn+\, tn+\),
because it is generally more costly and would cause accumulation of undamped
errors.
(3) The global error, en = yn — y(tn), is of course the error that we would like to
control directly—a feat which, unfortunately, is too difficult (expensive) to perform.
(It would require, for example, at least two full integrations at different Af's—via
'step doubling,' or equivalent.) Thus, the following caveat from the experts: 'Local
error control in a code can be viewed as a knob that can be turned to try to adjust
the step sizes and hence the global error. It is not a guarantee of small global
error'—Hindmarsh and Petzold (1995a).
Written as an algorithm, the overall smart integrator is the following:
Initialization: (The predictor begins after the first step and error control after the second.)
(1) Given yo and e, compute Ato = re1/3, where r is an estimate of the initial 'time-
constant.' (If no such estimate is available, then a conservatively small Ato can
TIME INTEGRATION 269
be selected because the smart integrator will quickly increase At to the proper,
e-dependent value.)
(2) Solve for y\ from TR: y\=y0 + (Af0/2)( jo + y\ )•
(3) Compute y\ = (2/Ato)(y\ — yo) — yo, and we are ready for the
General Step: n = 1, 2, ..., with At\ = Af0'-
(1) Compute yp+l = yn + {Atn/2)[{2 + Atn/Atn-\)yn - (Atn/Atn^)yn-X\.
(2) Solve yn+\ = yn + (Atn/2)(yn + j„+i), where, when /(y, f) is non-linear, use also
yp+l as a first guess when solving for yn+\. [Note that in general, the solution
for yn+\ involves the solution of a non-linear equation, yn+\ — (Atn/2)f(yn+\,
tn+\) = yn + (Atn/2)f(yn, tn) = bn.]
(3) dn = (yn+\ - ^+i)/3(! + Atn^/Atn).
(4) yn+\ = (2/Atn)(yn+\ - yn) - yn.
(5) tn+\ = tn + Atn.
(6) Ar„+1 = A^(eymax/|^|)'/3.
(7) Go to (1) unless it is time to STOP.
An alternative AB/TR algorithm, equivalent in theory but more accurate in the face
of round-off error, is the following (A. Hindmarsh, personal communication), beginning
with a better way of updating the derivative for the predictor step; instead of 'inverting'
TR to obtain the next value of y for the predictor, we proceed as follows:
y«+i = -^(y«+i-y«)-y« (2-7-82)
rather than (2.7-81), a small change [0(At3)] that, combined with those to follow, can
reduce the round-off error by a factor (roughly) of \yn+\ — yn\/\yn\. Inserting y^+x from
(2.7-75) gives the final form for computation:
ypn+x = (1 + Atn/Atn-i)yn - (Atn/Atn-i)yn-i. (2.7-83)
[Noting that % = (yn - j„_,)/Ar„_i, (2.7-83) 'looks like' yp+l = yn + Atn%.] The next
change is to solve for the (small) difference between the predictor and the corrector
rather than for yn+l itself. Subtracting yp+x = yn + (At/2)(yn + yp+x) from yn+x = yn +
(At/2)(yn + yn+l) yields the '5-form' of TR,
yn+\ ~ yn+l =8y= —(y„+i - yn+l)
= ~Y-[f(yn+\i tn+\) - yp+\]
= ^Utf+i + 8* t„+i) - tf+l], (2.7-84)
a non-linear (in general) equation in 8y [cf. Step (2) above]. After solving (2.7-84) for
8y, the final change is in the way that y is updated for the next step:
yn+\ = yPn+\ + -rr8y> (2-7-85)
L\tn
270 THE ADVECTION-DIFFUSION EQUATION
obtained by subtracting (2.7-82) from (2.7-81). [Note that, in the absence of round-off
error, (2.7-81) and (2.7-85) are completely equivalent—as are the other results.] Thus, in
the improved AB/TR algorithm, Step (2) is replaced by (2.7-83) and (2.7-84) and Step
(4) by (2.7-85). Also, of course, we need to add yn+\ = y^+l + 8y.
Finally, we address the proper way to present the results. To retain the TR accuracy at
user-specified output times, a second-order-accurate interpolation formula is needed: for
tn < t < tn+\, the following formula does the trick—and is recommended:
(2.7-86)
b. Variable step backward Euler
Before applying TR to AD, let us briefly present a similar, but first-order algorithm
(partly) for the purpose of showing that it is not much cheaper per timestep (although
rather less accurate). Then we will show that it often requires many more timesteps and
is thus usually not a serious competitor. (The details of the algorithm design are left as
an exercise.)
(0) Initial step size: A?0 = re1/2.
For n = 0, 1, 2, ..., do
(1) FE predictor: ypn+x = yn + Atnyn.
(2) BE step: yn+l = yn + Atnyn+X; solve for yn+l.
(3) d„ = (yn+\ ->f+i)/2.
(4) tn+i=tn + Atn.
(5) Atn+l = Atn(eymax/\dn+l\y/2.
(6) Go to (1), or STOP.
Remarks:
(1) If f(y) is expensive to evaluate, y in step (1) should/could be obtained by 'inverting'
the BE formula via yn = (yn — yn-\)/At, which yields the simple extrapolation
formula
y„+\ =2yn - y»-\-
(2) A variable step version of BDF2 will be presented in the next
chapter—Section 3.16.4—and would be preferred to that above for BE (BDF1) in the event
that a more stable method than TR is desired or required.
c. A model problem
It is of some interest to note that for the simple linear model problem y = — Xy with
y(0) = 1, a 'perfect' algorithm (one whose local error estimate, from dn = At^yn/\2 for
TR and dn = At2nyn/2 for BE, using the exact solution for calculating dn) would produce
timesteps that increase exponentially like
(i) TR: kAt(t) = (\2e)l/3ekt/3. (2.7-87)
(ii) BE : kAt(t) = (2e)l/2eXt/2. (2.7-88)
TIME INTEGRATION 271
These are the theoretical goals. It turns out that in practice the growth of At is conservative
(sometimes too much so); At grows more slowly than these theoretical 'goals'—partly
because true e~Xt behavior is (by design) 'lost' upon reaching y = e. For e = 10~4 (a
typical value), kAt(0) = 0.106 from TR and XAt(0) = 0.014 for BE, which is 7.5 times
smaller. But At grows faster for BE and would (in theory) catch TR in step size at
the time given by (12e)'/3eX//3 = (2e)'/Vr/2 or kt = 61n[(12e)'/3/V2e] = 12.1 for this
example. But e~12 ' = 5 x 10~6, so there is virtually nothing left of the transient; TR
wins—especially when the total number of steps to the cross-over point is counted—about
32 for TR and M48 for BE. (For the 'real,' not theoretical, results; see below).
We will show two more simple examples before dropping the simple model
problems—the first a continuation, with more details, of the above case and the second
with two disparate time constants.
The results of the model problem y = — Xy are shown in two tables (Tables 2.7-1 and
2.7-2) and two figures (Figures 2.7-7 and 2.7-8) for TR and BE. The numerical ('real')
results, in the tables, are seen to be increasingly conservative as At increases, as mentioned
above. The second figure shows the maximum global error (emax) as a function of the local
error (e) for the 'real' results and verifies the theory that asserts that the former is one order
lower than the latter; i.e., if d = c\hq+x for a q-th order method, then e = cihq is the global
error—for a fixed-At integration. If d = s, then e = C2(e/c\)Ci/ci+l, and the slopes of the
graphs are 'close' to this result: ~0.633 (q = 2) for TR and ~0.465 for BE (q = 1). Note
too, for the same e, that BE's global error is approximately an order of magnitude larger
than TR's; e.g., for e = 10~4, emax(BE) = 0.0036 and emax(TR) = 0.00047. (Inversely,
for emax = \0~\ TR needs e = 3 x 10~4, but BE needs e = (7-8) x 10"6.)
A bound on the global error for TR for the above situation (monotonically increasing
At) is given in Hairer and Wanner (1991) via yn — y{tn) ^ cAt?max ■ maxoscr^,, y(t), where
Atmax = Atn. They also give the result for non-monotonic At increases.
Our last 'theoretical' demonstration of a smart integrator is the following: suppose we
have the solution
;y = 0.5(e-A|r+e-*2r) (2.7-89)
for one of the components of a pair of ODE's. If we apply the 'theoretical' TR to this
solution, then we can get a pretty good idea as to how the integrator would change At
through the course of the integration. Requiring |Ar3y/12| = e leads to the theoretical
Table 2.7-1 AB/TR on y = -Xy.
n
1
2
3
4
5
10
15
20
25
30
XAtn
0.1063
0.1063
0.1100
0.1140
0.1183
0.1453
0.1874
0.2618
0.4233
0.9722
Xtn
0.1063
0.2125
0.3225
0.4366
0.5549
1.223
2.068
3.209
4.944
8.381
yf
—
0.8089
0.7246
0.6465
0.5743
0.2944
0.1265
0.04050
0.00732
0.00043
ylR
0.8991
0.8084
0.7241
0.6460
0.5738
0.2939
0.1260
0.04005
0.00693
0.00019
y(t„)
0.8992
0.8085
0.7243
0.6463
0.5742
0.2944
0.1265
0.04040
0.00712
0.00023
yTnR -y(tn)
-9.01 x 10 5
-1.62 x 10"4
-2.26 x 10"4
-2.81 x 10"4
-3.29 x 10"4
-4.65 x 10 4
-4.58 x 10"4
-3.53 x 10"4
-1.94 x 10"4
-4.34 x 10"5
272 THE ADVECTION-DIFFUSION EQUATION
Table 2.7-2 FE/BE on y = -Xy.
n
1
2
3
4
5
10
15
20
40
60
80
100
120
140
XAtn
0.01414
0.01424
0.01445
0.01455
0.01465
0.01521
0.01580
0.01644
0.01963
0.02432
0.03193
0.04631
0.08317
0.3323
Mn
0.01414
0.02838
0.04273
0.05717
0.07172
0.1461
0.2233
0.3035
0.6607
1.095
1.646
2.404
3.611
6.613
yFnE
0.9859
0.9720
0.9583
0.9446
0.9311
0.8648
0.8010
0.7397
0.5191
0.3379
0.1960
0.09306
0.02873
0.00165
yBnE
0.9861
0.9722
0.9585
0.9448
0.9313
0.8650
0.8012
0.7399
0.5193
0.3381
0.1962
0.09325
0.02891
0.00181
y(tn)
0.9860
0.9720
0.9582
0.9444
0.9308
0.8641
0.7999
0.7382
0.5165
0.3347
0.1927
0.09033
0.02702
0.00134
y%E-y(tn)
9.77 x 10"5
1.94 x 10"4
2.89 x 10"4
3.82 x 10"4
4.74 x 10"4
9.14 x 10"4
1.32 x 10"3
1.69 x 10"3
2.82 x 10"3
3.40 x 10"3
3.43 x 10"3
2.92 x 10"3
1.89 x 10"3
4.68 x 10"4
101
10°
At
10"1
10"2
0 2 4 6 8 10 12 14 16
Fig. 2.7-7 TR and BE on a scalar test problem (theoretical results).
TR formula
/ 24e \l/3
A'- = U-*.'-+*te-"-) ' (2'7"90>
A picture of this result, for A.) = 1, X2 = 10, and e = 10~3, is shown in Figure 2.7-9, along
with the variations that would occur if each time constant was behaving independently.
It is seen that the smart integrator does just what it should: follow the rapidly varying
part with sufficiently small steps while it is important to do so but not thereafter. If this
was a true two-equation ODE system, then the algorithm presented above would behave
much like these theoretical results—and no TR oscillations would occur [none larger than
O(10~3) anyway] with the solution being accurate for all t, without 'wasting' timesteps.
1 I I I I I V
TIME INTEGRATION
273
10-2
10-3
^max
(TR)
10-4
10-5
10-6
io-5
10-4
10-3
10"
- 10-1
(BE)
10-2
10-3
10-1
Fig. 2.7-8 Maximum global errors for TR and BE.
At
0.01
ln(10)/3 1
t
Fig. 2.7-9 Trapezoid rule At selection.
d. An aerospace version of TR
In some implicit CFD codes (in which the ODE's are no longer linear) developed by
and employed in the aerospace industry—typically at NASA Ames (e.g., Beam and
Warming, 1982), a 'simplified' BE or TR method is used. They use the term 'linearized
BE' (first-order) or 'linearized TR' (second-order) to describe what is also sometimes
called 'one-step Newton' —although it is also called a linearly implicit method by some
(see, for example, Hairer and Wanner, 1991). Additionally, these codes typically are
deficient by being inefficient in that they use fixed timesteps, no error control, and are actually
274 THE ADVECTION-DIFFUSION EQUATION
not guaranteed to be A-stable because of the linearizing approximations invoked. Here we
shall compare three methods for solving the explicit ODE's described by the (non-linear)
vector-valued system y = f(y): (i) the rigorous-but-somewhat costly way, (ii) the
'inefficient' way, and (iii) a reasonable compromise that might even help the large aerospace
CFD community. We shall employ the second-order method and leave the other as an
exercise. We show only the fixed At version, for simplicity. Generalization via error
estimates and local timestep control is, or should be, straightforward (see below). Starting
with the TR formula, yn+1 = yn + At(yn + yn+i)/2 = yn + At(fn + fn+\)/2, we have:
1. Rigorous TR. Define F(yn+l) = yn+\ - yn - At(fn + fn+\)/2 = 0 and apply
Newton's method: dF(yn+l)/dyn+\ L ■ (yj+t0 ~ >ffi) = -F(y™i)* where ^i+i = rf+i =
yn + At(3fn — /„_i)/2, and m is the iteration index. [dF/dy = I — ^Atdf(y)/dy is
called the Jacobian matrix.] When F(y(™+X) is 'sufficiently small,' stop.
2. Aerospace method. Linearize fn+\ m me TR formula via fn+\ = fn +
df(yn)/dyn ■ (yn+\ - yn) + 0(At2) to obtain yn+i = yn + ^At[2f„ +J„(yn+i - y„) +
0(At2)], where Jn = df/dy is also called the Jacobian matrix. Dropping the 0(At2)
term gives (/ — AtJn/2)(yn+\ — yn) = Atfn, which is also sometimes called the 'delta
method.' It is important to note that if no iterations are taken—and this seems to be the
'rule' —then the result is no longer TR, it is an approximation to TR. This is the linearized
TR that we call the 'aerospace method.' The Euler version of the method is also called
the linearly implicit Euler method (Hairer and Wanner, 1991)
3. One-step Newton. In fact, the linearized TR described above is also what some ODE
people call 'one-step Newton.' But it is very easy to improve this one-step Newton method
by obtaining a much better first guess—via the AB2 predictor—which is our
recommended, linearized/one-step Newton scheme: solve dF/dy ■ (yn+\ — y„+\) = [/ — \At
J(ypn+^ ■ (y«+\ - ypn+i) = -F(ypn+0 = -ypn+x+yn + 5a*[/„ +/(>£+,)] = ^Ar[/„_,
— 2/'„ + f(yP+\)] for yn+\, which is (very) little more work yet much more accurate than
the 'aerospace method.'
e. TR on advection-diffusion
We now return to the full GFEM linear ODE's governing advection and diffusion, (2.2-7)
or (2.2-25), and generalize the variable-step integrator to these coupled ODE's, which of
course introduces significant additional issues—and some 'heuristics.' First we write the
general 'corrector' step, then the general 'predictor' step, then the general 'acceleration'
update step. After that, we will return to the beginning—startup—and describe an entire
algorithm. [In the next chapter, we will extend these results to cases in which the ODE's
are non-linear—which in fact we have already just (tersely) introduced above.]
The TR applied to (2.2-7) leads to the linear system
{-kM+A)T^ = {-kM-A)T-+lf' (2J"9,)
where A = N(u) + K is N x N, and we limit—for now—the discussion to the constant-
coefficient case; in particular, the velocity is time-independent. Noting that / — ATn =
MTn leads to the more efficient form of the RHS,
TIME INTEGRATION 275
2 M + A] Tn+l = M f —Tn + Tn)+f, (2.7-92)
because Tn will be available 'recursively.'
It is worth pointing out that the 'more efficient' form is also more stable—a result that
may be more important. Let us demonstrate this assertion for the simple scalar equation,
y = -ky. The form corresponding to (2.7-91) is yn+\ = [(1 - kAt/2)/(I + kAt/2)]yn,
followed by yn+\ = 2(yn+\ — yn)/At — yn, and that corresponding to (2.7-92) is yn+\ =
(yn + (At/2)yn) /(I + kAt/2) followed by the same (inverted) formula for yn+\ (needed
for the predictor portion). If we let x = (y, y)T, then we can relate xn+\ to xn via a 2 x
2 matrix; xn+\ = Bxn, where in the former case
r 1 - kAt/2
1 + XAt/2
-2k
B =
L 1 + XAt/2
0
-1
(2.7-93)
and in the latter case,
1
B =
1 + XAt/2
1 At/2
-X -XAt/2
(2.7-94)
The corresponding eigenvalues, {/i}, from Bz = /xz, are /i = [— 1, (1 — XAt/2)/(\ +
kAt/2)] for the first case, and /x = [0, (1 - XAt/2)/(I + kAt/2)] for the second. Since
xn = Bnx0, the former (with eigenvalue —1) displays a complete lack of damping—of
roundoff errors, for example. Thus, the latter case, (2.7-92) for the AD equation, is to be
preferred.
The general 'predictor step' is
TPn+x =Tn+^[{2 + Atn/At„-i)t„ - (Atn/At„.i)t„.i] , (2.7-95)
where, of course, the 'accelerations' in the RHS are easily obtained by inverting TR;
namely, (2.7-81).
We now present the entire algorithm and introduce a few heuristics—needed to answer
the several natural questions that arise:
Startup:
(1) Solve MTq = / — ATq for Tq\ DSCG is the recommended solution method.
(2) Select Af0 via Af0 = re1/3, where e is the relative error tolerance (e.g., 10~4), and
r is an estimate of the initial time constant— via
max(Ts, max, |r0,|)
T =
max, \T{
o,l
where Ts is a 'user-specified' input value (needed, for example, if T0 = 0). An
alternative to estimating r is simply to select a conservatively small value for Ato
and watch the smart algorithm quickly (after two timesteps) recover to a more
appropriate value. Another alternative—one advocated by careful mathematicians
(e.g., R. Rannacher) who worry about 'rough data'—is to use a dissipative scheme
such as BE or BDF2 for the first step or so, because TR will not 'algorithmically'
damp noisy data—a subject that we shall return to at the end of this section.
276 THE ADVECTION-DIFFUSION EQUATION
(3) Take the first TR step; i.e., solve
\At0 ) V Af0
(4) Invert TR to get the required AB2 data:
7, = 2(7, - T0)/At0 - t0.
General Step: With At\ = Ato, for n = 1, 2, ..., do:
(1) Tpn+l =Tn + ±Atn[(2 + Atn/Atn-\)tn - (Atn/Atn-i)tn-i].
(2) Solve (2M/Atn +A)Tn+l = M(2Tn/Atn +tn) + f.
(3) tn+i=2(Tn+i-Tn)/At„-tn.
(4) dn = (Tn+l - Tpn+l)/3(\ + Ar„_,/Ar„).
(5) f„+i = f„ + Atn.
(6) Af„+1 =Ar„(e/||rf„||)1/3.
(7) Go to (1) unless it is time to STOP.
The choice of norm, || • ||, deserves some discussion. Following the lead of ODE
general-purpose software designers, we would generally opt for a properly weighted RMS
norm—a relative root mean square norm. (Recall: || • ||rMS = || • b/V^V, where || • H2
denotes the L2-norm. Division by N of the square gives the mean square; etc.) A 'properly
weighted' norm will be dimensionless and well-scaled; cf. (2.7-80) for the scalar ODE.
A good choice of scale factor for the i-th entry of dnj, i = 1, 2, ..., N, is \Tn+\j\ + Ts,
where Ts is the user-supplied estimate (minimum expected value, in case \Tn+\j\ is close
to zero) discussed above. This leads to
1 N
H4. II2 = u J2[d^/(\Tn+n\ + ^)]2, (2.7-96)
i=\
as the generally suggested relative RMS norm.
Remarks:
(1) The all-important, user-specified tolerance parameter, e, can obviously have a
significant effect on both cost and accuracy. Too large an e can cause: (i) inaccuracy, (ii) TR
oscillations, and (iii) a 'weakened' theory (no longer in the asymptotic range). Too
small an e will merely cause the simulation to be excessively expensive—seeking
more accuracy then the spatial grid 'deserves.' Recommended values, at least to
start: 10~4 ^ e ^ 10~3; experiment!
(2) Because we are really trying to solve PDE's and not 'just' ODE's, there may be
times when a maximum (L°°) norm is actually preferable—so as to better (or, at
least, more easily) capture the behavior near a spatial singularity, for example; i.e.,
if most of the error comes from one small region of the domain, the above norm
may not be sufficiently 'sensitive' to the potentially locally large error. Thus, an
alternate norm that may sometimes be useful is
Halloo = max,- \dnJ\/(\Tn+li\ + Ts), (2.7-97)
TIME INTEGRATION 277
wherein the selected / should be 'printed' (or saved), although a reasonable argument
might be made that (2.7-96) with a tighter tolerance (smaller s) might serve nearly
as well.
(3) The r estimate given in step (2) of 'Startup' is a simple 'heuristic' that tries to
generalize from a scalar ODE to a system, and is not guaranteed to be conservative;
others are surely possible, such as r = maxj(Ts, | T0i. |)/max,- \t0.\.
(4) The following relationships between norms may be useful:
I 00)
|| • Woo/VN ^ || • Urms < II • lloo ^ II • lb = VN\\ ■ ||rms ^ VN\\
a manifestation of the fact that all norms are equivalent in a finite-dimensional space.
The 'improved' (less round-off error) version of the above 'general step' presented
earlier [(2.7-83) through (2.7-85)] for the single ODE, goes like:
(i) Replace Step (2) by: solve for 8T from
(^:M+A)8T=f-AjP^ - Mr-+"
where 7^+, = (1 + Atn/Atn-\)Tn — (Atn/Atn-\)tn-\ is computed first,
(ii) Replace Step (3) by
*""+' =r-'+ irn8T-
(iii) Set Tn+l=Tpn+l+8T.
(iv) Return to Step (4).
Except for a few (crucial) 'details,' the description of the smart integrator is complete.
The rest is 'simply' linear algebra, in which reside the crucial details, which we now
partially reveal via a list of questions:
Ql. Noting that the TR step generally involves the formation and solution of a new
linear system (different matrix) for each timestep, we ask: Should we really change
At at every step? Related to this is:
Q2. Letting DTSF = Atn+X/Atn (Delta T Scale Factor), should we treat DTSF < 1
differently than DTSF > 1? Note that the former will generally only occur with
variable-coefficient problems (addressed below), and the latter will be 'the rule' for
most AD simulations—especially diffusion-dominated; smaller steps are usually
required at the beginning of a simulation rather than later.
Q3. Should we limit the magnitude of DTSF in general—e.g., 0.2 ^ DTSF ^ 2? That
is, stop with a 'Warning Message' if these limits are exceeded—after Step (2), to
permit initially large changes, up or down, sometimes needed to 'correct' a poor
choice of Af0-
But before answering these questions, we will 'generalize' the problem somewhat,
which will raise even more questions. Then we shall try to answer all of them. Thus far
we have assumed u(x) is constant in time and that any boundary conditions and/or source
terms were time-independent. Let us now generalize to perhaps a more typical case:
278 THE ADVECTION-DIFFUSION EQUATION
u(x, t) is time-dependent and—usually—supplied by a GFEM solution of the Navier-
Stokes equations on the same mesh. It is then just as convenient to permit the source
term and BC's to vary with time, so that (2.7-92) generalizes to
/ 2 M +An+l\ Tn+l = M (^-Tn + tn) + fn+l = b, (2.7-98)
A*n
which leads to the obvious simple changes in the above algorithm: replace / and A by /„
and An with the appropriate value of n in Steps (1) and (3) of 'Startup' and in Step (2)
of 'General Step.'
The most important observation to be made in this more general case is that A(t)
does change at each step whether or not we actually change At, so that a matrix update
seems to be absolutely required at each timestep. And this is true—as it stands. Before
addressing this issue further, however, let us raise a few more key questions:
Q4. Since variable coefficients could conceivably change rapidly (thus 'surprising' the
algorithm), at what lower limit on DTSF should we reject the result and repeat the
current step at the (significantly) smaller At!
Q5. Should we 'test for stiffness,' as in good, general ODE software [such as LSODA,
in Hindmarsh (1983); see also Sepehrnoori and Carey (1981)], and switch to a
more efficient method when non-stiff, or should we simply retain the algorithm as
presented (which 'presumes' that stiffness is always present)?
Q6. Suppose one wishes (for some strange reason) to lump the mass, or selects an FDM,
which (effectively) converts the ODE's from 'implicit' to 'explicit.' Are there then
more-efficient ODE methods (e.g., non-stiff) that should be considered? {It is a
somewhat ironic fact that mass lumping is more deleterious for advection problems
than for diffusion problems in that mass lumping for advection is less accurate yet
yields the 'more appropriate' (and non-stiff) explicit ODE's [y = M~i) f(y) = g(y)]
for which a nearly constant At and an explicit method are often appropriate—yet
it is the diffusion equation that generates stiff ODE's and 'demands' stiff (implicit)
ODE methods whether or not mass lumping is invoked; i.e., consistent mass for
advection generates implicit ODE's [My = f(y)], and lumped mass for diffusion
generates explicit ODE's. C'est la vie.}
Q7. Last, but certainly not least: how are we going to solve the linear systems?
Enough questions. Time for some answers—mixed with guesses. Since the answer to
some of the early questions depends in a crucial way on the answer to the last question,
we answer it first.
A7. On most modern computers it is probably safe to say that the best way to solve
the linear systems for all but the largest 2D simulations and for 'small 3D
simulations will turn out to be Gaussian elimination (direct method) in one form or
another—usually via an LU decomposition (see Volume II)—at least for those
cases (probably the majority) in which it is not required to form a new LHS matrix
at every timestep. For extremely large 2D problems and most 3D problems, the
preferred solution method is iterative rather than direct—the reason being that the
direct methods can usually only be 'best' when both L and U can be stored in
'memory.' If the problem is too large for in-core storage of L and U, then the
TIME INTEGRATION 279
relatively low storage requirements for iterative methods usually promotes them
into first place. The iterative method that we have in mind at this point is one
of the simplest: (DSCGS, diagonally scaled conjugate gradient squared). Diagonal
scaling is particularly simple and should be generally effective. [If more
sophisticated 'preconditioners' are to be employed, such as ILU (incomplete LU), then
the strategy discussed below would be subject to change because then the cost of
applying the preconditioner would not be 'negligible,' as it is with diagonal scaling.]
Al. This issue is noticeably different depending on the 'solver' selected, so we answer
it in two parts, first for the DSCGS method: yes. If a different preconditioned
iterative method is used, the 'yes' becomes a maybe, depending in part on the cost
of applying the preconditioner. For the LU (direct) method, however, a few special
'tricks' are very worthwhile. We describe two LU algorithms, with the latter largely
borrowed from ODE 'software.'
The first algorithm is simple and effective, but perhaps a bit naive (relative to the
second); and it is this:
(1) Compute Atn+i as above for each step.
(2) If DTSF ^ 0.8, then reject the current solution and re-compute Tn+\ using the
smaller step.
(3) If 0.8 < DTSF < 1.0, then do not change the step size. This of course permits the
re-use of the factored matrix ((2/Atn)M -\-An+\), via forward-reduction and back-
substitution. {A 're-solve' is generally much cheaper than the first solve since the
cost of performing the LU decomposition is high relative to the re-solution [the ratio
is 0(iV), where N is the length of the T-vector].}
(4) If 1.0 ^ DTSF ^ 1.5 and if this occurs four times in a row—to detect and verify a
trend—then change At and re-factor the matrix with the new At.
(5) If DTSF > 1.5, then change At and re-factor the matrix—a 50% or more increase
in step size is enough to make the change worthwhile.
Remarks:
(1) The scalar parameters (0.8, 1.5, 4) are just suggestions/rules of thumb. Perhaps
others would be better—the key piece of advice being that it is not cost-effective
to re-factor at every step.
(2) For the limiting case of pure diffusion, the above algorithm can deliver good results
out to steady state in as few as 50 timesteps, or even less, because At grows rapidly.
(3) For pure advection and 'uniform' flow, the algorithm may change At only very
rarely, thus giving an implicit scheme that costs about the same as an explicit
scheme in that the latter would factor the mass matrix (LU) once and for all, and
each timestep would be a back-substitution.
(4) For moderate Pe, the algorithm still works well, often increasing At moderately
during the simulation. We shall show sample results at the end of the chapter.
(5) As pointed out earlier in our discussion of model (scalar) problems, the growth rate
of At (vs t) may occasionally tend to 'stall'; i.e., grow more slowly than you think
it should, especially on problems that are approaching the steady state. C'est la vie.
280 THE ADVECTION-DIFFUSION EQUATION
This answers Ql for the first LU algorithm. We now describe a second algorithm also
based on direct solvers—a more sophisticated method that is more closely aligned with
what is done in, for example, LSODE (Radhakrishnan and Hindmarsh, 1993). The first
thing we do is to significantly modify the TR solution of (2.7-98); and this we do by
invoking a Newton-(or Newton-Raphson-) like strategy. (Yes, a Newton method on a
linear system of algebraic equations! Bear with us.) Rearranging (2.7-98) as F(Tn+\) =
((2/Atn)M + An+\)Tn+\ — b = 0 and applying Newton's method leads to
Jm (CV} ~ Tlti) = -F (rS) , (2-7-99)
where Jm = dF I T™^x J / dT^ is the Jacobian matrix and m is the iteration index; i.e.,
2
Atn
M +An+l (7^ - O = b ~ | t-M +A„+1 1 7^, (2.7-100)
for ra = 0,1,2,..., and T^ = T„+l. One might, at this point, question our sanity
since—as is well known and easily shown—the above iteration scheme always converges
exactly in one iteration; Tn+\ is the solution of (2.7-100). The trick (strategy is probably a
better word) that makes it useful for our purposes is this: use a Jacobian matrix that is not
current; it is an out-of-date Jacobian from some earlier time, say J0 = (2/Ato)M +A0,
where O stands for 'old.' Thus we have a modified Newton method (or chord method)
and now do have a reason to iterate— on
7ffi!) - TSi) = b - (^-* + AB+1) ?1% (2-7-101)
form = 0, 1,..., with 7^=7^.
Here is how it works, in general: at some earlier time, we formed and factored our old
Jacobian (J0 = LU) and saved L and U. Thus, at all later times, the iterations defined
by (2.7-101) represent the 'cheap' part of direct solution methods; one forward reduction
and one back-substitution. The iterations are stopped when
rpim+l) _ T(m)
1 n + \ 1n+\
^ ys, (2.7-102)
where || • || is as in (2.7-96) and e is the relative error tolerance parameter introduced
above, and y is a 'safety' factor chosen so that our approximate solution of (2.7-98)
is close enough to the true solution that the error caused by the difference does not
contaminate our local time truncation error estimate (dn+\); y = 0.1 is typical. When all
is well, only a few iterations will be required—typically < 1.5, on average (A. Hindmarsh,
personal communication). In fact, only when convergence becomes too slow (too many
iterations) do we update the Jacobian. Finally, we point out that most of the 'heuristics'
of the first algorithm apply here as well—as do some new ones; namely,
(1) If \Atn+\/Ato — 1.0| ^ 0.3, then refactor the matrix based on Atn+\.
(2) If the modified Newton method fails to pass the convergence test, (2.7-102)
after three or four iterations, then stop the iterations, update the matrix, form
its LU decomposition, and start the timestep over (same At).
TIME INTEGRATION 281
(3) If DTSF ^ 0.8, then reject the step—as before.
(4) If 0.8 < DTSF < 1.0, then do not change the step size—as before.
(5) If 1.0 < DTSF ^ 1.3, then follow Step (4) above—to prevent frivolous
updates.
(6) Finally, if DTSF > 1.3, then update the matrix (with the new At), form its
LU decomposition, and go to the next step. (This is to keep the number of
iterations low.)
A2. For iterative solvers or for the modified Newton method: no. For the other
LU method, the answer is contained in A1.
A3. Yes—in all cases—to catch 'glitches.'
A4. DTSF = 0.8 is reasonable—but not sacrosanct.
A5.&A6. In general: no; the non-stiff portions of most simulations (except pure advec-
tion or Pe ^> 1) will be a fairly small fraction of the total simulation time
so that using the 'stiff method' for the entire simulation is usually not very
wasteful. If, though, pure advection or highly advection-dominated
simulations are more common than others, and if you wish to use lumped mass
(and rather more node points, to make up for the accuracy loss), then a non-
stiff method based on functional iteration may be more appropriate. Called
Adams-Bashforth-Moulton predictor-corrector methods for explicit ODE's,
the AB2/TR version of same, for
t = M^[f(t)-A(t)T], ■ (2.7-103)
where ML is a diagonal matrix, goes like:
(1) Use the 'standard' AB2 predictor, from (2.7-95).
(2) Write the TR as
At ■ _■
Tn+i =Tn + y[7\t +ML (/„+, -An+lTn+l)]
= b-^-MZlAn+lTn+l (2.7-104)
and 'solve' it wis. functional iteration instead of linear algebra; namely,
for m = 0, 1, ..., do
Ci ° = b ~ y^Z'^+iCi (2.7-105)
until
with 7-2, =rj+1.
1 n+\ l n+\
^O.le, (2.7-106)
(3) If the total number of functional iterations becomes too large (more than,
say, five), the problem is probably becoming 'stiff,' and the method
should be dropped in favor of those discussed above. If this occurs,
though, the 'assumption' of hyperbolic behavior may have been wrong.
282 THE ADVECTION-DIFFUSION EQUATION
Remarks:
(1) In ODE jargon, the above method is also referred to as P(EC)m: predict; evaluate
(the RHS), correct, evaluate, correct, ... m times; i.e., until (2.7-106) is satisfied.
(2) Some (naive) predictor-corrector methods do not use error control to determine the
number of iterations, but instead take one predictor and one corrector per timestep.
To see what kind of troubles this can cause for non-linear ODE's (hint: spurious
solutions, non-linear dynamical systems, strange attractors), see Griffiths (1988) and
the following papers by H. Yee, who properly chastises/admonishes some portion (at
least) of the aerospace industry [cf. our discussion of the 'aerospace method' above
(2.7-91)]: Yee et al. (1991), Yee and Sweby (1995a), and Lafon and Yee (1996); see
too Yee and Sweby (1994, 1995b).
For 'completeness,' we present a consistent, 'ODE-style' <5-form of the second
LU-algorithm [(2.7-99) through (2.7-102)]. It consists of updating 7^, and 7^,
'simultaneously' and uses the form of the algorithm shown below (2.7-97); i.e., replace
(2.7-101) by
(■^-M +Ao) 8T{m+l) = fn+l -An+,T(^X - Mt^, (2.7-107)
followed by
7'i™+,) = TU\ + 8T"+' and t%*l) = t™l + —8Tlm+l\ (2.7-108)
and we remark that there does not appear to be any compelling reason to choose one of
these over the other—they are mathematically equivalent.
Remark:
B. Finlayson and students have also developed some smart integrators that are similar
to ours—but different; see Jensen (1980), Finlayson (1985; Section 3.4 in Aiken, 1985),
Josse and Finlayson (1984), and Jensen and Finlayson (1986).
f. The smoothing property
To close this discussion of implicit ODE methods, we turn briefly to a limiting case, the
transient heat equation (u = 0), which has received very much mathematical analysis.
(Some, but probably not all, of what we summarize below also applies for u ^ 0.) This
parabolic equation has a powerful 'smoothing property' that actually permits rather 'wild'
initial data to be contemplated—and leads to some quite technical analyses regarding the
solution's behavior/regularity as t \, 0. In fact, from Wahlbin (1980), we quote, 'Note that
even if v is only in L\, say, u(-, t) is infinitely differentiable for t > 0; this is the parabolic
smoothing property ...,' where v is the IC for the ID transient heat equation, «(-, t) is its
solution, and L\ is the function space such that JQ \v\dx < oo—which can even include
Dirac delta functions (which functions are not permitted in L2). The 'heat' equation is a
really strong smoother—and its solution, for t I 0, can thus obviously be quite difficult to
simulate for finite h and finite At. Should the numerical time-integration method used to
integrate the finite element approximation to the heat equation also possess this powerful
damping behavior? This is a good question and the answer is not simple—nor even, we
TIME INTEGRATION 283
believe, uniformly agreed upon by the 'experts.' For example, Rannacher (1984) believes
that TR—being the 'delicate' integrator that it is—should generally not be used 'in
isolation'; rather, it should be 'assisted' by a little bit of a dissipative method (e.g., two
to four steps of BE)—at t = 0+ if only the initial data are 'rough' (but still in L2), and
later too if the source terms are rough (also in L2 but time-dependent). [In fact, both the
FIDAP code (Fluid Dynamics International, 1993) and the Nachos code (Gartling, 1987)
are 'hard-wired' to do just that.] He also states that TR alone with rough initial data can
be very bad: 'For rough data the global order of convergence may reduce even to o(\), in
the extreme case.' That is to say, convergence will occur but not at any predictable rate.
While the above may indeed be excellent advice in general, we believe that there is
another side to the story/argument: smart integrators with variable At, even if 'only'
A-stable, like TR, can probably usually deal rather well with rough data—especially
if/when it is presumed that the rough data are introduced intentionally; i.e., that one
is really seeking the time-accurate solution to the rough data problem. (Why otherwise
introduce rough data?) It is then, of course, true that the initial At will be (and should
be) quite small, probably 0(Ax2/k), as pointed out in Luskin and Rannacher (1982); but
At can and should grow rapidly as the initial sharp transient 'diffuses away.' In fact, if
a variable-step BE method was applied to the same rough data problem that presumably
causes trouble for TR, both the initial step size and all subsequent ones would be rather
smaller (with a more expensive but not more accurate result) than those selected by TR
for the same specified accuracy.
In fact, to test the above 'hypothesis,' we [with the able help of A. Hindmarsh and his
LSODI code (Hindmarsh, 1983), restricted for this problem to the second-order implicit
Adams method (TR)] repeated the numerical experiments reported in Rannacher (1982)
in which, for an IC, the (discrete) L2-projection of a Dirac delta function was placed
at x = 0.5 and the ID heat equation on 0 ^ i ^ 1 solved via linear finite elements on a
uniform mesh with homogeneous Dirichlet BC's. [The above IC translates to MT0 = e^/2,
where t^/2 is the unit vector, with zeros in all entries except N/2 (we take N even) where
it contains 1.0, which comes from JQ </>,(x)<5(x — 0.5) dx = 1.0 for / = N/2; this is very
rough data. For N = 160, it gives T0(N/2) = 277 and T0(N/2 ± 1) = -74.3, the signs
alternate and the successive nodal amplitudes decrease in the ratio 2 — y/3 = 0.27. For
N = 80, the corresponding values are ~ 139 and ~ —37.1, very close to linear with N—as
is the norm of each eigenvector.] We first integrated the associated ODE's via variable-
step TR with a tolerance (s) selected to give virtually the same accuracy reported by R.
Rannacher, who used four BE steps and the rest TR—all at a. fixed At(= 2h). The results
are as follows: for virtually the same number of steps, the smart TR delivers accurate
results over the entire time interval: e.g., for N = 80, Rannacher took At = 1/40 = 0.025,
integrated from t = 0 to t = \(k = \/n2) with four BE and 36 TR steps to get 7(0.5, 1) =
0.7371, whereas LSODI started with At = 0.00002, finished in 63 steps at At = 0.38
to get 7(0.5, 1) = 0.7351; the 'exact' result is 0.7361. [N.B. Rannacher's results are
not accurate at small time—intentionally. Also note that the really exact delta-function
solution, T(x, t) = e-<*-°-5>2/4"/^4nKt, gives 7(0.5, 1.0) = ^/rr/4 = 0.8862, whereas we
followed Rannacher by setting homogeneous Dirichlet BC's; the true value at x = 0, 1
and t = 1 is y/Ti/4e~n~/l6 = 0.4782.] The smoothing property of the heat equation can
be readily duplicated via TR if the 'proper' step sequence is taken—a sequence easily
realized with a smart integrator.
284 THE ADVECTION-DIFFUSION EQUATION
The above calculations used a rather crude e—0.01. To see further how a smart
integrator behaves with a more typical tolerance, we repeated the above experiment for
e = 5 x 10~4 and N = 160—and we also repeated it with (variable-step) BE all the way.
The results are shown in Table 2.7-3.
Noteworthy are the following points.
1. TR does, on average, match At to h (164 timesteps for N = 160).
2. BE is quite inefficient by comparison.
3. Most of the effort goes toward small time accuracy.
4. The minimum time of believability (Section 2.6.2c) is about 0.lh2/K = 0.1(7r/160)2 =
4 x 10~5, which 'consumed' about 20 TR steps and 84 BE steps; i.e., these steps are
necessarily 'wasted' in order to obtain accurate results at later time.
5. The small difference in the 'exact' solutions at t = 1 between N = 80 and N = 160
is interesting—and explainable: the projection of the initial data vector onto the first
eigenvector (see Section 2.6.2a) is close to 2, and this eigenvector dominates the solution
by time t = 1, because the lowest eigenvalue is ~ kjt2 and, since k = l/n2, we have,
for the first mode, ~ 2e~A'r = 2e~' = 0.7358—a result that changes only in the fourth
decimal place between N = SO and N = 160. Thus, N = 80 is quite sufficient to accurately
track the first mode—and this mode is nearly all that remains at t = 1 (erXlt = e~4 =
0.02, etc.).
To see how well BE and TR fare against the 'full power' of LSODI, two additional
runs were made (at the same e, 5 x 10~4, and N = 160) in which all the stops were pulled
out. In the first, the Adams family was selected with the following results: 204 steps were
required (final solution: 0.7350) and the 'order selector' got as high as fifth (twelfth is
the maximum available)—and spent most of the time there—dropping back to second
(TR) toward the end. The second run used the BDF family with the following results:
156 steps were required (final solution: 0.7361) and again the order quickly rose to fifth
(the maximum available)—and stayed there. Thus, BDF 'wins'—fewer steps and smaller
error—for this transient heat equation example. But the variable-step TR was not very far
behind, and would show up better yet if the problem was one of advection-diffusion,
especially if advection-dominated (or pure advection), thus—in our opinion—further justifying
our selection of (smart) TR as an optimal integration method.
We hope (and believe) that these results lend further support to the use of smart
integrators in general and TR in particular for CFD. We would not, however, be too
Table 2.7-3 More LSODI results.
Time, t Closest step number to time t/T(0.5, t)
10"4
io-3
10"2
10"1
10"°
TR
37/98.94
80/28.49
113/8.878
141/2.734
164/0.7357
BE
262/99.42
479/28.29
620/8.864
724/2.812
800/0.7429
'Exact' at t = 1 (e = 10"6, 1250 TR steps): 0.7360
TIME INTEGRATION 285
much against the occasional introduction of a 'damping' step (say, every 50 or 100
steps), with BDF2 the method of choice, in order to help control 'extraneous noise.'
Our final bottom line on implicit integrators is this: constant step size should be used
with about the same frequency as uniform finite element meshes; i.e., rarely.
Final Remark:
In the next chapter we will also present variable-step versions of BE (BDF1) and
BDF2—albeit for the rather more complicated Navier-Stokes equations.
2.7.5 A Semi-Implicit Method
There are two principal reasons why one may wish to employ a 'hybrid' integration
method in which the diffusion term is integrated with an implicit method, and the advection
term with an explicit method: (i) there are thin BL's that are resolved by the mesh (and
the grid Peclet number there is less than one) and (ii) robust, unsymmetric 'solvers' are
not available/affordable/desirable. In fact, once one realizes that explicit advection leads
typically to the infamous 'CFL stability limit,' uAt/h ^ 0(1), a third reason appears if
the flow is advection-dominated (with still thin BL's and fine mesh there): the timestep
required for accurate tracking of 'parcels' will often need At small enough that uAt/h ^ 1
even if an implicit method was used for advection.
Thus we, and many others, have also developed methods for time-marching that are
semi-implicit—for which we note immediately that error control and variable timestep
sizes are more or less abandoned; whereas one might vary At based on accuracy if the
resulting At is less than the stability limit [CFL = 0(1)] and if the local error could be
adequately estimated, no one (to our knowledge) has seriously attempted it. What is more
common (and careless) is to assume that the CFL limit will always give a sufficiently
accurate time integration. And this is the approach we take in this section—although we
will advise the user of one of these methods to always verify the temporal accuracy of a
simulation by repeating it at a smaller At.
Our semi-implicit scheme begins where BTD left off; recall that in Section 2.7.2e
we found a modified (perturbed) operator for forward Euler on advection, which both
stabilized it (conditionally) and improved its accuracy. Here we extend/modify that scheme
in a simple but far-reaching way; we integrate both diffusion and BTD terms via TR
while still marching advection with FE. The gain turns out to be (at least for special
cases) unconditional stability, and the loss is, of course, the need to solve linear systems.
But the matrix is SPD, and the linear systems are actually quite easy to solve so that
usually there is a net gain.
Starting from
dT ( At \
— +u- V7 = V- I K+ — uuj • V7\ (2.7-109)
where K is diagonal (and typically/usually K = kI), we let K + (At/2) uu = K and apply
our semi-implicit integrator to the GFEM form of (2.7-109) to obtain
-j-M(Tn+l-Tn)+NnT„ + UknT„+kn+iTn+l) = -(/„ +/„+i), (2.7-110)
At 2 2
in which we have permitted time-varying velocity and BC's. Thus, each timestep requires
the formation and solution of the linear system,
286 THE ADVECTION-DIFFUSION EQUATION
(^-M + Kn+l\ Tn+l = (^-M -Kn- 2Nn\ Tn + (/„ + /„+,), (2.7-111)
where the coefficient matrix is SPD and amenable to efficient iterative methods. For 'time-
accurate' integrations, it is often even better yet because then At is often small enough
that the mass matrix 'dominates' Kn+\, and it is well known (e.g., Wathen, 1991) that M
is a 'very nice' matrix for the conjugate gradient method. (Its eigenvalues are clustered,
and DSCG converges very quickly and is cheap.) (See Volume II for further details on
CG and DSCG.)
Whereas we cannot prove unconditional stability in the general case (arbitrary mesh,
variable velocity, etc.), we can for the ID, constant coefficient case with periodic
BC's—via von Neumann [see Bullister et al. (1986) for some further discussion of the
general case, and see Ascher et al. (1995) for a recent FDM paper on the subject]. In ID,
(2.7-110) is, using linear basis functions,
"1 C
(Tnjll - r;_.) + 4(r;+1 - r;> + (r;+; - r;+1)j + -(r;+1 - r;_,)
= x-{a + c2) [(r;_, - 2r; + r;+1) + (Tn+l - nfx + rj+J)], (2.7-112)
where c = uAt/h and a = 2icAt/h2. Performing the von Neumann analysis a la
Section 2.7.2c yields
2 + cos# (a + c2)
(1 — cos 6) — ic sin 6
£ = 2 2 (2.7-113)
2 + cos# (a + c2)
3 + 2 '(1-cos^)
for the amplification factor (0 = kh, where k is the wavenumber, recall). It is
straightforward to form |£|2 and to see that it is less than one for all values of 0, a, and c. Stability
is unconditional—even for pure advection.
Remarks:
(1) It is somewhat remarkable that TR stability is obtained even though advection is
treated explicitly. It will be less remarkable, however, after Section 2.7.7b is
assimilated.
(2) Limited numerical experiments suggest that these von Neumann results generalize to
multi-dimensional problems on real grids with variable (but divergence-free) velocity
and real BC's.
(3) Unlike TR, however, this semi-implicit scheme is not neutral when k = 0—it is
dissipative; for At and h -> 0, in fact, |£| = 1 - u2K4At2h2/8. But for At -> oo
with h fixed, it does not damp at all; in fact, £ -> — 1 to give the characteristic 2At
'wiggles' of TR. The strongest damping occurs for c = 1/V3 for which a 2Ax wave
dies in a single step.
(4) If LM is used, then (2 + cos#)/3 is replaced by unity; stability is still unconditional
(and accuracy suffers). For pictures of |£|, |£|1/c, and phase speed vs 0—for both
TIME INTEGRATION 287
CM and LM—see Gresho and Chan (1985). (|£|1/c measures the amplitude error
corresponding to the true solution moving one grid length, h.)
(5) If an AD simulation attains a steady state and At is very large (c » 1), then the
resulting solution will be overly diffusive (but only along streamlines); the effective
diffusivity, a = a + c2 = a(l + Pc), which is much larger than a when P ^ 0(1).
A demonstration of the last situation, first discovered by McCallen (1988), is obtained
by integrating the semi-implicit equations to a steady state for the 'tough' problem (with
OBL for Pe ^> 1) discussed in Section 2.6.2c; rather than (2.6-148), the uniform mesh
solution for the semi-implicit/BTD method is
t, - n
1 +P(1 +c)
.1 -P(l -c)_
1
\+P(\+c)
1 -P{\ -c)
N+\
for 7 = 0, l,...,Af+l,
(2.7-114)
1
which (appropriately) degenerates to (2.6-148) for c -> 0. Also, for c » 1 and P ^ 0(1),
the solution becomes, approximately, [(1 + 2/c)j - 1]/[(1 + 2/c)N+l - 1] = j/N + 1,
the solution corresponding to pure diffusion. So, for fixed P, the steady solution ranges
from the wiggly GFEM result for c <$C 1 on to a non-oscillatory, overly diffusive
solution for c ^> 1 (which is only obtained after many timesteps that display slowly damped
2At oscillations—a clue to a problem, a wiggle signal). Plotted in Figures 2.7-10 and
2.7-11 are the 'BTD' results analogous to those shown in Figures 2.6-54 and 2.6-55 for
GFEM—for several values of c.
Whereas small c sends out a steady-state, spatial wiggle signal, large c is, at the steady
state, oblivious to the difficulty of the problem (it does see the 2At TR wiggles during the
transient). Bottom line: DO NOT USE BTD with c » 1. (Pure TR, on the other hand,
would yield a 'valid' steady state, although not a useful transient, when c ^> 1.) Finally,
it is good to recall that BTD is 'inherently' a transient trick and should, in fact, be viewed
with some suspicion if and when a steady state is attained.
1.4
1.2 —
T(x)
1.0
0.8
0.6
0.4 —
0.2 —
0
C = 0.1
C= 1.0
C= 10.0
Analytic
I I
J I
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
x
Fig. 2.7-10 Steady results for the hard 1D AD problem via BTD (Pe = 16, N = 7, P = 2).
288 THE ADVECTION-DIFFUSION EQUATION
x
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
4
3
2
1
0
01 2345678
J
Fig. 2.7-11 Same as Figure 2.7-10 except Pe = 160 (P = 20).
2.7.6 Dispersion (et al.) Errors For Some Fully Discrete
Methods
a. Introduction
Now it is time to tie together some previous pieces to obtain a better feeling for what
actually happens in the computer when approximating solutions of the scalar transport
equation. That is to say, we shall apply several ODE methods to the semi-discrete AD
equation (whose spatial error properties we have already examined) to see how Ax-errors
and Af-errors 'interact.' We shall be interested in phase errors, damping errors, and—of
course—stability, which will often show up as a 'negative' damping error. Our 'study
vehicle' will be the ID advection-diffusion equation using linear basis functions.
We have already studied one end of the spectrum—spatial errors in the absence of
temporal errors. Now we shall do the opposite—but only briefly since it is usually of
more academic interest than the ones already studied. Specifically, we shall apply TR
to the PDE for pure advection to see how temporal error alone, from a specific ODE
'integrator,' manifests itself. [It will turn out that stable implicit methods at large CFL
'obtain their stability' by seriously slowing down the wave form; whereas explicit methods
usually give leading phase error—and require c ^ 0(1) for stability.] After deriving this
one in some detail, we will present 'final results' for one first-order method (BE), one
explicit second-order method (LF2), and one fourth-order explicit method (RK4). Then,
having developed a feeling for the temporal error end of the spectrum, we will 'bite
the big bullet' and study the real error when the specimen PDE is discretized with
linear elements and time-integrated with six explicit methods, three implicit methods,
and one semi-implicit method; namely, FE, AB2, RK2, LF2, AB3, and RK4 for the
explicit family; BE, TR, and BDF2 for the implicit family; and FE/BTD(TR) for the
semi-implicit method. In each case we will present the key quantitative item needed for
error and stability quantification—the von Neumann amplitude coefficient, £; yes, the
analysis is (necessarily, it seems) of the common (but useful) 'Fourier type.' Given £,
I I I I I I \
c = 0.01
c = 0.1
TIME INTEGRATION 289
it is then a simple matter (in principal) to recover the two limiting (semi-discrete) cases
via letting either Ax —> 0 or At —> 0 and performing asymptotic analysis (perhaps using
appropriate software). Our results, however, will be mostly pictures; i.e., we will show
graphical results and leave asymptotics to the reader.
The target is the PDE
T, + uTx = kTxx (2.7-115)
on the unit span with periodic BC's (or, alternatively, on the infinite span) and IC Tq{x) =
elkx, with solution
T(x,t) = e~k2Kt -ek(x-ut\ (2.7-116)
which is a repeat of (2.6-108) and clearly depicts the k-ih Fourier mode being transported
at the fluid velocity and decaying monotonically in time; recall [see (2.6-14)] that k =
kn = Inn—and we suppress the mode number, n. The ODE analog of the target is the
semi-discretized system
i(f;_, +4tj + t]+x) + u(T]+\ -Tj-l)/2h = K(Tj-i -2Tj+T]+,)/h\ (2.7-117)
or its lumped mass counterpart (7^-1 and Tj+\ —>• tj). These ODE's have the exact
solution [cf. (2.6-113)]
T .(f\ —2k2Kt(\—cos9)/m92 _ ik(jh—utsin0/m9) (0 1 ]]R\
where
m = m(9) = (2 + cos 0)/3 for CM, (2.7-119)
and
m=\ forLM (2.7-120)
is the 'symbol' of the mass matrix [see Remark (4) below (2.6-24)] and 6 = kh =
2nn/N (as usual). Clearly the effective diffusivity is 2k(\ — cos0)/m02, which —>• k (with
second-order accuracy) as h —>• 0, and the effective velocity (phase speed) is u sin 0/m0,
which —>• u as h —>• 0 (with second-order accuracy for LM and fourth-order accuracy for
CM). Both the effective diffusivity and the effective velocity have been discussed in
Sections 2.6.2b and 2.6.2a, respectively. Every ODE method must, of course, recover
(2.7-118) for Af-> 0.
b. Semi-discrete the other way
We begin by applying our favorite ODE integrator, TR, to the pure advection PDE (if
k ^ 0, then this analysis becomes too complex to be useful),
T(x,tn+^-T(x,tn) + «[7,^ fn) + Tx{^ W)}] = Q (2 7121)
Seeking a solution in the form T(x, tn) = e'^-"'"', for some u TBD, gives
e-ikuAt _ j iku
-ilcuAt
At
+ (e-«««m+1) = 0<
290 THE ADVECTION-DIFFUSION EQUATION
or
i.e.,
ikuAt/2 _ —ikuAt/2
— / tan H/cAt/^ —
ikHAt/2 | -ikuAt/2 l u"" <*'v*-»i/ ^
~u 2 , kuAt
- = tan-1 ——,
u kuAt 2
ikuAt
2
(2.7-122)
which is the phase speed resulting when TR is applied to the advection PDE.
There is, in fact, a more useful way to interpret this result; it obtains by relating the step
size, At, to the period of the wave from the exact solution, r = X/u = 2n/ku = 2n/co,
where a> is the wave (angular) frequency. Defining NP = x/ At as the number of steps
per period gives NP = 2ji/kuAt and thus
Ni
n
u iv p I
- = — tan —
u n Np
(2.7-123)
as a more useful formula for the TR phase error. (If one considers NP = 2n/kuAt,
the number of timesteps for the slower wave to pass by, there follows NP/NP = u/u.)
Figure 2.7-12 shows that while a 'coarse mesh' (large At, small NP) yields a much
too small phase speed, it does not require very many steps per period to recover good
accuracy; e.g., for NP = 10, the error is just over 3%, and for NP = 20, it is less than 1%.
This seems to suggest that at least 20 steps per period should be employed in practice
(more yet if many periods are to be computed)—especially considering that spatial error
also (usually) decreases the wave speed.
TR
BE
LF2
RK4
IN,
^1'^ for BE
IN,
^r?forRK4 —
10
Np
12
14
16
18
20
Fig. 2.7-12 Phase speeds and decay per period for several ODE methods.
TIME INTEGRATION 291
Also shown in the figure is the result for BE,
n
- = —tan-1 —
u 2n NP
(2.7-124)
which requires finer resolution (naturally) for similar accuracy. But the situation is worse
yet for BE; it gives a solution that goes like |£|n, where |£| = \/\f\ + (2n/NP)2 is the
decrease per timestep—serious numerical damping. Even worse is the more appropriate
measure, \%\Np, the decay per period—shown in the same figure. (Recall that |£| = 1
for TR.)
Remark:
FE gives the same phase speed as BE but, unfortunately, grows (unboundedly) at the
same rate that BE decays; i.e., |£| = y/\ + (2n/NP)2. Neither Euler method should be
used for advection.
Also shown are the following results from two explicit methods—in the stable range
only:
u Np ,
LF2: - = —-sm'l2n/Np
u 2n
(2.7-125)
with |£| = 1, which is unstable for NP < 2tt, and
u Np ,
RK4: - = — tan-1
u 2n
2n/Np[\-l-(27T/Np)2]
D
\-l-(27T/Npf + ~(27T/NP)4
(2.7-126)
with
1*1 =
1 (2n\2 1 /2nY
~ 2 [n'p) + 24 VAW
+
'2n
N~p
1 /2n\
1_6 \Wp)
(2.7-127)
\%\Np is also shown. The stability limit for this case is NP = 2n/VS = 2.22. The following
explicit methods are not shown because they are unstable: FE, AB2, and RK2. AB3, which
is conditionally stable, is left as an exercise.
From the above results, we observe the following:
1. Implicit methods generate lagging phase error, thus exacerbating that caused (usually)
by spatial error.
2. Explicit methods generate somewhat compensatory (usually) leading phase error.
3. RK4 is impressively accurate and should give good results with as few as 10 steps per
period.
4. BE is impressively inaccurate and highly overdamped.
c. Fully discrete
o Trapezoid rule. We now return to the real world of finite At, finite h, and finite k, and
apply several ODE methods to (2.7-117)—with special thanks to J. Leone, Jr., whose
skills with modern software packages were crucial. Again, we shall derive the results in
292 THE ADVECTION-DIFFUSION EQUATION
any detail for only one method, but give final results for all those considered. Applying
TR to (2.7-117) and then seeking a solution of the resulting difference equation in the
usual form,
(2.7-128)
An)
where Tj denotes the value of Tj at time tn, yields
m
(£-Drv'^ +
iuAt /l +£'
h
smO-$neije
2k At /l +£'
i.e.,
£ =
h2 V 2
1 — [a( 1 — cos 6) + ic sin 9]/2m
1 + [a(l - cosG) + icsinG]/2m,
(1 -cos0)-$neij9;
(2.7-129)
in terms of a and c, where (as before) a = 2i<At/h2 and c = uAt/h. Another, perhaps
more useful, form makes £ = £(#, c, P) via a = c/P—and this is the form we will use.
Simplifying by multiplying numerator and denominator by the complex conjugate of the
denominator yields the final form,
1
£ =
a2(l -cos#)2+c2sin2#
4m2
ic sin 6
m
1 +
a{\ — cosO)
2m
+
c2 sin2 0
(2.7-130)
4m"
[This is the only method whose form we shall bother to explicate as Re(£) + //m(£); for
the others, we simply let the 'software' do the complex arithmetic] To relate this result to
the exact solution, and to determine the effective diffusivity (damping) and phase speed,
we proceed as follows [cf. (2.7-116)] and write £ = x + iy = |£|e'arg®:
<n)
n ij9
jyn, = £„e„* = |£|„e
= l£l" exp
In i(j6+n tan ' y/x)
ik{Jh + IKt^ ylx,
= \£\ne<k(~xJ~uP!)
(2.7-131)
where uP = — (l/&A?)tan ' y/x with proper account taken of the quadrant in which £
lies (not always easy), or
1 , . 1
u
ukAt
tan y/x =
cO
tan ' y/x
(2.7-132)
as the numerical (relative) phase speed of the fully discrete approximation. This equation
is general in that it depends only on the von Neumann amplification factor, £, which
itself is, of course, a function of both spatial and temporal discretization methods. Also,
|£| = (x2 + y2)1/2 = \T(f+X)/T("]\ is the (general) amplification factor magnitude, which
must be ^ 1 for stability and is to be compared with e~k kA' from (2.7-116), which latter
shows that short waves damp fast. The relative amplitude change per timestep,
RA! = |£|^VA?, (2.7-133)
the relative amplitude change per grid length traversed by the exact solution,
=. o'A-
Rh = R
At '
(2.7-134)
TIME INTEGRATION 293
because 1/c = h/uAt is the number of steps per grid length for the true solution, and the
relative amplitude change per period,
RP = (RAt)Np = {RAt)2n,c\ (2.7-135)
are all interesting measures of the error (and all are 1.0 for the perfect method—and
all -> 1 as 0 -> 0).
To further the interpretation of these results, it is useful to re-express (2.7-133) in terms
of the local Peclet number, P = uh/2K, and the CFL number, c = uAt/h, as
*a, = l£|er02/2/\ (2.7-136)
or in terms of the local 'diffusion number,' a = c/P, as
RAt = |£|ea02/2, (2.7-137)
the first of which is the form of relative 'diffusion' that we shall utilize. (But it will be
useful to keep the relationship a = c/P in mind when studying the curves to follow—and
the fact that a <$C 1 usually means an accurate diffusion calculation and a » 1 an
inaccurate one.)
Remarks:
(1) Although perhaps counterintuitive but nevertheless rather important, it follows that
RAt < 1 means an overdamped numerical solution and RAt > 1 is underdamped.
The goal, of course, is RA, = 1.
(2) The limiting case for pure advection is, appropriately, RAt = |£|, and |£| = 1 is the
goal; and that for pure diffusion is just (2.7-137)—and RA, = 1 is the goal, as usual.
(3) For finite P and £, RAl —>• oo as c —>• oo because the exact solution becomes zero in
one timestep—thus reducing somewhat the utility of the following results at large c.
(4) It is interesting at this point to note that the exact PDE solution, (2.7-116), can be
expressed in terms of the discrete variables and parameters (and thus at the discrete
mesh points in both space and time) as
T(xj, tn) = [Q-^l^f^J-no = [e-aeV2]«e«0(y-«c) (2.7-138)
and that of the ODE's, (2.7-118), as
Tj(tn) = [e^(\-cos9)/m]n . ei0U-nc.Sin6/mO)t (2.7-139)
(5) In the case of TR, as with all of the ODE methods considered herein, the
following analogy exists between the AD equations and the single scalar (model)
equation [(2.7-1)] with complex X;
y = -Xy : XAt = [a(l - cos#) + /csin0]/m(0), (2.7-140)
so that any ODE solution with amplification factor £ = %(XAt) can be instantly
'translated' to the von Neumann growth rate for the AD PDE. For TR, cf. (2.7-57)
and (2.7-129). (X is an eigenvalue of M~lK, where—recall—the modal index has
been suppressed; 0 = 0n = 2nn/N =>• X = Xn.) Different spatial discretizations will,
294 THE ADVECTION-DIFFUSION EQUATION
of course, give different functionalities between XAt and 6, a, c; i.e., different
spatial 'response functions' than [a(l — cos 6) + ic sin 6]/m(0)—but the analogy will
probably still hold, at least if every nodal equation is of the same type. (Quadratic
elements, for example, would again cause 'difficulty.')
o Range of P? Before pushing on, we pose a few relevant questions that will help to
guide our analysis: What range of P should be covered? And, at least for implicit (stable)
schemes, what range of c? Or—when is P small enough for advection to be neglected or
large enough to be called (or replaced by) pure advection, and when is c small enough to
say that the ODE solution can be said to have been attained and, finally, when is c too
large for the 'solution' to make sense? These are reasonable questions, only some of which
we shall answer with any certainty, beginning with the range of P, which itself begins
with an analysis of the exact solution for a wave of length X = 2n/k = 2nh/9. Next, we
ask: How fast does this wave decay by diffusion relative to its transport by advection? Or,
more quantitatively, how much decay (say, D, D ^ 1) occurs per wavelength advected?
If D is small, then the decay is strong and the process is diffusion-dominated, while if
D is close to one, then there is little damping and the process is advection-dominated.
Pure diffusion will have D = 0 (and Pe = uL/2k = 0) and D = 1 corresponds to pure
advection (Pe = oo).
The time required for one wavelength of advection, say Atx, is given by Atx = X/u =
nAt, giving n = X/u At = In/kuAt = 2n/c0 timesteps per period. Thus, (2.7-138) gives
T(Xj, Atx) = z~nelP • ei(je~27z) = Q-n0lpeJ0, (2.7-141)
and we have found the damping factor:
D = e~ne/p = e'27tkK/u = e_;r*L/Pe. (2.7-142)
D describes the relative amplitude of an initial 'sine' wave after it has moved forward one
wavelength; D is the damping of the exact solution per wavelength advected—as would
occur on a fixed, discrete mesh (with periodic BC's).
To interpret these results in a useful way, we consider the following sequence:
(1) Pick L (L = 1 is usual), the domain length.
(2) Pick N, giving h = Ax = L/N.
(3) Pick P = uh/2K, which =>• that Pe = NP (global Peclet number) is also fixed.
(4) Vary k such that 0 ^ kh < n\ i.e., 0 < 0 ^ n, with the upper limit on k being set by
the mesh. Alternatively, we have oo ^ X/ Ax ^ 2; the given mesh can only support
waves equal to or longer than 2Ax.
(5) Plot D vs 0; equivalently, D vs k for fixed h, P.
(6) Change P and repeat.
The results of such a procedure are shown in Figures 2.7-13 and 2.7-14, the latter
perhaps showing better the wide range of damping covered between P = 0.1 and P =
100—and we more or less arbitrarily show a 0.1% cutoff at D = 0.001: any wave that
decays more than three decades per period is considered to be rather evanescent.
So, what values of P should we consider in subsequent analysis? We have selected
P = 0.1, P = 1, and P = 100 as covering sufficiently the entire range of behavior, with
TIME INTEGRATION 295
Fig. 2.7-13 Decay per wavelength for several values of P.
1 \. "~"p~= 1*5" *"
P = 100
/■■
P = 1
P = 0.1
0.1% cutoff
T
1.5
e
2.0
T
2.5
3.0
0.5 1.0
0
VAx: oo 16 8 6 4 3 2
Fig. 2.7-14 Decay per wavelength for several values of P; semi-log plot.
p = 1 (a = c) 'defining' the case where advection and diffusion are equally important
(on the mesh)—more or less. P = 100 is reasonably close to the pure advection limit, and
/> = 0.1 surely represents a diffusion-dominated case; e.g., for P= 100, an 8Ax-wave
(0 = 7r/4) is damped by only ~ 2.4% per period, which requires about 28 periods for a
decrease of 50%, whereas for P = 0.1 even a wave as long as X/ Ax = 28.5(0 = 0.22)
decays one-thousand fold per period (a 285 Ax wave decays by 50% per period).
This forms our response to the issue regarding the range of P. For that of c, see the
figures that follow, since the answer depends a lot on both the ODE method and P itself.
o Return to TR. The results for TR, which 'set the stage' for most of the other ODE
methods that follow, are shown in Figure 2.7-15. Phase speed (relative) is on the left
and relative amplitude, per (2.7-133) and (2.7-136), on the right, for each of the three
selected values of P—the top set showing 'advection-dominated' and the bottom set
'diffusion-dominated.'
296 THE ADVECTION-DIFFUSION EQUATION
1.0
■o
CD
8.0.8
8 0.6
co
sz
£o.4
>
* 0.2
CD
cc
0
........ I ! I ' I '
Vx ^Z. ~^^C<0.1
P\ \ c'=i"---0\
+- \j0 ^ ^ \ —
\ "• -. -s. \
-\ioo '"--------... \
^ ~~ — — ~ "\
~ I , I i I i
1.5
1.4
1.3
< 1.2
cc
1.1
1.0
0.9
I ' I I I I
/100 .'"10 T
1 ,''' \
- / .... ; 3.n-t
0.1^ C = 1^ ^
I , I , I
0.5 1.0
1.5
e
(a) Phase speed
2.0 2.5
3.0
P= 100
0.5 1.0
1.5
0
(b) Amplitude
2.0 2.5 3.0
1.2
1 1.0
Q.
(/>
CD 0.8
w
£0.6
Q.
> 0-4
lo.2
CC
0 —
_ I
vxu'—
i Vv
-A 100
I
I
-.,
4°
i
I
^P
~~~ —
|
.--"
C
— —
I
c
<o.or
~---..
___
i
I
=~f "
.0.1
--.s
I
i
"~ —
—
V-~
I
1.5
1.4
1.3
= 1.1
1.0
0.9
0.8
| I
; /
— ;io ;3
— / /
_i i
■ f..-r'.
1 1
/(a=1)
/1
C = 0/f^^
l_
—
—
i
0.5 1.0 1.5 2.0
e
(c) Phase speed
2.5 3.0
0.5
P = 1
1.0 1.5 2.0 2.5
0
(d) Amplitude
16
I 14
% 12
% 10
5 8
Q.
CD 6
>
"■a 4
rr 2
0
kj
I
3 /"
ioot^
i i i i
, ••••••... c = 0.1
(a=1)/ "-••••......
/ /0.01
^10~T~~ i T
l_
—
—
—
i
0.5 1.0
1.5
e
(e) Phase speed
2.0 2.5
3.0 0 0.5 1.0 1.5 2.0
0
P = 0.1 (f) Amplitude
Fig. 2.7-15 Phase and amplitude results for TR at three values of P.
3.0
1.6
1.4
1.2
1.0
of 0.8
0.6
0.4
0.2
I ' I
-JO
-1 ;3 1
^
1'' , '
—
—
1 1 1 "
1 |
/ C = 0.01
•:0.1
(a=1)
i I
l_
—
—
——
—
—
I
2.5 3.0
Remarks:
(1) Whereas reasonable accuracy is obtained for c ^ 1 for the advection-dominated case,
large c seriously slows down the waves—a typical characteristic of implicit methods:
stability is gained at the expense of slowly advected waves. For c ^ ~0.1, all of
the phase error is 'spatial.'
(2) For the diffusion-dominated case, the phase speed is both strange and mostly
irrelevant for c ^ ~0.10 for which TR oscillations have precluded any hope for accuracy.
Accuracy requires c < ~0.01, which corresponds to a ^ 0.1 for this case.
(3) The intermediate/transition case, P = 1, is 'intermediate.'
TIME INTEGRATION 297
(4) The relative amplitude for P = 100 is basically just e(cf9 /2P) since |£| ~ 1 for all
c's shown—see Remark (3) following (2.7-137). Also, here and in those results to
follow, RAl is bounded (for c and P finite) even though it may not appear so. For
example, for a 4Ajc wave (0 = n/2) and c = 100, RAt = 0.99973 x e7^8 = 3.433
and a 2Ajc wave with c = 100 has RAt = O^e*2/2 = 69.5 (it peaks at - 110 at
9 = 3.07). TR is clearly underdamped for large c.
(5) From (2.7-140), we note (for any ODE method) that P = 1 tends to equalize advec-
tion and diffusion in that the coefficient of each trigonometric term is c; P = 1 is
indeed a logical 'transition' value.
(6) Rs(0 = n) — 0, for TR and for all succeeding results; some curves are discontinuous
at 0 = n.
o Backward Euler. Changing to a BE integration leads to
1+A.Af a(l-cos0) + /csin0
and
m(0)
1*1 =
1
(2.7-143)
(2.7-144)
1 +
a(\ — cos0)
m
+
csin0'
m
the results of which are shown in Figure 2.7-16, here omitting P = 0.1, which looks
much like P=\. The major difference from the TR results shows up for the
1.0
§0.8
Q.
CO
85 0.6
CO
©0.4
.>
"5
© 0.2
cc
0
^--J ■
»\ ^
+ \ *
\ \
P-. \
! \ \
I— \
i \10
-r- ">».
\100
v^-l-—
"T—<
-_
v«,
\3
^**'—••.
—I- —
<J:
^J
^"--
-J--
I
= 0T
'->
^J.
I I
V<0 01
\
v^ \ —
\ \
—— *^___ \\
"~~"*~~\
~-~t~~-A
0.5 1.0 1.5
e
(a) Phase speed
2.0 2.5
3.0
P =
1.6
1.4
1.2
1.0
^0.8
0.6
0.4
0.2
0
I ' I
C = 0.1
\
V 1
r*\io
tioo> ,
i
i i
i
i
i
; /i
/ 4-
/ i
/ ./ —
0 0.5 1.0
100
1.5
e
(b) Amplitude
2.0 2.5 3.0
1.0
0.9
$0.8
8-0.7
8 0.6
£0.5
S"0.4
I 0.3
co
© 0.2
= 0.1
0.0
r,;i
j_
\~
r
,
r
r
r
T
s'T -^
\
\.
s
\
\ 10
100 --->-..
-I—I.,
'■"••
c =
s1
s
>».
—h
0.1 "-•
^
!".--.
I
\0.01
► ""
I
—
—
—
'■■■\—
r*-^.^
0.5 1.0 1.5
e
(c) Phase speed
2.0 2.5 3.0
1.6
1.5
1.4
1.3
3 1.2
cc
1.1
1.0
0.9
0.8
'^^'■ZZZZr--^
!
,1 ,.5
/
/
/
/
/
y c=o.oi
0.1 —I
J I L
0 0.5 1.0
P = 1
1.5
e
(d) Amplitude
2.0 2.5 3.0
Fig. 2.7-16 Phase and amplitude results for BE at two values of P.
298 THE ADVECTION-DIFFUSION EQUATION
advection-dominated case with large At; e.g., c = P = 100 shows that all but the shortest
waves are seriously overdamped. On the other hand, RAt = ^ • en /2 = 20 at 0 = n; the
2Ajc wave is underdamped. Thus, BE is bad at both ends for large c; it overdamps the
good (long) waves and underdamps the bad (short) ones. Its phase speeds are not really
different from those for TR except for c < ~ 1; TR converges faster.
o BDF2. Changing to our last implicit method, BDF2, leads to (see Section 2.7.3c; take
the positive root)
2 + Vl -2XAt „ ,_
t = —L^L (2.7-145)
s 3 + 2XAt
where again XAt is given by (2.7-140), the results of which are shown in Figure 2.7-17,
which seems to merit the following remarks:
1. Its phase speed characteristics are remarkably close to those from BDF1 (BE) for
P = 100, and rather more like TR for the diffusion-dominated case.
2. It shares the BE property of overdamping the good waves and underdamping the bad
for large P and c.
3. A final remark, not from the figure: the spurious (negative) root for BDF2 is basically
innocuous—a statement that will be seen not to apply to some explicit methods. If XAt is
real and >l/2, then the decay of £_ [change the sign in front of the radical in (2.7-145)]
is damped oscillatory; both roots —>• 0 as At —>• oo.
o Forward Euler. So much for implicit methods. We now turn to explicit methods,
beginning with the simplest, FE:
£ = 1 -XAt = 1 -[a(l -cos0) + /csin0]3/(2 + cos0), (2.7-146)
the results of which are shown in Figure 2.7-18, for which the following remarks apply:
1. Recall (Section 2.7.2c) that FE is unstable if c > P/3 for /> ^ V3 (diffusion-limited
case) and if c > \/P for P > >/3. Several unstable results are shown for P = 100 to help
'understand' unstable behavior (at least for FE; different methods often display different
types of unstable behavior); e.g., for c = 0.02, which is twice the stability limit, FE is
unstable for all 0 < ~ 2.1 (A. > ~ 3 Ax), with the longer waves being more unstable (larger
|£| > 1); see Hindmarsh et al. (1984) for further discussion of unstable modes for FE.
2. For the advection-dominated case, we see that if FE is stable, then it is also accurate—as
well it should be since c ^ 0.01 gives a very small timestep.
3. For the diffusion-dominated case (P = 0.1), stability requires c ^ 1/30, ora = c/P ^
1/3; the two results in the phase speed graph that 'end midstream' (c = 1, c = 0.1) are
unstable for 0 larger than the cutoff. Yes, the shorter waves are more unstable when
diffusion dominates.
o FE with BTD. Changing now to a more stable (modified) FE, we add BTD (see
Section 2.7.2e) and, for variety, lump the mass (ra = 1; recall that mass lumping and
explicit methods are a 'natural' combination). Here
£= 1 _[(a + c2)(i -cos<9) + /csin<9], (2.7-147)
TIME INTEGRATION 299
1.0
-o0.9
$0.8
8-0.7
8 0.6
£0.5
Q.
a) 0.4
|0-3
(o
"5 0.2
* 0.1
0
1.0
$0.8
8-0.7
8 0.6
£o.5
Q.
a) 0.4
l0-3
ffl 0.2
01 0.1
0.0
I \
4-
\
\
\
X 3
10
:-sioo
0.5 1.0 1.5 2.0
e
(a) Phase speed
2.5 3.0
P=100
0.5 1.0 1.5 2.0 2.5 3.0
0
(b) Amplitude
4 \
LJ- X
^ \ \3
*- \ \
-U \io
\ V
"T100 ^-^.
\
^1 ►.-_
"--
--I-
^;4-.
.J
^_
— •■ — „_
—1- -
-■J.C = (
s ai--
\0.01
-«> \
■v.
x
~~~-— -
[
>.3l
^
—
—
\ —
v—
;;t^
0.5 1.0
1.5
e
(c) Phase speed
2.0 2.5 3.0
1.7
1.6
1.5
1.4
_1.3
of 1.2 I—
1.1
1.0
0.9
0.8
0.7
—
—
-\
10c
/
I! /'
! /3
i /
1 /
/ /
/
/ / /
./
1 ,
1.
/1
/(a
/
,
I
I
= 1)
C =
i
.•
o."i'
1
/
/
/
•
••0.3
0.01
I
J-,
—
—
I
0.5 1.0
P = 1
1.5
e
(d) Amplitude
2.0 2.5 3.0
■o
a)
a)
Q.
w
a)
w
(0
a)
a)
CC
1.5
e
(e) Phase speed
Fig. 2.7-17 Phase and amplitude results for BDF2 at three values of P.
a result obtained by adding physical diffusion to the RHS of (2.7-43); u2At/2 ->
K + u2At/2. Figure 2.7-19 shows the results, which now are stable for c ^ 2/7(1 +
Vl +4/>2), giving critical CFL numbers of -0.995, 0.618, and 0.099 for the three
values of P in the figure. Again some unstable results are displayed—this time showing
short wave instability (when unstable) for all values of P. BTD has stabilized the long
waves. Also noteworthy, for P = 100, is that time truncation error improves the phase
speed, right up to the stability limit, as previously demonstrated in Figure 2.7-6.
o FE with BTD a la TR. The most stable 'FE' method is that which adds BTD but treats
it implicitly, via TR. From Section 2.7.5, we have
300 THE ADVECTION-DIFFUSION EQUATION
1.0
■o
8.0.8
8 0.6
(o
fo.4
>
J? 0.2
CD
CC
0
1 ' 1 '
~~~-—^c
<0.01
— r\
C = 0.02
~~~ Unstable
—
0.05 j\
Stable!
I
—
. —
0.1\
i.oK
1 I 1 I 1 I
0.5 1.0 1.5 2.0
e
(a) Phase speed
2.5 3.0
1.0
0.8
cc 0.6
0.4
0.2
0
I ' I I I I
— C = 0.02
Unstable
III,
0.05
Stable
I
0.1
1.0
i
K
P= 100
0 0.5 1.0 1.5 2.0
e
(b) Amplitude
■o
CD
CD
a.
w
CD
&
(o
CD
I 1
CD
CC
—
I ' I I I I
/o.z
./■
•
C^O01^\\
1 1 1 1 1 1
3 —
0.5 1.0 1.5 2.0 2.5 3.0
e
(c) Phase speed P = 1
r *
1
0
1 40
Q.
</)
cd 30
w
(0
Q- 20
CD
>
K 10
CD
CC
0
I
1.0,.
I
I I I
/" """
/0.1
• x .•••■'
I I I
I
I 0.03
C = 0.01
|
I
—
—
I
2.5 3.0
I ' I ' I I
/
— t-
/■
/ —
/0.3
./
_.x-'"o.01
I I I I I I
0 0.5 1.0 1.5 2.0
e
(d) Amplitude
0 0.5 1.0 1.5 2.0 2.5 3.0 0 0.5 1.0 1.5 2.0
e e
(e) Phase speed P = 0.1 (f) Amplitude
Fig. 2.7-18 Phase and amplitude results for FE at three values of P.
2.5 3.0
2.5 3.0
£ =
a + c'
(1 — cos#) + ic sin 9
m
1 H —(1 -cos0)/m
(2.7-148)
which is unconditionally stable. The results, shown in Figure 2.7-20, look rather like
those for TR alone (Figure 2.7-15); thus, remarks made there apply here as well—except
to emphasize that TR generates an unsymmetric matrix in the general case, whereas
FE/BTD(TR) leads to a symmetric matrix.
TIME INTEGRATION 301
1.8
v 1.6
8 1.4
Q.
S1-2
« 1.0
I ' I I „J I-
—
—
—
*^^C J).01^ \ 01-
%. - ^ C = 0.995 _
Xy 0.9 - ^~
\V _
N°\
*vV0.8
\VX-
0.7\oT
0.5 1.0 1.5 2.0 2.5
e
(a) Phase speed
1.0 1.5 2.0 2.5
e
0.5
1.0 1.5 2.0 2.5 3.0
e
(c) Phase speed P = 1
0.5 1.0 1.5 2.0 2.5
e
(d) Amplitude
3.0
20
T3 „
o 18
CD "
o. 16
w
CD 14
IS 1?
Q. 10
9> 8
"cc 6
CD A
* 2
n
—
— ,
1
/-1
I ' I J
T
—^
/ '^^ ->~
1 **■—. J-
i "*■;
.'0.1 /
/ 0.05 ■■■
/ C = 0.01 / ~
,y -w =
l l l —i 1 1_
0.5
1.0 1.5 2.0 2.5
e
(e) Phase speed
10
9
8
7
6
< 5
= 4
3
2
1
0
I
—
I
- /1
- /
/
I
1
1
1
1
1
1
1
1
1
;o.3
1
1 ' / l ]_
i
/ —
i —
/ —
/
Ai
/' C = 0.01 ~
■T^ ; !-«:«?...
0
3.0
P = 0.1
Fig. 2.7-19 Phase and amplitude results for FE with BTD (and LM) at three values of P.
0.5 1.0 1.5 2.0 2.5 3.0
e
(f) Amplitude
o Second-order Adams-Bashforth. Switching back to purely explicit, we next visit our
first second-order method, AB2, in the LM version (m = 1). It turns out that AB2 is also
a method that forces a more serious consideration of the spurious/extraneous root—once
it is realized that this root can go unstable before the physical one. From Section 2.7.1,
we have
£± =
1 - \kAt ± yj\ -kAt +l(XAt)2
(2.7-149)
with kAt given by (2.7-140). Since also AB2 is not self-starting, we will show the solution
that obtains when FE is used for the first step:
Relative phase speed
ro *. o> oo o ro *.
Relative phase speed
pooo-'-'j-'-'-'ro
o m '^ b) bo b m i b) is b
o
Ul
Ul
ro
b
ro
Ul
GO
b
(V L'
V^~^o
-, v^ o
1 ',
i °
-\ /"
i\/
i\
• '\
: ; 'v
': ■ —
•1
— «
1*
'T\p
T! :- o
■ P --
• ib
/ ■ ."GO
L i ••'.
t-i- 1 1 1
1 1 1
—
__
_.
O "" —..
^ >v_
/
/
/ —
/
1 / ,
1 \i 1 -
R
AI
o o o o o
o
0.5
1.0
_L
Ul
2.0
2.5
3,0
* U Ol M(D -•■ GO Ul
' I ' I
I
/
/';
/ :
;-l-l -.'-
^
/
s'
i
'c = o
I
-1 / /p
— / / b
••' P 1 -1
.• GO
-(, 1 , \l , 1 .
-J CO
■L '
\7.;
o _
—.~
—
1 1
o
Ul
3 -o
cu
w
CD ^
"O Ul
CD
CD
Q.
ro
b
ro
Ul
GO
b
/ S '
■j-° / /
1 /-* /
-,0 /
I ! 1
i l
,'GO
i
IV,
\\
;\
• *
/ 1 •
i / / / '•:
, /
/ '■
i 1 / ,:"
.' / / ':
! // P /
i y N).-""''
ik[-i""i / i
1 1 1
—
-
^
\
\
\
o\
ii \
o V-
Ul \
\
1 1 H
RAI
o o o o -»■ _->■ _->■ _->■ ->■ ro
ro ^ b) 'co b 'ro '^ b) co b
>
3
c
Q.
CD
o
Ul
_L
o
—l
Ul
ro
o
ro
Ul
GO
o
1 1 i
1
E
— F
ii
i •■
? //
° / •'
GO ' p:
/ "I
-C\ i i
. i
o--
\
i
*
v>
0.01
I
I I
o
GO —
v
_,.-,_
I I
I
^^
~^ ^
I
Relative phase speed
oooo^--j-:j-.-L^
o ro '■&■ b> bo b fo *• b> bo
o
Ul
CD
tn
CD _,,
"O U1
CD
CD
Q.
II
o
o
ro
b
ro
Ul
GO
b
j_n / /
r / / i
i / /-
l l 1
r; / I
1< •'' §
' / -/ / P o
1 /;..•■ *■ V » o
-J4--I i rKi-?j. i -
RAI
OOO-'-'-'-'-'
Ul '-nI CO -^ GO Ul '-nI CO
O
Ul
>
3
C Ul
Q.
CD
ro
b
ro
Ul
GO
b
1 1
J
.'1
.
- /
1
; !
-1/ '
,' / :
' / O :
' i 1 1:
1 1 ' 1
\
\
1 \
\ »
\ o
\
\
\
\
1 PX-
i \
IGO
1 1 N- 1
1
^■~
—
1
TIME INTEGRATION 303
It turns out that the stability limit, |£±(0, c, P = c/a)\ ^ 1, is rather trickier here than for
AB1 (FE) in that sometimes it is the physical root (£+) and sometimes the spurious root
(£_) that sets the limit on At. This is clearly not a nice feature; spurious roots should never
dominate. C'est la vie. Figure 2.7-21 shows the phase (from £+) and amplitude results
for this case, including some unstable results for each P. For P = 100, the physical root
sets the stability limit (at c = 0.315 — 0.316 with &critAje = 1.25); in fact, for P —> oo,
we have 'numerically' determined the following asymptotic stability result:
ccrit = 3/(2/>"3)
(2.7-151)
■o
0)
0)
Q.
</)
0)
</)
(0
0)
>
0)
cc
1.1
1.0
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
4
-4
-f
4
-ti
-t
~l
1
I
1 ' 1 ' 1 i
*"*«*«»JC < 0.1
^%. ^^C = 0.315 —
1 1 ^s
i i ^^
!! \
0-5 ii X
0.316- J0.316 >^ —
■ ■ ' ^^.
ii ,'0.5 ]V ~
I , !! I i ■' I |0>s
<1
0 0.5 1.0 1.5 2.0 2.5
9
(a) Phase speed
-J cr1
1
1
1
1
3.0
P = 100
UU3
008
007
006
005
004
003
002
001
—
—
—
1
s-
,•
1
c
.•'
1
= 0.3._._
/'
1 1 1/
■f-
/ —
/
/
/
o.i....---'—
■o
0)
0)
Q.
</)
0)
</)
(0
0)
>
0)
cc
1.4
1.2
1.0
0.8
0.6
0.4
0.2
0
Z-*
4-
|C
T
4
i
I .
t**<.
= 5 \
\
\
\
rC = 5
V-
1 |
= 1
vs.
\
\ 0.5\
i 1
1
^X0.01
^0.2^^
^,
I
-^
I
—
—
—
—
2.8
2.6
2.4
2.2
<2.0
cc
0 0.5 1.0 1.5 2.0 2.5 3.0
9
(c) Phase speed P = 1
1.8
1.6
1.4
1.2
1.0
■o
0)
0)
Q.
</)
0)
(o
0)
cc
1.6
1.4
1.2
1.0
0.8
0.6
0.4
0.2
0
T ■.
\ --..\C = 0.01
0.1\003^.
<
cc
0 0.5 1.0 1.5 2.0 2.5
9
(e) Phase speed
3.0
P =
0.5 1.0 1.5 2.0 2.5 3.0
9
(b) Amplitude
T
T
C = 0.3/ ■
/'
.'" 0.1 ....-
Jr^-l T"0-01
0 0.5 1.0 1.5 2.0 2.5 3.0
9
(d) Amplitude
5
4
3
2
i
1 ! ' 1 ' /
—
—
i
i
' i
i
i
C = 1 /0.1
/
- '' /
.-'1 ._. .^•r-:.^.. 1 [""—
I I
—
—
—
0.03 ...-•" —
..•-•'" 0.01
—I— ~~T~
= 0.1
0.5 1.0 1.5 2.0 2.5 3.0
9
(f) Amplitude
Fig. 2.7-21 Phase and amplitude results for AB2 (with LM) at three values of P.
304 THE ADVECTION-DIFFUSION EQUATION
at 0CTh = 1.23 - 1.24, which gives ccrit = 0.323 for P = 100, suggesting that (2.7-151)
is approached from below. But for P ^ 1, we have the disconcerting result that stability
is set by the spurious root; the 2Ax wave in £_ is the first to go unstable—and it does
so at the 'diffusion' limit, a = \/2(ccr\t = P/2). Thus the (£+) curves for c > 1/2 in
Figure 2.7-21(c) and those for c > 1/20 in Figures 2.7-21(e) and (f) are actually unstable
in £_ (not shown); they in fact show the behavior of the physical root, which itself can
also be unstable (e.g., c = 5 for P = 1). Regarding damping, we generally see the nice
result that stable solutions are fairly accurate and the less nice (but typical) result that the
shortest waves are the least damped. A more detailed look at the stability behavior as a
function of 9 for each of the three values of P is shown in Figure 2.7-22, which plots
£±(#) for 0 ^ 9 ^ n for each P at close to the critical value of c: namely, c = 0.315 for
p = 100, c = 0.5 for P = 1, and c = 0.05 for c = 0.1. Figure 2.7-23 is a 'zoom' from
Figure 2.7-22 for the P = 100 case, in order to see the detailed behavior—and the unit
circle is thus distorted. All physical roots start at |£| = 1.0 (for 9 = 0) and all spurious
roots start at zero. For P = 100, the spurious root is innocuous and the physical root
is (properly) neutrally stable, |£+| ~ 1, for all waves between oo and 4Ajc, with shorter
waves being nicely damped; 9C has the largest |£|. But when diffusion dominates, so does
the spurious root. For P = 1 and 0.1, the physical root just reaches |£+| = 1/2 at the same
'time' (9) that the spurious root attains |£_ | = 1, at 9 = n\ for c > ccrit, the 2 Ajc unphysical
mode blows up. As a final comment, we mention that the (apparent) phase speed of the
spurious mode for P ^ 1, while not very meaningful/useful, behaves, for 9 <$C 1, like
Rs ^ 7i/2c9 or Rs = —3tt/2c9, the result depending on the 'quadrant selection' when
evaluating tan-1 [/ra/£_)//te(£_)] in (2.7-132). Either choice leads to the same (worthless)
numerical solution; they differ only by an integral number of wavelengths (one in this
case), which is not discernible; i.e., the difference between the two expressions is 2n/c9.
We defer further discussion of these subtle and perhaps not-so-important aspects until we
present leapfrog results, which we do after AB3.
o Third-order AB. Switching now to third-order Adams-Bashforth, we admit up front that
it is extremely difficult to analyze the three-step formula given by (2.7-11), leading as it
does to the following cubic equation:
(£ - 1) + ^[23 - 16/£ + 5/$2](kAt) = 0, (2.7-152)
where k At is given by (2.7-140) with m = 1 (LM). The first thing we would like to mention
is that the 'simple-looking' closed-form expressions given in many 'handbooks' (e.g., the
CRC Math Tables) are inadequate as presented when the equation and roots are complex.
The second thing to mention is that even current mathematical software packages can get
'lost' when branch cuts in the complex plane are involved ('root switching' will occur in
ways that only the programmer can know for sure). And, since we had so much trouble
with the cube root formula, we present, for the convenience of the reader, the proper way
to obtain the three roots of £3 + (f§A.Af - l) £2 - f(AA?)£ + jjkAt = 0 : £i =A + B-
(2?>kAt/\2 - l)/3, £± = -\{A + B) ± (iy/3/2)(A - B) - (23k At/12 - l)/3, where A =
[-b/2 + y/(b/2)2 + (a/3)3]l/3, and B = -a/3A, where a = -4kAt/3 - (23kAt/12 -
l)2/3 and b = kAt/36 + 23(kAt)2/21 + 2(23kAt/21 - 1)3/12. The handbook formula
is as above except for the calculation of B, for which the seemingly equivalent form is
presented: B = [-b/2 - yJ(b/2)2 + (a/3)3]1/3, which of course => AB = -a/3. But if
TIME INTEGRATION
305
1.0
0.8
0.6
0.4
0.2
0
0.2
0.4
0.6
0.8
1.0
I I I I I I
~~ .-'Unit circle !
— .■' I
[_:8 = n 0 = 0!
\ ^_(0.1;C = 0.05) \
*.* i
* \ ' '
_ \\%S_(1;C = 0.5) y'"\
I I I I I I
I I I I I
—
—
'■. —
^+(0.1;C = 0.05) \ _
e = 7i ^ e = o;
\ ^+Ti;cta5)7
v \ A -
Z,_ (100; C = 0.315)i// —
^+(100; C = 0.315) ..■■' _
(See zoom)
I I I I I
-1.0-0.8-0.6-0.4-0.2 0 0.2 0.4 0.6 0.8 1.0
Re©
Fig. 2.7-22 The two roots from AB2 at three values of P (at ccrn).
-0.1 —
^-0.2
E
-0.3
-0.4
Unit circle
9 = 0
• 9 h n/2
9C = 1.236 (|^| = 0.9999815)
0.92 0.93 0.94 0.95 0.96 0.97 0.98 0.99 1.00
Re fc+)
Fig. 2.7-23 Zoom on physical root forP = J\ 00 (c = 0.315).
left to its own devices, there is no assurance that the B obtained from the cube root formula
will be the one that is compatible with A; only B = —a/3A can assure this.
We show only the phase speed for the advection-dominated case for AB3, in
Figure 2.7-24, for which the pure advection stability limit is 12VTT/55 = 0.72362727
(D. Griffiths, personal communication), showing that all stable results are accurate. The
306 THE ADVECTION-DIFFUSION EQUATION
Fig. 2.7-24 AB3 phase speed for P = 100 (with LM).
E
1.0
0.8
0.6
0.4
0.2
0
0.2
04
0.6
0.8
1.0
l
—
— ,.
— /
■
;
i
— '■
—
i
I ' I
/e
^
I i"""l
J....J
9^2
f"""l"
I ' '
—
''.. —
\ —
\—
9/
.••' —
—
I i r
-1.0 -0.6 -0.2 0.2
Re®
100, C= 12VT1/55.
0.6
1.0
-1.0 -0.6
(b) P = 1, c = 3/11.
-0.2 0
Re©
0.6
1.0
Fig. 2.7-25 The three roots for AB3 at two values of P, each at very close to its stability
limit.
more interesting graphs are those shown in Figure 2.7-25 for 0 ^ 0 ^ n, for both the
physical mode (£1) and the two spurious modes (£2 = £+ and £3 = £_) at very close to
the critical value of c for each of two P,s—P= 100 (advection-dominated) and P = 1
(nearly diffusion-dominated; P = 0.1 did not seem necessary). For each P, the first mode
to go unstable is the spurious mode, £_ (we surmise that this is also the case for P = 0
and P = 00). It does so at 0 = n/2(4Ax wave) for the former and at 0 = tt(2Ax) for
the latter—for which case the P = 1 results are very close to those for pure diffusion
(P = 0), for which we see in Figure 2.7-25 that at 0 = n, £, = 1/2, & = 15/33 = 0.45,
and £3 = —1, the approximations being exact for P = 0, for which c = 3/11 defines the
stability limit (D. Griffiths, personal communication). (The difference between the two
TIME INTEGRATION 307
former roots may be difficult to discern in the figure). The interesting and disconcerting
consequence of this behavior is that even when stable there is a range of At for which
the spurious root will have a larger value of |£| than that of the physical root (e.g., for
k/Ax near 4Ajc for the advection-dominated case), ultimately resulting in a solution that
is stable but specious—which behavior is of course reminiscent of implicit methods.
o Leapfrog. Next, we switch to one of the most meteorogically popular explicit schemes
for pure advection (P = oo), LF2 (second-order leapfrog), which also displays some
peculiarities that probably are not familiar to many—partly because in meteorology, LF2
is always 'filtered' in practice, to control what they call the 'time-splitting instability'; see,
for example, Durran (1991). Since LF2 is much more commonly associated with FDM
rather than FEM, we shall (again) lump the mass via m(0) = 1. We begin by showing the
analytic solution when FE is used for the first step:
Tj] = (aQ + btjn_)eije, (2.7-153)
where
1 + \/l -c2sin2#
2\/l -c2sin2#
-1 + Vl - c2 sin2 0
2\J\ -c2sin2#
— c2 sin 0 — ic sin 0 is the physical root,
— vl — c2 sin2 0 — ic sin 0 is the spurious root,
and we note that a + b = 1, £+£_ = — 1, and, for cs'm0 < 1, |£+| = |£_| = 1. Also, this
solution is only valid for c sin 0 ^ 1; if c sin 0 = 1, we have the special case of root
coalescence and linear instability owing to a repeated root (£+ = £_ = — 1). The solution
then is given by T^f1 = (A + Bn)%ntlje in general, and for a FE first step, it is
T(f = (1 + in)(-i)neiJe, (2.7-154)
where c sin 0 = 1.
An overall picture of the LF2 behavior can be obtained by examining £± in the complex
plane, which we present in Figure 2.7-26, in which the arrows depicting the parameter
#(0 ^ 0 ^ 7r) that look 'parallel' to the unit circle are actually on the unit circle. The
symmetric behavior between £+ and £_ is only 'violated' for c > 1 and sin-1 (1/c) < 0 <
7i — sin~'(l/c).
The phase speeds of the two roots, meaningful only when c < 1 in practice, are given
by Rs = —arg(%)/c0 = —cp/c0 in general [see (2.7-132)] and
R+= tan"1 ( ~c sin = J = Sin-'(csin(9) (2.7-155)
S c0 V^l-cWey cB
and
Rs = [sin"' (c sin 0) ± it] (2.7-156)
b =
308
THE ADVECTION-DIFFUSION EQUATION
lm£)
n/2) = sirn1c
(a) c < 1
0" Re©
Re©
0 = sirr1(1/c) and 7t-sirr1(1/c)
(b) c> 1
Fig. 2.7-26 Amplification factors for leapfrog.
e = 7i/2;£_ = -i(V^T+c)
in particular, both valid for c ^ 1 and 9 ^ n. The difference between the two choices of
sign for R$ gives a difference in phase speed of 2n/c6 that corresponds to a difference in
phase (wave location) of Iji/0 per timestep, which is just k/ Ax for the chosen wave; i.e.,
the difference is not discernible—either choice of sign gives the same numerical result,
even though one choice appears to give a positive phase speed and the other a negative
one. The spurious root generates an ambiguous wave.
TIME INTEGRATION 309
These phase speeds are plotted in Figure 2.7-27 and 2.7-28 for several values of c
and using the-sign in (2.7-156), and the following remarks apply—considering also the
previous figures (£±) and equations:
1. Only c < 1 gives bounded solutions; and then, only small 6 gives accurate ones (a ~ 1
and b « 0)—except for the exceptional case of c ~ 1 and 6 ^ n/2.
1 .U
0.9
0.8
T3 0.7
CD
CD
wO.6
CD
(/)
i=U0.5
Q.
CD
■3 0.4
CD
CD
"■ 0.3
0.2
1.0
n
—
—
—
^^^■'•r- ""■--. C = 0.9 \
l
i /
c<o.N<Cvs v\ \c = i.o
/ / \\%>\
C = 0.5X / N$V\
0 = 0.7/ ^v
^&
N%.
I I I I I
—
—
—
—
—
—
—
0.5
1.0
1.5
2.0
2.5
3.0
Fig. 2.7-27 Phase speed for LF2; pure advection with lumped mass.
■a
CD
CD
O
(/)
ase
ve ph
laf
CD
IX
IU
9
8
7
6
5
4
3
2
1
I
I
\ ' \ I \ I \ I I I
\W \ \
\ \ \ \ \
\ \ V \ C = 0.2\
W \ \ \
\ \ \ '••• \
\ ' v '■■• ^\
\ \ \ ■■-.. 0.4
\ \ V-
\ \ "V
\ \ V0-6
\ ~*-
I I ^-^ I I I-
I
~
—
—
—
J.
0.1
0.6
1.1
1.6
e
2.1
2.6
3.1
Fig. 2.7-28 Phase speed for leapfrog's spurious root.
310 THE ADVECTION-DIFFUSION EQUATION
2. At c = 1, the 4Ax wave (9 = n/2) is always linearly unstable, and the curve for
9 ^ n/2 is given by Rs = tt/0 - 1. [See also Miller (1991).]
3. For c > 1, the physical mode is damped (|£+| < 1) for 9 in the 'neighborhood' of n/2;
i.e., for sin~'(l/c) < 9 < n — sin~'(l/c), while the spurious mode displays unbounded
growth with period 4 At (|£_| > 1) in this same 9 range, with the 4Ajc wave growing
fastest.
4. For an example of the ambiguity, consider an 8Ajc wave—9 = n/4 and c = 1. Here
a = (1 + V2)/2 = 1.207, b = 1 - a =~ -0.207, £+ = (1 - 0/V2, £_ = -(1 + 0/V2.
For the first choice for phase speed (+ sign), we have R^ = 1 and R$ = — 5; the physical
mode has 'size' 1.207 and speed 1, while the spurious mode (noise) has size —0.207 and
speed —5. The net result (sum of the two) is, of course, dispersion. For the second choice
(— sign), however, while R^ remains at 1, we obtain R$ = 3. The difference in the two
speeds for the 'noisy' part is 8, which means, since c = 1, that the difference is 8Ax per
timestep, which is exactly one wavelength and not discernible.
If diffusion is present (in ID, 2D, or 3D) and treated via Dufort-Frankel, then some
recent results by Kwok and Tarn (1993) may prove useful—at least for second-order
centered FDM:
1 1 1 M-l/2
At ^
(u2 + v2 + w2) I = + —^ +
Ajc2 Ay2 Az2.
(2.7-157)
for constant diffusivity—a result which is then independent of k [cf. (2.7-28) and (2.7-29)
for FE on the same equation]; and therefore also applies to pure advection.
o Second-order Runge-Kutta. The last explicit method we shall examine is RK (Runge-
Kutta), both second- and fourth-order. RK2 [(2.7-12) with y = 1 for simplicity] is, for
y = -*-y,
XAt
~2~-
yn+\ = yn + At y„+\/2
yn+\/i = yn —2~y"'
( "kAt \
= y„ - kAt I y„ — y„ J
= [1 - XAt + (XAt)2/2]y„, (2.7-158)
and our amplification factor for the AD equation is thus
£ = 1 - XAt + (XAtf/2, (2.7-159)
with XAt given by (2.7-140) and we choose m{9) = 1 for LM. The results, including a
few that are unstable, are shown in Figure 2.7-29, which we believe merit the following
Remarks:
(1) While RK2 is, like AB2 and FE, unconditionally unstable for pure advection, it
is 'reasonably' stable for the advection-dominated case; i.e., c = 0.416 is not too
terribly stringent.
(2) For P ^ 1, the stability limit is that of pure diffusion, a ^ 1 (c ^ P).
(3) The damping characteristics for P = 100 are quite good.
TIME INTEGRATION 311
1.005 —
0.5 1.0 1.5 2.0 2.5 3.0
9
(c) Phase speed P = 1
-0.8
cc
0
0.5 1.0 1.5 2.0 2.5
9
(e) Phase speed
0.5 1.0 1.5 2.0 2.5 3.0
9
(b) Amplitude
T
TTT
i
C = 1.0/
/
TT
/0.9 / _
/^■-'ai
.'0.6 —
0.3,.''
^2
//
0.01
0.5 1.0 1.5 2.0 2.5 3.0
9
(d) Amplitude
I
—
—
—
1 1
C = 0.1,
/
/
'I I i
1
1
i —
i
0.05 / _
/ 0.001
' 10.01 .1.
0
0.5
3.0
P = 0.1
Fig. 2.7-29 Phase and amplitude results for RK2 with LM at three values of P.
1.0 1.5 2.0 2.5 3.0
e
(f) Amplitude
(4) The 4Ax wave is very special (strange) when P = 1 and c = 1 (the stability limit);
it has an ambiguous phase speed and a damping factor of zero! Whatever its phase
speed is, it is gone after one timestep—because £ = 0.
(5) The negative phase speeds for short waves in the diffusion-dominated case (and for
P = 1, the 'transitional' case) are disconcerting; while stable (for c ^ P), they surely
are not accurately represented. It seems that a < ~0.1 is needed for good accuracy
in these cases.
o Fourth-order Runge-Kutta. Finally, we discuss our last explicit method for this section,
RK4. (In the next section we shall subject a few specialized AD methods to this von
312 THE ADVECTION-DIFFUSION EQUATION
Neumann analysis.) Applying (2.7-13) to y = -ky gives, perhaps not surprisingly,
yn+i = [1 - XAt + (XAtf/2 - (XAt)3/6 + (XAt)4/24]yn, (2.7-160)
and thus £ for AD is given by the bracketed term. Figures 2.7-30 and 2.7-31 show the
results, for CM and LM, respectively—the latter of which was really computed only to
help us to believe the CM results. Again, some unstable results are also displayed.
Remarks:
(1) For pure advection, XAt = ic sin 0/m(0) from (2.7-140), and it is easy to verify
that |£| ^ 1 for csin0/m(0) ^ V$. Hence, the stability limit for LM (m = 1)
■a
a>
a>
Q.
w
a>
w
CD
CD
>
to
CD
cc
■a
o
CD
Q.
w
CD
w
co
CD
>
CD
cc
4
3
2
1
0
-1
-2
1.0
0.5
0
-0.5
-1.5
0
1.5
9
(a) Phase speed
P = 100
; C = 0.6
\ :: 0.3,.^
0.5 1.0 1.5 2.0 2.5 3.0
e
(c) Phase speed P = 1
T
"^T^
C = 0.1
_L
0.5 1.0 1.5 2.0 2.5 3.0
9
(e) Phase speed P = 0.1
C<0,5
'N
c=i.65\\ y /// H
1-5 \,y
0.5 1.0 1.5 2.0 2.5 3.0
9
(b) Amplitude
I ' I ' I I;
/
/
/
/
C = 0.4/'
0.01 / ..'
-—■~--,g^fi:i-
i.i.i.
0.5 1.0 1.5 2.0 2.5 3.0
9
(d) Amplitude
I
I
1 1
1 1
1 1 '/
C = 0.04/
/
/
/
/
0.001 /'0-03x-
'•-■■'■"frbl
. 1 .
0.5 1.0 1.5 2.0 2.5 3.0
9
(f) Amplitude
Fig. 2.7-30 Phase and amplitude results for RK4 at three values of P.
TIME INTEGRATION 313
1 0.4
CC 0.2
0.5 1.0 1.5 2.0
e
(a) Phase speed
3.0
P =
1.1
1.0
0.9
g 0.8
r0.7
0.6
0.5
0.4
"^F
C = 0.1
/
A
\\ X / ^ 2.0/ //
\\ \ /
w
\'' s-/ y ;
r—-\ 2.5/ /
V\ ,\ /
i I i„
100
0.5 1.0 1.5 2.0 2.5 3.0
e
(b) Amplitude
■a
CD
o
Q.
w
CD
w
co
CD
>
jo
cc
■a
CD
CD
Q.
w
CD
(/)
co
CD
>
jo
CC
1.3
1.1
0.9
0.7
0.5
0.3
0.1
-0.1
-0.3
-0.5
1.0
0.8
0.6
0.4
0.2
0
-0.2
-0.4
-0.6
-0.8
-1.0
—
—
—
3\ 2
I i
s^.1., 1.5
X. N
\
1\
I i
| I
^^=03"
o.C^
i
cc3
0.5 1.0 1.5 2.0 2.5 3.0
9
(c) Phase speed P = 1
—
—
—
—
I
—*L l ' l '
v3^*^
\ \ \ •- p*
\ \ \0.13 ,--' —
\ \ V S —
„ *•.(). 15^ —
C = 0.2\ y
I I '" I I
<
cc
6
5
4
3
2
1
6
5
4
3 I—
2
1
I
—
I
i I i i |; i /
• ■'
.' -i-
l ; /
C = 1.3' 0.1; 0.6/_
/ '• I
1 ; /
/ ; / -
y \/ o.3..--
/ y ..-•■—
/ s: .-••"
^ <L-^-^<-'-hsiy^-
0 0.5 1.0 1.5 2.0 2.5
9
(d) Amplitude
3.0
I
I
1 1 "! \\ i
C = 0.13/ /
' / _
/ /0.1 0.05
/ / f
! 1
! 1
i / 0.001 /0.003,
u~..-siii*ftrr::rr.trr.".r...-|-'V""*''L_
3.0
0.5 1.0 1.5 2.0 2.5 3.0 0 0.5 1.0 1.5 2.0 2.5
e e
(e) Phase speed P = 0.1 (f) Amplitude
Fig. 2.7-31 Phase and amplitude results for RK4 with LM at three values of P.
occurs at c = Vs = 2.828 at 0 = n/2—a 4Ax wave, and, for CM, at c = \/8(2 +
cos#c)/3 sin#c = s/S/3 = 1.633, where 9C = 2n/?>—a 3Ax wave.
(2) For P = 100, the CM (LM) phase speed error is 'all spatial' for c < ~ 1 (2), where
of course the spatial error is rather larger for LM.
(3) The damping factor (RAt) for P = 100 is somewhat unusual for both CM and LM;
middle waves are rather overdamped as c approaches the stability limit—except
those close to the critical (~ 4Ax for LM and ~ 3 Ax for CM).
(4) For pure diffusion and LM, we have the (2Ax) stability limit a = 1.39 from Hirsch
(1988, p. 448), which we extend to CM via ac = 1.39/3 = 0.463—results that are
close to the experimental results for both P = 1 and P = 0.1.
314 THE ADVECTION-DIFFUSION EQUATION
With this, we close (finally) our discussion of ODE methods applied to the AD equation
with linear elements, leaving both quads and other ODE methods to others—except for
the next short section!.
d. Generalizations and extensions
The results presented above, while rather thorough in some ways, are noticeably deficient
in others. Here are two:
1. Only linear elements were considered.
2. Only phase speed, not group speed, was considered.
Thus, to partially remedy these shortcomings we offer below a methodology, and a
few results using it, that may be profitably pursued by others. But we shall also restrict
the discussion and analysis to the more important limiting case—pure advection. (The
methodology, however, is not so limited.) The methodology depends/relies upon a
knowledge of the eigenproblem results for the spatial discretization chosen—linear FEM being
just one special case.
Thus, we begin by returning to the general ODE's that describe pure advection,
Mf + KT = 0, (2.7-161)
which is intended to be a generalization of (2.6-17) for linear FEM; i.e., M and K could
come from quadratic [cf. (2.6-37)] or cubic FEM's, or from a CVFEM, or even simple
FDM with M = I. Now choose an eigenvector of M~lK as an IC, say v, and find a
solution in the form of (2.6-20) to obtain
Kv = icoMv, (2.7-162)
which, since v (and M, and K) are presumed known, is a 'defining' equation for the
frequency, co; ico is, of course, an eigenvalue of M~XK. Once co is known, so too is
the solution of (2.7-161) with the given IC. But we care more about phase and group
speeds—both available from the frequency, via
P = co/k (2.7-163)
and
G = dco/dk, (2.7-164)
a la (2.6-46) and (2.6-48) for linear FEM. We have changed the name of the phase speed
from c to P because now c is reserved for uAt/h, the Courant number. Note that the
dependence of co on the wave number implies, necessarily, that the eigenvector, v, is
a (known) function of k, which we presume to be always true. After all, this is just
some more 'Fourier analysis'; and, in fact, (2.6-20) will most often be the appropriate
eigenvector.
Next, to show the methodology that we wish to introduce, we first get specific and
apply a particular ODE method, TR in our case, to (2.7-161)—but in a somewhat special
way. TR on (2.7-161) yields, using Tn+\ = %Tn as usual,
KTn = Af ' ir|Mr"' (2.7-165)
TIME INTEGRATION 315
which clearly 'resembles' (2.7-162). All we need do now to complete the connection is
to apply TR to the same eigenvector that we employed for the ODE. Thus, with T0 = v
and n = 0, it is clear that (1 - £)/(l + S) = icoAt/2; i.e.,
£ = (1 - icoAt/2)/(\ + icoAt/2)
_ (1 - co2At2/4) - icoAt
\+co2At2/4
= x — iy
= \$\e~iv, (2.7-166)
where, recall, co is 'known.' Recalling, and repeating, the analysis leading to (2.7-131)
yields the phase speed a la TR:
<p u _, coAt
Up = = — tan
kAt c0 '"" 1 - co2At2/4
2u , coAt
= — tan"1—-, (2.7-167)
c0 2
the last coming from the 'trig identity,' 2tan-1 0/2 = tan~'(#/(l - 02/4)). But from
(2.7-163), co = kP and thus to At = AtkP = (uAt/h)(kh)(P/u) = c0P/u to give
2u , fcO \
uP = — tan-1 f —P/u J , (2.7-168)
the final phase speed result for TR. Given P(0)/u, the relative phase speed, from the ODE
solution, (2.7-168) shows how TR approximates it as a function of 0 (and, of course, c);
clearly uP —>• P for c —>• 0.
To now extend the analysis to group 'velocity,' we first return to (2.7-131), this time
in the form _
Tljn) = \$\nVjfTin(p = isr^-e-''*"", (2.7-169)
giving co = <p/At and thus
uG = daJ/dk = —d<p/dk. (2.7-170)
But we have <p = (p(co) given so that, using also (2.7-164),
1 dw dco G dw
Ug = Il_ = Z. (2.7-171)
At dco dk At dco
as the final equation for the group speed for the general ODE method selected; i.e., we
recall that both G(0) and <p(co) are 'given' from the spatial discretization selected (and its
eigenproblem results). For TR, <p = 2tan-1 (coAt/2) to give dep/deo = At/(\ + co2At2/4),
and we thus obtain the group speed a la TR as
316 THE ADVECTION-DIFFUSION EQUATION
Table 2.7-4 Fully discrete results in terms of ODE results [coAt = cOP/u in all cases].
ODE method u{p] u(2)
FE(3>&BE ^tarrVAO
ceia" v—^ ^+((oAt)2
U ^,,-1
LF2 ^sin-l(wAf)
^°,M ^ v/1 - (toAt?
RK2(3) atan-i <^A*__^ \±{«>Mtj2G
2u Un-i/wAf x G
TR ^tan-1(^)
C# 2 ! + (o)At/2)2
RK4 " tnn-i ^t1 ~ |(^AQ2] 1 -(o;A04/24 + (a;A06/144n
C^ ian 1 _ i (WA02 + ^AO4 1 - (coAtf/72 + {coAtf/576
(1) For linear FEM, P/u is given by (2.6-46) and for quads by (2.6-50).
(2) For linear FEM, G/u is given by (2.6-48) and for quads by (2.6-52).
(3) Unstable—presented for completeness. A 'small' amount of diffusion could stabilize the methods
without much affecting the results in the table.
and we are done—with our specific example. Other ODE methods will give other
results—in (2.7-165) et seq.—the 'general' results being uP = <p/kAt and uG =
(G/At)(d(p/dco).
We conclude by presenting a summary table (Table 2.7-4) of these and other
results—leaving both details and other ODE methods to others.
So now we are really finished, with apologies for not showing some of these results
in pictures. Time just ran out.
2.7.7 Other (Different) Methods Used by Others
In this final section on methods we merely perform (probably poorly) a sort of 'duty' by
first noting that there are many other 'interesting' time-integration methods used in CFD
(but not by us), and then by pointing out some of them—briefly. While we cannot fully
embrace these methods for various reasons (not the least of which is that we have not
tested them—who has the time and personnel? Would that we did!)—we would be remiss
if we neglected even to mention at least what seem to be the best (or most popular) of
them. Perhaps we will even find time to implement the 'best' of them. Some—indeed
perhaps most—of the methods have been derived specifically for the advection-diffusion
(AD) equation or even just the pure advection equation, and are thus custom/specialized
methods. Others were derived for solving the NS equations of the next chapter but can
also be applied to AD. A nearly common feature of all of them is that they eschew GFEM
(rightly or wrongly) for advection-dominated flows, thus displaying somewhat 'religious'
underpinnings (as do we, of course). We are, however, ready to admit that there may be
one or more 'winners' among them. Probably only time (lots of it, most likely) will tell.
Before describing a few particular methods, we point out that the reader interested in
learning about the wide choice of schemes (FEM and not) available before 1986 or so
will find the paper by Rood (1987) useful and interesting as it attempts to summarize the
many methods (with many more references) applied to advection and advection-diffusion.
Interesting quotations: 'Originally, an attempt was made to review the literature of plasma
TIME INTEGRATION 317
physics, meteorology, oceanography, computational physics, applied mathematics, and air
pollution research. Well over 100 schemes were found, and during the research for this
article, at least ten new algorithms were introduced, and at least four new comparison
studies were published.' Another interesting pair of 'comparison' papers are those by
Chock and Dunker (1983) and Chock (1985). Another survey paper is that by Thompson
(1984). For a more recent survey of many of these 'other' schemes and some not discussed
herein, see Donea and Quartapelle (1992). Finally, for a new report on the 'status' of some
AD methods, see Baptista et al. (1995).
a. Methods based on trajectories/characteristics
Most of the better schemes are based in one way or another on the method of
characteristics (for hyperbolic equations). Two useful quotations in this regard are: 'Any method
for approximating hyperbolic equations sacrifices a good deal if it takes no account of the
method of characteristics'—Morton (1982); and 'Numerical schemes that follow
characteristics backwards in time and then interpolate at their feet have a history stretching back
to the very early days of computational fluid dynamics (Courant et al., 1952)'—Priestley
(1993).
That this method is good (read also, difficult) is perhaps best demonstrated by
realizing/recognizing the tremendous number of 'people-years' in a wide variety of disciplines
that have gone into its development—which development is not yet over, at least in some
disciplines. The number of names given to virtually the same method is also somewhat
remarkable.
Before attempting to describe the method(s), we mention the principal reasons that
it is worthy of such pursuit—some of which may only be realized in 'special' cases: it
is very accurate (sometimes more accurate for large timesteps than small ones!) because
(in part, at least) the constant in the error estimate is much smaller when the equations
are solved along the characteristics than otherwise (see, for example, Russell, 1990);
it is unconditionally stable (when done 'correctly'); and it involves linear systems of
equations with only SPD matrices. On the other hand, it is not perfect: it introduces
numerical diffusion and dispersion (both small, usually), and it only sometimes conserves
'mass.' On balance, though, those who use it believe that the advantages win big!
The method is based on the BMOC (backward method of characteristics), which is
only one of its many names (perhaps the best one), and comes about as follows: given that
we have the solution at the current time, T(x, tn), we wish to find it at the next (or a later)
time level tn+\ = tn + At. This is done as follows: at time tn+\, select a point of interest,
say Xy, which is naturally taken as a node point on the mesh and, looking backwards along
the trajectory (characteristic), ask the following question to an imaginary 'fluid particle'
at this point, T(xj,tn+\), 'Where were you at tnT In the case of pure advection, the
reason we ask this question is quite simple: the value of T(Xj, tn+\) is exactly the value
of T at the point and time in question. So the advection process becomes—appropriately
and simply (again, simple in words)—finding the trajectory of each 'particle' (fictitious
Lagrangian moving point) of the mesh, there being one particle for each node point—for
each timestep; a new set of particles, one for each node (or integration point in some
algorithms), is employed for each timestep. In the more general case, with diffusion and
sources present, these processes must/should be accounted for during the transit of the
'particle'—a complicating feature to be sure, but not insurmountable.
318 THE ADVECTION-DIFFUSION EQUATION
Remark:
It is a consequence of looking backwards along characteristics that leads to the
unconditional stability of the method. In contrast, many early characteristic-based methods looked
forward and thus encountered the typical CFL stability limit. See Staniforth and Cote
(1991) for references to some of these earlier stability-limited methods; but in particular
we cite Tremback et al. (1987) as a good recent example of 'forward-looking'
characteristics methods, and Smolarkiewicz and Rasch (1991) as a recent example of how to
convert all of the Tremback et al. methods to 'backward-looking' characteristic methods.
Since we—regretably—do not have personal experience with this class of methods,
our coverage of it will be both brief and non-authoritarian; we will, however, steer the
interested reader toward the bulk of the vast literature on the subject. Regarding actual
details, all that we shall do is present the simplest possible case—pure advection in
ID on a uniform grid with constant velocity and linear basis functions—in sufficient
detail that the gist of the method will hopefully be realized. This presentation will be
enough to convince the reader that although the concept is simple, the realization is
not. (Computational advection is not easy, no matter how you look at it.) To see this
method in its simplest form—for expository purposes—consider the constant-velocity
pure advection equation in ID, Tt + uTx = 0, written more usefully here as DT/Dt = 0,
with known exact solution
T(x,t) = T0(x-ut), (2.7-173)
where To(x) is the IC, stating simply that T does not change when followed on a
characteristic curve, a trajectory—here x = ut + x() for all *o- In pictures, we have the situation
shown in Figure 2.7-32.
The simplest finite difference approximation to DT/Dt = 0 is
T(x, tn+\)- T(x - uAt, t„)
At
= 0,
or
(2.7-174)
(2.7-175)
T(x, t„+\) = T(x - uAt, t„) = T*(x);
the discrete time-continuous space approximate solution is, in this semi-trivial case, also
the exact PDE solution. But we must discretize in space, too, and this is where the fun
begins. Since we want a finite element approximate solution, we begin by expressing
(2.7-175) weakly (Galerkin form):
/ <piT(x, tn+\) = / <p;T*(x) Vi.
(2.7-176)
ut
T(x,t)=T0(x-ut)
Fig. 2.7-32 Pure advection.
TIME INTEGRATION 319
Next, express T(x, tn+\) in the conventional GFEM manner, T(x, tn+\) -> Th(x, tn+\) =
E™=, T,
n + \
<Pj(x), to arrive at
N r r
Y,TTX <P-<<Pj= <PiT*(x) Vi,
j=\ J J
(2.7-177)
and we recognize the familiar mass matrix on the LHS and thus realize that our solutions
represent an L2-projection (see Appendix 3) of T*(x)—a best fit to T* via the FEM
basis functions. We also see, by setting (p,- = 1, that the method—in theory—displays
conservation of T: ]T). Tj j fj = J T*. When the integrations are replaced by imperfect
quadratures, however, conservation is also imperfect. So far, so good—and seemingly
very simple. The 'problem' is the quadrature on the RHS; T*(x) is not a 'nice' function,
representing (approximately) as it does T(x — uAt, tn)—the solution at an earlier time
translated along x from x — uAt to x. To make further progress, we simplify (2.7-177)
to the case with linear basis functions (the most common case by far in the literature) to
obtain
^(77+,' + 477+1 + 77+,1) = J writ) Vi.
(2.7-178)
Next, we attempt to represent T*(x) = Th(x — uAt, tn) via the same functions, T*(x) =
J2j T*<pj(x) = J2j Tnj(pj(x - uAt), to obtain
^(77+,' +477+1 +T^) = Y,Tnj J<Pi(x)<Pj(x-uAt) Vi. (2.7-179)
While (2.7-179) is actually in the final form needed to 'write code,' the whole process
may be better appreciated/comprehended via another sketch, see Figure 2.7-33. The solid
curve labeled Th(x,tn) is the known solution at time tn, and the dashed curve labeled
T*(x) is its translate via uAt and is in fact the exact solution at time tn+\. But we cannot
exactly represent this piecewise-linear function via our chosen basis functions (because
our nodes are not in the right places), so we must project it 'down,' giving the function
shown with small dashes and labeled (pj(x)T*(x). This is, for node / in the sketch, the
>► x
i-3 i-2 i-1 i i + 1 i + 2
Fig. 2.7-33 Characteristic GFEM for pure advection; linear elements.
320
THE ADVECT10N-D1FFUS10N EQUATION
function whose integral forms the RHS of (2.7-178). It is instructive to show the step-
by-step construction of this RHS because it will show that the method will surely not
be easy to implement in the general (multi-dimensional, isoparametric) case. Placing the
origin at node / gives
RHS = Y^ T"j / <Pi(x)<Pj(x - uAt)
= TUJ<Pi
{x)<pi-2(x — "A/) + • • • + T
7+i J vm
+ \{x — uAt)
= Ti-ij (l+x/h)(-l ; 1+TU I (\+x/h)[2 +
l-h
h
7-. /
+ r
+ r
J -h+u
J-h
7 /*
Jut
, uAt-x\ „ fuAl (uAt-x
(l+x/h)[ )+Tl\l 0-x/h)
h J
x — uAt
(l+x/h)(\ +
h+uAt "
jc — uAt
) + T1 (\-x/h)l\ +
+ Tf! I (\-x/h)(\
4 At "
7+. f
Jut
h
X — UAt
h
-) + 77+i I (l-x/h)
4 At \ «
jc — uAt
Letting z = x/h and c = uAt/h gives
+ r
+
7-2 / 6(l+z)(-l-z + c) + 77_,
f (l+z)(c-z) + / (1 -z)(c-z)
J-\+c JO
?[/ (l+z)(l-c + z)+ f (l-z)(l-c + z)
J-\+c JO
J (1 - z)(l + c - z)] + T?+l J (I- z)(z - c)
/-1+c
(l+z)(2-c + z)
= - WTU + [cz(3 - c) + (1 - c){cl + Ac + 1) + cz(3 - c)]77_,
+ [(1 - c)(2 - c - c2) + c(6 - 6c + c2) + (1 - c)(2 - c - c2)]^ + (1 - c)377+11,
which is the RHS of (2.7-179). Inserting it into (2.7-179) and rearranging yields a
recognizable form of the 'characteristic-Galerkin' equation for node j:
->« + ! TH
-■rc + 1 7">n
n+1 7->n
(7-;:;-r;_|) + 4(r;+'-r;)+(r^-r;+,) hA< _
2 3
= c- (rj_, - 2r; + r;+1) + ^-(r;_2 - 37-;., + 37; - rj+1), (2.7- iso)
which induces the following
TIME INTEGRATION 321
Remarks:
(1) Except for the c3 (dispersive) term, it looks like forward Euler + BTD—see
Section 2.7.2e.
(2) It was previously presented in Hasbani et al. (1983)—their equation (15)—who also
briefly study the errors induced by approximate (Gauss-Legendre) integration. See
also Celia et al. (1990), who obtained this result as a special case of their ELLAM
method.
(3) It possesses the 'unit CFL' property; if c = 1, then u"+l = w"_, —at least for periodic
BC's.
(4) It is unconditionally stable (when done properly for c > 1)—a result we prove below.
(5) It is easy to see that, for h —>• 0, the equation approximates
(Tnj+l - r;.,)+ 4(r;+1 - Tnj) + (TnjH -r;+1)
6At
u2 At „ \ -) „
—-7^1, +(ii3 Af2)7^|y.
+ «^ly
(6) The name given to the above result (but usually not to the procedure employed to
obtain it—see below) in the meteorological literature is 'semi-Lagrangian'; stemming
from the Lagrangian form of the time-derivative and to the use of a new set of
(imaginary) particles for each timestep (see, for example, Staniforth and Cote, 1991).
Before studying the accuracy and stability of this apparently dissipative and dispersive
scheme, we point out how this characteristic Galerkin method is to be used if c > 1:
1. Set c — I = c, where / is an integer and 0 ^ c ^ 1; i.e., / counts the number of nodes
(elements) skipped over when looking backwards along the characteristic to get to x — u At
when uAt > Ax.
2. Subtract / from all indices in (2.7-180); j -> j -I.
3. Replace c by c in (2.7-180).
More Remarks:
(1) It will turn out—for the simple case of constant velocity at least—that the method
is actually more accurate for large c than small c!
(2) The variable grid case is (for constant velocity in ID) a simple extension of the
above procedure—an exercise we leave to the reader.
(3) The multi-D case is not a simple extension—especially on unstructured meshes.
(4) For an example with both variable velocity and variable mesh, see Roache (1992b).
To examine accuracy and stability in the standard way (Fourier analysis a la von
Neumann), we seek a solution to the general (c > 1) equation [j —>■ j — I and c —>■ c in
(2.7-180)] via T{p = %neij0 to get
(%em - 1)- +cos^) +/cSin0= _c2(i -cos#) + c3(3-4cos# + cos2#)/6
+ /c3(2sin#-sin2#)/6,
322 THE ADVECT10N-D1FFUS10N EQUATION
or
£ = e-'7* { 1 3~C
2 + cos 0
2
c2
c{\ -cosO) (3 - 4cos0 + cos20)
6
+ i [s'm0- C— (2sin0- sin20)
(2.7-181)
wherein the unimodular factor e,~'ie accounts for the shift (jump) over / elements (7 = 0
describes the case analyzed in detail above). Using now £ = (x + iy)e~lW and thus Tj =
(yjx2 + y2y QiHjh+t /k&t\axTx y/x) . e-itW/&t = |t|«e'*(*j-«/>0 gives
MP 70 1 _.
tan y/x
u ukAt ukAt
I 1
tan-1
I + c {l + ~c)0
-c sin 0 + c2(2 sin 0 - sin 20)/6
2 + cos# _ ~ „
c[c( 1 - cos 0) - c2(3 - 4 cos 0 + cos 20)/6]
(2.7-182)
as the numerical phase speed. Note that //(/ + c) = 1 — c/c and / + c = c; thus, c —>
oo =>• m^/m -> 1—perfect advection via 'translation.' This is a statement of the fact that
all of the error in this method is caused by the 'fractional part' of the Courant number;
if c = 0, then there is no phase error. This beautiful behavior is, of course, related to the
use of a constant velocity on a uniform mesh in 1D.
In Figure 2.7-34 we show some amplitude and phase speed results. We observe that
|£| ^ 1 for all c and all / and all 0 (for c ^ 1, which is true by construction/definition).
Additional
Remarks:
(1) Perfect results are obtained for c = 1 and any /.
(2) Larger / gives more accurate results.
(3) One of the few criticisms leveled against the BMOC is shown in the |£| plot—it is
dissipative, although in a useful way, damping mostly the short waves. Finally, the
symmetry about c = 1/2 is interesting.
Having shown that this BMOC (backward method of characteristics—not Big Man On
Campus) has some really good qualities, we now show how that cumbersome RHS integral
can be alternately (and exactly, still) evaluated using cubic spline interpolation; i.e., cubic
spline interpolation of both sides of (2.7-175), using linear basis functions, is equivalent to
the projection derived above—a result discovered by Bermejo (1990, 1991)—and perhaps
not too surprising when it is noted that when ipiix) is linear, (pi(x)(pj(x) is quadratic, and
its integral is cubic.
A convenient, but not necessary, cubic spline derivation of (2.7-180) is through the use
of fi-splines, which form a local basis in the linear space of cubic splines (B = Basis).
First we recall the key property of a cubic spline—the smoothness property: it has C2
TIME INTEGRATION 323
1.6
T3
0)
CD
O
</)
0)
</)
(0
.c
Q.
0)
>
(0
0)
cc
1.4
1.2
1 0
0.8
0.6
0 4
0.2
■o
0)
0)
Q.
0)
(0
O
0)
>
'15
0)
cc
1.0
0.9
0.8
0.7
0.6
0.6
0.4
0.3
0.2
0.1
0.5 1.0 1.5 2.0
e
(a) I = 0
1.0 1.5 2.0 2.5 3.0
e
(b) I = 1
—
—
—
—
I '
I I
1 ^^\""'k--C.;r-^
i = o\ —
\ i \ ?
0.5 1.0 1.5 2.0 2.5 3.0
e
(c) C = 0.3
1.0
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
^~™^
0.01, 1.0 ^>>v
0.5 1.0 1.5 2.0 2.5 3.0
e
(d) Amplitude (I = 0)
Fig. 2.7-34 Relative phase speed and amplitude for pure advection.
continuity; the function and its first two derivatives are continuous. For further spline
theory, see, for example, Ahlberg et al. (1967), De Boor (1978), and Schumaker (1981);
for a brief discussion of cubic splines and finite elements, see Strang and Fix (1973). The
fi-spline we shall use is equivalent to that on p. 136 of Schumaker (1981) and is shown
in Figure 2.7-35 (z = x/h); it is defined piecewise over four elements:
<J>o(z)=<
fi,(z) = (8+12z + 6z2 + z3)/6
fi2(z) = (4-6z2-3z3)/6
fi3(z) = (4-6z2 + 3z3)/6
B4(Z) = (S- 12z + 6z2-z3)/6
for - 2 ^ z ^ -1
for - 1 ^ z ^ 0
for 0 ^ z ^ 1
for 1 ^ z ^ 2.
(2.7-183)
(2.7-184)
(2.7-185)
(2.7-186)
We can now state and prove our (ID) version of Bermejo's (2D) Equivalence Theorem.
The solution of (2.7-179)—i.e., the characteristic Galerkin solution of (2.7-176) via
linear basis functions—is equivalent to each of:
1. Put a cubic spline through all node point values of T" at tn and for each node, /,
evaluate the result via cubic spline interpolation at the point uAt upstream of node / to
give T"+l. Note that the first step involves the solution of a tridiagonal system whose
matrix is equivalent to, if not identical to, the FEM mass matrix. For proof, see Bermejo
(1990, 1991).
324 THE ADVECT10N-D1FFUS10N EQUATION
-2-10 1 2
Fig. 2.7-35 A B-spline.
Fig. 2.7-36 The dashed curve on the left represents the cubic fit through T(x,tn),
x represents a B-spline node, and u = ch/At.
2. At time tn+\, for each node, /, form the sum ]T\. 7"+1 <J>,(x7) from the (unknown) nodal
values and the fl-spline basis functions. This gives l(T"^{ +4T"+[ + T"^) and
corresponds to the LHS of (2.7-179). Also, for each node, i, form the RHS sum J2j ^/<&/■(■*/ +
uAt), in which <t>j(x + uAt) corresponds to a leftward (upstream) shift of the fi-spline
0,-(jc)—a distance uAt. The result is the RHS of (2.7-179). Solve the resulting linear
system.
Proof of (2). First we present in Figure 2.7-36 another helpful sketch, in which / is
(as before) the integral portion of the number of elements traversed in At. The dashed
curve on the left represents the cubic fit through T(x,tn). The LHS follows easily
(instantly) from the right sketch; the left sketch yields the RHS as 7,"_2_/0,(x/_2_/ +
it At) + 77_ ,_7 <&/(*/_!_/ + uAt) + T^QjiXi^ + uAt) + 77+1_/<I>(Jt,-+i-/ + uAt) which,
using (2.7-183) through (2.7-186) and c = c + /, can be rewritten as rf_2_7fi,(-2 + c) +
Tl_x_jB2(-\ +c) + TV_,B3(c) + 77+1_7fl4(l +c)= £[77_2-/C3 + 77_,_7(1 + 3~c + 3~c2
— 3c3) + 77_7(4 — 6c2 + 3c3) + T"+\_j(l — c)3], which is easily seen to be the same as
the 77 terms in (2.7-180), after setting / to 0 and c to c there. QED.
As a short digression, it is of interest, given the close relationship between the Galerkin
projection and the cubic spline interpolation, to inquire whether we can generate a cubic
TIME INTEGRATION 325
spline using the piecewise-linear basis functions. The answer is yes: suppose we are
given a piecewise-linear function, f(x) = J2i fi<Ph and seek the best L2 fit (projection)
to d2f/dx2 via the same linear basis. Thus we seek y = J2 yj(fj = f" in the weak form;
i.e., we solve J2j yj I <Pi<Pj = ~J2k fk f <p'i<p'k or My = ~Kf f°r y» where K is the linear
basis function 'diffusion' matrix. After solving for the {yj}, we focus on the element
spanning [xj, xi+\] and perform an integration of y = f" to get f* f" = J2 / yj Jq <Pj, or
fix) — f'j = yi{x — x2/2h) + yi+\X2/2h, where we chose x, = 0 for convenience; here
f'j is to be regarded as unknown. One more integration, this time from 0 to xi+\ yields
fi+i — fj = hf'j + h2(2yi + yi+l/6), which is used to evaluate the unknown value of /j.
Finally, returning to the f'(x) equation and this time integrating between x,-(= 0) and x
yields our cubic spline over the selected element:
x + ^/U2 - x3/3h) + yi+ix3/6h.
(2.7-187)
This cubic function displays C2 continuity [because f"(x) = yt + (yi+\ — yi)x/h is
continuous because y(x) is, by construction] and describes the cubic spline in terms of the known
data, {fj} and {yj}', it also agrees with equation (2.1.2) on p. 10 of Ahlberg et al. (1967)
after the appropriate simplifications are made.
Remarks:
(1) The special case of constant grid spacing was taken solely for the purpose of
simplifying the presentation. All equations in this section generalize easily to variable
element lengths.
(2) If quadratic basis functions were used in the above, then it would ostensibly turn out
that the equivalence between BMOC via Galerkin and spline interpolation would lead
to quintic splines, since quadratic (p{ yields quartic (p,-(pj whose integral is a quintic
polynomial, etc. End digression.
The next step in our advection adventure is to see what happens if we replace the
C2 cubic splines with C° cubic Lagrange polynomials—e.g., via cubic Lagrange FEM
basis functions—to perform the interpolation step. The reason that we do this, since
it clearly deviates from the Galerkin projection discussed thus far, is suggested by the
following, 'Cubic Lagrange interpolation is particularly popular since it represents a good
compromise between computational efficiency and numerical accuracy'—Bermejo and
Staniforth (1992). The efficiency arises because there are no linear algebraic equations
to solve; the interpolation is local and direct and corresponds, in some sense, to lumping
the mass in the characteristic Galerkin method. The Lagrange interpolation goes like so:
simply use (2.7-175) as it stands, but expand the LHS into linear basis functions (as
before) and interpolate the RHS from the ('displaced') piecewise-linear basis into the
Lagrange cubic basis. Thus imagine that the cubic shown through the points on the left
side of the last sketch is the C° Lagrange polynomial. Then, T" + l = T"_j(xj — uAt) =
T*(xj) = Y^j Tn:\jfj{Xi — uAt), where {\j/j(x)} are the Lagrange cubic basis functions (see,
for example, Reddy, 1993). The result is
77+1 = T?_2_rfi-2-i(Xi-2-i - ch) + r;_\-iti-\-iiXi-\-i - m
f(x) = fi +
fi+i - fi _ h(2yi + yi+\)
h 6
326 THE ADVECTION-DIFFUSION EQUATION
= 77_2_/(c-c3)/6 + 77_,_/(c3+c2
+ T^_I(2~c + 3~c2 + ~c3)/6,
~2
- 2c)/2 + T1_j{\+c- Tcl - P)/6
which rearranges to
rptl-\-\ Jin
1 i 1 i-I
At
+ eAt^'^-' + 3Tlj ~ 6r'-'-7 + TU-])
2At
(TI^-ITIj + T^j)
+ GAt^-2-1 ~ ^TU~l + 3Tl1 ~ T>!+l-,X
(2.7-188)
which is to be compared with (2.7-180) after generalizing it via T" -> Tnj_j V/ and
c -> c. It is clear, once we realize that (27/+, + 37, - 67,_, + r,_2)/6/z = Tx\i + 0(h4),
that we have traded consistent mass plus 'second-order' advection (whose net result is,
recall, fourth-order accurate) for lumped mass plus fourth-order advection—more like a
higher-order, finite difference method. What is the net result? Let von Neumann tell us;
T(n) = ^ntije = \x + iy\n . Qik(jh+t/kAttan^ y/x) giyes Up = _(M/£A0tan_1 y/x and (taking
1 = 0 and thus c = c for simplicity, with no adverse effect)
£ = 1 -c2(l -cos#)-
c{\-cz)
(3 - 4 cos 0 + cos 20)
ic
- —[8 sin0 - sin20 - c2(2 sin 0 - sin20)].
6
(2.7-189)
Figure 2.7-37 shows the amplitude and phase and is to be compared with the / = 0
curves of Figure 2.7-34, the cubic spline results. We see that they are remarkably similar,
showing that the trade-off was probably a good one.
If the velocity is not constant, then the method is even more difficult to implement;
here we summarize what is needed, and since it is just as easy to describe the method in
1.6
■o 1.4
& 1.2
w
0) 1.0
£0.8
Q.
| 0.6
2s 0.4
01 0.2
0
1
—
1
1 1 1 1 1
C = 0.7/^_
— —~
"*J!i!=^r— 0.5,1.0
^^■-t.*.. '•^
""^■v!*. ^»
C = 0.01^. Ssp.3 _
\^>. V
j&' ^
^-^^^< ^
0,1 xlN
I , I X
0.5 1.0 1.5 2.0 2.5 3.0
0
(a) Phase speed
JJLT
1.0
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
~*s~Z£r.
"cTi
C = 0.01
•>. 0.1,0.9
\V 0.3, 0.7-
0.5
_L
0.5 1.0 1.5 2.0 2.5 3.0
e
(b) Amplitude
Fig. 2.7-37 Relative phase speed and amplitude for pure advection.
TIME INTEGRATION 327
multidimensions as in ID, we do so, following, for example, Pirroneau (1982): to solve
dT DT
+ u • V7 = — = /(x, t), (2.7-190)
ot Dt
where now / describes source terms and diffusion and u(x, t) is given, we need 'only'
to define the trajectories and then integrate f(x,t) along them. Each 'timestep' looks
like so:
1. Select a point in the domain, x, and a time, t, at which T(x,t) is desired. Find X
(x, t; t) the location of a 'particle' at 'auxiliary time' r that was located at x at time t,
by integrating 'backwards' (from head to tail/foot) the trajectory ODE that defines the
characteristic curves,
dX
— =u(X,r), (2.7-191)
dr
from T = ttor = t — At (some earlier time), where at r = t, X(x, t; t) = x. Thus,
X(\,t;r) = \- / u[X(x, f; s), .s] ds. (2.7-192)
Note that x and t are simply parameters in the trajectory equation. In practice, x is the
location of a node (or Gauss point in some methods) and t = tn+\ = tn + At; thus giving
X(x, tn+\; tn) as the particle position one timestep back.
2. Integrate (2.7-190) forward (tail to head) along the trajectory:
T(x, t) = T0+ f /[X(x, t; t), t] dr, (2.7-193)
Jt-At
where T0 = T[X(x, t; t — At), t — At] is the temperature at the foot (tail) of the
trajectory at the earlier time t — At. Thus, for pure advection with no source term, T(x, t) =
T[X(x, t;t — At), t — At]; the solution remains constant along a trajectory.
The methods used to approximate the above exact results are too many and too varied
to report here; consult the references—more of which we will cite below and in the next
chapter—and the references within references, for details. In fact, one of the reasons that
we have not yet tried these methods is that—difficult as they are for 2D problems on
simple meshes with rectangular elements—they must be nearly impossible to implement
on the general, distorted iso-parametric element meshes in 3D. Or, at least this is our
perception of the state of the art. In fact, the number of person-years thus far expended
on just ID problems is somewhat staggering—judging by the literature.
The last feature of the ID problem we have been discussing that we wish to consider
before moving on to 2D is that of numerical integration. It is a sad surprise that quadrature
errors can convert the algorithm from unconditionally stable to conditionally stable
(somewhat like explicit Eulerian methods, but displaying some bizarre/illogical results) and, in
the worst of cases, to unconditionally unstable (!). In the paper by Morton et al. (1988)
on certain characteristic-based methods that are similar to those discussed above—and
identical in some cases, are presented some stability analyses. We report just a few cases
from that paper:
1. If one-point quadrature is used on both sides of (2.7-177), then the resulting algorithm
is unconditionally unstable.
328 THE ADVECTION-DIFFUSION EQUATION
2. If the exact mass matrix is used (LHS) with one-point quadrature on the RHS, then
the resulting algorithm is unstable for the fractional part of the CFL number in the range
1/V6 = 0.408 to 1 - l/>/6 = 0.592; strange.
3. If lumped mass is used on the LHS and one-point quadrature on the RHS, then the
method is unconditionally stable. (Numerical advection is still full of surprises; it is not
easy.)
Moving finally to 2D and 3D, we begin by listing some of the mostly-still-open
problems—some of which have contributed to the delay in our becoming personally
involved in these advanced techniques:
1. Loss of unconditional stability caused by inexact integration—already referred to in
ID—although the bicubic spline method, already referred to, is equivalent to exact
integration and thus generally recommended (when applicable; e.g. for rectangles).
2. Little or no 3D progress—except for that at EDF (Electricite de France) employing
tetrahedral elements with flat faces. See Section 3.16.9a in the next chapter.
3. Little or no complex geometry progress—except for that at EDF employing tetrahedral
elements with flat faces. See Section 3.16.9a in the next chapter.
4. There are still unresolved boundary condition issues—especially for c ^> 1—both
inflow and outflow.
Rather than presenting any of the myriad of details for the multi-dimensional case,
we shall content ourselves with providing a relevant cross-section of the literature—and
a large cross-section it is, with some (most) subgroups not knowing (or caring,
probably—each thinking that they are the best and so ignore the rest) about very closely
related efforts in another field. As we have already made clear, there is no subject like
'numerical advection' that gets so much attention in so many disciplines. Further citations
will be presented in the next chapter, wherein the Navier-Stokes equations are the principal
subject.
A brief (and assuredly incomplete) synopsis of several key groups of contributing
researchers and their principal interest (the 'physics' side) is now attempted—and
concludes our discussion on 'characteristic advection':
1. The French: starting in about 1980, several French researchers, including O. Pironneau
and M. Bercovier, investigated characteristic-based methods as an attractive alternative
to upwinding. See Benque et al. (1980, 1982), Bardos et al. (1981), Pironneau (1982),
Bercovier and Pironneau (1982), Bercovier et al. (1983), Hasbani et al. (1983), and
Pironneau (1989). Their field, besides applied mathematics of course, is advection-diffusion
and—principally—incompressible viscous flow.
2. The English: spearheaded by K.W. Morton and focusing on advection-diffusion,
compressible flow, and (to a lesser extent) incompressible flow, this group of applied
mathematicians has produced many papers—and many names for the methods, some of
which we cite: under the name Euler Characteristic Galerkin Methods, see Morton and
Stokes (1982), Morton (1982, 1983), and Childs and Morton (1990); under Characteristic
Galerkin Methods, see Morton (1985); under Lagrange-Galerkin Method, see Morton
et al. (1988) and Priestley (1994). For a good overall discussion of both of the latter two,
see Morton (1990).
TIME INTEGRATION 329
3. The groundwater flow simulators: here the key players are many and they often
team up. The methods are somewhat different and so are some of the names of the
methods. Under Characteristic Galerkin Methods, we cite Ewing and Russell (1981, 1983),
Douglas and Russell (1982), Krishnamachari et al. (1989); under Modified Method of
Characteristics are Ewing and Russell (1981, 1983), Douglas and Russell (1982), and
Roache (1992a); under Eulerian-Lagrangian Localized Adjoint Methods (ELLAM), we
have Russell (1990), Celia et al. (1990, 1993), Russell and Trujillo (1990), and Arbogast
and Wheeler (1995); under the Backward Method of Characteristics is Baptista (1987).
Finally, under 'particle methods' see Tompson and Gelhar (1990) and Schaferperini and
Wilson (1991).
4. In spectral finite elements, spearheaded by A. Patera and applied to incompressible
flow, are: Ho et al. (1990), and Maday et al. (1990).
5. In magnetohydrodynamics, we cite the Ephemeral Particle-in-Cell (PIC) method of
Eastwood (1987). See also Arter and Eastwood (1995) for particle methods in fluid flow.
6. The meteorologists: last but not least are those who worry about advection at least as
much as any other group—computational meteorologists/applied mathematicians. In both
advection-diffusion and their versions of the NS equations (numerical weather
prediction, general circulation models, and even climate simulation models), there has been a
tremendous amount of work—mostly via finite difference methods and under the name
of Semi-Lagrangian Methods. Fortunately, most of our work on citations has been done
for us by one of the leading researchers in that field—A. Staniforth. In Staniforth and
Cote (1991) is given a thorough review of the history of characteristic-based methods in
meteorology. Since that review, the following relevant papers have appeared: in Bermejo
and Staniforth (1992) is discussed a method to convert 'conventional' semi-Lagrangian
schemes to those which in addition suppress most wiggles via a new scheme inspired by
the important paper by Zalesak (1979) in finite differences, in which a compound scheme
is employed that utilizes Godunov's theorem relating wiggles to order of accuracy in order
to generate a quasi-monotone scheme that retains the higher order of accuracy in regions
where the solution is smooth but reverts to first-order when necessary to suppress wiggles
(simple up winding in the Eulerian context, linear interpolation in the Lagrangian one).
The demonstrated results are impressive. In Priestley (1993), it was shown how to recover
conservation (J T) after adding the improvements of Bermejo and Staniforth to obtain
a method that is monotone, inexpensive, and accurate. A new twist to these methods is
espoused in a recent paper by Purser and Leslie (1994): a third-order method, 'equivalent'
to AB3, is used with an efficient, forward trajectory, semi-Lagrangian method that is also
stable 'in practice.' Finally, mention should be made of another paper that ostensibly
generalizes/unifies the entire concept using the advanced mathematics associated with
differential forms; see Smolarkiewicz and Pudykiewicz (1992).
A final geophysical application is that of Malevsky (see Malevsky and Yuen, 1991,
and Malevsky, 1993): apparently totally unaware of the efforts of the meteorologists, the
cubic spline interpolation-characteristics method was rediscovered and even applied in
3D (thermal convection at infinite Prandtl number).
b. Methods based on modified equations
Inspired by Lax and Wendroff (1960, 1962, 1964) and Leith (1965) in the finite difference
community, Donea (1984) launched a new research direction, to be followed by quite
330 THE ADVECTION-DIFFUSION EQUATION
a few others, by introducing a finite element version of these techniques—in which a
modified equation results from a Taylor series expansion in time that is applied prior to
spatial discretization via the FEM. It stands in contrast to the family of so-called Petrov-
Galerkin methods that always and intentionally introduce some artificial diffusion via
special test functions in that the Galerkin weak formulation is applied to the modified
equation. The original publication on pure advection was quickly followed by one applied
to advection-diffusion (Donea et ai, 1984) and then further developed in Selmin et al.
(1985) and Donea et al. (1987).
We present the method for one case only: the trapezoid rule integration of the pure
advection equation with constant (and divergence-free) velocity, Tt + u • VT = 0:
1.
2.
At2 At3,
Tn + \ =Tn + AfTn + jn + jn + 0(At4).
2 6
(2.7-194)
'-rn tvz + 1
AtT"+l +
At
yvz+1
2 "
At
77,+'+ 0(Af4). (2.7-195)
3. Subtracting the second from the first and dividing the result by 2At yields, upon
rearrangement,
VI+1
Tn 1 At
= _(Tn +7"+1) H (Tn
At 2K ' ' J 4 l "
VJ+l
)+^r{Tnm + Tnt^)+0(At3).
12
(2.7-196)
4. Use the original PDE to obtain Ttt = -u • VTt = (u • V)2T and Tttt = (u • V)2Tt to give
VJ+l
yn
At
+ u- V
(Tn+Tn+l) At
_ Ar
= — (u- V)2{Tn -Tn+l)+
4 ' 12
+ 0(At3).
(u-V)2(7?" +Tt+l)
(2.7-197)
5. Now note from (2.7-196) that, to within 0(Af3), At2(Tnt + T^ + x)/2 in (2.7-197) can
be replaced by At(Tn+l — Tn) to give, dropping the 0(At3) terms, the modified equation,
which—by construction—is fourth-order accurate,
At4
(u-Vr
TVZ+l TV1
At
Tn+Tn+\\ A
+ u • v [ —-— J = — (U • v)2cr
VJ+l
),
which immediately simplifies to
VI+1
'-rn
At
nrn , yn + 1
+ u • V | : | = 0,
(2.7-198)
where, since V • u = 0, (u • V)2 = V • (uu • V) = V • r • V, which 'looks like' a diffusion
term (but really is not) and can be so-treated a la Galerkin's weak form; i.e., via the usual
integration by parts. The final result is then called a Taylor-Galerkin method ('Crank-
Nicolson/Taylor-Galerkin' to be more precise), and it looks like
At1
M - -J2"*(u)
(T
n + \
At
rrn \ I Jin ■ j^n + \
+ N(u)[ | =0,
(2.7-199)
TIME INTEGRATION 331
where Ku = J Vy-, • uu • V<pj = /(u • V<p;)(u • V<pj). Donea (1984) shows, in ID, that
(2.7-199), with linear basis functions, is indeed more accurate than conventional TR
(which is 'recovered' by dropping the At2 term), especially for larger At. It is also (like
TR) non-dissipative and fourth-order accurate in space (also like TR with linear elements).
Remarks:
(1) If diffusion is included, then the Tttt terms cannot be so neatly utilized (they are
dropped), with the result that the final Taylor-Galerkin equation is only second-order
accurate in time (Donea et ai, 1984).
(2) If the forward Euler method is employed rather than TR, then the resulting Taylor-
Galerkin equation looks like our BTD result except that the mass matrix is modified
in the same way as in (2.7-199)—and the result is third-order accurate in time.
(3) For more background on the method of modified equations, see Warming and Hyett
(1974)—and Griffiths and Sanz-Serna (1986).
(4) See Donea and Quartapelle (1992) for further results.
Figure 2.7-38 shows the phase speed for CNTG, derived in the usual way, which is
obtained from
4 - c2 2 + c2
+
cos 0 — ic sin 0
$ =
4-c2 2 + c2
(2.7-200)
+
cos 0 + ic sin 0
which is easily seen to be unimodular. The t-(0) curve traces the entire unit circle as 0
varies from 0 to n. Comparing these results with those for straight TR in Figure 2.7-15
for P = 100 shows a big gain in accuracy over a wide range of useful CFL numbers; e.g.,
0.1 < c < 3. Finally, we remark again that the non-zero phase speed shown for several
values of P at 0 = n is somewhat illusory, as the 2Ax wave is stationary for all c; the
curves are simply discontinuous at 0 = n.
To conclude this section, we point out that A.J. Baker has both generalized and
categorized several of these methods under the term 'Taylor weak statement.' In Baker and Kim
(1987) is a presentation and discussion of many related CFD algorithms, including some
finite difference methods. Much related history is also discussed there. Finally, Chaffin
1.4
■o
(1)
0)
Q.
0)
(0
sz
Q-
01
>
ts
0)
cc
1 ?
1.0
0.8
0.6
0.4
0.2
_ I I I
V \ ^3
t ^^10
_\100
I I ^
C = 1.5^^^—
_- — — " 1,2
^o.ioX-V15 v~
i illJ
0.5 1.0
1.5
e
2.0 2.5 3.0
Fig. 2.7-38 Phase speed for Donea's CNTG method (pure advection).
332 THE ADVECTION-DIFFUSION EQUATION
and Baker (1995) analyze some of these methods for both linear and higher-order FEM's
(quadratic and cubic) and compare them against some finite difference (finite volume)
schemes [QUICK methods, a la Leonard, both third-order (Leonard, 1979) and fifth-
order (Leonard and Mokhtari, 1992)—and even seventh-order, in Leonard and Mokhtari
(1990)]. They also introduce a new, improved method with linear basis functions. Their
results show FEM to be more accurate and, especially for CFL near one, their new scheme,
based on a Taylor weak statement and linear basis functions, is really quite good. Again,
a useful 'history' (updated from the 1987 paper) is included.
c. Some least-squares finite element methods (LSFEM)
Whereas B. Jiang is becoming to the least-squares FEM as A. Patera is to the spectral
element method, we have neither the time nor space to describe the many and varied
applications by either—we simply provide a sampling and refer the reader to relevant
citations for the remainder. Some citations first: Carey and Jiang (1988), Jiang (1993),
Carey and Shen (1989). In the next chapter we will return to this list and mention some
papers devoted to the NS equations. We first remark that the 'obvious' least-squares
method is generally of little use for approximating diffusion because integration by parts
is not allowed; i.e., the Laplacian operator must remain 'as is'—thus ostensibly requiring
higher-order continuity (C1) in the basis functions. To clarify this point, we quickly
summarize the least-squares method as a weighted residual method for linear PDE's (for
further details, see, for example, Eason, 1976; Carey and Oden, 1983): given the general
equation Lu = f, an approximate solution, uh = J2j ujiPh ls generated by selecting the
coefficients {Uj} in such a way that the mean square of the residual, R = Luh — f, is
minimized (in an appropriate norm). Thus using the L2-norm, for example (the most
common LSFEM),
\huh - ff (2.7-201)
is minimized via dl/duj = 0, / = 1, 2, ..., N, to yield
f(Luh - f)(Ltpi) = 0 Vi; (2.7-202)
i.e.,
" LtptUpj )u,= I fL<pi, i=l,2,...,N. (2.7-203)
Thus, the weighting (test) function is Lip-, rather than simply ipt of the GFEM—an
observation that makes clear our above remark regarding the Laplacian. Another noteworthy
observation is that the coefficient matrix, here J L(ptL(pj, is symmetric regardless of
whether L is or is not—a nice feature. A final introductory and summary remark is
this: C° finite elements can be, and commonly are, used even when L includes the
Laplacian—by first rewriting each PDE with higher-order operators as coupled systems
of PDE's each with only first-derivative operators; e.g., V2w = / is rewritten as the
system a =Vu and V • a = f—and, at least in theory, V x a = 0 (see, for example
Jiang and Povinelli, 1993)—and then LSFEM is applied to the system, thus permitting
the (convenient) use of low-order C° basis functions but requiring rather more dependent
variables—and equations. More on this in the next chapter.
TIME INTEGRATION 333
The LSFEM 'sample' that we will present begins by applying TR to T, + u • VT = 0
on a periodic BC domain and ends by studying £ for linear basis functions in ID. Thus,
for Th = Y^=i Tj(t)(pj(X), we define Tn = Th(nAt) = Th{tn) and
'-ph '-ph
At
^+u-V
n + \
+ Th
1 ■* n
(2.7-204)
where Thn is given, and the LSFEM finds Thn+[ by minimizing \ $ R2 with respect to each
amplitude coefficient, T"+l = Tj(tn+\); i = 1, 2, ..., N; i.e.,
1 d
0= -
2 37
n+\
^2 J —-<^ + u-V' J J
-i 2
At
<Pj
giving
E
'yn + l '-rn ( rrn-\-\ _i_ T->n
_J L + n • V ' J J
At
At
i
2'
^It: + ^u' v<p< l = °<
which, upon multiplying by At, looks like (and can be interpreted as) a Petrov-Galerkin
method with test function w, = <pj + (At/2)u • V^,. In fact, the above LSFEM is called
(with At/2 replaced by a 'generic' r) SUPG (streamline upwind Petrov-Galerkin) by T.J.
Hughes and colleagues/students (see, for example, Hughes et ai, 1989). Rearrangement
gives, after another multiplication by At,
E
f At f At2 f
/ <PWj + y / ^'u * V(pJ + ^'u' v<^+ "T" / (u' v<^u * v<^^
= Y1 w + Y / (<^'u *v<^'' ~ ^'u'v^ ~ ~J~ /(u * v<^)(u'v^-)
4
i = l,2,.
1 j'
N,
(2.7-205)
an equation (with symmetric matrices) previously presented in, for example, Carey and
Jiang (1988, in ID), Jiang and Povinelli (1990—albeit for BE instead of TR), and Donea
and Quartapelle (1992). But this mess can be simplified since V- u = 0 and since our
BC's are periodic (non-periodic BC's would simplify only if n • u = 0 on F) via
(i) J(<piU • V<pj + <pjU -V<pi) = / V • (u<pi<pj) = fr n • vnpupj = 0; and (ii) f(<pju ■ V<pt -
<PiU • y<fj) = /[V • (vupiWj) — 2<pjU • Vipj] = —2 J (pjU • V^?y, to give, upon division by At
and rearranging,
Y, / W
rrn + \ rpn
At
M + U<piU-V<Pj)Tnj
At f
+ y (U-V<Pi)(U-V<Pj)
j J' > =0, i=\,2,...,N; (2.7-206)
which, while still symmetric (in the coefficients of T"+l) is much simplified, and admits
a very interesting interpretation/identification: namely for linear basis functions at least,
334 THE ADVECTION-DIFFUSION EQUATION
it is identical to the pure advection version of the semi-implicit FE + TR + BTD scheme
presented in Section 2.7.5; cf. (2.7-110) and (2.7-112)! The LSFEM 'automatically' inserts
BTD—a feature that also extends to 2D and 3D. We leave the proof of this assertion to
the reader and merely point out that we have here another example wherein the final
result is obtainable in any of several ways. (Is a method that is insensitive to its manner
of derivation generally good? Bad? Or just insensitive?)
Part of the 'cost' of generating symmetric matrices is that, even for (or, especially
for) pure advection, the scheme must be diffusive since only a skew-symmetric advection
matrix can be dissipation-free.
d. Methods based on a discontinuous-in-time Galerkin ODE technique
In a very large number of papers—originally emanating from two far apart 'countries,'
Sweden (C. Johnson et al., the originators) and California (T. Hughes et al), but recently
joined by (at least) T. Tezduyar at Minnesota—a set of methods variously described as
streamline diffusion, SUPG, Galerkin/least squares, and SST (stabilized space-time) have
evolved, all based on a (never-clearly-derived) discontinuous-in-time (C_1) Galerkin ODE
method, have poured forth. In this section we shall derive this family of ODE methods and
summarize—mostly via the citation of relevant references—its use by the above-named
researchers. Those that apply also to the NS equations will be recalled again in the next
chapter—and some that apply only to NS will only be cited there.
To introduce the discontinuous Galerkin method for ODE's (in time, for our purposes)
in the simplest and cleanest way that we have found, consider the 'standard' ODE
y = f(y, t) in which, even though y(t) is continuous, we will be interested in Galerkin
approximations that are discontinuous. [Continuous-in-time Galerkin methods—and least-
squares methods—have been tried and, to the best of our knowledge, largely abandoned;
see Zienkiewicz and Taylor (1991) and references therein; also Wood (1990).] For the
discontinuous Galerkin method, to be described below, see also: Delfour et al. (1981,
1992), Johnson (1988, 1992), and (especially) Estep (1995) and references therein.
To motivate the discussion, imagine that we wish to integrate our ODE over (0, At)
using a discontinuous approximation that suffers a jump at 8t, where 0 ^ 8t < At (see
Figure 2.7-39). A weak solution, using a continuous test function <p(t) on (0, At) is as
follows:
pAt pSt pAt pAt
/ ipydt = (pydt + (y0-y-)<p(8t)+ <pydt = / <pf dt, (2.7-207)
Jo Jo J st Jo
in which the Dirac delta function in the integrand at t = 8t has been properly accounted
for. Letting 8t -> 0 causes the jump to occur at the boundary of each time-slab and is the
discontinuous Galerkin (DG) method on (0, At) (once y is approximated via the same
functions comprising the test space, which we do below):
pAt pAt
(yo ~ y~)<PQ + (pydt= <pf dt, (2.7-208)
Jo Jo
where now y~{= y) is the value to the left of t = 0 and y0 that to the right. The next
figure (Figure 2.7-40) and the next weak statement generalize the situation. We are at tn
and have available y(t~)\ the discontinuous Galerkin ODE method between tn and tn+\
TIME INTEGRATION
335
0 8t
Fig. 2.7-39 Discontinuous ODE solution concept.
► t
ln-1
Fig. 2.7-40 Discontinuous Galerkin method for our ODE.
N+\
is then: find y(t) = J2j=\ yj<Pj(*) from
<Pi(tn)[y(t+)-y(t-)]+ f"+\,y{t)&t = l"" <Pif(y,t)dt, i = 1, 2, ..., N + 1,
Jt„ Jt„
(2.7-209)
where {</?,} are, a la FEM, piecewise polynomials on (tn, tn+\).
Remarks:
(1) The jump term enforces continuity, weakly.
(2) y(t+) is to be determined as part of the solution; the closer it is to y(t~), the less
'jumpy' is the solution.
(3) The solution on each time 'slab' involves N + 1 simultaneous (non-linear in general)
algebraic equations.
336
THE ADVECTION-DIFFUSION EQUATION
(4) N = I appears to be a 'practical' limit; all results we have seen use N = 0 (piecewise
constant) or 1 (piecewise linear).
(5) In some of the references on the subject (e.g., Hughes et ai, 1989), a different but
equivalent form is presented; it is a result of integrating by parts in (2.7-209):
/ (piydt = (Pi(tn+\)y(t~+l)-(pi(tn)y(t+)- ipydt
Jtn Jtn
to give
<Pi(tn+\)y(tn+l)- <Pi(tn)y(tn )
<Piydt= / (fifdt. (2.7-210)
Remark:
It is equivalent (in multi-dimensions) only in the absence of quadrature error; inexact
integration favors the latter (T. Hughes—personal communication).
Let us now demonstrate this ODE method on y = —Xy, y(0) = y0 = I for both N = 0
and N = 1:
l.N = 0. Here y(t) = yn+\ = y(t+), (p-< = <p\ = 1, y(t~) = yn, and y = 0 to give (yn+\ -
yn) + 0 = Jo ' f dt = — f Xydt = — XAtyn+\, or yn+\ — yn/(\ + XAt), and we have
merely recovered the BE method—albeit with a (perhaps) different interpretation. Note
though that if / is non-linear or includes a source term, say s(t), that this result would
be slightly better than BE because of JQ s(t)dt rather than simply At ■ sn+\.
2.N=l. Here y(t) = y\<p\ + y2(p2, where jy, = y+, y2 = y~+l, <p\ = (tn + \ - t)/At„, and
(P2 = (t- tn)/Atn. At t = 0, y~ = y(0). In this case, (2.7-209) for i = 1 gives
Jtn
= -X
Atn
"'«+' tn+l - t
Atn
Atn
Jtn Ar„
which incorporates the jump—and for / = 2, yields
o+ / LJi. y*+\ y» dt = -k' l u
+ (tn + \ ~t) _ (t-tn)
yn—7z— + yn+\-
Atn
dt.
Atn Ar„ Jt„ Atn
a pair of equations in y+ and ^+1. Integrating yields
+ (tn + \ ~t) yn + \(t-tn)
Atn
Atn
dt,
yn+\ + yj - ^^t + _
—o yn = —T-Q-yn + yn+0
(2.7-211)
and
yn+\ - yn
+
XAt
2 6
(yn+2yn+0
(2.7-212)
with solution y+ = y~ ■ (1 + 2XAt/3)/(l + 2XAt/3 + X2At2/6) and y~+l = y~{\ -
XAt/3)/(\ + 2XAt/?> + X2At2/6). The jump at t„ is y+ - y~ =-y~X2At2(l +
TIME INTEGRATION 337
2XAt/3 + X2At2/6)/6; i.e., it is small—0(At2). It is, however, larger than the global
truncation error, which is O(Af)3. Also, as is obvious from the above results, the At -> oo
results are that both y+ and y~+x -» 0; i.e., it is L-stable, similar to (dissipative) low-order
BDF's and different from (non-dissipative) TR. The discontinuous Galerkin ODE solution
will not 'wiggle'—one of its sales points by its promoters.
The general end-of-step solution is easily found to be
y- = yo ■ [(1 - XAt/3)/(l + 2XAt/3 + X2At2/6f, (2.7-213)
which is a third-order accurate approximation that is also unconditionally stable; the factor
(1 -A.Af/3)/(l + 2XAt/3 + X2At2/6) is, in fact, the so-called 2,1 Pade approximation
to e"XA? (e.g., Vichnevetsky and Bowles, 1982, or Wood, 1990).
How does the method compare in accuracy with TR? To TR at 1/2 the step size, since
it is about twice as much work (twice as many equations) as TR? To BDF3 or AB3?
Quite well, actually; Table 2.7-5 shows the results for XAt = 0.1 and 10 timesteps. The
error columns are to be multiplied by 10~6. (See Notes below table for explanation.)
Remarks:
(1) DG is remarkably accurate, and global error grows slowly—or not at all ('a
consequence of Galerkin orthogonality' — R. Rannacher, personal communication). This
nice result is rather special, however; for a general, non-linear system of ODE's, the
global error will increase with time (A. Hindmarsh, personal communication).
(2) TR/2 is listed because it is a good second-order method with about the same work
(one-half as many equations to solve).
(3) TR/8 is listed because it is the break-even point for TR with respect to accuracy.
Table 2.7-5
n
1
2
3
4
5
6
7
8
9
10
y(tn) = e-°An
0.9048
0.8187
0.7408
0.6703
0.6065
0.5488
0.4966
0.4493
0.4065
0.3679
y(tn)-yoG
1.225
2.216
3.008
3.629
4.104
4.456
4.704
4.865
4.952
4.979
y(tn)-YAB3
0<n
0(D
32.54
55.75
76.04
91.65
103.71
112.6
118.9
123.0
y(tn) -YBDF3
0(D
0<n
-10.81
-26.56
-41.68
-53.94
-63.20
-69.96
-74.75
-77.99
y(fn)-/TR/2
18.86
34.13
46.32
55.88
63.20
68.62
72.44
74.91
76.25
76.66
y(tn)-yi
1.178
2.132
2.894
3.491
3.949
4.288
4.526
4.681
4.765
4.790
Notes:
DG: Discontinuous (linear) Galerkin
BDF3: Third-order backward differentiation:
yn+i = Jj[-\8yn - 9yn_! + 2yn_2 + 6hyn+:]
AB3: Third-order Adams-Bashforth
TR/2: Trapezoid rule at At/2
TR/8: Trapezoid rule at At/8
(1) Exact solution used for first two steps.
338 THE ADVECTION-DIFFUSION EQUATION
(4) AB3 is rather disappointing.
(5) The jump in DG is about 0.001 ± 0.0005, decreasing with t.
So, for the model heat equation, linear-DG does very well. What about the model
advection equation, y = icoy? First we note that it is not conservative—it introduces
numerical damping; the solution is of course given by (2.7-213) with X = —ico, which in
turn leads to
£ =
1 + icoAt/3
1 -2ia)At/3-cozAtz/6
1 - 5co2At2/9 + icoAt(l - of At219)
1 - co2 At2/9 + co4 At4/36
< 1,
which, according to Johnson et al. (1984), is third-order accurate and fourth-order dissi-
pative.
Applying the linear DG method to the AD equation (GFEM in space and DG in time)
is straightforward but quite complex. (We shall study the piecewise-constant DG method
in the next section.) What we shall do here, instead, is to take a mere first step or so in
this direction, applying it to T, + uTx = 0. The result, on a uniform mesh is—where Tn-
corresponds (at node j) to y(t~) in the scalar ODE case and Tn- corresponds to y(t*), the
value of Tj just after the jump—analogous to the ODE result in (2.7-213), the pair
1
6 L
(TnjZi +Tnj_x- 2T)_X) +4(rp +Tnj-2Tnj) + (Tnj+{ + T)
uAt
+ i
+
6h
-n + \
irj+l-Tpi) + 2(Tnj+l-Tnj_l)
= 0
2TnJ+l)
(2.7-214)
and
1
6 L
-n + l
-n + 1
-■n + 1
(7-;+; - r;_,) + 4(r^+1 - rj) + (^ - r;+1)
uAt
+
6h I
-n + \
2(TnjH-Tnj±l) + (Tj+l-Tnj_l)
0,
(2.7-215)
"M+l T«+l
in which the unknowns are T"+ , T"j±\, T", and Tnj±l, and the first equation incorporates
the jump; e.g., T" — T". Rather than attempting a von Neumann analysis, we instead
quote from Shakib and Hughes (1991), where these equations first appeared, who did: 'A
symbolic manipulator program, SMP, was used to carry out these operations. The resulting
difference stencil is very long and complicated, consequently we will not present it here.'
What is worth noting is that the DG method is quite implicit; it intimately couples the
unknown nodal parameters at the beginning and end of the time interval with the nodal
values at the end of the previous interval, a result that can (as shown above) give quite
accurate results but that doubles both the number of equations and the bandwidth over
competing (conventional?) 'continuous' (single-valued) methods such as TR and BDF3.
See Shakib and Hughes (1991) for further discussion of this method, including some
'specialized' methods for dealing with the associated (and 'large') linear algebra problem.
It is shown there to be unconditionally stable in the two limiting cases of pure advection
and pure diffusion; presumably it is also stable for the general case.
Additional relevant references on this subject may be found in Johnson et al. (1984),
Hughes et al. (1987), and Eriksson and Johnson (1990).
TIME INTEGRATION 339
e. Methods based on least-squares and time-discontinuous ODE's
'Galerkin is accurate but unstable; least squares is stable but not accurate ...'—L. Franca
during his lecture on stabilized methods at the October 1993 Workshop on Numerical
Methods for the Navier-Stokes Equations, at Heidelberg, Germany. If the added stability
(relative to GFEM + ODE) of least squares alone or DG alone is not sufficient for your
'taste' you may follow Hughes et al. and combine the two. Thus, in Hughes et al. (1989) is
presented the 'Galerkin/least-squares' (GLSQ) method for the scalar transport equation.
Whereas they discuss both steady and time-dependent equations, we shall focus only
on the latter—in which the DG method in time is combined with GFEM in space and
augmented by an additional weighted-residual method—in a linear combination—the
least-squares method in both space and time. (At the time of writing, the GLSQ is the
method of choice by Prof. Hughes.) Thus, the GLSQ on Tt + u • VT = kV2T + S with
homogeneous Dirichlet BC's (for simplicity) is the following—for each time 'slab':
/'« + ) f / QTh .
+ u • Vwh - kVV, +u-VTh - KV2Th | \ dt
dt dt
/ e.
= f"+\wh,S) Vw\ (2.7-216)
Jtn
where r (with units of 'time') is the LSQ weighting parameter (r = 0 is DGFEM of
the previous section), and wh(x,t) is the discontinuous-in-time, continuous-in-space test
function. Also ( , )e denotes the L2 inner product over element interiors (and summed
over elements), thus permitting V2wh to 'make sense' with C° basis functions. [Note that
letting r -> oo causes this weighted residual method to be 'purely' least squares (in space
and time), and would of course now require C1 basis functions of degree two or more and
the usual (global) definition of ( , ) unless k = 0.] In fact, however, the DGFEM employs
only C° basis functions (and finite r) and does not even entertain the pure LSQ limit.
This fact, plus the element-based L2 inner product (which gives zero for the Laplacian
with linear basis functions), emphasizes the point that the LSQ is to be considered simply
as a stabilizing addition to the basic Galerkin FEM—even for the steady-state case. A
final remark regarding the above formulation is that omission of the Laplacian term in
the least-squares (r) terms returns us to the SUPG method—'The Galerkin/least squares
method is closely related to SUPG, but represents a conceptually simpler and more general
methodology, applicable to a wide variety of problem classes'—Hughes et al. (1989).
If we use the simplest DG method, then both uh and wh are piecewise constant in time
(dwh/dt = 0) over each time-slab and the result, from Shakib and Hughes (1991) is, in
ID on a uniform mesh of linear elements (V2wh = 0)
1 j-\ ~ y j-\ ) i £ [ 1J ~ J ) i I ( j+l ~ J+l \ i uLl±LZl±±
At } 3 V At j 6 I At J 2h
r\'-pn-\-\ rrn-\-\
(k + u2!)-1-^- JT ^-+ I ' ' Sdt, (2.7-217)
h
340 THE ADVECTION-DIFFUSION EQUATION
which, like the scalar case of the previous section, is simply BE with a twist—the LSQ
portion/addition has added 'BTD' to the BE method. Recall (see Section 2.7.2e) that BTD
made sense from the time integration point of view when FE was the time integrator
(with t replaced by At/2); here it just adds additional diffusion (streamline diffusion
in multi-dimensions) to an already 'maximum-dissipation' time integrator—BE. Some
might view this as overkill—sacrificing accuracy (take k = 0 and watch your initial data
'disappear') for super stability—especially when the arguments applied in Section 2.7.2e
to the FE method would lead to a negative BTD for BE; i.e., the diffusion term from the
time truncation error argument, would be (k - u2At/2)(Tn^\ - 27"+1 + r"+J)//z2, and
would give more accurate results than BE and (especially) BE + BTD; i.e., GLSQ. Thus,
this lowest-order method is not recommended for time-accurate simulations—even by its
designers; use it only to reach steady-state solutions is their advice.
Since it may not be totally obvious what is meant by Th and w11 in (2.7-216) for the
linear DG case, we shall elucidate. First, recalling that this is a Galerkin-based method
means that the trial (basis) and test functions are from the same space; they are the
following:
(2.7-218)
Th(x, t) = ^^W[n,(07-" + n2(t)Tnj+l]
and
wh(x, t) = ^;(jt)[n,(0wj + n2(o<+1],
(2.7-219)
where n,(r) = (tn+\ - t)/At, U2(t) = (t - tn)/At, and At = tn+\ - tn; also Tnj and f'j
have the same meaning as in the previous section, Section 2.7.7d.
In the implementation of (2.7-216), it suffices to satisfy (2.7-216) for every wh as
follows: first take wh = (pi(x)U\(t) for node / and then take wh = (pi(x)Y\2(t), still for
node /; repeat for every / (all nodes)—giving two 'ODE's' per node.
To conclude our 'summary,' we return to the pure advection case in ID presented
in the previous section—(2.7-214) and (2.7-215)—and see what the LSQ addition looks
like. To do so, we turn again to Shakib and Hughes (1991): to the LHS of (2.7-214) are
added [from rii(0] the following three terms:
3A?
ux
(Tj-i +4T] + r;+l) - (7-;+,' +477' +r£!)
Tn+\\
' l ]+\)
from
yrtj
from
r'n+\
Jtn
dwh ,\ / , dTh
,u•VTh ] + u • Vw\
dt
dt
dt.
dt.
and
u2xAt
3h2
from
■n + l
■vi + 1
7 + 1
,h „ x-TTh*
r(u- Vw",u- V7"1);
+ 2(-7"L1 +27"! -f"
7-1
7+1
and, analogously, to the LHS of (2.7-215) is added—from the same three LSQ terms,
this time from T\2(t):
TIME INTEGRATION 341
3 At
(rpl +4r;+l + rj+j) - (f)_{ + 4f; + fnj+l)
+ (f;+.-f"-.)
+ u—
h
+
u2At
2 (-7-;+; + nfx - rj+j) + (-r;_, + it) - f;+l)
Since only some of the 'correction terms' are amenable to useful physical interpretation,
we shall not bother writing out the full equations and refer the reader to Hughes et al.
(1989) and Shakib and Hughes (1991) for further details. This latter reference also shows
some 'Fourier analysis' results, both for r = 0 and for
t = [(2/At)2 + {lu/hf + 9(4K/h2fr{/2. (2.7-220)
It also provides a brief comparison with more conventional methods.
Related methods applied to quadratic basis functions (in addition to linears) can be
found in Franca et al. (1992) and Khelifa et al. (1993).
To conclude this brief summary, we state our current opinion regarding GLSQ: it is not
felt to be a serious competitor for the simple scalar transport equation that is the subject of
this chapter. It may, however, be a viable competitor for the much more difficult problems
to be addressed in the next chapter—especially for free-surface flows. Finally, it may also
be competitive for that branch of CFD not addressed in this book: compressible flow.
Finally, for some new methods that are a sort of blend of characteristic-based methods
and GLSQ, in which only symmetric linear equations are generated, called the
Characteristic Streamline Diffusion (CSD) method, see Johnson (1992) and Hansbo (1992a).
See too Pironneau et al. (1992) in which the 'characteristic Galerkin' and 'Galerkin/least
squares space-time formulation' were compared.
f. A wave equation method
A very recent result (Wu, 1994) seems to have successfully 'transcribed' the second-
order wave-equation method of Lynch and Gray (1979), which has been demonstrated to
do a very good job for the shallow-water equations, to the advection-diffusion equation.
It is shown, for both FDM in ID and lumped mass bilinear FEM in 2D, to have no
spurious damping, excellent phase-speed properties that are nearly independent of CFL in
the stable range (explicit formulation), and few or no wiggles. It properly precludes the
second of the two wave solutions generally displayed by the second-order wave equation,
by appending a second, required and proper, IC on the problem.
g. Another combined method: Taylor least squares
As our final 'survey' example, we summarize a method developed by Park and Liggett
(1990, 1991) that is described as 'An extremely accurate and also very flexible method
...' by Donea and Quartapelle (1992). It is restricted to C1 basis functions because it
is also of high order in time. It combines ideas from the modified equation approach in
which Taylor series are used to obtain a fourth-order accurate time 'integration method'
(for 6 = 1/2, the only case considered herein)—the modified equation—which is then
spatially discretized using least squares in space with Hermite cubic polynomials for
342 THE ADVECTION-DIFFUSION EQUATION
basis functions. The modified (Taylor portion) equation is, for pure advection in ID,
«*»_ + M#\,t.+1-t.\ gr,
2 dx \2 dx2 \ At J dx
and the weighted residual equation upon applying the least-squares method, is
' uAt d u2At2 d2 \ , , ,. dThn
uAt d u2At2 d2 .
1 H H T W
2 dx 12 fa
.a
= 0, Vw", (2.7-222)
where w^ is a test function. Upon representing both Th and w^ by means of Hermite cubic
polynomials, the higher-order TLS scheme of Park and Liggett is obtained.
When diffusion is included, it is treated temporally via the trapezoid rule using an
operator-splitting (fractional step) technique, thus accepting only second-order accuracy
for diffusion. It is applied (tested) in ID and 2D in Park and Liggett (1990) and in 3D in
Park and Liggett (1991), wherein also presented is a ray method for obtaining analytical
solutions to the transient AD equation in 3D. Impressive results are presented, and the
TLS method compares well against other weighted residual methods [the Crank-Nicolson
Taylor-Galerkin and Lax-Wenchoff Taylor-Galerkin modified equation methods of Donea
et al. (1987)] using similar (cubic) basis functions.
Thus we conclude our discussion of other methods on a high note; i.e., a higher order
in space and time method can indeed deliver high-order accuracy on relatively coarse
grids. It remains to be seen if the method can be successfully applied to the more difficult
NS equations.
It should be sufficiently clear by now that the 'simple' AD equation continues to
provide fertile ground for numerical methods developers—the 'sand box' is very large.
2.7.8 Concluding Remarks and Suggestions
It may be interesting, and perhaps even useful, to try to draw a few conclusions and
make some recommendations after such a long discussion of time integration techniques.
They will naturally be rather subjective since we (or anyone else, for that matter) have
only limited personal experience. Thus, the first thing we must do is remain rather silent
on those new techniques that involve least-squares and discontinuous-in-time Galerkin
methods—they are probably excellent, but they look to us rather expensive.
Next, we do believe in the method of characteristics for advection and recommend it
to our readers—although the best of them may still be in the future, and should probably
not use fixed timesteps.
Returning to the more mundane-but-much-more-common subject of ODE methods
applied to the Eulerian version of the AD equation, we can offer more confident advice:
1. Since the GFEM equations are inherently implicit (CM matrix) and since we strongly
believe that implicit methods should almost never be implemented as fixed-step methods,
our first vote goes to the variable-step TR method of Section 2.7.4.
ADDITIONAL NUMERICAL EXAMPLES 343
2. If you are uncomfortable with a method that contains no built-in damping, use the
variable-step BDF2 method that we discuss in the next chapter—Section 3.16.4d; simply
simplify it for the AD equation.
3. Please do not use BE, variable step or not; it is too dissipative and requires a too small
At for accuracy.
4. If you insist on explicit, fixed-A? integrators for simplicity (or whatever), yet are
interested in accurate time integration and in advection-dominated flows, then consider
either AB3 or RK4 and use a few DSCG iterations per timestep to reap the benefits of
consistent mass.
5. If you insist on solving only symmetric matrices but want more stability than
explicit methods can provide, consider the CM sera/-implicit but unconditionally stable
(we believe—see Bullister, 1986) method of Section 2.7.5—FE/TR/BTD—perhaps even
generalized to variable step sizes, somehow.
6. The lowest-level but cheapest per step method that we would even consider advocating
uses LM quadratic elements and the FE/BTD explicit method of Section 2.7.2e.
2.8 ADDITIONAL NUMERICAL EXAMPLES
We conclude this chapter with just three numerical examples—one that occurred in our
'real world of applications' and two that serve as simple but effective test problems to
demonstrate several aspects of what has been discussed.
2.8.1 Unstable ODE Example
This example came from a real-world application—thermal convection in molten uranium.
During a particular series of simulations (at LLNL), a previously quite useful code kept
blowing up. After many 'diagnostic' runs, including the use of two independent codes,
it was (finally!) determined that the advection-diffusion ODE for the temperature was
often unstable. Yes—the ODE itself. (This is the energy equation in the Boussinesq
model—see Volume II—and the velocity field that drives it is rapidly varying, both
temporally and spatially.) Presented below is a simplified sample of those results in
which the quite complex and always time-dependent velocity field was 'frozen' in the
middle of a run (when all the physics was in full swing) and that field—now only
spatially varying—used as a steady velocity field to drive the linear scalar transport
equation. The flow field was generated from the following thermal convection problem
in a 59.7 x 12.7 horizontal cavity: the bottom and right end are heated (^hot), the
top is cooled (Tqold), and the left end is a symmetry boundary. The velocity BC's
were: no slip, no penetration on the bottom and right end, no penetration and shear-free
on the top, and 'symmetry' on the left end (i.e., again no penetration and no shear).
Depending somewhat on AT(= 7Hot — ^cold), a multi-cell (four or five, usually) flow
pattern resulted, a snapshot of which is shown in Figure 2.8-1. The 'test' problem that
finally evolved was to use this flow field (which 'generated' the unstable ODE's) to
solve the pure advection equation with a specified, initial Gaussian temperature field
(a = 0.25// = 3.18) centered near the top of the right-most cell—see Figure 2.8-2(a).
[We switched from advection-diffusion to pure advection partly because diffusion (in
344 THE ADVECTION-DIFFUSION EQUATION
(c) Vector Zoom (d) Stream Function Zoom
Fig. 2.8-1 Flow field that generated an unstable ODE.
'reasonable' amounts) did not stabilize the ODE's and partly to perform other numerical
tests, reported below.] The series of experiments described below will demonstrate the
following points:
1. It is not too difficult to generate unstable ODE's from the advection term in complex
and rapidly varying velocity fields if any but the quadratically conserving formulation
(fi = 1/2, skew-symmetric advection matrix) is utilized.
2. The skew-symmetric form will conserve / T2 but not / T; i.e., it always generates
stable ODE's but cannot be guaranteed to conserve T, nor be guaranteed to be accurate.
Bounded gibberish can occur.
ADDITIONAL NUMERICAL EXAMPLES 345
(a)t-0
Tmax - 22-8
T^n=-7.00
Tmax = 42.3
T„i„=-60.4
^-133
Tmi„ = -182
(b)t=1
(C)t = 4
8
81
&££
(d) t = 16
"V
■ v w v \s
>°
- 0
(e) t = 32
1^ = 3.3x10
1^ = ^3.2x10
(f)t = 84
Fig. 2.8-2 Temperature field at several times (TR, P = 0).
346 THE ADVECTION-DIFFUSION EQUATION
3. The conservation form (divergence form; /J = 1) will indeed conserve / T but generally
not / T2 and will thus blow up (become unbounded). [N.B. finite volume fans.]
4. The simplest advective form (/J = 0) will conserve neither while blowing up.
5. Owing principally to directional group velocity errors, the passive scalar that should
remain in the first two cells crosses not only those dividing streamlines but all the others
as well, to eventually fill the whole flow field with spurious numbers. This does not bode
well for any of the advection options, and the only fix we know of would be to use
one of the better characteristic/trajectory methods for advection instead of Galerkin (or
Petrov-Galerkin, or least squares, or Galerkin least squares, or Taylor-Galerkin, or any
Eulerian method).
6. The time reversibility of the TR integrator will be dramatically demonstrated—even on
unstable ODE's—and compared with the dissipative BE method. TR is an excellent time
integrator for non-dissipative systems, and BE is a very poor one—requiring extremely
small timesteps for comparable accuracy.
7. Streamline diffusion can be used to generate different wrong answers—and does not
guarantee stability.
Figures 2.8-1(a) and (b) show the vector field and stream function for the flow, and
Figures 2.8-1(c) and (d) show a 'zoom' (different i^-contours) into the right-most 15%
or so of the cavity because this is where the largest 'action' will be seen to occur.
Even though the upper small eddy [Figure 2.8-1(d)] appears to be fairly well resolved
on this 100 x 24 graded mesh of <2i<2o (see Chapter 3) elements in which temperature
is piecewise bilinear, it is the 'generator' of the unstable ODE in the following sense:
the eigenvector corresponding to the most unstable mode (largest growth rate) has its
largest entries at nodes in this region. Thus, ultimately (large t) the temperature in this
neighborhood dominates that in the rest of the field. The small-but-variable velocities in
the upper corner generate (for /3 = 0 and /J = 1) a quite unstable ODE from the advection
matrix [M~[lN(u) to be precise]. The maximum speed in the entire field is ~ 4.1 (at
x = 16.5, y = 3.3) and the maximum in the right-most large eddy of Figure 2.8-1 is
~ 3.5, and occurs at x = 59.3, y = 6.2). Based on these speeds, the estimated average
eddy turnover time is ~ 16.
The sequence of figures in Figure 2.8-2 displays the instability qualitatively via a
'base case'—TR with p = 0, At = 0.10, and 0 ^ t ^ 84—beginning with the IC (T0) in
Figure 2.8-2(a), in which the value of rmax(20.8) was 'arbitrarily' selected to give J T0 =
A = 59.7 x 12.7 = 758—so that we could easily monitor 'conservation' of T : f T/A
should remain unity. By t = 4 [Figure 2.8-2(c)] the simulation is already 'in trouble' in
that the Gaussian has both 'broken up' and is about to penetrate the dividing streamline
between cells 2 and 3 to show up in the next cell—a clear violation of physics. Also, as
shown in the figure, the violation of rmax = 20.8 and rmin = 0 has already occurred. Thus
the rest of the experiments will focus on numerics rather than physics, which, of course, is
the prime purpose of this example. The 'trail of bad numbers' continues in Figures 2.8-2(d)
and (e), and we remark that the clear unstable behavior of the dominant mode does not
really appear until time 50 or so—as we will see later via some time histories. But by the
end of the run, t = 84 in Figure 2.8-2(f), the 'final' pattern is set—the dominant mode is a
growing oscillation, T(x, t) ~ /(x)e('2^A+^) with k > 0; in fact, X = 0.18 and r = 1.8,
giving l/XAt = 55 steps per 'e-folding' and r/At = 18 steps per period—showing a
ADDITIONAL NUMERICAL EXAMPLES 347
'nearly' sufficiently small At for TR to 'track' the solution pretty well. (Also, eXr = 1.4,
a 40% growth per period.)
Next, starting from the large T—0(1O6)—solution at t = 84, we 'digress' to
demonstrate a remarkable property of TR (symmetry) that is shared by only a few
time integrators: complete reversibility in time, as discussed in Section 2.7.3—a property
that is obviously closely related to its lack of numerical dissipation/lack of spurious
damping/neutral stability. Starting at t = 84 and reversing the sign of At (or, what is
equivalent in this case, reversing the sign of the velocity), Figures 2.8-3(a) and (b) show
two stages in the reverse integration, the first (after 52 time units) corresponding to
t = 32 in the forward integration and the second (at 80 time units) to t = 4. These are
to be compared with Figure 2.8-2(e) and (c), respectively. Not shown is the end of the
backward integration at t = —84, the IC, which looks just like Figure 2.8-2(a) and agrees
with it to 'several' digits. The last of our backward TR integration results is shown
in Figure 2.8-3(c), the first 7.5 time units of the backward integration—during which
•* o .
(a) t = 52(32)
(b)t = 80{4)
Temp.
Cxio"5)
-0.6 -
-1.8 —
-3.0
0.05 1.64 3.23 4.82 6.41
(c) Time
8.0
Fig. 2.8-3 (a) and (b) Temperature field at two times during backwards integration via TR;
(c) a nodal time history during a portion of the backward integration.
348 THE ADVECTION-DIFFUSION EQUATION
we see, appropriately, a damped sinusoidal oscillation (of the most unstable mode—for
small enough time; at 'large' time, other modes become relevant). This is for node '2399,'
located one node down from the top and six nodes in from the right (x = 59.38, y = 11.92)
and is certainly close to the 'most unstable' node.
The next series of pictures, Figure 2.8-4, are all at t = 32 [(cf. Figure 2.8-2(e)] and
show how the solutions vary with /3 and the effect of switching from TR to BE. Thus,
shown in the figure are: TR for (5 = 1 (conserves T) and /3 = 1/2 (conserves T2) and BE
for all three values of /3—with an additional BE result, Figure 2.8-4(d), using a ten-fold
smaller timestep (8400 total steps). Associated with all of the above is a summary of
extrema, Table 2.8-1, at both t = 32 and at the end of each run (t = 84).
For each value of /3, the BE and TR results should agree, whether stable or unstable.
When A Ms reduced, the results should not change 'significantly.' Comparing first ^ = 0
at t = 32 in the table shows that At = 0.01 for BE is 'almost' good enough [close to
TR at At = 0.10; cf. also Figures 2.8-2(e) and 2.8-4(d) but at t = 84, it is far from good
enough, giving extrema that are ~ 100-fold too small. Worse yet is BE at At = 0.10,
being ~600-fold too small at t = 84. Figure 2.8-5(a) shows that, even though two to
three orders of magnitude too small, BE gets the solution 'qualitatively correct.' Another
indication that TR is 'correct' is given by comparing the theoretical amplification factor,
fr = (1 + A.Af/2)(l - XAt/2), with the observed factor, f0 = (Tn/Tm)x/(n-m); the two
agree to within about 0.1% for t > ~50, with f = 1.02. A similar comparison for the BE
run gives rather different theoretical vs oberved growth rates—suggesting that the severe
damping has so strongly distorted the results that the most unstable mode is not yet clearly
present. Finally, a factor of two reduction in At for TR led to a factor of ~2 increase in
the extremal values at t = 84, suggesting, as mentioned earlier, that At = 0.10 is 'almost'
small enough for the TR integrator. (Recall that extreme values are being compared; better
accuracy over most of the domain can be expected.) Also noteworthy from the figures
and the table is that y3 = 1 is actually more 'unstable' (slightly larger X, and much earlier
growth) than the simpler advective form, even though it conserves T; conservation of linear
quantities guarantees neither stability nor accuracy. Finally, /3 = 1/2 gives bounded but
(also) totally fallacious results; conservation of T2 guarantees stability but not accuracy.
Another measure of BE's quality is how well it does on the reverse integration. Thus,
Figures 2.8-5(b) and (c) are to be compared with the TR counterparts, Figures 2.8-3(a)
and (b); the difference is significant, both qualitatively and quantitatively.
Finally, some relevant time histories are shown in Figure 2.8-6 for TR and BE. Plotted
there is fTdA/A and the natural logarithm of (f T2dA/A), with respective values 1.0
and 2.47 at t = 0. For /J = 0, TR shows an oscillatory blow-up of f T [Figure 2.8-6(a)]
and a pretty clean exponential, and non-oscillatory, blow-up of f T2 [Figure 2.8-6(d)].
(We offer an explanation for these counter-intuitive phenomena at the end.) For fi = 1
and /3 = 1/2, the predicted theoretical results obtain, with the former showing a slightly
larger growth rate of f T2 than did the advective form (/3 = 0)—and the latter is not
shown (it is constant at 2.47). The analogous BE results, Figures 2.8-6(c) and (e), agree
less well with theory (with the exception that /3 = 1 does conserve J T) because A/is too
large; e.g., J T2 is actually decreasing for /3 = 1/2. These poor BE results are in complete
harmony with those in Marx (1994), who found, for one of Rannacher's test problems
(Rannacher, 1989) that a 50-fold smaller At was needed by BE to get the same accuracy
as TR on ODE's displaying a slowly damped, oscillatory solution.
ADDITIONAL NUMERICAL EXAMPLES
349
Tmax = 831
T,*—1124
T7" ^
o' o(T0 oo u*
(a)TR p=1
Tmax = 59.5
W-70.4
• t • 0
(b)TR p = 1/2
(c)BE P=0
Ta« = 132
T„*,=-180
. .a . "0 0 0
(d)BE p = 0(At = .01)
Tn« = 870
Tml„ = -561
~± 1
(e) BE p = 1
i y^v.y I-*1' «j
Tmax=47.1
^ = -64-6
^T77*><
jii
(f) BE p = 1/2
Fig. 2.8-4 Temperature fields at t =32 for s/x different integrations.
350 THE ADVECTION-DIFFUSION EQUATION
Table 2.8-1 Temperature extrema.
Trapezoid rule
Backward Euler
£ = 0
(0 = 0, At = 0.01)
0=1
0=1/2
0 = 0
(0 = 0, At = 0.01)
(0 = 0, At = 0.05)
0 = 1
0=1/2
'min
-182
—
-1124
-70.4
-3.2 x 105
—
(-6.0 x 105)
-2.4 x 107
-145
'max
133
—
831
59.5
3.3 x 105
—
(6.7 x 105)
1.3 x 107
90
t = 32
t = 84
'min
-170
(-180)
-561
-64.6
-523
-3.20 x 103
—
-1.65 x 104
-80
'max
130
(132)
870
47.1
493
3.00 x 103
—
2.24 x 104
-58
T„«-493
W-623
-^_
(a) t = 84
T™ 128
T,*—153
T„«=14.2
T™--1.17
(b)t = 52(32)
O • • » 0 9 9 Q fc^l/
(c)t 80(4)
Fig. 2.8-5 Some more backward Euler results with At = 0.1; (a) 0 = 0 at t = 84; (b) and
(c) backward integration starting from the result in (a).
Remark:
It may be worthwhile demonstrating (for /J = 1/2) that TR conserves energy for all At
and BE degrades it. Thus, starting from MT + NT = 0 with NT = — N,
(1) Clearly \<\(TTMT)/dt = TTMT = 0 because TTNT = 0 because N is skew-
symmetric; i.e., TTMT is constant—J T2 is conserved by the ODE's.
ADDITIONAL NUMERICAL EXAMPLES 351
5
4
3
2
1
0
f-i
" -2
-3
I
I I I I I
I.UI
CO 1
2
4S 0.99
1—
2 0.98
e?
c 0.97
0.96
—
" 1
1 1
1 1
1 1 1 1 _
—
\ —
V^N —
i i iM-
10 20 30 40 50 60 70 80 90
(a)Time(TR,p = 0)
1.06
10 20 30 40 50 60 70 80 90
(b) Time (TR, 0 = 1/2)
-m 1.04
o
I 1.02
1
CO
oi
o
=- 0.98 I-
0.96
I I I I I I I I;
/\/"\Beta = 0.0 /"
VBeta=1.0 /
-/v ABe'
—v
Beta = 0.5 *-
V-Ny
I I I I I >H-,.>
0 10 20 30 40 50 60 70 80 90
(c) Time (BE)
2 20
.og (integral T7a
tn o en
n
I I I
-— Beta =
Beta ■
! I I
I
= 0.0
= 1.0
i
i y
I l l
I
Q
10
9
8
7
6
5 -
4 -
3 -
I I I I I I I
— Beta = 0.0
— Beta =1.0
— Beta = 0.5
^4-refcirl-nrr.L.J
0 10 20 30 40 50 60 70 80 90
(d) Time (TR)
Fig. 2.8-6 Time histories of various 'conserved' quantities.
0 10 20 30 40 50 60 70 80 90
(e) Time (BE)
(2) For TR, M(Tn+1 - Tn)/At+N(Tn + Tn+1)/2 = 0 yields (Tn+l + Tn)TM(Tn+l -
Tn)/At-\-(Tn+i-\-Tn)TN(Tn+\-\-Tn)/2 = 0, giving the desired conservation:
TTn+1MTn+l=TTnMTn.
(3) For BE, M(Tn+l - Tn)/At + NTn+l = 0 yields TTn+lM(Tn+1 - Tn)/At + Tjl+1
NTn+i = 0, giving TTn+lMTn+l = TTn+lMTn = TTnMTn+x. But from Tn+1 = (/ +
AtM~lNTlTn = [I - AtM~lN + At2{M~lN)2 + 0(At3)]Tn = T„ - AtM~lNTn
+ At2M~lNM-lNTn+ 0(At3), we obtain TTn+lMTn+x = TTnMTn - AtTTnNTn +
At2TTnNM~lNTn +0(At3) = TTnMTn - At2(NTn)TM-l(NTn) + 0(At3); BE
monotonically decreases the energy because (NTn)TM~l(NTn) ^ 0 since M is
positive definite.
Can we stabilize the ODE by adding dissipation/numerical diffusion? Should we? The
answer to the first question is, 'yes if you change your advection "operator" to one
that is monotonic—a common "defense" in FDM and FVM, although less common in
352 THE ADVECTION-DIFFUSION EQUATION
FEM. As we have little interest and zero capability in monotone FEM schemes, we will
present the closest thing that we do have: streamline upwinding. Although our derivation
of it was via BTD in which it was introduced to counter the FE instability of negative
diffusion (see Sections 2.1.It and 2.7.5), here we add it to TR simply to try to gain
stability without, hopefully, degrading accuracy. Thus we added the BTD coefficient,
UjUjAt/2, as a diffusion term, for the /J = 0 case via TR to see 'what happens.' What
happened is this: the ODE's appeared to be stable but were tending toward a completely
wrong (and ostensibly steady-state) solution. But in fact they were not stable—they were
just more stable than those without BTD. Figure 2.8-7 shows the result at 'near' steady
state—t =115. That the 'perfect' BTD solution would lead to a (wrong) steady state
for this situation (closed streamlines, steady velocity field) is easily seen by writing the
pure advection 'PDE' + BTD along a streamline: dT/dt + usdT/ds = (At/2)(us(d/ds))2T,
where s represents a streamline coordinate. This ID advection-diffusion equation causes
diffusion along each streamline, with the ultimate result that a steady state could
ultimately be attained in which T is constant on streamlines—different constants on different
streamlines, of course, depending only on the IC's. But the discrete approximation to
this PDE is only approximately representing ID advection-diffusion along a streamline.
Hence, pollution of the other cells still occurs (group velocity errors) and even instability
still occurs. Although Figure 2.8-7, and time histories (not shown), show a tendency to
display constant-along-streamlines 'temperature,' in fact, the solution eventually became
unstable—but with a much lower growth rate (X = 0.0021 vis-a-vis X = 0.18 without
BTD), the instability becoming clearly dominant only at very large t (several
thousand)—and was actually accomplished with the variable-step version of TR. (BTD almost
stabilized the ODE's for this problem.)
So what else have we learned from this example? What advice can be given to the
analyst based on it? Our answers to this type of advection dilemma are the following:
1. In general, stay with the cheapest method (^ = 0) whose accuracy, when stable, is as
good as the other two. Also, instabilities of the type shown above are not very common.
2. If an unstable ODE is generated in a particular simulation, first rerun it using fi = 1/2,
which should verify that you did have an unstable ODE by being stable. If the associated
'accuracy' looks good, you are lucky.
3. But if your stable results look like 'bounded gibberish,' as they did in our example,
go back to your mesh generator and refine the mesh—at least in the region of maximum
instability.
4. Suppose you cannot afford to remesh and rerun (you may already be at the computer's
capacity in 3D)? We offer three possible solutions to this sad situation: (i) give up! (the
t=115
Fig. 2.8-7 Temperature field at 'nearly' steady state (t = ^^5) via TR plus BTD.
ADDITIONAL NUMERICAL EXAMPLES 353
stated problem is simply beyond your resources), (ii) find a friend with a bigger computer,
(iii) try streamline (or other) upwinding, perhaps multiplying UjUjAt/2 by a scalar (10?
100?), but be very suspicious of your results.
To conclude, it may be of some interest to show, for a model problem in ID, that it
is indeed possible (with thanks to A.C. Hindmarsh) for a periodic-in-time (eigenvector)
solution to show a complete lack of periodicity in 'energy.' The dominant eigenvalue for
the ID AD equation with a Dirichlet BC on the left and a Neumann BC on the right
(outflow) is, from Section 2.6.2d, in the advection-dominated 'limit' (P ^> 1),
2k
A, = -r(l -iPcosn/N)
h2 '
= 2K/h2 -iu/h, (2.8-1)
which yields the temporal behavior e~x,? = Q-^t/h emt/h which5 smce /> _ unj2K ^> 1,
corresponds to an oscillating and slowly decaying mode [X\ = (ut/h)(—l/P-\-i)]. The
corresponding eigenvector is
vj = i-i)Jsin jn/N, j=l,2,-..N, (2.8-2)
and the solution to the semi-discretized AD equation, assuming this to be the only mode
present, is
Tj{t) = (e-Xi'vj + e-Xi'vj) (2.8-3)
which is, as required, real (a real solution requires that the conjugate modes have equal
amplitude). Letting T and v be the corresponding N-vectors, we seek the behavior of the
energy, E = TT T, and obtain
E = 2e2tRe{Xi ]vTv + 2Re(e-2Xi'vTv), (2.8-4)
where the first term represents monotonic (non-oscillatory) decay, and the second describes
oscillatory decay unless q = vTv = 0. We leave as an exercise—non-trivial—the proof
that, in fact, q = 0. We presume that an analogous situation has occurred for our TR
integration of the /J = 0 unstable ODE.
For a brief survey of the literature on unstable ODE's, refer to Section 2.3.1—in the
Remark in the Digression following (2.3-9).
2.8.2 Advection-Diffusion of a Puff (Point Source)
Air pollution studies often involve the elevated release of either a point source or a finite
'puff of pollutant, the fate of which is of some interest. A point source is, of course, a
mathematical idealization which, with the passage of time, becomes a puff. As a sample
result in this regard, Figure 2.8-8 shows a simple simulation of a puff release at t = 0 and
its subsequent advection and diffusion downwind. Shown are the 33 x 10 mesh of 4-node
bilinear elements and both CM and LM results, as well as the exact result (dotted),
at several times. The IC is a 2D Gaussian puff with a = 0.25 released at x = 0.625,
y = 1.0, the wind speed is u = 1, v = 0, the (constant in this case—not meteorologically
appropriate) diffusivity is k = 0.05 to give a puff Peclet number, uo/2k, of 2.5—showing
354 THE ADVECTION-DIFFUSION EQUATION
20
18
16
14
12
10
8
6
4
2
0
0 2 4 6 8 10 12
t
Fig. 2.8-8 Advection-diffusion of a puff, (a) 33 x 10 mesh of 4-node elements; (b) CM
results att = 0, 2.5, 6.5, and 11.5; (c) LM results at the same times.
a fairly close balance between advection and diffusion (each is important). The BC's are:
T = 0 at x = 0 and along the top, and dT/dn = 0 at y = 0 and at the outflow boundary.
The analytical solution is that for a point source released earlier (t = —0.625) at x = 0,
y = 1, which generates the IC mentioned above. Even with significant diffusion, the
lumped mass result is noticeably inferior to that from GFEM—attributable (as always) to
numerical dispersion. Also noteworthy is (for CM) the utility of the homogeneous NBC
as OBC: dT/dx = 0 at x = L.
Also of interest is At vs t from a smart integrator (variable step TR)—and this is
shown in Figure 2.8-9 for the CM case, for a rather 'tight' s (10~5, used to generate
Figure 2.8-8) and one order larger—the results of which are not shown but differed little.
In each case, At increased nearly 20-fold during the simulation as the effect of diffusion
made the simulation progressively easier. The fact that the small e run required a little
more than twice the total number of steps as the 10-fold larger e case is quite consistent
with the theory: 101/3 = 2.15.
2.8.3 The Rotating Cone—A Pure Advection Test Problem
We conclude this chapter by showing one more example in which mass lumping seriously
degrades the otherwise high accuracy attainable from GFEM. The popular test problem
[independently initiated in 1968 by Molenkamp (1968) and Crowley (1968)] known as
t—■—i—■—i—'—i—■—i—'—i—■—i—r
1 ■ ' ■ ■ ■
ADDITIONAL NUMERICAL EXAMPLES 355
1 1 1 1 1
••••••••••••••• * £
= 10"4; total steps
= 10-5; total steps
—
26
30
i
I
= 148
= 306
19
—
•
•
44 I
ir 22 62 J
lijj r™ J No. of steps at this At
ii^T41i i i i i
0 2 4 6 8 10 12 14
t
Fig. 2.8-9 Timestep history (TR) for the CM results in Figure 2.8-8 (s = 10 5) and for e =
10~4 (not shown in Figure 2.8-8).
the 'rotating cone' begins by placing a cone (2D 'hat function') on the mesh as in IC in a
pure/solid body rotational (Q rad/sec) velocity field; u = -£2y, v = Qx. The solution is
easy to depict in words: it is as if an (empty!) ice cream cone were placed upside-down on
a turntable/record player (remember record albums?) and the switch turned on. (A low-
speed, e.g., 33 rpm, record player may be needed, to preclude 'slippage' ... Remember
45 rpm? 78 rpm?).
Figure 2.8-10 shows a one-revolution result for both CM (GFEM) and LM, for a cone
whose base diameter is 8Ajc(= 8Ay)—see Figures 2.6-21 and 2.6-27 for 'analogous' ID
versions. The BC's specify T = 0 at the four inlet portions of the domain (one half of each
boundary) and, of course, no BC at the outflow portions. Time integration was via TR
with a very tight e—all error is spatial. Noteworthy is the accurate GFEM solution with
little error in either phase or group velocity and no artificial dissipation—even though the
advective form (/J = 0) was used.
For further discussion of these results, in which they are compared with eight- and nine-
node GFEM as well as with the disastrous lumping of the serendipity element discussed
in Section 2.3.4, as well as a comparison with some very good finite difference schemes
(Arakawa's second- and fourth-order methods), as well as some spectral comparisons,
see Gresho et al. (1978). [Arakawa's famous FDM's are presented in Arakawa (1966),
and the equivalence of his second-order method to lumped mass bilinear FEM is shown
in Jespersen (1974). Finally, in Gresho et al. (1978), it is shown that GFEM on bilinears
356 THE ADVECTION-DIFFUSION EQUATION
Fig. 2.8-10 Pure advection in 2D; bilinear FEM approximation. Contours are for 0.8, 0.6, 0.4,
0.2 (dotted for exact solution, solid for approximate solution) and -0.2 (dashed)
for lumped mass results. Lowest curves are for consistent mass, 1 revolution;
right most curves are for lumped mass at 1/2 revolution; leftmost at 1 revolution.
is more accurate than—and not equivalent to—Arakawa's fourth-order FDM, consistent
with our previous analyses of phase and group velocities.]
We conclude with a comment on reduced quadrature: if one-point quadrature is used
on the above problem (giving a 2D 'Box' scheme), then large and fast-moving high-
frequency noise emanates ahead of the cone—a result that is probably even worse than
that from mass lumping, and is consistent with the analysis presented in Section 2.6.3a.
The reader interested in other test problems for AD is advised to peruse Baptista et al.
(1995).
The Navier-Stokes Equations
3.1 NOTATIONAL INTRODUCTION
Because of the varied background and experience of the readers, and because there are
several ways to generate weak forms of vector equations, we shall—for the first, and
simplest case—show the details of several equivalent formulations. To do so, it may
first be a good idea to carefully explain our notation and conventions with respect to
vectors, tensors, dyads, and various 'gradients.' We prefer the invariant bold-face vector
notation of Gibbs, and we will use it more often than the index notation (with summation
convention on repeated indices) that is preferred by many. For clarity and reduction of
ambiguity, we present in Tables (3.1-1) and (3.1-2) a summary of the important 'vector
operations' that will be used in the sequel. Vectors are lower case (bold) and second-rank
tensors are upper case (bold). With respect to dot products, our convention is to start at
the dot (or dots in the case of, e.g., A:B), look left and right, then dot the first thing you
see; repeat as necessary. Examples:
1. A : B = e,eyA/7 : e*e/flw = e/A/7(ey • e*) • e,fiw = e/A/7<5^ • e,fiw = A,7(e,- • e,)fiy, =
AjjBjj, where <5,7 is the Kronecker delta and the e,'s are cartesian base vectors so that,
for example,
H 1
u = e,w/ = [ 0 ) u\ +
,0,
2.
n • [(Vu) • v] = e,-/!,- • [ej—ekuk
d
= e,-/i/
duk
dxj
du
dn
duk
e;—vk
~j
dx
j
= n
Vk = rijUkiVk
3. n • [(Vu)7 • v] = e,-/i/
= e/Hj
du
V = V • .
dn
d
(e/u/)
eye* -— • eivi
e/v/
dUj
dui
= /i,-(e,- • e,-)—— (e* • e/)u/ = njvk —
0Xk oxk
= n • [v • (Vu)] = n • [(v • V)u] = v • V(n • u) = v • Vun.
358
THE NAVIER-STOKES EQUATIONS
Table 3.1-1 Vector conventions.
Invariant
(Gibbs) notation
u
A
Ar
U • V{= V- U}
U X V{= -V X U}
A u{=u A7}
Ar u{=u A}
A B{=B A}
A : B{= B . A}
A:Br{=Br : A}
u (A • v) = (A • v) u
= (u A) v
= u A v
v (A u) = v A u(2)
uv
A uv{= (A- u)v}
A : uv{= (A- u) v
= v A u}
uv : wz = (v w)(u z) =
Cartesian base-vector notation
e,Uj
©/' ®/Ay
©/' ©/A'/
e,ui ■ QjVj = SijUjV,- = UjVj
e,u, x QjUj =ekSijkU,u^)
e^-Ay ■ ekuk = e,AyU/
e,efAf, -ekUk = e,A/U/
e,e/Ay ■ ekeiBki = e,e/AyB//
eiejAjj : ekeiBki = AjjBjj
e,e/Ay : eke/Bik =AjjBjj
e,u, ■ QjQkAjk ■ eiv, = u,Am
\jjA,kiik{= u-AkiVk}
eiejUiVj
e,e,Ay ■ ekeiukv, = e,e,AyU/W
e/e/Ay : e^e/u^v/ = (e/e/Ay • ©a
= AgUfV,
QiQjUiVj : ekeiWkZ, = uiVjWjZ,
M-eiVi
Index
notation
Ui
Ail
A/
UjVj
SijkUiVj
AijUj
ApUj
AjkBkj
AijB/i
AijBij
UjAjjVj
UjAijVj
U/Vj
AjjUjVk
AijUjVj
UjZiVjWj
(1) Ejjk = Alternating tensor: e^ =
as /,/', k are or are not in cyclic
<2> If A7" = A, then u A v = v
= 0 unless i,j,k are all different, in which case e^ = 1 or -1 according
order.
A u.
Table 3.1-2 Derivative conventions.
Invariant
(Gibbs) notation
Cartesian base-vector notation
Index
notation
V(-)
V0
v u
Vx u
vu
(Vu)r
u-V(-)
u-V0
u Vv{= (u
u-(W)r{=
u • (Vu)r =
V A
• V)v
(Vv)-
(Vu).
= (Vv)r • u}
u}
u = Jv(u u)
e,a(-)/9*/
e/30/ax/
e,- • d(ejUj)/dXi = dUj/dXj
e,- x d{ejU;)/dXj = e,- x ejdUj/dx,-
^ekSijkdUj/dXj
ejd(ejUj)/dXi = eiejdiij/dXj
e/efdui/dxj
eM-e/aoya*/ =u,d(-yax,
Ujd(f>/dXi
eiUidVj/dXj
QjUi ■ d(ejekVj)/dXk = eku,dVi/dXk
e;d(ejUj ■ ekuk)/dXi = eiUfdUf/dxt
e, • d(ejekAjk)/dXj = e^pc/dx,-
(•),/
0./
UiJ
ZijkUjj
UiJ
UU
Ui(-)j
Ui<t>J
UiVi-i
UjVjj
uiuiJ
Ay,/'
NOTATIONAL INTRODUCTION
359
Table 3.1 -2 (continued).
Invariant
(Gibbs) notation
Cartesian base-vector notation
Index
notation
(Vu)r
V-A7"
A.Vu = Vu : A=/Ar
= (Vu)r : A7"
Ar : Vu = A :(Vu)r = Vu : Ar
= (Vu)r : A
V • (uv){= (V u)v + u Vv}
Vu :Vv{= Vv :Vu}
Vu :(Vv)r = Vv :(Vu)r
= (Vu)r : Vv = (Vv)r : Vu
= Vuk ■ Vvk (sum on k)
V-(V0){=V20}
V(V-u)
v-(Vu){= v2u}
v-(Vu)r{= V(V-u)}
V • (u • Vv)
{= u V(V-v) +Vu :Vv}
V • [u • (Vv)7"]
{= u • V2v + (Vu): (Vv)7"}
V • (u • A)
= {u-(V-A7") + A:(Vu)7"}
V • (A • u) = u • (V • A)
+A:Vu
n-[(Vu)-v]{=n-[v-(Vu)r]
= v • [(Vu)7" • n]
= v (n Vu) = v-du/dn}
n-[(Vu)r -v]{= n- [v- (Vu)]
= n (v Vu)
= v V(n u) = v [(Vu)-n]}
u (u Vu)= |[V- (q2u)
-q2Vu]
(q2 = u-u)
ekdAk,/dXj
e,e/Ay : ekeidui/dxk = A,y9u,-/9x/
AjjdUj/dXi
e/ • d(ejekUiVk)/dXi = ekd(uivk)/dxi
eie/dUj/dXi : ekeidVi/dXk = dUj/dXidVi/dXj
dUj/dXidVj/dxi
e,- • 3(e/30/ax/)/ax/ = a20/3x,2
e,-a[e/ • d(ekuk)/dXj]/dxi = eid2uj/dxidxi
e, • d(ejekduk/dxj)/ax/ = ekd2uk/dxf
e/ • d{ejekdUj/dXk)/dXj = ekd2Ui/dXkdXj
e,- • d[efUj ■ ekd(e,v,)/dxk]/dXj
= Uid2vk/dxkdXj + duj/dxidvi/dx/
e, • dfyUj ■ (eke,dvk/dx,)]/dXi
= u,d2 v, i dxf + dUj i dXj dVj i ax,
e,- • d(efuf -ekeiAk,)/dXi
= U, dAjj I dXj +Aij dUj I dXj
e,d(Ajkejek •u/e/)/3x/-
= UjdAij/dXi +AjjdUj/dXi
e,nt ■[e,d(ekuk)/dXi -e,w]
= vknjduk/dXi
e/n; ■[eid(ekui)/dxk -e,v,]
= njVkdUj/dxk
±[e, ■ dtfefUfyax,
-q2ei ■ a(e,u/)/ax,]
AijU/.j
<P,ii
Ui.ij
Ui.U
ui.n
U,VN1 + UjjVjj
UfVjj+UjjVfj
UiAijj +Aijui,j
UjAijj+AijUjj
niVjU/j
niV/Uij
UjUjUjj
Note that, except for the invariant notation, our notation and definitions are restricted
to non-curved boundaries, a convenient restriction that we shall remove when necessary
(e.g., in the treatment of free-surface fluid mechanics).
If, in the sequel, certain derivations/manipulations do not quite seem to
be transparent/obvious, the reader may often refer to the above results for
assistance.
360 THE NAVIER-STOKES EQUATIONS
3.2 THE CONTINUUM EQUATIONS (THE PDE'S)
As mentioned in the previous chapter, the Navier-Stokes (NS) equations, which are
the governing PDE's for the motion of many fluids, are somewhat similar to the
advection-diffusion (AD) equation (with the Reynolds number, Re, replacing the Peclet
number, Pe), but are much more complicated and much more difficult to solve. The
reasons for the extra difficulty are related to the following items: (i) they comprise a vector
equation in which the associated scalar equations (one for each direction) are intricately
coupled to each other; (ii) the equation is inherently non-linear in the advection terms;
and (iii) the constraint equation between velocity components and the associated pressure
field (div-grad coupling) are not present in advection-diffusion. These additional features
cause numerical (and theoretical) solutions to be much more difficult to obtain, and the
search for 'optimum elements' (and better numerical methods in general, not just FEM)
is a continuing one—as is the search for a global existence theorem.
For the NS equations, even the simple diffusion-dominated limit of so-called Stokes
flow (Re -> 0; analogous to Pe -> 0 in AD) is sometimes very complex even though the
equations are still elliptic—for steady flows—and parabolic (with still an elliptic part) for
time-dependent ones; examples are: (i) the occurrence of reverse flows (separated flows
containing recirculating eddies), and (ii) very large gradients in pressure and velocity are
often generated near sharp 'corners' (singularities). The other limit, advection-dominated
flows (Re ^> 1), is especially difficult, as the principal 'actors' in the momentum equation
are then advection and the pressure gradient, with diffusion being nearly negligible except
near solid boundaries or internal shear layers. The final difficulty of the NS equations
worthy of mention in this introduction is their inherent instability: at a sufficiently large
Re, any previously stable laminar flow will undergo 'transition' to turbulence or near-
turbulence (semi-chaotic laminar flow), another consequence of the inherent non-linearity.
Further increases of Re will ultimately lead to fully developed turbulence, in which the
fluid behavior tends to become stochastic (as opposed to deterministic) in nature, with an
extremely wide range of 'characteristic eddy sizes' and concomitant time scales which
essentially defy detailed numerical simulations.
In this chapter we address the more modest goals of solving the NS equations for
laminar flows only. (A chapter in Volume II will address turbulence and methods of
modeling it.) In the remainder of this chapter we will set the stage for FEM approximations
to the NS equations and present some effective solution methods.
There are several ways to express the partial differential equations of motion
(conservation of momentum) and continuity (conservation of mass) for a constant property
Newtonian fluid, in which shear stress is proportional to the rate of strain. The fundamental
formulation is often referred to as the 'primitive variables' equations, which we emphasize
herein—partly owing to space limitations, partly owing to our desire to use a common
approach for 2D and 3D, and partly because of our predisposition and experience.
The NS equations, and the associated (mass) continuity equation, written in terms of
the primitive variables, velocity and pressure, are generally and most efficiently/compactly
expressed as
fdu \ Du 1
p — + u Vu )=P— = -VP + /iV2u, (3.2-1)
and
V • u = 0,
(3.2-2)
THE CONTINUUM EQUATIONS (THE PDE'S) 361
where u is the velocity (with cartesian components u, v in 2D and u, v, w in 3D), P is the
pressure deviation from hydrostatic [hence, no gravitational body force term in (3.2-1)], and
p and ii are the fluid density and viscosity, respectively (assumed henceforth to be constant).
Remarks:
(1) Why is (3.2-2) referred to as the 'continuity equation' ? After noting/recalling that its
more general form is dp/dt + V • (pu) = 0, we quote first from Batchelor (1969), 'It
has been called the 'equation of continuity' for many years, although not for any evident
good reason.', and then from Panton (1984), 'The equation derived in this section has
been called the continuity equation to emphasize that the continuum assumptions
(the assumption that density and velocity may be defined at every point in space) are
a prerequisite.' But he goes on to say, properly, 'The continuum assumption is, of
course, a foundation for all the basic laws.' Thus, we side with Batchelor.
(2) The implications of the 'simplification' of the mass conservation equation are
actually quite profound—both 'theoretically' (PDE theory, functional analysis) and
numerically; indeed, the simplification comes at a higher cost than many might
imagine, before actually 'diving in.' In many ways, the incompressible NS equations
are more difficult to 'solve' than their compressible progenitors—especially for
'neophytes' who often believe otherwise. For more discussion on the omnipotence
of V • u = 0 and its effects, see Gresho (1991a,b, 1992).
(3) While not intended to be obvious, (3.2-2) is/can be interpreted as 'the equation for
pressure'—and we shall do so, many times in fact.
Except for the essential (and quadratic) non-linearity of the advection term, u ■ Vu,
and the (implicit) presence of the velocity-pressure coupling through V • u = 0, the NS
equations share many of the features of the AD equation of the previous chapter. While
the additional term, VP, appears to be a simple body force (acceleration) in Newton's
second law, it is actually that plus quite a bit more; P also plays the role of a Lagrange
multiplier for the incompressibility constraint by 'adjusting itself,' instantaneously in time
(related to the infinite speed of 'sound' in an incompressible medium) and everywhere
in space so that V ■ u = 0 everywhere (including the boundary) and for all time. More
details will be given later and additional discussion of P is presented at the very end of
this chapter.
Actually, it is the combination of nonlinearity and the pressure-velocity coupling
that makes the NS equations difficult (if not impossible, in general) to solve. If either is
absent, the equations are much simpler and are known to have solutions (Galdi 1996)—the
limiting cases being Stokes flow (see Section 3.7.1) and the so-called Burgers equation
p(du/dt + u • Vu) = /xV2u, respectively. [Another interesting item from these 'Short
course' notes of Galdi (1996) that distinguishes the two extremes of fluid mechanics is
this: 'Fluid Dynamicists are divided into Hydraulic Engineers who observe what cannot
be explained and mathematicians who explain things that cannot be observed'—which he
attributes to Sir Cyril Hinshelwood.]
Similar to the AD equation, there are two useful non-dimensional forms for NS,
depending on how time and pressure are measured, and we emphasize immediately that
pressure must always be measured in units such that the resulting dimensionless equation
can never cause W to vanish in any of the allowable limiting cases. (If it did, so too
would the constraint V • u = 0.) Introducing w0 as the characteristic velocity, L as the
characteristic length, x as the characteristic time, and Pc as the characteristic pressure
362 THE NAVIER-STOKES EQUATIONS
gives (upon introducing the Reynolds number,
Re = pu0L//i, (3.2-3)
the ratio of advective to diffusive momentum transport), leads to
(i) — + Re u Vu= -VP + V2u, (3.2-4)
dt
for the case where x = pL2//i (wherein Pc = fiuo/pL) and
3u It
(ii) — + u Vu = -VP+ — V2u, (3.2-5)
dt Re
when t = L/uq (wherein Pc = pu%). As for AD, (3.2-4) is more appropriate in the low
Re regime (Re -> 0 via u0 -> 0) and (3.2-5) is better when dealing with the advection-
dominated, large Re regime (usually 'realized' as Re -> oo via /i -> 0). For Stokes flow,
Re = 0 in (3.2-4).
Remarks:
(1) In the Re -> 0 limit, it may seem awkward that the characteristic pressure 'vanishes'
with uq—and it is. The best 'rationalization' here seems to be that w0 —► 0 should
not be taken all the way to the limit, and instead argue that the advection term,
varying like m2,, becomes negligibly small and may be safely neglected when Re <$C 1.
(This enigma is probably related to the so-called Stokes paradox; see, for example,
Panton, 1984.)
(2) Most of the 'general characteristics' of AD problems, discussed in Section 2.1.2,
regarding degree of difficulty of a simulation, carry over to the NS equations—except
the last (fifth): pure advection. In the 'hydrodynamics' limit of pure advection, we
have the inviscid (/i = 0) incompressible Euler equations, whose solution
(numerical, at least) is often more than difficult—it is (we believe) virtually impossible
in the general case. For the steady case, these slippery equations admit too many
'solutions.'
Many CFD codes and publications on the numerical solution of the NS equations were
written with less than full understanding of the goal, with the result that many results
fell short of the mark in one way or another (or in many ways). What is the goal? The
first goal in the numerical solution of PDE's in general and CFD in particular (with
incompressible NS equations being more 'particular' yet) is to understand as much as
possible about the PDE's, and their solutions, whose approximate solution is sought. And
these depend, in part, on the answers to the following questions: What are the proper (and
improper) BC's and IC's?; When is the problem well-posed?; and, When ill-posed?
Our approach is to answer many of these questions regarding the first goal before
launching into the task of generating approximate numerical solutions.
3.3 ALTERNATE FORMS OF THE VISCOUS TERM
There are many equivalent ways to represent both the viscous terms and the advection
terms in the NS equations, many of which are based on the omnipotent constraint equation,
V • u = 0. These alternate representations, all of which are equivalent in the continuum,
lead to semi-discrete (continuous time, discrete space) equations that are generally not
equivalent—and which sometimes offer advantages over the simplest (conventional) form
ALTERNATE FORMS OF THE VISCOUS TERM 363
of the momentum equation presented above. In this section we focus on the viscous terms,
and in the next on the advection terms.
3.3.1 Stress-Divergence Form
This form is considered fundamental by many—especially those who came to FEM in
fluid mechanics with a knowledge of FEM in solid mechanics. Indeed, the form presented
in (3.2-1) is actually derived from the stress-divergence form, which is
pi — + u Vuj = Va= V-(d-PI), (3.3-1)
where d = /x[Vu + (Vu)7] or, equivalently,
fdu; dui\
7 = M V 3~ + ~dx~ ) = ^UjJ + UiJ"> = 2Me'7' (3.3-2)
and I is the identity tensor.
d is the viscous stress tensor (or deviatoric stress tensor), whose divergence is /i[V2u +
V(V • u)], which is just /iV2u when V • u = 0, thus recovering (3.2-1)—and indeed
justifying it, since only (3.3-1) is a true momentum balance equation, a is the total stress
tensor, and s is the strain rate tensor. It is worth repeating, however, that the discretized
versions of these two equations are not identical—usually.
The reason that (3.3-1) is often preferred to the simpler form, (3.2-1), is related to
weak forms and natural boundary conditions, which we summarize now and derive later:
only the stress-divergence form leads to NBC's that represent true (physical) forces; and
they are (for non-curved boundaries)
n a = a n = d n - Pxv = fi[n Vu + V(n u)] - Pn = F, (3.3-3)
where F is the applied force (traction) on/at the boundary. For curved boundaries, the
more general form of the viscous stress must be employed: /in ■ [Vu + (Vu)7] = /i[n ■
Vu + (Vu) • n].
3.3.2 Div-Curl Form
Using the vector identity V2u = V(V • u) — V x V x u, (3.2-1) can be rewritten as
p ( — + u • Vu J = -V/> + m[V(V • u) - V x V x u], (3.3-4)
where we note (again) that V • u = 0 in the continuum and also that the vorticity has
entered the equation; i.e.,
co = V x u (3.3-5)
is the vorticity.
Again the reason for even considering (3.3-4) is related to NBC's associated with the
weak formulation, and will be further discussed later.
364 THE NAVIER-STOKES EQUATIONS
3.3.3 Curl Form
The last variation on the viscous theme is to indeed invoke V • u = 0 in (3.3-4) to obtain
/du \
p I — + u Vu I = -VP - /xVx Vx u, (3.3-6)
which leads to yet another, slightly different than that from (3.3-4), NBC. Details follow
later.
3.4 ALTERNATE FORMS OF THE NON-LINEAR TERM
Here we return to (3.2-1) but focus on the advection term, u • Vu, and derive four alternate
forms, each of which displays certain advantages.
3.4.1 Divergence Form
From V • (uu) = u • Vu + u(V • u) and (3.2-2), where uu is the dyadic product (see
Table 3.1-1), (3.2-1) can be rewritten as
P
du
¥+V.(uu)
72
= -V/> + aiVzu, (3.4-1)
which, like the analogous AD equation of Chapter 2, we call a divergence form (of
advection). The attributes associated with this form are probably slight, if present at all.
If, however, this divergence form is combined with the stress-divergence form, we
obtain the total divergence form,
du
p— = W-(a-puu), (3.4-2)
ot
which offers two advantages when discretized via GFEM: (i) it leads easily to the proper
overall/global momentum balance (a global force balance), and (ii) a form of its NBC
can be useful if a BC of specified total momentum flux, n • (puu — a), is desired or
appropriate at a boundary.
3.4.2 Rotational Form
Here we begin with the vector identity,
u Vu= |V(u-u)-ux Vxu, (3.4-3)
noting that ^pu u = ^pq2 is the dynamic pressure, and insert the results into (3.3-6) to
obtain, using (3.3-5),
[du \
pl—+coxu\=-VPT-/iVxa>, (3.4-4)
where PT = P + pq2/2 is the total (Bernoulli) pressure, which we call the rotational form
of the NS equations.
ALTERNATE FORMS OF THE NON-LINEAR TERM 365
Some potentially useful aspects of this formulation, pointed out by Gresho (1991b),
are:
1. If the flow is irrotational (co = 0) or nearly so, then even if the Reynolds number is large
and advection is 'dominant,' this formulation has subsumed most of the non-linearity into
the (linear) pressure term and leads to a better outflow BC; i.e., P + \pq2 = constant.
Details later.
2. Even if the vorticity is large, it is sometimes the case (e.g., for coherent structures
in turbulent flow; see Frisch and Orszag, 1990) that it is often nearly aligned with the
velocity; i.e., it has close to the same direction as u so that co x u, the remaining portion
of the advection term, is small.
3. Noting that the combined system of equations (3.2-2), (3.3-5), and (3.4-4), display
no derivatives higher than first order, the door is opened to other methods, such as the
method of least squares. More on this later.
4. It seems to offer the possibility of generating a skew-symmetric advection matrix (see
below).
In closing this section, it is worth pointing out that the rotational form also displays a
disadvantage: more degrees of freedom to compute (i.e., velocity, pressure, and vorticity).
3.4.3 Skew-Symmetric Form
The next 'trick' is to consider rewriting the advection term in (3.2-1) as a linear
combination (average) of the advective form and the divergence form (as already done for the
scalar transport equation in Chapter 2, Section 2.2.4),
|[V• (uu) + u Vu] = u Vu + |u(V• u),
which gives the following momentum equation:
P
du l
h u • Vu H—u(V • u)
dt 2
= -VP + /iV2u. (3.4-5)
This form was introduced by Temam (1968; see also Temam, 1984) in order to be
able to prove that the numerical calculations would be stable. This is related, for NS as it
was for AD, to the fact that this form leads to a skew-symmetric advection matrix, which
guarantees quadratic conservation (of kinetic energy in this case). Again, we will present
more details later.
Another quadratically conserving form is that used by Heywood et al. (1996); it uses
the conventional (Laplacian) form for the viscous term but uses the vector identity
co x u = u • Vu — (Vu) • u = u • Vu — u • (Vu)7 in (3.4-4) to avoid needing a curl
operation. It is
P
du
— + u Vu - (Vu) u
at
72
= -VPT+tiVzu. (3.4-6)
3.4.4 A Symmetric Form
In a very interesting recent pair of papers (Gellert and Harbord, 1987, and Harbord and
Gellert, 1990), the construction of a symmetric advection operator was put forth. It begins
366 THE NAVIER-STOKES EQUATIONS
with the same vector identity used above,
u Vu = u (Vu)7 + (V x u) x u = (Vu) u + co x u. (3.4-7)
Using again u • Vu = (Vu)7 • u and forming the average of these two expressions yields
u Vu= |{[Vu + (Vu)7]-u + a>xu}, (3.4-8)
which is at least partially symmetric. But the above authors also show that the following
symmetric form of co x u exists:
co x u = Z(u) • u = ZT(u) • u, (3.4-9)
where the non-linear operator/matrix Z(u) is given by
' a>2 sin 2<p2 — (03 sin 2^>3 co?, cos 2(p2 a>2 cos 2^2
&>3 cos 2<fT, cot, sin 2^)3 — o)\ sin 2<p\ co\ cos 2<p\
&>2Cos2^>2 &>icos2^>i &>i sin 2^i — &>2 sin 2^2 ■
(3.4-10)
and the 'angles' are rather complicated functions of the velocity:
/ sin (pi cos (pi sm2(pi cos2^,
ZiJ =
1
W3
yfi[+
m
\l»] +
u2
u2
u\
Ui 2U2Ut, U2 — M3
u\ + u] "2 + «3 Ul + Ul
2 2
u\ - u\
U3 2U\U2
2 , ,.2 ,.2 , ,.2
M? + "22 V"? + "2 "1+"2 "1+"2
In 2D, this simplifies to (using co = cot, = 3^/3x — du/dy)
u\ + u] "i + M3 «i + "3
2 2
M] 2« 1 + ^2 Ml ~ u2
2 1 ..2 ,.2 , ,.2
z„ =
&>
'1/ — 2 9
ul + v2
—2uv u2 — v2
u2 — v2 2uv
(3.4-11)
Thus, a symmetric advection operator is
2
u • Vu = UVu + (Vu)7 + Z(u)] ■ u = A(u) • u, (3.4-12)
with A7(u) = A(u), an identity that can be verified by direct calculation.
It also follows that a GFEM approximation of this symmetric advection form also
generates a symmetric advection matrix, thus opening the door to methods that work well
on (or require) symmetric matrices. But it may be no panacea, considering the following:
Remarks:
(1) While symmetric, A(u) is indefinite—it has both positive and negative eigenvalues.
(2) Conjugate gradient-like methods (especially, perhaps, the minimum residual method)
might work well on this matrix.
(3) Its Jacobian (functional derivative, 3A/3u) is unsymmetric, which would reduce its
potential effectiveness if a Newton method is invoked after spatial discretization.
Nevertheless, the concept is quite interesting and probably worth further development.
DERIVED EQUATIONS 367
3.5 DERIVED EQUATIONS
The various forms of the NS equations discussed thus far are all primitive variable (u-P)
forms. But there are sometimes good reasons for considering alternate—and, hopefully
and usually—equivalent forms that are less 'primitive'; they are derived from the primitive
variables. The forms to be considered below manage (usually) to bypass the continuity
equation, V • u = 0, although they do (and must) imply the same mass conservation. They
are derived (in part) by differentiation, a process that introduces higher-order equations
and additional problems; the V • u = 0 'problem' is traded for other problems. They are
also often useful for generating alternative numerical methods of solution. We introduce
these derived equations in this section, deferring discussion of their numerical solution to
later sections.
3.5.1 The Pressure Poisson Equation (PPE)
The PPE is an equation that is implied by the u-P equations and derived therefrom by
taking the divergence of one form or another of the momentum equation and invoking
the (mass) continuity equation. Actually the PPE exists (is implied) only under the
conditions/assumption that the u-P solution is sufficiently smooth so that the divergence of the
momentum equation makes sense—which need not always be true. It thus follows that the
u-P formulation can admit a larger class of solutions (wherein only first derivatives of P
and second derivatives of u need exist). Paradoxically, in Section 3.10.5, we will discuss
a flip-side to this issue wherein the PPE formulation can display more solutions than can
the u-P formulation, but with the following important difference: the extra solutions of
the PPE system are spurious in that they are not solutions of the u-P system.
Let us begin with the advective/stress-divergence form given in (3.3-1), to which we
add a body force term, p g, for generality:
p ( y + u ■ Vu J + VP = aiV • [Vu + (Vu)7] + pg
= M[V2u + V(V • u)] + pg. (3.5-1)
Remark:
g(x, t) is meant to describe any given acceleration (forcing) term, not just
'gravity'—although that is a simple special case.
Assuming sufficient regularity (i.e., that all required derivatives actually exist), the
divergence of this vector equation yields the first form of the (scalar) PPE, namely,
V2P = V ■ {v[V2u + V(V u)] + g - u Vu - du/dt],
where v = /i/p is the kinematic viscosity and P = P/p is called the kinematic pressure.
In the best of all worlds, we can invoke V • u = 0 in 'many' places to obtain a useful
'working version' of the PPE. (That above is not useful, because of the presence of the
acceleration.) Thus, (i) V2u = V(V -u)-VxVxu=-VxVxuand V-(VxVx
u) = 0, and (ii) V • du/dt = d/dtV ■ u and, since we want/assume V ■ u = 0 for all time,
368 THE NAVIER-STOKES EQUATIONS
this term also vanishes. The final (first) form of the PPE is thus
V2p = V • (g - u • Vu), (3.5-2)
an equation that, with (3.5-1), constitutes the PPE system and can, with sufficient care,
also be used to solve the NS equations—but not always uniquely (more on this later).
The PPE that 'works best' is that in which the (seemingly zero) viscous term is retained:
V2P = V-(g + vV2u-u- Vu), (3.5-3)
which we shall bless with the name consistent PPE (CPPE)—for reasons that will be
made clear later.
Remarks:
(1) We shall return to the ill-posedness of (3.5-2) and the well-posedness of (3.5-3)
after completing the discussion of boundary conditions and initial conditions in
Section 3.10.
(2) Other forms (perhaps slightly more efficient) of the PPE are also possible: (i) via
the identity V [u Vu] = Vu : Vu + u V(V • u) from Table (3.1-2), to give, with
V ■ u = 0,
V2P = V • g - Vu : Vu, (3.5-4)
for which, in 2D cartesian geometry, Vu : Vu = u2x + 2uyvx + v2; (ii) a further use
of V • u = 0 leads to another simpler form [subtract (V • u)2 from the above result]:
V • (u • Vu) = 2(uyvx — vyux); as presented in, for example, Roache (1982).
(3) It will turn out that none of the alternatives that purport to simplify the RHS of the
PPE is advisable when obtaining approximate solutions via the FEM.
(4) We shall often (usually) omit the tilde over the kinematic pressure for simplicity of
notation.
When the pair (3.5-1) and (3.5-3) is employed properly (which we will define carefully
in Sections 3.8.2 and 3.9.2), it can be used, rather than (3.5-1) and V • u = 0, to solve the
NS equations. In particular, they will deliver a divergence-free velocity. We will return
to the PPE formulation later; for now, we just make the following additional
Remarks:
(1) The PPE is elliptic and thus shows that the pressure field is always in equilibrium
with the corresponding divergence-free velocity field.
(2) It can be a useful formulation when the solution of the time-dependent NS equations
is the principal goal. (It is not so useful if the steady NS equations are to be attacked.)
(3) PPE formulations and solution methods are generally more 'delicate' than u-P
formulations; the 'cost' of bypassing the explicit solution of V • u = 0 can be higher
than some might initially anticipate—and probably has been, frequently.
3.5.2 The Vorticity Transport Equation (VTE)
Applying curl rather than div to the momentum equation yields the VTE. Starting this
time with the simplest form of the momentum equation (3.2-1), and recalling the vorticity
DERIVED EQUATIONS 369
definition,
co = V x u, (3.5-5)
yields [using curl grad (•) = 0 to eliminate the ('God-awful,' in the eyes of \jf-co fans)
pressure, and V ■ <w = 0 because div curl(-) = 0]
dco -,
— + V x (u ■ Vu) = vV x V2u,
at
which simplifies via (i)
V x (u ■ Vu) = V x [\Vq2 - u x V x u] = -V x (u x co)
= V x (co x u) = <w(V ■ u) - u(V ■ co) — co ■ Vu + u • Vco
= —co ■ Vu + u ■ Vco
and (ii)
to
V x V2u = V x [V(V -u)-VxVxu] = -VxVxa>
= V2co - V(V ■ co) = V2co,
dco -j
—- + u Vco = co ■ Vu + vVlco (3.5-6)
at
in the general (3D) case and to the degenerate/simpler version
dco -,
—- + u ■ Vco = vV2co (3.5-7)
at
in the 2D case, wherein co is a scalar. (For example, co = k ■ V x u where u is in the
xy-plane and k is the unit vector in the z-direction; this formulation is also useful
for axisymmetric problems, using cylindrical coordinates.) In 2D, the VTE is just the
advection-diffusion equation of Chapter 2, a parabolic equation that is one of the pair
that comprise the stream function-vorticity formulation, which we will soon present.
As the VTE will see less 'action' in this text than either the u-P or PPE formulations,
we defer further discussion except to say that:
1. Again, more regularity (than even for the PPE) in u is required for these higher-order
(third) derivatives to exist.
2. The pair (3.5-5) and (3.5-6) can be used—again with proper care and sometimes
with some difficulty—to solve simultaneously for the velocity and the vorticity. [Also,
V ■ u = 0 may sometimes need to be specifically invoked; it depends on BC's and solution
strategy. See Gresho (1992) for further information and references.]
3. A major 'source' of vorticity occurs at 'no-slip' boundaries—usually via the non-zero
tangential pressure gradient there, but it can also be generated by an accelerating tangential
boundary.
3.5.3 The Penalized Momentum Equation
A 'slightly compressible' fluid may, intuitively, behave quite like an incompressible
fluid in many situations. That this premise is true has led to a large amount of work
370 THE NAVIER-STOKES EQUATIONS
in approximately incompressible fluids, from which we select (at this point) just one type
because it has achieved a large following in some finite element circles.
Suppose we replace V • u = 0 with
V-u=-eP (3.5-8)
or its equivalent
/>=-AVu, (3.5-9)
where A = l/s ^> /i has the same units as viscosity and is called the penalty
parameter—and is 'user-selected.' Clearly if P is finite, then V- u —> 0 as e -> 0(A ->
oo). Note that if /r n ■ u = 0, then the average 'penalty' pressure is zero: / P = 0. While
the name 'penalty parameter' will be discussed further later, for now we just assume
that this new continuity equation approximates well the incompressible one and insert
the above pressure into the momentum equation—say in the stress-divergence form
(3.3-1)—to obtain the penalized momentum equation,
P \ Y + U " VU) ~ (A + ^V<V ' u> = ^y2u' (3.5-10)
which contains no pressure and can therefore be solved directly for the velocity field—the
pressure being 'recovered', if and when desired, from (3.5-9). This is a substantial
simplification over the u-P system, and is what accounts for its popularity. But it is no pure
panacea for (at least) the following two (related) reasons: (i) it is not a priori obvious
how large A. must be to approximate V ■ u = 0 with acceptable error, and (ii) clearly if A
is 'too large' (related to round-off error when a numerical solution is sought), then the
above equation becomes simply V(V • u) = 0.
It is of interest (and important) to derive the analog of the PPE when the penalty
method is employed because the small compressibility can have a large effect on the
pressure—but only for small time—that is related to what may be called a (spurious)
'transient penalty shock wave.' To this end, we first form the divergence of (3.5-10) after
adding a source term to the RHS:
P
3V-u
—— + V • (u • Vu)
ot
- (A + m)V2V • u = /xV2(V ■ u) + pV ■ g,
which, with (3.5-8) and (see Table 3.1-2) using V (u Vu) = u V(V • u) + Vu : Vu,
gives
fdP \ 1 ?
e —- + u • VP ) = -[(1 + 2/jls)V2P] - (V • g - Vu : Vu), (3.5-11)
\dt J p
wherein, since A = l/e is usually much larger than /i, the 2/is term is (usually)
negligible. This equation is to be compared with the PPE of incompressible flow, (3.5-4). The
penalty pressure (and thus div u) actually 'dances' to a time-dependent advection-diffusion
equation with a source term; the effective diffusion coefficient is X/p. But since e is
very small, the transient will be very sharp—and short; there is an ephemeral temporal
boundary layer. After this initial (and spurious, relative to either incompressible or
compressible flow) penalty transient ('shock' wave) has passed through the domain—the required
time for which is 0(pL2s), where L is a characteristic length scale of the domain—the
pressure will be in quasi-steady equilibrium and can respond to the true time-variations of
ALTERNATE STATEMENTS OF THE NS EQUATIONS 371
the flow; i.e., it then satisfies, approximately, V2P = p[V • g — Vu : Vu], which is (3.5-4).
It is important to emphasize that even though P is absent from the penalized momentum
equation and (3.5-11) is in fact never formed, the implied pressure [from (3.5-8)] still
satisfies (3.5-11) and this can have a significant effect on the penalty velocity.
Later we shall demonstrate this spurious transient and show that div u can be very
large (and thus u very wrong) during this adjustment phase.
A Final Remark:
The word 'penalty,' and the concept, comes from the variational statement of the Stokes
equations (Section 3.7.1) as follows: 'The term penalty is to be understood in the
framework of optimal control: the cost functional is augmented with e-1 J"(V • u)2dx, so that
diverging velocity fields are strongly penalized.'—Thomasset (1981, p. 81).
3.6 ALTERNATE STATEMENTS OF THE NS EQUATIONS
The stage is now set, in part at least, for writing the various theoretically equivalent forms
of the NS equations. We do some of this, plus a little more, in this section—some of
which will simply serve as a summary of the preceding discussion. Also, we switch to
the dimensionless form of the equations introduced in (3.2-5)—for advection-dominated
flow.
3.6.1 Velocity-Pressure in Divergence Form
This is simply an expansion of (3.4-2): i.e.,
— + V-(uu + />I) = Re"1 V-[Vu + (Vu)7] and V • u = 0. (3.6-1)
dt
3.6.2 Velocity-Pressure in Rotational Form
This form combines the curl form of the viscous terms with the rotational (curl) form of
advection; i.e.,
-+(BXU + V/)r = -Re"lVx(B, a>=Vxu, and V • u = 0, (3.6-2)
dt
where PT = P + \q1 is the total pressure.
3.6.3 PPE Form
This important form combines (3.2-5) and (3.5-3) with g = 0: i.e.,
— + u • Vu + VP = Re-1 V2u
dt
and
V2/> = V • (Re-1 V2u - u Vu). (3.6-3)
372 THE NAVIER-STOKES EQUATIONS
3.6.4 The Stream Function-Vorticity (tfr - co) Formulation
This 2D (only) formulation utilizes (3.5-7) and the definition of vorticity, a> = dv/dx
du/dy, along with the introduction of the stream function (\js) via
(u = Vi/r x e3) to obtain
and
u = di(r/dy and v = —dty/dx (3.6-4)
— + u • Vco = Re-1 V2co (3.6-5)
dt
VV + & = 0, (3.6-6)
which is the \[r — co formulation.
Remarks:
(1) The elliptic equation relating \fr and co is somewhat analogous to that relating P and
u in the PPE formulation.
(2) It is possible to eliminate the elliptic equation by inserting co = — V2i/r into the VTE:
-V2i/r + ii- V(V2i/r) = Re"1 vV, (3.6-7)
dt
which, with (3.6-4), can be used to solve directly for the stream function. While this
formulation has actually been implemented via the FEM [see, for example, Olson
and Tuann (1979), Girault and Raviart (1986), and Gunzburger (1989)], we will not
pursue it any further. (A higher degree of regularity is obviously presumed, which
entails the necessity of using basis functions that are of class C1.)
(3) In 3D, the analogous formulation leads to the vector system of velocity (vector)
potential and vorticity, another approach that we believe to be too complicated
and largely unnecessary. See Gunzburger (1989) and references therein for further
discussion.
3.6.5 The Velocity-Vorticity Formulation
This formulation can be used in 2D or 3D and combines the kinetic vorticity transport
equation—(3.5-6)—with the two kinematic equations (3.2-2) and (3.5-5), to give the trio
of equations
—-- +u- V<o = co- Vu + Re-1 V2co, (3.6-8)
at
co = V x u, (3.6-9)
and
Vu = 0, (3.6-10)
where the vortex stretching term, co ■ Vu, is absent in 2D, wherein also co is a simple
scalar.
SPECIAL CASES OF INTEREST 373
Remarks:
(1) We shall also have little more to say regarding this formulation in the sequel (see,
for example, Gunzburger et ai, 1990, and references therein).
(2) The kinematic equation, V x u = <w, is sometimes replaced (or augmented) by a
higher-order one by taking its curl:
Vx Vx u= V(V-u)- V2u= Vx co; i.e.,
V2u=-Vx<b, (3.6-11)
a vector Poisson equation. See, for example, Hafez et al. (1989) for further discussion
of this formulation.
(3) co is also divergence-free; V • co = 0 from (3.6-9)
3.6.6 Other Formulations
Yes, there are still others, although they seem thus far to have proven more useful
theoretically than computationally. These are obtained from the non-dimensional forms of
(3.3-4) and (3.3-6); i.e.,
— + u Vu + VP = Re_,[V(V-u)- Vx V x u] and V • u = 0, (3.6-12)
dt
and
— + u Vu + VP = -Re"1 Vx Vx u and Vu = 0, (3.6-13)
dt
which we previously referred to as div-curl form and curl form, respectively.
The reason for 'belaboring' the issue of equivalent statements of the PDE's is simply
that the various weak formulations derived from these PDE's are not equivalent when it
comes to natural BC's; i.e., while the PDE's above are equivalent, the 'natural' boundary
value problems corresponding to them are not equivalent. Thus, we shall revisit some of
these various alternate statements of the NS equations when we generate the corresponding
weak forms and NBC's.
3.7 SPECIAL CASES OF INTEREST
There are three special cases—subsets, actually—of the NS equations that we wish to
illuminate here and explain some aspects of their utility.
3.7.1 Stokes Flow
If Re <$C 1 and the non-linear advection terms therefore neglected/omitted, then the NS
equations become
— +V/>=vV2u + g (3.7-1)
and
V-u = 0, (3.7-2)
374 THE NAVIER-STOKES EQUATIONS
which are the (linear) equations of Stokes flow—also often called creeping flow. While
they are sometimes appropriate/applicable in the time-dependent form shown (e.g., release
a small pearl in a vat of motor oil or—if your sensibilities prefer it—a glass bead in
glycerin), they are most often used in the steady version by omitting du/dt in (3.7-1); indeed,
there are those who seem to 'not believe' in transient Stokes flow—by using the name
'Stokes flow' (or the more descriptive 'creeping flow') to represent the steady version
of the above equations; see, for example, Langlois 1964 (Slow Viscous Flow, Macmillan
Co., NY; out of print!) A small exception is briefly noted in Happel and Brenner (1965),
wherein they admit to the existence of 'unsteady creeping flows' in which they point out,
properly, that the du/dt term need not be small—and a large one (exception) is the paper
by Maxey and Riley (1983), as is the rather recent and very relevant paper by Lovalenti
and Brady (1993). Indeed, in a later section, we shall show how a transient Stokes flow is,
in some sense, the proper 'precursor' to what is often and erroneously called 'impulsive
starts' in that du/dt -> oo, yet we are nevertheless dealing with Stokes flow.
(Explanation: the acceleration is arbitrarily large for a very short time.) These non-believers would,
ostensibly, also not believe in the following questions: (1) How much time is required
for the pearl to attain 90% of its terminal velocity? (It is clearly not zero.) (2) Ditto in a
fluid with twice the viscosity? It is also important to point out (and emphasize) that one
must realize that the fluid 'goes nowhere' during a typical transient Stokes problem that
attains a steady state; i.e., the viscous time scale is so short that all fluid 'parcels' are
virtually stationary throughout the diffusion-dominated simulation—unless, of course,
the boundary conditions are time-dependent, in which case the Stokes flow equations
only apply if Re <$C 1 for all time (and then fluid parcel displacement can occur). The
dropped pearl will travel only a very small distance before its acceleration becomes
negligible—and its terminal velocity attained. Although Leal (1992) is also 'close' to a
non-believer in transient Stokes flow, his book has much good discussion regarding steady
Stokes flow—for example, it discusses details of our falling pearl experiment above after
the 'acceleration phase' is over. Another example of a transient Stokes flow is afforded
by any 'spindown' experiment: turn off the body force in any contained flow, at any Re,
and let the flow 'spindown'; the advection term will become negligibly small long before
the flow has come to rest. The final portion of the process is transient Stokes flow.
The transient Stokes equations are useful for a number of reasons, not the least of
which is that they are linear. They thus form a very nice 'test-bed' for numerical methods
(and their detailed theoretical analysis!) whose real goal is usually to solve the full NS
equations. The important issue of 'div-grad coupling' is as delicate and crucial in the
Stokes equations as it is in the NS equations; ditto boundary conditions—for both u
and P. Finally, the steady Stokes equations, which represent a linear algebraic system of
equations when discretized, are often used to generate a simple first guess to a steady
NS flow via what is called the incremental Reynolds number solution method: use the
solution at a lower Re as the first guess to the solution of the non-linear equations that
describe the flow at some non-zero Re. More on this in Volume II.
Note that in the special case when g can be expressed as the gradient of a scalar—such
as gravity—it is then a conservative 'force' field, and it is both possible and advisable
(especially when a numerical solution is sought) to 'absorb' this scalar into the pressure
and drop g from the RHS.
The stationary Stokes equations are also special in that there are powerful variational
statements that relate to them and their solution. We introduce two of these here and
SPECIAL CASES OF INTEREST 375
will later refer back to them when seeking approximate solutions. The first is this: the
minimizing function over all divergence-free vector fields that satisfy u = w on V of the
following functional, called the Dirichlet integral,
70(u) = f [±vVu :(Vu)7 - u g] , (3.7-3)
satisfies the steady Stokes equations,
VP = vV2u + g and V • u = 0 in Q, (3.7-4)
with u = w on V. [The proof of this statement, provided in Section 3.15, involves the
recognition that if vV2u + g is L2-orthogonal to all divergence-free vector fields, then it
must be the gradient of a scalar, which scalar is called P. If u is obtained from (3.7-3,
then P can be then obtained from V2P = V • g in Q, dP/dn = n • (vV2u + g) on T.]
In the space of divergence-free functions, the Stokes solution is a true minimizer.
But suppose we enlarge the space of functions so that divergence-free functions are
only a subset? In this case, the variational problem becomes one of 'minimization plus
constraint':
Find the minimum of
7,(11) = J [\vVu :(Vu)7 - u g] (3.7-5)
for every u that takes on the value w on T subject to the constraint
Vu = 0 in Q. (3.7-6)
The realization of this (second) variational formulation usually involves the introduction
of a Lagrange multiplier (A) to enforce the constraint:
Find the stationary point of
72(u, A) = J [|vVu : (Vu7 - u g - AV u] (3.7-7)
over all vector functions that satisfy u = w on V and over all scalar functions A in
L2. This problem is easier in that the class of vector functions over which the search is
performed is much less restricted (even though the same u will ultimately be obtained),
but it is harder in at least two ways: (i) the search must simultaneously range over all
L2 scalar functions (the Lagrange multipliers), and (ii) the (more powerful) minimum
has been replaced by a mere extremum (a stationary point); i.e., we have now in fact a
saddle-point problem to deal with. (The introduction of a Lagrange multiplier generally
transforms a minimization problem to a saddle-point problem.) The first Frechet derivative
of (3.7-7) yields—again—the steady Stokes equation (3.7-4); i.e., it turns out that the
Lagrange multiplier, which entered the functional as a mathematical object, exits as the
Stokes pressure; A = P. The extremum simultaneously minimizes .^(u, P) with respect
to u [which minimum is clearly the same as that of J\(u)] and maximizes it with respect
to P; it is a minimax problem. (For Re > 0, there is no variational principle, but it may
still be permissible to still call P a Lagrange multiplier in that it still takes on the value
necessary to ensure constraint satisfaction.) Further detailed discussion is presented in
Section 3.15.
376 THE NAVIER-STOKES EQUATIONS
The penalty formulation (Section 3.5.3) returns us to a minimization problem:
minimize, for X > 0 given and fixed,
7,(11) = J ^vVu : (Vu)7 - u ■ g + ^(V ■ u)2 (3.7-8)
over all vector functions that satisfy u = w on F. The larger the divergence of u, the larger
is Jp(u); thus the term 'penalty'—the functional is penalized by non-solenoidal vector
fields. 'The penalty function method reduces problems of conditional (or constrained)
extremum to problems without constraints by the introduction of a penalty on the
infringement of constraints.'—Reddy (1982). The minimum of JP(u) is attained when 8JP(u) = 0
[and 82JP(u) > 0], which requires (leads to)
vA2u + g = -AV(V-u), (3.7-9)
which are the steady Stokes equations if A.V ■ u = —P [see (3.5-9)] and gives a velocity
field that is within e = 1/A. of the Stokes velocity for X ^> /i in (3.7-9). See Bercovier
(1978) for the theory of penalty methods.
Finally, in the 2D x/z-co formulation, the Stokes flow equations are
— = vV2a) + k • curl g (3.7-10)
dt
and
vV + <o = 0, (3.7-11)
a transient heat equation for the vorticity and—it would seem—an uncoupled elliptic
equation for the stream function. But, as we shall see later, life is not quite that simple
for the \fr-a) formulation; the reason is, basically, that (3.7-10) comes without BC's and
thus cannot be solved alone for the vorticity—the coupled set must always be solved
because \fr 'contains' all of the BC data.
3.7.2 Inviscid Flow
If y = 0, then the NS equations 'simplify' to the incompressible Euler equations, an
especially slippery system for which, when du/dt = 0, non-uniqueness is the name of the
game. They are, in the simplest form,
du
— + u-Vu + VP = g (3.7-12)
at
and
V-u = 0. (3.7-13)
Although we have not yet addressed the subject of boundary conditions, it is
well known that the no-slip BC goes away with the viscosity, a simplification that
can be a complication. Consider, for example, the following 'expansion' flow in a
channel—also called the backward-facing step—'An ingenious device for generating
Ph.D.s'—F. Habashi (personal communication), a simplified version of which is shown in
Figure 3.7-1. The flow enters at the upper half of the left boundary—and the irrotational
solution is shown (for u = 1 at the inlet).
SPECIAL CASES OF INTEREST
377
(a) Stream function
(b) Vector field
Fig. 3.7-1 Potential flow in a channel expansion; the vertical scale is magnified in the vector
plot.
There are (at least) two very different steady Euler flows for this case: (i) potential
flow (v 7^ 0 at the inlet, shown above and discussed below), and (ii) 'slug' flow in which
u = 1 in the entire upper half, u — 0 in the entire lower half, and v = 0 = P everywhere;
i.e., jumps in tangential velocity ('vortex sheets') are perfectly admissible for inviscid
flow—an essential complication that (for t > 0 at least) is missing for finite v, no matter
how small.
The rotational form of the Euler equations is interesting; it is, from (3.6-2) with a
source term added,
du
— + co x u + VPT = g,
ot
V • u = 0,
(3.7-14)
(3.7-15)
378 THE NAVIER-STOKES EQUATIONS
where (recall) PT — P + \q2 is the Bernoulli pressure. If also the flow is irrotational
(co = zero), the equations describe a (time-varying in general) potential flow, another
special case that we discuss next.
3.7.3 Potential Flow
For co = 0 and g the gradient of a scalar potential, say g = —V/i, the rotational form of
the (now irrotational) Euler equations becomes
-^+V/V = 0 (3.7-16)
ot
and
V-u = 0, (3.7-17)
where here PT = P + ^q2 + h. Since V • u = 0 for all time, the continuity equation can
be written as V • du/dt = 0 and the pair rewritten in terms of the acceleration, a = 3u/3 t:
a +V/V=0 (3.7-18)
and
V-a = 0, (3.7-19)
in which a is clearly a potential divergence-free acceleration (since it is the gradient of
a scalar, PT, the potential in this case).
A classical potential flow is both divergence-free and curl-free (see, for example,
Batchelor, 1967) and is thus the gradient of a scalar, say 0 (the velocity potential).
Inserting
u = V0 (3.7-20)
into (3.7-16) and (3.7-17) gives
and
^+Pt\=0 (3.7-21)
V20 = 0, (3.7-22)
the latter of which yields 0 once the BC's have been specified. The pressure can then be
obtained from
p=-(h2+h+t)- <17-23)
Remarks:
(1) The Euler equations are rarely attacked numerically, especially if the flow is
irrotational; see, however, Bell and Marcus (1992), who claim to at least come close to
solving them.
(2) The potential acceleration will be a useful notion when we discuss IC's and BC's,
and also (later) in the discussion of projection methods.
SPECIAL CASES OF INTEREST 379
(3) d(p/dt is only non-zero when the BC's are time-varying.
(4) Boundary conditions for potential flow are: un = n • V0 specified (usually) or 0
specified (rarely—typically as an OBC). Note that when the velocity at the inlet of
a domain is desired to be specified, only the normal component can be so specified;
the tangential components must be left 'free'—and will be such that V x u = 0 at
the inlet. In Figure 3.7-1 above, the inlet BC was u—\— —un — —30/3n = 30/3x.
(5) Since V2u = 0 when u is a potential flow velocity, it follows that every potential
flow satisfies the NSE's.
Finally, for an extensive discussion of both potential flow and Stokes flow and their
associated variational principles, see Section 3.15.
3.7.4 Axisymmetric Flow
There are many practical situations in which axisymmetric flow is either present or
assumed to be present; namely, flow in a tube/pipe in which there is no angular (0)
variation of any flow quantity—only radial (r) and axial (z). Thus, the equations of
motion in cylindrical coordinates are of interest. Note that axisymmetry can only exist
(in a bounded domain at least) if the 'geometry' is circular; r — R(z) defines the radial
boundary. In stress divergence form, they are, calling now u — ur the radial velocity,
v = vq the tangential/swirl velocity, and w = uz the axial velocity, we have, in component
form, with u = (u, v, w),
du du v 3« 1 3 on dor~
- + u- + w- = --(jar) ~ ~ + -~, (3-7-24)
dt or r dz r or r dz
dv dv uv dv 1 3 -, daZQ
dt dr ' r dz r2 3r" "" ' 3z
+ u— + — + w— = -^-(r2ar0) + -^, (3.7-25)
and
where
3vv 3vv dw 13 3cr,
\-u \-w— = (rar-)-\ -, (3.7-26)
dt or dz r dr dz
1 3 dw
V u = - — (ru) + — = 0, (3.7-27)
r dr oz
o> = — P + 2vdu/'dr, Oq — —P + 2vu/r,
(du dw\ 3
crrz = v[ — -\-—), 0* = vr—{v/r),
\dz or J dr
ozq = vdv/dz and oz = —P + 2vdw/dz.
Remarks:
(1) Simple (and much more common) axisymmetric flow has v = 0 in these equations,
no swirl—and of course (3.7-25) is omitted.
(2) Even when v # 0, the swirling flow is 2D—although there are three velocity
components and six components of the stress tensor.
(3) P is the kinematic pressure (P/p).
(4) See Stakgold (1979, p. 502) for some useful remarks on such coordinate systems.
380 THE NAVIER-STOKES EQUATIONS
Just as the cartesian stress-divergence form can be reduced to the simpler 'Navier-
Stokes' form [cf. (3.2-1)], so too can V • u = 0 be used to simplify the above equations, to
du du v2 du dP 7 ,„ „,
— + «- + w— + — = v(V2)rW, (3.7-28)
at or r dz or
dv dv uv dv -, „^
— + u— + — + w— = v(V2)0v, (3.7-29)
ot or r oz
and
dw dw dw dP ., „ „^
_ + u — + w — + — = v(V\w, (3.7-30)
ot or oz oz
where, because of the curvilinear coordinates, the 'Laplacian' is not quite the same in
each direction; we have (since d/dO = 0)
-j 1 3 / du\ u d2u
{WU=-r3-r{rTr)-72+^ ^^
(y2)ev = (V2)rv, (3.7-32)
7 i d ( dw\ d2w o ^
and
While not appearing to be much simpler, there are fewer terms, and the calculations turn
out to be slightly 'cheaper.'
3.8 BOUNDARY CONDITIONS
The BC issue for the NS equations is larger, and even more confusing, than that of
how to write the NS equations. Also, the jury is still out regarding the full story on
even mathematically permissible BC's, let alone those that are 'best' in some sense.
The simplest and most common BC is well understood, however, although it is often
misnamed: for a viscous fluid, the BC at a solid wall (or object) is 'no-penetration and
no-slip'; i.e., the normal and tangential velocity components must agree with those at
the 'wall,' typically via u = w on T, where w is specified (Dirichlet BC's; note too the
absence of BC's for the pressure for this case). This BC is often simply referred to as the
no-slip BC even though, as we will show, for an incompressible fluid the no-penetration
portion is often much more influential than the no-slip portion. Indeed, if v = 0, then
the u = w BC must be changed to n • u = n • w—arbitrary slip is permitted, but not
'penetration.' If also n • w = 0, then incompressibility requires that the flow must always
be parallel to the boundary, viscous or not. In numerical simulations, the above so-called
'specified velocity' (Dirichlet) BC is also very common for flow-through domains—at the
inlet. The outlet of these domains is another matter entirely, as the quest for better open
boundary conditions (OBC's) is a never-ending one. (Specified velocity, while legitimate,
often is not a good OBC.) In the remainder of this section we review the state of the art
('science'? It is evolving, albeit rather slowly, from the former to the latter) regarding
BC's for the several formulations presented earlier.
The only general statement that can be made with assurance is this: BC's are required
in both the normal and the tangential directions.
BOUNDARY CONDITIONS 381
We also concur with Kreiss and Lorenz (1989): 'In computations, boundary conditions
cause most of the problems.'—although most of their book covers the 'easiest' case,
periodic BC's, a subject we defer until Section 3.13.9—for reasons explained there.
3.8.1 u-P Equations
a. Traction
In addition to specified velocity, another BC that is appropriate in some branches of fluid
mechanics (typically those dealing with free surfaces) is a force (per unit area) balance
BC, sometimes referred to as specified traction. This BC has already been presented
(Section 3.3.1), but we restate it here:
a • n = /x[Vu + (Vu)7] n - Pn = F,
(3.8-1)
where F (presumed given) is the applied force (traction) on the boundary—the force
applied by the boundary to the fluid. While (3.8-1) is valid as it stands for any shaped
surface, it can be simplified for planar surfaces (constant curvature) to
At[n • Vu + V(n u)] - Pn = F
or, equivalently, to
M(^+VW") _/>n = F'
(3.8-2)
where un = n • u is the (outward) normal velocity.
It may be useful to clarify the surface stress and traction vector with a sketch—in 2D
for simplicity; see Figure 3.8-1.
y
FT = x-a-n
-V Fn = rj-F = n-a-n
^F^e^F
Fig. 3.8-1 Traction vector on the boundary.
382 THE NAVIER-STOKES EQUATIONS
Also,
F =
Fx'
Fy.
ei an
e2 a n
nx(2/idu/dx — P) + ny/i(du/dy + dv/dx)
nx/jL(du/dy + dv/dx) + ny(2/idv/dy — P)
(3.8-3)
and Fn = nxFx + nvFy, FT = r,/^ + zyFy to give (using n • x = 0)
—
'/V
.^r.
—
nan"
ran
—
2/i[nxdu/dx + nxny(du/dy + dv/dx) + n^3?;/3y] — P"
/^r^n^w/ftx + (r^/iv + zynx)(3u/dy + 3i>/3x)
+ 2rvnv3?V9};]
(3.8-4)
b. Mixed
At this point, it may be as well to point out that not all components of the full vector
equation need be applied simultaneously on T; e.g., the normal component of (3.8-1)
may be applied on the same portion of the boundary where the tangential velocity is
specified; i.e.,
h a n = fin [Vu + (Vu)' ]-n-P = F-n = F„
and
nxuxn=nxwxn
(3.8-5)
may be applied simultaneously.
Remarks:
(1) Note from (3.8-2) that for planar boundaries, the normal component of the viscous
force simplifies to 2/idun/dn. If also u = 0 on T, then V • u = 0 =>• dun/dn = 0
there; i.e., the normal viscous stress vanishes on a stationary solid boundary—a
relationship that also follows from the alternative form of the traction vector, F =
—Pn + i±((d x n) for u = 0 on T; since co x n lies in the tangent plane, normal
viscous forces are absent. [See also, for example, Serrin (1959, p. 241) or Panton
(1984, 1996; p. 335).] {A related discussion, focusing on the identity V- [(Vu) +
(Vu)7] = V2u — V(V • u) = V2u = —V x co away from T is presented by Batchelor
(1967, p. 148).}
The seemingly awkward representation of a vector in the tangent plane, (3.8-5)
compared with n x u = n x w, which also relates the two vectors in the tangent
plane, is required because the latter is not the proper projection of u and w onto
the tangent plane; the second cross-product returns the result, via a simple
rotation, to the proper projection. It could also be written (more awkwardly yet) as
u — n(u • n) = w — n(w • n), since any vector, say v, can be expressed as the
additive decomposition, v = n(v • n) + n x v x n. See also Gunzburger (1989).
For the coordinate system in Figure 3.8-1, the following relationship exists between
(2)
(3)
the magnitudes of the components of the unit vectors: rx = ny and ry =
-n,
In fact, borrowing partly from Gunzburger (1989), we show in Figure 3.8-2 a symbolic
representation of a 2D domain (Q) and its boundary that shows the many ways in which
the boundary (dQ) may be 'broken up' and BC's applied; in 3D, there are even more
combinations.
BOUNDARY CONDITIONS 383
Fn and FT specified on r - rnurT
(least constrained)
Fn and uT specified on rT - rn nr
un and FT specified on rn - rnnrt
/. • un and uT specified on rnnrT
(most constrained)
Fig. 3.8-2 Boundary conditions in 2D.
Remark:
It is also permissible, and often useful, to replace the above traction BC's with 'pseudo-
traction' BC's, obtained simply by omitting the term (Vu)7 from (3.8-1) et seq. Then, of
course, F cannot be a true physical force—it is not the same F as in (3.8-1). It will turn
out that this latter form is more natural when the conventional (V2) form of the viscous
term is used when writing the NS equations and is often more useful as an OBC. It is also
noteworthy that some trained more thoroughly in solid mechanics than fluid mechanics
have difficulty accepting the very legitimacy of the pseudo-traction notion—let alone its
utility; we assert that it is both legitimate and useful.
c. Total momentum flux
If the total momentum flux is known at a boundary (typically inflow or outflow), or is
desired to be specified, the advective flux must be 'added' to the traction BC as follows
(see Section 3.4.1):
n • (puu - a) = Fm, (3.8-6)
where the vector Fm (considered given; i.e., specified) is the normal component of the
total (local) flux of momentum on T, the sign change from that in (3.8-1) showing that F,„
is more closely related to the force (including inertial) applied by the fluid to its boundary.
Inserting the stress tensor from (3.3-1) and (3.3-2) yields, at a planar boundary,
/3u \ ^
punu + Pn - fi I — + Vun I = F,„,
(3.8-7)
as the specified momentum flux BC.
d. Symmetry
Symmetry (of one kind or another) is sometimes present in fluid mechanical systems
(although more so in the research world, probably, than in the 'real' world—owing mostly
384 THE NAVIER-STOKES EQUATIONS
to turbulence in the latter), and can be used to significantly reduce the cost of a simulation
via the application of appropriate BC's at the symmetry plane (or line) and solving the
problem in the appropriate fraction of the full domain—typically ^, although there are
some cases where a factor of ^ is appropriate, and others in which ^ is appropriate. The
most common symmetry BC is typified by vanishing normal velocity and vanishing shear
stress:
n u = 0 (3.8-8)
and
d • n - n(n • d • n) = 0. (3.8-9)
But the simplification shown below is worth pointing out—for planar boundaries. Using
(3.3-2), (3.8-9) becomes £i(Vw„ + du/dn — 2ndun/dn) = 0, which itself can be simplified
via u = ur + nun, V = VT + n(3/3n), where uT is the component of u in the tangent plane
and VT is the gradient operator in the tangent plane, to /i(VTun + 3uT/3n) = 0, which
further simplifies (finally) to
3uT/3n = 0, (3.8-10)
since un = 0 on V (and thus VTun — 0).
Another type of 2D symmetry BC may also be of interest—at least in special situations:
F„=0 = 2fi^-P (3.8-11)
on
and
wT = 0, (3.8-12)
which, in the 'Navier-Stokes' form (change two to one) was successfully employed by
Silvester and Kechkar (1990) to solve one half of a Stokes flow in a box (the lid-driven
cavity problem) by applying the above symmetry BC at the vertical 'centerline' of the
box. Noting that 3wT/3r = 0 along the centerline gives dun/dn = 0, and thus P = 0—at
least in theory. Such a BC may therefore be limited to steady Stokes flow where such a
symmetry is known to exist.
e. Robin
The Robin BC, in a simpler form than (3.8-7), can also be applied to the NS equations,
and it does have some utility. It is useful in the tangential direction (in 2D for simplicity):
duT
u • n = wn and wT+/3—— =wT, (3.8-13)
on
where wn, /3, and wT are specified (data)—and we note that this will be an NBC of the
weak formulation only if the 'V2-form' of the NS equations is used; i.e., (3.2-1). A typical
and simple application of this BC would have both wn and wT zero and /J < 0 to describe
a non-penetrable boundary on which the (slip) velocity is proportional to the shear stress
(see, for example, Silliman and Scriven, 1980).
Professor L.E. Scriven and colleagues/students at the University of Minnesota have
used a Robin BC to better match known asymptotic (x —► oo) solutions in a number of
situations in which such analytical results are available, and a sampling of them follows. In
BOUNDARY CONDITIONS 385
Higgins (1982), a vector Robin OBC with the appropriate 'coupling coefficient' permitted
good results for viscocapillary film flow using shorter domains than those needed for
Dirichlet or Neumann OBC's. Bixler and Scriven (1987) extended this from 2D to 3D
and found it still beneficial. In both cases, the coupling coefficient was obtained by
examining the asymptotic behavior of a relevant eigenvalue problem. In Christodoulou
and Scriven (1989), a Robin BC was used at the inlet of a slide coater by matching the 2D
GFEM solution to an asymptotic, upstream ID solution. [A related 1D/2D 'matching' BC
was successfully employed in Kistler and Scriven (1994)—although it was not a Robin
condition.] For effective use of the Robin BC in a continuous-flow, chemical reactor
system, see Novy et al. (1990), and for its use in porous media flow, see Novy et al.
(1991), from which we quote their bottom line: 'We believe that Robin-type boundary
conditions deserve more widespread use.' Finally, for a recent use of the momentum
flux OBC that admits velocity reversal, see Carvalho and Scriven (1995), or Carvalho's
Ph.D. Thesis (Department of Chemical Engineering and Materials Science, University of
Minnesota).
A so-called 'filtration BC,'
u-n + y(n-<r-n + F„) = 0 and u • x = 0, (3.8-14)
where y ^ y0 > 0, which is a Robin BC in the normal direction, was recently proposed
and briefly tested by Shopov and Iordanov (1994) to model permeable walls—'the flux
through the boundary is proportional to the pressure drop across it.'
f. OBC's
Finally, we address additional aspects of the most difficult and not-yet-resolved issue of
open boundary conditions (OBC's). Although we will not derive the various BC's to be
stated below until Section 3.12 on weak formulations—wherein only some are derived
via NBC's—we state them now for the sake of completeness. Key to this issue for the
momentum equation vis-a-vis the scalar AD equation of the previous chapter is the tight
coupling (in the normal direction) between velocity and pressure—V • u = 0 at the outlet
must be enforced/respected and causes significant extra troubles (although it may cause
fewer troubles than in the past when it was not fully appreciated how important V • u = 0
in Q is). To encompass one of the major difficulties, we address the case in which a
body force is present—often in the form of a buoyancy term in situations in which the
temperature field is intimately coupled with the velocity field, a situation that forms (a
portion of) the subject of Volume II.
The most common OBC is that given by (3.8-1) or its V2-counterpart, the latter, which
is simpler and often better, obtained by omitting the (Vu)7 term. Expressed in 2D with
a straight boundary for simplicity, it expands to
2fi—-P = n-¥ = Fn (3.8-15)
dn
in the normal direction, and to
in the tangential direction. Three noteworthy aspects of these traction OBC's (because
they cause 'problems') are: (i) the proper values of Fn and FT—the required data—are
386 THE NAVIER-STOKES EQUATIONS
usually not known, (ii) the pressure appears in the normal OBC, and (iii) the tangential
derivative of the normal velocity (a part of the shear stress) appears in the tangential OBC.
The pressure can sometimes be eliminated (special cases) by starting from a different form
of the momentum equation (which we do below), and the latter (dun/dz) by omitting the
(Vu)7 term from (3.8-1)—which generally requires its omission from the momentum
equation as well; i.e., use (3.2-1) rather than (3.6-1). But why should one want to change
from (3.8-15) and (3.8-16) as OBC's? As the first part of the answer to this germane
question, recall the passive and useful OBC discussed and demonstrated in the previous
chapter: 3()/3n = 0 usually works better than any of the alternatives. Thus, we presume
that it would (usually) be 'nice' to be able to use both dun/dn = 0 and duT/dn = 0 as
OBC's—neither of which appears to be realizable from (3.8-15) and (3.8-16). But it is
actually quite easy to obtain duT/dn = 0 as an OBC, since the NBC's associated with the
V2 form/conventional form of the NS equation (3.2-1), are just
du„
H~-P = fn (3.8-17)
on
and
M!r = /r, (3-8-18)
on
where we have switched to lower case /'s so they are not confused with the traction
force components in (3.8-15) and (3.8-16).
Remarks:
(1) If the true tractive force were known at the outflow boundary, then (3.8-15) and
(3.8-16) would be a useful and appropriate OBC—as is often the case in solid
mechanics, wherein the term 'outflow' is irrelevant. But the fact is that it is almost
never known in flow problems, thus opening the door for considering alternate
OBC's of which (3.8-17) and (3.8-18) are but one example—and a pretty useful
one at that, as demonstrated, for example, in Hey wood et al. (1996). The 'problem'
that can occur using (3.8-15) and (3.8-16) as outflow BC's was also demonstrated
by them—and also, much earlier, by Leone and Gresho (1981), who also explained
the reason for the failure.
(2) There are even recent mathematical analyses of BC's of this type that are shown
to be useful from the point of view of stability; i.e., it can sometimes be shown
that these BC's can not have a destabilizing influence; see, for example, Naughton
(1986) and Hagstrom (1991), who also develop some theory that would apply to
FDM implementation of OBC's. [On this point it is interesting to note that these
BC's are completely natural (as NBC's) when the weak form is discretized via
GFEM.]
(3) If the Euler equations are considered, then it may seem (at first) appropriate to
relinquish BC's at the exit because the equations are 'hyperbolic' But this is wrong; the
Euler equations are mixed elliptic and hyperbolic, and there is still an implied PPE.
One legitimate OBC follows from (3.8-15) and (3.8-16), with FT = 0 or (3.8-17)
and (3.8-18) with fT = 0, by simply setting /j, = 0; the tangential BC simply
disappears (the uT equation is hyperbolic with dP/dx acting as a given source term), and
the normal BC becomes P = —/„, which is also an appropriate (and Dirichlet) BC
(weakly applied, as an NBC) for the PPE—thus further strengthening the argument
BOUNDARY CONDITIONS 387
that the pressure gradient must be integrated by parts. (This is how one can 'specify'
P yet not sacrifice V • u = 0 on r.)
(4) A look ahead to Figure 3.13-22 in Section 3.13.5e may be helpful, for OBC's and
other BC's.
Exercise for the reader: Using (3.8-17), show, for v = 0 at y = 0 and y = H at the outlet, that the average pressure is
constrained; it must be — (1///) J0 /„(>) dy. Hint: use V • u = 0.
Remarks:
(1) It is often the case in practice that /„ = 0 (but not FT = 0), and it follows that the
average pressure is zero at the outlet.
(2) The same reasoning for the OBC dun/du = 0, often seen in FDM papers, leads to a
different, and stronger, constraint: uT = 0. (See too the discussion related to (3.8-30)
below.)
Now it is clear that simply setting fT = 0 achieves the desired passive OBC for the
tangential velocity. The normal component, though, is abnormal—only if P = —fn will
we obtain dun/dn = 0, a goal that is often not so easy to attain since it presumes some
(too much) a priori knowledge regarding the solution. Note that if P + /„ is 'large,' then
so too is /idun/dn and, from V • u = 0, so too is duT/dr; such artificially large values of
velocity derivatives can cause a large 'distortion' near the outlet, as we will demonstrate
in Volume II for a Boussinesq fluid.
But now let us return to the momentum equation itself, with a body force,
P-^ + V/> = mV2u + pg. (3.8-19)
Sticking to 2D again for simplicity, and even to the simpler x — y cartesian form, the
component equations are
Du dP ?
p-- + — = MV2M + Pgx (3.8-20)
Dt ox
and
Dv dP ,
P— + -r- = ^2v + Pgy, (3.8-21)
Dt dy
where we shall also assume for simplicity that the outflow boundary is at x = L and is
parallel to the y-axis. Thus, (3.8-20) is the normal component of the momentum equation,
and (3.8-21) is the tangential component. Next, we introduce the 'experimental fact' that
the pressure field is usually largely dominated by the hydrostatic 'component'; i.e., if
P = PH + 8P, where PH (by definition) satisfies VPH = pg, then it is often true that
8P<^PH. (The 'Lagrange multiplier portion' of P, required to enforce V • u = 0, is
small—but still crucial!) Returning now to (3.8-17), which becomes
H-£-=P + fn=PH+fn+ 8P, (3-8-22)
ox
we see that we could make du/dx 'small' if we could cause PH + /„ = 0—which we
can do by using the tangential component of the hydrostatic equation, dPH/dy = pgy, to
388 THE NAVIER-STOKES EQUATIONS
compute a 'proper' value of fn to use in the normal component, via
f„(y, t) = -PH = -p F gy(L, /, t)dy', (3.8-23)
Jo
where the constant of integration is taken to be zero (usually permissible, at least in 2D).
While inconvenient at best, such a procedure can and has been made to work; i.e., it can
give good results in cases where the previously discussed OBC's do not (see, for example,
Leone et ai, 1983; Lee and Leone, 1988; and Leone and Lee, 1989). This hydrostatic
OBC then reads (at x = L)
du
a —
= - Fgy(L,y',t)dyf, (3.8-24)
Jo
which is relatively easy to implement—it is also more 'useful' in a time-dependent
situation (at least if a good initial pressure is available) than for the steady equations, which also
needs a good initial guess. (A poor initial guess causes a very slow, linear convergence;
see Leone, 1980.)
Another method that has proven useful in dealing with certain OBC problems when g
points along only one of the coordinate directions is one that is often employed in
geophysical fluid mechanics: modify the vertical component of the momentum equation—as
follows (again for the 2D cartesian case, and with g = e^g):
Dt ay
where it is important that the newly introduced 'forcing function,' /, be independent of
x. This constraint permits a modification of the pressure as follows: VP = VP — f(y, t)\
i.e., since f = eyf(y, t) is curl-free, it can be expressed as the gradient of a scalar. Next,
take f(y, t) to be special: f(y, t) = pg(L, y, t), the value of the original forcing function
at the outlet plane. Thus we have
P-j^ + W> = MV2u + p[g(x, y, t) - g(L, y, t)], (3.8-26)
and the OBC of /idu/dx = P + /„ will now give du/dx ^ 0 simply by setting /„ = 0,
because P = PH + 8P now has PH = 0 at the outlet (and 8P is, still, small); i.e., VPH =
y°[g(*> y, t) — g(L, y, t)] gives VPH =0 at x = L. This OBC is of course much easier
to implement, the extra cost now being that associated with the second 'source' term.
We shall return to this BC, and demonstrate it (as well as others) in the chapter on
'Boussinesq' fluids in Volume II.
An interesting OBC situation arises by considering the rotational form of the NS
equation (3.6-2). The form of (3.8-17) and (3.8-18) that is relevant here is
, du„
Rz-l-^-PT = fn (3.8-27)
on
and
Re"1 ~ = /T> (3.8-28)
on
where Pj = P + |u • u is the 'Bernoulli' pressure. Consider now the following interesting
situation: Re » 1 (common) and the flow is nearly irrotational (a> = 0, less common),
BOUNDARY CONDITIONS 389
for which (3.6-2) simplifies to du/dt + VPT = 0, V • u = 0. Setting /„ and fT to zero in
(3.8-27) and (3.8-28) for this case yields PT = 0 and duT/dn = 0. But PT = 0 = P + ^2
is the Bernoulli equation for potential flow; i.e., it is just the right BC for the case
postulated. It appears that only this combination of equation 'form' and OBC (as an NBC
for the weak form, to be derived later) would work well in this case—or in its limiting form
via the following problem: solve a potential flow problem in a 'flow-through' domain,
prescribe the resulting velocity as an initial condition, set v = 0, and consider the time-
evolution of the resulting Euler equations. It seems that only the above formulation could
even hope to hold the IC as a steady solution (i.e., it would give du/dt = 0)—any other
OBC will violate the Bernoulli equation, P + ^q2 = 0, by changing the pressure. Then,
according to Kelvin's theorem, this no-longer-potential flow must introduce vorticity at
the inlet region; a bad BC at the outlet will cause an 'error' at the inlet!
Next we mention three cases wherein some experimentally-inspired OBC's that have
proven useful in practice but are difficult or impossible to analyze (rationalize?) have
been demonstrated to work well—in some sense.
In Taylor et al. (1985), an iterative 'bootstrapping' technique was employed in which
the first guess was the 'conventional' GFEM homogenous NBC as OBC : zero tractions
[Fn = FT = 0 in (3.8-15) and (3.8-16)]. Then, using the solution obtained with these
BC's, update the OBC to a non-zero traction NBC using the values of u and P just
computed to update the 'imposed' tractions. Iterate until convergence. They also applied
it to time-dependent flows by using the previous time-step values to update the tractions
for the next step. Neither this method, nor the one to be discussed next, have been analyzed
to see what actual PDE BC's these iterative methods converge to.
In the FIDAP code (used for many of the examples in this book), the normal momentum
equation OBC is treated as follows, and is actually very similar to the method of Taylor
et al. just discussed—but simpler. That is, ignore the viscous part of Fn (or /„) and
update the RHS (per iteration or per time step) simply by the approximation Fn = —P
(and /„ = —P for pseudo-traction)
The third method has already been introduced and discussed in Chapter 2, at the end
of Section 2.4.1. It is simply this: for nodes on an open boundary, do not integrate by
parts the viscous terms. What the 'free BC lacks in understanding, it seems to make up
for in performance. We recommend it as probably being the best of the three—and again
[(as we did in Sani and Gresho (1994)] implore the mathematics community to (further)
analyze it! It has even been successfully applied as an inlet BC by Carvalho and Scriven
(1996).
To conclude the discussion, recall that in Section 2.6.2c of Chapter 2 we show how a
hard (silly?) Dirichlet OBC can be accomodated for advection-dominated flows without
generating wiggle signals 'simply' by using a fine-enough (graded) mesh at the outlet
so that the BC-induced BL is at least marginally resolved—see Figure 2.6-53 vis-a-vis
Figure 2.6-52. Here we briefly revisit this case for the NSE's, because the same 'solution'
works. Figure 3.8-3 (thanks to S. Chan) shows a snapshot of the OBC region of a vortex
shedding simulation behind a square cylinder at Re = 100 using the hard (silly) OBC
of u = 1, v = 0. Because the OBL is resolved via a graded mesh, the vector field goes
smoothly from what it 'wants' to be to what we have forced it to be. The same calculation
without OBC resolution generated huge wiggles (not shown). The homogenous OBC's
of 3.8-17 with /„ = 0 and 3.8-18 with fT = 0 give very nice results, with no wiggles,
on the coarse mesh shown just upstream of the outlet.
390
THE NAVIER-STOKES EQUATIONS
Fig. 3.8-3 Snapshot of vortex shedding with Dirichlet BC at outlet.
g. More OBC's
Other OBC's are possible, and we list some more below—but point out that they are thus
far more theoretical than practical; i.e., they are legitimate BC's but too few have tested
them numerically—on a range of problems. In fact, it is probably safe/fair to say that
they were not specifically 'designed' (derived) to solve the 'OBC problem'—rather, just
to show other legitimate and potentially useful BC's for the NS equations that are based
on the fact that well-posed weak formulations suggest legitimate BC's for the strong
form via the associated natural boundary conditions. (Our opinion at this time is more
pragmatic in the following sense: if they are not useful as OBC's, then where would
they be useful?) We present six avant-garde BC's, listed below, some of which we shall
derive when we present the various weak formulations in Section 3.12. For further details
regarding the others, consult the original references, which we also list below.
1. The tangential velocity and the pressure can be specified. This BC could be a useful OBC
if a parallel flow (or nearly so) is known to exist, as it does away with the awkward coupling
between P and dun/dn; i.e., set P = 0 and n x u x n = uT = 0 at the outlet. [O. Pironneau
(personal communication) derived this OBC to solve, for example, the problem of a
bifurcating flow, such as one pipe splitting into two, wherein different downstream pressures are
presumed known. Our belief, however, is that this type of problem could be solved nearly
as well, and more easily, by using (3.8-17) and (3.8-18) and specifying /„ for the desired
pressure at the two outlets—at least in the absence of body force terms.]
2. The tangential vorticity and the pressure can be specified, the formulation of which we shall
derive later (Section 3.12.2). This too could be useful as an OBC if it can be assumed that the
flow is irrotational at the exit, a situation that is often realized in aerodynamics wherein only
the wake region contains significant vorticity. Another application might be to channel flow
wherein a linear variation in vorticity across the channel is often realized (Poiseuille flow).
3. The normal velocity and tangential vorticity can be specified. The utility of this BC as
an OBC is doubtful, we believe, since it is rarely a good idea to use Dirichlet data at an
outflow point. (It is a wiggle-maker in general—as discussed above).
BOUNDARY CONDITIONS 391
4. The normal velocity, normal vorticity, and normal component of the curl of the vorticity,
n (V x <o), can be specified. Really; see Girault (1988b). If the normal curl of the vorticity
does not appeal to you, then consider the next one:
5. Specified normal velocity, normal vorticity, and normal pressure gradient, where the
last of these is not at your discretion; it must be dP/dn = p n • g, where g is the applied
body force/acceleration.
6. Finally, it is possible to specify the normal velocity, the tangential vorticity, and (a
particular) normal pressure gradient. This BC is more ticklish yet, as it requires
satisfaction of the following 'constraint' on the data: dP/dn = p n • g - Vs • coT, where coT is
the specified tangential vorticity and Vv is the surface gradient operator. Again, mainly
because of the need to specify the normal velocity, we see little utility in this BC as
an OBC.
Further details on the above BC's can be found in the following references, where
OP1, 2, 3 = Pironneau (1986, 1987, 1989), HF = Hughes and Franca (1987), VG1, 2 =
Girault (1988a, b), and MG = Gunzburger (1989): for 1, see all of the above; for 2, see
HF and MG; for 3, see OP2, HF, VG1,2, and MG; for 4, see VG2; for 5, see OP3, who
attributes it to Girault (1988a); and for 6, see OP 1,3.
A final remark on BC's: Periodic BC's can be utilized, but their discussion is deferred
until later (Section 3.13.9) for reasons that will be explained there.
h. Penalty method OBC's
What about BC's in the penalty approximation? Since there is no pressure—at least
explicitly—in (3.5-10), it would seem that there might even be a reward connected with
the penalty in that the annoying pressure need not show up in the BC's. The logical
(and indeed, proper and legitimate) BC's for (3.5-10) are simply Robin conditions, which
include the subsets of Dirichlet and Neumann. This would then seem to allow, for a
significant example, a return to the desired passive OBC of dun/dn = 0. Does it? Unfortunately,
no; the glimmer of hope for a free lunch is dashed by the realization that the penalty term,
AV(V • u), also shows up in the OBC. For example, if (3.8-15) would have been our OBC
in the u-P formulation, then the penalty approximation to same is (necessarily) obtained
by replacing P by — A.V • u to obtain
2M^-+AV.u = F„; (3.8-29)
on
the penalty version of the OBC is (must be) also penalized in order to keep V • u small
at the outflow boundary. The pressure is still present, only in a disguised form. Thus, in
practice when using the penalty method, it is usually best to 'think u-P' even though the
actual equations do not contain the pressure.
/'. Ill-posed OBC's
Having exposed (overexposed?) the reader to quite a variety of potentially confusing BC
options for the u—P form of the NS equations, we wish now to point out that only a few
of these will actually be carried through the rest of the text. Do not despair. But do be
aware of the fact that the 'best' open BC for these equations is still an open issue.
392 THE NAVIER-STOKES EQUATIONS
We conclude this OBC discussion by mentioning some ill-posed OBC's for the normal
velocity that have nevertheless been proposed and used (somehow) in CFD:
dun/dn=0, (3.8-30)
dun/dn = 0 and, at just one point, P = 0, (3.8-31)
dun /dt + Vdun /dn = 0 (3.8-32)
where V is user-specified, and
dun/dn=0 and P = 0. (3.8-33)
We mention now and analyze later (Section 3.12.5) the reasons for the ill-posedness: the
first three BC's are ill-posed because they are under-specified, leading to non-uniqueness
(an infinity of solutions), and the last because it is over-specified (no solution exists).
If (3.8-32) were changed to
v(dun/dt + Vdun /dn) = VP, (3.8-34)
then the ill-posedness would, we assert, be vanquished. As to the utility of this OBC, we
can only conjecture that it might work well for problems with no body force, but perhaps
not otherwise—and perhaps never better than vdun/dn — P, to which it reverts at a steady
state. Its implementation would follow along either of the two lines presented for dT/dt +
VdT/dn = 0 of Chapter 2 (Section 2.4). For further discussion on some OBC issues, see
the paper by Sani and Gresho (1994) that summarizes two OBC mini symposia (see too
Gresho, 1991c, for more on these symposia) and also shows some 'fuzzy' BC's—those
that deliver useful results on 'normal' grids but that are ill-posed in the continuous limit.
See too the OBC benchmark solutions to four test problems, in Volume 11, No. 7 of the
International Journal for Numerical Methods in Fluids (1990).
By way of introduction to BC's for derived equations in the next two sections, we make
the following general (and generally obvious) remark: the BC's must also be 'derived.'
So, if BC's for u-P are still vague/obscure in any way, those for derived equations must
be vaguer/'obscurer' yet.
3.8.2 The Pressure Poisson Equation and Pressure Boundary
Conditions
One of the most confusing and misunderstood aspects of incompressible flow has been that
of 'boundary conditions for the pressure.' While it seems to be better understood that one
role of the pressure is to keep the flow divergence-free, it has not—until recently—been
clear just how this phenomenon occurs mathematically. While it is well known that the
PPE (pressure Poisson equation) is an alternate way to state that V • u = 0 in Q, it is
much less known that the BC's for the PPE are intimately related to the simple fact that
V- u must also vanish on the boundary of Q—and even when this is known (or believed),
the translation of V • u = 0 on T to actual and legitimate BC's for the elliptic PPE has
usually not been obvious. Indeed, the very meaning of V • u = 0 on F is somewhat
ambiguous; e.g., consider a straight boundary in 2D: is it simply dun/dn + duT/dr = 0
or could it be something else? While the full answer must await the following section on
BOUNDARY CONDITIONS 393
initial conditions, we provide an introduction to it now: Except for one very important
special case, the vanishing of div u can be expressed as dun/dn + duT/dx = 0. The
special case is 'start-up' — t = 0. The statement 'div u = 0 on r at t = 0' translates, in
practice, to n u0 = n w0, where u0 is the initial velocity in Q and w0 is the specified
velocity (at t = 0) on V. It is quite permissible, for example, to have dun/dn = 0 and
duT/dx ^ 0 at t = 0 and yet satisfy w0„ = wo„, giving the result that uo is divergence-free
and legitimate. (For example, the lid-driven cavity could be driven by a 'tent' function
for uT, with Uo = 0 in Q.) For t > 0, the same requirement, n • u = n • w, is equivalent
to dun/dn + duT/dr = 0—because any discontinuities in uT (the only velocity component
permitted to be discontinuous at t = 0) are 'smoothed by viscosity' for t > 0.
So what does this have to do with the pressure and BC's for the PPE? The answer is
this: if and only if the PPE BC's are always derived from the proper statement/realization
of V • u = 0 on r can they be guaranteed to be correct. Here we define 'correct' as
follows: the correct BC's on the PPE will ensure that V • u = 0 on T for all time and
that the normal velocity will be continuous [lim^o n • u = n • w for all t, where x is
here construed to be the distance from F into Q along the direction of n—where n is
presumed to be uniquely defined on V].
So how does one find these proper BC's for the PPE? After all, it is well known
that a Poisson equation can be solved with—in the most general case—any Robin BC:
aP + BdP/dn = y where a, B, and y are in general completely arbitrary (unless a = 0 and
B # 0—the Neumann case—in which the Neumann compatibility conditions between the
RHS of the Poisson equation and the boundary data must be respected; i.e., if V2P = f
in Q, then the divergence theorem tells us that f^f = fr y/B in order for the problem
to be well-posed). The other special case, of Dirichlet BC's, is realized via B = 0, a ^ 0.
The way in which we 'learned how' to apply V • u = 0 on T is detailed in Gresho and
Sani (1987, 1988), and comprises basically two simple steps: (i) define any consistent
discrete approximation to the equations a + VP = f and V ■ a = 0, which =>• V2P = V • f
with f given and with any of the legitimate BC's discussed above for the velocity applied
(differentiated in time) to the vector a (acceleration); and (ii) solve the discrete vector
equations for all values of a not specified by the BC's and insert the result into the discrete
form of V • a = 0. The result (when the grid size shrinks to zero and node points at the
boundary are examined) will be the proper BC for the PPE. Some of these operations were
performed by Gresho and Sani, and later in this chapter we shall perform many more. [This
procedure was quickly picked up by Veldman (1990) who, inspired by the above paper,
published another called 'Missing Boundary Conditions? Discretize First, Substitute Next,
and Combine Later,' in which other examples are also included. Unfortunately, Veldman
did not perform the final' step: analyze your final discrete equations to discover the true
BC's inherited by the higher-order PDE's.]
Below we summarize the results of such an activity and present the proper pressure
BC's for the PPE, after noting that techniques can be devised for explicitly enforcing V •
u = 0 on r as a BC for the PPE [Canuto et al. (1988a, p. 404 ff), Schuller (1990)]—which
BC is equivalent to those presented below. [See also Gresho and Sani (1987, 1988).]
If the normal velocity is specified on T, the proper PPE BC, from Gresho and Sani, is
the Neumann BC obtained simply by applying the normal component of the momentum
equation on V. Thus, recalling (3.5-3), the PPE
V2p = PV ■ (g + yV2u - u • Vu), (3.8-35)
394 THE NAVIER-STOKES EQUATIONS
inherits the Neumann BC,
dP/dn = pn ■ (g + vV2u - u Vu - du/dt), (3.8-36)
where, of course, the last term on the RHS of (3.8-36) is replaced by the given data,
pn ■ dw/dt. [For all t ^ 0, the alternate realization of V • u = 0 in Q and n • u = n • w on
r—i.e., of V • u = 0 in Q—is just (3.8-35) and (3.8-36), although there are some pitfalls
here that will be described later.]
Remarks:
(1) The pressure from (3.8-35) and (3.8-36) is obtainable only up to an arbitrary additive
constant (the so-called hydrostatic pressure mode, which 'constant' could actually
be an arbitrary function of time). It is thus permissible to set the pressure at any one
point in Q, at any value—to resolve the ambiguity.
(2) The Neumann compatibility/solvability condition is automatically satisfied in the
(only relevant) case wherein n • u = n • w on the whole of F when, in the only case
of interest, the u-P equations from which the PPE + BC's are derived is well-posed.
The details and proof will be presented later (Section 3.9.2).
(3) For t > 0, it turns out (see Gresho and Sani, 1987, 1988) that the tangential
components of the momentum equation also apply on T—and could be used in
principle—as PPE BC's; i.e., for t > 0 we have, in addition to (3.8-36), on T,
nxVPxn = pnx(g- vV2u - u Vu - du/dt) x n, (3.8-37)
which, via integration over T can be converted to/interpreted as a Dirichlet BC
for the pressure, and the so-called overdetermined Neumann problem is then well-
posed [both Neumann and Dirichlet BC's, the latter via integration of (3.8-37) over
T, are satisfied]. But this equation generally does not apply at t = 0 because of the
existence (in the general case) of vortex sheets on T; the overdetermined Neumann
problem is generally ill-posed at t = 0. (See, for example, Gresho, 1991a, b.) Also,
since it does not ever appear to us to be a useful or easy-to-implement BC, we drop
it from further consideration, and simply mention that it is automatically satisfied
when the BC employed is (3.8-36).
(4) For t > 0, (3.8-35) + (3.8-36) <S> (3.8-35) and (3.8-37), and these in turn => V u =
0 in Q and on T. [At t = 0, (3.8-37) does not generally apply, and thus V • u = 0,
on T, is not implied. Details to follow—Section 3.9.2.]
(5) For reasons to be discussed later, it is generally not permissible to neglect V • (V2u)
in (3.8-35) via the argument that V2u = V(V • u) — V x V x u, and therefore V •
V2u = V2(V • u) - V • (V x V x u) = 0 because V • u = 0 and div curl (•) = 0.
(6) If only steady solutions of the NS equations are sought via the tactic of setting
du/dt = 0 and attacking directly the steady equations, then the PPE 'method' is
only applicable if V • u = 0 is used as a PPE BC, and even then it is not a generally
recommended procedure—although we mention that Schuller (1990) has
successfully employed it as part of a multigrid solution algorithm.
(7) The symmetry BC (planar) is merely a special case of (3.8-36), which simplifies
to dP/dn = pn g. {Proof: (i) n V2u = V2un = d2un/dn2 + d2un/dx2; but un = 0
BOUNDARY CONDITIONS 395
and therefore d2un/dr2=0 and d2un/dn2 = -d/dz(duT/dn) via Vu = 0 on
r and duT/dn = 0 because of symmetry [cf. (3.8-9)]; (ii) n (u Vu) = u •
Vw„ = undun/dn + uTdun/dr = 0 because un = 0 and dun/dz = 0; (iii) n • du/dt =
dun/dt — 0 because u„ = 0 for all t.}
(8) For steady Stokes flow with no body force, the PPE + BC simplifies to V2P = 0 in
Q and dP/dn = vn ■ V2u on T, showing that the entire pressure field 'comes from'
the inhomogeneous Neumann BC.
So much for the most common BC—specified (normal) velocity—for now. Let us
move on to the rest, or at least to some others, as it may not be fruitful to attempt to be
exhaustive.
1. If the normal velocity is not specified on the boundary, but rather the normal traction
(or pseudo-traction) is, then it turns out, somewhat paradoxically, that this Neumann BC
for the velocity, a la (3.8-1) or (3.8-2), carries over completely intact, but is a Dirichlet
BC for the PPE. For example, if the normal component of (3.8-1), or its pseudo-traction
counterpart in which the term (Vu)7 is omitted, is the BC applied to the momentum
equation, then it is actually inherited by the PPE; i.e., the (Dirichlet) BC for the PPE in
this case is
P = Atn • [Vu + (Vu)7] n - n F, (3.8-38)
which, for the simpler case of planar boundaries (or straight, in 2D), becomes P =
2/idun/dn — Fn, a normal force balance. [If (Vu)7 is omitted, then this becomes P =
/idun/dn — /„, a pseudo-force balance.]
2. Similarly, the specified momentum flux BC of (3.8-7) carries over, as another Dirichlet
BC (via the normal component) for the PPE.
3. OBC's. The general statement regarding OBC's for the PPE is this: whenever the
normal OBC for the momentum equation is of the form a(dun/dn) — P = /3, the same
BC applies, interpreted as a Dirichlet BC, for the pressure; i.e., P = adun/dn — fi. This
covers (3.8-15), (3.8-17), (3.8-24), and (3.8-27)—as well as the BC for (3.8-26) in the
form P = fidu/dx (because /„ is set to zero). We also remark that it is often (usually)
the case that the viscous terms are small at such a boundary so that P = —fi is observed.
We stop here, purposely avoiding the PPE BC's corresponding to the avant-garde
OBC's for the u-P equations. This we leave as an exercise—perhaps difficult—for the
reader. As a final remark, however, we emphasize the close coupling between the normal
velocity (or normal momentum) equation and the pressure at all boundaries—another
consequence of V • u = 0.
3.8.3 The Vorticity Transport Equation and Boundary
Conditions on the Vorticity
a. The 2D stream function-vorticity formulation
If one simply (naively) examines each of the \fr-o) pair of equations (3.6-5) and (3.6-6), in
turn, and applies classical PDE theory to each, one reaches a dilemma that has caused at
least as much confusion—and probably more years of frustrating research—as has been
396 THE NAVIER-STOKES EQUATIONS
associated with the subject of boundary conditions for the pressure. The reason for the
confusion is this: each equation involves a Laplacian operator and thus each ostensibly
needs BC's—Dirichlet, Neumann, or Robin. But for the most common BC, u = w on
T, which translates to un = wn = di/f/dz and uT = wT = —dxjs/dn on T, the dilemma
becomes clear: there are two BC's on \fr and none on co; \(r has one too many and co has
one too few.
Since we will not pursue the ^r-co formulation in earnest in this book, we simply state
the resolution of this dilemma and refer the interested reader to Gresho (1991a, b, 1992)
for the details. The fallacy in the above application of PDE theory—well understood by
Glowinski and Pironneau (1979) in their important paper on this subject—is rooted in the
specious notion that each equation needs BC's because each contains the V2 operator. In
fact, however, these two equations are very closely coupled (as indeed are u and P in the
primitive variables or PPE formulations) and it is only required that proper BC's exist for
the coupled pair. It turns out, when examined in this light, that two BC's on \(r and zero
on co are just fine; there are no BC's for the vorticity in ^r-co formulations—and none is
needed. This realization of course brings with it some extra cost, however, which helps
to justify the older approaches that attempted to avoid this cost: the coupled system of
(3.6-5) and (3.6-6) must be solved as a coupled system; only then do the two BC's on \(r
and none on co permit a solution. [In the finite difference world on uniform grids, however,
E and Liu (1996) argue convincingly that some of the older methods using uncoupled
BC's—in which one of the \fr BC's is converted to one for co via 'local' formulae—are
virtually equivalent to the more globally coupled 'modern' methods.]
Remark:
The d-ijr/dx = wn (specified penetration) BC is usually converted to a Dirichlet BC by
integrating along the boundary: rfr(T) = i/r0 + J*J wn ds', which works just fine for simply
connected domains.
So much for 'no-slip' (and 'no-penetration') boundaries—what about inflow and
outflow? Well, if u = w is the known/desired BC, then there is no choice, no change:
two on \(r and none on co. [If v = 0, however, then a value of co must be specified at
the inlet (only) and this is proper: the equation is then hyperbolic and the dxjs/dn BC
must be dropped (e.g., no-slip is no longer possible).] It is also permissible to specify the
vorticity at the inlet—even for v > 0—unless u = w is given (see above). If in addition
\(r is specified (normal velocity), then the inlet tangential velocity is a 'result,' since only
its normal derivative is then effectively specified. If dxjr/dn (tangential velocity) and co are
specified, also legitimate, then the normal velocity is the floater; this is probably not often
useful/desirable. At outflow points, there are several options but, as in u-P formulations,
there is also lots of room for both ambiguity and improvement—i.e., the 'best' choice is
not readily available. If a passive OBC is desired, then dco/dn = 0 is usually okay, even
though it does imply the seemingly illegitimate BC a la u-P of V2wT = 0 at outflow (an
exercise we leave to the reader), which is known not to be a legitimate BC for the u-P
equations. But a second BC is also required and this causes some difficulty: specifying
\(r implies the specification of the normal velocity (usually a poor OBC), and specifying
d\[r/dn implies the specification of the tangential velocity, also usually not a good idea.
Nevertheless, di(r/dn = 0(= uT) is the most common (and legal) BC employed in \Js-co
formulations. See Tezduyar et al. (1988, 1991) for some alternative \fr-co OBC's via FEM,
and both Roache (1982) and Peyret and Taylor (1983) for FDM.
INITIAL CONDITIONS (AND WELL-POSEDNESS) 397
b. The 3D velocity-vorticity formulation
This formulation, while not new, is less developed and still developing, and a review of
the (confusing) literature reveals but one clear fact: those using this formulation do not
agree on the BC's, either with respect to legitimacy or utility. Since we have not been
party to this effort, and will have little need to discuss it further, we refer the reader to
the growing and semi-vast (half-vast) literature, beginning with Gunzburger et al. (1990),
Gresho (1992), and Wu et al. (1996)—and references therein.
3.9 INITIAL CONDITIONS (AND WELL-POSEDNESS)
The last technical 'detail' regarding incompressible flow that needs to be addressed—and
even cleaned-up/clarified relative to much of what exists in the literature, before we
can move on to the subject of FEM approximation methods—is that of the initial data.
Just what is required or permissible regarding initial velocity, pressure, vorticity, stream
function, etc.?
3.9.1 The u-P Formulation
Again the incompressibility constraint makes itself felt (quite strikingly, in fact) in such
a way that the simplicity of the choice of the initial condition (IC) for the scalar
transport equation of the previous chapter—basically any function in L2—is totally
inappropriate. The 'bottom line' can be easily stated, even though it was not an easy
one to obtain—indeed, it is probably not yet well known nor widely appreciated: the
initial velocity must be incompressible everywhere: V • u = 0 in Q. The translation of
these simple (and nearly obvious) words into mathematics is a bit more subtle:
V-110=0 in Q (3.9-1)
and
n u0 = n Wo on FD, (3.9-2)
where uo(x) is the initial velocity and VD is that portion of the boundary (the 'normal'
Dirichlet portion) on which the normal velocity BC is specified (n • u = n • w).
It may actually be helpful to derive (3.9-2). This can be done by referring to
Figure 3.9-1—where we note the generalization of (3.9-1) and (3.9-2) to t ^ 0. It is
accomplished by applying (3.9-1) and the divergence theorem to the thin Gaussian pillbox
(Jackson, 1975) shown there, where u is the velocity in the fluid and w is the specified
BC for u: V • u = 0 in Q => fu V • u = /r n u = 0, which becomes 0 = /02 n • wds +
J)0 n • u ds + 0(8), which gives, for 8 -> 0, /1 -> h = I and thus /0 n • (w — u) ds = 0,
from which we conclude that we need n • u = n • w on F; i.e., the normal component of
the velocity BC is also the realization of (3.9-1) on F. [The sufficiency of this requirement
is obvious. That it is also a necessary condition can be proved in either of two ways:
(i) since the location (on F) and the size of the pillbox are arbitrary, if n • u ^ n • w
at some point on / yet JQ n • (u — w) ds = 0, one merely needs to slide the pillbox to a
new location (and perhaps change /) that would give /() n • (u — w)ds # 0; and (ii) we
preclude sources and sinks of mass on the boundary.]
398 THE NAVIER-STOKES EQUATIONS
Fig. 3.9-1 A Gaussian pill box in the fluid at the boundary.
One of the most interesting consequences of the IC constraint of (3.9-1) and (3.9-2) is
related to impulsive starts, which we believe merits the following important digression:
the fluid mechanics literature is replete with the notion that impulsive behavior (impulsive
starts in particular, impulsive changes in general) is commonplace, even in the case of
the mathematical model that presumes an incompressible fluid. And this perception is
widely held among experimentalists, theorists, and computationalists. While it is true that
some realize what the mathematical basis for such fluid motion is—as, for example, so
well-described by Batchelor (1967)—it seems all too true that many hold an erroneous
perception. Whereas it is fairly well accepted—at least by those who might be called upon
to design laboratory experiments—that a truly impulsive start of any kind of 'mechanical'
system is not possible to attain owing to inertia, the V • u = 0 constraint/model adds
mathematical muscle to the statement that impulsive changes in the normal direction are
precluded.
Normal impulsive changes in velocity would require a discontinuous normal velocity
that violates V • u = 0 on V and are thus mathematically ill-posed. If the normal
component of velocity applied to the fluid—as a boundary condition—is different from the initial
normal velocity at the same point of the boundary, then the incompressible flow equations
are ill-posed by violation of (3.9-2), and no solution exists. (At least in the conventional
sense; any solutions that do exist are necessarily in the class called 'generalized' solutions.)
Normal impulsive acceleration, however, is mathematically possible, an example of
which is: n u = wo(l — e~?/T) with uq = 0 in Q and r as small as you like—but not
zero—and we shall present just such an example later (Section 3.19). The acceleration is
n a = (wo/r)e~?/T, giving n • ao = wo/r. As r —>• 0, n • ao —>• oo and n u is a smooth
function (continuous in time) that rapidly approaches wq. This situation (or a similar
one, such as a ramp function, n u = fit with /J constant) can be legitimately used to
'model' an impulsive start in the sense that a very large acceleration is applied for a very
short time.
Tangential impulsive changes, on the other hand, are permitted—and are quite common;
the initial conditions need not satisfy the tangential BC's—typically no-slip. The simplest
example of this situation is given by the Stokes/Rayleigh problem of an instantaneous
change in the tangential velocity of an infinitely long, flat plate in a semi-infinite fluid at
rest, with the familiar complementary error function solution:
Q\r
u(x, t) = uq erfc (x/V4vt)-
(3.9-3)
INITIAL CONDITIONS (AND WELL-POSEDNESS) 399
The well-known consequence of impulsive tangential velocity changes is also well
known: a vortex sheet is created in the fluid at the boundary. Of course in the limit of
an inviscid fluid, while such changes are still permissible, they no longer generate vortex
sheets since the fluid has lost its ability to communicate tangentially with its boundary.
(Normal impulsive changes, however, are just as illegal for v = 0 as they are for viscous
flow.)
It turns out that the normal impulsive start model—the rapid start via exponential,
ramp, or similar functions—does approach a limit that, paradoxically for an impulsive
start, describes potential flow, even though the no-slip BC may have been legitimately
applied; i.e., Re = 'oo' rather than Re = 0 is the effective initial condition, and the fluid
will appear to slip along the boundary. This limiting case will be described below, first
with words and then with mathematics: for t < 0, a steady, inviscid, irrotational flow—i.e.,
a potential flow—exists. At t = 0, a wand is waved that endows the fluid with viscosity,
which enables the no-slip BC to be satisfied (the brakes are applied)—instantaneously; a
vortex sheet now exists on all no-slip surfaces. Also instantaneously, the entire pressure
field snaps from that of potential flow (the 'Bernoulli pressure') to one that feels the
(usually significant) effect of the no-slip BC. Finally, for t > 0, the vortex sheet has
been dissipated by viscosity and a simpler (more regular) time-dependent viscous flow
develops. Mathematically, the foregoing events are:
1. t < 0. The steady potential flow is obtained from V20 = 0 in Q, 30/dn = n Wo on
F, where of course /r n • Wo = 0. The resulting velocity is uo = V0, and the (potential)
pressure (Pp) is given by Bernoulli's equation, Pp + \pq2 — C = 0, where q = |u0|. [As
a check, the insertion of this solution into the Euler equations, 3uo/3f + Uq • Vuo + VPp =
0, V • u0 = 0 in £2, n • uo = n • Wo on F gives, using uo • Vuq = ^ Vq2 — uo x V x uq =
\Vq2, V2Pp = -\Vq2mQ with dPp/dn = -n • (u0 • Vu0) - n • w0 = -n • (u0 • Vu0) =
— ^n • Vq2 on F to give Pp + \q2 — C and duo/dt = 0.] Steady potential flows always
satisfy the steady Euler equations (and even the full NS equations).
2. t = 0. Here we have y > 0 and no-slip on F. But the velocity field is still uo (except
on F), and the NS equations read
3iio 1 -7 ->
—- + - Vq2 + VP0 = yV2u0 and V • u0 = 0 in Q,
at 2
with Uo = w0 on F (no slip and no penetration). But V2uo = V(V • uo) — V x V x uo = 0,
since Uo is both divergence-free and curl-free; the viscous term in the NS momentum
equation is still zero (in Q). The initial pressure satisfies the following PPE at t = 0:
V2/>0 = -\ VV in Q and dP0/dn = n • (yV2u0 - {Vq2) = n • (yV2u0 - qVq). Here,
even though yV2uo = 0 in Q, vn ■ V2uo # 0 on F because now r • uo = r • Wo there rather
than t • uo = 30/3r—and it is just this 'no-slip' (and non-smooth!) viscous term that
causes the step change in pressure. (For a planar stationary boundary, this term is simply
vd2uon/dn2.) Finally,
^ = -V (V0 + l-q2^j = V(PP - P0) # 0;
the step change in pressure caused by the no-slip BC causes the potential flow to change
to a viscous flow with vortex sheets—and a concomitant large acceleration near F. On
400 THE NAVIER-STOKES EQUATIONS
T, however, we have (for n • w independent of time)
n ■ duo/dt = n • [vV2u0 - V (P0 + {Vq2)] = 0.
Thus, the 'large acceleration near f is only in the tangential direction. If Re <$C 1, then Pq
will be dominated by the boundary condition via dPo/dn % vn • V2uo—an Euler velocity
and a Stokes pressure—whereas if Re » 1, then P0 will look only a little different than
Pp (except very close to F) because it is then dominated by the source term in the PPE,
3. t > 0. After the (large) dose of vorticity has been absorbed by the fluid at the boundary,
the now smoother flow is free to evolve as it may.
Final Remarks:
(1) The above discussion has been based on the requirement that the minimum regularity
(smoothness) condition on the normal velocity is that it be continuous in time and
space, a requirement not always invoked when seeking or discussing certain weak
(ultra-weak?) solutions; see, for example, Hopf (1950/1951), Ladyshenskaya (1969,
1975), and Temam (1984).
(2) Our principal 'bottom line' on the misnomer called (normal) impulsive starts is this:
the initial velocity is not zero—it is potential flow. A normal impulsive change is
an incompressible impossibility.
(3) An example of such a start-up is given in Gresho (1992, in which the captions for
Figures 7 and 8 were inadvertently switched!); later (Section 3.19) we shall present
another.
(4) There are no initial conditions on the pressure; Pq is always induced by uo, a
statement that leads us naturally to the next section.
3.9.2 The PPE Formulation
As in the u-P formulation, initial data for the velocity are all that is needed. But the PPE
does apply at t = 0, and it follows that the induced initial pressure field can be computed.
The manner in which this is done is the obvious one: solve the PPE of (3.8-35) under
any of the BC's discussed in Section 3.8.2, with the exception of (3.8-37), because the
tangential momentum equation generally does not apply on T at t = 0, a fact closely
associated with the vortex sheets that are generally present at t = 0. [If, however, the
BC's are rather special, then the tangential momentum equation does apply on F at t = 0
and there are no vortex sheets; such BC's are said to satisfy the overdetermined Neumann
problem—a la (3.8-37) and the discussion there; see also Heywood and Rannacher (1982)
and Temam (1982).] Related to this issue is the following very important observation for
the special-but-common-and-important case where the BC is u = w on F:
The initial pressure field is always set by the PPE with the Neumann BC coming from
the normal momentum equation applied on T. The resulting pressure field acts like a
given 'source term' in the tangential momentum equation, which responds initially (/ >
0 but small) and (of course) close to V as a sort of transient heat equation (parabolic)
that is, in general, also subjected to a step change at the boundary.
INITIAL CONDITIONS (AND WELL-POSEDNESS) 401
The normal momentum equation does not behave like the parabolic heat equation
owing to its intimate connection with the omnipotent constraint V • u = 0.
On the other hand, the normal acceleration and the pressure can (for t > 0) be solved
either as a pair via a + VP = f(u) and V • a = 0 in Q with n • a = n • w on F—or,
equivalently and sequentially, via V2P = V ■ f in £1, with dP/dn = n • (f — w) on T,
followed by a = f — VP.
This should cover the PPE formulation, and it would if it were not for the possibility
of either overlooking or forgetting from whence it came (the momentum equation and
div u = 0). So we now address the issue that, in a turn-around from the statements made
in Section 3.5.1 that the u-P formulation can have solutions for pressure that lie in a
larger space (only VP need exist) than for the PPE formulation (where VP and V2P must
exist, as well as third spatial derivatives of velocity), is this: when PPE solutions exist
(i.e., when P is smooth enough), there can exist more solutions to the PPE formulation
than to the u-P formulation! But it may be better to defer the details of these (important)
'anomalies' until the next section; anyway, that is our plan. Here we merely close with
the blanket statement/bottom line that IC's for the PPE formulation should respect the
same constraint on the IC's, (3.9-1) and (3.9-2), as is required by the u-P formulation.
3.9.3 Vorticity-Based Methods
These methods include x/z-co and u-co methods, which we lump together because our
discussion will be brief—and biased. In fact, we shall focus only on the simpler 2D case
via the \//-co formulation as this will suffice to make our point.
The key problem with IC's in this formulation is, of course, related to those with the
previous formulations, but it is also sufficiently different and sufficiently more difficult in
general so as to merit separate and extensive discussion. Ironically, however, it is easier
in the following sense: the initial distribution of vorticity, coo(x), is completely arbitrary;
every coq{x) will correspond to some divergence-free initial velocity. But it is much more
difficult with respect to the common IC-BC combination that admits vortex sheets, a
direct consequence of employing—or trying to solve—a higher-order equation, the VTE.
As with the PPE formulation, we initiate the discussion using the simple and common
BC of u = w on T because, in fact, this BC is not so simple when vorticity is the variable
that is desired to be directly computed. In Section 3.12.4 we will discuss a more general
case.
We recall the ^/-co pair, see (3.6-5) and (3.6-6), dco/dt + u • Vco = vV2co and V2^ +
co = 0, and the velocity equations, u = dty/dy and v = —d\f//dx to initiate the discussion
and analysis. These are to be solved with the BC's of Section 3.8.3: d\f//dr = wn and
d\///dn = — wz on T, corresponding to u = w there. Given coq(x), the stream function
equation is used to obtain the concomitant initial—and divergence-free—velocity field as
follows: solve
vVo + w0 = 0 in Q (3.9-4)
subject to the (Dirichlet) BC of
^o = /o on T, (3.9-5)
where fo is obtained from the normal velocity BC applied at t = 0, d^Q/dx = dfo/dr =
wQn, via integration over T : f0 = /rw°. The resulting velocity field, u0 = d\j/0/dy and
402 THE NAVIER-STOKES EQUATIONS
v0 = —d^/o/dx, will satisfy V ■ uo = 0 in Q and on F—by construction (i.e., V • uo =
duo/dx + dvo/dy = d^/dxdy - d2\f/0/dydx = 0).
So far, so good; this was the easy part. But now we turn to the hard part. First we note
that, just like its u-P and PPE counterparts, the initial velocity will generally slip along F
because it was not—and cannot, in the most general case—be restrained from doing so:
iPT = —d\j/°/dn will not agree with w°T on F, and we thus introduce the slip velocity, s:
s(F) = u°T-w°T on T. (3.9-6)
So what is the problem? The problem is that the now-present vortex sheet on F,
the strength of which is s, generates an unbounded vorticity there: while the given coq(x)
describes the vorticity in Q, on F it is necessarily given by (with apologies for somewhat
imprecise notation)
coo(T) = s(xr)8(x - xr), (3.9-7)
where 8 is the Dirac delta function. Integration in the normal direction from F into £2 gives
-xr)dxn =s(xr), (3.9-8)
where e is small. The initial 'wall' vorticity is singular (unbounded) but integrable and is
computable in principle (but probably not otherwise) as a function of position on F. The
imposition of the no-slip, vorticity-generating BC is seen to cause a large problem in the
general case. (In the special case wherein no vortex sheet is present, the initial vorticity
would need to be very special—so that s = 0.)
The vortex sheet need not be confined to the boundary. In fact, it may be of interest
to present a simple example of initial vorticity and velocity fields in such a case; to that
end, we consider a 2D, solenoidal velocity field of compact support in 2D cylindrical
geometry. It is simple solid-body rotation: u^ = fr, where ««/,(= uT) is the tangential
velocity and / is the (constant) angular velocity for r ^ R, and u^ = 0 for r > R. The
jump in Ucj, (slip velocity if the rotating fluid was replaced by a solid cylinder rotating at
the same angular velocity) is s = fR. Thus, co = (\/r)[d(ru,p)/dr] = 2f forr<R and
co = 0 for r > R, at r = R, we have a vortex sheet, the description of which is the object
of the exercise. The stream function equation for this case satisfies
1 d diA
-:r(r:r) + w = 0'
r dr dr
with d\f//dr = 0 at r = 0 and V = 0 (no normal velocity) at r = R—and its solution
is V = f(R2 — r2)/2. Since u = 0 for r > R, \j/ = 0 there, too. To compute the vortex
sheet, we integrate the stream function equation across the discontinuity (an application
of Stokes theorem in the general case):
fR+s Id / di/A fR+e
/ - — r— r dr + / cor dr = 0,
Jr^c rdr\ dr J JR_S
and let e —> 0 to obtain
fR+ rR+
= 0 or — / cor dr = RuT = Rs,
INTERIM SUMMARY 403
since ux = u^ = fR is the jump in the tangential velocity—and the vortex sheet strength.
To go further requires, it seems, the introduction of a model; e.g., suppose w is a step
function of amplitude A/s and half-width (about r = R)e. Then,
rR+s/2 A
/ -rdr =AR = -RuT;
JR~s/2 £
i.e., A = —uT, and the vorticity at r = R is uT/e with e —>• 0. In summary, co = 2f for
r < R,co = — oo at r = /?, and &> = 0 for r > /?.
The root of this singularity problem can be traced back to the fact that the tangential
momentum equation does not apply on F at t = 0 in the general case. But the VTE
involves (comes from) the normal derivative of the tangential momentum equation. What
is the derivative of an invalid equation? Anyway, it is clear that vorticity methods will
necessarily confront significant difficulty whenever vortex sheets are involved.
3.10 INTERIM SUMMARY
3.10.1 A Well-Posed IBVP for Incompressible Flow, and
the Equivalence Theorem'
It may be well to pause, collect some results, and summarize the status of our description
of the incompressible flow problem before forging ahead to discuss approximate solution
methods. Also, partly because we will not need the results, and partly because we are
not fully confident regarding our opinion of what they are, we henceforth omit, except
for one small digression in Section 3.12.4, all vorticity-based methods and zoom in on
what will be the major focus of the remainder of this book: pressure-based methods. [See,
for example, Gresho (1992) for further discussions of well-posedness of vorticity-based
methods.]
We present below two equivalent, and fairly general, initial boundary-value problems
(IBVP's) for the NS equations that are well-posed and form the basis of much of the
rest of the book. For simplicity (i.e., ease of presentation), we stick to 2D, but the 3D
extension is straightforward—really.
The momentum conservation equation,
— +u- Vu + VP = Re""1 V- [Vu + (Vu)7] + g in Q, (3.10-1)
dt
and either form of the mass conservation equation,
Vu = 0 in Q (3.10-2)
or
V2/> = V-{Re_1 V-[Vu+(Vu)7] + g-u- Vu} in Q, (3.10-3)
with BC's of
u = w on rD, (3.10-4)
Re_1[Vu+(Vu)7']-n-Pn = F on TN, (3.10-5)
n.u = n-w and Re"1 r • [Vu + (Vu)7"] • n = FT on Fn, (3.10-6)
404 THE NAVIER-STOKES EQUATIONS
and
t.u = t-w and Re-1 n • [Vu + (Vu)7] • n - P = Fn on FT, (3.10-7)
where rD + FN + Fn + FT = F = dQ, when (3.10-2) is employed, or (3.10-4)
dP , T
and — = n-{Re_1 V- [Vu + (Vu)7] + g - u ■ Vu-dw/dt] on rD, (3.10-8)
dn
(3.10-5) on TN, (3.10-6)
UP I T
and — = n {Re""1 V- [Vu + (Vu)7] + g - u ■ Wu-dw/dt] on Fn, (3.10-9)
dn
and (3.10-7) on TT when (3.10-3) is employed, with initial conditions of
u = uo(jc) in Q (3.10-10)
withV-u0 = 0 in Q (3.10-11)
and n ■ uo = n ■ wo on FD and on Fn, (3.10-12)
constitute two equivalent, well-posed problems (called u-P and PPE, respectively) that
can be solved for u and P. If FN and FT are empty (i.e., if n ■ u = n • w is specified on
all of O, then it is also required that
/nw = 0 for all t^0, (3.10-13)
in which case P is determined only up to an arbitrary additive constant.
Remarks:
(1) Their equivalence, which applies only to the time-dependent case, is a generalization
of the so-called Equivalence Theorem put forth by Gresho and Sani (1987, 1988).
It obviously presumes sufficient regularity of u and P and is (still) actually more of
an assertion than a theorem since it has not (yet) been proven. The 'equivalence'
is, in some sense, only formal. [(3.10-3) cannot replace (3.10-2) for the steady NS
equations, unless the equation V ■ u = 0 on F is appended—see Schtiller (1990),
who presents a steady-state equivalence theorem.]
(2) If any of the three constraints on the data, (3.10-11), (3.10-12), and (3.10-13)
when applicable, are violated, then the NS equations are ill-posed, and no
solution of (3.10-1) and (3.10-2) exists. The pair, (3.10-1) and (3.10-3), is more lenient,
however—but see caveat (1) below. It can only be ill-posed when (3.10-13) applies
with a r/rae-varying w that violates the time derivative of (3.10-13); i.e., PPE
solvability requires only Jr n • (dw/dt) = 0, and this only when n • u is specified on all
of T. More on this later—below and in Section 3.10.5.
(3) If the u-P problem is well-posed and (3.10-13) applies, then it is automatic that the
PPE problem is also well-posed even though the Neumann problem for the pressure
implies a solvability constraint. See below.
(4) If the (Vu)7 term is omitted consistently (everywhere, a common occurrence in
practice), then the problems are also well-posed—and we are dealing with the
INTERIM SUMMARY 405
conventional (V2) form of the viscous term rather than the stress-divergence form,
and the things called 'F' are then pseudo-tractions.
(5) Although no initial data on P are supplied (or required), the initial pressure field
is obtainable by solving (3.10-3,5,7b,8 and 9) at t = 0. [See Heywood (1980),
Hey wood and Rannacher (1982), Gresho and Sani (1987), and Gresho (1991a).]
Also, inserting the resulting pressure into (3.10-1) gives the initial acceleration.
These observations are, in fact, just a special case of the general situation: given
a divergence-free velocity (V ■ u = 0 in £1 and n • u = n • w on those parts of T
with specified normal velocity), it is always possible to compute—sequentially—the
concomitant pressure and the associated divergence-free acceleration;
every divergence-free velocity implies (induces) both a pressure and a divergence-free
acceleration.
The acceleration is, of course, a measure of unbalanced forces, and will be zero if
all forces balance—this is a steady state.
(6) It is worth admitting that the very existence of a bounded solution for all time in
the 3D case is still an open issue; we offer nothing in this regard except hope (and
belief!).
(7) It is interesting to note that many have used the 'high-Re' approximation to the
Neumann BC for the pressure, dP/dn = 0 at stationary walls and no source term,
vis-a-vis the correct BC, from (3.10-8), of dP/dn = Re""1 n ■ V2u ^ 0, but that 'on
average' they were correct (even at low Re) because
= Re""1 / V • [V(V • u) - V x V x u] = 0,
since V ■ u = 0 and div curl (■) = 0.
(8) If (3.10-4) applies and n ■ w = 0, then the flow must be parallel to the boundary.
(9) The definition of boundary segments is different from that used in Figure 3.8-2 of
Section 3.8.1; both are legitimate, however.
It is not too difficult to provide an heuristic proof of the Equivalence 'Theorem'
for, at least, the special case wherein a unique NS solution is known to exist ('all' 2D
problems and 3D problems for sufficiently small data—Re, g, uo, w, F, F„, FT), so we
do so:
1. Clearly the u-P problem implies the PPE problem; just form the divergence of (3.10-1)
using (3.10-2) and Heywood's (1980) result, (3.10-8) and (3.10-9).
2. The PPE problem implies the u-P problem if it guarantees that V ■ u = 0. This
(necessary, but perhaps not sufficient, condition) is easy to show: just subtract (3.10-3) from
the divergence of (3.10-1) to obtain V • (du/dt) = 3(V ■ u)/dt = 0 in £2. But we have
V ■ u = 0 at t = 0 thanks to (3.10-11), and thus V • u = 0 for all t ^ 0.
It may also be useful to prove some of the assertions made in Remarks (2) and (3)
above. Suppose u • n = n • w is specified on all of f. Integrating then (3.10-3) over the
406 THE NAVIER-STOKES EQUATIONS
domain and invoking the divergence theorem on both sides yields
/ — = / n • {Re~' [Vu + (Vu)7] + g - u Vu}
as the solvability requirement associated with the Neumann problem. But application of
the BC given by (3.10-8) shows that this solvability requirement is nothing more than the
constraint Jrn • (dw/dt) = 0. So, if (3.10-13) is satisfied for all time, the PPE is always
solvable, and more: the only time there exists a solvability issue for the PPE is when w
varies with time; this implies, for example, that a constant (in time) value of w on F that
violates (3.10-13) will solve the PPE system but not the u-P system. Thus, it is vitally
important to emphasize the following two caveats on the Equivalence Theorem:
1. Only if all solvability requirements associated with the u-P system are also respected
by the PPE system will solutions of the latter also be solutions of the former.
2. The theorem is valid only if the solution is sufficiently regular: if the data are such that
V2/> does not exist or if V ■ {V ■ [Vu + (Vu)7]} does not exist in Q. or if V ■ [Vu + (Vu)7]
does not exist on T, but VP and V ■ [Vu + (Vu)7] do exist in Q, then the u-P system can
have a solution but not the PPE system. (N.B., we are of course discussing only classical
solutions here.)
3.10.2 Some Ill-Posed Problems
If V ■ uo # 0 in £2 and/or if n ■ u0 # n ■ wo on FD and/or on Tn, then the ill-posed problem
can be converted to a well-posed problem by modifying Uo in such a way that the 'nearest'
(in L2) divergence-free field is found and substituted for uo. This is accomplished by
a projection—uo is projected to the nearest admissible divergence-free subspace (that
satisfies the normal BC on TD and T„), a technique that will also prove useful later when
we discuss projection methods for solving (approximately) the NS equations—as follows:
find v and the associated scalar (p (a Lagrange multiplier) via the Helmholtz/Weyl additive
decomposition of uo into a divergence-free part and a curl-free part (see, for example,
Galdi, 1994):
uo = v+V0 (3.10-14)
and
V-v = 0 in Q, (3.10-15)
subject to the BC's
n v = n w0 on FD and Tn (3.10-16)
and
0 = 0 on TN and TT. (3.10-17)
Remark:
The actual nearest divergence-free velocity is obtained only when n ■ w0 = 0 on TD and
T„, in which case it is an orthogonal projection in that u is orthogonal to V0: J u ■ V0 = 0.
(See Appendix 3.)
This projection is realized via the following two steps:
INTERIM SUMMARY 407
Step 1. Solve
V20 = V-uo in Q (3.10-18)
subject to the BC's
-^ =n-(u0-w0) on TD and Tn (3.10-19)
an
and
0 = 0 on TN and TT. (3.10-20)
Step 2. Compute v = u0 — V0 as the new IC.
[Exercise for the reader: Prove that v is closer to uo than is any other divergence-free vector field when n ■ w0 = 0.]
While some ill-posed problems can be easily converted to well-posed neighbors,
such as those just discussed, those generated by a violation of (3.10-13)—global mass
conservation—cannot. Glowinski (1984) has presented a way to modify the boundary
data so that even this ill-posed problem can be converted to one that is well-posed—and
we shall present another later (Section 3.13.1g).
3.10.3 The Simplified PPE is also Ill-Posed
The simplified PPE(SPPE), obtained from (3.10-3) by assuming V • u = 0 in the viscous
term, i.e.,
V2/> = V- (g-u Vu), (3.10-21)
is also often regarded as the PPE. But we have earlier alluded to the existence of some
potential dangers associated with this seemingly equivalent 'statement of incompress-
ibility' (Section 3.5.1), and we discuss some of them now. If (3.10-1) and (3.10-21) are
solved together, then the behavior of the velocity divergence (V ■ u = 0) can be obtained
by subtracting (3.10-21) from the divergence of (3.10-1) to give, for the simpler case of
straight (planar in 3D) boundaries wherein V ■ [Vu + (Vu)7] = V2u + V(V ■ u),
— = 2vV20, (3.10-22)
dt
where the factor of 2 exists because we used the stress-divergence formulation—the V2
formulation would not have it. But the key issue is the same in either case: the divergence
satisfies a transient 'heat' equation. If the initial velocity is divergence-free, 0 = 0 at t = 0
and (3.10-22) would ensure that 0 remains zero if either 0 or 30/3n was held at zero on
T. But if we cannot show the existence of either of these BC's, and we cannot from the
problem posed, then Murphy's Law would tell us that divergence might sneak in, which
of course would cause the SPPE solution to be wrong in that it is not then a solution of the
NS equations. This ambiguity seems to be avoidable (for the continium at least) simply
by replacing Vu + (Vu)T by —V x V x u in (3.10-8) because, in addition to implying
(3.10-22), it also implies, in an exercise that we leave to the reader, 30/3n = 0 on r0,
thus assurring that 0 = 0 if it started that way.
We now present a proof of this ill-posedness for the linear case—Stokes flow (thanks
to R. Rannacher), after which we present an example of same (thanks to J. Strikwerda).
408 THE NAVIER-STOKES EQUATIONS
Since the ill-posedness is caused by lack of uniqueness, we begin by testing uniqueness:
suppose we have a solution, (u, P), to the PPE problem in which (3.10-21) is used rather
than (3.10-3). Then, add any harmonic function, say H, to P and see if the pair (u + v,
P + H) with v to be determined, has the unique solution v = 0 and H = 0; any non-zero
solutions would show non-uniqueness. Inserting this pair into the IBVP and utilizing the
fact that (u, P) is a solution of (3.10-1) and (3.10-21), sans u • Vu, leaves us with the
following problem: find v from
|^ + VH =Re~' V-[Vv + (Vv)7] with V2// = 0 in Q, (3.10-23)
v = 0 on rD, (3.10-24)
Re-'[Vv+(Vv)7] n = //n on TN, (3.10-25)
n-v = 0 and Re_1[Vv + (Vv)7] ■ r = 0 on Tn, (3.10-26)
and
with
,-lrv7„ , ,vj„\T
rv = 0 and Re~'[Vv + (Vv)1 ] ■ n = H on I\, (3.10-27)
v= 0 at t = 0 (and thus V ■ v = 0 at t = 0). (3.10-28)
It is immediately clear, since H is any given harmonic function, that the above well-
posed problem for v possesses a non-trivial solution—and one which is generally not
divergence-free. The SPPE problem is ill-posed owing to non-uniqueness.
Remarks:
(1) The consistent PPE problem, if subjected to the same analysis, easily leads to
v = 0, H = 0 because V2// = 0 is replaced by V2// = V ■ {Re-1 V • [Vv + (Vv)7]},
showing that in this case, H is not generally a harmonic function, which then leads
to 3V • \/dt = 0, and thus to v = 0, with H = 0 then following easily. [Take the
inner product of the first of (3.10-23) with v, etc. Or, see below.]
(2) A well-posed SPPE approach can be recovered by adding the additional BC V • u = 0
on T, which we shall show—below.
(3) In Gresho and Sani (1987, pp. 1138 and 1141), it was erroneously stated that the
SPPE is well posed if the proper Neumann BC is employed.
An example of the SPPE non-uniqueness is the following: solve for u and P from
^ + V/>=vV2u and V2/> = 0 in Q, (3.10-29)
at
where Q is the unit square centered at (1.5,1.5) with sides parallel to the x- and y-axis.
The initial data are u = 0 and the boundary conditions are
u(x, t) = -erfc (x/V4vt), (3.10-30)
v(x, t) = erfc (y/V4vt), (3.10-31)
and
dP 9
—- = n ■ (vV2u - du/dt), (3.10-32)
an
INTERIM SUMMARY 409
which satisfy the IC and the mass-conservation condition,
/un = 0 for t^O. (3.10-33)
Whereas there exists a unique solution to the Stokes equations (u, P) or the CPPE version
of same (with same solution) for these data, the SPPE formulation possesses this solution
and many more—one of which is simply u and v from (3.10-30), (3.10-31), and P = 0.
This extraneous solution, and most of the others, will not retain V ■ u = 0; in fact, the
above solution has V ■ u = (e~x2/4vt - e^2/4u')/V^-
Remarks:
(1) The SPPE solution [and the BC's in (3.10-30) through (3.10-32)] was generated by
seeking solutions to the ID heat equation.
(2) The reason that the unit square is not centered at (1/2, 1/2), which is its usual
location, is to preclude jumps in the normal velocity at t = 0+; i.e., to have a
well-posed problem.
Similar remarks apply to related versions of the SPPE of (3.5-2), such as (3.5-4), for
which the implied divergence equation is
90 9
_ + u ■ V0 = 2vV20,
dt
and for the 2D version below (3.5-4)—that using V • (u • Vu) = 2{uyvx — vyux)—it
becomes
90 9 9
—+u- V0 = 2vV20-02,
dt
the first being an advection-diffusion equation and the second the same except for a
(stabilizing) sink term. But the important conclusion from all of these is that these
simplified forms of the PPE do not, in fact, simplify the analysis and cannot generally assure
that 0 = 0 in Q for t > 0; only the CPPE (or the SPPE plus the BC V ■ u = 0 on T, as
shown next) can perform this important function.
3.10.4 Fixing the SPPE, and the PPE Paradox
The non-uniqueness of the SPPE formulation can be rendered unique by the addition,
simple in principle but probably not in practice, of another BC, as mentioned earlier
[Remark (6) in Section 3.8.2]:
Vu = 0 on T, (3.10-34)
which 'closes' the problem posed in (3.10-22); i.e., we now have that V ■ u = 0 in Q for
t ^ 0. After nothing that V • v = 0, uniqueness then follows from (3.10-23) via
-— f\-\+ /vV// = Re-' /V{V[Vv + (Vv)7]}; (3.10-35)
i.e. (see Table 3.1-2),
-— fy.y- ///V-v+ J Ht\\ = Rq-{ f n ■ [Vv + (Vv)7] ■ v
-Re-1 /vv: [Vv + (Vv)7]
410 THE NAVIER-STOKES EQUATIONS
to give, noting that n ■ v = 0 on Fd and Fn and accounting for the BC on FN and FT,
/vv= -Re_l /*Vv: [Vv + (Vv)7] ^ 0, (3.10-36)
which, since v = 0 at t = 0, gives v = 0 for all time.
We conclude this portion of the PPE discussion by stating
The PPE Paradox:
If you include it, you do not need it—but if you do not include it, you do need it, where
'it' refers to the viscous term, either vV ■ {V ■ [Vu + (Vu)7]} = 2vV2V ■ u or vV • V2u =
vV2V ■ u on the RHS of the PPE. The former (inclusion) gives V ■ u = 0 via the CPPE
formulation, and the latter (exclusion), via the SPPE formulation, generally results in
V • u 7^ 0. Interesting.
3.10.5 PPE Solutions that are not NSE Solutions
But even the CPPE is not free of problems, as mentioned earlier. One of them is already
apparent: if for some reason 0O = V • uo ^ 0, via oversight or carelessness or 'dumb/naive
user' (the most common case, perhaps), the CPPE solution will give V ■ u(x, t) = V ■
uo(jc); any initial divergence is frozen (spatially) into the fluid for all time. Another such
violation could occur if n ■ uo # n ■ wo at some points of F with n ■ w0 independent
of time—the PPE would hold n ■ u(0 = n u0 # n w0 at these same points on F for
all time.
This and other ill-posed NS problems are summarized in two equivalent ways in
Figures 3.10-1 and 3.10-2, which attempt to show those PPE solutions that are not
solutions of the NS (u-P) equations by virtue of the looser PPE solvability constraints.
[Recall: The only case in which the data (uo, w) do not admit a PPE solution is when
n • u is specified (as n • w) on all of F and Jr n • w 7^ 0. Conversely, the data admit a
solution whenever n • u is not specified on all of F or whenever n • u is specified on all
of T and Jr n • w = 0 for t ^ 0, which latter case constitutes the only solvability
requirement for the PPE. The simplest example of a well-posed PPE problem that is ill-posed
in the u-P formulation is a constant (in time) value of n • w that violates Jr n • w = 0.]
The figures are meant to include all initial data pairs (uo, n • w0) that admit a PPE
solution. Each ellipse in Figure 3.10-1 corresponds to a NS constraint violation: (3.10-13) at
t = 0, (3.10-11), or (3.10-12). The union of the three ellipses denotes that subset that does
not admit a solution to the NS (u-P) equations, and the horizontal ellipse (present only
when r\ = 0 and Fn = 0) is contained in the union of the other two because V ■ u0 = 0
and n ■ uo = n • wo => Jr n • wo = 0 [and therefore Jr n • wo 7^ 0 => one of the following:
(i) V ■ u0 # 0 and n ■ u0 = n • w0, (ii) n • u0 # n • w0 and V • u0 = 0, or (iii) V • u0 # 0
and n ■ uo # n ■ wo]. The little boxes depict computational domains and the vectors depict
BC's (Dirichlet, normal direction) and IC's; e.g., the vector in box (1) implies uo # 0 and
V • uo # 0 (and is the case discussed above, which we shall also demonstrate later), and
that in (5) indicates a BC violation of the type n • u0 # n • w0 on F (with u0 = 0 in £2).
The intended interpretation is then as follows: (1) and (2) each violate only one constraint,
(3) through (5) violate two, and (6) violates all three constraints. Further explanation: the
statement uo = 0 in (2) and (5) can also be interpreted as/generalized to: a divergence-free
INTERIM SUMMARY
411
Fig. 3.10-1 A Venn diagram of those PPE solutions that do not solve the NSE.
[n-w0 = 0:
V-u0*0
(D-Uo = D-w0)
nu0 * nw0
(V-u0 = 0)
Vu0■* 0 and
rvu0* n.yy0
[n-\/y0*0:
Fig. 3.10-2 Schematic of those PPE solutions that do not solve the NSE.
IC of compact support (V ■ u0 = 0 and u0 —>• 0 for x —>• T); also, these cases are probably
ill-posed for another reason than constraint violation (rough data). In Case (2), the BC
vectors are of the same length to indicate the satisfaction of global mass conservation
[(3.10-13)] when w is time-varying. (If the BC's are time independent, they could be of
different lengths since the PPE formulation does not then see them.) Case (3) is the 'sum'
of (1) and (2), and Case (4) is meant to depict a situation wherein the initial velocity is
smooth and satisfies n ■ iio = n ■ w0 = n ■ w(0 but does not satisfy V ■ iio = 0. Here too
the initial divergence would remain for all time with, in addition, a violation of global
mass conservation. Finally, (6) is the sum of (1) and (5). If n • w is time-dependent and
the condition Jrn • wo ^ 0 is generalized to Jrn ■ w(0 ^ 0, then Cases (4)-(6) are also
ill-posed in the PPE formulation. Hopefully the key point has been made: it is incumbent
upon the user of a PPE method to be sure that s/he is really solving the NS equations.
One can best do this by 'imagining' that one is using the u-P formulation and heeding
all of its associated constraints.
412 THE NAVIER-STOKES EQUATIONS
3.10.6 A Remark on the Penalty Method
To conclude this section, whose objective was, in part, to reveal and discuss some of the
principal subtleties of incompressible flow that carry over—for the most part—to the FEM
approximate solution methods to be discussed later, we return to the penalized momentum
equation and the associated penalty approximation—(3.5-10) and (3.5-11)—and make the
single remark: there is no such thing as an ill-posed penalty method problem, which at
first blush sounds much better than either the u-P method or the PPE method; 'all' IC's
and BC's generate well-posed problems. But—not all solutions will be well-behaved or
attractive (the lunch is cheap but not free) owing to the spurious penalty transient—an
example of which we will present later.
A sort of 'bottom line' here, and for PPE methods, is this: there is no substitute for a
good understanding of the 'primitive' (u, P) Navier-Stokes equations.
3.10.7 Key Features of Incompressible Flow
Somewhat in the way of a summary of what has been said, and just before getting on
with the task of actually trying to solve the NS equations via GFEM, we present some
key features of incompressible flow that are unique to it, i.e., not present for compressible
flows:
1. The equations contain an elliptic part, which causes instantaneous transmission of
pressure signals throughout the domain. (The sound speed is infinite.)
2. The initial conditions are constrained—they must be divergence-free.
3. The boundary conditions are constrained, sometimes—they must satisfy global mass
conservation.
4. Impulsive starts/changes in the normal direction are forbidden, which implies a need
for some compatibility between BC's and IC's.
5. The 'simple' mass conservation (constraint) equation is omnipotent, all space, all time.
6. The implied Poisson equation for the pressure and its BC's are, when made explicit,
subject to an unbelievably large amount of misunderstanding.
7. Even the nearly hyperbolic Euler equations need an OBC—for the pressure.
8. 'However, the emphasis is very different for high speed and low speed flows and
we shall concentrate on the latter because the burden they place on the design of good
difference schemes is in many ways greater'—Morton (1971).
9. When semi-discretized (in space), the resulting differential-algebraic equations are more
difficult to solve than ordinary differential equations.
3.11 GLOBAL CONSERVATION LAWS
Before returning to the FEM, there is one more set of 'goals' that needs to be discussed.
Inherent in the NS PDE' s are certain global conservation laws that should be mimicked by
the discrete solution. We present these here and their discrete analogs later after pointing
out that only some of the continuum conservation laws are also respected by the discrete
solution, and sometimes even these are not obtained without some extra work that usually
GLOBAL CONSERVATION LAWS 413
is not obvious at the outset. We present next the following results from classical fluid
mechanics: global conservation of mass, momentum, energy, vorticity, and enstrophy—the
latter two, of course, being less relevant to our later needs.
3.11.1 Conservation of Mass
We start easy and repeat what has been said many times before—for emphasis. Local
mass conservation is described by V ■ u = 0 and global mass conservation by integrating
over the domain; i.e.,
/ V u = / n u = 0 (3.11-1)
Jo. Jr
does it: inflow = outflow. Whether n • u is part of the solution or is given by BC's,
(3.11-1) must always be satisfied. For example, for the IBVP of the previous section,
(3.10-1) through (3.10-13), this translates to
/ nw+ / n u+ / nw+ / n u = 0. (3.11-2)
J F[) J T/v J r„ J rr
Global mass conservation is the one requirement that carries over strongly and intact
to the discrete equations. The remainder are both harder to satisfy and somewhat less
important in general.
3.11.2 Momentum Conservation
The best starting point is, of course, the divergence form of the momentum
equation (3.6-1). Integration over Q. and using the divergence theorem gives, directly,
- /u+ /[(n-u^ + PnJ^Re-1 / n ■ [Vu + (Vu)7], (3.11-3)
dt Jq Jr Jr
or, using (3.3-2) and (3.3-3),
^ [ u= /[F-(n-u)u], (3.11-4)
dr Jn Jr
which can also be interpreted as a global force balance once the momentum flux, unu,
is interpreted as a 'force.' This is the general result. Special cases come from specific
problems; e.g., for that described by (3.10-4) through (3.10-7), that portion of the boundary
integral over FN would have Re-' n • [ Vu + (Vu)7] — Pn replaced by F, the applied force
on this portion of T, etc.
As a final remark we mention that, if any other equivalent PDE were used as the starting
point—such as (3.2-4)—it would need to be manipulated into this form, usually via the
judicious use of V • u = 0, before the 'proper' form of global momentum conservation
could be ascertained.
3.11.3 Kinetic Energy Conservation
This one is trickier, but also important. The derivation below is the most efficient and
useful that we have found. Defining the kinetic energy (KE) as
414
THE NAVIER-STOKES EQUATIONS
\ \ pu ■ u = \p \ q2,
Jo. J
(3.11-5)
we first form the local KE equation by taking the scalar product of (3.6-1) rewritten in
the following (dimensional) form,
with u, to obtain
P
1 dq7
- + V.(uu)
V<r,
-P
2 dt
+ pu • [V • (uu)] = u • (V • a).
(3.11-6)
(3.11-7)
Using now u • [V • (uu)] = (u • u)V u + ^u V(u • u) = £u • Vq2 = \ V • (q2u) and
V • (<r ■ u) = u • (V • a) + a : Vu yields
1
~P
2H
I- + VVU)
V • (<r • u) - a : Vu,
(3.11-8)
which is in the proper form for the 'integration + divergence theorem,' giving
E+\p q2(n-u)
n • a • u
a: Vu.
(3.11-9)
'Q
Now we use n • a = F and a : Vu = (d — PI) : Vu = d :Vu, since I :Vu = V • u = 0, to
give, semi-finally,
E + \p I q2u ■ n
/ F u - / d :Vu.
T JQ.
(3.11-10)
But because d7 = d,
d:Vu= -d:[Vu+(Vu)7'] = — d : d = O ^0, (3.11-11)
2 IjJL
where O is the viscous dissipation function (internal friction), the final KE equation is
F • u - \p \ q n • u -
O;
(3.11-12)
'Q
the kinetic energy increases owing to work done by the boundary on the fluid and
by net inflow of KE. It always decreases owing to viscous dissipation throughout the
domain. (The kinetic energy lost by O is gained as internal energy/enthalpy—detailed in
Volume II.) Typically, on part of T, F is an applied traction force while on the rest of T,
it is a reactionary pressure and viscous force applied to the fluid and is to be evaluated
via (3.8-1); i.e., F = //[Vu + (Vu)7] • n — Pxv. [This same comment applies to the global
momentum equation (3.11-4), where (3.8-1) typically is used to compute lift and drag.]
For an inviscid fluid, the energy balance equation simplifies to
E = - J (P + \q2) n • u,
(3.11-13)
which, for example, is consistent with steady potential flow for which P + jq2 is constant,
and thus E = 0.
GLOBAL CONSERVATION LAWS 415
3.11.4 Vorticity Conservation
We limit our consideration to 2D since we do not plan to discuss 3D vorticity methods in
this text. (Indeed, we give little more than lip service to the 2D ty-io formulation—and
that with slightly snarling lips.) So, starting with (3.5-7) with the advection term changed
to divergence form yields
which easily leads to
a+ V-(uw) = Re~' / VV (3.11-14)
■wn-uj, (3.11-15)
the typical global conservation law for the scalar transport equation.
Another, and simpler, vorticity conservation law is derivable from Stokes'
circulation theorem, Js co • n ds = Jc u • dl, which, in the special 2D case being considered here
(wherein both co and n point 'out of the page'), is
(3.11-16)
the total vorticity is—instantaneously—given by the boundary integral of the tangential
velocity.
Remarks:
(1) This last result is also obtainable (equivalently) from the elliptic equation portion of
the V — ^ pair, V2^ + co = 0 via the divergence theorem:
/ (*=- [ V-(VV0 = - f^= fuT.
Jo. Jo. Jr on Jr
(2) The result applies for 'any' values for £2 and its bounding curve, F; e.g., £2 need
not be the full domain.
The time derivative of (3.11-16) gives the rate of change of total vorticity,
(3.11-17)
at least for a fixed domain and boundary; cf. (3.11-15), whose equivalence we will soon
show.
One interesting application of (3.11-16) that could even be useful in computations is
the popular test problem called the lid-driven cavity (LDC). Here, Jr uT = luo, where /
is the width of the cavity and uq is the driven lid speed. In a typical non-dimensional
application, both / and uq are unity. So, independent of both time and Reynolds number,
the total vorticity in the LDC is J co = luo. The simplicity of this result might be useful
in computations in the following way (even for a u-P code): if |/«o — jco\ > e, where e
is 'user input,' presumably small, the mesh is too coarse—or badly designed (or both).
Next, it is a useful exercise to show that (3.11-17) and (3.11-15) are actually equivalent,
since their origins are so different—one being kinematic and the other kinetic. We begin
416 THE NAVIER-STOKES EQUATIONS
by inserting the tangential momentum equation, duT/dt + u • Vwr 4- dP/dr = Re ' V2wr,
into (3.11-17) to get
— / co = / (Re-1 V2wr - u • Vwr),
since Jr dP/dx = 0.
Taking now each term on the RHS in turn,
2 d2uT d2uT d2uT d2un 3 /duT dun\ dco
(° yUz = ~dn2+lx2=^n2~ dnlh = ~dn \dn~ ~ !h J = ~dn
because V • u = dun/dn 4- duT/dx = 0 and co = duT/dn — dun/dx, and
duT duT ( dun \ duT
(ii) u Vwr = u„- \-uz~- =un ( 0)4- ^— + «r — = ww„
3n 3t V 3r / 3t
1 (du2 du2\ 1 3^2
4- - —- H \ = o)un H —;
2 ^ 3r 3t J " 2 dx
but Jr dq2/dx = 0, and we obtain
d
— wn • u ,
dn J
as required. One additional piece of information, related to the above, that is sometimes
useful is this: for a stationary boundary with no penetration (u • n = 0) and no slip (uz =
0), the latter giving Jn co = 0, results in Re-1 Jr dco/dn = 0—both the total vorticity in
£2 and the net viscous flux of vorticity into £2 through F are zero for all t > 0, regardless
of how co may vary in time and space. It is likely that at least sometimes the clever use
of such simple conservation laws could aid the simulator/modeler.
Another informative result is obtained by inserting V2«r = dco/dn into the tangential
momentum equation applied on F in the form
, dco DuT dP
Re'1 — = —- + —, (3.11-18)
dn Dt dx
which permits the following interpretation of viscous flux of vorticity through a bounding
'wall': it is 'caused by' tangential acceleration and/or tangential pressure gradients. (For
some discussion and references related to 'cause and effect' of vorticity vis-a-vis velocity,
see Gresho, 1992. Does velocity cause vorticity via its curl or does vorticity cause/induce
velocity via the Biot-Savart law?)
Our final and very important remark regarding vorticity conservation is this: (3.11-15)
and (3.11-18) are generally not applicable at t = 0 because (3.5-7) is not because the
tangential momentum equation is not. They cannot cope with vortex sheets. But (3.11-16)
is applicable, once it is recognized that singular behavior in the integrand of the LHS is
a common occurrence when vortex sheets are present. For example, consider a stationary
fluid in a very long cylindrical container of radius R that undergoes an impulsive rotational
start-up of its boundary; the motionless fluid suddenly contains an amount of vorticity
(per unit length), Jruz = 2jiRut, initially concentrated in a vortex sheet: JQ co = 2nRuT
with co = 0 except at r = R, where it is unbounded. See also Section 3.8.3.
[Exercise for the reader: An impulsive rotation cannot be given to any but a circular cylinder. Why?]
GLOBAL CONSERVATION LAWS 417
To conclude the vorticity discussion, we return to the LDC (for Iuq = 1) and consider
an impulsive start-up from rest and then an impulsive stop and subsequent spindown.
There are three 'phases' to consider, all using the simple result given by (3.11-16): (i) at
t = 0+, the total vorticity in the cavity is 1.0 and is all contained in the vortex sheet
under the lid; (ii) for t > 0, the same total vorticity is diffused and advected throughout
the cavity, approaching a constant-in-time spatial distribution if Re is not too large; and
(iii) upon reducing the lid speed from 1 to zero, the total vorticity instantly drops from
1.0 to zero, this time in a 'negative' vortex sheet at the lid that cancels the entire 'bulk'
vorticity. The spindown, at zero total vorticity, finally obtains a zero pointwise vorticity
as t —► oo.
3.11.5 Enstrophy Conservation
The last fluid mechanical quantity whose global conservation is of some interest is the
enstrophy, a positive measure of rotation that is simply one-half the square of the vorticity,
e = o)2/2 (see, for example, Leith, 1969; Lesieur, 1987; and Pedlosky, 1987). Starting
again with (3.5-7), we multiply by to to obtain the local (2D) enstrophy equation
— +V-(eu-Re-' Ve) = -Re~' Vco-Vco, (3.11-19)
dt
where we have used wu • Vco = ^ V • (co2u) and wV2w = V • ^Vw2 — Vco ■ Vco. Note that
Vco • Vco ^ 0 so that the RHS is always ^ 0 (enstrophy dissipation). Integration (etc.)
yields
en u-Re1 —J = -Re"1 f Vw • Vco; (3.11-20)
e increases by net transport (advection plus diffusion) into £2 through r and is always
decreased by viscosity.
The following rearrangement of one of the terms in (3.11-20) is interesting (for t > 0):
Re-1 — =Re-1 w— =wRe_l V2wr =o)[ — +u- Vwr H
dn dn \ at dr J
to give
iLe+lr-IA^+l)-^L^^ <3"-2,)
in addition to advective transport through T, it is seen that the same terms that 'cause'
vorticity flux, Dur/Dt + dP/dx, are also the generators of enstrophy. In fact, at steady
state with no inflow or outflow,
J co— (p + -u]\ =Re~' f Vco-Vco^0, (3.11-22)
from which it follows that the 'production' term on the LHS is (at least for this case)
necessarily ^ 0—a seemingly non-intuitive result.
For 3D enstrophy equations, see, for example, Batchelor (1967) and Wu (1995).
418 THE NAVIER-STOKES EQUATIONS
3.12 WEAK FORMS OF THE PDE'S/NATURAL
BOUNDARY CONDITIONS (NBC'S)
The review of (a portion of) classical incompressible fluid mechanics is over. We now push
on toward the formulations of the NS equations for which finite element methods are most
applicable. The combination of a vector-valued system and the div u constraint (and the
pressure) render—not surprisingly—weak formulations of the NS equations considerably
more complicated and difficult, both to generate and to solve, than that of the simple scalar
transport equation of Chapter 2. Accordingly, we believe it profitable to start slowly and
work up, perhaps at the expense of some repetition, but with the hope that the various
formulations seen in the literature will then be much easier to follow and understand.
Part of the complexity is related to notation, which can range from overly cumbersome
to overly terse, with equivalence not always immediately obvious. Thus we will start
with the simplest (hopefully) method of derivation and limit it to 2D cartesian coordinate
systems so that we can more easily work separately with each component of the vector
equation. After this, we will re-do the same example while simultaneously generalizing
it to 3D via the more efficient index notation. Finally, the equivalent formulation will
be presented in the most compact notation of all: Gibbs' vector notation, complete with
vector-valued test functions.
Also, for tractability, we narrow the scope ab initio by presenting and discussing in any
detail only a subset of all the possibilities, emphasizing the more common and preferred
forms of the equations for the most part. We shall further emphasize the forms of the
equations that have been of most interest to us and about which we have real experience,
but will only summarize other forms—especially those more avant garde forms that were
developed, at least in part, because of the continuous search for better OBC's.
We emphasize via repetition at the outset: applied PDE's (i.e., those whose solution is
sought) come with BC's (see Strang, 1986), and this is a significant feature of the various
weak forms that can be developed; i.e., weak formulations of PDE's also come with
(incorporate) BC's. Our terminology—not generally employed by others, but useful in our
opinion and already employed in the previous chapter—is this: by weak formulation, we
mean the sequence of steps needed to go from the given PDE's + BC's to a specific weak
form of them. Before developing weak forms, it may be well to pause and reflect upon
the following quotation from Ladyzhenskaya (1969, p. 142)—also for weak/generalized
solutions:
"Before becoming involved with precise formulations, we call the readers' attention to
the fact that the statement 'it has been proved that the problem has a unique solution'
can have very different meanings depending on the function space in which one looks
for the solution. The form in which the requirements of the problem must be satisfied
is different for different spaces, and different extensions of the concept of a solution of
a problem, i.e. different 'generalized solutions', present themselves. In fact, for every
problem there are infinitely many 'generalized solutions', but they coincide with the
classical solution, if the latter exists."
3.12.1 The Conventional u-P Formulation and the
Stress-Divergence Form Combined
To begin, we expand the NS equations, in the form du/dt + u • Vu + VP = vV • [Vu +
y(Vu)T] (with y = 0 or 1), into 2D cartesians:
WEAK FORMS OF THE PDE'S/NATURAL BOUNDARY CONDITIONS (NBC'S) 419
du du du dP
h u h v 1
dt dx dy dx
dv dv dv dP
h u h v 1
dt dx dy dy
and, of course,
Du dP f 7 d \
1 = v [V2u + v—V • u ,
Dt dx \ Y dx J
Dv dP ( , d \
_+ ,,^ + y-V..].
du dv
— + — = 0.
dx dy
(3.12-1)
(3.12-2)
(3.12-3)
Remark:
For y = 1, the V • u term is retained in the stress-divergence form of the momentum
equations, for 'generality' when we introduce/discover the NBC's. For y = 0, we are
starting with the 'conventional' (V2) formulation. While equivalent in the continuum,
the discrete equations for y = 1 differ from those for y = 0—and so do the solutions
(although usually not by much).
Upon setting BC's and IC's, these equations can be transformed from the above strong
form to the more desirable (amenable) weak form. But this time we shall proceed slightly
differently, hopefully profitably; i.e., we shall permit the weak formulation to show us
the way to some legitimate BC's via the NBC's of the weak formulation. (But are they
useful? That's another issue; but it turns out that the answer is, usually, yes.) And we
shall use only the ^-component of the NS equations to demonstrate the technique. Further
toward our goal, we first rewrite the ^-component of the NS equation (3.12-1), in the
following equivalent form—a divergence form:
Du
= Vtx, (3.12-4)
where
Dt
t-v — "*
du
(\+y)v--P
dx
f du dvs
(3.12-5)
and we note [see (3.3-1)] for y = 1 that the vector xx = ex ■ a\ i.e., it is the x-component
of the stress tensor. Now we multiply (3.12-4) by a test function, (p{x) (an x-direction
test function), whose exact 'qualities' (function class) we defer defining until a more
appropriate time—and integrate by parts, etc.:
0
(x)
Du
~Dt
= MwV-rI= / V.(0wrx)- / V0
U)
«,
0wn
V0
(x)
which we rearrange to
0«— + V0W
and 'expand' to
0
(■»•)
Du d(j)
(x)
Dt
+
dx
d+y)v
du
'S
(x)
n,
d+y)v
dx
du
+
<pix)n
d<f>{x) (du dv'
dy \dy dx /
dx
+ nvv
(3.12-6)
420 THE NAVIER-STOKES EQUATIONS
A further rearrangement of the LHS (isolating y) gives
0«_^ + vV0w- Vu + yv
dx)
d(pw du dcf)w dv
(x)
— +
dx dx dy dx
- -P
d(p
(x)
dx
= 4>
(x)
ft,
du
(\+y)v--P
ox
f du dv\
(3.12-7)
and one more (isolating u and v) gives
/
<t>
(x)
Du
+ v
Dt
= [ <PM
d<p{x) du d(p{x) du
dx dx dy dy
dx)
+ vy
dcf)w dv nd<p
dy dx
(x)
dx
n,
du
(\ + y)v--P
dx
+ nyv
du dv\
dy dx)
(3.12-8)
which leads to the following
Remarks:
(1) If y = 0, then the viscous term in (3.12-8) is clearly of the same form as the diffusive
term in the advection-diffusion equation; e.g., see (2.2-2).
(2) The RHS of the above equations shows the way to an appropriate NBC for this form
of the ^-momentum equation; namely,
n • Tr = n,
dx
f du dv\
+ nyV{Yy+yYx)=F-
(3.12-9)
where Fx, an applied force (per unit area) in the x-direction, is prescribed input
data.
(The RHS above then becomes, simply, Jr<p(x)Fx.) When y = 1, Fx is the true x-
component of the applied traction vector; when y = 0, we call Fx a pseudo-traction
as it is not then a physical force. But for either value of y, (3.12-9) is a legitimate
(and natural) BC, and a solution can be found once Fx is specified. Fx = 0 is another
example of a 'do-nothing' BC; no action is required on the part of the code user.
Indeed, no action by the code writer is required if Fx = 0 is always the desired BC
on T.
(3) Usually (3.12-9) is a BC that applies only over a portion of F, an issue we will
soon return to as we continue the development of the weak form.
Digression: before completing the weak formulation, we show how a simple
rearrangement of the div u term in (3.12-1) can lead to a different weak form. Rewrite
(3.12-1) as
Du 9 3
— = vV2M + — (yvV-u-P),
Dt dx
multiply by (p{x\ and integrate by parts as follows:
/ 0W ^ = /v<t>(x) y2" + / 0W Tx(KvV *u ~ P)
= I v[V • (0WV«) - V0W • Vm] + f —{(j)(x)(yvV ■ u - P)]
WEAK FORMS OF THE PDE'S/NATURAL BOUNDARY CONDITIONS (NBC'S)
421
dx
-(yvV-u-P)
«.(*)
du
\(x)
dx).
= / W>w — -v Wx> Vu+ <f)wnx(YvV u-P)
d(p
(x)
dx
(yvV-u-P),
which can be rearranged to
/
<p(x) + vV<^W . yM + yv_^_V . u _ P-Z—
Dt dx dx
= 7r*
U)
du
v— + nx(yvV u-P)
dn
which can be further rearranged to the final weak form,
0
(x)
Du
~Dt
+ v
dx)
= Jr> {"'
d(pix) du d(pix) du d(p(x) dv
(1 + y) 1 h y
dx dx dy dy dx dy
du dv~\ du I
{ +y)vTx~P + yvY +nyVT)
-P
d(p
(x)
dx
(3.12-10)
and we discover another NBC; namely,
n,
du dv
(1+K)v— -P + yv—
dx dy
du
+ nyv— = Fx.
dy
(3.12-11)
Remarks:
(1) If y = 0 or V • u = 0, then this formulation is identical to (3.12-8) and (3.12-9) with
K = 0.
(2) When y = 1 and V-u^0 (which is the case for most FEM's; V • u is only weakly
zero, not pointwise), (3.12-11) might be more useful as an OBC than (3.12-9); it
cannot be used when true traction BC's are required.
(3) This formulation has not been tested in the CFD lab, to our knowledge.
End Digression: We now return to the more conventional weak formulation,
a la (3.12-8), and augment it with the equivalent result from the y-momentum
equation (3.12-2):
0
(y)
Dv
~Dt
+ v
d(p(y) dv d(p(y) dv d(p(y) du
dx dx
dy dy
dx dy
-P
d(p
(y)
dy
dy)
du dv\
= ]/y\n'VVYy + Y,)+n
dv
(\+y)v—-P
(3.12-12)
where (f)(y) is the generic test function for the y-equation. Boundary conditions can now
be addressed, which will also permit the completion of the definition of the test functions,
422 THE NAVIER-STOKES EQUATIONS
(f){x) and (f)iy\ and the completion of the weak formulation, yielding a particular weak
form. The most general BC's that are appropriate to this weak form are:
For u:
u
n,
du
(\+y)V —-P
OX
+ nyv
du dv'
dy dx
= U on
= F
on
^D
^N
For v:
v = V on F
D
( du dv\
r dv
(1+K)v-
oy
= F,
on
■^N
(3.12-13)
(3.12-14)
(3.12-15)
(3.12-16)
where rj + TNu = T = T° + r* and Fx, Fy are specified—as are U and V. The weak
formulation of the momentum equations is completed by restricting the class of test
functions to vanish on the Dirichlet portions of the boundary, just as we did for the scalar
equation in Chapter 2. Thus we restrict (p(x) to vanish on T^ and (p(y) to vanish on T^,
so that the boundary integrals are effectively restricted to T^ for (3.12-8) and to T^ for
(3.12-12).
We now shift the focus to the continuity equation (3.12-3), and generate its weak form
very simply via
/f du dv\
t(Yx+Vy)=0, (3.12.17)
where \Js is a test function that is related to the pressure. The (nearly complete) weak form
of the NS equations under the BC's of (3.12-13) through (3.12-16) can now be stated:
Find u e HluE, v e H[E and P e L? such that
dx)
Du
0W— + V
Dt
d(p{x) du d(p{x) du d(p{x) dv
(1 + y) 1 h y
dx dx dy dy dy dx
d(p
(x)
dx
(x)
4>
(y)
Dv
Dl
ix)Fx, V0
d(p{y) dv
eH
M,0'
(3.12-18)
+ v
d(f)iy) dv dcf){y) du
+ (l +y)— + y—
dx dx dy dy dx dy
-P
d(p
(y)
dy
= [ ^Fy, V0W€//JO,
(3.12-19)
and (3.12-17) V^ e L2, where Hlu E is that set of piecewise, once-differentiable functions
in £2 that take on the value U on F®, H\)E is that similar set that takes on the value V
on T^, Hlu0 and Hlv0 are their homogeneous counterparts (0W = 0 on F„ and (p(y) = 0
on T^), and l? is the set of square integrable functions on £2.
Remarks:
(1) The resulting velocity field is called weakly solenoidal; or, divergence-free almost
everywhere. [V • u could take the value seven, or any other finite constant, at one
or at a countable infinity of points in £2, and still satisfy (3.12-17).]
WEAK FORMS OF THE PDE'S/NATURAL BOUNDARY CONDITIONS (NBC'S) 423
(2) The distinction between <p(x) and 0(v) is required because of the need to satisfy
different BC's for u and v in general. Away from T, one may conveniently envision
them as the same functions (as is indeed the case when we introduce the FEM).
(3) The solution space and the trial space (of test functions) for the pressure are identical
because there are no explicit BC's on P.
(4) Since there are no spatial derivatives of P in the final weak form, the pressure (and
the test functions, {\f/}) is not even required to be continuous (thus P e L2 =>■ WP
need not even exist in the classical sense—as, of course, is indeed also the case for
V2u).
(5) The incorporation of NBC's into weak forms should now be obvious, as should the
utility of weak formulations.
(6) If P is replaced by — AV • u and (3.12-17) dropped, then we have the weak form of
the penalized momentum equation; see (3.5-10).
(7) If If and If are zero, then a solvability constraint enters (and must be satisfied for
a solution to exist): frnxU + nyV = 0. In this case the pressure is determined only
up to an arbitrary additive constant.
The problem specification is not actually complete until we specify the IC's, which in
this case are
u(x,0) = u0(x), (3.12-20)
where Uo(x) is subject to some significant constraints:
(i) V-110 = 0 in Q, (3.12-21)
(ii) n-Uo = nxU0 + nyVo on if n if, (3.12-22)
(iii) n-Uo = nxU0 on if n if, (3.12-23)
and
(iv) n-u0 = nvV0 on if n if, (3.12-24)
corresponding to (3.9-1), (3.9-2), where Uq, Vq are the initial values of U and V of
(3.12-13) and (3.12-15). Note that, as in the classical formulation of Section 3.9, if any of
(3.12-21) through (3.12-24) are violated, the problem is ill-posed, and no solution exists.
We also remark that applications of this formulation are probably usually limited to cases
wherein nx and ny are either 0 or 1; i.e., the boundaries are aligned with the coordinate
system. More general cases, to be discussed later, will usually have BC's applied in
normal and tangential directions—thus necessitating transformations of the equations via
(local) rotations. Finally, we leave as an exercise the demonstration of reversibility—i.e.,
if u and P are sufficiently smooth, then the solution of (3.12-17) through (3.12-24) is also
a classical solution.
An alternative and shorter derivation will now be presented, which also generalizes (to
3D) the above. It is based on the cartesian index notation, complete with the summation
convention. The NS equations are first rewritten as
Dun, dP ( o 3 ur \
-FT + ^- = v\ V u<* + yiT^T ' « = 1, 2, ..., n,, (3.12-25)
Dt axa \ dxpdXa I
424 THE NAVIER-STOKES EQUATIONS
and
OUn
~=0, (3.12-26)
oxa
where Dua/Dt = dua/dt + up(dua/dxp), V2wa = (d2ua/dxpdxp), and ns is the number of
spatial dimensions (two or three). [We use Greek indices to denote spatial vectors (and
directions) because this will help us later when we introduce nodes and finite element
basis functions.]
Next, multiply (3.12-25) by the generic a-direction test function 0(a) and integrate by
parts:
4f»°±+v V^>.Vua + y^^ )-P*
Dt \ dxp dxa I dxa
~ r*
(a)
fdua dup\
\ on dxa J
a = 1, 2, ..., ns, (3.12-27)
where dua/dn = np(dua/dxp), and
the parenthetical superscript (a) on 0 denotes the important restriction: 'no sum
on a'—both here and hereafter.
To complete the formulation we simply restrict the test functions as follows: 0(a) = 0
on F®, where F® denotes that part of F on which ua is specified; i.e., corresponding to
(3.12-13) and (3.12-15), we now also have the BC's stated more compactly as
ua = Ua on Tj and v ( ~ + ynp^-) - Pna = Fa on r£, (3.12-28)
\ on oxa J
where r^ 4- F^ = F; a = 1, 2, ..., ns. The statement of the weak form is now: find
ua e Hla E and P e l? such that
f <rtDua ( la, 90(a) dup\ 90(a)
J Dt V dxp dxa J dxa
= / 0(a)Fa, a=\,2,...,ns, (3.12-29)
and
/v~=0 (3.12-30)
J dxp
V0(a) e //„ 0 and V^ e L2, subject to appropriate IC's—which we defer, temporarily
[until Section 3.13.1—but they are the same as those in (3.12-20) through (3.12-24)].
Remarks:
(1) Although the notation is different, it should be clear that (3.12-17) though (3.12-19)
and (3.12-28) through (3.12-30) are describing the very same problem.
(2) This compact notation will prove useful when we push on to find approximate
solutions to the weak form via Galerkin's method.
WEAK FORMS OF THE PDE'S/NATURAL BOUNDARY CONDITIONS (NBC'S) 425
(3) Since P will eventually become a linear combination of the V's, (3.12-30) implies
J /> V • u = 0, which in turn implies / u • VP = Jr Pn • u; thus, if the flow is parallel
to the boundary (n • u = 0), the velocity is orthogonal to the pressure gradient—an
observation that is also true for the classical solution, and may even be
counterintuitive since it tells us that—on average—the flow is anything but 'down (parallel
to) the pressure gradient.' [Indeed, if n • u = 0 on T, then u is orthogonal to the
gradient of any scalar function in L2, a fact that is more mathematical than physical,
and corresponds to an orthogonal decomposition (Helmholtz/Weyl) of L2 into the
space of divergence-free vector functions and that of gradients of scalars; the pressure
is, in this sense, not so special.]
There is a third equivalent way to state the same problem; it is even more
condensed/succinct than that just presented. It retains vector/tensor/dyadic notation
throughout, starting with
Du
— + VP = Vd (3.12-31)
Dt
Vu = 0, (3.12-32)
where
d = v[Vu + k(Vu)7] (3.12-33)
is the (symmetric when y = 1) viscous stress tensor (or pseudo-stress tensor if y = 0),
and we leave (for the moment, anyway) cartesian coordinates behind; i.e., these are
the coordinate-free NS equations. The next step is to form the scalar product of the
momentum equation with a generic vector test function, v, whose 'properties' will be
defined/discovered along the way—and (of course) perform an integration by parts:
Du
v + v • VP = v • (V • d)
Dt K
or, equivalently,
Du r
v + V • (Pv) - PV • v = V • (d • v) - dT : Vv
Dt
to give
v- — + v[(Vu)7 + yVu] : Vv-PV-vl = /(d7 • n - Pn) • v, (3.12-34)
an equation that (for y = 0) at least 'resembles' (3.12-27)—an observation that uses (for
y = 0) d7 • n = v(Vu)7 • n = vn • Vu = vdu/dn.
But (3.12-34) is just a single scalar equation (for each v), and (3.12-27) is clearly a
vector equation—so something seems to be amiss. To complete the connection, we must
realize that the (vector) test function must ultimately range over all possibilities, some of
which will contain non-zero entries only in one of its components, others of which will
have non-zeros only in another of its components, etc. And there is an infinite number
of each. Thus, (3.12-34) is simply a short-cut notation that actually implies—and should
be interpreted as—a set of vector equations with ns components. In fact, the easiest
way to obtain (3.12-27) from (3.12-34) is to specialize the set of vector test functions;
426 THE NAVIER-STOKES EQUATIONS
i.e., set v = ea(p{a\ recalling that there is no summation because of ( ), use u = epiip,
V = epd/dxp, etc., and let a range over 1 to ns—an exercise we leave to the reader.
Remarks:
(1) A nice feature about this formulation for y = 1 is the ease with which the kinetic
energy equation can be obtained: just set v = u in (3.12-34), utilize /PV • u = 0,
and set F = dT • n — Pn to get
5/^ = //—v/[vu+<vu/]:v„,
which is easily reduced to (3.11-10) by invoking V • u = 0.
(2) The following identities, with mix of notation, are interesting:
(Vu)7 : Vv = Vu :(Vv)7 = Vw( • Vvh
a la Table 3.1-2.
(3) A more compact yet statement of (3.12-34) for y = 1 is
where a = v[Vu + (Vu)7] — P\ is the total stress tensor, for which the classical
form is Du/Dt = V • a, a la (3.3-1).
(4) Related to Remark (3) following (3.12-30), it is important to emphasize that even
though / \f/V • u = 0 and thus / PV • u = 0, it is not true that / PV • v = 0; this
is because J \j/V ■ v ^ 0—our test functions are not even weakly divergence-free.
Indeed, this is one of the major advantages of the mixed formulation, since useful
divergence-free test and trial functions are hard to find (but not impossible—see
below—Section 3.13.7).
3.12.2 Other u-P Formulations
Having covered one formulation in detail, we now cover others in more abbreviated form
to conserve space; the missing details should not be hard to fill in. Thus,
a. Full divergence form
Starting from (3.4-2),
dua 3
+ — (uaup -aap) = 0, a= \,2, ...,ns, (3.12-35)
dt dxp
where
(dua dup\
the weak form is generated via
da)
0(a)lf + J ^[(f){a)(UaUP ~ °^)] ~ J -^-(u<*up ~ a-P) = o,
WEAK FORMS OF THE PDE'S/NATURAL BOUNDARY CONDITIONS (NBC'S) 427
which leads directly to
/
^(a)^ | d(t>
dt dxp
(a) r
dua dup \
v | - h K^— ) - PSap - uaup
dxp dxa
I
«.(«),
= / 4>wnp
dua diiR \
v|- h K^— ) - PSap - uaup
dx,
P
dxa
a = 1, 2, ..., ns
(3.12-36)
Except for the NBC, this form is not much different; the NBC is
dua dun
+ Y-
dn
dxa
\rna -\- UnUa) — f a,
(3.12-37)
where un = n • u. Here-Fa represents the total momentum flux (at least when y =
1)—see (3.8-6) and (3.8-7). It would seem rare that one would actually know the total
momentum flux at an inlet, but if one did, then this is the proper weak form. But it is
at the outlet that this NBC could actually cause trouble—probably even more so than it
would for the advection-diffusion equation; cf. (2.4-22) of Chapter 2—the momentum
flux must get out, but its value (Fa) is generally not known, yet must be specified (zero
or any constant probably would not work). Advice: do not use this formulation for an
OBC, unless Fa is known—as in some of the Scriven references cited in Section 3.8.1;
but if you do, and Fa is not known, set—Fa to the average inlet advective momentum
flux, \/H J{) unua d/ at the inlet—and cross your fingers.
b. Skew-symmetric form
This form, (3.4-5), will only be interesting when we later return to the subject of
global energy conservation (Section 3.13.8) because it, not surprisingly, leads to a skew-
symmetric advection matrix. Here it is sufficient to state that it is the same as (3.12-29)
with y = 0 and Dua/Dt replaced by Dua/Dt + ^uadup/dxp.
c. Rotational form and other curl forms
These forms are worth developing in more detail because div is 'replaced by' curl, and
the necessary integration by parts formulas are different—and additional vector identities
are needed. It is also more convenient to work in vector notation; thus, (3.4-4) becomes,
for our purposes here,
du
Yt
+ 0) x u + VPt = —vV x V x u,
(3.12-38)
and it is the curl-curl term that needs work in order to be cast into the appropriate
weak form.
Toward this end, then, let us temporarily digress and seek the weak form of
V x V x u = f,
(3.12-39)
where V x u = a>. This is best done using vector test functions; so we start with J v • V x
V x u = / v • f. Next, recall the vector identity, V • (A x B) = B • (V x A) - A • (V x B)
428 THE NAVIER-STOKES EQUATIONS
to give, with A = V x u and B = v,
v-VxVxu = V-[(Vxu)xv] + Vxu-Vxv
= -V • (v x V x u) + V x u • V x v,
so that / V x v • V x u = |r n • (v x (») + / v ■ f via the divergence theorem. Now we
work with the boundary term; first, we decompose o> into its normal and tangential
components, o> = (n • o>)n + n x o> x n to obtain
n-vxo) = n-vx[(n- o>)n + nxa>xn] = n-vx(nxo>xn)
= (n x a) x n) • (n x v)
since n • v x (n • a>)n = (n • a>)(n • v x n) = 0, and we have used the triple vector product
identity A-BxC = C-AxB, where here A = n, B = v, and C = n x o> x n, the
tangential vorticity. The final weak form of (3.12-39) is thus
/
V x v- V x
u = / v • f + / (n x a) x n) • (n x v),
and our digression is over. The weak form of (3.12-38) that is of interest can now be
easily developed, and the result is
'du \
ho>xu +vVxv-Vxu — P^V • v
dt J
/'
= / [v(n x v) • (n x a) x n) — n • \Pt],
which is seen to introduce the following new NBC's:
n x a) x n = f and PT = g;
(3.12-40)
(3.12-41)
both the tangential vorticity and the total pressure are specified.
A closely related result follows easily from the curl form [(3.6-13)] rather than (3.6-2)
or (3.4-4):
Du
v • hvVxv-Vxu — PV • v = / [v(n x v) • (n x o> x n) — n • \P],
Dt J Jr
(3.12-42)
where P is now the 'usual' (static) pressure, with NBC's of
nxo)xn = f and P = g; (3.12-43)
the tangential vorticity and the pressure are specified, both weakly.
Remarks:
(1) This result agrees with that in equation (4.11.3) of Gunzburger (1989) once the
'typos' there are corrected; the sign of s in his equations (4.11.3) and (4.11.4) should
be negated, as should that of r in equation (4.11.4).
(2) If the tangential velocity is specified on T, then the first boundary integral in
(3.12-42) vanishes, and the second involves only n • \P\ thus, if the normal velocity
is not specified, then we have an NBC in terms of pressure alone (P = g, say).
WEAK FORMS OF THE PDE'S/NATURAL BOUNDARY CONDITIONS (NBC'S) 429
(3) Finally, if the normal velocity is specified, n • v = 0, and if the tangential velocity is
not specified, then we obtain an NBC that specifies only the tangential component
of vorticity.
Another closely related and more stable (Gunzburger, 1989) form starts from the div-
curl form given by (3.6-12) and becomes [integrating also v • V(V • u) by parts]:
f Du
/ v • h vV x v • V x u + (vV • u - P)V • v
= / [v(n x v)- (n x © x n) + (n ■ v)(vV • u -/>)], (3.12-44)
with NBC's of
nxwxn = f and P - vV • u = g, (3.12-45)
which is equation (4.11.4) in Gunzburger (1989) once r is changed to —r there.
Remark:
It is probably usually safe to assume that vV • u ~ 0 in (3.12-45) and to interpret it as
specifying (weakly) a pressure, as in (3.12-43).
Thus, we see that the curl forms open up three new NBC possibilities (while
simultaneously closing others): (i) specify tangential velocity and (as an NBC) pressure, (ii) specify
tangential vorticity and pressure (both as NBC's), and (iii) specify normal velocity and (as
an NBC) tangential vorticity. All of these, of course, are in addition to the conventional
Dirichlet (essential) BC of specified velocity.
In an attempt to summarize the situation and complete the presentation, let us define
a particular problem based on (3.12-42):
u=w on Ti, (3.12-46)
n x a) x n= f2 and u • n = w2 on r2, (3.12-47)
nxuxn=w3 and P = P3 on T3, (3.12-48)
nxo)xn=f4 and P = P4 on T4. (3.12-49)
The weak form is as follows: find u that satisfies u = w on F\, u • n = vv2 on F2, and
n x u x n = W3 on r3 from
/ (v- — +vVxv-Vxu-/>V-v)
= J v(nx v)-f2- / n-v/>3+ f [v(n x v) • f4 - n • \P4] Vv (3.12-50)
J Y2 J Yi, J Y4
that is once-piecewise differentiable and also satisfies: v = 0 on F\, n • v = 0 on F2, and
n x v = 0 on T3.
Remarks:
(1) It is not at all obvious if or when one would actually wish to solve such an IBVP,
but the possibility does exist—as do simpler subsets.
(2) The continuity equation (3.12-30), must of course be solved simultaneously.
430 THE NAVIER-STOKES EQUATIONS
To conclude this discussion, and perhaps shed more light on weak forms involving curl,
let us write the weak div-curl form, (3.12-44), in simple 2D cartesians (via v = ea<p(a\
etc.), wherein co = dv/dx — du/dy:
Du
dx)
Dt dy \dy dx J
— -—I +4— (vV-u-P)
dx
and
= / <p{x)[-vnyco + nx(vV-u-P)]
{y)Dv , d(f){y) fdv du\ d(f){y)
<py>— + v-— [ ) + ^— (vv • u - P)
* Dt dx \dx dy) dy
(3.12-51)
= / <p(y)[vnxco + ny(vV-u-P)l
whose corresponding strong form reads
and
Du d fdv du\ d „
— + v— I H (P - vV • u) = 0
Dt dy \dx dy) dx
Dv d f du dv\ d „
— + v— I + — (P - vV • u) = 0,
Dt dx \dy dx) dy
(3.12-52)
(3.12-53)
(3.12-54)
where the simpler (but less stable computationally, according to Gunzburger, 1989) curl
form is obtained by setting V • u = 0 in each of the above.
Remarks:
(1) If these forms were to be used in a problem with outflow in the x-direction (say) via
homogeneous NBC's, (a) = 0 and P = 0 at outlet), the exit flow would be forced to
satisfy the irrotational constraint dvldx = du/dy and the 'non-Bernoulli' constraint
Pt = P + q2/2 = q2/2. Thus, if a near-potential flow is thought or assumed to exist
at the exit plane, the full rotational form of (3.12-38) should be used instead because
then the homogeneous NBC's are compatible with a potential flow: co = 0 and
PT = P + q2/2 = 0.
(2) If the full rotational form is used, u • Vu in (3.12-53) and (3.12-54) is replaced by
a) x u; i.e.,
and
du du (du dv
u • Vu = u h v— goes to — vco = v [
dx dy \dy dx
_, dv dv fdv du\
u • vv = u h v— goes to uco = u
dx dy \dx dy J
d. Other recent formulations
Here we simply summarize some of what others have done recently via more avant-garde
weak forms in a weak attempt at 'completeness.' The principal intent is to make the
WEAK FORMS OF THE PDE'S/NATURAL BOUNDARY CONDITIONS (NBC'S) 431
reader aware of yet other options. Specifically, we show three additional cases [all of
which must be appended by (3.12-30)], the first two from Pironneau (1989) and the last
from Girault (1988b), and all three are our own 'generalizations—interpretations' which
may not be precise.
1. The BC's u = w on FD and u • n = ur, n x a> = d, dP/dn = f • n — Vy • d on FN,
where Vv is the surface gradient operator, and it is required that n • d = 0 can be satisfied
via the following weak form: find u that satisfies u = w on FD and u • n = wr on FN
from
/ v- — + vVx v- Vx u + (vV-u-P)V- v= / vf+ / v d Vv (3.12-55)
that is once-piecewise differentiable and also satisfies: v = 0 on FD, n • v = 0 on FN.
Remarks:
(1) f is a specified body force (acceleration).
(2) This is a slight (and risky?) generalization (from Stokes to NS) of that presented in
Pironneau (1989).
2. The BC's u = w on FD; on FN we have the three BC's: n • u = ur, n • a> = 0, and
dP/dn = n • f, where f is a body force, can be satisfied using the following weak form:
find u that satisfies u = w on FD and u • n = wr on FN from
/ v ■ — + vV x v • V x u + (vV • u - P)V • v = / v ■ f, Vv (3.12-56)
that is once-piecewise differentiable and also satisfies: v = 0 on FD, n • v = 0 and n •
V x v = 0 on FN, where Remark (2) above applies again.
3. The BC's u = w on FD and on FN we have the three BC's: n • u = 0, n • a> = 0, and
n • (V x a>) = 0. There can be satisfied by the weak form: find u subject to u = w on To
on FN from
f du
/ v- hvVxv-Vxu + v- [(V x u) x u]
+ v- V(P + ±u-u) = f\f VveH1 (3.12-57)
that vanishes on T^ and has n • v = 0 on FN.
Remarks:
(1) Again, this presentation is an 'interpretation' of the presentation in Girault (1988b),
which should be consulted if these BC's are seriously contemplated.
(2) See also Girault and Raviart (1986) for a few other interesting weak formulations.
e. Divergence-free basis functions
It is possible in principle (and in practice), but apparently not very popular in
practice to actually construct a space of approximating functions such that each member
is solenoidal. The reason that it is seldom used in practice is that the construction is
cumbersome (especially in 3D), and the choice of elements is limited—at least that is
432 THE NAVIER-STOKES EQUATIONS
the situation up to now. Although more detailed discussions will be presented in a later
section, below we merely provide some motivation for the search for divergence-free
basis functions/elements. Beginning with (3.12-34): suppose every vector test function,
v, were divergence-free; this would obviously simplify (3.12-34) to
— + v [(Vu)7 + yVu] : Vvl = J v • (d7 • n - Pn). (3.12-58)
In addition, since u will be 'generated' via (expressed as a linear combination of) similar
divergence-free functions, V • u = 0 will be satisfied automatically, and (3.12-30) is not
needed. In this case, the resulting scalar equation [for each v, which now can not be
expressed via v = ea0(a) since its components are necessarily coupled] is really that;
divergence-free basis functions lead to a scalar system of equations rather than a vector
system—and the pressure is gone! The incentive should now be clear: rather than solve a
system of equations (infinite at the moment, finite when approximated via the FEM) for u,
another for v, another (in 3D) for w, and yet another for P—all of which are coupled—a
divergence-free basis reduces this to a smaller single system of (coupled) equations (for
the scalar amplitude coefficients in the expansion of the vector u), seemingly a very
substantial savings. We will return to this issue after we discretize the weak form in
Section 3.13 (i.e., Section 3.13.7).
3.12.3 Pressure Poisson Equation Formulations
Another way to avoid the explicit use of V • u = 0 is to replace it with the PPE. Here we
develop the obvious weak form of the PPE that could be used to replace the weak form of
the continuity equation (3.12-30), and suggest why the 'obvious' weak formulation may
not be the best one.
The problem to be addressed is that described by (3.10-1) and (3.10-3) through
(3.10-13) in the PPE formulation. But the momentum equation has already been dealt with,
so we focus now on the PPE and seek a good weak form, beginning in the obvious way;
i.e., multiplying (3.10-3) by the 'generic' scalar test function 0 and performing the usual
steps—after abbreviating by defining f = Re-1 V • [Vu + (Vu)7] + g — u • Vu—gives
/ 0V2/> = / 0V • f => / 0 / V0 • VP = J 0n • f - / f • V0;
i.e.,
/^■vf> = /rv, + /r,(£-n.r).
and we pause to reflect upon the BC's, noting first that the normal component of the
momentum equation on F is dP/dn = n • (f — du/dt)—an equation that supplies the
Neumann BC for the PPE when n • u is specified (as n • w) on T, which is here the
case on FD and Tn. On the remainder of T (TN and Tr), the pressure is related to the
velocity via P = Re~' n • [Vu + (Vu)7] • n - n • F = PN on TN, and P — Re~' n • [Vu +
(Vu)7] • n — Fn = PT on Tr; i.e., via Dirichlet BC's on the pressure. So, a weak form of
the PPE is the following:
Find P e HlPE such that
/V0-VP= / f-V0- / (pn-dw/dt V0e//j>o, (3.12-59)
WEAK FORMS OF THE PDE'S/NATURAL BOUNDARY CONDITIONS (NBC'S) 433
where HpE is that set of piecewise once-differentiable //'-functions in £2 that take on
the value PN on TN and the value PT on TT, and Hp0 is that subset that vanishes on VN
and rr.
Remarks:
(1) Since f = Re-1 V • [Vu + (Vu)7] + g — u • Vu, it is clear that this form requires
the existence of second spatial derivatives of the velocity field; i.e., it seems to
demand that u e H2 rather than the usual //' —which is at least consistent with our
now requiring P e //' rather than the larger space (L2) associated with the u-P
formulation.
(2) The (weak) form of V • u = 0 that is implied by this PPE method is not obvious;
perhaps it would be enough just to know that there is one, but we would rest more
easily if we could identify it. We believe and assert that there is none—but this
alone certainly does not preclude potential utility of the method.
(3) Satisfaction of the essential BC's will be more difficult than usual.
In response to Remark (1), consider the following: assume V-u = 0 to obtain
V • [Vu + (Vu)7] = V2u + V(V • u) = V2u and use V • V2u = V • [V(V • u) - V x V x
u] = 0 and
/ V0 • V2u = / 0n ■ V2u - / 0V • (V2u) = / 0n • V2u
so that another weak form is:
Find P e HPE such that
/v0- VP = J V0- (g-u Vu)+ / 0n- (Re"1 V2u - 3w/3r) V0e//J>o,
(3.12-60)
and one may well ask what has been gained since we still need to evaluate V2u—or at
least its normal component—and on the boundary yet! What may have been gained is
a decent approximation that might work for large Reynolds number: simply neglect the
boundary term Re-1 n ■ V2u for Re ^> 1. That this might work is related to the fact that
dP/dn = 0 on solid, stationary no-slip walls (with no body force) is a good approximation
at large Re (indeed, it is one of the cornerstones of boundary layer theory). This result
also seems to imply that the neglect of the viscous terms on the RHS of (3.12-59) may
also be valid for Re ^> 1, with the introduction of (3.12-60) merely serving as a method
of justification. We shall later (Section 3.13.4) return to these issues, and raise new ones,
after we discretize the weak form in Section 3.13. For now we merely state that the above
weak formulations of the PPE are probably not the best ones.
3.12.4 The Stream Function-Vorticity Formulation
Although we will not discuss the finite element implementation of the yj; — co formulation,
we will discuss proper—and improper—weak formulations, partly for completeness and
partly to show why we remain primitive variable advocates. Similarly, and for similar
reasons, we do not discuss u — a> methods. See Gresho (1992) for some discussion of
these, and for many references.
434 THE NAVIER-STOKES EQUATIONS
Suppose we wish to solve (3.6-5) and (3.6-6), using (3.6-4), in a weak form, for the
situation wherein u = w on FD and r — FD = FN comprises an outflow boundary, with
OBC TBD. Also, we are either given an initial divergence-free velocity [from which
(3.6-4) can be used to compute the initial stream function and then (3.6-6) gives the
initial vorticity]—or we are given an initial vorticity, coq, from which (3.6-6) gives the
initial stream function and then (3.6-4) gives the initial velocity.
First we show how early investigators fell into the 'weak formulation trap'—a la early
FDM investigators; namely, generate the 'usual' weak forms of (3.6-5) and (3.6-6), using
6 as the generic test function for the former and (p for the latter; i.e.,
fol — + u Vco- vV2co) = 0 and / 0(vV + «>) = 0,
which led to, in the usual way,
el — + u • vw) + vvo • s/co
V dt J
(3.12-61)
and
/*V0.V^ = <pto+ I 0—. (3.12-62)
The d\f//dn term is okay because dxfr/dn = uz = r • w and uz is known (except at
the exit). The problem is the viscous flux of vorticity, vdco/dn: it is not known on any
part of T (except perhaps at outflow where it is usually assumed/taken to be zero; i.e.,
'fully developed'). Thus, this weak form is not only useless, it has also been misused—for
example by assuming that a specified (essential) BC for co on FD could be obtained via the
computation cor = —V2i/Hr- See Stevens (1982), in which these ideas were implemented
and compared with the proper weak formulation—detailed below—which Stevens also
implemented. He showed quite conclusively that the fully coupled method, with both
BC's on V, is the thing to do.
The proper weak formulation was discovered by Campion-Renson and Crochet (1978),
and—it seems—simultaneously by Barrett (1978), via variational methods applied to the
steady Stokes equations. It begins with the BC's V = g and dij//dn = uT on FD and, for
OBC's, we choose dty/dn = a and vdco/dn = b on FN where we will soon set both a and
b to zero. The weak form is then (see also Thomasset, 1981, and Gresho, 1992):
Find V € H\ and co e Z/1 from
6 I — + u • Vco J +vV6-Vco
[ Ob V0€//i, (3.12-63)
and
/ V0 ■ V^ = / 4>co + / (puz+ (pa V0 g //', (3.12-64)
J J J Yd J T/v
where //' contains once-piecewise differentiable functions on Q, HXE is that subset that
takes the value g on rD, and Hq is that subset that vanishes on FD.
Remarks:
(1) The vorticity is computed everywhere: in £2 and on F because the test functions, 0, do
not vanish on F. (The no-slip BC, d\j//dn = r • w on Fo, is implicitly/automatically
realized because the proper value of co on FD is computed.)
WEAK FORMS OF THE PDE'S/NATURAL BOUNDARY CONDITIONS (NBC'S) 435
(2) a = b = 0 is the usual choice for an OBC.
(3) The \f/ — co system is tightly coupled, in proper analogy to u-P formulations, wherein
u and P are tightly coupled—another consequence of V • u = 0 in Q.
(4) Assuming 0 and co are smooth enough, the weak form may be reversed via
I V6Vco= J V • (0Va>) - / 6V2co = I 6— - f 0V2co
since
0 = 0 on TD;
i.e., (3.12-63) yields /6{Dco/Dt - vV2co) = J^ 0(b - v(dco/dn)) => Dco/Dt =
vS72co in Q and vdco/dn = b on TN. Next, / V0 • V0 = / V • (0V0) - / 0V20 =
fr(p(di//dn)- f(pV2\f,, so that (3.12-64) yields /0(V20 + co) = fr[)<p(d^/dn -
uT) + Jr <p(d\j//dn — a), which => V2i^ + co = 0 in Q, dij/fdn = uT on Yd, and
d\j//dn = a on Tyy.
(5) Multiply-connected domains cause additional difficulties. See the above references
for details.
(6) For T/v = 0, the above formulation satisfies Quartapelle's projection theorem (Quar-
tapelle and Val-Griz, 1981; and Quartapelle, 1981). 'A function co is such that
co = — V2\f/ in Q, where \f/ = g and d\f//dn = uT on T if, and only if, J co<p =
Jr(g(d(f)/dn) — (f)uT) for every 0 satisfying V20 = 0.' When the requirements of
this theorem are satisfied, then the problem: V20- + co = 0 and V2co = 0 in Q with
either \// = g or d\f//dn = uT on T has a unique solution that also solves the implied
biharmonic problem: V4i/f = 0 in Q, 0 = g and d\f//dn = uT on I\
Enough on \j/-co. Now we must get down to brass tacks and discuss finite element
methods for solving the u-P and PPE weak formulations—after one small digression.
3.12.5 Some Ill-posed Formulations
Not all formulations lead to well-posed problems. In this section we discuss some
formulations that are mathematically ill-posed but which have also been 'coded up' and
used—seeminly with success (!), mostly in the finite difference literature, but occasionally
in the finite element literature. The ill-posedness shows up only in those situations in which
an NBC is 'active'—notably as OBC's, as mentioned already in Section 3.8.1—and is
most easily presented/analyzed in the simple but relevant case of steady Stokes flow, so
that is what we shall do. Specifically, we consider three formulations, differing only in
their treatment of the NBC/OBC, and the proofs we present (for the ill-posed cases) are
based on those supplied to us by V. Girault, whom we gratiously acknowledge.
Thus, consider the following three problems, all of which strive to find u and P from
-V2u + V/> = f and V • u = 0 in Q (3.12-65)
with
u = w rD, (3.12-66)
where f and w are data and unit viscosity has been assumed for simplicity—and with
no loss of generality. They differ only in their treatment of BC's on FN = F — FD, the
436 THE NAVIER-STOKES EQUATIONS
'outflow'—or open—portion of T, as follows:
(i) du/dn-nP = 0 (3.12-67)
(ii) du/dn=0 (3.12-68)
(iii) du/dn=0 and P = 0. (3.12-69)
Remarks:
(1) It will suffice to consider only homogeneous BC's; generalization to the
inhomogeneous case is immediate.
(2) It is also sufficient to consider just the case in which the NBC is simultaneously
applied to both normal and tangential momentum equations, as above. Generalization
is again straightforward.
We state now and prove below (but not in great detail) the following results: only the
first formulation is well-posed.
Problem 1.
Actually, we will not delve deeply into the well-posedness issue for this problem
(existence, uniqueness, continuous dependence on the data), since the theory is deep
and presented well elsewhere (see, for example, Ladyzhenskaya, 1969; and Girault and
Raviart, 1986). We will merely state that Problem 1 is well-posed and show its weak
form, which is easily obtained from (3.12-34) after adding a body force term and setting
K = 0:
j V\:{Vu)T - j PV-\= j \i+ j \-{du/dn-nP) Vv e Hl0, (3.12-70)
which is to be solved, along with J ^V. u = 0, Vi//- e L2, after dropping the boundary
integral term, which of course forces the solution to satisfy the homogeneous NBC of
(3.12-67).
Problem 2.
The key thing to show here is that if a solution exists, it is not unique, thus proving
ill-posedness. To try to do this in the 'standard way,' we form the L2-inner product of the
momentum equation with u and similarly for the continuity equation with P, where now
u = Uj — U2 and P = P\ — P2, the difference of two alleged solutions of — V2u + VP = 0,
V • u = 0 in £1 with u = 0 on TD, and du/dn = 0 on T^. Thus,
- / u • V2u + / u • VP = 0 and / PV • u = 0,
and we integrate by parts the momentum equation to obtain
- / V • [u • (Vu)7] + / Vu :(Vu)7 + / V • (u/>) - / />V • u = 0,
or
Vu :(Vu)7 + / (Pn • u - u • du/dn) = 0.
WEAK FORMS OF THE PDE'S/NATURAL BOUNDARY CONDITIONS (NBC'S) 437
Now u = 0 and n ■ u = 0 on To, and du/dn = 0 on T/v, so we are left with
/vu:(Vu)7+/ />nu = 0. (3.12-71)
But since we know nothing about either P or n • u on TN, we can not prove uniqueness
in the 'usual way.' In contrast, the usual way applied to Problem 1 gives, instead of
(3.12-71),
f Vu :(Vu)7 + / (n/> - du/dn) • u = 0, (3.12-72)
which is obtained in just the same way. But P = n • du/dn on FN, and we get that
/ Vu :(Vu)7 = 0, which can only be satisfied by u = 0, thus proving uniqueness (of
velocity) in Problem 1. [Uniqueness of pressure follows from u = 0 =>■ VP = 0 =>■ P =
C, but C = 0 because P = 0 on TN from (3.12-67).]
Indeed, we show in Figure 3.12-1 an example of this non-uniqueness, also from
V. Girault, and one whose non-zero result does indeed satisfy (3.12-71); with f =0
and w = 0, the desired Stokes solution is u = 0, P = 0, which is realized by Problem 1.
But Problem 2 has, in addition to this, the solutions
u = a(y2-\), v = 0, P = 2ax + b (3.12-73)
for arbitrary values of a and b: we find a single infinity of velocity solutions
and a double infinity of pressures! Here Vu :(Vu)7 = u2 + u2 + v2 + v2 = u2 = {lay)2,
so that / Vu :(Vu)7 = 4a2/3 and JFn Pn u = /J (Pu\x=l - Pu\x=0)dy = /0' [(2a + b) -
b]a(y2 — \)dy = —4a2/3, thus properly satisfying (3.12-71). [The solution of this same
problem with, in addition to P = 0 at just one point, as mentioned in (3.8-31), is as above
except P = 2a(x — xq)—the double infinity is reduced to a single one.]
Although this single and simple counterexample is sufficient to show that Problem 2
is ill-posed by virtue of non-uniqueness, we present a second: consider 2D steady flow in
a channel of height H and length L under the BC's P — du/dx = 3 and v = 0 at x = 0,
du/dx = 0 and v = 0 at x = L (the outlet), and u = v = 0 at y = 0 and y = H. This
problem has (at least) two solutions and is thus ill-posed. They are:
1. u = v = 0, P = 3;
2. u = 3H2/2L ■ y/H • (1 - y/H), v = 0, and P = 3(1 - x/L), which is Poiseuille flow,
and which also satisfies (3.12-71).
y
A
rD
Q
rD
(1.1)
rN
0
Fig. 3.12-1 A simple domain for the Stokes equations.
438 THE NAVIER-STOKES EQUATIONS
If (3.12-68) was also applied at the inlet of the Poiseuille flow channel, an even worse
redundancy obtains: u = aH2(\ — y/H)(y/H)/2/xL, v = 0, and P = b — ax/L for all a
and b.
Problem 3.
We present two proofs that this problem is ill-posed, the first from D. Arnold and the
second from V. Girault (personal communications):
1. Since Problem 1 is well-posed and satisfies du/dn — nP = 0 on FN, with in general
P^Oon FN, it follows that Problem 3, which sets both du/dn = 0 and P = 0 on FN, is
ill-posed via overspecification.
2. Assume that Problem 3 has a solution. It is straightforward, as for Problem 1, to show
that it too satisfies (3.12-70) with, again, the boundary integral omitted because here
P = 0 and du/dn = 0 on FN. But we already have that the solution to (3.12-70) satisfies
(3.12-67), whose unique solution does not generally give du/dn = 0 and P = 0. Thus,
(3.12-69) is overspecified, and Problem 3 is thus ill-posed. If Problem 1 is well-posed
(and it is), then Problem 3 cannot be (and it is not).
While we do not profess to understand the many and varied 'schemes' used by many
finite difference (and finite volume) code writers, we assert that they are often solving
problems with OBC's that in the continuum are ill-posed (typically du/dn = 0). We called
these 'fuzzy' BC's in Sani and Gresho (1994) and implored/challenged the CFD numerical
analysts to try to explain them. (A partial response to this challenge, for the AD equation,
has recently been provided by Griffiths, 1997, and Renardy, 1997.)
Later, we will again address the ill-posedness of Problem 2 when we discuss the
pressure Poisson equation (PPE) for time-dependent flow, but we state the bottom line
here: the PPE cannot be solved (uniquely, at least) because it has no BC on FN. Also
later [(3.13-33)], we will show ill-posedness for the discrete Stokes equations—again for
Problem 2.
A final comment on the time-dependent version of (3.12-68), in the form dun/dt +
Vdun/dn =0, which is (3.8-32): integration over T and application of the constraint
Jr n • u = 0, and thus Jr n • du/dt = 0 yields a constraint on (the up-until-now seemingly
arbitrary) advecting velocity V; fr V(du/dn) = Jr n • dw/dt, where here F = Tobc +
FD, and w is the specified velocity on Fo. Clearly this would be difficult to satisfy
in the general case, thus posting another reason not to try it. [See, however, the end
of Section 3.8.1 for a generalization that could be made to work—by reintroducing the
pressure; i.e., (3.8-34). See, too, Lee and Leone (1988), who managed to make it work
(without the pressure!) in an application involving mountain lee waves.]
3.13 THE FINITE ELEMENT
EQUATIONS/DISCRETIZATION
OF THE WEAK FORM
Henceforth, we will focus almost exclusively on Galerkin's method for generating the
FEM equations corresponding to either the u-P or the PPE formulations and, while
each set of weak forms discussed above leads to a different set of GFEM equations,
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 439
we shall focus on—and develop in some detail—only one, albeit one that is sufficiently
general and useful. Later, we will get even more specific and present some detailed nodal
equations, including NBC's; this time (virtually) only for one particular element—the
paradoxical Q\Pq = Q\Qo—which element is simultaneously very popular in practice and
very unpopular in theory (even though lots of theory, much more than its 'fair share,' has
been devoted to it). And we shall attempt to explain the paradox. [The designation QmPn
denotes the following polynomial approximation over a quadrilateral (2D) or hexahedral
(3D) element: the velocity is approximated by ra-th degree polynomials in each direction
and the pressure is approximated by an n-th degree polynomial; details later.]
3.13.1 Detailed derivation of one u-P formulation
a.Continuum formulation
We begin by restating a particular IBVP in the weak form [(3.12-27) through (3.12-30)
with the body 'force,' a la (3.10-1), re-introduced], which we shall use to launch our
GFEM adventure:
Find ua e Hxa E and P e L2 such that
f i^Dua (ty) d(f){a)duB d(f){a)
/ * -RT +vV(f) ■ Vu« + ^ihir ~ p^-
J Dt oxp dxa dxa
= f <P(a)ga + / 0(a)Fa, a = 1, 2, ..., ns, (3.13-1)
and
^=0, (3.13-2)
V0(a) g H\ 0 and Vi/^ € L2, where Hla E are those once continuously differentiable
functions in Q that take the value Ua on T^—subject to the IC's
u(x,0) = u°(x) (3.13-3)
and the following constraints on the data:
(i) V-u° = 0 in Q, (3.13-4)
(ii) npu0p = npU°p on r(n), (3.13-5)
where F(n) is that portion of F on which the normal component of velocity is specified
with the understanding that U°p is taken to be zero on any portion of F on which Up is
not specified as an essential BC—cf. (3.12-22) through (3.12-24).
Remarks:
(1) If T^ = 4> Va [i.e., if F(n) = F], or if the normal component of velocity is specified
everywhere, then the following additional constraint must also be satisfied for t ^ 0:
f npUp= fn-V = 0. (3.13-6)
440
THE NAVIER-STOKES EQUATIONS
(2) The NBC's associated with the specified values of Fa in (3.13-1) are
vnJp- + yp)-Pna = Fa on Fna, a = 1, 2, ..., ns. (3.13-7)
V dxp dxaJ
(3) y plays its usual role—that of combining two weak forms within one set of equations.
(4) This generalizes (to 3D) the presentation in (3.10-1) through (3.10-13) and that in
(3.12-12) through (3.12-24).
(5) The classical version of this weak form is, essentially: find u and P in Q for t > 0
from
h V/> = v[V2u + yV • (Vu)' ] + g and
Vu = 0 in Q,
subject to the essential BC's given by (3.10-4) through (3.10-7), the NBC's of
(3.13-7), the IC's of (3.13-3) through (3.13-5), and the constraint (when applicable)
(3.13-6) where, of course, V • (Vu)7 = V(V • u) = 0.
(6) Noteworthy is the fact that, while we begin our search for a weak solution over all
of Hxa £, we conclude it by finding a velocity field that is actually in a subset of
Hxa £, i.e., JI E—the set of weakly solenoidal vector fields that satisfy the essential
BC's—thanks to the constraint (3.13-2). See also Appendix 3.
b. GFEM equations
We now move on to the approximate solution of the above IBVP by seeking a solution
in the appropriate finite dimensional subspaces of those spaces in which (ua, P) above
reside. Before beginning, we caution the reader (well, some readers) that there is some
tough sledding ahead, mostly for the following reasons, some of which we paraphrase
from Gunzburger (1989):
1. In all of our GFEM approximations associated with the scalar equations of Chapter 2,
stability and convergence were in a sense 'automatic' once the finite dimensional spaces in
which the GFEM solution was sought were established to be subspaces of the appropriate
infinite dimensional spaces in which the weak continuum solution lay. (Well—at least for
diffusion-dominated flow, since GFEM is not always as stable as one would like when
Pe 2> 1.) This is no longer true for NS owing to the V • u = 0 constraint; it introduces
a serious set of compatibility conditions/problems between the velocity space and the
pressure space. (As if it did not already cause enough problems in the strong formulation!)
2.Thus, 'We find ourselves in the realm of what are known as mixed finite element
methods'—Gunzburger (1989); both velocity and pressure must be approximated.
Anyway, we now move on toward a finite element approximate solution of (3.13-1)
through (3.13-7). Suppose Q has been discretized ('tesselated,' 'triangulated') via a mesh
of finite elements in which there are Na + Ma = Nt total velocity nodes, where Na
comprises those nodes in Q and on T^ and Ma are those nodes on F® (clearly Ma <$C Na,
at least in 2D), and there are Np total pressure nodes (in Q and on T). Associated with
each node in Na is a velocity test function (in the a-direction), 0-a), and with each node
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 441
in Np is a pressure test function, V*—the finite dimensional equivalent of 'for every 0(a)
contained in Hla 0 and for every i// contained in L2, above. Also, as with the corresponding
scalar transport equation in the previous chapter, we expand ua via a linear combination
of the same nodal functions (now called basis functions) over the Na nodes where ua
needs to be determined, and we use the analogous Ma basis functions to interpolate the
Dirichlet BC's on ua. The pressure is also approximated as a linear combination of the
Np pressure test functions (i.e., basis functions, since we are using Galerkin's method in
which the test and basis functions are members of the same family). Thus,
Na
ua = ua + ^2 uaj<pf\ (3.13-8)
7=1
where
Ma
7=1
and
p = YsPrti< (3.13-10)
7=1
where ua-3 is the nodal value of ua at the j-th node (etc.), and ua approximates (interpolates,
typically—at least until we reach the important discussion in Section 3.13.Id) the specified
value Ua on r^ in which 0y is used to denote a velocity 'basis' function on r%, and
we repeat the
Important convention: when the spatial index, a or /?, appears in parentheses, as (a)
or (/?), the summation convention is not in force—no summation.
Another notational shortcut is the omission of the superscript h, commonly used to
emphasize the approximate solution; i.e., we 'should' use uha instead of ua in (3.13-8) and Ph
rather than P in (3.13-10), but we shall simply let h be implied. Note that, in contrast to
the presentation in Chapter 2—cf. (2.2-1) through (2.2-3)—we are numbering the velocity
nodes with Dirichlet data separately from the others. Again, though, this is (probably, but
not necessarily) more for expositional convenience than coding efficiency—although it
does require the "on the velocity basis function, for clarity. To facilitate comprehension of
the GFEM equations to follow, we note that the substantial derivative, Dua/Dt in (3.13-1)
is actually quite a bit more involved than it appears there, owing to (3.13-8) and (3.13-9).
Thus,
Dua dua dua 3 / „ -rA (a)
+ Up— = — \ua + 2_^ Uaj(j)j
Dt dt Hdxp dt
7=1
442 THE NAVIER-STOKES EQUATIONS
N„
dua „ dua\ s—^
dt
7=1
da)
N*
'w '••■•■ #J + iE«^r
,(a) . ~
dx
U,
W}
(a)
o/J
P
k=\
dxt-
Nk
+ Ylu^
k=\
dxR
where ua, up are given by (3.13-9), and we see that Dua/Dt generates six parts: three
linear parts, one non-linear part, and two known parts that will be sent to the RHS. To
be absolutely sure that our summation convention is understood, or to further clarify
it—since the only free index at the end of the day is a,—we expand (in 2D) the last term
above (as an example):
Nt
5^ Ufikfr
(P)
*=1
dxp,
N,
^2u\k<t>t
CD
a=i
dx\
N7
+ X]U2k^'
(2)
,k=\
dx-i'
showing, in a sense, that there is indeed a 'summation' over /?.
Inserting (3.13-8) through (3.13-10) into the finite dimensional version of (3.13-1) and
(3.13-2) gives
Nn
e \ u.j I ^vr+
N«
^ \"P + 5ZuPk^>
(P)
^
(a)
(a) yj .(or)
k=\
(«) JW.W
dxf
+ v / V(f)}a} ■ V0
l(«)
(a)
C^ + 0 'Fa
u,
a]
Aa)fdUa . ~ ^a\ , „,(a) „. . d(j)f] dlip
a= l,2,...,n,; / = \,2,...,Na, (3.13-11)
and
N«
E
7=1
ifc
90
(/3)
dx
P
UPJ
Ifc
9wy
3xa
i,2, ...,yvp,
(3.13-12)
which have been written (as usual) so that the unknowns are on the LHS's and the RHS's
represent given forcing terms and, to repeat for emphasis and to further define/clarify,
summation over fi (in the sense defined above) but not over a is implied in the momentum
equations (only), and ua and up are a shortcut notation for the expansions (interpolations)
given in (3.13-9). In the mass conservation equation (3.13-12), 'conventional' summation
over fi on the RHS is implied because /? appears twice with no parentheses.
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 443
Remarks:
(1) The non-linear advection term causes an awkward (and often expensive) 'triple
product' of basis functions—in the term J (pla)(f)k dcp^/dxp—as well as two linear
parts via the coupling with the Dirichlet BC, ita; whereas the triple product coupling
terms are pervasive, the two linear parts are not—most of the terms are zero because
ua 'acts' only at nodes contiguous to F®.
(2) y = 1 causes an awkward viscous coupling between velocity components that also
engenders additional computational expense (clearly y = 0 generates less work).
(3) The RHS of the continuity equation (3.13-12), corresponds to given data from those
values of velocity components that are specified on F.
(4) While perhaps the GFEM momentum equations appear to be overly 'complex,'
they are actually quite compact when it is realized that they describe (almost) the
entire approximate solution—albeit in the form of a large system of non-linear
differential-algebraic equations (DAE's) that is not inexpensive to solve. [The part
of the solution that they do not describe is the interpolation between node points;
for this one uses (3.13-8) through (3.13-10).]
An alternate (but equivalent) representation that is also popular is based on the identity
a(b7c) = (ab7)c, where a, b, and c are n-vectors, b7c is the vector inner product, and
ab7 is the vector outer product (b7c is a scalar and ab7 is an n x n matrix). Thus,
replacing (3.13-8) through (3.13-10) with the equivalent versions
J
Un
Un
Ua + ^(a)Ua'
(3.13-13)
(3.13-14)
and
P = x//TP, (3.13-15)
with the obvious identifications [e.g., <p(a) is an 7Va-vector of basis functions for the a-
direction, ua is an 7Va-vector of nodal values of ua, etc.], leads to the following equivalent
statement of the GFEM equations, in terms of vector products:
^ v J d<p{a) d(pja)
sp J
<P(«)<pfa)
ua +
<P(a)W+<P,^)U^)-^
dxp dxp
Un
+
T dUa_ 9<P(«) d<p{p)
(^' dxp dxp dxa
up-
<P(a)ga + I <P(a)Fa
J r„
(diia „ dua\ d<p(a)dua 3<p(a) dup
dt
dx
P
dxp dxp
dxp dxa
a = 1,2 ns,
and
f-
d(p
m
dxt-
U/
*
diip
dxH
(3.13-16)
(3.13-17)
444 THE NAVIER-STOKES EQUATIONS
where we have changed the sign of the continuity equation to recover the (skew) symmetry
between div and grad ['grad = —div'; see, for example, Strang (1986)], a change that
can (should) also be made in (3.13-12). As before, summation over /? but not over a is
implied, and ua, up are given by (3.13-14).
c. Matrix-vector representation
We can now introduce the global matrices from either (3.13-11) and (3.13-12) or (3.13-16)
and (3.13-17); i.e., these equations can be written in yet another equivalent form that
may be more amenable to interpretation as finite element equations vis-a-vis Galerkin
equations—even though it requires that we specify the spatial dimensionality of the
problem, ns, which we take for now as 2:
rM,
0
L o
0
M2
0
01
0
oJ
'ii\"
u2
_P _
+
■Al+Bu+Nl(u) + V+Y)K\ Bl2 + yKl2 C,
B2\ + K^2i A2 + B22 + 7V2(u) + (1 + y)K2 C2
0 J
C]
CT
c2
u2
p
f\
fl
g
(3.13-18)
where u\ is an TV,-vector of nodal velocities in the x\-direction (etc.), and the matrix
definitions are:
Ma = J(p(a)9ja)l or Maij=j4>la)4>f\
(3.13-19)
Aa
9(a) U\
9<) , . W{a)
dx\
+ u2
Aa)
An
Act) I ~ ^Pj . ~ ~rj
dx
or
(3.13-20)
AUu)
9(a) I <P(l)Ul
T „ ^(«) , ,nT „ ^(a)
ax, + *(2)U2 dX2
Act) N2
or
*«» = £«H* /^Vi'^+E^ UT^^-x (3.13-21)
k=\ J dxi k=\ J dxl
Ka = v
K
<*a
&P(a) ^1)
dxp dxp
dcj>f #f
dxp dxp
or
•Aa)
(a).
v / vc • w;}>
(3.13-22)
B,
dua
aP = j9(a)9{p)-
or
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM
445
K
afi
Kafiu = V
Bafrj = J
oxp dxa
90i(g) 90j(/?),
dxp dxa
Ca = - f
r —
0(«)0(/J) Ma
I rj
dx
P
, with (Kap)T = Kap, or
^P(«) ,r
-—V . or
3xa
90
(a)
dxa
Vy,
(3.13-23)
(3.13-24)
(3.13-25)
with dimensions as follows:
The RHS Na-vector is
Ma:
Aa:
Ka:
Bap'-
Kap'-
r ■
(Na
(Na
(Na
(Na
(Na
(Na
xNa),
xNa),
xNa),
xNp),
xNp),
xNP).
fa
<P(a)ga + <P(a)Fa
j r„
/ 9"(«) , - dua \ , d<P(a) dua , d<P(a) 3w«
or
/a,-
3? 3*a
0!a)& + / 0!a)F„
3^ 3^ 3^ 3xaJ
01
(a) /3«a
V 3r
da)
da)
dua\ 30 3wa 30 3^
3-^a
dxp dxp
dxp dxa
(3.13-26)
and represents the forcing caused by, respectively: the body force term, the natural
(traction) boundary conditions, and the essential (Dirichlet) BC's. Finally, the RHS Np-
vector is
dup / dup
8
*:
or gi
*i
(3.13-27)
dxp J dxp
which, of course, should not be confused with the acceleration 7Va-vector, ga, in (3.13-26).
Remarks:
(1) The LHS of the continuity equations, C\u\ + C\u2, when examined at a pressure
'node' at or near FD, will look somewhat strange (incomplete); this is because part
of 'V • u = 0' is on the RHS—in g—where it is very important to realize that the
446
THE NAVIER-STOKES EQUATIONS
/V^-vector g lives only on the boundary and is thus quite sparse. (Otherwise, we
would be dealing with V • u = S, where S represents mass sources/sinks in Q.)
(2) The coefficient matrix of (w[ u\ PT)T 's singular; this is called the time-singular
representation (the coefficient of P is zero; see for example, Campbell, 1980) and
emphasizes the point that we are dealing with DAE's and not simply ODE's. More
on this later.
(3) In evaluating the boundary integrals in (3.13-26), it is sometimes better (but not
necessary), when a corresponds to a normal direction on T^, to expand Fa into the
pressure basis functions rather than those for velocity (because P usually plays a
larger role than the normal viscous stress in the NBC force balance)—unless Fa is
given analytically and numerical quadrature is adopted.
(4) A simple, but not cost-effective, way to solve the Stokes equations is to take v very
very large, say 106 or 108 in the viscous matrix; the advection terms, but not du/dt
and V/>, will 'automatically' shrink in importance. [This presumes, of course, that all
other 'characteristic quantities' — such as length scale and velocity scale—are 0(1).]
There is one more level of 'condensation' that will be of much use in the sequel;
i.e., even though (3.13-18) through (3.13-27) carefully and fully define the DAE's that
need to be solved—and are in the form that is appropriate for their construction via the
element-level matrix and vector contributions, and code writing—this representation is
still too cumbersome for purposes of further discussion. Hence, we introduce (nearly) the
most compact matrix-vector representation of the DAE's:
Mil + [K + N(u)]u + CP = f (3.13-28)
CTu = g, (3.13-29)
where the partitioned matrices are defined as follows:
M =
K =
A{+B[{+{\+y)K{ Bn + yKn
B2[ + yK[2 A2 + B22 + (1 + y)K2
N(u)
C
CT
f =
and g is unchanged. M, K, and N(u) are of dimension (/Vi + N2) x (Ni + N2), and
C is of dimension (/Vi +A^2) x N,,. The vectors u, f, and g are of length (/Vi + N2),
(N\ + N2), and TV,,, respectively. [Hopefully, N\(u) for the advection matrix will not be
confused with the number of unknown velocities in the x\ -direction, /Vi; etc.]
The names sometimes associated with these matrices, while not always 'accurate,' are
these: M is the mass matrix, K is the viscous or diffusion matrix (ignoring, 'conveniently,'
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 447
the B portion), N(u) is the non-linear advection matrix (or just advection matrix), and
C is the coupling matrix (it couples u and P), or the constraint matrix (it constrains
u, via the Lagrange multiplier, P, to be discretely divergence-free) or, finally, it is the
compromise matrix (a 'compromise' between div and grad). Actually, of course, M~' (or
perhaps M~[}, the inverse of the lumped mass matrix) should first multiply each matrix
and only then should it be 'named'; e.g., M~lC is the (weak) gradient operator—etc.
Also noteworthy is that CT is the negative of the divergence operator, and it follows that
('grad')7 = -'div'; see Strang (1986).
Digression
A small digression related to C and CT may be useful, beginning with an outline of
their shapes, noting that typically Na > N p, and always that N\ +N2 > Np (2D) and
N\ + N2+N3 > Np (3D). Thus, schematically we have, with n > m,
C
n x m
and CT
m x n
There are n = N\ +/V2 or TV 1 + N2 + N3 velocity equations and m = NP constraint
equations among these n velocities; and we next note that, if r is the rank of C (and CT),
the dimensions of the respective null spaces are: dim N(C) = m — r and dim N(CT) =
n — r. If C is of full rank, then r = m (all constraint equations are independent), and we
have that dim/V(C) = 0 and dim N(CT) = n — m; this is the common case (no so-called
'pressure modes,' spurious or otherwise—which we shall soon carefully explain): the
divergence matrix has a large null space (the field of discretely divergence-free vectors),
and the gradient matrix has no null space. Pressure modes, when present (r < m), increase
the null space dimension of both C and CT; each pressure mode (an m-vector in the null
space of C) reduces by one the number of linearly independent constraints and increases
by one the number of divergence-free vectors. It is also noteworthy that of the n total
momentum equations, the m constraints among the velocities leaves only n — m 'effective'
momentum equations. Finally, it is important to point out that C is a 'grad' except at
nodes on T at which a natural BC is employed in the normal direction (which BC 'acts
like' a Dirichlet BC for pressure; details later) and that CT is a 'div' except at nodes on T
at which a Dirichlet BC is employed for either velocity component. (Actually, because of
the sign change, CT is a convergence matrix. Actually, QTXCT or perhaps Q^'C7 is the
true convergence matrix—where Q is the pressure mass matrix: Q,7 = J ij/ii/j. Actually,
we, like others, will often maintain the sloppy-but-convenient terminology that calls CT
a 'div.')
End digression
The DAE's in (3.13-28) and (3.13-29) can be 'solved' (integrated forward in time) only
after appropriate (well-posed) IC's are stated; from (3.13-3) through (3.13-5), these are:
u(0) = u0 with CTu0 = g(0) = g0, (3.13-30)
and, in addition, when n • u is specified on all of T, the constraint
^2gi(t) = 0 for t^O, (3.13-31)
(=1
448 THE NAVIER-STOKES EQUATIONS
the discrete analog of (3.13-6) that requires global mass conservation from the specified
normal velocity. One way to prove this is to sum (3.13-12) over /, using (3.13-27), and
to realize/utilize that Xw= i V^< — 1» mus
i" --ip i=\
^N,
But f d<p(f]/dxp = §rnp<j)(p from Green's theorem, so the LHS becomes J2']=i frnPuPJ
0^ = Jrn ■ uh = 0 because of (3.13-6).
Remarks:
(1) If (3.13-30), or (3.13-31) when applicable, are violated, then the DAE's are ill-
posed and no solution exists. This is, of course, the discrete counterpart of (3.10-11)
through (3.10-13).
(2) If the steady NS equations are being addressed, then u is set to zero in (3.13-28),
and one is faced with solving a non-linear algebraic system for u and P. In this case
the only solvability constraint is (3.13-31), and that only when n • u is specified on
all of T.
(3) Additional (extraneous and spurious) solvability constrains—for transient or steady
flow—enter when certain 'elements' (combinations of {</>,} and {i/^}) are employed,
a situation that will be addressed in more detail below when we discuss 'pressure
modes.'
(4) The solution of these DAE's will satisfy the BC's given in (3.12-28) — which
actually encompasses those given in (3.10-4) through (3.10-7) and in Figure 3.8-2. The
formulation of the GFEM DAE's for any other of the permissible BC's discussed
in Section 3.8.1, some of which require the generation and use of different weak
formulations, and in Section 3.12.2, should now be fairly straightforward.
(5) As already mentioned once, the matrix C (or M~XC) corresponds to 'grad' in Q but
not on all of F; only on F®. It corresponds to a pressure force on T^; the important
details behind this remark will be presented later.
(6) Another way to derive (3.13-31) is simply to sum each of the Np discrete mass
conservation equations of (3.13-29); all internal nodes will 'cancel out', with the
result that the summations on the LHS will give zero.
(7) The vector g is always 'generated' by inhomogeneous Dirichlet BC's and, because
we preclude 'volumetric' sources (sinks) of mass, it contains mostly zeros; for
homogeneous Dirichlet BC's, CTu = 0, which describes contained flow within stationary
boundaries. For most practical problems of interest, 'significant' non-zero values
in g will be generated by Dirichlet BC's in the normal direction. For tangentially
specified velocity, g will be either 'small' or zero—the latter for constant tangential
velocity and a uniform mesh. More details on this 'boundary' vector will be presented
in Section 3.13.51—and in the two examples presented in Section 3.13.2b.
(8) The matrix-vector notation clearly implies a particular ordering and arrangement of
the discrete equations—for 'talking purposes' only, not necessarily for code writing.
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 449
d. Ill-posed equations
First we point out that if the DAE's are ill-posed because the constraint in (3.13-30) is not
satisfied, a nearby, discretely divergence-free velocity field can be obtained by performing
the discrete version of the L2-projection previously discussed for the continuum case,
(3.10-14) through (3.10-20), as follows: (i) define v = uq — M~lCk, where the TV-vector
v and the M-vector A are to be found (see too Appendix 3) such that (ii) CTv = go- This
projection is realized via the two-step procedure: (1) solve (CTM~lC)k = Ctuq — go
for A, and (2) compute v = uq — M~lCk, the adjusted and mass-consistent velocity that
replaces w0 as the IC. This 'mass adjustmentVprojection is, of course, most conveniently
performed (when 'legitimate') after lumping the mass (M —> ML)\ otherwise, the fully
coupled system given by (i) and (ii) needs to be solved—and only CM gives a true
L2-projection.
If violation of (3.13-31) at t = 0 is the cause of the ill-posedness, which will often
occur if the normal velocity is specified on all of T and interpolation is employed as
in (3.15-9), and will be demonstrated later, then the above technique will not work. In
that case, it is the applied BC rather than the IC that is the cause of the ill-posedness.
Glowinski (1984) shows one way to fix this problem, and we, at the end of this section
(Section 3.13. lg), show another; both involve changing the BC to recover discrete, global
mass conservation.
Next, we note that the (skew) symmetry between div and grad (i.e., C in the momentum
equation and CT in the continuity equation) is, of course, a consequence of integrating
J (pVP by parts to generate the weak form. Suppose we do not do so? This question
is of more than just academic interest because—recalling the discussion of OBC's in
Sections 3.8.1 and 3.12.5—it has the potential of removing P from the (normal) OBC, as
we show next.
To focus in on this issue as efficiently as possible, we consider the simplest relevant
case: steady Stokes flow, and we take y = 0 and omit the body force term. Thus, we return
to (3.13-11) and set uaj = 0, omit advection, and re-integrate the term J -^r,-(90y /dxa)
by parts, to obtain
^E i^r-^r K+z^ /*
:1
dxa
= I ^~Fa - v / V0,(a) • V«a;
a =1,2 ns\ i = 1, 2, ..., ,/Va, (3.13-33)
and (3.13-12). The NBC, of course, is now different—and acknowledged by the tilde
over Fa; it is
~ dua dua
Fa = vnp—- = v — , (3.13-34)
onp an
and the hope for a better NBC for use as an OBC, dua/dn = 0, is apparent. Before dashing
this hope, let us write the matrix-vector form of this result:
Ku + GP = f, (3.13-35)
CTu = g, (3.13-36)
450 THE NAVIER-STOKES EQUATIONS
where the new gradient matrix, G, is given by (in 2D)
Gij
dx2 J
(3.13-37)
and the following remarks are relevant:
1. The lowest-order discontinuous pressure approximations, P\Pq and Q\Pq, later called
Q\Qo, are precluded; pressure must be at least linear.
2. Div and grad are no longer (skew) symmetric (this element 'terminology' is explained
in Section 3.13.2a).
3. G is always a 'grad'—even on boundaries with NBC's, such as outflow boundaries.
4. Since G is always a grad, the hydrostatic pressure vector, PH = (1, —►, l)7, is always in
its null space; i.e., an eigenvector of the matrix [cT 0J is the vector (p J, and the
corresponding eigenvalue is zero. This puts the following new solvability constraint (proven
below, in Section 3.13.2b) on the system (3.13-35) and (3.13-36): uTf + PTg = 0, where
(p) is the null vector of (^ c^; i.e., the data (bin Ax = b) must be orthogonal to the null
vector of the transposed system (zTb = 0, where ATz = 0). (See Section 3.13.2b if the
above solvability condition is not sufficiently clear.) But P ^ PH and u ^ 0 in general,
and we see that the loss of symmetry associated with the ostensibly legitimate notion of
not 'integrating the pressure gradient around by parts' could lead to significant difficulty
and may even be fatal. A final damning feature of this notion, which we further explore in
Section 3.13.2b below, is this: the associated/implied PPE (for the time-dependent case,
in general) has no BC on F%, with the result that the pressure is underdetermined. [This
also applies to steady Stokes: the equation (CTK~XG)P = CTK~X f — g that is implied
by (3.13-35) and (3.13-36) would be found to be lacking in BC's on T^; also, directly
related to the lack of a pressure BC is the fact that (CTK~XG) is singular—with null vector
Ph, but GtPh # 0.] These observations are probably related to some of the difficulties
experienced by some FDM codes when G is a grad and GT ^ C.
5. Finally, recall that in Section 3.12.5, this formulation was shown to be ill-posed (under-
determined) in the continous case.
[Exercise for the reader: Consider integrating the continuity equation by parts, J ^r V •u = 0 = Jri/oi-n — / u •
Vi/r. Discuss known and possible consequences:
1. VP integrated by parts.
2. VP not integrated by parts.]
e. Normal and tangential BC's
The full generality of the finite element method requires that we be able to apply any
of the legitimate BC's to domains of any shape. This can lead to some awkward cases
if we 'stay the course' with isoparametric mappings and cartesian velocities. As first
pointed out by Engelman et al. (1982a) for the incompressible NS equations, there are
situations in which the cartesian directions are not appropriate for applying BC's; a (local)
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 451
coordinate transformation (rotation) to normal and tangential direction (s, 3D) is required.
[Earlier, Pinder and Gray (1977) performed a similar function, 'a rational method,' for the
shallow water equations.] Examples: (i) at a free surface, we often require n • u = 0 and
t • u 'free' (t is the unit tangent vector, typically oriented so that the domain is on your
left); (ii) in a turbulent boundary layer in which so-called 'wall functions' ['numerical
grafitti'—F. Habashi, personal communication (to a large audience!)] are employed (see
the chapter on 'Turbulent Flow' in Volume II), the BC's are n • u = 0 and specified
shear stress; (iii) an inlet region wherein parallel flow (t • u) = 0 and a normal force
BC is desired; (iv) both normal and tangential tractions are specified; and (v) a problem
stated in polar coordinates. The best way to implement such BC's is via local rotation
(at each node needing it) so that the momentum equation is expressed in normal and
tangential coordinates. Engelman et al. (1982a) showed how to perform this rotation and,
importantly, how to properly—and uniquely—define the normal direction. We summarize
(and paraphrase) their key results here, first in 2D, using Figure 3.13-1, in which t = x.
The general technique involves both a rotation of the momentum equations at node i
from cartesian to (n, t) and a change of variables from (u, v) to (u„, uT), the latter using
un = n • u = nxu + nyv, (3.13-38)
uT = x • u = xxu + xyv, (3.13-39)
or, in terms of global discrete velocity vectors,
uR = R-u, (3.13-40)
where uT = (...«,-••• v,■ • ■ •), uTR = (■ • • unj • • • uTi • • •) is the rotated velocity vector (at
node /), and R is the (orthogonal, /?~' = RT) rotation matrix; i.e., a matrix with ones
on the diagonal and zeros elsewhere except for the four entries that transform u\ to un.
and vi to vTj; i.e., the transformation puts un into u and uT into v at node / in the global
arrays. The exact location of these entries in R depends on the global node and equation
numbering schemes used, but in general we can call them j and k; i.e., Uj (and thus uni)
is at location j in the global n-vector, and v,- (and thus uTi) is at location k. The rotation
{ut specified, un free)
{ut free, un specified)
Fig. 3.13-1 Unit normal and tangent vectors.
452 THE NAVIER-STOKES EQUATIONS
matrix thus looks like:
R
1
(\
0
2
0
1
0
0
1
0
•
0
0
j
0
0
0
0
1
0
0
0
•
k
n(i)
ny
Ly
0
0
0
1
0
n
\ 1
2
0
V
i
, (3.13-41)
/
n
and it is clear(?) that RTR = I, as desired. Inserting the rotated velocities into the DAE's
of (3.13-28) and (3.13-29) gives, using u = R-luR = RTuR,
MRTuR + {K + N(R' uR)]R' uR + CP = f,
CTRTuR
(3.13-42)
(3.13-43)
To finish, we also rotate the momentum equations (still at node /), which is accomplished
simply by multiplication by R [cf. (3.13-40)]:
>T
(RMR1 )uR + [RKR1 + RN(R' uR)R' ]uR + RCP = Rf,
(3.13-44)
a procedure that is easily done at element level. We now have both momentum equations
and velocities in terms of normal and tangential components, and it is a simple matter to
apply either essential or natural BC's to either component of node /—in the 'usual' way.
Note that it is only the 'all essential' BC case (un, uT given) that does not need to be
rotated (although this case could be done in rotated mode), because we could then use
the inverse of (3.13-38) and (3.13-39) to obtain (u = RTuR)
u = nxun + xxuT,
v = nyun + tyUT,
(3.13-45)
(3.13-46)
a pair of equations that can also be used to transform back to cartesians after the (n, r)
BC's have been applied and the boundary velocity computed.
We are nearly finished. The remaining (and crucial) step is the proper computation of
the normal vector at node /. Noting first from Figure 3.13-1 that the geometric normal is
not even well-defined at node / (because of the C° boundary shape), the final task is to find
an appropriate and unique normal (and tangent) vector. This problem was also solved by
Engelman et al. (1982a), and we repeat the solution here—with slight variations; it turns
out that the 'omnipotent' incompressibility condition once again plays a major role—as
follows: starting from J1//7V . uh = 0 [cf. (3.13-2)], we sum over / and use J2j=\ & = 1
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM
to obtain J V • uh = 0 where, from (3.13-8) with
I u,\
453
U;
V;
denoting the nodal velocity,
NT
7=1
(3.13-47)
where Nt = Ma + Na there—because here we need not distinguish between
specified/given values of u7 and those to be determined, as we shall see. Thus,
o=yv.u*=y£v.(u,-0;)=yEurv^=Eu;• yv^- <3-13-48)
Now Green's Theorem in the form J V07 = Jr n<pj yields
Eu> • / n^ = Y,UJ I n^J+vJ I ny<t>J = °- (3.13-49)
~^ Jr "^ Jr Jr
The next key observation is that 4>j\v = 0 f°r all internal nodes, so that the summation
from one to Nj in (3.13-49) effectively collapses to one over only the NB (say) boundary
nodes. Thus, (3.13-48) becomes, effectively,
$>,•• / V0,•=O = ]^K,•
30/
dx
+ Vi
30/
dy
(3.13-50)
which gives, using (3.13-45) and (3.13-46), but now evaluated at node j on the assumption
that iij and t7 are uniquely defined,
v^ f 30/ f
2_JnXlunj + rXjuTj) / —- + (riyjUnj + ryjuTj) J
7=1
90/
-^ =0,
3y
(3.13-51)
and we are almost finished. Rearranging to
NB
v- / f d(Pj
2^ Unj [nXj I IT- + n
7=1
3x
>7
l+«JrI. / ~ + r
3y
3x
>V
30i
3y
0 (3.13-52)
yields the next key result: since (3.13-52) is still a statement of global mass conservation,
it can only depend on {unj}; i.e., it must be totally independent of the values of the
tangential velocity, uTj, which can only be true if
-I
dx y>
30j
dy
V0,- = xr n<Pj = 0,
(3.13-53)
a relation that gives the ratio of the two components of r. To finish, we simply add the
normalization requirement, r • r = 1, which permits the unique (up to a sign) solution,
± / tyj/dy
V0i
± / ny(j)
yvj
n07
(3.13-54)
454 THE NAVIER-STOKES EQUATIONS
■yj
d(f)j/dx
V0j
nx(pj
ikPj
(3.13-55)
Now Figure 3.13-1 shows that x = k x n, where k is the unit vector in the z-direction (out
of the plane), giving rXj = —nyj and xy. = nXj, which, with (3.13-54), (3.13-55) gives
n.
V07-
V0i
(3.13-56)
as the final result—in which the proper 'sign selection' has taken place [—in (3.13-54),
+in (3.13-55)]. Note, of course, that the global integration effectively collapses to the
area defined by the support of <f)j. An alternative form of this final result that makes
good physical sense but is probably not the preferred way in practice, obtains via another
application of Green's Theorem in (3.13-56) [and is already in (3.13-54), (3.13-55)]:
n.
ncPj
n0.
(3.13-57)
a form presented in Lynch and Gray (1980); the mass-consistent unit normal at node
j is a basis function-weighted geometric normal. [See (3.13-353) et seq. for a specific
example.]
This latter result also suggests that a 'simple,' and unique, geometric average value,
via nj = Jr. n/| Jr n|, where T7 means 'integrate over that portion of F containing node
j,' may not be mass-consistent—and this is true. n7 as computed from (3.13-56) or
(3.13-57) is the only normal vector that assures that 'flow in = flow out' of the element
pair meeting at node j-an interpretation employed by Gray (1977) in his original (and
simpler) derivation of the consistent normal—and in Pinder and Gray (1977). Finally, if
the normal velocity is specified on all of T, then only the consistent normal will assure
discrete global mass conservation and Yl8i: = 0 m (3.13-31); i.e., only then are the data
orthogonal to the hydrostatic null vector, and only then is the problem well-posed.
The consistent normal at a corner of the domain is interesting. It can easily be shown
to look like that in Figure 3.13-2 for any rectangular-shaped element. While perhaps
awkward geometrically in some cases (see Engelman et al., 1982a, for further details), here
we present an example of the positive side of the story: if one is using the Q\ Qq element
with the BC of specified traction in the corner of the domain, any normal except that
Fig. 3.13-2 Corner normal vector.
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 455
shown in Figure 3.13-2 will violate the two following (traction BC) requirements: (i) the
normal force balance involves the centroid (element) pressure, and (ii) the tangential force
balance must not involve the centroid pressure—mass consistency is here a prerequisite
in order to obtain momentum consistency. See Section 3.13.5f for details.
To conclude the consistent normal discussion, we summarize the 3D situation, which
contains no surprises—only the need to deal with a tangent plane (with two tangent
vectors) rather than a tangent line. The consistent normal is still that from (3.13-56), with
the integrations now occurring over the 3D volume supported by (pj. The only 'hitch' in 3D
is that, once a normal direction has been determined, there is an infinite number of choices
for the two tangential directions. While this does not present a problem from a theoretical
point of view, in practice it does. For example, consider the simple case where we wish
to specify a zero normal velocity component and known (non-zero) tangential velocities
on a surface such as a cylinder which is not aligned with any coordinate direction. The
procedure advocated in Engelman et al. (1982a) will result in a different set of tangential
directions at each node on the cylinder—clearly an undesirable situation. Or consider
turbulent flows using 'wall functions' (see Volume II) in which an applied shear stress
is specified in a particular tangential direction together with a zero normal velocity. One
solution to these 'problems', where again we stress that it is a. practical implementation
problem rather than a theoretical one, is to allow the user, once the normal direction has
been consistently derived, to specify explicitly one of the tangential directions, say x\.
The remaining tangential direction can then we computed using a simple vector cross
product (n x x\).
f. Axisymmetric case
Recalling the axisymmetric version of the NS equations of Section 3.6.4, it is of some
interest to discuss/present their weak formulation, especially since the axis itself (r = 0)
has caused certain difficulties to non-FEM CFDer's, who often 'drill a hole' through
the center of the mesh to avoid placing nodes along r = 0; see e.g. de Vahl Davis
(1979) and Smutek et al. (1985). The important 'saving grace' for GFEM begins with
the realization that the 'volume element' for integration begins as Inrdrdz rather than
dxdydz of cartesians—and we drop the common factor 2n. Thus, for example, the
uv/r term in (3.7-29) looks like f&uvdrdz in the weak form; i.e., 'easy.' There is,
however, one term that retains r in the denominator: the u/r2 term in (3.7-31) goes
over to J (pjiidrdz/r, which still does not cause problems—the term is integrable, and
appropriate numerical integration (Gauss-Legendre) keeps us away from r = 0. Next, the
viscous term like \/rd{rdu/dr)/dr in (3.7-31) integrates by parts to J r(pjdu/dr\r=Rdz —
J rd(f)/drdu/drdrdz, with the boundary integral ultimately 'showing up' as part of the
normal viscous (pseudo) traction force. The final 'interesting' term is dP/dr in (3.7-28);
here, integration by parts recovers the appropriate (div-grad) symmetry:
/ (pidP/drrdrdz = / — (r<pjP)drdz - / P—(r<pj)drdz
f f d(r(b)
= J r4>iP\r=R dz - J P^ drdz,
the second of which looks like its symmetric counterpart from the first term of
(3.7-27)—f \j/d(ru)/dr—once the appropriate 'expansions' are made. [The first term
456 THE NAVIER-STOKES EQUATIONS
is, of course, the pressure contribution to the normal force (traction) at the tube wall
and will be part of the NBC unless the tube wall is a no-flow boundary (the 99.99%
case)—in which case 0( = 0 at r = R, and the term vanishes.] After u = Yluj4>j ar,d
P = Y1 Pj^j are performed, we recover the required symmetry: the C-matrix contribution
is — J(d(pj/dr + <f>i/r)^l/jrdrdz and that for CT is — / \f/i(d<pj/dr + (pj/r)rdrdz, and we
are done. The bottom line is simply the following [cf. (3.13-18) through (3.13-27)]—for
the simpler 2D case [no swirl: omit (3.7-25) and set v = 0], leaving the 2.5D case (2D
equations, three components, with swirl) to the reader:
1. Identify x\ = x with r and X2 = y with z.
2. Replace dxdy by rdrdz in all 'bulk' integrals—even though we derived some terms
in which we cancelled the r's, for expository purposes.
3. Set y = 0; we are dealing with the simpler (V2) form, leaving the more complex
stress-divergence form as another exercise.
4. Augment the viscous matrix, Ka, by
T
Ka -> Ka + vSal / ^rdrdz,
where Sa\ is the Kronecker delta.
5. Augment the C-matrix by
f 1 r
Ca —> Ca — 8a\ / -<P(a)^f rdrdz. Done.
g. Fixing ill-posed Dirichlet BC's
A common situation is that where the normal velocity is specified on all of T(n • u = n •
w = /), and it is a simple fact that the finite element interpolant of this function, say
Ylf, generally does not satisfy JrTlf =0 even if the continuum problem is well-posed
(Jr / = 0). This ill-posed problem must be converted to one that is well-posed if we are
to make any progress with GFEM.
Remark:
As noted earlier (Section 3.10.5), if n • w is time-independent and the time-dependent
NS equations are being solved in the PPE formulation, then this ill-posedness is not
recognized by the mathematics; it is then up to the analyst to recognize the problem.
One way to 'fix' the data, as mentioned earlier, is given by Glowinski (1984), in
which a two-step procedure is employed: (i) modify the unit normal vector on all of F
by projecting it (in the L2 sense) onto the velocity basis; (ii) subtract off (pointwise) an
appropriate fraction of the global mass imbalance to regain global conservation. See his
book for details.
Here we provide an alternative and, we believe, simpler way to get the job done:
we modify the normal velocity in a least-squares sense with no need to modify the
normal vector on F. Like so: given / = Ylf = £\ f jfy with /_,- = f(xj) and Jrfj^0,
perform a least-squares adjustment, from / to fh, via: minimize Jr(fh — f)2 subject to
fr fh = 0- Converting this constrained extremal problem to a saddle-point problem via
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 457
the introduction of a Lagrange multiplier, A, to satisfy the constraint, yields: extremize
F(fh,X)={Jr(fh-ff+Xfrfh, which in turn leads to (i)fh-f+k = 0 and
(ii) Jr fh = 0, to give A = T~l Jr f, where F = Jr dF is the boundary measure 'size.'
The Lagrange multiplier is a constant that is proportional to the global mass imbalance.
The final resuU is simply /*(*) = /(*)_- T"1 /r / = £ , ffa,- T~l Jr£ , ffa,=
Y,j fj[<l>j(x) ~ 4>j\ = Y,j fhj<t>j(.x), where (pj = T"1 /r0, is the average value of 0_/(jt)
over all of T.
Remarks:
1. The nodal values are simpler yet: fh{ = /,• — A; the pointwise values are all adjusted
by the same amount. Clearly, if n/ is already mass-consistent, then A = 0 and no
change is made.
2. The mass-inconsistent interpolant is modified (slightly, at least when Jr f = 0) at
each node in such a way that the new contribution to fh from each node is itself
'mass-consistent' in that fr(<pj - 4>j) = OV/. [Note that </>, <$C 1—usually—whereas
<Pi(x) = 0(\).]
3. It is not even required that the continuum problem be well-posed; i.e., the modified
discrete problem is well-posed even if Jr f ^ 0. (If a wildly non-physical problem,
with Jr f 'large,' has been posed, then the 'adjustment' may be also large.)
4. If part of T is truly impenetrable so that / = 0 there, before, and after the adjustment,
then simply omit this part of F in the above calculations.
5. The necessity or desirability of employing consistent normals, a la Section 3.13.1e,
should also be considered it this method is to be implemented. [We derived this
method while writing this book, and have not (yet) implemented it ourselves.]
6. Similar procedures have been employed in certain FDM's to adjust the mass
imbalance (ill-posedness) that comes from trying to use dun/dn = 0 as an OBC; see, for
example, Schutt (1991) and Sani and Gresho (1994).
3.13.2 The Choice of Elements
a. Introduction and summary tables
We are about to embark on one of the most difficult and dangerous of finite element
trails as we attempt to explain some of the nitty-gritty details behind the long (and still
developing) history that finally leads up to not only the simple question, 'Which element(s)
do you prefer and why?' but to much more basic and difficult ones: 'Which elements
work, why do some not, and why do some appear to work but perhaps should not?'
Or even, 'Why do some advocate the use of an element that others claim is doomed to
fail?' or 'Why are there so d ... many choicesT Why cannot reasonable people agree
on such ostensibly simple issues; especially those with firm mathematical underpinnings?
Partly, it must be that the issues are actually far from simple. Naturally, we shall need
to try to clarify what it means for an element to 'work' or to not work, or even to fail.
Perhaps W. Habashi put it best when he said, 'Convergence is in the eye of the beholder'
(personal communication—to a large audience, many persons).
To begin, we emphasize that most of this deep and troubled and muddy water
came to be because of the single simplification (!) of the mass conservation equation;
458 THE NAVIER-STOKES EQUATIONS
i.e., the fluid will be treated as, or assumed to be, incompressible. We also note that
this alleged simplification has also taken its toll in the finite difference world, where
numerical solutions of the incompressible NS equations began; many in this world are
also quite confused even today. A relevant comment on this situation was made recently by
M. Rose—'.. .because the treatment of incompressible flow is so unforgiving of imprecise
ideas, such flows still remain a fertile ground.' (personal communication, 1990). On the
other hand, formulations of the fully compressible equations can—especially in regions
in which the flow is 'behaving' incompressibly (V • u is 'small')—also 'act up' (e.g.,
Pironneau, 1989; and Fortin and Pierre, 1992).
In contrast to the scalar transport equation, the choice of elements for the NS equations
is far from simple. Mixed methods and saddle-point problems is the name of the game—or
at least part of it. The new issues include: div-grad symmetry, compatible function spaces,
null spaces, spurious modes, stability (with mesh refinement), and element-level mass
conservation, as well at the 'simpler' ones of accuracy, simplicity, and cost-effectiveness.
All of these issues will be discussed further below.
A perusal of the literature reveals more than two dozen of either triangular or
quadrilateral elements for 2D flows. In 3D, the corresponding numbers are smaller—but there is
still a plethora of possibilities. The ease with which FEM researchers can generate various
'higher-order' approximations, relative to those in FDM or control volume methods, is
probably as much of a curse as it is a blessing—especially when it is acknowledged that
not all seemingly reasonable approximations deliver useful and/or cost-effective results.
The number of element 'combos' (velocity and pressure approximating functions) that
have been analyzed is very large; so too is the number that have been coded up and tested
in the CFD laboratory. Add to this list the concept of 'macro elements' and other tricks
to make 'stable' elements out of 'unstable' ones, and the situation becomes even more
complex—perhaps even scary, daunting. ... It is therefore especially easy to understand
how 'outsiders' or 'newcomers' scouting the field might view the FEM for
incompressible flow with some skepticism, perhaps wondering, 'When are they going to get their
act together?', and asking, 'Why should I jump into this clearly confused and frustrating
fray?' And indeed these are valid concerns—and we only wish that we could address
them more adequately then we do below. Perhaps, though, it is just another (annoying?)
manifestation of the general fact that the FEM offers many, many choices of basis
functions. (How many 'higher-order' finite volume methods are there? Lack of choice is also
not best.)
The field is definitely 'richer' for the finite element mathematician (or 'mathematical
engineer') than for the average CFD practitioner who is mainly interested in obtaining
good/useful results fairly cost effectively. But the truth (as we know it) is, unfortunately,
that there is no unequivocally 'best' element. ...
In this section we shall attempt to summarize the state of confusion (a moving target)
regarding element choices, focus on those subsets of elements that we advocate (partly,
of course, because of our own experience), and still try to present a reasonably balanced
presentation. That this is not entirely possible is probably obvious, since there often seems
to be a fairly large increase in adrenalin flow whenever the subject of 'element choices'
is discussed. Our discussion will probably also create a few new enemies—a plight we
could bear if in addition it attracts enough outsiders and newcomers to give finite elements
a try—so that, on balance, the FEM might move forward faster. Our general philosophy
will be based on the premise that simplicity is still beautiful, and on the fact that the theory
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 459
is too often silent. ... A colleague recently opined that the entire field would still be in
the Stone Age if practitioners had waited for the theorists to prove 'consistency, stability,
accuracy, convergence, etc' We are already clearly in violation of (our understanding
of) the 'French school'—for example—which usually seems to require some minimum
number of proved theorems before any computer programming and subsequent numerical
experiments are permitted. But even they manage to 'ignore' the unfortunate fact that
no one has yet been able to prove global existence of solutions to the subject of this
book—the NS equations. ... C'est la vie.
After presenting a few tables of 'elements'—good and bad—we will briefly summarize,
as we see/understand it, the key state of the art in some of the finite element selection
criteria—namely, stability and how to analyze it.
We begin by defining a few broad categories of elements, some of which we will need
and others we shall drop:
1. Equal-order vs mixed-interpolation. This refers to the basis functions used for velocity
(components) vis-a-vis those for pressure—0 vs \f/ in the previous section. The former is
obvious (the same for both), and the latter usually means that a higher-degree piecewise
polynomial is used for the velocity than for the pressure [which is at least partially related
to the following two issues: (i) V2u involves a higher-order operator than does VP; and
(ii) (roughly) the divergence of the (vector) 'velocity space' is a scalar space that is close
to being (and sometimes is) the 'pressure space']. It is worth noting that a stable method
can always be obtained by sufficiently enriching the velocity space for a fixed pressure
space—but such a stable space might not be very accurate in the sense that a high-order
polynomial basis function for velocity (e.g., quadratic) may not give high-order accuracy
if the accuracy of the pressure space is low (e.g., piecewise constant), and few would
opt for an expensive, inaccurate element (cost effectiveness in reverse!). It is also worth
noting that the choice of a pressure space (and, of course, the velocity space) implies the
choice of the divergence operator. The most popular mixed interpolation elements employ
one-order-lower basis functions for pressure than for velocity—although also popular are
'stabilized' equal-order elements, to be addressed by D. Silvester in the next section
(Section 3.13.3).
2. Continuous vs discontinuous pressure. Since (or when) the weak formulation of the
momentum equation involves integration by parts of VP, the resulting weak form contains
no derivatives of pressure, thus introducing the possibility of approximating it by functions
(piecewise polynomials, of course) that are not C°-continuous—and indeed, this has been
done and is quite popular/useful. But it is not necessary; hence, continuous approximations
for P are also much used—with or without integration by parts, with the latter generating
unsymmetric div and grad matrices and 'problems' with NBC's and well-posedness.
Note that discontinuous pressure elements do not possess uniquely defined pressure on
the element boundaries; they are dual-valued there—and often multi-valued at certain
velocity nodes. Note too that only discontinuous pressure elements assure an element-level
mass balance; Proof: for i/f, = piecewise-constant on element e,
0 = ( fr V ■ uh = f V • uh = J n u\
and only discontinuous pressure elements contain this element-level test function. QED.
460 THE NAVIER-STOKES EQUATIONS
3. Conforming vs non-conforming. Conforming velocity elements are those for which the
basis functions form a subset of //' for the continuous problem; i.e., the first derivatives
(and their squares) are integrable in Q. The simplest non-conforming element is a linear
triangle with the nodes placed at the three midsides; it 'conforms' with the velocity
in each neighboring triangle at just one point. Following Girault and Raviart (1986) and
Gunzburger (1989), we shall mostly neglect these little-used elements—but we do cite the
classic reference; it is Crouzeix and Raviart (1973), who also introduced some important
new concepts regarding conforming elements. See also Thomasset (1981). Also, the
nonconforming quadrilateral element of Rannacher and Turek (1992), a sort of 'rotated' (and
LBB-stable) version of Q\Qo, should be mentioned here; Turek (1994, 1996a) has shown
many good results with it.
Next, we introduce some terminology for efficient element descriptions—mostly
borrowed, but with a little bit that is new:
1. For triangles/tetrahedra, the designation PmPn means that the velocity (each
component) is approximated by continuous piecewise complete Polynomials of degree m and
pressure by continuous piecewise complete Polynomials of degree n. (For example, PjP\
in 2D means u ~ a\ + a2X + a^y + a^xy + a$x2 + a^y2 with a similar approximation for
v, and P ~ A\ -\-Ajx + A-^y.) Both velocity and pressure are continuous across element
boundaries, and each (triangular) element contains six velocity nodes and three pressure
nodes. The 3D (tetrahedron) version of this element contains 10 velocity nodes and four
pressure nodes.
2. For the same families, PmP-n is as above, except that pressure is approximated via
piecewise-discontinuous polynomials (C_1) of degree n; e.g., P2P-1 is the same as PjP\
except that the pressure is now an independent linear function in each element—it is
therefore discontinuous at element boundaries.
3. For quadrilaterals/hexahedra, the designation QmQn means that the velocity (each
component) is approximated by a continuous piecewise polynomial of degree m in each
direction on the Quadrilateral and likewise for the pressure, except that the polynomial
degree is n. [For example, Q2Q1 is like PjP\ above, with the addition of a-ix2y + a%xy2 +
a<)X2y2 to u and A4xy to P. Each element contains nine velocity nodes (32) and four
pressure nodes (22); the 3D (brick) version has, of course, 27 velocity nodes (33) and eight
pressure nodes (23)].
4. For these same families, QmQ-n is as above, except that the pressure approximation
is not continuous at element boundaries.
5. Again for the same families, QmP-n indicates the same velocity approximation with a
pressure approximation that is a discontinuous complete piecewise Polynomial of degree
n (not of degree n in each direction—it is as if the pressure was to be represented on a
triangle within the quadrilateral, with 'extrapolation' as necessary).
6. The designation P+ or Q+ adds some sort of 'bubble function' to the polynomial
approximation for the velocity. These are sometimes called 'enriched' elements (Arnold
etal., 1984).
7. Finally, for n = 0, we have piecewise-constant pressure, and we omit the minus sign
for simplicity.
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 461
Before presenting a summary of most of the known 'incompressible elements,' we
provide a hopefully-useful 'heuristic' to perhaps show how these mixed-interpolation
elements might have come about. And this we do via one example (with six 'varations'):
suppose you 'like' the nine-node quad (Q2) for scalar problems and are interested in
generalizing it to a 2D vector problem for the incompressible NSE's. To make such a 'velocity'
element incompressible, you might first consider 'average' incompressibility; i.e. request
mass conservation at only the element level, leading to the weak form J V • uh = 0 for
each element, e. A second idea might be to request a weak form that approximates
V • u = 0 at every velocity node, which would lead to J (p,V • uh = 0 for / = 1, 2,..., n.
A third method would request the same weak form, but only at the corner nodes of the
9-node quad, / \//,V • uh = 0, where 1/^ is a bilinear basis function and / ranges over only
the corner nodes of the mesh. A fourth might be a combination of the first and third;
i.e. require both Je V • uh = 0 and f \f/j'V • uh = 0. A fifth idea would be to strengthen the
first via 'moment' equations; i.e., in addition to Je V • uh = 0, also require J^£V • uh = 0
and Je rjV • uh = 0 on each element where (£, rj) are the local coordinates. Finally, a sixth
variation would add the cross-moment, J^£j?V ■ uh = 0, to the fifth. That's enough; all
we need say in addition is that these six elements have the names Q2Q0, Q2Q2, QiQ\,
Qi(Q\ + Qo). QiP-\, and Q2Q-1 —all of which (save one, Q2Q2, since equal-order
interpolation is definitely not very viable) are listed in one of the tables below—and each of
which implies/generates the concommitant pressure approximation for the mixed-method
in question.
With these new names in hand, we present in Tables 3.13-1 and 3.13-2 a summary
description of some of the triangular and rectangular elements that have been and/or
are used today; a full list is not necessary, we believe (consult the references for those
not listed). Also, and importantly: as we have only very limited personal experience
with the elements in these tables, much of the associated qualitative discussion is simply
based on our perception of the issues. The designation 'LBB-stable' refers to the three
mathematicians who made important contributions to the analysis of stability; they are
Ladyzhenskaya (1969), Babuska (1971, 1973), and Brezzi (1974). We will later
summarize what is meant by LBB stability—and its several aliases: inf-sup condition, BB-
condition, consistency condition, and div-stability condition. Briefly, any element passing
this stability test will converge 'optimally' (in the sense of approximation theory—details
later) and without spurious pressure behavior, and those failing the test may not (not will
not); they may converge, and may even converge optimally, but this theory does not assure
it—it becomes 'silent.' Indeed, from one of the leading 'stability experts' (M. Fortin), we
have: 'Knowing which elements are stable is not, however, by far, a complete picture of
the situation.' (Fortin and Fortin, 1985a.) Any reference to 'accuracy' in Tables 3.13-1
and 3.13-2 (e.g., first-order) refers to velocity error in H' and pressure error in L2. (Rough
rule of thumb: most if not all velocity errors can be restated in Lr by 'adding one' to its
//' error estimate; hk —>• hk+l.)
Tables 3.13-3 and 3.13-4 present a similar summary for the less developed 3D case.
Finally, below Tables 3.13-3 and 3.13-4 we offer some general comments—some objective
and some subjective; hopefully some are valid. A general remark pertaining to Tables 3.13-1
through 3.13-4 is this: M. Fortin seems to be the clear leader when it comes to both
the creation/design of new incompressible elements and in their stability analysis—even
if the effects of this leadership cannot be clearly discerned in our tables. The citations
include (at least): Fortin (1977, 1981, 1983, 1985), and with some help: Fortin and
462
THE NAVIER-STOKES EQUATIONS
Table 3.13-1 Summary of (useful?) 2D triangular elements
• Velocity
o Velocity and continuous pressure
x Discontinuous pressure.
Name
Sketch
LBB Advantages
stable?
Disadvantages Other
P^Po
PtP^
(MINI)
P^Po,
Crisis-
cross on
a 4-patch
(macro)
PiPl.on
a 4-patch
(macro)
P2P0
P2P\4)
(Taylor-
Hood)
Pp+Pi
P2(Pi +
Po)
P2P-i
N
Y
N
Y
Y
Y
Y
Y
N
Simple
—Sometimes —Rarely if ever
'locks' (u = 0) usable'1}
—Simple
-CAC(2) stable
—Pointwise
divergence-free
—Best element
with linear
velocity
(Gunzburger,
1989)
—See too
Glowinski (1984)
—Simple
—Simplest
second-order
triangle
—Better than
P2P1
—Element mass
balance
—Pointwise
divergence-free
—More work
than dPo but
no more
accurate
—Only
1st-order
accurate
—More work
than P2Pi
—2 hydrostatic
modes
—'Variable'
spurious null
space'1)
—Can be less
accurate than
PzP^
— First-order
—Cubic bubble
— First-order
-A/2 local
CB'si3)
— Penalty
method should
be used
—First-order
—Also called
iso P2 - Pi
—Also called
P^ isoP2-Pi
—Beat P2P-1 in
several tests
(Thompson,
1975)
—An early
favorite
—Second-order
—Cubic bubble
-'Good
element'
(M. Fortin)
—Second-order
—Second-order
—Can give
good results for
relaxed (natural)
BC's; also, see
Section 3.13.7
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM
463
Table 3.13-1 (continued).
Name
P2+P-i
(Crouzeix-
Raviart)
Sketch
/\
^LK
LBB
stable?
Y
•
Advantages
—Stabilizes
P2P-1
Disadvantages
—More work than
P2P1 [but see too
the 'modified'
version,
discussed in
Cuvelier et al.
(1986), which is
more economical]
Other
—Second-
order
—Cubic
bubble
-'Good
element'
(M. Fortin)
(1)
(2)
(3)
(4)
(5)
But see Qin (1994).
CAC: constrained approximation condition (see Malkus and Olsen, 1984).
See Section 3.13.2b for discussion of checkerboard modes (CB's); N = number of macro-elements.
Taylor and Hood (1973).
Thompson (1975).
Table 3.13-2 Summary of (useful?) 2D quadrilateral elements.
Name
Sketch
LBB Advantages
stable?
Disadvantages Other
Q,Qo (Q1P0)
N
Q^Q^, on a
4-patch
(macro)
Q^P-1
Y
N
P2+P-i
Y
—Simplicity
— Penalty
method works
—See
Section 3.13.5
—Less
sensitive to
mesh
distortion than
Q^Qo
—Mathematicians hate it
—See
Section 3.13.5
—Less
accurate than
O2O1 on same
grid
— More work
than Q1Q0. but
also more
accurate
—Awkward?
— First-order,
usually
— Pressure is
constant over
the element
—See
Section 3.13.5
— First-order
— Little or no
demonstrated
utility
— First-order
—Quadratic
bubble
-P =
Po+xPx+yPy
is equivalent
representation
—First-order
—Quadratic
bubble
—Quadratic
(continued overleaf)
464
THE NAVIER-STOKES EQUATIONS
Table 3.13-2 (continued).
Name
Sketch
LBB Advantages
stable?
Disadvantages Other
Q2Q0 (Q2P0)
Y
O2O1 (Taylor-
Hood)'1 >
Y
4
02(Oi +Po)
Y
O2P-1
m
Y
O2O-1
N
—Few, except
stable
—Simplest
higher-order
C°-pressure
quadrilateral
—Better
approximation
to V- u = 0
than O2O1
—Element
mass balance
— Probably
the most
accurate 2D
element
—First-order
accurate
—div u = 0 is
often not
strong enough
(see
Volume II)
—2
hydrostatic
modes
—Consistent'2 3)
penalty works
—Penalty
works
—One
CB-mode
normal velocity
at mid-sides
(linear tangential
velocity)
—R means
'Restricted'
—Momentum
equation for
central mode
sees no
pressures!
—Second-order
— More accurate
than Q^Q2
— (Much) less
accurate than
O2P-1
—Second-order
Introduced in
Gresho et al.
(1980b) to
improve on
O2O1
—Second-order
— First
introduced—it
seems—in Sani
etal. (1981a), it
is also referred
to as the 9/3
element
—Second-order
—See
Section 3.13.6b
(1)
(2)
(3)
Hood and Taylor (1974) actually used the eight-node serendipity (remove central node), Q^]Q-\,
which is also LBB stable and second-order accurate. See too Taylor and Hood (1973) for earlier
'equal order' experiments.
See Section 3.13.2e (penalty).
It is permissible/viable to use either local (P =a + b$ + crj) or global (P =A +Bx + Cy) pressure
approximation; Shopov and lordanov (1994) prefer the former (more accurate on distorted iso-P's),
whereas only the global representation, to our knowledge, has been shown to possess the optimal
error estimate (D. Arnold and F. Brezzi, personal communication).
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM
Table 3.13-3 Summary of usable/useful (?) 3D tetrahedral elements.
465
Name
P+Pi (MINI)
Iso P2 - P^
Pt+P^
P2P1
(Taylor-Hood)
P2+P-i
(Crouzeix-
Raviart)
P2(Pi+Po)
LBB
stable?
Y
Y
Y
Y
Y
Y
Advantages
—Simple
—'Best linear
tetrahedral
element'11)
—Simplest
higher-order
tetrahedron
—Cost
effective'3'
—Better mass
conservation than
P2P^]
Disadvantages
—Not very
accurate'1 )(2)
—Awkward (?)
—div u = 0 may
be too 'weak'
—2 hydrostatic
modes
Other
—First-order
—Quartic bubble
—See
Table 3.13-1 for 2D
version
—Add mid-face
and centroid
velocities to PiP_i
—Second-order
—Second-order
—Quartic bubble
—Second-order
(1) Soulaimani etal. (1987).
(2) Parre (1992).
(3) Bertrand ef a/. (1992).
(4) T\66etal. (1988).
Soulie (1983), Fortin and Fortin (1985a, b), and Soulaimani et al. (1987). [It seems that
his interest in finding stable elements was stimulated/increased when Sani etal. (1981)
showed that a particular element that Fortin had used and studied (Q\Pq) could lead to ill-
posed algebraic problems if the imposed velocity BC's did not satisfy a certain spurious
constraint equation. Details later.]
Additional Remarks:
(1) Mixed-interpolation can give 'stability' and optimal accuracy, but requires
additional bookkeeping—both in the 'main' code and in post-processing/graphics.
(2) Equal-order interpolation can only give 'stability' (and optimal accuracy) if
Vu = 0 is modified/weakened—typically to V-u = e/(P) with e 'small'; see
Section 3.13.3 for a discussion on 'stabilization.' If not stabilized, spurious pressure
modes (which we shall discuss later) result.
(3) 'LBB stable' elements assure the existence of a unique solution (Stokes flow) and
assure convergence at the optimal rate (i.e., as good as that from approximation
theory).
(4) 'LBB unstable' elements may not converge, and if they do, they may not do so at
the optimal rate. But the theory is mostly silent—thus far; i.e., they may converge
(and even at the optimal rate) in some cases/grids, as we will show later for Q\Qq.
(5) Continuous pressure approximation cannot deliver element-level mass balances,
nor can the penalty method be efficiently implemented. Discontinuous pressures
466
THE NAVIER-STOKES EQUATIONS
Table 3.13-4 Summary of usable/useful (?) 3D hexahedral elements.
Name
Q^Qo
(Q1P0)
Q2Q1
(Taylor-Hood) (*>
Q2(Qi+Po)
O2P-1
O2O-1
Linear triangular
prism
Quadratic
triangular prism
LBB
stable?
N
Y
Y(?)
Y
N
?
?
Advantages
—Simplicity
— Penalty works
—Simplest
higher-order
C°-pressure brick
—Same as 2D
—Element mass
balance
—Consistent
penalty works
—May give better
div u = 0 than
O2P-1
— Penalty works
—Transition
elements
—Transition
elements
Disadvantages
—Multiple modes
—div u = 0 not
strong enough
—Same as 2D
—May not be as
good as 2D
analog
—Multiple modes
—Awkward
—Awkward
Other
—First-order
—Second-order
—Second-order
—Second-order
—Second-order
<**)
<**)
(#)The 20-node serendipity element, QJ?0)Q-\, popular in solid mechanics, is obtained by omitting all
midface nodes and the center node from O2O1.
(**)The linear case would typically use constant pressure and the quadratic case would use either
continuous or discontinuous linear pressure. A growing use of these elements is in boundary layers close to
no-slip boundaries when unstructured tetrahedra are used in the bulk of the domain. It is conjectured that
the 3D element is stable if the two 2D elements comprising it are.
sidestep/skirt/obviate each of these disadvantages. [The lowest-order continuous
pressure on quadrilaterals (Q\Q\) corresponds to 'unstaggered grids' in finite
difference methods, and the lowest-order discontinuous pressure (Q\Qo) (roughly) to
'staggered grids.']
(6) Quadrilateral elements are usually more accurate than triangular elements, all else
being equal; the latter often display mesh orientation effects—at least when using
regular/structured triangulations. For example, Zhu and Zienkiewicz (1988) saw a
need for ~2.5 times as many nodes using P2 as when using Q2, for equal accuracy.
Triangles should, we believe, be 'mixed up' by the mesh generator—to reduce grid
orientation error. See also Shubin and Bell (1984) for some FDM 'grid orientation'
effects.
(7) Triangular elements are usually more useful for describing truly complex
geometry. [A common(?) and good policy is to use triangles only where necessary,
with quadrilaterals used where feasible—clearly a challenging problem for mesh
generation—and one not yet 'solved' in 3D.]
(8) Quadratic approximation has (de facto"]) been judged to be of 'high-enough'
order; these elements also do a fairly good job of matching/describing curved
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 467
boundaries via isoparametric approximation, and cubic elements are deemed (it
seems—'implicitly') to be just too complex/expensive.
(9) Bubble functions (and other internal nodes) can often be profitably 'condensed
out'/eliminated at element level. Indeed, Fortin has even opined 'that it is
worthwhile adding internal nodes to some elements, even if the order of convergence is
not increased,' to enhance stability (Fortin and Fortin, 1985a)—although this does
make the element construction more expensive.
(10) The Q\Qo element is probably the most controversial and least well-understood of
all elements (but see Section 3.13.5), in spite of the fact that it has been the center
of focus of very many analyses; paradoxically, in 3D it may also be the element
most frequently employed (most 'element-hours' logged on computers). Note too
that an equally good name for it is Q\Pq—and is indeed often used. A comment
of Ciarlet (1978) seems particularly relevant here: '... we shall simply emphasize
the fact that, for all practical purposes, nothing replaces the numerical experience
accumulated over the years by engineers.'
(11) Some of the elements listed can be generalized to arbitrary order; e.g., P^P^^-i)
for k > 2.
(12) A 'reduced' quadratic element, the Q2 ^Q\, eliminating the center node (x2y2), was
in fact the first of the higher-order quadrilateral elements. It was introduced by
Taylor and Hood (1973) and Hood and Taylor (1974)—rather than the Q2Q1, which
also often bears their names—because the eight-node quadratic element (called
'serendipity' by some—e.g., Zienkiewicz and Taylor, 1989) was then popular
in solid mechanics. C. Taylor later teamed up with one of the present authors
(PMG) and two others to show that the full quadratic element was worthy of
more serious consideration (Huyakorn et al., 1978)—a conclusion later supported
mathematically in Bercovier and Pironneau (1979). It is probably safe to say today
that the serendipity element should be passe in CFD—at least in 2D. And we now
feel the same about Q2Q1—when measured against QjP~\. (See Section 3.13.6a.)
(13) Stability of higher-order 'Taylor-Hood' elements, QkQk-\ for k ^ 3, are proven in
Brezzi and Falk (1991).
(14) 'The Hexahedron. The element having six quadrilateral faces is in general a better
element in three dimensions than the tetrahedron'—Wait and Mitchell (1985)—a
'general' remark, not explicitly addressed to the NS equations.
(15) If one is (only) interested in solving potential flows via the mixed FEM, the
elements in the larger space, //(Div) could perhaps be used effectively. For a
discussion of these 'Raviart-Thomas' or 'Brezzi-Douglas-Marini' et al. elements,
see Brezzi and Fortin (1991). See too the numerical example at the end of this
chapter.
(16) For some recent nice results using an improved MINI element (rescaled bubble
function) in 2D, see Simo et al. (1995)—a paper which also discusses/summarizes
some modern 'alternative' approaches that we do not discuss (much) in this text:
SUPG, Galerkin-least squares, 'optimally dissipative' methods, etc.
(17) From Malkus (Appendix 4.II in Hughes, 1987) we quote, 'The reason that the
standard error estimates for incompressible elements steer the practioner toward the
underconstrained and inconvenient elements is that the standard estimates demand
468
THE NAVIER-STOKES EQUATIONS
too much from a Lagrange multiplier (or related penalty pressure). The role of the
Lagrange multiplier has been seen to be a two-fold role of enforcer of the constraint
and of pressure solution. The choice made in all five of the safe elements is to
choose elements in which the role of enforcer has to some extent been sacrificed
to avoid pressure modes.' The five 'safe elements' are: P2P0, PiP\, Q2Q0, QiP-\,
and Q2Q\-
(18) Attempts have been made to quantitatively measure an element's quality by a
seemingly appropriate counting of constraints. Based on the fact that in the continuum
there exists one (vector) momentum conservation equation and one (scalar) mass
conservation equation at every point in the fluid, a similar 'counting' for the discrete
case may be made—and a qualitative judgment then following by asserting that
an element is 'good' if the constraint ratio (mass/momentum) is close to unity and
'bad' if far from it. (See, for example, Gresho et ai, 1980b, and Hughes, 1987—and
references therein.) For example, Q\Qo has, on average, one momentum equation
and one continuity equation per element—so that in this sense it is perfect. But so
too is Q\Q\—an element not even listed in the tables because of its plethora of
spurious pressure modes (defined in the next section). One more example: Q2Q1 has
a constraint ratio of only 1/4 in 2D and 1/8 in 3D, whereas £>2^-i, a 'better' element
in most practioners' opinion (we believe), has 3/4 in 2D and 1/2 in 3D—which is
also better. Our current position on the constraint ratio notion is that it provides,
at best, a first-order feel regarding the potential utility of an element—and for this
reason, we leave to the reader the preparation of a detailed 'comparison' table. We
do believe, however, that 'good' elements will have a ratio not too far from unity
and certainly not too large compared with unity because of possible 'locking';
i.e., if the ratio exceeds two in 2D and three in 3D, there will be more constraint
equations than can be satisfied by the available momentum equations.
(19) For all elements, the GFEM generates basically centered difference approximations
to the (nonlinear) advection term which, as for AD in the previous chapter, can be
wiggle-prone. As in the linear case of AD, we are still believers in 'wiggle signals'
over 'smooth is beautiful.' A re-read of the wiggle Section (2.6.1a) may be useful
at this point, just before we offer a supporting opinion from the FDM side of the
house—in which a dissipative and 'monotone' Godunov advection scheme and a
virtually non-dissipative centered-difference scheme are compared and discussed.
In Brown and Minion (1995) we find the following: (1) '... our computations cast
doubt on the validity of the proposition that the numerical dissipation mechanisms
in Godunov-projection methods mimic the physical dissipation'—they show that
some physically-reasonable-looking vortices on grids that are relatively too coarse
are spurious; (2) 'Whether or not under-resolved Godunov-projection computations
are useful is certain to be a controversial issue'; (3) 'In addition, since the centered
methods fail rather badly in the under-resolved case, it is somewhat easier to know
when one is properly resolving the computed solutions for those methods'—an
understatement that we freely translate as 'ye Olde Wiggle Signal'.
(20) For some new ideas related to LBB-stable low-order elements, via a macro-element
approach, see Nafa and Thatcher (1993).
(21) For some new ideas for 'ranking' elements, see Section A3.3.7 of Appendix 3.
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 469
After reading all of this, the reader may/should (?) ask, 'So what? What are the answers
to the questions raised at the beginning of this section?' Somewhat apologetically, we
answer, 'We wish we knew!!' There is definitely more work to be done.
We do believe, however, that we can provide some guidance for new code writers who
do not wish to generate and provide huge 'element libraries': In 2D, the triangular elements
P2+P\ and and P2+P-\ are very good, as is the £2^-1 isoparametric quadrilateral, which
is probably the most accurate 2D element. If you wish to avoid bubble functions on
triangular elements, P2P\ is not bad, and P2(P\ + Po) is even better—although somewhat
more complicated. In 3D, the situation is (still) much less clear, partly because not many
3D simulations have been made with a wide variety of elements (such testing is difficult
and expensive, but neverthless needed). Also, mesh generation capabilities can and do
enter in—often in a big way. If truly unstructured meshes are to be used, then tetrahedral
elements are a near necessity—for which the low-order MINI element is not too bad;
but the second-order elements, /^i and P2(P\ + Po) are better. For those who prefer
hexahedra, perhaps reverting to lets' or wedges where needed for complex geometry, try
02^-1 if you need LBB stability and/or want second-order accuracy. [If you're not afraid
of pressure nodes and your geometry is always sufficiently complex that the probability
of their occurrence is small, you might also want to try C^G-i-l- Finally, we opine that
Qi Qo is still a competitive element in both 2D and 3D and, at least after assimilating our
many discussions about it in this book, the new code developer should probably put it in
his/her library.
*b. Null spaces and their effects; pressure modes
0 Introduction. Another 'negative' virtue of incompressible flows stems from the facts
that the velocity part of the solution lives in the (large) null space (also called the kernel)
of a differential operator—the divergence—and the pressure part, depending on BC's,
can contain a component from the small (one-dimensional at most) null space of another
differential operator—the gradient. These properties of the solution should be, but not
always are, mimicked properly by the corresponding matrices (discrete operators) of the
approximate solution. It turns out that the discrete approximations are up against two
serious problems: (i) the approximate velocity, while (ultimately) living in a discretely
divergence-free space, is in a space that is not a subspace of the continuous divergence-
free space, and (ii) the null space dimension of the approximate pressure gradient is
often larger (sometimes much larger) than it should be; it should have dimension 0 or
1 depending on BC's. These two issues/facts permeate completely the entire subject of
approximate solutions for incompressible flow, and their effects are powerful, profound,
often not well-understood, and sometimes devastating; e.g., not all seemingly reasonable
approximations 'work'—a statement that is not limited to FEM, applying as it does to
virtually every approximation method that has ever been tried. These issues also limit
our understanding of approximation methods. But we shall, for the most part, stick to
the FEM version of these issues, and begin this by noting that the problems are 'caused'
(if indeed blame can be placed anywhere) by the use of 'mixed interpolation' (or mixed
method) in a different sense than before; i.e., both velocity and pressure are to be
approximated, because our velocity basis functions are generally (but see Section 3.13.7) not
discretely divergence-free. In an attribution (we believe) to G. Strang, we may say that
'mixed interpolation brings mixed blessings': the approximation of P as well as u permits
470 THE NAVIER-STOKES EQUATIONS
us to approximate the latter using basis functions that are not (discretely) divergence-
free, which provides much (too much?) more latitude than otherwise. But this additional
'breathing room' brings with it a plethora of new problems that is the subject of this
section: spurious/extraneous null spaces that are filled with spurious pressure modes that
introduce spurious/extraneous/redundant solvability constraints and sometimes reduce the
convergence rate (when solutions exist!). [The finite difference version of 'selection of
elements/basis functions' goes (roughly) over to, 'How "large" a stencil (how many node
points) and of what type (staggered, colocated, etc.) should I use for velocity and ditto
pressure in order to preclude odd-even decoupling?' But our discussion of the issues via
the language of linear algebra, below, surely covers the finite difference method as well
as the finite element method; we simply omit details regarding the former.] The choice
for velocity-pressure 'pairs' is far from arbitrary.
We start simple—with linear equations—and in some sense end there, because these
issues are, thankfully, independent of Reynolds number/advection. The discrete forms of
the potential flow equations, Mu + CP = / and CTu = g, or the transient NS equations,
written as Mil + CP = f(u) and CTii = g, or the steady Stokes equations, Ku + CP = /
and CTu = g, all present a linear algebra (saddle-point) problem (see Section 3.15) of
the form
(cr 9 (;)-(;)■
where B is either M or K (or, when time-marching via implicit treatment of the viscous
terms, a linear combination of the two) and is thus n x n and SPD; C is n x m with
n > m; u, it, f e Rn; and P, g e Rm.
o A digression. It may be well to show why we need n > m, which will be an
algebraic proof that we must choose our velocity and pressure basis functions so that there
are more momentum equations than continuity equations—thus precluding us from even
considering 'bizarre'/silly cases such as bilinear velocity and biquadratic pressure (Q\ Q2).
First, in words: if m > n, then there are more constraints on the velocities than there are
velocities. In mathematics: m > n implies that the number of vectors in the column space
of C is larger than the dimension of the space, which implies linear dependence among
the columns. The result would be that the matrix A = (^ c0) is singular and that the
system (3.13-58) is generally inconsistent. Thus we henceforth assume that m < n. End
digression.
Since B is non-singular, the above system of linear algebraic equations is of full rank,
which ensures the existence of an inverse and thereby a unique solution to (3.13-58), if
and only if C is of full rank—which is true if the rank of C is m. If C has full rank,
then Cq = 0 has but one solution: q = 0. But, unfortunately, there are many situations in
which the rank of C is less than m (C is then said to be rank-deficient), and this causes
A to be singular (having one or more zero eigenvalues) and the solution to Ax = b, if it
exists, to be non-unique. [Here, xT = (u, P)T and bT = (/, g)T.] We consider some of
these situations, and their effects, in this section.
Before getting any deeper into these 'null space issues,' let us digress briefly to note
a general property of the spectrum of the A-matrix above (see, for example, Bank et al.,
1990). This 'property' is obtained in two steps. The first step is to note that A can be
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 471
rewritten—via a so-called congruence transformation—as follows:
and the second is an application of Sylvester's 'law of inertia'—e.g., Strang (1976,
p. 259): the matrix in the middle, (^ -ctbic)' nas me same nurnrjer of positive
eigenvalues, the same number of zero eigenvalues, and the same number of negative
eigenvalues, as does A. Thus, since we have a block-diagonal matrix, all we need to
know are the spectra of fi_l and CTB~XC. (The matrix, — CTB~lC, is referred to as
the Schur complement of B in the matrix A; see, for example, p. 58 of Golub and Van
Loan, 1983.) The former is easy since B (and thus fi_1) is SPD: it has exactly n positive
eigenvalues. The second part is a little harder, and we rely in part on some results due to
D. Malkus (1992—personal communication; but see too his 1981 paper): if C is of full
rank, then CTB~XC has exactly m positive eigenvalues; but if C is rank deficient—say
rank C = r < m—then CTB~lC has exactly r positive eigenvalues and exactly m — r
zero eigenvalues. Thus, we have the general result for matrix A: it has exactly n positive
eigenvalues, exactly m — r zero eigenvalues, and exactly r negative eigenvalues. (Note
that the above characterization applies in general to symmetric saddle-point problems.)
And it is, of course, the case when r < m that is of interest herein—because that is when
A exhibits a non-trivial null space. In all cases, A is definitely indefinite. A 'look ahead'
to Figure 3.13-13 in the next section shows a 'picture' of this eigenvalue distribution
(wherein m — r = k, the dimension of the null space of C).
o Another digression. One more small digression is in order before we leap into the
subject of pressure modes—both pure and impure; and it too is simply a 'review' of (and
application of) linear algebra:
Theorem:
The linear system Ax = b where A is n x n (and real) and b a given n -vector has
a solution if and only if b is orthogonal to all vectors in the null space of AT, and
when the solution does exist, it is only unique if A (and thus AT) has no null space.
A proof of this theorem is both useful and reasonably simple—thus we present it, with
due thanks to A. Hindmarsh for the second part. (1) The vectors in the null space of A7,
say zi, i = 1,2, ..., k where k is the dimension of the null space (which is n — r where
r is the rank of A), each satisfy (by definition) ATZi = 0. (2) Multiply Ax = b by zf to
obtain zjAx = zfb = xTATz\ = 0, and thus a necessary condition for Ax to equal b is
zjb = 0 for / = 1, 2, ..., k. If zfb ^ 0, then b is not in the range of A—b cannot be
'reached' by the operation of A on any vector in its domain (Rn). (3) For sufficiency,
we start with zfb = 0, and form the residual, re = Ax — b. (4) Consider the quadratic
form/functional, J = rjre and consider its minimum: dJ/dx = 0 =>• 2A7re = 0, which
implies that re e N(AT), which implies that r is necessarily a linear combination of the
{zi}. (5) Also, zfre = zfAx - zfb = 0 because ATz; = 0 and zfb = 0. (6) Hence, re ± z\
and thus re ± r, and we obtain re = 0 because only the zero vector is orthogonal to itself.
This proves sufficiency. (7) Thus there then exists a solution to Ax = b. It is given by x =
xP + Xw=i ajy]> where xp is a particular solution and each yj is a null vector for A (i.e.,
Ay; = 0, / = 1, 2, ..., k), and the scalar coefficients {a;} are completely arbitrary—and
472 THE NAVIER-STOKES EQUATIONS
the proof is complete. Note that while our original A was symmetric, the more general
theory (for unsymmetric A) was presented because we shall soon (and temporarily!?)
come across situations where A is of the form A = (^ ^), where G approximates grad,
D approximates -div, and DT ^ G. But for our current case, (3.13-59), A = AT, and thus
the simpler result, z, = y,-, / = 1, 2, ..., k, obtains.
o The unsymmetric case. We are now ready to apply this linear algebra theory to
(3.13-58)—but first let us do so for the more general (and troublesome, and not often
used—at least in FEM) unsymmetric version,
(o o)(;)=(J)-
before which we note that we cannot even find a congruence transformation—and that
does not even matter because even if we could, we could not apply Sylvester's law of
inertia because it only applies to symmetric matrices. So we do not even have a good
handle on the spectrum of A in this unsymmetric case—although it is probably a safe bet
to assert that there are still at least n positive eigenvalues because of B.
Because B is invertible, it is possible to eliminate u in favor of a single equation for P,
{DB~X G)P = DB~X f -g, (3.13-61)
an equation that is also useful—although usually only for theoretical purposes (unless
D = CT,G = C,B = M, and the mass is lumped, in which case B is no longer dense,
and corresponds to a Poisson equation that is also useful in computations—as we will
see soon). The symmetric version of this equation, from (3.13-58), is
(CTB-XC)P = CTB-Xf -g. (3.13-62)
The dimension of the null spaces of the matrices DB~XG or CTB~XC is the same as
that of A, namely, k = m — r; that of the former is at least as large as that of G, and for
the symmetric case, see Remark (5) below.
Application of the existence/uniqueness theory presented above to (3.13-60) leads to
the following:
1. Existence. A solution exists if and only if wf f + rjg = 0, where (vv') is the i-th null
vector of A7 : Bwi + DTn = 0 and G7w, = 0, / = 1, 2, ..., k [which =>GTB-xDTn = 0;
cf. (3.13-61)].
2. Uniqueness. When a solution exists, it is not unique when k > 0, being given by
for arbitrary a\, where (up)p is a particular solution of (3.13-60), and (w') is the i-th null
vector of A : Bvt + Gqx = 0 and Dv\ = 0, / = 1, 2, ..., k.
Remarks:
(1) V} = —B~xGq\{q\ ^ 0, necessarily), and if we have the special case of q\ ^ 0 yet
Gqi = 0, we see that v, = 0 and—by definition—we have a pure pressure mode, or,
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 473
more simply and hereafter, a pressure mode; i.e., g, is then called a pressure mode, a
mode (eigenvector) that corresponds to a zero eigenvalue but has no component in the
velocity 'portion.' Pressure modes thus affect only the pressure part of the solution of
(3.13-60)—when such solutions exist which, from above, still requires wff + rjg =
0: the data must be orthogonal to the corresponding adjoint eigenvector, which vector
will generally not be an analogous 'pure pressure mode'; i.e., w; ^ 0 in general (but
it could be zero).
(2) If Vi 7^ 0 and if a solution exists, then the velocity is also polluted, and it is probably
the case that (3.13-60) does not represent a valid approximation to the original
continuum equations. An example of this type will be presented at the end of this
section.
(3) The only physically relevant/meaningful (non-spurious) null vector is the pressure
mode called the hydrostatic pressure mode, q; = PH : PTH = (1 —> 1), a constant
vector. The existence of Ph is assured whenever G properly approximates the
gradient operator—both in £1 and on T. If (and only if) the corresponding adjoint
eigenvector, r//, is also a constant vector and if wh = 0 and DTrH = 0—which
would, of course, be true if it too were a pure pressure mode, which seems to
also require that DT also approximate the gradient operator—or at least
annihilate constant vectors—then a meaningful approximate solution could be obtained. If
PH exists (i.e., if G is a gradient everywhere) and an NBC is used as an OBC (i.e.,
u • n is not specified on the OBC portion of T), then PH itself is (usually) a spurious
hydrostatic pressure mode—a subject we shall return to below.
(4) All null vectors except Ph are spurious numerical artifacts—they are extraneous and
have no analogs in the continuous case.
(5) The null space of CTB-[C is the same as that of C. (Proof: CTB-lCx = 0=$
xTCTB-lCx = 0=> zTB-lz = 0, where z = Cx. But B~x is SPD and thus z = 0.)
(6) The corresponding solvability constraint in the pressure only, a la (3.13-61), is easily
seen to be rf(DB~lf -g) = 0.
(7) The null vectors of the adjoint system affect the existence of a solution, while those
of the original system affect the 'quality' of the solution.
(8) An example of such an unsymmetric system is presented later in this section, and
another is Section 3.13.4b.
o The symmetric case. Let us now apply the same theory to the symmetric case, (3.13-58),
to see, in part, how much more 'attractive' it is. And it makes more sense to reverse the
order of presentation—uniqueness first.
1. Uniqueness. Suppose that (vT, qT)T is a null vector of (3.13-58):
Take the inner product of this equation with (vT, qT)T to obtain vTBv + vTCq + qTCTv =
0. But CTv = 0, and thus vTCq - qTCTv = 0, and we are left with vTBv = 0 and Cq = 0.
But since B is SPD, we are led to the important result that v = 0; any and all null vectors
in the symmetric case are (pure) pressure modes. Thus, the velocity solution will always
be unique, if a solution exists at all, which we address next.
474 THE NAVIER-STOKES EQUATIONS
2. Existence. Since only pressure modes can be present when C is rank deficient, the
solvability condition is simpler:
qfg = 0, i=\,2,...,k = m-r. (3.13-64)
When these constraints are satisfied, the solution of (3.13-58) is
co-co,+!>(s)-
where Cqx = 0 and the {a;} are arbitrary.
Remarks:
(1) Recall the origin of g—it came from the weak form of V • u = 0 applied to the
specified boundary velocities, (3.13-27).
(2) Again, only the hydrostatic pressure mode is 'physical'; in this case, (3.13-64) is the
same constraint given in our earlier GFEM formulation of the NS equation (3.13-31),
and corresponds to/represents the requirement that a discrete global mass balance
be assured by the Dirichlet BC's on the normal velocity (the only BC for which PH
exists). Associated with the hydrostatic mode is the fact that one (any one) of the m
continuity equations is redundant; each equation of CTu = g could be represented as
a linear combination of all (m — 1) others. The existence of PH in this situation, and
the existence of the associated redundancy (by one) in the continuity equations, is
completely proper and physical—and it has a simple (and also physical) analog in,
for example, heat transfer for the Poisson equation that may be useful to state: if a
steady temperature field is sought from V2T = — Q in Q, dT/dn = — q on T, where Q
is an internal heat source, and q is the specified heat flux removed from £2 through
T, then it is well known that: (i) a solution exists if and only if the data satisfy
the following solvability condition (heat balance): J Q = frq; (ii) the temperature
level is arbitrary—up to an arbitrary additive constant; and (iii) the GFEM analog
generates KT = b with det K = 0 and KTH = 0 where TH is any constant vector
(the 'hydrostatic'/thermostatic temperature?), so that a solution exists if and only if
Tjjb = 0 (the discrete global heat balance), and there is one redundant 'heat balance'
equation because the applied BC's (properly) duplicates the global heat balance that
is obtained by summing all of the discrete equations. It is also well known that,
when solvable, the arbitrariness can be removed by setting the temperature at any
node to any desired value and omitting the corresponding equation in KT = b.
(3) All other pressure modes, and their concomitant solvability constraints a la (3.13-64),
are spurious—and, of course, are not constant vectors. This is the more serious side
of pressure modes; i.e., whereas some could be content with good velocities and
good pressure gradients [arguing (meekly?/weakly?) that only gradients are needed
anyway—usually], few if any would like to be saddled with the extra (and spurious,
non-physical) constraints on allowable velocity BC's engendered by (3.13-64).
(4) As is the case for the hydrostatic pressure mode, each spurious mode also implies a
redundancy in the continuity equations; presumably again each continuity equation
could be obtained as a linear combination of all of the others. This of course says
that there is a &-fold redundancy in the continuity equations—we have k more than
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 475
we need. Each pressure mode (spurious or not) decreases the number of independent
constraints by one.
(5) The corresponding solvability constraint from the (symmetric) pressure
equation (3.13-58), is
qf(CTB-lf-g) = 0,
which (appropriately) is just (3.13-64) again because qfCT = 0—by definition.
(6) Besides depending on the BC's employed, pressure modes are 'mesh-dependent'; i.e.,
their occurrence/existence is sometimes obviated simply by changing the mesh from
'too regular' to less regular, a result that seems to properly suggest that redundancy
of continuity equations is more likely to occur on regular, structured meshes than on
general meshes of distorted isoparametric elements. This of course is another feature
that tends to lead to the conclusion/suggestion that only stable elements should be
used. But the fact is that some unstable elements are used daily—and with success;
thus, we shall not take the easy way out and exclude from further consideration only
elements that never display even a single spurious mode.
(7) The pressure modes considered herein are so-called 'global' modes. There also exist
'local' pressure modes (redundant constraints on the element, or macro-element,
level; e.g., criss-cross P\Pq or P2P-1 of Table 13.3-1); for discussion of these local
modes, see Malkus's Appendix 4.II in Hughes (1987), Brezzi and Fortin (1991), and
Qin (1994).
So which elements display spurious pressure modes and why, and what do the little
devils, and their constraint equations, look like? Also, should spurious modes preclude an
element from being used? We can only partially answer these questions, which may be
just as well, and this we try to do in the remainder of this section—incompletely, partly
intentionally.
First, we point out that all equal-order elements generate 'too many' (i.e., redundant)
mass balance constraints and thus possess spurious modes, whose descriptions are only
known for a very few—and this accounts for the predominance of 'mixed interpolation'
elements in which the pressure basis functions are polynomials of one (or more) degree(s)
lower than those for velocity—a 'trick' which reduces (as desired/necessary) the number
of constraint equations. Such elements have either few or no spurious modes, in contrast
to equal-order elements that have (too) many.
Next, we mention [again—see Remark (4) following (3.13-37)] that the unsymmetric
case, if generated by the method referred to above—of not integrating VP by parts,
produces a hydrostatic pressure mode even at an outflow boundary (or wherever
NBC's are applied). This hydrostatic mode is spurious—as is its associated solvability
condition—even if the element in question is free of pressure modes when VP is integrated
by parts, as was mentioned earlier and will be demonstrated later.
o Bilinear velocity. We now present an extended discussion on the derivation of pressure
modes for the bilinear element, for both piecewise discontinuous pressure (Q\Qo) and
continuous bilinear pressure (Q\Q\)—the latter being one of the simplest examples of an
equal-order element and presented in just enough detail (which, unfortunately, is a lot of
detail) to clearly make our point that equal order is rarely if ever viable. We shall show
one way to find the null vectors called pressure modes—from the symmetric formulation,
476 THE NAVIER-STOKES EQUATIONS
(3.13-58). In contrast to the original presentation (which delivered only partial results for
Q\Q\) in Sani et al. (1981a) employing more 'local' analysis techniques, which were then
'translated through the mesh' to obtain the needed 'global' results, here we shall derive
(some of) these global results directly, using simple linear algebra—on simple meshes
(another severe restriction needed to obtain analytical results) with only simple BC's
(Dirichlet). But simple cases are often good enough to elucidate the issues—simply. The
analysis is in fact so simple that it is (usually) restricted to meshes of uniform rectangles.
Our technique may thus also be directly applicable to finite difference or finite volume
methods.
o Piecewise-constant pressure. The methodology to be used here, which was first presented
in summary form in Shih and Gresho (1985), starts with the definition of a pressure
mode, CP = 0, and seeks non-trivial null vectors {P} that satisfy both components of this
equation. For a mesh of / x h rectangles, the Q\Qo pressure mode equations at node /, j
are (see Appendix 1) of Figure 3.13-3 are
CxP = 0=-m+iJ+i - Pij+i) + (Pi+ij - Pij)]
and
CyP = 0 = -[(PiJ+l ~ Pij) + (Pi+lJ+l ~ Pi+l.j)],
which are obviously satisfied by Ph = (1, —►, l)7, the hydrostatic pressure mode. But
another way to satisfy these equations is to have both P;j + Pjj+\ = 0 from the first
equation and Puj + Pj+ij — 0 from the second. Seeking a solution to this set of difference
equations via PU] = axb3 leads easily to a = b = — 1, and we obtain
Pij = (-\y+j (3.13-66)
as the simplest representation of the most (in) famous spurious pressure mode of them all:
the checkerboard (CB)-mode, taking on as it does the value of +1 on red (say) elements
and —1 on black. We defer to, for example, Stephens et al. (1984) to pronounce that there
are no other pressure modes for this element. We also defer, briefly, the interaction of
this pressure mode with velocity BC's.
o Bilinear pressure, too. Turning now to Q\Q\ on the mesh shown in Figure 3.13-4, the
same appendix allows us to write the following pressure mode equations at node /, j:
X
Pi,j+1
X
Pi,j
X
Pi+1,j+1
X
Pi+1,j
£
Fig. 3.13-3 A 4-patch of Q^Qo elements.
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM
477
i-1 ,j+1
•—
i-1,j<>
i,j+1
i-J
i-l,M
Fig. 3.13-4 A 4-patch of QiQi elements.
i.j-1
i+1,j+1
—•
<> i+1,j
i+1,j-1
CxP = 0= — [(P, + 1,;_, - P,'_,,7'_,) + 4(Pi+l,j - Pi-lj) + (Pi+lJ+l - Pi-lJ+l )]
and
/
C),P = 0=-[(F,-1,J+,-F,-,J-1)+4(F,;+l-^,) + (Pi+l,,+1-Fi+1,-1)],
which are obviously satisfied by PH, the hydrostatic mode—as they should be. The next
three modes are also obviously pressure modes, and were derived previously by Sani
etal. (1981a):
1. The 'S-wave,' a '2Ajc' wave: PitJ = (-1)'".
2. The Vwave,' a '2A/ wave: P}j = (-\)j.
3. The CB-wave, a '2Ax- x -2A>-' (or £ - rf) wave: Pu = (-l)'+i.
Remark:
These three spurious modes are also present in a simple, second-order, centered finite
difference approximation (five-point stencil) using 'non-staggered'/'co-located' grids.
To find the others, we proceed as above for Q\Qo\ i.e., seek solutions of P,-,;_i +
4Ptj +Pij+\ = 0 from the x-equation and of P;-\j +4P,-,; +Pi+\j = 0 from the y-
equation, which would satisfy CP = 0. (Note that more coupling can also introduce more
modes!) The solution of these homogeneous difference equations can again be obtained
by trying Ptj = albJ, which results in the following quadratic equation for both a and b:
x2 + 4x + 1 = 0,
(3.13-67)
where jc = a or b, with solutions x = — 2 ± >/3. Noting that the product of the two roots
is unity leads to the following general solution to the pair of difference equations, where
£ = -2 + V3:
PtJ=A^i+J + A2?-j + A3rli+J) +A4£;-', (3.13-68)
which describes four additional (and linearly independent) pressure modes, since the Ak's
are arbitrary coefficients. So, we are up to eight modes for Q\Q\, seven of them spurious.
We can find no more—nor could our computer; thus, we assert that the dimension of the
null space is eight. [We have also used a (numerical) Gram-Schmidt orthogonalization
routine on these eight vectors for several simple (e.g., 6 x 6, 7 x 6, 7 x 7) meshes and
478 THE NAVIER-STOKES EQUATIONS
found, in all cases examined, that the orthogonalization 'succeeded'—which proves linear
independence of the original vectors: linear dependence would lead to one or more zero
vectors via the Gram-Schmidt procedure.]
It is convenient to re-define the four new pressure modes in the following
rearrangements, because then certain symmetries are displayed:
1. Even-even mode (P,-y = P_,,7 = P,-,_/) via Ak = 1/4 for all k to give
p™ = W+r,-)(*;+ro. 0.13-69)
which gives P^' = 1.
2. Even-odd mode (P,-j = P_,,7 = —P,,_7) via A\ = A3 = — A2 = —A4 = l/[2(£ —
£-')]= 1/4V3 to give
p\y = & + $-'•)($; - r7')/4V3, 0.13-70)
giving ^ = 1.
3. Odd-even mode (P,-j = —P-ij = Pi,-j), which is a 90° rotation of even-odd:
pj;e) = ($«' -r'W+r7')/4V3, 0.13-71 >
■ ■ r>(oe) i
giving P,,0' = 1.
4. Odd-odd mode {Pu = -P-ij = -Pi,-j) via A, = A4 = -A2 = -A3 = (^ - ^_1)-2 =
1/12 to give
p;°;> = (^« - £-■)($; - r7')/i2, (3.13-72)
giving Pn = 1.
If node (0,0) is chosen to be the node at the geometric center of a rectangular domain
containing an even number of elements in each direction, then the above four modes are
mutually orthogonal. In other cases they are still linearly independent but generally not
orthogonal. In 'pictures,' these four pressure modes look as follows—here on a mesh of
36 elements (6 x 6) and 49 nodes, —3 ^ /, j ^ 3:
(i) The even-even mode:
676
-182
52
-26
52
-182
676
182
49
-14
7
-14
49
182
52
-14
4
—2
4
-14
52
-26
7
-2
1
-2
7
-26
52
-14
4
-2
4
-14
52
182
49
-14
7
-14
49
182
676
-182
52
-26
52
-182
676
(ii) The even-odd mode:
-390 105 -30 15 -30 105 -390
104 -28 8-4 8 -28 104
-26 7-2 1-2 7 -26
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 479
0
26
-104
390
(iii) The odd-
390
-105
30
-15
30
-105
390
(iv) The odd-
-225
60
-15
0
15
-60
225
0
-7
28
-105
-even mode:
-104
28
-8
4
-8
28
-104
-odd mode:
60
-16
4
0
-4
16
-60
0
2
-8
30
26
-7
2
-1
2
-7
26
-15
4
-1
0
1
-4
15
0
-1
4
-15
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
2
-8
30
-26
7
-2
1
-2
7
-26
15
-4
1
0
-1
4
-15
0
-7
28
-105
104
-28
8
-4
8
-28
104
-60
16
-4
0
4
-16
60
0
26
-104
390
-390
105
-30
15
-30
105
-390
225
-60
15
0
-15
60
-225
We now consider the solvability issue; for u and v specified on all of T, the solvability
constraints, (3.13-64), would generally lead to eight constraints on the data, gt, i = 1,
2, ..., 24, where we take / = 1 as the lower left node, and we loop counter-clockwise
around F to number the g{ (which, recall, live only on F, since we have no sources/sinks
of mass); we shall elucidate this constraint for one spurious mode—the even-even mode
(one example is enough!):
676(g, +g7+£l3+#19)
+ 182(g2 +g6 -88 ~g\2+g\4+g\8 ~820-g24)
+ 52(g3 +g5 +g9+g\\ +g\5 +g\l+g2\ +g23)
-26(s4+S,o+Si6+#22) = 0, (3.13-73)
a spurious constraint that precludes the existence of a solution to (3.13-58) unless the
imposed BC's satisfy it. And there are six others! (Not seven because it turns out that
the CB-mode is innocuous in that its solvability constraint equation is always and
automatically satisfied for all possible BC's. On the other side of the ledger, though, is the
concomitant fact that there are no BC's that can preclude the CB-mode's existence—and
the A-matrix is thus always singular.) All are spurious except that from the global mass
balance constraint, J2jL\ gj — 0- {From Appendix 1, 'Element Matrices,' we select two
entries from the g-vector for the above 6 x 6 for display, and further elucidation—g4 (at
the bottom center) and gg (on the right side, one node up): g4 = [2h{u$ — ut,) — 1(vt, +
4v4 + v5)]/12 and gg = [h(ui + 4w8 + w9) + 21 (y? — vi)]/\2. The combinations of sums
of normal velocities and differences of tangential velocities in PTHg = 0 are evident; the
normals will 'sum up,' and the 'tangentials' will cancel in Yl§i t0 yield the appropriate
480 THE NAVIER-STOKES EQUATIONS
global mass balance constraint.} We leave the explicit construction of the six spurious
constraint equations to the interested reader. It thus becomes abundantly clear why equal-
order interpolation has not been very popular; in addition to a multi-dimensional null
space and polluted pressures, very few otherwise well-posed problems with inhomoge-
neous Dirichlet BC's would even have solutions.
Additional remarks:
(1) If other BC's (NBC's) are employed on some parts of the domain with Dirichlet
on the remainder, the number of spurious modes is reduced. Dirichlet data are the
worst case—unless of course they are homogeneous; u = 0 on F obviously satisfies
all of the pressure mode constraints.
(2) For a vortex shedding simulation (Gresho et al., 1984a) with the following BC's, /„
and v specified at inlet, /„ and fT specified at outlet, and u specified laterally, the
number of pressure modes (all spurious since PH cannot exist with /„ specified on
any portion of T) observed was only three [and this was only determined indirectly
by monitoring pivots—a zero eigenvalue generates a (machine) zero pivot during
Gaussian elimination].
(3) In case one wishes to compute with Q\Q\ anyway, we list below the BC's—applied
on some part of T—that will eliminate each (except the omnipresent-but-innocuous
CB-mode) of the spurious pressure modes (of course, as already mentioned,
restriction to contained flows with u = 0 on T will always cause all constraint equations
to be satisfied):
(i) The £-wave is eliminated via a normal NBC on part of the x-boundary (e.g.,
along x = L).
(ii) The ^-wave is similarly precluded by a normal NBC somewhere along y = H
(for example).
(iii) All four modes from (3.13-68) are eliminated via application of a tangential
NBC on a portion of T (along an x- or a y-boundary).
[Thus, the three modes remaining in the vortex shedding simulation referred
to above are: (a) the »7-wave, (b) the CB-mode, (c) a mystery mode—after all,
the theory above is limited to rectangular elements, not distorted isoparametric
ones. Perhaps we mis-identified one via numerical pivot monitoring.]
(4) It is interesting (and coincidental) that certain spectral method approximations are
also cursed with a seven-dimensional spurious null space (also in 2D); see, for
example, Bernardi et al. (1990), Canuto et al. (1988b), and Schumack et al. (1991).
They somehow seem to cope with it in ways that we in finite elements have not
discovered; i.e., they use it anyway, whereas we run away from it.
(5) It is sad but true that perturbing a single node in the above example can change the
results (number of modes, etc.) drastically—further emphasizing the futility of the
situation.
o Discontinuous pressure. Having over-exposed (perhaps) the pressure mode problem with
equal-order interpolation, let us return to the 'single CB' (2D) element, Q\Qo, which we
still use and advocate, and attempt to make it more 'palatable' than we did in our original
publication on the subject (Sani et al., 1981a). There we summarized a several-year effort
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 481
that probably exposed more problems than solutions (with Q\Qo especially) with the not
surprising result that that paper turned out to be somewhat morose and thus helped form
a 'doom and gloom' attitude among many who were already or who might have become
'friends of FEM.' We apologize for this. The state of the art back then was still somewhat
primitive—and changing rapidly. Since that time, much has been learned—and while there
are still a few cautionary measures that need to be understood by the user of this element,
they are not all that difficult to master (we show you how) nor are they overly restrictive.
In fact, at LLNL, all of our new codes since the 'CB paper' have been built around Q\Qo
(both 2D and, mostly, 3D). The current atmosphere surrounding this element is (or, at
least, should be), in our opinion, much more upbeat than it was back then and than it was
even in the recent book by Gunzburger (1987); and we also have a new reason for this
optimism (in addition to mega computer minutes of good experience): in Section 3.13.5
we present a new convergence proof—the velocity converges at the optimal rate under
'all conditions' (not just 'special' meshes), and, although only 'proven' experimentally,
so do the (filtered when necessary) pressures.
We also take this opportunity to correct some 'typos' in Sani et al. (1981a). Part 1:
(i) AM~'g should be ACM~'g in (4d), (ii) the last + sign in (24) should be -, (iii) C7
and C should be switched in the equation below (A8); Part 2: (i) Oj = D4, D3 = D2 on
p. 175 should be D\ = D3, D4 = D2, (ii) the 80 spurious modes on p. 175 should be
8, (iii) insert / before lN in the last equation on p. 178, (iv) the x in (58) should be —,
(v) remove lim^o from (65), (vi) the last 0.625 should be 0.626 in Table III, (vii) remove
the extra ) in the ^-equation in the middle of p. 194, (viii) on p. 199, if the inconsistent
case is suspected (by the pivot test) in item 3, print WARNING—MAY BE ILL-POSED
before replacing the pivot by a 'suitably large value (1010 say).'
The principal issue for Q\ Q0 in 2D is the sometime presence of a CB-mode (both pure
and impure, which latter we shall define later) and its effect on solvability of the problem,
the velocity accuracy, the pressure accuracy, and the effects of mesh refinement. There is
also—on simple meshes—the issue of other 'modes' besides the CB that are more trouble
theoretically than practically, and we shall also return to these later (Section 3.13.5k). But
before launching into the negative aspects associated with CB-mode (modes in 3D), let us
remind the reader (again) that the Q\Qo has a particularly intuitively attractive property,
the results of which seem to show up in practice: just as the continuum equations satisfy
one (vector) momentum equation and one divergence-free constraint equation at each
point in the domain, so too does this element satisfy one momentum equation and one
divergence-free constraint 'at' each element in the domain; i.e., in spite of CB's, there
is a nearly optimal balance of equations and constraints, including element-level mass
conservation.
The first thing to realize about Q\Qo is the generalization of the simple CB-equation,
Pij = (—\)i+J, discovered by Fortin (1972a,b) and analyzed by Sani et al. (1981a), and
its 3D extension: on a mesh of rectangular elements (or even parallelograms, actually),
the 2D CB of (3.13-66) generalizes to
Pij = (-l)i+J/AtJ, (3.13-74)
where A,-j is the area of element /, j, and we are using the somewhat unconventional
notation (in the FEM world) that associates / with element 'rows' and j with element
'columns' in the mesh. An equivalent alternate description of the CB-eigenvector is
PCB\k = ±\/Ak, (3.13-75)
482 THE NAVIER-STOKES EQUATIONS
where Pcslk is the k-th element of the CB-null vector, Ak is the area of element k (the
elements now being numbered more conventionally, from 1 to m), and the + sign applies
to (say) a 'black' element and the — sign to a 'red'. The important thing is that CXP = 0
and CyP = 0, where P is the CB-eignevector of (3.13-75).
The 3D extension of the CB-mode leads to CB-modes (many)—at least for simple,
brick-shaped elements; there can be (depending, as always, on BC's) a 2D mode
(nonzero entries only in a particular 'plane' of elements) in each 2D plane of elements (except
one) for each of the three 2D planes (x — y, x — z, y — z) and one fully 3D CB-mode (a
2Ajc x 2Ay x 2Az 'wave'). The 'cartesian' description of these, a la (3.13-66), is
Pijk = (-\y+J for/: =1,2 Nz-\, (3.13-76)
PiJtk = {-\)i+k for j = 1, 2, ... ,Ny - 1, (3.13-77)
PiJ<k = (-\)j+k for / = 1, 2 Nx - 1, (3.13-78)
and
Pi,j,k = (-\)i+j+k, (3.13-79)
which applies (only) to a mesh of uniform bricks with Nx x Ny x Nz elements. This
mesh can support a maximum of Nx+Ny+Nz—3 2D CB-modes, one 3D CB-mode,
and one hydrostatic mode, for a total null space dimension of Nx + Ny + Nz — 1, with an
equal number of extraneous/redundant continuity equations and an equal number of BC
constraint equations, all but one of which are spurious. As in 2D, if the 3D mesh is not
geometrically regular, it is usually the case that all spurious redundancies vanish because
all of the continuity equations are then required. The generalization to a mesh of variable-
sized bricks is as follows—after mentally 'coloring' the elements 'red/black' on the 3D
mesh: (3.13-76) is replaced by PcB\k = ±l/AzA*, where Ak is the area-projection of the
volume, Vk = AzAk, k = 1, 2 , ..., Nz, of the k-th element onto the .ry-plane, and Az is
the thickness of this plane of elements; analogously, (3.13-77) and (3.13-78) are replaced
by the other 2D 'area' eigenvectors; finally, (3.13-79) goes over to PqbIic = ±1/V*, where
Vk is the volume of element k, k = 1, 2, ..., m — Nx x Ny x Nz.
To close this portion of the CB-mode description, we mention that D. Griffiths has
managed to find the most general 'CB-mesh' (although somewhat 'esoteric' in that the
probability of generating such a mesh with conventional mesh-generating programs is
probably nearly zero), which can support the 2D CB-mode (3D is still 'open'), of which
the rectangular case discussed herein is a special (degenerate) case. This general case is
described in Sani et al. (1981a), to which we refer the interested reader.
It is unfortunate, of course, to be saddled with such a massive mess of mesh-dependent
spurious pressure modes—and perhaps one would be well advised to utilize other
elements, as indeed many have done. But, as we will try to demonstrate, there are enough
advantages to the Q\Qo element—at least in 3D—to justify its sustained use. Also, we
will show that it is not really too difficult to adapt to 'life with modes,' especially after
we show how easy it is to accommodate them and to filter them.
o Solvability issues, symmetric case. Having described the CB-eigenvectors, let us now
demonstrate some of their effects for Q\Qq—simultaneously with a demonstration of
the effects of the physical mode, Ph— by examining a small 'piece' of a 2D mesh;
shown in Figure 3.13-5 is an 11-element (five black, six red) segment of what can be
construed to be a much larger mesh of Q\ Q0 elements, in which the arrows 'represent' the
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 483
Fig. 3.13-5 The CB mode affects the velocities.
CB-constraint equation ('out of the black and into the red'—an accountant's nightmare,
although T.J. Hughes has called Q\Qo 'a million dollar element'—personal
communication) to be derived below. We begin with the important observation that the solvability
conditions represented by (3.13-64) are, in fact, also derivable by forming the appropriate
linear combinations of continuity equations, from (3.13-58), such that all coefficients of
internal velocities (in from the boundary) cancel identically. It is no accident that the
'appropriate' linear combination is just that obtained by setting the scalar product of the
null eigenvector with the vector of the continuity equation LHS's, CTu, to zero. But the
additional important observation is that these 'inner products' and related results apply
equally to any internal 'boundary,' or cut, through the same mesh used to obtain (3.13-64);
i.e., for any subset of ms elements, the equation generated by
ms
Y,(CT")j<lJ = 0,
7 = 1
where qj is any pressure mode (here PH or PCb) generates an equation—satisfied by
the solution of (3.13-58) whenever a solution exists—in which any nodes that can be
identified as 'internal' are no longer present. (Obviously, in order to have internal nodes,
the set ms must form a closed curve—following element boundaries, of course—within
the domain.) Returning now to the above sketch, we first apply the above equation with
q = PH to obtain Xljli (CTu)j = 0, which leads to
(w5 - u\)h\ + (mio - U(,)(h\ + h2)
+ U\5(h2 + h3) - U\ \h2 + («19 - "16)^3
+ (Ull - V\)l\ + V\f>l2 - V2(l\ + l2)
+ {v\i - v3)(l2 + h) + (vis - v4)(h + U) + («i9 - v5)l4 = 0, (3.13-80)
which, more or less obviously, is a statement of 'global' mass conservation over the
selected subdomain, fr n • u = 0. Similarly, but with q = PCb, is obtained
484 THE NAVIER-STOKES EQUATIONS
J2(CTu)j/Aj= J2 (cTu)i/Ah (3.13-81)
RED BLACK
whose generality and importance we wish to emphasize—even though it is 'but another'
linear combination of continuity equations, which always eliminates internal nodes
(providing ms is as above, which of course it always is if ms = M, the total number
of elements). For the selected subdomain, (3.13-81) leads to:
-«3
-V\q
_ /«19 V\9_
\ U h
"16 V\6 "12 V\2
h /*3 h h
+ 7i-Ti+«*(r + r)=° <3J3-82>
h h2 yhi h2J
which is, again more or less obviously, an equation relating the tangential velocity
components on the boundary of the subdomain. Again, for emphasis, any solution of (3.13-58)
will (must) satisfy both (3.13-80) and (3.13-82). To help see what the CB-constraint means
(it clearly does not approximate fr x ■ u = 0), let us simplify to the case of a uniform
mesh to obtain
(U2 - Ml) + («3 - "2) + ("4 - "3) - ("5 - "4)
+ (UlO - V5) + (V\5 - V\q) - (V\9 ~ V\5)
- (M19 - Wig) + ("18 - "17) - ("17 - "16) - ("12 ~ "ll)
- (v\6 - U12) + (v6 - vx) - (v\ 1 - v6) = 0,
which, if Taylor series analysis dare be applied, generates nothing but du/dx + dv/dy = 0;
no surprise, really, since each of the 11 continuity equations also represents V • u = 0.
Thus, it is clear—hopefully—that this CB-'constraint' equation (and the many more in
3D) are not really terribly 'evil' (although spurious) and pose absolutely no barrier to
convergence with mesh refinement. What they do pose, when applied to the full mesh
with inhomogeneous Dirichlet BC's, are spurious constraint equations on the allowable,
specified, tangential velocities—but even then they cannot be too 'deleterious'; i.e., their
satisfaction is easy to obtain and does not really do much 'damage' to the specification
of any particular problem—as we shall soon show. As a final remark relating to the
above grid (which, of course, generalizes easily), we leave the following exercises to
the reader: (i) sum the continuity equations for those elements immediately surrounding
element 7, (ii) apply the CB-constraint equation to these same elements, then in each
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 485
resulting equation, set to zero all velocities except those on element 7; the result from
(i) will be (CTu)-i = 0, the continuity equation for element 7, and that from (ii) will be
(CTu)i/Ai = 0, where Ai is the area of element 7.
We now move on to consider other BC's than Dirichlet. (Recall that the 'worst case'
for pressure modes is that with inhomogeneous Dirichlet BC's.) Focusing first on normal
BC's, we note that if normal 'traction' BC's, a la (3.8-17) or (3.10-7), are employed on
any portion of T—even at just a single point—then the C-matrix no longer contains PH
in its null space, and there is no longer a redundant continuity equation—a statement that
is true for any element. This is because a normal traction BC 'looks like' a Dirichlet BC
for the pressure—a fact also reflected in the continuum equations, cf. (3.8-38). Only when
n • u is specified on all of T is PH present (and the pressure then only determinable up
to an additive constant—which 'constant' can actually be an arbitrary function of time).
So much for the physical pressure mode. For the non-physical modes, it is probably
no surprise by now that their presence (or absence) is related to tangential BC's on
velocity, and the first important point to be made is this: when (3.13-58) corresponds to
either potential flow or to the acceleration and pressure for NS, there are no spurious
pressure modes because there are no tangential Dirichlet BC's! Only the Stokes (or NS)
equations legitimately permit the application of essential tangential BC's, and only they
can lead to pressure modes. (Later, however, we shall admit to applying tangential BC's
anyway—sometimes illegitimately and sometimes not.) So—for the time being—suppose
we are dealing with steady Stokes flow, which definitely always requires the application of
tangential BC's. For the Q\Qo element in 2D, it follows (like that in the normal direction
for PH) that the use of a natural/'traction' BC—a la (3.8-18) or (3.10-6), a shear stress
BC—on any portion of T, again even at a single node, precludes the existence of Pqb
in the null space of C. And with PCb goes the redundancy; i.e., when Pqb is precluded
by BC's, then there is no redundancy in the continuity equations. The 3D case, with its
many modes, follows easily from 2D—at least for 'simple' domains: (i) if a tangential
BC of the traction type is applied (in the proper direction) at any point on the boundary
of the 'ring' of elements comprising one of the 2D pressure modes of which there are
Nx + Ny + Nz — 3, then this mode will no longer exist; (ii) for the single, 3D mode, a
tangential traction BC (in both directions) at one or more points on F will preclude it.
What this boils down to in practice is this: NBC's on one of the two bounding planes
in two of the three cartesian directions will preclude all 'CB'-pressure modes; i.e., the
maximum, tangential, Dirichlet case without CB-modes is that with tangential velocity
specified on four of six bounding planes—any more makes modes, mostly the 2D type.
We leave as an exercise the generalization of the CB-constraint equation to a segment
of a 3D domain, and simply state that in our experience, it has never been necessary to
actually explicate these relations. Another situation which is more or less guaranteed to
preclude all 3D pressure modes, is that wherein complex geometry and/or unstructured
meshes of distorted isoparametric elements are employed; i.e., for just those problems
that are the 'raison d'etre' of the FEM.
We now summarize some additional linear algebra issues related to pressure modes
(for any element) and their redundant constraints:
1. If the applied velocity BC's do not duplicate a pressure-mode boundary-constraint
equation, then this mode (its null vector, and its zero eigenvalue) will be absent—as are
related solvability issues. The velocity solution will then satisfy the constraint equation.
486 THE NAVIER-STOKES EQUATIONS
2. If the applied BC's do duplicate the constraint equation for a particular pressure mode,
then this mode will be present (in the null space of C) as will its associated zero eigenvalue
and resulting non-unique pressure solution and redundant continuity equation. The system
can be said to be (and is) over-specified but consistent (consistent singular).
3. If the applied BC's violate the boundary constraint equation associated with/implied
by any particular pressure mode—spurious or not—then the problem is ill-posed, and no
solution exists.
Next we present two simple examples of Q\Qo pressure modes and their interactions
with BC's, first for PH and then for PCb- Consider the mesh and BC's in Figure 3.13-6—a
sort of forced transition from slippery plug flow to Poiseuille flow (not a recommended
problem in practice—except as a sort of 'wiggle experiment' if Re ;$> 1 and the grid is
not 'refined' near x = L).
This mesh will display a hydrostatic mode and a CB-mode, with the constraint
equation from the hydrostatic mode [(3.13-31)] being 6w0 = u\ + «2 + "3 + «4 + "5, and
no constraint from Pqb (why?). Any values of u\ through u5 (at the five 'exit' nodes)
that sum to 6w0 generate a well-posed problem; any that violate it do not. If we want
a parabolic (fully developed) exit flow, we might try f(y) = Ay(6 — y), where we have
assumed a channel of height six, and A is to be determined. Thus, u\ = us = /(l),
/(2), and u5 = /(3) to give 6u0 = A(2 x 1 x 5 + 2 x 2 x 4 + 3 x 3), which
gives a well-posed problem only for A = 6«o/35—not A = uq/6, which is the appropriate
parabola amplitude for the continuous problem. This is but a simple example of a general
and very important result: the application of the interpolated value of n • u on F from a
well-posed continuous problem is generally not a well-posed discrete problem; it is the
discrete mass balance condition that must prevail. (Because the integral of a piecewise-
linear interpolant of a parabola is less than that of the integrated parabola, the discrete
parabola must be scaled up accordingly—in this case by the factor 36/35.)
The next example involves Pqb, but not PH, and is the (in)famous lid-driven cavity
(LDC)—as first discussed in Sani et al. (1981a). Application of the CB-constraint
equation (3.13-81), to the problem described in Figure 3.13-7 (with u = v = 0 everywhere on
T not on the top lid except at one node where /„ = 0 is applied—to preclude PH) yields
different results for an even or odd number of nodes across the top of the cavity.
For TV even, the CB-constraint equation gives
7f-"2(77 + Jl)+M3(A + i)
u = v= 0
u = u0
v = 0
u = v= 0
/v
u = f(y)
v = 0
Fig. 3.13-6 From plug flow to Poiseuille flow.
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM
487
N-2 UN-1
Fig. 3.13-7 The top of a lid-driven cavity.
H h uN-1
1 1
+
In-? In-
n-\
Un
In~\
= 0,
(3.13-83)
whereas N odd leads to
u
(3.13-84)
The applied tangential velocity must satisfy (3.13-83) or (3.13-84) in order that the
algebraic system be well-posed. If the simulation is the (simpler) 'leaky' LDC, then
typically w, = u$ = constant, for all /, and there is 'no problem'—each 1//, term in the
above equations cancels with its neighbor (in pairs) with the result that the LHS = 0,
too—giving well-posed problems, [g = 0 in (3.13-64).] But now consider the tougher
problem of a non-leaky ('water-tight') cavity that is realized by setting u\ and un to zero,
and the solvability issue is no longer trivial. To get specific, take TV = 6 in (3.13-83) and
TV = 7 in (3.13-84); the former gives
"M2(^+y+M3(i+y-M4fe+i)+M5(i+i
0,
and the latter gives
-Uj
h + h
+usir2 + h
— «4
h + U
+ "5
U + h
-i*,- + -)=0.
If now we set w, = u0 in the above two equations, which is the most common LDC BC,
they simplify to uq(— \/l\ + I//4) = 0 in the even case and to — u0(\/l\ + \/le) — 0 in the
odd case. [The general results, obviously, are — uq(\/1\ — l///v-i) = 0 for the first case,
and — uq(\/1\ + \/In-\) = 0 for the second.] Now we see before us the first real problem
with the spurious CB-constraint: the ramp-up-over-one-element BC cannot be used when
the number of nodes across the top is odd—the CB-constraint cannot then be satisfied,
and the linear system is inconsistent. (In the even case, we can solve the problem—if
and only if we take l\ = In-\-) But forewarned is forearmed; i.e., our knowledge of the
CB-constraint equation permits us to deal with it—easily and effectively—when it poses
a potential stumbling block. In this case, the solution is either: (i) stay with TV even or
(ii) ramp up over two elements—the latter solution being left as an exercise—or see
Malkus's discussion in the Appendix to Chapter 4 in Hughes (1987). Also left to the
reader is the analogous situation for the 3D LDC.
488 THE NAVIER-STOKES EQUATIONS
It may occur to some (as it did to us) to attempt to 'solve' the ill-posed problem in
the following way: simply peg two pressures (one on a black element and one on a red);
i.e., set the pressures as Dirichlet BC's (value immaterial) and omit the corresponding
two continuity equations, with the result being that the matrix is no longer singular, and
thus the solvability equation, u0(\/l\ + 1/7/v-i) = 0, is vanquished—on the premise that
we had two redundant continuity equations anyway so that all will now be well. That
such a trick is invalid follows from the realization that the two constraint equations (one
from PH and the other from Pqb) still apply to the remaining elements in the mesh with
results that look as follows: (i) a\ (CTu)\ + a2(CTu)2 = 0 from the hydrostatic constraint,
where (CTu)\ is the continuity equation on the first omitted element, (C7w)2 is that on the
second, and a\ and aj are known scalars; (ii) b\(CTu)\ + b2(CTu)2 = uq(\/1\ + 1/7/v-i)
from the CB-constraint equation, where b\ and b2 are known scalars. Thus, the violation
of the original CB-solvability equation when no pressures are pegged shows up as a loss of
mass conservation on the two selected elements. It thus also follows that in a well-posed
problem, such a procedure is perfectly legal—the RHS of the CB-constraint equation
above would then be zero with the result that both elements will display conservation
of mass.
We are almost, but not quite, finished with our discussion of (pure) pressure
modes for Q\Qq. Still to come are discussions of the following: filtering the spurious
pressures—including smoothing and grid smoothing, pressure pegging, node-freeing, and,
finally, the nasty and hard-to-predict/analyze impure CB-modes.
o Filtering the checkerboard mode. While the final details and suggested procedures are
deferred to Chapter 4, we state here for the record the simple filter that works well: if
n elements share velocity node /, then the CB-filter equation below produces a smooth
pressure at this same node—which could then, if desired, be confidently interpolated via
the velocity shape functions:
P^JZP^/JZ^J, 0.13-85)
where Pe is the (piecewise-constant) pressure on element j and Qe- is the size of the same
element (£2* = Ae- in 2D and OF- = Ve- in 3D). This filter was, of course, derived via our
knowledge about Pqb', i.e., from (3.13-74) through (3.13-79) and the related discussion.
If the grid (internal node locations) is 'smoothed' in the 'same' way—by moving the
nodes according to (3.13-85), a more-accurate-yet pressure is obtained; for details, see
Section 4.2.9, and for an example, see Figure 17 of Sani et al. (1981a). Finally we remark
that the post-processed pressure from a penalty method approximation will also need the
same filter.
o Impure pressure modes. Thus far we have discussed pressure modes on 'finite difference'
meshes—uniform rectangles (bricks in 3D). This is a worst case with respect to
pressure modes, and it turns out that for practical/real-world/complex-geometry simulations;
i.e., for those cases where GFEM can really 'shine', that the CB-modes are generally
absent—fortunately. There is (usually) no redundant continuity equation (equations in
3D), and thus no pure CB-modes and no zero eigenvalues. So, are we now 'home' with
respect to the CB-mode(s)? Close, but not quite; there may exist what were called 'impure
modes' by Sani et al. (1981a). Since the time of that publication, which raised a flag of
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 489
fear regarding the possible consequences of these impure modes, we (and many others, we
believe) have accumulated many years of successful experience with Q\ Q0 on a variety of
meshes and never 'worry' about impure pressure modes. We simply apply the 'standard'
filter, (3.13-85), regardless of the possible existence of pure or impure modes—and are
virtually always happy with our pressures. Nevertheless, the impure modes can perhaps
cause an occasional difficulty/surprise, so we shall briefly summarize them. At their very
worst, they correspond to a singular perturbation problem; i.e., there may be a mesh that
supports a pure CB that is only an 'e-away' (nodal locations) from the chosen mesh. If
this happens, the following bizarre behavior could occur: whereas the CB-eigenvalue is
only slightly perturbed from zero—Acb = 0(e2) in fact—the solution (velocity portion, in
fact) may be perturbed in a major way, to order 1 in e, and thus be rather inaccurate. The
CB-eigenvector is still present, and with a computable (not arbitrary) and large
magnitude; Pcb = 0(\/s)—see Sani et al. (1981a) for details. Also, Pqb need no longer be
orthogonal to g; the algebraic problem is well-posed for all g. Although the probability of
encountering such a solution is probably very low, it is not zero—and a way to reckon with
it would be nice. We summarize two ways in the next two subsections, after mentioning
that the conventional filter seems to work well for any impure modes that may exist, and
that mesh refinement will always reduce the problem (convergence is still assured).
o The pesky modes of Q\Qq. If one studies the literature on Q\Qo, namely, (at least)
Boland and Nicolaides (1984), Girault and Raviart (1986), Brezzi and Fortin (1991), one
will see that there are other 'unstable' and 'CB-like' modes besides the pure CB-mode
with its zero eigenvalue and the impure modes discussed above—modes with an LBB
'constant' (see next section) that is not a constant, but rather 0(h). Although these too
are impure modes, they are rather different than those just discussed in that they are
present in perfectly regular meshes of square (or rectangular) elements. For this reason,
and because they are rarely big enough to really cause trouble, we shall adopt the name
coined by D. Silvester and PMG when they first stumbled upon them in 1991-1992:
'pesky modes.' How many there are and how much harm they can do has been, and
perhaps still is, an open issue—although some seem to believe that as many as one fourth
of the total pressures are in this category.
One 'measure' of them, more or less in retrospect, nearly surfaced in our first
publications on the subject (Sani et al., 1981a), but did not; i.e., when we tested a reasonably
obvious method for removal of the pure/global CB-mode by making it orthogonal to the
final (filtered) pressure, it did not 'work' as well as we had hoped—small amounts of
'unexplainable' pressure mode seemed to be still present. The method that should have
worked but did not is this: if P# is the numerical pressure from the NS code and PF is
the to-be-determined filtered pressure in which the CB-mode (Pcb) is removed, then we
can hypothesize that the following procedures would work:
PF = PN-/3PCB, (3.13-86)
where fi is determined so that PF lies in the orthogonal complement of Pqb', ie., setting
PlBPF = 0 yields
P = PTcbPn/PTCbPcb, (3.13-87)
and (3.13-86) then ostensibly gives a CB-less pressure (still at element centroids).
The reason that it left lingering, pesky modes is that there indeed are additional CB-like
modes, recently quantified (finally!) by D. Griffiths (see Griffiths and Silvester, 1994, and
490 THE NAVIER-STOKES EQUATIONS
Griffiths, 1996), which modes (i) are not pure (non-zero eigenvalues, and—the velocity
is also oscillatory); and (ii) are not vanquished via 'shear stress' BC's on a portion of
T. We shall describe these modes in more detail later (Section 3.13.5k); suffice it to say
here that they are rather smooth modes that are modulated by the roughest of modes—the
pure CB-mode—and that the same filter designed for the pure CB-mode also works very
well on these pesky modes, which are also, fortunately, rather hard to 'excite' (it seems
to require 'rough' data); i.e., even before filtering, their amplitudes are usually small.
o Pressure pegging. Although we are only tentative regarding this trick in 3D, we know
it can work well in 2D, and it is this: if you detect an anomalous velocity solution and/or
an inordinately large CB-polluted pressure solution that might be caused by an impure
mode, try the following: (i) select a region of your problem where you expect VP to be
small, (ii) pick one 'red' element and its black neighbor, and peg these pressures at zero
for your next run—which (of course) removes the two associated constraint equations
from the system. This will trade off very slight [0(e)] mass imbalances on these two
elements for a nice regularization of the matrix and thus eliminate any 0(1) errors in
velocity that may have been present. The magnitude of the CB-part of the pressure will
also be small. We conclude by mentioning that this trick could even work well for a
pure CB-grid; the velocity solution will be unaffected and the CB-pressure more-or-less
minimized. Finally, if the BC's do not support a hydrostatic mode, peg only one pressure
rather than two, preferably at zero in a region of the domain where the pressure is close
to zero.
Pressure pegging, in general, may sometimes be convenient, and can be invoked in the
following ways—in 2D and 3D—after ensuring that the problem is solvable [qfg = 0 a
la (3.13-64)]: in 2D, peg any red element and one of its black neighbors in the general
case (both modes present), and any single element when only one mode is present. In 3D,
we consider only the worst case, 'all Dirichlet' (e.g., flow in a 'box' with brick-shaped
elements); an easy way to preclude the Nx + Ny + Nz — 1 pressure modes is to peg the
three columns of elements extending in the three coordinate directions, starting from one
corner; i.e., along three orthogonal edges of the box (see Figure 8 in Sani et al., 1981a)
and one other (anywhere else).
o Node-freeing. The final CB-trick that we have derived goes in just the other direction.
Rather than reducing the size of the system, we increase it, by freeing-up previously
specified velocities at Dirichlet boundary nodes—and we again present only the 2D case.
We begin by noting that a simulation on any mesh (with any element, in fact) that supports
a pure hydrostatic mode can (if well-posed) be solved with no hydrostatic mode and no
adverse effects simply by releasing (freeing) the normal component of velocity at any
node on T; this causes the NBC to be activated with the result that P will be 'set' to ~ 0
at this node, and the hydrostatic constraint equation (3.13-64), will assure that the velocity
solution is unchanged. Transferring this 'theory' to the (pure) CB-mode leads to the same
result except that it is one tangential velocity BC that must be released. If it is an impure
CB-mode, the same trick should both regularize the matrix [thus precluding 0(1) errors
in velocity] and cause no more than 0(e) mass imbalance on the selected element. We
conclude by noting that releasing uT in favor of setting fz = 0 is legitimate with or
without an accompanying hydrostatic mode. Finally, we re-emphasize: pressure pegging
or node freeing should rarely be required, in 2D or 3D—we believe.
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 491
Table 3.13-5 Spurious modes in two 9-node elements.
Element Spurious modes Comments
on rectangles
O2O-1 —One spurious 2Ax x 2Ay
mode (£>7-wave) per element;
PcB\k = ±VAc on element k.
Q(2]Q^ -One ^-wave.
—One £-wave.
—One r?-wave.
(i) It is tolerable and filterable; see below.
(ii) The CB-mode can be said to 'live' at the
four 2- x -2 Gaussian points within each element.
—Do not use this element.
o Higher-order elements. So much for Q\Qq—finally. We now switch gears to consider
higher-order quadrilateral elements with spurious pressure modes, but only briefly. In
Sani et al. (1981a), the eight-node (serendipity) element and the nine-node (Lagrangian
biquadratic) element with discontinuous bilinear pressure were considered and analyzed,
with results that are summarized in Table 3.13-5.
In Jackson and Cliffe (1981), another 'pressure mode' paper contemporaneous with our
own, were reported the following additional numbers of spurious modes for some higher-
order elements: Q2 Q2 has five, Q2Q2 has seven, and the Q2Q9 element has two.
By adding two more velocity degrees of freedom (x- and ^-derivatives at the centroid),
they enhanced the Q2Q2 ' to what we would call Q{2 ]Q(2 ] with no spurious modes.
[They called it an 11-8 element and advocated it at that time; since then, however, they
came to prefer the Q2P\ element—as do we, and many others (K.A. Cliffe, personal
communication).]
Returning to the non-leaky LDC and the Q2Q-1 element, the simplest relevant sample
problem for which the CB-constraint can 'act up,' we find, since the constraint equation is
basically the same as that for Q\Qq—see (3.13-81) and its nine-node analog in Sani et al.
(1981a)—for a mesh with N elements (even or odd) across the top and u\ = W2/V+1 = 0,
(uq — 2u2)/l\ = («o — 2u2n)/In, where l\ and lN are the lengths of the first and last
elements. Thus, the fix is similar to that of ramping over two elements for Q\Qo- set
U2 = U2N = uq/2. [It would not work to ramp up to uq at the first node from the corner;
nor would it work to use the 'smooth' parabola discussed, for example, by Carey and
McLay (1986) wherein U2 — «2/v = 0.75 uq.]
In 3D, the Q2Q-1 element has not, to our knowledge, been analyzed—but based on
our 2D knowledge and of how the single CB-mode in the 2D Q\Qq element expands to
many in 3D, we are fairly confident that the following remarks are true (for all-Dirichlet
BC's, of course):
1. As with Q\Qo, each 2D wave extends into the third dimension as a constant, leading
to the same number of 2D modes as the Q\Qo displays when each 3D quadratic element
(27-node brick) is replaced by the eight analogous 2D elements (eight-node bricks).
2. There is one fully 3D pressure mode; a £r?£ wave ( — \)l+J+k.
3. Tangential velocity BC's remove pressure modes in the same ways they do for Q\Qo-
Finally, we address the subject of filtering the 2D CB-mode for/from Q2Q-\—but not
smoothing, as for Q\Qo, since the filtered results still apply at the same pressure nodes
492 THE NAVIER-STOKES EQUATIONS
(not at the velocity nodes a la Q\Qo)- There are two ways to derive the filter for this
element, and each relies, as for Q\Qo, on a knowledge of the CB-eigenvector: (i) assume
that the true (physical) pressure is L2-orthogonal to the xy mode on each element (i.e.,
assume that the true pressure has no projection onto the element-level CB-mode), and
(ii) simply subtract off the xy-portion of the computed pressure in each element. To derive
these, we first present the pressure basis functions for the / x /j-rectangular element shown
in Figure 3.13-8 (P\ through P4 are at the 2 x 2 Gaussian points). The element pressure
can be expressed in the usual way, P(x, y) = Ylj=\ Pji^jO0^ )0> or m me equivalent way,
P(x, y) = P0 + xPx + yPv + xyP
jtyi
where
P^ =
Px =
py =
\(Pl+P2+P3+P4),
V3
2/
2h
(P2-Pl+P3-P4),
(P4-P1+P3-P2),
and
/>=-(/>, -p2+p3-p4).
Assume
Ih
P = PP+aPCB,
(3.13-88)
(3.13-89)
(3.13-90)
(3.13-91)
(3.13-92)
(3.13-93)
where PP(x,y) is the (unknown) physical pressure, and PCb = xy is the local CB-
pressure. To obtain a, and thus PP = P — aPcB as the filtered pressure, we assert/assume
that JePpPcB = 0, which yields fexyP = a fgPQB, where P is available—and given by
(3.13-88). Thus, both integrals can be evaluated (quickly and easily when it is realized
that all integrands except that involving Pxy are odd and integrate to zero), giving a = Pxy,
and thus
3*Vi -P2+P3- Pa) = Po+xPy + yPy,
P -
Ih
(3.13-94)
showing that the filtered pressure is actually linear—a la the stable and accurate Q2P-1-
It is also clear that simply subtracting the xy portion of the original pressure gives the
same result—namely, the physical pressure lies in the orthogonal complement of the null
x p.
x P,
x pq
x p,
h* £/^3 H
£
Fig. 3.13-8 One O2O-1 element.
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 493
space. In 3D, the analogous 3D 'CB' is that described by the xyz portion of the pressure,
whose subtraction should leave a linear pressure on each element and no 3D pressure
mode. The many 2D modes in 3D are presumably similarly removed in each plane of
elements, by subtracting off the offending 2D CB-mode. ... We have done neither the
complete analysis nor numerical experiments.
Final 3D, 27-node-brick remark: it is still not firmly resolved as to whether the Q2Q-1
element with eight continuity equations per element and spurious modes will give a better
solution (when, of course, a solution exists!) than the Q2P-1 element with its ostensible
'shortage' of continuity equations (only four per element) but no pressure modes. (We
believe that it just might.)
More work, and lots of numerical experiments, are still required.
o The unsymmetric case, revisited. We now turn to the unsymmetric case. The simplest way
(via finite elements) to generate the unsymmetric system, (3.13-60), introduced already
in (3.13-33), which also more closely mimics many finite difference methods, is to avoid
integrating by parts the VP term—which, of course, precludes the lowest-order
quadrilateral element, Q\Qq, from consideration; only elements with the ability to generate a valid
approximation to VP can be employed. (The only pressure gradient that exists for Q\ Q0 is
the weak one; ditto P\Pq.) The reason—or at least one reason—for doing so is related to
outflow boundary conditions (OBC's), a subject we take up in greater depth in Volume II.
Thus, for example, in Jackson and Cliffe (1981), Sani et al. (1984), and Eaton (1983), the
gain (or potential gain) in OBC flexibility was the reason for heading down (or almost
heading down) the path of ill-posed formulations a la Section 3.12.5—although it was
not really known then (but suspected, by some) that the formulation was (or could be) ill-
posed. Thus, we return to the weak formulation of the equations in Section 3.12.1; if the
pressure gradient term was not integrated by parts—see (3.12-4) through (3.12-19)—there
are two changes: (i) the pressure portion of (3.12-14) and (3.12-16) is absent (it is no
longer part of the NBC), and (ii) the derivatives are not shifted to the velocity test
functions; f(—Pd(p{x)/dx) in (3.12-18) becomes f<p{x)dP/dx with an analogous change in
(3.12-19). The net result, in the final GFEM equations given by (3.13-28) and (3.13-29),
is that we lose the 'div-grad symmetry' present there because 'C is no longer the
transpose of CT (which remains unchanged); rather, C is replaced by G (gradient matrix)
where, cf. (3.13-25), Ga = f <p{a)d\lrT /dxa. This formulation leads to, for the special cases
under consideration in this section, (3.13-60), where D = CT.
Remarks:
(1) It may be worth mentioning that the difference between C and G is in some sense
quite small; indeed, it is zero if all BC's are Dirichlet. Thus, they agree in all of £1
and only differ on those parts of V on which the normal NBC is applied. To see
this, simply note that
where v is a vector test function a la (3.12-34), which leads to (G — C)P|,- =
frP(n • v)(- Wi, and we know that n • v|r = 0 except on FN; thus, G = C nearly
everywhere.
494 THE NAVIER-STOKES EQUATIONS
(2) But, on FN, the difference between G and C is profound; GP is a 'gradient' (VP),
and CP is a 'force.' These differences, and their effects, are demonstrated below.
(3) GPH = 0, always; yet, if the problem has inflow and outflow with an NBC as OBC,
there is no redundant continuity equation associated with this singularity—Ph is
then a spurious hydrostatic mode.
*o A simple example. We conclude this section with a simple but carefully developed
and hopefully useful (but long) example—with the intent of demonstrating the theory
discussed above. Specifically, we consider the following exact solution to the 2D steady
Stokes equations in an unbounded domain: u = x2, v = — 2xy, P = 2x + ay, where a
corresponds to a body force (like gravity). Figure 3.13-9 depicts the streamlines for
this solution [and, for a = 0, the (vertical) isobars]. The boxed subdomain is the one
chosen for our example. And, for the ultimate in simplicity, we cover this domain by a
single element—the Q{2 ]Q\ 'serendipity' element, because it is the simplest
(quadrilateral) element that is capable of representing the exact solution. We shall specify
Dirichlet/essential BC's along the axes and NBC's along the other two boundaries, per
Figure 3.13-10 below, in which there are only 10 degrees of freedom—three w's, three
v's, and four P's (• is velocity, □ represents pressure).
We shall 'solve' this problem in several ways:
A
^\
'
M
ill,
1
0
-1
-2
-3
-4
-4-3-2-10 1 2 3 4
Fig. 3.13-9 Streamlines and isobars (- - -) for a Stokes test problem.
Fig. 3.13-10 One serendipity element.
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 495
1. Conventional GFEM.
2. Unconventional FEM, in which the pressure gradient is not 'integrated around by parts.'
3. Variations on 2, in which the pressure is specified at a point.
The reason that 3 is even considered is related to the simple fact that 2 is generally ill-
posed: the Stokes matrix is singular (and unsymmetric), and the RHS vector is (generally)
not orthogonal to the null vector of the transpose matrix. Before embarking on the details
(in which the Devil resides), let us state the bottom line: the Galerkin FEM is always
well-posed and for the example here can (given exact data) obtain the exact solution,
whereas the unconventional FEM is generally ill-posed (exception: exact data) unless
global mass conservation is forsaken via pressure specification—and, of course, cannot
thereby obtain the exact solution.
The GFEM condensed statement of the problem is
Ku + CP = f, (3.13-95)
CTu = g, (3.13-96)
where we use the simplest (y = 0) option so that these equations mimic — V2u + VP =
aey and V ■ u = 0. The unsymmetric version is
Ku + GP = fG, (3.13-97)
CTu = g, (3.13-98)
where G replaces C and, because of different NBC's, fG replaces /, and the first job at
hand is to construct the new gradient matrix, G—which we do starting from the available
(16 x 4) C-matrix of Appendix 1. We will detail the construction for the ^-momentum
equation only, leaving y to the reader: from (3.13-25), with a switched to a superscript for
convenience, Cf = — f d(pi/dx • xj/j, and from the Stokes equations with no integration
by parts of VP, we have the obvious definition, GJ- = f (pjd\//j/dx, so that
G"j-Cti =
/ dx^i^/j">= / ^j"*'
(3.13-99)
Defining
ff[. = I (pjXl/jn,
j r„
gives the following (8 x 4) boundary element matrix on the 'usual' (—/ ^ x ^ /,
y ^ h) rectangular domain:
-h€
^~2
-f-\<l>if\dv
0
0
-/l,04^id>?
0
0
0
~ S-108^1^
0
0
I
J_!,02^2d>7 5lx<hfi&,n
I.
I
, 03^2 d>7
0
0
, 06^2 d>7
0
0
/-
1 03^3 d>?
0
0
/! 106^3^
0
o
-/.
1 01^4d>?
0
0
1 04^4 d>?
0
0
0
-/_108^4d>?-
496 THE NAVIER-STOKES EQUATIONS
which, using the serendipity element basis functions, leads to
'j
^ ~ 6
-1
0
0
0
0
0
0
-2
?, = C?,+lfyis
Gii ~ 36
1
1
2
2
-8
-6
-4
-6
0 0 0
1 0 0
0 1 0
0 0-1
0 0 0
2 2 0
0 0 0
0 0 -2_
?
-1-2 2
-1 -2 2
-2 -1 1
-2-1 1
8 4-4
6 6-6
4 8-8
6 6 -
-6
in which the bold entries are those that differ from the corresponding entries in C\,-.
This is the full element-matrix; we need only rows 3, 6, and 7 for our purposes, since
we have only 10 degrees of freedom (unknowns) in our test problem; namely, W3, ue, u-],
v3, v6, vn, and Pi, P2, P3, Pa-
Omitting the details, we also present the y-version of the new gradient matrix, 'for the
record':
°h
I
~ 36
1
2
2
1
-6
-4
-6
-8
2
1
1
2
-6
-8
-6
-4
-2
-1
-1
-2
6
8
6
4
-1
-2
-2
-1
6
4
6
8
and again, only rows 3, 6, and 7 are needed here. By referring to Appendix 1—and
grabbing rows 3, 6, and 7 of the appropriate K{—V2) and C matrices, we can now form
the LHS of (3.13-95) and (3.13-96) and of (3.13-97) and (3.13-98)—where, for the
case of interest, / = h = 2: first for GFEM,
45 <
T8
52 -37
37 104
37 0
0 0
0 0
0 0
( 2 -6
J -2 -6
)-7 -6
I 1 -6
-37
0
104
0
0
0
-4
4
8
-8
0
0
0
52
-37
-37
2
1
-7
-2
0 0
0 0
0 0
-37 -37
104 0
0 104
-4 -6}
-8 -6 1
8 -6 [
4 -6 J
2 -2 -7
-6 -6 -6
-4 4 8
2 1 -7
-4 -8 8
-6 -6 -6
(0 0 0 0'
0 0 0 0
1 0 0 0 0
lo 0 0 0.
1 >
-6
-8
-2
4
-6,
>
1
J
■"3
"6
"7
«3
v7
P\
Pi
P3
lp4
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 497
for the unsymmetric version, only the upper 6x4 matrix changes (C —► G); it is then
2
-6
-4
2
-4
-6
-2
6
4
1
-8
-6
-1
6
8
-1
8
6
1
-6
-8
-2
4
6
which, it is important to note and in contrast to C, annihilates a constant (pressure)
vector; GPH = 0.
We now focus our attention on the RHS vectors, in which / and fG receive
contributions from all three 'sources': (i) inhomogeneous Dirichlet velocity BC's,
(ii) inhomogeneous NBC's, and (iii) body forces (accelerations): f = fD + /nbc + /bf,
and we construct each in turn (and later repeat it for fG).
For /D, we start with the eight-node element matrix for —V2 from Appendix 1, and
transpose to the RHS the Dirichlet data (nodes 1, 2, 4, 5, and 8)—first for u and then for
v, to obtain
-46«i - 45w2 - 45«4 + 46w5 + 46w81 [" -67
46«i + 74w2 + 46«4 - 32w8 148
46« i + 46u2 + 74u4 - 32u5 1 76
-46vi - 45v2 - 45v4 + A6v5 + 46v8 _ 45 0
46vi + !Av2 + 46v4 - 32v8 0
46vi + 46v2 + 74v4 - 32v5 J [ 0 ,
where we have employed the exact solution, u = x2, v = —2xy.
For /nbc» we first refer to Figure 3.13-11, in which the exact solution is inserted into
the 'traction' terms. The NBC contribution, /nbc, is thus comprised of the following
boundary integrals, wherein we invoke the shortcut notation that f (•) denotes the
boundary integral between nodes a and b:
1
/D=90
y
fx = -fT = 3u/3y = 0
fy = fn = -p + Bv/3y = -4x -2a
fx = fn = -P + 3u/3x = -ay
fy = fT = 3v/3x = -2y
►x
Fig. 3.13-11 The natural boundary conditions.
498
THE NAVIER-STOKES EQUATIONS
r Z-4
/
NBC
M>
06/ >
<hf>
03/,
06/<
wherein we observe that part of /nbc actually comes from the body force term, aey, via
the pressure force.
Finally, the body force contribution from the 'bulk' integrals is:
f <p3(-ay)+ f 0
/ (f)e(-ay)
0
3 z-4
03(-2>O+ / 03(-4x-
/ 06(-2)0
/ (p-i(-4x-2a)
-2a)
—
-2a/3
-Aa/3
0
-4 - 2a/3
-8/3
.-16/3-8a/3
/bf =
■ o "
0
0
<p6a
<p7a
—
" 0 "
0
0
-a/3
4a/3
. 4a/3 .
/ =
so that the total /-vector is
-67/45-2a/3 "
148/45 -4a/3
76/45
-4-a
-8/3 + 4a/3
-16/3-4a/3 J
and we now turn to the g-vector—from the mass conservation equations.
It is worthwhile to start by writing out the four full continuity equations, corresponding
to P\ through P4—again using Appendix 1 for the Cr-matrix; here for I = h = 2 and
dropping the common factor 1/18:
(7u\ + uj + 2«3 + 2«4 — 8«5 — 6«6 — 4«7 + 6wg)
+ {lv\ + 2t>2 + 2t>3 + V4 + 6v5 — 4v(y — 6v-i — 8vg) = 0,
(—«i — 7«2 — 2«3 — 2w4 + 8M5 — 6u(y + 4w7 + 6wg)
+ (2v\ + Ivj + v-} + 2t>4 + 6t>5 — 8t>6 — 6v-i — 4v8) = 0,
{—2u\ — 2uj — 7«3 — «4 + 4«5 — 6«6 + 8«7 + 6«8)
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 499
+ (—2v\ — V2 — Ivt,
(2u\ + 2u2 + M3 + 7«4 -
+ (—v\ — 2v2 — 2v3
- 2t>4 + 6t>5 + 8t>6 — 6v-i + 4v8) = 0,
4«5 — 6«6 — 8«7 + 6«g)
- 7^4 + 6t>5 + Av(y — 6v-j + 8vg) = 0.
Before forming the g-vector, it is interesting to add the four equations to obtain the
following global mass balance:
6[(«i - u2) + 4(m8 - U(y) + (u4 -u3)] + 6[(v\ - v4) + 4(v5 - v7) + (v2 - v3)] = 0,
which is interesting because it is also an element-level mass balance—a balance not
normally achieved when using continuous pressure basis functions. (But it is also true
that one-element domains are not common.)
Then, transposing all Dirichlet data to the RHS gives the Cr-matrix already displayed
and gives g on the RHS (after reinstating the 1/18 factor):
8
1
18
1
18
(—7«i — ii2 — 2u4 + 8«5 — 6«g)
(u\ + luj + 2«4 — 8«5 — 6wg)
(2m i + 2m2 + M4 — 4m5 — 6Mg)
_ (—2m i — 2m2 — 7m4 -f 4m5 — 6m§)
4"
20
4
-4
+ (—7im — 2t>2 — V4 — 6v$ + 8vg)
+ (—2vi — 7t>2 — 2t>4 — 6v$ + 4vg)
+ (2v\ + vj + 2t>4 — 6t>5 — 4vg)
+ (Vi + Vj + 7t>4 — 6t>5 — 8vg)
which comes from the only non-zero values: M2 = 4 and M5 = 1 (see Figure 3.13-10).
All of the data are now available to obtain the GFEM solution of (3.13-95) and
(3.13-96), which solution gives the exact results (m =x2, v = — 2xy, P = 2x + ay), as
can be easily verified by substitution:
"«3~
M6
M7
V3
V6
V7
Pi
Pi
Pi
IPA\
■ 4 "
4
1
-8
-4
-4
0
4
4 +2a
. 2<2 .
We now turn to the G-matrix version of the same problem, which requires the
construction of fG. To do this, we first repeat in Figure 3.13-12 the 'NBC sketch shown earlier
(Figure 3.13-11), wherein the pressure terms are now absent. Building /nbc m me same
way we did for /nbc (i-e., using the exact solution) yields
4/3
fG _
/nbc —
16/3
0
-8/3
-8/3
-8/3 J
500 THE NAVIER-STOKES EQUATIONS
y
A
fx = -fT = 9u/9y = 0
fy = f n = dv/dy = -2X
0 2
Fig. 3.13-12 Natural boundary conditions sans pressure.
fx = fn = du/dx = 4
fy = fT = dv/dx = -2y
► x
Since the other two parts of the /-vector remain unchanged, the total /G-vector is
fG =
-67/45 + 4/3
148/45 + 16/3
76/45 + 0
-a/3 - 8/3
4a/3 - 8/3
4a/3-8/3
-7/45
388/45
76/45
-a/3 - 8/3
4a/3 - 8/3
L 4a/3 - 8/3
The g-vector is the same for the 'G-problem' as it is for the 'C-problem' since both
use CT\ thus, we have completed the definition of A and b in Ax = b for each. For the
symmetric case (C-matrix), A is non-singular, and the solution of (3.13-95) and (3.13-96)
gives the exact solution—'as advertised' [i.e., if the exact solution is contained in the
grab bag (the trial space), GFEM will find it]. For the unsymmetric case (G-matrix), A
is singular, and the solution of (3.13-97) and (3.13-98) is not unique—when it exists. It
turns out that it does exist for the above data, and it is unique (and exact) only up to an
additive multiple of P#, the hydrostatic pressure mode; i.e., we have the special case of a
consistent singular system. This case is very special in that the use of the exact solution
to build /nbc yields a consistent system; virtually any /^bc other than that shown above
leads to an ill-posed problem—a point that we shall prove below. (Indeed, part of the
reason for defining such a 'small' problem is so that the linear algebra issues are easy to
illustrate.)
Since (3.13-97) and (3.13-98) have a singular matrix, we are led to seek the null vector
(corresponding to the pure pressure mode, Ph) for the adjoint/transpose problem and then
test for solvability. Thus, we form
Kw + Cr = 0,
GTw = 0,
(3.13-100)
(3.13-101)
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 501
from the given matrices and seek the solution of the resulting 10 x 10 homogeneous linear
system, which is easily found to be
wT = (wx3, wx6, w*v wy, wy, wy) = (2, 1, -1/2, 2, -1/2, 1)
and
rT = (r,, r2, r3, r4) = (9/5, -14/5, 29/5, -14/5),
where we 'normalized' the eigenvector by setting wy-j = 1. Recalling that the solvability
constraint is wT fG + rT g = 0, where fG = (/*, /*, /*, />', fy6, fy7)T and g = (#,, g2,
g3, g4)T, leads directly to the solvability condition
V\ + fl - 0-5fx7 + 2fy - 0.5f I + f}7 + 1.8s, - 2.8#2 + 5.8*3 - 2.8#4 = 0.
(3.13-102)
If this equation is not satisfied by the given data (/, g), then the problem is ill-posed
and no solution exists. Inserting the fG and g results above into this constraint equation
yields
- 14/45 + 388/45 - 38/45 - 2a/3 - 16/3
= 2a/3 + 4/3 + 4^/3-8/3 + (1.8 x 4-2.8 x 20 + 5.8 x 4 + 2.8 x4)/18 = 0,
or 0 = 0; i.e., it is satisfied identically! This rare result is obviously related to the
'simplicity' of the test problem and to our knowledge and utilization of the exact solution
to form fG (/nbc m particular). In the general case (many elements, no exact solution),
a solvability condition similar to (3.13-102), but with many more terms, will always exist
for the unsymmetric problem—but its satisfaction will rarely if ever be attained. Thus, the
general case is ill-posed. (Even though the A-matrix displays only a pure pressure mode
for its null vector, the null vector of AT will generally contain some non-zero entries in
the velocity portion of its null vector—as above—and these will usually assure that the
linear system Ax = b has no solution.)
Before pushing on, let us be sure to appreciate the current situation in which we have
constructed a consistent singular system. Since in this particular case the null vector of
AT is fully populated (no zeros in either w or r), it follows that each of the 10 matrix
rows is linearly dependent on the other nine [simply apply (3.13-102) to the matrix rows,
with f\ replaced by Row 1 ...andg4 replaced by Row 10]; since also the system is
consistent, with the RHS satisfying (3.13-102), it follows that any of the 10 equations
could be dropped after pegging (specifying) any one of the four pressures and transposing
the corresponding column data to the RHS, thus reducing the problem to a nine-equation
one with a 9 x 9 matrix that is not singular. Its solution would agree with that from the
C-matrix up to a constant multiple of PH, the hydrostatic pressure mode. For example,
let us omit the third equation (momentum equation for w7) and specify the first pressure
(P\) to be P\ — 10; the remaining nine equations will give the proper solution once the
RHS of each is modified by subtracting 10G,i, i = 1, 2, 4, 5 and 6, from the RHS vector,
where G,i is the first column in the original (6 x 4) G-matrix—it is the seventh column in
the A-matrix. Since the exact solution has P\ = 0, the 9x 9 solution will be 10 units too
large in pressure and exact in the six velocities. [In the general case of an TV x N system,
it may not be the case that the null vector is fully populated with non-zeros; any equation
corresponding to a zero entry in the null vector of AT is not linearly dependent on the other
TV — 1 and must be retained. It is probably safe to say, though, that any of the continuity
equations in the consistent (!) singular system could be safely jettisoned (r, ^ 0, Wi is
502 THE NAVIER-STOKES EQUATIONS
presumed) and its pressure specified—which is the most 'familiar' case/situation. But,
as already mentioned, it is probably also safe to say that the general case will not be
consistent, so that the point is somewhat moot.]
So now let us see what happens if we employ the 'trick' used by many FDM prac-
tioners; i.e., they often argue as follows: 'Since the pressure is never determinable except
up to an additive constant, let us remove this indeterminancy by setting the value of P
at a single point.' What they seemingly do not realize is that they do not—unless they
are solving the (too) rare case with a consistent singular system—really have a redundant
'continuity' equation (when OBC's are present) even though they do have a hydrostatic
pressure mode—because the linear system is not consistent. (While the matrix rows still
do display some linear dependence, the corresponding equations do not because the RHS
is not consistent.) Thus, once a pressure is specified, the corresponding mass conservation
equation is necessarily lost; removing the singularity by setting the pressure at one node
sacrifices the continuity equation at that node—a procedure that may or may not cause
serious 'damage.' It must be the case that serious damage is not the common
consequence, judging by the plethora of 'successful' simulations done this way. Perhaps it is
the case that the problems are otherwise well designed, so that the local loss of mass
conservation is hardly noticeable. Perhaps it is also the case that 'pressure-pegging' is
done selectively, cleverly; e.g., if it is done in a region of low velocity, the local loss of
conservation may well go unnoticed. [Note that the location of the pressure specification
point (node) is completely arbitrary; it could, but need not be, chosen to be on the outlet
boundary. It could be far away from any boundary—preferably, as stated above, in a
region of very low velocity.] But it will always be the case that changing the point of
pressure specification will change the entire solution.
Toward this end, then, we show below the results of selective pressure
specification for our simple test problem—for a = 0. To both generate an inconsistent G-matrix
problem, and to permit some sort of comparison with the C-matrix result when the proper
value of / is not known, we replaced the specified shear stress (—2y) at node six (the
momentum equation for v6) by zero in the /^-location—for both the G and C
problems. For the G-problem, we must also specify one pressure and omit one of the 10
equations to obtain a non-singular 9x9 system, as described above. For simplicity, and
to correspond more closely to what is done 'in practice,' we simply omitted the mass
conservation equation corresponding to the specified pressure. (For the C-problem, we
retain the always-consistent 10x10 system; only the RHS vector is changed: f\ goes
from —8/3 to 0.) To seek the 'best possible' solution, we first peg the chosen pressure at
the exact value; but then to better correspond to reality, we also repeated the two cases
wherein the exact pressure is non-zero {Pj = P$ = 4) by zero—the usual choice when no
knowledge is available. The results are shown in Table 3.13-6, with the last row showing
the residual from the mass conservation equations....
What can be said about these results? Well,
1. The pressures are all bad (the notorious 'over-reactors').
2. The P2 = A case is exceptionally good.
3. The P\ = 0 case is exceptionally bad.
4. It may be worthwhile to replace C by G, peg a pressure, and live with the
result—especially for the case of stratified (Boussinesq) flows that we shall discuss in
Vol. II, wherein the homogeneous OBC of the C-matrix often does a lousy job.
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 503
Table 3.13-6 Test problem results.
Variable
u3
Ue
u7
v3
v6
v7
Pi
P2
P3
P,
CTu-g
Exact
solution
4
4
1
-8
-4
-4
0
4
4
0
0
C -matrix
solution
3.44
3.80
0.86
-7.44
-3.86
-3.80
13.00
-7.21
9.00
-6.80
0
G-matrix solutions for various specified pressures
Pi =0
9.76
2.96
4.66
-2.65
-0.44
-6.30
0
63.6
29.3
63.4
2.22
P2 = 4
3.40
4.08
0.61
-8.83
-4.68
-3.84
5.81
4
7.22
-0.59
0.48
P3 = 4
4.55
4.46
1.48
-7.86
-3.62
-4.80
-2.05
1.33
4
1.18
0.69
P4 = 0
6.91
4.77
3.16
-6.62
-2.94
-5.48
-21.3
-1.92
-9.51
0
-1.43
P2 = 0
5.79
3.78
2.16
-5.51
-1.95
-4.49
-21.4
0
-9.66
-2.22
-1.43
P3 = '
4.55
4.46
1.48
-6.62
-3.62
-4.80
-6.05
-2.67
0
-2.82
0.69
Perhaps the main thing to conclude from this prolonged example is that the semi-
discrete equations, via linear algebra, lead to the same conclusion stated several times
already for the continuous case: integration by parts of the pressure gradient term is
required in order to have a well-posed problem for any but Dirichlet BC's.
*c. LBB-stability/div-stability
In Chapter 2 we mentioned the fundamental theorem that provided sufficient conditions
for the existence of a unique solution of the steady advection-diffusion equation in
the continuum; namely, the Lax-Milgram theorem. We saw that when the GFEM was
applied on the appropriate finite-dimensional subspaces (i.e., via conforming elements),
the resulting matrices were guaranteed to be invertible if the conditions of the theorems
were satisfied. [We also saw cases in which the theory was silent—i.e., the conditions
of the theorems were not satisfied—some of which were nevertheless shown to deliver
what appeared to be very reasonable (sometimes even surprisingly good) solutions on the
finite-dimensional subspaces even though it is (ostensibly) not known that the problem
remains well-posed as h —► 0.]
If this scalar situation may be termed 'bad' or 'scary,' then we are about to confront
a situation in which we go from bad to much worse. And this even for the simplest of
all cases for the vector-valued systems of interest in this book—the steady (and even
self-adjoint/symmetric) Stokes equations. It begins by recognizing that we must leave the
simpler situation wherein a single type of basis function was adequate to represent the
single unknown variable and enter the larger and more difficult realm of mixed methods
and 'mixed'-finite elements, wherein different unknown variables are (or, at least, may
be) represented by different types of basis functions. In our case, of course, we encounter
the option of using different basis functions for pressure than for (each component of)
velocity. 'Loosely speaking, we want to choose our velocity space and our pressure space
so that the resulting method is both accurate and stable. These demands are in some sense
conflicting and one has to find a reasonable compromise'—Johnson (1987). And from
Arnold et al. (1984), we add, '.. .This space is chosen so that the approximate solution
is easily computable...'; i.e., implementational convenience/efficiency should also play
504 THE NAVIER-STOKES EQUATIONS
a major role. As an indication of the magnitude of the problems ahead, both in theory
and in practice (computations), we note that an entire book on the subject has recently
appeared: Mixed and Hybrid Finite Element Methods, by Brezzi and Fortin (1991), and
we are very thankful because it both reduces the magnitude of our task and permits 'an
appeal to authority.' (As we have no need for 'hybrid' elements in our book, we shall
simply leave the curious reader dangling with respect to the term 'hybrid.') Depending
on the reader's mathematical background, it may be the case that there is some 'tough
sledding' ahead—although we shall endeavor to display and discuss what we consider to
be close to the minimum amount of advanced mathematics (functional analysis, mainly)
needed to adequately appreciate the magnitude and scope of the issues involved. In this
regard, we make two remarks: (i) the Brezzi/Fortin book is written at a much higher
mathematical level than ours—as is another important precursor to our work: Finite
Element Methods for Navier-Stokes Equations, by Girault and Raviart (1986); and (ii) it
is the very existence of this higher level of mathematics that turns off (scares) many
potential 'CFD engineers' (and physical scientists) from the finite element method; they
are much more comfortable with Taylor series and related divided differences—and are
usually comfortable in their assumed world of finding classical solutions. [They—or at
least a large subset of them—of course live under a false sense of security most of
the time, since most of their results represent/approximate 'only' weak solutions, not
classical ones—since the latter are often non-existent.] As stated by Mason in Methods of
Functional Analysis for Application in Solid Mechanics (1985), 'The subject of Functional
Analysis, with its abstract character and sweeping generalizations, is not easy for untrained
minds to master, since it departs considerably from the usual offhand engineering approach
to mathematics, but once one succeeds in learning some of it, the dividends are very
rewarding.' This statement could serve equally well as a warning and an inducement! It
is also true, unfortunately for those with narrow/applied interests, that functional analysis
covers much more than just weak solutions to 'our' PDE's. But there are some texts that
try to focus on the applied side; in addition to Mason (1985) referred to above, there is
Rektorys (1980) and Reddy (1986), to name a few. From Rektorys, we quote, and concur
with(!), 'Functional analysis is a difficult subject for a non-mathematician. It is rich in
abstract concepts which cannot be absorbed with haste. That is why I have advanced very
cautiously, in an inductive rather than a deductive manner, from the simpler to the more
complicated.' And from Reddy, 'An increased interest is seen in recent years in the study
of functional analysis among engineers and physicists who are theoretically inclined. This
is because it is now widely accepted that functional analysis is a powerful tool in the
solution of mathematical problems arising from physical situations.' Finally, from Ortega
(1990), 'The most important tool in many areas of numerical analysis is linear algebra and
matrix theory. .. .In more advanced work, infinite-dimensional linear algebra—functional
analysis—plays an analogous role.' Finally, a statement from O. Pironneau (personal
communication) is relevant here: 'Functional analysis is really only required in finite
elements if you want to do error analysis' —a subject we mostly wish to avoid in this
text; but not totally.
For the Stokes equations, unfortunately, Lax-Milgram cannot help us much. We are
forced to look to more general/powerful theory—especially when we seek the approximate
solution via the GFEM. In the words of Gunzburger (1989), another important reference
for our subject, 'In the positive-definite case... the mere inclusion of the finite element
spaces within the underlying function spaces is essentially sufficient to assure that the
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 505
approximate solution is well-defined and is, as far as the rate of convergence is concerned,
as accurate as possible for the type of finite element functions being used. Here, for the
Navier-Stokes equations, the inclusions... are not by themselves sufficient to produce
stable, meaningful approximations. We find ourselves in the realm of what are known
as mixed finite element methods.' Another good reference for this (and other) material,
is Arnold (1990), who states, 'A key point, which is characteristic of mixed variational
principles, is that the pair.. . is not an extremum point.... It is a saddle point.' That
is to say the steady Stokes equations, in u and P, correspond not to the minimizing
function of a quadratic functional, but to a saddle-point in that the solution corresponds
simultaneously to a minimum with respect to the velocity and a maximum with respect to
the pressure; see (3.7-7) and Section 3.15. If we were to use (discretely) divergence-free
basis functions, a possibility that we discuss in Section 3.13.7, much of the difficulty
associated with mixed methods would vanish; but the fact is that the great majority of
approximation methods use the mixed approximation, necessitating (at least) the solution
of saddle-point problems. [It is an interesting 'aside' in that if one were only and always
interested in solving time-dependent non-linear problems (i.e., the NS equations), the
issue of minimizing a functional over a divergence-free subspace in which the associated
Lagrange multiplier 'turns out' to be the pressure would ostensibly never arise; one would
then a priori give a more 'physical' interpretation to the pressure (it is no longer a mere
mathematical adjunct), although it would still need the interpretation that part of its role
is to keep u divergence-free. It is still true, however, that weak solutions would/could
benefit from the use of divergence-free basis functions, and the pressure would 'magically'
disappear from the final equations.]
In most FEM codes, both u and P are approximated—because the velocity basis
functions are not divergence-free, and this leads one to the subject of 'approximate (FEM)
solution of saddle-point problems'—their existence, uniqueness, and accuracy (error
estimates). The general case is of no interest here since our PDE's are specific; for more
general cases than Stokes flow, see, for example, Oden and Carey, (1983) Arnold (1990);
the FEM Handbook (Kartestuncer and Norrie, 1987); and Brezzi and Fortin (1991). The
general case (and associated 'abstract theory') of the type of saddle-point problems of
interest herein was first directly addressed by Brezzi (1974), even though this turns out
to be a 'special case' of the earlier Babuska theory (1971; see also the article (book!) by
Babuska and Aziz in Aziz, 1972; and Babuska, 1973). Brezzi developed independently
his own version of the theory—and it is his version that most closely describes the weak
formulation of the Stokes equations in function space. In any event, the two theories
agree and, importantly, they paved the way for much subsequent finite element analysis.
The theory is considerably more powerful than that of Lax-Milgram in that it provides
necessary and sufficient conditions for a well-posed continuous problem; it also supplies
sufficient conditions for the approximate (discrete) problem that are also necessary in
the following sense: only if they are satisfied is the approximate solution guaranteed
to be stable (as h —>• 0) and of at least quasi-optimal (best power of h) accuracy. The
'joint' theory was apparently first noticed by Bercovier (e.g., 1977; and in Bercovier and
Pironneau, 1978) who combined the credit and led the way to the current nomenclature
of 'BB theory/BB condition' (which condition we present below). But two names were
not enough, apparently. Oden et al. (1982) noted a connection between this theory and
earlier theory on the NS equations by Ladyzhenskaya (1969), and coined the triple crown
appelation, LBB. Finally, in an attempt to be more objective/descriptive, Gunzburger
506 THE NAVIER-STOKES EQUATIONS
(1987) suggested the term, first used in Boland and Nicolaides (1983), div-stability—for
reasons that we hope to clarify below. Another descriptive name, used (e.g.) by Carey and
Oden (1986) is a 'consistency condition' (between the two function spaces). It 'tests the
consistency of the approximation of derivatives...'—Thomasset (1981, p. 32). Related
to this is yet another appellation: 'compatibility condition.' Finally, the name most loved
(it seems) by mathematicians (but feared by many engineers) is the term 'inf-sup
condition.' All of these names are referring to the same 'problem.' [For a generalization of
this theory to unsymmetric saddle-point problems, and a simplified/alternate treatment of
Babuska's theory—without functional analysis—see Nicolaides (1982), and see Bernardi
et al. (1990) for an application of this theory to spectral methods for NSE.]
Let us get on with it; consider the inhomogeneous Stokes equations (with body force)
and inhomogeneous BC's:
V/>=vV2u + g, Vu = 0 in SI, u = w on T. (3.13-103)
The weak form is [see, for example (3.12-34)]: find ueH[ and P e L2 such that
v /(Vu)7 : Vv- f PV ■ v= /g.v VveH0, (3.13-104)
and
- fqV-u = 0 VqeL2, (3.13-105)
where, to conform to (more or less) standard terminology, we restate it (abstractly) as
a(u, \) + b(\,P) = f(\) (3.13-106)
and
b(u, q) = g(q), (3.13-107)
where the definition of the two bilinear forms a( , ) and b( , ), as well as that of the
linear form, /(v), is obvious—and Lq is that subspace of L2 that has the hydrostatic
mode removed (e.g., via JP = 0.) The linear form g(q) is not obvious, nor need it
be explicated carefully; suffice it to say that it is a boundary term resulting from the
inhomogeneous BC. The B, BB, LBB, div-stability theory provides the necessary and
sufficient conditions for the existence and uniqueness (up to an additive constant for P)
of a solution to the weak form of the Stokes (and other) equations (the general theory
is more general; we specialize here to our needs). Since both Babuska's theory is very
abstract (read 'difficult'), we present only the seemingly simpler of the two—that due to
Brezzi—and this in only a very brief summary form, following Brezzi and Fortin (1991)
to which the (mathematically literate) reader is referred for details.
Given (assuming, which is true in our case) that both a( , ) and b( , ) are bounded, i.e.,
for all permissible u, v, and q, \a(u, v)| ^ ||<z|| • ||u||i • ||v||i and |b(u, q)\ ^ \\b\\ ■ ||u||i •
o where
n ii l«(u, v)| \b(\,q)\
\\a\\ = sup and \\b\\ = sup
O^u.ve//,1, HUHl • IMIl O^u.vg//^ IMIl ' Mo
the necessary and sufficient conditions for the existence of a unique (up to an additive
constant for P) solution of (3.13-106) and (3.13-107) are two:
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 507
1. The ellipticity condition on a( , ),
a(v,v)^a||v||? Vv e Hj,
and a > 0 is a constant; i.e., a(,) is coercive as well as bounded.
(3.13-108)
2.
sup
</V-v
Mli
>P\\q\\o VqeL20,
(3.13-109)
where /3 > 0, and L2, denotes L2 modulo constants because q = constant is disallowed
(it gives a LHS of zero; JV-v = /rn-v = 0 because v = 0 on T), and the supremum
(least upper bound) also precludes v from residing in J0; i.e., V • v ^ 0—in fact, v resides
in the orthogonal complement to the divergence-free subspace of H0 (see, for example,
Girault and Raviart, 1986).
Before pressing on, it may be useful to attempt to relate (3.13-109)—which soon will
be shown to be equivalent to the famous (infamous?) inf-sup condition, a main product of
the 'modern' (BB) theory—to the 'older' theory developed by Ladyzhenskaya and others,
so that all will feel more comfortable with the term 'LBB condition.' (We are indebted to
J.T. Oden for helping us track some of the history described below—especially since we
were initially tempted to get the L out of LBB. E. Olsen also helped us in this regard.) This
is 'required' because the earlier theorists seemingly did not need the inf-sup condition to
obtain the same well-posedness results. [Remember that the Stokes equations are but one
special (symmetric) case of the modern abstract theory. It is also noteworthy that many
modern analysts do not seem to specifically state the need to satisfy the inf-sup condition
in order to have a well-posed (continuous) problem; some examples: Temam (1984),
Constantin and Foias (1989), and Kreiss and Lorenz (1989).] Specifically, Ladyzhenskaya
(et al.) were concerned primarily with the decomposition of the space of L2-vector-valued
functions into the sum of divergence-free vectors and gradients of scalars—each from a
different subspace of L2. In her classic textbook (Ladyzhenskaya, 1969), she proves the
existence of a divergence-free vector field (e H1) that satisfies given Dirichlet BC's,
which vector field can be used to show that for every q e L2, there exists a vector v e Hq
satisfying
V- v = q and ||v||i ^ kIMIo,
(3.13-110)
where y is a positive constant. The fact that (3.13-110) implies (3.13-109), which gets
the L into LBB, goes as follows:
1. Given q e L2, define
A(q) = sup
</Vv
|v i
so that A(q) ^ JqV ■ u/||u|| i for general (arbitrary) u.
2. Pick that u satisfying V • u = q to give
Mq)>
l = ll<7ll0/
u
508 THE NAVIER-STOKES EQUATIONS
3. Rearrange and use (3.13-110) as follows:
A(q) = sup
qV -y
v|h
^ lkllo/llulli > IMIo/k,
which is just (3.13-109) with fi — \/y. Thus, the 'L' (et al.) approach is equivalent—for
the Stokes equations—to the BB approach, and shows that (3.13-109) is indeed satisfied.
Remarks:
(1) The second part of (3.13-110) is not to be found in Ladyzhenskaya (1969); it is
probably in one of the references cited in her appendix called 'Comments.'
(2) For additional, detailed discussion of the problem V • v = q, see Galdi (1994, Vol. 1).
To finish, we restate (3.13-109) in the equivalent inf-sup form:
qV ■ v
inf sup
l<7llo
>P-
(3.13-111)
To recapitulate: the satisfaction of (3.13-108) and (3.13-109) or (3.13-111), the last of
which is the LBB condition, assures the existence of a unique solution to the (weak form
of the) Stokes equations.
To conclude our summary of the continuous case, we present (for w = 0) the following
bounds on the Stokes solution in terms of the constants a and /? above—from Brezzi and
Fortin (1991):
|u||, < -11/11 + (\ +-\\a\\] Ug\\, (3.13-112)
a
a
and, once the average pressure has been substracted from the pressure solution to give
(3.13-113)
where
v.g
= sup
and || g |
= sup
gift)
kilo'
where g is the 'body force' in (3.13-103), and g is the linear form/functional in (3.13-107).
This was the easy part. Now we move on to the hard part: discrete (via FEM)
approximations to the Stokes equations, which leads to the 'discrete LBB condition.'
'Fortunately,' however, there are only two aspects of this condition that are difficult:
(i) understanding it, and (ii) applying (verifying) it—especially on general meshes.
Unfortunately, however, both aspects are very difficult; experts are few. The recent Ph.D. Thesis
by Qin (1994) summarizes some of the available techniques.
We begin by writing the approximation problem in the same abstract formulation as
the continuum problem, (3.13-106) and (3.13-107),
a(uh,\h) + b(y\Ph)
fiyh)
(3.13-114)
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 509
and
b(u\qh) = g(qh\ (3.13-115)
where we a priori assume a( , ) is properly defined (coercive, bounded), which is true
in all realistic cases. Thus, the discrete LBB/inf-sup condition is the following analog of
the continuous version: for every qh e Qh,
\b(v\qh)\ ' ^'^
sup = sup h
>kh\\qh\\0, (3.13-116)
where kh ^ k0 > 0(k0 is the h -> 0 limit of kh, the value of the constant on a mesh of
'size' h), Vh is the discrete velocity space and Qh is the discrete pressure space modulo
any pressure modes (the null space of the discrete gradient is 'out of bounds').
The 'matrix' realization of (3.13-114) and (3.13-115) is
Ku + CP = f, (3.13-117)
CTu = g, (3.13-118)
and the corresponding realization of (3.13-116) is
max V. Cq ^khy/qTQq Vq e Qh, (3.13-119)
veVh y/vTKv
where Q = J V^7 ls me pressure 'mass' matrix; or, equivalently,
J
W
Cq\
min max . . ^ h, (3.13-120)
qeQh veVh JvTKv . y/qTQq
where kh > 0, which is basically a stability condition on the C-matrix.
If &o > 0 is fixed independent of h (i.e., if kh becomes independent of h for h —► 0),
then the LBB (inf-sup) condition is satisfied, the discrete unique solution, u and P (up
to the elements in the null space of C) exists, and uh and Ph are of optimal (or at least
quasi-optimal) accuracy: the convergence rate as h —► 0 is the best possible from the
approximating space. In fact, Brezzi and Fortin (1991, p. 56) give the following error
estimates for LBB-stable elements:
\\u — uh\\\^c\ inf \\u — Vh\\\ + C2 inf \\P — qh\\o, (3.13-121)
vheVh qheQh
where
c, ^ f 1 + ~\\a\\) • ( 1 + — \\b\\) and c2^-\\b\\, (3.13-122)
V a J \ kh J a
and, modulo pressure modes,
\\P-Ph\\o^(\ + —\\b\\] inf \\P-qh\\Q + — \\a\\ ■ \\u-uh\\\, (3.13-123)
\ h J qh&Qh h
and we note that, with respect to kh, we have
IIm-m^Ii ^0(\) + 0(\/kh)
510 THE NAVIER-STOKES EQUATIONS
and
\\P - Phh ^ 0(1) + 0(1/*fc) + 0(\/k2h);
i.e., pressure is less stable than velocity if kf, is badly behaved.
Important Remarks:
(1) If the discrete LBB condition is not satisfied (e.g., if kh = ch^ for positive constants
c and f$), then the theory becomes silent; or at least nearly so. Whereas LBB
satisfaction was necessary and sufficient for the continuous case, it is subordinated to a
sufficient condition in the discrete case.
(2) A solution to (3.13-117) and (3.13-118) can only exist if the g-vector is orthogonal
to all null vectors of the C-matrix, a solvability condition discussed in the previous
section that we assume to be satisfied at this point.
Actually, there is somewhat more that can be said when a particular element fails LBB:
(i) convergence may still occur, but the rate may be suboptimal (lower power of h)\ and
(ii) there exist data for which convergence does not occur. It 'is in a sense necessary
if we want a reasonable behavior of the discrete problem'—Brezzi and Fortin (1991,
p. 59). (Our 'reaction' to this issue will be presented in both the following section and in
Section 3.13.5J, and is related to the 'reasonableness' of the data.)
A good part of the problem in the finite-dimensional case is caused by the fact that the
space of discretely divergence-free velocities is generally not a subspace of the
continuous divergence-free vector space—rather, it is an 'external' approximation (Brezzi and
Fortin, 1991). This precludes the establishment of relationships like (3.13-110), which
would, as it does in the continuous case, lead directly to a satisfaction of the LBB
condition. This shortcoming leaves us with another problem as well: as we search for useful
elements, we must always beware lest the pressure (constraint) space become (relatively)
too large, and we must seek elements whose discretely divergence-free velocity space
still permits good approximation capability...; see Malkus's Appendix to Chapter 4 in
Hughes (1987).
d. Bringing LBB to the rest of the people
o Stability analysis. More light (for some) might be shed on the above matters if we
review some linear algebra and then restate the discrete div-stability criteria. Toward
this end, first recall that if we are given a linear system, Ax — b, where A is N x N
and SPD, we can write x =A~lb because we know that A~x exists. Next, we use the
property of a compatible matrix norm to write ||jc|| ^ ||A-11| • \\b\\, which bounds (perhaps
pessimistically) the solution in terms of the data—and is thus a 'stability' statement.
Choosing now the discrete L2 norm (also called the Euclidean vector norm, ||jc||^ = xTx,
which induces the spectral matrix norm), which is appropriate for our purposes, we have
||A||2 = max^o ll^zlh/lklh = Amax(A), where kmax(A) is the maximum eigenvalue of A.
It then follows that ||A_I ||2 = Amax(A_1) = l/Amin(A) so that ||jc||2 ^ l|6||2Amin(A). If x
is to remain bounded as N (the vector length) —>• oo, it is required (assuming \\b\\2 stays
bounded) that Amin(A) stay bounded away from zero; A^'^A) could be called the stability
constant in that the solution remains bounded for all N (stable) as long as Am;n(A) remains
bounded away from zero—and a 'large' stability constant implies a less 'stable' problem
(see also Arnold, 1990).
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 511
It is possible, and useful, to take this concept one step further, by actually constructing
the solution to Ax = b in terms of the full spectrum (eigenvalues) and corresponding
eigenvectors of A; i.e., instead of merely bounding the solution, we shall represent it in
'closed' form. To this end, recall that an SPD matrix has a complete set of orthogonal
eigenvectors, say {z,}, which form a basis for RN and satisfy Az.i = A/z,- with 0 < k\ ^
A2 ^ ... ^ A/v < 00 and—after normalization, which is always possible—zfzj = <$,•/, the
Kronecker delta. Then (recall) the solution of Ax = b can be obtained via the eigenvector
expansion, x = J2%\ ajZj as follows: find at by inserting the expansion for x into Ax = b
and forming zjAx = zfb; i.e., Y,%\ ajzfAzJ = zfb, which implies that J2%\ aj^jzfzj =
(zfb) or at = (zfb)/Xj so that x = Y!!j=\(z]b)zj/kj is the solution to Ax = b. (Note that
the previous bound, ||x||2 ^ H^lhAi, is also derivable from this complete solution.) The
key object of this exercise is to point out—and then emphasize—the fact that it is more
than just the behavior of Amin(A) = A) that is important in the analysis of stability: the
projection of the data (b) onto the eigenvectors {zj} is also quite relevant—at least in
'practice'; i.e., the amplitude coefficients, {zfb}. In particular, if b 'just happened' to be
orthogonal to the first eigenvector, then z\ and A) would play no role in determining the
stability of the solution to this particular problem (except in the face of round-off error,
which we ignore at this point). More relevant yet is the possibility that while zfb is not
zero, it may (and generally will) vary with N (just as A; may do). Thus, suppose A) =
C\/Na for large N where a > 0. Then, if fi\ = zfb remains bounded as TV increases—or
if it decreases with N, via f5\ = Cj/N^, but too slowly (0 < /? < a), the solution will
indeed become unbounded as N —► 00; i.e., be 'unstable' via (zfb)/X\ — 0(Na^^). But
the other side of the coin is also relevant—and possibly important; namely, if fi > a, then
even though Ai —► 0 as N —► 00, the solution may not 'blow up' because (zf fi)/k\ —> 0
as N —► 00. (For simplicity, and focus, we assume that only Ai might be a 'bad actor';
generalization to A2, etc., is virtually immediate.)
This is as far as we need take this discussion at this point, ending with the
realization that // matrix A 'appears' to be unstable because Ai « C\/Na, then it is only
truly unstable for the specific problem at hand if /3\ ~ C2/NP and fi < a. As we shall
see later, for the Stokes equations, even though situations occur for which Ai ~ C\/Na
with a = 2 (for example), that there are no 'reasonable data' (i.e., data for problems
of interest—those whose solution makes sense to consider) for which /3\ ~ C2/NP with
fi < a. For all reasonable data, it will be the case that /? > a. [This is, of course, related to
the purely mathematical definition of stability, 'solution bounded for all data,' vis-a-vis
the more practical—but perhaps occasionally risky—definition, 'solution bounded for all
data that make sense to consider.' A mathematician may be perfectly content to consider
solving—or, more likely, contemplate solving—a problem that most 'engineers' (i.e.,
applied physical scientists) would walk away from. See too the end of Section 3.13.5J.]
We now apply this analysis—and reasoning—to the Stokes equations,
Ku + CP = f, CTu = g, (3.13-124)
at least 'formally,' to obtain
(i) Solve (CTK-lC)P = CTK-lf -g (3.13-125)
for P, where u e Rn and P e Rm, etc.—and the total dimension of the (product) space is
n +m = N. [Alternatively, and equivalently, and in a form which will later prove useful,
512 THE NAVIER-STOKES EQUATIONS
solve (Q-{/2C7K~x CQ-[/2)(Q{/2P) = Q~X/2(CTK'X f - g) for QX/2P, where Q is the
(SPD) pressure 'mass matrix,' Qtj = f xj/jif/j.]
(ii) Solve Ku = f-CP for u. (3.13-126)
Remark:
The solution procedure is only 'formal' because K~x is a dense matrix, and one would
be foolish to actually construct it; the coupled (and sparse) system of (3.13-124) is what
is actually solved on the computer. The point is that the two solution methods are totally
equivalent, algebraically—and the above formalism will help us to better understand
inf-sup/LBB.
Continuing the formal solution procedure, we first obtain the m-vector P via
P = (CTK~x C)~x (CTK~x f - g) (3.13-127)
and then the n -vector u from
u = K-\f -CP), (3.13-128)
and we are ready to apply classical linear algebra theory to the results. First, clearly
there are two necessary conditions for solvability: (i) K is non-singular, and (ii) CTK~XC
is non-isingular. It turns out that (i) is virtually always satisfied (for all physically
reasonable problems—those requiring Dirichlet data for velocity on at least a portion
of the boundary) and (ii) is satisfied up to the existence of 'pure pressure modes'—see
Section 3.13.2b—which we preclude from the present discussion. Thus, by fiat, we have
a solvable system.
Next, for stability, we would like to have both ||Af-1|| and ||C7Ar_1C|| bounded—in
appropriate norms—and we start with the (ostensibly) easy part by showing that HA'-11| is
bounded (and show the bounds) in both the spectral norm and in the Ar-norm, which latter
norm is the discrete version of the //'-(semi-) norm. (It is easy because K is SPD.) We
begin by showing that, in fact, HA"-1 \\K = HA"-11|2 where \\x\\k = >JxTKx is the Ar-norm
of x: for a general but SPD matrix A, we seek ||A~' |U as follows:
M . \\A~xx\\A (A-xx)TA(A-xx) xTA-xx
\\A \\A = max = max \ ~ = max \ —~
x \\x\\a x V x Ax x V x Ax
xTA-xx (A-x/2y)TA-x(A-x/2y)
= max \ I —r^—~—r-pr— = max \ / ~
* V (Ax/2x)T(Ax/2x) y=A^xV yTy
= max
V
'yTA-2y
T
y y
= V^™AA-2) = v/l/Amin(A2)
,~1\ — II,!-'
= lAmin(A) = Amax(A-')= ||A-'||2, (3.13-129)
a result that actually applies to any power of A; i.e., \\Aa\\A = ||Aa||2.
Thus, || A"-1 \\k = || A"-11|2 = l/Amin(Ar), and we have (from Axelsson and Barker, 1984,
for example),
*-mm(K)~Ch"', (C = constant), (3.13-130)
where ns is the spatial dimension (ns = 1, 2, or 3), h is the maximum element size
(length), and the meaning of (3.13-130) is that there are constants C\ and Cj independent
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 513
of h such that C{h"s ^ Amin ^ C2hn>. Thus,
\\K-x\\K~C/hn\ (3.13-131)
and we obtain our first 'surprise' — HA'-1 H2 is not uniformly bounded! The resolution of
this dilemma is first to realize that K does not in fact correspond to the (unbounded)
operator —V2, and thus K~x does not correspond to its (bounded) inverse; it is, in
fact, M~XK that corresponds to —V2 (see p. 213 of Strang and Fix, 1973), and it
follows that HAf-'A'lb ^ ||Af-11|2 • ll^lb ^ Amax(/0/Amin(M) and that \\(M-[K)~[\\2 ^
^max(M)/A-min(^0- It is easily shown (see, for example, Axelsson and Barker, 1984,
again, or Strang and Fix) that Amax(M) ~ Chn° so that ||(Af-1 AT)"11|2 = 0(1); i.e., the
true inverse 'Laplacian' is bounded.
But we have K in our Stokes equations, not M~lK, which forces us to return to
the fact that ||AT~'||2 ~ 0(\/h)n-\ The 'final' resolution of the paradox is obtained by
looking at the entire equation—in particular, at the magnitude of the RHS vector, /
(CP will be discussed later), and this we do for a scalar (Poisson-type) problem for
simplicity of presentation (the true, vector case, follows easily—we assert). The problem
—V2w = S in Q with u = 0 on T leads to Ku = / via the GFEM, where /,■ = f faS, and
we assume S to be 'well-behaved' (e.g., S e L2). Considering the compact support of the
basis functions (or, with similar although slightly less accurate results, interpolate S via
the basis functions to obtain a mass matrix on the RHS), it follows easily that /,• = Sh"s
where 5; is the basis-function-weighted average value of S(x) over the support of 0/ [which
is 0(hn>)]. Thus, ||u||2 < ||/||2Amin(ff) = h^2V^/kmm(K) where S2 = l/N^ti ~&i
is the (discrete) domain average of S2 and Q is the domain size. Using (3.13-130) then
leads to ||m||2 ^ c'vS2/hns/2, which looks ominous until it is realized that
"2 =
\
jr^u2 = \[m2 = c"\/tf/hn<i2
to give, finally (in the RMS norm),
Vu^^c^S2; (3.13-132)
the solution is indeed 'stable' (bounded by the data independently of h) even though
HA'-11|2 —>• 00 as h —>• 0—and is our first example of a stable result from Ax = b even
though Amin —> 0 as N —> 00; both zfb and kj vary with h as h"-\
We now turn our attention to CTK~XC and the issue of its (potential) bounded inverse.
In this regard it is first relevant to point out that Malkus (1981) showed long ago that the
following eigenvalue problem,
(CTK-xC)qi=aiQqi, i = 1,2, ..., m, (3.13-133)
which he called the second adjoint LBB eigenproblem, has the (real) spectrum 0 ^ o\ ^
a2 ^ ... ^ crm < 00. In fact, if the continuous version of this eigenproblem can be
interpreted as
[V-(V2ylV]qi=aiqi, (3.13-134)
514 THE NAVIER-STOKES EQUATIONS
and we believe that it can, at least up to BC's, and if V • (V2)~* V approximates the identity
operator [see two Karniadakis et al. (1993)], then we would obviously have {07} = 1
(and qt arbitrary!)—and this does seem to be approximately true, numerically—at least
for the 'good' part of the spectrum. That is, at least for LBB-stable elements, it does
seem to be true that the eigenvalues of (3.13-133) satisfy Sj < a,■■ — 1 ^ s{ where 0 <
8{, Si < 1. [Also, the eigenvalues of (3.13-134) are bounded—above and below—because
V • (V2)-1 V is a bounded operator; there are not any derivatives left.] The 'velocity-mode'
that is associated with (3.13-134) for a = 1 is given by u, = [2/(1 ± \/5)](V2)-1 V?,- and
is referred to as an 'irrotational' (curl-free) mode in Griffiths and Silvester (1994) and
Griffiths (1996). Thus, we might expect the eigenvalues of Q~XCTK~XC [or, equivalently,
those of Q-{l2(CTK-{C)Q-{'2}, to be 0(1), at least in the best of cases. But for 'LBB-
unstable' elements, such as Q\Qq, we are not so fortunate But this is getting ahead
of the story, and we return to a consideration of the stability of (3.13-127) and (3.13-128)
after making a useful change of variable: define u via
u = K-lf, (3.13-135)
after which the Stokes equations become Ku + CP = Ku and CTu = g\ i.e., they
correspond to a discrete Hl -projection (see Appendix 3) of u to the discretely divergence-free
subspace. Then (3.13-127) becomes simply
P = (CTK-{C)-\CTii-g), (3.13-136)
and (3.13-128) becomes
u = u-K-lCP, (3.13-137)
where we remark that, from our earlier discussion, we know that u is well-behaved
(bounded independent of h).
Beginning with P from (3.13-136), which is appropriately measured in the discrete
version of the L2-norm, via Ph(x) = J2J=\ Pj^ii*)' we nave
2
\Ph\\l= [Ph(x)]2 =
X>^(*>
7=1
and (3.13-136) yields
= J2pJPk J'tjfa = pTQp = I^He = WQi/2pWh (3.13-138)
and thus
\Ph\\l = [(CTK-lCyl(CTu- g)]TQ[(CTK-lC)-\CTu- g)]
= [Q-[,\CTti-g)]T[Q['\CTK-{Cr{Q''2}2[Q-'l\CTU-g)]
= \\\Qi/2(CTK-iC)-iQi/2][Q-l/2(CT~u-g)]\\2
\Ph\\o = \\[Ql/2(CTK-lC)-lQ1'2] ■ [Q'l/2(CTU-g)]\\2
^\\Q-'/2(CT~u-g)\\2.\m,x(Q{l2(CTK-xCT'Q''2)
= \\Q-{,2{CT~u - g)||2Ami„(G-,/2(C7K-[ O0T1/2), (3.13-139)
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 515
and we wish to know if and how \\Ph\\o varies with h. The numerator—the easy part—is
estimated first: to begin, we recognize that there exists a continuous 'velocity' field, uh(x),
that satisfies the BC's of the Stokes velocity, and that its discrete, weak divergence (at
pressure nodes) is Qr\CTu - g); i.e., Vh ■ uh = Yl%\[Q~l(CTu - g)]jisj(x) so that
qW^W = J(yh -uhf = ii v, • uh\\20
= Xj0~' (Ctu - g)]j[Q'\CTu - g)]k I i,ji,k
= [Q~\CTu-g)]TQ[Q-\CTU~-g)]
= \\Q-\CTu-g)\\2Q
= \\Q~{'\CTu-g)\\l (3.13-140)
where (V/j • uh)2 is the average value of (V/, • uh)2 over the domain, Q, and is obviously
independent of h. Thus,
\\Q-W(CTu - g)\\2 = \/Q(Vh-uh)2 = \\Q-{'\CTK~{f - g)\\2 (3.13-141)
is independent of h.
We are thus left with \\Ph\\0 ^ c/kmm[Q-{'2{CTK~lC)Q~x/\ where c is independent
of h, and we now turn our attention to the denominator and show that it is intimately
related to the LBB 'constant,' kh, that is defined by [cf. (3.13-120)]
\uTCP\
kh = mm max —= . (3.13-142)
p u VuTKu ■ \/PTQP
wherein we note that the 'appropriate' norms have been utilized—//' for velocity and L2
for pressure. This is the finite-dimensional version of the inf-sup condition. By actually
evaluating kh from the above equation, we shall see how it is connected to Amjn[<2_1/2(C7
K~lC)Q~l/2]. The evaluation below follows that of Stenberg (1991; personal
communication), although it has been known much longer (e.g., Malkus, 1981). For a given P, we
must first compute
uTCP (Kx/2u)TK-x/2CP
ct(P) = max . = max —.
« VuTKu « y/{Kx/2u)T{Kx/2u)
vTK-\/2Cp VTK'X/2CP
= max — = max
v=Kxr-u yj VT V v Wv\\2
T
V W _,n
= max where w = K ' CP
1'
\V\\2
= max || w||2 cos 0,
e
where 6 is the angle between v and w—from the definition of the inner product: vTw =
1Mb • IMhcosfl. Clearly, the maximum is attained when 0 = 0; i.e., when v and w are
516 THE NAVIER-STOKES EQUATIONS
parallel. This says v = fiw (and thus u = fiK~x/2xv = fiK~xCP) for an arbitrary scalar /3
and yields
u(P) = |M|2 = \\K~X/2CP\\2. (3.13-143)
We now insert (3.13-143) into (3.13-142) and vary P:
ki — min
,2/D\ oTr-Tv-X
az(P) . P'C'K-lCP
U 111111 Tf
min
p
P QP p P QP
(QW2P)T(Q-l/2CTK-iCQ~i/2)(Qi/2P)
~ Pn (Qi/2P)T(Qi/2P)
. q\Q-'l2CTK-'CQ}l2)q
— min ~ .
q=Qx'2P q' q
But by Rayleigh's quotient, the RHS is just the minimum eigenvalue (Amin) of the matrix
Q~i/2CTK~lCQ-i/2, and we have the (important) result that
k2h = ^n(QT[,2CTK-[CQ-['2), (3.13-144)
from which follows
\\Ph\\o^c/k2, (3.13-145)
where (recall) c is independent of h.
Remarks:
(1) Whereas the LBB stability constant, kh, was obtained/derived by studying the
coupled/mixed problem, the 'same' result, namely that 'kmm[Q~{/2{CTK~x C)Q~x/2\
should be bounded independent of h, was derived by isolating P and studying its
stability, via the minimum eigenvalue of (3.13-133); i.e., these are two equivalent
ways to analyze the saddle-point problem.
(2) k2 from (3.13-144) is the same eigenvalue as o\ from (3.13-133); kh = sfo[-
Before looking further at kh and how it might vary with /z, we turn to the velocity
solution; we wish to evaluate the size of the velocity portion of the solution—again in
the appropriate semi-norm, Im^OOIi = HmH^ = \/uTKu. We have, from (3.13-136) and
(3.13-137),
u = u- K-lC(CTK-lCyl(CTu - g)
= [1 - K~xC(CTK~xCyxCT}u + K~xC(CTK~{CT{g
= Bu + K~xCA-xg, (3.13-146)
where A = CTK~XC, and B = I - K~XCA~XCT is a projection matrix (see Appendix 3,
wherein it is called p^) that is associated with the //'-projection of u to the discretely
divergence-free subspace; B2 = B and CTB = O (B projects into the null space of CT;
i.e., into the discretely divergence-free subspace). Thus,
\\u\\2K = uTKu = (Bu + AT1 CA~X g)TK(Bu + K~y CA~X g)
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 517
= u1BTKBii + 2gTA~x CTBu + gTA~x CT' K~x CA~X g
= uTKu - {CTu + g)TA-x {CT~u - g)
= \\u\\2K - (CTti+g)T(CTK-xCyx(CTti - g), (3.13-147)
where we note that this would properly simplify in the event that u happened to be
properly divergence-free; i.e., if CTu = g, then we get P = 0 and u = u.
Pushing on, we rewrite (3.13-147) as
\\u\\l = \ml-[Q-{,\CTu+g)}T[Qx'\CTK-xC)-xQx'2][Q-x'\CT~u-g)}
and thus, the obvious inequality,
\\u\\k ^ \\u\\k
+ [\\Q-l/2(CTu +g)\\2 ■ \\Qx/2(CTK-xC)-xQx/2\\2 ■ \\Q-x/2(CT~u - g)\\2]x/2
follows. Noting now that ||G~1/2<7ll2 = IIQ-1^lie and that
\\QX/2(CTK-XC)-XQX'2\\2 = kmadQl/2(CTK-xCyxQx/2]
= \/^mn(Q-[,2CTK-xCQ-x'2)
-k2
— Kh
from (3.13-144) yields
IMI*^ ll«lk + [|IG~,(C7'ii + g)||G-||G-,(C7'fi-g)||G],/2A*. (3.13-148)
Recalling now the result in (3.13-141), it is clear that the numerator in the second term
of (3.13-148) is independent of h to give
II"IIa: ^ \\u\\k + c'(u)/kh, (3.13-149)
where the only possible dependence on h could be in kf,.
So, our final stability bounds are that ||w||^ ^ cq + c'/kh and ||P||2 ^ c/k\, and further
progress rests on being able to estimate the LBB stability constant, kh, which estimate
(i) is not easy and (ii) depends on the element under consideration; thus, we have reached
the end of 'general' results.
Remarks:
(1) The fact that P varies quadratically with k^' and u only linearly has important
ramifications; namely, if kh = 0(ha) for a > 0, the pressure 'bound' is 'lost' (becomes
poor) 'sooner' than that for velocity—an observation previously made by Brezzi and
Fortin (1991, p. 57).
(2) Several papers that also approach the stability issue mainly from the pure linear
algebra approach are: (i) Brezzi and Bathe (1990); (ii) Fortin and Pierre (1992);
(iii) Chapelle and Bathe (1993); (iv) Wathen and Silvester (1993); and (v) Nicolaides
(1982).
518 THE NAVIER-STOKES EQUATIONS
o Eigenvector expansion. To conclude our LBB discussion, we replay (with a twist) the
eigenvector expansion technique on/for the Stokes equations—the results of which add
further to our overall understanding of stability. To this end, we construct—in principle,
at least—the analytical solution of (3.13-124). But to do so profitably—i.e., to build upon
previous knowledge garnered by others—we consider a non-standard eigenvalue problem
corresponding to (3.13-124): rather than considering the conventional eigenproblem, Kv +
Cq = kv and CTv = kq, we address the 'scaled' eigenproblem (fi = eigenvalue)
Kv + Cq = nKv (3.13-150)
and
CTv = iiQq, (3.13-151)
which was first considered by Malkus (1981), who referred to it as the 'convergence'
eigenproblem. [For a more 'modern' approach, and some new results, see Griffiths (1996),
portions of which we will summarize/utilize.] It also leads directly to (3.13-133), with
a = /i(/i — 1). This corresponds to a generalized eigenproblem of the type Az = kBz,
where A (which we assume initially to be the case) is non-singular, and B is SPD, thus
assuring a complete set of fi-orthogonal eigenvectors {z;}, which we take to be normalized;
zfBzj = 8jj. The solution of Ax = b in terms of this set of basis vectors is done as
follows: x = Yl,iaiZi, b = ]T\ PiBzi, the latter expansion employed because it leads to a
particularly efficient analytical solution. [It is a legitimate expansion because {zi} is a basis,
and B is SPD, which makes {Bzi} a basis too.] The final result is easily found to be x =
J2i ^i{(zjb)zi, as for the conventional eigenproblem. Applied to (3.13-124), we first have
(;)-gw/+'W(S) (3-13-152>
in which, as an aside, it is interesting to note that if g = 0 (a common situation in
practice—such as thermal convection in a contained flow), then (3.13-152) necessarily
implies J2i(vJ' f)Q<ii = 0 V/. That this is 'reasonable' follows from the rearrangement
t0 Yli^QVivJ^f = 0 and the realization (thanks to A. Hindmarsh) that the m x n 'outer
product' matrix, J2i Q^itf ■>ls actually the zero matrix—from the orthonormality condition,
vjKvj+qjQqj = 8ij; (3.13-153)
it is the lower left partition (m x n) of the identity matrix. (End of 'aside.') The final
solution of (3.13-124) is then
= £
V,J q,g ' "' ' (3.13-154)
/=i ^
which produces a stable solution to (3.13-124)—with (graph) norm (squared) of ||w||^ +
\\P\\l = uTKu + PTQP = Y,Nj=\(v]f + q]g)2/fi2}—as long as the amplitude coefficients,
(VJ f + q]S)/^'l = 1, 2, ..., Af, remain bounded as h —>• 0(/V —>• oo). Again, we point
out that unbounded growth (instability) can only occur if /z,- —> 0 faster than does the
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 519
projection of the data, (vf f + qfg), as h —> 0—in those cases wherein vf f + qfg —> 0
as h —> 0. The concern (mostly by mathematicians) over ./wsf /x,• —> 0 as /z —> 0 in the
general case is, however, warranted, for the following reason: stability in the strictest
sense means a bounded solution as h —> 0 for a// possible (bounded) data (/ and g).
If fjLj —> 0, then there must be wme data for which the solution will blow up rather
than converge. And this is true. But what this general and cautious/conservative stability
definition does not take into account is any consideration of the 'reasonableness' of such
data and its relationship to the continuous problem. (Mathematicians are prohibited from
enforcing 'reasonableness' on the data—partly, and certainly justifiably, because the term
cannot be well-defined.) Suppose, for example, that /ik —> 0 as h —> 0 for some k (and,
for simplicity, that this is the only eigenvalue that does so). Suppose further that the
data were carefully selected to be (as a 'worst case') the corresponding eigenvector, i.e.,
(f,g) = (vk, qk) to give the very simple solution [from (3.13-154)]
Hk \<lkj
the point of this extreme example (unstable for /z —> 0) is to emphasize the fact that the
eigenvectors are important too (not just the eigenvalues), and it may just turn out (as
we have discovered at least for some special cases) that the 'unstable' eigenvector is so
'bizarre' that it makes absolutely no sense in the continuum (h —> 0); it might even be
the case that it does not—for our case—lie in the range of the Stokes operator, which
here would mean that the data in the continuous problem are not in the dual space of
Hl —viz., //-'. As we will show later, in Section 3.13.5k for the (allegedly) 'unstable'
Q\Qo element, the eigenvectors corresponding to the unstable eigenvalues are indeed
rather 'bizarre'; they are very highly oscillatory (similar to a '2 Ax wave') and, as a
consequence, are nearly orthogonal to 'smooth' data—i.e., to reasonable data—and thus
the numerators of the amplitude coefficients do decrease as h —> 0, and they do so faster
than does the denominator, /ij, thus permitting both stability and convergence. It is this
simple fact that accounts for the major 'success' of Q\Qo-
It may be of interest to reveal a few more of the known results regarding the eigen-
system [(3.13-150), (3.13-151 J—results derived originally by Malkus (1981)—and relate
them to the LBB stability constant given in (3.13-142) and (3.13-144). To start, we
eliminate v, for /i # 1, from [(3.13-150), (3.13-151)] to recover (3.13-133):
(CTK-{C)q = fi(fi -\)Qq = aQq, (3.13-156)
which we convert, via the change of variable, r = Q^2q, to
[Q-l'2(CTK-lC)Q-l/2]r = ar, (3.13-157)
a conventional eigenproblem—of 'size' m; i.e., r e Rm, and the matrix is m x m. Malkus
(1981) has shown that this 'second adjoint LBB eigenproblem' has a discrete set of
positive eigenvalues (in the absence of pressure modes, which we assume to be the case
at this point),
0 < 0\ ^ <T2 ^ CT3 . . . ^ <Tm, (3.13-158)
and, from (3.13-144) we see that the LBB constant is just
kh = Ja[. (3.13-159)
520 THE NAVIER-STOKES EQUATIONS
Remark:
It seems to be the case that am ^ 1, although we know of no proof.
Then, since /x((/x; — 1) = <t(, each of the m-values of 07 produces two values of
///(one < 0 and the other > 1),
lif = \ (l ± 0+W) , (3.13-160)
and we have 2ra of the desired n + m(= N) eigenvalues with corresponding eigenvectors
where qt and 07 come from (3.13-156), and the </'s satisfy qT:Qq\ = 8jj. Also, the LBB
constant is now expressible as
kh = y/iiT(jiT-\). (3.13-162)
From (3.13-160) we see that one half of the /x's (/xf) are < 0, and the other half
(/x+) are > 1; and it is the negative roots that can be dangerous (if <r, —> 0). The remaining
n — m eigenvectors have /i — 1 and q = 0; i.e., they are the divergence-free subset of
('velocity-only') eigenvectors—cf. (3.13-151)—of the form
G)=(o)- (3-i3-i63)
where CTVj = 0, and fi,■ = 1. Note that the 2m 'velocity' eigenvectors of (3.13-161) are
not divergence-free; they are dilatational, with
CTvf = -^-—CTK'xCqi = (ifQqi. (3.13-164)
[Aj 1
These are the vectors that permit satisfaction of the inhomogeneous constraint equation,
CTu = g, in (3.13-124). They are referred to as 'discretely irrotational vectors' (curl-free)
by Griffiths (1996), because they are (K-) orthogonal to the divergence-free vectors. If
we now order the eigenvectors according to the size of the corresponding eigenvalue, we
can represent the full solution of (3.13-124) via (3.13-161) and (3.13-163) as
= v ! + Vi+^ (1 + yrT4^* Cq'
i=m+\
2
T^T^j -re,,.* / z K-iCq,\
(3.13-165)
-i + ^Y+4^ f_l + y/i+^;K c*
^ l(l + VT + 4^) V q{
i=n +1
where qi+n = qt, and ai+n = 07 for / = 1,2,..., m; and (recall) N = n + m.
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 521
Returning now to stability and LBB from an alternate viewpoint, we focus on
the smallest eigenvalue (a\); i.e., we have a\ = crmm(Q~i/2CTK~iCQ~i/2) = k\ from
(3.13-144)—wherein a was called A. Suppose now that o\ = ch2, as indeed it is for some
'LBB-unstable' elements such as Q\Qq in 2D (see Sections 3.13.5J and k), giving kh =
0(h) and an amplitude coefficient for the first eigenvector, using ->/1 + 4o\ = 1 4- 2ch2,
of
— (vjf+qjg) = -{q[CTK-{f + q]g)/ch2, (3.13-166)
Mi
and the potential instability with mesh refinement is clearly evident. Only if q\(CTK~{ f +
g) = 0(h2+E) for 8 ^ 0 will the solution not blow up as h —> 0. And this can be the case,
as we show in Sections 3.13.5J for Q\Qo, because q\ is very 'checkerboardish', and thus
nearly orthogonal to reasonably smooth (not checkerboardish!) data. We shall return to
this issue in Section 3.13.5k.
Remarks:
(1) If a non-trivial null space (dimension k) of pressure modes is present, (i) the first and
last summation in (3.13-165) is each reduced by k, (ii) the middle sum is increased
by k (each pressure mode 'generates' another divergence-free velocity mode), and
(iii) there is added a fourth sum, J2?=n+m+\-k a> (p)' where P, is one of the k pressure
modes (CPj = 0) and {«/} are arbitrary scalars. These modes correspond to a = 0
in (3.13-156) and have /i+ = \,/i~~ = 0. When pressure modes are present, the
'effective' number of pressures (and concomitant constraint/continuity) equations is
reduced—one per mode. [The rank (r) of C is m — k; cf. the discussion following
(3.13-58).]
(2) A similar eigenvector expansion for the transient Stokes equations, which has more
physical relevance, will be presented later—Section 3.16.3.
(3) A clever re-scaling of the above eigenvectors, due to Griffiths (1996), leads to a
much more compact representation:
;=i
^qjQqt .
i=m+\
+ E^("n). 0-13-167)
where
v Kv
Wj = K"xCqj (3.13-168)
is called 'discretely irrotational' because it is (A')-orthogonal to the discretely
divergence-free eigenvectors (vj) and was obtained (in part) by exploiting the
(unnormalized) orthogonality between the two sets of dilational m-vectors; namely,
[cf. (3.13-156)],
wjKwi = aiqjQqh (3.13-169)
where we note that the new eigenvectors are actually [vv7, (/x^ — 1)<7/]7 =
[K~lCqj, (/if — \)qj]T■ Additional relationships that led to the more streamlined
expansion are: fi~j + (iJ = 1 and /x+ • (ij = —Oj.
522 THE NAVIER-STOKES EQUATIONS
To conclude our discussion of LBB, we show 'qualitatively' the spectrum {fij} in
Figure 3.13-13, which might be useful. The eigenvalues are imagined to be distributed
along the curve from #1 to #8. For an element-specific version of this figure, see Griffiths
(1996). The figure is qualitative in that it applies to any element (probably even to
other spatial discretizations than FEM) and the curved portions are only 'suggestive,'
and the two 'limit points,' (jlj• = (1 ± V5)/2 that obtain when o}■ = 1, are probably only
approached for large N.
Additional remarks related to the circled numbers:
1. Regions #1 and #8 show the regions of 'good' (smooth) modes that are trying to mimic
the curl-free modes with a = 1 mentioned below (3.13-134).
2. Regions #2 and #7 are 'transition' regions and may vary in shape from element-to-
element; i.e., with the discretization.
3. Regions #3 and #6 are 'bad' regions with rather oscillatory modes that seem to have
no counterparts in the continuum—which would never deviate from the (1 ± V5)/2
'asymptotes.'
4. Another numerical artifact is Region #4, at least if k —the dimension of the null space
of C—exceeds unity; these are pure pressure modes.
5. Region #5 is the 'clean' region—the space of truly discretely-divergence-free
eigenvectors, the null space of CT; and it is often smaller than we would like. We would like
(we believe) a larger Region #5 and smaller Regions #3 and #6—and no Region #4.
6. Finally, we return to the 'danger zone,' Region #3, and remark that the difference
between a stable and an unstable element shows up at the right end of #3; a stable
element will have the smallest /x^'s that approach a constant independent of N as N is
increased and unstable ones will approach zero.
A final eigenvector remark, until we revisit the analogous problem for the transient
Stokes equations in Section 3.16.3, is this: the combination of 2(ra — k) dilatational modes
1.5
Hi 1-0
0.5
l\
-0.5
(1 + V5)/2
m-k dilatational
X
"K
4—#
n-m+k div-free
m-k dilatational
k pure pressure modes
dim N(CT)
-dim N(C)
(1 - V5)/2 j = 1 m-k m n+k j = n+m = N
Fig. 3.13-13 A spectral picture of the discretized Stokes equations.
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 523
are those needed to satisfy the following two sets of m — k equations:
1. CTu = g and
2. (CTK-{C)P = CTK-Xf - g- cf. (3.13-124) and (3.13-125)).
The former is in a sense a boundary equation in that g is populated only at
pressures corresponding to velocity nodes on the boundary with (inhomogeneous) Dirichlet
BC's—and the latter is a 'full domain' equation.
This concludes our version/description of LBB stability, and related matters. Hopefully,
it adds something positive to the total picture.
e. Penalty methods
We have already briefly introduced 'the' penalty method—via the penalty 'equation of
state' in Section 3.5.3 and variationally in Section 3.7.1, showing how the penalty
parameter, the pressure (Lagrange multiplier), and the velocity divergence are related. See also
Section 3.10.6. Here we further discuss the penalty method, extend it to the full NS
equations, and mention two other penalty-like methods. First, however, we shall cite
some of the relevant references on the subject (there are too many to list 'all'—see the
references' references) to which we refer the reader for the 'heavy' theory associated
with the continuous penalty method (PDE's), as our approach will be more 'practical'
in that we will begin with the discrete (FEM) equations. For the continuous case, see,
for example, Bercovier (1978), Temam (1984), Reddy (1978, 1982), Oden et al. (1982),
Carey and Oden (1983), and Girault and Raviart (1986). Some relevant publications more
on the 'applied' side are: Zienkiewicz and Godbole (1975), Hughes et al. (1976, 1978),
Malkus and Hughes (1978), Bercovier and Engelman (1979), Hughes et al. (1979a), Oden
and Jacquotte (1984), Kheshgi and Scriven (1982, 1984, 1985), and Reddy et al. (1992).
The demarcations of theoretical vs applied in listing the above references is, of course,
subject to 'interpretation.' Finally, an entire ASME conference on the subject is described
in the proceedings edited by Reddy (1982).
o Model problem. To further motivate the method, and indeed to aid in understanding it,
we present first a three-equation 'model' of the Stokes equations, the steady version of
the model discussed later in some detail (Section 3.16.2),
ku + c\P = f\,
kv + c2P = fi,
and
C\U + C2V = g,
with solution
c\f\ - c\c2fi +cxkg
u = = =
k(c] + c\)
c]fi -c\c2f\ +c2kg
v = = »
k{c] + c\)
(3.13-170)
(3.13-171)
(3.13-172)
(3.13-173)
(3.13-174)
524 THE NAVIER-STOKES EQUATIONS
k(c] + c\ + ek)
c]fi -c\c2f\+ c2kg + skf2
k(c] + c\ + ek)
c\f\ +c2fi-kg
c\ + c\+ ek
(3.13-177)
(3.13-178)
(3.13-179)
and
p=CI/i+C2/2-^
'•?+4
This was the 'mixed-interpolation/Lagrange multiplier' approach. The penalty
approximation replaces (3.13-172) by
c{u + c2 v-g = sP, (3.13-176)
where e is 'small' (1/e = A is the 'penalty parameter'). The penalty solution is also easily
found:
u =
v =
and
P =
But the 'raison d'etre' of the penalty method is to eliminate P a priori and only calculate
it at the 'end of the day'—if ever. Thus, inserting (3.13-176) into (3.13-170), (3.13-171)
yields the penalized momentum equations sans pressure:
(it + kc])u + \c\c2 v = /, + Ac,g, (3.13-180)
Ac,c2w + (it + Xc\) v = f2 + kc2g. (3.13-181)
It is clear from (3.13-173) through (3.13-179) that the penalty solution is 'close to'
the 'mixed' solution; clearly 0(e) away, in fact. And this is 'exactly' what happens in
the full-blown GFEM case. So much for 'theory'—for now. Except for the following
remarks—which also generalize to the 'many'-degrees-of-freedom case:
2
1. The 'penalty matrix,' B = (C| CiC22) is singular, a requirement first noticed, we believe,
by Fried (1974); a non-singular B would, from (3.13-180) and (3.13-181) drive u and v
to zero ('locking') for A —> oo (at least for the g = 0 case)—not a good approximation
to (3.13-173) through (3.13-175).
2. The full system matrix,
(k 0\ ( c\ c\c2\ {k + kc2; kc\c2 \
A = K + kB=(* ,)+k[ ' 2 =( , , \\ ' (3-13-182)
has eigenvalues k and k + \(c] + c\) and consequent large condition number—0(A). The
concomitant loss of accuracy in solving the penalty equations, (3.13-180) and (3.13-181),
via Gaussian elimination is thus also 'large,' and this sets an upper bound on A depending
on your computer's accuracy; a 14 digit machine will lose about 10~14A significant digits,
thus limiting A to values less than, say 109. This is the principal problem with the penalty
method: A too small gives too large a divergence (and other errors), A too big loses
significant digits. There is a bathtub curve measuring 'penalty error,' whose 'bottom' is
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 525
nearly flat (and on which the penalty method works well) only over a somewhat limited
range of A—typically 104 — 109. Iterative solution methods 'like' large A even less than
do direct methods, because the convergence rate becomes very small.
3. For steady-state problems (only), the sign of s is immaterial—a statement that may not
sit too well with our friends from solid mechanics, who often like to liken the penalty
parameter to a physical one (a Lame parameter) that should be of proper sign when dealing
with compressible elasticity. But we are dealing with incompressible fluid mechanics and
thus feel free to simply regard A as a penalty parameter (math, not physics).
o The real problem. We now apply the penalty method to the NS equations. Starting
from (3.13-28) and (3.13-29), we obtain the penalty version of them by first replacing the
latter by
CTu-g = sQP, (3.13-183)
where Q is the pressure 'mass matrix' (or 'basis function overlap' matrix; Kheshgi and
Scriven, 1985):
Qu = Jfifj, (3.13-184)
and eliminating the pressure in (3.13-28) via (3.13-183); i.e., the penalized momentum
equations are
Mil + [K +N(u) + -kCQTxCT\u = f + AC(r'g, (3.13-185)
and the pressure is gone... and our problems are thus gone!? Not quite. While the penalty
method is often very useful, it is, unfortunately, no panacea—for at least the following
reasons, some of which we shall further elucidate and/or demonstrate later:
1. The selection of A is not always as 'easy' as implied above.
2. The matrix Q~x is generally dense, effectively precluding practical utility. But consistent
mass is not a requirement for getting good penalty results (cf. Engelman et al., 1982b, and
Fortin, 1983); for those elements for which mass lumping is legitimate, Q may be lumped
and thus rendered diagonal. For other elements, the replacement of Q by (average) element
size (area or volume) would probably work fine—although we have not tested this.
3. Unless element-contained, and thus discontinuous, pressures are employed, the
(consistent—and 'highly' singular) penalty matrix,
B = CQ-lCT, (3.13-186)
is global and must be so constructed—a feature that makes the method unattractive in
practice. This helps explain why virtually all applied penalty methods have employed
discontinuous pressures.
4. The penalty matrix intensely couples the velocities, a fact that affects/limits the ensuing
numerical solution procedures.
5. The necessarily large penalty parameter makes the problem rather ill-conditioned, thus
again restricting the choice of numerical solution procedures and (usually) adversely
affecting their performance.
6. The spurious penalty start-up transient—Section 3.5.3—can be 'annoying.' (See
Section 3.16.2e for more on this.)
526 THE NAVIER-STOKES EQUATIONS
7. If pressure modes plagued the mixed interpolation 'progenitor,' they will usually also
make their presence felt in the penalty version—when (and if) the pressure is obtained
via post-processing, from (3.13-183).
o Reduced quadrature. Thus far, we have been discussing what we call 'consistent
penalty,' a term introduced in Engelman et al. (1982b); the above equations resulted from
a consistent GFEM applied to the continuum momentum equation and to the continuum
penalty equation (3.5-9), in the conventional 'mixed-interpolation mode' (e.g., P is one
order lower than u). Historically, and still alternatively today, there is another way to
obtain the penalized discretized momentum equation; namely, apply GFEM to (3.5-10),
the continuum penalized momentum equation. This approach results in a generally (but
not always, see below) different fi-matrix and a somewhat different set of 'problems.'
[The viscosity is completely negligible next to A in (3.5-10)—or should be if the penalty
method is to succeed—and can/should be dropped from that term.] The name of this
historical game is called 'reduced integration' or 'reduced integration penalty' (RIP) or
'selective reduced integration'—and it came about by starting 'wrong,' where by 'wrong'
we simply (naively?) mean starting from the weak form of the continuum-derived penalty
momentum equation; i.e., from (3.5-10), which when discretized a la GFEM, reads
Mu + (K+N(u)+\B)u = f, (3.13-187)
where all terms are as before [cf. (3.13-18) et seq.] except the penalty matrix, which
is now
^^ (3.13-188)
(3.13-189)
or
d<pla) d<pf]
dxa dxp
The only problem with this formulation is that it does not work. Not as stated, at least;
because as presented thus far, the fi-matrix is not singular, giving for the simplest case
of steady Stokes flow, for A —> oo, u = k~l B~x f % 0—the locking problem mentioned
above. How to fix it? Well, what began as a 'trick,' reduced integration (to render B 'less
accurate,' and singular) was later elevated to a legitimate methodology by Malkus and
Hughes (1978), who put all of the previous clues together and came up with their famous
equivalence theorem, which we loosely state in words as it applies to the case of interest
herein (it covers other mixed method applications in addition to incompressible flows),
albeit 'only' for discontinuous pressures (nearly the only case of practical interest): the B-
matrix of (3.13-187) and that of (3.13-186) are the same (at least under certain conditions,
such as straight-sided quadrilaterals) if and only if (3.13-189) is under-integrated in just
the right way—that being the Gauss-Legendre rule whose integration points correspond
to the pressure 'nodes' for the corresponding mixed method. Examples:
1. The Q\Qo element displays the equivalence when one-point quadrature is used for B
in (3.13-189). Full quadrature requires 2x2 and also has an equivalence: Q\Q~\, an
element that is no good because there are more constraint equations than velocities to
satisfy them; ergo, locking.
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 527
2. The Q2Q-1 element displays the equivalence when the fi-matrix is integrated via
2x2 Gaussian quadrature. Again, full quadrature, here 3x3, also has an equivalent
mixed method element: Q2Q-2. another loser. Both of these equivalences apply only to
straight-sided elements.
3. Any other higher-order Lagrange element, QmQ(\-m).
But the reduced integration method has enough significant 'problems' associated with
it that we recommend only the consistent penalty method: B = CQ~lCT, with
discontinuous pressures so that element-level construction of B is possible. Some of the reduced
integration 'problems,' per Engelman et al. (1982b), are: (i) equivalence does not obtain
for Q2Q-1 if the sides are curved—i.e., via fully isoparametric element simulations;
(ii) there is no known reduced quadrature method to do the penalty version of Q2P-1, a
leading contender for the 'best' 2D element; (iii) equivalence in 3D for Q\Qo and Q2Q-1
only occurs for simple bricks, not for distorted ones; and (iv) when consistent penalty is
pitted against reduced integration penalty under non-equivalent conditions (which is still
legitimate for both methods), the consistent penalty method is more accurate. Finally,
only consistent penalty is applicable (although not recommended) to all elements for
which mixed interpolation is successful—even those with C° pressure approximation. We
conclude this comparison with an important quotation from the Malkus/Hughes
equivalence paper: 'We believe the practical computing consequences of these results are
significant. Namely, the accuracy of the mixed formulation in constrained media situations can
be obtained with a displacement [read velocity] formulation, completely eliminating the
additional computational expense engendered by the auxiliary field of the mixed
formulation.' A solid vote for the consistent penalty method from the solid mechanics community
is the following, from Lee and Dawson (1989): 'It has been found that consistent penalty
techniques give more accurate pressure and velocity fields with less computation time. The
approach suggested by Engelman and co-workers uses a linear, discontinuous interpolation
function for pressure to avoid spurious modes with quadratic velocity approximation.'
Finally, we remark that the reader interested in 'Some Historical Remarks on Mixed
and Reduced and Selective Integration Methods' should see p. 226 of Hughes (1987).
o Transient penalty. Turning now fully to the time-dependent case, we form the PPE
analog a la-penalty by inserting ii from (3.13-185) into the time derivative of (3.13-183)
and use (3.13-183) again to obtain
-QP + {CTM-{C)P = CTM-\f -Ku- N{u)u] - g, (3.13-190)
A
which, for A —> 00, recovers the conventional PPE—derived in Section 3.13.4; see
(3.13-242). But it is precisely the P term for finite A that makes (3.13-190) look more like
the transient heat equation than the desired elliptic equation. [The advection -diffusion
equation, derived in (3.5-11) for the continuum case, could—and perhaps should—also
be rearranged and then 're-interpreted' as the transient heat equation, since advection,
eu • VP, is negligibly small.] The implied PPE/heat equation for P is indeed implied
by the penalty velocity solution and is what we called a 'transient penalty shock wave'
in Section 3.5.3. It corresponds to intentionally introduced stiffness (see Section 2.7.2c)
in the ODE sense in that, once the spurious 'wave' has diffused through the mesh,
the heat equation 'portion' is finished, and the PPE 'portion' takes over; i.e., P is in
528 THE NAVIER-STOKES EQUATIONS
quasi-equilibrium after a time of order r ~ 1/A. Three additional remarks: (i) although
indeed an implied equation that is always satisfied by the pressure, it of course need
never be formed in a computer code; P also satisfies (3.13-183), which is a much more
convenient way to 'retrieve' P, if and when desired; (ii) the index (see Section 3.16.1) of
the penalty method is, of course, zero—there are no more algebraic equations, just (stiff)
ODE's; and (iii) the divergence-free constraint on the initial velocity field is ostensibly
no longer present; arbitrary iio fields, however, are still physically meaningless and will
be converted to a (hopefully reasonable) divergence-free [to 0(e)] velocity in a time of
0(e) via what is effectively on L2-projection—the details of which will be presented later
(Sections 3.16.2e and 3.16.4f).
The above discussion also clearly shows that 'stiff integrators' are required in order
to solve the time-dependent penalized momentum ODE's; explicit methods are OUT. In
Sani et al. (1981b), this spurious penalty transient was introduced and demonstrated. Also
shown was the important trick of 'bypassing' this transient with a dissipative integrator;
one step of BE or BDF2, for example, of size At ^> 1/A will put (3.13-190) into the
'PPE mode,' after which the (stable) implicit time integrator (BE, TR, BDF2, etc.) could
take over and the smart timestepper turned on. That is, the short transient associated with
penalty need not be accurately integrated—and generally should not, being totally non-
physical. The only time this sidestepping trick might fail would be a situation in which
the true physics occurs on a time scale so short that At = 0(1/A) or less is required for
accuracy. This would represent a penalty method failure—unless A could be increased
sufficiently without running out of digits on the computer.
In Figures 3.13-14 and 3.13-15 we show some aspects of the spurious penalty transient
for a simple Poiseuille flow start-up via the imposition of a pressure drop over the length of
the channel, using the variable-step BE method—described in detail later (Section 3.16.4).
Here, A = 106, and the Q\Qo element was employed with a local time truncation error
tolerance (e) of 0.001. Whereas the true (or mixed interpolation) solution displays a
constant-in-time (and y), linearly decreasing in x (from four to zero) pressure profile
and a uniformly increasing u(y) in the y-direction with v = 0, the penalty method does
f, = v = 0
P5o
1 .
%
f 1
52
1
^40
53
1
'
1
P30
3011
*
29
1
>—r
P20
t
u = v = 0
1
1
1
P10
I I
—I—6°
p5
I
0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0
Fig. 3.13-14 Mesh and boundary conditions for transient Poiseuille flow; the inlet BC is
fn=4,v = 0.
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 529
4.0
3.5
3.0
2.5
P 2.0
1.5
1.0
0.5
0
10
_ (a)
^y i J
I
U^~~ p I
/ P40
/ / ^30
/ / /p
III
//A10
o
X
>
0
-10
-20
-30
-40
(b)
-
1 /
/At
\ v59
A v53 -
\v29 -=
I 1
10-5
10 7 At
10~9
-in-11
-11
10~* 10~
Time
10
-5
10
-11
10 9 10
Time
-7
10
-5
Fig. 3.13-15 The specious penalty transient.
not attain this situation until t = 0(1O~5)—the time it takes for the penalty transient
to equilibrate/dissipate. Note too that the non-zero vertical velocities are also spurious,
albeit they are 0(e). Only 80 timesteps were required to track this penalty transient,
-n
to
10
-4.
see
during which At grew by seven orders of magnitude—from ~ 10
Figure 3.13-15. For t > 10~5, the 'true' transient Poiseuille flow 'starts' for the penalty
approach; see Sani et al. (1981b) for further details, and also a non-physical penalty
transient in which the incompressibly illegal BC of u = 2y(\ — y/2) with zero initial
velocity was also run with the penalty method, both in the stiff and non-stiff modes—the
former taking advantage of the L-stability of BE via a starting At of 0.01, which stepped
right over the spurious transient. (The non-stiff integration follows the transient accurately
in time via local error control and variable timesteps, which will be carefully described in
Section 3.16.4.) In this latter case, the spurious compression wave generated pressures of
size 105-107 for 0 < t < O(10~5), after which they returned to 0(1). [Here TR required
only 43 time steps and BE required 94, with e = 0.001—using (of course) the 'smart'
(variable-step) integrators described in Section 3.16.4.]
Besides the spurious penalty transient, the penalty method is (for e > 0) artificially
dissipative, although only slightly. Neglecting the viscous term in (3.13-187) yields the
penalized Euler equations, whose kinetic energy 'conservation' law reads, for / = 0 and
N(u) skew-symmetric,
uTMu = -kuTBu, (3.13-191)
2 dt
which is dissipative if uTBu > 0. From (3.13-186) we obtain
uTMu = -k(CTu)TQ-\CTu)
2 dt
(3.13-192)
and, since QTX is SPD, dissipation has been proven. But it is not badly dissipative, even
though A appears on the RHS—because CTu is small. In fact, the use of (3.13-183), for
g = 0, yields
1 d
—uMu =
2dt
sP' QP,
(3.13-193)
530 THE NAVIER-STOKES EQUATIONS
the kinetic energy slowly decays (for s > 0 and small) at a rate proportional to the
pressure's 'kinetic energy.' Perhaps this energy 'argument' is, in fact, sufficient to suggest
that only positive values of A should be employed —but see below. For an example of
penalty's dissipation, see Sasaki and Reddy (1980).
o Pressure modes. Next we present a short discussion on 'pressure modes a la penalty,'
beginning with the obvious observation that even if the corresponding mixed-mode
element would display one or more pure pressure modes (zero eigenvalue, CPm = 0,
recall) —including the physical hydrostatic mode (n • u specified on all of T, recall)—the
penalty version of it will not be singular. In this sense, the penalty method is also a
regularization method; any eigenvalue that would be zero is then 0(e). Presuming the
existence of a (mixed-mode) pressure mode, Pm, (3.13-183) yields
PTmCTu - PTmg = uTCPm - PTmg = -PTmg = ePTmQP, (3.13-194)
which brings up two points: (i) if the 'mixed' problem is well-posed (Pjng = 0, recall),
then the penalty pressure is Q-orthogonal to that mode, PTmQP = 0, which, as shown by
Sani et al. (1981a), tends [cf. (3.13-87)] to act like a 'filter' if the mode is
checkerboardlike; and (ii) if the mixed problem is ill-posed (PTmg # 0), the corresponding penalty
pressure will be very large [0(A)], and the associated velocities not 'physical.' Do not
use 'penalty' to try to solve otherwise ill-posed problems.
o Variable penalty parameter. To conclude our discussion of 'the' penalty method, we
return briefly to the subject of 'selecting A' for steady flows. We have already asserted that
penalty 'works' [gives u and P that are 0(e) from those obtained with the 'mixed' analog]
for both positive and negative values of A. Here we further strengthen this position by
pointing to the papers by Kheshgi et al.; see, for example, Kheshgi and Scriven (1985) for
'applications' and Kheshgi and Luskin (1985) for theory, and references therein. Called
the 'variable penalty method,' the penalty parameter is chosen (2D) to alternate in sign
in a checkerboard manner and to vary in magnitude in a certain way, the combination
having been shown to increase the 'quality' of penalty results—at least in some cases.
Taking, for the Q\Qo element,
kj = ±cxai/(a), (3.13-195)
where cx is the 'conventional' penalty parameter (e.g., 107), a, is the area of element /', (a)
is the number average of all element areas, and the signs alternate in CB-fashion. When
it 'works' (explained below), this trick extends the utility of the penalty method by both
reducing compressibility error and permitting a wider range of c\ (several more decades
of utility). There are, however, two classes of BC's for which it does not work well: (i)
normal traction applied on a portion of T, and (ii) tangential stress applied on a portion
of T. In both cases, Dirichlet BC's are applicable on the rest of V. For all Dirichlet or for
partly Dirichlet and partly normal and tangential stress BC's, the variable penalty method
works well; i.e., better than the conventional method. The extension of these results to
other elements or to 3D, however, has yet to be accomplished—to our knowledge.
o Closure. Final remarks on the penalty method:
1. It is effective, especially in the 'consistent' method (no reduced integration), for any
element in which the pressure is element-contained (C~').
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 531
2. It fully and tightly couples the velocities, which can have serious implications in
3D; e.g., segregated/uncoupled solution methods (discussed in Section 3.16.7) are not
'available.' It is more 'viable' in 2D.
3. Only fully implicit time-integration methods should be employed—except perhaps for
advection (semi-implicit)—thus again coupling all velocity components in the 'large'
matrix. See Section 3.16.4f for implicit time-integration discussion.
4. The first timestep (or first several) should generally be made using a strongly A-stable
(or even L-stable) integrator with step sizes large compared with s.
5. If a wide variety of element sizes is present in the mesh for steady-state simulations,
then the variable penalty method of Kheshgi et al. may be worthwhile—BC's permitting.
It may also be a good idea in general, at least in 2D for 'proper' BC's, except for
time-dependent flows.
o Another penalty-like method: Augmented Lagrangian. In order to both reduce the need
for such a large penalty parameter and to more closely approximate exact enforcement
of the divergence-free constraint, the method called 'augmented Lagrangian' has been
employed and advocated by some [here are three: Fortin and Glowinski (1983), in which
the Navier-Stokes problem is but one of many addressed, Fortin and Fortin (1985c), and
Simo and Armero (1994)]. Unlike the classical/conventional penalty method, this method
does not try to reduce the size of the problem by eliminating the pressure; rather, it tries
to decouple the velocity and pressure. Thus, rather than (3.13-183) and (3.13-185), the
problem addressed is
Mh + [K + N(u) + XCQ-xCT]u + CP = f + kCQ~lg (3.13-196)
and
CTu = g\ (3.13-197)
i.e., the pressure is retained and exact mass conservation recovered. But as the astute
reader has no doubt already noticed, (3.13-197) implies that the penalty terms drop out in
(3.13-196), bringing us back to 'square one.' The augmented Lagrangian 'trick' is to apply
the so-called Uzawa method to a 'split-up' iterative form of the above equations—and we
'demonstrate' this only for the simple special case of steady Stokes flow, for simplicity
(see the references for the rest);
[K + XCQ~{CT]uk+[ =f + kCQ-lg - CPk (3.13-198)
and
Pk+i = Pk+k(CTuk+i -g), (3.13-199)
which system often converges 'quickly' even for not-so-large values of A (say 103). We
conclude this very brief summary with a few
Remarks:
(1) In Fortin and Glowinski (1983), the possibility of finding improved convergence
rates by using different A's in (3.13-198) and (3.13-199) is discussed.
(2) In Simo and Armero (1994), the algorithm is applied as part of a time-marching
method with only two iterations per timestep and, for reasons not obvious to us,
they omitted the penalty terms in the momentum equations.
532 THE NAVIER-STOKES EQUATIONS
(3) Fortin and Fortin (1985c) apply the method in conjunction with a Newton method
for solving the steady Navier-Stokes equations.
o Another penalty-like method: PALM. In a completely different approach and for
completely different reasons, Hutton and Smith (1981) and Smith (1985) invented the
penalty-augmented Lagrangian multiplier (PALM) method in order to correct a deficiency
when using biquadratic velocity and continuous bilinear pressure on isoparametric
quadrilaterals (typically the serendipity element). The 'deficiency' is a not very accurate
representation of pressure and (concomitantly) a not very accurate approximation to
V • u = 0. The 'fix' is to 'augment' the Lagrange multiplier (pressure) with a penalty term
that tends to return element-level mass conservation in a way that seems to be closely
related to that using the (MQi + ^o) element of Table 3.13-2—but in PALM the element-
level mass balance is only approximately achieved; it is a sort of penalty approximation
to Q2(Q\ + Po)- The PALM equations are basically (3.13-196) and (3.13-197) applied to
Qi Q\ (or Q2Q1), in which (only) the penalty matrix is different; C there is replaced by
Co, where here Co refers to the C-matrix of a Q2Q0 element, which would produce the
following element-level mass balance (and the penalty approximation gets it close), a la
Gresho et al. (1980b), for the lower left element in Figure 3.13-36:
h
-^[(ussww + 4usww + uww) — (uss +4w$ + wo)]
o
/
+ t[vssww + 4vSSw + vs) - (vww + 4vw + v0)] = 0. (3.13-200)
o
Also, the 'true' pressure from the PALM method is obtained as
P(x) = Y,pAM) + X>,lM*), (3.13-201)
j e
where \ffj(x) is the C°-bilinear basis function, Pj is the nodal pressure corresponding to
(3.13-196) and (3.13-197), ff(x) is the piecewise-constant basis function on element e,
and
Pe = -7- (Clu-g) (3.13-202)
Ae
is the 'augmentation' of the Lagrange multiplier. Final remarks on PALM:
1. Hutton and Smith also apply the method to the PjP\ element.
2. The larger is A, the closer is (3.13-200) satisfied; A = 105 seems 'typical.'
Final Remark:
From Fortin and Glowinski (1983), we end our augmented penalty presentation: 'In
summary, the penalisation is indissociable from a mixed (velocity-pressure) method, and
must be considered as a solution technique for this latter method, and not as an
approximation technique in itself. In this sense the use of augmented Lagrangian methods is quite
natural and the techniques of Chapter I provide some advance on the more usual methods,
since several iterations actually enable the error due to the penalisation to be eliminated.
We do not therefore have to choose values of r as large as in a pure penalisation method.
This possibility allows an improvement in the conditioning of the problems in u/j, and
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 533
this is particularly useful if one is unable to use double precision, or if the problem in
U/j, is to be solved by an iterative method...,' where their r is our A.
f. Some 2D vs 3D considerations
While there may never be even close to concensus on the 'best' element, especially in
this day of rapid growth of parallel computers, we dare offer a few opinions for a few
special cases; cf. the Tables 3.13-1-3.13-4 in Section 3.13.2a.
1. If 2D is the name of your game, then the element of choice is the QiP-1, usually
via the (consistent) penalty method—at least if you favor quadrilaterals. If you are a
'triangleperson,' P\P\ is good, as is the Crouzeix-Raviart element (PjP-\)—it permits
the 'elimination' of u and v at the center node and two pressures at element level, thus
making it cost-effective (D. Pelletier, 1993, personal communication); the cost is that of
P\Po (not recommended), but the accuracy is much higher.
2. If all of your work is in 3D, then the best is less clear, but some probably competitive
choices are:
(i) Q\Qo, perhaps in stabilized form (see next section)
(ii) Stabilized Q\Q\ (see next section)
(iii) QiP-\
(iv) P2PX or P2(Pi + P0)
(v) Mini (PfPi)
3. See Thatcher (1993) in Gunzburger and Nicolaides (1993) for some suggestions and
opinions—some of which are surely unjustified and probably wrong—such as 'In
practice, the only elements that are widely used are based on quadratic velocities and linear
pressures on tetrahedra and triquadratic velocities and trilinear pressures on hexahedra,
and it is the latter that are most often used for 3D flow.'
3.13.3 Stabilization [D. J. Silvester]
First, we review some basic definitions and set up notation that is used subsequently. Our
starting point is a conventional Galerkin formulation of the incompressible steady-state
Stokes equations. Our aim is to find the velocity-pressure pair u/j e Xf, and ph € Mf,
satisfying:
(grad uh, grad v) - (ph, div v) = (f, v) Vv g Xft,
-(?, div u/,) = 0 VqeMh. (3.13-203)
Here, h is a representative grid parameter associated with some (quasi-uniform)
subdivision Ch of the flow domain Q. Xh C X and M/, c M are finite-dimensional subspaces
of the underlying function spaces: X = (HQ(Q))d and M = L2(Q) with d = 2 or 3. Any
non-uniqueness of the pressure solution ph is associated with the space of pressure modes:
Qh = {qeMh\(q, div v) = 0 Vv G X^} (3.13-204)
being non-trivial.
534 THE NAVIER-STOKES EQUATIONS
If finite element spaces X/j and Mf, are constructed so that the stability condition:
. |(div \,q)\
inf sup ^ y (3.13-205)
qeMh\Qh veXh IM|x|l#llAf
is satisfied with the stability constant y > 0 and independent of h, then (3.13-203) is
well-posed since the velocity u/j is unique, and the pressure ph is unique in Mh\Qh-
a. Stable vs. stabilized methods
Let us start with an innocent-looking question, namely: Is stability essential? The issue
is surprisingly contentious. Although stability is fundamental in a mathematical sense—it
ensures good approximation properties on any conceivable mesh—specialists in solving
practical incompressible flow problems often argue otherwise. Their point is that
reasonable numerical solutions are often computed using supposedly unstable approximation
methods, particularly low-order finite element methods like Q\Qo- Obviously, a different
definition of stability is needed in such cases.
It is clearly possible to argue that discretization methods exhibiting pathologies only on
certain types of grids are 'semi-stable' in reality. Moreover 'stabilizing' such methods is
straightforward in principle; all one needs to do is to restrict the class of allowable grids.
The most stunning example of sensitivity to grid design is the PjP\ triangular element,
with a 'non-conforming' pressure approximation defined by the values at the mid-edge
points (so that the pressure is not continuous across the edge except at the mid-point).
On the one hand, if uniform grids are constructed by triangulating square elements into
'union-jack' patches, then the non-conforming PjP\ approximation is 'stable.' Yet, if the
direction of triangulation is changed to give a 'diagonal grid,' stability is lost and solution
accuracy is immediately reduced by one order of h. Similar sensitivity to the triangulation
direction is observed in the case of the fully discontinuous PjP- i triangular element, see
Qin (1994) for details.
One thing that makes restricting grids difficult in general is the desirability of adaptive
refinement as a means of error control. In particular, using Q\ Qq as a discretization method
it is impossible to categorize stable/unstable meshes in advance. The adaptivity feature
is what would seem to make stability really essential. It has certainly prompted the rapid
development of universally stabilized formulations in recent years. It also leads us to the
main issue addressed here: Is stabilization the way forward?
Those opposed to the principle of stabilization will argue that ensuring stability is
not really difficult. For example, using standard finite elements, enriching the velocity
approximation space will always do the trick (either adding 'bubble functions,' or else
adding velocity degrees of freedom to inter-element edges). Alternatively, working in a
finite difference or finite volume setting, a staggered grid (sometimes referred to as the
MAC scheme), with normal velocities defined on cell edges is always stable.
On the other hand, stabilization of equal order interpolation elements often
looks appealing because it is computationally convenient, especially in a parallel
processing/multigrid context. The main drawback, however, is that stabilization always
introduces (regularization) parameters, either explicitly or implicitly. Thus, insensitivity
to such parameter values is important if the methodology is to be competitive. In many
cases, estimates of good/optimal parameter values can be deduced a priori. In other cases,
however, an appropriate selection of parameters is not obvious, and in this instance the
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 535
advantage of stabilization must be questionable. We give some personal recommendations
after the discussion of particular element combinations which follow.
b. Equal order interpolation via stabilization
Using the same basis functions to approximate velocity and pressure is almost always
unstable*. For convenience, the lowest and the higher-order cases are discussed separately
below. We consider the simplest approximation methods first.
o P\P\ and Q\Q\ The inherent instability of low order interpolation methods is well
known. For example, solving an enclosed flow problem using a grid of Q\Q\ rectangular
elements, the space Qh in (3.13-204) is eight dimensional (see Section 3.13-2b). This
means that extreme care is required in imposing non-homogeneous velocity boundary
conditions in order to ensure that the singular system is consistent. If the pressure modes
are filtered a priori (for example, by relaxing the boundary conditions), the approximation
is still likely to be unstable because of the presence of 'pesky modes' (see Section 3.13-2b)
associated with the inf-sup constant in (3.13-205) being 0(h). An illustration is given later
in this section.
In the P\P\ case, there is a simple way to stabilize the approximation without a
formal loss of accuracy. The idea (originally presented in Brezzi and Pitkaranta, 1984) is
to 'regularize' (3.13-203) via a pressure Laplacian perturbation of the incompressibility
constraint:
(grad uh, grad v) - (ph, div v) = (f, v) Vv e Xh,
-(q, div uh) - P JZ ^(8rad Ph* grad 4)k =0 Vq G Mh. (3.13-206)
KeC„
Here, hx is the diameter of the kth element in the subdivision Ch, (•, -)k denotes the
element-wise L2 innerproduct, and the regularization parameter /3 is strictly positive. The
formulation (3.13-206) is also useful in the Q\Q\ case, although the original analysis
(Hughes et al., 1986) indicated that additional 'consistency' terms should be added to
(3.13-206) if the quadrilateral grid is non-cartesian. (These extra terms are usually omitted
in practice, see Hughes et al. 1986, p. 96).
Because of the 0(h2) perturbation, the method (3.13-206) is first-order convergent
at best: if the exact Stokes solution (u, p) is sufficiently smooth, and if the hydrostatic
pressure is set appropriately, then an a priori error estimate is satisfied:
llu - UfeHi + \\p - Ph\\o ^ C(h\u\2 + h2\p\2). (3.13-207)
Note that the order of the pressure approximation in (3.13-207) is limited by the velocity
error, so that using either Pi Pi or Q\Q\, the O(h) pressure error is the best that one can
expect. It must be stressed here that the 'weakening' of the incompressibility constraint
does not destroy global conservation of mass (because the hydrostatic pressure is in the
nullspace of the perturbation operator). This is in contrast to penalty methods whereby
global incompressibility is sacrified—the degree of compressibility being proportional to
the size of the penalty parameter.
* The exception is the case of fully periodic boundary conditions, see, for example Dean and Glowinski
(1993, pp. 49)
536 THE NAVIER-STOKES EQUATIONS
The popularity of the approach (3.13-206) is largely due to computational convenience:
implementation is relatively trivial since the stabilization matrix is a standard element
'stiffness matrix.' The difficulty of finding a 'good' choice of /? in (3.13-206) is the only
limiting factor. Unfortunately, as illustrated below, the quality of the approximation is very
dependent on the magnitude of /?. Furthermore, we show later on that an inappropriate
choice not only leads to inaccurate solutions, but also adversely affects the convergence
of iterative solvers applied to (3.13-206). Perhaps the biggest problem with (3.13-206) is
that it is very easy to 'over-stabilize' by using a parameter which is too large, in which
case the quality of the divergence-free approximation must inevitably deteriorate. Indeed,
in the limiting case of infinite /? the solution of (3.13-206) is a constant pressure together
with a velocity which is virtually unconstrained (it is only divergence-free globally). The
important point is that stability does not guarantee accuracy.
The issue of a small parameter value is quite subtle since (3.13-206) is theoretically
stable for all positive values of /?. The proof is trivial enough; defining the bilinear form:
Bh(w, r; v, q) = (Vw, Vv) - (div v, r) - (div w, q) - /3 ^ /*|(grad r, grad q)K,
KeCh
(3.13-208)
and the mesh-dependent norm:
1/2
|||(u,p)|||= f ||u||f+ X>£llv/>Ho,*) , (3.13-209)
the approximation (3.13-206) is coercive over X/, x Mf, (and thus stable). That is, for all
j8>0a positive constant a exists such that
Bh(v,q;v,-q)^a\\\(v,q)\\\2. (3.13-210)
(Note that this is a slightly unusual definition of coercivity, see Franca et al. 1993, pp. 97
for further details.) The loophole in the theory is the fact that the constant a in (3.13-210)
behaves like /3 for /3 < 1. Intuitively, if the parameter is small then we are essentially
solving the original problem and the usual symptoms of instability will be apparent. A
set of numerical experiments illustrating this is given in Pierre (1988).
To gain further understanding it is useful to look at the issue of 'small' /3 in a spectral
setting. To this end, consider the general matrix eigenproblem:
(3.13-211)
where /i is the eigenvalue, Q is the pressure mass matrix and K, C and S are defined in
the usual way from (3.13-206). The first thing to point out (or, see below) is that a sharp
upper bound on the largest negative eigenvalue (i-\ is given by:
I M _ . /1 ±4v2
Ai_, *U 1 -./1+4/2 , (3.13-212)
where the 'inf-sup like' constant ys is defined via:
2 . pT(CTK-{C + pS)p
Ys = min — f t—LL- (3.13-213)
pzm„ pTQp
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 537
Note that without stabilization (P = 0) (3.13-213) is algebraically equivalent to (3.13-205)
(for details see Brezzi and Fortin, 1991, pp. 73-78). This suggests that the role of
stabilization is to ensure that ys is independent of h in cases when y in (3.13-205) is 0(h). The
derivation of (3.13-212) is satisfyingly neat, and is included for completeness. Expanding
(3.13-211):
-Cp = (\ - n)Ku, (3.13-214)
CTu- pSp = /iQp, (3.13-215)
we consider /i < 0 and note that in this case p ^ 0 since otherwise (3.13-214) implies
u = 0, contradicting the definition of an eigenvector. Then, taking the scalar product of
(3.13-215) with p and substituting for u from (3.13-214) gives:
(1 - (i)-lpTCTK-lCp + pPTSp = -fipTQp,
and since 0^(1— /x)-1 ^ 1 it follows that
(1 - ii)~l pT(CTK~lC + PS)P < -fipTQp.
Using the definition (3.13-213), and the fact that the mass matrix is positive definite
leads to:
K2(l -/xr1 ^-m;
thus
0 ^ m2 - ii - y2s,
from which (3.13-212) easily follows. Note that the argument above is very general:
(3.13-212) holds independently of the choice of stabilization matrix S.
The next step is to quantify the relationship between ys and p as described in Silvester
(1994). Starting from (3.13-213), a crude bound is obviously
, qTCTK~lCq qTSq
y ^ mm f h p max ^—. (3.13-216)
qeMh q Qq q&Mh q Qq
Thus, introducing the constant
0 = max —f— (3.13-217)
qzMh q Qq
and noting that the definition of Q/j(3.13 — 204) implies that
qTCTK.-xCq
0 = min " == ", (3.13-218)
qeMh q Qq
we deduce that yj ^ PQ2. An analogous argument gives the alternative bound
2 qTCTK-xCq , Q . qTSq n „ -1Q.
y: ^ max f h P mm —~—. (3.13-219)
qzMh q Qq qzMh q Qq
Thus if we define an upper bound
9 qTCTK~xCq
T2 = max " = ", (3.13-220)
qzMh q Qq
538 THE NAVIER-STOKES EQUATIONS
and note that in the case of Pi Pi or Q\Q\ stabilized via (3.13-206)
0=min-V-^ (3.13-221)
q&Mh q Qq
(because a constant pressure is in the nullspace of the stabilization operator by
construction), we deduce that y2 ^ r2, and hence that
y2 ^ min(£©2, T2). (3.13-222)
This implies that the stability constant tends to zero as /? —> 0. Moreover, if the stability
parameter is small, then (3.13-212) and (3.13-222) imply that:
/it_, =-/3e2+0(/32),
which means that the matrix system associated with (3.13-206) becomes increasingly
ill-conditioned as fi —>• 0.
The result (3.13-222) also points to the existence of a critical ('optimal') parameter
fic = T2/02. The implication of (3.13-222) is that the stability constant ys (and thus (i-\)
is essentially independent of /? as long as /? is large enough, i.e. /? ^ /3C. One way of
using this result is to estimate constants r* and ©* which are independent of the mesh
so that r* ^ Y and ©* ^ 0, and then to make the specific choice fi = r2/©2 ^ (3C. Note
however, /3C is optimal in the sense that if /? > /3C then the maximum negative eigenvalue
of (3.13-211) varies like 0(/3) (so that the system associated see with (3.13-206) also
becomes increasingly ill-conditioned as /? —> oo; Silvester, 1994). Thus it pays off to
estimate /3C as accurately as possible.
A simple estimate of r* is well known (see Fortin and Pierre, 1992): a Cauchy-Schwarz
argument yields
|(div v, p)\2 lldiv vll2
-—, F ' ^ i- ^ d, (3.13-223)
IIvll II nil II Vvll
IIvIIxNPHm livv||
so for example in K2 we have \fl ^ F. In practice, this estimate (which holds for all
mixed approximations) seems to be pessimistic. In particular, in the case of the Q\Q\
approximation, numerical computations on quasi-uniform cartesian grids of rectangular
elements suggest that Y —> 1 from below, as h —> 0. Hence a better choice would be
r* = 1 in this case.
Estimating 02 a priori is more problematical, at least in the case of the scaled Laplacian
stabilization operator in (3.13-206). (Using a local stabilization based on macro-elements
there is no problem, see the next section.) The obvious way of proceeding is to try to use
a finite element inverse estimate
\\Vp\\2 ^ Cih-2\\p\\2, (3.13-224)
but this only gives limited information. Specifically, (3.13-224) implies that for a quasi-
uniform sequence of grids, ©2 is bounded above by the inverse constant C/. In general, it
seems that direct computation of the largest eigenvalue of the (scaled) stabilization matrix
using representative grids is a better way of getting a good estimate for 0*.
The determination of the optimal parameter in (3.13-206) is the key to assessing the
quality of implicitly stabilized methods. The simplest method in this category is the 'mini-
element', discretization, see Brezzi and Fortin (1991, pp. 213), where the basic P\P\
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 539
method is 'stabilized' by augmenting the velocity in each element with 'bubble' degrees
of freedom (one for each component). Since (by definition) a typical bubble function (pk
is zero on the boundary of the element (ensuring that the enriched velocity is continuous
across element edges or faces), the associated degrees of freedom may be removed using
the standard process of static condensation; that is each bubble degree of freedom may be
eliminated before element assembly. If this is done, then the reduced system also looks
like (3.13-206) with a perturbed incompressibility constraint of the form
-(q, div uh) - Y] — -^ (grad/7/,, grad?)* =0 Vq eMh,
^ A^(grad <f>k, grad (pk)K
(3.13-225)
where A# is the element area/volume. For a proof, see Pierre (1995). This interpretation
is appealing theoretically since it facilitates the construction of the 'optimal bubble' such
that after normalizing, (for example, setting the centroid value of (pk to unity), the inf-sup
constant (3.13-205) is maximized. One of the products is that for a grid of equilateral
triangles, the standard cubic bubble element can be shown to be the 'best' possible
conforming mini-element.
In practice, it is important to note that the cubic bubble mini-element often gives a poor
approximation (Lohner, 1993; Pierre, 1988); for example, local 0(h) pressure oscillations
are often observed near boundaries, suggesting that the implicitly defined stabilization
parameter in (3.13-225) is too small. The above discussion of the optimal parameter /3C
sheds a little light on this: specifically, if we have a uniform grid of right-angled triangles
with short sides parallel to the coordinate axis, then the cubic bubble mini element solution
satisfies (3.13-206) with:
1 1 AjcAv
fitfc = ——-, —^- (3.13-226)
K 40 Ax2 + Ay2
Thus, putting Ax = Ay = \ik we have a stabilization parameter of /? = 1/80. Moreover,
if a distortion parameter a is introduced so that Ax = ahx and Ay = hx, then
P= —=-, (3.13-227)
40(1+a2)
which shows that fi tends to zero in the limit of highly stretched triangles with a —> 0
or a —> oo. Further discussion of this issue is given in Becker and Rannacher (1994). In
Lohner (1993) the suggested 'fix' is to multiply the natural mini-element parameter by
an order of magnitude for elements within boundary layers!
Implicitly stabilized methods also arise in a natural way if standard (semi-)explicit time
stepping schemes are applied to the 'slightly compressible' Stokes system:
(grad uh, grad v) - (p/,, div v) = (f, v) Vv e Xh,
-(q, div uh) = \ (q, 3^) Vq G Mh, (3.13-228)
c
dt
where c is the speed of 'sound'. See Zienkiewicz and Wu (1991) for a complete list
of possibilities; which include simple predictor-corrector methods (with the momentum
equation treated explicitly), Taylor-Galerkin type methods, and explicit Runge-Kutta
methods. In the simplest case of a fixed time-step At, the steady-state solution to
540 THE NAVIER-STOKES EQUATIONS
(3.13-228) satisfies a system containing an O(At) perturbation term. For example, if
(3.13-228) is discretized using a uniform grid of P\P\ elements, with a time-step
At = h2/4 determined by the local stability limit, then the solution ultimately obtained
using a Taylor-Galerkin approach (see Lohner, 1993), will satisfy a system which is
essentially (3.13-206) with a (large) parameter fi = 1/2. In general, the idea of viewing
pseudo time-stepping as a mechanism for enforcing stability is likely to be fraught
with peril. Nevertheless, good results are certainly possible in the hands of experts, see
Zienkiewicz and Wu (1992).
o PkPk and QkQk for k ^ 2. The instability of equal-order interpolation is intrinsic.
Unfortunately, stabilization in the case k ^ 2 appears to be considerably more complicated than
in the low order case. A variety of practical computations have been done nevertheless,
using the methodology described below (for details see Tezduyar, 1992).
The key idea was suggested by Hughes and Franca in (1987), and involves adding
mesh-dependent Galerkin-least-squares perturbation terms to the discrete formulation
(3.13-203)):
(grad uh, grad v) - (ph, div v)
-aJ2 h2K(-V2uh + grad Ph - f, tV2v)k = (/, v) Vv e Xh (3.13-229)
KeCh
- (q, div U)j)-a^ h2K(-S72uh + grad ph - f, grad q)K =0 Vq e Mh.
KeC„
In this case a > 0 is the regularization parameter. The 'trick' is to add the stabilization
terms in an element-by-element fashion, circumventing the more demanding continuity
requirements associated with a conventional least-squares formulation. There are
obviously two possibilities above, depending on the choice of sign in (3.13-229). Both
possibilities are consistent in the sense that the solution of the underlying continuous
Stokes problem also satisfies (3.13-229).
The symmetric (minus) formulation is that given in Hughes and Franca (1987). It is
stable only if 0 < a < a*, where a* is defined via the inverse estimate (cf. (3.13-224)):
llgrad v||n
a* = max —-^ ^—-. (3.13-230)
veXh V^ h2 IIV2vll2
K&Ch
A systematic way of computing a* (solving a local eigenvalue problem on each element)
which is applicable to highly non-uniform grids, is given in Franca and Madureira (1993).
The unsymmetric (plus) formulation is more recent, see Douglas and Wang (1989). The
motivation for its introduction is that stability is ensured for all values of a in this case.
In the symmetric case, if a < a*, and assuming that the exact solutions are sufficiently
smooth, the following error estimate is established in Franca and Stenberg (1991):
llu-u/,11, + ||p-p/,||o^C(/**|u|ik+i +hk+l\p\k+i). (3.13-231)
The same estimate holds for the unsymmetric formulation with no restriction on a, (note
that C in (3.13-231) depends on a, however). A numerical performance comparison of
the plus/minus formulations in the case k = 2 can be found in Franca and Frey (1992).
The bottom line seems to be that the unsymmetric (plus) formulation is the better method
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 541
because of its relative insensitivity to the choice of parameter, although a clean mechanism
for determining a in practice is not obvious in either case.
c. Stabilized approximations using discontinuous pressure
Apart from the case of equal-order interpolation, the need for stabilization is most apparent
using mixed methods based on discontinuous pressure. Discontinuous pressures (or
alternatively control-volume finite element methods) are to be preferred in cases where mass
conservation at the element level is important. Noting the typical form of the mixed
approximation error estimate (for example, (3.13-231), it is clear that methods with a
pressure approximation which is one degree less than the velocity are of prime interest.
Again, the lowest order case is special and will be considered in detail.
o P\Po and Q\Qo The unstable P\Po triangular/tetrahedral and Q\Qo quadrilateral/brick
elements are very natural mixed methods; moreover their inherent simplicity makes them
computationally attractive. Two ways of stabilizing these methods are discussed here.
Each of these has advantages/disadvantages which are described below. Unfortunately, as
we shall demonstrate, both types of stabilization tend to destroy the intrinsic simplicity
of the underlying approximations.
The first approach we consider is the so-called global stabilization approach, which is
essentially a special case of the formulation in Hughes and Franca (1987), and corresponds
to controlling the jumps in pressure across element boundaries:
(grad uh, grad v) - (p/,, div v) = (f, v) Vv e Xh
-fo, div ufc) - 0 £>, f[[ph]U[q]]eds = 0 VqeMh. (3.13-232)
eeVh Je
Here, he is the length of the element edge (in K. ) or the diameter of the element face (in
[&3X [[•]]<? is the jump operator, and T/j is the set of all interior inter-element edges/faces.
The stabilization is global in the sense that the eigenfunctions of the perturbation operator
in (3.13-232) are all global functions—they cannot be constructed using a local (element
or macro-element) basis.
It is interesting to observe that the stabilized system (3.13-232) is closely related to
the stabilized Q\Q\ method above. To illustrate, consider a uniform grid of square Q\Qo
elements of side h. Assuming the usual pointwise interpretation of the constant pressure
(i.e., the value at the centroid), the stabilization term corresponding to the piecewise-
constant pressure test function %, at the centre of the patch of nine elements illustrated in
Figure 3.13.16 is given by:
= h2{(p0 - Pe) + (Po - Pn) + (Po - pw) + (po - Ps)}
= h2(4p0 - Pe - Pn - Pw - Ps)
= -hAV2 ph + 0(h6),
that is, the stabilization term is just a scaled discrete Laplacian defined on the 'dual' grid
obtained by joining the element centroids. Since the approach (3.13-232) is clearly another
542 THE NAVIER-STOKES EQUATIONS
1
1
1
1
N
I
l
I
I
S3
W VA 0 t\2 E
i
i
i
i
S1
S
Fig. 3.13-16 Typical patch of Q^ Q0 elements.
'pressure Laplacian' stabilization, poor accuracy is to be expected if j8 is too large. (In
the limiting case of infinite j8 the solution of (3.13-232) is again a constant pressure.)
The analysis of (3.13-232) is a straightforward generalisation of that above; coercivity
in the mesh-dependent norm:
|||(u,p)|||= ( ||u||?+ 5^/1, f[[p]]2eds) , (3.13-233)
V eerh Je J
over Xfj x M/,, is the key to deriving the desired error estimate:
llu - it/,||i + \\p - Ph\\o ^ Ch. (3.13-234)
The spectral analysis leading to the 'best' choice of fi also applies here. For example,
computations on uniform square grids in Silvester (1994) suggest that r* = 1 is a good
estimate in this case. Estimating 0* is more tricky, however; see the discussion of Q\Q\
above.
A relevant observation here is that explicitly perturbing the incompressibility constraint
by a suitably scaled pressure Laplacian operator is also a relatively clean way of stabilizing
'non-staggered' finite volume (and centered finite difference) methods. Furthermore, in
the case of a uniform grid, a local Fourier analysis of the non-staggered finite difference
operators suggests that fi= 1/16 is an intelligent choice to make in such cases.
Specifically, /3 = 1/16 is the smallest possible value that maximizes the local ellipticity measure
which determines smoothing of high frequency error components. Hence an optimally
convergent multigrid solver for the underlying discrete Stokes system is obtained in this
case. See Linden et al. (1988) for further details.
In practice however, the globally stabilized Q\Qq and P\P0 methods have limited
appeal. There are two main reasons for this: First, the jump terms make life awkward in the
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 543
sense of ease of implementation into existing codes based on Q\Qo, and secondly, the fact
that in gaining stability, the local incompressibility and the simplicity of the underlying
approximation are sacrificed. (Note that mass is still conserved globally since the nullspace
of the stabilization matrix contains constant vectors). Both of these limitations may be
overcome by making a subtle modification, however. The idea is simply to control jumps
in pressure locally within macro-elements instead of globally across the whole grid.
Starting from the global approach (3.13-232) the idea is to aggregate adjoining elements
into an appropriate macro-element partitioning M, and then omit those jumps in pressure
across the macro-element boundaries. This gives the following local stabilization method
[Kechkar and Silvester (1992):
(grad u,;, grad v) - (ph, div v) = (f, v) Vv e Xh,
-(q, div uh) - 0 Y, h>» J2 JiiPh]U[q]]eds = 0 Vqe Mh, (3.13-235)
me M e^Y m
where FM is the set of all edges/faces in the interior of the rath macro-element, and hm is
a measure of the macro-element's size, see below. Of course, if stability is to be retained
then the number of elements in each macro-element must be sufficiently large—if every
macro-element contained just one element (i.e. M = Ch) there are no internal jump terms
(i.e. FM = 0), and (3.13-235) degenerates to the unstabilized formulation. In the
motivating paper [Kechkar and Silvester (1992) it is rigorously established that as long as
M is constructed so that each macro-element is topologically equivalent to a reference
macro-element having a velocity node on every edge (or every face in three-dimensions),
then there exists a minimal parameter value /?o > 0 such that the formulation (3.13-235)
is stable (ys in (3.13-213) is independent of h), and the optimal error estimate (3.13-234)
holds (with a constant C independent of fi). Note also that the globally stabilized
formulation (3.13-232) corresponds to the extreme case of a local stabilization based on a single
macro-element.
One of the features of (3.13-235) is that if the discrete incompressibility constraints are
added together then the jump terms sum to zero in each macro-element (a specific example
is given below). This is crucially important to the success of the method since it implies
that the local incompressibility of the Q\Qo or P\Po method is retained after stabilization
(albeit over macro-elements). It also suggests that a good strategy when constructing M
is to form macro-elements containing as few elements as possible. Indeed, given some
arbitrary grid, an 'optimal' partitioning M may be constructed by a simple adaptive
process: successively subdividing large patches into smaller ones until further subdivision
cannot be done without violating the connectivity constraint on the macro-elements. As an
illustration, the patch of seven quadrilaterals in Figure 3.13-17 can obviously be split into
two macro-elements which are topologically equivalent to the reference macro-elements
illustrated in Figure 3.13-18.
Once a suitable macro-element partitioning has been formed, the local stabilization
matrices can be calculated by running through the component elements, summing jump
contributions corresponding to the internal edges. For example, in the case of the four
element macro element in Figure 3.13-17, each element has two internal (dotted) edges
544 THE NAVIER-STOKES EQUATIONS
Fig. 3.13-17 Typical patch of seven d Q0 elements.
Fig. 3.13-18 Reference Q^Q0 macro elements.
and the local stabilization matrix implied by (3.13-235) is given by:
SM
hK
( U\ + l\2
-In
0
-In
l\2 + llZ
-hi
0
0
—^23
^23 + ^34
— ^34
0
—^34
/34 + ^41 /
(3.13-236)
Here Uj is the length of the edge between elements / and j. The reference length hm may
be computed by simply defining it to be the average diameter of the constituent elements.
In two dimensions a convenient way of constructing a 'legitimate' M. is to take a
coarse subdivision of quadrilaterals and triangles and then to uniformly refine it once by
joining the mid-edge points. This gives a macro-element partitioning with each
macroelement consisting of precisely four elements. The quadrilateral macro elements are thus
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 545
4
r
Fig. 3.13-19 Reference P^P0 macro elements.
topologically equivalent to the left hand reference macro-element in Figure 3.13-18, and
the triangular macro-elements are all topologically equivalent to the reference triangle
in Figure 3.13-19. In three dimensions the 2 x 2 x 2 block is the obvious starting point
for stabilizing Q\Qo- Similarly, the basic reference P\Pq tetrahedral macro-element has
each face built up from three adjoining sub-tetrahedra, as illustrated in Figure 3.13-19.
(The crucial point here is that each face of the macro-element tetrahedron must have a
'centre-node' in order to satisfy the connectivity condition.)
Perhaps the most serious potential drawback of the local framework (3.13-235), is that
stability is only guaranteed if the stabilization parameter /? is bigger than some critical
value fio, which needs to be estimated, (cf. the globally stabilized case where stability is
built in). Fortunately, this does not cause any real difficulty in practice since it is known
theoretically that the value fic defined via the spectral analysis is always big enough (since
fio ^ Pc)> see Silvester (1994). This is important in the framework (3.13-235) because the
determination of an estimate for /3C is a simple piece of local analysis. To illustrate this
point, consider the case of a uniform grid of J x J square Q\Qo elements of side h. In
constructing (3.13-235) there are two possibilities; if J is even, then local stabilization
can be based on the 2 x 2 macro-element illustrated in Figure 3.13-18, whereas if J is
odd, then a single layer of larger 2x3 macro-elements needs to be appended around the
boundary. Restricting attention to the even case for simplicity, and setting hm = /,j = h
in (3.13-236), we see that the stabilization matrix S in (2.13-211) is block diagonal with
identical 4x4 blocks of the form:
>M
= hl
( 2
-1
0
\-l
-1
2
-1
0
0
-1
2
-1
-1
0
-1
2
(3.13-237)
/
A simple calculation then shows that the eigenvalues of the stabilization matrix S are
simply 0, 2h2, 2h2, Ah2 (each with multiplicity equal to J2/4). Furthermore, since the
pressure mass matrix Q is diagonal in this case, with entries Qn = h2, we immediately see that
O* = O2 = 4 in (3.13-217) independently of the grid. If this is combined with the
numerically computed upper bound r2 = 1 (see Table 3.13-8), then a 'good' parameter value
is easily deduced, namely fi = 1/4. Similar considerations apply in three dimensions, for
546 THE NAVIER-STOKES EQUATIONS
example using the 2 x 2 x 2 brick as the Q\Qo building block, each element has three
'internal' faces; thus the local stabilization matrix has maximal eigenvalue equal to 6h3,
and hence 0^ = 6 in this case.
To complete the picture a brief discussion of the higher order versions of the Q\Qo
and P\Pq methods is appropriate here.
o Pk+\P_k and Qk+\P-k for & ^ 1 The triangular/tetrahedral case is discussed first. The
Pk+\P-k approximation is very special since the divergence of the velocity approximation
is contained in the pressure space, for all k. The upshot is that the discrete velocity field
is divergence-free everywhere in the flow domain:
(q, divu/j) = 0 VqeMh implies div u/j = 0 in Q. (3.13-238)
Whilst the need for such a strong enforcement of incompressibility is not obvious in all
cases, it may be highly desirable for certain types of flows, and the method then gives
an alternative to working with a streamfunction formulation. With the incompressibility
so strongly enforced, the stability of the approximation is bound to be problematic. What
is surprising is that whilst the lowest order methods PjP-x, P3P-2 are unstable (with the
inf-sup constant in (3.13-205) behaving like 0(h) on certain types of grids), for k ^ 3, the
methods are not unstable as long as a technical condition associated with (near-)singular
vertices is satisfied, see Brezzi and Fortin (1991, pp. 227-228). The price to pay in this
case is that whilst the inf-sup constant is independent of h, it is not independent of k. Thus
it is difficult to use the family as a basis for adaptivity via ^-refinement (see Jensen and
Vogelius, 1990 for details). Stabilization of such ^-refinement methods remains an active
research area, partly because similar difficulties arise when using spectral approximations,
see Canuto et al. (1988a, pp. 394-406).
Returning to the /i-refinement setting, the PjP-1 and P3P-2 methods both need to be
stabilized if they are to work on all possible grids. This can be done either by adding
velocity bubble functions, leading to the Crouzeix-Raviart family of 'bubble' elements,
or else working within a globally stabilized formulation like (3.13-229). Either way,
stabilization is relatively straightforward, although the alluring property (3.13-238) is
lost. Stabilizing in a global framework via (3.13-229), the 'difficult to handle' pressure
jump terms in (3.13-232) are not needed for k ^ 2 (Franca et al., 1993), nor in the case
of the PiP~\ triangle which does have the crucial mid-edge velocity node. On the other
hand, some recent results Qin (1994) reinforce the basic point made at the outset: the
underlying methods can be made to work without stabilization by restricting attention to
carefully selected grids. For example, using (Clough-Tocher) macro-elements with each
triangle divided into three, both the P^P-\ and P3P-2 methods can be shown to be stable
and thus give optimal rates of convergence. See Qin (1994) for details.
Finally, it should perhaps be stressed that the Qk+\P-k family of methods are (Zi-)stable
fork^ 1, see Girault and Raviart (1986, pp. 156-157). Indeed, this class of methods is
one of the more attractive starting points for both p and h — p refinement strategies. A
more complete analysis is given in Stenberg and Suri and (1994). As a result, the unstable
Qk+\Q-k family are clearly of limited interest.
d. Impact on iterative solvers
The recent development of high-performance computing architectures, and the ability to
render three-dimensional solution information has led to an increased emphasis on iterative
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 547
linear equation solvers. The aim of this section is to assess the impact of instability on
the convergence rate of state-of-the-art iterative solvers when applied to the linearized
equations that arise (at each time-level), when solving the time-dependent Navier-Stokes
equations. For ease of exposition, we restrict ourselves to operator splitting methods
which involve a Stokes (or H' )-projection onto a discretely divergence free subspace at
every step (e.g., the second-order ^-scheme Dean and Glowinski, 1993, pp .27). If an L2-
projection is done instead, then the ramifications of instability are not nearly so evident,
see Griffiths and Silvester (1994).
Using a stabilized formulation of the form (3.13-235) [or (3.13-232)] the Stokes
projection step requires the solution of a generalized Stokes system of the form:
where K, C, and S are as defined previously, v is the inverse of the Reynolds number,
M is the velocity mass matrix, and / depends on the solution u! and pl at the previous
time-step.
Recently, 'fast iteration' methods for solving linear equation systems have been
developed. These typically incorporate multigrid or multilevel strategies, and are often optimally
efficient in the sense that a solution can be generated in O(N) floating point
operations (where N is the number of degrees of freedom), cf. direct methods for dense
linear equations which require 0(N3') flops. One iteration method that is uniquely
appropriate for indefinite systems like (3.13-239) is the minimal residual method (MINRES),
see Silvester and Wathen (1996). Alternatively, the mathematically equivalent conjugate
residual (CR) algorithm (Elman, 1994) may be used. In practice, our experience is that
MINRES has better stability properties than CR; furthermore, it is relatively cheap to
implement—requiring only one matrix-vector product, two dot products and six 'AXPY'
operations per iteration. The key point here is that if the basic iteration is
preconditioned by velocity and pressure operators which are spectrally equivalent to the primal
operator K, and the dual operator CTK~' C + /3S, respectively, then a mesh independent
convergence rate is assured (see Silvester and Wathen, 1996). Furthermore, using such
a preconditioner the 'energy' of the error is spectrally equivalent to the preconditioned
residual, and hence it is strictly minimized at every step. (To get a convergence rate which
is independent of v and At requires a more sophisticated preconditioning approach; see
Bramble and Pasciak, 1995 for details).
Note that in the limiting case of an arbitrarily large timestep the system (3.13-239)
reduces to a standard Stokes problem. In any case, (3.13-205) is a necessary and sufficient
condition for the pressure mass matrix Q to be spectrally equivalent to the dual operator. In
fact, the inf-sup constant y is the lower bound on the equivalence relation (in the stabilized
case, the condition (3.13-213) plays the same role). The crucial point is this: if a 'fast'
method is applied to a system corresponding to an unstabilized Q\Qo discretization, then
the ratio of the equivalence constants corresponding to the dual operator may blow up as
h —>• 0. In this case, what will happen in practice is that the rate of convergence of the
iteration will deteriorate under mesh refinement.
* Two symmetric matrix operators S and T are said to be spectrally equivalent if there exist constants a and
b independent of h such that a ^ x'Sx/x'Tx ^ b, for all vectors x.
548 THE NAVIER-STOKES EQUATIONS
A numerical experiment will hopefully reinforce this point. We solve an enclosed
Stokes flow problem, with Dirichlet velocity conditions on all boundaries. Note that
ignoring the effect of the starting guess, the convergence of iterative solvers applied to
symmetric systems of equations is completely determined by the systems' eigenvalues,
and hence is essentially independent of the actual boundary values imposed. We discretize
using a sequence of uniform grids of square Q\ Qo elements, each grid obtained by uniform
refinement of the previous one. The velocity components are preconditioned using a multi-
grid solver applied separately to each of the discrete Laplacian blocks. In particular, we
perform the simplest multigrid smoothing strategy available; one V-cycle of optimal point-
Jacobi relaxation with just one smoothing step before and after transferring to the next
grid. Bilinear prolongation is used, and restriction is via 'full weighting' to ensure that the
preconditioner is symmetric. The discrete pressure is preconditioned by the pressure mass
matrix, that is by a simple diagonal scaling. The iteration is stopped when the
preconditioned residual has reduced by a factor of 10~6, and the iteration counts are recorded
in Table 3.13-7 Stabilization is enforced via (3.13-235) using the 2 x 2 macro-element
construction analyzed above. Note that in all cases the computed velocity solutions were
identical in the 'eyeball norm'.
Iteration and operation counts (in megaflops) for three particular values of fi are shown;
the value ft = 1/4 discussed above, and the value fi = 0.058 (see Silvester, 1994) which
minimizes the condition number of the dual operator which determines the speed of
convergence. This 'perfect' value is hard to estimate in general, although it is easily
determined (by numerical experiment) in the case of uniform grids, see Vincent (1995).
What is observed is that iteration counts are only independent of the grid in the
stabilized cases: using the raw Q\Qo method the iteration counts significantly increase with
decreasing h. Note that exactly the same picture would be observed if we had used the
global formulation (3.13-232) instead of (3.13-235) above. The only difference is that the
computed solutions are much more sensitive to the choice of fi in this case, see Silvester
and Kechkar (1990) for details.
The poor performance in the case of unstabilized Q\ Qq is easily explained theoretically.
Indeed an asymptotic (worst-case) analysis of the instability of Q\Qo suggests that the
number of iterations will double for every uniform refinement, independently of the fast
iteration actually used. To illustrate the difference in behavior, the extremal eigenvalues
of the preconditioned dual operator (CTK~XC + fiS)/h2 are given in Table 3.13-8 (cf.
Section 3.13.5k).
In the unstabilized case there are two zero eigenvalues corresponding to the hydrostatic
and the 'pure' chequerboard mode. The minimal eigenvalue clearly decreases with h, in
fact it is known that <7min —> 3/j2tt2/8 as h -* 0, see Griffiths and Silvester (1994). In
addition, the maximal eigenvalue clearly tends to unity, hence the condition number
Table 3.13-7 MINRES iteration counts (operations).
Grid p = 0 p = 0.058 p = 0.25
8x8 46 (1.88) 32 (1.35) 35 (1.48)
16x16 76 (13.05) 34 (6.04) 42 (7.44)
32x32 128(91.06) 34(25.03) 41(30.13)
64x64 213(619.3) 34(102.4) 43(129.2)
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 549
Table 3.13-8 Extremal eigenvalues of the preconditioned
dual operator.
Grid
8x8
16x 16
32x32
P =
°Vnin
4.661 E-2
1.318E-2
3.465E-3
= 0
°Vnax
0.9764
0.9941
0.9984
P =
°Vnin
0.2320
0.2312
0.2200
= 0.058
°Vnax
1.1009
1.1121
1.1147
of the preconditioned dual operator becomes unbounded as h —> 0. In contrast, in the
stabilized case, there is one zero eigenvalue, the minimal eigenvalue is bounded away
from zero, and the condition number is uniformly bounded by a small constant (~ 5).
Looking at Table 3.13-8 it is obvious that the instability of Q\Qo is fundamental; any
iterative solver that relies on a well conditioned dual operator, e.g., any Uzawa method,
is bound to converge arbitrarily slowly in the limit h —> 0.
In three dimensions, the situation is even worse since the analysis Griffiths and Silvester
(1994) indicates that using a uniform grid of Q\Qq bricks, <rmin behaves like 0(h4) as
h —> 0. Thus if a 'fast' iteration is employed and the method is not stabilized, the number
of iterations required to satisfy a fixed tolerance will ultimately increase by a factor of
four with each uniform grid refinement.
A 3D comparison, kindly supplied by S. T. Chan, for potential flow (L2-projection)
over a hemisphere, is shown in Figure 3.13-19A—using an incomplete Cholesky
conjugate gradient iterative solver, and fi = 0.10. The payoff is quite considerable, although
not as dramatic as for the 2D Stokes example (//'-projection) shown in Table 3.13-1.
e. Recommendations
• It is important to be aware that stabilization does not come for free. The general
price that must be paid is that the regularization parameter has to be appropriately
chosen. Ignoring the lowest order conforming methods (i.e. with Q\ or P\ velocity),
efficient mixed approximation methods exist like Qk+\P-k which are intrinsically
stable. Hence, stabilized higher-order methods seem to have limited attraction.
• The stabilized formulation (3.13-206) is a clean way of implementing P\P\
approximation on triangles and tetrahedra, which is intrinsically superior to the alternative
mini-element discretization. Equation 3.13-206 is also a good starting point for equal-
order Q\Q\ approximation on grids of rectangles and bricks, but is less attractive in
the case of distorted grids (because the estimation of a good parameter choice is more
difficult). Care must be taken however—it is easy to over-stabilize, in which case
the quality of the divergence-free approximation is compromised, adversely affecting
accuracy.
• The local stabilization approach provides a convenient and efficient way of getting
Q\Qo and P\Pq methods to work with minimal restrictions on the grid used. In such
cases a priori estimates of an 'optimal parameter' are easily computed. However,
the need for a macro-element data structure is an unavoidable consequence. In three
dimensions, the necessity for iterative solvers make the case for stabilizing these
550 THE NAVIER-STOKES EQUATIONS
- 10
i—i—r—t—r~i—i—i—i—I—i—i—i—i—i—i—i—i—i—|—i—i—i—i—i—i—i—i—i—I—i—i—i—i—i—i—i—i—i-
\ 11x51x31
101x101x51
101x101x51
1 0 ■ ■ ■ » ■ 1 I I 1 i_J I I I L—t 1 1 1 1 1 ' ■ I t I I I I 1 | | | I I
0 100 200 300 400
1CCG Iterations
Fig. 3-13-19A A 3D example; solid curves show stabilized results, dashed curves unstabi-
lized.
methods compelling, especially if Uzawa-type iteration methods based on a Stokes
projection step are used.
3.13.4 The Discrete Pressure Poisson Equations (PPE's)
We have previously discussed the PPE in the continuum—see Sections 3.5.1, 3.8.2,
3.10.3-3.10.5, 3.12.3, and 3.12.5; here we address the discrete analog, in several
versions, and point out relative advantages and disadvantages—vis-a-vis (u, P). Later
(Sections 3.16.1 and 3.16.5), we will show how some discrete PPE formulations can be
used to write code and generate numbers.
a. The consistent PPE
Just as the continuous equations (PDE's) of mass and momentum conservation (plus BC's)
imply (induce) the existence of an associated pressure Poisson equation (PPE), complete
with appropriate BC's, so too do the semi-discrete forms of these equations—most easily
derived from the DAE's presented in their most compact form in (3.13-28) and (3.13-29).
The PPE derivation (but not all of its consequences) is (are) simple:
1. Since (3.13-29) applies for all t ^ 0, its time derivative exists,
C u= g,
(3.13-240)
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 551
which simply states that the acceleration is also divergence-free—and we remark that
3.13-240 is only valid when the domain shape is time-independent.
2. Since M is non-singular, (3.13-28) can be uniquely solved for the acceleration,
U = M-{[f -Ku-N(u)u-CP]; (3.13-241)
i.e., if u and P are known, the acceleration is computable.
3. To ensure that the velocity remains divergence-free, the acceleration is inserted into
(3.13-240) and the result rearranged to give
(CTM-' C)P = CTM- x[f -Ku- N(u)u] - g, (3.13-242)
the analog of (3.5-3), which is the consistent (discrete) PPE and, given u (such that
CTu = g), generates the unique pressure that assures that u remains divergence-free.
(The PPE formulation generally assures only a divergence-free acceleration.)
So why do we care about the discrete PPE and what do we mean by 'consistent'?
Simply put, the answers are these: we care about the PPE because it can replace the mass
conservation equation and be used, with the momentum equation, to sometimes solve more
efficiently the time-dependent, semi-discrete NS equations. It is consistent in that it can
generate exactly the same solution—u(t) and P(t)—as that obtained by time integration
of the 'primitive' equations (3.13-28) and (3.13-29). It also has the appropriate BC's
for the pressure 'built-in'—consistently. These statements will be seen to be especially
relevant when we discuss alternatives—i.e., inconsistent discrete PPE methods—in the
next section.
Remarks
(1) The matrix CTM~XC usually 'corresponds to' (—V2) and is thus (up to pressure
modes) SPD; and the RHS, appropriately, corresponds to (-div) on the 'data.' But
see Remarks (4) and (5) below.
(2) Whereas CTM~xKu approximates V- (vV2u), which is zero in the continuum, it
is not zero in the discrete world. Whereas it will be 'small' (and appropriately
unimportant) away from solid boundaries, it is 'large'—and very important at these
boundaries (unless Re 2> 1); it is the term that puts the viscous portion of the
Neumann BC, vn • V2u, into the PPE [see (3.8-36)]. In fact, for Stokes flow with
BC's that are independent of time, it is the only term that 'establishes' the pressure
field—per Remark (8) below (3.8-37).
(3) The fact that M~x is a dense matrix will cause 'problems,' which we discuss later.
The 'solution,' mass lumping, also 'weakens' our claim of 'consistency' for the PPE
in that LM is no longer 'honest,' no longer GFEM—and not always possible.
(4) Examining the consistent PPE at any portion of T on which n • u is specified (which
we do later for Q\Qo) and letting the mesh size tend to zero, recovers the appropriate
Neumann BC for the PPE—e.g., (3.8-36). It is important to point out that here
C1M~XC corresponds not to —V2, but to the normal derivative. (See Section 3.13.5f
for a demonstration of this fact). The tangential momentum equation (3.8-37), is not
obviously satisfied on F—even for t > 0; its satisfaction is and must be 'implicit'—a
spatially converged solution properly mimics the continuum.
552 THE NAVIER-STOKES EQUATIONS
(5) Examining this consistent PPE at any portion of F on which an NBC is applied
in the normal direction (which we do later for Q\Qo) and letting the mesh
size tend to zero, recovers the appropriate (Dirichlet) boundary condition for the
PPE—equation (3.8-38). Again, CTM~XC then does not correspond to (—V2); it
corresponds to a 'constant' operator, a scalar. Again, see Section 3.13.5f for a
demonstration.
(6) It is noteworthy that solving the pair (3.13-241) and (3.13-242) for u and P does
not imply CTu = g; it only implies (3.13-240). More on this issue in Section 3.16
on time integration.
(7) If the problem is one in which the rotated equations (at a boundary node) are
required in order to supply proper normal and/or tangential BC's, a la (3.13-42)
through (3.13-44), then it is an easy matter to show that the appropriate CPPE is
(CTM-' C)P = CTM-' [/ -(K+ N(RTuR))RTuR] - g; (3.13-243)
the (scalar) equation for P is affected only by the change of (vector) variable from
u to ur.
Since some of our mathematician friends tend to 'cringe' at the thought of even
mentioning a PPE when/since the pressure generally 'lives' in L2 rather than Hx
{especially true for those elements employing discontinuous pressure), let alone considering
using one to obtain an (alleged) solution, a few words of 'justification' (vindication?) may
be in order (with thanks here to both R. Rannacher and D. Griffiths, personal
communications). Suppose P e L2 (e.g., Q\Qo), yet we propose to both discuss and utilize (3.13-242);
what then does the vector (CT M~x C)P really mean? It means this: if the pressure from
the NSE's is 'sufficiently smooth' (typically 'valid' in much of £2 and away from
singularities), then this vector really does approximate —V2/\ per Remark (1) above. If, however,
P(x) is not sufficiently smooth (say near a 'corner'), then the above vector can not
correspond to —V2P; rather, it should then be interpreted precisely for what it is: an algebraic
'rearrangement' of the momentum and mass conservation equations, with P e L2. After
all, the numbers coming out the other end are the same whether the (u-P) formulation
or the CPPE formulation is used—at least when both are done properly. Thus, our 'final'
and, in our opinion, justifiable, position on the matter, which applies equally well to
FDM's and FVM's, is this: the CPPE approach gives the same results as the primitive
equation approach and, since P e L2 in general, we must simply admit that CTM~XC is
generally not a Laplacian. (Note that this same reasoning might suggest that the use of
continuous pressure elements, such as QjQ\, is generally ill-advised—probably not an
unreasonable suggestion.)
b. Some inconsistent (approximate) PPE's
Of the many ways to generate an inconsistent PPE, we will present just two—each
displaying its own form of 'inconsistency' and each precluding the use of the two lowest-
order elements, P\Pq and Q\Qq. For the first, we simply discretize the continuum PPE
given in (3.12-59)—presuming that we can somehow generate a decent approximation to
the viscous term—to obtain [omitting the (Vu)7 term for simplicity (y = 0)]
LP=[K-N(u)]u+~f-~g, (3.13-244)
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 553
where Ljj = fS/i/// -S/i///, etc. Note that L, like CTM~XC, approximates (—V2) for
sufficiently smooth pressure—at least up to BC's.
The many attempts to make such a method work are so numerous and so varied—and
often rather vague—that we cannot bring ourselves to discuss any of them—even though
most seem to 'succeed.' What we will do is refer the reader to much of the relevant
literature by requesting that s/he consult a recent paper by Gresho et al. (1995, references
#3 through #19 in particular); less important in this paper are two attempts to make the
Q\Q\ element work (which we too seemingly attained—until the algorithms were placed
under close scrutiny) via one form or another of ad hoc stabilization tricks.
These remarks are not intended to denigrate the above references; rather, they are
meant to imply that our understanding of them is less than perfect. Some of them
in fact probably do not belong on our list because they are addressing stabilization
of Q\Q\ {et al.)—a la Section 3.13.3—rather than inconsistent PPE's. We simply tend
to group them together, with apologies for any errors. But there was one result from
the above paper that is worth mentioning here: the simple replacement of CTM~XC in
(3.13-242) by L—ostensibly legitimate/interesting because each represents (—V2), at least
for Dirichlet BC's on velocity, which we shall presume here—is ill-advised. Why? Well,
simply because LP = CTM~x[f — Ku — N(u)u] — g, combined with CTM~x[f — Ku —
N(u)u] = CTu + CTM-' CP from (3.13-241), implies that LP + g = CT M~x CP + CTu,
which integrates to CTu(t) — g(t) = Ctuq — go + (L — CTM~XC) f0 P(r)dr, showing a
clear loss of control of 'div.' See the above paper for an example of a steady-state result
of a well-posed problem that satisfies LP = CTM~XCP with a very large divergence error
and a totally non-physical solution.
The second way is as discussed in Section 3.13.2b: do not integrate VP by parts when
generating the weak form. This makes the primitive equations look like
Mii + [K + N(u)]u + GP = f(G\ (3.13-245)
CTu = g, (3.13-246)
where only G and f(G) are different. They are (jc-components)
Gxu= f (pidi/j/dx; (3.13-247)
f(G) is given by (3.13-26) except that now Fx (on T^) does not contain the pressure
contribution to the normal force; drop — Pna from (3.12-28). The NBC associated with
this formulation is vdu/dx = Fx and, from the discussion following (3.12-69), is ill-posed
in the continuum. Nevertheless, we shall form the associated PPE—in the usual way
[insert u into the time derivative of (3.13-246)], to get
(CTM-XG)P = CTM-x[f{G) -Ku- N(u)u] - g, (3.13-248)
which, while not inconsistent with (3.13-245) and (3.13-246) [i.e., (3.13-245) and
(3.13-248) can give the same solution as (3.13-245) and (3.13-246)], has a different sort
of inconsistency; namely, if one were to examine this discrete PPE on any portion of T,
where the NBC in the normal direction is being used, and let the mesh size tend to zero,
one would not recover the proper Dirichlet BC for the PPE given by (3.8-38). What one
554 THE NAVIER-STOKES EQUATIONS
would recover is a repeat of the normal momentum equation NBC, dun/dn = Fn! This is
another (and much longer) way to discover the ill-posedness of the associated continuum
equations—the PPE has no BC, as mentioned earlier, in Section 3.12.5. For the 'finite
K case and assuming the case of a 'stable' element (no spurious pressure modes), say
Q,2Q\, there is another aspect of (3.13-248) that should arouse suspicion—and it is this:
there is always a hydrostatic mode (Ph is always a null vector of G because G is always
a gradient), yet there are no redundant continuity equations. This is a spurious hydrostatic
mode—another manifestation of the 'inconsistency.'
3.13.5 Additional Detailed Discussion of the Slightly
UNSTABLE but Highly USABLE Q^o Element
a. Introduction
While it may be true that the successful use of Q\Qo over a wide range of problems
(especially in 3D) requires a bit more expertise (and perhaps a bit of faith!) on the part of
the user than does an LBB-stable element, one of our goals is to turn the serious reader
of this book into one of those 'expert users' who can apply this often-excellent (and
cost-effective) element confidently and successfully.
After respectfully recalling the following remarks, (i) 'Whenever ... the power of
the computer is needed not only to solve a system of equations, but also to formulate
and assemble the discrete approximation in the first place—the finite element method
has something to offer'—Strang and Fix (1973); and (ii) 'I should like to say briefly
why I am a finite element man, and why I have adopted numerical integration almost
exclusively. Firstly, I am lazy, secondly, I do not enjoy mathematics, and thirdly, I make
mistakes'—Irons (1970); we put forth the somewhat counter viewpoint—more in concert
with Aris (1978) who said, 'Though a model may have been formulated with perfect
propriety and perspicacity, it is almost always a mistake to jump in with an extensive
set of computations. It is better to live with it for a bit, to view it from different angles,
to shape and mold it more justly. If the analogy may be permitted, there is a need for
mathematical foreplay if the model is to be fully responsive and the ultimate knowledge
is to be satisfactory'—that you should also study your discrete equations to at least some
extent, to help to understand both your own algorithms and the PDE's and BC's that they
allegedly represent. To this end, then, we present the following 'details'—for the simplest
quadrilateral element, the Q\Qo, the last of which is a new convergence proof.
b. General problem statement
It is an interesting and useful, although tedious, exercise to actually assemble and study
the full NS equations in semi-discrete form—especially for 'higher-order' elements. The
result is worthwhile, however, in several respects—not the least of which is to learn how
the GFEM actually generates boundary equations when non-'essential' BC's are employed,
as we have done in the previous chapter. It will also prove fruitful to see first-hand how the
GFEM automatically generates proper BC's for the pressure Poisson equation.
The analysis will be limited, of course. We will use the simplest, 2D, quadrilateral
element—Q\Qo—bilinear approximations for velocity and temperature, and piecewise-
constant approximation for pressure—and study only a mesh of uniform rectangles, as
shown in Figure 3.13-20, in which the right side is a boundary. While we shall initially
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM
555
)
(
)
(
)
(
)—
I
y
i
^6
~W
h
u*
13
.—|
I
xp9
xp8
r
xp7
)
^2
11'
io(
9
<
)
-<
xp6
XP5
XP4
— £
Q
\
8*' \
xp3
7U
xp2
6U
XPl
5
1>
—►
Kr
4
9x^
3" g
2U
1o -
t
9y
Fig. 3.13-20 /\ patch of d O0 elements.
develop the equations via the 'honest' GFEM, we will be forced to compromise this
later in the interest of being able to directly generate nodal (or elemental) equations—a
convenience that is lost when the consistent mass matrix is retained; i.e., then we do
accede to Strang and Fix, Irons, et ai, and let the computer do most of the work: the
discrete equations that we present will use the lumped mass approximation. (We also
leave non-uniform grids to the computer—or to the reader!)
The continuum equations that we choose to study are the 2D Boussinesq equations,
dP
du
h u • Vw
dt
dv
Yt
f du
+ u- Vv
dx
dP
9j
+ vVz« + aT,
+ vV2v + 0T,
dv
dx dy
0,
and
9 dT dT
V2/> = -V • (u • Vu) + a— + P —,
dx dy
(3.13-249)
(3.13-250)
(3.13-251)
(3.13-252)
where a, ft are the x- and y-components of the gravitational buoyancy force (a = ygx, f5 =
ygy, where y is the volumetric expansion coefficient). The energy equation (AD equation
for T) should also be appended to close the system, but it is not needed for our limited
purposes here. That we even include the buoyancy term relates to some 'difficulties' that
it can cause at outflow boundaries—a subject taken up in more detail in Volume II, as
are the Boussinesq equations themselves similarly deferred.
The GFEM form of these equations is
Mu + Ku+ N(u)u + CXP = fx+ aMT,
Mb + Kv + N{u)v + CyP = fy + 0MT,
CTxu + CTxv = g,
(3.13-253)
(3.13-254)
(3.13-255)
556 THE NAVIER-STOKES EQUATIONS
and
(CTM-lC)P = (CTXM-{CX + CTyM-{Cy)P
= CTXM~'[fx + aMT -Ku- N(u)u]
+ CTyM-' [fy + fiMT -Kv- N(u)v] - g, (3.13-256)
where we have omitted the subscripts x and y on M, K, and N for simplicity; this slight
laxity of notation should cause no confusion and will cause no harm for our current
purpose. Finally, fx and fy come from applied BC's.
The plan is as follows:
1. We shall explicitly construct the nodal equations for momentum conservation at a
typical interior node—node 6.
2. We shall then construct the explicit mass-conservation equation for a typical
interior element—element 5 (the only available fully interior element for this patch of nine
elements).
3. The discrete pressure Poisson equation (PPE) will then be formed at element 5—a
procedure that requires a departure from true GFEM by approximating M by a lumped
mass (i.e., diagonal) matrix.
After studying crudely the asymptotic behavior of these typical interior equations via
Taylor series analysis (l,h —> 0), we will move on to examine some boundary equations
and related BC's.
4. The momentum equations associated with natural boundary conditions (NBC's) will
then be constructed at a typical boundary node—node 2.
5. Then the same equations and BC's will be applied to a less typical boundary
node—node 1 (construed to form the lower right corner of the domain).
6. The effect of both Dirichlet and Neumann velocity BC's will then be examined for the
PPE, by studying its discrete form on element 2, obtained via the continuity equation on
element 2.
7. Finally, several PPE BC equations (16, actually!) will be presented for a corner element
when various velocity BC's are employed—e.g., on element 1 wherein the bottom line
in Figure 3.13-20 (x-axis) is construed to be the lower boundary of the domain, or on
element 3 wherein the top horizontal line is construed to be a boundary.
It is worth repeating that the only BC's applied to the NS equations are associated
with the velocity. Thus, it is of some interest to see what BC's, and how they are selected
by the GFEM for the PPE—especially since (we assert) these are the only proper BC's
(for this weak form of) the continuum PPE; i.e., for any particular (and well-posed) weak
form, pressure BC's are always induced by those on the velocity and the requirement
that V • u = 0 on T.
In order to begin the analysis, we will need the 'element matrices' for one / x
hQ\Qo element. Except for the non-linear advection matrix, we present these below for
'convenience'—even though they are also listed in Appendix 1. (The very complicated
advection matrix—for the general case—is not needed for our purposes. Later we will
just carry it along 'symbolically.')
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 557
1. Mass matrix: M\- = JA (p((pj dxdy = ^(pjCpj,
Ml
Ih
36
4 2 12
2 4 2 1
12 4 2
L2 1 2 4J
or, if lumped,
Ml
Sn / (pi
Ih
~4
10 0 0
0 10 0
0 0 10
L0 0 0 U
2. Diffusion matrix: K6^ = Je V0, • V07,
"2-2-1 1
h_ -2 2 1-1
6/ -1 1 2 -2
.1-1-2 2
3. Advection matrix:
Ke
+
6h
2 1
1 2
-1 -2
L-2 -1
-1 -2
-2 -1
2 1
1 2J
4. Gradient matrices: Cex.. = — fe V^(d(pj/dx), C\„. = — fe\//j(d(pj/dy),
C!
/z
2
r n
-i
-i
L 1 J
and CI = -
1
1
-1
-1
(3.13-257)
(3.13-258)
(3.13-259)
(3.13-260)
(3.13-261)
5. Divergence matrices: C\: = Cep,
h
2
Cf = ^[1,-1,-1,1], Cf = ^[l, 1,-1,-1].
(3.13-262)
c. Interior momentum equation
o Momentum equations at node 6. To form the global equations at any node, we first
determine, in turn, the individual contributions from each element containing that particular
(global) node; i.e., the 'standard' FEM assembly process. We will carefully (step-by-step)
generate the global jc-momentum equation for node 6. After that, we will skip details
(which should be easy to fill in) and move more hastily toward our final goal. We now
use the element matrices to determine the contributions to each of the six terms from
each of the four elements sharing node 6—for the jc-momentum equation (only):
1. Element 1. Global node 6 is local node 4 in element 1; thus, we use the fourth row of
each element matrix to give first,
Ih .
Mu\(y ~ — (2«5 +u\ +2u2 +4w6),
36
where ~ is to be read, 'the contribution to the global node in question'—it is not an
equation and should not be so written (even though some authors have presented element-
level contributions as element-level equations).
558 THE NAVIER-STOKES EQUATIONS
If lumped mass is employed,
lh.
4
Also,
vh vl
Ku\6 ~ — [2(w6 -u2) + (u5 -u\)] + — [2(w6 -«5) + ("2 -«i)J,
6/ 6«
and
yv(u)«|6 ~ - • X
where we have introduced the following short cut notation:
lA% = discrete (GFEM) approximation to u -Vw at node 6 from element 1.
Next, CxP\(y ~ \P\, where CxP\e is the nodal contribution to node 6, and P\ is the
pressure in element 1.
Finally,
fx\d ~ 0 (we are not at a boundary), and
aM7|6 ~ °^—(2T5 + 7, + 272 + 476) or, if lumped,
36
alh
aMT\6 — T6.
4
2. Element 2. Here, global node 6 is local node 1, and we get, from the first row of the
element matrices,
lh
Mu\(y ~ —(4w6 + 2«2 + "3 + 2«7) or, if lumped,
36
lh.
4
Also,
u/i vl
Ku\b ~ — [2(w6 - m2) + («7 - "3)] + —[2(u6 - w7) + (m2 - w3)],
6/ oh
N(u)u\6 ~ ^ • 2A£,
Cjc^U ~ -Pi,
/xl6~0,
aM7|6 ~ -^^(476 + 272 + F3 + 277) or, if lumped,
36
alh
aMT\6 ~ —-7-6.
4
3. Element 4. Here, global node 6 is local node 3; the third row of the element matrices
gives
lh .
Mu\(, ~ — [W9 + 2«5 + 4«6 + 2u\q] or, if lumped,
36
lh.
MuL ~ —«6-
4
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 559
Also,
vh vl
Ku\e ~ 77[("5 - "9) + 2(«6 - «io)] + yr[2(u6 ~ "5) + ("10 - u9)],
6/ on
(u)w|6 -
CVU "
fx\b "
4 A("
h
2 4
-0,
alh
aMT\6 ~ —(79 + 275 + 476 + 27,10) or, if lumped,
36
aM7|6 ~ —-7-6.
4
4. Element 5. Finally, global node 6 here corresponds to local node 2, giving
lh . ...
Mw|6 ~ —(2u\o +4«6 + 2«2 + «n) or, if lumped,
36
lh.
Also,
y/i y/
ATm|6 ~ 77l2(w6 -«!()) + ("7 -«ii)] + 77[2(«6 -"7) + («io -«n)],
6/ oh
yv(u)M|6 ~ - • 5A£,
4
2"
aM7|6 ~ ^-(27,0 + 476 +277 +7,,) or, if lumped,
36
alh
aMT\6 ~ —-7V
4
We are now ready to assemble the x-momentum equation for node 6. Collection of all
elemental contributions and placement into the jc-momentum equation (3.13-253), yields,
for the simpler case of lumped mass,
vh
lhii6 - —[(u\ — 2u5 +m9) + 4(«2 -2«6 +M10) + («3 -2«7 +«M)]
6/
vl
- —[(wi - 2w2 + «i) +4(«^ - 2w6 +M7) + ("9 — 2mio + «ii)]
6«
+ !I![lA«+2A«+4A«_|_5A«]_|_»(/,i_/,4+/,2_/,5)
-[^+2A"6+4A^+5A^ + -i
-alhT6=0. (3.13-263)
560
THE NAVIER-STOKES EQUATIONS
If we use consistent mass, then the terms in u^ and T(, are more complicted; e.g., Ihiie
would be replaced by
Ih
36
[(in + w3 + ii9 + u\ i) + 4(«2 + «5 + "7 + wio) + 16«6],
3. D^w, = t[(«5£ — usw) + 4(«£ — uw) + («/v£ — «/vw)L and its obvious y-analog, DyW,.
with a similar expression replacing alhT6. But in this example we must use lumped mass
in order to later invert M algebraically (analytically) and thus form explicitly the PPE.
Remark:
If «2 and «3 are specified (given), then their contributions to (3.13-263) would actually
wind up on the RHS as part of fx in (3.13-253), as would their time derivatives when
CM is employed.
To proceed more 'efficiently,' we now introduce some further shortcuts in terminology,
using the compass point notation (except for pressure) shown in Figure 3.13-21.
1. A1 = \0A1 +2 Aut +3 Af +4 Af), and represents u -Vw at node /.
2. Dxxut' = l[(uSw — 2us + uSe) + 4(«w -2u{■■ + ue) + (uNW - 1uN +uNE)], with an
obvious analog in the y-direction, Dyyui.
to
We will also need boundary terms later, so we now add the following:
4. AE = ^(2AE +3 AE), which represents u -Vu at node E and is needed for the case
where node E lies on the domain boundary.
5. DxuE = {[(use -us)+ 4(ue -ui) + (uNE - uN)].
6. DyyUE = \[2(uSe - 2uE + uNE) + (us - 2uj + uN)].
The x-momentum equation at node 6 now reads
lhu6 - v (jDxx + -DyyJ u6 + lhAl + -(Pi-P4+P2-P5)- oclhT6 = 0.
(3.13-264)
Since Ih = J 06 for this case, it is not surprising (since 0, is the general weighting
function) that we must divide by Ih to 'unweight' the weighted residual equation and
thus see the final version of the discrete momentum equation at node 6 that looks more
N
\A/ I
VV '
<
W
>—
r
©
©
i
si
®
i
©
>
NE
—o
sw
SE
Fig. 3.13-21 Another 4-patch.
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 561
like the PDE (and a finite difference approximation thereto); namely,
Uxx , LJ yy \ ■ ■■ *■
"6 ~ v [-JT + -fiT) "6 +A6 + 2 ( 7"^ + / ) ~ aTe = °- (3-13-265)
The one-to-one correspondence of each term with the corresponding term in (3.13-249)
is now quite obvious. In fact, a Taylor series expansion about node 6 could be
performed—an exercise we leave to the reader. The results would show second-order
accurate approximations to each of the derivative terms; i.e., the Q\Qo element generates
a second-order accurate (Taylor series) scheme on a uniform mesh of rectangles. (Of
course, the lazy presentation of the advection terms precludes such an analysis, but the
details could be worked out.
As a final notational shortcut, we let Dxx/l2 + Dyy/h2 = V2h, the bilinear, discrete
Laplacian operator—the expanded version of which can be seen in the previous chapter;
(2.3-24).
Clearly a similar analysis applies to the y-momentum equation at node 6, the final
result of which is
V6 ~ Wfo +Avb + i (P2~Pl + P5~hP4) - ^6 = 0. (3.13-266)
d. Interior PPE
o Continuity equation (V • u = 0) at element 5. Here we have an easier construction since,
for Q\Qo, each element directly yields the discrete equation of interest. From (3.13-262),
we have
Ce*u + Cfv) =0;
i.e.,
h I .
-(mio -«6 + "11 -un) + -{y\o + V(y -v1 -v\\) = 0, (3.13-267)
which is a second-order accurate approximation to —V • u = 0 at the centroid of element 5
in Figure 3.13-20—and we have employed the time-derivative of (3.13-262) to obtain
(3.13-267), a needed step in the derivation of the PPE.
o Pressure Poisson equation at element 5. While the CTM~'C-matrix is global and must
be generally so-constructed, it is easy to generate one row of this matrix (and one of the
set of discrete PPE's), when the mass matrix is lumped, by simply inserting the resulting
accelerations from the associated momentum equations into the corresponding (element-
level) continuity equation; and this we shall do after noting that we can do no such thing
when the more-accurate consistent mass matrix is employed—in which case the PPE (and
the 'Laplacian' matrix, CTM~~XC) is both global and has a fully populated band structure
because M_1 is dense; all nodes in the mesh are then involved in forming this elliptic
equation for pressure, both in the matrix and in the RHS vector. While it is, of course,
true that the CT operation, even if applied to a fully populated vector, will 'localize' the
result to one element at a time (corresponding to element-level divergence), it is also
true that the final (more local) result cannot be obtained until the fully populated global
vector is constructed first. Hence, we leave consistent mass to the computer (where it
probably belongs anyway—although we hasten to state that the actual implementation of
562 THE NAVIER-STOKES EQUATIONS
CM and the PPE method are generally not recommended) and focus on the lumped mass
approximation in the remainder of this discussion.
Since the continuity equation for element 5 requires the acceleration at several nodes
in addition to that at node 6 presented thus far, we first write (by inspection, hopefully)
the remaining momentum equations:
w7 - vV2hun +AU7 + - fP3~P^+P2~PA _aTl = q (3.13-268)
9 1 (P3 -P2+Pe -Ps\
vn - vV2hvn +X> + - I- 2—- J - fiT, = 0, (3.13-269)
9 „ i (P4-P-1 + P5 -PA
uw-W2hul0+Auw + - M L-j-l M -«r10 = 0, (3.13-270)
9 „ 1 (P5 -P4+P8-P-i\
vw-W2hvw+A\0 + - M i^—5 1J -/3TW = 0, (3.13-271)
and
9 „ 1 (P5 -Ps + P6-P9\
iiu-W2hull+Auu+- I- ^—- 9-\ -aTu = 0, (3.13-272)
«i 1 -W2vn+Avu +UP6 P5+hP9 P*j -PTu = 0. (3.13-273)
Next, we rewrite the continuity equation in a form that looks more like V • du/dt = 0
by dividing it by {-lh)\
1 (U(y—U\o ill — ii\\\ 1 (V-i — i>6 Vn -Vin\
-(_r_ + _r-L)+-(-L_i + -lL_J£J=0. <3.,3-274)
The eight momentum equations are now rearranged and combined (as needed) to give
M6-W10 2 /«6 — «10\ (K-AU\o\
—r- =yV* \—r-) - \—r-)
- ^ [(/>, - 2/>4 + Pn) + (P2 - 2P5 + P8)] + a(r6~r'0), (3.13-275)
ill -«ii ^2/"7 -"11 \ M" -A",
—*—=vV*(—H-(—r-
^2[(P2 - 2/>5 + />8) + (/>3 - 2/>6 + P9)] + a(r7 r"},
Vl -V6 _ „2 / v7 - V6 \ (A% - Al
h ~uvh\ h r\~~h
- -2 [(/>, - 2/>2 + />3) + (/>4 - 2/>5 + P6)] + ^(7?, ^
2h n
and
^11 -^10 = yV2 /^ll — ^10 \ _ Mil -^10
a H a y v a
2-[(P4 - 2P5 + P6) + (^7 - 2/>8 + P9)] + ^(r", r'0), (3.13-278)
2/*z /*
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM
563
Finally, inserting these results into (3.13-274) and rearranging yields the CPPE for
element 5:
-^[(P, - 2PA+P1) + 2{P2 - 2P5+P8) + (P3 - 2P6+P9)]
4r
+ 772 [(^1 - 2P2 + P3) + 2(P4 ~ 2P5 + P6) + (P7 - 2/>8 + P9)]
Ah
1
2
'4« _4M Au — Au \ / Av — Av Av — Av '
/
/
h
h
+
1 (T^-T^ , T7-Tu\ 0 /Tt-T6 Tn -Tip'
2 V / / / 2 V h h
+ 2V'
"6 - "10 "7 - "11 \ / ^7 - ^6 Vll — UlO
/
/
/*
/*
(3.13-279)
[Multiplication of the LHS by -Ih gives CTM'lCP; i.e., {CTM~XC)P « lh{-V2P); Ih
is the pressure mass matrix, Q.] Hopefully, the term-by-term identification is obvious; for
/, h —>• 0, it gives
32/> 32/> 3 3 dT dT
—2 +—r =- —(u-v")- — («• Vv) + a— + £—
3jcz 3/
3jc
3j
3x 3y
+ y
3Z 32 \ /3w 3^ _ 2 2n
+
3jcz 3/ / V9* dyJ
( — + —} +0(l\hz). (3.13-280)
Clearly, if V • u = 0, which is the /, h —> 0 limit of the continuity equation, then we
recover (3.13-252), the desired PPE (at the centroid of element 5, to be precise).
Remarks:
(1) It may not be obvious, but we have just obtained, in (3.13-279), one row of the
CTM~ 'C-matrix of (3.13-256), whose nine-element stencil, per Figure 3.13-20, we
elucidate below—using CTM^C = CTXM~XCX+CTyM-xCy:
1 t I 1
-cZm-xcx = —=■
Ih x x 4/2
-1 2 -1
-2 4 -2
L-l 2 -1J
(3.13-281)
and
1 r 1 1
— CTvMZlCY = —j
Ih v -v - 4h2
2 4 2
-1 -2 -1
(3.13-282)
which clearly approximate —32/3jc2 and — d2/dy2, respectively; and their sum is
{\/lh)CTM-{C^-V2.
(2) If / = h, the Laplacian is seen to simplify to
2/
I(/>l+/>3+ Pi +P9-4P5)
(3.13-283)
564
THE NAVIER-STOKES EQUATIONS
rather than to the simplest finite difference stencil,
1
/
2(/,2+/,4 + /,6+/,8-4/,5);
(3.13-284)
(3)
i.e., it is a 'rotated Laplacian' that involves the 'corner' elements rather than the
nearest neighbors in the x- and y-directions. Although both are second-order accurate,
the GFEM stencil here displays its CB-(checkerboard) pressure mode in the simplest
way; i.e., the above stencil clearly annihilates a pressure that takes the value +1 on
red elements and — 1 on black ones.
If the elements are / x h rectangles, then the following stencil describes
-Q~XCTM~XC ~ V2 for the central element of a 9-patch:
/
\/l2 + \/h2
2(1//2- \/h2)
\/l2 + \/h2
2(l/h2 - l//2)
-4(l//2 + l//*2)
2(\/h2 - l//2)
\/l2 + \/h2
2(1//2- \/h2)
\/l2 + \/h2
h
(4) If CM had been used, then all we need realize/recall [see then (2.3-26) in Chapter 2]
is that, for example, Mu = MLu + Ih ■ Oil2, h2), where now M refers to CM and
ML to LM, to 'see' the same consistent results from consistent mass for /, h —> 0.
So much for interior nodes; we now move on to the more interesting case of boundary
nodes and BC's.
e. Boundary momentum equations/NBC's
o Momentum equations at a typical boundary node. Since there are no momentum
equations at boundary nodes when the velocities are specified, we are naturally interested
here in other BC's and related equations. Thus, assume that node 2 is on a boundary for
which both /„ and fT have been specified. While we are ultimately more interested in the
PPE at the boundaries (and its built-in BC's, to be discovered), the momentum equations
are an important necessary step along the way. Besides, the analysis will shed some light
on the often somewhat mysterious NBC's. So let us return to the element matrices and
construct the x- and y-momentum equations at node 2—first on the assumption that no
(explicit) BC's are applied on this surface.
x-momentum equation.
1. Element 1. Global node 2 is local node 3, and we get (LM approximation), from the
same element matrices,
Ih
Mu\2 -U2,
4
vh vl
Ku\2 ~ — [2(m2 - u6) + (wi - u5)\ + — [2(m2 - u\) + (u6- u5)],
o/ bh
N(u)u\2 ~ '— ■ XA\,
Ih
~4
CxP\2~-^Pl,
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 565
/*l2~/ <hfndy=^fn\\
alh
aMT\2 T2.
4
2. Element 2. Global node 2 is also local node 2, and we get
Ih
Mu\2 -u2,
4
vh vl
Kll\2 ~ — [2(M2 - «6) + ("3 - "7)1 + 7t[2(«2 - «3) + ("6 - «7)],
7V(u)M|2 ~ - • 2AU2,
h
4
2'
/'l2~/ <hfndy = ^f%,
T2
aMTW ~ T2.
12 4 2
The equation for u2 is now 'fully summed,' and by combining the contributions to the
momentum equation at node 2 from elements 1 and 2, we obtain, from (3.13-253), an
NBC in the guise of an ODE for u2\
Ih vh
— "2 + 7t[("i -«5)+4(m2 - «6) + ("3 -"7)]
2 0/
vl
- —[2(u\ -2u2 + ut,) + (u5 -2ub + u-i)]
on
+ ^('A"2 +2AU2) - ^(P, + P2) = k-[f^ + /£>] +«y^2, (3.13-285)
where it is interesting and important to note that there are no pressure differences in the
equation—clearly the approximate equation does not contain the .r-component of VP. But
what does it contain? Recalling and returning to the shortcut notation introduced earlier,
we also (logically but, as it turns out, naively) divide the equation by J (p2 = lh/2 and
denning Jn2 = ![/£»> + /g>] to obtain
Tu2 - v-^u2 + -< A2 + A2) - —j— = —-
w2 + 2u^w2 - v-f-w2 + -('A^ +2A^) !——- = -^ +a72, (3.13-286)
the interpretation of which as a 'momentum equation' is not at all obvious; it is even
suspicious-looking: it sure doesn't look like a momentum equation! And this is true. To
see what it does represent, and what the GFEM is telling us, we multiply by 1/2 and
regroup:
I . Dyy t\^1 Dxlil Pl+P2
- I u2 - v-^u2 + Au2 - aT2 I + I y—
ill ~ v-f k2 +Au2-aT2\ + [ u-^ - „ = /„.. (3.13-287)
566 THE NAVIER-STOKES EQUATIONS
The terms within the first parentheses, for /, h —>• 0, correspond (at node 2) to
du d2u \ , ,
y—T + u ■ Vw - aT + 0(1, h),
dt dyl J
and those in the second parentheses to
du I d2u\ ( I dP\ /2 ,2
Yx-~2^)-{p-2Yx)+0{l-h)-
so that the equation, for small /, h, becomes
1 f du i dP\ ( du \ ,?,,,?,
2 \~d~t ~ U ~ ^ + Iti) +\Yx~P) =fn+0(l2JKh2).
(3.13-288)
But the coefficient of 1/2 is zero, since the jc-momentum equation is (we assume) satisfied.
Thus, the alleged momentum equation for node 2 actually converges to
v^--P = fn, (3.13-289)
ax
the natural BC for the jc-momentum equation at F—and it does so to second-order
accuracy in the Taylor series sense even though simple one-sided derivatives are present.
Remarks:
(1) It was the assumption of jc-momentum equation satisfaction that led to a second-
order accurate Taylor series result. This assumption is, of course, not necessary; i.e.,
the final result is still v(du/dx) — P = /„, the (pseudo)-traction NBC.
(2) This is, in fact, the natural BC associated with the terms vV2u — dP/dx in the
x-momentum equation that were integrated by parts to obtain the weak formulation.
(3) The initial GFEM equation for node 2 was first divided by hl/2 (= J 4>i) and then
multiplied by 1/2 to obtain the final equation. It turns out that the net result,
division by h, is proper since the equation is (properly) dominated by the boundary
terms—and h = Jr fa- Henceforth, we will account for such boundary behavior in a
more satisfactory way—the above being presented as it was for (hopefully) tutorial
purposes.
(4) If we had begun the analysis with the stress-divergence form of the NS equations,
the final result would be only slightly different; i.e., the NBC is then
2v^--P = fn, (3.13-290)
dx
which small difference promotes /„ to the status of a true boundary normal force (or
traction) per unit area; i.e., 2/idu/dx is the normal component of the viscous stress
at the boundary, and /„ is the normal force applied to the fluid at the boundary.
Only in this way could we actually apply a. force to the fluid—a result that is often
more prominent and important in solid mechanics than fluid mechanics, because in
the latter the /„ NBC is usually (but not always—e.g., for free surface flows) a
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 567
convenience at an (artificial) outflow 'boundary.' The NBC then corresponds to the
terms
, , d2u\ dP
v \V2u +
dx2 I dx
of the stress-divergence form.
(5) The non-dimensional form of the original ('V2-form') NBC can be written
1 dun
Re dn
-P = fn, (3.13-291)
which can be (properly) interpreted to suggest that P = —fn for large Re
simulations; i.e., this BC approximates a prescribed pressure and it does so too when the
stress-divergence form is utilized. (In actual fact, however, numerical results often
give P=-f n even for low Re simulations.)
(6) The original discrete ii2 equation (3.13-285), is still the one 'seen' by the computer;
it is actually an ODE that approximates the PDE's NBC.(!)
y-momentum equation.
By repeating the analogous steps as for kj, the 'momentum' equation (ODE, actually)
for i>2 can be obtained:
lh vh vl
irv2 + —[{vx - v5) + 4(v2 - vb) + (v3 - v7)] - — [2(vi - 2v2 + v3) + (v5 - 2vb + v7)]
2 6/ on
lh , ,, 9 „ / lh h
+ -CAV2 +2 A^) +-(P2-Pi)-p--T2 = -
/ 02/rd;y+ / fafTdy
= hfZ2.
(3.13-292)
This time, suspecting that a boundary equation will survive at the end of the day, we
divide by h = Jr 4>i and rearrange to get
-\V2-v-~-V2+A2 + —j;--eT2\+v-l
v2 - v-^v2 +A'2 + L - /3T2 \ + v^v2 = fT2. (3.13-293)
Again, while each of the terms multiplied by 1/2 converges to a well-defined quantity,
the key observation is that this whole term vanishes when /, h —> 0 because it represents
(after combining it with the 1/2 term from a Taylor series expansion of Dxv2/l) the
y-momentum equation, and we are simply left with
vdl=fz + 0{l2Jh,h2); (3.13-294)
ox
i.e., a (Taylor series) second-order accurate approximation to dv/dx = fz.
Remarks:
(1) Again, as for the normal momentum equation, this NBC does not generally represent
a force balance. If the stress-divergence form of the momentum equations had been
used, then the corresponding NBC would turn out to be
568 THE NAVIER-STOKES EQUATIONS
which is a true force (per unit area) balance—a shear stress balance on the boundary.
So only here does fT represent a true traction.
(2) The simple original form presented, dv/dx = fT, corresponding to the V2-form of
the equations is actually more useful for the purpose of a good 'outflow' BC via
setting fT to zero: it usually causes minimal perturbation to the flow as V leaves
the domain, and this is often the name of the game (or at least part of it) in CFD.
As a final remark on the implementation of non-zero values for /„ and /r, we have
found (for the Q\Qo element) it better to actually employ piece wise-constant (average)
values at the (line) centroid of each element when actually computing Jr <pf„ and Jr <f)fT.
The use of this seemingly lower-order-than-necessary approximation (i.e., why not use
linear interpolation?) is related to two items:
1. In the normal momentum equation, it is a better match to the pressure, which is a
centroid quantity and usually more important than the viscous term.
2. In the tangential momentum equation, we merely state the following experimental fact:
even though fT is linear (in y, say) for steady Poiseuille flow, the GFEM can only obtain
the nodally exact solution on a mesh composed of rectangular elements of varying size
if the discontinuous, stepwise-changing approximation is employed.
o NBC's in general. Figure 3.13-22 helps to relate x- and y-directions to (local) normal
and tangential directions, and shows how the NBC's are stated in both n — x and x — y
coordinate systems—limited for convenience and simplicity to straight boundaries aligned
with the x — y coordinate system. Note that /„ > 0 is an applied (normal) tensile stress,
and /„ < 0 is an applied (normal) compressive stress.
o Momentum equations at a corner node. Continuing on, it is of interest to inquire,
'Just what "boundary equations" are contained in the momentum equations at a node in
the corner of a computational domain?' So we focus now on node 1 of Figure 3.13-20
and suppose that nodes 13-9-5-1 form a bottom boundary, just as nodes 1-2-3-4 form a
side boundary. The first thing we observe is the following: since Dirichlet data (specified
velocity, essential BC) 'prevails,' we can only write momentum equations at node 1 if both
boundaries (bottom and side) are 'free,' or 'natural,' or 'flow-through' (i.e., open)—so
this is the case of interest.
Here we are dealing with only one element, and we can easily construct the 'global
momentum equation.' Thus, from row two of the element matrices, we obtain directly
the equation for global node 1 — where it will suffice to consider just one of the two
equations: i.e., either one will show us all there is to learn about such a corner node. So,
considering the jc-equation, we have
lh . vh vl lh , „ h alh
— Ml + TT[2(M1 ~"5)+ ("2 -Ub) - ~[2(U2 - Mi) + (lib ~U5)]+ — -A" - -/>, + —-7,
4 6/ on 4 2 4
p /-Node 2 /-Node 1
= / 0i/* = / 0i/„d.y+ / 0,/r(Lc, (3.13-296)
JV ./Node 1 ./Node 5
where /„ is the applied normal 'force' on the vertical boundary and fT the applied
tangential 'force' on the horizontal boundary. For the reasons discussed above, we represent these
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 569
y,v,fyA
*if;-p.w
(d\\ dv
ay ax
auT dU duT du
an ay' dx ax
aun av aun av
an ay ' dx
ax
X"
dUn _ au dUj, _ du
an ax ' dx dy
9uj _ av auT _ av
an ax ' dx ay
au „
-fx = 2m
—fv = M-
ax
av a_u
ax ay
un = v, uT = -u
'n = V 't ~~ ~'x
fn = "fx
fx = -fy
! f = 2uau-n - P '
| Tn ^an !
fx = M-
auT + au^
an dx
un = -v, uT = u
'n = —'y> 't = 'x
Wx
Wy
aun _ au atjp _ aij
an ax ' ax ay
auj _ av auT _ av
an dx ' ax ay
av au
ax ay
V ■--
fv = M-
iiX
-»- n
au^ _ av aun _ _ay
an ay' ax ax
a^ _ _ au a^ _ au
an ay ' dx dx
X, U, fx
au ay
ay dx
j
Fig. 3.13-22 Natural boundary conditions and force balances; for the conventional (V2)
formulation, omit the underlined terms and all factors of 2, to obtain pseudo-
tractions.
applied (pseudo-) 'forces' by their average values, so that the RHS becomes
hj l-
To obtain the 'limit equation' for this case, we might divide the equation by Jr <p\ =
(/ + h)/2. But the result would be awkward. So we merely multiply by two and rearrange
it to
Ih r v — i
— On + A\+aTx) + h{-[2(u[ - u5) + (u2 - u6)] - Pi -fnj
/{^[2(M2-Mi) + (M6-M5)] + /r} =0,
(3.13-297)
570 THE NAVIER-STOKES EQUATIONS
where it is noteworthy that this is the 'jc-equation' when u\ is 'free,' regardless of the BC
employed (essential or natural) in the y-direction. Clearly, the first term 'goes away' (it
is higher-order) as /, h —> 0. So in the limit (or near-limit) for / ^ h, we obtain a special
case of the following general result:
h
du
fx-nx(v- P)
ax
( du\
+ I [ fx — nYv— = 0 in the x-direction.
Similarly, from the nodal y-momentum equation (not presented), we obtain
h U-v-"^— 1 +/
f'-"'(%-p]
= 0 in the y-direction.
(3.13-298)
(3.13-299)
(In the case under consideration, nx = 1 and n v = — 1.) These equations can be interpreted
simply and properly as follows: the GFEM 'converts' the momentum equations at a comer
node when NBC's are applied there to surface force' balances on the comer element. For
example, (3.13-298) says: h x (jc-component of net normal stress) + / x (jc-component of
net tangential stress) = 0, a la ^ Fx = 0 from your first course in 'statics.'
Remark:
A true force balance, of course, is
and
h
h
f>
n,
fv ~nxv
' du
2v P
k dx
du dv\
dy dx)_
+ 1
+ 1
f>
nyv I
/du dv^
\dy dxt
dv
fy-ny\2v—-P
= 0
= 0,
(3.13-300)
(3.13-301)
which is only obtained if the stress-divergence form of the NS equations is used. The result
presented is the proper pseudo-force balance that is appropriate to the V2 u (conventional)
form of the equations.
This concludes our elaboration of the momentum equations—for now. Now we shift
back to the PPE.
f. The PPE at boundaries.
o Boundary conditions for the PPE. Consider the PPE for element 2 in Figure 3.13-20,
which is a typical boundary element. It is of interest to determine just what BC's are
built into the PPE when any of the following velocity BC's are imposed (on the vertical
boundary for this case—generalization is straightforward):
(a) Specified u, v: Dirichlet BC's (e.g., solid wall, or specified inflow).
(b) Specified /„, v: sometimes useful for inflow or outflow.
(c) Specified /„, fT: the general NBC, often useful for outflow.
(d) Specified u, fT: most commonly used (with u = 0 and fT = 0) for a symmetry BC.
The common starting point, of course, is the continuity equation for element 2:
(3.13-302)
h . I .
-("6 - "2 + "7 - m) + ~(v2 -v3+v6- vn) = 0,
where the values of the component accelerations at nodes 6 and 7 have already been
presented.
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM
571
We now examine each case in turn, since the same continuity equation will 'generate'
different pressure equations:
Specified u, v.
This is the simplest case, since all needed information is already available. In
anticipation of the final result (a boundary equation, recall), we divide the continuity equation
by h before substituting the accelerations: thus,
1 . / .
-(K6 ~U2+U1 - K3) + — (V2 - V3 +V(,~ Vn) = 0,
2 2h
(3.13-303)
into which we insert the previous ii and v equations for nodes 6 and 7, (3.13-265) through
(3.13-269). There are no equations for u and v at nodes 2 and 3; they lie on a Dirichlet
boundary and therefore u2, ^2, «3, and v3 are given 'data' and thus show up in g in
(3.13-256). The result is
1
1
2vVi(u6 + «7) - -{Al + A") - -
1 fP\ ~ Pa + Pi - P5 , P3-P6 + P2- P5
I
+
+ t(76 + T7) - -(«2 + u3) + ^l(v6 ~ t>i) -—(Al- Av7)
2 2 2n 2h
I (P2-P\+P5-P4
4h\ h
PI I . .
+ ^-(T6-T7)+— (v2-v3) = 0,
2h 2h
P2 + Pe
h
(3.13-304)
which is a mess. To proceed, we transpose all but the terms in Pt/l to the RHS, change
the sign, and regroup to obtain
(/>, -P4) + 2(P2-P5) + (P3-P6)
4/
2 / U6 + W7 \
= VH 2 J"
/
+ 2
-^
U2 + U3 { Au6+A"~
2 ' 2 j
Nv)+V+t*
r7 -T6) V2
h
~V3~
h
1
, (^6 + ^7)
+ a 2
- 2P2 + P3) + (P4 -
2/z2
-2/>5+/>6)
(3.13
an equation that is to be interpreted as applying at the centroid of element 2. Letting /
and h go toward zero, we obtain, in a two-step procedure that is not really necessary,
— = vV2w - [ — + u • Vw + aT
dx \dt
I 3
29^
9 dP dv
3y 3r
+ 0(l,h), (3.13-306)
in which only the first four of the RHS terms survives the limit process; i.e., the final
'PPE' for element 2 is actually a boundary equation (again—no surprise anymore); it is,
572 THE NAVIER-STOKES EQUATIONS
in fact, the (Neumann) boundary condition for the PPE,
dP , fdu \
— = v\72u- ( — + u-Vw }+aT, (3.13-307)
ox \at J
which is just the normal component of the NS equations applied on T^. The actual
equation (3.13-305), is, of course, an ODE that is actually a first-order accurate
approximation to the normal momentum equation—applied at the centroid of element 2, which
itself approaches the wall as / —► 0. (This ostensibly 'first-order' approximation, however,
in no way vitiates the second-order accuracy of the overall solution.)
Remarks:
(1) The 'suspicious sign' of dv/dt in the bracketed term above is just one example
of many in which the GFEM will do whatever is necessary, with respect to
tweaking/twiddling the truncation error terms, so as to attain the key objective for
incompressible flow: V/j • uh = 0.
(2) The term — (/z/2)(w2 + W3) + (l/2)(v2 — v3) in the original statement of the
continuity equation corresponds to (a portion of) g in (3.13-256).
(3) This case (Dirichlet BC's on velocity and Neumann BC's on pressure) is further
discussed in great detail in Gresho and Sani (1987).
Specified fn, v.
Here we need the full momentum equation for nodes 2 and 3 in the jc-direction. That
for node 2 has already been presented, see (3.13-285) and (3.13-286), and that for node 3
follows 'by inspection'; it is
lh . vh — vl — lh 9 „ , „
-«3 + —Dxu3 - -zrDyyUi + -(2AU3 +3 Au3)
L I Lh 4
--(P2 + P3)-oc-T3=hJny (3.13-308)
Starting afresh, we rewrite the continuity equation for element 2 in the form
-[(W6 + tin) - (K2 + «0] + ^[(V2 - V3) + (V6 ~V7)] = 0 (3.13-309)
to give, after inserting the relevant momentum equations,
vh (Au + A") h
-V\{ub + «7) - hK 62 V - -[(/>, - P4) + 2(/>2 -P5) + (P3- p6)]
ah vh— v— h , 0 0 ,
+ y (^6 + T7) + -J2DAU2 + m) - -Dyy(u2 + w3) + 4(^2 +2 A\ +2A% +3 A\)
h ah h — —
- -(/», + 2P2 + Pi) - ~(T2 + T3)- -{fni + /„3)
+ ^ {(^2 - v3) + vV2h(v6 - vj) - (Al-Aw7)
1
+ 2
(P, - 2P2 + P3) + (P4 - 2P5 + P6)
another mess.
+ £(76-:T7U=0; (3.13-310)
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 573
To proceed, note that several terms are divided by / and most are multiplied by h.
Thus, we multiply by l/2h and judiciously rearrange to arrive at
Dx(u2 + u3) />, + 2P2 + P3 7„2 + In,
v -
/ 2 4 2
/
2
fu6+u7\ Al+A" (Pl-P4) + 2(P2-P5) + (P,-P6)
(T6 + T7) Dyy lAu2+2Au2+2Au3+3Au3 (T2 + T3)
- a + v-y(u2 + u3) - "- +a
2 h 4 2
I2
+ 4
V3 ~V2 2 (I)-; - V6) (AV7 - AD
h " h h
(Tn - 7^1
(3.13-311)
(Pl-2P2 + P3) + (P4-2P5+P6) (r7-r6)
_ +p
2hz h
As /, h —> 0, the only remaining terms are those on the LHS; i.e., we obtain
v^-P-fn=0, (3.13-312)
ox
the same BC satisfied by the x-momentum equation at node 2; cf. (3.13-289). Here though,
the BC acts more like a Dirichlet condition for the PPE; while we do not specifically set
P at the boundary, it is inherently (and approximately for finite /, h) set via this '/„
BC as
P = v^--fn, (3.13-313)
ax
and is done in such a way that V/j • uh = 0 on F is assured.
Specified f„, fz.
The full equations for this case are almost too long to be useful; the final equation is
similar to the above case except that the y-momentum equations must be employed for
i>2 and i>3, since these velocities are no longer specified. We shall be content to leave the
details to those interested and merely state what is nearly obvious: the end result is the
same. As /, h —>• 0, the pressure (continuity) equation becomes the same BC for the PPE,
p=vT-~fn, (3.13-314)
ax
the y-direction (tangential) BC is again waived in favor of the normal 'momentum
equation'—i.e., the 'force' balance in the jc-direction—because this is the equation that
keeps the solution divergence-free.
Specified u, fT.
For this case we need the momentum equations for nodes 2 and 3 in the y-direction.
That for node 2 has already been presented, (3.13-292), and that for node 3 follows from
it by 'inspection':
Ih. vh
— V3 + — [(V2 - Vb) + 4(V3 - V7) + (V4 - Vs)]
1 o/
574
THE NAVIER-STOKES EQUATIONS
vl
- — [2{v2 - 2v3 + v4) + (V(y - 2v7 + v8)]
on
Ih , -, I Blh —
+ -(2A3' +3 A\) + -(P3 - P2) - ~~T3 = hf
*3!
(3.13-315)
or, dividing by h and using the shortcut notation,
Pi-Pi
D_2
ti
DY
- ( V3 ~ v-gvs +A\ + ' J , ' * - 0T3 1 + v-^u3 = /
/*
/
*y
(3.13-316)
Inserting all momentum equations (except those for which the velocity is specified, u2
and u3) into the (same) continuity equation, this time in the form \[{u2 +"3) — («6 +
"7)] + jr[(v3 - v2) + (vi ~ Vf,)] = 0, gives
U2 + U3 V
1
2 --(V>6 + V^7) + -(A^+A^)
+ -[(/>! -P4 + P2- P5) + (P2 - ^5 + P3 - Pe)]
ain + T^
+
(fT-fZ7) Dx(v3-v2)
h
— v-
l
h
+
+
Dyv (v3-v2) , Q(T3 -T2) A\-A\ Px- 2P2 + P3
V ^ ; V P~ —-'
uV
hA h ' '' h h
2(v7-v6) , .(7-7-7-6) (A*-AD
hl
h
+ P-
h
h
\_ /P1-2P2 + P3 P4-2P5+P6
2 I h2 h2
0.
(3.13-317)
Clearly this equation approximates—at the centroid of element 2—
fdu 9 dP \ d ( dv\
uV2w + u • Vw H aT\-\ I fz - v—\
\dt dx J dy V ar/
/ 3 / d2v
2Yy ["dy2
+ ^— I v^~2 + vV2v - 2u ■ Vv - 2— + 2pT ) =0(l,h), (3.13-318)
dP
Jy
which, since the tangential BC, v(dv/dx) = fT from (3.13-294), is valid for all y, will
converge to the Neumann BC for the PPE,
dP
= vVlu + olT
du
+ u • Vw ,
(3.13-319)
dx V &
the normal component of the momentum equation.
In a common application of this BC, u = 0 and fT = 0—i.e., a symmetry BC—we get
dP
dx
= ccT;
(3.13-320)
the normal pressure gradient balances the 'body force.'
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 575
Remarks:
(1) The viscous term vanishes (for the symmetry BC) because u = 0 for all y, which
leaves vd2u/dx2. But
d2u 3 fdv\ 3 fdv\ dv
V-u = 0=»—T = (— = — and v— = f T = 0
dx2 dx \dyj dy \dxJ dx J
for all y.
(2) In summary, and generalizing slightly, the PPE BC's are:
(i) For either u specified or u • n and tangential 'traction' specified,
dP/dn = n • (uV2u + ygT - u • Vu - du/dt),
a Neumann BC.
(ii) For either normal 'traction' and tangential velocity specified or total 'traction'
specified,
a Dirichlet BC.
If the stress-divergence form is used, the term dun/dn is multiplied by two.
In general, specified normal velocity implies that the BC for the PPE is the normal
component of the momentum equation—at least for domains with a fixed boundary.
The overriding issue related to pressure BC's is that they be such as to ensure that the
discrete version of V/, • u;' = 0 is satisfied on the boundary [just as the (consistent)
PPE itself ensures that the discrete version of V>, • uh = 0 in the domain.]
Corner boundary conditions for the PPE.
The last situation to examine, and in some sense the most involved, is the 'pressure
equations' that result from applying the various velocity BC's at a domain corner. For
variety and simplicity, we let this corner be that denoted by element 3; i.e., the domain
boundary is now the line connecting nodes 1, 2, 3, 4, 8, 12, and 16.
Since we have two equations (with BC's) on each of two surfaces (horizontal, say at
y = H, and vertical, say at x = L) and four possible velocity BC's at each (specified:
un and uT, un and fT, f„ and uT, and /„ and fT), there are 42 = 16 possibilities—and
this is just 2D! (In 3D, there are, by our count, 93 = 729, all of which will be left to the
computer as far as we are concerned; i.e., the knowledge gained from the 2D analysis
should be sufficient.)
We shall develop only two of the 16 in any detail, but will present a few of the others
in abbreviated form. First, a summary of the results, in the form of a table (Table 3.13-9).
We note several things from Table 3.13-9:
1. The specified /„ BC is 'dominant,' and provides a Dirichlet BC for the pressure.
2. Only 'specified u' returns the Neumann BC for P.
3. The loss of a pressure BC in three cases with un and fz specified on part or all of Fe
is surprising—especially since 'specified w„' usually delivers the Neumann BC for P.
(3)
(4)
(5)
576 THE NAVIER-STOKES EQUATIONS
Table 3.13-9 Boundary conditions for a corner element.
Case
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
BC's at x = L
u,v {u„,ux)
u,v (un,ur)
u,v (u„,ux)
u,v (un,ur)
Ujy (UnJx)
Ujy (UnJx)
Ujy (UnJr)
Ujy (Un,fr)
fx,V (fn,Ux)
fx,v (fn,ux)
fx,V (fn,Ux)
fx,V (fn,Ux)
fxJy (fnJr)
fxJy (fnJr)
fxJy (f„,fx)
fxJy (f„,fx)
BC's at y = H
u,v iu„,ux)
Ujy (f„,Ux)
fx,v (unJT)
fxJy (fnJr)
U,V (Un,Ux)
Ujy (fn,Ux)
fx,V (UnJr)
fxJy (fnJr)
U,V (Un,Ux)
Ujy (f„,UT)
fx,v (unJx)
fxJy (fnJr)
U,V (Un,Ux)
Ujy (f„,Ux)
fx,v (unJr)
fxJy (fn,Un)
dP/dn
(fx~
Resulting pressure BC's
= n • (vV2u + ygT -
P = vdun/dn
fr = vdur/dn
P = vdun/dn
fr = vdur/dn
P = vdun/dn
du/dy)y=H + (fy -
P = vdun/dn
P = vdun/dn
P = vdun/dn
P = vdUn/dn
P = VdUn/dn
P = VdUn/dn
P = vdun/dn
P = vdUn/dn
P = vdun/dn
- du/dt - u Vu)
-fn
(*)
-fn
(*)
-fn
^L=0(1)
-fn
-fn
-fn
-fn
-fn
-fn
-fn
~fn
-fn
(1)A loss of a pressure BC! It is only a pointwise
(D. Griffiths, personal communication).
loss, however, and is mathematically legitimate
We now present details of these derivations for Cases 1 and 16 (the two extremes)
in Table 3.13-9—after emphasizing that all of the pressure BC's come from a single
continuity equation, i.e.,
h . /
-(«3 -Un +UA- Us) + ~(V4 - V3 + V8 ~ Vj) = 0,
(3.13-321)
and pointing out that not all terms in each discrete momentum equation are required
to be carried along, since they vanish with mesh refinement. Thus, we first present the
(only) form of the momentum equations needed for the analysis of the most general
case—Case 16 [where we distinguish between normal and tangential applied stresses and
let HOT = Higher-Order Terms; actually, they are 0(1) in h and /]:
i{f-+p-^
+ HOT,
}(r,-v%)\3 + »m.
Hf",+r-»%)U + l(r,-v%)\t + nor,
T(/;-"$0L + g(/",+',->$)L + HOT.
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 577
Ul = (vv2hu + olt - u • vw - ap/a*)|7,
v7 = (vV2v + ^7-u- Vv-dPdy)\v
"8 =i(^-vt)l8+H0T' and
(3.13-322)
where we point out that in all cases save one (Case 1), the u-i and v-i contributions will
vanish with mesh refinement.
Case 1: u and v specified.
Here, only node 7 is 'free,' and the continuity equation becomes
h
-«3 — (vVjjU + aT — u • Vw — dP/dx)\-; + ii4 — u&
I
+ 2
/ 2 dp\
V4 — V3 + Vg — ( VV^V + fiT — U • Vv — — I
7J
0. (3.13-323)
Now as /, h —>• 0, we note that «3 4- «4 — «8 = 3w/3/|7 + HOT and v4 — v3 4- i>8 = 3v/3r|7
+ HOT to yield, at node 7,
(du dP
VV2M
\ /3u dP 7 \
aT] +1 [ — 4- u • Vv H vV2v - 071 =0,
J \dt dy J
(3.13-324)
a (particular) linear combination of the x- and y-momentum equations.
But if we introduce the so-called 'mass-consistent' normal vector (with concomitant
tangent), from (3.13-56), a more appropriate (mass-consistent) interpretation is possible.
For corner node 4, the consistent normal vector is given by nx = h/y/h2 +I2 and ny =
l/y/h2 4-12 (and the concomitant tangent vector ). Thus, dividing the
above linear combination of momentum equations by \Jh2 4-12 gives, with Mx = 0 and
My = 0 used as shortcut designations of the x- and y-momentum equations, respectively,
nxMx 4- nyMy = 0. But this is just n M = Mn = 0, the normal momentum equation in
the corner element; i.e., the continuity-equation-derived PPE BC is simply (again)
dP/dn = n • (vV2u + ygT - u Vu - du/dt); (3.13-325)
the normal component of the momentum equation is the (Neumann) BC for pressure.
Case 16: fn and fT specified.
This is the opposite end of the spectrum of cases, in which all momentum equations
partake, and Figure 3.13-23 may be useful.
Thus, inserting six of the eight momentum equations from (3.13-322) into (3.13-321)
gives, neglecting HOT,
h
2
2 ( „ du
+
dx
ui + -Afnx+p
i —
dx
h
' du\ 2 ( du
dyj.
h
dy,
578 THE NAVIER-STOKES EQUATIONS
h
*7
\jh2+£2
£
3 '
Fig. 3.13-23 A comer element.
+ £
+
-(r-v-
l V v VdxJ4 ■ h V v ' " dy/4
+ !(r,+p-v?
2/ dv
dx J 3 /i
+ l(r,+,-%\-*
0, (3.13-326)
where we refrain from explicating iii and v-j because, as will be seen below, their
contributions vanish in the limit of mesh refinement. In fact, letting /, h —> 0 in the above
equation yields
— if" + P- v— +/— l fl -v— ) +h— I fl -v —
/ V <W dx V dyj dy \J y dx
2/ / rn dv\
+ - \fy+p-vy) =0(i,h),
h
(3.13-327)
ay;
and multiplication by //i/2 gives
h2 [fnx + P - v~^j +I2 (f» + p - Vj-\ = 0(l2h, lh2). (3.13-328)
Introducing again the consistent normal definition gives, upon division by (h2 +/2),
(3.13-329)
"'^-f-)+<%-^+ov-k)-
If we now invoke the interpretation that each term in parentheses is simply (vdun/dn —
fn)~even though the first applies to the vertical boundary and the second to the horizontal
boundary—and use n2 + n2 = 1, we obtain
P=vdun/dn~ fn\ (3.13-330)
the normal 'traction' BC on velocity serves (again) as a Dirichlet BC for the pressure.
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM
579
Before leaving this case, however, it is of some interest to return to the actual
momentum equations (surface force balances) at node 4; i.e., (for /, h —>• 0) [cf. (3.13-298)
and (3.13-299)]
h If" -\- P - v-^-J -\-1 (fxx - v—j = 0 from the x-equation,
and
h(fzy-Vy)+l(f1,+P-v—) = 0 from the y-equation.
Ty)
Rotating these equations from x — y to n — r, as before, gives
n,
*(*+|,-©+'(*-$)
+ nv
>(r,-%)+i(r,
0 = M,
(3.13-331)
and
*(*+'-£)+'(';-£)
+ TV
'('^IM^'-S)
0 = Mr. (3.13-332)
Using now h = nxy/l2 + /z2 and / = nyy/l2 +h2 gives
P = /^l v— - f? 1 +/ir/2vl v— - f! + v— -f{\ +n2(v
i ( du „\ f du T 3v
dv
By
fl
--0
(3.13-333)
from Mn = 0 and
(3.13-334)
from MT = 0, where we have used xxnx + xyn v = t • n = 0. Noteworthy here are the
following:
1. The consistent (normal) rotation has removed P from the tangential momentum
equation—an absolute 'must' when the Q\/Pq element is used, since it is unable
to represent a pressure gradient within a single element (see also the discussion
surrounding Figure 3.13-2. in Section 3.13. le).
2. In the MT = 0 equation, the first and last terms sum to zero, after again invoking the
interpretation that v(du/dx) — f" = v(dv/dy) — f" = v(du/dn) — f n. This leads to
the final result, v(duT/dn) = fT, after invoking a similar interpretation of the shear
terms.
3. Comparing the Mn = 0 result with that earlier from the mass conservation
requirement, (3.13-330), and requiring consistency between the two resulting Dirichlet BC
equations for the pressure implies that nxnv(v(du/3y) — f\ + v(dv/dx) — f\) = 0 is
580 THE NAVIER-STOKES EQUATIONS
required. But this is identically true from the same shear stress BC, fT = vduT/dn;
i.e., the offered interpretation is compatible/consistent.
4. Using (for straight boundaries)
dn
' d d\
nx— + ny— I (nxu + nyv)
and
dn
' d d\
nx— +ny— \ (rxu + ryv)
permits the following alternate representation of the rotated momentum equations:
(i) v(dun/dn) ~P = nxff + nyff = ff and
(ii) vduT/dn = rxff = ff, where
:eff
ff = "*f"x + nyfl
hfnx+lfl
Vh2 +12
and
ff = nxfy + nyf"y
hfTy+lfny
Vh2 +12
are the effective surface forces caused by the applied surface tractions.
5. If the stress-divergence form of the equations was used, then each normal velocity
derivative above would be doubled and dun/dr added to every appearance of duT/dn.
6. If v = 0, then all tangential 'traction' terms are dropped (replaced by the
original tangential momentum equations, actually); the pressure BC remains unchanged
(P = —f„), as does the normal momentum equation; and the tangential momentum
equation becomes —l(du/dt + u • Vw) + h(dv/dt + u • Vv) = 0; i.e., r • (du/dt + u •
Vu) = 0—no pressure gradient is present.
This completes the top and bottom of the entries in Table 3.13-9. Only a few of the
remaining 14 cases will now be presented, and in a much-abbreviated form; in all cases
we just write Ui, v-i for the internal node momentum equation.
Case 3: u, v specified at x = L; fx and v specified at y = H.
h
2
2 / du'
m - un + uA - - if x - v—
I
+ ~[V4 - V3 +V8
vi] = 0;
(3.13-335)
i.e., the /, h —> 0 result is simply
. 3" . 3«r
fx = v— or fT = v—-.
ay dn
(3.13-336)
a 'repeat' of the only NBC in the momentum equations—and no BC for pressure.
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 581
Case 4: u, v specified at x = L; fx, f y specified at y — H.
h
—
2
«3 "
- hi + M4 -
2 r
- 7 (fx -
h \
du
- v—
dy
+
/
V4 - Vi, +
h
fy+P
V —
dy)>,
-V!
(3.
= 0,
3-337)
which => l(fy + P — v(dv/dy)) - h(fx - v(du/dy)) = 0, which, upon invoking the
tangential BC from the momentum equation, gives P = vdv/dy — f y = vdun/dn — fn
as the BC for the PPE.
Case 7: un and fT specified at x = L and at y = H.
h
2
«3
. 2/ du\
Ul + U4 - - [fx - V— )
+
/
V-!
o,
2 / , 9v\
V4-l{fy-vTx)3+v*
(3.13-338)
to give (fx — v(du/dy))\y=H + (fy — v(dv/dx))\x=L = 0; both shear-stress NBC's are
inherited by the PPE, with the pressure absent. It must be the case that the contiguous
boundary nodes, which supply Neumann BC's for the PDE, also set the pressure in such
a corner.
Remark:
It is interesting to examine the inviscid version of this BC: un specified at x = L and
at y = H. The mass consistent normal =>• iin = nxii4 + nyv4, which, when placed in
the (original) continuity equation, gives nx(uT, — u-] — u%) + un + ny(v^, — v-j — v3) = 0,
which becomes nx(l(d/dx)(du/dt) — «8) + un + ny(h(d/dy)(dv/dt) — v3) = 0, or un =
nxii8 4- nyv3, wherein un is given. Finally, inserting the (tangential) momentum equations
and rearranging gives (dP/dn) = —n • (du/dt + u • Vu)—a return to the Neumann BC.
(A similar result obtains for Cases 3 and 5.) So we see that the 'strange' result for the
viscous case is indeed a viscous effect.
Case 8: u, f v specified at x = L; fx, fy specified at y = H
h
2 / du
u3-u7 +u4- - [fx~vy
l
+ 2
2 / dv\
2 / dv\
+ T\fy+P-V-)
h
dy),
2 ( 8v\ 2 (
fy-v—.) +TAfy+p
I
dx J
h
— v-
'By.
= o,
(3.13-339)
to give
after invoking the x-equation BC, we obtain P = vdv/dy — fy = vdun/dn
stop here.
(3.13-340)
/„. And we
g. Flow past a flat plate.
While the equations and methodology are still fresh in our minds, it is interesting to
examine another common situation in which the PPE is deprived of a BC at a single
582 THE NAVIER-STOKES EQUATIONS
point—at least under one set of velocity BC's. To this end, suppose now that the original
patch of nine elements in Figure 3.13-20 describes flow near the leading edge of a flat
plate at zero angle of attack. The bottom surface is thus a symmetry line until the plate
begins at node 5; i.e., the flow is left to right and the BC at nodes 13 and 9 is fx = 0,
v = 0, and at nodes 5 and 1 (comprising the leading portion of the plate), we have
u = v = 0.
The continuity equation for element 4 is the focal point:
h I
-[w5 - ii9 + w6 - uw] + t;[v6 -V5+ i>io - vg] = 0, (3.13-341)
where, with our newly gained experience, we know that the key player will be the x-
momentum equation at node 9. Also, even though node 5 has u$ = i>5 = 0, we shall
(as usual) carry these (specified) accelerations for consistency of interpretation. Thus,
neglecting HOT ab initio,
2 / du\
h V dy/9
h
4
which gives the no-longer-surprising lack of a PPE BC equation,
+ ~[V6 ~ vs + v\o ~ vg] = 0, (3.13-342)
fx = v^=0. (3.13-343)
dy
Now, by the same token—and from earlier results—we find that on all upstream elements,
like #7, the same procedure (symmetry BC) gives (for buoyancy absent) dP/dy = 0,
and for all downstream elements (un = uT = 0)—such as #1—we obtain the Neumann
BC, dP/dn = n • (vV2u — u Vu — du/dt) or, specialized to the case of interest, dP/dy =
vd2v/dy2. Interesting. The 'transition' element, #4, 'sees' dP/dy = 0 to its left, du/dy = 0
thereon, and dP/dy = vd2v/dy2 to its right for the pressure/continuity equation; i.e., there
is actually no pressure BC from nodes '9' til 5.' What is the effect of this? It seems to
translate to the loss of a specific BC for the pressure at the leading (and trailing) edge of
the plate, perhaps related to the singularity there in which the pressure is allowed some
'freedom.'
On the other hand, a similar analysis in which symmetry is not invoked a priori, yields
a dissimilar result: it regains the Neumann BC for pressure. To see this, suppose the plate
to begin at node 6 (and extending to node 2 and beyond) so that there is flow on both
sides, and the velocity is specified at nodes 6 and 2, but symmetry on 14-10-6 is not
invoked. There are now four continuity equations that are relevant—those on elements 1,
2, 4, and 5. But we need to examine only two of them in any detail (say #1 and #4), the
others following from symmetry. Element 1 has
h . I . . .
-(«, - u5 + u2 - u6) + -(v2 - v{ + v6 - v5) = 0 (3.13-344)
to give (omitting the buoyancy term for simplicity)
-[(vV2M - u • Vu - dP/dx)\ - (vV2w - u • Vu - dP/dx)5 + u2 - ii6]
+ ~[V2 ~ (W2v -u-Vv- dP/dy - vV2u)\
+ vb- (vV2v -u-Vv- dP/dy)5] = 0; (3.13-345)
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 583
i.e.,
hi d (du 9 \ fdv -, \
-— l — +u-Vu + dP/dx-vV2u) +1 I — + u-Vv + dP/dy-vV2v\ =0(l,h),
(3.13-346)
which finally converges to the Neumann BC, dP/dy = vV2v — u • Vv — dv/dt in general,
and dP/dy = vd2v/dy2 in particular. No surprise.
Remark
If the flow is inviscid, then w2 and «6 are free, and the BC is v = 0 at nodes 2 and 6. The
result is dP/dy = 0 in elements 1, 2, 4, and 5. Also no surprise.
Switching now to element 4,
h . I
-(u5 -u9 + ub- wio) + -(ve - v5 + vl0 -v9) = 0
is the continuity equation, and w6 = vb = 0 is the BC. This leads to
h
(3.13-347)
d du fdu
I h —
dx dt V dt
vVzM + u • Vw + dP/dx
I
+ 2
which converges to
h
du
Tt
fdv 9 \ 3 dv
vV~v + u • Vv + dP/dy) +h
\dt J dydt
dv
0(1, h), (3.13-348)
+ u • Vw + dP/dx - vVzu 1 + / I — + u • Vv + dP/dy - vS/lv
= 0,
(3.13-349)
and further interpretation seems to rely on the recognition that nx = h/y/l2 + h2 and
ny = l/y/l2 +h2 define the consistent normal for node 6 in element 4; i.e., even though
the global mass-consistent unit normal would be n = — i (obtained from the pair of
elements, 1 and 2, that contain the domain's boundary), the e/emenr-consistent normal is
needed to interpret the above equation as the usual Neumann BC for the PPE, dP/dn =
n • (vV2u — u Vu — du/dt) in general, and dP/dn = vd2v/dy2 in particular.
Clearly, analysis of element 2 (respectively, 4) will give results that basically replicate
those from element 1 (respectively, 3); and —not clearly—we have no explanation for
this paradoxical behavior, other than 'singularities can cause strange reactions.' We do
point out, however, that a symmetric solution is 'compatible' with this result; all terms
in the y-equation vanish, and (3.13-349) becomes dP/dx = vV2w — udu/dx — du/dt.
h. Flow past a corner
Again, because of the resulting/induced forced interpretation that seems to be required
to understand discrete equations, we present one more example. And to make it more
interesting, we will use different-sized rectangular elements. Consider flow past the corner
shown in Figure 3.13-24, with u = 0 on the boundary: 4-5-8. Whereas previous analyses
apply more or less directly to elements 1 and 3 (giving dP/dx = vV2w — u • V« — du/dt =
vd2u/dx2 and dP/dy = vV2v — u • Vv — dv/dt = vd2v/dy2, respectively), it is the corner
584
THE NAVIER-STOKES EQUATIONS
u = w
(or un = 0)
Fig. 3.13-24 Flow past a corner.
element that is more interesting. Thus, for element 2 we have
hi l\
— (u5 -u2+ub- u3) + — (v3 -v2 + v6- v5) = 0,
where 115 is specified. Thus,
du
~dt
hi
2
5 - (vV2w - u • Vw - dP/dxh +l\
I
+ ■
3 dv ~>
h2 + (W2v - u ■ Vu - dP/dy)6
dy dt
3 3«"
dx dt
y)6 - -
dv
dl
5.
(3.13-350)
0(1, h), (3.13-351)
which ostensibly 'converges' to
(du o \ f dv
— + u • Vw + dP/dx - vV2u ) -hi —
dt I V dt
+ u • Vv + dP/dy - vV2v = 0,
(3.13-352)
a linear combination of the two momentum equations that is, in fact, the element-
level consistent normal momentum equation at node 5; i.e., nx = hi/\lh\ +12 and ny =
—l\l'\jh\ +l\ is the consistent normal at node 5 when 'viewed from element 2'; it is
called nL in the sketch. (We will see below that this local unit normal is generally different
from the global unit normal.) Thus, we seem to have arrived at the common and agreeable
result, 3P/3n = n • (vV2u — u Vu — 3u/3r) = vV2un at node 5.
But now consider the inviscid case, for which we must introduce the globally
mass-consistent unit normal in order to both permit slip and satisfy the appropriate 'no-
penetration' BC, u n = 0. This unit normal vector is calculated from (3.13-56):
n
V05
V05
(3.13-353)
'n
'Q
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM
585
where
V05
905
+
f d^l
P
-2
as follows (note that all three elements sharing node 5 are involved):
•1 /•! A,//3,l) U. r\ r\
(3.13-354)
9^(2,2) ^(1,3)
^
^
d£dr?,
(3.13-355)
where ^;(k-m) is the local basis function at (local) node k in element m. Since \j/k =
(1 ±£)(1 ± j?)/4, the computation is easy and gives nx = fih\/2. Similarly,
"> ~ P J If ~ fi2
c /.: t *^' '"•
+ °'iLL
1 /•! fl,/,C3)
di/[
dr)
d^d»7
-Phjl. (3.13-356)
Thus, from n2x + n2 = \, fi~2 = (h] + l\)/A to give, finally,
nx = hi I \lh] + I2, ny = -l2 / Jh] +1\,
(3.13-357)
which is, interestingly, the negative of the local consistent normal for node 5 when
viewed from the 'phantom' element 4 in the sketch, and called nG there. It then follows
that n u = nxU5 + nyvs, and the n • u = 0 slippery BC is thus h\Us = hv$.
Now let us return to the 'continuity' equation for element 2 for v = 0: (3.13-352) gives
h2 ( — + u • Vu + dP/dx J - /, I — + u • Vv + dP/dy J = 0,
(3.13-358)
which is the local (element-level) version of n • (du/dt) + u • Vu + VP) = 0, except here
nx = h2 J \jh\ + i\, ny = -l\
y/hl + l]; (3.13-359)
i.e., we have two verisons of a unit normal vector—one based on local mass conservation
and one based on global mass conservation. Denoting these as nL and nG, respectively,
the inviscid flow past the corner seems to satisfy
n
~dt
+ u • Vu + VP =0,
(3.13-360)
which serves as a 'mass-conserving' PPE BC, and
nG • — = 0,
dt
(3.13-361)
which is the no-normal-flow-and-mass-conserving momentum equation BC. It is
(3.13-361), h\us = hvs, that will be satisfied by the discrete equations.
586 THE NAVIER-STOKES EQUATIONS
The Neumann BC for the PPE is thus dP/dn = -nL ■ [(u ■ Vu) + du/dt], with nL u =
(h2 — h\(l\/h))u5 ^ 0 rather than the 'expected' result, dP/dn = — n (u Vu). Again,
it seems that the GFEM does what it must to enforce the discrete version of V • u = 0;
i.e., it can apparently be somewhat indiscreet in its discrete enforcement.
If /2 = /i and fi2 = h\, then nG
do have certain advantages.
n , and the discrepancy disappears; uniform grids
/'. Div u = 0 as a PPE BC
In Section 3.10.4 we discussed the ill-posedness of the SPPE (as first presented) and
mentioned that it could lead to a well-posed problem if appended with the additional BC,
V • u = 0 on r. Here we present a weak formulation of this approach and show how it
might discretize for one simple case. Consider first the problem
and
du/dt + v/> = vv2u + f
V2/> = Vf in Q,
(3.13-362)
(3.13-363)
with BC's u = w and dP/dn = n • (vV2u + f - du/dt) on T, and IC u = u0, with V •
Uo = 0. But as shown in Section 3.10.4, this problem is not well-posed—it has an
infinite number of solutions—many of which do not satisfy V • u = 0. So, here we recover
uniqueness, and divergence-free-ness, by explicitly adding another BC : V • u = 0 on T.
A weak formulation of the latter SPPE problem might go roughly like so:
Find u g H| and P e Hl/R via
9u
/
dt
+ v / Vv
(Vu)7 +
v ■ VP = / v ■ f VveH
0'
and
/V0■VP = /f■V0 V0 e H0,
^V • u = 0 V\f/eLz
(3.13-364)
(3.13-365)
(3.13-366)
where (3.13-365) is obtained from (3.13-363) via the usual integration by parts and
noting that the boundary integrals vanish because 0 = 0 on T. The functions {^} could,
in practice, be just those </>'s 'omitted from' (3.13-365) by the restriction 0 g H{0; i.e., we
replace BC's on (3.13-363) by V • u = 0 there.
There are three new features when discretization of these equations is contemplated:
(i) we cannot use Q\Qo, since (3.13-365) requires P €//', not just L2, and (ii) the
boundary evaluation of V • u, and (iii) the boundary evaluation of vnV2u. So, using
Figure 3.13-25 and bilinear functions for both u and -ft, i.e. the Q\Q\ element, leads
to—for node 0:
0
/ vv-u
f° h/2 + y
Js h
fN h/2 - y
N (h/2-y) (h/2 + y) , v0 - vs]
(us usw) ■ ,. +(«o uw) ■ ., +
In In h
, W2 -30,, , (h/2 + y) vN-
(Uo Uw)- ,, + (W/V Unw)- , +
In h h
dy
vo'
dy,
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 587
Fig. 3.13-25 A boundary 2-patch.
which finally leads to
l v/v —
— [us — usw + 4(mo — uw) + un — uimw\ H —
6/ 2h
vs
0
(3.13-367)
as a typical 'boundary' statement of V • u = 0—and seemingly somewhat 'defective'
(although still legitimate) in that no internal v-nodes appear. Also, of course, u on F is
replaced by w there. If the 'bulk' continuity equation were applied at node 0 instead, the
result would be [from the element matrices in (3.13-262)]
1 1 [2{vN
[us - uSw +4(m0 - ww) + un - uNW\ +
vs) + (vnw —vsw)]
6/ 3 2h
(3.13-368)
a somewhat 'better looking' approximation.
This entire formulation, while feasible, does not seem highly advisable for at least two
reasons: (i) there will be no discrete version of V • u = 0 except that in (3.13-366) on
r, and (ii) the boundary coupling is awkward. But it may have an additional advantage
that is worth exploring (not yet done, to our knowledge): it may have (probably did)
stabilized Q\Q\, or any(l) equal-order elements. Thus we conclude this little section in
a rather speculative manner—but do point out that a successful FDM version of the idea
seems to have been demonstrated by Schiiller (1990).
/. C?! Oo convergence proof
The analysis below relies heavily on the concepts shown in Figure 3.13-26, and it may
be useful (for some) to first review the stability discussion in Section 3.13.2d, as well as
Appendix 3.
Synopsis/Roadmap:
The proof goes roughly like so, referring to the circled 'distances' in the figure:
(1) 1^2 + 3.
(2) 3^4^5+6.
THE NAVIER-STOKES EQUATIONS
Fig. 3.13-26 Sketch for proving Q^Q0 convergence.
(3) Thus, 1 ^2 + 5+6.
(4) Show that each of 2,5,6 is 0(h). Done; proof complete. [Showing 6 = 0(h) is easy;
2 and 5 are not.]
Introductory remarks, per Figure 3.13-26:
(1) u is the exact Stokes velocity (from the weak solution of VP — vV2u = f, V • u = 0
in Q, u = 0 on T) and uh e Jh is the Q\ Qo-numerical approximation to u. The former
is divergence-free but not discretely divergence-free, and the latter is discretely
divergence-free but not divergence-free.
(2) The plane outside of Vh is the divergence-free subspace of //' that we call J.
(Vh is a subspace of //', but it is not a subspace of J. Thus, unfortunately, uh
is not obtainable as a divergence-free projection of u. It is, however, a discretely
divergence-free projection of u and VP; see Appendix 3)
(3) p) is the interpolation projection from //' to Vh.
(4) p\ is the //'-orthogonal projection from //' to Vh.
(5) Pj is the //'-orthogonal projection from //' to Jh (the discretely divergence-free
subspace of Vh).
(6) The ^-projection is not essential (because, as shown in Appendix 3, p1) = p) p\,
where p1} is essential), but it does serve to point out some interesting facts:
(i) ||u — p\u||i ^ ||u — pji u|| i = \\u — Pj p\u\\ i, and (ii) ||u — p\u|| i ^ ||u — uh|| i.
The closest vector (in the //'-norm) in Vh to the Stokes velocity is not in Jh; also,
it is neither divergence-free nor discretely divergence-free. Finally, the approximate
solution is not the closest discretely divergence-free vector to the Stokes velocity;
\\u — p1} u\\\ < \\u — uh\\\.
Background material:
(1) The important but 'technical' (read 'difficult') 1981 paper by Malkus.
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 589
(2) The important but 'technical' 1984 paper by Malkus and Olsen (see too Malkus and
Olsen, 1982) for a less technical discussion].
(3) Appendix 3 describes the above projections in great detail.
We now address one of the most difficult tasks facing authors on the subject of
this book: explaining—or perhaps, rationalizing—one of the most popular (perhaps
arguably) finite elements extant for incompressible flow (and incompressible elasticity?).
We are referring, of course, to the Q\Qq element—the bilinear velocity-piecewise-
constant pressure element on isoparametric quadrilaterals, and its 3D analog, trilinear
velocity-piecewise-constant pressure on isoparametric bricks. (It may be the case that its
greatest relative popularity is in 3D.) We now embark on this task—hopefully profitably;
i.e., we shall make a serious attempt at explaining the ostensibly paradoxical success of
this slightly unstable but highly usable element. Worth first recalling, though, from Brezzi
and Fortin (1991, p. 215) is, 'However simple it may look, the Q\Qo element is one of
the hardest elements to analyze, and many questions are still open about its properties.'
And this, more than 20 years after Fortin (1972) first discovered some of these properties!
Remark
Note that many others call it Q\Pq rather than Q\Qo —but since Q\Pq has, in many
circles (see below), generated a bad reputation, we have changed its name; Qq means
'constant on the Quadrilateral.' We also hope to change its reputation.
Before presenting our convergence proof for Q\Qo, which may be interpreted as our
version/interpretation of the Malkus-Olsen theory—or at least a piece of that theory, since
we focus on a single element and they studied several—we make some introductory and
motivational remarks:
1. The Malkus-Olsen theory uses the LBB theory, in conjunction with some results of
Mercier, to prove some more general results by showing that the error in the Q\Qo-
velocity (not pressure) is bounded by the sum of the smallest possible L2-error in the
pressure, and a computable error bound, using the H' -projection of the interpolant, to the
same discretely divergence-free subspace of//1, in which the GFEM velocity resides, Jh.
2. The goal (obviously) is to estimate the velocity error, \\uh — u\\ \ —which will, of course,
imply stability if convergence can be proven. (A sort of 'inverse' Lax theorem???)
3. The //'-projection of u to the discretely divergence-free subspace will be seen to play
an important role—see Appendix 3 for details of this projection.
4. Recall (or see, for example, Strang and Fix, 1973, or Appendix 3) that for 'simple'
elliptic problems, the GFEM solution is, because it is the best approximation, more
accurate than is the interpolant of the exact solution—and that both converge at the same rate.
Here, because we are using a mixed method, we lose the best approximation property—but
we will still find a good use for the interpolant, u), by showing that uh still converges
at the same rate as u), where it is important to realize/note that although u) is neither
divergence-free nor discretely divergence-free, it nevertheless converges to the proper
divergence-free result as h —> 0.
5. Just as the Malkus-Olsen theory (Malkus and Olsen, 1984) 'follows in the spirit of
Mercier,' by showing that 'when a "good" approximation exists in the null space of
590 THE NAVIER-STOKES EQUATIONS
the discrete divergence operator, velocity convergence can take place without the LBB
condition,' so does ours follow in the spirit of Malkus-Olsen; i.e., it is not simply a
re-statement of their analysis.
To be 'fair,' we should at least briefly turn to the other side of the ledger and make
some de-motivating remarks, most of which have appeared in the literature:
1. 'Moreover, they have shown (Boland and Nicolaides, 1985) that there exist data f for
which the pressure approximations do not converge and that it is also possible to set up
problems for which the velocity approximations do not converge as well'—Gunzburger
(1989, p. 24). Two remarks on this remark: (i) we believe that the 'data' referred to above
are far beyond what reasonable people would deem reasonable; and (ii) a useful summary
of the Boland-Nicolaides theory ('patch test') is available in Thatcher (1993). See too
Stenberg (1984) who, according to Thatcher (1990), 'extended and simplified' the method.
2. 'Only elements satisfying the consistency condition are recommended in a finite
element program for viscous flow computation using a mixed velocity/pressure
formulation'—Carey and Oden (1986, p. 138).
3. 'On the other hand, one might hesitate to disqualify elements, like the Q\Pq element,
which do not satisfy the inf-sup condition, on the argument that the condition may be too
strong. These doubts are not confirmed by experience'—Chapelle and Bathe (1993).
4. 'Given any constant c, however large, a flow (u, P) (dependent on c) exists such
that the ratio of the norm of the finite element error to the norm of the best
approximation error of this flow exceeds c for some h. A stronger statement is also true, namely
that a flow (u, P) exists for which the finite element scheme produces nonconvergent
approximations.'—Boland and Nicolaides (1984). [See Remark (1).]
5. 'The Q\Pq element was introduced more than a decade ago and then one did not know
so much about the troubles with mixed finite elements. Now the situation is different
and in view of what we know, the element cannot be recommended.' And, 'The Q\Pq
element has done a good job (when in skilled hands), but every product has its
lifespan.'—R. Stenberg (personal communication, 1991).
So—returning to Figure 3.13-26—we present our version of a convergence theory for
the QiQo-element—in a baker's dozen (or so) steps:
1. To begin, we introduce and utilize the //'-orthogonal projection (with projection
operator pj ; see Appendix 3) of the exact velocity to the discretely divergence-free subspace
Jh via uh — u = uh — Pj{u + pj u — u and the triangle inequality to get \\uh — u\\\ ^
\\uh — p1} u\\\ + Up1} u — w||i, and we shall estimate (bound) separately both members of
the RHS, with the goal being to prove that \\uh — u\\\ = 0(h); i.e., that velocity
convergence occurs when the mesh is refined—and it occurs at the optimal rate.
2. The following theorem is due to Mercier (1979a, b) and is crucial to our
proof: \\uh — Pj{u\\i ^ C[\\PqP — P\\0, where P is the Stokes pressure, and p^P is
its L2-orthogonal projection (the average pressure over each element—not shown in
Figure 3.13-26) to the discrete pressure space, see Appendix 3; i.e., the distance between
the numerical solution and the //'-orthogonal projection of the exact solution to the
discretely divergence-free subspace depends (only) on the quality of the pressure space
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 591
approximability—or, equivalently, the numerical solution approximates the discretely
divergence-free //'-projection of the exact velocity as well as the pressure basis functions
can approximate the pressure. Since this assertion (theorem) is surely not obvious, we
prove it below—following Olsen (1983):
(i) The weak form of the Stokes equation is
fvu:(V\)T-fpV\=ff\ VveHi.
(ii) The weak form of the numerical (GFEM) solution is
fvuh: (VvY - [ PhV\h = ff-\h Vvh e\h cHj.
(iii) Setting v = \h in (i)—i.e., we use only those test functions of the finite-
dimensional approximation—and subtracting generates the following 'principal
error equation,'
/
V(u - uh) : (VvY = /(/>- Ph)V ■ y\ V\h e \h
(iv) Selecting a particularly useful \h e Jh C \h, namely, \h = p) u — uh = p) (u —
uh) because (see Appendix 3) p1] uh = uh, gives
Jv(u-uh): [VpJ, (u - u*)]7 = f(P - Ph)V ■ phJx (u - uh).
(v) Using u - uh = phJx (u - uh) + QhJx (u - uh), where QhJx=l- p)x, and the fact
(see Appendix 3) that p)x is //'-orthogonal to Q)x J VphJx ( ) : [VQhJx( )]T = 0
gives
fv(u-uh): [Vrf, (u - u")]r = J [vphJx (u - uh) + V$, (u - u*)] : [vp*, (u - u*)
= j VphJx (u - uh) : [vPhJx (u - u*)
= ll^u-u^l?,
and thus
IIpJ,u - »*||? = Up - Ph)v ■ phJx (u - u*).
(vi) We now turn to the RHS; since Ph e Sh and p)x (u - ii*) € Jh, and by the
definition of Jh : JqhV ■ \h = 0 Vqh e Sh and Vv* e Jh, the second term on the RHS
of the above equation is identically zero. But another (and useful) term that is
also identically zero for the same reason is J(PqP)V ■ phJx (u — u*). Thus, we can
replace Ph above by the L2-projection of P (replacing one zero term by another)
to obtain
Mil* - rf,u||? = J(P - phQP)V ■ p)x (u - u*)
592 THE NAVIER-STOKES EQUATIONS
and then use the Cauchy-Schwarz inequality to obtain
Mil* - pj.ull i < 11^ - ^llo • IIV • phJx (u - u*)||0.
(vii) Now consider || V • u||§ = / fe?=i «,-,,-) ^ dJYl1=\ uu^ obtained by using
(again) the Cauchy-Schwarz inequality, this time in the form
d 2 d d
J2^ibi ^][>?X>? with bt = \.
i= 1 i= 1 i= 1
(viii) Next, use the obvious inequality JYl1=\ uh ^ Yl1=\ Z^=i utj — HUH? t0 obtain
|| V • u||o ^ VJ||u||i, and thus
||V.rfi(u-u/!)||o^Vj||rf1(u-u/!)ll'=^Hu/!-^,ull'-
(ix) Thus, finally,
Mil* - pj,u||? ^ \\ph0P - P\\o • Vd\\uh - phju\\,;
i.e.,
Ilu^-p^ulh ^cHlpjP-PHo where c, = Vd. QED
3. Thus, we arrive at [see step (1)]
\\uh -«||i ^ ci\\PqP -P||0+ \\u - phJxu\\\,
a result that is similar to (different constants) equation (1.11) of Girault and Raviart (1986),
to equation (2.18) of Gunzburger (1986), and to equation (2.11) of Brezzi and Fortin
(1991), to name three recent texts. But we shall take this result 'farther' than did any of
the above authors, starting with the approximation theory result, \\p$P — P||o = c2^, [on
uniform grids, the error is actually smaller yet—0(h2)] to write
|w — W||i ^ C\C2h + ||w — Pj W||i.
Remark:
If the difference between a divergence-free velocity and its discretely divergence-free
//'-projection can be shown to approach zero like 0(hk\ then the Malkus-Olsen (1984)
Constrained Approximation Condition (CAC) will be satisfied to 0(hk).
4. To deal with the second term on the RHS, we begin with the obvious inequality
\\u-phJxu\\\ ^ \\vh -w||i, Vvh eJ\
because p) u is the closest function in Jh to w.
5. Next, take a particularly useful vh, namely, vh = p1} u*} = p^p^u, so that vh — u =
p1} uhj — uhj + uhj — u to give, via the triangle inequality,
11^ -w||i = \\p) u1} -w||i ^ \\p) u1} -uhj\\\ + ||kJ -w||i
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 593
and thus
\\UH — W||i ^ C\C2h + ||Py Uh} — M/Hi + ||M^ — W||i.
6. Another application of approximation theory yields \\u^ — u\\\ = c^h to give, with
C4 = C\C2 + C3,
||w'2 — «||l ^ C4/j + HPy M/ — M/lll-
7. All we need to show now is that the interpolant of u into the bilinear basis, u!}, is
close enough—0(h) or better—to its discretely divergence-free projection, a seemingly
trivial task. (But it is actually not very trivial—at least 'our' way.) We now seek wh =
Pj uhj — uhj and its magnitude, Hh^Hi. This requires finding the discretely divergence-
free //'-projection of uh, onto Jh\ i.e, p^u1}. From Appendix 3, we know that Py,W/ is
obtained by solving the following algebraic system: Kv + Cq = Kii and CTv = 0 for q
(the Lagrange multiplier—at element centroids—associated with the projection) and v
[the nodal values of the projection Py:«/ = Yl"j=\ vj^j(x)^^ where u is the vector of nodal
values of the interpolant, uht{x) = Y^"j=\ Uj<t>j(x), where Uj = u(xj). The nodal values
of wh(x) = J2"j=\ wj<Pj(x) are men Wj = Vj — iij, j = 1, 2, ..., n, and we seek \\wh\\ \ =
(wTKw)l/2 = Up1} U[ — uj\\\. Solving the algebraic system of the projection yields q =
(CTK~'C)~'CTu and w = -K~xCq = -K'xC(CTK'{C)~'CTu, and thus
wTKw = [-K-lC(CTK-lCylCTu]TK[-K-lC(CTK-lCylCTu]
= {CTu)T(CTK-xCy\CTu).
Remark:
w is the minimum norm solution of CTw = —CTu, i.e., wTKw is a minimum, w has
the same discrete divergence as u, but w is smaller than li—in //'. [It is in fact much
smaller—wh = 0(h)—as we shall see; and, of course, u ~ 0(1).]
8. Introducing now (for reasons soon to become clear) the mass matrix on Sh C L2, Q,7 =
f i//ii//j, yields
wTKw= (Q-i/2CT~u)T [Qxl\CTK-xCyxQ}l2\ [Q"x/2CTu]
^\\Q"l/2CTu\\2-\\Qx/2(CTK-xCrlQl/2h
from xTy ^ ||jc|| • ||y||, via Cauchy-Schwarz, and from ||Ax|| ^ ||A|| • ||jc|| (a property of
an induced matrix norm) with y = Ax.
9. Now recall that the L2-norm of a matrix (A) is given by its largest eigenvalue—||A||o =
^maxG4)- So, we have
\\Ql/2(CTK-lC)-lQl/2h = kmax[Qi/2(CTK~xCrxQx/2]
= iAmin(Q-,/2c7/:-,CQ-,/2)
because Amax (A) = l/Amin(A"'). Thus, we have that
||w*||? = wTKw <: \\Q"i/2CTu\\2/kmin(Q-x/2CTK-xCQ-x/2).
594 THE NAVIER-STOKES EQUATIONS
10. But Q-l/2CTK-iCQ"i/2 is—or is equivalent to—an 'LBB stability matrix' (see
Section 3.12.2d) with the following known results:
(0 ^min = ^h'
where kh is the LBB stability constant;
(ii) kh = 0{h)
for a sequence of meshes of uniform rectangles or for quasi-uniform meshes generated by
uniform refinement of an initially 'arbitrary' mesh of quadrilaterals (Chapelle and Bathe,
1993; and Griffiths and Silvester, 1994, some of whose results we discuss in the next
section); and, importantly,
(Hi) kh = 0(\)
for most sequences of 'general' meshes [Malkus (1981); Brezzi and Fortin (1991, p. 244),
who also stated that 'This last fact is still resisting analysis'; Malkus and Olsen (1984);
Silvester and Gresho (1992, unpublished experiments); and Griffiths and Silvester (1994)].
Admittedly, the need to lean on some experimental results weakens (negates, in fact, for
most mathematicians) the alleged "proof." But we, and ostensibly many others, can live
with this 'problem.'
11. \\Q~{,2CTu\\l = {Q-xCTu)Q{Q-xCTu)= \\Q~y CT u\\2Q, where (recall—or see
Appendix 3) the matrix — Q~XCT corresponds to the weak divergence operator: i.e., V/j • u(x) =
— YlNj=\(Q~XCTu)j'[l;i(x)i where Uj is the value of u(x) at node j, and thus [see too
(3.13-141)]
||Vfc -u||S = ^(Q-'C^Q-'C7^)* fifjifk = \\Q-lCTu\\2Q,
M J
to give
||£/ C M||o — || V/j • U7||o = ||i>? C U\\q.
12. But from Johnson and Pitkaranta (1982) and Oden et al. (1982)—see also Olsen
(1983)—we have the following 'superapproximation' result on a mesh of rectangles for
u/ : IICT'C^tillg = 0(h2) rather than the expected 0(h); the discrete divergence of the
(bilinear) interpolant of a divergence-free vector is 'extra small'—another attribute of the
QiQo-element. [For a general mesh of quadrilaterals, the normal approximation theory
result, || £r'C7w||<2 = 0(h), obtains.]
13. Thus, finally, we have
IMh ^ \\Q'{CT~u\\Q/^kmin(Q-'l2CTK-'CQ-xl2) = \\Q-lCTu\\Q/kh;
and we (finally) see why the discrete projection of the interpolant to the discretely
divergence-free subspace [a la step (7)] is perhaps not so trivial. It is—or rather, it could
be—amplified by a badly behaved LBB 'constant'. From steps (8)-(l 1), we obtain ||w|| i <
0(h2)/0(h) for rectangles (or quasi-uniform mesh refinements) and ||w||i ^ 0(h)/0(\)
for a general mesh where—in the latter case—we have no superapproximation of (V/,-)
to aid us; and, fortunately, we do not need it.
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 595
14. Finally, inserting these last results—||w||i ^ c$h—into step (6) gives the final result:
\\u — w||i ^ ch\
the LBB-unstable Q\Qq-element converges at the optimal rate.
Finally, lest we be accused of the worst form of CFD (Charlatan Fluid Dynamics;
another form of CFD, not quite so bad, is Colorful Fluid Dynamics, in which GREAT
GRAFIX can replace/displace serious analysis of numerical results), we hasten to add
the following remarks, partly to somewhat temper the above conclusion (but in no way
negating it):
1. A wealth of laboratory experience seems to support this conclusion—or at least does
not refute it.
2. Convergence of the pressure is in no way implied in the above—all we can offer here
is the combination of experience and hope, the former of course being more useful; we
have never seen divergent pressures, and our filtered pressures, from 'proper' meshes, are
virtually always acceptable if not excellent.
3. 'Bad data'—i.e., the forcing function, f(x)—could conceivably weaken the above
conclusion; e.g., if the 'crazy' mesh-dependent data discussed by (or implied by) Boland
and Nicolaides (1985) were imposed, the QiQ0-element might fail to converge, which
failure must (we believe) show up in an 'unstable' interpolant in the above proof. See
also Arnold et al. (1984), who offer—among many other things—the following advice:
'Certainly formal insistance that the (LBB) stability condition is satisfied is
inappropriate.' For counter arguments to our counter arguments, see the article by Nicolaides in
Gunzburger and Nicolaides (1993).
4. We conjecture, but cannot prove, that the same convergence rate obtains for more
general, and inhomogeneous, BC's than just u = 0 on T—at least as long as the (spurious)
CB-constraint equation is satisfied (or absent).
5. If the LBB 'constant' was more badly behaved—say kh = 0(h3) on rectangles—then
the result of this theory, ||w|| i ^ 0{h~x), would be that it too (like LBB theory) is simply
not good enough. That is to say, it is very important to realize that the theory does not
say that ||w||i = ||V/j • u^||0//:/j, only that ||w||i is bounded by ||V/j • u^lo/^ if this term
—> oo for h —> 0, then the theory simply becomes silent/useless. Both theories can, if
conditions are 'right,' be used to prove convergence. But, when conditions are wrong,
they do not prove divergence! (in spite of numerous implied statements in the literature).
6. A quotation from Olsen (1983), to help wrap it up, is relevant here: 'This generally
favorable computational experience, together with the fact that the LBB condition is only
a sufficient condition for convergence, suggests that if the bilinear rectangle and crossed
triangle elements fail to satisfy the LBB condition, it is as much the fault of the LBB
condition as of the elements.' And another, from Malkus and Olsen (1984), '... when a
"good" approximation exists in the null space of the discrete divergence operator, velocity
convergence can take place without the LBB condition.'
7. Even Girault and Raviart (1986) seem to believe that the LBB condition is not so
sacrosanct: 'On the other hand, quadrilateral elements (more precisely, rectangular elements)
provide excellent examples of schemes which do not satisfy the, inf-sup condition and
yet can be proved to converge with optimal accuracy.'
596 THE NAVIER-STOKES EQUATIONS
Fig. 3.13-27 An example of a CB-precluding 'mesh'; each 4-patch is converted to a 5-patch.
8. For further discussion of results using the technique from which the above analysis
was derived, see Malkus and Olsen (1984) and Appendix 4.II (by D. Malkus) in Hughes
(1987), which, incidentally, is independently recommended reading. For example, the
crossed linear triangle element (a macro element) is also 'optimally constrained' yet does
not pass the LBB stability test.
9. For fans of Q\Qq who want guaranteed optimal convergence of both u and P (with,
however, larger error constants caused by the distorted shapes?), one way to assure this
is to discretize via the macro elements of Figure 3.13-27, each composed of five Q\Qq
quadrilaterals: These meshes can be made, e.g., from a mesh of nine-node elements by
'splitting in two' the central node in each element. Such 'CB-killer' meshes have been
employed in practice by (at least) J. Bathe (see, for example, Chapelle and Bathe, 1993)
and J. Schutt (personal communication). Both the macro-element and the proof are due
to Stenberg (1984).
10. A recent relevant reference that supports our position that an unstable element will
behave nicely for a wide range of input data (but not for all data) and that demonstrates
the notion in ID is Babuska and Narisimhan (1997).
k. Quantitative description of some unstable modes
In Section 3.13.2b on filtering Q\Qo pressure modes, we introduced the concept of 'pesky
modes' which are, in fact, just those most unstable LBB modes, where 'most unstable' =>
smallest eigenvalue—though it turns out that there are many others that are also unstable,
at least on uniform meshes. In this section, we quantify some of these modes, following
Griffiths and Silvester (1994—'GS')—but generalizing (somewhat) their results—after
noting that these modes were actually 'discovered' earlier (1992) in the CFD laboratory
by Silvester and Gresho, using MATLAB.
Using the method of modified equations, GS obtained the following results for the
eigenproblems of (3.13-150), (3.13-151), and (3.13-156) for a uniform mesh of square
(h x h) elements for the 'Dirichlet problem' on the unit square:
Mm,n = -| 7T2h2(m2 + n2) (3.13-369)
for m, n = 0, 1, 2, ... but 'small' relative to the number of elements in each direction;
i.e., it is an asymptotic result and applies for mh < 1 and nh < 1. The corresponding
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 597
('LBB') eigenvectors, for the m, n mode, are smooth modes that are modulated by the
infamous CB-mode, and are thus no longer smooth:
u
ij
Jtj
■Qi+l/2j+l/2 J
(-1)
i+j
— ^7t2h2 cos imnh sin jnnh
— 4 7t2h2 sin imjth cos jn nh
7rcos(/ — l/2)mnhcos(j — l/2)nnh_
(3.13-370)
where x; = ih and yj = jh, i, j = 1,2,..., \/h — 1 are the internal nodal locations, and
we presume \/h to be an integer—for simplicity. Note that «,7 vanishes smoothly at
the top and bottom, but roughly—in a boundary layer of thickness h—at the two sides
(because u = v = 0 on T). Also noteworthy is that the 'velocity modes' are quite small
in amplitude relative to the pressure mode, owing to the h2 factor—and hopefully helps
explain why these pesky modes are rarely seen in computations, although another reason
that they are small or absent is that the 'CB-factor,' (—l)1+j, will make these modes
nearly orthogonal to virtually all 'reasonable' (not CB-ish) data. A final reason that the
pesky velocity modes are sometimes absent is explained by the expansion (3.13-167),
and the realization that the unstable modes are represented by the first 'few' terms of the
(a) Raw pressure: contours
(b) Velocity
V \ 1 t f t 1 ;
:\\ \ \ \ \ s ' -
;', I \ i i i i ''
;' ' i' i!»* *' i * i
; \ \ i i i i f
: (' i i i \ i i ,
-'* i',',», * i * i^
/
(c) Smooth pressure: contours
(d) Smooth pressure: mesh surface
ii
Fig. 3.13-28 First of two most unstable LBB modes, 16x16 mesh.
598 THE NAVIER-STOKES EQUATIONS
first summation, and noting that g = 0 suppresses all of the velocity portions; i.e., only
g 7^ 0 can excite the pesky velocities.
Setting m = 1 and n = 0 or m = 0 and n = 1 yields the lowest mode with nonzero fi,
and with it a quantitative value for the LBB constant [cf. (3.13-159) and (3.13-162)]:
kh = V^oT = vWi(Mo,i-D = ^\/i + 0(/*2)- (3.13-371)
Figures 3.13-28 through 3.13-35, kindly supplied by D. Silvester, display the first
few unstable modes pictorially on two meshes (to show 'mode similarity' with mesh
refinement). They were obtained numerically, via MATLAB, not via (3.13-370); and
were done, in part, to verify (3.13-370). Shown are the first four LBB unstable modes
corresponding to (3.13-369) and (3.13-370); namely, for (m,n) = (1,0), (0, 1), (1, 1),
and (2, 0) + (0, 2)—the last one being 'MATLAB's choice' since both modes have the
same eigenvalue. In each plot, (a) shows the computed (and polluted) pressure part,
(b) shows the corresponding velocities, (c) is the filtered pressure, and (d) is a 3D
perspective/isometric plot of the smoothed pressures, sometimes rotated for better viewing.
Even though the pictures speak well for themselves, a few remarks may be in order:
1. Clearly the same modes are present on both meshes.
2. Clearly the mode shapes are well-described by (3.13-370).
(a) Raw pressure: contours (b) Velocity
1000000000000001
(c) Smooth pressure: contours (d) Smooth pressure: mesh surface
Fig. 3.13-29 Second of two most unstable LBB modes, 16x16 mesh.
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM
599
(a) Raw pressure: contours
(b) Velocity
(c) Smooth pressure: contours
\
.WT7//
■^ ^ \ i / s £-
/ ^
I
/>
V'
(c) Smooth pressure: mesh surface
Fig. 3.13-30 /Vexf unstable LBB mode, 16x16 mesh.
3. MATLAB sometimes 'chooses' a + sign, sometimes a — sign.
4. The smoothed pressures were not obtained with our 'standard' CB filter to the nodes.
Rather, they are still centroid pressures, obtained by multiplying the computed ('raw')
pressures by (—1)'+7.
5. Since both u and P oscillate as fast as possible, with the 'frequency' (wave number)
increasing with increasing number of elements, it is clear from (3.13-167) that both (wj f)
and (qfg) will be small for reasonable (smooth) data (/ and g) and will -» 0 as h -» 0.
We conclude by comparing the corresponding numerical eigenvalues with those from
the new theory—(3.13-369): the entries in Table 3.13-10 are (-8^-„/3^2/z2).
Additional Remarks
(1) For a more general grid of rectangular elements, the CB-factor in (3.13-370)
probably becomes (-1)'+V^i-1/2,7-1/2, where -A/—1/2.7—1/2 is the area of the associated
element—and the simple trigonometric parts of the eigenvector are no longer so
simple, and not available in closed form. The eigenvalues, likewise, are not simply
600 THE NAVIER-STOKES EQUATIONS
(a) Raw pressure: contours
^►§> O <$><♦><> <^^
►#>(> <S>4^>x> ^>#
>0^44 ^
90<^4
^
&4
^ G> o<£<S>o <^<A
c) Smooth pressure: contours
(b) Velocity
\ % \ , \
■ -> v^ v ^ v »
^\\N^ x-
"■ -V vO ^
-.v \ V v
. i
< \ '
- '/ '/ ''
S ', ', -
- "'/'/'
'/'/!.'
/t 't ,'l
/ / i '
. / / , t /
'/ >/ */ .
- ', *''/, -
-- '• '• --
- 'V '/ '' -
1 (
' J '
-V n\nV n-
" -\n\nk -
^ v\\* <■
-^ *vv
- ^\^N\.
■ » ^N » \N A
I \ \
(d) Smooth pressure: mesh surface
Fig. 3.13-31 Sum of LBB modes (2,0) and (0,2), 16 x 16 mesh.
given by (3.13-369). But we (and D. Griffiths, personal communication) believe that
the 'general' picture remains much the same.
(2) The area-weighted pressure filter applied to a typical 4-patch generated by (3.13-370)
gives, at node ij,
tj
|(-l)/+7'[cos(/' + l/2)mnh - cos(i - l/2)mnh]
x [cos(y + l/2)nnh — cos(j — l/2)nnh]
•, • . mith . . nith
(— 1) J sin imjth sin sin jnizh sin ,
(3.13-372)
which is clearly 0(h2). The 'standard' CB filter + smoother is thus not perfect for
the pesky modes, but it is quite good on good meshes (small h).
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM
601
(a) Raw pressure: contours
(b)Velocity
::: « ' • • • • • • • • • '
■■■ ■' '»• • • •' •' • •'
:::v.v.v.v.v v
::s *' i' ••••#•!••••••f i- ■§ •f
Vj'.V.V.'.V.V V
: :VV ...... ».
:::V.v.v.v.v.v
v-.v.v.v.v.v.v
iV.v.v.'.v.v.v
V'sV.'.V.V.V.V.'
••.v.v.v.v.w.v
• • • • 11 * 111 j 11«<
>
(c) Smooth pressure: contours
(d) Smooth pressure: mesh surface
TIMIr
LAlllll JIUIlL
Fig. 3.13-32 Same as Figure 3.13-28 except 32 x 32 mesh.
(3) GS also found the 3D LBB constant:
(4)
kh
V3
,1,3-
TZl\t + 0(h").
(3.13-373)
They also showed, for the leaky lid-driven cavity problem in both 2D and 3D (when
well-posed) that the amplitude coefficient of the most unstable mode, see (3.13-165)
through (3.13-168), is 0(h) even though the minimum eigenvalue is 0(h2) in 2D and
0(h4) in 3D. This is an example of the contention we have made several times: the
bad modes are nearly orthogonal to 'reasonable' data and thus offer no impediment
to obtaining good results on good grids, and convergent results as h —► 0.
A final remark is this—and applies to both stable and unstable elements: if a domain
has aspect ratio l/h and is uniformly discretized into rectangular elements (with of course
the same aspect ratio), then the LBB constant goes to zero like 0(1/h)2 for l/h —> 0 or
like 0(h/l)2 for l/h —► oo; from D. Silvester (personal communication, 1995).
/. The boundary vector, g
We mentioned earlier [Remark (7) below (3.13-32)] that g is a sparse vector that accounts
for inhomogeneous Dirichlet BC's in the mass conservation equation, CTu = g. In this
602 THE NAVIER-STOKES EQUATIONS
(a) Raw pressure: contours
(b) Velocity
N »» MIKNIIK
II N K
K NNNNIHN
NMNMMNNMMMNanilMMMNMMHHMMNHNMUMM
NKHNNMMtlMNMNXHNNHMNHNMMHMNMMNMM
HMMIIll NN MIKIIIIII
NHHHHHHMMNMHNMMHMMNMMMMMMMMMNMN
MXMMMNMNHMMNHMMMNMNhMNNNHIINHMMM
MXMXHNNMMHMMHMMMMHHMXNMMMNMNMHM
>MM» MIIMMHIMMIKK
♦♦♦♦♦♦♦♦»♦♦»♦♦♦«♦♦♦♦♦♦♦♦♦♦♦♦♦«♦
1 1
ooooooooooo
MM
(c) Smooth pressure: contours
(d) Smooth pressure: mesh surface
Fig. 3.13-33 Same as Figure 3.13-29 except 32 x 32 mesh.
little section we shall explicitly construct g for a special-but-general case. It is special
because we do it only for Q\Qo, and it is general because we consider an arbitrary
('isoparametric') mesh. Generalization to other elements is left as an exercise.
Let us return to Figure 3.13-23, but imagine it to be generalized in the following
way: nodes 3 and 7 are no longer constrained to form a rectangle; i.e., the element is
envisioned to be an isoparametric quadrilateral with the only restriction being that nodes
4 and 8 still define a line in the jc-direction. This restriction, too, could be removed if
we were willing to work in the transformed coordinate system (normal and tangential, a
la Section 3.13.1e), a complication we prefer to avoid for present purposes. Consulting
the element matrices of Appendix 1, it is straightforward to obtain the following mass
conservation equation, CTu = 0—a 'preprocessor' step prior to enforcing Dirichlet data
upon boundary nodes 4 and 8 (nodes 3 and 7 are here both internal nodes, not on T):
2CTu = (yg - y3)(«7 - u4) + (y4 - y7)(u8 - m)
+ (X3 - Xs)(v7 ~ V4) + (X4 ~ X7)(V3 ~ V8) = 0.
(3.13-374)
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 603
(a) Raw pressure: contours (b) Velocity
(c) Smooth pressure: contours (d) Smooth pressure: mesh surface
Fig. 3.13-34 Same as Figure 3.13-30 except 32 x 32 mesh.
Dividing by two and transposing Dirichlet data to the RHS gives the 'true' Cr-matrix
and 'true' (unknown) velocity vector (u) as
CTu = |(y8 - y3)u7 - \{yA - y7)u3 + \(x3 -xs)v7 + \{xA -x7)v3
= \(ys- J3)"4 -\{yA- yi)u% + \fa- x8)v4 + \{xa - *i)v% = g, (3.13-375)
and we see that the element contribution to the global g-vector is made up of the difference
between two neighboring tangential velocities («4 and u%) and the sum of two neighboring
normal velocities (V4 and v&). The main thing to note is that for 'smooth' data [U4 —
«8 + 0(x4 — xs), etc.] the 'tangential' contribution to g is small compared with the normal
component—as also mentioned in the above Remark. Indeed, if we specialize to a simple
/ x h rectangular element as in Figure 3.13-23, we obtain a simpler version of g:
h I
g = ^(u4 ~ u8) + ~(v4 + vs), (3.13-376)
from which it is obvious that normal contributions are generally much more 'important'
than tangential ones, the latter vanishing totally for the case of uniform shear velocity—all
of which is a simple consequence of (global) mass conservation. It is also clear that only
normal velocities remain in g as / and h —» 0.
604 THE NAVIER-STOKES EQUATIONS
(a) Raw pressure: contours (b) Velocity
(c) Smooth pressure: contours (d) Smooth pressure: mesh surfact
Fig. 3.13-35 Same as Figure 3.13-31 except 32 x 32 mesh.
Table 3.13-10
(m,n)
Mesh
8x8
16x 16
32x32
d,0) =
Expt
0.772
0.900
0.955
= (0,1)
Theory
1
1
1
(1,1)
Expt
1.188
1.613
1.822
Theory
2
2
2
(2, 0) --
Expt
2.162
3.278
3.732
= (0, 2)
Theory
4
4
4
The details, but not the concepts, will vary if one examines other elements than <2i<2o-
Indeed, if one has studied Section 3.13.2b on 'pressure modes,' s/he will know that the
explicit construction of g for two other elements, Q\Q\ and Q2 Qi, has already been
presented in the two example problems discussed there.
3.13.6 Higher-Order Elements
We shall be fairly brief here—and only present a sample of 'final' results, because they
are mostly less than enlightening.
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 605
Nearly everything needed for building a typical GFEM equation has already been
presented in Chapter 2. Here we provide, along with Appendix 1, just enough that is not
in Chapter 2 to permit the interested reader to finish the equations—namely, VP and
V • u; and we will provide only the d/dx portions of these stencils for the quadrilateral
elements: <22<2-i, QiQ\, and Q2P-\, leaving the rest as an exercise.
The key differences in the first-derivative operators from those of advection (with
constant velocity) shown in Chapter 2 are, of course, a consequence of 'mixed
interpolation' and, since P is only linear or bilinear, it is to be expected that the resulting
first derivatives will be both simpler and perhaps less accurate than the 'full' quadratic
approximation (trial and test functions) that is applied to advection.
a. 02Oi
We begin with the 4-patch of Figure 3.13-36. By consulting the element matrices in
Appendix 1 one can easily construct the following operators:
du/dx\Q = -QZxCTxu\0 (3.13-377)
and
dP/dx\o = m;1cxp\0,
(3.13-378)
where Qi and Mx are the lumped mass matrices for pressure (bilinear basis functions) and
velocity (biquadratic basis functions) respectively—specifically, QL = Ih and Mx = lh/9
for the 4-patch.
Since P 'lives' only at corner nodes, so too does V • u, so all we need to investigate
is du/dx for the 4-patch shown, for which we obtain
du/dx\0= I {2[(uSe - uSw)/l + ("e - uw)/l + (uNE - uNW)/l]
+ (uSee ~ uSWw)/2l + (uee ~ uww)/2l + (uNEE - uNWW)/2l}, (3.13-379)
which is indeed much simpler than du/dx when the test functions are also biquadratic.
Turning now to dP/dx, we must address the three different types of nodes: corner,
midside, and center. We begin with the corner node; for the uniform grid 4-patch, the
result is strikingly simple:
dP/dx\0 = (PEE-Pww)/2l,
(3.13-380)
NNWW
nnww
X
NWWn
WW
SWWn
SSWW
NNW
—n—
NWO
nww W
t>
SWO
-{]-
ssw
NN
nnw
X
X
nw
0 N
O
Os
ss
NNE
—n—
O NE
E
-C}
OSE
■£
SSE
NNEE
h NEE
«EE
SEE
SSEE
Fig. 3.13-36 A 4-patch of biquadratic elements.
606 THE NAVIER-STOKES EQUATIONS
really. Even a variable-rectangular grid brings just a little more coupling—only node 0
gets into the act/stencil, a fact clearly revealed by perusing the element matrix, CTX, in
Appendix 1.
Turning now to the 2-patch containing midside node E, we obtain another surprising
lack of coupling and resulting simple result—here using Mx = 2hl/9:
dP/dx\E = (PEE-Po)/l. (3.13-381)
The 2-patch for midside node TV is only slightly more involved:
dP/dx\u = ^[(Pee — P\vw)/2l + (Pnnee — Pnnww)/21]. (3.13-382)
Finally, the center node equation, with Mx = Alh/9, is, for node NE,
dP/dx\NE = \[{Pee ~ Po)/l + (Pnnee ~ Pnn)/H (3.13-383)
and we seem to be led to an obvious overall 'conclusion' that is at least sometimes borne
out in practice: the element may not be very accurate in V • u and VP, at least relative
to the other terms in the NSE's that benefit from 'full quadratic' approximation. This
probably also helps to explain why Q2P-\ and <22<2-i outperform <22<2i> and we state the
additional fact that helps to reinforce our assessment: because <22<2i uses a C° pressure
approximation, element mass balances do not exist—whereas the C_1-pressures in <22<2-i
and QiP-\ do generate element-level mass balances.
To complete our brief analysis of <22<2i > we shall present the (lumped mass) 'Laplacian',
CTM~XC, which would be used for explicit time integration of the PPE version of the
semi-discrete NSE's. Appendix 1 shows the stencils for both CTXM~XCX and CTM~xCy.
Summing these gives CTM~XC and multiplication by —<2l' gives, upon rearrangement,
the 'familiar' (finite difference) representation of the Laplacian. We present the result in
several steps; using Figure 3.13-36—but this time the velocity nodes are to be interpreted
as pressure nodes; i.e. the sketch now represents a 16-patch (4 x 4) of 9-node elements
containing 25 pressures (the size of the patch is now 4/ x Ah).
(1) d2P/dx\ = -QlxCTxMlxCxP\Q = —KPsww ~ IPs + Psee)/(21)2
+ 2(PSW ~ 2PS + Pse)/12 + 4(Pww ~ 2/>o + Pee)/(21)2
+ 8(/V - 2Po + Pe)/12 + (Pnww ~ 2PN + P nee)/(21)
+ 2(PNW ~ 2PN + Pne)/121 (3.13-384)
(2) d2P/dy2\0 = -QZlCTyM^CyP\o = —[(Pnnw - 2PW + PSSw)/(2h)2
+ 2(PNW - 2PW + Psw)/h2 + 4(PNN - 2PQ + Pss)/(2h)2
+ 8(PN - 2P0 + Ps)/h2 + (Pnne - 2PE + PSSE)/(2h)2
+ 2(PNE - 2PE + PSE)/h2. (3.13-385)
(3) Adding these two gives V2h. We present only the simpler result for a 'square' mesh,
/ =h:
VjPIo = -QllCTM-lCP\0 = ^[(Psww + Psee + Pnww + Pnee + Pnnw
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 607
+ PSsw + Pnne + Psse) + HPnn + Pss + Pee + P\vw) + M(PN + Ps
+ PE + Pw) + \6(PSW + ^f + /W + Pne) - 144/U (3.13-386)
which would probably better be presented in 'stencil' form—an exercise we leave to the
reader.
b. O2O-1
Here, in the first case (Case 1 in Appendix 1) we shall be even more brief, mostly because
the results are not especially 'revealing'—a fact that is related to the computationally-
convenient and 'historically'-determined choice of pressure basis functions: they are
Lagrangian basis functions whose nodes are at the four 2x2 Gauss points. Referring
back to Figure 3.13-36, we show the 4 pressure nodes in one of the elements (top left)
via the lower case letters; since the figure would be too 'busy' if we had explicated them
everywhere, we let the reader fill in the other 3—in the obvious way. The distance between
these Gauss-point nodes is 1/V3 in x and h/^/3 in y. We shall present only two of the
W equations and none of the V • u equations, as they reveal little but confusion. But
we do show the sum of the four CTu = 0 equations, because they do display something
useful, namely an element-level mass balance, which is (for the top left element)
h
~[(uww — uo) + 4(«/vww — un) + (unnww ~ unn)]
6
/
+ ~[(vww -vNNWW) + 4(vw -vNNW) + (vq - vNN)] = 0. (3.13-387)
6
Next we report M~XCXP for node 0 in the 4-patch, a la (3.13-378), from the element
matrices in Appendix 1:
dP/dx\o = -[(9 + 5V3)(Pne - PnW + Pse ~ P.w)
0/
+ (3 + V3)(Pnne ~ PnnW + ^.v, ~ P*sW) ~ 0^3 ~ 3)(Pnee - Pnww
+ Psee ~ Psww) + (9 - 5V3)(Pnnee ~ P„nww + Pssee ~ PSSWW)],0.1 3-388)
and it may be obvious why we present only one more—and the only one that is intuitively-
appealing—that for a center node. It is
1
dP/dx\NW = -
P — P P — P
+
l/y/3 l/y/3
(3.13-389)
More 'useful' equations result using the alternative-but-equivalent pressure basis
referred to as Case 2 in Appendix 1. Now the pressures 'live' at the 4 corner nodes of
each element, with the somewhat inconvenient consequence that pressure is now a
multivalued quantity; for example, there are 4 different pressures at node 0 of the 4-patch. Note
that this is really no different than the 2 x 2 Gauss point pressure basis, which are also
quadruple-valued at node 0. (Note too that the numerical results using either equivalent
basis will be the same). We simply need to introduce some new names/nomenclature;
and, rather than further cluttering up an already busy figure, we ask the reader to help us
by returning to Figure 3.13-36 and 'mentally', adding 3 rows of node numbers for the 16
608 THE NAVIER-STOKES EQUATIONS
pressures. Thus, 1 through 4 lie (left-to-right) on the bottom, 5 through 8 lie on the line
connecting WW and EE, as do 9 through 12 (the former living 'just' below the line and the
latter just above. Finally, nodes 13 through 16 are on the top row (NNWW —» NNEE).
Thus node SS contains Pi and />3, node WW contains P5 and Pg, node 0 contains P(„
P-j, P\o, and P\\, etc. We can now go to the same CT matrices in Appendix 1 that were
used earlier for Q2 Q\ and now re-use them for Q2 Q-\, with the following sampling of
results (the lumped mass matrices, Q and M are unchanged):
(1) V • u at node 6(1 of 4 continuity equations, in the lower left element, that at node 0):
du/dx\Q = —[(uww + Auw - 5u0) + 2(uSWw + 4usw - 5us)], (3.13-390)
9/
which is 'representative'. Viewed 'alone'/in isolation, it is easy to believe that the
Q2Q-1 approximation to V • u will not be very accurate (the above is first-order, if
TS applies—which it does not). But note/recall that the sum of all four equations in an
element gives /V-u=/n-u = 0, an element mass balance; see (3.13-387).
e Fe
(2) M~lCxP\0 for the 4-patch:
dP/dx\Q = -[(Pi -P5+ PX2 - P9) + 5(P7 -P6 + Pn - /»,<,)], (3.13-391)
which degenerates to that for Q2Q\ in (3.13-380) if P5 = P9 and Pxl = P& and P6 =
Pi = ^*io = ^11; i-e. if the pressure were continuous. But it is generally discontinuous,
the extra degrees of freedom accounting for its superior performance (up to the CB mode!)
compared to <22<2i- It is als° tempting to suggest that the term in the second parentheses
(the jump term) will, for smooth pressures, tend to zero with mesh refinement and the first
term tending to (PEE - PWw)/2l = dP/dx\0; i.e., for l,h-*0, Q2Q-1 -> QiQ\-
(3) M~XCXP\S for a 'horizontal' 2-patch:
dP/dx\s = - [{PA -Pi+Ps-P5) + 5(P3 -P2+PJ- P6)l (3.13-392)
the second group of terms again describing a jump.
(4) M~lCxP\w for a 'vertical' 2-patch:
dP/dx\w = ±-(P6-P5+ pl0 - P9], (3.13-393)
with no jump terms.
(5) M~XCXP\NE for a center node: see (3.13-383).
Note for both cases that all pressure equations annihilate the CB pressure mode—a
reflection of the LBB instability for <22<2-i-
This concludes our 'sampling' for Q2Q-1, about which we remark: if you think this
element is slightly confusing, turn the 'page'.
Unlike <22<2-i ■> this element—from the viewpoint of studying the discrete equations—truly
'suffers' from the need to put a 'square peg' (triangular 'element') into a 'round hole'
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 609
(rectangular element). For V • u we merely mention that the sum of the three continuity
equations also generates (3.13-387). Whereas <22<2-i at least has some symmetry that
helps a little to interpret the discrete equations, even this is lost with the 'great' QjP-\
element. As we use the first three Gauss points in each element to define the pressure
nodes and concomitant Lagrangian basis functions, we must omit the upper left node
(Gauss point) in each element (node nnww in Figure 3.13-36). We report, for what it's
worth, the M~x CXP\0 result for the above four patch—and let the reader both interpret the
result and build other discrete equations from the element matrices listed in Appendix 1:
dP/dx\0 = -[(2V3 + 3XPne+Psse) + V3>
nee T' ssee ww + Pssw)
+ (3^3 - \)(Pnnw + Psw - Pnnee - Psee) - 5V3(Pnw + />,,„)], (3.13-394)
about which the only remark we care to make is that the apparently wrong-signed term (the
third) is a common feature of higher-under approximations; recall that it even occurred
in ID for the AD equation; cf. Section 2-3-lb.
To 'finish,' we show again the center node equation, since, especially for this fun-
to-use but difficult-to-'analyze' element, it is the only one that is reasonably simple to
understand:
P — P
dP dx\NW = =—, (3.13-395)
l/Vd>
where we recall that there is no node corresponding to Pnnww for this element (thus it
'cleverly' neglects Pnnw in the dP/dx approximation).
To conclude our discussion of 9-node elements (27-node in 3D), we speculate on
<22<2-i vs. Q2P-\, the latter having become the favorite 2D element in many codes besides
ours, in both 2D and 3D—partly ('slightly' is a better word) as a result of contemplating
the above (and other) stencils: in the cases of most interest for the FEM—complex
geometry in 3D—there will seldom, if ever, appear spurious pressure nodes for Q2 Q- \
for which applications it just might be superior to QiP-\ which, especially in 3D, seems to
be a little 'short' in pressure degrees of freedom. 3D numerical comparisons are strongly
recommended between these two elements. (For finite difference geometries, like boxes
and cubes, QiQ-\ is often somewhat hampered by spurious modes 'because' the geometry
is so simple that not quite all of the continuity equations are actually required. We even
believe that finite difference methods are more appropriate—more cost-effective—than
finite elements on 'finite difference geometries').
Our final remark on 'higher-order elements' for NSE's is this: they probably are better
left to the computer (in practice) and to the finite element mathematician (in theory).
3.13.7 Divergence-Free Elements (and Methods)
Before diving into this section, it may be a good idea to take a look at our brief introduction
to this subject in Section 3.12.2e.
The 'idea' behind a divergence-free basis is best initiated via the conventional weak
form of the continuum Stokes problem—following Griffiths (1979a,b): find u e Hq and
P e L2 from
a(u, v) - (div v, P) = (f, v) Vv e Hj, (3.13-396)
610 THE NAVIER-STOKES EQUATIONS
and
(div u, q) = 0 Vv e L2, (3.13-397)
which, by decomposing the Hilbert space H(\ into the direct sum of a divergence-free
subspace (D) and its orthogonal complement—a curl-free subspace (C)—reduces the
above pair of coupled equations to the following sequence of equations, with the second
step 'optional':
1. Find u e D from
a(u,v) = (f,v) VveD. (3.13-398)
2. Find P e L2 from
(div v, P) = a(u, v) - (f, v) Vv e C, (3.13-399)
where H' = D + C
Whereas the (orthogonal) decomposition is always theoretically possible, the challenge
is to repeat the problem formulation for the finite-dimensional subspaces associated with
the FEM (the easy part) and then find suitable (local) basis functions for these subspaces
(the hard part). It is also relevant to point out that the pressure 'recovery' in Step 2, which
must use C '-approximations for P if the divergence-free basis is to remain local, is not
a 'conventional' linear system of algebraic equations (as is Step 1); rather, it is solved by
looping through the elements, thereby determining the jumps across element boundaries
of the pressure 'parameters' (P, dP/dx, etc.)—see Griffiths (1981) for details.
The divergence-free approach has had somewhat of a checkered history, beginning (we
believe) with one of the early pioneers of 'FEM in Fluids,' M. Fortin. About a quarter
of a century ago, he showed in his Ph.D. thesis (Fortin, 1972a) how a divergence-free
basis could be employed, although perhaps not gainfully: 'Indeed, one can construct finite
element methods where the incompressibility condition is exactly satisfied [cf. Fortin (8),
(9)], but this leads to the use of complex elements of limited applicability.'—Crouziex and
Raviart (1973), an important paper that introduced discontinuous pressure on triangles.
See also Thomasset (1981) and Hecht (1984)—who not only talked about divergence-
free bases, but constructed and used them (for Q2P-O. Our brief attempt at a summary
of this history, most assuredly with numerous errors of omission (at least), continues
with the early contributions of E. Thompson and students. Although not employing a
divergence-free basis, Thompson (Thompson and Hague, 1973, and Thompson, 1975)
did experiment with a pointwise (exactly) divergence-free element—PiP~\, happily and
innocently unaware (then) of the sometimes serious 'LBB stability' problems displayed
by this element (there can be many spurious pressure modes, depending on mesh design
and BC's—details later), because they always used NBC's on large segments of the
boundary and apparently never encountered insoluble problems. (A very neat example of
a 'drooping candle' is presented in Thompson and Hague, 1973—a sequence of steady
Stokes 'flows' in a Lagrangian formulation.) It may be interesting to demonstrate the
pointwise incompressibility of this element, so we do so, beginning with the
observation {required for exactly divergence-free elements) that the pressure space is precisely
the divergence of the velocity space—see, for example, the Appendix to Chapter 4 by
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 611
D. Malkus in Hughes (1987), so that
jfiV-uh = 0 V/, (3.13-400)
becomes, because of quadratic velocity (P2) and linear (and discontinuous) pressure (P_i),
for each element (e),
{a + bx + cy)(A + Bx + Cy) = 0, (3.13-401)
because \js = a + bx + cy and V • uh = A + Bx + Cy. Since this must hold for all a, b, and
c, simply select a = A, b = B, and c = C to obtain Je(V • u'7)2 = 0, and thus V • uh = 0.
The ^2^-1 element is an example of a mixed interpolation method whose computed
pressures 'cause' polntwise incompressibility—for straight-sided triangles only (E. Thompson,
1991, personal communication). Thus, Thompson et al. did not even attempt to save
computer time by seeking out the inherent divergence-free velocity basis functions; they
just used the element 'as is' (u and P are computed) because they liked it—at least for
a while. In fact, however, and perhaps somewhat surprisingly, Thompson (1975) found
this element to usually be actually less accurate than P2P0 m several test problems; thus,
'divergence-freeness' does not in and of itself imply accuracy. [Currently, Thompson
(personal communication, 1991) prefers <2i<2o—f°r both 2D and 3D simulations.] For
other applications of PiP-\, see, for example: Dawson and McTigue (1985), Dawson
(1987), and Mathur and Dawson (1987), in which it was found to be inferior to the
quadrilateral element QiQ-\, at least for Eulerian formulations. [For Lagrangian formulations,
the PiP-\ is generally more robust than Q2P-\, which 'crashes' more often (P. Dawson,
1991, personal communication).] Another paper on f^-i (plus others) is Malkus and
Olsen (1984), in which they identified the LBB-instability and gave the dimension of the
null space as six (correct as far as it went—but see below!).
It took D. Griffiths to actually generate the divergence-free basis for the PiP-\
element—and many more, some in the pointwise sense and others in the weak sense,
via pressure test functions. In a series of papers over a six-year period (Griffiths, 1977,
1979a, b, 1981, 1982), he, inspired and challenged by the important paper of Crouziex
and Raviart (1973), single-handedly generated analytically the divergence-free bases for
virtually 'all' elements then in use. Not all are 'local,' however, and not any of them
were extended to iso-P elements with curved sides. Even straight-sided elements possess
(at least some do) rather involved divergence-free bases. And they seem (thus far, at
least) to be sufficiently complex that they appear not to have 'caught on,' i.e., shown
up in subsequent papers—except for some recent work by Shopov et al. (see Shopov
et al., 1992, and Shopov and Iordanov, 1994), who use the Q2P-\ element in (weakly)
divergence-free form (no pressures)—and on isoparametric elements yet. Perhaps one
reason that not many codes have been written is due to Griffiths himself; in Griffiths
(1981) is: 'It is not, as yet, clear whether the computational procedures based on these basis
functions would be more cost-effective than the orthodox Lagrange multiplier method (or
indeed, penalty methods), although both methods would give identical results when used
with the same underlying functions spaces.' On the other hand, on the same page (342) is a
statement that big code developers should have but (it seems) did not take rather seriously:
'The potential savings brought about by using these new basis functions are much greater
for time-dependent problems, since the pressure does not need to be computed at each
step 'A final quotation is useful in that it helps to better understand the somewhat
612 THE NAVIER-STOKES EQUATIONS
distinct roles played by the different types of nodal velocities for a nine-node quad:
'normal components of velocities at midside nodes control the flux across element edges
(also the discontinuity/jump in pressure across a boundary), internal nodes control the
creation/destruction of mass within an element (also the gradient of pressure in the
element), and the remaining nodes are free to approximate the momentum equations.'
Thus, the four corner nodes and the four tangential components at midside nodes are
those 'available' to satisfy Newton's second law. Interesting. For more recent work in
this regard, see Shopov and Iordanov (1994).
In another sequence of papers, Gustafson and Hartman attacked the divergence-free
'challenge' as posed in the book by Temam (1984); in Hartman and Gustafson (1981), and
in Gustafson and Hartman (1983, 1985), graph-theoretic methods were used to explicate
the underlying discretely divergence-free bases of the elements discussed by Temam. Like
the Griffiths' work, however, the results seem not to have attracted much attention by
'code-builders'—at least to our knowledge.
In a recent Ph.D. thesis under the supervision of D. Arnold, Qin (1994) studied
theoretically and numerically the divergence-free PiP-\ element (in 'mixed mode') and its
two 'neighbors,' P\Pq and P3P-2. We summarize here only a few of his salient results for
PiP-\ on the unit square: (i) on a mesh oriented so that all hypotenuses go in the same
direction (45°) with Dirichlet BC's, the dimension of the null space of the gradient
operator is six, the 'reduced' (after removal of the zero eigenvalues) inf-sup constant is 0(h),
and optimal convergence with mesh refinement occurs for the velocity (only—pressure
does not converge); i.e., the velocity error is 0(h3) in L2 and 0{h2) in //'; (ii) ditto
except NBC's—which remove the entire six-dimensional null space; (iii) on 'many' other
meshes, such as one composed of criss-cross triangles, four to a square, the dimension
of the null space (of C) is huge(!), unbounded like 0(h~x) for either essential or natural
BC's, yet velocity converges optimally as does pressure [0(h2) in L2], and the reduced
inf-sup constant is good/stable—0(1); and (iv) for certain very special meshes, optimal
convergence of both u and P occurs with no instability and no spurious null space. Very
interesting news, but perhaps not to the 'applications engineer'—unless he is prepared
to generate his mesh as follows: (i) start with squares (presumably rectangles are also
okay, and presumably of various sizes, to permit graded grids); (ii) form the triangles by
criss-crossing each rectangle; and (iii) move the center node in each rectangle off-center
'a fixed distance, for instance /i/4'—such a mesh being called a distorted 'criss-cross
subdivision.'
Returning now to a low-order element and a divergence-free basis for velocity (no
pressure), we mention the work of Rannacher and Turek (1992) and Turek (1994, 1996, 1997);
see too Hey wood et al. (1996) for an interesting 'variationally based' analysis of 'built-
in' NBC's/OBC's when divergence-free basis functions are employed—one of the most
impressive-in-practice uses of such an element for time-dependent flows, with but one
small 'hitch': it is non-conforming, defined as it is on a so-called 'rotated bilinear' element
with the bilinear velocities defined at element midsides rather than at the corners. (The
xy part of the bilinear basis function goes over to x2 — y2.) This trick, however, permits
the efficient definition of a more local divergence-free basis (element-contained even for
distorted quadrilaterals) than is possible with the standard/conforming Q\ element, whose
divergence-free basis requires the use of 4-patches of macro elements (Griffiths, 1981).
(Note too that even Griffiths did not find a local basis for Q\ <2o on isoparametric elements.)
But the biggest key to the good performance realized in Turek's code is the effective use
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 613
of 'multigrid' [which is good in that even though the divergence-free 'stiffness' matrix
has a condition number of 0(h~4) and the divergence-free mass matrix has condition
number of 0(h~2), Turek's multigrid method has a convergence rate that is independent
of the condition numbers] to solve the associated linear systems (S. Turek, 1992, personal
communication) because, in fact, the number of unknowns with the discretely divergence-
free basis on the rotated Q\Qq element is virtually the same (not substantially less, as
perhaps intuitively expected) as that on the conforming Q\ Qq element with mixed
interpolation! (And in 3D, the mixed method actually involves fewer total degrees of freedom
than the div-free method—'because' the stream 'function' is a vector quantity in 3D.) It
turns out that the divergence-free basis necessarily (see too the Griffiths' papers)
introduces a sort of stream function (at the corners) which, along with tangential velocities
at the midside nodes, results in an element with an average of three degrees of freedom
per element, just like Q\ <2o (conforming) with bilinear velocity and constant pressure.
The 'reason' that there are so many nodal parameters/degrees of freedom is that
nonconforming elements always have more. In contrast, the <2i<2o basis derived by Griffiths
(1981) has only one-third the nodal parameters as when using mixed interpolation. In
fact, even though a very impressive 2D code has been generated with the divergence-free
basis, the Turek-Rannacher 'team' seems to be switching toward projection methods (see
Section 3.16.6); i.e., back to mixed interpolation (with a projection method, described
later)—especially in 3D.
We would also like to point out that a finite difference, divergence-free method that is
effectively the 'conventional' Q\Qo element, has also been invented and used; in Stephens
et al. (1984, which paper also proves that there can be at most two pressure modes for
Q\ <2o) and Bell et al. (1989), a 'finite difference Galerkin formulation' was employed, in
which a set of discretely divergence-free velocity 'mesh functions' was utilized to preclude
the pressure—on a 9-patch (see, too, p. 172 of Girault and Raviart, 1986) rather than on
the 4-patch a la Griffiths. [Fortin (1981) also used a nine-patch to show the divergence-
free 'Vortex' for Q\ <2o] However, like Turek and Rannacher, Bell et al. have returned to
conventional 'mixed-interpolation' (in finite difference 'garb'), for both 2D (Bell et al.,
1991) and 3D simulations, and have even switched over to 'approximate projections' (the
discrete velocity is only 'close' to being discretely divergence-free; see Section 3.16.6d).
In the first of these (steady flow), the fully coupled, divergence-free system of size N
(N = total number of nodes) was solved via banded solvers (and Newton's method). In
their extension to time-dependent equations, they 'split out' the projection, with the result
that in a sequential manner, they returned to 3N equations (N each for u, v, P). The main
reason for their switch was to more efficiently invoke a higher-order Godunov method
for advection. In 3D, Bell et al. returned to the 'PPE' approach, in part because the 3D
'basis' is very complicated and also because the relative savings is then not so large
(J. Bell, 1992, personal communication).
Another time-dependent, divergence-free-basis FDM approach is discussed in Goodrich
and Soh (1989), and a strong connection revealed between that approach and a stream
function only approach. In fact, it was the stream-function-only code that was used for the
computations presented in that paper (J. Goodrich, personal communication, 1994). They
also showed the 'equivalence' between their (and Stephens, Bell, et al.) 'finite difference
Galerkin' method and the 'dual variable' method of Amit et al. (1981)—another approach
that uses graph theory. In fact, Goodrich and Soh state, 'The next section will show that the
dual variable or finite difference Galerkin algorithms can actually be interpreted as stream
614 THE NAVIER-STOKES EQUATIONS
function algorithms. This discovery resulted from trying to understand and simplify the
product terms in the FDG algorithms 'It is, in fact, this very 'equivalence'—discretely
divergence-free bases seem always to introduce a stream function—that might help to
explain why the time-dependent case has not received much attention in the fully coupled
(N equations only—one per node) divergence-free approach; rather than simply du/dt,
the acceleration becomes, at least partially, converted to an equation for da>/dt (there
is an implied curl operation) or, 'worse yet,' an equation for d(V2\}/)/dt. For a more
recent application of Goodrich's V-only approach for time-dependent flow, see Gresho
et al. (1993). The method has, however, thus far only been applied on uniform grids—no
'geometry.'
To conclude this discussion, we make two observations:
1. Divergence-free elements have the added advantage that they cannot generate unstable
DAE's via the advection term (recall the example in Chapter 2 wherein any but skew-
symmetric advection caused the ODE's to be unstable; and see Section 3.16.4 in this
chapter, wherein the possibility of unstable DAE's is discussed). This is simply because
divergence-free basis vectors necessarily (at least via GFEM) generate skew-symmetric
advection matrices—at least up to outflow BC's; see Remark (1) following (2.2-24) in
Chapter 2.
2. The world is still in need of a truly cost-effective divergence-free basis for 3D GFEM
simulations in which complex geometries are to be tackled. The mixed blessings of mixed
methods seem currently to be on top, even though the 'best' 3D element is also not yet
'obvious.' (In 2D, it seems that there are now several 'best' elements: Q2P-\, Q\Qo, or
any of several triangular elements—it all depends on who is calling it best.)
3.13.8 Conservation Laws Revisited
Recalling the discussion in Sections 2.2.3 and 2.2.4 of the previous chapter, we now repeat
the analogous steps for the NS equations, except that we are now smart enough to do it
in the 'efficient' way right away; i.e., we shall work directly with the semi-discretized
equations in matrix-vector form, to study conservation of momentum and conservation of
kinetic energy. To this end, we first rewrite them in the augmented form corresponding
to (2.2-15) through (2.2-27), starting from the condensed form in (3.13-28), (3.13-29):
Mu + [K + N(u) + PD(u)]u + CP = f (3.13-402)
and
CTu = 0, (3.13-403)
where
Dij(u)= I(Pi(PjV-uh (3.13-404)
is another 'divergence' matrix, fi is a scalar to be determined later, the RHS of (3.13-403)
is zero because u now contains all velocities [including those on FD; cf. (3.13-13) and
(3.13-14)], and / is the 'augmented' forcing vector that includes the (as yet unknown)
force applied by FD to the fluid, but does no longer contain the u terms in (3.13-26):
/„, = Ufga + / <t>(?]Fa + f <t>(«]~Fa, (3.13-405)
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 615
where Fa is the new (reaction-force) term that, analogous to the 'Dirichlet' heat flux in
(2.2-16), is to be determined once the velocity and pressure are known; i.e., as for the
scalar transport equation, we have effectively defined a two-step procedure: (1) solve the
DAE's of (3.13-28) and (3.13-29), which are 'contained in' (3.13-402) and (3.13-403),
by omitting the test functions (and equations) that correspond to nodes on r£, and (2)
solve for the 'reaction' force, Fa, via
K
Fa= Y. ^5a)' (3.13-406)
j=N„ + \
with u and P available (known), analogous to (2.2-14) et seq.; i.e., (3.13-402), (3.13-405),
and (3.13-406) are to be considered a linear system of N^ — Na equations in the unknown
nodal force components {F«/}, a = 1, 2 or a = 1, 2, 3. The details of this force calculation
will be explained in Chapter 4. Step (2) is, of course, optional and can only be performed
after Step (1) is completed.
With the efficient 'nomenclature' now in place, it is a relatively simple matter to study
conservation of momentum and kinetic energy, beginning with the former. Since in this
augmented form all basis functions sum to unity, it is now a simple matter to sum all of
the momentum equations in (3.13-402), separately for each component (ua), to obtain
£- fuh+ fuh- Vuh + fi fuhVuh= fg+ /V, (3.13-407)
where uh is the approximate solution given in (3.13-13) and where, on F%, the applied
traction vector, Fha, is (in 2D) that given by (3.12-14) and (3.12-16), whereas on F® it is
the reaction force given by (3.13-406), and we of course realize that true traction forces
(vis-a-vis pseudotraction forces) are only obtained via the full stress-divergence form of
the momentum equation—}/ = 1 in, for example, (3.13-18) through (3.13-26). If y = 0,
then we only have portions of the full momentum balance. The final step in obtaining
a true force balance that will make (3.13-407) look just like the appropriate continuum
balance equation (3.11-3) (after adding the body 'force' term, g, to that equation), is
generally only possible if fi = 1 because J uh • Vuh = J V ■ (uV) - / ii* V • ii* = /r(n •
uV - JVV • uh, and (3.13-407) then becomes
g + f[¥h - (n • u'V], (3.13-408)
a true global momentum balance; cf. (3.11-4). Thus, rigorous conservation of momentum
requires the divergence form of both the viscous stress term and the advection term [recall
that fi = 1 is equivalent to replacing uh ■ Vuh by V • (u/lu/l)] —but we hasten to add that
'just fine' solutions to the NS equations can be obtained with the simpler (and thus less
expensive) versions via the V2-form (y = 0) and advective form (/J = 0). As was the case
for the scalar transport equation, reversibility in the sense that replacing Dirichlet data on
FD by Neumann data there and achieving the same (uh, Ph) solution is only achievable
via the 'consistent force' formulation (for y = 0 or 1 in fact, with only the latter giving a
true force); i.e., if Fa is determined in any way other than via 'Step (2)' above, then the
resulting velocities and pressures will not be the same as those obtained with Dirichlet
BC's. For added clarification here, we explicitly describe the ^-component of the reaction
it J
616 THE NAVIER-STOKES EQUATIONS
force calculation on rf:
/ 0W/T, = [MU + [K+ N(u) + fiD(u)]u + CP), - I <t>?'gx - l 4>?'F*,
JrD J JrNx
(3.13-409)
where { }, denotes the j-th row of the LHS of the jc-momentum equation [see (3.13-28)
and below it], and
NTX
F* = E ~F^f- (3.13-410)
j=Nx + l
The entire RHS is known, and the nodal values, {FXj}, which represent a true jc-direction
force component if y = 1, can be computed—and the boundary mass matrix,
/ W,
may be lumped (when lumping is feasible) if desired, as discussed in the previous chapter.
This is the 'consistent' force and, at least when the consistent mass matrix is used, is
'exceptionally accurate' (details later, in Chapter 4).
Finally, we turn to kinetic energy conservation, wherein (3.11-12) is our goal. To that
end, we simply take the scalar product of (3.13-402) and the (full) velocity vector u, and
using (3.13-403) to see that uTCP = PTCTu = 0 and obtain
-—uTMu + uT[N(u) + PD(u)]u = uTf - uTKu, (3.13-411)
2 dt
where, a la (2.2-26) and (2.2-27), we have [for y = 1 in the (augmented) ^-matrix]
uTMu = J uh ■ uh = f q\, uT[N(u) + fiD(u)]u = f uh ■ (uh ■ Vuh) + fiuh ■ uh V • uh =
\ /[V • (q2huh) - q2hV ■ uh] + p J q2hV ■ uh, which, if and only if 0 = 1/2, becomes \ /r(n •
uh)q2h; also, uT f = J uh • g + JrFh ■ uh, and uTKu = J <$>h = v/2f[Vuh + (VuYl2,
which we do not claim to be 'obvious.' Thus, for /3 = 1/2, we have
Eh = J ¥h ■ uh - \ J q2(n ■ uh) + J(uh • g - <t>h), (3.13-412)
where Eh = \§q\, which is (3.11-12) after dividing by p and adding the body force
term, g, to (3.11-6) et seq.
So we are done; conservation of energy can be assured—but to do so requires, as did
the quadratic conservation 'law' in the previous chapter, fi = 1/2—thus sacrificing global
conservation of momentum.
Final Remarks:
(1) Conservation of energy is, of course, more important than conservation of momentum
if guaranteed stable DAE's are desired. Recall too that only /J = 1/2 gives a skew-
symmetric advection matrix when n • uh = 0 on T; i.e., we then have uT[N(u) +
fiD(u)]u = 0. If fi = 0 or 1, then the DAE's are 'indefinite'—they may be stable
or unstable, although for well-designed problems and grids, instability will be rare
(unless v = 0, which is not recommended in general).
(2) ft = \/2 has long been used by many 'theorists'; it was introduced by Temam (1966,
1968) in order to assure 'well-behaved' equations—and the analysis above shows
why.
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 617
(3) Further detailed discussions of issues of momentum conservation and consistent
force calculations are presented in Gresho et al. (1987), albeit in a manner that is
somewhat obfuscated by the use of somewhat 'clumsy' notation.
(4) The 'sacrifice' of global momentum conservation when fi = 1/2 is usually a small
one because V- uh is, for 'reasonable' simulations, 'small.' (Ditto the conservation
of energy when fi = 1 —at least if the DAE's are stable; and ditto both when ft = 0.)
Finally, y = 0 is also usually quite 'acceptable', for any /3.
3.13.9 Periodic Boundary Conditions
Having shown how the GFEM can be used to compute consistent forces on boundaries,
we are finally ready to return to the 'BC Section' (3.8) and complete it—ironically by
removing the boundaries. Recalling first the discussion of periodic BC's for the scalar
transport equation (Section 2.6.3d, a review of which might be helpful) and the discussion
of internal line heat sources and internal heat fluxes in Section 2.3.2d, we extend these
concepts in the appropriate (if not obvious) way to the NS equations. The less obvious
part is related—of course—to the pressure.
We deal with the easy part first: tangential velocity (velocities in 3D). They are easy
because they are a direct analog of the scalar case; thus, the periodic BC is simply a
matter of 'node numbering': there must be a one-for-one nodal identity for each node on
the periodic boundary. That's all there is to it.
But the normal direction, that which involves the pressure in the 'force balance,' is
another matter. The first thing that must be addressed is 'frictional pressure drop' and
the associated lack of a periodic pressure BC. Since the pressure usually, at least 'on
average', decreases in the flow direction, it is clear that P can not be the same at the
exit as at the inlet of such a 'periodic' domain, a simple example being flow in a long
pipe with venturi meters (of the same type) inserted every / units—and we choose a
domain of length /. What to do? Well, the only way the 'modeler' can force the normal
velocity to be periodic yet permit 'pressure drop' in the computational domain is to add
a 'pump' at the periodic boundary—to cause a jump in P from the lower exit value to
the higher inlet value. Note too that if the physical flow were truly periodic in that flow
through a closed loop were being addressed, that said closed loop could also not operate
without a pump. Thus, we have physical justification for adding a mathematical pump to
make the pressure jump. [Actually, the pump jump is modeled as a normal traction (or
pseudo-traction) jump, as we shall see.]
The velocity portion is (again) easy—and 'standard': just give the appropriate inlet
and exit nodes (degree of freedom, to be more precise) the same 'name.' Then, just as
we allowed the possibility of either adding a line (plane) heat source or specifying the
temperature in the scalar case, Section 2.3.2d, we now have the option of either adding
a pump (line/plane 'source' for normal momentum) or specifying the desired normal
velocity along the periodic boundary and determining the required 'pump characteristics'
(which could be rather strange, depending on the normal velocity profile imposed). To
do the former requires the use of the 'augmented' set of momentum equations—those
just developed in the previous Section [(3.13-402) through (3.13-406)], which obviously
explains why we waited until now to discuss the periodic case. We need the so-called
'reaction force,' called Fa in (3.13-405) of Section 3.13.8, to introduce our
mathematical pump.
618 THE NAVIER-STOKES EQUATIONS
Remark:
Actually, the a in Fa must correspond to the normal direction at the periodicity line
(plane), which could also be the x-, or y-, or z-direction—but need not be. If it is 'none
of the above,' then the rotated momentum equations, to normal and tangent directions, as
described in Section 3.13.1e [equations (3.13-38) through (3.13-57)], must be employed.
o A digression. Before actually addressing the periodic case, let us analyze the situation in
which a 'pump' is inserted along a line of nodes internal to the domain; i.e., we consider
a 'line source of normal momentum.' As in Section 2.3.2d, there are two ways, nearly
equivalent, to insert a pump: (i) add a line momentum source along a line of velocity
nodes, or (ii) use the NBC approach to specify the total jump in momentum flux. We shall
present the second form because, while requiring more effort to derive, it is slightly more
general—it permits the calculation of the split in 'pump work,' by separate calculation of
'suction side' and 'pressure side'; details to follow, at least for one type of element (with
two types of pressure approximation). To this end, consider the 4-patch in Figure 3.13-37,
which we shall employ simultaneously in two ways: Q\ <2o and Q\ Q\ —partly to show
how much simpler is the case of discontinuous pressure.
The pump is located between the two center columns of nodes and will inject x-
momentum only. Also, the separation is figurative only—nodes 0^ and Or (et al.) reside at
the same jc-location. The reasons for the 'duality' are two: (i) it is needed for Q\ Q\ because
we need two pressures at the same location in order to permit a discontinuity/jump, and
(ii) it will make the periodic BC case easier.
To make the analysis tractable, we shall consider only the transient Stokes equations
(or steady, by dropping the acceleration terms). Also, as in Section 2.3.2d, we begin the
analysis by 'decoupling' the left and right pairs of elements, as if Si — 0l — Nl were
the right boundary of the 'left' domain and Sr — Or — Nr the left boundary of the 'right'
domain. The weak form of interest is
duh
frF" = V / V0; • W
(3.13-413)
which we apply, sequentially, to nodes 0^ and Or, via u" = ^2uj(pj and P = Yl^j^j
and LM for simplicity:
fyN h vn
/ <PoLFxl(y) dy= '77[Us'- ~ Usw + 4("°t ~uw) + unl- unw]
vl
6h
Ih
[(usw ~ 2u\v + uNW) + 2{uSl - 2u0l + uNl)] - /l(P) + y"oL,
(3.13-414)
N
w <
^
W
|
>
x NW
x SW
NL
OL
-—i
H
NR
OR
>
NE
xNE
xSE
o
SW SL SR
Fig. 3.13-37 A 4-patch for analyzing periodic BC's.
SE
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 619
where /'l(P) is the pressure term, which is different for the two elements:
h
Q\Qo : fdP) = ^(Psw + Pnw) with P at centroids,
Q\Q\ ■ fdP) = J^i(psw + 4/V + Pnw) + (PsL + 4P0, + Pnl)] with P at nodes.
The analogous equation for the right node is
fyN h vh vl
I <t>uRFxR(y)dy = 77t"5R - use + 4(k0k - uE) + uNr - uNE] - — \2{uSr - 2u0r + uNr)
' ys
lh
+ (use - 2uE + uNE)] + fR(P) + -«oR, (3.13-415)
where /r(P) is the pressure term:
h
Q\Qo ■ Ir(P) = 2(pSE + Pne) with P at centroids,
Q\Q\ ■ fR(P) = ^(psR + 4PoR + />*«) + (Pse + 4P£ + PNE)] with P at nodes.
To obtain the final GFEM jc-momentum equation at node 0, we sum the two equations to
get the total applied force at node 0, and we merge the L and R nodes for velocity (only)
to get, after dividing by h,
1 fyN
7 / 4>dFhXL{y) + fXR(yWy
n J ys
v
= 77{("5 -usw) + (uSE -us) + 4[(k0 - uw) - (uE - u0)]
6/
vl
+ (wyv _ uNW) — (uNE — uN)} — -—j[(usw ~ 2uw + uNW) + 4(us — 2uo + uN)
on
+ (W5£ - 2uE + w^)] + \[fR(P) - fdP)] + /«o, (3.13-416)
h
where, for Q\Qo,
1 Pse ~~\~ Pne Psw ~\~ Pnw
tU'r(P) ~ Il(P)] =
hLJ'" J ' 2 2
with P at centroids, and for Q\Q\,
U/r(P) ~ Il(P)] = t^{[Pse + PsR + 4(P£ + P0r) + Pne + />*„]
h 12
- [^w + Ps, + 4(Pw + P0/.) + Pnw + ^J},
with P at nodes (and two P's at three of the nodes; see Remark(5) below). Letting
FhxL(y) + FxR(y) = Fx(y), the total applied force in the jc-direction (at y) and letting
/, h —» 0, it is hopefully clear that the viscous 'y-terms' (~ vluyy) and the acceleration
terms vanish, and the remaining terms converge (we hope) to
h i 9"
du
a*
_(p| -p| ), (3.13-417)
showing the jump in (pseudo) traction force caused by the pump.
620 THE NAVIER-STOKES EQUATIONS
Remarks:
(1) Probably most of the jump given in (3.13-417) will usually be taken up by the
pressure change with, as usual, the viscous contribution being small.
(2) When the total pump jump, Fx(y), is specified, (3.13-416) is the GFEM equation
that determines uq (uq for the transient case); i.e., this case shows how to put a pump
into the system.
(3) If u(y) is prescribed along the pump line [and ii(y)] —an internal Dirichlet BC—then
(3.13-416) is the equation for the total jump, in the form Fx(y) = (psFXs + (poFXo +
$nFXn. In this case, it is possible (although not necessary) to 'post-process' via
(3.13-414) and (3.13-415), and FhXL(y) = FXL<t>s + F^o + FXL<f>N and similarly for
FXR, to determine how much of the total jump is on the suction side (Fh ) and how
much on the pressure side (Fh )—for what it is worth.
(4) If we take Fx = 0, we recover from (3.13-416) the conventional GFEM equation for
an interior node and no pump. For this case, it would (or at least should) turn out
that P0l = P0r, etc., for Q{QU thus converting (\/h)[fR(P) - fL(P)) back (after
dividing the entire equation by /) to a term that approximates dP/dx.
(5) For Q\Q\, we are not yet done—we must account for the extra pressures via
appropriate (extra) continuity equations. This follows naturally once we realize
that the pressure basis function for 0l spans only the two left elements and that
for 0R spans only the two on the right. (No such concern exists for Q\Qo, or
any element with discontinuous pressure—because the pressure jump is
accommodated 'naturally,' and the continuity equations are exactly the same as those
without a pump.) Thus, rather than the single continuity equation for node
0, (h/\2)[(uSE ~ use) + 4(«£ - uw) + (uNE - uNW)] + (l/\2)[vNE - vSE + HvN -
vs) + (vnw — vsw)], we get instead the pair. (h/\2)[(us — usw) + 4(«o — uw) +
(uN - uNW)] + (l/\2)[vNW - vsw + 2(vN ~ vs)] = 0 for 0L, and (h/\2)[uSE - us +
4(uE -«o) + (uNe ~un)] + (l/\2)[2(vN - vs) + (vNE - vSE)) = 0 for 0R.
(Note that the sum of the last two equations is the first equation, so that it too
is satisfied in the 'pump' case.)
(6) If pressure modes (spurious or hydrostatic) are permitted by the BC's on the non-
periodic portion of the domain, or if the entire domain is endowed with periodic
BC's, then they will appear in full measure and the associated matrix will be singular.
They are, however, innocuous in that the associated solvability conditions [Pj„g = 0
or Pj„g{t) = 0] are automatically satisfied because g = 0 at all periodic boundaries.
The matrix singularity in these cases can be avoided by appropriate specification of
pressures, as discussed earlier (Section 3.13.2b).
Digression
Following on from Remark (2), another possibility (see, for example, Fortin, 1988) is to
require the total flow rate, Jr unbe specified along the 'pump line,' FP, rather than
the pointwise value of u • n. In this case, rather than a variable Fx(y) in (3.13-416), we
are restricted to a constant value—the additional single constraint equation permits only
a single extra degree of freedom, and u(y), from (3.13-416), will vary along the pump
line (plane in 3D). One does, however, have the freedom/flexibility to apply this constant
force/pressure drop at only some elements/nodes (even one!) and not others—as long as
Fx is the same wherever it is applied. Probably a constant Fx along the entire pump line
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 621
would make the most sense in most cases. (Note too that a total flow constraint is more
or less 'global' in that all nodes along the pump line are coupled.)
End digression
o Back to periodic. We are finally ready to complete the discussion of periodic BC's
for the NSE. And the discussion will be brief because all of the hard work is behind
us. For the periodic case, the 'left' nodes in Figure 3.13-37 are those at the outlet
(assuming the conventional left-to-right flow convention); i.e., the left nodes (and their
equations) are located at the right boundary of the periodic domain, and the 'right' nodes in
Figure 3.13-37 are at the left domain boundary. Thus, in the periodic case, the x-locations
of the nodes (but not the y-locations) are truly different, even though the velocities (but
not the pressures) are still tied by periodicity, uq1 = uqr = uq, etc. (They are given the
same node number.)
Finally, we hope and believe that this extended discussion of <2i<2o and Q\Q\ will
permit the reader to implement pumps and/or periodic BC's for other elements, with
either continuous or discontinuous pressure.
If the expected solution is truly periodic (velocity and pressure), nothing else need
be done. If, however, the domain is a 'flow-through' type that suffers a pressure drop in
the flow direction, then further input is required: either u(y) of Fx(y) must be specified.
In either case, where continuous pressure elements are employed, the pressure nodes
along the periodic boundary are not tied—they are separate; and there are separate (not
coupled) continuity equations at inlet and outlet pressure nodes. Discontinuous pressure
elements need no special consideration. Finally, for continuous pressure elements and true
periodicity, the pressure nodes may be tied together—though it is not then a requirement.
Remark:
We are speculating more than we like on this last point—and in Remark (4) above: i.e.,
we believe but cannot prove that the pressure jump would be zero if either Fx = 0 or
true periodicity were computed without tying the pressure nodes together.
For the PPE formulation, we actually proceed in much the same way, at least for
'Case 1' (specified pressure drop): (i) use the 'standard' method (proper node numbering)
for the velocity, (ii) use the velocity normal traction NBC as the PPE Dirichlet BC (still
applied weakly, of course), to add the desired pressure drop; i.e., we proceed to set
the problem up exactly as if we were solving the primitive variable formulation. The
consistent construction of the consistent PPE will take care of the rest. For 'Case 2'
(specified un), the PPE will again take care of itself, automatically—but differently: the
inherited, inhomogeneous Neumann BC that is associated with specified normal velocity
still applies here. But note that the 'inlet' nodes for P see a different RHS than do those
at the outlet, and therefore dP/dn may differ at the 'interface'; the jump in pressure will
be a natural consequence of these Neumann BC's and is, of course, that produced by the
implied pump. So, it turns out, once again, that the PPE 'method' should be treated, with
respect to IC's and BC's, just as if it were the u-P method. Also similar to the u-P
formulation, post-processing for the Afn could be applied if desired. Finally, related to
these issues is one more: we believe that it matters little—at least for a Newtonian fluid
away from the Stokes limit—whether the stress divergence form (y=\, true traction
vector) or the simpler form (y = 0, pseudo-traction) is employed in the above equations.
622 THE NAVIER-STOKES EQUATIONS
For some nice examples of 2D periodic flow past arrays of cylinders, both in one
direction and two, see Tezduyar and Liou (1990), who used the \f/-co formulation. For some
primitive variables results, including alternative methods of implementing the periodic
BC's, see Fortin (1988) and Segal et al. (1994).
3.14 A CONTROL VOLUME FINITE ELEMENT METHOD
As promised in Chapter 2, we will extend the CVFEM discussion there to the NS
equations. But since we now know that (most) FVM's are inherently low-order methods
(first- and second-order), we limit our scope ab initio and present the CVFEM version
of only Q\Qq. First we give an executive summary: since all terms except W and
V • u are obviously the same (for each velocity component) as for the scalar problem of
Section 2.5.3, and since the discrete approximations to div and grad turn out to be identical
to those of GFEM on <2i<2o (really), we are basically done; i.e., the DAE's for CVFEM
are quite close (identical for CP and CTu) to those for GFEM presented in Section 3.13.5.
All that really remains is to show that GFEM = CVFEM for div and grad, and this
we do next, using the sketch in Figure 3.14-1. In the sketch, nG is a unit normal on an
element boundary, nCv is a Control Volume (FEM) unit normal, and we shall focus on
element 3 to present our story.
The subdomain/CVFEM begins (necessarily) with the divergence form of the NS
equations, a la (3.4-2), to obtain
l/f/V • (or— puu)
Q
= ^V2u - V • (PI) - V • (puu), (3.14-1)
Fig. 3.14- A control volume 4-patch.
A CONTROL VOLUME FINITE ELEMENT METHOD 623
which is equivalent (for straight/planar) boundaries to
f da f
/ p—- = / n • OnVu - PI - puu)
= / n • (/LtVu — puu) — / nP
= / ^^ - /°UM«) - / nP' (3.14-2)
Jr, dn JT.
where we have used V • u = 0 to obtain the simplified form of the viscous term. If the
acceleration, viscous term, and advective term are broken down into individual cartesian
components, it is clear that each components 'looks like' the analogous scalar terms in
(2.2-37) and (2.2-38); thus, we need now only focus on the new term—the pressure
gradient:
[ VP= [ nP (3.14-3)
Jsij Jr,
is the CVFEM version of VP. Recall now the GFEM form of VP:
j (piyp = - j ps/(pi + / n<piP
Jo. Jo. Jr
= - j PV<pj (3.14-4)
Jq
for an interior 4-patch because there is then no boundary integral. To show the equivalence,
we use P = Y2j Pjiffjix), where {i/^} are the piecewise-constant pressure basis functions
for QiQo in both (3.14-3) and (3.14-4) to give
/ VP = sTPj [ n (3.14-5)
and
/^VP = -VP7 f V<ph (3.14-6)
Jq j Jq
respectively, where, in each case, the sum over j is effectively a sum over the 4-patch.
Finally, we invoke Green's theorem in (3.14-6) to obtain
f <plS/P = -TPj f n<ph (3.14-7)
Jq j JVj
where T7 is the boundary of element j. Now note that, for each element, <pi is zero on
those two sides that are 'opposite' node /. (For example, in Figure 3.14-1, <pi is zero on
sides 6—^9 and 8 —► 9 in element 3.) Considering now each element in turn, we have
equivalence of the two pressure gradients if
/ ncv = - <pinG, (3.14-8)
JrCv JTc,
where rCv and TG are the appropriate control-volume boundary segments and element
boundary segments, respectively. For example, for element 3 we need
/ ncv+ / ncv = - <PinG- / (piiiG (3.14-9)
624 THE NAVIER-STOKES EQUATIONS
in order that the matrix coefficient of P3 be the same for each. We shall prove that (3.14-9)
is true by direct construction, after rewriting it in local coordinates:
r>0 f~\
ncvdr] + / nCyd£ =
-1 7o
nG(pi
d£-
nG<Pi
»> = -i
drj.
(3.14-10)
Noting that each unit normal vector is constant in the integrand and that the boundary
integral of <pi is simply half the length of the element side, reduces the problem to showing
that
Zfl^n(0, -1/2) + Z^cii(-l/2, 0) = -i/5^6n(0, -1) - i/5^8n(-l, 0), (3.14-11)
where we have 'evaluated' the unit normal vectors at the midpoint of each line segment
for convenience. To finish, we use Figure 3.14-2, which will give us the various normal
vectors: The equation of the normal vectors is then
—m/y/\ + m2
iil = -n/? =
-y'/y/T+if?
\/^\ + {y')2
and we can now evaluate each term in (3.14-11)
Ay
1/VTT
mz
1.
Zfl_frn(0,-l/2) =
3.
and
4.
■Ax
ys + ye + ys + J9
4
-Ay
Ax
ys + ye 1
2
1
4
*5 + *6 + *8 + X9 , X5 +X6
ys + j9 - ys - ye
Zft_cn(0,-l/2) =
1
-Ay
Ax
1
4
1
2-Z5_6n(0,-l)=-
-^/5-,8n(-l,0)=^
-Ay
Ax
Ay
-Ax
1
2
1
2
ys + ys - ye -
X6 + Xg - X5 -
-(ye- ys)
x6 -x5
y9
*8
ys - ys
-(xs -x5)
(3.14-12)
y = mx + b
m = (y2-yi)/(x2-xi)
£ = Vax2 + Ay2
•►x
Fig. 3.14-2 Unit normal vectors.
A CONTROL VOLUME FINITE ELEMENT METHOD 625
and we are finished. Both sides of (3.14-11) give \ n _ y6 , which is just the 'C-matrix'
^ _ -^6 -*-8
coefficient of Q\Qq (see Appendix 1). Generalization is immediate, and we see that the
discrete gradient operators for GFEM and CVFEM are identical.
Remarks:
(1) Noting that the integrals of both GFEM and CVFEM test functions are the same
(1/4 of the relevant area) makes the equivalence more believable —intuitively.
(2) Clearly, the discrete divergence operators are also identical in the two cases, since
both test and basis functions are identical in / V< V • uh = 0.
(3) The above construction has generated an alternate, but not necessarily useful, way
to compute the C-matrix.
(4) The absence of node 9 in the gradient evaluation at node 5 makes the 'bent element
blues' discussed in Gresho and Leone (1984) particularly obvious; the pressure
gradient at node 5 is completely independent of/oblivious to the location of node
9—it could be on the moon and make no difference. (This remark of course, also
applies to nodes 1,3, and 7 when 'fully assembled.') The gradient is also independent
of node 5's location.
(5) For an element that is a simple rectangle, the equivalence of the two gradients is
fairly obvious.
(6) The extension to 3D is not fairly obvious, unless the elements are simple bricks.
Isoparametric elements with planar faces/sides are also straightforward, but those
with non-planar sides are not.
Now that we have shown div and grad equivalence, there is one final aspect of CVFEM
that needs to be addressed: open (or outflow) boundary conditions. What is the CVFEM
equation for a node at which the velocity is not specified? To answer this important
question, we examine the following two-patch in Figure 3.14-3 at the right edge of a
computational domain. In generating the discrete momentum equation for node 0, we are
led to consider
/ ifcV.(IP)= / nP,
(3.14-13)
Fig. 3.14-3 A boundary 2-patch.
626 THE NAVIER-STOKES EQUATIONS
which leads to the question, 'Is that portion of ro that comprises a — 0 — b to be included
or not?' Our answer is the following: yes for the tangential component, but no for
the normal component, the second part of which may be surprising to some CVFEM
practitioners. The reason is this: if the entire CV boundary was included in the equation
for the normal momentum equation, the result would be zero—because P is piecewise-
constant. [It is, of course, not zero for the tangential equation; there it is simply and
appropriately H{Pn — Ps)/2.] What is needed is the realization that a normal force balance
is needed on the open boundary, and this leads to the CVFEM version of the NBC of
GFEM; namely,
-P + H^ = fn, (3.14-14)
dn
where, as with GFEM, the missing factor of two in the viscous term is a result of dropping
the (Vu)r portion of the viscous stress in (3.14-1), so that, as with GFEM, we really have
a pseudo-traction BC. But the key point is that the pressure must show up in the OBC
so that, when considering the boundary integral at node 0, the portion of it on a —>- b
must be replaced by the above BC; i.e., the combined viscous and pressure term [see
(3.14-2)], Jr(fidu/dn — nP), is replaced by F, a given (pseudo) traction vector. [In the
above sketch, the pressure term is then — (PN + Ps)h/2, which is lCxP' at node 0 and is
again identical to the GFEM result—which result came about somewhat more 'naturally'
as a natural boundary condition.]
So we have come to the end of our CVFEM presentation for the NS equations. For
convenience, however, we summarize the key results below, since some of them are lifted
from the previous chapter:
1. For an interior four-patch, (2.5-4) can be used to obtain the CVFEM version of du/dt,
u • Vu, and V2u by replacing T by u and v, respectively.
2. The div and grad terms are identical to those from GFEM, a la Section 3.13.5.
3. To construct the CVFEM equations at an open boundary, use (2.5-8) for a 2-patch
and (2.5-10) for a 1-patch in the same way as above, and use the GFEM results in
Section 3.13.5 to get div and grad and the OBC's.
4. As for the scalar case, the characteristic GFEM averaging of(l 4 l)/6 changes to
(1 6 l)/8.
5. The consistently derived CVFEM has more similarities than differences from GFEM,
and it is the authors' opinion that every difference but one is in favor of GFEM. That
'one' is: 'flux in = flux out.' But see Appendix 2.
6. The nascent theory of FVM's has been 'covered' briefly via some of the citations of
Section 1.7.
*3.15 VARIATIONAL PRINCIPLES FOR POTENTIAL AND
STOKES FLOW
3.15.1 Introduction
The steady Stokes equations—and their seemingly totally unrelated but simpler cousin, the
potential flow equations—provide a rich setting for mathematical analysis, mostly related
to or caused by the incompressibility constraint. These equations can be formulated via
VARIATIONAL PRINCIPLES FOR POTENTIAL AND STOKES FLOW 627
minimum principles or via maximum principles (which also introduces a least-squares
solution) or by 'both' (saddle-point principles); or by projections. We shall do all of
the above below, and in an order that moves more or less monotonically up the scale
of difficulty, beginning with the discrete equations for which linear algebra provides
the bulk of the analytical tools—as well described by Strang (1986; see also his SIAM
review article—Strang, 1988), who refers to the subject as 'duality.' After developing
and presenting this theory for the Stokes equations, we will digress to show how easily it
also applies to the discrete equations of potential flow. Then we leave the comforts of the
finite dimensional world to address the continuous analogs—first for potential flow, and
finally concluding with steady Stokes flow. This path may seem to some a digression from
the goal of solving the NS equations via GFEM, and indeed it is—and can be skipped
over without loss of continuity. It is much more aimed at the 'incompressible flow' part
of our title.
For 'instructions' (or, at least, some guidance) on how to generate appropriate
functional, see Kardestuncer and Norrie (1987) or Sewell (1987); see also Finlayson (1972).
3.15.2 Discrete Stokes
The discrete steady Stokes equations are contained in (3.13-28) and (3.13-29): omit Mu
and N(u)u, and add a body force (in general) to obtain
Ku + CP = f, (3.15-1)
CTu = g, (3.15-2)
which can be rewritten as Ax = b with xT = (uT, PT), bT = (fT, gT), and
A= A- n . (3-15-3)
where K is SPD (and n x n), and C(n x m, n > m) has full rank (no pure pressure
modes permitted—for the time being); thus, K is invertible, and C has no null space. It
is noteworthy that C has more rows than columns, and CT has more columns than rows.
Thus, when C (and therefore CT) has full rank, the rank is m, and since m < n, it follows
that CT has a non-trivial null space even though C does not: there are n — m linearly
independent vectors {u} for which CTu = 0; this null space is called the divergence-free
subspace of Rn corresponding to/generated by CT.
Formally (but never computationally), the solution to this linear system can be obtained
in two steps:
(i) P={CTK-[C)-[[CTK-'f -g], (3.15-4)
(ii) u = K~\f -CP), (3.15-5)
showing also that A is invertible.
To put the above solution into a variational setting, we introduce three functionals:
(i) J{v) = vT({Kv- f), (3.15-6)
(ii) I(q) = -\(Cq-f)TK-\Cq-f)-qTg, (3.15-7)
and
(iii) L{v,q) = J{v) + qT{CTv-g). (3.15-8)
628 THE NAVIER-STOKES EQUATIONS
J(v) is called the primal functional, I(q) the dual (or reciprocal) functional, and L(v, q)
the Lagrangian functional. We shall show that the solution of (3.15-1) and (3.15-2),
i.e., (3.15-4) and (3.15-5), can also be obtained in three other ways—one from each
functional: (i) minimize J(v) subject to the constraint CTv = g, (ii) maximize I{q) with
no constraints, and (iii) find the saddle-point of L(v, q). Note the asymptotic behaviors:
J{v) —► oo for \\v\\ —► oo, I(q) —► —oo for \\q\\ —► oo, and L(v, q) —► oo for \\v\\ —>- oo.
If also L(v, q) —► —oo for ||<?|| —► oo, we would be assured that L(v, q) is a saddle-
point functional. The last condition is, however, not always realized, and the sufficient
conditions for the existence of a saddle point may not be satisfied. It is, however, not
always necessary, as we shall see. See, for example, Carey and Oden (1983, Volume II)
for further discussion.
1. Minimize J. The first step is easy; the first variation of J{v) is simply
8J(v) = 8vT(Kv- /), (3.15-9)
but the second step is more subtle—because of the constraint CTv = g. Thus, attempting
to find 8J(v) = 0 via v = K~x f is not allowed because this v is generally not in the
admissible set of functions—it does not satisfy CTv = g. The admissible functions do satisfy
CTv = g, and thus their first variations are necessarily discretely divergence-free: CT8v =
0, which leads us to consider an n-vector, say w, that is generated by an m-vector, say q,
via w = Cq because all such vectors are orthogonal to 8v; 8vTw = wT8v = qTCT8v = 0.
This leads to the proper conclusion that if Kv — f in (3.15-9) were one of these w-vectors,
we would have 8J{v) = 0 and be respecting the constraint. Thus, Kv — f is the 'gradient'
of some scalar, say w = — Cq, and we have 8J{v) = 0 if Kv + Cq = /, where CTv = g;
i.e., the extremum of J{v) is attained at the Stokes solution: v = u and q = P.
To finish, we must show that the extremum is indeed a minimum. This is easy; the
second variation of J{v), from (3.15-9), is simply
82J(v) = 8vTK8v, (3.15-10)
which is a positive definite quadratic form because K is SPD, which proves that the
extremum is a minimum (82J > 0).
2. Maximize I(q). There are fewer subtleties here because the variational problem is
unconstrained. The first variation of I(q) in (3.15-7) gives
8I(q) = -8qT[CTK-l(Cq-f) + g], (3.15-11)
and since 8q is a completely arbitrary m-vector, we obtain 81(q) = 0 when CTK~x(Cq —
f) + g = 0; i.e., when q = P from (3.15-4). This 'dual' variable formulation thus leads
directly to the correct value of the dual variable—pressure.
To show that this solution maximizes I(q), simply form 82I(q) = —8qTCTK~xC8q =
-xTKx~x < 0, since K is SPD. Done.
To finish, we simply return to (3.15-5) to recover the primal variable, u.
Digression 1:
Least-squares solutions and projections. As a small aside, it is noteworthy that for the
special case of g = 0 in (3.15-2), corresponding to homogeneous BC's on velocity, the
above maximization (pressure solution) is related to a least-squares solution of CP = /,
VARIATIONAL PRINCIPLES FOR POTENTIAL AND STOKES FLOW 629
and the full solution (u and P) is related to an orthogonal projection of a 'velocity,'
say w, defined by Kw = f in (3.15-1), from the given space (Rn) to the divergence-free
subspace of Rn. Let us prove these assertions: suppose we seek a 'solution' to CP = /, a
system with more equations (n) than unknowns (m). A classical technique is the method
of (weighted) least squares: find a P that minimizes \\CP — /||k-i; i.e., minimize the
residual of CP — f in the ^~'-norm. Defining
R(q) = \\Cq - f\\\.x = (Cq - f)TK~\Cq - f), (3.15-12)
and setting 8R = 28qTCTK-\Cq - /) = 0 which, since 8q is arbitrary, means that the
m-vector CTK~x(Cq — f) must be orthogonal to all m-vectors in Rm, and thus it must
be the zero vector; (CTK~xC)q = CTK~l f, and we have proven our first point: when
g = 0, the maximization of I{q) is equivalent to finding a q that is the least-square solution
(in the ^_l norm) of Cq = /, which we call P. Actually, the proof that 8R = 0 yields
a minimum (least-squares residual) relies on the fact that 82R = 28qTCT K~xC8q > 0,
which we have already shown. Stated differently: the vector CP — f is the smallest
vector in the (large; dimension n — m) null space of CTK~{—smallest in the /f_1-norm,
that is; and it is (of course) unique. Stated differently yet: the pressure is the vector
that minimizes the 'related' vector, K~\Cq— f) in the ^-norm over all divergence-
free vectors, an interpretation that is more 'consistent' with the continuum analog to be
discussed below, wherein the relevant norm is the //'-norm.
To see the projection connection, we rewrite the Stokes equations—again for g =
0—as Ku + CP = f = Kw and CTu = 0 with solution P = (CT K-{CyxCTw and u =
w-K-lCP = [I-K~lC(CTK~lC)~lCT]w = pKw, where pK is a projection matrix
(p| = pK) that projects a given n-vector, here w, to the divergence-free subspace that
is the null space of CT\ CTpKw = CTu = 0 for all w because CTpK = 0. That the
projection is ^-orthogonal, (Pku)tK(Qku) = 0 for every u, where QK = I — pK is the
'residual' projection matrix, follows immediately by noting that
PtkKQk = [/ - C(CTK~{CT{CTK~X]C(CTK~xC)~xCT = 0. (3.15-13)
For more projection discussion, see Appendix 3, in which Pk is called p\.
To conclude this portion of the digression, we note that it is the presence of K that
caused that least-squares solution to be a weighted (via ^~') least-squares solution and
caused the (non-symmetric, p\ ^ pK) projection to be K-orthogonal rather than 'simply'
(Euclidean) orthogonal. A change of variable would change both of these results to their
'simpler' interpretations; i.e., u = Ki/2u, f = K~l/2f, and C = K~l/2C yields a system
with K replaced by the identity: u + CP = / and CTu = 0 and (i) the (unweighted) least-
squares solution of CP = /; namely, (CTC)P = CTf, gives P and (ii) the new projection
matrix, p, is p = / — C(CTC)~]CT, which is symmetric and generates a 'conventionally
orthogonal' projection of /; u = pf with (pf)T(Qf) = 0, where Q = I — p because P
and Q are 'conventionally' orthogonal: pQ = 0.
End Digression
Digression 2:
Show that ymin = /max. It is probably not intuitively obvious that if 7min and /max describe
the same solution, then it follows that they are equal. To show that this is indeed the case
630 THE NAVIER-STOKES EQUATIONS
'is just a medley of matrix algebra' (Strang, 1986, p. 101):
J(v) - I(q) = \vTKv - vTf + \(Cq- f)TK~l(Cq - f) + qT g
= \[vTKv + (Cq - f)TK~\Cq - /)] + qTg - vT f
= {[(Kv + Cq- f)TK~\Kv + Cq- /)] - vT(Cq - f) + qTg - vTf
= {(Kv + Cq- f)TK~x(Kv + Cq- f) + qT(g - CTv) (3.15-14)
is the general result. Now if v is divergence-free, then we have CTv = g and, because K~l
is SPD, we then obtain J(v) — l(q) = 0 if and only if Kv + Cq — f = 0; at the solution,
we have Ku + CP = f and CTu = g. We then have J(u) = Jm[n(v) = ImeLX(q) = I(P).
Finally, it is also noteworthy from above that J(v) ^ I(q) for all admissible v in the
minimization problem; i.e., those satisfying CTv = g—again because A'-1 is SPD.
End Digression
3. Find the saddle-point. The last item on our list is to study the (alleged) saddle-point
problem associated with the Lagrangian functional, (3.15-8). We will show that the Stokes
solution minimizes L(v, q) with respect to v and maximizes it with respect to q. We begin
by seeking the extremal/stationary points of L(v, q) via
8L = 8J + 8qT(CTv - g) + qTCT8v
= 8vT(Kv + Cq- f) + 8qT(CTv - g), (3.15-15)
and we emphasize that we are now not in the space of divergence-free vectors—CT8v ^
0—because the introduction of the Lagrange multiplier variable (q) has permitted a
relaxation of this constraint. The extremum of L is given by 8L = 0 and yields, since 8q and
8v are independent and arbitrary variations, Kv + Cq = / and CTv = g, thus recovering
the discrete Stokes equations: u and P from (3.15-1) and (3.15-2) are a stationary point
of the Lagrangian.
To show that the stationary point is a saddle-po'mt, we start with the easy part; at the
solution, we have L(u, P) = J(u), and we have already shown that this is the minimum
J. But to show that L(u, P) is also a minimum with respect to v at fixed q(= P), we must
examine
L(u + ev, P) = J(u + ev) + PT[CT(u + ev) - g]
= J(u + ev) + ePtCtv
= \(u + ev)tK(u + ev) -(u + svff + ePtCtv
= \uTKu - uTf + e[utKv - vTf + PTCTv] + \e2vtKv
= L(u, P) + svT[Ku + CP- f] + \e2vtKv
= L(u,P) + {e2vtKv, (3.15-16)
and we have that L(u + ev, P) > L(u, P) because K is SPD.
Turning to the other side, the analysis of the maximum proceeds as follows: pick a
q, any q but fixed (so that 8q = 0) and find the v that makes 8L = 0 for this q. This v
VARIATIONAL PRINCIPLES FOR POTENTIAL AND STOKES FLOW 631
comes from (3.15-15) with 8q = 0 and is v = K~l(f — Cq); i.e., for fixed q, this v gives
8L = 0. Now insert this v(q) back into L(v, q) to find its value at 8L = 0:
L(v(q), q) = ±[K-\f - Cq)\TK[K-\f - Cq)] - fTK~\f - Cq)
+ qT[CTK-[{f-Cq)-g\
which, after simplification, becomes
L(v(q), q) = -\[qTCTK-'Cq + fTK~lf] + qT[CTK'lf - g], (3.15-17)
which is just I(q) in (3.15-7), and we already know that maximizing I(q) yields the
Stokes solution. [Indeed, the above observation leads to a useful way to construct I(q)
from J(v) and L(v, q)—and we shall soon use this fact to our benefit.] Thus, we have
shown that L(u, P) is a minimum over v and a maximum over q\ i.e., we have just shown
that
L(u, q) <: L(u, P) ^ L(v, P), (3.15-18)
which is the definition of a saddle point. Finally, we close with the observation that
Jmin = ^max = ^saddie-poinu the value of which we leave to the reader to work out as we
have not found the exercise productive.
3.15.3 Discrete Potential
Now we switch gears and show the power and ease of linear algebra by making an instant
change from (discrete) Stokes (rotational) flow to (discrete) potentiaVirrotational flow. It
is easy in the discrete case, but not in the continuous case, thus perhaps serving as an
example of the rather significant difference between finite and infinite dimensional spaces.
The easy (and indeed somewhat remarkable) part is this: changing K to M (the velocity
mass matrix) at every occurrence above converts every statement about Stokes flow to
an equivalent one about potential flow. Thus, changing (3.15-1) gives
Mu + CP=f, (3.15-19)
CTu = g, (3.15-20)
which describes discrete potential flow in which / is now normally relegated to describing
boundary condition forcing, whereas for Stokes flow it was this plus a 'body force,' but
the key point regarding linear algebra is that they are both 'merely n-vectors.' And there
is also a change in g, in general, because Stokes flow needs both normal and
tangential Dirichlet BC's on velocity, whereas (slippery) potential flow permits specification
of only the normal velocity. One noteworthy difference between M and K is that M
can, sometimes—depending on the element, be 'lumped' without destroying the
potential flow approximation, but K cannot—and the reason is simple: M (and its diagonal
lumped version, Ml) both approximate the identity operator of the continuum (which
is very 'local'), whereas K approximates —V2, the Laplacian (an elliptic operator whose
inverse 'fills the domain'). The continuum analog is that potential flow moves us from the
//'-norm to the simpler L2-norm.
632 THE NAVIER-STOKES EQUATIONS
3.15.4 Continuous Potential
So let us now state the continuum version of potential flow described by (3.15-19) and
(3.15-20) and try to find the analogous variational 'consequences.' It is
u + VP = f and Vu = 0 in Q (3.15-21)
with
u • n = un on FD and P = PN on FN, (3.15-22)
where f, un, and PN comprise the data, we retain a body force (with V x f = 0 because
potential flow is, by definition, irrotational) for generality—even though f = 0 for
'conventional' potential flow, and we retain the symbol P for the velocity potential—for
'convenience.' These equations imply, for sufficiently smooth solutions,
V2P = Vf in £1, (3.15-23)
with
dP/dn = n • f - un on FD and P = PN on FN, (3.15-24)
wherein we note (again) the 'inversion' of essential and natural BC's; FN 'looks like' a
Dirichlet boundary for P, whereas FD looks like a Neumann boundary—and indeed this
would be the case if the solution of (3.15-23) and (3.15-24) were to be attacked directly
(but weakly, of course). But we are more interested in the mixed (primitive) formulation
of (3.15-21) and (3.15-22), usually, in which our appellation is the proper one.
The relevant/corresponding functionals for this case are again the primal, the dual, and
the Lagrangian, respectively:
(i) J(y) = ^ /vv- /vf + / PNn\, (3.15-25)
where every v must satisfy n • v = un on FD and be divergence-free: V • v = 0.
(ii) I(q) = -{f(Vq-f)-(Vq-f)- J unq, (3.15-26)
where here every q must agree with PN on FN.
(iii) L{\,q)=J{\)- fqV-v, (3.15-27)
where here n • v = un on Fq and q = PN on FN are required. It is noteworthy that the
primal functional contains a boundary integral over its Neumann portion, and its dual
displays a boundary integral over its Neumann portion.
We now repeat the variational analyses presented above for the continuous case: min
then max then saddle.
1. Minimize 7(v). Again, the first step is easy; the first variation of J(y) is
8J = /(v-f).$v+ / PNn-8\, (3.15-28)
J JvN
VARIATIONAL PRINCIPLES FOR POTENTIAL AND STOKES FLOW 633
and we now utilize the fact that v is divergence-free (so that V • 8\ = 0) and that n • v = un
on rD (so that n • S\ = 0 there) to write, for any scalar, q,
8\-Vq. (3.15-29)
Thus, comparing (3.15-28) and (3.15-29), we see that in order to have 8J = 0 in the
space of divergence-free functions satisfying n • v = un on To, we need that v — f is the
gradient of a particular scalar— via
\-f=-Vq in Q and q = PN on TN. (3.15-30)
Comparing this result with the potential flow equations (3.15-21) and (3.15-22), shows
that v = u and q = P. The final step is to show that this solution minimizes J(\), and
this follows easily by taking the second variation of J(v) via (3.15-28):
82J = JS\-S\, (3.15-31)
which is positive, and we are finished.
Remark:
We have just rediscovered (as a special case) the Kelvin principle for inviscid flow: among
all incompressible flows satisfying (3.15-21) with f = 0 (or conservative, V x f = 0) and
either PN =0 or TN = 0, the one that minimizes the kinetic energy is irrotational.
2. Maximize I(q) with no constraints. Again, the no-strings-attached variational problem
is simpler, although q = PN on FN (and thus 8q = 0 there) is still at least an attached
'thread.' We obtain
81 = - f(Vq -f).V8q- f un8q
= - f V . [8q(yq - f) + fsqV-CVq-f)-
= [8q(V2q-V-f)- I 8q[n • (Vq - f) + un], (3.15-32)
which vanishes if V2q = V • f in £2, dq/dn =nf—un on rD, and (of course) q = PN on
T/v which, via comparison with (3.15-23) and (3.15-24), shows that q = P, our potential
function. Returning to 81 = — f(Vq — f) • V8q — Jr un8q and taking its first variation
yields 82I = — f V8q ■ V8q, a negative-definite quantity, thus assuring that the pressure
maximizes I(q) : I(q) ^ I(P) for all q, a restatement (for f = 0, or conservative, and
rN = 0) of the Dirichlet principle for inviscid flow: among all irrotational flows satisfying
n • u = un on To, the one that minimizes the kinetic energy is divergence-free (Fix et ai,
1981); maximizing / minimizes KE = ^ J u • u because —u = Vq = VP.
Now, to finish, we note that even though V • (VP — f) = 0, it is definitely not the case
that VP = f. What is true is that, because the vector VP — f is divergence-free, it is the
634 THE NAVIER-STOKES EQUATIONS
curl of some other vector (because div curl (•) = 0); i.e., we have
VP-f=Vxv = -u, (3.15-33)
where u is both divergence-free and curl-free; finally, (3.15-24) and (3.15-33) show that
u will satisfy n • u = un on FD, and we are finished; maximizing I(q) solves the potential
flow equations.
Finally, in analogy with the discrete case, we note that for un = 0, the potential is a
least-squares solution to Vq = f, and the potential velocity is an L2-orthogonal projection
of f to the appropriate divergence-free subspace. Again, see Appendix 3.
Digression 3:
Show that ymin = /max. Again, as in the discrete case, it is worth noting that the solution
of (3.15-25) and (3.15-26) causes these two functional to 'touch':
J{y)-I{q)=\ /v-v- /vf + / P/vn-v
J J JvN
+ i|(V<?-f)-(V<?-f) + J unq
= \j(y + Vq-f)-(v + Vq-f)
-/v-(V?-f)-/vf+/ PNn-\+ f unq
= \ Av + V^-f)2+ f qV\
- qn-\+ / PNn-v+ / unq
Jv JrN JrD
= i /(V + V^-f)2+ fqV\ (3.15-34)
because q = PN on FN and n • v = un on FD. Thus, when v is divergence-free, 7(v) = I(q)
if and only if v + S/q — f = 0, and we are finished: the potential flow solution makes
•^min — 'max-
End Digression
3. Find the saddle-point. We now demonstrate that (3.15-27) has a saddle-point at the
potential flow solution:
8L = 8J- fqV-S\- J8qV-\
= 8J - qn-8\+ 8\ ■ Vq - 8qV • \
= (y-f+Vq)-8\+ PNn-8\- PNn ■ 8\ - f 8qV ■ v, (3.15-35)
j j r,v J r,v J
and we obtain 8L = 0 if v + Vg = f and V • v = 0 in 12 with n • v = un on rD and q = PN
on f^; the stationary point of the Lagrangian is obtained at the potential flow solution.
VARIATIONAL PRINCIPLES FOR POTENTIAL AND STOKES FLOW
635
To show that it is a saddle-point, we proceed as for the discrete case, by first varying
v about u at the fixed value of the potential, P:
L(u + ev, P) = J(u + ev) - / PV • (u + ev)
\ \ (u + ev) • (u + ev) - / f • (u + ev)
+
PNn ■ (u + ev) - e / PV • v
giving
u • u — / f • u + / fVn • u
+ e
(u • v + f • v - PV • v)
+ / rN
PNn ■ v + \e2 I v • v,
(3.15-36)
L(u + ev, P) - L(u, P) = e
v • (u + f - VP) - / Pn v + / PNn • v
Jr JrN
+ ^£2 / V • V.
(3.15-37)
But L(u, P) = 7(u) > 0 and Jr Pn • v = Jr PNn ■ v because n • v = 0 on rD (here, v is
a variation from u which satisfies n • u = un on fD) to give, noting that u + f — VP = 0,
1 „2
L(u + ev, P) - y(u) = ^ez / v • v,
(3.15-38)
showing that L(u, P) is a minimum with respect to velocity.
To show the 'other half—that L(u, P) is a maximum with respect to P—we proceed
as follows: pick a q, any q (but fixed: <5g = 0) and seek 8L = 0 for that g. Thus, setting
8q = 0 in (3.15-35), v = f — Vq makes 8L = 0, which result we place back into (3.15-27):
L(\(q),q) = J(\)- /tfV-v
I (f-Vq)-(f-Vq)- (f-Vq)-f
+ / Pyvn.(f-V^)- qS/-(f-S/q).
(3.15-39)
Integrating the last term by parts gives J qV ■ (f — Vq) = Jrqn ■ (f — Vq) — J(f -
Vq)-Vq and /r qn ■ (f- V^) = J^qn ■ v + JVn PNn ■ (f - Vq) = frp qun + JrN PNn
(f - V^) to yield
L(y(q), q) = \ f (f - Vq) ■ (f - Vq) - f(t -Vq)-t
+ [(f-Vq)-Vq- f unq
636 THE NAVIER-STOKES EQUATIONS
= -i J(Vq-f).(Vq-f)- j unq
= Kq), (3.15-40)
and we are done because we have already shown that the solution, q = P, maximizes the
dual functional, I(q). Thus, we have our saddle-point and the final result that
L(u, q) ^ L(u, P) ^ L(v, P) (3.15-41)
and, Of COUrse, Jm[n = /max = ^saddle-point-
3.15.5 Continuous Stokes
This completes the potential flow analysis. Now—finally—we come to the subject that
started this whole section: steady Stokes flow. We have followed the chosen path because
the Stokes equations are significantly more difficult with respect to variational principles,
and in fact we shall use the general result that the maximum of the Lagrangian with
respect to P agrees with I(q) to obtain this latter functional. Even at that, the procedure
is much more 'formal'—involving implicit Green's functions and integral operators—and
is (in fact) less useful, in some sense... explaining, or rationalizing, the fact that we will
fall a bit short of our goal.
So we begin by introducing only two functionals—the 'easy ones,' primal and
Lagrangian:
(i) J{\)={ /vv: (Vv)r- f\f- f v F (3.15-42)
and
(ii) L(v, P) = J(v) - J qV ■ v, (3.15-43)
which are 'related' to the following Stokes problem:
-V2u + VP = f and Vu = 0 in Q, (3.15-44)
with
u = w on TD and du/dn-nP = F on rN, (3.15-45)
where here f, w, and F are the data. These are in fact the PDE's and BC's described by
the opening equations of this section, (3.15-1) and (3.15-2).
1. Minimize J(\) subject to V • v = 0 and v = w on rD. The first variation of (3.15-42)
gives
8J = J Vv:(V5v)r- /5v-f- / «5v • F. (3.15-46)
Recalling that (or referring to Table 3.1-2),
Vu : (Vv)r = Vv : (Vu)r = V • [v • (Vu)7] - v V2u
gives
8J = I n • [(Vv) • 5v] - / 8\ ■ (V2v + f) - / <5v • F. (3.15-47)
Jr J JvN
VARIATIONAL PRINCIPLES FOR POTENTIAL AND STOKES FLOW 637
But since v = w on rD, 8v = 0 there and we get
8J = 8\- (9v/9n-F)- / <5v • (V2v + f). (3.15-48)
Utilizing the constraints that V • v = 0 in Q and v = w on Fq implies that V • 8\ = 0 in
£2 and 8v = 0 on To, which leads to
0 = qV-8\ = / qn-8\- 8\ • Vq Vq (3.15-49)
which, when compared with (3.15-48), shows that 8J = 0 when
V2v + f=V# in Q and d\/dn - F = qn on TN, (3.15-50)
which, with V • v = 0 in Q and v = w on rD, shows that the constrained extremum of
7(v) gives the Stokes solution: v = u, q = P. That it is minimum follows easily from
(3.15-47), the first variation of which gives
82J = jV8\: (V5v)r, (3.15-51)
which is positive-definite, and thus 7(u) is a minimum.
Remark:
For f = 0 (or conservative, f = VA.) and rN = 0, this result is virtually the same as the
Helmholtz dissipation theorem (Serrin, 1959): the solution of the (steady) Stokes equations
minimizes the viscous dissipation.
2. Find the saddle-point of L(u, q). We now vary both v and q in (3.15-43) while
respecting v = w on VD and 3v/3n — qn = F on rN (but V • v ^ 0 in general):
8L = 8J - 8qV-\- qV • 8\
= 8J - 8qV • v - / qn ■ 8\ + 8\ ■ Vq
= / 8\ ■ (d\/dn -F-qn)- 8\ ■ (V2v + f - S/q) - 8qV ■ v, (3.15-52)
which has 8L = 0 when 3v/3n — qn = F on rN and — V2v + Vq = f in Q and, finally,
when also V • v = 0 in Q; i.e., when v and q solve the Stokes equations: v = u, q = P.
As usual, we now seek to show that 8L = 0 is a minimum with respect to velocity,
which we do (again, as usual) by holding P and varying u via u + e v, where we note
that, while v is not required to be divergence-free, it is required to vanish on rD (so that
u + £v = w there). Thus, we form
L(u + ev, P) = J(u + ev) - / PV • (u + ev) = J(u + ev) - e / PV • v
= \ J V(u + ev) : [V(u + e\)]t - / (u + ev) ■ f - / (u + ev) • F
- e PV\
638 THE NAVIER-STOKES EQUATIONS
+ £
Vu : (Vu)7
Vu : (Vv)y
u f- / u F
JrN
fy.f- f vF-
PVv
+ \e2 / Vv : (Vv)y
= J(u) + el /[Vu: (Vv)7 - v-f-PV-v] - / v F
+ \s2 [Vv: (Vv)7.
(3.15-53)
But / Vu : (Vv)r = /r v • du/dn - J v • V2u = JTn v • du/dn - f v • V2u and / PV ■ v =
fr Pn • v — f v • VP to give
L(u + ev, P) - 7(u) = £
v • (du/dn -nP-F)- / v • (V2u + f - V/>)
+ - / Vv : (Vv)y
= k2 / Vv : (Vv)7 > 0
(3.15-54)
because u and P satisfy the Stokes equations. Hence, we do have that L(u, P) is a
minimum with respect to velocity.
Now the final, and hardest, part: show that 8L = 0 is a maximum with respect to P.
Again, as usual, pick a q, any q, but /zjced (8q = 0) and place it into L(v, g) and seek
8L = 0 by finding just the right v(#). Then we will vary q, if possible/lucky. Thus, we
simply omit the 5^-term in (3.15-52) to obtain
8L= 8\- (d\/dn -F-qn)
8\- (V2v + f- S/q),
and it is clear that 8L = 0 =>•
—V2v + Vq = f in Q and 3v/3n — qn = F on T/v,
(3.15-55)
(3.15-56)
which, along with v = won rD, provides a well-posed problem for v(g). Switching
notation via V2 = A for convenience, we formally write the solution for v(g) as
v = A_,(V?-f).
(3.15-57)
We shall place this result into (3.15-43) after rearranging the Lagrangian to a more
convenient form—first via integration by parts of the last term: f qV ■ v = frqn ■ v —
J \ . S/q = Jr qn ■ w + Jr qn ■ v — / v • Vg, and (3.15-43) becomes L(v, q) = ^ J Vv :
(Vv)7 — / v • f — Jr qn ■ w — Jr v • (F + qn) + J v • Vg, which we rearrange by using
Vq - f = V2v to obtain L(v, q) = {j Vv : (Vv)7 + / v • V2v - /r qn ■ w - /r v • (F +
gn), which we further rearrange by—yep, you guessed it—integration by parts; this time
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 639
of / v • V2v, and utilizing the BC F + qn = d\/dn on FN : / v • V2v = fr v • d\/dn -
f Vv : (Vv)r = fVi> w • d\/dn + JTn v • (F + qn) - / Vv : (Vv)r to give
L(v, q) = I w {d\/dn + qn) - \ f Vv : (Vv)7",
and we are 'finished' —after 'simply' replacing \(q) from (3.15-57) to obtain [cf.
(3.15-17)]
L[\(q), q] = -- f[VA-l(Vq - f)] : [VA"1 (V<? - f)f
2
— A-l(Vq-f)-qn
dn
= I(q), (3.15-58)
and we leave it to the (talented) reader to verify the rest: the extremum of I(q) exists and
is a maximum. [An alternative, and probably better, functional to be maximized, is given
by Finlayson (1972, p. 2.27) in Kardestuncer and Norrie (1987).]
Thus we conclude our (condensed, yet protracted) odyssey on 'variational fluid
mechanics' —a journey that we hope has been useful to some.
3.16 SOLUTION METHODS FOR THE SEMI-DISCRETIZED
TIME-DEPENDENT (AND STEADY) EQUATIONS
We have now arrived at what is arguably the most important part of the book: solving
the time-dependent incompressible Navier-Stokes equations via the GFEM. There are
those who would argue that these are the only codes that should be written, since one
rarely if ever knows that a stable steady state exists. But there are also those who would
argue—perhaps at least after 'modifying' the equations to include a turbulence model (see
Volume II)—that most flows of interest are turbulent and 'steady,' at least statistically.
Both arguments have merit—and while the bulk of this section is focused upon the (much
more complex) time-dependent case, the steady case will also occasionally be addressed
separately, albeit usually as a special case/subset of the time-dependent case.
Anyway, we now address numerical time integration of the DAE's that comprise the
semi-discretized NS equations, and we begin by pointing out an early and important paper
on the subject with a very cogent title, 'DAE's Are Not ODE's' (Petzold, 1982). So what
are they and why do they need a special name? They are ODE's subjected to algebraic
constraints, and they need a special name because they have, in the last 15 years or so,
spawned a new, and growing, branch of applied mathematics that attempts to properly
account for the often-significant additional difficulties, both theoretical and applied, when
ODE's are constrained. We are fortunate not to need more than the tip of the iceberg
in this field because our DAE's are of a special class (semi-explicit) and the 'index' (a
measure of the 'degree of difficulty,' defined below) is not too high. Suffice it to say in
this introduction that 'simple' ODE-thinking can lead to trouble, and we thus introduce
the reader (or some readers) to the proper approach to the problem.
After introducing some of the concepts and ideas behind DAE's, we shall show how to
apply several ODE methods to both types of DAE's that comprise the semi-discrete NS
640 THE NAVIER-STOKES EQUATIONS
equations; namely, those involving the primitive variables, (3.13-28) and (3.13-29), and
those involving the PPE—(3.13-28) and (3.13-242). The first and foremost new concept
is the notion of the index of the DAE system:
The minimum number of times that all or part of the constraints of a DAE system must
be differentiated with respect to / in order to obtain an ODE system in the original
variables is called the index of the DAE's. 'The index is a measure of the singularity
of a system' —(Petzold and Lotstedt, 1986). And, '... the more singular a DAE system
is, the more difficult it is to solve numerically'—(Hindmarsh and Petzold, 1988).
Note that we have already performed one constraint differentiation, (3.13-240), to
obtain the PPE—another algebraic constraint equation [(3.13-29) was the first constraint
equation]. If we perform one more, this time on the PPE itself, (3.13-242), to obtain
(CTM~XC)P = CTM~l—[f -Ku- N{u)u] - g, (3.16-1)
at
we have a system of ODE's; namely, (3.13-28) and (3.16-1), and we have thus discovered
the indices: since it required two differentiations of the algebraic constraint equations to
arrive at our (index 0) ODE system in u and P, the index of the primitive variable DAE's is
two. Similarly it follows that the index of the PPE formulation, (3.13-28) and (3.13-242),
is one. We remark that (3.16-1) has no other use than 'index-determination.'
Next, we paraphrase a few important remarks from the recent text on the subject by
Brenan etal. (1996):
1. The higher the index, the more difficult is the DAE system to solve. (This statement
refers, in particular, to ODE methods in which automatic error control of the non-linear
DAE's is desired.)
2. DAE's with index two or more are called 'higher index' systems.
3. The solution of higher index systems can involve derivatives of order k — 1 of the
forcing function (k is the index). ['Since numerical differentiation is notoriously ill-
conditioned, (sensitive to small errors), difficulties for numerical ODE methods can be
expected ' (Hindmarsh and Petzold, 1988)]
4. Not all initial conditions admit a smooth solution if k ^ 1. (Consistent IC's generally
lead to smooth solutions.)
5. Higher-order DAE's can have hidden algebraic constraints.
6. Lowered-index DAE's (derived by differentiation) have more solutions than the original
DAE's. Only some of these solutions are solutions of the original DAE's.
7. Often the most difficult part of solving a DAE system in applications is to determine a
consistent set of initial conditions. Fortunately, this problem need not plague us because
we know how to determine consistent IC's; details later. (It has, however, plagued others
in the past—with the most common offenders being 'impulsive starts.')
8. This one is our own: we presume that the solution 'most desired' is that satisfying
the original DAE system—that of highest index. [This is because the implication is
'oneway'; solutions of the highest index system will always satisfy all (derived) lower index
systems.]
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 641
It may be helpful to present a simple, but meaningful example (because generalization
is possible) of one of the difficulties that can occur with DAE's. It has to do with local
truncation error estimates, and the lesson to be learned is this: 'ordinary' error estimates,
from ODE theory, may be wrong. The following example is from Petzold (1982):
h = yu (3.16-2)
yi = g(t), (3.16-3)
an innocuous-looking index 2 system in two degrees of freedom. To prove that the index
is 2, differentiate the constraint to get y2 = 8, which, when inserted into (3.16-2), gives
the 'hidden' constraint y\ = g, which, when differentiated once more, yields the pair
of ODE's for y\ and y2, y2 = y\ and y\ = g. Hence—index 2. As an 'aside,' note that
the general solution to these ODE's is y\ = y\ (0) + g(t) — g(0) and y2 = y2(0) + g(t) —
g(0) + t[y\(0) — g(0)], whereas that of the original (index 2) DAE's is simply y\ = g(t)
and y2 = g(t); also, that of the intermediate, index 1, system is y\ = g(t) and y2 = y2(0) +
g(t) — g(0). The index 1 solution (the solution of the index 1 system) is only correct if
the particular IC y2(0) = g(0) is prescribed, and the index 0 solution (the solution of the
index 0, ODE, system) is only correct if, in addition, the particular IC y\ (0) = g(0) is
prescribed, showing clearly the existence of extraneous solutions from the derived lower
index problems.
Let us consider the simplest and most 'robust' time integrator, backward Euler. As we
learned in the previous chapter, the local truncation error for BE (on ODE's) is 0(At2),
a result that was used to generate a variable-step method based on local error control.
Does this estimate carry over to (index 2) DAE's? To help answer this, let us compute
the exact local truncation error for the above example. BE, starting at tn, gives
y2,n + \ — yi,n gn + \ ~ gn ,~ , ,. A,
*..+. = s = —^—. (3-16-4)
y2,n + l=8(tn + l) = gn + l, (3.16-5)
whereas the exact solution has y\ (tn+\) = g{tn+\) = gn+\ and 3^2(^+1) = gn+\ ■ Thus, the
error is zero in the second component; but in the first, it is
^i,n+i = y\,n+\ — y\(tn+\)
gn + \ ~ gn
= —^— - gn + X
Atgn+] - (Af2/2)g(£) .
= Xf 8n+x
= ~m< (3-16-6)
where tn ^ £ ^ tn+\. Thus, the surprising result is that one order of accuracy (for y\ has
been lost by applying BE to the simplest index 2 problem.
'The situation... is obviously enough to wreak havoc with any step size selection
algorithm which assumes that errors are 0(Atk+l), where k is the order of the
method'—Petzold (1982). If all local errors were this bad, and if they accumulated, then
the global error would be 0(1) in At\ Fortunately, things are not quite that bad; in this
642 THE NAVIER-STOKES EQUATIONS
example, y\ is the algebraic variable (and y2 is the ODE variable), and the general theory
for BE on index 2 DAE's (Brenan et ai, 1996) state that these larger local errors in the
algebraic variable do not accumulate, with the result that y\ above is first-order accurate
both locally and globally. This will extend to our Navier-Stokes index 2 DAE's in the
obvious way: BE will be (globally) first-order accurate for both u and P even though
the local truncation errors are 0(At2) and (generally) O(At), respectively. The 'general'
advice that comes out of DAE theory, for our purposes (our DAE's) is this, roughly (but
not uniformly for all DAE's): base your error estimates and timestep control strategies
on the differential variable. The general 'lesson,' which is also true for our index 2 NS
DAE's, is this: when g(t) is time-dependent, the numerical solution will be more difficult
(and often less accurate—in pressure) because the integration process actually involves
an implied numerical differentiation of g(t). [The index 1 problem, in contrast, is easier in
principle because it involves a numerical integration of g(t)—a statement that presumes
both g(t) and g(t) are given continuous functions of t.]
Another intuitively useful concept when 'thinking about' DAE's is that they can be
thought of as stiff differential equations in the limit of infinite stiffness. Suppose, for
example, that we changed the algebraic constraint equations, CTu = g, to a set of
differential equations via xP + CTu = g, which we could integrate simultaneously with the
momentum equations. This technique was introduced by Chorin (1967b) to efficiently
'time-march' to steady solutions—and has been recently utilized by Kwak et al. (1986).
If we then let r —► 0, we approach infinite stiffness—DAE's. Thus, it is no accident that a
very recent book on the subject (Hairer and Wanner, 1991) bears the telling title, Solving
Ordinary Differential Equations II: Stiff and Differential-Algebraic Problems.
Remark 6 above should not be too surprising in light of what has already been discussed
in Section 3.9.2 regarding extraneous solutions of the PDE's when the PPE is used to
replace V • u = 0—a discussion we now revisit, by asking, 'When will the index 1
solution (PPE) agree with that from the index 2 (primitive equation) formulation?' The answer,
prior to introducing time integration error, is, just as it was for the continuum: when
the same solvability conditions that apply to the index 2 formulation are satisfied; here
Ctuq = go and (when n • u is specified on all of T) PTHg{t) = 0 for t ^ 0, where Ph is the
hydrostatic pressure mode. But there are many additional solutions to the PPE formulation
that are not solutions of the index 2 formulation—even for the time-continuous
'theoretical' solutions; e.g., if g(t) = g0 is time-independent and CTuQ ^ g0, the PPE formulation
has no solvability constraints so that a solution always exists, but that solution will not
be a solution to the original (index 2) DAE's—which are ill posed and have no solution.
From (3.13-240), which is easily seen to be implied by (3.13-241) and (3.13-242), it is
easy to see that the PPE solution, rather than satisfying the proper mass conservation
equation, CTu = g(t), will instead satisfy
CTu(t) = g(t) + CTu0 - g0; (3.16-7)
any initial divergence error will linger forever.
The PPE does, however, sometimes have a solvability constraint; namely, when (and
only when, in the absence of spurious pressure modes, which we assume herein) the
hydrostatic pressure mode exists (i.e., only when u • n is prescribed on all of T), the
scalar product of the PPE, (3.13-242) with PH yields, using the fact that PTHCT = 0,
PTHg(t) = 0,
(3.16-8)
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 643
which is the time derivative of the constraint given earlier for the index 2 formulation,
(3.13-31), and corresponds to the time derivative of (3.10-13) in the continuum—see the
discussion on 'solvability' following Remark (9) after (3.10-13). So, only if the normal
velocity is specified on all of T and is time-varying on some portion of T does the PPE
method 'retain' the original solvability constraint related to global mass conservation.
Then, if ]T);g;(0) # 0, the index 1 system is also ill-posed (even if CTu0 = g0, actually)
and has no solution because the PPE for P at t = 0 is ill-posed. Except for cases where
g(t) is time-dependent and Pjfg(t) ^ 0, the violation of Ctuq = g0 is also depicted in
Figure 3.10-1 of Section 3.10, to which we now return. In Figure 3.10-1, the union of the
two larger ellipses is now described by Ctuq ^ g0, and the horizontal ellipse describes
PTHgo # 0, where g(t) = 0.(CTu = g =» PTHg = 0 and thus PTHg # 0 => CTu # g; but
CTu ^ g does not =>• Pfjg ^ 0.) In Figure 3.10-2, as in Figure 3.10-1, all six cases
correspond to Ctuq t^ go; here, though, if we generalize our interpretation and allow g(t) to
be time-dependent and interpret Cases 4-6 as a violation of Pjfg(t) = 0, then these three
cases are also ill-posed via the PPE formulation. (Recall that two procedures for fixing
this problem were presented in Section 3.13.1g.) Finally, now the vectors in both figures
might make more sense intuitively since each vector could correspond to a velocity at just
one node. One final remark on solvability: if inflow = outflow in (special) Case 2, this
case solves the index 1 DAE's (but not those of index 2) even when the normal velocity
on T is time-varying.
We close this introductory discussion with a quotation from two leading general-
purpose code developers before we present a few methods: 'The development of codes
for DAE's is not a straightforward task because of the difficulties in the computation
arising from the singular part of the system and the coupling to the differential part,
which do not occur for ODE systems. In particular, starting, error estimation, and solving
the nonlinear system all present difficulties—even for index 1 systems.' — (Hindmarsh
and Petzold, 1988).
3.16.1 Some Time-Integration Methods for the DAE's
In this section we shall show how to apply several of the ODE methods discussed in the
previous chapter to both index 2 and index l formulations. In Volume II, we shall discuss
how to solve the resulting equations—for a few of the methods. Before this, however,
we will first form another index 0 formulation (ODE's), one that is useful only in that
it shows how to properly integrate the index l DAE's—it is not useful for 'generating'
code. But before we do even this, we introduce a new, highly condensed notation, to save
'ink,' because we will be rewriting these DAE's many times. Thus, we rewrite the index
2 and index l formulations, respectively, in the following two ways:
(i) u + GP = f(u), (3.16-9)
Du = g(t) (3.16-10)
and, by differentiation of the constraint,
(ii) u + GP = f(u), (3.16-11)
LP = Df(u)-g(t), (3.16-12)
644 THE NAVIER-STOKES EQUATIONS
where (obviously) G = M~lC, f(u) = M~l[f - Ku - N(u)u]—and we apologize if the
dual use of / causes any confusion, D = CT, and L = DG = LT is the Laplacian. It is
noteworthy that many simple FDM formulations already display this concise description,
in which the new terms really are simple because M = I. But it is also important to
remember that we are dealing with M ^1 and, in the general case, M~x is dense so
that these new definitions are more 'formal.' Also slightly noteworthy is that many FDM
formulations do not generate a symmetric L. Finally, we point out that the stability of the
DAE's can only be assured if the advection matrix is skew-symmetric.
Some remarks on initial conditions: whereas (3.16-9) and (3.16-10) are well posed
given any initial velocity, uq, satisfying Du0 = go (and not otherwise), (3.16-11) and
(3.16-12) are well-posed for any uq. Thus, index 1 solutions are actually wrong/spurious
if Duq t^ go. Also the proper/appropriate/consistent initial pressure, P0, cannot be chosen
arbitrarily; it must be derived—and this can be done in either of two equivalent ways:
(i) solve (3.16-9) and the time-derivative of (3.16-10) at t = 0 for uQ and Pq,, (ii) solve
(3.16-12) aw = 0for/>0.
Now we find our alternate index 0 (ODE) formulation by solving (3.16-12) for P and
placing the result in (3.16-11) to get
it=(I - GL~lD)f(u) + GL~lg = pf{u) + GL~lg, (3.16-13)
= Pf(u) + v,
where
v(t) = GL~X g(t) (3.16-14)
and p = I — GL~XD is a projection matrix/operator. See Appendix 3. Note that these
ODE's =>• Du = g, since Dp = 0, DG = L and Dv = g. These ODE's are not in the
original variables (P is gone), and (3.16-13) is also 'formal' in that it is not a useful
representation from which to launch code writing. But it is, or can be, useful as a
canonical approach for the application of ODE methods to DAE's (A.C. Hindmarsh, personal
communication); it is also useful when performing theoretical analysis, as we shall soon
see. In fact, however, if (3.16-13) was construed as a legitimate index-determining ODE
system, then the erroneous conclusion that the index of (3.16-9) and (3.16-10) is 1 would
obtain.
Remarks:
(1) Here and hereafter, unless otherwise specified, we presume, when writing gn = g(tn),
that we have a given continuous function of time, just as we presume for gn = g(tn).
(2) The penalty method (see Section 3.13.2e)—P = X(Du — g) for X 'large'—generates
a more useful (albeit quite stiff) index 0 system, it + XGDu = f(u) + XGg, that we
shall return to later (at the end of Section 3.16.4).
(3) Here and hereafter we presume that spurious pressure modes are either absent or
have been properly accommodated (see Section 3.13.2) so that our matrices are
either non-singular or, at worst, our problems are 'consistent singular.'
(4) If the IC for the NSE's is the most general possible, there will be an ephemeral
vortex sheet on that portion of T on which the tangential velocity is specified. If
this sheet is important, then a fine spatial mesh near V will be required as well as
small timesteps at small time—regardless of the time integration method selected.
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 645
a. Primitive equations/index 2
We shall start at the 'top'—index 2. We shall present two explicit and four implicit
methods, and then argue that, for GFEM at least, only implicit methods make much sense
for this formulation. [What we call 'explicit' some (more careful) authors—e.g., Hairer
et al. (1989), call 'half explicit'—because P is always implicit.]
1. Forward Euler. Applied to (3.16-9), (3.16-10), this simplest of methods goes like so:
given un with Dun = gn, solve
Af
+ GPn=fn=f(un)
and
Dun+\ — gn+\
for un+\ and Pn. As a coupled linear system, this looks like
(I AtG\fun+l\ fun+Atfn\
\D 0 )\Pn ) \ gn+] J '
which we immediately 'expand,' using our previous definitions, to
(3.16-15)
(3.16-16)
(3.16-17)
M
C 0 )
Mun + At[fn — Kun — N(un)un]
gn + l
(3.16-18)
(with a different /„, of course), which leads to the following
Remarks:
(1) The time 'index' on P is perfectly proper; it should not be Pn+\ as many believe—a
statement we shall later prove.
(2) If the mass is lumped (or if FDM is used), then the system in (3.16-17) easily
uncouples to the following less expensive sequence [solve for un+\ in (3.16-15) and
put the result in (3.16-16)]:
(i) solve LPn = Dfn - (gn+\ ~ gn)/&t;
(ii) compute un+\ = un + At(fn — GPn),
which turns out to be exactly the same as applying forward Euler to the index 1
DAE's after approximating g by (g»+\ — gn)/At, as we later show. The above PPE
is an example of the 'hidden constraint equation' with higher-order DAE's, referred
to in the introduction—and of the implied numerical differentiation of the constraint
in the index 2 solution, a fact that will be seen to sometimes adversely affect the
local accuracy of the pressure.
(3) If the mass is not lumped (GFEM), then (3.16-17) does not (easily) uncouple, and
we have an expensive explicit method—and one that is not recommended.
(4) If PH exists (n • u specified on all of T), then (3.16-15) and (3.16-16) are solvable if
and only if Pjjgn+[ = 0, n = 0, 1, ..., and the resulting matrix singularity can then
be removed by specifying any one of the elements of the pressure vector—at any
value. This remark applies to all time-integration methods.
(5) If Duo # go, then P0 will behave like O[(Du0 - go)/At] and U\ — Wo —
0(1) in At
with Du\ = g\; this is an example of IC's that give non-smooth solutions because the
646 THE NAVIER-STOKES EQUATIONS
problem is ill-posed. In fact, LP0 = Df0 + (Du0 - g\)/At = Df0 - g0 + (Du0 -
go)/At + O(At) and u\ = p(u0 + Atf0) + v\ = pu0 + v\ + O(At). In fact, AtP0
approximates [to O(At)] the Lagrange multiplier of the L2-projection discussed in
Sections 3.10.2 and 3.13.Id—and u\ — v\ is clearly O(At) away from the
corresponding projected velocity.
2. Second-order Adams-Bashforth. AB2 applied to (3.16-9) and (3.16-10) gives, given
un with Dun = gn, un_\ with Dun-\ = gn_u and Pn-\,
and
un + \ ~ Un
At
1
~ 2
Dun
Xfn
+ 1 =
-G^j-a^.-G^-,)
g„+i, for n = 1,2, ...,
(3.16-19)
(3.16-20)
which again are to be solved for un+\ and Pn and, like forward Euler, is only a
feasible method for FDM or lumped mass FEM—because the equations then uncouple to
permit the sequential solution steps—after inserting un+\ from (3.16-19) into (3.16-20),
(i) compute Pn from the implied PPE, L[(3Pn - Pn-\)/2] = D[(3fn - fn-\)/2] -
(gn+\ — gn)/&-U (ii) compute un+\ from (3.16-19). Consistent mass 'demands' implicit
methods for these DAE's even more than it did in the case of the scalar transport equation
of the previous chapter. Finally, we mention that:
(i)
(ii)
Start-up is, as for ODE's, 'special'; typically forward Euler would be used to take
the first (and smaller) step.
In order to see what approximation to g is implied by AB2, simply insert Df(u) —
LP = g from (3.16-12) into the above PPE to obtain the implied differentiation of
the constraint,
1
gn =
gn-\ +2
gn+\ gn
A~t
^ dt
which rearranges to gn+\ = gn + (At/2)(3gn — g„_i), the AB2 integration of a
system of ODE's for g. This observation will later be seen to be quite 'general'
and quite significant.
3. Backward Euler. The simplest implicit method gives
un + \ ~ un
At
+ GPn+\ = fn+\
and
(3.16-21)
(3.16-22)
Dun + \ = gn + \,
a fully coupled non-linear system for (un+\, Pn+\)-
Remarks:
(1) The advanced time-level index on both u and P is related to the fact that BE cares
not whether uq is discretely divergence-free; it thus also has no need for the initial
pressure, P$. Thus, this most robust of all implicit schemes can, like the less robust,
simplest explicit scheme, even solve ill-posed index 2 DAE's—and is related to the
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 647
introductory remark that 'Not all IC's admit a smooth solution'; a la Remark (5)
under FE.
(2) The discrete PPE satisfied (implied) by (3.16-21) and (3.16-22) is LPn+i = Dfn+l -
(gn+\ - Dun)/At, which, [f Duo = go, becomes LPn+l = Dfn+X - (gn+x - gn)/At,
which approximates (3.16-12) and which, if Duq ^ go gives, using gn+{ = gn +
Atgn + 0(At2) at n = 0, LP\ =Dfl-g0- (g0 - Du0)/At + O(At), like FE on
Pq\ i.e., P\ —► oo as At —► 0. Another consequence of this ill-posedness is (again)
that u\ — uq is 0(1) in At (not smooth) rather than O(At) as when well-posed;
u\ = puQ + v\ + 0{At).
(3) It turns out—see Gresho (1991b) and Section 3.16.Id—that u\ is, for Duo # go and
At —> 0, an L2-projection of uq onto the discretely divergence-free subspace, and
AtP\ is the associated Lagrange multiplier, a statement that we have already applied
to FE with A^i replaced by AtPo\ BE (or FE) applied to an ill-posed problem
'changes the data' so that after one timestep, it becomes well-posed—at least for
those cases violating Duo = go but not Pjjgo = 0 (problems violating Pjfgo = 0
when PH exists are ill-posed because of bad BC's, not bad IC's).
(4) As for ODE's, setting At = oo recovers the steady equations; i.e., BE can, in
principle, obtain the (or a) steady state in one step—assuming one exists, which solution
may not be (temporally) stable.
4. Trapezoid rule. A single step up from BE gives the next member of the Adams family:
given un with Dun = gn and Pn, solve
Un+l~f Un + \G{Pn + Pn+l)= fn +/"+', (3.16-23)
At At
—D(un+un+l)=—(gn+gn+l) (3.16-24)
for (un+\, Pn+\), which prompts us to mention how (3.16-24), rather than Dun+\ = gn+\,
which we shall soon discuss, came about. A general/canonical method of discovering how
an implicit ODE method can be applied to a DAE system is to first write the latter in
time-singular form (Campbell, 1980), Ay = b(y), where A is singular and b contains the
algebraic constraints (which, of course, can be done for our two DAE systems), and then
'multiply' the selected ODE method by A and replace Ay by b\ i.e., for TR applied to y =
b, we have yn+l = yn + At(yn + y„+i)/2, and thus Ayn+\ = Ayn + At(bn + bn+\)/2.
This is how (3.16-23), (3.16-24) was obtained from (3.16-9), (3.16-10), with y = (£),
A=(J °0), and b =(/_"-).
[Another way to arrive at the same result is to replace Du = g by the (stiff) ODE,
xP = g — Du, apply the implicit ODE method, and then let r —► 0.]
But it is easy to see, via induction, since we assumed that Dun = gn for all n (in
particular for n = 0), that (3.16-24) quickly simplifies to
Dun+X = gn+\, (3.16-25)
which we shall call shortened TR (STR); and this is the equation that would probably be
used in most codes.
But the above difference in the treatment of Du = g leads to a useful digression in
which we briefly return to the subject of wiggles and wiggle signals. It turns out that
648 THE NAVIER-STOKES EQUATIONS
the 'long' version of TR, referred to as the 'direct approach' by Hairer and Wanner
(1991), using (3.16-24), can alert the code user to some difficulties that would mostly be
overlooked by the 'short' version, (3.16-25), the 'indirect approach' (Hairer and Wanner,
1991). To see this distinction, we first examine the two PPE's that are implied by the two
methods. Writing Pn+l/2 = (Pn + Pn+i)/2 and fn+l/2 = (/„ +/„+i)/2, (3.16-23) and
(3.16-24) give, in the 'general' case (Duq, ^ go), for n = 0, 1,...,
LPn+]/2 = Dfn+l/2-(gn+i-gn)/At + 2(-\y(Du0-go)/At, (3.16-26)
and the velocity would satisfy [see (3.16-13)]
un+l =un+ Atpfn+i/2 + GL-\gn+x - gn) - 2(-\)nGL-\Du0 - go) (3.16-27)
and
Dun-gn = (-\)n(Du0-g0). (3.16-28)
Thus, there is a persistent 2At oscillation/wiggle in the solution, and larger the larger
is the initial divergence error. Especially noteworthy is the pressure oscillation; contrary
to 'conventional' large-A? TR oscillations, these oscillations increase as At is decreased.
We can and shall regard this as a wiggle signal—a clear message to the user that s/he
should re-examine the input data. [A recent example of this very thing is contained in
Marx (1994)—except that the signal was either not 'received' or not heeded; rather, TR
was wrongly condemned, again.] On the other hand, the shortened TR, (3.16-23) and
(3.16-25), yields the following:
ux = u0 + Atpf\/2 + GL~x{gx - g0) - GL~X (Du0 -go),
= puo + v\+ O(At), (3.16-29)
which satisfies Du\ = g\ and, for n = 1, 2,...,
k„+i = un + Atpfn+l/2 + GL~l(gn+\ -gn), (3.16-30)
which satisfies Dun+\ = gn+\', for the pressure,
LPl/2 = Dfx/2 - (gi - go)/At + (Du0 ~ go)/At, (3.16-31)
and, for n ^ 1,
LPn+l/2=Dfn+l/2-(gn+l -gn)/At. (3.16-32)
Thus, except for the first step, the shortened version (like the Euler methods) is—provided
that only Pn+\/2 is reported—oblivious to ill-posedness; bad initial data will be changed
(via an appropriate L2-projection) during the first timestep, and a different-but-legitimate
(consistent) problem solved thereafter. But—as for FE and BE—the code may be solving
a different problem than the code user thinks is being solved.
Unless one computes and reports only Pn+\/2 (a viable option, actually), the initial
pressure field is needed in (3.16-23) for n = 0, and it comes (only—no other choice, which
extends our definition of 'honest GFEM' further into the time domain) from the DAE's
applied, with differentiated constraint equation, at t = 0; i.e., given «0 with Du0 = go,
solve
uo + GP0 = f0 (3.16-33)
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 649
and
Du0 = g0, (3.16-34)
a linear system with matrix ('D GQ) or, in GFEM form, (£?r CQ), for the initial acceleration
and pressure, (uq, Pq). From the above, we observe the following general result:
every divergence-free velocity field induces a concomitant pressure and a divergence-
free acceleration,
an observation that is also true for the PDE's; recall Remark (5) below (3.10.13). The
initial pressure is thus seen to satisfy the appropriate PPE at t = 0,
LPQ = DfQ-go. (3.16-35)
Note that even when (3.16-33) and (3.16-34) [and thus (3.16-35)] are solvable, both itQ
and Pq make little sense if Duq ^ go. Note too that if the hydrostatic pressure mode, PH,
is present and Pjfgo # 0, (3.16-35) has no solution—even if DuQ = go. If PH is present
and Pjfgo = 0, the singularity in L can be removed by 'pegging' a pressure—as stated
earlier—with no adverse effects.
Returning, however, to the ill-posed case (Du0 ^ go), we wish to point out the pressure
behaviour for both TR and STR. Rather than the average pressure behaviour discussed
above, we get, with Pq from (3.6-35),
LPn=Dfn+{-\f
j=0
(3.16-36)
where fin = 2n for TR and fin = 1 for STR. Thus, a 'true' TR simulation with Duo # go
will show a linearly growing oscillatory pressure, whereas STR will merely oscillate;
each, of course, is a wiggle signal (especially for small At) that begs for attention.
Finally, for the well-posed case (Duq = go), it is interesting to find the implied PPE
for Pn+\ rather than that for (Pn + Pn+\)/2 as done thus far. From (3.16-26) applied at
n = 0 and (3.16-35), we obtain, by induction,
LPn+\ = Dfn+X - (V"+^~ 8n - gn) (3.16-37)
(where it is important to point out that gn is a TR approximation to dg/dt at t„, but
gn ^ dg/dt at tn) which is (3.16-12) applied at t = tn+\ with a 'special' (appropriate)
approximation to gn+\\ namely, gn+\ = 2(gn+\ — gn)/At — gn, which is just (the
inversion of) TR applied to an ODE system for g.
5. Second-order backward-differentiation method. BDF2, like BE, needs neither pressure
nor a divergence-free IC to advance the solution; i.e., given un and un-\, from (2.7-72)
of the previous chapter, we get
uJl±l^=X-U-^^- + \(f„+i-GPn+i) (3.16-38)
and
Dun+\=gn+u n =0,1,2,..., (3.16-39)
a non-linear system in (un+\, Pn+\)- Note that all BDF methods apply the algebraic
constraint equation only at the advanced time so that, like the BE case already discussed,
650 THE NAVIER-STOKES EQUATIONS
an infinite timestep recovers the steady-state NS equations (but not necessarily their
solution, even if one exists). If Duq = go, the implied PPE and concomitant approximation to
g(t) is easily found to be
LPn+i =Dfn+l - (3gn+l -Agn +£„_,)/2Af, (3.16-40)
which is (again) the appropriate PPE with gn+\ consistently (a la BDF2) approximated;
cf. (2.7-73).
Since BDF2 is not self-starting, it may be well to address this issue here. There are two
choices, basically, both viable: (i) the first step could be made with the first-order member
of the same BDF family; namely, BDF1, which of course is backward Euler—and because
it is only first-order accurate, the At should be smaller (by a factor of approximately five
to 10) than that for BDF2; (ii) the TR, via (3.16-23) and (3.16-25) with Du0 = g0, could
also be used to obtain u\, and at the same At contemplated for BDF2.
6. Implicit midpoint rule. The last implicit method we consider, on the index 2 system,
and the third of the 'second-order-accurate' family, is IMR:
Un+\-lin . n (Pn + Pn + \\ (Un + Un + X\ ,,U/|n
—aT~ + G {-^~) = f {—^) (3-16"41)
and
yD(W„ + un+i) = Atg ftn+^n+l\ for n = 0, 1, 2,..., (3.16-42)
which, unless g(t) is either constant or a linear function of time, introduces a slight
peculiarity because the divergence-free constraint applies only at the midpoint; i.e., Dun ^
gn in the general case—even if Duq = g0. [Du\ = g\ + 0(At3).] IMR is also a member
of the family of implicit Runge-Kutta methods, all of which share this same 'difficulty.'
For the (common) case in which g is independent of time, IMR will also satisfy the
constraint equation at endpoints as well as midpoints—if Duq, = g.
The implied PPE is also different:
L(Pn + Pn+i)/2 = Df(un+l + un)/2 - D(un+l - un)/At, (3.16-43)
which implies a differentiation of Du rather than g.
Another good candidate (better, probably) would be a modified IMR in which the
constraint is enforced at the 'end-of-step'; i.e., use Dun+\ = gn+\ rather than (3.16-42).
This implies the following PPE and concomitant g approximation; replacing (3.16-43):
L(Pn +Pn+i)/2 = Df(un + un+i)/2 - (gn+i - gn)/At. (3.16-44)
These are all of the methods that we wish to discuss for now, concluding with some
final remarks on index 2 time-marching methods:
1. Unless the mass is lumped (or unless a simple finite-difference method is used to
generate the DAE's), only implicit methods should be considered. If lumped mass is
employed, then the index 1/PPE 'version' (which 'falls out' of the index 2 equations)
of any explicit integration method should be used—because cheaper sequential solutions
then obtain.
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 651
2. The first-order BE method should generally only be used if a rapid approach (via fairly
'large' timesteps) to a steady state—assuming one exists—is desired. For this reason, we
will later present a 'smart' BE integrator—even though it is rather inefficient for obtaining
'time-accurate' transient solutions (too many small steps are needed).
3. Among the three second-order methods,
(i) TR is not recommended by some ODE/DAE experts (e.g., Brenan et ai, 1996)
because of its neutral stability. We, however, have had much success with
TR—especially when used with variable step size via error control, as we present
later—and are not afraid of it. [Neither is Gartling, in his NACHOS II code
(1987), who has told us that TR (with error control) 'always' works well—personal
communication.]
(ii) IMR may be a good choice if a fixed-A? code is written because it conserves
'energy' (uTu or uTMu) regardless of step size—and because variable At methods
are not easy to derive for the one-leg family. (TR is indefinite with respect to energy
conservation, and BDF2 removes energy.) But we do not believe in fixed-A? implicit
integrators.
(iii) BDF2 may be the best choice of all because of its (extra) stability (L-stable rather
than just A-stable) and (like TR) implementation of error control and variable At is
easy—its disadvantage being artificial dissipation (the price for L-stability). Thus,
we will later also present a robust, variable-step BDF2 integration technique.
(iv) If unconditional stability is desired or required, then only implicit methods should
be used, regardless of mass lumping issues.
b. PPE methods/index 1
For the index 1 system, (3.16-11) and (3.16-12), we shall consider four explicit methods
(FE, AB2, RK2 and LF) but only one implicit method, and the latter only to make the
point that index 1 and implicit methods are not a good combination (implicit methods
should be applied directly to the index 2 DAE's). Of the explicit methods considered,
only one, AB2, will be described in any detail. Another important point is that consistent
mass virtually precludes all index 1 solution methods; we will therefore soon limit our
discussion to lumped mass FEM or FDM.
1. Backward Euler. Let us start with the (simplest) implicit example, backward Euler,
applied to (3.16-11) and (3.16-12), it is, simply
un+x =un + At(fn+] - GPn+]) (3.16-45)
and
LPn+l =Dfn+l-gn+l, (3.16-46)
a fully coupled non-linear system in (un+\, Pn+\ ). Since L = DG = CTM~XC and D =
CTM~X, and M-1 is dense, it is clear that GFEM is just not meant to be integrated with
implicit methods in index 1 form. Even if the mass were lumped (or FDM used), the
system is still fully coupled, and—as with all ODE methods applied to the true index 1
DAE's—the resulting solution will generally not even be discretely divergence-free; from
(3.16-45) and (3.16-46), it is easy to derive that Dun+\ = Dun + Atgn+\ ^ gn+\, even
when Duq = go.
652 THE NAVIER-STOKES EQUATIONS
We now go to the explicit methods, beginning with AB2.
2. Second-order Adams-Bashforth. Because we advocate higher than first-order methods,
we begin with AB2, which is best applied first (and formally) to the second index 0 form,
(3.16-13), to give
un+x =un + At/2[3(pfn+GL-]gn) - (p/„_, +GL-'^_,)]. (3.16-47)
Next, to recover the appropriate index 1 version that is useful, we explicitly construct
the projection of /, after rearranging the above result to (un+\ — un)/At = ^[fn —
GL-\Dfn-gn)\-\[fn-X-GL-\Dfn-X -gn-\)l which, noting that L~\Df - g)
is an ra-vector called P, yields
{Un+x~Un) = 3-(fn -GPn)- !(/„_, - G/V.), (3.16-48)
with
LPn = Dfn - gn and LPn-X = Dfn-X - gn-X (3.16-49)
as the appropriate 'defining' equations for the pressure. Thus, presuming Pn_x to be
available (we will return to 'start-up' later), the AB2 method, as an 'algorithm,' is
Step 1. solve LPn = Dfn — gn for Pn;
Step 2. update un+x from (3.16-48). Done.
We see already that index 1 (with LM) and explicit methods are a good match since
the (sole?) advantage of explicit methods is realized: there are no algebraic equations to
solve for the 'ODE variable,' u. (There will always be algebraic equations to solve for
P, because these are DAE's, not ODE's.)
So how close to divergence-free is the resulting velocity? This is most easily answered
by applying D to (3.16-47) and recalling that Dp = 0 to give
D(un,\ — u„) 1
+^ = ■jQgn-kn-x)* (3.16-50)
which is now a second-order accurate approximation (AB2 in fact) to Du = g; index 1
methods preserve the discrete divergence of the acceleration, but only up to the LTE—and
they do not, in general, yield Dun = gn. If, however, the Dirichlet BC's were time-
independent, we would have g = 0, and then AB2 would be maintaining Dun = g if it
started that way (Duq = g).
But we can do better yet when g ^ 0 by constructing a special-but-appropriate
approximation to gn; i.e., one that keeps the velocity discretely divergence-free regardless of
step size. We shall call this approach a modified index 1 method, vis-a-vis 'true' index
1. Setting Dun+X = gn+x and Dun = gn in (3.16-50) yields the following approximation
to gn that is to be used in (3.16-49):
^ 8n + \ ~ gn ' . _ ,, C1
£« = ^ — + ^8n~\, (3.16-51)
which is simply the AB2 ODE method 'inverted.' The 'improved' (modified) AB2 index 1
algorithm is then
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 653
Step 1. solve LPn =Dfn - \{gn+\ -gn)/At- \gn-u
Step 2. update un+\ from (3.16-48) and update gn from (3.16-51). Done—and the result
will satisfy Dun+\ = gn+\, the index 2 constraint equation (iff Duq, = go).
So, if the mass is lumped, we have a good way to solve the modified index 1 DAE's.
But, as the astute reader will have already realized, comparing this 'new' result to the
index 2 integration with AB2, (3.16-19), (3.16-20) and the Poisson equations below that
equation, we see ... equivalence! We have found the way to start with index 1 yet obtain
a solution to index 2—at least with perfect 'arithmetic' (no roundoff, no iteration errors,
etc.). But in the next section, we will discuss the (minor) 'cost' of this 'improvement'
(reduced local pressure accuracy).
Final remarks on AB2:
(i) Use forward Euler for the (smaller) first step, in the 'modified' manner to be
described below,
(ii) Before taking that first step, solve LP0 = D/0 — go for Pq.
3. Forward Euler. Applied to (3.16-11) and (3.16-12), with the help of (3.16-13) if
necessary [via un+\ = un + At(pfn + vn) and pf = f — GP — v] to get the right time
index on GP, it is
un+\ =un + At(fn - GPn) (3.16-52)
and
LPn=Dfn-gn, (3.16-53)
in which (3.16-53) is solved first. Equation (3.16-53) is the best equation to use to prove
the assertion made earlier regarding the time index on P; i.e., the index is not n + 1. We
offer two proofs, the second from J. Leone (personal communication, 1989):
(i) The elliptic (algebraic) PPE always enforces the pressure to be in equilibrium with
the current velocity field; since the RHS of (3.16-53) shows a velocity field at time
tn, the LHS must agree—Pn is correct, not Pn+\- QED.
(ii) If Pn+\ were correct, then the PPE would read Pn+\ = Df n + gn, which makes the
value of Pn+\ = P(tn + At) completely independent of At\ QED.
This (true) index 1 solution would satisfy D(un+\ — un)/At = gn, a first-order
approximation to Dii = g, and thus Dun = gn + O(At). Again, however, the simple and expedient
use of the 'appropriate' approximation to gn in (3.16-53), i.e., gn = (gn+\ —gn)/At,
recovers the index 2 result via our modified index-1 method; i.e., the resulting solution
would satisfy (3.16-15) and (3.16-16).
Remark:
We mentioned in Section 3.13.4a that PPE methods only enforce the weaker constraint
CTu = g. Using gn = (gn+\ —gn)/At in (3.16-53), it is easy to show that Dun+\ =
gn+\ + (Dun — gn). Any initial divergence error (Duq, — g0 ^ 0) will forever be 'frozen'
in the fluid—two examples of which are presented in Gresho (1991b), and we shall present
another at the end of this Section (3.16.Id).
Another way to interpret (and generalize) these results is that the index 2 formulation,
via its implied ('hidden,' initially) PPE, which itself implies an approximation to g, tells
654 THE NAVIER-STOKES EQUATIONS
the index 1 formulation the best way to approximate g—an interpretation that we believe
is useful and general [it applies to any ODE method—although only explicit ones, with
lumped mass (or FDM) make sense for code writing]. It is also restricted to the situation
in which the matrix D is time-independent, which is usually the case. The exception is
free-surface flows in which all matrices are time-dependent because of a continuously
changing domain shape (see Volume II). As mentioned above, however, both index 2
and modified index 1 solutions give a less accurate pressure (but only locally) when g is
time-dependent—the details of which are presented in the next section.
With this new interpretation, we now present the final results for two more second-order
explicit methods, leapfrog and Runge-Kutta, in both 'true' and modified formulations.
4. Leapfrog. The algorithm is as simple as FE (which it needs for the first step), but more
accurate:
un+l =un_x + 2At(fn-GPn), (3.16-54)
where
for true index 1, or
LPn =Dfn-gn, (3.16-55)
LPn=Dfn - (gn+l - gn_{)/2At (3.16-56)
for the modified version. But because LF2 is unstable for diffusion (cf. Chapter 2), its
use for NS is probably ill-advised.
5. Second-order Runge-Kutta. Strictly speaking, RK2 requires two PPE 'solves' per
timestep (once per stage), but see also Le and Moin (1991), in which a way was found to
do just one. Also, like leapfrog (but in the other extreme), it is unstable for pure advection
(Euler's equations) and must be used 'carefully'—like FE. The general algorithm [see
(2.7-12)] goes as follows:
Stage 1:
where
un+l/2Y + At(fn - GPn)/2y, (3.16-57)
LPn = Dfn - 2y{gn+M2Y - gn)/At (3.16-58)
is solved first.
Stage 2:
un+l =un+ At[{\ - Y){fn ~GPn) + y(fn+\/2Y - GPn+l/2y)], (3.16-59)
where
LPn+\/2y = Dfn+l/2y - [(gn+\ - gn)/yAt - 2(1 - y) ■ (gn+i/2y - g„/Af] (3.16-60)
is solved first, with the result that
Dun+X =gn+\- (3.16-61)
The way is now clear how to apply higher-order RK formulas to the DAE's—exercises
we leave to the reader.
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 655
Final remarks on index 1/PPE methods:
(i) FE is generally not recommended.
(ii) AB2 (or even AB3, not shown) may be preferable to RK2 because it needs to solve
half as many PPE's.
(iii) Use the modified index-1 method rather than true index 1.
c. Error analysis for index 1 and 2
We now present a brief analysis of local error for the two Euler methods applied to index
2 (and modified index l, since they are the same algorithm) and (true) index l. We will
present enough detail (we believe) so that the reader may do the 'same' (perhaps with
some effort) for higher-order methods, and we explicitly re-introduce the viscous terms.
o We begin with index 2 and forward Euler; FE applied to the index 2 (= FE applied to
modified index l) DAE's is
un+i-u« +KUn+GPn = f{Un) = f (3.16-62)
At
and
Dun+X = gn+\, (3.16-63)
which is index 2—or, equivalently, (3.16-62) and
LPn = £>(/„ -Kun)- (gn+l - gn)/At, (3.16-64)
which is modified index 1. Eliminating the pressure via the projection matrix, p = I —
GL~lD, permits an efficient analysis of the velocity error; i.e.,
Un + \—Un \(gn + \ - gn)
=p(fn-Kun) + GL ,
At At
= p(fn -Kun) + (vn+x -vn)/At, (3.16-65)
which is to be compared to (3.16-13)—due account being taken of the different
definition of f(u) there. In order to compute /„ = un+\ — u(tn+\) = dn for FE [cf. (2.7-7) in
Chapter 2, wherein we showed, for FE, that the LTE is also the local error], where u(tn+\)
is the solution of (3.16-13), we use Taylor series:
At2 ,
u(tn+i) = u(tn) + Atu{tn) + —-u{tn) + 0{Af) (3.16-66)
At2
= un + Atitn + ~^un + 0(At3). (3.16-67)
Using (3.16-13) gives
At2
u(tn+i) = un +At[p(f„ - Kun) + i)n] +-—un +0(At"), (3.16-68)
which, when subtracted from un+\ in (3.16-65) yields
FP At2 ,
d™ = Vn + \ -Vn- AtVn - -^r-Un + 0(Af )
656
THE NAVIER-STOKES EQUATIONS
vn + Atvn + -—vn + 0(Ar)
— vn — Atvn —
At'
-un +0(Ar)
At'
(vn-un) + 0(At3);
(3.16-69)
the velocity is locally second-order accurate.
For the pressure, we must compare Pn+\ and P(tn+\)—since Pn = Pn(t) was presumed
exact. Thus,
and
where
LPn+l = D(fn+l - Kun+i) - (gn+2 - gn+\ )/At
LP(tn+l) = D{f[u(tn+l)] - Ku(tn+i)} -gn+u
(3.16-70)
(3.16-71)
fn+i = f(un+i) = f[u(tn+i) + dFnE]
dT + 0{d¥nE)\
df
= f[u(tn+i)] + -f
du
(3.16-72)
tn + l
where df /du\tn+l = J is a Jacobian matrix. Then, defining en = Pn+\ — P(tn+\) yields,
using (3.16-69),
/FE
Len = D(J - K)d™ + gn+i- (gn+2 - 8n+\ )/At + O(Af).
(3.16-73)
Now use TS on gn+2'-
At'
At3...
gn+2 = gn + \ + Atgn + i + —gn + l + -g-g„ + l + 0(At )
to give, using (3.16-69) again,
At2...
At . At _. ...
Len = -—D(J-K)(vn -un)-—gn+x -~-gn+x +0{Ati),
2 2 6
(3.16-74)
and we have discovered, in (At/2)gn+\, another DAE problem: when g(t) is time-
dependent, the numerical differentiation of the constraint equation—either implicitly via
index 2 or explicitly via modified index 1—causes a loss of local accuracy in P, by one
order, over that for velocity (and for FE on ODE's). If and only if g is either independent
of time or linear in t, (3.16-74) gives
Len = —^D(J - K)un + 0(Ar>);
for the general case, however, en = O(At), while dFE = 0(At2).
o Next we apply FE to the true index-1 DAE's:
(3.16-75)
At
+ Kun + GPn = fn
(3.16-76)
and
LPn = D(fn - Kun)- gn,
(3.16-77)
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 657
which, upon elimination of Pn, is
Un+'~t Un = P(fn -Kun) + GL~xgn
= p(fn-Kun) + vn (3.16-78)
rather than (3.16-65)—a simple change with non-simple consequences. (Recall/note that
Dun+X = Dun + Atgn = gn+ Atgn ^ gn+l.) Subtracting u(tn+l) in (3.16-68) from un+\
in (3.16-78) yields
d¥nE = y'K + 0(At3), (3.16-79)
a la ODE local error. What about PI Well,
LPn+l = D[fn+i - Kun+l] - gn+l (3.16-80)
and
LP(tn+l) = D[f(un+l)-Ku(tn+l)]-gn+u (3.16-81)
to give
Len = L[Pn+i - P(tn+l)]
= D{fn+i - f[u(tn+l)] - K[un+l - u(tn+i)]} (3.16-82)
which, using (3.16-70) through (3.16-73), yields
Len=D{J-K)dlE + 0{\\dlE\\2)
At2 ,
= —— D(J-K)un +0(At3); (3.16-83)
no loss of accuracy relative to the velocity! We see here a special instance of a general
fact:
The index 1 DAE's 'trade' a loss of discrete divergence (Dun ^ gn) for a (local) gain
in pressure accuracy—in the general case (time-dependent g).
o We are now ready to analyze BE, first on index 2 (= modified index 1):
un+\ ~ un
At
and
+ Kun+] +GPn+] =fn+] (3.16-84)
Dun+X = gn+\, (3.16-85)
or (3.16-84) and
LPn+l =D(fn+] -Kun+])-(gn+] -g„)/At, (3.16-86)
both of which yield the following equation in velocity only, upon elimination of
Pn+\— and using (3.16-14):
= p(fn+\ ~ Kun+X) + — . (3.16-87)
At At
The LTE for BE [cf. (2.7-8) in the previous chapter] is obtained by inserting the exact
solution into (3.16-87) and evaluating the residual. Thus,
df- = u{tn) + Atp{f[u(tn+i)] - Ku{tn+X)} + (vn+i -vn)- u(tn+l), (3.16-88)
658 THE NAVIER-STOKES EQUATIONS
and the TS is now taken backwards from tn+\~.
At2 „ ,
u(tn) = u(tn+l) - Atu(tn+l) + —u(tn+l) + 0(Af )
At2
and we obtain
= u{tn+x)- At{p{f[u{tn+X)] - Ku(tn+l)) + vn+\} + -^-"«+i (3.16-89)
rf„ = v„+i - vn - Atvn+i + -—un + O(Ar')
A^
2
(ii„+i -i)„+i) + 0(Ar3)
= ^-(«„ -SJ + OCAf3); (3.16-90)
as for ODE's, BE reverses the sign of the LTE for FE [cf. (3.16-69)].
For the pressure, the analysis follows precisely as that for FE, given by (3.16-70)
through (3.16-74), except that gn+2 — gn+\ is now gll+\ — gn, with the 'same' result—first-
order accurate.
o Finally, to finish, we examine BE applied to the true index-1 system:
^^At ^ + KUn+l + °Pn+l = /"+1, (3.16-91)
LPn+x = D{f n+x - Kun+X) - gn+u (3.16-92)
which, assuming Dun = gn, =>• Dun+\ = Dun + Atgn+\ ^ gn+\ in general. Eliminating
pn+x gives
un+x =un +Atp(fn+x -Kun+x) +Ati)n+x, (3.16-93)
the BE analog of (3.16-78). As above, we compute the LTE as
d*E = u{tn) + Atp{f[u(tn+x)] ~ Ku(tn+l)} + Atvn+x - u(tn+l) (3.16-94)
and use (3.16-89) to obtain
At2
dBnE=^-un+i+0(At3)
At2 ,
= -r-K« +0(At2), (3.16-95)
with no error contribution from g(t). For the pressure, the analysis proceeds just as for
FE, see (3.16-80) through (3.16-83), with the final result that
At2 ,
Len = ~-D{J - K)un + 0(At3); (3.16-96)
as for FE on index 1, the pressure error is also locally 0(At2).
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 659
This concludes our error analysis. We now appeal to authority (the DAE experts,
several of whom have already been cited) to state the following facts: in all four cases
studied above the local velocity errors, all of which are 0(At2), accumulate during the
integration to give global errors one order lower—O(At)—as with ODE's. For pressure
(the algebraic variable), it turns out, luckily we add, that the O(At) local pressure errors
for index 2 (= modified index 1) do not accumulate, so that pressure is also globally
first-order accurate. For the true index-1 systems, the pressure errors do accumulate, so
that the locally 0(At2) accuracy degrades to first-order globally. Since also the discrete
continuity equation is not exactly satisfied for the true index 1 system (unless g = 0), it
would seem, on balance, to suggest that the modified index 1 approach (= index 2) is the
way to go.
Finally, we summarize these results and those for some other methods, in Table 3.16-1,
obtained mostly from Hairer et al. (1989) and Hairer and Wanner (1991), where Euler
refers to FE and BE (the latter going by the name Radau IA in the above citations). Also,
TR has the alias Lobotto IIIA (s = 2) and IMR is also called a Gaussian method with
s = 2—these alternate names coming from 'quadrature rules.'
Final remarks:
(1) Some implicit Runge-Kutta (IRK) methods may also be useful for these DAE's, but
we refer the reader to the literature for details—particularly Brenan et al. (1996) and
Hairer and Wanner (1991). See also Arnold (1993) to see how error amplification
in the algebraic variable can occur when IRK is applied to index 2 DAE's.
(2) A perhaps useful intuitive notion to 'explain' the less-accurate local truncation error
for pressure in the index 2 formulation is that the momentum equation causes
pressure to 'look like' a sort of time-derivative of the velocity
(3) If time-dependent free-surface fluid mechanics is contemplated, the matrices become
time-dependent—which introduces additional DAE difficulties—and the 'naive'
DAE solution procedures discussed herein may often experience difficulties. See
Ascher and Petzold (1991) and Brenan et al. (1996) for relevant theory and solution
methods.
Table 3.16-1 Errors in velocity/pressure.
Index 2 (or
Method
Euler(2)
TR
BDF2
IMR(3)
AB2(4)
RK2(4)
modified index 1)
LTE(D
At2/At
At3/At2
At3/At2
At3/At
At3/At2
At3/At2
Global Error
At/At
At2/At2
At2/At2
Af2/1
At2/At2
At2/At2
Index 1
Method
Euler(5)
AB2
RK2
LF2(6)
LTE
At2/At2
At3/At3
At3/At3
?
Global Error
At/At
At2/At2
At2/At2
?
(1' If g = 0, the pressure error is the same as that for velocity—both locally and globally.
(2) If BE is selected, please also select TR (and/or BDF2).
(3) The pessimistic result for pressure probably does not apply in our easier situation (linear, constant
coefficient constraint matrix; DG = L); from L. Petzold (personal communication).
,4) Not recommended for index 2.
(5) If FE is selected, please also select AB2 (and/or AB3).
(6) Not available in the DAE literature (we believe). It is probably the same as AB2.
660 THE NAVIER-STOKES EQUATIONS
d. Some numerical results (Taylor vortex)
With a large amount of help from D. Veyret, we verify some of the DAE solution methods
in this section via a pretty popular (and rare) analytical 2D solution of the NSE's. (We will
also include results using two algorithms that are fully-described in a later section—with
some apology.) Although restricted to periodic BC's, it nevertheless provides a useful
family of exact solutions—originated by G.I. Taylor (1923) and recently generalized by
Walsh (1991). After presenting the original solution and Walsh's significant
generalizations, we shall focus on the simplest member of the family in order to test some time
integrators. (We shall also, as a useful 'aside', obtain some results on spatial accuracy for
the 7 elements examined: namely Q\Qq, Q2P-1, QiQ\, Q2Q-1, PiP\, Ptp-\ and P2P\)
The 'Taylor vortex' solution, as well as its generalizations by Walsh, is 'simply'
obtained via stream functions that are also eigenfunctions of the Laplacian operator in a
periodic domain (either 2^-periodic or unit-square 'half-periodic'; we chose the latter):
VV = -W on - i ^ x, y ^ \, (3.16-97)
with period 2. The eigenvalues are all of the simple form
Xm,n =n2(m2 + n2) (3.16-98)
and the concomitant eigenfunctions are linear combinations of sums and products of
cosines and sines of mux and nny in the x- and y- directions. For example, the simplest
of them (in 2D—and our choice for a test problem) has m = n = \(X = 2n2) and the
single eigenfunction comprising cosines:
xff = — cos nx cos ny. (3.16-99)
moving toward the 'other end of the spectrum', we present an eigenfunction derived by
Walsh (1991), with X = 625tt2:
\{/ = sin25tzx + cos25tzy — sin24ttxcos7ny + cos \5nxcos20ny
— cos Inx sin 2Any, (3.16-100)
whose shape is so complex that only a fraction of the full cell can be easily shown; thus
Figure 3.16-1, from Walsh (1991), shows one-eighth of a full cell in each direction (0 ^
x, y ^ 1/8). Clearly Walsh's generalization of the simpler Taylor 'vortex cells' provides
an incredibly rich and challenging family of test problems (!).
The rest of the solution is as follows (the stream function, Vo, represents initial data,
and is given for example, by (3.16-99) or (3.16-100)):
u = u0e^u', (3.16-101)
where
u0 = (dx/so/dy, -di//0/'dx)T, (3.16-102)
xf, = xlsoe-Xvt, (3.16-103)
and, finally,
P = Poe-Uvt, (3.16-104)
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 661
Fig. 3.16-1 f(x, y) for 1/64 of a cell for A = 625 n2.
where PQ comes from
V2/>0 = -V • (u0 • Vu0). (3.16-105)
This is the 'general' solution, given xfr [from (3.16-97)], and it may already be evident
that this 'general' solution family is not quite as 'general' as one would desire; i.e. it is
actually rather special in several ways, thus perhaps somewhat reducing its viability as
an 'ideal' test problem.
1. The shape of the solution is time-independent; it merely decays in place—as do all
eigenfunctions of the 'heat equation'.
2. The pressure is not needed (in the continuum—only) to keep V • u = 0; it is only
needed to (exactly) balance advection (a sort of 'Bernoulli' pressure)—and is thus not
needed at all for Stokes flows.
3. Related to point 2, the acceleration and viscous terms are also in perfect balance for
all time; i.e., we really do have simple viscous decay (albeit div-free) solutions of the
(vector) heat equation.
4. The vorticity is proportional, via the eigenvalue, to the stream function:
dv du <■>
co = =-V2\/r = X\j/. (3.16-106)
3* dy
We now return to the simplest version of this family of exact solutions—(3.16-99) and
u
v
it cos Ttx sin izye
~2n2vt
—nsm nx cos n ye
.2
-2n2vt
p = —(sin nx + sin ny)e v
(3.16-107)
(3.16-108)
(3.16-109)
with
co = In cosnxcosnye
-27T2Xt
662 THE NAVIER-STOKES EQUATIONS
where we note that, consistent with (3.16-104) and (3.16-105), the pressure decays at
twice the rate that velocity does. This is also a special case—the normal component of
both acceleration and viscous terms, from the general PPE BC of (3.10-8), are exactly
zero. In more general cases, we might expect to see the temporal decay rate of P to be
twice that of u (from advection in the PPE, which is 'quadratic' in velocity) away from
T, but (sometimes at least) approaching the same rate of decay as u near or an r because
of the linear (viscous) coupling there.
Before embarking on our own numerical experiments, for which we use (only)
(3.16-106) through (3.16-109) on the unit span, we digress briefly to summarize what a
few others have done with this (and similar) 'Taylor vortex' problem(s). While (probably)
numerous investigators have taken a 'quick' look at this problem in order to verify, for
example, the alleged accuracy of their chosen time integration method [for example,
Kim and Moin (1985), Tau (1994)—and we make no attempts to be complete here],
D. Valentine and colleagues have pursued the problem in rather more depth, beginning
(we believe) with Mohamed et al. (1991), submitted in July 1988, in which they tested
a new code against a 16-cell (4 x 4) solution of square cells—using collocation on high-
order finite elements (Hermite bicubics) and TR ('Crank-Nicolson') for time integration
(plus Richardson extrapolation). They even used a 'smart' integrator and varied the
time step during the simulation—albeit not in the most cost-effective way. [They were
probably not aware of the much more efficient predictor-corrector method for TR (i.e.,
AB/TR), and employed a 'step-doubling' method that is more appropriate for other
ODE methods, such as Runge-Kutta.] In Valentine and Mohamed (1989), they proposed
the Taylor vortex problem (this time as a 2 x 2 array) as a numerical test problem
and performed some limited (numerical) hydrodynamic stability analyses by introducing
certain perturbations to the IC's. Finally (or, perhaps, more recently), in Valentine (1995)
a more serious set of numerical stability analyses was performed on a 4 x 4 array of
Taylor vortices—Valentine also introduced a more efficient finite difference method, which
produced vortex merging/capturing under certain conditions, depending on the form of
the initial perturbation. For related (classical) linear stability analyses, see Lin and Tobak
(1986), Thess (1992), and other references in Valentine (1995).
Returning now to our 'simple' version of the test problem, (3.16-106) through
(3.16-109), we begin by mentioning that we, like Valentine et al., did not invoke periodic
BC's in our simulation (but see Veyret et al., 1999); rather, in order to have time-varying
g(t) in Du = g, we applied the analytic solution as time-dependent Dirichlet data on the
boundary of our domain—a more difficult BC, infact.
This approach (simplification?), as well as that of comparing numerical solutions with
an analytical solution in general, is not without its own 'problems'; so we begin by
addressing them.
1. Since the GFEM interpolant of the IC given by (3.16-102) is not discretely divergence-
free, the interpolated initial data must be 'adjusted' in order to have a well-posed index 2
DAE system.
2. Whereas in the general case the GFEM interpolant of the normal component of the
specified velocity will not satisfy the hydrostatic pressure mode's solvability constraint
(3.13-31) even when the specified normal velocity does (Jrn • u = 0), in this case we
are 'lucky' because n • u = 0 on T. Thus, simply maintaining n • u = 0 on F gives a
well-posed problem.
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 663
3. But the 'tangential consequence' of holding the analytic Dirichlet BC's is also
significant—though not fatal (the problem is still well-posed). It turns out that the largest
error between the adjusted (via an L2-projection, typically) and interpolated velocity occurs
'one-node-in' from F—a consequence of our not permitting adjustment of the tangential
BC, a constraint (along with the normal constraint) that we chose to impose but didn't
need to (at least for an L2-projection).
In every case except that using forward Euler (FE), we first adjusted the interpolated
data via an L2-projection to the div-free subspace—per Appendix 3—which, it is
important to realize, injects an error at t = 0 when comparing the results with the analytic
solution, one of the 'problems' referred to earlier. But since the code employed (FIDAP;
see Fluid Dynamics International (1993)), probably like many others, does not have the
L2-projection 'machinery' built-in (a la Section 3.10.2), we explain next how such a
projection can be w^^-approximated via BE, as mentioned in the Remarks following
(3.16-21)—which is what we did. Following on from Remark (2) there, we have
LPn+l =Dfn+\ -ign+i -Dun)/At,
which we re-write, for n = 0, as
L{P\° + 0/Af) = Dfx - {gx - go)/At + (Du{) - g{))/At,
in which we have defined P\ = Pf + (p/At, where P\-corresponds to the 'physical' (true)
pressure and 0 is the Lagrange multiplier associated with the L2-projection; i.e. we
'interpret' the first time-step result via the two equations
LP\ = Dfx - (gx -go)/At, (3.16-110)
the PPE at t\ = At, and
L(P = Duo-g{h (3.16-111)
whose separation is not generally available from the P\ actually computed (but could be).
For small-enough At, what the code calls P\ we must interpret as cp/At. The resulting
velocity is (see too Appendix 3)
u\ =p(uo + Atfx) + GL-Xgx, (3.16-112)
where p = I — GL~XD, which, of course, satisfies Du\ = g\ for any value of At, but it is
only a good approximation to the L2-projection of uo for At -> 0 (specifically, we need
A'll/ill <3C ||«oll), for which (3.16-112) is equivalent to (A3.3-43) in Appendix 3. Note
too that a second small BE step is required in order to obtain a true/physical pressure—and
this is what we do; all (implicit) runs begin after two small BE steps of size Ato = 10~5.
All results herein are performed for v = 0.01, giving a Reynold's number of O (100);
Remax = umaxH/v = n x 1.0/0.01 = 314 at t = 0. For results at other values of v, as well
as variations in other problem parameters, see Veyret et al. (1998).
Figure 3.16-2a shows a typical mesh used for all simulations to follow. It has 2 x
242 =1152 quadratic triangular elements, 242 = 576 biquadratic quadrilateral elements
(obtained by removing the 'diagonals'), and the equivalent (and nearly equal, but not
shown) bilinear element mesh has 482 = 2304 elements. The range of Ax and A_y is
from 0.0083 (corners) to 0.0415 (center). All but P$P-\ and P^P\ meshes have 492 =
2401 total velocity nodes; the two remaining quadratic 'triangles' have 1152 (2 x 242)
664 THE NAVIER-STOKES EQUATIONS
(a) The mesh for triangular elements.
(c) Analytic vorticity at t = 0.
(Contour separtion = 1.406)
(e) Initial pressure for Q1Q0.
(Contour separation = 0.705)
(b) Interpolated analytic velocity at t = 0.
(d) Analytic pressure at t = 0.
(Contour separation = 0.705)
(f) Pressure from Q1 QO at t = 5.
(Contour separation = 0.097)
Fig. 3.16-2 A single Taylor vortex.
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 665
more, because of the bubble functions. The number of pressures is as follows: Q\Qo
and Q2Q-\ have 4 x 576 = 2304, Q2P-\ has 3 x 526 = 1728, Q2QX has 252 = 625,
P^P-x has 3 x 2 x 242 = 3456, P$PX and P2P\ have 252 = 625. (Note the relative
'paucity' of pressures for the two Taylor-Hood elements and the relative 'abundance' for
P^P-x). The total number of degrees of freedom (ignoring BC's) thus covers the range
of 2 x 2401 + 625 = 5427 for Q2QX and P2PX (Taylor-Hood elements) to 2 x (2401 +
2 x 242) + 3456 = 10562 for P$P-\. (Our favorite 2D element, Q2P-X, lies toward the
low end, with 2 x 2401 + 1228 = 6530.) These differences also show up in CPU costs
and should be borne in mind when comparing results; indeed, P^P-x, was about three
times as costly as QxQq and about twice as costly as Q2P-x, P2Px, and Q2Q\. Also, when
Q\Qo (with 2 x 2401 + 482 = 7106 unknowns) is run in 'penalty mode; there are only
2 x 2401 = 4802 degrees of freedom—and this was indeed the least costly in CPU time
(when measured against other implicit ODE methods).
Figures 3.16-2(b), (c) and (d), respectively, show the IC from the exact solution
[(3.16-99) and (3.16-109)] for velocity, vorticity [stream function, too; see (3.16-106)],
and pressure. Figure 3.16-2(e) shows the 'projected' initial pressure (2 BE steps at At =
0.00001, the second step required to get a true pressure) for QxQq (representative),
showing very small differences from the exact pressure; for example, whereas the range of
PQ is tt2 = 9.8696 from the exact solution, it is -9.8743 in Figure 3.16-2(e). The
simulations were run to t = 5, for which the Q\Qo pressure result is shown in Figure 3.16-2(f).
Clearly the shape has been properly retained—and the range of pressure is ~ 1.359
visa-vis the exact: 7t2t~An x001x5 ^ 1.371. Again this is 'representative' of all other results,
and thus we show no more [until our planned later publication on the subject, Veyret
et al. (1999)]. It seems that, in general, all elements performed quite well on this (easy?)
problem—at least in the augen-norm. We will see some differences—later.
We now turn to a detailed comparison of several time integrators and a less detailed, but
still useful, comparison of several 'elements' (spatial accuracy). We measured the error
in two norms—the first of which we call the PDE error; for the velocity, it is given by
ev(t) =
\
- 5>;(0 - «(*;. 0]2 + [vj(t) - v(xj, t)}2, (3.16-113)
N
where u(xj, t) is the ^-component of the exact solution at node j, etc., and N is the
number of (internal) nodes in the domain. For pressure, we use the same type of RMS
norm after forming an (element) area-weighted average of the 'raw' pressures at the same
velocity nodes and after 'normalizing' the results via a reference pressure—defined to be
the pressure at the node closest to x = y = ^, which is one-half of the distance between
the pressure extrema of the analytic solution; i.e. we have
eJt) =
N
v J2[Pj(t) - p(xj> l) -pj(t)+p(xj>f)]2' (3-16_114)
n
j
where J is the node located closest to |, ^.
Remark:
These PDE norms are perhaps better called 'lazy'-person's, (or discrete) PDE norms, as
the true L2-norm would involve integrals of the squares of the differences rather than sums.
666 THE NAVIER-STOKES EQUATIONS
For the second norm, called the ODE norm, we discard the exact solution and define
'truth' as the solution at the smallest At employed—usually 0.001. Thus,
- 5}k;(A0 - Uj(Ats)]2 + [vj(At) - Vj(Ats)]2, (3.16-115)
j
where Ats is the smallest time step—with a similar equation for the ODE pressure error
norm. (Actually, because of symmetry, we only used uj in the ODE velocity norm
definition, thus giving a dv that is low by a factor of \/2.) This approach should allow us
to verify the ODE/DAE theory if two criteria are satisfied: (i) Ats <$C At and (ii) At is
'sufficiently small'. As will be seen, this approach generally works quite well—at least
for velocity.
All linear systems were solved with one or another form of Gaussian elimination and
the fixed-point iterative method of Picard [functional iteration/successive substitution; see
Vol. II or Reddy and Gartling 1994)] was used for the nonlinear algebraic systems—using
a reasonably tight convergence criterion (e = 0.001, where we enforce both || Ax||/||x;|| ^
e and || f(xi)||/1|/oil ^ £ when solving f(x) = 0. It turns out that most of the runs needed
but one iteration; only those with large At required more—two or three, suggesting
perhaps again that the 'fluid dynamics' is not particularly difficult.)
We begin by comparing three quadrilateral elements for both BE and TR time
integrators—for velocity—in the PDE norm in Figure 3.16-3 for three different times:
t = 1, 3, 5, where, from (3.16-107) we note that the decay time constant, r = \/2n2v, is
very close to 5, so that the simulation stops after one time constant [one-half of the time
constant for the pressure, a la (3.16-109)]. For 'comparison' purposes, the RMS (nodal)
norm of the exact velocity at these three time is 1.72, 1.16, and 0.78, respectively; it
is 2.09 at t = 0. The penalty method (ODE's, penalty parameter = 109) is shown for
Q\Qo, whereas fully-coupled mixed interpolation (DAE's) is used for the rest of the
presentation—and this just to make the point that penalty 'works' by pointing out that
the error graphs for the fully-coupled Q\ <2o result (not shown) are virtually the same in
the 'augen-norm' (i.e. by visual comparison)—except for the smaller values of At, where
the combination of small At and large penalty parameter seems to sometimes cause
differences (in the errors) of ~10%. In these figures, and (most of) those to follow, we
placed a line with slope = 1 on all BE results and one with slope = 2 for the TR results
to test agreement with theory. Also noteworthy from Figure 3.16-3 are the following.
1. When At is sufficiently small (a mesh-dependent determination/definition, naturally),
the error curves flatten out, indicating that the error is spatial only—a situation that is
realized much sooner (larger At) for TR than for BE; typically, for the mesh chosen,
this 'convergence' At is ~0.10 or so for TR (~50 steps per time constant): and more
like 0.01 (500 steps per time constant) for BE. This 10-fold increase in cost is truly that
because the CPU costs per time-step for TR and BE are within a percent or so of being
equal. (Later, when we present variable-step/smart integration results, we will see similar
accuracy for significantly fewer time steps to get to t = 5; 15-20 for TR and about twice
that for BE).
2. By the same token, only when At is sufficiently large can we hope to see temporal
error only. For Q\Qq for example, this appears to be At > ~0.03 for BE and At > ~0.3
dv(At) =
\
iu ■
10-2
ev 10-3
[
m-4!
■irv-5
-
I °'A
i in in
A
I I I I I I!
A
\
Ml
SOLUTION METHODS FOR THE SEMI-DISCRETIZED
10-1
10-2
667
0.001 0.01 0.1
At
(a) C^Qq via BE (penalty).
10-1
10-2
ev 10-3
10-4
10-5
0.001
(c) C^P.! via BE.
10-1
10-2
ev IO-3
10-4
-
-
-_
T
11
£
=
-.' ■
n t
•
A i
..'
I I . I III
1 i 1 II :;
r
A
■ T
\ 1
1
0.001 0.01 0.1
At
(e) C^ via BE.
-
"
-
-
I_
=
8 8 \ i
i i i 111
■ /i n ii
a
/ "
/* T
/ A
r/
i i 11 m
ev IO-3
10-4
10-5
0.001 0.01 0.1
At
(b) Ditto (a) except TR.
10-1
10-2
ev IO-3
10-4 6
10-5
0.001 0.01 0.1
At
(d) Ditto (c) except TR.
10-1
10-2
r
=
n n n n >
D
/ °
/ V
<t • O • / T
T A T A/ T
I I . I il / . I ll I III.
-
—
□
ii~
-
-
-
D
A
I ; i
t
t
] D I
1 A t
/< : 1 II
Q
] □/
' / A
1 ' II
' O
A
1 :
ev 10-3
10-4
10-5
0.001 0.01 0.1
At
(f) Ditto (e) except TR.
Fig. 3.16-3 PDE velocity error for 3 quadrilateral elements. Here, and in those to follow,
D denotes t = 1, • denotes t = 3, and A denotes t = 5.
668 THE NAVIER-STOKES EQUATIONS
for TR. Only when temporal error 'totally' dominates spatial error can a PDE error norm
be compared to DAE theory.
3. The 'flat' portion of the curves—spatial error only—shows that, for this problem at
least, QiP-\ [and QiQ-\, which gave nearly identical results (3-4 significant figures),
not shown] is significantly more accurate than Q\Qo, which itself is significantly more
accurate than £2Q\- [This latter result may surprise a few readers, but we have become
convinced that this is just another example of a true thing—or two, actually: Q\Qo
is often 'very' accurate and <22<2i is often very inaccurate. Even though the 9-node
element is generally much more accurate than the 4-node for 'simple' (scalar) PDE's, the
mixed-interpolation requirement of the NSE's drags down the 9-node element's
('deliverable') accuracy (for <22<2i) because of the poor job it does on the pressure (and V • u).]
Note that a At as large as ^ with TR (17 steps to t = 5) has virtually removed all
temporal error for QiQ\ (Figure 3.16-3(f)), and that, in marked contrast, BE needs a At
of 0.001 (5000 steps to t = 5!) to get to the 'same' point when using the good 9-node
element—QiP-1, which 'point' has more than an order of magnitude smaller error than
QiQ\. The 'best' accuracy attainable on the chosen mesh is that using Q2P-\, and its
most cost-effective attainment (for fixed At integration and mixed interpolation) is via
TR with At = 0.1 (XAt = 0.02)—until we apply the 'smart' version of TR; see below.
Moving now to Figure 3.16-4, we first discuss the pressure errors for QiP-\, for BE
(Figure 3.16-4(a)) and TR (Figure 3.16-4(b)). There are two things worthy of discussion,
the first of which is obvious.
(i) The pressure error is much larger than the velocity error, such that even BE has
'converged' for At just less than 0.10 (and TR by At = ^, as for velocity).
(ii) These pressure errors are rather 'representative' of those for all elements tested; i.e.
the large spatial errors are quite close to each other. (And for this reason we will
not show the others.)
The reason for this behavior, we believe, is that the 'DAE pressure' dances to a
different drummer than does the PDE pressure, beginning with the fact that the largest
error occurs at t = 0+; i.e. the pressure error associated with the projected velocity is
both large (relative to that for velocity) and nearly element-independent. We are not so
surprised by the former result, since the pressure is well-known to be a rather 'sensitive'
variable, responding with relatively large changes to relatively small changes in velocity.
But the latter (element 'independence') is still rather puzzling—an issue we plan to return
to in Veyret et al. (1999).
The rest of Figure 3.16-4 shows the velocity error for two of the better triangular
elements, and we see that P^P-\ is very similar in accuracy to <2i<2o (but at several
times the cost) and that P^P\ is close to <22<2i (also in cost) the latter result being less
surprising than the former. [We have also run the PiP\ (Taylor-Hood) element, with the
following results (not shown): the PDE velocity error looked a lot like that from either
QiQ\ or P^Pu the remaining three types of error curves looked very much like those
from any other element.]
In Figure 3.16-5 we show the PDE error for 'Projection 2' with semi-consistent mass
(Q\Qo) (see Section 3.16.6c) and the first of the ODE errors. Figures 3.16-5(a) and (b)
present us—and perhaps the rest of the world—with our biggest surprises in this study:
10-1
10-2
SOLUTION METHODS FOR THE SEMI-DISCRETIZED
10"1
10-2
667
ev 10-3
■\Q-5 ' ' ! ' '
0.001 0.01 0.1
At
(a) C^Qo via BE (penalty).
10"1
10-2
ev 10-3
10-4
•8
r
'P
-r
#
10-5
i i i , i - ii i ■ i i i
I I I i I
0.001 0.01 0.1
At
(c) QaP.! via BE.
10"1
10-2
Q P P P
ev 10-3
&- A A
-• it
10-^
10-51 -U-
-ill L
A
0.001 0.01 0.1
At
(e) Q2Q! via BE.
ev 10-3
10-4
10-5
0.001 0.01 0.1
At
(b) Ditto (a) except TR.
10"1
10-2
ev 10-3
0.001 0.01 0.1
At
(d) Ditto (c) except TR.
10"1
10-2
p
ev 10-3
P p n P p p/
'=—*—T—*—"—T—{
AT A
10-4
1Q-5 I 1 i i I Mil
A A
Ml III l i ll i l.
0.001 0.01 0.1
At
(f) Ditto (e) except TR.
Fig. 3.16-3 PDE velocity error for 3 quadrilateral elements. Here, and in those to follow,
U denotes t = 1, • denotes t = 3, and A denotes t = 5.
670 THE NAVIER-STOKES EQUATIONS
10-1
10-2
d- □ a □ yg
3V 10-3<b • ° •/-!
10-4 =
10"6
0.0001 0.001 0.01
At
(a) Q^Qq via BE and Projection 2.
10"1
10-2
10-3
10-4
10"6
10-6
10-7
10-8
0.001 0.01 0.1
At
(c) ODE velocity error; Q2P_., and BE.
10°
10"1
10-2
dp
10-3
10-^
=
=
=f
-'"
.A
--i
=
—
=""
I Mill'
.A
I I I I : M
A
I : l ! I'
10-5
-
= ! li
j A
A/
* /J
= s^
- O </
~ ^^
0.01 0.1
At
1
10"1
10-2
ev 10-3
10-^
10-5
0.0001 0.001
(b) Ditto (a) except TR
0.01
At
dv
ILT'
10-2
10-3
10"^
10-5
10-6
10"7
10-8
E
=
=
=
i
=
n
o
□ /
y ^
•/
n /
_ _.A
KV^
n /
/ f
•/
n /
n /
y
2^
y
v
\
.
:
0.001 0.01 0.1
At
(d) Ditto (c) except TR.
10°
10"1
10-2
10-3
10-^
10-5
0.001
□
□
7<r
p
D
' '' ■• ' i
i ' i i ir
0.01
0.001
(e) ODE pressure error; Q2P_i and BE. (f) Ditto (e) except TR.
Fig. 3.16-5 PDE velocity error for Projection 2, and several ODE errors.
0.1
At
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 671
1. Both BE and TR behave very nearly the same—clearly violating any remaining notions
one might have of applying ODE or DAE theory to projection methods of this type.
2. The errors are large relative to those from the more 'honest' methods (by factors of
5-10) i.e. time integration of the DAE's (cf. Figures 3.16-3(a) and (b)). We believe that
we can explain the relatively large spatial error, and we do not believe it is a code 'bug'
because of the following: whereas the results in Figures 3.16-3(a) and (b) were obtained
with a code (PASTIS) written by H. Daniels while spending a year at LLNL, these results
were closely duplicated (at least in the range 0.01 ^ At < 0.10) by running the FIDAP
code in the manner discussed in Section 3.16.7; i.e., using the segregated solver with
just 'one pass' (no iterations) is an 'algorithm' that is (at least for sufficiently small At)
very close to that of projection 2—one of the obvious differences being that the
projection 2 code uses 1-point quadrature on the element matrices, which is less accurate (but
perhaps more cost-effective). Another difference is in the treatment of advection: whereas
FIDAP does it 'honestly' (triple-product integrals, etc.), PASTIS uses the 'centroid
advection' approximation (with only double-product integrals) discussed in Section 3.16.5. This
combination of differences 'explains' the larger error. [Note too that when At was
sufficiently small with FIDAP (in the 1-pass mode), the 'spatial only' error, attained for
At ^ ~0.001, agreed 'completely' with that using BE or TR in Figure 3.16-3.] A related
conclusion from this comparison is thus this: at least the extra effort of employing 'honest
GFEM' does deliver greater accuracy.
3. The two straight lines shown in both of the figures have slopes of 1.5 and 2.0 because
it appears possible that e(, ~ Ar1-5 for early time (t = 1) and er ~ At2 for 'late' time
(t = 5)—for both TR and BE. We are just not sure.
4. The At range is one order of magnitude smaller than all the others, partly because
larger At caused much larger errors (and convergence difficulties), presumably caused by
the spurious boundary layer (e.g. *JvAt = 0.1 for At = 1; see Section 3.16.6c) of this
projection method. (Clearly the results presented for At < 0.001 are superfluous—until
we examine the ODE errors.)
The rest of Figure 3.16-5 shows the ODE error for Q2P-\, for both BE and TR for
both velocity and pressure—and we point out that, just as pressure PDE errors are nearly
element-independent, so too are the ODE errors, but this time for both P and u (and even
more so for u, for which they are virtually identical for all elements examined. Again,
except to conjecture that the 'spatial' differences are 'removed' (hidden?) by our
'normalization' ('truth' At = 0.001, one for each element), we are at a loss to explain such close
behavior, since each element employed generates a different set of DAE's and because the
differences when compared to the PDE solution are quite noticeable. Especially confusing
is the ODE pressure error for TR at small At—besides being 'inverted' from the rest of
the results (largest errors at large time rather than smallest errors at large time), they
display a nearly Af-independent behavior over much of the range. We did verify that the
dp results do (finally) bend over and go to zero at the 'truth' Ars of 0.001—although
the error is still between 10~6 and 10"5 for At as small as 0.0012; the turn-down is
very steep. Also interesting (and also not understood) is why the BE velocity errors do
not spreadout/separate with time (smaller error at larger t) as the solution decays; TR is
much more reasonable in this regard. Finally, we mention that, as a check, we defined
672 THE NAVIER-STOKES EQUATIONS
the 'truth' solution for BE as the (more accurate) TR solution at At = 0.001; the result
(not shown) would be three more points on the curve in Figure 3.16-5(c) at At = 0.001,
with values between 1 and 2 x 10-5, which is reassuring—it reinforces the 'assumption'
that virtually all of the error is spatial for both TR and BE for At = 0.001.
Finally, Figure 3.16-6 shows the ODE errors from 'Projection 2' and the two PDE
velocity errors for FE (see Section 3.16.5)—all with QiQq, Whereas the ODE velocity
errors for Projection 2 using TR seem to be in accord with what we believe to be
known theoretically, those from BE are, like the PDE errors, rather surprising. For
large At, they look very much like those from TR (slope 2 and similar magnitude), and
display a painfully slow transition toward a slope of unity (the smallest 'local' slope in
Figure 3.16-5(a) is ~1.2—and the two straight lines have slopes 1 and 2), where we point
out that the 'truth' At for the BE result is very very small (1/30000)—needed to convince
us that the asymptote, at least, does appear to agree with BE theory. ['Projection 1'—see
Section 3.16.6b—behaves more like expectations; and its slope is 1 for both BE and TR;
see Veyret et al. (1998).] Turning to the ODE pressure errors, in Figure 3.16-6(c) and
3.16-6(d), we seem to see the following behavior: BE has a slope 'near' 2 for 'large' At
and 'near' 1 for small At (the two lines shown have slopes 1 and 1.5); TR may have a
slope near 2 for large At, it seems to show 0(At15) for some ill-defined 'intermediate'
range of At, and may be heading toward a slope of 1 for small At. The 3 curves in
Figures 3.16-6(d) have slopes of 1, 1.5, and 2. Note that Shen (1996) also showed some
experimental results that seemed to have a slope of ~1.5. Note also that Van Kan (1986),
after demonstrating second-order accuracy for u, had this to say about P: 'For the pressure
the results appeared to be less trustworthy.' All-in-all, it seems that Projection 2 does not
fit comfortably into any 'niche' regarding temporal accuracy.
The last 2 curves in this figure show some FE results (with a line of slope 1
drawn on Figure 3.16-6(f)), which, unlike the BE and TR results presented earlier,
are obtained from, the index 1 DAE's (PPE 'method'). (We did not invoke BTD—see
Section 3.16.5—in order to keep the DAE system 'honest'). The semi-theoretical stability
limits for FE—based on the FDM version of the ODE's and given in (2.7-28) and
(2.7-29) for constant velocity on a uniform mesh—are approximately as follows:
(i) At ^ Ax2mJ4v = (0.0083)2/0.04 ^ 0.0017 and At ^ 0.02/(tt2 + 0) = 0.0020. The
empirical/experimental result is AtmdX = 0.0045-0.0050. But the important result is that
all stable time steps are necessarily so small that spatial error always dominates. (BTD
would help, but not much: AtmdX from (2.1-41) is ~0.01. Also worth noting, by comparing
Figure 3.16-6(e) to Figure 3.16-3(a) or (b) is that mass lumping ('required' for explicit
time integration) has increased the spatial error several-fold. Finally, note that the ODE
error agrees well with its implicit counter part: Figure 3.16-6(f) vis-a-vis Figure 3.16-5(c).
Figure 3.16-7 shows the divergence error (contours of CTu) that goes with the
interpolated (but not projected—not necessary) IC used in the FE runs. It is actually quite
small and, of course retains the values shown in the figure during the entire integration.
[The solution (vectors, etc.) from the FE runs were visually as good as those that are
div-free.] We conclude the FE discussion with the following remark: it is interesting to
realize that the index 1 'property' of preserving the divergence error (see Sections 3.10.1,
3.13.4, 3.16.1b, 3.16.2c, and 3.16.2g) will actually preclude the FE time integration from
attaining the final steady-state of no-flow. For the case above and a long time
integration, the FE PDE error, which starts at ~ 1.3 x 10~4, increases for awhile, peaks at
t = 3 at ~ 6 x 10-4, then decreases monotonically until t = 30 or so (at which time the
10°
10-2
dv 10-3
10-5
10-6
H
=
~
=
3 ! l/lMI!
A
k1
I I I I Mil
J&
Y
I ■ I Mill
/♦ *
/
II!
SOLUTION METHODS FOR THE SEMI-DISCRETIZED
10-1
ID"2
10-3
1(H
10-5
10-6
10"7
673
0.0001 0.001 0.01
At
0.1
(a) ODE velocity error; Q^Qo, Projection 2, BE.
101
10°
10"1
10-2
dp
10-3
10-^
10-5
10-6
=
=
=
-
- □<//
^/
3^ A
■ I
I I i l-M
■% ?
r ^
^~~
I ! I : ■ I l!
0.0001 0.001 0.01
At
0.1
(c) ODE pressure error; Q^Qq, Projection 2, BE.
10"1
10-2
ev10-3
10-^
a 8 $ 8
10-5
L
I I I ! Ml
0.00001
(e) PDE velocity error; Q^Qq via FE
i
0.0001 0.001
At
0.01
dv
E
=
-
E
/
A
E /A
= / A
; ii i ill
A
A
I
i ■
\
i ii i mi
10-8
0.0001 0.001 0.01
At
(b) Ditto (a) except TR.
0.1
0.0001 0.001 0.01
At
(d) Ditto (c) except TR.
10"1
10-2
10-3
10-^
dv
10-5
10-6
10"7
10-8
0.00001 0.0001 0.001
At
(f) Ditto (e) except ODE error.
E
E
=
E
=
= ^^
i i m ■ ii-
I I I ! I I II
8^
l Mill:
0.01
Fig. 3.16-6 Several other errors using Q^ Q0.
674 THE NAVIER-STOKES EQUATIONS
© = -1.44 x 10*6 (contour interval = 0.288 x 10~6).
true velocity is ^ 7re 2rvt = 0.008), after which it levels off at ~ 6 x 10 6—because
CTu{t) - g{t) = CTu0 - g{) # 0, per (3.16-7).
To close our discussion of 'spatial only' errors, we show in Table 3.16-2 the projection
(PDE) errors for velocity and pressure (2 BE steps at At = 10~5, called 'Initial Velocity
Error') and the range of errors from the (A?-converged) results at t = 1, 3, and 5.
From these results, we seem to see the following.
1. The pressure error is virtually element-independent. We have no good explanation
for this, nor for the anomalous behaviour of PtP-\, which, at t = 0+, simultaneously
displays the largest velocity error and the smallest pressure error.
2. The pressure errors are much larger than those for velocity—and are largest initially.
3. Elements Q\Qq and P%P-\, which generally have performed rather similarly, display
nearly the same projection error as the A?-converged velocity error—and both (at t = 0+)
relatively rather large.
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 675
Table 3.16-2 Some element errors.
Element
Q^Qo
O2P-1
O2O-1
O2O1
P2+P-i
P2+Pi
P2P1
Q^ Q0 via
Projection 2
104 x Initial
Velocity Error
1.3
0.056
0.056
0.092
3.2
0.12
0.14
1.3
Initial
Pressure Error
0.026
0.020
0.020
0.020
0.012
0.021
0.020
0.026
104 x Range of
Velocity Error
1-2
0.2-0.9
0.2-0.9
5-20
1-3
9-30
6-22
5-20
Range of
Pressure Error
0.003-0.017
0.003-0.014
0.003-0.014
0.003-0.014
0.002-0.008
0.003-0.012
0.002-0.012
0.003-0.015
4. QiQ\, ^2^1' and PiP\ all show 'small' initial velocity error but 'large' A?-converged
errors.
5. QjP-x and Q2Q-\ are the most accurate elements, and for the mesh chosen, seem to
show that the potential extra accuracy offered by the xy term is not really needed for this
'easy' problem. See Veyret et al. (1998) for more on this issue.
To wrap it up, we 'advance' from fixed-step to two variable-A? (smart) integrators.
The results of one such pair of simulations (for At$ = 10~3 and e — 10~3 via Q\Qo),
one for BE and the other for TR, are shown in Figure 3.16-8. In Figure 3.16-8(a) and (b)
are shown the At histories as the solid lines (the first three steps are below 0.01 and are
not shown), the PDE velocity errors as the solid lines + dots with every time step plotted
(1 per dot), and the 'theoretical' (scalar ODE) curve as a dashed line [kAt = (2eeXt)i/2
for BE and XAt = (12ee^)'/3 for TR with X = 2n2v\ see Section 2.7.4c]. The BE run
required 36 steps to get to t = 5 and TR needed 19. Figures 3.16-8(c) and (d) show
(again) the At vs. t curve plus the PDE pressure error. We conclude from these results
the following:
1. Variable time-stepping is much more efficient.
2. For TR, all of the error is spatial for all t [cf. Figure 3.16-3(b)]; for this mesh a larger
s could be used for Q\Qq.
3. Whereas the TR curve of At vs. t is somewhat conservative relative to the simple ODE
theoretical curve, the BE result is not. The BE time steps appear to have grown a bit 'too
fast'—and the 'increasing' PDE global error reflects this. TR did a much better job of
maintaining the error in the desired range.
4. The PDE pressure errors are virtually identical for BE and TR and are decaying in
proportion to the pressure—like e~4n vt. Recall too that the pressure error is greatest at
t = 0.
5. When we tightened s to 10"4 (not shown), BE kept the velocity error below 10~3
and required 106 time steps to t = 5, whereas TR maintained the error at about 2 x 10~4
and needed only 33 steps; as noted above, even e = 10~3 is 'overkill' for TR. Both of
these results are in fair agreement with simple asymptotic ODE theory (36\/To = 114
and 19^10 = 41).
676
THE NAVIER-STOKES EQUATIONS
At 10
- 1(T* ey
0 12 3 4 5
0 12 3 4 5
(a) At and PDE velocity error for BE.
(b) At and PDE velocity error for TR.
At 10
10-2 en
2345 01234
t
(c) At and PDE pressure error for BE. (d) At and PDE pressure error for TR.
Fig. 3.16-8 Sample performance of two variable-step integrators (4/1 element).
Finally, we report the results when using the usually-cost-effective (in 2D at least)
penalty method—an index 0 DAE system; i.e. ODE's. And our view of the (usually) most
efficient 2D transient simulation is that using the QiP-x (9/3) element in the consistent
penalty mode (see Section 3.13.2e) integrated via the variable-step TR. The results, using
a penalty parameter (see (3.16.254) of 107 and At0(still) = 10-3 which (via the first two
BE steps) suppresses (skips over) the 'penalty transient' are as follows: BE needed 36 time
steps and TR needed 18—the same as mixed interpolation on Q\Qo, a la Figure 3.16-8.
The penalty method can be quite cost-effective.
These extensive simulations—more of which will be reported in Veyret et al.
(1999)—have led us to the following conclusions:
1. While the Taylor-vortex problem is indeed worthwhile as a test problem, it is not
perfect.
2. The most accurate elements are QiP-\ and Q2Q-1, with the former being both stable
and more cost-effective.
3. The Projection 2 method has a few remaining nagging problems (high spatial error,
strange At behaviour) that are 'calling out' to the numerical analysis community.
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 677
4. Perhaps the Taylor-Hood elements (continuous pressure) have outlived their usefulness.
5. Variable-step TR is a good way to integrate the index 2 DAE's; even variable-step BE
is a better choice than any fixed-A? integration.
6. Whereas we saw mostly excellent agreement with DAE theory, some of the results
(eg. ODE errors) need further error analysis.
3.16.2 A Model DAE Problem
a. Introduction
The following (ostensibly simple) model problem for the incompressible Stokes equations
is introduced and developed for several reasons: (i) it provides insight into the
mathematics and behaviour of DAE's; (ii) analytical solutions are reasonably easy to obtain, and
these reveal some of the potentially bizarre behavior of DAE's; (iii) the model mimics
remarkably well the true semi-discretized DAE's generated by the Stokes equations; and
(iv) it sheds some light on the behavior of the penalty method. It is, however, diversionary
and is, like the section to follow, not absolutely essential; skip to Section 3.16.4 if you
only wish to see how to write code.
To start, we review the goal: find u and P from
du/dt + VP = uV2u, Vu = 0 in Q, (3.16-116)
with BC u = w(0 on F, and with IC u(x, 0) = uo(jc). If uo is divergence-free—i.e., if
V • uo = 0 in £2 and n • uo = n • w(0) on F—then the IC's are said to be compatible. If
uo is not divergence-free, a continuous-in-time solution does not exist, and the problem
is generally regarded as ill-posed. A nearby well-posed problem can then be obtained as
discussed in Section 3.10.2, and it is worth mentioning now that the analytical solution
of the ill-posed DAE analog of (3.16-116) that we are about to present will automatically
'correct itself to generate the analogous well-posed problem.
The simple DAE model for (3.16-116) is given by the following three-degree-of-
freedom system:
u + kxu + c\P= f\(t), u(0) = u0, (3.16-117)
v + k2v + c2P = f2(t), v(0) = v0, (3.16-118)
and
Clu + c2v = g(t), P(0) = Po, (3.16-119)
where c\ ^ 0, c2 # 0, k\ > 0, and k2 > 0 are all constant. (An anisotropic 'viscosity,'
k{ ^ k2, while not necessary, is utilized for generality. The model is still relevant, and in
some sense more appropriate, for k\ = k2. Also, c\ ^ c2 is most appropriate.) We initially
proceed naively in that (i) we presume that an initial pressure is required and given, and
(ii) we presume that u0 and vq are arbitrary.
This is the index 2 (primitive variable) version of the model problem, whose index
we shall now verify. Just as the time derivative of V • u = 0 leads to the PPE in the
continuum, so too does
cxit + c2v = g (3.16-120)
678 THE NAVIER-STOKES EQUATIONS
lead to the semi-discrete PPE and an index 1 DAE; i.e., (3.16-117), (3.16-118) and
(3.16-120) constitute an index 1 DAE system. To obtain the PPE version, substitute the
accelerations from (3.16-117), (3.16-118) into (3.16-120) to get c\u + C2V = g = c\(f\ —
k\u — c\P) + C2(fi — k2V — C2P), which we rearrange to
{c\ + c22)P + clklu + c2k2v = clfl +c2f2-g, (3.16-121)
and denote (3.16-117), (3.16-118) and (3.16-121) as the PPE/index 1 formulation; note
that (c\ + c\) is our mimic of the Laplacian. It is this latter version of the index 1 DAE's
that corresponds to the way some computer codes are written to solve the NS equations.
Finally, one more differentiation of the constraint leads to an index 0 DAE system—i.e.,
an ODE system: (3.16-117), (3.16-118) and
(c? + c\)P + cxkxii + c2k2v = c,/, + c2f2 ~ g, (3.16-122)
which is a system of ODE's. We can now claim that the stated 'indices' are correct since
it required two differentiations of the constraint to obtain an ODE system in the original
variables—the original DAE's were indeed index 2.
Remarks:
(1) Another (and not equivalent) index 0 system could be obtained by solving (3.16-121)
for P and placing the result in (3.16-117) and (3.16-118) to obtain a pair of ODE's
in u and v only.
(2) Since neither index 0 system is used to write computer codes, we will say little more
about them.
b. Index 2
Returning now to the index 2 formulation, (3.16-117) through (3.16-119), we proceed to
find an analytical solution in terms of eigenvalues and eigenvectors. To start, we write it
as a general system of singular ODE's (see, for example, Campbell, 1980, or Hairer and
Wanner, 1991) via
By + Ay = F(t), y(0) = y0, (3.16-123)
where y = (u, v, P)T, F=(fl,f2, g)T,
/l 0 0\ (k\ 0 c,\
B= 0 1 0 , and A = 0 k2 c2 1 ,
\0 0 0/ \c, c2 0/
where it is clear that B is singular. It is also clear that A is symmetric and non-singular
(|A| = — c\k2 — c\k\)\ it is less clear but true that A is indefinite. To solve (3.16-117)
through (3.16-119) via an eigenvector expansion, we must first find the proper
eigenvectors. We start 'conventionally,' naively; i.e., as if B were not singular, and pretend that we
have a simple system of linear ODE's. The conventional ODE method is the following:
Step 1. Set F(t) = 0 and study the homogeneous problem.
Step 2. Seek a solution of the form y(t) = xe~~Xt, which generates the following
generalized eigenproblem:
Ax = XBx. (3.16-124)
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 679
Step 3. Solve the eigenproblem to obtain (hopefully) a set of basis vectors—the first
key objective.
Step 4. Return to the inhomogeneous problem, (3.16-123), and expand both y(t) and
F(t) into linear combinations of the basis vectors. The amplitude coefficients
for F{t) are obtained by solving a linear algebraic system and those for y{t) by
solving (integrating) a set of uncoupled (usually) ODE's.
Step 5. Once the IC vector, y0, is also expanded into the basis set, the solution is
complete.
So let us start along this conventional path and see what happens. To solve (3.16-124),
we first must form the characteristic equation,
\A-kB\=0, (3.16-125)
which gives
X = {c\k2 + c22k\)/(c]+cl), (3.16-126)
rather than the expected cubic equation for the desired three eigenvalues of the 3 x 3
system. This is the first DAE 'surprise'; we find only one eigenvalue rather than three.
The eigenvector, from (3.16-124) and (3.16-126), is easily found to be
x = [c2, -c\, clC2(k2 - *i)/(c? + C2)f, (3.16-127)
which is 'divergence-free': c\X\ + c2x2 = 0. What next?
Guided by, for example, Cook, et al. (1989), or by Wilkinson (1978), we seek the rest
of our 'solution' via the inverse eigenproblem,
Bx = (jlAx, (3.16-128)
which displays the same eigenvectors as (3.16-124) and, importantly, \x = \/X. The
characteristic equation here is \B — (jlA\ = 0, which gives
fi2[(c2k2 + cjkOfJL - (c2 + c\)\ = 0; (3.16-129)
i.e., the three roots are \x = {0, 0, 1/A.}. The two missing original eigenvalues are infinite!
Another DAE surprise.
Inserting \± = 0 into (3.16-128) yields the eigenvector
x=x0 = (0,0, \f, (3.16-130)
and we came up against the third DAE surprise: the matrix A~XB is defective—there is
but a single eigenvector corresponding to the repeated eigenvalue \± = 0. The resolution
of this dilemma was provided by Jordan, and we thus seek a generalized eigenvector (x\)
corresponding to the repeated eigenvalue (/x = iiq) via the application of Jordan theory
(e.g., Noble, 1969):
(B - fi0A)xl = Ax0, (3.16-131)
where /xq = 0, but we will not utilize this fact until later, for reasons that will become
clear. The solution of (3.16-131) is easily found to be
xi=\c2) = \c2)+Y\0) = \C2]+ J*o, (3.16-132)
680 THE NAVIER-STOKES EQUATIONS
where y is arbitrary, and it is obviously expedient (and legitimate) to take y = 0. Note
that the generalized eigenvector is not divergence-free—it is dilatational, with div x\ =
c\X\\ + C2X\2 = (c2 + c\); it will turn out that its 'job' is to enforce c\u + c2v = g for
t > 0—and that of x0 is to enforce the PPE, (3.16-121) for t > 0.
We finally have a basis for Rn(=R3): x0 = (0, 0, \)T with fi0 = 0 (A.0 = oo);
x\ = (c\,C2,Q)T, also with /xo = 0 (A0 = oo); and the first one obtained, X2 =
[c2, -ci,cic2(k2 - k\)/(c\ + c|)]r, with A2 = (c]k2 + c\k\)/(c\ + c|).
{Proof of basis:
0
0
1
C2
0
C2
-ci
CiC2(^2 ~k\)
c\ + c$
= -(cf + c|)#0,
showing that the three vectors are linearly independent.}
Given a basis, we can return to (3.16-123) and express the solution as
y(t) = ^2aj(t)Xj,
7=0
(3.16-133)
7=0
(3.16-134)
and
F{t) = YJbj{t)Axj,
7=0
(3.16-135)
where it is convenient to represent F(t) via the modified basis, {Ajc/}, for reasons that will
soon become clear. (Since {xj} form a basis, it follows that {Axj} also form a basis for
any non-singular matrix A.) The modified basis is easily found to be Axq = (c\, C2, 0)T,
Ax\ = (c\k\, C2k2, c\ + c|)r, and Ax2 = {^2C2, —^ic\, 0)T.
Proceeding in reverse order for the amplitude coefficients, we first find the three fr/s.
Expanding (3.16-135) gives
Cibo + c\k\b\ +X2c2b2 = f \{t),
c2bo + c2k2b\ - X2c\b2 = fi(t),
and
with solution
0 • b0 + (cf + c\)bx +Q-b2 = g{t\
(3.16-136)
c\f\(t) + c2f2(t)
bo(0 =
clk\ + c2k2
cl + c2,
8(0
ci + ci
b\(t) = g{t)/(c] + c22),
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 681
and
c\c2(k2 — k\)
cifdt) - c, f2(t) + 2 2 8(0
b2(0 = 21 2,C'+C2 • (3.16-137)
C\K2 + C2ACi
Similarly, solving the 3x3 system from (3.16-134) gives the {yj}, where _yo =
(k0, v0, P0)T:
c\c2(k2 - k\)(c2u0 -c\vq)
yo = Po W^4? '
Y\ = (c\u0 + c2vq)/(c] + c\),
Y2 = (c2uQ - c, v0)/(cf + c\\ (3.16-138)
where we still (naively) presume the initial data to be arbitrary. Finally, we insert
(3.16-133) and (3.16-135) into (3.16-123) to obtain
2
^2\Baj{t)xj +Adj{t)Xj - bj(t)Axj] = 0,
7=0
which, using (3.16-128) for j = 0, 2 and (3.16-131) for j = 1 (the generalized
eigenvector), gives
[fjLoa0(t) + a0(t) - bo(t)]Ax0
+ (fjioAxi + Ax0)a\ (t) + [ai (t) - b\ (t)\Ax\
+ [M2«2(0 + a2{t) - ft2(0]^2 = 0,
which we rearrange to
[Mo^oCO + &\ (0 + «o(0 - ^o(OJM)
+ [mo«i (0 + a\(t) — b\ (t)\Ax\
+ [li2a2(t) + a2{t) - b2(t)\Ax2 = 0, (3.16-139)
which shows why our choice of the modified basis for F(t) is 'convenient:' the vectors
{Axj} are linearly independent, so that we finally obtain the nearly uncoupled ODE's—i.e.,
uncoupled in the sense of Jordan,
fjLoaoit) + h\ (0 + a0(t) = b0(t),
fjLQa\(t) + a\(t) = b\(t),
fi2a2(t) + a2(t) = b2(t), (3.16-140)
with IC's given by (3.16-134) and (3.16-138).
While the solution of (3.16-140) is particularly simple if we set /xo =0, we shall
proceed somewhat differently, to highlight some additional 'properties' of DAE's. Thus,
suppose for the moment that /xq > 0; we shall solve the resulting ODE's and then see
what happens as /iq -> 0. These ODE's are solved in the usual way (integrating factor)
682 THE NAVIER-STOKES EQUATIONS
»-V
to give (using Xj = \/fij):
ao(t) = e
ai(t) = e^0'
a2(t) = e^2'
Yo + X0 h(z)e^T dr
Jo
Y\+*o [ ^i(r)e^Tdr
Jo
Y2 + X2 [ b2(T)ehTdz
Jo
(3.16-141)
where h(t) = b0(t) - X0{bi (t) - e^'fa, (0) + X0 /0' bx (r)e^T dr]}.
The full solution, from (3.16-133), is then
u
v
P
= a0(t) I 0 j +ai(0 I c2 } +a2(t)
I c2 \
cxc2(k2 -k\)
\ c] + c]
(3.16-142)
/
i.e.,
and
u(t) = ci<3i(0 + c2a2(t),
v(t) = c2a\(t) -c\a2(t),
P(t) = a0(t) + cxc2(k2 - k\)a2(t)/(c\ + c\).
(3.16-143)
But what about t —► 0 and IC's? Recall that we presumably selected uq, vq, and Pq
arbitrarily. ... Also recall that we are ultimately interested in ixq = 0 {X$ = 00). Well, it
turns out to matter in which order we take the two limits, and the difference is interesting
and illuminating. If we let t -> 0 first, (3.16-141) through (3.16-143) yield
k(0) = c\Y\ +c2y2,
v(0) = c2Y\ -C1K2,
and
P(0) = Yo + CxC2{k2 - kx )Y2i/c] + c\\
or using (3.16-138), u(0) = u0, v(0) = v0, and P(0) = P0—as required; i.e., by
construction. But if we let ixq —> 0 first (the real case of interest), the results are vastly different.
Thus, we finally focus on the solution of real interest for the DAE's: Xo = 00 (/io = 0)
for which we need, in order to use (3.16-141),
1.
and
lim A()e
-A.0r
»A.or
G(r)eA(,T dr = G(t)
(3.16-144)
2. lim h(t) = bo(t) - lim
A.0—>-oo Xq—>-oc
A.oMO-*oe
2^~V
^i(r)e^,Tdr
= b0(t) - b{(t).
(3.16-145)
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 683
The final Oo = 0) solution is therefore
ao(0 = MO-MO*
al(t) = bl(t),
and
a2(t) = e~k2'
b2{r)tk2T dr
(3.16-146)
inserted into (3.16-143).
Remarks:
(1) This solution is, of course, (much) easier to obtain by setting iiq = 0 and solving
(3.16-140), now a system of DAE's, directly. We proceeded as we did for
pedagogical reasons—hopefully profitably and not pedantically.
(2) It is noteworthy that the solution involves a time derivative of the data—a situation
that does not occur with ODE's.
(3) yq and y\ are totally irrelevant.
Now for our final{!) DAE surprise: if we now evaluate (3.16-143) at t = 0, using
(3.16-146) and (3.16-138), we obtain
,nx c2(c2uo -c\v0) + c\g(0)
U(0) = = 2 ^ W°'
c, +c2
,m c\(civ0 - c2u0) + c2g(0)
V(0) = 2 2 ^ ^°'
c, + c2
and
D/m Cl/l (0) + C2/2(0) - g(0) - [Cl*m(0) + C2^(0)1 , D
P(0) = 2 2 ^^°; (3.16-147)
c, + c2
i.e., these results do not (in general) satisfy the chosen IC's! But they do satisfy both
the continuity equation and the PPE; i.e., c\u(0) + c2v(0) = g(0) and (c\ + c2)P(0) +
c\k\u(0) + C2^2^(0) = ci/i(0) + C2/2(0) - ^(0), the two algebraic constraint equations.
What has happened is that the DAE solution is smarter than we are in the following sense:
if we were foolish enough to select an initial velocity field that is not divergence-free
and/or foolish enough to believe that we could select an initial condition for the pressure,
the solution will correct our errors by (via the two infinite eigenvalues) changing the IC's!
Another way to view it is that the solution will initially be discontinuous in time owing
to a lack of compatible initial data: at t = 0+, the solution will be u(0), v(0), and ^(0),
rather than uq, vq, and Pq. For t > 0, the solution will be continuous, divergence-free,
and will always satisfy the PPE.
Remarks:
(1) Not all numerical integration methods will be as smart as the analytical solution,
with the result (both here and in general) that DAE's can cause integration schemes
to go crazy.
684 THE NAVIER-STOKES EQUATIONS
(2) Note that a0(0 is used only for pressure and that a\(t), with the generalized
eigenvector, is used only for velocity—a situation that will be seen to carry over to the
full Stokes equations.
(3) If c\ uq + C2Vq = g(0) = go, then the velocity solution (but generally not the pressure)
will satisfy the IC's.
Before we conclude the index 2 presentation, we point out that it is highly
significant—because it too is not limited to this model problem—that the 'adjusted'
IC's are in fact identical to those obtained by the L2-projection of the initial data to the
nearest divergence-free subspace, as discussed in Section 3.10.2 for the continuous case.
To prove this important assertion, we pose the following problem: given uq and vq (A) is
irrelevant), find the closest divergence-free velocity, u and v. The solution of this problem
is: find u and v such that J(u, v) = \[{u — uq)2 + (v — vq)2] is minimized over all (u, v)
satisfying c\u + C2V = go. Introducing a Lagrange multiplier, <p, to satisfy the constraint,
an equivalent statement of the problem is:
Find the stationary point of J(u, v; <p) = J(u, v) + ip{c\u + C2V — go) over all u, v, <p.
This leads to dl/du = 0, dJ/dv = 0, and dJ/d<p = 0; i.e., to u = uq — c\<p, v = vq — C2p,
and c\u + C2V = go, with solution
C2(C2«0 ~ C\V0) + C\g0
u —
<p —
c
c
c\ + c\
i (ci^o -c2uo) + c2go
c] + c\
1 Uq + C2V0 - g0
c] + c\
Clearly u = k(0) and v = v(0) from (3.16-147). QED.
If we had been smarter a priori, we would not violate div uo = 0, nor would we select
a Pq. If we constrain uq and t>o to satisfy c\Uq + C2VQ = go, (3.16-147) 'agreeably' gives
u(0) = uq and v(0) = vq; and P(0) is the appropriate pressure corresponding to uq and
^o and satisfies the PPE, (3.16-121), at t = 0.
To conclude the index 2 discussion, we specialize to the simpler case of constant
forcing to glean some additional insight; the above solution [with c\Uq + C2V0 = g and
Pq obtained from (3.16-121)] then 'agreeably' simplifies to
u(t) = UQQ~Xlt + uss{\ - Q~Xlt),
v(t) = v0e-^< + vss(\ -e~ht),
and
where
P{t) = Pqq-x"j +Pss{\ -e""2'), (3.16-148)
uss = [C2(c2f\ - cxf2) + clk2g]/(c2k2 + c\k\),
vSs = [c\(clf2 -C2f\) + c2k\g]/(c\k2 + clk\),
Pq = [c\f\ + C2J2 - {c\k\UQ + c2k2VQ)]/(c2 + c\),
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 685
and
c\k2f\+c2k\f2-k\k2g c\fx+c2f2-(c\k\uss + c2k2vss) ,~,,1>ln.
Pss = T T = = 2 • (3.16-149)
cxk2 + c2k\ c, +c2
[Exercise for the reader: Add a third momentum equation (for w) to the model system of (3.16-117) through (3.16-119),
change (3.16-119) to c\u + cif + ct,w = g, and solve as above. Show that the only significant difference is the addition
of a second divergence-free viscous decay mode; i.e., the index of the system is still 2, and there are still two infinite
eigenvalues. (The two 2's are coincidental.)]
c. Index 1
The first thing to note about the lower index version of the model problem, given by
(3.16-117), (3.16-118) and (3.16-121), is that these equations imply (3.16-120) rather
than (3.16-119), which itself implies
c\u(t) + c2v(t) - g(t) = c\uq + c2v0 - g0; (3.16-150)
i.e., any initial violation of div u = 0, which is in fact mathematically permissible in
the index 1 formulation, will remain for all time. Next we note that both B and A of
(3.16-123) are singular, where now
c2 ), F(t) = {f{,f2,c{f{+c2f2-g)T,
c] + cl)
and B is unchanged from its index 2 definition. (Note that the third row of A is a linear
combination of the first two rows; hence, A must be singular.)
The eigenproblem (3.16-124) now yields two roots rather then just one—and they are
X = {0, X2 = (c\k2 + c\k\)l(c\ + t'l)}. The eigenvector corresponding to X = A.i = 0 is
x\ = (c\k2, c2k\, —k\k2)T and that corresponding to X2 is given by (3.16-127). Now the
inverse eigenproblem, (3.16-128), is again needed to complete the vector space. It yields
ix = {0, \/X2}, and it is the first root, \± = iaq = 0 (Xq = oo) that is now of interest;
its eigenvector is xo = (0, 0, l)r, as for the index 2 problem. Thus, one of the infinite
eigenvalues from the index 2 formulation has been converted/inverted to zero, a rather
significant change—another DAE surprise.
It follows easily, as with index 2, that the three eigenvectors are linearly
independent. Given a basis, we now return to (3.16-133) through (3.16-135), except with a new
twist—the 'efficient' expansion of F{t) this time is as follows:
F(t) = b0(t)Ax0 + bi(t)Bx\ + b2(t)Bx2; (3.16-151)
i.e., a mixture of modified basis vectors is utilized. We remark that an equally successful
expansion would replace b2Bx2 by b2Ax2, and we leave as an exercise the proof that these
modified vectors are indeed linearly independent even though both A and B are singular.
(Hint: form their determinant.) Solving (3.16-151) yields
c\f\{t) + c2f2{t)-g(t)
MO =
ct + cj
g(t)
M0 = -2, , 2,
c\k2 + c2k\
686 THE NAVIER-STOKES EQUATIONS
C\C2(fC2 — k\) ,
C2/l(0-Cl/2(0- 2/ 2, S(0
b2(t) = -2 c^2 + c^' . (3.16-152)
c, + c2
Inserting (3.16-133) and (3.16-151) into (3.16-123) and utilizing the results from the
eigenproblems, Bxq = 0, Ax\ = 0, and Ax2 = "k2Bx2, yields
(<30 - b0)Axo + (a\ - b\)Bx\ + (a2 + X2a2 - b2)Bx2 = 0. (3.16-153)
Again, since the vectors {Axq, Bx\, BX2) are linearly independent, this is necessarily an
expansion of the zero vector; i.e., we have
a0(t) = b0(t),
al(0 = bl(t),
and
a2(t) + X2a2(t) = b2(t). (3.16-154)
Next, noting that u and v depend only on x\ and x2, the IC's for a\{t) and a2{t) are
obtained from uq = y\X\\ + Yix2\ and ^o = Y\x\2 + Yix22 t0 giye
Y\ = (c\u0 + c2VQ)/(c]k2 + c\k\),
Yi = (c2k\Uo — c\k2vo)/(c\k2 + c\k\\ (3.16-155)
so that (3.16-154) yields, using (3.16-152),
a\(t) = Y\+ b\{T)&z= —
clk2 + c2k\
and
a2(t) = e
-x2t
ft
^X2T
Y2+ / fc2(r)eA2Tdr
Jo
(3.16-156)
and the full solution is now at hand:
u(t) = c\k2a\ (t) + c2a2{t),
v(t) = c2k\a\(t) - c\a2(t),
P(t) = b0(t) - klk2al(0 + cxc2(k2 - kx)a2{t)/{c\ + c\), (3.16-157)
from which we can easily verify that
(i) c\u + c2v = g(t) + (ciuq + c2v0 -go) and
(ii) {c\ + c\)P + c{k{u + c2k2v = cxf \ +c2f2 - g;
the PPE is always satisfied, but the solution is divergence-free only if it began that way.
And only then will it agree with the index 2 solution. The loss of div u = 0, and the
concomitant admission of more solutions, is a direct consequence of the eigenvalue that
went from oo in the index 2 formulation to 0 in the index 1 formulation, the latter not
recognizing the possibility of using incompatible IC's. Finally, if some arbitrary Pq were
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 687
chosen as a pressure initial condition, then P would be non-smooth (discontinuous at
t = 0).
[Exercise for the reader: Show that the index 1 and index 2 solutions agree when c\ uq + cjVq = go- (Hint; an integration
by parts is required.)]
d. Index 0.
Although we will not provide the details, we discuss this case for completeness and, more
importantly perhaps, to shed yet more light on the behavior of DAE's. The salient features
of the ODE system and its (more conventional) solution via the eigenvectors are:
1. Two of the eigenvalues (X) are zero and one is non-zero (and the same as X2 above,
corresponding to the solenoidal viscous decay mode).
2. IC's are now required for P, too.
3. All IC's are 'arbitrary'; i.e., they are necessary, and a solution always exists.
4 The only divergence condition that is guaranteed is
c\'u + c2v = g;
the jerk (rate of change of acceleration) is divergence-free.
5. Only the time derivative of the PPE is guaranteed to be satisfied.
6. Last but not least: if and only if the IC's are compatible will the solution be correct; i.e.,
agree with the index 2 solution. Compatible IC's are the following: (i) divergence-free
velocity and (ii) pressure that satisfies the PPE, (3.16-121), at t = 0.
e. Penalty
The penalty method neatly converts the DAE's to ODE's, so that it would seem a
simple—even desirable—ploy, since we now know that DAE's are trickier to solve than
ODE's. As we shall see, this is a good ploy except for two things:
1. The results are not uniformly valid in time—there exists a sharp 'penalty transient' (a
boundary layer in time) that is spurious from the viewpoint of the index 2 DAE's that it
attempts to solve.
2. Non-trivial asymptotic analysis is often required if a full understanding of the penalty
method is desired.
The penalty method begins as an index 1 DAE system:
u + k\u + c\P = f\(t), u(0) = uq,
v + k2v + c2P = fi(t), v(0) = v0,
and
ciu + c2v-g(t) = sP, P(0) = (ciu0 + c2vo-go)/e, (3.16-158)
where e is 'small' in an appropriate sense, and positive. Elimination of P generates the
index 0 penalty ODE's,
u + (k\ + c\/e)u + c\c2v/e = f\+ c\g/e,
688
and
THE NAVIER-STOKES EQUATIONS
v + (k2 + c2/£)v + C[C2u/e = f2 + c2g/£,
(3.16-159)
a system we shall solve and then use to attempt to evaluate the penalty premise: for
e —► 0, the penalty solution is close—to within 0(e)—to the index 2 solution.
Remark:
In addition to satisfying (3.16-158), the penalty pressure also satisfies the following 'PPE':
eP + (c\ + c\)P + c\k\u + c2k2v = c,/, + c2f2 - g, (3.16-160)
which of course is the true PDE iff eP is 'sufficiently small'.
In (3.16-123), we now have B = I,
' f\(t) + c]g(t)/e\
A =
k\ + c2/e c\c2/e
c\c2/e k2 + c\je
and F(t) =
f2(t)+C2g(t)/£
Solving (3.16-125) then gives the two eigenvalues (X ^ \/e\—the penalty parameter here
is simply 1/e):
1
A.= -
2
kl+k2 + C-±±^±
\
2(k2-kx)(c\-c\) (c] + c22
(k2 — k\ y H h
(3.16-161)
whose asymptotic behavior is of particular interest: as e —> 0, (3.16-161) gives one very
large and one finite eigenvalue,
0 0 0 0
c, + c9 c,k\ + c0k2 ^
X+ = -! 2- + -J-4 It1 + 0(e)
>+
c2 + 4
and
c,k2 + c0k\ _
X- = -4 P + O(e),
c\ + c\
(3.16-162)
the former tending to infinity and the latter to the proper physical value, X2. It is X+ that
will be responsible for the spurious penalty transient and X- for the close approximation
to the desired solution once this transient has expired—more or less.
The corresponding eigenvectors, from (3.16-124), are found to be
£{h - X)
2c2 +Cl
g(A.-*i) r
2c, Cl
in which X+ [from (3.16-161)] gives one of the eigenvectors, and X_ gives the other [which
is divergence-free to 0(e)]. Expanding both y = (u, v)T and F(t) into these eigenvectors
and solving (3.16-123) yields
e(k2 - X+)
x =
(3.16-163)
u
V
= e
-k+t
«.(0)+ / £,(r)e^rdr
Jo
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 689
s(k2 -A._)
+ e
-X_r
fl2(0)+ / /32(T)ex~Tdz
Jo
+ C2
— C\
, (3.16-164)
where
3cj - c] + e(Aik + #) c| - 3c? + e(Ak - R)
fli(O) = c\v0 c2u0,
_, -_. . _v— . _., 3cl - c\ + s(Ak - R)
a2(0) = c2u0 c\vQ,
Pi =
c\~
3c22
D
3c\ + s(Ak + R)
D
- c\ + s(Ak + R)
D
c\(fi + c2g/s)-
D
c\ - 3c\ + s(Ak - R)
D
ci(f\ +cig/e),
and
c\ - 3c] + s(Ak + R) r . ,
3c? - c? + e(AA: - R)
where
D
Ak = k2 — k[, R =
D
c\(f2 + c2g/e),
\
iAk)2+2Ak(4-cl)+(cU4
and
D = eR[eAk + 2{c\ - c\)],
which is to be compared with the index 2 result (velocity part), (3.16-142) through
(3.16-146).
The 'comparison' is clearly not easy—and probably not worthwhile, at least in any
detail. What is worthwhile is to note that it can be shown, from (3.16-164) for e —► 0 and
using (3.16-144), that while u(0) = uq and v(0) = vq where uq and t>o are arbitrary, the
spurious penalty transient does a similar job as the index 2 solution if c\u$ + c2v$ ^ go;
namely, it performs the 'projection' to the nearest divergence-free subspace. The difference
is that it takes longer to do so via penalty [the penalty time constant is r = e/{c\ + c\)
vis-a-vis r = 0 for index 2], and the ('slow') projection is only correct to 0(e). It is
also noteworthy that there is a non-physical penalty transient even if c\Uq + c2vq = go
(for which Pq = 0!); it is just not so spurious in this case, because during (and after) the
transient, the velocity will satisfy c\u(t) + c2v(t) = g(t) + 0(e). (P0 is also spurious.)
We will now perform a detailed comparison, albeit only for a rather simpler particular
case: time-independent forcing and 'isotropic' viscosity, k\ = k2 = k; cf. Section 3.13.2e
for the steady case. The index 2 solution is, from (3.16-148) and (3.16-149),
= e
-kt
"0
Vq
+
(1
-kt
)
k{c] + c\)
ci(c2fx -c\f2) + kcxg
c\{c\f2 - c2f\) + kc2g
and
P = P0=PSS = (c,/, + c2f2 - kg)/(c] + c22);
(3.16-165)
(3.16-166)
i.e., we have a particularly simple special case (because k\ = k2): no transient viscous
decay terms in the pressure. The initially divergence-free velocity (c\uq + c2vq = g)
690
THE NAVIER-STOKES EQUATIONS
remains that way with no 'need' for the pressure. [If the IC is not div-free, (3.16-165) still
applies, but with «0 and vo replaced by u(0) and v(0) in (3.16-147).] The corresponding
penalty solution, however, with eigenvalues [from (3.16-161)] of X+ = k + {c\ + c\)/e =
(c\ + c\)je and X- = k, is not quite so simple: from (3.16-164) it is
ci + c
+
+
_A-+' I" C\(C\U0 + C2Vq
_C2(C\Uo + C2Vq)
(1 -Q~K')
[c] + c\){£k+c] + 4)
ec\(c\f\ +c2f2) + {c]+cl)c\g
£c2{c\f\ + c2fi) + {c\ + c\)c2g
-kt
C2(C2UQ -C\Vq)
c\ +c2 lc\(C\VQ -C2U0)
+
(1
e~k')
Kc\ + c\)
c2{c2f\ -cxf2)
c\(cxf2 - c2fx)
(3.16-167)
where uq, and ^o are (thus far) arbitrary, and we note that the physical decaying portion
is exactly divergence-free—and thus will make no contribution to the penalty pressure.
We shall soon specialize to the case c\Uq + c2vq = g (a la the index 2 solution), but first
let us show the penalty pressure from (3.16-158), and then study the entire result:
-X+t
x+t*
P{t) = q-a+'(ciUq + c2v0 - g)/e + (1 - e"A+?)
c\f\ +c2f2 -kg
sk + c\ + c\
(3.16-168)
where we imagine e —► 0. The first thing we observe is that the penalty transient is
the only transient and that P(0) is very large unless the initial velocity is divergence-
free—for which case P(0) = 0. But very shortly after this bad start, both velocity and
pressure from the penalty approximation become quite good. After the very fast penalty
transient is 'finished' (e~x+t/e <$C 1)), the pressure is clearly the same as that from index 2,
to 0(e) of course. The corresponding penalty velocity, after the penalty transient, is the
following:
£C\(c\f\ +C2/2)
1
sk + c\ + c\
kt-
+
(l-e-*f)
k(c] + c\)
c\ + c2
£c2(cxf\ +c2f2)
c] + c\
c2(c2fx -C1/2)
c\(cxf2 -c2/i)
+ c\g
+ c2g
-kt
+
c\ + c2
c2(c2u0
.C\(C\Vq
C\Vq)
C2Uq)
(3.16-169)
which is 0(e) away from the index 2 velocity. But for the more reasonable penalty
simulation, we would start with a divergence-free (or close to divergence-free) velocity.
Thus if we invoke c\uq + c2vq = g in (3.16-167), we obtain, after some algebra, the
pleasant result,
+ eP(t)
index 2
c2
where now
P(t) = (\
e-x+t) (c\f\ +cifi-kg
\ ek + c\+c22 ,
(3.16-170)
(3.16-171)
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 691
which, while starting at zero, approximates (3.16-166) to O(s) after completion of the
penalty transient—and completes our 'comparison.'
We conclude the penalty discussion by noting that in general the pressure is recovered
from (3.16-158) via P = (c\u + c2v — g)/e, and is different from that in (3.16-143) by
O(s) for t ^ 0(1/"k+). Note too that the initial penalty pressure is never correct—the
simplest example being P(0) = 0 when the IC is divergence-free. The pressure rapidly
adjusts—and often can change its amplitude by a very large [0(e~1)] amount—during
the penalty transient, an example of which we have already discussed for the full Stokes
equations—in Section 3.13.2e.
[Exercise for the reader—perhaps important: Show that the penalty transient can be bypassed and an L2 -projection of
any initially 'divergent' velocity field to the discretely divergence-free subspace simultaneously obtained in just a single
BE timestep by choosing At such that XAt ^> 1. Show too that the corresponding pressure is really not a pressure at all,
but 1/ At times the potential function associated with the projection. The 'true' pressure is obtained at step 2—which need
not have XAt ^> 1 (but could—as long as At is still small enough for accuracy). Fortunately, as demonstrated by Gresho
and Sani (1998), these results carry over intact to the full NS penalty ODE's—thus showing the 'proper' way to use the
penalty method.]
f. Energetics
To conclude, we briefly study the 'kinetic energy,' KE = \(u2 + v2), for the homogeneous
DAE's (/i = f2 = g = 0) and each of the above formulations:
1. Index 2. From (3.16-117) through (3.16-119) it is easily seen that
(u2 + v2) + kxu2 + k2v2 = 0
2 dr
the KE is independent of the pressure and, since k\ > 0 and k2 > 0, it decays monotoni-
cally. This behavior properly simulates that of unforced Stokes flow.
2. Index 1. From (3.16-117), (3.16-118) and (3.16-121) can be derived
U 0 0 0 0
— KE + [k\u + k2v~ — (c\u + c2v){c\k\u + c2k2v)/(c, + c2) = 0,
dr
which can be shown to also decay, but at the wrong rate unless c\uq + c2v$ = 0.
3. Index 0. Here we combine (3.16-117), (3.16-118) with the second integral of c\'u +
c2v = 0—i.e., c\u + c2v = a + fit, where a and /J are constants that depend on the
IC's—to obtain
— KE + k{u2 + k2v2 + (a + fit)P = 0,
dr
where, from (3.16-122), P = y — (c\k\u + c2k2v)/(c\ + c\), where y is another constant.
Since it is also true (and not hard to show) that P(t) contains—in the general case
(arbitrary IC's) a term linear in t, it follows that KE will contain terms up through r3 — it
could become very large in magnitude, and be either positive or negative. Again, only if
the proper IC's are chosen (divergence-free and Pq satisfies the PPE at t = 0), will a, ft,
and y vanish and the index 0 solution agree with that of index 2.
4. Penalty. Equations (3.16-158) yield
— KE + k\u2 + k2v2 + sP2 = 0;
dr
692 THE NAVIER-STOKES EQUATIONS
the KE will decay even faster than that for index 2; the penalty ODE's are (slightly)
'over-stable.'
g. Numerical integration
Another interesting and useful exercise that is (with the help of A.C. Hindmarsh) not too
difficult to perform with this three-equation DAE system is to represent the numerical
solution exactly—in terms of the eigenvalues and eigenvectors (and generalized
eigenvectors) when a particular ODE time integration method is selected. Thus, in this section we
shall present some closed form solutions when FE, BE, TR, and shortened TR (STR) are
employed on both the index 2 and index 1 DAE's. But, since little more is to be learned
for the inhomogeneous case with lots more effort, we shall present and discuss only the
homogeneous case (f\ = f2 = 0, g = 0). Finally, as the details behind all eight methods
would probably be more burdensome than interesting, we shall only present 'details' for
two of them (FE and TR), leaving the rest as exercises—although we will present final
results for all of them.
All begin with (3.16-123) with F{t) = 0. But we shall re-define both A and B to be
method-dependent, so that each can be expressed in the form
Byn+X =Ayn (3.16-172)
We shall let the reader determine A and B (the easy part) for all but one method; we
show them for one method to 'set the stage': if FE is applied to the index 2 DAE's of
(3.16-123), we easily obtain, with yn = (un, vn, Pn-\)7',
/ 1 0 c\At\
B= I 0 1 c2At J , (3.16-173)
\c\At c2At 0 /
and
/1-kiAt 0 0\
A= I 0 1 -k2At 0 , (3.16-174)
V 0 0 0/
where we have multiplied the 'continuity' equation (3.16-119) by At to obtain a useful
symmetry. [For all of the remaining cases, yn = (un, vn, Pn).]
Returning now to the general case, the next matrix of interest, and the one that will
lead to the analytical solution, we shall call the decay matrix, D:
D = B~lA (3.16-175)
because, once we find its spectrum and corresponding eigenvectors (and generalized
eigenvectors when necessary/appropriate), say {Xj, Xj}, we can obtain the analytical (and
decaying, when solved properly) solution by first expressing jo in terms of the {Xj},
3
yo = J2aJxJ (3.16-176)
7=1
which gives a\, a2, and a^, and then using (3.16-172) to obtain
3
yn = Dny0 = Y^ajDnXj- (3.16-177)
7=1
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 693
i.e., the solution is expressible in powers of the decay matrix operating on the basis
vectors, If we were dealing with simple ODE's, the rest of the analysis would also be
simple—because then
DXj = XjXj (3.16-178)
easily leads to the final solution
3
yn=J2ajKxJ (3.16-179)
7=1
But we are dealing with DAE's, not ODE's, and the procedure is not quite so
simple—because:
(i) D may be singular (FE, BE and STR on index 2; FE and BE on index 1);
(ii) D may be defective (FE, BE, and TR on index 2),
(iii) both may be true (FE and BE on index 2);
(iv) the IC's cannot (or, at least, should not) be freely chosen; even though (3.16-176)
seems to suggest otherwise—more later.
In our small three-equation system, we will always obtain a (the) physically-decaying
mode, whose eigenvalue and eigenvector approximate those of the continuous DAE
solution; namely, (3.16-126) and (3.16-127). Corresponding however, to the two infinite
eigenvalues for index 2 and one infinite and one zero eigenvalue for index 1, is a method-
dependent range of behavior—which is part of the reason that numerical time integration
of DAE's is not quite so straight forward. In our model problem, those cases with defective
matices (repeated eigenvalues with less than the desired number of linearly-independent
eigenvectors) are of the simplest type: one repeated eigenvalue (a double root) with but
one eigenvector. So this is the (only) case that we will present—and we shall let (A.i, X\)
denote the single corresponding pair, even though X2 = X\ (i.e., A.] occurs twice). The
(Jordan) theory of generalized eigenvectors then generates the second independent vector,
X2, according to
DXl=XlXl (3.16-180)
and
DX2=X\X2+Xl. (3.16-181)
This is the required generalization—X2 is the generalized eigenvector corresponding to
the repeated eigenvalue X\. In this case, it is not hard to obtain the proper generalization
of (3.16-177) through (3.16-179), starting with (3.16-176):
3
yn=D"yQ = J2aJDnXJ
7=1
= a\X[X\ + a2(XnlX2 + nXnl'lXl) + a3Xn3X3 (3.16-182)
is the exact solution to the discretized version of the DAE's when D is defective of
degree 1. (It will turn out that A.i = 0 for the two Euler methods and A.i = — 1 for TR.)
694 THE NAVIER-STOKES EQUATIONS
We are now ready to apply this theory to the several chosen time-marching methods
(with fixed At).
o Index 2. As mentioned earlier, we shall show D for only FE and TR. Thus,
c\(\-k\At) -c,c2(l -k2At) 0
-cxc2(\ -k{At) cl(\-k2At) 0
c,(l -k{At)/At c2(l -k2At)/At
where L = c\ + c\ ('Laplacian'); and
DFE — #FEAFE =
1
Z
(3.16-183)
0
DTR —
1
(c\-c\ -LX0At/2
-2c\c2
L(\+X0At/2)
-2c \c
\
1^2
4(1 + k2At/2)c{
At
0
0
\
:f - cl2 - LXoAt/2
4(l+^Ar/2)c2 _L{l+XoAt/2))
(3.16-184)
where the ominous factor of \/At is related to the possibility of employing inconsistent
IC's, Xq is the viscous decay rate, given by (3.16-126), and we remark that only TR and
STR have non-zero entries in the third column of D. Next, we present {Xj,Xj} for the
four selected methods—all obtained by solving (3.16-180), (3.16-181) and, for j = 3,
(3.16-178). In tabular form with Ak = k2—k\, we obtain:
( c2 \
FE:X{ =
X2 =
(A, = 0)
cxAt/{\ -k{At)
c2At/{\ -k2At)
(X2 = A, = 0)
X, =
c2
-c\
C]C2Ak
\L(\ -X0At)/
(X3 = 1 - XQAt)
BE :Xl =
X2 =
(A, =0)
TR:XX =
(A, =-1)
X2 =
STR : X\ =
(A, =-1)
(1 -k2At/2)c{At/2\
X2= ( (1 -k{At/2)c2At/2 )
(A.2 = 0)
(X3,X3) asforTR,
where
_ (1 - k2At/2)c\yx + (1 -kxAt!2)c\y2
Z~ L(\+X0At/2)
(3.16-185)
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 695
with
K, = 1 + AkAt/2 - k{k2At2/4 (3.16-186)
and
K2 = 1 - AkAt/2 - k{k2At2/A (3.16-187)
In addition to noting that X\ is common to all methods and that STR is rather more
'complicated' than the others, we offer the following
Remarks:
(1) For FE, BE, and TR, X2 is a generalized eigenvector. The fact that X2 'vanishes'
as At —> 0 for all methods is related to the fact that its sole purpose is to force the
solution to satisfy any non-div-free IC, a statement that will be more clear when we
present the full solutions.
(2) Xj = 0 in the discrete equations corresponds to Xj = oo in the DAE's; both are
'active' only initially, and serve to 'adjust' any incompatible initial data.
(3) Both BE and FE will quickly (n ^ 1) cause satisfaction of the div-free constraint,
c\un + c2vn = 0, and of the (implied) PPE, LPn + c\k\u„ + c2k2vn = 0, per
(3.16-121).
(4) TR, with X\ = X2 = — 1, will not change the data; rather, it will generate wiggle
signals for incompatible IC's.
(5) STR will quickly enforce the continuity equation (via X2 = 0), but not the PPE.
(6) The isolated singularities for FE at particular values of At are easily explained,
(i) At —► \/k\ or \/k2 causes a significant change in D; either the first or second
column becomes zero [cf. (3.16-174)] thus necessitating a special case analysis—with
the result that X = (0, 0, — c\c\IXqL}) with three (new) linearly independent
eigenvectors, (ii) For At = \/X0, D has the spectrum (0, 0, 0), and the eigenspace
is completed via the construction of two generalized eigenvectors; i.e., DX\ = 0,
DX2 = Xx, and DX3 = X2. Similarly, STR with At = 2/X0 becomes defective (A.3 =
X2 = 0) and requires separate analysis with another generalized eigenvector.
Finally, we are ready to present the four analytical solutions in terms of the {Xj}, {Xj},
and {aj}.
1. FE. A three stage presentation is required for FE; i.e., it is most convenient to consider
separately the cases n = 0, n = 1, and n ^ 2.
(i) n = 0:
v0 )=alXl +a2X2+a^X3 (3.16-188)
gives a2 and a3, and we note that both P-\ and a\ are irrelevant:
(c\uq + c2vq)(\ —k\At)(\ -k2At)
a2 =
AtL{\ -X0At)
Atc\a2 \
696 THE NAVIER-STOKES EQUATIONS
(ii) n = \:
vx \=a2Xl +a3(\ -X0At)X3 (3.16-189)
which used DX2 = X\, and gives Pq (and, of course, u\ and v\), which initial
pressure satisfies LPq + k\C\UQ + C2&2'yo = (c\Uq + C2Vq)/ At.
(iii) n ^ 2:
un \
v„ ]=fl3(l -A.oAO"X3 (3.16-190)
■ Pn-lJ
which satisfies both c\un + C2Vn = 0 and the PPE, LPn+k\C\un+k2C2Vn = 0.
2. #£. Again a three stage approach seems most useful, even though now the pressure
index agrees with that for velocity.
(i) n = 0:
gives the three a's:
(ii) n = \:
UQ
v0 \ =a\X\+a2X2+a2X2 (3.16-191)
Po
®\ = (Po — c\C2AkaT,/L,
Cl2 = (C\Uq + C2Vo/AtL
«3 = ("o - Atcxa2)/c2
Vx )= a2Xl + —- X3 (3.16-192)
p I 1 +A0A?
l
using DX2 = X\, which satisfies c\U\ + C2V\ = 0 and gives a P\ that is
'independent of Pq and satisfies
LP\ + k\C\U\ + k2C2V\ = (C\Uq + C2Uo)/At.
(iii) n ^ 2:
v" = n , 1 A,v»*3 (3.16-193)
^ y (l +A.0A0
which satisfies c\un + C2Vn = 0 and the PPE.
3. TR. Here a 'single-stage' presentation 'works', but a different set of 'complications'
arises. For n ^ 0,
^ J =aa-\)nXl+a2(-\)\X2-nXO + aJ]^^^\ X2 (3.16-194)
where we have used the fact that DX2 = X\ — X2. Again, n = 0 gives the three a's:
a\ = Pq — c\C2Akai,/L,
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 697
a2 = 4(ci«0 + c2vo)/AtL,
a?, = (u0 - Atc\a2/4)/c2,
and we have
(i) cxun +c2vn = (-\)"(c\u0 + c2vq)
(ii) LPn+k[c[un+k2c2vn = {-\)n
(Hi) LPn +Pn + \ + klC{ ^n +Un + X ^ + ^ (Vn +Vn + , ^ = (_ { )n QUo+^O,
4. STR. For this last case, a two stage presentation seems best,
(i) n = 0:
= fl,X,+A2^2+fl3^3 (3.16-195)
^,-(n-A0A^/4)(c^0A+^)
o + c:
At/2
Vr,
gives the three a's:
a\ = Pq — c\c2Aka^/L — za2,
At
a2 = {c\Uq + c2vq)/
-L{\ -XoAt/2)
<33 =
1
C2
At
"o
-ci(l - k2At/2)a2
(ii) n ^ 1:
«. l=ai(-l^i+a3(J^g)"jf3. (3.16-196)
which solution satisfies c\un + C2«« = 0,
L/>„ +^ic2«„ +k2c2vn = (-\)"La\,
and
z.(p" + p">) +klC, (»'+»'+>) + *2C2 f""+"^ = o.
Remarks:
(1) Only the first-order (Euler) methods mimic the DAE solution when the initial
conditions are not compatible; i.e., they change the 'data' at step 1 to satisfy the algebraic
constraints.
(2) In all of the cases, if and only if u$ and v$ are selected properly (c\Uq + c2v$ = 0),
will a2 be zero. If, in addition—and only if—Pq for the three implicit methods is
selected properly (LPq + k\C\uo + k2c2vo = 0), will a\ also be zero.
(3) For the general (ill-posed) case, three of the four methods (FE, BE, and STR)
will, for At —► 0, generate at the first time step, the same velocity solution as do
698 THE NAVIER-STOKES EQUATIONS
the DAE's in (3.16-147)—corresponding to an L2-projection to the divergence-free
subspace. TR, however, will 'preserve the div', up to the (—1)" wiggle-signal factor.
The corresponding three pressures (Pq for FE, P\ for BE and STR), multiplied by
At, are actually the Lagrange multipliers associated with the projection.
(4) For the ill-posed case {c\Uq + c2v0 ^ 0), TR, will send out a strong wiggle-signal
in the pressure—the n{— \)n term a la (3.16-36)—even though the average pressure,
(Pn +Pn+\)/2, does not grow, a la STR's pressure (whose average P still satisfies
the PPE).
(5) Even if the initial velocity is div-free, the pressures from TR and STR will still
oscillate unless Pq is compatible (making a\ =0).
o Index 1. As for index 2, we begin by displaying the decay matrix for FE and TR:
1
and
£>fe = - | c\c2k\At
-k\c\
At
L — c\k\At c\c2k2At
L - c\k2At
(3.16-197)
-k2c2
/L+ ^-{c]k2-c\k\) c{c2k2At
Dtr —
L[l +
X0At
c\c2k\At
\ -klcl(2 + k2At)
L+ 4f(cjk\ -c}k2)
-k2c2(2 + k\At)
0
0
-L(\+X0At/2).J
(3.16-198)
The results of the eigensystem analysis are presented next—for FE, BE, and TR
(leaving the more complex STR as an exercise). It turns out that all three have (almost) the
same eigenvectors—and no generalized eigenvectors are required since the eigenvalues
are distinct. (Lower index systems are always simpler than higher ones.) The common
eigenvectors are as follows;
Xi = 0
X2 =
c\k2 \
c2k\ J
-k\k2)
*3 =
/ c2
-C
\Pcxc2Ak/L
(3.16-199)
where p = 1 for BE and TR, and p = 1/(1 - A.0A0 for FE—owing to the index 'shift'
between velocity and pressure. Note that (up to the /^-factor) these are also the eigenvectors
of the continuous DAE problem derived in Section 3.16.2c.
The eigenvalues of D are as follows:
Method
FE
BE
TR
A,
X2
X,
0
0
-1
1
1 - X0At
1/(1 +X0At)
(l-XpAt/2)
(1 +X0At/2)
where we point out that the 'job' of X2 is to preserve the div—it corresponds to the zero
eigenvalue in the continuous DAE system. The zero eigenvalues for the Euler methods
correspond to the infinite eigenvalue for the index 1 DAE's—and their job is to enforce
the PPE for t > 0 even if it is not initially satisfied.
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 699
Finally, we present the analytical solutions, which, like the eigenvectors, are very
similar.
1. FE.
(i) n = 0:
vQ = a\Xx + a2X2 + a3X3 (3.16-200)
gives a2 and a3—and again we note that a\ and P_\ are irrelevant:
a2 = (c\Uq + c2v{))/XqL,
#3 = (uq — c\k2a2)/c2.
(ii) n ^ 1:
v„ \=a2X2+a3(\ -k0At)nX^ (3.16-201)
■ Pn-lJ
which, for n = 1, gives Pq satisfying the PPE. Also, for all n ^ 0 the solution
satisfies the PPE and
c\un + c2vn = c\Uq + c2vq\ (3.16-202)
the index 1 solution preserves the initial divergence—as noted many times
earlier.
2. BE. For all n ^ 0,
un\ a
vn = a,(0)'% +a2X2 + 3 Y3, (3.16-203)
p y (i + AoA?)
where n = 0 gives, using (0)° = 1, the a's:
fli = P() + ^1^2 — c\c2AkaT,/L,
with A2 and 03 the same as those above for FE. This solution also satisfies (3.16-202)
for all n and satisfies the PPE for n ^ 0. (The value of Pq for n = 0 is actually quite
irrelevant.)
3. TR. For n ^ 0, the solution is
M"\ , n-koAt/2\n
vn =ai(-l)BXi+02*2 + 03 t , 1 a /o X-^ (3.16-204)
P y V1 + M)At/zj
with n = 0 giving the same three a's as for BE. Also, the divergence is preserved and
the pressure satisfies
LPn +k\c\un +k2c2vn = Lfli(-l)";
i.e. it 'rings' unless a\ = 0, for which case it satisfies the PPE for all n—like the two
Euler schemes, even when the solution is not div-free.
700 THE NAVIER-STOKES EQUATIONS
Final remarks:
(1) If (and only if) the two compatibility conditions are satisfied by the initial data,
then each ODE method will produce idential solutions for index 1 and index 2
(PPE) formulations—and only the physical eigenvalue, A3, which approximates A.o,
is needed.
(2) Noting that the correct final (t —► 00) solution is zero makes it clear that the index 1
system will generally attain a spurious non-zero final state; only a2 = 0 (and a\ = 0
for TR) will lead to the correct steady state (and transient state, of course).
[Exercise for the reader: Obtain the projected ODE's by eliminating P, which introduces the projection matrix (see
Appendix 3)
k> = - ( c' ~CiCA
s L\-c\c2 c\ J
Show that this ostensibly simpler approach (two equations vs three) is equivalent (only) to the index 1/PPE approach
in that it will preserve the initial divergence. [We stayed with the full system for yet another reason: in the next section we
generalize this approach from a (2 + l)-system to an (n + w)-system.]
As a concluding remark, we believe that a clear understanding of the model problem
(plus the next section, 3.16.3) and how the various ODE methods 'solve' the DAE's is
bound to be helpful as one applies, or just contemplates applying, the same method to
the full DAE's that approximate the NSE's.
h. Final exercise
To close this portion of the extended introductory discussion of DAE's, let us show how
easy it is to go astray by not being sufficiently careful. Since (3.16-121) was obtained
from (3.16-119), let us see what happens if we use this pair to eliminate P from the
index 1 system. Inserting P from (3.16-121) into (3.16-117), (3.16-118) yields
k\C2U — k-2C\C2V 2 2
U~\ 2 2 = [C2(.C2fl -Cl/2) + Cig]/(c, +C2)
c, +c2
and
IC2C,V — k\C2C\U ~> ~>
v + 4 Y^ = \.c\ifi\f2 - c2fx) + C2g\/(c] + c22). (3.16-205)
c, +c2
Invoking (3.16-119) in these equations leads to the simplified set,
U + Xu = [c2(c2f\ -c\f2) + c\(g + k2g)]/(c] +c\)
and
v + Xv = [c,(ci/2 -C2f\) + c2{g + k\g)\/{c\ + c\),
where
X = (c]k2 + c\kx)/{c\ + c\\ (3.16-206)
is the physical decay rate. This result leads to two interesting observations:
1. u and v have become uncoupled.
2. The divergence, c\u + C2V — g = w(t) satisfies dw/dt + Xw = 0, and thus w(t) =
wqq"^'—any initial divergence will decay toward zero at the viscous decay rate.
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 701
Both of these results are spurious in general: u and v are correct as given above only
if wo = 0, and the correct general solution of the index 1 DAE has w(t) = wo with no
decay. The error was caused by assuming the validity of (3.16-119) 'just because' it was
used to obtain (3.16-121). One must be very careful about 'going backwards.'
3.16.3 Analytical Solution of the Stokes Equations
a. Introduction
In a continuation of our diversion into the mathematics of DAE's, we generalize here
from the three-equation model problem to the general n + m equation system describing
Stokes flow. It is 'analytical' only in the sense that a closed-form solution in terms of
the appropriate eigenvectors is developed; in no sense are these eigenvectors 'analytical.'
It is useful only in that it may provide (as did the previous section, hopefully) a deeper
insight into some of the subtleties/difficulties of incompressible flow, and is similar in
spirit to the eigenvector expansion presented earlier, near the end of Section 3.13.2d, for
the steady Stokes equations.
The details behind the summary of results to be presented below are too many to be
useful, we believe, and are thus left as exercises for the reader. (But see Malkus, 1981,
for some guidance.) Here we just present 'final' results, for both index 1 and index 2,
and point out their similarity with the model problem solutions.
b. Index 2
We start with
MU + Ku + CP = f(t), k(0) = k0, (3.16-207)
CTu = g(t), CTu0 = g(0), (3.16-208)
where u is an n-vector and P is an m-vector; or, expressed in singular ODE form,
By + Ay = F(t), y(0) = yQ, (3.16-209)
where y = (£), F = (f), B = (^ q) is singular, and A = (*r CQ) is not (we preclude
pressure modes for 'convenience'—until the end). For jo = (hq, Po)t> simply takePq = 0
because any IC's attempted to be imposed on P will be ignored; this applies also to the
index 1 formulation, considered next. (It does not apply to the index 0 formulation, which
we do not consider anyway.)
In order to obtain our 'analytical' solution, we begin as we did with the three-equation
model problem of the previous section; i.e., we first seek the homogeneous (/ = g = 0)
solution in the form y = xe"Xt to obtain the following generalized symmetric (n + m)-
dimensional eigenproblem [xT = (vT, qT)\.
Kvj + Cqj = kjMvj, (3.16-210)
CTvj=0, (3.16-211)
or, equivalently, a la (3.16-124),
Ax, = XiBxi. (3.16-212)
702 THE NAVIER-STOKES EQUATIONS
This eigenproblem, which turns out to be defective (there are repeated eigenvalues and
fewer than n + m independent eigenvectors), was 'solved' by Malkus (1981), who called
it the 'natural modes' eigenproblem and who also did consider pressure modes (which
he called 'ill-disposed' modes); results:
1. There are only n — m finite (and positive) eigenvalues, and the associated (divergence-
free and linearly independent) eigenvectors (Vj, q^)7 are M-orthogonal — in the velocity
part (and we take vfMvi = 1; i.e., we presume that they have been normalized). That these
are positive follows easily from (3.16-210) and (3.16-211): v^Kvj + vTjCqj = Xjv^Mvj
and v^Cqj = q]CTVj = 0; since both K and M are SPD, Xj = vTjKvj/vTjMvj > 0.
2. There are m repeated infinite eigenvalues with a corresponding linearly independent
set of ^-orthogonal eigenvectors in the pressure only [Q is the pressure mass matrix, and
(3.16-212) becomes BXj = 0]:
where the {<? •} can be obtained from the subsidiary (and smaller, dimension m)
eigenproblem
(CTM~]C)qj = VjQq-j, (3.16-214)
called the 'Adjoint LBB eigenproblem' by Malkus, in which the m eigenvalues {vj} are
positive (and we take qjQq~j = 1).
3. The remaining m basis vectors (also with X = oo) are generalized eigenvectors [and
therefore need not/do not satisfy (3.16-212)] in the velocity only (and can be obtained
by the application of Jordan theory to the inverse eigenproblem, Bxj = HjAxj, where
Xj — \/hj), one for each q~j, and are given by [see (3.16-131) for guidance]
(;;) = ("»■
These vectors are not divergence-free (CTvj = CTM~~lCq~j = VjQqj), nor are they
required to be [see (3.16-131) and (3.16-132) and below] and also 'belong to' the m
infinite values of Xj. They also correspond to 'gradients,' and are annihilated by the
projection matrix (see Appendix 3), P0 = I — M~~lC(CTM~~xC)"1 CT, because they are
(M-) orthogonal to divergence-free vectors. In fact, they are therefore regarded as curl-free
('discretely irrotational') vectors (Griffiths, 1996).
Remarks:
(1) There is an error in the first sentence of Theorem 3 in Malkus (1981), from which
the above results were obtained: n — m should be just n.
(2) Equation (3.16-214) approximates the continuous eigenproblem for the Laplacian
operator, which of course implies a wide range of eigenvalues—from 0(1) to
0(\/h2), where h is a linear measure of element size.
The (n + m)-vector space basis is now complete and we can return to (3.16-207),
(3.16-208) and expand both the data and the solution in terms of these basis vectors.
Some 'tricks' are required—and they are mostly analogous to those used in the previous
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 703
analysis of the model problem; the results are
ft)-^tt)+i.H-^C)+^p)}
,/=l ' j=n-m+\
(3.16-216)
where
ai{t) = v]MuQ + Xi I b^{x)Qk'T dx; i = \,2, ... ,n; (3.16-217)
Jo
b\i{t) = qTig{t)/vl, i = n-m+\,...,n\ (3.16-218)
1 1 "
MO = T-vJf(t) - - J2 bij(t)vjK(M-lCqj),
Xj *"i ■ , ,
j=n—m+\
i = 1,2 n-m\ (3.16-219)
1 1 "
bQi(t) = -(M-lCqff(t) - - V bXj{t){M-yCqjjIK{M-xCq.\
i = n-m+\,...,n. (3.16-220)
Surely it would be less work to simply numerically integrate the Stokes DAE's! But
it may be useful to at least realize what the numerical integrations are striving for.
Remarks:
(1) The first n — m modes correspond to divergence-free viscous decay.
(2) The first of the m 'equilibrium' modes [those with (vj, qj) = (0, q-) and Xj = oo]
'cause' satisfaction of the PPE, {CTM"[C)P = CTM"\f - Ku) - g, the 'hidden'
algebraic constraint—and, of course, also involve differentiation of the data, which
can help to explain why ODE methods sometimes deliver lower local accuracy for
P than u—numerical differentiation. The PPE is satisfied for t ^ 0 if and only if
CTuo = g(0); otherwise it is only satisfied for t > 0.
(3) The second of the m equilibrium modes (those with generalized eigenvectors and
Xj = oo) 'cause' satisfaction of the original constraint, CTu = g, for t > 0. These
equations actually give the solution for arbitrary u{), but only if Ctuq = g(0) will
CTu = g for t ^ 0; otherwise the solutions will be non-smooth and suffer a jump at
t = 0+—via an L2-projection, as discussed in the previous Section (3.16.2b).
(4) Recalling that g(t) corresponds to specified BC's and is thus very sparsely populated,
it seems that lots of basis vectors (all m generalized eigenvectors) are 'used up' just
to enforce a boundary condition! And frequently the BC has g = 0! A partial
explanation of this situation is this: the general solution does not specifically recognize
that g(t) is sparse and therefore could also solve the more general (and non-physical)
problem, in which g(t) could also correspond to sources and sinks of mass in Q, and
thus be non-sparse. (See, for example Strikwerda, 1984.)
(5) If k pressure modes are present, the solution changes in the following ways: (i) the
first summation is increased to n — m + k, (ii) the second summation is reduced;
704 THE NAVIER-STOKES EQUATIONS
it now goes from n — m + k + \ to n, (iii) the pressure modes are tacked on to
the end via YTjtt+\ aj(p)> wnere Pj is a pressure mode (CPj = 0) and the {a,-}
are arbitrary scalars. Explanation: with k pressure modes, there are only m — k
linearly independent continuity equations (constraints)—the 'effective' number of
pressures is reduced from m to m — k; thus there are n — (m — k) divergence-free
modes, m — k ^-orthogonal eigenvectors in the pressure only, each of which (still)
'generates' a generalized eigenvector, and k pure pressure eigenvectors, each with
X = 0; cf. Figure 3.13-13.
(6) If a divergence-free basis was employed, then the problem would simplify
considerably; the pressure term is gone in (3.16-207), and (3.16-208) is not needed; all n
modes would be divergence-free viscous decay modes.
(7) While the velocity parts of the n — m divergence-free eigenvectors are the same
as those discussed earlier for the steady Stokes equations—Section 3.13.2d and
Figure 3.13-13, the remainder of these transient Stokes eigenvectors are different
from the 'convergence' eigenvectors there. Since, however, each eigensystem
forms a basis, any one of the dilatational eigenvectors from one system can be
represented as a linear combination of those from the other. Finally, the pressure
parts of the viscous decay modes are obtainable from the velocity parts via qj =
-(CTM-lC)-lCTM-lKvj— from (3.16-210) and (3.16-211), which is the discrete
version of the continuum equations, V2g(- = 0 in Q, with BC dqjdn = vn • V2v, on
To, Qi = vd(n ■ \,)/dn on TN.
(8) Recalling the previous analysis of the three-equation model problem, it is clear that
there n = 2 and m = 1.
c. Index 1
The analogous PPE formulation starts with
Mii + Ku + CP = f(t), u(0) = uQ, (3.16-221)
(CTM"lC)P = CTM"\f - Ku) - g, (3.16-222)
whose associated eigenproblem is in the form (3.16-212) with now
A=( rK , rC i )
\CTM-lK CTM-XC)'
which is both singular (because the last m rows are obviously just linear combinations of
the first n rows, owing to the factor CTM~~l) and non-symmetric; B is unchanged from
its index 2 form. Here, as in the model problem above, both A and B are singular, thus
necessitating a somewhat different set of tricks than for index 2. But we first mention that
the n — m div-free modes are the same as those from the index 2 formulation—as are the
first m modes with X = oo. What happens is that, rather than generalized eigenvectors
a la Jordan, the second set of m eigenvectors comes from a third eigenproblem; i.e., in
addition to the obvious eigenproblem from the above equations,
Kvj + Cqj = XjMvj (3.16-223)
and
CTM-lCqj + CtM"[Kvj = 0; (3.16-224)
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 705
we must now invoke Malkus' 'second adjoint LBB eigenproblem' [M —► K in (3.16-214);
see also (3.13-156)],
{CTK"xC)qj = OjQqj, j = n - m + 1, ..., n, (3.16-225)
where the m values of {oj} are (in the absence of pressure modes) positive, and
the corresponding linearly independent eigenvectors are ^-orthogonal (and we take
q]Qqi = !)• Noting now that the operation CTM~~{ on (3.16-223) yields, considering
(3.16-224), XjCTVj = 0, we must have either Xj = 0 or CTvj = 0. The latter case has
already been accounted for; thus it must be the former case that is associated with
(3.16-225), for which (3.16-223) yields
vj = -K~xCqh j = n-m+\,...,n, (3.16-226)
where the {qj} are those from (3.16-225); qj = qj Vy. The second set of m equilibrium
modes from index 2 (Xj = oo) has been inverted; the 'physics' of this transformation
is this: rather than enforcing CTu = g for index 2 (even if Ctuq ^ go), these m non-
decaying modes (Xj = 0, a direct result of the fact that the last m rows of A are linear
combinations of the first n rows) for the index 1 formulation enforce the weaker constraint,
CTu = g—they merely 'hold the div.'
The analytical solution in terms of the n + m basis vectors for index 1 turns out to be
7=1 x J / j=n-m+\
M>U)-M)(-^
(3.16-227)
where, a la index 2, (vTj,qTj)T come from (3.16-210), and (3.16-211), (07,qTj)T come
from (3.16-214) with Xj = oo, and the rest come from (3.16-225) and (3.16-226) with
Xj = 0; also
n ~t
ai(t) = vjMu0- ]T (q]CTUQ)(vTiMK"[Cqj)/aj+ I ^(z)ex'z dz (3.16-228)
and
and
j=n—m+\
n
Pi(t) = vjf(t) - J2 (qTjk(t))(v]MK-{Cqj)/o}, i = 1, 2, ..., n - m,
j=n—m+\
(3.16-229)
b0i(t) = qJ[CTM-lf(t) - g(t)]/vt (3.16-230)
M0 = ^[£(°) - 8(0 ~ CTUQ\/Oi, i = n-m+\,...,n. (3.16-231)
And we are done. But, as for index 2, one would probably be foolish to even
contemplate actually computing the transient Stokes solution in this way.
Remarks:
(1) In marked contrast to index 2, the index 1 velocity is divergence-free only if it
started that way; the set of m zero eigenvalues cause the initial divergence, whatever
it may be, to be retained for all time—from CTii = g.
706
THE NAVIER-STOKES EQUATIONS
(2) If and only if CTu0 = go in both cases will the index 1 solution agree with that
from index 2.
(3) As for index 2, only the first n — m divergence-free modes describe viscous decay.
(4) The m eigenvectors with infinite eigenvalues again 'cause' satisfaction of the PPE
for all t ^ 0.
(5) These index 1 eigenproblem results are not explicitly in the 1981 Malkus paper,
but he has since then (personal communication, 1991) generalized them to cover
the new results.
(6) As for index 2, the 'analytical' results represent an appropriate generalization of
those from the model problem in the previous section.
(7) Noteworthy is that g{t) is differentiated in the index 2 solution and that g(t) is
integrated in the index 1 solution.
(8) If we were to bother studying the index 0 (ODE) formulation, we would find 2m
zero eigenvalues, corresponding to zero time derivatives of both continuity and
pressure Poisson equations—and the usual n — m divergence-free viscous decay
modes.
(9) Remark (5) on pressure modes, at the end of the index 2 discussion, applies here
as well.
(10) The book by Cook et al. (1989) contains a nice summary of FEM-related eigen-
problems.
Final remark for index 2 and index 1:
For the finite X cases, (3.16-212) can be 'rearranged' to give [/ - M"1C(CrM~1C)"1Cr]
M~~lKvj = P{)M"lKvj = kjVj\ the projection matrix, Pq, reduces the size of the solution
space from that of Kv; = XjMvj (size n) to an (n — m)- dimensional subspace of div-free
vectors—since CtPq = 0 (see too Appendix 3).
d. Linear stability theory
To conclude this lengthy DAE diversion, we consider briefly the application of discrete
(via GFEM) linearized stability theory, which may serve as a mini introduction to a chapter
in Volume II. Only the index 2 formulation is useful in this context, so that is what we
will present. Classical stability analysis seeks to determine if a particular steady solution
of the NS equations, Ku + ReN(u)u + CP = / and CTu — g, where Re is the Reynolds
number (displayed here for 'emphasis'), say (us, Ps), is stable to small perturbations. The
results of such an analysis lead to the following generalized eigenproblem (see Volume II
for details)
[K + ReN(us)]vj + Cqj = XjMvj (3.16-232)
and
CTvj=0, (3.16-233)
which is obviously related to the index 2 'natural modes' (Malkus, 1981) eigenproblem,
(3.16-210) and (3.16-211). Here
N(us) = —[N(u)u]
du
(3.16-234)
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 707
is a modified (augmented) advection matrix that corresponds to the operator uv • V +
(Viiy)7. in the continuum. The difference, of course, is that (3.16-234) is unsymmetric
and so therefore is the eigenproblem (3.16-232) and (3.16-233), which may now display a
complex solution. Whereas the details of the stability eigenproblem are certainly different
from those of the Stokes problem, some similarities remain; in particular, there are still
2m infinite eigenvalues [2m zero eigenvalues for the inverse (fi = 1/A.) eigenproblem].
Further details will be presented in Volume II, the above intended only to begin to bridge
the gap between DAE's and stability analysis.
3.16.4 Three Variable-Step Implicit (Index 2) Methods—and
Some Steady-State Methods
a. Introduction
In this section we return to numerical time integration and generalize the variable-step
methods introduced in the previous chapter—and we add one more. We shall construct
smart time integrators for TR, BE, and BDF2, all applied to the index 2 DAE's; starting
with TR, the history of which goes back to 1978 (Gresho et ai, 1978b, 1979, 1980a). In
each of the three, the related solution method for the steady form of the NS equations is
'available'—as a subset.
b. Trapezoid rule
Analogous to (2.7-92) application of TR to the DAE's of (3.13-28) and (3.13-29) yields
-£-M + K + N(un+l) C
CT 0
(M(-£-uH+uH)+fH+x\ (316_235)
V 8n+\ J
a fully coupled non-linear system of equations in (un+\, Pn+\), wherein we have invoked
the 'shortened' form of TR a la (3.16-25), which is valid if Ctuq = go—and it 'adjusts
the data' in the first step if Ctuq ^ go; see (3.16-29). Before discussing methods for
actually solving this nonlinear system [and a legitimately linearized version that is always
stable when N(u) is skew-symmetric—discussed below], we describe the entire AB2/TR
algorithm, wherein it is important to note that the predictor and concomitant error control
are based solely on velocity—pressure just goes along for the ride, being the algebraic
variable in the DAE's.
Start-up. Given u{) satisfying Ctuq = go'-
Step 1. Solve
M C
CT 0
\ = ffo~[K + N(uo)uo]\ (3.16-236)
for (u{), P0).
Step 2. Select Ato as discussed in Chapter 2 (Sections 2.7.4a and e) for T; replace T
by u in the start-up algorithm below (2.7-95).
Step 3. Take the first TR step; solve (3.16-235) for n = 0 to get (uu P\), using up =
uq + At{)iio as a first guess for u\ [e.g., N{upx)].
708 THE NAVIER-STOKES EQUATIONS
Step 4. Invert TR to get the required AB2 data for velocity;
ii\ = 2{u\ — uo)/Ato — Mo-
Remarks:
(1) Solving (3.16-236) is the best (most trustworthy) way to get started, even if Pq is
not actually needed—or wanted. Approximations that estimate Uq by less rigorous
methods are not as robust. Note that, in some sense, TR is not quite as 'self-starting'
for these DAE's as it is for ODE's.
(2) Even though the more efficient form of the RHS of the momentum equation is used
in (3.16-235)—vis-a-vis that in (3.16-23)—with the result that Pq is not actually
required, do not make the mistake of assuming that you can select Pq arbitrarily to
obtain uq, as is occasionally seen in the literature (e.g., Eguchi et al., 1988; see also
Gresho, 1990a). Such a procedure will cause the 'cursed' TR 2At oscillations—a
wiggle signal.
General step. With At\ = Ato, for n = 1, 2,..., do:
Step 1. Predict the velocity with AB2:
u
p
At
un + —^[(2+ Atn/Atn-\)un - (Af„/Af„_i)K„_i].
Step 2. Solve (3.16-235) for (un+\, Pn+\) using u^+l as the first guess for un+\ (details
later).
Step 3. Invert TR to get AB2 data for the next step and update t:
un+\ = 2(m„+i — un)/Atn — un,
tn + \ = tn + Atn.
Step 4. Compute the LTE estimate based on the velocity:
dn = (un+l - upn+l)/[3(\ + Atn-i/Atn)].
Step 5. Compute the (potential) next step size:
Atn+l = Atn(e/\\dn\\)l/\ where
\dn\\2 =dTn+l -dn+l/NTu2
max'
where NT is the total number of (variable) velocity nodes, and umdX is an
appropriate measure of the maximum speed in the domain. A better-yet weighted
RMS-norm might be the following (in 2D, for simplicity):
ll^ll2
Nu+Nv
y^ / an+\j \ | y^ / an+\j
f^ \\un+ij\ + Uo) f^ \\Vn+lj\ + Vo.
where Uo and Vq are user-specified 'characteristic' velocities ('floor' values, in
case | • | is 'too close' to zero), and Nu + Nv = NT.
Step 6. Bump n and go to Step 1 unless the final time has been reached.
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 709
Now, as for the scalar problem of Chapter 2, there are a few practical matters to deal
with—more in fact since the TR equations are now non-linear.
Digression
But before biting the non-linear bullet, we digress briefly to point out a useful and cost-
effective way to retain the stability of TR without solving non-linear equations! It is
borrowed from Simo and Armero (1994) and only applies rigorously when N(u) is strictly
skew-symmetric; and our version is this: 'linearize' the advection term in TR so that the
momentum equation reads
M(un+l-Un) («„+«„+!) (Un+Un + 1) (Pn+Pn + l) fn + fn + \
+ K h N(u ) h C
Atn 2 2 2 2
(3.16-237)
where u* is given but is thus far 'arbitrary'—but it is not a function of un+\ so that
(3.16-237) remains linear. The first thing we note is that for N(-) = —NT(-), we get
energy conservation (in the absence of forcing terms) for all At; i.e., the inner product
of (3.16-237) with (un + un+\)/2 yields, utilizing CT(un + un+\) = 0, the symmetry of
M and K, and the skew symmetry of N,
1 UTn + {Mun + \-UTnMun (Un+Un+1\T (un+Un + \ \ ^i.^tn
2 ATn = -{-^^) K[—T-~)t (3-16"238)
which (i) proves stability; uTn+xMun+\ < uTnMun because K is SPD and (ii) approximates
the appropriate ODE conservation law, \ d(uTMu)/dt = —uTKu; viscous dissipation. This
is the reason that this linearization is 'interesting': guaranteed stability with no non-linear
equations to solve.
Remarks:
(1) This 'trick' was well advertised by Simo and Armero (1994), but it was not
discovered by them. Although we do not know who actually discovered it, we
do know that it was discussed as far back as 1972 by Lions (1972) and Temam
(1972)—who probably discovered it—in Temam (1966), in which is presented a
stability proof—probably a much more elegant one than we have just presented.
(2) The skew-symmetric form is only strictly attainable for Dirichlet BC's; NBC's as
OBC's can cause trouble (see the discussion of fully implicit methods at the end of
Section 3.16.6c); in Simo and Armero (1994), the theory was 'all Dirichlet'—yet
flows with outflows, nice-looking and stable, were also presented, perhaps somewhat
misleadingly. They actually modified the advection operator at the outflow to obtain
'reasonable' solutions; i.e., they sacrificed skew-symmetry there to obtain useful
results for which they had no theory (F. Armero, personal communication).
Rearranging (3.16-237), and then invoking the ODE at tn yields
2
At,,
2
M + K + N(u*)
un+\ + CPn+\
Mun + fn ~ Kun - N(U )un - CPn + fn + \
Atn
710
THE NAVIER-STOKES EQUATIONS
2
= M
Atn
un+un) + fn+l + [N(un) - N(u*)]un,
(3.16-239)
and the full 'TR' system becomes
-2j-M + K + N{u*) C
CT 0
Pn + \
M
(j^tUn + Un^j + /„+1 + [N(un) - N(u*)]un
8n+\
(3.16-240)
an unsymmetric linear system in (un+\,Pn+\). We now address some options for
choosing u*:
1. Choose u* = un to gain efficiency (the AN term on the RHS vanishes) while losing
advection accuracy.
2. Set u* = (un +m£+1)/2, where upn+x is the predictor from AB2, to gain advection
accuracy but at higher cost (AN ^0 on the RHS). Note that this causes the advection
term to look a lot like an explicit midpoint rule.
3. Ditto 2, but drop AN from the RHS to save cost—and take your chances on stability.
We have not tested any of these choices, but would probably start with 1 or 3. We
would also use the same variable-A? algorithm as presented above for the honest
(nonlinear) TR, including all the 'rules of thumb;' it would probably work quite well and is
recommended.
End Digression
Returning now to the (more general) non-linear and non-skew-symmetric case, we show
how the non-linear system of (3.16-235) can be solved pretty effectively using one or
another variant of Newton's method. Applied to (3.16-235), Newton's method gives the
following sequence of linear systems: for m = 0, 1, ... with u°n+l = m£+1, solve
-£-M + K + N(u^) + N\u^) C
CT 0
u
n + \
U
n+\
n+\
= (M(Jrun + iin)+fn^\_([-irnM + K + N{u^x)
ln + \
(3.16-241)
where we have simplified the Newton system because Pn+\ appears only linearly; also
N'(u) denotes the matrix whose (i,j) element is given by
or, equivalently,
N'u(u) =
N'(u) =
dNik{u)
dUj
d[N(u)v]
du
uk
(3.16-242)
(3.16-243)
and is the 'non-linear' portion of the Jacobian matrix of Newton's method. (See Gresho
et ai, 1980a, for explicit expressions for N-,.)
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 711
Remarks:
(1) In the (most common) case that g = 0, the RHS can be simplified by omitting CTu™+,
and gn+l since then even the predicted velocity is divergence-free (if Ctuq = go =
g). This simplification could, however, lead to the accumulation of round off errors.
(2) The 'safety valve' that precludes illegal IC's has been bypassed ('shortened form');
to reset it, add gn — CTu„ to the RHS of (3.16-241). [Best advice for code writers:
use shortened TR but test on CTu{) — go = e{) and print an 'appropriate' norm of eo
so that the user can ascertain the quality of his/her IC. (Recall that u\ will always
be discretely divergence-free.)]
(3) If iterative methods are used to solve the linear systems, then the solution will always
contain some iteration error that should be appropriately controlled.
The iterations in the Newton system, (3.16-241), are terminated when Au is 'sufficiently
small'; i.e., small enough that the resulting error does not contaminate the estimation of
the local time truncation error. A reasonable criterion is that ||Aw|| ^ O.le, which will
typically be attained in very few iterations: one, two, or three. In fact, however, full
Newton interations are probably seldom cost-effective for several reasons:
1. The cost of updating the Jacobian is significant and, if direct methods are employed,
the cost of solving the linear system at each iteration is very high.
2. The AB2 predictor is often close enough to un+\ to accept just one iteration as
converged. This is called one-step Newton; it was suggested by Gresho et al. (1979,
1980a) and is used, for example, by Gartling (1987) and in FIDAP (Fluid Dynamics
International, 1993).
3. Other approximations, such as the chord method (outdated Jacobian; see, for example,
Gresho et al., 1980a) or quasi-Newton methods (FIDAP, 1993) may also often work
well—experiment.
Final 'rules of thumb' on the time-stepping algorithm—similar but not identical, to
those in Chapter 2 (Section 2.7.4) for the scalar problem:
1. If DTSF = Atn+\/Atn > 1, accept the increase.
2. If y < DTSF < 1, where y is user-specified, but a value like 0.8 is recommended,
accept the solution but do not change At.
3. If DTSF < y < \, reject the current solution and repeat the step using Atn+\.
4. If DTSF <<C 1 or if more than two step reductions occur within one timestep, then it
may be a good idea to stop the integration and print a warning message so that the cause
can be studied.
5. Depending on the strategy used in approximating Newton's method, the Jacobian may
or may not be updated in cases 1 and 3. In case 2, an update is probably a good idea.
As discussed in Chapter 2 [following (2.7-97)], an implementation of TR that reduces
potential problems with round-off error may be a good idea (we have not tested it, but
should). It goes like so:
1. Perform Step (1) of 'General step'—below (3.16-236)—as usual.
712 THE NAVIER-STOKES EQUATIONS
2. Compute the predictor acceleration from [see too (2.7-82) and (2.7-83)]
"J+i = (1 + Atn/Atn-\)un - (Atn/Atn-\)Un-i.
3. Replace Step (2) [solve (3.16-235)] by: solve for 8u = un+\ - u?+l and Pn+l from
/„+, - (K + N(upn+l + Su))upn+l - Mupn+['
4. Replace Step 3 by
■p 2
^« + l = ^h + A?„.
5. Set «„+i = «^+i + Su.
6. Return to Step (1).
Suppose you wish to continue a run after studying the results, which will probably
occur frequently. There are two options here, which we call 'smooth' and 'fresh' restarts.
The former is the simplest and should normally be employed; it does require, however,
that more data be saved than with a fresh restart: un, un+\, un, un+\, Atn, and Atn+\. The
smooth restart then starts with the general AB predictor equation, and the continuation
run should be as smooth as if you had not stopped. The fresh restart, on the other hand,
ignores all history data and saves only un+\, which is then treated like u$ in the start-up
phase. This type of continuation run, which may not be as smooth in the sense that the
computed A?o (or user-selected if that is your choice) will generally differ from that of
a smooth restart. The fresh restart is needed if some parameter is to be changed (such as
e) and is (only) recommended if you are not 'happy' with the current At vs t behavior
of the algorithm—which can occur, for example, if error accumulation (iteration error,
too large an e, etc.) has 'polluted' the history vector. (Even 'smart' integrators are not
perfect.)
Final remark:
If the skew-symmetric and a priori linearized form, (3.16-240), is employed, then the
same timestepping strategy as above may, except for option (1), be safely used because
the (second-order accurate) linearization does not change (pollute) the LTE estimates—an
assertion whose proof we leave to the reader.
A few words on the steady NS equations are appropriate here—and similar words
apply to each of the following two time-marching methods (because all collapse to the
same equations):
1. The Newton system given by (3.16-241) applies equally well to the steady-state NS
equations; simply set M = 0 or At = 'oo' and omit the time-level indices—with or
without 'explicit relaxation' [see 3 below].
2. It may often be the case that a good first guess is hard to come by (the ball of
attraction/radius of convergence of Newton's method is not large—bad initial guesses
-+-M + K + N(up+l + Su) C
CT
0.
Su
Pn+\
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 713
result in divergence of the iterations); methods for getting a good guess are described in
Volume II (the chapter on continuation methods).
3. An alternative solution strategy is to begin the steady-state iterations with a simpler
method that: (i) may have a larger radius of convergence but (ii) converges more slowly
(linearly vis-a-vis quadratically for Newton's method). Called successive substitution
or functional iteration, or Picard iteration, the recommended alternate strategy uses the
following fixed-point iterative method (see too Reddy and Gartling, 1994):
K + N(um) C
CT 0
m = 0, 1.
(3.16-244)
with perhaps a relaxation factor applied: aum + (1 — a)u
m+\
um+l where 0 < a < 1,
with a similar 'trick' for P—but only for getting started. Once the solution is 'close
enough' (the hard part, and left to the reader), the switch to Newton's method should be
made.
c. Backward Euler
We now repeat the exercise with BE, and repeat the admonition: generally only use BE
when seeking a time-marching approach to a steady solution—and then be 'generous' with
s (try to use 'large' timesteps), or use BE if TR 'crashes' and a 'sanity check' is needed.
(Robustness caused by numerical damping is sometimes useful—although too large an e
can preclude convergence of the non-linear iterations.) If BE is being used to get 'quickly'
to a steady state, then it may also be a good idea to monitor (un+\ — un) = Aun and switch
to a steady-state solver (probably using Newton's method) when Aun is 'small enough.'
(See too Section 3.16.10 below.) The BE method of (3.13-28) and (3.13-29) gives
-±-M + K + N(un+l) C
CT 0
■^-Mun + fn+A ^ (3.16-245)
8n + \ J
and the Newton system for solving it is [cf. (3.16-241)]
-^-M+K+N{umn+X) + N'{umn+X) C
CT 0
Un+\ Un+\
pm+1
rn + \
gn + \
J_M+K+N(<+^K+
, m = 0, 1,..., (3.16-246)
rTum
and we cannot help but point out again the obvious very slight cost reduction over
TR—with a significant reduction in accuracy. As an algorithm, the BE/FE method is the
following:
Start-up.
Step 1. Select A?o as suggested in Sections 2.7.4b and 2.7.4e.
Step 2. Solve for uq as in (3.16-236); Pq is a bonus.
General Step.
For n = 0, 1, 2,..., do:
714 THE NAVIER-STOKES EQUATIONS
Step 1. Compute u^+l = un + Atnu„.
Step 2. Solve (3.16-245) or iterate on (3.16-246) for un+\, using u^+l as the first guess.
Step 3. Invert BE via un+\ = (un+\ — un)/Atn.
Step 4. Compute the LTE based on velocity from dn = (un+\ — u^+l)/2.
Step 5. Compute the next (potential) step size from Atn+\ = Atn(e/\\dn\\)l/2.
Step 6. Bump n and go to (1) unless the final time has been reached.
Similar rules of thumb as for TR are also suggested here.
Remark:
Warning: The 'rush' to an alleged steady state via BE using large At (large e), may not
be a stable and/or unique steady state—even when it 'works' (converges). [If a time-
accurate solution from well-posed initial data attains a steady state, then that steady
state is unique—for the given IC's. The 'robust' BE method can still sometimes be a
useful alternative to attaching the SS equations directly; see, for example, Reddy and
Gartling (1994) for further discussion, wherein they consider mainly non-linear ODE's
via a 'semi'-time-accurate BE method—with 'judicious' selection of At and e.]
We follow up on this last remark with a few more that may actually be more important.
Yee and colleagues have spent much effort in studying the 'dynamics of numerics' for
various fixed-step (usually) time integration methods applied to 'model' non-linear ODE
problems that are presumed to mimic at least portions of the behavior of the non-linear
DAE's (and ODE's) of CFD. The emphasis is usually on the following issue/question:
For generally unknown IC's, do time-marching schemes reliably find stable steady states
when they exist? By 'unknown' we mean that the original PDE may have come from
the steady NSE's, whose associated BVP is converted to an IBVP via a time-integration
approach What IC's should be selected? 'The phenomenon that a non-linear differential
equation and its discretized counterpart can have different dynamical behavior (asymptotic
behavior) was not uncovered fully until recently'—Yee and Sweby (1995a). And
strong dependence on initial data means that for a finite time step At that is not sufficiently
small, the asymptotic numerical solutions and the associated, numerical basins of attraction
depend continuously on the initial data'—Yee and Sweby (1994), where the 'basins of
attraction' refers to that set of IC's whose solution curves all approach the same asymptote.
(Thus, 'large' At integrations can give a different steady solution for each different IC.)
One of their points of emphasis is that implicit methods (usually using fixed At that is
'too large') can stabilize unstable steady states—a conclusion that is valid for some, but
not necessarily all, IC's. Finally, they also point out that the different methods for solving
approximately the resulting non-linear algebraic equations can also affect the dynamics of
the numerics—all-in-all a sobering set of 'fears' that we would all do well to appreciate.
For the most recent 'summary' of these efforts, see Yee and Sweby (1996).
While we still believe that smart, variable-step, integrators will rarely fall prey to such
spurious behavior and thus further justify their use, it may often be the case that the
number of 'small' timesteps required to find a putative steady state is too large to be
deemed affordable—thus leaving the analyst in a quandary: either attack the problem
with fixed, 'large' At implicit integrations and take your chances on obtaining 'correct'
results, or 'blow' your computer budget on a smart, variable-but-presumably-'small' At,
implicit integration method. We still tend to side with the latter approach for three reasons:
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 715
(i) reliability, (ii) smart integrators will properly and automatically use 'large' At if and
only if a stable SS is approached (At is only small when necessary), and (iii) if At stays
'small' to follow a time-dependent solution that does not not go to SS, so be it—you
have found the solution appropriate to your selected IC's.
d. BDF2
The last smart integrator that we consider is another 'predictor-corrector' linear multistep
method, with a second-order predictor that turns out to be leapfrog (explicit midpoint rule)
when the step size is constant. First, however, we present our variable- At version of BDF2
on ODE's-y = f(y):
yn+\ - yn A*„ yn-yn-\ . Atn + Atn^ . ,,„....
= • h • y„+i, (3.16-247)
Atn 2Atn + Af„_i Af„_i 2Atn + Atn-\ '
which agrees with the result in Hairer et al. (1987, p. 351). The appropriate predictor
equation is
fn+\ =yn+y + -£f^) At«y« - \^TL~) iyn ~ yn-°- (3-16-248)
The LTE of (3.16-247) is found to be
(Atn + Atn-i)2 Atly„ 4
dn = y«+i - v(W.) = A; " , " , , • -^ + 0(At4) (3.16-249)
Atn(2Atn + Atn-\) 6
and that of (3.16-248) is
rf+i ~ y(Wi)= - (l + -^A ■ ^—^ + 0(At\ (3.16-250)
from which dn can be obtained by solving the above two equations for y(tn+\) and yn —to
0(At4):
. (\ + Atn^/Atnf p .,n(K.A.
dn = 1 t()Wi — yn-t-\) + 0(Atn).
\+3(Atn^/Atn) + 4(Atn^/Atn)2 + 2(Atn^/Atn)3W + *n+u
(3.16-251)
With the variable-step ODE preliminaries out of the way (which results could, of
course, be applied to the ODE's of the previous chapter), we can now formulate the
smart DAE integrator using BDF2:
Start-up.
Step 1. Solve (3.16-236) for (k0, Po).
Step 2. Select Ato as per TR.
Step 3. Take the first timestep with TR; solve (3.16-235) for n = 0 to get (u\, P\).
Setp 4. Invert TR to get the acceleration: ii\ = 2(u\ — uo)/Ato — Mo-
General Step.
With At\ = Ato, for n = 1,2,..., do:
Step 1. Predict the velocity via 'generalized leapfrog'; i.e., (3.16-248).
716
THE NAVIER-STOKES EQUATIONS
Step 2. Solve for (un+\, Pn+\) from the Newton system derived from
«TT&M+^^» C
M
CT
1 +Af„/Af„_i
At„
0
(Ar„/Ar„-i):
2
"" Ar„(l+Ar„/Ar„_,)
M„_l
gn+1
+ /* + !
, (3.16-252)
using m^+, as the first guess.
Step 3. Invert BDF2 to get the required predictor data for the next step; un+\ = (3un+\ —
Aun + K„_i)/2Af„.
Step 4. Compute the LTE based on velocity from (3.16-251) with y —► u and O(AfJ)
dropped.
Setp 5. Compute the next (potential) step size:
tn+\ = ^« + Afm
Atn+l =Atn(e/\\dn\\)l/3.
Step 6. Bump n and go to (1) unless the final time has been reached.
Remarks:
(1) (3.16-252) can of course be solved by other than Newton's method.
We have not tested this algorithm in the laboratory, but nevertheless recommend
it—especially as an alternative to TR when and if TR 'acts up'; and as a better
dissipative algorithm than BE.
The same 'rules of thumb' and solution methods for the corrector equation discussed
for TR apply here as well.
As for BE and TR, the predicted velocity is generally not divergence-free.
An admonition from Kheshgi and Scriven (1984), who used the variable-step TR
method: 'The scheme is started with a backward-difference corrector to avoid the
oscillations that certain initial conditions set off in the TR corrector: higher-order
backward-differentiation methods were rejected because their numerical dissipation
can make an unstable flow appear to be stable.' [They used a penalty method and
wished to preclude spurious penalty transients—discussed below. Also, and related to
the Kheshgi-Scriven comment, the BDF methods (especially BE!) have the property
that they will actually damp an unstable ODE solution, e(,ftH~X)', for X > 0 but not
'too large' and co sufficiently large—i.e., slowly growing oscillatory solutions. See,
for example, Gear (1971).]
Final remark on Index 2 DAE's:
In 3D, these fully coupled solution methods are still generally considered by most
practitioners to be too costly, even with smart integrators—a situation that we hope will not
last forever.
This remark in a sense 'justifies' most of the following subsections on time integration.
(2)
(3)
(4)
(5)
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 717
Final remark on Smart Integrators:
Although they usually do the 'advertised' job (follow the physics), they are not
infallible—although we strongly believe that they are always superior to fixed-step
implicit integration methods. We also believe that there is room for further improvement
in this area of CFD; the graduation from 'smart' to brilliant integrators is a noble goal.
Some of its attributes would probably be (at least) the following: (1) it never 'stalls' when
At should be increasing, (2) it always selects the proper norm in which to measure the
error estimate, (3) it never needs to repeat a step, (4) it never misses any important small
scale action, (5) it ... .
e. Discussion
In an important series of papers, Heywood and Rannacher have done much to realistically
clarify the picture when the NS equations are solved approximately via GFEM + TR;
see Heywood and Rannacher (1982, 1986a, 1988, 1990) and Rannacher (1990) for a
'summary'. They put forth a convergence theory and error estimates for virtually all finite
elements (second- through fifth-order) used (and some not used) for the time-dependent
index 2 NS equations via TR. The word 'realistically' refers to the fact that, prior to
their work, the error analyses seen in the literature 'conveniently' assumed more
smoothness/regularity than is realistic for the NS equations; in particular, vortex sheets had been
prohibited by assuming that the tangential momentum equation(s) was (were) valid at
t = 0 on T. As is now well known, and was the only 'regularity' assumption utilized by
Heywood and Rannacher, only the normal component of the momentum equation applies
on T for all t ^ 0—and is the BC for the PPE. The tangential equation(s) apply only for
t > 0 and, as stated in Section 3.9.2, behaves like the transient heat equations for small
time (with VP known from the PPE at t = 0). This behavior returns us to the
discussion presented for the heat equation (with advection, usually) in Section 2.7.4f. Whereas
the normal momentum equation satisfies both zeroth- and first-order compatibility
conditions [n • uo = n • w(0) and the normal momentum equation itself, respectively] and the
initial velocity field must be divergence-free (no jumps permitted in the direction along
streamlines), the tangential component(s) can tolerate much less regularity—it (they) can
reside in L2 or, more precisely, the initial velocity can be in H (div; £2); see Girault and
Raviart (1986). A simple example of this situation would be an IC comprising a cylinder
of fluid (of finite length if 3D—with ends orthogonal to the axis) in solid body rotation
in an otherwise quiescent fluid—an example of a divergence-free IC of compact support.
Another example is a box of fluid with an initial velocity determined/derived from an
arbitrary vorticity field (arbitrary only up to the necessary constraint that n • uo = 0 on
T)—which will generally display vortex sheets (co = oo) on T (at least); see Gresho
(1992). Heywood and Rannacher advocate, similar to the 'Rannacher philosophy' for
the transient heat equation (and AD equation) espoused in the previous chapter (see
Section 2.7.4f), the use of a 'smoother' in order to compensate for TR's lack of
dissipation so that meaningful 'smoothing estimates' can be made. (This means, roughly, that
the smooth solution for t > 0 is generally not smooth for t | 0; regularity is lost—in
general.) Using only the above regularity assumption, and a quadratically conservative
FEM formulation, they derive the following (L2) error estimates for fixed step TR and a
stable NS solution:
\\un ~ "('«)II ^ c(hm/t™/2-1 + At2/tn), (3.16-253)
718 THE NAVIER-STOKES EQUATIONS
where tn = nAt > 0 and At ~ h2. Here m (2 ^ m ^ 6) is the order of the FEM basis
functions (m = 2 for linear approximation, etc.) and are called 'smoothing' estimates
for the 'singular cases': m ^ 3. For a detailed discussion of the various notions and
definitions regarding 'stability' of the solution, see Heywood and Rannacher (1986b). This
result accounts for the general case in which the overdetermined Neumann problem (see
Heywood and Rannacher, 1982) is not satisfied at t = 0; i.e., only the normal momentum
equation applies on T at t = 0—with the general result that ||V3u/3?||o, ||3u/3?||//i, and
||u||//3 are unbounded as t \ 0.
If the TR is 'stabilized' via, for example, the appropriate sprinkling of a few BE steps
(or BDF2)—as discussed at the end of Section 2.7.4—then the stringent restriction that
At ~ h2 can be removed and (3.16-253) can still apply. Our position here, however, is
basically the same as it is for the transient heat equation: a smart, careful application
of the variable step TR (or, perhaps, BDF2) applied to the NS DAE's will also yield
'optimal' accuracy—albeit with At ~ h2 for small t, in general (as selected by the smart
integrator—not the analyst!). This approach will, we believe, generally be more cost-
effective than a 'fixed At TR + smoothing' approach; i.e., the total number of steps for a
given simulation will be a minimum for a given, specified accuracy—and said accuracy
will be achieved even at small time (where, at least in some cases, it may not be fully
warranted).
f. Penalty method
We conclude this section with some remarks on the penalty method. If the penalty method
is utilized to eliminate the pressure, then we return to an index 0 formulation—albeit a
mighty stiff one, owing to the (necessarily) large penalty parameter. These ODE's look
like [cf. (3.13-185)]
Mii + [K + N(u) + kB]u = f + kCQ~lg, (3.16-254)
where k^> v, B = CQ~lCT is the penalty matrix (cf. Section 3.13.2e), and Q is the
pressure mass matrix. We mentioned in Section 3.5.3 that the penalized momentum equations,
while reducing the system size by eliminating P, also 'penalized' the ODE's via the
introduction of an extraneous and spurious penalty transient; i.e., a compression 'wave' that
travels quickly through the domain until the penalty stiffness becomes virtually
equilibrated. This extra stiff behavior effectively rules out all explicit time-integration methods,
but the three implicit methods discussed above can easily deal with the problem—provided
certain simple modifications are made to the algorithms. Since the spurious transient is
of no physical significance, it can be quickly eliminated (overlooked) by taking
advantage of the stiff stability of BDF1 and BDF2. The required changes (besides, of course,
never seeing the pressure or the continuity equation in the system) are as follows, for all
three: for the first timestep (only), use BE and a At that is large relative to the penalty
time constant, rP = L2/k, where L is an appropriate characteristic length, yet small
relative to the physical time scales of interest, xA = L/uq and xq = L2/v, where uq, is a
characteristic velocity. Note that tp/td = v/k <£ 1 and that rp/rA = Re-u/A, which is
also <<C 1 for moderate Reynolds numbers so that such a At selection should be feasible.
Presuming that the initial velocity field is close to satisfying Ctuq = go; i.e., that one
is trying to solve a physical (nearly incompressible) problem via 'penalty,' the velocity,
u\ = uq (the new/effective IC) from the above step will still be nearly divergence-free,
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 719
and the penalty transient [especially in the implied pressure field—see (3.5-11)] will have
been vanquished. If this step is regarded as a pre-start-up process, which is probably the
best way to view it, then simply replace uq by uq and return to the general variable-step
method for non-linear ODE's—obtained from the above three DAE index 2 implicit
integrators simply by deleting the C-matrix, the pressure, and the continuity equation; and
using (3.16-254) as the momentum equation.
Remarks:
(1) If Ctuq — go is not small, then the resulting, nearly divergence-free adjusted velocity,
uq, may be 'strange' looking even though it satisfies CTu$ — g0 = 0(1/1)—an
example of which is presented in Sani, et al. (1981b). (Even if uq is 'strange looking',
it is still quite close to a discrete L2-projection of u0.)
(2) If At cannot be large enough to fully kill the penalty transient in a single step, a
situation that can be measured by examining the size of CTu\ — g\ compared to
\/k, then it may be necessary to take several BE steps. (One properly selected step
should generally suffice, however.)
(3) Another time scale of interest is rh = h2/v, where h is a measure of the grid
spacing—typically for the smallest element. Another constraint on k is thus L2/k <$c
h2v, or k ^> v(L/h)2, which is usually not so difficult to attain.
(4) In practice, 105 ^ k ^ 109 is fairly common and useful.
We conclude this penalty discussion by returning to the 'spirit' of the eigenvector
expansions of Section 3.13.2d and 3.16.3 and briefly consider some analogous results a la
penalty; i.e, we embark on a somewhat 'academic' diversion, which we hope is justifiable.
And it is better to back up a bit and start with (3.13-183) before eliminating P; i.e. we
first consider the index 1 system
Mii + Ku + CP= f (3.16-255)
and
CTu = g + eQP, (3.16-256)
e = \/k, and we have reverted to Stokes flow because N{u)u is not 'receptive' to simple
linear analysis. The associated (n + m)—dimensional eigenproblem is
Kvj + Cqj = kjMvj (3.16-257)
and
CTVj-£Qqj=0, (3.16-258)
rather than (3.16-210) and (3.16-211). The qualitative 'solution' of this eigen problem,
which is clearly an 0(e) perturbation from the index 2 eigenproblem, is as follows (and
we assume, for simplicity, no 'pressure modes').
1. There will be n-m 'div-free' modes [CTVj = 0(e)] with finite ky, both eigenvalues
and eigenvectors will be 0(e) from those of index 2.
720 THE NAVIER-STOKES EQUATIONS
2. There will be 2m values of kj that are O (\/e); i.e., very large. To find the eigenvectors,
we note that (3.16-257) and (3.16-258) imply
CT(kjM - Ky[Cqj = sQqj (3.16-259)
in general, and, for sufficiently large kj,
CTM'lCqj = skjQqj (3.16-260)
in particular. Since ekj = 0(1), it seems clear that (3.6-260) is an approximation to
(3.16-214) with ekj = Vj. Thus, the first set of m eigenvectors is, considering again
(3.16-257) with kj = vj/s,
fv]\ = feM^Cq+0(e^)\
\qjj V 1j + 0{e) J
where cjj are the eigenvectors of (3.16-214). The second set of m eigenvectors correspond
to the generalized eigenvectors of the index 2 result; i.e., to (3.16-215):
(;;>rc^0(£))
and we are done. The 'mixed' (index 1) penalty method approximates well—to 0(e)—the
index 2 results.
But, you may object, the real (index 0) penalty method of interest contains no pressure.
Thus we now return to (3.16-254), drop N{u)u, and consider the concomitant eigen-
problem:
1
which is of 'size' n. The solution of this eigenproblem will, we assert, display the
following properties.
1. There will the same n — m modes as from 'mixed penalty' with finite kj that are nearly
div-free; CTVj = O(e). These {vj} are O(e) from the {^-portion of the index 2
eigenvectors from (3.16-212). (There is no analogous pressure-portion; pressure eigenvectors
do not exist in this 'velocity-only' formulation.)
2. There are m eigenvalues having kj = 0(1 /e) with dilatational eigenvectors. From
(3.16-263), we have (eK + CQ~1Ct)vj = ekjMvj in general and
CQ~lCTVj = ekjMvj (3.16-264)
in particular—for sufficiently small e—and CTVj ^ 0(e); i.e. the discrete divergence is
finite. [Also noteworthy is that (3.16-264) also follows from eliminating qj from the pair
(3.16-257) and (3.16-258).] In addition to the n—m nearly div-free modes discussed
above, (3.16-263) and (3.16-264) will admit m modes with finite divergence, which will
be 0(e) from the index 2 modes given by (3.16-215) and (3.16-214), here with kj = vy/e;
i.e., large but not infinite—as with the second set of m modes from index 1 above.
Thus, the index 0 eigensystem mimics both the div-free modes and the dilatational
generalized eigenvector modes of the index 2 eigensystem—the latter being those that
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 721
cause satisfaction of CTu = g. It does not (apparently/ostensibly) mimic the m infinite
eigenvalues of index 2 that cause satisfaction of the PPE; PPE satisfaction via penalty
would need to come 'after the fact.'
But what we do learn from the above 'analysis' is the following important result
(and one that we have verified numerically): since the penalty eigenproblem results are
0(e) from those of index 2 (at least up to the PPE satisfaction), so too will be the
penalty transient if the initial velocity field is far from div-free (or close, of course);
i.e., the penalty transient will actually mimic the L2-projection of index 2 to the (nearly)
discretely div-free subspace—a result we have actually already seen in the three-equation
model problem of Section 3.16.2e. (It is demonstrated again in Gresho and Sani, 1998.)
With these results more-or-less in hand, we now pose two interesting and related
problems for the reader—part of which has already been presented in Gresho and Sani
(1987). Consider a bounded domain containing a motionless fluid everywhere except in
a (small) subdomain in which a non-zero velocity field is given by the stream function
\ff(x, y) = sin2 [tt(x — xq)/1] sin2 [n(y — yo)/h], (3.16-265)
for 0 ^ x ^ / and 0 ^ y ^ h, a 'cellular' divergence-free IC contained in an / x h
subdomain whose lower left corner is located at (xq, yo) that is away from the boundaries of
the full domain. This is an example of an IC with compact support (u = 0 at and near T).
The two 'interesting' problems are these—on a set of node points that 'discretizes' Q.
1. Take as discrete initial data the interpolant of the velocity from (3.16-265); i.e., the
exact velocity field is evaluated at the nodes in the subdomain. Noting/recalling that a
div-free velocity is generally not discretely div-free leads to the second problem:
2. Take the discrete L2 projection of the velocity from problem 1 as the IC.
Consider now 'solving' (time-integrating) these two problems (with advection terms
re-instated) in three different ways, (i) the index 2 DAE's (u-P formulation), (ii) the
PPE/index 1 DAE's, and (iii) the index 0 penalty DAE's (ODE's). After realizing
that problem 1 is ill-posed in the strongest (index 2) formulation, we are down to
five computable solutions, whose qualitative behavior we shall predict and leave the
quantitative results to the interested reader (with the warning that 'the Devil is in the
details.):
(1) Interpolated IC via index 1. This problem is solvable, but the solution will be —at
least in part —non-physical; the initial non-zero divergence, CTu$ ^ 0, will be
present for all time [CTu(t) = Ctuq] and, thus, the concomitant part of the pressure
field will also be spurious. (If the mesh is sufficiently fine, the non-physical portion
of the solution may be difficult to detect).
(2) Interpolated IC via penalty. Here the initial pressure (e~{QCtuq) may be quite
large, but by time 0(e) the system would have recovered to be 0(e) away from
the L2-projected IC of problem 2.
(3) Projected IC via index 2. A physical, domain-filling initial pressure for t ^ 0 (that
is independent of viscosity and has dP/dn = 0 on T) will occur (i.e., not just in
the IC's subdomain) and a physical flow would ensue.
722
THE NAVIER-STOKES EQUATIONS
(4) Projected IC via index 1. Ditto index 2; i.e., the results would be identical—and
this is the example presented, but without any discussion of time integration, in
Gresho and Sani (1987).
(5) Projected IC via penalty. This case is similar to that presented in Section 3.16.2e
for the three-equation model in that the initial pressure is zero (and thus wrong),
but will undergo a penalty transient that will recover the physical pressure —to
0(e)—while only very slightly changing the velocity field, again to 0(e).
(6,7) Yes—there are actually two more cases that should be considered: each IC via
penalty but not 'tracking' the penalty transient (e.g. just take one step with kAto =
100 via BE). After a single time step, both solutions should be 0(e) from the L2
projected IC, and variable-step time integration may now commence. And this
is the way the penalty method should be employed—in practice. The penalty
transients from cases (2) and (5) would be interesting to compare in that they are
very different (in the pressure) yet would both lead to the same result—to 0(e) in
time 0(e): the L2 projected velocity field and its corresponding pressure. A final
exercise is to reconsider the above problem for Stokes flow—for which the exact
initial pressure is zero, (why zero?)
We are now finished with 'honest' (rigorous) implicit methods that choose At based
on the physics and have no need for mass lumping, but generate the largest possible sets
of coupled equations; i.e., robustness exacts its price. In the rest of this section, therefore,
we investigate some cost-cutting alternatives, beginning with an explicit method applied
to the index 1 DAE's.
3.16.5 An Explicit (Index 1) Method, Plus a Few Tricks
The tricks will be mainly applied to one particular element—Q\Qo—but the explicit
method to be described next can be applied to any element for which mass lumping
makes sense. It was broached in Gresho et al. (1980a), first implemented in Gresho et al.
(1981), with an additional simplification called 'centroid advection,' and described in
detail in Gresho et al. (1984b, c). We begin with the 'general' element and, at the end,
specialize to Q\Qq. Also, whereas we will describe only the simplest explicit method,
there seems to be no reason that others cannot be used. The forward Euler method is
described only because it is still quite popular and it is the simplest possible way to make
a code.
We return to (3.16-52) and (3.16-53) but this time in full dress form (FEM): given un
with CTun = gn, the algorithm is simply as follows, for n = 0, 1, ...:
Step 1. Solve
(CTMlxC)Pn = CTMl\fn-Kun -N(un)un] - (gn+l - gn)/At
= CTan - (gn+l -gn)/At (3.16-266)
for Pn, where an is a partial acceleration (sans VP).
Step 2. Update the velocity via
un+x =un + At(an -M~[lCPn) = un + Atan, (3.16-267)
where an is the total acceleration vector.
Step 3. Bump n and go to Step 1 unless it is time to stop.
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 723
Remarks:
(1) ML is a diagonal, lumped mass matrix—and we no longer have a GFEM (or at least
an 'honest' GFEM); our FEM has moved a step toward FDM. Mass lumping also,
of course, restricts the choice of elements (to those for which lumping is 'allowed').
(2) The final velocity is discretely divergence-free: CTun+\ = gn+\—unless Ctuq^
go and PHg(t) = 0, in which case the initial divergence error is forever present:
CTun+l = gn + CTu0 - go, a la (3.16-7).
(3) If (3.16-266) is not solved sufficiently accurately, then spurious divergence may
accumulate. There are two known ways to remedy this problem: (i) 'penalize' the
PPE by replacing gn by C1 un on the RHS of (3.16-266), (ii) periodically project
the velocity to the divergence-free subspace (see next section, and Appendix 3); an
example of the latter is given in Dupont and Marchal (1988).
The price paid for the simplicity is instability—only At less than some stability-limited
A?max will preclude blow up. And this is over and above the price, in loss of accuracy,
already paid for mass lumping. And Afmax is, in general, not known! Thus, rather than
addressing the issues covered in the previous section on implicit integration (efficiency
issues mainly—but not simple ones), the main issue for explicit methods, besides the
single linear algebra issue of how best to solve (3.16-266), is how to 'guesstimate' Atmax.
Another issue, rarely addressed in the literature (see Paolucci, 1990, for an exception),
is related to accuracy: even if I can find Afmax, how do I know it is small enough to be
'sufficiently' accurate? Most users of explicit methods seem to believe (at least implicitly)
that Atmdx is so small (ridiculously so, sometimes) that the results must be accurate;
indeed, more accurate than warranted for the given mesh. Without further ado and totally
without justification, we shall henceforth make the same (religious?) assumption and only
justify it by offering the following advice: just as a single-grid simulation {one FEM
discretization) is not a good simulation practice (see Volume II for more on this subject),
so too is a single At simulation not generally tenable. If you perform a simulation at or
near Atmax (which is frequently found experimentally, being guided by those estimates to
be soon described), perform another one at one-half this step size; large differences tell
you to 'keep going,' small ones are a sign of relief.
Rather than advocating the FE algorithm just described, we advocate (but not too
strongly) the BTD improvement described in Section 2.7.2e for the scalar transport
equation—because the stability limits for NS are at least as stringent as those for AD
(FE still generates negative diffusion). But we also include another warning: the simple
replacement of the viscosity, v, by a balancing tensor viscosity, vl + uuAf/2, does
not have quite as simple a consequence for the vector -valued PDE's as it did for the
simple scalar equation. In particular, the term V • (uu • Vu), when represented in Cartesian
coordinates (as in conventional FEM codes), does not 'transform' properly. For example,
transformation to 'intrinsic coordinates' (parallel and normal to streamlines) of the 2D
Euler equations (v = 0) for simplicity, with BTD, du/dt + u • Vu + VP = \ At{u ■ V)2u,
yields
dq/dt + qdq/ds + dP/ds = Atq2{d2q/ds2 - K2q)/2 (3.16-268)
for the streamline component of the velocity (q = |u|), where s is the streamline coordinate
and k is the principal curvature—and is the culprit; rather than simply (q2At/2)d2q/ds2,
which is streamline diffusion, the BTD 'curvature crisis' (a la Gresho and Chan, 1990)
724 THE NAVIER-STOKES EQUATIONS
shows the undesirable introduction of a damping term. Another way to see the 'problem'
is to seek an 'equivalent operator'—as in Section 2.7.2e—which, when integrated via FE,
would give the right result—to 0(At2). This yields, again for the inviscid case,
At
du/dt + u ■ vu + s/p = —
V (uu Vu) + VP ■ Vu + (u • V)V/>
ds/p
+ (u Vu) Vu -
dt
(3.16-269)
in which only the first term on the RHS is 'BTD.' If one cared to and was able to
incorporate the other four correction terms, perhaps the forward Euler fix would really
be proper. Rather than try or suggest such silliness, we recommend switching at least
the advection terms—from FE to one of the higher-order explicit methods discussed
earlier; AB2, AB3, or RK4. We present and further discuss FE/BTD only because we
and others have quite a lot of experience with it—mostly good, in spite of the theoretical
shortcomings just exposed.
Then, because we know no better and because it works reasonably well, we assume
that using BTD as in the scalar equation leads to the same estimates for Afmax; i.e., see
(2.7-47) and related discussion. But on the assumption (!) that both the CFL and the
diffusional stability limit cause Atmax to be much smaller than necessary for a sufficiently
accurate integration, we now discuss the concept of 'subcycling,' a cost-effective trick
first introduced in Chan et al. (1981; and in detail in Gresho et ai, 1984b)—and one
that frequently works well. The object is to further reduce cost by updating the pressure
less frequently than once per timestep, as in the presented algorithm. And here we do
reintroduce some accuracy concepts based on the LTE—and utilize the fact that (at least
in the absence of forcing) the pressure-velocity coupling via mass conservation has no
impact on the stability (since u'n+lCPn = PTnCTun+\ = 0, the pressure term drops out of
the kinetic energy equation). One of the goals of the method is to perform the expensive
part of the problem (solving the PPE) no more frequently than would a BE method with
error control. A four-step summary of the idea is given first, followed by the details (in
which the Devil resides):
1. The minor (smaller, and fixed) timestep based on stability estimates is used to compute,
presumably very accurately, the advection and diffusion terms (these processes are 'subcy-
cled' for a predetermined number of steps—and the pressure gradient is approximated
(guessed) via linear extrapolation. Hence, the PPE (or, equivalently, the mass conservation
equation) is not needed during subcycling; it is ignored.
2. Project the now non-divergence-free velocity to the divergence-free subspace and take
the result as the velocity solution. (Because we do not use the consistent mass matrix for
the projection, we do not have an L2-orthogonal projection to the nearest divergence-free
velocity—but it is close; see Appendix 3.)
3. Solve the PPE for the concomitant pressure field.
4. Compute the next major timestep (the timestep for the pressure) via local error estimates.
Figure 3.16-9 shows the process schematically; un is the divergence-free velocity, Pn
is its corresponding pressure, un+\ and Pn+\ are the 'intermediate' velocity and pressure
at the end of subcycling, Ats is the minor (subcycle) timestep based on stability, and
SOLUTION METHODS FOR THE SEMI-DISCRETIZED
725
Fig. 3.16-9 A schematic of the subcycling process.
Atn is the major (and variable) timestep. (Actually, as seen below, Pn+\ is not actually
needed—nor is it computed.)
The details of the subcycling method are as follows:
1. The subcycling process. Given un at tn with CTun = gn and S(= Atn+\/Ats)—the
subcycle ratio (whose value will be discussed later)—the following ODE's are solved
via FE:
MLu + Ku + N(u)u + CP(t) = f;
i.e., for m = 0, 1, ..., S — 1, with uq = un,
um+\ = um + AtsML [fm — Kum — N(um)um — CPm],
where
(3.16-270)
(3.16-271)
+ {Pn -Pn_{)(mAts)/Atn+{.
2. Project us(= un+\) to the divergence-free subspace and call the result un+\\
ML(us - un+\) = Ck and CTun+l = gn+l; (3.16-272)
i.e., solve (CTM~[l C)k = CTus — gn+\ for X (the Lagrange multiplier of the projection),
and then compute un+l = us — MJ^Ck and discard \. (See Appendix 3 for 'projection
theory,' and details.)
3. Solve the PPE for />„+,,
(CTMl{C)Pn+x =CTM-[l[fn+l -Kun+l -N(un+l)un+l]
~ (gn + l ~ 8n)/Atn+i.
(3.16-273)
4. Determine the next major step size, Atn+2—the heart of the matter, and the most
tenuous part. We would like to base the major step size on specified local accuracy
(as usual), but we have corrupted the usual ODE (or DAE) problems with our ad hoc
procedure of extrapolate, subcycle, project—and the local error estimate is thus trickier
than usual. In fact, we have two of them, one more conservative than the other and neither
726 THE NAVIER-STOKES EQUATIONS
really rigorous. The conservative one simply ignores the fact that the modified ODE is
solved with small minor steps and bases the local error on major steps only—as if full
FE were used at the major step sizes. It is thus simply
d„+i =un+\ - u(tn+l) = -At2n+lun/2 (3.16-274)
a la (2.7-7), and we approximate un by (un+\ — un)/Atn+\, where both un+\ and un
are computed directly from the ODE's; u^ = M~Lx\fk — Ku^ — N(uk)uk — C7\], k = n
or n + 1, to give dn+\ = —Atn+\(iin+\ — iin)/2. The standard ODE theory then gives the
next major step size as
Atn+2 = Atn+l(£/\\dn+l\\)l/\ (3.16-275)
where e and || || are as before (for implicit solvers).
The less conservative estimate attempts to account for the subcycling process (at Ats)
and the pressure extrapolation during same. It is also less easy to derive, beginning with
e„+i =un+\ - u(tn+\) = pus - u(tn+i) (3.16-276)
as the local error, where p = I — M~[]CiC1"M~[]C)~]CT is the projection matrix [see
(3.16-13) and Appendix 3]; en+\ is the local error on the assumption that u$ = un =
u(tn). {Recall that, via p, the original DAE's can be expressed as u = p[M~[l(f — Ku —
N(u)u] + M^1 C(CTMjT1 C)_1 g; see (3.16-13).} The analysis proceeds by expressing en+\
as the sum of a Subcycle (plus projection) error and an Extrapolation error,
eB+, =e;+1+ej+1, (3.16-277)
where
e^+, = p[us -u(tn+l)]
and u(tn+\) is the exact solution of the subcycled ODE, (3.16-270), and
e*+l = pu(tn+l) - u(tn+l) (3.16-278)
is the extrapolation error [the error remaining if Ats —► 0 (with Atn+\ fixed) and thus
e^+, —>■ 0]. We treat these two error contributions in turn, starting with e^+1, whose local
(single subcycle step) error is obviously —At2sum/2. Next, from the theory of global error
for ODE's, the accumulated local error at time Atn+\ can be shown to be, approximately,
— Atn+\Atsiin/2 (A.C. Hindmarsh, personal communication); i.e., we have
us - u(tn+\) = -Atn+\Atsun/2
and thus
e^+, = p[us -u(tn+l)] = -Atn+lAtspun/2. (3.16-279)
To simplify the ensuing analysis, we take g = 0 (the most common situation in practice)
and define h(u) = M~[\f — Ku — N(u)u) so that the NS ODE reads u = ph(u), and the
subcycle ODE reads u = h(u) — M~[lCP(t). Thus, u = h~u{u)u — M~[lCP, where hu(u) =
dh(u)/du is a Jacobian matrix. We also have u = phu(u)u in general and 'un = phu(un)un
in particular. Thus, using pM~[lC = 0 gives pu = phu(u)u and pun = phu(un)un. But
we have and it follows easily that un = un so that pun = p'un = 'un since 'un is
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 727
divergence-free—and we are done, once we approximate un by (iin+\ — iin)/Atn+\; i.e.,
we have
Qsn+l = -Atn+lAtspu/2
= -Atn+lAtsun/2
= -Ats(un+l -un)/2. (3.16-280)
Switching now to e^+1, we start by utilizing the fact that it is a continuous function of
time to write
e*+, =e^+AW.^+^±i^+^±iet + 0(AW.)4
~ Atl+\ -
= pun -un + Atn+[(pun -un)-\ —(pu -un)
A?3
+ —^±I(pu-u) + 0(A^+1)
A?3
= —f^ipun ~ u'J + 0(AfJ+1). (3.16-281)
Further effort would show that pun — u ~ (Pn — Pn) = 0(Atn) to give e^+, = 0(At*+l)
for the linearly extrapolated P(t), but the details are not necessary because even if we
used constant extrapolation, P(t) = Pn, the result would be e^+, = 0(At\+[), which is
high-order relative to the 0{Atn+\Ats) from the subcycle plus projection error. Thus we
can neglect e^+, relative to e^+, to arrive at our second error estimate,
e„+i = -Ats(un+\ - u„)/2 (3.16-282)
and the concomitant timestepping strategy
Atn+2 = Af„+1(£/||e„+,||), (3.16-283)
a 'linear' (in l/||e„+i||) relationship and thus faster changing than the inverse square-
root relationship given by the more conservative estimate, (3.16-275). The ratio of the
two estimates, from (3.16-275) and (3.16-283) dn+\/en+\ = Atn+\/Ats, is in fact S, the
subcycle ratio—which can be significant [O(10) or more].
Thus, using either (3.16-275) or (3.16-283), the next subcycle ratio,
S = Atn+2/Ats, (3.16-284)
which is naturally rounded down to the nearest integer, is available. The subcycling
method via FE plus BTD is complete.
Remarks:
(1) For start-up, set At\ = Ats and solve (3.16-273) with n = — 1 and (go — g-\)/At
replaced by go to get Pq. Take one FE step to get u\. Solve (3.16-273) with n = 0
to get Pi. Compute At2 from (3.16-275) or (3.16-283) with n = 0 and then S from
(3.16-284). Set n = 1 and go to (3.16-271).
728 THE NAVIER-STOKES EQUATIONS
(2) Subcycling obviously only makes sense when S > 2, since two Poisson equations
are solved for each cycle (major time step). If S » 2 (e.g., 10, 20, 50), then the
procedure can be quite cost-effective.
(3) In our experience it seems that often the conservative estimate is too
conservative, and the other estimate is too 'jumpy' (rapid adjustments to S). A reasonable
compromise, although totally ad hoc, might be to average the two estimates—either
arithmetically or, perhaps, geometrically.
(4) If an iterative method is used to solve the PPE without subcycling, then it may be a
good idea to add a 'penalizing' term to its RHS to help control possible error buildup
in the velocity divergence; i.e., replace gn on the RHS of (3.16-266) by CTun. [Then,
the operation CTun+\ in (3.16-267) will give gn+\ rather than CTun + (g„+i — g«)on
the RHS.] This procedure has been called 'divergence cleaning' in electromagnetics
(e.g., Ramshaw, 1983). Another possibility here is the periodic projection (e.g.,
every 20 timesteps of the current velocity to the divergence-free subspace—a la, for
example, Dupont and Marchal (1988), as mentioned earlier.
We now specialize to the Q\Qo element and offer an additional cost-savings
device—also first published in detail in Gresho (1984b), but exposed earlier in Chan
et al. (1981) and Chan and Gresho (1982). This additional short cut, which further
converts/subverts(?) the GFEM to what might be called an 'isoparametric FDM,' is not
without some pain, however, which requires an additional and inconvenient 'patch job.'
Borrowed from the solid mechanics community, this short cut is most simply described as
one-point quadrature; rather than paying the cost of 'full quadrature' to get the Galerkin
integrals (the coefficient matrices) right (or close to right for distorted elements and the
A'-matrix), the simple expedient of centroid evaluation (approximation) of all integrals
can significantly reduce the cost of running a code. It of course also reduces the accuracy;
but, on balance, the technique has often proven to be cost-effective—at least when
measured against the explicit method just described when 2x2x2 Gaussian quadrature
is employed (eight times as many quadrature points).
The method to be described here is an evolution of that proposed and tested in Gresho
et al. (1984c)—in ML and C. In particular, the one-point scheme we currently advocate
is the following, first in an abbreviated form:
1. Mi is generated by full (3x3x3) quadrature on M, which is then row-summed.
2. The C-matrix is generated via 2 x 2 x 2 quadrature for both pressure gradient and
velocity divergence. (Earlier experience with one-point on C caused noticeable loss of
mass in 3D with distorted elements. If your code is 2D only or 3D on pure bricks,
one-point on C is fine since it is then exact.)
3. A second C-matrix, say C, is generated via one-point quadrature because, as shown
below, it can be efficiently utilized to generate K, N(u), and the BTD matrix—'on the fly.'
Additional details and remarks:
1. One-point quadrature on the non-linear advection terms simplifies them considerably
and, at least on reasonable grids, gives the same result as full quadrature after the 'centroid
advection' approximation is made; i.e., for the A^e(«)-matrix, J (piuh ■ V^9y, take u^ at the
element centroid as representative of that everywhere in the element to obtain A^.(u) =
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 729
uk fe(Pid(pj/dxk, in which uk is the average of all nodal values of the k-th component of
velocity. This serious simplification, reducing otherwise triple product integrals to double,
is of course related to the group finite element (product approximation) advocated by
Fletcher and others; see Fletcher (1991). This approximation, as already discussed in the
previous chapter, modifies the Galerkin weighting/averaging from (14 1 )/6 to (1 2 1 )/4
and is quite easily stated in words: the average (centroid) velocity in each element is
multiplied by the average gradient of the advected variable in that element and the result
averaged (volume-weighted) over those elements sharing the node in question.
2. The one-point C-matrix is defined, on element e, by
Ceik = -(d(pi/dxk)o£2e, (3.16-285)
where ()o indicates centroid evaluation, and Qe is the element 'size' (correctly integrated,
2 x 2 x 2 for general 3D elements, one-point otherwise). The anti-symmetric nature of
(d(pi/dxk)o allows for the computation (and storage, if stored) of only one-half of the full
C-matrix (i.e., in 3D, 8x3 goes to 4 x 3).
3. The one-point advection 'matrix' ('matrix'-valued vector product, actually) is easily
formed from C as follows—using T as the typical advected variable (it could also be u
or v or w):
N'jTj = ukTj / (pid(pj/dxkdQ
= ukTj(pi(0)(d(pj /dxk)0£le
= -(Pi(0)ukCejkTj, (3.16-286)
which, since ^,(0) = 1/4 in 2D and 1/8 in 3D for all i, is seen to be a simple scalar, to
be distributed equally to each node in Qe.
4. Similarly for the diffusion matrix, we obtain (using T again to represent any diffused
quantity), for element e,
KeuTj = Tj [(d^/dxkXdyj/dx^dn
= Tj(d(pi/dxk)o(d(pj/dxk)oQe
= Tj{Ceik/Qe) ■ Cejk (no sum on e)
= -Ceik(dT/dxk)0. (3.16-287)
5. The one-point quadrature implementation of the BTD matrix, say B(u), is also rather
efficient (again on T);
Je dxk dxi
= ukU[CeikCej,Tj/Qe (no sum on e), (3.16-288)
and 'looks like' J(u ■ V<p(-)(u • S/T). Adding (3.16-288) to (3.16-287) is the BTD
improvement over 'straight' FE.
730 THE NAVIER-STOKES EQUATIONS
These describe how a 'streamlined' code can be generated via one-point quadrature.
Now we turn to the other side of the ledger and admit to a diffusional deficiency. (The
advectional deficiency, both for LM and for one-point quadrature, has been thoroughly
discussed already—in Chapter 2.) Just as the simplification of forward Euler integration
on the hyperbolic terms required a truncation error 'correction' term, so too does the
one-point approximation to V2 terms require some correction terms—described in the
solid mechanics literature (see, for example, Hughes, 1987) as 'hourglass correction' or
'hourglass control,' since the Lagrangian-based elements can deform into zero-energy
modes that resemble an hourglass. In our Eulerian framework, of course, the element
shape stays constant—and our manifestation of these 'zero energy' modes shows up as
a singular A'-matrix. The amount of work that has gone into patching this problem is
impressive—and nearly all of it comes from the solid mechanics community (see, for
example, Liu et al., 1985, and references therein)—although there are exceptions (e.g.,
Mallet et al, 1992).
The singularity in the A'-matrix, which actually is truly singular only for Neumann or
periodic BC's (Dirichlet BC's add a stabilizing influence, but not enough to sufficiently
suppress the deleterious effects), shows up as oscillatory null vectors (and near null
vectors—eigenvectors with small eigenvalues), which make wiggles when excited by
the 'data'—especially short wave data. Since advection cannot move these modes very
well [not at all for the null vectors ('2Ajc'), which have zero phase speed], they must
be dealt with explicitly—by adding truncation error correction terms to the one-point
('under-integrated') A'-matrix. But since we have probably already given too much space
to the explicit Euler plus 'fixes' method, relative to its importance in today's computing
environment, we merely state that both K and B(u) are augmented, at element level,
by a correction matrix that is proportional to the outer product of the corresponding
null vectors—and refer the reader to the original reference, Gresho et al. (1984c) for
details. Suffice it to say here that this final patch job does do a fair job of restoring a
better simulation of diffusion. One final remark on the hourglass correction that is not
discussed in Gresho et al. (1984c): the simplest technique, devised by Goudreau and
Hallquist (1982) on the basis of explicit knowledge of the form of the null vectors for
simply shaped elements (rectangles/bricks), has worked fairly well—even though many
solid mechanics codes employ the more rigorous but more expensive method of Flanagan
and Belytsckko (1981). This technique is more appropriate for distorted (iso-f) elements,
for which the '/i-stabilization' of Goudreau and Hallquist is only approximately correct
(G. Goudreau, personal communication). The idea of this simpler stabilization scheme
is to render non-singular the A'-matrix by adding a rank-one matrix to it for each null
vector; e.g., if x is a null vector (with eigenvalue X = 0) of K, then the modified matrix
is K = K + xxT, which raises the rank of K by one; i.e., Kx = Kx + xxTx = Lc gives
X = xTx ^ 0 because Kx = Xx = 0.
Digression on Quadratic Elements:
Since the biquadratic (nine-node) element is still reasonably accurate for advection-
dominated flows when mass lumping is invoked (see the discussion and figures in
Section 2.6.3b), we believe that explicit integration using it may be a viable alternative
to Q\Qo—probably with linear pressure (QiP-1). But it should probably be done with
fewer tricks; in particular, we would recommend: full quadrature, AB3 or RK4 rather
than FE + BTD, subcycling, and perhaps (we are not sure) some simplified advection
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 731
approximation that would be the nine-node analog of centroid advection (if possible)—to
avoid the 'triple-product integrals.' In fact, even without considering the extra phase
speed accuracy of lumped over lumped linears, it can be shown, at least with respect to
the relative costs of both forming and using the matrix CTM^{ C, that quadratics are more
cost-effective than linear (M. Engelman, personal communication).
End Digression
We conclude this section with a brief example showing how subcycling works.
Figures 3.16-10 through 3.16-14 are results from an early 2D simulation (with Q\Qo)
of an LNG (Liquified Natural Gas) simulation run three different ways and called (a),
(b), and (c) in the first four figures: (a) no subcycling, (b) subcycling with linear pressure
extrapolation, and (c) subcycling with constant pressure extrapolation (used to convince
us that linear extrapolation is worth the effort). Time is in seconds, horizontal (u) and
vertical (v) velocities are in meters/second, temperature is in °C, and concentration is
in volume fraction of LNG. The time histories shown are for a node near the surface
of the 'spill pond' (see Koopman et al., 1989, and Chan, 1992, for details of these and
other LNG safety study simulations). LNG 'injection' stops at t = 40 and is the cause
of the discontinuity in the velocity plots. Only the vertical velocity is noticeably affected
by subcycling. More remarkable yet is the vertical velocity in Figure 3.16-14 at a node
closer to the surface. The case with constant pressure extrapolation (a) shows extremely
variable velocity during subcycling as compared with that using linear extrapolation (b).
But the key (and perhaps slightly remarkable) point is that in each case the projected
velocity at the end of each round of subcycling is nearly the same in both cases—and it
is also very close to the 'right' answer (that with no subcycling)—nearly zero for t > 60.
The small (stability-limited) timestep was 0.2, and the typical subcycle ratio (S) was about
12. Clearly one should not generally 'believe' any velocities but those at the conclusion
of each cycle (the projected velocities).
-A w -
: VJ
I I I I
0 20 40 60 80 100 0 20 40 60 80 100 0 20 40 60 80 100
Time
Fig. 3.16-10 Horizontal velocity for three cases.
732 THE NAVIER-STOKES EQUATIONS
.30
0 20 40 60 80 100 20 40 60 80 100 0 20 40 60 80 100
Time
Fig. 3.16-11 Vertical velocity for three cases.
I I I I
V (c)
M
20 40 60 80 100 0 20 40 60 80 100 0 20 40 60 80 100
Time
Fig. 3.16-12 Temperature for three cases.
0 20 40 60 80 100 0 20 40 60 80 100 0 20 40 60 80 100
Time
Fig. 3.16-13 Concentration for three cases.
SOLUTION METHODS FOR THE SEMI-DiSCRETIZED
?33
U.«£U
0.18
0.18
0.14
0.12
0,10
V 0.08
0.06
0.04
0.02
0
-0.02
A r\A
K-0.)
t
\
\
\ A
" V V_
—
—
^J L_L_J
J Li_J I
10 20 30 40 50 60 70 80 90 100
Time
Fig. 3.18-14 Vertical velocity for two types of subcycling.
?34 THE NAVIER-STOKES EQUATIONS
Having exposed the reader to a bag of tricks applied to a particular element, we
conclude this section with the general advice that there are usually better ways to solve
the DAE's—some of which have yet to be presented. The mass-lumped explicit integration
method does 'work,' however, and is still sometimes useful—even though it looks too
much like an FDM, which is why we call it an 'isoparametric finite difference method.'
3.16.6 Semi-Implicit Projection Methods
a. Introduction
Having described fully implicit and fully 'explicit' methods (except of course for pressure),
we now turn to an increasingly popular class of 'compromise' methods—semi-implicit—in
which some of the advantages of each of the two 'pure' methods are realized and
their disadvantages reduced, if not eliminated. Goodrich and Soh (1989) said it best:
'The diffusion terms are treated implicitly to avoid the numerical stability restrictions
from the viscous terms, and the convection terms are lagged to avoid the computational
effort of solving a nonlinear system at each timestep.' While explicit-advection, implicit-
diffusion is simple to implement (and often effective) for the scalar transport equation, cf.
Section 2.7.5, the vector system of NS equations and the V • u = 0 constraint preclude,
in significant ways, such simplicity. But effective methods have been devised, if not
always understood. Relevant here, from E and Liu (1995) is: 'The numerical phenomena
involved in the projection method are sufficiently complex that soft arguments can hardly
touch the heart of the matter; neither does a simple convergence theorem or crude error
estimates.'—and, from J. Shen (personal communication, 1996), after noting that a
projection method can be interpreted as a temporal discretization of a singularly perturbed
equation:
'Unlike the usual cases where the error of a fully discretized scheme is simply the sum
of the temporal discretization error and the spatial discretization error, the error of a full
discretization for singularly perturbed Navier-Stokes equations is much more involved,
since the a priori estimates for its solution may depend on the perturbation parameter
... On the other hand, this type of analysis is important, because it is not obvious,
without a thorough understanding of the error behavior, how to properly match the
temporal discretization parameter At with the spatial discretization parameter /z.'
Finally, some related remarks from the projection pioneer: 'In ending, the author would
like to make some comments on the preceding proofs. First of all, he would like to state
his belief that the value of a scheme such as (14) lies in its practical usefulness, not in
the possibility of a convergence proof. The value of the convergence proofs lies in the
fact that they contribute to the understanding of the numerical processes performed on the
computer'—Chorin (1969), wherein (14) referred to what we will later call 'projection 1'
for the simplest case—periodic BC's.
Herein we provide both some useful semi-implicit projection methods and further (but
not complete) understanding (via, unfortunately, somewhat 'soft arguments') regarding
how and why they work. In the semi-implicit projection methods that we have
in mind, which carry several aliases (fractional step, splitting, pressure correction,
predictor-corrector), unconditional stability of the viscous term is attained, as with implicit
methods; also, the decoupling of equations is attained, as with explicit methods—almost.
The cost of these gains is that in addition to solving a PPE at each timestep (a la explicit
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 735
methods), additional but symmetric linear systems, one per velocity component, must be
solved at each timestep. Another disadvantage is that, unlike the semi-implicit treatment
of the scalar transport equation in which (using forward Euler for advection) the implicit
treatment of BTD stabilized the scheme unconditionally, for the NS equations the non-
linearity of the (explicitly treated) advection terms limits us to a CFL stability limit
of ~ 1 — 10—an experimental result. One final advantage is also realized—although
obtained in a somewhat ad hoc manner: the consistent mass matrix is retained even though
the PPE associated with the projection is done with the lumped mass matrix (CTM~[lC).
An additional 'cost'—as will soon be seen—is the notable extra effort required to derive
and justify the methods! To further punctuate this last sentence, we quote again from E and
Liu (1995), 'It has been a mystery for twenty-five years that the projection method seems
to perform better than expected.' On balance, however, one or more of the semi-implicit
projection methods to be described below often provides a cost-effective compromise
between 'costly' implicit methods and 'unstable' explicit methods—although in the next
section we shall describe a competitive method that is both uncoupled and fully implicit.
Finally, we emphasize at the outset—and demonstrate later—that these projection methods
are intended only for 'time-accurate' simulations, not for 'quickly' time-marching to
steady-state solutions.
The plan for this section is the following: after deriving and discussing optimally
(or near optimally) accurate projection methods for the PDE's, which require special
(and awkward) BC's for the so-called 'intermediate velocity,' we will fall back to less
optimal but (probably) more cost-effective methods that use only the simplest BC's for the
intermediate velocity, discuss semi-discrete methods in which only the time is discretized
(in marked contrast to the DAE approach), and, finally, discuss fully discrete semi-implicit
projection methods, in which we show how to maintain the extra accuracy associated
with the consistent mass matrix. We mention now and try to explain later that all of
these projection methods are approximations to 'legitimate' methods for solving both the
PDE's and the resulting DAE's (if we may call them that)—a statement that would still be
true even if the (intermediate velocity) equations were solved fully implicitly because the
principal 'problem' with the projection method (implicit or semi-implicit, but not explicit)
is that the viscous term causes the occurrence of a numerical and completely spurious
boundary layer (wherever the tangential velocity is specified a la Dirichlet) of thickness
0(a/uA/), within which certain quantities (starting with the pressure) are generally in
error—and at their worst on To [the pressure by 0(a/vA/), and its normal gradient by
0(1)!]. This behavior, which is part of the 'mystique' of these methods, is still only
partially understood, as is the near-miraculous recovery (usually) to the 'right' answer
beyond 0(a/vA/) from TD.
Before embarking on the approximation to a projection method for solving the NS
equations, let us briefly visit the real projection that is, after all, our goal; i.e., we shall
'view' the NS equations as a projection: since du/dt + VP = uV2u + g — u • Vu = f and
V • u = 0 => V2P = V • f, we have (formally at least, and introducing A = V2),
/>=A_1V-f, (3.16-289)
VP = VA~lV-f, (3.16-290)
and
^ = (/-VA-'V.)t (3.16-291)
dt
736 THE NAVIER-STOKES EQUATIONS
which has introduced the projection operators (see also Appendix 3—which includes some
discussion of concomitant BC's that we mostly avoid here)
p = /-VA"'V- (3.16-292)
and
0 = /-p = VA~'V- (3.16-293)
so that the NS equations become, simply,
^=pf(u) (3.16-294)
at
= f(u)-V/>, (3.16-295)
which shows that p 'strips off the gradient part of f to reveal its divergence-free part—the
acceleration. Also, Qf = VP. These orthogonal (Qp = pQ = 0) projection (p2 = p,Q2 =
Q) operators—which also come with BC's 'built-in'—display the following additional
properties: V • p = 0 and V x Q = 0, showing that p projects onto the null space of div,
and Q projects onto the null space of curl; finally, f = pi + Qf is the orthogonal
decomposition of the vector f into its divergence-free and curl-free components. Returning to
(3.16-294), it is clear that the time integral of pf(u) is the desired velocity solution of
the NS equations—and this is the goal of the projection method: to strip off the gradient
part of f—a process that is easier said than done. But the basic idea is easy: guess VP,
subtract it from f(u) and integrate the result for some length of time—the length of
which is proportional, in some sense, to the quality of your guess and the tolerance level
you place on the divergence (a perfect guess remains divergence-free, while an
imperfect guess generates spurious divergence)—and then project the result back down to the
divergence-free subspace. The Devilish details follow.
b. Derivation of an 'optimal' projection method, simplifications thereto,
and analysis thereof
Before doing mathematics, we do English; i.e., we will introduce the projection method in
words, beginning with these: if we were somehow provided with the (God-given?) proper
pressure, P(x, t), we note that the NS equations represent nothing more than a coupled
system of vector AD equations (or Burger's equations, as some may prefer to call them)
that, because of VP(x, t), would remain divergence-free with no need to carry V • u = 0
along as a constraint. Projection methods try to do the same thing, as follows:
Given an incompressible velocity field, say at t = 0 for convenience, that satisfies the
appropriate BC's, perform the following steps:
Step 1. Guess VP(x, t) for t ^ 0; i.e., approximate it, somehow.
Step 2. Solve the momentum equations alone, with VP(x, t) simply acting as another
given body force 'on the RHS,' up to what we shall call the projection time (t =
T), which could either be set a priori (somehow) or—better yet—be defined as
that time at which an appropriate norm of V • u(x, t), where u is the intermediate
velocity (from the momentum equations), which does not remain divergence-
free because our guess at VP(x, t) was imperfect, reaches some predetermined
maximum-acceptable value.
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 737
Step 3. Project the intermediate velocity, u(x, T), to the nearest (in L2) divergence-free
subspace and (boldly?) report it as the NS velocity; i.e., u(x, T) = pu(x, T).
This completes one projection cycle; reset the clock and go to Step 1.
Immediate questions/issues are the following:
1. How do we guess the pressure (gradient)?
2. How do we select Tl Or, what is a 'maximum acceptable value' of V • u(jc, 7")?
3. Are there better BC's for u than simply those given for u?
4. What are the BC's associated with the projection? Is there a choice?
5. How do we know that the (projected) result is 'close' to the true NS solution?
These are hard questions. But clearly their answers are both necessary and important,
so we must address them. We can say this much for sure: the answers to questions 1 and
3 actually 'define' a particular projection method, and the answer to 2 is 'cut and try.'
For question 4, we again first quote E and Liu (1995): (i) 'There are still controversies
with regard to the optimal choice of boundary conditions at the projection step.' And,
(ii) 'The agonizing decision to be made is the BC for (2.7)'—the projection step; and,
there is a choice, they assert. We say no—no choice; only the proper (normal direction)
BC will retain a well-posed problem and a divergence-free projected velocity. More
controversy—c'est la projection. The answer to question 5 will come later.
Let us begin by deriving a particular projection method that we called 'projection 2' in
our first publication on the subject (Gresho and Chan, 1988) because it is intended to be
second-order accurate in some sense; it begins with a pressure guess that is time-invariant
and often assumes more regularity than is really necessary in practice:
1. Given uo that satisfies 'appropriate' BC's (details later) and V • uo = 0 and given the
associated Pq = P(x, 0), solve for u(x, 0 on 0 < t ^ T from
— = uV2u + g - u Vu - VP0
dt
= uV2u + f(u) - W>0 in & (3.16-296)
with
u = w(0 on FD and (3.16-297)
vdu/dn = F(0 + nP0 on FN, (3.16-298)
where wis a to-be-determined 'intermediate' Dirichlet BC (recall that u = w on T^ is
the proper NS BC) and F(0 is the given 'traction' force. We lump the body force and
advection terms into f(u) because most of the 'interesting' behavior in projection methods
comes from the viscous term.
2. At t = T, project u to the divergence-free subspace as follows:
v(jc, T) = pu(x, T), (3.16-299)
which is restated less formally as follows: solve for v and <p from the additive
decomposition of u into a divergence-free vector field and a curl-free vector field:
u = v + V<p and V-v = 0 in Q, (3.16-300)
738 THE NAVIER-STOKES EQUATIONS
with
n • v = n • w(T) on T^, (p = cp on FN, (3.16-301)
where cp is, like w, to be determined. This projection is, usually, realized as follows:
(i) Solve
VV = V-Q in ft, (3.16-302)
with
d(p/dn = n • (u — w) = n • (w — w) on r^, (3.16-303)
and
<p = (jp on rN, for cp. (3.16-304)
(ii) Compute
v = Q-V<p in fi. (3.16-305)
[Comparing (3.16-299) and (3.16-305) yields p = I - grad(V2r'div, as before. See
also Appendix 3.]
3. Update P(x, t) = P(x, T) via the 'usual' PPE:
V2/>=V-f(v) in £2, (3.16-306)
dP/dn = n • (uV2v + g - v • Vv - dxv/dt) on FD, (3.16-307)
P=vdvn/dn-n-¥(T) on rN. (3.16-308)
4. v is called uo, and P is called Pq, and the next projection cycle can begin.
Remarks:
(1) It may be preferable to set v = w on To because, in general, (3.16-305) will produce
some slip there. More on this key issue later.
(2) Note that P is generally not a continuous function of time in a projection method.
(3) Since V x S/cp = 0, we see from (3.16-305) that the projection operator preserves
the vorticity present in u—the projection is a. potential flow adjustment, at least until
the no-slip BC is re-introduced; see Remark (1).
o Boundary conditions. To obtain the 'proper' Dirichlet BC data for u and <p, we resort
to a Taylor series expansion, in time, of both u and u (the true NS velocity) about time
'zero':
2
u(0 = u0 + m0 + ^-uo + 0(t2), (3.16-309)
u(0 = u0 + rtio + -u0 + 0(t2), (3.16-310)
where u0 = uo. Now we invoke the PDE's for u and u, which of course implies certain
smoothness assumptions, to obtain
uo = u0 = uV2u0 + f(uo) - V/>0, (3.16-311)
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 739
iio = ^[^V2u + f(u) - VP]t=0 = uV2u0 + f(uo) - VP0, (3.16-312)
at
iio = -[vV2u + f(u) - V/>0L=o = ^V2u0 + f(uo), (3.16-313)
at
where f = (9//9u) • u. Clearly f(u0) = f(u0) = (9f/9u)0 • iio- Subtracting (3.16-309) from
(3.16-310) then gives
2
u(0 = u(0 + f-VP0 + 0(t\ (3.16-314)
which we boldly 'push to the wall' and neglect HOT to find our BC for u:
t2 ■
w = w+ yVPoIr,,- (3.16-315)
Inserting (3.16-314) and (3.16-315) into the projection Poisson equation (3.16-302)
through (3.16-304), yields (to second-order)
T2
V2<p=—V2p0 in Q, (3.16-316)
/ = —n-VPo on TD, (3.16-317)
an 2
and
<p = (jp on rN, (3.16-318)
and the desired Dirichlet BC for (p 'drops out'; i.e., if we take
T2 •
<P=YPo> (3.16-319)
it is clear that the 'solution' of BVP (3.16-316) through (3.16-318) is simply
T2.
<P=YP°' (3.16-320)
at least through 0(T3). The 'optimal' BC's, (3.16-315) and (3.16-320), are now known.
o Simpler projection method. But how do we get P{)1 This is the question that causes us
to cheat even further—since it is either impossible or highly inconvenient to try to solve
for Po by solving for the time-derivative of the PPE plus BC's. Thus, even though we
have found our 'second-order' BC's, we will in fact not use them. What we will use is
what we called 'simpler schemes' in Gresho (1990b); i.e., (i) just use the physical BC for
u on rD:
w = w on rD; (3.16-321)
(ii) assume that <p ~ T2Pq/2 still, approximate P{) by (P(T) — Pq)/T and approximate
the normal component of (3.16-298) by Fn{T) + P() = Fn(T) + P(T) = 0, which neglects
the viscous contribution [the latter a usually valid approximation, especially for 'large'
Re; see point 3 following (3.8-38)] to obtain
(p = q) = -T[Fn(T) + P0]/2 on FN. (3.16-322)
740 THE NAVIER-STOKES EQUATIONS
Later we shall attempt to justify some of these approximations, which are clearly only
'reasonable' for T 'sufficiently small.' For now, we simply use them and rewrite the entire
(simpler) projection 2 cycle as an algorithm: given a divergence-free uq(x) and Pq(x),
Step 1. Solve (3.16-296) through (3.16-298), and (3.16-321), for 0 < ?^ T to get u(T).
Step 2. Solve (3.16-302) through (3.16-304) for <p, where (3.16-303) now reads d<p/dn =
0 on TD, and y for (3.16-304) comes from (3.16-322).
Step 3. Compute the new pressure from
P(T) = P0 + 2<p/T. (3.16-323)
Step 4. Compute \(T) from (3.16-305). Report v and P as the (alleged) NS velocity
and pressure.
Step 5. Reset the variables for the next projection cycle as follows: t = 0, Pq = P(T),
and uo = v except on rD, where it is set to uo = v/(T).
Remarks:
(1) At the true beginning of a simulation, the proper PPE and BC's must be solved to
get the proper initial pressure (Pq) that is induced by the initial (divergence-free)
velocity.
(2) The Neumann BC for the Lagrange multiplier, d<p/dn = 0 on FD, implies, using the
approximation <p = T(P(T) — Po)/2, that is an inherent part of the method/algorithm,
that dP(T)/dn = dPo/dn on To, rather than the correct inhomogeneous Neumann
BC from the correct PPE; i.e., the pressure gradient on rD never changes throughout
the simulation! While true, and seemingly stupid/wrong, it will be (largely)
'justified' later.
(3) The projected velocity, \(T), from (3.16-305) will, unfortunately, not satisfy the no-
slip BC; i.e., x • v ^ x • w, on YD\ the so-called overdetermined Neumann problem
is not satisfied [recall Remark (3) following (3.8-36)]. Rather, it will slip—with a
slip velocity, s, given by s= r • v — t • w = t • (u(T) — V<p — w) = — x • V<p. The
(necessary?) resetting of Uo to w(T) on To [rather than to uo = \(T)] introduces
a vortex sheet, of strength s, on rD—a phenomenon that we shall discuss in more
detail later. [Note that there is no jump in the normal velocity on r^, which is of
course illegal for an incompressible flow; i.e., n • \(T) = n • w(T) comes from the
projection—by construction.]
(4) If FN =0, then the resulting Neumann problem for <p is 'consistent singular'; it
is singular because d<p/dn = 0 on F leaves <p = constant as a function in the null
space, and it is consistent because, from (3.16-302), we need JV-u = Jrn-u =
jr n • w = 0, which is satisfied because w is properly constrained. <p is then obtained
only up to an irrelevant additive constant.
(5) It turns out, as part of the 'mystique' associated with this projection method, that
the coefficient 2 in (3.16-323) can be replaced by y for 0 < y ^ 2 and 'success' (of
some sort) still be achieved, a fact discovered by Gresho and Chan (1990) and
subsequently analyzed by Shen (1992, 1996)—although y —► 0 only 'works' if At does
too. We recommend using either y = 2 (somewhat more attractive, theoretically) or
y = 1 (somewhat more robust, practically; y = 2 sometimes makes wiggles—2At
oscillations in P).
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 741
(6) Whereas u must have sufficient regularity to reside in H', v can be rather rougher;
v e H(div) is sufficient [J.-L. Guermond, personal communication; a function v e
H(div) => v e L2 and V • v e L2].
(7) Additional BC discussions, in the framework of spectral element methods, are
presented in Karniadakis et al. (1993).
o Oldest projection method. Thus it appears that the seemingly simplifying approximations
associated with the above 'projection method' are not without significant cost. (There is
just no free lunch when it comes to 'solving' accurately the incompressible NS equations.)
But before trying to justify the above technique, which we state now and demonstrate later
(and earlier, in Section 3.16.Id) that it really does 'work,' we digress briefly to go a few
years backward in time to discuss a simpler-yet projection method (the original one—by
Chorin, 1967a, 1968a,b, 1969) that also works but looks even more like it should not. [We
credit Chorin with the invention of the first semi-implicit projection method. A similar,
but explicit, projection method that was contemporaneous with that of Chorin is that of
Temam (1966, 1969a,b). Note though that we have chosen to classify this latter method
as simply explicit Euler (see Section 3.18); i.e., the original Temam et al. projection
method is simply forward Euler—although Temam did also study implicit and semi-
implicit projection methods; see Marion and Temam (1996) for relevant references.] Called
'projection 1' in Gresho (1990) and Gresho and Chan (1990), we present it simultaneously
in two forms, the first (a) corresponding to that using the 'optimal' BC's for u and <p
(to assure, more or less, a consistent first-order method) and the second (b) using the
simpler BC's. It is characterized, in its simplest form, by its rather simple guess for
the pressure while solving for the intermediate velocity: P = 0! Really; zero (not zero
factorial—although that would work, too). It reads: given a divergence-free velocity field,
uo(jc), and P0 in the 'optimal' method, for each cycle, do:
Step 1. Solve for u for 0 < t < T from
du/dt = uV2u + g - u Vu
= f(u) in £2,
with either
(a) u = w = w + tS/P0
or
(b) u = w on rD
and
vdu/dn = F(0 on rN.
Step 2. Perform the projection:
(i) Solve for <p from
V2<p = V • u(T) in ft,
with either
(a) d(p/dn = TdPo/dn or
(b) dcp/dn =0 on To
(3.16-324)
(3.16-325)
(3.16-326)
(3.16-327)
(3.16-328)
(3.16-329)
(3.16-330)
742 THE NAVIER-STOKES EQUATIONS
and
<p=-TFn(T) on rD. (3.16-331)
(ii) Compute
v = u-V<p in Q. (3.16-332)
Step 3. Compute the new pressure from either the expensive way,
V2/> = Vf(v) in Q, with (3.16-333)
dP/dn = n • [f(v) - dxv/dt] on FD and (3.16-334)
P=vdvn/dn- Fn(T) on FN, (3.16-335)
or the 'cheap' way,
P = <p/T in fi. (3.16-336)
Step 4. Reset the variables, as for 'projection 2' above, for the next cycle.
Remarks:
(1) The 'optimal' method (a) came from a similar Taylor series analysis to seek the
best BC's for u and <p—an exercise we leave to the reader—for which the <p—
P relationship turns out to be <p = TPo, a result that was already used to obtain
(3.16-331) and (3.16-336).
(2) The simpler method (b) implies that dP/dn = 0 on FD for all time—an implication
that appears to scuttle the scheme, at least for 'boundary-driven' Stokes flow for
which the true pressure obeys S/2P = 0 in 12, with dP/dn = vd2un/dn2 on T; i.e.,
it is only the non-zero value of dP/dn on F that generates a non-trivial P. We will
explain later why this scheme does work—even for Stokes flow.
(3) In his original projection method, Chorin combined the options as follows: he used
(3.16-336) to update the pressure, which he employed in (3.16-325) and (3.16-329).
(4) All of the boundary-produced vorticity that comes from the no-slip BC and the
tangential pressure gradient is injected rather roughly—as a vortex sheet upon
reducing the post-projection slip velocity to zero in preparation for the next cycle.
(In contrast to 'projection 2', no vorticity is introduced into the fluid by the
tangential pressure gradient during the intermediate velocity portion of the cycle because
dP/dz is absent.)
(5) Clearly, u(t) will stray from the divergence-free subspace more quickly (much more
quickly if S/Pq is large) than does the corresponding u(t) from projection 2. Thus,
we do not advocate this 'projection 1' method since 'projection 2' provides more
accuracy (at least in theory) for virtually no more effort.
(6) A steady state, if attained, is a function of At—not a desirable attribute; the pressure
gradient is then multiplied by (/ — uA^V2).
o Justification, vorticity production. Before moving on to analyze and at least partially
justify the above projection methods, it may be wise to note that even though a 'projection
3' method was proposed (but not tested) in Gresho (1990b), by the 'obvious' extension
of projection 2 (guess the pressure as P = Pq + ^o, etc.), it is not to be recommended
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 743
because it is actually unstable—a result determined theoretically by Shen (1993) and
experimentally (and earlier) by J. Schutt (personal communication).
To justify the simpler projection methods, we must first discuss the 'discovery'
announced in Gresho and Sani (1987) and further elucidated in Gresho (1990a, 1991a):
whereas the normal component of the momentum equation applies even on To at
t = 0—and is thus used to set the Neumann BC for the PPE [this portion of the discovery
having been borrowed from Hey wood (1980) and Hey wood and Rannacher (1982)]—the
tangential component does not (in general); rather, at least for small time and close to
FD, the tangential component of Wo acts like a given 'source' (body force) term with Pq
coming from the PPE plus Neumann BC at t = 0. The resulting vortex sheet (present in
general, because the no-slip BC is not required to be satisfied by the initial velocity field)
quickly diffuses from To into £2 as the solution to a 'heat equation' with a step change at
the boundary (and a given source term near To, dP/dz); i.e., the 'action' occurs near FD
in a boundary layer of thickness 8 % y/Vt. While this behavior usually occurs only once
(at start-up) for the true solution of the NS equations, it occurs once per projection cycle
when a projection approximation to the NS equations is utilized. What is happening is this:
the projection of u to v at the end of each cycle introduces a slip velocity on Fq of size
s = (—\/2)T2dPo/dT for projection 2 (s = —TdPo/dz for projection 1) and a resulting
diffusional boundary layer of thickness ~ y/vi, 0 < t ^ 7\ when uT is (necessarily, in this
'simple' approximation) forced to satisfy the no-slip BC during the next cycle. The effect
of these processes is to cause a significant 'loss of regularity' within the layer 8, with one
result being that the Taylor series analyses presented above are probably not valid within
this 'projection' boundary layer.
The whole concept of vorticity 'production' at a no-slip boundary is subtle, difficult, and
perhaps not even yet fully understood. For some 'classical' discussions on the subject, see,
for example, Batchelor (1967) or Panton (1984, 1996). For some fairly recent controversy
on the subject, see Gresho (1992) and the references therein—and also E and Liu (1996).
For a modern viewpoint with additional in-depth analysis, see Wu and Wu (1993, 1995,
1996) and Wu (1995). We assert that one must deal with this issue, successfully or not,
when designing, discussing, and (especially) analyzing projection methods. We start with
two questions:
1. Must we re-set the slip velocity, computed via s = — r ■ V0, to zero to start the next
projection cycle? We presumed yes in our stated algorithm via Remark (3) following
(3.16-323), but others seem to obtain good results by not doing so; just setting u = w at
the start of the next cycle seems to be good enough. (More on this later—at the end of
this section.)
2. How is vorticity production at the wall related to the choice of BC?
While we have no 'final' answers, we currently believe that the answer to Question 1
is 'no,' and we try next to answer 2, somewhat heuristically.
Starting in 2D for simplicity, the (scalar) vorticity on F is defined by
co = duT/dn — du„/dr (3.16-337)
and its flux into Q at F (whatever that means) by
Q = -vdco/dn. (3.16-338)
744 THE NAVIER-STOKES EQUATIONS
From V • u = 0 = dun/dn + duT/dr and (3.16-337) follows
Q = -vV2uT;
the flux of (jo is related to the tangential viscous term, which at least suggests the attempted
employment of the tangential momentum equation on VD, in the following way:
Q = -vV2uT = r- (g - Du/Dt - V/>); (3.16-339)
vorticity flux is 'caused by' tangential body forces, tangential acceleration, and—last but
not least—tangential pressure gradients. Since it is the latter 'source' term that is of
interest herein, we abbreviate r • (g — Du/Dt) by / to obtain, integrating in time over
one projection cycle,
T
(dP/dr)dt
^+,*A + >l3A + ...]dl
dz dz 2 dz
= r/d,_(V^ + l!^ + l!^ + ...|. (3,6-340)
Recalling now the slip velocities associated with the projections via the 'simpler' BC of
u = w on To, s = —T dPo/dz for projection 1 and s = — (T2/2)dPo/dz for projection
2, we see that the end-of-cycle vortex sheet that results from resetting the tangential
velocity from x ■ (w — V0) to zero corresponds to the appropriate term in the Taylor
series expansion of dP/dz. And this result can also be used to ascertain that the 'better'
BC's of (3.16-325) for projection 1 and (3.16-315) for projection 2 'merely' increases by
one power of T(At in the real case) the accuracy of vorticity injection. The important
point is that the reduction of the slippery post-projection velocity to the no-slip value is a
clear and important part of the projection method; what is not as clear is whether no-slip
needs to be applied to both u and u, or whether application to just u is sufficient. (This
latter seems to be the case ... in which case our previous use of the word 'necessary'
was unnecessary.)
In 3D, the situation is similar, just more complicated for a general, curved boundary.
The starting point is the (no body force) curl form of the momentum equation (3.3-6),
whose tangential components can be written as
/Du \
vn x (V xa>) x n= -n x I — + VP 1 x n, (3.16-341)
where o> = V x u. Combining this with the equation for the normal flux of tangential
vorticity,
Q = -vn x da>/dn, (3.16-342)
yields, with the help of J.-Z. Wu (personal communication, 1995), finally,
Q = nx ( — + V/M xn+unx (V2n) • co, (3.16-343)
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 745
where V2 is the (2D) surface gradient operator. While surface curvature (V211 ^ 0) causes
additional flux of vorticity, the tangential pressure gradient plays a role similar to that in
2D—and that is the major point.
o Further analysis, justification. With this new proviso and new understanding [i.e., be
careful about assuming too much 'smoothness' within 8(t)], we now take the next (and
last) step toward justifying projection methods—which begins by considering the
hypothetical but do-able (in principle) projection cycle in which a continuous projection for
all t between 0 and T is performed, for reasons soon to become clear (hopefully). Thus,
we now consider both v and ip to also be continuous functions of time and consider the
following cycle, 0 ^ t ^ T:
Given Pq(x) and uo with V • uo = 0,
Step 1. Solve for u, with iio = uo at t = 0 from
du/dt = uV2u + f(u) - W>0 in ^,
u = w(0 on To, and
vdu/dn = ¥(t) + nPQ on TN. (3.16-344)
Step 2. Perform the continuous projection; i.e.,
(a) Solve
V2<p = V • u(0 in Q,
dcp/dn =0 on rD,
<p = -[Fn(t) + P0]t/2 on FN. (3.16-345)
(b) Compute
v(0 = u(0 - V<p(t) in Q. (3.16-346)
Step 3. Compute the pressure from
P(t) = P0 + 2(p/t. (3.16-347)
Step 4 At_t_= T, set u0 = \(T) in Q and on rN, set u0 = w(T) on FD, set ^o = P(T)
in Q, set t = 0 (reset the clock), and go to Step (1).
The reason for considering a continuous projection is so that we can 'analyze' (partially)
the results via Taylor series expansions, which we will do soon. But to begin the analysis,
we first insert u = v + Vcp into the intermediate velocity PDE, (3.16-344), to obtain
(d/dt - uV2)(v + V(p) = f(v + V<p) - V/>0 in £2, (3.16-348)
with BC's
v + V<p = w on TD (3.16-349)
and
v— (v + V^) = F(0 + n/>o on TN, (3.16-350)
dn
746
THE NAVIER-STOKES EQUATIONS
which, when augmented by the IC v + Vcp = uo at t = 0, is a well-posed problem for
the linear combination of the two vector fields, v and S/<p. Rearrangement of (3.16-348)
identifies it as what we shall refer to as a modified/perturbed momentum equation:
3v
~dt
+ V
Po +
dt
uV2 )<p
= uV2v + f(u),
(3.16-351)
where we have reinstated u in f, for 'convenience.' Note that to the extent that P0 +
(d/dt — vV2)cp 'looks like' P, (3.16-351) 'looks like' the true momentum equation (at
least if V(p is small compared with v in f(u) or, simpler yet, for Stokes flow). In fact, let
us now subtract the true NS momentum equation from the v equation above to obtain
dt
(v - u) + V
(p°-p)+u-vVr
= uV2(v-u) + f(u)-f(u), (3.16-352)
which we analyze as follows—with slightly circuitous, but reasonably rigorous, logic,
starting with the assumption [cf. (3.16-319)] that <p(t) = t2PQ/2 + 0(t3):
1. From (3.16-314) and \(t) = u(0 - V<p(t), we see that
v — u =
u(O + yV/»o + 0(f3)
-VPo-u(O = 0(r),
(3.16-353)
and thus 9(v - u)/dt = 0(t2).
P0-P = P0-(PQ + tp0 + t2/2P0 + •••) = -tP0 + 0(t2). (3.16-354)
2
- f(u)
2.
3.
f(u) - f(u) = f
r
u+-VP0 + O(r3)
t2 9f
2"9u
S/P0 + O(t3) = O(t2).
(3.16-355)
Inserting these asymptotic results into (3.16-352) gives V[d/dt — vV2)(p — tPo =
0(t2), which 'integrates' to (d/dt - vS/2)(p - tP0 = constant (in space) + 0(t2). Taking
the constant to be zero leads to the ostensible 'governing' PDE for <p—at least for small
time:
( yv2 ] y = tP0 in Q,
with BC's
and
and IC
.9'
d(f)/dn = 0 on Fq
<P=-[Fn(t) + P0]t/2 on r,v,
<p = 0 in Q,
(3.16-356)
(3.16-357)
(3.16-358)
(3.16-359)
a parabolic PDE that supplements the 'real' (and elliptic) one, V2<p = V • u; i.e., the cp
that comes from the elliptic equation also, to the order stated, satisfies the transient heat
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 747
equation. Considering now that <p = 0 at t = 0, and that we are really only interested in the
solution to this transient heat equation outside of the spurious projection boundary layer,
suggests that the 'outer' solution of (3.16-356) through (3.16-359) in the neighborhood
of FD (not T/v) can be obtained by neglecting the diffusion term to give
t2.
<p(x,t)*-PQ(x), (3.16-360)
at least up to 0(t3) and outside of 8 = 0(^/vi), which we happily note is consistent with
the assumption made at the beginning of the analysis.
Remark:
We realize that the stated form of <p is not correct near FN, but we are not really interested
in the solution there since it is near rD where the problems lie.
Noting now that (p = t2P{)/2 is also the solution [through 0(?3)] that was obtained
in the 'hypothetical' case of 'optimal' BC's, we now believe and assert that the simple
BC's only cause (portions of) our projected solution to be spurious within 0(8) of the
Dirichlet boundary and that the actual use of optimal BC's would 'merely' mean that
(3.16-360) would apply all the way to the wall. [The pollution from projection would
then occur in the 0(t3) terms; but it would occur.] We note, finally, that the pressure
update, P = P0 + 2<p/t = Pq + tPo, is consistent with the above solution.
Then, to the extent that (3.16-360) does apply, the modified momentum
equation (3.16-351), becomes
-^ + V(/>0 + tP0) = vV2v + f(u) + 0(t2), (3.16-361)
at
which now looks a lot more like the NS momentum equation.
o Special cases. To help appreciate the importance and extra difficulty caused by no-slip
walls, and to see a really great use of the projection method, consider the following two
special classes of problems for Stokes flow:
1. Periodic domain, periodic forcing function, periodic BC's.
2. Unbounded domain, IC of compact support, and body force that goes to zero as x —► oo.
These can be described by the following IBVP or IVP:
du/dt + VP = uV2u + f(x, 0 and Vu = 0 in £2, (3.16-362)
with divergence-free initial velocity (u0), which is periodic for Class 1 and with uo(jc) —► 0
for x —► oo for Class 2, periodic BC's for Class 1, and BC of u —*■ 0 for x —► oo for
Class 2. Derived from (3.16-362) are the PPE,
V2/>=V-f in £2, (3.16-363)
and the VTE
du/dt = uV2a> + V x f. (3.16-364)
The BC's for (3.16-363) are (i) periodic or (ii) P —► 0 as x —► oo, and those for co are
the 'same.'
748 THE NAVIER-STOKES EQUATIONS
The first observation is that the pressure 'stands alone' in that we do not need to know
the velocity field to solve (3.16-363). This leads to the following 'idea': omit VP from
(3.16-362), call the resulting (and generally non-solenoidal) velocity field u, and consider
finding u and later projecting it ...; i.e., solve
du/dt = uV2u + f (3.16-365)
with iio = Uo and the same BC's as for u. Clearly this vector 'heat' equation (with
uncoupled components, yet) is much easier to solve than (3.16-362). The curl of it yields
dii>/dt = uV2w + V x f, (3.16-366)
with o>o = o>o and the same BC's as on a>; note that both o> and a> are necessarily
divergence-free.
Our next observation is that a> = o> even though only the latter is derived from a
divergence-free vector field, suggesting that the two velocities could only differ by the
gradient of a scalar.
Next, suppose we have solved (3.16-362) and (3.16-365) from t = 0 to t = tF (not
necessarily small) and we ask: How would the projection of u to the divergence-free
subspace compare with u? We obtain the answer by 'construction': compute v and <p
from
u = v + V<p and V-v = 0 (3.16-367)
with either periodic (Class 1) or no BC's (Class 2; i.e., v —*■ 0 for x —► oo). This of course
involves first solving the Poisson equation,
VV = V-Q, (3.16-368)
and then computing
v = u — V<p.
To answer the above question, we invoke the following (Helmholtz) theorem: a vector
field is uniquely defined if and only if both its curl and divergence are known, as well as
its value at one space point. Thus, since v and u have both the same curl (co = to) and
the same divergence (zero), it follows that v = u and we are done; for the special classes
of problems defined above, the transient Stokes equations can be solved by omitting the
pressure and solving the vector heat equation of (3.16-365) and projecting the resulting
vector field to the divergence-free subspace at whatever times u(x, t) is desired—and only
at those times. The pressure, if desired, is always obtainable from (3.16-363).
We conclude this interesting 'digression' with three
Remarks:
(1) If f = 0, then we have P = 0 (or constant) and u = u: the decaying solution remains
divergence-free with no 'need' for a pressure; u(x) is then (necessarily) a linear
combination of (decaying) Stokes eigenfunctions—each of which has P = 0 (see
Walsh, 1991).
(2) If f is independent of time such that a steady solution, say u9, will ultimately obtain,
then a similar trick can be applied: simply solve for v = u — u5 from d\/dt = uV2 v
with vo = uo — u5. There is no more need for P because v (and thus u) will remain
divergence-free.
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 749
Proof: The pressure is given by (3.16-363) and is independent of time, and the
divergence-free steady-state velocity is obtained by solving uV2u5 = VP — f, which
permits (3.16-362) to be rewritten as du/dt = uV2(u — u5) and thus as 3(u—
us)/dt = uV2(u — u9). [Note that both u5 = 0 and P = const if f = 0, consistent
with Remark (1).] Defining now D = V • u leads to dD/dt = uV2D with D = 0 at
t = 0. Finally, since the BC's cannot cause non-zero D, we have D = 0, and thus
u = v + u9 solves (3.16-362). QED.
In this case, the solution is also representable as a linear combination of decaying
Stokes eigenfunctions, again with P = 0 for each:
00
u = ^a„e-^+b„(l-e^'),
n = \
where a„ is the projection onto the n-th eigenfunction of uo(jc), and b„ is that of
us(x).
(3) It is said but true that the (non-linear) advection terms preclude such a streamlined
solution procedure; for the NSE, the pressure is always needed. Damn!
o Yet another equation for <p, the BHE. One may still wonder (legitimately, we add) if
and why the bad/polluted portions of the solution actually recover—as we have merely
asserted—outside of the (putative) projection boundary layer. To begin to answer the
question, we derive yet a third{\) PDE for <p—obtained by operating on (3.16-351) with
the divergence operator and using V • v = 0:
uV2 ) VV = V • [f(u) - VPqI (3.16-369)
dt J
If f(u) was known (e.g., for Stokes flow), (3.16-369) might be solved for <p if appropriate
IC's and BC's were specified—and if the implied regularity/smoothness was 'true.' [As
this assumption is probably violated in the majority of actual simulations, we are
admittedly treading on thin ice. Also, some of our more mathematical colleagues have not and
presumably will not 'buy' it; in particular J. Shen and R. Rannacher, both of whom have
made serious and important contributions to 'projection theory.' On the other side of
the coin, though, we have concurrence by the mathematician who first employed
projection 2—J. van Kan (personal communication), and a very recent contribution by another
mathematician, Schwab (1995), in which he shows, for the time-discretized case that we
consider below, that the pressure does indeed (when sufficiently regular, of course) satisfy
a biharmonic equation (BHE). We will thus present it because we believe it does lead to
a useful and perhaps deeper understanding of the projection boundary layer; it also leads
to what we call the Biharmonic Miracle (BHM).] We know the IC, <p = 0, and we know
some BC's [(3.16-357), (3.16-358)]. A completely posed problem may be obtained (we
assert) by applying, on the boundary, the normal component of the modified momentum
equation from which this higher-order equation was derived; namely (3.16-351):
( yV2 ) — = n • [uV2v + f(u) - V/>0 - dxv/dt] on rD + rv (3.16-370)
\dt ) dn
This equation is analogous to the inhomogeneous Neumann BC (normal momentum
equation on fD) needed for the PPE and, we believe, for the same reason: to assure
750 THE NAVIER-STOKES EQUATIONS
V • v = 0 on T. [Alternatively, one could argue the other way: V • v = 0 on F causes
(3.16-370) to apply.] In any event, the above parabolic problem is now well-posed and
should, under perhaps some stringent regularity assumptions (probably even requiring
very smooth boundaries) admit a solution—at least in principle. We also believe and
assert that the solution to the above IBVP will also (for t small) display a boundary layer
behavior such that, outside of 8 = 0(*Jvi), the S/4<p term will become negligible, and the
previous approximate solution, <p = t2Po/2, will apply—for which (3.16-369) reduces to
V2(/>0 + tP0) = V • f(u) + 0(t2), the 'PPE' for small t; similarly, (3.16-370) then
simplifies to n -V(P0 + tP0) = n ■ [(uV2v) + f(u) - dxv/dt], the appropriate (on FD at least) BC
for the PPE.
o Semi-discrete BHE. To make further progress, we have found it fruitful, in order to
complete the (already too-long?) analysis of projection methods, to shift gears and go
to a semi-discrete projection method—in which only time is discretized. In so doing we
shall make the same assumption that all 'practitioners' have made (whether or not they
realized it): that the projection should be performed at each timestep of the selected
time integration method for u. That this is silly and expensive, at least for any flow
that is approaching a steady state with At fixed, is obvious. How to do better is less
obvious—and few have tried. [We—PMG and H. Daniels—made a brief attempt in
1992, but were not so successful. We also have more recently learned—see Gresho et al.
(1995)—that projection methods are not so efficient for finding steady solutions; they are
'inherently' time-accurate methods—a conclusion that we shall validate later.] Anyway,
we now present one semi-implicit algorithm that could be useful in practice (for simplicity,
we drop the body force term and assume F to be time-independent); use AB2 (or even
AB3) for advection and TR for diffusion. The general step (ignoring start-up issues for
now) is as follows:
1. Given Pm and um with V ■ um = 0, and um_i with V ■ um_i = 0, solve for u^+i from
h - (3um ■ Vum - U„,_, ■ Vum_i ) + VPm
=-W2(um+1+um) in Q, (3.16-371)
with BC's
um+i=wm+l on TD (3.16-372)
and
vdum+\/dn = F + nPm on FN, (3.16-373)
a 'modified' Helmholtz equation [with operator (/ — (A?/2)uV2) vis-a-vis (V2 + k2)].
2. Solve for <p from
V2<^ = V-uWI+, in Q (3.16-374)
d(p/dn=0 on TD (3.16-375)
At
<P=- — (Fn+Pm) on IV (3.16-376)
SOLUTION METHODS FOR THE SEMI-DISCRETIZED
751
3. Update the velocity via
4. Update the pressure from
um+\ = um+\ - V(p in £2.
Pm+\ = Pm +2(p/At.
(3.16-377)
(3.16-378)
5. Reset variables: Pm+\ —>- Pm, um+\ —> um except on To, where wffl+| —► um. Bump m
and go to 1.
We can now complete our projection analysis, such as it is; insert u^+i = um+\ + V(p
into (3.16-371) to obtain the semi-discrete version of the modified momentum equation:
um+\
U„
At
1
+ -[3u„, • Vum - um_i ■ Vum_i]
+ V
1 / vAt
P^AtV--2-
V2 U
= -vV2(um+um+l), (3.16-379)
whose divergence is also of interest here:
(/ - 82V2)V2(p/At = -V • [VPm + \Onm ■ Vum - um_, ■ Vu„,_,)] , (3.16-380)
where now 8 = y/vAt/2 is the projection BLT, and it is worth noting that this result
looks like (3.16-369) after a single timestep, starting from ^9 = 0. (Indeed, it could have
been so derived—and perhaps should have been.) Anyway, we now have a biharmonic
equation—the third PDE satisfied by that slippery variable called <p. The final (?) step in
the analysis is to replace <p by (Pm+\ —Pm)At/2 a la (3.16-378) to obtain the BHE for
the pressure difference:
(7-<52V2)V2
"m+1 "n
= -V-
1
VPm + -(3u„, ■ Vu„, - um_i ■ Vu„,_,)
(3.16-381)
whose rearrangement gives the BHE satisfied by the pressure itself:
e2.^
(I-8zV')VzPm+l =-V.(3um-Vum-um_, ■ Vu„,_,)-(/ + «52V2)V2/V (3.16-382)
Thus, rather than the desired PPE, the 'projection pressure' satisfies a (singularly
perturbed) BHE that approximates the PPE, and apparently does it well if 82V2P is
'small' compared with P. (This is also essentially the BHE derived recently by Schwab,
1995.) A different rearrangement will make the perturbation to the PPE more clear:
' P -\- P
1 m * * i
m+\
= -V-
1
- (3u„, • S/um
um_i ■ Vum_
■m— 1
+ 52V4
'm+1 — ' n
(3.16-383)
which, since Pm+\ — Pm is O(At) and 82 = vAt/2, is clearly an 0(At2) perturbation from
the 'good' PPE. The BC's for (3.16-381) are (from the earlier <p equations): d(Pm+i —
Pm)/dn = 0 on To, (Pm+\ — Pm) = —(Fn + Pm) on TN, and the normal component of
(3.16-379) on TD and rV; i.e.,
I ^-Vz I — (Pm+i -Pm) = n- uV (um + um+i) - (3u„, ■ S/um - um_i • Vu„,_,)
dn
- 2S7Pm - 2
Uwi+1 U«
~At
(3.16-384)
752 THE NAVIER-STOKES EQUATIONS
These BC's permit, in principle, the solution of the BHE for (Pm+\ — Pm); they also
'cause' a recovery of the normal pressure gradient from the bad value on To (zero) to
the proper PPE value because the term vAtS/2(Pm+[ — Pm)/2 becomes small compared
with (Pm+\ — Pm) once outside of the projection BL; i.e., for xn > 0(\/vAt), where xn
denotes the normal distance from F into Q.
o ID model problem, BHM. Now, because we probably cannot actually solve the BHE
BVP for any real case of interest, we switch to a ID model problem, introduced in Gresho
(1990b), which we believe mimics at least some of the important parts of the solution of
(3.16-383).
1. The model PPE (P —► u) is given by
—u" = S on 0 < x < 1,
u = a at x = 0,
u'=a-S at x=\, (3.16-385)
where S and a are constant, and we note that the solvability condition jQ S = u'(0) — u'(\)
is satisfied. Thus, (3.16-385) has a solution—up to an arbitrary additive constant, which
we take to be zero.
2. The model BHE problem is given by
= S in 0 < x < 1,
= qo at x = 0,
= q\ at x = 1, and
= a at x = 0,
-82u" + u = a - S at jc=1, (3.16-386)
where qo and q\ are arbitrary (although they should both be taken as zero to look more
like the pressure conditions, we include the more general case for more 'punch').
Remarks:
(1) We shall regard the solution of the BHE as 'spurious' to the extent that it disagrees
with the PPE solution.
(2) The solvability condition is now = /0 S, which is again satisfied—for
arbitrary q\ and qo. Again we take the resulting arbitrary additive constant to be
zero.
(3) A similar 'boundary layer' problem is discussed by Bender and Orszag
(1978)—Example 2 on p. 449.
The PPE solution is simply
u(x) = upPE = ax — Sx2/2, (3.16-387)
and that of the BHE is (less simply, of course)
u(x) = uppE + _l/s { [S - a{\ - e-'/5) + (qx - ^e^5)] ex/s
+ [S + a(el/s- !) + (?, -q0)Ql/s]e-x/s}, (3.16-388)
c2 //// /
o u — u
u
u
-82u'" + u
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 753
which is well approximated when 8 <£ 1 (the case of most interest) by
u = uPPE + 8 [(S - a + qx )e-(l-^s + (a - qo)e~x/s] . (3.16-389)
The 'normal' derivatives are also of interest:
u'PPE = a-Sx (3.16-390)
and, for the BHE solution,
u' = uPPE + (S - a + qy )e-(1-*>/5 _ (a - qQ)Q-x/s, (3.16-391)
which satisfies u' = qo at x = 0 and u' = q\ at x = 1, as it must. (The exact solution
satisfies these BC's exactly, of course.) Thus, we see that:
1. The error in 'pressure' is 0(8) at the wall and very small outside the BL, 0(<5e~1/<5).
2. The error in the normal gradient is very large, 0(1), at the wall, but very small outside
the BL. In fact, (3.16-391) shows that u' = u'PPE + 0(e~'/5) for 0(8) < x < 1 - 0(8); the
large error at the wall 'vanishes' as the BL is traversed.
3. This is the biharmonic miracle in ID: a 'rough' solution with a tendency to show a loss
of regularity because of the very large higher derivatives at the wall (dm/dxm)u « \/8m~l)
smooths out nicely while traversing the BL to recover to virtually the proper PPE results.
To the extent that this behavior carries over to the full NS equations, we have explained
and rationalized some of the disconcerting features of projection methods.
To further elucidate the BHM and, importantly, to be sure that discrete approximations
also 'recover,' we present, with thanks to S. Chan, in Figures 3.16-15 through 3.16-17
a summary of results of exact and finite difference approximate solutions (second-
order, centered) to both PPE and BHE, for a = 4, S = 6, qo = q\ = 0, for which
«ppe(*) = Ax- 3x2 and u(x) = Ax - 3x2 + 8[2q-({-x)I& + 4e^5]; u'PPE(x) = A - 6x, and
u'(x) = A — 6x + 2e~(1~*)/<5 — 4e~*/<5. In each figure, we plot the BHE solution, u(x), as
solid curves—the upper one being the FDM solution (101 grid points) and the lower
one the analytic solution. The solid dots describe the FDM solution of the PPE and the
open dots the analytic solution of same. Finally, the first derivative of the BHE solution
is also plotted ('normal pressure gradient')—dashed for the FDM solution and solid for
the analytic solution. Figure 3.16-15 shows the case wherein the mesh is too coarse to
resolve the spurious BL—or, stated differently, the timestep is small enough that the BL
is very thin; here, 8 = ^/vAt = 0.001 and h = 0.01, giving h = 105. Note first that the
analytic PPE results (open circles) lie right on top of the analytic BHE curve—the only
discrepancy, of 0(8) at x = 0, 1 is graphically invisible. But the differences do show up
in du/dx, wherein the BHE shows the 0(1) error at the walls; the correct slopes are 4
and — 2 at x = 0, 1; vis-a-vis zero for the BHE. The FDM results, while not particularly
accurate, do show a similar behavior with the important result that the approximate BHE
agrees with the approximate PPE even though the spurious BL is too thin to be resolved.
The other extreme, a fat BL, is shown in Figure 3.16-16; here 8 = 0.1 = lO/i so that
the BL is well-resolved in this case. Here even the analytic results for PPE and BHE show
large disagreement near the walls. But the key point of this result is that even though the
BHE easily resolves the (bad) solution within the spurious BL, once outside of it (say
754 THE NAVIER-STOKES EQUATIONS
Fig. 3.16-15 PPE/BHE results; small timestep yields unresolved boundary layer (8 = 0.1 h).
28 from the walls), the BHE solution recovers to the desired PPE solution—both for the
analytic and FDM cases.
Finally, Figure 3.16-17 shows the case in between—8 = h = 0.01. No more surprises.
Not shown are cases with better spatial resolution (up to 401 points, with 'convergence'
observed) and even more extreme ratios of 8/h (from 0.01 to 100). The inevitable
conclusions from the model problem are these:
1. In all cases, the solution of the discrete PPE was close to that of the discrete BHE once
outside of the BL region.
2. For 8 <<C /?, the discrete BHE solution 'tactfully' ignores a BL that it cannot see; i.e., at
the first node away from the wall (and all others), the agreement between BHE and PPE
was good.
3. For 8y> h, the BHE solution within the BL is 'poor' (far from the PPE solution) for
both analytic and FDM—but once outside the BL, a near-miraculous recovery occurs.
4. To the extent that this discrete Biharmonic Miracle extends to the multi-dimensional
case associated with the projection method, the 'success' of the method is, at least partially,
explained—even for Stokes flow.
That was the good news. The bad news is this: suppose you need accurate results right
up to and at the wall and you accordingly use a graded and extremely-fine-near-the-wall
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 755
Fig. 3.16-16 PPE/BHE results; large timestep yields a thick boundary layer (8 = 10 h).
mesh? In spite of the fact that your implicit treatment of the viscous terms has removed
the onerous diffusional stability limit, At < 0(h2/v), the requirement of high accuracy at
the wall seems to say that 8 = ^JvAt should be small relative to the mesh spacing, h, so
that the specious BL is thin enough not to matter. But this brings us back to 8 < h (or
8 <$C hi), which translates to ^JvAt < h or At < h2/v, and we are (or might as well be)
back to an explicit treatment of diffusion! That is to say, if it is really true that accurate
results near the wall require At < 0(h2/v) for h 'very' small, then the projection method
loses much of its appeal for at least a certain class of problems, since forward Euler could
do the same job at lower cost.
We shall return to this issue after a brief excursion to the 'other end'—relatively large
timesteps, which we do next.
o Steady-state paradox. We have mentioned more than once that projection methods
should be used in the 'time-accurate' mode and not be used as steady-state seekers—at
least not 'by design'; if a time-dependent flow attains a steady state via 'accurate' time-
marching, that is another matter. And we are not alone; cf. Simo and Armero (1994) and
Turek (1997). Here we will explain why large timesteps should not be used and do so
using a most robust ODE method, BE. To simplify our task, we first rewrite the DAE's
in the condensed notation used previously (Section 3.16.1), and we do it for Stokes flow,
for simplicity. (Recall that BE applied to the index 2 Stokes DAE's will give the proper
THE NAVIER-STOKES EQUATIONS
Fig. 3.16-17 PPE/BHE results; 'balanced' case (8 = h).
steady-state solution in one step if At is set to infinity.) To solve
u + Ku + GP = f # f(t) (3.16-392)
and
Du = g^g(t) (3.16-393)
via the BE projection method, we do—given Pn and un with Dun = g for n = 0, 1, ...:
Step 1. Solve (un+\ — un)/At + Kun+\ + GPn = / for un+\.
Step 2. Solve DG(Pn+l - Pn) = (Dun+{ - g)/At for At(Pn+l - Pn).
Step 3. Compute un+\ = un+\ — AtG(Pn+\ — Pn).
Step 4. Update P, bump n, and go to 1. (3.16-394)
We now show that algorithm (3.16-394) is very badly-behaved for large At with the
result that, unlike true BE applied to the Stokes DAE's, a steady result is not attainable
simply by taking a few very large steps. Quite the contrary, in fact; the larger is At,
the more steps does (3.16-394) require to attain steady state! A qualitative picture of
this disaster is shown in Figure 3.16-18, first shown in Gresho et al. (1995), in which
SOLUTION METHODS FOR THE SEMI-DISCRETIZED
757
U0 u^ u-i u2 u3
Fig. 3.16-18 Backward Euler projections for several At's.
Vu = 0
the horizontal line represents the manifold of all divergence-free velocities-
range of the projection matrix,
-l
p = I -G(DGylD
-which is the
(3.16-395)
inherent in the above algorithm, in which we have taken g = 0 in order to obtain
orthogonal projections (see Appendix 3). The figure depicts the fact that Step 1 of
(3.16-394) lifts us out of the divergence-free subspace—and farther from it for larger
At(At\ < At2 < Atj,...)—and also depicts Steps 2 and 3, which brings us back. The
limiting case, At = oo, is particularly easy to examine, so we do so. Take n = 0 and
At = oo to obtain
1.5!= K~\f - GPQ), where (recall) P0 came from DGP0 = D(f - KuQ).
2. Solve for A. = At(P\ — Po) from DGX = Du\ — g, where we note that AtAP is
perfectly well-defined even for At —► oo; i.e., the Lagrange multiplier (A.) is finite and
thus P{ -> Pq.
3. Compute
u\ = u\ — GX
-l
pux +G(DGylg
— l
-l
= pK-l(f-GP0) + G(DGylg,
and we are done with the first timestep. Now set n = 1 for Step 2; we easily obtain
-l
u2 = K-\f - GPi) = K-\f - GP0) = u
(3.16-396)
and we see the 'stall' shown in Figure 3.16-18: the 'velocity' simply bounces up and down
between Uoq and Uoq. Only for small At (e.g., At = At\ in the figure) is the projection
method useful. (Figure 3.16-19 shows the analogous, and even more bizarre behavior
when TR is used for large At—the analysis of which we omit because TR is well-known
not to be recommended with large At.)
758 THE NAVIER-STOKES EQUATIONS
U1=U3.
Vu = 0
u0 = u2 = u4
U1 = U3
Fig. 3.16-19 TR projections for several At's.
We end this discussion by noting that 'small' or 'large' At with respect to the behavior
depicted in Figures 3.16-18 and 3.16-19 is a strong function of the timescales and the
temporal behavior inherent in the flow being simulated; e.g., during sharp transients, the
At needed to stay 'sufficiently close' to the divergence-free manifold will be much smaller
than that for a slowly changing flow or one that is approaching a steady state. Perhaps
A?||m|| < £Umax would be a good guide, for 'properly selected' e and £/max-
o Biharmonic catastrophe. We now return to the notion that an accurate solution within
the spurious BL is not readily achievable and that an accurate solution within any physical
BL (or near any boundary with u = w, actually) requires that 8 = ^JvAt should be small
relative to h, which itself should be small relative to any physical BLT. For example, in
Gresho and Chan (1995) was presented a case in which a very fine mesh was employed
near a geometric singularity (re-entrant corner) in which no physical BL was present
(Stokes flow, in fact)—a presumably easier case than one at large Re. What they observed
was a disastrous manifestation of the BHE (quite the opposite of a BHM) which they
called a BHC (Biharmonic Catastrophe) and it is this: if At is not sufficiently small (read
'very, very small'), then the well-resolved-but-spurious BL in the vicinity of a singularity
can generate garbage by 'finding' a singular solution to the BHE that dominates that of the
desired PPE—at least near the singularity, giving a region of totally spurious velocity and
pressure. And the solution appears to attain a steady state, a manifestation of the behavior
discussed regarding Figure 3.16-10 above. In reality, a steady solution—for which there
is no more BHE and therefore no BHM or BHC—would have required an unthinkable
number of (too large) steps to follow a non-physical 'transient.' If the spurious BL was
to be made sufficiently small so as not to cause a BHC, the needed timestep would be
close to that needed for stability if the simple FE method was employed—in their case
requiring ~109 steps—either via FE or a BE version of projection 2, either of which
would deliver an accurate solution, with FE costing much less (and each unaffordable).
See Minev and Gresho (1998) for a possible 'fix' for the BHC.
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 759
o Approximate projection. We conclude the semi-discrete discussion by noting that the
other side of the coin is also interesting; i.e., suppose we decided to report um rather
than um as the NS velocity, a decision that (we believe) has actually been made by
several investigators (cited later), especially for 'projection 1.' Inserting <p from (3.16-378)
into (3.16-377) and placing the result into (3.16-371)—after lowering the ra-index by
one—yields, reverting to Stokes flow for simplicity,
hm+[-hm (Wm-Pm_x\ v 2 _ vAt 2
— + VI 1 = -VZ(U„,+ , +Um)- -—VZV(Pm ~Pm-\)-
(3.16-397)
We now boldly drop the 'spurious' 0(At2) term—both here and in the advection term for
the full NSE—and consider the following algorithm, which clearly mixes AB2 and TR:
Step 1. Given um, Pm, and Pm-\, solve for um+i from
+ VI 1 = -Vz(um+, + um) (3.16-398)
with the same BC's as before—(3.16-372) and (3.16-373).
Step 2. Solve for Pm+l from (3.16-374) through (3.16-376); i.e.,
j 2
V2(Pm+i-Pm) = —V-um+i in £2,
d(Pm+i - Pm)/dn =0 on VD,
and
Pm+\ = -Fn on rN.
Done. Go to Step 1.
This algorithm obviously merits the following remark, since we omitted the step that
strips off the gradient part of u: we have lost incompressibility! True, but before becoming
overly concerned, let us see what we have gained—ostensibly (also, as mentioned at the
outset, if the pressure 'guess' is good, the velocity will be close to divergence-free):
(i) no BC 'problems,
(ii) no BHE, and thus
(iii) no spurious BL and
(iv) no vortex sheets.
As we have not tested this idea, we shall drop it here—but retrieve it later when
discussing projection methods used by others, in Section 3.16.6d. [We did test two
methods of 'stabilizing' the Q\Q\ element—in Gresho et al. (1995)—one of them a
projection method; but neither was deemed to be really worthwhile.]
c. A GFEM (almost) implementation of the second-order projection
method—projection 2
One of the discoveries of Gresho and Chan (1990) during their initial investigations on
projection methods via Q\Qo was a 'trick' that permitted the introduction of a (semi-)
consistent mass matrix into the method in spite of performing the projection with the
760 THE NAVIER-STOKES EQUATIONS
only viable Laplacian—that using lumped mass: CTMj}C. They introduced a modified
semi-discrete momentum equation in an ad hoc manner (the pressure gradient term, CP,
was premultiplied by MMj}, where M is consistent and ML its lumped approximation)
and used it to help derive 'discrete projection 2.' They also presented a GFEM version
of 'projection 1,' in which the consistent mass matrix was (naturally) employed for the
intermediate velocity step (which uses no pressure gradient), and the lumped mass matrix
(necessarily) was used for the projection step. A consistency analysis of the projection 1
scheme revealed that the method 'automatically' inserted the MMj} factor in front of the
'normal' pressure gradient term. Based on that result, and others to follow, here we take
the stand that this is the 'proper' way to incorporate the beneficial effects of consistent
mass (low dispersion error, mainly) into a finite element projection method.
Thus, we begin by stating the projection method DAE's:
Mu + [K + N(u)]u + MMllCP = f,
CTu = g, (3.16-399)
wherein we point out that the modified pressure term is really not very far from the
original GFEM pressure term because MM~[l is not far from the identity matrix. In
fact, on a uniform 2D mesh of bilinear elements, it is not hard to show that MMj}u =
u + (h2/6)'V2u + 0(h4) for a smooth function, u. (The second-order truncation error terms
may drop to first-order on a general mesh of distorted quadrilaterals.) Thus, since the factor
MMj} hardly hurts the momentum equation (probably/usually—at least on good grids),
yet permits both consistent mass treatment of advection and lumped mass during the
projection (see below), it is a significant improvement over either of the two 'same' mass
matrix approaches (lumped mass is inaccurate and consistent mass is unaffordable).
Next we note that the implied PPE of (3.16-399) is also a mixed mass matrix result:
(CTMllC)P = CTM-l[f -Ku-N(u)u] - g, (3.16-400)
wherein we point out that even though M~x appears on the RHS, our projection method
will bypass this 'inconvenience.'
But before describing the projection method, we make some remarks about these
DAE's—mostly discouraging:
Remarks:
(1) Even the steady Stokes equations no longer display a symmetric matrix.
(2) Steady state results are not independent of the mass matrix (unless lumping is
employed, which is of course still permissible—even advisable for steady solutions).
In fact, we emphasize that the sole purpose of the mass matrix trick is, as is that
of the projection method in general, to obtain more accurate transient solutions—as
demonstrated in Gresho and Chan (1990) and from which we show a sample result
in Figure 3.16-20, in which the tick marks between the two figures show the nodal
spacing in the jr-direction. The significant improvement is rather obvious.
(3) Even the linear stability [N(u) = 0] of the DAE's, observed experimentally, is not
provable—at least not by us.
(4) Another useful interpretation of the modified momentum equation is obtained by
rewriting (3.16-399) as u + M~l [K + N(u)]u + M^CP = M~l f, which, since M~l
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 761
-0.5
Fig. 3.16-20 Snapshot of streamlines for vortex shedding at Re = 250; f = Q, 0.50, 0.75,
1.0. (a): Semi-consistent mass, (b) lumped mass. Tick marks denote element
lengths.
is dense, shows that the pressure gradient, and only the pressure gradient, is
approximated locally (a la finite differences) in order to compute the acceleration.
(5) As for the original GFEM DAE's, it is easy to show that the first of (3.16-399)
and (3.16-400) imply the second of (3.16-399) if and only if CTu$ = go—a simple
exercise we leave for the reader.
It may be worthwhile to return to Remark (3) above and point out that in spite of our
inability to perform the stability analysis, experimental results have never (yet)
generated unstable Stokes solutions for either TR or BE. This stability result can perhaps be
better appreciated—even for the fully non-linear case—if we examine the evolution of
the kinetic energy in the absence of forcing, obtained by forming the scalar product of
(3.16-399) with the velocity vector and utilizing uTN(u)u = 0 for N skew-symmetric
(which we here assume) to obtain
1 d T
E = u Mu =
2d?
-uT Ku
uTMM]
CP
= -u' Ku-P' C'MT'Mu
T ^Ta/i-\
L
vis-a-vis the true GFEM result
E = -uTKu < 0
(3.16-401)
(3.16-402)
since K is positive-definite and, thus, (3.16-402) properly describes 'viscous decay.' The
remaining term on the RHS of (3.16-401) is indefinite (precluding a definite stability
762 THE NAVIER-STOKES EQUATIONS
assessment) but small—because Mj} M % / + 0(h2) in the operator sense as shown
earlier for MM~[l, and CTu = 0, thus at least telling us that the indefinite term is small
in some sense. As a final 'persuasion,' the further rearrangement of (3.16-399) via
multiplication by MlM~x leads to the following 'energy' equation:
-—uTMlxu= -uTMLM-l[K + N(u)]u = -uTKu + 0{h2)\ (3.16-403)
i.e., the pressure term is gone if we use the lumped mass kinetic energy, but now the
viscous term no longer guarantees decay. So, we have guaranteed viscous dissipation in
the natural (CM) norm with an indefinite pressure contribution on the one hand, and no
pressure contribution (as desired) but 'slightly' indefinite viscous 'dissipation' (for finite h)
in the LM norm, on the other hand. However, since in a finite-dimensional vector space all
norms are equivalent (i.e., auTMu ^ uTMiu < /3uTMu and auTMiu ^ uTMu ^ buTMiu
for some finite a, /3, a, b), we are not surprised that the DAE's have been observed to be
stable.
Now we can present the semi-implicit projection 2 algorithm. For simplicity of
presentation, we will utilize only the simplest (first-order) explicit and implicit ODE
methods—FE (with BTD in the ^-matrix, per Section 3.16.5) and BE on diffusion, and
remark that one would be better advised to use AB2 or AB3 for advection and TR for
diffusion. We do what we do just to save 'ink.'
Given Pn and un with CTun = gn, projection 2 starts at n = 0 and does:
Step 1. Solve
M(un+l-un) +K~n+i +N(Un)Un+MMZ{CPn =fn
for the intermediate velocity, un+\; i.e., solve
(M + AtK)un+[ = Mun + At[fn - N(un)un -MMlxCPn\, (3.16-404)
which is done quite efficiently using, for example, DSCG.
Step 2. Project un+\ to the divergence-free subspace;
un+\ = pun+i +Mz;1CA~'g„+i,
or, recalling (3.16-14),
k„+i = pun+\ +vn+i, (3.16-405)
where p = I — M~lXCA~xCT and A = CTMllC, which assures that CTun+\ =
gn+i- (See Appendix 3 for a detailed discussion of projections, wherein p is
called pj , the interpolation projection, below (A3.3-38).) This projection is
realized by
,. (Un + \ — Un + \) „{Pn + \ — Pn) n ,
Mi \-C =0 and
CTun+l =gn+l, (3.16-406)
which itself is realized by the following sequential steps:
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 763
(i) Solve
A(Pn+l -Pn) = 2(CTun+l -gn+l)/At; (3.16-407)
(ii) Compute the final velocity from
un+l =un+l - -^MllC(Pn+l -/>„); (3.16-408)
(iii) Update the pressure;
add (/>„+, -Pn)toPn. (3.16-409)
Step 3. Bump n and go to (1).
Remarks:
(1) The sequential solution procedure is obvious—once it is realized that (3.16-404)
can be solved separately for each velocity component, or 'in parallel' on a more
modern computer.
(2) The scheme can be interpreted as a CM predictor and an LM corrector.
(3) In spite of our earlier discussions regarding slippery projections in the continuum,
it is clear from the above algorithm that we are enforcing the same BC's on u
as on u; i.e., we overspecify the 'no-slip' BC during the projection. That it is
'convenient' is obvious; that it is (usually) innocuous will be demonstrated later.
What happens is this: the code is smart enough to ignore the overspecified velocity
in the following sense: the first node in from Fq will look like a slip velocity, and
there is no visible (resolved) BL.
(4) The projection in (3.16-405) is more properly described as an affine
transformation on un+\. It is a true projection when gn+\ = 0. But the affine transformation
could also be regarded as a projection in the sense that un+\ = pun+\ +vn+\,
where vn+x = MllCA~lgn + l gives pun+l = p(pun+l + vn+l) = pun+l = un+l -
vn+\ because pvn+\ = 0 and p2 = p; thus, finally, un+\ = pun+\ + vn+\.
(5) The projection method does not need M^x per the RHS of the PPE in (3.16-400)—a
very important point, yet P satisfies (3.16-400) (also important)—to O(At); see
Gresho and Chan (1990) for a proof.
(6) Start-up (n = 0) requires Pq, which must be obtained from the PPE given by
(3.16-400) at t = 0, which does involve M~x. This is easily done as follows:
(i) Solve Ma = f0 — Kuq — N(uo)uq for a via, for example, DSCG.
(ii) Solve APo = CTa — go for P0 using your favorite method.
[Another useful way to solve Mlx^+x = b is via the iteration MLXk+\ =
b(ML—M)xk, with xq = 0; convergence is usually adequate after several
iterations; see, e.g., Wathen (1991).]
(7) If MM^1 is omitted, then the resulting method is unconditionally unstable—unless,
of course, M is lumped. (The 'cheat' is really required!) This instability arises
because the projection does then not annihilate the previous pressure gradient,
which is part of un+\, as we show below.
764 THE NAVIER-STOKES EQUATIONS
(8) For that part of /„ in (3.16-404) that corresponds to the normal force BC (NBC)
on open boundaries, it is usually a good idea to multiply this vector by MM~lx to
better balance the (dominant) pressure portion of the normal force balance.
(9) Related to earlier discussion, the 2 in (3.16-406) through (3.16-408) can be (perhaps
should be—especially if the Euler methods are utilized) replaced by unity.
(10) Restating our earlier advice, for emphasis: use higher-order methods than Euler's.
(11) If this method was to be used as a time-marching-to-steady-state method, then we
recommend three changes: (i) lump the mass, (ii) use BE exclusively, and (iii) use
a better method.
(12) Solving the same problem on the same mesh two times—once with CM and once
with LM—can sometimes provide a simple test for a grid-converged solution; i.e.
they will then agree. This remark also applies to fully implicit methods, and even
to the transient AD equation of the previous chapter.
(13) Actually, (3.16-407) is solved most efficiently via multigrid (S. Turek and
L. Howell, personal communication), although our implementation has thus far
used only direct methods or DSCG—the former (with A stored in factored form)
always winning when main memory is large enough to store (the factored) A.
o Convergence analysis. The discrete projection algorithm described above can be shown
to converge to the DAE's of (3.16-399) and (3.16-400) as At -* 0. We will need, in
addition to pun = un — vn which was derived in Remark (4) above,
(M + AtK)~l = [M(I + AtM~lK)]~l
= (/ + AtM~lK)~lM~l
= [I - AtM~lK + At2(M~lK)2]M-1 + 0(At3). (3.16-410)
Inserting un+\ from (3.16-404) into (3.16-405) gives
un+l =p{M + AtK)-[[Mun + At(fn -N(un)un - MMllCPn)] + vn+l
= p[un + AtM~\fn -Kun -N(un)un -MMl[CPn)}
+ vn+l +0{At2)
= un-vn+ AtpM~' (/„ - Kun - N(un )un)
- AtpMlxCPn +vn+l+ 0(At2)
= un + AtpM~ \fn-Kun-N (un )un)
+ vn+[ -vn +0{At2) (3.16-411)
because pM~[{C = 0. Dividing by At and passing to the limit gives
ii = pM~\f-Ku-N{u)u) + v, (3.16-412)
which, we assert, is (3.16-399). To verify the assertion, we place P from (3.16-400) into
(3.16-399) to obtain
MU + [K + N(u)]u + MMlxC(CTMlxCy\CTM-\f - Ku - N(u)u] -g] = f,
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 765
which, using p = I - M~LXC(CTM~LXCYXCT, and MllC(CTMj;lC)g = v rearranges to
(3.16-412), and we are done.
Remarks:
(1) If MM1' is omitted, then there will remain the non-zero term —pM~' CP on the RHS
of (3.16-412), thus showing inconsistency and probably explaining the instability
referred to above—the pressure gradient should project to zero.
(2) For further analysis and discussion of this and other projection methods, see Gresho
and Chan (1990) and Shen (1993), wherein a method called 'Projection 3' in Gresho
(1990) was shown to be unconditionally unstable.
o Overspecified BC's. It is both important and easy to demonstrate the utility of overspec-
ifying the projection by enforcing the 'no-slip' (i.e., specified tangential velocity) BC in
addition to the proper BC of specified normal velocity—and we show this in two ways:
1. Suppose we did permit slip during the projection. A possible and divergence-free
result of the projection might look like that shown (<2i<2o element) on element (e) in
Figure 3.16-21. If, before taking the next step for un+\, we enforce the (required, for
proper vorticity generation) no-slip velocity, the above picture (for un) changes to that in
Figure 3.16-22, which is clearly not 'mass-consistent.' End of 'first way.'
2. Write the mass conservation equation for the 'same' element (e) in Figure 3.16-23,
both with slip and no-slip (and n ■ u = 0 on f):
(a) Slip:
(b) No slip:
hiiiT, — ua, + «2 — "i ) + l(vs + V4) = 0;
h(uT, — U4) + l(vi + V4) = 0,
<
1
u = 2
1 >
►
« ^-
u = 1
(e)
u = 1
u = 2
Fig. 3.16-21 Slippery but mass-consistent.
u = 2
u = 1
*
^
* ^^
>
<
w
> >>
*-
>
Fig. 3.16-22 No-slip but mass-inconsistent.
766 THE NAVIER-STOKES EQUATIONS
from which it is seen that, except in the improbable event that 112 = u\,a 'mass adjustment'
would be necessary when switching BC's from slip (the projection) to no-slip (post-
projection). End of second way.
It is also worthwhile demonstrating that such overspecification is innocuous in that
the projected (divergence-free) velocity fields for the legitimate (slip) and overspecified
(no-slip) BC's are the same—within the 'truncation' error of the method. The 'proper'
projection of u is given by u = u + VA and V ■ u = 0 in Q, u • n = w • n on T, and
the overspecified projection replaces the BC by u = w on T. The discrete realization of
these two projections will be presented for element (e) of Figure 3.16-24. The continuity
equation for element (e) is
h(u2 — u\ + u4 — ut,) + l(v\ — vt, + V2 — v4) = 0, (3.16-413)
where in both cases we have v\ = w\ and V2 = w2—specified. In the second case we also
have u\ = wj and U2 = w\—specified. For the first (slippery) case we have u\ =u\ —
(Xq — Xw)/l and «2 = "2 — (K — ^o)/2, so that U2 — u\ = 112 — u\ — (A.£ — 2Xq + Xw)/l.
But we will just carry them as u\ and U2 for the time being, and specialize later. The
remaining equations needed are:
«3 = "3 - K^-o - A.w) + (A.5 - A.5w)/2/,
u4 = u4- [(XE - Xo) + (XSE ~ A.5)]/2/,
^3 = £3 - [(A.w - ^sw) + (A.o - Xs)/2h,
v4 = v4- [(A.0 - Xs) + (XE - XSE)]/2h.
£
<
i
> <
4 3
1 2
> <
>
h
t
Fig. 3.16-23 One Q^Q0 element.
r 1 2
h
I i
x ^.Sw
£
\ «
x \0
3 (e)
x\s
* 1
xXE
4
x >.SE
Fig. 3.16-24 A boundary 6-patch.
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 767
Inserting these results into (3.16-413) yields
(kE — 2A.0 + kw) + (ksE — 2A5 + Xsw)
h
U2 — U\ + Ua, — «3
+ /
2/
v , v ~ , (>W — ^Sw) + 2(Ao — kS) + (^£ — ^SE)
w\ + w2 - v3 - vA H —
2«
= 0,
(3.16-414)
which we rearrange and divide by 2/ to get
kw — ksw + 2(Ao — ks) + (A.£ — As£)
_
^3 — VV^ + V4 — VV2
+
+
h
(kE — 2k() + kw ) + (^5£ — 2^5 + A5W )
2/'
«3 — M4 + «i
W2
/
(3.16-415)
First note that the LHS approximates (and should converge to) 3A/3y(= dk/dn).
Next, recalling the (pre-projection) BC u = w shows that the first term on the RHS
approximates —h'dv/dy. Finally, it is clear that the remaining terms approximate,
respectively, (h/2)d2k/dx2, —(h/2)du/dx, and [from (u\ — U2)/l] either —{h/2)dwx/dx
or —h/2(du/dx — d2k/dx2) = —h/2(dwx/dx — d2k/dx2) for the no-slip (overspecified)
or slippery case, respectively. Thus the only difference between the two is the term
(h/2)d2k/dx2, which, along with every other term on the RHS, vanishes with mesh
refinement so that in either case, the BC for the Lagrange multiplier associated with the
projection is dk/dn = 0—and the highly 'convenient' overspecification of the Dirichlet
BC in the tangential direction during the projection is justified/vindicated. Finally, the
(important) vorticity 'injection' via a vortex sheet is well-approximated in either case.
Three final remarks on these BC's:
(1) One could justifiably promote the argument that the 'proper' tangential BC (i.e.,
none) is proper in that the subsequent process of reducing the resulting slip velocity
to zero is also perfectly legitimate because it is actually being applied to u, which
is not mass-consistent anyway.
(2) For Q\Qo, the legitimate (slippery) projection would reap an additional and quite
attractive benefit—the legitimate elimination of all spurious pressure modes; i.e.,
they are precluded (cf. Section 3.13.2b). Two 'wrongs' (projection method, slippery
walls) could indeed make a 'right'—and has (J. Schutt, personal communication).
The 'slippery walls', of course, are 'wrong' only if really needed for vorticity
production; they are right during the projection.
(3) Although Q\Qo was used to 'prove' our assertions, we believe that most of the
results generalize to all other elements.
o Return to the BHE. It is of some interest to duplicate the analysis performed earlier
on the semi-discrete equations to see 'how' the FEM equations represent the biharmonic
768 THE NAVIER-STOKES EQUATIONS
pressure equation. To this end, we substitute un+\ from (3.16-408) into (3.16-404) and
multiply by CTM~X to get, upon rearrangement, and replacing the factor of two by y in
(3.16-408), where 0 < y ^ 2 per Remark (5) following (3.16-323), for generality:
[CT Ml1 C + AtCT M~l KMl1 C]Pn+l
= y[CTM-\fn -Kun+l -N(un)un)-(gn+l -gn)/At]
- [(y - \)CTMlxC - AtCTM~lKMllC]Pn, (3.16-416)
wherein we note the approximations: — Q~lCTMJ^lC ~ V2, — Q~lCT ~ div, —M~XK ~
uV2, and M~[{C ~ V—where Q is the pressure mass matrix. Thus, recalling that V-
V2(W>) = V ■ [VV ■ (V/>) - V x V x (V/>)] = V4P, Q-[CTM-{KMlxC ~ uV4, and we
see (up to the factor Q~l) that (3.16-416) does correspond to (3.16-382), at least if we take
y = 2. This result also shows why y = 1 is more robust—it kills the term — C ML CPn
on the RHS, and with it the tendency to make 2A?-oscillations.
o Temporal accuracy of projection 2. Before mentioning some numerical results, we opine
that it is generally not easy to perform a simple, short set of numerical experiments to
'cleanly' validate/support some convergence estimates in the field of numerical solutions
of PDE's. Many 'surprises' lurk in the CFD laboratory, and often extensive runs and
re-runs are required—and even then truly conclusive and general results are not easy to
obtain. And this opinion is probably a large understatement when it comes to projection
methods and their numerical evaluation; confusion still reigns. The 'numerical results'
refered to have already been discussed—in Section 3.16.Id; there we showed 'good'
results for TR (second-order in At) and not so good for BE in the sense that the approach
to the theoretical behavior (first order) was attained painfully slowly (very small At
needed). But the good news from BE was second-order accuracy (close to TR in fact)
for 'larger' At—more like those used in practice.
o The pesky modes of Q\Qq. Since we have produced and promoted 'projection 2 via
QiQo,' it is natural to wonder if or how the LBB instability tendency for this element
manifests itself. The answer is simple: no problem. This fact has been known
experimentally for some time and has recently been explained theoretically by Griffiths and
Silvester (1994); in an extension of their LBB-mode analysis via the method of modified
equations that we have already summarized (Section 3.13.5k), they studied the projection
step of projection methods—with the following results:
1. The projection is stable in that the associated eigenproblem on the unit square, Mm +
CP = XMlu and CTu = XQP, has amn = y/Xm,n(^m,n — 1) = (m2 + n2)n2, which is
now independent of h with, for each (m, n) a pair of eigenvectors, one of which is
smooth (the 'physical' mode) and the other of which is oscillatory: (— \)m+n multiplying
the smooth mode—for mh <$C 1 and «/;« 1,
2. The oscillatory (spurious) eigenvectors are [cf. (3.13-370)]
mn sin imnh cos jnnh
mi cos imnh sin jnnh
(1 — ^»i,n)cos(/ — 1 /2)mjrh cos(j — \/2)njrh.
form, n = 0, 1,2, .... (3.16-417)
Ujj
Vij
Pijl
= (-\)i+j
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 769
The good news is stability—no eigenvalues go like 0(h), and the (slightly) bad news is
that the velocity parts are not, as for the Stokes LBB problem, 0(h2) smaller than the
pressure parts. Our final remark is this: it must be the CB-part, (—i)'+j, that (still) causes
the projection of u to the divergence-free subspace, u, to occur without wiggles because
this has been our numerical experience; i.e., the intermediate velocity, u, is apparently
'sufficiently smooth' that its projection onto the 'bad' modes is very small.
o A fully implicit projection method. We (RLS and D. Veyret) have experimented with
a fully implicit second-order projection method with mostly favorable results. We report
that and planned future progress here, in summary form—using our modification of the
code PASTIS (Daniels, 1992, 1993). Beginning with the linearized TR equations shown in
(3.16-237) with the skew-symmetric advection matrix, N(u*), being explicitly constructed
from the 'advective form' matrix:
N(u*) = ±[N(u)-NT(u)l (3.16-418)
from which the 'mostly favorable' descriptor originated. The results were favorable for all
'internal' flows tested (Dirichlet BC's all around), but unfavorable for a flow with an NBC
at outflow (vortex shedding); there were large spurious 2Ajc 'waves' at and near the outlet.
The reason is that N(u*) involves velocities at nodes on the outflow boundary and the
boundary integral, per the /J = 1/2 discussion in Chapter 2 (Sections 2.2.3 and 2.2.4), is
not properly accounted for. A 'fix' that is ad hoc and loses skew symmetry but has worked
in practice, is to revert to the simple advective form at outflow boundary nodes—as was
done in Simo and Armero (1994; R. Taylor, personal communication) for their vortex
shedding simulations. The linearlized N(u*) was then generated using u^+[ from AB2
or AB3. Finally, the 'standard' projection 2 algorithm was implemented, with the only
change being that the intermediate velocity solution now generates an unsymmetric matrix;
M + A^A: changes to M + At[K + N(u*)] in (3.16-404), except that the ODE integrator
was TR (as it should be!) rather than BE.
What needs yet to be done to really get a good 'solver' is to take advantage of the
unconditional stability engendered by N(u*) and design a smart integrator—part of which
could involve the further potentially cost-effective feature: do not project every timestep;
do it only when 'needed.' Three obvious ideas are the following:
1. Noting from (3.16-314) that V ■ u ~ t2 in the continuous projection method, vary the
timestep via
Atn+l =A^(£/||V-u„||)1/2, (3.16-419)
where e is a user-specified maximum allowable divergence.
2. Try an AB predictor-TR corrector scheme as done with the fully coupled system;
although theoretically 'lacking,' it may still give reasonable results, at least if At is not
too large.
3. Combine 1 and 2 in some clever fashion, and figure out a smart way to avoid the
projection step at each timestep; do it less frequently—especially for flows tending to a
steady state.
Final Remark on Smart Integrators Using the Projection Method:
Beware the biharmonic catastrophe for too-large At selection—it could turn 'smart' to
stupid.
770 THE NAVIER-STOKES EQUATIONS
d. A sampling of projection methods used by others.
To 'prove' that projection methods are both 'attractive' and not easy to
understand—regardless of how easy they are to program—we conclude this section with a
sampling of the literature on this subject, spanning FEM, FDM, and spectral methods.
There is a seemingly endless string of papers on the subject and even as we write
this down, we are aware of more coming—from Heidelberg, in particular (A. Prohl and
R. Rannacher). The more mathematical of the publications will typically begin by making
one or another set of regularity assumptions and proceed from there to prove one or another
convergence result. The problem is that the results often, but not always, disagree with
those of others and with those from one or another numerical experiment. Anyway, we list
below, and comment upon, enough of these so that the interested reader may quickly(?)
catch up on the literature.
We start with J. Shen, who probably leads the pack in number of publications on the
subject—and we cite only some of them. In Shen (1992) he showed ('weakly') first-
order global accuracy (in At) for u and 'weakly 1/2' for P for projection 1 (Chorin's
method). He also showed 'strongly first-order' for u and 'weakly first-order' for P for
projection 2 via BE—the latter being close to the best you can hope for, since BE on
the full index 2 coupled system is first-order in u and P. (See, however, our numerical
results in Section 3.16.Id.) In Shen (1996) he addresses higher-order schemes (projection
2 and variants) and presents some numerical results using a (Legendre-Galerkin) spectral
method. He proves second-order for u, but only first-order for P for projection 2 and TR.
We shall return to tins paper after citing a few others, because some of the issues are
related—but we also now mention his 'first attempt,' in Shen (1992).
Rannacher (1992) has gone further with projection 1 — 'the classical projection method';
by reinterpreting the intermediate velocity (u) as the final, reported velocity, he placed the
method in the category of 'pressure stabilization methods' (see Section 3.13.3) and then
proved the optimal result: first-order for both u and P, the latter holding only away from
Fd- Near the boundary, convergence deteriorates to 0(^/~At), a la Shen (R. Rannacher,
personal communication)—owing to the spurious BL there.
In a series of papers, J.-L. Guermond has addressed some projection issues and obtained
some interesting results, using both finite elements and finite differences (in different
papers). One of the latest is also interesting; in Guermond and Quartapelle (1995) they
apply BE to projection 2, but then omit the last step (stripping off the gradient) to obtain
an approximate projection method. See also Guermond and Quartapelle (1996). Recalling
(3.16-398) and the discussion there, we repeat this 'analysis' for projection 2 via BE,
starting from
IWl^- Um + ^ = vV2^m+ ^ (3A6_42{))
which is solved with the BC's given in (3.16-372) and 3.16-373). The 'conventional'
projection step is then: find um+\ and Pm+\ from
um+l =u„,+, + AfV(/Vn ~Pm) and V-um+1=0, (3.16-421)
with n un+i = n ■ w„+) on FD and Pm+\ = — Fn on FN. This of course yields
V2(/Vi -Pm) = V-um+l/At in £2, (3.16-422)
d(Pm+\ - Pm)/dn = 0 on FD, and Pm+\ = —Fn on FN. This is projection 2 via BE. The
'trick' played by Guermond is then to 'pretend' (3.16-421) does not exist after using it
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 771
to eliminate the divergence-free velocity in (3.16-420) and then just use (3.16-420) for
the (reported) velocity and (3.16-422) for the pressure update; they state: 'In this way the
end-of-step velocity is made to disappear from the algorithm, thus eliminating the wierd
velocity space Yo,/j from practical calculations.' The approximate projection algorithm is
then, given um and Pm,
Step 1. Solve for um+i from
Um+[~t ""' + V(2/»M -/»„_,) = uV2u„,+ 1, (3.16-423)
which is the BE analog of (3.16-398) shown earlier for TR.
Step 2. Solve for Pm+l from (3.16-422). Done.
Then they showed (stable—obviously) results for the P\P\ triangle; i.e., equal-order
interpolation—using the 'inconsistent' Laplacian (/ V0, ■ V0y) for the Poisson equation.
Related (and earlier) papers are Guermond and Tenaud (1994) and Guermond (1994).
It now seems appropriate to revisit (3.16-398) and the discussion there, in light of
the contributions by Shen, Rannacher (et al.), and Guermond (et al.). They all seem to
opt for an approximate projection by reporting what we have been calling the
intermediate velocity. We now return to Shen (1996), who computed and compared (i) with
our version of projection 2, (ii) with the approximate version given in (3.16-398), and
with a third—change the Adams-Bashforth pressure integration to a forward Euler one;
i.e., replace (3Pm — Pm)/2 by Pm. The reason he did this is that (3.16-398) gave only
first-order results for P (and second-order for u)—which he blamed on the homogeneous
Neumann BC for P, and its associated boundary layer. His numerical results showed
essentially second-order for u and something like 0(AtL5±) for P—for all three methods!
These results also seem to justify the seemingly cavalier neglect of the divergence-free
velocity—the divergences were small enough not to be harmful, and they apparently did
not accumulate. These results, taken with the above, seem to imply that the ostensible
'gains' listed below (3.16-398) are not all realized; whereas there is no BHE and no vortex
sheets, there is still a BL and still BC 'problems.' We note in passing that Shen's pressure
accuracy away from To went up to full second-order for projection 2 and for approximate
projection 2, (3.16-398), but not for the 'FE version' of the pressure. Interesting. The last
Shen paper we cite is Shen (1993), in which he too examines (briefly) projection 2 via
BE a la Guermond and Quartapelle (1995), although the purpose of the paper was to
prove that a higher-order-yet projection method, called 'projection 3' in Gresho (1990b),
is a loser—it is unconditionally unstable.
Moving on, we shall more quickly list the remaining, relevant projection
contributions that we are aware of, beginning with the first higher-order finite-difference method
(projection 2), of Van Kan (1986); this paper, and that of Bell et al. (1989) are
probably the two key finite-difference papers on projection 2. Both predict and demonstrate
second-order convergence for velocity; both did not report pressure accuracy.
The next important paper we mention (again) is that by E and Liu (1995); they even
address the intermediate BC 'issue' for projection 1, including what we earlier called
optimal BC's on u; cf. (3.16-325). They study Chorin's method ['classical' projection 1
(BE) with optimal BC] and Kim and Moin's (1985) method, which is Chorin's except for
an elevation from BE to TR. Their results are as follows: (i) for semi-discrete projection
1 with 'smooth initial data,' the velocity error (oo-norm in time, Lr in space) is O(At),
772 THE NAVIER-STOKES EQUATIONS
and that for pressure is 0(Atl/2); (ii) for semi-discrete projection 2, the analogous results
are 0(At2) and O(At), respectively; (iii) if, however, some non-local and generally non-
realizable-in-practice initial (t = 0) regularity assumptions are satisfied, such as (3.8-37),
then additional results are available (especially) for the pressure—outside of the spurious
numerical BL—improving the pressure error (pointwise in this case; i.e., stronger) for
projection 1 to O(At) and that for projection 2 to 0(At2). See the original paper for
details. See too their most recent paper [E and Liu (1996)] in which the velocity accuracy
applies right up to the boundary—in apparent disagreement with Shen and Rannacher.
The 'approximate factorization' approach taken by Dukowicz and Dvinsky (1992) to
obtain/derive higher-order projection methods (projection 2 and others) is novel; the paper
has a number of interesting ideas.
The finite-difference version of 'approximate projections' seems to have originated with
the paper by Almgren et al. (1996), although the earlier one by Dvinksy and Dukowicz
(1993) may have coined the phrase first. A related paper by Rider (1994) is interesting
in that it includes a large number of numerical experiments.
Another alternative method for deriving projection methods is shown by Perot (1993);
the result is similar to projection 2. It is based on approximate factorization like those
of Dukowicz and Dvinsky (1992), not approximate projection, as the final velocity is
discretely divergence-free; and it involves (implicitly, perhaps) a discrete BHE. Applied
to the Stokes system, ii + Ku + GP = f, Du = g it gives, for constant / and g for
simplicity, the algorithm below:
Step 1. Solve
un+\ ~ un „ I un + \ + un
At V 2
with P dropped—like projection 1.
Step 2. Solve [DG + (At/2)DKG]Pn+l = (Dun+l - g)/At.
Step 3. Obtain the divergence-free velocity,
un+l = un+l -AH/ + ~YK) Gpn+\,
which satisfies Dun+\ = g. The perturbation to the pressure gradient term,
a consequence of the approximate factorization, elevates the method to
second-order in velocity—and first-order in pressure, which Perot demonstrates
numerically.
A collective account may be in order for both the Dukowicz and Dvinsky paper
mentioned earlier and the one by Perot: they all assert that the issues associated with
BC's for the intermediate velocity and for the pressure are obviated by their derivations
via 'matrix manipulations' of the discrete equations—assertions that we believe, while
ostensibly true, are misleading in that the final discrete equations for both cases can be
studied near r to see what BC's are 'built-in'; we believe that they will turn out to be the
same ones that we use in the simpler projection methods. Another interesting 'reaction to
these papers is that of Rosenfeld (1996): 'Note that the pressure always converges with
second-order accuracy as well, contrary to the predictions of Dukowicz and Dvinsky and
Perot.'
= /
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 773
Returning now to FEM, we mention two recent papers by Turek (1996, 1997), in which
many numerical comparisons are made and compared also with some fully coupled (DAE)
methods—and even a combination of the two in which the projection method is used as
a sort of 'preconditioner' for the fully coupled method.
We conclude this brief(?) review with a higher-order, spectral-element projection
method that introduces a new wrinkle—in Timmermans et al. (1996), the 'pressure
correction term,' Pm+l — Pm, in, for example, (3.16-378), was replaced by Pm+\ — Pm +
uV ■ uffl+i; i.e., Pm+\ = Pm + 2<p/At — uV ■ um+\. This 'improvement' does seem to help
the accuracy even though it was not rigorously justified theoretically. We should add that
Timmermans et al. did not use either TR or BE for the u step: they used BDF2 (see
too Minev and Gresho, 1998). Look for some recent additional work on this and related
methods by Prohl and Rannacher which will also improve the results of E and Liu (1995)
[R. Rannacher, personal communication]; see, in particular, Prohl (1996, 1997) and Prohl
and Rannacher (1997)—in which also a new variant, called 'Chorin-Uzawa,' is introduced
to virtually preclude the spurious boundary layers.
This concludes our excursion into 'work by others' on projection methods; hopefully
some of it is useful, perhaps leading some future genius to the 'ultimate projection' and
its unerring analysis.
3.16.7 Fully-Implicit Segregated Solution Methods—Transient
and Steady-State
The desirability of sequential solution methods with the concomitant significantly smaller
matrices is obvious from the viewpoint of computational cost—if not simplicity—at least
for large 3D problems. The desirability of fully implicit time-integration techniques, with
At based on the 'physics,' is also (now) obvious. The method to be described below is
predicated (in part) on these features, but in the main part, on obtaining a good steady-
state 'solver'. It resembles the semi-implicit projection methods discussed above, but
generalizes them in several significant ways: (i) time integration is 'elevated' from semi-
to fully-implicit so that the CFL restriction is bypassed and smart timestep control is
possible; (ii) the ad hoc approximation of semi-consistent mass is obviated via the honest
use of fully consistent mass; (iii) there is no projection boundary layer and no spurious slip
velocity; and, finally, (iv) the method is, as mentioned above, most useful for attacking
the steady-state form of the equations (At = oo in what follows). While the method to
be described below finds its greatest utility in 3D simulations involving lots of coupling
(e.g., Boussinesq equations via both thermal and solutal natural convection problems—see
Volume II), wherein full/honest coupling of all conservation equations (especially in the
index 2 formulation) can easily overload even the biggest computers, we believe it
appropriate to introduce them here and, for ease of presentation, in 2D only. The extension to
3D is 'obvious' and that to coupled systems not difficult—and will be done in Volume II.
Besides, the intentional lack of coupling can cause convergence 'problems' when the
tight coupling should be respected owing to its importance (e.g., thermal convection; see
Volume II). Thus, while the presentation to follow 'merely' uncouples u, v, and P, it is
worth emphasizing that its rewards are better realized in 3D and when additional
transport equations are present (including turbulence 'transport' equations)—all of which are
segregated/uncoupled and solved sequentially—but repeatedly, via a new 'iteration loop,'
even for linear problems. One further simplification that we invoke below, again merely
774 THE NAVIER-STOKES EQUATIONS
to simplify the basic/conceptual ideas, is to employ implicit Euler (BE) for the time
integrations. Surely by now the reader realizes that: (i) it is not (usually not) the method
of choice and (ii) it is easy to convert the final equations from BE to, for example, TR
or BDF2. But the equations to follow are, necessarily, quite long even when written via
BE—so we hope the reader will: (i) forgive us, and (ii) not 'write code' using (only) BE.
The technique described below was devised by Haroutunian et al. (1993), partly to help
iterative solvers cope with unsymmetric matrices and partly to help 'stabilize' the overall
solution of non-linear algebraic equation systems. It involves the use of so-called 'implicit
relaxation' procedures on every transport equation, in which the 'relaxation factor' is an
inherent part of the solution procedure rather than only being explicitly applied at the end;
i.e., explicit relaxation means that xk+l = a>xk + (1 — co)xk+l/2, where xk+l/2 represents
an 'intermediate' (temporary) update of x, and co is the relaxation factor (0 < co < 1).
The implicit procedure is, in fact, an adaptation of old and successful FDM strategies
('SIMPLE,' 'TEACH,' etc.) in which diagonal dominance was the coveted attribute, and
implicit relaxation was employed during the iterative solution process.
The starting point of the segregated solution method is an iterative solution of the BE
equation (3.16-245):
^-M + K + N (ukn+x)
4X\ + CpknX\ = ^tM"n + fn+i = bn+i, (3.16-424)
CTukn++\ =gn+l, (3.16-425)
which could (and should) also be construed as a solution method for the steady equations
by setting At = oo and dropping the n + 1 subscripts—a property not shared by TR. But
these linear equations are still fully coupled—just what we wanted to avoid, which we
now do, beginning with an explication of the implied (but not used!) equation for the
pressure,
[CTA-\ukn+l)C]Pkn++\ = CTA-\ukn+{)bn+{ -gn+u (3.16-426)
where A(u) = At~xM + K + N(u). Henceforth, we shall suppress the temporal indices,
n and n + 1, for notational simplicity. Also for simplicity, we shall denote A(uk) by Ak.
The basic idea will be presented first (frills later) and it is this: noting that (3.16-426)
and (3.16-424) imply (3.16-426), replace (3.16-425) by an approximation to (3.16-426)
(via A —► A, where A approximates A and is easy to invert, and will be discussed below)
and iterate between (3.16-424) and the approximation to (3.16-426) in such a way that
convergence 'rapidly' occurs and in such a way that (3.16-425) will still be satisfied.
One way to assure a divergence-free result is to perform a projection during each
iteration, leading to the following algorithm, called the 'pressure projection' algorithm in
Haroutunian et al. (1993): given u°, do for k = 0, 1,2,...:
Step 1. Solve
(CTA7lC)Pk+l = CTA7l(b-Akuk), (3.16-427)
and note that (3.16-426) is recovered if A = A and if CTuk = g.
Step 2. Solve
Akuk+l/2 = b- CPk+l. (3.16-428)
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 775
Step 3. Project: i.e.,
(a) Solve {CTA^C)X = CTuk+l/2 - g. (3.16-429)
(b) Compute uk+l = uk+l/2 - A^CX, (3.16-430)
where X is the Lagrange multiplier associated with the projection (see
Appendix 3); it is zero upon convergence.
Step 4. If ||X|| < e, stop; else update Ak and Ak and go to Step 1.
Note that, upon convergence, both CTu = g and A{u)u + CP = b, as desired; also,
(3.16-427) is (still) satisfied, for any A(u)—but it is no longer relevant. Note too that
there are two Poisson-like equations per (non-linear) iteration—one for Pk+l and one for
X; clearly the method will only be cost-effective if a 'small' number of iterations is needed.
It is also noteworthy that even steady Stokes flow (A = K) requires iterations (two PPE's
plus one Poisson equation for each velocity component, per iteration), whereas, if A = A,
one iteration would suffice—so that the decoupling is not without some added cost. (It
is also worth mentioning that the algorithm was not designed with linear problems in
mind.)
What can be said about convergence of the iterations? Is convergence guaranteed?
What is the convergence rate? We shall brush these aside—at least for now—to get on
to an improved algorithm that is even harder to analyze (yet usually performs better).
Besides, until A is defined, it is impossible to answer these questions.
The 'final' algorithm introduces the implicit relaxation referred to above (via the a-
terms, with a, and A, defined below) and further decouples the equations by clearly
segregating the u and v momentum equations. It is this: for k = 0, 1,2, ..., do [with u°
and v° given, and Axk = Ax{uk), etc.]:
Step 1. Solve
(CTxA;klcx + cTvA;klcv)Pk+l = cTxA;kl(bx -Axkuk) + cTvA^(bY -Avkvk).
(3.16-431)
Step 2. Solve
(j^Axk+Axk) uk^2 = bx - CxPk+l + -^-Axkuk. (3.16-432)
\ 1 - au ) 1 - au
Step 3. Solve
(-^-Avk+Ayk) vk^'2 = bx - CYPk+l + ~^-Aykvk. (3.16-433)
V 1 - Of,, ■ ■ / 1 - Of,;
Step 4. Project: i.e.,
(a) Solve (3.16-429):
(CTxA;klCx + CTyA~klCy)X = CTxuk+l/2 + Cy+1/2 -g. (3.16-434)
(b) Compute
uk+\ = uk+\/2 _X~k{CxX. (3.16-435)
776 THE NAVIER-STOKES EQUATIONS
(c) Compute
vk+l=vk+l/2
Ayk CyX.
(3.16-436)
Step 5. If ||A.|| ^ e, stop; else update Ak and A*, increment k, and go to Step 1.
Remarks:
(1) Upon convergence, the desired equations (3.16-424) and (3.16-425), have been
solved—regardless of the choice of a's or A—as required.
(2) Any or all of Steps 1, 2, 3, and 4(a) could also define 'inner iteration' steps if solved
by an iterative method.
(3) The a's are between 0 and 1, the lower bound removing relaxation and the upper
bound giving no change; hopefully there is an optimum value.
(4) The algorithm can also be used to attack the steady equations; simply drop the mass
matrix terms.
(5) The (transient) algorithm can also utilize the predictor-corrector-variable-A?
techniques discussed earlier for the fully coupled solution methods.
(6) The 'proper' a can be very useful to 'stabilize' (read: make converge) the non-linear
equations themselves.
(7) When the convergence test is passed (||A.|| < s), either the timestep has been
completed or the steady solution (no mass matrix) has been found.
(8) The stopping criterion could just as well be done using \\8u\\ and \\8v\\ or
\\CTu — g\\—and probably better using 'relative' norms; see Section 3.16.4. Safest,
of course, is to 'test on all.'
It remains to define the approximations to the A-matrix and the relaxation factors. The
A-matrix is approximated by—no surprise—a diagonal matrix that is (surprise?) formed
by summing the absolute values of each row, the hope being that at least some semblance
of the actual A-matrix will be contained in the result. (It will at least be dimensionally
correct!) Thus,
Kk =Ax(ukn+l,vkn+l) = diag
Ayk=Ay(ukn+l,vkn+l) = diag
and we note the following features:
^2\(Ax(4+i^n+i))ij\ : i'=1.2,..
]T|(Ay(^+1,^+1))0| : i = l,2, ...
,(3.16-437)
,(3.16-438)
1. In the At —► 0 limit, it becomes a sort of row-sum mass lumping, except for the absolute
values. In contrast to conventional mass lumping, this 'row-sum' lumping has no effect
on the accuracy of either transient or steady-state solutions—at least when iterating to
convergence.
2. In a 'worst case' of large Re and steady flow, A(u) is nearly skew-symmetric, and it
is not obvious how any diagonal matrix could approximate it.
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 777
3. The A described here is only one of an infinite number of possible choices. We implore
that any reader who finds a (significantly) better A contact the authors immediately!
There are several ways to select the implicit relaxation factors, au and av:
1. Simply choose the optimal constant values for each. Since, however, these values are
usually not known, other methods are also listed.
2. A local Reynolds-number-based scheme. Here the a's are functions of element (grid)
Reynolds number, Re = uele/v, where ue is the average velocity in element e and le =
y/A~e approximates the element 'size' (Ae is element area). The a's are then computed
locally after each iteration via au = (amjn + Regoimdx)/{\ + Re), where again amin and amax
are user-specified. This method uses a 'large' a for advection-dominated regions and a
small one where the flow is diffusion-dominated.
3. Recent experience with the implicit relaxation scheme suggests that a blend of dynamic
element-matrix-based implicit relaxation and explicit relaxation of the non-linear iterations
may be an optimal strategy .... This (at the time of writing) is current research!!
More remarks on segregated solvers:
Remarks:
(1) It may be useful, sometimes, to employ some explicit relaxation on the pressure
update, via apPk + (1 — ap)Pk+l —► Pk+l; sometimes this will help the 'non-linear'
iterations converge.
(2) Another segregated solution method that sometimes works well, called 'pressure
update' (PU) by Haroutunian et al. (1993), from which most of this section was
obtained, can be derived via a few simple modifications to the projection method
described above:
(i) Add a 'penalizing' term to the RHS of the pressure equation (3.16-431), in the
form Xp(CTukn+l — gn+\). Experience indicates that Xp = 0.15 is a reasonable
value.
(ii) Omit the projection step,
(iii) Test for convergence via ||Crw^', — gn+\ || ^ e.
(3) The SIMPLE algorithm (PC; pressure correction) in Haroutunian et al. (1993) is no
longer recommended.
Remarks on Remark (2):
(1) Since the pressure and X equations are usually the most time-consuming part of each
iteration, this method has the potential for being much more cost-effective—and is
if not too many more iterations are required.
(2) The resemblance of this scheme to the semi-implicit PPE scheme of Gresho and Chan
(1990) is interesting; it generalizes and improves that one via penalizing any spurious
divergence—a technique similar to that called 'divergence cleaning' (Ramshaw,
1983).
778 THE NAVIER-STOKES EQUATIONS
Final Remark:
Recalling that iteration to completion (convergence) yields the associated underlying DAE
method (e.g., TR or BE), it is also worth pointing out that the segregated solution method
sometimes approaches another popular method in the other limit—for a time-dependent
simulation: one pass through the algorithm gives a result that, at least for small At
(time-accurate) simulations, looks (and behaves) very much like 'projection 2.' This
assertion/observation requires two important 'facts' [k = 0 in (3.16-431) through (3.16-438)]:
(i) ||Axo«l/2 ^> orM/(l — au)\\Axo(ul/2 — u°)\\ where u° = un, ul/2 is taken to be un+\, and
ux is taken to be un+\; and ditto for v; (ii) the pressure from the 'PPE' [(3.16-431)], with
P{ taken to be Pn, is 'the same as' the pressure from the projection 2 Lagrange multiplier;
namely, (3.16-407).
3.16.8 A Fractional-Step (Index 2) Method
In a recent series of papers (Rannacher, 1989, 1993; Miiller et al., 1995), Rannacher
et al. argue convincingly that (his version/adaptation of) a special scheme designed by
R. Glowinski, which is called the fractional-step ^-scheme (FS#), is one that merits
serious consideration by the CFD community. This method, '... which seems to have
the potential to become the winner in this race...' (Rannacher, 1993), is second-order
accurate, like TR and BDF2 but, unlike TR, it exhibits strong A-stability and, unlike
BDF2, it is only lightly dissipative for purely hyperbolic problems. He argues against TR
because any high-frequency 'noise' that is injected into the solution (e.g., via too crude
a solution of the equations, linear or non-linear, within a timestep) will be too slowly
damped. But we note in passing that we know of few complaints by those who have
employed our variable-step TR method [e.g., Crochet et al. (1983, 1985); Kheshgi and
Scriven (1984); Bixler and Benner (1985); Keunings (1986); Gartling (1987); Derby et al,
(1987); Ladeinde and Torrence (1990); FIDAP (Fluid Dynamics International, 1993); and
Basaran and De Paoli (1994)]; in fact, we cite D. Gartling (1994, personal
communication), who wrote the NACHOS code and other codes that use the variable-step TR
integrator (only NACHOS solves the NSE's), which has numerous users: 'The algorithm
works well—we are happy with it. We seldom if ever run the BE version.'
Nevertheless, R. Rannacher (1994, personal communication) argues that FS^ may still beat TR in
cost-effectiveness because its good damping properties will permit equivalent accuracy
using larger timesteps. We shall return to this issue after describing the method—which
we do largely following Rannacher rather than Glowinski in that Rannacher considers it
as another ODE method. Rannacher attributes the FS# scheme to Glowinski via Bristeau
et al. (1987), but Glowinski himself (Glowinski, 1991) nicely summarizes the scheme as
'a variant of the Peaceman-Rachford scheme,' and refers to even earlier references, e.g.,
Bristeau et al. (1985); the earliest being Glowinski (1984), and the one with the most
appropriate title is Glowinski (1986): 'Splitting Methods for the Numerical Solution of
the Incompressible Navier-Stokes Equations.' For a recent analysis of the method, see
Kloucek and Rys (1994). It resembles a diagonally implicit Runge-Kutta method in that
there are three (implicit) 'stages' (fractional steps), all of which taken together constitute
a 'timestep.' Applied first to the (somewhat special) non-linear ODE system described by
y + A(y)y = f(t),
(3.16-439)
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 779
the 'general' FS# scheme is:
1. Solve for yn+e from
[\ + a0AtA(yn+0)]yn+e = [1 - (1 - a)6AtA(yn)]yn + 6Atfn. (3.16-440)
2. Solve for yn+\-e from
[1 +(1 -a)(\-26)A(yn+l^)]yn+l_e = [1 - a(\ - 20)A(yn+9)]yn+e
+ (l-20)Af/„+,_«,. (3.16-441)
3. Solve for yn+\ from
[\ + a0AtA(yn+l)]yn+l = [1 - (1 - a)0AtA(yn+^o)]yn+\-e + OAtfn+l-0, (3.16-442)
where 0 < 0 < 1 and 0 < a < 1 are parameters. Glowinski (1991) shows, for the linear
case (A is a constant and SPD matrix), by comparing the overall amplification factor
(replace A by X, an eigenvalue of A, and set / to zero), £ = yn+\/yn, to e~XAt, where
[1 - (1 - a)0XAt]2[\ - of(l - 20)XAt]
£ = ~ , (3.16-443)
[1 +a0XAt]2[\ +(1 -a)(l -20)XAt]
that the scheme is second-order accurate if either
0=1- 1/V2 = 0.2929 (3.16-444)
or
a =1/2, (3.16-445)
the first of which is the 'interesting' one. (For a = 1/2, the scheme approaches TR in
either the 0 -* 0 or 0 -* 1 limit, and is TR at Af/2 if 0 = 1/2 and w TR at At/3 if
0 = 1/3; none of these special cases—all TR—is worthwhile. In fact, a = 1/2 is always
a variant of TR.)
If neither is true, then the scheme is doomed to first-order accuracy. But the choice
a = 1/2 also gives lim^^oo £ = —(1 —a)/a = —1, which is A-stability—like TR, a
'disadvantage.' So the choice of 0 in (3.16-444) is made. Next, it is observed that only if
a = (1 - 26)/(\ -6) = 2-V2 = 20 = 0.5858 (3.16-446)
will the coefficient matrix in (3.16-440) through (3.16-442) for the special-but-important
case of a constant matrix, A, give the same linear system to solve in each of the three
steps. Since this choice also gives \imXAt-^oo^ = — 1/V2 = —0.7071, which means that
the ODE method is then strongly A-stable (but not L-stable or even stiffly stable), this
is deemed to be a good choice—especially by Rannacher. [Large 00At computations of
y = icoy will give yn + \ = (—0.7071 )nyo rather than yn+\ = (— \)"yo of TR and yn+\ = 0
of BDF2.]
Remarks:
(1) Glowinski and Rannacher do not seem to agree on the sampling points for f(t)
when solving (3.16-440) through (3.16-442)—which is the R2 method; Glowinski
uses fn+o in the first stage and f n+\ in the third.
780 THE NAVIER-STOKES EQUATIONS
(2) The amplification factor for the (constant coefficient) FS^ scheme (for optimal 6, a)
is easily found to be (to four digits)
= (1 -0.1213AAQ2(1 -0.2426AAQ Jfi
(1+0.1716AA03
which, for Xh -* 0, gives J-=\-\h+ (Xh)2/2 - 0A159(Xhf + 0(Xh)4, which has a
very small 'local error' term. (Recall that for TR, 0.1759 is replaced by 1/4, which,
compared with the 'target' of 1/6, has an error about nine times larger, FS# is indeed
accurate.)
The fractional steps for y = — X(t)y + f(t) corresponding to the optimal (6, a), and to
(3.16-447) for constant X, are (again to four digits):
1. (\+0Al\6Xn+eAt)yn+e = (1 -0A2\3XnAt)y„+6Atfn', (3.16-448)
2. (\+0Al\6Xn+l^At)yn+^e = (1 -O.2426Xn+0At)yn+0
+ (1 -20)Af/„+,_*; (3.16-449)
3. (1 +0.1716A„+1A0^+i = (1 -0A2\3Xn+l-eAt)yn+l-d
+ 6Atfn+[_e, (3.16-450)
and the suspicious-looking index on / can be justified by simply requiring second-order
accuracy when X = 0—which means no error for y = t with 0 a free parameter. The
result is 0 = 1 — 1 /\/2; the indices are right as written.
In Rannacher (1989) are shown some very interesting comparisons of FS# with TR,
BDF2, BE, FE, and a second-order DIRK (diagonally implicit Runge-Kutta) method (see,
too—Marx, 1994). He showed, in the context of fixed At algorithms, that FS# is the clear
winner for an ODE system displaying some damped and some undamped oscillations:
(i) in the non-stiff cases it displayed about the same accuracy as TR using three times
the At (which is the proper measure since it takes three times the effort of TR); (ii) in
the stiff case (stiffness ratio of 104), TR oscillations polluted the early time solution (the
At used was quite accurate for tracing the oscillatory mode—~126 steps/cycle—but not
small enough to damp the 'stiff mode), thus giving better results at (again) three times
the At of TR (~42 steps/cycle). Impressive as these results are, we still believe that
a 'smart' TR integrator would fare much better, taking very small At only initially to
follow the stiff components and growing monotonically toward that needed to follow
the free oscillations. [Part of the 'problem' is philosophical: we believe stiffness should
be accurately resolved because it is (usually at least) physical whereas Rannacher is
more concerned with non-physical stiffness encountered in the form of 'noise.'] And we
believe it could be done more cost-effectively in the general case even if the FS# scheme
were improved to include error control and automatic At selection [and it has—see Turek
(1996)—because of the following very important fact: local error estimation and resulting
At control are virtually free (very low in cost) for TR (or LMM's in general) because it is
done with an explicit method, whereas the relatively expensive (implicit) technique called
'step doubling' (common, for example, with RK methods) is required for FS# because it
is a so-called 'composite method.' For every two 'real steps,' a single step at twice the
size is required in order to estimate the local error and thus invoke At control [see Gear
(1971) for details]. Thus, to a first approximation, the overhead cost is a full 50%.
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 781
We leave as an exercise the application of the FS^ scheme to both the scalar transport
equation and the NSE's, but we do show the result for phase speed (up) and amplitude
coefficient (|£|) for ID pure advection using linear elements (recall Section 2.7.6): it is
u,
u
1 -i y
= — tan ' -
c0 x
and
% = x + iy
[m(0) — aic sin 0] [m(6) — 2aic sin 0]
[m(0) + j8/csin0]3
(3.16-451)
(3.16-452)
with due account taken of the quadrant in which £ lies.
Here, m(0) = (2 + cos0)/3 is the mass matrix symbol, a = 0.1213, and fi = 0.1716.
Figure 3.16-25 shows the result in terms of relative phase speed (up/u) and
amplification coefficient (|£|)—both of which should be 1.0—for several values of 0. It is seen
that the phase speed is as good or better than its implicit 'competitors,' TR and BDF2
(cf. Figures 2.7-15 and 2.7-17 for P = 100). The amplitude comparison, however, is
not so easy—it is not available by comparing Figure 3.16-25(b) with Figure 2.17-15(b)
or Figure 2.17-17(b), since the former is |£| for P = oo and the latter are \%\ec9~/2p
for P = 100. We can compare a random point, however—such as 0 = tt/2 for c = 3.
The results are (for pure advection): |f| = 1.0, 0.495, and 0.953 for TR, BDF2, and
FS#, respectively—which suggests at least that FS# will be much better than BDF2 for
advection-dominated flows.
Because it is rather interesting, and was not easy for us (PMG and J. Leone, Jr.) to
obtain, we show for one case how £ varies in the complex plane as 0 ranges from 0 to n.
The spiral shown in Figure 3.16-26 for c = 10, which begins on the unit circle at 0 = 0
and proceeds in a clockwise (decreasing argument) direction, has a maximum damping
(|£| = 0.76) at 0 = 2.07 (a nearly 3A* wave) for which the argument is ~ —60° (from
the *-axis). As 0 increases from 2.07, the spiral reverses its path and retraces virtually
the same spiral to return to its starting point at 0 = n. The straight line segments (chords)
in the figure represent the return path, plotted with increment A0 = 7r/500(~ 0.36°); the
smooth curve outside the chords shows the 'outward' path (0 < 6 < ~2.07). The resulting
||(0)| curve is that labeled c = 10 in Figure 3.16-25(b). In order to obtain the associated
phase speed curve using either a hand calculator or a software package on a computer,
1.0
& 0.8
8-0.7
$0.6
^0.5
Q.
a) 0.4
1 0.3
CD
"cd 0.2
01 0.1
0
c
v~s> -
— \
, \
\
~' C = 10"
) 0.5
""\ 3~ ^>^xC<0.3 _
"'V 5 ^. ^ "vS.
*^, a.
x\
I I I I M
1.0 1.5 2.0 2.5 3.0
0
(a) Phase speed
1.05
1.00
0.95
^0.90
0.85
0.80
0.75
(
I ' I ' I I
C = 0.1
" \ \ ~r--" / r
_ \ \ / !-
\ 5"\ / '
\ /
— 10'v. / —
| , | ' ■—, -[ ,
J 0.5 1.0 1.5 2.0 2.5 3.0
0
(b) Amplitude
Fig. 3.16-25 Phase speed and amplitude for pure advection via FS-G.
782 THE NAVIER-STOKES EQUATIONS
Fig. 3.16-26 Argand diagram of £(0) for C = 10 (0 ^ 9 < n, AO = tt/250.
both of which usually deal only with the principal branch in the complex plane, one
must add or subtract the proper multiple of n from tan-1 y/x, where £ = x + iy. For the
curve shown, there were two subtractions of n needed on the way out (one when the
spiral crossed the imaginary axis into the left-half plane and the other when it crossed the
real axis from above); these two increments needed to be 'added back' during the return
portion of the spiral.
Anyway, the FS# scheme does appear to be a viable contender for CFD; whether it is
a good way to solve ODE's in general is another matter.
We also leave for the reader the details of applying FS# to the NS equations—or see
the above Glowinski/Rannacher references. We do point out, however, that no one has,
to our knowledge, applied the scheme to any but the index 2 DAE's (u — P)—where, in
Turek (1994, 1996, 1997), it has been applied quite effectively.
3.16.9 Other Methods (Used by Others)
As in the previous chapter, we would be remiss if we did not also mention that there are
yet other ways to solve both the steady and time-dependent NS equations—still using
FEM but differing in many 'details,' sometimes in major ways. Thus, tending toward
'completeness,' we cover (only) some of these below.
a. Methods based on trajectories/characteristics
The most impressive applied ('industrial') use of the (backward) method of characteristics
(BMOC; see Section 2.7.7a) in real-world (3D) geometries (reactor vessels, automobiles,
etc.) that we have seen takes place at Electricite de France (EDF), in which tetrahedral
elements permit the effective use of (Lagrangian) particle tracking (x = u)—made all
the more efficient when thermal, species, and turbulence fields must also be computed
because one trajectory calculation applies to all (u, v, w, T, c,, k, e, etc.). This in fact
appears to be an example that strongly favors simple tetrahedra over bricks because the
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 783
latter, in the isoparametric (distorted) shapes that are necessary to model complex 3D
geometries, do not have flat faces (each of the six faces of a distorted brick is a curved
surface), and the associated particle tracking is a very difficult task; the QiQ\ element
has been dropped from the EDF codes (J.-P. Chabard, personal communication, 1994).
They in fact use either the f^i element or the iso P\ — P\, and prefer the latter for big
3D problems in complex geometry.
While they began their BMOC using first-order methods (BE for the ODE's and for
the characteristic curves), they are currently leaning toward second-order methods (BDF2
for the ODE's, and a BE with extrapolation for the characteristic curves)—at least for
time-accurate unsteady simulations. (They still use BE for obtaining steady solutions.)
See, for example, Janvier et al. (1992) and Boukir et al. (1996). For a sample of results
involving a nuclear reactor pressure vessel, see Alvarez et al. (1992), and for flow past
an automobile, see Bidot et al. (1992). Finally, for flow through and around a fully 3D
(but not rotating) drill bit for the petroleum industry, see King et al. (1990) or Chabard
and King (1991). To conclude our brief discussion related to the impressive EDF Navier-
Stokes code (N3S), we note that while these folks have long been fond of (and used) the
so-called Uzawa method (see, for example, Cahouet and Chabard, 1988) for solving the
pressure-velocity (viscous part) coupling problem, they seem lately to be more interested
in the (second-order) projection method—at least for 'large' Re (not Stokes): Ng et al.
(1993) and B. Nitrosso (personnal communication). They still use 'Uzawa' for laminar
flow simulations.
For other representative efforts in this and related areas, see the citations in
Section 2.7.7a in Chapter 2 and the following, which were not listed there because they
focus exclusively on the NS equations: Suli (1988), Suli and Ware (1989), and Hansbo
(1992b).
b. Methods based on least squares (LSFEM)
Another method that is increasing in popularity is the LSFEM (least-squares FEM),
wherein many of the problems of GFEM using mixed interpolation are avoided at the
cost of solving many more equations—all first order in space. A leading proponent of this
technique is B-n. Jiang, from whom we quote, 'It is well known that the Galerkin mixed
method leads to a saddle point problem, thus the sophisticated LBB condition is invoked
to guarantee the existence of a solution. It is notoriously difficult to verify and satisfy
the LBB condition. From a numerical point of view, the most difficult problem
associated with the Galerkin mixed method is that the resulting discretized algebraic equations
are nonsymmetric and non-positive-definite, which are hard to deal with for large
problems. All these difficulties motivated us to apply the least squares method.'—Jiang et al.
(1994). These are indeed powerful arguments against GFEM, and it might just turn out that
LSFEM is the 'right' way to go—or at least a good way. By representing the NS equations
as a coupled system of (many) first-order PDE's, as discussed in the previous chapter for
the scalar transport equation, low-order C° finite element basis functions—with equal-
order interpolation in fact (for all variables)—can be profitably employed. The curl form
[(3.3-6)] or the rotational form [(3.4-4)] are appropriate starting points. But there are many
more equations and many more unknowns, thus exhibiting the 'down side' of the method:
many coupled equations (but always with symmetric matrices). By including the definition
of vorticity as an additional (vector) equation and the constraint that it be divergence-free
as another (scalar), a system of eight equations in seven unknowns—three velocities,
784 THE NAVIER-STOKES EQUATIONS
three vorticities, and pressure (which is not overdetermined, see below) is the first-order
system of NS equations to which LSFEM is applied. For the steady NS equations, the large
system of non-linear PDE's is first linearized (Newton's method, usually) and then the
LSFEM applied. The resulting SPD systems are then solved via DSCG (also called Jacobi
preconditioning). While the matrices are sparse and SPD, and while only matrix-vector
products are required during the CG iterations thus generating a method with minimal
memory requirements (minimal for the nearly twice as many equations as for GFEM), the
solver typically requires many thousands of iterations for a non-trivial 3D problem (B-n.
Jiang, personal communication). Perhaps the planned (potential) switch over to multigrid
will further increase the computational efficiency of their LSFEM.
We now briefly return to the issue of more equations than unknowns for this LSFEM.
The addition of seemingly redundant constraint equations such as curl v = 0 when the
problem V2w = / is solved via v = Vw and div v = /orV-6> = 0 when the steady Stokes
equations, —VP + uV2u = f, V • u = 0 are solved via VP + vV x co = —f, V • u = 0, and
co = V x u (ditto transient NS equation), was discovered by C. Chang (see Chang, 1992,
and Chang and Gunzburger, 1987), in which a so-called 'slack variable' that turns out to
be identically zero was also introduced—needed for convergence proofs and error analysis
but not needed in the 'codes'—a fact originally discovered and utilized in both 2D and
3D (but not analyzed), by B-n Jiang (personal communication). The first of the additional
constraint equations (but not the additional slack variable—a quantity not needed in the
computations) used in computations appears to be in Chang (1992), but see also Jiang and
Povinelli (1993), Jiang et al. (1994), and Bochev and Gunzburger (1993). This simple(?)
trick is what allows LSFEM to 'work' in general—and precluded (finally) the serious
constraints that augered against the method that had been previously promulgated in
a series of papers by Fix et al. (1979a, b, 1981) in which 'success' was only assured
if certain constraints were imposed; principally the so-called GDP (grid decomposition
principle) or DDP (discrete decomposition principle), in which some severe constraints
were placed on both the basic approximating spaces and on the nature of the grids
employed. These worries are no more when the discrete forms of the continuum-redundant
equations are added to the system—the otherwise limited or lost stability is then assured.
The prices for the simplicity (equal-order approximation and stability) is, 'simply,' more
equations.
For some recent theoretical results on LSFEM in the manner of the above discussion,
which also 'corrects' some earlier theoretical works, see Bochev and Gunzburger (1994,
1996) and Jiang et al. (1994), the last of which also introduces some new ('non-standard')
BC's for the NS equations. For a recent 'down-side' report on LSFEM, see Chang (1996)
and Nelson and Chang (1995), in which it is shown that some 'problems' can arise
in which the discrete, divergence-free constraint equation is too weakly enforced. (The
proposed fix, which does work, reintroduces a Lagrange multiplier and an associated
saddle-point problem!)
In the arena of time-dependent NS equations, the team under T. Tsang seems to be
leading the pack. In Tang and Tsang (1993), they used BE (linearized) then LSFEM on
both isothermal NS equations and Boussinesq equations (see Volume II). It may be worth
pointing out the following experience (T. Tsang, personal communication): the use of BE
plus LSFEM to simulate Karman vortex shedding past a cylinder 'failed' in the following
sense: even with a small At, the correct shedding frequency (Strouhal number) could not
be obtained. Perhaps this is a manifestation of combining two dissipative methods—one
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 785
in space and one in time. More recently (Tang etai, 1995), they have gone to fully
3D, time-dependent simulations, using the Q\ element (eight-node brick) for all seven
variables, (fixed step) TR for time integration, Newton linearization followed by LSFEM
with the SPD systems solved via matrix-free DSCG—in what appears to be a cost-
effective method. (All they need now is a smart time integrator.) More recent yet is Tang
etal. (1998).
To conclude the LSFEM discussion, we simply point out a. few others who have recently
tried it—but not in the same way as above: Harbord and Gellert (1991), Winderscheidt
and Surana (1994), and Bell and Surana (1994).
c. Methods based on Galerkin least squares
The previous section discussed 'pure' least squares in which the least-squares criterion
was the only (weighted residual) principle invoked to obtain the final FEM equations—in
space; time was discretized via 'conventional' (C°) ODE methods. In this last little section,
least squares forms only a portion of the methodology, and often the ODE portion
is done via discontinuous methods (Galerkin—as discussed in the previous chapter,
Section 2.1.Id, e). Another difference is that the methods to be summarized below (via
citations) do not reduce second-order operators to first order by introducing new
variables; rather, they retain the second-order operators, but only apply least squares on
element interiors, wherein the required differentiability is present (the C° basis
functions of FEM are infinitely differentiable inside each element, with most higher-order
derivatives being zero). A sample of recent publications in this area include: Hansbo
and Szepessy (1990), Hauke and Hughes (1994), and some involving in addition free and
moving surfaces: Tezduyar (1992), Tezduyar et al. (1992a, b), and Hansbo (1992b). These
advanced methodologies also employ (usually) the discontinuous-in-time Galerkin method
for the ODE integration. Finally, for a brief comparison of characteristic methods and
GLSQ methods, see Pironneau et al. (1992)—in which it was determined (O. Pironneau
and T. Tezduyar, personal communication) that the latter is easier to formulate but the
former is less expensive. For some more recent GLSQ results, see Behr and Tezduyar
(1994) and Mittal and Tezduyar (1994).
3.16.10 A Strategy for Hastening Steady Solutions
To conclude our discussion of both time-dependent (mostly) and steady methods for the
NS equations, we combine them via a strategy that employs (almost) 'any' time-marching
method but switches, cleverly we hope, to a steady solution method when it is deemed
likely that a steady state exists—and one is not using a smart implicit integrator so that
quite a few more timesteps would be required to actually attain it. The method to be
discussed below is related to one that was partially, but successfully, tested by McCallen
(1993).
Suppose one is integrating in time using a fixed-step ODE method and that it is
suspected that a stable steady state exists—or the algorithm can be designed to test for
'approach to steady state' in any of several ways. Here is one: monitor the time-history
of several key velocity 'nodes' and/or the kinetic energy (less sensitive) of the flow,
uTMu, and test for steady state 'once-in-a-while' by monitoring Aw, vs time, where
Aw,- is the change in u at node i per timestep. If Aw, is of constant sign (Aw, > 0 or
786 THE NAVIER-STOKES EQUATIONS
Aui < 0) for 'many' steps and decreasing monotonically in absolute value, then it may be
asymptotically approaching a steady state. When these tests are passed and an appropriate
norm of the velocity (or KE) increment drops below a user-specified tolerance (s), one
could try the following algorithm—first in words: guess via exponential extrapolation the
steady-state solution by assuming that only the smallest eigenvalue is still 'active' and
use this guess as a first guess in a steady-state solution algorithm.
As an algorithm it might be the following:
Step 1. For each variable monitored, x,, use three successive (or three not
successive—perhaps \0At apart) computed values of the solution at three known
times to fit the coefficients of the assumed exponential behavior,
Xi = at + Z?,e
-fit
Step 2. Solve for at, bt, c, from the three equations—see below.
Step 3. Use jc(- = a,(t — oo) as the first guess in the steady-state iterative solver.
Remarks:
(1) If e is sufficiently small, then the first guess, at, may actually be good enough—and
could, as a first cut, be used to report the steady-state solution.
(2) The 3x3 system leads to a nonlinear equation for c:
x2 - jc, e"ct2 - e~''f|
R= — =
JC3-JC2 e_rt3 -e^2'
where the indices now apply to the three time levels, and we have omitted the nodal
index /, for simplicity. The Newton method should be effective for finding c:
Cm+1 — Cm ~i
(R + 1 )e~Cm'2 - (Re~Cmh +e~~Cmt')
t2(R+\ )e~~Cm'2 - (h Re-''"1'3 +t{e~Cmh)'
Once this has converged, compute a from (e.g.)
; —ch X3 ~ X^ —ch
a=x2-be 2=*2-e_,f3_e-cf2e '2.
(3) A much simpler method, thanks to S. Chan (personal communication), is this:
take ?2 = t\ + 8t and ^ = t2 + 8t, where 8t is the chosen time interval (for
example, an integral number of timesteps). This equal-interval approach causes
the above equation to simplify considerably: R = ecSt, giving c = (\nR)/8t, b =
(*2 — x\)ect]/(R~l — 1), and the final desired result is a = (x\ —Rx2)/(\ — R) =
x(t = oo).
(4) We have not tested this idea, but nevertheless advocate it.
3.17 ALIASING AND ALIASING INSTABILITY, LINEAR
AND NON-LINEAR
The literature on 'aliasing' in finite element CFD is rather sparse and in our opinion,
rather confusing; the latter may help to explain the former. The concept of aliasing is
ALIASING AND ALIASING INSTABILITY, LINEAR AND NON-LINEAR 787
fairly clear—but its consequences are not. If one attempts to place on a mesh (as in IC,
say) a waveform with a spatial frequency content that cannot be 'resolved' by the grid
(X < 2Ax), or if such a wave tends to be 'generated' by the numerical solution procedure
(typically via a product, such as u • V7\ of two short-but-resolvable waves), the grid will
misinterpret the (too) short wave as a longer wave that it can resolve. This is aliasing.
If aliasing occurs and if the resulting time-integration (with a stable marching scheme,
or with At —► 0) becomes unstable (solution grows without bound), then the result is
often called an aliasing instability. If the aliasing is caused by a non-linear product (such
as u • Vu) and if the resulting ODE (or DAE) behavior becomes unstable, then the result
is often called non-linear aliasing instability. Finally, if the ODE's are linear (e.g., u
given when solving the scalar transport equation) and the product u • S/T is still deemed
the generator of aliased modes, and if the resulting ODE becomes unstable, then the term
'linear aliasing instability' is sometimes used.
Before delving further into this issue, we ask the reader to refer back to our first
meeting with it, via the Remark in the previous chapter, in Section 2.3.1 between (2.3-9)
and (2.3-10), wherein some concepts, and some previous work, was discussed.
Before getting any deeper into the alias issue, which itself seems to arise under aliases,
we present a brief sample of the literature, in the form of interesting quotations—after
mentioning that Canuto et al. (1988a) is both a useful general reference and a particular
reference on aliasing in spectral methods.
1. In a truncated Fourier expansion (spectral, elkx) method, Orszag (1972) states,
'... where the summation terms are referred to as 'aliasing' terms; the 'aliases' km =
k + mN of k satisfy exp(ikmxn) = exp(ikxn) so that they are indistinguishable from k on
a discrete grid.'
2. 'The confusion of frequencies is an inevitable consequence of discretization'—Roache
(1982). He, of course, is discussing FDM's and their associated, discrete grid point values.
3. In discussing some FDM's, Orszag (1971) states, 'Schemes with no quadratic semi-
conservation properties may be unstable due to aliasing errors.' Here, quadratic semi-
conservation means stable ODE's.
4. In the same paper, 'The energy-conserving finite difference schemes discussed (above)
have aliasing errors (Lilly, 1965; Grammeltveldt, 1969), but they are not susceptible to
aliasing instability.'
5. In Gary (1979) appears, 'The analysis of Richtmyer and Phillips indicates that the
non-linear instability may be due to a distortion of the non-linear interaction between
wave numbers caused by the discrete mesh.' This is what is usually meant by non-linear
aliasing instability.
6. Again from Roache (1982): 'Since it conserves £2 (vorticity squared—enstrophy), it
is not subject to the non-linear instability of Phillips (1959), which arises from aliasing
errors.' (Aliasing errors are present but remain bounded, since £2 remains bounded.) He
is here discussing Arakawa's famous method (Arakawa, 1966), which later just happened
to also be 'a finite element method'—see Jespersen (1974).
7. 'When enstrophy is not conserved, the average (spatial) frequency It will in general
not be constant, and in such cases Phillips (1959) has shown that numerical instabilities
can be created by high-frequency noise cascading down into the low frequencies. Phillips
788 THE NAVIER-STOKES EQUATIONS
calls these aliasing errors'—Fix (1975). He then goes on to show that his finite element
method of 'ocean circulation' assures that enstrophy is conserved: 'We thus conclude in
this case that the semi-discrete finite element model will have no aliasing errors.' This
conclusion follows after his proof of conservation of mean wave number, k.
8. 'The difference in the accuracy of finite element and finite difference methods is
analyzed to illustrate the removal of "aliasing" by the Galerkin approach,' and 'In practice,
the grid-point projection is known to be unsatisfactory for non-linear analysis if
propagation is possible between points. The errors usually called "aliasing" errors are exactly
the spatial evolutionary errors in evaluating products as in (6) by direct multiplication at
grid points.. .'—both in Cullen (1976).
9. In the same WMO publication containing Gary's article on 'Non-linear Instability,'
Cullen (1979) has one in which he states, 'In a numerical method where a function is
defined over the whole domain by finite elements or other means, there can be no aliasing
because there is no ambiguity.' He means, of course, that a term such as uh ■ VTh is, via
the basis function expansions, a well-defined quantity at all points in space—both at mesh
points and in between.
10. 'For the same number of degrees of freedom as a finite difference scheme, the
finite element method is more accurate and eliminates the possibility of aliasing
energy cascading onto the trunction scale and then back into the larger resolvable
scales'—Wyngaard et al. (1984). Note too the use of 'scheme' vis-a-vis 'method'—not
the only place these identifiers have been employed (cf. Strikwerda, 1989).
11. Stopping short of a dozen, we end with a statement from the monograph by Gottlieb
and Orszag (1977), 'However, Galerkin approximation is sometimes very attractive
because it gives approximations that are conservative and have no so-called aliasing
errors.'
So, it looks like all is right with the world if you employ GFEM—no aliasing and
therefore no aliasing instability. Or is it really that simple? What of the several unstable
ODE discussions (plus one example) in the previous chapter? These were true Galerkin,
true (honest) FEM. Could it be that the 'conventional wisdom,' if our sample above
suffices to summarize it, is not always correct? We definitely believe that to be the
case; whether or not GFEM is free of aliasing, we know for a fact that it is not always
free of instability. In fact, we now know rather well that the most important asset for
assuring stability of the ODE's and DAE's is the assurance of a skew-symmetric advection
matrix—a coveted attribute that is easiest to obtain for contained flow (n • u = 0 on F)
or periodic BC's when we take /J = 1/2 (velocity or temperature). It is not so easy
otherwise.
Also, did we see aliasing, stable or not, in our numerical example in Section 2.8.1?
Probably not; what we saw in the unstable ODE case in Chapter 2 was one or more
eigenvectors of M~lN(u) that had positive real parts, thus ultimately causing unbounded
growth for virtually any IC. What we did have there is a condition that is now fairly
well recognized—rapidly varying coefficients in u(x) that, if fi ^ 1/2, generated unstable
ODE's. That is to say in at least (plus those mentioned in Chapter 2) Kreiss and Oliger
(1973) and Roache (1982), it is recognized that linear equations with variable coefficients
can cause unstable behavior.
ALIASING AND ALIASING INSTABILITY, LINEAR AND NON-LINEAR 789
We will conclude this section with a discussion of the ID, inviscid Burger's equation,
du du
— +u— =0, (Kjc^I, (3.17-1)
at dx
a non-linear hyperbolic PDE, which we shall semi-discretize in several ways, for both
periodic and Dirichlet BC's on a uniform mesh.
1. FDM1
iii + Ui(ui+\ — w,_i)/2/ = 0; (3.17-2)
2. FDM2
iii + \[Ui(ui+l -M(_,)/2/ + (^+i -«72_,)/2/] = 0, (3.17-3)
which might have come about by first rewriting (3.17-1) in the equivalent form
du \ ( du du \
¥ + 3(> + l*)=0' (3-,7"4)
3. FEM1 (advective form: fi = 0). From (2.3-9) et seq. in the previous chapter (let Tj —►
uj), we get
1 1
-(ui-i +4w,- +w,-+i)+ —[(2uj + Uj+i)uj+i
6 6/
+ (uj-i — Uj+\)uj — (2uj + Uj-\)uj-\] = 0, (3.17-5)
which rearranges to
g(«,-_i +4Ui■ + U;+\)+ ^(ut-\ + Ui + ui+\)(ui+\ —Ui-\)/2l =0. (3.17-6)
4. FEM2 (/J = 1 /2). From the same section of Chapter 2,
1 1 / u; + u-l+\ Ui + ui-\ \
~(w/-i +4i/,- + ui+\)+ — I ui+i ui-i 1 = 0, (3.17-7)
and we omit the /3 = 1 case.
Remarks:
(1) (3.17-2) is often unstable (Fornberg, 1973; Kreiss and Oliger, 1973; and Gary, 1979).
(2) (3.17-3) is a stable ODE under periodic BC's (lac. cit.).
(3) (3.17-3) and (3.17-6) are equivalent when mass lumping is invoked; i.e., the stable
FDM and the LMFEM with linear basis functions are the same.
(4) (3.17-7) shows a skew-symmetric advection matrix and is therefore—up to BC's at
least—also a stable ODE. (It is clearly skew-symmetric, and therefore stable, for
periodic BC's.)
(5) The earliest citation that we know of that discusses (3.17-6) is that of Swartz and
Wendroff (1969)—a paper that also introduces (we believe) the 'product
approximation'; for the latter, see also Christie etal. (1981) and Fletcher (1991b) and
references therein).
It is interesting, and probably rare, that the simple advective form (/3 = 0) of the GFEM
equations above is guaranteed to be stable (at least for periodic BC's); i.e., even in the
790 THE NAVIER-STOKES EQUATIONS
absence of a skew-symmetric matrix. Proof of stability: rewrite (3.17-6) as
Mu + N(u)u = 0 (3.17-8)
and form uTN(u)u—if it vanishes, the ODE is stable. We have
t x~^ f ( dfa\ f h 2^u
u'N(u)u = 2_^Ui J (Pi(uj(pj) I Uk-jr- I = / (" ) y-
ijk
3 J a* ' 3y 'l0
because of periodicity.
Finally, we point out, following on from Cullen (1979), that advection products, like
u • Vr or u • Vw, cannot generate 'aliases' of longer wavelength even if both u and T
are varying as fast as possible; i.e., even if u(x) is a 2 A* wave (ID for simplicity) and
T(x) another 2Ajc wave, the product of u and dT/dx is also simply a third 2A* wave—a
statement whose proof we leave as an exercise.
We believe that the following summary of aliasing and GFEM is reasonably accurate:
1. GFEM is not susceptible to aliasing, except in the sense discussed in
Section 2.6—unresolvable initial data.
2. Hence, GFEM is not susceptible to aliasing instability, whether linear or non-linear.
3. GFEM is, however, susceptible (if fi ^ 1/2) to generating unstable ODE's—both linear
and non-linear.
4. Stable GFEM ODE's can always be assured—at least when n • u = 0 or periodic BC's
are employed—simply by using the fi — \/2 form of the advection operator.
*3.18 A NEW LOOK AT TWO OLD FINITE DIFFERENCE
METHODS
Anyone who has the interest or need to go back in history to see the origins
of some methods might be misled because it is a common occurrence that those
discovering/inventing new methods do not always fully understand them when testing
them and writing about them. (And the authors of this book are surely no exception!)
Such is the case in several papers to be described here: the first, evolving from the
MAC (Welch and Harlow 1965) and SMAC (Amsden and Harlow, 1970) methods at Los
Alamos Scientific Laboratory (LASC) and the second from France. The former include
the SMAC (simplified marker and cell) 'improvement' over the original MAC method of
Harlow et al. [See, for example, Harlow and Welch (1965), which in many ways was more
'correct' than its SMAC successors. Also, C.W. Hirt has opined that SMAC was never
'needed'—personal communication, with which we concur.] The SMAC 'improvements'
appeared, e.g., in Amsden and Harlow (1970) and Easton (1972). [Then there is the
simpler-yet and less confused manuscript by Hirt et al. describing SOLA (SOLution
Algorithm?); in Hirt et al. (1975) is described a 'proper' approach: rather than trying
to apply the PPE on the boundary, which, as we shall soon see, is at the root of
most of the problems, Hirt et al. applied V • u = 0 there, which, as we have seen, is
the proper approach.] From the French group, we have Fortin et al. (1971), recently
A NEW LOOK AT TWO OLD FINITE DIFFERENCE METHODS 791
summarized and somewhat reinterpreted by Peyret and Taylor (1983). Also, the French
(and others!) are sometimes guilty of employing a notational convenience that is strictly
illegal and thus valid only when properly interpreted; namely, they write semi-discretized
equations that are discrete in time and continuous in space, with the time discretization
being explicit (FE in fact), which is illegal for two reasons—even for the simple
'model' equation, the transient heat equation; i.e., the semi-discretization of ut = vV2u
via (un+\ — un)/At = vV2un gives—at least away from r, un = (l + vAtV2)nuo, which
is unbounded as n —► oo for all At > 0. This was the first reason. The second occurs at
the boundary; for u = w(t) as the Dirichlet BC, the equation un+\ = (l + vAtV2)un for
x —> T is compatible with the BC un+\ = wn+\ on T if and only if w(t) is sufficiently
smooth [so that wn+\ = wn + Atwn + 0(At2) is true], which unnecessarily restricts the
boundary function, w(t). These results are of course completely compatible with the well-
known stability limit of the full discretization, At ^ CAx2/v, which tells us that stability
requires At ^ 0 for the continuum limit(!). Thus, in what follows we must interpret the
semi-discrete equations as actually fully discrete (V = V/,, etc.).
In this section we shall lift the veil of confusion that has surrounded these two methods
for so many years, and reveal them for what they are—forward Euler. Yes, they are not
really two distinct methods—each is a confused presentation of the simplest possible NS
algorithm.
Our explanation will utilize the original (and still popular/useful) MAC grid, as
shown below at a solid boundary—in which fictitious/phantom cells are (unnecessarily)
introduced as part of the method. All subsequent effort is directed toward the cell labeled
Pq in Figure 3.18-1—a representative 'boundary' cell.
To show that the fictitious cells, velocities, and pressures (and all that they entail) are
truly unnecessary, we begin by applying FE to
du/dt + VP= uV2u - u Vu = f(u)
(3.18-1)
V,
NW
-o-
Vsw
o
OU
w
U
sw
V,
NN
O
Unw x Pn
-(>-
xp0
Vs
XPC
V,
ss
Fluid
ou
NE
V,
NE
>-
1!
UF
x PF h
-o
■ £
■~-li
U
SE
Fiction
Fig. 3.18-1 MAC-Type (staggered) mesh near a wall.
792 THE NAVIER-STOKES EQUATIONS
and
Vu = 0 in Q, (3.18-2)
with u = w(0 on F. Recall first the FE algorithm; given um = u(tm) with V • um = 0:
(iW, - um)/At + VPm =fm= f(uj (3.18-3)
and
V-um+1 =0. (3.18-4)
Focusing on 'cell 0,' we begin with the discretized version of (3.18-4):
m+\ _ m+\ m+\ _ „/n+\
^ %L_ + ^Y ^_=0, (3.18-5)
/ h
wherein the velocities are obtained from the discretized momentum equations—unless
they are known through the BC:
u™+l = u%+l(F) = w™+l is given, (3.18-6)
,.m+\ _ m+\
w — w
+ &t[fxw{um) - (P% - P»)/l], (3.18-7)
v™+l =v™ + At[fyN(um) - (PmN - P%)/h], (3.18-8)
and
<+1 = < + Mfys(um) - (P% ~ PsVhl (3.18-9)
Now use the FE-compatible time variation of w(t) on T,
u™+l(F) = u™(F) + Atii™(F) (3.18-10)
and the discrete continuity equation (3.18-5) at time tm to obtain, upon inserting the above
velocities into the above continuity equation at tm+\, the simple result
\kmE + (P* - PD/l - fxw(um)]/l + [fyN(um) - fs(um)]/h
-(/>£ - 2P% + P™)/h2 = 0, (3.18-11)
which is the consistent PPE at the boundary. It is quite clear that multiplication by / and
letting /, h 'shrink' yields
uE + dP/dx\o - fxw(um) = ld/dy[dP/dy - fy] + 0(1, h), (3.18-12)
with the limit (/, h —► 0) giving
3P/dx\r = (fx-du/dt)\r, (3.18-13)
which is the proper Neumann BC for the PPE.
Remark:
In practice, the FE equation for u^+l (F) would be simply replaced (usually) by w%+l;
the above description is needed only to help understand the actual algorithm.
So much for the simple and straightforward way. Now let us invoke the SMAC and
FP (French Projection—explicit version) algorithms and, later, try to understand them;
given um with V • um = 0, they are:
A NEW LOOK AT TWO OLD FINITE DIFFERENCE METHODS
793
Wyn * V Uwj I»>
Step 1. Omit VP from the momentum equation and compute a provisional (or fractional
step) velocity field, u, from
(u - um)/At = uV2u„
Step 2. Solve
(um+1 -u)/At + VPm=0
and
(3.18-14)
(3.18-15)
(3.18-16)
(3.18-17)
V-um+i =0
for um+1 and Pm via formation of the implied PPE,
y2pm = V • u/At.
Then compute um+\ = u — AtVPm.
Remarks:
(1) We shall discuss BC's below.
(2) In SMAC, Pm is called \js/At, and \Js is called a 'potential function.'
(3) In the FP, Pm is called Pm+\, and the second step is called a projection.
(4) Both xjs/At and Pm+\ are really Pm. (This, of course, was part of the confusion.)
Returning to Figure 3.18-1, this time using also the fictitious quantities on the other
side of T, the SMAC and FP approaches write a discrete PPE (rather than V • u = 0) for
cell '0'; i.e.,
PZ - 2P™ + Pm PZ - 2Pm +_Pl _ _1_
~ At
where, from (3.18-14),
+
h2
ue — uw vN — vs
I
h
(3.18-18)
uw = u^ + Atfxw(um),
vN = t% + AtfyN(um),
vs = v™ + Atf}s(um),
and
1. SMAC:
(3.18-19)
(3.18-20)
(3.18-21)
uE = uE!+l(ry, (3.18-22)
2. FP: Same as SMAC in Fortin et al. (1971), but Peyret and Taylor (1983) say that it
does not matter: 'The essential feature of the projection method is that the numerical
solution is independent of u(T).' Interesting ....
But Pe is also fictitious, and the following choices were made:
1. SMAC: Pe = Po, which seems to be related to statements about 'homogeneous' BC's
for the PPE; i.e., if Pe = A)> it would seem that dP/dn = 0 on T.
2. FP: the Neumann BC inherent in the projection step, dPm/dn = n [u - um+1(r)]/ At
from (3.18-15), or, in discrete form for cell '0,'
UE-U^iF)
pm
pm
/
At
(3.18-23)
794 THE NAVIER-STOKES EQUATIONS
an equation whose interpretation is vague at best. If we then chose uE = «£+l (T), as did
Fortin et al. (1971), we obtain PE = P$, a la SMAC. If we do not choose either PE or
uE, per Peyret and Taylor (1983), but simply insert (3.18-23) into (3.18-16), it follows
immediately that both PE and uE 'vanish' from the algorithm, as stated above. [Note that
this vanishing only occurs if the approximation to (3.18-15) is that given by (3.18-23).]
Thus, in all cases, placing the fictitious velocity and pressure (uE and P^) into (3.18-18)
gives
pffl pffl r>m opw _i pffl
rw ~ ro_ , £jy_— z/o "t rs
V
K
1
At
"hT'ch-
/
-uw vN -vs
h
(3.18-24)
an equation whose misinterpretation has (we believe) led to the mistaken belief that it
converges to dP/dx = 0 (multiply by / and then let /, /i —► 0). But the proper interpretation
comes about as follows: insert (3.18-10) and (3.18-19) through (3.18-21) into (3.18-24)
to obtain
'<(f) + Atii^iF) - ,i* - Atf%\um)
pm
rw
V
pm pm
- 2P% + P
~~h2 '
1
~At
I
+
v™ + AtfyN(um)-vm-Atfys(um)
h
, (3.18-25)
giving
pm _ pm
rW M)
I2
+
pm
2P% + Pm
W,
w-
= [ue ~ fxw(um)]/l + [/{/(««) - fsy\»m)Vh (3.18-26)
because [um(T) - u%]/l + « - t%)/h = 0. Multiplication by / and letting l,h^0 now
clearly gives (3.18-13)—the proper PPE BC. In fact, it is also clear that (3.18-26) is a
'repeat' of (3.18-11) and we are done: both SMAC and FP are nothing other than FE
applied properly; the fictitious variables really are fictitious and totally unnecessary—as
is the 'PPE on f approach.
3.19 NUMERICAL EXAMPLE—IMPULSIVE START
3.19.1 Introduction
We conclude this overly-long chapter with but a single example (to help compensate?):
flow past a circular cylinder via an 'impulsive' start from rest. But it will turn out to be a
long example because of the many different issues and problems that accompany it, the
first one being this: an impulsive start is a mathematical impossibility for incompressible
flow—it is an ill-posed IBVP. Please refer back to Section 3.9.1 for further important
details, and for an introduction to the problems to be 'solved' below.
So why then is an ill-posed problem our single selected example? There are several
reasons:
1. It is a very common 'misconception' in much of the fluid mechanics literature that is
in need of further clarification.
2. We want to show how and why some who thought an impulsive start from a fluid
at rest was actually that (perhaps by not looking carefully enough at the results after a
NUMERICAL EXAMPLE-IMPULSIVE START 795
single time step)—and to note that those in the know knew it for what it was: a potential
flow initial condition—rather far from a fluid at rest!
3. We want to show how to 'legally' and easily (in principle at least) approximate the so-
called impulsive start—(nearly) as closely of you like (or can afford); via an inlet BC of,
for example, u = wo(l — e~Xt) for 'large' X. Note that a nearly-equivalent startup could be
obtained with a normal traction BC of the type /„ = XFe~Xt at the inlet, where F is the
desired 'impulsive' stress (f£° fn dt); i.e. we apply a large but rapidly-decaying 'force' at
the inlet. We chose the Dirichlet BC so as to get a better handle on the Reynolds number.
See Gresho et al. (1980a) for such a 'normal-force' startup (albeit non-impulsive). Note
too that the chosen 'exponential' start is only one of an infinite number of time-dependent
BC's that could approximate an impulsive start from rest.
4. We want to 'show-off a smart integrator (variable At) by the existence of widely
separated time scales; i.e., the very small acceleration time constant, rac = \/k, required
to get close to an impulse, would severely challenge any fixed-A? integration method.
'Our' integrator just takes it in its stride.
5. Another feature, related to the chosen problem, is the calculation of potential flow
with a Navier-Stokes 'solver', which requires that the code be able to rotate the x-
and _y-momentum equations to the normal and tangential momentum equations a la
Section 3.13.le, in order to apply the appropriate inviscid BC's: n • u = 0 and free slip
(zero shear), which we will also demonstrate.
6. The next reason was actually unplanned: it is a simulation that makes the simulator
realize that s/he, and the code employed, has been severely 'numerically challenged'. We
will show what appear to be mesh-converged pressure solutions that are really not—a
consequence of not being able to properly simulate a vortex sheet. The easiest way to
get into this 'pickle' (which we shall later demonstrate) is to try the impulsive start in
the manner discussed in Section 3.9.1; namely, take a potential flow and apply to it the
no slip BC on the cylinder—'add viscosity and apply the brakes'. Also responding to the
numerical challenge was our friend and colleague, David Gartling, who, at our request,
provided some significant help early on by verifying some of our early 'strange' results
with his own code (NACHOS); see Gartling (1987). (He was also smart enough to 'bail
out' early!)
7. The final 'reason' was also unplanned, but worthwhile: to show how badly a great
'Stokes element' can perform when trying to solve problems that are closer to potential
flow than to Stokes flow. In fact our favorite 2D viscous flow element, Q2P-1, let us
down during the phase of startup in which viscous effects were either negligible or too
'localized' to capture—via wiggles, a result that also beckons the FEM theorists to get
more 'active' in the time-dependent 'arena'. We also mention 'up-front', and show later
that the presumably less stable <2i<2o element beats Q2P-1 badly—at least with respect
to wiggles.
It is not our purpose either to provide an extensive bibliography on this much-studied
problem, or to attempt serious comparision of our results with those of others—the
latter partly because we employed a rather restrictive (small) and bounded domain that
in no way is meant to approximate the unbounded situation. Rather, we will be closer
to following Sarpkaya's suggestion (1989 personal communication) that fully numerical
simulations would do well to attack truly accelerating flows rather than just the limiting case
796 THE NAVIER-STOKES EQUATIONS
of a potential flow IC. [See too Sarpkaya (1996) for more recent discussion and citations,
and Telionis (1989) for transient incompressible flow in general—and impulsive starts in
particular.]. He also (in Sarpkaya, 1992) comments on the impossibility of generating a
'truly impulsive flow either numerically or experimentally,' the latter needing to
additionally deal with compressibility and cavitation effects in liquids (even if the required infinite
force were available). We too will have much to say about its numerical impossibility.
A very recent reference covering virtually all aspects of flow past circular cylinders is
Zdravkovich (1997). Finally, as we are really mainly interested in 'small' time behavior,
we shall invoke symmetry and save a factor of 2 or so in each run which is justifiable
since the full flow is symmetric for any Re for t sufficiently small.
We shall investigate two types of 'impulsive' starts (potential flow IC and no-flow
IC with large acceleration) for two values of the Reynolds number (1000 and 0) for
two elements (QiP-\, which we shall call 9/3 for short, and Q\Qo, which we shall call
4/1) using two time integrators (BE and TR)—both in the 'smart' (variable-step) mode.
Fear not, however—we shall only discuss one of these eight combinations in any detail;
namely 9/3 via TR on the no-flow IC at Re = 1000. (We do want the reader to know,
however, that we spent many months examining all eight possibilities—plus others!)
3.19.2 Domain, mesh, BC's, IC's
After several 'mesh iterations', we settled on that shown in Figure 3.19-1 for most of
our 'runs' (for the 9/3 element; for 4/1 divide by 2 in each direction—roughly, since
9-node meshes are graded differently than 4-node in order to keep mid-side nodes at
midsides, a la our discussion on Good Simulation Practices in Volume II). At the 'end
of the (long!) day, we realized and admit that our mesh grading was not 'optimal'; we
should have graded more 'aggressively'; and we do, briefly, at the end. The mesh contains
4290 elements and 16965 nodes, with the unit-radius cylinder located at the center of a
domain that covers — 3 ^ x ^ 3 and — 2 ^ y ^ 2, although the symmetry BC allows us
to use 0 ^ y ^ 2—which we do. The BC's are as follows: u = wq(\ — e~Xt) and v = 0
at the inlet (x = —3) with wq = 0.1 and X = 100 = l/rac; homogeneous NBC's at the
outlet (x = 3) as OBC's (vdu/dx — P = 0 = vdv/dx); u = 0 on the cylinder (n • u = 0
for the inviscd cases); and symmetry elsewhere (y = 0 and y = 2) via v = du/dx = 0
which also implies dP/dy = 0, a la Section 3.8.2). This was for the no-flow IC. For the
'impulsive'/potential flow version of the problem, the inlet BC (only) was changed to
u = wo = 0.1 and the resulting 'illegal' problem (no initial flow) was made legitimate
via an L2-projection (see Appendix 3) to a discretely div-free potential flow + vortex
sheet (caused by the no-slip BC) at t = 0+ using the very small BE time-step trick
already utilized and explained in Section 3.16.Id. (The first step gives the appropriate
velocity field and the second gives the concomitant pressure, both results being
virtually independent of Re and At, for At sufficiently small). The cylinder diameter (D)
is 2 and is the characteristic length for defining the Reynolds number; Re = wqD/v
gives, for Re = 1000, v = 0.0002—which value we also used for the Stokes flow
simulations. The resulting additional physical time scales (in addition to rac = 0.01) are:
tA = D/wq = 20 (advection) and xD = D2/v = 20000 (diffusion), both of which are large
relative to rac.
NUMERICAL EXAMPLE-IMPULSIVE START 797
(a) The full domain.
(b) Zoom near the cylinder.
Fig. 3.19-1 The mesh of 4290 Q2P-i (9/3) elements; 16 965 nodes.
3.19.3 Two steady-state results (v = 0, oo)
Before launching into the transient simulations, we show and discuss two limiting steady-
state results: potential flow (Re = oo, via the 2-step trick using BE with v = 0) and
steady Stokes flow (Re = 0)—in Figures 3.19-2 to 3.19-4. Whereas the Stokes result looks
(and is) reasonable and accurate, that for potential flow (realized, of course, by rotating
the momentum equations a la Section 3.13.1e and releasing the no-slip BC to use the
n ■ u = 0 BC on the cylinder) is highly irregular/suspicious because of the WIGGLES—in
Figure 3.19-3 and 3.19-4 (the discussion of which we defer temporarily). The reason
we can say the Stokes results are 'accurate' is the following, which is always a useful
adjunct to mesh refinement studies: the 4/1 element delivered virtually the same solution
as the 9/3 (cf. Figures 3.19-2(e) and 3.19-2(f), which leads to the perhaps somewhat
promiscuous conclusion that the much-more-accurate-in-general 9/3 element is virtually
'mesh converged'.
798 THE NAVIER-STOKES EQUATIONS
(a) Potential flow streamlines (Ay = 0.01)
(b) Potential flow pressure; Pmax = 0.00505,
Pmin = -0.0284 (AP = 0.00167)
(c) Potential; <(>max = 80356, <j>min =-0.1
(A* = 4018)
(d) Stokes flow streamlines (Ay = 0.01)
(e) Stokes flow pressure; Pmax = 3.73 x 10-4,
Pmin = -0.5 x 10"4 (AP = 2.14 x 1Q-5)
' ^1
<o
(f) 4/1 versions of Stokes flow pressure;
Pmax = 3.41 x 10"4, Pmin = -0.5 x 10-4
(AP=1.95x10"5)
Fig. 3.19-2 Potential flow (left) and steady Stokes flow (right); 9/3 elements (mostly).
The streamlines in Figure 3.19-2 show the rather large 'displacement' thickness for
(sticky) Stokes flow, vis-a-vis (slippery) potential flow. Except for the first ones in from
the boundaries, the streamlines are equally-spaced (A^ = 0.10)—the lowest value is
0.005 and the largest shown is 0.195 (the top of the domain has \Js = 0.2 = JQ vvrjdy).
Turning now to the pressure (Figure 3.19-2(b) and 3.19-2(e), the only thing the two have in
common is some symmetry about x = 0—but even these are different; the potential flow
has Pmax = 0.0051 at both upstream and downstream stagnation points (even symmetry
about x = 0), whereas the Stokes pressure field displays odd symmetry, with Pmax =
0.000 373 at the forward stagnation point and Pmin = -0.000055 at the rear. (Recall that
the pressure scales with v for Stokes flow.) The pressure at the exit is close to zero for
both cases (and at the inlet for potential flow), and the minimum pressure for potential
flow occurs at the top of the cylinder; it is —0.0284, and we note that the analytic solution
[see for example Batchelor 1967)] for unbounded potential flow,
P = -u2 -
2 °V
2 cos 29
a
(3.19-1)
l-u2 -
2Moo
= (0.1)2/2 = 0.005 at 0 = 0 and
^u^ = —0.015 at 0 = 7r/2; both at r = a. Thus, our bounded domain
in polar coordinates (a is the radius), has Pmax =
6 = 71 and Pmin =
has captured well the stagnation pressures, but the additional flow forced past the top of
the cylinder has further reduced the 'Bernoulli' pressure minimum (to —0.028). Also,
rather than «max = lu^ at the top of the cylinder for the unbounded case, we see «max =
0 20
0 05
(c) u vs x at y = 1
0.30
0 25
u 0.20
0.15
0.10
NUMERICAL EXAMPLE-IMPULSIVE START
0 20
799
0.15 -
0.10 -,
0 05
(b) Same as (a)
0 30
0 25
0 20 -
0.15
0 10
0 05
i 1 1 r
J L
(d) Same as (c)
0.30
0 25
u 0.20 -
0 15
0.10
_L
J L
-1 0 1
x
1.0 1.2 1.4 1.6 1.8 2.0 1.0 1.2 1.4 1.6 1.8 2.0
y y
(e) u vs y at x = 0 (f) Same as (e)
Fig. 3.19-3 Potential flow: 9/3 results on the left, 4/1 on the right.
800 THE NAVIER-STOKES EQUATIONS
0.20
0.15 -
0.10
0.05
(a) u vs x at y = 2
0.35
0.10
0.05
i r
J L
-1
-2 -1
0
x
(c) u vs x at y = 1
0.35
0.30
0.25
0.20
0.15
0.10
J L
1.0 1.2 1.4 1.6 1.8 2.0
y
(e) u vs y at x = 0
0.20
0.15
0.10
0.05
(b) Same as (a)
0.35
0.30
0.25 -
uO.20
0.15
0.10
0.05
(d) Same as (c)
0.35
0.30
0.25 -
u
0.20 -
0.15 -
0.10
1.0 1.2 1.4 1.6 1.8 2.0
y
(f) Same as (e)
Fig. 3.19-4 Mesh refinement results for 9/3, potential flow; 4323 node mesh on left, 1267
node mesh on right.
NUMERICAL EXAMPLE-IMPULSIVE START 801
2.75 Woo—a somewhat uncertain value because of the wiggles. (Bernoulli's theorem would
say that it is V2(0.005 + 0.028) = 0.257 = 2.57moo.)
Finally we remark that the 'potential' field shown in Figure 3.19-2(c) is actually the
pressure field after the first small (At = 10~5) BE step on the ill-posed problem
associated with zero initial velocity and u = wo = 0.1 as the inlet BC—per the discussion in
Section 3.16.Id for the Taylor vortex problem. A further digression on this potential field
may be worthwhile since the associated potential flow problem,
u = u+V0 and V-u = 0 in Q, (3.19-2)
with BC's of
n u = n u on TD (3.19-3)
and
0 = 0 on FN, (3.19-4)
where FN is the outflow portion of dQ and Fd is all the rest, where u = 0 in Q and
n ■ u = wo at the inlet —which describes an L2-projection (see Appendix 3) of the non-
smooth function u—does not seem to be an easy one. Harder yet is the derived Poisson
equation for the potential,
V20 = V-u in £2, (3.19-5)
with
90
:r-=0 on TD (3.19-6)
an
and
0 = 0 on FN (3.19-7)
in which V ■ u is a sort of Dirac delta function, being zero every where except at the
inlet where it is unbounded. That the problem is not really all that difficult may perhaps
be better-appreciated by showing the exact solution for the ID analog to this 'Green's
function' problem (for wo = 1);
<Kx, |) = -(£-*)#(£-*) + L-x, (3.19-8)
where £ is the location of the source term (delta function), L is the domain length, and //(■)
is the Heaviside unit step function. It looks as in Figure 3.19-5. It has slope = 0 for x < £
and 0 = 0 at x = L, and the solution of interest is that with £ = 0. [See too Hughes (1987),
p. 26.] The 2D analog of this exact solution is the potential function in Figure 3.19-2(c).
[Note that the potential flow solution is insensitive to the over-specification at the inlet;
v = 0 or v 'free' (a la true potential flow) look just the same.] Further explanation of the
potential function in this figure, whose maximum is ~ 8.0 x 104 at the inlet (and whose
minimum is ~ —0.06 at the exit, close to the 0 = 0 NBC imposed) is as follows. For a
channel of length 6 with no cylinder, the linear potential field would have an inlet value
of 0.6 (for wo = 0.1) in order to give u = 30/3jc = 0.1; the increase from 0.6 to 0.8 is
caused by the cylinder and the factor of 105 is 1/Af, where we recall that the code thinks
that this is a pressure, whereas it is really (p/ At, per the discussion in Section 3.16.Id.
We now return to the wiggles in Figures 3.19-3 and 3.19-4 for potential flow and
point out three things: (i) The (LBB-stable) 9/3 element is (ironically?) much noisier
than the (LBB-unstable) 4/1; (ii) The clear reduction in wiggle magnitude with mesh
802 THE NAVIER-STOKES EQUATIONS
»- x
0 % L
Fig. 3.19-5 1D Green's function.
refinement suggests that convergence will indeed occur; and (iii) part of the wiggles are
caused by the C-matrix (even when operating on what appears to be a smooth 'pressure'
field) and part are caused by the consistent mass matrix (in its usual mode of 'wiggle
signal announcer')—a lumped mass version of this potential flow yields smaller wiggles
and a lumped mass matrix on an Re = 1000 simulation generated no wiggles; (iv) for the
Re = 1000 transient case to follow, the wiggles are only present at early time—before
viscous effects obliterate them. Recall that the discrete version of this 'second-order mixed
elliptic' problem is, via the first (very small) time step of a BE integration,
1
—M(ui -u0) + CPi =0
At
and
CTux = g,
giving
(CTM~XC)P{ = (CTu0-g)/At (3.19-11)
and
mi =u0- AtM~lCP{ (3.19-12)
for the (L2-projected) potential velocity, where u$ corresponds to u in the above discussion
for the continuum, and AtP\ = (p. If P\ is smooth (which usually seems to be the case)
yet u\ is rough (wiggly), the 'cause' can be in either CP\ or M~lCP\ (or both)—and we
believe that we have some of each. For example, the wiggles at y = 1 in Figures 3.19-3
and 3.19-4 are much the same when M is lumped, suggesting that the culprit is C.
Similarly, let us focus on the 4/1 element wiggles at the upper corners of the mesh (the
y = 2 plot in Figure 3.19-3(b)—which wiggles are present for CM and LM on all three
meshes and even on a fourth (with over 67 000 nodes). We offer the following additional
remarks for Q\Qo'
1. This element often performs poorly at or near mesh lines that point into a corner of
the domain (for reasons that we do not understand).
2. It also responds badly when only two pressures are available in the 'nodal' momentum
equation (2-patches) and when the elements are large and/or distorted—all of which
(3.19-9)
(3.19-10)
NUMERICAL EXAMPLE-IMPULSIVE START 803
occur along the y = 2 line. (In fact, 2-patch equations often do not have coefficients of
the VP approximation—the C-matrix—that sum to zero, thus causing trouble even if P
is constant!).
3. We have sung the 'Bent Element Blues' before [in Gresho and Leone (1984)].
In Figure 3.19-6 we add yet another piece to the wiggle puzzle. Shown is the velocity
field in the upper right-hand corner of the mesh (similar behavior occurs in the upper left-
hand corner) from Q2P- \ and potential flow for two different pressure approximations.
The rough field in Figure 3.19-6(a) is from the local approximation on the element (P =
a + b^ + crj) and the smooth field [Figure 3.19-6(b)] is from the global approximation
(P = A + Bx + Cy), and we remark that (i) both of the solutions are smooth and good
for steady Stokes flow, (ii) we also do not understand the cause of these wiggles, and
(iii) only the global approximation is used henceforth.
After mentioning that (Stokes) triangular elements (e.g. P2PO also wiggle for potential
flow, we assert that we have done our best to unravel these element 'stability'
problems, but have found the literature to be rather sparse—when 'Stokes elements' are used
for potential flow or related simulations that also involve L2-projections [for example,
projection methods and even explicit time integration methods (see below)]. What we
seem to have here is a lack of satisfaction of the so-called 'first Brezzi stability
conditions' (D. Arnold, personal communication)—that of coercivity/ellipticity. An intuitive
'feel' for the problem is related to the 'inconsistency' in the mixed approximation in
that the velocity is in //' and the (lower-order) pressure/potential is in the larger and
less smooth space L2, yet the only spatial derivatives [cf. (3.19-2)] are on the latter. The
second Brezzi condition'—the LBB condition—is indeed satisfied for the more wiggly of
our two elements, for Stokes or potential flow. While we know of no proofs that certain
(all?) stable Stokes elements either pass or fail the coercivity condition when applied to
potential flow, we suspect—and this suspicion is reinforced by our discussions with a
fair number of 'finite element mathematicians'—that they fail. [This stability condition
is related to the stability of the inverse of the matrix (^r c0), where A = K for Stokes
flow and A = M (or / for FDM) for potential flow. [It is quite noteworthy that the simple
FE time marching method for the NSE's involves virtually the same matrix as potential
flow, with A = (\/At)M; yet we, and many others, have computed quite successfully via
FE and 'Stokes' elements.] Another 'intuitive feeling' as to why it may fail the ellipticity
test is that, noting first that the inverse of the above matrix also involves the inverse of
the Schur complement, CTA~lC, K~x is a 'smoothing' operator (the inverse Laplacian)
whereas M_1 (or /) is not, so that if C tends to be 'unstable' in some sense, K~{ (only) can
help it to recover. We believe that any Stokes elements that fail the first Brezzi condition
for potential flow (or for explicit time marching methods) but nevertheless perform
satisfactorily (if not optimally) in practice is a consequence of one or both of that following:
(i) The stability condition is a sufficient but not a necessary condition for convergence,
and (ii) the RHS's that are encountered 'in practice' are special, not general—and the
study of stability necessarily, considers all possible 'data' (RHS's). Indeed, in the new
'Bible' on mixed finite element methods, Brezzi and Fortin (1991) do not even consider
these 'Stokes elements' for the solution of such a (potential flow) problem. Rather, they
seek elements in the larger space than H1 called H (div): vector fields in this space need
not (and generally do not) have continuous tangential components—the vectors and their
divergences are all that must be in L2, But these elements, good as they might be for the
804
THE NAVIER-STOKES EQUATIONS
(a) Local pressure approximation
(b) Global pressure approximation
Fig. 3.19-6 Potential flow velocity field from O2P-1 in upper right corner of the mesh shown
in Figure 3.19-1.
NUMERICAL EXAMPLE-IMPULSIVE START 805
inviscid case, are useless as soon as we introduce viscosity and the no-slip BC; thus, we go
no further in the direction of H(div) elements. [For an alternative approach to computing
potential flow via a mixed FEM, see Carey and Oden (1986), in which integration by parts
is applied to the continuity equation rather than to the pressure (potential) gradient with
the result that 'pressure' (potential) is the 'higher-order' variable. In Pironneau (1989),
potential flow is discussed in the context of the Laplace equation for the potential. See
too, Lee et al. (1982), in which the Laplace equation approach was surprisingly better
than the mixed method approach.]
We conclude our discussion of the steady cases by further verifying that the 9/3 element
behaves 'perfectly' for steady Stokes flow—as is of course well-known, Figure 3.19-7
shows the line plots that were wiggly for potential flow. Also shown is the vorticity, and
we remark that results from <2i<2o are very close to these—within a few percent.
3.19.4 Pressure impulse
Enough on the two steady solutions—now we will show results from a transient case
integrated with TR (in the 'smart' mode): Re = 1000 with a horizontal inlet velocity BC
0.30
0.12
u 0.15 —
-0.3 -0.2 -0.1
(a)uvsxaty=2
1.0 2.0 3.0
-3 -2
(b) uvsxaty =1
(d) Vorticity (a>max = 0.075, <»min = -0.799)
(c) u vs y at x = 0
Fig. 3.19-7 Further results from Q2P-i for steady Stokes flow.
806 THE NAVIER-STOKES EQUATIONS
of u(t) = 0.1(1 — e~100r), zero initial velocity, with the 9/3 element. This is the simulation
that does start from rest, albeit with large acceleration and pressure at small time (from
"(Ohniet = 10e~100r and the PPE 's BC, dP/dx = -it(t), at the inlet). It also provides a
good example of what Bachelor (1967) has called the 'pressure impulse':
7i= fpdt, (3.19-13)
and, as we shall see, for sufficiently large X and sufficiently small time, the pressure goes
like 0(jc, y)le~Xt— from the n • u term in the PPE's BC, (3.8-36)—so that
jz = 4>(x, y) (1 - e"*'), (3.19-14)
which is close to 4>(x, y), the initial potential field if Xt;» 1 yet t is still small enough
that the other physical processes—advection and diffusion—have not yet had a chance
to 'react'; the initial pressure is just X times the potential function. Thus, as stated by
Bachelor (1967), 'This relation provides us with a physical interpretation of the velocity
potential. The potential 0 of a given irrotational velocity distribution may be interpreted as
(—1/p) times the pressure impulse required to set up the flow from rest, or, alternatively,
as (1/p) times the pressure impulse required to reduce the given motion to rest.' In fact,
the initial pressure field for this problem is exactly 1000-fold smaller than the 'potential'
function shown in Figure 3.19-2(c); multiplication by At reduces it by a factor of 105,
and multiplying it by X increases it 100-fold; the initial AP over the domain is ~ 80. This
shape of the pressure field remains virtually unchanged during the transient for times up to
almost 5rac—while it decays in magnitude according to Po(x, y)e~Xt—and the flow during
this same time is, in fact, merely an accelerating potential flow (basically—except very
close to the cylinder), until either advection or diffusion has had time to react/wakeup—the
idea here being to have X large enough that a true potential flow (not merely a potential
acceleration with a feeble flow) exists before the fluid even knows what hit it. (It may
be worthwhile to point out/emphasize that, even though we envision X -> oo, which
generates unbounded initial pressure and acceleration, the initial velocity is always zero.)
Thus we need (at least) rac <^ ta = D/Uqo = 20 and rac <£ td = D2/v = 20000. which,
as already mentioned, is easily satisfied for our rac of 0.01.
3.19.5 Minimum Time of Believability; Re = 1000 Results
What is not so easy to obtain relates to the vortex sheet on the cylinder and its diffusion
by viscosity. Defining
TMTB=h2/4v (3.19-15)
as the fastest diffusional response time of the mesh (which we call the Minimum Time of
Believability for reasons that will become quite clear later), where h is the distance to the
closest node off the cylinder surface (h = 0.022 for our mesh), gives tmtb =~ 0.60, or
60rac (see too Section 2.4.1). Thus, we are not surprised that the acceleration transient can
achieve a virtually steady potential flow before the fluid reacts in other ways; in fact, we
planned it that way. A most sensitive measure of this 'reaction' is the maximum magnitude
of the vorticity, which of course is that at the top surface of the cylinder: it is 23.6 at
NUMERICAL EXAMPLE-IMPULSIVE START 807
t = 0+ for our version of an impulsive start (potential flow + no slip BC) and is the best
approximation we have to the true value (oo). For later times, it is as follows, where
the result from the e_x' startup is shown in parentheses: 22.1 (22.2) at 5rac, 20.8 (21.1)
at 10rac(f = 0.10), and 19.9 at 15Tac—and the corresponding contour plots (not shown)
look much like those in Figure 3.19-8, the vortex sheet as 'seen' by our (definitely finite)
mesh. Thus, the velocity/vorticity field is indeed changing slowly at the 'completion'
of the acceleration transient (say 10rac). But the pressure, as usual, is another matter
entirely; it always 'reacts' first and fast to any change in 'external' conditions/forcing.
This is clearly seen in Figure 3.19-9—showing the isobars between t = 5rac and t = 10rac,
which appears to be the transition period: transition from acceleration-dominated potential
flow + vortex sheet to a flow in which both advection and diffusion are 'effective'. [At
t = 5rac, P0(max)e-X' = 80e~5 is 98.5% of Pmax in Figure 3.19-9(a); at t = 10rac, 80e~10
is only 23% of Pmax in Figure 3.19-9(f).] By t = 10rac =0.10, the pressure already looks
a lot like that for potential flow (Figure 3.19-2), but it is definitely 'viscosity - affected',
the range now being from — —0.026 near the top of the cylinder (vis-a-vis —0.028 for
potential flow) to —0.016 at the forward stagnation point (vis-a-vis —0.005 for potential
flow). It thus appears that it is the 'source' term [—V- (u • Vu)] on the RHS of the PPE
(which is still very close to 'potential', ~^S/2q2) that determines the pressure minimum
(and most of the 'shape' of the field), but that the 'Stokes' boundary term (vn • V2u in
the Neumann BC for the PPE) has a significant effect on the maximum. (The fact that
this BC is singular when a vortex sheet is present causes significant difficulty.)
Shown next are i/r, P, and to at t = 5 and t = 10, in Figure 3.19-10, and we note from
the minimum value of i/r (— — 8 x 10~6 at t = 5 and — —0.037 at t = 10) that separtion
has occured (\}/mm = 0 prior to separation). Boundary layer theory (Re = oo) predicts, for
the unbounded domain, a separation time of —3.22 and a good prediction at Re = 1000
is —3.71 [both from Collins and Dennis (1973a)]. Vorticity advection is clearly evident
at these two times. Further snapshots at t = 15 and 25 are shown in Figure 3.19-11, the
latter showing two secondary eddies. In this figure, and the next two, the Ai/f for the
contours is definitely not constant; the flow in the small eddies is very weak.
Finally, Figures 3.19-12 through 3.19-14 show the results at 'late' time: t = 40 (also
shown in the frontispiece) and t = 95, the latter of which is surely a good 'test' of the
OBC—and is the end of our simulation. Similar results (but definitely not the same, owing
Um I
i r^ i
Fig. 3.19-8 The vortex 'sheet' on the mesh of Figure 3.19-1; potential flow with no-slip
boundary condition (o)max. — 4.9; ajmin = -23.6).
808 THE NAVIER-STOKES EQUATIONS
(a) t = 5-rac = 0.05; Pmax = 0.547, Pmin = 0
(AP = 0.027)
(c) t = 7Tac; Pmax = 0.081, Pmin = 0(AP=0.0041)
(e) t = 9tac; Pmax = 0.024, Pmin = -0.021
( P = 0.0023)
Fig. 3.19-9 Isobars during the transition ph,
(a) Stream function
(b) Pressure; Pmax = 0.0079, Pmin = -0.0256,
(AP = 0.00167)
i r^ \
(c) Vorticity; comax = 0.67, comin = -7.53,
(Aco = 0.41)
Fig. 3.19-10 Solution att = 5 (left) and t = 1
mm
(b) t = 6tac; Pmax = 0.211 ,Pmin = 0 (AP = 0.011)
(d) t = 8Tac Pmax = 0.037, Pmin = -0.013
(AP = 0.0025)
(f) t= 10tac; Pmax = 0.016, Pmin = 0.
(AP = 0.0021)
>, Re = 1000.
(d) Stream function
(e) Pressure; Pmax 0.0091, Pmin 0.0256
(f) Vorticity; comax = 3.20, comin = -7.20,
(Aco = 0.52)
(right).
NUMERICAL EXAMPLE-IMPULSIVE START
809
(a) Stream function
(d) Stream f
(b) Pressure; Pmax = 0.0119, Pmin = -0.020,
(AP = 0.0016)
(e) Pressure; Pmax = 0.014, Pmin = -0.034,
(AP = 0.0032)
f~\
(c) Vorticity; comax = 5.80, comin = -7.09,
(Aco = 0.64)
(f) Vorticity; comax = 4.62, comin = -7.12,
(Aco = 0.78)
Fig. 3.19-11 Solution att=^ (left) and t = 25 (right).
at least to our 'small' domain) are available in (at least) Ta Phuoc Loc (1980), Chou and
Huang (1996), and Chu etal. (1996). Related results at larger Re (3000 and 9500) are
presented in Ta Phuoc Loc and Bouard (1985), in which the two small secondary eddies
near the top of the cylinder are called 'phenomenon «' after Bouard and Coutanceau (1980).
[For a fairly recent review of the literature on unsteady flows, see Sarpkaya (1996).]
3.19.6 Transient Stokes Flow
We shall return to this more interesting case after a brief look at the 'boring' case,
Re = 0, because we later (must) consider both. (Re = 0 is not as easy as it might seem.)
Figure 3.19-15 shows snapshots at three times (t = 1, 10 and 100), and we point out that
either IC gives the same results (for t;» rac). The symmetric diffusion of vorticity and
the nearly-constant shape of the pressure field are obvious—the latter a subject that we
wish to say a lot more about, since virtually every Stokes pressure field that we have
examined (even those for the impulsive start: potential flow + no-slip BC) has virtually
the same shape for all time. The velocity field for (unbounded) potential flow is [in polar
coordinates; Batchelor (1967)]
2 /„2>
and
ur = Moo(l — a /r )cos0
uq = —«oo(l +a /r )sin#
(3.19-16)
(3.19-17)
810 THE NAVIER-STOKES EQUATIONS
(a) Stream function
(b) Pressure (Pmax = 0.014, Pmjn = -0.040; AP = 0.0027).
(c) Vorticity (comax = 2.09, comin = -7.01; Aco = 0.35).
Fig. 3.19-12 Solution at t = 40.
and the normal component of V2u, needed for the PPE's Neumann BC, is (n points in
the — r direction)
' dr
1 d
r or
1 d2u
r 2 dug
r2 d92 r2 89 '
(3.19-18)
NUMERICAL EXAMPLE-IMPULSIVE START 811
(a) Stream function
(b) Pressure (Pmax= 0.043, Pmin =-0.010; AP = 0.0026).
(c) Vorticity (Wmax = 2.10, wmin = -6.99; Aw = 0.45).
Fig. 3.19-13 Solution att = 95.
812
THE NAVIER-STOKES EQUATIONS
(a) t = 40; see also Figure 3.9-12
(b) t = 95; see also Figure 3.9-13
Fig. 3.19-14 The small eddies at t = 40 and 95.
which, for potential flow vanishes identically [V2u = 0 for every solenoidal irrotational
vector field, since V2u = V(V • u) — V x V x u]. But if we impose the no-slip BC upon
this otherwise potential (curl-free) velocity, the last term in (3.19-18) vanishes (at r = a)
with the result that
n-V2u|r =
~d~r
l a
r dr
(rur)
4u
oo
COS#,
a
cos#
(3.19-19)
NUMERICAL EXAMPLE-IMPULSIVE START 813
(a) P at t = 1; Pmax = 0.00473, Pmin = -0,00092
(AP = 0,00028)
(b) co at t = 1; comax = 0,20, comin = -11.34
(Aco=0.58)
r\
(c) P at t = 10; Pmax = 0.00159, Pmin = -0,00031
(AP = 0,000095)
(d) co at t = 10; comax = 0.099, comin = -3,680,
(Aco=0,19)
(e) P at t = 100; Pmax = 0.000633,
Pmin = -0.000119 (AP = 0,000038)
(f) co at t = 100; comax = 0,052, comin = -1,375
(Aco= 0.071)
Fig. 3.19-15 Pressure and vorticity during developing Stokes flow, starting from potential
flow or from rest with large acceleration.
so that the PPE BC (at t = 0 only) for the impulsive start is (or seems to be)
dp
dn
dp
dr
= vn • V u|r = —^— cos#,
a
(3.19-20]
which of course is the only driving function for the Stokes pressure field. [It is alsc
impossible to compute accurately on any mesh—an important point to which we will
soon return.] With the BC given by (3.19-20), it is easy to show that the (unbounded
domain) solution to V2P = 0 is
P{r, 9) =
4vUnc, COS 9
(3.19-2T
for P —>- 0 as r —>- oo. For steady solutions, multiply (3.19-21) by — c with c > 0, whicl
constant will be discussed below, and we note that P ~ — cos 9/r is the 'normal' case: higl
pressure at the inlet (cos# < 0) and low pressure at the outlet (cos# > 0). The 'wrong
sign for the 'proposed' t = 0 potential flow plus vortex sheet solution (which, if it is eve]
valid, applies only at t = 0; not for t < 0 and not for t > 0) seems to be so because the
no-slip BC wants to slow the flow. (Later, we shall present a proposed pressure solutior
for t > 0.) Our bounded-domain solutions (when Stokes-like) are actually not too differen
from P(r, 9) = (Acos#)/r for some A, even though we have different BC's [dP/dy = (
814 THE NAVIER-STOKES EQUATIONS
at y = 0 and y = 2, dP/dn = n • (vV2u - u Vu - du/dt) at the inlet, and P = 0 at the
outlet].
Remark:
As promised in Section 3.8.2 we have here an example of an incompressible flow that
violates V • u = 0 on rD: for potential flow + no-slip, V • u = \/r [d/dr(rur) + dug/d9]
applied at r = a, gives V • u = (l/a)[2«00 cos# + 0] ^ 0.
It is also worth pointing out that even though V2u = 0 for potential flow, the normal
pressure gradient at r = a is still non-zero; it is, from (3.19-1),
8P
a7
l^sin2^
(3.19-22)
a
a
which is also —n • (u • Vu) at r = a. [This result also corrects a small error in Gresho and
Sani (1987)—Remark (iv) on p. 1119.]. True potential flow has a maximum in dP/dr at
9 = n/2 rather than the 'Stokes flow' result, of dP/dr = 0 there. The solution given by
(3.19-21) is what we refer to, perhaps paradoxically, as an Euler-Stokes flow [see also
Gresho (1991a,b, 1992)]: an Euler velocity field (except on r = a) and a Stokes pressure
field—even though we may not be able to find the latter field numerically.
Next we note that, although not valid for r —>• oo, the steady Stokes equations have
the 'near-field' solution (see for example, Panton (1984)]
/a r r r\
vi/ — cuood h 2- In - sin9,
\r a a as
giving
ur
CUr
a
1 +21n-
a
Uq — CUqq
a 1 1, r
-x- — 1 — 2 In -
r a
cos 9,
sin#,
and
n • V u|r = —(4cUoo/a)cos9,
(3.19-23)
(3.19-24)
(3.19-25)
(3.19-26)
where c is 'arbitrary' (and related to the Stokes paradox; c = 2/ln(7.4/Re) for Re < |,
a la Oseen, is a good choice (Batchelor, 1967))—but the important point is that the 9-
variation of dP/dn at r = a is the same in both potential flow with no-slip and steady
Stokes flow—and our numerical simulations have told us that it is also the same for
transient Stokes flow that 'began' as a potential flow. [We have also solved the following
Neumann problem on the mesh in Figure 3.19-1: V2T = 0 in Q, dT/dn = cos9 on the
cylinder, and dT/dn = 0 elsewhere; the solution looks just like all of the Stokes pressure
fields. In fact, the 'empirical' result on our domain leads to the following relationship
between Pmax - Pmin = AP and the dP/dn BC on the cylinder: AP = (3/ cos 9)dP/dr,
which can be used, along with the steady Stokes flow result [see Figure 3.19-2] that
AP = 4.2 x 10"4, to obtain an estimate of the 'bounded domain factor', 0, in the PPE BC
equation dP/dr = fi(4vUoo cos 9)/a2, to give 0 = 4.2 x 10"4/(3 x 4 x 0.0002 x 0.1) =
1.75—vis-a-vis c from (3.19-26) and 1 from (3.19-20). Another measure of 0 comes from
the exact solution to V2T = 0 in Q with dT/dr = A cos 9/a2 on r = a = 1 with T -> 0 for
NUMERICAL EXAMPLE-IMPULSIVE START 815
r —>- oo; viz. T = — A cos 9/r giving Armax = 2A vis-a-vis 3/4 on our bounded domain.
Thus, the bounded domain seems to increase the 'magnitude' of the solution by 50-75%.]
We stated above for Stokes flow that either IC gives the same results for t;» rac and
t > tmtb- Well, it is also true for Re = 1000, thus justifying our assertion that the 'real'
transient, via the e~kt BC starting from rest, is indeed a valid and legitimate (physically
and mathematically) technique for obtaining an impulsive start in the CFD laboratory.
Another thing that is virtually the same for any Re is the accelerating potential flow field
(up) for the e~x'-case; u(x, t) = up(x)(\ — e~Xt) is independent of Re—except very close
to the cylinder.
3.19.7 Divergence for h —► 0
But for t < O(10rac), the results from the two start-up methods are, of course, not the
same. They are also, unfortunately, not accurately computable for at least one case of
interest: rac <$C t < tmtb; i.e., the inability of any mesh to properly simulate what is
effectively a step change in the tangential velocity BC precludes the attainment of an accurate
solution—especially for the pressure—for at least 10rac < t < tMtb- Thus, for example,
the pressures that we have shown during the transition phase in Figure 3.19-9 are only
'nearly' correct. Subsequent finer mesh calculation (not shown) with h = 0.0024 (rather
than 0.022) and thus tMtb = 0.0072, which is less than rac =0.01, showed, for example,
that Pmax in Figure 3.19-9 is in error ranging from —3% low at t = 5rac to —35% low
t = 10rac; AP = PmdX — Pmin was less in error (3-10%), and the isobars in Figure 3.19-9
are qualitatively correct. We shall return to and explain these errors later; for now, let us
just report the empirical result (for the impulsive start in general; and for 10rac < t < tMtb
for the rapid start) that, for h sufficiently small, the initial AP = XAvu^/h—with smaller
h needed for smaller v(larger Re) to see 0(1//?). This \/h behavior was also displayed
by Pmm and PmdX, and leads to the observation that the pressure behaves like a fractal
dimension: the more closely you examine/measure it, the larger it becomes! (Divergence
as h -> 0!).
Why is the pressure field so wrong, (in magnitude, but ostensibly not in 'shape') how
wrong is it, and how wrong is the associated velocity field? These are good questions,
whose answers we can only partially supply, beginning with an examination of the
numerical solution, vis-a-vis what we think we know about the PDE solution, very close
to the cylinder, just after the 'impulsive' start [t = 0+ for the potential flow + no-slip
IC, and t = 10rac (say) for the no-flow IC]. Consider the 'worst case' (maximum
amplitude of the vortex sheet, which is also the maximum slip velocity for the related potential
flow)—the top of the cylinder (although the analysis to follow applies just as well at other
values of 9). It seems clear that, sufficiently close to the cylinder and for sufficiently small
time, the dominant terms in the tangential momentum equation are the acceleration term
and the 'friction' term; i.e., the normal 'component' of the viscous term. All other terms
are negligibly small in comparison. It is also clear for the conditions stated that the
curvature of the cylinder can be neglected (close enough to the surface, the radius appears to
be infinite), thus permitting a simple model that is nothing more than the ID transient
heat equation [see also Gresho (1991a, 1992)],
du d2u
~dt ~ya7'
(3.19-27)
816 THE NAVIER-STOKES EQUATIONS
where u = —u$ and y is the distance from the surface. The solution to this equation,
with an IC of u = uq = constant (another valid approximation close to the surface;
«o = (1/a) 30/90, the slip velocity) and a BC of u = 0 at y = 0 [see too Collins and
Dennis (1973b)] is
u = u0 erf (y/V4vi), (3.19-28)
which we assert is a reasonably-close approximation to the true solution (for the
conditions stated). Later, we will present an even better model that, while somewhat more
complicated, leads to virtually the same results for our present purposes. Now please take
a look at Figure 3.19-16. Plotted there are three curves corresponding to the first node
up from the cylinder top (at y = h = 0.022) and three to the second (at y = 2h = 0.044).
The two solid curves in each figure are from the numerical solutions—one for the 'true'
transient startup (e~x') and the other for the impulsive start. (The dotted curves are
'theoretical', and are explained below.) Two things are especially noteworthy when comparing
these curves.
1. The transient case catches up to the potential flow IC case by t = 0.05 (= 5rac) or so,
and for t >~ 10rac the results, as mentioned earlier, are virtually identical.
2. For the impulsive start, the second node off the surface clearly starts off in the wrong
direction (!).
For the impulsive start, the first node also starts in the wrong direction—a fact ('proven'
shortly) that is much less obvious than that for the second node, U2- (The e~Xt startup is
thus also in error. The spurious initial acceleration for 1/2 [= w*(653)] is a clear wiggle
signal, thanks to the consistent mass matrix—a signal first identified and utilized in Gresho
and Lee (1980) (see also Gresho, 1979). [A lumped mass simulation, or most FVM or
FDM simulations, would not be alerted to this dilemma—both nodes would show a (not
so unreasonable) declerating flow for all t ^ 0—and the analyst might just believe the
results, and thus remain befuddled as to the pressure's behavior with mesh refinement.]
To see quantitatively the errors in the simulation, let us first return to (3.19-28) and
evaluate the acceleration:
du uvye-y2/4v' d2u
— = — , = v^r, 3.19-29
which we wish to examine at y = h and y = 2h. In each case, the point of overriding
importance is that du/dt = 0 at t = 0; the initial acceleration is neither negative (cf.
Mi[= Mjr(392)] in Figure 3.19-16(a)) nor positive (a la 112 in Figure 3.19-16(b))—it is
zero. The dotted curve in these two figures is a plot of (3.19-28) for y = h and 2h,
respectively, with an IC that approximates that from potential flow [cf Figure 3.19-3(e)
and (f)]. These ostensibly small differences are the cause of the big problem, where
we point out that recovery to the proper solution does occur for times > 0(tmtb)> as
suggested in Figure 3.19-16, in which [see (3.19-15)] tMtb = 0.6 for u\ and —2.4 for u2.
[It is now clear that our definition of tmtb is closely related to the error function solution:
for a given distance (h) from the surface, it is the time at which the argument in the error
function is unity. It is also clear that we are not quite consistent in our definition—cf
Section 2.6.2c]
The reason that a 'finite K calculation cannot get it 'right' for small time is
actually rather easy to understand—and 'leads to' the less-easy-to-understand 'I//?' pressure
NUMERICAL EXAMPLE-IMPULSIVE START 817
_, , , r
i 1 1 r
ux (392), impulsive
ux (392), transient
(a) First node up (y = 1.022).
1.0
± ♦-
ux (653), impulsive
ux (653), transient
0.5
t
(b) Second node up (y = 1.044).
1.0
Fig. 3.19-16 Time histories of u for 2 nodes (392 and 653) just above the cylinder, and 2
error function curves (dotted).
behavior, which, we emphatically point out, is observed for both 9/3 and 4/1 elements.
It is simply a reflection of the fact that we cannot compute d2u/dy2 in (3.19-29) with
any accuracy; that the step change is beyond our reach is perhaps best realized via the
simplest approximation—second-order FDM:
d2u
-rjiuj-i -2uj +uj+l),
(3.19-30)
where, for our IC's and j = 1 at y = h gives, roughly,
d2u
dv2
h
(0 - 2m0 + "o) =
VUq
hl
(3.19-31)
at t = 0; The finite mesh will always give a spurious second-derivative close to the
'wall'—and thus spurious acceleration there. [Note that quadratic basis functions do not
'help' at all; they too give O(vuo/h2)]. Also, mesh refinement experiments verified the
0(1/h2) initial acceleration. This is the beginning of the 'problem'. Next, we must connect
this bad behavior to the pressure field, where the errors are perhaps more noticeable. And
this must be done by first turning to the normal velocity, since it is the component that
is closely coupled to the pressure (owing, of course, to V • u = 0). We begin this part of
the analysis by recalling the PPE's 'viscous' BC at the cylinder:
dP
dn
= vn • Vzu = v
d un
dn2
(3.19-32)
or, in our 'local' coordinate system at the top of the cylinder,
8P
d2v
= v-
dy dy2
(3.19-33)
But, we note, after 'filtering' the wiggles [or from a lumped mass (or FDM/FVM) result
which is smooth in v(y) but still has bad accelerations and pressure], that d2v/dy2 is
818 THE NAVIER-STOKES EQUATIONS
'smooth'(no step change in the normal direction!) and thus easy to compute well. So we
are led to repeat the question: Why is the pressure field so wrong? To answer this nagging
but important question, we must examine the discrete PPE at and near the cylinder's
surface. (We limit this analysis to the 4/1 element—but assert that the results are quite
general.) To this end, please return to Section 3.13.5f and review the analysis leading up
to (3.13-306)—because it provides the vitally important missing link. And this 'link' is
in one of the 'truncation error' terms; namely, (l/2)d/dy(vV2v). In terms of our local
coordinate system with u = 0 on r, the remaining important terms are these:
dP , H ,
— = vV2un - --(vVV), (3.19-34)
an 2 at
and the 'bad guy' is the normal 'component' of V2uT : d2ur/dn2 = d2u/dy2 in (3.19-34).
(It is noteworthy that the error in the normal momentum equation is caused by the
tangential velocity, which is much larger than the normal velocity.) We cannot make h
small enough to make this error term negligible because d2u/dy2 ~ \/h2, a la (3.19-31).
Thus, dP/dn becomes dominated by the error term, giving the spurious result that
dP/dn = 0(1/h) and therefore P = 0(1//?) as h -> 0, which is just what we saw in
our numerical results. Although this is really the 'bottom line', it turns out that we can
actually be somewhat more quantitative in the analysis, by returning again to (3.19-34):
(i) V2«„ = 4Moocos6»/a2 from (3.19-19).
(ii) Jtv2ur\j = (V2urj+l -V2uTj)/l
where j is any node on the surface of the cylinder (increasing j is in the same
direction as increasing 6), and / = aA6,
(Hi) As in (3.19-31),
- 2uTlJ + urij
where uSj is the original slip velocity on the cylinder at nodey (from the potential
flow solution). This, of course, is the term that is in error,
(iv) Thus,
- — V2U I ~ 1 "Sj+l ~ Usj
2dx rlj 2h I
(v) But the ^-variation of the slip velocity is smooth and therefore easily
approximated. This means that the term (uSj+] —uSj)/l will be close to (\/a)due/d6 =
—2uoo cos#/a from the potential flow solution (3.19-17)—at least for the unbounded
domain, Hence,
h d 7 Uqc cos 6
-— y ur\; %
2 3t ah
and we are basically finished.
NUMERICAL EXAMPLE-IMPULSIVE START 819
(vi) Thus,
dP vUqq cos 6 / a
dr a2 V h
the first term ostensibly coming from 'physics' and the second from 'numerics'
(truncation error)—the key observation being that the two terms have the same
6-dependent coefficient, (This is the 'good' news and causes the subtle result that
the shape is correct; The bad news is that the 'error' term swamps the 'physics'
term.)
So, for the impulsive start, and any Reynolds numbers, for a fine enough mesh, the PPE
BC for t < tMtb will be
dP VUnc COS 6
— = ^- , (3.19-35)
dr an
which causes all of the bad behavior:
(i) The pressure diverges like 0(1//?) for h -> 0.
(ii) This boundary term will even swamp the source term on the RHS of the PPE
[—V • (u • Vu)] with the result that, for a sufficiently fine mesh, even large-Re
simulations will generate an initial pressure field corresponding to steady Stokes
flow—in 'shape' only—and is thus totally spurious. (The 'fine mesh' solution at
and near 0 = n/2 may show the source term effect, however.)
(iii) Bad early-time drag coefficients—discussed below.
Thus the inability to get a step change right on any mesh for small time actually
causes double damage for a NS simulation: both tangential acceleration and pressure
are not accurately computable. Hence, we cannot (at least for the impulsive start) go to
the t \, 0 limit and obtain anything useful on any mesh. Only for t > 0(tMtb) does the
numerical solution recover and become useful/believable. In fact, for the (more regular)
e~Xt startup, it is interesting and ironic that the solution's accuracy starts 'good' and
ends 'good'—but is bad during a portion of the transient. (The potential flow IC, in
contrast, is 'consistently' bad for all t < ~tmtb-)- From the above analysis, we can argue
that the numerical BC for the PPE on the cylinder during the early transient is given
approximately by
dP vunccos0 / a\ ,,
~" '" Ml-e-*'), (3.19-36)
dn a2 ^ h
whereas that at the domain's inlet is
W dP . u
-— = — = n u = XUooe-kt (3.19-37)
ax dn
where Uqo = wq. The numerical solution can thus be accurate for 'small f if |(1 —
e-A-f)w/ooCos0/a/i| is sufficiently small compared to ku^t^1 which, at the stagnation
points, for example, translates to
t ("Kan \
— =A.f«ln + 1 , (3.19-38)
820 THE NAVIER-STOKES EQUATIONS
where '<$C' will be subject to 'interpretation'. For our parameters, we get ln((100 x 1 x
0.022)/0.0002 + 1) = 9.3—which puts us right near the end of the transition period
shown in Figure 3.19-9, and thus, as already mentioned, casts doubt on the validity of
these pressures. However, because the logarithm with large argument is a slowly-varying
function, it doesn't require much of a reduction below Xt = 9.3 to recover 'validity';
for example, at Xt = 9.3 we have Xuoo^Xt = WooO — z~Xt)lah = 9.1 x 1CT4, but at just
one-half of this time, kt = 4.65, we get ku^t'^ = 0.096, whereas w/ooO - z~Xt)/ah
is (still) only 0.00090—over two decades smaller than the 'competition.' The transition
from acceleration-dominated to viscous-dominated occurs on a very short time scale, as
we have seen in Figure 3.19-9.
Also noteworthy is a comparison of the true (unbounded domain) viscous BC term,
AvUoo cosO/a2, with the acceleration term (at 6 = 0); equating them (at 6 = 0) gives kt =
\n(ka2/4v) = 11.7, which is not much larger than 9.3. Thus, the deleterious truncation
error effect 'merely' hastens (by a small amount here) the time at which the transition
from potential flow to viscous flow occurs—though it is still true that the magnitude of
the pressure will be in error for roughly 10 < kt < A.tMtb> which we shall define as the
window of nonbelievability; i.e., for
( ah\ ^ ^ h2
tacln < t < tmtb = —, (3.19-39)
V vr.dC J 4v
which is 0.10 < t < 0.60 on this mesh—a rather small range. For t > tMtb> the results
from either method of startup are believable—and only for t > tMtb are the results
from the potential flow + no-slip IC believable (i.e., accurate). Of course, if tmtb <
Tac ln(a/?/vrac + 1), the above argument breaks down. (This 'crossover' occurs at h ~
0.00815, giving tmtb — 0.083 for our parameters.) We also believe and assert that these
results are general; any element, any numerical method—even spectral.
At this point it seems reasonable to ask: Do these results generalize in any useful way?
I.e., you've beaten to death a simple geometry with some known analytical solutions; what
about arbitrary geometries for which even the potential flow solution is not known? And
what about 3D? Our answers are as follows:
1. We believe that the error function analysis/argument extends 'point-wise' to other, more
general geometries with curved streamlines (although we are unsure of the extra-hard
effects caused by geometric singularities such as sharp corners).
2. We will present below a better (more general) model for the cylinder that leads to
essentially the same results.
3. We therefore believe that the error term, (h/2)(d/dr)(V2uT) can always (for smooth
geometries) be approximated by (h/2)(d/dr)(us/h2), where us is the slip velocity of the
associated potential flow solution; thus
4. The PPE BC is, for small t,
(1-e-*'), (3.19-40)
dP
dn
= v
■j 1 du y
n V2uP -
2h dr
where uP is the potential flow solution.
NUMERICAL EXAMPLE-IMPULSIVE START 821
And this is as far as we can take it. Since both uP and us are problem-specific and
generally not known in closed form, all we say in general is that pressure will behave
somewhat like \/h for times between about rac \n(ah/vrac) and tmtb = h2/4v. We believe
too that if the potential flow is accurately computable on a fine enough mesh (as for the
circular cylinder), then the shape of the two terms in (3.19-40) will be much the same, thus
causing only the magnitude of the pressure (and, of course, the concomitant tangential
acceleration—and the drag coefficient, presented and discussed later) to be spurious during
the above time range. Finally, while we have not seriously addressed the 3D situation,
we believe that it will not differ much.
3.19.8 A Better Model
The 'better' model for the cylinder case, alluded to above, was obtained from some of
the (more general) results in Wang (1967) and Bar-Lev and Wang (1975)—and which
we generated in order to help explain some portions of the drag coefficient curves to be
presented next. Using the method of matched asymptotic expansions, the above authors
derived an equation for u0, whose leading term (dominant for t \, 0) is the one we need;
namely,
(a\2 _ „ (r — a
1 +
Uq = — U
oo
2 erfc
V4
vt.
sin#,
(3.19-41)
from which, using V • u = 0, we obtain the needed normal component:
ur = —uc
1
a\2
r )
2W-a-
erf
(1-e
-(r-a)2/4vt
cos#,
r> \V4vt/ r V n
(3.19-42)
and we note the reassuring result (not obtainable from the simpler model presented earlier)
that the potential flow solution is obtained for r > a and t = 0; cf. (3.19-16) and (3.19-17).
Evaluating
a
n Vzu = —
at r = a gives the PPE BC:
8P
dr
2viinn cos 6
a
1 d
r or
a
/sfirvt
1
(3.19-43)
in which the second term is negligible for small t, and it is interesting to note that
this second term is the only difference between this 'better'-model and the simple one
obtainable from (3.19-28) as uq = —2^ sin# • erf [(r — a)/y/Avi\.
It is now interesting to compare the PPE BC's for the impulsive start (potential flow +
no slip) for t < 0, t = 0, and t > 0. Collecting previous results yields
(i) dP/dr = 4*4 sin2 0/a for t < 0 from (3.19-22);
(ii) dP/dr = -Avuoo cos 6/a2 for t = 0 from (3.19-20); and
(iii) dP/dr = 2vuoo cos O/a^/jtvi for t > 0 but small, from (3.19-43).
822 THE NAVIER-STOKES EQUATIONS
The existence and diffusion of the vortex sheet wreaks havoc with the pressure
field—and it is interesting and significant to note that the unbounded pressure at t = 0+ is
'caused' by the combination of a step change in tangential velocity, the div-free constraint
with curved stream lines, and the PPE's BC, which is the normal momentum equation
applied on Tcyi. [If an impulsive start were applied to a geometry for which the potential
flow is basically unidirectional—as in the classical Rayleigh-Stokes problem or for long
triangular wedge with very small angle—the pressure field would not become unbounded
as t \, 0. Finally, we mention that a brief numerical experiment for flow past a square
cylinder seemed to give 0(1/?0'2) behavior for small time.]
3.19.9 Drag Coefficients
For more 'surprises', we turn now to a brief(?)look at the drag coefficient (for Re = 1000),
CD = Fj/l/M&D (3.19-44)
where FTX is the total jc-direction force on the (full) cylinder and is comprised of a pressure
part (form drag) and a viscous part (friction drag); FTX = Fx + Fx, where
Fpx=- f nxPdl (3.19-45)
? = -/«*
JT,.
and
F
3m /du dv'
2nx— +ny — + —
ox \ ay ox
d/. (3.19-46)
where nx = —cos6 and ny = — sin#. [Note that p = 1 in our case, and recall that P
is the kinematic pressure.] The first thing to point out is that, just as the pressure is
necessarily in error at small time for an impulsive start, so too is the viscous part—and
nearly for the same reason: the step change in tangential velocity. The vortex sheet causes
Fx ~ v/h in (3.19-46)-again for both 9/3 and 4/1 elements. Figure 3.19-17 shows some
CD vs. time results for Re = 1000, which we present first and analyze later. The CD
in Figure 3.19-17(a) is clearly dominated by the acceleration transient: it behaves like
CD(t) = CD(0)e~Xt, and it is virtually all 'pressure'. On the 'large' time scale (t > 0.07),
it behaves as shown in Figure 3.19-17(b), about which we make three remarks:
1. The minimum value of ~1.0 at t = 4.5 is 'too large' relative to the unbounded domain
case [CD min = 0.5-0.6 at t = 6-7 from Chou and Huang (1996) and Chu et al. (1996)].
2. When we stretched the domain's top from jmax = 2 to jmax = 3 (thus doubling the
constricted flow passage), our Comin decreased—to about 0.75 at t = 6, thus showing the
proper trend.
3. The impulsive start produces results that would lie atop those shown, the difference
being that it starts at CD = 3.5 at t = 0 rather them ^8000.
Next, Figure 3.19-17(c) shows another small-time result, up to 20rac, this time on two
meshes and with both viscous and pressure contributions plotted separately. Fx denotes the
total drag coefficient, Px the pressure contribution, and Fx — Px the viscous contribution;
16965 denotes the mesh (number of velocity nodes). We see the following behavior:
NUMERICAL EXAMPLE-IMPULSIVE START 823
8000 -: i -—
7000
6000
5000
CD 4000
3000
2000
1000-
o ; - — l:=:"-=1
0 0.01 0.02 0,03 0.C4 ; '-:
t
(a) Small time behavior.
Fx 16985
Px 16965
(Fx-Px) 16965
Fx 4323
Px 4323
(Fx-Px) 4323
. ;t- 0.07
5.0 r-
4.5 -
4.0 L
3.5 ■-
3.0 -
r 9 5 -
2.0 -
1.5
1.0
0.5 /
0 '--
0
0.05
0.10
t
0.15
0.20
(c) More small time behavior, with more detail
- on two grids
Fig, 3.19-17 Drag coefficients.
CD
9
8
7
8
5
4
3
2
1
0
5 10 15 20 25 30 35 40
t
(b) Long time behavior, (t - 0.07)
-
-
-
- -
- Fx 16965
Px16965
(Fx-Px) 18965
mam C^ 4*70*2
rx 4o£o
□ u JOOO
O K ,_
Wn W
3.0 -
2.5 -
2.0 :
1.5 \
1.0 \
X*
0.5 *
0 -
0
\X
^«;~-
2
1.1 a — fa; tjttj
_ / y
■<•""""
4 6 8 10 12
t
14
(d) Impulsive start results on two grids.
(i)
(ii)
(iii)
the viscous contributions grow (for small t) from zero at t = 0 approximately like
(1
*-kt
);
the viscous contribution varies, unfortunately, approximately like 0(1 fh) and so
does the pressure part, but only for t > ~0.10 = 10rac;
the smart integrator seems to be using rather large time steps—although the
piece wise-linear plot in Figure 3,19-17(e) should really be 'faired in' with quadratic
interpolation because we are using TR, which would make the curves look more
presentable.
Finally, Figure 3.19-17(d) shows the same contributions to Co, and Co itself, for
the impulsive start case, from which we see: (i) ~ 1/h behavior for small t, and (ii)
convergence (recovery) for t > ~2t2 or so, where %2 == h2/4v = 2.4 on the coarse mesh
(ri = 0.6 on the fine mesh), Note too that, for t > ~0.1 = 10rac, these curves also closely
describe the e~A' startup.
824 THE NAVIER-STOKES EQUATIONS
Another empirical result from our many 'runs' is this: the initial (and spurious) drag
coefficient for the impulsive start behaves like CD = 38v/uooh. And another is this: for
t > —tmtb, Cd = (12/uo)*JvJi, which can be compared with the theoretical result in an
unbounded domain [see, for example, Collins and Dennis (1973b)], Co = (A*Jn/u0)^fvJt;
our bounded domain has increased the coefficient from A*Jtz = 7.1 to 12. Also, in our case
the pressure contributed —59% of the total, whereas the theory for the unbounded case
(which we shall soon present) predicts exactly 50%. Both of the above results are valid
for any Re, and are shown graphically in Figure 3.19-18, in which the 'new' meshes will
be discussed further below—as will the Co curves themselves. The reason that the results
are independent of Re (i.e., for a given v, it matters not whether the nonlinear advection
terms are included) is as follows: the only difference between Stokes and Navier-Stokes
is in the pressure (the initial velocity being that from potential flow for each); NS has a
source term on the RHS of the PPE and Stokes does not. But—the contribution of this
source term is (basically) to add to the Stokes pressure field the pressure field for potential
flow, which, of course, causes no drag. (In fact, when we subtracted the initial Stokes
pressure from the initial NS pressure the result was indeed the potential flow pressure
field.) This observation is true for both the continuum (at least up to the no-slip BC
'problem') and the discrete/FEM case, with the latter, however being a 'victim' of the
mostly spurious Stokes pressure (for h/a <^ 1)—as shown so clearly in Figure 3.19-18.
Now that we have presented our sometimes-good, sometimes-bad numerical results, let
us sit back and try to explain them. We begin with an explanation of CD (0) for the rapid-
start case, which, with special thanks to Renwei Mei, we can adequately explain with
a fairly simple analytical model. Although our computed results employed a 'cartesian'
domain, we could have used a 'polar' domain—enclosing our cylinder inside a larger
100.0
10.0
cD
1.0
0.1
0.00001 0.0001 0.0010 0.0100 0.1000 1.0000
t
Fig. 3.19-18 More drag coefficients; impulsive start, small time, four grids.
NUMERICAL EXAMPLE-IMPULSIVE START 825
and concentric cylinder that defines the outer domain boundary; and this comprises our
'model', for which we can find analytical solutions for limiting cases. Thus, imagine that
our domain is a < r < R and consider first the rapid start-up of an inviscid flow in this
domain. The plan is to find the pressure field associated with an accelerating potential
flow and then to determine the resulting 'form' drag caused by this pressure. To this end,
we first solve (with u = V0)
Subject to
and
V20 = 0 in a<r < R,
dc/)/dr = 0 at r = a
-Xt
d<f>/dr = w0(l - e~Af) cos0 at r = R,
(3.19-47)
(3.19-48)
(3.19-49)
the latter converting our rectilinear accelerating flow BC to polar coordinates. The solution
to (3.19-47) to (3.19-49) is
,-A.r
0 =
wo(l -e~~Ar)cos#- (r + ayr)
1
a2/R2
(3.19-50)
and we note (for t = oo) that R -> oo recovers the appropriate potential function for the
unbounded domain—from which (3.19-16) and (3.19-17) were obtained. The Bernoulli
equation,
fa 2 2"oo'
(3.19-51)
where q = |V0| = ^(30/3r)2 + ((l/r)/30/36»)2 and u^ = w0(l - e~A?) gives the
pressure:
p = \<L(t)
— Xwq6
a / ^ a
—^ 2cos20 j
r \ rl
-Xt
r +
a
cos 6
(3.19-52)
which, for both t -> oo and R -> oo, recovers (appropriately) the pressure field for steady
unbounded potential flow past a cylinder; cf. (3.19-1). The resulting drag force is given
by (nx = — cos 0)
FAX =2 [ P\r=acosead6 (3.19-53)
Jo
and the drag coefficient by C^ = Fpx/^Wq -2a. Since the first term in (3.19-52) is a
potential flow pressure, it contributes naught to the drag—a la d'Alembert. But the second
term gives
-Xt
_ izja + q)te-M _
Cd ~ 7\ 27^2; = CD^U)e
w0(l - a /R )
(3.19-54)
and we have obtained an analytical estimate of Cq(0)—and we shall 'explain' the factor
'a + cC, which came from r + a2/r in (3.19-50), later. Recall that, experimentally, we
obtained Cq(0) ~ 8000 [See Figure 3.19-17(a)] so that, presuming that it were obtained
826 THE NAVIER-STOKES EQUATIONS
on our new, annular, domain, we can determine the requisite size of this domain; i.e.
solving (3.19-54) for R gives
R =
a
1
y/\ - 2nak/w0CD(0) ^/\-2n x 100/800
~ 2.16,
which seem quite reasonable; e.g. it gives, via (3.19-50), a maximum velocity (at 0 = n/2,
r = a and t = oo) of ~ 2.53wo and a total flux (fa uo(r, n/2)dr) or ~ 2.17wo—vis-a-vis
our computed values on the 'cartesian' mesh of ~ 2.75wo and 2wo, respectively. (For the
'record', an unbounded domain gives Cq(0) = 2nak/wo, or 6283 for our parameters.)
Later, we shall further interpret these results in terms of 'added mass' —but for now we
stop, with the following
[Exercise for the reader: Re-derive the above results in a different-but-equivalent way, using the PPE and the appropriate
Neumann BC's]
Before we actually 'stop', we wish to note that (3.19-52), rewritten as
P =
-0(r, 0)ke-Xt + Ppol(r, 0)(1 - e~Xt)\
where now
and
0(r, 0) =
wo(r + a /r) cos 6
1 - a2/R2
(3.19-55)
(3.19-56)
Ppoi(r,e) =
w0
~2 I 2 cos 26
a
a
R2
a
R2
(3.19-57)
oo (an
(3.19-58)
which is uniformly valid in time, can actually be taken to the limit of k
impulsive start-up from rest of an inviscid fluid:
P = -0(r, 0)8(t) + Ppot(r, 6) ■ H(t),
where 8(t) = lim^oo ke~Xt is a 'generalized' function that we shall call a Dirac delta
function (/0°° 8(t)dt = 1 and J0°° f(t)ke~Xt dt = /(0) for k -* oo) and H(t) is a version
of the Heaviside step function [H(t) = 0 for t ^ 0 and H(t) = 1 for t > 0]. Thus, it is
now even more clear that the role played by the accelerating inlet flow, vt>o(l — e~Xt),
is to quickly accelerate the flow from rest—via the potential function—to a potential
flow, in 'ein augenblick'. To do so in 'zero time' of course requires an infinite pressure,
briefly, Note too that n • u is not continuous in this limiting case—thus 'violating' in
compressibility—again, briefly—and 'explains' the ill-posedness that we mentioned at
the beginning.
3.19.10 A New Analytical Model
Much of what has just been said applies equally well (or at least nearly equally-well) to
a viscous, no-slip fluid, to which we now turn our attention. We have also generated a
useful model for the 'real' situation. It begins with a ID model for the viscous boundary
NUMERICAL EXAMPLE-IMPULSIVE START 827
layer (VBL) that resides beneath an accelerating potential flow (the 'outer' solution),
and concludes with a 2D representation of the velocity and pressure—in an unbounded
domain, for 'simplicity'. Just as we invoked a ID model via the transient heat equation
for the impulsive start case—cf. (3.19-27) et seq.—so too we begin with one appropriate
to the e_;u case(with y -> x); viz., solve
30 320
with
„ =v—T + T0ke-Xt for x> 0, (3.19-59)
ot dx
0(0, t) = 0(jc, 0) = 0, (3.19-60)
which models, via the source term, an accelerating 'flow' field above the VBL (ie. for
x ;» A(t) = y/4vt) and a viscous, no-slip flow beneath it. Clearly for jc ;» A, the solution
to (3.19-59, 60) is simply 0 = T0(\ — e_;u)- But to obtain the full solution, we transform
this problem to one that has already been 'solved' (p. 64 of Carslaw and Jaeger 1959).
Letting 0 = 70(1- e~Xt) - T yields the following IBVP for T:
8T d2T
— = v—T for x> 0 (3.19-61)
dt dx2
T(x,0) = 0, (3.19-62)
and
7/(0, 0 = 7/0(1 -e~Xt). (3.19-63)
The solution to this problem and (especially) the ensuring asymptotic analysis was
obtained by A.C. Hindmarsh (whom we thank again; he is as much at home in the
complex plane as is PMG in his own backyard); in terms of 0 it is
0 = To {(1 - e~Xt) - erfc (x/V^i) + e-*2/4yr • Re[w(z)]} , (3.19-64)
where
z = y/)J + ix/V4vi (3.19-65)
and the complex function w(z) is given by
w(z) = e~z2 (1 + -7= T^2 ds J . (3.19-66)
We shall make good use of some asymptotic approximations to this solution (viz, one
for x ;» y/4vt, one for Xt <^ 1, and yet another for x <£ */Avt) after converting it to the
form needed for flow past a cylinder.
But before doing even this, we digress briefly to note the existence of a common
solution to two different IBVP's, the first given by (3.19-59) and (3.19-60), for A. -> 00
[i.e. for Xe~Xt -> 5(0, the source term becoming a Dirac 'burst'], and the second by
30 320
— = v—^- for jc>0 (3.19-67)
dt dx2
828 THE NAVIER-STOKES EQUATIONS
0(jc,O) = 7,o, (3.19-68)
0(0,0 = 0; (3.19-69)
i.e., a simple step change. The 'common solution' (an interesting result in its own right) is
0(jc, 0 = ©(*, 0 = 7"o erf (x/V4vt), (3.19-70)
which could prove useful (it did for us) when trying to model (and understand) an
impulsive start via X -> oo.
Now the true fluid-mechanical model that we propose to better understand the rapid-
start case is simply the following:
ud = -2w0sino\(\ - e~u) - erfc ^^ + e~(r-a)2/4yr • Re[vv(z)] 1, (3.19-71)
I V4v? J
where now z = Vxi + i(r — a)/^/4vt\ i.e. we have 'borrowed' the ID solution and
converted it to one which, for v = 0 and r = a + e where we must have g«a, describes
the tangential component of a growing potential flow (small s is required so that a
'cartesian' solution can apply to a cylindrical geometry). Note too that (3.19-71) gives uq = 0
at r = a and at t = 0, as desired. After exploring the asymptotics of this solution, we will
invoke V • u = 0 to obtain the corresponding radial velocity derivative which in turn will
be used to study the PPE's BC on the cylinder. Ultimately, we shall present an analytical
solution for the pressure field—and the drag.
For a ^> r — a ^> *jAvt and t > 0, (3.19-71) becomes, approximately,
ue = -2w0(l - e~Xt) sin<9, (3.19-72)
which describes the accelerating potential flow just outside of the VBL—yet still close to
the cylinder. Next, we need a 'small time' approximation; for Xt <*^\, (3.19-71) becomes
uq = — 2woXts'm6 < 1
2vt
erfc LZl + rJZ± . e-o--)V4w
v4vt s/nvt
(3.19-73)
which displays a 'reduced' boundary layer thickness, Xty/4vt, and is valid for all r,
t—though we still want r — a small. Finally, for r — a <£ y/4vt, 'we' (ACH) obtain from
(3.19-71) the all-important approximation deep within the VBL:
. n r — a r— i—
Uq ~ -4W() Sin# • • y/XtD(y/Xt)
s/TZVt
= -4w0sin6>- r~a D(VXt), (3.19-74)
V7rv'Tac
valid for all Xt, where D(-) is Dawson's integral, D(y) = e~y2 J0V es2ds (See, e.g., Abra-
mowitz and Stegun 1965, p. 297), whose properties of most interest here are these: (i) for
NUMERICAL EXAMPLE-IMPULSIVE START 829
y<K\, D(y) ~ y, (ii) for y » 1, D(y) ~ l/2y, and (iii) for j ~ 0.92, D(y) attains a
maximum of D(0.92) ~ 0.54. (In fact, D(y) somewhat resembles a more familiar function,
ye~v\) Thus, for kt <^ 1, (3.19-74) describes an accelerating velocity in the VBL,
uq 2^ —4wq sin 6
a
s/nvt
kt = — 4wnsin#
and for kt ^> 1 it gives a decelerating one:
uq = —2wosin#
r — a
yJnvTa
a
kt
(3.19-75)
-JlZVt
(3.19-76)
which, it should be noted, is both independent of k and is an appropriate approximation to
the step function solution utilized earlier—in (3.19-41) for r — a <^C V4vt and r — a <£. a
there. Finally, at yfkt ~ 0.92, at which point z~Xt is only about 43% of its starting value,
uq within the VBL peaks (in time) at
uo = —2.\6wos\nO
r — a
^vt^'
(3.19-77)
after which time viscosity 'wins out' over acceleration. (Note that since this equation only
applies for r — a <£ y/7ivrac, Tac -> 0 (A. -> oo) =>• r -> a.)
To obtain the associated BC for the PPE, we need only the VBL approximation,
(3.19-74). Thus, we insert ue from (3.19-74) into V u = \[d(rur)/dr + du0/d6] = 0 to
obtain d(rur)/dr = —duo/dO, which we insert into the Neumann BC on the cylinder, cf.
(3.19-19), to obtain
dP
a7
3 /1 due
= — v— -■
4vwo cos 6
a
a
dr \r 86 )
which, for small time (kt <^ 1), becomes
4vwo cos 6
dp
and for large time (kt;» 1) becomes
yj7TVX&
a
DWkt),
(3.19-78)
a
*Jnvxa
kt
(3.19-79)
8P
a7
2vwo cos 6 a
a
\fjrvt
(3.19-80)
and, recalling the additional restriction, a ;» «j7ivt, we see that, in some sense, we are
still restricted to 'small' time even when kt;» 1.
Finally, for *Jkt 2^ 0.92(r ~ 0.85rac), we obtain the maximum value of dP/dr:
8P
2.16vwocos# a
a
y/KVTvc '
(3.19-81)
which becomes unbounded as k -> oo(rac -> 0). The viscous BC for the PPE will cause
unbounded pressure in the limit k -> oo—but only at t = 0+; it (the 'viscous' pressure)
is still zero at t = 0 because u = 0 then.
830 THE NAVIER-STOKES EQUATIONS
Considering, in addition to the above model and PPE BC, the accelerating potential flow
away from the cylinder, leads us to propose the following small-time pressure solution
to the problem of accelerating flow from rest past a circular cylinder:
, e)xe-xt + ^ • a\ [ic^ie - 4) 0 - z~kt)'
P = ~<P(r.
4vwq cos 6 a
D(VAi), (3.19-82)
where 0 = wq{t + a2/r)cos6 is the potential function, and the second term accounts for
'2
the source term in the PPE (V2/> = —V • (u • Vu) = — \V2q2) that is given by the
accelerating potential flow [u = V0 • (1 — e_Xr) and q = |u|]. The only hesitancy/uncertainty
that we still harbor regarding this putative solution is that it seems to imply a slippery
solution on the cylinder; i.e.
dP u0 4vwocos# a
dr
a a2 *JnvTa
= ^ + -7 ; • D(y/kt), (3.19-83)
ac
where u2e/a = —\4-(q2) = 4wq sin2#/a at r = a from (3.19-17) rather than uq = 0 there.
Perhaps it is best to suggest that the potential flow part of the pressure be only applied
'outside' the VBL, though it is noteworthy that others have also included the potential
flow part on the cylinder [e.g. C-Y Wang (1968), Bar-Lev and Yang (1975), Bentwich
and Miloh (1982)]. It is clear that (3.19-83) is valid in the two limits, v -> 0 and v -> oo
(transient Stokes flow, for which we probably need to have X ~ v; or else just drop the
advection term). In any event, we believe that it is important to emphasise that the last term
in (3.19-82) is a 'viscous-generated' pressure that is realized (only) for incompressible
flow by the need to respect the omnipotent constraint, V • u = 0, via the Neumann BC
for the PPE.
The A. -> oo limit of (3.19-82) is obviously also of much interest; it is
2 2
P = -0(r, 0)S(t) + — % (2 cos 26 -a2/r2)-H (t)
2 r
2vwncos 6 a
7=, (3.19-84)
showing that, in spite of viscosity, the potential 'burst' at t = 0 still generates a potential
flow (with vortex sheet) at t = 0+. Note that, like the potential flow portion, the viscous
portion only applies for t > 0 [recalling the shape of D(VXf)]; it is still zero at t = 0.
Note too that, unlike (3.19-55) which applies uniformly in time, (3.19-84) is still restricted
to small time, via y/nvt ^C a.
The pressure portion of the drag coefficient from (3.19-82) is
CPD = Xe-kt + —J—D J— , (3.19-85)
WO VV0 V Tac V V Tac /
and comes from the first and last terms, the middle term contributing nothing.
The viscous component of the drag is also computable from our model as follows:
?: = -2 T
Jo
Fvx = -2j (Trrcos6» + T^sin6»)ad6» (3.19-86)
NUMERICAL EXAMPLE-IMPULSIVE START 831
dur
2— cos# +
dr
3 1 dur
sme'fadff (3.19-87)
evaluated at r = a, for which it simplifies to
Fvx = -lav \ — sin6»d6» (3.19-88)
Jo dr
since dur/dr, dur/d6, and uo all vanish at r = a. Inserting uq from (3.19-74) gives
Fvx ~ t -.D(Vxt) / sin26»d6» (3.19-89)
y/TtVTac J()
and thus to a viscous drag coefficient of
4 n^v~ —
Cl~—J—D(Vki), (3.19-90)
Wo V Tac
which is clearly identical to the viscous (Stokes) part of the pressure drag coefficient,
from (3.19-85)—and we are not the first ones to notice this equality, at least for the limit
case of X -> oo for which our results agree with those of previous investigators. (We may
be the first to find it for the e~A? startup case). Anyway, for small time (0 < Xt <$C 1), we
find
p 2nak „ *Jizvt
Cpn =CVD=4X- , (3.19-91)
W() Wt>
corresponding to a constant acceleration of the far-field flow and an accelerating flow in
the VBL. For large Xt, we obtain
D 2 Ittv
Cpn^C'n = —J — , (3.19-92)
w0 V t
which corresponds to a decelerating flow in the VBL and agrees with that from many
earlier investigators for the impulsive start case, beginning (probably) with Blasius (1908);
eg. Goldstem & Rosenhead (1938), who erroneously argued that the pressure drag is zero,
Collins and Dennis (1973), Wang (1968), Bar-Lev and Yang (1975), and Bentwich and
Miloh (1982), although their result—perhaps a 'typo'—is four-fold smaller. The two drag
coefficients peak at yfxt — 0.92, at
C£(max) = C^(max) ~ 2.6^^, (3.19-93)
w0
which clearly become unbounded (at t = 0+) for X -> oo. Again, however, we point out
that Co from both the viscous part of the pressure and shear is zero at t = 0. Only the
acceleration drag is present at t = 0—impulsively for X -> oo. Returning to (3.19-85),
it is interesting to compare the acceleration drag to the viscous/pressure drag. Thus,
equating 27iaXe~Xt/wq and S^/Trv/r.dCD(y/Xt)/wo gives, for Xt;» 1, the time of equality:
Xte~Xt = 2y/vt/n/a; note that the RHS is a sort of 'reduced' boundary layer thickness,
and thus both sides must be small. For our parameters, this equation yields Xt ~ 7.44
(7 + time constants), which makes us recall the rapid transition observed between ~ 5rac
and 10rac; see Figure 3.19-9, equation (3.19-38), and related discussion—including that
832 THE NAVIER-STOKES EQUATIONS
following (3.19-39). This really is a transition time! [Increasing X to 104 changes the
'crossover' time only slightly—to Xt = 9.9, and X = 108 only increases it to Xt = 14.7.]
To finish our analysis, which we believe has already tied up most of the loose ends
with respect to our numerical results, we return to the drag coefficient for the bounded
domain inviscid model, (3.19-54), and (especially) to the pressure from which it came,
(3.19-52), and we remark that the following discussion applies as well to the unbounded
(R = oo) domain. The last term in (3.19-52) can be rewritten (generalized) as
/'accel = -piioc(t)(r + a2/r)cos8/(l -a2/R2); (3.19-94)
and we have converted from kinematic pressure to true pressure—for reasons that will
become clear soon. Paccei is the pressure field generated by the accelerating potential
flow—and it comprises two separately identifiable terms: Paced = Pfs + Pam, where
PFS = -p«0O(0rcos6»/(l -a2/R2)
= -piiooiOx/il - a2/R2)- (3.19-95)
i.e.
dPFS/dx = -p«oo(0/(l - a2/R2) (3.19-96)
is the (uniform) pressure gradient caused by an accelerating free stream (FS) velocity.
[After setting R = oo, (3.19-95) also describes the pressure distribution in an accelerating
incompressible solid.] The remaining term
Uoo(t)a cos#
Pam = -p 7 7 (3.19-97)
r(\-a2/R2)
is the pressure generated by the so-called added mass effect; it accounts for the acceleration
of that mass of fluid caused by the need to 'flow around' the cylinder. [Lovalenti and
Brady (1993), who also present a most detailed study of particle dynamics, state the
following about added mass: 'It represents the additional mass the particle appears to
have due to the resistance to acceleration of the surrounding fluid'.] The added mass will
be made more clear after we point out the following fact: It turns out that the circular
cylinder is a sort of special case with respect to 'added mass' in that both Pfs and Pam
are identical on the cylinder (r = a). For a sphere, Pam = \Pvs, ar>d for other shapes the
ratio (Pam/Pfs) can vary from near zero (very thin elliptical cylinders or very prolate
spheroids) to very large values (very thick elliptical cylinders or very oblate spheroids);
see, e.g. Figure 16.1 in Vogel (1994), Clift et al. (1978) and Sarpkaya and Isaacson (1981).
In any case, however, the added mass factor (y = Pam/Pfs) is no harder to compute than
the potential flow field; i.e., it is determined by the shape of the object (with d(p/dn = 0
on the surface) and V20 = 0, with dc/y/dx -> «oo(0 for r -> oo.
The drag force from the accelerating potential flow is obtained as usual, via Fx =
2 f* Pcos6a60 where P = Pacce\(a, 6), and we see from (3.19-94) that r + a2/r = a + a
for r = a, as noted earlier—and we note another 'equality' for a cylinder, the first being
the early time equality of viscous and pressure drag. (For other shapes, this factor would
probably become something like r + yl2/r (in 2D), at least for large r, where / is a
characteristic dimension of the obstacle and y is a scalar. See, e.g. Batchelor (1967), in
NUMERICAL EXAMPLE-IMPULSIVE START 833
which the added mass factor is generalized from a scalar to a second rank tensor.) Thus
Fx = 2na2piiOQ/(l - a2/R2), (3.19-98)
where (for our case) Uoo = kwoe~Xt is the force required to hold the cylinder in place
against the accelerating potential flow.
To conclude our added mass 'digression', we ask the seemingly simple question, which
will also show why we needed to re-introduce the fluid's density: Suppose we release the
cylinder in this 2D flow field (in zero gravity)—what is its initial acceleration? The naive
response is Fx/psna2 where ps is the (solid) cylinder's density and na2 is its volume (per
unit length). The correct response, however, is
(ps + p)na2us = Fx = 2na2pu00/(\ - a2/R2),
or
(ps/p + \)US = 2*00/(1 - a2/R2) (3.19-99)
because the accelerating cylinder also 'causes' acceleration of surrounding fluid—in this
case a mass of fluid that is contained in the 'volume' of the cylinder, because y = 1. For
a more general geometry, (3.19-99) generalizes (for R = oo) to
(ps/p + y)us = (1 + y)uoo; (3.19-100)
there is an added mass effect on both sides of / = ma. (Note the 'compatibility' of this
result if ps = p,and note too, per Landau and Lifshitz (1959) who also generalize the
result to arbitrary bodies, that time integration gives the cylinder's velocity for all time.)
A practical and real life example of added mass effects on a moving object is provided
by Vogel (1994): 'Perhaps the most extreme case so far uncovered occurs in the escape
response of a crayfish. It flexes tail and abdomen and goes rearward with a maximum
acceleration of 51 m/sec2. Drag turns out to be only around 10% of the resistance, with
90% caused by the masses of crayfish and water—as is reasonable for a high acceleration
to a fairly low final speed.'
And this is as far as we wish to take it, except to state that these inviscid results
carry over without change to viscous flows. But for readers who wish to dig deeper into
these concepts—which are quite prevalent when studying the motion of small particles,
or drops or bubbles, in an incompressible fluid—we provide a 'short list' of some useful
recent references, in addition to those already cited: Maxey and Riley (1983), Chang and
Maxey (1995), Mei (1993), Mei and Lawrence (1996), Mei (1996), Lovalenti and Brady
(1993, 1995), and Panton (1996).
3.19.11 A Better Mesh
While we argued earlier, and (we assert) correctly, that the computed magnitude of the
pressure field should be pretty accurate when the acceleration Neumann BC for the PPE
is much larger than that from the viscous term (i.e., when kt <3C 9.3), we must also admit
that the pressure distribution at and near the cylinder can not be accurately computed
for Tac ln(a/?/vrac) < t < tmtb; hence, the spurious (\/h) form drag coefficient for small
834 THE NAVIER-STOKES EQUATIONS
time. We also said that we cannot say how CD should behave during the 'blind spot'.
What we meant is that we cannot on this mesh and with the chosen acceleration rate
(A. = 100) answer that question.
In order to generate 'believable' results for nearly all t ^ 0, and still approximate well
an impulsive start via the e_;u start up, it is now clear that the 'effective' acceleration
rate should not exceed the mesh response time; i.e., a minimum requirement for good
accuracy for all time is that the 'window of non-believability' be closed—realized via
tacln (—) > tmtb = h2/4v (3.19-101)
from (3.19-39), and already reported as h ^ 0.00815 for our parameters (tmtb = 0.083).
[Fixing the mesh at h = 0.022 with v = 0.0002, gives rac = 0.084 or k = 11.9, which is
not a very rapid start.]
So—to see if our 'methodology' could yield better insight into the small-time behavior,
we performed a brief study with modified (improved) versions of the three meshes
presented earlier: a coarse mesh with 4323 nodes and h = 0.0049 (tMtb = 0.030), a
medium mesh with 16965 nodes (cf. Figure 3.19-1) and h = 0.0024 (tMtb = 0.0073),
and a fine mesh with 67 209 nodes and h = 0.0012 (tMTb = 0.0018)—each differing by
close to a factor of 2 in each direction, and each more aggressively graded (Figure 3.19-19
shows the 4323 mesh), thus permitting us to get closer to the cylinder and closer to t = 0
believability. Shown first, in Figure 3.19-20, are the initial numerical pressure fields
for the impulsive start case (potential flow + no slip) and Re = 1000 (v = 0.0002), in
order to clearly demonstrate the nearly 'disastrous' effect of mesh refinement for this
'impossible' problem. Whereas the coarse mesh seems to give a 'reasonable' result for
this Re (cf. Figure 3.19-9), the two finer meshes quickly dispel any notion regarding
mesh convergence, with the 67 K mesh clearly showing a return to the truncation-error-
dominated Stokes-like pressure field-a la (3.19-35). Also noteworthy is that both pressure
and vorticity (in the vortex sheet) are varying approximately like 0(1/h) in magnitude.
The coarse mesh, while still giving a spurious initial pressure with respect to the viscous
BC for the PPE, is not so dominated by this error term that it cannot 'see' some of the
advection 'source' term in the PPE.
Returning now to Figure 3.19-18, which shows, this time on a log-log plot, the
impulsive start drag coefficients for small t—for both Re = 1000 and Re = 0 (both with
v = 0.0002)—on several grids: the three new grids mentioned above as well as the first
(old) 17 K grid used for all previous results. Noteworthy is the 0(1/h) specious behavior
of Co for t < tMtb for each, as well as the convergence (for all grids) to a k/s/i behavior
for t > O(tmtb)' where k 2^ 0.69 for the viscous drag, k ~ 1 for the pressure drag and,
of course, k = 1.69 for the total drag (The theoretical value of k, for both viscosity and
pressure, is 2-y/nv/ua = 0.50 for the unbounded case—as pointed out earlier (we are 38%
high for the viscous part, 100% high for the pressure part, and 61% high for the total—all
of which are explained by our too-tight domain.
Our final simulations, on the 'new' 4323 and 16965 node meshes for the e~kt startup
at Re = 1000, are summarized by the drag coefficients shown in Figure 3.19-21, on a
log-log scale. The first 3-4 decades show mainly the e~kt decay of the acceleration
drag (all pressure) and we note with pleasure that both meshes give the same result (to
graphical accuracy), i.e., the curves for Fx (total drag) and Px (pressure portion) from the
NUMERICAL EXAMPLE-IMPULSIVE START 835
(a) The full domain
(b) Zoom near the cylinder
Fig. 3.19-19 A 4323 node mesh of 9/3 elements with improved 'grading'.
two meshes are indistinguishable—a consequence of which is that we seem to be mesh-
converged. That the viscous portions (Fx — Px) do not agree for t < tmtb [and, in fact,
vary like 0(1/h)], tells us—of course—that the viscous drag (and the viscous portion of
the pressure drag) are still not accurate at small time. Fortunately, they are very small for
t < Tmtb and thus, the total drag is believable for all t—and we note, for both meshes,
that there is no window of non-believability—since xac\r\{ah/vxac) > tmtb for each; see
(3.19-101). For t <~ 5xac or so, CdU) = C£>(0)e_A' and is, of course, all pressure drag.
For t > 10Ta(, or so, the transition from acceleration-dominated to 'inertia + viscous' is
complete, with the curves now agreeing completely with those in Figure 3.19-18 when
t > ?mtb there. The rise of the viscous part, which follows closely the equation CVD =
10(1 — e_Xr) = lOOOr, at least up to t =~ 0.01, will now be explained. First, however,
we explain what it is not; it is not a good approximation to the small time (small Xt,
actually) analytical solution, CVD ~ AXyfnvi/wQ = 100>/f from (3.19-91). The computed
viscous drag, for t < tmtb, is simply another victim of the small time error on any finite
mesh—even though the small Xt analytical solution is describing a simpler transient:
constant acceleration. But for t < h2/4v = tmtb the mesh still cannot 'see' the proper
viscous effect; it 'sees' only the outer solution—potential flow, like wo — — 2w§Xt sin#
when Xt « 1, from (3.19-72). Thus, in the attempt to compute CVD [cf. 3.19-86-3.19-91],
836 THE NAVIER-STOKES EQUATIONS
the code 'sees' due/dr = —(2uo^ts'm6 - 0)/h to give
_ 2nvXt
L,D —
woh
(3.19-102)
another spurious result that gives, for our parameters on the '17k' mesh (h — 0.0024),
CVD ~ 524?, within a factor of ~ 2 of the numerical result on the truncated domain and
thus satisfactorily explaining the observed behavior. (To get closer yet, recall that our
peak speed across the top of the cylinder is closer to 3 than to 2).
For our final drag figure, we show in Figure 3.19-22 a portion of the Co vs. t curve for
Tac ^ t < 100rac for both the original and the improved 16 965 node mesh for Re = 1000.
Recalling our earlier discussion regarding the questionable accuracy of the pressure field
in the window of non-believability (blind spot); (cf. Figure 3.19-9 and discussion), we
assert that the new mesh shows the correct result and the old mesh does indeed display
noticeable error in the range 0(rac = 0.01) < t < 0(tmtb — 0.61).
(a) 4323 mesh; Pmax=0.051, Pm,n =-0.016, (0,^= 17.1, oomln =-97.2
(b) 16965 mesh; Pmax = 0.097, Pmin =-0.014, ©max= 34.5, a)min =-218
(c) 67209 mesh; Pmax = 0.196, Pmin = -0.034, comax = 73.0, © min = ^11
Fig. 3.19-20 Initial pressure field and range of vorticity for the impulsive start on 3 meshes
(Re = 1,000;.
NUMERICAL EXAMPLE-IMPULSIVE START
837
10,000
1,000
100
10
0.1
0.01 '—-
0.001
Fx4k
Px4k
(Fx-Px)4k
Fx16k
Px 16k
(Fx-Px) 16k
0.00001 0.0001 0.001 0.01
t
0.1
Fig. 3.19-21 Early-time drag coefficients for rapid starts on two 'better'meshes.
10,000
1,000
cD 100
Original mesh
mproved mesh
Fig. 3.19-22 The 'window of error' on the original mesh is approximately 0.04 < t < 0.8.
To conclude our drag discussion, we return to a small sampling of the literature [Dennis
and Collins (1973b); Chou and Wang (1996)] and note that they obtained the
asymptotic (t -> 0) result that both Co (pressure) and Cq (viscous) go like (2/u00)y/(7Tv)/t.
Attempting to calibrate our e~A' startup with these results would suggest, since the friction
drag is necessarily zero at t = 0, a really rapid growth (achievable of course for A. -* oo)
in order to approach this asymptote.
838 THE NAVIER-STOKES EQUATIONS
To conclude our (numerical) drag discussion, we wish to state that it appears to us that
the stream function vorticity (V-co) method should have significantly more difficulty than
the primitive-variable (u-p) method that we utilize: for example, in Collins and Dennis
(1973b) are (after fixing a misprint is the pressure term) the following equations for the
drag coefficient:
4 [*
Co (friction) = — / co smOdO
and
Co (pressure)
cos#d#.
r=a
The vortex sheet of course has co = oo at r = a, and we will not even 'speculate' as to
the type of singularity that is dco/dr at r = a. The vortex sheet is a real code breaker.
This is probably a good place to relate some opinions of someone else who is and
has been very interested in transient fluid dynamics: 'The impulsive start is a man-made
problem; Nature has no impulsive starts.' And: 'Any paradoxes at the end were put in at
the beginning, by our assumptions.'—T. Sarpkaya (personal communications).
3.19.12 At vs. t
To conclude on an upbeat note, we now turn to the behavior of our (smart, via local error
control) time integrator. We shall present and discuss At vs. t curves for, unless otherwise
indicated, TR applied to the Re = 1000 case via the 9/3 element on the mesh shown in
Figure 3.19-1 with s = 10~4. But because we used a code (FIDAP) that does not initialize
TR 'properly,' a la the discussion surrounding (3.16-236), we first describe how this code
starts up—and mention up front that, while not theoretically 'perfect,' it usually works
quite well in practice; i.e., the short cut is viable. It is also simple: start the integration
with 2 (or 3, or 4) fixed, small- At BE steps (here 10-5), thus precluding a need for Pq, Uq,
and a div-free IC (the first step is generally to be considered as a 'projection step'). After
step 2, the quantity Uj = («2 — «i)/A?o is used as the acceleration vector on the RHS of
the general TR algorithm, (3.16-235), and the switch to TR is made. After one TR step,
still at the conservatively-small initial step size, we also have u?, = 2(1/3 — «2)/A?o — ii2,
so that error control via AB2 can commence with step 4, and At changed beginning at
step 5, (If the IC is known to satisfy Ctuq = gQ, the switches can be made one step
sooner.)
Figure 3.19-23 shows the variable-step integrator results at early time, both to show
how well it deals with the imposed e_A' transient, and to compare this result with the more
conventional impulsive start—the potential flow IC. Here, and in those to follow, all time
steps are plotted. The solid line in Figure 3.19-23(a) is the simple theoretical At vs. t result
for the scalar ODE y = -Xy (see (2.7-87)]: At = (12seA')l/3A = 0.00106e33-3'. It is seen
to describe well the full Navier- Stokes time integration during the acceleration-dominated
period—say t < ~ 0.10 or so. Note that only a few steps are needed to 'recover' from
the conservatively small Ar0 of 10-5 to the 'theoretically correct' value of ~10~3. For
t > 0.1 or so, the physics of advection and diffusion (and spatial numerical error!) take
over the job of 'determining A?.' Figure 3.19-23(b) shows three items of interest for the
impulsive start:
NUMERICAL EXAMPLE-IMPULSIVE START
10°
839
At
(a) e X1 case; (40 steps)
(b) Impulsive start; (16 steps)
Fig. 3.19-23 Short-time performance of TR integrator (Re = 1000!
101
10°
10"1
io-2
10"3
10"4
10-5
i—i—i—i—i—i—i i r
j i i l
J I L
0 10 20 30 40 50 60 70 80 90 100
t
(a)The full TR run, with a restart at t = 50
(162 steps)
10"5
0 2 4 6 8 10 12 14 16 18
t
(b) Backward Euler, shorter run (200 steps)
Fig. 3.19-24 Performance of TR and BE for the e~XT case (Re = 1000;.
1. At quickly reaches a value appropriate to the problem—in this case, the mesh 'response
time', or tmtb, is ~0.6, and TR ODE theory now says, for s = 10-4, kAt (different A.!)
should initially be (12s)1/3 = 0.1, so that At should be ~ 0.1 /k = 0.1tMtb = 0.06, which
is close-enough to the values in the plot.
2. Beyond t = 0.2, the At vs. t behavior is very close to that is Figure 3.19-23(a)—another
desirable property of a 'smart' integrator.
3. The e~Xt transient cost only 24 extra time steps for an accurate integration—even
though its time scale (rac = 0.01) is much shorter than those for advection and diffusion.
In Figure 3.19-24(a) we show At vs. t for the entire simulation, including an appropriate
time step reduction at t = 50, caused by a restart which includes a switch back to BE for
840 THE NAVIER-STOKES EQUATIONS
2 or 3 (necessarily smaller) steps. Note too the rapid recovery to what would have been a
smoother curve via non-stop integration to t = 95. It turns out that the 'rough' restart only
'costs' about 5-10 steps—the non-stop integration used 156 time steps, The reduction in
At at t = 15 is apparently required to follow the separation phenomena, although it does
grow again beyond t = 25. For comparison, we show in Figure 3.19-24(b) the typically
inferior performance of BE on (a portion of) the same problem; it took 200 time steps
just to reach a time of ~17.
Moving to the Stokes flow simulations, Figure 3.19-25 shows the At behavior (TR)
for both types of startup, about which we note the following:
1. Again, the e~kt transient required only 24 extra time steps over the potential flow IC.
2. Again the At selection mechanism caused virtually identical behavior for t > ~ 0.2.
3. Stokes flow is 'easier' to integrate and At grows monotonically owing, in part, to the
linearity of the DAE's.
3.19.13 A Deficient Mesh Design
We conclude (finally!) this example with a small excursion/digression related to mesh
'response'; i.e. tmtb—to show how much more important is a 'good' mesh for parabolic
problems than for the more forgiving (easier) elliptic problems. We mentioned earlier that
the #-variation around the cylinder is, in some sense, the 'easy' part of the simulation. Let
us return briefly to that issue to point out that a non-uniform mesh size as a function of 6
can cause some significant problems—at least for the transient part of the simulation and
for small time close to the cylinder. Figure 3.19-26(a) shows a portion of a highly
nonuniform mesh (in the r-direction), constructed 'by mistake' early-on in our investigation.
(The domain is the same as in Figure 3.19-1). Suffice it to say (but not show) that
the transient pressure field that resulted from either type of startup produced lots of
interesting-but-spurious dynamics. Rather, we shall briefly demonstrate related behavior
for the much-simpler transient heat equation. We solved 37'/3? = 0.0002V2T on this mesh
i i i r
i i r
10_5i L
o o
o o
J I L
_L
0 10 20 30 40 50 60 70 80 90
t
(a) No-flow BC, e-^case (60 steps)
0 10 20 30 40 50 60 70 80 90
t
(b) Potential flow/impulsive start case (36 steps)
Fig. 3.19-25 Performance of TR for Re = 0.
NUMERICAL EXAMPLE-IMPULSIVE START 841
(a) Non-uniform, non-optimal mesh; Drmin = 0.0023, Armax= 0.022.
(b) T at t = 0.05 (Tmax = 1.037) (c) T at t = 1.0 (Tmax = 1.008)
Fig. 3.19-26 Demonstration of spurious dynamics on a non-uniform mesh.
with an IC of T = 1 and BC's of dT/dn = 0 except on the cylinder, where we used T = 0;
i.e., we have another step change at the boundary. Figure 3.19-26(b) shows the resulting
solution near the cylinder at early time and Figure 3.19-26(c) shows it at a later time. We
summarize the discussion of this, and other simulations, with the following
Remarks:
(1) The spurious non-concentric isotherms are a direct consequence of the range in
mesh response times—as a function of 0. tMtb ranges from 0.0070(/?i = 0.00236)
to 0.613(/?i = 0.02215) for the first nodes off of the cylinder, the minimum occurring
at 6 = tt/2, the maximum at about halfway down.
(2) The temperature overshoot is 'caused' by the CM matrix—a wiggle signal. (A
LM result has no overshoot, but also has spurious, non-concentric isotherms for
small time.)
842 THE NAVIER-STOKES EQUATIONS
(3) Even by t = 1.0, the solution has not totally recovered,even though the isotherms
are now properly concentric.
(4) A uniform-in-^ tmtb at least generates concentric isotherms for all t—and the only
remaining problem relates then to the difficulty of a step change.
(5) The solution of the steady Stokes equations on this mesh looks little different than
those on our good mesh (Figure 3.19-1), thus, demonstrating the extreme relative
lack of sensitivity to mesh non-uniformity for elliptic problems.
Returning to the transient case, we (nearly) conclude this overlong example with the
obvious 'word to the wise:' If you solve a NS problem with time-dependent forcing that
acts over a time scale r in which viscous (and pressure) effects are important near a no-slip
boundary, you need tmtb ^C t for believable results; i.e., be sure that h <gi -jAvr—the
same result that 'comes from' the simple ID transient heat equation.
3.19.14 Concluding Remarks
We end this discussion by attempting to summarize what we know—or think we
know—about rapid starts from rest and 'impulsive' starts from potential flow, beginning
with the former. There are three competing processes for establishing the velocity and
(especially) the pressure fields, and we assume below that X is sufficiently large that
'process 1' dominates for small time:
(i) the acceleration of the inlet flow—in our case via the Dirichlet BC, u(t) = wq(1 —
(ii) the viscous diffusion of momentum (and vorticity—as well as its generation) via
the no-slip BC on the cylinder;
(iii) Advection of momentum and vorticity by the velocity field.
Correspondingly/concomitantly are three driving forces' for setting the always-in-
equilibrium pressure:
(i) the Neumann BC at the inlet, dP/dx = -Xwoe~Xt;
(ii) the Neumann BC at the cylinder,
dP ? d
i a
~^{rur)
r dr
at r = a;
dr dr
(iii) the source term on the RHS of the PPE,
V2/> = -V ■ (u • Vu).
Remark:
The velocity at t = 0 is zero, but the t = 0 pressure is very large, approaching infinity as
X -> oo, owing to process 1.
Since X is 'sufficiently large,' the very-early-time solution is one of accelerating
potential flow, albeit, with a vorticity-producing no-slip BC on the cylinder; process 1 is
NUMERICAL EXAMPLE-IMPULSIVE START 843
dominant, but process 2 is also active very close to the cylinder. Process 3 is too small
to be seen, i.e., the early time response is 'independent' of Re. Next, depending on the
Reynolds number (or 1/v), the decaying 'acceleration BC gives way to advection and
diffusion—a transition that is most prominent near the cylinder. If Re is sufficiently small,
the viscous effects (process 2) will dominate the transition from a mostly potential flow,
and process 3 is still less important (and separation will not occur); advection is totally
absent for Stokes flow (very large v). If Re is sufficiently large, process 3 will dominate the
transition, which causes the flow to remain mostly curl-free (potential) except very close
to the cylinder where, especially near the two stagnation points, viscosity (process 2) will
(slightly) affect the solution (e.g. the upstream stagnation point pressure will exceed ju2^).
Shortly into this transition phase will occur another transition: boundary layer separation
and downstream advection of the separated flow—ultimately leading to vortex shedding
or even turbulence. For intermediate values of v (Re = 107100?) both processes 2 and 3
will be important as the acceleration phase winds down—and their interaction will often
lead to boundary layer separation and downstream advection of momentum and vorticity
in a still-laminar flow which may even ultimately become steady.
For the impulsive start case, achievable in principle via X -> oo in the rapid-start case,
process 1 is absent and the effective initial condition is one of potential flow everywhere
except at the cylinder, upon which resides a vortex sheet (an integrable singularity of
infinite vorticity, whose integral is the slip/potential velocity just off the surface, u$ =
—2^00 sin 6). The corresponding initial pressure field is Reynolds-number-dependent and
(for us at least) very difficult to describe quantitatively. If v is sufficiently large (Stokes
flow in the limit), process 2 will totally dominate and the advection source term in the
PPE will be completely unimportant (Euler velocity, Stokes pressure). If v is sufficiently
small, the opposite situation will exist; process 2 will be mostly unimportant and the
initial pressure will basically correspond to that of simple potential flow (P + \q2 =
\u2OQ)—except that the no-slip BC on the cylinder must still be respected. The PPE will
not see the potential flow BC, dP/dr = — n • (u • Vu) = u^/a = Au2^ sin2 6/a because this
term is now zero on the cylinder; rather, it will (we believe!) see
4vUoo cos 6
with very small v, thus giving dP/dr -> 0 in the limit of v -> 0. For t > 0, however,
the 1/v^-like pressure behavior will agree with the rapid-start case for X -> oo there,
and is caused by the step change in tangential velocity, which velocity jump 'generates'
a concomitant pressure via the 'omnipotent' divergence-free constraint. The impulsive
start generates a temporal pressure discontinuity in response to a spatial velocity
discontinuity—unless one chooses the probably legitimate position that the velocity
change is also temporal.
When one attempts to invoke these three processes via approximate/numerical
solution of the NSE's (via FEM in our case, but surely no other numerical method would
behave much differently), a fourth 'process' enters the picture. It is both spurious and
insidious—and caused by the combination of incompressibility (and all that it entails)
and by the inability to numerically simulate properly a step change for the transient
'heat' equation for the tangential velocity, with the net effect that the viscous BC for
the PPE (process 2) becomes 'augmented' by another Neumann BC of the form dP/dr ~
vUqo cos 6J ah which, for a seemingly well-resolved flow for which h/a <^ 1 is clearly
8P _ d
dr dr
1 ^
r dr
844 THE NAVIER-STOKES EQUATIONS
necessary, can generate a totally spurious pressure field that, while showing the same
qualitative shape as a Stokes pressure, is quantitatively totally spurious—and even diverges
(becomes unbounded) as h -> 0. For the impulsive start, this bad solution will dominate
the numerical results until the Minimum Time of Believability of the given mesh, tmtb =
h2/4v, has been passed—after which the pressure 'recovers' to the proper 0(1/*Ji)
behaviour and the numerical solution actually returns to believability, fortunately. For the
rapid-start case, the numerical behaviour can be even more bizarre, depending as it does on
the magnitude of tmtb relative to the time constant of the startup phase, rac = 1/A. If rac <£
tmtb, there will exist a 'window' of non-believability during which the solution is
dominated by the same spurious Neumann BC for the PPE as for the impulsive start case. Prior
to entering this window, t < 0(rac) and after leaving it, t > O(tmtb), the solution can be
relied upon, assuming 'all else' is done well. If, on the other hand, tmtb ^C Tac, the rapid-
start problem is capable of delivering good results for all t ^ 0. The only problem in this
desirable case is this: it is quite impossible to let A become arbitrarily large since tmtb ^C
rac =>• h ^C y/4v/X. But at least this case is superior to the impulsive start/potential flow
startup case in that one can select A. based on the mesh that is deemed affordable (and
graded meshes are extremely useful and important here) via, say, A = 0.1 (4v/h2)—and
then proceed to perform a believable simulation of a 'rapid' startup from rest.
Finally, we wish to point out that in spite of what many have said regarding impulsive
starts from rest, there is a subtle-but-important (and large!) difference between a flow
which truly starts from rest and one which starts from a potential flow. In the former case
there exists, for A -> oo, a 8(t) impulse via a potential function and associated infinite
pressure that accelerates the fluid to the potential flow that exists at t = 0+ but not at
t = 0. In the latter case this effect is missing, showing again that this is definitely not
an impulsive start from rest. Also for this case, the pressure at t = 0 is bounded—not
infinite. It is only 'nearly' unbounded at t = 0+ via the weaker singularity, 0(\/*Jt), that
is also present for the A -> oo impulsive start from rest.
Having said this, we realize that an alternative interpretation of our results may also
be viable—and that is that we have 'merely' justified the potential flow + vortex sheet
startup by showing how it comes about.
Our final remarks on this difficult problem—except to note that we describe the
opposite case, impulsive and sudden stops in another publication (Gresho & Sani 1998)—are
that we now agree more strongly than ever in the adage regarding the NSE's:
— Easy for fluids
— Difficult for people
— Impossible for computers,
first seen by some on a tee shirt!
3.20 CLOSURE: SOME ADDITIONAL REMARKS ON THE
PRESSURE
A recent paper on 'projection' methods (Perot, 1993) contains the following statement:
'The pressure is a very interesting variable in the context of numerical discretizations of
the incompressible Navier-Stokes equations.' This understatement prompts a digression
CLOSURE: SOME ADDITIONAL REMARKS ON THE PRESSURE 845
to review and augment portions of a lecture given at one of the FEM in Fluids
conferences (Antibes, France, 1984) by PMG, updated slightly, entitled, 'Some Remarks on the
Pressure ...':
— In the stress tensor, P is clearly a compressive stress. The mechanical pressure is
proportional to the trace of the stress tensor.
— In hydrostatics (u = 0), VP is a. force per unit volume; it balances 'body forces' —and
P itself has less meaning than pressure differences. (VP is, of course, also a force
per unit volume in hydrodynamics; its role there is simply less obvious.)
— In steady Stokes flow, it is a Lagrange multiplier (that enforces the divergence-free
constraint)—a mathematical entity. Is it not also a physical entity, helping 'guide'
the flow around 'obstacles,' etc.? Yes, it is that, too.
— At the fluid's boundaries, it (not its gradient) is an important part of a normal force
balance.
— When body forces are present, a 'portion' of the pressure is used to balance them.
— For ideal flow (v = 0, V x u = 0), the pressure is an energy per unit volume in the
Bernoulli equation. After solving for the velocity from Laplace's equation, only the
pressure remains to be determined: '... in an ideal flow, the pressure adjusts itself
according to the Bernoulli equation so that the fluid is accelerated to those values
of velocity dictated by the geometry of the boundaries'—Panton (1984, p. 452). Is
it then also a Lagrange multiplier? Yes; P is a multi-purpose variable.
— For certain more general (and non-Newtonian) incompressible fluids, a-tj = —PSjj +
f(ujj + Ujj), where /(•) is a nonlinear-function of the strain rate, P even loses
its physical interpretation as a normal stress; it is then 'merely' a Lagrange
multiplier—the slow 'recognition' of which caused some confusion in the past
(K. Rajagopal, personal communication).
— What makes the fluid flow around a corner, or a circular cylinder when viscosity is
present? Or when it isn't? It is, again, VP.
— Finally, for time-varying incompressible viscous flows, it seems like the pressure is
'all of the above'—and more: it must adjust itself, instantaneously, so that V • u = 0
at all points in x and /. Or is this latter simply another manifestation of its role as a
Lagrange multiplier? Probably.
— In time-dependent flows, the pressure often varies, in both time and space, in wild
and wondrous ways—only some of which are easy to understand; even the range
of magnitudes is often quite mysterious. This is apparently related to the many jobs
it has to do and that it is an elliptic variable embedded in an otherwise parabolic
system.
— It is by far the most 'sensitive' variable to any change in any 'parameter'; e.g., v,
h, At, IC, BC, Q, 8Q, ....
— See Remark 17 following Table 3.13-4 (Section 3.13.2a).
— We conclude these remarks by recalling one thing, in marked contrast to the situation
with compressible flow, that the pressure is not: it is not a thermodynamic variable;
there is no equation of state.
4| Derived Quantities
4.1 INTRODUCTION
In this chapter we shall discuss a number of issues that are important after a numerical
solution has been obtained. (Actually, for time-dependent problems, some of what we
shall discuss takes place during the calculation.) Often called 'post-processing,' the 'data
manipulations' of the primary variables—velocity, pressure, temperature (etc.)—that are
needed to obtain such things as streamlines, heat flux, and forces and moments, are what
we loosely refer to as 'derived quantities.' It is often the case, however, that one or more
of these 'secondary' quantities is actually the primary quantity of interest; examples:
(i) the drag force on an automobile, and (ii) the heat transfer rate in a thermal convection
simulation or in a heat exchanger.
There is often the possibility of deriving the quantity of interest in more than one
way, which means that choices must be made. Herein we shall describe some of these
options and suggest—when we know the answer—which of the available options is 'best,'
a term whose definition will often depend on (at least) the problem, the element used,
the quantity desired, and the graphics package available—the latter of which can be so
important as to be overriding, an example of which would be the vorticity computed
from the velocity field obtained with the Q[Qq element: clearly co = dv/dx — du/dy =
Ylj=i(vj^(l)j/dx — ujd(j)j/dy) should be computed, reported, and plotted (via contours
of constant co) at the centroid of each element, to take advantage of the supercon-
vergence phenomenon, e.g., Zienkiewicz and Taylor (1989): derivatives at appropriate
Gaussian points are more accurate than elsewhere; bilinears, for example, give second-
order accuracy for first derivatives at the centroids, at least for rectangular elements. (For
general quadrilaterals, the 'order' is hard to define—but it is always true that the most
accurate value will be at the centroid;—J.Z. Zhu, personal communication.) For a good
bibliography on superconvergence in general, see Krizek and Neittaanmaki (1987). But
more often than not, the available graphics package will only accept data at the element
nodes—thus requiring further action than simply evaluating co at the centroids. How best
to bring these data to the nodes is the question forced upon us by the plotting package.
This is an example of what this small chapter is all about.
We shall first describe, in a fair amount of detail, most items of interest from 2D
simulations, after which we shall extend the discussion to 3D, which both complicates
some of the issues and introduces more derived variables (e.g., vorticity is then a vector
quantity).
848 DERIVED QUANTITIES
4.2 TWO DIMENSIONS
The 2D derived quantities that we shall discuss below are: vorticity, stream function,
particle paths (tracer), heat flux, forces and moments, and global Peclet (Reynolds)
numbers, after discussing methods of smoothing discontinuities at element boundaries
and—especially—at nodes.
4.2.1 Smoothing in General
Since we use at most C° basis functions, whose gradients are only in C~l, and since
many pressure basis functions are in C_1, it is of some interest to review some of the
methods that have been used to smooth discontinuities at the global nodes of a finite
element mesh. For a simple motivational example, suppose we want the pressure and
the vorticity at the 'central' node of a 4-patch of distorted Q\Qq elements? Since each
of the four contributing elements will present its own value (and, for vorticity, not even
a 'unique' value for each of the four, as we shall see), we begin with four different
values, and it is clear that a unique nodal value can only be obtained by some sort
of averaging procedure—apparently necessarily ad hoc. We will consider four ways of
local smoothing/averaging to obtain a unique nodal result, and one global method; in
each case, of course, the averaging is performed over the number of elements sharing
the node:
1. Simple arithmetic averaging.
2. Area-weighted averaging.
3. Inverse area-weighted averaging.
4. Basis-function-weighted averaging.
The first is certainly the simplest, and this indeed has much to recommend
it—especially on well-designed meshes that have gradual grading. The second and fourth,
which are rather closely related (the latter being a lumped mass shortcut of the global
method to be discussed below), have little to recommend them in most cases because they
are actually somewhat illogical—the 'definition' of which follows after we describe one of
the other recommended schemes: inverse area weighting. Inverse (element) area weighting
is logical in the following sense: the smallest element presumably has the most accurate
value (all else being equal) and thus should get more weight than larger elements. While
this seems, in some sense, to fly in the face of the very GFEM that we are espousing
in this book (wherein basis function weighting—at least in simple cases—looks more
like area weighting than inverse area weighting), we take the position that the averaging
(weighting) arguments for these post-processed quantities lie outside of the Galerkin
framework. Indeed, recall that in Section 2.6.3, we argued (and demonstrated) that mass
lumping (basis function weighting of the time derivatives) was quite inaccurate in many
cases—worse even than the 'analogous' FDM.
Thus, our general recommendation for local smoothing is to either use simple
arithmetic averaging or, if you want higher accuracy and/or frequently employ somewhat
'coarse' meshes that may contain a few 'bad' elements, use inverse area-weighted
averaging.
TWO DIMENSIONS 849
4.2.2 Vorticity
We begin with a global smoothing that is also a best least-squares fit (L2-projection, see
Appendix 3) via the global basis functions being used for velocity, following Lee et al.
(1979):
N r N „
^2coj / faQj = ^2 <t>i(vkd(t>k/dx ~ ukd(pk/dy), i = 1, 2, ..., N, (4.2-1)
j= 1 k= 1
which is, of course, a mass-matrix problem, Mx = b, whose solution can be rather quickly
obtained, e.g., via the diagonally scaled conjugate-gradient method (see Volume II). While
not regarded as very viable by Lee et al., we nevertheless present it—and even recommend
it—because it is based on firm FEM theory: the integral of the square of the error between
the raw discontinuous field and the smoothed C° field is a minimum. A final feature of this
method relates to wiggle signals; because of the propensity of L2 best fits to wiggle when
the function being approximated changes too rapidly for the given grid, this technique
could perhaps also be used as a guide to better mesh design. It is, of course, more
expensive than any of the local methods to be described below, but it is also much more
consistent than its lumped mass counterpart (or area-weighted averaging).
Alternatively, and perhaps even more consistently, a best fit to the vorticity field may
be accomplished in the space of pressure basis functions (L2) rather than the somewhat
unnatural space (//') in (4.2-1), which is from the projection into the velocity 'space'.
Thus, rather than coh = Y^j (Oj<f>j(x), we use coh = Y^j coj\jrj(x), where \j/j is a 'pressure'
basis function, leading directly to
N
fc&j = ^2 ^i(vkd(pk/dx - ukd(pk/dy), i= 1, 2, ..., M, (4.2-2)
where M is the total number of pressure (and now, vorticity) nodes in the mesh (same as
the number of elements for Q\Qq, although the notions generalize). A further 'economy'
results upon recognizing the coefficient matrices on the RHS as the components of the
divergence matrix, CT, in the GFEM NS equations—at least when the BC's for NS are
'appropriate' (no Dirichlet BC's on velocity); i.e., the 'full' C7-matrix is needed, and
(thus) TV above includes every node in the mesh—as indeed it does in (4.2-1). Thus,
recalling the definition of CT [(3.13-25)]
Qco = C\u - CTxv, (4.2-3)
where Q is the 'pressure' mass matrix. Of course, if a C~l pressure approximation had
been employed (e.g., Q[Qq or Q2P-O, then the nodes for vorticity are internal to the
element and—depending on the graphics package used—may need to be spit to the
velocity nodes. C~l (element-contained) pressure basis functions have, of course, an
additional significant advantage: Q is also element-contained and the inverse is thus easy
to compute, and the vorticity is obtained simply by 'looping through the elements.' In
contrast, the vorticity from (4.2-1) is directly available at the velocity nodes—after solving
a more expensive linear algebra problem. We even believe but cannot prove that the
accuracy at the pressure (vorticity) nodes from (4.2-3) is higher than that from the more
850 DERIVED QUANTITIES
expensive (4.2-1) at the velocity nodes. It is probably less accurate if it too must be
'transferred' to the velocity nodes.
For a method of determining an accurate boundary vorticity when a stream function-
vorticity method is used, see MacKinnon et al. (1990), which uses some of the 'consistent
flux' concepts discussed in the next section.
We now consider simpler, but less consistent, schemes. It seems that simplicity usually
dominates over consistency in that not many CFD packages that we are aware of go the
consistent route. For the simpler methods that need results at the (velocity) nodes, we
must first compute the 'best' estimate from each element sharing that node. Within an
element, we can obviously compute co(x, y) at any desired location from
ne
co = dv/dx — du/dy = \J(^30y/3jf — Ujdcf>j/dy), (4.2-4)
where element e contains ne nodes. What to do next is the key question—and the (non-
unique) answer is quite 'element-dependent.' Zienkiewicz and Taylor (1989, p. 349) show
some 'optimal sampling points' (for stresses, which are also basically first derivatives)
for several elements: centroid for Q\ and Pi, the four 'Gaussian' points for P2, and
the 2 x 2 Gaussian points for Q\ and Q2. Whereas these points give 'extra' accuracy
owing to super-convergence, this (local) accuracy is usually lost (returning us to 'normal'
convergence) when 'extrapolation' of any type to the nodes is performed; such is the
cost of bowing to the contour package programmer. In the event you must go to the
nodes, we present the 'winner' (but not by much) from the brief study performed by Lee
et al. (1979) for two Lagrange elements: (i) for Q\, evaluate co at the 2 x 2 Gaussian
points and the centroid, then linearly extrapolate to each of the four nodes; (ii) for Q2 [or
Q2 ], evaluate co at the 3 x 3 Gaussian points (one of which is the center node for Q2)
and linearly extrapolate to the eight 'boundary' nodes. For triangles, analogous
procedure, can be used via the appropriate integration points. We close this discussion with a
useful warning (implicitly contained in our discussions up to now) from Zienkiewicz and
Taylor (1989): '... in quadratic C° elements, whether 2D or 3D, the stresses (or similar
quantities) should never be calculated at nodes.'
In concluding our vorticity discussion, we point out (again—see too Section 3.11.4)
some simple, but probably not obvious, facts related to vorticity—that could prove useful
in CFD simulations. Starting from the equation relating stream function to vorticity (see
Section 3.6.4),
VV + co = 0, (4.2-5)
integrate over the domain to obtain
coT= fco=- J ^= f uT; (4.2-6)
the total vorticity in the domain, even in a time-dependent flow with time-dependent BC's,
is (for simply-connected domains) always the line integral of the tangential velocity over
the boundary. This is Stokes' theorem in 2D. For but a single simple example of the
potential utility of this result, consider a common CFD test problem, the lid-driven cavity
(LDC); here the flow in a box (unit size, usually) is at rest, and the top lid is impulsively
moved to induce a shearing force (and a vortex sheet). Since (if) both lid speed and box
TWO DIMENSIONS 85
size are unity, (4.2-6) tells us that coT = 1 for all r > 0 (and for any value of Re). To the
extent that your mesh does not deliver unit total vorticity, it is deficient. While this is a
rather easy diagnostic to compute, it does not—unfortunately—tell the analyst where the
mesh is deficient. Note too that (4.2-6) tells us, for any contained flow with stationary,
no-slip boundaries, that J co = 0. Finally, we mention that the above test can equally be
applied to steady-state simulations.
4.2.3 Stream Function
Besides vector plots, the stream function is the most popular way to depict 2D
(Cartesian or axi-symmetric) flow. Here again, a choice needs to be made—at least for elements
that are mass-conserving (those with discontinuous pressures). For such elements, we can
profitably use the definition of \Jr that relates a change in \Jr to the flow rate between the
two values (see, for example, Batchelor, 1967):
ir(x, y)-rfr0= [(udy-vdx), (4.2-7)
where the line integral is along any curve connecting x/tq to \]/(x, y). For mass-conserving
finite elements, it is natural (Gartling, 1987) to take the curve(s) to be element boundaries;
i.e., integration around an element gives the incremental values (node-to-node) of \\r
for that element. The entire procedure can be gleaned from the sketch in Figure 4.2-1,
in which each node is visited only once (even if theoretically unnecessary owing to
specified BC's).
In practice, the line integrals are performed via the appropriate basis functions and
associated isoparametric coordinates; i.e., —1 ^ £ ^ 1, and we show only the results
(details left as an exercise) for both linears and quads—both from applying (4.2-7):
HHHHB
Fig. 4.2-1 Stream function calculation via boundary integrals.
852 DERIVED QUANTITIES
1. Linears. Node n to node n + 1:
i^n+i - isn = \i.yn+\ - yn)(un +un+l)- \{xn+{ -xn)(vn +vn+i). (4.2-8)
2. Quads. Node n (edge) to node n + 1 (center) to node n + 2 (edge):
Vr„+1 -i/n = — [(-6yn + 7y„+i - y„+2K + (-7y„ + 6y„+1 + yn+2)un+l
+ (yn - yn+i)un+2 ~ (-6x„ + 7*„+1 -xn+2)vn
- (-lxn +6xn+l +xn+2)vn+i - (x„ -xn+l)vn+2], (4.2-9)
V^+2 - irn + i = —[(yn + l - yn+2)un + ^(-yn ~ 6J« + 1 +7j„+2)"« + l
+ (yn ~ 7y„+i + 6yn+2)un+2 - (xn+i -xn+2)vn
- (-xn - 6xn+l + lxn+2)vn+l - (xn - lxn+l + 6xn+2)vn+2].
(4.2-10)
If C° pressure basis functions are employed (e.g., Q2Q\), then the above procedure
will not work so well because Jr n • u / 0 in general for these non-element-mass-balance
elements. An alternate procedure, which in some sense hides its own errors, uses the
stream function-vorticity equation,
VV =-co = du/dy - dv/dx, (4.2-11)
because [from (4.2-7)]
u = d^/dy, (4.2-12)
v=-df/dx. (4.2-13)
The weak form is, of course, desired; it is, via GFEM,
Y^j^jj V0,- • V0y = J 0,-(3u/ac - du/dy) - J 0,-iiT
- f V0,-VVr, i = \,2,...,N, (4.2-14)
T
D
where uT = —d\Js/dn is the specified value of the tangential velocity on TT. Note that 0,- =
0 on any part of f on which uT is not specified, because \Jr itself is there specified— by
\fr(xe r) = 1A0+ / n-u, (4.2-15)
where 0o (typically 0) is selected (at jco, yo) arbitrarily. Also, TV is the total number
of nodes in Q and on TT, 3^/3jf = Yljvj^(Pj/^x an<^ similarly for du/dy. After solving
(4.2-14), the contour plotting package may be called—for any type of element. This
method can, of course, also be used for 'mass-consistent' elements—trading the coding
logic of boundary integral tracing for solving linear systems, and generating
generally different xjr fields—but hopefully not too different, which would indicate a too
coarse mesh.
TWO DIMENSIONS 853
This was for the simple case of simply connected domains. If the domain is multi-
connected (flow past a cylinder, for example), a 'branch cut' must be taken from the
external boundary (with known xjr at that point) to each internal object and a line integral
like (4.2-15) performed, to complete the Dirichlet data—a necessary complication that
many might wish to avoid.
4.2.4 Heat Flux
Again, there are multiple methods available for this post-processing procedure. The
simplest is to just estimate the normal derivative at the boundary points wherever qn =
KdT/dn is desired (n • K • VT in the most general case), and the most complex is the
so-called consistent flux method that is built into the GFEM. Since the latter is
demonstrably more accurate and uses basically the same 'data' that was used to solve for T
via the GFEM, it is the preferred method—but we shall describe both methods. First we
remark that while again we need to approximate the gradient of a computed quantity, it is
typically the case that this gradient, and more often its normal component, is only needed
at the boundary of the domain, vis-a-vis the vorticity.
We begin with the best method and end with older and more common (but simpler
and less accurate) methods: called the 'consistent (flux) method' in Gresho et al. (1981b),
where our version of it was first derived and discussed, and in Gresho et al. (1987), which
also included the NS equations, the 'new' method is in fact a method that has been visited
several times already in this text. Thus, much of our work has already been done, and
all we need do here is to tie up some loose ends and to present it as the preferred
postprocessing method for computing the heat flux through T into Q. Note first that, consistent
with this consistent flux 'philosophy,' on r#, wherein if q was specified as data for the
problem, we are done; i.e., regardless of the computed GFEM solution for Th, no
postprocessing need be or should be done anywhere q was specified—it is the flux on TN. In
this sense, 'closure' has been reached in that applied fluxes are incorporated via NBC's
and 'reactive'' fluxes are virtually the same things (another aspect of 'consistency'); i.e.,
they are computed from the very same type of equations that occur at NBC's—although
in this case the primary variable (here, Th) must have been computed first.
The consistent flux equation that we need has already been presented—as part of
(2.2-16) in Chapter 2. Here we recall and rearrange it:
4>i<lD =
0/
~dt
+ u ■ Vr + pThV u-S\ + V0,- • (K • VF")
+ / (/>i[H(Th -f)-q] fori = N+\,N + 2 NT, (4.2-16)
where there are TV nodes at which Th = f + ZlyLi Tj(t)4>j(x) has already been computed
[cf. (2.2-1) and (2.2-3) in Chapter 2], NT is the total number of nodes in Q = Q + T,
and the reactive heat flux is given by the expansion [cf. (2.2-15) in Chapter 2]
N7
qhD = ^2 ^Dj4>ji.x) iorxeTD\
J=N+\
(4.2-17)
854 DERIVED QUANTITIES
i.e., the consistent flux is expressed in the same C° basis functions used for Th—here
restricted to rD, and we have generated another 'mass-matrix problem.' (Again, our node
numbering convention is solely for our expository convenience—not for writing code.)
Now for some
Remarks:
(1) fi = 1 assures global energy conservation even when V- u / 0, as discussed in
Section 2.2.3; p / 1 often does not. The simpler methods—discussed below—will
never (in general) generate global energy balances.
(2) /3 = 0 is still permissible regarding consistent heat flux, even if V • u / 0, in that it
would deliver the same temperatures on rD as given by (2.2-3) in Chapter 2 if the
problem was recomputed using our applied flux BC on all of T (as long as ft = 0
is also used in the recomputation).
(3) As it stands, (4.2-16) and (4.2-17) represent a linear algebraic system of size
Nt — N, with the consistent (boundary) mass matrix; solution via DSCG would
be quite effective—as would a skyline-based direct method. Note too that the
'dimensionality' of the post-processing linear systems is one lower than the
original problem, thus hopefully parrying 'computer cost' as an argument for not doing
it right.
(4) If the boundary mass matrix is lumped (when 'permissible'), the solution becomes
much simpler (and generally somewhat less accurate), since then Jr $iqhD =
qDj St ^" anc* tne solution of (4.2-16) is 'free' relative to the cost of forming
the RHS.
(5) For h -> 0, only one term on the RHS will remain (the others vanishing with h)
to give qD = n • K • VTh.
(6) For h not -> 0, the consistent flux depends on much more data than just the normal
derivative of K • V7\ and this is precisely what makes the result more accurate.
(7) The method (magically?) computes a 'normal' flux at boundary corners (or other
sharp changes in shape), even though the normal direction is not uniquely defined.
[See the discussion following (2.3-33) in Chapter 2, wherein an attempt at defining
a unique normal direction, which applies here as well, failed.]
(8) The boundary integral term over F^ will only be present at any intersections of
VD and r^v; usually 0, will be zero on rN—as indeed it will in the 'bulk' integrals
except in the single layer of elements that make up rD.
(9) The consistent heat flux, from (4.2-16), could be used to re-solve the original
(primary variable) problem via specified flux on rD rather than specified T and
give the same solution. [Here, of course, consistency in mass lumping, or not, will
be required; i.e., if the mass is lumped when solving for qD from (4.2-16), then it
must be applied in the lumped manner if used as an applied flux to determine T,
and is another reason that the term consistent is consistent.] No other method of
flux calculation would be 'reversible'/consistent.
(10) A short historical list of some other contributors to these ideas must be presented,
some focusing on practical applications and others on the super-convergence issue,
lest we leave the (wrong) impression that we discovered all of these good things:
see, for example J. Wheeler (1973, 1978), M. Wheeler (1974), Douglas et al.
TWO DIMENSIONS 855
(1974), Larock and Herrmann (1977), Marshall et al. (1978), Kjaran and Sigurdson
(1981), Carey (1981), and Mizukami (1986). The consistent flux 'ideas' are called
'extraction methods' by Babuska and Suri (1994), Carey (1982), Carey et al. (1985),
and Lynch (1984, 1985a, 1985b). Finally, MacKinnon and Carey (1990) even show
the finite difference community how to obtain super-convergent fluxes.
An algorithmic and heuristic way in which to both appreciate and perhaps program the
consistent flux method is as follows:
1. Initially, form all of the boundary nodal equations as if there were to be imposed
the most general type of natural boundary condition (consistent with the weak form
employed, of course) at each node [for the Laplace operator considered thus far, it is
n-K-VT + H(T-f) = ql
2. Modify the boundary node equations for the particular problem at hand, e.g., for
Dirichlet data, the nodal equation can be omitted entirely (after transposing the appropriate
coupling information to the RHS), although it should also be 'saved' (e.g., on a disk file)
for later use in Step (4). For simpler natural boundary conditions, the proper deletions are
made (e.g., H, To, or q in the current problem).
3. Assemble and solve the conventional GFEM equations for the primary variables.
4. Recall each nodal equation for which Dirichlet data were employed and solve for the
consistently derived flux—i.e., solve (4.2-16) and (4.2-17).
We now switch to simpler but less consistent methods for flux estimation. Actually,
before describing simpler methods, we mention one that appears to be more costly but,
we believe, also less accurate than the consistent flux method just discussed: the method
of Lagrange multipliers [e.g., Babuska (1973); Strang and Fix (1973, p. 133); Carey
and Oden (1983, p. 108)]. It begins with the following 'standard' identity (K -> k for
simplicity, not necessity):
f K(j>iV2Th = f 4)iKdTh/dn+ f <f>iKdTh/dn- f *rV0,- ■ VTh
= f <t>iqa+ f 4>iqr- /"kV^-VT*. (4.2-18)
where r^ + VD = V, qa is the applied flux on rv, and qr is the reactive flux on rD. (In
our 'old' terminology, qa — q and qr = qD.) Now, qa is given and qr is not. The Lagrange
multiplier method involves using (4.2-18) for the diffusive term in the appropriate GFEM
equations and, to introduce the required additional equations in order to 'balance' the
Nt - N new unknowns [qr = Y?jLn+i Qj^j^ e ^D)], the Dirichlet BC is applied only
weakly:
[ friT1' -TD) = 0, i = N+\,N + 2,...,NT. (4.2-19)
The solution of (4.2-19) simultaneously with the nodal temperature equations is the
Lagrange multiplier method (qr is the Lagrange multiplier). Clearly it has turned an
ODE problem of size TV followed by a linear algebra problem of size Nt — N of the
consistent flux method, into a DAE problem of size Nt—an increase in difficulty that is
856 DERIVED QUANTITIES
probably unwarranted; the resulting solution will, we believe, not be more accurate than
that from the simpler consistent flux method. If the steady-state problem is being solved,
then the consistent flux method involves the sequential solution of two linear algebraic
systems, one of size TV and the other of size 7V> — TV; whereas the Lagrange multiplier
method solves one larger system, of size Nj. Finally, if u = 0 and a steady solution is
sought, the Lagrange multiplier method converts a nice minimization problem (the size
TV linear system) into a less desirable saddle-point problem of greater size. We (and, for
example, Carey, 1982) see no reason to use the Lagrange multiplier method.
Finally, we arrive at simpler, more 'obvious' methods, which we might (somewhat
dangerously) call 'engineering' methods: since the diffusive heat flux is defined to be
q = KdT/dn, why not simply try to estimate the normal derivative of T on rD? The
first answer (and one surely known even by engineers) is that differentiation is a 'noisy'
process, and the second is that our C° temperature approximation necessarily causes the
gradient to be discontinuous. To combat the latter—if the plotting package will permit
it—the best advice is to compute dTh/dn (from the basis functions) only on the boundary
of an element [which is, of course, also (generally) an approximation of the domain's
boundary], not at nodes shared by two (or more) elements. This avoids the non-uniqueness
of dT/dn at shared nodes and the non-uniqueness (in general) of the normal direction.
So, if you have the proper type of plotting package (or if you care not about plotting the
results), compute the heat flux as follows (on f):
dT 3£ dTdrjY
9£ dy dr) dy J
dTdx dTdxY
as a^+ a^a$/J'
(4.2-20)
where £ and rj are either s or ±1, depending on which (isoparametric) element side we are
on (s is the parameter defining the element side in space and ranges between —1 and +1),
x = E"=i xjisj(s\ y = Y!j=\ yrfjis)* and T = J2%i Tj<t>jiM, rj), where ns is the number
of nodes on the side of an element J = (dx/d^)dy/drj - (dx/dr))dy/di-, ne is the number
of nodes in {and on) the element, 0, is the local basis function, and i/r, is 0, restricted
to r. It is important to realize that the value of qn from (4.2-20) is more accurate at
certain points on T than at others; the phenomenon of superconvergence suggests that the
midpoint (s = 0) be used for Q\ elements and the two Gaussian points (s = ±1/V3) be
used for Q2 elements. For triangles, the procedure is analogous—just replace 'Gaussian
integration points' by those associated with the appropriate integration rules for triangles;
for example, 1, or 4,or 7 point rules.
qn(x e VD) = kti ■ VF = K{nxdT/dx + nydT/dy)
K
dydT dxdT\
'fdx\2 /dy\2 \dsdx dsdy;
U)+ U)
K
:s)**(2)*
dy (dT d^ dT d^
ds \ 9£ dx drj dx
dy fdTdy dT dy'
ds \ 9£ dr] dr\ 9£ t
dx
dT
TWO DIMENSIONS 857
If you must bow to the graphics programmer and report your heat flux at nodes so that
the plotter can give you q vs x e V, there are several options available—all of which are
probably in current use somewhere:
1. Compute qn from (4.2-20) at the element nodes and, where shared by another element,
just report the arithmetic average.
2. The nearest Gaussian point flux is computed and assumed to apply at the node, and
simple (arithmetic) averaging performed.
3. The Gaussian point values may be extrapolated 'appropriately' to the nodes and
averaging performed. 'Appropriate' for Q\ is piecewise-constant extrapolation [giving the
same result as in 2 above]; for Q2 it is linear extrapolation from the two Gaussian point
values, etc.
4. As in 1 through 3, except use a better averaging method in each; inverse element side
length seems appropriate.
5. Use a least-squares fit via the boundary basis functions and boundary mass matrix; i.e.,
solve
NB n n
J2<lj / 0/0;= / 0tf«* (4.2-21)
where qn are the 'best' Gaussian point values extrapolated (but not averaged) as per 3
above. It is not advisable to lump the mass; better would be to simply return to 3 above.
There are probably yet other ways to try to salvage a heat flux from the temperature
field, some perhaps better than those listed above. Nevertheless, we stop here with our
repeated recommendation: use the available GFEM machinery to get the best heat flux
obtainable—the consistent flux. If, however, you insist on 'simpler' methods, a good
description of the details of doing so (for anisotropic materials yet, and for 3D and for
components of the flux vector other than normal) is available in Gartling and Hogan
(1994). See also Reddy and Gartling (1994).
We conclude this section with but a single, simple-but-effective example—for others,
see Gresho et al. (1987) and Thornton (1982), who puts forth an argument for lumping
the mass (for the Q\-element) and also demonstrates the utility of the consistent flux
method for computing inter-element fluxes. The exact solution to the Poisson equation,
V2T = -S = 5/2, (4.2-22)
on (0 ^ x ^ 2, - ^ y ^ 1) with T = y2 on x = 0, T = 1 + y2 on x = 2, T = x2/4 on
y = 0, and T = x2/4 + 1 on y = 1, is
T =x2/4 + y2. (4.2-23)
Figure 4.2-2 shows the solution and a simple 4-patch of bilinear elements for use in testing
the flux approximations. Since the bilinear element will give a nodally exact solution for
this simple problem, we can use (4.2-23) to evaluate the various heat flux 'post-processors'
at node S:
858 DERIVED QUANTITIES
y
1/2
NW
©
W
-^T = 1/8
^\ ©
sw \
N
1/4
0
S
£=1 NE
©
^\1
X9/16 \
\ © \
h = 1/2
E
SE w
0 1 2
Fig. 4.2-2 Solution of a Poisson equation.
1. Consistent flux—in the lumped mass approximation: (4.2-16) here simplifies to
ml <t>i= I V0,-Vr- / 0,-S; i.e.,
lclDs =
h
or
--[2(7W - 2TS + TSE) + (Tw - 2T0 + TE)]
0/
- ZlUTw - Tsw) + 4(7\) - Ts) + (TE - TSE)] - Slh/2,
oh
VDs = -TTiWsw ~ 2TS + TSE) + (Tw ~ 2TQ + TE)]
61
~ -k^Tw ~ Tsw)+4(Tv - Ts) + (TE - TSE)] - Sh/2
oh
= -^[2(0 - 2/4 + 1) + (1/4 - 2/2 + 1.25)]
- 3 [d/4 - 0) + 4(1/2 - 1/4) + (1.25 - 1)] + 5/8
= -0.125-0.5+0.625 = 0
2. Average of dT/dn from each element at node S:
1
4s =
8T
2 [dy
1
2
dT
3,5 dy
4,5
To-Ts\ + /To-Ts-
h J \ h
= -[2(1/2-1/4)] = -0.5.
TWO DIMENSIONS
859
3. Average of the two Gaussian point values:
qs = -
1
2
1
2
1
dT
97
dT
3.(SW+S)/2
dy
4,(S+SE/2)
(Tw - Tsw) + (TQ - Ts) (Tp - Ts) + (TE - Ts)
2h
2h
= -^[(1/4 - 0) + (1/2 - 1/4) + (1/2 - 1/4) + (1.25 - 1)]
= -i[l/2+ 1/2] = -0.5.
It is, of course, just a coincidence that methods 2 and 3 give the same result—they
would usually differ. But it is no coincidence that method 1 produced the exact result:
g5 = dT/du\s = — dT/dy\s = — 2y\s = 0. Note how the consistent flux combines the
three pieces of available 'data'—jc-direction 'heat flow,' y-direction flux, and local heat
generation—to come up with the right answer. Note too that only the y-direction flux
would 'survive' in passing to the limit; the consistent flux does the best job possible
with the given data on the finite element mesh—consistent with the general GFEM. The
final worthwhile comment is this: the consistent flux method would give exact results
for this problem even for a mesh comprising various-sized rectangular elements if the
consistent (boundary) mass matrix is used in (4.2-16) (LM is only exact for equal-sized
rectangles)—for which methods 2 and 3 would still give erroneous results (and the two
wrong results would differ in the general case).
If the same exercise was to be repeated at node E, at which the exact flux is dT/dn |£ =
x/2\E = 1 (an exercise we leave to the reader), the results are 'the same'; i.e., consistent
flux gets qoE = 1, and the two simpler methods give q^ = 0.75. [The consistent flux
method gives 0.75 (jc-flux) —1.0 (y-'heat flow') +1.25 (local heat generation).] Finally,
only the consistent flux method applied at all eight boundary nodes would generate fluxes
that satisfy the global energy balance frq = - fnS = 5— and (only) it would give the
'proper' flux on each face (proper flux at a corner needs to be 'properly' interpreted);
in fact, the full results are these: qsw = qs = qw = 0> asE = 1/3, <?£ = 1, qNE = 5/3,
qN = 2, and qNW = 4/3, giving frq = E; <lj ' /r 0y = 5-
The consistent flux method always utilizes whatever data are available in order to
provide fluxes (secondary variable) of equivalent accuracy as are the temperatures (primary
variable)—at least if consistent mass is used throughout; this is the real bottom line.
A final derogatory remark on any of the simpler methods: if you were to 'revisit' TN
during the post-processing phase, on which a given (applied) q was part of the problem's
data, you would oftentimes be disappointed in the lack of agreement between the applied
q and your post-processing estimate of same.
4.2.5 Forces and Moments
Just as the consistent flux method follows from and is intimately connected with GFEM
done properly, so too is the case with 'consistent force'—a global force/momentum
860 DERIVED QUANTITIES
balance. Just as the 'conservation form' (/? = 1) of the AD equation was required in
order to realize a global energy balance via consistent flux calculations, so too is that
form required for the NS equations to realize a global force balance via the
consistent force method. But, as there, the conservation form is not required for accurate
local forces—the consistent force method is all that is required. From (3.13-402) in
Section 3.13.8 of Chapter 3,
+ [K + N(u) + PD(u)]u + CPl
^ga - [ ^]Fa, (4.2-24)
with a = 1, 2 or a = 1, 2, 3 and i = Na + l,Na+ 2 NTa, where the terms [ ],
represent the i-th row of the LHS of the a-component of the momentum equation. This
lazy way to write the momentum equation at node / is actually justifiable if the 'algorithmic
way' described earlier (for heat flux) is implemented; i.e., just 'haul out' the previously
saved equation for node /.
Whether or not the code is written that way, (4.2-24) is the suggested way to compute
the forces exerted on the fluid by the boundary at all locations that used Dirichlet BC's for
the velocity; it is the consistent force method and will generate forces (e.g., lift and drag)
whose accuracy is commensurate with that of the primary variables. Of course, just as the
heat flux from (4.2-16) 'finally' (on a fine-enough mesh) simply represents n • K • VTh
from the term J V0, • K • VTh, so too does the consistent force from (4.2-24) finally 'see'
only the term on the RHS given by (Ku + CP)t—the viscous plus pressure contributions
of the traction vector, the former being correct only if the y = 1 form (stress-divergence
form) is employed. But since h -> 0 is rarely seen in GFEM calculations, the 'other'
terms in the 'force balance' equation represented by (4.2-24) will often be significant and
should not be neglected—at least if you wish to wring the last drop of accuracy from
your simulation.
Finally, we direct the reader's attention to the Remarks and discussion following
(4.2-17) because, with the proper (and easy, we assert) reinterpretation, they apply equally
well for the force components, Fx and Fy.
If, however, you insist on using simpler (and less consistent!) methods of estimating
reaction forces, the methods discussed relative to the simpler heat flux equation (4.2-20),
can be easily adapted to the force calculations—with the 'exception' that the heat flux is
a vector quantity whose normal component is the natural desired result, whereas with the
force vector, it is usually the x- and y-components that are the desired results, although
sometimes normal and tangential components of the reaction traction vector may be
needed. Here we shall merely present the continuum equations for the boundary forces;
details are left as exercises—or see Gartling (1987):
Fx = nx(2fidu/dx -P) + nyfi(du/dy + dv/dx) (4.2-25)
and
Fy = nxix(du/dy + dv/dx) + ny(2fidv/dy - P). (4.2-26)
Another advantage (besides the alleged simplicity) of this more 'conventional' approach
is that the pressure part of the 'lift and drag' could, if desired, be easily separated out
Ni
£ FaJ i 0,(avr=ma
j=Na+\
>rs
TWO DIMENSIONS 861
from the viscous part. Also, if normal and tangential forces are desired, then
and
Fn = nxFx +nvFy
f r Xxt x -\- Xyt ,
(4.2-27)
(4.2-28)
where it may be important to use the appropriate definition of nx and ny (and thus of
xx, ty). Recall that in Section 3.13.1e, we showed how to compute a (mass-) consistent
normal vector at each node on r in conjunction with applying certain BC's. Although not
as serious in this post-processing stage, we believe that the same definition of n presented
there is appropriate here. Alternatively, of course, n could be simply computed from the
equation for an element side [see (4.2-20)]—preferably at the right points on r (typically
Gaussian points) and Fx, Fy (or Fn, Fx) computed there as well—followed, perhaps, by
extrapolation and averaging to nodes.
Turning briefly to the computation of moments/torques, we refer to Figure 4.2-3 and
assume that we want the turning moment about the point O—which could be, for example,
the center of gravity of a 2D tumbling spacecraft. The moment of the traction vector at
the point P is
Mp = F x r = Fxry - Fyrx = |r| • |Fr| (4.2-29)
and is directed into the paper. The total moment is thus
(4.2-30)
and this torque (per unit length, into the paper) can then be computed 'consistently,' via
(4.2-24) or otherwise, via (4.2-25) and (4.2-26).
To conclude this section, we remark: for a recent and striking demonstration of the
much higher accuracy from the consistent force method vis-a-vis the 'simpler' method
for computing the drag coefficient for axisymmetric flow past a sphere, see Tabata and
Itakura (1995).
y
F = a • n
Fig. 4.2-3 The moment about 0 is F x r.
862 DERIVED QUANTITIES
4.2.6 A Recommended Method for Computing First
Derivatives at Nodes
In what is possibly the best way to compute derivatives at nodes, advantage is taken
of the known points (in most cases) of super-convergence/extraordinary accuracy, by
0. Zienkiewicz and J. Zhu. In a series of papers, beginning with Zhu and Zienkiewicz
(1988), and culminating in what appears to be a significant breakthrough, these authors
showed in a pair of papers [Zienkiewicz and Zhu, 'Parts One and Two' (1992), pp. 1331
and 1365)], not only how to obtain accurate gradients (first derivatives) at nodes, but
also how to use the same 'recovered' information to provide a cost-effective error
estimator for mesh redesign. Since adaptive meshing is one of the subjects not covered
in this text, we shall focus on the first of their good results (but will summarize the
second).
In order to set the stage, we present a few quotations that let the reader know that they
surely like the 'Z2 local L2-recovery method,' the first and last from Part One above and
the middle from Zhu (1991):
1. After summarizing what is wrong with all of the older methods, they state, 'In this paper
we therefore propose a new procedure in which a single and continuous polynomial
expansion of the function describing the derivatives is used on an element patch surrounding
the node at which recovery is desired. This expansion can be made to fit locally the
superconvergent points in a least-squares manner or simply be an L2-projection of the
consistent finite element derivatives. The first of these will be shown always to lead to
superconvergent recovery of nodal derivatives. ...' Thus, it is only this 'first' method that
we shall examine in any detail.
2. 'It has been demonstrated, by numerical experiments, that the recovered nodal values of
the derivatives by the discrete superconvergent recovery procedure are superconvergent.
One-order-higher accuracy is achieved by the procedure for the derivatives of linear and
cubic elements. Two-orders-higher accuracy is achieved for the derivatives with quadratic
elements. In particular, 0(hA) convergence of the nodal values of the derivatives for the
quadratic triangular element is reported for the first time.' Finally, the bottom line:
3. 'The results presented in this part of the paper indicate clearly that a new, powerful
and economical process is now available, which should supersede the currently used
post-processing procedures applied in most codes.' (!)
They do show lots of convincing evidence and the method is immediately intuitively
appealing and so obviously excellent that it is a little bit surprising that it took so long
to discover ... 'Why didn't / think of that?'
It is appealing because it is simple, easy to apply, and easy to say in words: put a
polynomial through the super-convergent values at the appropriate points surrounding the
node in question and evaluate this polynomial at that node. What polynomial? Where are
the appropriate points? How should the appropriate polynomial be obtained? These are
the questions answered in their papers. Here, we merely provide a summary, first via a
ID example using quads that was presented in yet another of their papers—Zienkiewicz
and Zhu (1991): consider the 2-patch in Figure 4.2-4, with two center nodes (•), three
edge nodes (x), and four Gaussian points (O); Figure 4.2-4 is used as follows:
1. The derivatives of the finite element solution are evaluated at the four superconvergent
Gaussian points (D -> Qj).
TWO DIMENSIONS 863
->^^>
Fig. 4.2-4 A 1D 2-patch of quads.
X
X
X
41 (
X
> <
X
3
X
> <
X
2
X
,
A B C D
Fig. 4.2-5 A partial mesh of bilinear elements.
2. A least-squares fit through these four points is made with a quadratic polynomial.
3. The new nodal values are obtained by evaluating the resulting polynomial at the nodes
(• ->, x ->).
4. For the central edge node, we are done. For the two midside nodes, an arithmetic
average will be made with the analogous result obtained by applying the same procedure
at the next edge node. (Since both values are superconvergent, so too is their average.)
The mathematics of this process goes as follows:
1. Evaluate uh(x) = J2j Fj d<j>j(x)/&c at the four Gaussian points, xt(i = 1, 2, 3, 4), from
the given finite element solution, Fh(x) = Yjj Fj<l>j(x)> where F is the 'field' variable in
question, and uh(x) is its conventional first derivative (discontinuous at element edges).
2. Seek a new u(x) = a + bx + ex2 such that the sum of the squares of u(xj) — uh(Xj)
over the four Gaussian points is a minimum; i.e., minimize
J = i ^[(a + bxi + ex2) - uh(Xi)Y
(4.2-31)
864 DERIVED QUANTITIES
with respect to the three unknown coefficients; namely,
dJ/da = 0 = J2(a + bxi + cxf) - "*(*«■)
i
dJ/db = 0 = ^[a + bxi + ex] - uh{Xi)]xi
i
dJ/dc = 0 = ^[a + bxf + exf - uh(Xi)]xf.
(4.2-32)
(4.2-33)
(4.2-34)
This 3x3 linear system can be rewritten as
i i
E*,2
E^ X>.4
a
= E
-2,./i
Xfu (Xj)
(4.2-35)
3. Once a, b, and c are available, the nodal values are computed from
u(Xj) = a + bxj + exj, j= 1,2,3, (4.2-36)
for the three nodes at which 'recovery' is desired. The value so obtained at the 'central'
edge node is final; the two midside node results will later be averaged with their like
values from adjacent 2-patches.
Now that we see the concept, the rest should be easy—up to a few details. But we
present one more example—2D bilinear elements on a 3, 4, or 5-patch, near the corner of
a domain in which some corner refinement is employed—nodes 1, 2, and 3, respectively,
in Figure 4.2-5. In each patch we seek a best fit to the first derivative of Fh(x, y), dFh/dx,
evaluated at the appropriate centroids over each patch, via a bilinear function,
u(x, y) = a + bx + cy + dxy,
using the method of least-squares:
n
minimize J = j Y~][a + bx{ + cy, + t/jc,y, — uh(xi, y,)]2,
(4.2-37)
(4.2-38)
where (xt, y() define the centroid locations of the n elements (3, 4, or 5) in the patch,
and uh(Xi, y,) is the conventional, basis-function evaluation of dFh/dx at centroid i. The
minimization in each case leads to the 4 x 4 linear system ($2 = YH=0>
n E* E^' E-*^
E* E-*2 E-*^ E-*2^-
E^- E-*^ E^2 E-*^2
.E-*^ E-*2^ E-*^2 E-*2^2
a
b
c
Id.
J2uh(Xi,yi)
J2xiUh(Xi,yi)
^2yiUh(Xi,yi)
^2xiyiuh(xi, y()
(4.2-39)
Once these three linear systems have been solved, the best-fit expansion (4.2-37) is
evaluated, respectively, at nodes 1,2, and 3.
TWO DIMENSIONS 865
Remarks:
(1) The recovery polynomial in all cases is the same polynomial used to represent the
primary field variable.
(2) Boundary nodal values are also easily evaluated from the corresponding interior
patch that contains that node—and is the recommended procedure; e.g., nodes A
and B come from node l's 3-patch, node C from node 3's 5-patch, and node D
from node 2's 4-patch.
(3) Another, somewhat simpler, but certainly less general method for 4-patches that
should display equivalent accuracy at lower cost is available simply from
interpolating (using the 'conventional' FEM bilinear basis functions) the super-convergent
centroid data to the node in the 'center' of the 4-patch—after finding the 'central'
node's location relative to the new centroid—which is just the average of the four
surrounding centroids.
(4) A second recovery technique was also discussed by Zienkiewicz and Zhu (1992,
Part 1) and tested by Zhu (1991), in which the sums in (4.2-38) and (4.2-39) are
replaced by integrals; i.e., the new result minimizes the integral of the difference of
the squares. It seems to us, however (and J. Zhu agrees—personal communication)
that it must be somewhat inferior, because it does not so 'strongly' utilize the
original super-convergent information—it tends to smear it out. In fact, it does not
yield super-convergent results for one of the elements examined: quadratic.
(5) The application to triangular elements, although more tenuous owing to the lack
of clearly super-convergent evaluation points, does indeed deliver super-convergent
results at nodes if the centroid is used for the linear element and the three midside
nodes for the quadratic element—Zienkiewicz and Zhu (Part 1). The latter, in fact,
is 'ultra convergent'; error ~ 0(hA). (Cubic triangles remain to be studied.)
(6) The error estimator follows simply once the 'super-convergent' nodal values are
available: assume these to be exact, form the 'conventional' (inaccurate) derivatives
using the basis function derivatives, and compute the energy norm of the difference
over each patch. Large errors suggest mesh refinement.
(7) For additional details regarding the methods and many experimental results, see the
Zienkiewicz and Zhu references, as well as a more recent paper that further improves
the method: Labbe and Garon (1995).
(8) For some recent detailed analysis of the Z2-method see Babuska et al. (1997) and,
for a recent comparison of several methods, see Zhu (1997).
(9) For another recent superconvergence result on bilinears, see Zheng and Li (1996).
4.2.7 Particle Paths
Three basic types of particle path plots can be created from the velocity field computed
by a CFD simulation: particle path plots, dye plots, and material line plots. In a particle
path plot, a particle is introduced at a point (or points) in the flow domain, and the path
of motion of the particle is tracked based on the computed flow field. In a dye plot, as
for the particle path plot, a massless particle is introduced at a point (or points) in the
flow domain, and the path of motion of the particle is tracked based on the computed
866 DERIVED QUANTITIES
flow field. The difference between a particle path plot and a dye plot is that in a dye plot
a new particle is introduced at the same position as the original particle at a specified
time increment. In each of these plots, the position of the particles can be plotted at
discrete time increments or the subsequent positions of a particle can be joined by line
segments creating a continuous line plot of the particle's motion. The continuous dye plot
corresponds exactly to a flow visualization experiment in which tracer dye is introduced
into the flow at some point. Note that for a steady-state stimulation, particle path and dye
plots are identical; they will only differ in the case of a transient simulation. In a material
line plot, a sequence of positions in the flow field at some initial time are connected with
straight line segments. The deformation of this 'line', the material line, is then followed
in time with the position of the material line being plotted at specified time increments.
All of the above plots require the computation of the trajectory of a massless particle
introduced into the flow; i.e., into the computed flow solution. The basic problem can be
stated as follows: Given a particle at position />(jci, JC2, JC3) in the flow domain at time tn,
find the position of the particle at time tn + At = tn+\, (These times and the ensuing time
steps are totally independent of the time-integrator used for the momentum equation). The
particle trajectories are obtained by the solution of the equation:
dxf n
-r- = «?, (4.2-40)
at
where xf is the position coordinate of the particle at time t. Using the finite element
representation for the velocity «,, namely
ui = YJUki{t)(t>k{Hj),
k
where £, are the local coordinates, 0^ are interpolation functions on the reference element,
and U'l are the nodal value of the velocity components, we can write (4.2-40) as
The relationship between the local coordinates £, and the global coordinates jc, is given
by the isoparametric mapping
*,■(*;) = £*?**<*;)
k
It follows that for the reference element, (4.2-40) can be written as
dxj _ dxj_d%j _
~dt ~ dfj~dT ~""
or, equivalently, in matrix form, as
dx d£
T~ = J T = u'
dt dt
where J is the Jacobian of the parametric transformation. Using this formulation, we can
transform the solution of the original problem to the solution of the equivalent problem
TWO DIMENSIONS 867
on the reference element:
dt
u,
with u = u(§).
Once the particle (i.e. fluid) velocities are available at time tn+\, the position of the
particle at time tn+\ in the reference coordinate system is given by, for example, the
trapezoid rule:
£j(tn) + kjitn+\)
l;j(tn+l) = i;j(tn) + ^2pkAt-
where & is an interpolation factor used to compute intermediate positions when the path
transverses several elements in the time step At. Formally,
Pi =
A,
Am =
a7~
A =
A,
where t\, i*, and xm are the time at which the particle enters the first element, the time
at which the particle enters the &'th element along the path (during the time interval At),
and the time at which the particle exits the last element in the path; see Figure 4.2-6.
Clearly, if during the time interval At the particle moves within the same element, then
/3 = 1 (Note that At here is a user-specified quantity.)
4.2.8 Effective Peclet (Reynolds) Number
A potentially useful diagnostic that has not, to our knowledge, been tested, is a global
measure of artificial/numerical dissipation. For advection-diffusion, we call it an effective
Peclet number:
Pe
Pe(eff) = T T , (4.2-41)
1 + TtN(u)T/kTtKT
where Pe(= uqL/k) is the given Peclet number, N(u) is the advection matrix, K is the
diffusion matrix (sans diffusivity coefficient), and TT is the transpose of the nodal
temperature vector. For the NS equations, simply replace Pe by Re after replacing T by u and k
by v. The first thing to notice is that Pe(eff) = Pe (presumably the goal!) if N(u) is skew-
symmetric, and the next is that Pe(eff) < Pe if N(u) is dissipative. In our GFEM using
Element k
Element k -1
Element 2
Element 1
'n+1
Particle path
Element
m — \
Fig. 4.2-6 A particle path.
868 DERIVED QUANTITIES
either advective [ft = 0 in (2.2-16)] or conservation (flux) form (/? = 1), the advection
matrix is indefinite and thus Pe (eff) may be bigger or smaller than Pe.
For those to whom the above definition is not quite obvious, we now present the
derivation. Recall the semi-discretized AD ODE from Chapter 2 (2.2-7)], here with no
source term and no BC forcing:
MT + N(u)T + kKT = 0; (4.2-42)
i.e., we are considering a 'spindown' problem—and we have intentionally modified the
definition of the diffusion matrix by factoring out the (constant) diffusion coefficient. The
resulting 'energy' balance equation is obtained by taking the scalar product of (4.2-42)
with the T vector:
-—TTMT = -TTN{u)T - kTtKT = -KeffTTKT, (4.2-43)
2 dt
where clearly Keff = k if TTN{u)T = 0—which is (only) true if N(u) is skew-symmetric
[/? = 1/2 in (2.2-16)]. Otherwise, we see an 'effective' diffusivity given by
*reff = K + TTN(u)T/TTKT, (4.2-44)
leading to (4.2-41) via Pe (eff) = uqL/k^. To the extent that Pe (eff)/Pe differs from unity
in the general case (not just the spindown model used for the derivation), the simulation is,
in this measure, defective. For example, for simple first-order upwinding in ID, (4.2-44)
gives Afeff = k + uAx/2.
Since we have no actual experience with this diagnostic, all we can do is put it out
on the 'table' for consideration—as was done in Baptista et al. (1995). It just might,
at least sometimes, be a simple-but-effective way to measure one aspect of simulation
'quality.' Unfortunately, it, like the global vorticity measure put forth in Section 4.2.2,
does not tell you how to improve your results—only that they might need it. Even so, these
sorts of diagnostics may eventually also find good use in adaptive mesh algorithms—as
independent 'quality' measures.
4.2.9 Pressure Smoothing and Node Moving for QiQ0
The infamous checkerboard (CB) pressure mode (modes in 3D) that afflicts the Q\Qo
element is easily dealt with—because we know (at least for 'simple' grids) the form of
the CB-eigenvector. But even for non-simple grids and even when BC's have precluded
CB-mode(s), the procedures described below are useful and generally recommended. If,
however, there are no spurious modes and if your graphics package will plot centroid
values, that too is a good choice—perhaps better.
We shall treat the 2D case in sufficient detail that the extension to 3D will be fairly
obvious. For the most rigorous possible filter for the pure CB-mode, see 'Scheme V in
Sani et al. (1981a). For a mesh comprising rectangles, the filter to be described is also
rigorous—and, since it also works well on general meshes and is much simpler than
'Scheme 1,' we do not hesitate to advocate it 'universally.'
The filtered pressures are computed at the velocity nodes (and thus could then be
considered 'smooth'—C°—and interpolated bilinearly), typically (but not necessarily) at
the 'central' node of a '4-patch.' If there are n elements sharing the node in question, say
TWO DIMENSIONS 869
node i, the new nodal pressure is simply
where Pej is the pressure on element j and Ae- is its area. This area-weighted filter
works because, recall, the CB-pressure mode eigenvector is Pqb = ±1/Ae, and if the
element pressure is interpreted as representing the sum of a physical pressure and the
CB-mode, it is clear that—at least on a 'nice' 4-patch or 2-patch, the most likely
candidates for containing the CB-mode—the above filter precisely annihilates the CB-mode,
leaving an area-weighted physical pressure. We regret that we do not know how to
inverse-area-weight the physical pressure (for more accuracy) while simultaneously area-
weighting (removing) the CB-mode. But, thanks to D. Griffiths, we also have a fix for this
'problem'—also from Sani et al. (1981a): simply move the nodes by the same
prescription. When (4.2-45) is applied to both the x- and y-components of node f s location,
the pressure at this new location (for 'pressure purposes' only, of course) will be more
accurate—second-order vs first-order if not done. This can be seen by considering the
4-patch shown in Figure 4.2-7. Since (4.2-45) has removed the non-smooth portion of
the pressure, we can consider the filtered result to be smooth; i.e., susceptible to a Taylor
series expansion. Thus, rather than applying (4.2-45) at node i, let us apply it at
location (jco, yo), whose position is to be determined. If Ps(x, y) denotes the general, smooth
pressure field, and (jc,-, y,) the location of element f s centroid, then we have
4 , 4
+
j=\
P.s\o + (Xi -Xq)
Ps\o + (X2 -Xq)-
dPs
dx
dPs
dx
+ (y\ - yo)
+ (n - yo)
dPs_
dy
dP^
dy
+ 0(Ajc2) + 0(Ay2)
+ 0(Ajc2) + 0(Ay2)
+ [ ]A3 + [ ]A4 } /(A, +A2 +A3 +A4).
(4.2-46)
X4
<
i
X 1
1
1
1
1
1
1
1
i X 3
1
1
^_^;_o_
i
i X2
i
i
i
, 1 1
B
B'
Fig. 4.2-7 Nodal 'smoothing' for better pressure.
870 DERIVED QUANTITIES
Can we find an (x0, yo) such that Po(xo, y0) is more accurate than for any other location?
That the answer is 'yes' follows by requesting the coefficients of V/Mo in (4.2-45) to
vanish—and the result is, simply
4 , 4
-*o = £-*A7EA> (4.2-47)
with a similar equation for yo (and zo for that matter in the 3D case wherein, of course,
Aj -* Vj, etc.), giving P0(x0, y0) = Ps(x0, y0) + 0(Ax2) + 0(Ay2), and we can
apparently do no better. This prescription, suggested in the figure, moves node i to the geometric
center of a rectangular 4-patch. That this node smoothing/moving works is demonstrated
in Sani et al. (1981a). Thus, if you want the best interior pressures attainable from Q\Qo,
use both (4.2-45) and (4.2-47)—the latter using the original centroid coordinates. (We do
not propose moving boundary nodes, and we leave as an exercise how to communicate
these results to your plotting package.)
This takes care of interior nodes, but not boundary nodes (and especially not corner
nodes). These require special treatment and are to be done after processing the internal
nodes—with or without nodal smoothing. We (J.M. Leone, Jr. and PMG) discovered how
to effectively 'process' the boundary and explained our technique to D. Malkus, who further
analyzed and published it in Yao and Malkus (1990). It is based upon two simple
observations: (i) the filtering and smoothing of a boundary 2-patch gives a result that applies
most properly at a distance of Ajc/2 (or Ay/2) from the boundary—corresponding to the
centroid distance from same; and (ii) boundary corner nodes 'see' only one element
pressure, which is generally CB-polluted, and no filtering operation is apparent. To deal with
these two minor issues, return to Figure 4.2-7 and focus first on node B (not B' because
we do not shift boundary nodes). The 2-patch filter there, (P\A\ + /V^VO^i + A2) = P,
kills the CB-mode and most properly applies on the line joining i to B, halfway in between.
Thus, a better boundary value is obtained by simple linear extrapolation:
PB = 2P-P(, (4.2-48)
a result that presumably could be improved further by utilizing Pq instead of Pt and
devising a more elaborate interpolation/extrapolation scheme.
Finally, assume that node C defines a corner of the domain. The best way that we (and
Malkus) have found to obtain Pc is via simple linear extrapolation in the master element
(^-coordinates). Thus, presuming PA and PB to have been properly obtained, the value
at the corner is simply
PC=PA+PB-Pi- (4.2-49)
Except for a few finishing remarks, we are done:
1. For internal corners, (4.2-49) also applies.
2. It is possible (cf. Hughes et al., 1979a, and Yao and Malkus, 1990) but not recommended, to
perform linear extrapolation to corner nodes viax-y coordinates in distorted elements rather
than in the simpler t--r) system. It is both more cumbersome to implement and, according to
both J. Leone (personal communication) and Yao and Malkus (1990), less accurate.
3. It is regretable that, if node smoothing is to be employed, both Pt and Pq need to be
available. Perhaps further effort would result in further improvement—such as permitting
the motion of boundary nodes via sliding on T....
4. Extension of the remaining details to 3D is left as a not-very-difficult exercise.
THREE DIMENSIONS 871
4.3 THREE DIMENSIONS
In the 'real world' we must accept the (expensive) fact that CFD 'post-processing' is
an even more important aspect of the simulation than in 2D—including the intelligent
use (still to come, hopefully) of color graphics for other purposes than Colorful Fluid
Dynamics (CFD's alias, sometimes). While more difficult to actually code up, most (not
all) of the 3D derived quantities are conceptually pretty analogous to those in 2D. We
shall therefore highlight 'differences' and discuss some 3D 'only' quantities.
4.3.1 Vorticity
Other than the fact that co = V x u is a vector quantity, the same issues as in 2D are
present. Simply repeat for the two other dimensions that which was done in 2D for the
'vertical' vorticity component; e.g., extending what may be the most straightforward of
those methods, we present the other two components, cox and a>y, from (4.2-3), which
describes the z-vorticity:
Qcox = CTzv - CTyw (4.3-1)
and
Qcoy = CTxw - Cju, (4.3-2)
giving the vector at the vorticity (pressure) nodes. Note that this approach, for C_l
pressure, is still element-by-element. An analogous extension of (4.2-1) to 3D would
give two more 'similar' equations—all with the same global mass matrix—all giving
L2-projections to the velocity basis. Also, the simpler methods using (4.2-4) can also be
extended in generally obvious ways, although there may be more room for 'ad hocness'
in 3D. Finally, the best method is probably the Z2-method described in Section 4.2.6 for
2D; simply extend it to 3D.
4.3.2 Helicity Density
One quantity that has no 2D counterpart is the scalar product of the velocity and vorticity
vectors; called helicity density,
he = u ■ co = u ■ (V x u), (4.3-3)
and its global integral,
He= J he, (4.3-4)
the helicity (see, for example, Pelz et al., 1985, and Mobbs, 1981). As mentioned in
Chapter 3 (Section 3.3.2), if he is everywhere large, then u x co—the non-linear (advec-
tion) term—is necessarily small, and vice versa: 'nonlinearity is depleted if co tends to
align with u'—Frisch and Orszag (1990). Our goal here is, however, only to see if we
can compute it—not necessarily understand it. At least it is simply a scalar 'at the end of
the day.' Expanded, it is
he = ucox + vu>y + wcoz
872 DERIVED QUANTITIES
The expensive/Galerkin way would be to first obtain the vorticity (discussed above) either
in terms of the velocity (//') or pressure (L2) basis functions, and then perform a 'Galerkin
product' (L2-projection):
Y,hej f <j>i(j>j = f 0,11* <o\ i=\,2,...,N, (4.3-6)
j
where we have chosen the 'velocity' basis for he. (If <oh were expressed in the 'pressure'
basis, it might make sense for he to be, too.) But the double summation and the integral of
triple products of basis function terms on the RHS, followed by a 3D mass-matrix problem,
leads us to also consider approximations that are perhaps more cost-effective. The first
of these that comes to mind, and probably the simplest, is pointwise multiplication of
Gaussian point (where applicable) values; i.e., evaluate both u* and toh at the 'appropriate'
Gaussian points (centroid for Q\Qo, 2 x 2 for Q2, etc.) and multiply them to obtain he
there. Nodal quantities, if needed, could then be obtained in any of the 'usual' ways. If
you have already computed nodal values of <oh, however, and if you believe they are
sufficiently accurate, then simple nodal multiplication (u ■ co) is obviously called for.
We close with the suggestion that if helicity is really important to you, you should try
several methods in the 'research version' of your code and select the most 'appropriate'
version after completing a small research program aimed at a comparison.
The forces and moments in 3D are probably worth a few words—especially the latter,
called pitch, yaw, and roll by the experts. The forces should be done consistently (and
thus, accurately) using (4.2-24). Consider the jc-axis as defining the direction of travel, so
that it is also the roll axis. At any given point (P) on the surface of our moving 'vehicle'
(say), it is an 'easy' matter to compute the traction force vector, F = a n, by methods
discussed above. To compute the roll moment requires 'dropping a perpendicular' from
P to the jc-axis and calling the resulting distance vector r. Next, the projection of F onto
the plane containing r and normal to the jc-axis is required. Calling this vector ¥R (roll
vector), the pointwise roll moment is computed as M* = ¥R x r, and its integral over the
entire boundary (surface area) of the vehicle gives the total roll moment. Similarly, if the
y-axis points up, it is used in an analogous way to compute the pitching moment. The
remaining axis, z, is then used to compute the yaw moment. The actual details of such a
computation are left as an exercise.
The remaining derived quantities are simple (Ha!) extensions of what we have already
covered in 2D. If we are wrong, then each 'error' is an exercise for the reader (!). Anyhow,
we are, we believe, done.
Appendix 1 Some Element
Matrices
In this appendix we list some results that are useful for performing old-fashioned pencil-
and-paper analysis—and for code debugging. In ID and 2D we present many of the
(GFEM) element matrices for fluid mechanics that are 'normally' generated in a computer
subroutine. We will show mass, diffusion, and advection matrices in ID, and these
plus divergence and gradient matrices—and their 'product' that generates an uncommon
Laplacian matrix for the pressure—in 2D. The ID results are useful only for AD, whereas
the 2D results are also useful for the NSE's. At the end, we even show some element
matrices for a CVFEM.
Figure A-l shows the generic elements examined, in both ID and 2D, and form the
basis for understanding the matrices to follow. (Sorry, no triangles.)
A. 1.1 ADVECTION-DIFFUSION MATRICES
rriij = I (fityj (mass matrix),
dx dx dy dy
f f ( dip; dip;
riij = / (piu ■ W(pj = / \U(pi-^-+ vcpi
dx
dy
(diffusion matrix),
(advection matrix),
Notes:
(1) In ID, e is element length; in 2D it is element area.
(2) In ID, drop the y-terms.
(3) Lower cases are used to denote element (vis-a-vis global) matrices.
A. 1.2 ONE-DIMENSIONAL ELEMENT MATRICES
(1) Linear
rriij =
ktj —
u
/
6
1
/
"2 1"
1 2
7
1 -1"
-1 1
-1 1
-1 1
874 SOME ELEMENT MATRICES
Jt
1
•-
Jt
<*—
1(8
,»!—
X©
x©
—•
7
• 9
5
•
®x
©x
—3°
611
-?o
Jt
Fig. A-1 Linear and quadratic elements for 1D and 2D. The four internal nodes at the 2x2
Gauss points are for the pressure in 020-i, with the fourth one omitted for 02P-i-
All are omitted for 02Oi and for bilinear elements, for which notes 5 through 9 are
also omitted.
(2) Quadratic
rriij =
*"< / ^ r
riij =
1
30
1
3/
r 4
?,
.-1
r 7
-8
. 1
r-3
u
—
6
-4
1
2
16
2
-8
16
-8
4 -
0
-4
-ll
?,
4.
1
-8
7
-11
4
3.
A. 1.3 TWO-DIMENSIONAL ELEMENT MATRICES
(1) Bilinear (full quadrature and 1-point quadrature)
where
mU =
Ih
36
r4 2 1 2l
2 4 2 1
12 4 2
L2 1 2 4.
1-point //j
16
rl l l li
llll
llll
.1111.
kij ~ k*ij + kiJ
" ~ 61
2
2
1
1
-2
2
1
-1
-1
1
2
-2
ll
-1
-2
2.
1-point /j
> —
4/
r 1
-1
-1
. 1
-1
1
1
-1
-1
1
1
-1
-1
TWO-DIMENSIONAL ELEMENT MATRICES 875
and
ky
ntj
riij
(1-point)
/
~ 6h
uh
= 12
uh
- 2
1
-1
.-2
"-2
-2
-1
.-1
--1
-1
-1
.-1
1
2
-2
-1
2
2
1
1
1
1
1
1
-1
-2
2
1
1
1
2
2
1
1
1
1
-2"
-1
1
2.
-P
-1
-2
-2.
-r
-l
-l
-l.
1-point /
Ah
vl
+ T2
vl
--2
-1
-1
.-2
--1
-1
1
. 1
" 1
1
-1
.-1
-1
-2
-2
-1
-1
-1
1
1
1
1
-1
-1
1
2
2
1
-1
-1
1
1
-1
-ll
-1 -1
1 1
1 1.
2"
1
1
2.
-P
-1
1
1.
For u = J2i uj<t>j' and v = J2i vj<Pj> the 'nonlinear' advection matrices become
n
>j = n*j + nh'
where
72
(—6«i — 3ll2 — "3 — 2m4) (6u\ + 3«2 + "3 + 2«4) (2«i + U2 + «3 + 2«4) ( — 2li\ — li2 — M3 — 2«4)
(—3«i — 6«2 — 2«3 — 1/4) (3«i + 6«2 + 2«3 + U4) (u\ + 2«2 + 2«3 + 1/4) (—u\ — 2u2 — 2ut, — U4)
{—U\ — 2u2 — 2«3 — U4) (U\ + 2U2 + 2«3 + U4) (U\ + 2u2 + 6M3 + 3m4) ( — Ml — 2«2 — 6«3 — 3m4)
( —2«i — U2 — «3 — 2«4) (2« 1 + U2 + "3 + 2m4) (2« 1 + U2 + 3»3 + 6M4) (—2l4\ — U2 — 3«3 — 6M4)
/
-2^2-^3-^4) (2^1+2^2+^3+^4) (6^1+2^2 + ^3 + 3^4)'
- 6v2 - 3^3 - V4) {2V\ + 6lb + 3i>3 + i>4) (2^i + 2v2 + ^3 + V4)
3V2 - 6U3 - 2^4) (1>1 + 3V2 + 6^3 + 2V4) (Vi +V2+ 2l>3 + 2V4)
^2-2^3-2^4) (^1+^2+2^3+2^4) (3^1+^2 + 2^3+6^4).
For 1-point quadrature and variable velocity, use the 1-point results above for constant
velocity but replace u and v by their average values at the centroid.
^ = 72X
"(-6^1 - 2V2 - 1>3
(-2V\ - 2V2 - l>3
(-V\ -V2- 2V7t -
_(-3V\ -Vi- 2V?,
-3v4) (-2V\
-V4) (-2V\
2V4) (-Vi
-6U4) (-Vi
Related results for bilinears on a distorted element
(1) Area: A = ±[(x3 -xi)(y4 - y2) + (x2 -*4)(j3 - yi)]
(2) Lumped mass matrix mf = <5,ymf /36,
where
m\ = [(\'3 - v2) + 2(y4 - yi )][2(x2 -xi) + to - x4] - Ito - x2) + 2(x4 - xx )][2(y2 - >>,) + (w - v4)L
m^ = [2( V3 -y2) + (y4- vi )][2(x2 - -vi) + (JT3 - -^4)1 - 12(^3 - *2) + (*4 - jti )][2( v2 - vi) + (x - v4)],
m\ = [2(b -y2) + (V4 - Ji )][(*2 - *i) + 2(x3 -x4)]~ [2(xi - x2) + (x4 - x\ )][(y2 - vi) + 2(W - v4)],
m\ = [(b - >'2) + 2(>>4 - y\ )][(x2 - x,) + 2(*3 - jt4)] - [U3 - x2) + 2(x4 - x{)][(y2 - vi) + 2(j3 - y4)].
876 SOME ELEMENT MATRICES
(2) Biquadratic
mu
kjj
where
kx
i i
' J
and
ky ■
K : -
J
lh
~ 900
= *& +
h
90/
/
90h
- 16
-4
1
-4
8
-2
-2
8
. 4
ky
- 28
4
-1
-7
-32
2
8
14
.-16
" 28
-7
-1
4
14
8
2
-32
.-16
-4
16
-4
1
8
8
-2
-2
4
4
28
-7
-1
-32
14
8
2
-16
-7
28
4
-1
14
-32
2
8
-16
1
-4
16
-4
-2
8
8
-2
4
-1
-7
28
4
8
14
-32
2
-16
-1
4
28
-7
2
-32
14
8 -
-16 -
-4
1
-4
16
-2
-2
8
8
4
-7 -
-1 -
4
28
8
2 -
-32 -
14 -
-16
4
-1
-7
28
2
8
14
-32
-16 -
8
8
-2
-2
64
4
-16
4
32
-32
-32
8
8
64
-16
-16
-16
32 -
14
14
2
2
112
-16
16
-16
-128
-2
8
8
-2
4
64
4
-16
32
2
14
14
2
-16
112
-16
16
-128
8
-32
-32
8
-16
64
-16
-16
32 -
-2
-2
8
8
-16
4
64
4
32
8
8
-32
-32
-16
-16
64
-16
32 -
2
2
14
14
16
-16
112
-16
-128
8
-2
-2
8
4
-16
4
64
32
14
2
2
14
-16
16
-16
112
-128
-32
8
8
-32
-16
-16
-16
64
32
4"
4
4
4
32
32
32
32
256.
-16"
-16
-16
-16
32
-128
32
-128
256.
-16"
-16
-16
-16
-128
32
-128
32
256.
n
u
nx- + n-
where
uh
nx: =
I / 1 Of\
J 180
--12
4
-l
3
-16
2
4
-6
. -8
-4
12
-3
1
16
6
-4
-2
8
1
-3
12
-4
-4
6
16
-2
8
3
-1
4
-12
4
2
-16
-6
-8
16
-16
4
-4
0
-8
0
8
0
-2
6
6
-2
8
48
8
-16
64
-4
4
-16
16
0
-8
0
8
0
-6
2
2
-6
-8
16
-8
-48
-64
8
-8
-8
8
0
-64
0
64
0
TWO-DIMENSIONAL ELEMENT MATRICES 877
and
n
u
vl
1 or\
180
--12
3
-l
4
-6
4
2
-16
. -8
3
-12
4
-1
-6
-16
2
4
-8
1
-4
12
-3
-2
16
6
-4
8
-4
1
-3
12
-2
-4
6
16
8
-6
-6
2
2
-48
-8
16
-8
-64
-4
16
-16
4
8
0
-8
0
0
-2
-2
6
6
-16
8
48
8
64
16
-4
4
-16
8
0
-8
0
0
8
8
-8
-8
64
0
-64
0
0
Remark:
Variable velocity biquadratic matrices, a la those shown earlier for bilinear elements, are
available; however, they required many pages (via Mathematica, by J. Derby), and are
not presented. The interested reader can contact PMG to get a copy.
(3) Serendipity (8-node quadratic; omit node 9)
and
where
and
m,
i j
Ih
180
" 6
2
3
2
-6
-8
-8
.-6
2
6
2
3
-6
-6
-8
-8
3
2
6
2
-8
-6
-6
-8
2
3
2
6
-8
-8
-6
-6
-6
-6
-8
-8
32
20
16
20
-8
-6
-6
-8
20
32
20
16
-8
-8
-6
-6
16
20
32
20
-6
-8
-8
-6
20
16
20
32
k- ■ — kx A- ky
K'J ~ Kij ^ Kij
n
ij
«h+<
k*-h
ij 90/
I
ky —
,J 90h
- 52
28
23
17
-80
-6
-40
. 6
" 52
17
23
28
6
-40
-6
.-80
28
52
17
23
-80
6
-40
-6
17
52
28
23
6
-80
-6
-40
23
17
52
28
-40
6
-80
-6
23
28
52
17
-6
-80
6
-40
17
23
28
52
-40
-6
-80
6
28
23
17
52
-6
-40
6
-80
-80
-80
-40
-40
160
0
80
0
6
6
-6
-6
48
0
-48
0
-6
6
6
-6
0
48
0
-48
-40
-80
-80
-40
0
160
0
80
-40
-40
-80
-80
80
0
160
0
-6
-6
6
6
-48
0
48
0
6"
-6
-6
6
0
-48
0
48.
-80"
-40
-40
-80
0
80
0
160.
878 SOME ELEMENT MATRICES
where
and
x uh
n : ; =
lJ 180
„y vl
n ; ; =
,J 180
"-12
8
3
3
-20
14
0
.-26
"-12
3
3
8
-26
0
14
.-20
-8
12
-3
-3
20
26
0
-14
3
-12
8
3
-26
-20
14
0
-3
-3
12
-8
0
26
20
-14
-3
-8
12
-3
-14
20
26
0
3
3
8
-12
0
14
-20
-26
-8
-3
-3
12
-14
0
26
20
20
-20
0
0
0
-40
0
40
14
14
14
14
-48
-40
-48
-40
-14
-14
-14
-14
40
48
40
48
0
20
-20
0
40
0
-40
0
0
0
-20
20
0
-40
0
40
-14
-14
-14
-14
48
40
48
40
14
14
14
14
-40
-48
-40
-48
20
0
0
-20
40
0
-40
0
A. 1.4 NAVIER-STOKES; ADDITIONAL MATRICES
A.1.4.1 Gradient Matrix
where
-•J
</
Lt^ and <■
90/
dy
fj>
where 0, is a velocity basis function and xjrj is a pressure basis basis function.
A.1.4.2 Divergence Matrix
where
(Note that d = -cl)
dij = (dxijdylj),
(1) Bilinear velocity, piecewise-constant pressure (QiQ0)
This case is carefully covered in Section 3.13.5, including construction of the global
GFEM equations from the element matrices.
NAVIER-STOKES; ADDITIONAL MATRICES 879
(2) Biquadratic velocity, continuous bilinear pressure (Q2Q1)
(cT)u = [(cx)Jj (cy)Jj],
where
and
(cl)
x nj
(<)/;
h
36
/
36
"5
1
0
.0
"5
0
0
.1
-1
-5
0
0
0
5 -
1 -
0
0
0
-5
-1
0 -
-1
-5
0 -
0
0
1
5
-1
0
0
-5
-4
4
0
0
10
10
2
2
-2
-10
-10
-2
0
-4
4 -
0 -
0
0
4
-4
-2
-2
-10
-10
10
2
2
10
-4
0
0
4
-8
8
8
-8
-8
-8
8
8
(3) Serendipity (Q^8)Qi).
Remove node 9 from <22<2i:
(c[),7 = 3g
and
(cl)
yfij
36
7
1
2
2
7
2
2
1
1
-7
-2
2
2
7
-1
-2
2
-2
-7
1
2
1
-7
-2
2
-2
-1
7
1
2
-2
-7
-8
8
4
-4
6 -
6 -
6
6
-6 -4
-6 4
-6 8
-6 -8
-4 -6 -
-8 -6 -
8 -6
4 -6
6
6
6
6
-8
-4
4
8
(4,) Biquadratic velocity, discontinuous bilinear pressure (Q2Q_i)
o Case 1. The pressure basis functions are centered at the 2 x 2 gauss points—and helps
explain the appearance of V3.
<<£)
n
n = — x
'7 72
9 + 575.
-(3-75).
9-575,
-(3 + 75).
and
7- '
(Cy ),7 = — X
r 9+573,
-(3 + 75),
9-575,
.-(3-75),
3-75
-(9 + 575),
3 + 75.
-(9-575).
-(3 + 75).
9 + 575,
-(3-75).
9-575,
-(9-575),
3 + 75,
-9 + 575,
3 - 1/3,
-(9-575),
3-73.
-(9 + 575),
3 + 73,
-(3 + 75),
9-575,
-(3-75).
9 + 575,
3-73.
-(9-573).
3 + 73,
-(9 + 573).
-4(3 + 73),
4(3 + 73).
-4(3-73),
4(3 - 73),
4(273 + 3),
4(273 + 3),
-4(273-3).
-4(273-3).
4(273-3).
-4(273 + 3).
-4(273 + 3).
4(273-3),
4(3 - 73).
-4(3 + 73).
4(3 + 73),
-4(3 - 73),
4(3 - 73).
-4(3-73),
4(3 + 73).
-4(3 + 73),
4(273-3),
4(273-3),
-4(273 + 3),
-4(273 + 3),
4(273 + 3)
-4(273-3),
-4(273-3),
4(273 + 3),
-4(3 + 73).
4(3 - 73),
4(3 - 73),
-4(3 + 73).
-1673)
1675
1675
-1675
-1673-
-1673
1675
1675.
o Case 2. The pressure basis functions are centered at the four corner nodes; this basis
is equivalent to that in case l, and both will thus give the same numerical results. The
880 SOME ELEMENT MATRICES
'new' one is simply simpler. The element matrix has, in fact, already been presented—it
is simply the cT matrix of <22<2i- The difference is in how the matrix is used to construct
the global equations; for Q2Q1 there is only one pressure per global node (C° pressure),
whereas for Q2G-1 there are as many pressures at each global node as there are elements
sharing that node (C~l pressure). See main text for further details—Section 3.13.6.
(5) Biquadratic velocity, discontinuous linear pressure (Q2P-i)
The pressure basis functions are centered at the first three of the four Gauss points. (Any
three points in the plane suffice to define the linear pressure.)
(cl)u
36
2^3 + 3, 2^3-3, 2^3-3,
a/3, -5a/3, -5a/3,
-3(a/3- 1), 3(a/3- 1), 3(a/3- 1),
and
(CTy)lj
36
3(a/3+ 1),
-a/3,
-(2a/3-3),
-3(a/3-1), 3(a/3-1),
5a/3, -a/3,
-(2v^-3), -(2^3 + 3),
2a/3 + 3,
a/3,
3(a/3- 1),
-4a/3,
4a/3,
0,
4(2a/3-3),
-8 a/3.
-12,
-4a/3, 4(2a/3 + 3),
4a/3, -8a/3,
0, 12,
-16a/3
16a/3
0
-3(a/3+1),
5 a/3,
-(2^ + 3),
12,
8a/3,
-4(2a/3-3),
0,
-4a/3,
4a/3,
-12,
8a/3,
-4(2a/3 + 3),
0,
-4a/3,
4a/3,
0
•16a/3
16a/3
A.1.4.3 Consistent Laplacian Matrix
(CTMZlC)ij = Y,CikMZklkCkj,
where ML is the lumped mass matrix (diagonal), and it is important to note that this is an
exceptional case: we are no longer dealing with an element matrix; the summation over
k ranges over all velocity modes in the mesh. CTM~[lC is a (sparse) global matrix.
The following 16-element patch is required for CTM^lC, where • corresponds to the
x-portion and x to the y-portion:
#-
t§-
$$-
&-
#-
i$-
4$-
h&
*$-
.0
■*■
^^~
^^
The two stencils (not matrices!) that comprise CTMLlC corresponding to node 0 are
h
CTxMllCx
111
-1 -8 18 -8 -1
-4 -32 72 -32 -4
-1 -8 18 -8 -1J
TWO-DIMENSIONAL CONTROL VOLUME FINITE ELEMENT MATRICES 881
and
CTyMl)Cy
12h
-1
-8
18
-8
L-l
-4
-32
72
-32
-4
-1
-8
18
-8
-1.
A.1.5 TWO-DIMENSIONAL CONTROL VOLUME FINITE
ELEMENT MATRICES
m,
■i j
Ih
64
"9
3
1
.3
3
9
3
1
1
3
9
3
3
1
3
9
Remark:
Unfortunately, the symmetry is not preserved if the element shape is non-rectangular.
K; j
h
81
- 3
-3
-1
. 1
-3
3
1
-1
-1
1
3
-3
1-
-1
-3
3.
/
+ Sh
" 3
1
-1
.-3
1
3
-3
-1
-1
-3
3
1
-3
-1
1
3
Remark:
Unfortunately, the symmetry is not preserved if the element shape is non-rectangular.
n
u
uh
T6
3
-3
-1
1
3
-3
-1
1
1
-1
-3
3
ll
-1
-3
3J
vl
+ 76
3
1
1
3
1
3
-3
-1
1
3
-3
-1
3
1
-1
-3
Finally, if m = V. iij<f>j, and V = ]T\- Vj<f>j, the 'non-linear' matrices becomes
where
«o- =
96
7(«i +«3) + 2(«3 + u4),
-7(«i + us) - 2(«3 + u4),
— U\ — U2 — 2(«3 + U4),
Ui + U2 + 2(U3 + U4),
and
u
4 + 4-
7(«i + u2) + 2(«3 + u4), 2(m + u2) + M3 + "4,
—7(«i + u2) — 2(«3 + M4), —2(«i + M2) — "3 ~ "4.
2(«i + «2) + «3 + "4
—2(u\ + u2) — "3 — "4
— Ml — «2 ~ 2(«3 + M4),
U] + U2 +2(W3 +M4),
-2(«i + U2) - 7(«3 + M4), -2(«i + M2) _ 7("3 + "4)
2(«i + u2) + 7(«3 + M4), 2(«i + «2) + 7(«3 + «4> .
7 96
'7(l>, + V4) + 2(V{ +V2),
2(V2 + Vi) + Vi + V4,
-2(V2 + Vi)-V{ -V4,
-HVi +V4)-2(Vi +V2),
2(l>, +V4) + V2 + V?,
l(V2 + Vi) + 2(Vi +V4),
-HV2+V3)-2(Vi +V4),
-2(V{ + V4)-V2-V^
2(V{ + V4) + V2 + V?,
l(V2 + Vi) + 2(V\ +V4),
-l(V2+Vi)-2(Vl +V4),
-2(Vi +V4)-V2-V3,
7(V{+V4) + 2(V2 + Vi) '
2(V2 + Vi) + Vi +V4
-2(V2+V])-Vl -V4
-1(V{ + V4)-2(V2 + Vi)_
Appendix 2 Further Comparison
of Finite Elements and Finite
Volumes
A.2.1 INTRODUCTION
Since, by some measures, recent trends seem to favor the more physically appealing finite
volume methods (there is not just one) over the GFEM (and other FEM's), it seems
interesting to offer another comparison of the two. This we now do by comparing both GFEM
and CVFEM (see Sections 2.2.6 and 2.5.3 for the latter) via bilinear approximations on
rectangles. In so doing, we will also digress somewhat to offer some subjective remarks on
much (most) of the previous FEM textbook literature and the manner in which the discrete
equations are derived. And to make it perfectly clear that we are embarking on a path that
is clearly not clear, we shall present two ostensibly rather 'opposite' viewpoints regarding
the important-but-ambiguous subject of 'local conservation' via GFEM. In the former, we
shall take the position that, alas and alack, the 'poor' GFEM is simply devoid of the often
highly coveted local conservation property. Then we shall turn the tables and explain our
version of the local conservation properties that, when properly interpreted/understood,
assert quite the opposite—GFEM does display local conservation, both at the nodal level
and at element level. This latter viewpoint was pioneered in modern Italy by Comini
et al. (1991, 1992) and has been further 'interpreted' via some personal communications
with G. Comini—to whom we remain grateful. Herein, we present our version of their
arguments. At the end, we expect that the reader will either be more confused than ever or
will 'choose' one or the other side! In either case, it may be worthwhile reiterating one of
the unwritten 'laws' pertaining to numerical simulation: both the final discrete equations,
and their numerical solution, are indifferent to the manner of interpretation.
A.2.2 VIEWPOINT ONE
Let us begin on the negative side of the argument.... We have seen very few FEM
textbooks that do not (we assert) confuse the newcomer by carrying the finite element
method a bit too far by writing so-called element-level equations—which erroneously
imply, in most cases, element-level 'balances' (momentum/force, heat, mass, etc.). Many
peer-reviewed research papers, unfortunately, also promulgate this confusion. Chapter 5
in the book by Burnett (1987) is one of clearest and most elaborate treatments of 'element
equations' (and with no erroneous implications) that we have seen—even though we still
believe that such an approach is unnecessary.
884 FURTHER COMPARISON OF FINITE ELEMENTS AND FINITE VOLUMES
We believe that the proper way to generate the GFEM equations is just the way we
did it in Sections 2.2-2.4 and 3.13.5. It is then sad but true that the GFEM lacks (usually)
that single attractive property of many finite volume/control volume methods: simple-to-
interpret, local-level 'physical' balances. But to help relieve the consistent confusion that
still abounds in even the most recent of FEM texts, we shall have a go at a proper
description of element equations and element assembly. Before doing so, however, we admit that
the confusing concepts are rather subtle, thus rationalizing the rampant confusion, which
confusion seems to be more 'prevalent' among engineers than mathematicians.
Consider for simplicity the ID AD equation,
dT dT
1- u —
dt dx
k-
d2T
dx2
+ S(x) on 0 ^ x ^ L,
(A. 2.2-1)
where u and k are constant. We shall discretize the weak form of this equation with
both linear and quadratic basis functions in the manner of many FEM texts written by
engineers ('free body diagrams'); i.e., we will actually write equations for each node of
each element—as if the element were not connected to its neighbors—but we will make
every effort to make the presentation both palatable and rigorous (no 'arm waving'). To
this end, we first write the finite element approximation to the weak form of (A.2.2-1)
over an arbitrary interval (A, B) located somewhere in (0, L), with 0 ^ A < B ^ L:
r-fl
0/
~dt
+ u
dx
+ K
"B 30,- dTh
\ dx dx
"B r\T^
0/5 + K(j)j—-
i dx
B
(A.2.2-2)
where Th = ]T\ Tj4>j(x). Next, an energy balance on the interval (A, B) is, from (A.2.2-1),
S+lx-
8T
uT
B
(A.2.2-3)
if r =
dt J a J a V dx
and is one 'goal' of our GFEM approximate solution.
Consider now the (typical) 3-patch in Figure A.2.2-1 using linear elements. Next we
write two equations for element i + 1, one at the left end and one at the right, using
(A.2.2-2); i.e., A = X(, B = xi+i, starting at the left, node /:
li+l ■ ■ U (T; — 7\ + i)
-^-{ITi + Ti+i) + -(Ti+l - Tt) + k l+U
6 2 li+l
(t>(S + K(f>i-—
r. OX
Xi+\
(A.2.2-4)
The key issue before us lies in the interpretation of the last term—what it is and what
it is not. First the easy part: since 0, = 1 at xt and 0, = 0 at jc(+i, the term simplifies
i + 2
Fig. A.2.2-1 Local GFEM linear element solution.
VIEWPOINT ONE 885
to — K(dTh/dx)\Xi. Now, in spite of what may seem obvious, what it definitely is not is
K(d/dx)Y^j Tj(j}j(x)\Xi, since the first derivative of the C° function, Th(x), is not even
uniquely defined at Xj. [Also foolish, and fatal, would be to attempt to approximate
dTh/dx\Xi by (Ti+i — Tj)/li+\, which would cancel the diffusion term on the LHS—and
is equivalent to not having integrated by parts.] What it is is an unknown, but consistent
GFEM diffusional flux at node / (sometimes called a secondary variable—to distinguish
it from the primary variable, Th; see, for example, Reddy (1993); see also Chapter 4 for
further discussion of 'consistent flux'). We shall even give this new unknown a name:
—K(dTh/dx)\x+, = qf, which means the (consistent) flux at x = xf, with the + sign (jc+ =
Xj + s for s > 0, where s —> 0) indicating that we are quite prepared for a discontinuous
flux at Xj. (This quantity also goes under the name of 'generalized nodal flux'—a name
that might make more sense in multi-dimensions; see below.) Thus, the first 'element'
equation is
h+\ • • u (Tj — Tj+\) f ,
-!£-QT, + T,+ i) + -(T,+ i-Ti) + K ' '+U = / <f>iS + q+, (A.2.2-5)
6 2 li+i Jii+l
which, as it will turn out, is actually nothing but the defining equation for qf\ And this is
precisely what causes the confusion. If we had the GFEM solution, then (A.2.2-5) could
be used to compute the consistent flux through the left end of li+\. Similarly, at the right
end of element / + 1, we have
l-^-(fj + 2ti+l) + ^(r,+ i - Tj) + K(Ti+l ~ Ti) = f 0,+ 1S-<7,-„ (A.2.2-6)
o 2 //+i J,.+l
where qJ+[ = —K(dTh/dx)\x- is the (unknown, but consistent) heat flux through the right
end of li+\. {qf > 0 =>• net influx to element / + 1 at its left end and q~[+x > 0 =>• net
outflux from element / + 1 at its right end—because —K(dTh/dx) describes flux in the
positive jc-direction.)
Important Observation:
Whereas we have written two equations per element, which would lead to 2N total
equations with only TV unknown nodal temperatures, we also have introduced two
additional unknowns per element so that the total number of unknowns is now 3/V.
What to do? Well, since 0, +0,+ i = 1 in li+\, it is interesting first to sum the two
element equations (in three unknowns); this gives
l-^-(tj+tj+l) + u(Tj+l-Tj) + 0= f S + q+-q7+l, (A.2.2-7)
which, if the heat fluxes on the RHS were 'just right,' would represent an element-level
energy balance. And this is a good place to 'prove' again that K(dTh/dx) cannot be
expressed via Th = ][]. Tj<f>j because that would yield q+ — qf+l =0 in (A.2.2-7)—in
clear violation of an element energy balance via the total absence of the conduction term.
To make further progress, we write (A.2.2-2) for node / in element /, and that for node
/ + 1 in element li+2'.
^(Tj^+ltj) + 1(T, - Tj^) + K(Ti ~Ti~{) = t ct»S-q- (A.2.2-8)
o 2 /, Jij
886 FURTHER COMPARISON OF FINITE ELEMENTS AND FINITE VOLUMES
and
//+2(27V, + tl+2) + U-(Ti+2 - Ti+l) + K(Ti+li Ti+2) = [ <j>i+lS + qT+l.
2 //+2 Jli+2
(A.2.2-9)
The next—and crucial—step is to require/invoke flux continuity at each node
(yes—even though our C° temperature field seems to preclude same) via
qt=qj V/, (A.2.2-10)
which closes the system (we now have 3/V equations in 3/V unknowns). Although Th e
C° =>• dTh/dx e C~l, the consistent flux, q-, = qf, is continuous; q, e C°—by (required)
definition. (This is an example of why these concepts are subtle/slippery.)
Now we return to the ostensible energy balance given by (A.2.2-7), and use (A.2.2-8)
and (A.2.2-10) to eliminate q[ and (A.2.2-9) and (A.2.2-10) to eliminate qJ+{, to give
£(7V, + it,) + l,+l (Ti+Ti+l) + lJ±*(2Ti+l + t,+2)
6 I 6
u (T, — Ti-1 T,+, — Ti+2 \
+ ^(Ti+l - t, + t^ - r,_,) + k I { + ,+Ir 2 '+2 j
= [<j>,S+[ S+ f 0,+ i5, (A.2.2-11)
which is a confusing mess. But is it also an energy balance? That is to say, is there some
sort of local energy balance lurking in these equations? Could it be an energy balance
over element / and over one-half of each of its neighbors? Unfortunately, no. What it is
is simply the sum of two GFEM nodal equations; i.e., the sum of two global equations
with no local energy balance.
Proof: the GFEM equations for nodes / and /+ 1, obtained (of course) by applying
(A.2.2-2) with A = 0, B = L so that the last term is zero, are, respectively,
VJV, + It,) + -£-(27, + ti+i) + ^(T,+ x - r,-_,)
6 6 Z
+ /T1-T!zi + T1-T!±1\r ^ (A22]2)
V h li+\ J Jh+h+i
and
%J-(7, + 27,+ ,) + ^r(27V. + ti+2) + ^(T,+2 - Ti)
6 6 2
+ JTl±±^li + Tl+l-Tl+I\ = j ^ (A22]3)
V li+l h+2 J Jli+l+ll+2
whose sum is easily seen to be (A.2.2-11). Thus, the potential element-level energy balance
suggested by (A.2.2-7) is a red herring. Only piecew'ise-constant test functions—such as
those often employed in FVM's—can generate local balances.
To be absolutely sure that our analysis is both clear and useful, we reiterate: once the
GFEM solution has been found, the flux at node / computed as
qf = l-f(2t, + 7V.) + U-(T,+ { - 7V + * (ZlzZill) _ jf 0,5
VIEWPOINT ONE 887
from (A.2.2-5) and that computed as
qj = j hS-^ti-x+lti) - ^(T, - r,_,) - k (r/~/r'"1)
from (A.2.2-8) are precisely the same quantities even though they are obtained from very
different equations. [Setting qf = qT above simply returns the GFEM equation (A.2.2-12),
for node /.] These are the consistent (internal) fluxes that are, of course, computable only
after having solved the GFEM equations for Tt(t), i = 1, 2,..., N. And—even though
the diffusive portion of the flux suffers a jump at the element interface (node / above)—the
complete consistent flux does not. (Note, however, that continuous consistent flux is a
different issue than element-level energy balances.)
To finish the story with linears, let us connect what we have presented above to
that in many FEM texts. To obtain the assembled equation for node /, using the above,
cumbersome (?), element equation-based approach, simply form (A.2.2-5) and (A.2.2-8),
then add them, using (A.2.2-10); the result is (A.2.2-12).
For quadratic basis functions, we merely outline the steps to obtain the analog of
(A.2.2-11), leaving the details to the interested reader:
1. Form the element equation for the left node, introducing the additional unknown,
say, qf.
2. Form the element (and global) equation for the center node, which introduces no
unknown fluxes because 0(+i = 0 at the two ends of the element.
3. Form the element equation for the right node, introducing qj+1.
4. Sum the three equations, giving a potential element-level balance.
5. Eliminate qf and qJ~+2 from (1) and (3) in favor of the left and right neighboring
element equations.
6. The final result in Step (5) is the same as summing all three GFEM equations
corresponding to the given element, and there is (again) no local energy balance.
Moving to 2D, we consider (only) the bilinear element and address the equally
confusing concept of '...upon assembly, all element boundary integrals will cancel...
because the path (element boundary segment) is traversed once in each direction ' The
4-patch (Figure A.2.2-2) below will be all we need to perform the analysis: The six 'paths'
shown relate to the boundary (line) integrals below; e.g., V{0 ) is the boundary integral path
NW
N
NE
W
-(4)
©
-0)
1 r
(1)
E I
©
I •+-
1 (2)
©
r(2)l
1 E i
sw
SE
Fig. A.2.2-2 Element boundary integral paths.
888 FURTHER COMPARISON OF FINITE ELEMENTS AND FINITE VOLUMES
associated with node 0 in element 4. [The paths traverse only one-half of each element's
perimeter because the basis (test) function associated with the node in question is zero
on the remaining portion of the element boundary.] The 'plan' is three-fold: (i) to show
how to form the proper assembled GFEM equation 'one element at a time'; (ii) to show
that the thus-introduced unknown inter-element flux terms are not very useful in general;
and (iii) to use the results to discuss the concept of an element-level energy balance.
The element equation for node 0 from element l's 'viewpoint' is
00
~dt
+ u • vrh + kv<Pq ■ vr
0OS + / 0OK-
dr
dn
(A.2.2-14)
which is the 2D analog of (A.2.2-2), and, as an alternate to introducing boundary g's
as above, we have placed a bar over Th in the boundary integral to distinguish it from
T = ]T) Tj(j)j on the LHS; dT /dn is an additional unknown quantity.
Remark:
Note again that, as in ID, the attempted identification of dT /dn = V ■ Tjd(j)j/dn would
lead to the total absence of any diffusion terms in an attempted element-level energy
balance that could be attempted by repeating (A.2.2-14) for each of the other three nodes
in element 1, summing the four equations, and using ^/0j = 1- This result can also be
obtained by realizing (i) that the above procedure is completely equivalent to not having
integrated 0V2r by parts to obtain the weak form, and (ii) that V2T = 0 for bilinear basis
functions. Higher-order elements would not give a null result, but they would also not
give a correct result—an exercise we leave for the reader.
If we write three more element equations for node 0, one from each element, and sum
all four, we obtain, with Qq = A\ + Aj + A3 + A4,
00
dr
~dt
+ u • VTh ) + kV0o • VTh
<l>oS + Yl
(A.2.2-15)
which becomes the GFEM equation for node 0 after imposing 'flux-continuity' by setting
the sum of the boundary integrals to zero; i.e., we must define the boundary integrals to
cancel in pairs—the 2D equivalent of the flux-continuity statement in (A.2.2-10). So, with
more effort than is needed, we can derive the assembled GFEM equations for the full
mesh by repeating the above exercise. (It is more effort than needed because we wrote
element-level equations with the necessary introduction of the element boundary integral
terms, which later drop out.)
Suppose now that we have solved the GFEM equations so that Th(x, t) is available.
Is there any utility in returning to element equations and the associated inter-element
flux-like terms? Well, let us try first to compute the diffusive flux between elements 1
and 2; i.e., through 0-E. We begin by returning to (A.2.2-14)—rearranged and expanded;
and using
00*-
dT
dn
- J foK^-dy- / (p0K^r- dx,
dx
dy
VIEWPOINT ONE 889
we obtain
,N dTn rE qt»
(poK^^dy- / (j}0K—dx
dx
00
~a7
dy
+ u • vr - 5 + /cV0o • vr
RHS
(i)
(A.2.2-16)
:(D
where RHS0 denotes a known RHS associated with node 0 and element 1. Thus, whatever
the two terms on the LHS really mean, their sum is now available. But we are interested
in only the second of them (the heat flux between elements 1 and 2). What to do? Well,
we might try writing the analogous equation for element 2—but it is obvious from the
above sketch that this would introduce another boundary integral, k f0 4>o(dT /3jc)dy,
which we do not care about. Going to node E will not help either because it brings in
the two boundaries E-NE and E-SE. So let us return to the other three equations for
node 0 and write out each—taking due account (as we did above) of the outward pointing
normal on each element:
rE QYh rs Qfh
+ / foK—dx- 4>ok— dy = RHS02), (A.2.2-17)
rs dT rw dT
+ 4>oK-—dy+ 0OK--ck = RHS^, (A.2.2-18)
Jo dx Jo dy
,(3)
and
) dy
<!>ok— dy
i dx
RHS
(4)
(A.2.2-19)
where all RHS's are known. Now we might hope to get somewhere because we
have four equations in the four unknown 'generalized' heat fluxes. Thus, writing these
four equations as Ax = b, with jci = J0 4>0K(dT /dx)dy, X2 = J0 4>oK(dT /3y)d*, *3
J0 4>oK(dT /dx) dy, and x4 = J0 (poK(dT /dy) dx gives
1
0
0
1
-1
1
0
0
0
-1
1
0
°1
0
1
-lj
A =
which, unfortunately, is singular (the four equations are not linearly independent), and
our hopes are dashed—we cannot extricate the individual element-side heat fluxes from
the given sums of them. So far, no 'utility'!
But we can get some information on heat flux—but it will be nodal rather than
elemental. Suppose we want the total flux through W-O-E that is associated with node
0; i.e., that between elements 1 and 4 above and elements 2 and 3 below? This is do-able,
and we shall finally see some utility of these secondary variables. If we sum (A.2.2-16)
and (A.2.2-19), we get
rE dT»
(f)oK—-dx
) dy
(pox^dx
i dy
RHSl^+RHS
(4)
(A.2.2-20)
890 FURTHER COMPARISON OF FINITE ELEMENTS AND FINITE VOLUMES
where the LHS looks like fw4>oqdx = [(/i +h)/^]% to give
2
%
h +h
RHS^ + RHSj,4*
(A.2.2-21)
as an 'average' vertical flux 'through' (associated with) node 0. Note too that this same
(consistent) flux could also be obtained 'from below'; i.e., via elements 2 and 3. Just add
(A.2.2-17) and (A.2.2-18) to get
00*-— dx +
) dy
0o*——dx
i oy
RHSj,2) + RHSj,3),
(A.2.2-22)
the LHS of which is just the negative of that in (A.2.2-20). That they agree (up to a
sign) is a simple consequence of the fact that Ylj=i RHSq = 0, which is (of course) the
GFEM equation for node 0.
To get the flux through a single element side, say 0 — E, one would first compute qE
in an analogous way to obtaining (A.2.2-21)—which, of course, requires the introduction
of the element to the right of element 1. Then go0oOO + qE4>E(x) can be used to describe,
pointwise, the (C°) heat flux through edge 0-£—for any x between 0 and E.
While certainly legitimate and probably sometimes useful, these consistent flux
calculations are usually more useful at true domain boundaries—a subject we shall return to
in Chapter 4.
To conclude our element-equation analysis, let us examine the possibility of element-
level energy balances. Thus, we focus on a single element, say number 3, in the above
mesh and write sdlfour element equations:
Qsv/K^—dy-
>sw
dx
»S f)T
())swk— dx
isw
dy
05 W
05*—— dx +
>sw
dy
+ u • vr - s) + *V0.
r° dfh
/ ^sK~dx~dy
>sw
vr
(A.2.2-23)
05
ar
~dt
+ u • vr - s + *V05 • vr
(A.2.2-24)
+
"0 j3T"
0o*^— dy +
dx
"0 r)T
(j>oK—— dx
'W
dy
00
ar
~a7
+ u • vr - s + *V0O • vr
(A.2.2-25)
and
f° dT
/ <Pw«^— dx
'w dy
-w Qf
(j)WK^—dy
>sw
dx
VIEWPOINT TWO 891
(j)W
+ u • VTh - S ) + kV(J)w ■ VTh
(A.2.2-26)
which sum to
k dy —
is dx Jsw dx
o «~/j
,5 -—h
w dT \ i r dT" r* dT"
K—iy] + llwK-%'ix-LK-*'ix
+ u • vr - s
(A.2.2-27)
Now, except for the 'elusive' nature of the terms on the LHS, it is clear that (A.2.2-27) is
at least an attempt at an element energy balance. In fact, the LHS can be rewritten
as Jr^K(dT /dn), which, from the divergence theorem, is fA kV2T . Unfortunately,
—h' —h
however, the element energy balance is just as elusive as T (or (dT /dn); perhaps
the best interpretation of (A.2.2-27) is to consider it as the definition of the sum of the
diffusive flux terms that would give an element-level balance. In fact, these terms are
simply the residual of the known terms on the RHS.
Our 'bottom line' on this issue is simply the following advice: do not form your
GFEM equations via element-level equations; use the proper (global) support of the test
functions, and you will not get confused. Although this approach implies the
construction of nodal (global) equations spanning more than one element, and thus seems to
ignore/bypass/minimize/preclude 'classical' finite element thinking/methodology, it really
does not—it simply admits (perhaps even emphasizes) that 'looping through the elements,'
or 'global assembly,' is merely a bookkeeping procedure in which element contributions
to global equations is the proper name of the game. Finally, with regard to the lack of
element balances: sometimes one has no choice but to simply let the mathematics speak
for itself and let the physics take a back seat. But as we have shown, finite volume
methods with their true local conservation usually do not produce as accurate an
approximation to the PDE solution as does GFEM! The GFEM trades element (control volume)
balances for higher accuracy.
A.2.3 VIEWPOINT TWO
The starting point is the same; write (A.2.2-5) as
f Xj+ I
qf = -k(T'+! T,) + ljJr(2t, + ti+i)+ U-(Tl+{ - r,) - / frS, (A.2.3-1)
ii+i o 2 Jx.
where q~l is the (consistent) diffusional flux (in the jc-direction; qf > 0 =$■ flow in the jc-
direction) through node / as seen from element i + \; qf > 0 =$■ flux is into the element.
Yes, in spite of the 'extra' terms on the RHS, q{ is a diffusional flux—a concept that will
become more clear after we write the analogous equation for node / from the 'viewpoint'
of element /:
qj = -«(Ti ?l~{) ~ liVTi + r,_,) - fa - 7V-,) + T 0,5,
// 6 2 A,_,
(A.2.3-2)
892 FURTHER COMPARISON OF FINITE ELEMENTS AND FINITE VOLUMES
wherein qj is the diffusional flux through node / as seen from element /—wherein the
sign reversals are interesting, and significant. Both (A.2.3-1) and (A.2.3-2) should be
interpreted in the following sense: the RHS comprises a 'first cut' (the first term) and
a set of 'correction' terms for the diffusive flux at node /. This is the key to finding
local GFEM balances, and is rationalized/justified, at least in part, by realizing that mesh
refinement will leave only the first term on the RHS's, —icdT/dx; the other three terms
shrink to zero (no more corrections needed).
'Nodal' conservation is now invoked by the requirement that the (consistent) diffusional
flux be continuous; i.e., we enforce q[ = qj, which is just (A.2.2-10). The result, of
course, is simply the GFEM equation for node i. Once the full set of GFEM equations
is solved for Tj(t), i = 1, 2,... N, either (A.2.3-1) or (A.2.3-2) can be used to compute
the diffusional flux at node /. Thus, although the GFEM equation for node i does not
describe a nodal energy balance per se, the 'post-processing' equations (A.2.3-1) and
(A.2.3-2), do.
The element energy balance (for /(+i) begins by writing the analog to (A.2.3-1) for
node i + 1; namely, (A.2.2-6):
-qT+\=K '+!~ +^r}-(ti+2ti+l) + 1)(Ti+l-Ti)- / 0/+15, (A.2.3-3)
'(+1 O l Jx,
which (for —qJ+{ > 0) describes the diffusional flux into li+\ from the right. Clearly the
sum of (A.2.3-1) and (A.2.3-3) represents the total diffusional flux into li+\\ it yields
(f. +f-Lt) fXi+l
//+i ' 2 + u(Ti+i -Ti) = J S + q+- q7+l (A.2.3-4)
as the total energy balance for element i + 1. Comparing this with (A.2.2-3), it becomes
(again) clear that the term K(dT/dx)\^ is represented by q~l — q~l~+l, the total diffusional
flux entering element li+\. This is the GFEM element energy balance. All that is needed
to 'accept' it is the previously stated key interpretation: the g's as given by (A.2.3-1) and
(A.2.3-3) are the 'proper' diffusional flux terms, comprising the sum of a simple first
approximation and appropriate (and not so simple) correction terms.
Moving to 2D, we begin with (A.2.2-14), where Th = ^2jT'j(/)j, refer back to
Figure A.2.2-2, and introduce the additional defining equation for the nodal diffusive
(and consistent) flux:
/ fate— = - 0orf} = —q{i\ (A.2.3-5)
Jru) dn Jru) 2
wherein q^ is the (pointwise) normal diffusional flux leaving [for q^ > 0] element i
through Tq ('through' node 0) and q(Q is the average outward (when positive) normal
diffusional flux through Tq0. Both q^ and q^ are unknown quantities until the GFEM
equations are solved, after which the second term on the RHS of (A.2.2-14) is available.
Remark:
The term Jr<n (poK(dT /dn) is often labeled a 'generalized flux' or some other such mostly
meaningless term—and contributes to the extant confusion.
VIEWPOINT TWO
893
Equations (A.2.2-14) and (A.2.3-5) represent/describe the consistent diffusional flux
through Tq as a 'first approximation,' — fA kV0o • vr, plus several correction terms, as
l + h
4o,} = ~ f kV<Po ■ VTh - [ 0O (
JAi Ja\ \
~a7
+ u • vr
(A.2.3-6)
wherein the second integral vanishes as A\ =$■ 0; i.e., for l,h -+ 0, this equation becomes
5S" =
K
3(1 +h)
+ 0(l,h),
which further leads to
2(7'£-7'0) + (7W-7',v) , , 2{TN-T0) + {TNE-TE)
h ■ ; h / •
/
h
(A.2.3-7)
/?■
-d)
dT
—
dx
dT
+ l^T
o dy
l+h
+ 0(l,h),
(A.2.3-8)
a consistent description of a particular heat flux vector—including direction. (We presume
here that the limit is taken with l/h fixed.)
Similar equations apply at node 0 for the other three elements:
</Wo
(2)
/ kV<Pq • vr - / 0o
IA2 JA2
-a)
0o^3) = - / kV0o • VTh
"aT
dTh
+ u • vr - ^
+ u ■ vr
(A.2.3-9)
(A.2.3-10)
and
-(4)
<Po%
(4)
- / kV0o • vr - / 00
J/t4 JA4
dTh
"a7
+u • vr - 5
(A.2.3-11)
The sum of all four nodal equations for node 0 yields, upon invoking/enforcing flux
continuity at node 0,
£/r>^ = o,
(A.2.3-12)
/= 1 u 10
(A.2.2-15) with the second term on the RHS dropped—which is (of course) the GFEM
equation (fully 'summed') for node 0.
Summarizing, once the full GFEM set of equations is solved, one can return to the
above nodal equations and compute the appropriate (and consistent) diffusional fluxes at
each node. This is what Comini et al. describe as a nodal energy balance.
Moving now toward an element energy balance, we begin by writing all four of the
above type nodal equations, but this time for a single element, say number 3, as follows:
/,rf = - / *v0O-vr- / 00
•M3) Ja3 Ja}
dTh
"a7
+u ■ vr
(A.2.3-13)
894 FURTHER COMPARISON OF FINITE ELEMENTS AND FINITE VOLUMES
0^ = - / KV<pw -VTh- I 03 ( %- + u • VTh - S ) , (A.2.3-14)
T\l] J A, J A, \ Ot
w
f0) tswlsw = ~ J "Wsw ■ VT* - J <}>sw[^+VL-VTh-Sy (A.2.3-15)
and
/ 4>sqf = - J kV05 ■ VTh - J 05 I ^- + u • VTh - S ) . (A.2.3-16)
Simple summation of these four equations yields
dTh
I „ </wf>3) + / + <Pwq(w + / m <t>swq(sw + / ,„ fcqf = \s
u-VTh
(A.2.3-17)
wherein the LHS describes the total energy leaving element 3 via diffusion, and the
equation can be rewritten, using (A.2.3-5), as
3$" + ^ + 5& + 5f = j^-h I (s - ^ - u • VI* j . (A.2.3-18)
It is thus clear that (A.2.3-17) and (A.2.3-18) are statements of local energy conservation
at the element level. While not quite as straightforward to obtain as when a CVFEM
procedure is employed, the resulting GFEM equation does indeed describe an equivalent
local conservation law.
Remark:
The 'completion' of the energy balance actually requires either that V • u = 0 or that the
flux-divergence form of advection be employed, so that the RHS of (A.2.3-17) can be
rewritten as
[ S-^ [ Th- [ Thnu.
Jai d? J A) Jr,
As an 'aside,' it may be worth pointing out that the above discussion/derivations
involved a 'mass lumping' approximation that is actually not required. By doing more
work, via consistent mass, it may be possible to generate more accurate 'first estimates'
of the local diffusional (and still consistent) heat fluxes. This can be done by expanding
the pointwise flux, q(e , the flux associated with node i in element e, via the (in this case)
linear basis functions on the element's boundary [rje)]:
^} = E^% (A.2.3-19)
j
on r/ , where only three of the four elements' nodes make contributions; e.g., for node
0 in element 3, (A.2.3-19) gives
43) = <$<t>w + tfVo + q(s]4>s, (A.2.3-20)
VIEWPOINT TWO 895
because the basis function for node SW, <f>sw, is zero on Tq3). Repeating this procedure
for the other three nodes in element 3 (S, 0, and W) and performing the integrations on
the LHS yields, rather than (A.2.3-13) through (A.2.3-16),
[lq^ + 2(1 + h)qf + hqf] = - [ kV0o ■ VTh
J A)
vrh
+ u- VTh-S ) , (A.2.3-21)
|[2(/ + h)q$ + hq% + Iqf] = - f kVc/>w ■ VTh
J At,
djh
+ u ■ VTh - S ) , (A.2.3-22)
frhqff + 2(1 + h)q(^ + Iqf] = - f kVc/>sW ■ VTh
J At,
djh
+ u ■ vr - S , (A.2.3-23)
and
[Iqfj, + 2(/ + h)qf + /?43)] = - f kVc/>s ■ VTh
JAi
Ir/J3)
6
- / 0s f — + u ■ W* - 5 | , (A.2.3-24)
whose sum is again the element-level energy balance given by (A.2.3-18)—appropriately
—with q- replaced by qj . This 4x4 system can be solved for the consistent mass
version of the consistent flux equations on element 3—and the procedure can be repeated
for every element in the domain, wherein we point out that the nodal equations on a
boundary element in which the flux is specified need not be written.
Exercises for the reader:
1. Verify that the 'consistent mass' approach in the formulation of the nodal equations
such as (A.2.3-6) is also consistent/legitimate. (Hint: in addition to flux continuity at node
0, similar enforcement is required at nodes N, S, E, and W.)
2. Show how the element-level matrices can be used to finally implement the (optional)
flux calculations.
3. Extend the analysis to arbitrary meshes and to higher-order elements.
If 'Viewpoint Two' is accepted and 'Viewpoint One' is rejected, then the principal
argument favoring FVM over FEM has been vanquished.
Appendix 3 Projections,
Orthogonal and Not—and
Projection Methods
A3.1 INTRODUCTION
* [Warning to some readers: you have really got to want to understand the nitty-gritty
about projections to justify the time and effort required to assimilate this appendix!]
This appendix is intended to supplement Chapters 2 and 3 and, accordingly, begins
with scalar systems and ends with divergence-free vector systems. It may be devoured
whole or piecemeal, the latter by reading only the scalar portions of it when Chapter 2
refers to this appendix and skipping the vector portion until referred to in Chapter 3. It
is a selective segment of a very general concept and has been designed to focus on the
applications associated with this book. For further background and more information, the
reader may consult, among others, the following references: Mikhlin (1964), Bronshstein
and Semendyayev (1985).
Galerkin's method is often called a projection method, and one of the key goals of
this appendix is to explain why. This will obviously entail a fairly careful definition and
description of 'a projection', as well as that of a projection method.
In the various linear vector spaces (of functions, or vectors, as they are also called)
associated with the branch of mathematics known as functional analysis, are a variety of
so-called projections. These are abstract generalizations of familiar Euclidean projections
and, by construction/definition, share some of their properties. Three of these familiar
projections are: (i) the projection of a 2D vector in the plane onto a line in the plane;
(ii) the projection of a vector in 3-space (R2) onto a plane (R2); and (iii) the projection of
a vector in R2 onto a line (/?'), which could also be realized by first applying the second
of the above projections and then applying the first. The key point is that a projection
is a representation of some 'quantity' in a subspace of the original space (sometimes
called a proper subspace). This means that it is never a complete or total representation
in that some information is necessarily lost; i.e., you cannot go backwards. If/when the
projection is considered as an 'operation,' it is one-way: the inverse operation does not
exist. Consider, for example, the two (orthogonal) projections of vectors from R2 to Rl
shown in Figure A.3.1-1. The result of projecting either u\ or u2 to the line represented
by the jc-axis is u. While u is the unique projection of both u\ and u2, there is no unique
inverse projection—clearly.
898
PROJECTIONS, ORTHOGONAL AND NOT-AND PROJECTION METHODS
# ^x
Fig. A.3.1-1 Two simple sample projections.
If the projection is represented by the operator symbol, p (whether explicitly or, as is
often the case, implicitly) in a vector space W, the subspace onto which the projection
occurs is called the range of the projection and is denoted by R{p). Associated with p is
another subspace, N(p), called the nullspace of the projection; any quantity in N{p) gets
projected to zero by p. Similarly, associated with the projection operator, p, is another
projection operator, Q, via Q = I — p. If u e R(p), then pu = u and Qu = 0. Similarly,
if v € N(p), then Qv = v and pv = 0. Also, we use the symbol x, as in u(x), to represent
the total dimensionality of the spaces involved; e.g., if we are in R2, x means x, y, z.
Consider now the following more general sketch shown in Figure A.3.1-2, and three
vectors u, v, and w, and we remark that these sketches in the plane are at best 'schematics'
when considering higher-dimensional spaces: (Note that the intersection of TV and R
'passes through '/defines the only vector that lives in both subspaces—the zero vector.)
The projection operator p is said to project 'down to' R(p) in the direction of
(parallel to) N(p); analogously, Q projects 'down to' N(p) in the direction of R(p).
Note that the notions of both distance (norm) and angle (inner product) are implied in
R(P)
Fig. A.3.1-2 General (non-orthogonal) projections.
INTRODUCTION 899
the above sketch—as in Euclidean geometry and as in a Hilbert (function) space—and,
hence, also the notion of orthogonality (_L): if TV _L R, then the projection is said to be
orthogonal and vice versa: if we have an orthogonal projection, then TV _L R. If
orthogonality exists, then the distance between a given vector u and any vector in R(p) is a
minimum for pu, where distances are necessarily measured in the norm appropriate to the
projection in question. With or without orthogonality, the decomposition u = pu + Qu is
unique.
Remarks:
(1) Some of the projections discussed below are not of the 'usual' type in that the
associated subspaces are not linear.
(2) More general definitions of projections exist in which the vector space need not
possess either an inner product or a norm. These are not of interest herein.
Denoting the inner product between two vectors (functions) by (u, v) and the induced
norm by || ■ || leads to the following 'algorithm' for constructing the above diagram: given
u and v, each as a point lying in the plane of the paper (W):
Step 0. Draw a horizontal line; call it R(p), by definition, and place a point at the zero
vector.
Step 1. Compute the magnitude of u, \\u\\ = y/{u, u) and its projection onto the range,
pu, and onto the null space, Qu = u — pu.
Step 2. Compute the angle from the /?(p)-axis to u via cos# = (u, pu)/(\\u\\ ■ ||p«||).
Step 3. Compute the angle between R and TV via cos\J/= (pu,Qu)/(\\pu\\ ■ \\Qu\\).
Draw the line N(p).
Step 4. Plot u, pu, and Qu.
Step 5. Compute the magnitude of v, \\v\\. Plot v; pv and Qv can then be obtained (and
plotted) either directly (via 'computation,' as for u) or indirectly (graphically).
Step 6. Compute the angle between u and v via cos0 = (u, v)/\\u\\ ■ \\v\\.
Remarks:
(1) Note that ||w||2 = \\pu + Qu\\2 = \\pu\\2 + \\Qu\\2 + 2\\pu\\ ■ \\Qu\\ ■ cosxjr. Only if
{pu, Qu) = 0 do we have orthogonality, and the concomitant satisfaction of the
'Pythagorean theorem.'
(2) Another property of the projection is that the angle \Jr is the same for all admissible
functions (those in W)—a requirement that is clearly (and most easily) satisfied for
orthogonal projections; xjr = 90° via (pu, Qu) = 0.
(3) The dimension of R (hence N) may be the same as that of W, or it may be less than
that of W.
(4) Only orthogonal projections are norm-reducing, \\pu\\ ^ ||«|| Vw. In the
non-orthogonal projection depicted in the sketch, it is clear that ||pw|| > ||w|| for the w shown.
And this brings us to the notion that there is usually a variational
statement/interpretation of a projection: the projected quantity is the function in R(p) that is 'as close
as possible,' in some sense, to the original quantity; and this is, of course, related to the
900 PROJECTIONS, ORTHOGONAL AND NOT-AND PROJECTION METHODS
concept of orthogonality. In this context, Qu = u — pu is often called the error associated
with the projection. The projections that we consider do indeed also have some of their
roots in the calculus of variations.
An essential property of a projection—indeed, part of its definition, is that a second
projection changes nothing [you are already in the subspace associated with p; i.e., in
R(p)]: p2 = p (p is idempotent) is the symbolic statement of this fact/requirement. (Or,
p2u = pu.) Note too that p2 = p => Q2 = Q and pQ = Qp = 0.
The larger (original) space and the subspace may be oo-dimensional or
finite-dimensional (of course we cannot project from the latter to the former!). We will be primarily
interested in the case in which the original space is oo-dimensional and the subspace
finite-dimensional, although projections from one finite-dimensional space to another (a
subspace of it) will also occasionally be of interest. Projections may also be (loosely)
categorized as continuous or discrete, the former sometimes involving differential operators
and the latter usually involving matrices.
A simple example of a finite-dimensional function projection is the finite Fourier
series representation (or any other truncated eigenfunction expansion) of a given
function; namely, the amplitude coefficients: the coefficients of the expansion represent the
projection of the given function onto the set of trigonometric (or other) functions. (In this
case, the projection is an L2-projection, as will soon become clear.)
An example of an oo-dimensional function projection is a partial Fourier series
representation of a given function on the interval [—1, 1] by the sine series, {sin«7rjc, n =
1,2,..., oo}. For example, since e* is not an odd function about the origin, the Fourier
sine series can only approximate it; the sine sequence is (and spans) only a subspace of L2
on [—1, 1]—even though it spans L2 on [0,1]. The (orthogonal in this case) complement
to the sine series, {cos«7Tjc, n = 0, 1, ..., oo}, would be required, in addition to the sine
series, in order to exactly represent e* on [—1, 1]; the combination of the two infinite
sets of functions spans L2 [—1, 1]. In fact, the sine series representation/approximation of
e* will describe exactly the odd part of it—sinhjc. And, equivalently, sinhjc is the best
least-squares fit to e* using the functions {sin«7TJc}.
Remark:
The simplest finite element projection occurs when the basis functions are interpolating
(Lagrange polynomials); i.e., when (pj(xj) = <5(y—and it is this: the simple interpolation of
a given function, f(x), via the basis functions, f'ix) = ]Cj=; f(xj)<l>j(x)i is a projection
that we shall call pt; i.e., pif(x) = f'(x).
While a particular projection of a given function to a given subspace is a well-defined
process that is independent of the source/origin of the given function, it will often be
the case that the given function represents the solution to some BVP, in which case the
projection turns out to be the approximate (GFEM) solution to the same BVP; indeed,
this is why Galerkin's method is a projection method—and we shall demonstrate this
projection connection. For example, the projection of a solution to Poisson's equation
onto the finite element subspace (mesh, nodes, and basis functions) is one from infinite
dimensions to finite dimensions. If the (weak) gradient of this projected function was then
further projected to a (finite-dimensional, necessarily!) subspace of discretely divergence-
free functions, we would be projecting from one finite-dimensional space to another. We
shall demonstrate these projections in what follows.
SCALAR PROJECTIONS 901
Another noteworthy (and general) property of projection operators is that their
eigenvalues are either zero or one. [Proof: px = Xx =$■ p2x = px = Xpx => (1 — X)px = 0 =
(1 -X)kx.]
In closing this introduction, we point out that there are only two basic types of
projections that are of interest herein; one is called the L2-projection (from the Lebesque norm,
||m||z,p = (/ \u\p){/p for p = 2), wherein we shall convert to the common and simpler
name, ||m||o, because H° is another name for I? (the function, but none of its
derivatives must be square integrable); i.e., ||«||o = ||m|Il2- The other is the //'-projection, in
the (Hilbert?) norm ||«||i = (/ Vu ■ V«)1/2, where we neglect the 'L2-portion' of the
conventional //'-norm, ||«||2 = J(Vu ■ Vu + u2) because our semi-norm will actually
qualify as a true norm (value zero if and only if u = 0) in virtually all cases of interest,
because our associated BC will preclude u = constant—which would give \\u\\i = 0 for
u = constant / 0 and thus be illegal as a norm. Also, while not explicitly stated each time,
associated with each oo-dimensional projection is a spatial domain, Q (in RHs; ns = 1, 2,
or 3), with boundary I\
In the remainder of this appendix we shall attempt to remove the thus-far
qualitative and therefore somewhat vague interpretations of projections, and replace/augment
them with quantitative discussions including explicit (when possible) definitions of the
projection operators for the two norms mentioned above and for two classes of problems:
scalar and vector, with the former serving partly as a stepping stone to the latter—and
with an important additional projection regarding the latter: the projection to a discretely
divergence-free subspace. The terminology we shall employ utilizes the following (ten!)
projection definitions:
po : Infinite dimensional L2(Z/°)-projection
Pq : Finite dimensional L2(Z/°)-projection
Pi : Infinite dimensional //'-projection
p\ : Finite dimensional //'-projection
' Pj0 : Infinite dimensional L2-projection to the divergence-free subspace, J
Pj0 : Finite dimensional L2-projection to the weakly divergence-free subspace, Jh
J Pq : Projection matrix; discrete L2-projection to the discretely divergence-free
^ subspace
' Pj{ : Infinite dimensional //'-projection to the divergence-free subspace, J
p1/ : Finite dimensional H' -projection to the weakly divergence-free subspace, Jh
J P\ : Projection matrix; discrete //'-projection to the discretely divergence-free
l, subspace
Thus, we have five variations on each of two themes; the first four are for scalar fields,
and the remaining six are for vector fields. We shall define and describe them in the above
order.
A.3.2 SCALAR PROJECTIONS
Here we are principally interested in the projections of scalar-valued functions to the
finite-dimensional subspace spanned by the basis functions of the FEM, via both L2-and
//'-norms—the former (and simpler) of which always delivers an orthogonal projection,
902 PROJECTIONS, ORTHOGONAL AND NOT-AND PROJECTION METHODS
and the latter of which sometimes does. But we will open the discussion with the oo-
dimensional case—partly for completeness, in some sense.
A.3.2.1 The L2-Projection, p0
Suppose we are given a function u{x) in L2 and are asked to find the closest function to
it within some (given) linear space of functions that is a subset of L2, say S, with u <fc S.
Calling such a function uq(x)—assuming such a function exists—leads to the following
equivalent statements (choose the one that suits/pleases you):
(i) Findinf \\u — v\\0
(ii) \\u — Hollo = inf \\u ~ v\\o
V€S
(iii) \\u — «ollo ^ llM — v\\o VueS. (A.3.2-1)
Thus, to find uq, let v e S, introduce the functional
F0{v) = \J{v-u)2, (A.3.2-2)
and try to minimize this functional by varying v within S:
8F0(v) = 0= (v-u)8v. (A.3.2-3)
(Note that 82F0(v) = J 8v8v > 0, and thus the extremum is indeed a minimum; i.e., the
greatest lower bound.) Now suppose that we have a basis for S; i.e., a set of linearly
independent functions, {vn,n = 1,2,...} that spans the space and thus permits us to solve
(A.3.2-3); any function in S can be represented by a linear combination of the {vn}. Thus,
we can represent v and 8v (an arbitrary variation of v) as follows: v = YlT=i anVn(x) and
8v = Yl^Li bmvm from which (A.3.2-3) gives
00 „ 00
]Cfem / ^2^a«v" - u^Vm = °' (A.3.2-4)
m=\ n=\
a result that can only hold for all 8v if each coefficient of bm vanishes (the values of {bm}
are arbitrary):
/oo
^2(anvn - u)vm =0, m = 1,2, ..., oo, (A.3.2-5)
n = \
showing that the error (u — v) is orthogonal (in L2) to the basis (the projection of the
error is zero)—which actually defines Galerkin's method. Rearrangement gives
oo
]Pa„ vmvn = uvm, m = 1,2, ..., oo, (A.3.2-6)
n = l •* J
an infinite set of linear equations for the amplitude coefficients, {an}.
[Exercise for the reader: Show that the solution of (A.3.2-6), and therefore that of (A.3.2-3), minimizes (A.3.2-2).]
SCALAR PROJECTIONS 903
To make further progress, to simplify the notation, and to identify our first projection,
we invoke the fact that any linearly independent set of functions can be ortho-normalized
(via the Gram-Schmidt procedure, for example), and we suppose this to have been done
to the i>'s; i.e., we have f vmvn = 8mn so that (A.3.2-6) becomes, simply, an = J uvn =
(u, vn), and our solution, v = uq, is then just
oo
"OO) = X^"' Vn)vnix) = PQU(x),
(A.3.2-7)
n = \
where (u,v) = J uv is the L2-inner product: the closest function to u(x) in 5", uq{x), is
the L2-projection of the given function u{x) onto the basis functions spanning S. [The
amplitude coefficient, (u,vn), is the projection of u(x) in the direction of (onto) vn(x).]
To prove that (A.3.2-7) defines (implicitly) a projection, we simply project again to see
if pi = p0. Thus,
00
p0u0(x) = PqU(x) = ^2
n=\
oo
y^0> Vm)Vm(x),V„(x)
jn=\
vn(x)
oo
^2 (u,vm)(vm,vn)vn(x)
m,n = \
oo
= ^ ("' vm)8mnv„(x)
m,n=l
oo
= ^(u, vn )v„ (x) = u0(x) = Pqu(x)\ QED.
(A.3.2-8)
n = l
Finally, to complete our introductory example, we test orthogonality to see if our L2-
projection is indeed 'closest': since uq = p^u and QQ = I — p0, the 'remainder' of the
projection is Qou; i.e., u = Pqu + Qqu, and we now wish to see if the Pythagorean theorem
is satisfied in the sense of vectors in a vector space: does \\u\\q = ||po"llo + ll<2oMllo =
llMollo + llM ~~ "olio? A direct calculation yields
u2 = (p0u + Qqu)2 = / [(Pqu)2 + (Qqu)2] + 2 PquQqu
= l|Po"llo + IIGo"llo + 2(po"i Go")
= IIpo"IIo + IIGo"llo + 2 / PuuU ~ Po)w;
i.e.,
\u\
= l|Po"llo + IIQo"llo + 2
up0u - (PquY
(A.3.2-9)
(A.3.2-10)
and we do have J_o (read J_o as 'orthogonality in L2') because / up0u = (u, u0) =
(Uo,ll) = 52„(u,V„)(vn,U) = E«("^«)2 and f(P0uf = ("0,"()) = Z)m.„((W, V/n)Vm,
(u,v„)v„) = T,m,„(u< vm)(u, v„)(vm,v„) = ^2n(u,vn)2\ i.e., we have shown that p0u ±0
Qqu, or, equivalently, (pou, Qqu) = 0.
904
PROJECTIONS, ORTHOGONAL AND NOT-AND PROJECTION METHODS
N(P0)
A
Q0u«
0
► 4
IIQullo
~<
^M^
IIUollo
^^ I
/^ 1
t
I
I
I
\
1
1
I
u0=P0u
►R(P0)
Fig. A.3.2-1 An Lz-orthogonal projection.
The sketch in Figure A.3.2-1 thus applies (TV _L R).
Final remarks:
(1) All of the above discussion still applies if the space S is finite-dimensional; just
change the upper limit on the sums from oo to N.
(2) uq = Pqu is the closest function to u (in the L2-norm) in the subspace S C L2 spanned
by {vn}; a best fit in L2.
(3) Recalling (A.3.2-2) makes it clear that this L2-projection is also a least-squares best
fit to u(x) via {^„(jc)}; the L2 best fit is also the least-squares best fit.
(4) The 'error,' u — pou = Qqu, is J_o ('L2-orthogonal') to the subspace, (u — uo,vn)
V«—another way to say that uq is a best approximation.
(5) For most cases of interest herein, the constant functions belong to S, from which it
follows that the total 'mass' of u (or, equivalently, its average value) is preserved
by this projection: f u$ = f u. (Take v = 1 = Ylm cmVm as the test function.)
A.3.2.2 The L2 -projection, phQ
It is a simple matter to specialize the above (somewhat abstract) L2-projection to a specific
finite-dimensional version via the FEM basis functions, {0,, / = 1, 2,..., TV} e Sh C L2:
return to (A.3.2-3), replace v(x) by Uq(x) = Y^=i uj4>j(x) an<3 Sv by 0(, where / varies
from one to N, to obtain
8F0 = 0
j l]C"A~") 0/>
(A.3.2-11)
or
^Uj 0y-0,- = u4>i, i=\,2,...,N,
j=i J J
(A.3.2-12)
SCALAR PROJECTIONS 905
or, introducing an efficient matrix-vector terminology that will be more useful later,
Mu = b, (A.3.2-13)
where M = f<p<pT, or M-,j = f <j>i<j>j is the (SPD) N x N mass matrix, u = (u{ ... uN)T is
the TV-vector of nodal coefficients (to be determined), and b = b(u) = J u<p, or bj(u) =
f u(pj = (u, (pj), where <p(x) is an TV-vector of the basis functions, {0,}.
Solving the above linear system yields the amplitude coefficients of the projection {u,}.
The projection is completed via the basis function expansion:
uh0{x) = (pT(x)u = (p(x)TM~lb(u)
ee ph0u(x) = J2"j<l>j(x) = ]£[Af-lb(ii)];0y-Or). (A.3.2-14)
Thus, we have, ostensibly, derived and described the L2-projection of the GFEM. All
that remains is to prove that it is an orthogonal projection; and we begin by showing that
ph0u0(x) = (ph0)2u(x) = <p(x)TM-lb[uh0(x)]
r-l
T ha-\\
<p' (i)M"'b[<p(x)' M~lb(u)]
\T w-1
= <p' (x)M'1 / [<p(xy M-lb(u)]<p(x). (A.3.2-15)
Noting that [<pT(x)M lb(u)] is a scalar leads to
ph0u0(x) = <pT(x)M-1 f<p(x)[<p(x)TM'lb(u)]
= <pT(x)M~l
<P(x)<P (x)
M~lb(u)
= <pT(x)M-lb(u) = uhJx);
(A.3.2-16)
Pq is a projection operator.
To prove that the projection is ±o, we need to show that D = JPqU(I — p^)u =
(PqU, Qqu) = 0—and, for 'variety', we will do it slightly differently: we have p^u =
(pT(x)M~lb(u) = bT(u)M~l<p(x), and thus
D = / up^u - I PqUPqU
= / ubT{u)M~[<p{x)- bT(u)M~l(p(x)(pT(x)M'lb(u)
r-l
= b' (u)M'1 / u<p-b' (u)M~
<p(x)<pJ (x)
M~lb(u)
r-l
= b' (u)M'lb(u) - b1 (u)M~lMM~lb(u) = 0. QED.
(A.3.2-17)
Remarks:
(1) Note that here we obtained an orthogonal projection with basis functions that are
merely linearly independent but are not themselves orthogonal.
906 PROJECTIONS, ORTHOGONAL AND NOT-AND PROJECTION METHODS
(2) If the mass is lumped in (A.3.2-13), a common trick in some FEM applications, the
solution is particularly simple, u{ = f u(pj/ f (/>;—a basis function-weighted average
value of u{x) at node /; it is important to realize, however, that the result is no longer
a projection. But it is at least 'close' to a projection. To see this, project again to
obtain
ph0uh0(x) = <?T{x)MZ{MMZ{b{u) = <pT(x)MZlb(u) + 0(hp) = 4 + 0(hp),
where ML is the lumped mass matrix, and we have used the 'sloppy' interpretation
that says MM~[l = I + 0(hp), where h is element 'length' and p > 0, depends on
mesh and basis functions. [In fact, MM^lu = u + 0{hp) for a sufficiently smooth
u.] The lumped mass 'projection' is only as close to a true projection as ML is to M.
(3) Lumping loses the 'best approximation' property as well—obviously.
(4) If u(x) is replaced/approximated by its interpolant into the basis set, {0,}, in
(A.3.2-11) through (A.3.2-13), then the projection degenerates to the interpolation
projection introduced previously (Section A.3.1), which is indeed a projection—it
is just not an L2-projection (of u; ptu is an L2-projection of the interpolant of u,
which is again the interpolant because it is already in the subspace). With finite
elements, L2-projections always involve the consistent mass matrix.
(5) If the approximations in both (2) and (4) above are invoked and then the mass
lumped also on the RHS, the simple interpolation projection, pju, is again obtained.
(6) In practice, it is often (usually, probably) the case that u(x) e H' —a subspace of L2.
(7) The simplest possible L2-projection via FEM is the expansion of uh via piecewise-
constant basis functions; in this case, M is diagonal and Uq(x) corresponds to element
average values of u{x).
We conclude this section by restating this L2-projection in a different way:
uh0(x) = <pT\x)Af-lb(u) = bT(u)M~l(p(x) = bT(u)(p(x) = yT (x)b(u),
where
<p(x) = M~l<p(x), (A.3.2-18)
or
0,.(*) = ]T(M-%0;(.x), /= l,2,...,/V, (A.3.2-19)
j
are the so-called conjugate (or dual) basis functions; the nodal values of the i-th element of
<p(x), (pj(xk) fork= 1, 2,..., N, are just the i-th row (or column) of M_1 (none of which
is zero in general). Each dual basis function, 0((jc), is a linear combination of all of the
FEM basis functions and is therefore a truly global basis function (i.e., a basis function
with global support) that is L2-orthogonal to the conventional FEM basis functions;
[<p(pT=I. (A.3.2-20)
(The conventional basis functions are not an orthogonal set—hence, the non-diagonal
mass matrix; the basis functions and the conjugate basis functions form a bi-orthogonal
SCALAR PROJECTIONS 907
set.) See Oden (1972) for a detailed discussion of these conjugate functions—including
pictures; and further discussion is presented in Oden and Reddy (1976). Thus, recalling
(A.3.2-14), the simplest 'projection-looking' representation of Uq(x) is either
(i) uh0(x) = <pT(x)M-lb(u) = <pT(x)M-l(u, <p) = <pT(x)(u, M" V)
= <PT(X)(U, <p) = ^(U, $j)(/)j(x) = ^2uj(/}j(x),
j J
(A.3.2-21)
or
(ii) uh0(x) = vT(x)b(u) = <pT(u, <p) = J2(u> 0y)0/C*), (A.3.2-22)
j
where bi-orthogonality yields (uq, 0,) = («, 0,) from (i) and (ufa, 0() = (u, 0() from (ii);
thus, the error (u — p^u) is, appropriately, orthogonal to both sets of basis functions:
(u - Uq, 0;) = 0 = (K - KJ|, 0,-) V/.
Although Oden (1972) has already plotted some linear conjugate basis functions for
both ID and 2D, we shall show a few more here—including quadratics. To use (A.3.2-18),
we first multiply through by M and then pick a mesh node, say k, to get
My(xk)=<p(xk) (A.3.2-23)
or
N
^Mi$j{xk) = <t>;(xk) = 8ik, i=\,2,...N, (A.3.2-24)
where <5(y is the Kronecker delta. Letting k range over all nodes gives the matrix equation
MO = /, (A.3.2-25)
where each matrix is N x N. Each column in the (symmetric) O-matrix is a vector of
nodal values for the corresponding conjugate basis function, and we see that O is just
M~l, as stated previously. Equation (A.3.2-25) is, of course, solved (for TV not too large)
by 'factoring' M and performing back-substitution against ek = (0, -> 1, 0, -> )T for each
k, where the one is in the k-th position of the TV-vector ek. Once the /V2-values of O/j are
available, we can return to (A.3.2-19) to obtain
N
0;(*) = ]T;O,70;(*), (A.3.2-26)
i=i
which we plot for several values of / in Figure A.3.2-2 for linears and in Figure A.3.2-3
for quads—each normalized to 1.0 for plotting convenience. The actual peak nodal
amplitudes are <f>k(xk) = 41.6, 22.3, 20.8, and 20.8 for k = 1, 2, 6, 9 for linears, and
(pk(xk) = 50.9, 12.7, 12.2, and 25.5 for quads. [The asymptotic peak amplitude (h -> 0)
for linears is y/3/h (Oden, 1972) in general, and 20.785 for this example—for internal
modes. It is twice that value for nodes 1 and TV.] The piecewise parabolas in Figure A.3.2-3
were obtained element-by-element with 20 plotting increments per element, using the
conventional quadratic basis functions for interpolation.
908 PROJECTIONS, ORTHOGONAL AND NOT-AND PROJECTION METHODS
W*) 0.2 -V
Fig. A.3.2-2 M-conjugate linear basis functions for nodes 1, 2, 6, and 9 in a 12-uniform-
element mesh.
♦i(x) 0.2
-0.2 i-
-0.4 'r-
-0.6
Fig. A.3.2-3 Same as Figure A.3.2-2 except for six quadratic elements.
Moving to 2D, Figures A.3.2-4 and A.3.2-5 show the bilinear basis functions at five
'different types' of nodes on a 13 x 13 = 169 node mesh, and their corresponding duals.
They are plotted on a 85 x 85 node mesh via bilinear basis function interpolation to
provide better clarity—and they are normalized to unit amplitude. In these plots and in
those to follow, the height of the 'base' is 10% of the full range of the plotted function plus
SCALAR PROJECTIONS
909
(a) 0 (x) for a typical internal node
I
\
(b) 6 (x) corresponding to (a); maximum = 432
'i
(c) (t) (x) for one node in from boundary
(b) 0 (x) corresponding to (c); maximum = 463
(e) (t)(x) for one node in from corner
(f) 0 (x) corresponding to (e); maximum = 496
Fig. A.3.2-4 Basis functions and dual (in L2) basis functions for bilinear elements (4-
patches).
the absolute value of the function's minimum value. [The true (asymptotic) amplitudes
of the dual functions are 3/lh for internal nodes (4-patch), 6/7/? for non-corner boundary
nodes, and 12/1h for corner nodes—where / x h is the element dimension.] Figure A.3.2-6
shows the biquadratic basis functions and their conjugates for the three types of internal
nodes—on the same 13x13 mesh plotted (via biquadratic basis function interpolation)
910
PROJECTIONS, ORTHOGONAL AND NOT-AND PROJECTION METHODS
I
\
(a) <|)(x) for a typical boundary node (2-patch)
i"
(b) (J)(x) corresponding to (a); maximum = 864
(c) <|)(x) for a corner node (1-patch) (d) <j)(x) corresponding to (c); maximum = 1728
Fig. A.3.2-5 Same as Figure A.3.2-4, but for two boundary nodes.
on an 85 x 85 mesh. To complete the pictures, Figures A.3.2-7 and A.3.2-8 show the
corresponding biquadratic functions for nodes at or near a boundary and nodes at or near
a corner, respectively. We have no more to say except that the local minima for quads
are generally not at nodes and that the biorthogonality property now seems (more-or-less)
'obvious' (!).
To conclude this section, we examine u(x) = S(x — xk), the Dirac delta function
'centered' at node k. (A.3.2-12) then gives
^2 nf / 0/0y = / 0/5(.r - **) = 0/(.rjfc) = 8ik,
(A.3.2-27)
(*).
the Kronecker delta, which, upon comparison with (A.3.2-24), shows that 0y-(.V/t) = iij ;
the dual basis functions are also the L2-projections of the Dirac delta function.
SCALAR PROJECTIONS 911
J
v
, it A
\
• s~
(a) 4>(x) for a 4-patch internal node
(b) <|)(x) corresponding to (a); maximum = 648
^
1
\
X
7
(c) <|)(x) for a 2-patch internal node
(d) <j)(x) corresponding to (c); maximum = 310
I
y
- 1 "
(e) <|)(x) for a central node (1-patch)
(f) <|)(x) corresponding to (e); maximum = 148
Fig. A.3.2-6 Basis functions and their L2-duals for quads.
A.3.2.3 The H1 -Projection, p1
Instead of the given function being merely square-integrable [i.e., u(.x) e L2], we now
consider a given function u(x) that is smoother; its gradient is also in Lr. Thus, we
now consider u(x) e //', a smaller space than Lr (a subspace of it, actually), so that
912 PROJECTIONS, ORTHOGONAL AND NOT-AND PROJECTION METHODS
I
(a) $ (x) for a boundary node (b) <j) (x) corresponding to (a); maximum = 1296
V.
(c) <j) (x) 1 node in from boundary (d) § (x) corresponding to (c); maximum = 342
Fig. A.3.2-7 Same as figure A.3.2-6, but for nodes at or near a boundary (2-patches).
both J u2 < oo, and J Vu ■ Vu = \\u\\2l = WVu\\q < oo. (We shall adopt the common-
but-sloppy terminology of the //'-norm—semi-norm, actually—by precluding constant
functions from our //'-subspace, except in special cases; see below.) This additional a
priori smoothness allows one to seek a 'nearby' function in a space of functions that
is a subspace of //' such that its gradient is as close as possible, in some sense, to the
gradient of u(x).
Rather than considering 'general' cases beyond our interest, we shall ab initio restrict
our attention to subspaces of functions in Q, that take on a specified (boundary) value on a
portion of dQ = T, say rD; and this value will be u(x)\rD = ud / 0 nl general—the value
of the given function evaluated on TD (called 'the trace' of the function in the functional
analysis literature). We do this because these projections and the related projection methods
are related to elliptic BVP's, in which at least a portion of the BC is of the Dirichlet
type—usually.
Next, we introduce the subspace of admissible functions, a constrained and non-linear
subspace of //' that, by fiat, does not contain u:
VE = u' + V0, (A.3.2-28)
SCALAR PROJECTIONS
913
9.
^
A
(a) <|)(x) for a corner node.
X
/
/
/
/
/
(b) <|)(x) corresponding to (a); maximum = 2592
V
' \
v? .
<s
(c) <|)(x) for 1 node in from corner, (d) <|)(x) corresponding to (c); maximum = 162,
Fig. A.3.2-8 Same as Figure A.3.2-6, but for nodes at or near a corner (1-patch).
where Vq is a linear vector space of functions that vanish on rD, and vv is any //'-function
(other than u\) that takes on the value uD on TD—the essential 'boundary condition.' w is
called an //'-extension of no into Q. The space Ve is constrained because every function
therein must agree with «D on VD, and it is non-linear because the sum of two functions
in the subspace, on rD, is 2uD and is thus no longer in the subspace. If v e Ve, then
v — w e V0; the difference between any admissible function and the chosen //'-extension
of uD does lie in a linear vector space. Note that the 'large' (possibly oo-dimensional)
subspace VE is constructed by adding a single function to another large subspace. Note
too that changing w to some other //'-extension of uD changes Ve-
914 PROJECTIONS, ORTHOGONAL AND NOT-AND PROJECTION METHODS
In this //'-case, we call the objective function U[(x) [rather than uq{x) for the H°-
norm] and assert that the following are equivalent statements of the appropriate variational
problem—and will lead us to the //'-projection:
(i) Find inf \\u — v\\\
(ii) \\u — «i||i = inf \\u ~ v\\\
v&VE
(Hi) \\u-u{\\x ^\\u-v\\{ VveVE, (A.3.2-29)
where Ve C Z/1 such that if v e Ve, then v e //' and v = uq on rD; it satisfies the
Essential BC.
To find u\(x), let us introduce the functional [cf. (A.3.2-2)]
F{ (v) = \ I V(v-u)- V(v - u) (A.3.2-30)
and seek a minimum by varying v e Ve'-
8Fi(v) = 0= J V(v-u)-V8v. (A.3.2-31)
Remarks:
(1) It is important to remember that this is a constrained variational problem—the
constraint being that the admissible class of functions over which the minimization
is sought is constrained by the requirement that each must take on a particular value
on rD [that of u(x) evaluated there]; and this fact will be seen to affect ('mess up'
a bit, actually) our concept of orthogonality.
(2) 82F[(v) = J V8v ■ V8v > 0, and therefore our stationary point is, as desired, a
minimum.
To make further progress, we proceed (somewhat) as in the L2-case: we first imagine
that we have a basis (orthonormal in //', for convenience) for Vq, say {vn,n = 1,2,...}.
Thus, we can perform the following expansions for v e Ve and 8v e Vq\ (i) ^(jc) =
w(x) + YlT=i anVn(x) with {an} to be determined, and (ii) 8v(x) = Y^=i bmvm(x) with
{bm} arbitrary.
Proceeding again as in the L2-case easily leads to, from (A.3.2-31),
00 « / 00 \
y^ybm / V f y^anvn + w -u I
m=l J \n=l J
Vvm=0
and then to
oo
y^qn / V^„ ■ Vvm = / V(u - w) ■ Vvm, m = 1, 2, ..., co, (A.3.2-32)
an co-dimensional linear system for the a„'s. Introducing the //'-inner product
[u, v]= [vu-Vv (A.3.2-33)
SCALAR PROJECTIONS 915
and utilizing the assumed (//') orthogonality of the basis, which we shall refer to as _l_i,
gives [vn, vm] = 8mn and thus an = [u — w, vn], and the final solution to (A.3.2-31) is
oo
u\(x) = v(x) = 2J[« — w, v„]v„(x) + w(x) = p™u(x), (A.3.2-34)
and we have (we assert now and prove below) our //'-projection, with [u\, vn] = [u, vn],
or [u — u\, vn] = 0V«; the error is _l_i (//'-orthogonal) to the basis, a la Galerkin. We
have also found the associated (implicit) projection operator (pf), for which we note the
obvious interesting aspect: the Dirichlet BC is first 'subtracted off and then 'added back,'
a feature that is an essential part of the projection but which will be seen to cause p\ to
be 'suboptimal' in that it is not an Hl-orthogonal projection—even though the error is
J_i to the basis. Different w's give different Mi's, for the same u\ hence, the superscript w.
[See also p. 70 of Strang and Fix (1973)]
We now address the two questions: (i) is it really a projection?; and (ii) is it _l_i?
The first question is answered, as usual, by applying the (alleged) projection a second
time: from (A.3.2-34), it follows that
oo
p™ux(x) = (p™fu(x) = ^[«i - w, vn]vn(x) + w(x)
n=l
oo
= E
oo
^[u - w, vm]vm(x), vn(x)
n — l Lm=l
oo
vn (x) + w(x)
- ^2 t" ~ W> V^iVm, V„]V„(X) + W(X)
m.n=l
= ^2[u-w, vm]8mnv„(x) + w(x)
m,n
= /J[« ~ w> vn\vn(x) + w(x) = ui(x), (A.3.2-35)
n
and we do have a projection: (pv,v)2 = p™.
The second question is also answered by construction—testing (//') orthogonality a la
Pythagoras: letting Q\u = (I — p™)u leads (as usual) to u = p\u + Q™u and _l_i would
mean that \\u\\\ = Hp^wll? + 11(27"Up We obtain, however, the following:
\\u\\]=\\pw{u + Qw{u\\2{
= / W« + VQ7«) ■ (Vp> + VQw{u)
= f Vp> • Vpw{u + J VQw{u ■ VQw{u + 2 ! Vpw{u ■ VQw{u
= ||p>||? + ||Qy«||?+2 fvp^u- VQw{u, (A.3.2-36)
916 PROJECTIONS, ORTHOGONAL AND NOT-AND PROJECTION METHODS
from which J-i=> the necessary condition, J Vp™u • V(u - p™u) = 0, or J Vu ■ Vp™u =
J Vp™u ■ Vp^u by setting [p™u, Q™u] = 0. To test this, we simply evaluate both sides:
LHS = / Vu ■ V
oo
y~"ju - w, v„]v„(x) + w(x)
.n=\
/oo
y~][u — w, vn]Vu ■ Vvn + Vu ■ Vw
oo
= ^[u - w, vn][u, v„] + [u, w].
(A.3.2-37)
n = \
Next,
RHS = [ V
oo
^[u - w, vn]vn(x) + w(x)
ln=l
oo
^[u-w, vm]vm(x) + w(x)
m=\
OO p. OO
= ^[u-w, vn][u - w, vm][vn, vm] + 2 I Vw ■ ^[u - w, vn]Vvn +
m,n=l n=l
\w\
m,n=\
oo
= ^[u- w, vnf + 2^2[u- w, vn][w, vn] +
\w\
1-
(A.3.2-38)
Then, finally,
LHS - RHS =
n = \
oo
u-w - ]P[m - w, vn]vn,
w
n = l
= [u — w — (u\ — w), w] = [u — u\,w] = [u — pYu, w] = [Q™u, w]
(A.3.2-39)
to give
and we are left with
[pw{u, Qw{u] = f Vpw{u ■ VQw{u = [Qw{u, w], (A.3.2-40)
lull, =
ipr"iii + ne7"ii?+2[e7K,w]
\u\ 111 + Wu — u\ 111 + 2[M — Ml, W].
(A.3.2-41)
We have _l_i in general only if w = 0; i.e., if u = 0 on Tq. More precisely, we also have
±i, if Q™u ±i w or if Q™u = 0 or if VQ™u = 0 or if Vw = 0, none of which warrants
very serious consideration, except perhaps the last one: u = constant on rD—and one
more: rD = 0. In the general case, however h / 0 on TD (thus, w / 0), and we do
not obtain a best approximation in //' even though our projection is a solution to the
variational problem [(A.3.2-31)]; i.e., unless u = 0 on To, the solution u\(x) is not the
closest function to u(x)\ nor is it unique, varying as it does if we change w—even though
the error is //'-orthogonal to the basis. [The error is not _l_i to w(x).]
SCALAR PROJECTIONS 917
Why is this? What is the closest function to u(x)7 Is there a different projection (still in
//') that is orthogonal? We will answer the second of these good questions second, after
answering the last first: yes (see below). The closest function to u(x) would be that which
minimizes the functional in (A.3.2-30) without considering the constraint—v(x) would be
allowed to vary on To just as it is on VN = r — rD. [Actually, however, this case is not
quite well-posed in that then F\(y + c) = F\(y): any constant could be added to v without
affecting the value of the functional. A 'standard' way around this non-uniqueness issue
that is, in fact, analogous to the non-uniqueness associated with the classical Neumann
BVP is to subtract the average value (over Q) of each function in the subspace from itself
before 'using' it—a procedure that makes each resulting function L2-orthogonal to all
constants.] Thus, if rD = 0, the unconstrained projection will both truly minimize F[(u[)
and be _l_ i.
Noteworthy is the fact that it is actually the constraint of an inhomogeneous Dirichlet
BC that is the cause of loss of _l_i; if u(x) were zero on TD, so too would be w(x), and
then the projection, u\(x), would be closest in Hl even in the presence of the constraint
n = 0on TD, a result that is obviously related to the fact that all of our 'test functions'
also vanish on rD. And this leads to the ±i-projection alluded to above that was, in fact,
already hinted at [after (A.3.2-34)]: subtract off w(x) before doing anything else. Thus,
we now consider the modified problem: let u = u — w (giving u = 0 on rD) and seek
v(x) e Vo from
(i) Find inf \\u — v\\\, or
(ii) ||« — v\\\ = inf \\u — v\\\, or
v€V0
(iii) \\u-v\\i ^Wu-vh VveVo, (A.3.2-42)
all of which are solved by finding the stationary (and minimum) point of
F{ (v) = \ jV(v-u)- V(v - u) (A.3.2-43)
over all v e Vo via 8F[ =0, or
/ V(v - u) ■ V8v = 0; (A.3.2-44)
i.e., we seek the closest function to u(x) — w(x) rather than the closest to u(x). Proceeding
as before leads to
oo
v(x) = v(x) = y^[«, vn]v„(x) = p\u(x) = u\ — w = p™u — w, (A.3.2-45)
and we have our 'different' projection. That p\ = p\ follows easily, and the next step—as
usual—is to test for _l_i, which is true if / Vw • Wpu = J Vpu ■ Vpii, which in this case
yields:
LHS= fvu-v(^2[u,vn]vn(x)) =Yl[u>v"] J V"' Vv« =Yl[u'Vn]2
918 PROJECTIONS, ORTHOGONAL AND NOT-AND PROJECTION METHODS
and
RHS= fv(Y^[»iVm]vm\ v(^[«,^K
= Jj«, vm][u, v„] / Vvm ■ Vvn
= ^[u, vm][u, vn]8mn = ^[u, vnf = LHS, (A.3.2-46)
m,n n
and we do have _l_i: v(x) is an orthogonal //'-projection of u(x) in a Hilbert space; it
minimizes the quadratic functional and is the closest function (in //') to u(x). We shall
even emphasize this fact by changing the name of p\ to p\—and note that the gradient
of v(x) is the closest (in L2) to the gradient of u(x).
Now, since v(x) minimizes J V(i> + w — u) ■ V(v + w — u) and is the closest function to
(u — w), so it must follow, since v = u\ — w, that (u\ — w) minimizes the same functional
and is the closest function to u — w, giving u\ — w J_i u\ — u; i.e., we have
U[(x) = p™u(x) = v(x) + w(x) = pfu(x) + w(x) = p-(-[u(x) — w(x)] + w(x),
(A.3.2-47)
and we see that our original //'-projection, thanks to inhomogeneous Dirichlet BC's, is
actually probably better described as an affine (but idempotent) transformation since, as
noted earlier, different choices for w(x) give different values of U[(x). But we shall be
content to stay with the terminology of a non-orthogonal projection.
We can summarize our results in the following two sketches, first for the original
(Figure A.3.2-9) and non-_l_i-projection, and then for the _l_i-projection (Figure A.3.2-10),
wherein we note the following:
1. The zero function is not in the range of p"'—unless w = 0—and thus the domain of
p\ has no null space; p\ is not a projection in a linear vector space.
2. p\u is the closest function to u in R(p™) : \\u — «i||i is a minimum and u — u\ _l_i
U\ — W.
3. Changing w changes pf and R(p"')—and thus u\.
R(PD
o
Fig. A.3.2-9 A non-orthogonal /-/1 -projection.
SCALAR PROJECTIONS 919
u - u1
N(P|)
(l-PJ)(u-w)
u - w
u1 - w= P|(u-w)
Fig. A.3.2-10 An H1 -orthogonal projection.
4. p\u = U[ is not ±i to Q[u = u — u\, and p\ cannot operate on Q™u; p™Q™ is
undefined. But p-yQy = 0; i.e., p/-(/ — p™)u = p^[u — w - p^(u — w)] = 0.
Comparing the two figures shows that the non-_l_i-projection of u(x) to u\ (jc) is equivalent
to the ±i-projection of u(x) — w(x) to u\(x) — w(x). Rather than u\(x) being the closest
possible function to u(x), we have that u\ (jc) — w(x) = v(x) is the closest possible function
to u(x) — w(x) = u(x). Note too that the 'error,' \\u — u\\\\, is the same size in both
depictions, and that it changes when w(x) does—and this is what really matters. Clearly,
there is room to at least seek a best w—by trying to minimize \\u — «i(vt>)||i—but we
will drop it here (after mentioning that w = u is clearly a minimizer, but not a very
interesting one). Finally, if u(x) = 0 on Vd, we have w = 0 and p^ = p°{ = p{; this is
the 'clean' case, with a simple //'-orthogonal projection—similar to the unconstrained
projection that results when TD = 0. But the w = 0 constrained case is different, as
are the solutions, from the unconstrained (TD = 0) case, the latter generating a smaller
F[(v) than the former; constrained minima are never as small as the unconstrained case,
which, in some sense, has the largest 'grab bag' of admissible functions. This point will
be further clarified when we consider the FEM version in Section A.3.2.4. So much for
orthogonality—for now, except to mention that a final pair of comparison sketches that
are intended to be self-explanatory, shown in Figure A.3.2-11, may be useful—in which
Vw is the subset of all //'-extensions of uD, and /?(p/") is that subspace of Vq generated
by pf:
We said earlier that //'-projections can be associated with a BVP involving the Lapla-
cian operator, at least if u(x) and v(x) are sufficiently smooth—an assertion that we now
demonstrate. Returning to (A.3.2-31), we perform an integration by parts to give
0 =
[8vV(v - u)] -
8vV2(v
u)
V(v -u)- I SvV2(v - u).
(A.3.2-48)
But 8v = 0 on rD because v = u there and, using the fact that 8v is an arbitrary variation
in Q and on rv, leads to the Euler-Lagrange equations associated with the minimization
of the given functional:
V2(v - u) = 0 in Q,
(A.3.2-49)
920 PROJECTIONS, ORTHOGONAL AND NOT-AND PROJECTION METHODS
-Yw subtracts _ L
•-^u-w
Ul = P^u = P|(u-w) + w
N(P|)
V0 (A linear vector space)
p|(u-w)
And .
Subvac^i.
£v
P1
N(P|)
\
1± /~>W,
P| Qyu = 0
R(Piw)
v„
R(P|)
Fig. A.3.2-11 Interpreting the two H ^-projections [N.B. Changing w(to w') changes P™ and
3
dn
(v — u) = 0 on T^,
and, of course,
v — u = 0 on To,
(A.3.2-50)
(A.3.2-51)
which is an elliptic BVP for v since u is given, with the apparent obvious solution, v = u
in £2! What sort of projection is this? It is, of course, the ultimate/perfect 'projection'
and represents a limiting case (the subspace is the same as the original space). We shall
return to notions related to this when we discuss the projection method.
A.3.2.4 The H1 -projection, p*
As for the L2-case, it is now reasonably easy to specialize the //'-projection to the finite-
dimensional version via the (C° or better) FEM basis functions, {</>,, / = 1,2,..., TV},
where TV is the total number of nodes in Q and on VN, which functions are required
to be in the space Vh C V0; i.e., </>, = 0 on rD. Also, as usual, we shall implement the
//'-extension of u into Q from FD via the basis function interpolant of u on rD; i.e., w
SCALAR PROJECTIONS 921
is represented/replaced by
NT
u\x) = Y^ U(XJ e ro)(pj(x), (A.3.2-52)
j=N+l
which describes u1 as an interpolant of u(x) (another projection!) on rD, described by
the rest of the nodes (/ = TV, TV + 1,..., 7V> on rD) and is the required //'-extension of
u(x e rD) into Q. Then,
N
v(x) = u\{x) = p\u(x) = u'(x) + ^2 uj<f>j(x) = u'(x) + "*(•*) (A.3.2-53)
is utilized in (A.3.2-31) along with 8v = fa to obtain the discrete form of the finite-
dimensional H' -projection,
0 = / V ^ Uj<f>j + U1 - U 1 • V0;, 1=1,2,
, N, (A.3.2-54)
or
N
J Vuh ■ V0,- = ^T,Uj J V<Pj ■ Vfa = fV(k - u1) ■ Vfr, i=\,2,...,N, (A.3.2-55)
which can be compared with its L2-counterpart, (A.3.2-11) through (A.3.2-13). Again, it
is expedient to utilize matrix-vector notation to rewrite (A.3.2-55) as
Ku = b, (A.3.2-56)
where K = J V<p • V<p7 and b = b(u - u1) = J V(u - ul) ■ V<p; or K{j = J V</>( • V0y and
bi = J V(m — u!) ■ V</>,. Thus, <p is an TV-vector (as is u) and A' is a symmetric (and positive
definite when rD / 0) TV x TV matrix.
The solution of (A.3.2-56) gives the nodal amplitude coefficients of the //'-projection,
and (A.3.2-53) gives the projection:
ph{u(x) = uh{(x) = <pTu + ul(x)
= (pT(x)K~lb(u-uI)+uI
= ^[/T'b(w - u!)]j<f>j(x) + u1, (A.3.2-57)
j
and the proof that (p\)2 = p\ is now left as an exercise. It also follows easily that the
finite-dimensional version of (A.3.2-41) obtains (p\ -> p\, Qw{ -> Qh{, w —>- u) and, as
there, //'-orthogonality is generally only achieved for homogeneous Dirichlet BC's (u! =
0) or FD = 0. Also, however, u\ — u! = uh is an ±i-projection of u — u1; uh = ]T\. uj<f>j
is the closest function in Vh to u — u!.
Finally, it is noteworthy that this H' -projection has introduced another set of (global)
conjugate basis functions—another dual basis. We shall introduce them in the same way
922 PROJECTIONS, ORTHOGONAL AND NOT-AND PROJECTION METHODS
as before (for L2), but it is noteworthy that the 'bottom line' is obtainable simply by
replacing M by K in (A.3.2-18). From (A.3.2-57), rewritten as
uh{(x) = bT(u - ul)K-l<p(x)+ul(x),
we introduce the new TV-vector
- v-\.
<p(x) = K~l<p(x),
to obtain
u\(x) = <pT(x)b(u - u1) + ul(x)
N ,-
= 2 IV(M - "7) • V(j>j
y=i L
(A.3.2-58)
0;(jc) + m7(jc). (A.3.2-59)
The global basis functions, {</>;}, are _l_i to the conventional basis functions; in fact,
they form a bi-orthonormal set:
V«p • V<pT = / V(K~l<p) ■ V(pT = K'1 / V<p • V<p7 = /,
(A.3.2-60)
or
Vfo-Vt^Sn,
which permits the following 'streamlined' representations of the projection:
T^w-U
J\ i ../
(i) uUx) = <p' (x)K~lb(u -u') + u
= (pT(x)K~l J V(k - u1) -Vip + u1
= <pT(x) J V(k - u1) ■ V(/T V) + u1
N
= Y^[u-u', 4>j]<l>j(x) + u'(x),
(A.3.2-61)
I * ^ L< ^
with [u{(x), </),] = [u(x), (pi]; i.e., [u{ — u, </>,•] = 0 For the simple case of u = 0 on r0,
(A.3.2.61) becomes
u\ = 2~\uj<$>j(x), with Uj = S/u ■ V0y
(ii) wf(jc) = <pT(x)b(u - u1) + u1
= <pT(x) / V(k - u1) -Vip + u1
N
= ^2[u-u', <pj]<j)j(x) + u'(x),
(A.3.2-62)
with [k}(x), </>,] = [m, </>,]; i.e., [wf - «, </>,] = 0.
The error is //'-orthogonal to the original basis and to the conjugate basis—as for the
L2-version, and as expected since both sets of basis functions span Vh.
SCALAR PROJECTIONS 923
As we did for the mass matrix, we shall compute and plot a few of the ID K-
matrix conjugate basis functions, from (A.3.2-58), here for<p(je) = 0 at x = 0, 1. Recalling
(A.3.2-24) through (A.3.2-26), a one-for-one replacement of M by K yields the matrix of
nodal values for the A'-conjugate basis functions,
O = K~\
and
N
<f>i(x) = ^2&ij<f>j(x),
(A.3.2-63)
(A.3.2-64)
values of which are plotted for several nodes in Figure A.3.2-12 for linears (normalized
to unity) and Figure A.3.2-13 for quads, the latter kindly supplied by D.F. Griffiths and
showing the quadratic 'Bubbles' for center nodes.
We conclude our ID discussion with the following
Remarks:
(1) The A'-conjugate linear basis functions and the quads at edge nodes are in fact the
Green's function for the ID Laplacian operator:
2../J
dzw
dx'
= S(x-xk) with uh (jc) = Y^ uf)(t>j (-*)<
gives
Yl uf} / <P'i<P'j = / <Pi(x)8(x -xk) = (ptixk) = 8ik,
$i(x)
1 .U
0.8
0.6
0.4
0.2
n
\ i y\ i /\ i
\ - / \
\ / / ^
\ / ^
\--' /•.
x / \ \
' \ ' \
\ / \
\/
_ / / X. \^ \ _
/ '' / ^v v> \
\
' \ > N
\ , / \ - \
\ ; ' \ v> \
i ■■ / \ '■•• x "
' / \ \ \
\ > / \ - \
\ '■•■ x
\ . ' \ \ \
J// X\\-
/ / / \ \ \
' / \v \
1 II 1 ^
' 1 1 1 1 ^
0.2
0.4 0.6
x
0.8
1.0
Fig. A.3.2-12 K-conjugate linear basis functions for nodes 2, 6, and 9 in a 12-uniform-
element mesh.
924
PROJECTIONS, ORTHOGONAL AND NOT-AND PROJECTION METHODS
0.25
0.2
0.15-
0.05
0.1
0.2
0.3
0.4 0.5 0.6 0.7 0.8 0.9 1
Fig. A.3.2-13
K-conjugate quadratic basis functions for center nodes 2, 4, 6, and edge nodes
7,9, 11.
which is just (A.3.2-63) in disguise (u) = </>,;), obtained exactly because of two
facts: (i) the exact Green's function is piecewise linear, which is a function spanned
by our basis functions, and (ii) the source term for the Green's function was placed
at the nodes.
(2) The center node conjugate functions for quads are not true Green's functions because
they are C°°-functions within the element—a jump in the first derivative is required
to obtain a true Green's function. They are approximate Green's function.
(3) The functions shown in Figure A.3.2-12 are again normalized, for plotting, by the
maximum nodal value. The true peak nodal amplitudes are 4>k(xk) = **0 — *k) for
linears and for quads at edge nodes, showing small amplitudes near the ends and
largest amplitude in the center, per Figure A.3.2-13—and is related to the fact (for
the Green's function analog at least) that the same total 'heat' flux must be removed
no matter where the 'heat' source is placed.
(4) For the (strange-looking) midside node basis functions, the peak value generally
does not occur at the node. The nodal value for these is 3h/16 above the average
value of the two neighbouring edge nodes (D.F. Griffiths private communication).
(5) These functions really do span the space of quadratic functions on (0,1), even though
every other basis function is piecewise linear.
Moving to 2D, we repeat what we did earlier for the L2-conjugate functions—plot
some of them on a 13 x 13 = 169 node mesh interpolated to 85 x 85. Figure A.3.2-14
shows some bilinear //'-conjugate functions, and Figure A.3.2-15 and A.3.2-16 show
some biquadratics. There are no pictures for nodes on V since all of the conjugate basis
functions are zero there. The left side of the previous figures (Figures A.3.2-4 through
A.3.2-8) show the basis functions to which these are bi-orthonormal in the //' sense (their
SCALAR PROJECTIONS 925
%r
ft
' i-
(a) $(x) for a typical internal node (4-patch);
maximum = 0.642.
(b) $(x) for 1 node in from boundary;
maximum = 0.454.
i
I
5Wg
(c) $(x) for 1 node in from a corner;
maximum = 0.408.
Fig. A.3.2-14 Dual (in /-/1j basis functions for bilinear elements.
gradients are bi-orthonormal). The little 'bumps' on the otherwise smooth biquadratic
functions are really there—and they are really small—a simple consequence of quadratic
interpolation. As we shall later show, every one of these dual functions is an approximation
of the 2D Green's function, In /% where /• = y/(x — xk)2 + (v — yk)2\ they are rather poor
approximations, of course, on this coarse mesh.
A.3.2.5 The Projection Method
The principal purpose of this section is to describe and demonstrate the following
important, and somewhat remarkable, fact: the GFEM solution to a second-order elliptic BVP
926 PROJECTIONS, ORTHOGONAL AND NOT-AND PROJECTION METHODS
"N
I
1
r
/
(a) $(x) for a 4-patch;maximum = 0.698
(b) $(x) for a 2-patch;maximum = 0.539
(c) $(x) for a central node; maximum = 0.446
Fig. A.3.2-15 Dual (in H V basis functions for biquadratic elements at three internal nodes.
(a) <|>(x) for 1 node in from a corner; (b) $(x) for 1 node in from a boundary;
maximum = 0.214 maximum = 0.354
Fig. A.3.2-16 Dual in H^) basis functions for biquadratic elements for nodes near the
boundary.
SCALAR PROJECTIONS 927
is precisely the //'-projection of the exact solution onto the finite element subspace (basis
functions)—and this in spite of the fact that the exact solution is generally 'not available.'
We shall demonstrate this remarkable fact for one simple example, but the result is much
more general—and applies to any elliptic problem in which the operator is self-adjoint
in Hq, although sometimes the most 'natural' norm, such as the so-called energy norm,
is different from the simple //'-(semi-)norm we have been using; see the literature for
other examples.
Thus, we consider the weak form of the following simple BVP:
V2u + / = 0 in Q (A.3.2-65)
u = uD on rD (A.3.2-66)
and
du/dn = g on rN = r - rD; (A.3.2-67)
i.e., find u e HE such that
Vu-Vv= fv+ gv Vve //J, (A.3.2-68)
J J JrN
where HE is that subspace of //' (not a linear subspace) in which all functions take on
the value up on To, and HQ is the linear subspace of//'-functions that vanish on FD.
An alternate statement of the weak formulation that may actually be more useful is:
find u = u — w e HQ, where w is an //'-extension of uD from TD into Q, by solving
[ V(u + w)-Vv= J fv+ J gv Vve Hl0, (A.3.2-69)
or
Vu • Vv = / fv + / gv- VwVv Vve //'; (A.3.2-70)
i.e., the problem is now placed in a linear vector space setting. The corresponding
approximate (GFEM) formulation is
f V j J2 "A + u' • v<fr = / /><• + / 8<Pi, *'=1,2,
,/V, (A.3.2-71)
and uh{(x) = u1 {x) + ^ ■ Uj(pj(x) = u1 + uh, where u1 {x) is the interpolant of uD on rD.
Now comes the first of two key observations: since (A.3.2-69) is valid for every v e Hq, it
is surely valid for the finite-dimensional subset, i> = </>,- for / = 1, 2,..., N, which implies
that
/ V(w + w) ■ V0,- = / f<pi + / g<pi (A.3.2-72)
J J JrN
for every i; i.e., the exact (weak) solution satisfies this finite set of equations (and, of
course, many others) with the same RHS as does the approximate solution. This fact
permits us to subtract (A.3.2-72) from (A.3.2-71), which 'eliminates' the data (/ and g),
928 PROJECTIONS, ORTHOGONAL AND NOT-AND PROJECTION METHODS
giving
Y^ UJ f v</>/ • V0; = J V(fi + w - u1) ■ V0;, i = 1, 2,..., N. (A.3.2-73)
The second key observation is this: (A.3.2-73) is a repeat of (A.3.2-55), the discrete
//'-projection equation—after replacing u by u + w there. So, we have achieved one of
our goals:
The GFEM approximate solution to the BVP given by (A.3.2-65) through (A.3.2-67)
is identical to the H' -projection of that solution onto the subspace spanned by the
GFEM basis functions.
Galerkin's method is indeed a projection method. More precisely, perhaps: the GFEM
solution, u\ — u1, is an ±i-projection of (u — u1) onto the finite element basis.
Remarks:
(1) A crucial piece of the above analysis is that the finite elements be of the 'conforming'
type; i.e., the discrete space is really a subspace of the oo-dimensional problem. The
GFEM applied with non-conforming elements is not a projection method—a fact
that will later be seen to lessen the 'quality' of incompressible flow simulations.
(2) The //'-projection to the GFEM basis functions is thus seen to satisfy du\/dn =
du/dn on rN, weakly; it is an NBC that comes with the projection.
(3) If rD = 0, (i) the problem is only solvable if J f + Jr g = 0, (ii) it is then solvable
only up to an arbitrary additive constant, (iii) the projected solution will try to
maintain du/dn on all of T, and (iv) the projection is then an //' -orthogonal (_l_i)
projection of the exact solution onto the finite element subspace—the gradients of
the two solutions are as close as possible (in L2).
(4) Recalling that u\ = w + p^(u — w), see (A.3.2-47), we observe the following: while
it is true that u\ — w is the closest (in //') function to u — w in the oo-dimensional
subspace Hl0 and that u\ — ul is the closest (in //') function to u — u1 in the finite-
dimensional subspace of H{0, we have, in general, no idea of how close u\ is to u\.
(5) The Pq- and Pq-projections can also be related in a somewhat similar manner, as
follows: (i) rewrite (A.3.2-6) as J u0vm = J uvm and (A.3.2-12) as f ufaj = /«</>,■
and take Sh C S; (ii) restrict the {vm} to the subset {</>,} to obtain J ugcpj = J ucpj =
f ufai, which shows that Uq is also a £)q-projection of u0, which itself is a p0-
projection of u: u\ = PqU$ = PqPqu = p\u\ i.e., p^(u — p0u) = 0, which is similar
to the familiar Euclidean projection of a 3D vector onto a line, in which an
intermediate projection might be to a plane containing the line. Perhaps a somewhat
more relevant example would be the projection of u(x) onto a subspace spanned
by, say, 30 continuous piecewise, cubic polynomials (a subspace of dimension 120),
followed by a projection to a subspace spanned by, say, 30 piecewise, linear FEM
basis functions at the same set of nodes. The result would be the same as projecting
u(x) directly to the 30-dimensional FEM subspace.
We conclude the //'-projection discussion with a return to an issue raised in
Section 3.2.3—between (A.3.2-47) and (A.3.2-48); namely, just what is the difference
SCALAR PROJECTIONS 929
between the following two _l_i-projections: (i) r^ = 0 (unconstrained) and (ii) uq = 0
(constrained, but with u = 0 on FD / 0)? The difference, which is small but 'finite' when
u(x) = 0 on VD, is rather easier to understand for the finite-dimensional FEM projections,
beginning with the observation that—all else being the same [same domain, same number
of nodes, same basis functions, and same u(x)]—the two FEM subspaces have different
dimensions, with that of the first projection being larger [Nt'- see (A.3.2-52)]. First we
note intuitively that the projection with more functions in the 'grab bag' should have a
better chance at minimizing the quadratic functional. Thus, the unconstrained projection,
which will allow u\ (x) to be different from zero on T, will have a smaller F\ (u\) than that
which, while still an _l_i-projection, constrains U[(x) to be zero on rD. Next we mention an
intuitive reason for this difference: the unconstrained solution will more closely match the
normal gradient of u(x) on T [see (A.3.2-67)], which, after all, is part of the goal of an H '-
projection; the constrained case, on the other hand, sacrifices the ability to maintain du/dn
on To by requiring u\ (jc) = 0 there—and thus cannot find the truly smallest F\ {u\), even
though it is the smallest in the lower-dimensional subspace—of dimension TV. Finally, we
remark that the unconstrained projection (which, of course, must deal with the fact that
K is singular, probably by setting Uj = 0 on some boundary node) will also be _U for
u(x) that is not zero on T; the special case was selected just to compare and contrast the
two projections.
A.3.2.6 Brief Discussion of GFEM Errors on Elliptic BVP's
To conclude this discussion, we return to the GFEM approximation of the Green's function
for the Laplacian—because it is important in its own right, not just for discussing the
//'-conjugate functions of Section 2.6.4—and we thank D. Arnold and R. Rannacher for
helping us here.
They are used by FEM numerical analysts to make error estimates—a deep subject
that we shall merely touch upon, beginning with the following quotation from Brenner
and Scott (1994), p. 170: 'The finite element approximation is essentially defined by a
mean square projection of the gradient. [As we saw in the previous section.] Thus it is
natural that error estimates for the gradient of the error directly follow in the L? norm.
It is interesting to ask whether such a gradient-projection would also be of optimal order
in some other norm, for example L°°. We prove here that this is the case.' We shall
summarize how this works, using the //'-dual functions. Consider solving the following
'Green's function' problem:
-V2G{k) = 8(x-xk) in Q, (A.3.2-74)
G(k) = 0 on 8Q, (A.3.2-75)
where x^ denotes the node at which the Dirac delta (source) function 'lives'—although
the theory does not require a nodal source. The weak form (of course) is the equation of
interest; namely
fvv VG(k) = f vS(x - xk) = v(xk) VveHl0 (A.3.2-76)
The GFEM approximation to G(k), gh(x) = Ey£?Vy(x), is §iven bY
J Vh ■ Vgh = Ui(x)8(x - xk) = 0,-(xt) Vi, (A.3.2-77)
930 PROJECTIONS, ORTHOGONAL AND NOT-AND PROJECTION METHODS
i.e.,
which is
giving
Y,gf JV0,- • ^(pj = Sik, / = 1, 2,..., TV, (A.3.2-78)
Y,KU8y=S*' (A.3.2-79)
8ik) = £(*>7>* = (K~l)ik = 4>ik\ (A.3.2-80)
the K-conjugate basis functions are the GFEM approximations to the Green's functions
of the Laplacian—and we are now ready to (imprecisely) summarize a particular GFEM
error estimate.
Since (A.3.2-76) holds for all v e Hl0, let us take
v = u-uh (A.3.2-81)
where u is the solution to the BVP,
V2« = -S(x) in Q (A.3.2-82)
u = 0 on dQ (A.3.2-83)
and uh is the GFEM approximation to the same problem (thus u — uh is the error) and on
the same mesh used to obtain gh, the approximate Green's function. Thus, from (A.3.2-76)
and (A.3.2-81), with the nodal identifier, k, suppressed in the sequel (for convenience),
u-uh = I VG ■ V(k - uh) (A.3.2-84)
= fv(G-gh)- V(k - uh), (A.3.2-85)
because u — uh ±1 gh, a fact that is contained in (A.3.2-73) of the previous section, for
w = u! = 0 (the error in a GFEM solution to an elliptic BVP is //'-orthogonal to the
basis.) By the same reasoning, the error in the Green's function, G — gh, is //'-orthogonal
to uh, and also to u!, the interpolant (projections) of the exact solution. Thus we can replace
uh by u!,
u-uh = J V(G - gh) • V(k - u1), (A.3.2-86)
to obtain, via the Schwarz inequality;
\u - uh\ ^ J | V(G - gh)\ • ( | V(k - u')\ (A.3.2-87)
where \u — uh\ is the absolute value of the error at the (suppressed) node, jc^.
Next it is clear that
/ |V(m — u1) ^ meas (Q) maxfi |V(w — u!)\,
SCALAR PROJECTIONS 931
where meas (Q) is the 'size' of the domain. Standard approximation theory [see, for
example, Strang and Fix (1973) or Ciarlet (1978)] then gives
I \V{u-u'\^chr (A.3.2-88)
where c is proportional to the second-derivatives of u; and r = 1 for linears, 2 for
quadratics, etc. Thus,
\u-uh\^chr I\V(G-gh)\, (A.3.2-89)
and we now appeal to authority [for example, the Brenner and Scott book, and Rannacher
and Scott (1982) and references therein—in which paper they succesfully 'removed' a
previously-present In \h\ factor for linear (only) basis functions] to state that the (gradient)
error in the Green's function approximation is also 0(h) —for all r (owing to the
singularity in G)—and we are done: now letting k range over all the nodes in the mesh (and
even over all of Q) gives the maximum (L°°) estimate:
llw-^lloo ^chr+l
This concludes our brief excursion into GFEM error analysis. Hopefully it is useful to
some of our readers.
A.3.2.7 Numerical Examples
Before moving on to the more difficult subject of 'vector projections,' we shall show a
few ID examples and one 2D example of scalar projections—some with linear basis
functions, some with quadratic, some with both, starting with a very smooth (C°°)
function, a Gaussian. Figures A.3.2-17 and A.3.2-18 show the L2- and //'-projections,
1.2
1.0
0.8
ug(x) 0.6
0.4
0.2
0
0 0.2 0.4 0.6 0.8 1.0
x
Fig. A.3.2-17 L2-projection of a Gaussian via quadratic basis functions (dashed) on a 6-
element mesh.
932 PROJECTIONS, ORTHOGONAL AND NOT-AND PROJECTION METHODS
1.2
1.0
0.8
uftx) 0.6
0.4
0.2
0
0 0.2 0.4 0.6 0.8 1.0
x
Fig. A.3.2-18 H^-projection of a Gaussian via quadratic basis functions (solid) on a 6-
element mesh.
respectively, of
u(x) = f(x) = e-^--^2/2*2 (A.3.2-90)
for 0 ^ x ^ 1 with xq = 1/2 and o = Ajc = 1/12 onto an TV = 12±1-dimensional
subspace via quadratic basis functions (six elements) and three-point Gaussian quadrature,
which we supplement with the following
Remarks:
(1) TV = 13 for L2 and TV = 11 for H' (because T = 0 at x = 0, 1 are BC's).
(2) In these and subsequent figures the plots are 'lifted off the origin by adding 0.1 to
the true results. Also we remark that the plots from quads 'suffer' somewhat because
linear interpolation is used by the graphics routine.
(3) Higher-order quadrature rules give the 'same' result (graphically identical).
(4) Linear basis functions give virtually the same graphical results—with a two-point
(or greater) Gaussian rule.
(5) The //'-projection looks much like an interpolant projection—and it is, for the edge
nodes in ID and for all linear basis functions), which we show below.
(6) The true //'-projection, via ||«||2 = ||m||q + ||Vm||q gives, for these cases, (M+
K)u = b, where b{ = /(«</>, + «'</>•)• Limited experimentation here showed that there
is little difference between the true //'-norm and the //'-semi-norm (A'-matrix)
results.
That the ID //'-projection, p\, yields the interpolant for linear basis functions is
probably worthwhile demonstrating, even though the result does not extend to multi-
dimensions—and we do so in two ways.
SCALAR PROJECTIONS 933
First way: Recall (A.3.2-55) for u = f and u1 = 0 in ID; i.e.,
/ 0;(«*)'= / 0J/', i=\,2,...,N, (A.3.2-91)
Jo Jo
or, introducing the 'error' function e(x) = ^(jc) — u(x),
f <S>\e'= f (p'i(uh-f)'=0 Vi; (A.3.2-92)
Jo Jo
the gradient of the error is //'-orthogonal to the basis. For f(x) sufficiently smooth, we
can integrate by parts to obtain, using e(0) = e{\) = 0,
Jo
1
e(xyt>"(x) = 0 Vi. (A.3.2-93)
But (p'/(x) is a Dirac delta function centered at x = jc, and thus (A.3.2-93) gives e(jc()
0 = ^(jc,-) — u(xj).
Second way: Starting again from (A.3.2-55), this time in the form
n ,1 ,1
Y^Uj f #</>) = f </>•/', (A.3.2-94)
~\ Jo Jo
we 'realize' (or can compute) that the LHS is («, — «(_i)//?L + (u( — ui+\)/hR. The
analogous (/-th row of the) RHS is also easy to evaluate:
RHS = / -Wl) -£d| + / -(1 - Wr) ~<%
JXi_{ d£ d£ Jx. d£ d£
1
-*"/ — !
V
+ = A/to) - /Ot,-i)] + ^-[f(Xi) - f(xi+i)].
Recalling the non-singular A'-matrix then leads to K(u — f) = 0, where /,• = /(jc,-); and
thus u = f.
We have thus proven (twice) that linear basis functions cause the //'-projection to
be the same as the interpolation projection. What about quads? Here we use a two-step
procedure, as follows (with thanks to D. Arnold):
Step 1. The fact that the linear basis functions are a subset of the quadratics (using only
the edge nodes) yields immediately that the edge nodes of the H '-projection via
quads are also the interpolant.
Step 2. Here we focus on the center nodes and return to (A.3.2-93). But this time <j>" is
a constant, and we obtain
/ e(x) = 0 Vi; (A.3.2-95)
Jo
i.e., the average error, e(jc) = ^(jc) — /(jc), is zero over each element, which
is consistent with minimizing the error in the gradient—and we are done. For
quads, the edge nodes are the interpolant and the center nodes take on the value
934 PROJECTIONS, ORTHOGONAL AND NOT-AND PROJECTION METHODS
that gives zero average error over each element—and we leave generalization
to higher-order elements to the reader.
We now move to our second, more difficult, example (C° function), in which quadrature
error precludes obtaining the interpolant for the //'-projections. We chose a portion
of a nearly half ellipse to see how well the projections could deal with a large slope
discontinuity (and we were pleasantly surprised). [In what follows we discuss—for
simplicity—only a single ellipse, the left one in the figures below, whereas the actual
calculations and figures show a pair of them—for 'variety,' and to see how well a very
large change in slope is accommodated (which is rather well).] In the next four figures
are shown some TV = 24 ± 1 projection results for the function
u(x) = f(x) = c + b\ \ - f -J on jc,^jc^jc2, (A.3.2-96)
with horizontal center location jco = 3/8, semi-major axis b = 1, semi-minor axis a =
1/8, and vertical center location c = —0.1—giving the intercepts [f(x) = 0]*i = xq —
ay/\ -c2/b2 = 0.2506, and x2 = x0 + ay/1 -c2/b2 = 0.4994. The L2-projections have
N = 25 (24 linear elements or 12 quadratic elements) and, because we set / and uh = 0
at x = 0 and 1, the //'-projections have TV = 23. Figures A.3.2-19 and A.3.2-20 show the
L2-projections for linears (two-point rule) and quads (three-point rule), respectively. In
both cases, higher-order quadrature did not visibly change the pictures. On the other hand,
the //'-projections were much more sensitive to the 'Gaussian rule'; Figures A.3.2-21 and
A.3.2-22 show the linear basis function results for a two-point and a seven-point rule,
respectively. (For quads—not shown—a three-point rule produced a picture much like
that in Figure A.3.2-21, and the seven-point rule result was close to, but slightly lower
1.2
1.0
0.8
u5(x) 0.6
0.4
0.2
0
0 0.2 0.4 0.6 0.8 1.0
x
Fig. A.3.2-19 L2-projection of two ellipses via linear basis functions on a 24-element mesh
(2-point Gaussian quadrature).
SCALAR PROJECTIONS 935
uS(x) 0.6
Fig. A.3.2-20 Same as Figure A.3.2-19 except via 12 quadratic elements (and 3-point
quadrature).
u"(x) 0.6
Fig. A.3.2-21 H^-projection of two ellipses via linear basis functions on a 24-element mesh
(2-point quadrature).
than, that in Figure A.3.2-22, showing that the quads are harder to get right.) Since exact
integration yields the interpolant, all of the error at the nodes in these last two figures is
quadrature error. We also mention that in these and other cases, the results are less 'nice'
looking if node points are not located at points of discontinuity—owing to additional
quadrature error.
936 PROJECTIONS, ORTHOGONAL AND NOT-AND PROJECTION METHODS
u"(x) 0.6
Fig. A.3.2-22 Same as Figure A.3.2-21 except using a 7-point quadrature rule.
Let us now recall the projection connection: the continuum results in Figures A.3.2-19
through A.3.2-22 (the ellipses) also correspond to the (weak) solution of the following
BVP:
d2u d2f
— = -S(x) = —^ on 0 ^ x ^ 1 with k(0) = u(\) = 0,
dxz dx
where, from (A.3.2-96), the 'source' term is given by
dx2
b/az
1 -
X-Xq
a
-,3/2
on x\ < x < X2,
(A.3.2-97)
(A.3.2-98)
with point heat sinks at x = x\ and x = X2, and S(x) = 0 for x < x\ and x > X2. The point
heat sinks are Dirac delta functions, needed to compensate for the discontinuity in f'(x)
at x = x\ and X2, and are given by the 'flux' jump there; namely.,
S(x) = [b2^\ - c2/b2/ac]8(x - jc()
(A.3.2-99)
for / = 1,2; here b2\J\ — c2/b2/ac = —79.6. Note that the ellipse center, c, must be <0
in order for the solution to make sense (reside in //'); for c -> 0, the flux at the two
'edges' of the ellipse -> ±oo and the BVP becomes ill-posed. The approximate (GFEM)
solution of the above BVP is given by
Y.UJ J Wj = j <PiS(x) = -j <piU"(x) V/,
(A.3.2-100)
which will clearly not give the same result as the //'-projected result, (A.3.2-94), with
/' replaced by u', when numerical integration is employed in both cases; the RHS's will
SCALAR PROJECTIONS 937
generally differ and, thus, the solutions. Galerkin's method via FEM is a true projection
method only in the absence of quadrature error. In the general case, we can call it an
approximate projection method.
Our last ID example is a C-1-function and thus can only be studied in L2. It, and its
projection, are shown in Figure A.3.2-23 for a 50-linear-element mesh (N = 51) and a
two-point Gaussian rule, which merits the following
Remarks:
(1) The discontinuity at x = 0.7 causes a classic Gibbs jump because our basis functions
are continuous.
(2) Quadratic basis functions with a three-point rule and an edge node located at the
discontinuity look much like the linears—but see below.
(3) The near-perfect result for the ramp portion is deceptive; i.e., there are wiggles,
decreasing in amplitude away from the discontinuity—they are just too small to see
because, at least in part, there is a node at the jump in u'{x).
The Gibbs jump can, with thanks to D. Griffiths, be studied analytically—at least for
the case of a single discontinuity, for both linears and quads. In both cases we wish
to solve Mu — b, where b,■ = \ • f </>, for / = 0 (node at discontinuity), b/ = 1 • J & for
/ < 0, and bj = 0 for * > 0. Application of the theory of difference equations leads first
to the homogeneous solution,
u( = a^ + b?_, (A.3.2-101)
where £± = (—2 ± V3) for linear elements and £± = (3 ± 2^2) for the edge nodes
(only) for quads—the latter having been obtained by first eliminating the midside (1/2-
integer) nodes in terms of edge nodes. Next, in order to obtain bounded solutions for
i > 0, we need b = 0 for linears and a = 0 for quads; conversely, for / < 0 we need
1.4
1.2
1.0
0.8
u5(x) 0.6
0.4
0.2
0
-0.2
0 0.2 0.4 0.6 0.8 1.0
x
Fig. A.3.2-23 L2-projection of a C~1 function via linear basis functions on a 50-element
mesh.
938 PROJECTIONS, ORTHOGONAL AND NOT-AND PROJECTION METHODS
a = 0 for linears and b = 0 for quads. Finally, to obtain the non-zero values for a and
b, two equations are needed: (i) the matchup (continuity) condition at / = 0 to obtain the
particular solution Uj = 1 for / < 0 and u{ = 0 for / > 0; and (ii) the 'special' equation,
with bo = j f (po and u0 = 1/2, must be satisfied for i = 0. The final results are:
1. Linears:
= 1 - j£L for / ^ 0,
= 1$; for / ^ 0,
giving a Gibbs jump of magnitude
-2 + V3
\U[\
0.134 or, equivalently, |m_i| — 1 =
(-2 - V3)"
- 1.
independent of grid size.
2. Quadratics
(i) Edge nodes (integer):
1 - \?+ for i ^ 0,
W for i ^ 0.
2'
Here, though, there is no Gibbs jump because both £'s are positive. The Gibbs jump
(and oscillatory decay) shows up in the center nodes. ...
(ii) Center nodes (1/2-integer):
j Ui-1/2 = [10 — (iij + «,_i]/8 for / ^ 0,
\ ui+i/2 = -(Uj + ui+l)/8 for / ^ 0,
showing here a Gibbs jump of magnitude 11/1/21 = («o + «i)/8 = [1/2 + 1/2(3 —
2\/2]/8 = 0.0732, which is about one-half of that from linears. This does, however,
correspond to a sort of 'best case' in that a larger jump is observed if a center node
is placed at the discontinuity or, worse yet, if no node is there (additional quadrature
error)—a situation that also applies to linear elements (the jump can then be more
like 20%).
For our single 2D example, we shall show five projections of the 2D Gaussian shown
in Figure A.3.2-24a, which has
ax = Ax = Ay = | and oy = 0.3a^, via
u(x, y) = e-te-irfri+ty-yofrtW, (A.3.2-102)
with jco = yo = j (center of the unit square domain).
As with our conjugate basis function examples presented earlier, we use a 13 x 13 =
169 node mesh and interpolate all results via the appropriate FEM basis functions onto an
85 x 85 mesh. Figure A.3.2-24(b) shows the bilinear interpolant projection, the rounded
cap becomes a spike on this coarse mesh. Figure A.3.2-24c is from a 3 x 3 (Gauss points)
quadrature on bilinear elements, and the extrema (max/min) are (0.981/ — 0.110)—and
higher order quadrature looks much the same (for example 7x7 gives extrema of
0.981/—0.106). Figure A.3.2-24(d) is from a 7 x 7 quadrature on biquadratics and has
SCALAR PROJECTIONS
939
;/
(a) The exact 2-D Gaussian
(b) Its interpolant via bilinear elements
s
hi
'//
<0
■^
(c) L2-projection via bilinears
(d) L2-projection via biquadratics
I
f
(e) H1-projection via bilinears
(f) H1-projection via biquadratics
Fig. A.3.2-24 >A 2D Gaussian and five projections.
extrema of (1.061/—0.102), whereas a 3 x 3 rule here, while looking very much the
same, gave extrema of (1.177/—0.108). Moving now to the Hl results, Figure A.3.2-24(e)
shows a bilinear 3x3 result with extrema (0.970/—0.052), whereas 7x7 gave the 'same'
picture with extrema (0.987/—0.054). Note the similarity to the interpolant—and recall
that in ID these two projections are identical. Finally, a 7 x 7 rule for biquadratics, shown
940 PROJECTIONS, ORTHOGONAL AND NOT-AND PROJECTION METHODS
in Figure A.3.2-24(f), has extrema (1.024/-0.110), whereas a 3 x 3 rule gave the 'same'
picture with extrema of (1.161/—0.167). The interpolant projection (not shown) looks
much the same, and has extrema of (1.00/—0.120). Clearly the //' projections 'look' a
little bit smoother, and virtually all of the 'wiggles' (for both projections) are in the more
poorly resolved y-direction. [We purposely used a coarse mesh to reveal the projection
errors, which virtually vanish (in the augen norm at least) on 'normal' meshes—i.e., both
projections then 'look like' the interpolant.]
One of the lessons that has been learned from these examples is this: whereas the L
(best least square fit to the function) projection necessarily 'wiggles', the //'-projection
can be 'mentally approximated' by the interpolant (projection), which only 'wiggles' (and
not very much, usually) for higher-order elements. Finally, it is useful to recall (see, e.g.,
Strang and Fix 1973) that the solution to the associated elliptic BVP (Poisson equation)
via GFEM is always more accurate (in H') than is the interpolant of the exact solution.
A.3.3 VECTOR PROJECTIONS
We now move up in complexity by considering both L2-and //'-projections of vector-
valued functions. The subspaces of interest are both the same finite element spaces
as above (subsets of L2 and //', respectively) and some new ones: the subspaces of
divergence-free vector fields. In fact, our main goal here is to explicate the projection of
vectors (divergence-free or not) from either oo-dimensional or finite-dimensional spaces
to a finite-dimensional subspace in which the vector is discretely divergence-free. Along
the way we will discover that the projection of the oo-dimensional space to the discretely
divergence-free subspace can also be represented as two sequential projections—the first
to a non-divergence-free finite-dimensional subspace and the second from there down to
the final subspace. Also, as in the scalar case, we will lead up to the case of interest by first
examining the case of an oo-dimensional projection of a vector onto its oo-dimensional
subspace of vectors that have zero divergence. We shall again start with the simpler L2-
case and conclude with the //'-case, and we note now the interesting 'final' result for the
finite-dimensional versions: simply swap M with K to convert from one to the other.
A.3.3.1 The pj -Projection
The proper (minimum allowable smoothness/largest possible space) space of vector
functions that can be considered here is called H(Div)—the space of vectors in L2 whose
divergence is also in L2; this space was apparently introduced by P. Raviart (see Thomasset,
1981, p. 94)—see also, for example, Temam (1984, p. 5), Girault and Raviart (1986,
p. 26), Arnold (1990), or Brezzi and Fortin (1991, p. 18) for detailed descriptions of this
space that contains H' as a subspace. Such spaces permit the use of non-conforming
elements because (viewed from H1) the tangential velocity is permitted to be
discontinuous across element boundaries. But, even though some mathematicians may cringe at
the notion, we shall only permit our 'velocities' to lie in (be restricted to) the smaller
space H1; i.e., the space of conforming elements in which all first-derivatives are in L2.
They may cringe because the function-analytic stability properties of the divergence-free
subspaces of this space (velocity-pressure combinations, in simple language) are not fully
understood (or at least that is our reading of the situation). We use H' because we—and
VECTOR PROJECTIONS 941
many others—use this space daily in computations, and mostly with success. Thus, while
it may have some theoretical deficiencies (as might its finite-dimensional subspace in
the limit h -> 0), we opt for the practical side and permit, for example, the QiQo finite
element basis (and avoid the h -> 0 cases!).
So, given a bounded domain Q containing a vector field u(jc) e H1, we are interested
in finding the closest (in the L2-norm) divergence-free function to this vector field, say Uq,
subject to the constraint that n uo = n • w on To, where n • w is a given function on TD,
which itself is a portion of dQ = T. We remark that whereas a scalar L2-projection need
not be subjected to any 'boundary' conditions, the divergence-free constraint associated
with the vector L2-projections introduces a differential operator—and with it, the necessity
of finding and using 'appropriate' BC's. That this is reasonable for us is related to the
fact that our final 'product' will be an incompressible flow in Q whose normal component
is 'controlled' on at least a portion of dQ. We also remark that if rD is all of T, then
n • w must satisfy Jr n • w = 0 and that n • w ^ n • u in general. Rather than considering
the elusive-in-practice divergence-free spaces of vector fields, we a priori introduce a
Lagrange multiplier (A.) and seek a constrained extremum—via a saddle-point problem,
consistent with our 'theme' of mixed interpolation. Thus, we introduce the (non-quadratic)
functional
F(\,k)= \ /(v-u)-(v-u)- /\v-v, (A.3.3-1)
where veH[ and A. e L2; HlE is that subset of H1 whose functions satisfy n • v = n • w
on rD—and H0 is the sub-space of H1 with n • v = 0 on rD. (The minus sign is chosen
for 'convenience' only; for either choice of sign, the functional takes on both positive and
negative values.) Seeking a stationary point (critical point) of F(\, A.) via 8F(\, A.) = 0
and calling the critical values of v and A., uo and A-o, respectively, gives
0 = / (u0 - u) • 8\ - / A.oV • <5v - / 5A.V • u0
= / (u0 - u) • 8\ - / V • (A.05v) + / 8\ • VA.0 - / 5A.V • u0
= / (u0 - u + VA.0) -8\- X0n-8\- A.0n • <5v - / 5A.V • u0, (A.3.3-2)
J «/ F/v J r /) j
where r^ = V — VD. Now, since n uo = n w on rD, n • <5v = 0 there. Also, realizing
that <5v and 8X are independent variations gives the pair of variational equations,
/ (u0 - u + VA.0) • 8\ - / A.0n • <5v = 0 (A.3.3-3)
and
- j 8XV-u0 = 0, (A.3.3-4)
which is the final statement of the saddle-point problem. The arbitrariness of the variations
finally leads to the so-called Euler-Lagrange equations for uo and A,0 that describe the
projection (proven below) of u to the closest (in L2) divergence-free subspace,
u = u0 + VA.Q and V • u0 = 0 in Q, (A.3.3-5)
A.() = 0 on VN, (A.3.3-6)
942 PROJECTIONS, ORTHOGONAL AND NOT-AND PROJECTION METHODS
and, of course,
n • u0 = n w on rD. (A.3.3-7)
Remarks:
(1) This is sometimes called a Helmholtz-Weyl decomposition of a given vector field
into the sum of its divergence-free and curl-free parts; see, for example, Galdi (1994).
(2) The BC A.0 = 0 on rN is actually an NBC. Also on VN, there is a 'built-in' constraint
on u0; namely, whereas the normal component of uo is free to change, x • uo = x • u
because x ■ VA-o = 0 there—the tangential components of velocity are not allowed to
change on rv. [This restriction could be removed by specifying A-o on rV such that
3A.o/3t = r • uo = g, where g is given; i.e., replacing A-o = 0 by A-o = A.£ = Jr g on
FN puts a weak Dirichlet BC on A,0 and results in a projected velocity that satisfies a
specified tangential velocity on rv But we shall usually opt for the simpler situation
in the sequel, leaving the more general case as an exercise. And with this exercise,
we pose another—to the FDM expert who is able to solve the NSE with VP being
a gradient everywhere (even at 'outflows'): how would you solve the above system
for u and A. wherein u • x is specified on rN and (of course) the solution is discretely
divergence-free everywhere?]
(3) The solution to (A.3.3-5) through (A.3.3-7) is a saddle point of (A.3.3-1); it
minimizes the kinetic energy of the difference between u and uo, and A-o maximizes
F(\, A.0) at v = u0.
(4) If V • u = 0 in Q and n • u = n • w on rD, the solution turns out to be A-o = 0 and
uo = u (see below); the function u is already in the subspace.
(5) If u = 0, then we have a saddle-point formulation of potential flow; see Section 3.15.
(6) Clearly we have violated the original assertion that A. e L2; we will correct this
'carelessness' soon, after briefly investigating the situation in which even more
regularity is required.
If we demand or assume additional smoothness, we obtain a classical
formulation—which 'formal' representation will also be useful in our search for the
projection operator; i.e., from (A.3.3-5) through (A.3.3-7) follows
V2A.() = V-u in Q (A.3.3-8)
A.0 = 0 on rN (A.3.3-9)
dXo/dn =n-(u-w) on FD, (A.3.3-10)
which can be solved for the Lagrange multiplier, from which the desired vector field,
u0 = u-VA.0 in Q, (A.3.3-11)
is finally obtained. (Note the sequential solution procedure; first A-o, then uo.)
The formal solution of (A.3.3-8) through (A.3.3-11) is obtained by inverting the Lapla-
cian in (A.3.3-8) through (A.3.3-10) and placing the result in (A.3.3-11):
A.0 = (V2r'V-u= A-'V-u, (A.3.3-12)
where we have switched notation (A = V2) for notational convenience, and we remark
that A.Q from (A.3.3-12) comes with the BC's [(A.3.3-9)) and (A.3.3-10)] 'built-in'—again,
VECTOR PROJECTIONS 943
formally. Then,
u0 = u-VA"'V-u= (/-VA-'V-)u = py()u, (A.3.3-13)
where pJo signifies an L2(//°)-projection onto the divergence-free subspace J: i.e., the
subscript on J refers to the type of projection used to get there (7i will then be an
//'-projection onto J). Thus, we assert that we have found our (first) projection
operator: pj0 = I — gradA~'div projects (in L2) a given vector field onto the divergence-free
subspace [and the result satisfies (A.3.3-7)]. pJ{) is an operator that, when applied to u,
'subtracts off that portion of u that is not divergence-free; i.e., it subtracts the curl-free
portion—a gradient. Defining additionally
fi/0=/-py0 = VA-lV., (A.3.3-14)
leads to VA-o = Qjt)u, the gradient part of the decomposition, so that we have
u = u0 + VA-o = p/0u + QJ{)u. (A.3.3-15)
It is immediate to demonstrate that pj = pJt) and Qj = QJo as well as V • uo = 0;
what is more interesting is to examine orthogonality: is pjt)u _L0 <2y,)u? Or' perhaps more
clumsily: does ||m||q = ||py0u||o + ||(?/()u||q = ||uoNo + llu — uo||o? The answer is—only
if n • w = 0 (i.e., only when the subset of H1 becomes also a subspace of H1). Proof:
||u||2 = n ■ u = /(u0 + V^o) • (uo + V^o)
= /u0 • u0 + f Vlo • V^o + 2 fuo • VV
But V • u0 = 0 and thus
/ u0 • VA.Q = / V • (A0uo) = / A.on • w;
i.e.,
l|u|lS=||uoll5 + l|VX||J + 2
= IIP/oullo + ll&/<,ullo + 2 / ^on-w. (A.3.3-16)
The L2-projection is only _L0 when u0 is parallel to rD, or if rD = 0 [no essential BC, a
situation that is interesting in that the A.0 = 0 BC constrains the tangential component(s) of
u0 on T; see Remark (2) below (A.3.3-7)]. [See Chorin and Marsden (1992) for additional
discussion.]
Now that we have identified the projection, and its 'strong' formulation [(A.3.3-8)
through (A.3.3-11)], let us get more realistic and obtain the weak formulation of the
projection. It is important to point out that, while either approach is seemingly
legitimate, the most 'appropriate' weak formulation begins with the 'strong' formulation given
by (A.3.3-5) through (A.3.3-7) and not the stronger(?) one given by (A.3.3-8) through
(A.3.3-11). The reason for this is that the finite-dimensional approximation to the weak
944 PROJECTIONS, ORTHOGONAL AND NOT-AND PROJECTION METHODS
form of (A.3.3-8) through (A.3.3-11) would, in a sense, 'lose track of the V • u0 = 0
constraint, with the result that a A.o obtained from the weak formulation of (A.3.3-8)
through (A.3.3-10), which reads: 'Find A.o G Hq such that
/VA.0 • V0 = - /0V • u V</>g//0, (A.3.3-17)
where functions in Hl0 c Hl vanish on rv," will not lead, in the finite-dimensional case,
to a projected velocity, from (A.3.3-11), that will satisfy any readily identifiable version
of V • u0 = 0.
So, we return to the 'primitive' equations in the 'mixed' formulation, (A.3.3-5) through
(A.3.3-7), this time in the weak formulation. Let Hq be a subspace of H1 such that
v g Hq =>• v g H1 and n • v = 0 on rD. Next, let w be a given H1-extension of a vector
field from rD into Q that satisfies n • w = n • w on rD. Then, to find our second L2-
projection, uo = py()u, we proceed as follows: Find Uq g Hq and k0 e L2 from
/(u0)+w-u)-v- / k0V-\ = 0 VvgH^ (A.3.3-18)
and
.0 , „•=,>> _ n w„^r2
- / ?V-(itf+ w) = 0 WqeL\ (A.3.3-19)
where integration by parts of v • VA-o (along with n • v = 0 on rD and A.o = 0 on rN) has
been employed to reduce the differentiability requirements on A.o to those of great interest
in the GFEM—namely, none. Note, too, that just as in the scalar //'-case, inhomogeneous
Dirichlet BC's (the 'essential' variety) must be given special attention in order to invoke
the powerful machinery inherent in linear vector spaces. We remark that w need not
be divergence-free, and it generally is not—a feature that tends to 'obscure' the best
approximation property of the projection—as we shall see. Rearrangement of (A.3.3-18)
and (A.3.3-19) to place the data on the RHS leads to a nicer form from which to launch
our GFEM approximation: Find Uq g Hq and Xq g L2 from
and
Remark:
uj • v - / A.0V • v = / (u - w) • v Vv g H^ (A.3.3-20)
qV-v§= qV-vr VqeL2. (A.3.3-21)
Recall that this solution gives t-uo = 3A.o/3t = 0 on FN. If instead VN has a
nonzero specified tangential velocity, say x ■ uo = g, there needs to be added to the RHS of
(A.3.3-20) the following boundary integral: — fr (n • v)A.£, where XE = jT g and comes
from dXs/dr = g. This is the weak form of a Dirichlet BC referred to earlier—in Remark
(2) below (A.3.3-7).
Once u[| is available, we have
u0 = ug + w = pJou, (A3 3-22)
VECTOR PROJECTIONS 945
and the projection proof (pj{) = pJo) follows easily: place uo into (A.3.3-20) and realize
that Uq and A-o = 0 solves the pair; thus, pj()uo = uo = pj u.
Before leaving the oo-dimensional case, let us again examine orthogonality, or lack
thereof. We have (weakly)
u = u° + w + VX0 = u0 + V^o = pJ{)u + Qj()u, (A.3.3-23)
which leads to
/ u • u = / u0 • u0 + / VA0 • V^o + 2 / u0 • V^o,
which, using / A.0V • u0 = 0 from (A.3.3-21) with q = A.0, n • u0 = n • w on TD, and
A-o = 0 on r^ yields, for the last term on the RHS,
/ u0 • VA.Q = / A.0n • u0 - / A.0V • u0 = / A.0n • w,
which leads to
llullo = lluollo + II v^ollo + 2 / Ion • w, (A.3.3-24)
T
D
showing that, analogous to (A.3.3-16), uo J-o VA.o if and only if n • w = 0; i.e., we (again)
need n • w = 0 on rD in order to have J_o, i-e., if TD / 0, then the flow must be parallel
torD.
But—as in the scalar //'-case—an alternate version of orthogonality is also of interest:
start from u — w = u[| + VA.0 and form f(u — w)2 to obtain
||u - w||g = 11 ug 11 g + || V^ollo + 2 [u°0 • VA.,
/u!J-VA0= /"^on-uj- [XqV.u°0= floV-vf
where
to give
|u - w||g = ||u0 - w||g + || VA.0II0 + 2 / k0V ■ w, (A.3.3-25)
and we obtain an orthogonal decomposition/projection of u — w to uo — w + VA-o if and
only if w is divergence-free; i.e., a divergence-free H1-extension of n • w from rD into Q
will generate an J_o-projection (of u — w), even when n • w / 0—and we have reached
the end of the pj{)-projection discussion.
A.3.3.2 The pj -Projection
It is now relatively easy, as usual, to specialize from an oo-dimensional setting to the
finite-dimensional one via our FEM basis functions—which we present, for 'convenience,'
in 2D only. And here again, somewhat in violation of the advice provided by many of our
FEM mathematician friends, we permit—and indeed, compute with—ostensibly unstable
946 PROJECTIONS, ORTHOGONAL AND NOT-AND PROJECTION METHODS
'element pairs,' such as QiQo, QiQ-\, etc. But this issue is of minor concern relative to
our current discussion in that our presentation would not be affected in any appreciable
way if we restricted attention to only 'stable' elements—which the reader is welcome
to do.
So, consider now the finite-dimensional subspaces V'cH1 [or H(Div)] for velocity
and Sh C L2 for pressure as well as the additional subspace, Jh c \h, of discretely
divergence-free vectors; and seek a GFEM solution to (A.3.3-20) and (A.3.3-21) via
uj = uh + u7 g J*, (A.3.3-26)
where Uq(jc) is the projected (weakly) divergence-free velocity,
■' = t»A = E (:;>=("l ?r)(Uv)z< (A.3.3-27)
where TV is the total number of nodes in Q, and on r^,u7 is the H'-extension
(corresponding to vv) given by interpolation (another projection):
u7 = ^ (n w^-ii; G H', (A.3.3-28)
where (n • w), is the (specified) value of n • w at node j, and
ny"/vV /v</>
(A.3.3-29)
is the mass-consistent unit normal at node j (see Section 3.13. le). Note that u7 is generally
not divergence-free because the velocity basis functions are not. Note also that the 'proper'
(mass consistent) value of the normal component of u1 is only obtained via iij • u7 with
nj from (A.3.3-29). And note, finally, that (A.3.3-29) is also required in order to assure
that uh g Hq; i.e., n • uh = 0 on rD can only be assured using the consistent normal.
We also need
Xh0 = ^2XjxJ,j eSh, (A.3.3-30)
where N p is the total number of 'pressure' nodes, after which the combination of (A.3.3-20)
and (A.3.3-21) with v = (</>,, </>,)r, / = 1, 2,..., TV and q = y\rhi = \,2, ...,NP there, and
equations (A.3.3-26) through (A.3.3-29), give
N p N p p rj I p
Y,Uj I 4>i4>j -J2kJ ^J= /("-"7)<^ /=1,2, ...,7V, (A.3.3-31)
N N" - ^
Y,vj j <Pi<Pj-J2kjj jL^j = jiv-i/^i, i=\,2,...,N, (A.3.3-32)
and
- Y, (UJ J ^^ + VJ J ftJ1) = J ftv u7' i=l,2,...,Np, (A.3.3-33)
VECTOR PROJECTIONS 947
the set of 2N + N p GFEM equations of the L2-projection, and this may be a good time to
make the following important remark: even if u is 'perfectly' divergence-free (V • u = 0),
its projection (Uq) will not generally be, which leads to two further remarks: (i) this
projection is also a transformation (mapping) in that it transforms a divergence-free vector
to a discretely divergence-free vector, and (more importantly) (ii) a (seemingly
'superior') perfectly divergence-free vector field must be projected to its ('inferior') discretely
divergence-free counterpart in order to provide a legitimate IC for time-integration of the
GFEM NS equations. This is because Jh is not a subset of J—even though Jh c \h and
\h cH1 and JcH1.
Before proceeding, it is worthwhile noting that omitting the Lagrange multiplier terms
in (A.3.3-31) and (A.3.3-32) and dropping (A.3.3-33) recovers the familiar L2-projections
of the scalar case, one for u and one for v, uncoupled [albeit with some unnecessary
(but permissible) BC's/constraints]; i.e., the RHS's of (A.3.3-31) and (A.3.3-32) could
be rewritten in terms of a (non-divergence-free) discrete L2-projection with nodal values
f uiN
via
V; I
Mx 0 \ (u\ _ (Mxit
0 mJ \v) ~ \Mvv
V1
/ (u - u')(p^
J , (A.3.3-34)
J (v-vr)<p j
which here defines (u, v) and can be compared to (A.3.2-11) through (A.3.2-13).
The Venn/potato diagram in Figure A.3.3-1 depicts some of the preceding observations,
showing some projections of two functions in H1 —one divergence-free (ui) and the other
not (u2): the 'suggestion' that p) = p) p^ will be proven later—Section A.3.3.5.
The condensed form of (A.3.3-31) through (A.3.3-33) will also be useful:
Mxii + Cx\ = bx=Mxu, (A.3.3-35)
MyV + CyX = by= M yV, (A.3.3'36)
C[u + C\v = g, (A.3.3-37)
Fig. A.3.3-1 Some L2-projections—non-orthogonal in general.
948 PROJECTIONS, ORTHOGONAL AND NOT-AND PROJECTION METHODS
with the obvious definitions of matrices and vectors—and because of our simplified choice
of BC's, Mx = Mr Solving (A.3.3-35) and (A.3.3-36) for u and v and placing the results
into (A.3.3-37) yields the discrete equation for the Lagrange multiplier
(CTXM~{CX + CTyM-{Cy)k = CTM~lC = Ak = CTxu + CTyv - g, (A.3.3-38)
which is the discrete analog of (A.3.3-12); A ~ —V2 and M~lC ~ V.
Remarks:
(1) If u is interpolated into the basis functions and the mass then lumped on both sides of
(A.3.3-35) and (A.3.3-36), we would obtain the (relatively inexpensive via
sequential solutions) discretely divergence-free interpolation projection, say p1} , which is
related to a commonly used projection method when solving the time-dependent
Navier-Stokes equations (see Section 3.16.6c). [If the mass is not lumped after the
interpolation, then a different (and more expensive) divergence-free projection is
obtained.]
(2) As in the continuous case, setting u = 0 yields a mixed interpolation approximation
to potential flow.
Solving for k and placing the result in (A.3.3-35) and (A.3.3-36) gives
M-{CxA-{g \
M-lCyA-lgJ
(A.3.3-39)
(A.3.3-40)
(A.3.3-41)
(A.3.3-42)
(A.3.3-43)
where P0 is the L2-projection matrix; Pq = Pq{Pq is a projection), CtPq = 0(^0 is also a
divergence-free projection), and P0M~lC = 0 (gradients project to zero). As noted earlier,
the eigenvalues of any projection operator are either zero or one. In the case of P0, the
zero eigenvalues (of which there are Np when C has full rank) correspond to eigenvectors
that are gradients of scalars (the null space of P0: x = M~lCq for some (non-trivial) q
has Pqx = 0), and the unit eigenvalues (of which there are 2/V — Np) correspond to
divergence-free eigenvectors (at least when C has full rank; see Section 3.13.2b). Finally,
we note that Pq is an orthogonal projection via the discrete L2-inner product, (jc, y) =
xTMy; i.e., with Q0 = I - P0, (P0u, Q0u) = (P0u)TM(QQu) = (P0u)7MM-lCA~lCTu =
(CtPqu)tA~{Ctu = 0 because CtPq = 0. Thus, Pq is an 'L2-orthogonal divergence-free
projection' matrix.
Remark:
This L2-projection can be closely realized via the BE or STR time integration method by
taking one very small time step in which k = AtP, see Section 3.16.Id.
u
v
T-lf A-l/^T
(Ix - M~ lCxA~lCx) -M- lCxA~l c;
-M-{CyA-{CTx (Iy-M-lCyA-lCTy
or, in further condensed form,
u
v
+
Ak = C u — g,
l^A-lt^T;
u = u-M~lCA~l(C'u-g),
or
\/^A-\/^Tx7.
\r-A-\.
u= (I -M-lCA-lC')it + M-lCA-lg
= P0~u + M-{CA-{g,
VECTOR PROJECTIONS 949
Solving (A.3.3-35) through (A.3.3-37) is equivalent to applying (A.3.3-43) to It with the
result that CTu = g; the resulting (discrete) vector field is discretely divergence-free. Also,
(2o yields the gradient part: MlCX = Q0u - M~lCA~lg = M~lCA~l(CTu - g), and we
have
it = u + M"1 Ck = P0u + Q0u, (A.3.3-44)
which mimics (A.3.3-15), except that here the (discrete) projection is _L0- Note that,
especially clearly from (A.3.3-41), if the w-field is already (discretely) divergence-free, then
the projection does nothing; u = u, and (A.3.3-44) shows that, in general, the projection
of u 'subtracts out' the gradient portion to leave the divergence-free portion.
Testing orthogonality of u and M~lCX, however, leads to
\\u\\l = (u+M~lCX)TM(u+M~lCk)
= uTMu + (M~lCX)TM(M~lCX) + 2uTCl;
but
uTCX = kTCTu = kTg,
and we thus obtain
Plo = Nlo + W~{CX\\l + 2XTg, (A.3.3-45)
a la (A.3.3-25); L2-orthogonality between u and M~lCX obtains only for g = 0 which, in
this case, would require n • w = 0 (or rD = 0), since u7 is not a divergence-free extension.
The comparison of (A.3.3-45) with (A.3.3-25) rather than with (A.3.3-24) is appropriate
because the discrete vector, u, of (A.3.3-44), corresponds to u — w, u to u[j, and g to
V • w in the oo-dimensional case.
This seems like a good place to quote (with a few necessary notational changes) from a
nice paper of Strang (1988, p. 283), for the case g = 0: 'The first two equations [(A.3.3-35)
and (A.3.3-36)] separate u (and v) into its components u (and v) and M~lCxX (and
M~'CVA.). The third equation [(A.3.3-37)] makes these components perpendicular. There
is only one solution—one choice from each subspace that will combine to give u (and
v). It is found from (A.3.3-38).'
Finally, we consider/introduce the continuous projection operator, p1} , that is induced
by the matrix projection, Pq\ phJ{) is a finite-dimensional L2(//°)-projection operator onto
the discretely divergence-free subspace, Jh. To do this, we first state the procedure in
words, for clarity:
1. Given
( u(x)\
and n • w on rD, make a first approximation to the former via
and approximate the latter by u7 via (A.3.3-28).
2. To obtain (tij,Vj), perform the (non-divergence-free) L2-projection by solving
(A.3.3-34).
950
PROJECTIONS, ORTHOGONAL AND NOT-AND PROJECTION METHODS
3. Project (itj, vj) to the divergence-free subspace via (A.3.3-39), giving (Uj, Vj).
4. Insert the result into (A.3.3-26) and (A.3.3-27); done. We have the projection, which is:
u;(,)^u = u*+u'=(*07 ,')(:)+u7;
(A.3.3-46)
i.e.
"6
v0
if 0
0 <pT
r-1
1^7.
(Ix-M-lCxA-yC'x)
-M;lCxA~lCT
-M~{CyA~{CTx
(/v
r-1,
y
M~[CyA-'C'y)
u
V
+
M;lCxA~lg
M-lCyA~lg
<PT 0 \
0 <pT
+
u
v
u
V
+ M~lCA~lg
+
u
vl
where
u
v
m;1
m:
(u — u')(p
(v — v1 )<p
(A.3.3-47)
(A.3.3-48)
(A.3.3-49)
Remark:
In practice, of course, the (numerical) solution would be done all-at-once by solving the
coupled system (A.3.3-31) through (A.3.3-33), since M~' is dense. Unless we lump the
mass, in which case the sequential procedure, (A.3.3-38), (A.3.3-35) and (A.3.3-36),
is viable—but the L2-projection is sacrificed, although we still do have a discretely
divergence-free (interpolation) projection. In this sense, the 'small-Ar' BE
approximation/trick mentioned in the Remark below (A.3.3-43) is closer to a true L2-projection than
is the lumped mass 'sequential procedure'.
To verify (!) the alleged projection (the last one for us), we project again; i.e., pj Uq =
0
(PjJ u gives
f<PT 0
J „h
PoM
<P(uh0 -ul)
(pU-v1)
+ M-{CA-{g\ +
u
V
which, using (A.3.3-47) and (A.3.3-48), gives
^hnh (<PT 0
X
u
V
<pT 0
0 <pT
+ M-lCA~lg
AD
<p 0
0 <p
<pT 0
0 <pT
+ M-{CA-{g\ +
u
v1
+ M~{CA-{g
+ M-{CA-{g\ +
-uh
— u0'
because
M'
<P<P
0
T
0
-1,
(A.3.3-50)
T)=I, Pq = P0, and /,0M~'C = 0. QED
VECTOR PROJECTIONS 951
In concluding this section, we assert the truth of the following claim and leave the proof
as an exercise: if and only if n • w = 0 (giving u7 = 0 and therefore g = 0) in (A.3.3-47)
and (A.3.3-48), it follows that ||u
lrf„u|
2 +
0 +
G$0u|& where Q)^l-p)-, i.e., the
projection is _l_o if the Dirichlet BC is homogeneous [u is parallel to fD, a la (A.3.3-16)].
A.3.3.3 The pj, -Projection
In this final case of interest for the continuous case there is no ambiguity regarding the
smoothness/regularity that is required; the velocities are in H1, and the 'pressures' (i.e.,
Lagrange multipliers) can reside in L2—both, of course, when the appropriate equations
are put in the appropriate weak form. As for the scalar case, usually we presume that
there is always a Dirichlet portion of T (TD / 0). Thus, given a vector field u(jc) g H1,
we seek its closest neighbor in H1, say ui(jc), that is both divergence-free and retains the
value of u on To; ui = u on rD.
Combining our knowledge from the //'-scalar case and the L2-vector case just
examined permits a rather more expeditious treatment of this case—and that is good because
this is the most difficult case. (And, we are all tired.)
Again, we begin with a saddle-point problem via the introduction of an associated
Lagrange multiplier, say k(x), via the following functional:
F(v,A)=± / V(v - u) : [V(v - u)]J
AV -v
V(^« - ua) ■ V(va -ua)- / AV • v,
(A.3.3-51)
where the Greek index implies summation over the dimension of the underlying Euclidean
space; e.g., in 2D, Vua • Vva = Vm( • Vui + V«2 • Vt>2 = Vu :(Vv)7. In (A.3.3-51), v e
HlE; i.e., v e H1 with v = u on rD; also, Hq c H1 has v = 0 on rD, and A e L2. Taking
the first variation of v and A. to give 8F and setting 8F = 0 gives an alleged stationary
point with v = ui and A = X\ there; i.e.,
0 = / V(ki - ua) • V8v0
A,V-5v
<5AV-ui,
(A.3.3-52)
which is the starting point for obtaining each of two results: (i) the Euler-Lagrange
equations of the stationary point and (ii) the weak formulation of same. To obtain the
former (and the implied/formal projection operator), we use integration by parts on the
first two terms to get
0 = / V • [8vaV(ula - ua)] - / 8vaV2(ula - ua)
5AV-U).
But 8va = 0 on fD and thus
0 =
a
— (ula -ua) - Ai«c
an
8Va
(A.3.3-53)
- /[V2(u,-u)-VA,]-5v- /5AV-U,,
(A.3.3-54)
952 PROJECTIONS, ORTHOGONAL AND NOT-AND PROJECTION METHODS
which leads to the Euler-Lagrange equations,
V2u, - VA., = V2u and V-u,=0 in Q, (A.3.3-55)
3ui 3u
—--A.,n= — on rN, (A.3.3-56)
dn dn
ui = u on rD. (A.3.3-57)
and, of course
Remarks:
(1) These 'look like' the steady Stokes equations with 'body force' V2u.
(2) The BC on VN is an NBC.
(3) The solution of (A.3.3-55) through (A.3.3-57) is a saddle-point of (A.3.3-51).
(4) If V • u = 0, then the solution turns out to be A.) =0 and Ui = u; i.e., u was already
in the subspace.
(5) If u = 0 in Q and du/dn = f on FN and u = w on rD, we have steady Stokes flow.
Now we will use these PDE's to derive the projection operator; simply 'solve'
(A.3.3-55) through (A.3.3-57) for ui (which of course includes the BC's) and place
the result into V • ui =0:
u, =u +A-'VA.,, (A.3.3-58)
giving
-V- A-'VA., = V u, (A.3.3-59)
which is 'solved' for A.),
A., = -(V- A-'vr'V-u, (A.3.3-60)
and the result placed in (A.3.3-58) to give
u, =u- A_1V(V- A_1V)_1V-u
= [/- A_1V(V- A_1V)_1V-]u
= Pj{u, (A.3.3-61)
a rather formidable formal operator! Nevertheless, it is a projection operator (pj = pJ{)
as is QJ{ = I — pJx, which is related to VA.) via
-VA., = AQ7lu; (A.3.3-62)
i.e., we have obtained the decomposition
u = u, - A-'VA., =pJlu + QJlu, (A.3.3-63)
with V • pJ{ =0 (the projected vector lies in the null space of div) and V • QJt = V-; i.e.,
Qjx has no effect on the divergence. pJ{ is an H'-projector onto J.
Remark:
Rewriting QJ{ as A-1 grad[div A-1 grad]-1 div and recalling that div grad = A, it is
tempting to suggest that div A-1 grad approximates the identity operator so that Qj. %
VECTOR PROJECTIONS 953
A-1 grad div, etc.; but to yield to this temptation would probably be counterproductive.
We merely refer the reader to (3.13-134) in Chapter 3, where a modicum of support for
this notion was presented.
Next (of course), we test J_i; we obtain
lUll 1 = / ^Uot ' ^Ua —
^(Pj{Ua + Qj,Ua) ■ V(py,Ma + Qj,Ua)
IP/,u|l?+IIG/,ll|l?+2 (Vpj.Ua) ■ (VQj^a)
|u,||?+ 1^-^,11?-2 / VMla-V(A-,VA1)a.
(A.3.3-64)
To make further progress, we must integrate the last term by parts; first, letting v =
A_1VA.! yields
/ Vula • Vva = / V • (UiaVva) ~ / U[a Ava
d
i.
= J ula — (A-lVki)a- J uia(Vki)a. (A.3.3-65)
But A 'VA-i = Ui — u from (A.3.3-58), and Jui • VA.i = Jr A.in • U) via the
divergence theorem to give
VMla.(A-,VA,)a =
a
"la — ("la —U) — k{U -U)
an
= / U
d
dn
(ui — u) — A-i n
(A.3.3-66)
because 3(ui — u)/3n) — A.in = 0 on VN, and ui = u on rD. Thus, finally,
a
|u||f = llpy.ullf + ||Q7lu||f-2 / u
dn
(ui — u) — A. [ii
(A.3.3-67)
and we see that the projection is J_i if u = 0 on TD—or if TD = 0; i.e., we have the
'usual result'—inhomogeneous Dirichlet BC's cause loss of orthogonality. [Cf. (A.3.2-41)
and the discussion following it, to dispense with unimportant special cases.]
Now we derive the weak formulation for this projection, in preparation for the GFEM
approximation. Returning to (A.3.3-52), we first introduce Hq, a subspace of H1 such that
v g Hq =$■ v g H1 and v = 0 on rD. Next, let w be a given H1-extension of a vector field
from rD into Q that satisfies w = u on rD. Then, to find our desired H1-projection,
m = pj^u, we proceed as follows: set ui = u^ + w, and replace 8va by a test function
v g Hq and 8k by a test function q e L2 to obtain the variational problem: find vl\ e Hq
and k\ e L2 from
V(u°la + wa-ua)- Vv0
A,V-v = 0, VvgHJ,
(A.3.3-68)
and
[ qV-(u°l+w) = 0, VqeL2,
(A.3.3-69)
954 PROJECTIONS, ORTHOGONAL AND NOT-AND PROJECTION METHODS
where w is generally not divergence-free. As before, we rearrange by placing the data on
the RHS before 'going Galerkin': Find 11° e Hq and X e L2 from
fvu°la-Vva- /\,V-v = fv(ua-wa)-Vva Vve//J, (A.3.3-70)
- [qV-VL°{= fqVw VqeL2, (A.3.3-71)
then set
u, =u°l+w = pJlu. (A.3.3-72)
Now, analogous to the scalar H1-projection, we can find an J_i-projection by first
'subtracting' the inhomogeneous Dirichlet data, then performing the projection, and finally
'adding' the data back in. Thus, defining ii = u — w and following the same steps as
before leads to the same Euler-Lagrange equations as (A.3.3-55) through (A.3.3-57) except
that now iii = 0 on rD—we have homogeneous Dirichlet data. Here, iii = p/,u and
iii + w is the final result which, because u —► ii = 0 on rD in (A.3.3-67), gives the
desired(?) H1 -orthogonal projection; i.e., we have ||u||^ = ||p/,u||^ + IK^uHp
A.3.3.4 The pj -Projection
We have finally arrived at our last pair of projections (p1} and Pi), giving a total of
10—as advertised. We consider now the finite-dimensional subspaces: \h c H1, Sh C L2,
and Jh c \h—the discretely divergence-free subspace in which our GFEM solutions lie.
Proceeding as in the py() case, we invoke (A.3.3-26), (A.3.3-27), and (A.3.3-30), but
replace (A.3.3-28) by the full interpolation
u1 = Y^ UA' e H1. (A.3.3-73)
where uy is the value of the given velocity at node j on rD. Thus, rather than (A.3.3-31)
through (A.3.3-33), the weak form [(A.3.3-70), (A.3.3-71)] leads to the analogous H1-
version of it—using also (A.3.3-26) and (A.3.3-27); i.e.,
N f Np f dd) r
Y.UJ J V(k • V^ - Y,XJ I ^J = / V(« - «7) • V0,-, / = 1, 2, ..., N,
(A.3.3-74)
N p. N p
YVJ J V<Pi-V<Pj-J2lj 1^1 = Jv(v-vr)-V<j>h /=1,2,...,/V,
(A.3.3-75)
and
- E("J J ti~^ + "j J fi j1) = J yjfiV • u7-, / = 1, 2,..., Np (A.3.3-76)
VECTOR PROJECTIONS 955
are the 2/V + /V^-Galerkin equations of the //'-projection. Comparing these equations
with those of the ^-projection given by (A.3.3-31) through (A.3.3-33) makes it clear
that there is a perfect one-to-one analogy between the mass matrix (M) and the Laplacian
matrix (A')—a fact that allows us to abbreviate the rest of this discussion.
The 'pre-projection' velocity, corresponding to (A.3.3-34), the non-divergence-free
//'-projection, has as its nodal amplitude coefficients, (it, v)T, and is obtained via solving
Kxii = / V(m - u1) • V<p (A.3.3-77)
and
Kyv= jViv-v1)-^, (A.3.3-78)
after which the final projection takes the form
Kxu + Cxk = Kxu (A.3.3-79)
Kyv + Cyk = Kyv (A.3.3"80)
and
CTxu + CTyv = g, (A.3.3-81)
where here, Kx = Ky—because of our somewhat restrictive BC's.
Next, the analog of (A.3.3-38) and the discretization of (A.3.3-59) is easily obtained:
(CTxK;lCx + C\,K~{Cy)X = CTK~{CX = BX = CTxu + CTyv - g. (A.3.3-82)
So too are the analogs of (A.3.3-39) and the GFEM analog of (A.3.3-61):
/k\ = WX-K-{CXB-'CTK) -K-'CxB-'CTy j /fi\ /K-lCxB~lg\
\v) [ -K^CvB-lCTx (Iv-K-lCvB-lCTy)\\~v) \K-lCyB-lg)'
y ~ ■ (A.3.3-83)
which has introduced the //'-projection matrix, Pi; i.e., in condensed form,
u = (I - K-lCB~lCT)u + K~lCB~lg
= P{u + K-{CB-{g, (A.3.3-84)
a la (A.3.3-43), which properly shows the (comforting) lack of change if u is divergence-
free via the rearrangement to
u = u-K-{CB-\CTu-g). (A.3.3-85)
Also, in analogy with (A.3.3-44), with Q{ = I - P, = K-lCB~lCT,
u = u + K-xCX = P{u + Q{u, (A.3.3-86)
and the inner product, [u, v] = f Vua • Vva, goes to
[u, v] = uTKv (A.3.3-87)
956 PROJECTIONS, ORTHOGONAL AND NOT-AND PROJECTION METHODS
and leads to the following J_i-condition, analogous to (A.3.3-45):
p|2 = ||M||2 + ijA-'CA-ll? + 2XTg; (A.3.3-88)
J_i is obtained only if u = 0 on VD or if rD = 0. As with the L2-case [see (A.3.3-45) and
related discussion], P\u _l_i Q\u but u is not J_i K~lCX even though u = P\li + Q\u =
u + K~lCX— 'because' P{u = u - K~lCB~lg and Q{u = u + K~lCB~lg.
Finally, to introduce the continuous projection operator, pj, that is induced by
the discrete projection operator, Pi, we repeat the summary given for pjo just above
(A.3.3-46):
1. Given
make a first approximation to it via
■**>=x: (?;)*,
and approximate its constrained value on rD by (A.3.3-73).
2. To obtain (iij,Vj), perform the (non-divergence-free) //'-projection by solving
(A.3.3-77) and (A.3.3-78).
3. Project (Uj, vj) to the divergence-free subspace via (A.3.3-83), giving (Uj, Vj).
4. Insert the result into (A.3.3-26) and (A.3.3-27), and we are done; we have the projection,
which is (A.3.3-46) through (A.3.3-49)—the finite-dimensional //'-projection operator
onto the discretely divergence-free subspace, Jh, with (of course) M replaced by K, Po
byP,,andpJ() by phJ{.
A.3.3.5 Sequential Projections
Before relating the divergence-free projections to BVP's and their solution, we need to
point out an interesting fact that answers the following question: Does the direct projection
from the oo-dimensional space to the discretely divergence-free subspace generate the
same vector as results by first applying the non-divergence-free projection operator to the
same vector to bring it down to the finite-dimensional subspace and then applying the
discretely divergence-free projection to this vector (L2 and/or //')? In fewer words, we
ask: Does
ui = phJiu = phJip*u = phJiu* for / = 0,1?
Or, pictorially, do we have (a) or (b) in Figure A.3.3-2?
We shall demonstrate that the answer is 'yes' [Figure A.3.3-2(b)] for the L2-projection
(as before, replace M by K to get the H'-analog): from (A.3.3-34), (A.3.3-35) through
(A.3.3-37), and (A.3.3-48) we have
< = rf0u
l)+M-lCA-lg
+ u7
VECTOR PROJECTIONS
Vh
957
Fig. A.3.3-2 Sequential projections.
(f 0
0 (p
t I \ poM~
<p(u — u1)
<p(v — v1)
+ M-lCA~lg} +u7.
(A.3.3-89)
Then, from (A.3.3-34) again, we have
Ji„ _ r,h
*T
<?' 0 \ U
*>0« = U0 = I 0 yT
V
+ u'
i [<Plit+ u'
V <pTv+ v1
(A.3.3-90)
Finally,
,A «A _ ^~A„ _ / * 0 \ JP()M-1
P7oU0 = Pj0P0U = l 0 ^
^
0
*r
0
0
<P7
0
<pT
I (pi^-u1)
+ M-'CA-'gl +u7
/W
<p<pTv
+ M~lCA'lg
J
+ u'
PQM~lM
\H4 U \ , W-1^4-1.
V
+ M"'CA-'g
+ u'
rf(u; QED: *>;0=p}()p8-
(A.3.3-91)
A.3.3.6 The Projection Method
To (nearly) conclude this extended discussion on projections, we seek BVP's that
correspond to the two divergence-free projections. These correspond to potential flow in the
first case and Stokes flow in the second. See also Section 3.15.
Recalling first the L2-projection, we consider the following BVP:
with
u + V/> = f, V-u = 0 in Q,
n • u = un on To, and P = P^ on VN,
(A.3.3-92)
(A.3.3-93)
which, with P identified as a velocity potential, is recognized as the potential flow
equations with a (necessarily curl-free) source term, f (which we could omit but retain
for generality/consistency).
958 PROJECTIONS, ORTHOGONAL AND NOT-AND PROJECTION METHODS
The 'analogous' Stokes problem is
-V2u + V/> = f, V-u = 0 in Q, (A.3.3-94)
with
u = uD on TD and du/dn - nP = F on VN, (A.3.3-95)
in which f may, but need not, be the same source term as in (A.3.3-92)—ignoring
'regularity issues.' Also, f need not be curl-free for Stokes flow.
To see the projection connection, we first rewrite the above BVP's in their appropriate
weak formulations and then compare them with the GFEM approximations of same:
Potential: Find u e Hl0 and P e L2 from
/(u + w)v- jPV.\= jf.y- J PNn\ VveH^, (A.3.3-96)
and
- I qV ■ (u + w) = 0 Vq eL2, (A.3.3-97)
where w is an H1-extension of a vector field from rD into Q that satisfies n • w = un on
Stokes: Find u 6 Hq and P e L2 from
/ V(u + w) : (Vv)7"- JPV\= /fv + J Fv Vv e Hj, (A.3.3-98)
and
-[qV-(u + w) = 0 VqeL2, (A.3.3-99)
where w is an H1-extension of a vector field from TD into Q that satisfies w = uD on
rD.
The GFEM version of these follows easily wherein v -> \h -> (</>,-, </>,)7, q -> qh -> V/»
and w —► u7; the test functions become the (subset of) GFEM basis/test functions, and
the interpolant is used as the H1-extension of Dirichlet data. Thus:
Potential: Find uh = Y^=l «;</>./ and Ph = Y^% Ptfj from
f(uh + u7) • \h - J PhV • \h = J f • \h - J PNn\h Vv* e \h c Hj,
(A.3.3-100)
and
- f qhV-(uh+uI) = 0 VqheQhCL2, (A.3.3-101)
and
Stokes: Find u* = Yfj=\ "j<Pj and Ph = YjjLx Pj^j from
/\(uh +u7): (Vv'1)7"- f PhV\h = ff-\h+ f F
Vv* e\h cH' (A.3.3-102)
v*
VECTOR PROJECTIONS 959
and
- [qhV-(uh+uI) = 0 VqheQhcL2. (A.3.3-103)
Now, as we did earlier for the scalar case, we note that since \h c Hq and Qh c L2, we
have that (it follows that) (A.3.3-96) through (A.3.3-99) are also valid when v is replaced
by \h and q by qh— thus generating finite-dimensional subsets of the oo-dimensional
equations of the weak formulations. We then also note that the RHS's (data) for the
resulting sets are the same as the RHS's of (A.3.3-100), (A.3.3-101) and (A.3.3-102),
(A.3.3-103), respectively, which leads directly to the following results:
Potential:
/(u + w-u/)-v/i- fpV\h = fuh-\h (A.3.3-104)
- fqhV-uh= /Vv-u7, (A.3.3-105)
where the introduction of uh (which is computable if u and P are available), a non-
divergence-free vector field is merely (partly) for 'convenience.'
uh.\h- / PhV ■ \h =
and
Stokes:
W : (VvV - I Phy • \h = f V(u + w - u7) : (VvV - f PV
v*
and
= / Vuh : (VvV (A.3.3-106)
qhV-uh= qhV-u', (A.3.3-107)
where again, uh is a computable (and non-solenoidal) vector field—again assuming u and
P are available (which we do).
To finish, all we need do is recognize that (A.3.3-104) and (A.3.3-105) is nothing
other than equations (A.3.3-31) through (A.3.3-37) and that (A.3.3-106), (A.3.3-107) is
nothing other than equations (A.3.3-74) through (A.3.3-81)—after replacing u by u + w
in the former equations and accepting the 'new' definition of u71 = ]T^ u/</>_,. Thus, at least
up to the slightly ambiguous H1-extension functions, w, we clearly (and finally!) see the
projection connections for incompressible flow: the GFEM solution of the potential/Stokes
flow equations is the same as the projections (L2 in one case, //' in the other) of the
exact solutions to the appropriate finite element subspaces. These projections are 'best
fitsVclosest approximations/orthogonal projections in the case wherein the Dirichlet data
are homogeneous (as usual). [The functions u/( (partial/'internal' GFEM solutions) are
always orthogonal projections of u + w-u', but the full solutions (uh +u7) are non-
orthogonal projections of u 4- w.]
Final remarks—for homogeneous BC's:
1. The potential flow projection can also be interpreted as a projection of the source
vector to the discretely divergence-free subspace, since (A.3.3-104) implies (weakly) uh =
u + VP = f, with V • u = 0.
960 PROJECTIONS, ORTHOGONAL AND NOT-AND PROJECTION METHODS
2. The Stokes flow projection can also be re-interpreted via (A.3.3-106), this time via
-V2uh = -V2u + VP = f; i.e., uh = -A"'f, with V • u = 0.
Thus, the GFEM solution is also just a 'different' decomposition of the forcing function,
and is also related to the fact, mentioned earlier, that divergence-free vector fields are
(generally) not discretely divergence-free.
A.3.3.7 Ranking Elements via Projections
As another potentially useful product of this prolonged discussion on projections, we offer
(perhaps with tongue-in-cheek, since we have not tested it) the following suggestions as
one (fairly easy) way to compare 'element A' to 'element B' (to 'element C to ...) for
incompressible flow:
1. Pick an analytical function for the stream function (vector potential in 3D).
2. Take its curl to get a div-free vector field, u.
3. Design a 'reasonable' finite element mesh (or, perhaps better yet—a sequence of them).
4. Interpolate u onto the mesh; call it U/.
5. Project (in L2 or Hl or, perhaps preferably, both) u/ onto the discretely div-free
subspace. This gives (A., u^), with CTu = g, where u is the vector of nodal values of u^.
6. Compute appropriate norms of A. and (u — u^).
7. Rank the elements according to the size of these norms; smallest 'wins'—for this test
case.
8. Goto 1.
Remarks:
(1) The oo Do-loop can be truncated when you've 'had enough'.
(2) This of course tests the 'quality' only of the discrete divergence (and, by implication,
it seems, that of the pressure)—but this is just the additional quantity that is needed.
(3) The cost of each projection might also be factored in, somehow, to also rank an
element's cost-effectiveness.
(4) A small sample of such a comparison is shown in the table near the end of
Section 3.16.Id.
References
N.N. Abboud and P.M. Pinsky. Finite element solution and dispersion analysis for the
transient structural acoustics problem. Appl. Mech. Rev., 43(5):S381-S388, Part 2,
1990.
N.N. Abboud and P.M. Pinsky. Finite element dispersion analysis for the three-
dimensional second-order scalar wave equation. Int. J. Numer. Meth. Eng.,
35:1183-1218, 1992.
L. Abia and J.M. Sanz-Serna. On the use of the product approximation technique in
nonlinear Galerkin methods. Int. J. Numer. Meth. Eng., 20:778-779, 1984.
M. Abramowitz and LA. Stegen (Eds.). Handbook of Mathematical Functions, National
Bureau of Standards, US Dept. of Commerce, Washington, D.C., USA, 1964. NBS
Applied Mathematics Series, Vol. 55.
J.H. Ahlberg, E.N. Nilson and J.L. Walsh. Theory of Splines and Their Applications.
Academic Press, New York, USA, 1967.
J.E. Aiken. Stiff Computations. Oxford University Press, New York, USA, 1985.
M. Ainsworth and J.T. Oden. A procedure for a posteriori error estimation for h-p finite
element methods. Comput. Meth. Appl. Mech. Eng., 101:73-96, 1991.
D. Alvarez, O. Daubert, L. Janvier and J.P. Schneider. Proc. NURETH 5, 1992. Chap.
"Three dimensional calculations and experimental investigations of the primary coolant
flow in a 900 MW PWR vessel"; Salt Lake City, Utah, USA.
A.S. Almgren, J.B. Bell and W.G. Szymczak. A numerical method for the incompressible
Navier-Stokes equations based on an approximate projection. SI AM J. Sci. Comput.
17:358-369, 1996.
R. Amit, C.A. Hall and T.A. Porsching. An application of network theory to the solution
of implicit Navier-Stokes difference equations. J. Comput. Phys., 40:183-201, 1981.
A.A. Amsden and F.H. Harlow. A simplified MAC technique for incompressible fluid
flow calculations. J. Comput. Phys., 6:322-325, 1970.
A.A. Amsden and F.H. Harlow. The SMAC Method: A Numerical Technique for
Calculating Incompressible Fluid Flows. Los Alamos Scientific Laboratory, Los
Alamos, New Mexico, USA, LA-4370; UC-32, mathematics and computers; TID-4500
edition, 1970.
A. Arakawa. Computational design for long-term numerical integration of the equations
of fluid motion: Two-dimentional incompressible flow. Part 1 J. Comput. Phys.,
1:119-143, 1966.
T. Arbogast and M.F. Wheeler. A characteristics-mixed finite element method for
advection-dominated transport problems. SINUM, 32(2):404-424, 1995.
962 REFERENCES
J.S. Archer. Consistent mass matrix for distributed systems. Proc. Am. Soc. Civ. Eng.,
89(ST4):161, 1963.
R. Aris. Mathematical Modeling Techniques. Pitman, London, England, UK, 1978.
D.N. Arnold, I. Babuska, and J. Osborn. Finite element methods: principles for their
selection, Comput. Meth. Appl. Mech. Eng. 45:57-96, 1984.
D.N. Arnold. Mixed finite element methods for elliptic problems. Comput. Meth. Appl.
Mech. Eng., 82:281-300, 1990.
M. Arnold. Stability of numerical methods for differential-algebraic equations of higher
index. Appl. Numer. Math., 13:5-14, 1993.
W. Arter and J.W. Eastwood. Particle-mesh schemes for advection dominated flows. J.
Comput. Phys., 117:194-204, 1995.
U.M. Ascher and L.R. Petzold. Projected implicit Runge-Kutta methods for differential-
algebraic equations. SI AM J. Numer. Anal, 28(4): 1097-1120, 1991.
U.M. Ascher, S.J. Ruuth and B.T.R. Wetton. Implicit-explicit methods for time-
dependent partial differential equations. SIAM J. Numer. Anal., 32(3):797-823, 1995.
0. Axelsson and V.A. Barker. Finite Element Solution of Boundary Value Problems.
Theory and Computation. Academic Press, Inc., Orlando, Florida, USA, 1984.
A.K. Aziz (Ed.). The Mathematical Foundations of the Finite Element Method with
Applications to Partial Differential Equations. Academic Press, New York, New York, USA,
1972.
1. Babuska. Error-bounds for finite element method. Numer. Math., 16:322-333, 1971.
I. Babuska. The finite element method with Lagrangian multipliers. Numer. Math.,
20:179-192, 1973.
I. Babuska and M. Suri. The p and h-p versions of the finite element method, basic
principles and properties. SIAM Rev., 36(4):578-632, 1994.
I. Babuska. Courant Element: Before and After. 1994.
I. Babuska, and J.T. Oden, Preface. Comput. Meth. Appl. Mech. Eng., 133:xi-xii, 1996.
I. Babuska and R. Narasimhan. The Babuska-Brezzi condition and the patch test: an
example. Comput. Meth. Appl. Mech. Eng., 140:183-199, 1997.
I. Babuska, T. Strouboulis, S.K. Gangaraj and C.S. Upadhyay. Pollution error in the h-
version of the finite element method and the quality of the recovered derivatives.
Comput. Meth. Appl. Mech. Eng., 140:1-37, 1997.
A.J. Baker. Finite Element Computational Fluid Mechanics. Hemisphere Publishing
Corporation/McGraw-Hill Book Company, Washington/New York, USA, 1983.
A.J. Baker and J.W. Kim. A Taylor weak-statement algorithm for hyperbolic conservation
laws. Int. J. Numer. Meth. Fluids, 7:489-520, 1987.
A.J. Baker and D.W. Pepper. Finite Elements 1-2-3. McGraw-Hill, Inc., New York, New
York, USA, 1991.
B.R. Baliga and S.V. Patankar. A new finite-element formulation for convection-diffusion
problems. Numer. Heat Transfer, 3:393-409, 1980.
R.E. Bank, B.D. Welfert and H. Yserentant. A class of iterative methods for solving saddle
point problems, Numer. Math., 56:645-666, 1990.
A.M. Baptista, E.E. Adams and P. Gresho. Benchmarks for the transport equation: The
convection-diffusion forum and beyond. Quantitative Skill Assessment for Coastal Ocean
Models Coastal and Estuarine Studies, 47:241 -268, American Geophysical Union, 1995.
C. Bardos, M. Bercovier and O. Pironneau. The vortex method with finite elements. Math.
Comput., 36(153): 119-136, 1981.
REFERENCES 963
M. Bar-Lev and H.T. Yang. Initial flow field over an impulsively started circular cylinder.
J. Fluid Mech., 72(4):625-647, 1975.
K.E. Barrett. A variational principle for the stream function-vorticity formulation of
the Navier-Stokes equations incorporating no-slip conditions. J. Comput. Phys.,
26:153-161, 1978.
O.A. Basaran and D.W. DePaoli. Phys. Fluids, 6(9):2923, September 1994.
G.K. Batchelor. An Introduction to Fluid Dynamics. Cambridge University Press, London,
England, UK, 1967.
S. Bates and B. Cathers. Analysis of spurious eigenmodes in finite element equations. Int.
J. Numer. Meth. Fluids, 23:1131-1143, 1986.
K.J. Bathe. Finite Element Procedures. Prentice-Hall, Englewood Cliffs, New Jersy, USA,
1996.
R.M. Beam and R.F. Warming. Lecture Notes, 1981-82 Lecture Series Programme,
Computational Fluid Dynamics, von Karman Institute for Fluid Dynamics, UK, 1982.
Chap. "Implicit numerical methods for the compressible Navier-Stokes and Euler
equations"; Waterloo, Belgium, March 29-April 2; J.A. Essers (Ed.).
E.B. Becker, G.F. Carey and J.T. Oden. Finite elements. An Introduction. Vol. I. Prentice-
Hall, Inc., Englewood Gliffs, New Jersey, USA, 1981.
R. Becker and R. Rannacher. Finite element solution of the incompressible Navier-
Stokes equations on anisotropically refined meshes. Technical Report 94-31, Universitat
Heidelberg, Germany, 1994.
M. Behr and T.E. Tezduyar. Finite element solution strategies for large-scale flow
simulations. Comput. Meth. Appl. Mech. Eng., 112:3-24, 1994.
B.C. Bell and K.S. Surana. p-version least squares finite element formulation for two-
dimensional, incompressible, non-Newtonain isothermal and non-isothermal fluid flow.
Int. J. Numer. Meth. Fluids, 18:127-162, 1994.
J.B. Bell, P. Colella and H.M. Glaz. A second-order projection method for the
incompressible Navier-Stokes equations. J. Comput. Phys., 85:(2)257-283, 1989.
J.B. Bell, P. Colella and L.H. Howell. Proc. AIAA 10th Computational Fluid Dynamics
Conf.. American Institute of Aeronautics and Astronautics (AIAA), New York, USA,
1991. Chap. "An efficient second-order projection method for viscous incompressible
flow"; Honolulu, Hawaii, June 24-27, 1991.
J.B. Bell and D.L. Marcus. Vorticity intensification and transition to turbulence in the
three-dimensional Euler equations. Commun. Math. Phys., 147:371-394, 1992.
J.B. Bell, A.S. Almgren and W.G. Szymczak. A numerical method for the incompressible
Navier-Stokes equations based on an approximate projection. SIAM J. Sci. Comput.,
17(2):358-369, March 1996.
T. Belytschko and R. Mullen. On Dispersive Properties of Finite Element Solutions. John
Wiley and Sons, Inc., New York, New York, USA, 1978, in Modern Problems in Elastic
Wave Propagation; J. Miklowitz et ai, (Eds.).
CM. Bender and S.A. Orszag. Advanced Mathematical Methods for Scientists and
Engineers. McGraw-Hill, New York, New York, USA, 1978.
J.P Benque, B. Ibler, A, Keramsi and G. Labadie. Proc. 3rd. Int. Conf. Finite Elements
in Flow Problems, Vol. I, 1980. Chap. "A finite element method for Navier-Stokes
equations"; p. 110-120; Banff, Alberta, Canada, June 10-13, 1980.
J.P Benque, G. Labadie and J. Ronat. Finite Element Flow Analysis: Proc. 4th. Int. symp.
Finite Element Methods in Flow Problems, 1982.
964 REFERENCES
M. Bentwich and T. Miloh. Low Reynolds number flow due to impulsively started circular
cylinder. J. Eng. Math., 16(1): 1-21, 1982.
M. Bercovier. Information Processing 77. North-Holland, 1977. Chap. "A family of finite
elements with penalisation for the numerical solution of Stokes and Navier-Stokes
equations"; pp. 97-101; B. Gilchrist (Ed.).
M. Bercovier and O. Pironneau. Proc. Numerical Methods in Laminar and Turbulent Flow.
NUL, 1978. Chap. "Comparisons and error estimates for several finite elements for the
numerical simulation of incompressible viscous flows"; Swansea, Wales, UK, July
18-21, 1978.
M. Bercovier. Perturbation of Mixed Variational Problems. Application to mixed finite
element methods. R.A.I.R.O. Analyse numerique/Numer. Anal, 12(3):211-236, 1978.
M. Bercovier and M. Engelman. A finite element for the numerical solution of viscous
incompressible flows. J. Comput. Phys., 30:181-201, 1979.
M. Bercovier and O. Pironneau. Error estimates for finite element method solution of the
Stokes problem in the primitive variables. Numer. Math., 33:211-224, 1979.
M. Bercovier and O. Pironneau. Finite Element Flow Analysis: Proc. 4th. Int. Symp. Finite
Element Methods in Flow Problems. Chap. "Characteristics and the finite element
method"; pp. 67-73; Chuo University, Tokyo, Japan, July 26-29, 1982.
M. Bercovier, O. Pironneau and V. Sastri. Finite elements and characteristics for some
parabolic-hyperbolic problems. Appl. Math. Modelling, 7:89-96, April 1983.
R. Bermejo. On the equivalence of semi-Lagrangian schemes and particle-in-cell finite
element methods. Mon. Weather Rev., 118:979-987, April 1990.
R. Bermejo. Analysis of an algorithm for the Galerkin-characteristic method. Numer.
Math., 60:163-194, 1991.
R. Bermejo and A. Staniforth. The conversion of semi-Lagrangian advection schemes to
quasi-monotone schemes. Mon. Weather Rev., 120:2622-2632, November 1992.
C. Bernardi, C. Canuto and Y. Maday. Generalized Inf-Sup Conditions for Chebyshev
Spectral Approximation of the Stokes Problem. SIAM J. Numer. Anal.,
25:(6)1237-1271, December 1988.
C. Bernardi, C. Canuto, Y. Maday and B. Metivet. Single-grid spectral collocation for
the Navier-Stokes equations. IMA J. Numer. Anal., 10:253-297, 1990.
F.H. Bertrand, M.R. Gadbois and P.A. Tanguy. Tetrahedral elements for fluids flow. Int.
J. Numer. Meth. Eng., 33:1251-1267, 1992.
T. Bidot, S. Delaroff, J.M. Vanel, G. Monville and G. Pot. Proc. Basel World User's Day
CFD. Chap. "Application of the N3S finite element code to numerical simulation
around a peugeot car 405"; May, 1992.
R.B. Bird, W.E. Stewart and E.N. Lightfoot. Transport Phenomena. John Wiley and Sons,
Inc., New York, New York, USA, 1960.
N.E. Bixler and R.E. Benner. Proc. Fourth Int. Cont. Numer. Meth. in Laminar and
Turbulent Flow. Pineridge Press, Ltd, Swansea, Wales, UK, 1985. Chap. "Finite element
analysis of axisymmetric oscillations of sessible liquid drops"; pp. 1325-1335; Swansea,
Wales, UK, July 9-12, 1985.
N.E. Bixler and L.E. Scriven. Downstream development of three-dimensional viscocap-
illary film flow. Ind. Eng. Chem. Res., 26:475-483, 1987.
H. Blasius. Grenzschichten in Fltissigkeiten kit Kleiner Reiburg. Zeit. Math. Phys.
56:1-37, 1908.
P.B. Bochev and M.D. Gunzburger. Accuracy of least-squares methods for the
Navier-Stokes equations. Comput. Fluids, 22(4/5):549-563, 1993.
REFERENCES 965
P.B. Bochev and M.D. Gunzburger. Least-squares methods for the velocity-pressure-
stress formulation of the Stokes equations. Comput. Meth. Appl. Mech. Eng., 114:213,
1994.
P.B. Bochev and M.D. Gunzburger. Analysis of least-squares finite element methods
for the Stokes equations. Math. Comput., to appear; also Virginia Tech., Department
of Mathematics and Interdisciplinary Center for Applied Mathematics, Blacksburg,
Virginia 24061-0531, USA, 1996.
J.M. Boland and R.A. Nicolaides. Stability of finite elements under divergence constraints.
S1AMJ. Numer. Anal, 20(4):722-731, 1983.
J.M. Boland and R.A. Nicolaides. On the stability of bilinear velocity-constant pressure
finite elements. Numer. Math., 44:219-222, 1984.
J.M. Boland and R.A. Nicolaides. Stable and semistable low order finite elements for
viscous flows. SIAMJ. Numer. Anal., 22(3):474-492, 1985.
R. Bouard and M. Coutanceau. The early stage of development of the wake behind an
impulsively started cylinder for 40 < Re < 104. J. Fluid Mech., 101(3):583-607, 1980.
J. Bramble and J. Pasciak. Iterative techniques for time dependent Stokes problems. In
W. Habashi, editor. Solution Techniques for Large-Scale CFD Problems, pp. 201-216.
John Wiley, 1995.
K. Boukir, Y. Maday, B. Metivet and E. Razafindrakoto. A high order
characteristics/finite element method for the incompressible Navier-Stokes equations. Int. J.
Numer. Meth. Fluids, 1996.
K.E. Brenan, S.L. Campbell, and L.R. Petzold. Numerical Solution of Initial-Value
Problems in Differential-Algebraic Equations. North-Holland, Elsevier Science Publishing
Co., Inc., New York, New York, USA, 1989.
K.E. Brenan, S.L. Campbell and L.R. Petzold. Numerical Solution of Initial-Value
Problems in Differential-Algebraic Equations. Society for Industrial and Applied
Mathematics, Philadelphia, Pennsylvania, USA, 1996.
S.C. Brenner and L.R. Scott. The Mathematical Theory of Finite Element Methods.
Springer-Verlag, Berlin, Germany, 1994.
F. Brezzi. On the existence, uniqueness and approximation of saddle-point problems
arising from Lagrangian multipliers. Revue Francaise d' Automatique, Informatique
et Recherche Operationnelle (R.A.I.R.O.), R-2:129-151, 1974.
F. Brezzi and J. Pitkaranta. On the Stabilisation of Finite Element Approximations of
the Stokes Problem. Vieweg, Braunschweig, Germany, 1984, in Efficient Solutions of
elliptic Systems; pp. 11-19; W. Hackbusch (Ed.).
F. Brezzi and K.-J. Bathe. A discourse on the stability conditions for mixed finite element
formulations. Comput. Meth. Appl. Mech. Eng., 82:27-57, 1990.
F. Brezzi and M. Fortin. Mixed and Hybrid Finite Element Methods. Springer-Verlag, Inc.,
New York, New York, USA, 1991.
F. Brezzi and R.S. Falk. Stability of higher-order Hood-Taylor methods. SIAM J. Numer.
Anal., 28(3):581-590, 1991.
M.O. Bristeau, R. Glowinski and J. Periaux. Numerical methods for the Navier-Stokes
equations. Applications to the simulation of compressible and incompressible viscous
flows. Comput. Phys. Reports, 6:73-187, 1987.
M.O. Bristeau, R. Glowinski, B. Mantel, J. Periaux and P. Perrier. Finite Elements in
Fluids—Vol. 6: Finite Elements and Flow Problems. John Wiley and Sons, Chichester,
966 REFERENCES
England, UK, 1985. Chap. 1, "Numerical methods for incompressible and
compressible Navier-Stokes problems"; pp. 1-40; R.H. Gallagher, G.F. Carey, J.T. Oden and
O.C. Zienkiewicz (Eds.).
I.N. Brohnshtein and K.A. Semendyayev. Handbook of Mathematics. Van Nostrand Rein-
hold Company, New York, New York USA, 1985.
A. Brooks and T.J.R. Hughes. Proc. Third Int. Conf. Finite Element Methods in Fluid
Flow. 1980. Chap. "Streamline-upwind / petrov-Galerkin methods for advection
dominated flows"; also Available from Division of Engineering and Applied Science,
California Institute of Technology, Pasadena, California 91125, USA.
A.N. Brooks and T.J.R. Hughes. Streamline upwind/Petrov-formulations for convection
dominated flows with particular emphasis on the incompressible Navier-Stokes
equations. Comput. Meth. Appl. Mech. Eng., 32:199-259, 1982.
D.L. Brown and M.L. Minion. Performance of under-resolved two-dimensional
incompressible flow simulations. J. Comput. Phys., 122:165-183, 1995.
P.N. Brown, G.D. Byrne and A.C. Hindmarsh. VODE: A Variable-Coefficient ODE
Solver. SIAMJ. Sci. Stat. Comput., 10:(5) 1038-1051, 1989.
E.T. Bullister, G.E. Karniadakis, E.M. Ronquist and A.T. Patera. Proc. Sixth Int. Symp.
Finite Element Methods in Flow Problems, Antibes, France, 1986. Chap. "Solution of
the unsteady Navier-Stokes equations by spectral element methods"; pp. 225-230.
E.T. Bullister. Development and Application of High Order Numerical Methods
for Solution of the Three-dimensional Navier-Stokes Equations. Ph.D. Thesis,
Massachusetts Institute of Technology, Department of Mechanical Engineering,
Cambridge, Massachusetts, USA, 1986.
D.S. Burnett. Finite Element Analysis: From Concepts to Application. Addison-Wesley
Publishing Company, Reading, Massachusetts, USA, 1987.
K. Burrage, J.C. Butcher and F.H. Chipman. An implementation of singly-implicit
Runge-Kutta Methods. BIT, 20:326-340, 1980.
J.C. Butcher. The Numerical Analysis of Ordinary Differential Equations: Runge-Kutta
and General Linear Methods. John Wiley and Sons, Chichester, England, 1987.
J. Cahouet and J.-P. Chabard. Some fast 3d finite element solvers for the generalized
Stokes problem. Int. J. Numer. Meth. Fluids, 8:869-895, 1988.
Z. Cai. On the finite volume element method. Numer. Math., 58:713-735, 1991.
Z. Cai, J. Mandel and S. McCormick. The finite volume element method for diffusion
equations on general triangulations. SIAM J. Numer. Anal., 28:(2)392-402, 1991.
S.L. Campbell. Singular Systems of Differential Equations. Pitman Advanced Publishing
Program, San Francisco, California, USA, 1980.
A.-Campion Renson and M.J. Crochet. On the stream function-vorticity finite element
solutions of Navier-Stokes equations. Int. J. Numer. Mech. Fluids, 12:1809, 1978.
C. Canuto, M.Y. Hussaini, A. Quarteroni, and T.A. Zang. Spectral Methods in Fluid
Dynamics. Springer-Verlag, Inc., New York, New York, USA, 1988a.
C. Canuto, C. Bernardi and Y. Maday. Generalized inf-sup conditions for Chebyshev
spectral approximation of the Stokes problem. SIAMJ. Numer. Anal., 25(6): 1237-1271,
December 1988b.
G.F. Carey. An analysis of finite element equations and mesh subdivision. Comput. Meth.
Appl. Mech. Eng., 9:165-179, 1976.
G.R. Carey. Derivative calculation from finite element solutions. Comput. Meth. Appl.
Mech. Eng., 35:1-14, 1982.
REFERENCES 967
G.F. Carey and J.T. Oden. Finite Elements: A Second Course Vol. II. Prentice-Hall, Inc.,
Englewood Cliffs, New Jersey, USA, 1983.
G. Carey and J.T. Oden. Finite Elements: Computational Aspects Vol. III. Prentice-Hall,
Inc., Englewood Cliffs, New Jersey, USA, 1984.
G.F. Carey, S.S. Chow and M.K. Seager. Approximate boundary-flux calculations.
Comput. Meth. Appl. Mech. Eng., 50:107-120, 1985.
G.F. Carey and J.T. Oden. Finite Elements: Fluid Mechanics. Vol. VI. Prentice-Hall, Inc.,
Englewood Cliffs, New Jersey, USA, 1986.
G.F. Carey and R. McLay. Local pressure oscillation and boundary treatment for the
8-node quadrilateral. Int. J. Numer. Meth. Fluids, 6:165-172, 1986.
G.F. Carey and B.N. Jiang. Least-squares finite elements for first-order hyperbolic
systems. Int. J. Numer. Meth. Eng., 26:81-93, 1988.
G.F. Carey and Y. Shen. Convergence studies of least-squares finite elements for first-
order systems. Commun. Appl. Numer. Methods, 5:427-434, 1989.
H.S. Carslaw and J.C. Jaeger. Conduction of Heat in Solids. Clarendon Press, Oxford,
England, UK; 2nd edition, 1959.
M.S. Carvalho and L.E. Scriven. Numerical Methods in Laminar and Turbulent Flow,
Pineridge Press, Swansea, Wales, UK, 1995, Vol. 9, Part II. Chap. "Flows between rigid
and deformable rotating cylinders with free surfaces, inflow and outflow"; pp. 972-983;
C. Taylor and P. Durbetaki (Eds.).
M.S. Carvalho and L.E. Scriven. Multiple states of a viscous free surface flow: transition
from a pre-metered to a metering in flow. 24:813.
B. Cathers and B.A. O'Connor. The group velocity of some numerical schemes.
5:201-224, 1985.
M.A. Celia, T.F. Russell, I. Herrera and R.E. Ewing. An Eulerian-Lagrangian localized
adjoint method for the advection-diffusion equation. Adv. Water Res., 13(4): 187, 1990.
M.A. Celia, I. Herrera, R.E. Ewing and T.F. Russell. Eulerian-Lagrangian localized
adjoint method: The theoretical framework. Numer. Meth. Partial Differential
Equations, 9:431-457, 1993.
J.P. Chabard and I. King. Proc. XXIVth Biennial Congress Int. Assoc, for Hydraulic
Research (AIRH). Chap. "An industrial application of the N3S finite element code";
Madrid, Spain, September 9-13, 1991.
D.J. Chaffin and A.J. Baker. On Taylor weak statement finite element methods for
computational fluid dynamics. Int. J. Numer. Meth. Fluids, 21:273-294, 1995.
S.T. Chan, P.M. Gresho, R.L. Lee and CD. Upson. Proc. AIAA, 5th Comp. Fluid
Dynamics Conf, USA, 1981. Chap. "Simulation of three-dimensional, time-dependent,
incompressible flows by a finite element method"; pp. 354-364; Palo Alto, California;
also Lawrence Livermore National Laboratory, Livermore, California, UCRL-85226.
S.T. Chan and P.M. Gresho. Proc. 4th Int. Symp. Finite Element Methods in Flow
Problems, Finite Element Flow Analysis. University of Tokyo Press, Tokyo, Japan,
1982. Chap. "Solution of the multi-dimensional, incompressible Navier-Stokes
equations using low-order finite elements and one-point quadrature"; pp. 201-210;
T. Kawai (Ed.).
S.T. Chan. Numerical simulations of LNG vapor dispersion from a fenced storage area.
J. Hazard. Mater., 30:195-224, 1992.
C.-L Chang and M.D. Gunzburger. A finite element method for first order elliptic systems
in three dimensions. Appl. Math. Comput., 23:171-184, 1987.
968 REFERENCES
C.L. Chang. Finite element approximation for grad-div type systems in the plane. SIAM
J. Numer. Anal, 29(2):452-461, 1992.
C.L. Chang. Least-squares finite element methods for incompressible flow with zero
residual for mass conservative law. SIAM J. Numer. Anal, 1996. to appear; also
Cleveland State University, Department of Mathematics Research Report 94-50 (September
1994).
E.J. Chang and M.R. Maxey. Unsteady flow about a sphere at low to moderate Reynolds
number. Part 2. Accelerated motion. J. Fluid Mech., 303:133-153, 1995.
M.W. Chang and B.A. Finlayson. On the proper boundary conditions for the thermal entry
problem. Int. J. Numer. Meth. Eng., 15:935-942, 1980.
D. Chapelle and K.J. Bathe. The inf-sup test. Comput. Struct., 47(4/5):537-545, 1993.
R.T.S. Cheng. Numerical solution of the Navier-Stokes equations by the finite element
method. Phys. Fluids, 15(12):2098, 1972.
R.C.Y. Chin, G.W. Hedstrom and K.E. Karlsson. A simplified Galerkin method of
hyperbolic equations. Math. Comput., 33:(146)571-586, 1979.
J.C. Chien. A general finite-difference formulation with application to Navier-Stokes
equations. Comput. Fluids, 5:15-31, 1977.
P.N. Childs and K.W. Morton. Characteristic Galerkin methods for scalar conservation
laws in one dimension. SIAM J. Numer Anal, 27(3):553-594, 1990.
D.P. Chock and A.M. Dunker. A comparison of numerical methods for solving the advec-
tion equation. Atmos. Environ., 17(1): 11 —24, 1983.
D.P. Chock. A comparison of numerical methods for solving the advection equation—II.
Atmos. Environ., 19(4):571-586, 1985.
A.J. Chorin. The numerical solution of the Navier-Stokes equations for an incompressible
fluid. Bull. Am. Math. Soc, 73(6):928, 1967a.
A.J. Chorin. A numerical method for solving incompressible viscous flow problems. J.
Comput. Phys., 2:12-26, 1967b.
A.J. Chorin. Numerical solution of incompressible flow problems. Stud. Numer. Anal,
2:64-1 \, 1968a.
A.J. Chorin. Numerical solution of the Navier-Stokes equations. Math. Comput., 22:745,
1968b.
A.J. Chorin. On the convergence of discrete approximations to the Navier-Stokes
equations. Math. Comput, 23(106):341, 1969.
M.H. Chou and W. Huang. Numerical study of high-reynolds-number flow past a bluff
object. International Journal for Numerical Methods in Fluids, 23:1 \ 1 -732, 1996.
I. Christie, D.F. Griffiths, A.R. Mitchell and J.M. Sanz-Serna. Product Approximation for
Non-linear Problems in the Finite Element Method. IMA J. Numer. Anal, 1:253-266,
1981.
K.N. Christodoulou and L.E. Scriven. The fluid mechanics of slide coating. J. Fluid
Mech., 208:321-354, 1989.
C.-C Chu, C.-C. Chang, C.-C. Liu, and R.L. Chang. Suction effect on an impulsively
started circular cylinder: Vortex structure. Phys. Fluids, 8(11):2995, 1996.
P.G. Ciarlet. The Finite Element Method for Elliptic Problems. North-Holland, Amsterdam,
The Netherlands, 1978.
P.G. Ciarlet and J.L. Lions (Eds.). Handbook of Numerical Analysis, Vol I. Finite
Difference Methods (Part I), and Solution of Equations in Ren (Part I). North-Holland,
Amsterdam, The Netherlands, 1990.
REFERENCES 969
K.A. Cliffe. On conservative finite element formulations of the inviscid Boussinesq
equations. Int. J. Numer. Meth. Fluids, 1(2): 117, 1981.
R. Clift, J.R. Grace and M.E. Weber. Bubbles, Drops and Particles. Academic Press, New
York, New York, USA, 1978.
W.M. Collins and S.C.R. Dennis. Flow past an impulsively started circular cylinder. J.
Fluid Mech., 60(1): 105-127, 1973a.
W.M. Collins and S.C.R. Dennis. The initial flow past an impulsively started circular
cylinder. Quart. Journ. Mech. Appl. Math., XXVI, 1973b.
G. Comini and S. Del Giudice. A physical interpretation of conventional finite element
formulations of conduction-type problems. Int. J. Numer. Meth. Eng., 32:559-569,
1991.
G. Comini, S. Del Giudice, and C. Nonino. Energy balances in CVFEM and GFEM
formulations of convection-type problems. Int. J. Numer. Meth. Eng., 35:709, 1992.
G. Comini M. Malisan, and M. Manzan. Accuracy comparison of control-volume and
Galerkin finite-element methods for heat conduction Problems. Numer. Heat Trans.
Part B, 1996, (in press).
P. Constantin and C. Foias. Navier-Stokes Equations. The University of Chicago Press,
Chicago, USA, 1989.
R.D. Cook, D.S. Malkus and M.E. Plesha. Concepts and Applications of Finite Element
Analysis. John Wiley and Sons, Inc., New York, New York, USA, 3rd edition, 1989.
R. Courant, K.O. Friedrichs, and H. Lewy. Uber die Partiellen Differenzengleichurgen
der Mathematischen Physik. Mathematische Annalen, 100:32-74, 1928.
S.H. Crandall. Engineering Analysis: A Survey of Numerical Procedures. McGraw-Hill
Book Company, Inc., New York, New York, USA, 1956.
M.J. Crochet, F.T. Geyling and J.J. Van Schaftingen. Numerical simulation of the
horizontal Bridgman growth of a gallium arsenide crystal. J. Crystal Growth, 65:166-172,
1983.
M.J. Crochet, F.T. Geyling and J.J. Van Schaftingen. Finite element method for
calculating the horizontal Bridgman growth of semiconductor crystals. J. Crystal Growth,
65:166-172, 1983.
M.J. Crochet, F.T. Geyling and J.J. Van Schaftingen. Finite Element Method for
Calculating the Horizontal Bridgman Growth of Semiconductor Crystals, in Finite
Elements in Fluids—Volume 6, John Wiley & Sons Ltd, Chichester, England, UK, 1985;
pp. 321-339; R.H. Gallagher, G.F. Carey, J.T. Oden and O.L. Zienkiewizc (Eds.).
M. Crouzeix and P.-A. Raviart. Conforming and nonconforming finite element methods
for solving the stationary Stokes equations I. Revue Francaise d' Automatique, Infor-
matique et Recherche Operationnelle (R.A.I.R.O.), R-3:33-76, December 1973.
W.P. Crowley. Numerical advection experiments. Mon. Weather Rev., 96(1), January
1968.
M.J.P. Cullen. On the use of artificial smoothing in Galerkin and finite difference solutions
of the primitive equations. Q. J. R. Meteorol. Soc, 102:77-93, 1976.
M.J.P. Cullen. Numerical Methods Used in Atmospheric Mols. World Meteorological
Organization, Bracknell, England, UK, Vol. 2, 1979. GARP Publication Series
No. 17.
M.J.P. Cullen and K.W. Morton. Analysis of evolutionary error in finite element and other
methods. J. Comput. Phys., 34(2):245-267, 1980.
970 REFERENCES
M.J.P. Cullen. The use of quadratic finite element methods and irregular grids in the
solution of hyperbolic problems. J. Comput. Phys., 45:221-245, 1982.
E.L. Cussler. Diffusion: Mass Transfer in Fluid Systems. Press Syndicate of the University
of Cambridge, New York, New York, USA, 1984.
C. Cuvelier, A. Segal and A. van Steerhover. Finite Element Methods, and Navier-Stokes
Equations. D. Reidel Publishing Company, Dordrecht, The Netherlands, 1986.
G. Dahlquist and A. BjOrck. Numerical Methods. Prentice-Hall, Inc., Englewood Cliffs,
New Jersey, USA, 1974. Translated by Ned Anderson.
G. Dahlquist. On one-leg multistep methods. SIAM J. Numer. Anal., 20(6): 1130-1138,
1983.
H. Daniels. PASTIS-3D: Finite Element Projection Algorithm Solver for Transient
Incompressible Flow Simulations. Implementation Aspects and User's Manual, Version 1.0.
Lawrence Livermore National Laboratory, Livermore, California, USA, UCRL-MA-
111833, 1992.
H. Daniels. Proc. Finite Elements in Fluids. Pineridge Press Ltd., Swansea, Wales,
UK, 1993. Chap. "PASTIS-3D—A new generation finite element code for the
incompressible Navier—Stokes equations"; p. 338; K. Morgan, E. 0nate, J. Periaux,
J. Peraire and O.C. Zienkiewicz (Eds.).
H.T. Davis, R.A. Novy, and L.E. Scriven. A comparison of synthetic boundary conditions
for continuous-flow systems. Chem. Eng. Sci., 46(l):57-68, 1991.
P.R. Dawson and D.F. McTigue. A numerical model for natural convection in fluid-
saturated creeping porous media. Numer. Heat Transfer, 8:45-63, 1985.
P.R. Dawson. On modeling of mechanical property changes during flat rolling of
aluminum. Int. J. Solids Struct., 23(7):947-968, 1987.
C. De Boor. Practical Guide to Splines. Springer-Verlag, New York, New York, USA,
1978. Applied Mathematical Sciences, Vol. 27.
Baptista de Melo A.E. Solution of Advection-Dominated Transport by Eulerian-Lag-
rangian Methods Using the Backwards Method of Characteristics. Ph.D. Thesis,
Massachusetts Institute of Technology, Cambridge, Massachusetts, USA, 1987.
G. de Vahl Davis. A note on a mesh for use with polar coordinates. Numerical Heat
Transfer, 2:261-266, 1979.
E. Dean and R. Glowinski. On some Finite Element Methods for the Numerical Solution
of Incompressible Viscous Flow. Cambridge University Press, 1993, in
Incompressible Computational Fluid Dynamics; pp. 17-65; M. Gunzburger and R. Nicolaides
(Eds.).
M. Delfour, W. Hager and F. Trochu. Discontinuous Galerkin methods for ordinary
differential equations. Math. Comput, 36(154):455-473, 1981.
M. Delfour, W. Hager and F. Trochu. Discontinuous Galerkin Methods for Ordinary and
Non-Conservative Finite Element Formulations of Convection-Type Problems. Int. J.
Numer. Meth. Eng., 35:709-727 1992.
J.E. Dendy. Two methods of Galerkin type achieving optimum I2 rates of convergence
for first-order hyperbolics. SINUM, 11:637, 1974.
J.J. Derby, L.J. Atherton, P.D. Thomas and R.A. Brown. Finite-Element Methods for
Analysis of the Dynamics and Control of Gzochralski Crystal Growth. J. Sci. Comput.,
2(4):297, 1987.
J. Donea. A Taylor-Galerkin method for corrective transport problems. Int. J. Numer.
Meth. Eng., 20:101-119, 1984.
REFERENCES 971
J. Donea, S. Giuliani, H. Laval, and L. Quartapelle. Time-accurate solution of
advection-diffusion problems by finite elements. 45: 1984.
J. Donea, L. Quartapelle and V. Selmin. An analysis of time discretization in the finite
element solution of hyperbolic problems. J. Comput. Phys. 70:463-499, 1987.
J. Donea and L. Quartapelle. An introduction to finite element methods for transient
advection problems. Comput. Meth. Appl. Mech. Eng., 95:169-203, 1992.
J. Douglas Jr, T. Dupont and M.F. Wheeler. A Galerkin procedure for approximating the
flux on the boundary for elliptic and parabolic boundary value problems. Revue Franqise
d'Automatique, Informatique et Recherche Operationnelle (R.A.I.R.O.), August 1974.
J. Douglas Jr and T.F. Russell. Numerical methods for convection-dominated diffusion
problems based on combining the method of characteristics with finite element or finite
difference procedures. SIAMJ. Numer. Anal, 19(5):871-885, 1982.
J. Douglas and J. Wang. An absolutely stabilised finite element method for the Stokes
problem. Math. Comp., 52:495-508, 1989.
J.K. Dukowicz and A.S. Dvinsky. Approximate factorization as a high order splitting for
the implicit incompressible flow equations. J. Comput. Phys., 102:336-347, 1992.
S. Dupont and J.M. Marchal. Preconditioned conjugate gradients for solving the transient
Boussinesq equations in three-dimensional geometries. Int. J. Numer. Meth. Fluids,
8:283-303, 1988.
D.R. Durran. The third-order Adams-Bashforth method: An attractive alternative to
leapfrog time differencing. Mon. Weather Rev., 119:702-720, 1991.
A.S. Dvinsky and J.K. Dukowicz. Null-space-free methods for the incompressible
Navier-Stokes equations on non-staggered curvilinear grids. Comput. Fluids,
22(6):685-696, 1993.
W.E and J.-G. Liu. Projection method I: Convergence and numerical boundary layers.
SIAMJ. Numer. Anal., 32(4): 1017-1057, 1995.
W.E and J.-G. Liu. Vorticity boundary conditions and related issues for finite difference
schemes. J. Comput. Phys., 1996, 124:368-382.
W.E and J.-G. Liu. Projection methods II: Godenov-Ryabenki analysis. SIAM J. Name.
Anal. 1996, 33:(4) 159-1621.
E.D. Eason. A review of least-squares methods for solving partial differential equations.
Int. J. Numer. Meth. Eng., 10:1021-1046, 1976.
R. Easton. Homogeneous boundary conditions for pressure in the MAC method. J.
Comput. Phys., 9:375-379, 1972.
J.W. Eastwood. The stability and accuracy of EPIC algorithms. Comput. Phys. Commun.,
44:73-82, 1987.
B.E. Eaton. The Galerkin Finite Element Method Applied to Viscous Incompressible Flows.
Ph.D. Thesis, University of Colorado, Department of Chemical Engineering, Boulder,
Colorado, USA, 1983.
Y. Eguchi, G. Yagawa and L. Fuchs. A conjugate-residual-FEM for incompressible
viscous flow analysis. Comput. Mech., 3:59-72, 1988.
H. Elman. Multigrid and Krylov subspace methods for the discrete Stokes equations.
Technical Report UMIACS-TR-94-76, Institute for Advanced Computer Studies, University
of Maryland, 1994.
M.S. Engelman, R.L. Sani and P.M. Gresho. The implementation of normal and/or
tangential boundary conditions in finite element codes for incompressible fluid flow.
Int. J. Num. Meth. Fluids 2:225-238, 1982a.
972 REFERENCES
M.S. Engelman, R. Sani, P.M. Gresho and M. Bercovier. Consistent vs. reduced
integration penalty methods for incompressible media using several old and new elements.
Int. J. Numer. Meth. Fluids, 2:25-42, 1982b.
M.S. Engelman. Incompressible Computational Fluid Dynamics: Trends and Advances.
Cambridge University Press, Cambridge, England, UK, 1993. Chap. 3, "CFD—An
Industrial Perspective"; pp. 67-86; M.D. Gunzburger and R.A. Nicolaides (Eds.).
W.H. Enright and T.E. Hull. Numerical Methods for Differential Systems: Recent
Developments in Algorithms, Software, and Applications. Academic Press, Inc., New York,
New York, USA, 1976. Chap. "Comparing numerical methods for the solution of stiff
systems of ODE's"; pp. 45-66; L. Lapidus and W.E. Schiesser (Eds.).
N. Ericsson. On the Stability of Pipe Flow. Master of Science Thesis. Chalmers University
of Technology, Goteborg, Sweden, 1993.
K. Eriksson and C. Johnson. Adaptive Streamline Diffusion Finite Element Methods for
Convection-Diffusion Problems. Department of Mathematics, Chalmers University of
Technology, Goteborg, Sweden, No. 1990-18/ISSN 0347-2809, 18:1990.
D. Estep. A Posteriori error bounds and global error control for approximation of ordinary
differential equations. SIAM J. Numer. Anal., 32(1): 1 -48, 1995.
R.E. Ewing and T.F. Russell. Advances in Computer Methods for Partial Differential
Equations—IV. IMACS, Rutgers University, New Brunswick, New Jersey, USA, 1981.
Chap. "Multistep Galerkin methods along characteristics for convection-diffusion
problems," P. 28-36; R. Vichnevetsky and R.S. Stepleman (Eds.).
R.E. Ewing, T.F. Russell and M.F. Wheeler. Simulation of miscible displacement using
mixed methods and a modified method of characteristics. Society of Petroleum
Engineers of AIME, Dallas, Texas, USA, SPE 12241, 1983.
M. Feistauer. Mathematical Methods in Fluid Dynamics. John Wiley and Sons, Inc., New
York, New York, USA, 1993.
J.H. Ferziger and M. Peric. Computational Methods for Fluid Dynamics. Springer-Verlag,
Berlin, Germany, 1996.
B.A. Finlayson and L.E. Scriven. The method of weighted residuals—A review. Appl.
Mech. Rev., 19(9):735-748, 1966.
B.A. Finlayson. The Method of Weighted Residuals and Variational Principles, with
Application in Fluid Mechanics, Heat and Mass Transfer. Academic Press, Inc., New York, New
York, USA, 1972. Vol. 87 in Mathematics in Science and Engineering, R. Bellman (Ed.).
B.A. Finlayson. Stiff Computation. Oxford University Press, New York, New York, USA,
1985. Sec. 3.4, "Solution of stiff equations resulting from partial differential equations,"
pp. 124-139; R.C. Aiken (Ed.).
B.A. Finalyson. Numerical Methods for Problems with Moving Fronts. Ravenna Park
Publishing, Inc., Seattle, Washington, USA, 1992.
R.S. Fisk. On an oscillation phenomenon in the numerical solution of the
diffusion-convection equations. SIAM J. Numer. Anal., 19(4):721-724, 1982.
G.J. Fix. Finite element models for ocean circulation problems. SIAM J. Appl. Math.,
29(3):371-387, 1975.
G.J. Fix, M.D. Gunzburger and R.A. Nicolaides. Constructive Approaches to
Mathematical Models. Academic Press, Inc., New York, New York, USA, 1979a. Chap. "Theory
and applications of mixed finite element methods"; p. 375-393.
G.J. Fix, M.D. Gunzburger and R.A. Nicolaides. On finite element methods of the least
squares type. Comput. Math. Appl., 5:87-98, 1979b.
REFERENCES 973
G.J. Fix, M.D. Gunzburger and R.A. Nicolaides. On mixed finite element methods for
first order elliptic Systems. Numer. Math., 37:29-48, 1981.
G.J. Fix, M.D. Gunzburger and J.S. Peterson. On finite element approximations of
problems having inhomogeneous essential boundary conditions. Comput. Math. Appl., 1983,
9:(5)687-700.
D.P. Flanagan and T. Belytschko. A uniform strain hexahedron and quadrilateral with
orthogonal hourglass control. Int. J. Numer. Meth. Eng., 1981, 17:679-706.
C.A.J. Fletcher and K. Srinivas. Stream function vorticity revisited. Comput. Meth. Appl.
Mech. Eng., 1983, 41:297-322.
C.A.J. Fletcher. Computational Techniques for Fluid Dynamics I. Fundamental and
General Techniques. Springer-Verlag, Berlin, Germany, 2nd edition, 1991a. Series:
Springer Series in Computational Physics; R. Glowinski, M. Holt, P. Hut, H.B. Keller,
J. Killeen, S.S. Orszag, and V.V. Rusanov (Eds.).
C.A.J. Fletcher. Computational Techniques for Fluid Dynamics 2. Specific Techniques for
Different Flow Categories. Springer-Verlag, Berlin, Germany, 2nd edition, 1991b.
Series: Springer Series in Computational Physics; R. Glowinski M. Holt, P. Hut,
H.B. Keller, J. Killeen, S.S. Orszag, and V.V. Rusanov (Eds.).
R. Fletcher and D.F. Griffiths. The generalized eigenvalue problem for certain unsym-
metric band matrices. Linear Algebra Appl., 29:139-149, 1980.
Fluid Dynamics International, Inc., FIDAP 7.0: Fluid Dynamics Analysis Package: Theory
Manual. Fluid dynamics international, Inc., Evanston, Illinois, USA, revision 7.0, 1st
edition, 1993.
B. Fornberg. On the instability of leap-frog and Crank-Nicolson approximations of a
nonlinear partial differential equation. Math. Comput., 27(121):45-57, 1973.
B. Fornberg. A Practical Guide to Pseudospectral Methods. Cambridge University Press,
Cambridge, England, UK, 1996.
M. Fortin, R. Peyret and R. Temam. Resolution numerique des equations de
Navier-Stokes pour un fluide incompressible. J. Mec, 10:(3)357-390, 1971.
M. Fortin. Calcul Numerique des Ecoulements des Fluides de Bingham et des Fluides
Newtoniens Incompressibles par la Methode des Elements Finis. Ph.D. Thesis,
Universite de Paris VI, Paris, France, 1972a.
M. Fortin. Numerical Methods in Fluid Dynamics. 1972b. Chap. "Numerical solution of
steady state Navier-Stokes equations"; J.J. Smolderen (Ed.); Agard Lecture Series
No. 48, AGARD-LS-48.
M. Fortin. An analysis of the convergence of mixed finite element methods. R.A.I.R.O.
Analyse numerique/Numer. Anal, 11(4):341-354, 1977.
M. Fortin. Old and new finite elements for incompressible flows. Int. J. Numer. Meth.
Fluids, 1:347-364, 1981.
M. Fortin. Short communication: Two comments on: Consistent vs reduced integration
penalty methods for incompressible media using several old and new elements. Int. J.
Numer. Meth. Fluids, 3:93-98, 1983.
M. Fortin and M. Soulie. A non-conforming piecewise quadratic finite element on
triangles. Int. J. Numer. Meth. Eng., 19:505-520. 1983.
M. Fortin and R. Glowinski. Augmented Lagrangian Methods: Applications to the
Numerical Solution of Boundary-Value Problems. North-Holland, Amsterdam, The
Netherlands, 1983. Series: Studies in Mathematics and Its Applications, Vol. 15;
J.L. Lions, G. Papanicolaou, R.T. Rockafellar and H. Fujita (Eds.).
974 REFERENCES
M. Fortin. A three-dimensional quadratic nonconforming element. Numer. Math.,
46:269-279, 1985.
M. Fortin and A. Fortin. Finite Elements in Fluids. John Wiley and Sons, Inc., Chichester,
England, UK. Vol. 6, 1985a. Chap. 7, "Newer and newer elements for incompressible
flow"; pp. 171-187; R.H.Gallagher, G.F.Carey, J.T. Oden and O.C. Zienkiewicz
(Eds.).
M. Fortin and A. Fortin. Experiments with several elements for viscous incompressible
flows. Int. J. Numer. Meth. Fluids, 5:911-928, 1985b.
M. Fortin and A. Fortin. A generalization of Uzawa's algorithm for the solution of the
Navier-Stokes equations. Commun. Appl. Numer. Meth., 1:205-208, 1985c.
A. Fortin. On the imposition of a flowrate by an augmented Lagrangian method.
Commun. Appl. Numer. Meth., 4:835-841, 1988.
M. Fortin and R. Pierre. Stability analysis of discrete generalised Stokes problems. Numer.
Meth. Partial Differential Equations, 8:303-323, 1992.
L. Franca and R. Stenberg. Error analysis of some Galerkin least square methods for the
elasticity equations. SIAM J. Numer. Anal., 28:1680-1697, 1991.
L.P. Franca, S.L. Frey and T.J.R. Hughes. Stabilized finite element methods: I. application
to the advective-diffusive model. Comput. Meth. Appl. Mech. Eng., 95:253-276, 1992.
L. Franca and S. Frey. Stabilised finite element methods: II. the incompressible Navier-
Stokes equations. Comput. Meth. Appl. Mech. Eng., 99:209-233, 1992.
L. Franca and A. Madureira. Element diameter free stability parameters for stabilized
methods applied to fluids. Comput. Meth. Appl. Mech. Eng., 105:395-403, 1993.
L. Franca, T.J.R. Hughes and R. Stenberg. Stabilized finite element methods. Cambridge
University Press, 1993, in Incompressible Computational Fluid Dynamics; pp. 87-107;
M. Gunzburger and R. Nicolaides, (Eds.).
I. Fried. Finite element analysis of incompressible material by residual energy balancing.
Int. J. Solids Struct., 10:993-1002, 1974.
I. Fried and D.S. Malkus. Finite element mass matrix lumping by numerical integration
with no convergence rate loss. Int. J. Solids Struct., 11:461-466, 1975.
I. Fried. On a deficiency in unconditionally stable explicit time-integration methods in
elastodynamics and heat transfer. Comput. Meth. Appl. Mech. Eng., 46:195-200, 1984.
E.O. Frind. Solution of the Advection-Dispersion Equation with Free Exit Boundary.
Numer. Meth. Partial Diff. Equations, 4:301-313, 1988.
U. Frisch and S.A. Orszag. Turbulence: challenges for theory and experiment. Phys.
Today, p. 24, January 1990.
G.P. Galdi. An Introduction to the Mathematical Theory of the Navier-Stokes Equations,
Vol. I. Linearized Steady Problems. Springer-Verlag, New York, New York, USA,
1994a.
G.P. Galdi. An Introduction to the Mathematical Theory of the Navier-Stokes Equations,
Vol. II. Nonlinear Steady Problems. Springer-Verlag, New York, New York, USA, 1994b.
R.H. Gallaher, G.F. Carey, J.T. Oden and O.C. Zienkiewicz. (Eds.), Finite elements in
fluids—Vol. 6: finite elements and flow problems, John Wiley and Sons Ltd, Chichester,
England, UK, 1985.
J. Gary. Nonlinear Instability. World Meteorological Organization, 1979. Chap. 10 of
Numerical Methods Used in Atmospheric Models. Vol. II.
REFERENCES 975
D.K. Gartling. Some comments on the paper by Heinrich, Huyakorn, Zienkiewicz and
Mitchell. Int. J. Numer. Meth. Eng., 12:187-191, 1978.
D.K. Gartling. NACHOS II—A Finite Element Computer Program for Incompressible
Flow Problems, Part I. Theoretical Background. Sandia National Laboratories,
Albuquerque, New Mexico, USA, SAND86-1816; UC-32 edition, 1987.
D.K. Gartling and R.E. Hogan. Coyote II —A Finite Element Computer Program for
Nonlinear Heat Conduction Problems Part I —Theoretical Background. Sandia Report,
SAND 94-1173-uc-905, 1994.
C.W. Gear. Numerical Initial Value Problems in Ordinary Differential Equations. Prentice-
Hall, Inc., Englewood Cliffs, New Jersey, USA, 1971.
M. Gellert and R. Harbord. Symmetric forms for finite element analysis of the
Navier-Stokes problem. Comput. Fluids, 15(4):379-389, 1987.
J.P. Gerrity Jr. A note on the Computational Stability of the Two-Step Lax-Wendroff
Form of the Advection Equation. Mon. Weather Rev., 100:(l)72-73. 1972.
V. Girault and P. A. Raviart. Finite Element Methods for Navier-Stokes Equations. Theory
and Algorithms. Springer-Verlag, Berlin, Germany, 1986.
V. Girault. Incompressible finite element methods for Navier-Stokes equations with
nonstandard boundary conditions in R3. Mathe. Comput., 51(183):55-74, July 1988a.
V. Girault. Curl-conforming Finite Element Methods for Navier-Stokes Equations with
Non-standard Boundary Conditions in R3. Universite Pierre et Marie Curie, Centre
Nationale de la Recharche Scientfic, 1988b.
R. Glowinski and O. Pironneau. Numerical methods for the first biharmonic equation and
for the two-dimensional stokes problem. SI AM Rev., 21(2): 167, April 1979.
R. Glowinski. Numerical Methods for Nonlinear Variational Problems. Springer-Verlag,
New York, New York, USA, 1984.
R. Glowinski. Vistas in Applied Mathematics. Optimization Software, New York,
New York, USA, 1986. Chap. "Splitting methods for the numerical solution
of the incompressible Navier-Stokes equations"; p. 57; A.V. Balakrishnan,
A.A. Dorodnitsyn and J.L. Lions (Eds.).
R. Glowinski. In Vortex Dynamics and Vortex Methods. American Meteorological Society,
Providence, Rhode Island, USA, 1991. Chap. "Finite element methods for the
numerical simulation of incompressible viscous flow. Introduction to the control of the
Navier-Stokes equations"; pp. 219-301; Lectures in Appl. Math., Vol. 28; C. Anderson
and C. Greengard (Eds.).
M.B. Goldschmit and E.N. Dvorkin. On the solution of the steady convection-diffusion
equation using quadratic elements: A generalized Galerkin technique also reliable with
distorted meshes. Eng. Comput., 11:565-573, 1994.
S. Goldstein and L. Rosenhead. Boundary layer growth. Proc. Cambridge Phil. Soc,
32:392-401, 1936.
G.H. Golub and C.F. Van Loan. Matrix Computations. The Johns Hopkins University
Press, Baltimore, Maryland, USA, 1983.
J.W. Goodrich and W.Y. Soh. Time-dependent viscous incompressible Navier-Stokes
equations: The finite difference Galerkin formulation and streamfunction algorithms.
J. Comput. Phys., 84:207-241, 1989.
D. Gottlieb and S.A. Orszag. Numerical Analysis of Spectral Methods: Theory and
Applications. Society for Industrial and Applied Mathematics, Philadelphia, Pennsylvania,
USA, 1977.
976 REFERENCES
G. Goudreau and J. Hallquist. Recent developments in large-scale finite element Lagrangian
hydrocode technology. Comput. Meth. Appl. Mech. Eng., 1982, 33(1 -3):725.
A.R. Gourlay. A note on trapezoidal methods for the solution of initial value problems.
Math. Comput., 24(111):629-633, 1970.
A. Grammeltvedt. A survey of finite-difference schemes for the primitive equations for a
barotropic fluid. Mon. Weather Rev., 97(5):384-404, 1969.
W.G. Gray and G.F. Pinder. On the relationship between the finite element and finite
difference methods. Int. J. Numer. Meth. Eng., 10:893-923, 1976.
W.G. Gray. Proc. Finite Elements in Water Resources. Pentech Press, London, England,
UK, 1977. Chap. "An efficient finite element scheme for two-dimensional surface water
computation"; pp. 4-33; W. Gray, G. Pinder and G. Brebbia (Eds.).
P.M. Gresho and S. Chan. Solving the incompressible Navier-Stokes equations usng
consistent mass and a pressure Poison equation/UCRL-99406. ASME Symposium on
Recent Developments in CFD, Chicago, 95:51-75, August 1988.
P.M. Gresho, R.L. Lee and R.L. Sani. Proc. 2nd. Int. Symp. Finite Element Methods in
Flow Problems. International centre for computer aided design (ICCAD), 1976, Chap.
"Advection-dominated flows, with emphasis on the consequences of mass lumping";
pp. 745-756; Santa Margherita Ligure, Italy, June 14-18, 1976.
P.M. Gresho, R.L. Lee and R.L. Sani, Finite elements in fluids. John Wiley and Sons,
Inc., New York, USA, Vol. 3, 1978a. Chap. 19, "Advection-dominated flows, with
emphasis on the consequences of mass lumping"; p. 335.
P.M. Gresho, R.L. Lee, R.L. Sani and T.W. Stullich. On the Time-dependent FEM
Solution of the Incompressible Navier-Stokes Equations in Two- and Three-Dimesions.
Preprint. Lawrence Livermore National Laboratory, Livermore, California, USA;
UCRL-81323, 1978b.
P.M. Gresho. Comments on a Recent Paper by Emery et ai: A comparison of the some
of the Thermal Chracteristics of Finite-element and Finite-difference Calculations of
Transient Problems. Numer. Heat Trans., 2:519-520, 1979.
P.M. Gresho and R.L. Lee. Finite Element Methods for Convection Dominated Flows.
AMD, The American Society of Mechanical Engineers, New York, New York,
USA, 1979. Chap. 3, "Don't suppress the wiggles—they're telling you something!";
pp. 37-61; T.J.R. Hughes (Ed.).
P.M. Gresho, R.L. Lee, S.T. Chan and J.M. Leone Jr. A New Finite Element for Boussi-
nesq Fluids. Preprint. Lawrence Livermore National Laboratory, Livermore, California,
USA; UCRL-82842, 1979.
P.M. Gresho, R.L. Lee and R.L. Sani, Recent advances in numerical methods in fluids.
Pineridge Press Ltd, Swansea, Wales, UK. Vol. 1, 1980a. Chap. 2, "On the time-
dependent solution of the incompressible Navier-Stokes equations in two and three
dimensions"; pp. 27-79; C. Taylor and K. Morgan (Eds.),
P.M. Gresho, R.L. Lee, S.T. Chan and R.L. Sani. Solution of the time-dependent Navier-
Stokes and Boussinesq equations using the Galerbin finite element method, in
Approximation Methods for Navier-Stokes Problems, Proceedings of the Sympossium Held
by IUTAM at the Universityof Paderborn, Germany, September 9-15, 1979. Springer-
Verlag, Berlin, Germnay, 1980b; pp. 203-222; R. Rautman (Ed.). Series: Lecture Notes
in Mathematics, Vol. 771. A Dold and B. Eckmann (Eds.).
P.M. Gresho and R.L. Lee. Don't suppress the wiggles—they're telling you something!
Comput. Fluids, 9:223-253, 1981.
REFERENCES 977
P.M. Gresho, S.T. Chan, R.L. Lee and CD. Upson. Proc. 22nd Num. Meth. Laminar and
Turbulent Flow, Pineridge Press Ltd., Swansea, Wales, UK, 1981a. Chap. "Solution
of the time-dependent, three-dimensional incompressible Navier-Stokes equations via
FEM"; p. 27-39; C Taylor and B. Schreffler (Eds.); Venice, Italy.
P.M. Gresho, R.L. Lee and R.L. Sani. The Consistent Method for Computing Derived
Boundary Quantities when the Galerkin FEM is used to Solve Thermal and/or Fluids
Problems. Preprint. Lawrence Livermore National Laboratory, Livermore, California,
USA; UCRL-85366, 1981b.
P.M. Gresho and J.M. Leone Jr. Proc. 5th Int. Conf. on Finite Element Methods. Springer-
Verlag, Burlington, Vermont, USA, 1984. Chap. "Another attempt to overcome the
bent element blues"; pp. 667-683; June, 1984; also Lawrence Livermore National
Laboratory, Livermore, California, USA, UCRL-90449.
P.M. Gresho, R.L. Lee and R.L. Sani. Proc. 5th Int. Symposium on Finite Elements in
Flow Problems, TICOM, USA, 1984a. Chap. "Further studies on equal-order
interpolation for Navier-Stokes"; pp. 143-148; Austin, Texas USA, also Lawrence Livermore
National Laboratory, Livermore, California, UCRL-89094.
P.M. Gresho, S.T. Chan, R.L. Lee and CD. Upson. A modified finite element method for
solving the time-dependent, incompressible Navier-Stokes equations, Part 1: Theory.
Int. J. Numer. Meth. Fluids, 4:557-598, 1984b.
P.M. Gresho, S.T. Chan, R.L. Lee and CD. Upson. A Modified Finite Element Method
for Solving the Time-Dependent, Incompressible Navier-Stokes Equations, Part 2:
Theory. Int. J. Numer. Meth. Fluids, 4:619-640, 1984c.
P.M. Gresho and S.T. Chan. Proc. Int. Cont. Num. Meth. in Laminar and Turbulent
Flow. Pineridge Press Ltd, Swansea, Wales, UK, 1985. Chap. "A new semi-implicit
method for solving the time-dependent conservation equations for incompressible
flow"; pp. 3-21; Swansea, Wales, UK, July 9-12, 1985.
P.M. Gresho, C Taylor, M.D. Olson and W.G. Habashi. Proc. Int. Cont. Numer. Meth. in
Laminar and Turbulent Flow. Part 2. Pineridge Press, Swansea, Wales, UK., 1985.
P.M. Gresho and R.L. Lee. Comments on 'the group velocity of some numerical schemes'.
Int. J. Numer. Meth. Fluids, 7:1357-1362, 1987.
P.M. Gresho, R.L. Lee, R.L. Sani, M.K. Maslanik and B.E. Eaton. The consistent
Galerkin FEM for computing derived boundary quantities in thermal and/or fluids
problems. Int. J. Numer. Meth. Fluids, 7:371-394, 1987.
P.M. Gresho and R.L. Sani. On pressure boundary conditions for the incompressible
Navier-Stokes equations. Int. J. Numer. Meth. Fluids, 7:1111-1145, 1987.
P.M. Gresho and R.L. Sani. Finite Elements in Fluids. John Wiley and Sons Ltd,
Chichester, England, UK. Vol. 7, 1988. Chap. 7, "On pressure boundary conditions
for the incompressible Navier-Stokes equations"; pp. 123-157; R.H.Gallagher,
R. Glowinski, P.M. Gresho, J.T. Oden, and O.C. Zienkiewicz (Eds.).
P.M. Gresho and S.T. Chan. On the theory of semi-implicit projection methods for viscous
incompressible flow and its implementation via a finite element method that also
introduces a nearly consistent mass matrix. Part 2: Implementation. Int. J. Numer. Meth.
Fluids, 11(5):621-660, 1990.
P.M. Gresho. Comments on 'a conjugate-residual-FEM for incompressible viscous flow
analysis' by Y. Eguchi, G. Yagawa and L. Fuchs. Comput. Mech., 6:203-204, 1990a.
P.M. Gresho. On the theory of semi-implicit projection methods for viscous
incompressible flow and its implementation via a finite element method that also
978 REFERENCES
introduces a nearly consistent mass matrix, Part 1: Theory. Int. J. Numer. Meth. Fluids,
ll(5):587-620, 1990b.
P.M. Gresho. Annual Review of Fluid Mechanics, Vol. 23. Annual Reviews, Inc., Palo
Alto, California, USA, 1991a. Chap. "Incompressible fluid dynamics: some
fundamental formulation issues"; pp. 413-453.
P.M. Gresho. Proc. Fourth Int. Symp. Computational Fluid Dynamics: A Collection of
Technical Papers, Vol. I. University of California, Davis, Davis, California, USA,
1991b. Chap. "A summary report on the 14 July 91 minisymposium on outflow
boundary conditions for incompressible flow"; pp. 436-442.
P.M. Gresho. Some current CFD issues relevant to the incompressible Navier-Stokes
equations. Comput. Meth. Appl. Mech. Eng., 87:201-252, 1991c.
P.M. Gresho. Advances in Applied Mechanics. Academic Press, Inc., New York, New
York, USA, Vol. 28, 1992. Chap. "Some interesting issues in incompressible fluid
dynamics, both in the continuum and in numerical simulation," pp. 45-140.
P.M. Gresho, D.K. Gartling, J.R. Torczynski, K.A. Cliffe, K.H. Winters, T.J. Garratt,
A. Spence and J.W. Goodrich. Is the steady viscous incompressible two-dimensional
flow over a backward-facing step at Re = 800 stable? Int. J. Numer. Meth. Fluids,
17:501-541, 1993.
P.M. Gresho, S.T. Chan, M.A. Christon, and A.C. Hindmarsh. A little more on stabilized
QiQi for transient viscous incompressible flow. Int. J. Numer. Meth. Fluids,
21:837-856, 1995.
P.M. Gresho and S.T. Chan. Proc. Sixth Int. Symp. Comput. Fluid Dyn. Chap. "An Update
on Projection Methods for Transient Incompressible Viscous Flow"; Vol. I, pp. 389;
Lake Tahoe, Nevada, USA, September 4-8, 1995.
P.M. Gresho and R.L. Sani. Problems and solutions (generalized and FEM) related
to rapid and impulsive changes for incompressible flows, in Computational Fluid
Dynamics Review 1997. John Wiley and Sons, Inc., New York, New York, USA,
1998; M. Hafez and K. Oshima (Eds.).
D.F. Griffiths. Proc. Sixth Canadian Congress of Applied Mechanics. Chap. "The
construction of approximately divergence-free finite elements"; Vancouver, British
Columbia, Canada, May 29-June 3, 1977.
D.F. Griffiths and J. Lorenz. An analysis of the Petrov-Galerkin finite element method.
Comput. Meth. Appl. Mech. Eng., 1977.
D.F. Griffiths. Mathematics of Finite Elements and Applications III. Academic Press,
1979a. Chap. "The construction of approximately divergence-free finite elements";
pp. 237-245; J.W. Whiteman (Ed.).
D.F.Griffiths. Finite elements for incompressible flow. Math. Meth. Appl. Sci., 1:16-31,
1979b.
D.F. Griffiths, I. Ghristie and A.R. Mitchell. Analysis of error growth for explicit
difference schemes in conduction-convection problems. Int. J. Numer. Meth. Eng.,
15:1075-1081, 1980.
D.F. Griffiths. An approximately divergence-free 9-node velocity element (with
variations) for incompressible flows. Int. J. Numer. Meth. Fluids, 1:323-346, 1981.
D.F. Griffiths. Numerical Methods for Fluid Dynamics. Academic Press, London,
England, UK, 1982. Chap. "The effect of pressure approximations on finite element
calculations of incompressible flows"; pp. 359-374; K.W. Morton and M.J. Baines
(Eds.).
REFERENCES 979
D.F. Griffiths and J.M. Sanz-Serna. On the scope of the method of modified equations.
SIAMJ. Sci. Stat. Comput., 7(3):994-1008, 1986.
D.F. Griffiths. Numerical Analysis 1987. Longman Science and Technology, Pitman
research notes in mathematics. Vol. 170, 1988. Chap. "The dynamics of some
linear multistep methods with step-size control"; pp. 115-134; D.F.Griffiths and
G.A. Watson (Eds.).
D.F. Griffiths. Discretised Eigenvalues Problems, LBB Constants and Stabilization.
Longman Scientific & Technical, Pitman Research Notes in Mathematics. Vol. 334,
1996. D.F. Griffith and G.A. Watson (Eds.).
D.F. Griffiths. The 'no boundary condition' outflow boundary condition. Int. J. Numer.
Meth. Fluids, 24:393-412, 1997.
D. Griffiths and D. Silvester. Unstable Modes of the Q\ -Pq Element. University of
Manchester, Manchester, England, UK, technical report NA-257, 1994.
W.D. Gropp and D.E. Keyes. Domain decomposition methods in computational fluid
dynamics. Int. J. Numer. Meth. Fluids, 1992, 14:147-165.
J.-L. Guermond and C. Tenaud. Proc. ECCOMAS 94. John Wiley and Sons Ltd,
Chichester, England, UK, 1994. Chap. "Error analysis and numerical tests for the
approximation of unsteady incompressible viscous flow by means of projection
methods".
J.-L. Guermond. Sur 1'approximation des equations de Navier-Stokes instationnaires par
une methode de projection. C.R. Acad. Sci. Paris, 319:887-892, 1994. Serie I.
J.-L Guermond and L. Quartapelle. Proc. Ninth Int. Conf. Finite Elements in Fluids: New
Trends and Applications, Part 1. 1995. Chap. "Unconditionally stable finite-element
method for the unsteady Navier-Stokes equations"; pp. 367-376; M.M. Cecchi,
K. Morgan, J. Periaux, B.A. Schrefler and O.C. Zienkiewicz (Eds.); Venice, Italy,
October 15-21, 1995.
J.-L Guermond and L. Quartapelle. Calculation of incompressible viscous flows by
an unconditionally stable projection FEM. J. Comput. Phys., 1996. submitted; also
Laboratoire d'lnformatique pour la Mecanique et les Sciences de l'Ingenieur(Notes et
Documents LIMSI) No. 95-06 and No. 95-14; Orsay, France, May, 1995.
D. Gunzburger and A. Nicolaides. Incompressible Computational Fluid Dynamics Trends
and Advances. Cambridge University Press, Cambridge, UK, 1993.
M.D. Gunzburger. Finite Element Methods for Viscous Incompressible Flows: A Guide to
Theory, Practice, and Algorithms. Academic Press, Inc., Boston, Massachusetts, USA,
1989.
M.D. Gunzburger, M. Mundt and J.S. Peterson. Computational methods for viscous
flows, Vol. 4, 1990. Chap. "Experiences with finite element methods for the
velocity-vorticity formulation of three-dimensional viscous incompressible flows";
pp. 231-271; C.A. Brebbia (Ed.).
M.D. Gunzburger and R.A. Nicolaides. Incompressible Computational Fluid Dynamics
Trends and Advances. Cambridge University Press, Cambridge, UK, 1993.
K.K. Gupta and J.L. Meek. A brief history of the beginning of the finite element method.
Int. J. Numer. Meth. Eng., 39:3761-3774 1996.
K. Gustafson and R. Hartman. Divergence-free bases for finite element schemes in
hydrodynamics. SIAMJ. Numer. Anal, 20(4):697-721, 1983.
K. Gustafson and R. Hartman. Graph theory and fluid dynamics. SIAM J. Alg. Disc.
Methods, 6(4):643-656, 1985.
980 REFERENCES
W.G. Habashi and G.G. Youngson. Letter to the editor: Discussion on article by S. Ramad-
hyani and S.V. Patankar. Int. J. Numer. Meth. Eng., 1980, 15:1740-1742.
M. Hafez, J. Dacles and M. Soliman. Proc. Ilth Int. Conf. Num. Methods in Fluid
Dynamics, Springier-Verlag, Berlin, Germany, 1989. Chap. "A velocity/vorticity
method for viscous incompressible flow calculations"; p. 288; Series: Lecture Notes
in Physics; Vol. 323; D.L. Dwoyer and M.Y. Hussaini (Eds.).
T. Hagstrom. Conditions at the downstream boundary for simulations of viscous,
incompressible flow. S1AMJ. Sci. Stat. Comput., 12(4):843-858, 1991.
E. Hairer. Unconditionally stable explicit methods for parabolic equations. Numer. Math.,
35:57-68, 1980.
E. Hairer, S.P. N0rsett and G. Wanner, Solving ordinary differential equations I: nonstiff
problems, Springer-Verlag, Berlin, Germany, 1987.
E. Hairer, C. Lubich and M. Roche, The numerical solution of differential-algebraic
systems by Runge-Kutta methods, Springer-Verlag, Berlin, Germany, 1989. Series:
Lecture Notes in Mathematics; Vol. 1409; A. Dold, B. Eckmann and F. Takens (Eds.).
E. Hairer and G. Wanner. Solving Ordinary Differential Equations II: Stiff and
Differential-Algebraic Problems. Springer-Verlag, Berlin, Germany, 1991.
P. Hansbo and A. Szepessy. A velocity-pressure streamline diffusion finite element
method for the incompressible Navier-Stokes equations. Comput. Meth. Appl. Mech.
Eng., 84:175-192, 1990.
P. Hansbo. The characteristic streamline diffusion method for convection-diffusion
problems. Comput. Meth. Appl. Mech. Eng., 96:239, 1992a.
P. Hansbo. The characteristic streamline diffusion method for the time-dependent
incompressible Navier-Stokes equations. Comput. Meth. Appl. Mech. Eng.,
99:171-186, 1992b.
J. Happel and H. Brenner. Low Reynolds Number Hydrodynamics: with Special Applications
to Particulate Media. Prentice-hall, inc., Englewood Cliffs, New Jersey, USA, 1965.
R. Harbord and M. Gellert. Progress in symmetric formulation of the incompressible
Navier-Stokes equations. Comput. Meth. Appl. Mech. Eng., 83:201-209, 1990.
R. Harbord and M. Gellert. A simple least-squares method for FE analysis of the
Navier-Stokes problem. Comput. Mech., 8:19-24, 1991.
F.H. Harlow and J.E. Welch. Numerical calculation of time-dependent viscous
incompressible flow of fluid with free surface. Phys. Fluids, 8(12):2182, 1965.
V. Haroutunian, M.S. Engelman and I. Hasbani. Segregated finite element algorithms for
the numerical solution of large-scale incompressible flow problems. Int. J. Numer.
Meth. Fluids, 17:323-348, 1993.
R.L. Hartman and K. Gustafson. Quantum Mechanics in Mathematics, Chemistry, and
Physics. Plenum Publishing Corporation, USA, 1981. Chap. "On the dimension
of a finite difference approximation to divergence-free vectors"; pp. 125-131;
K.E. Gustafson and W.P. Reinhardt (Eds.).
Y. Hasbani, E. Livne, and M. Bercovier. Finite elements and characteristics applied to
advection-diffusion equation. Comput. Fluids, 11(2):71 —83, 1983.
G. Hauke and T.J.R. Hughes. A unified approach to compressible and incompressible
flows. Comput. Meth. Appl. Mech. Eng., 1994, 113:389-395.
L.J. Hayes, S.V. Krishnamachari, and T.F. Russell. A finite element alternating-direction
method combined with a modified method of characteristics for convection-diffusion
problems. SIAMJ. Numer. Anal, 26(6): 1462-1473, 1989.
REFERENCES 981
F. Hecht. Analysis of Laminar Flow over a Backward Facing Step, A GAMM-Workshop.
Friedr. Vieweg and Sohn, Braunschweig, Germany, 1984. Notes on numerical fluid
mechanics; Vol. 9. Chap. "Use of divergence free basis in finite elements methods";
pp. 290-316; K. Morgan, J. Periaux, and F. Thomasset (Eds.).
G.W. Hedstrom. The Galerkin method based on Hermite cubics. SIAM J. Numer. Anal.,
16(3):385-393, 1979.
A.F. Hegarty, J.J.H. Miller, E. O'Riordan and G.I. Shishkin. Special Meshes for Finite
Difference Approximations to an Advection-Diffusion Equation with Parabolic Layers.
J. Comput. Phys., 117:47-54, 1995.
J.G. Heywood. The Navier-Stokes equations: On the existence, regularity, and decay of
solutions. Indiana Univ. Math. J., 29(4):639-681, 1980.
J.G. Heywood and R. Rannacher. Finite element approximation of the nonstationary
Navier-Stokes problem. I. Regularity of solutions and second-order error estimates
for spatial discretization. SIAM J. Numer. Anal., 19(2):275-311, 1982.
J.G. Heywood and R. Rannacher. Finite element approximation of the nonstationary
Navier-Stokes problem, Part II: Stability of solutions and error estimates uniform in
time. SIAM J. Numer. Anal., 23(4):750-777, 1986a.
J.G. Heywood and R. Rannacher. An analysis of stability concepts for the Navier-Stokes
equations. Journal fur die reine und angewandte Mathematik (Crelles Journal), Band
372, 1986b.
J.G. Heywood and R. Rannacher. Finite element approximation of the nonstationary
Navier-Stokes problem, Part III. smoothing property and higher-order error estimates
for spatial discretization. SIAM J. Numer. Anal., 25(3):489-512, 1988.
J.G. Heywood and R. Rannacher. Finite-element approximation of the nonstationary
Navier-Stokes problem. Part IV: Error analysis for second-order time discretization.
SIAM J. Numer. Anal., 27(2):353-384, 1990.
J.G. Heywood, R. Rannacher and S. Turek. Artificial boundaries and flux and pressure
conditions for the incompressible Navier-Stokes equations. Int. J. Numer. Meth. Fluids,
22:325-352, 1996.
B.G. Higgins. Downstream development of two-dimensional viscocapillary film flow.
Ind. Eng. Chem., Fundam., 21:168-173, 1982.
A.C. Hindmarsh. GEAR: Ordinary Differential Equation System Solver. Lawrence Liver-
more National Laboratory, Livermore, California, USA, UCID-30001, rev. 1, computer
documentation, 1972.
A.C. Hindmarsh. Numerical Solution of ODE's; Lecture Notes. Lawrence Livermore
National Laboratory, Livermore, California, USA, UCID-16558, 1974.
A.C. Hindmarsh. On a Finite Element Algorithm for Ordinary Differential Equations.
Numerical Mathematics Group, Lawrence Livermore National Laboratory, Livermore,
California, USA, technical memorandum No. 75-4, 1975.
A.C. Hindmarsh. On Numerical Methods for Stiff Differential Equations—Getting the
Power to the People. Lawrence Livermore National Laboratory, Livermore, California,
USA, UCRL-83259, 1979.
A.C. Hindmarsh. Scientific Computing. Vol. I of IMACS Transactions on Scientific
Computation. North-Holland, Amsterdam, The Netherlands, 1983. Chap. "ODEPACK,
a systematized collection of ODE solvers," pp. 55-64; R.S. Stepleman et al.
(Eds.).
982 REFERENCES
A.C. Hindmarsh, P.M. Gresho and D.F. Griffiths. The stability of explicit Euler time-
integration for certain finite difference approximations of the multi-dimensional
advection-diffusion equation. Int. J. Numer. Meth. Fluids, 4:853-897, 1984.
A.C. Hindmarsh and L.R. Petzold. Numerical Methods for Solving Ordinary Differential
Equations and Differential/Algebraic Equations. Energy Tech. Rev., September:23-36,
1988.
A.C. Hindmarsh and L.R. Petzold. Algorithms and software for ordinary differential
equations and differential-algebraic equations, Part I: Euler methods and error
estimation. Comput. Phys., 9(1):34-41, Jan./Feb. 1995a.
A.C. Hindmarsh and L.R. Petzold. Algorithms and software for ordinary differential
equations and differential-algebraic equations, Part II: Higher-order methods and
software packages. Comput. Phys., 9(2): 148-155, Mar./Apr. 1995b.
E. Hinton, T. Rock and O.C. Zienkiewicz. A note on mass lumping and related processes
in the finite element method. Earthquake Eng. Struct. Dyn. 1976, 4:245-249.
C. Hirsch. Numerical Computation of Internal and External Flows. Vol. I: Fundamentals
of Numerical Discretization. John Wiley and Sons Ltd., Chichester, England, UK, 1988.
C.W. Hirt, B.D. Nichols and N.C. Romero, SOLA—A numerical solution algorithm for
transient flux flows, Los Alamos Scientific Laboratory, Los Alamos, New Mexico,
USA, 1975, UC-34 and UC-79d.
L.-W. Ho. A Legendre Spectral Element Method for Simulation of Incompressible
Unsteady Viscous Free- Surface Flows. Massachusetts Institute of Technology,
Department of Mechanical Engineering, Cambridge, Massachusetts, USA, 1989.
Ph.D. Thesis.
L.-W Ho and A.T. Patera. A Legendre spectral element method for simulation of unsteady
incompressible viscous free-surface flows. CMAME, 80:355-366, 1990.
L.-W Ho, Y. Maday, A.T. Patera and E.M. Ronquist, "A high-order Lagrangian-
decoupling method for the incompressible Navier-Stokes equations," CMAME,
Vol. 80, pp. 65-90, 1990.
P. Hood and C. Taylor. Proc. Finite Element Methods in Flow Problems. University of
Alabama Press, Alabama, USA, 1974. Chap. "Navier-Stokes equations using mixed
interpolation"; J.T. Oden, R.H. Gallagher, O.C. Zienkiewicz, and C. Taylor (Eds.);
Swansea, Wales (January 1974).
N.A. Hookey, B.R. Baliga, and C. Prakash. Evaluation and enhancements of some control
volume finite-element methods—Part 1. convection-diffusion problems. Numer. Heat
Trans., 14:255-272, 1988.
N.A. Hookey and B.R. Baliga. Evaluation and enhancements of some control volume
finite-element methods—Part 2. incompressible fluid flow problems. Numer. Heat
Trans., 14:273-293, 1988.
E. Hopf. iiber die anfangswertaufgabe fur hydrodynamischen grundgleichungen.
Mathematische Nachrichten, 4, Sept. 1950/1951. also available as 'On the Initial
Value Problem for the Fundamental Equations of Hydrodynamics,' translated by
P.P. Weidhaas, Lawrence Livermore National Laboratory, Livermore, California, USA,
UCRL-Trans-12144 (December 1986).
T.J.R. Hughes. Unconditionally stable algorithms for nonlinear heat conduction. CMAME,
1977, 10:135-139.
T.J.R. Hughes and A. Brooks. Finite Element Methods for Convection Dominated Flows.
AMD, The American Society of Mechanical Engineers, New York, New York, USA,
REFERENCES 983
1979. Chap. 2, "A multi-dimensional upwind scheme with no crosswind diffusion";
pp. 19-35; T.J.R. Hughes (Ed.).
T.J.R. Hughes, W.K. Liu and A. Brooks, Finite Element Analysis of Incompressible
Viscous Flows by the Penalty Function Formulation, J. Comput. Phys., 1979a, 30:1-60,
January.
T.J.R. Hughes and K.S. Pister and R.L. Taylor. Implicit-explicit finite elements in
nonlinear transient analysis. 17/18:159-182, 1979b.
T.J.R. Hughes and A. Brooks. Finite Elements in Fluids. John Wiley & Sons Ltd,
Chichester, England, UK. Vol. 4, 1982. Chap. "A theoretical Framework for
Petrov-Galerkin methods with discontinuous weighting functions: application to the
streamline-upwind procedure"; p. 47; R.H. Gallagher, D.H. Norrie, J.T. Oden and
O.C. Zienkiewicz (Eds.).
T.J.R. Hughes. Analysis of Transient Algorithms with Particular Reference to Stability
Behavior, in Computational Methods for Transient Analysis. North-Holland,
Amsterdam, The Netherlands, 1983; pp. 67-156; T. Belytschko and T.J.R. Hughes
(Eds.).
T.J.R. Hughes, L. Franca and M. Balestra. A new finite element formulation for CFD:
V. Circumventing the Babuska-Brezzi condition: a stable Petrov-Galerkin formulation
of the Stokes problem accommodating equal-order interpolations. Comput. Meth. Appl.
Mech. Eng., 59:85-99, 1986.
T.J.R. Hughes. The Finite Element Method: Linear Static and Dynamic Finite Element
Analysis. Prentice-Hall, Inc., Englewood Cliffs, New Jersey, USA, 1987.
T.J.R. Hughes and L. Franca. A new finite element formulation for CFD: VII. The
Stokes problem with various well-posed boundary conditions: Symmetric formulations
that converge for all velocity/pressure spaces. Comput. Methods, Appl. Mech. Eng.,
65:85-96, 1987.
T.J.R. Hughes and L.P. Franca. A new finite element formualtion for computational fluid
dynamics: VII. the Stokes problem with various well-posed boundary conditions:
Symmetric formulations that converge for all velocity/pressure spaces. Comput. Meth.
Appl. Mech. Eng., 65:85-96, 1987.
T.J.R. Hughes, L.P. Franca and M. Mallet. A new finite element formulation for
computational fluid dynamics. VI. Convergence analysis for Linear time-dependent
multidimensional advective-diffusive systems. Comput. Meth. Appl. Mech. Eng., 63:97-112,
1987.
T.J.R. Hughes. Finite Elements in Fluids, Vol. 7. John Wiley and Sons Ltd, Chichester,
England, UK, 1988.
T.J.R. Hughes, R.L. Taylor and J.F. Levy. Proc. 2nd Int. Symp. Finite Element Methods in
Flow Problems. International Center for Computer Aided Design ICCAD. 1976. Chap.
"A finite element method for incompressible viscous flows"; p. 3; Santa Margherita
Ligure, Italy, June 14-18, 1976, Conference Series No. 2/76.
T.J.R. Hughes, L.P. Franca, and G.M. Hulbert. A new finite element formulation for
computational fluid dynamics: VIII. The Galerkin/least-squares method for advective-
diffusive equations. Comput. Meth. Appl. Mech. Eng., 73:173-189, 1989.
A.G. Hutton and R.M. Smith. Numerical Methods in Laminar and Turbulent Flow. Piner-
idge Press, Swansea, Wales, UK, 1981. Chap. "On the finite element simulation of
incompressible turbulent flow in general two- dimensional geometries"; C. Taylor and
B.A. Schrefler (Eds.); also CEGB Report RD/B/5010N81.
984 REFERENCES
P.S. Huyakorn, C. Taylor, R.L. Lee and P.M. Gresho. A comparison of various mixed-
interpolation finite elements in the velocity-pressure formulation of the Navier-Stokes
equations. Comput. Fluids, 6:25-35, 1978.
S.R. Idelsohn and E. 0nate. Finite volumes and finite elements: Two 'good friends'. Int.
J. Numer. Meth. Eng., 37:3323-3341, 1994.
A.M. Il'in. Differencing scheme for a differential equation with a small parameter
affecting the highest derivative. Math. Notes Acad. Sci. USSR, 6:596, 1969.
B. Irons. Finite Element Techniques, Proceedings of a Seminar at the University of
Sowthampton, April 1970. p. 328; H. Tottenham and C. Brebbia (Eds.). Southampton,
England, UK, 1970.
B. Irons and S. Ahmad. Techniques of Finite Elements. Ellis Horwood Ltd, Chichester,
England, UK, 1980.
C.P. Jackson and K.A. Cliffe. Mixed interpolation in primitive variable finite element
formulations for incompressible flow. Int. J. Numer. Meth. Eng., 17:1659-1688, 1981.
J.D. Jackson. Classical Electrodynamics. John Wiley and Sons, Inc., New York, New
York, USA, 2nd. edition, 1975.
G. James and R.C. James (Eds.). Mathematics Dictionary. D. Van Nostrand Company,
Inc., Princeton, New Jersey, USA, multilingual edition, 1959.
L. Janvier, B. Metivet, R. Mgouni, G. Pot and E. Razafindrakoto. Numerical Simulation
of 3-D Incompressible Unsteady Viscous Laminar Flows: A GAMM- Workshop. Friedr.
Viewegund Sohn, Braunschweig, Germany, 1992. Chap. "A 3-D driven cavity flow
simulation with N3S code"; pp. 67-78; Series: Notes on Numerical. Fluid Mechanics,
Vol. 36; M. Deville, T.-H. Le and Y. Morchoisne (Eds.).
O.K. Jensen. An automatic timestep selection scheme for reservoir simulation. Proc. 55th
Annual Fall Technical Conf. and Exhibition ofSPE of AIM E, SPE 9373, Dallas, Texas,
USA, September 21-24, 1980.
O.K. Jensen and B.A. Finlayson. A numerical technique for tracking sharp fronts in
studies of tertiary oil-recovery pilots. SPE Reservoir Eng., 1:194-202, March 1986.
S. Jensen and M. Vogelius. Divergence stability in connection with the p-version of the
finite element method. MM AN, 24:737-764, 1990.
D.C. Jespersen. Arakawa's methods is a finite-element method. J. Comput. Phys.,
16:383-390, 1974.
B.-N Jiang and L.A. Povinelli. Least-squares finite element method for fluid dynamics.
Comput. Meth. Appl. Mech. Eng., 81:13-37, 1990.
B.-N Jiang and L.A. Povinelli. Optimal least-squares finite element method for elliptic
problems. Comput. Meth. Appl. Mech. Eng., 1993, 102:199-212.
B.-N. Jiang. Non-oscillatory and non-diffusive solution of convection problems by
the iteratively reweighted least-squares finite element method. J. Comput. Phys.,
105(0:108-121, 1993.
B.-N Jiang, C.Y. Loh, and L.A. Povinelli. Theoretical Study of the Incompressible Navier-
Stokes Equations by the Least-Squares Method. Lewis Research Center, Cleveland, Ohio,
USA, NASA technical memorandum 106535 and ICOMP-94-04, March 1994.
B.-N. Jiang, T.L. Lin and L.A Povinelli, Large-Scale Computation of Incompressible
Viscous Flow by Least-Squares Finite Element Method, Comput. Meth. Appl. Mech.
Eng. 114:213-231, 1994.
C. Johnson and J. Pitkaranta. Analysis of some mixed finite element methods related to
reduced integration. Math. Comput., 38(158):375-400, 1982.
REFERENCES 985
C. Johnson, U. Navert, and J. Pitkaranta. Finite element methods for linear hyperbolic
problems. Comput. Meth. Appl. Mech. Eng., 45:285-312, 1984.
C. Johnson. Numerical Solution of Partial Differential Equations by the Finite Element
Method. Cambridge University Press, Cambridge, England, UK, 1987.
C. Johnson. Error estimates and adaptive time-step control for a class of one-step methods
for stiff ordinary differential equations. SI AM J. Numer. Anal, 25(4):908-926, 1988.
C. Johnson. A new approach to algorithms for convection problems which are based on
exact transport + projection. Comput. Meth. Appl. Mech. Eng., 100:45-62, 1992.
C. Johnson, R. Rannacher and M. Boman. Numerics and hydrodynamic stability: Toward
error control in computational fluid dynamics. SIAM J. Numer. Anal., 32(4): 1058-1079,
August 1995.
S.L. Josse and B.A. Finlayson. Reflections on the numerical viscoelastic flow problem.
J. Non-Newtonian Fluid Mech., 16:13-36, 1984.
J.P. Gerrity Jr. A note on the computational stability of the two-step Lax-Wendroff form
of the advection equation. Mon. Weather Rev., 100(l):72-73, 1972.
H. Kardestuncer and D.H. Norrie (Eds.). Finite Element Handbook. McGraw-Hill, Inc.,
New York, New York, USA, 1987.
G. Em Karniadakis, S.A. Orszag, E.M. R0nquist and A.T. Patera. Spectral Element and
Lattice Gas Methods for Incompressible Fluid Dynamics, in Incompressible
Computational Fluid Dynamics Trends and Advances. Cambridge University Press, Cambridge,
England, UK, 1993. p. 203; M.D. Gunzburger and R.A. Nicolaides (Eds.).
G.E. Karniadakis, M. Israeli and S.A. Orszag. High-order splitting methods for the
incompressible Navier-Stokes equations. J. Comput. Phys., 97(2):414-443, 1991.
N. Kechkar and D. Silvester. Analysis of locally stabilised mixed finite element methods
for the Stokes problem. Math. Comp., 58:1-10, 1992.
H. Keller. Numerical Solution of Partial Differential Equations II. Academic Press, Inc.,
New York, New York, USA, 1971. Chap. "A new finite difference scheme for the
incompressible advection-diffusion equation"; p. 327; B. Hubbard (Ed.).
R. Keunings. An algorithm for the simulation of transient viscoelastic flows with free
surfaces. Journal of Computational Physics, 62:199-220, 1986.
A. Khelifa, J.-L. Robert and Y. Ouellet. A Douglas-Wang finite element approach
for transient advection-diffusion problems, Comput. Meth. Appl. Mech. Eng.,
110:113-129, 1993.
H. Kheshgi and M. Luskin. Analysis of the finite element variable penalty method for
Stokes equations. Math. Comput., 45(172):347-363, 1985.
H.S. Kheshgi and L.E. Scriven. Penalty-Finite Element Methods in Mechanics. AMD,
The American Society of Mechanical Engineers, New York, New York, USA, 1982.
Chap. "Finite element analysis of incompressible viscous flow by a variable penalty
function method"; p. 67-74; AMD-Vol. 51; J.N. Reddy (Ed.).
H.S. Kheshgi and L.E. Scriven. Finite Elements in Fluids. John Wiley and Sons
Ltd, Chichester, England, UK. Vol. 5, 1984. Chap. 19, "Penalty finite element
analysis of unsteady free surface flows"; pp. 393-434; R.H. Gallagher, J.T. Oden,
O.C. Zienkiewicz, T. Kawai and M. Kawahara (Eds.).
H.S. Kheshgi and L.E. Scriven. Variable penalty method for finite element analysis of
incompressible flow. Int. J. Numer. Meth. Fluids, 1985, pp. 785-803.
N. Kikuchi, J.T. Oden and Y.J. Song. Penalty-finite element methods for the analysis of
Stokesian flows. Comput. Meth. Appl. Mech. Eng., 31:297-329, 1982.
986 REFERENCES
J. Kim and P. Moin. Application of a fractional-step method to incompressible
Navier-Stokes equations. J. Comput. Phys., 59:308-323, 1985.
I. King, C. Bratu, B. Delbast, A. Besson and J.P. Chabard. Proc. Europec 90. Society of
Petroleum Engineers, 1990. Chap. "Hydraulic optimization of PDC bits"; The Hague,
The Netherlands, October 22-24, 1990.
I. Kinnmark. The Shallow Water Wave Equations: Formulation, Analysis and Application.
Springer-Verlag, Berlin, Germany, 1986.
I.P.E. Kinnmark and W.G. Gray. Stability and accuracy of spatial approximations for
wave equation tidal models. J. Comput. Phys., 60:447-466, 1985.
S.F. Kistler and L.E. Scriven. The teapot effect—sheet-forming flows with deflection,
wetting and hysteresis. J. Fluid Mech., 263:19-62, March 1994.
S.P. Kjaran and S.T. Sigurdsson. Treatment of time derivative and calculation of flow
when solving groundwater flow problems by Galerkin finite element methods. Adv.
Water Resources, 4, March 1981.
P. Kloucek and F.S. Rys. Stability of the fractional step ^-scheme for the nonstationary
Navier-Stokes equations. SI AM J. Numer. Anal., 31(5):1312-1335, 1994.
J.B. Knox. Numerical errors in the time integration of advective processes. J. Geophys.
Res., 66(12):4177-4186, 1961.
R.P. Koopman, D.L. Ermak and S.T. Chan. A review of recent field tests and
mathematical modelling of atmospheric dispersion of large spills of denser-than-air gases.
Atmos. Environ., 23:(4)731-745, 1989.
H. Kreiss and J. Oliger. Methods for the Approximate Solution of Time Dependent
Problems. World Meteorological Organization, 1973. Vol. 136 in Pure and Applied
Mathematics.
H.-O Kreiss and J. Lorenz. Initial-Boundary Value Problems and the Navier-Stokes
Equations. Academic Press, Inc., Boston, Massachusetts, USA, 1989.
S.V. Krishnamachari, L.J. Hayes and T.F. Russell. A finite element alternating-direction
method combined with a modified method of characteristics for convection-diffusion
problems. SIAM J. Numer. Anal., 26:(6) 1462-1473, 1989.
D. Kwak et al. A three-dimensional incompressible Navier-Stokes flow solver using
primitive variables. AIAA Journal, 24(3):390-396, 1986.
Y.-K Kwok and K.-K. Tarn. Stability analysis of three-level difference schemes
for initial-boundary problems for multidimensional convective-diffusion equations.
Commun. Numer. Methods Eng., 9:595-605, 1993.
M. Kfizek and P. Neittaanmaki. On superconvergence techniques. Acta Appl. Math.,
9(3):175-198, 1987.
P. Labbe and A. Garon. A robust implementation of Zienkiewicz and Zhu's local patch
recovery method. Commun. Numer. Methods Eng., 11:427-434, 1995.
F. Ladeinde and K.E. Torrance. Galerkin finite element simulations of convection driven
by rotation and gravitation. Int. J. Numer. Meth. Fluids, 10(1), January 1990.
O.A. Ladyshenskaya. The Mathematical Theory of Viscous Incompressible Flow. Gordon
and Breach Science Publishers, Inc., New York, New York, USA, 2nd edition, 1969.
O.A. Ladyshenskaya. Annual Review of Fluid Mechanics. 7, 1975. Chap. "Mathematical
analysis of Navier-Stokes equations for incompressible fluids"; p. 249; M. Van Dyke,
W. Vincenti and J. Wehausen (Eds.).
A. Lafon and H.C. Yee. Dynamical approach study of spurious steady-state numerical
solutions for nonlinear differential equations. Part III: The effects of nonlinear source
REFERENCES 987
terms and boundary conditions in reaction-convection equations. Int. J. Comput. Fluid
Dyn., 6:1, 1996.
J.D. Lambert. Numerical Methods for Ordinary Differential Systems: The Initial Value
Problem. John Wiley and Sons Ltd, Chichester, England, UK, 1991.
L.D. Landau and E.M. Lifshitz. Fluid Mechanics. Pergamon Press, Oxford, England, UK,
1959.
W.E. Langlois. Slow Viscous Flow. The Macmillan Company/Collier-Macmillan Limited,
New York/London, 1964.
L. Lapidus and W.E. Schiesser. Numerical Methods for Differential Systems. Academic
Press Inc., New York, New York, USA, 1976.
B.E. Larock and L.R. Herrmann. Proc. First Int. Cont. Finite Elements in Water Re sour.,
Pentech Press, London, England, UK, 1977. Chap. "Improved Flux Prediction Using
Low Order Finite Elements"; pp. 1.103-1.114; Princeton, New Jersey, USA, July,
1976.
P. Lax and B. Wendroff. Systems of conservation laws. Commun. Pure Appl. Math.,
XIIL217-237, 1960.
P.D. Lax and B. Wendroff. On the stability of difference schemes with variable
coefficients. Commun. Pure Appl. Math., 15:363-371, 1962.
P.D. Lax and B. Wendroff. Difference schemes for hyperbolic equations with high order
of accuracy. Commun. Pure Appl. Math., XVII:381-398, 1964.
H. Le and P. Moin. An improvement of fractional step methods for the incompressible
Navier-Stokes equations. J. Comput. Phys., 92:369-379, 1991.
L.G. Leal. Laminar Flow and Convective Transport Processes: Scaling Principles and
Asymptotic Analysis. Butterworth-Heinemann, Boston, Massachusetts, USA, 1992.
R.L. Lee and P.M. Gresho and R.L. Sani. Smoothing techniques for certain primitive
variable solutions of the Navier-Stokes equations. Int. J. Numer. Meth. Eng.,
14:1785-1804, 1979.
R.L. Lee, S.T. Chan, P.M. Gresho and CD. Upson. Proc. AIAA, 5th Comp. Fluid
Dynamics Conf., USA, 1981. Chap. "Simulation of three-dimensional, time-dependent,
incompressible flows by a finite element method"; pp. 354-364; Palo Alto, California;
also Lawrence Livermore National Laboratory, Livermore, California, UCRL-
85226.
R.L. Lee, P.M. Gresho, S.T. Chan, R.L. Sani, and M.J.P. Cullen. Finite Elements
in Fluids. John Wiley and Sons Ltd, Chichester, England, UK. Vol. 4, 1982.
Chap. 2, "Conservation laws for primitive variable formulations of the incompressible
flow equations using the Galerkin finite element method," p. 21; R.H. Gallagher,
D.H. Norrie, J.T. Oden, and O.C. Zienkiewicz (Eds.).
R.L. Lee, P.M. Gresho and R.L. Sani. An Exploratory Study on the Application of an
Existing Finite Element Navier-Stokes Code to Compute Potential Flows. Preprint.
Lawrence Livermore National Laboratory, Livermore, California, USA; UCRL-86684,
1982.
R.L. Lee and J.M. Leone Jr. A modified finite element model for mesoscale flows over
complex terrain. Comput. Math. Appl., 16(12):41-56, 1988.
R.L. Lee, P.M. Gresho, S.T. Chan and C.D.Y. Upson. A Three-dimensional, Finite
Element Method for Simulating Heavier-than-air Gaseous Releases over Variable
Terrain, in Air Pollution Modeling and Its Application II. 1983; pp. 555-573;
C. deWispelaere (Ed.).
988 REFERENCES
R.L. Lee P.M. Gresho and R.L. Sani. Proc. Summer Computer Simulation Conference.
Chap. "A comparative study of certain finite-element and finite-difference methods in
advection-diffusion simulations"; pp. 37-42; Washington, D.C., USA; April 1976.
Y.-S Lee, and P.R. Dawson. Obtaining residual stresses in metal forming after neglecting
elasticity on loading. J. Appl. Mech., Ill, June 1989.
C.E. Leith. Methods in Computational Physics: Advances in Research and Applications.
Academic Press, New York, New York, USA. Applications in hydrodynamics, Vol. 4,
1965. Chap. "Numerical simulation of the earth's atmosphere"; pp. 1-28; B. Alder,
S. Fernbach and M. Rotenberg (Eds.).
C.E. Leith. Diffusion approximation for two-dimensional turbulence. Phys. Fluids,
11:671-673, 1969.
B.P. Leonard. A stable and accurate convective modelling procedure based on quadratic
upstream interpolation. 1979a, 19:59-98.
B.P. Leonard. Finite Element Methods for Convection Dominated Flows. AMD, The
American Society of Mechanical Engineers, New York, New York, USA, 1979b.
Chap. 1, "A survey of finite differences of opinion on numerical muddling of the
incomprehensible defective confusion equation"; pp. 1-17; T.J.R. Hughes (Ed.).
B.P. Leonard and S. Mokhtari. Beyond first-order upwinding: The ultra-sharp alternative
for non-oscillatory steady-state simulation of convection. Int. J. Numer. Meth. Eng.,
30:729-766, 1990.
B.P. Leonard and S. Mokhtari. ULTRA-SHARP solution of the Smith-Hutton problem.
Int. J. Numer. Meth. Heat Fluid Flow, 2:407-427, 1992.
J.M. Leone Jr., P.M. Gresho, S.T. Chan and L. Lee. A note on the accuracy of
Gauss-Legendre quadrature in the finite element method. Int. J. Numer. Meth. Eng.,
14:769-773, 1979.
J.M. Leone Jr. Finite Element Simulations of Stratified Flow over Simple Geometrical
Obstructions and Arbitrarily Complex Terrain. Iowa State University, Ames, Iowa,
USA, 1980. Ph.D. Thesis.
J.M. Leone Jr. and P.M. Gresho. Finite Element Simulations of Steady, Two-Dimensional
Viscous Incompressible Flow over a Step. J. Comput. Phys., 41(1): 167, 1981.
J.M. Leone Jr., P.M. Gresho, R.L. Lee and R.L. Sani. Proc. Int. Cont. Numer. Meth.
Laminar and Turbulent Flow. Pineridge Press Ltd, Swansea, Wales, UK, 1983. Chap.
"Flow-Through Boundary Conditions for Time-Dependent, Buoyancy-Influenced Flow
Simulations Using Low Order Finite Elements"; pp. 1-13; Swansea, Wales, UK,
August 8-11, 1983.
J.M. Leone Jr and R.L. Lee. Numerical simulation of drainage flow in Brush Creek,
Colorado. J. Appl. Meteorol, 28(6):530-542, 1989.
M. Lesieur. Turbulence in Fluids: Stochastic and Numerical Modelling. Kluwer Academic,
Boston, Massachusetts, USA, 1987. Series: Mechanics of Fluids and Transport
Processes.
F.S. Lien and M.A. Leschziner. A general non-orthogonal collocated finite volume
algorithm for turbulent flow at all speeds incorporating second-moment turbulence-
transport closure, Part 1: Computational implementation. Comput. Meth. App. Mech.
Eng., 114:123-148, 1994.
M.J. Lighthill. Group velocity. J. Inst. Math. Appl, 1:1-28, 1965.
D.K. Lilly. On the computational stability of numerical solutions of time-dependent
nonlinear geophysical fluid dynamics problems. Mon. Weather Rev., 93(1): 11-25, 1965.
REFERENCES 989
S.P. Lin and M. Tobak. Spectral stability of Taylor's vortex array. Phys. Fluids,
29(10):3477, October 1986.
J. Linden, G. Lonsdale, B. Steckel and K. Stiiben. Multigrid for the steady-state
incompressible Navier-Stokes equations: a survey. Technical Report 322, GMD, 1988.
J.L. Lions. AGARD Lecture Series No. 48: Numerical Methods in Fluid Dynamics. North
Atlantic Treaty Organization (NATO) Advisory Group for Aerospace Research and
Development (AGARD), 1972. Chap. "On the numerical approximation of some
equations arising in hydrodynamics"; pp. 9-21; J.J. Smolderen (Ed.).
J. Liou, O. Pironneau and T. Tezduyar. Characteristic-Galerkin and Galerkin/least-squares
space-time formulations for the advection-diffusion equation with time-dependent
domains. Comput. Meth. Appl. Mech. Eng., 100:117-141, 1992.
W.K. Liu and Y.F. Zhang. Unconditionally stable implicit-explicit algorithms for coupled
thermal stress waves. Comput. Struct., 17(3):371-374, 1983; see also Int. J. Numer.
Meth. Eng. 20(9): 1581, 1984.
W.K. Liu, T. Belytschko and Y.F. Zhang. Partitioned rational Runge Kutta for parabolic
systems. Int. J. Numer. Meth. Eng., 20:1581-1597, 1984.
W.K. Liu, J.S.-J. Org and R.A. Uras. Finite element stabilization matrices-a unification
approach. Comput. Meth. Appl. Mech. Eng., 53:13-46, 1985.
T.P. Loc. Numerical analysis of unsteady secondary vortices generated by an impulsively
started circular cylinder. J. Fluid Mech., 100(1): 111-128, 1980.
T.P. Loc and R. Bouard. Numerical solution of the early stage of the unsteady viscous
flow around a circular cylinder. A comparison with experiments visualization and
measurements. J. Fluid Mech., 160:93-117, 1985.
P.M. Lovalenti and J.F. Brady. The Lydrodynamic force on a rigid particle undergoing
arbitrary time-dependent motion at small Reynolds number. J. Fluid Mech.,
256:561-605, 1993.
P.M. Lovalenti and J.F. Brady. The temporal behaviour of the hydrodynamic force on a
body in response to an abrupt change in velocity at small but finite Reynolds number.
J. Fluid Mech., 293:35-46, 1995.
C. Lubich, E. Hairer and M. Roche. The Numerical Solution of
Differential-Algebraic Systems by Runge-Kutta Methods. Springer-Verlag, Berlin, Germany, 1989.
Series: Lecture Notes in Mathematics, Vol. 1409; A. Dold, B. Eckmann and F. Takens
(Eds.).
H.J. Lugt. Vortex Flow in Nature and Technology. John Wiley & Sons, Inc., New York,
New York, USA, 1983.
M. Luskin and R. Rannacher. On the smoothing property of the Crank-Nicolson scheme.
Appl. Anal., 14:117-135, 1982.
D.R. Lynch and W.G. Gray. A wave equation model for finite element tidal computations.
Comput. Fluids, 7:207-228, 1979.
D.R. Lynch and W.G. Gray. Finite element simulation of flow in deforming regions.
J. Comput. Phys., 36:135-153, 1980.
D.R. Lynch. Mass conservation in finite element groundwater models. Advances in Water
Resources, 7:67, 1984.
D.R. Lynch. Heat conservation in deforming element phase change simulation. J. Comput.
Phy., 57(2):303, January, 1985a.
D.R. Lynch. Mass balance in shallow water simulations. Commun. Appl. Numer. Meth.,
1:153-159, 1985b.
990 REFERENCES
R. Lohner. Design of incompressible flow solvers: practical aspects, Cambridge University
Press, 1993, in Incompressible Computational Fluid Dynamics; pp. 267-293;
M. Gunzburger and R. Nicolaides (Eds.).
M. Morandi, K. Morgan, J. Periaux, B.A. Schrefler and O.C. Zienkiewicz. Proc. Int. Cont.
Finite Elements in Fluids: New Trends and Applications. Venezia, Italy, 1995.
R.J. Mackkinnon and G.F. Carey. Superconvergent derivatives: A Taylor series analysis.
Int. J. Numer. Meth. Eng., 28:489-509, 1989.
R.J. MacKinnon and G.F. Carey. Nodal superconvergence and solution enhancement for
a class of finite-element and finite-difference methods. SI AM J. Sci. Stat. Comput.,
ll(2):343-353, 1990.
R.J. MacKinnon, G.F. Carey and P. Murray. A procedure for calculating vorticity
boundary conditions in the stream-function—vorticity method. Commun. Appl. Numer.
Meth., 1990, 6:47-48.
R.H. MacNeal. An asymmetrical finite difference network. Q. Appl. Math.,
11(3):295-310, 1953.
Y. Maday, A.T. Patera and E.M. Ronquist. An operator-integration-factor splitting method
for time-dependent problems: application to incompressible fluid flow. J. Sci. Comput.,
5(4):263-292, December 1990.
A.V. Malevsky and D.A. Yuen. Characteristics-based methods applied to infinite
Prandtl number thermal convection in the hard turbulent regime. Phys. Fluids A,
3(9):2105-2115, 1991.
A.V. Malevsky. Spline-Characteristic Method for Simulation of Convective Turbulence.
University of Minnesota, Army High Performance Computing Research Center,
Minneapolis, Minnesota, USA, AHPCRC 93-059, 1993.
D.S. Malkus and T.J.R. Hughes. Mixed finite element methods—reduced and selective
integration techniques: A unification of concepts. Comput. Meth. Appl. Mech. Eng.,
15:63-81, 1978.
D.S. Malkus. Eigenproblems associated with the discrete LBB condition for
incompressible finite elements. Int. J. Eng. Sci., 19:1299-1310, 1981.
D.S. Malkus and E.T. Olsen. Penalty-Finite Element Methods in Mechanics. AMD, The
American Society of Mechanical Engineers, New York, New York, USA, 1982. Chap.
"Incompressible finite elements which fail the discrete LBB condition"; pp. 33-50;
AMD-Vol. 51; J.N. Reddy (Ed.).
D.S. Malkus and E.T. Olsen. Obtaining error estimates for optimally constrained
incompressible finite elements. Comput. Meth. Appl. Mech. Eng., 45:331-353, 1984.
D.S. Malkus and M.E. Plesha. Zero and negative masses in finite element vibration and
transient analysis. Comput. Meth. Appl. Mech. Eng., 59:281-306, 1986.
D.S. Malkus, R.D. Cook and M.E. Plesha. Concepts and Applications of Finite Element
Analysis. John Wiley and Sons, Inc., New York, New York, USA, 3rd edition, 1989.
M. Mallet, C. Poirier and F. Shakib. A new finite element formulation for computational
fluid dynamics: development of an hourglass control operator for multidimensional
advective-diffusive systems, Comput. Meth. Appl. Mech. Eng., 94:429-442, 1992.
R.S. Marshall, J.C. Heinrich and O.C. Zienkiewicz. Natural convection in a square
enclosure by a finite-element, penalty function method using primitive fluid variables.
Numerical Heat Transfer, 1:315-330, 1978.
M. Marion and R. Temam. Navier-Stokes Equations Theory and Approximation.
Handbook Numer. Anal., to appear, 1996.
REFERENCES 991
Y.P. Marx. Time integration schemes for the unsteady incompressible Navier-Stokes
equations. J. Comput. Phys., 112:182-209, 1994.
J. Mason. Methods of Functional Analysis for Application in Solid Mechanics. Elsevier,
Amsterdam, The Netherlands, 1985. Series: Studies in Applied Mechanics, Vol. 9.
K.K. Mathur and P.R. Dawson. On modeling damage evolution during the drawing of
metals. Mech. Mater., 6:179-196, 1987.
M.R. Maxey and J.J. Riley. Equation of motion for a small rigid sphere in a nonuniform
flow. Phys. Fluids, 26(4):883, 1983.
R.C. McCallen. An Investigation of the Finite Element Incompressible Flow Code FEM3:
The Model Some of Its Options, A Guide on How To Obtain and Run the Code, and
Its Performance on Some Chosen Problems. Lawrence Livermore National Laboratory,
Livermore, California, USA, UCID-21527 edition, 1988.
R.C. McCallen. Large-Eddy Simulation of Turbulent Flow Using the Finite Element
Method. Ph.D. Thesis University of California, Davis, Department of Mechanical
Engineering, Davis, California, USA, 1993.
R. Mei. History force on a sphere due to a step change in the free-stream velocity. Int. J.
Multiphase Flow, 19(3):505-525, 1993.
R. Mei. Velocity fidelity of flow tracer particles. Exp. Fluids, 22:1-13, 1996.
R. Mei and C.J. Lawrence. The flow field due to a body in impulsive notion. J. fluid
Mech., 325:79-111, 1996.
M.C. Melaaen. Calculation of fluid flows with staggered and nonstaggered curvilinear
nonorthogonal grids—the theory. Numer. Heat Transfer, Part B, 21:1-19, 1992.
B. Mercier. Topics in Finite Element Solution of Elliptic Problems. Springer-Verlag, Berlin,
Germany, 1979a.
B. Mercier. A conforming finite element method for two-dimensional incompressible
elasticity. Int. J. Numer. Meth. Eng., 14:942-945, 1979b.
S.G. Mikhlin. Variational Methods in Mathematical Physics. Pergamon Press, Oxford,
England, UK, 1964.
J.J.H. Miller, E. O'Riordan and G.I. Shiskin. On piecewise-uniform meshes for upwind-
and central-difference operators for solving singuarly perturbed problems. IMA J.
Numer. Anal. 1995, 15:89-99.
J.J.H. Miller, E. O'Riordan and G.I. Shishkin. Solution of singular perturbation
problems with e-uniform numerical methods—introduction to the theory. World Scientific
Publishers, Singapore, 1996. to appear.
R.H. Miller. A horror story about integration methods. J. Comput. Phys. ,93:469-476, 1991.
K. Millsaps. Karl Pohlhausen, as I remember him. Ann. Rev. Mech., 16:1-10, 1984.
P. Minev and P.M. Gresho. A remark on pressure correction schemes for transient viscous
incompressible flow. Commun. Numer. Meth. Eng., in press.
W.J. Minkowycz, E.M. Sparrow, G.E. Schneider and R.H. Pletcher. Handbook of
Numerical Heat Transfer. John Wiley and Sons, Inc., New York, New York, USA, 1988.
A.R. Mitchell and D.F. Griffiths. Semi-Discrete Generalised Galerkin Methods for Time-
Dependent Conduction-Convection Problems. Academic Press Ltd, London, England,
UK, 1979. Chap. 2 in MAFELAP III.
A.R. Mitchell. Recent Developments in the Finite Element Method. North-Holland,
Amsterdam, The Netherlands, 1984, in Computational Techniques and Applications,
J. Noye and C. Fletcher (Eds.).
992 REFERENCES
S. Mittal and T.E. Tezduyar. Massively parallel finite element computation of
incompressible flows involving fluid-body interactions. Comput. Meth. Appl. Mech. Eng.,
112:253-282, 1994.
K. Miyakoda. Contribution to the numerical weather prediction: Computation with finite
difference. Jpn. J. Geophys., 3(1):75-190, 1962.
A. Mizukami. A mixed finite element method for boundary flux computation. Comput.
Meth. Appl. Mech. Eng., 57:239-243, 1986.
A.G. Mohamed, D.T. Valentine and R.E. Hassel. Numerical study of laminar separation
over an annular backstep. Comput. Fluids, 20(2): 121-143, 1991.
C.R. Molenkamp. Accuracy of finite-difference methods applied to the advection equation.
J. Appl. Meteoroi, 7(2): 160-167, 1968.
K.W. Morton. Stability and convergence in fluid flow problems. Proc. R. Soc. Lond. A.,
323:237-253, 1971.
K.W. Morton. Stability of finite difference approximations to a diffusion-convection
equation. Int. J. Numer. Meth. Eng., 1980, 15:677-683.
K.W. Morton and A. Stokes. The Mathematics of Finite Elements and Applications IV
MAFELAP1981. Academic Press, New York, New York, USA, 1982. Chap. "Generalised
Galerkin methods for hyperbolic equations"; pp. 421-431; J.R. Whiteman (Ed.).
K.W. Morton. Proc. IMA Conf, Numerical Methods for Fluid Dynamics. 1982. Chap.
"Generalised Galerkin methods for steady and unsteady problems"; pp. 1-32;
K.W. Morton and M.J. Baines (Eds.).
K.W. Morton. Proc. Fifth GAMM Conf. Numerical Methods in Fluid Mechanics. Friedr.
Vieweg und Sohn, Braunschweig, Germany, 1983. Chap. "Characteristic Galerkin
methods for hyperbolic problems"; pp. 243-250; M. Pandolfi and R. Piva (Eds.);
Rome, Italy, October 5-7, 1983.
K.W. Morton. Generalised Galerkin methods for hyperbolic problems. Comput. Meth.
Appl. Mech. Eng., 52:847-871, 1985.
K.W. Morton, A. Priestley and E. Siili. Stability of the Lagrange-Galerkin method with
non-exact integration. Math. Model. Numer. Anal. 22:(4)625-653, 1988.
K.W. Morton. Proc. Third Int. Conf. Hyperbolic Problems. Studentilitteratur, Uppsala,
Sweden, 1990. Chap "Lagrange-Galerkin and Characteristic-Galerkin methods and their
applications"; pp. 742-755; B. Engquist and B. Gustafsson. Eds. Uppsala, Sweden,
June 11-15, 1990.
K.W. Morton and E. Siili. Finite volume methods and their analysis. IMA J. Numer. Anal.,
11:241-260, 1991.
K.W. Morton. Numerical Solution of Convection-Diffusion Problems. Champman & Hall,
London, UK, 1996.
R. Mullen and T. Belyschko. Dispersion analysis of finite element semidiscretizations of
the two-dimensional wave equation. Int. J. Numer. Meth. Eng., 18:11-29, 1982.
S. Miiller, A. Prohl, R. Rannacher and S. Turek. Fast solvers for flow problems, Friedr.
Vieweg und Sohn, Heidelberg, Germany, 1995. Chap. "Implicit time-discretization
of the nonstationary incompressible Navier-Stokes eqations"; p. 175; Notes on CFD,
Vol. 49; W. Hackbush and G. Wittum (Eds.).
B. Metivet, K. Boukir, Y. Maday and E. Razafindrakoto. A high order
characteristics/finite element method for the incompressible Navier-Stokes equations. Int. J.
Numer. Meth. Fluids, 1996, to appear.
REFERENCES 993
K. Nafa and R.W. Thatcher. Low-order macroelements for two- and three-dimensional
Stokes flow. Numerical Methods for Partial Differential Equations, 9:579-591, 1993.
M.J. Naughton. On Numerical Boundary Conditions for the Navier-Stokes Equations.
Ph.D. Thesis, California Institute of Technology, Pasadena, California, USA, 1986.
J. Nelson and C.L. Chang. A mass conservative least-squares finite element method for
the Stokes problem. Commun. Numer. Meth. Eng., 11:965-970, 1995.
B. Neta and R.T. Williams. Stability and phase speed for various finite element
formulations of the advection equation. Comput. Fluids, 14(4):393-410, 1986.
O. Nevanlinna and W. Liniger. Contractive methods for stiff differential equations. Part I.
BIT, 18:457-474, 1978.
O. Nevanlinna and W. Liniger. Contractive methods for stiff differential equations. Part II.
BIT, 19:53-72, 1979.
E. Ng, B. Pegtor and B. Nitrosso. On the solution of Stokes's system within N3S using
supernodal Cholesky factorization, in Finite Elements in Fluids, Part I, Pineridge Press
Ltd, Swansea, wales UK, 1993. pp. 76-84; K. Morgan, E. Onate, J. Periaux, J. Peraire
and O.C. Zienkiewicz (Eds.).
R.A. Nicolaides. Existence, uniqueness and approximation for generalized saddle point
problems. SIAMJ. Numer. Anal, 19(2):349-357, 1982.
R.A. Nicolaides, T.A. Porsching and C.A. Hall. Computational fluid dynamics review
John Wiley and Sons, Inc., New York, New York, USA, 1995. Chap. "Covolume
methods in computational fluid dynamics"; M. Hafez and K. Oshima (Eds.).
B. Noble, Applied Linear Algebra, Prentice-Hall, Inc., Englewood Cliffs, New Jersey,
USA, 1969.
R.A. Novy, H.T. Davis and L.E. Scriven. Upstream and downstream boundary conditions
for continuous-flow systems. Chem. Eng. Sci., 1990, 45:(6)1515-1524.
R.A. Novy, H.T. Davis and L.E. Scriven. A comparison of synthetic boundary conditions
for continuous-flow systems. Chem. Eng. Sci., 1991, 46:( 1)57-68.
J.T. Oden. Finite-Element Method. Finite-Element Analogue of Navier-Stokes Equation,
J. Eng. Mech. Div. ASCE, 96(4):529, 1970.
J.T. Oden. Finite Elements of Nonlinear Continua. McGraw-Hill Book Company, New
York, New York, USA, 1972.
J.T. Oden and J.N. Reddy. An Introduction to the Mathematical Theory of Finite Elements.
John Wiley and Sons, New York, New York, USA, 1976.
J.T. Oden, N. Kikuchi and Y.J. Song. Penalty-finite element methods for the analysis of
Stokesian flows. Comput. Meth. Appl. Mech. Eng., 31:297-329, 1982.
J.T. Oden and G.F. Carey. Finite Elements: Mathematical Aspects Vol. IV. Prentice-Hall,
Inc., Englewood Cliffs, New Jersey, USA, 1983.
J.T. Oden and O.-P. Jacquotte. Stability of some mixed finite element methods for
Stokesian flows. Comput. Meth. Appl. Mech. Eng., 43:231-247, 1984.
J.T. Oden. The best FEM. Finite Elements Anal. Des., 7:103-114, 1990.
J.T. Oden. Finite Elements: An Introduction. Handbook of Numerical Analysis, Vol. II,
North-Holland, Amsterdam, The Netherland, 1991.
J.T. Oden, A. Patra and Y. Feng. An hp adaptive strategy, AMD, ASME, 1992, 157:23-46.
Adaptive, Multilevel and Hierarchical Computational Strategies; A.K. Noor (Ed.).
J.T. Oden, W. Wu and M. Ainsworth. Three-step h-p adaptive stragtegy for the
incompressible Navier-Stokes equations, IMA summer program on modeling, mesh
generation, and adaptive numer. Methods for partial differential equations, 1993.
994 REFERENCES
E.T. Olsen. Stable Finite Elements for Non-Newtonian Flows: First Order Elements Which
Fail the LBB Condition. Ph.D. Thesis, Illinois Institute of Technology, Chicago, Illinois,
USA, 1983.
M.D. Olson and S.-Y. Tuann. New finite element results for the square cavity. Comput.
Fluids, 7:123-135, 1979.
I. Orlanski. A simple boundary condition for unbounded hyperbolic flows. J. Comput.
Phys., 21:251-269, 1976.
S.A. Orszag. Numerical simulation of incompressible flows within simple boundaries:
Accuracy. J. FluidMech., 49(1):75-112, 1971.
S.A. Orszag. Comparison of pseudospectral and spectral approximation. Stud. Appl.
Math., 51:253-259, 1972.
J.M. Ortega. Numerical analysis. A Second Course, SIAM, Philadelphia, Pennsylvania,
USA, 1990. Series: Classics in Applied Mathematics; R.E. O'Malley Jr. (Ed.).
R.L. Panton. Incompressible Flow. John Wiley and Sons, Inc., New York, New York,
USA, 2nd edition, 1996.
S. Paolucci. Direct numerical simulation of two-dimensional turbulent natural convection
in an enclosed cavity. J. Fluid Mech., 215:229-262, 1990.
S. Paolucci and D. Chenoweth. A note on the stability of the explicit finite differenced
transport equation. J. Comput. Phys., 47:489, 1982.
T.C. Papanastasiou, N. Malamataris and K. Ellwood. A new outflow boundary condition.
Int. J. Numer. Meth. Fluids, 14:587-608, 1992.
N.-S. Park and J.A. Liggett. Taylor-least-squares finite element for two-dimensional
advection-dominated unsteady advection-diffusion problems. Int. J. Numer. Meth.
Fluids, 11:21-38, 1990.
N.-S. Park and J.A. Liggett. Application of Taylor-least squares finite element to
the three-dimensional advection-diffusion equation. Int. J. Numer. Meth. Fluids,
13:759-773, 1991.
J. Pedlosky. Geophysical Fluid Dynamics. Springer-Verlag, New York, New York, USA,
2nd edition, 1987.
A. Peirce and J.H. Prevost. On the lack of convergence of unconditionally stable explicit
rational Runge-Kutta schemes. Comput. Meth. Appl. Mech. Eng., 57:171-180, 1986.
R.B. Pelz, V. Yakhot, and S.A. Orszag. Velocity-Vorticity patterns in turbulent flow.
Phys. Rev. Letters, 54(23):2505, 1985.
D.W. Pepper and J.C. Heinrich. The Finite Element Method: Basic Concepts and
Application. Taylor & Francis, Basingstoke, England, UK, 1992.
J.B. Perot. An analysis of the fractional step method. J. Comput. Phys. 108:51-58. 1993.
L. Petzold and P. Lotstedt. Numerical solution of nonlinear differential equations
with algebraic constraints II: Practical implications. SIAM J. Sci. Stat. Comput.,
7(3):720-733, 1986.
L. Petzold. Differential/algebraic equations are not ODE's. SIAM J. Sci. Stat. Comput.,
3(3):367-384, 1982.
R. Peyret, M. Fortin and R. Temam. Resolution numerique des equations de
Navier-Stokes pour un fluide incompressible. J. Mec, 10(3):357-390, 1971.
N.A. Phillips. The Atomsphere and the Sea in Motion. The Rockefeller Institute Press,
New York, New York, USA, 1959. Chap. "An Example of Non-Linear Computational
Instability"; B. Bolin (Ed.).
REFERENCES 995
S.A. Piacsek and G.P. Williams. Conservation properties of convection difference
schemes. J. Comput. Phys., 6:392-405, 1970.
R. Pierre. Simple C° appproximations for the computation of incompressible flows.
Computer Methods in Applied Mechanics and Engineering, 68:205-227, 1988.
R. Pierre. Optimal selection of the bubble function in the stabilization of the PI-PI element
for the Stokes problem. SI AM. J. Numer. Anal, 32:1210, 1995.
G.F. Pinder and W.G. Gray. Finite Element Simulation in Surface and Subsurface
Hydrology. Academic Press, New York, New York, USA, 1977.
O. Pironneau. On the transport-diffusion algorithm and its application to the
Navier-Stokes equations. Numer. Math., 38:309-332, 1982.
O. Pironneau. Equations aux derivees partielles (analyse numerique). Conditions aux
limites sur la pression pour les equations de stokes et de Navier-Stokes. C. R. Acad.
Sc. Paris, Ser. I, 303(9):403, 1986.
O. Pirronneau, C. Bengue, C. Conca, and F. Marat. Problemes Mathematiques de la
Mecanique. A nouveau sur les equations de Stokes et de Navier-Stokes avec des
conditions aux limites sur la pression. C. R. Acad. Sc. Paris, ser. I, 304(1 ):23, 1987.
O. Pironneau. Finite Element Methods for Fluids. John Wiley and Sons Ltd, Chichester,
England, UK, 1989.
O. Pironneau, and J. Liou and T. Tezduyar. Characteristic-Galerkin and Galerkin/Least-
Squares Space-Time Formulations for the Advection-Diffusion Equation with
Time-Dependent Domains. Comput. Meth. Appl. Mech. Eng., 100:117-141, 1992.
C. Prakash. Examination of the upwind (donor-cell) formulation in control volume finite-
element methods for fluid flow and heat transfer. Numer. Heat Transfer, 11:401-416,
1987.
C. Prakash and B.R. Baliga. Finite Element Analysis in Fluids: Proc. 7th. Int. Conf.
Finite Element Methods in Flow Problems. UAH Press, 1989. Chap. "Control-volume-
based numerical methods for fluid flow: similarities and differences": pp. 397-404;
Huntsville, Alabama, USA, April 3-7, 1989.
H.S. Price, R.S. Varga and J.E. Warren. Application of oscillation matrices to diffusion
convection equations. J. Math. Phys., 1966, 45:301-311.
A. Priestley, K.W. Morton and E. Suli. Stability of the Lagrange-Galerkin method with
non-exact integration. Math. Model. Numer. Anal., 22(4):625-653, 1988.
A. Priestley. A quasi-conservative version of the semi-Lagrangian advection scheme. Mon.
Weather Rev., 121:621-629, February 1993.
A. Priestley. Exact projections and the Lagrange-Galerkin method: A realistic alternative
to quadrature. J. Comput. Phys., 112:316-333, 1994.
A. Prohl. Analysis of Chorin's Projection Method for Solving the Incompressible
Navier-Stokes Equations. Universitat Heidelberg, Institut fiir Angewandte Mathematik,
INF 294, D-69120 Heidelberg, Germany, 1996.
A. Prohl. Projection and Quasi-Compressibility Methods for Solving the Incompressible
Navier-Stokes Equations. Wiley-Teubner, Chichester, England, UK/Germany, 1997.
A. Prohl and R. Rannacher. (1997) An analysis of Chorin's projection method for the
incompressible Novier-Stokes equations, submitted to SIAM J. Numer Anal.
R.J. Purser and L.M. Leslie. An efficient semi-Lagrangian scheme using third-order semi-
implicit time integration and forward trajectories. Mon. Weather Rev., 122:745-756,
April 1994.
996 REFERENCES
J. Qin. On the Convergence of Some Low Order Mixed Finite Elements for
Incompressible Fluids. Ph.D. Thesis, The Pennsylvania State University, University
Park, Pennsylvania, USA, 1994.
L. Quartapelle and F. Valz-Gris. Projection conditions on the vorticity in viscous
incompressible flows. Int. J. Numer. Meth. Fluids, 1:129, 1981.
L. Quartapelle. Vorticity conditioning in the computation of two-dimensional viscous
flows. J. Comput. Phys., 40:453, 1981.
K. Radhakrishnan. New integration techniques for chemical kinetic rate equations. I.
efficiency comparison. Combust. Sci. Technol, 46:59-81, 1986.
K. Radhakrishnan and A.C. Hindmarsh. Description and Use of LSODE, the Livermore
Solver for Ordinary Differential Equations. National Aeronautics and Space
Administration, Washington, D.C., USA. NASA reference publication 1327 edition, 1993;
also available as Lawrence Livermore National Laboratory Report UCRL-ID-113855,
Livermore, California.
S. Ramadhyani and S.V. Patankar. Solution of the Poisson equation: Comparison of
the Galerkin and control-volume methods. Int. J. Numer. Meth. Eng., 15:1395-1418,
1980.
S. Ramadhyani and S.V. Patnakar. Solution of the convection-diffusion equation by a
finite-element method using quadrilateral elements. Numer. Heat Transfer, 8:595-612,
1985.
J.D. Ramshaw. A method for enforcing the solenoidal condition on magnetic field in
numerical calculations. J. Comput. Phys., 52(3):592-596, 1983.
J.D. Ramshaw. Numerical viscosities of difference schemes. Commun. Numer. Methods
Eng., 10:927-931, 1994.
A. Randriamampianina, P. Bontoux and B. Roux. Ecoulements Induits par la force
gravifique dans une cavite cylindrique en rotation. Int. J. Heat Mass Transfer,
30(7): 1275-1292, 1987.
R. Rannacher. Discretization of the heat equation with singular initial data. Z. Angew.
Math. Mech., 62:T346-T348, 1982.
R. Rannacher and R. Scott. Some Optimal Error Estimates for Piecewise Linear Finite
Element Approximations. Math. Comput., 38(158):473-445, 1982.
R. Rannacher. Finite element solution of diffusion problems with irregular data.
Numer. Math., 43:309-327, 1984.
R. Rannacher. Applications of Mathematics in Industry and Technology. B.G. Teubner,
Stuttgart, Germany, 1989. Chap. "Numerical analysis of nonstationary fluid flow (a
survey)"; pp. 34-53; V.C. Boffi and H. Neunzert (Eds.).
R. Rannacher. Navier-Stokes Equations: Theory and Numerical Methods. Springer-
Verlag, Berlin, Germany, 1990. Chap. "On the numerical analysis of the nonstationary
Navier-Stokes equations"; pp. 180-193; J. Heywood et al. (Eds.).
R. Rannacher and S. Turek. Simple nonconforming quadrilateral Stokes element. Numer.
Methods PDE's, 8(2):97-lll, 1992.
R. Rannacher. The Navier-Stokes Equations II: Theory and Numerical Methods. Springer-
Verlag, Berlin, Germany, 1992. Chap. "On Chorin's projection method for the
incompressible Navier-Stokes equations"; pp. 167-183; Lecture Notes in Mathematics,
Vol. 1530.
R. Rannacher. On the numerical solution of the incompressible Navier-Stokes equations.
Z. Angew. Math. Mech., 73:203-216, 1993.
REFERENCES 997
R. Rautmann (Ed.). Approximation Methods for Navier- Stokes Problems. Springer-Verlag,
Berlin, Germany, 1979. Proceedings of the Symposium Held by International Union of
Theoretical and Applied Mechanics (IUTAM) at the University of Paderborn, Germany,
September 9-15, 1979.
R. Rautmann, J.G. Heywood, K. Masuda and V.A. Solonnikov (Eds.). The Navier-Stokes
Equations II —Theory and Numerical Methods. Springer-Verlag, Berlin, Germany, 1992.
Proceedings of a Conference held in Oberwolfach, Germany, August 18-24, 1991.
M.J. Raw and G.E. Schneider and V. Hassani. Proc. AIAA 22nd Aerospace Sciences
Meeting, McGraw-Hill, Inc., New York, USA, 1987.
W.H. Raymond and A. Garder. Selective damping in a Galerkin method for solving wave
problems with variable grids. Mon. Weather Rev., 104:1583-1590, Dec 1976.
J.N. Reddy. On the accuracy and existence of solutions to primitive variable models of
viscous incompressible fluids. Int. J. Eng. Sci., 16(12-A):921-929, 1978.
J.N. Reddy. On penalty function methods in the finite-element analysis of flow problems.
Int. J. Numer. Meth. Fluids, 2:151-171, 1982.
J.N. Reddy. An Introduction to the Finite Element Method. McGraw-Hill Book Company,
New York, New York, USA, 1984; also 2nd edition, 1993.
J.N. Reddy. Applied Functional Analysis and Variational Methods in Engineering.
McGraw-Hill, Inc., New York, New York, USA, 1986.
J.N. Reddy, M.P. Reddy and H.U. Akay. Penalty finite element analysis of incompressible
flows using element by element solution algorithms. Comput. Meth. Appl. Mech. Eng.,
100:169-205, 1992.
J.N. Reddy and D.K. Gartling. The Finite Element Method in Heat Transfer and Fluid
Dynamics. CRC Press, Inc., Boca Raton, Florida, USA, 1994.
S.C. Reddy and L.N. Trefethen. Pseudospectra of the convection-diffusion operator.
SI AM J. Appl. Math., 54(6): 1634-1649, 1994.
K. Rektorys. Variational Methods in Mathematics, Science and Engineering. D. Reidel
Publishing Company, Dordrecht, The Netherlands, 2nd edition, 1980.
M. Renardy. Imposing 'no' boundary condition at outflow: Why does it work? 1997,
24:413-418.
R.D. Richtmyer and K.W. Morton. Difference Methods for Initial-Value Problems. Inter-
science Publishers, a Division of John Wiley and Sons, Inc., New York, New York,
USA, 2nd edition, 1967.
W.J. Rider. Approximate Projection Methods for Incompressible Flow: Implementation,
Variants and Robustness. Los Alamos National Laboratory, Los Alamos, New Mexico,
USA, 1994, Technical Report LA-UR-2000.
P.J. Roache. Computational Fluid Dynamics. Hermosa Publishers, Albuquerque, New
Mexico, USA, 1982.
P.J. Roache. A flux-based modified method of characteristics. Int. J. Numer. Meth. Fluids,
15:1259-1275, 1992a.
P.J. Roache. Proc. Computational Methods in Water Resources, Vol. I: Numerical Methods
in Water Resources. Chap."Validation exercises of a one-dimensional flux-based
modified method of characteristics"; pp. 69-76; T.F.Russel et al. (Eds.); June 9-12, 1992b.
P.J. Roache and P.M. Knupp. Completed Richardson extrapolation. Commun. Numer.
Methods Eng., 9:365-374, 1993.
R.S. Rogallo and P. Moin. Annual Review of Fluid Mechanics. Annual Reviews Inc., Palo
Alto, California, USA. Vol. 16, 1984. Chap. "Numerical simulation of turbulent flows";
pp. 99-137.
998 REFERENCES
E.M. Ronquist. Convection Treatment Using Spectral Elements of Different Order. Int. J.
Numer. Meth. Fluids, 22:241-264, 1996.
R.B. Rood. Numerical advection algorithms and their role in atmospheric transport and
chemistry models. Rev. Geophys., 25(1 ):71-100, 1987.
M. Rosenfeld. Uncoupled temporally second-order accurate implicit solver of
incompressible Navier-Stokes equations. AIAA Journal, 34(9): 1829, September 1996.
J.E. Rowley and P.M. Gresho. Proc. Sixth IMACS Int. Symp. Computer Methods for
PDE's. Chap. "Some New Results Using Quadratic Finite Elements for Pure
Advection"; pp. 202-209; Lehigh University, Bethlehem, Pennsylvania, June 23-27,
1987; also Lawrence Livermore National Laboratory, Livermore, California, Report
UCRL-96615 (May, 1987).
T.F. Russell. Proc. Seventh Int. Conf. Finite Element Meth. in Flow Problems. UALI Press,
Huntsville, Alabama, USA, 1989; p. 538; Huntsville, Alabama, USA, 1989.
T.F. Russell and R.V. Trujillo. Computational Methods in Surface Hydrology,
Proc. Eighth lnt.Conf Computational Methods in Water Resources. Computational
Mechanics Publications, Southampton, England, UK, 1990. Chap."Eulerian-lagran-
gian localized adjoint methods with variable coefficients in multiple dimensions,"
pp. 357-363; G. Gambolati, A. Rinaldo, C.A. Brebbia, W.C. Gray and G.F. Pinder
(Eds.).
T.F. Russell, R.E. Ewing and L.C. Young. An anistropic coarse-grid dispersion model of
heterogeneity and viscous fingering in five-spot miscible displacement that matches
experiments and fine-grid simulations. 1989.
T.F. Russell. Numerical Analysis 1989. Longman Science and Technical, Harlow,
England, UK. Pitman research notes in mathematics series, Vol. 228, 1990.
Chap. "Eulerian-Lagrangian Localized Adjoint Methods for Advection-Dominated
Problems"; pp. 206-228; D.F. Griffiths and G.A. Watson (Eds.).
R.L. Sani, P.M. Gresho, R.L. Lee and D.F. Griffiths. On the cause and cure (?) of
the spurious pressures generated by certain FEM solutions of the incompressible
Navier-Stokes equations: Parts 1 and 2, Int. J. Numer. Meth. Fluids, 1: 17-43 for
Part 1, 171-204 for Part 2, 1981a.
R.L. Sani, B.E. Eaton, P.M. Gresho, R.L. Lee, and S.T. Chan. Proc. 2nd Num. Meth.
Laminar and Turbulent Flow. Pineridge Press Ltd, Swansea, Wales, UK, 1981b.
Chap."On the solutions of the time-dependent incompressible Navier-Stokes equations
via a Penalty Galerkin finite element method"; pp. 41-51; C. Taylor and B. Schreffler
(Eds.); Venice, Italy.
R.L. Sani, B.E. Eaton, P.M. Gresho, CD. Upson and M.S. Engelman. Proc. 5th Int. Symp.
Finite Elements in Flow Problems. TICOM, USA, 1984. Chap."On outflow boundary
conditions for startified and/or rotating flows"; pp. 85-90; Austin, Texas, USA, January
1984.
R.L. Sani and P.M. Gresho. Resume and remarks on the open boundary condition
minisymposium. Int. J. Numer. Meth. Fluids, 18:983-1008, 1994.
J.M. Sanz-Serna and L. Abia. Interpolation of the coefficients in nonlinear elliptic
Galerkin procedures. SI AM J. Numer. Anal, 21(l):77-83, 1984.
J.M. Sanz-Serna. Studies in numerical nonlinear instability I. Why do leapfrog schemes
go unstable? SI AM J. Sci. Stat. Comput, 6(4):923-938, 1985.
T. Sarpkaya and M. Isaacson. Mechanics of Wave Forces on Offshore Structures. Van
Nostrand Reinhold Company, New York, New York, USA, 1981.
REFERENCES 999
T. Sarpkaya. Brief reviews of some time-dependent flows. J. Fluids Eng., 114/283,
September 1992.
T. Sarpkaya. Unsteady Flows. John Wiley and Sons, New York, New York, USA, 1996.
Handbook of Fluid Dynamics and Fluid Machinery. Joseph A. Schetz and Allen E. Fuhs
(Eds.).
Y.K. Sasaki and J.N. Reddy. A comparison of stability and accuracy of some
numerical models of two-dimensional circulation. Int. J. Numer. Meth. Eng., 16:149-170,
1980.
A.L. Schaferperini and J.L. Wilson. Efficient and accurate front tracking for two-
dimensional groundwater flow models. Water Resour. Res., 27(7): 1471-1485, 1991.
H. Schlichting. Boundary Layer Theory. McGraw-Hill, New York, New York, USA, 1979.
G.E. Schneider and M.J. Raw. A skewed, positive influence coefficient upwinding
procedure for control-volume-based finite-element convection-diffusion computation.
Numer. Heat Transfer, 9:1-26, 1986.
H.L. Schreyer. Dispersion of Semidiscretized and Fully Discretized Systems. North-
Holland, Amsterdam, The Netherlands, 1983. Chap. 6 in Computational Methods for
Transient Analysis; T. Belytschko and T.J.R. Hughes (Eds.).
M. Schumack, W. Schultz and J. Boyd. Spectral method of the Stokes equations on
nonstaggered grids. J. Comput. Phys., 94:30, 1991.
L.L. Schumaker. Spline Functions. Wiley-Interscience, New York, USA, 1981.
J. A. Schutt. ZEPHYR 30: A Finite Difference Computer Program for 3D, Transient
Incompressible Flow Problems. Sandia National Laboratories, USA, SAND 91-0350-UC-705,
1991.
C. Schwab. Proc. Int. Conf. Finite Elements in Fluids: New Trends and Applications. Chap.
"Remarks on pressure approximation in projection methods for viscous incompressible
flow"; Venezia, Italia, October 15-21, 1995.
A. Schuller, Numerical Treatment of the Navier-Stokes Equations. Proc. Fifth GAMM-
Seminar. Friedr. Vieweg und Sohn, Braunschweig, Germany, 1990. Chap."A Multigrid
algorithm for the incompressible Navier-Stokes equations"; pp. 124-133; Series: Notes
on Numerical Fluid Mechanics, Vol. 30; W. Hackbusch and R. Rannacher (Eds.).
A. Segal. Aspects of numerical methods for elliptic singular perturbation problems. SIAM
J. Sci. Stat. Comput., 3(3):327-349, 1982.
A. Segal, P. Wesseling, J. Van Kan, C.W. Oosterlee and K. Kassels. Invariant
discretization of the incompressible Navier-Stokes equations in boundary fitted co-ordinates.
Int. J. Numer. Meth. Fluids, 15:411-426, 1992.
G. Segal, K. Vuik and K. Kassels. On the implementation of symmetric and antisymmetric
periodic boundary conditions for incompressible flow. Int. J. Numer. Meth. Fluids,
18:1153 -1165, 1994.
V. Selmin, J. Donea and L. Quartapelle. Finite element methods for nonlinear advection.
Comput. Meth. Appl. Mech. Eng., 817-845, 1985.
K. Sepehrnoori and G.F. Carey. Numerical integration of semidiscrete evolution systems.
Comput. Meth. Appl. Mech. Eng., 27:45-61, 1981.
J. Serrin. Handbuch der Physik. Springer-Verlag, Berlin, Germany, 1959. Chap.
"Mathematical principles of classical fluid mechanics"; pp. 125-263.
M.J. Sewell. Maximum and Minimum Principles: A Unified Approach, with Applications.
Cambridge University Press, Cambridge, England, UK, 1987. Series: Cambridge Texts
in Applied Mathematics.
1000 REFERENCES
F. Shakib and T.J.R. Hughes. A new finite element formulation for computational
fluid dynamics: IX. Fourier analysis of space-time Galerkin/least-squares algorithms.
Comput. Meth. Appl. Mech. Eng., 87:35-58, 1991.
L.F. Shampine and M.K. Gordon. Computer Solution of Ordinary Differential Equations:
The Initial Value Problem. W.H. Freeman and Company, San Francisco, California,
USA, 1975.
L.F. Shampine and C.W. Gear. A user's view of solving stiff ordinary differential
equations. SI AM Rev., 21(1), 1979.
J. Shen. On error estimates of some higher order projection and penalty-projection
methods for Navier-Stokes equations. Numer. Math., 62:49-73, 1992.
J. Shen. A remark on the projection-3 method. Int. J. Numer. Meth. Fluids, 16:249-253,
1993.
J. Shen. On error estimates of the projection methods for the Navier-Stokes equations:
Second-order schemes. Math. Comput, 1996, 65, 215:1039-1065.
T.-M Shih and P.M. Gresho. Pressure Modes for Galerkin Finite Element Method Using
Equal-Interpolation Bilinear Elements. Lawrence Livermore National Laboratory,
Livermore, California, USA, UCRL-92045, extended abstract, 1985.
D. Shin and J.C. Strikwerda. Inf-sup conditions for finite difference approximations of
the Stokes equations. J. Austral. Math. Soc, Series B, 39, August 1997.
P.J. Shopov, P.D. Minev and LB. Bazhlekov. Numerical method for unsteady viscous
hydrodynamical problem with free boundaries. Int. J. Numer. Meth. Fluids,
14:681-705, 1992.
P.J. Shopov and Y.I. Iordanov. Numerical solution of Stokes equations with pressure and
filtration boundary conditions. J. Comput. Phys., 112(1): 12-23, 1994.
G.R. Shubin and J.B. Bell. An analysis of the grid orientation effect in numerical
simulation of miscible displacement. Comput. Meth. Appl. Mech. Eng., 41:41-11, 1984.
W.J. Silliman and L.E. Scriven. Separating flow near a static contact line: Slip at a wall
and shape of a free surface. J. Comput. Phys., 34:287-313, 1980.
D. Silvester and N. Kechkar. Stabilised bilinear-constant velocity-pressure finite elements
for the conjugate gradient solution of the Stokes problem. Comput. Meth. Appl. Mech.
Eng., 79:71-86, 1990.
D. Silvester. Optimal low order finite element methods for incompressible flow. Comput.
Meth. Appl. Mech. Eng., 111:357-368, 1994.
D. Silvester and A. Wathen. Fast & robust solvers for time-discretised incompressible
Navier-Stokes equations, in Numerical Analysis 1995. Pitman Research Notes in
Mathematics Series, 1996 D.F. Griffiths, (Ed.).
J.C. Simo and F. Armero. Unconditional stability and long-term behavior of transient
algorithms for the incompressible Navier-Stokes and Euler equations. Comput. Meth.
Appl. Mech. Eng., 111:111 -154, 1994.
J.C. Simo, F. Armero and C.A. Taylor. Stable and time-dissipative finite element methods
for the incompressible Navier-Stokes equations in advection dominated flows. Int. J.
Numer. Meth. Eng., 38:1475-1506, 1995.
R.F. Sincovec. Some Projection Methods in Atmospheric Simulation. Lawrence Livermore
National Laboratory, Livermore, California, USA, UCID-16186 edition, 1972.
F. Singer, Ozone, skin cancer, and the SST. Aerosp. Am., page 22-26, July 1994.
R.M. Smith. Finite element solutions of the energy equation at high Peclet number.
Comput. Fluids, 8:335-350, 1980.
REFERENCES 1001
R.M. Smith. The Current Status of Turbulence Modelling in the Fluid Flow Code
FEATT. Central Electricity Generating Board, Berkeley Nuclear Laboratories, Berkeley,
Gloucestershire, England, UK, TPRD/B/0591/N85, 1985.
P.K. Smolarkiewicz and P.J. Rasch. Monotone advection on the sphere: An Eulerian
versus semi-Lagrangian approach. J. Atmos. Sci., 48(6):793-810, 1991.
P.K. Smolarkiewicz and J.A. Pudykiewicz. A class of semi-Lagrangian approximations
for fluids. J. Atmos. Sci., 49(22):2082-2096, 1992.
C. Smutek, P. Bontoux, B. Roux, G.H. Schiroky, A.C. Hurford, F. Rosenbderger and
G. de Vahl Davis. Three-dimensional convection in horizontal cylinders: Numerical
solutions and comparison with experimental and analytical results. Numerical Heat
Transfer, 8:613-631, 1985.
G. Sottas. Rational Runge-Kutta methods are not suitable for stiff systems of ODEs.
J. Comput. Appl. Math., 10:169-174, 1984.
A. Soulaimani, M. Fortin, Y. Ouellet, G. Dhatt and F. Bertrand. Simple continuous
pressure elements for two-and three-dimensional incompressible flows. Comput. Meth.
Appl. Mech. Eng., 62:47-69, 1987.
I. Stakgold. Green's Functions and Boundary Value Problems. John Wiley and Sons, Inc.,
New York, New York, USA, 1979.
A. Staniforth and J. Cote. Semi-Lagrangian integration schemes for atmospheric models-a
review. Mon. Weather Rev., 119(9):2206-2223, 1991.
R. Stenberg. Analysis of mixed finite element methods for the Stokes problem: A unified
approach. Math. Comput, 42(165):9-23, 1984.
R. Stenberg and M. Suri. Mixed hp finite element methods for problems in elasticity and
Stokes flow. Technical Report 18, Helsinki University of Technology, 1994.
A.B. Stephens, J.B. Bell, J.M. Solomon and L.B. Hackerman, A finite difference Galerkin
formulation of the incompressible Navier-Stokes equations, J. Comp. Phys., 1984,
53:152-172.
W.N.R. Stevens. Finite element stream function vorticity solution of steady laminar
natural convection. Int. J. Numer. Meth. Fluids, 2:349, 1982.
G. Strang and G.J. Fix. An Analysis of the Finite Element Method. Prentice-Hall, Inc.,
Englewood Cliffs, New Jersey, USA, 1973.
G. Strang. Linear Algebra and Its Applications. Academic Press, New York, New York,
USA, 1976.
G. Strang. Introduction to Applied Mathematics. Wellesley-Cambridge Press, Wellesley,
Massachusetts, USA, 1986.
G. Strang. A framework for equilibrium equations. SI AM Rev., 30(2):283-297, 1988.
J.C. Strikwerda. Finite difference methods for the Stokes and Navier-Stokes equations.
SI AM J. Sci. Stat. Comput, 5(1): 56-68, 1984.
J.C. Strikwerda. Finite Difference Schemes and Partial Differential Equations. Wadsworth
and Brooks/Cole, Pacific Grove, California, USA, 1989.
C.R. Swaminathan and V.R. Voller. Streamline upwind scheme for control-volume finite
elements, Part I. formulations. Numer. Heat Transfer, Part B, 22:95-107, 1992a.
C.R. Swaminathan and V.R. Voller. Streamline upwind scheme for control-volume finite
elements, Part II. implementation and comparison with the SUPG finite-element
scheme. Numer. Heat Transfer, Part B, 22:109-124, 1992b.
B. Swartz and B. Wendroff. Generalized finite-difference schemes. Math. Comput.,
23:37-49, 1969.
1002 REFERENCES
B. Swartz and B. Wendroff. The relative efficiency of finite difference and finite element
methods. I: Hyperbolic problems and splines. SI AM J. Numer. Anal, 11(5):979-993,
1974.
E. Siili. Convergence and nonlinear stability of the Lagrange-Galerkin method for the
Navier-Stokes equations. Numer. Math., 53:459-483, 1988.
E. Siili and A.F. Ware. The Spectral Lagrange-Galerkin Method for the Navier-Stokes
Equations: Convergence and Non-Linear Stability. Oxford University Computing
Laboratory, Numerical Analysis Group, 11 Keble Road, Oxford, England OX13QD, 89/10
(November 1989) edition, 1989.
E. Siili. The Mathematics of Finite Elements and Applications VII. Academic Press
Ltd, 1991. Chap. "The accuracy of finite volume methods on distorted partitions";
J. Whiteman (Ed.).
M. Tabata and K. Itakura. Precise Computation of Drag Coefficients of the Sphere.
Hiroshima University, Higashi-Hiroshima, 739 Japan, INS AM report no. 12 (95-07)
edition, 1995. This paper is also scheduled to appear in Int. J. Numer. Meth. Fluids,
from the Proceedings of the 3rd. US-Japan Symposium on Large-Scale FEM in CFD.
L.Q. Tang and T.T.H. Tsang. A least-squares finite element method for time-dependent
incompressible flows with thermal convection. Int. J. Numer. Meth. Fluids, 17:271-289,
1993.
L.Q. Tang, T. Cheng, and T.T.H. Tsang. Transient solutions for three-dimensional lid-
driven cavity flows by a least-squares finite element method. Int. J. Numer. Meth.
Fluids, 21:413-432, 1995.
L.Q. Tang, J.L.Wright and T.T.H. Tsang. Simulations of 2-D and 3-D Thermocapillary
Flows by a Least-Squares Finite Element Method. Int. J. Numer. Meth. Fluids, in press.
E.Y. Tau. A second-order projection method for the incompressible Navier-Stokes
equations in arbitrary domains. J. Comput. Phys., 115:147-152, 1994.
C. Taylor and P. Hood. A numerical solution of the Navier-Stokes equations using the
finite element technique. Comput. Fluids, 1:73-100, 1973.
C.Taylor, J. Ranee and J.O. Midwell. A note on the imposition of traction boundary
conditions when using the FEM for solving incompressible flow problems. Comm.
Applied Num. Meth,, 1:113-121, 1985.
G.I. Taylor. On the Decay of Vortices in a Viscous Fluid. Phil. Mag. S. 6.. 46(271), Oct.
1923.
D.P. Telionis. Unsteady Viscous Flows. Springer-Verlag, New York, New York, USA,
1989.
R. Temam. Analyse Mathematique. Sur 1'approximation des solutions des equations de
Navier-Stokes. C R. Acad. Sci. Paris, Ser. A, 262:219-221, 24 January 1966.
R. Temam. Une mehode d'approximation de la solution des equations de Navier-Stokes.
Bull. Soc. Math. France, 1968:115-152, 1968.
R. Temam. Sur 1'approximation de la solution des equations de Navier-Stokes par la
methode des pas fractionaires I. Arch. Rat. Mech. Anal., 32(2): 135, 1969a.
R. Temam. Sur 1'approximation de la solution des equations de Navier-Stokes par la
methode des pas fractionaires II. Arch. Rat. Mech. Anal., 33:377, 1969b.
R. Temam. AGARD Lecture Series No. 48: Numerical Methods in Fluid Dynamics. North
Atlantic Treaty Organization (NATO) Advisory Group for Aerospace Research and
Development (AGARD), 1972. Chap."Approximation of Navier-Stokes equations";
pp. 22-27; J.J. Smolderen (Ed.).
REFERENCES 1003
R. Temam. Behaviour at time t = 0 of the solutions of semi-linear evolution equations.
J. Diff. Equations, 43:73-92, 1982.
R. Temam. Navier-Stokes Equations. North-Holland, Amsterdam, The Netherlands, 3rd
edition, 1984.
T.E. Tezduyar, R. Glowinski and J. Liou. Petrov-Galerkin methods on multiply-
connected domains for the vorticity-stream function formulation of the incompressible
Navier-Stokes equations. Int. J. Numer. Meth. Fluids, 8:1269-1290, 1988.
T.E. Tezduyar and J. Liou. Computation of spatially periodic flows based on the
vorticity-stream function formulation. Comput. Meth. Appl. Mech. Eng., 83:121-142,
1990.
T.E. Tezduyar and J. Liou. On the downstream boundary conditions for the
vorticity-stream function formulation of two-dimensional incompressible flows.
Comput. Meth. Appl. Mech. Eng., 85:207-217, 1991.
T.E. Tezduyar. Advances in Applied Mechanics. Academic Press, Inc., Boston,
Massachusetts, USA. Vol. 28, 1992. Chap. 1, "Stabilized finite element formulations for
incompressible flow computations"; pp. 1-45; J.W. Hutchinson and T.Y. Wu (Eds.).
T.E. Tezduyar, M. Behr and J. Liou. A new strategy for finite element computations
involving moving boundaries and interfaces—the deforming-spatial-domain/space—
time procedure: I. the concept and the preliminary numerical tests. Comput. Meth.
Appl. Mech. Eng., 94:339-351, 1992a.
T.E. Tezduyar, M. Behr, S. Mittal, and J. Liou. A new strategy for finite element
computations involving moving boundaries and interfaces—the deforming-spatial-
domain/space-time procedure: II. computation of free-surface flows, two- liquid flows,
and flows with drifting cylinders. Comput. Meth. Appl. Mech. Eng., 94:353-371,
1992b.
R.W. Thatcher. Locally mass-conserving Taylor-Hood elements for two- and three-
dimensional flow. Int. J. Numer. Meth. Fluids, 11:341-353, 1990.
R.W. Thatcher. Incompressible Computational Fluid Dynamics Trends and Advances.
Cambridge University Press, Cambridge, England, UK, 1993. Chap. 13, "The
finite element method for three dimensional incompressible flow"; pp. 427-445;
M.D. Gunzburger and R.A. Nicolaides (Eds.).
A. Thess. Instabilities in two-dimensional spatially periodic flows, part ii: Square eddy
lattice. Phys. Fluids, 4(7), July 1992.
F. Thomasset. Implementation of Finite Element Methods for Navier-Stokes Equations.
Springer-Verlag, Inc., New York, New York, USA, 1981. Series: Springer Series
in Computational Physics; W. Beiglbock, H. Cabannes, H.B Keller, J. Killeen and
S.A. Orszag (Eds.).
E.G. Thompson, L.R. Mack, F.-S. Lin. Finite Element Method for Incompressible Slow
Viscous Flow with a Free Surface, in Developments in Mechanics, The Iowa State
University Press, Ames, Iowa, USA Vol. 5, 1969. p. 93; H.J. Weussm D.F. Young,
W.F. Riley and T.R. Rogge (Eds.).
E.G. Thompson and M.I. Haque. A high order finite element for completely
incompressible creeping flow. Int. J. Numer. Meth. Eng., 6:315-321, 1973.
E.G. Thompson. Average and complete incompressibility in the finite element method.
Int. J. Numer. Meth. Eng., 9:925-932, 1975.
H.D. Thompson, B.W. Webb and J.D. Hoffman. The cell reynolds, number myth. Int. J.
Numer. Meth. Fluids, 5:305-310, 1985.
1004 REFERENCES
J.F. Thompson. Convection schemes for use with curvilinear coordinate systems—a
survey. Mississippi State University Department of Aerospace Engineering, June 1984.
V. Thomee. Galerkin Finite Methods for Parabolic Problems. Springer-Verlag, Berlin,
Germany. Lecture notes in mathematics, vol. 1054 , 1984.
E.A. Thornton. Finite Element Flow Analysis. University of Tokyo Press, Tokyo, Japan,
1982. Chap. "Computation of consistent boundary quantities in finite element thermal-
fluid solutions"; pp. 263-270; T. Kawai (Ed.).
D.M. Tidd, R.W. Thatcher and A. Kaye. The free surface flow of Newtonian and non-
Newtonian fluids trapped by surface tension. International Journal for Numerical
Methods in Fluids, 8:1011-1027, 1988.
L.J.P. Timmermans, F.N. Van De Vosse, and P.D. Minev. Taylor-Galerkin-based
spectral element methods for convection-diffusiion problems. Int. J. Numer. Meth. Fluids,
18:853-870, 1994.
L.J.P. Timmermans, P.D. Minev, and F.V. Van De Vosse. An approximate projection
scheme for incompressible flow using spectral elements. Int. J. Numer. Meth. Fluids,
22:673, 1996.
A.F.B. Tompson and L.W. Gelhar. Numerical simulation of solute transport in
three-dimensional, randomly heterogeneous porous media. Water Resour. Res.,
26(10):2541-2562, 1990.
L.N. Trefethen. Group velocity in finite difference schemes. SIAM Rev., 24(2): 113-136,
1982.
L.N. Trefethen. Numerical Analysis 1991: Proc. 14th Dundee Conf, June 1991. Longman
Scientific and Technical, with John Wiley and Sons, Inc., New York, New
York, USA, 1991. Chap. "Pseudospectra of matrices"; pp.234; D.F.Griffiths and
G.A. Watson (Eds.).
L.N. Trefethen, A.E. Trefethen, S.C. Reddy and T.A. Driscoll. Hydrodynamic stability
without eigenvalues. Science, 261:578, July 1993.
L.N. Trefethen. ICIAM'95: Proc. Third Int. Congress on Industrial and Applied
Mathematics. Akademie-Verlag, Berlin, Germany, 1995. Chap. "Pseudospectra of linear
operators".
C.J. Tremback, W.R. Cotton, J. Powell, and R.A. Pielke. The forward-in-time upstream
advection scheme: Extension to higher orders. Mon. Weather Rev., 115:540-555,
February 1987.
D.J. Tritton. Physical Fluid Dynamics. Van Nostrand Reinhold Company New York, New
York, USA.
S. Turek. Tools for simulating non-stationary incompressible flow via discretely
divergence-free finite element models. Int. J. Numer. Meth. Fluids, 18:71-105,
1994.
S. Turek. A comparative study of some time-stepping techniques for the
incompressible Navier-Stokes equations: From fully implicit nonlinear schemes to semi-implicit
projection methods. Int. J. Numer. Meth. Fluids, 22(10):987-1012, 1996.
S. Turek. On discrete projection methods for the incompressible Navier-Stokes equations:
An algorithmical approach. Comput. Meth. Appl. Mech. Eng., 143:271-288, 1997.
CD. Upson, P.M. Gresho, R.L. Sani, S.T. Chan and R.L. Lee. Num Properties and
Methodologies in Heat Proc. Transfer, 2nd. Nat. Symp. Hemisphere Publishing Corp.,
1983. Chap. "A Thermal Convection Simulation in Three Dimensions by a Modified
Finite Element Method"; p. 245-259; T. Shih (Ed.).
REFERENCES 1005
D.T. Valentine and A.G. Mohamed. Taylor's vortex array: a new test problem
for Navier-Stokes solution procedures, in Solution of Superlarge Problems in
Computational Mechanics Plenum, New York,; New York, New York, USA, 1989,
pp. 167-181; J. Kane and A. Carlson (Eds.).
D.T. Valentine. Decay of confined, two-dimensional spatially periodic arrays of vortices:
A numerical investigation. Int. J. Numer. Meth. Fluids, 21:155-180, 1995.
J. Van Kan. A second-order accurate pressure-correction scheme for viscous
incompressible flow. SIAMJ. Sci. Stat. Compute 7(3):870-891, 1986.
M. Van Dyke. An Album of Fluid Motion. The Parabolic Press, Stanford, California, USA,
1982.
John Milton van Dyke, J. V. Wehausen and L. Lumley. Annual review of fluid mechanics.
Annual Reviews Inc., Palo Alto, California, USA; Vol. 16, 1984.
R.S. Varga. Matrix Iterative Analysis. Prentice-Hall, Inc., Englewood Cliffs, New Jersey,
USA, 1962.
R.S. Varga, H.S. Price and J.E. Warren. Application of oscillation matrices to diffusion
convection equations. J. Math. Phys., 45:301-311, 1966.
A.E.P. Veldman, 'missing' boundary conditions? discretize first, substitute next, and
combine later. SIAMJ. Sci. Stat. Compute 11(1):82-91, 1990.
A.E.P. Veldman and K. Rinzema. Playing with nonuniform grids. J. Eng. Math.,
26:119-130, 1992.
D. Veyret, P. Gresho and R. Sani, 1998 (in preparation).
R. Vichnevetsky and F. De Schutter. A Frequency Analysis of Finite Difference and Finite
Element Methods for Initial Value Problems. Proceedings of the AICA International
Symposium on Computer Methods for Partial Differential Equations hold at Lehigh
University, Pennsylvania, 1975.
R. Vichnevetsky and J.B. Bowles. Fourier Analysis of Numerical Approximations of
Hyperbolic Equations. Society for Industrial and Applied Mathematics, Philadelphia,
Pennsylvania, USA, 1982.
R. Vichnevetsky. Propagation and spurious reflection in finite-element approximations of
hyperbolic equations. Comput. Math. Appi, 11(7/8):733-746, 1985.
R. Vichnevetsky. Wave propagation analysis of difference schemes for hyperbolic
equations: A review. Int. J. Numer. Meth. Fluids, 7:409-452, 1987.
C. Vincent. The influence of the stabilization parameter on the convergence factor of
iterative methods for the solution of the discretized Stokes problem. Int. J. Numer.
Methods Fluids, 20:1237-1252, 1995.
S. Vogel. Life in Moving Fluids. The Physical Biology of flow. Princeton University Press,
Princeton, New Jersey, USA; 2nd edition, 1994.
L.B. Wahlbin. A remark on parabolic smoothing and the finite element method. SIAM
J. Numer. Anal., 17(l):33-38, 1980.
R. Wait and A.R. Mitchell. Finite Element Analysis and Applications. John Wiley & Sons,
Chichester, UK, 1985.
O. Walsh. The Navier-Stokes Equations II—Theory and Numerical Methods: Conference
Proceedings. Springer-Verlag, Berlin, Germany, 1991. Chap. "Eddy solutions
of the Navier-Stokes equations"; J.G. Heywood, K. Masuda, R. Rautmann and
V.A. Solonnikov (Eds.); Oberwolfach, Germany, August 18-24, 1991.
C.-Y. Wang. The flow past a circular cylinder which is started impulsively from rest. J.
Math. Phys., XLVI(2):195, 1967.
1006 REFERENCES
C.-Y. Wang. A note on the drag of an impulsively started circular cylinder. J. Math. Phys.,
47:451, 1968.
A. Wambecq. Rational Runge-Kutta methods for solving systems of ordinary differential
equations. Computing, 20:333-342, 1978.
R.F. Warming and B.J. Hyett. The modified equation approach to the stability and
accuracy analysis of finite-difference methods. J. Comput. Phys., 14:159-179, 1974.
R.F. Warming and R.M. Beam. Proc. 1979 SIGNUM Meeting on Numerical ODEs. 1979a.
Chap. "Factored, A-Stable, linear multistep methods—an alternative to the method of
lines for multidimensions"; Champaign, Illinois, April 3-5, 1979; also available from
Computational Fluid Dynamics Branch, Ames Research Center, NASA, Moffett Field,
California 94035, USA.
R.F. Warming and R.M. Beam. An extension of a-stability to alternating direction implicit
methods. BIT, page 395-417, 1979b.
A.J. Wathen. On relaxation of Jacobi iteration for consistent and generalized mass
matrices. Commun. Appl. Nume. Methods, 7:93-102, 1991.
J.E. Welch and F.H. Harlow. The MAC Method: A Computing Technique for Solving
Viscous, Incompressible, Transient Fluid-Flow Problems Involving Free Surfaces. Los
Alamos Scientific Laboratory, Los Alamos, New Mexico, USA, LA-3425 edition, 1965.
J. Wheeler. Simulation of heat transfer from a warm pipeline buried in permafrost. 74th
Nat. Mtg. Am. Inst. Chem. Engrg.; New Orleans, Louisiana, USA, 1973.
J.A. Wheeler. Permafrost Thermal Design for the Trans-Alaska Pipeline, in Moving
Boundaryr Problems, Academic Press, New York, New York, USA, 1978. p. 267;
D.G. Wilson, A.D. Solomon and P.T. Boggs (Eds.).
M.F. Wheeler. A Galerkin procedure for estimating the flux for a two point boundary
problem. SI AM Jour. Numer. Anal, 11:764, 1974.
J.R. Whiteman. The Mathematics of Finite Elements and Applications II: MAFELAP 1975.
Academic Press, London, England, UK, 1976.
J.R. Whiteman. The Mathmatics of Finite Elements and Applications IV: MAFELAP 1981.
Academic Press, London, England, UK, 1982.
G.B. Whitham. Linear and Nonlinear Waves. John Wiley and Sons, Inc., New York, New
York, USA, 1974.
J.H. Wilkinson. Recent Advances in Numerical Analysis. Academic Press, Inc., New York,
New York, USA, 1978. Chap. "Linear differential equations and Kronecker's canonical
form"; pp. 231-265; C. de Boor and G. Golub (Eds.).
P.T. Williams and A.J. Baker. Incompressible Computational Fluid Dynamics and the
Continuity Constraint Method for the Three-Dimensional Navier-Stokes equations.
Numer. Heat Trans., 29(2): 137, 1996.
A.M. Winslow. Internal Memorandum. Lawrence Livermore National Laboratory, Liver-
more, California, USA, 1967.
D. Winterscheidt and K.S. Surana. p-version least squares finite element formulation for
two-dimensional, incompressible fluid flow. Int. J. Numer. Meth. Fluids, 18(l):43-70,
1994.
W.L. Wood. Practical Time-Stepping Schemes. Oxford University Press, New York, New
York, USA, 1990. Series: Oxford Applied Mathematics and Computing Science
Series.
J. Wu. Wave equation model for solving advection-diffusion equation. Int. J. Numer.
Meth. Eng., 37(16):2717-2734, 1994.
REFERENCES 1007
J.-Z. Wu and J.-M. Wu. Interactions between a solid surface and a viscous compressible
flow field. J. Fluid Mech., 254:183-211, 1993.
J.-Z. Wu. A theory of three-dimensional interfacial vorticity dynamics. Phys. Fluids,
7(10):2375-2395, 1995.
J.Z. Wu and J.M. Wu. Advances in Applied Mechanics. Academic Press, New York, New
York, USA, 1996. Chap. "Vorticity dynamics on boundaries"; Vol. 32.
X.H. Wu, J.Z. Wu and J.M. Wu. Effective vorticity-velocity formulations for three-
dimensional incompressible viscous flows. J. Comput. Phys., 122:68-82, 1995.
M.G. Wurtele. On the problem of truncation error. Tellus XIII, 3:379-391, 1961.
J.C. Wyngaard, W.D. Bach Jr., S. Burk, W.R. Cotton, J.H. Ferziger, S.R. Hanna, P. Moin,
W. Ohmstede and J.C. Weil. Large-Eddy Simulation: Guidelines for Its Application to
Planetary Boundary Layer Research. Michaels Communications, Boulder, Colorado,
USA, 1984. Final Report from The Working Group on Large-Eddy Simulation;
J.C. Wyngaard (Ed.).
M. Yao and D.S. Malkus. Boundary node correction and superconvergence in the FEM.
Int. J. Numer. Meth. Fluids, 10:713-721, 1990.
H.C. Yee, P.K. Sweby, and D.F. Griffiths. Dynamical approach study of spurious steady-
state numerical solutions for nonlinear differential equations. Part 1: The Dynamics of
Time Discretizations and Its Implications for Algorithm Development in Computational
Fluid Dynamics J. Comput. Phys, 1991, 97:249-310.
H.C. Yee and P.K. Sweby. Global asymptotic behavior of some iterative implicit schemes.
Int. J. Bifur. Chaos, 4(6): 1579-1611, 1994.
H.C. Yee and P.K. Sweby. Dynamical approach study of spurious steady-state numerical
solutions for nonlinear differential equations. Part II: The dynamics of numerics
of systems of 2 x 2 ODEs and its connections to finite discretizations of PDEs.
Int. J. Comput. Fluid Dyn., 4:219-283, 1995a.
H.C. Yee and P.K. Sweby. Proc. Conf. Numerical Methods for the Euler and
Navier-Stokes Equations. 1995b. Chap. "On super-stable implicit methods and
time-marching approaches"; Montreal, Canada, September 14-16, 1995; to appear
Int. J. CFD; also RIACS Technical Report 95.12 (July, 1995).
H.C. Yee and P.K. Sweby. Nonlinear Dynamics & Numerical Uncertainties in CFD.
National Aeronautics and Space Administration, USA, April 1996. NASA Technical
Memorandum 110398; also submitted to J. Com. Phys.
S.T. Zalesak. Fully multidimensional flux-corrected transport algorithms for fluids. J.
Comput. Phys., 31:335-362, 1979.
M.M. Zdravkovich. Flow Around Circular Cylinders. Oxford University Press, Oxford,
England, UK. Vol. 1, 1997.
L. Zhang and L. Li. On superconvergence of isoparametricc bilinear finite elements.
Commun. Numer. Meth. Eng., 12:849-862, 1996.
J.Z. Zhu and O.C. Zienkiewicz. Adaptive techniques in the finite element method.
Commun. Appl. Numer. Methods, 4:197-204, 1988.
O. Zienkiewicz and J. Wu. Incompressibility without tears—how to avoid restrictions of
mixed formulations. Int J. Numer. Meth. Eng., 32:1189-1203, 1991.
O. Zienkiewicz and J. Wu. A general explicit or semi-explicit algorithm for compressible
and incompressible flows. Int J. Numer. Methods Eng., 35:457-479, 1992.
O.C. Zienkiewicz and P.N. Godbole. Finite Elements in Fluids, Vol. I: Viscous Flow and
Hydrodynamics. John Wiley and Sons, Ltd, London, England, UK, 1975. Chap. 2,
1008 REFERENCES
"Viscous, incompressible flow with special reference to non-Newtonian (plastic)
fluids"; pp. 25-55; R.H. Gallagher, J.T. Oden, C. Taylor and O.C. Zienkiewicz (Eds.).
O.C. Zienkiewicz and K. Morgan. Finite Elements and Approximation. John Wiley and
Sons, Inc., New York, New York, USA, 1983.
O.C. Zienkiewicz and R.L. Taylor. The Finite Element Method. Vol. 1: Basic Formulation
and Linear Problems. McGraw-Hill Book Company (UK) Ltd, London, England, UK,
4th edition, 1989.
O.C. Zienkiewicz and E. 0nate. Nonlinear Computational Mechanics: State of the Art.
Springer-Verlag, Berlin, Germany, 1991. Chap. "Finite Volume vs Finite Elements. Is
There Really a Choice?" pp. 240-254; P. Wriggers and W. Wagner (Eds.).
O.C. Zienkiewicz and R.L. Taylor. The Finite Element Method. Vol. 2: Solid and Fluid
Mechanics, Dynamics, and Non-Linearity. McGraw-Hill Book Company (UK) Ltd,
London, England, UK, 4th edition, 1991.
O.C. Zierkiewicz and J.Z. Zhu. The superconvergent patch recovery and a posteriori error
estimates. Part 1: the recovery technique. Int. J. Numer. Meth. Eng., 33:1331-1364,
1992a.
O.C. Zienkiewicz and J.Z. Zhu. The superconvergent patch recovery and a posteriori
error estimates. Part 2: error estimates and adaptivity. Int. J. Numer. Meth. Eng.,
33:1365-1382, 1992b.
J.Z. Zhu. Further tests on the derivate recovery technique and a posteriori error estimator,
in Finite Elements in the 90's. Springer-Verlag/CIMNE, Barcelona, 1991; E. Onate,
J. Periaux and A. Samuelsson (Eds.).
J.Z. Zhu. A posteriori error estimation—the relationship between different procedures.
Comput. Meth. Appl. Mech. Eng., 150:411-422, 1997.