Текст
                    Incompressib 'Flow
AND THE
Element
Method
■■i».i JL £
title
author
publisher
isbnlO I asin
print isbnl3
ebook isbnl3
language
subject
publication date
lcc
ddc
subject
cover


Pagei ~ " ' > ViMJv\ Cover illustration Snapshot of streamlines and pressure (in color) shortly after an 'impulsive' start for Re=1000. See section 3.19 for an explanation. page_i
Incompressible Flow and the Finite Element Method AdvectionDiffusion and Isothermal Laminar Flow P. M. Gresho Lawrence Livermore National Laboratory R. L. Sani University of Colorado in collaboration with M. S. Engelman Fluid Dynamics International, Evanston JOHN WILEY AND SONS Chichester < New York - Weinheim - Brisbane ■ Singapore ■ loronto page_iii
Page iv Copyright © 1998 John Wiley & Sons Ltd, Baffins Lane, Chichester, West Sussex P019 IUD, England National 01243 779777 International (+44) 1243 779777 e-mail (for orders and customer service enquiries): cs-books@wiley.co.uk Visit our Home Page on http://www.wiley.co.uk or http://www.wiley.com Reprinted with corrections January 1999 All Rights Reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except under the terms of the Copyright, Designs and Patents Act 1988 or under the terms of a licence issued by the Copyright Licensing Agency, 90 Tottenham Court Road, London, UK, W1P 9HE, without the permission in writing of the Publisher. Other Wiley Editorial Offices John Wiley & Sons, Inc., 605 Third Avenue, New York, NY 10158-0012, USA WILEY-VCH Verlag Gmbh, Pappelallee 3, D-69469 Weinheim, Germany Jacaranda Wiley Ltd, 33 Park Road, Milton, Queensland 4064, Australia John Wiley & Sons (Asia) Pte Ltd, Clementi Loop #02-01, Jin Xing Distripark, Singapore 129809 John Wiley & Sons (Canada) Ltd, 22 Worcester Road, Rexdale, Ontario M9W 1LI, Canada British Library Cataloguing in Publication Data A catalogue record for this book is available from the British Library ISBN 0 471 96789 0 Typeset in 10/12pt Times from the authors' disks by Laser Words, Madras, India Printed and bound in Great Britain by Bookcraft (Bath) Ltd. This book is printed on acid-free paper responsibly manufactured from sustainable forestation, for which at least two trees are planted for each one used for paper production. page_iv
Contents Page v Preface xv Glossary of Abbreviations xix 1 1 Introduction 1.1 Introduction 1 1.2 Incompressible Flow 3 1.3 The Finite Element Method 6 1.4 Incompressible Flow and the Finite Element Method 12 1.5 Overview of this Book 15 1.6 Some Subjective Discussion 17 1.7 Why Finite Elements? Why Not Finite Volumes? 18 2 23 The Advection-Diffusion Equation 2.1 The Continuum Equation 23 2.1.1 The Advective (Convective) Form 23 2.1.2 Dimensionless Forms and Limiting Cases of the Equation 25 2.1.3 The Divergence (Conservation) Form 29 2.1.4 Conservation Laws 30 2.1.5 Weak Forms of the PDE's/Natural Boundary Conditions 32 2.2 The Finite Element Equations/Discretization of the Weak Form 37 2.2.1 Advective Form 37 2.2.2 Divergence Form 44 2.2.3 Conservation Laws 44 2.2.4 An Absolutely Conserving Form 48 2.2.5 A Finite Difference Interpretation 52 2.2.6 A Control Volume FEM 54 2.3 Some Semi-Discrete Equations 58 2.3.1 One Dimension 58 a. Linear elements 59 b. Quadratic elements 65 A * indicates advanced or peripheral material. page_v
2.3.2 Two Dimensions with Bilinear Elements 69 a. An Interior 4-Patch 69 b. A Boundary 2-Patch 76 c. A Boundary Corner 79 d. An Internal Line Heat Source 82 2.3.3 Two Dimensions with Biquadratic Elements 84 a. An Interior 4-Patch 85 b. An Interior 2-Patch 86 c. An Interior 1-Patch 87 2.3.4 Two Dimensions with Serendipity Elements 88 a. An Interior 4-Patch 88 b. An Interior 2-Patch 89 c. Another Interior 2-Patch 90 2.4 Open Boundary Conditions (OBC's) 93 2.4.1 One Dimension 93 2.4.2 Two Dimensions 104 2.5 Some Non-Galerkin Results 107 2.5.1 The Lumped Mass Approximation 107 2.5.2 One-point Quadrature 108 a. An Interior 4-Patch of Uniform Rectangles 108 b. A B oundary 2-Patch 110 c. A Boundary Corner 110 2.5.3 Control Volume Finite Element (CVFEM) 111 a. An Interior 4-Patch 111 b. A Boundary 2-Patch 116 c. A Boundary Corner 119 d. OBC's 120 e. A Nine-Node CVFEM 120 2.5.4 The Group FEM/Product Approximation 123 2.5.5 The Petrov-Galerkin FEM 124 2.6 Dispersion, Dissipation, Phase Speed, Group Velocity, Mesh Design, andWiggles 125 2.6.1 Qualitative Discussion 125 a. Wiggles 125 b. Dispersion 128 c. Dissipation 129 d. Phase Speed 130 e. Group Velocity 131 f. Mesh Design 132 2.6.2 Quantitative Discussion for Some ID Problems 133 a. Pure Advection with Periodic BC's 134 b. Advection-Diffusion with Periodic BC's 187 c. Advection-Diffusion with Dirichlet BC's 188 d. Advection-Diffusion with Dirichlet/ Neumann BC's 208 e. Advection-Diffusion with Neumann BC's at Both Ends 213 f. Advection-Diffusion with Dirichlet/ Robin BC's 214 g. The Advective-Diffusive Time Scale Daee vi 217 h. Final Remarks on ID Advection-Diffusion 218
Page vii 2.6.3 Extension to 2D 219 a. Pure Advection with Periodic BC's 220 b. Pure Advection with Dirichlet BC's (Inlet Only) 224 c. Advection-Diffusion with Dirichlet BC's 228 d. Advection-Diffusion with Periodic BC's 230 e. Advection-Diffusion with OBC's 231 f. Final Remarks on Advection-Diffusion via GFEM 231 2.7 Time Integration 232 2.7.1 Some Explicit ODE Methods 240 a. Second-Order Adams-Bashforth (AB2), an 'Explicit Multi-Step Method' 240 b. Third-Order Adams-Bashforth (AB3), Another 'Explicit Multi-Step Method' 240 c. Runge-Kutta Methods (RK2,4) 241 d. Leapfrog (Another Explicit Midpoint Rule) 242 e. Rational Runge-Kutta (RRK) 243 2.7.2 Application to Advection-Diffusion (Scalar Transport) 244 a. Generalities 244 b. Lumping the Mass 245 c. Stability Estimates and the Case for Implicit Methods 246 d. Matrix Method of Stability Analysis 251 e. Balancing Tensor Diffusivity (BTD) 252 2.7.3 Some Implicit ODE Methods 257 a. The Trapezoid Rule (TR) 259 d. Implicit Midpoint Rule (IMR) 262 c. Backward Differentiation Formulae (BDF) 265 2.7.4 Variable-Step Implicit Methods 266 a. Variable-Step Trapezoid Rule 266 b. Variable-Step Backward Euler 270 c. A Model Problem 270 d. An Aerospace Version of TR 273 e. TR on Advection-Diffusion 274 f. The Smoothing Property 282 2.7.5 A Semi-Implicit Method 285 2.7.6 Dispersion (et al.) Errors For Some Fully Discrete Methods 288 a. Introduction 288 b. Semi-Discrete the Other Way 289 c. Fully Discrete 291 d. Generalizations and Extensions 314 2.7.7 Other (Different) Methods Used by Others 316 a. Methods Based on Trajectories/Characteristics 317 b. Methods Based on Modified Equations 329 c. Some Least-Squares Finite Element Methods (LSFEM) 332 d. Methods Based on a Discontinuous-in-Tlrrie^^irerkin ODE Technique 334
e. Methods Based on Least-Squares and Time-Discontinuous ODE's 339 f. A Wave Equation Method 341 g. Another Combined Method: Taylor Least Squares 341 2.7.8 Concluding Remarks and Suggestions 342 2.8 Additional Numerical Examples 343 2.8.1 Unstable ODE Example 343 2.8.2 Advection-Diffusion of a Puff (Point Source) 353 2.8.3 The Rotating ConeA Pure Advection Test Problem 354 3 357 The NavierStokes Equations 3.1 Notational Introduction 357 3.2 The Continuum Equations (The PDE's) 360 3.3 Alternate Forms of the Viscous Term 362 3.3.1 Stress-Divergence Form 363 3.3.2 Div-Curl Form 363 3.3.3 Curl Form 364 3.4 Alternate Forms of the Non-Linear Term 364 3.4.1 Divergence Form 364 3.4.2 Rotational Form 364 3.4.3 Skew-Symmetric Form 365 3.4.4 A Symmetric Form 365 3.5 Derived Equations 367 3.5.1 The Pressure Poisson Equation (PPE) 367 3.5.2 The Vorticity Transport Equation (VTE) 368 3.5.3 The Penalized Momentum Equation 369 3.6 Alternate Statements of the NS Equations 371 3.6.1 Velocity-Pressure in Divergence Form 371 3.6.2 Velocity-Pressure in Rotational Form 371 3.6.3 PPE Form 371 3.6.4 The Stream Function-Vorticity (yw) Formulation 372 3.6.5 The Velocity-Vorticity Formulation 372 3.6.6 Other Formulations 373 3.7 Special Cases of Interest 373 3.7.1 Stokes Flow 373 3.7.2 Inviscid Flow 376 3.7.3 Potential Flow 378 3.7.4 Axisymmetric Flow 379 3.8 B oundary Conditions 380 3.8.1 uP Equations 381 a. Traction 381 b. Mixed 382 c. Total Momentum Flux 383 d. Symmetry 383 e. Robin 384 f. OBC's 385 g.MoreOBC's 390 h. Penalty method OBC's 391 i. Ill-posed OBC's page viii 391 3.8.2 The Pressure Poisson Equation and Pressure Boundary Conditions 392
3.8.3 The Vorticity Transport Equation and Boundary Conditions on the Vorticity 395 a. The 2D Stream Function-Vorticity Formulation 395 b. The 3D Velocity-Vorticity Formulation 397 3.9 Initial Conditions (and Well-Posedness) 397 3.9.1 The uP Formulation 397 3.9.2 The PPE Formulation 400 3.9.3 Vorticity-Based Methods 401 3.10 Interim Summary 403 3.10.1 A Well-Posed IBVP for Incompressible Flow, and the Equivalence 'Theorem' 403 3.10.2 Some Ill-Posed Problems 406 3.10.3 The Simplified PPE Is Also Ill-Posed 407 3.10.4 Fixing the SPPE, and the PPE Paradox 409 3.10.5 PPE Solutions That Are Not NSE Solutions 410 3.10.6 A Remark on the Penalty Method 412 3.10.7 Key Features of Incompressible Flow 412 3.11 Global Conservation Laws 412 3.11.1 Conservation of Mass 413 3.11.2 Momentum Conservation 413 3.11.3 Kinetic Energy Conservation 413 3.11.4 Vorticity Conservation 415 3.11.5 Enstrophy Conservation 417 3.12 Weak Forms of the PDE's/Natural Boundary Conditions (NBC's) 418 3.12.1 The Conventional u P Formulation and the Stress-Divergence Form Combined 418 3.12.2 Other uP Formulations 426 a. Full Divergence Form 426 b. Skew-Symmetric Form 427 c. Rotational Form and Other Recent Forms 427 d. Other Recent Formulations 430 e. Divergence-Free Basis Functions 431 3.12.3 Pressure Poisson Equation Formulations 432 3.12.4 The Stream Function-Vorticity Formulation 433 3.12.5 Some Ill-Posed Formulations 435 3.13 The Finite Element Equations/Discretization of the Weak Form 438 3.13.1 Detailed Derivation of One uP Formulation 439 a. Continuum Formulation 439 b. GFEM Equations 440 c. Matrix-Vector Representation 444 d. Ill-Posed Equations 449 e. Normal and Tangential BC's 450 f. Axisymmetric Case 455 g. Fixing Ill-Posed Dirichlet BC's 456 3.13.2 The Choice of Elements 457 a. Introduction and Summary Tables 457 *b. Null Spaces and Their Effects; Pressure Modes 469 *c. LBB-Stability/Div-Stability 503 d. Bringing LBB to the Rest of the People 510 e. Penalty Methods page ix 523 f. Some 2D vs 3D Considerations 533
3.13.3 Stabilization (D. J. Silvester) 533 a. Stable vs. Stabilized Methods 534 b. Equal Order Interpolation via Stabilization 535 c. Stabilized Approximations Using Discontinuous Pressure 541 d. Impact on Ierative Solvers 546 e. Recommendations 549 3.13.4 The Discrete Pressure Poisson Equations (PPE's) 550 a. The Consistent PPE 550 b. Some Inconsistent (Approximate) PPE's 552 3.13.5 Additional Detailed Discussion of the Slightly UNSTABLE but Highly 554 USABLE gi go Element a. Introduction 554 b. General Problem Statement 554 c. Interior Momentum Equation 557 d. Interior PPE 561 e. Boundary Momentum Equations/NBC's 564 f. The PPE at Boundaries 570 g. Flow Past a Flat Plate 581 h. Flow Past a Corner 583 i. Div u = 0 as a PPE BC 586 ). Qi Qo Convergence Proof 587 k. Quantitative Description of Some Unstable Modes 596 1. The Boundary Vector, g 601 3.13.6 Higher-Order Elements 604 a.Q2Qi 605 b.Q2Qi 607 C.Q2P1 608 3.13.7 Divergence-Free Elements (and Methods) 609 3 13 8 Conservation Laws Revisited 614 3 13 9 Periodic Boundary Conditions 617 3 14 A Control Volume Finite Element Method 622 *3 15 Variational Principles for Potential and Stokes Flow 626 3 15 1 Introduction 626 3 15 2 Discrete Stokes 627 3 15 3 Discrete Potential 631 3 15 4 Continuous Potential 632 3 15 5 Continuous Stokes 636 3 16 Solution Methods for the Semi-Discretized TimeDependent (and Steady) Equations 639 3 16 1 Some Time-Integration Methods for the DAE's 643 a Primitive Equations/Index 2 645 b PPE Methods/Index 1 651 c Error Analysis for Index 1 and 2 655 d Some Numerical Results (Taylor Vortex) 660 *3 16 2 A Model DAE Problem 677 a Introduction 677 b Index 2 678 c Index 1 685 d Index 0 687 page x
e. Penalty 687 f. Energetics 691 g. Numerical Integration 692 h. Final Exercise 700 *3.16.3 Analytical Solution of the Stokes Equations 701 a. Introduction 701 b. Index 2 701 c. Index 1 704 d. Linear Stability Theory 706 3.16.4 Three Variable-Step Implicit (Index 2) Methodsand Some Steady-State 707 Methods a. Introduction 707 b. Trapezoid Rule 707 c. Backward Euler 713 d. BDF2 715 e. Discussion 717 f. Penalty Method 718 3.16.5 An Explicit (Index 1) Method, Plus a Few Tricks 722 3.16.6 Semi-Implicit Projection Methods 734 a. Introduction 734 b. Derivation of an 'Optimal' Projection Method, Simplifications Thereto, and 736 Analysis Thereof c. A GFEM (Almost) Implementation of the Second-Order Projection 759 MethodProjection 2 d. A Sampling of Projection Methods Used by Others 770 3.16.7 Fully-Implicit Segregated Solution MethodsTransient and Steady-State 773 3.16.8 A Fractional-Step (Index 2) Method 778 3.16.9 Other Methods (Used by Others) 782 a. Methods Based on Trajectories/Characteristics 782 b. Methods Based on Least Squares (LSFEM) 783 c. Methods Based on Galerkin Least Squares 785 3.16.10 A Strategy for Hastening Steady Solutions 785 3.17 Aliasing and Aliasing Instability, Linear and Non-Linear 786 *3.18 A New Look at Two Old Finite Difference Methods 790 3.19 Numerical Examplelmpulsive Start 794 3.19.1 Introduction 794 3.19.2 Domain, Mesh, BC' s, IC' s 796 3.19.3 Two Steady-state Results (v = 0,¥) 797 3.19.4 Pressure Impulse 805 3.19.5 Minimum Time of Believability, Re = 1000 Results 806 3.19.6 Transient Stokes Flow 809 3.19.7 Divergence for h®0 815 3.19.8 A Better Model 821 3.19.9 Drag Coefficients 822 3.19.10 A New Analytical Model page xi g26
3.19.11 A Better Mesh 833 3.19.12Dfvs. t 838 3.19.13 A Deficient Mesh Design 840 3.19.14 Concluding Remarks 842 3.20 Closure: Some Additional Remarks on the Pressure 844 4 847 Derived Quantities 4.1 Introduction 847 4.2 Two Dimensions 848 4.2.1 Smoothing in General 848 4.2.2 Vorticity 849 4.2.3 Stream Function 851 4.2.4 Heat Flux 853 4.2.5 Forces and Moments 859 4.2.6 A Recommended Method for Computing First Derivatives at Nodes 862 4.2.7 Particle Paths 865 4.2.8 Effective Peclet (Reynolds) Number 867 *4.2.9 Pressure Smoothing and Node Moving for Q\ go 868 4.3 Three Dimensions 871 4.3.1 Vorticity 871 4.3.2 Helicity Density 871 Appendix 1 873 Some Element Matrices A.1.1 Advection-Diffusion Matrices 873 A.1.2 One-Dimensional Element Matrices 873 A.1.3 Two-Dimensional Element Matrices 874 A.1.4 NavierStokes; Additional Matrices 878 A. 1.4.1 Gradient Matrix 878 A.1.4.2 Divergence Matrix 878 A.1.4.3 Consistent Laplacian Matrix 878 A.1.5 Two-Dimensional Control Volume Finite Element Matrices 881 Appendix 2 883 Further Comparison of Finite Elements and Finite Volumes A.2.1 Introduction 883 A.2.2 Viewpoint One 883 A.2.3 Viewpoint Two 891 Appendix 3 897 Projections, Orthogonal and Notand Projection Methods A.3.1 Introduction 897 A.3.2 Scalar Projections 901 page_xn
Page xiii A. 3.2.1 The L2-Projection, ** 902 A.3.2.2 The L2 -projection, #$ 904 A3.23 The H] -Projection, .;-i 911 A.3.2.4 The H] -projection, J? 920 A.3.2.5 The Projection Method 925 A.3.2.6 Brief Discussion of GFEM Errors on Elliptic BVP's 929 A.3.2.7 Numerical Examples 931 A.3.3 Vector Projections 940 A.3.3.1 The *M> -Projection 942 A.3.3.2 The ^ -Projection 945 A.3.3.3 The PJ -Projection 951 A.3.3.4 The '** -Projection 954 A.3.3.5 Sequential Projections 956 A.3.3.6 The Projection Method 957 A.3.3.7 Ranking Elements via Projections 960 References 961 Author Index 1009 Subject Index 1017 page_xiii
Preface There are many ways to 'do' CFD (computational fluid dynamics) today, and there will undoubtedly be more rather than fewer 'tomorrow'. C'est la vie of CFD. There are also several {not 'many') ways for one to become acquainted with the need or desire to actually do CFD. The latter, of course, can have a major impact on the former. Rather than attempting to elaborate on either of the above subjects (a probably pedantic, difficult, and mostly useless exercise), we shall simply state our own opinion up-front (and 'opinion' it must be, as the jury is still out, and likely to remain so, regarding 'How best to do CFD'): the Galerkin finite element method (GFEM) is one of the good ways to 'do CFD'—especially when flows in or around 'real world' (complex) geometry are of principal interest. Note that 'good' does not necessarily imply easy, or robust. It does, in our view, imply accuracy and generality—and, in some sense, 'honesty'. It is an objective and honest method that tries to remain true to the underlying PDE's (partial differential equations). Hopefully there is still a market for a method that displays these characteristics. There is also a significant market for what we perceive to be less honest methods; namely, those modifying the Galerkin principle in various seemingly ad hoc ways such as 'upwinding' and related stabilizing and artificially dissipative methods. Such methods would be acceptable to us (and, we believe, to many others) if, in addition to the continual demonstration of their more-or-less acknowledged robustness, they would always be used in conjunction with appropriate mesh refinement efforts that would convince both giver and receiver that their final results do represent an accurate approximate solution to the stated problem. If these 'final' results require a significantly finer mesh than does a (frequently harder-to-obtain) wiggle-free GFEM solution, then there is clearly room for 'compromise' methods that cleverly combine robustness and accuracy in an efficient way. This book was actually 'conceived' way back in 1981. That it did not actually get written until 'now' (1989-1997) is probably a good thing, as a book back then would have been premature. (We, at least, have learned many important aspects of the subject—FEM in fluids—that were opaque back then, as most of the world was still in the Dark Ages relative to the present.) Some may say, perhaps rightly, that we are still premature. Others, however, will (we hope!) say 'It's about time! Promises, promises!' Finite element methods (FEM's) are not as easy to understand as finite difference methods (FDM's) or even finite volume methods (FVM's). The incompressible Navier- Stokes equations (NSE's) are also not easy to understand. Finite elements applied to incompressible flows can be especially hard to understand. One of the major goals of this book is to significantly clarify the use of FEM for discretizing the incompressible NSE's.
xvi PREFACE Another major goal has been to make the FEM more accessible/understandable—even desirable (!)—to the many engineers and applied scientists who might use it, but who are put off by, perhaps even frustrated by, some of the heavy mathematical machinery that is often associated with the subject. Thus, this book is also intended to place a readable finite element text in the hands of would-be FEM practitioners who might otherwise select an alternative method by presenting a lucid and reasonably uncluttered explanation of the FEM and by presenting and interpreting the final discrete equations in the form of simple finite difference equations. We even occasionally digress to derive and discuss a particular finite volume method—and compare it with the GFEM. On the other hand we intend—as the title suggests—that this book also be useful if only the first part of the title is of interest to the potential reader; for example, for those who prefer or are already committed to other numerical methods than the FEM and/or those who are not even interested in numerical computations but only wish to learn more about incompressible fluid dynamics. In contrast to some texts that provide only broad brush introductory information on many topics, this text is clearly focused on providing the important details of only a few useful-but-important topics—including much previously unpublished related research results and many illuminating discussions of one of the most mysterious parts of incompressible fluid dynamics: the pressure. Also, after careful derivation of the semi-discretized NSE's via the GFEM, the resulting differential-algebraic equations (DAE's) are—for the first time—addressed in great detail. This book puts modern theory and methods of ordinary differential equations (ODE's—for the advection-diffusion equation) and DAE's into the hands of the CFD practitioner, complete with robust algorithms that automatically select the proper time step size (small enough to 'follow the physics' but no smaller—and large when possible). The time integration methods presented in this book are also virtually independent of the underlying spatial approximation method—and thus will also be useful to those using FDM, FVM, or spectral methods (another Galerkin method). The FEM enjoys and utilizes, like spectral methods but not like FDM's, the particularly useful aspect of providing a final solution in a well-defined functional form; no ad hoc interpolation procedures are needed to determine the solution away from a node point. Another advantage is the systematic and consistent generation (via weighted residual/error distribution principles) of the appropriate ODE's, DAE's, and/or linear and nonlinear algebraic equations—with no need of 'intervention' by the analyst to consider, for example, how to treat a certain derivative at or near a boundary. In addition to removing much of the mystique related to 'natural boundary conditions' (NBC's), we also explain and emphasize consistency in (at least) the following six areas that even now are not as widely known and appreciated as they should be: (i) consistent mass matrices, (ii) consistent heat flux, (iii) consistent penalty methods, (iv) consistent PPE (Pressure Poisson Equation), (v) consistent normal direction, (vi) consistent forces and moments. We hope that this book will be useful to each of the following four 'types': graduate students, researchers, code developers, and code users. Although we are definitely not writing the book for mathematicians, we hope and believe that they too will find much that is of interest. We are writing the book, at least in part, for that class of people whom Strang and Fix (1973) have referred to as 'mathematical engineers' in the following sense: (i) We assume that the (average) reader does not have a PhD in applied mathematics/numerical
PREFACE xvii analysis, (ii) We do assume that the reader has an advanced degree in applied physical science or engineering and an interest in the mathematics of CFD, (iii) We do assume a 'reasonable' background/facility with some relatively advanced mathematics (ODE's, linear algebra, PDE's, and numerical analysis), and, finally, (iv) we do assume the reader has a reasonable background in fluid mechanics. The task of writing a book soon brings the putative author face-to-face with a stark and embarrassing reality: he knows a bit less about 'his' subject than he had thought. Even though one may like to think that writing a book turns one into an authority of sorts, this is unfortunately not the case. Rather, it helps to define/reveal the author's level of ignorance. Also, we realize well the bane of frustrated researchers trying to learn from the literature: a confusing misprint, or, worse yet, a misleading misprint or statement. So let us profusely apologize in advance for errors of this type (which there will inevitably be)—as well as those that are even more serious/harmful: errors in conceptual understanding that show up in the form of promulgating our own misunderstanding. The scope of this text is both narrow and broad; it is narrow in that it covers only advection-diffusion and isothermal laminar flow, and it is broad because these important 'basic' topics are covered in much detail. Looking ahead to a second volume, with the tentative subtitle, 'Additional and Advanced Topics', we plan to further broaden the scope via (at least) the following chapters: Flows with coupled transport (for example, the Boussinesq equations for both stratified flows and buoyancy-driven flows); stability, continuation, and bifurcation (advanced analysis of steady flows); free and/or moving boundaries/interfaces; turbulent flows and turbulence modeling; other specialized applications; good simulation practices—with appendices that discuss non-dimensionalization, solution methods for algebraic systems (linear and nonlinear), and (perhaps) weak operators. Before acknowledging the assistance provided by others in this endeavour, we wish to echo the too-true words of Telionis (1981) in his preface: 'At the end, one realizes that a manuscript is never finished. It is simply abandoned.' The first person owed thanks is Dr. Joseph B. Knox, who provided both intellectual support and an excellent research environment at LLNL during the 'early' years (circa 1975-1982 or so). Next, we opine that the ultimate portion of the CFD learning curve is that traversed by writing code and making runs. Special thanks are offered here to several important contributing LLNL colleagues: Bob Lee (an early and able 'partner'), Steve Chan (a later and longer partner), and John Leone, Jr (last but not least—he also provided much help with many numerical results that are presented for the first time in this text). Thanks also to the Institute Universitaire des Systems Thermique Industriels in Marseille, directed by J. Pantoloni and J. Martin, for providing computational resources and a stimulating environment for RLS during the final stages of this book. RLS would also like to acknowledge his family and the family Aubert in Marseille, France, for encouragement and support during the project, as well as the Council on Research and Creative Work at the University of Colorado for providing financial support via a faculty fellowship. PMG also acknowledges the sabbatical-like environment provided by the University of Minnesota in the form of a George T. Piercy Distinguished Visiting Professorship in the fall of 1989. And special thanks to Damien Veyret of the CNRS-UMR 6595 groups, who provided invaluable assistance in the generation of some of the new numerical results. The following alphabetical list, probably incomplete, of professional colleagues who have contributed indirectly (usually) but significantly to this product is presented with
xviii PREFACE a general but sincere 'Thank You:' Doug Arnold, Alex Chorin, Peter Brown, Michel Fortin, Dave Gartling, Vivette Girault, Max Gunzburger, John Heywood, Dave Malkus, Ty Olson, 'Nick' Nicolaides, Tinsley Oden, Olivier Pironneau, Rolf Rannacher, Rolf Stenberg, and David Silvester. Finally, no mere words of thanks can repay the active and able assistance provided by two mathematicians/friends who have helped us to remain 'honest' over many years and, most importantly for this project, during the entire preparation of this book: David Griffiths and Alan Hindmarsh. And to those whose help we have forgotten to acknowledge explicitly: please accept both our thanks and our apologies. This book is dedicated to the person who, via the significant labor of 'true' TgX (not IATgX) and the ability to improve many (but not all!) clumsy sentences, worked as hard as anyone: Doris Getsla Gresho PMG RLS January 1998
GLOSSARY OF ABBREVIATIONS AB AD BC BDF BE BHC BHE BHM BL BLT BTD BVP CB CFD CG CM CPPE CVFEM DAE DSCG DTSF FDM FE FEM FVM GFEM GFEMIA HOT IBVP IC IMR Adams-Bashforth advection-diffusion boundary condition backward differentiation formula backward Euler biharmonic catastrophe biharmonic equation biharmonic miracle boundary layer boundary layer thickness balancing tensor diffusivity boundary value problem checkerboard computational fluid dynamics conjugate gradient consistent mass consistent pressure Poisson equation control volume finite element method differential-algebraic equation diagonally-scaled conjugate gradient delta t scale factor finite difference method forward Euler finite element method finite volume method Galerkin finite element method Galerkin finite element method intelligently- higher-order terms initial boundary value problem initial condition implicit midpoint rule applied
GLOSSARY OF ABBREVIATIONS IRK IVP LBB LDC LHS LM LMM LTE NR NS NSE's OBC ODE PDE PPE RHS RK RMS SPD SPPE ss STR TBD TR TS VBL VTE implicit Runge-Kutta initial value problem Lady shenskaya- Babuska- Brezzi lid-driven cavity left-hand side lumped mass linear-multistep method local truncation error Newton- Raphson Navier-Stokes Navier-Stokes equations outflow/open boundary condition ordinary differential equation partial differential equation pressure Poisson equation right-hand side Runge-Kutta root mean square symmetric positive definite simplified pressure Poisson equation steady state shortened trapezoidal rule to be determined trapezoid rule Taylor series Viscous boundary layer vorticity transport equation
INTRODUCTION 1.1 INTRODUCTION This short chapter is intended primarily to help bring the 'novice' up-to-speed by supplying a sufficiently-generous (but by no means complete) sampling of the relevant introductory and contemporary related literature. A secondary purpose is to provide a sort of 'road- map' to chart our plans for the rest of the book. A final purpose is to state 'up-front' what is not covered in this book. We assume a certain level of 'sophistication' on the part of the reader, which we now attempt to define. 1. The basic concepts ('physics') and equations of fluid mechanics—especially those for incompressible flow—should be reasonably well-known and understood. (We will further your understanding.) 2. The (Galerkin) finite element method (GFEM) should also be in the reader's sphere of knowledge, as we begin at well-above an introductory level (although we often 'drop back down' to generate discrete equations). 3. We hope that an 'intermediate' level of mathematical sophistication—such as that attained by about the end of a 'second' course in finite elements would have provided—is all that is required. Related to this last point, we quote from John Whiteman (1975) the following very- relevant, but perhaps (for some 'engineers') somewhat frightening summary of FEM and mathematics: It is perhaps part of the fascination of the subject that so many branches of mathematics are involved in the theory of finite elements. As can be seen from the above, this draws on such areas as functional analysis, the theory of differential and integral equations, variational principles, optimisation, interpolation, approximation and the solution of linear and nonlinear systems. The task of becoming conversant with this wide spectrum of knowledge is indeed a challenge. But we will not spend much time on 'the theory of finite elements' (there are quite enough publications on this subject), as we try to get the subject matter of our book into the hands of 'practitioners'—code writers and code users. But it is a fact of life that the FEM is somewhat unique in the sense that it is a 'perfect' subject for heavy involvement by engineers and applied scientists on the one hand and by a wide range of applied mathematicians on the other. We present in this text a product of our experience in which we have strongly attempted to keep 'one foot in each camp'. While we try to keep
2 INTRODUCTION 'advanced' mathematics to a minimum (for example, we invoke very little 'functional analysis,' as we mostly avoid the sophisticated subject of error analysis), we also hope that the less sophisticated reader who masters the material herein will be able to proceed more confidently into much of the FEM literature. In their excellent and timely text, Strang and Fix (1973) stated in their preface, ... this book is absolutely not intended for the exclusive use of specialists in numerical analysis. On the contrary, we hope it may help establish closer communication between the mathematical engineer and the mathematical analyst. In our opinion, they succeeded well—and we would be quite pleased if our book is judged in a similar vein, but with a slightly different slant: we would like the not- necessarily-mathematical engineer to learn, appreciate, and use effectively the FEM—as we attempt to further bridge the gap between mathematicians and 'engineers'. Two related historical facts/events that we believe worth reporting here are the following, the first from Whiteman again (1981): Over a decade ago I reached the conclusion that there existed a very definite and detrimental lack of communication between those mathematicians, who were at that time working in increasing numbers on the mathematical theory of finite elements, and engineers who were routinely using finite element methods to solve practical problems. The second is from Tinsley Oden (1991): One unfamiliar with aspects of the history of finite elements may be led to the erroneous conclusion that the method of finite elements emerged from the growing wealth of information on partial differential equations, weak solutions of boundary value problems, Sobolev spaces, and the associated approximation theory for elliptic boundary value problems. This is a natural mistake, because the seeds for the modern theory of partial differential equations were sown about the same time as those for the development of modern finite element methods, but in an entirely different garden. ... the rich mathematical theory of partial differential equations, which began in the 1940's and 1950's, blossomed in the 1960's, and is now an integral part of the foundations of not only partial differential equations but also approximation theory, grew independently and parallel to the development of finite element methods for almost two decades. ... It was, perhaps, an unavoidable occurence, that in the late 1960's these two independent subjects, finite element methodology and the theory of approximation of partial differential equations via functional analysis methods, united in an inseparable way, so much so that it is difficult to appreciate the fact that they were ever separate. Next, we summarize our 'interpretation' of the reason that the Galerkin Finite Element Method is so useful and powerful (cf. Fourier series and other eigenfunction expansion methods). 1. The approximate solution is represented as an expansion in a convenient (and finite) set of linearly-independent basis functions (piecewise-polynomials) with coefficients to be determined. 2. Placing the expansion into the governing PDE results in an error, called the residual. 3. Application of the 'Galerkin principle' yields a set of equations for the undetermined coefficients as follows. Since the basis functions form a complete set of functions and since the only function that is orthogonal to a complete set of functions is the zero function,
INCOMPRESSIBLE FLOW 3 it makes very good sense to orthogonalize the residual function against the same basis functions. 4. The choice of piecewise-polynomials and the 'Galerkin principle' more or less 'define' the GFEM—the former leading to sparse matrices and the latter (often) to optimal accuracy. Of course there are other linearly-independent sets of functions against which one might orthogonalize the error—leading to, for example, Petrov-Galerkin methods. More general yet is the so-called 'method of weighted residuals', of which the Galerkin method is one member; for background here, see Crandall (1956), Finlayson and Scriven (1966), and Finlayson (1972). But we prefer the 'Galerkin method' for reasons that will be clarified throughout the book. We conclude this introductory discussion with the following remark: we hope, as the title suggests, that this book will also be useful if only the first part of the title is of interest; for example, for those who prefer other numerical methods than the FEM or for those who are not at all interested in numerical computions (CFD) but only in Incompressible Fluid Dynamics. 1.2 INCOMPRESSIBLE FLOW As stated earlier, we must assume that the reader is at least somewhat versed in this subject, but we will offer a short list of references that we have found useful—hopefully some on our list will actually be 'new' to the reader. But before doing this, let us help to 'define'/review the subject via some citations from both a classic text and a neo-classic (?) one: A fluid is said to be incompressible when the density of an element of fluid is not affected by changes in the pressure. [Batchelor (1967, p. 75)] The layman is usually surprised to learn that the pattern of the flow of air can be similar to that of water. From a thermodynamic standpoint, gases and liquids have quite different characteristics. As we know, liquids are often modeled as incompressible fluids. However, 'incompressible fluid' is a thermodynamic term whereas 'incompressible flow' is a fluid-mechanical term. We can have incompressible flow of a compressible fluid. [Panton (1984 4. 228)] I regard the flow of an incompressible viscous fluid as being at the centre of fluid dynamics by virtue of its fundamental nature and its practical importance" [Batchelor (1967, p. xiv)] In the remainder of this book we therefore concentrate on the flow of a fluid which possesses inertia and viscosity but which is effectively incompressible. This programme might appear to be modest, but it lies at the center of fluid mechanics and both deserves and requires serious study. [Batchelor (1967, p. 173)] Following on from these relevant words, we add just a few of our own: While it is well and good to remember that incompressible flow (V • u = 0) is always only an approximate conservation of mass equation (dp/dt + u • Vp = —pV • u being the exact equation) it is also always true that the exact mathematical consequences of this 'approximate' conservation of mass equation are both significant and profound—as we shall see. Returning for a minute to real fluids, please consult (at least) both Batchelor's and Panton's books to see under what conditions it is a 'good' approximation to model a compressible fluid via the incompressible assumption—the perhaps most important/common condition being related to the Mach number, Ma = q/c where q is
4 INTRODUCTION the fluid's speed and c is the speed of sound in the fluid: If Ma2 <JC 1 everywhere (say Ma2 < 0.1, that is, Ma < ~0.3), the flow will be have basically incompressibly. (Note that V • u = 0 implies that Ma = 0.) The rest of our brief 'survey' of incompressible flow is presented in the form of a briefly annotated table, with some relevant educational quotations, in (roughly) chronological order (Table 1.1)—about which we make three remarks: (i) The books listed cover an incredibly large range of mathematical tools/approaches for examining/studying the NSE's. (ii) Our 'sampling' is, of course, incomplete, (iii) Abbreviations are defined in the Glossary of Abbreviations. Table 1.1 Some books on incompressible flow. Author Title (year) Comments J. Serrin L. Landau and E. Lifshitz H. Schlichting G. Batchelor 0. Ladyzhenskaya D. Tritton R. Temam Mathematical Principles of Classical Fluid Mechanics (1959) Fluid Mechanics (1959) Boundary Layer Theory (1955-1979) R. Bird, W. Stewart Transport Phenomena and E. Lightfoot (1960) An Introduction to Fluid Dynamics (1967) The Mathematical Theory of Viscous Incompressible Flow (1963-1969) Physical Fluid Dynamic (1977) Navier-Stokes Equations (1979, 1984) Classic—careful discussion, via vector and tensor calculus, of both compressible and incompressible flows. Much emphasis on 'basics'/physics; very broad coverage but mostly compressible. The classic text on the subject; it covers laminar, turbulent, and thermal boundary layers—mostly incompressible. Classic introductory text on momentum, heat, and mass transfer; many practical/tutorial examples. Another classic, at an intermediate level, focussing on incompressible flow; a 'must' for the serious student. The title tells all for this classical mathematical text. Most of the discussion is in Sobolev spaces (functional analysis methods). See footnotes 1-3. The focus is on physics, not mathematics; broad coverage of incompressible fluid dynamics including rotational and thermal effects. From the preface: This book stands at the boundary between computational fluid dynamics and mathematical analysis to which CFD is firmly tied', The 'analysis' is in function space, but some FDM's (and FEM's) are also discussed.
INCOMPRESSIBLE FLOW Table 1.1 (continued). Author Title (year) Comments M. Van Dyke H. Lugt R. Panton H-O. Kreiss and J. Lorenz P. Constantin and C. Foias G. Galdi An Album of Fluid Motion (1982) Vortex Flow in Nature and Technology (1983) Incompressible Flow (1984, 1996) Initial- Boundary Value Problems and the Navier-Stokes Equations (1989) Navier-Stokes Equations (1989) An Introduction to the Mathematical Theory of the Navier- Stokes Equations. Volume I: Linearized Steady Problems. Volume II: Nonlinear Steady Problems (1994) Beautiful photographs of 'all' of fluid mechanics; 'a picture is worth a thousand words'-and 'some' equations. A unique and mostly qualitative approach with some mathematics) in which the theory of vorticity dynamics is used to illustrate many fluid-mechanical phenomena. A good beginning-to-intermediate-level text; the mathematics is 'classical'; broad range of topics. Much hyperbolic and parabolic PDE analysis—linear and non-linear—in 1 to 3 dimensions; mostly, periodic BC's for NSE's; the depth is in IBVP's, less extensive is NSE's; see footnotes 4 and 5. A short treatise using modern techniques in PDE's; also discusses attractors, fractal dimensions, and inertial manifolds. See footnotes 6-8. Modern, comprehensive, and thorough (over 750 pages), and mathematical (see footnote 9); the time-dependent case (Volume III) is anxiously awaited. See footnote 9. 1. 'However, the only way to verify what the Navier-Stokes equations really have to say about the motion of actual fluids is first to carry out a rigorous mathematical analysis of the solution of boundary- value problems for the Navier-Stokes equations, corresponding to the actual hydrodynamical situations.' (P-4) 2. 'Before studying the nonlinear Navier-Stokes equations, we investigate various linearized versions of the equations. These studies show that the boundary-value problems for the linearized equations always have unique solutions, and that properties of the operators corresponding to stationary problems are very much like those of the Laplace operator, while the properties of the operators corresponding to nonstationary problems resemble those of the heat-conduction operator but have some distinction,' (p. 5) (See too footnote 7.) 3. 'Finally, we warn the reader who is accustomed to the classical methods of mathematical physics that the interpretations given here of what is understood by the solution of a problem and what it means to solve a problem differ from those with which he is familiar. To a large extent, a precise analysis of these matters is responsible for the success of the investigations reported here,' (p. 6) 4. 'Existence and regularity questions play a fundamental role in computations because the resolution required depends on the smoothness of the solution, and there is always the danger that one tries to compute things that do not exist,' (p. ix) (continued overleaf)
6 INTRODUCTION Table 1.1 (continued). 5. 'There is no existence proof except for small time intervals. Thus it has been questioned whether the N-S equations really describe general flows. ... Possibly a lack of mathematical ingenuity is the reason for the missing existence proof, and the N-S equations are physically correct, (p. 1) 6. 'Questions regarding the notions of weak and strong solutions and their relations to classical solutions are studied in some detail.' (p. viii) 7. 'The difference between the Laplacian for the Dirichlet problem and the Stokes operator originates from the fact that Leray's projector P and (-A) do not commute in general,' (p. 41) (See too footnote 2.) 8. 'The need to study weak solutions arises mainly for d = 3 because even if uo and f are very nice functions, in this case the existence of a classical solution of the Navier-Stokes equations is known, in general, only for short time intervals.' (p. 63) 9. 'The book is essentially mathematically self-contained: the knowledge of Banach spaces and their basic properties (completeness, separability, reflexivity) along with some classical results in operator theory are the only necessary prerequisites to reading this book, which is devoted to students (graduate and undergraduate) and those mathematicians and applied mathematicians who wish to become acquainted with the subject.' (p. vii) 1.3 THE FINITE ELEMENT METHOD As in the previous section, we shall mainly merely provide the interested reader with a reasonably complete shopping list of books on the FEM. We wait until the next section to cite some of those who, like us, connected the FEM with fluid mechanics—mostly incompressible. Our list will range from beginner's books to those requiring a rather more solid background in advanced mathematics. But before we do this, we wish to get three things out of the way: (1) the history of FEM, (2) FEM error analysis via Taylor series, and (3) some brief discussion of 'function spaces' 1. For the interested reader, we offer a small sample of progress that trace one or another aspect of FEM history. We begin by returning to a reference already cited—Oden (1991). Slightly more recent is Babuska (1994), and more recent yet is the paper by Gupta and Meek (1996). Additionally, some of the books referred to below trace the early development of the FEM. 2. Even though the discrete equations generated by the FEM are often 'amenable to' Taylor series analysis, especially when uniform meshes are employed, we caution the reader not to draw 'final' conclusions regarding the accuracy of GFEM from such a (necessarily) local analysis—especially when the 'nodal equations' are not all of the same 'type' [see, for example, Carey (1976)]. For but a single example: the discrete momentum equation for the center node of a 9-node (biquadratic) velocity-piecewise-constant pressure approximation to the incompressible NSE's contains no pressure gradient term. (!) This is a Taylor- sehes-inconsistent equation, yet the 'element' converges nicely, though 'slowly' (it is consistent). 3. We now turn to the subject of function spaces—wherein all the GFEM solutions reside. So as not to lose a large fraction of our planned audience, we cover the subject as follows: nearly all that we wish to say about the subject is contained in the 'graphic' shown in Figure 1.1.
THE FINITE ELEMENT METHOD 7 Fig. 1.1 Our GFEM lives in a Banach space that is also in the intersection of two subspaces of it: a Sobolev space and a Hilbert space. With function spaces out of the way, we now present another briefly annotated table, this time on the FEM, in approximately chronological order—and again with no intention of 'completeness'. (For example we list none that are too 'solid-mechanics-oriented'.) The three remarks regarding Table l.l apply equally well to Table 1.2. We conclude this FEM section with a short discussion on 'element matrices' and 'assembly of equations'—two common and important aspects of the FEM. While we generally refer the notice to the (more basic, introductory) texts in Table 1.2 for learning about these things, we believe that there is one 'area' that is not 'completely' covered there—and that is what we wish to contribute here: element matrices for several common quadrilateral elements for incompressible fluid mechanics. And this we do only for the simplest shapes—rectangles; isoparametric element matrices are better left to the computer. With some apologies for getting a little bit ahead of ourselves in terminology, we present in Appendix 1 the following element matrices in one and two dimensions: M, K, and N(u) (all defined in Chapter 2) for both linear and quadratic basis functions, and CT (defined in Chapter 3) for 2D. Also included there are M, K, and N{u) for the control-volume-finite-element method (CVFEM; defined in Chapters 2 and 3) and, for one higher-order element (QjQx), we even show CTM^}C (Chapter 3). We shall refer to this appendix frequently in the following chapters; we introduce it here because it is FEM. Two related remarks: 1. We will actually present the 'nitty-gritty' details of assembling the global semi- discrete NSE's from the element matrices for one special case (Q\Qo)—in Section 3.13.5 of Chapter 3. 2. For some advice on numerical integration 'quadrature rules'—required for the isoparametric case in which only approximate results are generally attainable—see Leone et al. (1979).
8 INTRODUCTION Table 1.2 Some books on finite elements. Author Title (year) Comments O. Zienkiewicz and R. Taylor G. Strang and G. Fix P. Ciarlet E. Becker*, G. Carey and J.T. Oden "(Volume I only) O. Zienkiewicz and K. Morgan J.N. Reddy O. Axelsson and V. Barker The Finite Element Method. Volume 1: Basic Formulation and Linear Problems (1989) (1st Edition: 1967) An Analysis of the Finite Element Method (1973) The Finite Element Method for Elliptic Problems (1978) The Texas Finite Element series I. Finite Elements: An Introduction (1981) II. Finite Elements: A Second Course (1983) III. Finite Elements: Computational Aspects (1984) IV. Finite Elements: Mathematical Aspects (1983) VI. Fluid Mechanics (1984) Finite Elements and Approximation (1983) An Introduction to the Finite Element Method (1984, 1993) Finite Element Solution of Boundary Value Problems (1984) Latest edition of one of the 'standards' on the subject; while each edition tends to be more general and less solid mechanics, the latter still dominates; a basic book that ranges from introductory material on through to some advanced topics. A well-deserved classic; an excellent and readable book for the 'mathematical engineer' to learn about error estimates, convergence rate, and more—at an 'intermediate' level of mathematics. See footnotes 1 and 2. This is the 'standard' reference for elliptic problems; the mathematical level is fairly high (functional analysis, etc); includes theorems, proofs, and much error analysis. See footnote 3. A series of 'small' books that range from undergraduate level to graduate and postgraduate level; good introductions to the various 'spaces' of solutions; very wide coverage of the subject, A sort of 'special-purpose' book that is appropriate for a beginner and for one trained in FDM, for example. See footnote 4. One of the first 'basic' texts that addresses the 'rest' of the science and engineering community. See footnotes 5 and 6. Written by mathematicians, this is another text addressed to non-solid-mechanics readers; it emphasizes second-order BVP's and methods of solving the resulting linear algebra problems, and the mathematical level is 'intermediate'.
THE FINITE ELEMENT METHOD Table 1.2 (continued). Author Title (year) Comments V. Thomee R. Wait and A. Mitchell D. Burnett T. Hughes H. Kardestuncer (Editor) C. Johnson R. Cook, D. Malkus and M. Plesha Galerkin Finite Element methods for Parabolic Problems (1984) Finite Element Analysis and Applications (1985) Finite Element Analysis (1987) The Finite Element Method: Linear Static and Dynamic Analysis (1987) Finite Element Handbook (1987) Numerical Solution of Partial Differential Equations by the Finite Element Method. (1987) Concepts and Applications of Finite Element Analysis (1989) Very thorough, very mathematical; much error analysis (with proofs), and in several norms, including LM; also full discretizations, including BE and TR; smoothing properties discussed and analyzed. Another book by mathematicians that avoids structural mechanics; intermediate level of mathematics; includes time-dependent problems; concisely introduces function spaces. See footnote 7. A long, but thorough and useful introductory text that is directed toward the advanced undergraduate in non-solid mechanics curricula. See footnotes 8-10. Although written by an expert in (at least) solid mechanics, this first level text is an excellent starting point for other branches of engineering—and applied science; mixed methods (Stokes flow) are also addressed. See footnote 11. This large four-part volume begins with 'FEM mathematics', written by mathematicians at a high/advanced level; next, 'FEM Fundamentals', is written for and by solid mechanicers; 'FEM Applications' does contain some fluid mechanics; the last section, 'FEM computations', again address mostly issues in solid mechanics. (C'est la vie.) Addressed to advanced undergraduates and first-year graduate students, this text does a very good job of introducing the necessary mathematics (but no more) and shows how it applies to a range of problems. See footnote 12. While mainly a 'solid mechanics' FEM text, we include it for two reasons: (i) The chapter on constraints, (ii) the Appendix on eigenvalues and eigenvectors is unique and useful. (continued overleaf)
10 INTRODUCTION Table 1.2 (continued). Author Title (year) Comments O. Zienkiewicz and R. Taylor A. Baker and D. Pepper The Finite Element Method, 4th Edition. Volume 2: Solid and Fluid Mechanics, Dynamics and Non-Linearity (1991) Finite Elements 1 -2-3 (1991) D. Pepper and J. Heinrich S. Brenner and R. Scott K.J. Bathe The Finite Element Method: Basic concepts and Applications (1992) The Mathematical Theory of Finite Element Methods (1998). Finite Element Procedures (1996) This is 'Part 2' of the first-listed book; it is more advanced with the first third emphasizing 'solids'; the remainder brings in the other physics, including fluid mechanics—both compressible and incompressible. A rather different (unique) approach to introduce the FEM, mostly via heat transfer and advection-diffusion examples, is presented; a disk-supplied code is an integral part of the text; it alleges to streamline the analysis via unconventional hypermatrix terminology. See footnote 13. Another introductory text that emphasises heat transfer and not structural mechanics. See footnotes 14-16. This useful text, written by mathematicians, seems to be a sort of 'modern-day Strang and Fix (1973); though perhaps more advanced in the mathematics. See footnote 17. Although solid mechanics-oriented, this new text covers several other relevant issues: mixed methods for incompressible behavior, new ways to analyze and understand the inf-sup condition, and incompressible flow. See footnotes 18 and 19. 1. 'Whenever flexibility in the geometry is important—and the power of the computer is needed not only to solve a system of equations, but also to formulate and assemble the discrete approximation in the first place—the finite element method has something to contribute' (p. ix) 2. 'This completes the technical error estimates for parabolic problems; there are no surprises in the results. Our impression is that just as in static problems, finite elements are particularly effective in coarse mesh calculations, with a large value of h. In this situation, the physics is often more adequately represented by Galerkin's principle, on which the finite element method is based, than by supposing difference quotients to be close to the derivative.' (p. 251) 3. 'Although the emphasis is mathematical, it is one of the author's wishes that some parts of the book will be of some value to engineers, whose familiar objects are perhaps seen from a different view point. Indeed, in the selection of topics, we have been careful in selecting only actual problems and we have likewise restricted ourselves to finite element methods which are actually used in contemporary engineering applications.' (p. vii) 4. 'Many alternative numerical approximation processes existed before the advent of the finite element method. Here boundary solution techniques and finite difference methods have established their own useful existence—and proponents of these have at times crossed swords with those advocating finite element methods in claiming particular superiority. Today some of us see the essential unity of all approximation processes used in the solution of problems defined by differential equations and in this book we stress this throughout. We endeavour to show that a 'generalized finite element method' can be defined embracing all the alternative variants, thus leaving scope for choosing the 'optimal approximation' to the user.' (p. vii) 5. 'The many discussions I have had with students who had no background in solids and structural mechanics gave rise to my writing a book that should fill the rather unfortunate gap in the literature.' (p. xi)
THE FINITE ELEMENT METHOD 11 Table 1.2 (continued). 6. 'In introducing the finite element method in Chapters 3 and 4, the traditional solid mechanics approach is avoided in favor of the "differential equation" approach, which has broader interpretations than a single special case.' (p. xii) 7. 'In this book, the authors have attempted to provide an introduction to the method by considering both the theory and the practice. The book is aimed at final-year undergraduate and first-year postgraduate students in mathematical sciences and engineering, No specialized mathematical knowledge beyond a familiarity with calculus and elementary differential equations is assumed. The applications are drawn from many areas and no knowledge of structural mechanics, or any other branch of engineering science, is assumed. The abstract mathematics is kept to a minimum and is concentrated in a single chapter' (p. vii) 8. 'Because of these origins, most of the FE literature is permeated with mechanical concepts such as forces, moments, displacements, rotations, masses, dampers, springs, rods, beams, plates, and shells. Understandably, many physicists, applied mathematicians, and non-mechanical engineers have concluded that this field is not for them.' (p. vi) 9. 'Of course, the only technical language understandable to all fields is mathematics, so that is the approach used here. This, however, presented a challenge: how to avoid a lot of the sophisticated mathematical concepts and specialized mathematical jargon prevalent in much of the mathematically- oriented FE literature, while preserving only those concepts necessary for intelligent application of the FEM.' (p. vi) 10. 'Judging from my own experience and that of my colleagues, such topics as functional analysis and variational calculus are definitely not needed to understand and use the FEM.' (p. ix) 11. 'Background in structural mechanics (i.e., the theory of beams, plates, and shells) is certainly an asset when it comes to studying this book but is not essential. Only Chapters 5 and 6 deal exclusively with this subject, and these chapters may be ignored by students whose interest lies elsewhere, ... In this spirit the book emphasises fundamental finite element concepts and techniques applicable to a very broad range of problems and thus constitutes a suitable text for most students in the physical sciences.' (p. xvii) 12. The purpose of this book is to give an easily accessible introduction to the finite element method as a general method for the numerical solution of partial differential equations in mechanics and physics covering all the three main types of equations, namely elliptic, parabolic, and hyperbolic equations.' (p. 7) 13. 'The intrinsic beauty of the method is that this theory completely accounts for each and every critical decision that must be made in the design of a numerical algorithm. No other approximation method in use today is so complete in giving firm and accurate guidance on algorithm construction, consistent boundary condition implementation, direct adjustment of spatial order of accuracy, and geometric flexibility.' (p. xiii) 14. 'Most practioners of the finite element now employ Galerkin's method to establish the approximations to the governing equations. The underlying theme in this book likewise follows Galerkin's method. The simplicity and richness of the method pays for itself as the user progresses into more complicated and demanding types of problems. Once this fundamental concept is grasped, application of the finite element method unfolds quickly.' (p. 2) 15. 'The finite element method is rapidly becoming the de facto standard for numerical approximation of the partial differential equations which define engineering and scientific problems. Many of the commercial computer codes currently available are finite element based—especially in the structural and heat transfer areas.' (p. 3) 16. 'The theory of the finite element method is found in variational calculus, and it is this mathematical basis that allowed it to be developed in a very short time and makes it the powerful tool for engineers that it is today. However, this has also created the misconception that a strong mathematical background is essential in order to understand the finite element method. Here, we will indeed show that this is not the case and that all of the finite element methodology can be developed utilizing the theorems of advanced calculus and basic physical principles.' (p. 5) (The reference to variational calculus for the theory is, of course, mainly addressed to 'classical' FEM and elliptic BVP's.) 17. This book developes the basic mathematical theory of the finite element method, the most widely used technique for engineering design and analysis. One purpose of this book is to formalize basic tools that are commonly used by researchers in the field but never published. It is intended primarily for mathematics graduate students and mathematically sophisticated engineers and scientists.' (p. vii) 18. 'My objective in writing this book was to provide a text for upper-level undergraduate and graduate courses on finite element analysis and to provide a book for self-study by engineers and scientists.' (p. xiii) 19. This text does not present a survey of finite element methods. ... Instead, this book concentrates on only certain finite element procedures, namely, on techniques that I consider very useful in engineering practice and that will probably be employed for many years to come.' (p. xiii)
12 INTRODUCTION 1.4 INCOMPRESSIBLE FLOW AND THE FINITE ELEMENT METHOD We are certainly not the first to try to tie these two important subjects together—and we will not be the last (we hope!). We mentioned earlier that the 'simplification' of the mass conservation equation leads to some significant and profound consequences. Well, such is the case when one attacks the incompressible NSE's via the FEM—or any other spatial discretization method, we hasten to add. The 'divergence free constraint' that is placed on the vector field called velocity causes nearly no end of difficulties, mostly mathematical. It is no wonder, as anyone who is familiar with the historical development of CFD is no doubt aware, that many 'slightly compressible' approximations to truly incompressible flow have been devised. Even today, many 'incompressible' discretizations are not really that; discrete incompressibility, and all of its associated 'stability' problems are often side-stepped by invoking 'stabilized' formulations that permit the discrete divergence to be proportional to some power (usually the first) of the mesh spacing, rather than strictly zero. The history of the subject, revealed by a perusal of the literature over the last 25 years [30 years if the FDM history is also included—which it is in the recent paper by Williams and Baker (1996).], will reveal many many (too many, some detractors might suggest) publications that 'test' or analyze (or both) various types of 'discretizations' (velocity and pressure basis functions, basically) for both accuracy and for that truly annoying mathematical concept called 'stability' — uniform invertibility of the operator (roughly). This is all we care to reveal at this time; the remainder of the 'iceberg' will be detailed in Chapter 3 and in many if not all of the citations listed in our final historical introduction table (Table 1.3)—about which the same remarks made regarding Table 1.1 again apply. In addition to the references cited above, useful material can also be found in (at least) those listed below—some of which 'occur' periodically: (i) the FEM in Fluids Series, John Wiley and sons in seven volumes, from 1974 through 1988. Also, the FEM in Fluids Conferences, from 1974 through the present (the next is in January 1998); (ii) the Numerical Methods in Laminar and Turbulent Flow series, Pineridge Press, in nine volumes from 1978 through 1995; (iii) the 'Water Resources Conference' series; called Finite Elements and Water Resources for the first 10 or so years, beginning in 1976, until 'external influences' (apparently) caused a name change to Computational Methods in Water Resources; (iv) various ASME Special Meetings Proceedings—especially the Division of Applied Mechanics and the Division of Fluids Engineering; (v) various other 'CFD' conferences—too many to list; (vi) Annual Review in Fluid Mechanics—annually; (vii) All journals in which CFD is one of the main subjects. We conclude this section with a very brief listing of some of the 'pioneering' papers in which the FEM was applied to incompressible flow: Thompson et al. (1969), Oden (1970), Cheng (1972), Taylor and Hood (1973), and Hood and Taylor (1974)—from each of which we learned something when we began our FEM effort in 1975.
INCOMPRESSIBLE FLOW AND THE FINITE ELEMENT METHOD 13 Table 1.3 Some books on finite elements and fluid mechanics. Author Title (year) Comments A. Baker G. Carey and J.T. Oden V. Girault and P.A. Raviart C. Cuvelier, A. Segal, and A. Van Steenhoven M. Gunzburger 0. Pironneau 0. Zienkiewicz and R. Taylor F. Brezzi and M. Fortin Finite Element Computational Fluid Dynamics (1983) Finite Elements. Volume VI: Fluid Mechanics (1986) Finite Element Methods for Navier-Stokes Equations (1986) Finite Element Methods and Navier-Stokes Equations (1986) Finite Element Methods for Viscous Incompressible Flows (1989) Finite Element Methods for Fluids (1989) The Finite Element Method, 4th Edition Volume 2: Solid and Fluid Mechanics, Dynamics and Non-linearity (1991) Mixed and Hybrid Finite Element Methods (1991) Although 'dated' with respect to 'FEM in fluids,' this early text is still useful, both to introduce FEM as well as presenting parabolic (boundary layer) methods; the language of 'hypermatrix' formulation is, however, rather unconventional. See footnote 1. One of the Texas series'; another early text that recognized, and responded to, the need; still a useful book. See footnote 2. This mathematical text, written by mathematicians, is a 'standard' (advanced) reference for the theory of FEM and incompressible flows—as far as it goes (it does not address the time-dependent case). Another 'early' text written by mathematicians, but for engineers (no functional analysis); includes discussions of div-free bases; includes some error analysis and convergence proofs. Although written at a higher level of mathematics (by a mathematician), this text probably comes closer to our own than any other—it is complementary; it covers also stream function-related methods and a few others that we do not. See footnote 3. A small-but-useful book that touches on a wide range of flows: irrotational, incompressible Stokes and NS, compressible Euler, and shallow water equations, plus advection-diffusion. See footnotes 4 and 5. Begun in Volume 1 (mixed formulations), this text denotes about 15% to the subject matter of our book; It is thus somewhat complementary. See footnotes 6 and 7. While incompressible NSE occupies only about 20% of this important mathematical text, it is the 'standard reference' for the analysis of mixed methods and their stability; we appeal to it as the 'authority'. See footnote 8. (continued overleaf)
14 INTRODUCTION Table 1.3 (continued). Author Title (year) Comments M. Gunzburger and R. Nicolaides (Editors) J.N. Reddy and D. Gartling Incompressible Computational Fluid Dynamics—Trends and Advances (1993) The Finite Element Method in Heat Transfer and Fluid Dynamics (1994) While not restricted to FEM, this book is still timely and relevant; the state-of-the-art circa 1992—or a segment there of. Nearly at the opposite end of the 'spectrum' from Girault and Raviart (1986), this text is focussed on applied FEM and shows many results; solution methods (linear and non-linear equations) are also discussed. See footnotes 9 and 10. 1. 'For engineers and scientists whose expertise lies generally outside mathematics or structural analysis, and in fluid mechanics in particular, these approaches [mathematical or structural] are probably confusing if not rather incomprehensible and frustrating [Amen;] ... This text addresses this dilemma.' (p. xiii) 2. The main framework of the subject for flow problems has now been established. Our objective in this volume is to develop and discuss the method in this context. Accordingly, the scope has been limited to the following fundamental classes of flow problems: linear potential flow, compressible inviscid flow, viscous flow, and transport processes [advection-diffusion].' (p. ix) 3. 'A principal goal is to present some of the important mathematical results that are relevant to practical computations. In so doing, useful algorithms are also discussed. Although rigorous results are stated, no detailed proofs are supplied; rather, the intention is to present these results so that they can serve as a guide for the selection and, in certain respects, the implementation of the algorithms.' (p. xv) 4. This course addresses students having a good knowledge of basic numerical analysis, a general idea about variational techniques and finite element methods for partial differential equations and if possible a little knowledge of fluid mechanics; its purpose is to prepare them to do research in numerical analysis applied to problems in fluid mechanics.' (p. 7) 5. 'Computational fluid dynamics (CFD) is in a fair way to becoming an important engineering tool like wind tunnels. For Dassault Industries, 1986 was the year when the numerical budget overtook the budget for experimentation in wind tunnels. In other domains, like nuclear security and aerospace, experiments are difficult if not impossible to make.' (p. 11) 6. 'We were tempted to publish this section [fluid mechanics] as a separate volume. This is not only because it deals with a field of application of its own wide interest but also because it extends the field of finite element applications to a difficult area in which 'variational principles' do not exist naturally.' (p. xiv) 7. The whole field of computational fluid mechanics in which finite difference approximation has been the mainstay is today in a transition stage in which the advantages of finite elements are being realized.' (p. xiv) [optimists!] 8. 'One, therefore, sees that we must keep a delicate balance between coerciveness on the kernel of B and the inf-sup condition which are in a sense conflicting conditions with respect to the choice of spaces.' (p. 198) 9. Though finite difference methods have been [playing] and will continue to play a major role in computational fluid dynamics (CFD) and heat transfer, finite element techniques have spurred the explosive development of 'general purpose' methods and the growth of commercial software. The inherent strengths of the finite element method such as unstructured meshes, element-by-element formulation and processing, and the simplicity and rigor of boundary condition application are being coupled with modern developments in automatic mesh generation, adaptive meshing, and improved solution techniques to produce accurate and reliable simulation packages that are widely accessible.' (p. iii) 10. 'As in any rapidly developing field, the education of the nonexpert user community is of primary importance. The present text is an attempt to fill a need for those interested in using the finite element method in the study of fluid mechanics and heat transfer. It is a pragmatic book that views numerical computations as means to an end—we do not dwell on theory or proof.' (p. iii)
OVERVIEW OF THIS BOOK 15 1.5 OVERVIEW OF THIS BOOK We now briefly summarize what lies ahead and what does not; i.e., we also list as many relevant issues/items that we can think of that, for one reason or another, are not discussed in this book—although many of these omitted subjects will be taken up in Volume II. In Chapter 2 we develop an appropriate weak formulation of the advection-diffusion (AD) equations after appropriate discussion of the underlying partial differential equation (PDE). The equations governing the approximate solution of the weak form are then carefully developed, including both boundary conditions (BC's) and initial conditions (IC's). After displaying in 'full glory' (probably for the first time in many cases) the semi-discrete form—the 'method of lines'—of these equations (time remaining continuous) generated by the Galerkin finite element method in both ID and 2D for several 'elements', we discuss outflow/open boundary conditions (OBC's) in general, and then present some non-Galerkin semi-discrete equations, including a finite volume method. After an extensive discussion of the 'behaviour' of the 'GFEM-generated' ordinary differential equations (ODE's)—including such things as dispersion, phase and group velocity, and wiggles as a warning signal, we present a major discussion of time-integrating these ODE's—to get the 'final' results (numbers) from the computer. Besides reviewing explicit and implicit classical ODE methods, we discuss in detail one of our most important contributions: how to employ local error control to vary the integration step size in such a way that a specified level of overall accuracy is attained without 'wasting' time steps; i.e., small step are only used when required by the physics, and not by either stability limits or analyst's arbitrary 'guesses'. After returning to a discussion of dispersion (etc.) for fully-discretized systems, we summarize other methods of solving the GFEM equations, and conclude with a few numerical examples that are tutorial in nature; i.e., they demonstrate one or more of the characteristics described earlier. Finally, an appendix presents further comparisons of finite elements to finite volumes. Chapter 3 is clearly the most important chapter in the book; in a sense, the other chapters, are 'adjuncts' to this one. After some careful discussion of the myriad ways to express the continuum PDE's that are the Navier-Stokes equations (NSE's), and a brief introduction to 'related' equations (for vorticity, for pressure) and limiting cases, we present extensive discussion of the wide variety of permissible BC's, concluding the 'introduction' portion with a discussion of IC's and well-posedness followed by a summary section for the PDE's. After a brief discussion of associated global conservation laws we embark on a detailed derivation of the various weak forms of the NSE's, followed by an even more detailed exposition of a particular weak form (the most common one). Next is a lengthy discussion related to element 'choice' (still a 'non-converged' process, world wide), including the most difficult of the criteria for making selection (div- stability) and some consequences of making 'wrong' choices (pressure modes). Following a 'contributed' section (by D. J. Silvester) on 'Stabilizing' the GFEM and another discussion about the pressure Poisson equation (PPE), we offer a very detailed section which, from the element matrices onward, shows 'how' the GFEM 'works'—how both natural boundary conditions (NBC's) and the always elusive-in-the-past pressure BC's really come about. Also included is a new convergence proof for the controversial Q\Qq element. Following discussions of some higher-order elements and global conservation laws for the GFEM equations, a digression to show a finite volume method is permitted. The new 'meat' of the chapter begins with some tutorial information on various ways to integrate
16 INTRODUCTION the differential-algebraic equations (DAE's) that are the GFEM equations. After a useful digression on a model DAE system with an analytic solution, and another on the Stokes equation via eigenvectors, we discuss a variety of methods used to solve (integrate in time) the DAE's, beginning with our favorite and strongly-recommended method: trapezoid rule (TR) with intelligently selected ('smart') variable step-sizes. Methods also included are : explicit via forward Euler (FE), semi-implicit via a popular projection method, other projection-related methods, a fractional step method, methods based on characteristics, methods based on least squares, and methods based on Galerkin least squares. After a small section on aliasing and one that clarifies some old finite difference methods, the chapter concludes with but one numerical example—treated in some detail: impulsive start of flow past a circular cylinder. One of the objectives of this chapter that we think we have achieved is to remove most of the cloak of mystery surrounding the pressure. The last chapter addresses the various issues of 'what to do with the results', and how to derive other 'secondary' quantities from the 'primary' (computed) variables. We show several ways to compute vorticity, heat flux, and forces (and moments)—and recommend some over others. The book concludes with an extensive appendix on 'projections' related to the GFEM—and projection methods, again presenting some quite new material that we believe is illuminating and useful. What (else) did we leave out? Lots—as the reader how by now already ascertained. Probably more than we know. Below we list some of the important items, and hasten to point out that many of these will be covered in Volume II. 1. Local error estimates and adaptive meshing; error estimates in general. This is a serious defect, but perhaps our treatment of the 'same' subject in the time domain will help compensate. Besides serious error estimates require too much functional analysis. For a fairly recent 'update' on this important-and-still-developing areas, see Zienkiewicz and Taylor (1991). (We do provide a few error analyses, and summarize a few others.) 2. Detailed discussion/analysis of triangular/tetrahedral elements. 3. Petrov-Galerkin and related schemes. We simply 'apologize' for our 'naivete'—and lack of experience. 4. Free surface flows: See Volume II. 5. Thermal problems, buoyancy-induced flows (Boussinesq equations), stratified flows: See Volume II. 6. Turbulence; turbulence modeling; turbulent flows: see Volume II, but note also, for example: 'Laminar flows are not just of academic interest. They are of considerable practical importance to the designer of forced convection heating and cooling devices used in the electronics, biomechanics, and aerospace industries, among others. [Mohammed etal. (1991)] 7. Solution methods for linear and non linear algebraic equations: This very important practical aspect of the GFEM unfortunately also had to be deferred to Volume II. 8. /^-methods and h- p methods (p—polynomial, h—element 'size') spectral methods. We cover only /z-methods. For an introduction to the first of these higher-order methods, see, for example, Oden (1990), Ainsworth and Oden (1991), Babuska and Suri (1994), Oden etal. (1993), and Babuska and Oden (1996). For spectral methods, start with the excellent text by Canuto et al. (1988a). We hope to make up for many, but definitely not all, of these deficiencies in Volume II.
SOME SUBJECTIVE DISCUSSION 17 1.6 SOME SUBJECTIVE DISCUSSION In this short section, we wish to present a potpourri of items and issues that we believe are useful, poignant, relevant, or simply 'interesting'—that fit well nowhere else. They are rather subjective, but are hopefully useful to some of our readers. We begin with a few references that we believe are useful reading for the reasons stated. The recent book by Morton (1996) covers many aspects of 'our Chapter 2' from a variety of alternative analytical viewpoints, with—unlike our presentation—an emphasis on various Petrov-Galerkin methods. (He is, ultimately, more interested in compressible CFD, in which shocks can wreak havoc with 'centered difference' methods like GFEM); nevertheless, the book is a good adjunct to ours. The recent contribution to the Handbook of Numerical Analysis by Marion and Temam (1996) both complements and supplements our treatment of the NSE's in Chapter 3; it also includes a concise and readable summary of what is and what is not known (3D global existence) regarding existence and uniqueness of solutions in both 2D and 3D—recommended reading. On a rather lighter note, the short paper by Russell (1989) whose title tells all, is also recommended: 'Finite Elements and Finite Differences: Are they Really Different, and Does it Matter?' Returning to a much heavier subject, the paper by Johnson et al. (1985) is a piercing critical review of most of the 'classical' methods of performing 'stability analysis' of a given steady solution of the NSE's; also critically reviewed and allegedly improved are the related error analyses in CFD—including error control, based a combination of 'strong stability and Galerkin orthogonality.' Related to this, from an M.Sc, thesis, is Mathematically there are two reasons for the partial failure of classical stability (eigenvalue) analysis. First, as already indicated, the classical analysis is purely qualitative, investigating the behaviour of [infinitesimal] perturbations as time tends to infinity. This does not properly describe cases where the operators occurring are non- normal; e.g., the Orr-Sommerfeld operator in plane Couette and Poiseuille flows. Non-normality means that the eigenfunctions of the O-S operator, though forming a complete set, are not orthogonal. In this case a standard eigenvalue analysis fails, since it does not account for the large transient perturbation growth that can arise when the eigenfunctions are nearly linearly-dependent. The second reason is the restriction to 2D flow. The operators occuring in two- and three-dimensional Couette and Poisieuille flow are both non-normal, but only the 3D case appears to have the potential of large, transient pertubation growth, [Ericsson (1993)]. [See also the seminal works in this area by C.N. Treffethen—for example Treffethen et al. (1993) and Reddy and Treffethen (1994).] For a 'new' approach to the larger problem of fluid dynamics from a mathematical viewpoint, see Feistauer (1992)—for both incompressible and (mostly) compressible flow, Numerous additional relevant references that we have 'overlooked' are also cited. We now cite a few (additional) quotations that simply seem relevant: The FEM is, first, a systematic and powerful method of Interpolation' [Oden and Reddy (1976, p. 197)]. ... there is also a great controversy surrounding them [numerical methods]—focusing on the trade off between ease of obtaining a solution and its accuracy. Solution of the Navier-Stokes equations is much more difficult than that of the linear equations governing most solid mechanics problems; hence computational methods for fluid dynamics are much more involved and require greater expertise on the part of the user than stress or thermal analysis by finite-element methods. [SAE Automotive Engineering Staff (1995)]
18 INTRODUCTION At the conclusion of an article in the Annual Review in Fluid Mechanics eulogizing Karl Pohlhausen, Millsaps (1984) states, ... by a legendary figure on the ascent to the summit of human intellectual activity—fluid mechanics. Finally getting back down to Earth—in an unpublished set of notes that was a precursor to their 1973 text (see their preface), Strang and Fix state: The development of the method has led naturally from piecewise linear functions to splines and other piecewise polynomials of fixed degree p: each increase in p adds both to the accuracy and to the complexity of the method. As usual, the extra accuracy is initially worth the price, but just as Newton's method is more popular than its higher- order analogues, questions of convenience soon become paramount. In applications to second-order equations, cubic approximants (p = 3) are apparently close to the turning point. This is probably still true today for fluid mechanics via /z-methods—although it seems that 'cubic' should be replaced by 'quadratic'. We conclude this brief section with the following remark: As have most if not all previous books of this type, we shall probably be found guilty of spending too much time discussing our own previous work relative to that of others—we, too, are human. 1.7 WHY FINITE ELEMENTS? WHY NOT FINITE VOLUMES? In some CFD circles the finite element method has not yet achieved the same level of acceptance as other numerical solution methods. Part of this is probably a backlash of relief by those who feared the worst some 10 or so years ago when it had become pretty clear that the FEM had few serious competitors in its field of origin: solid mechanics computations in arbitrarily complex geometry. The fact that FEM is clearly 'best' for elliptic problems on arbitrary domains led some FEM optimists/advocates/zealots to believe (and espouse) that other applied fields of computational physics—fluid mechanics in particular—would also be 'easy pickings'. However, the fact that the Navier-Stokes equations, for both compressible and incompressible flows, are much more difficult to solve, let alone solve efficiently, combined with oversell by some and by dazzling displays of advanced and often obfuscating mathematics by others, seems to have caused the FEM bandwagon to become less attractive to jump on during the 80's and 90's—a state of affairs that was furthered by the simultaneous development and near-maturation of two finite-difference-related-areas of CFD: (i) body-fitted (and related) coordinate transformations for reasonably complex geometries, and (ii) finite/control volume methods (FVM) for spatial discretization, the former being a competitor to the inherent geometric flexibility via the isoparametric mappings of FEM and the latter offering a more readily understood (low-order) weighted residual method for approximating the solution of PDEs' via 'local conservation.' Additionally, the FVM—developed and applied mostly by mechanical engineers—was not much 'burdened' by the oft-times frightening mathematical analyses that accompanied (or at least lurked in the background of) applied FEM code development... such as mixed interpolation, LBB stability (inf-sup condition),
WHY FINITE ELEMENTS? WHY NOT FINITE VOLUMES? 19 Hilbert spaces, Sobolev spaces, etc. ... not to mention the seemingly never-ending quest for the best—or optimal—element. Even the experts do not agree. Much of this 'trend' is, it should be noted, attributable to the historical development of both FEM and FVM in CFD, the former spinning off from solid mechanics, where fully-coupled systems solved via the Newton-Raphson method and 'Gaussian elimination' were the rule—and the latter evolving from the 'simple geometry' FDM's (of an earlier computer era) in which uncoupled equations and iterative solution methods dominated the solution algorithms. The advantages of the uncoupled/iterative methods are two: (i) reduced computational cost (memory and CPU—ignoring accuracy considerations) and (ii) a significantly larger radius of convergence. The disadvantage is a significant increase in the number of iterations to achieve convergence—and in some cases even the ability to converge to tight tolerances. It is thus interesting to note that a current trend in the finite volume community is toward coupled iterative solution methods, because of the disadvantage noted above. Also we note that very often the methods are compared simply on a 'CPU work per node' basis. However this is a very misleading way of comparing the relative costs, particularly in the (most common) case of nonlinear problems in which additional issues such as accuracy and convergence rates should be factored into the 'cost equation'. All of the above issues have, we believe, conspired to cause the FEM, which is really just a 'generalized' FVM (because a consistently-formulated FVM has more similarities than differences when compared with a low-order FEM—as we shall demonstrate), to either lose followers/advocates or at least to not gain new ones at a rate proportional to the total rate of growth of CFD. But the facts are that the situation has changed rather significantly in the recent past—on both sides of the fence: FEM has, in some circles at least, been switching to simple (low-order) elements and to iterative solution methods on uncoupled (or less coupled) equations—and FVM is beginning to become 'saddled' with mathematical analysis relation to stability and convergence. Three examples of the latter are: Cai et al. (1991), Cai (1991), and Shin and Strikwerda (1997). We believe that the FEM, using low-order elements (basically linear and piecewise- constant), does represent a viable version of generalized FVM—especially now that the CFD FEM community (or at least some of it) has 'caught on' regarding solution strategy—and that it can and will complete cost-effectively in the arena of large CFD simulations—involving millions of unknowns. We are less certain regarding the 'viability' of higher-order elements—although this may in fact be less important since there are no (or very few) higher-order FVM's. (As we shall demonstrate in the next chapter, the FVM is inherently a low-order method.) We now present a comparison of FEM and FVM in several areas, beginning with the principal and most-oft-stated advantage of the latter over the former: the linear conservation laws implied by the governing PDE's are always and inherently satisfied locally (at control volume level) and thus globally (via simple summation). Hence, the discrete equations are always amenable to simple physical interpretation because the resulting stencils are also simple—at least on simple grids. In fact, the above is stated rather well by some users: One reason for this (the increasing popularity of FVM's) is that they combine the intrinsic geometric flexibility of FEM together with the desirable, direct physical invocation (sic) of a conservation principle to clearly identified and delineated control volumes comprising the domain, [Schneider and Raw (1986)]
20 INTRODUCTION While there is no doubt that the above statements are true and that local conservation is a decided asset, it is also true that the stencils are far from simple and not so easy to intrepret 'physically' when the FVM is applied to complex geometry with its mappings, Jacobians, metrics, equation transformations, and sometimes even (God-awful) Christoffel symbols—and that local conservation does not necessarily imply local accuracy. [It is also true that virtually all FVM's we have seen employ a lumped (diagonal) mass matrix, thus vitiating the claim of local conservation—at least for time-dependent situations.] Now we would like to list some definite advantages of GFEM over FVM. 1. The inherent (built-in) geometric flexibility permits the easy use of simple Cartesian velocity components on unstructured meshes for arbitrarily complex geometry. There is absolutely no need for 'add-on' global mappings, global transformation of equations to covariant (or contravariant) components and thus no need for Christoffel symbols (etc.) This FEM simplification is really significant, further appreciation of which could be obtained by reading a carefully written paper on FVM by a team of mathematicians; see Segal et al. (1992). In fact, however, many 'modern' FVM methods are also based on Cartesian velocities—as are many commercial codes; see Ferziger and Peric (1996). 2. The inherent ability to easily and accurately apply the appropriate (physical) BC's on complex domains—especially the Neumann type, and especially at outflow regions—is a real asset. 3. Global physical (linear) conservation laws are either satisfied automatically (a la FVM's) or can be made to do so with a slight change in formulation; the conservation of linear momentum (global 'force balance') is obtained simply by using the divergence form for advection, V • uu), as do FVM's. In addition, global energy conservation (quadratic) is guaranteed by writing the advection terms as ^ [V • uu + u • Vu]—issues we shall elaborate upon in due course. On this point we note that many (us included) believe that the most important reason to seek global conservation of quadratic quantities (such as kinetic energy) is that it assures stability (boundedness) of the numerical results. Again, however we remark that stability does not imply accuracy—although it is true (obviously) that instability implies inaccuracy. Finally, we point out that FVM's also generally do not conserve (locally or globally) quadratic quantities; the only quantities guaranteed to be conserved are those for which the divergence theorem applies. 4. If the original differential operator is symmetric (self-adjoint), so too are the FEM discretizations of the operator, but not (in general, on non-rectangular grids, etc.) FVM. Examples: (i) Laplacian; (ii) the divergence and gradient operators are adjoint to each other in the continum and in the GFEM, but not in the FVM. 5. The phase speed of an FEM is always more accurate than that of the 'corresponding' FVM. 6. For elliptic problem, the GFEM is always more accurate than the 'corresponding' FVM. Related to this last item, we would like to react to the following 'related' assertion by some FVM advocates: FEM is great for elliptic problems, but it may be out of its realm for Navier-Stokes. We respond thus: This best approximation property for elliptic problems should not even be surfaced when discussing CFD—but if it is, it should be interpreted as a bonus, an extra. This is just another example, like that of phase speed already mentioned, wherein GFEM trades ease of 'interpretation' for increased accuracy.
WHY FINITE ELEMENTS? WHY NOT FINITE VOLUMES? 21 Another 'excuse,' we believe, for going with FVM rather than FEM is this: The mathematics is too difficult and the concepts of weak formulation, weighted residuals, etc. are non-physical and non-intuitive. We have a three-part response to this one. 1. The FVM is (often if not always) also a weighted residual method; only the weighting functions are different. They are simpler to be sure (piecewise-constant); and this is the reason we refer to GFEM as a generalized finite volume method (GFVM). We shall, in fact, derive, describe and demonstrate (partially) a particular FVM in the next two chapters. 2. Most of the difficult mathematics can be bypassed by practitioners and even code builders—just as it is and has been in FVM; i.e., there is a large body of (difficult) mathematics in the numerical analysis literature that is largely ignored by applied scientists writing finite difference codes (finite volume, too, but to a lesser extent, as the difficult mathematics is still in its infancy). 3. Finally, the same criticism should be leveled against all elementary Fourier-series methods (and related eigenfunction expansion methods), as they too are certainly 'non- physical and non-intuitive'—but in fact they are all Galerkin methods. But we believe that even finite volume advocates appreciate the 'power' (and magic?) of Fourier series. We conclude by emphasizing that there are actually more similarities than differences between these two methods: (i) they are both members of the family of weighted residual methods (although FVM methods need not be; there is a wide variety of methods called FVM); (ii) they both rely more on integration than divided differences to generate the discrete equations; and (iii) they both treat complex geometry via a 'mapping'. We strongly hope that enough 'mechanical engineers' (et alii) will be sufficiently swayed by this book that some good fraction of this large group of practioners will help the FEM to grow and improve.
qi The Advection-Diffusion ^ Equation An appropriate starting point for the study and numerical simulation of incompressible fluid flow is that of the simpler but important linear equation of advection-diffusion (AD), in which the velocity field is presumed known. Indeed, many fluid flow simulations are primarily (or ultimately) concerned with the transport and diffusion of scalar quantities such as 'heat' (temperature) or concentration (e.g. air pollution). Unfortunately, even in these cases, the more-difficult-to-obtain velocity field must usually be computed first. Here, however, we shall assume that the velocity is known, either analytically or from a numerical solution of the incompressible NSE's; in the next chapter we will turn to the problem of computing the velocity field itself. Finally, since the advection- diffusion equation is, in many ways, prototypical of the (much more difficult) NSE's, it is useful to study it first—and we remark that it is often called the convection-diffusion equation. 2.1 THE CONTINUUM EQUATION 2.1.1 The Advective (Convective) Form The conservation principle for energy or chemical species (mass) can often be well- approximated by the following partial differential equation (PDE)—the scalar transport equation—written here in terms of temperature, T(x, t), where x is a short-cut notation for all spatial directions and applies to one, two, or three dimensions: — +u- V7 = V- (K- VT) + S, (2.1-1) dt where the velocity field, u(x, t), is given and satisfies V • u = 0, as is the diffusivity tensor, K, and the source term, S. {For S(x, t) > 0[< 0], it is a source [sink] term}. In this chapter we will consider the simple case wherein the equation is linear (i.e. no temperature-dependent terms); a further simplification, which we also use for the most part, is that the coefficients are invariant in time. The term u • V7 represents advection (temperature is 'carried' by the velocity field), which is also often called convection; the term 37/3/ represents 'accumulation' for non-steady processes; and finally, the term V • (K • V71) is, of course, the diffusion term; if K is a scalar (k) and constant, then this term becomes noticeably simpler: kW2T, where k = k/pCp (thermal conductivity divided by density and heat capacity) is the thermal diffusivity. The solution to (2.1-1) will generally be sought within a bounded domain, £2, with boundary, 3^2—also called f. Given an initial distribution of temperature, (2.1-1) can, in
24 THE ADVECTION-DIFFUSION EQUATION principle, be solved subject to an appropriate set of boundary conditions (BC's), which typically are: T = TD on TD (2.1-2) and n(KVT) + H(T -f) = q on TN, (2.1-3) where 3£2(= VD + rN) is composed of the two non-overlapping segments, rD and TN; also TD, f,H^0 (heat transfer coefficient, albeit slightly redefined—with units of velocity—from that used in conventional heat transfer analysis), and q (specified normal heat flux into Q,—and again slightly redefined) are given functions (time-dependent, in general—and often simply called 'data' by mathematicians) on the appropriate portion of the boundary, and n is the outward pointing unit normal vector. The BC given by (2.1-2) is called a Dirichlet BC, or a BC of the first kind (an 'essential' BC in the weak formulations to follow); if H = 0 in (2.1-3), then the resulting BC is called a Neumann BC or a BC of the second kind; finally, for H ^ 0, we have a BC of the third kind, or a Robin BC—see, for example, James and James (1959). [The Robin BC is, for a heat conduction problem, also Newton's law of cooling; hence, the Robin BC is sometimes referred to as the Newton BC (e.g. Bird et al. 1960; Rektorys, 1980; Reddy, 1993). Both Neumann and Robin BC's will later also be referred to as 'natural' BC's in the weak formulations to follow.] While K can, in general, be a full (but symmetric and positive-definite) second-order tensor (a 2 x 2 matrix of coefficients in 2D, and 3 x 3 in 3D) representing anisotropic diffusion, it is usually much simpler; e.g., a diagonal matrix or even a scalar. Since this presentation is largely introductory, we shall usually consider the simplest case of a scalar (and constant) diffusion coefficient k. Finally, the statement of the scalar transport problem (abbreviated henceforth by AD; advection-diffusion) is completed by specifying an initial condition (IC): T(x,0) = T0(x) in Q, (2.1-4) where Tq is a given function of position and Q, = Q, + d£2. Before continuing, we make several Remarks: (1) The above BC's are the most general linear BC's that can be applied to (2.1-1); if K or H or q are temperature-dependent, a 'family' of non-linear BC's unfolds. {Actually, to incorporate a BC that will be discussed later, which may be even more 'general,' a convective transport term —n • uT should be added to the LHS of (2.1-3) [cf. (2.1-25)].} (2) The IC need not (and generally does not) satisfy the BC's, but if it does, the resulting solution will be smoother, i.e., possess higher-order derivatives—especially if (2.1-4) satisfies (2.1-2), the Dirichlet BC. (This 'flexibility' regarding IC's and BC's will be partially lost when we advance to the Navier-Stokes equations in the next chapter.) (3) A practical application of a BC (2.1-3) occurs when rN is a wall (at which u = 0, usually) containing a heater (for q > 0) and on the other side of which flows a fluid at temperature T. (4) Another practical and very common use of (2.1-3) occurs when TN represents an 'outflow' (n • u > 0) boundary that is usually artificial/synthetic in the real world but
THE CONTINUUM EQUATION 25 very real in the mathematical modeling world. Here the use of H = 0 and q = 0 is often effective as an approximation to the true coupling with the rest of the universe. (5) It is impractical, and generally ill-advised, to apply (2.1-3) at inflow (n • u < 0) portions of the boundary because it could lead to an ill-posed problem—or at least to a problem whose existence of a unique solution cannot be proven, as we shall demonstrate. (6) If K = 0, we have the limiting case of pure advection, a hyperbolic equation for which no BC is permitted at outflow; i.e., BC (2.1-3) must be dropped in this situation, because the theory of characteristics tells us that T must be specified at inlet points on T(n • u < 0), but that there is no BC at outlet points—at these points, the PDE itself prevails. (7) Consider the ID AD equation with constant coefficients. To help appreciate the tremendous difference between advection and diffusion, we point out that discontinuities in the initial data, Tq(x), are transported unaltered (C_1 initial data remains C_1, i.e. rough) under pure advection but are instantaneously smoothed (to C°° for / = 0+) when diffusion is present. (8) There will generally occur a singularity at the junction of VD and rN, at which certain derivatives of T (e.g., diffusive heat flux) will fail to exist (be unbounded). (9) Periodic (or cyclic) BC's are sometimes useful/appropriate, for which both (2.1-2) and (2.1-3) are replaced by the requirement that the solution at one part of the boundary must be equal to that at another. Given sufficiently smooth data (u, K, TD, T, q, Tq, and dQ), the solution of (2.1-1) will possess two continuous spatial derivatives and (for / > 0) one continuous time derivative, as implied by (2.1-1). Such solutions are called classical solutions—which distinguish them from the weak solutions to be discussed later (and which 'dominate' the solution space in the 'applied' world). Given that a classical solution to (2.1-1) exists, its uniqueness (for n • u ^ 0 on rN) follows easily—in the usual way; i.e., insert the difference of two putative/alleged solutions into (2.1-1), multiply the equation by this difference, and integrate the equation over the domain. [See Remark (6) following (2.1-18) for help.] 2.1.2 Dimensionless Forms and Limiting Cases of the Equation It is often useful, both physically and mathematically, to recast a given PDE into one or another dimensionless form to develop/improve one's intuition regarding the solution's behavior. We provide two such forms below. Suppose that a given problem is characterized by a particular (characteristic) length scale, L, and a particular (characteristic) velocity scale, uq. We now consider two useful non-dimensional forms of (2.1-1)—for the case K -» k, where k could represent the average of the AT,-/s, for example—by representing distance (x) in terms of L and velocity in terms of m0- In the first form, we will assume that the time scale is 'set' by diffusion, an assumption that must often be verified a posteriori; i.e., we will non-dimensionalize time via L2/k, a diffusion time constant.
26 THE ADVECTION-DIFFUSION EQUATION Equation (2.1-1) then becomes + ^(u • V7) = -^(Vz7) + S, (2.1-5) k /dT\ uq k 9 where the terms in parentheses all have the units of temperature (it is not necessary for our purposes here to use dimensionless temperature). Multiplication of (2.1-5) by L2/k gives the first dimensionless form, — + Pe u VT = V2T + QU (2.1-6) dt where Q\ = L2S/k and the Peclet number, Pe = u0L/k, (2.1-7) has been introduced. Each term in T is now presumably (at least 'globally'—on average) of order unity in AT, the characteristic driving temperature difference: i.e., 37/3/ = 0(1), u W = 0(1), and V27 = 0(l). The Peclet number represents a ratio between the 'strength' of the advective and diffusive processes. In fact, an alternative derivation of the Peclet number considers it as the ratio that estimates the relative magnitude of advection to that of diffusion: u • VT/kV2T; it then approximates u • V7 by uqAT/L, and approximates kV2T by kAT/L2; the result is uqL/k. If Pe «; 1, then advection is unimportant (almost everywhere, usually) and the process is said to be diffusion- dominated. If Pe ^>> 1, then diffusion is secondary (again; almost everywhere, usually) and the process is called advection-dominated. Finally, of course, when Pe = 0(1), both processes are important. In practice, it is often the case that both transport terms are on nearly equal footing over most of Q, if 0.1 < Pe < 10, say. To obtain the second non-dimensional form, we assume, in contrast to the above, that the time scale is set by advection; in this case, the appropriate measure of time is L/uq—a transport time—and (2.1-1) yields V7)= ~(V2T) + S, (2.1-8) L which, when multiplied by L/uq, gives the second non-dimensional form, — +u-VT=~-V2T + Q2, (2.1-9) at Pe where Q2 = LS/uQ, and Pe is still given by (2.1-7). Again, 37/3/, u • V7\ and V27 are 0(1) in AT—presumably. Remark: It is sometimes useful to note that the Peclet number (or Reynolds number for momentum transport—next chapter) is also the ratio of the diffusion time scale (or time constant) to the advection time scale (or time constant); e.g., Pe ^> 1 means that the 'response' of the advection process to a 'perturbation' is much faster than that of the diffusion process. It is also of interest to write the Robin BC, (2.1-3), with K -> k, in dimensionless form. Noting first that since n • VT = nx(dT/dx) + ny(dT/dy) (2D), we see that only
THE CONTINUUM EQUATION 27 length need be re-scaled, and the result is jn-VT + H(T-f) = q, or, using the short-cut notation, n • V7 = dT/dn, dT HL — + (T-T) = g, (2.1-10) on k where q = qL/K, and the dimensionless group HL/k has two names—at least in thermal analysis, noting that h, the conventional heat transfer coefficient, is h = pCpH: 1. HL/k = Nu, the Nusselt number (convective heat transfer), and 2. HL/k = Bi, the Biot number (transient heat conduction). Returning now to the dimensionless PDE, it turns out that (2.1-6) is the appropriate form to consider for diffusion-dominated situations in that, as uq -> 0, Pe -» 0, and (2.1-6) becomes the appropriate and familiar 'transient heat equation'; i.e., it describes pure diffusion. On the other hand, (2.1-6) is generally inappropriate for studying advection- dominated cases; here we should use (2.1-9) and thus obtain the appropriate hyperbolic limit, via Pe -> oo as k -» 0, of pure advection (with, in general, a source term). Equation (2.1-9) is also the proper one for studying the effects of boundary layers when Pe ^>> 1. Thus, (2.1-6) is often more appropriate if Pe ^ 1, and (2.1-9) is often better if Pe ^ 1. It is to be emphasized, however, that either (2.1-6) or (2.1-9) may be used for any value of Pe other than the asymptotic limits of 0 or oo; the forms presented simply better place the weaker of advection or diffusion into the more appropriate setting. The different forms can also have important numerical ramifications, which will be discussed later. In many cases of interest here we will be concerned with the (more difficult) case of advection-dominated flow, Pe ^> 1. We next discuss some general characteristics of a given problem that can cause the numerical solution of the advection-diffusion equation (and the NS equations, for that matter) to be (relatively) either easy or difficult to obtain (the latter requiring fine zoning in at least some portions of the domain): 1. Since diffusion is a smoothing process, diffusion-dominated flows (Pe <^C 1), for which the PDE is predominantly parabolic in character, are generally easier than advection- dominated flows (Pe ^> 1), which are more hyperbolic in character and do not smooth the solution. Exceptions occur, of course, but are usually associated with early time (small t); e.g., a sharp change in a BC and/or a very non-smooth source function (S) will make the solution more difficult—even if Pe <<C 1—at least during the initial transient period. Steady-state simulations (37/3/ = 0) with Pe <<C 1 are generally the easiest. 2. Smooth wave forms (spatial via IC's, or temporal via inflow BC's) are easy compared with rapidly changing ones. The combination of non-smooth wave forms, which contain large amounts of 'energy' in the short wavelengths (in the context of Fourier analysis), and large Pe, in which case the difficult wave form must be translated with little change of shape, is a difficult problem. This situation is usually only encountered in time-dependent simulations wherein, for example, the advection of a 'square wave,' or a sequence of them via time-varying inlet BC's, is very difficult, as we will demonstrate.
28 THE ADVECTION-DIFFUSION EQUATION 3. For problems in which the flow must, usually for reasons of computational feasibility, actually leave the computational domain, the simulation can be either easy or difficult, depending on the form of the outflow boundary condition (OBC) employed. Usually in these cases, the outflow 'boundary' (n • u > 0) is artificial in that the true domain does not end at this location, and thus the true (or physical) 'BC is not known (there is no boundary and therefore no BC); nevertheless, the PDE, e.g., (2.1-1), cannot generally be solved unless some (mathematically allowed) BC is employed. [An exception occurs if the problem is truly hyperbolic (k = 0), in which case the PDE does not require an OBC—and none should be imposed.] These simulations are especially difficult if Pe ^> 1 (but bounded; i.e., k ^ 0) and (2.1-2) is employed at an outflow boundary, as already mentioned. The reason for the difficulty is that a thin, outflow boundary layer will form, and the interior solution, T, which is generally not close to TD as the flow approaches To from the interior, must rapidly change to be equal to TD in a very short (non-dimensional) distance—of 0(1/Pe)—from (2.1-9). (This distance, say, 8, in the flow direction can be derived by 'equating' advection to diffusion in this boundary layer: uqAT/8 = kAT/82, or 8 = k/uq, which in non-dimensional form is 8/L = k/uqL = 1/Pe.) The problem is much easier, on the other hand, if an appropriate form of (2.1-6) is employed; if H and q are zero, then the BC given by n ■ (K • VT) = 0 does not cause a boundary layer phenomenon (see also Gartling, 1978; Chang and Finlayson, 1980; Gresho and Lee, 1981) and is usually a good OBC for a 'computationally truncated' domain. For inflow boundaries (n • u < 0), this problem does not arise, even for Pe ^> 1, and either (2.1-2) or (2.1-3) can be utilized. [The former is often more appropriate—especially for large Pe; indeed, if K = 0, then (2.1-2) is the only allowable BC at inflow boundaries.] Finally, these comments apply to both time-dependent and steady-state simulations. 4. Flows around obstacles can be difficult if Pe ^> 1, again for both steady-state and time- dependent situations; i.e., the advection-diffusion equation can be difficult to solve—even if the flow field is 'simple,' like potential flow or Stokes flow. The reason is, again, the formation of thin boundary layers, especially when (2.1-2) is applied on the 'upwind' portion of the obstacle (where fine zoning will be required). This type of situation is frequently encountered in heat and/or mass-transfer analysis. 5. Pure advection (k = 0, Pe = oo) simulations are often difficult, but are sometimes actually easier than large Pe (e.g., 104) problems, owing to the simpler BC's and absence of boundary layers for the hyperbolic equation. For these situations, generally the only appropriate BC for (2.1-1) is (2.1-2), applied at inflow boundaries only. Again, exceptions are possible; e.g., in the case of a steady flow with an internal (closed) recirculation region (attached eddy), the steady, pure advection equation (u • VT = 0) has no solution unless T is specified at some point on each streamline in the eddy. A general solution to u • VT = 0 for constant u is T(x, y) = f(vx — uy) for any /(•)• For the time-dependent case, on the other hand, the solution is completely determined from the initial conditions, and there is, in general, no steady state. The numerical simulation of such flows can be difficult. One of the advantages of the FEM (a property also shared with most spectral methods but not by many finite difference methods) is that the approximate solutions are always obtained subject to the same BC's as are appropriate to the continuum equations—no more, no less; e.g., (i) when Neumann or mixed BC's are involved, they are 'automatically,' unambiguously, and consistently (although approximately) incorporated (as we
THE CONTINUUM EQUATION 29 shall demonstrate) into the discretized equations; (ii) in the hyperbolic limit, k = 0, the PDE requires no BC's at any outflow region (n • u > 0) of dQ—a situation that is again 'built-in' to the final FEM equations—as long as it is remembered that both H and q must then also be zero. Hence, an appreciation and understanding of the right (and wrong) BC's for the PDE's is especially useful in FEM approximations. In fact, the motivation behind many of the above remarks is related to these issues—and more. If the analyst (or 'modeler') possesses a good understanding of the qualitative behavior of the continuum equations and BC's, including limiting cases, s/he is in a much better position to plan the experiments, create a good mesh, and finally, to understand and interpret the numerical results—especially with regard to the frequently asked question, 'What went wrong?' 2.1.3 The Divergence (Conservation) Form Since V ■ u = 0, an equivalent form of (2.1-1) is (for K -» k = constant, for simplicity) — + V • (uT) = kV2T + S dt or dT — + V ■ (uT - kWT) = S, (2.1-11) dt which is called the (flux-) divergence form, since qA=uT (2.1-12) is the advective flux vector and qD = -KVT (2.1-13) is the diffusive flux vector. That is, with qr = qA + qo, the total flux vector, (2.1-11) is clearly d^ + V-qT = S, (2.1-14) at which is called a conservation form because integration over Q gives directly, via the divergence theorem, the following global conservation law: lJT = JS-Iraqr- <21-'5) i.e., the total energy (or mass if T represents a concentration or mass fraction) changes (decreases) only by the net flux of T out of the domain through the boundary—except, of course, for the source term. Remark: Here and hereafter, we often employ the abbreviated but convenient notation that /(■) means integration of (■) over £2 and Jr() to denote integration of (■) over the boundary of Q. Now it is clear that the same global conservation law could also have been derived from (2.1-1) because V ■ u = 0. So, one may reasonably ask: What is the reason for
30 THE ADVECTION-DIFFUSION EQUATION discussing the divergence form? The detailed answer will come later and is in two parts, which we merely hint at now: (1) in the weak formulations of the transport equation, the two forms thus far discussed—advective form and divergence form—can differ owing to different natural BC's; and (2) in the spatially discretized equations, we generally do not obtain V ■ u = 0 pointwise, with the result that only the divergence form can assure global conservation—an assertion we shall later prove. And this leads naturally to the subject of the next section. 2.1.4 Conservation Laws Often, one of the goals of approximate solutions to PDE's, in addition to the principal goal of finding a cost-effective approximate solution that is close to the continuum solution, is the assurance that the approximate solution will satisfy discrete approximations to certain global conservation laws that are satisfied by the continuum solution and that are basically independent of the 'local error'; i.e., they are satisfied on the coarsest of meshes. The principal reason for this goal is the desire to attain stable and bounded numerical solutions, independently of the issue of accuracy. This presumes, of course, that the PDE solution is itself stable and bounded. Toward this end then, we present next a brief discussion of the relevant conservation laws for the AD equation, so that we can set our sights toward the proper goals when later generating numerical approximations. The first of these, global conservation of T, has already been derived—in (2.1-15), which we restate in expanded form: — /T = J S- J n (uT -kVT), (2.1-16) showing that internal transport (i.e., within Q) of T via the principle transport processes (advection and diffusion) makes no contribution to the global change of T—it merely redistributes it within £2. Invoking BC (2.1-3) in (2.1-16) yields another equivalent form of the global energy (enthalpy) conservation statement: ilT=ls+h+H{f-T)}+LKfn-LauT- <2M7) in which the individual boundary contributions are more clearly displayed. The global energy increases owing to (1) the internal heat source, (2) the applied heat flux on TN, (3) the convective heat flow—also on TN, (4) inflow (for dT/dn > 0) of diffusive heat flux owing to the specified value of T on rD, and (5) net inflow of advective flux owing to the velocity field (recall that u • n < 0 for inflow). Remark: If a steady solution is sought for the somewhat special case of Vo = 0, H = 0, and u ■ n = 0 on T, (2.1-17) yields a constraint on the data; i.e., it states that 0 = J S + Jr q. If this solvability condition is not satisfied, then the problem is ill-posed, and no solution exists because the given data preclude a global balance and are thus inconsistent. Another energy-like quantity that is often of interest is a quadratic one: How does E = J T2, a positive-definite quantity, behave? (Note that J T could be well-behaved
THE CONTINUUM EQUATION 31 even if T is locally 'poorly-behaved'; e.g., small regions of large negative T could be cancelled by small regions of large positive T.) To answer this question, we first multiply (2.1-11) by T and integrate over £2: f dT f f / T— + / 7V ■ (uT - kWT) = / ST. Application of the divergence theorem after an integration by parts of the two transport terms yields, with V ■ u = 0, -— T2 = ST-k WT VT - - / n ■ (uT2 - kVT2), (2.1-18) which merits the following Remarks: (1) If S > 0, then the source term will act to increase (decrease) E if T > 0 (< 0). (2) Dissipation—the second term on the RHS—will (try to) decrease E monotonically (because J V7 ■ V7 > 0) and is the reason that diffusional processes are called dissi- pative. It is noteworthy that this type of 'damping' is present in the T2 equation, but not in the T equation—internal diffusion acts to equalize T, conserve its integral, and decrease J T2—consistent with thermodynamics. (3) The boundary terms show that T2, like T, is subject to inflow/outflow along T by (again, like T) both transport processes. (4) If n • u = 0 (contained flow) and n ■ VT = 0 on T ('insulated' container), then we have djT/dt = JS and \&$ T2/dt = J ST - kJ VT ■ V7\ In a situation with no source term, J T = J T0—where Tq(x) is the initial temperature—and E decays monotonically, showing that E -> 0 and T -> constant as / -> oo; i.e., a steady state will be attained in which the constant final temperature is the same as the average initial temperature. For S ^ 0, a steady solution can clearly only be attained if J S = 0; sources and sinks must balance. (5) If k = 0 (pure advection) and n ■ u = 0 on T, then the sourceless situation will conserve all powers of T; i.e., it then follows that J Tm = J T™, m = 1, 2, .... (6) (2.1-18) can be used to prove uniqueness as follows: denote T as the difference between two alleged solutions, whence the source term vanishes and the BC's (and IC) are homogeneous—T = 0 on VD and KdT/dn + HT = 0 on TN\ (2.1-18) then becomes 1 d 2d7 JT* = -jKvr.vr-\jn.*i*+Kj dT T — dn f -HIT Jyn -2. To but T = 0 on TD and thus, if n ■ u ^ 0 on rN, d / T2/dt < 0; finally, since T = 0 at / = 0, so too is / T2; thus, / T2 = 0 is the only possible solution, which => T = 0. QED. Note that any inflow on ^(n ■ u < 0) loses the negative semi-definiteness of the RHS and precludes a simple uniqueness proof; there could presumably exist two (or more) solutions, differing because the temperature at the inflow portions of f/v
32 THE ADVECTION-DIFFUSION EQUATION could differ—it is 'uncontrolled' there. [In practice, however, the non-unique case is sometimes 'solved' in the CFD laboratory—uniquely or not. In fact, even the full Navier-Stokes equations and Boussinesq equations are often 'solved' in cases wherein there is some inflow at an ostensibly outflow portion of the domain (an eddy gets 'chopped in half at rV), thus bringing into £2 'unknown' values of both velocity and temperature—and the solutions usually look quite believable. C'est la vie.] These results can be regarded as some goals for the approximate (numerical) solutions. We will later return to these conservation issues after deriving the numerical approximations—both semi-discrete, which lead to a set of ordinary differential equations (ODE's) in time, and fully discrete, in which a time-marching method has been selected. 2.1.5 Weak Forms of the PDE's/Natural Boundary Conditions The next step toward a Galerkin FEM solution is to recast the governing PDE—either (2.1-1) or (2.1-11)—into the weak (or Galerkin) form, sometimes also referred to as a variational form. Here and hereafter, when we speak (loosely, sometimes) of the weak form of an equation (PDE), we are usually referring to the final result of a weak formulation of the spatial part of the problem (PDE plus BC's); i.e., weak forms generally come with BC's. The weak form can be derived in several ways, but below we offer mainly one—one that is usefully heuristic even if not totally rigorous in all situations. (The 'end of the day'/'bottom line' results, however, are rigorous). We also state at the outset that while the classical statement of a problem (PDE + BC's + IC's, also often referred to as an IBVP—Initial Boundary Value Problem) is generally unique and unambiguous, there is usually no unique weak statement of the same problem. But while there may exist alternate weak formulations of a given problem, they are actually equivalent—at least when a classical solution exists, in which case the solution is said to be 'sufficiently smooth.' Some weak formulations, however, are more useful than others because (at least) they more efficiently and more 'naturally' take account of the BC's. Part of the 'game,' therefore, is to find the most appropriate weak form—a task that is often non-obvious and non-trivial—especially when we consider the NS equations in the next chapter. (Thus, the FDM problems of 'how to discretize each operator and how to treat each term at a boundary?' is replaced by the FEM problem of 'selection of the weak form.') Another, and very important, attribute of weak solutions is this: a great many physically interesting and seemingly well-posed problems possess weak solutions but do not possess classical solutions. Beginning with (2.1-1) then, we first suppose that we have a solution, T(x, t), that satisfies this PDE. It is then clear that w[dT/dt + u • VT - S - V • (K • V7)] = 0, and therefore that w(—+u-VT] = J w[S + V • (K • V71)] (2.1-19) is also satisfied for all (bounded, which we assume) functions, w(x). The next step is to immediately narrow the class of so-called test functions, [w], so that w is at least once-differentiable. This restriction on w permits the use of the following identity: V • [w(K ■ V71)] = wV • (K • WT) + Vw • (K • WT),
THE CONTINUUM EQUATION 33 w( — +u-V7j + Vw(K- V7)= / wS + J wn-(K- V7), (2.1-21) which we integrate over Q and apply the divergence theorem to the LHS to obtain [ V • [w(K • V71)] = f wn-(K-WT) = f wV ■ (K • V7) + /" Vw • (K • V7); i.e., we have obtained, via integration-by-parts and application of the divergence theorem, the following important result: / wV ■ (K • VT) = J wn ■ (K • V7) - f Vw ■ (K • V7), (2.1-20) whose 'importance' will soon become clear. Using (2.1-20), (2.1-19) becomes in which the diffusive normal boundary flux is now prominent, and is one reason that (2.1-20) is important. Namely, recalling now the Neumann (Robin) BC, given by (2.1-3), leads to / w ( — + u • VT J + Vw • (K • VT) = wS+ / wn ■ (K • V7) + / w[q-H(T-f)], (2.1-22) where we have separated the boundary integral into two parts, one over rD and the other over T/v, in order to (naturally) incorporate the Neumann BC; recall that rD + rN = T. We are almost, but not quite, to the desired weak form. To finish, we now further restrict the class of test functions, {w}, of which there is an infinite number, to those that vanish on the Dirichlet portion of 3£2; i.e., we now require w = 0 on rD. Calling this class of functions HlQ, our final weak form of the advective form of the AD equation is—and this is important—obtained by dropping the assumption that T(x, t) satisfies (2.1-1), and instead considering it as an unknown function that need only be once piecewise- differentiable [i.e., it no longer need satisfy (2.1-1)] and require (2.1-22) to hold for every function, w(x) in HlQ; i.e., Find T(x, t) e H\ such that / w(— +u V7J + Vw(K- V7) = fwS+f w[q-H(T - f)] Ww € //J, which we rearrange to place the unknown boundary temperature on the LHS and the data on the RHS: find T(x, t) in H\ such that / w(— +u VTj + Vw(K- V7) +/ wHT = [wS+ [ w(q + HT) VweHlQ, (2.1-23) where HlE is that set of once piecewise-differentiable functions in £2 that satisfy the essential BC, (2.1-2), on T/j. This is the final weak form of (2.1-1); it incorporates automatically BC (2.1-3) and can be solved, in principle at least, for T(x, t) once the initial data, (2.1-4), are supplied
34 THE ADVECTION-DIFFUSION EQUATION at/ = 0. In fact, we now discard (2.1-1) through (2.1-3) and regard (2.1-23) as the 'God- given' form of the problem; and also note that T(x, t), the weak solution, can (but need not) now reside in a larger function space than do solutions of (2.1 -1), since the weak solution need not even possess second spatial derivatives, at least in the classical sense. A final comment on this weak formulation is that the BC (2.1-3) has been incorporated into the solution in a (relatively) natural way and is the reason (or one reason, at least) that such a BC is called a natural boundary condition (NBC); it (the Neumann BC for the Laplacian operator) is 'natural' to this weak formulation—and (2.1-23) is the 'natural' weak form of (2.1-1). See also Strang and Fix (1973) for further useful elucidation, including the concept of 'completion' of the function space. [The actual origin of the term is in the calculus of variations; e.g., 'Any BC in the boundary value problem which need not be imposed on the set of admissible functions in the variational principle is said to be a natural boundary condition. Other BC's are essential'—Stakgold (1979).] It is interesting and instructive to reverse the procedure that 'generated' this weak form—at least when this is permissible; i.e., when T(x, t) is sufficiently smooth. To this end, we assume sufficient regularity and manipulate—a la (2.1-20)—the diffusion term as follows: f Vw • (K • WT) = J V • (wK • V7) - / wV • (K • WT) = / wn ■ (K • VT) - / wV ■ (K • V7) JrN -l wn ■ (K • V7) - / wV • (K • WT) so that (2.1-23) becomes, after rearrangement, w — + u • V7 - V • (K • WT) - S dT ■7- +U dt Jr. = / w[q - H(T -T) -n- (K-VT)] Vw e HlQ. (2.1-24) Since this equation holds for all w e Hq, it follows that it holds for that subset (say Hl0) that vanishes on VN (as well as on rD); i.e., for this subset, we have / w dT + u • VT - V • (K • V7) - S = 0 Wwe Hi But since even this subset contains an infinite number of functions (i.e., w is an arbitrary function in Hq), it follows that dT dt + u • V7 = V • (K • WT) + S in Q, and we see that T(x, t) satisfies the original PDE. And this fact, with (2.1-24), leads directly to JrN w[q-H(T-T)-n-(K-VT)] = 0 Vw €//J. But again the set of test functions is of infinite dimension, and thus n (K-VT) + H(T -T) = q on N
THE CONTINUUM EQUATION 35 is necessarily true, and we see also that T(x, t) satisfies the Neumann/Robin BC. But it also satisfies the Dirichlet BC, (2.1-2), and the IC, (2.1-4), both by construction; i.e., T(x, t) satisfies the original IBVP. Hence, we have just proven a special case of the following general result: if the solution of a weak form of the problem is sufficiently smooth, then that solution is also a classical solution of the same problem. The key word is IF, a word that is missing when going the other direction—i.e., a classical solution is always also a weak solution. This distinction is actually rather important in practice because most (or at least many) problems posed do not satisfy all of the smoothness requirements in order that a classical solution exists. (It is perhaps also worth pointing out that conventional finite-difference approximate solutions to such problems can also converge only to a weak or generalized solution in such cases.) As stated by Rektorys (1980, p. 377), 'The weak solution ... represents a substantial generalization of the concept of a classical solution of a differential equation with boundary conditions. However, the weak solution is a generalization considerably more expressive as regards the range of problems, on the one hand, and the assumptions imposed on the given data of the problem on the other hand.' Finally, for an interesting 'formal' proof of the equivalence—at least for the steady case—see Hughes (1987, pp. 4 and 60). Weak solutions are also referred to as generalized solutions or solutions in a distributional sense. If the classical solution exists,—which requires (at least) smooth data (e.g., no delta functions in S or jumps in K) and a domain with a sufficiently smooth boundary (e.g., no L-shaped domains are allowed), then the weak solution will also be a classical solution, as shown above. If, however, the problem is not smooth enough, a strictly classical solution (e.g., V2T exists everywhere) will not exist, whereas a weak solution usually will. For further discussion regarding technical definitions of 'sufficiently smooth,' refer to, for example, Strang and Fix (1973). We conclude by reiterating: classical solutions are subsets of weak solutions; classical solutions [i.e., the solution to (2.1-1)] will always satisfy (2.1-23), but solutions of (2.1-23) will not always satisfy (2.1-1). Before taking the next step toward a finite element solution, we digress to consider another weak formulation: if we start with the conservation form of the PDE, (2.1-11), to generate the weak form, it may seem natural (although it is not necessary) to also integrate the other divergence term, wV • (uT), by parts, so that the total flux, uT — K • V7\ is thus so treated. The result is / w— + Vw-(K-VT-uT) = wS+ wn • (K • V7 - uT), in which the total flux appears in both domain and boundary integrals. This suggests, properly, that if the Neumann/Robin BC was n (K VT -uT) + H(T-f) = q on VN (2.1-25) instead of (2.1-6), then the appropriate weak formulation, generated from the divergence form (2.1-11), would lead to: Find T(x, t) in HXE such that J w_ + Vw • (K • V7 - uT) + / wHT J I dt J JVn = [wS+ f w(q + HT) VweHl0, (2.1-26) rather than that given by (2.1-23).
36 THE ADVECTION-DIFFUSION EQUATION Remarks: (1) If the advection term (in the flux-divergence form) was not integrated by parts, then the resulting weak form would be equivalent to that derived earlier—with u W replaced by V • (uT) in (2.1-23)—and would satisfy the natural BC implied by it; i.e., (2.1-3) rather than (2.1-25) above. (2) While either form of the PDE (advective or flux divergence) could actually be used to solve the AD equation with either (natural) BC—(2.1-3) or (2.1-25)—in the weak form, the former is a more natural choice if the BC is (2.1-3) and the latter if it is (2.1-25). (3) We will have more to say regarding the choice of weak form and associated BC's in Section 2.4 after we 'discretize the weak form' via the finite element method. (4) The weak forms of the scalar transport equation presented above contain several important and simpler (usually) special cases; e.g., (i) u = 0 gives the transient heat equation (parabolic equation) in a weak form, (ii) u = 0 and 37/3/ = 0 gives a weak form of a Poisson problem (elliptic equation), which, in the special case of VN = V and H = 0, leads to the previously-stated solvability condition, J S + frq = 0—a global heat balance requirement that is obtained directly from (2.1-23) by setting w = 1 there, and (iii) K = 0 gives the pure advection equation (hyperbolic) in a weak form, which also requires q = 0 and H = 0. (5) We shall concentrate mainly on the first weak form, (2.1-23), for reasons that will be explained later. (6) Alternate weak forms could be obtained by either not integrating the diffusion term by parts or integrating it by parts a second time, thereby shifting all of the derivatives to test functions. These options will not be examined herein since they are less relevant to our purpose; see, for example, Hopf (1950), Ladyzhenskaya (1969). (7) If both q and H are zero, then the boundary integral vanishes, and the 'data preparation' involves 'no action' (no data input) on TN—resulting in the sometimes-used jargon of a 'do-nothing' BC. (8) See also Strang and Fix (1973, p. 70) and Carey and Oden (1983, p. 4) for further discussion of the non-uniqueness of weak formulations. A final remark on existence and uniqueness for the steady-state AD equation: if and only if TN = 0 or if n • u ^ 0 on rN, some powerful 'machinery' from functional analysis—the Lax-Milgram theory in particular, e.g., Axelsson and Barker (1984, p. 158) and Johnson (1987)—applies to (2.1-23): it then assures us that a solution exists and that it is unique. Powerful and useful as this may be (and is), it does not say, as indeed the classical uniqueness 'proof presented in Remark (6) following (2.1-18) (for the time- dependent case) did not say, that if n • u < 0 on some portion of rN a unique solution does not exist; it simply becomes silent in that case, being 'merely' a sufficient condition. All we can say for sure when portions of VN display inflow is that we may be flirting with danger and should proceed cautiously and suspiciously. However, to end this section on a relatively high note, in spite of these theoretical 'deficiencies' for our non-self-adjoint problems in fluid mechanics, we remark that when the Lax-Milgram theory applies, our GFEM solution will possess existence and uniqueness because it
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 37 resides in a proper subspace of the continuum solution; i.e., the matrix will be guaranteed invert ible. 2.2 THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 2.2.1 Advective Form We now address the issue of 'solving'—albeit approximately—the weak form of the problem, (2.1-23), and mention up front that thus far it is far from obvious that solving (2.1-23) is any easier than solving (2.1-1). But it is the weak form upon which the FEM—a weighted residual method (see Finlayson and Scriven, 1966) or a projection method—is based. The FEM is a general and systematic technique for obtaining approximate solutions of weak forms; i.e., it is a method of 'discretizing' the weak form with the result that the underlying function spaces become finite—and thus amenable to representation via computer. Of course, the resulting solution in this 'truncated' function space is only an approximation to the true weak solution; i.e., of the solution to (2.1-23) in this case. In addition, the approximate solution is based on the approximation of functions—be they given or need to be found—via (usually) piecewise polynomials defined on the spatial domain of the problem. Different FEM solutions—by which here and hereafter we always mean approximate solutions to a given continuous problem—arise from the use of different piecewise polynomials and/or different weak forms, all of which are ostensibly trying to solve the same IBVP. But once the choice of weak form is made and once the choice of piecewise polynomial (e.g., linear, quadratic, or cubic) is made, there are virtually no more choices available to the analyst—except, of course, the difficult and important one of how many and what distribution of piecewise polynomials are to be used; i.e., the number and distribution of nodes/elements in the mesh. Bothersome issues, such as 'How should I treat this term or that term near and/or at this curved boundary, or at that corner?' are simply not present. They are part of the 'package deal' mentioned by Strang and Fix (1973); the rest of the 'recipe' is well defined—just turn the crank (which crank admittedly may sometimes be somewhat resistant, and other times generate less-than-ideal results) to generate the complete—and usually quite good—spatial approximation. The piecewise polynomials of the FEM are also called basis functions or shape functions and are said to 'span the space': any function in this finite-dimensional subspace is presumed to be representable by an appropriate linear combination of these basis functions. When the test functions, {vv}, are also represented by a linear combination of the same basis functions used to approximate the solution, which we assume to be the case herein, the FEM that evolves is called the Galerkin FEM or GFEM. If the test functions differ from the basis functions, we have a so-called Petrov-Galerkin method, which leads to different numerical approximations; we will say a little more about some of these methods later. So we now have our next task—namely to apply the GFEM to the weak form of the AD equation given by (2.1-23). Since the integrals in (2.1-23) involve no derivatives of higher order than one, we can (and do) employ the simplest class of piecewise polynomials called C° functions (zero derivatives are continuous; C1 functions also have continuous first derivatives, etc.); the basis functions are piecewise continuous and linearly independent, and their first derivatives, while discontinuous (they typically suffer jumps at node points), are square integrable—and that is all that is needed for the
38 THE ADVECTION-DIFFUSION EQUATION terms (namely, the diffusion term) in (2.1-23) to 'make sense'; i.e., they can be evaluated. Second- and higher-order derivatives are not required to 'make sense' nor even to exist. The next step toward a GFEM solution is to represent the unknown function, T(x, t), in (2.1-23) as a linear combination of (known) basis functions (piecewise polynomials) with unknown amplitude coefficients that are to be determined in such a way that the resulting approximate solution function, which we call Th(x, t), represents T(x, t) from (2.1-23) in a reasonable way. The generic symbol h is used both to represent a typical (or maximum) element size (length) on the discrete mesh and to remind us that we are henceforth dealing with an approximate solution—Th(x, t) ^ T(x, t) from (2.1-23), but we hope that Th(x, t) — T(x, t) is 'small.' Thus, we write N Th(x, t) = f(x, t) + Y/Tj(t)(pj(x), (2.2-1) where <pj is the y'-th (global) finite element basis function (with, however, compact support), Tj(t) is the y'-th unknown (to-be-determined) amplitude coefficient, N is the number of nodes (in Q, and on 1"V) at which Tn is to be determined, and T(x, t) is a given function (to be discussed in detail below) whose purpose is to ensure that Th(x, t) satisfies the Dirichlet (essential) BC, (2.1-2), since a property of the {<f>j}, inherited from that of {w}, is that <pj = 0 for x e T^; i.e., for points located on VD, (2.2-1) gives Th(x, t) = T(x, t) ~ TD of (2.1-2). (It may be worthwhile to emphasize that N is not the total number of nodes in the mesh; it does not include those on VD.) Another useful and important property of the {4>j} is that they are a Lagrange interpolating basis of C° piecewise polynomials; i.e., <pj(xi) = &,j, the Kronecker delta, where jc, is the location of node /. This property—combined with a similar representation of T(x, t) which we shall discuss below—endows the discrete system of equations with the convenient property (as with finite difference methods) that the numerical value of the amplitude coefficient, Tj(t), is also the value of Th(x, 0 at jc = xy, i.e., the numbers that come out at the other end are the values of the approximate solution at the nodal points. (This 'convenience feature' is not an essential part of the FEM and will, in some cases to be discussed later when dealing with the NS equations, be waived.) Next we deal with the 'function space issue' represented by the expression 'Vw e HXq in (2.1-23). First, we note that since our approximation lives in a finite-dimensional subspace of //', of dimension N, we need only (can only, in fact) force (2.1-23) to be satisfied for an analogous finite-dimensional subspace of Hq, also of dimension N. This is accomplished as follows: since each \^(x), a subset of {w}, is representable as a linear combination of the N basis functions, {07}, it suffices to enforce (2.1-23) only for each of these particular test functions; i.e., 'Vw e //q' is equivalent to and replaced by the finite-dimensional version, 'for wh = fa, / = 1, 2, ..., /V.' [See Becker et al. (1981) and Hughes (1987) for alternative derivations/explanations. See also Remark (6) below.] Thus, the finite-dimensional/GFEM statement of (2.1-23) is obtained by inserting (2.2-1) into the finite-dimensional analog of (2.1-23) to obtain the following set of ordinary differential equations (ODE's) for the amplitude coefficients (nodal values of T): V07 + V0(-(K-V07) -.1 + Tj HMj
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 39 = f 4>iS+ [ <j>i(q+Hf) j 0, f^+u-vrj+v0(-(K-vr) + j frHf fori= 1,2, ...,N, (2.2-2) where tj = dTj/dt, and we note that the entire term in curly brackets on the RHS is actually a formal method of enforcing the essential BC, (2.1-2), and is not nearly so cumbersome in practice as it appears at first glance. Further Remarks: (1) The solution of (2.2-2) approximates that of (2.1-23), which represents a generalized/weak solution of (2.1-1) through (2.1-4). (2) The entire approximate solution (once IC's are set) of the scalar transport problem is contained in this set of equations—both at all points in Q, [via solving (2.2-2) for nodal values and by using (2.2-1) elsewhere] and at all points on rN [where T is also an unknown function owing to the derivative BC, (2.1-3)]. (3) Hopefully, the dual use of the symbol N—for Neumann (1"V) and for the number of nodal unknowns—will not cause a problem. (4) It is noteworthy (and significant, and perhaps even somewhat amazing) that none of the individual basis functions satisfies the NBC of (2.1-3), yet the solution of (2.2-2) will do—albeit approximately (as indeed is the entire solution an approximate one) and more closely as N is increased—even when VN has a complicated shape. This is, in fact, one of the major advantages of approximating the weak form rather than the strong form. [See Strang and Fix (1973), for more detailed discussions of the theory behind such 'unstable' BC's.] (5) The ODE's become algebraic equations (Tj = 0) if the steady AD equation is being solved via the GFEM—a linear system of N equations in N unknowns. (6) An 'implicit' method of obtaining (2.2-2) that is sometimes used goes as follows: in the finite-dimensional subspace associated with (2.1-23), the generic test function, wh, can be represented as wh = X^Li a<A = Y^=\ ^(jc,)0,(*), and the statement, 'for every wh e HXq is replaced by 'where the a, are arbitrary,' which leads to the following version of (2.2-2): Y!!=\ a«{LHS - RHS} = 0, where LHS is the left-hand side of (2.2-2), etc.; and (2.2-2) then follows immediately since the a,'s are arbitrary coefficients. (7) For elucidation of our opinion on the subject of the GFEM equation formulation via the process of 'element assembly', or looping through the elements, and the frequent confusion that it has sometimes caused, see the first part of Appendix 2. It is interesting and perhaps fruitful to show how the C° approximation actually 'tries' to satisfy both the original PDE and flux-continuity between elements. To do this expeditiously and without loss of generality, we consider the simpler (homogeneous) version
40 THE ADVECTION-DIFFUSION EQUATION of (2.2-2) in which T, H, and q are all zero, so that (2.2-2) becomes simply J <Pi I ^- + u-vrM +V0(-(K-vr/!) = j &S v/. Then, following Hughes (1987, p. 68), we break up the global integral into a sum over elements (which is, in fact, the way most codes are actually written), focus on one of the elements containing node /, and integrate the diffusion term by parts as follows: f W<f>i ■ K • VTh = f 0,n • (K • VTh) - f &V • (K • WTh), which is legitimate even for a C° approximation because our basis functions are smooth within Qe. (They are, in fact, C°° there.) Next we realize that for each element boundary, re, that is internal to the domain (not on V, the boundary of the global domain, Q), the boundary integral will be generated twice as we loop through the elements—once from each side of Ve. But both n and Th (and thus VTh) are different in each of these two boundary integrals (but not <pf), the former merely in sign (n is outward-pointing in each element) and the latter because VTh is computed from different 'data.' The net result from each pair of boundary integrals is, upon summation over elements, a jump in the heat flux across each element boundary since, after all, it is precisely on Te where the (normal) derivative of Th is discontinuous. Denoting each jump by [[n • K • VTh]\, gives, for each / (node), Y,f 0/[[n-K.VT*]]+ f 4>i + u • VTh - V • (K • WTh) - S dt = 0, where the summation is, in effect, only over those elements containing node / (typically a four-patch in 2D). The interpretation of this result is that, at every node, Th from the GFEM is satisfying both the original PDE and flux continuity—both weakly. [The Euler-Lagrange equations associated with the above 'variational' statement are (2.1-1) and (2.1-3), the latter applying on all internal boundaries (with q = H = 0) as well as on the domain boundary. See Hughes (1987) for further discussion.] Remarks: (1) It is remarkable how many misleading papers have appeared in the finite element literature in which the jump term was claimed to be zero—typically via a statement like '... and all interior boundary integral terms cancel upon summation ...' (2) We shall return to this issue, and generalize it, in Chapter 4. Before moving on to the discussion of the solution of (2.2-2), we must address two more issues: (i) the function T(x, t), and (ii) IC's. The main job of T(x, t), as alluded to earlier, is to ensure that the approximate solution satisfies (closely if not exactly) the essential/Dirichlet/stable BC of (2.1-23); there is no 'free lunch' for these BC's. Wait and Mitchell (1985, pp. 88-91) present an interesting sample problem in which a comparison is made of 'blending functions' (which exactly satisfy the essential BC's) and finite element, piecewise-polynomial basis functions (which interpolate the BC's and are therefore exact only at the nodes). The result is that both
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 41 'work' quite well, and neither is clearly superior—and both preserve the overall 'order' of accuracy. Another analysis, and with a similar conclusion (at least for polynomial domains), was done by Fix et al. (1983); they compared the boundary interpolant with a least-squares fit (L2 projection; see Appendix 3) using the same basis functions. We shall follow common practice and use the same class of piecewise polynomials to interpolate T(x, t) for x e VD that are used to approximate the solution (and the test functions, in the Galerkin method)—a procedure that may enforce a certain degree of regularity upon TD(x e VD). That is, we take NT f(x,t)= J2 TD(Xj,t)<j>j(x) for XjerD, (2.2-3) j=N+\ where NT is the total number of nodes in the finite element mesh, and we quickly note that the implied ordering/numbering of the nodes (i.e., j = \,2 ..., N, N + I, N + 2,..., Nt) is definitely not appropriate for solution by the computer—it merely simplifies the presentation of the 'theory.' The advantages of this choice for the interpolation of TD are several: 1. Simplicity; it is 'natural' to the FEM technique, and code writing is much easier. 2. All of the amplitude coefficients, {Tj{t)}—those in Q and those on dQ = TD + TN—represent the value of the function Th(x, t) at the nodes. (This is not true if blending or other functional forms are employed.) 3. The function T(x, t) is of compact support; it is non-zero only on those elements that are contiguous to To and zero elsewhere. The bracketed term on the RHS of (2.2-2) is thus zero over most of the domain. For other methods (not recommended by us) of enforcing Dirichlet BC's—Lagrange multipliers, penalty methods, least-squares methods—see Strang and Fix (1973). Turning finally to the subject of initial conditions, we mention that again at least one alternative to 'interpolation via the basis functions' exists, but that there is usually not a sufficiently compelling reason to introduce this more complicated technique, which is: Compute the 'consistent' IC's by setting Th(x,0) = Tq(x) weakly; i.e., from (2.2-1) we obtain f Th(x, 0)0; = f f(x, 0)<j>i + j ]T Tj(0)<f>j<j>i = f T0(x)<j>i for 1 = 1, 2, ..., N, (2.2-4) which is an N x N linear system for {7^(0)}. We leave as an exercise the proof that this is the same Th(x, 0) that minimizes the following functional, Q= f[Th(x,0)-T0(x)]2, (2.2-5) where Th(x, 0) is again expressed via (2.2-1) and (2.2-3). The initial values thus obtained will generally not agree with Tq(x) at the nodal points, but the resulting Th(x, 0) will be as close as possible—in the least squares sense—to Tq(x) in Q; Th{x, 0) is an L2-projection of Tq(x) —see also Appendix 3. While this IC computation is indeed more consistent,
42 THE ADVECTION-DIFFUSION EQUATION we shall generally again follow precedent/common practice, and simply interpolate the initial data via 7,(0) = To(xj), j=l,2,...,N, (2.2-6) which again simplifies code writing and is usually sufficiently accurate (indeed, the error is zero at each node, so that the only error is that caused by interpolation). Note too that (2.2-6) also obtains from (2.2-4) simply by approximating Tq(x) itself via the interpolant—a quite reasonable procedure, usually; the best fit to the interpolant is the interpolant. Final Remarks on IC's: (1) Only the L2-projected IC of (2.2-4) can always [and easily—see, for example, Johnson (1987, p. 151)] be shown—at least for the heat equation (u = 0)—to satisfy the following stability 'condition' (which is also satisfied by the PDE for S = 0, q = 0,andH = 0):\\Th(t)\\^\\Th(0)\\^ \\T0(x)\\, properly reflecting its dissipative behavior. (2) Only the L2-projected IC can successfully (and easily) deal with arbitrary rough data (Tq(x) e L2) such as: Tq(x) is a smooth function (say C° or C1) except at a finite number (or a countably infinite number) of points in Q, where it takes the value of, say, 1000. Whereas the L2-projection does not even 'see' these points [they live in a set of measure zero in the term J 4>iTq(x)], the interpolated IC, which would presumably require locating a node at each point, would definitely see them—and generate a very 'bad' solution, since the weak solution of the PDE would also [necessarily—at least via the form presented in (2.2-2)] ignore them. Thus, in using (2.2-6) in the sequel, we must implicitly agree to preclude such irregular initial data, or else explicitly revert to (2.2-4). (3) If Tq(x) and TD(x, 0) disagree at any nodal points on T^, then the BC must prevail; i.e., it is necessary that Tj(0) = Td(xj, 0) for all nodes on rD. (If the IC and BC are the same on r^, then there is no jump there at / = 0, and the resulting solution will be smoother.) (4) If Tq(x) is sufficiently smooth, yet another projected initial condition is also possible (but not used in practice, to our knowledge): the Hl projection; see, e.g., Thomee (1984) and Appendix 3. The total GFEM problem has now been posed; namely, using (2.2-3), solve (2.2-2) for Tj(t) with IC's obtained from (2.2-6), and (optionally, and rarely done in practice) use (2.2-1) to obtain Th(x, t), the full finite-element solution. But 'solve (2.2-2)' is easier said than done—even though we now have only a. finite number of unknowns. We will thus later devote a fair amount of attention to methods for solving the ODE's of (2.2-2), but first we shall spend some time studying the ODE system that has been generated. To begin, we rewrite the GFEM problem in the more compact matrix-vector form MT + [N(u) + K]T = f for t > 0, (2.2-7) where T = (T\, 7^, ..., T^)T is an /V-vector of the nodal values, Tjit), which satisfy TjiO) = Toixj) at / = 0. Also, M,N(u), and K are sparse NxN matrices ii,j = 1,2,...,/V): Uj = J <Pi<Pj Mij = / Mj (2-2-8)
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 43 is the mass matrix, is the advection matrix, where is the diffusion matrix, and Nu(u) = J <j>iu{x, t) ■ V(j)j (2.2-9) Kij=Kfj+Kfj, (2.2-10) Kfj = Jw4>i-(K-W4>j) (2.2-11) Kfj ee f Hfrtj (2.2-12) is the boundary matrix representing the contribution of the Robin BC. Finally, / is an /V-vector that comprises the entire RHS of (2.2-2); i.e., it incorporates the internal source term, the specified boundary heat flux, the remainder (specified portion) of the Robin BC, and it contains information that couples the Dirichlet BC (including the time derivative) to the rest of the problem (the term in curly brackets). Remarks: (1) Kfj is zero for most /, j; it is only non-zero for those nodes {/} on 1"V that 'see' node j (via the support of the basis function). (2) M is symmetric and positive-definite (SPD), and causes (dTh/dt)(x, t) to be a best (least-squares) fit to the data: V • (K • VTh) + S - u VTh and the NBC of (2.1-3). It is sometimes referred to as the finite-element version of the identity matrix; see, for example, Wathen (1991), or 'the variational equivalent of the identity operator', in Karniadakis et al. (1993). (3) K is always symmetric; it is SPD unless TD = 0 and H = 0; i.e., K is symmetric but singular if Neumann data prevail on all of dQ—a rare occurrence in practice. (4) Both M and K (when SPD) possess all positive eigenvalues—because all positive- definite matrices do. (5) N(xx) is unsymmetric and indefinite, and its eigenvalues are complex in general—and purely imaginary if N is skew-symmetric; it is also time-dependent when u is. N(u) is always 'close,' in some sense, to being skew-symmetric—'because' u • V is a skew-symmetric operator. (6) Variable coefficients—especially u(x, t) or, perhaps more commonly, u(x)—are usually interpolated via the basis functions before performing the integrations in (2.2-9). (7) We will often, with apologies, use the bad notations that TT is the transpose of the temperature vector, that /V is both the advection matrix and the length of the 7-vector, and that Nt refers to the total number of nodes. (8) Some of these matrices are tabulated—at element level—in Appendix 1. This may be a good time to attempt to define what we like to refer to as 'honest GFEM':
44 THE ADVECTION-DIFFUSION EQUATION 1. Perform the integrals in the above matrix definitions 'as accurately as possible'—do not use so-called reduced quadrature (typically Gauss-Legendre—use a higher than 'minimum' quadrature rule, see Leone et al. (1979)); 2. Do not cheat on the mass matrix via lumping (we shall later address 'mass lumping' in some detail); 3. When it comes to time integration (Section 2.7), use a non-dissipative method—at least for advection-dominated cases. 2.2.2 Divergence Form It is now a very simple matter to write the GFEM equations in flux-divergence form, a la (2.1-11); just change the definition of the advection matrix, from (2.2-9) to Nu(u) = JfrV ■ [cjijuix, /)]. (2.2-13) But since V • (4>jU) = u • V07 + 07 V • u and the velocity field is allegedly divergence- free, one may properly ask, 'Why bother with the divergence form since the results are the same?' While the detailed answer can only be provided after we have discussed the GFEM solution of the NS equations in the next chapter, it is appropriate to point out here that V-u^O when u is obtained from the approximate (GFEM) solution of the NS equations, and that the velocity field that drives the scalar transport equation often, if not usually, is obtained from just these equations. So we must face the case where the velocity divergence is small but not zero. (The velocity is generally only discretely divergence-free.) Hence, we do not require that Njj from (2.2-13) be the same as Af,7 from (2.2-9). The consequence of this is that only the use of (2.2-13) can assure global conservation of T in the GFEM solution, an assertion that we shall soon prove. If, of course, the Robin BC was (2.1-25) rather than (2.1-3), then the GFEM would (or at least, should) be based on the weak form given by (2.1-26) rather than on that given by (2.1-23). The resulting semi-discretized equations would differ from those in (2.2-2) in the following ways: 1. J <piU ■ V(f)j is replaced by — J V0(- • (0/ii); i.e., in this case, Af/y(u) = — J 0/ii • V0,-, vis-a-vis (2.2-9) and (2.2-13). 2. The same replacement must be made in the advection part of the Dirichlet BC term on the RHS; i.e., J 0;u • V7 is replaced by — J Tu ■ V0(. 2.2.3 Conservation Laws We now attempt to mimic the analyses presented in Section 2.1.4, this time for the semi- discrete system of GFEM equations. But before we can do so conveniently, we will modify/generalize/augment our GFEM in the following way (see, for example, Mizukami, 1986; Gresho et al., 1987): rather than stating the problem a la (2.2-1), (2.2-2), and (2.2-3); i.e., find Th(x, t) in Q and on I"V from J 4>t f^-+u.VT*J+V0l-.(K.Vr*) + J ^HTh
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 45 = [<piS+ I <pi(q + HT) for/= 1,2, ...,N, (2.2-14) we generalize this weak form in three ways: 1. Replace u-VTh by u • VTh + /37/!V • u, where the scalar /3 will be defined (and discussed) below. 2. Introduce a new unknown, the diffusive heat flux into £2 through T^ (on which Th is specified, and which is n • K • V7 in the continuum), as follows: nt where the [qD } are to be determined. Note that qhD is a continuous function—whereas n. K. VTh is not. 3. Increase the size of the space of test functions, from those in £2 and on rN to those in Q and on V = VD + VN = dQ. The generalized/augmented weak form is then: find Th(x, t) in Q, and on rN and find qhD on TD from / ^ l^{-+u-VTh + l3Thw-u) +V0(-(K-V7/!) + J 4>xHTh = f<t>-tS+ f <pi(q + Hf)+ t 4>iqhD fori=l,2,...,NT, (2.2-16) where (still) T is given by (2.2-1) and T by (2.2-3), and we immediately point out that (2.2-16) naturally 'decomposes' into two sets of equations—the first set given by (2.2-14), which (as before) can be used to solve the N ODE's for Th(x, t) from the first /V equations; and the second set by the last N7 — ./V algebraic equations of (2.2-16), which can be used to solve for the NT — N values of qD. (with Th known). The reason this decomposition occurs is that the first Af equations are independent of the rest (the converse, of course, is not true). The reason we introduced this additional complexity is that it is a nice way to ensure that the total GFEM solution (Th in £2 and on I"V, and qhD on To) can be made to satisfy a global energy balance, as we demonstrate below, after making some additional Remarks: (1) qhD from (2.2-15) and (2.2-16) is called the consistent heat flux because, in addition to yielding global energy conservation (shown below), it is the only heat flux that permits reversibility; i.e., if the Dirichlet BC, T = TD on VD were to be replaced by a Neumann BC, n • K • V7 = qD with qD specified on VD, then only qhD as computed from (2.2-15) and (2.2-16) would produce the same Th as did the original problem—an important point even if somewhat 'theoretical' in that the proper qD can only be determined (for the continuum case as well) by first solving the problem with TD specified. (2) The actual value of qhD is seen to depend on much more than just the normal component of K • V7 on rD—at least on a finite mesh; but in the limit of h -> 0(AV -> 00),
46 THE ADVECTION-DIFFUSION EQUATION all of the terms in the last Nt — N equations of (2.2-16) would vanish (-> 0) except / V0( • (K • WTh) on the LHS and /r <j>iqhD on the RHS, with the final result that qD = n • K • V7\ (3) We shall have much more to say about the consistent heat flux (and other 'boundary' quantities) in Chapter 4. We will return to these (non-obvious) issues later; for now we just allege their veracity so that we can get on with the problem at hand—the derivation of global conservation laws. To this end, we now note the final reason for introducing the generalized problem of (2.2-16): the sum of all NT basis functions is unity, NT ]T0,(jc)=1.O Vjc, (2.2-17) (=i a result that is crucial to the establishment of global conservation statements. (Note that Y^=\ 4>i(x) 7^ 1 near To.) The important property (2.2-17), which can also be stated in the more 'elegant' form that 'the constant function is a member of the test space,' leads easily to the following result when all NT equations of (2.2-16) are summed (equivalently, set (pi = 1): ( (d^+u-VTh+/3ThV.u) + J HTh= fs+ f {q + HT)+ f qhD, which we rearrange to j{ ITh = fs+ f H(f-Th)+ J qhD+ f q- f (u ■ VTh + /3ThV ■ u), (2.2-18) and note that all but the last term on the RHS are in the desired form [cf. (2.1-17)]; namely, the second term is the heat input from Newton's 'law of cooling,' the third is the heat flux into r^ that results from the specified temperature there, and the fourth term is the original applied heat flux on rN. To finish, we use J u • VTh = J V • (uTh) — J Th V • u = /r n • uTh - J Th V • u to obtain, finally, A frh= fs+ f [(q+H(f-Th)j + J qD- f n ■ uTh + (\ - 0) f ThV (2.2-19) which, upon identifying qD as KdT/dn, the diffusive (only) flux through rD, now properly mimics (2.1-17) except for the last term which should, but does not, vanish—unless V • u = 0 or f> = 1. Since, as mentioned earlier, we often must solve the scalar transport equation using velocity fields that have small (hopefully) but indefinite divergence, we conclude that for these cases, it is necessary to set /3 = 1 if we wish to assure global conservation of our scalar field, Th. But since V • (uTh) = u • VTh + ThV ■ u, we see from (2.2-16) that 0 = 1 is nothing but the flux-divergence form of the advective term (fi = 0 being the advective form). Thus, while the advective form cannot assure (and will not attain) global conservation of energy/enthalpy when V • u ^ 0, the divergence form can and will. See Lee et al. u.
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 47 (1982) for some demonstrations of these facts in the case of an ideal fluid (zero diffusion coefficient for the AD equation and zero viscosity in the Boussinesq equations, whose computed velocity field drives the 7-field). So at this point it seems clear that the 'proper' GFEM for scalar transport that is driven by a GFEM-computed velocity field (or any other for which V • u ^ 0) should not use the simpler advective form; the flux divergence (conservation) form (fi = 1) is clearly preferred. Or is it? What about quadratic conservation, Eh = J(Th)2, and the associated stability/boundedness that conservation of Eh would guarantee? To answer this question, we must attempt to duplicate the steps that led to (2.1 -18) for the continuum. We begin with (2.2-16) again, and with the observation that Th is a linear combination of all (NT) basis functions. Thus, we form (in principle) that same linear combination of the NT equations of (2.2-16) to obtain (in fact, just replace 0, by Th, a perfectly legitimate test function) / Th ( 4-+u- V^ + ^V-u) + VTh>(K- VTh) +f H(Thf = ThS+ Th(q + HT)+ / ThqD, J J rV J I"/; an equation that is also satisfied if (2.2-16) is [it is implied by (2.2-16)]. Next, recall BC (2.1-3) and rearrange the above equation to obtain f(Th)2= frhS- fvTh-(K-VTh)+ J Thn-(K-VTh)+ f ThqD - f Th{u- VTh + pThV-u), which is seen to agree, except for the advection term, with the continuum version, (2.1-18), once we generalize (k -» K) and then replace \ Jr n • (K • V72) by Jr Tn ■ (K • V7) + IrD TqD there. The final step is Thu ■ VTh = u-V(Thf = \ n-u(Thf - i (Th)2V>u to obtain -— f(Th)2= f STh - fvTh-(K-VTh)- - f n-[u(Th)2 -K- W(Th)2] (2.2-20) in which agreement with (2.1-18), and the assurance of global conservation of J(Th)2—and, when 'appropriate,' the associated assurance of a bounded solution—can now only be obtained (when V • u ^ 0) by choosing fi = \! A dilemma, to be sure: for V • u 7^ 0, j6 = 0 conserves 'nothing,' /3 = \ conserves T2 but not T, and /3 = 1 conserves T but not T2. Which /3 (and associated form) should we choose, and why?
48 THE ADVECTION-DIFFUSION EQUATION In the experiments performed by Lee et al. (1982), these discouraging aspects of global conservation when V • u ^ 0 were indeed verified; see also Cliffe (1981). But they also reported that they would still not switch from the simpler (fi = 0). Why is this? Besides computer costs (the conservative form is slightly more expensive than the advective form) and laziness, the reasons are basically these: 1. Any decent (weak) solution of the NS equations, in which J \J/V • u = 0 Wx(/, where ^ is a pressure test function (Chapter 3), will generate only small (and of variable sign, generally) values of V • u, so that the offending terms are probably always pretty small—although they may cause instability if diffusion is small or absent. 2. Any real (physical) solution—i.e., one with non-zero diffusion coefficients—should provide sufficient physical dissipation to control most potential instabilities related to the (indefinite) term J(Th)2W • u. Experience, both our own and that of many others, suggests that this is indeed true—usually. So, for now at least, we shall leave the issue of '^-selection' open, except perhaps for the rare hyperbolic (K = 0) case wherein fi = 1/2 is to be preferred to ensure stability of the ODE's, an example of which we shall later demonstrate—in Section 2.8.1. 2.2.4 An Absolutely Conserving Form In this section we examine the case fi = 1/2 in a little more detail. We first mention that the term 'absolutely conserving' was apparently introduced by Piacsek and Williams (1970) in a finite difference context, and refers to what we have called quadratic conservation: in the absence of diffusion and sources, and with no inflow or outflow (n • u = 0), (2.2-20) shows that Eh = df(Th)2/dt = 0 if and only if 0 = 1/2 when V • u ^ 0. If fi ^ 1/2, Eh behaves in an indefinite (i.e., unpredictable) manner so that boundedness of the approximate solution cannot be a priori guaranteed. It is interesting to first note that fi = 1/2 is actually the average of the advective and divergence forms; i.e., u • WTh + \Thy u = i[u V7/! + V • (uTh)]. (2.2-21) Next, consider the matrix representation of this form; the advection matrix becomes, from (2.2-9) and (2.2-13), N?j(u)=tJ(j)i[u.V<j>j + V.(u(Pj)] = i f(<PiU ■ V0,- + fru ■ V(j)j + 0,-0,-V • u) = J fou ■ V(pj + \ J frcpjV ■ u, (2.2-22) vis-a-vis (2.2-9) or (2.2-13); i.e., if the matrix of (2.2-9) is—after increasing its dimension from N x N to NT x NT—called NA for advection form, and that of (2.2-22) is called Nq (for quadratically conserving form), we have NQ = NA + B, (2.2-23)
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 49 where Bij = { J<j>i(j>jW.u. (2.2-24) Remarks: (1) It is interesting to first form the transpose of Nq,N7q = NA + B, and then the sum, S = NQ+NTQ = NA+Nl + 2B; i.e., Sij = J 0,u • V(pj + J <j>ju ■ Vfr + J <pi<pjV ■ u = J V • (<j>i<j>jU) - J &0,-V ■ u + f WjV ■ u = / fafyn ■ u. Thus, if n • u = 0 on I\ then Nq is skew-symmetric, Nq = —Nq; a useful observation—and one that does not exist for /3 = 0 or /3 = 1 when V • u ^ 0. Thus, noting first from (2.1-18) or (2.2-20) for £=1/2 that when there is no inflow or outflow the advection process has no effect on f T2, we see that the skew- symmetry of Nq assures the same since xTAx = 0 for every vector x when A = —AT [proven below in Remark (4)]. Here, A = Nq and x = T. Another property of skew- symmetric matrices that is useful to know for CFD is that their spectra (eigenvalues) are purely imaginary. Finally, it is important to note that 5,-y = Jr <pi<pj-n • u obtains even if V • u = 0; i.e., even a solenoidal velocity field wil] only yield a skew- symmetric advection matrix if either 0, = 0 (e.g., specified T) or n • u = 0 on r. (2) In matrix language, and (again/still) for the case with no inflow/outflow (u • n = 0 on O, we have that NQ = (NA — NTA)/2 is the skew-symmetric part of AfA and that —B = (NA +NTA)/2 is the symmetric part of NA — and the decomposition (NA = Nq — B) is unique. (Note/recall also that the diagonal entries of a skew-symmetric matrix are zero.) (3) These results generalize as follows: any discrete (matrix) 'centered-difference-like' approximation to u • V(-) when u is spatially varying and n • u = 0 on T—say No (u)—can be uniquely decomposed into the sum of a skew-symmetric part, (A^c — NTG)/2, and a symmetric part, (NG + Njj)/2—the former 'truly' (more properly) representing advection and the latter corresponding to (representing) — ^(-)V • u. If, however, the discrete approximation to u • V() is dissipative, then the symmetric part will always contain a term that looks like diffusion—and may or may not display a — ^()V • u term; e.g., for pure upwinded advection in ID, the 'usual' result obtains: (i) the skew-symmetric part looks like centered advection and (ii) the symmetric part looks like centered diffusion with diffusivity uAx/2. The eigenvalues of the 'true' advection matrix are purely imaginary (as they should be when approximating the hyperbolic operator), and those of the symmetric matrix are indefinite (because V • u is). Discarding the symmetric part guarantees that the
50 THE ADVECTION-DIFFUSION EQUATION advection terms will not destabilize the ODE's because TtNqT = 0. It would also convert any dissipative advection approximation into a non-dissipative one. (4) Theorem: For any real matrix, A, a necessary and sufficient condition that the quadratic form q = x1Ax vanish for all x is that A be skew-symmetric. Proof: First note that q = \xT(A + AT)x too, because the quadratic form of the skew-symmetric part of a matrix, I, (A — AT), is always zero. (i) Sufficiency. If A is skew symmetric, AT + A = 0. (ii) Necessity. If q = 0, let B = (A + AT)/2, the symmetric part of A, and consider the eigenproblem Bz = Xz- Since q = 0, zTBz = XzTz = 0 for every eigenvector, which =>• all of the eigenvalues of B are zero. But a symmetric matrix with all zero eigenvalues must be the zero matrix. QED. (5) The above theorem does not preclude the following possibility: for some non-trivial x, xTAx = 0 with A not skew-symmetric—a situation that can actually occur in practice. [See, for example, Gary (1979), who studied u, = uux and obtained uTN(u)u = 0 using GFEM and linear basis functions, thus conserving energy but with NT(u) ^ —N(u). We shall return to this point in the next chapter (Section 3.16).] (6) For a recent 'spectral element' discussion of skew-symmetric advection, see Ronquist (1996). The generalized matrix-vector form of (2.2-7) is now Mf + [NQ(u) + K]T = f, (2.2-25) where now T is an NT vector that represents all of the nodes: T = (T\, Tj, • • •, TN, T/v+i, Tn+2 ■ ■ ■ Tnt) , and the matrices, M, K, and Nq are of size NT x N7. Also, consistent with this formulation, the A^^-vector / is different: it does not contain the terms in curly brackets of (2.2-2) because T has been absorbed into Th (the 7-vector 'contains' it), but it does contain an additional term—that corresponding to the 'Dirichlet' heat flux term on the RHS of (2.2-16). In fact, / is the RHS of (2.2-16). We will now generate the fi = 1/2 version of (2.2-20) from the matrix-vector form of the equations—a procedure that may be regarded as an exercise, but hopefully a useful one. Multiplication of (2.2-25) by TT (the transpose of the nodal temperature vector) yields a scalar equation that is a combination of quadratic forms—and one linear form: - — TTMT + TTN0T + TTKT = TTf, (2.2-26) 2df u and we examine each term in turn: 1. TTMT = JTTi(jM^Tj '7=1 " N7 E^.jlEr*| = /^)2-^
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 51 3. TTNQT = J2 Ti (J'fa ■ V0,J Tj + X-Tt (j 0,^V • u) Tj = lJu.W(Th)2+l-J(Th)2V-u = i J V • [u(Th)2] - X- j(Thfy -u+~J = 1/(7**..,. Nt \ c ~\ Nt / r TTKT = Y^Ti / W>/ • (K • W>,)^ + Y. T> / H^ (rrv-u nr; -/ vr1 • (K • vr h\2 H(T"y. TTf = J2T; /<&<>+ f (pi(q + Hf)+ f ^qhD = frhS+ f Th(q + HT)+ f ThqhD. Thus, (2.2-26) becomes -— t(Thf + - f(Th)2n u + fvTh-(K- VTh) + f H(Thf =lSTh+L STh+ I Th(q + Hf)+ [ ThqhD, (2.2-27) which, using n • K • VTh + H(Th - f) = q on VN, reproduces (2.2-20) when 0 = 1/2 there. Since this is also the proper conservation law, we see (again) that the absolutely conserving form does indeed properly 'conserve' Eh. It also follows that the quadratic forms associated with the Af-matrix for either the advective form (fi = 0) or the divergence form (fi = 1) lead to indefinite forms, and global conservation of Eh cannot then be assured unless V • u = 0. The 'bottom line' from these results is as follows: if V • u ^ 0, 1. It is not possible, in general, to conserve both T and T2. 2. If the problem is diffusion-dominated and/or good boundary heat fluxes and/or an exact overall heat balance is important, fi = 1 is the recommended choice (unless, of course, it 'blows up'—probably unlikely under these conditions). 3. If the problem is advection-dominated (or, worse yet, the limiting case of a hyperbolic problem), then fi = 1/2 is advisable since a bounded solution may be difficult to obtain otherwise. 4. For many (if not most) practical problems, however, fi = 0 may be the best choice, since it is simpler, and it does not necessarily follow that conservation forms lead to more
52 THE ADVECTION-DIFFUSION EQUATION accurate approximate solutions of the PDE—at least when fi = 0 and fi = 1 generate stable results. While instability =>• inaccuracy, it does not follow that stability =>• accuracy. 5. We have occasionally been 'burned' via the GFEM generation of unstable ODE's when using the advective form—an example of which we shall present in Section 2.8.1; we thus warn the reader that only (N — NT)/2 can assure that the FEM ODE's will be stable—a result that also applies to all other spatial discretizations, as well. 6. Although various upwind and/or streamline diffusion advocates of advection seem never to worry about the fact that their ODE's too are not guaranteed to be stable because their advection matrix is also indefinite, it is probably the case that instability never occurs simply because the extra dose of diffusion (numerical), with its positive (decaying) eigenvalues, swamps the indefinite contribution from — ^(-)V • u. As a final remark, we mention that we will return to these issues (in Volume II) after studying the NS and Boussinesq equations—and thereby muddy the waters of conservation even further. 2.2.5 A Finite Difference interpretation Since the GFEM equations are in 'weighted residual' form, it is of some interest to 'undo' the Galerkin weighting so that the equations can be more readily interpreted as finite difference equations, a procedure that can unfortunately lead also to misinterpretations—as we shall demonstrate. In this section we will convert (2.2-2), or (2.2-7), to an equivalent form that more readily permits such an interpretation. Since the GFEM equations are formed by the process 'multiply by each basis (test) function and integrate the result over the domain,' we now consider the effect of dividing the final results by the same test functions integrated over the domain; i.e., by f <p[. For reasons that will become more clear later, we define MLij=8ij J fa (2.2-28) where 5,-y is the Kronecker delta. Whereas f <p{, i = 1, 2, ..., NT, is, of course, a vector, this diagonal matrix representation is more convenient for our purposes, because often ML is also the so-called lumped mass matrix, about which we will say more later. Noting that M^} is trivial to compute, we multiply (2.2-7) by M~[l to get AT + Ml{[N{u) + K]T = Ml{f, (2.2-29) where A=Ml[M (2.2-30) is, in fact, an averaging matrix; i.e., N A'j = J2 (8{k J &) / fob = J &<t>j/ f & (2-2-31) is dimensionless and has the property that the sum over each row is unity (since "JTrLi 4>j — 1.0), so that each element of the vector AT represents a particular weighted average of
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 53 the elements of the A^-vector 7. In fact, the /-th component of AT is AT = jrj4>i{tj<i>j)/ !& = !& dT"{x,t) dt (2.2-32) and it is now clear that we did indeed undo the Galerkin weighting—at least for the time derivative. Next, note that M]^{N{u) corresponds to (represents) a weak advection operator and (—)M~[lK corresponds, with the exception noted below, to a weak Laplacian operator [at least when the heat transfer coefficient—see (2.2-12)—is zero]. Next, MlxN{u)T = E (/^'u • V^./) Tj/ J*i = J0'U " V7V/*» (2-2_33) i ./= and MlxKDT = / V0, • k • vr (2.2-34) r-l Similarly, M~[lf—see (2.2-16)—is -l MTlf </>iS+ / (Pi(q + HT) + (t>iqhD r,> (2.2-35) and the finite difference interpretation is now available; namely, except for the time- derivative term, each of the other terms now corresponds (in some sense) to a point-wise approximation of the corresponding term in the original PDE, because 0, is the 'proper' piecewise polynomial of the FEM—the number of neighboring nodes (j) that couple with the node in question (/) being only a function of the support of the basis function, 0,. For example, a bilinear approximation in 2D will couple (generally, and away from 3£2) eight neighbors to each node. The time derivative term, though, is 'special' in the sense that the GFEM performs a weighted average of all the t/s in the neighborhood of node / to approximate 37/3/ at x = x,. Again the details depend on the support of the basis functions, but the key point to note is that only AT\i is not a pointwise approximation in (2.2-29); a pointwise approximation of 37/3/ would be simply 7,—as with lumped mass. A final remark: if 0, belongs to a node on T, then the interpretation is somewhat trickier. It turns out that if xt e rN, then the nodal equation is actually (as for the original GFEM) an approximation to the Neumann (Robin) BC, (2.1-3), and will approach this exactly as the mesh is refined; i.e., all other terms will -> 0 as NT -* oo. Finally, if xt e VD, a similar result is obtained, with only M^lKDT and J 4>iqhD/ J 4>; remaining significant as NT -> oo, wherein they give n K V7 = qD. If some of the above assertions are not obvious, they will become more transparent soon, when we show some actual nodal equations.
54 THE ADVECTION-DIFFUSION EQUATION 2.2.6 A Control Volume FEM In this section we develop one form of a non-Galerkin weighted residual method that has been gaining in popularity, partly via religious beliefs (via the God of 'local conservation')—and probably at the 'expense' of both the GFEM and FDM's. Called the control volume finite element method (CVFEM), it is a subdomain method of weighted residuals (see Crandall, 1956; Finlayson and Scriven, 1966), and seems to have been spearheaded by, among others, Professor S. Patankar and colleagues—at least for incompressible flow. For elliptic BVP's, it was employed at least as far back as 1952 (MacNeal, 1953); see Varga (1962), who presents (but does not 'name') both GFEM—in the guise of a Ritz method—and a finite volume method, and attributes the latter to MacNeal (1953). While most of the recent papers we have seen involve more than a simple change in the test function (such as directional upwinding and mass lumping), herein we develop and present the CVFEM as a fully legitimate (no cheating) alternate finite element technique, beginning with the appropriate weak formulation and introducing the CVFEM version of natural boundary conditions (NBC's). Its extension from the advection-diffusion equation to the NS equations will be considered in the next chapter. The 'Patankar-family' and related publications are now summarized (there are many more than those cited here), beginning with Baliga and Patankar (1980), which we believe to be the first in the 'series,' using triangular elements and 'exponential upwinding.' Further efforts/improvements, still on triangles, followed in Hookey et al. (1988) for advection-diffusion and Hookey and Baliga (1988) for Navier-Stokes. Quadrilateral CVFEM seems to have begun with Ramadhyani and Patankar (1985), again using 'upwinded' shape functions. Prakash (1987) shows that CVFEM, like FDM, suffers from excess diffusion if simple ID 'smart upwinding' (see Section 2.6.2c) methods are employed. Schneider and Raw (1986) develop a type of skew upwinding on quadrilateral elements, and Prakash and Baliga (1989) compare FDM, FEM, and CVFEM, concluding with the hope that CVFEM will catch up with FEM for problems with 'truly complex shapes' via unstructured grids. A sort of state-of-the-art summary, as of about 1987, plus references, is available in Minkowycz et al. (1988). Finally, in a recent pair of papers, Swaminathan and Voller (1992a,b) add SUPG [Streamline-Upwind Petrov-Galerkin; see, for example, Hughes and Brooks (1982) and Brooks and Hughes (1982)] to CVFEM to minimize the up-to-then excessive cross-wind diffusion via extant upwind methods. It compared favorably with 'GFEM/SUPG' in most respects, although its phase speed errors were noticeably larger (which we explain later: Section 2.6.2a)—thus bringing CVFEM nearly up to date with one popular type of FEM. Outside of the 'Patankar family,' we mention just two recent contributions: Zienkiewicz and Onate (1991) and Idelsohn and Onate (1994)—and a quotation: 'The FVM is a poor man's FEM; it's an FDM moved over half-way'—O.C. Zienkiewicz (invited lecture, 8th Int. Conf. on Finite Elements in Fluids, Barcelona, Spain, September 1993). A final 'justification' for FVM's is revealed in the following quotation (typical of many): 'Finite element methods can easily be used on irregular geometries, but the equations are more complex and it is often more difficult to explain them physically'—from Melaaen (1992). Somewhat ironically, this paper then goes on to a discussion of curvilinear non-orthogonal grids via tensor calculus and seems, in our humble opinion, to get bogged down in equations that are far 'more complex' than GFEM equations ever are!
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 55 A brief sampling of recent theoretical/convergence results, a field that is rather nascent compared with FEM, follows—and it is important to point out that there are many varieties of what is most often called the 'finite volume method': Cai (1991) and Cai et al. (1991); Siili (1991) and Morton and SOU (1991), from which we quote: 'Despite the practical success of finite volume methods, their theoretical foundation is still unsatisfactory and their stability and accuracy properties are not well understood'; see also Nicolaides et al. (1995) and Shin and Strikweda (1997). Crucial to the CVFEM is the conservation (divergence) form of the PDE, (2.1-11) in this case, because it is based on the 'conservation of T at control volume level. The weak form of this AD equation begins, as usual, by multiplication by a test (weighting) function and integration over the domain. In this case, however, the test function is piecewise constant; it is unity over a particular subdomain (control volume or, in 2D, the only case we consider in detail, a control area) and zero over the rest of £2. Thus, in this subdomain method the PDE is satisfied on the average over each subdomain. Calling the test function for subdomain '/' surrounding node / \J/j, we have fiW ■ (uT - kVT) = 0, i = 1, 2, ..., N, (2.2-36) where N is now the number of non-overlapping subdomains covering £2. But owing to the nature of the test functions, the above equation is equivalent, via the divergence theorem, to / (^t~~S) + / n • (ur - *Vr) = °> i=U2,...,N, (2.2-37) where T, is the boundary of subdomain Qj. This simple set of equations—each representing an energy balance over one subdomain—is the starting point for the finite-element discretization; it is the desired, and intuitively appealing, weak form. What about boundary conditions? They are not nearly as apparent here as in the GFEM weak form, a la (2.1-23) and (2.2-2). But the answer is actually simple: if (and only if) T, includes a portion of the full domain boundary, T, then 'special procedures' need to be introduced so that both Dirichlet and Robin/Neumann data are properly incorporated. These procedures are, in fact, little different from those already discussed and will later be presented in some detail. Suffice it to say here that the simple looking equation of the weak form is, in practice, only slightly simpler than that from GFEM. (It is also easy to interpret physically, a very important attribute in the eyes of many would-be finite-element practitioners.) The next step is to approximate the solution in the finite-element spirit: expand the 'solution' in the same set of piecewise polynomials used in the GFEM, a la (2.2-1) and (2.2-3), which converts (2.2-37) to the final control volume weak form: r n r n / EW)-^+/ 52n-M>j-KV4>j)Tj ' /= i ' /= i = -f^r--f n • (uf -/fVf), i = 1,2,..., AT, (2.2-38) JQi 'dt Jri where, for simplicity, we have also expanded the source term into the FEM basis functions, S = Yl^i $$]■> v'a interpolation. Note that the RHS is only non-zero at points where
56 THE ADVECTION-DIFFUSION EQUATION £2,- and r,- involve rD; i.e., the (now-less-simple-looking) weak formulation now does account for the Dirichlet BC. (Neumann and Robin BC's are deferred until Section 2.5.3, where they will appear as NBC's.) Note too that j also ranges over 1 to N, where N is the number of nodes (in Q and on T/v, as before) at which Tj is to be determined; there must be one subdomain for each unknown nodal temperature. Thus, we have reduced the weak form of the continuous problem [obtained via N -> co in (2.2-38)] to one of finite dimension. All that remains to be addressed prior to programming is the precise definition of £2,- and V; for / = 1, 2,..., N, in such a way that the test functions retain linear independence. We do this first in the 2D context in which the basis functions {(pj) are bilinear. Consider the 4-patch of isoparametric elements shown in, Figure 2.2-1 surrounding a generic node (/) in the domain: The subdomain £2,- (control 'volume') is that formed by joining the element centroids (xq = \ Ylj=\ xj is tne -^-coordinate of a centroid, etc.) with eight straight line segments, each of which passes through the midside of the appropriate element. It should now be apparent, at least in principal, 'how to build' a CVFEM code: each internal node's control volume—for the integration of 'volume quantities' (37/3/ and S above)—is composed of pieces of neighboring elements (four for a 4-patch, two for a 2-patch, etc.); each internal node's control volume boundary—for the integration of flux quantities (like uT and kVT)—is made up from two internal segments from each element that has something to contribute. It may also be apparent that this method is more 'localized' than GFEM, owing to the nature of the test functions; i.e., CVFEM will give more weight to node / relative to its neighbors than does GFEM. In local coordinates (£, iff), these line segments are simply pieces of the coordinate lines themselves, e.g., £ = 0 or r\ = 0, and this fact actually makes the 'boundary' calculations significantly easier to perform since the general bilinear interpolation becomes simply linear on each of these segments. [The volume quantities are not simplified, however, and conventional element-level matrices (or the equivalent) need to be constructed. See Appendix 1 for some CVFEM element matrices] This is a sufficient exposition of the method at this point. Later, we will actually present the resulting semi-discrete equations and compare them with those from the GFEM. Suffice it to say here that there are more similarities (when done our way) than differences. How do the two schemes compare theoretically? Numerically? While we do not have many answers here [indeed we have not (yet) programmed a CVFEM], some conjectures, opinions, and assertions are offered: Fig. 2.2-1 A 4-patch of control volume finite elements.
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 57 1. Because of the particular conservation formulation, the CVFEM has the nice property that 'whatever exits one CV through its boundary surface enters the neighboring CV.' This physically appealing property—which also assures global conservation—accounts in (very) large measure for its popularity, as already hinted at above. [The GFEM does not, in general, incorporate control volume or element-level conservation properties. In a sense, it resembles more a Fourier series expansion method—or any other eigenfunction expansion method (e.g., spectral methods)—which also lack the appeal of physical intuition but nonetheless provide very useful mathematical tools. But, see Appendix 2.] Also, and importantly, local conservation does not imply local accuracy. 2. There is no assurance that quadratic quantities are conserved (e.g., J T2), and in general they will not be; hence, boundedness of the solutions is not a priori guaranteed—as indeed it is not for the analogous (/3 = 1) GFEM. 3. For any mesh other than simple rectangles, both the mass matrix and the diffusion matrix are unsymmetric—not a nice feature. 4. For situations in which variational principles apply—which generally require u = 0—the GFEM is guaranteed to produce the most accurate solution possible on a given mesh, at least when errors are measured in the appropriate norm. For example, for the elliptic (Poisson) problem PDE, V27 = —S, the error from GFEM, e = T — Th, is a minimum in the 'energy' (or 'heat flux') norm, (J" Ve • Ve)l/2. 5. The dispersion (phase speed) error in the advection-dominated situation is significantly smaller for GFEM than CVFEM, as we will demonstrate. 6. Higher-order schemes are rare (non-existent?)—CVFEM advantages tend to disappear when higher accuracy is demanded. 7. Finally, we note that no one has yet (to our knowledge) presented results from the 'honest' CVFEM presented here—at least for the time-dependent case; the practitioners thus far seem to both lump the mass (thus 'blowing' the very local conservation that they so admire!) to simplify the algorithms and modify the advection terms to control wiggles (Section 2.6.1) by adding dissipation. Comini et al. (1996) have recently performed some steady heat conduction analyses for CVFEM and GFEM; but they, like 'Patankar' (see below), did not use the correct norm in which to measure the error—although they did come closer. They used the proper (intrinsic) energy norm, but did so after interpolating the exact solution into the finite element subspace. Following up on Remarks (4) and (7), we wish to point out a misleading paper by Ramadhyani and Patankar (1980) in which they 'showed' that the bilinear CVFEM is more accurate than the bilinear GFEM by solving several 'Poisson' problems on the same meshes and measuring the errors in the wrong norms. The facts, somewhat alluded to by Habashi and Youngson (1980), are these: (i) if they had used the proper norm (energy norm, or //'-semi-norm), they would have reversed their conclusions—GFEM would win every time, as it must since there is no more accurate solution than that from GFEM; (ii) they would, however, see that GFEM does not win by much. We (RLS and PMG) repeated these experiments and found consistency between numerics and theory; the 'bad' news was that GFEM only won by a 'small' margin in all cases. CVFEM is definitely a valid and viable competitor to GFEM, but we are not pleased with some of the misleading 'advertising' that often accompanies it.
58 THE ADVECTION-DIFFUSION EQUATION 2.3 SOME SEMI-DISCRETE EQUATIONS In this section we will get closer to the nitty gritty and actually display the final semi- discrete equations for a few common elements (time is still treated as a continuous variable), both in ID and 2D. (We leave 3D to the reader! Or to his/her computer with symbolic manipulator.) We shall use both the GFEM form and, so that the reader can more easily interpret the results via identification with the PDE and its BC's, the finite difference form developed above—(2.2-29). We limit the presentation to linear and quadratic basis functions and, in 2D, we further limit the discussion to rectangular quadrilateral elements [distorted, isoparametric element equations are better left to the computer—see, for example, Irons and Ahmad (1980) for further discussion of this point—and for much other interesting 'discussion']. Also, we further limit the presentation—for the most part—to the constant coefficient case, although we will permit one important coefficient—the velocity—to be (sometimes) non-constant. Finally, we make the obvious remark that while our presentation of the time-dependent case yields ODE's, the simpler steady case follows easily by discarding all time derivative terms to arrive at the appropriate systems of linear algebraic equations. As we mentioned in Chapter 1, we generally presume that the reader is sufficiently FEM-educated both to construct 'element matrices' and to use them for the generation of the global GFEM equations. And to provide a little bit of assistance, we have already mentioned, in Chapter 1, many element matrices that are displayed in Appendix 1. In this chapter, we have used these matrices to present the final, 'assembled' equations. For the reader who wants even more assistance, a look-ahead to Section 3.13.5 in the next chapter will actually show, step-by-step, how to 'build' the global equations using bilinear basis functions. 2.3.1 One Dimension The ID version of (2.1-1) is dT dT d2T h u— = K—z dt dx dx2 T = TD at x = 0, (2.3-2) + m— =k—2+S on 0 < x < L, (2.3-1) dT K—+H(T-T) = q at x = L, (2.3-3) ax with T(x, 0) = T0(x). (2.3-4) The weak formulation of the advective form is given by the ID version of (2.2-16) with fi = 0: find Th(x, t) on 0 < x ^ L and (optionally) qhD at x = 0 from dTh dTh\ fdfodT11 . dt dx / ./ dx dx x=L = I &S + friq + HT) x—L , i=l,2,...,NT, (2.3-5) -v=0
SOME SEMI-DISCRETE EQUATIONS 59 where N Th{x, t) = TD<t>NT{x) + ]T Tj{t)(t>j{x). (2.3-6) 7=1 Note that for this simple case, Nr = N + 1, and 4>Nt corresponds to the single node at jc = 0; i.e., <pNT(0)= 1, 0,(0) = 0 for / = 1,2, ...,JV. Also, J {■) = J^ {■) dx. Although we will only treat the case of constant S (and k, etc.) for the most part, note that it is very easy to extend the results to the case of variable S when S is interpolated into the basis functions via S = ^ Sj<pj by simply realizing that the (consistent) mass matrix applies so that the same coefficients that multiply f{ will multiply S{. (We shall soon explicate this in the next section—in 2D.) a. Linear elements The domain, showing a single 'hat,' 'chapeau,' 'roof,' 'tent,' 'teepee,' 'pagoda,' 'pyramid,' etc. basis function (all of these names and probably others, have been used—although perhaps/probably not all in ID) is shown in Figure 2.3-1: Using linear basis functions on (at first) the variable mesh shown above with N elements (and N + 1 nodes) leads to the following equations, of which there are four different types: 1. / = 1. The first node in from x = 0 is special because it incorporates the Dirichlet/essential BC from (2.3-5); it is - [htD + 2(/, + l2)ti + l2t2] + -u{T2 - TD) + K{h^lR + I^IA = li±hs, (2.3-7) V h h J £ in which TD{t) is given by BC (2.3-2) and—in practice—the terms in TD(t), and fD, are transposed to the RHS to form a portion of the 'data' (forcing function)—unless mass lumping (see Section 2.5.1) is invoked, in which case all of the T terms are replaced by [(/,+/2)/2]7V 2. 1 < i < N, where node Af is located at x = L: - [liti-i + 2(/,- + ii+i)ti + ii+iti+i] + ^{Ti+l -r,-_,) /rI--r,-_, Ti-Ti+l + k h /,■ + u +1 u u (2.3-8) +i (x = 0) £, 1.0 ~+\ \ i = NT i = 1 i = 2 i = 3 i -1 Fig. 2.3-1 1D mesh and one linear basis function. H h i + 1 N-1 £N (x = L) • N
60 THE ADVECTION-DIFFUSION EQUATION in the GFEM form, or—via (2.2-29)—converted to the FD form, where here /0, = (/,- + /,■+, )/2, 1 2/,- ^ ,2^,1 2/,+, ^ , r,-+i-r,--i + at 'Ti-Ti-i Ti+i-T{ '/,■ + // + i /,■ /,■ = 5, (2.3-9) +i which appears to be more amenable to a term-by-term identification with those of the PDE. In fact, it is also ostensibly amenable to a Taylor-series 'error analysis,' as follows: assume that the exact solution to (2.3-1) is available and smooth enough that a Taylor series makes sense. Thus, for example, /: /? /; r(*f_,, 0 = r,-_, = Ti - UT\ + "±t[ - ^T'!' + ±r!" + 0(/J), 2 O 24 where 7J = dT/dx\x=Xi, etc. Performing a similar expansion for Ti+i and inserting the results into the above difference equation yields [using T\ = d{dT/dt)/dx\Xi, etc., and h = U + ^i+i] 1 /3 -i- /3 /4 - /4 /5 -i- /5 Ti + 3(/,+' " li)Ti + ~~6h~T< + 18/7" ' + ^727T ' + ( } + u I ■ — I ■ I 4- / ryl 1+1 'l „// . ' )'+ | ~l~ | y/// . t |+ | * )' y//// . = K" 3 , ,3 I4 -I4 I5 + /5 120/5 f5 , |5 -(v) 6/, 24/5 ;4 ,4 t;.v, + o(/3) / — /• /3 4-/3 /4 — /4 /3 4-/ T" 4- —li—ll-T"' 4- _i±_L___Lt"" i i+l ' ^<v) I i + l > ynvu I f)(p\ 12/, 60/, 360/. + S. But since we have assumed/claimed that 7 is the exact solution to (2.3-1), the leading terms (within each bracketed group) cancel, and we are left with one measure of the local truncation error (LTE) of the scheme (subtract the RHS from the LHS): TEi = (/,-+, - /,-) -t': + -uT'l - -kT'I' 3 ' 2 ' 3 ' + I +1 i + l 6(/,+/,+,) T; + UT; -kT" 2 ' + 0(/?+i -/?)/(/,- + /,+,) + 0(/4). Digression: Before proceeding with this 'error analysis,' we note that if the velocity is not constant and is interpolated via the basis function as u = ^ Uj<pj, then the advection term changes to 1 6 ",■-1-77——, r- + 4«,- 7V+i - Ti h + h+l\ ' ""' (/,■ + /;+,) +"'+,//,+/,- + 1
SOME SEMI-DISCRETE EQUATIONS 61 for the finite difference form, in which the 1/6, 2/3, 1/6 averaging of upwind, centered, and downwind difference approximations (encountered here for the first of many times with linear basis functions) is clearly apparent. We make the important remark that this advective form (fi = 0) can actually generate an unstable ODE [sometimes referred to as a linear (or non-linear, depending on the author) aliasing instability in the finite difference literature] if u{x) varies in certain ways. To see this, we write the general row of the corresponding advection matrix (GFEM form, wherein—recall—the variable element lengths show up in the mass matrix but not in the advection matrix): £[0,-» 0, -(2«,- + m,-_i ), (m,-_i - w,+i), (2«,- + w,+i), 0, -» 0], whose (generally complex) eigenvalues may display positive real parts; i.e., there is no guarantee that they will not. In contrast, if the quadratically conserving form (/3 = 1/2)—equivalent to \ [(d(uT))/dx + u(dT/dx)} = u(dT/dx) + \T{du/dx) — \s employed, then the GFEM advection term becomes - L"iVi+i - Ti-\) + («,-+!7\-+i - w,-_ir(-_i)j = - I —-—ri+l —r,_, j, whose matrix row is £[0, -» 0, -(«,- + «,-_!), 0, («,- + ui+i), 0, -» 0], which shows that the advection matrix is skew symmetric and thus displays purely imaginary eigenvalues regardless of the behavior of u(x); i.e., it cannot generate an unstable ODE. [The linearly conserving form, fi = 1, leads to the following GFEM advection term: t[2(ui+lTi+\ -Ui-iTi-i) + Ui(Ti+i - 7Vi) + («,-+! -Ui-\)T{\, and a general matrix row like £[0, -» 0, -(2m,-_i + «,-), («,-+i - «,--i), (2m,-+i + «,-), 0, -» 0], which, because it corresponds to an indefinite advection matrix, like that for fi = 0, is also susceptible to unstable behavior even though it conserves T.] Remark: It is interesting that the famous 'non-linear aliasing instability' of Phillips (1959) may have in fact been a simpler linear instability (or perhaps some of each) caused by trying to solve what are effectively unstable ODE's. The possibility of explaining the growth as a linear instability seems to have first been put forth by Miyakoda (1962). Later, this possibility was rediscovered by Gerrity (1972) and later yet by Gary (1979). A finite difference analog of the quadratically stable (/3 = 1/2) version was presented by Piacsek and Williams (1970). And, earlier, Lilly (1965) recognized that any quadratically conserving scheme would preclude the Phillips' instability. Finally, see Kreiss and Oliger (1973), who also showed that linear equations with sufficiently rough coefficients can be unstable. Neither of the latter two (P & W, K & O) seemed aware of the previous
62 THE ADVECTION-DIFFUSION EQUATION and essentially equivalent descriptions described by the earlier authors. We will discuss aliasing, briefly, in the next chapter (Section 3.17). In our own experience (Lee et al., 1982), we too observed linear 'aliasing' instability in certain inviscid simulations: only the quadratically conserving advection approximation precluded blow up of / T2 in a velocity field that was semi-chaotic, with small scales in both space and time. Later in this chapter, we will demonstrate the occurrence of an unstable ODE system for fi ^ 1 /2. End Digression Returning now to the Taylor series analysis, we note that if a uniform mesh is employed, then /, = /,-+i = I = L/N, and the semi-discrete difference equation [(2.3-8)] simplifies to 2r,- + rI-_, -(r,-_i +4Ti + Ti+i) + u — =*■ O LI I + s, (2.3-10) and its Taylor series analog to / Ti + -T" + 72' = K r; + 12 i-piin . + u 360 V lL T' + -T'" + '6 '120 -(v) -(vi) + S + 0(l5). The corresponding LTE in this case is I2 TEi = -(r; + ur; k" __ Tffff 2 '' + 24 _ t"" 3 '' 1 + -5uT, (V) kT 15 (vi) + 0(/5). So what do these equations tell us, besides the obvious fact that if T{x, t) that solves the given IBVP is available, then the quantities called TEiit) could be computed (up to some power of /) at each node? Considering first the general result (/, ^ /,+i), it seems to say that the FDM version of the GFEM equation has leading truncation error terms in (/,+i — /,) and in (/;?+1 + l])/(/,- + /,-+i), the latter of which can be interpreted as second-order in /; i.e., it is 0(/2). Considering the first of these, however, there are two possibilities: (i) the mesh grading is random/rough such that (/,+i —/,-) = 0(1 j)—e.g., /,-+i = 3/,; i.e., large changes in element size will occur; and (ii) (/,+i — /,) = 0(l2); i.e., (/,-+! — /,■)//,■ = 0(1 {)—the mesh gradation is gradual. In the former case, the Taylor series accuracy is clearly only first-order, whereas in the second, it is second-order. One obvious conclusion—and one that is generally also borne out in practice—is to grade the mesh gradually. In practice, a common mesh grading procedure is 'geometric'; e.g., /,+i — /, = el{, where e <^ 1. While theoretically still only first-order accurate, the computed results are usually quite acceptable/accurate. We quickly add, however, the important remark that while Taylor series analyses are useful to some extent, they are subservient to the more powerful error analysis techniques inherent in the GFEM and the results of these techniques, in that the truncation error as computed above is not necessarily the correct measure of the error in a GFEM solution. Another, related to the first, is that Taylor series consistency is not even necessary for convergence (to the PDE solution) to occur; although this is not the case here, there are FEM situations where it is (e.g., Carey, 1976). Part of the reason for these assertions is that the GFEM is not based on Taylor series expansions,
SOME SEMI-DISCRETE EQUATIONS 63 which are actually only appropriate (i.e., when used to actually generate approximations to derivatives) when 'Ax' is 'sufficiently small.' The GFEM tries, in some sense (and often-but-not-always succeeds) to deliver an approximate solution that is 'as accurate as possible' based on the given mesh (and the basis functions employed). Finally, we consider two interesting limiting cases. In the first, we will uncover here (and demonstrate later) an example of 'improved' convergence: let k = 0 = S so that we have the pure advection equation, T, + uTx = 0. The uniform grid result in this case enjoys a property of truncation error cancellation that increases its Taylor series accuracy from first-order on a variable mesh all the way to fourth -order! This is because the term T'l + uT'j" = d2{T, + uTx)/dx2 = 0, and the I2 term drops out of the TE equation. The second case is that of pure diffusion (u = 0). Here the general TE equation becomes TE = — - — k—r + —! ^- T' - -kT"" + (HOT). 3 dx\dt dx2) 6(/,- + /,+,) V 2 / But since T, = kTxx + S, we have 3(7", — KTxx)/dx = 0; the variable-grid GFEM for the transient heat equation is second-order accurate—a result that is general; i.e., it also holds for a uniform mesh. We remark too that this is a consequence of the so-called consistent mass matrix of GFEM and does not occur if the mass is lumped [setting 7,_i and ti+\ to T; in (2.3-10)] — lumping gives only first-order accuracy. (See Section 2.5.1 for a more extensive discussion on mass lumping.) The fourth-order accuracy for the pure advection case is also lost if the mass is lumped; it is then only second-order accurate. We also mention that the second-order accuracy obtains even if the source term is not constant and is treated either via interpolation into the FEM basis functions, (/,+2/,+l) S -+ g^-i + 2(/' + li+iW + f.-+iS.-+i]> or via numerical (approximate) integration (Strang and Fix, 1973). Before proceeding, however, we emphasize the point (as mentioned earlier) that we must be careful not to fall into the common trap of actually believing that the FDM (Taylor series) truncation error analysis actually governs the behavior of the error in the solution to the GFEM equations (it does not) and thereby provides accurate error estimates. [For a recent example of such entrapment, see Fletcher's (1991) discussion of non-uniform grids. See also Fletcher and Srinivas (1983).] The 'reduction' of the GFEM equations to an FDM 'equivalent' is thus actually misleading in several ways: (i) The (global) finite element theory [via energy inner products, Sobolev spaces, etc.; see, for example, Strang and Fix (1973); Wait and Mitchell (1985); Girault and Raviart (1986); Ciarlet and Lions (1990); Kardestuncer and Norrie (1987)] prevails, and the (local) Taylor series theory is secondary; i.e., the one-node-at- a-time-analyzed-separately-and-independently approach does not reveal the whole truth/the big picture. (ii) Whereas M and KD are symmetric matrices in the GFEM version of the semi- discrete equations, these important properties are lost (or at least, camouflaged) in the FD version. (iii) Whereas the advection matrix —see (2.2-9)—is skew-symmetric in the GFEM version [N(u) for variable velocity with ft = 1/2], this property too is 'lost'
64 THE ADVECTION-DIFFUSION EQUATION in the FD version. Thus, for example, the former version assures that the advection approximation is dissipation-free (since xTNx = 0, Vx), but the latter version—via the particular Taylor series term, [(/,+i — li)/2]u(d2T/dx2)\., which looks dissipative/diffusive, at least if (/(+) — /,) were replaced by |/,+i — /,-[—seems to imply otherwise; pointwise! This 'conclusion' is erroneous because the local error analysis via FDM/Taylor series is, and is the cause of, for example, Fletcher's (1991) errors when discussing this sort of example. So the best advice that we can offer seems to be, 'Do not be afraid to use Taylor series analysis to assist in studying accuracy and verifying consistency, but do not rely on it too heavily—certainly not exclusively. Also, when it generates "good news," advertise the fact; but if it generates bad or suspicious results, be suspicious—or ignore it.' 3. / = N; (the right-most node). Here, (2.3-5) gives 1 ' o I 1 K TN^) + —(TN lN TN^)+HT N = -lNS + q + HT, (2.3-11) which rearranges to / TN — 7/v_ i 1 K I N + H{TN-T) = q + -lN 'c (TN — 7V_i) 2 • o — u In In 3 -In-\Tn_\, o (2.3-12) a notable result in several ways: (i) (ii) (iii) (iv) The GFEM form of the semi-discrete equation is 'useful' as it stands because it is the 'integrated form' that is appropriate for studying Neumann/Robin boundary conditions—no need to divide by f (p^ = /#/2. (This is true only in ID.) It shows how the GFEM satisfies approximately the natural boundary condition, (2.3-3). The ODE at node N is—in effect—the BC given by (2.3-3). We shall return to this important point later (Section 2.4). Presuming convergence as /, -> 0, all terms on the RHS but the first vanish so that convergence to the NBC indeed occurs. A Taylor series analysis about TN at x = L gives, for //v_i I ( dT dT d2T 2 dT K~+H{TN-T) = q ax dT dT h u — dt dx In = I, I2 K- dxz 12 uTxx + 0{l'), (v) which, if the PDE holds on T (i.e., if the term in parentheses vanishes), and if the Taylor series interpretation is valid, is a second-order accurate (third-order if u = 0 and the mass is not lumped) approximation to BC (2.3-3)—see also Strang and Fix (1973, p. 33). The equation can also be interpreted as a sort of discrete (GFEM) energy balance in the last element—or perhaps better, in the last half of the last element, or perhaps better yet, at the last node. (See Section 4.2.4 in Chapter 4.) Finally, we remark that
SOME SEMI-DISCRETE EQUATIONS 65 the discrete equations generated by the GFEM (or any other approximation)—and their solution—are indifferent to the manner of 'interpretation.' 4. / = NT; (the left-most node)—optional/'post-processing.' Here, since all 4>j = 0 at x = 0 except for j = NT, (2.3-5) gives -(2/,77, + l2ti) + \u{T, - TD) - k{T{ Td) = i/,5 + qD, (2.3-13) which is similar to (2.3-11) and is actually an equation for qhD, the heat flux into £2 at x = 0. That is, it is presumed that equations 1 through N have already been solved so that T\ is known. Again we note that this (and only this) calculation is optional, and is a result of expressing the equations in the generalized/augmented weak form. Clearly, / -» 0 => qhD = -K(dT/dx)\x=0 = K{dT/dn)\r; the heat flux through TD, for / -> 0, is purely diffusive—even though n-u^0 there. Also, for constant /, „ {Tj - TD) 1 qD = -k + -/ which yields, via Taylor series analysis, dT 2 • 1 . {T\-TD) «+*& i dT dT = \-u — 2\dt dx d2T K r- — S dx' i2 d dT 3 dT _l 1_ _M— 12 3jc \ dt 2 9jc K- d2T ~d? + 0(/3), which gives, again assuming the PDE holds on f, qD = = —K- dT ~dx i2 H uTx 12 x + 0(/3), another second-order accurate heat flux result (again, it turns out to be third-order in the absence of advection, when consistent mass is retained). b. Quadratic elements The domain in this case, with N/2 elements and N + 1 nodes, is shown in Figure 2.3-2 (N is necessarily even). Now the fun begins; i.e., higher-order basis functions generate higher-order complexity, complete with different 'types' of nodes, basis functions, and associated (and often counter-intuitive) equations—a necessary adjunct to higher-order accuracy, at least when we insist on C° Lagrange basis functions with compact support. But, in fact, quadratic basis functions are but the first step up in what is called the p- finite element method (p = 2 here and stands for polynomial of degree 2), a technique that we shall not present here, just summarize: rather than employing a simple mesh refinement using the same type of element on each mesh (/z-method) to increase the accuracy, it holds a fixed mesh (relatively coarse) and increases the order (p) of the basis functions. In the h-p method (see, for example, Oden et ai, 1992), both h and p are varied, usually 'adaptively,' in order to find an optimal solution strategy. After seeing
66 THE ADVECTION-DIFFUSION EQUATION N-2 N-1 N * p to. <* p ^ **— *3~► -<— ^n/2 —► Fig. 2.3-2 1D mesh and four quadratic basis functions. some of the discrete equations below from p = 2, the reader will probably appreciate why most people are content to just trust Galerkin and let the computer do the work. We now 'evaluate'/explicate (2,3-5) for 'quads'; here there are six different 'types' of nodes, l. / = l, the first of two nodes that see the Dirichlet data, To(t): j-(tD + 87, + t2) +\{T2~ TD) + -^(-TD + 27, - T2) = ~S, (2.3-14) l j j /1 3 in which the 'data,' l\ ■ uTD 8kTd would be transposed to the RHS when forming the ODE system a la (2,2-7), (As for linear elements, mass lumping removes the effect of TD by replacing all f terms by t\.) 2, / = 2, the second of two nodes coupling with To(t): — [-htD + 21 {ti + 4(/, + l2)t2 + 2l2t3 - l2t4) + ^[4(7-3 -Ti)-(T4- TD)] K + 3 (TD - 87, +772) {1T2 - &T2 + T4) /i /: (h+h)S (2.3-15) in which the data, -/,7D/30 + uTD/6 + KTD/3h again go over to the RHS. [If the mass is lumped, TD does not go to the RHS because all time derivative terms are then replaced by their 'sum,' {l\ +l2)t2/6.\ 3, / = 3, 5, 7, .,., N — 1; these are the rest of the 'center' nodes, which we easily display as FDM equations by dividing each GFEM equation by f <p-, = ^1{;+d/2:
SOME SEMI-DISCRETE EQUATIONS 67 1 • ■ Ti+\-Ti-\ -Ti-\ +2T;-T;+i — (77_, + 87-,- + 7-,-+,) + u^ — + K — ~ ^1 = S, (2.3-16) l(J Mi+D/2 Axf where Ax,- = jla+i)/2 is the node-to-node distance on element (/ + l)/2. Remarks: (i) For the case of variable velocity via u = ^ ■ Ujipj, the term u(Ti+l — r,_i)//((+i)/2 is replaced by [«,•_! (47\- - 37\-_, - r,-_2) + Sui(Ti+i - r,-_i) + «,-+, (7\-_, -4T; + 3Ti+l)]/lCl+l)/2, a linear combination of first-derivative approximations, (ii) To lump the mass, just set 7Vi and fi+\ to f;. This result, (2.3-16), 'smells like' a simple, second-order accurate approximation, and it is if 'simple-think' is employed. But the actual scent is better—the global theory, which couples all midside nodes with all 'edge' nodes, gives the result that quadratic basis functions are third-order accurate, even for variable coefficients. 4. i = 4, 6, ..., N — 2, the remaining edge nodes except the last: here we first present the GFEM form, obtained from the element matrices (Appendix 1) for two adjacent elements: 1 ... ^[—li/lTi-2 + 2/;/27\'_l + 4(/;/2 + li/2+\)Tj + 2li/2+\T{+\ — li/2+\T{+2\ + ^[4(7Vh -rI-_1)-(r/+2-7Y_2)] + ^[(Ti-2 - 8r,-_, + lTi)/li,2 + {IT, - 87;+, + Ti+2)/li/2+l] = ^(/,72 + /;/2+i)5, (2.3-17) which we further present in FDM form by dividing through by f (pi = (/,y2 + /,/2+i)/6: I I AT -I- Q^'72^'-1 ~*~ ^'72+1^+1 _ h/2Tj-2 + lj/2+\Tj+2 5 y /;/2 + ^'/2+i /;/2 + ^72+1 / \ /-j-T T' T' T' 0 ' ;+l — ^i-l ' H-2 — ' i-2 + u 2-j —-. ///2 + ///2+1 /,72 +/i/2+l V 2 / + -.—-^—[(7V2 - 8r,-_, + 7r()/(V2 + (iTf - sr/+1 + Ti+2)/ii/2+l] h/2 + h/2+\ = S, (2.3-18) wherein we note/remark: (i) The diffusion terms are easily identified as differences of fluxes. For / = 2Ax = constant, they become k[{T^2 - 2Tt + Ti+2)/l2 - 2(7,_, - 2T; + Ti+\)/Ax2].
68 THE ADVECTION-DIFFUSION EQUATION (ii) Lumped mass is, as usual, easily obtained by setting 7,-±i and T;±2 to T\. (iii) The variable velocity case, via ^itjipj, replaces the advection term in (2.3-18) by |[(w,_2 + 2m,-_i + 2«,-)r,-_2 - 4(2m,-_i + 3w;)7V, + (ui+2 - 6w, + i + 6w,_, - Ui-2)Ti + 4(3W; + 2«;+1 )7,;+1 - (2«,- + 2«;+i + Ui+2)/Ti+2]/(l;/2 + /,72+l). (iv) Again, local Taylor series analysis is not recommended, except/unless to demonstrate consistency. (Recall, though, that consistency is not necessary for convergence; it is merely sufficient.) 5. i = N; the right-most (NBC) node: here, (2.3-5) gives — lN/2(-tN-2 + 27V., + 47V + ~(TN-2 ~ 47V, + 37V + ^-(TN-2 - 87V, + 77V + HTN = i/^S + q + HT, (2.3-19) J//V/2 O which is an ODE for node N that can be conveniently rearranged to reveal its alias—the boundary condition given by (2.3-3)—via k(7V2 - 87V, + 77V/3/^/2 +H(TN~ f) = Q + {h/iiS - u(TN~2 ~ 47V i + 37V/V/2 - (-7V_2 + 27V, + 47V/5], (2.3-20) and letting lN/2 -» 0. 6. / = Afj-; the left-most node—via post-processing: here (2.3-5) gives —/,(47*D + 27, - 7-2) + \i-Wo + 47, - T2) + ^(77D - 87, + T2) = ~S + qhD, (2.3-21) which we recognize as the (consistent) heat flux into the domain at x = 0 via rearrangement: qhD = KOTD-STi+T2)/3h I + 6 -(ATD + 27-, - t2) + u(-3TD + 47, - 72)//, + S , (2.3-22) which, for /, -> 0, is ^ = -/c37/3jc|0 + 0(/J). But for finite /,, (2.3-22) describes the most accurate estimate to qD available by accounting for more ('finite') physics than simply heat conduction. (Further details of such post-processing will be provided in Chapter 4, 'Derived Quantities,' as well as in Appendix 2.)
SOME SEMI-DISCRETE EQUATIONS 69 This concludes our ID 'demonstration.' The reader should now both have a better 'feel' for the GFEM and be able to 'go farther'. Thus, we leave as an exercise the generation of the ODE's using cubic (or higher) basis functions. 2.3.2 Two Dimensions with Bilinear Elements Here we wish to present and examine the semi-discrete equations, (2.2-2) or (2.2-16) with fi = 0, on a mesh of variable rectangles for both bilinear and, in the next two sections, biquadratic elements—both in the interior (£2) and on the boundary (T). For one case (bilinear elements), we will even show how the advection terms change when /3 is set to 1 (flux-divergence/conservation form) or to 1/2 (quadratically conserving form). We remark that anyone who decides to verify/check these results is well-advised to use a 'symbolic/algebraic-manipulator' package rather than 'do it by hand' as was done by the authors (PMG, in fact). a. An Interior 4-Patch We start with a general 4-patch of general rectangular elements and a 'general' velocity field, i.e., the velocity is variable and expressed (via interpolation) in terms of the bilinear basis functions. The semi-discrete equation for the 'control' node (0) comprises the following terms for the 4-patch shown in Figure 2.3-3, in which we employ compass-point notation: 1. Mf\0= 3L {/,/*,tsw + l2h\tSE + l2h2fNE + lxh2tNW + 2[(l\ +l2)(h{fs + h2fN) + (/z, +h2)(lltw+l2tE)]+4(ll +/2)(/z, +h2)t0}. 2. For K -> k, KD % -V2: KDT\0 = - \j-[2(T0 - Tw) + (Ts - Tsw)] + ^-[2(70 - TE) + (Ts - TSE)] + ^[2(70 - Tw) + (TN - TNW)] + ^[2(T0 - TE) + (TN - TNE]\ h h ) N 4 W ' < t > A h < J h2 0 hi > £2 NE %9 u sw SE Fig. 2.3-3 A 4-patch of bilinear elements.
70 THE ADVECTION-DIFFUSION EQUATION + ~ {y-[2(T0 - Ts) + (Tw - Tsw)] + ^[2(70 - Ts) + (TE - TSE)] 6 {hi h\ + ^[2(70 - TN) + (Tw - 7W)] + j^[2(T0 - TN) + (TE - TNE)]\ . h2 h2 J 3. N(uT = Nx(u)T + Ny(v)T « u(dT/dx) + v(8T/dy), where w = £\ w;0; and v = ^j vJ<Pj'- Nx(u)T\0 = — {h\(uw + usw)(Ts -Tsw) + [3(h\ + h2)uw + h\usw + h2uNW] x (T0 - Tw) + h2(uw + uNW)(TN — TNW)} + zr {M"o + "<>)(^<>£ - Tsw) + [3(h\ + h2)u0 + h{us + h2uN] JO x (7£ - Tw) + /z2("o + uN)(TNE - TNW)} + =« fai(uE + use){Tse -Ts) + [3(hi + h2)uE + h{uSE + h2uNE)] x (7£ - To) + /z2("£ + uNE){TNE - TN)}, 1 72 x (70 - r5) + l2(vs + vSE)(TE - TSE)} Ny(v)T\0 = ^ {/i(v5W +vs)(Tw - Tsw) + [3(/i + l2)vs + l\vSw + hvSE] + ^r {/i(vo + vw)(7W - ^5w) + [3(/i +h)vo + l\vw +hvE] io x (TN - Ts) + l2(v0 + vE)(TNE - TSE)} + ~ (M^w + ^w)(7W - Tw) + [3(11 +l2)vN + l\vNW +hvNE\ x (TN - T0) + l2(vN + vNE)(TNE - TE)}, which involves a linear combination of upwind, centered, and downwind differences for u -V7\ Combining the above three results in the form MT\o + N(u)T\o + KDT\o = MS\o, where MS\o, is, for S = Y^Sjfij' tne same form as MT\o and thus need not be explicitly written out, yields the GFEM ODE for T0. [Note that if S is constant, then MS\o = (lx+l2)(hx +h2)/4-S.] Since the GFEM equation as stated (and as usually seen by the computer) is not in a clearly tasteful form, we next multiply it by the inverse of the lumped mass matrix [ML = (Ii +l2)(h\ +h2)/4] and rearrange the terms to reduce it to a more palatable form; i.e., the finite difference form discussed in Section 2.2.5, which is more easily interpreted. The result is, where ASw =l\h\—the area of the southwest element, etc. —and AT = Asw + ASE + ANE + ANW = (/, + l2)(h\ + h2), Asw ■ ANW ■ ANE ■ Ase ^ \ 0 (ANE + ANW ■ Ase + Asw + —A— I SW ~\ — I nw ~\ — I NE ~r —— 1 SE ) -r *■ \ : '/vi 1 s AT AT AT AT ) \ AT AT . Ase +ANE + . Asw + ANW ■ \ ■ + : 1 e H : lw +4/()
SOME SEMI-DISCRETE EQUATIONS 71 + 6 6 2h\ M/v + Usw Ts — T< sw /_ , hi usw + h2uNW \ 3uw H hi+h2 2 (l\+h + 4 /ll +/I2 V 7\) — Tw /i+/2 + 2/i2 w h\ +h2 4 l + 6l6 /l+/2 2/Zi Mo + M.v TSE — T sw (' , h{us + h2uN \ 3m0 H ; ; hi+h2 2 (/1+/2) + 4 /ll +/l2 V T'e — T w (/1+/2) + 2/z2 Mo + «/v 7^/vf - 7 /vw /ii+/i2 2 (/1+/2) + 6I6 2h\ uE + use Tse - Ts (\ + h^uSE + h2uNE\ h\+h2 2 fl\+l + 4 hx +h2 V /1+/2 + 2/?2 Mf + Une T^e ~ T^ h{+hQ 2 fli+l + 6I6 2/1 vs + vsw Tw - T< sw (~ , Uvsw + 12Vse\ 3VS H : ; /l+/2 /ii +h2 + 4 /1+/2 V hi +h2 + 2/2 ^5 + vSE TE - TSE h+h h{ +h2 + 6 | 6 2/1 Vq + Vw TNW-T ( 3v,^W + l2VE\ SW /1+/2 (h+h2) + 4 /.+/; TN-TS (/ii+/j2) + 2/2 Vo + vE TNE - TSE /,+/2 2 (h{+h2) + 6l6 2/1 t>/v + Vnw Tnw~Tw 3vN H /1+/2 hx +h2 + 4 /1+/2 V / Tn — Tq h{+h2\
72 THE ADVECTION-DIFFUSION EQUATION + 2/2 vN + vNE TNE - TE l\ +/o h\ -f h2 l\+h 2h\ (Ts — TSw TSe — Ts\ {TQ — Tw TE — TQ + 2hj h\ + h2 K hi+h2 V l\ Tn ~ Tftw The — Tn h hx +h2 l\ h 2/1 I1"w — Tsw Tnw ~ Tw /,+/2 /i, h2 + + 21: Te — Tse Tne — Tt /,+/2 A$w hx /t/vw AftE a AsE „ . Sjiv H—-—Snw H—-—SNE H—-—Sse AT At At At . ^ ,'Ane + Anw „ Ase + Asw _ + I I : ^N H : J5 + AT " AT Ase + Ane „ ^5iv + Anw ^ _l — Sw + 4S0 + 4 /, /5 To - Ts TN- TQ (2.3-23) Interesting, isn't it? The finite element method displayed. (Palatable, maybe; but still not enticing!) The following remarks seem appropriate: 1. All terms are consistently weighted according to element size, a point especially noteworthy, both now and later when we address the subject of 'mass lumping.' 2. 'Superimposed' on the element size weighting is the characteristic averaging associated with linear basis functions; namely, (1 4 l)/6. 3. The advection approximation is composed of one-sixth upwind, two-thirds centered, and one-sixth downwind differences. It also employs an interesting combination of averaged coefficients. 4. The diffusion term (as in ID) is in the form —V • q where q = —kVT. 5. It is trivial to reduce the advection terms to their constant velocity counterparts. 6. The GFEM treatment of products—here in the term u • V7 —is complicated and costly in 'operation counts'; its cost-effectiveness for advection is still an open issue, especially when 'wiggles' are generated in such a costly way—a subject we will later return to, briefly, in Section 2.5.4. 7. GFEM generates many averages of averages (usually weighted) on a uniform rectangular mesh, wherein the extra effort (of averaging) may not always be worth the extra work; it may be the case that it is on general, distorted isoparametric element meshes for which the extra effort really pays off—i.e., for complex geometry.
SOME SEMI-DISCRETE EQUATIONS 73 8. If side SE-E-NE were a boundary segment with time-varying Dirichlet data supplied, the nodal values of Tse, Te, and T^e, as well as their time derivatives, would ultimately wind up on the RHS as given data. For the record, and for completeness, we present below the uniform mesh (/, —l,h[ — h) version of the above AD equation: 36 [(fsw + fNW + fNE + fSE) + 4(tN + ts + fE + tw)+ \6tQ] + 6< 2 + 3< + 6< + 6< 2 + 3< + 6< uw + usw Ts — Tsw I 3uw + usw + unw + 4 UQ + Us Tse — Tsw 2 11 ue + use Tse — Ts 3w0 + 4 us + uN Tp — Tw uw + unw Tn — Tnw + 4 + 4 3uE + 4 | 2/ USE + UNE Te — Tw up + uN TNe — T^w 21 Te — Tp ue + une T^e — Tn Vs + vSw Tw - Tsw vQ + vw TNW - Ts w 2h + 4 + 4 -3vs+Vsw+VsE\ J I Tq-Ts | Vs + Vse Te - TSE iV" + - 1 7\ - TS vQ + vE Tne - Tse 2h 2h tyvj- VNW TNw ~Tw A N+ 2 1 TN - Tp VN + VNE TNE - TE K 6 + (TSW - 27-5 + TSE)/l2 + 4(TW - 2T0 + TE)/l2 + (TNW - 2TN+]TNE)/l2 (TSW - 2TW + TNW)/h2 + 4(7s - 2T0 + TN)/h2 + (TSE - 2TE + TNE/h2 + tt [(Ssw + SNW + SNE + SSE) + MSN +Ss + SE+Sw)+ 1650] JO (2.3-24) which seems to warrant no further comment, except that the simple case of constant velocity is easily obtainable and gives, not surprisingly, dT u ( TSE — Tsw , Te — Tw , TNe — TNW \ with an obvious analogous term for v dT/dy. For the special case of constant velocity on a uniform rectangular mesh, it is interesting to note that the 2D equation can be generated using the outer products (tensor products; see, for example, Wait and Mitchell, 1985) of the corresponding ID 'operators', because the 2D basis functions can be written as the (tensor) product of the corresponding ID basis functions; 0,(jc, y) = i/;(x) • i/i(y), where V/OO = \0 ±*/0> etc This permits the
74 THE ADVECTION-DIFFUSION EQUATION separation of each 2D integration into the product of two ID integrations. The result is a 'speedy' way to go from a ID 'stencil' to the corresponding 2D version. Example: the ID mass matrix contains the operator /(l 4 l)/6 in the x-direction and h(\ 4 l)/6 in y. The outer product of these operators is Ih ( ' \ lh(X 4 ' \ which is just the 2D stencil for M presented above. Similarly, the term udT/dx in 2D can be generated from the outer product of the ID mass operator in y with the 2D advection operator in x, thus H:)'^i,=4::s:)' which is clearly the appropriate 2D representation. For further details on these matters, see Wait and Mitchell (1985) or, especially, Fletcher (1991), who shows how to use it on non-uniform rectangular meshes, although still with constant coefficients. For general, isoparametric meshes with variable velocity, however, the simple tensor-product formulation does not work—numerical integration a la 'conventional' FEM construction of the assembled equations is the only way. Digression on Mass Lumping: Since we will later spend a fair amount of time discussing lumped mass approximations to GFEM, it may be useful and appropriate here to show one form of the relationship between lumped and consistent mass for bilinear elements. If we had used (2.2-28) to define the mass matrix, M -» Mi, and the time-derivative terms in (2.3-24) would be much simpler—uncoupled: all time derivatives are replaced by Tq. We now wish to examine the difference between M and Mi when 'operating' on a smooth function, say F(x, y), evaluated at the nodes and denoted by the vector /. We easily obtain, for a mesh of / x h elements, from (2.3-24): (Mf - MLf)\0 = —[(fsw + Inw + fNE + fsE) + MJn + fs + fE + fw) - 20/oL JO (2.3-25) which, if subjected to Taylor series analysis about node 0, in the form F(x + 8, y + e) = F{x, y) + ££Li ^ (SJL + e^y F(x, y), yields the following: Ih (M - ML)f\0 = — (/2F„ + h2Fvy)\0 + Ih ■ 0(l\ h4), (2.3-26) o showing that the 'difference' between M and ML shrinks with mesh refinement. Also, multiplication by M^1 and rearranging gives [see (2.2-30) through (2.2-32)], M^Mf\o = F(xo, yo) + {U2FXX + h2Fyy)\0 + 0(l\ h4). (2.3-27) For a fine enough mesh, CM and LM are equivalent; for the finite meshes that we are forced to use in practice, however, they are not, and the effect of the difference can be very significant, as we will later demonstrate.
SOME SEMI-DISCRETE EQUATIONS 75 End Digression The results above are for the advective form (/3 = 0). The conservative form (fi = 1) is interesting in that the mix of upwind, centered, and downwind differences is replaced by a mix of pointwise flux terms, u7\ all in a centered difference format (a la finite volume methods—before upwinding is invoked). It is sufficient to focus on just the x- portion, as the rest follows from 'symmetry.' Calling then Ncx(uT)\; = J (pj(d(uhTh)/dx), the conservative form of x-advection yields, for the same 4-patch: 72Ncx(uT)\0 = /i,{3[(m0 + uE)(T0 + TE) - (uQ + uw)(TQ + Tw)] + 2[(uE + uSe)(Te + Tse) - (uw + uSw)(Tw + TSw) - (uSeTse - u.swTSw)] + (us + uSE)(Ts + TSE) - (us + uSw)(Ts + Tsw) + ("o + Use)(Tq + TSE) - (m0 + uSw)(Tq + Tsw) + (us + uE)(Ts + TE) - (us + uw)(Ts + Tw)} + h2{3[(uo + uE)(TQ + TE) - (mo + uw)(T0 + Tw)] + 2[(ue + Une)(Te + TNE) — (uw + Unw)(Tw + TNW) — {u^eTne — u^wT^w)] + (m/v + Une)(Tn + Tne) — (un + Unw)(Tn + TNw) + (mo + Une)(Tq + TNE) - (mo + unw)(Tq + TNW) + (uN + uE)(TN + TE) - (uN + uw)(TN + Tw)}. If we multiply this by ML ' = 4/(/) + l2){h\ + hj) and judiciously rearrange the result, we obtain the FDM form of d(uT)/dx; i.e., MllNcx(uT)\0 1 2h\ 9 hx+h2 ue + use Te + Tse uw + usw Tw + Tsw h+h 1 (useTse — VswTsw) 2 /,+/, + 2 us + use Ts + Tse us + usw Ts + Tsw l\+h uq + use Tq + TSe "o + usw T0 + Ts w + h+h + us + uE Ts + TE us + uw Ts + Tw l\+h + UQ + UE Tq + TE Uq + UW Tq + TW l\+h 1 2h2 + x 9 h\+h l -t- ni uE + unE Te + TnE uw + unw Tw + T^w h+h 1 unETnE — unwTnw 2 /,+/2
76 THE ADVECTION-DIFFUSION EQUATION 1 + - un + "ne Tpj + The «/v + unw Tn + T^w h+h 2 WO + W/V£ Tq + TnE "0 + UNW Tq + TnW + 2 2 ~ 2 2 X/l+/2 2 w/v + «£ T^/v + Te un + Mtv T'/v + TV 2 • 2 - 2 2 'h+h (2.3-28) which is rather easily interpreted: the 2h\/{h\ + h2) terms contribute a net j3(«r)/3jc from the southern nodes, the middle term is clearly ^d(uT)/dx from the central nodes, and the remainder—the 2h2/(h\ + h2) terms—contribute the remaining ^d(uT)/dx, from the northern nodes. Final 4-Patch Remarks: (1) The conservation form results displayed as above involved a non-trivial amount of manipulative algebra, but the results are felt to be worthwhile and instructive, especially because we will later compare them with those generated by the control volume FEM, (2) It is easy to see how the conservation form will conserve T via addition of the nodal equations: all interior contributions will cancel. (3) Taylor series analysis of these equations might be fruitful, but this is doubtful. (4) The quadratically conserving (skew-symmetric) form of 72N@(u)T\o, obtainable either by averaging the advective and divergence forms above or by forming the skew-symmetric part of the advective-form matrix, a la Section 2,2,4, is: 72N®(u)T\0 = \ {h\ [(mo + us + uE + use)TSe - ("o + us + uw + usw)TSw] + [3(/*i + /z2)("o + ue) + h\ (us + use) + h2(uN + uNE)] TE - [3(/?i + h2)(u0 + uw) + h\ (us + uSw) + h2(uN + uNW)] Tw + h\ [(m0 + uN + uE + uNE)TNE - (m0 + w/v + «vv + "/vw)7W]}, in which we note that the coefficients of Ts, T0, and TN are all zero, (The diagonal entries of any skew-symmetric matrix are zero.) b. A boundary 2-patch Suppose the N-O-S side of the 4-patch shown above is an outflow plane, and the two right-most elements are (of course) omitted. It is of interest—and not difficult, having completed the 4-patch analysis—to determine the ODE corresponding to node 0 for this case, when the BC given by (2,1-3) is employed. It is, from (2.2-2) term-by-term, in the FDM form [here, ML = /,(/?, + h2)/4\:
SOME SEMI-DISCRETE EQUATIONS 77 1 (tsw + 2ts) + , 2 , (tNw + 2tN) + 2(tw + 2T0) /Jl + /j2 "~ "" ' /li +/l2 1 2h\ usw + uw + 2mq + 2us Ts - TSw + + + + 6 h\ + h2 6 /i 1 2h2 Uw + Www + 2uq + 2m/v 7"/v — Tnw 6 /ji+/j2 /, 1 2h\ 2u$ + Msw 1 2uq + Uw 1 2/j2 2un + unw 12 ' /i, + /i2 3 + 2 3 + 12 ' hx + h2 3 vsw + 3^5 + 6v0 + 2v w 12 Tp-Ts hi + h2 \ + 2vw + 6v0 + 3vN + vN w + + 6 2k vSw + vs + 2v0 + 2vw I Tw -Ts w hi +h2 + 12 2vw + 2v0 + vN + vNW 2hi 6/i [hi +h2 K hi +h2 Ts - TSw \ . {Tq — Tw\ 2h2 2{Tn7To -To7Ts \ + hi +h2 Tnw — Tw ho Tn — Tnw Ti Tw — Tsw hi + 2H 6/7 2/i i 2/i2 -Ts + 4T0+-——TN hi +h2 hi +h2 9 [hi +h2 (Ssw + 2Ss) + 2(Sw + 2Sq) + -—■—— (Snw + 2Sn) hi +h2 2 + — 6/, [hi +h2 2hl (qs + HTS) + 4(tfo + HT0) + 7—V^* + ^^) /ii +/i2 0 — ' w 7^/v — 70 /ll +/l2 \ \ /VIV — 1 W hi +h2 (2.3-29) where we have allowed both q and t to vary along TN and have interpolated them via the basis functions. While most terms in this equation permit obvious identification with their continuous counterparts, there are three that do not quite; namely, those with a coefficient of 2/l\, and this is a key observation that will show how the ODE at node 0 is actually also the Robin BC given by (2.1-3). So let us first multiply the equation by l\/2 and rearrange: 2 \ 18 + 2/ii hi + h2 2/i2 (tsw + 2TS - Ssw -2SS) + 4(TW + 2TQ - Sw ~ 250) h\ + h2 (Tnw + 2Tn — Snw — 2Sn] 1 2/ii uSw + uw + 2 (wo + us) Ts — Tsw 6 hi + h2 l\ 1 2/12 uw + unw + 2(«o + un) Tn — Tnw + 6 h\ +h2 6 11 1 2/ii 2«5 + usw 1 2«o + uw 1 2/12 2w/v + unw 12 /11+/12 + r 3 + 12 hi+h2 To — Tw
78 THE ADVECTION-DIFFUSION EQUATION 1 + 3 + vsw + 3^5 + 6t>0 + 2vw { Tq-Ts \ , 2vw + 6v0 + 3vN + vN + 12 Vsw +Vs + 2Vq + 2vw 6 TN-T0 Tq-Ts h2 h\ Tnw — Tw Tw — Tsw h\ +h2 + w 12 Tn ~ Tq h\ + /i2 Tw ~ Tsw \ 2vw + 2VQ + VN + VNW K + 6 H 1 hi 2hx h\ + hi ' 2/m h\ +hi 2h\ h\ Ts — Tsw I Tnw w h\ +h2 + 4 (TS - TS) + 4(r0 - TQ) + 2h2 h\ + /i2 (TN - TN) 6 V/Ji +/J2 2/i2 <75 + 49o+ 9/v «1 +«2 (2.3-30) Note that the operation of multiplying by /) /2 is equivalent to realizing that it is MBL = (h\ + /?2)/2, a boundary (lumped) mass matrix, which is the appropriate 'de-scrambler' rather than the 'area' bulk/mass matrix used initially. While this is indeed an ODE describing the behavior of T0 and is indeed the equation 'solved' by the computer, it is a/so the Robin BC (2.1-3)—at least approximately. For further clarity, we first simplify the equation to the constant coefficient case on a uniform mesh: Tsw + Tnw u + 6 V + 3 K 2 Ts + (Ts + Tn) + 2(Tw + 2T0) Tnw -S I NW Jjk + 4.Ti I Tw T_n__ I 2h Tsw j TN — Ts 2h + K 6 + H - [{Tnw ~ 27V + Tsw)/h2 + 2(TN - 2T0 + Ts)/h2} I S — I SW . 1 0 — I W I N — I NW I I TS + 4T0 + TN^ I -T q, (2.3-31) which represents a sort of (time-dependent) energy balance in the vicinity of node 0 on a finite mesh, a balance that includes the specified (and time-independent) Robin BC. As the mesh is refined, the contribution from the terms in curly brackets tends to vanish because these terms are multiplied by /, which -> 0, and all that remains is the BC on Tn- {The ostensibly 'missing' term, Kd2T/dx2 in the curly brackets terms, can be found
SOME SEMI-DISCRETE EQUATIONS 79 by a Taylor series analysis of the diffusive flux term in x; namely, K 6 Ts — Tsw_ To I T\y TN — TNW I = K dT ~dx I d2T 2 dxz + 0(/2, h2) From the ODE viewpoint, / -> 0 means that the time constant for node 0 is also approaching zero, and that the ODE therefore approaches a quasi-steady-state condition (its time constant approaches zero), which is another way to say that the Robin BC is being enforced. Returning briefly to Taylor series one more time, the insertion of the exact solution into (2.3-31) gives, after Taylor series expansion about node 0, / 2 {dT dT dT , \ ±1 d±—!^ I+on) 0 7\T + K~~+H(T-f) = q + 0(h2); ax i.e., the ODE at the outflow yields a second-order accurate (a la Taylor series) approximation to the Robin BC, (2.1-3), as in ID. Finally, we remark that a variable-grid version of the Taylor series analysis would—for both the boundary 2-patch and the interior 4-patch—yield only first-order accuracy, a misleading result since the prevailing global theory proves second-order accuracy (still) for the bilinear element. c. A boundary corner To finish, we consider the case involving a corner in the domain; for the sketch in Figure 2.3-4, let W-O-S lie on the boundary of the domain. We will first present the equation for node 0 under the condition that the Robin BC applies on both W-0 and 0-S. Referring to the element matrices again, Row 3 gives the final result; i.e., (2.2-2) for this case is: * sw) 36 h + — [(6«0 + 3miv + 2us + usw)(TQ -Tw) + (2w0 + 2us + uw + usw)(Ts - Tsw)] + ^ [(6«o + 3^ + 2vw + vsw)(To -Ts) + (2vQ + 2vw + vs + vsw)(Tw ~ Tsw)] *0 <>S Fig. 2.3-4 A corner bilinear element.
80 THE ADVECTION-DIFFUSION EQUATION + K— [2(70 - Tw) + (Ts - Tsw)} + ^ [2(T0 - Ts) + (Tw - Tsw)] + - [l(Tw + 2TQ) + h(2TQ + Ts)] = —(4SQ + 2Sw + 2Ss + Ssw) JO 1 6 + M' (qw + HTw) + 2(qQ + HTQ) + h [2(qQ + Hf0) + (qs + HTS)] } . (2.3-32) Here, the appropriate 'de-averaging' mass matrix is the boundary integral MBL = Jr 4>0 = (/ + h)/2; division by Mf and rearranging then yields the FDM form, Ih 2(1+ h) [9 1 (4T0 + 2Tw+2Ts + Tsw) + + Ih 2- 6«q + 3«vv + 2us + usw To — T w 6(1+h) V 12 2«o + 2us + u\y + usw Ts — Tsw I + 2 6 / 6i;o + 3^5 + 2vw + vSw T0 - Ts + 12 h 2vq + 2vw +vs + vsw Tw - TSw + K 3(1 +h) * C2 —i—+ h J Ts — Tsw I + l(2-To~Ts + -W~Tsw h h + H 6" Ih 21 ~ 2h (Tw - Tw) + 4(70 - T0) + —^(Ts - Ts) l + h l + h 2(1+h) ^(4S0 + 2SW+2SS+Ssw) + 1 2/ 2h qw + 4^o + -TT~7as l+h l+h (2.3-33) as the ODE for node 0—another interesting energy-balance-plus BC. As usual, the only surviving terms upon mesh refinement are those from the Robin BC; in fact, the asymptotic version of (2.3-33) is K h dT I dT\ + 7—- —)+H(T-T) = q + Oil, h), I + h dx I + h dy which might seem to suggest a 'heat balance' definition of a consistent (and unique) normal direction: n • V7 = nx(dT/dx) + ny(dT/dy). But this implies nx = h/(l + h) and ny = 1/(1 + h), which does not satisfy the requirement that n\ + n2y = 1. A better interpretation of this result can be obtained via multiplication by (/ + h) and regrouping: h dT ~ K—+H(T-T)-q ax + 1 dT ~ K—+H(T-T)-q = 0(l2, h2);
SOME SEMI-DISCRETE EQUATIONS 81 i.e., it leads to a linear combination of the BC (energy balance) in the x-direction and that in the ^-direction. Or, perhaps more accurately, it is a linear combination of the residuals of the BC equations. There is one more 'corner scene' that we wish to examine, because the results are interesting. Suppose the Robin BC is applied to only one of the two surfaces—say 0- S—and Dirichlet data on the other. In this case, T0 is given, and the GFEM equation at node 0 is an equation for the heat flux there. It is obtained from (2.2-16) rather than from (2.2-2), and all of the nodal temperatures are presumed to be known. [Recall that these come from (2.2-2), in which there is no equation at node 0.] The result is (for fi = 0), after dividing by Mf = Jr <pi = fw<pdx = 1/2, h 2 -(4TQ + 2TW + 2TS + 1 sw) 1 / 6mo + 3miv + 2us + Usw Tq — Tw 2uq + 2us + u$w + "s T$ — T$w + 3\2 12 / + 6 / 6v0 + 3vs + 2vw + vsw T0 -Ts 2v0 + 2vw + vs + vsw _l_ i. _ . ^ _)_ Tw-T sw 12 1 - -(4S0 + 2SW + 2SS + Ssw) h k ( To - Tw Ts - Tsw h + —H 3/ 2(T0 - T0) + (Ts - Ts) h 1 „ — (2qo + qs)+ ^(^d0 + 4dw ), h + To — Ts Tw sw h (2.3-34) where qhD = Y^Qd^i has been utilized to describe the (diffusive, in the limit) heat flux into £2 through rD. This is an equation for qDo, basically. (See also Chapter 4.) But, as seen several times already, this equation for qDo involves quite a lot of information—and all gleaned from a single bilinear element! What does it tell us? In general, it is an accounting of all energy 'flows' in the neighborhood of node 0. In particular, asymptotic analysis leads to h 1 dT ~ K—+H(T-T)-q ox dT + k— = qD + 0(l,h), 3y or h ' oT ~ K—+H(T-T)-q ox + / (Kj- - qD\ = 0(l\ IK h2), which has the following interpretation—roughly: 1. If h -> 0 with / fixed, then qD = KdT/dy + 0(1), the corner diffusive flux is in the ^-direction. 2. If / -► 0 with h fixed, then the Robin BC, x{dT/dx) + H(T -f) = q, is recovered—an x-direction heat balance.
82 THE ADVECTION-DIFFUSION EQUATION 3. If h/l is fixed and both -> 0, the most meaningful/sensible case, then it is (again) a linear combination of the horizontal and vertical components of an energy balance in the corner. d. An internal line heat source A sometimes useful mathematical model is one that employs a point heat source in ID, a line heat source in 2D, or a plane heat source in 3D. Here we shall show how, using bilinear elements, a line source may be modeled in the plane (2D). And we shall show two ways to do so—the second of which extends nicely (in the next chapter) to the mathematical insertion of a pump, which makes the pressure jump. For the first way, we return to Figure 2.3-3 and suppose that we wish to solve the heat equation (transient in general, steady by dropping the acceleration terms) with a line heat source along the column of nodes passing through S, 0, and N. The problem statement is the following: — = Kv2T + Q(y)8(x - xQ) in Q, (2.3-35) at plus appropriate BC's and IC's—the details of which do not concern us here because we are simply interested in generating a typical internal node's equation. The line source location is xq (along which line we insist on placing some of our nodes), and 8(x) is the Dirac delta function. The weak form of (2.3-35) is f f dT f f dT J (PiQ(y)8(x -Xq)- J &— = J KVfr ■ V7 - J K<Pi — , (2.3-36) and we henceforth omit the boundary integral, since we are studying internal nodes only. In fact, we will only examine the case 0; = <p0 in Figure 2.3-3, and we further simplify to the case of equal element size, the result being, with T & Th = ]C; Tj<pj, and lumping the mass for simplicity, yN <!>Q(xQ,y)Q(y)dy -ihiQ ryr JyS Kit = 77 UTs - T^) - (Tse - Ts) + 4[(T0 - Tw) - (TE - T0)] + (TN - TNW) - (TNE - TN)} o/ - ^-[(Tsw ~ 27V + TNW) + 4(TS - 2TQ + TN) + (TSE - 2TE + TNE)], (2.3-37) on in which the algebraic form of the x- and ^-diffusion terms is purely intentional—the line heat source will cause a jump (discontinuity) in the x-direction heat flux. If we divide by h and rearrange, we obtain ] ryN j (po(xQ,y)Q(y)dy JyS h JyS K 6 SW l SE — l S . I l 0 ~ 1 W ' E — ' 0 \ ' N — ' NW ' NE ~ ' N _ . + . . . | + + un-l Tsw — 2T$ + T'/viv Ts — 2Tq + T^ + 4 + W- hl Tse — 2TE + T^e _ (2.3-38)
SOME SEMI-DISCRETE EQUATIONS 83 which yields the following result for /, h -> 0: Q(yo) = * dT a* (2.3-39) XQ where [[ ]] denotes the jump in dT/dx at x = xq; i.e., Q(yo) = * dT a* dT a* (2.3-40) Remarks: (1) The line heat source causes a jump in heat flux. (2) Setting Q(y) = 0 in (2.3-37) recovers, properly, the 'conventional' GFEM equation, in which both 37/3/ and d2T/dy2 are still present as /, h -> 0, converging to (2.3-35) with Q omitted. (3) The ID case, obtained by taking Q(y) = Q0 and setting TN = T$ = Tq, etc. in (2.3-37), is / Tn — Tw Tp — in\ Q0 = k[ , - -S-—° + IT0, (2.3-41) / / which converges to go = K[[dT/dx]]Xo and has the following exact solution for the steady-state situation with T = 0 at x = 0 and x = L: {M^T1JQo/K for 7 = 0, !,•••/ Tj = N ^-^JQq/k fOTJ = J,J+l, N (2.3-42) /V where there are /V elements and A^ + 1 total nodes (/ = 1 /N) and j = J is the location of the (only) point source of heat, Qq. This piecewise linear function is, in fact, the Green's function for Txx, and the flux jump at x = xj is Ag = k Tj-Tj-i Tj+i-Tj I I which gives, using (2.3-42), Aq = Qq. The second way to solve the problem is less direct, involves NBC's, and starts by 'neglecting' the line heat source. Suppose we imagine, for a moment, that the two sides of the line S-O-N in Figure 2.3-3 are disjoint and that we apply, separately to each surface, an applied heat flux. The weak form is now (2.3-36) with the heat source term omitted and the boundary integral retained. For the case in which the domain is on the left, we say qLiy) = KdT/dx\r is the (leftward flowing) applied heat flux, and when the domain is on the right, we use qR(y) = KdT/dx\r as the (rightward flowing) applied heat flux in the boundary integral terms. Thus, the two disjoint equations, from (2.3-36), are CyN Kh / <foqUy)dy = -ttKTs - TSW) + 4(7o - Tw) + (TN - TNW)] JyS 6/ *■/ lh . - -ttKTsw - 27V + TNW) + 2(TS - 2T0 + TN)] + —T0, (2.3-43) oh 2
84 THE ADVECTION-DIFFUSION EQUATION which, not surprisingly, is like the 2-patch result given by (2.3-31), and f <f>oqR(y)dy = K—[{TS - TSE) + 4(T0 - TE) + (TN - TNE)] JyS 0/ - |r[2(rs - 270 + TN) + (TNE - 2TE + TSE)] + —t0. (2.3-44) oh 2 Finally, we rejoin the two pieces by summing the two equations to obtain the total applied flux ^tot = <Jl + <1r- The result is (2.3-37), with gT0T = Q, the total applied heat flux is 'equivalent' to a line source of internal generation. Remarks: (1) The individual heat fluxes (leftward-flowing, qL, and rightward-flowing, qR) can be obtained from (2.3-43) and (2.3-44), respectively, after the temperature has been obtained. This is nothing other than another application of the consistent flux methodology of Chapter 4. (2) If the sum of the fluxes (qror) is set t0 zero, the conventional GFEM internal node equation is obtained—modeling/approximating a continuous heat flux. Also in this case, the use of (2.3-43) and (2.3-44) to compute qL and qR in a 'postprocessing' mode, would give (consistent) flux continuity: qi + qR = 0 as required. (See Appendix 2 for further discussion of related issues.) (3) If, instead of a specified heat source or a specified total heat flux along the line S-0- N, it is desired to specify the temperature along this same line (an internal Dirichlet BC!), we now see how: simply switch known and unknown variables in (2.3-37); T$ (and Tq if time-dependent) is the given variable, and Q0 from f ^ 4>oQ(y) dy is to be determined—via Q(y) = Q(ys)<ps + Q(yo)4>o + Q{yN)4>N- A specified temperature distribution along a line internal to £2 can be realized by applying the appropriate heat source distribution along this same line. After the full temperature field is available, (2.3-43) and (2.3-44) may be utilized to see how the specified heat flux is distributed in the two directions, by computing qL and qR—again via qL = qLs4>s + <7Lo0o + qLN<pN, etc. [In both cases, i.e., in computing either Q(y) or qL(y) and qR(y), one may use—at least for this element—consistent or lumped mass.] 2.3.3 Two Dimensions with Biquadratic Elements The first 'higher-order' element (and the highest we consider) in more than ID is too elaborate to make anything but the simplest case 'palatable' when it comes to writing and pondering nodal equations. Thus, we a priori limit ourselves to the advective form on a mesh of uniform rectangles with constant coefficients—and we drop S. The fact that there are several different types of nodes to consider more than compensates for the above simplifications. To further streamline the presentation, we show only final results—and those in FDM form only—obtainable by dividing by J cpi, which is lh/9 for a corner node (in a 4-patch), 2lh/9 for an edge node (in a 2-patch), and 4lh/9 for the center node of a 1-patch; surely every reader by this time knows how to assemble equations from element matrices and is thus more interested in the final results than more 'tutorial.'
SOME SEMI-DISCRETE EQUATIONS 85 a. An interior 4-patch For a different presentation of the nine-node (and eight-node) results to be presented below, see Gray and Pinder (1976). By 'different,' we mean simply an algebraic rearrangement. We are suspicious, however, of portions of the ensuing error analysis in that paper. The 4-patch (and associated stencil/'molecule') for the interior node '0' comprises 25 nodes, as shown in Figure 2.3-5, in which the three node 'types' are shown with different symbols. Thus, upon forming the GFEM equations for node 0 and converting them to FDM form, one is led to the following 'interesting' 25-point stencil/ODE (Ax = 1/2, Ay = h/2): 1 100 [64f0 + 16(TN + fs + fE + 7V) - HfNN + fss + Tee + tww) + ^(TNE + TNW + Tsw + Tse) — 2(Tnne + TNnw + TNww + Tsww + Tssw + Tsse + Tsee + TNee) + (Tnnee + TNnww + Tssww + Tssee)] , u (\n. Te — Tw ,, Tee — Tww TNe — TNw + Tse — Tsw -\ 32 16 h 16 20 V / 2/ 2/ , . Tnnee — TNNww + Tssee — Tssww Q Tsse — Tssw + Tnne — Tnnw _|_ 4 — — g- - 8 4/ Tsee — Tsww + TNee — TNWw 4/ 2/ , v L TN — Ts TNN — Tss , ,, TNe — Tse + TNw — Tsw -\ 51 16 h 16 20 V h 2h 2h . . Tnnee — Tssee + TNNww — Tssww Q Tnnw — Tsww + TNee — Tsee _|_ 4 — — g Mi 2h Q Tnne — Tsse + TNnw — Tssw \ "8 4h ) NNWW • NWWP \N\N n SWWn SSWW NNW —n— NWO w o- SW O -o ssw NN U N O PS ss NNE —U— O NE -o O SE £ —D— SSE NNEE 0 NEE it EE □ SEE SSEE Fig. 2.3-5 A 4-patch of biquadratic elements.
86 THE ADVECTION-DIFFUSION EQUATION k (' Te — 2T0 + Tw __ The — 2TN + TNw + Tse — 2Ts + Tsw = — 64 ~—■ h 32 5 40 V Ax2 2Ajc2 0 TNNee — 2TNN + TNNww + T^s^ — 27^5 + Tssww ao T'ee — 270 + 7Vw + o ~ — jZ ~ 2/2 /2 , £ Tnne — 2TNN + TNNw + r^sf — 2Tss + Tssw — 16 ~ 2Ajc2 ,, TVee — 27V + TNWW + T^^ — 27$ + Tssw \ "16 & ) , *" /,. TN — 2Tq + Ts TNw — 27V + Tsw + The — 2TE + TSe -\ 64 ~ h 32 ^ 40 V Ay2 2Ay2 Q Tnnee — 2TEe + Tssee + Tnnww — 27Vw + 7"55ww ao Tnn — 2Tq + 7^ + o » 51 » 2n2 I2 , ^ TVee — 2Tee + T'see + TVwiv — 2TWw + T^w — lo ~ 2A/ , £ Tnne — 2Te + Tsse + TNNw — 27V + Tssw \ ,_ - . _. — 16 = . (2.3-45) 2n2 / Remarks: (1) This higher-order approximation is seen to be simply a (non-simple) linear combination of second-order centered difference approximations—an observation that applies to most, if not all, of those stencils to follow. (2) As discovered long ago by Carey (1976), it is not appropriate—though tempting—to analyze the accuracy of this equation via TS expansions. The results would generally be wrong, because they do not properly account for the 'difference' between mode types and the couplings between the different types of equations. This point is also brought out in the recent text by Morton (1996). (3) This is the 2D extension of the constant / version of (2.3-18), to which it simplifies by assuming ID behavior. (4) Mass lumping (see Section 2.5.1) can also be invoked by setting all 25 time derivatives to to. b. An interior 2-patch An appropriate 2-patch is shown in Figure 2.3-6 for a 'typical' edge node, such as node N in the above 4-patch sketch. (The other edge nodes, like node E, can be easily derived from that below via 'symmetry'): The GFEM for node 0 in this case couples 14 others and generates the following ODE in FDM format: [(128ro + 32(7- E + TV) + 16(7* + tN) 200 + 4(Tse + Tsw + TNE + TNW) — \6(Tee + Tww) — 2(Tsww + TNWw + tsEE + TNee)\
SOME SEMI-DISCRETE EQUATIONS 87 NWW NW HD— N WWD WO -D- NE -ID- NEE —• go OE ■£ — 0 EE sww sw s Fig. 2.3-6 A 2-patch of biquadratic elements. Tee — Tww -D- SE SEE u ( Tf — Tw + _ 64— 40 V / 32- 2/ , , , The ~ Tnw + Tse — Tsw 0 T^ee — T^ww + Tsee — Tsww _l_ 15 _ x- v + —32 40 V 2/ Tn-Ts h 4/ + 16 Tnw — Tsw + TNE — TSe 2h Tnww — Tsww + TNEe — Tsee \ 2h ') k ( --Te — 27\) + Tw , __ Tse — 27$ + Tsw + T^e ~ 2TN + TNw 80 128- Axz + 32- 2Ajc' 64 Tee — 2Tp + Tww . ^ ^5vvw ~ 27$ + 7^ee + TNww — 27V + 7Vee \ V 21 , K {^,Ts — 2Tq + Tn Tse — 2TE + TNE + 7svv — 2Tw + 7Viy + — 64 = h 32 80 A/ 2Ay 16 T^ww ~ 27Vw + TNWW + r^gg — 2r££ + TNee __ ^ (2.3-46) wherein the 'anisotropy' of such a midside node is obvious. This equation properly also degenerates to (2.3-18)—for / = constant—if the solution is taken to be ID (no y-variation). Also, mass lumping by equating all 7's to T0 is viable (but less accurate). c. An interior 1-patch Our last 'biquadratic equation' is also the simplest, and is contained in the single element shown in Figure 2.3-7. The central node equation is easily found to be 1 400 [256r0 + 32(7V + fs + fE + tw) + 4(.tNE + tNW + tsw + tSE)] + w / Tse — Tsw TE — Tw TNE — TNw \ To I—/—+8—r~ + —/ ) , NW — * SW , nl N — I S , I NE — I SE \ 10 h + 8- h + h
88 THE ADVECTION-DIFFUSION EQUATION NW WD sw s Fig. 2.3-7 One nine-node element. 0 E « ( Tsw — 2Ts + TNW TE — 2Tq + Tw TNE — 2TN + TNW 1_ $ _ 1 10 Axz Axz Axz + K fTNy/ 27V + TSw io V A/ + 8 TN — 2Tq + Ts , TNE — 2TE + TsE a/ + A/ , (2.3-47) whose ID version is easily seen to agree with (2.3-16)—and, with the following statement, we are done: boundary patches are left as exercises for the reader. 2.3.4 Two Dimensions with Serendipity Elements Because some still use the eight-node serendipity element (it can sometimes be slightly more cost-effective than the nine-node element, although we do not recommend it), and—especially—since it leads to some very interesting (bizarre) behavior when mass lumping is invoked, we repeat the exercise for this element. Again, only final (FDM) forms of the equations will be presented, and again only for constant coefficients on a uniform rectangular mesh with S = 0. (As usual, the basis function-interpolated source term can be instantly realized via the mass matrix.) a. An interior 4-patch We begin with the 25-node stencil shown earlier (Figure 2.3-5) for the nine-node element and reduce it to a 21-node stencil by removing the center node from each element. The concomitant serendipity basis functions and their element matrices (Appendix 1) then lead to the following ODE—after some judicious rearrangement that even includes adding and subtracting equal terms to make more sense of the GFEM equations: — [l2(fN + fs + fE + fw) + &(Tnee + TNNE + TNNW + TNww + Tsww + Tssw + Tsse + T$ee) — 4(r/v/v + Tss + Tee + Tww) — 3(TNNEE + TNNWW + Tssee + Tssww) — 24Tq]
SOME SEMI-DISCRETE EQUATIONS 89 _"_ /R Tee — Tww ~ TNNEe — TNNww + Tssee — Tssww __ ^ _ _ .Te — Tw TNee — TNww + Tsee — Tsww \ _ + __ j ■ v (Q Tnn — Tss Q TNNee — Tssee + TNNWw — Tssww + M 2h +3 4h _ 1 n ^N ~ ^s -i- i a Tnne — Tsse + TNNW — Tssw \ h 4/ / /c / 7Vf — 277) ~l~ Tww Tp — 277) + Tw _ 2S_tt ut ww _ 20— " w 15 V /2 Ajc2 Tnne — 2TNN + 7V/VW + Tssw — 2Tss + Tsse 10- 2 Ajc2 ^ TVff — 27V + r^ww + Tsee — 27^ + ^vvw — 6 = 2/2 ^ Tnnee — 2TNN + TNNww + Tssff, — 27'55 + r^^ww \ 2? / 15 V /i2 A/ TVee _ 27££ + 75FF + TNww — 27V + Tsww 10 2A/ ^ 7"/V/V£ _ 2TE + 7"55£ + TNNw — 2TW + 755W 6 2h2 . OQ TNNEe — Tee + ^ss££ + TNNww — 27Vvv + Tssww \ ,» » ,0, + 23 ^ J , (2.3-48) which deserves several Remarks: (1) The lumped mass for node 0, J cpo, is negative (—lh/3); and, we have divided the GFEM equations by this negative 'mass' in order to obtain the FDM form above. (2) The 'small' change from the nine-node element has caused a large change in the coefficients—with some rather 'strange' weightings. (3) Even though the coefficients of the T terms sum to unity, lumping via 'row sum' does not work—an important point that we will soon return to. b. An interior 2-patch Again we can use the nine-node, 2-patch sketch presented earlier (Figure 2.3-6) and simply omit nodes E and W. Again, only the FDM form of the equation is shown, and is — [647Yj + 20(tNE + fNW + Tse + tsw) + ^(Tee + tww) - \2(tN + ts)
90 THE ADVECTION-DIFFUSION EQUATION — &(Tnee + Tnww + Tsww + Tsee)\ u f TNE — TNW + Tse — Tsw \^EE ~ Tww 15 V 2/ 2/ „ TVgg ~ ^JVffff + TsEE — Ts\VW \ ~ 21 J v_ (Tn — Ts Tne — Tse + TNw — Tsw \ 3 V ^ 2h J k (' Tee — 2To + Tww TNww — 2TN + TNee + Tsww — 2Ts + TSee \ = To [s ? + 1 tf ) « f Tn — 2T0 + Ts TNWW — 2TWW + Tsww + TNee — 2Tee + Tsee \ 3 V A? + 2A? J ' (2.3-49) which displays even more anisotropy than does the analogous nine-node result in (2.3-46). Remarks: (1) At least now the 'close-in' nodes get the largest positive weighting. (2) Here J (po = 2lh/3 was used to convert the GFEM form to the displayed FDM form. (3) Again, even though the sum of the T terms is unity, this form of mass lumping is to be avoided. c. Another interior 2-patch Because the eight-node element is 'different,' it is of some interest to examine the other type of midside node, node 0 in Figure 2.3-8. And the result is [64ro + 20(tNE + tNW + tsw + tSE) + 16(7^ + tss) 120 — \2(TE + Tw) — 8(TNne + TNNW + Tsse + ^ssw)] u fTE — Ty/ TNE — TNw + Tse — Tsw \ + 3^ / + 21 ; v_ (,ftTne — Tse + TNW — Tsw \^nn ~ ^ss - 7 15 V 2h 2h Tnne — Tsse + TNNw — Tssw \ 2h — - ('? ^E ~ ^° ~*~ ^w _i_ Tnne — 2TNN + TNNW + Tsse — 2Tss + Tssw \ ~ 3 \l Ax2 + 2A? ) , *" (Q TNN ~ 2Tq + TSS + Tol8 1? ry Tnnw — 27V + Tssw + TNNE — 2TE + Tsse \ _ „ n + Z 2h2 I, (2.J-5U)
SOME SEMI-DISCRETE EQUATIONS 91 NNW NWP W<> SWP NN -o- -Q 0 w n_ ssw ss Fig. 2.3-8 A 2-patch of serendipity elements. NNE —• PNE i* E P SE SSE an equation that is actually easily derived from (2.3-49) via 'symmetry,' but one that we present because it leads to a different ID equation, which we show below. Before continuing our promised mass lumping discussion, it is of some interest to see how the serendipity element responds to a ID problem; thus, suppressing all y-variation in the above equations leads to the following three 'x-direction' equations: ^[14(7-/+, + 7-/-1) + 3(r/+2 + ti-2) ~ 47-/1 + \ (5 Ti+2 2/ Ti 2 ~ 2- / Ti-2 -2T; + Ti+2 . 7V, - 2T; + Ti+, \ = KV ? 2 K? J /+i Ti. from the 4-patch equation; (2.3-51) from the first 2-patch equation; and 1 • — (7V 10 + &Ti + Ti+l) + u- Ti+i - Ti. I = K- Ti+i -2Ti + Ti. Ax2 (2.3-52) (2.3-53) from the second 2-patch equation, wherein it is clear that we have switched from compass point to index notation. Whereas (2,3-53) does recover the analogous equation from ID quadratic elements—i.e, (2.3-16)—the other two are new. But they are valid/legitimate; i.e., the serendipity element can indeed solve ID problems even though its basis functions are not tensor products of ID basis functions. To conclude this discussion (on a really sour note), we now discuss mass lumping. The first—and rather obvious—lumping is simply 'row sum' (see too Section 2.5.1), which is obtained from the above consistent mass equations, (2.3-48) through (2.3-50), simply by setting all 7's to to- The equations that result are the same as those derived via the so-called optimal lumping scheme of Malkus et al. (1988); see too the original reference, Fried and Malkus (1975), which was derived using nodal quadrature rather than exact integration. Regardless of the fact that row sum lumping is the same as optimal nodal quadrature, and regardless of the fact that the resulting ODE's 'look' okay, they are in fact
92 THE ADVECTION-DIFFUSION EQUATION very far from 'okay.' In what might be called one of the 'finite element surprises,' it turns out that such a lumped mass approximation turns what were perfectly legitimate GFEM ODE's into totally inappropriate non-GFEM ODE's that are sometimes even unstable (eXl behavior, where Re(A.) > 0). In our numerical experiments, summarized in Gresho et al. (1976, 1978), we found that these lumped mass ODE's were unstable for pure diffusion, stable but 'meaningless' (i.e., not representing the PDE) for pure advection, and somewhere in between for finite Peclet number. The case of pure diffusion was later substantiated in Malkus and Plesha (1986) and further discussed in Malkus et al. (1988). To add even more confusion to the mass lumping arena wherein PDE consistency can be lost, we must discuss the innovative (but still ad hoc, as are all) mass lumping scheme of Hinton et al. (1976), beginning with the admonition that they were mainly interested in structural dynamics—they did not advocate nor test their scheme on the equations of fluid mechanics. (They in fact cleverly side-stepped this issue by restricting their claims to problems for which variational principles exist.) Their scheme is simple to state and to implement, and it has apparently served well in some areas—but not advection-diffusion via 'serendipity' (!): compute the diagonal entries of the element consistent-mass matrix, sum them, and multiply each by the (same) unique scalar that preserves total element mass. Since both diagonal entries and total mass are always positive, this scheme never generates negative lumped masses, which, indeed, was one of the authors' objectives. Referring then to the eight-node mass matrix in Appendix 1, this mass lumping algorithm puts 3/76 of the mass at each corner node and 16/76 at each midside node. What does this do to the three nodal equations in (2.3-48) through (2.3-50)? To answer, we must first back up from the FDM form presented to the original GFEM form, then lump the mass, and then divide by f <Po, so that all terms except the time derivatives are as presented above, and it is no longer GFEM. The results are: 1. — jgf0 for a corner node (2.3-48), and 2. j^Tq for a midside node (2.3-49) and (2.3-50), which merit the following Remarks: (1) Recall that row sum lumping of the FDM versions of the equations gives t0 for each, a result that 'looks' consistent (but is not). (2) The minus sign in front of the corner node equation is caused by our 'undoing' of the weighting by division by J (p0 = —lh/3. (3) Neither coefficient looks consistent. (4) These same coefficients apply in the ID cases, (2.3-51) through (2.3-53), the first giving —-^T'o and the other two jf 7V (5) We also tested this lumping scheme in Gresho et al. (1976, 1978) with the following results: (i) it was always stable, (ii) it was—like row sum—completely inappropriate for the hyperbolic limit of pure advection, and (iii) it was 'reasonable' (but much less accurate than consistent mass) for pure diffusion. We did no mesh refinement experiments, but believe that the resulting 'pure advection' ODE's do not model dT/dt + u • V7 = 0 even when h -* 0.
OPEN BOUNDARY CONDITIONS (OBC'S) 93 While further analysis may be interesting, we believe it to be unwarranted; suffice it to say that neither lumping scheme should be used in practice—especially the more logical- looking row sum lumping, since it will often generate unstable ODE's. If you wish to use the serendipity element for the scalar transport equation (or even for the NS equations), then you should use only the consistent mass (GFEM) formulation (for which it behaves very well) or design a new lumping scheme that works. (We have here a perfect example of the use of the adjective 'consistent'!) 2.4 OPEN BOUNDARY CONDITIONS (OBC'S) We now specifically address, for but one of several times in this text, the important issue of outflow (or, more generally, open) boundary conditions (OBC's)—a special case (usually) of the NBC's associated with the weak form. In many simulations of interest in fluid mechanics, the fluid—and the 'load' that it carries/advects, here the scalar T—flows through (i.e., both into and out of) the computational domain, a situation necessitated by the fact that the true (physical) domain of interest is (much) too large to even be considered in the numerical simulation. For an engineering example, consider a physical laboratory in which the experiment of interest is flow past an obstacle—a cylinder in a channel, or an airplane in a wind tunnel—and the flow is forced via a pump or fan/compressor; to attempt to model the entire closed loop would be expensive. (Modeling an open wind tunnel would be out of the question.) For a geophysical application, consider the problem of trying to predict the air pollution from a (dirty) factory that is located (to make the problem more interesting) in mountainous terrain; to attempt to model the entire atmosphere of the earth would be expensive. So we must consider inflow/outflow situations in which our computational domain is truncated and some BC's necessarily applied at these artificial/synthetic 'boundaries'; i.e., the PDE does not know that we are truncating the universe—all it knows is that BC's on T are required in order to 'solve for 7Y OBC's synthesize the connection of our restricted computational domain to the rest of the 'universe.' The general goal is to apply BC's at inflow (n • u < 0) and—especially—at outflow (n • u > 0) that are both mathematically legitimate and computationally useful. But what does 'useful' mean? While necessarily vague, it is basically this: useful BC's are those that lead to good results in the 'smallest' truncated domain. But what does 'good' mean? What does 'smallest' mean? Good results are those that cause the solution in the 'subdomain of principle interest' to change little when the computational domain is made larger and that would agree well with those from the true (physical) domain. The smallest truncated domain is often (but not always) the largest domain that one can afford to model. Naturally, all of these issues are rather qualitative in nature—a necessary consequence of domain truncation. But it is a very real fact of life that many CFD simulations must deal with the open-boundary situation, and whereas many BC's are supplied by nature, OBC's are not. Finally, we remark that good OBC's are cost-effective, and bad OBC's can even destroy the entire simulation. 2.4.1 One Dimension So now let us assume that our ID problem, which indeed displays inflow (x = 0) and outflow (x = L) as posed in (2.3-1) for u > 0 and constant, is one in which the 'desired'
94 THE ADVECTION-DIFFUSION EQUATION simulation is to produce good results on some subdomain, Ls < L. We have already imposed a Dirichlet BC at the inflow point, which presumes that we have been supplied with some 'outside information' there. But this is not actually necessary (nor desirable, sometimes), in that a Robin BC could also (and sometimes judiciously; see, for example, Novy et ai, 1990) be applied at x = 0—this time either in the form — K(dT/dx) + H(T — f) = qQ or — K(dT/dx) + uT + H(T — f) = qQ, the minus sign accounting for the fact that the unit normal vector in the x-direction is — 1 at the inlet. But here we retain the Dirichlet BC at x = 0 and shift our attention to the outlet, x = L. The NBC of (2.3-3) can, in fact, be usefully applied there as an OBC, usually in the following way: take H = q = 0; i.e., the useful OBC is simply dT/dx = 0 at x = L. Recalling the semi-discrete equation for / = N [(2.3-12)] gives, for this case, (TN — TN_\ K ~ I N u(TN - TN-i) 2 . lN 3 -lN-XtN-u (2.4-1) 6 an ODE for node TN, with (ignoring the coupling) time constant x ~ {3k/12n + 3u/2lN)~l, that is also the NBC approximation to dT/dx = 0. Note that the time constant, which is 12n/3k for pure diffusion and 2lN/3u for pure advection, tends to zero with mesh refinement; the ODE solution appears to then respond so rapidly that it is in quasi- equilibrium; i.e., lutN -> 0 with tN finite, which presumably could have implications regarding the stability of the chosen ODE method. (But in fact it does not, fortunately; see Section 2.7.2c.) For lN -> 0, the ODE is (automatically) 'sacrificed' in favor of the OBC. We shall demonstrate later (Section 2.6.2d) that this BC permits a passive exit (of the advective flux) from the domain even in the advection-dominated {Pe ^> 1) case. We will also show that a Dirichlet BC at the outflow is (usually) not at all passive—especially for advection-dominated flow: 'From the study of singular perturbation problems, it is well known that boundary layers are weaker if boundary conditions on the derivatives are used (soft BC's) rather than boundary conditions on a function itself {hard BC's)'—Naughton (1986). A specific and simple example might be useful to help investigate the concept of quasi-equilibrium of the ODE for node N, and the 'dual' assignment of this ODE; i.e., it must simultaneously (i) satisfy closely the BC of the PDE and (ii) ensure that TN(t) is an accurate approximation to the PDE solution at the boundary. Consider the problem T, — kTxx for x > 0, — kTx = q at x = 0, T = 0 at / = 0; i.e., the transient heat equation in a semi-infinite domain with an applied flux BC. The exact solution is T=1 K J^l^/ak, _ x(l _ erfx/^-t) (2.4-2) which for 'small' / (namely, 4/cf <$C L2), approximates well the solution on the finite span, 0 ^ x ^ L, which we shall use for a numerical solution (with BC 7 = 0 at x = L= 1). The ODE for node 1 (at x = 0; node /V -> node 1)—corresponding to the OBC—is [linear elements, lumped mass—for simplicity—which is also 'FDM + image point,' see (2.4-26) through (2.4-28) below] {lti+K(Tl-T2)/l=q, (2.4-3) and our objective is to focus on the temporal behavior of node 1 with an applied flux BC. Viewed 'in isolation,' it responds with a time constant x = I2/2k, which goes to
OPEN BOUNDARY CONDITIONS (OBC'S) 95 zero with mesh refinement, suggesting, as stated above, quasi-equilibrium behavior; i.e., for / 'sufficiently small' (and / 'sufficiently large,' both of which we shall soon define), node 1 should respond so fast that T, (t) in (2.4-3) will agree very closely with that of the exact solution, dT/dt\o = q/y/mct, from (2.4-2). We then presume that the quasi- equilibrium ODE will then very nearly satisfy k(T\ - T2)/l =q- (l/2)dT/dt\0 = q[\ - l/(2y/icKt)] = q; it will of course always satisfy k(T\ — T2)/l = q — (l/2)f\. But since the exact acceleration is unbounded at / = 0, quasi-equilibrium behavior is clearly not true for all /. This is because of the discontinuity in heat flux at / = 0; the closer you go toward / = 0, the finer the needed mesh if the approximate solution is to be accurate—a given mesh will not be able to capture T\ = 0(1 /yfi) for / 'too small.' We can quantify this approximately as follows: supposing that / is small enough that the mesh nearly captures the correct solution at x = 0, we can say 7X0, 0 = q/yfriTt = 7,(0/t = 7X0, t)/x = (q/r)y/At/7tK, (2.4-4) with t = I2/2k, to give the 'limiting case' result, / = yfAid. (2.4-5) Thus, for a fixed mesh, our approximate solution will be poor for / < 0(12/4k); conversely, if we wish to capture the behavior at small /, then the first element length should be less than y/Aict. Quasi-equilibrium {and concomitant good approximation of the constant flux BC) in this case will be observed only for / ^> I2/4k; for / ^ 0(12/4k), T\ (t) will not be in quasi-equilibrium—and neither the PDE near x = 0 nor the BC will be accurately approximated. Quasi-equilibrium and accurate results near x = 0 go hand-in- hand, and can only occur for / ^> 0(12/k). Figure 2.4-1 shows—thanks to A.C. Hindmarsh and his VODE code (Brown et al., 1989)—the exact and approximate (N = 20) solutions at x = 0, and Figure 2.4-2 shows the 'flux error,' q — k(T\ — T2)/l, which is also lt\/2, for a variable mesh [x-, = (//TV)1'2] solution of the appropriate ODE's on the unit span with T = 0 to x = 1 for 10, 20, 40, and 80 grid points—with q = k = 1. [The oo span solution is 'valid' for / «; L2/k = 1. Also, the quantity called 'flux error' is not exactly consistent with our discussion of 'consistent flux' in Chapter 4. C'est la vie. Also shown is the 'scaled' exact acceleration, lt(0, t)/2 = lq/2j7ZKt, where I = (\/N)L2 = 0.063, 0.027, 0.012, and 0.0052 for N = 10, 20, 40, and 80, respectively. These give the following values for r : 0.0020, 0.00038, 0.000071, and 0.000014, respectively. Figure 2.4-3 shows the actual acceleration error, T\{t) — q/y/rnci, for the same four cases. (N = 80 is really there—it is just 'too accurate' to see, and the initial error in all cases is —oo.) It is seen that for / > ~10r quasi-equilibrium and good accuracy ('error' in flux ^ ~0.13) obtain. (The horizontal line shown corresponds to ~5.3t and an 'error' of ^ ~0.18 in Figure 2.4-2; e.g., for /V = 20, the flux error is <0.18 for / > ~ 0.002.) This time (10t) is closely related to the 'minimum time of believability' discussed in Gresho and Lee (1981) and would be even more obvious had we used consistent mass in the above example because T2(t) would then start off in the wrong direction, and would have recovered and become accurate by 10t. Remark: Another approximation that can be applied for this problem, from the FDM point of view: replace the ODE for node 1 by the PDE BC approximation, q = k(T\ — T2)/l. We
96 THE ADVECTION-DIFFUSION EQUATION 0.12 0.1 0.08 0.06 0.04 0.02 T(0,t) = V4t/7t. /T,(t) for N = 20 and x, = (i/N)1 Fig. 2.4-1 Exact and approximate solutions at x = 0. CM C CO " 0.4 CM 0 0.002 0.004 0.006 0.008 0.01 0.012 t 0 0.002 0.004 0.006 0.008 0.01 t Fig. 2.4-2 Scaled acceleration: numerical (solid curves) and exact.
OPEN BOUNDARY CONDITIONS (OBC'S) 97 0.002 0.004 0.006 t 0.008 0.01 Fig. 2.4-3 Acceleration error at x = 0. did the experiment and obtained much less accurate answers. The ODE NBC is the best way to go. We conclude the analysis by actually determining the exact solution to the ODE's via an eigenvector expansion—hopefully for further elucidation. Also, the solution is both simple and interesting—so we present it. To do so, we switch to a uniform mesh and utilize the (uniform mesh) eigenproblem results from Hindmarsh et al. (1984)—accounting for the reversed BC's there (T = 0 at x = 0, Tx = 0 at x = L): where K K=- I Ky = -iiMLy, r-2 1 1 -1 1 ... 1 -2 1 I 1 -lJ (2.4-6) (2.4-7) is the N x N diffusion matrix and Mr=l (2.4-8) 1/2.
98 THE ADVECTION-DIFFUSION EQUATION is the lumped mass matrix. The corresponding ODE system is MLt = KT + b, 7(0) = 0, (2.4-9) where bT = (0 -» q) is the driving force. The eigensolution is fi„ = 2kN2 M - cos tt] IL2 (2.4-10) and y)^ = sm ^-jn fovnJ=l,2,...,N, (2.4-11) IN where / = \/N—and j = N corresponds to the ODE + BC; namely, from (2.4-9), 2TV = y (TV-i - TN) + q, (2.4-12) which corresponds to the T\ equation in the earlier discussion. The solution for T(t) in terms of the eigenvectors, {j(w)}, goes like: 1. Expand the solution in terms of the eigenvectors (which form a basis): N T(t) = Y,an(t)y(n). (2.4-13) n=\ 2. Expand the RHS vector in a 'related' basis: N b = Y,bnMLy{n\ (2.4-14) n = \ where the expansion coefficients are obtained using the following orthogonality relation {y{m))TML(y{n)) = ^mn, (2.4-15) to obtain bn = 2bTy(n)/L = (2q/L)sm ^-jt = (2q/L)(-l)n+i. 3. Insert both of the above expansion into (2.4-9) and utilize (2.4-6) to obtain N ]T(a„ + finan - bn)MLy(n) = 0, (2.4-16) M=l which, since the vectors {MLy(n)) are linearly independent, is an expansion of the zero function so that we have an + fina„ = bn, a„(0) = 0, (2.4-17) with solution an(t) = bn(\ ~e-^')/fjtn. (2.4-18) Hence, the complete solution at node j is TJ^ = fB-'^'C -e-Mmf)sin-^->/^, (2.4-19)
OPEN BOUNDARY CONDITIONS (OBC'S) 99 and we zoom in on the 'BC node,' j = N, to obtain 2q N TN(t) = -fY.^~Q' ■Unt )/Vn, (2.4-20) and its rate of change—much more relevant in this case since T itself starts out with zero error— is 2q L N TN(t) = ^Y.f .-/*«' (2.4-21) n=\ In Table 2.4-1, we show, for N = 20(r = I2/2k = 0.00125) some eigenvector results—at t = 0.002(1.6t) and 0.01 (8r), at which times 7^(0.002) = 0.046 vs 7(0, 0.002) = 0.050 (8% error), tN(0.002) = 14.1 vs t(0, 0.002) = 12.6 (12% error), and 7^(0.01) = 0.111 vs 7(0, 0.01) = 0.113 (1.8% error), and finally 7^(0.01) = 5.74 vs 7(0, 0.01) = 5.64 (1.7% error)—showing, again, good accuracy for / > ~10r. We first note the lack of 'mode separation' at / = 0.002 (all modes are significant) for TN, whereas at / = 0.01 there is a hint of mode separation. This actually is not surprising when it is realized that initially all modes contribute equally, that from each mode to TN(t) being 2qt/L. We seem to have chosen a bad example for arguing 'quasi-equilibrium' for node Nl Surely 7V(0 is a better indicator than T^(t), since both exact and approximate values of T start at zero. By / = 0.01, though, there is significant mode separation, especially in tN—only the first five or so are really significant. We conclude our simple(?) example with the following Table 2.4-1 Eigenvalues and eigenvector contributions to TN and TN for N = 20 at t = 0.002 and 0.01. n 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Mn 2.46613 22.1041 60.8964 117.888 191.675 280.442 382.001 493.853 613.244 737.233 862.767 986.756 1106.15 1217.10 1319.56 1408.32 1482.11 1539.10 1577.90 1597.53 in3 1 -e-0002"" 1.99508 1.95644 1.88301 1.78171 1.66128 1.53078 1.39843 1.27076 1.15236 1.04594 0.95266 0.87259 0.80509 0.74917 0.70370 0.66760 0.63990 0.61981 0.60675 0.60037 e-0.002//n 0.995080 0.956755 0.885172 0.789958 0.681574 0.570705 0.465798 0.372430 0.293321 0.228901 0.178078 0.138968 0.109449 0.087510 0.071423 0.059806 0.051601 0.046042 0.042605 0.040964 103- J—% 9.87770 8.97192 7.48954 5.87318 4.44980 3.34992 2.56039 2.01038 1.62713 1.35557 1.15885 1.01337 0.90402 0.82102 0.75783 0.71006 0.67471 0.64973 0.63376 0.62596 e-0.0Vn 0.975640 0.801684 0.543425 0.307623 0.147084 0.060542 0.021928 0.007165 0.002171 0.000628 0.000179 0.000052 0.000016 0.000005 0.000002 0.000001 0 0 0 0
100 THE ADVECTION-DIFFUSION EQUATION Remarks: (1) Pure diffusion is not a particularly good example of an OBC. (2) We are open to suggestion for a better model problem that includes advection. Turning now to another OBC issue, we conjecture: Suppose, however, that one employed the weak form (2.1-26) rather than (2.1-23), which incorporates the total flux as an NBC. Our purpose here is to indicate the possibility of serious danger in this case with respect to OBC's—especially (again) for the advection-dominated case. The ID version corresponds to the following BC at x = L : K(dT/dx) — uT + H(T — T) = q, which leads to the following semi-discrete equation for / = N—after replacing the term J(()iu(dTh/dx) by - JuTh(dc()i/dxX a la (2.1-26): 7 [7/V-1 ^/v-1 + 2//v^/v] — -m(T/v-i + TN) 6 I + ^(TN- 7V_,) + HTN = -lNS + q + HTN, lN I which is the NBC of the 'total-flux' BC given above. Consider now the use of these two forms of the OBC (at / = N) for the advection-dominated case, the dimensionless form of which is (2.1 -10) for the former, and Pe T + Bi(T -f) = q (2.4-22) dx for the latter (total flux), which is the dimensionless form of (2.1-25), where Pe = uL/k, Bi = HL/k, and q = qL/K. To simplify(?) matters, first take q = 0 and consider the case Pe » 1. If Bi = 0(1), the total flux BC becomes dT/dx = Pe T at the exit rather than dT/dx = 0; forward advection is matched/balanced by back-diffusion with (in general) an extremely large gradient. Whereas the homogeneous Neumann case is computationally effective regardless of the size of Pe, the homogeneous Robin case can be a disaster for Pe ^> 1, another result we shall demonstrate in due course—Section 2.6.2f. The reason may be clear already: q = 0 is usually quite inappropriate for the total flux case, damming as it does the total flux. It is appropriate for the 'advective' form because it permits the advective flux to leave passively, while damming only the diffusive flux. Another OBC has been used and advocated in the finite difference literature (e.g., Orlanski, 1976; Sani and Gresho, 1994)—again for the advection-dominated case wherein it tries to satisfy the maxim, 'Let the advective flux leave the domain passively.' In the simple ID context, it is dT dT — + V— = 0, (2.4-23) dt dx where it clearly makes the most sense here to take V = u. Since, however, the multidimensional case rarely has a fixed and uniform outflow velocity, we will for now keep V as a 'general' (but > 0) velocity. [In Sani and Gresho (1994) are also cited several other OBC's—some for FDM, some for FEM, and a few via the spectral element method.] In the finite element context, this OBC might be implemented simply by replacing the semi-discrete equation at the outlet (/ = N) by (for linears) ±(/tf_i7V_, +2lNtN)+ V(TN - TN_{) = 0, (2.4-24)
OPEN BOUNDARY CONDITIONS (OBC'S) 101 where we note that in this same context, it looks more like a Dirichlet (essential) BC [as indeed does (2.4-23)] than a Robin/Neumann BC; i.e., it is not an NBC. But it is also clear from the above discussion that—provided V = u — it corresponds properly to the hyperbolic limit associated with Pe -» oo after, of course, setting H = 0, q = 0, and S = 0 in these equations. At this time, we have little or no experience in using this BC via GFEM; nor do we know others who have. It should probably receive more attention/testing—especially in the more difficult context of multi-dimensional flow wherein, for example, setting (in 2D) V = (\/H)fQ (n-u)dy, where n • u (> 0) is the normal velocity at the outlet, may be appropriate. If, of course, n • u varies greatly between 0 < y < H, then this BC would be more severely tested. An alternative, and indeed more appropriate, way to enforce (2.4-23) is via the NBC route of GFEM—as follows: find Th from r riT^1 r W r\T^ r / Vi-r- + / Wtt-ht + / (<^u ' yTh + kWW ■ WTh) = ° V/' (24"25) Jq at Jy{) V at Jq which is obtained via the usual integration by parts and (2.4-23), and we see that a simple modification to the mass matrix is all that is needed. Remarks: (1) The / -» '0' limit yields, for linears, the lumped version of (2.4-24). (2) It has not yet been tested (to our knowledge), probably in part because the FEM community is perfectly happy with the simple 'do-nothing' OBC: oTjon = 0. (3) It properly 'vanishes' for the hyperbolic (k = 0) case. (4) It also vanishes at the steady state, leaving aTh jan = 0 as the OBC. (5) It appears to be also interpretable as a Dirichlet BC—applied weakly. The last issue to be raised in this, our first visit to OBC's, is—perhaps somewhat unfairly, but we do not believe so—to compare the OBC/NBC of the GFEM with a common OBC that is derived in a finite difference version of (2.3-3). This is the (Taylor series) second-order BC at x = L that is derived using the image point method, wherein a fictitious node (N + 1) is temporarily introduced beyond x = L (at a distance In, to be precise) and used to formulate a discrete version of BC (2.3-3); namely, the FDM might proceed as follows for i = N: . u(TN+\ -TN-\) (TN-\ -2TN + TN+\) Tn + = * -2 + S, (2.4-26) Un In where TN+\ is found from the Robin BC, K(TN+l-TN^) + H(Tn _ f^ = ^ (24_2?) 2/ N and used to eliminate the fictitious node from the semi-discrete equation above, resulting in the FDM-ODE-plus OBC, Tn + h(^---\tn + ^(Tn - TN-i) = S + (q + HTN) (%- - - )■ (2.4-28) \h k) 1% \lN k)
102 THE ADVECTION-DIFFUSION EQUATION First we note that the equation is consistent: multiplication by lN and letting lN -» 0 recovers (2.3-3). Next, we write the equation in non-dimensional form (x -» x/L, t -» ut/L\ a la (2.1-8), and let Ax = lN/L: PetN+Bi (-?- - Pe\ TN + 2{TN - TN-\)/Ax2 = g, + (q + BitN) (— - where Q\ = SL2/k, and we note two things: (i) Ax -> 0 recovers (2.4-22), and (ii) the nature of the equation changes at Pe = 2/ Ax; while (2/Ax — Pe) > 0 for any fixed Pe as Ax -» 0, it is also true that (2/Ajc — Pe) < 0 for Ax fixed and Pe sufficiently large—and the crossover corresponds to a grid Peclet number, P = uAx/2k, equal to 1. For comparison, the dimensionless form of the GFEM analog is 1 .1 -(Axn-\TN-\ +2AxNTN) + - Pe (TN - TN-\) + (TN - TN-\)/AxN + BiTN o I = -AxNQi + (q + Bi)fN, (2.4-30) which, (i) for Ax -> 0 recovers (2.4-22), and (ii) sees no 'transition' at P = 1. These differences can be important. Consider, for example, the advection-dominated case, Pe ^> 1/Ax for Ax fixed [and Bi = 0(1), Q\ = 0(1)]: the FDM equation becomes, approximately, tN = Bi Pe(TN — TN), and the FEM equation above becomes, again approximately, ^(AxN-\Tn-\ + 2AxNTN) + Pe(7/v — TN_\) = 0, which approximates the large-Pe PDE at the outflow, 37/3/ + Pe(37/3;c) = 0. Thus, we see the proper asymptotic behavior of the GFEM approximation; for Pe -> oo, the AD equation (i.e., the PDE) approaches the pure hyperbolic equation for which no BC is applicable—the PDE itself applies at the outflow point. The FDM ODE equation above is—of course—then unstable in the following sense: it would yield TN ~ eB,Pe'. Even if H = 0(5/ = 0), the equation would be poorly-behaved; i.e., tN = 0 implies that TN(t) = TN(0), similar to the 'hard'/foolish Dirichlet OBC referred to earlier and discussed in detail later (Section 2.6.2). See also Smith (1980), who shows that this OBC is a 'wiggle-maker' even at steady state. So, another nice feature of the NBC/OBC is that it better mimics the 'physics' of the problem and over all ranges of the parameters/data. It does so by approximating the time- in dependent BC (for constant //, T, etc.) via a stable ODE in such a way that, while TN varies in time, it does so in such a manner that the NBC is always and stably, although approximately, satisfied. In this case, however, the 'physics' could also be well-approximated via the FDM simply by using an upwind approximation to udT/dx at the outlet (as indeed is the method 'chosen' by the GFEM); if u(TN+i — TN-\)/2lN is replaced by u(TN — TN_\)/lN, then the FDM ODE becomes u(TN-TN_{) 2k 2H ~ 2 TN + \ + -^{Tn - TN^)+ —{TN -TN) = S+ —q, (2.4-31) in lN In In which is stable for all values of Pe and is, in fact, the (wiggle-free) lumped mass version of the GFEM equation. This is but one example of a situation in which FDM modelers could benefit by looking to the FEM for guidance in selecting BC's.
OPEN BOUNDARY CONDITIONS (OBC'S) 103 A final remark: if an FDM was a priori generated to solve the purely hyperbolic equation, Tt + uTx = 0, surely no silly OBC of the form discussed above would arise, because the analyst would know that an approximation of the PDE itself is required at the exit point. But an analyst might fall into the trap discussed above if s/he was 'thinking parabolic when writing the code,' i.e., k > 0, in which case the image point derivation of an OBC is natural and useful, and the trap could later catch either the analyst or—more probably and probably more dangerously—an unwary user who might innocently set k = 0 (or simply 'very small') in order to solve a pure advection problem. A final final remark for the FDM community: if the image point (wiggle-making, at large P) BC of (2.4-27) with H = q = 0 is employed, with a forward Euler timestepping scheme (see Section 2.7.2e), then the OBC wiggles can be eliminated (precluded) by adding the BTD term to the diffusion term and operating at or near the stability limit—a change that actually converts the BAD OBC to the good one. We conclude by returning to the GFEM and introduce the reader to what is often a better OBC than dT/dx = 0. While it is easy to state and easy to implement in a GFEM code, it is very difficult to analyze—which has led to its being called a 'fuzzy boundary condition' (FBC) by Sani and Gresho (1994), and a 'no BC boundary condition' by Griffiths (1997). Stated in words, it is simply this: for nodes on the open boundary (only), do not integrate the diffusion term by parts when generating the GFEM equations. That such a simple procedure seems to leave the AD equation without any BC at the open boundary was first noted by Sani and Gresho (1994), and remarked upon by two mathematicians who later analyzed it, as follows: 'It is our feeling that it should be referred to as the 'no BC boundary condition as this more accurately describes the situation, since within a purely continuous setting (as opposed to finite element approximations) the weak formulation is invalid because it is equivalent to not setting any BC at x = L and the governing equations cannot therefore isolate a unique solution—Griffiths (1997), and 'At the level of the partial differential equation it is of course impossible to simply drop a boundary condition. However the discrete problem in Reference 1 yields perfectly well-defined solutions which appear to be better than those obtained with more 'classical' choice of boundary conditions—Renardy (1997). Thus, for finite h (i.e. any result from a finite element code), the 'new' BC yields unique (and often surprisingly, good) results but the problem actually becomes ill posed (non-unique) for h -» 0, thus causing us to call it a. fuzzy BC. Reference 1 is Papanastasio et al. (1992) in which the so-called 'free boundary condition' was put forth and demonstrated (on non-trivial 2D problems, in fact.). It was also put forth even ealier by Frind (1988), who called it a 'free exit boundary condition'. Whereas we introduced it above by stating that integration by parts is waived at open boundary nodes, both of the above references derived it in an equivalent, but perhaps slightly more confusing manner; viz. they did perform the integration by parts, but then put the resulting boundary integral terms back onto the LHS as 'unknown' quantities. It is also worth mentioning that Frind used (only) linear basis functions, (for which V2Th vanishes identically) and Papanastasio et al. used (only) quadratic basis functions. Frind even tested the pure diffusion case, with the disappointing, result—later analyzed and explained by Sani and Gresho (1994)—that the initial condition at x = L also served as a BC there. He thus suggested, and Griffiths later 'proved', that there should he some advection present in order that the new OBC work 'well'. (Sani and Gresho also showed failure of the new BC for steady state diffusion.)
104 THE ADVECTION-DIFFUSION EQUATION Thus, whereas it is a demonstrably effective 'OBC and better than dT/dx = 0 (see the numerical results in Frind, Papanastasios et al., and Griffiths), and is thus recommended as perhaps the very best way to treat outflow boundaries, it is still slightly mysterious as to how and why it works—although, Griffiths has done an excellent job of unravelling at least most of the mystery—albeit with some non-'classical' results; e.g. 'It does not converge in the conventional sense, but it does converge to something—and this paper tells what that something is '—D.F. Griffiths (1997, personal communication). What is it? Unfortunately, one thing that it is, is that it is 'element-dependent'; the results for linears are different than those for quadratics, which are different than those for cubics, etc—with higher-order elements giving higher-order (better) results. Part of the reason for this is that the hidden/fuzzy BC is actually not applied at the boundary, but somewhere within the element that contains the boundary node (nodes in multi-D), although, neither Griffiths nor Renardy studied any but ID. In fact, Griffiths is currently (at the time of writing) working on the 2D case, which appears to be rather more difficult, with results that may be more fuzzy.). We conclude by summarizing a few of Griffith's (1997) results—and we do so for the much more common case of zero source terms (S = 0), although, both Griffiths and Renardy also obtained results with S ^ 0. dT (i) k dx (2.4-32) x=L x=L where the integration is over only the last element, and \J/(x) is an 'appropriate' linear combination of the FEM basis functions (and is thus 'element-dependent'). Clearly the RHS of (2.4-32) is 'small' (because L — X[ is) and thus the 'free' OBC is actually quite close to dT/dx = 0. (ii) The error is 0[k(k + h)p] for pth-order elements (p = 1 for linears), whereas that from the conventional GFEM NBC OBC (dT/dx = 0) adds a term like hp+l to the above error; it is noticeably larger for the convection-dominated case—defined here by very small k(k <^ h). (iii) The general ('non-standard') OBC is \- U dt dx dxp -i = 0 (2.4-33) at x = £, where £ is somewhere in the last element. [This result, for p = 2, was also independently derived by Renardy (1997).] (iv) For u sufficiently smooth, this OBC can also be stated as dp+i n ,7 = 0 at x = Z. (2.4-34) dxP+\ > Thus, as mentioned at the outset, it is easier to apply this BC than to understand it. So our final remark is: try it; you will probably like it—usually. 2.4.2 Two Dimensions In fact, not very much more need be added in the multi-dimensional case. The goals are the same and, for the most part, so is the method of implementation.
OPEN BOUNDARY CONDITIONS (OBC'S) 105 The analog of (2.4-1) for 2D can be obtained from the boundary two-patch equations presented earlier—(2.3-30) and (2.3-31). We take the latter case for simplicity; (i.e.) as an OBC, (2.3-31) with q = H = S = 0 gives K (T$ — Tsw TQ — Tw TN — TNW 6 V / / / Tsw + TN w u (T$ — Tsw . Tq — Tw .T^ — Tnw 6 I / / / + (TS + TN) + 2(TW + 2TQ) v I TNW — Tsw 2h + 2- TN - 7\ 2h K 3 Tnw — 2TW + Tsw \ ~ ( Ts — 2TQ + TN (2.4-35) an ODE that also represents the energy-balance-plus NBC/OBC at node 0 that approximates dT/dx = 0 there, but achieves dT/dx = 0 only as / -> 0. Again, even for advection- dominated flow, this OBC equation is effective because it does not interfere with the passage of the advective flux thru Tyy. If the limiting condition, k = 0, applies, then (2.4-35) automatically becomes the appropriate hyperbolic equation at the exit, Tsw + TN w + (TS + TN) + 2(TW+2TQ) + _ -J^L ^HL + 2— i = S, 3 V 2h 2h ' u (Ts — Tsw .Tq — Tw Tn — Tnw 6 V / 11 (2.4-36) which properly approximates 37/3/ + u • V7 = S at the outflow. It is noteworthy that this PDE (not BC) is, as are the NBC's for k > 0, built-in to a GFEM code; if k ^ 0 and a Robin BC a la (2.1-3) is appropriate, then the code user need only input the Robin data (H, q, T). If a passive OBC is desired, then the code user simply 'does nothing'—i.e., homogeneous Neumann data are automatically applied if no data are supplied. Finally, if the pure advection case is being run, then the user again 'does nothing' —except of course set k = 0—and the GFEM will automatically deliver (2.4-36) at outflow points. Perhaps the largest difference in 2D is that V7 and u are generally not parallel—neither to each other nor to the unit normal vector at the domain exit, a situation that is conceivably more difficult. This is because the NBC always involves /en • V7\ the normal flux on VN, independently of either the flow direction or the shape of FN; e.g., consider the situation in Figure (2.4-4) (thanks to J. Leone), with an x-directed velocity field and an outflow boundary not parallel to y: For example, if the flow were fully developed (plane-parallel/uni-directional) with v = 0 and T linear in y (e.g., a steady, stably stratified flow), then the NBC of n • V7 = nx(dT/dx) + ny(dT/dy) = 0 can cause some outflow problems relative to the desired solution, T = T(y), or dT/dx = 0; i.e., the (homogeneous) NBC may force dT/dx ^ 0. Figure 2.4-4(b)-(d), with u = 1 and v = 0, should all show T = y, but only the advection- dominated case comes even close at the outlet—because k is 'small.'
106 THE ADVECTION-DIFFUSION EQUATION (a) y Q -► x (b) 1.0 T0.5 0 —■ (C) 1.0 T0.5 0 — (d) 1.0 T0.5 0 0.5 1.0 1.5 2.0 2.5 x 3.0 Fig. 2.4-4 (a) Isotherms for non-parallel outflow boundary; (b) Pe = 100; (c)Pe=~\0; (d) Pe=1.
SOME NON-GALERKIN RESULTS 107 2.5 SOME NON-GALERKIN RESULTS We mentioned earlier that there is a significant body of literature in which—for one reason or another—the Galerkin method is not used to generate the discrete FEM equations. Herein, we address several of these for the AD equation—all based on bilinear function approximation. We do not discuss here the many non-Galerkin approximations of the advection terms—most of which involve the intentional introduction of some form of numerical dissipation. Those that we introduce here are: mass lumping, one-point quadrature, and our version of a CVFEM (Control Volume FEM). 2.5.1 The Lumped Mass Approximation A very common approximation/short-cut in the FEM is that of 'lumping the mass,' the term deriving from solid mechanics in which the mass matrix of (2.2-8) is related to the acceleration of inertial mass; see Archer (1963). The terminology is rather fixed because it is so prevalent, and we retain it here—usually in the analogous context wherein the mass matrix is associated with the time rate of change of a dependent variable. We first attempt to motivate the approximation in three ways: physically, mathematically, and numerically. Physically, the procedure may be (at least for some elements) motivated by the following assumption/approximation: assume—at least temporarily—that all nodes that share the support of the test function [e.g., 0, in (2.2-2)] vary in time at the same rate. In the 4-patch context of Section 2.3.2—see (2.3-23) or (2.3-24) and Figure 2.2-3—the time rate of change of all eight nodes surrounding node 0 is assumed to be no different from that of node 0 itself. It is a 'temporary' assumption because it must be changed when changing the focus from node / to any other node—a fact that clearly calls such a rationale into question. Mathematically, to permit equivalent formulation of the FEM equations—it can be described/derived by replacing f <pi<pj in (2.2-8) by /0, as follows: Mij = / <pj<pj -» Mu. = 8U / (ph which is (2.2-28). Note that since X^/Ii 4>j = l> this procedure is equivalent to summing the rows of the consistent mass (CM) matrix, M,7. We hasten to add, however, that this so-called 'row-sum' technique of mass lumping does not work for all elements since f fa is not always positive—the simplest example being the six-node triangle, and the next simplest, the eight-node serendipity (semi-biquadratic) element already discussed in Section 2.3.4. But here we use the row-sum technique because it does work for the four-node bilinear element. By 'work' we mean that the resulting semi-discrete equations generate 'appropriate' ODE's, which converge to the PDE as h —> 0. Numerically, the principal motivation behind the technique is the generation of a diagonal mass matrix—basically a vector—whose inverse is trivial to evaluate and 'compute with' when compared with the CM matrix, which, while sparse and banded, has an inverse that is dense. We will return to all of these aspects in more detail later, but here our goal is simply to point out the effect of this ad hoc modification on the ODE's of the semi-discrete system, and to emphasize that the result is a non-Galerkin modification. And this is perhaps best done by referring to the element patches discussed earlier; (2.3-23) through (2.3-47). In all
108 THE ADVECTION-DIFFUSION EQUATION of these equations the LM approximation is effected simply by replacing the coupled first derivatives in the first term by to, as done in most FDM approximations. [In ID—e.g., in (2.3-8)—the analogous procedure is to replace t,_\ and ti+\ by t„ of course.] It is important to point out that the LM technique can give quite inconsistent (read 'silly') weighting to the time-derivative terms—especially on a non-uniform mesh. Consider, for example, the 4-patch shown earlier, in which the northwest (NW) element is 'much' larger than the southeast (SE) element (i.e., hj ^> h\). The Galerkin weighting is basically area weighting, for which node NW assumes a larger role (is more 'important') than does node SE in the ODE for node 0—but this larger role is felt consistently in all terms when GFEM is used. Lumping the mass removes this consistent weighting from the dT/dt approximation with the result that the resulting semi-discrete equations are much less accurate approximations to the PDE. (They are often even less accurate than the simplest FDM on the same mesh!) This inconsistency of mass lumping will show up again later, when we examine phase speed and group velocity—in which case a serious loss of accuracy occurs even on a uniform mesh. In closing this brief section (see Section 2.7.2b for further LM discussion), it is also worthwhile pointing out (again) that only the CM matrix produces a best least-squares fit to the data; i.e., for (2.2-7), with T given, t = M~l[f-N(u)T-KT] produces an 'acceleration field, dTh(x, t)/dt = ]T tj(t)4>j(x), that is also the L2-projection (see Appendix 3) of V • (K • V7) + S — u • V7 onto the bilinear basis functions. This 'best approximation' property of the time derivative is lost when LM is involved. An interim bottom line on mass lumping, especially for advection-dominated flow: avoid it if at all possible, it is not honest GFEM. 2.5.2 One-point Quadrature Another common short-cut to the GFEM is to employ a less-accurate (and much less- expensive, especially in 3D) integration rule to form the element matrices. Here, for linear (bilinear, trilinear) basis functions we examine the effect of replacing a sufficiently accurate Gauss-Legendre rule to evaluate the integrals in (2.2-8) through (2.2-12) by a simple and cheap one: evaluate (at element level, of course) each integrand at the element centroid and multiply the result by the element area (volume in 3D). The element matrices corresponding to this particular non-Galerkin (or perhaps 'approximate Galerkin,' since the spirit is still the same) modification are presented in Appendix 1. Here, we use those results to 'replicate' some of those presented earlier for GFEM. We remark, however, that this approximation tends to make the diffusion matrix singular [to a 2Ajc x 2Ay 'checkerboard' eigenvector, (— \)l+J in the simplest case], and various 'hourglass corrections' (a term from Lagrangian solid mechanics; see, for example, Goudreau and Hallquist, 1982) often need be applied—a subject we defer until the next chapter. a. An interior 4-patch of uniform rectangles The ODE for node 0 in Figure 2.3-3 that approximates the GFEM equation given by (2.3-24) is
SOME NON-GALERKIN RESULTS 109 T7[&sw + TNW + tNE + tSE) + 2(7* + ts + fE + tw) + 4f 0J 16 1 + 4 1 / uw + usw Ts — Tsw r. usw + 2uw + uNW To — Tw _. _ . _ _ + U\y + UNW ' N — T NW 1 + 2 + + + 1 + 4 2 / I ( Uq ~*~ Us ^SE ~ ^sw j- ? Us ~*~ ^M° ~*~ M/v ^E ~ ^w _i_ Mo + W/v 7W — T'/viv 4 V 2~ 2/ 4 2/ 2 2/ _J_ / »E + »5£ 7,y£ — 7^ «5£ + 2uE + M/yg ^ — Tq UE + UNE TNE — TN 4 V 2 / 4 / 2 / 1 fvs + Vsw Tw - Tsw 2 Vsw + 2i;5 + ^5£ TQ-TS vs + ^se 7£ - 75E 4 V 2 ' A ' 4 /i 2 /z j_ fvo + Vw TNW - Tsw Vw + 2v0 + vE TN - Ts Vq + ve Tne - TSE 4\ 2 ' 2h ' 4 2/i 2 2/z 1 /f/v + f/vw ^/viv — 7 iv t>/viy + 2^/v + Vme Tm — TQ 4\ 2 " h 4 h + vN + vNE TNE — TE 2 h = -[(Tsw - 2TS + TSE)/l2 + 2(TW - 2TQ + TE)/l2 + (TNW - 2TQ + Tsw)/h2] K + -[(Tsw ~ 27V + TNW)/h2 + 2(TS - 2TQ + TN)/h2 + (TSE - 2TE + TNE)/h2] 1 + ttWsw + SNW + SNE + SSE) + 2(SN +Ss + SE + Sw) + 4S0]. 16 (2.5-1) Besides a different averaging of the velocity coefficients, the one-point quadrature approximation has consistently converted the (1 4 l)/6 averaging coefficients—and its tensor product equivalent for 2D via bilinear basis functions—to (1 2 l)/4 for ID and its tensor product equivalent for 2D. Also, the 1/6 upwind, 2/3 centered, and 1/6 downwind of GFEM becomes 1/4 upwind, 1/2 centered, and 1/4 downwind. It is interesting to note that the advection terms can be rearranged to 1 L (T0 - 7V) + (7s - Tsw) , _ (TE- To) + (TSE - Ts) uSw ■ w, 1" use ■ 2/ 2/ _ (TE — Tq) + (TNE — TN) (Tq — Tw) + (TN — TNW) + M/v£ • — h UNW 2/ 2/ 1 -I— [similar terms from the vdT/dy portion] , where u~sw = \(usw + "5 + "o + uw), the average (centroid) x-velocity in the southwest element, etc.; i.e., the one-point quadrature approximation—at least on a patch of uniform rectangles—can be usefully interpreted and described as follows: the average velocity in each element is multiplied by the average temperature gradient in the same element, and the results are averaged over the four elements sharing the node in question.
110 THE ADVECTION-DIFFUSION EQUATION If the skew-symmetric version of the advection matrix is desired (/3 = l /2), it can be obtained via the skew-symmetric part of the above advection matrix, as discussed in Section 2.2.4. For example, the udT/dx portion becomes l I useTse — usv/Tsw 2/ use + uNe ~ usw + unw ~ l E Z I W + 2 2/ + UneTne ~ unwT \ NW 2/ J which is 'similar' to the GFEM skew-symmetric version shown earlier. b. A boundary 2-patch Similarly, the one-point quadrature approximation that corresponds to (2.3-30)—simplified to a uniform mesh— is < - [Tsw + TS + TN + TNW + 2(7\) + Tw) — (Ssw + Ss + SN + SNW) — 2(S0 + Sw)\ I (I 2 1 + 4 Usw + Us + Uq + Uw Ts — Tsw ~ USW + «5 + 2(«o + «IV ) + M/V + 1*NW Tq — Tw + 4 / Uw + Uq -\- M/v + Uftw Tn — Tfi/w + + 1 4 / vSw + vs + vQ + vw Tw - Tsw 2 vsw + vs + 2(vQ +vw) + vN + vNW TN - Ts h 8 2/i vw +v0 + vN + vNW TNW - T w K 2 h Tsw — 27V + TNW \ [Ts — 2TQ + TN h1 hz « (Ts — Tsw 0 T0 — Tw TN — TNW\ H = -(qs + 2q0 + qN); (Ts - Ts) + 2(T0 - TQ) + (TN - TN) (2.5-2) again a similar modification of the GFEM result. c. A boundary corner Moving right along, the one-point quadrature approximation corresponding to (2.3-33) is Ih 2(1+h) 1 .... 1 -(Tsw + Ts + T0 + Tw) — -(Ssw + Ss + Sq + Sw) , _ ( Tq — Tw + Ts — Tsw \ , _ / ^o — Ts + Tw — Tsw + u — +v 2/ 2h + K h (Tq — Tw + Ts — TSw \ I {Tq — Ts + Tw — Tsw l+h 2/ l + h 2h
SOME NON-GALERKIN RESULTS 111 + H T0 + T w w + h l+h Tq-Tq + Ts- Ts + h fqs + qo l+h (2.5-3) where u and v are the centroid (average) velocities. As for the GFEM, the /, h —> 0 limit is h dT ~ K—+H(T-T)-q ox + 1 K—+H(T-T)-q = 0. 2.5.3 Control Volume Finite Element (CVFEM) A straightforward application of the theory presented in Section 2.2.6 will display the CVFEM and permit some interesting comparisons with GFEM (both honest and via the one-point quadrature approximation). However, as mentioned earlier, there is not yet available a computational comparison on advection-dominated flows (to our knowledge), although Comini et al. (1996) come close. It will become reasonably clear, though, that the CVFEM has more similarities with (a lowest-order) GFEM than differences. We proceed as before, beginning with an interior four-patch and ending with an exposition of NBC's and OBC's for the CVFEM, after mentioning again that only on simple meshes of the type shown below are M and KD symmetric. a. An interior 4-patch We begin with a new sketch—shown in Figure 2.5-1—for the CVFEM on rectangles (as for GFEM, distorted meshes and isoparametric elements are probably best left to the computer). A direct application of (2.2-38)—with Sj = 0 for simplicity—to the CV for node 0 yields the following, in which the matrix notation of (2.2-7) is employed: NW W SW N NE o h2 h1 i it < 1 ( > i > o 0 2 O SE Fig. 2.5-1 A bilinear 4-patch with control volume.
112 THE ADVECTION-DIFFUSION EQUATION 1. MT\o = — {l\h\tsw + l2h\fSE + l2h2fNE + l\h2fNw + 3[(/, + l2)ih\ Ts + h2tN) + (hi + h2)(htw + l2tE)] + 9(li+l2)(hl+h2)to}; 2. KDT\0 = £ ly-[3(T0 - Tw) + (Ts - Tsw)] + y-[3(T0 - TE) + (Ts - TSE)] O [/| l2 + ^[3(r0 - Tw) + (TN - TNW)] + ^[3(T0 - TE) + (TN - TNE)]\ l\ l2 ) + K~ lr-[3(T0 - Ts) + (Tw - Tsw)\ + ^[3(r0 - Ts) + (TE - TSE)] + £-[3(r0 - 7*) + (Tw - TNW)] + ^[3(7o - TV) + (7£ - 7W)]1 ; 3. yV(uD|0 = Nx(uT) + A^T") « —(uT) + — («r), oc ay where u = ]T\- uj(t>j and v = Yj vj(t)j'- N(uT)\0 = ^ |/i, "5 + use Ts + TSe «s + "siv ^s + Tsw + 2 2 2 2 2 «o + ue Ts + 7^£ M0 + "w T's + ^5vv + 2 2 2 2 us + «S£ To + TE us + «svv 7"o + TV + 7(/i, +/i2) + h2 2 2 2 uq + ue Tq + T^ «o + uw To + TV 2 2 2 2 . UN + W/V£ T^ + TNe Un + M/VW T^ + TNw + 2 + V 2 2 2 2 "0 + ue TN + T^/vE «o + MW T^ + T^w x 2 2^ 2 2 M/v + w/v£ T'o + Te un + w/vw ^o + TVN + s{'' vw + tWiv TV + TNW vw + ^5W TV + Tt sw + 2 2 2 2 V0 + VN TN + 7/vw ^0 + ^5 TV + ^5W + 2 2 2 2 vn + ^/vw T0 + TN vw + ^5w To + TV
SOME NON-GALERKIN RESULTS 113 + 7(/,+/2) + h vq + vn To + TN _ vp + vs ^o + Ts 2 " 2 2 " 2 ve + vNE TE + TNE vE + vSE TE + 75£' + 2 V 2 2 ^0 + vN TE + TNE 2 2 Vq + VS TE + TSE + 2 2 2 2 vE + vNE T0 + TN vE + vSE T0 + Ts which is (as required) symmetric in velocity and temperature and is a combination of centered differences. These are the contributions to the CVFEM version of (2.2-7) for node 0 and can be compared with the GFEM equivalent in Section 2.3.2. Similar to the GFEM, we can generate an FDM form via multiplication by the inverse of the (same!) lumped mass matrix, ML = {l\ + h){h\ +h2)/4, which now is simply the size of the CV, and rearrange—a la (2.3-23) and, especially the conservation form (/3 = l) that came later—see (2.3-28). The result is l 16 Asw + , ANW ■ ANE ■ ASe ■ — ' SW n I NW H — 1 NE + —— ' SE Ay A? Aj Aj , „ [ANE +ANW. ASe +Asw + , ASe +ANE ■ Asw +ANW ■ + -3 I / n H -A 1 s H / e H -A ' w ) + 9/o + 2h\ 24 h\+h2 Aj Aj Aj us + use Ts + Tse _ us + usw Ts + TSw 2 ' 2 2 ' 2 /1+/2 + 2 UE + USE Te + Tse Uw + USW TW + T SW 2- (/1+/2) + uq + use Tq + Tse _ uq + usw Tg + Tsw us + ue Ts + TE _ us + uw Ts + Tw i -> °2 2. "> "> n h+h h+h - 2- useTse — uswTsw (/1+/2) + 12 uq + ue Tq + Te uq + uw Tq + T w 1 ■ h+h -2- ueT e — uwT w h + h + 1 2h2 24 h\ + h2 un + une Tn + The un + unw Tn + T^w h+h
114 THE ADVECTION-DIFFUSION EQUATION + 2 he + une Te + Tne u\v + unw Tw + T^ w 2- (/1+/2) «o + une Tq + T^E uq + unw Tq + TNW un + ue T^ + Te w/v + uw T^ + T w + h+h + h+h - 2- uneTne — unwTnw (77+71) + 'vertical advection' K h+h 2h\ (T$- Tsw TSE - Ts\ (T0 - Tw Te - Tq + o M +h2 V h h h + 2h2 (TN-T, NW TNe — Tft h\ +h2 \ h K h\ +/i2 h 2/i fTw — Tsw TNW-TW\ (Tq — Ts Tn — T0 + o h+h hi h2 + 2/2 (Te - Tse TNE - Te h+h hi (2.5-4) which merits the following Remarks: (1) The vertical advection terms can by written by inspection/symmetry. (2) The advection terms were judiciously rearranged (as was done earlier for the fi = 1 GFEM case) so that pointwise products (fluxes) are present everywhere. (3) The x-advection terms are interpretable as follows: the 2h\/(h\+h2) terms contribute ^d(uT)/dx from the southern nodes, the 2h2/(h\ + h2) terms contribute a like amount from the northern nodes, and the remaining -^d(uT)/dx is contributed by the center nodes; cf. (2.3-28). (4) The similarity with the GFEM conservation form is obvious; only the averaging is changed. (5) The source term can be easily incorporated by replacing each t,j by (7,y — 5,-y), since they share the same mass matrix. (6) The diffusion term again is in the form —V • q, where q = —kVT. To make further 'progress,' we simplify the results to the simplest case: constant velocity on a uniform mesh. The result is 64 [(Tsw + tNW + tNE + fSE) + 6(7^ + fs + fE + fw) + 3670] + u rT^ 'T' r-w~\ rrf (Ti rrf I SE — I SW , , I E — 1 W , I NE — I NW + o • — h ~ 2/ 2/ 2/
SOME NON-GALERKIN RESULTS 115 v + 8 rr-w~\ rrf (Tl /xi rrf NW — * SW , , 1 n — 1 S , I NE — I SE - + O • — h - 2h 2h 2h K 8 (Tsw ~ 2TS +_Tse) 6(7V - 2T0 + TE) (TNW - 2TN + TNE) V V V K + 8 (Tsw — 27V + TNW) 6(TS — 2T0 + TN) (TSe — 2TE + TNE) _l _ | /i^ /i^ h1 (2.5-5) which is to be compared with (2.3-24)—GFEM—and (2.5-1)—one-point GFEM, after setting the velocity to a constant there. Before making any obvious remarks other than noting that the Galerkin weighting (1 4 l)/6 has gone over to (1 6 l)/8, let us first display the analogous ID CVFEM results—easily derived from (2.5-5): 1 8 2/, h+l. ■TW + 6T0 + 21: TE + u- TF-T w = K TE h+h Tq Tq — Tw I: /i /, +/2 I (U +h' (2.5-6) which is a seemingly small change from the GFEM equation, (2.3-9), and it may be interesting to examine this 'small change' (mass matrix only) from the point of view of local conservation, which is the 'strong point' of control volume methods. To this end, consider the local solution, Th(x), shown in Figure 2.5-2. While the support of the GFEM test function, (p0, spans all of both elements, that of CVFEM, \J/o, spans one half of each. The change in the mass matrix, from ^[l\,2(l\ + I2), 12] to |[/i, 3(/i + I2), 12] is just that change needed to conserve energy in Qq = [xq — l\/2, xq + h/2]. If the mass ■^-x Fig. 2.5-2 Piecewise linear function, GFEM test function ((p0), and CVFEM test function (ty0).
116 THE ADVECTION-DIFFUSION EQUATION is lumped, then both GFEM and CVFEM become identical—and neither one represents a proper local energy balance since the total energy in the CV (via the piecewise-linear representation) is exactly ^[(l\Tw +3(l\ + l2)T0 + l2TEV, it is not [(l\ + l2)/2]T0. Also, whereas the GFEM diffusive flux is discontinuous at element boundaries, the CVFEM flux is continuous at CV boundaries—even though it too is discontinuous at element boundaries. The CVFEM 'cleverly' arranges local flux continuity and (using consistent mass) local energy conservation, two physically appealing attributes that are sacrificed via Galerkin weighting. Note that this comparison is at odds with those recently published by Comini et al. (1991, 1992), who assert that GFEM also displays element-level balances, which clashes with the opinion that only a weighted-residual method that utilizes discontinuous test functions can generate element-level balances. Both viewpoints are presented in Appendix 2. Remark A quadratically conserving CVFEM and the associated guaranteed stable ODE's do not appear to be easily derived—even if desired—since by construction/definition, only the divergence form (not skew-symmetric) of advection is permitted. This could be construed as a disadvantage. The (interim) bottom line on these last two methods is this: the (l 4 l)/6 averaging inherent in the GFEM is replaced by (1 2 l)/4 for the one-point quadrature approximation, and by (1 6 l)/8 for the CVFEM; the former increases the influence of 'neighbors,' and the latter decreases it. In 2D, the tensor product of these terms applies—at least to the mass matrix. In fact, in this simple context, we have: (GFEM) = ^ (GFEM via one-point) + f (CVFEM), or, (CVFEM) = \ (GFEM) - \ (GFEM via one-point). A few further comparisons will be presented later, when we focus on pure advection and the associated problems (errors) associated with dispersion (phase speed and group velocity). b. A boundary 2-patch The key new feature here will be the NBC associated with the CVFEM, for which purpose we introduce the following 2-patch in Figure 2.5-3. Application of (2.2-38) to the CV of NW N o o r h2 ,— >- Q0 %f IP hl £ o o o sw Fig. 2.5-3 A boundary 2-patch with control volume.
SOME NON-GALERKIN RESULTS 117 node 0 yields, with due account taken of the Robin BC of (2.1-3) on the two segments of To that are coincident with rN for the KdT/dn terms, / 64 [M3r5 + tsw) + 3(/*i + h2)(3t0 + fw) + h2(3fN + fNW)} + 24 7 u0Tq Uq + UW Tq + T w + 2(u0Ts + usT0) + usTs - us + usw Ts + Tsw /us + usw Tq + Ts uq + uw Ts + Tsw + 24 7 M0^0 - 2 2 «o + «w 7o + Tw M/V + Ma/W T/y + 7" /VW + 2(u0TN + wa/70) + wa/7a/ - (un + unw To + Tw uq + uw Tn + 7a/w + / 24 + 2f 7 2 2 2 2 ' vo + vN 70 + TN v0 + vs T0 + Ts 2 2 2 2, i;o + vw Tw + 7,/vw ^o + ^s 7w + 75vy + + 2 2 2 2 vw + vnw Tq + TV ^ vw + ^sw 7^o + ^5 \ 2 2 2 2 J /% + vnw Tw + 7Vw vw + ^sw ?V + Tsw + hlf9L^)+h2 2 2 Qn + 3Q0^ 8 = y [3(70 - Tw) + (7S - 75vv)] + —-[3(70 - Tw) + (7* - 7^)] + ^-[3(7* - 70) + (7^ - 7*)] + ^-[3(70 - 7S) + (Tw - Tsw)], (2.5-7) where we have introduced the short-cut notation. Q, = qi — H(Tt — 7,) from the Robin BC, in which both q and 7 are assumed variable and represented (as before) via interpolation using the bilinear basis functions. Since we now know (or at least suspect) that the proper averaging coefficient is the lumped mass matrix on rN—namely, {h\ + h2)/2—we divide by this quantity and rearrange the result to see the CVFEM ODE + NBC at IV / 64 1 (375 + Tsw) + 6(370 + fw) + t-^tOTn + tNW) hi +h2 hi + hi
118 THE ADVECTION-DIFFUSION EQUATION + / 2h\ 24 h\+h2 'u0Ts + usT0' /' us + usw To + Tw uq + uw Ts + Tsw + usTs - 7 + + 12 / 2 2 2 us + uSw Ts + TSw 2 2 . uq + uw Tq + Tw uoTo - 2h2 24 hi + h2 2 2 (UqTn + M/V^O f UN + M/vw 7\) + Tw Uq + Uw TN + 7/vw + "^^, Af + / 24 h\ +h2 2 2 ' 2 un + "a'W Tn + ^w [similar terms in v and T] kI ~2 1 8 3(TN — Tq) + (TNw — Tw) 3(70 — Ts) + (T^w — Tsw) 2hi hi + h2 Ah-, QS-K Ah\ Ts — Tsw I + 6 Go - *: To-T w I + 2h, Qn - k TN-T NW I = 0, (2.5-8) h\ -\-h2 which is in a simpler form but is still rather complicated. But the main result is the same as it was for the GFEM in (2.3-30); namely, all terms but the last are (in effect) multiplied by / and will decrease in size with mesh refinement. So, analogous to (2.3-31), we simplify the above to the case of constant velocity on a uniform mesh; the result is Tnw I [ • u 2 r»+8 Ts — Tsw , , Tq —ho- — Tw TN I v + 4 K 4 1 / / 3(TN — Ts) + (TNW — Tsw) h (Tnw — 27V + Tsw) ~(Tn — 2r0 + Ts) hl hl + 8 T c — T cw k / SW +H(Ts-Ts)-qs + T n — T nw k h H(TN - TN) - qN + 6 = 0, Tq — Tw k : h H(T0 - Tq) - qo I (2.5-9) where we also lumped the mass for variety. Equation (2.5-9) appears to converge to (2.3-3) with mesh refinement—as desired.
SOME NON-GALERKIN RESULTS 119 So again, CVFEM is similar to GFEM; only the averaging coefficients differ. (Probably the Taylor series definition of local error also differs, but probably not by very much; its inherent irrelevance stops us from doing the work.) c. A boundary corner For completeness, and to assure the absence of surprises, we present the CVFEM at node 0 in Figure 2.5-4. The equation below, derived from (2.2-38), has been divided by Jr 0, = (/ + h)/2, as was the case for GFEM in (2.3-33): Ih 9T0 + 3fs + 37V + t sw 2(1 + h) 16 + h 12(1+ h) + 2 ( u0Ts + usTq 7 uqTq - Uq + UW Tq + T w 2 2 ) uq + uw Ts + TSw us + uSw To + Tw + ( usTs - I + 2 2 us + usw Ts + TswN 7 ^0r0 12(1+ h) + 2 ( v0Tw + vwT0 - _ vq + vs Tq + Ts \ 2 ' 2 J vq + vs Tw + Ts vw + vSw T0 + Ts' + [vwTw 1 f 2h + + 6 8 U+^ h 2 2 vw + vsw Tw + ^w \ 2 * 2 J 7^5 _ 7"5W AT- / + h(ts- fs) qs l + h To — Tw , / K : h / l+h J0 . Ts + H (t0 - f q) - h go + 21 l+h Tw — T sw k h H(TW - Tw) ~ qw h = 0, (2.5-10) which is simultaneously the ODE for node 0, an energy balance over the control volume n0, and (above all) an NBC for the CVFEM. w o £ SW Fig. 2.5-4 A boundary corner with control volume.
120 THE ADVECTION-DIFFUSION EQUATION The lumped-mass, constant-velocity version of this equation converts the storage term and advection terms to Ih { ■ u <T0 + - 2(l + h)\° 4 3(7\) — Tw)_ ,Ts — Tsw I I v + 4 3(T0-Ts) + (Tw-Tsw) h h terms that diminish with mesh refinement owing to the coefficient Ih. Finally, in either the variable or constant coefficient case, the asymptotic result is simply dT dx h dT / ~\ q + i dT Ty+H{T (t-t\ q = 0(l\h2), the same as the GFEM and the one-point quadrature results; it is a linear combination of the jc-direction and ^-direction Robin BC's at the corner. A final remark: If in the above CVFEM equations the mass is lumped, then the results would be closer to what is more often seen in the literature—a rather sad circumstance, in that lumping loses what the CV method had gained, i.e., local conservation. d. OBC's Little need be said here except perhaps that the CVFEM is close enough to the GFEM at outflow points that the technique that works well there should also do so here; namely, set q = H = 0 in the Robin BC and let the code 'do its thing'; the results should usually be acceptable (sometimes even good) for any value of the Peclet number. e. A nine-node CVFEM Finally, to emphasize that the finite volume method ('our' way, at least) is inherently a low-order method, we present the results of an analysis using higher-order (biquadratic) approximation. It turns out that since the test functions (piecewise-constants) are (still) low-order, the results are also low-order (second-), so that all one can get from higher- order basis functions is an expensive second-order method. The 4-patch of nine-node elements in Figure 2.5-5 below is used to both define the control 'volumes' and to facilitate the discussion/analysis—the 'gaps' being shown for clarity only. Shown are the CV's for nodes 0, E, N, and NE, three of which we shall pursue (node E and node N are related by some obvious symmetries). A direct application NNWW • NWW WW SWWO SSWW NNW —□— NWO W f> SWO -n- ssw NN N O so ss J I- NNE —□— i ONE] i -a E i OSE -D- SSE NNEE ONEE <IEE nSEE SSEE Fig. 2.5-5 A 4-patch of biquadratic elements with four control volumes.
SOME NON-GALERKIN RESULTS 121 of (2.2-38) to node 0 gives (the elements are equal rectangles, all coefficients are constant, and S = 0 for simplicity), in FDM 'format' (divide the CVFEM equation by M0 = A0 = Ih/A = AxAy): 256T0 + 80(7^ + Ts + TE + Tw) + 25(TNE + TNW + TSE + Tsw) 576 — \6{TNN + TSs + TEE + TWw) — ${Tnee + TNNE + TNnw + 7Vww + TSww + 7"ssvv + 7"ss£ + TSEE) + (Tnnee + Tnnww + 7^55 ww + TSsee) r i + M 16 (^ss£ — ^w) + 5(r5£ — TSw) + 16(7£ — Tw) + 5(Tne — Tnw) — (TNNE — TNNW) 1 48 (Tssww — Tssee) + 5(Tsee — Tsww) + \6(TEE — 7Vw) + 5(Tnee — TNWw) — (Tnnww — TNNEE) I 2/ > +v {h> [~{Tnee ~ Tsee)+5(Tne ~ Tse)+16(Tn ~Ts) + 5(Tnw — Tsw) — (Tnww — Tsww)] / h — (Tnnee — Tssee) + 5(Tnne — Tsse) + 16(7V/v — TSs) 1 48 + 5(Tnnw — Tssw) — (Tnnww — Tssww) 2h K 24 - (TSSE - 2TSS + Tsw) + 5(TSE - 2TS + Tsw) + 16(TE - 2T0 + Tw) + 5(Tne — 1TN + TNw) — (Tnne — 2TNN + TNNw) Axz + (Tnee ~ 2TEE + TSEE) + 5(TNE - 2TE + TSE)+ 16(TN - 2T0 + Ts) + 5(Tnw — 2Tw + Tsw) ~ (Tnww — 27Vw + Tsww) A/ , (2.5-11) which is to be compared with the GFEM analog, (2.3-48). It is interesting that only the nearest neighbors are present in the diffusion terms. The ID version of this result is, for the jc-direction, —Tww + 57V + 167"o + 5TE — TEE f?>TE — Tw 1 TEE — Tww \ 24 \Y / 2 21 J = K- TE — 2Tq + Tw Ax2
122 THE ADVECTION-DIFFUSION EQUATION Next we present a typical 2-patch result, using Figure 2.3-6: 352r0 + 1 \0(TE + TW)+ \6(TN + Ts) + 5(TNE + TNW + TSE + TSW) 576 THJee + Tww) — (Tnee + Tnww + TSww + TSee) + u 1 Tse — Tsw + 22(Te — Tw) + 7/yg — 7W 16 / 1 T'see _ Tsww + 22(TEe — Tww) + 7"/v££ — ^ww + 48 v 2A 21 — (Tnww — Tsww) + 5(Tnw — Tsw) + 16(7^ — 7"s) + 5(Tne — Tse) — (Tnee — TSee) h k f (TSw - 2TS + TSE) + 22(7V - 2T0 + 7£) + (TNW - 2TN + 7y£) 24 \ + Ajc^ 5(7Vw _ 2TW + T^vv) — (Tnww — 2TWW + r^vvvv) + 16(7\ — 270 + 7$) + 5(Tne — 2Te + 7"s£) — (Tnee — 2TEe + T'see) A/ (2.5-12) which is the CVFEM version of (2.3-46) and whose ID version in the jc-direction is the same as that above, and in the ^-direction is 1 • „ • • TN -Ts TN - 2T0 + Ts -(TN + 2270 + rs) + v N = k N ° ■ 24 h Ay Finally, the central node; referring to Figure 2.3-7, gives [484r0 + 22(fN + fs + tE + tw) + (fNE + tNW + fSE + tsw)] (2.5-13) 576 + + u 24 v 24 Tse — Tsw + 22(TE — Tw) + TNe — T, NW K 2A + / Tne — Tse + 22(TN — Ts) + TNw — Tsw _ TSE ~ 2TS + Tsw + 22(TE -2T0 + TW) + TNE - 2TN + T NW Ax1 Tne — 2Te + Tse + 22(TN — 2T0 + Ts) + TNw — 2TW + Tsw) Ay2 whose ID version is Tw + 227"o + Te T e — Tw Te — 2T0 + Tw h U ; = K- 24 / Axz (2.5-14) (2.5-15)
SOME NON-GALERKIN RESULTS 123 in the ^-direction. [Note that both the equation for the central node and its interpretation are rather different from that presented in Raw et al. (1984).] So there they are; a highly coupled, very complex, second-order accurate approximation to the scalar transport equation—lots of work for little gain, an example of a cost-ineffective method. (Later, when we present ID phase speed results for pure advec- tion, we will point out that this quadratic CVFEM is less accurate than even linear GFEM.) A higher-order CVFEM could perhaps be generated as follows: (i) replace the four contiguous subdomains in the sketch by the single subdomain obtained by omitting the internal boundaries, and (ii) apply a discontinuous, bilinear test function over this new control volume. The four parameters of the test function 'balance' the four nodal unknowns. In practice, this would generate four independent equations by making the error orthogonal to the following four functions: 1, x, y, and xy, with a resulting approximation that is just as complicated as GFEM—but probably less accurate. If, however, one completely abandons the finite element methodology and turns instead to a finite difference methodology, it is possible to generate higher-order finite volume methods—starting from (2.2-37). For example, Lilek and Peric (1995) generated a fourth- order CV-FDM by examining each term separately and devising higher-order approximations term-by-term. These authors also support our 'philosophy' that no upwinding is required on properly designed grids—although they do also advocate the so-called 'deferred-correction' approach, which utilizes some upwinding 'technology' during the iterations toward a non-upwinded converged result. Perhaps the FEM community could also benefit from such an approach. 2.5.4 The Group FEM/Product Approximation We present one other non-Galerkin technique in this section; an ad hoc -but-quite- inexpensive method for dealing with product non-linearities. An early work in this area is that by Swartz and Wendroff (1969); more recent is Christie et al. (1981), Abia and Sanz-Serna (1984), and Fletcher (1991). In the last reference, Fletcher, a pioneer and strong advocate of the approximation, devotes much space to it and argues for its cost- effectiveness. [We remark, parenthetically, that R. Taylor suggested back in 1975 that we (at LLNL) try it for our non-linear terms—a suggestion we briefly tested on Hamel flow; Newton's method behaved less robustly than it did for GFEM.] Here we present it briefly and with little discussion; indeed, like CVFEM, we have no personal experience (i.e., computing) with it. The group FEM must begin, like the CVFEM, with the divergence form of the equation, because the whole idea rests on approximating the product, uT, in a simple but hopefully cost-effective way. It is simply this: instead of multiplying the two variables u and T together after expressing each in the basis set, do it before expanding (!); i.e., uT is approximated via N uT = J2(uT)j4>j, (2.5-16) 7=1 a trick that essentially linearizes the advection term. The resulting advection approximation via GFEM is equivalent to that with a constant velocity; i.e., use (2.2-9) with u = constant
124 THE ADVECTION-DIFFUSION EQUATION to form the Af-matrix, but replace the final result, Tj, at the nodes—by (u7)7; i.e., the velocity can vary. Thus, we begin by returning to (2.3-23) and re-writing the advection term for the case of constant velocity on the four-patch; it simplifies to u ( 2h\ TNE - TNW TE — Tw 2h2 TSe — TSw\ U • V/,r|o = 7 7 — : — h 4 • — — h 6\hi+h2 h+h h+h h{ + h2 h+l 2 2/1 TNW — Tsw . TN — Ts 2/2 TNe — Tse h+h h\ + h2 h\ + h2 h+h h\ + h2 J (2.5-17) another averaging of centered differences. The group FEM for the variable velocity case is now easily obtained in the manner mentioned above; i.e., the (entire) set of advection terms in (2.3-23) is simply replaced by _ . _. 1 / 2/i 1 uNETNE — uNwTnw V/, • (u7)|0 = - —— —— 6 \h\ +h2 l\ + l2 . ueTe — uwTw 2h2 useTse — uswTsw\ /, +/2 h{+h2' h+h J 2/) vNWTNW — vsv/Tsw h+h h\+h2 + 4 VnTn ~VsTs 2^2 vneTne -vseTse\ (2 . 1R h\ +h2 h+h h\ +h2 ) ' a remarkable simplification, to be sure, and one that could easily be immediately converted to the CVFEM group approximation to advection; just replace the (1 4 l)/6 weighting above by (1 6 l)/8, and then place the result in the CVFEM 4-patch equation, (2.5-4), after deleting the (many) advection terms from that equation. The cost reduction of the product approximation appears to be very significant. It also ensures global conservation of T (but not T2). What else can we say, except that Fletcher advocates it quite strongly and that we (negligently?) have not tested it? It seems worthy of further careful exploration, especially in the context of the NS equations, where the product non-linearity is present in large measure. A negative opinion on the approximation, which may or may not be too relevant for our particular advection non- linearity, also exists, however: Abia and Sanz-Serna (1984) assert that in 2D and 3D, the product approximation may be more costly than standard GFEM. 2.5.5 The Petrov-Galerkin FEM To conclude this section, we mention the existence of a large class of 'non-Galerkin' methods, all of which add artificial diffusion in one way or another to control/damp wiggles, called Petrov-Galerkin methods, wherein the test functions are different than the basis functions. For the interested reader, we cite but two recent references, the first of which seems to present a pretty useful historical account; i.e., see Goldschmit and Dvorkin (1994) for a Petrov-Galerkin 'family tree.' The second reference is a new book by Bill Morton—an acknowledged expert numerical analyst with broad experience in
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 125 FDM, FEM, and FVM. In Morton (1996) is much useful analysis of the AD equation (and its approximate solutions) that complements what we present here—including GFEM error analysis for both linear/bilinear and quadratic/biquadratic on triangles and quadrilaterals. But the bulk (main thrust) of the book is nearly 'orthogonal' to ours: Petrov-Galerkin finite volume methods for compressible flow. Nevertheless, especially for the reader interested in learning about this approach to CFD, this text is 'required reading'. In the next Section (2.6-1) we discuss some CFD 'philosophy', but mention here that the above reference generally reflects a philosophy that is different from ours. We give short shrift to this large 'branch' of CFD for the following, probably somewhat naive (and not quite 100% true) reasons: we have no (or little) experience with them because we (usually) see no need for them, causing us to have little interest in them. (Those who believe we are 'wrong' in this regard would probably assert that were never solved any 'really hard' problems. ... Our only defence against such a probably-valid assertion is this: probably neither have you really solved the 'hard' problems that you have in mind.) If the reader detects that we are being somewhat contentious here, please read on! (If not, please wake up.) 2.6 DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN, AND— WIGGLES 2.6.1 Qualitative Discussion a. Wiggles CFD is not a 'hard' science in that some of its aspects (such as 'conservation form') are often more religious than rational in nature. We are about to discuss one of these aspects. Wiggles—the Nemesis of CFD. Perhaps no other single difficulty has generated more frustration and caused more effort than the rapid, high-frequency—typically (but definitely not always) node-to-node (or time-step to time-step)—oscillations that come out of the computer and pollute the putative 'solution' than that called, most simply, wiggles. Perhaps no other aspect of CFD has so divided the world into two basic camps: those who hate/fear the wiggles so much that they use only methods that never permit their occurrence, and those who, while not exactly embracing them, believe that there is a message in the wiggles and that there is more to good CFD analyses than simply being wiggle-free. And—in many cases—never the twain shall meet; there is too large a gap between the two 'religions.' Indeed, some zealots in the teaching profession—especially those who belong to the smooth-is-good, wiggle-free religion—seem to generate 'products' (typically Ph.D. students) who have never had any first-hand experience with wiggles and thus leave the university wearing blinders—an unfortunate situation in our opinion. (Our 'religion' already shows through.) The price that is too often paid by those who a priori suppress wiggles by their choice of a numerical method is simply that they are often solving the wrong problem; i.e., the effectively/numerically much-reduced Peclet or Reynolds number leads these analysts to believe that they are really solving some tough problems when, in fact, they are not (a virtual reality of CFD) because they have changed the problem. The wiggly-camp, on the other hand, often has much difficulty with tough problems wherein, in the worst case, they can get no solution at all. This camp, in which we are fairly firmly (but perhaps not permanently) entrenched, believes that 'The
126 THE ADVECTION-DIFFUSION EQUATION wiggles are telling you something' and try to use wiggle signals as a guide to better mesh design (where possible) or, in the worst of cases, admit that the stated problem—truth be told—is just too difficult (for the current generation of computers). Consistent with our religious belief, we shall not do the disservice of citing any but a bare minimum of publications from the 'other side'; e.g., as mentioned above, a good history is available in Goldschmit and Dvorkin (1994). On the other hand, sometimes the wiggle signal can be too strong—forcing a fine mesh upon the analyst in regions of the domain where high accuracy is known (or assumed) to be not important. Thus, it is not hard to see why there are two camps, because one must basically make a Hobson's choice: results (when obtainable!) that may display spurious wiggles or results that may be deceptively smooth. In between these two 'extremist' philosophies are those who seriously try to reduce the wiggles—not always a priori—in ways that are not otherwise too deleterious, thus suggesting at least some hope for a 'middle ground.' We conclude this metaphilo- sophical introduction with a quotation from another field of 'science' that also abounds with religious zealots—stratospheric ozone depletion—because it probably applies here as well: 'The debate has become quasi-theological, with each side basing its arguments on faith in its own imperfect calculations'—Singer (1994). Holy ozone, Batman! While this divisive issue transcends the boundaries of FEM to encompass FDM, FVM, and spectral methods, we shall naturally focus mostly on FEM, in which our belief—generally—is reflected in our new acronym, GFEMIA: Galerkin finite element method intelligently applied. In partial support of this belief, we shall show below how a thousand nodes, badly placed, will lead to a wiggly solution, but that just a single node, intelligently placed, can give a really good solution—an example that will also obviate much of the typical related GFEM error analyses. The GFEMIA requires, besides a lower bound on the analyst's IQ, not much more than common sense—and, of course, an appreciation for some of the subtleties of both fluid mechanics and the numerical methods used to describe it. It can yield results that are unbelievably more accurate than most error analyses can/would ever predict. An interesting wiggle-opinion from the spectral method side of the house is this: 'A very important attribute of spectral methods is their self-diagnosis property. Inadequate grid resolution is reflected in excessive values of high-order expansion coefficients.'—Rogallo and Moin (1984). Hopefully, it is obvious that 'excessive values of high-order expansion coefficients' is just another way to say 'wiggles,' and we now borrow their useful terminology: wiggles are a self-diagnosis property. To further emphasize this property, this time in the other direction, which again introduces some nice terminology, we quote from Gropp and Keyes (1992): 'Complaints that heavily upwinded discretizations conceal their own errors are common in the literature ...' So, another way to state the extremes of this dilemma is this: choose a method that either has a self-diagnosis property (and thus often 'makes waves') or one that conceals its own errors (and is wiggle-free). While not everyone understands the wiggles—or their causes—it is a safe bet to assert that everyone does recognize them—and nearly everyone has an opinion about them. (Those who have no opinion have probably been brainwashed early-on and have never seen them from their 'smooth-is-beautiful' codes. Smooth may be 'beautiful', but it can also be very wrong.) While not as bad as turbulence, or pornography, or even art, in each
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 127 of which you may not be able to define it but believe that you recognize it when you see it, there really is no pure and simple always-applicable definition of wiggles except perhaps this (which may require some knowledge of physics): wiggles are non-physical oscillations. What does cause the wiggles? One principal cause of spatial wiggles is the specification of a problem in which the dependent variable is forced to suffer/experience gradients (rates of change) in the flow direction that are 'too large' to be 'captured' by the mesh—with the resulting solution (when a solution can be attained, which is basically 'always' for the linear problems now under discussion) that 'radiates' 2Ajc (and other short-wave) oscillations away from the 'source' (the non-resolved 'boundary layer'), typically in the upstream direction. But not always upstream—especially on grids that are regular in the sense of having the node points more or less aligned with the direction of the coordinate axes. We believe—and will act on this belief in what follows—that one good way to understand wiggles is to study the eigenproblems associated with the spatial operators (matrices) in that wiggles are excitations of the high-frequency (wave number) eigenmodes; i.e., the amplitude coefficients of one or more of the oscillatory eigenvectors that describe the GFEM (or other method) for advection-dominated flows become relatively 'too large.' In the above category is one of the most important wiggle makers: flow toward a Dirichlet (hard) BC that generates a boundary layer that is 'too thin' relative to the mesh employed; i.e., the boundary layer thickness is smaller than the (normal) distance from the boundary to the first node point, typically quantified by k/u„ < Ax„/2, where un is the component of the velocity at the node in question that is directed toward the boundary. This inequality, which also reads P = u„Ax„/2k > 1, where P is called the grid Peclet number, makes wiggles—and if P ^> 1, the wiggles become stronger and more widespread. The flow at large Pe(= uqL/k) toward the hard BC can force the dependent/transported variable to undergo a large adjustment/change in a very short distance, 0(8), where 8/L = 1/Pe, so that transport by diffusion—which is small away from V—'suddenly' and necessarily becomes large, to 'balance' the large transport via advection, thus generating a steep gradient through the boundary layer. This is a challenging and (in many cases) important physical problem in virtually all areas of transport phenomena: momentum, heat, and mass transfer. An example of the importance of the above situation, we surmise, may have been related to the jet engine failures suffered by certain manufacturers twenty-some years ago when the wiggle-suppressing numerical results failed to reveal overheating and a concomitant serious thermal stress condition. ... Heavy-handed but 'robust' techniques can and do generate a 'What, me worry?' type of attitude among otherwise good analysts/designers, and a perfect example of the following statement that we attribute to J. Ferziger: 'The greatest disaster one can encounter in computation is not instability or lack of convergence, but results that are good enough to be believable but bad enough to cause trouble.' This is the euphoria of CFD: a heavily 'damped' code that always delivers smooth, wiggle-free results. How does a designer/analyst know how to make a good mesh if his/her code never makes wiggles? If the answer to this question is, 'They don't', then perhaps it is more understandable why so many clearly difficult problems have been ostensibly solved using wiggle-free methods on really coarse grids. (We admit, here, to inexperience with good adaptive mesh methods, which, in the best of all worlds, would generate an appropriate mesh for the unskilled
128 THE ADVECTION-DIFFUSION EQUATION analyst—at least if employed in conjunction with 'honest,' not overly artificially-damped methods.) Two other wiggle makers, to be addressed below, that are worthy of mention in this introduction are: advection of a wave-form that is too tough for the chosen mesh, and transient diffusion at early time and close to a large local disturbance—such as a step change in a Dirichlet BC or a discontinuous source term. The first of these is probably far more important, prevalent, and obvious—and the second is more subtle and less well appreciated. Returning to the former—a good description of a special but common and important special case of which is the propagation of a (steep) front through the domain—it may be the case that this situation is even more common than the 'hard BC case discussed above. In fact, a recent (and excellent) book has been devoted to this single subject (Finlayson, 1992). But a moment's reflection will reveal a strong similarity between these two wiggle makers—the key difference being that the moving front problem cannot be 'solved' by simple local mesh refinement. Another cause of wiggles is poorly resolved, or rough in general, IC's—a consequence of the fact that the GFEM mass matrix generates an L2-best fit to the data (in this case, the data are the operations of the advection and diffusion matrices on the initial condition vector). What are the wiggles telling you? They are suggestive oscillations that are trying to tell you that there may exist a serious deficiency in your mesh design—or in the problem specifications if a too-sharp IC is used or a hard OBC is used when a soft one could or should be used. They are saying, 'An important steep gradient—and a concomitant large diffusive flux—exists that cannot be resolved ("captured") by the chosen mesh.' They also tell you, usually (if you can translate the wiggle signal) where in your mesh a better resolution is needed. If it is a thermal analysis, temperature wiggles are often telling you where the local Nusselt number is very large and where to improve (refine) the mesh so that the next solution might be wiggle-free and the high heat flux properly computed. (An analogous momentum transfer problem, to be studied in the next chapter, is that of obtaining proper lift, drag, and torque.) To close this extended and somewhat philosophical introduction to the 'theory' of wiggles, we point out a recent serious application of the same philosophy: in September 1993, the ASME's Journal of Fluids Engineering modified their editorial policy in a major way such that smooth-but-lousy numerical results would/should be much more difficult to slip past the referees and mislead the readers. In a somewhat less brash manner, the International Journal for Numerical Methods in Fluids followed suit—as did the AIAA Journal. Finally, two suggested readings for the novice interested in wiggles—whose full titles we must present: (1) 'A Survey of Finite Differences of Opinion on Numerical Muddling of the Incomprehensible Defective-Confusion Equation,' by B.P. Leonard (1979), and (2) 'Don't Suppress the Wiggles—They're Telling You Something!' by P.M. Gresho and R. Lee (1979); both in the ASME publication, Finite Element Methods for Convection- Dominated Flow, T.J.R. Hughes, Ed. (1979). [The latter appeared later—and longer—in Gresho and Lee (1981).] b. Dispersion To open this discussion, we appeal to authority for the definition (Whitham, 1974, p. 3), 'A linear dispersive system is any system which admits solutions of the form (p = acos(kx — cot), (2.6-1)
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 129 where the frequency co is a definite real function of the wavenumber k (with wavelength A. = 2ic/k), and the function co(k) is determined by the particular system. The phase speed is then co(k)/k, and the waves are said to be "dispersive" if this phase speed is not a constant but depends on k. The term refers to the fact that a more general solution will consist of the superposition of several modes like (2.6-1) with different k. If the phase speed co/k is not the same for all k, that is co ^ c0k where cq is some constant, the modes with different k will propagate at different speeds; they will disperse.' co/lir is the frequency at which wave crests/troughs pass a fixed point. The first important remark for our 'waves' is this: the continuum solutions are non- dispersive. And the second is this: all grid-based numerical approximations are dispersive. Thus, all of our numerical dispersion is spurious—it is pure error. [N.B. In some fields, such as groundwater flow, the term dispersion has a different meaning than the above wave-based definition; it refers to flow processes in which small-scale velocity variations cause enhanced mixing. See, for example, Shubin and Bell (1984) for a description of porous medium dispersion, and Cussler (1984) for turbulence-induced dispersion—both cases being models of the true physics. Also, some writers confound the term dispersion with dissipation, which we will discuss next.] It is ironic (and paradoxical) that the ostensible 'simplifications' engendered by making the incompressible flow assumption/restriction that should surely preclude most internal wave phenomena (except in Boussinesq fluids—covered in Volume 2)—and does in the PDE's—nevertheless require the (learned) analyst to become familiar with a fairly significant portion of 'wave theory' simply because of the deficiencies associated with the modeling of (non-dissipative!) advection. The simplest case of dispersion (or dispersion error, an interchangeable term in this text) is when an initial wave-form (e.g., a Gaussian, or a triangular 'pulse') is placed on the grid, and the pure advection solution sought; it will, if followed long enough in time, break up into a trail of wiggles. Results—and some theory—follow later. c. Dissipation A synonym for this term, in the context that we shall use it, is 'artificial dissipation,' and another is 'artificial diffusion,' and, finally, some call it 'numerical diffusion.' Others mean artificial dissipation when they use the term dispersion. C'est la vie. They all mean this: when the pure advection equation—which, by definition (almost), is free of dissipation—is solved by a numerical approximation method that reduces the amplitude and changes the shape of the initial wave in a way analogous to a diffusional process, the method is said to contain (or to suffer from) 'dissipation.' A dissipative scheme will monotonically and erroneously decrease the 'energy' in the wave—the quadratic conservation property of the advection equation (cf. Sections 2.1.4, 2.2.3, and 2.2.4) is lost. Whereas it seems safe to say that all 'numerical methods designers' seek and covet schemes that display as little dispersion as possible, it is definitely not the case that they also want to minimize dissipation. And the reason is wiggle-related: dispersion leads to/generates wiggles, and some people are simply allergic to wiggles. And, adding numerical diffusion to an otherwise non-dissipative method will always reduce the wiggles, and often eliminates them entirely. Hence the occurrence or intentional introduction of artificial diffusion, although sometimes the intent is disguised by phrases such as 'upwinded advection' approximations. [Pig-pen advection, a name attributed to B. Spalding by Roache (1982), is also an appropriate description of 'upwinding.'] More later.
130 THE ADVECTION-DIFFUSION EQUATION d. Phase speed Although briefly introduced (in ID) just above, a few more words may be helpful—and will here be specifically applied to our problem: advection. If a sine wave with wave number k (and wave vector k with direction—by definition—orthogonal to the wave crests) is placed in a flow field with constant velocity, u, the phase speed (a scalar, c) is the projection of u in the direction of k—see Figure 2.6-1: c = u • k/k = |u| cos(# — fi), (2.6-2) where k = |k|; it is the speed at which wave crests move past a stationary observer. Clearly c is maximized when u and k are parallel. The apparent wavelength, to a stationary observer, is Xa = 2;r|u|/k • u = 2n\u\/kc, which is a minimum (ka = k = 2n/k) when k and u are parallel (c = |u|) and is a maximum (ka = oo) when k and u are perpendicular (c = 0). Finally, since k/k is just the unit vector in the direction of k, c is independent of the magnitude of k; thus, there is no dispersion. In ID, of course, c = u, and the sine wave is simply dragged along (translated) by the flow at speed u. The above is a special case of the more general concept of phase velocity, which is the speed of the wave train in the wave direction k, which is normal to the lines of constant phase. It is shown in Figure 2.6-1 and given by Whitham (1974) CO c = -(k/k) = cok/k2 = ck/k = (u • k)k/k2, k (2.6-3) giving jc| = c = co/k, where co is the temporal frequency of the wave and 2n/co = x is its period, via a generalization of (2.6-1) to a plane wave in multi-dimensions, cp = acos(k • x — cot). (2.6-4) It is clear from (2.6-3) that the direction of the phase velocity is that of the wave number vector. To obtain (2.6-2) from (2.6-3) and (2.6-4), simply seek a solution of the form (2.6-4) to the advection equation, d(p/dt + u-V(p = 0. (2.6-5) Fig. 2.6-1 Plane waves in a fluid with constant velocity u.
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 131 The result is co = u • k and, using (2.6-3) and c = |c| yields the phase speed given in (2.6-2). Finally, the phase velocity can also be written as c=-(k//c). (2.6-6) T e. Group velocity Whereas the study of wave motion for an isothermal incompressible fluid in the absence of free surfaces is, or should be, a nearly vacuous subject, it is (unfortunately) far from that when numerical approximations are employed. So we must, because of our imperfect model of the PDE's, address certain aspects of wave motion—mainly to help to assess the errors contained in the model. First, however, another appeal to authority: 'The theory of group velocity is an essentially mathematical theory that has been developed over the years with an eye on a great variety of spheres of application' —Lighthill (1965). A general qualitative description of group velocity is that it is the velocity of the energy (amplitude squared, basically) of the wave (or wave packet, to which it is generally applied). Rather than simply dividing the wave's frequency by its wave number, a la (2.6-3) for phase speed, the group velocity is the vector that obtains by differentiating co(k) with respect to the components of the wave number vector, k: G*)=v"»=(t/*;) inm <z6-7) Now for our simple advection equation, co = u • k, and thus G = u; (2.6-8) the group velocity is merely the fluid's velocity and is independent of k: there is no dispersion in the continuous case, and the theory of wave motion (for an isothermal incompressible fluid) need not be invoked—at least in the absence of free surfaces or stratified fluids (see Volume 2). If one were to try to examine the 'components' of the phase velocity, (2.6-3), in an 'analogous' way as above for group velocity, via c\ = oo/k\ and c2 = oo/l<2, then one would find that these non-physical components do not 'transform' properly; and, rather than satisfying c2 = c\ + c\, it turns out that c~2 = cj~2 + c^2 because, in fact, c\ =c/cos0 and C2=c/sin#. These phase 'components' are geometrical, not physical—whereas those of G are physical. (There are some slippery concepts in wave motion.) The 'physical' components of c come from (2.6-3) via k = k(e\ cos 0 + e2 sin 0) and c = c\t\ + c2%2\ i.e., c\ = ccos# and c2 = csin# are the projections of c onto the coordinate directions. When we try to solve (2.6-5) by any numerical approximation method, however, both phase speed and group velocity will differ from those above and will exhibit dispersion by not being independent of wavelength. These concepts will also prove to be much more useful than local (h -> 0) error analyses, whether based on Taylor series or via more sophisticated convergence theory, because they do not require h -> 0 and thus apply on 'real' meshes. Stated differently, asymptotic (small h) analyses are restricted to long waves (h/X —> 0), whereas the general situation always involves a linear combination of long and short waves, and the behavior of the actual error must account for this fact. (Short waves are typically those whose wavelengths are between 4Ajc and 2Ajc, the latter being the shortest wave resolvable on the mesh. More on this later. And yes, h = Ax and
132 THE ADVECTION-DIFFUSION EQUATION we use both designations, depending on the context; e.g., a 2h wave just does not 'sound right,' whereas a 2Ajc wave does.) To conclude this brief introduction to phase and group velocity, we present a few more gems from Whitham (1974): 1. (p. 10), 'For linear problems, solutions more general than (2.6-1) are obtained by superposition to form Fourier integrals, such as (p = / F(k)cos[kx — co(k)t]dk, Jo where co(k) is the dispersion function appropriate to the system. Formally, at least, this is a solution for arbitrary F(k), which is then chosen to fit the boundary or initial conditions, with use of the Fourier inversion theorem. The solution of (2.6-1) is a superposition of wave-trains of different wave numbers, each traveling at its own phase speed, c(k) = co(k)/k. As time evolves, these different component modes 'disperse,' with the result that a single concentrated hump, for example, disperses into a whole oscillatory train. This process is studied by various asymptotic expansions of (2.6-1). The key concept that comes out of the analysis is that of the group velocity, defined as G(k) = 3w(/c)/3/c.' 2. (p. 371), 'Although the Fourier integrals give exact solutions, the content is hard to see.' This statement can often also be applied to the exact solutions of semi-discrete equations given by the eigenvector expansions to be presented soon. 3. (p. 376), '... an observer moving with the velocity G(ko) will always see waves with wave number &o and frequency &>(&())•' 4. (p. 377), 'An observer following any particular crest moves with the local phase velocity but sees the local wave number and frequency changing; that is, neighboring crests get farther away. An observer moving with the group velocity sees the same local wave number and frequency, but crests keep passing him.' And, finally, 5. (p. 380, 381), 'The group velocity G(k) is the propagation velocity for the wave number k, dk/dt + G(k)dk/dx = 0... It is interesting and significant that (this equation) is non-linear, even though the original problem is linear ' f. Mesh design We have stated above that the design of smart meshes is part of GFEMIA, and we reiterate here: in our opinion, not enough use has been made of the inherent flexibility of isoparametric finite elements to really maximize the cost-effective use of GFEM. But, unfortunately, we too have no magic recipe But we can offer some advice on initial mesh design (more on this in Volume 2): think in advance of calling your mesh generator, analyze the problem parameter range of interest, and, qualitatively, the associated fluid dynamics and potential 'wiggle dynamics.' As a start, consider the following questions and answers: Q: Should I use a uniform or a slightly graded or a highly graded mesh? A): If the flow is advection-dominated and if transport of wave-forms ('shapes') is important and if there are no obstacles to flow around that require hard (Dirichlet) BC's, then a uniform or quasi-uniform mesh is suggested (slow growth of element
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 133 size in the flow direction if diffusion is strong enough to be 'noticeable' and if there is a clear flow direction). A2: If flow past obstacles is the important part of the problem and if the obstacles require hard BC's, then an advection-dominated flow needs a graded mesh such that one or (preferably) a few nodes are placed in the boundary layer at least upstream of any stagnation points, and wherever there is a boundary layer on the obstable in the normal direction. Such a procedure will properly both eliminate (or make small) wiggles and produce a reasonably accurate solution. If, though, transverse boundary layers are also important, then mesh refinement/grading is needed near these surfaces for other than wiggle reasons—usually. (Transverse boundary layers are not often big wiggle generators, except under certain transient conditions.) Hard BC's are the name of the game when no-penetration and no-slip are appropriate for the momentum equations and Re ^> 1. They apply to the scalar problem if, for example, cool fluid flows around a hot obstacle and Pe ^>> 1. A3: If both wave form transport and flow past obstacles are present in the problem, then obviously a combination of A) and A2 is required. Q: Does the problem involve a boundary layer that you consider to be unimportant and therefore do not wish to pay the extra price of selective mesh refinement? A: If yes, why? (If no, see above.) This protracted 'introduction' has been intended partly as an 'eye opener' and partly as an incentive for the reader to believe it worthwhile to plow through the often lengthy mathematical analyses to follow both here and (especially) in the next chapter—and which is but a miniscule 'sample' of a type of 'CFD' problem that has consumed the efforts of so many for so long, and continues to do so. An appreciation of this single aspect of CFD will make a CFD 'analyst' much more 'useful' in both the applied world of CFD and in the never-ending world of research seeking better methods. For the reader truly interested in the numerical treatment of advection, it would be advisable to stay abreast of this portion of the following 'specialty areas,' since each works quite hard at devising improved numerical methods (most of which are not FEM's) and more often than not, each field develops in relative isolation—not knowing or (it seems) caring what developments are occurring outside of the particular specialty, all in addition (of course) to perusing the applied math literature: geophysical fluid dynamics (general circulation modeling, planetary boundary layer modeling, ocean modeling, shallow-water equations, etc.), flow through porous media and ground water pollution, oil reservoir analysis, aerodynamics, air pollution, astrophysics, magnetohydrodynamics and particle physics, river dynamics and sediment transport, gas dynamics, materials science, separation science, 2.6.2 Quantitative Discussion for Some 1D Problems Many of the above items are amenable to at least some useful quantitative analysis—at least in ID. Here we shall look at these issues from the point of view of the semi-discrete equations—time remaining continuous. In a later section (2.7.6), we shall include the additional complexity of time-marching and analyze some full discretizations. Finally, we shall spend just a little time discussing the steady equations—which of course are just a 'special case,' albeit an important one.
134 THE ADVECTION-DIFFUSION EQUATION Before embarking on some extensive analysis in ID, we make the comment that much less analysis will be applied when we go to multi-dimensions (for several reasons!) but that a good understanding of the behavior—idiosyncratic and not—of ID approximations, although seemingly slightly academic, is actually helpful for the multi-dimensional cases in the following sense: if the ID method has 'problems,' surely so too will the multi-dimensional version have these and more (a sort of Murphy's Law of CFD). Thus, if ID analysis causes rejection of a scheme, then it is a safe bet that it should not be tried in multi-dimensions. The 'converse,' unfortunately, is not generally true: good/great methods in ID can be poor—or, more often, 'impossible' to implement properly—in multi-dimensions. Even in ID, the bulk of the effort will be for the simple (but useful, for learning) case of periodic BC's, and mostly for pure advection, to which we now turn. a. Pure advection with periodic BC's o Continuum. We start with the simplest of all hyperbolic equations, one which is a true time-dependent equation since steady solutions (other than T = constant) do not exist, dT dT — +u—=0 on 0^jc^L=1 (2.6-9) dt dx with periodic BC's, T(0,t) = T(L,t), (2.6-10) and some IC, T(x, 0) = 7o(jc), (2.6-11) where u is constant. The exact solution is T(x,t) = T0(x-ut); (2.6-12) the solution simply translates the initial data to the right at speed u, thus showing how simple this PDE really is. Until you spatially discretize it—at which point you will encounter what Mitchell (1984) has called CFD's 'ultimate embarrassment.' [More precisely: 'The ultimate embarrassment is to be unable to solve the simplest of equations, du/dt + du/dx = 0, accurately by numerical methods on a fixed grid'.] The solution of the semi-discrete equations is unbelievably more 'difficult' than that of the continuum. This is due, at least in part, to the lack of damping in this first-order hyperbolic equation; whereas (2.6-12) will propagate all wavelengths—and even discontinuities—with no change in size or shape, the short waves—and especially discontinuities—are very difficult to propagate properly by most numerical approximations to (2.6-9). Another relevant opinion from an advection expert is this: 'There is no such thing as a perfect advection scheme—only differing degrees of badness'—A. Staniforth (personal communication). Finally, from Lien and Leschziner (1994), we quote: 'The approximation of convection poses challenges which might not be expected at first sight. The problem is, essentially, one of reconciling stability, boundedness, and accuracy.' Before pursuing the general problem, we first digress to perform some Fourier analyses that will be both useful and important when we study the semi-discretized version. To do this, we take a single Fourier mode as our initial data: r0 (jc) = e'"' (2.6-13)
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 135 where, because of the periodicity constraint, (2.6-10), we have a constraint on the allowable wave number: k = kn= 2nn, (2.6-14) where n is any integer (or 0). The wavelength, of course, is just X = kn=27i/k = \/n, (2.6-15) and the exact solution to (2.6-9) is simply T(x, t) = zik(x~ut) = e<(2™*-^), (2.6-16) where coc = ku = 2nnu is the temporal frequency of the wave—with period r = 2jt/coc = \/nu. (The subscript refers to the continuous case, to distinguish this frequency from those to follow.) Remarks: (1) If we studied instead the closely related IVP on the infinite span rather than the stated IBVP, then the restriction on k would be removed—it would be a continuous independent variable. We choose the periodic IBVP because our finite-dimensional 'analog' will be always restricted to (a finite set of) discrete wave numbers. Even so, we shall often consider n to be a continuous variable, for 'convenience' (and with little error, usually)—as have many before us. (2) The restriction to periodic BC's is not as 'severe' (unrealistic) as one might first imagine, in the following sense: lots of what is learned applies to situations with different BC's, at least during a time interval in which the BC's are not important/felt. (3) The spatial derivatives of this wave behave as follows: dmT/dxm = (ik)mT, which gives ±kmT for m even and ±ikmT for m odd; the former corresponds to dissipation and the latter to dispersion, which accounts for the often-seen statement that even- order derivatives are dissipative and odd-order are dispersive; see, for example, Ramshaw (1994). (4) The restriction to a single Fourier mode is, of course, not really a restriction in that any IC, and solution, can always be represented as a linear combination of such modes. (5) A quotation from Strang (1986, p. 264) is relevant here: 'Every harmonic elkx is an eigenfunction of every derivative and every finite difference.' It is, however, less relevant when we move up to quadratic finite elements—as we shall see. o Linear elements. So—each Fourier mode translates to the right at speed u. How well is this simple behavior approximated by the GFEM (or other)? Let us find the answer to this very important question by first examining 'linear elements' on a uniform mesh (the only kind of mesh amenable to Fourier analysis, unfortunately), for which (2.6-9) is approximated by [see (2.3-10)] \(Tj-X +4fJ+fJ+l)+^-(TJ+l -Tj-i) = 0, j=\,2,...,N, (2.6-17) 6 2h and (2.6-10) by T0 = TN; (2.6-18)
136 THE ADVECTION-DIFFUSION EQUATION there are N elements and N nodes for the periodic BC problem, and Nh = 1. (It may be helpful to think of the ID problem as one on a circular track, wherein node 0 and node N coincide—at x = 0 and x = 1, when unrolled; nodes 1 and N + 1 also coincide.) Taking our cue from (2.6-16), the general solution to the above system of ODE's is sought in the form (x -> Xj = jh): Tj(t) = ei{kjh-Wt\ (2.6-19) where k is a given wave number, and co is to be determined; hopefully it is close to 2icnu. Note that (2.6-19) satisfies (2.6-18) only when k = kn = Inn. It is convenient, for later use, to rewrite the above result as Tj (t) = e,k"Jh • Q~iwt = v(pt-iwt, (2.6-20) which has introduced the eigenvectors {v(n\ n = 1, 2, ..., N). Tj(t) thus describes the temporal behavior of the n-th eigenvector. We remark that i/w) qualifies as an eigenvector because it satisfies the generalized eigenvalue problem Kv(n) = XnMv(n) (2.6-21) obtained from (2.6-17) and (2.6-18) via Tj = v" e~knt, where K and M are (after multiplying by h) 'defined by' (2.6-17) and kn is here an eigenvalue (not a wavelength); i.e., a row of the skew-symmetric matrix K reads ^(0, ->, -1,0, l,0->), and one of the SPD matrix M reads t(0,-> 1,4,1,0,-*), o and it follows that Xn = icon, where con is to be determined. Inserting the 'test' solution, (2.6-19), into (2.6-17) leads to -^ (e~ie + 4 + eie) + ^-(Qie - e~ie) = 0, (2.6-22) 6 2h where 6 = kh is a dimensionless wave number and k = kn = Inn, and we henceforth omit the subscript for 'convenience.' The 'approximate' frequency is thus found to be u . 3 sin# 3 a>=-sin0- =uk , (2.6-23) h 2 + cos# 6 2 + cos# ' which approximates the true frequency, coc = uk; and it is (only) a good approximation for 0 'small.' Remarks: (1) We have apparently succeeded in finding the analytic solution to (2.6-17). (2) If the mass is lumped in (2.6-17), the factor 3/(2 + cos#) is replaced by unity; i.e., sin# (joLM =uk——, (2.6-24) V which is also the second-order (centered) finite difference result.
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 137 (3) It is sometimes convenient, but not necessary nor rigorous, to regard 0 as a continuous variable, rather than 0 = 0n = Inn/N; n = 1, 2, ..., N. It is a good approximation when N ^> I; see, for example, Hindmarsh et al. (1984). (4) The frequency in (2.6-23) is sometimes called the 'symbol' of the operator/matrix (Vichnevetsky and Bowles, 1982), A = -M~lK from t = AT; i.e., it is the (normalized) result of the effect of operating on a Fourier mode with the operator A: SA = e~'J0A ■ (e'-i0). The separate symbols in this case are: Sm = (2 + cos#)/3 and Sk = iu sin 0/h, whose continuous analogs are, respectively, 1 and uk. See also Swartz and Wendroff (1974), who may have been the first to analyze the phase speed of this GFEM. Recalling that 0 = 0n = knh = 2nn/N for n = 1, 2, ..., N, we plot in Figure 2.6-2 the eigenfrequency, con/u vs (continuous) n for N = 80 for both CM and LM [e.g., from (2.6-24) we have a>„/u = Nsm2nn/N]. Noteworthy from this figure are: 1. The CM curve remains much closer to the goal, 2im, for the low-frequency modes, than does that for LM. But both wander quite far from the goal for n 'large.' (This is the 'cause' of dispersion error.) 2. Both go through 0 at n = N/2(0 = tt), which corresponds to A = 2/N = 2h, the infamous '2Ax-wave.' The eigenvector of the n = N/2 mode, from (2.6-20), is viN/ } = el7T-i = (— \)J, showing a spatial period (wavelength) of 2Ajc; hence, the name '2Ajc- mode.' (If N is odd, then this wave does not strictly exist, but one very close to it—for large N—does.) 3. The curves are anti-symmetric about n = N/2. 4. The upper half of the spectrum (n > N/2) corresponds to waves shorter than 2Ajc, none of which can be resolved/seen/displayed on the discrete mesh. Each of these waves is 'aliased' to the lower half of the spectrum (see Fornberg, 1996, for discussion and pictures). In fact, V3N N co^U 0 -N -V3N 1 N/2 N Fig. 2.6-2 Eigenvalues for linear elements on pure advection.
138 THE ADVECTION-DIFFUSION EQUATION 5. oi>N-n = — &>n and v{N~n) = v{n)—complex conjugate—for n = 1,2,... ,N — \. This leads [see (2.6-20)] to T(f~n)(t) = T{"\t), showing that each high-'frequency' mode is aliased to the complex conjugate of the corresponding low-frequency mode. (What you 'thought' was a high-frequency wave is actually a low-frequency one.) 6. n = N is the special case of the constant eigenvector with zero frequency and infinite wavelength. Thus, in a sense, we can 'disregard' the upper half of the spectrum, and focus our attention on only the resolvable lower half. (Note, however, that this does not mean we are really discarding useless or redundant 'information'—we just look in two different 'ways' at the lower portion of the spectrum, a 'consequence' of periodic BC's. The eigenvectors for the upper half, for example, are needed for the expansion of an arbitrary vector on the grid—as we will show later.) So now let us look more closely at the lower half—Figure 2.6-3—in which we also show the wavelength scale. Additional noteworthy points are: 1. The midpoint, N/4(X = 4Ajc), is special in two ways—the first of which is quite important: (i) It divides the modes 50/50 (for all semi-discrete approximations—not just the two under discussion); one half of the modes, which we might call long-wave modes, have wave numbers between In and Nn/2 (wavelengths between 4Ax and NAx = 1)—and the other half, which are obviously called short waves, have wave numbers between Nic/2 and Ntz (wavelengths between 2Ajc and 4Ax). Clearly, 'too many' waves are short—and this is why wiggles of 'all kinds' can occur—especially for the lumped-mass case. ^N — con/u N — Analytical^' (27in) / Consistent Nh = 1 Fig. 2.6-3 Eigenvalues for linear elements on pure advection—a closer look.
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 139 (ii) It is the point where dco/dk = 0 for the lumped-mass (finite difference) case; i.e., the group velocity of the 4 Ax wave is zero. 2. The eigenvector of each short-wave mode (2h < A. < 4h) is also actually the product of the 2Ajc eigenvector with the (complex conjugate of the) eigenvector of the 'corresponding' long-wave mode; i.e., we have v(N/2-m) _ e2mj[(N/2-m)/N) — QixJQ-lx'Jm/N = v(N/2) . yim) = (-iy¥m); (2.6-25) thus, if t/m) is considered as 'smooth,' then wW2""1' is 'rough,' a distinction that is most clear when m is 'small' and least clear as m -> N/4, the 4Ax wave. Even though mode N/2 — m can be obtained from mode m, using (2.6-25), it is important to realize that this does not imply that mode N/2 — m is not an independent eigenvector. In fact, each of the N eigenvectors is distinct—the vector space is only spanned when all N of them are utilized, even though v{N~m) = v(m) for m = 1, 2,..., N - 1, and v(N/2-m) = (-l)-¥m) form= 1,2, ...,N/2- 1. 3. If we instead define dco/dk = 0 as the dividing point between long and short waves (i.e., G = 0), which actually makes good sense, the consistent mass spectrum clearly contains more long-wave modes than does the lumped-mass spectrum (one-third more, in fact, since dco/dk = 0 gives k = 3Ax; 2/3 of the modes are 'long'). Many of the waves that look 'short' (hard) for the less-accurate lumped-mass model, look 'long' (easy) for the GFEM (consistent-mass) model. And we shall later argue that dco/dk = 0 is indeed a useful definition of long vs short. It may be useful to show a few of these analytical eigenvector results pictorially, which we do with N = 80 (h = 0.0125). Figures 2.6-4 and 2.6-5 show the eigenvector __ !Tiv^n=1 N \ 1 \ 1 \ 1 J\ | |\ 1 | 1 llltv\ hi t 1U 1 if pi 1 M I M i \ x ^* n = 39 _— \ / \ / \ 1 ,\\ \V\ K A II 1 11 v ill T i 1 ' y 1 / ' s V / -JJj' _L 1/N Fig. 2.6-4 Eigenvectors for linear elements; n = 1,39.
140 THE ADVECTION-DIFFUSION EQUATION 1/N 1 XJ Fig. 2.6-5 Eigenvectors for linear elements; n = 2,38. (real part) cos2imj/N = cos2icnXj, j = 1, 2, ..., N for n = 1 (an 80Ax wave, shown dashed) and n = 2 (a 40Ax wave, dashed) along with their short-wave analogs, n = 39 and 38, respectively (shown solid), on the same graphs. The wavelengths of the latter two are ||Ajc and f§Ajt, respectively; i.e., they are close to the limit of 2Ajc—given by n = 40. This shows the long and short 'pairs' referred to above, and it is relevant to note that only the long-wave modes look like 'conventional' cosine curves. The modulated cosine that is the short-wave mode is another consequence of dealing with a high-frequency wave on a discrete mesh; e.g., the function cos(2icj x 38/80) = (—\)J cos(2jtj x 2/80) displays the shape that it does owing to sampling 'error'—a form of 'aliasing.' If j was instead a continuous variable, the 38-th mode would be a pure cosine wave (amplitude ±1 with no modulation) of wavelength 80/38; i.e., there would be 38 cosine cycles across the mesh. Sampling this 'pure' wave at the discrete points j = 1, 2, ..., 80 yields the discrete eigenvector for mode 38—and is the 'cause' of group velocity error, as we shall show (near the end of this section). Moving more toward the middle of the spectrum, Figures 2.6-6 and 2.6-7 show n = 10 (an 8Ajc wave) and n -= 40 — 10 = 30 (an 8Ajc/3 wave, 3 waves per 8Ax)—shown separately—even though mode 30 is a 2 Ax modulation of mode 10, for ease of viewing. Finally, Figure 2.6-8 shows a 4Ajc wave, n = 20, halfway through the spectrum. Enough pictures for now. Later we shall return to these figures in our discussion of phase and group velocities. o Quadratic elements. Before discussing phase speed and group velocity, and their comparison with the exact results, the next 'logical' step, let us examine one 'higher- order' element—quadratic—to see how much more difficult the analysis becomes for methods in which 'one-node-looks-just-like-another' (on a uniform mesh) does not hold true. It will also reveal how significantly more accurate this element is than the linear one—and not just asymptotically for h -> 0, which is all that 'local' theory can predict. The pure advection equation, via quadratics, is available from (2.3-16) and (2.3-18), as i rr rr — (tj-i + STj + tj+i) + u J+l j~l = 0 (2.6-26)
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 141 1/N 1 Xj Fig. 2,6-6 Eigenvector for linear elements; n - 10. 1/N 1 Xj Fig. 2.8 7 Eigenvector for linear elements; n = 30. for center nodes, j = 1,3. ...,N; N is necessarily odd—in contrast to the situation discussed earlier (Section 2.3-1) with different EC's, and — (—T;-~> + 27\-_i + STj + 2T ;+\ — Tj+->) + 2u-^——^ u^l~z^^L^ = q 10 J J J J ]+ 2/1 4/1 (2.6-27) for edge nodes, j = 2, 4, ..., N + I, where h = 1/2 is the internodal distance, the number of elements is (N + l)/2, the number of nodes is N + 1, h = l/(N + 1), and we require N ^ 5 for the equations to make sense. If we seek a solution to (2.6-26) and (2.6-27) in the form of (2.6-20) (and we have), we will fail (and we did). Due account must be made of the difference between edge
142 THE ADVECTION-DIFFUSION EQUATION Fig. 2,6-8 Eigenvector for linear elements; n = 20. and midside nodes, and this can be accomplished by seeking a solution in the form of (2.6-20) for edge nodes, and Tj(t) = /3e'WA-orf) (2.6-28) for center nodes, where fi is to be determined along with co. A combined form that covers all nodes is Tj(t) = |[(1 + /3) + (-iy(1 - j8)]elW*-arf), (2.6-29) and another is Tj(t) i + (-iy i-(-iy J(kjh—mf) (2.6-30) for j = 1, 2, ..., N + 1. The analog of (2.6-10) is To = TN+\ and T^+2 = T\, which, again, requires k = kn = Inn. Inserting the trial solution into (2.6-26) and (2.6-27) yields the pair 1(0 .-2/0 , ^on-W — (-tr™ + 2pe-'° + 8 + 2Bew - e_2w) + 2Bu 10 2/3 i6 —W 2/0 —2id ~ II- 4/i and io> ,fl ■,, (e'( - e"'") — (e-'* + 8/? + e'e) + u ■ — = 0, 10 2/i 0 (2.6-31) (2.6-32) for « and /J. Rewriting these in terms of the trigonometric functions gives 5ii(4j6 sin 6 — sin 20) = 2coh(4 + 2/? cos 0 - cos 29) and (2.6-33) 5m sin 0 = (t)h(4p + cos 0). \ Jkrf * \J aJ 1 I
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 143 Solving the second of these for fi, inserting the result into the first, and rearranging yields the following quadratic equation; / coh\ coh (3-cos 26>) — +(4 sin 26>) 5(1 - cos260 = 0, (2.6-35) \ u J u with solution -2 sin26- ± J(\ - cos260(19 - cos26-) toh/u = , (2.6-36) 3 — cos 26 and we are up against the dilemma that has confused many an analyst in the past; namely, the quadratic equation gives two roots (frequencies) for each wave number. (Recall 6 = kh = 2nnh,) Is one of them spurious (extraneous) and if so, which? And why? Before answering these questions, we briefly review some of the previous literature on the subject that may have given 'quads' a bad name—all because both roots of the quadratic were utilized and assumed needed/present, with the 'explanation' (rationale/excuse) that these silly elements possessed one spurious non-physical mode for each physical mode. And to show that we too are not immune as we criticize previous efforts, please note that the earliest entry on this list of references is ourselves. The list that we are aware of includes Gresho et al. (1976), Hedstrom (1979), Cullen and Morton (1980), Cullen (1982), Cathers and O'Conner (1985), Bates and Cathers (1986). On the other hand, there is at least one publication early-on that got it right: Vichnevetsky and DeSchutter (1975), It is too bad that this early paper was not seen by all those later people who 'blew it.' [We finally discovered the error, and published correct results—there are no spurious modes—in Gresho and Lee (1987) and Rowley and Gresho (1987).] The best way we have found to resolve this dilemma is to realize that coh/u is closely related to an eigenvalue of M~lK and to invoke linear algebra theory, where M is the mass matrix and K the advection matrix defined by (2,6-26) and (2.6-27). Actually, to display the symmetries that we will need below, these equations must first be returned/converted to a form more like that of GFEM, (We presented them in finite difference form for ease of 'interpretation,') Thus, multiplying (2,6-26) by 4/i/3 and (2.6-27) by 2h/3 yields the desired form; namely, MT + KT = 0, (2.6-37) whose solution we now seek in the form T(t) = qo~Xt to give Kq = kMq; (2.6-38) A. is an eigenvalue of M~XK (one of precisely Af + 1), and q the corresponding ('quad') eigenvector, of length N + \; also, A. = ico. Thus, if we can determine the proper properties of the spectrum, we will have resolved the dilemma regarding co. And this we can do because: M is symmetric-positive-definite (SPD), and K is skew-symmetric. We proceed as follows: since M is SPD, its square root exists (and is also SPD) as does the inverse of its square root. Thus, (2.6-38) can be written as M~l/2KM~l/2x = Ax = Xx, where x = Mxl2q. Now, since A"7 = —K, so too does A7 = —A. Since A is a real skew-symmetric matrix, its eigenvalues [except A. = 0, which occurs for n = (N + 1 )/2 and n = N + 1 ] are pure imaginary and occur in conjugate pairs. Since there are N + 1 — 2 values of X, there are at most (N — l)/2 distinct values of \co\. Now return to the quadratic equation solution, (2.6-36), and use 6 = kh = knh = 2icnh = 2nn/(N + 1) for n = \,2, ... ,N + 1. In order to obtain just Af + 1 roots and
144 THE ADVECTION-DIFFUSION EQUATION not 2(N + 1) and to obtain the proper complex conjugate pairs, it is necessary to select the plus (or minus) sign for the first (lower) half of the spectrum (n = 1, 2, ..., (N + 1 )/2) and the minus (or plus) for the upper half. (This also rejects the N + 1 extraneous roots.) We choose the former (plus for first half, minus for second) to give, for convenience and compatibility, con -> ukn for h -> 0 when n < (N + l)/2, rather than con -> — ukn. The first (N + l)/2 frequencies are thus positive, and the second half negative, with coN+l_n = —con corresponding to the complex conjugate pairs of eigenvalues (cf. the previous frequency plot, Figure 2.6-2, in which the same behavior is shown using linear basis functions—with no need to solve a quadratic equation). The spurious root issue was a spurious issue. Remark: We suspect, but have not proven and thus merely offer a potentially useful warning, that similar erroneous conclusions have been obtained, via similar erroneous analyses, in two other CFD fields—shallow water equations and acoustics. For the former, see Kinnmark and Gray (1985) and Kinnmark (1986), wherein they asserted the existence of spurious solutions ('numerical artifacts'). For the latter, see Belytschko and Mullen (1978), Mullen and Belytschko (1982), and Schreyer (1983), wherein the extraneous roots were identified as 'optical (rather than acoustical) branches.' That these optical branches may be optical illusions has also been pointed out by, at least, Abboud and Pinsky (1990, 1992). To complete the analysis, we need /3; from (2.6-34), this is simply 1 /5m sin 0n \ ' -cos0„), n = l,2,...,N+\, (2.6-39) Qn = Inn i'(N +1), and we are done. If the mass is lumped [replacing all time derivatives by tj], then the RHS's of (2.2-33) and (2.2-34) are replaced by \0coh and 5/3a>h, respectively, (2.6-36) is replaced by - sin20 ± \/l7- 16cos20 - cos2 26 coh/u = , (2.6-40) and (2.6-39) by u sin 6n Pn = :A (2.6-41) a>nn It is noteworthy that the analog of (2.6-25) also applies to these quads, with the eigenvector being now [cf. (2.6-30)] («) i + (-iy i-(-iy ^ rPn ~ e ikjh, (2.6-42) where k = 2nn, n = 1,2, ..., N + 1; i.e., qf+X),2-m] = (-\yqf\ (2.6-43) although in this case—as we shall see—(N + l)/2 is perhaps not the best definition of the long-wave/short-wave dividing point. (It still does divide the modes 50/50, however—as mentioned earlier.)
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 145 Now that we have found our 'Fourier' modes for quads, we realize that they are not simple/conventional Fourier modes—a consequence of the two different types of equations. The plot of fin vs n in Figure 2.6-9 for N = 79, which is symmetrical about (N + 1 )/2, shows that only the low modes (n «; N) look like simple Fourier modes; higher modes have a reduced amplitude for mid-side nodes, reaching 1/2 at the shortest resolvable mode [n = (N + l)/2, knh = 2nn(N + 1) = ic, kn = \/n =2h, the '2Ax' mode]. This causes, for example, the 'simple'/conventional 2Ajc mode (±1) to be a linear combination of two quad eigenvectors, which we will later explicate. This somewhat awkward behavior, while in no way harming the approximating properties of the quadratic basis functions, leads to another point of confusion in the literature (see, for example, Cullen, 1982). This is awkward and unfortunate—the latter because those potential newcomers to quadratic finite elements who study only these negative and misleading papers will probably be 'turned off when they should not be. We try herein to 'restore the faith', since quads are actually very accurate for advection. To start, let us try to reinterpret the situation as follows: it is just pure dumb luck that some numerical approximations (linear FEM, centered FDM) share the same eigenvector (discrete version, on a uniform mesh) as the continuum. Such a fortunate occurrence merely makes the analysis easier—it does not make the approximation better. For example, if quadratic B-splines are utilized as basis functions in the Galerkin method—a non-interpolatory (and thus awkward) basis—the semi-discrete equations are simpler in that they do satisfy the one-node-looks-like-another property, on a uniform mesh, and the advection problem looks like (see, for example, Chin et al. 1979, or Vitchnevetsky and Bowles, 1982) 1 120 (r,-_2 + 267V, + 66fj + 267V i + fj+2) u + 24/- (TJ+2 + 107V, - 107V, - Tj-.2) = 0, (2.6-44) (N = l)h Fig. 2.6-9 Mid-side node amplitude coefficient for quads.
146 THE ADVECTION-DIFFUSION EQUATION which also gives a highly accurate approximation asymptotically [phase error = 0(h6) vis-a-vis 0(h4) for the GFEM quads] from c = 5w(sin20 + lOsin0)/0(33 + 26cos0 + cos 20). But if one plots c(0) and G(0) for these 'unambiguous' (and smoother, with C1 continuity) quads, one would find that they lie virtually on top of the corresponding curves from the 'conventional' (GFEM) quads that we will present later; i.e., they would perform just about like the C° quads in practice. Another blow to the 'simple' Fourier analysis is this: real grids rarely have equal nodal spacing, thus restricting the Fourier analysis method to a small subset of 'model' problems. What happens in practice, and which can only be dealt with in 'generalities,' is that every 'real' problem can be represented in terms of eigenvectors of the matrix (M~lK in our case) approximating the differential operator. Only for special methods and special grids do the eigenvectors take the simple form of Fourier modes—or any other simple 'analytic' form. Thus, whereas the quadratic element displays 'modes' that differ even more from simple trigonometric functions, these modes still qualify for (i) representing an arbitrary function via linear combinations, and (ii) modal analysis that asks 'How well is each mode propagated with respect to the ideal?' That is, how close is the phase velocity of each mode to the fluid velocity, u, as we let n range through the spectrum? The answer is this: quads do an excellent job of modal translation/advection relative to linears (and to many very high-order FDM's) because they transport a larger fraction of their modes at much closer to the proper velocity—as we shall see. (The errant literature referred to above has interpreted their extraneous roots as modes that move upstream—against the flow—a totally fallacious conclusion.) A hint of their behavior is shown in Figure 2.6-10, which should be compared with Figure 2.6-3 and which shows graphs of (2.6-36) and (2.6-40)—using the + sign only (because we are plotting only the lower half of the spectrum, recall). The GFEM version (consistent mass) hugs the exact solution line for a very large fraction of its spectrum. Lumped mass quads, on the other hand, are (slightly) less accurate than consistent mass linears. If dco/dk = 0 is used to separate short from long waves for CM quads, 2N (On/U N 0 1 (N+1)/4 (N+1)/2 Fig. 2.6-10 Eigenvalues for quadratic elements on pure advection. Analytical (2nr\) Consistent
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 147 which definition we strongly recommend in general, then almost three-quarters of the waves/modes fall into the long (easy) category (vs two-thirds for GFEM linears and one-half for LM linears). Again, a few eigenvector pictures may be useful—although we will again be somewhat misled using quads, this time because of what could be called laziness; i.e., we use a 'conventional' plotting package that connects discrete data points with straight line segments rather than the proper piecewise-parabolas. The key difference, though, will be in the '/J-factor' for the center nodes; e.g., the 2Ax wave oscillates between +1 and —1/2 rather than between ±1, and should look like a sequence of parabolas, as shown—in part.—in Figure 2.6-11. This is the mode that modulates the lower modes to 'generate' the corresponding higher modes. Figure 2.6-12 shows n = 1 (dashed) and n = 39 (solid), corresponding to Figure 2.6-4 for linears—in which the /J-factor is again easily recognized. We will skip n = 2, 38, and 20, and conclude with the analog of Figures 2.6-6 and 2.6-7: n = 10, 30—shown in Figures 2.6-13 and 2.6-14. 1.0 0.5 -0.5 — 1.0 — Fig. 2.6-11 The 2Ax eigenvector for quads. 1/(N+1) Fig. 2.6-12 Eigenvectors for quadratic elements; n = 1,39.
148 THE ADVECTION-DIFFUSION EQUATION 1/(N+1) Fig. 2.6-13 Eigenvector for quadratic elements; n = 10. 1/(N+1) Fig. 2.6-14 Eigenvector for quadratic elements; n = 30. o Phase and group speeds. We are now ready to examine the numerical phase and group velocities (neither of which is simply u, as in the continuum)—for both linear and quadratic elements—and see how well the semi-discrete systems can translate/advect the various 'Fourier' modes. The phase speed, from (2.6-3) is just c = |c| = co/k, and the group velocity (= group speed in ID), from (2.6-7), is just G = dco/dk, where now it will really be seen how 'convenient' it is to regard k as a continuous variable. For linear finite elements, we thus have, from (2.6-19), f .(f) — Qi(kxj-mt) _ Qik(Xj- -cot/k) = e ik(xi —ct) (2.6-45)
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 149 rather than T(xj, t) = elkixJ~ul) from (2.6-16). Here c = co/k is the phase speed of the Fourier mode k = kn = 2nn, obtained from (2.6-23) for GFEM and from (2.6-24) for its LM counterpart; i.e., s'mO 3 (2.6-46) c = u- 0 2 + cos# and sin# clm = u- 0 (2.6-47) The respective group velocities, G = dco/dk = h dco/dO = d(ck)/dk = c + k dc/dk = c + Odc/de are l+2cos# 3 G = u 2 + cosO 2 + cos# and Glm = u cos 6. For quads, the corresponding results are [from (2.6-36)] c = u- -2 sin 26 + ^(1 - cos2#)(19 - cos 2^) 0(3 - cos 29) (2.6-48) (2.6-49) (2.6-50) for consistent mass and, for lumped mass, clm — u- - sin 20 + \/l7- 16 cos 20 -cos2 20 40 (2.6-51) from (2.6-40), with corresponding group velocities G = 2m (3 - cos 20y 2(1 -3 cos 20) + sin 20(11 +7 cos 20) and Glm = - V(l -cos2#)(19-cos2#) (8 + cos 20) sin 20 y/\l - 16cos2^-cos22^ — cos 20 (2.6-52) (2.6-53) These functions are plotted in Figures 2.6-15 and 2.6-16, in which we switch to 0n on the abscissa rather than n; also, we ignore the 'discrete' effects (such as n = N/2 =>• Af is even, and Af should really be Af + 1 for quads) and consider 0n as a continuous variable—for convenience. (The curve labeled 'Best Petrov-Galerkin' will be discussed later.) Remarks: (1) The phase speed for CVFEM, from (2.5-6), is easily found to be [replacing (1 4 l)/6 by (1 6 l)/8 in the mass matrix 'translates' to: replace (2 + cos#)/3 by (3 + cos 0)14 in the phase speed equation] c = u sin 0/0 ■ 4/(3 + cos 0), a result also shown in Figure 2.6-15. G = u • 4/(3 + cos#) • (1 + 3cos#)/(3 + cos#) is shown in Figure 2.6-16. CVFEM is not nearly as accurate as GFEM with linear basis functions.
150 THE ADVECTION-DIFFUSION EQUATION 1.0 0.8 c/u 0.6 0.4 0.2 n I -""■'-- Quadratic Linear Lumped Quad. Lumped Linear CVFEM Best Petrov-Galerkin ~ — ' ~ " ^^^"^-^ ^ ^ \ V'N/. N\ \. ^ \ \\ ^ V '•■ "v \ \ \ \ V \ \ \ >:l 7l/2 e Fig. 2.6-15 Phase speed for several elements. G/u Quadratic Linear Lumped Quadratic Lumped Linear CVFEM 7l/2 e Fig. 2.6-16 Group velocity for several elements. — o — -1 — -3 (2) In fact, however, virtually all FVM users lump their mass, giving up what little gain they could have. The curve labeled 'Lumped Linear' describes their scheme. (3) The phase speed of quadratic CVFEM [see (2.5-11) through (2.5-15)] would lie just below that of linear GFEM (not plotted, not worth it). Recall: the phase speed shows the modal speed—mode n moves at speed cn, and G gives the velocity of a wave group—a concept that may thus far seem somewhat vague, but soon we shall make it clear (hopefully) via examples, including some with negative group speed.
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 151 Additional Remarks: (1) Lumped linears (FDM) are not very accurate. (2) All except quads show a lagging phase speed (c < u), whereas quads show a slight phase lead for a good fraction of the spectrum (cmax = 1.0067 at n = 0.24N; 0 = 0.48;r; k = 4Ax). All modes move in the proper direction, in spite of what some have said about quads. (3) The higher the phase accuracy is for 'long' waves (G > 0), the larger is the (negative) short-wave group velocity. This translates in practice to: an accurate scheme will track most of the waves very well (c % u), but those few (short) waves that it cannot track well will cause fast upstream-moving wiggles. [By 'waves' here, we mean of course the projection of the initial data onto the eigenvectors; i.e., we imagine (and could do—at higher cost) the problem is being solved via actual eigenvector expansion. See below.] Thus, for example, the group velocity of — 5u for the 2Ax wave with quads is actually a blessing in disguise, in some sense, relative to — u for lumped linears. And by 'upstream-moving,' we mean that the appropriate linear combination of all modes, each of which is moving to the right, causes the resulting total wave form to display leftward-moving wiggles—this is group velocity (soon to be shown). (4) CVFEM is in between LM and CM for the linear GFEM; while it may conserve locally better than GFEM, that conservation is obtained at the cost of not transporting the solution as accurately. When LM is employed for CVFEM (and it seems to be always the case), then it does not even conserve locally, in spite of what its adherents assert. (5) Another interpretation is this: quads are especially good for smooth data. Let us now return to the eigenmode pictures in Figures 2.6-4-2.6-8, and imagine first that a single linear basis function eigenvector is placed on the grid as initial data (a portion of modal analysis/eigenvector expansion analysis). Setting u = 1 for convenience, the n = 1 mode will translate at a speed given by (2.6-46) for 6 = Inn/N = n/'40 of d = 0.99999979—unless the mass is lumped, in which case the speed is given by (2.6-47) and is only c\M = 0.9989723. For n = 2, the speeds are reduced to c2 = 0.99999966 and c^4 = 0.99589274. In contrast, the short-wave 'sisters' will be much lazier in their forward motion. For n = 39, we have 6 = 39;r/40 to give C39 = 0.0766 and c$ = 0.0256. For the second mode, n = 38, we obtain c38 = 0.1553 and c$ = 0.0524. Finally, the associated group velocities, from (2.6-48) and (2.6-49) are—even though a single mode follows the phase speed and not the group speed (there is no 'group' for a pure/monochromatic wave—more later)—G, = 0.9999989, G\M = 0.9969173, G2 = 0.999983, G^M = 0.987688, G39 = -2.963, G^ = -0.997, G38 = -2.855, and G$ = —0.987. For quads, we let the reader do the work. The gist of all of this is this: an arbitrary IC will behave as some linear combination of all 80 modes and will only be accurate if the amplitude coefficients of the higher frequency modes are very small—which will normally only be the case when the IC is 'sufficiently smooth'—the definition of which varies with N (rougher data require larger N for fixed 'error'). Demonstrations will be made soon. o Finite difference comparison. To compare some of these results with FDM, as an 'aside', we go to Vichnevetsky and Bowles (1982)—a short book that should be required
152 THE ADVECTION-DIFFUSION EQUATION reading for anyone interested in understanding more on this interesting subject (as also should Trefethen, 1982, and Vichnevetsky, 1987)—and plot a few of 'their' finite difference schemes alongside our finite elements. In Figures 2.6-17, 2.6-18, and 2.6-19, we compare frequency, phase speed, and group velocity for 2nd-, 4th-, 6th-, 12th-, 18th-, and 24th-order centered FDM's, along with linear and quadratic FEM's, the FDM frequency equation being coh = sinO u K i + £ Q-!)2(2sinfl/2) (2./+1)! 2j (2.6-54) for K = 0, 1, ..., where 2{K + 1) is the 'order' of the FDM. For variety, we also plot phase-speed error (1 — c/u) in Figure 2.6-18(b) as a function of the number of points per wave (A/Ajc). The type of plot in Figure 2.6-18(c) is useful for answering questions like: How many points per wave are required to attain 2% phase-speed error? These figures seem to say that GFEM with linear elements is a good competitor to 8th-order FDM and that it takes a 24th-order FDM to beat quads. Asymptotically, of course, this is not true. Both linears and quads show a 4th-order convergence rate (super- convergence for linears) as h -> 0. But in practice ('in the real world'), unfortunately, we are almost never lucky enough to operate in this limit. Thus, the phase and group speeds over the full range tell the story better. Nevertheless, in Table 2.6-1 we list some asymptotic results (kh -> 0) for most of the schemes pictured above—for completeness, wherein the following formulas describe the FDM cases, with thanks to B. Fornberg: = 1 n 7=2(2) J 7+1 (2.6-55) toh/u 2nd order 4th order 6th order 12th order 18th order 24th order Quadratic Linear ~~~ y^ I I J& ^"T '- /^Analytical - - \* * ^ \ NX\ ^ \ \ x f \ \ \\'\\ V\VA "-•-x4 7l/2 e Fig. 2.6-17 Eigenvalues of some FDM's and FEM's.
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN (a) 1 c/u mt\mii ■ I.IW I ■iM^j r -^ i _ _^|» . 2nd order 4th order 6th order 12th order 18th order 24th order Quadratic Linear 7l/2 e \ \ \ \ \> ••. \ \ \ \i '••• \ \' VI \\\\A 153 (b) c/u .--"" 2nd order .,-'' 4th order •■ - 6th order 12th order 18th order 24th order Quadratic ----- Linear _J I I I 4 5 6 7 8 Wave Length / Grid Spacing (c) 1-c/u u.^ 0.1 0 lH ; W '« \ I I XI I U \ \ \ \ 2nd order x \\ \ \ \ \ 4th order -•••■-■ "^.v \l \ \ \ \ 6th order ^~~v \\ \ \ \ \, 12th order -^-^ - \\ \ \ \ \ 18th order ^ i * \ \ \ "'•■-.. 24th order \* \ Nx \ ''•••... Quadratic ■-■ \ "»\ \ v\v^ ""-•-....Linear I I I I I 2 3 4 5 6 7 8 Wave Length / Grid Spacing Fig. 2.6-18 Phase speeds corresponding to Figure 2.6-17. (a) Phase speeds; (b) phase speed vs k/h; (c) phase speed error.
154 THE ADVECTION-DIFFUSION EQUATION G/u 0 -1 -2 -3 -4 I— -5 ■ — ,.Mnt -■ mi m*m ™* 2nd order 4th order 6th order 12th order 18th order 24th order Linear Quadratic ^.^ - NVY - '% % V. k/2 e 1 o -1 -2 -3 -4 -5 -6 Fig. 2.6-19 Group velocities. Table 2.6-1 Asymptotic phase speed for several discretizations. Method (0 = Lumped linears (= Linear GFEM 4th-order FDM 6th-order FDM 12th-order FDM 18th-order FDM Lumped quads Quadratic GFEM 24th-order FDM kh) = FDM) c/u \-e2/6 l-6»4/180 I - 6>4/30 l-6»6/140 1 -6»12/12, 012 1 -6>18/923,780 I - 6»4/270 I + 6>4/270 I - 6»24/67, 603, 900 G/u I - 02/2 I - 6>4/36 i - e4/6 I - 6»6/70 I -6>12/924 I -6»18/48,620 I - 6>4/54 I + 6»4/54 I -6»24/2, 704, 156 and G - = 1 u «*»<$)'&& (2.6-56) where n = 2(K + 1) is the order of the scheme. A more useful comparison is framed with the following question: How many points per wave are required for a 1% (5%) error in the phase speed? Table 2.6-2 answers this {k/h = Xk/kh = 2tt/0). Remarks: (1) A result shown in Table 7 of Swartz and Wendroff (1974) is reasonably compatible with our results: for a phase error of 0.01 per period, the implicit quadratic FEM via TR costs about the same as the explicit 8th-order centered FDM via leapfrog. (We shall discuss the ODE methods called 'TR' and 'leapfrog' in Section 2.7.) (2) For additional very-high-order FDM results, and pseudo-spectral results, see the recent book by Fornberg (1996).
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 155 Table 2.6-2 Resolution required (k/h) to get 1% (5%) phase speed error for several discretizations. Method Lumped linears (= FDM) 4th-order FDM 6th-order FDM Linear GFEM 12th-order FDM 18th-orderFDM Quadratic GFEM 24th-order FDM 1% error 26 8.31 5.70 5.61 3.82 3.29 3.19 3.03 5% error 11.3 5.49 4.25 3.93 3.21 2.89 2.85 2.72 o Reduced quadrature. To wrap up this ID analytical discussion, we move away from GFEM and examine two more (besides lumped-mass) non-Galerkin FEM's. The first is reduced quadrature, and the second goes under the name of SUPG (originally, Streamline Upwinding Petrov-Galerkin, but see Hughes (1988) for an etymological discussion). The former is interesting because, with linear elements, it generates a finite element derivation of a well-known finite difference method called Keller's (1971) box scheme, and the latter because it is a dissipative scheme that is called 'simple first-order upwinding' in the finite difference literature. (The true 'power' of SUPG is revealed in multi-dimensions—it has no harmful 'crosswind diffusion.') The one-point quadrature analog of (2.6-17), available for example by appropriately simplifying (2.5-1), is i(7-;_, + 2tj + tj+l)+ ^(Tj+i ~ Tj-i) = 0, j=l,2,...,N, (2.6-57) in which only the mass matrix is changed. This is Keller's box scheme, and it is interesting to note that it is also a finite element scheme, although non-Galerkin, obtained via inaccurate quadrature on the mass matrix. Repeating the analysis following (2.6-18), it is immediate to obtain the one-point quadrature frequency equation sin# 2 2uk co = uk = tan 0/2, (2.6-58) 0 1 +cos6> 0 which leads to and sin 0 2 2u c = u = — tan 0/2 (2.6-59) 0 l+cos6> 0 G = 2u/(l +cos6). (2.6-60) This second-order accurate approximation displays a quite different error behavior than those seen heretofore. In particular, it exhibits a leading error in both phase and group velocities, with both c and G approaching oo(!) as 0 -> n, even though the actual 2Ax wave (0 = n) is stationary; i.e., the functions are not continuous at 0 = n. See also Vichnevetsky and Bowles (1982) for further discussion of this method, and for methods 'in between.'
156 THE ADVECTION-DIFFUSION EQUATION The box scheme above was obtained via reduced quadrature and led to a scheme with a leading phase error. It is interesting and worth pointing out that the same sort of behavior (leading phase error) obtains with quadratic basis functions and—like linears—it also carries over to 2D and 3D. If a two-point quadrature rule is used on the 'quadratic' GFEM equations, then the mass matrix (only) changes (GFEM requires three-point Gaussian quadrature on M)\ it becomes I ( 2 2 ~X\ M = — [ 2 8 2 , (2.6-61) 18\-1 2 2/ where / = 2h is the element length, which leads to the following advection equations [cf. (2.6-26) and (2.6-27)]: i(7-;_, + 4fj + tj+l) + uTj+{~Tj-{ = 0 (2.6-62) 6 2h for center nodes (like linear elements!) and z(-tj-2 + 2f;_, + 4fj + 2tj+i - fj+2) + 2uTj+[~Tj-[ - uTj+2~Tj-2 = 0 6 2h 4h (2.6-63) for edge nodes. The analogous phase speed/dispersion analysis leads to -3sin20 + y/3(7-cos2^)(l -cos26) c= ^71 ^ ' (2.6-64) 20(1 — cos 20) for the phase speed. The associated group velocity is 3 J3 (7 - 8 cos 20 + cos2 20) - 9 sin 20 G = -^—^ . (2.6-65) (1 - cos 20) V 3 (7 - 8 cos 20 + cos2 20) wherein it is clear that like linears, both go to oo at 0 = n. Finally, the midside node coefficient, p, is [cf. (2.6-39)] 1 /3«sin0„ \ Pn = 2 V C0 " C°S °" J ' (2.6-66) which again varies from 1 for 0„ -> 0 to 1/2 for 6n -> n. Figure 2.6-20 shows the phase and group speeds for the reduced quadrature approximations, whose leading phase and group error we shall later demonstrate—in 2D (Section 2.6.3b)—but the 'message' is already clear: use full quadrature/honest GFEM. o Upwind, SUPG. Before examining SUPG, we first introduce and study its 'predecessor,' pure upwinding; i.e., i(7V_, +47,- + tj+l)+ Uh{T) - r,-_,) = 0. (2.6-67) The 'usual' procedure now leads to an unusual result; i.e., the use of (2.6-19) in (2.6-67) results in the following analog of (2.6-22): ^(e'10 + 4 + ei0) + y(l - e-ie) = 0, (2.6-68) 6 h
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 157 2.0 1.9 1.8 1.7 1.6 -d 1-5 £ 1.4 w 1.3 1.2 1.1 1.0 0.9 I I 1/ / 1 / — ' / / Group_quadratic / / ■-■•- Phase_quadratic / / — _._.. Groupjinear / / ~ Phasejinear / / / / ' / / / / ■' V 1 — — — 0.5 1.0 1.5 e 2.0 2.5 3.0 Fig. 2.6-20 Reduced quadrature phase speed and group velocity. which has the solution u 3 to = 7[sin0 - i(l - cos0)]— -; h 2 + cos0 (2.6-69) an imaginary part has been tacked onto the GFEM frequency. This has the effect of changing the solution, (2.6-20), from the GFEM result: Tj(t) = eik{x'-cl\ where c = «(sin0/0)(3/(2 + cos0)), to T (t) = e-[M'('-cos6,)//'l[3/(2+cos6,)l . eik(xj-ct) (2.6-70) (2.6-71) with the same c. Thus we now also have dissipation—numerical damping—of the traveling and dispersing [c = c(k)] wave. In the 'limit' 6 <& \, this becomes Tj(t) = Q-k2hul/2-eikix^cl\ (2.6-72) which actually looks like a solution to the full advection-J/^M^/on equation with diffu- sivity k = uh/2, (2.6-73) because if (2.6-13) is used as initial data for the transient heat equation, then it follows easily that the analytic solution is just T(x, t) = e~~k'Kl ■ e'kx. Thus, the upwinded advection equation reacts as if it were solving the parabolic advection -diffusion equation—not the hyperbolic advection equation. On the other hand, (1 — cos 6) is a monotone function between 6 = 0 and 0 = n; for 6 = n, the damping is a maximum. The shorter the wave, the stronger the damping, with the 'deleterious' 2Ajc being damped faster than any others. In fact, this [in the LM mode, in which the 3/(2 + cos 0) factor is omitted] is the world's most famous wiggle suppressor, it has the property of always suppressing wiggles, regardless of the initial data—continuous or not, such as a step function or even a discrete Dirac delta function (Tj = \/h at node j, zero at all others), both of which 'excite' the entire
158 THE ADVECTION-DIFFUSION EQUATION spectrum quite significantly. In fact, the exact (LM version) solution of a related delta function problem above is available (Wurtele, 1961): the solution of the lumped mass (FDM) version of (2.6-67) is the Poisson distribution, Tj(t) = e~Tzj/j\ for j ^ 0 and Tj(t) = 0 for j < 0, where 7,(0) = l for j = 0 and 0 for j ^ 0, and r = ut/h. (Unlike the true delta function solution, Wurtele's solution goes to zero as h -> 0. For further discussion of the true delta function, see the end of Section 2.7.4.) For further discussion of the upwinded AD equation, see Section 2.6.2b. Turning now to SUPG, with its convenient upwinding 'parameter'—a tuning knob—we will see that it can do better on pure advection. Following Brooks and Hughes (1982), who followed (generalized) Raymond and Garder (1976), who followed Dendy (1974), we shall 'tune for the best phase speed.' But first it is instructive to rewrite (2.6-67) in the following equivalent form: ^(tj-i+Wj + tj+ri + U (Tj+l - Tj-\) _ uh Tj^i - 2Tj + Tj+\ 2h hz (2.6-74) in which the numerical diffusion coefficient (2.6-73) is clearly displayed. The SUPG, though, is better than the above 'simple' upwinded approximation because the effect of the weighting function ('Petrov-type') also 'shows up' in the (unsymmetric) mass matrix for approximating (2.6-9); it is given by (Tj+i-Tj-i) (l+P/2)TM+4Tj + (l-p/2)TJ+i = Puh(TM-2TJ+TJ+l)/h2, + u- 2h (2.6-75) where /3 is an upwinding parameter. Whereas fi = 1 /2 is the recommended value for steady-state simulations at large values of the grid Peclet number (P = uh/2ic), which then gives simple upwinding as shown above, the choice /3 = l/\/T5 = 0.26 (Raymond and Garder, 1976) minimizes the phase speed error—at least for long waves (kh <^ 1)—which we shall demonstrate. Proceeding as before, the analog of (2.6-22) is -ico [(1 + 3P)e~i9 + 4 + (1 - 30)eie] + u(ei0 - Q-i6)/2h .-W id, = puh(c-w -2 + ew)/h\ which leads to the solution where K = Tj(t) = e-^Kt ■ eik{xJ-cl\ (2.6-76) (2.6-77) /2 + cos#\ 9 *2* (l+mY + fstfe (2.6-78) is the artificial/numerical diffusion coefficient, and sin# u- c = e 2 + 3 cos 0 2 + cos^ + 2£2(1 -cos#) 2 ,,:„2 + PL sinz 0 (2.6-79)
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 159 is the phase speed. For 0 -> 0, the phase speed is c = u 1 - 1 180 ^\o4 + 0{e6) (2.6-80) which is clearly a sixth-order accurate result (i.e., for long waves) if the upwind parameter, /3, is taken to be l/\/l~5. For this choice, we have c = u- sin# V\5u 9(4 + cos 6) 23 + 20cos# + 2cosz# K = (i -cosoy klh 23 + 20cos# + 2cosz# (2.6-81) (2.6-82) which gives, for 6 -> 0, K = klh 180 e4 + 0(6>6) •21,3 180 ukzh* + 0(tf), (2.6-83) which is actually a higher-order diffusive term; SUPG is a big improvement over simple upwinding. The phase speed for this accurate advection method is plotted in Figure 2.6-15 where it is rather interesting that it nearly overlaps the curve for quadratic basis functions. This suggests that, in practice, the 'best' linear and the standard quadratic would perform quite similarly—and very well, although only the latter has zero numerical diffusion. o Fourier analysis, eigenvector expansion for linear elements. Next, we show some sample results—mostly in the form of (a sampling of) popular tutorial test problems—which we hope the reader will find useful. Before presenting any numerical results, however, we introduce more quantitatively the notion of easy vs difficult problems in the context of Fourier analysis. Although the Fourier transforms that we are about to introduce are only strictly valid for a single copy of the waveform on an oo-span, they are nevertheless useful for our periodic case—at least if the given function goes to zero 'sufficiently fast.' Figure 2.6-21 shows three types of waveforms (e.g., initial data) that can span the range from 'easy' to difficult. The Gaussian is easy in that its Fourier transform, shown normalized in the figure, is also a Gaussian (in wave number) and thus rapidly goes to zero as ok increases; i.e., Fx{ka) = 1 .—ikx 2 /o 2 -x /2a a dx = e -a2k2/2 (2.6-84) 1lT J—oo which has virtually zero response for ok > 5 or so (e~125 =4 x 10~6). Translation: 'A fine mesh should not be required.' Moving next to the other end of the scale, the square wave is very difficult—the discontinuities cause the excitation of many high wave numbers. The normalized spectral amplitude (transform) in this case is given by F2(kl) 2/7-< Jkx f{x)e,KX<\x = (slnkiykl, (2.6-85) where f(x) describes the step function (of unit amplitude). Noteworthy here is the slow drop-off with increasing kl —the n-th local maximum or minimum is given, approximately, by F2(kl = [(In + \)/2\n) = ±2/n(2n + 1) forn = 1, 2, ...; i.e., it decays like
160 THE ADVECTION-DIFFUSION EQUATION 0 271 471 67C Dimensionless wavenumber, k^or ka Fig. 2.6-21 Normalized Fourier spectrum of several waveforms. l/nic, which is not fast enough—as we shall demonstrate. A (very) fine mesh is required. Finally, the intermediate case of a waveform given by the triangle has the intermediate response curve given by 1 f°° F3(kl) = - \ g(x)e,kx dx = 2( 1 - coskl)/k2l2, (2.6-86) ' J—oo where g(x) defines the triangle with unit amplitude and base 21. Consider now the problem of advecting these three waves across the periodic unit span. Let us begin by selecting a grid for the Gaussian (guided by the figure) such that o^max — 5, which is achieved if a = 1.6/i, so this is what we choose first; i.e., okmdX = 5 = a ■ 2n/Xmin = 2na/2h =>• a = 5h/n = l.6h. Later, we will also compare some methods on a well-resolved Gaussian: a = 4h(akmdX = Arc). For the triangle, we try kl = 4n to catch the first small lobe but neglect the rest (whose amplitudes decrease like \/n2 per period). Since kmdX = 2n/kmin = ic/h, we get / = Ah as a guess at the minimum resolution required. Finally, for the step, we try two different 'grids'; for the first, we take a lO/i step width (21) and for the second, we take 40h. Both grids will get to the 2Ajc wave at k = 7ijh or kl = icl/h, giving 5;r for the first and 20jt for the second. The former will 'capture' the first four 'lobes' in Figure 2.6-21, and the latter about the first 19 (not shown in the figure). The analog of the continuous Fourier transform of the initial data is the amplitude coefficients of the eigenvectors corresponding to the same data; i.e., the (L2, see Appendix 3) projection of the initial data onto the eigenmodes, which brings us to an important, and somewhat lengthy, diversion: the exact representation of the approximate (semi-discrete) solution in terms of the discrete eigenvectors. We have already introduced these in (2.6-21) for linear elements and in (2.6-38) for quads. But let us 'take it from the top,' more or less, starting with linears: we seek a solution of (2.6-17) in the form N Tj(t) = ^fl^f'e-^', (2.6-87) m=\
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 161 where v{- = Q,2jTmJ/N is the y'-th element in the ra-th eigenvector, the eigenvalue com is given by (2.6-23) [(2.6-24) for LM], and the amplitude coefficients, {am}, are to be determined from the initial data. Thus, at / = 0, we have [see (2.6-11)] N 7,(0) = T0(xj) = ^Tamvf\ j=\,2,...,N; (2.6-88) an expansion of a real number (for each j) in terms of N complex numbers. Defining N (u,w) = ^wjUj, (2.6-89) it follows that our eigenvectors are orthogonal, (v{m\v{n))=N-8mn, (2.6-90) which allows us to easily evaluate am\ i.e., I N I N am = (v(m\ T0(Xj))/(vlm\ v(m)) = - ]T ToiXj*-12™"''" = - ]T T0(Xj)v{™\ (2.6-91) 7=1 j=\ and we see that am indeed is the projection of the IC onto the m-th eigenvector—which is also our approximate/discrete/finite Fourier transform of T0(x). Introducing the phase speed, from (2.6-46) and (2.6-47), the exact solution to the approximate advection equation is N N T\x,t) = Y,T j{t)(j>j{x) = Y, j=\ j=\ lm=\ N i2Ttm(j/N-cmt) ^2ame': 4>j(x), (2.6-92) where am is obtained from (2.6-91) and cm from (2.6-46) or (2.6-47) with 0 = 0m. So, in principle, we are finished. But in practice, we actually still have a long way to go, because it is not cost-effective to actually compute the solution this way. The cost-effective way is to invoke a numerical method for integrating (approximately) the discrete ODE's, (2.6-17)—a subject we shall turn to in the next section (Section 2.7). The above analytical solution is presented only/mainly to increase our 'awareness level' and to help us better understand what our ODE solutions will be trying to do. So, back to the theory. While (2.6-92) is a full and complete representation, there are a few matters of interpretation to deal with [e.g., for m > N/2, the eigenvectors are aliased to lower (resolvable) modes]. Also, we can usually omit (defer) the basis function summation and just focus on each nodal coefficient, Tj(t). To this end, then, we first rewrite the equation for Tj(t) as N/2 N/2-\ Tj(t)^^2amviJm)e-i27Tmc'"l+ ]T aN-mvf-m)e-i2n(N-m)c»->''\ (2.6-93) where obviously we are (for now) assuming that Af is an even number. (We shall account for the remaining 50% of the cases later.) The reason for the rewrite is because of the following easily verified facts:
162 THE ADVECTION-DIFFUSION EQUATION 1. <2/v- 2. V: = Vj ; and 3. (N - m)cN^m = -mcm. Before using this information, however, let us note two special features of (2.6-93): 1. For m = N/2 in the first summation, we have 1 N aN/2v(?l2)e-i7<Nc»r-> = (-\yaN/2 = (-!)>- ^2(-\)lT0(xi) N l=\ because v(: ) = (— \y is the 2Ajc mode, and its phase speed, cN/2, is zero. aN/2 measures the amount of 2Ajc 'noise' in the IC. (A smooth IC will obviously give a small a/v/2-) 2. For m = 0 in the second summation, we have N a^v f^NcNl=aN = ±_J2T0(xl), N i=i which is the average value of To(xj) because v • — 1 is the constant mode, and cN = 0. Thus, we can conveniently rewrite (2.6-93) as N/2-l m=\ amv{?)e-i27Tmc'»t + aOT^m)e-''2™c«' + (-l)JaN/2 +aN. (2.6-94) Finally, noting for any complex number z = x + iy that z + z = 2Re(z) leads to N/l-\ ^') = ^£ A^ + m=\ ' N N Y^ Tq(xi ) cos Irnnl/N L/=i cos2icm(j/N — cmt) y] Tq(xi ) sin 2nml/N N sin 2nm(j/N — cmi) + ^Yl^-iy+l + l]T^x^ N (2.6-95) i=i which merits the following Remarks: (1) The solution is clearly seen to be a projection of the IC onto each mode, followed by a linear combination of all of these translating (to the right because cm > 0) modes. The larger the projection coefficient for the higher modes, the worse—usually—will be the approximate solution because cm is too far from u. (2) Af must be even for the above result to be valid. If Af is odd, there is no 'pure' 2Ajc mode, and the result is slightly different—an exercise we leave to the reader—and is this: drop the (—1)7+/ term and change the upper limit in the outer (modal) summation to (N — 1 )/2. (3) The original Af modes with complex exponential eigenvectors finally 'break down' to the first % Af/2 modes represented twice: once by cosines and once by sines.
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 163 (4) Tq(xj) must be real for the above result to be valid—the principal case of interest herein. [Complex To(Xj) can be easily accounted for, however.] o Eigenvector expansion for quadratic elements. For quads, the analysis proceeds in much the same way but, of course, it is (necessarily) more complicated, although the bottom line will be just the same: project the initial data onto the eigenmodes and then let all modes 'advect'—or try to, since, by definition, (almost) advection means to move precisely at the fluid's velocity. What happens is that each mode 'drifts' to the right at its characteristic (phase) speed, and the total solution is the superposition of all moving modes. Here we still have the solution as described by (2.6-87), with N replaced by N + 1 and eigenvectors, now called q and given by (2.6-42), rewritten as (m) i + (-iy i-(-D' —— 1————p„ J2jmj/(N+\) (2.6-96) and the eigenvalues given by [the positive (!) roots of] (2.6-36)—or (2.6-40) for LM. The next complication arises from the fact that these eigenvectors, although linearly independent, are not orthogonal in the sense of (2.6-89). But they are M-orthogonal; q{m)TMq{n) =0 ifm^n, where M is the [proper, a la (2.6-37), not a la (2.6-26) and (2.6-27)] GFEM mass matrix (or the lumped mass matrix if lumping is invoked). The vector w{m) = Mq{m) is the eigenvector of the adjoint (transpose) problem; q satisfies M~xKq = Xq and w satisfies the adjoint problem, KM~xw = —kw with (w(m), q(n)) = 0 for m^n. In fact, it is not too difficult to obtain the following explicit orthogonality result: + -[4 + 8/£ + 4/3OT cos6m - cos20OT]aOT„, (2.6-97) {Mq(m\q(n)) = 15 which replaces (2.6-90). The corresponding amplitude coefficient, from (2.6-88) with q replacing v, is now, via (2.6-97), am = (w{m\T0(xj))/(w{m\q^) ^{2[l-(-lV] (4An+cos0m)+ [1 +(-lV] (4 + 2/3wcosgw-cos2gw)} 7=1 (N + 1 )(4 + 8#, + 4/3m cos 6m - cos 26m) xT0(xj)e-i2™j/{N+l\ (2.6-98) which is the quadratic element's version of the projection of T0(x) onto mode m. The full solution is now, as before, given by (2.6-92) with Af replaced by Af + 1. The analysis leading from (2.6-93) to (2.6-95) follows analogously (with, in addition, the use of ^N+\-m = Pm) to give the final result—for the y-th node: N Tj(» = 2j2 m=\ l + (- \y i — + - <-i)J Pn X + Re(aOT)cos2;rm J N + 1 — Im(aOT)sin2;rm Af+ 1 i+3(-iy <2(/V+l)/2 + ®N+\, (2.6-99)
164 THE ADVECTION-DIFFUSION EQUATION where N = (N+l)/2-l and a(N+i)/2 = (4/3 (AT + l))T!j=}(-^jTo(Xj) is the projection of the IC onto the 2Ajc mode, (1 +3(-l)-/')/4 (see Figure 2.6-11), and aN+\ = (1/3(JV + 1)) Y^!j=\ t3 - (-1 )J]T0(Xj) is the projection of the IC onto the constant mode, wherein the midside nodes get twice the weight of the edge nodes—the 'average' value of Tq(xj) a la quads. If mass lumping is employed, the following changes must be made: 1. Replace (2.6-97) by (Mq{m\q^) = i(tf+ l)(2/£ + \)8mn. (2.6-100) 2. Replace (2.6-98) by "+l 2[l-(-l)']&, +[1+ (-!)>] = E i2 (N+\)(2fa + \) T0(xj)Q-i27Tmj/{N+l). (2.6-101) Remarks: (1) If the 'simple' 2Ajc mode, T0(xj) = (— l)7, is the IC, then the quad needs both g(/v+n/2 an(j q(N+\) t0 represent it, and the result is (_1) ~3qJ "39> =3 4 3' which is but one example of a two-for-one deal that we now examine further. (2) The eigenvectors for linear elements are, of course, also M-orthogonal; they just 'happen to' also satisfy (2.6-90). o One eigenvector or two? Before showing a few sample results, it may be important to point out another ostensible 'problem' (Cullen, 1982) with quads (while eigenvector expansions are still fresh in our minds): because of the '^-factor' in the quad's eigenvector, the ostensibly simple experiment of placing a single discrete 'Fourier mode,' Q<27TmJ/N, on the grid to see how accurately it is moved to the right (a common form of 'dispersion analysis') is in fact not so simple, and has, owing again to misinterpretation, been interpreted as another detraction of this element. The reason it is not simple is that each simple Fourier mode (more precisely: the linear interpolant of each continuous Fourier mode, which of course introduces sampling error) requires a linear combination of two quad eigenvectors for its representation: one with the same index (say m) and the other with index (N + 1 )/2 — m. The reason it can be misinterpreted is that the second mode is inaccurately translated, with the result that quads might appear to do a poor job of (discrete) Fourier mode advection—even when the mode has a long wavelength—because of 'eigenvector error' in addition to phase error. And in fact they do, but this fact is irrelevant, usually, because what is relevant is how well do the quad eigenvectors translate the general initial condition; i.e., that made up of all Fourier modes (eigenvectors for the simple cases, such as linears) and all quad eigenvectors? The answer is that quads do very, very well—as we shall soon demonstrate. The confusing issue may become less confusing if we also asked the inverse question: How well do linear elements (or simple centered differences—or any method with 'zero' eigenvector error—at node points) advect a given quad eigenvector? The inverse answer is (for linears): not very well, because to
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 165 represent the m-th quad eigenvector requires two simple (Fourier) eigenvectors, the rath and the [(N + l)/2 — ra]-th, the second of which will 'spoil' the result. Thus, to be complete, in some sense, we show below the results of advecting a single eigenvector from lumped linear elements via two quad eigenvectors—and vice versa. We leave the details of the analyses as exercises. First, given the m-th Fourier mode as the IC (real, for convenience/variety) .(»») Tq(xj) = Vj = cos2icmj'/(N + 1) = cos j6m, (2.6-102) where our mesh has N + 1 nodes because of quads—and (recall) N + 1 is even—the solution in terms of quads is Tj(t) = an i + (-iy , i (-1)> Pn cos 2icm J N+ 1 Cnjl + a/v+i x cos 2n i + (-iy i-(-i)7' — + ^ fel N + 1 \ , — m J N+ 1 — cn+\ t (2.6-103) where dm — 2(4^w + cos 0m) + (4 + 2pm cos Bm - cos 26m) 4 + &pl + Wm cos em - cos 20„ aN+i = 1 a„ (2.6-104) (2.6-105) the phase speeds are given by (2.6-50), and the center-node amplitude coefficients by (2.6-39). Whereas linear elements would translate T0(xj) at the single appropriate phase speed (close to u for m small), from (2.6-47), the quad representation will ostensibly not be nearly so good (for / > 0) because of the seriously lagging phase speed of the (N + 1 )/2 - m mode. The other side of this coin is this: given the real part of the m-th quad eigenvector if i + (-iy \-{-\y — 1 —Z Pn J9m (2.6-106) as the IC, its representation via linear basis functions also requires two eigenvectors and is given by the (simpler) linear combination Tj(t) = —r-^ cos 2;rm J + 2 cos2;r JV+ 1 (N + 1 Cm * m J N+ 1 c/v+i t (2.6-107) with the phase speeds obtained from (2.6-47), which again shows a good mode and a bad (slow) one.
166 THE ADVECTION-DIFFUSION EQUATION For 1 <£ m <£ N, however, it turns out in both cases that the amplitude coefficient of the slow modes is virtually zero and that of the good (long) modes is % 1. The harder and more interesting cases have 'intermediate' values of m, e.g., m = (N + l)/8—an 8Ax wave—for which the slow mode, 3(N + l)/8 with X = 8Ax/3, has both a significant amplitude and a slow phase speed. The results of one such experiment are shown in Figures 2.6-22 and 2.6-23 for N = 7 and m = 1, which is an 8Ajc wave. The curve labeled / = 0 in Figure 2.6-22 is the 8Ajc wave for lumped linears that is composed of the two quadratic eigenvectors, m = 1 and m = (N + l)/2 — 1 = 3—which agrees with cos2nj/8 at the nodes (and also in between when linear interpolation is employed, as done here). The curves labeled / = 1 and / = 2 in Figure 2.6-22 show the quadratic element solution, from (2.6-103) after one and two laps, respectively, and that labeled / = 100 shows it after 100 laps. The goal in all cases is the exact transport of the linear eigenvector—the / = 0 curve. Now look at Figure 2.6-23, in which the m = 1 quad eigenvector is synthesized by two linears (m = 1 and m = 3). [Also shown is the first mode linear eigenvector at t = 0 to show that they (linears vis-a-vis quads) really are not all that different.] The solution at later times, however, is quite different. The results after just one lap (/ = 1) are already in serious error, which just gets worst with larger /, so we stop with the / = 2 result. The surprising (?) result is that the two quads advect the single 8Ajc mode of lumped linears (= second-order FDM in this case) much better than do the linears themselves! This is simply because c\ for quads is very close to u, and fi\ is not far from unity. This can perhaps be better appreciated by showing the 'numbers' for this case (and let the reader examine other cases); (2.6-103) gives Tj(t)= 1.03418 l + (-l)7 l-(-lV —H-—- + 0.94733 • —4:—- cos 2n(j/S- 1.00115/) -0.03418 i + (-iy i-(-iy —H:—- + 0.59378 • —4;—- cos 6^0'/8- 0.899600, and (2.6-107) gives Tj{t) = 0.97367 cos2tt(./78 - 0.900320 + 0.02633 cos6^0/8 - 0.3001k)- The m = 1 quad has a (leading) phase speed error of 0.115% vs 10% (lagging) for lumped linears (and 0.19% lagging for consistent linears—not shown) and an amplitude 'error,' in each case, of only about 3%. And this is nearly the 'worst' case; larger or smaller values of m/N would reduce these differences. So much for the 'wrong' quad eigenvector. The bottom line on this two-for-one digression is that while higher-order elements are definitely much harder to analyze, a correct analysis is all the more important if unwarranted or even erroneous conclusions are not to be drawn. o Numerical experiments. We now abandon exact eigenvector solutions and turn to some 'numerical' results—all with u = 1 and obtained by integrating the ODE's with a 'time- marching' ODE method (to be discussed in the next section) and a small enough At that the results can be considered virtually exact. We do this to study spatial errors separately from temporal errors. Thus, all results to follow can be considered to be the exact solution of the ODE's with interpolated IC's—as if we had actually employed the eigenvector expansions just discussed.
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 167 Ti(t) Fig. 2.6-22 Two quads make one linear. T,(t) 1 0 1 ^~- 1 J ~"7 \. t = 1 /---^ y V , X ■. Mi •*•. t-n ' / s y^^ \ v- — </•■''» \ ^. t = 2 / / ■' \ \ V. '/.•/» \ v 1 mAor Pirtfln\/o/^rtr ....-...- ' / y n \ v, LINcdl digcilVcClOl , / ;• x \ V / ./ \ X. / / \ ^ / > \ \ / ' 4 \ \ , / f \ \ / / —v "\ \ ' ■/ / — \ \ / / \ ^ / ' \ * / ' \ \ / f \ s / t 4 \ \ 'ft \ ■* / / '•' \ -\ / / /: \ '.\ / / /.' \ \ '^ ' / <■' \ ■•>. / s.- \ \ - "\ V ^-- >. \ < "V-""^ '■ N"-""^ •-* v Fig. 2.6-23 Two linears make one quad. We start with the cr = 1.6 Ax Gaussian discussed earlier. Figure 2.6-24 shows the initial Gaussian, centered at x = 0.15 on a mesh with 40 nodes (N = 40 for linears, N + 1 = 40 for 20 quads) as well as its linear interpolate. The integration was stopped at / = 0.6 with the exact solution now centered at x = 0.75. Also shown are the results from GFEM, for both linear and quadratic basis functions. While a small amount of dispersion (and associated wiggles) is already apparent for linears, the quadratic element's solution is still quite close to the interpolant. To gain further perspective, Figure 2.6-25 shows a few finite difference results for the same problem. The highly damped shape (owing to numerical diffusion) of the curve labeled first-order FD is that from equations (2.6-67) through (2.6-73) with lumped mass [omit the (2 + cos#)/3 factor]. It is readily apparent
168 THE ADVECTION-DIFFUSION EQUATION 1.2 T 0.4 -0.4 Exact Linear FE Quadratic FE 0.2 0.4 0.6 X 0.8 1.0 Fig. 2.6-24 Pure advection in ID; Gaussian waveform, finite element results. 1.2 -0.2 -0.4 Exact 1st order FD 2nd order FD 4th order FD 0.2 1.0 Fig. 2.6-25 Pure advection in 1D; Gaussian wave form, finite difference results. that the other (centered) FDM results display significant dispersion error (a = 1.6Ajc is not easy for these methods, in spite of the nice Fourier spectrum of Figure 2.6-21). Also clear is that linear FEM is quite a bit more accurate than even fourth-order FDM. A final noteworthy result is that the highly dispersive second-order FDM is the same as FEM linear basis functions with lumped mass—our first actual demonstration of the truly deleterious effect caused by mass lumping in advection-dominated flows.
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 169 If one studies the GFEM equations with linear basis functions on a variable mesh (different element lengths) via Taylor-series methods, the result is rather discouraging—the super-convergence enjoyed on a uniform mesh (fourth-order) drops to 'inferior'-convergence (first-order). But we have said several times that GFEM is generally not amenable to such analyses, and we demonstrate that now. We designed a mesh with 53 linear elements that alternated between Ax = 0.025 and 0.0125 and solved the same (a = 1.6 x 0.025) problem. Figure 2.6-26 shows the result, and also that from a truly second-order FDM on a variable mesh, in which dT/3jc| is approximated by \/{hL + hR) [(hR/hL)(T0 - TL) + (hL/hR)(TR - T0)]. The results are rather striking; both results are insensitive to the variable mesh! That is, FDM still looks poor, and GFEM still looks good—in fact, the GFEM result is even more accurate here than it was on the uniform mesh; it responds to the extra nodes by giving extra accuracy. So much for Taylor—and for the misleading 'note of caution' on p. 422 of Vichnevetsky (1987). Moving up in difficulty, a la Figure 2.6-21, we next study the advection of a triangle. The base width (21) is 8Ajc, and the grid contains 50 nodes. The IC in Figure 2.6-27 is centered at x = 0.12, and it is noteworthy that here we have no interpolation error in the IC. The / = 0.6 result is shown for linear elements—both consistent (GFEM) and lumped (FDM). Comparing the latter result with that in Figure 2.6-25, it is clear, since the solutions look much the same, that the FDM cannot recognize the difference between a 1.6Ax Gaussian and the triangle. The GFEM does notice the difference—it holds the triangular shape pretty well but leaves behind a longer trail of wiggles for this less-smooth waveform. Now we move to the really hard wave—a square wave—for which GFEM falls on its face, as does FDM. Figure 2.6-28 shows a lOh step (2/ = lOAx in Figure 2.6-21) on a 100-node uniform mesh, both at / = 0(*o = 0.15) and at / = 0.6 (xo = 0.75) for linear and quadratic GFEM, as well as second-order FDM (lumped linears). All are lousy, wiggly, dispersive. The discontinuities excite the bad end of the spectrum. Clearly, if advection of discontinuities is important, GFEM is of questionable utility. Fortunately, such is rarely 1.2 T 0.4 — -0.4 Exact Linear FE 2nd order FD "-jj<"—'-"^ / v \ ./ 0.2 0.4 0.6 X 0.8 1.0 Fig. 2.6-26 Pure advection in ID; Gaussian waveform, variable-grid results.
170 THE ADVECTION-DIFFUSION EQUATION 1.2 0.8 — T 0.4 0.0 -0.4 Exact Linear FE 2nd order FD / ••/'\. / \ \ / \ J 0.2 0.4 0.6 X 0.8 Fig. 2.6-27 Pure advection in ID; triangular waveform. -0.4 Fig. 2.6-28 Pure advection in 1D; rectangular waveform. 1.0 the case in the 'applied' (incompressible!) world with which we are familiar. For methods that are more-or-less designed to solve problems with discontinuities and sharp fronts, see the recent text by Finlayson (1992). Remarks: (1) The above results were taken from Lee et al. (1976) and did not employ periodic BC's; the inlet had T = 0, and the outlet had no BC. But the calculations were stopped well short of the exit, so that the periodic BC case would be little different.
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 171 (2) It is interesting that for lumped linears, the exact solution of the ODE's is available—in terms of Bessel functions of the first kind. For example, Wurtele (1961) shows the following solution for the discrete 'delta' function IC (Tj(0) = 1 for j = 0, and 0 for j ^ 0) : Tj(t) = Jj(ut/h), where Jj is a Bessel function of the first kind, and more general solutions (for more general IC's) via linear combinations of these 'fundamental solutions.' (3) Sincovec (1972) shows, for the square wave above, that other higher-order methods, such as cubic spline Galerkin and cubic Hermite collocation, also do a very bad job. The next series of figures (with thanks to J. Rowley) shows another view of a IOAjc square wave (11 = lO/i), this time on an 80-node mesh, beginning with the modal decomposition plot of the IC; i.e., Figure 2.6-29 shows the absolute value of the (normalized) projection of the step function onto the 41 resolvable modes [see (2.6-95) with N = 80 and (2.6-99) with N = 79], which shows 39 sines and cosines plus the constant (zero wave number) mode plus the 2Ajc mode for both linears and quads. Mode 0 is the constant mode [F(0) = 1.0], and mode 41 is 2Ax. Comparing the (discrete) eigenvector expansion in Figure 2.6-29 with the (continuous) Fourier transform of Figure 2.6-21 (for kl = 5n) shows some similarity and some differences—e.g., the discrete case shows 4| lobes vs 4 for the continuous case. The similarity of the spectra for the two discrete cases is closer yet—suggesting more similarities than differences between the two—even though each quad mode is a linear combination of two linear modes and vice versa. The differences show up in the phase speeds—especially, for example, that between CM quads and LM linears; recall that, based on the zero-group-velocity criterion, three-quarters of the modes look easy for quads, but only one-half look easy for lumped linears—a 50% increase in favor of quads. The practical effect of these phase speed differences will be well-demonstrated later. Now we just show a short integration, in Figure 2.6-30, wherein the center of the square wave moves just a small distance, from 0.5625 to 0.625, in which we show results in the order 'best to worst'; i.e., CM quads, CM linears, LM quads, and LM linears. It is clear that none do very well; also clear is the short-wave upstream-moving wiggles, the fastest moving at close to the group speed of the 2Ax mode; namely, —5, —3, —2, and — 1, respectively—in the order of appearance. Shown next is another step function—this time one with a much longer flat top (2/ = 40Ajc rather than IOAjc), shown in Figure 2.6-31, the first of which shows the IC and its 'Fourier' spectrum (i.e., eigenvector amplitude coefficients) in which the excrutiatingly slow decay of the higher modes' amplitude coefficients is particularly notable—showing again that square waves are difficult. (Shown in Figure 2.6-31 is the spectrum for quads, but that for linears is virtually identical.) Figure 2.6-32 shows the results at / = 0.0625—complete with upstream wiggles—again corroborating the Fourier 'analysis.' A simpler periodic function is examined next—briefly. The C° function given by x(\ — x) is 'easy' in that only the first derivative is discontinuous—as the discrete spectrum in Figure 2.6-33 shows: only the lower 20% or so of the modes are 'active.' The single result shown in Figure 2.6-34, linear elements at t = 0.125, is again 'consistent'—only a small amount of dispersion is present. A less simple (C-1) function, a periodic ramp, with a discontinuity, is portrayed next—a sawtooth function. Note the smooth but slow drop-off of the discrete spectrum
172 THE ADVECTION-DIFFUSION EQUATION 1.5 1.0 0.5 f(x) 0 -0.5 -1.0 -1.5 (a) Initial P.nnriitinn : n - |F(k)| |F(k)| 0.5 - 0.1 0.2 0.3 0.4 0.5 X 0.6 0.7 0.8 0.9 1.0 0.1 0.2 0.3 0.4 0.5 0.6 Normalized frequency 0.7 0.8 0.9 1.0 Fig. 2.6-29 A 10 Ax square wave and its modal decomposition. shown in Figure 2.6-35 (for linears—quads look virtually the same)—like \/n (or l/k for the continuum). The four results, at / = 0.125, are shown in Figure 2.6-36. Here again the more accurate methods tend not to look so because of the upstream-moving wiggles—group velocity 'in action.' The next four examples—and the last in this series—are designed to further accentuate the group velocity aspects of the several methods under scrutiny. We begin with the experiment inspired by Cathers and O'Conner (1985) and place a truncated 2Ajc wave on the 80-node span—see Figure 2.6-37—for both linear and quadratic elements (and apologize for the different 'scale' used for quads). The discrete Fourier spectra are again revealing—showing (logically) the dominance of the 2Ajc eigenvector. [The spectrum for
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 173 1.5 1.0 0.5 T 0 -0.5 -1.0 -1.5 1.5 1.0 0.5 T 0 -0.5 -1.0 (a) Consistent quads ■ -—a,aa/w^"—^vV — (b) Consistent linears ~ WW- -V-a/^ Fig. 2.6-30 Brief advection of a square wave.
174 THE ADVECTION-DIFFUSION EQUATION 0.4 0.5 0.6 0.7 Normalized frequency Fig. 2.6-31 A 40Ax square wave and its modal decomposition. 0.8 0.9 the quadratic shows a larger fraction of long modes; these are required to compensate for the '/3-factor' that shifts the (dominant) short modes away from an average value of zero.] The solutions for all four cases, shown at / = 0.125 in Figure 2.6-38, reveal that the wave packet is indeed dominated by the group velocity of the 2 Ax wave, since all packets move leftward at close to its corresponding group velocity. Thus, in some sense, the error is 'total' in that the exact solution (translation to the right of the initial sawtooth at unit speed) can in no way be well approximated by any of the methods. The positive 'bias' shown by quads is a reflection of the IC, which sawtooth displayed positive values for end nodes and negative (of the same magnitude) for center nodes. If the IC is shifted by one grid length, then the 'reverse' is seen, and the solution displays a negative bias. [Note that Figure 13 in Cathers and O'Conner (1985) is, as pointed out by Gresho and Lee (1987), erroneous. The code bug was later corrected by Cathers (personal communication).] Next we show (Figures 2.6-39 and 2.6-40) a wave packet that is obtained by modulating, and reflecting about the jc-axis, a smooth Gaussian (a = lOh) by (— \y, the 2Ax wave for linear elements, in which the excitation of long waves for the quad case (Figure 2.6-39) is even more evident/prominent—and for the same reason; except for this feature, the symmetric modulated Gaussian displays only short-wave excitation—from linears. And the results are, correspondingly, cleaner: the wave packets are—with the exception of quads whose difference will be made clear in the next example to follow—really translating leftward at the group velocity of the 'shortest' waves. (Exercise for the reader: from these two figures, deduce / in Figure 2.6-40.)
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 175 1.5 -1.5 (a) Consistent quads "v'v/V'^*—*y/\s**s\f •^v (b) Consistent linears Fig. 2.6-32 Brief advection of a longer square wave.
176 THE ADVECTION-DIFFUSION EQUATION 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 Normalized frequency Fig. 2.6-33 A periodic C° function and its modal decomposition. Fig. 2.6-34 Brief advection ofx(1 -x) via linears. The fact that there are no rightward moving waves (for linears) is a consequence of the reflected Gaussian—the smooth envelope cancels out so that only the short waves are seen. In contrast, if we had used an unsymmetric Gaussian, say positive only (as in earlier examples), the solution would display similar leftward moving 'noise' plus a smooth Gaussian moving rightward at c = u = 1. For examples of this case, see Vich- nevetsky (1987), in which paper he also 'corrects' (on p. 423) an erroneous statement related to this behavior in his book with Bowles (Vichnevetsky and Bowles, 1982). There they discuss the separation of smooth and rough solutions via, in part, an appeal to the second-order wave equation, Ttt = u2Txx, easily derivable from the first-order equation of interest, Tt + uTx = 0, to make the following statement, '... which shows that the central-difference semi-discretization is a consistent approximation of (the second-order wave equation) rather than that of the advection equation, Tt + uTx = 0.' What they
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 177 f(x) 1.5 1.0 0.5 -1.0 (a) Initial Condition ^_ 0.1 0.2 0.3 0.4 0.5 0.6 0.7 X 0.8 0.9 1.0 I I T i i i (b) Linears 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 Normalized frequency Fig. 2.6-35 A sawtooth function and its modal decomposition. meant to say [R. Vichnevetsky (1985), personal communication] was 'in addition to' rather than 'rather than.' Nevertheless, the remaining discussion and analysis presented there is useful, interesting, and enlightening (even if not absolutely necessary, nor perhaps totally rigorous; e.g., for h -> 0, implied by the writing of PDE's, the amplitude of their left-moving difference wave goes to zero). The third case is for quads (CM) only, and was kindly supplied by D.F. Griffiths. It uses an IC of a 'symmetric' Gaussian modulated by the quad's 2Ax eigenvector—hence the positive bias; see Figure 2.6-41. This wave packet (for which the theory of group velocity is most apt) corresponds more closely to that in the previous example using linear elements; i.e., it shows a purely leftward motion at a velocity of % —5. The rightward- moving smooth portion of the wave displayed in the previous example is completely absent (as are low mode eigenvector amplitude coefficients of the IC, not shown), owing to cancellation. Another interesting experiment was performed by Cathers and O'Connor (1985); in their Figure 9 is shown an IC that is comprised of about five 2Ax waves, followed by the same number of 4Ajc waves, followed by the same number of 8Ax waves. The results showed the proper wave packet 'separation' as each portion of the IC moved at (nearly) its own group velocity—and concludes (almost) our group velocity examples for the time being. The last of the four group velocity demonstrations returns us to the eigenvector pictures in Figure 2.6-5—as promised there. The 'exact' solution referred to in the legends of Figures 2.6-42 and 2.6-43, which figures are to be viewed top-to-bottom, is the exact solution of the ODE with the given mode (eigenvector) as an IC; they are given by a special case (pure cosine IC) of (2.6-95): T(f] = cos2nm(j/N - cmt) for m = 2, N = 80, where cm = (sm2]rm/N)/(2]rm/N) = 0.9959—and the same equation for m = 38 for
178 THE ADVECTION-DIFFUSION EQUATION Fig. 2.6-36 Brief advection of a sawtooth wave. which cm = 0.0524. Each mode is shown at / = 0 and at / = r/4, 2r/4, 3r/4, and 4r/4 where r = 2n/N sm(2nm/N) is the wave's period—2n/co = \/mcm = 0.5021 for both m = 2 and m = 38, owing to the symmetry of the sine function. [The consistent mass version—not shown—would not display this (fortuitous) symmetry. It would have ci = 0.9999966, c38 = 0.15533 and thus r2 = 0.50000 and r38 = 0.1694.] Anyway, the object of the exercise is to point out the temporal behavior of low vis-a-vis high modes—and
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 179 0.3 0.4 0.5 0.6 0.7 Normalized frequency 1.0 Fig. 2.6-37 A finite 2Ax wave and its modal decomposition.
180 THE ADVECTION-DIFFUSION EQUATION — (a) Consistent quads 1.5 1.0 0.5 T 0 -0.5 -1.0 -1.5 1.5 1.0 0.5 |— T 0 -0.5 -1.0 |— -1.5 2.0 1.5 1.0 T 0.5 0 -0.5 "•"'vA/V* 'AA'AA (b) I I I — (c) Lumped quads , V I I I I I I Ilk., a. Fig. 2.6-38 Brief advection of the truncated 2Ax wave.
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 181 |F(k)| |F(k)| 0.2 0.3 0.4 0.5 0.6 0.7 Normalized frequency Fig. 2.6-39 A modulated Gaussian and its modal decomposition. this we now do. Mode 2 moves rightward at its phase speed (0.9959)—'because' it is a pure sinusoidal wave. But this is definitely not the case for mode 38, which moves leftward at its group velocity, G^ = cos 2mn/N = —0.9877 — 'because' it is a wave group, comprised as it is of a 2Ajc wave and a 40Ax wave. A reaction from the maker of these figures, J. Rowley, after viewing the results in 'movie mode,' is relevant: 'The groups zoom to the left while the envelope creeps to the right.' Thus, since a general IC requires 'all' modes to represent it (80 in this case), the solution will be poor to the extent that the IC contains significant amounts of the high modes. [A close perusal of Mode 38 in Figure 2.6-43 will reveal that the actual plot is not quite right; plotted is cos76ir(j - l)/80 for 1 ^ j ^ 79.]
182 THE ADVECTION-DIFFUSION EQUATION 1.0 - (b) Consistent linears Fig. 2.6-40 Brief advection of a modulated Gaussian wave.
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 183 in ! S) * ■ /vaAI vwvVu k l I I t = t = k Si ^/vvv V V V t = ii > > > lkA-- ,= Ml. • .20 .16 .12 .08 .04 = 0 i ii -1.0 -0.5 0 0.5 1.0 x Fig. 2.6-41 Reverse advection of a 2Ax-modulated Gaussian; a/Ax = 5, Ax = 0.02. 0 10 20 30 40 50 60 70 80 j Fig. 2.6-42 'Exact' advection of mode 2 for lumped linears (N = 80,), shown each quarter cycle.
184 THE ADVECTION-DIFFUSION EQUATION 0 10 20 30 40 50 60 70 80 J Fig. 2.6-43 'Exact' advection of mode 38 for lumped linears (N = 80,), shown each quarter cycle. For our final final example, we 'get more serious' about sending waveforms around the periodic circuit—finally getting away from all of the short time results shown heretofore. Specifically, we shall chase a a = 4Ajc Gaussian around and around an 80-node circuit and compare several methods—using results previously published in Rowley and Gresho (1987). Figure 2.6-44(a) shows a result using the least accurate of the methods considered: lumped linears (= centered second-order FDM) after 15 circuits/cycles. Dispersion already so dominates that the Gaussian is no longer discernible. The next three are much more accurate, so we subjected them to a tougher test: 80 laps through the mesh. Figures 2.6-44(b) and (c) show, as discussed and demonstrated previously, that lumped quads and consistent linears are quite close in advection accuracy—their average speeds are ~0.9975, with slightly larger wiggles from lumped quads. Consistent quads, on the other hand, are almost 'spot on'—see Figure 2.6-44(d)—demonstrating the remarkable accuracy alluded to earlier. To finish, we show some variable-grid results, partly to show that GFEM is still quite good, but also to show some puzzling, and not yet understood, results. Starting with the 80-node uniform mesh, we randomly perturbed the nodes to generate element lengths up to 20% above or below the uniform value of 0.0125. (Note that variable-grid FEM results are also non-dissipative because the advection matrix remains skew-symmetric.) All results are shown after 80 cycles with an average CFL number of 0.1 using the trapezoid rule for time integration. Figure 2.6-45(a) shows the worst result—lumped linears not only lose the Gaussian but really generate a lot of high frequency noise—especially near 2Ax. Mass lumping on non-uniform meshes of linear elements is not a wise move. Consistent linears (GFEM), on the other hand, do quite well—in spite of their 'first-order spatial accuracy'; Figure 2.6-45(b) shows about the same phase error as for the uniform ('fourth-order') mesh, but the results are somewhat polluted by low-amplitude, high-frequency wiggles. Perhaps a remark by Vichnevetsky
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 185 0 0.5 1.0 X Fig. 2.6-44 Uniform mesh of 80 nodes: (a) Lumped linears after 15 cycles through the domain (b) Lumped quads after 80 cycles through the domain (c) Consistent linears after 80 cycles through the domain (d) Consistent quads after 80 cycles through the domain. (1987), for gradually changing mesh size, is relevant here—even though our mesh change may not qualify here as gradual: 'During the passage of such wave packets, the frequency (co) remains constant and satisfaction of the dispersion relation with h variable results in an x-dependence of the wave number.' Lumped and consistent quads are shown in Figures 2.6-45(c) and (d), respectively, in which more puzzling results are apparent: (i) although lumped quads show slightly stronger wiggles, both show about the same phase error, (ii) consistent quads seem to retain the proper shape of the Gaussian, thus
186 THE ADVECTION-DIFFUSION EQUATION 0 0.5 1.0 X Fig. 2.6-45 Non-uniform 80-node mesh; 80 cycles: (a) Lumped linears, (b) Consistent linears, (c) Lumped quads, (d) Consistent quads. suggesting that the phase speed error is basically the same for all eigenvectors—although we note (or believe) that only the low modes will have significant initial amplitudes for this easy-for-quads-to-resolve case. But another part of the puzzle—not shown by these results—is this: different runs, with different randomized node locations, yield different, and surprising, results in that dispersion seems to be virtually absent and sometimes the numerical solution moves faster, sometimes slower, than the continuous one. For example, for six runs, the average speed of the Gaussian over 80 laps ranged from 0.9963 to 1.0031. For further results and discussion of this issue, see Rowley and Gresho (1987). We leave the resolution/explanation of these curious results as an exercise for the reader (!), with
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 187 a plea to share the explanation with us. Perhaps alternate IC's, such as the best least- squares fit via the consistent mass matrix, would deliver further insight—we simply used the interpolant IC. b. Advection-diffusion with periodic BC's Little need be added to the pure advection discussion when the diffusion term is added to the equation, except to note that: (i) the problem becomes parabolic rather than hyperbolic, and thus much easier since 'roughness' no longer remains but is diffused into 'smoothness'; and (ii) the rate of diffusion in the semi-discrete equations only approximates the correct rate. Each of the eigenmodes (still elkx) will decay (diffuse) while advecting—and more quickly the higher the wave number owing to the larger gradients. The analytic solution for the continuous case is easily found to be a simple combination of that for pure advection (already discussed) and that for pure diffusion (Q,kx-k **); namely, T(x,t) = e-klKteikix-ut\ (2.6-108) where k(= kn) = 2nn. The eigenvalue for the AD operator is thus X„ = k2K + ik„u. Each Fourier mode (eigenfunction) undergoes scale-dependent diffusion while being advected at speed u—and, interestingly, this is the only case (BC's) for which eigenfunction advection actually occurs; they decay in place for all other BC's, as we shall see. For the semi-discretized approximation we will restrict the discussion to linear elements and focus primarily on how GFEM and several other schemes approximate the decay rate. The semi-discrete equations of interest are obtained by adding the diffusion term to (2.6-17): i(ry_, + 4fj + tj+l)+ ^(7>, - 7V.) = ^(7V, - 2Tj + TJ+l), j=l,2,...,N. (2.6-109) We seek a solution of the form Tj(t) = e-M?e'(^-o>o t0 easily obtain - (/x + ico)m{6) + — sin 6 = ^ (1 - cos 6), (2.6-110) h hr where 6 = kh and m(6) is the mass matrix 'response,' m(0) = (2 + cos#)/3—also called the 'matrix symbol.' But we shall 'generalize' m(0) to permit the inclusion of three additional mass matrix approximations, as follows: m = 1 for lumped mass, m = (1 + cos 0)/2 for one-point quadrature (the 'Box' scheme), and m = (3 + cos 6)/4 for CVFEM. In all cases, (2.6-110) yields li = 2k(\ - cos0)/mh2, (2.6-111) co = us'mO/mh, (2.6-112) to give the eigenvalue A = /x + ico and the solution T (t) = Q~2Kt(l~COii0ymh2QikUh-utsinQ/m(^ (2 6-113) which is to be compared with (2.6-45) through (2.6-47)—the pure advection limit (k = 0) above—and of course to (2.6-108), the 'goal.' These eigenvectors translate to the right at
188 THE ADVECTION-DIFFUSION EQUATION the (mode-dependent) phase speed while being damped by diffusion. It is seen that the advection portion of the solution is unchanged, thus permitting us to focus on the diffusion portion. Figure 2.6-46 shows the effective diffusivity, k^/k = 2(1 — cos0)/m02, ratioed to the true diffusivity for the above-mentioned schemes. We see that (i) CM (GFEM) is over-diffusive—an ostensible advantage in the shortwave portion (6 > |;r; see Figure 2.6-3), since these are the modes that are inaccurately advected; i.e., why not get rid of the noise more quickly? (ii) LM is rather under-diffusive (noise will linger longer than it should); (iii) CVFEM is clearly the winner (finally!); and (iv) the box scheme is a clear loser. Finally, we leave the analysis of quads as an exercise—perhaps not easy. Exercise for the reader: (1) Replace the GFEM advection term by its upwinded counterpart (cf. 2.6-67). Show that the resulting AD ODE is i(7V_, + 47; + tj+l)+ ^(TJ+l - THl) u f2L 1 \ which shows, for large Pe, that the equation (and therefore its solution) becomes completely independent of Pe—all diffusion is numerical—a result that obtains for other BC's as well, and one that carries over to both 2D and 3D and thus should relegate simple upwinding to the numerical methods cemetery. (2) Show that the resulting ejfective diffusivity is tc(eff) = k + uh/2 and that the ejfective Peclet number is Pe(ejf) = Pe/(1 + P), where P = uh/2ic is the grid Peclet number—showing that Pe(eff) -> 2L/h for k -> 0; hyperbolic behavior is completely impossible with upwinding. Kgff/K Box scheme /' Consistent mass Lumped mass ~--^ jc/2 9 Fig. 2.6-46 Effective diffusivity of four methods.
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 189 c. Advection -diffusion with Dirichlet BC's o Continuum. We now switch from easy BC's to hard ones, still in ID: we change from studying IVP's to IBVP's in which physical diffusion will play a major role. We are thus interested in approximate solutions to (2.3-1), such as given by (2.6-109), now with BC's T(0,t) = T0(t) (2.6-114) and T(L,t) = TL(t), (2.6-115) and IC (2.3-4). We will also address the steady-state version of the AD equation with the above BC's, which will lead to a discussion of a classic tough case that has been used, effectively but erroneously—we believe—to denigrate GFEM as a viable solution technique. In fact, if it were not for the excessive use and abuse of the above case, the steady-state section would probably not need to be written—at least for the advection- dominated case (which we emphasize here). But written it must be—and for reasons that are perhaps as much philosophical as technical. We hope, at the end, though, that the reader will share with us a more balanced view regarding the utility, or otherwise of GFEMIA—as measured against the various 'upwinded' alternatives whose pushers/sales people/advocates/zealots believe in the need for artificial dissipation, intelligently (or, sometimes, otherwise) applied. The reason that this section should not need to be written is that the use of a Dirichlet boundary condition as an 'outflow' BC, is, to say the least, a bit silly in a fluid flow problem. It (usually) makes little or no sense physically and causes (or can cause) serious problems mathematically. That it is 'silly' has been addressed in (at least) Gartling (1978), Chang and Finlayson (1980), and Gresho and Lee (1981). But we believe that more yet needs to be said, and we now proceed to do so, beginning with the associated eigenproblem, obtained by seeking e~Xt temporal behavior with homogeneous BC's, which leads directly to wOA - K<bxx = AO on 0 <x < L=\ (2.6-116) O = 0 at jc = 0, L, (2.6-117) where O is an eigenfunction, with concomitant eigenvalue k. The solution is Xn = (n2ir2 + Pq2)k/L2, (2.6-118) <D „ (x) = ePex/L sin nnx/L, n = l,2, ..., (2.6-119) where Pq = uL/2k (2.6-120) is the (new) global Peclet number—the factor of two introduced solely for mathematical convenience [as indeed it was for the local (grid) Peclet number, P = uh/2K = Pe h/L], and it is quite worth noting that each 'advection-diffusion' eigenfunction, as initial data, only diffuses (!)—in place, according to T(x, t) = e~/fO(x); they do not advect at all, yet the linear combinations of them that define the general solution to the IBVP do 'advect,' or appear to, at least. Such is the 'power' of linear combinations/superposition. This somewhat 'strange' (abnormal?) modal behavior is related to the fact, discussed in more detail at the end of the next section (2.6.2d), that because of the BC's, the operator is no longer normal and the modes no longer 'simply' orthogonal (vis-a-vis the normal
190 THE ADVECTION-DIFFUSION EQUATION periodic case with orthogonal modes). It is also noteworthy that the 'advection part' of the eigenvalue is independent of n (cf. the periodic case). The exponential factor—the advection effect—is part of 'the problem.' And the Dirichlet OBC is the other part. These two 'do battle' in that the former wants large O, and the latter small O at the outlet. Figure 2.6-47(a) shows the net result at a moderate Pe (24) and small n\ the trend is clear, and Dirichlet wins, as it must. The discrete versions (linear elements) shown in Figures 2.6-47(b) and (c) will be derived in the next subsection; suffice it to say here that the low modes are easily simulated when P is small. But it wins at the expense of an extremely large gradient at x = L, given by o;(L) = (-l)"^ePe, (2.6-121) = 2.6 x 1010(— \)nnic/L for Pe = 24, and of a concomitant thin OBL (outflow boundary layer), whose thickness is 0(Pe~') or 'less' (less for large n). [The extremum closest -4x109 m = 2\ 40 50 60 J 70 80 (b) Consistent mass. 40 50 (c) Lumped mass. 60 I Fig. 2.6-47 First three Dirichlet eigenfunctions and eigenvectors for Pe = 24; N = 79 (P = 0.3; for (b) and (c).
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 191 to x = L (defining the OBL) is given by x/L = (nn — tan ' nn/Pe)/nn, giving 8/L = \-x/L=\/nn tan"' (nir/Pe), which looks like 8/L = l /Pe for nn « Pe and like 8/L = \/2n for nn ^> Pe; i.e., the BL is very thin for large n and/or Pe ^>> l.] Figure 2.6-48 shows the same first three eigenfunctions, but now with Pe = 100—and only for the right-most 20% of the domain. It is clear that advection takes the pure heat equation eigenfunction and, for large Pe, 'compresses' it toward the right boundary—while strongly amplifying it there. If one is really trying to solve an applied problem in thermal (or other) analysis, it is almost clear already that the Dirichlet OBC is not such a good idea; not only does it make little or no sense physically, but it causes unnecessarily difficult mathematical and computational problems. But we shall 'push on' for largely historical reasons, leaving until the next section a better, more sensible OBC. Thus we have a hard problem for Pe ^> l, since any solution of (2.3-1) with Dirichlet BC's is always a linear combination of these 'badly behaved' eigenfunctions, whose analytical solution, at least, is made simple (in principle) via the (adjoint) orthogonality condition (<D„,<Dm)= f e-2Pe</L0„0„, = <W2, Jo (2.6-122) permitting (awkward) eigenfunction expansions of the following form: f(x) = Y^n an^n(x) via an = 2 f f(x)e~Pex/L sin nnx/L—a somewhat counter-intuitive result for large Pe, weighting 'small x' as it does. Note too that the time constant for mode n, Tn = 1/An — L2/k l nV + Pe2 n2n2K/L2 + u2/4^ (2.6-123) is a combination of a (physical) diffusional time scale, zD = L2/Kn2n2, and an advection- diffusional time scale, xAD = 4k/u2, that is non-physical—the former (ultimately) ■r- 8x10 0.80 ,41 8x1Q 41 Fig. 2.6-48 Dirichlet eigenfunctions for Pe = 100.
192 THE ADVECTION-DIFFUSION EQUATION dominating for the high modes and the latter for the low modes (at large Pe), the 'crossover' occurring near n = Pe, suggesting the possible need for many modes when Pe » 1. We shall encounter the above non-physical time scale several more times. A simple example of the use of these eigenfunctions is given by Tq(x) = constant and TL = Tq = 0, which describes the advection and diffusion of a (two-sided) step function: oo T(x, 0 = T0J2 2nn[\ -(-l)"e -Pe n=\ n V + Pe2 -<P„e~Kt = ro£ n=\ 2nir[\ -(-l)"e-Fe] n V + Pe2 ePe*/L smnixx/L ■ e-^+'W, (2.6-124) which, for Pe^>\, is a complex way to say that the (near) step function translates across the interval on the advective time scale (L/u) with 'massive' diffusion occurring at x = L—and at x = ut for ut < L (back-diffusion from the backside of the step). There are two interesting points worth noting here for this, or any other, Tq(x): (i) while each individual mode has a decay time constant much smaller than the advective scale of L/u, the total process does occur on this slower time scale, thus suggesting (again) that 'many' modes will be required to get an accurate result—which is another manifestation of the statement that these (Dirichlet BC at outlet) are 'hard' problems for large Pe; (ii) each individual mode (eigenfunction, 0„) decays 'in place' (N.B. As noted above, there is in fact no advection of these 'advection-diffusion' modes.) according to its decay constant (kn), the linear combination of them conspiring to generate a total result (waveform) that does move to the right (at speed u). o Linear elements, mostly. We turn now to the approximation eigenproblem, via GFEM and the associated discrete eigenproblem, the discrete version of (2.6-116) and (2.6-117), given for linear basis functions by u(Tj+i - Tj-i)/2h - k(Tj-i - 2Tj + 7>,)//i2 = A(7>, + ATj + 7>,)/6, j = 1, 2,..., N, (2.6-125) with Tq = 7V+i = 0, the solution of which is sufficiently challenging/interesting that we present it; i.e., we show one way to solve the generalized eigenproblem Kz = kMz for simple tri-diagonal matrices. It is this: using the fact (e.g. Fletcher and Griffiths, 1980) that the eigenproblem Az = yz where A is N x N and tri-diagonal with lower diagonal /, main diagonal d, and upper diagonal u has the solution z^ = (l/u)->/2 sin jmn/(N + 1) and ym = d + 2\/~lu cos mn/'(N + 1), we form A = XM — K, and solve Az = yz, after which we set y = 0 to get k. The result, applied to (2.6-125), is, with h = L/(N + 1) and z = T, j 1 +Xmh2/6K + Px J/2 1 + kmh2/6K - P sin jmnh/L, (2.6-126) and _ 6k h 2 + cos2 6m - cos6m\f9-P2(4-cos2em) 4 — cos2 6m for m U2,...,N, (2.6-127)
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 193 where 0m = nuih/L and (recall) P = uH/2k. Also, we have changed the 'name' of the eigenvectors—from Vj in (2.6-21) to T(™]. Actually, an easier alternative to solving the quadratic equation d(k) + 2*J I (X)u(X) cos mich = 0 is to simply 'guess' the above eigenvector (from the form of the tri-diagonal matrices) and place it into Kz = XMz, which gives X directly. Anyway, we are done—for linears. This result [for Xm, not for z(m)] is also presented in Fletcher and Griffiths (1980) and in Mitchell and Griffiths (1979), in the latter of which they also show that all of the eigenvalues are real only for P ^ 3/2. They also show that all are complex for P > y/3. For 1.5 < P < %/3, some are real and some are complex. Recalling that the continuous case has only real eigenvalues, clearly points to a 'problem' in the approximate solution, and one may suspect that the approximation can only be good for 'small P.' We shall see later that this is generally a valid suspicion; but we shall also show how a smart mesh with variable h can get good results for very large values of P in most of the domain and small values [0(1)] only locally, selectively. The lumped mass version of the above results is easier to obtain and has been presented in, at least, Hindmarsh et al. (1984): Xm = JL(\-yJ\ -p2C0Sem\ (2.6-128) and / 1 + P \ ^2 T{p] = l——\ sin jmnh/L (2.6-129) for P ^ 1. For P = 1, the degenerate case is Xm = 2k/h2 for all m and the single eigenvector 7y") = (0, 0 -> 0, \)T. The lumped case has a real solution only for P < 1. For P > 1, it is complex and can be written in a more convenient form: / p _i_ J \ J/2 Tf] = (-i)J I —— 1 sin jmnh/L, (2.6-130) with a similar form of (2.6-126) for P > y/3: ~P + (\+kmh2/6K) rf] = (-/y P-{\ +Xmh2/6K) 7/2 sin jmnh/L. (2.6-131) In both cases, CM and LM, the corresponding eigenvalues have a factor of / in front of the radical, and the quantity under the radical is negated—and corresponds to spurious, damped, temporal oscillations, with the same damping rate (r = h2/2k) for all modes, with frequency (LM) 2K/h2y/P2 — 1 cosOm [highest for mode 1 and lowest (0) for the (stationary) 2Ajc mode]. Also, in both cases, both eigenvalue and eigenvector for the discrete case converge—as they must—to the continuous results for h -> 0. Another relevant property of all of the above four eigenvectors is that they display adjoint orthogonality [cf. (2.6-122)] with respect to the mass matrix; i.e., the adjoint eigenvector is given by ff\P) = Tf\-P\ and it follows that (f(m))TMT(n) = 0 for m # n. For m = n, the 'normalization' results are [fln)]TMLT{n) = (N+\ )/2 (2.6-132)
194 THE ADVECTION-DIFFUSION EQUATION for LM and nn 3[f{n)]TMT{n) = N + 1 + cos — sin • sin ' • cos —:— (2.6-133) N+l N + 1 N + 1 for CM. A final property of these eigenvectors that is worth pointing out is that each higher mode is obtainable from a particular lower mode and the 2Ax mode; i.e., as for the periodic BC case, we have, for both lumped and consistent mass, T(N+\-m) = {_X)j+\T(m) for a„ m (2.6-134) Enough 'theory'—now for some pictures; beginning with a return to Figure 2.6-47, where we have already noted that both CM and LM simulate well the low modes when P is small. The high modes, even for P 'small,' are not nearly so well simulated, as seen in Figure 2.6-49 for the (nearly) 4Ajc mode (40) and for the nearly 2Ax mode [the wavelength of mode 79 is ^ times 2Ax]. Worse yet—much worse, in fact—is the case of 'large' P, where large here is basically any P giving complex eigenvalues and eigenvectors; i.e., P > 1 for LM and P > y/3 for CM. Thus, whereas the eigenvalues of the discrete case are reasonable approximations to the continuum for small P, see Figure 2.6-50(a), they are rather unreasonable for 'large' P, as shown in Figures 2.6-50(b) and (c) for P = 3, wherein we note the 'similarity' between N = 79 and N = 799. The continuum eigenvalues are purely real, and the approximate eigenvalues are complex—with the real part being really far from the correct value. And we will not even try to display the complex eigenvectors! No wonder that 'advection-dominated' flows are hard for Dirichlet BC's. It even seems slightly amazing that either approximate solution can ever be close to the exact solution for large P. But close it can be, even for P = 106 (say), as long as the boundary at x = L is not 'seen'—as we shall soon show. The 'success' alluded to is apparently 'just' another manifestation of the miracle of linear combinations But see too the remarks on non-normal matrix eigenvector expansions at the end of the next section (2.6.2d). Considering quadratic elements, the analysis is much more difficult (penta-diagonal matrices, quartic equations); thus, we just mention that some results are available in Fletcher and Griffiths (1980), and some of their properties are discussed in Mitchell and Griffiths (1979)—including lumped mass. They tell us that all eigenvalues are real (GFEM) when P ^ 2^31/15 = 2.9; i.e., close to the previous results, from linears. o Wiggles or not? We now continue our discussion of the time-dependent case, (2.3-1), by returning to an example first presented in Gresho and Lee (1981)—again by integrating the ODE's with a very small At to get an accurate ODE solution. To see how the Dirichlet OBC makes an otherwise easy problem difficult, we set Tq = TL = 0, u = 1, k = 0.004, and use for Tq(x) a fairly well-resolved Gaussian, Tq(x) = e~{x~Xo) /2a with a = 1.6Ajc, Pe = uo/k = 10(P = 3.125), Ax = 0.025, and 40 linear elements. The problem is easy until the Gaussian 'sees' (or feels) the hard BC, at which time GFEM announces the difficulty by generating non-negligible wiggles. In Figure 2.6-51, we show the solution a short time later (/ = 0.3) to show that advection of this well-resolved IC really is easy (for CM only, cf. Figures 2.6-24 and 2.6-25), and finally at / = 0.8, wherein the oo-span solution
o DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 2x1010 1x1010 0 -1x1010 195 -2x1010 (a) Continuum, mode 40. -2x101° (d) Continuum, mode 79. 500 r-- 40 50 (b) Consistent mass, mode 40 3x1010 2x1010 — 1x1010 „ 0 S_-1x1010 ^^xlO10 -3x1010 — -4x1010 60 J 40 50 (e) Consistent mass, mode 79 60 J 80 -5x1010 40 50 60 J r-- (c) Lumped mass, mode 40. (f) Lumped mass, mode 79. Fig. 2.6-49 Some higher modes with Dirichlet BC's for Pe = 24, N = 79, (P = 0.3;. would be halfway out of the domain. The infamous GFEM wiggles are sending out their signal—and even more clearly in Figure 2.6-52, in which Pe has been increased to 1000. The dotted lines shows the corresponding oo-span analytical solution, T(x, t) = exp[-(;c - jc0 - 02/2(l + 2f/Pe)]/v/l+2f/Pe, (2.6-135) interpolated to the nodes, which, of course, is not valid when the BC TL = 0 is encountered. But it is a reasonable goal for OBC testing; i.e., the perfect OBC would generate the oo-span solution. Remark: It is interesting to note that the solutions obtained via numerical time integration could also be obtained via the eigenvector expansion method, using a linear combination of
196 THE ADVECTION-DIFFUSION EQUATION m 0 10 (a) Pe = 24, N = 79 (P = 0.3) 60,000 lm(Am) 40,000 20,000 0 -20,000 -40,000 IT 1 m ■Am (P = 3) ^40; \A '80 a^M(P = 3) -► n 40 80 80 x -60,000 10,000 30,000 (b) Pe = 240, N = 79 (P = 3) 6x106 _L _L _L 50,000 70,000 90,000 110,000 130,000 Re(Am) 4x106 2x106 lm(Am) 0 -2x106 -4x106 m /cm C (P = 3) n i i r -► n ->e ^ (P = 3) 1 400 800 -6x106 I §°°- 1x106 3x106 J_ 5x106 7x106 9x106 1.1x106 1.3x106 Re(Am) (C) Pe = 2400, N = 799 (P = 3) Fig. 2.6-50 Continuous and discrete eigenvalues for Dirichlet BC's.
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 197 0.8 — 0.6 0.4 0.2 t = 0.3, GFEM t = 0.8, GFEM t = 0.0, exact t = 0.3, exact t = 0.8, exact t = 0.3 n /1 / i. ni A = 0.8! ^± / 0.2 0.4 0.6 x 0.8 1.0 Fig. 2.6-51 Gaussian hitting hard OBC; Pe = 10. 1.5 1.0 0.5 -0.5 \t = 0 t = 0.3, GFEM t = 0.8, GFEM t = 0.0, exact t = 0.3, exact t = 0.8, exact t = 0.3 A M / MM ■-» ^v / ^ / \ >*f V '/I I / ".! = 0.8 0.2 0.4 0.6 x 0.8 1.0 Fig. 2.6-52 Gaussian hitting hard OBC; Pe = 1000. those modes of, for example, (2.6-126), and it is interesting (again, perhaps) to ponder how the short modes 'get excited' later when they appear to be absent from the IC expansion and in the 'small f solution. Such is the power of linear combinations; i.e., the apparently 2Ax oscillation is much more than just the 2Ax mode since its amplitude coefficient must be very small. It is in fact a particular linear combination of many of the short modes, each of which has a very small amplitude coefficient, that conspire (with just the proper phases) to make up the wiggly solution. Supposing, however, that we really wanted to solve the 'Dirichlet' problem, we must respond to the wiggle signal, which forces us to realize that there is an OBL of thickness
198 THE ADVECTION-DIFFUSION EQUATION 0(1/Pe) that is not being resolved by our uniform mesh. So we take advantage of one of FEM's greatest virtues and design a new mesh that will—will little or no additional cost—solve the stated problem. We re-meshed by adding five more elements, graded between 0.00244 and 0.00595, from x = 1 to x = 0.98—and used a uniform mesh of 39 elements to the left of x = 0.98(Ajc = 0.02513). The virtually wiggle-free results are shown in Figure 2.6-53 for Pe = 1000 at the same times as above. Resolving the OBL has solved the main wiggle problem—and reveals the minor wiggles associated with dispersion error; Ajc = 0.02513 is just not quite small enough. So, if we resolve the (silly?) OBL, we can get a good solution; in the next section we shall show how to get a good solution on the 'coarse' mesh by employing a smarter OBC. o Diffusion wiggles, minimum time of believability. We will conclude the discussion of the time-dependent case by showing how the mass matrix wiggles that can occur for even pure diffusion (u = 0) can and should probably be regarded as a blessing in disguise, a point first brought out in Gresho and Lee (1981). Consider the following common 'sharp' transient—a step change in boundary temperature: = k—- on 0<jc<1, dT ~dt dxz with initial temperature zero and boundary temperature 7(0,0=1, r(l,r) = 0. The exact solution is given by ry OO 7(i,0 = (1-i)--V n=\ sinn70c n and a useful small-time approximation (based on a semi-infinite thickness) is T(x, 0=1- erf(jc/V4^), (2.6-136) (2.6-137) (2.6-138) (2.6-139) 0.2 1.0 0.8 0.6 0.4 0.2 0 — — i i i i i i i ) ) ) s 1 1 1 \ 1 \ 1 » 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ;t = o i i 1 i / i / i / i J \ 1 \ J N ^ 1 1 A t = 0.3, GFEM J \ t = 0.8, GFEM / \ t = 0.0, exact / \ t = 0.3, exact / \ t = 0.8, exact \ \ t = 0.3 1 / r 1 1 ';— t = 0.8 '/ _ / / _ / s-.S, V 0.4 0.6 x 0.8 1.0 Fig. 2.6-53 Gaussian hitting hard resolved OBC; Pe = 1000.
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 199 from which the heat flux at x = 0 is seen to be q = KdT/dx = -K/y/jTKt, (2.6-140) which, of course, is unbounded for t -» 0. This should clearly alert the analyst to 'worry' about the approximate solution's behavior near x = 0 for small time. But if the analyst does not know of the asymptotic heat flux approximation, or the 'real' problem is so much more difficult that no analytical solution of any sort is available, s/he can usually rely on the GFEM to signal the danger—by making WIGGLES. In the above case the GFEM ODE's, given by MT + KT = /, give the initial temperature 'acceleration' t0=M-lf, (2.6-141) where / is a vector of zeros except for the first entry (from the inhomogeneous Dirichlet BC). Now, a property of the mass matrix is that its inverse (which is dense) oscillates in sign (and decreases in magnitude) away from the main diagonal, which is positive, 'so that' M~x f can be a least-squares best fit. The net result, for this case, is that M~x f picks out the first column of M_1 to get T0. But since the signs of the entries of (M,i )_1 oscillate in sign, so too do the entries in 7V, thus, whereas the exact T(x, t) is always positive, every other node in the GFEM acceleration vector is negative, thus causing one half of the nodal values to start out in a definitely non-physical way. If the mass is lumped (i.e., if we do not employ GFEM), only node 1 has a non-zero value of Tq (and it is positive), and there are never negative temperatures. (But of course, the LM results are also wrong for small t—a subject we return to at the end of the next chapter: Section 3.19.) So what are the GFEM wiggles telling us? They are alerting us to the fact—perhaps not appreciated a priori and certainly not recognized by the lumped mass approximation—that we have defined a very difficult problem and that the approximate solution at small time [at least while any nodal values Tjit) are negative], especially near x = 0, is not reliable. This is the self-diagnosis capability referred to earlier. Once so alerted though, what is the analyst to do? Here are two responses—each of which is better than mass lumping and living in its associated and erroneous dream world: (i) do not believe the solution for times shorter than that at which the last negative temperature (which will be closest to the step change) passes through zero from below—called the 'minimum time of believability' in Gresho and Lee (1981); (ii) equate the t = 0 heat flux from the discrete solution, which can be approximated by KdT/dx = —k/Ii, where h is the first element length, to the analytic flux at small time given by (2.6-140), which yields a simple estimate of the minimum time of believability (since —ic/h is an upper bound for the discrete problem). This yields tc = h2/nK, (2.6-142) which is also close to the element time constant—see Gresho and Lee (1981), in which is also shown the exact solution to the finite element equations. Related to (2.6-142), we present below the maximum eigenvalues (minimum time constants) for (2.6-136) when 'solved' via linear (L) or quadratic iQ) FEM: xLCM « /z2/12k, x%m « h2/\5K (2.6-143) and their lumped mass counterparts: xLm « h2/4K, r?M « h2/6K, (2.6-144)
200 THE ADVECTION-DIFFUSION EQUATION where h is the smallest nodal separation in the mesh—all of which are useful to know and are derived by studying the eigenproblem for a single element; see Gresho and Lee (1981) and Hughes et al. (1979). Another thing the GFEM solution would show you is the desirability of employing a graded mesh—small elements near x = 0—because a uniform mesh will cause larger wiggles near x = 0 than elsewhere. Finally, even a very well-designed graded mesh will have its own (smaller!) minimum time of believability—and GFEM will also announce that fact, in the usual way. Our last time-dependent 'example' is borrowed, but appears to be useful enough to pass on to others. In Vichnevetsky (1985) an inlet boundary condition was proposed that lets any spurious upwind-moving noise (wiggles) leave gracefully—rather than being fully reflected (and 'aliased' to long waves) as does the simple Dirichlet BC, 7(0, t) = T0(t). He addresses only linear elements, but shows for them that a boundary condition that couples the inlet node, say To, to the first interior node, as follows: f0(0 = 27o(f) - Ti(t), (2.6-145) which looks like To(t) applied at the middle of the first element, successfully removes upstream-moving wiggles. o Steady state. Now we turn to the steady-state case. We will introduce the (over- publicized) 'tough' problem alluded to above and solve it three ways: (i) uniform mesh GFEM, (ii) smart upwind methods, and (iii) GFEMIA. But in between the first two we will summarize the error analysis that applies to (i)—because (iii) will obviate it. The steady-state solution of (2.3-1), (2.6-114), and (2.6-115) is (with Pe = uL/2k) T(x) - T0 e2Pex/L - 1 T^—r = P2Pe , > (2-6-146) i l — i o e — 1 while that of its GFEM approximation with N + 1 linear elements, Pe(7> - Tj-y) = 7f' ~Jj + Tj~' ~ Tj, j = 1, 2, ..., N, (2.6-147) hj+i/L hj/L on a uniform mesh [hi = h = L/(N + 1)] is (use Tj = a + b^j) ^-70 V-p) -, (2.6-148) TL 1 7=0, 1, ..., N + 1. The approximation of (2.6-148) to (2.6-146) can be shown to be: excellent for P <$C 1, 'reasonable' for P = 0(1) but not > 1, oscillatory for P > 1, very bad/unreasonable/wiggly for P ^> 1, and sometimes unbounded (!) for Pe => oo. For example, in Figure 2.6-54 we show, for T0 = 1 and TL = 0, the exact solution and the approximate solution for N = 7 for Pe = 4(/> = 1/2) and Pe = 16 (P = 2). Worse yet is large P: Figure 2.6-55 shows the results for Pe = 160 on two (coarse) grids: N = 6(P = 22f) and N = 1(P = 20). The reason for two values of N is to display the disjoint solutions for odd and even j when N is odd, wherein the odd nodes actually become unbounded for fixed N(Tj ~ Pe/N2 ~ P/N for j odd and P ^> N).
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 201 1.4 1.2 - 1.0 T(x) 0.8 - 0.6 - 0.4 - 0.2 0 P = 1/2 - P = 2 - Pe = 4 - - Pe = 16 - Fig. 2.6-54 Exact and GFEM (N = 7) solutions for two Peclet numbers. T(x) Fig. 2.6-55 Exact and two GFEM solutions for Pe = 160. In any event, it is clear that GFEM gives very poor accuracy for P ^> 1. But it also gives a very clear wiggle signal. And we shall respond to these two aspects in two different ways: first, by verifying/ascertaining analytically (via error analysis) that GFEM can be very 'bad,' and later, by heeding the wiggle signal, to show that GFEMIA can be very 'good'. To this end, then, we present an error analysis of the GFEM approximation—largely following T. Hughes (personal communication). Defining the error as e = Th - T, where Th = J2j Tj<Pj(x) and T is the exact solution, a first basic step in finite element error analysis is to decompose the error into two parts, as follows,
202 THE ADVECTION-DIFFUSION EQUATION and analyze each part separately: e= (Th-th) + (Th-T) = eh+ t), (2.6-149) where th is the finite element interpolant of T, eh is the part of the error contained in the finite element space, and r\—the interpolation error—is that part of the error not contained in the finite element space (because T is not so contained). The plan then is to independently bound these two errors, then combine the results via the triangle inequality. To set the stage for eh, we first restate the weak form of the problem as B(Th, wh)= [ uThxwh + KThxwhx = 0 for every wh, (2.6-150) Jo where, for convenience, we are assuming that every trial solution satisfies the inhomoge- neous Dirichlet BC's. Next, we note that the true solution also satisfies (2.6-150) to give B(e, wh) = 0; the projection of the error onto the finite-dimensional subspace vanishes. We also need the following 'stability' result: B(wh,wh)= f uwhxwh + K(whx)2 = «\\™hx\\l (2.6-151) since wh = 0 on V. We will use this result to obtain an estimate for eh, via K\\ex\\l = B(eh,eh) = B(e\e-r1) = B(eh,e)-B(eh,r1) = ~B(eh, r,) = \B(eh, r,)\ (2.6-152) / uehxri + Kexr)x which, via the triangle inequality, yields kII^IIo ^ u\(ex rj)\ +K\(ex, Y]x)\, which, via the Cauchy-Schwarz inequality, yields K\\ehx\\l ^ u{\\e% ■ |M|0) + K(||e*||o ■ ll^llo), which, via an application of Young's inequality, xy^ - (ax2 + -y2\ for all a > 0, (2.6-153) [obtainable from (x^/a — (y/y/a)2 ^ 0], yields *H*Jllo^ (flill^llo + ^ll^llo) +^ (^Ilejllj+^II^IIS) (2.6-154) for arbitrary a\ and aj. We now apply a common trick in such error analyses: pick a\ and a2 such that when the ||^||J terms on the RHS of (2.6-154) are transferred to the LHS,
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 203 the final coefficient is positive; i.e., we want k — ^{a\u + a2K) = c > 0. A good way to do this is to set c = k/2, which =>• a\u + a2K = k, which leads to the following logical choice: a\ = k/2u and a2 = 1/2. Then (2.6-154) rearranges to Jni2 u \\enx% ^2 \ -2 ||i,||£ + ||^||j). (2.6-155) To make further progress, we recall two facts: 2. IMI? = /^ + ^ = IMIJ + M?, and that the semi-norm, \<p\\, qualifies as a norm because the only constant function allowed is (p = 0 because (p = 0 on T. Thus, (2.6-155) can be rewritten as Ji\2 U Kit ^2 -rlhll^ + Nf (2.6-156) /c and we have a valid (norm) estimate of eh in terms of the interpolation error. To finish, we apply the triangle inequality to (2.6-149) in the //' semi-norm, \e\\ ^ \eh\\ + Mi, square it, \e\\ ^ \eh\\ + 2\eh\\ ■ \r]\\ + \r]\\, and apply Young's inequality with a = 1 to the middle term, \eh\\ ■ \rn\ ^ ^(\eh\j + \rj\\) to give, finally, ^2 = 2 W 2 -rlhllS + Ni +1^11 ACT 2«2 K" ■|hllo + 3N? , 2"2 o ^ 2 max I -^-, 3 K hllo + Ni) = 2 max 8Pe' -,3 IhUf. (2.6-157) To finish, all we need is the //'-estimate of the interpolation error—a standard result (e.g., Strang and Fix, 1973, or p. 190 of Wait and Mitchell, 1985): Nh ^chk\\T\\k+l, (2.6-158) where k is the polynomial degree (k = 1 for linear elements). Thus, finally, 4Pe \e\x ^max —,V6 )chk\\T\\k+l (2.6-159) bounds the GFEM error in terms of the exact solution—and we zoom right in on the GFEM's major alleged defect: for Pe > Ly/6/4, the error increases linearly with Pe (for
204 THE ADVECTION-DIFFUSION EQUATION fixed h), so that large Pe simulations may show large errors. (Indeed, this is quite the case for the simple example presented above—and perhaps also for the time-dependent case; cf. Figures 2.6-51 and 2.6-52.) For this case, it follows easily that to have \e\\ small, it is necessary (at least) to keep Pehk small, giving P 'small' for linear elements, hP 'small' for quadratics, etc. It thus follows, from this 'worst case' error analysis, that the only way linear elements could be accurate is if P < 0(1)—and this is true, per the above example, if a uniform mesh is employed. For example, if Pe = 104, we would need 104 elements to get P = 1. And, as it turns out, the result is 'only if,' in a sense; i.e., it turns out that the worst thing you can do for this hard OBC/Dirichlet case is to use a uniform mesh; it maximizes the error! To make our main point of this section in the clearest possible way, we state now what we shall prove later: the large Pe case can always be solved quite accurately—in spite of the dreary estimate of (2.6-159) by the intelligent use of just two elements (one degree of freedom)! Details later. o Smart upwinding. Before returning to GFEMIA, we believe a short digression is in order to discuss what some believe is the 'proper' solution to GFEM's hard OBC/wiggles dilemma: smart upwinding. To review the history of smart upwinding prior to about 1980, see Gresho and Lee (1981) wherein it was stated that 'Further ... literature review would undoubtedly continue to reveal more and more re-discoveries of this highly touted scheme.' Here are two more—unknown to us then and brought to our attention via Segal (1982) in a paper contemporaneous and complementary [and (thus) well worth reading!] to our 'wiggle paper': (i) II'in (1969) and (ii) Chien (1977); both are in a 'disguised' (and less compact) form, as is Segal's, representing the effective diffusivity as kP[\ — 2(e2P — \)/(e4p — 1)]_1 rather than the simpler equivalent form, kPcothP—which we present below. First, we define the smart upwind method: it is a method that obtains the exact solution—(2.6-146)—at the nodes (in ID only). And it 'looks like'/becomes simple upwinding when the grid Peclet number is large. Next, we present our version of it—on a variable grid yet (sought by Segal, 1982)—beginning by rewriting (2.6-147) as Tj+l - Tj-i = -^ J- + -±-L J-, (2.6-160) where Pj = uhj/2K, and we remark/note that n+\ Y^PJ= Pe • (2.6-161) Replacing MP j by coth/^ [which functions agree to O(Pj) for Pj -> 0] gives Tj+l - Tj-i = (cothPj+i)(Tj+l - Tj) + (coitLPj)(Tj-i - Tj), (2.6-162) the solution of which is exactly (2.6-146) evaluated at the nodes. For a uniform mesh, an alternate representation of (2.6-162) is u(Tj+i - Tj-X)/2h = K(Pco\hP)(Tj+x - 2Tj + Tj-i)/h2, (2.6-163) thus revealing the effective diffusivity (when viewed as a centered scheme), ATeff = KP COth P, (2.6-164)
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 205 giving (for P -> 0) Ke{{ = k[\ — P2/6 + 0(P4)]. This is the form that has been derived many times and in many different ways: finite differences, finite elements with Petrov- Galerkin weighting, etc. For small P, Areff = k, and the central differencing scheme is recovered; but for large P, Areff = kP = uh/2, which recovers 'pure' upwinding for P —> oo (and yields Tj =Tj-\). But it is for intermediate P that this scheme gained its fame—not many approximation schemes are nodally exact. The problem is: it does not generalize; not to the time-dependent case, not to situations with internal sources or sinks, and not to multi-dimensions. Thus, we believe it to be cute, interesting, but not really viable. Also, some observations of Segal's (1982) study (of it and other methods), are worth mentioning, who refers to smart upwinding as 'Il'in's scheme': 1. 'Hence we may conclude that the numerical oscillations... are caused by the presence of the normal boundary layer and the fact that Dirichlet boundary conditions are given at the outflow boundary.'—p. 332. 2. 'These examples show that when a normal boundary layer is present, a central difference scheme with mesh refinement is preferable to upwind differencing. Only in the case that the outer solution is constant does the Il'in scheme appear to be very accurate.'—p. 334. 3. 'Our final conclusion is that for the general (2D) case, where boundary layers are present, the "outer solution" is not a constant, a Neumann condition is given at the outflow (normal) boundary and mesh refinement is used in the horizontal (parallel) boundary, central differencing is much more accurate than upwind differencing...'—p. 340. Another paper on this subject that is worth reading is Smith (1980), in which a number of exact solutions to the discrete equations, via FDM, linear FEM, and even quadratic FEM, are derived and discussed. The conclusions reached by Smith are in complete agreement with Segal's—and with our own. o Try GFEMIAl So much for smart (and other) upwinding. We now return to GFEMIA and attempt to drive another nail into the upwinder's coffin—which is probably just wishful thinking since this religion will probably never die. And this brings us to an interesting paper by Veldman and Rinzema (1992) that is worth recognizing before we present our version of it, because they too look at crude-but-smart solutions employing but one internal grid point—which solutions we will generalize and improve—and compared GFEM's version of variable grids, (2.6-160), with a common, second-order-accurate (Taylor series) variable grid used by finite differencers. These are called, respectively, 'Method A' and 'Method B' in their paper. Some quotations: 1. 'Method A is only of first order, whereas Method B is of second order. Nevertheless, Method A gives good results for all three grids, whereas Method B is not able to produce an acceptable solution at all. Thus, the local truncation error does not give a reliable indication about the behavior of the global discretization error'—p. 124. 2 'Method A has often been mentioned, but each time it was rejected because of its LTE. The present experiments show that this rejection has been premature; Method A is much more powerful than generally assumed'—p. 130. So let us return to 'Method A'—GFEM—and consider the intelligent placement of nodes for the hard (Dirichlet) OBC problem, (2.6-160) with T0 = 1 and TL = 0—as
206 THE ADVECTION-DIFFUSION EQUATION before. Let us start with a two-element mesh (N = 1) for which the single nodal equation, from (2.6-160), gives 7, = 1//?1 + 1 , (2.6-165) \/Pi + \/P2 where P\ = uh\/2ic, P2 = uh2/2K, and Px+P2 = Pe, a la (2.6-161); i.e., h\+h2= L. We take Pe large and see how T\ varies as we 'move' the single node. It is not hard to find that T\(P\) peaks at P, = ^(Pe - 1) = Pe/2 with value 7, = (Pe + l)2/4Pe = Pe/4, which is consistent with (2.6-159) and corresponds to a uniform mesh; h\ = h2 = L/2. Since T(L/2) = 1 for Pe ^> 1, we see that the largest error occurs on a uniform mesh—at least for this simple case. (But the result does generalize—when all P} ^> 1.) On the other hand, if we place the node such that P2 = 1, we get a much better result (and one which also generalizes); T\ = 1 (the inlet temperature). While not perfect, the result is quite good—considering—for Pe ^> 1. That is, P2 = 1 =>• h2/L = 1/Pe, and the exact solution atx/L = 1 - h2/L = 1 - 1/Pe is, from (2.6-146), T(x/L) = (1 - e~2)/(\ - e~Pe) = 0.865 for Pe » 1 (say Pe > 10). Placing the node at 8/L = 1/Pe from the hard OBC turns out to be a smart thing to do—GFEMIA—for any number of nodes in the mesh. In fact, it is easy to show from (2.6-160) that, for any N, setting Pm+\ = 1 will always give T}■ = 1, j = 1,2, ..., N no matter where the other nodes are placed and no matter how may of them are used! No wiggles, reasonable solution, and no more need for the pessimistic, uniform grid error analysis (or, more commonly, and less useful yet, error analysis in which '/z' is considered to be the size of the largest element). Figure 2.6-56 shows T\ vs h\/L and the resulting two-element solution, T(x), for Pe = 50 and three values of h\/L: 0.2, 0.5, and 0.8. Clearly the uniform mesh has the largest error; also clear is that h\/L = 1 — 1/Pe = 0.98 is an excellent choice. Remarks: (1) The above property was called 'disconnected' by Griffiths and Lorenz (1978) in their study of Petrov-Galerkin methods, because Tj = To (the inlet value) for all j is independent of the specified value of T^. (2) If the 'second-order-accurate,' variable-grid finite difference scheme is employed, then (2.6-160) changes to I-Pi l+^z+i -p-^iTj+l ~ Tj) = ~~P~^(Tj ~ Tj~xl (2-6_166) which has the (peculiar) property that P^i = 1 (rather than Pn+\ = 1) gives Tj = 1 for all j. Thus, Pn+\ could be 'large' and the boundary layer totally missed, with still Tj = 1 for all j. For a similar comparison of centered and upwind FDM's, with a similar conclusion as our own, see Ferziger and Peric (1996). (3) Whereas a uniform mesh would require on the order of Pe node points to get a good result, a smart mesh could do nearly as well with only one or two nodes—a considerable saving for large Pe; e.g., 104. (4) The high cost (large error constant) of a uniform finite-difference grid for this problem is also clearly shown in Table IV of Roache and Knupp (1993). Our bottom line(s) for the Dirichlet problem in ID are the following: 1. Do not bother with smart upwind methods because they do not generalize.
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 207 h^L and x/L Fig. 2.6-56 Locus of nodal temperature and T (x) for Pe = 50; exact solution shown dashed. 2. Never use a uniform mesh. 3. Do not place the first node in from the outflow farther than 8/L = 1/Pe from the exit. If you do, GFEM will wiggle. 4. General advice (which later needs to be—and will be—extended to multi-dimensions): place a node at 8/L = 1/Pe away from the 'hard' boundary, place another halfway between it and the boundary, and, unless there is a source term in the steady, ID, advection-diffusion equation, you need no more nodes; and no supercomputer, no workstation, no desktop computer, no calculator, no slide rule—just pencil and paper. GFEMIA can really work. 5. For a better-yet mesh design, in both ID and multi-dimensions, see Hegarty et al. (1995) and the new book by Miller et al. (1996). Somewhat along these lines is a sample 'pure advection with source term' test problem, originated by Leonard (1979), in which Brooks and Hughes (1980, 1982) present what we believe is a somewhat incomplete picture—which we complete below. Specifically, they neglected to mention GFEM's performance on the problem, perhaps leaving the reader with the impression that it is not worth considering, because it is so 'wiggle-prone.' In Figure 2.6-57, we correct this impression by showing the GFEM solution (it is l/24th larger than the exact solution at the four nodes marked 'X'; it is exact at the others) as well as those shown by Brooks and Hughes for the problem uTx = S(x) on 0 < x < 15 with r(0) = 0, where S(x) = 1 - x/4 for 0 ^ x ^ 6, S(x) = -2 + x/4 for 6 ^ x ^ 8 and S(x) = 0 for x > 8; here, u = Ax = 1.
208 THE ADVECT10N-D1FFUS10N EQUATION L) — Exact (nodal interpolate) □ Upwind/classical o Upwind/Petrov-Galerkin X GFEM a n n n □ n n 0 10 15 Fig. 2.8-57 1D steady advection with a source term. d. Advection-diffusion with Dirichlet/Neumann BC's This case applies when the DiricMet BC at the outlet, x — L, (2.6-115), is replaced by a Neumann BC, sometimes called a 'soft' BC, KdT/dx — q at x = L. (2.6467) (In the next section we will consider the Neumann BC at both ends.) Except for pure heat conduction problems (see, for example, Reddy and Gartling, 1994), there is really only one important case worth detailing here—the use of the homogeneous (q = 0) Neumann BC as an OBC. This simple change can go a long way toward relieving/precludieg wiggle problems near x = L. Reason? Diriclilet imposes a large gradient upon the solution, and Neumann imposes a small one—zero. Large gradients are wiggle makers—unless they are 'smashed' by up winding. But let us start by summarizing the associated eigenproblem results, which will suggest the alleviation of difficulties relative to the previous case—usually. Solving (2.6-116) with the BC # = 0 at x — L replaced by <J>' = 0, yields Xn=(K" + Pe2K/L2, where y„ are the roots of the transcendental equation and Pe tan y + y = 0, <f>n (x) = ePe*/L sin ynx/L (2.6468) (2.6469) (2.6470) where y„ lies between (n — ^)jt and tin. Again, the advection effect seems to want to cause 'problems' near x — L—although not nearly as badly as before because sin yn ^ 0; in fact, the eigenfunction attains an extremum at jc = L rather than crashing back down to zero there, as in the Dirichlet OBC situation. This key difference helps to explain its relative success as an OBC; note though that for Pe ^> 1, — y„ ~ nn[\ + 0(1/Pe)] from which, for low modes at least, (2.6470) tends to be a return to the Dirichlet case^—a point
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 209 15000 10000 5000 T(x) 0 -5000 -10000 -15000 0 1 x Fig. 2.6-58 Dirichlet-Neumann eigenfunctions for Pe = 10. to bear in mind when we see how an FDM approximation behaves. Figure 2.6-58 shows the first several modes for Pe = 10—for which y\ = 2.86, yi = 5.76, and 5/3 = 8.71. The discrete case is interesting, and paradoxical, in at least two ways (via linear elements and, for simplicity, lumped mass): (i) a common FDM version of the Neumann BC ('image point') yields to analysis but gives bad results, and (ii) the GFEM version of the same BC (via an NBC) does not yield to analysis but gives good results—where here bad/good means big/small wiggles when an otherwise well-simulated waveform tries to leave the grid. We shall briefly summarize this awkward situation, and then appeal to the recent revelations of L.N. ('Nick') Trefethen and colleagues to help 'excuse' our incompetence. The image point method of approximating dT/dx = 0 has already been introduced, in Section 2.4.1 on OBC's; namely, from (2.4-28) with H = S = q = f N = 0, we have (for a uniform mesh) ltN + 2k{Tn - TN_x)/l = 0 (2.6-171) as the ODE/OBC for the last node, in which the absence of any advection effect should be noted. Recall that the 'image point,' TN+\, was eliminated by approximating dT/dx = 0 by TN+1 = Tm-1 • The FEM version of this OBC is also a special case of the iV-th node's equation of Section 2.4.1; i.e., set S = H = q = 0 in (2.4-31) to give, for a uniform mesh, ltN + u{TN - TN_X) + 2k{Tn - TN-i)/l = 0, (2.6-172) a seemingly small perturbation from (2.6-171)—since both yield dT/dx —>• 0 for / —> 0. But the fact is that this 'perturbation' is almost unbelievably powerful in its effect: even though all other equations in the two problems are identical, with only the term u{Tm — TN-\) in the last equation being different, the difference in the 'response' of the two sets of equations, for large Pe, is profound. Before expounding these differences, however, let us briefly return to the discussion near the end of Section 2.4.1, wherein we suggested that only foolish or naive modelers would actually invoke the image point — — — /' n = 3 / _ /■•-""x\ = 1 "Xn = 2 \ \
210 THE ADVECTION-DIFFUSION EQUATION method. That this is not necessarily the case is seen in at least the two following FDM papers: Price et al. (1966) needed to invoke the 'theory of oscillatory matrices' to explain their results, and Fisk (1982) showed that even the Keller box scheme is not immune to wiggles. Both were concerned about the resulting 'oscillations' in their solutions; both used the 'conventional' image point OBC approximation to dT/dx = 0. Since in this case pictures speak much louder than words, we show comparative results before trying to understand/explain them. The IC in Figure 2.6-59 is a Gaussian wave on a 50-element mesh with u = 1, o/l = 5 (fairly easy even for LM, at least for small t), Pe = uo/k = 1000 (P = uI/2k = 100), and At was small enough to consider the ODE integration as exact. The solution is on 0 ^ x ^ 1, but we also show the exact solution, (2.6-135), on 0 ^ x ^ 2, as a dashed line. The image point BC is obviously quite disruptive, and the homogeneous NBC is obviously quite the opposite—showing only a modicum of small amplitude 2Ax waves that appear to trail the exact solution as it leaves the mesh. Both these little wiggles and the big ones from the FDM case will actually move upstream at the group velocity of about — 1. The analysis of this behavior, in terms of eigenvectors, is not easy. Although the results for the FDM case are well known (see, for example, Hindmarsh et ai, 1984), those for the FEM case (even with LM) are not. The analog of (2.6-168) through (2.6-170) for the FDM case is given by K = -^ (l - Vl-P2cosfm) (2.6-173) and Vl — P2 is replaced by iy/P2 — 1 for P > 1, where \(rm are (for both cases) the N roots (N = L/h) between 0 and tt of P tan Nfm + tan fm = 0, (2.6-174) with eigenvector 1 -I- P\J/ ■ Y^rp) siny^m for P<\ (-1V (KPA])J/2 sin jxfrm for P> 1 Tf]={ " " . (2.6-175) For the finite element case, the (small?) change in the last row of the matrix changes the characteristic equation from (2.6-174) to P tan Nfm + y/\ - P2 sin fm = 0, (2.6-176) with \m still given by (2.6-173) and Tf] still given by (2.6-175). The difference here is that P > 1 converts (2.6-176) to a complex equation with complex roots; the roots of (2.6-174) remain real for all P. While the details of these subtle differences are not yet fully appreciated, one limiting case is (with thanks to D.F. Griffiths): as P —>■ oo, (2.6-174) —>• tanN\[f/n = 0 or \(rm = mn/N and the Dirichlet OBC case is recovered—an observation that obviously helps to explain FDM's wiggles. For the FEM case, P —>■ oo does not recover Dirichlet; rather, (2.6-176) yields tan Nilsm + i sin \ffm = 0, (2.6-177)
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 1.0 | 1 * 1 1 1 1.0 211 T0.5 — 1.8 1.0 — 0.5 — -0.5 0.5 (a) FDM 1.0 x (c) FDM 1.5 — T0.5 t = 0.25 1.0 T0.5 - i J i !^j 2.0 t = 0.50 1.0 T 0.5 0.5 1.0 1.5 x (d) FEM 0.5 1.0 x (f)FEM 1.5 2.0 I I :\ ' , 1 1 1 1 1 1 \ Iv 2.0 Fig. 2.6-59 Comparison of FDM image point and FEM NBC as OBC's: Pe = 1000, P = 100, lumped linears.
212 THE ADVECTION-DIFFUSION EQUATION producing a complex set of roots that somehow recognizes that, for k —>• 0, the OBC should vanish because the AD equation becomes the pure advection equation, dT/dt + udT/dx = 0. The details of this 'recognition' are, however, still somewhat obscure. All we know for sure is that, if an eigenvector expansion approach is pursued, the wiggle-free FEM result and the wiggly FDM result are both complex linear combinations of complex eigenvectors, only one set of which conspires to make wiggles. The FEM is often smarter than its users. To conclude this section, we return to the relevant issues raised by Trefethen in this regard, as alluded to earlier. For our AD equation on a finite domain, it is a fact that only the periodic BC case generates an operator (in the continuum) and a matrix (in the discrete approximation) that is 'normal'; a normal operator commutes with its adjoint, and a normal matrix does the same (the adjoint being the complex conjugate of the transpose matrix). For either Dirichlet/Dirichlet or Dirichlet/Neumann BC's, AAT ^ A1A; our matrices are non-normal, a measure of the non-normality being the condition number of the transformation matrix that converts our AD matrix to a symmetric matrix—which condition number increases with Pe. Such matrix transformations are discussed in Fletcher and Griffiths, 1980; also, for the record, symmetric and skew-symmetric matrices are normal. Reddy and Trefethen (1994) point out that the condition number of the continuous AD operator, as well as that of its basis (as we have seen), is 0(ePe). The key points are these: (i) the eigenvectors corresponding to non-normal matrices are not orthogonal (indeed, for P ^> 1, they can be nearly parallel), and (ii) an expansion in terms of these guys (or even the eigenfunctions in the continuous case) would encounter insuperable numerical difficulties (cf. e±Pe for Pe = 10, 100, 1000,...). Let us end the discussion with some cogent words and advice from Trefethen and friends: 1. 'This is a reflection of the fact that these operators* are non-normal. This means that they cannot be unitarily diagonalized, or to put it another way, their eigenfunctions are not orthogonal. For the convection-diffusion problem, the degree of non-normality grows exponentially with Pe. It follows that any attempt to make quantitative estimates of the behavior of L (the AD operator) by means of its eigenfunctions or eigenvalues is likely to lead to exponentially large constants. Such estimates are of little use when Pe is large, and have no content at all that is uniformly valid as Pe —>• oo.'—Reddy and Trefethen (1994). 2. 'If it is far from normal, however, the change to eigenvector coordinates may involve an extreme distortion of the state space. In the new coordinates, the physics of the system may become strangely complicated. A typical state of the system may be a superposition of huge eigenfunction components that nearly cancel, and the evolution over time intervals of scientific interest may be determined by how this pattern of cancellation evolves, rather than by the growth or decay of the individual eigenfunctions. In other words, there may be no good scientific reason for attempting to analyze the problem in terms of eigenvalues or eigenvectors.'—Trefethen (1995). 3. 'Eigenvalues and eigenvectors are an imperfect tool for analyzing non-normal matrices and operators, a tool that has often been abused. Physically, it is not always the eigen- modes that dominate what one observes in a highly non-normal system. Mathematically, * Not just matrices; i.e., it is true for the continuum too
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 213 eigenanalysis is not always an efficient means to the end that really matters: understanding behavior.'—Trefethen (1991). With this we abandon the analysis and return to the numerical results, which tell us that the homogeneous NBC is an excellent OBC. They also provide guidance to the FDM community. So, we have found a desirable OBC for the ID advection-diffusion equation, namely, KdT/dx = 0. This result generalizes to multi-dimensions via kti • VT = 0, at least when exit 'planes' are perpendicular to the coordinate axes. It also nicely 'goes away' as k —>• 0, and we reach the proper pure advection limit for which there is no BC at the exit. And this is as far as we need to take the Dirichlet/Neumann case, except to mention: see Smith (1980) for more results on this class of problems. e. Advection-diffusion with Neumann BC's at both ends We include this case only because it is mathematically interesting—if physically lacking in interpretation. And we mention at the outset that it is only legitimate for non-zero diffusion coefficients—pure advection requires a Dirichlet BC at the inlet (and no outlet BC). Here we will restrict attention to the continuous problem, because this set of BC's seems to have little practical utility. The solution of (2.6-116) with <J>' = 0 at both ends is (again, from Hindmarsh et ai, 1984) k0 = 0, kn = (nV + Pe2)K/L2, (2.6-178) <D0 = 1, <&n(x) = ePex/L [ cos rnix/L sin nnx/L J , (2.6-179) V nn J and we refer to the original reference for the discrete analogs via lumped linears—and the 'image point' BC. The interesting aspect of this set of BC's is what it shows us if we pose an IBVP using them; namely, dT/dt + udT/dx = Kd2T/dx2 on 0 < x < L, (2.6-180) dT/dx = 0 at x = 0,L, (2.6-181) and T(x,0) = T0(x). (2.6-182) An eigenfunction expansion solution of this IBVP reveals the following steady-state solution—from the constant eigenfunction that does not decay: T(x,oo)= [ T0(x)e-2Pex/Ldx/ f e"2Pex/L Jo Jo = 4r t To(x)e~2Pex/L <Lc/(l - e"2Pe), (2.6-183) 2Pe Jo a constant temperature that approximates 7\)(0) for Pe » 1—the zero derivative at the inlet forces inflow to occur at close to r0(0) for all time.
214 THE ADVECTION-DIFFUSION EQUATION dT + u uT dT — K d2T = K T dx2 T(0, t) = dT — =0 dx on = T0 at 0 x = < X = L, < L f. Advection-diffusion with Dirichlet/Robin BC's The last case we visit returns to Dirichlet at the inlet but employs a BC of the third kind at the exit; namely, KdT/dx + HT = 0 at x = L, (2.6-184) where H > 0 corresponds to a 'Newton's law of cooling' (convective) BC, and H < 0 can cause problems (e.g., exponential growth in x). In fact, the special case of H = — u gives the zero total flux BC, KdT/dx-uT = 0, (2.6-185) which is interesting in its own right—although perhaps more in the field of mass transfer than heat transfer. In neither case will we present much in the way of results, however, and we also omit/neglect discussing the associated eigenproblems; see Hindmarsh et al. (1984) for a discussion of these. What we will do is refer the reader to some literature for the H > 0 case and offer a new challenge for the case H = — u; i.e., we conclude this section by returning to the zero total flux case (// = —u) and pose the following IBVP: (2.6-186) (2.6-187) (2.6-188) ox with T(x,0) = 0, (2.6-189) which describes the advection-diffusion of a 'front.' For large Pe = uL/k, advection will dominate for t ^ 0(L/u), and the front will simply translate at speed u. But when the front slams into the wall at x = L, advection will no longer dominate, owing to the no- flux OBC. In fact, far from it; even though the flow would like to 'carry 71' with itself right through the exit 'plane,' back-diffusion precludes it—no T can leave at x = L. The resulting battle between advection and diffusion is interesting to say the least; it is, in fact, the toughest, ID, linear advection-diffusion problem that we have encountered. The one thing we can say for sure is that at sufficiently large t, the solution will approach the following steady state T(x) = r0ePex/L, (2.6-190) giving T(L) = 22, 000 T0 for Pe = 10 and T(L) = 2.7 x 1043r0 for Pe = 100. Clearly, the advection-dominated case is very difficult, and we 'pause' to note that we are well aware of the fact that this is a problem in mathematics, not in physics. But we nevertheless offer it as a 'next generation' hard problem designed to break codes—and computers; i.e., while a GFEMIA analyst would probably know how—or soon learn how from the wiggle signal—to build a decent mesh, the numbers may become too difficult to cope/compute with. Another challenge in this problem is time integration itself, as there are at least two time scales, one very short (r^ = L/u) and one very long, Tss = LePe/«Pe = r^e^/Pe, whose derivation came about as follows: (i) the total 'energy' in (0, L) at a steady state
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 215 is Jo T(x) dx = LToe /Pe, (ii) the rate of energy addition by advection at x = 0 is uT0, and (iii) a lower bound on the time to reach a steady state is obtained by neglecting back-diffusion and just estimating how long it takes to advect in the total energy—the result is that given above, r55. The true time scale for reaching the steady state is even longer! Shown next is one solution (for To = 1) of this no-flux problem, obtained in a semi-analytic way—via Laplacian transforms [courtesy of Dr. Novy, then (1989) a postdoctoral student at the University of Minnesota]—but only for small Pe; i.e., the software subroutine INLAP of the IMSL V.IO library on the Cray 2 was not able to obtain a reliable solution (inverse Laplace transform) for Pe much above 10 owing to insufficient resolving capability (machine round-off)- The solution for small time (t ^ 1) is shown in Figure 2.6-60 for three values of Pe in which the 'front' is clearly not sharp; i.e., we thus far have a nearly diffusion-dominated flow But the medium and large t solution, shown in Figure 2.6-61 for Pe = 10, shows that even this amount of advection does indeed cause 3.0 2.5 2.0 1.5 1.0 0.5 0 — "^fr~ (c) d:o5 i Pe=10 TOjC^"-- I Zjxi — I —~—^____o:5__ I t = 1 ,■•' ; / JZ 0.2 0.4 0.6 0.8 1.0 X Fig. 2.6-60 Advection-diffusion with a no-flux outflow boundary condition.
216 THE ADVECTION-DIFFUSION EQUATION Pe=10 25000 20000 — 15000 — 10000 — 5000 — Fig. 2.6-61 As in Figure 2.6-60 except longer times and Pe = 10 only. problems as the outlet value slowly climbs to e10 = 22, 000 on the time scale whose lower bound of Tss = 2200 is remarkably accurate; in fact, the following equation describes very accurately the temperature at x = L: T(L, t) = r0ePe(l — e~Pe?/e e) for t > 0(u/L)—for reasons that we do not yet understand. Returning briefly to the H > 0 case, we first remark that a large H(Hh/K ^> 1) will cause a previously 'easy' simulation to become hard—unless proper mesh refinement
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 217 accompanies large H. In particular, it is obvious that H —>• oo returns us to the 'hard' (Dirichlet) OBC, compete with wiggles—unless the grid Peclet number is less than 0(1) at the exit. The only literature on Robin BC's that we are aware of has come from the research group at the University of Minnesota's Department of Chemical Engineering and Materials Science, under the leadership of L.E. Scriven and H.T. Davis. A sampling of this literature is the following: Higgins (1982), Bixler and Scriven (1987), Christodoulou and Scriven (1989), and Novy et al. (1990, 1991). Most, if not all, of these results were obtained via asymptotic analysis of downstream conditions. Since many (most) of the above references are solving the NS equations, we shall revisit some of them in the next chapter (Section 3.8.1). q. The advective-diffusive time scale When the AD operator is normal (see Section 2.6.2d), the time scales that appear in any analytical solutions clearly show both advective and diffusive time scales. Two examples: 1. The dimensional form of the Gaussian solution on the oo-span is, from (2.6-135), T(x, t) = exp "(X-X°-Mr)2 /(l+2,rAx2) 2ol \+2Kt/a2, (2.6-191) in which ta = xQ/u and zD = a2/2k are clearly identifiable as advective and diffusive time scales, respectively. 2. For the case of periodic BC's, (2.6-108) also clearly shows both time scales: xA = \/ku = X/2nu and zD = \/k2K = X2/4jt2k, where X is the wavelength. When the AD operator is non-normal, some of the above 'physics' is lost; four more examples: 1. The Dirichlet BC case of Section 2.6.2c has, from (2.6-118), td = L2 /n27i2K and tAd = L2//cPe2 = Ak/u2, wherein only rD is 'physical'—see, for example (2.6-123). 2. The Dirichlet/Neumann case of Section 2.6.2d has, from (2.6-168), td = L2/y2K and xAD = 4k/u2, similar to the pure Dirichlet case. 3. The Neumann/Neumann case of Section 2.6.2e is the same as all Dirichlet. 4. The Dirichlet/Robin case of Section 2.6.2f: while not stated there, the results are the 'same' as the Dirichlet/Neumann case—only the value of yn differs, slightly; it comes from (HL/k + Pe) tan y + y = 0 rather than from (2.6-169). In all four cases, each mode (n) decays in place, at a rate given by the combined time constant, xn = (1/td„ + 1/Tad)-1* so that it seems to be the case that the non- physical (and mode-independent.) advective-diffusive time constant is closely related to the apparently non-physical response that does not obviously 'display' advection. Also, if we define the physical advective time scale via Ta =Xn/u = L/nu, we obtain *ad/*a = 4-nK/uL ~ n/Pe and Tad/*d = (^Yn^/uL)2 ~ n2/Pe2—neither of which 'make sense'—vis-a-vis td/ta = uL/nK = Pe("\ which of course does make sense; e.g., if Pe = uL/k ^> 1, then the low modes are advection-dominated (ta <$C td), whereas the very high (short wave length) modes (n » Pe) are diffusion-dominated.
218 THE ADVECTION-DIFFUSION EQUATION All of the above observations lend more credence to the admonitions put forth by Trefethen at the end of Section 2.6.2d; namely, it is not a good idea to put much stock in eigenproblems that come from non-normal operators. h. Final remarks on W advection-diffusion We have seen what may be called 'easy' problems and 'hard' problems in the above discussion, where some 'hard' problems may be contrived or may occur inadvertently (hard OBC's, for example), where here 'easy' means 'no wiggles/very small wiggles/smooth solution' and 'hard' means 'wiggles that are large enough to be distasteful.' What have we learned about wiggles? Basically this: there are two major causes of wiggles, the first of which is sometimes only revealed via honest GFEM (i.e., consistent mass): (i) poorly resolved or rough IC's and (ii) hard OBC's, wherein the following definitions apply: a poorly resolved IC is one that, while smooth enough (C° at least), is attempted on a too coarse mesh [e.g., a Gaussian (C00) with h ^ 0(a)] and a rough IC is typically a C~' function (e.g., a step function, or any discontinuous function). A caveat is needed in the second (rough) case, however: if Pe <$C 1 (which of course =>• P <$C 1) and the mass is lumped (where possible), the wiggles generated by the consistent mass matrix of GFEM (L2-best fit) are suppressed, and a false sense of security may be thereby engendered. A smooth IC that is well resolved is called 'easy.' By hard OBC's we mean Dirichlet (usually) or Robin with a large H, and by easy OBC's we mean homogeneous Neumann (usually, and only when properly inplemented via NBC's) or periodic (that rare but wonderful case). Finally, the hard OBC case applies equally well to steady-state simulations. Another, perhaps less common wiggle-maker is a smooth IC (e.g., constant) but a rough source term; e.g., a discontinuity in a 'heat source' will cause CM wiggles in the initial 'acceleration.' The remainder of this wiggle discussion precludes the (always) hard IC cases and examines the remaining possibilities with easy IC's. Here the results are first presented in the form of a table (2.6-2), which we do for large N(h <$C L), the only 'sensible' situation, then via some remarks. (Recall that P = uh/2K and Pe = uL/2k = P(L/h) = NP.) Remarks: (1) All of the above results are based on a uniform mesh. (2) The hard OBC case can be made easy simply by employing an appropriate fine mesh near the outlet (i.e., hN/L < 1/Pe and smoothly graded from the outlet). (3) Since Pe <$C 1 =>• P <$C 1, a small global Peclet number is always easy. (4) Clearly P ^> 1 =>• Pe ^> 1; locally advection-dominated flows are always globally so. Table 2.6-2 Some good and bad simulation results (Easy IC's and N » "Ij. Easy OBC Hard OBC P <£ 1 Good Good P»1 Good Bad
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 219 (5) If a non-uniform mesh is employed, even easy IC's and easy OBC's can lead to a 'bad' result if, downwind of the IC (presumed to be of compact support) for an advection-dominated flow, the grid becomes sufficiently coarse that the local Peclet number, Pj = uhj/2K, becomes large compared with unity. For advection-dominated flow, the entire mesh must be 'sufficiently fine,' and is permitted to coarsen only gradually in the flow direction—or else wiggles will occur. Next, we present another brief discussion on the subject of 'advection-dominated' flows for N ^> 1. Does P ^> 1 =>• advection-dominated? It surely makes Pe very very large. Is Pe <$C 1 diffusion-dominated? It surely makes P very very small. The answer to both questions is, 'Yes, usually.' If P ^> 1 and we have a waveform (IC) characterized by a length scale /, say, then P{ = uI/2k = P ■ l/h is a quite reasonable definition of an appropriate (to this problem) Peclet number. Since it would be silly (usually) not to take h <$C / for good resolution, we have that P » 1 =>• P/ » 1; advection-dominated. The other case, though, is not so easy—partly because what P <$C 1 really means is that we have good resolution. For example, consider the case P = uh/2K <$C 1 <$C ul/2k = Pi; this very-well-resolved advection-dominated situation would appear to be diffusion-dominated if viewed solely from the 'viewpoint' that P <$C 1. Finally, let us suggest that advection- dominated means that D[= e~K!^1 \ where / is the characteristic length of the waveform in question or of the object being 'flowed past'] should not be less than 0.9 when advected a distance /. This gives D = e~K/ul = e~Pl/2 > 0.9 or P, > -20. (If D > 0.99 is more to your liking, then you will need P/ > —200.) If l/h = 100, then we have P = 0.2 for this advection-dominated (P/ = 20) flow. As final final remarks, we briefly attempt to play devil's advocate by pointing a finger of failure at GFEM. Thus, we ask, 'After all is said and done, why even consider GFEM for the myriad of cases presented above when you nearly always (or, often) run flat up against a wiggle problem? Haven't you been hoisted with your own petard?' While many 'anti-Galerkin' advocates would indeed argue that we have just presented many reasons to shun GFEM—in favor of ad hoc or other methods that can simultaneously suppress/preclude the wiggles while otherwise generating accurate solutions that are not overly dissipative/smooth—we still generally prefer GFEM, partly because we still feel that the wiggle signals tell us how to do a better job and partly because methods that never wiggle will sometimes be deceptive with respect to alleged or assumed accuracy. Also, we have presented GFEM in the manner shown in part to 'expose' it fully, so that its strong points may also be better appreciated. Consistent with this belief, we will not offer panaceas to wiggle-sensitized readers—we merely warn them that all such non-wiggly methods should be employed with due caution and, usually, healthy skepticism. Finally, we point out that whereas tricky methods that do most of the right things are not too tough to devise in ID, their extension to multi-dimensions is often either unsuccessful or tremendously difficult and/or expensive. Whereas most of our ID examples via GFEM have truly analogous 2D and 3D versions, the smart ID wiggle-suppressors often do not. For a fairly recent and useful summary of ID methodologies, see Finlayson (1992). 2.6.3 Extension to 2D The jump from ID to 2D is a big one—too big in many ways, especially when it comes to analysis; useful, closed-form solutions are much harder to find and often are difficult
220 THE ADVECTION-DIFFUSION EQUATION to 'interpret' even when found. The additional jump from 2D to 3D is, fortunately, not a big one—at least conceptually. But 3D analysis does involve lots of long equations. Thus, we will move into 2D, but not into 3D, for our analytical discussions. In fact, even our 2D presentation will be brief due, in part, to a lack of known results. For example, we will restrict almost all of our analysis to bilinear elements—leaving quads (and the entire and important class of triangular elements) to 'others,' or to the reader. We will, however, show some computational results comparing these latter elements with bilinears. a. Pure advection with periodic Bc's The principal purpose of this section is to extend the phase speed/group velocity analysis to 2D, to see what new treats/surprises are in store. The two basic results of the effort below can be easily summarized up front: (i) dispersion error extends to directional error as well as to just translational error (spurious anisotropic behavior of the numerical approximation), and (ii) lumping the mass is even more deleterious than it was in ID. To start, we return to the uniform-mesh, 4-patch equation of Section 2.3.2, (2.3-24), simplified to constant velocity and with diffusion and source terms dropped: 1 36 [16ro + 4(7V + Ts + TE + 7V) + (TNE + tNW + TSE + tsw)} + + u 6 v Tse — Tsw ,a^e 2/ 2/ Tw TNE NW Tne — TsE TN — Ts 2/ Tmw — Tsw 6 V 2h 2h 2h which, of course, approximates the pure advection equation, dT/dt + udT/dx + vdT/dy = 0. = 0, (2.6-192) (2.6-193) Actually, for the purpose of the analysis to follow, there is really no reason to be restricted to a finite domain with periodic BC's; thus, we switch to a pure IVP and allow the wave vector to be a continuous variable. (Restriction to periodic domains can always be done at the end, if desired.) Thus, inserting the IC _ i(k[X+k2y) T0(x, y) = e where k\ + k\ = k2, into (2.6-193) yields the solution where T(x, y, t) = e^x+k2y-u>,t)^ (joc = k\u + kjv = k u, (2.6-194) (2.6-195) (2.6-196) and the subscript on co refers (again, still) to the continuum result. Also, referring to Figure 2.6-1, the phase velocity [see (2.6-3)] is c = cock/k = k2 k2 k cos 6 sin# = c cosO sinO (2.6-197) and G = u—from (2.6-7).
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 221 A similar approach using (2.6-192) on a mesh of / x h elements leads to T0(xm,yn) = e^ml+k^h) (2.6-198) as initial data, as the trial solution, and T(xm, yn, t) = TQ(xm, yn)t-'wl (2.6-199) sin ^i 3 sin 02 3 a) = uk\ + vk2 (2.6-200) 0i 2 + cos0, 02 2 + cos02 as the resulting frequency, with 6\ = k\l = kl cos0 and 02 = k2h = khsinO. Comparing this result with that from the ID case, (2.6-23), shows a 'nice' generalization that is totally lost if we lump the mass; i.e., whereas the ID, lumped-mass result is given by (2.6-24), it does not generalize. What happens if the mass is lumped in (2.6-192) is this: (2.6-198) and (2.6-199) lead to sin 9\ 2 + cos 02 sin 02 2 + cos 9\ coi = uk\ \-vk2 ; (2.6-201) 0, 3 02 3 not only do we lose the improvement generated by the 'consistent mass factor,' 3/(2 + cos 0), we pick up a further reduction in frequency (and therefore in phase speed) by the cross-directional 'pollution' factors, (2 + cos0,)/3! This is perhaps simpler to appreciate if we display the frequency from the simple, second-order centered FDM (the five-point stencil), sin 0i sin 02 (Ofdm = uk\ — h vk2——, (2.6-202) 0i 02 which is the proper generalization of the ID, lumped-mass result. Lumped-mass linears equal FDM in ID only; they are inferior to FDM in multi-dimensions—an important point not generally appreciated. Consistent mass really is consistent. We will soon demonstrate these differences via examples, but first let us bring two other non-Galerkin methods, previously considered, into the picture: the CVFEM of Section 2.5.3 and the one-point quadrature approximation of Section 2.5.2—wherein we recall that the key differences are in the mass matrix averages that result; the (1 4 1 )/6 of GFEM goes over to (1 6 1 )/8 for CVFEM and to (1 2 l)/4 for one-point quadrature. The resulting frequencies are sin 0i 4 sin 02 4 MCVFEM = uk\~^—-^-{ 7- +vk2—r-^~, 7". (2.6-203) 01 3+COS 01 02 3+COS 02 and its (more frequently employed) lumped-mass counterpart, , sin 0i 3 + cos 02 , sin 02 3 +cos 0) ^cvfem = "*i —Q 4 + vk2~fo 4 (2.6-204) and / sin(9' 2 , / sin(92 2 /oAon^ a)\-pt = ukx—— — —— -t- vA:2——— -— —, (2.6-205) 0)1+ COS 0) 02 1 + COS 02 a generalization of the Box-scheme result of (2.6-58) to 2D. Finally, the lumped mass and one-point quadrature result is , sin 0) 1 + cos 02 sin 02 1 +cos0i M*i ' ^ T ^+Mt2^——■. (2.6-206)
222 THE ADVECTION-DIFFUSION EQUATION The numerical phase speeds are simply obtained from (2.6-197) with coc replaced by the approximate frequency from any of (2.6-200) through (2.6-206), say &>,; and the phase speed magnitude is coi/k. The numerical group velocities, however, are another matter—some calculus is required per (2.6-7), and the results are shown in Table 2.6-3; recall that G = u for the continuum—see Figure 2.6-1. The cross-pollution caused by mass lumping is particularly evident in the group velocities—for both FEM and CVFEM. A quantitative comparison is given in Table 2.6-4 for 0 = 45° and three wave-number pairs, given in terms of wave lengths: (1) 16/ x 8/z (a fairly well-resolved wave), with 6\ = 7r/8cos7r/4, 0i = 7r/4cos7r/4; (2) 8/ x Ah (a not-so-well-resolved wave); and (3) a 4/ x Ah wave (barely resolved): Finally, the middle case of Table 2.6-4 is depicted in Figure 2.6-62 for u = 1, v = 2, and l/h = 2. Most notable is the superiority of GFEM and the inferiority of its Table 2.6-3 Group velocities for several methods. Method Gx = dco/dk<\ Gy = doo/dkz GFEM CVFEM LMCVFEM 1-pt FDM LM LM + 1 -pt u u 3 1 + 2 cos 01 2 + cos 01 ' 2 + cos Q\ 4 1 + 3 cos fl1 3 + cos 01 3 + cos 01 ucos0i . 3 + cos02 _ yjLSjnfllSinfl2 u. ? 1+COS01 U COS 01 u cos 01 • 2 + c°sd2 - ^ sin 01 sin 02 ucos0i • 1 + c°s^ _^j_s]n0:S]nQ2 V ■ V ■ 3 1 + 2 cos 02 2 + cos 02 ' 2 + cos 02 1 +3cos02 3 + COS 02 3 + COS 02 v cos 02 • 3 + c°s0i - ^ sin 01 sin 02 v 2 1 + COS 02 V COS 02 V COS 02 2 + cos 01 u_h_c] ^ 'j sin 01 sin 02 v cos 02 ■ 1 + cos ft - | tj. sin 01 sin 02 Table 2.6-4 Method GFEM CVFEM LMCVFEM 1-pt FDM LM LM + 1 -pt Three special cases. Gx 16/ x 8/7 0.9998 0.9739 0.9256 -0.036^ 1.020 0.9617 0.9135 -0.048^ 0.8894 -0.072 ^f /u 8/ x4/7 0.9972 0.9579 0.7316 -0.118^ 1.081 0.8497 0.6922 -0.126^ 0.6135 -0.236^ 4/ x 4/7 0.9461 0.7864 0.3823 -0.201 ^ 1.385 0.4440 0.3617 -0.268^ 0.3206 -0.401 j£ Gy/V 16/ x8/7 0.9972 0.9579 0.8416 -0.036^ 1.081 0.8497 0.8172 -0.048^ 0.8334 -0.072^ 8/ x 4/7 0.9461 0.7864 0.4273 -0.118^ 1.385 0.4440 0.4218 -0.126^ 0.4106 -0.236^ 4/ x4/7 0.9461 0.7864 0.3823 -0.201^ 1.385 0.4440 0.3617 -0.268^ 0.3206 -0.401 ^
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 223 / (a) GFEM / (b) LM fi 11 fi fi Ii fi fi (c) CVFEM // (d) FDM V 11 If (e) LMCVFEM 4 i A iAm i / t / if if ii if if if if (f) 1-pt. Fig. 2.6-62 Pictorial of group velocities for an 8/ x 4/7 wave with u = 1, v = 2, l/h = 2.
224 THE ADVECTION-DIFFUSION EQUATION lumped-mass counterpart—as usual. CVFEM is in between FDM and GFEM—unless lumping is employed, in which case it is poor. Also, the one-point quadrature approximation gives too large a result, although it is probably not in last place, which seems to go to one-point plus LM. b. Pure advection with Dirichlet BC's (inlet only) The GFEM automatically 'switches' to an upwind approximation to advection at the exit if the domain is truncated in a region of outflow (n • u > 0)—see (2.3-11), with K = H = S = q = 0. One might initially think that such an approximation would add artificial diffusion and thus be bad for pure advection. While it is true that the advection matrix is then no longer skew-symmetric, it is also true that the behavior/response is innocuous. In fact, Gustafson has shown (see Kreiss and Oliger, 1973) that the overall order of accuracy is not harmed by one-order-lower approximations to derivatives at boundaries for finite differences, and Hindmarsh (1975) has shown the same for ID, linear finite elements. But the main purpose of this section is to present some numerical results for pure advection, which uses a Dirichlet BC at the inlet (T = 0) and no BC at the exit. Presented below are numerical results for simple, constant-velocity, pure advection of a Gaussian through a 2D rectangular domain with the purpose of demonstrating some of the features discussed above and to compare various elements numerically/pictorially, including those not studied analytically: quadratic rectangles and triangles, both linear and quadratic. In all cases, the ODE's were integrated numerically using a sufficiently small timestep so that all of the errors can be regarded as 'spatial.' The domain in all cases is a unit square, and the discretization in all cases corresponds to that of a uniform bilinear element mesh of 30 x 50; i.e., Ax = 1/30 and Ay = 1/50. The velocity field is given by u = cos 37° = 0.7986, v = sin 37° = 0.6018, and the time integration goes from zero to 1 (500 steps, trapezoid rule; see Section 2.7). The IC is a Gaussian centered at (0.2, 0.2) with ox = oy = 0.05, giving ox/ Ax = 1.5 and <yy/Ay = 2.5, which we 'define' to be fairly well resolved in y but slightly under-resolved in x. Thus, at the end of the integration, the Gaussian center should be at x = 0.2 + cos 37° = 0.9986 (virtually at the exit) and y = 0.2 + sin37° = 0.8018. The comparison is purely pictorial (the 'augenmethod'), and there are 15 different runs, as defined in Table 2.6-5, in which an additional piece of comparison data is given for each case; the max and min of the computed result at t = 1/2 (x = 0.599 and y = 0.501 for the exact result)—which sort of measures 'conservation' and wiggles. The pictures referred to in the table follow the table, and in each the left side shows i =\ an(^ tne right s^e shows t = 1. We offer the following comments regarding the results: 1. The size of the 'base' (added for 'display' purposes) on which the Gaussian sits is 10% of the height of the Gaussian—if the Gaussian remains positive. For the plots in these figure, however, there are come negative values, in which cases the height of the base is 10% of the full range of the plotted function plus the absolute value of the smallest negative value. 2. The three rectangular-element GFEM results, Figures 2.6-63(a), 2.6-63(c), and 2.6-64(a), show clearly the group velocity effect of being under-resolved in x and 'okay' in y by the fact that the dispersion wiggles are basically moving in the negative x-direction, irrespective of the fluid's velocity. The fact that lumped biquadratics, Figure 2.6-64(b),
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 225 Table 2.6-5 Some advection experiments. Run 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Element Bilinear Bilinear 8-node serendipity 8-node serendipity Biquadratic Biquadratic Linear triangle(2) Linear triangle(2) Linear triangle<3) Linear triangle<3) Linear triangle*4' Linear triangle'4' Quadratic (6-node) triangle'2' Quadratic (6-node) triangle'2' Quadratic (6-node) triangle*3' Bilinear Mass matrix Consistent Lumped (row sum) Consistent Lumped'1' Consistent Lumped (row sum) Consistent Lumped Consistent Lumped Consistent Lumped Consistent Lumped'1' Consistent Consistent'5' Max/(-min) at t = 0.5 0.972/0.012 0.770/0.277 1.002/0.004 0.400/0.439 1.004/0.006 0.962/0.07 0.963/0.017 0.758/0.305 0.968/0.015 0.787/0.266 0.964/0.044 0.748/0.302 0.998/0.045 0.662/0.414 0.998/0.011 0.860/0.200 Figure 2.6-63 (a) 2.6-63(b) 2.6-63(c) None (gibberish) 2.6-64(a) 2.6-64(b) 2.6-65(b) 2.6-65(c) 2.6-66(a) 2.6-66(b) None None 2.6-64(c) None 2.6-65(a) 2.6-66(c) <1 'Compute total element mass (area), then distribute it equally to the nodes; see also Section 2.3.4, in which two other poor lumping schemes were discussed. <2)Grid designed so the triangles were aligned with the flow—more or less. <3)Grid designed so the triangles were aligned against the flow—more or less. <4''Union-jack' mesh design—preferred by some (shunned by others). <5'Reduced quadrature (one-point). also look like the others, suggests that lumping the mass for this element is not nearly as deleterious as it is for most others—an effect we have seen before; Gresho et al. (1978, 1980). 3. The lumped bilinear element, Figure 2.6-63(b), performs poorly—as predicted—showing lots of dispersion in both directions. 4. The lumped serendipity element is inconsistent/non-convergent, an interesting point since the consistent mass result for this element, Figure 2.6-64(c), is arguably the best of all. 5. Linear triangles do indeed demonstrate their famous (infamous) 'mesh-orientation effect'; no surprise, really, since we know that group velocity errors are definitely mesh-dependent.
226 THE ADVECT10N-D1FFUS10N EQUATION Fig. 2.6-63 Pure advection results for three quadrilateral elements. 6. Consistent linears are almost, but not quite, as good as consistent bilinears—again not a new result. 7. Lumped linears are about as bad as lumped bilinears. 8. Quadratic triangles, Figures 2.6-64(c) and 2.6-65(a), are quite good, but the aligned result [Figure 2.6-64(c)] has a surprising trail of wiggles in the better-resolved direction.
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 227 Fig. 2.6-64 Pure advection results for two quadrilateral elements and one triangular element. 9. Finally, the lumping algorithm employed for the quadratic triangle is, like that for the eight-node quad, inappropriate. 10. Reduced quadrature on the bilinear element [Figure 2.6-66(c)] causes phase lead, as predicted; ditto nine-node (2 x 2), but smaller error (not shown). 11. For some analytical results using linear triangles in several configurations, see Neta and Williams (1986)—although at least one of their results is known to be in error (D. Griffiths, personal communication): the criss-cross triangle formulation is not unstable.
228 THE ADVECTION-DIFFUSION EQUATION Fig. 2.6-65 Pure advection results for three triangular elements. c. Advection-diffusion with Dirichlet BC's The only eigensystem result known to us for this case is that for the related FDM on the simple five-point stencil with constant coefficients. Thus, we shall present it, knowing (believing) that the nine-point GFEM version is 'similar but better.' The FDM version of
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 229 Fig. 2.6-66 Pure advection results for two triangular elements and one quadrilateral element. dT/dt + udT/dx + vdT/dy = kV2T, is, on a uniform mesh (see Figure 2.3-3), to + ^(Te~ TV) + ^-(TN - Ts) = k[(Tw - 2T0 + TE)/l2 + (TN - 2T0 + Ts)/h2], 2/ 2n (2.6-207) which is the analog of (2.3-24)—after setting w, = u, Vj = v, and 5 = 0 there.
230 THE ADVECTION-DIFFUSION EQUATION We present the eigensolution only for the simplest case—a square domain with / = h. It turns out that the five-point stencil above leads, as does the continuum, to a simple generalization of the ID results given in (2.6-128) and (2.6-129)—a generalization that is not easy for the GFEM. For the continuum, we have [cf. (2.6-118) and (2.6-119)] ^min = [(m2 + n2)n2 + (Pe2 + Pe2)]K/L2, (2.6-208) where Pei = uL/2k and Pe2 = vL/2k, with eigenfunction <&m,n = e(Pe'x'L+?t2ylL) sinmnx/L sin nny/L, (2.6-209) for m, n = 1, 2, The corresponding FDM result is [for / = h = L/(N + 1)] 2k i — ( 1 - yjl -P2cosmjrh/L] + ( 1 - \A ~ P\cosnnh/L (2.6-210) with eigenvector rp{m,n) 1 jk ~ 1+P. 1 -Pi J/2 sin jmnh/L ■ 1+P2 k/2 sin kmnh/L, (2.6-211) for m,n,j,k = 1,2, ... ,N and Pi = uh/2ic, Pj = vh/2ic. Clearly the 2D case is a simple and appropriate generalization of the ID case discussed in Section 2.6.2c, and we leave to the reader the extension of those results with respect to wiggles (transient and steady-state), minimum time of believability, etc.—including, unfortunately, the extension to GFEM! But there are two additional features of the 2D case that merit mention here—both of which are much more important than the somewhat silly case discussed above: (i) flows past objects (with or without wiggles) and (ii) contained flows—both of which must extend the previously restricted case to one of variable velocity and variable mesh—via a return to GFEM. With respect to the former, consider flow of a cool fluid past a hot step, or block, or circular cylinder—at large Peclet number, which leads to two remarks. The first remark is this: if P\ ^> 1 and P2 ^> 1 in the neighborhood of the object, then the 7-field will wiggle; the second is this: if the mesh is properly graded such that P\ < 1 and P2 < 1 (GFEMIA), again (only) in the neighborhood of the object, then the solution will not wiggle and it will be accurate. (We surmise, but have not proven, that it is really the local normal grid Peclet number that makes or breaks wiggles; i.e., Pn = unhn/2ic, where hn is the normal distance from the first node in from T, and un = u • n is the 'normal' velocity at that node.) The final remark relates to contained flow (u • n = 0 on T), although actually not much is new for this case, because wiggles—and their suppression via local mesh refinement—can still occur when there is a velocity component normal to the boundary at the first node in, and the local (normal) Peclet number is too large there. d. Advection-diffusion with periodic BC's Periodic BC's are sometimes appropriate; e.g., if the 'geometry' of the problem is itself periodic (in one or more directions), so too might be the solution. Consider, for example, flow in a channel containing periodically placed identical 'obstacles'—such as cross-flow through a tube bank. The mathematical conditions that apply under such conditions are two: (i) continuity of T, and (ii) continuity of KdT/dn, where dT/dn refers to the normal
DISPERSION, DISSIPATION, PHASE SPEED, GROUP VELOCITY, MESH DESIGN 231 derivative along the periodic boundary—at x = 0 and x = L, typically. The first of these is incorporated into the finite element code simply by appropriate node 'labeling'; i.e., at 'outflow' the degree-of-freedem numbers must match, one-for-one, the corresponding numbers at 'inflow'—a situation that is especially obvious for the (rare) case of flow through a torus/doughnut; i.e., a 'truly' periodic flow. By giving the same mode number at inlet and outlet, the FEM 'assembly' process will take care of the rest; i.e., the solution will be periodic. The second of the 'BC's' actually comes 'free' in the usual case; i.e., the first BC assures, as well as can be approximated, the second. (Of course, with our C° basis functions, we do not have C° gradients; but the jump in dT/dn will be 'as small as possible,' in some sense—the same sense in which jumps in V7 are small 'internally.') To finish, we take a quick look at the associated 'eigenproblems.' The analytic solution for this case is easily found to be [cf. (2.6-108)] T(x, y, t) = e-*2" ■ eki{x~ut) ■ eik*y-vt\ (2.6-212) where k2 = k2 + k2 and periodicity requires k\ = 2nm/L and kj = 2jin/H with m,n = 1, 2, ... in an L x H domain. The bilinear GFEM 'analog' on a uniform mesh is obtained by first adding the diffusion term to the RHS of (2.6-192)—see (2.3-24)—and then seeking a solution in the form Tjt{t) = e-/"e'(*'>Aj:+*2/A-v-ftrf) to obtain li = -£(1 - cos0,) • — + -f (1 - cos02) ■ — -, (2.6-213) r 2 + cos#i h 2 + cos 02 and a) is (still) given by (2.6-200), where 0\ = k\ Ax and 02 = ^ Ay; i.e., we again have an appropriate generalization of the ID result in Section 2.6.2b. We leave the mass-lumped analog to the reader, except for the following parting remark: as for pure advection, the 2D result is not an appropriate generalization of the 1D result. e. Advection-diffusion with OBC's Little need be said here except that the FEM NBC as OBC, the homogeneous Neumann BC, KdT/dn = 0, should generally be employed or perhaps (probably) the new 'free OBC discussed in Section 2.4.1. It will usually produce quite acceptable results, at both small and large global and local Peclet numbers, even when dT/dn is not even close to zero at the outflow boundary—as in ID. Finally, a brief return to Section 2.4.2 and Figure 2.4-4 may be worthwhile, showing as it does a possible problem when n is not parallel to a coordinate direction. f. Final remarks on advection-diffusion via GFEM The AD equation can range over the following PDE classifications: elliptic (steady-state with u = 0), parabolic (the general case), and hyperbolic {k = 0, pure advection). It is nearly common knowledge by now that elliptic equations are the easiest to solve numerically and that the GFEM generates, in the appropriate norm, the 'best' possible solution. For the other extreme, however (k = 0), it usually does not—wiggles often get in the way. A primary purpose of much of what has been presented above was to display and explain GFEM—not necessarily to always advocate it. We have demonstrated the 'wiggle-weakness' of GFEM for pure advection (and for advection-dominated, Pe » 1, flows) and 'rough data'; i.e., for wave forms with some
232 THE ADVECTION-DIFFUSION EQUATION length scales that are much smaller than the mesh size. Thus, for the parabolic (general) case that is 'close to' the hyperbolic limit, GFEM leaves something to be desired—which causes us to end this section with some ostensibly apologetic remarks. The main one is that, although many in the field of numerical simulation believe that there is a definite need for a dissipative advection scheme (to preclude wiggles, if nothing else), we have rarely found such a need—in spite of some of the wiggles presented above. Thus, we cannot offer the wiggle-sensitized reader any of the many alleged panaceas that have appeared in the literature. But we do suggest the following when considering a particular dissipative scheme: make sure it works well in multi-D and not just ID. We presented very few multi-D versions of our various GFEM examples, because we know that the behavior generalizes properly (or improperly, if you are a wiggle-hater). But beware the 'smart' dissipative scheme that works great in ID but often fails to do so in 2D or 3D, especially with regard to phase and group velocity errors. And we will always remain suspicious of Eulerian methods that never ever wiggle. If we were to advocate a 'better' way than GFEM for advection-dominated flows, it would be this: use the method of characteristics—much of which will be summarized in Section 2.7.7a. 2.7 TIME INTEGRATION While all of the equations of interest in this text are PDE's, we are advocates of what some call the method of lines [we solve for T as a 'continuous' function of time on the 7-th line, Tj(t)—at least in ID], and others call semi-discretization—and have been practicing it thus far; i.e., the spatial operators of the PDE's are approximated via spatial 'discretization' with the time variable remaining continuous, thus generating (countable) systems of ODE's (this chapter) or DAE's (next chapter). Then, in 'Step 2,' the (considerable) theory and machinery of ODE's (or DAE's) is brought to bear in order to effect an appropriate/cost-effective time-integration method. A dominant reason for our proceeding along these lines is that one of us (PMG) has had (near) ready and willing access to one of the ODE experts of the day, Dr. A.C. Hindmarsh (LLNL), whose impact on our work has been, to say the least, considerable. While we do not see total agreement by many others with this philosophy, neither are we alone. We borrow from Warming and Beam (1979), who stated our case very well back in 1979 (shortly after our first FEM Navier-Stokes papers were published, in which the same approach was taken): 'Historically, the development and analysis of methods for ODE's have been more advanced than those for PDE's. The present state of numerical methods is no exception; therefore, it behooves the numerical analyst to exploit the sophisticated ODE methods for the numerical solution of PDE's.' This is our belief too; they just said it better. [See also Beam and Warming (1982).] Note too that the recent basic (and excellent) text on finite elements by Hughes (Hughes, 1987) adopts a similar approach. We just take (a portion of) it one important step further than either of the above—we utilize that portion of ODE theory that employs variable timesteps. But we are also well aware that this approach is not without some disadvantages and even pitfalls. For openers, it generally pays little heed to the possibility of 'matching' spatial methods with temporal methods to obtain an optimum balance; e.g., a very-high- order ODE method does not match well with a spatial differencing scheme that uses simple upwinding. Another disadvantage of the method of lines is that one may sometimes not
TIME INTEGRATION 233 see the forest for the trees; i.e., it is not always fruitful, nor even possible, to get at some of the deeper issues buried in PDE's such as regularity (smoothness), spatial singularities, and even convergence as Ax and At —> 0. For example, it is not always sufficient to simply study convergence by letting At —>• 0 for a fixed Ax (fixed ODE system size); sometimes it is necessary to attempt to recover some of the forest by letting the number of ODE's grow toward infinity (Ax —>• 0) while simultaneously allowing At to approach zero. So, with certain admonitions placed 'up front,' we nevertheless believe that there are many more advantages to this 'ODE' approach than disadvantages, among which are: 1. There exists a large and growing theoretical base that is not accessible if a 'full PDE' treatment (time and space together) is selected. 2. The FEM is especially well developed and suited for spatial approximation, which more or less naturally leads to the method of lines. 3. The FEM in time, while explored by some, has not been notably successful—partly because of the many well-developed and well-understood finite difference methods that are available. Recent exceptions to this statement will be discussed later. 4. Spatial approximation error can be studied and evaluated on its own merits (or otherwise). 5. ODE theory can show us how to select the proper timestep—and when and how to change it. 6. Additional 'machinery' associated with ODE theory, such as solving linear and nonlinear algebraic equations, can also be usefully utilized. 7. Finally, our time derivatives are 'low' (first-order), and our space derivatives are high (second-order), thus relegating—in some sense—temporal integration to a secondary (easier) role. Enough on philosophy; let us now get on with it. What we shall do in this chapter is to introduce some ODE methods and some 'model' ODE's, beginning with the largest, single, ODE 'categorization' term: explicit vs implicit, the former usually leading to simple step- by-step 'marching' methods, and the latter to the intermediate solution of simultaneous (non-linear and linear) algebraic equations. Intellectual efforts (considerable) for implicit methods focus on 'How to solve linear and non-linear equation sets efficiently,' and those for explicit methods on 'What are the stability limits as a function of problem parameters and how can they be improved?' Another categorization, lesser in total significance, is low-order (first- and second-, typically) vis-a-vis high-order ODE methods, that will not consume much of our time (higher-order is, usually, not really required; unless, perhaps, one is solving (low-Re) turbulent flows via direct numerical simulation and a high-order spatial method). After the introduction via model ODE's, we will apply a few ODE methods to the scalar transport equation. We begin by writing a model ODE that in some sense corresponds to the subjects of this chapter: advection and diffusion. It is y(t) = icoy - y/x = -Xy, (2.7-1) y(0) = y0, with (obvious) solution y(t) = yoe'^' = yoe~t/Teta)t, where r(> 0) corresponds to a time-constant associated with diffusion (like L2/k or l/k2K, where k is a wavenumber),
234 THE ADVECTION-DIFFUSION EQUATION and the frequency co corresponds (via co = u/L or co = ku) to a velocity (advection); the term icoy is 'non-dissipative', like advection. For pure diffusion ('friction'), set co = 0, and for pure advection, set r = oo. Equation (2.7-1) can also be interpreted as one Fourier mode of a converged (h —>• 0) AD equation; i.e., one with no spatial error. A further correspondence with the PDE describing advection-diffusion is obtained when it is recalled that the diffusion matrix has purely real eigenvalues (corresponding to monotonic decay in time), and the advection matrix (skew-symmetric form, at least) has purely imaginary eigenvalues (corresponding to non-decaying transport). Furthermore, if one were to solve the ODE's corresponding to the AD equation via the eigenvector expansion method, it would turn out that each (uncoupled) 'mode' (eigenvector) would look like (2.7-1); see Remark (4) below. Remarks: (1) Whereas the ODE's of advection-diffusion involve the GFEM mass matrix, we shall defer this additional complexity in order to introduce some ODE methods in the easiest way. (2) It is useful and immediate to generalize (2.7-1) to a vector system of equations each, in general, with a different A. (eigenvalue); see Remark (4). (3) Equation (2.7-1) is sometimes referred to as 'Dahlquist's test equation'; see Hairer and Wanner (1991). (4) To clarify the above discussion and to motivate (formally at least) the study of a single scalar equation, return to (2.2-7) and rewrite it as t + AT = b, where A = M~l[N(u) + K] and b = M~l f. Now consider the eigenvalue problem Axi = XiX(, i = 1, 2,... N, and assume the existence of a complete set of eigenvectors (valid for virtually all cases of interest to us) so that the 'total' eigenproblem is AX = XA, where the j-th column of the matrix X is the j-th eigenvector and A is a diagonal matrix of the eigenvalues, {A.,-}. Thus A = XAX"1 and T +XAX~~lT = b. Finally, let y = X~XT to give T — Xy and Xy + XAy = b or y + Ay = X~xb = b, an uncoupled system of N ODE's. Actually, to better introduce the several ODE methods to be described below, and to better set the stage for the next chapter, we will also consider the general (nonlinear) ODE y = f(y,t), (2.7-2) y(0) = y0, where f(y, t) is a given function that is fairly well-behaved (continuous in time and satisfies a Lipschitz condition; see, for example, Gear, 1971). Next we list a few 'desirables' of ODE methods and point out right up front that no method yet devised possesses all of the listed attributes: 1. Self-starting; i.e., the method should be applicable given only the IC, ;y(0) = yQ. (Many are not.) 2. No spurious/extraneous/parasitic roots (solutions); i.e., the method should not introduce any numerical artifacts in addition to the desired solution. (Many do.) 3. Stable for all timestep sizes, At. (Most are not.) 4. No spurious damping; i.e, if the ODE is of the non-dissipative variety, then the solution method should not introduce numerical damping. (Many do.)
TIME INTEGRATION 235 5. Easy to implement/inexpensive. 6. Finally, the method should be 'accurate'—an obvious but vague (thus far) statement; i.e., it should solve correctly the ODE for At —>■ 0, and the error should be small at finite but 'reasonable' Af's. Additional Remarks: (1) Attributes (1) and (2) above tend to go together in that self-starting methods have no extraneous roots, and methods with extraneous roots are not self-starting. (2) Selection of an ODE method always involves compromises among the above list of attributes. (3) For a method with 'good' stability properties, an additional attribute is that the local error be easy enough to estimate that a cost-effective, variable-step method, with At based on desired accuracy, can be designed. (4) As in the linear case, we can also consider that (2.7-2) describes a vector system of (in this case, coupled) ODE's (5) ODE people often/usually use h for At (one symbol vs two), and PDE people often use k for At because h to them refers to Ax. But in fluid mechanics, k often means wavenumber. Thus, since we are discussing all of the above—ODE's, PDE's, and fluid mechanics—we shall simply use At for At. To further set the stage for our brief discussion of ODE methods for AD, we refer to Figure 2.7-1, taken (with permission) from the Stanford CFD course notes in the Department of Aeronautics and Astronautics, #AA214—Numerical Methods in Fluid Mechanics, taught for many years by Harvard Lomax (ours is from 1980). The terminology in the figure is this: his X is our r-1, his h is our At, and his er's are the (complex in general) roots of the so-called 'characteristic polynomial' that characterizes the ODE method. We shall soon show some er's, but will call them £'s. Each dot on each curve in the figure corresponds to the value of er for a particular timestep size, and the numerical solution is stable if and only if all er's lie within or on the unit circle (|er| ^ 1), the latter condition (|er| = 1) requiring the further constraint that the roots be simple (repeated roots on the unit circle are unstable). To continue the discussion, we find it best to quote directly from the Lomax course notes, after pointing out that the left 'column' in Figure 2.7-1 is modeling pure diffusion, the right column is modeling pure advection, and the middle is AD: 'One can picture the stability of the er-roots by plotting them, for given values of A./?, in the complex er-plane relative to the unit circle about the origin. Figure 2.7-1 illustrates what could happen for a numerical method that produced one principal and two spurious roots. We assume the method is applied to inherently stable ODE. Shown are the traces of the er-roots as Xh starts at zero and is increased by some constant increment. The top row shows a (hypothetical) exact behavior. The second row shows the attempt of the principal er-root to match the exact behavior and its ultimate failure for large Xh. Since &xh =1 for h = 0, the principal root always starts at +1 on the unit circle. Traces of two spurious roots are shown on the bottom row. They can start anywhere inside, on, or outside the unit circle. For most useful methods they start inside the circle, and are kept inside by a proper choice of h. When any root, principal or spurious, leaves the unit circle, the method is said to be numerically unstable for that Xh.
236 THE ADVECTION-DIFFUSION EQUATION •AK *■=: e GTz. e (-jUcCJ^ NjM«Wta Principal* Hum&rtcai- ^-fo-J 6.4- 5 fcabif.'fjj i« H»fc covHolex* <T-f>loM* A.r« veai O/vid. p©si"t"iVe • /4 dM^ **J 4 -V«rl *e«r) ro»H .start* an t&e UK»«-t* Ca re I • . A. ,X«rJ *«<*■./ (b) A4<*■*»* ^yp*..-Alt spurious roots sfcxrl" at Hi ft Ortqj* . 8.3T- L.oc-cd't'on of 0" 6ube.n H = o £or Ada»v»j o*l<J Fig. 2.7-1 Two figures from H. Lomax's CFD course. (Reproduced by permission from H. Lomax). Asymptotic Numerical Stability for Ah —► 0. In the classical study of ODE's, considerable attention is given to the subject of asymptotic stability which refers to the behavior of the numerical a-root structure in the limit as h —> 0. The theory applies only to the study of spurious a-roots since all principal a-roots must start at +1 when h = 0. We mention here two classifications of multi-root (defined as methods that produce at least one spurious root) methods. If all the spurious a-roots fall on the origin in the complex a-plane when h = 0, the method is said to be of the Adams type. The Adams-Bashforth and Adams-Moulton methods have this property. If any spurious a-root falls on the unit circle when h = 0, the method is said to be of the Milne type. The leap-frog method has this property. Milne methods are usually more accurate, but less stable than Adams methods. They are illustrated in Figure 2.7-1.
TIME INTEGRATION 237 Returning now to a general discussion of ODE integration methods, five important ODE definitions are the following: 1. A method is said to be k-th order accurate if, given the exact solution at time tn, the local (single-step) error, ln = yn+\ - y(tn+\), (2.7-3) where y(tn+\) denotes the exact solution and yn+\ the approximate solution at tn+\, varies like /„ = 0{Atk+[). 2. The global error, e„ = yn -y(tn\ (2-7-4) is the error over the whole range and includes the accumulation of local errors. It is in general larger than the local error by one power of At; e„ = 0{Atk) for a k-th order method. [See, for example, Gear (1971) for the proof of this assertion.] 3. A method is said to be A-stable (A / Absolutely; it is simply A) if, when applied to (2.7-1) with arbitrary r > 0, the numerical solution —> 0 as n —> oo. Note that there is no constraint on At in this definition; thus, all Af's must cause yn —>• 0 as n —>• oo if the method is A-stable. (Few are). 4. A method is said to be L-stable if it is A-stable and, when applied to (2.7-1), gives HniA^oo yn+\/yn =0. 5. An intermediate stability, called strong A-stability, has limA?-^cx3 yn+\/yn = S, where 0 <8 < 1. Before getting too embroiled in details, let us clarify some of the above discussion by showing the simplest (and lowest-order) explicit and implicit ODE methods—both first used (apparently) by Euler, because they are called 'forward Euler' (FE) and 'backward Euler' (BE), respectively—or 'explicit' and 'implicit' Euler. They are, applied to (2.7-2), FE. yn+l = yn + Atf(yn, tn) = yn + Atyn. (2.7-5) BE. yn+i = yn + Atf(yn + l,tn+\) = yn + Atyn+{. (2.7-6) Clearly the explicit method is much simpler in that, since yn and tn are known, the evaluation of f(yn, tn) is explicit, and a simple marching method is obtained. For BE, on the other hand, f(yn+\, tn+\) is not known because yn+\ is not; the method is implicit in yn+\, and each step requires the solution of the non-linear (in general) equation, (2.7-6). Even if the ODE is linear, an implicit method will always involve solving for yn+\ rather than simply evaluating the RHS, which characterizes explicit methods. Thus, for (2.7-1), BE yields yn+\ = yn — XAtyn+\, which is 'solved' via yn+\ = yn/(\ -\-XAt). Before evaluating these two methods further (clearly, FE looks to be the 'winner' thus far), let us show them graphically, starting at yn, tn in Figure 2.7-2. Note that FE overshoots and BE undershoots; this is also true for a monotonically decreasing function—and helps to understand (as we shall soon see) the inherent instability of the former and the inherent stability of the latter. Even in the diagram, FE is much simpler than BE, the latter requiring first finding the slope of y(t) at tn+\ and then 'translating' this downward (in this case) until it passes through y(tn) to finally yield yn+\. And in truth it is even worse yet because the curve y(t)—the exact solution of the ODE—is not available to us! Thus, the BE sketch is more suggestive than real. For example, even the simple
THE ADVECTION-DIFFUSION EQUATION Fig. 2.7-2 Forward and backward Euler. linear equation, y = —ky, would yield, in going from y(tn) at tn to tn+\ = tn + At, yn+\ = y(tn)/{\ +XAt) for BE vs the true solution y(tn+\) = y(tn)e'XAt; i.e., the true slope at tn+\ is — ky(tn+\) = — ky(tn)e'XAt, whereas the BE slope is given by [yn+\ — y(tn)]/At = —ky{tn)/{\ + kAt)—differing from the true slope by 0(At2) for kAt small. How do the two methods stack up against our list of attributes? Like so: 1,2. Both are self-starting and introduce no parasitic solutions. 3. Only BE qualifies here, and we see that the extra effort associated with the implicit method can yield a dividend; if the ODE itself is stable (not going to oo at 'large' t), then BE is also stable for arbitrary At whereas FE is, at best, conditionally stable, and, at worst, unstable (yielding solutions that grow without bound). For example, y = —ky with k > 0 is only stable via FE for At < 2/k—and for another example, y = icoy, the FE method, simple though it is, is unstable for all At. [Proof: y = —ky =>• yn+\ = (1 — kAt)yn => yn = (I — kAt)nyo, and stability thus requires |1 — kAt\ < 1 giving, for k = kr + ikh (1 - krAtf + (kjAt)2 < 1, or At < 2kr/(k2r + k2). This same analysis for BE, left as an exercise, gives stability when At > —2kr/(k2 + kj), which is everywhere in the complex plane except within the unit circle centered at krAt = —1; in particular, it is stable for all ODE's with kr ^ 0. It is, in fact, so stable that it will drive to zero (damped oscillation) the numerical solution of an ODE with an exponentially growing solution (kr < 0) if At is sufficiently large: krAt < — 1. This is the most 'dissipative' (and stable) method ever devised—and FE is the least stable.]
TIME INTEGRATION 239 4. Here they both lose; BE will damp the solution to y = icoy, y = yoelajt, eventually getting to y = 0, and FE will blow up—thus suggesting (properly), with due respect to Leonhard Euler, that perhaps better ODE methods should be sought. [FE yields yn = (1 +ia)At)ny0 and \yn/yo\ = 0 + co2At2)n/2, and BE gives yn = y0/(l + icoAtf and IW3*>I= l/d+^2Af2)"/2.] 5. Only FE qualifies here, since solving non-linear equations is, by comparison with simple time-marching, expensive. 6. Both are of minimal acceptable accuracy. As we will see below, they are 'first-order' methods; for small At, yn — y{tn) = O(At). The local truncation error (LTE), dn, is defined as the residual in the ODE formula when the exact solution is inserted. For FE, the LTE is determined, via Taylor series analysis of a single step (hence, 'local') on the assumption that the exact solution is available at the beginning of the step, which we now derive. /FE d„ = yM + A*y(tn) - y(tn+l) At2.. At3 = yn + &ty„ - \yn + &tyn + -^-.v« + ~T~yn + "' = —^yn+0(Aty), (2.7-7) and we see that in this case, the LTE is also the local error, /„. For BE, d*E = y(tn) + Aty(tn+l)- y(t„+i) + Aty(tn+l) - y(tn+l) At2 y{tn+x)- Aty{tn+X) + — KWi) + 0{Af) At2 2 y(tn+l) + O(Af) At2 = -^-yn+0{At\ (2.7-8) the sign change being consistent with the sketches presented earlier. Also, the actual local error, /„, is not quite d^E because, in fact, we do not, as assumed in obtaining (2.7-8), have the exact solution available at t„+\ (only at tn, by definition)—but it is within 0(At3) ofd*E. So much for the two first-order ODE methods—for now, except to point out that the average of the two (first-order) Euler methods might be second-order accurate—and this is true. But it then goes by a different name—trapezoid rule (TR), a method that we shall return to later (many times, in fact). We now move on to describe a few higher-order methods, both explicit and implicit. Also, it is important to point out that we will provide only a brief overview of (some) 'numerical' ODE theory; for basic, in-depth, authoritative discussions, see, for example, Gear (1971), Shampine and Gordon (1975), Butcher (1987), and Hairer et al. (1987).
240 THE ADVECTION-DIFFUSION EQUATION 2.7.1 Some Explicit ODE Methods a. Second-order Adams -Bashforth (AB2), an 'explicit multi-step method' Applied to (2.7-2), this method is At yn+\ = yn + y(3^" ~ ^""l^ (2.7-9) which is a two-step method (needing one extra 'history vector') with the following properties: 1. It is second-order accurate; the local error and the LTE are the same, and given by At dn = yn + -yOyn - yn-i)- y{tn+\) At yn -Atyn + -^-yn+0(At3) At' At yn + Atyn + -—yn + -—yn+ O(Af) = --Atiyn+0{Af), (2.7-10) where we have used the exact solution at tn-\ in order to invoke Taylor series. 2. It has a startup problem (not self-starting). Step 1 {n = 0) is therefore typically performed with FE—preferably with a smaller timestep. 3. It introduces a spurious/extraneous 'root,' derivable for y = — Xy by seeking a solution to (2.7-9) of the form yn = y^n which gives £"+' = S" - (A.Ar/2)(3|" - |"_1), with solutions £± = ^(1 - \XAt ± J\ - XAt + \X2At2), one of which (£+) is 'physical' (approximates the ODE solution, y{t) = yot~~Xt and the other (the - sign) spurious/extraneous. (A &-step Adams-Bashforth method has k — 1 spurious solutions.) As At —>• 0, however, so does the spurious root (a property of the Adams' family of methods; all spurious roots —>• 0 as At —>• 0, as shown earlier in Figure 2.7-1, with £ replaced by a). But as At is increased, so too does one or both of £+ or £_ until eventually one of them 'punches through' the unit circle, giving instability. Whether the physical or spurious root goes unstable first depends on the relative magnitudes of a> and r in (2.7-1)—an issue we shall return to when applying AB2 to advection-diffusion in Section 2.7.6. If X = —ico (pure advection), then the physical root (£+) is larger than 1 for all At > 0; as for explicit Euler, which is also ABl, stability is lost for purely oscillatory solutions. (For X real, instability commences at XAt = 1, with £_ = — 1.) b. Third-order Adams-Bashforth (AB3), another 'explicit multi-step method' Applied to (2.7-2), it is At yn+\ = yn + "12(^n ~ 16:v«-i + 5y„_2), (2.7-11) which requires one more history vector (jn-2)- But it offers a key advantage over AB2 (in addition to being third-order accurate): it is not unconditionally unstable for pure
TIME INTEGRATION 241 advection. It is conditionally stable; coAt < ~0.724 [Durran (1991)—a recent paper that, with what seem to be very good arguments, strongly advocates this method for pure advection]. It, of course, has a startup problem, and it introduces two extraneous roots (solutions), which can (as with AB2) cause stability as well as accuracy problems, as we shall demonstrate in Section 2.7.6. Startup could be done with two FE steps or one FE and one AB2 step—all at smaller At if similar accuracy is to be preserved. The stability boundary for the first three AB methods is shown in Figure 2.7-3 (AB1 is forward Euler), taken from L-W. Ho (1989, Ph.D. Thesis), with permission. The curves plot the real and imaginary portions of XAt for the ODE y = Xy for complex X (note the sign change which is conventional in the ODE literature), and each method is stable only when XAt lies on (the stability limit) or within the closed stability boundary. They are obtained by seeking a solution of the form yn = y^n and setting £ = e'e, which has |£| = 1. For example, applied to explicit Euler, which we have already derived in a different way, yn+\ = (1 + XAt)yn = %yn = t'eyn or XAt = e"9 — 1 = (cosO — 1) + is'mO, which, as 0 ranges from 0 to 2n, describes the curve labeled AB1. For other stability curves, see, for example, Gear (1971) or Hairer and Wanner (1991). Note too the small regions in the right half plane in which AB3 would drive to zero the numerical solution of an unstable ODE; cf. the discussion of BE above. c. Runge-Kutta methods (RK2, 4) These one-step (and thus self-starting with no spurious roots) multi-stage methods are characterized by the fact that they involve evaluations of / at intermediate values of t in addition to those at the temporal mesh points, tn. Since we have little experience with them but since they are extremely popular ODE methods in general (especially RK4, often called 'classical Runge-Kutta') and are reasonably popular in CFD (at least 1.2 1.0 0.8 0.6 0.4 0.2 0 -0.2 -0.4 -0.6 -0.8 -1.0 -1.2 -2.2-2.0-1.8-1.6-1.4-1.2-1.0-0.8-0.6-0.4-0.2 0 0.2 Fig. 2.7-3 Stability regions for first-, second- and third-order Adams-Bashforth formulas. (Reproduced by permission from L-W. Ho). T I I I I I I I I T I I I I I I I I I L
242 THE ADVECTION-DIFFUSION EQUATION for the advection terms), we shall summarize the lowest (second-order) member of the family, which itself describes a 'family' (as do RK3, RK4, etc.) in that it comes with a free parameter, y, which may take on any value except zero (higher-order RK methods involve more free parameters): Stage 1 : yn+\/2y = yn + Atyn/2y, Stage 2 : yn+x = yn + At[(l - y)yn + yyn+l/2y], (2.7-12) where yn+\/2y = f(yn+\/2Y, tn + 1/2)/). When y = 1, this RK2 method also goes by the name 'explicit midpoint rule', and for y = 1/2, it has several aliases: modified Euler, modified (explicit) trapezoid rule, or one of Heun's methods. For y = 3/4, a certain bound on the LTE is minimized; see Gear (1971) and/or Hairer et al. (1987) for details and further discussion—and also for other RK methods except RK4, which we shall later use: Stage 1: j, = Atyn, Stage 2: y2 = Aty(yn + \yx), Stage 3: jy3 = Aty(yn + \y2), Stage 4: y4 = Aty(yn + y3) yn+\ = yn + i(y\ + 2j2 + 2j3 + y4). (2.7-13) A serious impediment to both explicit and implicit [not considered herein; see, for example, Butcher (1987)] RK methods when one is interested in generating a variable- timestep ODE method, is that the LTE's are difficult, if not impossible, to estimate. (They are, however, good fixed-step integrators.) Finally, if f(y,t) = ia)y (advection), RK2 is unstable; higher-order RK methods—especially RK4—are then conditionally stable. See Hirsch (1988), Canuto et al. (1988a), or Vitchnevetsky and Bowles (1982) for stability plots of the first four RK methods, and Hairer and Wanner (1991), for these and many more. d. Leapfrog (another explicit midpoint rule) We would surely be remiss in the eyes of many if we omitted this meteorologically (and oceanographically) popular, simple two-step method of second-order accuracy—even though it is sometimes not regarded as a 'legitimate' ODE 'method' by the experts. [It is only 'weakly stable,' which often translates to (slightly) 'unstable' in practice.] It is yn+l = >>„_, +2Afj„, (2.7-14) which is not self-starting and displays one extraneous solution—often referred to as a time-splitting 'instability' (even and odd timesteps tend to look 'de-coupled'). Seeking a solution of the form yn = yo^n for / = — Xy yields £ = — XAt ± Vl + X2At2, which, for X real, always has an unstable root; leapfrog is unconditionally unstable for diffusion. If, though, X = — ico, then the roots are £ = icoAt ± Vl — co2At2, and the method is both stable and properly non-dissipative when coAt < 1 [|£| = 1; this is the feature loved by meteorologists vis-a-vis virtually all other explicit methods; it also seems to be popular in plasma physics; see Miller (1991).] For coAt ^ 1, it is unstable; coAt = 1 is
TIME INTEGRATION 243 unstable even though |£| = 1 because the root is not simple—yn = (1 — in)in. £+ is the physical root, and £_ is the extraneous root. But since £ is unimodular (when stable), it is more convenient to use £ = e"9 to obtain £+ = e'9, where 6 = sin"1 coAt and £_ = -Vl - co2 At2 + iooAt = (-l)(Vl -co2 At2 - iooAt) = -t'ie. To see the time-splitting (Lilly, 1965), we write the exact leapfrog solution to y = icoy as yn =a%n+ + b%n__ = at"10 + (— \)nbt~~ine, where a and b are determined from yo and y\ = y0 + icoAtyo, which comes from explicit Euler on the first step; the result is (for yo = 1 and coAt strictly less than 1) yn = {(I + 1/Vl -oj2At2)ein0 + (-1)" • ±(1 - 1/Vl - co2At2)t'ine, vis-a-vis the exact solution, y{tn) = e'MnAt—which also corrects a small error in Lilly's paper (he had Vl — co2At2 rather than its inverse). If coAt is not 'sufficiently small,' then the even and odd timesteps will differ noticeably because of the oscillating extraneous root—this is 'time-splitting.' To see how meteorologists cope with this problem, via 'time-filtering,' see, for example, Durran (1991). The stability boundary, obtained by setting |£| = 1 = e"9 in the leapfrog solution to y = Xy, is found to be XAt = i sin 0, which is merely a small piece of the imaginary axis. If there is any diffusion at all [0 < r < oo in (2.7-1)], leapfrog is outside of its stability boundary—the extraneous root will exceed one in magnitude. For further discussion of leapfrog for non-linear equations, see Sanz-Serna (1985). e. Rational Runge-Kutta (RRK) We end our brief introduction to explicit methods with a particularly suspicious one—and we include it only to steer the reader away from it because (and in spite of) several recent references (Wambecq, 1978; Hairer, 1980; Liu and Zhang, 1984; Liu et ai, 1984; Pironneau, 1989) have advocated it (probably some at least without testing it). They are non-linear RK methods that are unconditionally stable (hence the interest)—of which we will show just a two-stage, second-order method (from Hairer, 1980): on y = f(y), it is yn yn+\ = yn+ &tyn ——^-, (2.7-15) 2j« - f [yn + Y^n) where the two stages have been combined. While explicit and unconditionally stable, we eschew it for the following reasons—all of which we learned from others: 1. In spite of Hairer's paper (1980), 'Unconditionally Stable Explicit Methods for Parabolic Equations,' in which stiff ODE's were addressed, a follow-up paper by Sottas (1984)—in which Hairer served as a 'consultant' (see Acknowledgements)—has the following title, 'Rational Runge-Kutta methods are not suitable for stiff systems of ODE's.' [We will discuss 'stiffness' in more detail below—for now, just imagine that it is necessary to solve a pair of ODE's like (2.7-1) simultaneously and that the ratio of the two r's is » 1.] 2. In another paper whose title also tells the story, 'On the lack of convergence of unconditionally stable rational Runge-Kutta schemes,' Pierce and Prevost (1986) state, 'This paper establishes that the poor performance of the RRK scheme, which has been described as a 'loss of accuracy' by Liu et al. (1984), is in fact due to a lack of convergence. Since the method is unconditionally stable, the loss of accuracy does not manifest itself in a blow-up of the solution or wild oscillations.'
244 THE ADVECTION-DIFFUSION EQUATION 3. At the end of a brief, unpublished analysis, A. Hindmarsh (1985, personal communication) concluded that 'So it seems that the non-stiff components are not being approximated even to order 1.' 4. In a short note on the subject, Fried (1984) states, 'Unconditionally stable (semi-) explicit integration schemes for stiff systems of equations as generated by the finite element analysis of elasto dynamics and non-stationary heat transfer are shown to suffer from the Dufort-Frankel-Saulev syndrome, whereby coupling between the space and time discretizations may have a ruinous effect on the accuracy of the computations.' For the heat equation, he refers to the RRK method, and the 'syndrome' referred to is that consistency as the finite element mesh is refined is obtained only if At/Ax —>• 0 [cf. forward Euler for the heat equation; here stability requires At/Ax < O(Ax) as Ax —>• 0]. 2.7.2 Application to Advection-Diffusion (Scalar Transport) We now address the interesting problem of solving the semi-discrete AD equations—the ODE's—for transient situations. A very common occurrence in CFD is that wherein over large portions of the computational domain the main job of the fluid is just to carry its load (this is advection); i.e., Pe ^> 1, and the passive scalar 'simply' tags along with the fluid (for the 'ride')—except near certain boundaries, where the diffusion process can also be quite strong. But, over large portions of many flow domains, the scalar transport equation is, effectively, dT/dt + u • VT = 0. While the analytic solution of this equation is extremely simple, the numerical solutions are very definitely not; indeed, the design and analysis of numerical schemes for 'solving' the advection equation might well have more person-hours invested in it than has any other linear and scalar PDE. The analytical solution is simple (in concept, at least) because the equation just represents a Lagrangian, or substantial, derivative; DT/Dt = 0 is a derivative following the motion, and thus T does not change along a streamline. In fact, in a later section (Section 2.7.7), we shall return to this concept; for now, however, we address the solution of these equations in their Eulerian form. a. Generalities Switching from general (and uncoupled) ODE's to the specific (and coupled) GFEM ODE's governing scalar transport, (2.2-7), we note immediately the obvious fact that these ODE's 'look' implicit (coupled) in that even the 'simple' evaluations of f(y, t) a la explicit ODE methods goes over to: solve My = f(y, t) for y. Another fact is that three of the simplest ODE methods discussed above, forward Euler (which is also the first-order Adams-Bashforth method and the first-order Runge-Kutta method), AB2, and RK2, are unconditionally unstable in the absence of diffusion (and, of course, in the absence of an advection scheme that introduces diffusion, such as upwinding and its offshoots). But if diffusion is present (and it usually is), then these schemes are at least conditionally stable, and if one insisted upon the simplicity of an explicit scheme, one could easily solve for the needed values of y using but two or three iterations of the diagonally scaled conjugate gradient (DSCG) method (see Volume II, and Wathen, 1991). But, as will soon be demonstrated, the conditional stability—At must be less than some critical value, Afcrit—is often so restrictive as to require an inordinate number of timesteps to complete the desired time integration. And then even the few DSCG iterations per timestep might
TIME INTEGRATION 245 drive one to 'mass lumping,' which brings a pair of bonuses: (i) Afcrjt is larger and (ii) the 'implicit' ODE's are converted to explicit ones in that M~[lf(y,t) is a cheap operation. Indeed, for transient diffusion problems, or even diffusion-dominated (small Peclet number) problems, mass lumping is often employed (when 'legitimate'; see below) and is not very deleterious. But for advection-dominated flows, it usually is—as we will soon show, even though the phase speed and group velocity results shown earlier should suffice. b. Lumping the mass Since many have done it in the past, and in spite of our aversion to it, we discuss briefly some mass lumping methods that have been employed (recall—all are ad hoc), which is an incomplete coverage at best. [For further information on this subject, see Hughes (1987), Fried and Malkus (1975), Cook et al. (1989), and Zienkiewicz and Taylor (1991).] One of the arguments, hopefully nearly passe, goes like a finite difference vs finite element argument: 'The extra speed of my resulting algorithm will permit me to use rather more elements than otherwise, thus compensating for loss of accuracy by a gain in mesh density.' Some mass lumping methods: 1. For the Lagrange family of quadrilateral elements, the procedure called 'row summing' is effective; the rows of the consistent mass matrix are summed and the result placed on the diagonal, with (of course) zeros placed in all other locations; i.e., Meu = J2; fe <Pi<Pj — fe <t>i since Y.j0/ = 1 • 2. There is no procedure available, to our knowledge, that is successful for the serendipity elements (e.g., the eight-node element in 2D) when the advection terms are important (high Pe) and the Galerkin method is employed—see Section 2.3.4. Not all elements are susceptible to mass lumping. 3. For the linear triangle, the procedure developed by Winslow (1967) may be more effective (S. Sackett, personal communication) than the commonly advocated row-sum technique, which always places one third of the element mass at each node. Here, the mass is distributed among the three nodes in proportion to the fraction of relevant area associated with each node, this area being determined by forming the perpendicular bisectors from each side. For right triangles, this leads to the placement of one-half of the mass at the node with the right angle, and one-quarter at the other two. This proportion is also used for obtuse triangles, wherein the three bisectors do not intersect inside the triangle. 4. For the quadratic, six-node triangle, the row-sum technique is completely inappropriate using GFEM since the vertex nodes are assigned zero mass. Donea et al. (in Hughes, 1979) have suggested a modified, weighted residual method (Petrov-Galerkin) which, upon row summing, places 4/15 of the total mass at each midside node and 1/15 at each vertex. 5. Nodal quadrature is an approximate integration method that, by construction, gives a diagonal mass matrix; see, for example, Fried and Malkus (1975), Gray (1977), and Zienkiewicz and Taylor (1989, 1991). (Figure A8.2 of Vol. 1 and the figure on p. 321 of Vol. 2 have an error for the eight-node element result; 8/36 should be 8/38, and 1/36 should be 3/76. Also, the ninth node, with weight 16/36, should be added to the nine-node element in the same figure, Figure A8.2.)
246 THE ADVECTION-DIFFUSION EQUATION To our knowledge, none of the above lumping schemes for triangular elements has yet been tested numerically on the advection-diffusion equation. The use of LM (at least for the simplest elements) has an additional advantage with respect to the stability-limited timestep size in the low Pe range: for diffusion-dominated flows, the stability limit (Afmax) of the explicit Euler scheme is, in ID at least, three times larger than that when CM is employed (for high Pe, both schemes have the same stability limit, At ^ 2k/u2, in ID—which result we shall soon derive). Also, for Pe <$C 1, the LM results are not necessarily less accurate than CM (phase error is less important), although they could be misleading since they tend to suppress important wiggle signals (Gresho and Lee, 1981). All things considered, however, the LM method (even using more elements) can sometimes be more cost-effective than CM when explicit time integration is employed. c. Stability estimates and the case for implicit methods The most common and the most useful method for analyzing stability of numerical approximations to PDE's is the so-called von Neumann method, which is based on Fourier analysis. Before presenting our first application of this method, let us point out that while extremely important, it does have a few shortcomings—seemingly serious: 1. It requires a uniform mesh and periodic BC's (or, what is the same thing for Fourier analysis, no BC's and an infinite spatial domain)—neither of which occurs much in practice. 2. It is often difficult to apply in multi-dimensions (but then, so too are alternative methods). But it turns out that von Neumann stability results are always necessary conditions, at least on a uniform mesh, regardless of BC's. Alternative methods of analysis also exist, such as the matrix method (in which the eigenvalues of the appropriate matrix are estimated and from which stability estimates obtain) and the energy method (in which the 'energy,' ^yTMy in the ODE system My = /, a positive-definite quantity, is to remain bounded). We shall discuss these methods a little bit, but not as much as von Neumann—which we now apply to the ID AD equation discretized via linear elements on a uniform grid with periodic BC's and constant coefficients (velocity, diffusivity) [cf. (2.3-10)], and time integrated via forward Euler, and no source term (stability, or otherwise, of numerical time-integration methods is independent of forcing functions): 1 (Tn + l _ Tn_i)+4{Tn + l _ jp + (r« + l _ ^j + ±{T%X ~ T)_x) 6At L = jj[(T'}_l-2Tnj+Tnj+l), (2.7-16) where T" denotes the value of T at node j and time tn. It is convenient at this stage to introduce two common dimensionless parameters, c = uAt/h and a = 2icAt/h2, (2.7-17) the first of which has a name and the second of which generally does not—although some (e.g., Roache, 1982) call it the diffusion number or the diffusion parameter, c is called the
TIME INTEGRATION 247 Courant number [or CFL number, after the famous paper by Courant et al. (1928)] and denotes the travel distance measured in mesh spacing during one timestep; c = 1 means the fluid moves one grid length per timestep. Rearranging (2.7-16) and introducing c and a gives l-(Tnj+l +4r;+1 + t]X\) = g(7-;_, + 4r; + r;+1) + ^(t^ - it) + rj+1) -\<Jnj+\-Tnj_x\ (2.7-18) whose solution we seek in the form jn = ^n^ije^ (2.7-19) where 0 = kh is the dimensionless wave number [k is the ('user-selected') actual wave number of the 'sinusoidal' IC that is part and parcel of Fourier analysis], and £ is the ('von Neumann') amplification factor. If we find such a solution, it is clear, since tlj0 is unimodular, that the magnitude of £ relative to unity will determine whether the numerical solution grows or decays. Since £ is generally complex, the von Neumann stability criterion is l£|2 = £f ^ 1, (2.7-20) where the overbar denotes a complex conjugate. Remarks: (1) This is actually a modified von Neumann criterion, as the original criterion would be a bit more generous, |£| ^ 1 + O(At), permitting growth of the numerical solution even when the true (ODE) solution does not grow. Richtmyer and Morton (1967) and Morton (1971—an excellent paper) call this the 'practical' stability requirement—one that precludes the numerical solution from growing faster than the true solution. We do, however, take issue with one of the points made by Morton in that paper; he blames the trapezoid rule for phase error that is more properly ascribed to the second-order, centered, spatial differencing employed to obtain the results in his Figure 2—those for a Courant number of 0.25 and 0.50 being virtually the same indicates that the 'inaccurate' ODE's were actually solved accurately by TR. (2) The wave number selection should actually not be arbitrary; rather, because of discrete Fourier analysis (or because of the periodic BC's) it should be a multiple of 27ra, where n is an integer; i.e., for a mesh with N nodes (h = \/N = Ax), the mesh can only support a finite set (N) of wave numbers n = no, no + \, ..., no + N — I, where no is any integer. But it is common practice—and much easier—if the analysis regards k as a continuous variable over the range 0 to jt/Ax (the '2Ax' wave), and that is what we shall do. But since being 'easy' is not sufficient justification, we also point out that for large N at least, the results of the two analyses differ only by 0(h2); see Paolucci and Chenoweth (1982) and Hindmarsh et al. (1984). (3) Similar von Neumann analyses can be done (but not as easily) for other explicit methods, but the end result will usually be the same: conditional stability (with different 'constants'). We shall perform some such analyses in a later section—Section 2.7.6.
248 THE ADVECTION-DIFFUSION EQUATION Anyway, inserting (2.7-19) into (2.7-18) leads to the von Neumann amplification factor equation, -(t~i9 + 4 + e/e)$ = W'e + 4 + ti0) + °^{t'i9 - 2 + tl9) - %i9 - e~i9), 6 6 2 2 which can be simplified and rearranged to give the final result: £ = 2 + cos 9 2 + cos 9 — a( 1 — cos 9) — ic sin 9 (2.7-21) and thus 1*1 = '2 + cos#' 2a '2 + cos#' (1 -cos#) + a2(l -cosO)2 + c2(l -cos2#) ' 2 + cos 9' Requiring |£|2 ^ 1 leads easily to c ^ where 2P(1 -cos#)(2 + cos#)/3 (1 -cos#)2 + P2(l -cos2#)' P e= uAx/2k (2.7-22) (2.7-23) (2.7-24) is called the grid Peclet number. (Note that aP = c.) This is our 'CFL' stability limit in terms of the arbitrarily variable dimensionless wave number, 9. If 9 ^ 0, then it can be simplifed to 2P(2 + cos#)/3 = f(P,9). c< 1 -cos6 + P2(\ + cos9) (2.7-25) Studying the function / (P, 9) over the range 0 < 9 ^ n and all P ^ 0, or plotting it as a function of P with 9 as a parameter as in Figure 2.7-4—all curves passing through / = 1/V3 at P = V3—yields the desired stability results. The key point, and the final result of this stability analysis, is that the two curves defined by 9 —>• 0 and 9 = tt are, for P > V3 and P < V3, respectively, the lowest of all / (P, 9). Thus, the stability limits are the following: (i) c ^ P/3 for P < V3, (ii) c ^ 1/P for P > V3, (2.7-26) (2.7-27) and we make the following Remarks: (1) The diffusion limit, c ^ P/3, translates to At ^ Ax2/6k, and the 9 = tt boundary curve suggests—properly—that it is the ubiquitous 2Ax wave that goes unstable first. That pure diffusion requires At —> 0 as Ax —> 0 can perhaps be better appreciated if FE is attempted on the heat equation with the spatially continuous
TIME INTEGRATION 249 12 10 8 f(P.e) 6 4 2 1 0 _ 0 V3 5 Grid Peclet No. (P) Fig. 2.7-4 Forward Euler stability diagram. Laplacian— (Tn+\ — Tn)/At = KV2Tn, which at least up to BC's, implies that Tn+X = (/ + AtKW2)nT0, which 'blows up' for all At > 0 because (/ + AticV2) is an unbounded operator. (2) The pure advection limit (P —>• oo) is unstable for all At—a reflection of the fact that the explicit Euler method is unconditionally unstable for ODE's with purely imaginary eigenvalues. That it is bounded by the 9 —>• 0 curve suggests—again properly—that the longest waves are the most unstable. This 'advection-diffusion' forward Euler stability limit (c ^ \/P or At ^ 2k/u2) also applies to higher-order centered difference approximations—and presumably (probably) to higher-order GFEM's—since it is basically an 'asymptotic' result (0 —>• 0 is the most unstable case); i.e., it even applies to explicit Euler with exact spatial representation, in the advection-dominated limit. (3) If the mass is lumped, then the (2 + cos 6)/3 factor in the above equations is replaced by unity, as is the factor of three in the final stability results. That is, the consistent mass matrix reduces (only) the diffusional stability limit by a factor of three. (4) The lumped mass result is of course also the second-order, centered, finite-difference result for which erroneous stability results have appeared in the literature that asserted instability if P > 2, regardless of At; see Hindmarsh et al. (1984) and Hirsch (1988) for details—and see Thompson et al. (1985), whose title tells all: 'The cell Reynolds number myth.' (5) The advection-dominated stability limit, c ^ \/P, or, equivalently, At ^ Ik/u2, is particularly distressing when P ^> 1; in the next section we will show how to improve this to the much more tolerable limit, c ^ 1. It is interesting, even if not too 'relevant,' that the FE stability limit is as 'non-physical,' non-intuitive, as was the advective-diffusive time scale discussed in Section 2.6.2g. See too E and Liu (1996) on this issue. 1 1/P(0 = O) v0A i ] > l i \.X rv ': \T .0.5 r- ■^••^••■^fe?*^ t — — P/3 (0 = Jt) v 2 e = 2-75_v_3~I=
250 THE ADVECTION-DIFFUSION EQUATION (6) Recalling the discussion following (2.4-1), in which the time constant for the ODE at the outflow boundary is (for the LM version, for simplicity) r = (2k/Ax2 + u/Ax)~\ and recalling the stability result for FE of At ^ 2r, gives—approximately since the outflow ODE is actually coupled to all of the others—c ^ ~2P/(1 + P), which is less restrictive than the LM version of (2.7-26), c ^ P; i.e., the OBC ODE appears not to further reduce the allowable At for explicit methods even though the time constant tends to zero with mesh refinement. The multi-dimensional analog of the above stability analysis, while vitally important, has not yet been successfully completed. We [PMG, with A. Hindmarsh and D. Griffiths; see Hindmarsh et al. (1984)] have tried but failed—for the GFEM case (except for some special cases) with bilinear basis functions. But we did succeed for the lumped mass case and the finite difference (five-point) Laplacian, for which the following rigorous results were obtained: the FTCS (forward-time-centered space) method is stable provided (if and only if) (i) At ^ 1/2*^(1/Ax,)2 or ^a, ^ 1 (2.7-28) j=\ /=• and (ii) At ^2k Y1u)' (2.7-29) ' /=• where ns is the spatial dimensionality (1, 2, or 3). Note the obvious similarity to the easy- to-obtain ID results. It follows that (2.7-28) prevails when all Pj (grid Peclet number in the 7-th direction) are < 1 and (2.7-29) when all are > 1. The real bottom line here—which must generalize to the true GFEM case even though the details are not yet available—is this: there is a diffusional stability limit on At that becomes severe for small elements, and there is an 'advection-diffusion' stability limit that becomes severe when grid Peclet numbers are much larger than unity. For some recent, analogous results when leapfrog time integration is used—for a variety of spatial FDM's—see Kwok and Tarn (1993). For the record (and/or the challenged reader), the stability condition for dT/dt + udT/dx + vdT/dy = K\d2T/dx2 + 2K\2d2T /dxdy + K2d2T/dy2 for bilinear elements on a mesh of uniform rectangles, which has resisted our efforts to 'solve,' is: [ 1 — g\ (#2)0 — cos 0\) — giiO\)i\ -cos#2)-ai2sin#i sin#2]2 + [/z,(#2)sin#i +/z2(#,)sin#2]2 ^ 1, where 0,-= kiAxi, giiOj) = a,(2 + cos0/)/3, where a, = 2KiAt/Axf and a12 = 2K\2At/AxAy, and hi(Qj) = fii(2 + cos0j)/3, where /3, = w/Af/Ax,, and |0/| ^ n. We now (somewhat boldly) employ some of these stability results to two hypothetical, but hopefully practical, cases, using a graded mesh (and even variable velocities), simply in order to show (or at least suggest) that explicit methods can sometimes be too expensive. In general, this situation can occur when the stability At is very small relative to that required for an accurate solution and/or to the total integration time required. 1. Pure diffusion (Pe = 0). Consider the transient heat equation (even in ID) and a step change (or other rapid variation) in boundary temperature on a mesh that is highly graded (very small elements near the boundary) in order to obtain an accurate result near this boundary. If L is the characteristic length and the solution is required for the full transient,
TIME INTEGRATION 251 then the total number of timesteps will be ~ x/ At, where x ~ L2/k is the longest 'time constant.' Since At — Ax^Jk, we have r/At — (L/Axmm)2; a finely graded mesh with fewer than even MOO nodes in the 'x-direction' could easily result in Axmm/L < 0.001, giving t/ At > 106. [(L/Axmin)2 in this case is a measure of the 'stiffness' of the problem.] 2. Flow past an obstacle. Consider steady flow around an obstacle such as a cylinder. At t = 0, a temperature anomaly, or hot-spot, is introduced into the otherwise isothermal flow field at the inlet and approximately on the 'axis' (i.e., it is transported toward the obstacle); it is desired to find how much heat is transferred to the (isothermal, say) obstacle and the downstream temperature distribution for two cases—low (0.01) and high (104) Peclet numbers, UD/k, where D = 0(1) is the 'size' of the obstacle. Since an accurate solution is desired, a graded mesh is again employed such that Axmin/D ~ 0.001 (near the obstacle) and Axmax/D ~ 1 (far downstream). Consider a total domain length of MOD and a nominal (characteristic) velocity of 0(1); thus, the total mesh transport time, x, is M0. Using the ID stability results as a first approximation, we have P = {Ax/ID) Pe, giving P(min) = 0.0005Pe. For Pe = 0.01, the diffusion limit gives Ax2 • i i n At = =52* = - x 10~6D2/k = - x 10"6-Pe = 0.5 x 10"8, 2k 2 ' 2 U and MO9 steps would be required for the simulation. On the other hand, for Pe = 104, the advection-diffusion limit (using Axmjn) gives At ^ (2/Pe)(D/U) = 2 x 10~4, which is controlling. In this case, 'only' 105 or so timesteps would be required. In general, a strongly graded mesh (often required for optimum spatial accuracy) and a low Pe lead to expensive explicit calculations. On the other hand, a rather uniform mesh and a moderate or high Pe may not be so bad; i.e., explicit methods are more afforable (and can be more cost-effective) for certain 'hyperbolic' cases. d. Matrix method of stability analysis An alternate method of stability analysis that is sometimes useful—partly because BC's other than periodic can, in principle, be easily accommodated—is based on estimating either the spectral radius or an appropriate norm of the matrix that relates the solution from time level tn to that at tn+\. We provide a bare introduction to this topic in this small section. For further details, see, for example, the book by Hirsch (1988), and the following papers: Hindmarsh et al. (1984), Griffiths et al. (1980), and Morton (1980), which contains the following important remark: 'Unfortunately, most of the analysis was based on the so-called matrix method, and an associated concept of stability which is misleading both in theory and practice for such problems.' The 'matrix method' begins with the system of ODE's, such as (2.2.7), and applies the ODE method of interest—say FE for simplicity, to obtain (for constant forcing) MTn+l ={M - At[N(u) + K]}T„ + Atf (2.7-30) and thus (see Hindmarsh et al., 1984) Tn+l =EnT0-(E" -I)[N(u) + K]-'f, (2.7-31) where E = I -AtM-l[N(u) + K] (2.7-32)
252 THE ADVECTION-DIFFUSION EQUATION is the 'amplification' matrix. Clearly both stability and the attainment of a steady solution requires, for n —>• oo, En —>• 0—the zero matrix. This in turn requires ||£|| < 1 for some (induced) matrix norm, because ||£|| < 1 =>• ||En|| < 1, which then =>• His" Toll < \\En\\\\Tq\\ < \\To\\. But since the norm of E is usually very difficult to estimate relative to the spectral radius, p(E) = |A.max(£)|, and since p(E) ^ ||E||, the matrix method of stability analysis is often/usually simply taken to be p(E) < 1 (2.7-33) or, if |A.max(£)| is a simple eigenvalue, p(E) ^ 1. Some authors use p{E) ^ 1 as the stability limit even when there are repeated eigenvalues of unit modulus; see Hindmarsh et al. (1984) to see what sort of trouble this more lax definition can cause. Now, p(E) < 1 does guarantee that En —>• 0 as n —>• oo for fixed N (ODE size) and fixed At. Thus, ultimately the satisfaction of (2.7-33) will assure stability. But what it does not assure is that ||£|| < 1 nor \\En\\ < 1, the violation of which, even with p(E) < 1, can cause large growth in \\En\\ before it peaks and turns around—even in the cases in which there are no repeated eigenvalues. Just such behavior was in fact demonstrated by Griffiths et al. (1980) and Hindmarsh et al. (1984) for a case in which [while p(E) < 1] the von Neumann stability analysis predicted instability. What happens in the computer is this: if you have p(E) < 1 but are unstable according to von Neumann, then the solution magnitude can become exceedingly large (say 1010) before finally turning around and decaying. Thus, the matrix method 'wins' ultimately, but certainly not in practice. See too the discussion of non-normal matrices at the end of Section 2.6.2d, to which such behaviour is closely related. For these reasons, we generally suggest that stability analyses be performed, wherever possible, using the von Neumann method: |£| ^ 1. e. Balancing tensor diffusivity (BTD) If one insists on lumping the mass and using explicit time integration—a la many FDM's—one would probably also at least consider (or start with) the simplest method—forward Euler (FE), as we did above. In this section we present a forward Euler method that is modified in such a way that both accuracy (usually) and stability (always) are increased, a rare occurrence in CFD. To motivate the derivation and to obtain the result in the simplest manner, we revert to the pure advection equation for which FE is unconditionally unstable (the P —>• oo limit in Figure 2.7-3) and ask the question: Can we (slightly) modify (perturb) the spatial operator (i.e., the problem) in such a way that FE will be both stable (at least conditionally) and sufficiently accurate? The answer turns out to be 'yes,' and it is obtained as follows: rather than trying to integrate dT/dt = -u-VT = LT (2.7-34) with FE (which is fruitless—unless we 'upwind'), let us integrate dT/dt = LT (2.7-35) with FE, where L approximates L in an appropriate sense. We first use Taylor series on the exact solution as follows: starting at tn, T(tn+l) = Tn + At— ot dT At2 d2T n+ 2 tf + 0(At3) (2.7-36)
TIME INTEGRATION 253 Tn + AtLT + At 2 r L2T dL + —T dt + o(An (2.7-37) using (2.7-34) with u = u(x, t). Next, apply FE to (2.7-35), starting from the same place: Tn+l =Tn + AtlT\n. (2.7-38) We find our modified operator [to 0(At2)] by equating Tn+\ to T(tn+\): (2.7-39) At ( 2 9M l=l+t{l + *)- and we do find an operator that is close to L. That is, integrating (2.7-35) via FE will, to 0(At2), agree with the exact solution of (2.7-34). Now use L2 = (u • V)(u • V) and dL/dt = -(du/dt) ■ V to obtain L = -u • V + (Af/2)[(u • V)2 - (du/dt) ■ V], and to make further progress, we note (i.e., we leave as an exercise for the reader) that V • (uu • V) = (u • V)2 + (V • u)u • V = (u • V)2 for our incompressible flow field. Thus, / Atdu\ At (2.7-40) is, we assert, an appropriate modified operator wherein we hasten to add that it is the diffusion-like term (u2At has units of a diffusion coefficient) that is important. (Indeed, limited testing with a time-dependent velocity field suggested to us that the extra work associated with the du/dt term is not worth the extra effort.) In fact, our final modified operator omits the acceleration term [thus losing some of the theoretical closeness—0(At2)—to the exact solution when u is time-dependent] so that the associated, modified advection equation reads dT/dt + u • vr = -^ v • (uu • vr), (2.7-41) which is to be integrated via FE, which we now examine for the simplest case; i.e., ID lumped linears with constant velocity. Thus, (2.7-41) becomes dT dT u2At d2T h u— = =-, dt dx 2 dx2 and another application of 'von Neumann' [cf (2.7-16) through (2.7-20)] to (2.7-42) rrn+\ J j At U 2h + ^(T"J+l jn ^ u2At T"^ - 2Tnj + TnJ+l hl gives £ = 1 - c2( 1 - cos 9) - ic sin 6, |£|2= 1 -c2(l -c2)(l -costf)2, (2.7-43) (2.7-44) (2.7-45) and it is not too difficult to find that the stability limit occurs at 6 = n (the 2Ax wave) with the result c < 1 (2.7-46)
254 THE ADVECTION-DIFFUSION EQUATION for stability. Also, for h —> 0 (and c ^ 1), |£| —> 1 — u2k4At2h2/S, which is called (owing to k4) 'dissipative of fourth order in the sense of Kreiss' (Richtmyer and Morton, 1967). Thus we have stabilized forward Euler for pure advection by changing (with justification) the equation—at least for this special case. It is noteworthy that FE stability has been achieved by adding a diffusion term to the pure advection equation, a result that helps to explain the instability mechanism of FE on pure advection; i.e., FE is unstable by virtue of the fact that its truncation error 'generates' an advection-diffusion equation with negative diffusivity—an unstable PDE. The addition of the positive diffusion coefficient, u2At/2, balances this destabilizing LTE. (Likewise, the BE integration of the advection equation adds the same amount of numerical diffusion—helping to 'explain' its robustness.) It is also noteworthy that this result is actually quite old—and well enough known to have a name: the Lax-Wendroff method (see Richtmyer and Morton, 1967). [Our derivation, from A. Hindmarsh (personal communication) is, we believe, new—and first appeared in Upson et al. (1983).] Also relevant from this classic text is the following quotation (p. 332): It has occasionally been argued that the Lax-Wendroff equations must in some sense be less accurate than the centered or 'leapfrog' equations because of the damping of those Fourier components exp (ikx) having kAx ~ 1, which does not occur for the leapfrog equations or for the differential equations. This argument fails to take account of the phases of the Fourier coefficients, which are falsified, for kAx ~ 1, by both the Lax-Wendroff and the leapfrog schemes, in fact by an amount of order (kAx)7,—the same amount for both schemes—which is one order of magnitude larger than the falsification of the amplitudes (the damping). It seems to us that to retain the short-wave Fourier components with unchanged amplitudes is unrealistic under these circumstances. More history: in ID, the BTD method is also known as Leith's method (Roache, 1982), referring to Leith (1965)—but in fact it is older yet; in Knox (1961), it is mentioned that Leith suggested the scheme to him in 1960. Knox also showed that the scheme competes well with 'the well-known centered explicit scheme'—leapfrog. Generalization to include diffusion is possible (almost), but we present only the final results, referring to the original references for details (i.e., Gresho et al., 1984b; Hindmarsh et ai, 1984): to solve dT/dt + u • VT = V • K • V7\ where K is a diagonal diffusion tensor via FE and lumped mass FEM on bilinear elements (trilinear in 3D), simply replace K by K + uuAf/2. We succinctly summarize the results in the form of Remarks: (1) The entire procedure is called BTD (balancing tensor diffusivity) because it balances the FE truncation error with a diffusivity tensor. (2) The von Neumann stability limit (necessary and sufficient) for the ID case is (see, e.g., Hindmarsh et al. (1984) c ^ 2/7(1 + y/l +4P2) {2.1-Al) and is plotted in Figure 2.7-5, and called modified FTCS. The 'upwind' stability result, from FDM or LM linears with pure upwinding, is also shown—as is the no-BTD LM result, a la Figure 2.7-4 for CM, and called FTCS here. Note that CM causes a factor of three reduction in At for unmodified FE and P < 1.
TIME INTEGRATION 255 Fig. 2.7-5 Stability results for three schemes with forward Euler; lumped linears (FTCS); upwind FDM, and lumped linears using BTD (modified FTCS). (3) The stability limit for multi-dimensions is not yet known unless the one-point quadrature approximation is invoked, in which case it is (2.7-47), applied separately in each direction—but only as necessary conditions. Presumably, it is close to a sufficient condition and is a good approximation to the full quadrature (2 x 2) case. (It has served us (PMG) well in practice, even for 'real' problems—those with variable grids, velocity, etc.). (4) Since the 'fix' is based on the time-dependent AD equation, it is not strictly valid (though still needed for stability) if a steady state is reached (the time truncation error of FE that introduces negative diffusion is no longer present; there is nothing left to 'balance'). It is then merely another example of streamline upwinding, so- called because the diffusion tensor, UjUjAt/2, is anisotropic in just such a way that it is non-zero (value |u|2 At/2) only along streamlines; crosswind diffusion, the bane of multi-dimensional upwinding, is absent. [The proof that the diffusion acts only along streamlines is simple: rotate the diffusion tensor to streamline and normal (to streamline) coordinates—i.e., to principal directions—via (in 2D, for simplicity) tan# = v/u and Kfj = (RT ■ K • R),7, where Ktj = UjUj and *«7 = cos 6 — sin 6 sin 0 cos 0 is the appropriate rotation matrix. The result is that the rotated 'diffusivity' matrix, + 0 K« = u2 + v2 "'7 is non-zero only in the streamline direction. A second, and simpler, 'proof follows from V • (uu • V) = (u • V)2 and the realization that u • V = usd/ds, the directional derivative in the streamline direction.]
256 THE ADVECTION-DIFFUSION EQUATION (5) If BE is used to integrate the AD equation, then it adds (implicitly) the streamline diffusity, uuAf/2, to the physical diffusivity; BE is thus inherently a 'streamline- diffusion' method (see Johnson, 1987)—for 'small' At. For large At, BE goes from extra damping in the streamline direction to massive amplitude reduction everywhere. Returning to 'small' At, it is clear that the accuracy of BE can be increased (and stability decreased) by subtracting a BTD term—a result that was demonstrated by Engelman (see Gresho and Chan, 1990) on a spinning vortex problem for the incompressible Euler equations. We now demonstrate that BTD 'works' via a simple ID example—from Gresho et al. (1984b): a unit-amplitude Gaussian wave with a = 2Ax is advected with unit velocity through a uniform mesh of 100 lumped linear elements with Pe = uo/k = 20 (and P = 5) and periodic BC's on the unit span. Figure 2.7-6 shows three approximate solutions and the exact (infinite span) solution, given by T(x, t) = exp[—(x — xq — ut)2/(2a2 + 4Kt)]/\/l + 2/ct/er2 with xo = 0.5 and t = 1.0 (one lap) and shown as the dotted curve. Curve (1) shows the true solution (very small At) to the ODE with no BTD, and curve (2) shows the no-BTD result at its stability limit (c = \/P = 0.2) and seems to corroborate the notion that (at the stability limit) the negative diffusivity of forward Euler has 'precisely' cancelled the physical diffusivity because the curve looks just like a pure advection result with lumped mass (dispersion error). Finally, curve (3) shows the BTD result (k + u2At/2) at its larger stability limit (c = 2/7(1 + Vl + 4P2) = 0.905—a very cost- effective result because the accuracy is much better and the timestep is significantly larger (a factor of 0.905/0.2 = 4.5). While these truly desirable assets gained by BTD with forward Euler are not always so striking in the general multi-dimensional case, they are sufficiently good as to strongly recommend the use of BTD when forward Euler time integration is employed. 0.6 — 0.3 — T 0 <-^w -0.3 — X Fig. 2.7-6 Advection-diffusion of a Gaussian: Curve (1) no BTD at very small At; curve (2) no BTD at the stability limit; curve (3) BTD result at the stability limit. (The dotted curve is the exact solution.
TIME INTEGRATION 257 As a final remark on BTD, it seems that it may only be effective for the explicit Euler method; e.g., if second-order Adams-Bashforth is analyzed in a similar way, the leading FE truncation error term, AtL2/2, where L = — u • V, is replaced (for constant u) by the higher-order AB2 truncation error term, 5 At2 L3/12, which does not look like simple diffusion. 2.7.3 Some Implicit ODE Methods Since explicit methods may be too expensive for some cases of interest (for which the stability-induced step size is often orders of magnitude smaller than that required to integrate the ODE's with acceptable accuracy), it is appropriate to additionally consider stable, implicit methods. In these situations the At question is changed from 'How small must At be to maintain stability?' to 'How large can At be while still assuring sufficient accuracy?' The answer to this important question is, unfortunately, not easy to obtain in a simple and general way since the answer is problem-dependent and is often a strong function of time for a given problem. (The question is important because implicit methods are relatively expensive per timestep, and the goal is to use as few steps as possible.) Here we first give a few general guidelines or insights, before suggesting the 'proper' answer. If Pe <$C 1, then the diffusion-dominated, time-dependent flow usually exhibits multiple time scales; e.g., consider the typical (ID) analytical solution, T = J2T=\ an(x)e~fin''K^L\ where an(x) is proportional to an appropriate eigenfunction, and /J = 0(1). For very small time (Kt/L2 <$C 1), many of the 'faster' modes (large n) are often quite important; but as time goes on, these modes contribute less and less until, for Kt/L2 ^ ~ 1, only the slowest mode (n = 1) remains significant. This suggests, and properly so, that numerical solutions of the FEM model (the ODE's) should use a very small At initially (on the order of the time constant of the smallest element; e.g., r ~ QAh2/k), and At should (or at least, could) be increased monotonically and significantly during the simulation, without sacrificing accuracy. On the other hand, for certain advection-dominated situations (Pe ^> 1) involving the transport (with little diffusion) of a discrete waveform, the solution is more hyperbolic in nature—the 'temperature' follows the fluid streamlines (particle paths for time-varying velocity). As a result, the relevant time scale is not nearly so variable, and often a fixed step size (and perhaps an explicit method) is most appropriate. For problems not pushing the hyperbolic limit, a variable step size may generally be useful and cost-effective—if done properly. Fortunately, the theory of the numerical solution of ODE's is sufficiently well advanced to provide simple but effective methods for selecting the appropriate step size for the general case. Thus, after introducing implicit methods in the simpler context of fixed-step-size algorithms, we will present 'smarter' and more cost-effective algorithms for solving the ODE's. The stability issues discussed above for AD are usually translated into what is called 'stiffness' in the ODE jargon. The early portion of a diffusion (-dominated) problem is usually 'non-stiff in that small timesteps are required for accuracy no matter what type of time integrator—explicit or implicit—is employed. A general definition of stiffness, in terms of our model ODE [(2.7-1)] (considered now to be one of many, each with a different time constant and frequency), is this: 'The problem is stiff if the smallest time constant (Tmin) is "very small" relative to the time over which we wish to solve the problem.' Thus, for example, the transient heat equation, when spatially discretized, becomes stiff at 'large' time because the 'fast' modes have decayed away. If the maximum time constant
258 THE ADVECTION-DIFFUSION EQUATION is called rmax and if the minimum frequency is called com\n, then it is typically the case that the time interval of interest is at least rmax or l/&>min- Since all explicit methods, and many implicit methods, are only stable for At not far from rmjn, a measure of the stiffness of a problem is the ratio of rmax/rmjn or the inverse product l/cominTm\n, each of which is an estimate of the number of timesteps necessary to do the job stably—not necessarily accurately, although it is often true that the step size needed for stability will give adequate accuracy. (Sometimes in the extreme, AtCTll can be so small that the ODE's are integrated with an accuracy much greater than is needed, or reasonable—especially when solving PDE's, which almost never warrant 'too precise' an ODE solution because of 'Ax errors.') In the case y = Ay, where A is a matrix, the stiffness ratio is often taken to be the ratio of largest to smallest eigenvalue moduli of A. For further discussions of this subject, see Aiken (1985), Lambert (1991), and Radhakrishnan and Hindmarsh (1993). Three examples should suffice to demonstrate stiffness (for others, see, for example, Gear, 1971; Shampine and Gear, 1979; and Hairer and Wanner, 1991, in addition to those already mentioned); the first is from Dahlquist and Bjorck (1974), and the second is our own invention (Lee et ai, 1983)—and will be referred to later (Volume II) when we discuss anelastic equations: 1. Consider the ODE y = 100(sin t - y), y(Q) = 0, (2.7-48) with solution x sint- 0.01 cosr + O.Ole"100' y(t) = , (2.7-49) y 1.0001 which displays y(0) = 0 and y(t) = sint — 0.01 cos? for t > ~0.1, and even y(t) = sin? to within 1%. But since a = 1/r = 100, any of the explicit schemes discussed above would be unstable for At much larger than 0.02 or so. While At = 0(0.01) is fine (appropriate) for the initial rapid transient, the stability limit later gives the requirement of n = 2n/At = several hundred steps per period of the simple sine wave—many more steps than would be necessary to give reasonable accuracy, probably ten-fold too many. And suppose 100 were changed to 104! 2. Here we consider a pair of equations, yi = -y\ + yi, yi(0) = i, (2.7-50) and y2 = yi - \000y2, y2(0)=L (2.7-51) The negatives of the eigenvalues of the associated matrix, A=(~{ M ^ i -loooy' are A, = 1000.001001 and a2 = 0.998999, and the exact solution is A) — A2 k\ — k2 A2(A, - 1) , , A,(l -A2) , , yi = . . e~x'? + -^ -^e-K (2.7-53) A) — A2 A) — A2
TIME INTEGRATION 259 This solution—to five or six digits—is v, = l.OOle-0999' - O.OOle 1000' (2.7-54) and y2 = O.OOle-0'999' + 0.99e-100(\ (2.7-55) which rapidly (t > -0.01) changes to yi = 1.001e~0999r and y2 = 0.00 le"0999'; only the slowly decaying portion of the solution is significant. If explicit Euler were to be applied to this problem, stability would require At ^ 2/X\ = 0.002; to follow y\'s decay to, say 10-4, would mean tt\na\ = 10, and a total of 5000 steps would be needed when less than 1/100 that many, based on accuracy, would suffice. Exercise: Solve y = 2t- 106(j - t2), y(0) = 0 via FE and BE. Show that FE is not useful unless At ^ ~ 10~6, whereas BE gives six-place accuracy with At = 1 (!). (See also Gear, 1971.) 3. One more fairly common situation in which stiffness is encountered is that where the advection-diffusion equation becomes the 'reaction-advection-diffusion' equation—i.e., when chemical reactions occur in the fluid. It is often the case that very wide differences in chemical reaction rates ('time constants') exist, thus introducing 'chemical' stiffness. After a short time, some of the reactants are virtually in equilibrium (y = 0), yet a stability-limited ODE method would not know this and would need to always use the small timestep (to follow the fastest reaction, which is in equilibrium) that only makes sense at very early time, in general. For examples of stiffness of this type, see, for example, Enright and Hull (1976), Radhakrishnan (1986), and Aiken (1985). It is interesting (again) to note that the ODE's that are also the natural BC's, which are also used for OBC's, become stiffer and stiffer as / -> 0; i.e., the ODE's time constant goes to zero like 12/k—see, for example, (2.3-11) or (2.4-1)—causing, appropriately, 'equilibrium' behavior. Since this time constant is essentially the same as the At stability limit for explicit methods, we see that—fortunately—the NBC ODE does no additional 'harm' when an explicit ODE method is used. Implicit methods, of course, also take this 'stiffness' in their stride and should use a step size based on accuracy—even for very small /. Recall: 'A method is A-stable if lim^oc y„ = 0 for all Re(A) < 0 and a fixed positive At when applied to the test problem y = Ay'—Brenan et al. (1989, 1996). 'The condition of A-stability, as a requirement for a method to be considered for stiff problems, leads to disappointingly few methods. A more fruitful approach to this subject was followed by Gear. In this approach, instead of requiring that the entire half-plane Re(XAt) ^ 0 lie in the absolute stability region, it is only required that a suitably large part of it lie there.'—Hindmarsh (1974). See Gear (1971) for the precise definition of 'stiff stability'; suffice it to say here that there are many more stiffly stable methods than A-stable methods. Every A-stable method is also stiffly stable, but not vice versa. We now consider implicit ODE methods that can cope with stiffness—after stating that a final and significant reason to consider implicit methods is that the GFEM ODE's from PDE's are 'inherently implicit' owing to the mass matrix. a. The trapezoid rule (TR) This is the second member of the Adams-Moulton family, or implicit Adams family (the first is backward Euler); applied to (2.7-2), the TR gives
260 THE ADVECTION-DIFFUSION EQUATION yn+\ = yn + -z-(yn+ yn+\\ (2.7-56) and applied to the linear ODE [(2.7-1)], it yields 1 ~ *Af/2 y*+x = urn]?*- (2-7"57) This popular method, called 'Crank-Nicolson' in much of the PDE literature (at least when applied to the diffusion-only equation), has a number of interesting and important features—most of them very good; recall our list of virtues in the introductory discussion in Section 2.7: 1. It is self-starting. 2. There are no extraneous roots. 3. It is stable for all At whenever the ODE is stable (i.e., it is A-stable). It is also unstable whenever the ODE is unstable. 4. It displays no spurious damping; it is completely neutral—at least for constant a>. 5. It is nearly as simple to implement as its first-order dissipative relative, backward Euler. 6. It is second-order accurate. Before presenting the next neat feature of TR, the 'Dalhquist theorem,' we briefly digress to define a full class of methods; two classes, actually—the Adams methods. These are embedded in a yet more general class of methods called linear multi-step methods as is yet another: BDF (backward differentiation formula) methods, popularized by Gear (1971) and Hindmarsh, beginning with Hindmarsh (1972) on through to Hindmarsh and Petzold (1995a,b). The linear multi-step methods are given by yn = Y.ajy"-j + ^^fryn-h (2.7-58) wherein the terms 'linear' (for linear combinations, vis-a-vis, for example, RK methods, which are non-linear) and 'multi-step' (at least for K\ and/or K2 > 1) are obvious. The quantities {K,, a,-, /},•} are constants given by the particular choice of method. If we define K = max (A!'], K2), then the formula gives a AT-step method. It is explicit if /}0 = 0 (e.g., Adams-Bashforth) and implicit otherwise (e.g., Adams-Moulton). Also, the order (accuracy) of the method is determined by K\ and K2. The subsets (families) referred to above, of order q, are 1. Adams-Bashforth (explicit): K\ = 1 (and a\ = 1), K2 = q, fio = 0. 2. Adams-Moulton (implicit): K\ = 1 (and a\ = 1), K2 = q — 1, fio > 0. 3. BDF: Kx = q, K2 = 0, & > 0. See, for example, Gear (1971) for values of the coefficients as functions of q in each case. From this text (p. 130), we quote: '... the region of stability for the implicit Adams- Moulton methods is larger by a factor of ten or more than that of the explicit Adams- Bashforth methods. The truncation errors are also smaller for the implicit methods, so the implicit methods can be used with a step size that is several times larger than that of the explicit methods. This increase in step size usually more than offsets the additional effort in solving the corrector, which may require two or three function evaluations.' The
TIME INTEGRATION 261 'corrector' here refers to a predictor-corrector method for these implicit ODE methods which we shall define below—or see p. 114 of Gear (1971). This excursion into general ODE methodology, for our purposes, has ended (at least until the next chapter); but for those interested in how general ODE software packages (of variable order and automatically varying step sizes) are built upon the above multi-step methods, see Hindmarsh (1979); Shampine and Gordon (1975, non-stiff ODE's); Burrage et al. (1980); Hindmarsh (1983); and Brown et al. (1989). Also, a general CFD-oriented discussion of these methods (and others, such as one-leg methods that we will define below) is available in the excellent Van Karman Institute lecture series publication by Beam and Warming (1982). The forward Euler method is clearly the first in the AB family, and the implicit Euler is the first member in both the Adams-Moulton and BDF families, whereas TR is the second member (q = 2) of the Adams-Moulton family. The BDF methods are so named because they look like formulas for yn that are obtained by looking backwards at earlier values of yj. They are famous because of their 'stiff-stability.' We now return to the TR and quote, loosely, the famous Dahlquist theorems (following Hughes, 1987), which helps explain our attraction to it: 1. There are no explicit linear multi-step methods that are A-stable. 2. The highest order attainable that has A-stability is 2 (q = 2), and it is necessarily implicit. 3. Of all second-order, A-stable methods, the TR is the most accurate (smallest constant in the LTE). It is useful to demonstrate the accuracy, stability, and lack of dissipation of TR since we will rely heavily on this 'optimal' method in the sequel. The LTE is determined from (2.7-56) as follows: At dn= yn + -y[;y« +y(tn+\)] - y(tn+l) = yn + At At2... , yn + yn + Atyn + ~ y„ + 0(At3) + 0(1 „) At2 At yn + Aty„ + -—% + —- yn + 0(An = —At3yn+0(At4), (2.7-59) where, as with BE, the LTE differs slightly from the true local error, /„— here by 0(At4). The stability of TR, for the constant-coefficient case, from (2.7-57), is determined by studying £ = (1 - XAt/2)/{\ + XAt/2) and setting £ = e'e to find the stability boundary (|£| = 1), which yields XAt = — 2/tan(#/2): the stability boundary is the entire imaginary axis; this is the epitome of A-stability—all values of XAt with Re(AAf) > 0 are stable and all values of XAt with Re(XAt) < 0 are unstable, thus mimicking optimally the ODE behavior. Finally, to test dissipation, we set X = ico and compute l£|2 = icoAt/2 1 + icoAt/2 1 + ((oAt/2Y 1 + (coAt/2)7 = l;
262 THE ADVECTION-DIFFUSION EQUATION TR is neutrally stable and (properly) will not damp solutions that should be purely oscillatory—thus showing that TR shares that coveted property of the leapfrog method. The most serious criticism ever leveled against TR (perhaps besides being only second- order accurate) is that its neutral stability (lack of numerical damping) is sometimes regarded as a disadvantage because, although A-stable, too large a timestep (used at the wrong time) can (but need not—see below) lead to what are called TR oscillations, or ringing. TR is also not very good at damping perturbations, or round-off error, for the same reason. To see the ringing, consider X real so that the ODE solution is monotonically decaying like z~~Xt. If XAt:» 1 in (2.7-57), then the TR solution is, approximately, yn+\ = -(1 -4/XAt)yn, giving yn = (-1)"(1 -4/XAt)ny0 % (-l)nyo; the solution oscillates nearly between ±y0 (similar to forward Euler at its stability limit, At = 2/k), while only very slowly decaying—non-monotonically—to zero. Soon, though, we shall show how this alleged shortcoming is easily overcome by a 'smart integrator'—a la ODE theory—that precludes ringing by intelligently (and efficiently) selecting appropriate step sizes for TR. Unfortunately, the literature abounds with papers in which a fixed step size is used and—especially for diffusion-dominated situations—spurious oscillations result because the fixed At is inappropriate (at early time) for the 'higher modes.' These TR oscillations should be regarded as another 'wiggle signal,' telling the astute analyst that a smaller At should have been used—at least at early time. (TR for time integration is the ODE 'portion' of what we call 'honest GFEM' for PDE's.) We will return to (and solve) this problem soon, but first we introduce two serious competitors to TR, both also second-order: implicit midpoint rule and BDF2—after mentioning that TR is also a member of the second-order implicit Runge-Kutta family. A suggestion by R. Rannacher that may sometimes be useful (e.g., in the presence of 'rough data') in the context of a fixed-step integrator is called 'a slight shift to the implicit side' in Hey wood and Rannacher (1990); namely, modify (2.7-57) to yn+\ = yn kAt —- (1 -XAt) XAt 1 + ^-(1 +A.A0 (See also Timmermans et ai, 1994, who make a similar suggestion.) A final remark on TR that also applies to leapfrog and to the implicit midpoint rule to be described next, but does not apply to dissipative methods: it is a symmetric (self-adjoint) method in that any given time integration can be reversed (At -> —At) and backward- integrated to recover the original IC's. {Proof (A. Hindmarsh): Given y = f(y, t) and y = yo at t = to, one TR step gives the non-linear system, y\ = y0 + (At/2)[f(y0, to) + f(y\,t\)], for y\. Assuming a unique solution exists, all we need to show is that a backward step recovers yo, which is easy: given y\ at t\, integrate backwards to find yifo) = yo as follows: y0 = y\ - (At/2)[f(yi, t\) + f(y0, t0)], which is again assumed to have a unique solution. But this is just y, = yo + (At/2)[f(y\, t\) + f(y0, to)], which proves that y0 = yo- For further discussion of symmetric ODE methods, see Hairer et al. (1987).} Later in this chapter we will demonstrate this symmetry for a 2D, pure advection problem. b. Implicit midpoint rule (IMR) There exists a class of methods pioneered by Dahlquist (see, for example, Dahlquist, 1983) called 'one-leg methods' because only one function evaluation is involved in each
TIME INTEGRATION 263 timestep. The implicit midpoint rule is the second-order member of this family (backward Euler is the first); applied to (2.7-2), it is , a,/- (yn + yn+\ , . AA mow yn+i=yn + &tfl—-—,tn + —u (2.7-60) with the function evaluation occurring at the 'midpoint' of the interval. Note that application of the IMR to the linear ODE of (2.7-1) simply returns the TR, a specific example of a general fact: one-leg methods are the same as linear multi-step methods when applied to linear ODE's with constant coefficients and a fixed step size. Even for the general nonlinear case, they are quite closely related (Hairer and Wanner, 1991): if {yn} is the solution of (2.7-60), then ~yn = ~(yn + yn+\) at tn = ^(tn + tn+\) satisfies (2.7-56); conversely, if {y„, t„] is the TR solution, then {yn - (At/2)f(y„,t„), tn - (At/2)} satisfies (2.7-60). The main reason that one might be attracted to this method, which is self-starting and displays no spurious roots, is that it is somewhat more stable than TR, without being dissipative, which we now demonstrate for the GFEM ODE's, (2.2-25), Mt + [NQ(u) + K]T = 0, (2.7-61) in the absence of a forcing function. Recall that Nq is the skew-symmetric version of the advection approximation for n • u = 0 on T, which we assume. The 'energy' (quadratic form) of the system is E = ^TTMT, and (2.7-61) easily leads to E = -TTKT, (2.7-62) which, since K is SPD, shows monotonic decay—the ODE's are stable, and the advection process (properly) has no net effect on the energy. (Recall that xTAx = 0 for all x when A is a skew-symmetric matrix.) The TR applied to (2.7-61) with a time-varying velocity field is M(Tn+^~Tn) + l-[NnTn + Nn+lTn+l + K(Tn + Tn+l)] = 0, (2.7-63) where we have dropped the Q-subscript on N, and Nn =N[u(tn)\. Taking the scalar product of this equation with (Tn + Tn+\) yields, using the symmetry of M and K and the skew-symmetry of N, ^-t(TTn + lMTn+l -TTnMTn) = --[TTn+x{Nn - Nll+{ )T„ + (7,I+1 + Tn)TK(Tn+{ + Tn)]. (2.7-64) Since the diffusion term is behaving 'properly,' we now focus on the pure advection limit by setting K = 0 to give En+l=En + ^-[TTn+l(Nn+l -Nn)Tn]=En+0(At3) (2.7-65) rather than energy conservation, En+\ = En from (2.7-62). The TR gives an indefinite result for the energy, which seems to imply that stability (in the 'energy sense') is not guaranteed. Soon we will present an alternate analysis that 'returns' stability, but first we examine the IMR result for the pure advection case: *"r^ - r"> + N ("" +2""+l )(T" +2r"+' ) = 0- (2-7-66)
264 THE ADVECTION-DIFFUSION EQUATION Forming the same scalar product as above leads easily to the desirable result that En+\ = En; energy conservation is achieved for all step sizes. The TR result, in an asymptotic form that replaced (Nn+\ — Nn) by AtN + 0(At2), was first discussed from the standpoint of stability and conservation by Lee et al. (1982), and the better qualities of the IMR on the same were first pointed out by Cliffe (1981). We now turn to an interesting model ODE with a time-varying decay rate, y = -k(t)y, y(0) = y0, (2.7-67) and compare TR with IMR. The former yields ( l-A„Af/2 ^ _ and the latter _ l-kAt/2 yn+\ = = yn, (2.7-69) where k = k(tn + At/2). As first pointed out by Gourlay (1970) and further studied by Hughes (1977) (see also Nevanlinna and Liniger, 1978), the TR result above actually looks unstable if k(t) > 0 is a decreasing function of time and if At > 4/(A.„ — kn+\) = Atc—a result that obtains from violating the left inequality in the stability statement — 1 ^ £„ ^ 1. In contrast to this ostensibly conditional stability behavior, the IMR is easily seen to be stable for all step sizes. But it is actually too hasty to conclude instability if At > Atc because we have a variable-coefficient ODE. If A. was constant and if |£| > 1, we would get yn = %nyo, which is indeed unbounded. But for k = k(t)—a function of time—the analogous behavior is yn = (rio-1£/);yo = <pnyQ-> and it does not necessarily follow that cpn is unbounded for n -» oo even when each £y has |£y| > 1; e.g., n^,(l + \/j2) = sinh7r/7T. In fact, it is not hard to show that TR is (at least for constant At) actually still stable—in a slightly extended sense, and not as 'cleanly' stable as IMR—even when At > Atc for every step. The proof uses 'energy' arguments and for the scalar problem goes like: yn =>• (1 + K+\&t/2)yn+x = (1 - knAt/2)yn with kn+l < kn, and we permit At > Atc for which |^„| > 1. Squaring both sides gives (1 + kn+\ At/2)2y2+l = (1 — knAt/2)2y2n=(\-knAt + (At2/4)k2n)y2n<:(l+knAt/2)2y2n, which =>|y„|^[(l + k0At/2)/(l + knAt)]yo; i.e., although yn+\ may not decay to zero if At is large, it is true that yn+\ does not 'blow up' for any At. This boundedness is stability. This result also generalizes to the system of ODE's given by (2.7-61), written now as Mf + [N(t) + K(t)]T = 0, (2.7-70) where NT = — N represents advection, and K is SPD (diffusion with, for generality but not necessarily, a time-varying diffusivity): the TR yields (M + (At/2)An+\)Tn+\ = (M — (At/2)An)Tn, where A = N + K. The energy analysis goes as follows (and uses the fact that M, and thus M_1, is SPD): let (M + (At/2)An+l)Tn+l = a and (M - (At/2)An)Tn = b. Since a = b, we have M~xa = M~~xb and aTM'la = aTM~~xb = bTM~~xb which, after some algebra, yields tt 1 n + \ At2 M + AtKn+x + --(Kn+iM-lKn+l - 2Nn+lM'lKn+l) Tn+\
TIME INTEGRATION 265 At + — {Nn+xTn+x)TM-\Nn+xTn+x) = T{ ^ T1 At M - AtKn + —(K„M-lKn - 2NnM-lKn) M + AtKn + ^-(KnM-{Kn - 2NnM-lKn) At Tn + —(NnTn)TM-\NnTn) + At* -l, ■(N„T„)'M-\N„T„), (2.7-71) which, since now all terms on each side of the inequality are the 'same' functions of n, leads by induction to LHS„+i ^ RHSo, and we have stability. If in fact K = 0 (pure advection), then we have, from (2.7-71), the following 'conservation' law: TTnMTn + (At2/4)(NnT„)TM-l(NnTn) = TTQMTQ + (At2/4)(N0T0)TM-1 (N0T0), which is seen to be more useful than (2.7-65) in that now we have shown a useful and definite result—boundedness. In Hughes (1983) similar arguments were presented, but he apparently did not realize (nor did Gourlet for the scalar case) that <pn = njp'^y need not become unbounded even though i-j > 1, and thus saw a discrepancy between the scalar case and the ODE system. Our analysis shows that there need not be a discrepancy—at least for the linear case [Hughes was considering the more general non-linear case, and there are examples from this class, e.g., Fornberg (1973)] for which TR is indeed only conditionally stable. But the main reason that we are not attracted to IMR is that only TR provides us with easy and accurate methods of error control and automatic step size variation, as we will soon show. c. Backward differentiation formulae (BDF) The last implicit method that we will consider is BDF2 (see Gear, 1971, and Hindmarsh, 1972). It is the second of the so-called 'stiffly-stable' BDF ODE methods (implicit Euler being the first) meaning, roughly at least, that ODE's like y = —ky with A. > 0 are integrated in an unconditionally stable way, and those like y = icoy are at least conditionally stable; recall that stiff stability is weaker than A-stability. BDF2 is sometimes referred to as the second-order, implicit Euler method in the CFD literature (BDF1 is precisely implicit Euler). BDF2 is not neutrally stable, like TR or IMR; rather, it—like backward Euler, but less enthusiastically—will damp an oscillatory solution like that of y = icoy, and thus would be shunned by those wishing to solve pure advection on long time scales. It is, for (2.7-2), yn+\ -yn _ \yn_ ~ 3 At - yn-\ . 2. (2.7-72) in which form it appears to be one-third 'extrapolation' and two-thirds backward Euler. (The general BDF family seems susceptible to a similar interpretation.) Another way to write it is iKt = yn+u (2-7"73) in which the LHS is—via Taylor series—a well-known, second-order-accurate, one-sided approximation to y at tn+\. BFD2 is actually better than just stiffly stable—and even better
266 THE ADVECTION-DIFFUSION EQUATION than A-stable [see Figures 11.6 and 11.7 of Gear, 1971 (the higher-order BDF methods are only stiffly stable)]. Applied to y = — Xy, it gives y«+i = ^ ~ y"~l (2-7-74) yn+X 3 + 2AAf V which, via yn+\ = %yn yields a quadratic equation for £ with roots £ = (2 ± Vl - 2A.AO/(3 + 2XAt), with |£| < 1, which displays L-stability (y„+1 -> 0 for XAt -» oo), a good feature for dissipative systems (at least). The disadvantages of BDF2 are two: it is not self-starting (requiring, for example, a BE first step—at a smaller At—or, preferably in our opinion, TR at the same At) and it displays one spurious root (given by the minus sign in the ^-equation). Also, for pure diffusion (X real) and XAt > 1/2, BDF2 displays (in both roots) an oscillatory damped behavior—vis-a-vis BDFl (BE), which is monotone. While BDF2 may seem thus to be less attractive than TR—which we believe to be generally the case—it may be a useful complement to TR when variable timesteps are employed and a solution is heading for a (non-zero) steady state (which, of course, precludes the undamped, pure advection cases). We shall return to this point later. To conclude our brief visit to implicit ODE methods, we summarize as follows: TR and IMR are A-stable whereas BDFl (implicit Euler) and BDF2 are L-stable. The former pair are clearly preferred for pure advection whereas the latter pair have certain 'stability' advantages (e.g., errors are always damped). 2.7.4 A Variable-Step Implicit Method for Advection-Diffusion Having presented a number of methods in the fixed-A? context, it is time to get 'serious' regarding implicit ODE methods and show how they should be used. If one is committed to—or more interested in—implicit methods so that At can be selected based principally on desired accuracy with little or no regard to stability issues, one is naturally led to consider (stiffly stable or even A-stable) implicit methods in which At is varied during the time integration procedure, being 'small' only when necessary (lots of high-frequency and/or small time-constant modes that are active/important—or via a 'busy' forcing function) and 'large' whenever possible (lack of high-frequency modes, small time-constant modes are of small amplitude, and any forcing is 'slow'). Such a method we refer to as a smart integrator because it follows the 'physics' intelligently—and is a natural adjunct to FEM which, when employed optimally, puts nodes where they are most needed. a. Variable step trapezoid rule Based on the above considerations, the implicit method that we favor for such an approach is TR, whose only detraction is its tendency to oscillate when At is (too) large—a detraction that is absent when At is selected properly, a point that we cannot emphasize too strongly. That is, our TR integrator will not generate visible oscillations even when XAt, as selected by the smart integrator, is very large. (Large At will only be employed when it is 'safe'/appropriate to do so.) The ODE theory that we employ is that in which an explicit method (cheap by comparison with the implicit method) is also employed in such a way that the LTE of TR can be easily and reliably estimated and used to control accuracy via step size changes; increase At whenever possible, decrease At only when necessary. (The timesteps are selected to
TIME INTEGRATION 267 follow the 'physics.') Since TR is second-order accurate and uses two values of y, the natural explicit method is second-order Adams-Bashforth, AB2. These two are used side by side to make a variable-step TR that we call 'smart'—it emulates just what the better ODE software packages do, which packages could sometimes be used to advantage in CFD, at least if the problem is not too large. (But they rarely are; here's one exception we're aware of: Randriamampianiva et al. (1987). The strategy will first be introduced with the scalar model problem—and favorably compared with its first-order counterpart (backward Euler, with forward Euler used to estimate the local error)—and then for the GFEM ODE's for advection-diffusion, including some 'heuristics' and some 'warnings.' In the next chapter we will extend it to the NS equations. The trick is to combine the variable-A? AB2 method, as a sort of 'predictor' (see, for example, Shampine and Gordon, 1975), P Atn y„+l =yn + ~^-[(2+ Atn/Atn-\)yn - (Atn/Atn-i)y„-il (2.7-75) with LTE [cf. (2.7-10)] rf+l-y(tn+i) = -(2 + 3Atn-i/Atn)At3nyn/\2 + 0(At*), (2.7-76) which is used to estimate the LTE of TR, given by (2.7-59), with At replaced by Atn, the current step size; i.e., the pair of equations, (2.7-59) and (2.7-76), in the two unknowns, y(tn+\) and yn, can be solved, and the item of interest, dn = yn+\ — y(tn+\), where yn+\ is the TR solution, computed. The result is d- = »+<-*"+')S30?*/;!L.y <2J-77) the local error in the TR step [to 0(AtAn)] is simply proportional to the difference between the TR and the AB2 solution—with a known proportionality constant. Armed with this knowledge, we can vary the step size in such a way that this local error is maintained below a (user-specified) tolerance, which also keeps any TR oscillations within the 'noise' level |X 0(e), where e is the specified error allowance]—at least when e is chosen to be 'sufficiently small.' [Some of the smart integrator 'package' falls into the lap of (is the responsibility of) the user.] Remarks: (1) A simple rearrangement of (2.7-75) makes it more 'transparent': p , A, • , Atl (yn -yn-\\ (2) If the ODE is linear in y, then AB2 is used only to estimate dn; y^+l is then 'discarded.' (For the general non-linear case, or for the alternative AB/TR algorithm to be described below, y%+l has a dual use, further amortizing its (small) added cost.) Suppose we have just computed dn from (2.7-77). The first thing we might want to do is improve our current result by using (2.7-77) again, this time in the form y(tn+\) = y„+i-d„, (2.7-78)
268 THE ADVECTION-DIFFUSION EQUATION which is ostensibly a better estimate than the TR result, yn+u i.e., subtracting the LTE yields a result that should be third-order accurate. And this is true. But it is not true that the above value should be used to replace yn+\ in the overall algorithm, which would then no longer be TR. It could be used at 'output' times, but should not be used in any other way since A-stability would then be lost. And, it is only really sensible (and recommended) to use it at the first step beyond the specified output times, and an [0(At4)] interpolation scheme would need to be invoked to retain the third-order accuracy at exactly the specified output times. The next step is to compute the next step—At. This is done using the LTE of TR, dn = At3ny n/\2 as follows: if we change At from Atn to Atn+\ and take the next step, we would clearly expect to see dn+\/dn = (Atn+\)3yn+l/At3ny' n = (Atn+\/Atn)3 + 0(Atn), which is just a reflection of TR's second-order accuracy. Now we place the following accuracy constraint on the next TR solution: \d„+i\ ^ £}W, (2.7-79) where e is the user-specified, dimensionless relative-error tolerance parameter, and }W (also user-supplied) is an appropriate scaling factor. This leads to £ymm/\dn\ ^ (Atn+\/Atn)3, which we use to obtain our next step size (by invoking equality): Ar„+, = Atn(eymm/\dn\)l/3, (2.7-80) and we are basically finished with the derivation. If \dn\ = \yn+\ — ^+1|/3(1 + Atn-\/Atn) is larger than eymax, then the timestep will be reduced, and vice versa. Finally, to set up for the next AB2 predictor, (2.7-75), we 'invert' the TR as follows: 2 yn+\ = ti^/i-h - yn) - y„, (2.7-81) thus providing a cheap, recursive formula for use in the next predictor step. Remarks: (1) An alternative method of error control is based on the (dimensionally inconsistent) concept of 'local error per unit step' (Shampine and Gordon, 1975); an example of its inferiority is given in Griffiths (1988). (2) It is generally not advisable to replace (2.7-81) by (2.7-2); i.e., yn+\ = f(yn+\, tn+\), because it is generally more costly and would cause accumulation of undamped errors. (3) The global error, en = yn — y(tn), is of course the error that we would like to control directly—a feat which, unfortunately, is too difficult (expensive) to perform. (It would require, for example, at least two full integrations at different Af's—via 'step doubling,' or equivalent.) Thus, the following caveat from the experts: 'Local error control in a code can be viewed as a knob that can be turned to try to adjust the step sizes and hence the global error. It is not a guarantee of small global error'—Hindmarsh and Petzold (1995a). Written as an algorithm, the overall smart integrator is the following: Initialization: (The predictor begins after the first step and error control after the second.) (1) Given yo and e, compute Ato = re1/3, where r is an estimate of the initial 'time- constant.' (If no such estimate is available, then a conservatively small Ato can
TIME INTEGRATION 269 be selected because the smart integrator will quickly increase At to the proper, e-dependent value.) (2) Solve for y\ from TR: y\=y0 + (Af0/2)( jo + y\ )• (3) Compute y\ = (2/Ato)(y\ — yo) — yo, and we are ready for the General Step: n = 1, 2, ..., with At\ = Af0'- (1) Compute yp+l = yn + {Atn/2)[{2 + Atn/Atn-\)yn - (Atn/Atn^)yn-X\. (2) Solve yn+\ = yn + (Atn/2)(yn + j„+i), where, when /(y, f) is non-linear, use also yp+l as a first guess when solving for yn+\. [Note that in general, the solution for yn+\ involves the solution of a non-linear equation, yn+\ — (Atn/2)f(yn+\, tn+\) = yn + (Atn/2)f(yn, tn) = bn.] (3) dn = (yn+\ - ^+i)/3(! + Atn^/Atn). (4) yn+\ = (2/Atn)(yn+\ - yn) - yn. (5) tn+\ = tn + Atn. (6) Ar„+1 = A^(eymax/|^|)'/3. (7) Go to (1) unless it is time to STOP. An alternative AB/TR algorithm, equivalent in theory but more accurate in the face of round-off error, is the following (A. Hindmarsh, personal communication), beginning with a better way of updating the derivative for the predictor step; instead of 'inverting' TR to obtain the next value of y for the predictor, we proceed as follows: y«+i = -^(y«+i-y«)-y« (2-7-82) rather than (2.7-81), a small change [0(At3)] that, combined with those to follow, can reduce the round-off error by a factor (roughly) of \yn+\ — yn\/\yn\. Inserting y^+x from (2.7-75) gives the final form for computation: ypn+x = (1 + Atn/Atn-i)yn - (Atn/Atn-i)yn-i. (2.7-83) [Noting that % = (yn - j„_,)/Ar„_i, (2.7-83) 'looks like' yp+l = yn + Atn%.] The next change is to solve for the (small) difference between the predictor and the corrector rather than for yn+l itself. Subtracting yp+x = yn + (At/2)(yn + yp+x) from yn+x = yn + (At/2)(yn + yn+l) yields the '5-form' of TR, yn+\ ~ yn+l =8y= —(y„+i - yn+l) = ~Y-[f(yn+\i tn+\) - yp+\] = ^Utf+i + 8* t„+i) - tf+l], (2.7-84) a non-linear (in general) equation in 8y [cf. Step (2) above]. After solving (2.7-84) for 8y, the final change is in the way that y is updated for the next step: yn+\ = yPn+\ + -rr8y> (2-7-85) L\tn
270 THE ADVECTION-DIFFUSION EQUATION obtained by subtracting (2.7-82) from (2.7-81). [Note that, in the absence of round-off error, (2.7-81) and (2.7-85) are completely equivalent—as are the other results.] Thus, in the improved AB/TR algorithm, Step (2) is replaced by (2.7-83) and (2.7-84) and Step (4) by (2.7-85). Also, of course, we need to add yn+\ = y^+l + 8y. Finally, we address the proper way to present the results. To retain the TR accuracy at user-specified output times, a second-order-accurate interpolation formula is needed: for tn < t < tn+\, the following formula does the trick—and is recommended: (2.7-86) b. Variable step backward Euler Before applying TR to AD, let us briefly present a similar, but first-order algorithm (partly) for the purpose of showing that it is not much cheaper per timestep (although rather less accurate). Then we will show that it often requires many more timesteps and is thus usually not a serious competitor. (The details of the algorithm design are left as an exercise.) (0) Initial step size: A?0 = re1/2. For n = 0, 1, 2, ..., do (1) FE predictor: ypn+x = yn + Atnyn. (2) BE step: yn+l = yn + Atnyn+X; solve for yn+l. (3) d„ = (yn+\ ->f+i)/2. (4) tn+i=tn + Atn. (5) Atn+l = Atn(eymax/\dn+l\y/2. (6) Go to (1), or STOP. Remarks: (1) If f(y) is expensive to evaluate, y in step (1) should/could be obtained by 'inverting' the BE formula via yn = (yn — yn-\)/At, which yields the simple extrapolation formula y„+\ =2yn - y»-\- (2) A variable step version of BDF2 will be presented in the next chapter—Section 3.16.4—and would be preferred to that above for BE (BDF1) in the event that a more stable method than TR is desired or required. c. A model problem It is of some interest to note that for the simple linear model problem y = — Xy with y(0) = 1, a 'perfect' algorithm (one whose local error estimate, from dn = At^yn/\2 for TR and dn = At2nyn/2 for BE, using the exact solution for calculating dn) would produce timesteps that increase exponentially like (i) TR: kAt(t) = (\2e)l/3ekt/3. (2.7-87) (ii) BE : kAt(t) = (2e)l/2eXt/2. (2.7-88)
TIME INTEGRATION 271 These are the theoretical goals. It turns out that in practice the growth of At is conservative (sometimes too much so); At grows more slowly than these theoretical 'goals'—partly because true e~Xt behavior is (by design) 'lost' upon reaching y = e. For e = 10~4 (a typical value), kAt(0) = 0.106 from TR and XAt(0) = 0.014 for BE, which is 7.5 times smaller. But At grows faster for BE and would (in theory) catch TR in step size at the time given by (12e)'/3eX//3 = (2e)'/Vr/2 or kt = 61n[(12e)'/3/V2e] = 12.1 for this example. But e~12 ' = 5 x 10~6, so there is virtually nothing left of the transient; TR wins—especially when the total number of steps to the cross-over point is counted—about 32 for TR and M48 for BE. (For the 'real,' not theoretical, results; see below). We will show two more simple examples before dropping the simple model problems—the first a continuation, with more details, of the above case and the second with two disparate time constants. The results of the model problem y = — Xy are shown in two tables (Tables 2.7-1 and 2.7-2) and two figures (Figures 2.7-7 and 2.7-8) for TR and BE. The numerical ('real') results, in the tables, are seen to be increasingly conservative as At increases, as mentioned above. The second figure shows the maximum global error (emax) as a function of the local error (e) for the 'real' results and verifies the theory that asserts that the former is one order lower than the latter; i.e., if d = c\hq+x for a q-th order method, then e = cihq is the global error—for a fixed-At integration. If d = s, then e = C2(e/c\)Ci/ci+l, and the slopes of the graphs are 'close' to this result: ~0.633 (q = 2) for TR and ~0.465 for BE (q = 1). Note too, for the same e, that BE's global error is approximately an order of magnitude larger than TR's; e.g., for e = 10~4, emax(BE) = 0.0036 and emax(TR) = 0.00047. (Inversely, for emax = \0~\ TR needs e = 3 x 10~4, but BE needs e = (7-8) x 10"6.) A bound on the global error for TR for the above situation (monotonically increasing At) is given in Hairer and Wanner (1991) via yn — y{tn) ^ cAt?max ■ maxoscr^,, y(t), where Atmax = Atn. They also give the result for non-monotonic At increases. Our last 'theoretical' demonstration of a smart integrator is the following: suppose we have the solution ;y = 0.5(e-A|r+e-*2r) (2.7-89) for one of the components of a pair of ODE's. If we apply the 'theoretical' TR to this solution, then we can get a pretty good idea as to how the integrator would change At through the course of the integration. Requiring |Ar3y/12| = e leads to the theoretical Table 2.7-1 AB/TR on y = -Xy. n 1 2 3 4 5 10 15 20 25 30 XAtn 0.1063 0.1063 0.1100 0.1140 0.1183 0.1453 0.1874 0.2618 0.4233 0.9722 Xtn 0.1063 0.2125 0.3225 0.4366 0.5549 1.223 2.068 3.209 4.944 8.381 yf — 0.8089 0.7246 0.6465 0.5743 0.2944 0.1265 0.04050 0.00732 0.00043 ylR 0.8991 0.8084 0.7241 0.6460 0.5738 0.2939 0.1260 0.04005 0.00693 0.00019 y(t„) 0.8992 0.8085 0.7243 0.6463 0.5742 0.2944 0.1265 0.04040 0.00712 0.00023 yTnR -y(tn) -9.01 x 10 5 -1.62 x 10"4 -2.26 x 10"4 -2.81 x 10"4 -3.29 x 10"4 -4.65 x 10 4 -4.58 x 10"4 -3.53 x 10"4 -1.94 x 10"4 -4.34 x 10"5
272 THE ADVECTION-DIFFUSION EQUATION Table 2.7-2 FE/BE on y = -Xy. n 1 2 3 4 5 10 15 20 40 60 80 100 120 140 XAtn 0.01414 0.01424 0.01445 0.01455 0.01465 0.01521 0.01580 0.01644 0.01963 0.02432 0.03193 0.04631 0.08317 0.3323 Mn 0.01414 0.02838 0.04273 0.05717 0.07172 0.1461 0.2233 0.3035 0.6607 1.095 1.646 2.404 3.611 6.613 yFnE 0.9859 0.9720 0.9583 0.9446 0.9311 0.8648 0.8010 0.7397 0.5191 0.3379 0.1960 0.09306 0.02873 0.00165 yBnE 0.9861 0.9722 0.9585 0.9448 0.9313 0.8650 0.8012 0.7399 0.5193 0.3381 0.1962 0.09325 0.02891 0.00181 y(tn) 0.9860 0.9720 0.9582 0.9444 0.9308 0.8641 0.7999 0.7382 0.5165 0.3347 0.1927 0.09033 0.02702 0.00134 y%E-y(tn) 9.77 x 10"5 1.94 x 10"4 2.89 x 10"4 3.82 x 10"4 4.74 x 10"4 9.14 x 10"4 1.32 x 10"3 1.69 x 10"3 2.82 x 10"3 3.40 x 10"3 3.43 x 10"3 2.92 x 10"3 1.89 x 10"3 4.68 x 10"4 101 10° At 10"1 10"2 0 2 4 6 8 10 12 14 16 Fig. 2.7-7 TR and BE on a scalar test problem (theoretical results). TR formula / 24e \l/3 A'- = U-*.'-+*te-"-) ' (2'7"90> A picture of this result, for A.) = 1, X2 = 10, and e = 10~3, is shown in Figure 2.7-9, along with the variations that would occur if each time constant was behaving independently. It is seen that the smart integrator does just what it should: follow the rapidly varying part with sufficiently small steps while it is important to do so but not thereafter. If this was a true two-equation ODE system, then the algorithm presented above would behave much like these theoretical results—and no TR oscillations would occur [none larger than O(10~3) anyway] with the solution being accurate for all t, without 'wasting' timesteps. 1 I I I I I V
TIME INTEGRATION 273 10-2 10-3 ^max (TR) 10-4 10-5 10-6 io-5 10-4 10-3 10" - 10-1 (BE) 10-2 10-3 10-1 Fig. 2.7-8 Maximum global errors for TR and BE. At 0.01 ln(10)/3 1 t Fig. 2.7-9 Trapezoid rule At selection. d. An aerospace version of TR In some implicit CFD codes (in which the ODE's are no longer linear) developed by and employed in the aerospace industry—typically at NASA Ames (e.g., Beam and Warming, 1982), a 'simplified' BE or TR method is used. They use the term 'linearized BE' (first-order) or 'linearized TR' (second-order) to describe what is also sometimes called 'one-step Newton' —although it is also called a linearly implicit method by some (see, for example, Hairer and Wanner, 1991). Additionally, these codes typically are deficient by being inefficient in that they use fixed timesteps, no error control, and are actually
274 THE ADVECTION-DIFFUSION EQUATION not guaranteed to be A-stable because of the linearizing approximations invoked. Here we shall compare three methods for solving the explicit ODE's described by the (non-linear) vector-valued system y = f(y): (i) the rigorous-but-somewhat costly way, (ii) the 'inefficient' way, and (iii) a reasonable compromise that might even help the large aerospace CFD community. We shall employ the second-order method and leave the other as an exercise. We show only the fixed At version, for simplicity. Generalization via error estimates and local timestep control is, or should be, straightforward (see below). Starting with the TR formula, yn+1 = yn + At(yn + yn+i)/2 = yn + At(fn + fn+\)/2, we have: 1. Rigorous TR. Define F(yn+l) = yn+\ - yn - At(fn + fn+\)/2 = 0 and apply Newton's method: dF(yn+l)/dyn+\ L ■ (yj+t0 ~ >ffi) = -F(y™i)* where ^i+i = rf+i = yn + At(3fn — /„_i)/2, and m is the iteration index. [dF/dy = I — ^Atdf(y)/dy is called the Jacobian matrix.] When F(y(™+X) is 'sufficiently small,' stop. 2. Aerospace method. Linearize fn+\ m me TR formula via fn+\ = fn + df(yn)/dyn ■ (yn+\ - yn) + 0(At2) to obtain yn+i = yn + ^At[2f„ +J„(yn+i - y„) + 0(At2)], where Jn = df/dy is also called the Jacobian matrix. Dropping the 0(At2) term gives (/ — AtJn/2)(yn+\ — yn) = Atfn, which is also sometimes called the 'delta method.' It is important to note that if no iterations are taken—and this seems to be the 'rule' —then the result is no longer TR, it is an approximation to TR. This is the linearized TR that we call the 'aerospace method.' The Euler version of the method is also called the linearly implicit Euler method (Hairer and Wanner, 1991) 3. One-step Newton. In fact, the linearized TR described above is also what some ODE people call 'one-step Newton.' But it is very easy to improve this one-step Newton method by obtaining a much better first guess—via the AB2 predictor—which is our recommended, linearized/one-step Newton scheme: solve dF/dy ■ (yn+\ — y„+\) = [/ — \At J(ypn+^ ■ (y«+\ - ypn+i) = -F(ypn+0 = -ypn+x+yn + 5a*[/„ +/(>£+,)] = ^Ar[/„_, — 2/'„ + f(yP+\)] for yn+\, which is (very) little more work yet much more accurate than the 'aerospace method.' e. TR on advection-diffusion We now return to the full GFEM linear ODE's governing advection and diffusion, (2.2-7) or (2.2-25), and generalize the variable-step integrator to these coupled ODE's, which of course introduces significant additional issues—and some 'heuristics.' First we write the general 'corrector' step, then the general 'predictor' step, then the general 'acceleration' update step. After that, we will return to the beginning—startup—and describe an entire algorithm. [In the next chapter, we will extend these results to cases in which the ODE's are non-linear—which in fact we have already just (tersely) introduced above.] The TR applied to (2.2-7) leads to the linear system {-kM+A)T^ = {-kM-A)T-+lf' (2J"9,) where A = N(u) + K is N x N, and we limit—for now—the discussion to the constant- coefficient case; in particular, the velocity is time-independent. Noting that / — ATn = MTn leads to the more efficient form of the RHS,
TIME INTEGRATION 275 2 M + A] Tn+l = M f —Tn + Tn)+f, (2.7-92) because Tn will be available 'recursively.' It is worth pointing out that the 'more efficient' form is also more stable—a result that may be more important. Let us demonstrate this assertion for the simple scalar equation, y = -ky. The form corresponding to (2.7-91) is yn+\ = [(1 - kAt/2)/(I + kAt/2)]yn, followed by yn+\ = 2(yn+\ — yn)/At — yn, and that corresponding to (2.7-92) is yn+\ = (yn + (At/2)yn) /(I + kAt/2) followed by the same (inverted) formula for yn+\ (needed for the predictor portion). If we let x = (y, y)T, then we can relate xn+\ to xn via a 2 x 2 matrix; xn+\ = Bxn, where in the former case r 1 - kAt/2 1 + XAt/2 -2k B = L 1 + XAt/2 0 -1 (2.7-93) and in the latter case, 1 B = 1 + XAt/2 1 At/2 -X -XAt/2 (2.7-94) The corresponding eigenvalues, {/i}, from Bz = /xz, are /i = [— 1, (1 — XAt/2)/(\ + kAt/2)] for the first case, and /x = [0, (1 - XAt/2)/(I + kAt/2)] for the second. Since xn = Bnx0, the former (with eigenvalue —1) displays a complete lack of damping—of roundoff errors, for example. Thus, the latter case, (2.7-92) for the AD equation, is to be preferred. The general 'predictor step' is TPn+x =Tn+^[{2 + Atn/At„-i)t„ - (Atn/At„.i)t„.i] , (2.7-95) where, of course, the 'accelerations' in the RHS are easily obtained by inverting TR; namely, (2.7-81). We now present the entire algorithm and introduce a few heuristics—needed to answer the several natural questions that arise: Startup: (1) Solve MTq = / — ATq for Tq\ DSCG is the recommended solution method. (2) Select Af0 via Af0 = re1/3, where e is the relative error tolerance (e.g., 10~4), and r is an estimate of the initial time constant— via max(Ts, max, |r0,|) T = max, \T{ o,l where Ts is a 'user-specified' input value (needed, for example, if T0 = 0). An alternative to estimating r is simply to select a conservatively small value for Ato and watch the smart algorithm quickly (after two timesteps) recover to a more appropriate value. Another alternative—one advocated by careful mathematicians (e.g., R. Rannacher) who worry about 'rough data'—is to use a dissipative scheme such as BE or BDF2 for the first step or so, because TR will not 'algorithmically' damp noisy data—a subject that we shall return to at the end of this section.
276 THE ADVECTION-DIFFUSION EQUATION (3) Take the first TR step; i.e., solve \At0 ) V Af0 (4) Invert TR to get the required AB2 data: 7, = 2(7, - T0)/At0 - t0. General Step: With At\ = Ato, for n = 1, 2, ..., do: (1) Tpn+l =Tn + ±Atn[(2 + Atn/Atn-\)tn - (Atn/Atn-i)tn-i]. (2) Solve (2M/Atn +A)Tn+l = M(2Tn/Atn +tn) + f. (3) tn+i=2(Tn+i-Tn)/At„-tn. (4) dn = (Tn+l - Tpn+l)/3(\ + Ar„_,/Ar„). (5) f„+i = f„ + Atn. (6) Af„+1 =Ar„(e/||rf„||)1/3. (7) Go to (1) unless it is time to STOP. The choice of norm, || • ||, deserves some discussion. Following the lead of ODE general-purpose software designers, we would generally opt for a properly weighted RMS norm—a relative root mean square norm. (Recall: || • ||rMS = || • b/V^V, where || • H2 denotes the L2-norm. Division by N of the square gives the mean square; etc.) A 'properly weighted' norm will be dimensionless and well-scaled; cf. (2.7-80) for the scalar ODE. A good choice of scale factor for the i-th entry of dnj, i = 1, 2, ..., N, is \Tn+\j\ + Ts, where Ts is the user-supplied estimate (minimum expected value, in case \Tn+\j\ is close to zero) discussed above. This leads to 1 N H4. II2 = u J2[d^/(\Tn+n\ + ^)]2, (2.7-96) i=\ as the generally suggested relative RMS norm. Remarks: (1) The all-important, user-specified tolerance parameter, e, can obviously have a significant effect on both cost and accuracy. Too large an e can cause: (i) inaccuracy, (ii) TR oscillations, and (iii) a 'weakened' theory (no longer in the asymptotic range). Too small an e will merely cause the simulation to be excessively expensive—seeking more accuracy then the spatial grid 'deserves.' Recommended values, at least to start: 10~4 ^ e ^ 10~3; experiment! (2) Because we are really trying to solve PDE's and not 'just' ODE's, there may be times when a maximum (L°°) norm is actually preferable—so as to better (or, at least, more easily) capture the behavior near a spatial singularity, for example; i.e., if most of the error comes from one small region of the domain, the above norm may not be sufficiently 'sensitive' to the potentially locally large error. Thus, an alternate norm that may sometimes be useful is Halloo = max,- \dnJ\/(\Tn+li\ + Ts), (2.7-97)
TIME INTEGRATION 277 wherein the selected / should be 'printed' (or saved), although a reasonable argument might be made that (2.7-96) with a tighter tolerance (smaller s) might serve nearly as well. (3) The r estimate given in step (2) of 'Startup' is a simple 'heuristic' that tries to generalize from a scalar ODE to a system, and is not guaranteed to be conservative; others are surely possible, such as r = maxj(Ts, | T0i. |)/max,- \t0.\. (4) The following relationships between norms may be useful: I 00) || • Woo/VN ^ || • Urms < II • lloo ^ II • lb = VN\\ ■ ||rms ^ VN\\ a manifestation of the fact that all norms are equivalent in a finite-dimensional space. The 'improved' (less round-off error) version of the above 'general step' presented earlier [(2.7-83) through (2.7-85)] for the single ODE, goes like: (i) Replace Step (2) by: solve for 8T from (^:M+A)8T=f-AjP^ - Mr-+" where 7^+, = (1 + Atn/Atn-\)Tn — (Atn/Atn-\)tn-\ is computed first, (ii) Replace Step (3) by *""+' =r-'+ irn8T- (iii) Set Tn+l=Tpn+l+8T. (iv) Return to Step (4). Except for a few (crucial) 'details,' the description of the smart integrator is complete. The rest is 'simply' linear algebra, in which reside the crucial details, which we now partially reveal via a list of questions: Ql. Noting that the TR step generally involves the formation and solution of a new linear system (different matrix) for each timestep, we ask: Should we really change At at every step? Related to this is: Q2. Letting DTSF = Atn+X/Atn (Delta T Scale Factor), should we treat DTSF < 1 differently than DTSF > 1? Note that the former will generally only occur with variable-coefficient problems (addressed below), and the latter will be 'the rule' for most AD simulations—especially diffusion-dominated; smaller steps are usually required at the beginning of a simulation rather than later. Q3. Should we limit the magnitude of DTSF in general—e.g., 0.2 ^ DTSF ^ 2? That is, stop with a 'Warning Message' if these limits are exceeded—after Step (2), to permit initially large changes, up or down, sometimes needed to 'correct' a poor choice of Af0- But before answering these questions, we will 'generalize' the problem somewhat, which will raise even more questions. Then we shall try to answer all of them. Thus far we have assumed u(x) is constant in time and that any boundary conditions and/or source terms were time-independent. Let us now generalize to perhaps a more typical case:
278 THE ADVECTION-DIFFUSION EQUATION u(x, t) is time-dependent and—usually—supplied by a GFEM solution of the Navier- Stokes equations on the same mesh. It is then just as convenient to permit the source term and BC's to vary with time, so that (2.7-92) generalizes to / 2 M +An+l\ Tn+l = M (^-Tn + tn) + fn+l = b, (2.7-98) A*n which leads to the obvious simple changes in the above algorithm: replace / and A by /„ and An with the appropriate value of n in Steps (1) and (3) of 'Startup' and in Step (2) of 'General Step.' The most important observation to be made in this more general case is that A(t) does change at each step whether or not we actually change At, so that a matrix update seems to be absolutely required at each timestep. And this is true—as it stands. Before addressing this issue further, however, let us raise a few more key questions: Q4. Since variable coefficients could conceivably change rapidly (thus 'surprising' the algorithm), at what lower limit on DTSF should we reject the result and repeat the current step at the (significantly) smaller At! Q5. Should we 'test for stiffness,' as in good, general ODE software [such as LSODA, in Hindmarsh (1983); see also Sepehrnoori and Carey (1981)], and switch to a more efficient method when non-stiff, or should we simply retain the algorithm as presented (which 'presumes' that stiffness is always present)? Q6. Suppose one wishes (for some strange reason) to lump the mass, or selects an FDM, which (effectively) converts the ODE's from 'implicit' to 'explicit.' Are there then more-efficient ODE methods (e.g., non-stiff) that should be considered? {It is a somewhat ironic fact that mass lumping is more deleterious for advection problems than for diffusion problems in that mass lumping for advection is less accurate yet yields the 'more appropriate' (and non-stiff) explicit ODE's [y = M~i) f(y) = g(y)] for which a nearly constant At and an explicit method are often appropriate—yet it is the diffusion equation that generates stiff ODE's and 'demands' stiff (implicit) ODE methods whether or not mass lumping is invoked; i.e., consistent mass for advection generates implicit ODE's [My = f(y)], and lumped mass for diffusion generates explicit ODE's. C'est la vie.} Q7. Last, but certainly not least: how are we going to solve the linear systems? Enough questions. Time for some answers—mixed with guesses. Since the answer to some of the early questions depends in a crucial way on the answer to the last question, we answer it first. A7. On most modern computers it is probably safe to say that the best way to solve the linear systems for all but the largest 2D simulations and for 'small 3D simulations will turn out to be Gaussian elimination (direct method) in one form or another—usually via an LU decomposition (see Volume II)—at least for those cases (probably the majority) in which it is not required to form a new LHS matrix at every timestep. For extremely large 2D problems and most 3D problems, the preferred solution method is iterative rather than direct—the reason being that the direct methods can usually only be 'best' when both L and U can be stored in 'memory.' If the problem is too large for in-core storage of L and U, then the
TIME INTEGRATION 279 relatively low storage requirements for iterative methods usually promotes them into first place. The iterative method that we have in mind at this point is one of the simplest: (DSCGS, diagonally scaled conjugate gradient squared). Diagonal scaling is particularly simple and should be generally effective. [If more sophisticated 'preconditioners' are to be employed, such as ILU (incomplete LU), then the strategy discussed below would be subject to change because then the cost of applying the preconditioner would not be 'negligible,' as it is with diagonal scaling.] Al. This issue is noticeably different depending on the 'solver' selected, so we answer it in two parts, first for the DSCGS method: yes. If a different preconditioned iterative method is used, the 'yes' becomes a maybe, depending in part on the cost of applying the preconditioner. For the LU (direct) method, however, a few special 'tricks' are very worthwhile. We describe two LU algorithms, with the latter largely borrowed from ODE 'software.' The first algorithm is simple and effective, but perhaps a bit naive (relative to the second); and it is this: (1) Compute Atn+i as above for each step. (2) If DTSF ^ 0.8, then reject the current solution and re-compute Tn+\ using the smaller step. (3) If 0.8 < DTSF < 1.0, then do not change the step size. This of course permits the re-use of the factored matrix ((2/Atn)M -\-An+\), via forward-reduction and back- substitution. {A 're-solve' is generally much cheaper than the first solve since the cost of performing the LU decomposition is high relative to the re-solution [the ratio is 0(iV), where N is the length of the T-vector].} (4) If 1.0 ^ DTSF ^ 1.5 and if this occurs four times in a row—to detect and verify a trend—then change At and re-factor the matrix with the new At. (5) If DTSF > 1.5, then change At and re-factor the matrix—a 50% or more increase in step size is enough to make the change worthwhile. Remarks: (1) The scalar parameters (0.8, 1.5, 4) are just suggestions/rules of thumb. Perhaps others would be better—the key piece of advice being that it is not cost-effective to re-factor at every step. (2) For the limiting case of pure diffusion, the above algorithm can deliver good results out to steady state in as few as 50 timesteps, or even less, because At grows rapidly. (3) For pure advection and 'uniform' flow, the algorithm may change At only very rarely, thus giving an implicit scheme that costs about the same as an explicit scheme in that the latter would factor the mass matrix (LU) once and for all, and each timestep would be a back-substitution. (4) For moderate Pe, the algorithm still works well, often increasing At moderately during the simulation. We shall show sample results at the end of the chapter. (5) As pointed out earlier in our discussion of model (scalar) problems, the growth rate of At (vs t) may occasionally tend to 'stall'; i.e., grow more slowly than you think it should, especially on problems that are approaching the steady state. C'est la vie.
280 THE ADVECTION-DIFFUSION EQUATION This answers Ql for the first LU algorithm. We now describe a second algorithm also based on direct solvers—a more sophisticated method that is more closely aligned with what is done in, for example, LSODE (Radhakrishnan and Hindmarsh, 1993). The first thing we do is to significantly modify the TR solution of (2.7-98); and this we do by invoking a Newton-(or Newton-Raphson-) like strategy. (Yes, a Newton method on a linear system of algebraic equations! Bear with us.) Rearranging (2.7-98) as F(Tn+\) = ((2/Atn)M + An+\)Tn+\ — b = 0 and applying Newton's method leads to Jm (CV} ~ Tlti) = -F (rS) , (2-7-99) where Jm = dF I T™^x J / dT^ is the Jacobian matrix and m is the iteration index; i.e., 2 Atn M +An+l (7^ - O = b ~ | t-M +A„+1 1 7^, (2.7-100) for ra = 0,1,2,..., and T^ = T„+l. One might, at this point, question our sanity since—as is well known and easily shown—the above iteration scheme always converges exactly in one iteration; Tn+\ is the solution of (2.7-100). The trick (strategy is probably a better word) that makes it useful for our purposes is this: use a Jacobian matrix that is not current; it is an out-of-date Jacobian from some earlier time, say J0 = (2/Ato)M +A0, where O stands for 'old.' Thus we have a modified Newton method (or chord method) and now do have a reason to iterate— on 7ffi!) - TSi) = b - (^-* + AB+1) ?1% (2-7-101) form = 0, 1,..., with 7^=7^. Here is how it works, in general: at some earlier time, we formed and factored our old Jacobian (J0 = LU) and saved L and U. Thus, at all later times, the iterations defined by (2.7-101) represent the 'cheap' part of direct solution methods; one forward reduction and one back-substitution. The iterations are stopped when rpim+l) _ T(m) 1 n + \ 1n+\ ^ ys, (2.7-102) where || • || is as in (2.7-96) and e is the relative error tolerance parameter introduced above, and y is a 'safety' factor chosen so that our approximate solution of (2.7-98) is close enough to the true solution that the error caused by the difference does not contaminate our local time truncation error estimate (dn+\); y = 0.1 is typical. When all is well, only a few iterations will be required—typically < 1.5, on average (A. Hindmarsh, personal communication). In fact, only when convergence becomes too slow (too many iterations) do we update the Jacobian. Finally, we point out that most of the 'heuristics' of the first algorithm apply here as well—as do some new ones; namely, (1) If \Atn+\/Ato — 1.0| ^ 0.3, then refactor the matrix based on Atn+\. (2) If the modified Newton method fails to pass the convergence test, (2.7-102) after three or four iterations, then stop the iterations, update the matrix, form its LU decomposition, and start the timestep over (same At).
TIME INTEGRATION 281 (3) If DTSF ^ 0.8, then reject the step—as before. (4) If 0.8 < DTSF < 1.0, then do not change the step size—as before. (5) If 1.0 < DTSF ^ 1.3, then follow Step (4) above—to prevent frivolous updates. (6) Finally, if DTSF > 1.3, then update the matrix (with the new At), form its LU decomposition, and go to the next step. (This is to keep the number of iterations low.) A2. For iterative solvers or for the modified Newton method: no. For the other LU method, the answer is contained in A1. A3. Yes—in all cases—to catch 'glitches.' A4. DTSF = 0.8 is reasonable—but not sacrosanct. A5.&A6. In general: no; the non-stiff portions of most simulations (except pure advec- tion or Pe ^> 1) will be a fairly small fraction of the total simulation time so that using the 'stiff method' for the entire simulation is usually not very wasteful. If, though, pure advection or highly advection-dominated simulations are more common than others, and if you wish to use lumped mass (and rather more node points, to make up for the accuracy loss), then a non- stiff method based on functional iteration may be more appropriate. Called Adams-Bashforth-Moulton predictor-corrector methods for explicit ODE's, the AB2/TR version of same, for t = M^[f(t)-A(t)T], ■ (2.7-103) where ML is a diagonal matrix, goes like: (1) Use the 'standard' AB2 predictor, from (2.7-95). (2) Write the TR as At ■ _■ Tn+i =Tn + y[7\t +ML (/„+, -An+lTn+l)] = b-^-MZlAn+lTn+l (2.7-104) and 'solve' it wis. functional iteration instead of linear algebra; namely, for m = 0, 1, ..., do Ci ° = b ~ y^Z'^+iCi (2.7-105) until with 7-2, =rj+1. 1 n+\ l n+\ ^O.le, (2.7-106) (3) If the total number of functional iterations becomes too large (more than, say, five), the problem is probably becoming 'stiff,' and the method should be dropped in favor of those discussed above. If this occurs, though, the 'assumption' of hyperbolic behavior may have been wrong.
282 THE ADVECTION-DIFFUSION EQUATION Remarks: (1) In ODE jargon, the above method is also referred to as P(EC)m: predict; evaluate (the RHS), correct, evaluate, correct, ... m times; i.e., until (2.7-106) is satisfied. (2) Some (naive) predictor-corrector methods do not use error control to determine the number of iterations, but instead take one predictor and one corrector per timestep. To see what kind of troubles this can cause for non-linear ODE's (hint: spurious solutions, non-linear dynamical systems, strange attractors), see Griffiths (1988) and the following papers by H. Yee, who properly chastises/admonishes some portion (at least) of the aerospace industry [cf. our discussion of the 'aerospace method' above (2.7-91)]: Yee et al. (1991), Yee and Sweby (1995a), and Lafon and Yee (1996); see too Yee and Sweby (1994, 1995b). For 'completeness,' we present a consistent, 'ODE-style' <5-form of the second LU-algorithm [(2.7-99) through (2.7-102)]. It consists of updating 7^, and 7^, 'simultaneously' and uses the form of the algorithm shown below (2.7-97); i.e., replace (2.7-101) by (■^-M +Ao) 8T{m+l) = fn+l -An+,T(^X - Mt^, (2.7-107) followed by 7'i™+,) = TU\ + 8T"+' and t%*l) = t™l + —8Tlm+l\ (2.7-108) and we remark that there does not appear to be any compelling reason to choose one of these over the other—they are mathematically equivalent. Remark: B. Finlayson and students have also developed some smart integrators that are similar to ours—but different; see Jensen (1980), Finlayson (1985; Section 3.4 in Aiken, 1985), Josse and Finlayson (1984), and Jensen and Finlayson (1986). f. The smoothing property To close this discussion of implicit ODE methods, we turn briefly to a limiting case, the transient heat equation (u = 0), which has received very much mathematical analysis. (Some, but probably not all, of what we summarize below also applies for u ^ 0.) This parabolic equation has a powerful 'smoothing property' that actually permits rather 'wild' initial data to be contemplated—and leads to some quite technical analyses regarding the solution's behavior/regularity as t \, 0. In fact, from Wahlbin (1980), we quote, 'Note that even if v is only in L\, say, u(-, t) is infinitely differentiable for t > 0; this is the parabolic smoothing property ...,' where v is the IC for the ID transient heat equation, «(-, t) is its solution, and L\ is the function space such that JQ \v\dx < oo—which can even include Dirac delta functions (which functions are not permitted in L2). The 'heat' equation is a really strong smoother—and its solution, for t I 0, can thus obviously be quite difficult to simulate for finite h and finite At. Should the numerical time-integration method used to integrate the finite element approximation to the heat equation also possess this powerful damping behavior? This is a good question and the answer is not simple—nor even, we
TIME INTEGRATION 283 believe, uniformly agreed upon by the 'experts.' For example, Rannacher (1984) believes that TR—being the 'delicate' integrator that it is—should generally not be used 'in isolation'; rather, it should be 'assisted' by a little bit of a dissipative method (e.g., two to four steps of BE)—at t = 0+ if only the initial data are 'rough' (but still in L2), and later too if the source terms are rough (also in L2 but time-dependent). [In fact, both the FIDAP code (Fluid Dynamics International, 1993) and the Nachos code (Gartling, 1987) are 'hard-wired' to do just that.] He also states that TR alone with rough initial data can be very bad: 'For rough data the global order of convergence may reduce even to o(\), in the extreme case.' That is to say, convergence will occur but not at any predictable rate. While the above may indeed be excellent advice in general, we believe that there is another side to the story/argument: smart integrators with variable At, even if 'only' A-stable, like TR, can probably usually deal rather well with rough data—especially if/when it is presumed that the rough data are introduced intentionally; i.e., that one is really seeking the time-accurate solution to the rough data problem. (Why otherwise introduce rough data?) It is then, of course, true that the initial At will be (and should be) quite small, probably 0(Ax2/k), as pointed out in Luskin and Rannacher (1982); but At can and should grow rapidly as the initial sharp transient 'diffuses away.' In fact, if a variable-step BE method was applied to the same rough data problem that presumably causes trouble for TR, both the initial step size and all subsequent ones would be rather smaller (with a more expensive but not more accurate result) than those selected by TR for the same specified accuracy. In fact, to test the above 'hypothesis,' we [with the able help of A. Hindmarsh and his LSODI code (Hindmarsh, 1983), restricted for this problem to the second-order implicit Adams method (TR)] repeated the numerical experiments reported in Rannacher (1982) in which, for an IC, the (discrete) L2-projection of a Dirac delta function was placed at x = 0.5 and the ID heat equation on 0 ^ i ^ 1 solved via linear finite elements on a uniform mesh with homogeneous Dirichlet BC's. [The above IC translates to MT0 = e^/2, where t^/2 is the unit vector, with zeros in all entries except N/2 (we take N even) where it contains 1.0, which comes from JQ </>,(x)<5(x — 0.5) dx = 1.0 for / = N/2; this is very rough data. For N = 160, it gives T0(N/2) = 277 and T0(N/2 ± 1) = -74.3, the signs alternate and the successive nodal amplitudes decrease in the ratio 2 — y/3 = 0.27. For N = 80, the corresponding values are ~ 139 and ~ —37.1, very close to linear with N—as is the norm of each eigenvector.] We first integrated the associated ODE's via variable- step TR with a tolerance (s) selected to give virtually the same accuracy reported by R. Rannacher, who used four BE steps and the rest TR—all at a. fixed At(= 2h). The results are as follows: for virtually the same number of steps, the smart TR delivers accurate results over the entire time interval: e.g., for N = 80, Rannacher took At = 1/40 = 0.025, integrated from t = 0 to t = \(k = \/n2) with four BE and 36 TR steps to get 7(0.5, 1) = 0.7371, whereas LSODI started with At = 0.00002, finished in 63 steps at At = 0.38 to get 7(0.5, 1) = 0.7351; the 'exact' result is 0.7361. [N.B. Rannacher's results are not accurate at small time—intentionally. Also note that the really exact delta-function solution, T(x, t) = e-<*-°-5>2/4"/^4nKt, gives 7(0.5, 1.0) = ^/rr/4 = 0.8862, whereas we followed Rannacher by setting homogeneous Dirichlet BC's; the true value at x = 0, 1 and t = 1 is y/Ti/4e~n~/l6 = 0.4782.] The smoothing property of the heat equation can be readily duplicated via TR if the 'proper' step sequence is taken—a sequence easily realized with a smart integrator.
284 THE ADVECTION-DIFFUSION EQUATION The above calculations used a rather crude e—0.01. To see further how a smart integrator behaves with a more typical tolerance, we repeated the above experiment for e = 5 x 10~4 and N = 160—and we also repeated it with (variable-step) BE all the way. The results are shown in Table 2.7-3. Noteworthy are the following points. 1. TR does, on average, match At to h (164 timesteps for N = 160). 2. BE is quite inefficient by comparison. 3. Most of the effort goes toward small time accuracy. 4. The minimum time of believability (Section 2.6.2c) is about 0.lh2/K = 0.1(7r/160)2 = 4 x 10~5, which 'consumed' about 20 TR steps and 84 BE steps; i.e., these steps are necessarily 'wasted' in order to obtain accurate results at later time. 5. The small difference in the 'exact' solutions at t = 1 between N = 80 and N = 160 is interesting—and explainable: the projection of the initial data vector onto the first eigenvector (see Section 2.6.2a) is close to 2, and this eigenvector dominates the solution by time t = 1, because the lowest eigenvalue is ~ kjt2 and, since k = l/n2, we have, for the first mode, ~ 2e~A'r = 2e~' = 0.7358—a result that changes only in the fourth decimal place between N = SO and N = 160. Thus, N = 80 is quite sufficient to accurately track the first mode—and this mode is nearly all that remains at t = 1 (erXlt = e~4 = 0.02, etc.). To see how well BE and TR fare against the 'full power' of LSODI, two additional runs were made (at the same e, 5 x 10~4, and N = 160) in which all the stops were pulled out. In the first, the Adams family was selected with the following results: 204 steps were required (final solution: 0.7350) and the 'order selector' got as high as fifth (twelfth is the maximum available)—and spent most of the time there—dropping back to second (TR) toward the end. The second run used the BDF family with the following results: 156 steps were required (final solution: 0.7361) and again the order quickly rose to fifth (the maximum available)—and stayed there. Thus, BDF 'wins'—fewer steps and smaller error—for this transient heat equation example. But the variable-step TR was not very far behind, and would show up better yet if the problem was one of advection-diffusion, especially if advection-dominated (or pure advection), thus—in our opinion—further justifying our selection of (smart) TR as an optimal integration method. We hope (and believe) that these results lend further support to the use of smart integrators in general and TR in particular for CFD. We would not, however, be too Table 2.7-3 More LSODI results. Time, t Closest step number to time t/T(0.5, t) 10"4 io-3 10"2 10"1 10"° TR 37/98.94 80/28.49 113/8.878 141/2.734 164/0.7357 BE 262/99.42 479/28.29 620/8.864 724/2.812 800/0.7429 'Exact' at t = 1 (e = 10"6, 1250 TR steps): 0.7360
TIME INTEGRATION 285 much against the occasional introduction of a 'damping' step (say, every 50 or 100 steps), with BDF2 the method of choice, in order to help control 'extraneous noise.' Our final bottom line on implicit integrators is this: constant step size should be used with about the same frequency as uniform finite element meshes; i.e., rarely. Final Remark: In the next chapter we will also present variable-step versions of BE (BDF1) and BDF2—albeit for the rather more complicated Navier-Stokes equations. 2.7.5 A Semi-Implicit Method There are two principal reasons why one may wish to employ a 'hybrid' integration method in which the diffusion term is integrated with an implicit method, and the advection term with an explicit method: (i) there are thin BL's that are resolved by the mesh (and the grid Peclet number there is less than one) and (ii) robust, unsymmetric 'solvers' are not available/affordable/desirable. In fact, once one realizes that explicit advection leads typically to the infamous 'CFL stability limit,' uAt/h ^ 0(1), a third reason appears if the flow is advection-dominated (with still thin BL's and fine mesh there): the timestep required for accurate tracking of 'parcels' will often need At small enough that uAt/h ^ 1 even if an implicit method was used for advection. Thus we, and many others, have also developed methods for time-marching that are semi-implicit—for which we note immediately that error control and variable timestep sizes are more or less abandoned; whereas one might vary At based on accuracy if the resulting At is less than the stability limit [CFL = 0(1)] and if the local error could be adequately estimated, no one (to our knowledge) has seriously attempted it. What is more common (and careless) is to assume that the CFL limit will always give a sufficiently accurate time integration. And this is the approach we take in this section—although we will advise the user of one of these methods to always verify the temporal accuracy of a simulation by repeating it at a smaller At. Our semi-implicit scheme begins where BTD left off; recall that in Section 2.7.2e we found a modified (perturbed) operator for forward Euler on advection, which both stabilized it (conditionally) and improved its accuracy. Here we extend/modify that scheme in a simple but far-reaching way; we integrate both diffusion and BTD terms via TR while still marching advection with FE. The gain turns out to be (at least for special cases) unconditional stability, and the loss is, of course, the need to solve linear systems. But the matrix is SPD, and the linear systems are actually quite easy to solve so that usually there is a net gain. Starting from dT ( At \ — +u- V7 = V- I K+ — uuj • V7\ (2.7-109) where K is diagonal (and typically/usually K = kI), we let K + (At/2) uu = K and apply our semi-implicit integrator to the GFEM form of (2.7-109) to obtain -j-M(Tn+l-Tn)+NnT„ + UknT„+kn+iTn+l) = -(/„ +/„+i), (2.7-110) At 2 2 in which we have permitted time-varying velocity and BC's. Thus, each timestep requires the formation and solution of the linear system,
286 THE ADVECTION-DIFFUSION EQUATION (^-M + Kn+l\ Tn+l = (^-M -Kn- 2Nn\ Tn + (/„ + /„+,), (2.7-111) where the coefficient matrix is SPD and amenable to efficient iterative methods. For 'time- accurate' integrations, it is often even better yet because then At is often small enough that the mass matrix 'dominates' Kn+\, and it is well known (e.g., Wathen, 1991) that M is a 'very nice' matrix for the conjugate gradient method. (Its eigenvalues are clustered, and DSCG converges very quickly and is cheap.) (See Volume II for further details on CG and DSCG.) Whereas we cannot prove unconditional stability in the general case (arbitrary mesh, variable velocity, etc.), we can for the ID, constant coefficient case with periodic BC's—via von Neumann [see Bullister et al. (1986) for some further discussion of the general case, and see Ascher et al. (1995) for a recent FDM paper on the subject]. In ID, (2.7-110) is, using linear basis functions, "1 C (Tnjll - r;_.) + 4(r;+1 - r;> + (r;+; - r;+1)j + -(r;+1 - r;_,) = x-{a + c2) [(r;_, - 2r; + r;+1) + (Tn+l - nfx + rj+J)], (2.7-112) where c = uAt/h and a = 2icAt/h2. Performing the von Neumann analysis a la Section 2.7.2c yields 2 + cos# (a + c2) (1 — cos 6) — ic sin 6 £ = 2 2 (2.7-113) 2 + cos# (a + c2) 3 + 2 '(1-cos^) for the amplification factor (0 = kh, where k is the wavenumber, recall). It is straightforward to form |£|2 and to see that it is less than one for all values of 0, a, and c. Stability is unconditional—even for pure advection. Remarks: (1) It is somewhat remarkable that TR stability is obtained even though advection is treated explicitly. It will be less remarkable, however, after Section 2.7.7b is assimilated. (2) Limited numerical experiments suggest that these von Neumann results generalize to multi-dimensional problems on real grids with variable (but divergence-free) velocity and real BC's. (3) Unlike TR, however, this semi-implicit scheme is not neutral when k = 0—it is dissipative; for At and h -> 0, in fact, |£| = 1 - u2K4At2h2/8. But for At -> oo with h fixed, it does not damp at all; in fact, £ -> — 1 to give the characteristic 2At 'wiggles' of TR. The strongest damping occurs for c = 1/V3 for which a 2Ax wave dies in a single step. (4) If LM is used, then (2 + cos#)/3 is replaced by unity; stability is still unconditional (and accuracy suffers). For pictures of |£|, |£|1/c, and phase speed vs 0—for both
TIME INTEGRATION 287 CM and LM—see Gresho and Chan (1985). (|£|1/c measures the amplitude error corresponding to the true solution moving one grid length, h.) (5) If an AD simulation attains a steady state and At is very large (c » 1), then the resulting solution will be overly diffusive (but only along streamlines); the effective diffusivity, a = a + c2 = a(l + Pc), which is much larger than a when P ^ 0(1). A demonstration of the last situation, first discovered by McCallen (1988), is obtained by integrating the semi-implicit equations to a steady state for the 'tough' problem (with OBL for Pe ^> 1) discussed in Section 2.6.2c; rather than (2.6-148), the uniform mesh solution for the semi-implicit/BTD method is t, - n 1 +P(1 +c) .1 -P(l -c)_ 1 \+P(\+c) 1 -P{\ -c) N+\ for 7 = 0, l,...,Af+l, (2.7-114) 1 which (appropriately) degenerates to (2.6-148) for c -> 0. Also, for c » 1 and P ^ 0(1), the solution becomes, approximately, [(1 + 2/c)j - 1]/[(1 + 2/c)N+l - 1] = j/N + 1, the solution corresponding to pure diffusion. So, for fixed P, the steady solution ranges from the wiggly GFEM result for c <$C 1 on to a non-oscillatory, overly diffusive solution for c ^> 1 (which is only obtained after many timesteps that display slowly damped 2At oscillations—a clue to a problem, a wiggle signal). Plotted in Figures 2.7-10 and 2.7-11 are the 'BTD' results analogous to those shown in Figures 2.6-54 and 2.6-55 for GFEM—for several values of c. Whereas small c sends out a steady-state, spatial wiggle signal, large c is, at the steady state, oblivious to the difficulty of the problem (it does see the 2At TR wiggles during the transient). Bottom line: DO NOT USE BTD with c » 1. (Pure TR, on the other hand, would yield a 'valid' steady state, although not a useful transient, when c ^> 1.) Finally, it is good to recall that BTD is 'inherently' a transient trick and should, in fact, be viewed with some suspicion if and when a steady state is attained. 1.4 1.2 — T(x) 1.0 0.8 0.6 0.4 — 0.2 — 0 C = 0.1 C= 1.0 C= 10.0 Analytic I I J I 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 x Fig. 2.7-10 Steady results for the hard 1D AD problem via BTD (Pe = 16, N = 7, P = 2).
288 THE ADVECTION-DIFFUSION EQUATION x 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 4 3 2 1 0 01 2345678 J Fig. 2.7-11 Same as Figure 2.7-10 except Pe = 160 (P = 20). 2.7.6 Dispersion (et al.) Errors For Some Fully Discrete Methods a. Introduction Now it is time to tie together some previous pieces to obtain a better feeling for what actually happens in the computer when approximating solutions of the scalar transport equation. That is to say, we shall apply several ODE methods to the semi-discrete AD equation (whose spatial error properties we have already examined) to see how Ax-errors and Af-errors 'interact.' We shall be interested in phase errors, damping errors, and—of course—stability, which will often show up as a 'negative' damping error. Our 'study vehicle' will be the ID advection-diffusion equation using linear basis functions. We have already studied one end of the spectrum—spatial errors in the absence of temporal errors. Now we shall do the opposite—but only briefly since it is usually of more academic interest than the ones already studied. Specifically, we shall apply TR to the PDE for pure advection to see how temporal error alone, from a specific ODE 'integrator,' manifests itself. [It will turn out that stable implicit methods at large CFL 'obtain their stability' by seriously slowing down the wave form; whereas explicit methods usually give leading phase error—and require c ^ 0(1) for stability.] After deriving this one in some detail, we will present 'final results' for one first-order method (BE), one explicit second-order method (LF2), and one fourth-order explicit method (RK4). Then, having developed a feeling for the temporal error end of the spectrum, we will 'bite the big bullet' and study the real error when the specimen PDE is discretized with linear elements and time-integrated with six explicit methods, three implicit methods, and one semi-implicit method; namely, FE, AB2, RK2, LF2, AB3, and RK4 for the explicit family; BE, TR, and BDF2 for the implicit family; and FE/BTD(TR) for the semi-implicit method. In each case we will present the key quantitative item needed for error and stability quantification—the von Neumann amplitude coefficient, £; yes, the analysis is (necessarily, it seems) of the common (but useful) 'Fourier type.' Given £, I I I I I I \ c = 0.01 c = 0.1
TIME INTEGRATION 289 it is then a simple matter (in principal) to recover the two limiting (semi-discrete) cases via letting either Ax —> 0 or At —> 0 and performing asymptotic analysis (perhaps using appropriate software). Our results, however, will be mostly pictures; i.e., we will show graphical results and leave asymptotics to the reader. The target is the PDE T, + uTx = kTxx (2.7-115) on the unit span with periodic BC's (or, alternatively, on the infinite span) and IC Tq{x) = elkx, with solution T(x,t) = e~k2Kt -ek(x-ut\ (2.7-116) which is a repeat of (2.6-108) and clearly depicts the k-ih Fourier mode being transported at the fluid velocity and decaying monotonically in time; recall [see (2.6-14)] that k = kn = Inn—and we suppress the mode number, n. The ODE analog of the target is the semi-discretized system i(f;_, +4tj + t]+x) + u(T]+\ -Tj-l)/2h = K(Tj-i -2Tj+T]+,)/h\ (2.7-117) or its lumped mass counterpart (7^-1 and Tj+\ —>• tj). These ODE's have the exact solution [cf. (2.6-113)] T .(f\ —2k2Kt(\—cos9)/m92 _ ik(jh—utsin0/m9) (0 1 ]]R\ where m = m(9) = (2 + cos 0)/3 for CM, (2.7-119) and m=\ forLM (2.7-120) is the 'symbol' of the mass matrix [see Remark (4) below (2.6-24)] and 6 = kh = 2nn/N (as usual). Clearly the effective diffusivity is 2k(\ — cos0)/m02, which —>• k (with second-order accuracy) as h —>• 0, and the effective velocity (phase speed) is u sin 0/m0, which —>• u as h —>• 0 (with second-order accuracy for LM and fourth-order accuracy for CM). Both the effective diffusivity and the effective velocity have been discussed in Sections 2.6.2b and 2.6.2a, respectively. Every ODE method must, of course, recover (2.7-118) for Af-> 0. b. Semi-discrete the other way We begin by applying our favorite ODE integrator, TR, to the pure advection PDE (if k ^ 0, then this analysis becomes too complex to be useful), T(x,tn+^-T(x,tn) + «[7,^ fn) + Tx{^ W)}] = Q (2 7121) Seeking a solution in the form T(x, tn) = e'^-"'"', for some u TBD, gives e-ikuAt _ j iku -ilcuAt At + (e-«««m+1) = 0<
290 THE ADVECTION-DIFFUSION EQUATION or i.e., ikuAt/2 _ —ikuAt/2 — / tan H/cAt/^ — ikHAt/2 | -ikuAt/2 l u"" <*'v*-»i/ ^ ~u 2 , kuAt - = tan-1 ——, u kuAt 2 ikuAt 2 (2.7-122) which is the phase speed resulting when TR is applied to the advection PDE. There is, in fact, a more useful way to interpret this result; it obtains by relating the step size, At, to the period of the wave from the exact solution, r = X/u = 2n/ku = 2n/co, where a> is the wave (angular) frequency. Defining NP = x/ At as the number of steps per period gives NP = 2ji/kuAt and thus Ni n u iv p I - = — tan — u n Np (2.7-123) as a more useful formula for the TR phase error. (If one considers NP = 2n/kuAt, the number of timesteps for the slower wave to pass by, there follows NP/NP = u/u.) Figure 2.7-12 shows that while a 'coarse mesh' (large At, small NP) yields a much too small phase speed, it does not require very many steps per period to recover good accuracy; e.g., for NP = 10, the error is just over 3%, and for NP = 20, it is less than 1%. This seems to suggest that at least 20 steps per period should be employed in practice (more yet if many periods are to be computed)—especially considering that spatial error also (usually) decreases the wave speed. TR BE LF2 RK4 IN, ^1'^ for BE IN, ^r?forRK4 — 10 Np 12 14 16 18 20 Fig. 2.7-12 Phase speeds and decay per period for several ODE methods.
TIME INTEGRATION 291 Also shown in the figure is the result for BE, n - = —tan-1 — u 2n NP (2.7-124) which requires finer resolution (naturally) for similar accuracy. But the situation is worse yet for BE; it gives a solution that goes like |£|n, where |£| = \/\f\ + (2n/NP)2 is the decrease per timestep—serious numerical damping. Even worse is the more appropriate measure, \%\Np, the decay per period—shown in the same figure. (Recall that |£| = 1 for TR.) Remark: FE gives the same phase speed as BE but, unfortunately, grows (unboundedly) at the same rate that BE decays; i.e., |£| = y/\ + (2n/NP)2. Neither Euler method should be used for advection. Also shown are the following results from two explicit methods—in the stable range only: u Np , LF2: - = —-sm'l2n/Np u 2n (2.7-125) with |£| = 1, which is unstable for NP < 2tt, and u Np , RK4: - = — tan-1 u 2n 2n/Np[\-l-(27T/Np)2] D \-l-(27T/Npf + ~(27T/NP)4 (2.7-126) with 1*1 = 1 (2n\2 1 /2nY ~ 2 [n'p) + 24 VAW + '2n N~p 1 /2n\ 1_6 \Wp) (2.7-127) \%\Np is also shown. The stability limit for this case is NP = 2n/VS = 2.22. The following explicit methods are not shown because they are unstable: FE, AB2, and RK2. AB3, which is conditionally stable, is left as an exercise. From the above results, we observe the following: 1. Implicit methods generate lagging phase error, thus exacerbating that caused (usually) by spatial error. 2. Explicit methods generate somewhat compensatory (usually) leading phase error. 3. RK4 is impressively accurate and should give good results with as few as 10 steps per period. 4. BE is impressively inaccurate and highly overdamped. c. Fully discrete o Trapezoid rule. We now return to the real world of finite At, finite h, and finite k, and apply several ODE methods to (2.7-117)—with special thanks to J. Leone, Jr., whose skills with modern software packages were crucial. Again, we shall derive the results in
292 THE ADVECTION-DIFFUSION EQUATION any detail for only one method, but give final results for all those considered. Applying TR to (2.7-117) and then seeking a solution of the resulting difference equation in the usual form, (2.7-128) An) where Tj denotes the value of Tj at time tn, yields m (£-Drv'^ + iuAt /l +£' h smO-$neije 2k At /l +£' i.e., £ = h2 V 2 1 — [a( 1 — cos 6) + ic sin 9]/2m 1 + [a(l - cosG) + icsinG]/2m, (1 -cos0)-$neij9; (2.7-129) in terms of a and c, where (as before) a = 2i<At/h2 and c = uAt/h. Another, perhaps more useful, form makes £ = £(#, c, P) via a = c/P—and this is the form we will use. Simplifying by multiplying numerator and denominator by the complex conjugate of the denominator yields the final form, 1 £ = a2(l -cos#)2+c2sin2# 4m2 ic sin 6 m 1 + a{\ — cosO) 2m + c2 sin2 0 (2.7-130) 4m" [This is the only method whose form we shall bother to explicate as Re(£) + //m(£); for the others, we simply let the 'software' do the complex arithmetic] To relate this result to the exact solution, and to determine the effective diffusivity (damping) and phase speed, we proceed as follows [cf. (2.7-116)] and write £ = x + iy = |£|e'arg®: <n) n ij9 jyn, = £„e„* = |£|„e = l£l" exp In i(j6+n tan ' y/x) ik{Jh + IKt^ ylx, = \£\ne<k(~xJ~uP!) (2.7-131) where uP = — (l/&A?)tan ' y/x with proper account taken of the quadrant in which £ lies (not always easy), or 1 , . 1 u ukAt tan y/x = cO tan ' y/x (2.7-132) as the numerical (relative) phase speed of the fully discrete approximation. This equation is general in that it depends only on the von Neumann amplification factor, £, which itself is, of course, a function of both spatial and temporal discretization methods. Also, |£| = (x2 + y2)1/2 = \T(f+X)/T("]\ is the (general) amplification factor magnitude, which must be ^ 1 for stability and is to be compared with e~k kA' from (2.7-116), which latter shows that short waves damp fast. The relative amplitude change per timestep, RA! = |£|^VA?, (2.7-133) the relative amplitude change per grid length traversed by the exact solution, =. o'A- Rh = R At ' (2.7-134)
TIME INTEGRATION 293 because 1/c = h/uAt is the number of steps per grid length for the true solution, and the relative amplitude change per period, RP = (RAt)Np = {RAt)2n,c\ (2.7-135) are all interesting measures of the error (and all are 1.0 for the perfect method—and all -> 1 as 0 -> 0). To further the interpretation of these results, it is useful to re-express (2.7-133) in terms of the local Peclet number, P = uh/2K, and the CFL number, c = uAt/h, as *a, = l£|er02/2/\ (2.7-136) or in terms of the local 'diffusion number,' a = c/P, as RAt = |£|ea02/2, (2.7-137) the first of which is the form of relative 'diffusion' that we shall utilize. (But it will be useful to keep the relationship a = c/P in mind when studying the curves to follow—and the fact that a <$C 1 usually means an accurate diffusion calculation and a » 1 an inaccurate one.) Remarks: (1) Although perhaps counterintuitive but nevertheless rather important, it follows that RAt < 1 means an overdamped numerical solution and RAt > 1 is underdamped. The goal, of course, is RA, = 1. (2) The limiting case for pure advection is, appropriately, RAt = |£|, and |£| = 1 is the goal; and that for pure diffusion is just (2.7-137)—and RA, = 1 is the goal, as usual. (3) For finite P and £, RAl —>• oo as c —>• oo because the exact solution becomes zero in one timestep—thus reducing somewhat the utility of the following results at large c. (4) It is interesting at this point to note that the exact PDE solution, (2.7-116), can be expressed in terms of the discrete variables and parameters (and thus at the discrete mesh points in both space and time) as T(xj, tn) = [Q-^l^f^J-no = [e-aeV2]«e«0(y-«c) (2.7-138) and that of the ODE's, (2.7-118), as Tj(tn) = [e^(\-cos9)/m]n . ei0U-nc.Sin6/mO)t (2.7-139) (5) In the case of TR, as with all of the ODE methods considered herein, the following analogy exists between the AD equations and the single scalar (model) equation [(2.7-1)] with complex X; y = -Xy : XAt = [a(l - cos#) + /csin0]/m(0), (2.7-140) so that any ODE solution with amplification factor £ = %(XAt) can be instantly 'translated' to the von Neumann growth rate for the AD PDE. For TR, cf. (2.7-57) and (2.7-129). (X is an eigenvalue of M~lK, where—recall—the modal index has been suppressed; 0 = 0n = 2nn/N =>• X = Xn.) Different spatial discretizations will,
294 THE ADVECTION-DIFFUSION EQUATION of course, give different functionalities between XAt and 6, a, c; i.e., different spatial 'response functions' than [a(l — cos 6) + ic sin 6]/m(0)—but the analogy will probably still hold, at least if every nodal equation is of the same type. (Quadratic elements, for example, would again cause 'difficulty.') o Range of P? Before pushing on, we pose a few relevant questions that will help to guide our analysis: What range of P should be covered? And, at least for implicit (stable) schemes, what range of c? Or—when is P small enough for advection to be neglected or large enough to be called (or replaced by) pure advection, and when is c small enough to say that the ODE solution can be said to have been attained and, finally, when is c too large for the 'solution' to make sense? These are reasonable questions, only some of which we shall answer with any certainty, beginning with the range of P, which itself begins with an analysis of the exact solution for a wave of length X = 2n/k = 2nh/9. Next, we ask: How fast does this wave decay by diffusion relative to its transport by advection? Or, more quantitatively, how much decay (say, D, D ^ 1) occurs per wavelength advected? If D is small, then the decay is strong and the process is diffusion-dominated, while if D is close to one, then there is little damping and the process is advection-dominated. Pure diffusion will have D = 0 (and Pe = uL/2k = 0) and D = 1 corresponds to pure advection (Pe = oo). The time required for one wavelength of advection, say Atx, is given by Atx = X/u = nAt, giving n = X/u At = In/kuAt = 2n/c0 timesteps per period. Thus, (2.7-138) gives T(Xj, Atx) = z~nelP • ei(je~27z) = Q-n0lpeJ0, (2.7-141) and we have found the damping factor: D = e~ne/p = e'27tkK/u = e_;r*L/Pe. (2.7-142) D describes the relative amplitude of an initial 'sine' wave after it has moved forward one wavelength; D is the damping of the exact solution per wavelength advected—as would occur on a fixed, discrete mesh (with periodic BC's). To interpret these results in a useful way, we consider the following sequence: (1) Pick L (L = 1 is usual), the domain length. (2) Pick N, giving h = Ax = L/N. (3) Pick P = uh/2K, which =>• that Pe = NP (global Peclet number) is also fixed. (4) Vary k such that 0 ^ kh < n\ i.e., 0 < 0 ^ n, with the upper limit on k being set by the mesh. Alternatively, we have oo ^ X/ Ax ^ 2; the given mesh can only support waves equal to or longer than 2Ax. (5) Plot D vs 0; equivalently, D vs k for fixed h, P. (6) Change P and repeat. The results of such a procedure are shown in Figures 2.7-13 and 2.7-14, the latter perhaps showing better the wide range of damping covered between P = 0.1 and P = 100—and we more or less arbitrarily show a 0.1% cutoff at D = 0.001: any wave that decays more than three decades per period is considered to be rather evanescent. So, what values of P should we consider in subsequent analysis? We have selected P = 0.1, P = 1, and P = 100 as covering sufficiently the entire range of behavior, with
TIME INTEGRATION 295 Fig. 2.7-13 Decay per wavelength for several values of P. 1 \. "~"p~= 1*5" *" P = 100 /■■ P = 1 P = 0.1 0.1% cutoff T 1.5 e 2.0 T 2.5 3.0 0.5 1.0 0 VAx: oo 16 8 6 4 3 2 Fig. 2.7-14 Decay per wavelength for several values of P; semi-log plot. p = 1 (a = c) 'defining' the case where advection and diffusion are equally important (on the mesh)—more or less. P = 100 is reasonably close to the pure advection limit, and /> = 0.1 surely represents a diffusion-dominated case; e.g., for P= 100, an 8Ax-wave (0 = 7r/4) is damped by only ~ 2.4% per period, which requires about 28 periods for a decrease of 50%, whereas for P = 0.1 even a wave as long as X/ Ax = 28.5(0 = 0.22) decays one-thousand fold per period (a 285 Ax wave decays by 50% per period). This forms our response to the issue regarding the range of P. For that of c, see the figures that follow, since the answer depends a lot on both the ODE method and P itself. o Return to TR. The results for TR, which 'set the stage' for most of the other ODE methods that follow, are shown in Figure 2.7-15. Phase speed (relative) is on the left and relative amplitude, per (2.7-133) and (2.7-136), on the right, for each of the three selected values of P—the top set showing 'advection-dominated' and the bottom set 'diffusion-dominated.'
296 THE ADVECTION-DIFFUSION EQUATION 1.0 ■o CD 8.0.8 8 0.6 co sz £o.4 > * 0.2 CD cc 0 ........ I ! I ' I ' Vx ^Z. ~^^C<0.1 P\ \ c'=i"---0\ +- \j0 ^ ^ \ — \ "• -. -s. \ -\ioo '"--------... \ ^ ~~ — — ~ "\ ~ I , I i I i 1.5 1.4 1.3 < 1.2 cc 1.1 1.0 0.9 I ' I I I I /100 .'"10 T 1 ,''' \ - / .... ; 3.n-t 0.1^ C = 1^ ^ I , I , I 0.5 1.0 1.5 e (a) Phase speed 2.0 2.5 3.0 P= 100 0.5 1.0 1.5 0 (b) Amplitude 2.0 2.5 3.0 1.2 1 1.0 Q. (/> CD 0.8 w £0.6 Q. > 0-4 lo.2 CC 0 — _ I vxu'— i Vv -A 100 I I -., 4° i I ^P ~~~ — | .--" C — — I c <o.or ~---.. ___ i I =~f " .0.1 --.s I i "~ — — V-~ I 1.5 1.4 1.3 = 1.1 1.0 0.9 0.8 | I ; / — ;io ;3 — / / _i i ■ f..-r'. 1 1 /(a=1) /1 C = 0/f^^ l_ — — i 0.5 1.0 1.5 2.0 e (c) Phase speed 2.5 3.0 0.5 P = 1 1.0 1.5 2.0 2.5 0 (d) Amplitude 16 I 14 % 12 % 10 5 8 Q. CD 6 > "■a 4 rr 2 0 kj I 3 /" ioot^ i i i i , ••••••... c = 0.1 (a=1)/ "-••••...... / /0.01 ^10~T~~ i T l_ — — — i 0.5 1.0 1.5 e (e) Phase speed 2.0 2.5 3.0 0 0.5 1.0 1.5 2.0 0 P = 0.1 (f) Amplitude Fig. 2.7-15 Phase and amplitude results for TR at three values of P. 3.0 1.6 1.4 1.2 1.0 of 0.8 0.6 0.4 0.2 I ' I -JO -1 ;3 1 ^ 1'' , ' — — 1 1 1 " 1 | / C = 0.01 •:0.1 (a=1) i I l_ — — —— — — I 2.5 3.0 Remarks: (1) Whereas reasonable accuracy is obtained for c ^ 1 for the advection-dominated case, large c seriously slows down the waves—a typical characteristic of implicit methods: stability is gained at the expense of slowly advected waves. For c ^ ~0.1, all of the phase error is 'spatial.' (2) For the diffusion-dominated case, the phase speed is both strange and mostly irrelevant for c ^ ~0.10 for which TR oscillations have precluded any hope for accuracy. Accuracy requires c < ~0.01, which corresponds to a ^ 0.1 for this case. (3) The intermediate/transition case, P = 1, is 'intermediate.'
TIME INTEGRATION 297 (4) The relative amplitude for P = 100 is basically just e(cf9 /2P) since |£| ~ 1 for all c's shown—see Remark (3) following (2.7-137). Also, here and in those results to follow, RAl is bounded (for c and P finite) even though it may not appear so. For example, for a 4Ajc wave (0 = n/2) and c = 100, RAt = 0.99973 x e7^8 = 3.433 and a 2Ajc wave with c = 100 has RAt = O^e*2/2 = 69.5 (it peaks at - 110 at 9 = 3.07). TR is clearly underdamped for large c. (5) From (2.7-140), we note (for any ODE method) that P = 1 tends to equalize advec- tion and diffusion in that the coefficient of each trigonometric term is c; P = 1 is indeed a logical 'transition' value. (6) Rs(0 = n) — 0, for TR and for all succeeding results; some curves are discontinuous at 0 = n. o Backward Euler. Changing to a BE integration leads to 1+A.Af a(l-cos0) + /csin0 and m(0) 1*1 = 1 (2.7-143) (2.7-144) 1 + a(\ — cos0) m + csin0' m the results of which are shown in Figure 2.7-16, here omitting P = 0.1, which looks much like P=\. The major difference from the TR results shows up for the 1.0 §0.8 Q. CO 85 0.6 CO ©0.4 .> "5 © 0.2 cc 0 ^--J ■ »\ ^ + \ * \ \ P-. \ ! \ \ I— \ i \10 -r- ">». \100 v^-l-— "T—< -_ v«, \3 ^**'—••. —I- — <J: ^J ^"-- -J-- I = 0T '-> ^J. I I V<0 01 \ v^ \ — \ \ —— *^___ \\ "~~"*~~\ ~-~t~~-A 0.5 1.0 1.5 e (a) Phase speed 2.0 2.5 3.0 P = 1.6 1.4 1.2 1.0 ^0.8 0.6 0.4 0.2 0 I ' I C = 0.1 \ V 1 r*\io tioo> , i i i i i i ; /i / 4- / i / ./ — 0 0.5 1.0 100 1.5 e (b) Amplitude 2.0 2.5 3.0 1.0 0.9 $0.8 8-0.7 8 0.6 £0.5 S"0.4 I 0.3 co © 0.2 = 0.1 0.0 r,;i j_ \~ r , r r r T s'T -^ \ \. s \ \ 10 100 --->-.. -I—I., '■"•• c = s1 s >». —h 0.1 "-• ^ !".--. I \0.01 ► "" I — — — '■■■\— r*-^.^ 0.5 1.0 1.5 e (c) Phase speed 2.0 2.5 3.0 1.6 1.5 1.4 1.3 3 1.2 cc 1.1 1.0 0.9 0.8 '^^'■ZZZZr--^ ! ,1 ,.5 / / / / / y c=o.oi 0.1 —I J I L 0 0.5 1.0 P = 1 1.5 e (d) Amplitude 2.0 2.5 3.0 Fig. 2.7-16 Phase and amplitude results for BE at two values of P.
298 THE ADVECTION-DIFFUSION EQUATION advection-dominated case with large At; e.g., c = P = 100 shows that all but the shortest waves are seriously overdamped. On the other hand, RAt = ^ • en /2 = 20 at 0 = n; the 2Ajc wave is underdamped. Thus, BE is bad at both ends for large c; it overdamps the good (long) waves and underdamps the bad (short) ones. Its phase speeds are not really different from those for TR except for c < ~ 1; TR converges faster. o BDF2. Changing to our last implicit method, BDF2, leads to (see Section 2.7.3c; take the positive root) 2 + Vl -2XAt „ ,_ t = —L^L (2.7-145) s 3 + 2XAt where again XAt is given by (2.7-140), the results of which are shown in Figure 2.7-17, which seems to merit the following remarks: 1. Its phase speed characteristics are remarkably close to those from BDF1 (BE) for P = 100, and rather more like TR for the diffusion-dominated case. 2. It shares the BE property of overdamping the good waves and underdamping the bad for large P and c. 3. A final remark, not from the figure: the spurious (negative) root for BDF2 is basically innocuous—a statement that will be seen not to apply to some explicit methods. If XAt is real and >l/2, then the decay of £_ [change the sign in front of the radical in (2.7-145)] is damped oscillatory; both roots —>• 0 as At —>• oo. o Forward Euler. So much for implicit methods. We now turn to explicit methods, beginning with the simplest, FE: £ = 1 -XAt = 1 -[a(l -cos0) + /csin0]3/(2 + cos0), (2.7-146) the results of which are shown in Figure 2.7-18, for which the following remarks apply: 1. Recall (Section 2.7.2c) that FE is unstable if c > P/3 for /> ^ V3 (diffusion-limited case) and if c > \/P for P > >/3. Several unstable results are shown for P = 100 to help 'understand' unstable behavior (at least for FE; different methods often display different types of unstable behavior); e.g., for c = 0.02, which is twice the stability limit, FE is unstable for all 0 < ~ 2.1 (A. > ~ 3 Ax), with the longer waves being more unstable (larger |£| > 1); see Hindmarsh et al. (1984) for further discussion of unstable modes for FE. 2. For the advection-dominated case, we see that if FE is stable, then it is also accurate—as well it should be since c ^ 0.01 gives a very small timestep. 3. For the diffusion-dominated case (P = 0.1), stability requires c ^ 1/30, ora = c/P ^ 1/3; the two results in the phase speed graph that 'end midstream' (c = 1, c = 0.1) are unstable for 0 larger than the cutoff. Yes, the shorter waves are more unstable when diffusion dominates. o FE with BTD. Changing now to a more stable (modified) FE, we add BTD (see Section 2.7.2e) and, for variety, lump the mass (ra = 1; recall that mass lumping and explicit methods are a 'natural' combination). Here £= 1 _[(a + c2)(i -cos<9) + /csin<9], (2.7-147)
TIME INTEGRATION 299 1.0 -o0.9 $0.8 8-0.7 8 0.6 £0.5 Q. a) 0.4 |0-3 (o "5 0.2 * 0.1 0 1.0 $0.8 8-0.7 8 0.6 £o.5 Q. a) 0.4 l0-3 ffl 0.2 01 0.1 0.0 I \ 4- \ \ \ X 3 10 :-sioo 0.5 1.0 1.5 2.0 e (a) Phase speed 2.5 3.0 P=100 0.5 1.0 1.5 2.0 2.5 3.0 0 (b) Amplitude 4 \ LJ- X ^ \ \3 *- \ \ -U \io \ V "T100 ^-^. \ ^1 ►.-_ "-- --I- ^;4-. .J ^_ — •■ — „_ —1- - -■J.C = ( s ai-- \0.01 -«> \ ■v. x ~~~-— - [ >.3l ^ — — \ — v— ;;t^ 0.5 1.0 1.5 e (c) Phase speed 2.0 2.5 3.0 1.7 1.6 1.5 1.4 _1.3 of 1.2 I— 1.1 1.0 0.9 0.8 0.7 — — -\ 10c / I! /' ! /3 i / 1 / / / / / / / ./ 1 , 1. /1 /(a / , I I = 1) C = i .• o."i' 1 / / / • ••0.3 0.01 I J-, — — I 0.5 1.0 P = 1 1.5 e (d) Amplitude 2.0 2.5 3.0 ■o a) a) Q. w a) w (0 a) a) CC 1.5 e (e) Phase speed Fig. 2.7-17 Phase and amplitude results for BDF2 at three values of P. a result obtained by adding physical diffusion to the RHS of (2.7-43); u2At/2 -> K + u2At/2. Figure 2.7-19 shows the results, which now are stable for c ^ 2/7(1 + Vl +4/>2), giving critical CFL numbers of -0.995, 0.618, and 0.099 for the three values of P in the figure. Again some unstable results are displayed—this time showing short wave instability (when unstable) for all values of P. BTD has stabilized the long waves. Also noteworthy, for P = 100, is that time truncation error improves the phase speed, right up to the stability limit, as previously demonstrated in Figure 2.7-6. o FE with BTD a la TR. The most stable 'FE' method is that which adds BTD but treats it implicitly, via TR. From Section 2.7.5, we have
300 THE ADVECTION-DIFFUSION EQUATION 1.0 ■o 8.0.8 8 0.6 (o fo.4 > J? 0.2 CD CC 0 1 ' 1 ' ~~~-—^c <0.01 — r\ C = 0.02 ~~~ Unstable — 0.05 j\ Stable! I — . — 0.1\ i.oK 1 I 1 I 1 I 0.5 1.0 1.5 2.0 e (a) Phase speed 2.5 3.0 1.0 0.8 cc 0.6 0.4 0.2 0 I ' I I I I — C = 0.02 Unstable III, 0.05 Stable I 0.1 1.0 i K P= 100 0 0.5 1.0 1.5 2.0 e (b) Amplitude ■o CD CD a. w CD & (o CD I 1 CD CC — I ' I I I I /o.z ./■ • C^O01^\\ 1 1 1 1 1 1 3 — 0.5 1.0 1.5 2.0 2.5 3.0 e (c) Phase speed P = 1 r * 1 0 1 40 Q. </) cd 30 w (0 Q- 20 CD > K 10 CD CC 0 I 1.0,. I I I I /" """ /0.1 • x .•••■' I I I I I 0.03 C = 0.01 | I — — I 2.5 3.0 I ' I ' I I / — t- /■ / — /0.3 ./ _.x-'"o.01 I I I I I I 0 0.5 1.0 1.5 2.0 e (d) Amplitude 0 0.5 1.0 1.5 2.0 2.5 3.0 0 0.5 1.0 1.5 2.0 e e (e) Phase speed P = 0.1 (f) Amplitude Fig. 2.7-18 Phase and amplitude results for FE at three values of P. 2.5 3.0 2.5 3.0 £ = a + c' (1 — cos#) + ic sin 9 m 1 H —(1 -cos0)/m (2.7-148) which is unconditionally stable. The results, shown in Figure 2.7-20, look rather like those for TR alone (Figure 2.7-15); thus, remarks made there apply here as well—except to emphasize that TR generates an unsymmetric matrix in the general case, whereas FE/BTD(TR) leads to a symmetric matrix.
TIME INTEGRATION 301 1.8 v 1.6 8 1.4 Q. S1-2 « 1.0 I ' I I „J I- — — — *^^C J).01^ \ 01- %. - ^ C = 0.995 _ Xy 0.9 - ^~ \V _ N°\ *vV0.8 \VX- 0.7\oT 0.5 1.0 1.5 2.0 2.5 e (a) Phase speed 1.0 1.5 2.0 2.5 e 0.5 1.0 1.5 2.0 2.5 3.0 e (c) Phase speed P = 1 0.5 1.0 1.5 2.0 2.5 e (d) Amplitude 3.0 20 T3 „ o 18 CD " o. 16 w CD 14 IS 1? Q. 10 9> 8 "cc 6 CD A * 2 n — — , 1 /-1 I ' I J T —^ / '^^ ->~ 1 **■—. J- i "*■; .'0.1 / / 0.05 ■■■ / C = 0.01 / ~ ,y -w = l l l —i 1 1_ 0.5 1.0 1.5 2.0 2.5 e (e) Phase speed 10 9 8 7 6 < 5 = 4 3 2 1 0 I — I - /1 - / / I 1 1 1 1 1 1 1 1 1 ;o.3 1 1 ' / l ]_ i / — i — / — / Ai /' C = 0.01 ~ ■T^ ; !-«:«?... 0 3.0 P = 0.1 Fig. 2.7-19 Phase and amplitude results for FE with BTD (and LM) at three values of P. 0.5 1.0 1.5 2.0 2.5 3.0 e (f) Amplitude o Second-order Adams-Bashforth. Switching back to purely explicit, we next visit our first second-order method, AB2, in the LM version (m = 1). It turns out that AB2 is also a method that forces a more serious consideration of the spurious/extraneous root—once it is realized that this root can go unstable before the physical one. From Section 2.7.1, we have £± = 1 - \kAt ± yj\ -kAt +l(XAt)2 (2.7-149) with kAt given by (2.7-140). Since also AB2 is not self-starting, we will show the solution that obtains when FE is used for the first step:
Relative phase speed ro *. o> oo o ro *. Relative phase speed pooo-'-'j-'-'-'ro o m '^ b) bo b m i b) is b o Ul Ul ro b ro Ul GO b (V L' V^~^o -, v^ o 1 ', i ° -\ /" i\/ i\ • '\ : ; 'v ': ■ — •1 — « 1* 'T\p T! :- o ■ P -- • ib / ■ ."GO L i ••'. t-i- 1 1 1 1 1 1 — __ _. O "" —.. ^ >v_ / / / — / 1 / , 1 \i 1 - R AI o o o o o o 0.5 1.0 _L Ul 2.0 2.5 3,0 * U Ol M(D -•■ GO Ul ' I ' I I / /'; / : ;-l-l -.'- ^ / s' i 'c = o I -1 / /p — / / b ••' P 1 -1 .• GO -(, 1 , \l , 1 . -J CO ■L ' \7.; o _ —.~ — 1 1 o Ul 3 -o cu w CD ^ "O Ul CD CD Q. ro b ro Ul GO b / S ' ■j-° / / 1 /-* / -,0 / I ! 1 i l ,'GO i IV, \\ ;\ • * / 1 • i / / / '•: , / / '■ i 1 / ,:" .' / / ': ! // P / i y N).-""'' ik[-i""i / i 1 1 1 — - ^ \ \ \ o\ ii \ o V- Ul \ \ 1 1 H RAI o o o o -»■ _->■ _->■ _->■ ->■ ro ro ^ b) 'co b 'ro '^ b) co b > 3 c Q. CD o Ul _L o —l Ul ro o ro Ul GO o 1 1 i 1 E — F ii i •■ ? // ° / •' GO ' p: / "I -C\ i i . i o-- \ i * v> 0.01 I I I o GO — v _,.-,_ I I I ^^ ~^ ^ I Relative phase speed oooo^--j-:j-.-L^ o ro '■&■ b> bo b fo *• b> bo o Ul CD tn CD _,, "O U1 CD CD Q. II o o ro b ro Ul GO b j_n / / r / / i i / /- l l 1 r; / I 1< •'' § ' / -/ / P o 1 /;..•■ *■ V » o -J4--I i rKi-?j. i - RAI OOO-'-'-'-'-' Ul '-nI CO -^ GO Ul '-nI CO O Ul > 3 C Ul Q. CD ro b ro Ul GO b 1 1 J .'1 . - / 1 ; ! -1/ ' ,' / : ' / O : ' i 1 1: 1 1 ' 1 \ \ 1 \ \ » \ o \ \ \ \ 1 PX- i \ IGO 1 1 N- 1 1 ^■~ — 1
TIME INTEGRATION 303 It turns out that the stability limit, |£±(0, c, P = c/a)\ ^ 1, is rather trickier here than for AB1 (FE) in that sometimes it is the physical root (£+) and sometimes the spurious root (£_) that sets the limit on At. This is clearly not a nice feature; spurious roots should never dominate. C'est la vie. Figure 2.7-21 shows the phase (from £+) and amplitude results for this case, including some unstable results for each P. For P = 100, the physical root sets the stability limit (at c = 0.315 — 0.316 with &critAje = 1.25); in fact, for P —> oo, we have 'numerically' determined the following asymptotic stability result: ccrit = 3/(2/>"3) (2.7-151) ■o 0) 0) Q. </) 0) </) (0 0) > 0) cc 1.1 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 4 -4 -f 4 -ti -t ~l 1 I 1 ' 1 ' 1 i *"*«*«»JC < 0.1 ^%. ^^C = 0.315 — 1 1 ^s i i ^^ !! \ 0-5 ii X 0.316- J0.316 >^ — ■ ■ ' ^^. ii ,'0.5 ]V ~ I , !! I i ■' I |0>s <1 0 0.5 1.0 1.5 2.0 2.5 9 (a) Phase speed -J cr1 1 1 1 1 3.0 P = 100 UU3 008 007 006 005 004 003 002 001 — — — 1 s- ,• 1 c .•' 1 = 0.3._._ /' 1 1 1/ ■f- / — / / / o.i....---'— ■o 0) 0) Q. </) 0) </) (0 0) > 0) cc 1.4 1.2 1.0 0.8 0.6 0.4 0.2 0 Z-* 4- |C T 4 i I . t**<. = 5 \ \ \ \ rC = 5 V- 1 | = 1 vs. \ \ 0.5\ i 1 1 ^X0.01 ^0.2^^ ^, I -^ I — — — — 2.8 2.6 2.4 2.2 <2.0 cc 0 0.5 1.0 1.5 2.0 2.5 3.0 9 (c) Phase speed P = 1 1.8 1.6 1.4 1.2 1.0 ■o 0) 0) Q. </) 0) (o 0) cc 1.6 1.4 1.2 1.0 0.8 0.6 0.4 0.2 0 T ■. \ --..\C = 0.01 0.1\003^. < cc 0 0.5 1.0 1.5 2.0 2.5 9 (e) Phase speed 3.0 P = 0.5 1.0 1.5 2.0 2.5 3.0 9 (b) Amplitude T T C = 0.3/ ■ /' .'" 0.1 ....- Jr^-l T"0-01 0 0.5 1.0 1.5 2.0 2.5 3.0 9 (d) Amplitude 5 4 3 2 i 1 ! ' 1 ' / — — i i ' i i i C = 1 /0.1 / - '' / .-'1 ._. .^•r-:.^.. 1 [""— I I — — — 0.03 ...-•" — ..•-•'" 0.01 —I— ~~T~ = 0.1 0.5 1.0 1.5 2.0 2.5 3.0 9 (f) Amplitude Fig. 2.7-21 Phase and amplitude results for AB2 (with LM) at three values of P.
304 THE ADVECTION-DIFFUSION EQUATION at 0CTh = 1.23 - 1.24, which gives ccrit = 0.323 for P = 100, suggesting that (2.7-151) is approached from below. But for P ^ 1, we have the disconcerting result that stability is set by the spurious root; the 2Ax wave in £_ is the first to go unstable—and it does so at the 'diffusion' limit, a = \/2(ccr\t = P/2). Thus the (£+) curves for c > 1/2 in Figure 2.7-21(c) and those for c > 1/20 in Figures 2.7-21(e) and (f) are actually unstable in £_ (not shown); they in fact show the behavior of the physical root, which itself can also be unstable (e.g., c = 5 for P = 1). Regarding damping, we generally see the nice result that stable solutions are fairly accurate and the less nice (but typical) result that the shortest waves are the least damped. A more detailed look at the stability behavior as a function of 9 for each of the three values of P is shown in Figure 2.7-22, which plots £±(#) for 0 ^ 9 ^ n for each P at close to the critical value of c: namely, c = 0.315 for p = 100, c = 0.5 for P = 1, and c = 0.05 for c = 0.1. Figure 2.7-23 is a 'zoom' from Figure 2.7-22 for the P = 100 case, in order to see the detailed behavior—and the unit circle is thus distorted. All physical roots start at |£| = 1.0 (for 9 = 0) and all spurious roots start at zero. For P = 100, the spurious root is innocuous and the physical root is (properly) neutrally stable, |£+| ~ 1, for all waves between oo and 4Ajc, with shorter waves being nicely damped; 9C has the largest |£|. But when diffusion dominates, so does the spurious root. For P = 1 and 0.1, the physical root just reaches |£+| = 1/2 at the same 'time' (9) that the spurious root attains |£_ | = 1, at 9 = n\ for c > ccrit, the 2 Ajc unphysical mode blows up. As a final comment, we mention that the (apparent) phase speed of the spurious mode for P ^ 1, while not very meaningful/useful, behaves, for 9 <$C 1, like Rs ^ 7i/2c9 or Rs = —3tt/2c9, the result depending on the 'quadrant selection' when evaluating tan-1 [/ra/£_)//te(£_)] in (2.7-132). Either choice leads to the same (worthless) numerical solution; they differ only by an integral number of wavelengths (one in this case), which is not discernible; i.e., the difference between the two expressions is 2n/c9. We defer further discussion of these subtle and perhaps not-so-important aspects until we present leapfrog results, which we do after AB3. o Third-order AB. Switching now to third-order Adams-Bashforth, we admit up front that it is extremely difficult to analyze the three-step formula given by (2.7-11), leading as it does to the following cubic equation: (£ - 1) + ^[23 - 16/£ + 5/$2](kAt) = 0, (2.7-152) where k At is given by (2.7-140) with m = 1 (LM). The first thing we would like to mention is that the 'simple-looking' closed-form expressions given in many 'handbooks' (e.g., the CRC Math Tables) are inadequate as presented when the equation and roots are complex. The second thing to mention is that even current mathematical software packages can get 'lost' when branch cuts in the complex plane are involved ('root switching' will occur in ways that only the programmer can know for sure). And, since we had so much trouble with the cube root formula, we present, for the convenience of the reader, the proper way to obtain the three roots of £3 + (f§A.Af - l) £2 - f(AA?)£ + jjkAt = 0 : £i =A + B- (2?>kAt/\2 - l)/3, £± = -\{A + B) ± (iy/3/2)(A - B) - (23k At/12 - l)/3, where A = [-b/2 + y/(b/2)2 + (a/3)3]l/3, and B = -a/3A, where a = -4kAt/3 - (23kAt/12 - l)2/3 and b = kAt/36 + 23(kAt)2/21 + 2(23kAt/21 - 1)3/12. The handbook formula is as above except for the calculation of B, for which the seemingly equivalent form is presented: B = [-b/2 - yJ(b/2)2 + (a/3)3]1/3, which of course => AB = -a/3. But if
TIME INTEGRATION 305 1.0 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 0.8 1.0 I I I I I I ~~ .-'Unit circle ! — .■' I [_:8 = n 0 = 0! \ ^_(0.1;C = 0.05) \ *.* i * \ ' ' _ \\%S_(1;C = 0.5) y'"\ I I I I I I I I I I I — — '■. — ^+(0.1;C = 0.05) \ _ e = 7i ^ e = o; \ ^+Ti;cta5)7 v \ A - Z,_ (100; C = 0.315)i// — ^+(100; C = 0.315) ..■■' _ (See zoom) I I I I I -1.0-0.8-0.6-0.4-0.2 0 0.2 0.4 0.6 0.8 1.0 Re© Fig. 2.7-22 The two roots from AB2 at three values of P (at ccrn). -0.1 — ^-0.2 E -0.3 -0.4 Unit circle 9 = 0 • 9 h n/2 9C = 1.236 (|^| = 0.9999815) 0.92 0.93 0.94 0.95 0.96 0.97 0.98 0.99 1.00 Re fc+) Fig. 2.7-23 Zoom on physical root forP = J\ 00 (c = 0.315). left to its own devices, there is no assurance that the B obtained from the cube root formula will be the one that is compatible with A; only B = —a/3A can assure this. We show only the phase speed for the advection-dominated case for AB3, in Figure 2.7-24, for which the pure advection stability limit is 12VTT/55 = 0.72362727 (D. Griffiths, personal communication), showing that all stable results are accurate. The
306 THE ADVECTION-DIFFUSION EQUATION Fig. 2.7-24 AB3 phase speed for P = 100 (with LM). E 1.0 0.8 0.6 0.4 0.2 0 0.2 04 0.6 0.8 1.0 l — — ,. — / ■ ; i — '■ — i I ' I /e ^ I i"""l J....J 9^2 f"""l" I ' ' — ''.. — \ — \— 9/ .••' — — I i r -1.0 -0.6 -0.2 0.2 Re® 100, C= 12VT1/55. 0.6 1.0 -1.0 -0.6 (b) P = 1, c = 3/11. -0.2 0 Re© 0.6 1.0 Fig. 2.7-25 The three roots for AB3 at two values of P, each at very close to its stability limit. more interesting graphs are those shown in Figure 2.7-25 for 0 ^ 0 ^ n, for both the physical mode (£1) and the two spurious modes (£2 = £+ and £3 = £_) at very close to the critical value of c for each of two P,s—P= 100 (advection-dominated) and P = 1 (nearly diffusion-dominated; P = 0.1 did not seem necessary). For each P, the first mode to go unstable is the spurious mode, £_ (we surmise that this is also the case for P = 0 and P = 00). It does so at 0 = n/2(4Ax wave) for the former and at 0 = tt(2Ax) for the latter—for which case the P = 1 results are very close to those for pure diffusion (P = 0), for which we see in Figure 2.7-25 that at 0 = n, £, = 1/2, & = 15/33 = 0.45, and £3 = —1, the approximations being exact for P = 0, for which c = 3/11 defines the stability limit (D. Griffiths, personal communication). (The difference between the two
TIME INTEGRATION 307 former roots may be difficult to discern in the figure). The interesting and disconcerting consequence of this behavior is that even when stable there is a range of At for which the spurious root will have a larger value of |£| than that of the physical root (e.g., for k/Ax near 4Ajc for the advection-dominated case), ultimately resulting in a solution that is stable but specious—which behavior is of course reminiscent of implicit methods. o Leapfrog. Next, we switch to one of the most meteorogically popular explicit schemes for pure advection (P = oo), LF2 (second-order leapfrog), which also displays some peculiarities that probably are not familiar to many—partly because in meteorology, LF2 is always 'filtered' in practice, to control what they call the 'time-splitting instability'; see, for example, Durran (1991). Since LF2 is much more commonly associated with FDM rather than FEM, we shall (again) lump the mass via m(0) = 1. We begin by showing the analytic solution when FE is used for the first step: Tj] = (aQ + btjn_)eije, (2.7-153) where 1 + \/l -c2sin2# 2\/l -c2sin2# -1 + Vl - c2 sin2 0 2\J\ -c2sin2# — c2 sin 0 — ic sin 0 is the physical root, — vl — c2 sin2 0 — ic sin 0 is the spurious root, and we note that a + b = 1, £+£_ = — 1, and, for cs'm0 < 1, |£+| = |£_| = 1. Also, this solution is only valid for c sin 0 ^ 1; if c sin 0 = 1, we have the special case of root coalescence and linear instability owing to a repeated root (£+ = £_ = — 1). The solution then is given by T^f1 = (A + Bn)%ntlje in general, and for a FE first step, it is T(f = (1 + in)(-i)neiJe, (2.7-154) where c sin 0 = 1. An overall picture of the LF2 behavior can be obtained by examining £± in the complex plane, which we present in Figure 2.7-26, in which the arrows depicting the parameter #(0 ^ 0 ^ 7r) that look 'parallel' to the unit circle are actually on the unit circle. The symmetric behavior between £+ and £_ is only 'violated' for c > 1 and sin-1 (1/c) < 0 < 7i — sin~'(l/c). The phase speeds of the two roots, meaningful only when c < 1 in practice, are given by Rs = —arg(%)/c0 = —cp/c0 in general [see (2.7-132)] and R+= tan"1 ( ~c sin = J = Sin-'(csin(9) (2.7-155) S c0 V^l-cWey cB and Rs = [sin"' (c sin 0) ± it] (2.7-156) b =
308 THE ADVECTION-DIFFUSION EQUATION lm£) n/2) = sirn1c (a) c < 1 0" Re© Re© 0 = sirr1(1/c) and 7t-sirr1(1/c) (b) c> 1 Fig. 2.7-26 Amplification factors for leapfrog. e = 7i/2;£_ = -i(V^T+c) in particular, both valid for c ^ 1 and 9 ^ n. The difference between the two choices of sign for R$ gives a difference in phase speed of 2n/c6 that corresponds to a difference in phase (wave location) of Iji/0 per timestep, which is just k/ Ax for the chosen wave; i.e., the difference is not discernible—either choice of sign gives the same numerical result, even though one choice appears to give a positive phase speed and the other a negative one. The spurious root generates an ambiguous wave.
TIME INTEGRATION 309 These phase speeds are plotted in Figure 2.7-27 and 2.7-28 for several values of c and using the-sign in (2.7-156), and the following remarks apply—considering also the previous figures (£±) and equations: 1. Only c < 1 gives bounded solutions; and then, only small 6 gives accurate ones (a ~ 1 and b « 0)—except for the exceptional case of c ~ 1 and 6 ^ n/2. 1 .U 0.9 0.8 T3 0.7 CD CD wO.6 CD (/) i=U0.5 Q. CD ■3 0.4 CD CD "■ 0.3 0.2 1.0 n — — — ^^^■'•r- ""■--. C = 0.9 \ l i / c<o.N<Cvs v\ \c = i.o / / \\%>\ C = 0.5X / N$V\ 0 = 0.7/ ^v ^& N%. I I I I I — — — — — — — 0.5 1.0 1.5 2.0 2.5 3.0 Fig. 2.7-27 Phase speed for LF2; pure advection with lumped mass. ■a CD CD O (/) ase ve ph laf CD IX IU 9 8 7 6 5 4 3 2 1 I I \ ' \ I \ I \ I I I \W \ \ \ \ \ \ \ \ \ V \ C = 0.2\ W \ \ \ \ \ \ '••• \ \ ' v '■■• ^\ \ \ \ ■■-.. 0.4 \ \ V- \ \ "V \ \ V0-6 \ ~*- I I ^-^ I I I- I ~ — — — J. 0.1 0.6 1.1 1.6 e 2.1 2.6 3.1 Fig. 2.7-28 Phase speed for leapfrog's spurious root.
310 THE ADVECTION-DIFFUSION EQUATION 2. At c = 1, the 4Ax wave (9 = n/2) is always linearly unstable, and the curve for 9 ^ n/2 is given by Rs = tt/0 - 1. [See also Miller (1991).] 3. For c > 1, the physical mode is damped (|£+| < 1) for 9 in the 'neighborhood' of n/2; i.e., for sin~'(l/c) < 9 < n — sin~'(l/c), while the spurious mode displays unbounded growth with period 4 At (|£_| > 1) in this same 9 range, with the 4Ajc wave growing fastest. 4. For an example of the ambiguity, consider an 8Ajc wave—9 = n/4 and c = 1. Here a = (1 + V2)/2 = 1.207, b = 1 - a =~ -0.207, £+ = (1 - 0/V2, £_ = -(1 + 0/V2. For the first choice for phase speed (+ sign), we have R^ = 1 and R$ = — 5; the physical mode has 'size' 1.207 and speed 1, while the spurious mode (noise) has size —0.207 and speed —5. The net result (sum of the two) is, of course, dispersion. For the second choice (— sign), however, while R^ remains at 1, we obtain R$ = 3. The difference in the two speeds for the 'noisy' part is 8, which means, since c = 1, that the difference is 8Ax per timestep, which is exactly one wavelength and not discernible. If diffusion is present (in ID, 2D, or 3D) and treated via Dufort-Frankel, then some recent results by Kwok and Tarn (1993) may prove useful—at least for second-order centered FDM: 1 1 1 M-l/2 At ^ (u2 + v2 + w2) I = + —^ + Ajc2 Ay2 Az2. (2.7-157) for constant diffusivity—a result which is then independent of k [cf. (2.7-28) and (2.7-29) for FE on the same equation]; and therefore also applies to pure advection. o Second-order Runge-Kutta. The last explicit method we shall examine is RK (Runge- Kutta), both second- and fourth-order. RK2 [(2.7-12) with y = 1 for simplicity] is, for y = -*-y, XAt ~2~- yn+\ = yn + At y„+\/2 yn+\/i = yn —2~y"' ( "kAt \ = y„ - kAt I y„ — y„ J = [1 - XAt + (XAt)2/2]y„, (2.7-158) and our amplification factor for the AD equation is thus £ = 1 - XAt + (XAtf/2, (2.7-159) with XAt given by (2.7-140) and we choose m{9) = 1 for LM. The results, including a few that are unstable, are shown in Figure 2.7-29, which we believe merit the following Remarks: (1) While RK2 is, like AB2 and FE, unconditionally unstable for pure advection, it is 'reasonably' stable for the advection-dominated case; i.e., c = 0.416 is not too terribly stringent. (2) For P ^ 1, the stability limit is that of pure diffusion, a ^ 1 (c ^ P). (3) The damping characteristics for P = 100 are quite good.
TIME INTEGRATION 311 1.005 — 0.5 1.0 1.5 2.0 2.5 3.0 9 (c) Phase speed P = 1 -0.8 cc 0 0.5 1.0 1.5 2.0 2.5 9 (e) Phase speed 0.5 1.0 1.5 2.0 2.5 3.0 9 (b) Amplitude T TTT i C = 1.0/ / TT /0.9 / _ /^■-'ai .'0.6 — 0.3,.'' ^2 // 0.01 0.5 1.0 1.5 2.0 2.5 3.0 9 (d) Amplitude I — — — 1 1 C = 0.1, / / 'I I i 1 1 i — i 0.05 / _ / 0.001 ' 10.01 .1. 0 0.5 3.0 P = 0.1 Fig. 2.7-29 Phase and amplitude results for RK2 with LM at three values of P. 1.0 1.5 2.0 2.5 3.0 e (f) Amplitude (4) The 4Ax wave is very special (strange) when P = 1 and c = 1 (the stability limit); it has an ambiguous phase speed and a damping factor of zero! Whatever its phase speed is, it is gone after one timestep—because £ = 0. (5) The negative phase speeds for short waves in the diffusion-dominated case (and for P = 1, the 'transitional' case) are disconcerting; while stable (for c ^ P), they surely are not accurately represented. It seems that a < ~0.1 is needed for good accuracy in these cases. o Fourth-order Runge-Kutta. Finally, we discuss our last explicit method for this section, RK4. (In the next section we shall subject a few specialized AD methods to this von
312 THE ADVECTION-DIFFUSION EQUATION Neumann analysis.) Applying (2.7-13) to y = -ky gives, perhaps not surprisingly, yn+i = [1 - XAt + (XAtf/2 - (XAt)3/6 + (XAt)4/24]yn, (2.7-160) and thus £ for AD is given by the bracketed term. Figures 2.7-30 and 2.7-31 show the results, for CM and LM, respectively—the latter of which was really computed only to help us to believe the CM results. Again, some unstable results are also displayed. Remarks: (1) For pure advection, XAt = ic sin 0/m(0) from (2.7-140), and it is easy to verify that |£| ^ 1 for csin0/m(0) ^ V$. Hence, the stability limit for LM (m = 1) ■a a> a> Q. w a> w CD CD > to CD cc ■a o CD Q. w CD w co CD > CD cc 4 3 2 1 0 -1 -2 1.0 0.5 0 -0.5 -1.5 0 1.5 9 (a) Phase speed P = 100 ; C = 0.6 \ :: 0.3,.^ 0.5 1.0 1.5 2.0 2.5 3.0 e (c) Phase speed P = 1 T "^T^ C = 0.1 _L 0.5 1.0 1.5 2.0 2.5 3.0 9 (e) Phase speed P = 0.1 C<0,5 'N c=i.65\\ y /// H 1-5 \,y 0.5 1.0 1.5 2.0 2.5 3.0 9 (b) Amplitude I ' I ' I I; / / / / C = 0.4/' 0.01 / ..' -—■~--,g^fi:i- i.i.i. 0.5 1.0 1.5 2.0 2.5 3.0 9 (d) Amplitude I I 1 1 1 1 1 1 '/ C = 0.04/ / / / / 0.001 /'0-03x- '•-■■'■"frbl . 1 . 0.5 1.0 1.5 2.0 2.5 3.0 9 (f) Amplitude Fig. 2.7-30 Phase and amplitude results for RK4 at three values of P.
TIME INTEGRATION 313 1 0.4 CC 0.2 0.5 1.0 1.5 2.0 e (a) Phase speed 3.0 P = 1.1 1.0 0.9 g 0.8 r0.7 0.6 0.5 0.4 "^F C = 0.1 / A \\ X / ^ 2.0/ // \\ \ / w \'' s-/ y ; r—-\ 2.5/ / V\ ,\ / i I i„ 100 0.5 1.0 1.5 2.0 2.5 3.0 e (b) Amplitude ■a CD o Q. w CD w co CD > jo cc ■a CD CD Q. w CD (/) co CD > jo CC 1.3 1.1 0.9 0.7 0.5 0.3 0.1 -0.1 -0.3 -0.5 1.0 0.8 0.6 0.4 0.2 0 -0.2 -0.4 -0.6 -0.8 -1.0 — — — 3\ 2 I i s^.1., 1.5 X. N \ 1\ I i | I ^^=03" o.C^ i cc3 0.5 1.0 1.5 2.0 2.5 3.0 9 (c) Phase speed P = 1 — — — — I —*L l ' l ' v3^*^ \ \ \ •- p* \ \ \0.13 ,--' — \ \ V S — „ *•.(). 15^ — C = 0.2\ y I I '" I I < cc 6 5 4 3 2 1 6 5 4 3 I— 2 1 I — I i I i i |; i / • ■' .' -i- l ; / C = 1.3' 0.1; 0.6/_ / '• I 1 ; / / ; / - y \/ o.3..-- / y ..-•■— / s: .-••" ^ <L-^-^<-'-hsiy^- 0 0.5 1.0 1.5 2.0 2.5 9 (d) Amplitude 3.0 I I 1 1 "! \\ i C = 0.13/ / ' / _ / /0.1 0.05 / / f ! 1 ! 1 i / 0.001 /0.003, u~..-siii*ftrr::rr.trr.".r...-|-'V""*''L_ 3.0 0.5 1.0 1.5 2.0 2.5 3.0 0 0.5 1.0 1.5 2.0 2.5 e e (e) Phase speed P = 0.1 (f) Amplitude Fig. 2.7-31 Phase and amplitude results for RK4 with LM at three values of P. occurs at c = Vs = 2.828 at 0 = n/2—a 4Ax wave, and, for CM, at c = \/8(2 + cos#c)/3 sin#c = s/S/3 = 1.633, where 9C = 2n/?>—a 3Ax wave. (2) For P = 100, the CM (LM) phase speed error is 'all spatial' for c < ~ 1 (2), where of course the spatial error is rather larger for LM. (3) The damping factor (RAt) for P = 100 is somewhat unusual for both CM and LM; middle waves are rather overdamped as c approaches the stability limit—except those close to the critical (~ 4Ax for LM and ~ 3 Ax for CM). (4) For pure diffusion and LM, we have the (2Ax) stability limit a = 1.39 from Hirsch (1988, p. 448), which we extend to CM via ac = 1.39/3 = 0.463—results that are close to the experimental results for both P = 1 and P = 0.1.
314 THE ADVECTION-DIFFUSION EQUATION With this, we close (finally) our discussion of ODE methods applied to the AD equation with linear elements, leaving both quads and other ODE methods to others—except for the next short section!. d. Generalizations and extensions The results presented above, while rather thorough in some ways, are noticeably deficient in others. Here are two: 1. Only linear elements were considered. 2. Only phase speed, not group speed, was considered. Thus, to partially remedy these shortcomings we offer below a methodology, and a few results using it, that may be profitably pursued by others. But we shall also restrict the discussion and analysis to the more important limiting case—pure advection. (The methodology, however, is not so limited.) The methodology depends/relies upon a knowledge of the eigenproblem results for the spatial discretization chosen—linear FEM being just one special case. Thus, we begin by returning to the general ODE's that describe pure advection, Mf + KT = 0, (2.7-161) which is intended to be a generalization of (2.6-17) for linear FEM; i.e., M and K could come from quadratic [cf. (2.6-37)] or cubic FEM's, or from a CVFEM, or even simple FDM with M = I. Now choose an eigenvector of M~lK as an IC, say v, and find a solution in the form of (2.6-20) to obtain Kv = icoMv, (2.7-162) which, since v (and M, and K) are presumed known, is a 'defining' equation for the frequency, co; ico is, of course, an eigenvalue of M~XK. Once co is known, so too is the solution of (2.7-161) with the given IC. But we care more about phase and group speeds—both available from the frequency, via P = co/k (2.7-163) and G = dco/dk, (2.7-164) a la (2.6-46) and (2.6-48) for linear FEM. We have changed the name of the phase speed from c to P because now c is reserved for uAt/h, the Courant number. Note that the dependence of co on the wave number implies, necessarily, that the eigenvector, v, is a (known) function of k, which we presume to be always true. After all, this is just some more 'Fourier analysis'; and, in fact, (2.6-20) will most often be the appropriate eigenvector. Next, to show the methodology that we wish to introduce, we first get specific and apply a particular ODE method, TR in our case, to (2.7-161)—but in a somewhat special way. TR on (2.7-161) yields, using Tn+\ = %Tn as usual, KTn = Af ' ir|Mr"' (2.7-165)
TIME INTEGRATION 315 which clearly 'resembles' (2.7-162). All we need do now to complete the connection is to apply TR to the same eigenvector that we employed for the ODE. Thus, with T0 = v and n = 0, it is clear that (1 - £)/(l + S) = icoAt/2; i.e., £ = (1 - icoAt/2)/(\ + icoAt/2) _ (1 - co2At2/4) - icoAt \+co2At2/4 = x — iy = \$\e~iv, (2.7-166) where, recall, co is 'known.' Recalling, and repeating, the analysis leading to (2.7-131) yields the phase speed a la TR: <p u _, coAt Up = = — tan kAt c0 '"" 1 - co2At2/4 2u , coAt = — tan"1—-, (2.7-167) c0 2 the last coming from the 'trig identity,' 2tan-1 0/2 = tan~'(#/(l - 02/4)). But from (2.7-163), co = kP and thus to At = AtkP = (uAt/h)(kh)(P/u) = c0P/u to give 2u , fcO \ uP = — tan-1 f —P/u J , (2.7-168) the final phase speed result for TR. Given P(0)/u, the relative phase speed, from the ODE solution, (2.7-168) shows how TR approximates it as a function of 0 (and, of course, c); clearly uP —>• P for c —>• 0. To now extend the analysis to group 'velocity,' we first return to (2.7-131), this time in the form _ Tljn) = \$\nVjfTin(p = isr^-e-''*"", (2.7-169) giving co = <p/At and thus uG = daJ/dk = —d<p/dk. (2.7-170) But we have <p = (p(co) given so that, using also (2.7-164), 1 dw dco G dw Ug = Il_ = Z. (2.7-171) At dco dk At dco as the final equation for the group speed for the general ODE method selected; i.e., we recall that both G(0) and <p(co) are 'given' from the spatial discretization selected (and its eigenproblem results). For TR, <p = 2tan-1 (coAt/2) to give dep/deo = At/(\ + co2At2/4), and we thus obtain the group speed a la TR as
316 THE ADVECTION-DIFFUSION EQUATION Table 2.7-4 Fully discrete results in terms of ODE results [coAt = cOP/u in all cases]. ODE method u{p] u(2) FE(3>&BE ^tarrVAO ceia" v—^ ^+((oAt)2 U ^,,-1 LF2 ^sin-l(wAf) ^°,M ^ v/1 - (toAt? RK2(3) atan-i <^A*__^ \±{«>Mtj2G 2u Un-i/wAf x G TR ^tan-1(^) C# 2 ! + (o)At/2)2 RK4 " tnn-i ^t1 ~ |(^AQ2] 1 -(o;A04/24 + (a;A06/144n C^ ian 1 _ i (WA02 + ^AO4 1 - (coAtf/72 + {coAtf/576 (1) For linear FEM, P/u is given by (2.6-46) and for quads by (2.6-50). (2) For linear FEM, G/u is given by (2.6-48) and for quads by (2.6-52). (3) Unstable—presented for completeness. A 'small' amount of diffusion could stabilize the methods without much affecting the results in the table. and we are done—with our specific example. Other ODE methods will give other results—in (2.7-165) et seq.—the 'general' results being uP = <p/kAt and uG = (G/At)(d(p/dco). We conclude by presenting a summary table (Table 2.7-4) of these and other results—leaving both details and other ODE methods to others. So now we are really finished, with apologies for not showing some of these results in pictures. Time just ran out. 2.7.7 Other (Different) Methods Used by Others In this final section on methods we merely perform (probably poorly) a sort of 'duty' by first noting that there are many other 'interesting' time-integration methods used in CFD (but not by us), and then by pointing out some of them—briefly. While we cannot fully embrace these methods for various reasons (not the least of which is that we have not tested them—who has the time and personnel? Would that we did!)—we would be remiss if we neglected even to mention at least what seem to be the best (or most popular) of them. Perhaps we will even find time to implement the 'best' of them. Some—indeed perhaps most—of the methods have been derived specifically for the advection-diffusion (AD) equation or even just the pure advection equation, and are thus custom/specialized methods. Others were derived for solving the NS equations of the next chapter but can also be applied to AD. A nearly common feature of all of them is that they eschew GFEM (rightly or wrongly) for advection-dominated flows, thus displaying somewhat 'religious' underpinnings (as do we, of course). We are, however, ready to admit that there may be one or more 'winners' among them. Probably only time (lots of it, most likely) will tell. Before describing a few particular methods, we point out that the reader interested in learning about the wide choice of schemes (FEM and not) available before 1986 or so will find the paper by Rood (1987) useful and interesting as it attempts to summarize the many methods (with many more references) applied to advection and advection-diffusion. Interesting quotations: 'Originally, an attempt was made to review the literature of plasma
TIME INTEGRATION 317 physics, meteorology, oceanography, computational physics, applied mathematics, and air pollution research. Well over 100 schemes were found, and during the research for this article, at least ten new algorithms were introduced, and at least four new comparison studies were published.' Another interesting pair of 'comparison' papers are those by Chock and Dunker (1983) and Chock (1985). Another survey paper is that by Thompson (1984). For a more recent survey of many of these 'other' schemes and some not discussed herein, see Donea and Quartapelle (1992). Finally, for a new report on the 'status' of some AD methods, see Baptista et al. (1995). a. Methods based on trajectories/characteristics Most of the better schemes are based in one way or another on the method of characteristics (for hyperbolic equations). Two useful quotations in this regard are: 'Any method for approximating hyperbolic equations sacrifices a good deal if it takes no account of the method of characteristics'—Morton (1982); and 'Numerical schemes that follow characteristics backwards in time and then interpolate at their feet have a history stretching back to the very early days of computational fluid dynamics (Courant et al., 1952)'—Priestley (1993). That this method is good (read also, difficult) is perhaps best demonstrated by realizing/recognizing the tremendous number of 'people-years' in a wide variety of disciplines that have gone into its development—which development is not yet over, at least in some disciplines. The number of names given to virtually the same method is also somewhat remarkable. Before attempting to describe the method(s), we mention the principal reasons that it is worthy of such pursuit—some of which may only be realized in 'special' cases: it is very accurate (sometimes more accurate for large timesteps than small ones!) because (in part, at least) the constant in the error estimate is much smaller when the equations are solved along the characteristics than otherwise (see, for example, Russell, 1990); it is unconditionally stable (when done 'correctly'); and it involves linear systems of equations with only SPD matrices. On the other hand, it is not perfect: it introduces numerical diffusion and dispersion (both small, usually), and it only sometimes conserves 'mass.' On balance, though, those who use it believe that the advantages win big! The method is based on the BMOC (backward method of characteristics), which is only one of its many names (perhaps the best one), and comes about as follows: given that we have the solution at the current time, T(x, tn), we wish to find it at the next (or a later) time level tn+\ = tn + At. This is done as follows: at time tn+\, select a point of interest, say Xy, which is naturally taken as a node point on the mesh and, looking backwards along the trajectory (characteristic), ask the following question to an imaginary 'fluid particle' at this point, T(xj,tn+\), 'Where were you at tnT In the case of pure advection, the reason we ask this question is quite simple: the value of T(Xj, tn+\) is exactly the value of T at the point and time in question. So the advection process becomes—appropriately and simply (again, simple in words)—finding the trajectory of each 'particle' (fictitious Lagrangian moving point) of the mesh, there being one particle for each node point—for each timestep; a new set of particles, one for each node (or integration point in some algorithms), is employed for each timestep. In the more general case, with diffusion and sources present, these processes must/should be accounted for during the transit of the 'particle'—a complicating feature to be sure, but not insurmountable.
318 THE ADVECTION-DIFFUSION EQUATION Remark: It is a consequence of looking backwards along characteristics that leads to the unconditional stability of the method. In contrast, many early characteristic-based methods looked forward and thus encountered the typical CFL stability limit. See Staniforth and Cote (1991) for references to some of these earlier stability-limited methods; but in particular we cite Tremback et al. (1987) as a good recent example of 'forward-looking' characteristics methods, and Smolarkiewicz and Rasch (1991) as a recent example of how to convert all of the Tremback et al. methods to 'backward-looking' characteristic methods. Since we—regretably—do not have personal experience with this class of methods, our coverage of it will be both brief and non-authoritarian; we will, however, steer the interested reader toward the bulk of the vast literature on the subject. Regarding actual details, all that we shall do is present the simplest possible case—pure advection in ID on a uniform grid with constant velocity and linear basis functions—in sufficient detail that the gist of the method will hopefully be realized. This presentation will be enough to convince the reader that although the concept is simple, the realization is not. (Computational advection is not easy, no matter how you look at it.) To see this method in its simplest form—for expository purposes—consider the constant-velocity pure advection equation in ID, Tt + uTx = 0, written more usefully here as DT/Dt = 0, with known exact solution T(x,t) = T0(x-ut), (2.7-173) where To(x) is the IC, stating simply that T does not change when followed on a characteristic curve, a trajectory—here x = ut + x() for all *o- In pictures, we have the situation shown in Figure 2.7-32. The simplest finite difference approximation to DT/Dt = 0 is T(x, tn+\)- T(x - uAt, t„) At = 0, or (2.7-174) (2.7-175) T(x, t„+\) = T(x - uAt, t„) = T*(x); the discrete time-continuous space approximate solution is, in this semi-trivial case, also the exact PDE solution. But we must discretize in space, too, and this is where the fun begins. Since we want a finite element approximate solution, we begin by expressing (2.7-175) weakly (Galerkin form): / <piT(x, tn+\) = / <p;T*(x) Vi. (2.7-176) ut T(x,t)=T0(x-ut) Fig. 2.7-32 Pure advection.
TIME INTEGRATION 319 Next, express T(x, tn+\) in the conventional GFEM manner, T(x, tn+\) -> Th(x, tn+\) = E™=, T, n + \ <Pj(x), to arrive at N r r Y,TTX <P-<<Pj= <PiT*(x) Vi, j=\ J J (2.7-177) and we recognize the familiar mass matrix on the LHS and thus realize that our solutions represent an L2-projection (see Appendix 3) of T*(x)—a best fit to T* via the FEM basis functions. We also see, by setting (p,- = 1, that the method—in theory—displays conservation of T: ]T). Tj j fj = J T*. When the integrations are replaced by imperfect quadratures, however, conservation is also imperfect. So far, so good—and seemingly very simple. The 'problem' is the quadrature on the RHS; T*(x) is not a 'nice' function, representing (approximately) as it does T(x — uAt, tn)—the solution at an earlier time translated along x from x — uAt to x. To make further progress, we simplify (2.7-177) to the case with linear basis functions (the most common case by far in the literature) to obtain ^(77+,' + 477+1 + 77+,1) = J writ) Vi. (2.7-178) Next, we attempt to represent T*(x) = Th(x — uAt, tn) via the same functions, T*(x) = J2j T*<pj(x) = J2j Tnj(pj(x - uAt), to obtain ^(77+,' +477+1 +T^) = Y,Tnj J<Pi(x)<Pj(x-uAt) Vi. (2.7-179) While (2.7-179) is actually in the final form needed to 'write code,' the whole process may be better appreciated/comprehended via another sketch, see Figure 2.7-33. The solid curve labeled Th(x,tn) is the known solution at time tn, and the dashed curve labeled T*(x) is its translate via uAt and is in fact the exact solution at time tn+\. But we cannot exactly represent this piecewise-linear function via our chosen basis functions (because our nodes are not in the right places), so we must project it 'down,' giving the function shown with small dashes and labeled (pj(x)T*(x). This is, for node / in the sketch, the >► x i-3 i-2 i-1 i i + 1 i + 2 Fig. 2.7-33 Characteristic GFEM for pure advection; linear elements.
320 THE ADVECT10N-D1FFUS10N EQUATION function whose integral forms the RHS of (2.7-178). It is instructive to show the step- by-step construction of this RHS because it will show that the method will surely not be easy to implement in the general (multi-dimensional, isoparametric) case. Placing the origin at node / gives RHS = Y^ T"j / <Pi(x)<Pj(x - uAt) = TUJ<Pi {x)<pi-2(x — "A/) + • • • + T 7+i J vm + \{x — uAt) = Ti-ij (l+x/h)(-l ; 1+TU I (\+x/h)[2 + l-h h 7-. / + r + r J -h+u J-h 7 /* Jut , uAt-x\ „ fuAl (uAt-x (l+x/h)[ )+Tl\l 0-x/h) h J x — uAt (l+x/h)(\ + h+uAt " jc — uAt ) + T1 (\-x/h)l\ + + Tf! I (\-x/h)(\ 4 At " 7+. f Jut h X — UAt h -) + 77+i I (l-x/h) 4 At \ « jc — uAt Letting z = x/h and c = uAt/h gives + r + 7-2 / 6(l+z)(-l-z + c) + 77_, f (l+z)(c-z) + / (1 -z)(c-z) J-\+c JO ?[/ (l+z)(l-c + z)+ f (l-z)(l-c + z) J-\+c JO J (1 - z)(l + c - z)] + T?+l J (I- z)(z - c) /-1+c (l+z)(2-c + z) = - WTU + [cz(3 - c) + (1 - c){cl + Ac + 1) + cz(3 - c)]77_, + [(1 - c)(2 - c - c2) + c(6 - 6c + c2) + (1 - c)(2 - c - c2)]^ + (1 - c)377+11, which is the RHS of (2.7-179). Inserting it into (2.7-179) and rearranging yields a recognizable form of the 'characteristic-Galerkin' equation for node j: ->« + ! TH -■rc + 1 7">n n+1 7->n (7-;:;-r;_|) + 4(r;+'-r;)+(r^-r;+,) hA< _ 2 3 = c- (rj_, - 2r; + r;+1) + ^-(r;_2 - 37-;., + 37; - rj+1), (2.7- iso) which induces the following
TIME INTEGRATION 321 Remarks: (1) Except for the c3 (dispersive) term, it looks like forward Euler + BTD—see Section 2.7.2e. (2) It was previously presented in Hasbani et al. (1983)—their equation (15)—who also briefly study the errors induced by approximate (Gauss-Legendre) integration. See also Celia et al. (1990), who obtained this result as a special case of their ELLAM method. (3) It possesses the 'unit CFL' property; if c = 1, then u"+l = w"_, —at least for periodic BC's. (4) It is unconditionally stable (when done properly for c > 1)—a result we prove below. (5) It is easy to see that, for h —>• 0, the equation approximates (Tnj+l - r;.,)+ 4(r;+1 - Tnj) + (TnjH -r;+1) 6At u2 At „ \ -) „ —-7^1, +(ii3 Af2)7^|y. + «^ly (6) The name given to the above result (but usually not to the procedure employed to obtain it—see below) in the meteorological literature is 'semi-Lagrangian'; stemming from the Lagrangian form of the time-derivative and to the use of a new set of (imaginary) particles for each timestep (see, for example, Staniforth and Cote, 1991). Before studying the accuracy and stability of this apparently dissipative and dispersive scheme, we point out how this characteristic Galerkin method is to be used if c > 1: 1. Set c — I = c, where / is an integer and 0 ^ c ^ 1; i.e., / counts the number of nodes (elements) skipped over when looking backwards along the characteristic to get to x — u At when uAt > Ax. 2. Subtract / from all indices in (2.7-180); j -> j -I. 3. Replace c by c in (2.7-180). More Remarks: (1) It will turn out—for the simple case of constant velocity at least—that the method is actually more accurate for large c than small c! (2) The variable grid case is (for constant velocity in ID) a simple extension of the above procedure—an exercise we leave to the reader. (3) The multi-D case is not a simple extension—especially on unstructured meshes. (4) For an example with both variable velocity and variable mesh, see Roache (1992b). To examine accuracy and stability in the standard way (Fourier analysis a la von Neumann), we seek a solution to the general (c > 1) equation [j —>■ j — I and c —>■ c in (2.7-180)] via T{p = %neij0 to get (%em - 1)- +cos^) +/cSin0= _c2(i -cos#) + c3(3-4cos# + cos2#)/6 + /c3(2sin#-sin2#)/6,
322 THE ADVECT10N-D1FFUS10N EQUATION or £ = e-'7* { 1 3~C 2 + cos 0 2 c2 c{\ -cosO) (3 - 4cos0 + cos20) 6 + i [s'm0- C— (2sin0- sin20) (2.7-181) wherein the unimodular factor e,~'ie accounts for the shift (jump) over / elements (7 = 0 describes the case analyzed in detail above). Using now £ = (x + iy)e~lW and thus Tj = (yjx2 + y2y QiHjh+t /k&t\axTx y/x) . e-itW/&t = |t|«e'*(*j-«/>0 gives MP 70 1 _. tan y/x u ukAt ukAt I 1 tan-1 I + c {l + ~c)0 -c sin 0 + c2(2 sin 0 - sin 20)/6 2 + cos# _ ~ „ c[c( 1 - cos 0) - c2(3 - 4 cos 0 + cos 20)/6] (2.7-182) as the numerical phase speed. Note that //(/ + c) = 1 — c/c and / + c = c; thus, c —> oo =>• m^/m -> 1—perfect advection via 'translation.' This is a statement of the fact that all of the error in this method is caused by the 'fractional part' of the Courant number; if c = 0, then there is no phase error. This beautiful behavior is, of course, related to the use of a constant velocity on a uniform mesh in 1D. In Figure 2.7-34 we show some amplitude and phase speed results. We observe that |£| ^ 1 for all c and all / and all 0 (for c ^ 1, which is true by construction/definition). Additional Remarks: (1) Perfect results are obtained for c = 1 and any /. (2) Larger / gives more accurate results. (3) One of the few criticisms leveled against the BMOC is shown in the |£| plot—it is dissipative, although in a useful way, damping mostly the short waves. Finally, the symmetry about c = 1/2 is interesting. Having shown that this BMOC (backward method of characteristics—not Big Man On Campus) has some really good qualities, we now show how that cumbersome RHS integral can be alternately (and exactly, still) evaluated using cubic spline interpolation; i.e., cubic spline interpolation of both sides of (2.7-175), using linear basis functions, is equivalent to the projection derived above—a result discovered by Bermejo (1990, 1991)—and perhaps not too surprising when it is noted that when ipiix) is linear, (pi(x)(pj(x) is quadratic, and its integral is cubic. A convenient, but not necessary, cubic spline derivation of (2.7-180) is through the use of fi-splines, which form a local basis in the linear space of cubic splines (B = Basis). First we recall the key property of a cubic spline—the smoothness property: it has C2
TIME INTEGRATION 323 1.6 T3 0) CD O </) 0) </) (0 .c Q. 0) > (0 0) cc 1.4 1.2 1 0 0.8 0.6 0 4 0.2 ■o 0) 0) Q. 0) (0 O 0) > '15 0) cc 1.0 0.9 0.8 0.7 0.6 0.6 0.4 0.3 0.2 0.1 0.5 1.0 1.5 2.0 e (a) I = 0 1.0 1.5 2.0 2.5 3.0 e (b) I = 1 — — — — I ' I I 1 ^^\""'k--C.;r-^ i = o\ — \ i \ ? 0.5 1.0 1.5 2.0 2.5 3.0 e (c) C = 0.3 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 ^~™^ 0.01, 1.0 ^>>v 0.5 1.0 1.5 2.0 2.5 3.0 e (d) Amplitude (I = 0) Fig. 2.7-34 Relative phase speed and amplitude for pure advection. continuity; the function and its first two derivatives are continuous. For further spline theory, see, for example, Ahlberg et al. (1967), De Boor (1978), and Schumaker (1981); for a brief discussion of cubic splines and finite elements, see Strang and Fix (1973). The fi-spline we shall use is equivalent to that on p. 136 of Schumaker (1981) and is shown in Figure 2.7-35 (z = x/h); it is defined piecewise over four elements: <J>o(z)=< fi,(z) = (8+12z + 6z2 + z3)/6 fi2(z) = (4-6z2-3z3)/6 fi3(z) = (4-6z2 + 3z3)/6 B4(Z) = (S- 12z + 6z2-z3)/6 for - 2 ^ z ^ -1 for - 1 ^ z ^ 0 for 0 ^ z ^ 1 for 1 ^ z ^ 2. (2.7-183) (2.7-184) (2.7-185) (2.7-186) We can now state and prove our (ID) version of Bermejo's (2D) Equivalence Theorem. The solution of (2.7-179)—i.e., the characteristic Galerkin solution of (2.7-176) via linear basis functions—is equivalent to each of: 1. Put a cubic spline through all node point values of T" at tn and for each node, /, evaluate the result via cubic spline interpolation at the point uAt upstream of node / to give T"+l. Note that the first step involves the solution of a tridiagonal system whose matrix is equivalent to, if not identical to, the FEM mass matrix. For proof, see Bermejo (1990, 1991).
324 THE ADVECT10N-D1FFUS10N EQUATION -2-10 1 2 Fig. 2.7-35 A B-spline. Fig. 2.7-36 The dashed curve on the left represents the cubic fit through T(x,tn), x represents a B-spline node, and u = ch/At. 2. At time tn+\, for each node, /, form the sum ]T\. 7"+1 <J>,(x7) from the (unknown) nodal values and the fl-spline basis functions. This gives l(T"^{ +4T"+[ + T"^) and corresponds to the LHS of (2.7-179). Also, for each node, i, form the RHS sum J2j ^/<&/■(■*/ + uAt), in which <t>j(x + uAt) corresponds to a leftward (upstream) shift of the fi-spline 0,-(jc)—a distance uAt. The result is the RHS of (2.7-179). Solve the resulting linear system. Proof of (2). First we present in Figure 2.7-36 another helpful sketch, in which / is (as before) the integral portion of the number of elements traversed in At. The dashed curve on the left represents the cubic fit through T(x,tn). The LHS follows easily (instantly) from the right sketch; the left sketch yields the RHS as 7,"_2_/0,(x/_2_/ + it At) + 77_ ,_7 <&/(*/_!_/ + uAt) + T^QjiXi^ + uAt) + 77+1_/<I>(Jt,-+i-/ + uAt) which, using (2.7-183) through (2.7-186) and c = c + /, can be rewritten as rf_2_7fi,(-2 + c) + Tl_x_jB2(-\ +c) + TV_,B3(c) + 77+1_7fl4(l +c)= £[77_2-/C3 + 77_,_7(1 + 3~c + 3~c2 — 3c3) + 77_7(4 — 6c2 + 3c3) + T"+\_j(l — c)3], which is easily seen to be the same as the 77 terms in (2.7-180), after setting / to 0 and c to c there. QED. As a short digression, it is of interest, given the close relationship between the Galerkin projection and the cubic spline interpolation, to inquire whether we can generate a cubic
TIME INTEGRATION 325 spline using the piecewise-linear basis functions. The answer is yes: suppose we are given a piecewise-linear function, f(x) = J2i fi<Ph and seek the best L2 fit (projection) to d2f/dx2 via the same linear basis. Thus we seek y = J2 yj(fj = f" in the weak form; i.e., we solve J2j yj I <Pi<Pj = ~J2k fk f <p'i<p'k or My = ~Kf f°r y» where K is the linear basis function 'diffusion' matrix. After solving for the {yj}, we focus on the element spanning [xj, xi+\] and perform an integration of y = f" to get f* f" = J2 / yj Jq <Pj, or fix) — f'j = yi{x — x2/2h) + yi+\X2/2h, where we chose x, = 0 for convenience; here f'j is to be regarded as unknown. One more integration, this time from 0 to xi+\ yields fi+i — fj = hf'j + h2(2yi + yi+l/6), which is used to evaluate the unknown value of /j. Finally, returning to the f'(x) equation and this time integrating between x,-(= 0) and x yields our cubic spline over the selected element: x + ^/U2 - x3/3h) + yi+ix3/6h. (2.7-187) This cubic function displays C2 continuity [because f"(x) = yt + (yi+\ — yi)x/h is continuous because y(x) is, by construction] and describes the cubic spline in terms of the known data, {fj} and {yj}', it also agrees with equation (2.1.2) on p. 10 of Ahlberg et al. (1967) after the appropriate simplifications are made. Remarks: (1) The special case of constant grid spacing was taken solely for the purpose of simplifying the presentation. All equations in this section generalize easily to variable element lengths. (2) If quadratic basis functions were used in the above, then it would ostensibly turn out that the equivalence between BMOC via Galerkin and spline interpolation would lead to quintic splines, since quadratic (p{ yields quartic (p,-(pj whose integral is a quintic polynomial, etc. End digression. The next step in our advection adventure is to see what happens if we replace the C2 cubic splines with C° cubic Lagrange polynomials—e.g., via cubic Lagrange FEM basis functions—to perform the interpolation step. The reason that we do this, since it clearly deviates from the Galerkin projection discussed thus far, is suggested by the following, 'Cubic Lagrange interpolation is particularly popular since it represents a good compromise between computational efficiency and numerical accuracy'—Bermejo and Staniforth (1992). The efficiency arises because there are no linear algebraic equations to solve; the interpolation is local and direct and corresponds, in some sense, to lumping the mass in the characteristic Galerkin method. The Lagrange interpolation goes like so: simply use (2.7-175) as it stands, but expand the LHS into linear basis functions (as before) and interpolate the RHS from the ('displaced') piecewise-linear basis into the Lagrange cubic basis. Thus imagine that the cubic shown through the points on the left side of the last sketch is the C° Lagrange polynomial. Then, T" + l = T"_j(xj — uAt) = T*(xj) = Y^j Tn:\jfj{Xi — uAt), where {\j/j(x)} are the Lagrange cubic basis functions (see, for example, Reddy, 1993). The result is 77+1 = T?_2_rfi-2-i(Xi-2-i - ch) + r;_\-iti-\-iiXi-\-i - m f(x) = fi + fi+i - fi _ h(2yi + yi+\) h 6
326 THE ADVECTION-DIFFUSION EQUATION = 77_2_/(c-c3)/6 + 77_,_/(c3+c2 + T^_I(2~c + 3~c2 + ~c3)/6, ~2 - 2c)/2 + T1_j{\+c- Tcl - P)/6 which rearranges to rptl-\-\ Jin 1 i 1 i-I At + eAt^'^-' + 3Tlj ~ 6r'-'-7 + TU-]) 2At (TI^-ITIj + T^j) + GAt^-2-1 ~ ^TU~l + 3Tl1 ~ T>!+l-,X (2.7-188) which is to be compared with (2.7-180) after generalizing it via T" -> Tnj_j V/ and c -> c. It is clear, once we realize that (27/+, + 37, - 67,_, + r,_2)/6/z = Tx\i + 0(h4), that we have traded consistent mass plus 'second-order' advection (whose net result is, recall, fourth-order accurate) for lumped mass plus fourth-order advection—more like a higher-order, finite difference method. What is the net result? Let von Neumann tell us; T(n) = ^ntije = \x + iy\n . Qik(jh+t/kAttan^ y/x) giyes Up = _(M/£A0tan_1 y/x and (taking 1 = 0 and thus c = c for simplicity, with no adverse effect) £ = 1 -c2(l -cos#)- c{\-cz) (3 - 4 cos 0 + cos 20) ic - —[8 sin0 - sin20 - c2(2 sin 0 - sin20)]. 6 (2.7-189) Figure 2.7-37 shows the amplitude and phase and is to be compared with the / = 0 curves of Figure 2.7-34, the cubic spline results. We see that they are remarkably similar, showing that the trade-off was probably a good one. If the velocity is not constant, then the method is even more difficult to implement; here we summarize what is needed, and since it is just as easy to describe the method in 1.6 ■o 1.4 & 1.2 w 0) 1.0 £0.8 Q. | 0.6 2s 0.4 01 0.2 0 1 — 1 1 1 1 1 1 C = 0.7/^_ — —~ "*J!i!=^r— 0.5,1.0 ^^■-t.*.. '•^ ""^■v!*. ^» C = 0.01^. Ssp.3 _ \^>. V j&' ^ ^-^^^< ^ 0,1 xlN I , I X 0.5 1.0 1.5 2.0 2.5 3.0 0 (a) Phase speed JJLT 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 ~*s~Z£r. "cTi C = 0.01 •>. 0.1,0.9 \V 0.3, 0.7- 0.5 _L 0.5 1.0 1.5 2.0 2.5 3.0 e (b) Amplitude Fig. 2.7-37 Relative phase speed and amplitude for pure advection.
TIME INTEGRATION 327 multidimensions as in ID, we do so, following, for example, Pirroneau (1982): to solve dT DT + u • V7 = — = /(x, t), (2.7-190) ot Dt where now / describes source terms and diffusion and u(x, t) is given, we need 'only' to define the trajectories and then integrate f(x,t) along them. Each 'timestep' looks like so: 1. Select a point in the domain, x, and a time, t, at which T(x,t) is desired. Find X (x, t; t) the location of a 'particle' at 'auxiliary time' r that was located at x at time t, by integrating 'backwards' (from head to tail/foot) the trajectory ODE that defines the characteristic curves, dX — =u(X,r), (2.7-191) dr from T = ttor = t — At (some earlier time), where at r = t, X(x, t; t) = x. Thus, X(\,t;r) = \- / u[X(x, f; s), .s] ds. (2.7-192) Note that x and t are simply parameters in the trajectory equation. In practice, x is the location of a node (or Gauss point in some methods) and t = tn+\ = tn + At; thus giving X(x, tn+\; tn) as the particle position one timestep back. 2. Integrate (2.7-190) forward (tail to head) along the trajectory: T(x, t) = T0+ f /[X(x, t; t), t] dr, (2.7-193) Jt-At where T0 = T[X(x, t; t — At), t — At] is the temperature at the foot (tail) of the trajectory at the earlier time t — At. Thus, for pure advection with no source term, T(x, t) = T[X(x, t;t — At), t — At]; the solution remains constant along a trajectory. The methods used to approximate the above exact results are too many and too varied to report here; consult the references—more of which we will cite below and in the next chapter—and the references within references, for details. In fact, one of the reasons that we have not yet tried these methods is that—difficult as they are for 2D problems on simple meshes with rectangular elements—they must be nearly impossible to implement on the general, distorted iso-parametric element meshes in 3D. Or, at least this is our perception of the state of the art. In fact, the number of person-years thus far expended on just ID problems is somewhat staggering—judging by the literature. The last feature of the ID problem we have been discussing that we wish to consider before moving on to 2D is that of numerical integration. It is a sad surprise that quadrature errors can convert the algorithm from unconditionally stable to conditionally stable (somewhat like explicit Eulerian methods, but displaying some bizarre/illogical results) and, in the worst of cases, to unconditionally unstable (!). In the paper by Morton et al. (1988) on certain characteristic-based methods that are similar to those discussed above—and identical in some cases, are presented some stability analyses. We report just a few cases from that paper: 1. If one-point quadrature is used on both sides of (2.7-177), then the resulting algorithm is unconditionally unstable.
328 THE ADVECTION-DIFFUSION EQUATION 2. If the exact mass matrix is used (LHS) with one-point quadrature on the RHS, then the resulting algorithm is unstable for the fractional part of the CFL number in the range 1/V6 = 0.408 to 1 - l/>/6 = 0.592; strange. 3. If lumped mass is used on the LHS and one-point quadrature on the RHS, then the method is unconditionally stable. (Numerical advection is still full of surprises; it is not easy.) Moving finally to 2D and 3D, we begin by listing some of the mostly-still-open problems—some of which have contributed to the delay in our becoming personally involved in these advanced techniques: 1. Loss of unconditional stability caused by inexact integration—already referred to in ID—although the bicubic spline method, already referred to, is equivalent to exact integration and thus generally recommended (when applicable; e.g. for rectangles). 2. Little or no 3D progress—except for that at EDF (Electricite de France) employing tetrahedral elements with flat faces. See Section 3.16.9a in the next chapter. 3. Little or no complex geometry progress—except for that at EDF employing tetrahedral elements with flat faces. See Section 3.16.9a in the next chapter. 4. There are still unresolved boundary condition issues—especially for c ^> 1—both inflow and outflow. Rather than presenting any of the myriad of details for the multi-dimensional case, we shall content ourselves with providing a relevant cross-section of the literature—and a large cross-section it is, with some (most) subgroups not knowing (or caring, probably—each thinking that they are the best and so ignore the rest) about very closely related efforts in another field. As we have already made clear, there is no subject like 'numerical advection' that gets so much attention in so many disciplines. Further citations will be presented in the next chapter, wherein the Navier-Stokes equations are the principal subject. A brief (and assuredly incomplete) synopsis of several key groups of contributing researchers and their principal interest (the 'physics' side) is now attempted—and concludes our discussion on 'characteristic advection': 1. The French: starting in about 1980, several French researchers, including O. Pironneau and M. Bercovier, investigated characteristic-based methods as an attractive alternative to upwinding. See Benque et al. (1980, 1982), Bardos et al. (1981), Pironneau (1982), Bercovier and Pironneau (1982), Bercovier et al. (1983), Hasbani et al. (1983), and Pironneau (1989). Their field, besides applied mathematics of course, is advection-diffusion and—principally—incompressible viscous flow. 2. The English: spearheaded by K.W. Morton and focusing on advection-diffusion, compressible flow, and (to a lesser extent) incompressible flow, this group of applied mathematicians has produced many papers—and many names for the methods, some of which we cite: under the name Euler Characteristic Galerkin Methods, see Morton and Stokes (1982), Morton (1982, 1983), and Childs and Morton (1990); under Characteristic Galerkin Methods, see Morton (1985); under Lagrange-Galerkin Method, see Morton et al. (1988) and Priestley (1994). For a good overall discussion of both of the latter two, see Morton (1990).
TIME INTEGRATION 329 3. The groundwater flow simulators: here the key players are many and they often team up. The methods are somewhat different and so are some of the names of the methods. Under Characteristic Galerkin Methods, we cite Ewing and Russell (1981, 1983), Douglas and Russell (1982), Krishnamachari et al. (1989); under Modified Method of Characteristics are Ewing and Russell (1981, 1983), Douglas and Russell (1982), and Roache (1992a); under Eulerian-Lagrangian Localized Adjoint Methods (ELLAM), we have Russell (1990), Celia et al. (1990, 1993), Russell and Trujillo (1990), and Arbogast and Wheeler (1995); under the Backward Method of Characteristics is Baptista (1987). Finally, under 'particle methods' see Tompson and Gelhar (1990) and Schaferperini and Wilson (1991). 4. In spectral finite elements, spearheaded by A. Patera and applied to incompressible flow, are: Ho et al. (1990), and Maday et al. (1990). 5. In magnetohydrodynamics, we cite the Ephemeral Particle-in-Cell (PIC) method of Eastwood (1987). See also Arter and Eastwood (1995) for particle methods in fluid flow. 6. The meteorologists: last but not least are those who worry about advection at least as much as any other group—computational meteorologists/applied mathematicians. In both advection-diffusion and their versions of the NS equations (numerical weather prediction, general circulation models, and even climate simulation models), there has been a tremendous amount of work—mostly via finite difference methods and under the name of Semi-Lagrangian Methods. Fortunately, most of our work on citations has been done for us by one of the leading researchers in that field—A. Staniforth. In Staniforth and Cote (1991) is given a thorough review of the history of characteristic-based methods in meteorology. Since that review, the following relevant papers have appeared: in Bermejo and Staniforth (1992) is discussed a method to convert 'conventional' semi-Lagrangian schemes to those which in addition suppress most wiggles via a new scheme inspired by the important paper by Zalesak (1979) in finite differences, in which a compound scheme is employed that utilizes Godunov's theorem relating wiggles to order of accuracy in order to generate a quasi-monotone scheme that retains the higher order of accuracy in regions where the solution is smooth but reverts to first-order when necessary to suppress wiggles (simple up winding in the Eulerian context, linear interpolation in the Lagrangian one). The demonstrated results are impressive. In Priestley (1993), it was shown how to recover conservation (J T) after adding the improvements of Bermejo and Staniforth to obtain a method that is monotone, inexpensive, and accurate. A new twist to these methods is espoused in a recent paper by Purser and Leslie (1994): a third-order method, 'equivalent' to AB3, is used with an efficient, forward trajectory, semi-Lagrangian method that is also stable 'in practice.' Finally, mention should be made of another paper that ostensibly generalizes/unifies the entire concept using the advanced mathematics associated with differential forms; see Smolarkiewicz and Pudykiewicz (1992). A final geophysical application is that of Malevsky (see Malevsky and Yuen, 1991, and Malevsky, 1993): apparently totally unaware of the efforts of the meteorologists, the cubic spline interpolation-characteristics method was rediscovered and even applied in 3D (thermal convection at infinite Prandtl number). b. Methods based on modified equations Inspired by Lax and Wendroff (1960, 1962, 1964) and Leith (1965) in the finite difference community, Donea (1984) launched a new research direction, to be followed by quite
330 THE ADVECTION-DIFFUSION EQUATION a few others, by introducing a finite element version of these techniques—in which a modified equation results from a Taylor series expansion in time that is applied prior to spatial discretization via the FEM. It stands in contrast to the family of so-called Petrov- Galerkin methods that always and intentionally introduce some artificial diffusion via special test functions in that the Galerkin weak formulation is applied to the modified equation. The original publication on pure advection was quickly followed by one applied to advection-diffusion (Donea et ai, 1984) and then further developed in Selmin et al. (1985) and Donea et al. (1987). We present the method for one case only: the trapezoid rule integration of the pure advection equation with constant (and divergence-free) velocity, Tt + u • VT = 0: 1. 2. At2 At3, Tn + \ =Tn + AfTn + jn + jn + 0(At4). 2 6 (2.7-194) '-rn tvz + 1 AtT"+l + At yvz+1 2 " At 77,+'+ 0(Af4). (2.7-195) 3. Subtracting the second from the first and dividing the result by 2At yields, upon rearrangement, VI+1 Tn 1 At = _(Tn +7"+1) H (Tn At 2K ' ' J 4 l " VJ+l )+^r{Tnm + Tnt^)+0(At3). 12 (2.7-196) 4. Use the original PDE to obtain Ttt = -u • VTt = (u • V)2T and Tttt = (u • V)2Tt to give VJ+l yn At + u- V (Tn+Tn+l) At _ Ar = — (u- V)2{Tn -Tn+l)+ 4 ' 12 + 0(At3). (u-V)2(7?" +Tt+l) (2.7-197) 5. Now note from (2.7-196) that, to within 0(Af3), At2(Tnt + T^ + x)/2 in (2.7-197) can be replaced by At(Tn+l — Tn) to give, dropping the 0(At3) terms, the modified equation, which—by construction—is fourth-order accurate, At4 (u-Vr TVZ+l TV1 At Tn+Tn+\\ A + u • v [ —-— J = — (U • v)2cr VJ+l ), which immediately simplifies to VI+1 '-rn At nrn , yn + 1 + u • V | : | = 0, (2.7-198) where, since V • u = 0, (u • V)2 = V • (uu • V) = V • r • V, which 'looks like' a diffusion term (but really is not) and can be so-treated a la Galerkin's weak form; i.e., via the usual integration by parts. The final result is then called a Taylor-Galerkin method ('Crank- Nicolson/Taylor-Galerkin' to be more precise), and it looks like At1 M - -J2"*(u) (T n + \ At rrn \ I Jin ■ j^n + \ + N(u)[ | =0, (2.7-199)
TIME INTEGRATION 331 where Ku = J Vy-, • uu • V<pj = /(u • V<p;)(u • V<pj). Donea (1984) shows, in ID, that (2.7-199), with linear basis functions, is indeed more accurate than conventional TR (which is 'recovered' by dropping the At2 term), especially for larger At. It is also (like TR) non-dissipative and fourth-order accurate in space (also like TR with linear elements). Remarks: (1) If diffusion is included, then the Tttt terms cannot be so neatly utilized (they are dropped), with the result that the final Taylor-Galerkin equation is only second-order accurate in time (Donea et ai, 1984). (2) If the forward Euler method is employed rather than TR, then the resulting Taylor- Galerkin equation looks like our BTD result except that the mass matrix is modified in the same way as in (2.7-199)—and the result is third-order accurate in time. (3) For more background on the method of modified equations, see Warming and Hyett (1974)—and Griffiths and Sanz-Serna (1986). (4) See Donea and Quartapelle (1992) for further results. Figure 2.7-38 shows the phase speed for CNTG, derived in the usual way, which is obtained from 4 - c2 2 + c2 + cos 0 — ic sin 0 $ = 4-c2 2 + c2 (2.7-200) + cos 0 + ic sin 0 which is easily seen to be unimodular. The t-(0) curve traces the entire unit circle as 0 varies from 0 to n. Comparing these results with those for straight TR in Figure 2.7-15 for P = 100 shows a big gain in accuracy over a wide range of useful CFL numbers; e.g., 0.1 < c < 3. Finally, we remark again that the non-zero phase speed shown for several values of P at 0 = n is somewhat illusory, as the 2Ax wave is stationary for all c; the curves are simply discontinuous at 0 = n. To conclude this section, we point out that A.J. Baker has both generalized and categorized several of these methods under the term 'Taylor weak statement.' In Baker and Kim (1987) is a presentation and discussion of many related CFD algorithms, including some finite difference methods. Much related history is also discussed there. Finally, Chaffin 1.4 ■o (1) 0) Q. 0) (0 sz Q- 01 > ts 0) cc 1 ? 1.0 0.8 0.6 0.4 0.2 _ I I I V \ ^3 t ^^10 _\100 I I ^ C = 1.5^^^— _- — — " 1,2 ^o.ioX-V15 v~ i illJ 0.5 1.0 1.5 e 2.0 2.5 3.0 Fig. 2.7-38 Phase speed for Donea's CNTG method (pure advection).
332 THE ADVECTION-DIFFUSION EQUATION and Baker (1995) analyze some of these methods for both linear and higher-order FEM's (quadratic and cubic) and compare them against some finite difference (finite volume) schemes [QUICK methods, a la Leonard, both third-order (Leonard, 1979) and fifth- order (Leonard and Mokhtari, 1992)—and even seventh-order, in Leonard and Mokhtari (1990)]. They also introduce a new, improved method with linear basis functions. Their results show FEM to be more accurate and, especially for CFL near one, their new scheme, based on a Taylor weak statement and linear basis functions, is really quite good. Again, a useful 'history' (updated from the 1987 paper) is included. c. Some least-squares finite element methods (LSFEM) Whereas B. Jiang is becoming to the least-squares FEM as A. Patera is to the spectral element method, we have neither the time nor space to describe the many and varied applications by either—we simply provide a sampling and refer the reader to relevant citations for the remainder. Some citations first: Carey and Jiang (1988), Jiang (1993), Carey and Shen (1989). In the next chapter we will return to this list and mention some papers devoted to the NS equations. We first remark that the 'obvious' least-squares method is generally of little use for approximating diffusion because integration by parts is not allowed; i.e., the Laplacian operator must remain 'as is'—thus ostensibly requiring higher-order continuity (C1) in the basis functions. To clarify this point, we quickly summarize the least-squares method as a weighted residual method for linear PDE's (for further details, see, for example, Eason, 1976; Carey and Oden, 1983): given the general equation Lu = f, an approximate solution, uh = J2j ujiPh ls generated by selecting the coefficients {Uj} in such a way that the mean square of the residual, R = Luh — f, is minimized (in an appropriate norm). Thus using the L2-norm, for example (the most common LSFEM), \huh - ff (2.7-201) is minimized via dl/duj = 0, / = 1, 2, ..., N, to yield f(Luh - f)(Ltpi) = 0 Vi; (2.7-202) i.e., " LtptUpj )u,= I fL<pi, i=l,2,...,N. (2.7-203) Thus, the weighting (test) function is Lip-, rather than simply ipt of the GFEM—an observation that makes clear our above remark regarding the Laplacian. Another noteworthy observation is that the coefficient matrix, here J L(ptL(pj, is symmetric regardless of whether L is or is not—a nice feature. A final introductory and summary remark is this: C° finite elements can be, and commonly are, used even when L includes the Laplacian—by first rewriting each PDE with higher-order operators as coupled systems of PDE's each with only first-derivative operators; e.g., V2w = / is rewritten as the system a =Vu and V • a = f—and, at least in theory, V x a = 0 (see, for example Jiang and Povinelli, 1993)—and then LSFEM is applied to the system, thus permitting the (convenient) use of low-order C° basis functions but requiring rather more dependent variables—and equations. More on this in the next chapter.
TIME INTEGRATION 333 The LSFEM 'sample' that we will present begins by applying TR to T, + u • VT = 0 on a periodic BC domain and ends by studying £ for linear basis functions in ID. Thus, for Th = Y^=i Tj(t)(pj(X), we define Tn = Th(nAt) = Th{tn) and '-ph '-ph At ^+u-V n + \ + Th 1 ■* n (2.7-204) where Thn is given, and the LSFEM finds Thn+[ by minimizing \ $ R2 with respect to each amplitude coefficient, T"+l = Tj(tn+\); i = 1, 2, ..., N; i.e., 1 d 0= - 2 37 n+\ ^2 J —-<^ + u-V' J J -i 2 At <Pj giving E 'yn + l '-rn ( rrn-\-\ _i_ T->n _J L + n • V ' J J At At i 2' ^It: + ^u' v<p< l = °< which, upon multiplying by At, looks like (and can be interpreted as) a Petrov-Galerkin method with test function w, = <pj + (At/2)u • V^,. In fact, the above LSFEM is called (with At/2 replaced by a 'generic' r) SUPG (streamline upwind Petrov-Galerkin) by T.J. Hughes and colleagues/students (see, for example, Hughes et ai, 1989). Rearrangement gives, after another multiplication by At, E f At f At2 f / <PWj + y / ^'u * V(pJ + ^'u' v<^+ "T" / (u' v<^u * v<^^ = Y1 w + Y / (<^'u *v<^'' ~ ^'u'v^ ~ ~J~ /(u * v<^)(u'v^-) 4 i = l,2,. 1 j' N, (2.7-205) an equation (with symmetric matrices) previously presented in, for example, Carey and Jiang (1988, in ID), Jiang and Povinelli (1990—albeit for BE instead of TR), and Donea and Quartapelle (1992). But this mess can be simplified since V- u = 0 and since our BC's are periodic (non-periodic BC's would simplify only if n • u = 0 on F) via (i) J(<piU • V<pj + <pjU -V<pi) = / V • (u<pi<pj) = fr n • vnpupj = 0; and (ii) f(<pju ■ V<pt - <PiU • y<fj) = /[V • (vupiWj) — 2<pjU • Vipj] = —2 J (pjU • V^?y, to give, upon division by At and rearranging, Y, / W rrn + \ rpn At M + U<piU-V<Pj)Tnj At f + y (U-V<Pi)(U-V<Pj) j J' > =0, i=\,2,...,N; (2.7-206) which, while still symmetric (in the coefficients of T"+l) is much simplified, and admits a very interesting interpretation/identification: namely for linear basis functions at least,
334 THE ADVECTION-DIFFUSION EQUATION it is identical to the pure advection version of the semi-implicit FE + TR + BTD scheme presented in Section 2.7.5; cf. (2.7-110) and (2.7-112)! The LSFEM 'automatically' inserts BTD—a feature that also extends to 2D and 3D. We leave the proof of this assertion to the reader and merely point out that we have here another example wherein the final result is obtainable in any of several ways. (Is a method that is insensitive to its manner of derivation generally good? Bad? Or just insensitive?) Part of the 'cost' of generating symmetric matrices is that, even for (or, especially for) pure advection, the scheme must be diffusive since only a skew-symmetric advection matrix can be dissipation-free. d. Methods based on a discontinuous-in-time Galerkin ODE technique In a very large number of papers—originally emanating from two far apart 'countries,' Sweden (C. Johnson et al., the originators) and California (T. Hughes et al), but recently joined by (at least) T. Tezduyar at Minnesota—a set of methods variously described as streamline diffusion, SUPG, Galerkin/least squares, and SST (stabilized space-time) have evolved, all based on a (never-clearly-derived) discontinuous-in-time (C_1) Galerkin ODE method, have poured forth. In this section we shall derive this family of ODE methods and summarize—mostly via the citation of relevant references—its use by the above-named researchers. Those that apply also to the NS equations will be recalled again in the next chapter—and some that apply only to NS will only be cited there. To introduce the discontinuous Galerkin method for ODE's (in time, for our purposes) in the simplest and cleanest way that we have found, consider the 'standard' ODE y = f(y, t) in which, even though y(t) is continuous, we will be interested in Galerkin approximations that are discontinuous. [Continuous-in-time Galerkin methods—and least- squares methods—have been tried and, to the best of our knowledge, largely abandoned; see Zienkiewicz and Taylor (1991) and references therein; also Wood (1990).] For the discontinuous Galerkin method, to be described below, see also: Delfour et al. (1981, 1992), Johnson (1988, 1992), and (especially) Estep (1995) and references therein. To motivate the discussion, imagine that we wish to integrate our ODE over (0, At) using a discontinuous approximation that suffers a jump at 8t, where 0 ^ 8t < At (see Figure 2.7-39). A weak solution, using a continuous test function <p(t) on (0, At) is as follows: pAt pSt pAt pAt / ipydt = (pydt + (y0-y-)<p(8t)+ <pydt = / <pf dt, (2.7-207) Jo Jo J st Jo in which the Dirac delta function in the integrand at t = 8t has been properly accounted for. Letting 8t -> 0 causes the jump to occur at the boundary of each time-slab and is the discontinuous Galerkin (DG) method on (0, At) (once y is approximated via the same functions comprising the test space, which we do below): pAt pAt (yo ~ y~)<PQ + (pydt= <pf dt, (2.7-208) Jo Jo where now y~{= y) is the value to the left of t = 0 and y0 that to the right. The next figure (Figure 2.7-40) and the next weak statement generalize the situation. We are at tn and have available y(t~)\ the discontinuous Galerkin ODE method between tn and tn+\
TIME INTEGRATION 335 0 8t Fig. 2.7-39 Discontinuous ODE solution concept. ► t ln-1 Fig. 2.7-40 Discontinuous Galerkin method for our ODE. N+\ is then: find y(t) = J2j=\ yj<Pj(*) from <Pi(tn)[y(t+)-y(t-)]+ f"+\,y{t)&t = l"" <Pif(y,t)dt, i = 1, 2, ..., N + 1, Jt„ Jt„ (2.7-209) where {</?,} are, a la FEM, piecewise polynomials on (tn, tn+\). Remarks: (1) The jump term enforces continuity, weakly. (2) y(t+) is to be determined as part of the solution; the closer it is to y(t~), the less 'jumpy' is the solution. (3) The solution on each time 'slab' involves N + 1 simultaneous (non-linear in general) algebraic equations.
336 THE ADVECTION-DIFFUSION EQUATION (4) N = I appears to be a 'practical' limit; all results we have seen use N = 0 (piecewise constant) or 1 (piecewise linear). (5) In some of the references on the subject (e.g., Hughes et ai, 1989), a different but equivalent form is presented; it is a result of integrating by parts in (2.7-209): / (piydt = (Pi(tn+\)y(t~+l)-(pi(tn)y(t+)- ipydt Jtn Jtn to give <Pi(tn+\)y(tn+l)- <Pi(tn)y(tn ) <Piydt= / (fifdt. (2.7-210) Remark: It is equivalent (in multi-dimensions) only in the absence of quadrature error; inexact integration favors the latter (T. Hughes—personal communication). Let us now demonstrate this ODE method on y = —Xy, y(0) = y0 = I for both N = 0 and N = 1: l.N = 0. Here y(t) = yn+\ = y(t+), (p-< = <p\ = 1, y(t~) = yn, and y = 0 to give (yn+\ - yn) + 0 = Jo ' f dt = — f Xydt = — XAtyn+\, or yn+\ — yn/(\ + XAt), and we have merely recovered the BE method—albeit with a (perhaps) different interpretation. Note though that if / is non-linear or includes a source term, say s(t), that this result would be slightly better than BE because of JQ s(t)dt rather than simply At ■ sn+\. 2.N=l. Here y(t) = y\<p\ + y2(p2, where jy, = y+, y2 = y~+l, <p\ = (tn + \ - t)/At„, and (P2 = (t- tn)/Atn. At t = 0, y~ = y(0). In this case, (2.7-209) for i = 1 gives Jtn = -X Atn "'«+' tn+l - t Atn Atn Jtn Ar„ which incorporates the jump—and for / = 2, yields o+ / LJi. y*+\ y» dt = -k' l u + (tn + \ ~t) _ (t-tn) yn—7z— + yn+\- Atn dt. Atn Ar„ Jt„ Atn a pair of equations in y+ and ^+1. Integrating yields + (tn + \ ~t) yn + \(t-tn) Atn Atn dt, yn+\ + yj - ^^t + _ —o yn = —T-Q-yn + yn+0 (2.7-211) and yn+\ - yn + XAt 2 6 (yn+2yn+0 (2.7-212) with solution y+ = y~ ■ (1 + 2XAt/3)/(l + 2XAt/3 + X2At2/6) and y~+l = y~{\ - XAt/3)/(\ + 2XAt/?> + X2At2/6). The jump at t„ is y+ - y~ =-y~X2At2(l +
TIME INTEGRATION 337 2XAt/3 + X2At2/6)/6; i.e., it is small—0(At2). It is, however, larger than the global truncation error, which is O(Af)3. Also, as is obvious from the above results, the At -> oo results are that both y+ and y~+x -» 0; i.e., it is L-stable, similar to (dissipative) low-order BDF's and different from (non-dissipative) TR. The discontinuous Galerkin ODE solution will not 'wiggle'—one of its sales points by its promoters. The general end-of-step solution is easily found to be y- = yo ■ [(1 - XAt/3)/(l + 2XAt/3 + X2At2/6f, (2.7-213) which is a third-order accurate approximation that is also unconditionally stable; the factor (1 -A.Af/3)/(l + 2XAt/3 + X2At2/6) is, in fact, the so-called 2,1 Pade approximation to e"XA? (e.g., Vichnevetsky and Bowles, 1982, or Wood, 1990). How does the method compare in accuracy with TR? To TR at 1/2 the step size, since it is about twice as much work (twice as many equations) as TR? To BDF3 or AB3? Quite well, actually; Table 2.7-5 shows the results for XAt = 0.1 and 10 timesteps. The error columns are to be multiplied by 10~6. (See Notes below table for explanation.) Remarks: (1) DG is remarkably accurate, and global error grows slowly—or not at all ('a consequence of Galerkin orthogonality' — R. Rannacher, personal communication). This nice result is rather special, however; for a general, non-linear system of ODE's, the global error will increase with time (A. Hindmarsh, personal communication). (2) TR/2 is listed because it is a good second-order method with about the same work (one-half as many equations to solve). (3) TR/8 is listed because it is the break-even point for TR with respect to accuracy. Table 2.7-5 n 1 2 3 4 5 6 7 8 9 10 y(tn) = e-°An 0.9048 0.8187 0.7408 0.6703 0.6065 0.5488 0.4966 0.4493 0.4065 0.3679 y(tn)-yoG 1.225 2.216 3.008 3.629 4.104 4.456 4.704 4.865 4.952 4.979 y(tn)-YAB3 0<n 0(D 32.54 55.75 76.04 91.65 103.71 112.6 118.9 123.0 y(tn) -YBDF3 0(D 0<n -10.81 -26.56 -41.68 -53.94 -63.20 -69.96 -74.75 -77.99 y(fn)-/TR/2 18.86 34.13 46.32 55.88 63.20 68.62 72.44 74.91 76.25 76.66 y(tn)-yi 1.178 2.132 2.894 3.491 3.949 4.288 4.526 4.681 4.765 4.790 Notes: DG: Discontinuous (linear) Galerkin BDF3: Third-order backward differentiation: yn+i = Jj[-\8yn - 9yn_! + 2yn_2 + 6hyn+:] AB3: Third-order Adams-Bashforth TR/2: Trapezoid rule at At/2 TR/8: Trapezoid rule at At/8 (1) Exact solution used for first two steps.
338 THE ADVECTION-DIFFUSION EQUATION (4) AB3 is rather disappointing. (5) The jump in DG is about 0.001 ± 0.0005, decreasing with t. So, for the model heat equation, linear-DG does very well. What about the model advection equation, y = icoy? First we note that it is not conservative—it introduces numerical damping; the solution is of course given by (2.7-213) with X = —ico, which in turn leads to £ = 1 + icoAt/3 1 -2ia)At/3-cozAtz/6 1 - 5co2At2/9 + icoAt(l - of At219) 1 - co2 At2/9 + co4 At4/36 < 1, which, according to Johnson et al. (1984), is third-order accurate and fourth-order dissi- pative. Applying the linear DG method to the AD equation (GFEM in space and DG in time) is straightforward but quite complex. (We shall study the piecewise-constant DG method in the next section.) What we shall do here, instead, is to take a mere first step or so in this direction, applying it to T, + uTx = 0. The result, on a uniform mesh is—where Tn- corresponds (at node j) to y(t~) in the scalar ODE case and Tn- corresponds to y(t*), the value of Tj just after the jump—analogous to the ODE result in (2.7-213), the pair 1 6 L (TnjZi +Tnj_x- 2T)_X) +4(rp +Tnj-2Tnj) + (Tnj+{ + T) uAt + i + 6h -n + \ irj+l-Tpi) + 2(Tnj+l-Tnj_l) = 0 2TnJ+l) (2.7-214) and 1 6 L -n + l -n + 1 -■n + 1 (7-;+; - r;_,) + 4(r^+1 - rj) + (^ - r;+1) uAt + 6h I -n + \ 2(TnjH-Tnj±l) + (Tj+l-Tnj_l) 0, (2.7-215) "M+l T«+l in which the unknowns are T"+ , T"j±\, T", and Tnj±l, and the first equation incorporates the jump; e.g., T" — T". Rather than attempting a von Neumann analysis, we instead quote from Shakib and Hughes (1991), where these equations first appeared, who did: 'A symbolic manipulator program, SMP, was used to carry out these operations. The resulting difference stencil is very long and complicated, consequently we will not present it here.' What is worth noting is that the DG method is quite implicit; it intimately couples the unknown nodal parameters at the beginning and end of the time interval with the nodal values at the end of the previous interval, a result that can (as shown above) give quite accurate results but that doubles both the number of equations and the bandwidth over competing (conventional?) 'continuous' (single-valued) methods such as TR and BDF3. See Shakib and Hughes (1991) for further discussion of this method, including some 'specialized' methods for dealing with the associated (and 'large') linear algebra problem. It is shown there to be unconditionally stable in the two limiting cases of pure advection and pure diffusion; presumably it is also stable for the general case. Additional relevant references on this subject may be found in Johnson et al. (1984), Hughes et al. (1987), and Eriksson and Johnson (1990).
TIME INTEGRATION 339 e. Methods based on least-squares and time-discontinuous ODE's 'Galerkin is accurate but unstable; least squares is stable but not accurate ...'—L. Franca during his lecture on stabilized methods at the October 1993 Workshop on Numerical Methods for the Navier-Stokes Equations, at Heidelberg, Germany. If the added stability (relative to GFEM + ODE) of least squares alone or DG alone is not sufficient for your 'taste' you may follow Hughes et al. and combine the two. Thus, in Hughes et al. (1989) is presented the 'Galerkin/least-squares' (GLSQ) method for the scalar transport equation. Whereas they discuss both steady and time-dependent equations, we shall focus only on the latter—in which the DG method in time is combined with GFEM in space and augmented by an additional weighted-residual method—in a linear combination—the least-squares method in both space and time. (At the time of writing, the GLSQ is the method of choice by Prof. Hughes.) Thus, the GLSQ on Tt + u • VT = kV2T + S with homogeneous Dirichlet BC's (for simplicity) is the following—for each time 'slab': /'« + ) f / QTh . + u • Vwh - kVV, +u-VTh - KV2Th | \ dt dt dt / e. = f"+\wh,S) Vw\ (2.7-216) Jtn where r (with units of 'time') is the LSQ weighting parameter (r = 0 is DGFEM of the previous section), and wh(x,t) is the discontinuous-in-time, continuous-in-space test function. Also ( , )e denotes the L2 inner product over element interiors (and summed over elements), thus permitting V2wh to 'make sense' with C° basis functions. [Note that letting r -> oo causes this weighted residual method to be 'purely' least squares (in space and time), and would of course now require C1 basis functions of degree two or more and the usual (global) definition of ( , ) unless k = 0.] In fact, however, the DGFEM employs only C° basis functions (and finite r) and does not even entertain the pure LSQ limit. This fact, plus the element-based L2 inner product (which gives zero for the Laplacian with linear basis functions), emphasizes the point that the LSQ is to be considered simply as a stabilizing addition to the basic Galerkin FEM—even for the steady-state case. A final remark regarding the above formulation is that omission of the Laplacian term in the least-squares (r) terms returns us to the SUPG method—'The Galerkin/least squares method is closely related to SUPG, but represents a conceptually simpler and more general methodology, applicable to a wide variety of problem classes'—Hughes et al. (1989). If we use the simplest DG method, then both uh and wh are piecewise constant in time (dwh/dt = 0) over each time-slab and the result, from Shakib and Hughes (1991) is, in ID on a uniform mesh of linear elements (V2wh = 0) 1 j-\ ~ y j-\ ) i £ [ 1J ~ J ) i I ( j+l ~ J+l \ i uLl±LZl±± At } 3 V At j 6 I At J 2h r\'-pn-\-\ rrn-\-\ (k + u2!)-1-^- JT ^-+ I ' ' Sdt, (2.7-217) h
340 THE ADVECTION-DIFFUSION EQUATION which, like the scalar case of the previous section, is simply BE with a twist—the LSQ portion/addition has added 'BTD' to the BE method. Recall (see Section 2.7.2e) that BTD made sense from the time integration point of view when FE was the time integrator (with t replaced by At/2); here it just adds additional diffusion (streamline diffusion in multi-dimensions) to an already 'maximum-dissipation' time integrator—BE. Some might view this as overkill—sacrificing accuracy (take k = 0 and watch your initial data 'disappear') for super stability—especially when the arguments applied in Section 2.7.2e to the FE method would lead to a negative BTD for BE; i.e., the diffusion term from the time truncation error argument, would be (k - u2At/2)(Tn^\ - 27"+1 + r"+J)//z2, and would give more accurate results than BE and (especially) BE + BTD; i.e., GLSQ. Thus, this lowest-order method is not recommended for time-accurate simulations—even by its designers; use it only to reach steady-state solutions is their advice. Since it may not be totally obvious what is meant by Th and w11 in (2.7-216) for the linear DG case, we shall elucidate. First, recalling that this is a Galerkin-based method means that the trial (basis) and test functions are from the same space; they are the following: (2.7-218) Th(x, t) = ^^W[n,(07-" + n2(t)Tnj+l] and wh(x, t) = ^;(jt)[n,(0wj + n2(o<+1], (2.7-219) where n,(r) = (tn+\ - t)/At, U2(t) = (t - tn)/At, and At = tn+\ - tn; also Tnj and f'j have the same meaning as in the previous section, Section 2.7.7d. In the implementation of (2.7-216), it suffices to satisfy (2.7-216) for every wh as follows: first take wh = (pi(x)U\(t) for node / and then take wh = (pi(x)Y\2(t), still for node /; repeat for every / (all nodes)—giving two 'ODE's' per node. To conclude our 'summary,' we return to the pure advection case in ID presented in the previous section—(2.7-214) and (2.7-215)—and see what the LSQ addition looks like. To do so, we turn again to Shakib and Hughes (1991): to the LHS of (2.7-214) are added [from rii(0] the following three terms: 3A? ux (Tj-i +4T] + r;+l) - (7-;+,' +477' +r£!) Tn+\\ ' l ]+\) from yrtj from r'n+\ Jtn dwh ,\ / , dTh ,u•VTh ] + u • Vw\ dt dt dt. dt. and u2xAt 3h2 from ■n + l ■vi + 1 7 + 1 ,h „ x-TTh* r(u- Vw",u- V7"1); + 2(-7"L1 +27"! -f" 7-1 7+1 and, analogously, to the LHS of (2.7-215) is added—from the same three LSQ terms, this time from T\2(t):
TIME INTEGRATION 341 3 At (rpl +4r;+l + rj+j) - (f)_{ + 4f; + fnj+l) + (f;+.-f"-.) + u— h + u2At 2 (-7-;+; + nfx - rj+j) + (-r;_, + it) - f;+l) Since only some of the 'correction terms' are amenable to useful physical interpretation, we shall not bother writing out the full equations and refer the reader to Hughes et al. (1989) and Shakib and Hughes (1991) for further details. This latter reference also shows some 'Fourier analysis' results, both for r = 0 and for t = [(2/At)2 + {lu/hf + 9(4K/h2fr{/2. (2.7-220) It also provides a brief comparison with more conventional methods. Related methods applied to quadratic basis functions (in addition to linears) can be found in Franca et al. (1992) and Khelifa et al. (1993). To conclude this brief summary, we state our current opinion regarding GLSQ: it is not felt to be a serious competitor for the simple scalar transport equation that is the subject of this chapter. It may, however, be a viable competitor for the much more difficult problems to be addressed in the next chapter—especially for free-surface flows. Finally, it may also be competitive for that branch of CFD not addressed in this book: compressible flow. Finally, for some new methods that are a sort of blend of characteristic-based methods and GLSQ, in which only symmetric linear equations are generated, called the Characteristic Streamline Diffusion (CSD) method, see Johnson (1992) and Hansbo (1992a). See too Pironneau et al. (1992) in which the 'characteristic Galerkin' and 'Galerkin/least squares space-time formulation' were compared. f. A wave equation method A very recent result (Wu, 1994) seems to have successfully 'transcribed' the second- order wave-equation method of Lynch and Gray (1979), which has been demonstrated to do a very good job for the shallow-water equations, to the advection-diffusion equation. It is shown, for both FDM in ID and lumped mass bilinear FEM in 2D, to have no spurious damping, excellent phase-speed properties that are nearly independent of CFL in the stable range (explicit formulation), and few or no wiggles. It properly precludes the second of the two wave solutions generally displayed by the second-order wave equation, by appending a second, required and proper, IC on the problem. g. Another combined method: Taylor least squares As our final 'survey' example, we summarize a method developed by Park and Liggett (1990, 1991) that is described as 'An extremely accurate and also very flexible method ...' by Donea and Quartapelle (1992). It is restricted to C1 basis functions because it is also of high order in time. It combines ideas from the modified equation approach in which Taylor series are used to obtain a fourth-order accurate time 'integration method' (for 6 = 1/2, the only case considered herein)—the modified equation—which is then spatially discretized using least squares in space with Hermite cubic polynomials for
342 THE ADVECTION-DIFFUSION EQUATION basis functions. The modified (Taylor portion) equation is, for pure advection in ID, «*»_ + M#\,t.+1-t.\ gr, 2 dx \2 dx2 \ At J dx and the weighted residual equation upon applying the least-squares method, is ' uAt d u2At2 d2 \ , , ,. dThn uAt d u2At2 d2 . 1 H H T W 2 dx 12 fa .a = 0, Vw", (2.7-222) where w^ is a test function. Upon representing both Th and w^ by means of Hermite cubic polynomials, the higher-order TLS scheme of Park and Liggett is obtained. When diffusion is included, it is treated temporally via the trapezoid rule using an operator-splitting (fractional step) technique, thus accepting only second-order accuracy for diffusion. It is applied (tested) in ID and 2D in Park and Liggett (1990) and in 3D in Park and Liggett (1991), wherein also presented is a ray method for obtaining analytical solutions to the transient AD equation in 3D. Impressive results are presented, and the TLS method compares well against other weighted residual methods [the Crank-Nicolson Taylor-Galerkin and Lax-Wenchoff Taylor-Galerkin modified equation methods of Donea et al. (1987)] using similar (cubic) basis functions. Thus we conclude our discussion of other methods on a high note; i.e., a higher order in space and time method can indeed deliver high-order accuracy on relatively coarse grids. It remains to be seen if the method can be successfully applied to the more difficult NS equations. It should be sufficiently clear by now that the 'simple' AD equation continues to provide fertile ground for numerical methods developers—the 'sand box' is very large. 2.7.8 Concluding Remarks and Suggestions It may be interesting, and perhaps even useful, to try to draw a few conclusions and make some recommendations after such a long discussion of time integration techniques. They will naturally be rather subjective since we (or anyone else, for that matter) have only limited personal experience. Thus, the first thing we must do is remain rather silent on those new techniques that involve least-squares and discontinuous-in-time Galerkin methods—they are probably excellent, but they look to us rather expensive. Next, we do believe in the method of characteristics for advection and recommend it to our readers—although the best of them may still be in the future, and should probably not use fixed timesteps. Returning to the more mundane-but-much-more-common subject of ODE methods applied to the Eulerian version of the AD equation, we can offer more confident advice: 1. Since the GFEM equations are inherently implicit (CM matrix) and since we strongly believe that implicit methods should almost never be implemented as fixed-step methods, our first vote goes to the variable-step TR method of Section 2.7.4.
ADDITIONAL NUMERICAL EXAMPLES 343 2. If you are uncomfortable with a method that contains no built-in damping, use the variable-step BDF2 method that we discuss in the next chapter—Section 3.16.4d; simply simplify it for the AD equation. 3. Please do not use BE, variable step or not; it is too dissipative and requires a too small At for accuracy. 4. If you insist on explicit, fixed-A? integrators for simplicity (or whatever), yet are interested in accurate time integration and in advection-dominated flows, then consider either AB3 or RK4 and use a few DSCG iterations per timestep to reap the benefits of consistent mass. 5. If you insist on solving only symmetric matrices but want more stability than explicit methods can provide, consider the CM sera/-implicit but unconditionally stable (we believe—see Bullister, 1986) method of Section 2.7.5—FE/TR/BTD—perhaps even generalized to variable step sizes, somehow. 6. The lowest-level but cheapest per step method that we would even consider advocating uses LM quadratic elements and the FE/BTD explicit method of Section 2.7.2e. 2.8 ADDITIONAL NUMERICAL EXAMPLES We conclude this chapter with just three numerical examples—one that occurred in our 'real world of applications' and two that serve as simple but effective test problems to demonstrate several aspects of what has been discussed. 2.8.1 Unstable ODE Example This example came from a real-world application—thermal convection in molten uranium. During a particular series of simulations (at LLNL), a previously quite useful code kept blowing up. After many 'diagnostic' runs, including the use of two independent codes, it was (finally!) determined that the advection-diffusion ODE for the temperature was often unstable. Yes—the ODE itself. (This is the energy equation in the Boussinesq model—see Volume II—and the velocity field that drives it is rapidly varying, both temporally and spatially.) Presented below is a simplified sample of those results in which the quite complex and always time-dependent velocity field was 'frozen' in the middle of a run (when all the physics was in full swing) and that field—now only spatially varying—used as a steady velocity field to drive the linear scalar transport equation. The flow field was generated from the following thermal convection problem in a 59.7 x 12.7 horizontal cavity: the bottom and right end are heated (^hot), the top is cooled (Tqold), and the left end is a symmetry boundary. The velocity BC's were: no slip, no penetration on the bottom and right end, no penetration and shear-free on the top, and 'symmetry' on the left end (i.e., again no penetration and no shear). Depending somewhat on AT(= 7Hot — ^cold), a multi-cell (four or five, usually) flow pattern resulted, a snapshot of which is shown in Figure 2.8-1. The 'test' problem that finally evolved was to use this flow field (which 'generated' the unstable ODE's) to solve the pure advection equation with a specified, initial Gaussian temperature field (a = 0.25// = 3.18) centered near the top of the right-most cell—see Figure 2.8-2(a). [We switched from advection-diffusion to pure advection partly because diffusion (in
344 THE ADVECTION-DIFFUSION EQUATION (c) Vector Zoom (d) Stream Function Zoom Fig. 2.8-1 Flow field that generated an unstable ODE. 'reasonable' amounts) did not stabilize the ODE's and partly to perform other numerical tests, reported below.] The series of experiments described below will demonstrate the following points: 1. It is not too difficult to generate unstable ODE's from the advection term in complex and rapidly varying velocity fields if any but the quadratically conserving formulation (fi = 1/2, skew-symmetric advection matrix) is utilized. 2. The skew-symmetric form will conserve / T2 but not / T; i.e., it always generates stable ODE's but cannot be guaranteed to conserve T, nor be guaranteed to be accurate. Bounded gibberish can occur.
ADDITIONAL NUMERICAL EXAMPLES 345 (a)t-0 Tmax - 22-8 T^n=-7.00 Tmax = 42.3 T„i„=-60.4 ^-133 Tmi„ = -182 (b)t=1 (C)t = 4 8 81 &££ (d) t = 16 "V ■ v w v \s >° - 0 (e) t = 32 1^ = 3.3x10 1^ = ^3.2x10 (f)t = 84 Fig. 2.8-2 Temperature field at several times (TR, P = 0).
346 THE ADVECTION-DIFFUSION EQUATION 3. The conservation form (divergence form; /J = 1) will indeed conserve / T but generally not / T2 and will thus blow up (become unbounded). [N.B. finite volume fans.] 4. The simplest advective form (/J = 0) will conserve neither while blowing up. 5. Owing principally to directional group velocity errors, the passive scalar that should remain in the first two cells crosses not only those dividing streamlines but all the others as well, to eventually fill the whole flow field with spurious numbers. This does not bode well for any of the advection options, and the only fix we know of would be to use one of the better characteristic/trajectory methods for advection instead of Galerkin (or Petrov-Galerkin, or least squares, or Galerkin least squares, or Taylor-Galerkin, or any Eulerian method). 6. The time reversibility of the TR integrator will be dramatically demonstrated—even on unstable ODE's—and compared with the dissipative BE method. TR is an excellent time integrator for non-dissipative systems, and BE is a very poor one—requiring extremely small timesteps for comparable accuracy. 7. Streamline diffusion can be used to generate different wrong answers—and does not guarantee stability. Figures 2.8-1(a) and (b) show the vector field and stream function for the flow, and Figures 2.8-1(c) and (d) show a 'zoom' (different i^-contours) into the right-most 15% or so of the cavity because this is where the largest 'action' will be seen to occur. Even though the upper small eddy [Figure 2.8-1(d)] appears to be fairly well resolved on this 100 x 24 graded mesh of <2i<2o (see Chapter 3) elements in which temperature is piecewise bilinear, it is the 'generator' of the unstable ODE in the following sense: the eigenvector corresponding to the most unstable mode (largest growth rate) has its largest entries at nodes in this region. Thus, ultimately (large t) the temperature in this neighborhood dominates that in the rest of the field. The small-but-variable velocities in the upper corner generate (for /3 = 0 and /J = 1) a quite unstable ODE from the advection matrix [M~[lN(u) to be precise]. The maximum speed in the entire field is ~ 4.1 (at x = 16.5, y = 3.3) and the maximum in the right-most large eddy of Figure 2.8-1 is ~ 3.5, and occurs at x = 59.3, y = 6.2). Based on these speeds, the estimated average eddy turnover time is ~ 16. The sequence of figures in Figure 2.8-2 displays the instability qualitatively via a 'base case'—TR with p = 0, At = 0.10, and 0 ^ t ^ 84—beginning with the IC (T0) in Figure 2.8-2(a), in which the value of rmax(20.8) was 'arbitrarily' selected to give J T0 = A = 59.7 x 12.7 = 758—so that we could easily monitor 'conservation' of T : f T/A should remain unity. By t = 4 [Figure 2.8-2(c)] the simulation is already 'in trouble' in that the Gaussian has both 'broken up' and is about to penetrate the dividing streamline between cells 2 and 3 to show up in the next cell—a clear violation of physics. Also, as shown in the figure, the violation of rmax = 20.8 and rmin = 0 has already occurred. Thus the rest of the experiments will focus on numerics rather than physics, which, of course, is the prime purpose of this example. The 'trail of bad numbers' continues in Figures 2.8-2(d) and (e), and we remark that the clear unstable behavior of the dominant mode does not really appear until time 50 or so—as we will see later via some time histories. But by the end of the run, t = 84 in Figure 2.8-2(f), the 'final' pattern is set—the dominant mode is a growing oscillation, T(x, t) ~ /(x)e('2^A+^) with k > 0; in fact, X = 0.18 and r = 1.8, giving l/XAt = 55 steps per 'e-folding' and r/At = 18 steps per period—showing a
ADDITIONAL NUMERICAL EXAMPLES 347 'nearly' sufficiently small At for TR to 'track' the solution pretty well. (Also, eXr = 1.4, a 40% growth per period.) Next, starting from the large T—0(1O6)—solution at t = 84, we 'digress' to demonstrate a remarkable property of TR (symmetry) that is shared by only a few time integrators: complete reversibility in time, as discussed in Section 2.7.3—a property that is obviously closely related to its lack of numerical dissipation/lack of spurious damping/neutral stability. Starting at t = 84 and reversing the sign of At (or, what is equivalent in this case, reversing the sign of the velocity), Figures 2.8-3(a) and (b) show two stages in the reverse integration, the first (after 52 time units) corresponding to t = 32 in the forward integration and the second (at 80 time units) to t = 4. These are to be compared with Figure 2.8-2(e) and (c), respectively. Not shown is the end of the backward integration at t = —84, the IC, which looks just like Figure 2.8-2(a) and agrees with it to 'several' digits. The last of our backward TR integration results is shown in Figure 2.8-3(c), the first 7.5 time units of the backward integration—during which •* o . (a) t = 52(32) (b)t = 80{4) Temp. Cxio"5) -0.6 - -1.8 — -3.0 0.05 1.64 3.23 4.82 6.41 (c) Time 8.0 Fig. 2.8-3 (a) and (b) Temperature field at two times during backwards integration via TR; (c) a nodal time history during a portion of the backward integration.
348 THE ADVECTION-DIFFUSION EQUATION we see, appropriately, a damped sinusoidal oscillation (of the most unstable mode—for small enough time; at 'large' time, other modes become relevant). This is for node '2399,' located one node down from the top and six nodes in from the right (x = 59.38, y = 11.92) and is certainly close to the 'most unstable' node. The next series of pictures, Figure 2.8-4, are all at t = 32 [(cf. Figure 2.8-2(e)] and show how the solutions vary with /3 and the effect of switching from TR to BE. Thus, shown in the figure are: TR for (5 = 1 (conserves T) and /3 = 1/2 (conserves T2) and BE for all three values of /3—with an additional BE result, Figure 2.8-4(d), using a ten-fold smaller timestep (8400 total steps). Associated with all of the above is a summary of extrema, Table 2.8-1, at both t = 32 and at the end of each run (t = 84). For each value of /3, the BE and TR results should agree, whether stable or unstable. When A Ms reduced, the results should not change 'significantly.' Comparing first ^ = 0 at t = 32 in the table shows that At = 0.01 for BE is 'almost' good enough [close to TR at At = 0.10; cf. also Figures 2.8-2(e) and 2.8-4(d) but at t = 84, it is far from good enough, giving extrema that are ~ 100-fold too small. Worse yet is BE at At = 0.10, being ~600-fold too small at t = 84. Figure 2.8-5(a) shows that, even though two to three orders of magnitude too small, BE gets the solution 'qualitatively correct.' Another indication that TR is 'correct' is given by comparing the theoretical amplification factor, fr = (1 + A.Af/2)(l - XAt/2), with the observed factor, f0 = (Tn/Tm)x/(n-m); the two agree to within about 0.1% for t > ~50, with f = 1.02. A similar comparison for the BE run gives rather different theoretical vs oberved growth rates—suggesting that the severe damping has so strongly distorted the results that the most unstable mode is not yet clearly present. Finally, a factor of two reduction in At for TR led to a factor of ~2 increase in the extremal values at t = 84, suggesting, as mentioned earlier, that At = 0.10 is 'almost' small enough for the TR integrator. (Recall that extreme values are being compared; better accuracy over most of the domain can be expected.) Also noteworthy from the figures and the table is that y3 = 1 is actually more 'unstable' (slightly larger X, and much earlier growth) than the simpler advective form, even though it conserves T; conservation of linear quantities guarantees neither stability nor accuracy. Finally, /3 = 1/2 gives bounded but (also) totally fallacious results; conservation of T2 guarantees stability but not accuracy. Another measure of BE's quality is how well it does on the reverse integration. Thus, Figures 2.8-5(b) and (c) are to be compared with the TR counterparts, Figures 2.8-3(a) and (b); the difference is significant, both qualitatively and quantitatively. Finally, some relevant time histories are shown in Figure 2.8-6 for TR and BE. Plotted there is fTdA/A and the natural logarithm of (f T2dA/A), with respective values 1.0 and 2.47 at t = 0. For /J = 0, TR shows an oscillatory blow-up of f T [Figure 2.8-6(a)] and a pretty clean exponential, and non-oscillatory, blow-up of f T2 [Figure 2.8-6(d)]. (We offer an explanation for these counter-intuitive phenomena at the end.) For fi = 1 and /3 = 1/2, the predicted theoretical results obtain, with the former showing a slightly larger growth rate of f T2 than did the advective form (/3 = 0)—and the latter is not shown (it is constant at 2.47). The analogous BE results, Figures 2.8-6(c) and (e), agree less well with theory (with the exception that /3 = 1 does conserve J T) because A/is too large; e.g., J T2 is actually decreasing for /3 = 1/2. These poor BE results are in complete harmony with those in Marx (1994), who found, for one of Rannacher's test problems (Rannacher, 1989) that a 50-fold smaller At was needed by BE to get the same accuracy as TR on ODE's displaying a slowly damped, oscillatory solution.
ADDITIONAL NUMERICAL EXAMPLES 349 Tmax = 831 T,*—1124 T7" ^ o' o(T0 oo u* (a)TR p=1 Tmax = 59.5 W-70.4 • t • 0 (b)TR p = 1/2 (c)BE P=0 Ta« = 132 T„*,=-180 . .a . "0 0 0 (d)BE p = 0(At = .01) Tn« = 870 Tml„ = -561 ~± 1 (e) BE p = 1 i y^v.y I-*1' «j Tmax=47.1 ^ = -64-6 ^T77*>< jii (f) BE p = 1/2 Fig. 2.8-4 Temperature fields at t =32 for s/x different integrations.
350 THE ADVECTION-DIFFUSION EQUATION Table 2.8-1 Temperature extrema. Trapezoid rule Backward Euler £ = 0 (0 = 0, At = 0.01) 0=1 0=1/2 0 = 0 (0 = 0, At = 0.01) (0 = 0, At = 0.05) 0 = 1 0=1/2 'min -182 — -1124 -70.4 -3.2 x 105 — (-6.0 x 105) -2.4 x 107 -145 'max 133 — 831 59.5 3.3 x 105 — (6.7 x 105) 1.3 x 107 90 t = 32 t = 84 'min -170 (-180) -561 -64.6 -523 -3.20 x 103 — -1.65 x 104 -80 'max 130 (132) 870 47.1 493 3.00 x 103 — 2.24 x 104 -58 T„«-493 W-623 -^_ (a) t = 84 T™ 128 T,*—153 T„«=14.2 T™--1.17 (b)t = 52(32) O • • » 0 9 9 Q fc^l/ (c)t 80(4) Fig. 2.8-5 Some more backward Euler results with At = 0.1; (a) 0 = 0 at t = 84; (b) and (c) backward integration starting from the result in (a). Remark: It may be worthwhile demonstrating (for /J = 1/2) that TR conserves energy for all At and BE degrades it. Thus, starting from MT + NT = 0 with NT = — N, (1) Clearly \<\(TTMT)/dt = TTMT = 0 because TTNT = 0 because N is skew- symmetric; i.e., TTMT is constant—J T2 is conserved by the ODE's.
ADDITIONAL NUMERICAL EXAMPLES 351 5 4 3 2 1 0 f-i " -2 -3 I I I I I I I.UI CO 1 2 4S 0.99 1— 2 0.98 e? c 0.97 0.96 — " 1 1 1 1 1 1 1 1 1 _ — \ — V^N — i i iM- 10 20 30 40 50 60 70 80 90 (a)Time(TR,p = 0) 1.06 10 20 30 40 50 60 70 80 90 (b) Time (TR, 0 = 1/2) -m 1.04 o I 1.02 1 CO oi o =- 0.98 I- 0.96 I I I I I I I I; /\/"\Beta = 0.0 /" VBeta=1.0 / -/v ABe' —v Beta = 0.5 *- V-Ny I I I I I >H-,.> 0 10 20 30 40 50 60 70 80 90 (c) Time (BE) 2 20 .og (integral T7a tn o en n I I I -— Beta = Beta ■ ! I I I = 0.0 = 1.0 i i y I l l I Q 10 9 8 7 6 5 - 4 - 3 - I I I I I I I — Beta = 0.0 — Beta =1.0 — Beta = 0.5 ^4-refcirl-nrr.L.J 0 10 20 30 40 50 60 70 80 90 (d) Time (TR) Fig. 2.8-6 Time histories of various 'conserved' quantities. 0 10 20 30 40 50 60 70 80 90 (e) Time (BE) (2) For TR, M(Tn+1 - Tn)/At+N(Tn + Tn+1)/2 = 0 yields (Tn+l + Tn)TM(Tn+l - Tn)/At-\-(Tn+i-\-Tn)TN(Tn+\-\-Tn)/2 = 0, giving the desired conservation: TTn+1MTn+l=TTnMTn. (3) For BE, M(Tn+l - Tn)/At + NTn+l = 0 yields TTn+lM(Tn+1 - Tn)/At + Tjl+1 NTn+i = 0, giving TTn+lMTn+l = TTn+lMTn = TTnMTn+x. But from Tn+1 = (/ + AtM~lNTlTn = [I - AtM~lN + At2{M~lN)2 + 0(At3)]Tn = T„ - AtM~lNTn + At2M~lNM-lNTn+ 0(At3), we obtain TTn+lMTn+x = TTnMTn - AtTTnNTn + At2TTnNM~lNTn +0(At3) = TTnMTn - At2(NTn)TM-l(NTn) + 0(At3); BE monotonically decreases the energy because (NTn)TM~l(NTn) ^ 0 since M is positive definite. Can we stabilize the ODE by adding dissipation/numerical diffusion? Should we? The answer to the first question is, 'yes if you change your advection "operator" to one that is monotonic—a common "defense" in FDM and FVM, although less common in
352 THE ADVECTION-DIFFUSION EQUATION FEM. As we have little interest and zero capability in monotone FEM schemes, we will present the closest thing that we do have: streamline upwinding. Although our derivation of it was via BTD in which it was introduced to counter the FE instability of negative diffusion (see Sections 2.1.It and 2.7.5), here we add it to TR simply to try to gain stability without, hopefully, degrading accuracy. Thus we added the BTD coefficient, UjUjAt/2, as a diffusion term, for the /J = 0 case via TR to see 'what happens.' What happened is this: the ODE's appeared to be stable but were tending toward a completely wrong (and ostensibly steady-state) solution. But in fact they were not stable—they were just more stable than those without BTD. Figure 2.8-7 shows the result at 'near' steady state—t =115. That the 'perfect' BTD solution would lead to a (wrong) steady state for this situation (closed streamlines, steady velocity field) is easily seen by writing the pure advection 'PDE' + BTD along a streamline: dT/dt + usdT/ds = (At/2)(us(d/ds))2T, where s represents a streamline coordinate. This ID advection-diffusion equation causes diffusion along each streamline, with the ultimate result that a steady state could ultimately be attained in which T is constant on streamlines—different constants on different streamlines, of course, depending only on the IC's. But the discrete approximation to this PDE is only approximately representing ID advection-diffusion along a streamline. Hence, pollution of the other cells still occurs (group velocity errors) and even instability still occurs. Although Figure 2.8-7, and time histories (not shown), show a tendency to display constant-along-streamlines 'temperature,' in fact, the solution eventually became unstable—but with a much lower growth rate (X = 0.0021 vis-a-vis X = 0.18 without BTD), the instability becoming clearly dominant only at very large t (several thousand)—and was actually accomplished with the variable-step version of TR. (BTD almost stabilized the ODE's for this problem.) So what else have we learned from this example? What advice can be given to the analyst based on it? Our answers to this type of advection dilemma are the following: 1. In general, stay with the cheapest method (^ = 0) whose accuracy, when stable, is as good as the other two. Also, instabilities of the type shown above are not very common. 2. If an unstable ODE is generated in a particular simulation, first rerun it using fi = 1/2, which should verify that you did have an unstable ODE by being stable. If the associated 'accuracy' looks good, you are lucky. 3. But if your stable results look like 'bounded gibberish,' as they did in our example, go back to your mesh generator and refine the mesh—at least in the region of maximum instability. 4. Suppose you cannot afford to remesh and rerun (you may already be at the computer's capacity in 3D)? We offer three possible solutions to this sad situation: (i) give up! (the t=115 Fig. 2.8-7 Temperature field at 'nearly' steady state (t = ^^5) via TR plus BTD.
ADDITIONAL NUMERICAL EXAMPLES 353 stated problem is simply beyond your resources), (ii) find a friend with a bigger computer, (iii) try streamline (or other) upwinding, perhaps multiplying UjUjAt/2 by a scalar (10? 100?), but be very suspicious of your results. To conclude, it may be of some interest to show, for a model problem in ID, that it is indeed possible (with thanks to A.C. Hindmarsh) for a periodic-in-time (eigenvector) solution to show a complete lack of periodicity in 'energy.' The dominant eigenvalue for the ID AD equation with a Dirichlet BC on the left and a Neumann BC on the right (outflow) is, from Section 2.6.2d, in the advection-dominated 'limit' (P ^> 1), 2k A, = -r(l -iPcosn/N) h2 ' = 2K/h2 -iu/h, (2.8-1) which yields the temporal behavior e~x,? = Q-^t/h emt/h which5 smce /> _ unj2K ^> 1, corresponds to an oscillating and slowly decaying mode [X\ = (ut/h)(—l/P-\-i)]. The corresponding eigenvector is vj = i-i)Jsin jn/N, j=l,2,-..N, (2.8-2) and the solution to the semi-discretized AD equation, assuming this to be the only mode present, is Tj{t) = (e-Xi'vj + e-Xi'vj) (2.8-3) which is, as required, real (a real solution requires that the conjugate modes have equal amplitude). Letting T and v be the corresponding N-vectors, we seek the behavior of the energy, E = TT T, and obtain E = 2e2tRe{Xi ]vTv + 2Re(e-2Xi'vTv), (2.8-4) where the first term represents monotonic (non-oscillatory) decay, and the second describes oscillatory decay unless q = vTv = 0. We leave as an exercise—non-trivial—the proof that, in fact, q = 0. We presume that an analogous situation has occurred for our TR integration of the /J = 0 unstable ODE. For a brief survey of the literature on unstable ODE's, refer to Section 2.3.1—in the Remark in the Digression following (2.3-9). 2.8.2 Advection-Diffusion of a Puff (Point Source) Air pollution studies often involve the elevated release of either a point source or a finite 'puff of pollutant, the fate of which is of some interest. A point source is, of course, a mathematical idealization which, with the passage of time, becomes a puff. As a sample result in this regard, Figure 2.8-8 shows a simple simulation of a puff release at t = 0 and its subsequent advection and diffusion downwind. Shown are the 33 x 10 mesh of 4-node bilinear elements and both CM and LM results, as well as the exact result (dotted), at several times. The IC is a 2D Gaussian puff with a = 0.25 released at x = 0.625, y = 1.0, the wind speed is u = 1, v = 0, the (constant in this case—not meteorologically appropriate) diffusivity is k = 0.05 to give a puff Peclet number, uo/2k, of 2.5—showing
354 THE ADVECTION-DIFFUSION EQUATION 20 18 16 14 12 10 8 6 4 2 0 0 2 4 6 8 10 12 t Fig. 2.8-8 Advection-diffusion of a puff, (a) 33 x 10 mesh of 4-node elements; (b) CM results att = 0, 2.5, 6.5, and 11.5; (c) LM results at the same times. a fairly close balance between advection and diffusion (each is important). The BC's are: T = 0 at x = 0 and along the top, and dT/dn = 0 at y = 0 and at the outflow boundary. The analytical solution is that for a point source released earlier (t = —0.625) at x = 0, y = 1, which generates the IC mentioned above. Even with significant diffusion, the lumped mass result is noticeably inferior to that from GFEM—attributable (as always) to numerical dispersion. Also noteworthy is (for CM) the utility of the homogeneous NBC as OBC: dT/dx = 0 at x = L. Also of interest is At vs t from a smart integrator (variable step TR)—and this is shown in Figure 2.8-9 for the CM case, for a rather 'tight' s (10~5, used to generate Figure 2.8-8) and one order larger—the results of which are not shown but differed little. In each case, At increased nearly 20-fold during the simulation as the effect of diffusion made the simulation progressively easier. The fact that the small e run required a little more than twice the total number of steps as the 10-fold larger e case is quite consistent with the theory: 101/3 = 2.15. 2.8.3 The Rotating Cone—A Pure Advection Test Problem We conclude this chapter by showing one more example in which mass lumping seriously degrades the otherwise high accuracy attainable from GFEM. The popular test problem [independently initiated in 1968 by Molenkamp (1968) and Crowley (1968)] known as t—■—i—■—i—'—i—■—i—'—i—■—i—r 1 ■ ' ■ ■ ■
ADDITIONAL NUMERICAL EXAMPLES 355 1 1 1 1 1 ••••••••••••••• * £ = 10"4; total steps = 10-5; total steps — 26 30 i I = 148 = 306 19 — • • 44 I ir 22 62 J lijj r™ J No. of steps at this At ii^T41i i i i i 0 2 4 6 8 10 12 14 t Fig. 2.8-9 Timestep history (TR) for the CM results in Figure 2.8-8 (s = 10 5) and for e = 10~4 (not shown in Figure 2.8-8). the 'rotating cone' begins by placing a cone (2D 'hat function') on the mesh as in IC in a pure/solid body rotational (Q rad/sec) velocity field; u = -£2y, v = Qx. The solution is easy to depict in words: it is as if an (empty!) ice cream cone were placed upside-down on a turntable/record player (remember record albums?) and the switch turned on. (A low- speed, e.g., 33 rpm, record player may be needed, to preclude 'slippage' ... Remember 45 rpm? 78 rpm?). Figure 2.8-10 shows a one-revolution result for both CM (GFEM) and LM, for a cone whose base diameter is 8Ajc(= 8Ay)—see Figures 2.6-21 and 2.6-27 for 'analogous' ID versions. The BC's specify T = 0 at the four inlet portions of the domain (one half of each boundary) and, of course, no BC at the outflow portions. Time integration was via TR with a very tight e—all error is spatial. Noteworthy is the accurate GFEM solution with little error in either phase or group velocity and no artificial dissipation—even though the advective form (/J = 0) was used. For further discussion of these results, in which they are compared with eight- and nine- node GFEM as well as with the disastrous lumping of the serendipity element discussed in Section 2.3.4, as well as a comparison with some very good finite difference schemes (Arakawa's second- and fourth-order methods), as well as some spectral comparisons, see Gresho et al. (1978). [Arakawa's famous FDM's are presented in Arakawa (1966), and the equivalence of his second-order method to lumped mass bilinear FEM is shown in Jespersen (1974). Finally, in Gresho et al. (1978), it is shown that GFEM on bilinears
356 THE ADVECTION-DIFFUSION EQUATION Fig. 2.8-10 Pure advection in 2D; bilinear FEM approximation. Contours are for 0.8, 0.6, 0.4, 0.2 (dotted for exact solution, solid for approximate solution) and -0.2 (dashed) for lumped mass results. Lowest curves are for consistent mass, 1 revolution; right most curves are for lumped mass at 1/2 revolution; leftmost at 1 revolution. is more accurate than—and not equivalent to—Arakawa's fourth-order FDM, consistent with our previous analyses of phase and group velocities.] We conclude with a comment on reduced quadrature: if one-point quadrature is used on the above problem (giving a 2D 'Box' scheme), then large and fast-moving high- frequency noise emanates ahead of the cone—a result that is probably even worse than that from mass lumping, and is consistent with the analysis presented in Section 2.6.3a. The reader interested in other test problems for AD is advised to peruse Baptista et al. (1995).
The Navier-Stokes Equations 3.1 NOTATIONAL INTRODUCTION Because of the varied background and experience of the readers, and because there are several ways to generate weak forms of vector equations, we shall—for the first, and simplest case—show the details of several equivalent formulations. To do so, it may first be a good idea to carefully explain our notation and conventions with respect to vectors, tensors, dyads, and various 'gradients.' We prefer the invariant bold-face vector notation of Gibbs, and we will use it more often than the index notation (with summation convention on repeated indices) that is preferred by many. For clarity and reduction of ambiguity, we present in Tables (3.1-1) and (3.1-2) a summary of the important 'vector operations' that will be used in the sequel. Vectors are lower case (bold) and second-rank tensors are upper case (bold). With respect to dot products, our convention is to start at the dot (or dots in the case of, e.g., A:B), look left and right, then dot the first thing you see; repeat as necessary. Examples: 1. A : B = e,eyA/7 : e*e/flw = e/A/7(ey • e*) • e,fiw = e/A/7<5^ • e,fiw = A,7(e,- • e,)fiy, = AjjBjj, where <5,7 is the Kronecker delta and the e,'s are cartesian base vectors so that, for example, H 1 u = e,w/ = [ 0 ) u\ + ,0, 2. n • [(Vu) • v] = e,-/!,- • [ej—ekuk d = e,-/i/ duk dxj du dn duk e;—vk ~j dx j = n Vk = rijUkiVk 3. n • [(Vu)7 • v] = e,-/i/ = e/Hj du V = V • . dn d (e/u/) eye* -— • eivi e/v/ dUj dui = /i,-(e,- • e,-)—— (e* • e/)u/ = njvk — 0Xk oxk = n • [v • (Vu)] = n • [(v • V)u] = v • V(n • u) = v • Vun.
358 THE NAVIER-STOKES EQUATIONS Table 3.1-1 Vector conventions. Invariant (Gibbs) notation u A Ar U • V{= V- U} U X V{= -V X U} A u{=u A7} Ar u{=u A} A B{=B A} A : B{= B . A} A:Br{=Br : A} u (A • v) = (A • v) u = (u A) v = u A v v (A u) = v A u(2) uv A uv{= (A- u)v} A : uv{= (A- u) v = v A u} uv : wz = (v w)(u z) = Cartesian base-vector notation e,Uj ©/' ®/Ay ©/' ©/A'/ e,ui ■ QjVj = SijUjV,- = UjVj e,u, x QjUj =ekSijkU,u^) e^-Ay ■ ekuk = e,AyU/ e,efAf, -ekUk = e,A/U/ e,e/Ay ■ ekeiBki = e,e/AyB// eiejAjj : ekeiBki = AjjBjj e,e/Ay : eke/Bik =AjjBjj e,u, ■ QjQkAjk ■ eiv, = u,Am \jjA,kiik{= u-AkiVk} eiejUiVj e,e,Ay ■ ekeiukv, = e,e,AyU/W e/e/Ay : e^e/u^v/ = (e/e/Ay • ©a = AgUfV, QiQjUiVj : ekeiWkZ, = uiVjWjZ, M-eiVi Index notation Ui Ail A/ UjVj SijkUiVj AijUj ApUj AjkBkj AijB/i AijBij UjAjjVj UjAijVj U/Vj AjjUjVk AijUjVj UjZiVjWj (1) Ejjk = Alternating tensor: e^ = as /,/', k are or are not in cyclic <2> If A7" = A, then u A v = v = 0 unless i,j,k are all different, in which case e^ = 1 or -1 according order. A u. Table 3.1-2 Derivative conventions. Invariant (Gibbs) notation Cartesian base-vector notation Index notation V(-) V0 v u Vx u vu (Vu)r u-V(-) u-V0 u Vv{= (u u-(W)r{= u • (Vu)r = V A • V)v (Vv)- (Vu). = (Vv)r • u} u} u = Jv(u u) e,a(-)/9*/ e/30/ax/ e,- • d(ejUj)/dXi = dUj/dXj e,- x d{ejU;)/dXj = e,- x ejdUj/dx,- ^ekSijkdUj/dXj ejd(ejUj)/dXi = eiejdiij/dXj e/efdui/dxj eM-e/aoya*/ =u,d(-yax, Ujd(f>/dXi eiUidVj/dXj QjUi ■ d(ejekVj)/dXk = eku,dVi/dXk e;d(ejUj ■ ekuk)/dXi = eiUfdUf/dxt e, • d(ejekAjk)/dXj = e^pc/dx,- (•),/ 0./ UiJ ZijkUjj UiJ UU Ui(-)j Ui<t>J UiVi-i UjVjj uiuiJ Ay,/'
NOTATIONAL INTRODUCTION 359 Table 3.1 -2 (continued). Invariant (Gibbs) notation Cartesian base-vector notation Index notation (Vu)r V-A7" A.Vu = Vu : A=/Ar = (Vu)r : A7" Ar : Vu = A :(Vu)r = Vu : Ar = (Vu)r : A V • (uv){= (V u)v + u Vv} Vu :Vv{= Vv :Vu} Vu :(Vv)r = Vv :(Vu)r = (Vu)r : Vv = (Vv)r : Vu = Vuk ■ Vvk (sum on k) V-(V0){=V20} V(V-u) v-(Vu){= v2u} v-(Vu)r{= V(V-u)} V • (u • Vv) {= u V(V-v) +Vu :Vv} V • [u • (Vv)7"] {= u • V2v + (Vu): (Vv)7"} V • (u • A) = {u-(V-A7") + A:(Vu)7"} V • (A • u) = u • (V • A) +A:Vu n-[(Vu)-v]{=n-[v-(Vu)r] = v • [(Vu)7" • n] = v (n Vu) = v-du/dn} n-[(Vu)r -v]{= n- [v- (Vu)] = n (v Vu) = v V(n u) = v [(Vu)-n]} u (u Vu)= |[V- (q2u) -q2Vu] (q2 = u-u) ekdAk,/dXj e,e/Ay : ekeidui/dxk = A,y9u,-/9x/ AjjdUj/dXi e/ • d(ejekUiVk)/dXi = ekd(uivk)/dxi eie/dUj/dXi : ekeidVi/dXk = dUj/dXidVi/dXj dUj/dXidVj/dxi e,- • 3(e/30/ax/)/ax/ = a20/3x,2 e,-a[e/ • d(ekuk)/dXj]/dxi = eid2uj/dxidxi e, • d(ejekduk/dxj)/ax/ = ekd2uk/dxf e/ • d{ejekdUj/dXk)/dXj = ekd2Ui/dXkdXj e,- • d[efUj ■ ekd(e,v,)/dxk]/dXj = Uid2vk/dxkdXj + duj/dxidvi/dx/ e, • dfyUj ■ (eke,dvk/dx,)]/dXi = u,d2 v, i dxf + dUj i dXj dVj i ax, e,- • d(efuf -ekeiAk,)/dXi = U, dAjj I dXj +Aij dUj I dXj e,d(Ajkejek •u/e/)/3x/- = UjdAij/dXi +AjjdUj/dXi e,nt ■[e,d(ekuk)/dXi -e,w] = vknjduk/dXi e/n; ■[eid(ekui)/dxk -e,v,] = njVkdUj/dxk ±[e, ■ dtfefUfyax, -q2ei ■ a(e,u/)/ax,] AijU/.j <P,ii Ui.ij Ui.U ui.n U,VN1 + UjjVjj UfVjj+UjjVfj UiAijj +Aijui,j UjAijj+AijUjj niVjU/j niV/Uij UjUjUjj Note that, except for the invariant notation, our notation and definitions are restricted to non-curved boundaries, a convenient restriction that we shall remove when necessary (e.g., in the treatment of free-surface fluid mechanics). If, in the sequel, certain derivations/manipulations do not quite seem to be transparent/obvious, the reader may often refer to the above results for assistance.
360 THE NAVIER-STOKES EQUATIONS 3.2 THE CONTINUUM EQUATIONS (THE PDE'S) As mentioned in the previous chapter, the Navier-Stokes (NS) equations, which are the governing PDE's for the motion of many fluids, are somewhat similar to the advection-diffusion (AD) equation (with the Reynolds number, Re, replacing the Peclet number, Pe), but are much more complicated and much more difficult to solve. The reasons for the extra difficulty are related to the following items: (i) they comprise a vector equation in which the associated scalar equations (one for each direction) are intricately coupled to each other; (ii) the equation is inherently non-linear in the advection terms; and (iii) the constraint equation between velocity components and the associated pressure field (div-grad coupling) are not present in advection-diffusion. These additional features cause numerical (and theoretical) solutions to be much more difficult to obtain, and the search for 'optimum elements' (and better numerical methods in general, not just FEM) is a continuing one—as is the search for a global existence theorem. For the NS equations, even the simple diffusion-dominated limit of so-called Stokes flow (Re -> 0; analogous to Pe -> 0 in AD) is sometimes very complex even though the equations are still elliptic—for steady flows—and parabolic (with still an elliptic part) for time-dependent ones; examples are: (i) the occurrence of reverse flows (separated flows containing recirculating eddies), and (ii) very large gradients in pressure and velocity are often generated near sharp 'corners' (singularities). The other limit, advection-dominated flows (Re ^> 1), is especially difficult, as the principal 'actors' in the momentum equation are then advection and the pressure gradient, with diffusion being nearly negligible except near solid boundaries or internal shear layers. The final difficulty of the NS equations worthy of mention in this introduction is their inherent instability: at a sufficiently large Re, any previously stable laminar flow will undergo 'transition' to turbulence or near- turbulence (semi-chaotic laminar flow), another consequence of the inherent non-linearity. Further increases of Re will ultimately lead to fully developed turbulence, in which the fluid behavior tends to become stochastic (as opposed to deterministic) in nature, with an extremely wide range of 'characteristic eddy sizes' and concomitant time scales which essentially defy detailed numerical simulations. In this chapter we address the more modest goals of solving the NS equations for laminar flows only. (A chapter in Volume II will address turbulence and methods of modeling it.) In the remainder of this chapter we will set the stage for FEM approximations to the NS equations and present some effective solution methods. There are several ways to express the partial differential equations of motion (conservation of momentum) and continuity (conservation of mass) for a constant property Newtonian fluid, in which shear stress is proportional to the rate of strain. The fundamental formulation is often referred to as the 'primitive variables' equations, which we emphasize herein—partly owing to space limitations, partly owing to our desire to use a common approach for 2D and 3D, and partly because of our predisposition and experience. The NS equations, and the associated (mass) continuity equation, written in terms of the primitive variables, velocity and pressure, are generally and most efficiently/compactly expressed as fdu \ Du 1 p — + u Vu )=P— = -VP + /iV2u, (3.2-1) and V • u = 0, (3.2-2)
THE CONTINUUM EQUATIONS (THE PDE'S) 361 where u is the velocity (with cartesian components u, v in 2D and u, v, w in 3D), P is the pressure deviation from hydrostatic [hence, no gravitational body force term in (3.2-1)], and p and ii are the fluid density and viscosity, respectively (assumed henceforth to be constant). Remarks: (1) Why is (3.2-2) referred to as the 'continuity equation' ? After noting/recalling that its more general form is dp/dt + V • (pu) = 0, we quote first from Batchelor (1969), 'It has been called the 'equation of continuity' for many years, although not for any evident good reason.', and then from Panton (1984), 'The equation derived in this section has been called the continuity equation to emphasize that the continuum assumptions (the assumption that density and velocity may be defined at every point in space) are a prerequisite.' But he goes on to say, properly, 'The continuum assumption is, of course, a foundation for all the basic laws.' Thus, we side with Batchelor. (2) The implications of the 'simplification' of the mass conservation equation are actually quite profound—both 'theoretically' (PDE theory, functional analysis) and numerically; indeed, the simplification comes at a higher cost than many might imagine, before actually 'diving in.' In many ways, the incompressible NS equations are more difficult to 'solve' than their compressible progenitors—especially for 'neophytes' who often believe otherwise. For more discussion on the omnipotence of V • u = 0 and its effects, see Gresho (1991a,b, 1992). (3) While not intended to be obvious, (3.2-2) is/can be interpreted as 'the equation for pressure'—and we shall do so, many times in fact. Except for the essential (and quadratic) non-linearity of the advection term, u ■ Vu, and the (implicit) presence of the velocity-pressure coupling through V • u = 0, the NS equations share many of the features of the AD equation of the previous chapter. While the additional term, VP, appears to be a simple body force (acceleration) in Newton's second law, it is actually that plus quite a bit more; P also plays the role of a Lagrange multiplier for the incompressibility constraint by 'adjusting itself,' instantaneously in time (related to the infinite speed of 'sound' in an incompressible medium) and everywhere in space so that V ■ u = 0 everywhere (including the boundary) and for all time. More details will be given later and additional discussion of P is presented at the very end of this chapter. Actually, it is the combination of nonlinearity and the pressure-velocity coupling that makes the NS equations difficult (if not impossible, in general) to solve. If either is absent, the equations are much simpler and are known to have solutions (Galdi 1996)—the limiting cases being Stokes flow (see Section 3.7.1) and the so-called Burgers equation p(du/dt + u • Vu) = /xV2u, respectively. [Another interesting item from these 'Short course' notes of Galdi (1996) that distinguishes the two extremes of fluid mechanics is this: 'Fluid Dynamicists are divided into Hydraulic Engineers who observe what cannot be explained and mathematicians who explain things that cannot be observed'—which he attributes to Sir Cyril Hinshelwood.] Similar to the AD equation, there are two useful non-dimensional forms for NS, depending on how time and pressure are measured, and we emphasize immediately that pressure must always be measured in units such that the resulting dimensionless equation can never cause W to vanish in any of the allowable limiting cases. (If it did, so too would the constraint V • u = 0.) Introducing w0 as the characteristic velocity, L as the characteristic length, x as the characteristic time, and Pc as the characteristic pressure
362 THE NAVIER-STOKES EQUATIONS gives (upon introducing the Reynolds number, Re = pu0L//i, (3.2-3) the ratio of advective to diffusive momentum transport), leads to (i) — + Re u Vu= -VP + V2u, (3.2-4) dt for the case where x = pL2//i (wherein Pc = fiuo/pL) and 3u It (ii) — + u Vu = -VP+ — V2u, (3.2-5) dt Re when t = L/uq (wherein Pc = pu%). As for AD, (3.2-4) is more appropriate in the low Re regime (Re -> 0 via u0 -> 0) and (3.2-5) is better when dealing with the advection- dominated, large Re regime (usually 'realized' as Re -> oo via /i -> 0). For Stokes flow, Re = 0 in (3.2-4). Remarks: (1) In the Re -> 0 limit, it may seem awkward that the characteristic pressure 'vanishes' with uq—and it is. The best 'rationalization' here seems to be that w0 —► 0 should not be taken all the way to the limit, and instead argue that the advection term, varying like m2,, becomes negligibly small and may be safely neglected when Re <$C 1. (This enigma is probably related to the so-called Stokes paradox; see, for example, Panton, 1984.) (2) Most of the 'general characteristics' of AD problems, discussed in Section 2.1.2, regarding degree of difficulty of a simulation, carry over to the NS equations—except the last (fifth): pure advection. In the 'hydrodynamics' limit of pure advection, we have the inviscid (/i = 0) incompressible Euler equations, whose solution (numerical, at least) is often more than difficult—it is (we believe) virtually impossible in the general case. For the steady case, these slippery equations admit too many 'solutions.' Many CFD codes and publications on the numerical solution of the NS equations were written with less than full understanding of the goal, with the result that many results fell short of the mark in one way or another (or in many ways). What is the goal? The first goal in the numerical solution of PDE's in general and CFD in particular (with incompressible NS equations being more 'particular' yet) is to understand as much as possible about the PDE's, and their solutions, whose approximate solution is sought. And these depend, in part, on the answers to the following questions: What are the proper (and improper) BC's and IC's?; When is the problem well-posed?; and, When ill-posed? Our approach is to answer many of these questions regarding the first goal before launching into the task of generating approximate numerical solutions. 3.3 ALTERNATE FORMS OF THE VISCOUS TERM There are many equivalent ways to represent both the viscous terms and the advection terms in the NS equations, many of which are based on the omnipotent constraint equation, V • u = 0. These alternate representations, all of which are equivalent in the continuum, lead to semi-discrete (continuous time, discrete space) equations that are generally not equivalent—and which sometimes offer advantages over the simplest (conventional) form
ALTERNATE FORMS OF THE VISCOUS TERM 363 of the momentum equation presented above. In this section we focus on the viscous terms, and in the next on the advection terms. 3.3.1 Stress-Divergence Form This form is considered fundamental by many—especially those who came to FEM in fluid mechanics with a knowledge of FEM in solid mechanics. Indeed, the form presented in (3.2-1) is actually derived from the stress-divergence form, which is pi — + u Vuj = Va= V-(d-PI), (3.3-1) where d = /x[Vu + (Vu)7] or, equivalently, fdu; dui\ 7 = M V 3~ + ~dx~ ) = ^UjJ + UiJ"> = 2Me'7' (3.3-2) and I is the identity tensor. d is the viscous stress tensor (or deviatoric stress tensor), whose divergence is /i[V2u + V(V • u)], which is just /iV2u when V • u = 0, thus recovering (3.2-1)—and indeed justifying it, since only (3.3-1) is a true momentum balance equation, a is the total stress tensor, and s is the strain rate tensor. It is worth repeating, however, that the discretized versions of these two equations are not identical—usually. The reason that (3.3-1) is often preferred to the simpler form, (3.2-1), is related to weak forms and natural boundary conditions, which we summarize now and derive later: only the stress-divergence form leads to NBC's that represent true (physical) forces; and they are (for non-curved boundaries) n a = a n = d n - Pxv = fi[n Vu + V(n u)] - Pn = F, (3.3-3) where F is the applied force (traction) on/at the boundary. For curved boundaries, the more general form of the viscous stress must be employed: /in ■ [Vu + (Vu)7] = /i[n ■ Vu + (Vu) • n]. 3.3.2 Div-Curl Form Using the vector identity V2u = V(V • u) — V x V x u, (3.2-1) can be rewritten as p ( — + u • Vu J = -V/> + m[V(V • u) - V x V x u], (3.3-4) where we note (again) that V • u = 0 in the continuum and also that the vorticity has entered the equation; i.e., co = V x u (3.3-5) is the vorticity. Again the reason for even considering (3.3-4) is related to NBC's associated with the weak formulation, and will be further discussed later.
364 THE NAVIER-STOKES EQUATIONS 3.3.3 Curl Form The last variation on the viscous theme is to indeed invoke V • u = 0 in (3.3-4) to obtain /du \ p I — + u Vu I = -VP - /xVx Vx u, (3.3-6) which leads to yet another, slightly different than that from (3.3-4), NBC. Details follow later. 3.4 ALTERNATE FORMS OF THE NON-LINEAR TERM Here we return to (3.2-1) but focus on the advection term, u • Vu, and derive four alternate forms, each of which displays certain advantages. 3.4.1 Divergence Form From V • (uu) = u • Vu + u(V • u) and (3.2-2), where uu is the dyadic product (see Table 3.1-1), (3.2-1) can be rewritten as P du ¥+V.(uu) 72 = -V/> + aiVzu, (3.4-1) which, like the analogous AD equation of Chapter 2, we call a divergence form (of advection). The attributes associated with this form are probably slight, if present at all. If, however, this divergence form is combined with the stress-divergence form, we obtain the total divergence form, du p— = W-(a-puu), (3.4-2) ot which offers two advantages when discretized via GFEM: (i) it leads easily to the proper overall/global momentum balance (a global force balance), and (ii) a form of its NBC can be useful if a BC of specified total momentum flux, n • (puu — a), is desired or appropriate at a boundary. 3.4.2 Rotational Form Here we begin with the vector identity, u Vu= |V(u-u)-ux Vxu, (3.4-3) noting that ^pu u = ^pq2 is the dynamic pressure, and insert the results into (3.3-6) to obtain, using (3.3-5), [du \ pl—+coxu\=-VPT-/iVxa>, (3.4-4) where PT = P + pq2/2 is the total (Bernoulli) pressure, which we call the rotational form of the NS equations.
ALTERNATE FORMS OF THE NON-LINEAR TERM 365 Some potentially useful aspects of this formulation, pointed out by Gresho (1991b), are: 1. If the flow is irrotational (co = 0) or nearly so, then even if the Reynolds number is large and advection is 'dominant,' this formulation has subsumed most of the non-linearity into the (linear) pressure term and leads to a better outflow BC; i.e., P + \pq2 = constant. Details later. 2. Even if the vorticity is large, it is sometimes the case (e.g., for coherent structures in turbulent flow; see Frisch and Orszag, 1990) that it is often nearly aligned with the velocity; i.e., it has close to the same direction as u so that co x u, the remaining portion of the advection term, is small. 3. Noting that the combined system of equations (3.2-2), (3.3-5), and (3.4-4), display no derivatives higher than first order, the door is opened to other methods, such as the method of least squares. More on this later. 4. It seems to offer the possibility of generating a skew-symmetric advection matrix (see below). In closing this section, it is worth pointing out that the rotational form also displays a disadvantage: more degrees of freedom to compute (i.e., velocity, pressure, and vorticity). 3.4.3 Skew-Symmetric Form The next 'trick' is to consider rewriting the advection term in (3.2-1) as a linear combination (average) of the advective form and the divergence form (as already done for the scalar transport equation in Chapter 2, Section 2.2.4), |[V• (uu) + u Vu] = u Vu + |u(V• u), which gives the following momentum equation: P du l h u • Vu H—u(V • u) dt 2 = -VP + /iV2u. (3.4-5) This form was introduced by Temam (1968; see also Temam, 1984) in order to be able to prove that the numerical calculations would be stable. This is related, for NS as it was for AD, to the fact that this form leads to a skew-symmetric advection matrix, which guarantees quadratic conservation (of kinetic energy in this case). Again, we will present more details later. Another quadratically conserving form is that used by Heywood et al. (1996); it uses the conventional (Laplacian) form for the viscous term but uses the vector identity co x u = u • Vu — (Vu) • u = u • Vu — u • (Vu)7 in (3.4-4) to avoid needing a curl operation. It is P du — + u Vu - (Vu) u at 72 = -VPT+tiVzu. (3.4-6) 3.4.4 A Symmetric Form In a very interesting recent pair of papers (Gellert and Harbord, 1987, and Harbord and Gellert, 1990), the construction of a symmetric advection operator was put forth. It begins
366 THE NAVIER-STOKES EQUATIONS with the same vector identity used above, u Vu = u (Vu)7 + (V x u) x u = (Vu) u + co x u. (3.4-7) Using again u • Vu = (Vu)7 • u and forming the average of these two expressions yields u Vu= |{[Vu + (Vu)7]-u + a>xu}, (3.4-8) which is at least partially symmetric. But the above authors also show that the following symmetric form of co x u exists: co x u = Z(u) • u = ZT(u) • u, (3.4-9) where the non-linear operator/matrix Z(u) is given by ' a>2 sin 2<p2 — (03 sin 2^>3 co?, cos 2(p2 a>2 cos 2^2 &>3 cos 2<fT, cot, sin 2^)3 — o)\ sin 2<p\ co\ cos 2<p\ &>2Cos2^>2 &>icos2^>i &>i sin 2^i — &>2 sin 2^2 ■ (3.4-10) and the 'angles' are rather complicated functions of the velocity: / sin (pi cos (pi sm2(pi cos2^, ZiJ = 1 W3 yfi[+ m \l»] + u2 u2 u\ Ui 2U2Ut, U2 — M3 u\ + u] "2 + «3 Ul + Ul 2 2 u\ - u\ U3 2U\U2 2 , ,.2 ,.2 , ,.2 M? + "22 V"? + "2 "1+"2 "1+"2 In 2D, this simplifies to (using co = cot, = 3^/3x — du/dy) u\ + u] "i + M3 «i + "3 2 2 M] 2« 1 + ^2 Ml ~ u2 2 1 ..2 ,.2 , ,.2 z„ = &> '1/ — 2 9 ul + v2 —2uv u2 — v2 u2 — v2 2uv (3.4-11) Thus, a symmetric advection operator is 2 u • Vu = UVu + (Vu)7 + Z(u)] ■ u = A(u) • u, (3.4-12) with A7(u) = A(u), an identity that can be verified by direct calculation. It also follows that a GFEM approximation of this symmetric advection form also generates a symmetric advection matrix, thus opening the door to methods that work well on (or require) symmetric matrices. But it may be no panacea, considering the following: Remarks: (1) While symmetric, A(u) is indefinite—it has both positive and negative eigenvalues. (2) Conjugate gradient-like methods (especially, perhaps, the minimum residual method) might work well on this matrix. (3) Its Jacobian (functional derivative, 3A/3u) is unsymmetric, which would reduce its potential effectiveness if a Newton method is invoked after spatial discretization. Nevertheless, the concept is quite interesting and probably worth further development.
DERIVED EQUATIONS 367 3.5 DERIVED EQUATIONS The various forms of the NS equations discussed thus far are all primitive variable (u-P) forms. But there are sometimes good reasons for considering alternate—and, hopefully and usually—equivalent forms that are less 'primitive'; they are derived from the primitive variables. The forms to be considered below manage (usually) to bypass the continuity equation, V • u = 0, although they do (and must) imply the same mass conservation. They are derived (in part) by differentiation, a process that introduces higher-order equations and additional problems; the V • u = 0 'problem' is traded for other problems. They are also often useful for generating alternative numerical methods of solution. We introduce these derived equations in this section, deferring discussion of their numerical solution to later sections. 3.5.1 The Pressure Poisson Equation (PPE) The PPE is an equation that is implied by the u-P equations and derived therefrom by taking the divergence of one form or another of the momentum equation and invoking the (mass) continuity equation. Actually the PPE exists (is implied) only under the conditions/assumption that the u-P solution is sufficiently smooth so that the divergence of the momentum equation makes sense—which need not always be true. It thus follows that the u-P formulation can admit a larger class of solutions (wherein only first derivatives of P and second derivatives of u need exist). Paradoxically, in Section 3.10.5, we will discuss a flip-side to this issue wherein the PPE formulation can display more solutions than can the u-P formulation, but with the following important difference: the extra solutions of the PPE system are spurious in that they are not solutions of the u-P system. Let us begin with the advective/stress-divergence form given in (3.3-1), to which we add a body force term, p g, for generality: p ( y + u ■ Vu J + VP = aiV • [Vu + (Vu)7] + pg = M[V2u + V(V • u)] + pg. (3.5-1) Remark: g(x, t) is meant to describe any given acceleration (forcing) term, not just 'gravity'—although that is a simple special case. Assuming sufficient regularity (i.e., that all required derivatives actually exist), the divergence of this vector equation yields the first form of the (scalar) PPE, namely, V2P = V ■ {v[V2u + V(V u)] + g - u Vu - du/dt], where v = /i/p is the kinematic viscosity and P = P/p is called the kinematic pressure. In the best of all worlds, we can invoke V • u = 0 in 'many' places to obtain a useful 'working version' of the PPE. (That above is not useful, because of the presence of the acceleration.) Thus, (i) V2u = V(V -u)-VxVxu=-VxVxuand V-(VxVx u) = 0, and (ii) V • du/dt = d/dtV ■ u and, since we want/assume V ■ u = 0 for all time,
368 THE NAVIER-STOKES EQUATIONS this term also vanishes. The final (first) form of the PPE is thus V2p = V • (g - u • Vu), (3.5-2) an equation that, with (3.5-1), constitutes the PPE system and can, with sufficient care, also be used to solve the NS equations—but not always uniquely (more on this later). The PPE that 'works best' is that in which the (seemingly zero) viscous term is retained: V2P = V-(g + vV2u-u- Vu), (3.5-3) which we shall bless with the name consistent PPE (CPPE)—for reasons that will be made clear later. Remarks: (1) We shall return to the ill-posedness of (3.5-2) and the well-posedness of (3.5-3) after completing the discussion of boundary conditions and initial conditions in Section 3.10. (2) Other forms (perhaps slightly more efficient) of the PPE are also possible: (i) via the identity V [u Vu] = Vu : Vu + u V(V • u) from Table (3.1-2), to give, with V ■ u = 0, V2P = V • g - Vu : Vu, (3.5-4) for which, in 2D cartesian geometry, Vu : Vu = u2x + 2uyvx + v2; (ii) a further use of V • u = 0 leads to another simpler form [subtract (V • u)2 from the above result]: V • (u • Vu) = 2(uyvx — vyux); as presented in, for example, Roache (1982). (3) It will turn out that none of the alternatives that purport to simplify the RHS of the PPE is advisable when obtaining approximate solutions via the FEM. (4) We shall often (usually) omit the tilde over the kinematic pressure for simplicity of notation. When the pair (3.5-1) and (3.5-3) is employed properly (which we will define carefully in Sections 3.8.2 and 3.9.2), it can be used, rather than (3.5-1) and V • u = 0, to solve the NS equations. In particular, they will deliver a divergence-free velocity. We will return to the PPE formulation later; for now, we just make the following additional Remarks: (1) The PPE is elliptic and thus shows that the pressure field is always in equilibrium with the corresponding divergence-free velocity field. (2) It can be a useful formulation when the solution of the time-dependent NS equations is the principal goal. (It is not so useful if the steady NS equations are to be attacked.) (3) PPE formulations and solution methods are generally more 'delicate' than u-P formulations; the 'cost' of bypassing the explicit solution of V • u = 0 can be higher than some might initially anticipate—and probably has been, frequently. 3.5.2 The Vorticity Transport Equation (VTE) Applying curl rather than div to the momentum equation yields the VTE. Starting this time with the simplest form of the momentum equation (3.2-1), and recalling the vorticity
DERIVED EQUATIONS 369 definition, co = V x u, (3.5-5) yields [using curl grad (•) = 0 to eliminate the ('God-awful,' in the eyes of \jf-co fans) pressure, and V ■ <w = 0 because div curl(-) = 0] dco -, — + V x (u ■ Vu) = vV x V2u, at which simplifies via (i) V x (u ■ Vu) = V x [\Vq2 - u x V x u] = -V x (u x co) = V x (co x u) = <w(V ■ u) - u(V ■ co) — co ■ Vu + u • Vco = —co ■ Vu + u ■ Vco and (ii) to V x V2u = V x [V(V -u)-VxVxu] = -VxVxa> = V2co - V(V ■ co) = V2co, dco -j —- + u Vco = co ■ Vu + vVlco (3.5-6) at in the general (3D) case and to the degenerate/simpler version dco -, —- + u ■ Vco = vV2co (3.5-7) at in the 2D case, wherein co is a scalar. (For example, co = k ■ V x u where u is in the xy-plane and k is the unit vector in the z-direction; this formulation is also useful for axisymmetric problems, using cylindrical coordinates.) In 2D, the VTE is just the advection-diffusion equation of Chapter 2, a parabolic equation that is one of the pair that comprise the stream function-vorticity formulation, which we will soon present. As the VTE will see less 'action' in this text than either the u-P or PPE formulations, we defer further discussion except to say that: 1. Again, more regularity (than even for the PPE) in u is required for these higher-order (third) derivatives to exist. 2. The pair (3.5-5) and (3.5-6) can be used—again with proper care and sometimes with some difficulty—to solve simultaneously for the velocity and the vorticity. [Also, V ■ u = 0 may sometimes need to be specifically invoked; it depends on BC's and solution strategy. See Gresho (1992) for further information and references.] 3. A major 'source' of vorticity occurs at 'no-slip' boundaries—usually via the non-zero tangential pressure gradient there, but it can also be generated by an accelerating tangential boundary. 3.5.3 The Penalized Momentum Equation A 'slightly compressible' fluid may, intuitively, behave quite like an incompressible fluid in many situations. That this premise is true has led to a large amount of work
370 THE NAVIER-STOKES EQUATIONS in approximately incompressible fluids, from which we select (at this point) just one type because it has achieved a large following in some finite element circles. Suppose we replace V • u = 0 with V-u=-eP (3.5-8) or its equivalent />=-AVu, (3.5-9) where A = l/s ^> /i has the same units as viscosity and is called the penalty parameter—and is 'user-selected.' Clearly if P is finite, then V- u —> 0 as e -> 0(A -> oo). Note that if /r n ■ u = 0, then the average 'penalty' pressure is zero: / P = 0. While the name 'penalty parameter' will be discussed further later, for now we just assume that this new continuity equation approximates well the incompressible one and insert the above pressure into the momentum equation—say in the stress-divergence form (3.3-1)—to obtain the penalized momentum equation, P \ Y + U " VU) ~ (A + ^V<V ' u> = ^y2u' (3.5-10) which contains no pressure and can therefore be solved directly for the velocity field—the pressure being 'recovered', if and when desired, from (3.5-9). This is a substantial simplification over the u-P system, and is what accounts for its popularity. But it is no pure panacea for (at least) the following two (related) reasons: (i) it is not a priori obvious how large A. must be to approximate V ■ u = 0 with acceptable error, and (ii) clearly if A is 'too large' (related to round-off error when a numerical solution is sought), then the above equation becomes simply V(V • u) = 0. It is of interest (and important) to derive the analog of the PPE when the penalty method is employed because the small compressibility can have a large effect on the pressure—but only for small time—that is related to what may be called a (spurious) 'transient penalty shock wave.' To this end, we first form the divergence of (3.5-10) after adding a source term to the RHS: P 3V-u —— + V • (u • Vu) ot - (A + m)V2V • u = /xV2(V ■ u) + pV ■ g, which, with (3.5-8) and (see Table 3.1-2) using V (u Vu) = u V(V • u) + Vu : Vu, gives fdP \ 1 ? e —- + u • VP ) = -[(1 + 2/jls)V2P] - (V • g - Vu : Vu), (3.5-11) \dt J p wherein, since A = l/e is usually much larger than /i, the 2/is term is (usually) negligible. This equation is to be compared with the PPE of incompressible flow, (3.5-4). The penalty pressure (and thus div u) actually 'dances' to a time-dependent advection-diffusion equation with a source term; the effective diffusion coefficient is X/p. But since e is very small, the transient will be very sharp—and short; there is an ephemeral temporal boundary layer. After this initial (and spurious, relative to either incompressible or compressible flow) penalty transient ('shock' wave) has passed through the domain—the required time for which is 0(pL2s), where L is a characteristic length scale of the domain—the pressure will be in quasi-steady equilibrium and can respond to the true time-variations of
ALTERNATE STATEMENTS OF THE NS EQUATIONS 371 the flow; i.e., it then satisfies, approximately, V2P = p[V • g — Vu : Vu], which is (3.5-4). It is important to emphasize that even though P is absent from the penalized momentum equation and (3.5-11) is in fact never formed, the implied pressure [from (3.5-8)] still satisfies (3.5-11) and this can have a significant effect on the penalty velocity. Later we shall demonstrate this spurious transient and show that div u can be very large (and thus u very wrong) during this adjustment phase. A Final Remark: The word 'penalty,' and the concept, comes from the variational statement of the Stokes equations (Section 3.7.1) as follows: 'The term penalty is to be understood in the framework of optimal control: the cost functional is augmented with e-1 J"(V • u)2dx, so that diverging velocity fields are strongly penalized.'—Thomasset (1981, p. 81). 3.6 ALTERNATE STATEMENTS OF THE NS EQUATIONS The stage is now set, in part at least, for writing the various theoretically equivalent forms of the NS equations. We do some of this, plus a little more, in this section—some of which will simply serve as a summary of the preceding discussion. Also, we switch to the dimensionless form of the equations introduced in (3.2-5)—for advection-dominated flow. 3.6.1 Velocity-Pressure in Divergence Form This is simply an expansion of (3.4-2): i.e., — + V-(uu + />I) = Re"1 V-[Vu + (Vu)7] and V • u = 0. (3.6-1) dt 3.6.2 Velocity-Pressure in Rotational Form This form combines the curl form of the viscous terms with the rotational (curl) form of advection; i.e., -+(BXU + V/)r = -Re"lVx(B, a>=Vxu, and V • u = 0, (3.6-2) dt where PT = P + \q1 is the total pressure. 3.6.3 PPE Form This important form combines (3.2-5) and (3.5-3) with g = 0: i.e., — + u • Vu + VP = Re-1 V2u dt and V2/> = V • (Re-1 V2u - u Vu). (3.6-3)
372 THE NAVIER-STOKES EQUATIONS 3.6.4 The Stream Function-Vorticity (tfr - co) Formulation This 2D (only) formulation utilizes (3.5-7) and the definition of vorticity, a> = dv/dx du/dy, along with the introduction of the stream function (\js) via (u = Vi/r x e3) to obtain and u = di(r/dy and v = —dty/dx (3.6-4) — + u • Vco = Re-1 V2co (3.6-5) dt VV + & = 0, (3.6-6) which is the \[r — co formulation. Remarks: (1) The elliptic equation relating \fr and co is somewhat analogous to that relating P and u in the PPE formulation. (2) It is possible to eliminate the elliptic equation by inserting co = — V2i/r into the VTE: -V2i/r + ii- V(V2i/r) = Re"1 vV, (3.6-7) dt which, with (3.6-4), can be used to solve directly for the stream function. While this formulation has actually been implemented via the FEM [see, for example, Olson and Tuann (1979), Girault and Raviart (1986), and Gunzburger (1989)], we will not pursue it any further. (A higher degree of regularity is obviously presumed, which entails the necessity of using basis functions that are of class C1.) (3) In 3D, the analogous formulation leads to the vector system of velocity (vector) potential and vorticity, another approach that we believe to be too complicated and largely unnecessary. See Gunzburger (1989) and references therein for further discussion. 3.6.5 The Velocity-Vorticity Formulation This formulation can be used in 2D or 3D and combines the kinetic vorticity transport equation—(3.5-6)—with the two kinematic equations (3.2-2) and (3.5-5), to give the trio of equations —-- +u- V<o = co- Vu + Re-1 V2co, (3.6-8) at co = V x u, (3.6-9) and Vu = 0, (3.6-10) where the vortex stretching term, co ■ Vu, is absent in 2D, wherein also co is a simple scalar.
SPECIAL CASES OF INTEREST 373 Remarks: (1) We shall also have little more to say regarding this formulation in the sequel (see, for example, Gunzburger et ai, 1990, and references therein). (2) The kinematic equation, V x u = <w, is sometimes replaced (or augmented) by a higher-order one by taking its curl: Vx Vx u= V(V-u)- V2u= Vx co; i.e., V2u=-Vx<b, (3.6-11) a vector Poisson equation. See, for example, Hafez et al. (1989) for further discussion of this formulation. (3) co is also divergence-free; V • co = 0 from (3.6-9) 3.6.6 Other Formulations Yes, there are still others, although they seem thus far to have proven more useful theoretically than computationally. These are obtained from the non-dimensional forms of (3.3-4) and (3.3-6); i.e., — + u Vu + VP = Re_,[V(V-u)- Vx V x u] and V • u = 0, (3.6-12) dt and — + u Vu + VP = -Re"1 Vx Vx u and Vu = 0, (3.6-13) dt which we previously referred to as div-curl form and curl form, respectively. The reason for 'belaboring' the issue of equivalent statements of the PDE's is simply that the various weak formulations derived from these PDE's are not equivalent when it comes to natural BC's; i.e., while the PDE's above are equivalent, the 'natural' boundary value problems corresponding to them are not equivalent. Thus, we shall revisit some of these various alternate statements of the NS equations when we generate the corresponding weak forms and NBC's. 3.7 SPECIAL CASES OF INTEREST There are three special cases—subsets, actually—of the NS equations that we wish to illuminate here and explain some aspects of their utility. 3.7.1 Stokes Flow If Re <$C 1 and the non-linear advection terms therefore neglected/omitted, then the NS equations become — +V/>=vV2u + g (3.7-1) and V-u = 0, (3.7-2)
374 THE NAVIER-STOKES EQUATIONS which are the (linear) equations of Stokes flow—also often called creeping flow. While they are sometimes appropriate/applicable in the time-dependent form shown (e.g., release a small pearl in a vat of motor oil or—if your sensibilities prefer it—a glass bead in glycerin), they are most often used in the steady version by omitting du/dt in (3.7-1); indeed, there are those who seem to 'not believe' in transient Stokes flow—by using the name 'Stokes flow' (or the more descriptive 'creeping flow') to represent the steady version of the above equations; see, for example, Langlois 1964 (Slow Viscous Flow, Macmillan Co., NY; out of print!) A small exception is briefly noted in Happel and Brenner (1965), wherein they admit to the existence of 'unsteady creeping flows' in which they point out, properly, that the du/dt term need not be small—and a large one (exception) is the paper by Maxey and Riley (1983), as is the rather recent and very relevant paper by Lovalenti and Brady (1993). Indeed, in a later section, we shall show how a transient Stokes flow is, in some sense, the proper 'precursor' to what is often and erroneously called 'impulsive starts' in that du/dt -> oo, yet we are nevertheless dealing with Stokes flow. (Explanation: the acceleration is arbitrarily large for a very short time.) These non-believers would, ostensibly, also not believe in the following questions: (1) How much time is required for the pearl to attain 90% of its terminal velocity? (It is clearly not zero.) (2) Ditto in a fluid with twice the viscosity? It is also important to point out (and emphasize) that one must realize that the fluid 'goes nowhere' during a typical transient Stokes problem that attains a steady state; i.e., the viscous time scale is so short that all fluid 'parcels' are virtually stationary throughout the diffusion-dominated simulation—unless, of course, the boundary conditions are time-dependent, in which case the Stokes flow equations only apply if Re <$C 1 for all time (and then fluid parcel displacement can occur). The dropped pearl will travel only a very small distance before its acceleration becomes negligible—and its terminal velocity attained. Although Leal (1992) is also 'close' to a non-believer in transient Stokes flow, his book has much good discussion regarding steady Stokes flow—for example, it discusses details of our falling pearl experiment above after the 'acceleration phase' is over. Another example of a transient Stokes flow is afforded by any 'spindown' experiment: turn off the body force in any contained flow, at any Re, and let the flow 'spindown'; the advection term will become negligibly small long before the flow has come to rest. The final portion of the process is transient Stokes flow. The transient Stokes equations are useful for a number of reasons, not the least of which is that they are linear. They thus form a very nice 'test-bed' for numerical methods (and their detailed theoretical analysis!) whose real goal is usually to solve the full NS equations. The important issue of 'div-grad coupling' is as delicate and crucial in the Stokes equations as it is in the NS equations; ditto boundary conditions—for both u and P. Finally, the steady Stokes equations, which represent a linear algebraic system of equations when discretized, are often used to generate a simple first guess to a steady NS flow via what is called the incremental Reynolds number solution method: use the solution at a lower Re as the first guess to the solution of the non-linear equations that describe the flow at some non-zero Re. More on this in Volume II. Note that in the special case when g can be expressed as the gradient of a scalar—such as gravity—it is then a conservative 'force' field, and it is both possible and advisable (especially when a numerical solution is sought) to 'absorb' this scalar into the pressure and drop g from the RHS. The stationary Stokes equations are also special in that there are powerful variational statements that relate to them and their solution. We introduce two of these here and
SPECIAL CASES OF INTEREST 375 will later refer back to them when seeking approximate solutions. The first is this: the minimizing function over all divergence-free vector fields that satisfy u = w on V of the following functional, called the Dirichlet integral, 70(u) = f [±vVu :(Vu)7 - u g] , (3.7-3) satisfies the steady Stokes equations, VP = vV2u + g and V • u = 0 in Q, (3.7-4) with u = w on V. [The proof of this statement, provided in Section 3.15, involves the recognition that if vV2u + g is L2-orthogonal to all divergence-free vector fields, then it must be the gradient of a scalar, which scalar is called P. If u is obtained from (3.7-3, then P can be then obtained from V2P = V • g in Q, dP/dn = n • (vV2u + g) on T.] In the space of divergence-free functions, the Stokes solution is a true minimizer. But suppose we enlarge the space of functions so that divergence-free functions are only a subset? In this case, the variational problem becomes one of 'minimization plus constraint': Find the minimum of 7,(11) = J [\vVu :(Vu)7 - u g] (3.7-5) for every u that takes on the value w on T subject to the constraint Vu = 0 in Q. (3.7-6) The realization of this (second) variational formulation usually involves the introduction of a Lagrange multiplier (A) to enforce the constraint: Find the stationary point of 72(u, A) = J [|vVu : (Vu7 - u g - AV u] (3.7-7) over all vector functions that satisfy u = w on V and over all scalar functions A in L2. This problem is easier in that the class of vector functions over which the search is performed is much less restricted (even though the same u will ultimately be obtained), but it is harder in at least two ways: (i) the search must simultaneously range over all L2 scalar functions (the Lagrange multipliers), and (ii) the (more powerful) minimum has been replaced by a mere extremum (a stationary point); i.e., we have now in fact a saddle-point problem to deal with. (The introduction of a Lagrange multiplier generally transforms a minimization problem to a saddle-point problem.) The first Frechet derivative of (3.7-7) yields—again—the steady Stokes equation (3.7-4); i.e., it turns out that the Lagrange multiplier, which entered the functional as a mathematical object, exits as the Stokes pressure; A = P. The extremum simultaneously minimizes .^(u, P) with respect to u [which minimum is clearly the same as that of J\(u)] and maximizes it with respect to P; it is a minimax problem. (For Re > 0, there is no variational principle, but it may still be permissible to still call P a Lagrange multiplier in that it still takes on the value necessary to ensure constraint satisfaction.) Further detailed discussion is presented in Section 3.15.
376 THE NAVIER-STOKES EQUATIONS The penalty formulation (Section 3.5.3) returns us to a minimization problem: minimize, for X > 0 given and fixed, 7,(11) = J ^vVu : (Vu)7 - u ■ g + ^(V ■ u)2 (3.7-8) over all vector functions that satisfy u = w on F. The larger the divergence of u, the larger is Jp(u); thus the term 'penalty'—the functional is penalized by non-solenoidal vector fields. 'The penalty function method reduces problems of conditional (or constrained) extremum to problems without constraints by the introduction of a penalty on the infringement of constraints.'—Reddy (1982). The minimum of JP(u) is attained when 8JP(u) = 0 [and 82JP(u) > 0], which requires (leads to) vA2u + g = -AV(V-u), (3.7-9) which are the steady Stokes equations if A.V ■ u = —P [see (3.5-9)] and gives a velocity field that is within e = 1/A. of the Stokes velocity for X ^> /i in (3.7-9). See Bercovier (1978) for the theory of penalty methods. Finally, in the 2D x/z-co formulation, the Stokes flow equations are — = vV2a) + k • curl g (3.7-10) dt and vV + <o = 0, (3.7-11) a transient heat equation for the vorticity and—it would seem—an uncoupled elliptic equation for the stream function. But, as we shall see later, life is not quite that simple for the \fr-a) formulation; the reason is, basically, that (3.7-10) comes without BC's and thus cannot be solved alone for the vorticity—the coupled set must always be solved because \fr 'contains' all of the BC data. 3.7.2 Inviscid Flow If y = 0, then the NS equations 'simplify' to the incompressible Euler equations, an especially slippery system for which, when du/dt = 0, non-uniqueness is the name of the game. They are, in the simplest form, du — + u-Vu + VP = g (3.7-12) at and V-u = 0. (3.7-13) Although we have not yet addressed the subject of boundary conditions, it is well known that the no-slip BC goes away with the viscosity, a simplification that can be a complication. Consider, for example, the following 'expansion' flow in a channel—also called the backward-facing step—'An ingenious device for generating Ph.D.s'—F. Habashi (personal communication), a simplified version of which is shown in Figure 3.7-1. The flow enters at the upper half of the left boundary—and the irrotational solution is shown (for u = 1 at the inlet).
SPECIAL CASES OF INTEREST 377 (a) Stream function (b) Vector field Fig. 3.7-1 Potential flow in a channel expansion; the vertical scale is magnified in the vector plot. There are (at least) two very different steady Euler flows for this case: (i) potential flow (v 7^ 0 at the inlet, shown above and discussed below), and (ii) 'slug' flow in which u = 1 in the entire upper half, u — 0 in the entire lower half, and v = 0 = P everywhere; i.e., jumps in tangential velocity ('vortex sheets') are perfectly admissible for inviscid flow—an essential complication that (for t > 0 at least) is missing for finite v, no matter how small. The rotational form of the Euler equations is interesting; it is, from (3.6-2) with a source term added, du — + co x u + VPT = g, ot V • u = 0, (3.7-14) (3.7-15)
378 THE NAVIER-STOKES EQUATIONS where (recall) PT — P + \q2 is the Bernoulli pressure. If also the flow is irrotational (co = zero), the equations describe a (time-varying in general) potential flow, another special case that we discuss next. 3.7.3 Potential Flow For co = 0 and g the gradient of a scalar potential, say g = —V/i, the rotational form of the (now irrotational) Euler equations becomes -^+V/V = 0 (3.7-16) ot and V-u = 0, (3.7-17) where here PT = P + ^q2 + h. Since V • u = 0 for all time, the continuity equation can be written as V • du/dt = 0 and the pair rewritten in terms of the acceleration, a = 3u/3 t: a +V/V=0 (3.7-18) and V-a = 0, (3.7-19) in which a is clearly a potential divergence-free acceleration (since it is the gradient of a scalar, PT, the potential in this case). A classical potential flow is both divergence-free and curl-free (see, for example, Batchelor, 1967) and is thus the gradient of a scalar, say 0 (the velocity potential). Inserting u = V0 (3.7-20) into (3.7-16) and (3.7-17) gives and ^+Pt\=0 (3.7-21) V20 = 0, (3.7-22) the latter of which yields 0 once the BC's have been specified. The pressure can then be obtained from p=-(h2+h+t)- <17-23) Remarks: (1) The Euler equations are rarely attacked numerically, especially if the flow is irrotational; see, however, Bell and Marcus (1992), who claim to at least come close to solving them. (2) The potential acceleration will be a useful notion when we discuss IC's and BC's, and also (later) in the discussion of projection methods.
SPECIAL CASES OF INTEREST 379 (3) d(p/dt is only non-zero when the BC's are time-varying. (4) Boundary conditions for potential flow are: un = n • V0 specified (usually) or 0 specified (rarely—typically as an OBC). Note that when the velocity at the inlet of a domain is desired to be specified, only the normal component can be so specified; the tangential components must be left 'free'—and will be such that V x u = 0 at the inlet. In Figure 3.7-1 above, the inlet BC was u—\— —un — —30/3n = 30/3x. (5) Since V2u = 0 when u is a potential flow velocity, it follows that every potential flow satisfies the NSE's. Finally, for an extensive discussion of both potential flow and Stokes flow and their associated variational principles, see Section 3.15. 3.7.4 Axisymmetric Flow There are many practical situations in which axisymmetric flow is either present or assumed to be present; namely, flow in a tube/pipe in which there is no angular (0) variation of any flow quantity—only radial (r) and axial (z). Thus, the equations of motion in cylindrical coordinates are of interest. Note that axisymmetry can only exist (in a bounded domain at least) if the 'geometry' is circular; r — R(z) defines the radial boundary. In stress divergence form, they are, calling now u — ur the radial velocity, v = vq the tangential/swirl velocity, and w = uz the axial velocity, we have, in component form, with u = (u, v, w), du du v 3« 1 3 on dor~ - + u- + w- = --(jar) ~ ~ + -~, (3-7-24) dt or r dz r or r dz dv dv uv dv 1 3 -, daZQ dt dr ' r dz r2 3r" "" ' 3z + u— + — + w— = -^-(r2ar0) + -^, (3.7-25) and where 3vv 3vv dw 13 3cr, \-u \-w— = (rar-)-\ -, (3.7-26) dt or dz r dr dz 1 3 dw V u = - — (ru) + — = 0, (3.7-27) r dr oz o> = — P + 2vdu/'dr, Oq — —P + 2vu/r, (du dw\ 3 crrz = v[ — -\-—), 0* = vr—{v/r), \dz or J dr ozq = vdv/dz and oz = —P + 2vdw/dz. Remarks: (1) Simple (and much more common) axisymmetric flow has v = 0 in these equations, no swirl—and of course (3.7-25) is omitted. (2) Even when v # 0, the swirling flow is 2D—although there are three velocity components and six components of the stress tensor. (3) P is the kinematic pressure (P/p). (4) See Stakgold (1979, p. 502) for some useful remarks on such coordinate systems.
380 THE NAVIER-STOKES EQUATIONS Just as the cartesian stress-divergence form can be reduced to the simpler 'Navier- Stokes' form [cf. (3.2-1)], so too can V • u = 0 be used to simplify the above equations, to du du v2 du dP 7 ,„ „, — + «- + w— + — = v(V2)rW, (3.7-28) at or r dz or dv dv uv dv -, „^ — + u— + — + w— = v(V2)0v, (3.7-29) ot or r oz and dw dw dw dP ., „ „^ _ + u — + w — + — = v(V\w, (3.7-30) ot or oz oz where, because of the curvilinear coordinates, the 'Laplacian' is not quite the same in each direction; we have (since d/dO = 0) -j 1 3 / du\ u d2u {WU=-r3-r{rTr)-72+^ ^^ (y2)ev = (V2)rv, (3.7-32) 7 i d ( dw\ d2w o ^ and While not appearing to be much simpler, there are fewer terms, and the calculations turn out to be slightly 'cheaper.' 3.8 BOUNDARY CONDITIONS The BC issue for the NS equations is larger, and even more confusing, than that of how to write the NS equations. Also, the jury is still out regarding the full story on even mathematically permissible BC's, let alone those that are 'best' in some sense. The simplest and most common BC is well understood, however, although it is often misnamed: for a viscous fluid, the BC at a solid wall (or object) is 'no-penetration and no-slip'; i.e., the normal and tangential velocity components must agree with those at the 'wall,' typically via u = w on T, where w is specified (Dirichlet BC's; note too the absence of BC's for the pressure for this case). This BC is often simply referred to as the no-slip BC even though, as we will show, for an incompressible fluid the no-penetration portion is often much more influential than the no-slip portion. Indeed, if v = 0, then the u = w BC must be changed to n • u = n • w—arbitrary slip is permitted, but not 'penetration.' If also n • w = 0, then incompressibility requires that the flow must always be parallel to the boundary, viscous or not. In numerical simulations, the above so-called 'specified velocity' (Dirichlet) BC is also very common for flow-through domains—at the inlet. The outlet of these domains is another matter entirely, as the quest for better open boundary conditions (OBC's) is a never-ending one. (Specified velocity, while legitimate, often is not a good OBC.) In the remainder of this section we review the state of the art ('science'? It is evolving, albeit rather slowly, from the former to the latter) regarding BC's for the several formulations presented earlier. The only general statement that can be made with assurance is this: BC's are required in both the normal and the tangential directions.
BOUNDARY CONDITIONS 381 We also concur with Kreiss and Lorenz (1989): 'In computations, boundary conditions cause most of the problems.'—although most of their book covers the 'easiest' case, periodic BC's, a subject we defer until Section 3.13.9—for reasons explained there. 3.8.1 u-P Equations a. Traction In addition to specified velocity, another BC that is appropriate in some branches of fluid mechanics (typically those dealing with free surfaces) is a force (per unit area) balance BC, sometimes referred to as specified traction. This BC has already been presented (Section 3.3.1), but we restate it here: a • n = /x[Vu + (Vu)7] n - Pn = F, (3.8-1) where F (presumed given) is the applied force (traction) on the boundary—the force applied by the boundary to the fluid. While (3.8-1) is valid as it stands for any shaped surface, it can be simplified for planar surfaces (constant curvature) to At[n • Vu + V(n u)] - Pn = F or, equivalently, to M(^+VW") _/>n = F' (3.8-2) where un = n • u is the (outward) normal velocity. It may be useful to clarify the surface stress and traction vector with a sketch—in 2D for simplicity; see Figure 3.8-1. y FT = x-a-n -V Fn = rj-F = n-a-n ^F^e^F Fig. 3.8-1 Traction vector on the boundary.
382 THE NAVIER-STOKES EQUATIONS Also, F = Fx' Fy. ei an e2 a n nx(2/idu/dx — P) + ny/i(du/dy + dv/dx) nx/jL(du/dy + dv/dx) + ny(2/idv/dy — P) (3.8-3) and Fn = nxFx + nvFy, FT = r,/^ + zyFy to give (using n • x = 0) — '/V .^r. — nan" ran — 2/i[nxdu/dx + nxny(du/dy + dv/dx) + n^3?;/3y] — P" /^r^n^w/ftx + (r^/iv + zynx)(3u/dy + 3i>/3x) + 2rvnv3?V9};] (3.8-4) b. Mixed At this point, it may be as well to point out that not all components of the full vector equation need be applied simultaneously on T; e.g., the normal component of (3.8-1) may be applied on the same portion of the boundary where the tangential velocity is specified; i.e., h a n = fin [Vu + (Vu)' ]-n-P = F-n = F„ and nxuxn=nxwxn (3.8-5) may be applied simultaneously. Remarks: (1) Note from (3.8-2) that for planar boundaries, the normal component of the viscous force simplifies to 2/idun/dn. If also u = 0 on T, then V • u = 0 =>• dun/dn = 0 there; i.e., the normal viscous stress vanishes on a stationary solid boundary—a relationship that also follows from the alternative form of the traction vector, F = —Pn + i±((d x n) for u = 0 on T; since co x n lies in the tangent plane, normal viscous forces are absent. [See also, for example, Serrin (1959, p. 241) or Panton (1984, 1996; p. 335).] {A related discussion, focusing on the identity V- [(Vu) + (Vu)7] = V2u — V(V • u) = V2u = —V x co away from T is presented by Batchelor (1967, p. 148).} The seemingly awkward representation of a vector in the tangent plane, (3.8-5) compared with n x u = n x w, which also relates the two vectors in the tangent plane, is required because the latter is not the proper projection of u and w onto the tangent plane; the second cross-product returns the result, via a simple rotation, to the proper projection. It could also be written (more awkwardly yet) as u — n(u • n) = w — n(w • n), since any vector, say v, can be expressed as the additive decomposition, v = n(v • n) + n x v x n. See also Gunzburger (1989). For the coordinate system in Figure 3.8-1, the following relationship exists between (2) (3) the magnitudes of the components of the unit vectors: rx = ny and ry = -n, In fact, borrowing partly from Gunzburger (1989), we show in Figure 3.8-2 a symbolic representation of a 2D domain (Q) and its boundary that shows the many ways in which the boundary (dQ) may be 'broken up' and BC's applied; in 3D, there are even more combinations.
BOUNDARY CONDITIONS 383 Fn and FT specified on r - rnurT (least constrained) Fn and uT specified on rT - rn nr un and FT specified on rn - rnnrt /. • un and uT specified on rnnrT (most constrained) Fig. 3.8-2 Boundary conditions in 2D. Remark: It is also permissible, and often useful, to replace the above traction BC's with 'pseudo- traction' BC's, obtained simply by omitting the term (Vu)7 from (3.8-1) et seq. Then, of course, F cannot be a true physical force—it is not the same F as in (3.8-1). It will turn out that this latter form is more natural when the conventional (V2) form of the viscous term is used when writing the NS equations and is often more useful as an OBC. It is also noteworthy that some trained more thoroughly in solid mechanics than fluid mechanics have difficulty accepting the very legitimacy of the pseudo-traction notion—let alone its utility; we assert that it is both legitimate and useful. c. Total momentum flux If the total momentum flux is known at a boundary (typically inflow or outflow), or is desired to be specified, the advective flux must be 'added' to the traction BC as follows (see Section 3.4.1): n • (puu - a) = Fm, (3.8-6) where the vector Fm (considered given; i.e., specified) is the normal component of the total (local) flux of momentum on T, the sign change from that in (3.8-1) showing that F,„ is more closely related to the force (including inertial) applied by the fluid to its boundary. Inserting the stress tensor from (3.3-1) and (3.3-2) yields, at a planar boundary, /3u \ ^ punu + Pn - fi I — + Vun I = F,„, (3.8-7) as the specified momentum flux BC. d. Symmetry Symmetry (of one kind or another) is sometimes present in fluid mechanical systems (although more so in the research world, probably, than in the 'real' world—owing mostly
384 THE NAVIER-STOKES EQUATIONS to turbulence in the latter), and can be used to significantly reduce the cost of a simulation via the application of appropriate BC's at the symmetry plane (or line) and solving the problem in the appropriate fraction of the full domain—typically ^, although there are some cases where a factor of ^ is appropriate, and others in which ^ is appropriate. The most common symmetry BC is typified by vanishing normal velocity and vanishing shear stress: n u = 0 (3.8-8) and d • n - n(n • d • n) = 0. (3.8-9) But the simplification shown below is worth pointing out—for planar boundaries. Using (3.3-2), (3.8-9) becomes £i(Vw„ + du/dn — 2ndun/dn) = 0, which itself can be simplified via u = ur + nun, V = VT + n(3/3n), where uT is the component of u in the tangent plane and VT is the gradient operator in the tangent plane, to /i(VTun + 3uT/3n) = 0, which further simplifies (finally) to 3uT/3n = 0, (3.8-10) since un = 0 on V (and thus VTun — 0). Another type of 2D symmetry BC may also be of interest—at least in special situations: F„=0 = 2fi^-P (3.8-11) on and wT = 0, (3.8-12) which, in the 'Navier-Stokes' form (change two to one) was successfully employed by Silvester and Kechkar (1990) to solve one half of a Stokes flow in a box (the lid-driven cavity problem) by applying the above symmetry BC at the vertical 'centerline' of the box. Noting that 3wT/3r = 0 along the centerline gives dun/dn = 0, and thus P = 0—at least in theory. Such a BC may therefore be limited to steady Stokes flow where such a symmetry is known to exist. e. Robin The Robin BC, in a simpler form than (3.8-7), can also be applied to the NS equations, and it does have some utility. It is useful in the tangential direction (in 2D for simplicity): duT u • n = wn and wT+/3—— =wT, (3.8-13) on where wn, /3, and wT are specified (data)—and we note that this will be an NBC of the weak formulation only if the 'V2-form' of the NS equations is used; i.e., (3.2-1). A typical and simple application of this BC would have both wn and wT zero and /J < 0 to describe a non-penetrable boundary on which the (slip) velocity is proportional to the shear stress (see, for example, Silliman and Scriven, 1980). Professor L.E. Scriven and colleagues/students at the University of Minnesota have used a Robin BC to better match known asymptotic (x —► oo) solutions in a number of situations in which such analytical results are available, and a sampling of them follows. In
BOUNDARY CONDITIONS 385 Higgins (1982), a vector Robin OBC with the appropriate 'coupling coefficient' permitted good results for viscocapillary film flow using shorter domains than those needed for Dirichlet or Neumann OBC's. Bixler and Scriven (1987) extended this from 2D to 3D and found it still beneficial. In both cases, the coupling coefficient was obtained by examining the asymptotic behavior of a relevant eigenvalue problem. In Christodoulou and Scriven (1989), a Robin BC was used at the inlet of a slide coater by matching the 2D GFEM solution to an asymptotic, upstream ID solution. [A related 1D/2D 'matching' BC was successfully employed in Kistler and Scriven (1994)—although it was not a Robin condition.] For effective use of the Robin BC in a continuous-flow, chemical reactor system, see Novy et al. (1990), and for its use in porous media flow, see Novy et al. (1991), from which we quote their bottom line: 'We believe that Robin-type boundary conditions deserve more widespread use.' Finally, for a recent use of the momentum flux OBC that admits velocity reversal, see Carvalho and Scriven (1995), or Carvalho's Ph.D. Thesis (Department of Chemical Engineering and Materials Science, University of Minnesota). A so-called 'filtration BC,' u-n + y(n-<r-n + F„) = 0 and u • x = 0, (3.8-14) where y ^ y0 > 0, which is a Robin BC in the normal direction, was recently proposed and briefly tested by Shopov and Iordanov (1994) to model permeable walls—'the flux through the boundary is proportional to the pressure drop across it.' f. OBC's Finally, we address additional aspects of the most difficult and not-yet-resolved issue of open boundary conditions (OBC's). Although we will not derive the various BC's to be stated below until Section 3.12 on weak formulations—wherein only some are derived via NBC's—we state them now for the sake of completeness. Key to this issue for the momentum equation vis-a-vis the scalar AD equation of the previous chapter is the tight coupling (in the normal direction) between velocity and pressure—V • u = 0 at the outlet must be enforced/respected and causes significant extra troubles (although it may cause fewer troubles than in the past when it was not fully appreciated how important V • u = 0 in Q is). To encompass one of the major difficulties, we address the case in which a body force is present—often in the form of a buoyancy term in situations in which the temperature field is intimately coupled with the velocity field, a situation that forms (a portion of) the subject of Volume II. The most common OBC is that given by (3.8-1) or its V2-counterpart, the latter, which is simpler and often better, obtained by omitting the (Vu)7 term. Expressed in 2D with a straight boundary for simplicity, it expands to 2fi—-P = n-¥ = Fn (3.8-15) dn in the normal direction, and to in the tangential direction. Three noteworthy aspects of these traction OBC's (because they cause 'problems') are: (i) the proper values of Fn and FT—the required data—are
386 THE NAVIER-STOKES EQUATIONS usually not known, (ii) the pressure appears in the normal OBC, and (iii) the tangential derivative of the normal velocity (a part of the shear stress) appears in the tangential OBC. The pressure can sometimes be eliminated (special cases) by starting from a different form of the momentum equation (which we do below), and the latter (dun/dz) by omitting the (Vu)7 term from (3.8-1)—which generally requires its omission from the momentum equation as well; i.e., use (3.2-1) rather than (3.6-1). But why should one want to change from (3.8-15) and (3.8-16) as OBC's? As the first part of the answer to this germane question, recall the passive and useful OBC discussed and demonstrated in the previous chapter: 3()/3n = 0 usually works better than any of the alternatives. Thus, we presume that it would (usually) be 'nice' to be able to use both dun/dn = 0 and duT/dn = 0 as OBC's—neither of which appears to be realizable from (3.8-15) and (3.8-16). But it is actually quite easy to obtain duT/dn = 0 as an OBC, since the NBC's associated with the V2 form/conventional form of the NS equation (3.2-1), are just du„ H~-P = fn (3.8-17) on and M!r = /r, (3-8-18) on where we have switched to lower case /'s so they are not confused with the traction force components in (3.8-15) and (3.8-16). Remarks: (1) If the true tractive force were known at the outflow boundary, then (3.8-15) and (3.8-16) would be a useful and appropriate OBC—as is often the case in solid mechanics, wherein the term 'outflow' is irrelevant. But the fact is that it is almost never known in flow problems, thus opening the door for considering alternate OBC's of which (3.8-17) and (3.8-18) are but one example—and a pretty useful one at that, as demonstrated, for example, in Hey wood et al. (1996). The 'problem' that can occur using (3.8-15) and (3.8-16) as outflow BC's was also demonstrated by them—and also, much earlier, by Leone and Gresho (1981), who also explained the reason for the failure. (2) There are even recent mathematical analyses of BC's of this type that are shown to be useful from the point of view of stability; i.e., it can sometimes be shown that these BC's can not have a destabilizing influence; see, for example, Naughton (1986) and Hagstrom (1991), who also develop some theory that would apply to FDM implementation of OBC's. [On this point it is interesting to note that these BC's are completely natural (as NBC's) when the weak form is discretized via GFEM.] (3) If the Euler equations are considered, then it may seem (at first) appropriate to relinquish BC's at the exit because the equations are 'hyperbolic' But this is wrong; the Euler equations are mixed elliptic and hyperbolic, and there is still an implied PPE. One legitimate OBC follows from (3.8-15) and (3.8-16), with FT = 0 or (3.8-17) and (3.8-18) with fT = 0, by simply setting /j, = 0; the tangential BC simply disappears (the uT equation is hyperbolic with dP/dx acting as a given source term), and the normal BC becomes P = —/„, which is also an appropriate (and Dirichlet) BC (weakly applied, as an NBC) for the PPE—thus further strengthening the argument
BOUNDARY CONDITIONS 387 that the pressure gradient must be integrated by parts. (This is how one can 'specify' P yet not sacrifice V • u = 0 on r.) (4) A look ahead to Figure 3.13-22 in Section 3.13.5e may be helpful, for OBC's and other BC's. Exercise for the reader: Using (3.8-17), show, for v = 0 at y = 0 and y = H at the outlet, that the average pressure is constrained; it must be — (1///) J0 /„(>) dy. Hint: use V • u = 0. Remarks: (1) It is often the case in practice that /„ = 0 (but not FT = 0), and it follows that the average pressure is zero at the outlet. (2) The same reasoning for the OBC dun/du = 0, often seen in FDM papers, leads to a different, and stronger, constraint: uT = 0. (See too the discussion related to (3.8-30) below.) Now it is clear that simply setting fT = 0 achieves the desired passive OBC for the tangential velocity. The normal component, though, is abnormal—only if P = —fn will we obtain dun/dn = 0, a goal that is often not so easy to attain since it presumes some (too much) a priori knowledge regarding the solution. Note that if P + /„ is 'large,' then so too is /idun/dn and, from V • u = 0, so too is duT/dr; such artificially large values of velocity derivatives can cause a large 'distortion' near the outlet, as we will demonstrate in Volume II for a Boussinesq fluid. But now let us return to the momentum equation itself, with a body force, P-^ + V/> = mV2u + pg. (3.8-19) Sticking to 2D again for simplicity, and even to the simpler x — y cartesian form, the component equations are Du dP ? p-- + — = MV2M + Pgx (3.8-20) Dt ox and Dv dP , P— + -r- = ^2v + Pgy, (3.8-21) Dt dy where we shall also assume for simplicity that the outflow boundary is at x = L and is parallel to the y-axis. Thus, (3.8-20) is the normal component of the momentum equation, and (3.8-21) is the tangential component. Next, we introduce the 'experimental fact' that the pressure field is usually largely dominated by the hydrostatic 'component'; i.e., if P = PH + 8P, where PH (by definition) satisfies VPH = pg, then it is often true that 8P<^PH. (The 'Lagrange multiplier portion' of P, required to enforce V • u = 0, is small—but still crucial!) Returning now to (3.8-17), which becomes H-£-=P + fn=PH+fn+ 8P, (3-8-22) ox we see that we could make du/dx 'small' if we could cause PH + /„ = 0—which we can do by using the tangential component of the hydrostatic equation, dPH/dy = pgy, to
388 THE NAVIER-STOKES EQUATIONS compute a 'proper' value of fn to use in the normal component, via f„(y, t) = -PH = -p F gy(L, /, t)dy', (3.8-23) Jo where the constant of integration is taken to be zero (usually permissible, at least in 2D). While inconvenient at best, such a procedure can and has been made to work; i.e., it can give good results in cases where the previously discussed OBC's do not (see, for example, Leone et ai, 1983; Lee and Leone, 1988; and Leone and Lee, 1989). This hydrostatic OBC then reads (at x = L) du a — = - Fgy(L,y',t)dyf, (3.8-24) Jo which is relatively easy to implement—it is also more 'useful' in a time-dependent situation (at least if a good initial pressure is available) than for the steady equations, which also needs a good initial guess. (A poor initial guess causes a very slow, linear convergence; see Leone, 1980.) Another method that has proven useful in dealing with certain OBC problems when g points along only one of the coordinate directions is one that is often employed in geophysical fluid mechanics: modify the vertical component of the momentum equation—as follows (again for the 2D cartesian case, and with g = e^g): Dt ay where it is important that the newly introduced 'forcing function,' /, be independent of x. This constraint permits a modification of the pressure as follows: VP = VP — f(y, t)\ i.e., since f = eyf(y, t) is curl-free, it can be expressed as the gradient of a scalar. Next, take f(y, t) to be special: f(y, t) = pg(L, y, t), the value of the original forcing function at the outlet plane. Thus we have P-j^ + W> = MV2u + p[g(x, y, t) - g(L, y, t)], (3.8-26) and the OBC of /idu/dx = P + /„ will now give du/dx ^ 0 simply by setting /„ = 0, because P = PH + 8P now has PH = 0 at the outlet (and 8P is, still, small); i.e., VPH = y°[g(*> y, t) — g(L, y, t)] gives VPH =0 at x = L. This OBC is of course much easier to implement, the extra cost now being that associated with the second 'source' term. We shall return to this BC, and demonstrate it (as well as others) in the chapter on 'Boussinesq' fluids in Volume II. An interesting OBC situation arises by considering the rotational form of the NS equation (3.6-2). The form of (3.8-17) and (3.8-18) that is relevant here is , du„ Rz-l-^-PT = fn (3.8-27) on and Re"1 ~ = /T> (3.8-28) on where Pj = P + |u • u is the 'Bernoulli' pressure. Consider now the following interesting situation: Re » 1 (common) and the flow is nearly irrotational (a> = 0, less common),
BOUNDARY CONDITIONS 389 for which (3.6-2) simplifies to du/dt + VPT = 0, V • u = 0. Setting /„ and fT to zero in (3.8-27) and (3.8-28) for this case yields PT = 0 and duT/dn = 0. But PT = 0 = P + ^2 is the Bernoulli equation for potential flow; i.e., it is just the right BC for the case postulated. It appears that only this combination of equation 'form' and OBC (as an NBC for the weak form, to be derived later) would work well in this case—or in its limiting form via the following problem: solve a potential flow problem in a 'flow-through' domain, prescribe the resulting velocity as an initial condition, set v = 0, and consider the time- evolution of the resulting Euler equations. It seems that only the above formulation could even hope to hold the IC as a steady solution (i.e., it would give du/dt = 0)—any other OBC will violate the Bernoulli equation, P + ^q2 = 0, by changing the pressure. Then, according to Kelvin's theorem, this no-longer-potential flow must introduce vorticity at the inlet region; a bad BC at the outlet will cause an 'error' at the inlet! Next we mention three cases wherein some experimentally-inspired OBC's that have proven useful in practice but are difficult or impossible to analyze (rationalize?) have been demonstrated to work well—in some sense. In Taylor et al. (1985), an iterative 'bootstrapping' technique was employed in which the first guess was the 'conventional' GFEM homogenous NBC as OBC : zero tractions [Fn = FT = 0 in (3.8-15) and (3.8-16)]. Then, using the solution obtained with these BC's, update the OBC to a non-zero traction NBC using the values of u and P just computed to update the 'imposed' tractions. Iterate until convergence. They also applied it to time-dependent flows by using the previous time-step values to update the tractions for the next step. Neither this method, nor the one to be discussed next, have been analyzed to see what actual PDE BC's these iterative methods converge to. In the FIDAP code (used for many of the examples in this book), the normal momentum equation OBC is treated as follows, and is actually very similar to the method of Taylor et al. just discussed—but simpler. That is, ignore the viscous part of Fn (or /„) and update the RHS (per iteration or per time step) simply by the approximation Fn = —P (and /„ = —P for pseudo-traction) The third method has already been introduced and discussed in Chapter 2, at the end of Section 2.4.1. It is simply this: for nodes on an open boundary, do not integrate by parts the viscous terms. What the 'free BC lacks in understanding, it seems to make up for in performance. We recommend it as probably being the best of the three—and again [(as we did in Sani and Gresho (1994)] implore the mathematics community to (further) analyze it! It has even been successfully applied as an inlet BC by Carvalho and Scriven (1996). To conclude the discussion, recall that in Section 2.6.2c of Chapter 2 we show how a hard (silly?) Dirichlet OBC can be accomodated for advection-dominated flows without generating wiggle signals 'simply' by using a fine-enough (graded) mesh at the outlet so that the BC-induced BL is at least marginally resolved—see Figure 2.6-53 vis-a-vis Figure 2.6-52. Here we briefly revisit this case for the NSE's, because the same 'solution' works. Figure 3.8-3 (thanks to S. Chan) shows a snapshot of the OBC region of a vortex shedding simulation behind a square cylinder at Re = 100 using the hard (silly) OBC of u = 1, v = 0. Because the OBL is resolved via a graded mesh, the vector field goes smoothly from what it 'wants' to be to what we have forced it to be. The same calculation without OBC resolution generated huge wiggles (not shown). The homogenous OBC's of 3.8-17 with /„ = 0 and 3.8-18 with fT = 0 give very nice results, with no wiggles, on the coarse mesh shown just upstream of the outlet.
390 THE NAVIER-STOKES EQUATIONS Fig. 3.8-3 Snapshot of vortex shedding with Dirichlet BC at outlet. g. More OBC's Other OBC's are possible, and we list some more below—but point out that they are thus far more theoretical than practical; i.e., they are legitimate BC's but too few have tested them numerically—on a range of problems. In fact, it is probably safe/fair to say that they were not specifically 'designed' (derived) to solve the 'OBC problem'—rather, just to show other legitimate and potentially useful BC's for the NS equations that are based on the fact that well-posed weak formulations suggest legitimate BC's for the strong form via the associated natural boundary conditions. (Our opinion at this time is more pragmatic in the following sense: if they are not useful as OBC's, then where would they be useful?) We present six avant-garde BC's, listed below, some of which we shall derive when we present the various weak formulations in Section 3.12. For further details regarding the others, consult the original references, which we also list below. 1. The tangential velocity and the pressure can be specified. This BC could be a useful OBC if a parallel flow (or nearly so) is known to exist, as it does away with the awkward coupling between P and dun/dn; i.e., set P = 0 and n x u x n = uT = 0 at the outlet. [O. Pironneau (personal communication) derived this OBC to solve, for example, the problem of a bifurcating flow, such as one pipe splitting into two, wherein different downstream pressures are presumed known. Our belief, however, is that this type of problem could be solved nearly as well, and more easily, by using (3.8-17) and (3.8-18) and specifying /„ for the desired pressure at the two outlets—at least in the absence of body force terms.] 2. The tangential vorticity and the pressure can be specified, the formulation of which we shall derive later (Section 3.12.2). This too could be useful as an OBC if it can be assumed that the flow is irrotational at the exit, a situation that is often realized in aerodynamics wherein only the wake region contains significant vorticity. Another application might be to channel flow wherein a linear variation in vorticity across the channel is often realized (Poiseuille flow). 3. The normal velocity and tangential vorticity can be specified. The utility of this BC as an OBC is doubtful, we believe, since it is rarely a good idea to use Dirichlet data at an outflow point. (It is a wiggle-maker in general—as discussed above).
BOUNDARY CONDITIONS 391 4. The normal velocity, normal vorticity, and normal component of the curl of the vorticity, n (V x <o), can be specified. Really; see Girault (1988b). If the normal curl of the vorticity does not appeal to you, then consider the next one: 5. Specified normal velocity, normal vorticity, and normal pressure gradient, where the last of these is not at your discretion; it must be dP/dn = p n • g, where g is the applied body force/acceleration. 6. Finally, it is possible to specify the normal velocity, the tangential vorticity, and (a particular) normal pressure gradient. This BC is more ticklish yet, as it requires satisfaction of the following 'constraint' on the data: dP/dn = p n • g - Vs • coT, where coT is the specified tangential vorticity and Vv is the surface gradient operator. Again, mainly because of the need to specify the normal velocity, we see little utility in this BC as an OBC. Further details on the above BC's can be found in the following references, where OP1, 2, 3 = Pironneau (1986, 1987, 1989), HF = Hughes and Franca (1987), VG1, 2 = Girault (1988a, b), and MG = Gunzburger (1989): for 1, see all of the above; for 2, see HF and MG; for 3, see OP2, HF, VG1,2, and MG; for 4, see VG2; for 5, see OP3, who attributes it to Girault (1988a); and for 6, see OP 1,3. A final remark on BC's: Periodic BC's can be utilized, but their discussion is deferred until later (Section 3.13.9) for reasons that will be explained there. h. Penalty method OBC's What about BC's in the penalty approximation? Since there is no pressure—at least explicitly—in (3.5-10), it would seem that there might even be a reward connected with the penalty in that the annoying pressure need not show up in the BC's. The logical (and indeed, proper and legitimate) BC's for (3.5-10) are simply Robin conditions, which include the subsets of Dirichlet and Neumann. This would then seem to allow, for a significant example, a return to the desired passive OBC of dun/dn = 0. Does it? Unfortunately, no; the glimmer of hope for a free lunch is dashed by the realization that the penalty term, AV(V • u), also shows up in the OBC. For example, if (3.8-15) would have been our OBC in the u-P formulation, then the penalty approximation to same is (necessarily) obtained by replacing P by — A.V • u to obtain 2M^-+AV.u = F„; (3.8-29) on the penalty version of the OBC is (must be) also penalized in order to keep V • u small at the outflow boundary. The pressure is still present, only in a disguised form. Thus, in practice when using the penalty method, it is usually best to 'think u-P' even though the actual equations do not contain the pressure. /'. Ill-posed OBC's Having exposed (overexposed?) the reader to quite a variety of potentially confusing BC options for the u—P form of the NS equations, we wish now to point out that only a few of these will actually be carried through the rest of the text. Do not despair. But do be aware of the fact that the 'best' open BC for these equations is still an open issue.
392 THE NAVIER-STOKES EQUATIONS We conclude this OBC discussion by mentioning some ill-posed OBC's for the normal velocity that have nevertheless been proposed and used (somehow) in CFD: dun/dn=0, (3.8-30) dun/dn = 0 and, at just one point, P = 0, (3.8-31) dun /dt + Vdun /dn = 0 (3.8-32) where V is user-specified, and dun/dn=0 and P = 0. (3.8-33) We mention now and analyze later (Section 3.12.5) the reasons for the ill-posedness: the first three BC's are ill-posed because they are under-specified, leading to non-uniqueness (an infinity of solutions), and the last because it is over-specified (no solution exists). If (3.8-32) were changed to v(dun/dt + Vdun /dn) = VP, (3.8-34) then the ill-posedness would, we assert, be vanquished. As to the utility of this OBC, we can only conjecture that it might work well for problems with no body force, but perhaps not otherwise—and perhaps never better than vdun/dn — P, to which it reverts at a steady state. Its implementation would follow along either of the two lines presented for dT/dt + VdT/dn = 0 of Chapter 2 (Section 2.4). For further discussion on some OBC issues, see the paper by Sani and Gresho (1994) that summarizes two OBC mini symposia (see too Gresho, 1991c, for more on these symposia) and also shows some 'fuzzy' BC's—those that deliver useful results on 'normal' grids but that are ill-posed in the continuous limit. See too the OBC benchmark solutions to four test problems, in Volume 11, No. 7 of the International Journal for Numerical Methods in Fluids (1990). By way of introduction to BC's for derived equations in the next two sections, we make the following general (and generally obvious) remark: the BC's must also be 'derived.' So, if BC's for u-P are still vague/obscure in any way, those for derived equations must be vaguer/'obscurer' yet. 3.8.2 The Pressure Poisson Equation and Pressure Boundary Conditions One of the most confusing and misunderstood aspects of incompressible flow has been that of 'boundary conditions for the pressure.' While it seems to be better understood that one role of the pressure is to keep the flow divergence-free, it has not—until recently—been clear just how this phenomenon occurs mathematically. While it is well known that the PPE (pressure Poisson equation) is an alternate way to state that V • u = 0 in Q, it is much less known that the BC's for the PPE are intimately related to the simple fact that V- u must also vanish on the boundary of Q—and even when this is known (or believed), the translation of V • u = 0 on T to actual and legitimate BC's for the elliptic PPE has usually not been obvious. Indeed, the very meaning of V • u = 0 on F is somewhat ambiguous; e.g., consider a straight boundary in 2D: is it simply dun/dn + duT/dr = 0 or could it be something else? While the full answer must await the following section on
BOUNDARY CONDITIONS 393 initial conditions, we provide an introduction to it now: Except for one very important special case, the vanishing of div u can be expressed as dun/dn + duT/dx = 0. The special case is 'start-up' — t = 0. The statement 'div u = 0 on r at t = 0' translates, in practice, to n u0 = n w0, where u0 is the initial velocity in Q and w0 is the specified velocity (at t = 0) on V. It is quite permissible, for example, to have dun/dn = 0 and duT/dx ^ 0 at t = 0 and yet satisfy w0„ = wo„, giving the result that uo is divergence-free and legitimate. (For example, the lid-driven cavity could be driven by a 'tent' function for uT, with Uo = 0 in Q.) For t > 0, the same requirement, n • u = n • w, is equivalent to dun/dn + duT/dr = 0—because any discontinuities in uT (the only velocity component permitted to be discontinuous at t = 0) are 'smoothed by viscosity' for t > 0. So what does this have to do with the pressure and BC's for the PPE? The answer is this: if and only if the PPE BC's are always derived from the proper statement/realization of V • u = 0 on r can they be guaranteed to be correct. Here we define 'correct' as follows: the correct BC's on the PPE will ensure that V • u = 0 on T for all time and that the normal velocity will be continuous [lim^o n • u = n • w for all t, where x is here construed to be the distance from F into Q along the direction of n—where n is presumed to be uniquely defined on V]. So how does one find these proper BC's for the PPE? After all, it is well known that a Poisson equation can be solved with—in the most general case—any Robin BC: aP + BdP/dn = y where a, B, and y are in general completely arbitrary (unless a = 0 and B # 0—the Neumann case—in which the Neumann compatibility conditions between the RHS of the Poisson equation and the boundary data must be respected; i.e., if V2P = f in Q, then the divergence theorem tells us that f^f = fr y/B in order for the problem to be well-posed). The other special case, of Dirichlet BC's, is realized via B = 0, a ^ 0. The way in which we 'learned how' to apply V • u = 0 on T is detailed in Gresho and Sani (1987, 1988), and comprises basically two simple steps: (i) define any consistent discrete approximation to the equations a + VP = f and V ■ a = 0, which =>• V2P = V • f with f given and with any of the legitimate BC's discussed above for the velocity applied (differentiated in time) to the vector a (acceleration); and (ii) solve the discrete vector equations for all values of a not specified by the BC's and insert the result into the discrete form of V • a = 0. The result (when the grid size shrinks to zero and node points at the boundary are examined) will be the proper BC for the PPE. Some of these operations were performed by Gresho and Sani, and later in this chapter we shall perform many more. [This procedure was quickly picked up by Veldman (1990) who, inspired by the above paper, published another called 'Missing Boundary Conditions? Discretize First, Substitute Next, and Combine Later,' in which other examples are also included. Unfortunately, Veldman did not perform the final' step: analyze your final discrete equations to discover the true BC's inherited by the higher-order PDE's.] Below we summarize the results of such an activity and present the proper pressure BC's for the PPE, after noting that techniques can be devised for explicitly enforcing V • u = 0 on r as a BC for the PPE [Canuto et al. (1988a, p. 404 ff), Schuller (1990)]—which BC is equivalent to those presented below. [See also Gresho and Sani (1987, 1988).] If the normal velocity is specified on T, the proper PPE BC, from Gresho and Sani, is the Neumann BC obtained simply by applying the normal component of the momentum equation on V. Thus, recalling (3.5-3), the PPE V2p = PV ■ (g + yV2u - u • Vu), (3.8-35)
394 THE NAVIER-STOKES EQUATIONS inherits the Neumann BC, dP/dn = pn ■ (g + vV2u - u Vu - du/dt), (3.8-36) where, of course, the last term on the RHS of (3.8-36) is replaced by the given data, pn ■ dw/dt. [For all t ^ 0, the alternate realization of V • u = 0 in Q and n • u = n • w on r—i.e., of V • u = 0 in Q—is just (3.8-35) and (3.8-36), although there are some pitfalls here that will be described later.] Remarks: (1) The pressure from (3.8-35) and (3.8-36) is obtainable only up to an arbitrary additive constant (the so-called hydrostatic pressure mode, which 'constant' could actually be an arbitrary function of time). It is thus permissible to set the pressure at any one point in Q, at any value—to resolve the ambiguity. (2) The Neumann compatibility/solvability condition is automatically satisfied in the (only relevant) case wherein n • u = n • w on the whole of F when, in the only case of interest, the u-P equations from which the PPE + BC's are derived is well-posed. The details and proof will be presented later (Section 3.9.2). (3) For t > 0, it turns out (see Gresho and Sani, 1987, 1988) that the tangential components of the momentum equation also apply on T—and could be used in principle—as PPE BC's; i.e., for t > 0 we have, in addition to (3.8-36), on T, nxVPxn = pnx(g- vV2u - u Vu - du/dt) x n, (3.8-37) which, via integration over T can be converted to/interpreted as a Dirichlet BC for the pressure, and the so-called overdetermined Neumann problem is then well- posed [both Neumann and Dirichlet BC's, the latter via integration of (3.8-37) over T, are satisfied]. But this equation generally does not apply at t = 0 because of the existence (in the general case) of vortex sheets on T; the overdetermined Neumann problem is generally ill-posed at t = 0. (See, for example, Gresho, 1991a, b.) Also, since it does not ever appear to us to be a useful or easy-to-implement BC, we drop it from further consideration, and simply mention that it is automatically satisfied when the BC employed is (3.8-36). (4) For t > 0, (3.8-35) + (3.8-36) <S> (3.8-35) and (3.8-37), and these in turn => V u = 0 in Q and on T. [At t = 0, (3.8-37) does not generally apply, and thus V • u = 0, on T, is not implied. Details to follow—Section 3.9.2.] (5) For reasons to be discussed later, it is generally not permissible to neglect V • (V2u) in (3.8-35) via the argument that V2u = V(V • u) — V x V x u, and therefore V • V2u = V2(V • u) - V • (V x V x u) = 0 because V • u = 0 and div curl (•) = 0. (6) If only steady solutions of the NS equations are sought via the tactic of setting du/dt = 0 and attacking directly the steady equations, then the PPE 'method' is only applicable if V • u = 0 is used as a PPE BC, and even then it is not a generally recommended procedure—although we mention that Schuller (1990) has successfully employed it as part of a multigrid solution algorithm. (7) The symmetry BC (planar) is merely a special case of (3.8-36), which simplifies to dP/dn = pn g. {Proof: (i) n V2u = V2un = d2un/dn2 + d2un/dx2; but un = 0
BOUNDARY CONDITIONS 395 and therefore d2un/dr2=0 and d2un/dn2 = -d/dz(duT/dn) via Vu = 0 on r and duT/dn = 0 because of symmetry [cf. (3.8-9)]; (ii) n (u Vu) = u • Vw„ = undun/dn + uTdun/dr = 0 because un = 0 and dun/dz = 0; (iii) n • du/dt = dun/dt — 0 because u„ = 0 for all t.} (8) For steady Stokes flow with no body force, the PPE + BC simplifies to V2P = 0 in Q and dP/dn = vn ■ V2u on T, showing that the entire pressure field 'comes from' the inhomogeneous Neumann BC. So much for the most common BC—specified (normal) velocity—for now. Let us move on to the rest, or at least to some others, as it may not be fruitful to attempt to be exhaustive. 1. If the normal velocity is not specified on the boundary, but rather the normal traction (or pseudo-traction) is, then it turns out, somewhat paradoxically, that this Neumann BC for the velocity, a la (3.8-1) or (3.8-2), carries over completely intact, but is a Dirichlet BC for the PPE. For example, if the normal component of (3.8-1), or its pseudo-traction counterpart in which the term (Vu)7 is omitted, is the BC applied to the momentum equation, then it is actually inherited by the PPE; i.e., the (Dirichlet) BC for the PPE in this case is P = Atn • [Vu + (Vu)7] n - n F, (3.8-38) which, for the simpler case of planar boundaries (or straight, in 2D), becomes P = 2/idun/dn — Fn, a normal force balance. [If (Vu)7 is omitted, then this becomes P = /idun/dn — /„, a pseudo-force balance.] 2. Similarly, the specified momentum flux BC of (3.8-7) carries over, as another Dirichlet BC (via the normal component) for the PPE. 3. OBC's. The general statement regarding OBC's for the PPE is this: whenever the normal OBC for the momentum equation is of the form a(dun/dn) — P = /3, the same BC applies, interpreted as a Dirichlet BC, for the pressure; i.e., P = adun/dn — fi. This covers (3.8-15), (3.8-17), (3.8-24), and (3.8-27)—as well as the BC for (3.8-26) in the form P = fidu/dx (because /„ is set to zero). We also remark that it is often (usually) the case that the viscous terms are small at such a boundary so that P = —fi is observed. We stop here, purposely avoiding the PPE BC's corresponding to the avant-garde OBC's for the u-P equations. This we leave as an exercise—perhaps difficult—for the reader. As a final remark, however, we emphasize the close coupling between the normal velocity (or normal momentum) equation and the pressure at all boundaries—another consequence of V • u = 0. 3.8.3 The Vorticity Transport Equation and Boundary Conditions on the Vorticity a. The 2D stream function-vorticity formulation If one simply (naively) examines each of the \fr-o) pair of equations (3.6-5) and (3.6-6), in turn, and applies classical PDE theory to each, one reaches a dilemma that has caused at least as much confusion—and probably more years of frustrating research—as has been
396 THE NAVIER-STOKES EQUATIONS associated with the subject of boundary conditions for the pressure. The reason for the confusion is this: each equation involves a Laplacian operator and thus each ostensibly needs BC's—Dirichlet, Neumann, or Robin. But for the most common BC, u = w on T, which translates to un = wn = di/f/dz and uT = wT = —dxjs/dn on T, the dilemma becomes clear: there are two BC's on \fr and none on co; \(r has one too many and co has one too few. Since we will not pursue the ^r-co formulation in earnest in this book, we simply state the resolution of this dilemma and refer the interested reader to Gresho (1991a, b, 1992) for the details. The fallacy in the above application of PDE theory—well understood by Glowinski and Pironneau (1979) in their important paper on this subject—is rooted in the specious notion that each equation needs BC's because each contains the V2 operator. In fact, however, these two equations are very closely coupled (as indeed are u and P in the primitive variables or PPE formulations) and it is only required that proper BC's exist for the coupled pair. It turns out, when examined in this light, that two BC's on \(r and zero on co are just fine; there are no BC's for the vorticity in ^r-co formulations—and none is needed. This realization of course brings with it some extra cost, however, which helps to justify the older approaches that attempted to avoid this cost: the coupled system of (3.6-5) and (3.6-6) must be solved as a coupled system; only then do the two BC's on \(r and none on co permit a solution. [In the finite difference world on uniform grids, however, E and Liu (1996) argue convincingly that some of the older methods using uncoupled BC's—in which one of the \fr BC's is converted to one for co via 'local' formulae—are virtually equivalent to the more globally coupled 'modern' methods.] Remark: The d-ijr/dx = wn (specified penetration) BC is usually converted to a Dirichlet BC by integrating along the boundary: rfr(T) = i/r0 + J*J wn ds', which works just fine for simply connected domains. So much for 'no-slip' (and 'no-penetration') boundaries—what about inflow and outflow? Well, if u = w is the known/desired BC, then there is no choice, no change: two on \(r and none on co. [If v = 0, however, then a value of co must be specified at the inlet (only) and this is proper: the equation is then hyperbolic and the dxjs/dn BC must be dropped (e.g., no-slip is no longer possible).] It is also permissible to specify the vorticity at the inlet—even for v > 0—unless u = w is given (see above). If in addition \(r is specified (normal velocity), then the inlet tangential velocity is a 'result,' since only its normal derivative is then effectively specified. If dxjr/dn (tangential velocity) and co are specified, also legitimate, then the normal velocity is the floater; this is probably not often useful/desirable. At outflow points, there are several options but, as in u-P formulations, there is also lots of room for both ambiguity and improvement—i.e., the 'best' choice is not readily available. If a passive OBC is desired, then dco/dn = 0 is usually okay, even though it does imply the seemingly illegitimate BC a la u-P of V2wT = 0 at outflow (an exercise we leave to the reader), which is known not to be a legitimate BC for the u-P equations. But a second BC is also required and this causes some difficulty: specifying \(r implies the specification of the normal velocity (usually a poor OBC), and specifying d\[r/dn implies the specification of the tangential velocity, also usually not a good idea. Nevertheless, di(r/dn = 0(= uT) is the most common (and legal) BC employed in \Js-co formulations. See Tezduyar et al. (1988, 1991) for some alternative \fr-co OBC's via FEM, and both Roache (1982) and Peyret and Taylor (1983) for FDM.
INITIAL CONDITIONS (AND WELL-POSEDNESS) 397 b. The 3D velocity-vorticity formulation This formulation, while not new, is less developed and still developing, and a review of the (confusing) literature reveals but one clear fact: those using this formulation do not agree on the BC's, either with respect to legitimacy or utility. Since we have not been party to this effort, and will have little need to discuss it further, we refer the reader to the growing and semi-vast (half-vast) literature, beginning with Gunzburger et al. (1990), Gresho (1992), and Wu et al. (1996)—and references therein. 3.9 INITIAL CONDITIONS (AND WELL-POSEDNESS) The last technical 'detail' regarding incompressible flow that needs to be addressed—and even cleaned-up/clarified relative to much of what exists in the literature, before we can move on to the subject of FEM approximation methods—is that of the initial data. Just what is required or permissible regarding initial velocity, pressure, vorticity, stream function, etc.? 3.9.1 The u-P Formulation Again the incompressibility constraint makes itself felt (quite strikingly, in fact) in such a way that the simplicity of the choice of the initial condition (IC) for the scalar transport equation of the previous chapter—basically any function in L2—is totally inappropriate. The 'bottom line' can be easily stated, even though it was not an easy one to obtain—indeed, it is probably not yet well known nor widely appreciated: the initial velocity must be incompressible everywhere: V • u = 0 in Q. The translation of these simple (and nearly obvious) words into mathematics is a bit more subtle: V-110=0 in Q (3.9-1) and n u0 = n Wo on FD, (3.9-2) where uo(x) is the initial velocity and VD is that portion of the boundary (the 'normal' Dirichlet portion) on which the normal velocity BC is specified (n • u = n • w). It may actually be helpful to derive (3.9-2). This can be done by referring to Figure 3.9-1—where we note the generalization of (3.9-1) and (3.9-2) to t ^ 0. It is accomplished by applying (3.9-1) and the divergence theorem to the thin Gaussian pillbox (Jackson, 1975) shown there, where u is the velocity in the fluid and w is the specified BC for u: V • u = 0 in Q => fu V • u = /r n u = 0, which becomes 0 = /02 n • wds + J)0 n • u ds + 0(8), which gives, for 8 -> 0, /1 -> h = I and thus /0 n • (w — u) ds = 0, from which we conclude that we need n • u = n • w on F; i.e., the normal component of the velocity BC is also the realization of (3.9-1) on F. [The sufficiency of this requirement is obvious. That it is also a necessary condition can be proved in either of two ways: (i) since the location (on F) and the size of the pillbox are arbitrary, if n • u ^ n • w at some point on / yet JQ n • (u — w) ds = 0, one merely needs to slide the pillbox to a new location (and perhaps change /) that would give /() n • (u — w)ds # 0; and (ii) we preclude sources and sinks of mass on the boundary.]
398 THE NAVIER-STOKES EQUATIONS Fig. 3.9-1 A Gaussian pill box in the fluid at the boundary. One of the most interesting consequences of the IC constraint of (3.9-1) and (3.9-2) is related to impulsive starts, which we believe merits the following important digression: the fluid mechanics literature is replete with the notion that impulsive behavior (impulsive starts in particular, impulsive changes in general) is commonplace, even in the case of the mathematical model that presumes an incompressible fluid. And this perception is widely held among experimentalists, theorists, and computationalists. While it is true that some realize what the mathematical basis for such fluid motion is—as, for example, so well-described by Batchelor (1967)—it seems all too true that many hold an erroneous perception. Whereas it is fairly well accepted—at least by those who might be called upon to design laboratory experiments—that a truly impulsive start of any kind of 'mechanical' system is not possible to attain owing to inertia, the V • u = 0 constraint/model adds mathematical muscle to the statement that impulsive changes in the normal direction are precluded. Normal impulsive changes in velocity would require a discontinuous normal velocity that violates V • u = 0 on V and are thus mathematically ill-posed. If the normal component of velocity applied to the fluid—as a boundary condition—is different from the initial normal velocity at the same point of the boundary, then the incompressible flow equations are ill-posed by violation of (3.9-2), and no solution exists. (At least in the conventional sense; any solutions that do exist are necessarily in the class called 'generalized' solutions.) Normal impulsive acceleration, however, is mathematically possible, an example of which is: n u = wo(l — e~?/T) with uq = 0 in Q and r as small as you like—but not zero—and we shall present just such an example later (Section 3.19). The acceleration is n a = (wo/r)e~?/T, giving n • ao = wo/r. As r —>• 0, n • ao —>• oo and n u is a smooth function (continuous in time) that rapidly approaches wq. This situation (or a similar one, such as a ramp function, n u = fit with /J constant) can be legitimately used to 'model' an impulsive start in the sense that a very large acceleration is applied for a very short time. Tangential impulsive changes, on the other hand, are permitted—and are quite common; the initial conditions need not satisfy the tangential BC's—typically no-slip. The simplest example of this situation is given by the Stokes/Rayleigh problem of an instantaneous change in the tangential velocity of an infinitely long, flat plate in a semi-infinite fluid at rest, with the familiar complementary error function solution: Q\r u(x, t) = uq erfc (x/V4vt)- (3.9-3)
INITIAL CONDITIONS (AND WELL-POSEDNESS) 399 The well-known consequence of impulsive tangential velocity changes is also well known: a vortex sheet is created in the fluid at the boundary. Of course in the limit of an inviscid fluid, while such changes are still permissible, they no longer generate vortex sheets since the fluid has lost its ability to communicate tangentially with its boundary. (Normal impulsive changes, however, are just as illegal for v = 0 as they are for viscous flow.) It turns out that the normal impulsive start model—the rapid start via exponential, ramp, or similar functions—does approach a limit that, paradoxically for an impulsive start, describes potential flow, even though the no-slip BC may have been legitimately applied; i.e., Re = 'oo' rather than Re = 0 is the effective initial condition, and the fluid will appear to slip along the boundary. This limiting case will be described below, first with words and then with mathematics: for t < 0, a steady, inviscid, irrotational flow—i.e., a potential flow—exists. At t = 0, a wand is waved that endows the fluid with viscosity, which enables the no-slip BC to be satisfied (the brakes are applied)—instantaneously; a vortex sheet now exists on all no-slip surfaces. Also instantaneously, the entire pressure field snaps from that of potential flow (the 'Bernoulli pressure') to one that feels the (usually significant) effect of the no-slip BC. Finally, for t > 0, the vortex sheet has been dissipated by viscosity and a simpler (more regular) time-dependent viscous flow develops. Mathematically, the foregoing events are: 1. t < 0. The steady potential flow is obtained from V20 = 0 in Q, 30/dn = n Wo on F, where of course /r n • Wo = 0. The resulting velocity is uo = V0, and the (potential) pressure (Pp) is given by Bernoulli's equation, Pp + \pq2 — C = 0, where q = |u0|. [As a check, the insertion of this solution into the Euler equations, 3uo/3f + Uq • Vuo + VPp = 0, V • u0 = 0 in £2, n • uo = n • Wo on F gives, using uo • Vuq = ^ Vq2 — uo x V x uq = \Vq2, V2Pp = -\Vq2mQ with dPp/dn = -n • (u0 • Vu0) - n • w0 = -n • (u0 • Vu0) = — ^n • Vq2 on F to give Pp + \q2 — C and duo/dt = 0.] Steady potential flows always satisfy the steady Euler equations (and even the full NS equations). 2. t = 0. Here we have y > 0 and no-slip on F. But the velocity field is still uo (except on F), and the NS equations read 3iio 1 -7 -> —- + - Vq2 + VP0 = yV2u0 and V • u0 = 0 in Q, at 2 with Uo = w0 on F (no slip and no penetration). But V2uo = V(V • uo) — V x V x uo = 0, since Uo is both divergence-free and curl-free; the viscous term in the NS momentum equation is still zero (in Q). The initial pressure satisfies the following PPE at t = 0: V2/>0 = -\ VV in Q and dP0/dn = n • (yV2u0 - {Vq2) = n • (yV2u0 - qVq). Here, even though yV2uo = 0 in Q, vn ■ V2uo # 0 on F because now r • uo = r • Wo there rather than t • uo = 30/3r—and it is just this 'no-slip' (and non-smooth!) viscous term that causes the step change in pressure. (For a planar stationary boundary, this term is simply vd2uon/dn2.) Finally, ^ = -V (V0 + l-q2^j = V(PP - P0) # 0; the step change in pressure caused by the no-slip BC causes the potential flow to change to a viscous flow with vortex sheets—and a concomitant large acceleration near F. On
400 THE NAVIER-STOKES EQUATIONS T, however, we have (for n • w independent of time) n ■ duo/dt = n • [vV2u0 - V (P0 + {Vq2)] = 0. Thus, the 'large acceleration near f is only in the tangential direction. If Re <$C 1, then Pq will be dominated by the boundary condition via dPo/dn % vn • V2uo—an Euler velocity and a Stokes pressure—whereas if Re » 1, then P0 will look only a little different than Pp (except very close to F) because it is then dominated by the source term in the PPE, 3. t > 0. After the (large) dose of vorticity has been absorbed by the fluid at the boundary, the now smoother flow is free to evolve as it may. Final Remarks: (1) The above discussion has been based on the requirement that the minimum regularity (smoothness) condition on the normal velocity is that it be continuous in time and space, a requirement not always invoked when seeking or discussing certain weak (ultra-weak?) solutions; see, for example, Hopf (1950/1951), Ladyshenskaya (1969, 1975), and Temam (1984). (2) Our principal 'bottom line' on the misnomer called (normal) impulsive starts is this: the initial velocity is not zero—it is potential flow. A normal impulsive change is an incompressible impossibility. (3) An example of such a start-up is given in Gresho (1992, in which the captions for Figures 7 and 8 were inadvertently switched!); later (Section 3.19) we shall present another. (4) There are no initial conditions on the pressure; Pq is always induced by uo, a statement that leads us naturally to the next section. 3.9.2 The PPE Formulation As in the u-P formulation, initial data for the velocity are all that is needed. But the PPE does apply at t = 0, and it follows that the induced initial pressure field can be computed. The manner in which this is done is the obvious one: solve the PPE of (3.8-35) under any of the BC's discussed in Section 3.8.2, with the exception of (3.8-37), because the tangential momentum equation generally does not apply on T at t = 0, a fact closely associated with the vortex sheets that are generally present at t = 0. [If, however, the BC's are rather special, then the tangential momentum equation does apply on F at t = 0 and there are no vortex sheets; such BC's are said to satisfy the overdetermined Neumann problem—a la (3.8-37) and the discussion there; see also Heywood and Rannacher (1982) and Temam (1982).] Related to this issue is the following very important observation for the special-but-common-and-important case where the BC is u = w on F: The initial pressure field is always set by the PPE with the Neumann BC coming from the normal momentum equation applied on T. The resulting pressure field acts like a given 'source term' in the tangential momentum equation, which responds initially (/ > 0 but small) and (of course) close to V as a sort of transient heat equation (parabolic) that is, in general, also subjected to a step change at the boundary.
INITIAL CONDITIONS (AND WELL-POSEDNESS) 401 The normal momentum equation does not behave like the parabolic heat equation owing to its intimate connection with the omnipotent constraint V • u = 0. On the other hand, the normal acceleration and the pressure can (for t > 0) be solved either as a pair via a + VP = f(u) and V • a = 0 in Q with n • a = n • w on F—or, equivalently and sequentially, via V2P = V ■ f in £1, with dP/dn = n • (f — w) on T, followed by a = f — VP. This should cover the PPE formulation, and it would if it were not for the possibility of either overlooking or forgetting from whence it came (the momentum equation and div u = 0). So we now address the issue that, in a turn-around from the statements made in Section 3.5.1 that the u-P formulation can have solutions for pressure that lie in a larger space (only VP need exist) than for the PPE formulation (where VP and V2P must exist, as well as third spatial derivatives of velocity), is this: when PPE solutions exist (i.e., when P is smooth enough), there can exist more solutions to the PPE formulation than to the u-P formulation! But it may be better to defer the details of these (important) 'anomalies' until the next section; anyway, that is our plan. Here we merely close with the blanket statement/bottom line that IC's for the PPE formulation should respect the same constraint on the IC's, (3.9-1) and (3.9-2), as is required by the u-P formulation. 3.9.3 Vorticity-Based Methods These methods include x/z-co and u-co methods, which we lump together because our discussion will be brief—and biased. In fact, we shall focus only on the simpler 2D case via the \//-co formulation as this will suffice to make our point. The key problem with IC's in this formulation is, of course, related to those with the previous formulations, but it is also sufficiently different and sufficiently more difficult in general so as to merit separate and extensive discussion. Ironically, however, it is easier in the following sense: the initial distribution of vorticity, coo(x), is completely arbitrary; every coq{x) will correspond to some divergence-free initial velocity. But it is much more difficult with respect to the common IC-BC combination that admits vortex sheets, a direct consequence of employing—or trying to solve—a higher-order equation, the VTE. As with the PPE formulation, we initiate the discussion using the simple and common BC of u = w on T because, in fact, this BC is not so simple when vorticity is the variable that is desired to be directly computed. In Section 3.12.4 we will discuss a more general case. We recall the ^/-co pair, see (3.6-5) and (3.6-6), dco/dt + u • Vco = vV2co and V2^ + co = 0, and the velocity equations, u = dty/dy and v = —d\f//dx to initiate the discussion and analysis. These are to be solved with the BC's of Section 3.8.3: d\f//dr = wn and d\///dn = — wz on T, corresponding to u = w there. Given coq(x), the stream function equation is used to obtain the concomitant initial—and divergence-free—velocity field as follows: solve vVo + w0 = 0 in Q (3.9-4) subject to the (Dirichlet) BC of ^o = /o on T, (3.9-5) where fo is obtained from the normal velocity BC applied at t = 0, d^Q/dx = dfo/dr = wQn, via integration over T : f0 = /rw°. The resulting velocity field, u0 = d\j/0/dy and
402 THE NAVIER-STOKES EQUATIONS v0 = —d^/o/dx, will satisfy V ■ uo = 0 in Q and on F—by construction (i.e., V • uo = duo/dx + dvo/dy = d^/dxdy - d2\f/0/dydx = 0). So far, so good; this was the easy part. But now we turn to the hard part. First we note that, just like its u-P and PPE counterparts, the initial velocity will generally slip along F because it was not—and cannot, in the most general case—be restrained from doing so: iPT = —d\j/°/dn will not agree with w°T on F, and we thus introduce the slip velocity, s: s(F) = u°T-w°T on T. (3.9-6) So what is the problem? The problem is that the now-present vortex sheet on F, the strength of which is s, generates an unbounded vorticity there: while the given coq(x) describes the vorticity in Q, on F it is necessarily given by (with apologies for somewhat imprecise notation) coo(T) = s(xr)8(x - xr), (3.9-7) where 8 is the Dirac delta function. Integration in the normal direction from F into £2 gives -xr)dxn =s(xr), (3.9-8) where e is small. The initial 'wall' vorticity is singular (unbounded) but integrable and is computable in principle (but probably not otherwise) as a function of position on F. The imposition of the no-slip, vorticity-generating BC is seen to cause a large problem in the general case. (In the special case wherein no vortex sheet is present, the initial vorticity would need to be very special—so that s = 0.) The vortex sheet need not be confined to the boundary. In fact, it may be of interest to present a simple example of initial vorticity and velocity fields in such a case; to that end, we consider a 2D, solenoidal velocity field of compact support in 2D cylindrical geometry. It is simple solid-body rotation: u^ = fr, where ««/,(= uT) is the tangential velocity and / is the (constant) angular velocity for r ^ R, and u^ = 0 for r > R. The jump in Ucj, (slip velocity if the rotating fluid was replaced by a solid cylinder rotating at the same angular velocity) is s = fR. Thus, co = (\/r)[d(ru,p)/dr] = 2f forr<R and co = 0 for r > R, at r = R, we have a vortex sheet, the description of which is the object of the exercise. The stream function equation for this case satisfies 1 d diA -:r(r:r) + w = 0' r dr dr with d\f//dr = 0 at r = 0 and V = 0 (no normal velocity) at r = R—and its solution is V = f(R2 — r2)/2. Since u = 0 for r > R, \j/ = 0 there, too. To compute the vortex sheet, we integrate the stream function equation across the discontinuity (an application of Stokes theorem in the general case): fR+s Id / di/A fR+e / - — r— r dr + / cor dr = 0, Jr^c rdr\ dr J JR_S and let e —> 0 to obtain fR+ rR+ = 0 or — / cor dr = RuT = Rs,
INTERIM SUMMARY 403 since ux = u^ = fR is the jump in the tangential velocity—and the vortex sheet strength. To go further requires, it seems, the introduction of a model; e.g., suppose w is a step function of amplitude A/s and half-width (about r = R)e. Then, rR+s/2 A / -rdr =AR = -RuT; JR~s/2 £ i.e., A = —uT, and the vorticity at r = R is uT/e with e —>• 0. In summary, co = 2f for r < R,co = — oo at r = /?, and &> = 0 for r > /?. The root of this singularity problem can be traced back to the fact that the tangential momentum equation does not apply on F at t = 0 in the general case. But the VTE involves (comes from) the normal derivative of the tangential momentum equation. What is the derivative of an invalid equation? Anyway, it is clear that vorticity methods will necessarily confront significant difficulty whenever vortex sheets are involved. 3.10 INTERIM SUMMARY 3.10.1 A Well-Posed IBVP for Incompressible Flow, and the Equivalence Theorem' It may be well to pause, collect some results, and summarize the status of our description of the incompressible flow problem before forging ahead to discuss approximate solution methods. Also, partly because we will not need the results, and partly because we are not fully confident regarding our opinion of what they are, we henceforth omit, except for one small digression in Section 3.12.4, all vorticity-based methods and zoom in on what will be the major focus of the remainder of this book: pressure-based methods. [See, for example, Gresho (1992) for further discussions of well-posedness of vorticity-based methods.] We present below two equivalent, and fairly general, initial boundary-value problems (IBVP's) for the NS equations that are well-posed and form the basis of much of the rest of the book. For simplicity (i.e., ease of presentation), we stick to 2D, but the 3D extension is straightforward—really. The momentum conservation equation, — +u- Vu + VP = Re""1 V- [Vu + (Vu)7] + g in Q, (3.10-1) dt and either form of the mass conservation equation, Vu = 0 in Q (3.10-2) or V2/> = V-{Re_1 V-[Vu+(Vu)7] + g-u- Vu} in Q, (3.10-3) with BC's of u = w on rD, (3.10-4) Re_1[Vu+(Vu)7']-n-Pn = F on TN, (3.10-5) n.u = n-w and Re"1 r • [Vu + (Vu)7"] • n = FT on Fn, (3.10-6)
404 THE NAVIER-STOKES EQUATIONS and t.u = t-w and Re-1 n • [Vu + (Vu)7] • n - P = Fn on FT, (3.10-7) where rD + FN + Fn + FT = F = dQ, when (3.10-2) is employed, or (3.10-4) dP , T and — = n-{Re_1 V- [Vu + (Vu)7] + g - u ■ Vu-dw/dt] on rD, (3.10-8) dn (3.10-5) on TN, (3.10-6) UP I T and — = n {Re""1 V- [Vu + (Vu)7] + g - u ■ Wu-dw/dt] on Fn, (3.10-9) dn and (3.10-7) on TT when (3.10-3) is employed, with initial conditions of u = uo(jc) in Q (3.10-10) withV-u0 = 0 in Q (3.10-11) and n ■ uo = n ■ wo on FD and on Fn, (3.10-12) constitute two equivalent, well-posed problems (called u-P and PPE, respectively) that can be solved for u and P. If FN and FT are empty (i.e., if n ■ u = n • w is specified on all of O, then it is also required that /nw = 0 for all t^0, (3.10-13) in which case P is determined only up to an arbitrary additive constant. Remarks: (1) Their equivalence, which applies only to the time-dependent case, is a generalization of the so-called Equivalence Theorem put forth by Gresho and Sani (1987, 1988). It obviously presumes sufficient regularity of u and P and is (still) actually more of an assertion than a theorem since it has not (yet) been proven. The 'equivalence' is, in some sense, only formal. [(3.10-3) cannot replace (3.10-2) for the steady NS equations, unless the equation V ■ u = 0 on F is appended—see Schtiller (1990), who presents a steady-state equivalence theorem.] (2) If any of the three constraints on the data, (3.10-11), (3.10-12), and (3.10-13) when applicable, are violated, then the NS equations are ill-posed, and no solution of (3.10-1) and (3.10-2) exists. The pair, (3.10-1) and (3.10-3), is more lenient, however—but see caveat (1) below. It can only be ill-posed when (3.10-13) applies with a r/rae-varying w that violates the time derivative of (3.10-13); i.e., PPE solvability requires only Jr n • (dw/dt) = 0, and this only when n • u is specified on all of T. More on this later—below and in Section 3.10.5. (3) If the u-P problem is well-posed and (3.10-13) applies, then it is automatic that the PPE problem is also well-posed even though the Neumann problem for the pressure implies a solvability constraint. See below. (4) If the (Vu)7 term is omitted consistently (everywhere, a common occurrence in practice), then the problems are also well-posed—and we are dealing with the
INTERIM SUMMARY 405 conventional (V2) form of the viscous term rather than the stress-divergence form, and the things called 'F' are then pseudo-tractions. (5) Although no initial data on P are supplied (or required), the initial pressure field is obtainable by solving (3.10-3,5,7b,8 and 9) at t = 0. [See Heywood (1980), Hey wood and Rannacher (1982), Gresho and Sani (1987), and Gresho (1991a).] Also, inserting the resulting pressure into (3.10-1) gives the initial acceleration. These observations are, in fact, just a special case of the general situation: given a divergence-free velocity (V ■ u = 0 in £1 and n • u = n • w on those parts of T with specified normal velocity), it is always possible to compute—sequentially—the concomitant pressure and the associated divergence-free acceleration; every divergence-free velocity implies (induces) both a pressure and a divergence-free acceleration. The acceleration is, of course, a measure of unbalanced forces, and will be zero if all forces balance—this is a steady state. (6) It is worth admitting that the very existence of a bounded solution for all time in the 3D case is still an open issue; we offer nothing in this regard except hope (and belief!). (7) It is interesting to note that many have used the 'high-Re' approximation to the Neumann BC for the pressure, dP/dn = 0 at stationary walls and no source term, vis-a-vis the correct BC, from (3.10-8), of dP/dn = Re""1 n ■ V2u ^ 0, but that 'on average' they were correct (even at low Re) because = Re""1 / V • [V(V • u) - V x V x u] = 0, since V ■ u = 0 and div curl (■) = 0. (8) If (3.10-4) applies and n ■ w = 0, then the flow must be parallel to the boundary. (9) The definition of boundary segments is different from that used in Figure 3.8-2 of Section 3.8.1; both are legitimate, however. It is not too difficult to provide an heuristic proof of the Equivalence 'Theorem' for, at least, the special case wherein a unique NS solution is known to exist ('all' 2D problems and 3D problems for sufficiently small data—Re, g, uo, w, F, F„, FT), so we do so: 1. Clearly the u-P problem implies the PPE problem; just form the divergence of (3.10-1) using (3.10-2) and Heywood's (1980) result, (3.10-8) and (3.10-9). 2. The PPE problem implies the u-P problem if it guarantees that V ■ u = 0. This (necessary, but perhaps not sufficient, condition) is easy to show: just subtract (3.10-3) from the divergence of (3.10-1) to obtain V • (du/dt) = 3(V ■ u)/dt = 0 in £2. But we have V ■ u = 0 at t = 0 thanks to (3.10-11), and thus V • u = 0 for all t ^ 0. It may also be useful to prove some of the assertions made in Remarks (2) and (3) above. Suppose u • n = n • w is specified on all of f. Integrating then (3.10-3) over the
406 THE NAVIER-STOKES EQUATIONS domain and invoking the divergence theorem on both sides yields / — = / n • {Re~' [Vu + (Vu)7] + g - u Vu} as the solvability requirement associated with the Neumann problem. But application of the BC given by (3.10-8) shows that this solvability requirement is nothing more than the constraint Jrn • (dw/dt) = 0. So, if (3.10-13) is satisfied for all time, the PPE is always solvable, and more: the only time there exists a solvability issue for the PPE is when w varies with time; this implies, for example, that a constant (in time) value of w on F that violates (3.10-13) will solve the PPE system but not the u-P system. Thus, it is vitally important to emphasize the following two caveats on the Equivalence Theorem: 1. Only if all solvability requirements associated with the u-P system are also respected by the PPE system will solutions of the latter also be solutions of the former. 2. The theorem is valid only if the solution is sufficiently regular: if the data are such that V2/> does not exist or if V ■ {V ■ [Vu + (Vu)7]} does not exist in Q. or if V ■ [Vu + (Vu)7] does not exist on T, but VP and V ■ [Vu + (Vu)7] do exist in Q, then the u-P system can have a solution but not the PPE system. (N.B., we are of course discussing only classical solutions here.) 3.10.2 Some Ill-Posed Problems If V ■ uo # 0 in £2 and/or if n ■ u0 # n ■ wo on FD and/or on Tn, then the ill-posed problem can be converted to a well-posed problem by modifying Uo in such a way that the 'nearest' (in L2) divergence-free field is found and substituted for uo. This is accomplished by a projection—uo is projected to the nearest admissible divergence-free subspace (that satisfies the normal BC on TD and T„), a technique that will also prove useful later when we discuss projection methods for solving (approximately) the NS equations—as follows: find v and the associated scalar (p (a Lagrange multiplier) via the Helmholtz/Weyl additive decomposition of uo into a divergence-free part and a curl-free part (see, for example, Galdi, 1994): uo = v+V0 (3.10-14) and V-v = 0 in Q, (3.10-15) subject to the BC's n v = n w0 on FD and Tn (3.10-16) and 0 = 0 on TN and TT. (3.10-17) Remark: The actual nearest divergence-free velocity is obtained only when n ■ w0 = 0 on TD and T„, in which case it is an orthogonal projection in that u is orthogonal to V0: J u ■ V0 = 0. (See Appendix 3.) This projection is realized via the following two steps:
INTERIM SUMMARY 407 Step 1. Solve V20 = V-uo in Q (3.10-18) subject to the BC's -^ =n-(u0-w0) on TD and Tn (3.10-19) an and 0 = 0 on TN and TT. (3.10-20) Step 2. Compute v = u0 — V0 as the new IC. [Exercise for the reader: Prove that v is closer to uo than is any other divergence-free vector field when n ■ w0 = 0.] While some ill-posed problems can be easily converted to well-posed neighbors, such as those just discussed, those generated by a violation of (3.10-13)—global mass conservation—cannot. Glowinski (1984) has presented a way to modify the boundary data so that even this ill-posed problem can be converted to one that is well-posed—and we shall present another later (Section 3.13.1g). 3.10.3 The Simplified PPE is also Ill-Posed The simplified PPE(SPPE), obtained from (3.10-3) by assuming V • u = 0 in the viscous term, i.e., V2/> = V- (g-u Vu), (3.10-21) is also often regarded as the PPE. But we have earlier alluded to the existence of some potential dangers associated with this seemingly equivalent 'statement of incompress- ibility' (Section 3.5.1), and we discuss some of them now. If (3.10-1) and (3.10-21) are solved together, then the behavior of the velocity divergence (V ■ u = 0) can be obtained by subtracting (3.10-21) from the divergence of (3.10-1) to give, for the simpler case of straight (planar in 3D) boundaries wherein V ■ [Vu + (Vu)7] = V2u + V(V ■ u), — = 2vV20, (3.10-22) dt where the factor of 2 exists because we used the stress-divergence formulation—the V2 formulation would not have it. But the key issue is the same in either case: the divergence satisfies a transient 'heat' equation. If the initial velocity is divergence-free, 0 = 0 at t = 0 and (3.10-22) would ensure that 0 remains zero if either 0 or 30/3n was held at zero on T. But if we cannot show the existence of either of these BC's, and we cannot from the problem posed, then Murphy's Law would tell us that divergence might sneak in, which of course would cause the SPPE solution to be wrong in that it is not then a solution of the NS equations. This ambiguity seems to be avoidable (for the continium at least) simply by replacing Vu + (Vu)T by —V x V x u in (3.10-8) because, in addition to implying (3.10-22), it also implies, in an exercise that we leave to the reader, 30/3n = 0 on r0, thus assurring that 0 = 0 if it started that way. We now present a proof of this ill-posedness for the linear case—Stokes flow (thanks to R. Rannacher), after which we present an example of same (thanks to J. Strikwerda).
408 THE NAVIER-STOKES EQUATIONS Since the ill-posedness is caused by lack of uniqueness, we begin by testing uniqueness: suppose we have a solution, (u, P), to the PPE problem in which (3.10-21) is used rather than (3.10-3). Then, add any harmonic function, say H, to P and see if the pair (u + v, P + H) with v to be determined, has the unique solution v = 0 and H = 0; any non-zero solutions would show non-uniqueness. Inserting this pair into the IBVP and utilizing the fact that (u, P) is a solution of (3.10-1) and (3.10-21), sans u • Vu, leaves us with the following problem: find v from |^ + VH =Re~' V-[Vv + (Vv)7] with V2// = 0 in Q, (3.10-23) v = 0 on rD, (3.10-24) Re-'[Vv+(Vv)7] n = //n on TN, (3.10-25) n-v = 0 and Re_1[Vv + (Vv)7] ■ r = 0 on Tn, (3.10-26) and with ,-lrv7„ , ,vj„\T rv = 0 and Re~'[Vv + (Vv)1 ] ■ n = H on I\, (3.10-27) v= 0 at t = 0 (and thus V ■ v = 0 at t = 0). (3.10-28) It is immediately clear, since H is any given harmonic function, that the above well- posed problem for v possesses a non-trivial solution—and one which is generally not divergence-free. The SPPE problem is ill-posed owing to non-uniqueness. Remarks: (1) The consistent PPE problem, if subjected to the same analysis, easily leads to v = 0, H = 0 because V2// = 0 is replaced by V2// = V ■ {Re-1 V • [Vv + (Vv)7]}, showing that in this case, H is not generally a harmonic function, which then leads to 3V • \/dt = 0, and thus to v = 0, with H = 0 then following easily. [Take the inner product of the first of (3.10-23) with v, etc. Or, see below.] (2) A well-posed SPPE approach can be recovered by adding the additional BC V • u = 0 on T, which we shall show—below. (3) In Gresho and Sani (1987, pp. 1138 and 1141), it was erroneously stated that the SPPE is well posed if the proper Neumann BC is employed. An example of the SPPE non-uniqueness is the following: solve for u and P from ^ + V/>=vV2u and V2/> = 0 in Q, (3.10-29) at where Q is the unit square centered at (1.5,1.5) with sides parallel to the x- and y-axis. The initial data are u = 0 and the boundary conditions are u(x, t) = -erfc (x/V4vt), (3.10-30) v(x, t) = erfc (y/V4vt), (3.10-31) and dP 9 —- = n ■ (vV2u - du/dt), (3.10-32) an
INTERIM SUMMARY 409 which satisfy the IC and the mass-conservation condition, /un = 0 for t^O. (3.10-33) Whereas there exists a unique solution to the Stokes equations (u, P) or the CPPE version of same (with same solution) for these data, the SPPE formulation possesses this solution and many more—one of which is simply u and v from (3.10-30), (3.10-31), and P = 0. This extraneous solution, and most of the others, will not retain V ■ u = 0; in fact, the above solution has V ■ u = (e~x2/4vt - e^2/4u')/V^- Remarks: (1) The SPPE solution [and the BC's in (3.10-30) through (3.10-32)] was generated by seeking solutions to the ID heat equation. (2) The reason that the unit square is not centered at (1/2, 1/2), which is its usual location, is to preclude jumps in the normal velocity at t = 0+; i.e., to have a well-posed problem. Similar remarks apply to related versions of the SPPE of (3.5-2), such as (3.5-4), for which the implied divergence equation is 90 9 _ + u ■ V0 = 2vV20, dt and for the 2D version below (3.5-4)—that using V • (u • Vu) = 2{uyvx — vyux)—it becomes 90 9 9 —+u- V0 = 2vV20-02, dt the first being an advection-diffusion equation and the second the same except for a (stabilizing) sink term. But the important conclusion from all of these is that these simplified forms of the PPE do not, in fact, simplify the analysis and cannot generally assure that 0 = 0 in Q for t > 0; only the CPPE (or the SPPE plus the BC V ■ u = 0 on T, as shown next) can perform this important function. 3.10.4 Fixing the SPPE, and the PPE Paradox The non-uniqueness of the SPPE formulation can be rendered unique by the addition, simple in principle but probably not in practice, of another BC, as mentioned earlier [Remark (6) in Section 3.8.2]: Vu = 0 on T, (3.10-34) which 'closes' the problem posed in (3.10-22); i.e., we now have that V ■ u = 0 in Q for t ^ 0. After nothing that V • v = 0, uniqueness then follows from (3.10-23) via -— f\-\+ /vV// = Re-' /V{V[Vv + (Vv)7]}; (3.10-35) i.e. (see Table 3.1-2), -— fy.y- ///V-v+ J Ht\\ = Rq-{ f n ■ [Vv + (Vv)7] ■ v -Re-1 /vv: [Vv + (Vv)7]
410 THE NAVIER-STOKES EQUATIONS to give, noting that n ■ v = 0 on Fd and Fn and accounting for the BC on FN and FT, /vv= -Re_l /*Vv: [Vv + (Vv)7] ^ 0, (3.10-36) which, since v = 0 at t = 0, gives v = 0 for all time. We conclude this portion of the PPE discussion by stating The PPE Paradox: If you include it, you do not need it—but if you do not include it, you do need it, where 'it' refers to the viscous term, either vV ■ {V ■ [Vu + (Vu)7]} = 2vV2V ■ u or vV • V2u = vV2V ■ u on the RHS of the PPE. The former (inclusion) gives V ■ u = 0 via the CPPE formulation, and the latter (exclusion), via the SPPE formulation, generally results in V • u 7^ 0. Interesting. 3.10.5 PPE Solutions that are not NSE Solutions But even the CPPE is not free of problems, as mentioned earlier. One of them is already apparent: if for some reason 0O = V • uo ^ 0, via oversight or carelessness or 'dumb/naive user' (the most common case, perhaps), the CPPE solution will give V ■ u(x, t) = V ■ uo(jc); any initial divergence is frozen (spatially) into the fluid for all time. Another such violation could occur if n ■ uo # n ■ wo at some points of F with n ■ w0 independent of time—the PPE would hold n ■ u(0 = n u0 # n w0 at these same points on F for all time. This and other ill-posed NS problems are summarized in two equivalent ways in Figures 3.10-1 and 3.10-2, which attempt to show those PPE solutions that are not solutions of the NS (u-P) equations by virtue of the looser PPE solvability constraints. [Recall: The only case in which the data (uo, w) do not admit a PPE solution is when n • u is specified (as n • w) on all of F and Jr n • w 7^ 0. Conversely, the data admit a solution whenever n • u is not specified on all of F or whenever n • u is specified on all of T and Jr n • w = 0 for t ^ 0, which latter case constitutes the only solvability requirement for the PPE. The simplest example of a well-posed PPE problem that is ill-posed in the u-P formulation is a constant (in time) value of n • w that violates Jr n • w = 0.] The figures are meant to include all initial data pairs (uo, n • w0) that admit a PPE solution. Each ellipse in Figure 3.10-1 corresponds to a NS constraint violation: (3.10-13) at t = 0, (3.10-11), or (3.10-12). The union of the three ellipses denotes that subset that does not admit a solution to the NS (u-P) equations, and the horizontal ellipse (present only when r\ = 0 and Fn = 0) is contained in the union of the other two because V ■ u0 = 0 and n ■ uo = n • wo => Jr n • wo = 0 [and therefore Jr n • wo 7^ 0 => one of the following: (i) V ■ u0 # 0 and n ■ u0 = n • w0, (ii) n • u0 # n • w0 and V • u0 = 0, or (iii) V • u0 # 0 and n ■ uo # n ■ wo]. The little boxes depict computational domains and the vectors depict BC's (Dirichlet, normal direction) and IC's; e.g., the vector in box (1) implies uo # 0 and V • uo # 0 (and is the case discussed above, which we shall also demonstrate later), and that in (5) indicates a BC violation of the type n • u0 # n • w0 on F (with u0 = 0 in £2). The intended interpretation is then as follows: (1) and (2) each violate only one constraint, (3) through (5) violate two, and (6) violates all three constraints. Further explanation: the statement uo = 0 in (2) and (5) can also be interpreted as/generalized to: a divergence-free
INTERIM SUMMARY 411 Fig. 3.10-1 A Venn diagram of those PPE solutions that do not solve the NSE. [n-w0 = 0: V-u0*0 (D-Uo = D-w0) nu0 * nw0 (V-u0 = 0) Vu0■* 0 and rvu0* n.yy0 [n-\/y0*0: Fig. 3.10-2 Schematic of those PPE solutions that do not solve the NSE. IC of compact support (V ■ u0 = 0 and u0 —>• 0 for x —>• T); also, these cases are probably ill-posed for another reason than constraint violation (rough data). In Case (2), the BC vectors are of the same length to indicate the satisfaction of global mass conservation [(3.10-13)] when w is time-varying. (If the BC's are time independent, they could be of different lengths since the PPE formulation does not then see them.) Case (3) is the 'sum' of (1) and (2), and Case (4) is meant to depict a situation wherein the initial velocity is smooth and satisfies n ■ iio = n ■ w0 = n ■ w(0 but does not satisfy V ■ iio = 0. Here too the initial divergence would remain for all time with, in addition, a violation of global mass conservation. Finally, (6) is the sum of (1) and (5). If n • w is time-dependent and the condition Jrn • wo ^ 0 is generalized to Jrn ■ w(0 ^ 0, then Cases (4)-(6) are also ill-posed in the PPE formulation. Hopefully the key point has been made: it is incumbent upon the user of a PPE method to be sure that s/he is really solving the NS equations. One can best do this by 'imagining' that one is using the u-P formulation and heeding all of its associated constraints.
412 THE NAVIER-STOKES EQUATIONS 3.10.6 A Remark on the Penalty Method To conclude this section, whose objective was, in part, to reveal and discuss some of the principal subtleties of incompressible flow that carry over—for the most part—to the FEM approximate solution methods to be discussed later, we return to the penalized momentum equation and the associated penalty approximation—(3.5-10) and (3.5-11)—and make the single remark: there is no such thing as an ill-posed penalty method problem, which at first blush sounds much better than either the u-P method or the PPE method; 'all' IC's and BC's generate well-posed problems. But—not all solutions will be well-behaved or attractive (the lunch is cheap but not free) owing to the spurious penalty transient—an example of which we will present later. A sort of 'bottom line' here, and for PPE methods, is this: there is no substitute for a good understanding of the 'primitive' (u, P) Navier-Stokes equations. 3.10.7 Key Features of Incompressible Flow Somewhat in the way of a summary of what has been said, and just before getting on with the task of actually trying to solve the NS equations via GFEM, we present some key features of incompressible flow that are unique to it, i.e., not present for compressible flows: 1. The equations contain an elliptic part, which causes instantaneous transmission of pressure signals throughout the domain. (The sound speed is infinite.) 2. The initial conditions are constrained—they must be divergence-free. 3. The boundary conditions are constrained, sometimes—they must satisfy global mass conservation. 4. Impulsive starts/changes in the normal direction are forbidden, which implies a need for some compatibility between BC's and IC's. 5. The 'simple' mass conservation (constraint) equation is omnipotent, all space, all time. 6. The implied Poisson equation for the pressure and its BC's are, when made explicit, subject to an unbelievably large amount of misunderstanding. 7. Even the nearly hyperbolic Euler equations need an OBC—for the pressure. 8. 'However, the emphasis is very different for high speed and low speed flows and we shall concentrate on the latter because the burden they place on the design of good difference schemes is in many ways greater'—Morton (1971). 9. When semi-discretized (in space), the resulting differential-algebraic equations are more difficult to solve than ordinary differential equations. 3.11 GLOBAL CONSERVATION LAWS Before returning to the FEM, there is one more set of 'goals' that needs to be discussed. Inherent in the NS PDE' s are certain global conservation laws that should be mimicked by the discrete solution. We present these here and their discrete analogs later after pointing out that only some of the continuum conservation laws are also respected by the discrete solution, and sometimes even these are not obtained without some extra work that usually
GLOBAL CONSERVATION LAWS 413 is not obvious at the outset. We present next the following results from classical fluid mechanics: global conservation of mass, momentum, energy, vorticity, and enstrophy—the latter two, of course, being less relevant to our later needs. 3.11.1 Conservation of Mass We start easy and repeat what has been said many times before—for emphasis. Local mass conservation is described by V ■ u = 0 and global mass conservation by integrating over the domain; i.e., / V u = / n u = 0 (3.11-1) Jo. Jr does it: inflow = outflow. Whether n • u is part of the solution or is given by BC's, (3.11-1) must always be satisfied. For example, for the IBVP of the previous section, (3.10-1) through (3.10-13), this translates to / nw+ / n u+ / nw+ / n u = 0. (3.11-2) J F[) J T/v J r„ J rr Global mass conservation is the one requirement that carries over strongly and intact to the discrete equations. The remainder are both harder to satisfy and somewhat less important in general. 3.11.2 Momentum Conservation The best starting point is, of course, the divergence form of the momentum equation (3.6-1). Integration over Q. and using the divergence theorem gives, directly, - /u+ /[(n-u^ + PnJ^Re-1 / n ■ [Vu + (Vu)7], (3.11-3) dt Jq Jr Jr or, using (3.3-2) and (3.3-3), ^ [ u= /[F-(n-u)u], (3.11-4) dr Jn Jr which can also be interpreted as a global force balance once the momentum flux, unu, is interpreted as a 'force.' This is the general result. Special cases come from specific problems; e.g., for that described by (3.10-4) through (3.10-7), that portion of the boundary integral over FN would have Re-' n • [ Vu + (Vu)7] — Pn replaced by F, the applied force on this portion of T, etc. As a final remark we mention that, if any other equivalent PDE were used as the starting point—such as (3.2-4)—it would need to be manipulated into this form, usually via the judicious use of V • u = 0, before the 'proper' form of global momentum conservation could be ascertained. 3.11.3 Kinetic Energy Conservation This one is trickier, but also important. The derivation below is the most efficient and useful that we have found. Defining the kinetic energy (KE) as
414 THE NAVIER-STOKES EQUATIONS \ \ pu ■ u = \p \ q2, Jo. J (3.11-5) we first form the local KE equation by taking the scalar product of (3.6-1) rewritten in the following (dimensional) form, with u, to obtain P 1 dq7 - + V.(uu) V<r, -P 2 dt + pu • [V • (uu)] = u • (V • a). (3.11-6) (3.11-7) Using now u • [V • (uu)] = (u • u)V u + ^u V(u • u) = £u • Vq2 = \ V • (q2u) and V • (<r ■ u) = u • (V • a) + a : Vu yields 1 ~P 2H I- + VVU) V • (<r • u) - a : Vu, (3.11-8) which is in the proper form for the 'integration + divergence theorem,' giving E+\p q2(n-u) n • a • u a: Vu. (3.11-9) 'Q Now we use n • a = F and a : Vu = (d — PI) : Vu = d :Vu, since I :Vu = V • u = 0, to give, semi-finally, E + \p I q2u ■ n / F u - / d :Vu. T JQ. (3.11-10) But because d7 = d, d:Vu= -d:[Vu+(Vu)7'] = — d : d = O ^0, (3.11-11) 2 IjJL where O is the viscous dissipation function (internal friction), the final KE equation is F • u - \p \ q n • u - O; (3.11-12) 'Q the kinetic energy increases owing to work done by the boundary on the fluid and by net inflow of KE. It always decreases owing to viscous dissipation throughout the domain. (The kinetic energy lost by O is gained as internal energy/enthalpy—detailed in Volume II.) Typically, on part of T, F is an applied traction force while on the rest of T, it is a reactionary pressure and viscous force applied to the fluid and is to be evaluated via (3.8-1); i.e., F = //[Vu + (Vu)7] • n — Pxv. [This same comment applies to the global momentum equation (3.11-4), where (3.8-1) typically is used to compute lift and drag.] For an inviscid fluid, the energy balance equation simplifies to E = - J (P + \q2) n • u, (3.11-13) which, for example, is consistent with steady potential flow for which P + jq2 is constant, and thus E = 0.
GLOBAL CONSERVATION LAWS 415 3.11.4 Vorticity Conservation We limit our consideration to 2D since we do not plan to discuss 3D vorticity methods in this text. (Indeed, we give little more than lip service to the 2D ty-io formulation—and that with slightly snarling lips.) So, starting with (3.5-7) with the advection term changed to divergence form yields which easily leads to a+ V-(uw) = Re~' / VV (3.11-14) ■wn-uj, (3.11-15) the typical global conservation law for the scalar transport equation. Another, and simpler, vorticity conservation law is derivable from Stokes' circulation theorem, Js co • n ds = Jc u • dl, which, in the special 2D case being considered here (wherein both co and n point 'out of the page'), is (3.11-16) the total vorticity is—instantaneously—given by the boundary integral of the tangential velocity. Remarks: (1) This last result is also obtainable (equivalently) from the elliptic equation portion of the V — ^ pair, V2^ + co = 0 via the divergence theorem: / (*=- [ V-(VV0 = - f^= fuT. Jo. Jo. Jr on Jr (2) The result applies for 'any' values for £2 and its bounding curve, F; e.g., £2 need not be the full domain. The time derivative of (3.11-16) gives the rate of change of total vorticity, (3.11-17) at least for a fixed domain and boundary; cf. (3.11-15), whose equivalence we will soon show. One interesting application of (3.11-16) that could even be useful in computations is the popular test problem called the lid-driven cavity (LDC). Here, Jr uT = luo, where / is the width of the cavity and uq is the driven lid speed. In a typical non-dimensional application, both / and uq are unity. So, independent of both time and Reynolds number, the total vorticity in the LDC is J co = luo. The simplicity of this result might be useful in computations in the following way (even for a u-P code): if |/«o — jco\ > e, where e is 'user input,' presumably small, the mesh is too coarse—or badly designed (or both). Next, it is a useful exercise to show that (3.11-17) and (3.11-15) are actually equivalent, since their origins are so different—one being kinematic and the other kinetic. We begin
416 THE NAVIER-STOKES EQUATIONS by inserting the tangential momentum equation, duT/dt + u • Vwr 4- dP/dr = Re ' V2wr, into (3.11-17) to get — / co = / (Re-1 V2wr - u • Vwr), since Jr dP/dx = 0. Taking now each term on the RHS in turn, 2 d2uT d2uT d2uT d2un 3 /duT dun\ dco (° yUz = ~dn2+lx2=^n2~ dnlh = ~dn \dn~ ~ !h J = ~dn because V • u = dun/dn 4- duT/dx = 0 and co = duT/dn — dun/dx, and duT duT ( dun \ duT (ii) u Vwr = u„- \-uz~- =un ( 0)4- ^— + «r — = ww„ 3n 3t V 3r / 3t 1 (du2 du2\ 1 3^2 4- - —- H \ = o)un H —; 2 ^ 3r 3t J " 2 dx but Jr dq2/dx = 0, and we obtain d — wn • u , dn J as required. One additional piece of information, related to the above, that is sometimes useful is this: for a stationary boundary with no penetration (u • n = 0) and no slip (uz = 0), the latter giving Jn co = 0, results in Re-1 Jr dco/dn = 0—both the total vorticity in £2 and the net viscous flux of vorticity into £2 through F are zero for all t > 0, regardless of how co may vary in time and space. It is likely that at least sometimes the clever use of such simple conservation laws could aid the simulator/modeler. Another informative result is obtained by inserting V2«r = dco/dn into the tangential momentum equation applied on F in the form , dco DuT dP Re'1 — = —- + —, (3.11-18) dn Dt dx which permits the following interpretation of viscous flux of vorticity through a bounding 'wall': it is 'caused by' tangential acceleration and/or tangential pressure gradients. (For some discussion and references related to 'cause and effect' of vorticity vis-a-vis velocity, see Gresho, 1992. Does velocity cause vorticity via its curl or does vorticity cause/induce velocity via the Biot-Savart law?) Our final and very important remark regarding vorticity conservation is this: (3.11-15) and (3.11-18) are generally not applicable at t = 0 because (3.5-7) is not because the tangential momentum equation is not. They cannot cope with vortex sheets. But (3.11-16) is applicable, once it is recognized that singular behavior in the integrand of the LHS is a common occurrence when vortex sheets are present. For example, consider a stationary fluid in a very long cylindrical container of radius R that undergoes an impulsive rotational start-up of its boundary; the motionless fluid suddenly contains an amount of vorticity (per unit length), Jruz = 2jiRut, initially concentrated in a vortex sheet: JQ co = 2nRuT with co = 0 except at r = R, where it is unbounded. See also Section 3.8.3. [Exercise for the reader: An impulsive rotation cannot be given to any but a circular cylinder. Why?]
GLOBAL CONSERVATION LAWS 417 To conclude the vorticity discussion, we return to the LDC (for Iuq = 1) and consider an impulsive start-up from rest and then an impulsive stop and subsequent spindown. There are three 'phases' to consider, all using the simple result given by (3.11-16): (i) at t = 0+, the total vorticity in the cavity is 1.0 and is all contained in the vortex sheet under the lid; (ii) for t > 0, the same total vorticity is diffused and advected throughout the cavity, approaching a constant-in-time spatial distribution if Re is not too large; and (iii) upon reducing the lid speed from 1 to zero, the total vorticity instantly drops from 1.0 to zero, this time in a 'negative' vortex sheet at the lid that cancels the entire 'bulk' vorticity. The spindown, at zero total vorticity, finally obtains a zero pointwise vorticity as t —► oo. 3.11.5 Enstrophy Conservation The last fluid mechanical quantity whose global conservation is of some interest is the enstrophy, a positive measure of rotation that is simply one-half the square of the vorticity, e = o)2/2 (see, for example, Leith, 1969; Lesieur, 1987; and Pedlosky, 1987). Starting again with (3.5-7), we multiply by to to obtain the local (2D) enstrophy equation — +V-(eu-Re-' Ve) = -Re~' Vco-Vco, (3.11-19) dt where we have used wu • Vco = ^ V • (co2u) and wV2w = V • ^Vw2 — Vco ■ Vco. Note that Vco • Vco ^ 0 so that the RHS is always ^ 0 (enstrophy dissipation). Integration (etc.) yields en u-Re1 —J = -Re"1 f Vw • Vco; (3.11-20) e increases by net transport (advection plus diffusion) into £2 through r and is always decreased by viscosity. The following rearrangement of one of the terms in (3.11-20) is interesting (for t > 0): Re-1 — =Re-1 w— =wRe_l V2wr =o)[ — +u- Vwr H dn dn \ at dr J to give iLe+lr-IA^+l)-^L^^ <3"-2,) in addition to advective transport through T, it is seen that the same terms that 'cause' vorticity flux, Dur/Dt + dP/dx, are also the generators of enstrophy. In fact, at steady state with no inflow or outflow, J co— (p + -u]\ =Re~' f Vco-Vco^0, (3.11-22) from which it follows that the 'production' term on the LHS is (at least for this case) necessarily ^ 0—a seemingly non-intuitive result. For 3D enstrophy equations, see, for example, Batchelor (1967) and Wu (1995).
418 THE NAVIER-STOKES EQUATIONS 3.12 WEAK FORMS OF THE PDE'S/NATURAL BOUNDARY CONDITIONS (NBC'S) The review of (a portion of) classical incompressible fluid mechanics is over. We now push on toward the formulations of the NS equations for which finite element methods are most applicable. The combination of a vector-valued system and the div u constraint (and the pressure) render—not surprisingly—weak formulations of the NS equations considerably more complicated and difficult, both to generate and to solve, than that of the simple scalar transport equation of Chapter 2. Accordingly, we believe it profitable to start slowly and work up, perhaps at the expense of some repetition, but with the hope that the various formulations seen in the literature will then be much easier to follow and understand. Part of the complexity is related to notation, which can range from overly cumbersome to overly terse, with equivalence not always immediately obvious. Thus we will start with the simplest (hopefully) method of derivation and limit it to 2D cartesian coordinate systems so that we can more easily work separately with each component of the vector equation. After this, we will re-do the same example while simultaneously generalizing it to 3D via the more efficient index notation. Finally, the equivalent formulation will be presented in the most compact notation of all: Gibbs' vector notation, complete with vector-valued test functions. Also, for tractability, we narrow the scope ab initio by presenting and discussing in any detail only a subset of all the possibilities, emphasizing the more common and preferred forms of the equations for the most part. We shall further emphasize the forms of the equations that have been of most interest to us and about which we have real experience, but will only summarize other forms—especially those more avant garde forms that were developed, at least in part, because of the continuous search for better OBC's. We emphasize via repetition at the outset: applied PDE's (i.e., those whose solution is sought) come with BC's (see Strang, 1986), and this is a significant feature of the various weak forms that can be developed; i.e., weak formulations of PDE's also come with (incorporate) BC's. Our terminology—not generally employed by others, but useful in our opinion and already employed in the previous chapter—is this: by weak formulation, we mean the sequence of steps needed to go from the given PDE's + BC's to a specific weak form of them. Before developing weak forms, it may be well to pause and reflect upon the following quotation from Ladyzhenskaya (1969, p. 142)—also for weak/generalized solutions: "Before becoming involved with precise formulations, we call the readers' attention to the fact that the statement 'it has been proved that the problem has a unique solution' can have very different meanings depending on the function space in which one looks for the solution. The form in which the requirements of the problem must be satisfied is different for different spaces, and different extensions of the concept of a solution of a problem, i.e. different 'generalized solutions', present themselves. In fact, for every problem there are infinitely many 'generalized solutions', but they coincide with the classical solution, if the latter exists." 3.12.1 The Conventional u-P Formulation and the Stress-Divergence Form Combined To begin, we expand the NS equations, in the form du/dt + u • Vu + VP = vV • [Vu + y(Vu)T] (with y = 0 or 1), into 2D cartesians:
WEAK FORMS OF THE PDE'S/NATURAL BOUNDARY CONDITIONS (NBC'S) 419 du du du dP h u h v 1 dt dx dy dx dv dv dv dP h u h v 1 dt dx dy dy and, of course, Du dP f 7 d \ 1 = v [V2u + v—V • u , Dt dx \ Y dx J Dv dP ( , d \ _+ ,,^ + y-V..]. du dv — + — = 0. dx dy (3.12-1) (3.12-2) (3.12-3) Remark: For y = 1, the V • u term is retained in the stress-divergence form of the momentum equations, for 'generality' when we introduce/discover the NBC's. For y = 0, we are starting with the 'conventional' (V2) formulation. While equivalent in the continuum, the discrete equations for y = 1 differ from those for y = 0—and so do the solutions (although usually not by much). Upon setting BC's and IC's, these equations can be transformed from the above strong form to the more desirable (amenable) weak form. But this time we shall proceed slightly differently, hopefully profitably; i.e., we shall permit the weak formulation to show us the way to some legitimate BC's via the NBC's of the weak formulation. (But are they useful? That's another issue; but it turns out that the answer is, usually, yes.) And we shall use only the ^-component of the NS equations to demonstrate the technique. Further toward our goal, we first rewrite the ^-component of the NS equation (3.12-1), in the following equivalent form—a divergence form: Du = Vtx, (3.12-4) where Dt t-v — "* du (\+y)v--P dx f du dvs (3.12-5) and we note [see (3.3-1)] for y = 1 that the vector xx = ex ■ a\ i.e., it is the x-component of the stress tensor. Now we multiply (3.12-4) by a test function, (p{x) (an x-direction test function), whose exact 'qualities' (function class) we defer defining until a more appropriate time—and integrate by parts, etc.: 0 (x) Du ~Dt = MwV-rI= / V.(0wrx)- / V0 U) «, 0wn V0 (x) which we rearrange to 0«— + V0W and 'expand' to 0 (■»•) Du d(j) (x) Dt + dx d+y)v du 'S (x) n, d+y)v dx du + <pix)n d<f>{x) (du dv' dy \dy dx / dx + nvv (3.12-6)
420 THE NAVIER-STOKES EQUATIONS A further rearrangement of the LHS (isolating y) gives 0«_^ + vV0w- Vu + yv dx) d(pw du dcf)w dv (x) — + dx dx dy dx - -P d(p (x) dx = 4> (x) ft, du (\+y)v--P ox f du dv\ (3.12-7) and one more (isolating u and v) gives / <t> (x) Du + v Dt = [ <PM d<p{x) du d(p{x) du dx dx dy dy dx) + vy dcf)w dv nd<p dy dx (x) dx n, du (\ + y)v--P dx + nyv du dv\ dy dx) (3.12-8) which leads to the following Remarks: (1) If y = 0, then the viscous term in (3.12-8) is clearly of the same form as the diffusive term in the advection-diffusion equation; e.g., see (2.2-2). (2) The RHS of the above equations shows the way to an appropriate NBC for this form of the ^-momentum equation; namely, n • Tr = n, dx f du dv\ + nyV{Yy+yYx)=F- (3.12-9) where Fx, an applied force (per unit area) in the x-direction, is prescribed input data. (The RHS above then becomes, simply, Jr<p(x)Fx.) When y = 1, Fx is the true x- component of the applied traction vector; when y = 0, we call Fx a pseudo-traction as it is not then a physical force. But for either value of y, (3.12-9) is a legitimate (and natural) BC, and a solution can be found once Fx is specified. Fx = 0 is another example of a 'do-nothing' BC; no action is required on the part of the code user. Indeed, no action by the code writer is required if Fx = 0 is always the desired BC on T. (3) Usually (3.12-9) is a BC that applies only over a portion of F, an issue we will soon return to as we continue the development of the weak form. Digression: before completing the weak formulation, we show how a simple rearrangement of the div u term in (3.12-1) can lead to a different weak form. Rewrite (3.12-1) as Du 9 3 — = vV2M + — (yvV-u-P), Dt dx multiply by (p{x\ and integrate by parts as follows: / 0W ^ = /v<t>(x) y2" + / 0W Tx(KvV *u ~ P) = I v[V • (0WV«) - V0W • Vm] + f —{(j)(x)(yvV ■ u - P)]
WEAK FORMS OF THE PDE'S/NATURAL BOUNDARY CONDITIONS (NBC'S) 421 dx -(yvV-u-P) «.(*) du \(x) dx). = / W>w — -v Wx> Vu+ <f)wnx(YvV u-P) d(p (x) dx (yvV-u-P), which can be rearranged to / <p(x) + vV<^W . yM + yv_^_V . u _ P-Z— Dt dx dx = 7r* U) du v— + nx(yvV u-P) dn which can be further rearranged to the final weak form, 0 (x) Du ~Dt + v dx) = Jr> {"' d(pix) du d(pix) du d(p(x) dv (1 + y) 1 h y dx dx dy dy dx dy du dv~\ du I { +y)vTx~P + yvY +nyVT) -P d(p (x) dx (3.12-10) and we discover another NBC; namely, n, du dv (1+K)v— -P + yv— dx dy du + nyv— = Fx. dy (3.12-11) Remarks: (1) If y = 0 or V • u = 0, then this formulation is identical to (3.12-8) and (3.12-9) with K = 0. (2) When y = 1 and V-u^0 (which is the case for most FEM's; V • u is only weakly zero, not pointwise), (3.12-11) might be more useful as an OBC than (3.12-9); it cannot be used when true traction BC's are required. (3) This formulation has not been tested in the CFD lab, to our knowledge. End Digression: We now return to the more conventional weak formulation, a la (3.12-8), and augment it with the equivalent result from the y-momentum equation (3.12-2): 0 (y) Dv ~Dt + v d(p(y) dv d(p(y) dv d(p(y) du dx dx dy dy dx dy -P d(p (y) dy dy) du dv\ = ]/y\n'VVYy + Y,)+n dv (\+y)v—-P (3.12-12) where (f)(y) is the generic test function for the y-equation. Boundary conditions can now be addressed, which will also permit the completion of the definition of the test functions,
422 THE NAVIER-STOKES EQUATIONS (f){x) and (f)iy\ and the completion of the weak formulation, yielding a particular weak form. The most general BC's that are appropriate to this weak form are: For u: u n, du (\+y)V —-P OX + nyv du dv' dy dx = U on = F on ^D ^N For v: v = V on F D ( du dv\ r dv (1+K)v- oy = F, on ■^N (3.12-13) (3.12-14) (3.12-15) (3.12-16) where rj + TNu = T = T° + r* and Fx, Fy are specified—as are U and V. The weak formulation of the momentum equations is completed by restricting the class of test functions to vanish on the Dirichlet portions of the boundary, just as we did for the scalar equation in Chapter 2. Thus we restrict (p(x) to vanish on T^ and (p(y) to vanish on T^, so that the boundary integrals are effectively restricted to T^ for (3.12-8) and to T^ for (3.12-12). We now shift the focus to the continuity equation (3.12-3), and generate its weak form very simply via /f du dv\ t(Yx+Vy)=0, (3.12.17) where \Js is a test function that is related to the pressure. The (nearly complete) weak form of the NS equations under the BC's of (3.12-13) through (3.12-16) can now be stated: Find u e HluE, v e H[E and P e L? such that dx) Du 0W— + V Dt d(p{x) du d(p{x) du d(p{x) dv (1 + y) 1 h y dx dx dy dy dy dx d(p (x) dx (x) 4> (y) Dv Dl ix)Fx, V0 d(p{y) dv eH M,0' (3.12-18) + v d(f)iy) dv dcf){y) du + (l +y)— + y— dx dx dy dy dx dy -P d(p (y) dy = [ ^Fy, V0W€//JO, (3.12-19) and (3.12-17) V^ e L2, where Hlu E is that set of piecewise, once-differentiable functions in £2 that take on the value U on F®, H\)E is that similar set that takes on the value V on T^, Hlu0 and Hlv0 are their homogeneous counterparts (0W = 0 on F„ and (p(y) = 0 on T^), and l? is the set of square integrable functions on £2. Remarks: (1) The resulting velocity field is called weakly solenoidal; or, divergence-free almost everywhere. [V • u could take the value seven, or any other finite constant, at one or at a countable infinity of points in £2, and still satisfy (3.12-17).]
WEAK FORMS OF THE PDE'S/NATURAL BOUNDARY CONDITIONS (NBC'S) 423 (2) The distinction between <p(x) and 0(v) is required because of the need to satisfy different BC's for u and v in general. Away from T, one may conveniently envision them as the same functions (as is indeed the case when we introduce the FEM). (3) The solution space and the trial space (of test functions) for the pressure are identical because there are no explicit BC's on P. (4) Since there are no spatial derivatives of P in the final weak form, the pressure (and the test functions, {\f/}) is not even required to be continuous (thus P e L2 =>■ WP need not even exist in the classical sense—as, of course, is indeed also the case for V2u). (5) The incorporation of NBC's into weak forms should now be obvious, as should the utility of weak formulations. (6) If P is replaced by — AV • u and (3.12-17) dropped, then we have the weak form of the penalized momentum equation; see (3.5-10). (7) If If and If are zero, then a solvability constraint enters (and must be satisfied for a solution to exist): frnxU + nyV = 0. In this case the pressure is determined only up to an arbitrary additive constant. The problem specification is not actually complete until we specify the IC's, which in this case are u(x,0) = u0(x), (3.12-20) where Uo(x) is subject to some significant constraints: (i) V-110 = 0 in Q, (3.12-21) (ii) n-Uo = nxU0 + nyVo on if n if, (3.12-22) (iii) n-Uo = nxU0 on if n if, (3.12-23) and (iv) n-u0 = nvV0 on if n if, (3.12-24) corresponding to (3.9-1), (3.9-2), where Uq, Vq are the initial values of U and V of (3.12-13) and (3.12-15). Note that, as in the classical formulation of Section 3.9, if any of (3.12-21) through (3.12-24) are violated, the problem is ill-posed, and no solution exists. We also remark that applications of this formulation are probably usually limited to cases wherein nx and ny are either 0 or 1; i.e., the boundaries are aligned with the coordinate system. More general cases, to be discussed later, will usually have BC's applied in normal and tangential directions—thus necessitating transformations of the equations via (local) rotations. Finally, we leave as an exercise the demonstration of reversibility—i.e., if u and P are sufficiently smooth, then the solution of (3.12-17) through (3.12-24) is also a classical solution. An alternative and shorter derivation will now be presented, which also generalizes (to 3D) the above. It is based on the cartesian index notation, complete with the summation convention. The NS equations are first rewritten as Dun, dP ( o 3 ur \ -FT + ^- = v\ V u<* + yiT^T ' « = 1, 2, ..., n,, (3.12-25) Dt axa \ dxpdXa I
424 THE NAVIER-STOKES EQUATIONS and OUn ~=0, (3.12-26) oxa where Dua/Dt = dua/dt + up(dua/dxp), V2wa = (d2ua/dxpdxp), and ns is the number of spatial dimensions (two or three). [We use Greek indices to denote spatial vectors (and directions) because this will help us later when we introduce nodes and finite element basis functions.] Next, multiply (3.12-25) by the generic a-direction test function 0(a) and integrate by parts: 4f»°±+v V^>.Vua + y^^ )-P* Dt \ dxp dxa I dxa ~ r* (a) fdua dup\ \ on dxa J a = 1, 2, ..., ns, (3.12-27) where dua/dn = np(dua/dxp), and the parenthetical superscript (a) on 0 denotes the important restriction: 'no sum on a'—both here and hereafter. To complete the formulation we simply restrict the test functions as follows: 0(a) = 0 on F®, where F® denotes that part of F on which ua is specified; i.e., corresponding to (3.12-13) and (3.12-15), we now also have the BC's stated more compactly as ua = Ua on Tj and v ( ~ + ynp^-) - Pna = Fa on r£, (3.12-28) \ on oxa J where r^ 4- F^ = F; a = 1, 2, ..., ns. The statement of the weak form is now: find ua e Hla E and P e l? such that f <rtDua ( la, 90(a) dup\ 90(a) J Dt V dxp dxa J dxa = / 0(a)Fa, a=\,2,...,ns, (3.12-29) and /v~=0 (3.12-30) J dxp V0(a) e //„ 0 and V^ e L2, subject to appropriate IC's—which we defer, temporarily [until Section 3.13.1—but they are the same as those in (3.12-20) through (3.12-24)]. Remarks: (1) Although the notation is different, it should be clear that (3.12-17) though (3.12-19) and (3.12-28) through (3.12-30) are describing the very same problem. (2) This compact notation will prove useful when we push on to find approximate solutions to the weak form via Galerkin's method.
WEAK FORMS OF THE PDE'S/NATURAL BOUNDARY CONDITIONS (NBC'S) 425 (3) Since P will eventually become a linear combination of the V's, (3.12-30) implies J /> V • u = 0, which in turn implies / u • VP = Jr Pn • u; thus, if the flow is parallel to the boundary (n • u = 0), the velocity is orthogonal to the pressure gradient—an observation that is also true for the classical solution, and may even be counterintuitive since it tells us that—on average—the flow is anything but 'down (parallel to) the pressure gradient.' [Indeed, if n • u = 0 on T, then u is orthogonal to the gradient of any scalar function in L2, a fact that is more mathematical than physical, and corresponds to an orthogonal decomposition (Helmholtz/Weyl) of L2 into the space of divergence-free vector functions and that of gradients of scalars; the pressure is, in this sense, not so special.] There is a third equivalent way to state the same problem; it is even more condensed/succinct than that just presented. It retains vector/tensor/dyadic notation throughout, starting with Du — + VP = Vd (3.12-31) Dt Vu = 0, (3.12-32) where d = v[Vu + k(Vu)7] (3.12-33) is the (symmetric when y = 1) viscous stress tensor (or pseudo-stress tensor if y = 0), and we leave (for the moment, anyway) cartesian coordinates behind; i.e., these are the coordinate-free NS equations. The next step is to form the scalar product of the momentum equation with a generic vector test function, v, whose 'properties' will be defined/discovered along the way—and (of course) perform an integration by parts: Du v + v • VP = v • (V • d) Dt K or, equivalently, Du r v + V • (Pv) - PV • v = V • (d • v) - dT : Vv Dt to give v- — + v[(Vu)7 + yVu] : Vv-PV-vl = /(d7 • n - Pn) • v, (3.12-34) an equation that (for y = 0) at least 'resembles' (3.12-27)—an observation that uses (for y = 0) d7 • n = v(Vu)7 • n = vn • Vu = vdu/dn. But (3.12-34) is just a single scalar equation (for each v), and (3.12-27) is clearly a vector equation—so something seems to be amiss. To complete the connection, we must realize that the (vector) test function must ultimately range over all possibilities, some of which will contain non-zero entries only in one of its components, others of which will have non-zeros only in another of its components, etc. And there is an infinite number of each. Thus, (3.12-34) is simply a short-cut notation that actually implies—and should be interpreted as—a set of vector equations with ns components. In fact, the easiest way to obtain (3.12-27) from (3.12-34) is to specialize the set of vector test functions;
426 THE NAVIER-STOKES EQUATIONS i.e., set v = ea(p{a\ recalling that there is no summation because of ( ), use u = epiip, V = epd/dxp, etc., and let a range over 1 to ns—an exercise we leave to the reader. Remarks: (1) A nice feature about this formulation for y = 1 is the ease with which the kinetic energy equation can be obtained: just set v = u in (3.12-34), utilize /PV • u = 0, and set F = dT • n — Pn to get 5/^ = //—v/[vu+<vu/]:v„, which is easily reduced to (3.11-10) by invoking V • u = 0. (2) The following identities, with mix of notation, are interesting: (Vu)7 : Vv = Vu :(Vv)7 = Vw( • Vvh a la Table 3.1-2. (3) A more compact yet statement of (3.12-34) for y = 1 is where a = v[Vu + (Vu)7] — P\ is the total stress tensor, for which the classical form is Du/Dt = V • a, a la (3.3-1). (4) Related to Remark (3) following (3.12-30), it is important to emphasize that even though / \f/V • u = 0 and thus / PV • u = 0, it is not true that / PV • v = 0; this is because J \j/V ■ v ^ 0—our test functions are not even weakly divergence-free. Indeed, this is one of the major advantages of the mixed formulation, since useful divergence-free test and trial functions are hard to find (but not impossible—see below—Section 3.13.7). 3.12.2 Other u-P Formulations Having covered one formulation in detail, we now cover others in more abbreviated form to conserve space; the missing details should not be hard to fill in. Thus, a. Full divergence form Starting from (3.4-2), dua 3 + — (uaup -aap) = 0, a= \,2, ...,ns, (3.12-35) dt dxp where (dua dup\ the weak form is generated via da) 0(a)lf + J ^[(f){a)(UaUP ~ °^)] ~ J -^-(u<*up ~ a-P) = o,
WEAK FORMS OF THE PDE'S/NATURAL BOUNDARY CONDITIONS (NBC'S) 427 which leads directly to / ^(a)^ | d(t> dt dxp (a) r dua dup \ v | - h K^— ) - PSap - uaup dxp dxa I «.(«), = / 4>wnp dua diiR \ v|- h K^— ) - PSap - uaup dx, P dxa a = 1, 2, ..., ns (3.12-36) Except for the NBC, this form is not much different; the NBC is dua dun + Y- dn dxa \rna -\- UnUa) — f a, (3.12-37) where un = n • u. Here-Fa represents the total momentum flux (at least when y = 1)—see (3.8-6) and (3.8-7). It would seem rare that one would actually know the total momentum flux at an inlet, but if one did, then this is the proper weak form. But it is at the outlet that this NBC could actually cause trouble—probably even more so than it would for the advection-diffusion equation; cf. (2.4-22) of Chapter 2—the momentum flux must get out, but its value (Fa) is generally not known, yet must be specified (zero or any constant probably would not work). Advice: do not use this formulation for an OBC, unless Fa is known—as in some of the Scriven references cited in Section 3.8.1; but if you do, and Fa is not known, set—Fa to the average inlet advective momentum flux, \/H J{) unua d/ at the inlet—and cross your fingers. b. Skew-symmetric form This form, (3.4-5), will only be interesting when we later return to the subject of global energy conservation (Section 3.13.8) because it, not surprisingly, leads to a skew- symmetric advection matrix. Here it is sufficient to state that it is the same as (3.12-29) with y = 0 and Dua/Dt replaced by Dua/Dt + ^uadup/dxp. c. Rotational form and other curl forms These forms are worth developing in more detail because div is 'replaced by' curl, and the necessary integration by parts formulas are different—and additional vector identities are needed. It is also more convenient to work in vector notation; thus, (3.4-4) becomes, for our purposes here, du Yt + 0) x u + VPt = —vV x V x u, (3.12-38) and it is the curl-curl term that needs work in order to be cast into the appropriate weak form. Toward this end, then, let us temporarily digress and seek the weak form of V x V x u = f, (3.12-39) where V x u = a>. This is best done using vector test functions; so we start with J v • V x V x u = / v • f. Next, recall the vector identity, V • (A x B) = B • (V x A) - A • (V x B)
428 THE NAVIER-STOKES EQUATIONS to give, with A = V x u and B = v, v-VxVxu = V-[(Vxu)xv] + Vxu-Vxv = -V • (v x V x u) + V x u • V x v, so that / V x v • V x u = |r n • (v x (») + / v ■ f via the divergence theorem. Now we work with the boundary term; first, we decompose o> into its normal and tangential components, o> = (n • o>)n + n x o> x n to obtain n-vxo) = n-vx[(n- o>)n + nxa>xn] = n-vx(nxo>xn) = (n x a) x n) • (n x v) since n • v x (n • a>)n = (n • a>)(n • v x n) = 0, and we have used the triple vector product identity A-BxC = C-AxB, where here A = n, B = v, and C = n x o> x n, the tangential vorticity. The final weak form of (3.12-39) is thus / V x v- V x u = / v • f + / (n x a) x n) • (n x v), and our digression is over. The weak form of (3.12-38) that is of interest can now be easily developed, and the result is 'du \ ho>xu +vVxv-Vxu — P^V • v dt J /' = / [v(n x v) • (n x a) x n) — n • \Pt], which is seen to introduce the following new NBC's: n x a) x n = f and PT = g; (3.12-40) (3.12-41) both the tangential vorticity and the total pressure are specified. A closely related result follows easily from the curl form [(3.6-13)] rather than (3.6-2) or (3.4-4): Du v • hvVxv-Vxu — PV • v = / [v(n x v) • (n x o> x n) — n • \P], Dt J Jr (3.12-42) where P is now the 'usual' (static) pressure, with NBC's of nxo)xn = f and P = g; (3.12-43) the tangential vorticity and the pressure are specified, both weakly. Remarks: (1) This result agrees with that in equation (4.11.3) of Gunzburger (1989) once the 'typos' there are corrected; the sign of s in his equations (4.11.3) and (4.11.4) should be negated, as should that of r in equation (4.11.4). (2) If the tangential velocity is specified on T, then the first boundary integral in (3.12-42) vanishes, and the second involves only n • \P\ thus, if the normal velocity is not specified, then we have an NBC in terms of pressure alone (P = g, say).
WEAK FORMS OF THE PDE'S/NATURAL BOUNDARY CONDITIONS (NBC'S) 429 (3) Finally, if the normal velocity is specified, n • v = 0, and if the tangential velocity is not specified, then we obtain an NBC that specifies only the tangential component of vorticity. Another closely related and more stable (Gunzburger, 1989) form starts from the div- curl form given by (3.6-12) and becomes [integrating also v • V(V • u) by parts]: f Du / v • h vV x v • V x u + (vV • u - P)V • v = / [v(n x v)- (n x © x n) + (n ■ v)(vV • u -/>)], (3.12-44) with NBC's of nxwxn = f and P - vV • u = g, (3.12-45) which is equation (4.11.4) in Gunzburger (1989) once r is changed to —r there. Remark: It is probably usually safe to assume that vV • u ~ 0 in (3.12-45) and to interpret it as specifying (weakly) a pressure, as in (3.12-43). Thus, we see that the curl forms open up three new NBC possibilities (while simultaneously closing others): (i) specify tangential velocity and (as an NBC) pressure, (ii) specify tangential vorticity and pressure (both as NBC's), and (iii) specify normal velocity and (as an NBC) tangential vorticity. All of these, of course, are in addition to the conventional Dirichlet (essential) BC of specified velocity. In an attempt to summarize the situation and complete the presentation, let us define a particular problem based on (3.12-42): u=w on Ti, (3.12-46) n x a) x n= f2 and u • n = w2 on r2, (3.12-47) nxuxn=w3 and P = P3 on T3, (3.12-48) nxo)xn=f4 and P = P4 on T4. (3.12-49) The weak form is as follows: find u that satisfies u = w on F\, u • n = vv2 on F2, and n x u x n = W3 on r3 from / (v- — +vVxv-Vxu-/>V-v) = J v(nx v)-f2- / n-v/>3+ f [v(n x v) • f4 - n • \P4] Vv (3.12-50) J Y2 J Yi, J Y4 that is once-piecewise differentiable and also satisfies: v = 0 on F\, n • v = 0 on F2, and n x v = 0 on T3. Remarks: (1) It is not at all obvious if or when one would actually wish to solve such an IBVP, but the possibility does exist—as do simpler subsets. (2) The continuity equation (3.12-30), must of course be solved simultaneously.
430 THE NAVIER-STOKES EQUATIONS To conclude this discussion, and perhaps shed more light on weak forms involving curl, let us write the weak div-curl form, (3.12-44), in simple 2D cartesians (via v = ea<p(a\ etc.), wherein co = dv/dx — du/dy: Du dx) Dt dy \dy dx J — -—I +4— (vV-u-P) dx and = / <p{x)[-vnyco + nx(vV-u-P)] {y)Dv , d(f){y) fdv du\ d(f){y) <py>— + v-— [ ) + ^— (vv • u - P) * Dt dx \dx dy) dy (3.12-51) = / <p(y)[vnxco + ny(vV-u-P)l whose corresponding strong form reads and Du d fdv du\ d „ — + v— I H (P - vV • u) = 0 Dt dy \dx dy) dx Dv d f du dv\ d „ — + v— I + — (P - vV • u) = 0, Dt dx \dy dx) dy (3.12-52) (3.12-53) (3.12-54) where the simpler (but less stable computationally, according to Gunzburger, 1989) curl form is obtained by setting V • u = 0 in each of the above. Remarks: (1) If these forms were to be used in a problem with outflow in the x-direction (say) via homogeneous NBC's, (a) = 0 and P = 0 at outlet), the exit flow would be forced to satisfy the irrotational constraint dvldx = du/dy and the 'non-Bernoulli' constraint Pt = P + q2/2 = q2/2. Thus, if a near-potential flow is thought or assumed to exist at the exit plane, the full rotational form of (3.12-38) should be used instead because then the homogeneous NBC's are compatible with a potential flow: co = 0 and PT = P + q2/2 = 0. (2) If the full rotational form is used, u • Vu in (3.12-53) and (3.12-54) is replaced by a) x u; i.e., and du du (du dv u • Vu = u h v— goes to — vco = v [ dx dy \dy dx _, dv dv fdv du\ u • vv = u h v— goes to uco = u dx dy \dx dy J d. Other recent formulations Here we simply summarize some of what others have done recently via more avant-garde weak forms in a weak attempt at 'completeness.' The principal intent is to make the
WEAK FORMS OF THE PDE'S/NATURAL BOUNDARY CONDITIONS (NBC'S) 431 reader aware of yet other options. Specifically, we show three additional cases [all of which must be appended by (3.12-30)], the first two from Pironneau (1989) and the last from Girault (1988b), and all three are our own 'generalizations—interpretations' which may not be precise. 1. The BC's u = w on FD and u • n = ur, n x a> = d, dP/dn = f • n — Vy • d on FN, where Vv is the surface gradient operator, and it is required that n • d = 0 can be satisfied via the following weak form: find u that satisfies u = w on FD and u • n = wr on FN from / v- — + vVx v- Vx u + (vV-u-P)V- v= / vf+ / v d Vv (3.12-55) that is once-piecewise differentiable and also satisfies: v = 0 on FD, n • v = 0 on FN. Remarks: (1) f is a specified body force (acceleration). (2) This is a slight (and risky?) generalization (from Stokes to NS) of that presented in Pironneau (1989). 2. The BC's u = w on FD; on FN we have the three BC's: n • u = ur, n • a> = 0, and dP/dn = n • f, where f is a body force, can be satisfied using the following weak form: find u that satisfies u = w on FD and u • n = wr on FN from / v ■ — + vV x v • V x u + (vV • u - P)V • v = / v ■ f, Vv (3.12-56) that is once-piecewise differentiable and also satisfies: v = 0 on FD, n • v = 0 and n • V x v = 0 on FN, where Remark (2) above applies again. 3. The BC's u = w on FD and on FN we have the three BC's: n • u = 0, n • a> = 0, and n • (V x a>) = 0. There can be satisfied by the weak form: find u subject to u = w on To on FN from f du / v- hvVxv-Vxu + v- [(V x u) x u] + v- V(P + ±u-u) = f\f VveH1 (3.12-57) that vanishes on T^ and has n • v = 0 on FN. Remarks: (1) Again, this presentation is an 'interpretation' of the presentation in Girault (1988b), which should be consulted if these BC's are seriously contemplated. (2) See also Girault and Raviart (1986) for a few other interesting weak formulations. e. Divergence-free basis functions It is possible in principle (and in practice), but apparently not very popular in practice to actually construct a space of approximating functions such that each member is solenoidal. The reason that it is seldom used in practice is that the construction is cumbersome (especially in 3D), and the choice of elements is limited—at least that is
432 THE NAVIER-STOKES EQUATIONS the situation up to now. Although more detailed discussions will be presented in a later section, below we merely provide some motivation for the search for divergence-free basis functions/elements. Beginning with (3.12-34): suppose every vector test function, v, were divergence-free; this would obviously simplify (3.12-34) to — + v [(Vu)7 + yVu] : Vvl = J v • (d7 • n - Pn). (3.12-58) In addition, since u will be 'generated' via (expressed as a linear combination of) similar divergence-free functions, V • u = 0 will be satisfied automatically, and (3.12-30) is not needed. In this case, the resulting scalar equation [for each v, which now can not be expressed via v = ea0(a) since its components are necessarily coupled] is really that; divergence-free basis functions lead to a scalar system of equations rather than a vector system—and the pressure is gone! The incentive should now be clear: rather than solve a system of equations (infinite at the moment, finite when approximated via the FEM) for u, another for v, another (in 3D) for w, and yet another for P—all of which are coupled—a divergence-free basis reduces this to a smaller single system of (coupled) equations (for the scalar amplitude coefficients in the expansion of the vector u), seemingly a very substantial savings. We will return to this issue after we discretize the weak form in Section 3.13 (i.e., Section 3.13.7). 3.12.3 Pressure Poisson Equation Formulations Another way to avoid the explicit use of V • u = 0 is to replace it with the PPE. Here we develop the obvious weak form of the PPE that could be used to replace the weak form of the continuity equation (3.12-30), and suggest why the 'obvious' weak formulation may not be the best one. The problem to be addressed is that described by (3.10-1) and (3.10-3) through (3.10-13) in the PPE formulation. But the momentum equation has already been dealt with, so we focus now on the PPE and seek a good weak form, beginning in the obvious way; i.e., multiplying (3.10-3) by the 'generic' scalar test function 0 and performing the usual steps—after abbreviating by defining f = Re-1 V • [Vu + (Vu)7] + g — u • Vu—gives / 0V2/> = / 0V • f => / 0 / V0 • VP = J 0n • f - / f • V0; i.e., /^■vf> = /rv, + /r,(£-n.r). and we pause to reflect upon the BC's, noting first that the normal component of the momentum equation on F is dP/dn = n • (f — du/dt)—an equation that supplies the Neumann BC for the PPE when n • u is specified (as n • w) on T, which is here the case on FD and Tn. On the remainder of T (TN and Tr), the pressure is related to the velocity via P = Re~' n • [Vu + (Vu)7] • n - n • F = PN on TN, and P — Re~' n • [Vu + (Vu)7] • n — Fn = PT on Tr; i.e., via Dirichlet BC's on the pressure. So, a weak form of the PPE is the following: Find P e HlPE such that /V0-VP= / f-V0- / (pn-dw/dt V0e//j>o, (3.12-59)
WEAK FORMS OF THE PDE'S/NATURAL BOUNDARY CONDITIONS (NBC'S) 433 where HpE is that set of piecewise once-differentiable //'-functions in £2 that take on the value PN on TN and the value PT on TT, and Hp0 is that subset that vanishes on VN and rr. Remarks: (1) Since f = Re-1 V • [Vu + (Vu)7] + g — u • Vu, it is clear that this form requires the existence of second spatial derivatives of the velocity field; i.e., it seems to demand that u e H2 rather than the usual //' —which is at least consistent with our now requiring P e //' rather than the larger space (L2) associated with the u-P formulation. (2) The (weak) form of V • u = 0 that is implied by this PPE method is not obvious; perhaps it would be enough just to know that there is one, but we would rest more easily if we could identify it. We believe and assert that there is none—but this alone certainly does not preclude potential utility of the method. (3) Satisfaction of the essential BC's will be more difficult than usual. In response to Remark (1), consider the following: assume V-u = 0 to obtain V • [Vu + (Vu)7] = V2u + V(V • u) = V2u and use V • V2u = V • [V(V • u) - V x V x u] = 0 and / V0 • V2u = / 0n ■ V2u - / 0V • (V2u) = / 0n • V2u so that another weak form is: Find P e HPE such that /v0- VP = J V0- (g-u Vu)+ / 0n- (Re"1 V2u - 3w/3r) V0e//J>o, (3.12-60) and one may well ask what has been gained since we still need to evaluate V2u—or at least its normal component—and on the boundary yet! What may have been gained is a decent approximation that might work for large Reynolds number: simply neglect the boundary term Re-1 n ■ V2u for Re ^> 1. That this might work is related to the fact that dP/dn = 0 on solid, stationary no-slip walls (with no body force) is a good approximation at large Re (indeed, it is one of the cornerstones of boundary layer theory). This result also seems to imply that the neglect of the viscous terms on the RHS of (3.12-59) may also be valid for Re ^> 1, with the introduction of (3.12-60) merely serving as a method of justification. We shall later (Section 3.13.4) return to these issues, and raise new ones, after we discretize the weak form in Section 3.13. For now we merely state that the above weak formulations of the PPE are probably not the best ones. 3.12.4 The Stream Function-Vorticity Formulation Although we will not discuss the finite element implementation of the yj; — co formulation, we will discuss proper—and improper—weak formulations, partly for completeness and partly to show why we remain primitive variable advocates. Similarly, and for similar reasons, we do not discuss u — a> methods. See Gresho (1992) for some discussion of these, and for many references.
434 THE NAVIER-STOKES EQUATIONS Suppose we wish to solve (3.6-5) and (3.6-6), using (3.6-4), in a weak form, for the situation wherein u = w on FD and r — FD = FN comprises an outflow boundary, with OBC TBD. Also, we are either given an initial divergence-free velocity [from which (3.6-4) can be used to compute the initial stream function and then (3.6-6) gives the initial vorticity]—or we are given an initial vorticity, coq, from which (3.6-6) gives the initial stream function and then (3.6-4) gives the initial velocity. First we show how early investigators fell into the 'weak formulation trap'—a la early FDM investigators; namely, generate the 'usual' weak forms of (3.6-5) and (3.6-6), using 6 as the generic test function for the former and (p for the latter; i.e., fol — + u Vco- vV2co) = 0 and / 0(vV + «>) = 0, which led to, in the usual way, el — + u • vw) + vvo • s/co V dt J (3.12-61) and /*V0.V^ = <pto+ I 0—. (3.12-62) The d\f//dn term is okay because dxfr/dn = uz = r • w and uz is known (except at the exit). The problem is the viscous flux of vorticity, vdco/dn: it is not known on any part of T (except perhaps at outflow where it is usually assumed/taken to be zero; i.e., 'fully developed'). Thus, this weak form is not only useless, it has also been misused—for example by assuming that a specified (essential) BC for co on FD could be obtained via the computation cor = —V2i/Hr- See Stevens (1982), in which these ideas were implemented and compared with the proper weak formulation—detailed below—which Stevens also implemented. He showed quite conclusively that the fully coupled method, with both BC's on V, is the thing to do. The proper weak formulation was discovered by Campion-Renson and Crochet (1978), and—it seems—simultaneously by Barrett (1978), via variational methods applied to the steady Stokes equations. It begins with the BC's V = g and dij//dn = uT on FD and, for OBC's, we choose dty/dn = a and vdco/dn = b on FN where we will soon set both a and b to zero. The weak form is then (see also Thomasset, 1981, and Gresho, 1992): Find V € H\ and co e Z/1 from 6 I — + u • Vco J +vV6-Vco [ Ob V0€//i, (3.12-63) and / V0 ■ V^ = / 4>co + / (puz+ (pa V0 g //', (3.12-64) J J J Yd J T/v where //' contains once-piecewise differentiable functions on Q, HXE is that subset that takes the value g on rD, and Hq is that subset that vanishes on FD. Remarks: (1) The vorticity is computed everywhere: in £2 and on F because the test functions, 0, do not vanish on F. (The no-slip BC, d\j//dn = r • w on Fo, is implicitly/automatically realized because the proper value of co on FD is computed.)
WEAK FORMS OF THE PDE'S/NATURAL BOUNDARY CONDITIONS (NBC'S) 435 (2) a = b = 0 is the usual choice for an OBC. (3) The \f/ — co system is tightly coupled, in proper analogy to u-P formulations, wherein u and P are tightly coupled—another consequence of V • u = 0 in Q. (4) Assuming 0 and co are smooth enough, the weak form may be reversed via I V6Vco= J V • (0Va>) - / 6V2co = I 6— - f 0V2co since 0 = 0 on TD; i.e., (3.12-63) yields /6{Dco/Dt - vV2co) = J^ 0(b - v(dco/dn)) => Dco/Dt = vS72co in Q and vdco/dn = b on TN. Next, / V0 • V0 = / V • (0V0) - / 0V20 = fr(p(di//dn)- f(pV2\f,, so that (3.12-64) yields /0(V20 + co) = fr[)<p(d^/dn - uT) + Jr <p(d\j//dn — a), which => V2i^ + co = 0 in Q, dij/fdn = uT on Yd, and d\j//dn = a on Tyy. (5) Multiply-connected domains cause additional difficulties. See the above references for details. (6) For T/v = 0, the above formulation satisfies Quartapelle's projection theorem (Quar- tapelle and Val-Griz, 1981; and Quartapelle, 1981). 'A function co is such that co = — V2\f/ in Q, where \f/ = g and d\f//dn = uT on T if, and only if, J co<p = Jr(g(d(f)/dn) — (f)uT) for every 0 satisfying V20 = 0.' When the requirements of this theorem are satisfied, then the problem: V20- + co = 0 and V2co = 0 in Q with either \// = g or d\f//dn = uT on T has a unique solution that also solves the implied biharmonic problem: V4i/f = 0 in Q, 0 = g and d\f//dn = uT on I\ Enough on \j/-co. Now we must get down to brass tacks and discuss finite element methods for solving the u-P and PPE weak formulations—after one small digression. 3.12.5 Some Ill-posed Formulations Not all formulations lead to well-posed problems. In this section we discuss some formulations that are mathematically ill-posed but which have also been 'coded up' and used—seeminly with success (!), mostly in the finite difference literature, but occasionally in the finite element literature. The ill-posedness shows up only in those situations in which an NBC is 'active'—notably as OBC's, as mentioned already in Section 3.8.1—and is most easily presented/analyzed in the simple but relevant case of steady Stokes flow, so that is what we shall do. Specifically, we consider three formulations, differing only in their treatment of the NBC/OBC, and the proofs we present (for the ill-posed cases) are based on those supplied to us by V. Girault, whom we gratiously acknowledge. Thus, consider the following three problems, all of which strive to find u and P from -V2u + V/> = f and V • u = 0 in Q (3.12-65) with u = w rD, (3.12-66) where f and w are data and unit viscosity has been assumed for simplicity—and with no loss of generality. They differ only in their treatment of BC's on FN = F — FD, the
436 THE NAVIER-STOKES EQUATIONS 'outflow'—or open—portion of T, as follows: (i) du/dn-nP = 0 (3.12-67) (ii) du/dn=0 (3.12-68) (iii) du/dn=0 and P = 0. (3.12-69) Remarks: (1) It will suffice to consider only homogeneous BC's; generalization to the inhomogeneous case is immediate. (2) It is also sufficient to consider just the case in which the NBC is simultaneously applied to both normal and tangential momentum equations, as above. Generalization is again straightforward. We state now and prove below (but not in great detail) the following results: only the first formulation is well-posed. Problem 1. Actually, we will not delve deeply into the well-posedness issue for this problem (existence, uniqueness, continuous dependence on the data), since the theory is deep and presented well elsewhere (see, for example, Ladyzhenskaya, 1969; and Girault and Raviart, 1986). We will merely state that Problem 1 is well-posed and show its weak form, which is easily obtained from (3.12-34) after adding a body force term and setting K = 0: j V\:{Vu)T - j PV-\= j \i+ j \-{du/dn-nP) Vv e Hl0, (3.12-70) which is to be solved, along with J ^V. u = 0, Vi//- e L2, after dropping the boundary integral term, which of course forces the solution to satisfy the homogeneous NBC of (3.12-67). Problem 2. The key thing to show here is that if a solution exists, it is not unique, thus proving ill-posedness. To try to do this in the 'standard way,' we form the L2-inner product of the momentum equation with u and similarly for the continuity equation with P, where now u = Uj — U2 and P = P\ — P2, the difference of two alleged solutions of — V2u + VP = 0, V • u = 0 in £1 with u = 0 on TD, and du/dn = 0 on T^. Thus, - / u • V2u + / u • VP = 0 and / PV • u = 0, and we integrate by parts the momentum equation to obtain - / V • [u • (Vu)7] + / Vu :(Vu)7 + / V • (u/>) - / />V • u = 0, or Vu :(Vu)7 + / (Pn • u - u • du/dn) = 0.
WEAK FORMS OF THE PDE'S/NATURAL BOUNDARY CONDITIONS (NBC'S) 437 Now u = 0 and n ■ u = 0 on To, and du/dn = 0 on T/v, so we are left with /vu:(Vu)7+/ />nu = 0. (3.12-71) But since we know nothing about either P or n • u on TN, we can not prove uniqueness in the 'usual way.' In contrast, the usual way applied to Problem 1 gives, instead of (3.12-71), f Vu :(Vu)7 + / (n/> - du/dn) • u = 0, (3.12-72) which is obtained in just the same way. But P = n • du/dn on FN, and we get that / Vu :(Vu)7 = 0, which can only be satisfied by u = 0, thus proving uniqueness (of velocity) in Problem 1. [Uniqueness of pressure follows from u = 0 =>■ VP = 0 =>■ P = C, but C = 0 because P = 0 on TN from (3.12-67).] Indeed, we show in Figure 3.12-1 an example of this non-uniqueness, also from V. Girault, and one whose non-zero result does indeed satisfy (3.12-71); with f =0 and w = 0, the desired Stokes solution is u = 0, P = 0, which is realized by Problem 1. But Problem 2 has, in addition to this, the solutions u = a(y2-\), v = 0, P = 2ax + b (3.12-73) for arbitrary values of a and b: we find a single infinity of velocity solutions and a double infinity of pressures! Here Vu :(Vu)7 = u2 + u2 + v2 + v2 = u2 = {lay)2, so that / Vu :(Vu)7 = 4a2/3 and JFn Pn u = /J (Pu\x=l - Pu\x=0)dy = /0' [(2a + b) - b]a(y2 — \)dy = —4a2/3, thus properly satisfying (3.12-71). [The solution of this same problem with, in addition to P = 0 at just one point, as mentioned in (3.8-31), is as above except P = 2a(x — xq)—the double infinity is reduced to a single one.] Although this single and simple counterexample is sufficient to show that Problem 2 is ill-posed by virtue of non-uniqueness, we present a second: consider 2D steady flow in a channel of height H and length L under the BC's P — du/dx = 3 and v = 0 at x = 0, du/dx = 0 and v = 0 at x = L (the outlet), and u = v = 0 at y = 0 and y = H. This problem has (at least) two solutions and is thus ill-posed. They are: 1. u = v = 0, P = 3; 2. u = 3H2/2L ■ y/H • (1 - y/H), v = 0, and P = 3(1 - x/L), which is Poiseuille flow, and which also satisfies (3.12-71). y A rD Q rD (1.1) rN 0 Fig. 3.12-1 A simple domain for the Stokes equations.
438 THE NAVIER-STOKES EQUATIONS If (3.12-68) was also applied at the inlet of the Poiseuille flow channel, an even worse redundancy obtains: u = aH2(\ — y/H)(y/H)/2/xL, v = 0, and P = b — ax/L for all a and b. Problem 3. We present two proofs that this problem is ill-posed, the first from D. Arnold and the second from V. Girault (personal communications): 1. Since Problem 1 is well-posed and satisfies du/dn — nP = 0 on FN, with in general P^Oon FN, it follows that Problem 3, which sets both du/dn = 0 and P = 0 on FN, is ill-posed via overspecification. 2. Assume that Problem 3 has a solution. It is straightforward, as for Problem 1, to show that it too satisfies (3.12-70) with, again, the boundary integral omitted because here P = 0 and du/dn = 0 on FN. But we already have that the solution to (3.12-70) satisfies (3.12-67), whose unique solution does not generally give du/dn = 0 and P = 0. Thus, (3.12-69) is overspecified, and Problem 3 is thus ill-posed. If Problem 1 is well-posed (and it is), then Problem 3 cannot be (and it is not). While we do not profess to understand the many and varied 'schemes' used by many finite difference (and finite volume) code writers, we assert that they are often solving problems with OBC's that in the continuum are ill-posed (typically du/dn = 0). We called these 'fuzzy' BC's in Sani and Gresho (1994) and implored/challenged the CFD numerical analysts to try to explain them. (A partial response to this challenge, for the AD equation, has recently been provided by Griffiths, 1997, and Renardy, 1997.) Later, we will again address the ill-posedness of Problem 2 when we discuss the pressure Poisson equation (PPE) for time-dependent flow, but we state the bottom line here: the PPE cannot be solved (uniquely, at least) because it has no BC on FN. Also later [(3.13-33)], we will show ill-posedness for the discrete Stokes equations—again for Problem 2. A final comment on the time-dependent version of (3.12-68), in the form dun/dt + Vdun/dn =0, which is (3.8-32): integration over T and application of the constraint Jr n • u = 0, and thus Jr n • du/dt = 0 yields a constraint on (the up-until-now seemingly arbitrary) advecting velocity V; fr V(du/dn) = Jr n • dw/dt, where here F = Tobc + FD, and w is the specified velocity on Fo. Clearly this would be difficult to satisfy in the general case, thus posting another reason not to try it. [See, however, the end of Section 3.8.1 for a generalization that could be made to work—by reintroducing the pressure; i.e., (3.8-34). See, too, Lee and Leone (1988), who managed to make it work (without the pressure!) in an application involving mountain lee waves.] 3.13 THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM Henceforth, we will focus almost exclusively on Galerkin's method for generating the FEM equations corresponding to either the u-P or the PPE formulations and, while each set of weak forms discussed above leads to a different set of GFEM equations,
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 439 we shall focus on—and develop in some detail—only one, albeit one that is sufficiently general and useful. Later, we will get even more specific and present some detailed nodal equations, including NBC's; this time (virtually) only for one particular element—the paradoxical Q\Pq = Q\Qo—which element is simultaneously very popular in practice and very unpopular in theory (even though lots of theory, much more than its 'fair share,' has been devoted to it). And we shall attempt to explain the paradox. [The designation QmPn denotes the following polynomial approximation over a quadrilateral (2D) or hexahedral (3D) element: the velocity is approximated by ra-th degree polynomials in each direction and the pressure is approximated by an n-th degree polynomial; details later.] 3.13.1 Detailed derivation of one u-P formulation a.Continuum formulation We begin by restating a particular IBVP in the weak form [(3.12-27) through (3.12-30) with the body 'force,' a la (3.10-1), re-introduced], which we shall use to launch our GFEM adventure: Find ua e Hxa E and P e L2 such that f i^Dua (ty) d(f){a)duB d(f){a) / * -RT +vV(f) ■ Vu« + ^ihir ~ p^- J Dt oxp dxa dxa = f <P(a)ga + / 0(a)Fa, a = 1, 2, ..., ns, (3.13-1) and ^=0, (3.13-2) V0(a) g H\ 0 and Vi/^ € L2, where Hla E are those once continuously differentiable functions in Q that take the value Ua on T^—subject to the IC's u(x,0) = u°(x) (3.13-3) and the following constraints on the data: (i) V-u° = 0 in Q, (3.13-4) (ii) npu0p = npU°p on r(n), (3.13-5) where F(n) is that portion of F on which the normal component of velocity is specified with the understanding that U°p is taken to be zero on any portion of F on which Up is not specified as an essential BC—cf. (3.12-22) through (3.12-24). Remarks: (1) If T^ = 4> Va [i.e., if F(n) = F], or if the normal component of velocity is specified everywhere, then the following additional constraint must also be satisfied for t ^ 0: f npUp= fn-V = 0. (3.13-6)
440 THE NAVIER-STOKES EQUATIONS (2) The NBC's associated with the specified values of Fa in (3.13-1) are vnJp- + yp)-Pna = Fa on Fna, a = 1, 2, ..., ns. (3.13-7) V dxp dxaJ (3) y plays its usual role—that of combining two weak forms within one set of equations. (4) This generalizes (to 3D) the presentation in (3.10-1) through (3.10-13) and that in (3.12-12) through (3.12-24). (5) The classical version of this weak form is, essentially: find u and P in Q for t > 0 from h V/> = v[V2u + yV • (Vu)' ] + g and Vu = 0 in Q, subject to the essential BC's given by (3.10-4) through (3.10-7), the NBC's of (3.13-7), the IC's of (3.13-3) through (3.13-5), and the constraint (when applicable) (3.13-6) where, of course, V • (Vu)7 = V(V • u) = 0. (6) Noteworthy is the fact that, while we begin our search for a weak solution over all of Hxa £, we conclude it by finding a velocity field that is actually in a subset of Hxa £, i.e., JI E—the set of weakly solenoidal vector fields that satisfy the essential BC's—thanks to the constraint (3.13-2). See also Appendix 3. b. GFEM equations We now move on to the approximate solution of the above IBVP by seeking a solution in the appropriate finite dimensional subspaces of those spaces in which (ua, P) above reside. Before beginning, we caution the reader (well, some readers) that there is some tough sledding ahead, mostly for the following reasons, some of which we paraphrase from Gunzburger (1989): 1. In all of our GFEM approximations associated with the scalar equations of Chapter 2, stability and convergence were in a sense 'automatic' once the finite dimensional spaces in which the GFEM solution was sought were established to be subspaces of the appropriate infinite dimensional spaces in which the weak continuum solution lay. (Well—at least for diffusion-dominated flow, since GFEM is not always as stable as one would like when Pe 2> 1.) This is no longer true for NS owing to the V • u = 0 constraint; it introduces a serious set of compatibility conditions/problems between the velocity space and the pressure space. (As if it did not already cause enough problems in the strong formulation!) 2.Thus, 'We find ourselves in the realm of what are known as mixed finite element methods'—Gunzburger (1989); both velocity and pressure must be approximated. Anyway, we now move on toward a finite element approximate solution of (3.13-1) through (3.13-7). Suppose Q has been discretized ('tesselated,' 'triangulated') via a mesh of finite elements in which there are Na + Ma = Nt total velocity nodes, where Na comprises those nodes in Q and on T^ and Ma are those nodes on F® (clearly Ma <$C Na, at least in 2D), and there are Np total pressure nodes (in Q and on T). Associated with each node in Na is a velocity test function (in the a-direction), 0-a), and with each node
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 441 in Np is a pressure test function, V*—the finite dimensional equivalent of 'for every 0(a) contained in Hla 0 and for every i// contained in L2, above. Also, as with the corresponding scalar transport equation in the previous chapter, we expand ua via a linear combination of the same nodal functions (now called basis functions) over the Na nodes where ua needs to be determined, and we use the analogous Ma basis functions to interpolate the Dirichlet BC's on ua. The pressure is also approximated as a linear combination of the Np pressure test functions (i.e., basis functions, since we are using Galerkin's method in which the test and basis functions are members of the same family). Thus, Na ua = ua + ^2 uaj<pf\ (3.13-8) 7=1 where Ma 7=1 and p = YsPrti< (3.13-10) 7=1 where ua-3 is the nodal value of ua at the j-th node (etc.), and ua approximates (interpolates, typically—at least until we reach the important discussion in Section 3.13.Id) the specified value Ua on r^ in which 0y is used to denote a velocity 'basis' function on r%, and we repeat the Important convention: when the spatial index, a or /?, appears in parentheses, as (a) or (/?), the summation convention is not in force—no summation. Another notational shortcut is the omission of the superscript h, commonly used to emphasize the approximate solution; i.e., we 'should' use uha instead of ua in (3.13-8) and Ph rather than P in (3.13-10), but we shall simply let h be implied. Note that, in contrast to the presentation in Chapter 2—cf. (2.2-1) through (2.2-3)—we are numbering the velocity nodes with Dirichlet data separately from the others. Again, though, this is (probably, but not necessarily) more for expositional convenience than coding efficiency—although it does require the "on the velocity basis function, for clarity. To facilitate comprehension of the GFEM equations to follow, we note that the substantial derivative, Dua/Dt in (3.13-1) is actually quite a bit more involved than it appears there, owing to (3.13-8) and (3.13-9). Thus, Dua dua dua 3 / „ -rA (a) + Up— = — \ua + 2_^ Uaj(j)j Dt dt Hdxp dt 7=1
442 THE NAVIER-STOKES EQUATIONS N„ dua „ dua\ s—^ dt 7=1 da) N* 'w '••■•■ #J + iE«^r ,(a) . ~ dx U, W} (a) o/J P k=\ dxt- Nk + Ylu^ k=\ dxR where ua, up are given by (3.13-9), and we see that Dua/Dt generates six parts: three linear parts, one non-linear part, and two known parts that will be sent to the RHS. To be absolutely sure that our summation convention is understood, or to further clarify it—since the only free index at the end of the day is a,—we expand (in 2D) the last term above (as an example): Nt 5^ Ufikfr (P) *=1 dxp, N, ^2u\k<t>t CD a=i dx\ N7 + X]U2k^' (2) ,k=\ dx-i' showing, in a sense, that there is indeed a 'summation' over /?. Inserting (3.13-8) through (3.13-10) into the finite dimensional version of (3.13-1) and (3.13-2) gives Nn e \ u.j I ^vr+ N« ^ \"P + 5ZuPk^> (P) ^ (a) (a) yj .(or) k=\ («) JW.W dxf + v / V(f)}a} ■ V0 l(«) (a) C^ + 0 'Fa u, a] Aa)fdUa . ~ ^a\ , „,(a) „. . d(j)f] dlip a= l,2,...,n,; / = \,2,...,Na, (3.13-11) and N« E 7=1 ifc 90 (/3) dx P UPJ Ifc 9wy 3xa i,2, ...,yvp, (3.13-12) which have been written (as usual) so that the unknowns are on the LHS's and the RHS's represent given forcing terms and, to repeat for emphasis and to further define/clarify, summation over fi (in the sense defined above) but not over a is implied in the momentum equations (only), and ua and up are a shortcut notation for the expansions (interpolations) given in (3.13-9). In the mass conservation equation (3.13-12), 'conventional' summation over fi on the RHS is implied because /? appears twice with no parentheses.
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 443 Remarks: (1) The non-linear advection term causes an awkward (and often expensive) 'triple product' of basis functions—in the term J (pla)(f)k dcp^/dxp—as well as two linear parts via the coupling with the Dirichlet BC, ita; whereas the triple product coupling terms are pervasive, the two linear parts are not—most of the terms are zero because ua 'acts' only at nodes contiguous to F®. (2) y = 1 causes an awkward viscous coupling between velocity components that also engenders additional computational expense (clearly y = 0 generates less work). (3) The RHS of the continuity equation (3.13-12), corresponds to given data from those values of velocity components that are specified on F. (4) While perhaps the GFEM momentum equations appear to be overly 'complex,' they are actually quite compact when it is realized that they describe (almost) the entire approximate solution—albeit in the form of a large system of non-linear differential-algebraic equations (DAE's) that is not inexpensive to solve. [The part of the solution that they do not describe is the interpolation between node points; for this one uses (3.13-8) through (3.13-10).] An alternate (but equivalent) representation that is also popular is based on the identity a(b7c) = (ab7)c, where a, b, and c are n-vectors, b7c is the vector inner product, and ab7 is the vector outer product (b7c is a scalar and ab7 is an n x n matrix). Thus, replacing (3.13-8) through (3.13-10) with the equivalent versions J Un Un Ua + ^(a)Ua' (3.13-13) (3.13-14) and P = x//TP, (3.13-15) with the obvious identifications [e.g., <p(a) is an 7Va-vector of basis functions for the a- direction, ua is an 7Va-vector of nodal values of ua, etc.], leads to the following equivalent statement of the GFEM equations, in terms of vector products: ^ v J d<p{a) d(pja) sp J <P(«)<pfa) ua + <P(a)W+<P,^)U^)-^ dxp dxp Un + T dUa_ 9<P(«) d<p{p) (^' dxp dxp dxa up- <P(a)ga + I <P(a)Fa J r„ (diia „ dua\ d<p(a)dua 3<p(a) dup dt dx P dxp dxp dxp dxa a = 1,2 ns, and f- d(p m dxt- U/ * diip dxH (3.13-16) (3.13-17)
444 THE NAVIER-STOKES EQUATIONS where we have changed the sign of the continuity equation to recover the (skew) symmetry between div and grad ['grad = —div'; see, for example, Strang (1986)], a change that can (should) also be made in (3.13-12). As before, summation over /? but not over a is implied, and ua, up are given by (3.13-14). c. Matrix-vector representation We can now introduce the global matrices from either (3.13-11) and (3.13-12) or (3.13-16) and (3.13-17); i.e., these equations can be written in yet another equivalent form that may be more amenable to interpretation as finite element equations vis-a-vis Galerkin equations—even though it requires that we specify the spatial dimensionality of the problem, ns, which we take for now as 2: rM, 0 L o 0 M2 0 01 0 oJ 'ii\" u2 _P _ + ■Al+Bu+Nl(u) + V+Y)K\ Bl2 + yKl2 C, B2\ + K^2i A2 + B22 + 7V2(u) + (1 + y)K2 C2 0 J C] CT c2 u2 p f\ fl g (3.13-18) where u\ is an TV,-vector of nodal velocities in the x\-direction (etc.), and the matrix definitions are: Ma = J(p(a)9ja)l or Maij=j4>la)4>f\ (3.13-19) Aa 9(a) U\ 9<) , . W{a) dx\ + u2 Aa) An Act) I ~ ^Pj . ~ ~rj dx or (3.13-20) AUu) 9(a) I <P(l)Ul T „ ^(«) , ,nT „ ^(a) ax, + *(2)U2 dX2 Act) N2 or *«» = £«H* /^Vi'^+E^ UT^^-x (3.13-21) k=\ J dxi k=\ J dxl Ka = v K <*a &P(a) ^1) dxp dxp dcj>f #f dxp dxp or •Aa) (a). v / vc • w;}> (3.13-22) B, dua aP = j9(a)9{p)- or
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 445 K afi Kafiu = V Bafrj = J oxp dxa 90i(g) 90j(/?), dxp dxa Ca = - f r — 0(«)0(/J) Ma I rj dx P , with (Kap)T = Kap, or ^P(«) ,r -—V . or 3xa 90 (a) dxa Vy, (3.13-23) (3.13-24) (3.13-25) with dimensions as follows: The RHS Na-vector is Ma: Aa: Ka: Bap'- Kap'- r ■ (Na (Na (Na (Na (Na (Na xNa), xNa), xNa), xNp), xNp), xNP). fa <P(a)ga + <P(a)Fa j r„ / 9"(«) , - dua \ , d<P(a) dua , d<P(a) 3w« or /a,- 3? 3*a 0!a)& + / 0!a)F„ 3^ 3^ 3^ 3xaJ 01 (a) /3«a V 3r da) da) dua\ 30 3wa 30 3^ 3-^a dxp dxp dxp dxa (3.13-26) and represents the forcing caused by, respectively: the body force term, the natural (traction) boundary conditions, and the essential (Dirichlet) BC's. Finally, the RHS Np- vector is dup / dup 8 *: or gi *i (3.13-27) dxp J dxp which, of course, should not be confused with the acceleration 7Va-vector, ga, in (3.13-26). Remarks: (1) The LHS of the continuity equations, C\u\ + C\u2, when examined at a pressure 'node' at or near FD, will look somewhat strange (incomplete); this is because part of 'V • u = 0' is on the RHS—in g—where it is very important to realize that the
446 THE NAVIER-STOKES EQUATIONS /V^-vector g lives only on the boundary and is thus quite sparse. (Otherwise, we would be dealing with V • u = S, where S represents mass sources/sinks in Q.) (2) The coefficient matrix of (w[ u\ PT)T 's singular; this is called the time-singular representation (the coefficient of P is zero; see for example, Campbell, 1980) and emphasizes the point that we are dealing with DAE's and not simply ODE's. More on this later. (3) In evaluating the boundary integrals in (3.13-26), it is sometimes better (but not necessary), when a corresponds to a normal direction on T^, to expand Fa into the pressure basis functions rather than those for velocity (because P usually plays a larger role than the normal viscous stress in the NBC force balance)—unless Fa is given analytically and numerical quadrature is adopted. (4) A simple, but not cost-effective, way to solve the Stokes equations is to take v very very large, say 106 or 108 in the viscous matrix; the advection terms, but not du/dt and V/>, will 'automatically' shrink in importance. [This presumes, of course, that all other 'characteristic quantities' — such as length scale and velocity scale—are 0(1).] There is one more level of 'condensation' that will be of much use in the sequel; i.e., even though (3.13-18) through (3.13-27) carefully and fully define the DAE's that need to be solved—and are in the form that is appropriate for their construction via the element-level matrix and vector contributions, and code writing—this representation is still too cumbersome for purposes of further discussion. Hence, we introduce (nearly) the most compact matrix-vector representation of the DAE's: Mil + [K + N(u)]u + CP = f (3.13-28) CTu = g, (3.13-29) where the partitioned matrices are defined as follows: M = K = A{+B[{+{\+y)K{ Bn + yKn B2[ + yK[2 A2 + B22 + (1 + y)K2 N(u) C CT f = and g is unchanged. M, K, and N(u) are of dimension (/Vi + N2) x (Ni + N2), and C is of dimension (/Vi +A^2) x N,,. The vectors u, f, and g are of length (/Vi + N2), (N\ + N2), and TV,,, respectively. [Hopefully, N\(u) for the advection matrix will not be confused with the number of unknown velocities in the x\ -direction, /Vi; etc.] The names sometimes associated with these matrices, while not always 'accurate,' are these: M is the mass matrix, K is the viscous or diffusion matrix (ignoring, 'conveniently,'
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 447 the B portion), N(u) is the non-linear advection matrix (or just advection matrix), and C is the coupling matrix (it couples u and P), or the constraint matrix (it constrains u, via the Lagrange multiplier, P, to be discretely divergence-free) or, finally, it is the compromise matrix (a 'compromise' between div and grad). Actually, of course, M~' (or perhaps M~[}, the inverse of the lumped mass matrix) should first multiply each matrix and only then should it be 'named'; e.g., M~lC is the (weak) gradient operator—etc. Also noteworthy is that CT is the negative of the divergence operator, and it follows that ('grad')7 = -'div'; see Strang (1986). Digression A small digression related to C and CT may be useful, beginning with an outline of their shapes, noting that typically Na > N p, and always that N\ +N2 > Np (2D) and N\ + N2+N3 > Np (3D). Thus, schematically we have, with n > m, C n x m and CT m x n There are n = N\ +/V2 or TV 1 + N2 + N3 velocity equations and m = NP constraint equations among these n velocities; and we next note that, if r is the rank of C (and CT), the dimensions of the respective null spaces are: dim N(C) = m — r and dim N(CT) = n — r. If C is of full rank, then r = m (all constraint equations are independent), and we have that dim/V(C) = 0 and dim N(CT) = n — m; this is the common case (no so-called 'pressure modes,' spurious or otherwise—which we shall soon carefully explain): the divergence matrix has a large null space (the field of discretely divergence-free vectors), and the gradient matrix has no null space. Pressure modes, when present (r < m), increase the null space dimension of both C and CT; each pressure mode (an m-vector in the null space of C) reduces by one the number of linearly independent constraints and increases by one the number of divergence-free vectors. It is also noteworthy that of the n total momentum equations, the m constraints among the velocities leaves only n — m 'effective' momentum equations. Finally, it is important to point out that C is a 'grad' except at nodes on T at which a natural BC is employed in the normal direction (which BC 'acts like' a Dirichlet BC for pressure; details later) and that CT is a 'div' except at nodes on T at which a Dirichlet BC is employed for either velocity component. (Actually, because of the sign change, CT is a convergence matrix. Actually, QTXCT or perhaps Q^'C7 is the true convergence matrix—where Q is the pressure mass matrix: Q,7 = J ij/ii/j. Actually, we, like others, will often maintain the sloppy-but-convenient terminology that calls CT a 'div.') End digression The DAE's in (3.13-28) and (3.13-29) can be 'solved' (integrated forward in time) only after appropriate (well-posed) IC's are stated; from (3.13-3) through (3.13-5), these are: u(0) = u0 with CTu0 = g(0) = g0, (3.13-30) and, in addition, when n • u is specified on all of T, the constraint ^2gi(t) = 0 for t^O, (3.13-31) (=1
448 THE NAVIER-STOKES EQUATIONS the discrete analog of (3.13-6) that requires global mass conservation from the specified normal velocity. One way to prove this is to sum (3.13-12) over /, using (3.13-27), and to realize/utilize that Xw= i V^< — 1» mus i" --ip i=\ ^N, But f d<p(f]/dxp = §rnp<j)(p from Green's theorem, so the LHS becomes J2']=i frnPuPJ 0^ = Jrn ■ uh = 0 because of (3.13-6). Remarks: (1) If (3.13-30), or (3.13-31) when applicable, are violated, then the DAE's are ill- posed and no solution exists. This is, of course, the discrete counterpart of (3.10-11) through (3.10-13). (2) If the steady NS equations are being addressed, then u is set to zero in (3.13-28), and one is faced with solving a non-linear algebraic system for u and P. In this case the only solvability constraint is (3.13-31), and that only when n • u is specified on all of T. (3) Additional (extraneous and spurious) solvability constrains—for transient or steady flow—enter when certain 'elements' (combinations of {</>,} and {i/^}) are employed, a situation that will be addressed in more detail below when we discuss 'pressure modes.' (4) The solution of these DAE's will satisfy the BC's given in (3.12-28) — which actually encompasses those given in (3.10-4) through (3.10-7) and in Figure 3.8-2. The formulation of the GFEM DAE's for any other of the permissible BC's discussed in Section 3.8.1, some of which require the generation and use of different weak formulations, and in Section 3.12.2, should now be fairly straightforward. (5) As already mentioned once, the matrix C (or M~XC) corresponds to 'grad' in Q but not on all of F; only on F®. It corresponds to a pressure force on T^; the important details behind this remark will be presented later. (6) Another way to derive (3.13-31) is simply to sum each of the Np discrete mass conservation equations of (3.13-29); all internal nodes will 'cancel out', with the result that the summations on the LHS will give zero. (7) The vector g is always 'generated' by inhomogeneous Dirichlet BC's and, because we preclude 'volumetric' sources (sinks) of mass, it contains mostly zeros; for homogeneous Dirichlet BC's, CTu = 0, which describes contained flow within stationary boundaries. For most practical problems of interest, 'significant' non-zero values in g will be generated by Dirichlet BC's in the normal direction. For tangentially specified velocity, g will be either 'small' or zero—the latter for constant tangential velocity and a uniform mesh. More details on this 'boundary' vector will be presented in Section 3.13.51—and in the two examples presented in Section 3.13.2b. (8) The matrix-vector notation clearly implies a particular ordering and arrangement of the discrete equations—for 'talking purposes' only, not necessarily for code writing.
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 449 d. Ill-posed equations First we point out that if the DAE's are ill-posed because the constraint in (3.13-30) is not satisfied, a nearby, discretely divergence-free velocity field can be obtained by performing the discrete version of the L2-projection previously discussed for the continuum case, (3.10-14) through (3.10-20), as follows: (i) define v = uq — M~lCk, where the TV-vector v and the M-vector A are to be found (see too Appendix 3) such that (ii) CTv = go- This projection is realized via the two-step procedure: (1) solve (CTM~lC)k = Ctuq — go for A, and (2) compute v = uq — M~lCk, the adjusted and mass-consistent velocity that replaces w0 as the IC. This 'mass adjustmentVprojection is, of course, most conveniently performed (when 'legitimate') after lumping the mass (M —> ML)\ otherwise, the fully coupled system given by (i) and (ii) needs to be solved—and only CM gives a true L2-projection. If violation of (3.13-31) at t = 0 is the cause of the ill-posedness, which will often occur if the normal velocity is specified on all of T and interpolation is employed as in (3.15-9), and will be demonstrated later, then the above technique will not work. In that case, it is the applied BC rather than the IC that is the cause of the ill-posedness. Glowinski (1984) shows one way to fix this problem, and we, at the end of this section (Section 3.13. lg), show another; both involve changing the BC to recover discrete, global mass conservation. Next, we note that the (skew) symmetry between div and grad (i.e., C in the momentum equation and CT in the continuity equation) is, of course, a consequence of integrating J (pVP by parts to generate the weak form. Suppose we do not do so? This question is of more than just academic interest because—recalling the discussion of OBC's in Sections 3.8.1 and 3.12.5—it has the potential of removing P from the (normal) OBC, as we show next. To focus in on this issue as efficiently as possible, we consider the simplest relevant case: steady Stokes flow, and we take y = 0 and omit the body force term. Thus, we return to (3.13-11) and set uaj = 0, omit advection, and re-integrate the term J -^r,-(90y /dxa) by parts, to obtain ^E i^r-^r K+z^ /* :1 dxa = I ^~Fa - v / V0,(a) • V«a; a =1,2 ns\ i = 1, 2, ..., ,/Va, (3.13-33) and (3.13-12). The NBC, of course, is now different—and acknowledged by the tilde over Fa; it is ~ dua dua Fa = vnp—- = v — , (3.13-34) onp an and the hope for a better NBC for use as an OBC, dua/dn = 0, is apparent. Before dashing this hope, let us write the matrix-vector form of this result: Ku + GP = f, (3.13-35) CTu = g, (3.13-36)
450 THE NAVIER-STOKES EQUATIONS where the new gradient matrix, G, is given by (in 2D) Gij dx2 J (3.13-37) and the following remarks are relevant: 1. The lowest-order discontinuous pressure approximations, P\Pq and Q\Pq, later called Q\Qo, are precluded; pressure must be at least linear. 2. Div and grad are no longer (skew) symmetric (this element 'terminology' is explained in Section 3.13.2a). 3. G is always a 'grad'—even on boundaries with NBC's, such as outflow boundaries. 4. Since G is always a grad, the hydrostatic pressure vector, PH = (1, —►, l)7, is always in its null space; i.e., an eigenvector of the matrix [cT 0J is the vector (p J, and the corresponding eigenvalue is zero. This puts the following new solvability constraint (proven below, in Section 3.13.2b) on the system (3.13-35) and (3.13-36): uTf + PTg = 0, where (p) is the null vector of (^ c^; i.e., the data (bin Ax = b) must be orthogonal to the null vector of the transposed system (zTb = 0, where ATz = 0). (See Section 3.13.2b if the above solvability condition is not sufficiently clear.) But P ^ PH and u ^ 0 in general, and we see that the loss of symmetry associated with the ostensibly legitimate notion of not 'integrating the pressure gradient around by parts' could lead to significant difficulty and may even be fatal. A final damning feature of this notion, which we further explore in Section 3.13.2b below, is this: the associated/implied PPE (for the time-dependent case, in general) has no BC on F%, with the result that the pressure is underdetermined. [This also applies to steady Stokes: the equation (CTK~XG)P = CTK~X f — g that is implied by (3.13-35) and (3.13-36) would be found to be lacking in BC's on T^; also, directly related to the lack of a pressure BC is the fact that (CTK~XG) is singular—with null vector Ph, but GtPh # 0.] These observations are probably related to some of the difficulties experienced by some FDM codes when G is a grad and GT ^ C. 5. Finally, recall that in Section 3.12.5, this formulation was shown to be ill-posed (under- determined) in the continous case. [Exercise for the reader: Consider integrating the continuity equation by parts, J ^r V •u = 0 = Jri/oi-n — / u • Vi/r. Discuss known and possible consequences: 1. VP integrated by parts. 2. VP not integrated by parts.] e. Normal and tangential BC's The full generality of the finite element method requires that we be able to apply any of the legitimate BC's to domains of any shape. This can lead to some awkward cases if we 'stay the course' with isoparametric mappings and cartesian velocities. As first pointed out by Engelman et al. (1982a) for the incompressible NS equations, there are situations in which the cartesian directions are not appropriate for applying BC's; a (local)
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 451 coordinate transformation (rotation) to normal and tangential direction (s, 3D) is required. [Earlier, Pinder and Gray (1977) performed a similar function, 'a rational method,' for the shallow water equations.] Examples: (i) at a free surface, we often require n • u = 0 and t • u 'free' (t is the unit tangent vector, typically oriented so that the domain is on your left); (ii) in a turbulent boundary layer in which so-called 'wall functions' ['numerical grafitti'—F. Habashi, personal communication (to a large audience!)] are employed (see the chapter on 'Turbulent Flow' in Volume II), the BC's are n • u = 0 and specified shear stress; (iii) an inlet region wherein parallel flow (t • u) = 0 and a normal force BC is desired; (iv) both normal and tangential tractions are specified; and (v) a problem stated in polar coordinates. The best way to implement such BC's is via local rotation (at each node needing it) so that the momentum equation is expressed in normal and tangential coordinates. Engelman et al. (1982a) showed how to perform this rotation and, importantly, how to properly—and uniquely—define the normal direction. We summarize (and paraphrase) their key results here, first in 2D, using Figure 3.13-1, in which t = x. The general technique involves both a rotation of the momentum equations at node i from cartesian to (n, t) and a change of variables from (u, v) to (u„, uT), the latter using un = n • u = nxu + nyv, (3.13-38) uT = x • u = xxu + xyv, (3.13-39) or, in terms of global discrete velocity vectors, uR = R-u, (3.13-40) where uT = (...«,-••• v,■ • ■ •), uTR = (■ • • unj • • • uTi • • •) is the rotated velocity vector (at node /), and R is the (orthogonal, /?~' = RT) rotation matrix; i.e., a matrix with ones on the diagonal and zeros elsewhere except for the four entries that transform u\ to un. and vi to vTj; i.e., the transformation puts un into u and uT into v at node / in the global arrays. The exact location of these entries in R depends on the global node and equation numbering schemes used, but in general we can call them j and k; i.e., Uj (and thus uni) is at location j in the global n-vector, and v,- (and thus uTi) is at location k. The rotation {ut specified, un free) {ut free, un specified) Fig. 3.13-1 Unit normal and tangent vectors.
452 THE NAVIER-STOKES EQUATIONS matrix thus looks like: R 1 (\ 0 2 0 1 0 0 1 0 • 0 0 j 0 0 0 0 1 0 0 0 • k n(i) ny Ly 0 0 0 1 0 n \ 1 2 0 V i , (3.13-41) / n and it is clear(?) that RTR = I, as desired. Inserting the rotated velocities into the DAE's of (3.13-28) and (3.13-29) gives, using u = R-luR = RTuR, MRTuR + {K + N(R' uR)]R' uR + CP = f, CTRTuR (3.13-42) (3.13-43) To finish, we also rotate the momentum equations (still at node /), which is accomplished simply by multiplication by R [cf. (3.13-40)]: >T (RMR1 )uR + [RKR1 + RN(R' uR)R' ]uR + RCP = Rf, (3.13-44) a procedure that is easily done at element level. We now have both momentum equations and velocities in terms of normal and tangential components, and it is a simple matter to apply either essential or natural BC's to either component of node /—in the 'usual' way. Note that it is only the 'all essential' BC case (un, uT given) that does not need to be rotated (although this case could be done in rotated mode), because we could then use the inverse of (3.13-38) and (3.13-39) to obtain (u = RTuR) u = nxun + xxuT, v = nyun + tyUT, (3.13-45) (3.13-46) a pair of equations that can also be used to transform back to cartesians after the (n, r) BC's have been applied and the boundary velocity computed. We are nearly finished. The remaining (and crucial) step is the proper computation of the normal vector at node /. Noting first from Figure 3.13-1 that the geometric normal is not even well-defined at node / (because of the C° boundary shape), the final task is to find an appropriate and unique normal (and tangent) vector. This problem was also solved by Engelman et al. (1982a), and we repeat the solution here—with slight variations; it turns out that the 'omnipotent' incompressibility condition once again plays a major role—as follows: starting from J1//7V . uh = 0 [cf. (3.13-2)], we sum over / and use J2j=\ & = 1
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM to obtain J V • uh = 0 where, from (3.13-8) with I u,\ 453 U; V; denoting the nodal velocity, NT 7=1 (3.13-47) where Nt = Ma + Na there—because here we need not distinguish between specified/given values of u7 and those to be determined, as we shall see. Thus, o=yv.u*=y£v.(u,-0;)=yEurv^=Eu;• yv^- <3-13-48) Now Green's Theorem in the form J V07 = Jr n<pj yields Eu> • / n^ = Y,UJ I n^J+vJ I ny<t>J = °- (3.13-49) ~^ Jr "^ Jr Jr The next key observation is that 4>j\v = 0 f°r all internal nodes, so that the summation from one to Nj in (3.13-49) effectively collapses to one over only the NB (say) boundary nodes. Thus, (3.13-48) becomes, effectively, $>,•• / V0,•=O = ]^K,• 30/ dx + Vi 30/ dy (3.13-50) which gives, using (3.13-45) and (3.13-46), but now evaluated at node j on the assumption that iij and t7 are uniquely defined, v^ f 30/ f 2_JnXlunj + rXjuTj) / —- + (riyjUnj + ryjuTj) J 7=1 90/ -^ =0, 3y (3.13-51) and we are almost finished. Rearranging to NB v- / f d(Pj 2^ Unj [nXj I IT- + n 7=1 3x >7 l+«JrI. / ~ + r 3y 3x >V 30i 3y 0 (3.13-52) yields the next key result: since (3.13-52) is still a statement of global mass conservation, it can only depend on {unj}; i.e., it must be totally independent of the values of the tangential velocity, uTj, which can only be true if -I dx y> 30j dy V0,- = xr n<Pj = 0, (3.13-53) a relation that gives the ratio of the two components of r. To finish, we simply add the normalization requirement, r • r = 1, which permits the unique (up to a sign) solution, ± / tyj/dy V0i ± / ny(j) yvj n07 (3.13-54)
454 THE NAVIER-STOKES EQUATIONS ■yj d(f)j/dx V0j nx(pj ikPj (3.13-55) Now Figure 3.13-1 shows that x = k x n, where k is the unit vector in the z-direction (out of the plane), giving rXj = —nyj and xy. = nXj, which, with (3.13-54), (3.13-55) gives n. V07- V0i (3.13-56) as the final result—in which the proper 'sign selection' has taken place [—in (3.13-54), +in (3.13-55)]. Note, of course, that the global integration effectively collapses to the area defined by the support of <f)j. An alternative form of this final result that makes good physical sense but is probably not the preferred way in practice, obtains via another application of Green's Theorem in (3.13-56) [and is already in (3.13-54), (3.13-55)]: n. ncPj n0. (3.13-57) a form presented in Lynch and Gray (1980); the mass-consistent unit normal at node j is a basis function-weighted geometric normal. [See (3.13-353) et seq. for a specific example.] This latter result also suggests that a 'simple,' and unique, geometric average value, via nj = Jr. n/| Jr n|, where T7 means 'integrate over that portion of F containing node j,' may not be mass-consistent—and this is true. n7 as computed from (3.13-56) or (3.13-57) is the only normal vector that assures that 'flow in = flow out' of the element pair meeting at node j-an interpretation employed by Gray (1977) in his original (and simpler) derivation of the consistent normal—and in Pinder and Gray (1977). Finally, if the normal velocity is specified on all of T, then only the consistent normal will assure discrete global mass conservation and Yl8i: = 0 m (3.13-31); i.e., only then are the data orthogonal to the hydrostatic null vector, and only then is the problem well-posed. The consistent normal at a corner of the domain is interesting. It can easily be shown to look like that in Figure 3.13-2 for any rectangular-shaped element. While perhaps awkward geometrically in some cases (see Engelman et al., 1982a, for further details), here we present an example of the positive side of the story: if one is using the Q\ Qq element with the BC of specified traction in the corner of the domain, any normal except that Fig. 3.13-2 Corner normal vector.
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 455 shown in Figure 3.13-2 will violate the two following (traction BC) requirements: (i) the normal force balance involves the centroid (element) pressure, and (ii) the tangential force balance must not involve the centroid pressure—mass consistency is here a prerequisite in order to obtain momentum consistency. See Section 3.13.5f for details. To conclude the consistent normal discussion, we summarize the 3D situation, which contains no surprises—only the need to deal with a tangent plane (with two tangent vectors) rather than a tangent line. The consistent normal is still that from (3.13-56), with the integrations now occurring over the 3D volume supported by (pj. The only 'hitch' in 3D is that, once a normal direction has been determined, there is an infinite number of choices for the two tangential directions. While this does not present a problem from a theoretical point of view, in practice it does. For example, consider the simple case where we wish to specify a zero normal velocity component and known (non-zero) tangential velocities on a surface such as a cylinder which is not aligned with any coordinate direction. The procedure advocated in Engelman et al. (1982a) will result in a different set of tangential directions at each node on the cylinder—clearly an undesirable situation. Or consider turbulent flows using 'wall functions' (see Volume II) in which an applied shear stress is specified in a particular tangential direction together with a zero normal velocity. One solution to these 'problems', where again we stress that it is a. practical implementation problem rather than a theoretical one, is to allow the user, once the normal direction has been consistently derived, to specify explicitly one of the tangential directions, say x\. The remaining tangential direction can then we computed using a simple vector cross product (n x x\). f. Axisymmetric case Recalling the axisymmetric version of the NS equations of Section 3.6.4, it is of some interest to discuss/present their weak formulation, especially since the axis itself (r = 0) has caused certain difficulties to non-FEM CFDer's, who often 'drill a hole' through the center of the mesh to avoid placing nodes along r = 0; see e.g. de Vahl Davis (1979) and Smutek et al. (1985). The important 'saving grace' for GFEM begins with the realization that the 'volume element' for integration begins as Inrdrdz rather than dxdydz of cartesians—and we drop the common factor 2n. Thus, for example, the uv/r term in (3.7-29) looks like f&uvdrdz in the weak form; i.e., 'easy.' There is, however, one term that retains r in the denominator: the u/r2 term in (3.7-31) goes over to J (pjiidrdz/r, which still does not cause problems—the term is integrable, and appropriate numerical integration (Gauss-Legendre) keeps us away from r = 0. Next, the viscous term like \/rd{rdu/dr)/dr in (3.7-31) integrates by parts to J r(pjdu/dr\r=Rdz — J rd(f)/drdu/drdrdz, with the boundary integral ultimately 'showing up' as part of the normal viscous (pseudo) traction force. The final 'interesting' term is dP/dr in (3.7-28); here, integration by parts recovers the appropriate (div-grad) symmetry: / (pidP/drrdrdz = / — (r<pjP)drdz - / P—(r<pj)drdz f f d(r(b) = J r4>iP\r=R dz - J P^ drdz, the second of which looks like its symmetric counterpart from the first term of (3.7-27)—f \j/d(ru)/dr—once the appropriate 'expansions' are made. [The first term
456 THE NAVIER-STOKES EQUATIONS is, of course, the pressure contribution to the normal force (traction) at the tube wall and will be part of the NBC unless the tube wall is a no-flow boundary (the 99.99% case)—in which case 0( = 0 at r = R, and the term vanishes.] After u = Yluj4>j ar,d P = Y1 Pj^j are performed, we recover the required symmetry: the C-matrix contribution is — J(d(pj/dr + <f>i/r)^l/jrdrdz and that for CT is — / \f/i(d<pj/dr + (pj/r)rdrdz, and we are done. The bottom line is simply the following [cf. (3.13-18) through (3.13-27)]—for the simpler 2D case [no swirl: omit (3.7-25) and set v = 0], leaving the 2.5D case (2D equations, three components, with swirl) to the reader: 1. Identify x\ = x with r and X2 = y with z. 2. Replace dxdy by rdrdz in all 'bulk' integrals—even though we derived some terms in which we cancelled the r's, for expository purposes. 3. Set y = 0; we are dealing with the simpler (V2) form, leaving the more complex stress-divergence form as another exercise. 4. Augment the viscous matrix, Ka, by T Ka -> Ka + vSal / ^rdrdz, where Sa\ is the Kronecker delta. 5. Augment the C-matrix by f 1 r Ca —> Ca — 8a\ / -<P(a)^f rdrdz. Done. g. Fixing ill-posed Dirichlet BC's A common situation is that where the normal velocity is specified on all of T(n • u = n • w = /), and it is a simple fact that the finite element interpolant of this function, say Ylf, generally does not satisfy JrTlf =0 even if the continuum problem is well-posed (Jr / = 0). This ill-posed problem must be converted to one that is well-posed if we are to make any progress with GFEM. Remark: As noted earlier (Section 3.10.5), if n • w is time-independent and the time-dependent NS equations are being solved in the PPE formulation, then this ill-posedness is not recognized by the mathematics; it is then up to the analyst to recognize the problem. One way to 'fix' the data, as mentioned earlier, is given by Glowinski (1984), in which a two-step procedure is employed: (i) modify the unit normal vector on all of F by projecting it (in the L2 sense) onto the velocity basis; (ii) subtract off (pointwise) an appropriate fraction of the global mass imbalance to regain global conservation. See his book for details. Here we provide an alternative and, we believe, simpler way to get the job done: we modify the normal velocity in a least-squares sense with no need to modify the normal vector on F. Like so: given / = Ylf = £\ f jfy with /_,- = f(xj) and Jrfj^0, perform a least-squares adjustment, from / to fh, via: minimize Jr(fh — f)2 subject to fr fh = 0- Converting this constrained extremal problem to a saddle-point problem via
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 457 the introduction of a Lagrange multiplier, A, to satisfy the constraint, yields: extremize F(fh,X)={Jr(fh-ff+Xfrfh, which in turn leads to (i)fh-f+k = 0 and (ii) Jr fh = 0, to give A = T~l Jr f, where F = Jr dF is the boundary measure 'size.' The Lagrange multiplier is a constant that is proportional to the global mass imbalance. The final resuU is simply /*(*) = /(*)_- T"1 /r / = £ , ffa,- T~l Jr£ , ffa,= Y,j fj[<l>j(x) ~ 4>j\ = Y,j fhj<t>j(.x), where (pj = T"1 /r0, is the average value of 0_/(jt) over all of T. Remarks: 1. The nodal values are simpler yet: fh{ = /,• — A; the pointwise values are all adjusted by the same amount. Clearly, if n/ is already mass-consistent, then A = 0 and no change is made. 2. The mass-inconsistent interpolant is modified (slightly, at least when Jr f = 0) at each node in such a way that the new contribution to fh from each node is itself 'mass-consistent' in that fr(<pj - 4>j) = OV/. [Note that </>, <$C 1—usually—whereas <Pi(x) = 0(\).] 3. It is not even required that the continuum problem be well-posed; i.e., the modified discrete problem is well-posed even if Jr f ^ 0. (If a wildly non-physical problem, with Jr f 'large,' has been posed, then the 'adjustment' may be also large.) 4. If part of T is truly impenetrable so that / = 0 there, before, and after the adjustment, then simply omit this part of F in the above calculations. 5. The necessity or desirability of employing consistent normals, a la Section 3.13.1e, should also be considered it this method is to be implemented. [We derived this method while writing this book, and have not (yet) implemented it ourselves.] 6. Similar procedures have been employed in certain FDM's to adjust the mass imbalance (ill-posedness) that comes from trying to use dun/dn = 0 as an OBC; see, for example, Schutt (1991) and Sani and Gresho (1994). 3.13.2 The Choice of Elements a. Introduction and summary tables We are about to embark on one of the most difficult and dangerous of finite element trails as we attempt to explain some of the nitty-gritty details behind the long (and still developing) history that finally leads up to not only the simple question, 'Which element(s) do you prefer and why?' but to much more basic and difficult ones: 'Which elements work, why do some not, and why do some appear to work but perhaps should not?' Or even, 'Why do some advocate the use of an element that others claim is doomed to fail?' or 'Why are there so d ... many choicesT Why cannot reasonable people agree on such ostensibly simple issues; especially those with firm mathematical underpinnings? Partly, it must be that the issues are actually far from simple. Naturally, we shall need to try to clarify what it means for an element to 'work' or to not work, or even to fail. Perhaps W. Habashi put it best when he said, 'Convergence is in the eye of the beholder' (personal communication—to a large audience, many persons). To begin, we emphasize that most of this deep and troubled and muddy water came to be because of the single simplification (!) of the mass conservation equation;
458 THE NAVIER-STOKES EQUATIONS i.e., the fluid will be treated as, or assumed to be, incompressible. We also note that this alleged simplification has also taken its toll in the finite difference world, where numerical solutions of the incompressible NS equations began; many in this world are also quite confused even today. A relevant comment on this situation was made recently by M. Rose—'.. .because the treatment of incompressible flow is so unforgiving of imprecise ideas, such flows still remain a fertile ground.' (personal communication, 1990). On the other hand, formulations of the fully compressible equations can—especially in regions in which the flow is 'behaving' incompressibly (V • u is 'small')—also 'act up' (e.g., Pironneau, 1989; and Fortin and Pierre, 1992). In contrast to the scalar transport equation, the choice of elements for the NS equations is far from simple. Mixed methods and saddle-point problems is the name of the game—or at least part of it. The new issues include: div-grad symmetry, compatible function spaces, null spaces, spurious modes, stability (with mesh refinement), and element-level mass conservation, as well at the 'simpler' ones of accuracy, simplicity, and cost-effectiveness. All of these issues will be discussed further below. A perusal of the literature reveals more than two dozen of either triangular or quadrilateral elements for 2D flows. In 3D, the corresponding numbers are smaller—but there is still a plethora of possibilities. The ease with which FEM researchers can generate various 'higher-order' approximations, relative to those in FDM or control volume methods, is probably as much of a curse as it is a blessing—especially when it is acknowledged that not all seemingly reasonable approximations deliver useful and/or cost-effective results. The number of element 'combos' (velocity and pressure approximating functions) that have been analyzed is very large; so too is the number that have been coded up and tested in the CFD laboratory. Add to this list the concept of 'macro elements' and other tricks to make 'stable' elements out of 'unstable' ones, and the situation becomes even more complex—perhaps even scary, daunting. ... It is therefore especially easy to understand how 'outsiders' or 'newcomers' scouting the field might view the FEM for incompressible flow with some skepticism, perhaps wondering, 'When are they going to get their act together?', and asking, 'Why should I jump into this clearly confused and frustrating fray?' And indeed these are valid concerns—and we only wish that we could address them more adequately then we do below. Perhaps, though, it is just another (annoying?) manifestation of the general fact that the FEM offers many, many choices of basis functions. (How many 'higher-order' finite volume methods are there? Lack of choice is also not best.) The field is definitely 'richer' for the finite element mathematician (or 'mathematical engineer') than for the average CFD practitioner who is mainly interested in obtaining good/useful results fairly cost effectively. But the truth (as we know it) is, unfortunately, that there is no unequivocally 'best' element. ... In this section we shall attempt to summarize the state of confusion (a moving target) regarding element choices, focus on those subsets of elements that we advocate (partly, of course, because of our own experience), and still try to present a reasonably balanced presentation. That this is not entirely possible is probably obvious, since there often seems to be a fairly large increase in adrenalin flow whenever the subject of 'element choices' is discussed. Our discussion will probably also create a few new enemies—a plight we could bear if in addition it attracts enough outsiders and newcomers to give finite elements a try—so that, on balance, the FEM might move forward faster. Our general philosophy will be based on the premise that simplicity is still beautiful, and on the fact that the theory
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 459 is too often silent. ... A colleague recently opined that the entire field would still be in the Stone Age if practitioners had waited for the theorists to prove 'consistency, stability, accuracy, convergence, etc' We are already clearly in violation of (our understanding of) the 'French school'—for example—which usually seems to require some minimum number of proved theorems before any computer programming and subsequent numerical experiments are permitted. But even they manage to 'ignore' the unfortunate fact that no one has yet been able to prove global existence of solutions to the subject of this book—the NS equations. ... C'est la vie. After presenting a few tables of 'elements'—good and bad—we will briefly summarize, as we see/understand it, the key state of the art in some of the finite element selection criteria—namely, stability and how to analyze it. We begin by defining a few broad categories of elements, some of which we will need and others we shall drop: 1. Equal-order vs mixed-interpolation. This refers to the basis functions used for velocity (components) vis-a-vis those for pressure—0 vs \f/ in the previous section. The former is obvious (the same for both), and the latter usually means that a higher-degree piecewise polynomial is used for the velocity than for the pressure [which is at least partially related to the following two issues: (i) V2u involves a higher-order operator than does VP; and (ii) (roughly) the divergence of the (vector) 'velocity space' is a scalar space that is close to being (and sometimes is) the 'pressure space']. It is worth noting that a stable method can always be obtained by sufficiently enriching the velocity space for a fixed pressure space—but such a stable space might not be very accurate in the sense that a high-order polynomial basis function for velocity (e.g., quadratic) may not give high-order accuracy if the accuracy of the pressure space is low (e.g., piecewise constant), and few would opt for an expensive, inaccurate element (cost effectiveness in reverse!). It is also worth noting that the choice of a pressure space (and, of course, the velocity space) implies the choice of the divergence operator. The most popular mixed interpolation elements employ one-order-lower basis functions for pressure than for velocity—although also popular are 'stabilized' equal-order elements, to be addressed by D. Silvester in the next section (Section 3.13.3). 2. Continuous vs discontinuous pressure. Since (or when) the weak formulation of the momentum equation involves integration by parts of VP, the resulting weak form contains no derivatives of pressure, thus introducing the possibility of approximating it by functions (piecewise polynomials, of course) that are not C°-continuous—and indeed, this has been done and is quite popular/useful. But it is not necessary; hence, continuous approximations for P are also much used—with or without integration by parts, with the latter generating unsymmetric div and grad matrices and 'problems' with NBC's and well-posedness. Note that discontinuous pressure elements do not possess uniquely defined pressure on the element boundaries; they are dual-valued there—and often multi-valued at certain velocity nodes. Note too that only discontinuous pressure elements assure an element-level mass balance; Proof: for i/f, = piecewise-constant on element e, 0 = ( fr V ■ uh = f V • uh = J n u\ and only discontinuous pressure elements contain this element-level test function. QED.
460 THE NAVIER-STOKES EQUATIONS 3. Conforming vs non-conforming. Conforming velocity elements are those for which the basis functions form a subset of //' for the continuous problem; i.e., the first derivatives (and their squares) are integrable in Q. The simplest non-conforming element is a linear triangle with the nodes placed at the three midsides; it 'conforms' with the velocity in each neighboring triangle at just one point. Following Girault and Raviart (1986) and Gunzburger (1989), we shall mostly neglect these little-used elements—but we do cite the classic reference; it is Crouzeix and Raviart (1973), who also introduced some important new concepts regarding conforming elements. See also Thomasset (1981). Also, the nonconforming quadrilateral element of Rannacher and Turek (1992), a sort of 'rotated' (and LBB-stable) version of Q\Qo, should be mentioned here; Turek (1994, 1996a) has shown many good results with it. Next, we introduce some terminology for efficient element descriptions—mostly borrowed, but with a little bit that is new: 1. For triangles/tetrahedra, the designation PmPn means that the velocity (each component) is approximated by continuous piecewise complete Polynomials of degree m and pressure by continuous piecewise complete Polynomials of degree n. (For example, PjP\ in 2D means u ~ a\ + a2X + a^y + a^xy + a$x2 + a^y2 with a similar approximation for v, and P ~ A\ -\-Ajx + A-^y.) Both velocity and pressure are continuous across element boundaries, and each (triangular) element contains six velocity nodes and three pressure nodes. The 3D (tetrahedron) version of this element contains 10 velocity nodes and four pressure nodes. 2. For the same families, PmP-n is as above, except that pressure is approximated via piecewise-discontinuous polynomials (C_1) of degree n; e.g., P2P-1 is the same as PjP\ except that the pressure is now an independent linear function in each element—it is therefore discontinuous at element boundaries. 3. For quadrilaterals/hexahedra, the designation QmQn means that the velocity (each component) is approximated by a continuous piecewise polynomial of degree m in each direction on the Quadrilateral and likewise for the pressure, except that the polynomial degree is n. [For example, Q2Q1 is like PjP\ above, with the addition of a-ix2y + a%xy2 + a<)X2y2 to u and A4xy to P. Each element contains nine velocity nodes (32) and four pressure nodes (22); the 3D (brick) version has, of course, 27 velocity nodes (33) and eight pressure nodes (23)]. 4. For these same families, QmQ-n is as above, except that the pressure approximation is not continuous at element boundaries. 5. Again for the same families, QmP-n indicates the same velocity approximation with a pressure approximation that is a discontinuous complete piecewise Polynomial of degree n (not of degree n in each direction—it is as if the pressure was to be represented on a triangle within the quadrilateral, with 'extrapolation' as necessary). 6. The designation P+ or Q+ adds some sort of 'bubble function' to the polynomial approximation for the velocity. These are sometimes called 'enriched' elements (Arnold etal., 1984). 7. Finally, for n = 0, we have piecewise-constant pressure, and we omit the minus sign for simplicity.
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 461 Before presenting a summary of most of the known 'incompressible elements,' we provide a hopefully-useful 'heuristic' to perhaps show how these mixed-interpolation elements might have come about. And this we do via one example (with six 'varations'): suppose you 'like' the nine-node quad (Q2) for scalar problems and are interested in generalizing it to a 2D vector problem for the incompressible NSE's. To make such a 'velocity' element incompressible, you might first consider 'average' incompressibility; i.e. request mass conservation at only the element level, leading to the weak form J V • uh = 0 for each element, e. A second idea might be to request a weak form that approximates V • u = 0 at every velocity node, which would lead to J (p,V • uh = 0 for / = 1, 2,..., n. A third method would request the same weak form, but only at the corner nodes of the 9-node quad, / \//,V • uh = 0, where 1/^ is a bilinear basis function and / ranges over only the corner nodes of the mesh. A fourth might be a combination of the first and third; i.e. require both Je V • uh = 0 and f \f/j'V • uh = 0. A fifth idea would be to strengthen the first via 'moment' equations; i.e., in addition to Je V • uh = 0, also require J^£V • uh = 0 and Je rjV • uh = 0 on each element where (£, rj) are the local coordinates. Finally, a sixth variation would add the cross-moment, J^£j?V ■ uh = 0, to the fifth. That's enough; all we need say in addition is that these six elements have the names Q2Q0, Q2Q2, QiQ\, Qi(Q\ + Qo). QiP-\, and Q2Q-1 —all of which (save one, Q2Q2, since equal-order interpolation is definitely not very viable) are listed in one of the tables below—and each of which implies/generates the concommitant pressure approximation for the mixed-method in question. With these new names in hand, we present in Tables 3.13-1 and 3.13-2 a summary description of some of the triangular and rectangular elements that have been and/or are used today; a full list is not necessary, we believe (consult the references for those not listed). Also, and importantly: as we have only very limited personal experience with the elements in these tables, much of the associated qualitative discussion is simply based on our perception of the issues. The designation 'LBB-stable' refers to the three mathematicians who made important contributions to the analysis of stability; they are Ladyzhenskaya (1969), Babuska (1971, 1973), and Brezzi (1974). We will later summarize what is meant by LBB stability—and its several aliases: inf-sup condition, BB- condition, consistency condition, and div-stability condition. Briefly, any element passing this stability test will converge 'optimally' (in the sense of approximation theory—details later) and without spurious pressure behavior, and those failing the test may not (not will not); they may converge, and may even converge optimally, but this theory does not assure it—it becomes 'silent.' Indeed, from one of the leading 'stability experts' (M. Fortin), we have: 'Knowing which elements are stable is not, however, by far, a complete picture of the situation.' (Fortin and Fortin, 1985a.) Any reference to 'accuracy' in Tables 3.13-1 and 3.13-2 (e.g., first-order) refers to velocity error in H' and pressure error in L2. (Rough rule of thumb: most if not all velocity errors can be restated in Lr by 'adding one' to its //' error estimate; hk —>• hk+l.) Tables 3.13-3 and 3.13-4 present a similar summary for the less developed 3D case. Finally, below Tables 3.13-3 and 3.13-4 we offer some general comments—some objective and some subjective; hopefully some are valid. A general remark pertaining to Tables 3.13-1 through 3.13-4 is this: M. Fortin seems to be the clear leader when it comes to both the creation/design of new incompressible elements and in their stability analysis—even if the effects of this leadership cannot be clearly discerned in our tables. The citations include (at least): Fortin (1977, 1981, 1983, 1985), and with some help: Fortin and
462 THE NAVIER-STOKES EQUATIONS Table 3.13-1 Summary of (useful?) 2D triangular elements • Velocity o Velocity and continuous pressure x Discontinuous pressure. Name Sketch LBB Advantages stable? Disadvantages Other P^Po PtP^ (MINI) P^Po, Crisis- cross on a 4-patch (macro) PiPl.on a 4-patch (macro) P2P0 P2P\4) (Taylor- Hood) Pp+Pi P2(Pi + Po) P2P-i N Y N Y Y Y Y Y N Simple —Sometimes —Rarely if ever 'locks' (u = 0) usable'1} —Simple -CAC(2) stable —Pointwise divergence-free —Best element with linear velocity (Gunzburger, 1989) —See too Glowinski (1984) —Simple —Simplest second-order triangle —Better than P2P1 —Element mass balance —Pointwise divergence-free —More work than dPo but no more accurate —Only 1st-order accurate —More work than P2Pi —2 hydrostatic modes —'Variable' spurious null space'1) —Can be less accurate than PzP^ — First-order —Cubic bubble — First-order -A/2 local CB'si3) — Penalty method should be used —First-order —Also called iso P2 - Pi —Also called P^ isoP2-Pi —Beat P2P-1 in several tests (Thompson, 1975) —An early favorite —Second-order —Cubic bubble -'Good element' (M. Fortin) —Second-order —Second-order —Can give good results for relaxed (natural) BC's; also, see Section 3.13.7
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 463 Table 3.13-1 (continued). Name P2+P-i (Crouzeix- Raviart) Sketch /\ ^LK LBB stable? Y • Advantages —Stabilizes P2P-1 Disadvantages —More work than P2P1 [but see too the 'modified' version, discussed in Cuvelier et al. (1986), which is more economical] Other —Second- order —Cubic bubble -'Good element' (M. Fortin) (1) (2) (3) (4) (5) But see Qin (1994). CAC: constrained approximation condition (see Malkus and Olsen, 1984). See Section 3.13.2b for discussion of checkerboard modes (CB's); N = number of macro-elements. Taylor and Hood (1973). Thompson (1975). Table 3.13-2 Summary of (useful?) 2D quadrilateral elements. Name Sketch LBB Advantages stable? Disadvantages Other Q,Qo (Q1P0) N Q^Q^, on a 4-patch (macro) Q^P-1 Y N P2+P-i Y —Simplicity — Penalty method works —See Section 3.13.5 —Less sensitive to mesh distortion than Q^Qo —Mathematicians hate it —See Section 3.13.5 —Less accurate than O2O1 on same grid — More work than Q1Q0. but also more accurate —Awkward? — First-order, usually — Pressure is constant over the element —See Section 3.13.5 — First-order — Little or no demonstrated utility — First-order —Quadratic bubble -P = Po+xPx+yPy is equivalent representation —First-order —Quadratic bubble —Quadratic (continued overleaf)
464 THE NAVIER-STOKES EQUATIONS Table 3.13-2 (continued). Name Sketch LBB Advantages stable? Disadvantages Other Q2Q0 (Q2P0) Y O2O1 (Taylor- Hood)'1 > Y 4 02(Oi +Po) Y O2P-1 m Y O2O-1 N —Few, except stable —Simplest higher-order C°-pressure quadrilateral —Better approximation to V- u = 0 than O2O1 —Element mass balance — Probably the most accurate 2D element —First-order accurate —div u = 0 is often not strong enough (see Volume II) —2 hydrostatic modes —Consistent'2 3) penalty works —Penalty works —One CB-mode normal velocity at mid-sides (linear tangential velocity) —R means 'Restricted' —Momentum equation for central mode sees no pressures! —Second-order — More accurate than Q^Q2 — (Much) less accurate than O2P-1 —Second-order Introduced in Gresho et al. (1980b) to improve on O2O1 —Second-order — First introduced—it seems—in Sani etal. (1981a), it is also referred to as the 9/3 element —Second-order —See Section 3.13.6b (1) (2) (3) Hood and Taylor (1974) actually used the eight-node serendipity (remove central node), Q^]Q-\, which is also LBB stable and second-order accurate. See too Taylor and Hood (1973) for earlier 'equal order' experiments. See Section 3.13.2e (penalty). It is permissible/viable to use either local (P =a + b$ + crj) or global (P =A +Bx + Cy) pressure approximation; Shopov and lordanov (1994) prefer the former (more accurate on distorted iso-P's), whereas only the global representation, to our knowledge, has been shown to possess the optimal error estimate (D. Arnold and F. Brezzi, personal communication).
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM Table 3.13-3 Summary of usable/useful (?) 3D tetrahedral elements. 465 Name P+Pi (MINI) Iso P2 - P^ Pt+P^ P2P1 (Taylor-Hood) P2+P-i (Crouzeix- Raviart) P2(Pi+Po) LBB stable? Y Y Y Y Y Y Advantages —Simple —'Best linear tetrahedral element'11) —Simplest higher-order tetrahedron —Cost effective'3' —Better mass conservation than P2P^] Disadvantages —Not very accurate'1 )(2) —Awkward (?) —div u = 0 may be too 'weak' —2 hydrostatic modes Other —First-order —Quartic bubble —See Table 3.13-1 for 2D version —Add mid-face and centroid velocities to PiP_i —Second-order —Second-order —Quartic bubble —Second-order (1) Soulaimani etal. (1987). (2) Parre (1992). (3) Bertrand ef a/. (1992). (4) T\66etal. (1988). Soulie (1983), Fortin and Fortin (1985a, b), and Soulaimani et al. (1987). [It seems that his interest in finding stable elements was stimulated/increased when Sani etal. (1981) showed that a particular element that Fortin had used and studied (Q\Pq) could lead to ill- posed algebraic problems if the imposed velocity BC's did not satisfy a certain spurious constraint equation. Details later.] Additional Remarks: (1) Mixed-interpolation can give 'stability' and optimal accuracy, but requires additional bookkeeping—both in the 'main' code and in post-processing/graphics. (2) Equal-order interpolation can only give 'stability' (and optimal accuracy) if Vu = 0 is modified/weakened—typically to V-u = e/(P) with e 'small'; see Section 3.13.3 for a discussion on 'stabilization.' If not stabilized, spurious pressure modes (which we shall discuss later) result. (3) 'LBB stable' elements assure the existence of a unique solution (Stokes flow) and assure convergence at the optimal rate (i.e., as good as that from approximation theory). (4) 'LBB unstable' elements may not converge, and if they do, they may not do so at the optimal rate. But the theory is mostly silent—thus far; i.e., they may converge (and even at the optimal rate) in some cases/grids, as we will show later for Q\Qq. (5) Continuous pressure approximation cannot deliver element-level mass balances, nor can the penalty method be efficiently implemented. Discontinuous pressures
466 THE NAVIER-STOKES EQUATIONS Table 3.13-4 Summary of usable/useful (?) 3D hexahedral elements. Name Q^Qo (Q1P0) Q2Q1 (Taylor-Hood) (*> Q2(Qi+Po) O2P-1 O2O-1 Linear triangular prism Quadratic triangular prism LBB stable? N Y Y(?) Y N ? ? Advantages —Simplicity — Penalty works —Simplest higher-order C°-pressure brick —Same as 2D —Element mass balance —Consistent penalty works —May give better div u = 0 than O2P-1 — Penalty works —Transition elements —Transition elements Disadvantages —Multiple modes —div u = 0 not strong enough —Same as 2D —May not be as good as 2D analog —Multiple modes —Awkward —Awkward Other —First-order —Second-order —Second-order —Second-order —Second-order <**) <**) (#)The 20-node serendipity element, QJ?0)Q-\, popular in solid mechanics, is obtained by omitting all midface nodes and the center node from O2O1. (**)The linear case would typically use constant pressure and the quadratic case would use either continuous or discontinuous linear pressure. A growing use of these elements is in boundary layers close to no-slip boundaries when unstructured tetrahedra are used in the bulk of the domain. It is conjectured that the 3D element is stable if the two 2D elements comprising it are. sidestep/skirt/obviate each of these disadvantages. [The lowest-order continuous pressure on quadrilaterals (Q\Q\) corresponds to 'unstaggered grids' in finite difference methods, and the lowest-order discontinuous pressure (Q\Qo) (roughly) to 'staggered grids.'] (6) Quadrilateral elements are usually more accurate than triangular elements, all else being equal; the latter often display mesh orientation effects—at least when using regular/structured triangulations. For example, Zhu and Zienkiewicz (1988) saw a need for ~2.5 times as many nodes using P2 as when using Q2, for equal accuracy. Triangles should, we believe, be 'mixed up' by the mesh generator—to reduce grid orientation error. See also Shubin and Bell (1984) for some FDM 'grid orientation' effects. (7) Triangular elements are usually more useful for describing truly complex geometry. [A common(?) and good policy is to use triangles only where necessary, with quadrilaterals used where feasible—clearly a challenging problem for mesh generation—and one not yet 'solved' in 3D.] (8) Quadratic approximation has (de facto"]) been judged to be of 'high-enough' order; these elements also do a fairly good job of matching/describing curved
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 467 boundaries via isoparametric approximation, and cubic elements are deemed (it seems—'implicitly') to be just too complex/expensive. (9) Bubble functions (and other internal nodes) can often be profitably 'condensed out'/eliminated at element level. Indeed, Fortin has even opined 'that it is worthwhile adding internal nodes to some elements, even if the order of convergence is not increased,' to enhance stability (Fortin and Fortin, 1985a)—although this does make the element construction more expensive. (10) The Q\Qo element is probably the most controversial and least well-understood of all elements (but see Section 3.13.5), in spite of the fact that it has been the center of focus of very many analyses; paradoxically, in 3D it may also be the element most frequently employed (most 'element-hours' logged on computers). Note too that an equally good name for it is Q\Pq—and is indeed often used. A comment of Ciarlet (1978) seems particularly relevant here: '... we shall simply emphasize the fact that, for all practical purposes, nothing replaces the numerical experience accumulated over the years by engineers.' (11) Some of the elements listed can be generalized to arbitrary order; e.g., P^P^^-i) for k > 2. (12) A 'reduced' quadratic element, the Q2 ^Q\, eliminating the center node (x2y2), was in fact the first of the higher-order quadrilateral elements. It was introduced by Taylor and Hood (1973) and Hood and Taylor (1974)—rather than the Q2Q1, which also often bears their names—because the eight-node quadratic element (called 'serendipity' by some—e.g., Zienkiewicz and Taylor, 1989) was then popular in solid mechanics. C. Taylor later teamed up with one of the present authors (PMG) and two others to show that the full quadratic element was worthy of more serious consideration (Huyakorn et al., 1978)—a conclusion later supported mathematically in Bercovier and Pironneau (1979). It is probably safe to say today that the serendipity element should be passe in CFD—at least in 2D. And we now feel the same about Q2Q1—when measured against QjP~\. (See Section 3.13.6a.) (13) Stability of higher-order 'Taylor-Hood' elements, QkQk-\ for k ^ 3, are proven in Brezzi and Falk (1991). (14) 'The Hexahedron. The element having six quadrilateral faces is in general a better element in three dimensions than the tetrahedron'—Wait and Mitchell (1985)—a 'general' remark, not explicitly addressed to the NS equations. (15) If one is (only) interested in solving potential flows via the mixed FEM, the elements in the larger space, //(Div) could perhaps be used effectively. For a discussion of these 'Raviart-Thomas' or 'Brezzi-Douglas-Marini' et al. elements, see Brezzi and Fortin (1991). See too the numerical example at the end of this chapter. (16) For some recent nice results using an improved MINI element (rescaled bubble function) in 2D, see Simo et al. (1995)—a paper which also discusses/summarizes some modern 'alternative' approaches that we do not discuss (much) in this text: SUPG, Galerkin-least squares, 'optimally dissipative' methods, etc. (17) From Malkus (Appendix 4.II in Hughes, 1987) we quote, 'The reason that the standard error estimates for incompressible elements steer the practioner toward the underconstrained and inconvenient elements is that the standard estimates demand
468 THE NAVIER-STOKES EQUATIONS too much from a Lagrange multiplier (or related penalty pressure). The role of the Lagrange multiplier has been seen to be a two-fold role of enforcer of the constraint and of pressure solution. The choice made in all five of the safe elements is to choose elements in which the role of enforcer has to some extent been sacrificed to avoid pressure modes.' The five 'safe elements' are: P2P0, PiP\, Q2Q0, QiP-\, and Q2Q\- (18) Attempts have been made to quantitatively measure an element's quality by a seemingly appropriate counting of constraints. Based on the fact that in the continuum there exists one (vector) momentum conservation equation and one (scalar) mass conservation equation at every point in the fluid, a similar 'counting' for the discrete case may be made—and a qualitative judgment then following by asserting that an element is 'good' if the constraint ratio (mass/momentum) is close to unity and 'bad' if far from it. (See, for example, Gresho et ai, 1980b, and Hughes, 1987—and references therein.) For example, Q\Qo has, on average, one momentum equation and one continuity equation per element—so that in this sense it is perfect. But so too is Q\Q\—an element not even listed in the tables because of its plethora of spurious pressure modes (defined in the next section). One more example: Q2Q1 has a constraint ratio of only 1/4 in 2D and 1/8 in 3D, whereas £>2^-i, a 'better' element in most practioners' opinion (we believe), has 3/4 in 2D and 1/2 in 3D—which is also better. Our current position on the constraint ratio notion is that it provides, at best, a first-order feel regarding the potential utility of an element—and for this reason, we leave to the reader the preparation of a detailed 'comparison' table. We do believe, however, that 'good' elements will have a ratio not too far from unity and certainly not too large compared with unity because of possible 'locking'; i.e., if the ratio exceeds two in 2D and three in 3D, there will be more constraint equations than can be satisfied by the available momentum equations. (19) For all elements, the GFEM generates basically centered difference approximations to the (nonlinear) advection term which, as for AD in the previous chapter, can be wiggle-prone. As in the linear case of AD, we are still believers in 'wiggle signals' over 'smooth is beautiful.' A re-read of the wiggle Section (2.6.1a) may be useful at this point, just before we offer a supporting opinion from the FDM side of the house—in which a dissipative and 'monotone' Godunov advection scheme and a virtually non-dissipative centered-difference scheme are compared and discussed. In Brown and Minion (1995) we find the following: (1) '... our computations cast doubt on the validity of the proposition that the numerical dissipation mechanisms in Godunov-projection methods mimic the physical dissipation'—they show that some physically-reasonable-looking vortices on grids that are relatively too coarse are spurious; (2) 'Whether or not under-resolved Godunov-projection computations are useful is certain to be a controversial issue'; (3) 'In addition, since the centered methods fail rather badly in the under-resolved case, it is somewhat easier to know when one is properly resolving the computed solutions for those methods'—an understatement that we freely translate as 'ye Olde Wiggle Signal'. (20) For some new ideas related to LBB-stable low-order elements, via a macro-element approach, see Nafa and Thatcher (1993). (21) For some new ideas for 'ranking' elements, see Section A3.3.7 of Appendix 3.
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 469 After reading all of this, the reader may/should (?) ask, 'So what? What are the answers to the questions raised at the beginning of this section?' Somewhat apologetically, we answer, 'We wish we knew!!' There is definitely more work to be done. We do believe, however, that we can provide some guidance for new code writers who do not wish to generate and provide huge 'element libraries': In 2D, the triangular elements P2+P\ and and P2+P-\ are very good, as is the £2^-1 isoparametric quadrilateral, which is probably the most accurate 2D element. If you wish to avoid bubble functions on triangular elements, P2P\ is not bad, and P2(P\ + Po) is even better—although somewhat more complicated. In 3D, the situation is (still) much less clear, partly because not many 3D simulations have been made with a wide variety of elements (such testing is difficult and expensive, but neverthless needed). Also, mesh generation capabilities can and do enter in—often in a big way. If truly unstructured meshes are to be used, then tetrahedral elements are a near necessity—for which the low-order MINI element is not too bad; but the second-order elements, /^i and P2(P\ + Po) are better. For those who prefer hexahedra, perhaps reverting to lets' or wedges where needed for complex geometry, try 02^-1 if you need LBB stability and/or want second-order accuracy. [If you're not afraid of pressure nodes and your geometry is always sufficiently complex that the probability of their occurrence is small, you might also want to try C^G-i-l- Finally, we opine that Qi Qo is still a competitive element in both 2D and 3D and, at least after assimilating our many discussions about it in this book, the new code developer should probably put it in his/her library. *b. Null spaces and their effects; pressure modes 0 Introduction. Another 'negative' virtue of incompressible flows stems from the facts that the velocity part of the solution lives in the (large) null space (also called the kernel) of a differential operator—the divergence—and the pressure part, depending on BC's, can contain a component from the small (one-dimensional at most) null space of another differential operator—the gradient. These properties of the solution should be, but not always are, mimicked properly by the corresponding matrices (discrete operators) of the approximate solution. It turns out that the discrete approximations are up against two serious problems: (i) the approximate velocity, while (ultimately) living in a discretely divergence-free space, is in a space that is not a subspace of the continuous divergence- free space, and (ii) the null space dimension of the approximate pressure gradient is often larger (sometimes much larger) than it should be; it should have dimension 0 or 1 depending on BC's. These two issues/facts permeate completely the entire subject of approximate solutions for incompressible flow, and their effects are powerful, profound, often not well-understood, and sometimes devastating; e.g., not all seemingly reasonable approximations 'work'—a statement that is not limited to FEM, applying as it does to virtually every approximation method that has ever been tried. These issues also limit our understanding of approximation methods. But we shall, for the most part, stick to the FEM version of these issues, and begin this by noting that the problems are 'caused' (if indeed blame can be placed anywhere) by the use of 'mixed interpolation' (or mixed method) in a different sense than before; i.e., both velocity and pressure are to be approximated, because our velocity basis functions are generally (but see Section 3.13.7) not discretely divergence-free. In an attribution (we believe) to G. Strang, we may say that 'mixed interpolation brings mixed blessings': the approximation of P as well as u permits
470 THE NAVIER-STOKES EQUATIONS us to approximate the latter using basis functions that are not (discretely) divergence- free, which provides much (too much?) more latitude than otherwise. But this additional 'breathing room' brings with it a plethora of new problems that is the subject of this section: spurious/extraneous null spaces that are filled with spurious pressure modes that introduce spurious/extraneous/redundant solvability constraints and sometimes reduce the convergence rate (when solutions exist!). [The finite difference version of 'selection of elements/basis functions' goes (roughly) over to, 'How "large" a stencil (how many node points) and of what type (staggered, colocated, etc.) should I use for velocity and ditto pressure in order to preclude odd-even decoupling?' But our discussion of the issues via the language of linear algebra, below, surely covers the finite difference method as well as the finite element method; we simply omit details regarding the former.] The choice for velocity-pressure 'pairs' is far from arbitrary. We start simple—with linear equations—and in some sense end there, because these issues are, thankfully, independent of Reynolds number/advection. The discrete forms of the potential flow equations, Mu + CP = / and CTu = g, or the transient NS equations, written as Mil + CP = f(u) and CTii = g, or the steady Stokes equations, Ku + CP = / and CTu = g, all present a linear algebra (saddle-point) problem (see Section 3.15) of the form (cr 9 (;)-(;)■ where B is either M or K (or, when time-marching via implicit treatment of the viscous terms, a linear combination of the two) and is thus n x n and SPD; C is n x m with n > m; u, it, f e Rn; and P, g e Rm. o A digression. It may be well to show why we need n > m, which will be an algebraic proof that we must choose our velocity and pressure basis functions so that there are more momentum equations than continuity equations—thus precluding us from even considering 'bizarre'/silly cases such as bilinear velocity and biquadratic pressure (Q\ Q2). First, in words: if m > n, then there are more constraints on the velocities than there are velocities. In mathematics: m > n implies that the number of vectors in the column space of C is larger than the dimension of the space, which implies linear dependence among the columns. The result would be that the matrix A = (^ c0) is singular and that the system (3.13-58) is generally inconsistent. Thus we henceforth assume that m < n. End digression. Since B is non-singular, the above system of linear algebraic equations is of full rank, which ensures the existence of an inverse and thereby a unique solution to (3.13-58), if and only if C is of full rank—which is true if the rank of C is m. If C has full rank, then Cq = 0 has but one solution: q = 0. But, unfortunately, there are many situations in which the rank of C is less than m (C is then said to be rank-deficient), and this causes A to be singular (having one or more zero eigenvalues) and the solution to Ax = b, if it exists, to be non-unique. [Here, xT = (u, P)T and bT = (/, g)T.] We consider some of these situations, and their effects, in this section. Before getting any deeper into these 'null space issues,' let us digress briefly to note a general property of the spectrum of the A-matrix above (see, for example, Bank et al., 1990). This 'property' is obtained in two steps. The first step is to note that A can be
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 471 rewritten—via a so-called congruence transformation—as follows: and the second is an application of Sylvester's 'law of inertia'—e.g., Strang (1976, p. 259): the matrix in the middle, (^ -ctbic)' nas me same nurnrjer of positive eigenvalues, the same number of zero eigenvalues, and the same number of negative eigenvalues, as does A. Thus, since we have a block-diagonal matrix, all we need to know are the spectra of fi_l and CTB~XC. (The matrix, — CTB~lC, is referred to as the Schur complement of B in the matrix A; see, for example, p. 58 of Golub and Van Loan, 1983.) The former is easy since B (and thus fi_1) is SPD: it has exactly n positive eigenvalues. The second part is a little harder, and we rely in part on some results due to D. Malkus (1992—personal communication; but see too his 1981 paper): if C is of full rank, then CTB~XC has exactly m positive eigenvalues; but if C is rank deficient—say rank C = r < m—then CTB~lC has exactly r positive eigenvalues and exactly m — r zero eigenvalues. Thus, we have the general result for matrix A: it has exactly n positive eigenvalues, exactly m — r zero eigenvalues, and exactly r negative eigenvalues. (Note that the above characterization applies in general to symmetric saddle-point problems.) And it is, of course, the case when r < m that is of interest herein—because that is when A exhibits a non-trivial null space. In all cases, A is definitely indefinite. A 'look ahead' to Figure 3.13-13 in the next section shows a 'picture' of this eigenvalue distribution (wherein m — r = k, the dimension of the null space of C). o Another digression. One more small digression is in order before we leap into the subject of pressure modes—both pure and impure; and it too is simply a 'review' of (and application of) linear algebra: Theorem: The linear system Ax = b where A is n x n (and real) and b a given n -vector has a solution if and only if b is orthogonal to all vectors in the null space of AT, and when the solution does exist, it is only unique if A (and thus AT) has no null space. A proof of this theorem is both useful and reasonably simple—thus we present it, with due thanks to A. Hindmarsh for the second part. (1) The vectors in the null space of A7, say zi, i = 1,2, ..., k where k is the dimension of the null space (which is n — r where r is the rank of A), each satisfy (by definition) ATZi = 0. (2) Multiply Ax = b by zf to obtain zjAx = zfb = xTATz\ = 0, and thus a necessary condition for Ax to equal b is zjb = 0 for / = 1, 2, ..., k. If zfb ^ 0, then b is not in the range of A—b cannot be 'reached' by the operation of A on any vector in its domain (Rn). (3) For sufficiency, we start with zfb = 0, and form the residual, re = Ax — b. (4) Consider the quadratic form/functional, J = rjre and consider its minimum: dJ/dx = 0 =>• 2A7re = 0, which implies that re e N(AT), which implies that r is necessarily a linear combination of the {zi}. (5) Also, zfre = zfAx - zfb = 0 because ATz; = 0 and zfb = 0. (6) Hence, re ± z\ and thus re ± r, and we obtain re = 0 because only the zero vector is orthogonal to itself. This proves sufficiency. (7) Thus there then exists a solution to Ax = b. It is given by x = xP + Xw=i ajy]> where xp is a particular solution and each yj is a null vector for A (i.e., Ay; = 0, / = 1, 2, ..., k), and the scalar coefficients {a;} are completely arbitrary—and
472 THE NAVIER-STOKES EQUATIONS the proof is complete. Note that while our original A was symmetric, the more general theory (for unsymmetric A) was presented because we shall soon (and temporarily!?) come across situations where A is of the form A = (^ ^), where G approximates grad, D approximates -div, and DT ^ G. But for our current case, (3.13-59), A = AT, and thus the simpler result, z, = y,-, / = 1, 2, ..., k, obtains. o The unsymmetric case. We are now ready to apply this linear algebra theory to (3.13-58)—but first let us do so for the more general (and troublesome, and not often used—at least in FEM) unsymmetric version, (o o)(;)=(J)- before which we note that we cannot even find a congruence transformation—and that does not even matter because even if we could, we could not apply Sylvester's law of inertia because it only applies to symmetric matrices. So we do not even have a good handle on the spectrum of A in this unsymmetric case—although it is probably a safe bet to assert that there are still at least n positive eigenvalues because of B. Because B is invertible, it is possible to eliminate u in favor of a single equation for P, {DB~X G)P = DB~X f -g, (3.13-61) an equation that is also useful—although usually only for theoretical purposes (unless D = CT,G = C,B = M, and the mass is lumped, in which case B is no longer dense, and corresponds to a Poisson equation that is also useful in computations—as we will see soon). The symmetric version of this equation, from (3.13-58), is (CTB-XC)P = CTB-Xf -g. (3.13-62) The dimension of the null spaces of the matrices DB~XG or CTB~XC is the same as that of A, namely, k = m — r; that of the former is at least as large as that of G, and for the symmetric case, see Remark (5) below. Application of the existence/uniqueness theory presented above to (3.13-60) leads to the following: 1. Existence. A solution exists if and only if wf f + rjg = 0, where (vv') is the i-th null vector of A7 : Bwi + DTn = 0 and G7w, = 0, / = 1, 2, ..., k [which =>GTB-xDTn = 0; cf. (3.13-61)]. 2. Uniqueness. When a solution exists, it is not unique when k > 0, being given by for arbitrary a\, where (up)p is a particular solution of (3.13-60), and (w') is the i-th null vector of A : Bvt + Gqx = 0 and Dv\ = 0, / = 1, 2, ..., k. Remarks: (1) V} = —B~xGq\{q\ ^ 0, necessarily), and if we have the special case of q\ ^ 0 yet Gqi = 0, we see that v, = 0 and—by definition—we have a pure pressure mode, or,
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 473 more simply and hereafter, a pressure mode; i.e., g, is then called a pressure mode, a mode (eigenvector) that corresponds to a zero eigenvalue but has no component in the velocity 'portion.' Pressure modes thus affect only the pressure part of the solution of (3.13-60)—when such solutions exist which, from above, still requires wff + rjg = 0: the data must be orthogonal to the corresponding adjoint eigenvector, which vector will generally not be an analogous 'pure pressure mode'; i.e., w; ^ 0 in general (but it could be zero). (2) If Vi 7^ 0 and if a solution exists, then the velocity is also polluted, and it is probably the case that (3.13-60) does not represent a valid approximation to the original continuum equations. An example of this type will be presented at the end of this section. (3) The only physically relevant/meaningful (non-spurious) null vector is the pressure mode called the hydrostatic pressure mode, q; = PH : PTH = (1 —> 1), a constant vector. The existence of Ph is assured whenever G properly approximates the gradient operator—both in £1 and on T. If (and only if) the corresponding adjoint eigenvector, r//, is also a constant vector and if wh = 0 and DTrH = 0—which would, of course, be true if it too were a pure pressure mode, which seems to also require that DT also approximate the gradient operator—or at least annihilate constant vectors—then a meaningful approximate solution could be obtained. If PH exists (i.e., if G is a gradient everywhere) and an NBC is used as an OBC (i.e., u • n is not specified on the OBC portion of T), then PH itself is (usually) a spurious hydrostatic pressure mode—a subject we shall return to below. (4) All null vectors except Ph are spurious numerical artifacts—they are extraneous and have no analogs in the continuous case. (5) The null space of CTB-[C is the same as that of C. (Proof: CTB-lCx = 0=$ xTCTB-lCx = 0=> zTB-lz = 0, where z = Cx. But B~x is SPD and thus z = 0.) (6) The corresponding solvability constraint in the pressure only, a la (3.13-61), is easily seen to be rf(DB~lf -g) = 0. (7) The null vectors of the adjoint system affect the existence of a solution, while those of the original system affect the 'quality' of the solution. (8) An example of such an unsymmetric system is presented later in this section, and another is Section 3.13.4b. o The symmetric case. Let us now apply the same theory to the symmetric case, (3.13-58), to see, in part, how much more 'attractive' it is. And it makes more sense to reverse the order of presentation—uniqueness first. 1. Uniqueness. Suppose that (vT, qT)T is a null vector of (3.13-58): Take the inner product of this equation with (vT, qT)T to obtain vTBv + vTCq + qTCTv = 0. But CTv = 0, and thus vTCq - qTCTv = 0, and we are left with vTBv = 0 and Cq = 0. But since B is SPD, we are led to the important result that v = 0; any and all null vectors in the symmetric case are (pure) pressure modes. Thus, the velocity solution will always be unique, if a solution exists at all, which we address next.
474 THE NAVIER-STOKES EQUATIONS 2. Existence. Since only pressure modes can be present when C is rank deficient, the solvability condition is simpler: qfg = 0, i=\,2,...,k = m-r. (3.13-64) When these constraints are satisfied, the solution of (3.13-58) is co-co,+!>(s)- where Cqx = 0 and the {a;} are arbitrary. Remarks: (1) Recall the origin of g—it came from the weak form of V • u = 0 applied to the specified boundary velocities, (3.13-27). (2) Again, only the hydrostatic pressure mode is 'physical'; in this case, (3.13-64) is the same constraint given in our earlier GFEM formulation of the NS equation (3.13-31), and corresponds to/represents the requirement that a discrete global mass balance be assured by the Dirichlet BC's on the normal velocity (the only BC for which PH exists). Associated with the hydrostatic mode is the fact that one (any one) of the m continuity equations is redundant; each equation of CTu = g could be represented as a linear combination of all (m — 1) others. The existence of PH in this situation, and the existence of the associated redundancy (by one) in the continuity equations, is completely proper and physical—and it has a simple (and also physical) analog in, for example, heat transfer for the Poisson equation that may be useful to state: if a steady temperature field is sought from V2T = — Q in Q, dT/dn = — q on T, where Q is an internal heat source, and q is the specified heat flux removed from £2 through T, then it is well known that: (i) a solution exists if and only if the data satisfy the following solvability condition (heat balance): J Q = frq; (ii) the temperature level is arbitrary—up to an arbitrary additive constant; and (iii) the GFEM analog generates KT = b with det K = 0 and KTH = 0 where TH is any constant vector (the 'hydrostatic'/thermostatic temperature?), so that a solution exists if and only if Tjjb = 0 (the discrete global heat balance), and there is one redundant 'heat balance' equation because the applied BC's (properly) duplicates the global heat balance that is obtained by summing all of the discrete equations. It is also well known that, when solvable, the arbitrariness can be removed by setting the temperature at any node to any desired value and omitting the corresponding equation in KT = b. (3) All other pressure modes, and their concomitant solvability constraints a la (3.13-64), are spurious—and, of course, are not constant vectors. This is the more serious side of pressure modes; i.e., whereas some could be content with good velocities and good pressure gradients [arguing (meekly?/weakly?) that only gradients are needed anyway—usually], few if any would like to be saddled with the extra (and spurious, non-physical) constraints on allowable velocity BC's engendered by (3.13-64). (4) As is the case for the hydrostatic pressure mode, each spurious mode also implies a redundancy in the continuity equations; presumably again each continuity equation could be obtained as a linear combination of all of the others. This of course says that there is a &-fold redundancy in the continuity equations—we have k more than
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 475 we need. Each pressure mode (spurious or not) decreases the number of independent constraints by one. (5) The corresponding solvability constraint from the (symmetric) pressure equation (3.13-58), is qf(CTB-lf-g) = 0, which (appropriately) is just (3.13-64) again because qfCT = 0—by definition. (6) Besides depending on the BC's employed, pressure modes are 'mesh-dependent'; i.e., their occurrence/existence is sometimes obviated simply by changing the mesh from 'too regular' to less regular, a result that seems to properly suggest that redundancy of continuity equations is more likely to occur on regular, structured meshes than on general meshes of distorted isoparametric elements. This of course is another feature that tends to lead to the conclusion/suggestion that only stable elements should be used. But the fact is that some unstable elements are used daily—and with success; thus, we shall not take the easy way out and exclude from further consideration only elements that never display even a single spurious mode. (7) The pressure modes considered herein are so-called 'global' modes. There also exist 'local' pressure modes (redundant constraints on the element, or macro-element, level; e.g., criss-cross P\Pq or P2P-1 of Table 13.3-1); for discussion of these local modes, see Malkus's Appendix 4.II in Hughes (1987), Brezzi and Fortin (1991), and Qin (1994). So which elements display spurious pressure modes and why, and what do the little devils, and their constraint equations, look like? Also, should spurious modes preclude an element from being used? We can only partially answer these questions, which may be just as well, and this we try to do in the remainder of this section—incompletely, partly intentionally. First, we point out that all equal-order elements generate 'too many' (i.e., redundant) mass balance constraints and thus possess spurious modes, whose descriptions are only known for a very few—and this accounts for the predominance of 'mixed interpolation' elements in which the pressure basis functions are polynomials of one (or more) degree(s) lower than those for velocity—a 'trick' which reduces (as desired/necessary) the number of constraint equations. Such elements have either few or no spurious modes, in contrast to equal-order elements that have (too) many. Next, we mention [again—see Remark (4) following (3.13-37)] that the unsymmetric case, if generated by the method referred to above—of not integrating VP by parts, produces a hydrostatic pressure mode even at an outflow boundary (or wherever NBC's are applied). This hydrostatic mode is spurious—as is its associated solvability condition—even if the element in question is free of pressure modes when VP is integrated by parts, as was mentioned earlier and will be demonstrated later. o Bilinear velocity. We now present an extended discussion on the derivation of pressure modes for the bilinear element, for both piecewise discontinuous pressure (Q\Qo) and continuous bilinear pressure (Q\Q\)—the latter being one of the simplest examples of an equal-order element and presented in just enough detail (which, unfortunately, is a lot of detail) to clearly make our point that equal order is rarely if ever viable. We shall show one way to find the null vectors called pressure modes—from the symmetric formulation,
476 THE NAVIER-STOKES EQUATIONS (3.13-58). In contrast to the original presentation (which delivered only partial results for Q\Q\) in Sani et al. (1981a) employing more 'local' analysis techniques, which were then 'translated through the mesh' to obtain the needed 'global' results, here we shall derive (some of) these global results directly, using simple linear algebra—on simple meshes (another severe restriction needed to obtain analytical results) with only simple BC's (Dirichlet). But simple cases are often good enough to elucidate the issues—simply. The analysis is in fact so simple that it is (usually) restricted to meshes of uniform rectangles. Our technique may thus also be directly applicable to finite difference or finite volume methods. o Piecewise-constant pressure. The methodology to be used here, which was first presented in summary form in Shih and Gresho (1985), starts with the definition of a pressure mode, CP = 0, and seeks non-trivial null vectors {P} that satisfy both components of this equation. For a mesh of / x h rectangles, the Q\Qo pressure mode equations at node /, j are (see Appendix 1) of Figure 3.13-3 are CxP = 0=-m+iJ+i - Pij+i) + (Pi+ij - Pij)] and CyP = 0 = -[(PiJ+l ~ Pij) + (Pi+lJ+l ~ Pi+l.j)], which are obviously satisfied by Ph = (1, —►, l)7, the hydrostatic pressure mode. But another way to satisfy these equations is to have both P;j + Pjj+\ = 0 from the first equation and Puj + Pj+ij — 0 from the second. Seeking a solution to this set of difference equations via PU] = axb3 leads easily to a = b = — 1, and we obtain Pij = (-\y+j (3.13-66) as the simplest representation of the most (in) famous spurious pressure mode of them all: the checkerboard (CB)-mode, taking on as it does the value of +1 on red (say) elements and —1 on black. We defer to, for example, Stephens et al. (1984) to pronounce that there are no other pressure modes for this element. We also defer, briefly, the interaction of this pressure mode with velocity BC's. o Bilinear pressure, too. Turning now to Q\Q\ on the mesh shown in Figure 3.13-4, the same appendix allows us to write the following pressure mode equations at node /, j: X Pi,j+1 X Pi,j X Pi+1,j+1 X Pi+1,j £ Fig. 3.13-3 A 4-patch of Q^Qo elements.
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 477 i-1 ,j+1 •— i-1,j<> i,j+1 i-J i-l,M Fig. 3.13-4 A 4-patch of QiQi elements. i.j-1 i+1,j+1 —• <> i+1,j i+1,j-1 CxP = 0= — [(P, + 1,;_, - P,'_,,7'_,) + 4(Pi+l,j - Pi-lj) + (Pi+lJ+l - Pi-lJ+l )] and / C),P = 0=-[(F,-1,J+,-F,-,J-1)+4(F,;+l-^,) + (Pi+l,,+1-Fi+1,-1)], which are obviously satisfied by PH, the hydrostatic mode—as they should be. The next three modes are also obviously pressure modes, and were derived previously by Sani etal. (1981a): 1. The 'S-wave,' a '2Ajc' wave: PitJ = (-1)'". 2. The Vwave,' a '2A/ wave: P}j = (-\)j. 3. The CB-wave, a '2Ax- x -2A>-' (or £ - rf) wave: Pu = (-l)'+i. Remark: These three spurious modes are also present in a simple, second-order, centered finite difference approximation (five-point stencil) using 'non-staggered'/'co-located' grids. To find the others, we proceed as above for Q\Qo\ i.e., seek solutions of P,-,;_i + 4Ptj +Pij+\ = 0 from the x-equation and of P;-\j +4P,-,; +Pi+\j = 0 from the y- equation, which would satisfy CP = 0. (Note that more coupling can also introduce more modes!) The solution of these homogeneous difference equations can again be obtained by trying Ptj = albJ, which results in the following quadratic equation for both a and b: x2 + 4x + 1 = 0, (3.13-67) where jc = a or b, with solutions x = — 2 ± >/3. Noting that the product of the two roots is unity leads to the following general solution to the pair of difference equations, where £ = -2 + V3: PtJ=A^i+J + A2?-j + A3rli+J) +A4£;-', (3.13-68) which describes four additional (and linearly independent) pressure modes, since the Ak's are arbitrary coefficients. So, we are up to eight modes for Q\Q\, seven of them spurious. We can find no more—nor could our computer; thus, we assert that the dimension of the null space is eight. [We have also used a (numerical) Gram-Schmidt orthogonalization routine on these eight vectors for several simple (e.g., 6 x 6, 7 x 6, 7 x 7) meshes and
478 THE NAVIER-STOKES EQUATIONS found, in all cases examined, that the orthogonalization 'succeeded'—which proves linear independence of the original vectors: linear dependence would lead to one or more zero vectors via the Gram-Schmidt procedure.] It is convenient to re-define the four new pressure modes in the following rearrangements, because then certain symmetries are displayed: 1. Even-even mode (P,-y = P_,,7 = P,-,_/) via Ak = 1/4 for all k to give p™ = W+r,-)(*;+ro. 0.13-69) which gives P^' = 1. 2. Even-odd mode (P,-j = P_,,7 = —P,,_7) via A\ = A3 = — A2 = —A4 = l/[2(£ — £-')]= 1/4V3 to give p\y = & + $-'•)($; - r7')/4V3, 0.13-70) giving ^ = 1. 3. Odd-even mode (P,-j = —P-ij = Pi,-j), which is a 90° rotation of even-odd: pj;e) = ($«' -r'W+r7')/4V3, 0.13-71 > ■ ■ r>(oe) i giving P,,0' = 1. 4. Odd-odd mode {Pu = -P-ij = -Pi,-j) via A, = A4 = -A2 = -A3 = (^ - ^_1)-2 = 1/12 to give p;°;> = (^« - £-■)($; - r7')/i2, (3.13-72) giving Pn = 1. If node (0,0) is chosen to be the node at the geometric center of a rectangular domain containing an even number of elements in each direction, then the above four modes are mutually orthogonal. In other cases they are still linearly independent but generally not orthogonal. In 'pictures,' these four pressure modes look as follows—here on a mesh of 36 elements (6 x 6) and 49 nodes, —3 ^ /, j ^ 3: (i) The even-even mode: 676 -182 52 -26 52 -182 676 182 49 -14 7 -14 49 182 52 -14 4 —2 4 -14 52 -26 7 -2 1 -2 7 -26 52 -14 4 -2 4 -14 52 182 49 -14 7 -14 49 182 676 -182 52 -26 52 -182 676 (ii) The even-odd mode: -390 105 -30 15 -30 105 -390 104 -28 8-4 8 -28 104 -26 7-2 1-2 7 -26
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 479 0 26 -104 390 (iii) The odd- 390 -105 30 -15 30 -105 390 (iv) The odd- -225 60 -15 0 15 -60 225 0 -7 28 -105 -even mode: -104 28 -8 4 -8 28 -104 -odd mode: 60 -16 4 0 -4 16 -60 0 2 -8 30 26 -7 2 -1 2 -7 26 -15 4 -1 0 1 -4 15 0 -1 4 -15 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 -8 30 -26 7 -2 1 -2 7 -26 15 -4 1 0 -1 4 -15 0 -7 28 -105 104 -28 8 -4 8 -28 104 -60 16 -4 0 4 -16 60 0 26 -104 390 -390 105 -30 15 -30 105 -390 225 -60 15 0 -15 60 -225 We now consider the solvability issue; for u and v specified on all of T, the solvability constraints, (3.13-64), would generally lead to eight constraints on the data, gt, i = 1, 2, ..., 24, where we take / = 1 as the lower left node, and we loop counter-clockwise around F to number the g{ (which, recall, live only on F, since we have no sources/sinks of mass); we shall elucidate this constraint for one spurious mode—the even-even mode (one example is enough!): 676(g, +g7+£l3+#19) + 182(g2 +g6 -88 ~g\2+g\4+g\8 ~820-g24) + 52(g3 +g5 +g9+g\\ +g\5 +g\l+g2\ +g23) -26(s4+S,o+Si6+#22) = 0, (3.13-73) a spurious constraint that precludes the existence of a solution to (3.13-58) unless the imposed BC's satisfy it. And there are six others! (Not seven because it turns out that the CB-mode is innocuous in that its solvability constraint equation is always and automatically satisfied for all possible BC's. On the other side of the ledger, though, is the concomitant fact that there are no BC's that can preclude the CB-mode's existence—and the A-matrix is thus always singular.) All are spurious except that from the global mass balance constraint, J2jL\ gj — 0- {From Appendix 1, 'Element Matrices,' we select two entries from the g-vector for the above 6 x 6 for display, and further elucidation—g4 (at the bottom center) and gg (on the right side, one node up): g4 = [2h{u$ — ut,) — 1(vt, + 4v4 + v5)]/12 and gg = [h(ui + 4w8 + w9) + 21 (y? — vi)]/\2. The combinations of sums of normal velocities and differences of tangential velocities in PTHg = 0 are evident; the normals will 'sum up,' and the 'tangentials' will cancel in Yl§i t0 yield the appropriate
480 THE NAVIER-STOKES EQUATIONS global mass balance constraint.} We leave the explicit construction of the six spurious constraint equations to the interested reader. It thus becomes abundantly clear why equal- order interpolation has not been very popular; in addition to a multi-dimensional null space and polluted pressures, very few otherwise well-posed problems with inhomoge- neous Dirichlet BC's would even have solutions. Additional remarks: (1) If other BC's (NBC's) are employed on some parts of the domain with Dirichlet on the remainder, the number of spurious modes is reduced. Dirichlet data are the worst case—unless of course they are homogeneous; u = 0 on F obviously satisfies all of the pressure mode constraints. (2) For a vortex shedding simulation (Gresho et al., 1984a) with the following BC's, /„ and v specified at inlet, /„ and fT specified at outlet, and u specified laterally, the number of pressure modes (all spurious since PH cannot exist with /„ specified on any portion of T) observed was only three [and this was only determined indirectly by monitoring pivots—a zero eigenvalue generates a (machine) zero pivot during Gaussian elimination]. (3) In case one wishes to compute with Q\Q\ anyway, we list below the BC's—applied on some part of T—that will eliminate each (except the omnipresent-but-innocuous CB-mode) of the spurious pressure modes (of course, as already mentioned, restriction to contained flows with u = 0 on T will always cause all constraint equations to be satisfied): (i) The £-wave is eliminated via a normal NBC on part of the x-boundary (e.g., along x = L). (ii) The ^-wave is similarly precluded by a normal NBC somewhere along y = H (for example). (iii) All four modes from (3.13-68) are eliminated via application of a tangential NBC on a portion of T (along an x- or a y-boundary). [Thus, the three modes remaining in the vortex shedding simulation referred to above are: (a) the »7-wave, (b) the CB-mode, (c) a mystery mode—after all, the theory above is limited to rectangular elements, not distorted isoparametric ones. Perhaps we mis-identified one via numerical pivot monitoring.] (4) It is interesting (and coincidental) that certain spectral method approximations are also cursed with a seven-dimensional spurious null space (also in 2D); see, for example, Bernardi et al. (1990), Canuto et al. (1988b), and Schumack et al. (1991). They somehow seem to cope with it in ways that we in finite elements have not discovered; i.e., they use it anyway, whereas we run away from it. (5) It is sad but true that perturbing a single node in the above example can change the results (number of modes, etc.) drastically—further emphasizing the futility of the situation. o Discontinuous pressure. Having over-exposed (perhaps) the pressure mode problem with equal-order interpolation, let us return to the 'single CB' (2D) element, Q\Qo, which we still use and advocate, and attempt to make it more 'palatable' than we did in our original publication on the subject (Sani et al., 1981a). There we summarized a several-year effort
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 481 that probably exposed more problems than solutions (with Q\Qo especially) with the not surprising result that that paper turned out to be somewhat morose and thus helped form a 'doom and gloom' attitude among many who were already or who might have become 'friends of FEM.' We apologize for this. The state of the art back then was still somewhat primitive—and changing rapidly. Since that time, much has been learned—and while there are still a few cautionary measures that need to be understood by the user of this element, they are not all that difficult to master (we show you how) nor are they overly restrictive. In fact, at LLNL, all of our new codes since the 'CB paper' have been built around Q\Qo (both 2D and, mostly, 3D). The current atmosphere surrounding this element is (or, at least, should be), in our opinion, much more upbeat than it was back then and than it was even in the recent book by Gunzburger (1987); and we also have a new reason for this optimism (in addition to mega computer minutes of good experience): in Section 3.13.5 we present a new convergence proof—the velocity converges at the optimal rate under 'all conditions' (not just 'special' meshes), and, although only 'proven' experimentally, so do the (filtered when necessary) pressures. We also take this opportunity to correct some 'typos' in Sani et al. (1981a). Part 1: (i) AM~'g should be ACM~'g in (4d), (ii) the last + sign in (24) should be -, (iii) C7 and C should be switched in the equation below (A8); Part 2: (i) Oj = D4, D3 = D2 on p. 175 should be D\ = D3, D4 = D2, (ii) the 80 spurious modes on p. 175 should be 8, (iii) insert / before lN in the last equation on p. 178, (iv) the x in (58) should be —, (v) remove lim^o from (65), (vi) the last 0.625 should be 0.626 in Table III, (vii) remove the extra ) in the ^-equation in the middle of p. 194, (viii) on p. 199, if the inconsistent case is suspected (by the pivot test) in item 3, print WARNING—MAY BE ILL-POSED before replacing the pivot by a 'suitably large value (1010 say).' The principal issue for Q\ Q0 in 2D is the sometime presence of a CB-mode (both pure and impure, which latter we shall define later) and its effect on solvability of the problem, the velocity accuracy, the pressure accuracy, and the effects of mesh refinement. There is also—on simple meshes—the issue of other 'modes' besides the CB that are more trouble theoretically than practically, and we shall also return to these later (Section 3.13.5k). But before launching into the negative aspects associated with CB-mode (modes in 3D), let us remind the reader (again) that the Q\Qo has a particularly intuitively attractive property, the results of which seem to show up in practice: just as the continuum equations satisfy one (vector) momentum equation and one divergence-free constraint equation at each point in the domain, so too does this element satisfy one momentum equation and one divergence-free constraint 'at' each element in the domain; i.e., in spite of CB's, there is a nearly optimal balance of equations and constraints, including element-level mass conservation. The first thing to realize about Q\Qo is the generalization of the simple CB-equation, Pij = (—\)i+J, discovered by Fortin (1972a,b) and analyzed by Sani et al. (1981a), and its 3D extension: on a mesh of rectangular elements (or even parallelograms, actually), the 2D CB of (3.13-66) generalizes to Pij = (-l)i+J/AtJ, (3.13-74) where A,-j is the area of element /, j, and we are using the somewhat unconventional notation (in the FEM world) that associates / with element 'rows' and j with element 'columns' in the mesh. An equivalent alternate description of the CB-eigenvector is PCB\k = ±\/Ak, (3.13-75)
482 THE NAVIER-STOKES EQUATIONS where Pcslk is the k-th element of the CB-null vector, Ak is the area of element k (the elements now being numbered more conventionally, from 1 to m), and the + sign applies to (say) a 'black' element and the — sign to a 'red'. The important thing is that CXP = 0 and CyP = 0, where P is the CB-eignevector of (3.13-75). The 3D extension of the CB-mode leads to CB-modes (many)—at least for simple, brick-shaped elements; there can be (depending, as always, on BC's) a 2D mode (nonzero entries only in a particular 'plane' of elements) in each 2D plane of elements (except one) for each of the three 2D planes (x — y, x — z, y — z) and one fully 3D CB-mode (a 2Ajc x 2Ay x 2Az 'wave'). The 'cartesian' description of these, a la (3.13-66), is Pijk = (-\y+J for/: =1,2 Nz-\, (3.13-76) PiJtk = {-\)i+k for j = 1, 2, ... ,Ny - 1, (3.13-77) PiJ<k = (-\)j+k for / = 1, 2 Nx - 1, (3.13-78) and Pi,j,k = (-\)i+j+k, (3.13-79) which applies (only) to a mesh of uniform bricks with Nx x Ny x Nz elements. This mesh can support a maximum of Nx+Ny+Nz—3 2D CB-modes, one 3D CB-mode, and one hydrostatic mode, for a total null space dimension of Nx + Ny + Nz — 1, with an equal number of extraneous/redundant continuity equations and an equal number of BC constraint equations, all but one of which are spurious. As in 2D, if the 3D mesh is not geometrically regular, it is usually the case that all spurious redundancies vanish because all of the continuity equations are then required. The generalization to a mesh of variable- sized bricks is as follows—after mentally 'coloring' the elements 'red/black' on the 3D mesh: (3.13-76) is replaced by PcB\k = ±l/AzA*, where Ak is the area-projection of the volume, Vk = AzAk, k = 1, 2 , ..., Nz, of the k-th element onto the .ry-plane, and Az is the thickness of this plane of elements; analogously, (3.13-77) and (3.13-78) are replaced by the other 2D 'area' eigenvectors; finally, (3.13-79) goes over to PqbIic = ±1/V*, where Vk is the volume of element k, k = 1, 2, ..., m — Nx x Ny x Nz. To close this portion of the CB-mode description, we mention that D. Griffiths has managed to find the most general 'CB-mesh' (although somewhat 'esoteric' in that the probability of generating such a mesh with conventional mesh-generating programs is probably nearly zero), which can support the 2D CB-mode (3D is still 'open'), of which the rectangular case discussed herein is a special (degenerate) case. This general case is described in Sani et al. (1981a), to which we refer the interested reader. It is unfortunate, of course, to be saddled with such a massive mess of mesh-dependent spurious pressure modes—and perhaps one would be well advised to utilize other elements, as indeed many have done. But, as we will try to demonstrate, there are enough advantages to the Q\Qo element—at least in 3D—to justify its sustained use. Also, we will show that it is not really too difficult to adapt to 'life with modes,' especially after we show how easy it is to accommodate them and to filter them. o Solvability issues, symmetric case. Having described the CB-eigenvectors, let us now demonstrate some of their effects for Q\Qq—simultaneously with a demonstration of the effects of the physical mode, Ph— by examining a small 'piece' of a 2D mesh; shown in Figure 3.13-5 is an 11-element (five black, six red) segment of what can be construed to be a much larger mesh of Q\ Q0 elements, in which the arrows 'represent' the
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 483 Fig. 3.13-5 The CB mode affects the velocities. CB-constraint equation ('out of the black and into the red'—an accountant's nightmare, although T.J. Hughes has called Q\Qo 'a million dollar element'—personal communication) to be derived below. We begin with the important observation that the solvability conditions represented by (3.13-64) are, in fact, also derivable by forming the appropriate linear combinations of continuity equations, from (3.13-58), such that all coefficients of internal velocities (in from the boundary) cancel identically. It is no accident that the 'appropriate' linear combination is just that obtained by setting the scalar product of the null eigenvector with the vector of the continuity equation LHS's, CTu, to zero. But the additional important observation is that these 'inner products' and related results apply equally to any internal 'boundary,' or cut, through the same mesh used to obtain (3.13-64); i.e., for any subset of ms elements, the equation generated by ms Y,(CT")j<lJ = 0, 7 = 1 where qj is any pressure mode (here PH or PCb) generates an equation—satisfied by the solution of (3.13-58) whenever a solution exists—in which any nodes that can be identified as 'internal' are no longer present. (Obviously, in order to have internal nodes, the set ms must form a closed curve—following element boundaries, of course—within the domain.) Returning now to the above sketch, we first apply the above equation with q = PH to obtain Xljli (CTu)j = 0, which leads to (w5 - u\)h\ + (mio - U(,)(h\ + h2) + U\5(h2 + h3) - U\ \h2 + («19 - "16)^3 + (Ull - V\)l\ + V\f>l2 - V2(l\ + l2) + {v\i - v3)(l2 + h) + (vis - v4)(h + U) + («i9 - v5)l4 = 0, (3.13-80) which, more or less obviously, is a statement of 'global' mass conservation over the selected subdomain, fr n • u = 0. Similarly, but with q = PCb, is obtained
484 THE NAVIER-STOKES EQUATIONS J2(CTu)j/Aj= J2 (cTu)i/Ah (3.13-81) RED BLACK whose generality and importance we wish to emphasize—even though it is 'but another' linear combination of continuity equations, which always eliminates internal nodes (providing ms is as above, which of course it always is if ms = M, the total number of elements). For the selected subdomain, (3.13-81) leads to: -«3 -V\q _ /«19 V\9_ \ U h "16 V\6 "12 V\2 h /*3 h h + 7i-Ti+«*(r + r)=° <3J3-82> h h2 yhi h2J which is, again more or less obviously, an equation relating the tangential velocity components on the boundary of the subdomain. Again, for emphasis, any solution of (3.13-58) will (must) satisfy both (3.13-80) and (3.13-82). To help see what the CB-constraint means (it clearly does not approximate fr x ■ u = 0), let us simplify to the case of a uniform mesh to obtain (U2 - Ml) + («3 - "2) + ("4 - "3) - ("5 - "4) + (UlO - V5) + (V\5 - V\q) - (V\9 ~ V\5) - (M19 - Wig) + ("18 - "17) - ("17 - "16) - ("12 ~ "ll) - (v\6 - U12) + (v6 - vx) - (v\ 1 - v6) = 0, which, if Taylor series analysis dare be applied, generates nothing but du/dx + dv/dy = 0; no surprise, really, since each of the 11 continuity equations also represents V • u = 0. Thus, it is clear—hopefully—that this CB-'constraint' equation (and the many more in 3D) are not really terribly 'evil' (although spurious) and pose absolutely no barrier to convergence with mesh refinement. What they do pose, when applied to the full mesh with inhomogeneous Dirichlet BC's, are spurious constraint equations on the allowable, specified, tangential velocities—but even then they cannot be too 'deleterious'; i.e., their satisfaction is easy to obtain and does not really do much 'damage' to the specification of any particular problem—as we shall soon show. As a final remark relating to the above grid (which, of course, generalizes easily), we leave the following exercises to the reader: (i) sum the continuity equations for those elements immediately surrounding element 7, (ii) apply the CB-constraint equation to these same elements, then in each
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 485 resulting equation, set to zero all velocities except those on element 7; the result from (i) will be (CTu)-i = 0, the continuity equation for element 7, and that from (ii) will be (CTu)i/Ai = 0, where Ai is the area of element 7. We now move on to consider other BC's than Dirichlet. (Recall that the 'worst case' for pressure modes is that with inhomogeneous Dirichlet BC's.) Focusing first on normal BC's, we note that if normal 'traction' BC's, a la (3.8-17) or (3.10-7), are employed on any portion of T—even at just a single point—then the C-matrix no longer contains PH in its null space, and there is no longer a redundant continuity equation—a statement that is true for any element. This is because a normal traction BC 'looks like' a Dirichlet BC for the pressure—a fact also reflected in the continuum equations, cf. (3.8-38). Only when n • u is specified on all of T is PH present (and the pressure then only determinable up to an additive constant—which 'constant' can actually be an arbitrary function of time). So much for the physical pressure mode. For the non-physical modes, it is probably no surprise by now that their presence (or absence) is related to tangential BC's on velocity, and the first important point to be made is this: when (3.13-58) corresponds to either potential flow or to the acceleration and pressure for NS, there are no spurious pressure modes because there are no tangential Dirichlet BC's! Only the Stokes (or NS) equations legitimately permit the application of essential tangential BC's, and only they can lead to pressure modes. (Later, however, we shall admit to applying tangential BC's anyway—sometimes illegitimately and sometimes not.) So—for the time being—suppose we are dealing with steady Stokes flow, which definitely always requires the application of tangential BC's. For the Q\Qo element in 2D, it follows (like that in the normal direction for PH) that the use of a natural/'traction' BC—a la (3.8-18) or (3.10-6), a shear stress BC—on any portion of T, again even at a single node, precludes the existence of Pqb in the null space of C. And with PCb goes the redundancy; i.e., when Pqb is precluded by BC's, then there is no redundancy in the continuity equations. The 3D case, with its many modes, follows easily from 2D—at least for 'simple' domains: (i) if a tangential BC of the traction type is applied (in the proper direction) at any point on the boundary of the 'ring' of elements comprising one of the 2D pressure modes of which there are Nx + Ny + Nz — 3, then this mode will no longer exist; (ii) for the single, 3D mode, a tangential traction BC (in both directions) at one or more points on F will preclude it. What this boils down to in practice is this: NBC's on one of the two bounding planes in two of the three cartesian directions will preclude all 'CB'-pressure modes; i.e., the maximum, tangential, Dirichlet case without CB-modes is that with tangential velocity specified on four of six bounding planes—any more makes modes, mostly the 2D type. We leave as an exercise the generalization of the CB-constraint equation to a segment of a 3D domain, and simply state that in our experience, it has never been necessary to actually explicate these relations. Another situation which is more or less guaranteed to preclude all 3D pressure modes, is that wherein complex geometry and/or unstructured meshes of distorted isoparametric elements are employed; i.e., for just those problems that are the 'raison d'etre' of the FEM. We now summarize some additional linear algebra issues related to pressure modes (for any element) and their redundant constraints: 1. If the applied velocity BC's do not duplicate a pressure-mode boundary-constraint equation, then this mode (its null vector, and its zero eigenvalue) will be absent—as are related solvability issues. The velocity solution will then satisfy the constraint equation.
486 THE NAVIER-STOKES EQUATIONS 2. If the applied BC's do duplicate the constraint equation for a particular pressure mode, then this mode will be present (in the null space of C) as will its associated zero eigenvalue and resulting non-unique pressure solution and redundant continuity equation. The system can be said to be (and is) over-specified but consistent (consistent singular). 3. If the applied BC's violate the boundary constraint equation associated with/implied by any particular pressure mode—spurious or not—then the problem is ill-posed, and no solution exists. Next we present two simple examples of Q\Qo pressure modes and their interactions with BC's, first for PH and then for PCb- Consider the mesh and BC's in Figure 3.13-6—a sort of forced transition from slippery plug flow to Poiseuille flow (not a recommended problem in practice—except as a sort of 'wiggle experiment' if Re ;$> 1 and the grid is not 'refined' near x = L). This mesh will display a hydrostatic mode and a CB-mode, with the constraint equation from the hydrostatic mode [(3.13-31)] being 6w0 = u\ + «2 + "3 + «4 + "5, and no constraint from Pqb (why?). Any values of u\ through u5 (at the five 'exit' nodes) that sum to 6w0 generate a well-posed problem; any that violate it do not. If we want a parabolic (fully developed) exit flow, we might try f(y) = Ay(6 — y), where we have assumed a channel of height six, and A is to be determined. Thus, u\ = us = /(l), /(2), and u5 = /(3) to give 6u0 = A(2 x 1 x 5 + 2 x 2 x 4 + 3 x 3), which gives a well-posed problem only for A = 6«o/35—not A = uq/6, which is the appropriate parabola amplitude for the continuous problem. This is but a simple example of a general and very important result: the application of the interpolated value of n • u on F from a well-posed continuous problem is generally not a well-posed discrete problem; it is the discrete mass balance condition that must prevail. (Because the integral of a piecewise- linear interpolant of a parabola is less than that of the integrated parabola, the discrete parabola must be scaled up accordingly—in this case by the factor 36/35.) The next example involves Pqb, but not PH, and is the (in)famous lid-driven cavity (LDC)—as first discussed in Sani et al. (1981a). Application of the CB-constraint equation (3.13-81), to the problem described in Figure 3.13-7 (with u = v = 0 everywhere on T not on the top lid except at one node where /„ = 0 is applied—to preclude PH) yields different results for an even or odd number of nodes across the top of the cavity. For TV even, the CB-constraint equation gives 7f-"2(77 + Jl)+M3(A + i) u = v= 0 u = u0 v = 0 u = v= 0 /v u = f(y) v = 0 Fig. 3.13-6 From plug flow to Poiseuille flow.
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 487 N-2 UN-1 Fig. 3.13-7 The top of a lid-driven cavity. H h uN-1 1 1 + In-? In- n-\ Un In~\ = 0, (3.13-83) whereas N odd leads to u (3.13-84) The applied tangential velocity must satisfy (3.13-83) or (3.13-84) in order that the algebraic system be well-posed. If the simulation is the (simpler) 'leaky' LDC, then typically w, = u$ = constant, for all /, and there is 'no problem'—each 1//, term in the above equations cancels with its neighbor (in pairs) with the result that the LHS = 0, too—giving well-posed problems, [g = 0 in (3.13-64).] But now consider the tougher problem of a non-leaky ('water-tight') cavity that is realized by setting u\ and un to zero, and the solvability issue is no longer trivial. To get specific, take TV = 6 in (3.13-83) and TV = 7 in (3.13-84); the former gives "M2(^+y+M3(i+y-M4fe+i)+M5(i+i 0, and the latter gives -Uj h + h +usir2 + h — «4 h + U + "5 U + h -i*,- + -)=0. If now we set w, = u0 in the above two equations, which is the most common LDC BC, they simplify to uq(— \/l\ + I//4) = 0 in the even case and to — u0(\/l\ + \/le) — 0 in the odd case. [The general results, obviously, are — uq(\/1\ — l///v-i) = 0 for the first case, and — uq(\/1\ + \/In-\) = 0 for the second.] Now we see before us the first real problem with the spurious CB-constraint: the ramp-up-over-one-element BC cannot be used when the number of nodes across the top is odd—the CB-constraint cannot then be satisfied, and the linear system is inconsistent. (In the even case, we can solve the problem—if and only if we take l\ = In-\-) But forewarned is forearmed; i.e., our knowledge of the CB-constraint equation permits us to deal with it—easily and effectively—when it poses a potential stumbling block. In this case, the solution is either: (i) stay with TV even or (ii) ramp up over two elements—the latter solution being left as an exercise—or see Malkus's discussion in the Appendix to Chapter 4 in Hughes (1987). Also left to the reader is the analogous situation for the 3D LDC.
488 THE NAVIER-STOKES EQUATIONS It may occur to some (as it did to us) to attempt to 'solve' the ill-posed problem in the following way: simply peg two pressures (one on a black element and one on a red); i.e., set the pressures as Dirichlet BC's (value immaterial) and omit the corresponding two continuity equations, with the result being that the matrix is no longer singular, and thus the solvability equation, u0(\/l\ + 1/7/v-i) = 0, is vanquished—on the premise that we had two redundant continuity equations anyway so that all will now be well. That such a trick is invalid follows from the realization that the two constraint equations (one from PH and the other from Pqb) still apply to the remaining elements in the mesh with results that look as follows: (i) a\ (CTu)\ + a2(CTu)2 = 0 from the hydrostatic constraint, where (CTu)\ is the continuity equation on the first omitted element, (C7w)2 is that on the second, and a\ and aj are known scalars; (ii) b\(CTu)\ + b2(CTu)2 = uq(\/1\ + 1/7/v-i) from the CB-constraint equation, where b\ and b2 are known scalars. Thus, the violation of the original CB-solvability equation when no pressures are pegged shows up as a loss of mass conservation on the two selected elements. It thus also follows that in a well-posed problem, such a procedure is perfectly legal—the RHS of the CB-constraint equation above would then be zero with the result that both elements will display conservation of mass. We are almost, but not quite, finished with our discussion of (pure) pressure modes for Q\Qq. Still to come are discussions of the following: filtering the spurious pressures—including smoothing and grid smoothing, pressure pegging, node-freeing, and, finally, the nasty and hard-to-predict/analyze impure CB-modes. o Filtering the checkerboard mode. While the final details and suggested procedures are deferred to Chapter 4, we state here for the record the simple filter that works well: if n elements share velocity node /, then the CB-filter equation below produces a smooth pressure at this same node—which could then, if desired, be confidently interpolated via the velocity shape functions: P^JZP^/JZ^J, 0.13-85) where Pe is the (piecewise-constant) pressure on element j and Qe- is the size of the same element (£2* = Ae- in 2D and OF- = Ve- in 3D). This filter was, of course, derived via our knowledge about Pqb', i.e., from (3.13-74) through (3.13-79) and the related discussion. If the grid (internal node locations) is 'smoothed' in the 'same' way—by moving the nodes according to (3.13-85), a more-accurate-yet pressure is obtained; for details, see Section 4.2.9, and for an example, see Figure 17 of Sani et al. (1981a). Finally we remark that the post-processed pressure from a penalty method approximation will also need the same filter. o Impure pressure modes. Thus far we have discussed pressure modes on 'finite difference' meshes—uniform rectangles (bricks in 3D). This is a worst case with respect to pressure modes, and it turns out that for practical/real-world/complex-geometry simulations; i.e., for those cases where GFEM can really 'shine', that the CB-modes are generally absent—fortunately. There is (usually) no redundant continuity equation (equations in 3D), and thus no pure CB-modes and no zero eigenvalues. So, are we now 'home' with respect to the CB-mode(s)? Close, but not quite; there may exist what were called 'impure modes' by Sani et al. (1981a). Since the time of that publication, which raised a flag of
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 489 fear regarding the possible consequences of these impure modes, we (and many others, we believe) have accumulated many years of successful experience with Q\ Q0 on a variety of meshes and never 'worry' about impure pressure modes. We simply apply the 'standard' filter, (3.13-85), regardless of the possible existence of pure or impure modes—and are virtually always happy with our pressures. Nevertheless, the impure modes can perhaps cause an occasional difficulty/surprise, so we shall briefly summarize them. At their very worst, they correspond to a singular perturbation problem; i.e., there may be a mesh that supports a pure CB that is only an 'e-away' (nodal locations) from the chosen mesh. If this happens, the following bizarre behavior could occur: whereas the CB-eigenvalue is only slightly perturbed from zero—Acb = 0(e2) in fact—the solution (velocity portion, in fact) may be perturbed in a major way, to order 1 in e, and thus be rather inaccurate. The CB-eigenvector is still present, and with a computable (not arbitrary) and large magnitude; Pcb = 0(\/s)—see Sani et al. (1981a) for details. Also, Pqb need no longer be orthogonal to g; the algebraic problem is well-posed for all g. Although the probability of encountering such a solution is probably very low, it is not zero—and a way to reckon with it would be nice. We summarize two ways in the next two subsections, after mentioning that the conventional filter seems to work well for any impure modes that may exist, and that mesh refinement will always reduce the problem (convergence is still assured). o The pesky modes of Q\Qq. If one studies the literature on Q\Qo, namely, (at least) Boland and Nicolaides (1984), Girault and Raviart (1986), Brezzi and Fortin (1991), one will see that there are other 'unstable' and 'CB-like' modes besides the pure CB-mode with its zero eigenvalue and the impure modes discussed above—modes with an LBB 'constant' (see next section) that is not a constant, but rather 0(h). Although these too are impure modes, they are rather different than those just discussed in that they are present in perfectly regular meshes of square (or rectangular) elements. For this reason, and because they are rarely big enough to really cause trouble, we shall adopt the name coined by D. Silvester and PMG when they first stumbled upon them in 1991-1992: 'pesky modes.' How many there are and how much harm they can do has been, and perhaps still is, an open issue—although some seem to believe that as many as one fourth of the total pressures are in this category. One 'measure' of them, more or less in retrospect, nearly surfaced in our first publications on the subject (Sani et al., 1981a), but did not; i.e., when we tested a reasonably obvious method for removal of the pure/global CB-mode by making it orthogonal to the final (filtered) pressure, it did not 'work' as well as we had hoped—small amounts of 'unexplainable' pressure mode seemed to be still present. The method that should have worked but did not is this: if P# is the numerical pressure from the NS code and PF is the to-be-determined filtered pressure in which the CB-mode (Pcb) is removed, then we can hypothesize that the following procedures would work: PF = PN-/3PCB, (3.13-86) where fi is determined so that PF lies in the orthogonal complement of Pqb', ie., setting PlBPF = 0 yields P = PTcbPn/PTCbPcb, (3.13-87) and (3.13-86) then ostensibly gives a CB-less pressure (still at element centroids). The reason that it left lingering, pesky modes is that there indeed are additional CB-like modes, recently quantified (finally!) by D. Griffiths (see Griffiths and Silvester, 1994, and
490 THE NAVIER-STOKES EQUATIONS Griffiths, 1996), which modes (i) are not pure (non-zero eigenvalues, and—the velocity is also oscillatory); and (ii) are not vanquished via 'shear stress' BC's on a portion of T. We shall describe these modes in more detail later (Section 3.13.5k); suffice it to say here that they are rather smooth modes that are modulated by the roughest of modes—the pure CB-mode—and that the same filter designed for the pure CB-mode also works very well on these pesky modes, which are also, fortunately, rather hard to 'excite' (it seems to require 'rough' data); i.e., even before filtering, their amplitudes are usually small. o Pressure pegging. Although we are only tentative regarding this trick in 3D, we know it can work well in 2D, and it is this: if you detect an anomalous velocity solution and/or an inordinately large CB-polluted pressure solution that might be caused by an impure mode, try the following: (i) select a region of your problem where you expect VP to be small, (ii) pick one 'red' element and its black neighbor, and peg these pressures at zero for your next run—which (of course) removes the two associated constraint equations from the system. This will trade off very slight [0(e)] mass imbalances on these two elements for a nice regularization of the matrix and thus eliminate any 0(1) errors in velocity that may have been present. The magnitude of the CB-part of the pressure will also be small. We conclude by mentioning that this trick could even work well for a pure CB-grid; the velocity solution will be unaffected and the CB-pressure more-or-less minimized. Finally, if the BC's do not support a hydrostatic mode, peg only one pressure rather than two, preferably at zero in a region of the domain where the pressure is close to zero. Pressure pegging, in general, may sometimes be convenient, and can be invoked in the following ways—in 2D and 3D—after ensuring that the problem is solvable [qfg = 0 a la (3.13-64)]: in 2D, peg any red element and one of its black neighbors in the general case (both modes present), and any single element when only one mode is present. In 3D, we consider only the worst case, 'all Dirichlet' (e.g., flow in a 'box' with brick-shaped elements); an easy way to preclude the Nx + Ny + Nz — 1 pressure modes is to peg the three columns of elements extending in the three coordinate directions, starting from one corner; i.e., along three orthogonal edges of the box (see Figure 8 in Sani et al., 1981a) and one other (anywhere else). o Node-freeing. The final CB-trick that we have derived goes in just the other direction. Rather than reducing the size of the system, we increase it, by freeing-up previously specified velocities at Dirichlet boundary nodes—and we again present only the 2D case. We begin by noting that a simulation on any mesh (with any element, in fact) that supports a pure hydrostatic mode can (if well-posed) be solved with no hydrostatic mode and no adverse effects simply by releasing (freeing) the normal component of velocity at any node on T; this causes the NBC to be activated with the result that P will be 'set' to ~ 0 at this node, and the hydrostatic constraint equation (3.13-64), will assure that the velocity solution is unchanged. Transferring this 'theory' to the (pure) CB-mode leads to the same result except that it is one tangential velocity BC that must be released. If it is an impure CB-mode, the same trick should both regularize the matrix [thus precluding 0(1) errors in velocity] and cause no more than 0(e) mass imbalance on the selected element. We conclude by noting that releasing uT in favor of setting fz = 0 is legitimate with or without an accompanying hydrostatic mode. Finally, we re-emphasize: pressure pegging or node freeing should rarely be required, in 2D or 3D—we believe.
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 491 Table 3.13-5 Spurious modes in two 9-node elements. Element Spurious modes Comments on rectangles O2O-1 —One spurious 2Ax x 2Ay mode (£>7-wave) per element; PcB\k = ±VAc on element k. Q(2]Q^ -One ^-wave. —One £-wave. —One r?-wave. (i) It is tolerable and filterable; see below. (ii) The CB-mode can be said to 'live' at the four 2- x -2 Gaussian points within each element. —Do not use this element. o Higher-order elements. So much for Q\Qq—finally. We now switch gears to consider higher-order quadrilateral elements with spurious pressure modes, but only briefly. In Sani et al. (1981a), the eight-node (serendipity) element and the nine-node (Lagrangian biquadratic) element with discontinuous bilinear pressure were considered and analyzed, with results that are summarized in Table 3.13-5. In Jackson and Cliffe (1981), another 'pressure mode' paper contemporaneous with our own, were reported the following additional numbers of spurious modes for some higher- order elements: Q2 Q2 has five, Q2Q2 has seven, and the Q2Q9 element has two. By adding two more velocity degrees of freedom (x- and ^-derivatives at the centroid), they enhanced the Q2Q2 ' to what we would call Q{2 ]Q(2 ] with no spurious modes. [They called it an 11-8 element and advocated it at that time; since then, however, they came to prefer the Q2P\ element—as do we, and many others (K.A. Cliffe, personal communication).] Returning to the non-leaky LDC and the Q2Q-1 element, the simplest relevant sample problem for which the CB-constraint can 'act up,' we find, since the constraint equation is basically the same as that for Q\Qq—see (3.13-81) and its nine-node analog in Sani et al. (1981a)—for a mesh with N elements (even or odd) across the top and u\ = W2/V+1 = 0, (uq — 2u2)/l\ = («o — 2u2n)/In, where l\ and lN are the lengths of the first and last elements. Thus, the fix is similar to that of ramping over two elements for Q\Qo- set U2 = U2N = uq/2. [It would not work to ramp up to uq at the first node from the corner; nor would it work to use the 'smooth' parabola discussed, for example, by Carey and McLay (1986) wherein U2 — «2/v = 0.75 uq.] In 3D, the Q2Q-1 element has not, to our knowledge, been analyzed—but based on our 2D knowledge and of how the single CB-mode in the 2D Q\Qq element expands to many in 3D, we are fairly confident that the following remarks are true (for all-Dirichlet BC's, of course): 1. As with Q\Qo, each 2D wave extends into the third dimension as a constant, leading to the same number of 2D modes as the Q\Qo displays when each 3D quadratic element (27-node brick) is replaced by the eight analogous 2D elements (eight-node bricks). 2. There is one fully 3D pressure mode; a £r?£ wave ( — \)l+J+k. 3. Tangential velocity BC's remove pressure modes in the same ways they do for Q\Qo- Finally, we address the subject of filtering the 2D CB-mode for/from Q2Q-\—but not smoothing, as for Q\Qo, since the filtered results still apply at the same pressure nodes
492 THE NAVIER-STOKES EQUATIONS (not at the velocity nodes a la Q\Qo)- There are two ways to derive the filter for this element, and each relies, as for Q\Qo, on a knowledge of the CB-eigenvector: (i) assume that the true (physical) pressure is L2-orthogonal to the xy mode on each element (i.e., assume that the true pressure has no projection onto the element-level CB-mode), and (ii) simply subtract off the xy-portion of the computed pressure in each element. To derive these, we first present the pressure basis functions for the / x /j-rectangular element shown in Figure 3.13-8 (P\ through P4 are at the 2 x 2 Gaussian points). The element pressure can be expressed in the usual way, P(x, y) = Ylj=\ Pji^jO0^ )0> or m me equivalent way, P(x, y) = P0 + xPx + yPv + xyP jtyi where P^ = Px = py = \(Pl+P2+P3+P4), V3 2/ 2h (P2-Pl+P3-P4), (P4-P1+P3-P2), and />=-(/>, -p2+p3-p4). Assume Ih P = PP+aPCB, (3.13-88) (3.13-89) (3.13-90) (3.13-91) (3.13-92) (3.13-93) where PP(x,y) is the (unknown) physical pressure, and PCb = xy is the local CB- pressure. To obtain a, and thus PP = P — aPcB as the filtered pressure, we assert/assume that JePpPcB = 0, which yields fexyP = a fgPQB, where P is available—and given by (3.13-88). Thus, both integrals can be evaluated (quickly and easily when it is realized that all integrands except that involving Pxy are odd and integrate to zero), giving a = Pxy, and thus 3*Vi -P2+P3- Pa) = Po+xPy + yPy, P - Ih (3.13-94) showing that the filtered pressure is actually linear—a la the stable and accurate Q2P-1- It is also clear that simply subtracting the xy portion of the original pressure gives the same result—namely, the physical pressure lies in the orthogonal complement of the null x p. x P, x pq x p, h* £/^3 H £ Fig. 3.13-8 One O2O-1 element.
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 493 space. In 3D, the analogous 3D 'CB' is that described by the xyz portion of the pressure, whose subtraction should leave a linear pressure on each element and no 3D pressure mode. The many 2D modes in 3D are presumably similarly removed in each plane of elements, by subtracting off the offending 2D CB-mode. ... We have done neither the complete analysis nor numerical experiments. Final 3D, 27-node-brick remark: it is still not firmly resolved as to whether the Q2Q-1 element with eight continuity equations per element and spurious modes will give a better solution (when, of course, a solution exists!) than the Q2P-1 element with its ostensible 'shortage' of continuity equations (only four per element) but no pressure modes. (We believe that it just might.) More work, and lots of numerical experiments, are still required. o The unsymmetric case, revisited. We now turn to the unsymmetric case. The simplest way (via finite elements) to generate the unsymmetric system, (3.13-60), introduced already in (3.13-33), which also more closely mimics many finite difference methods, is to avoid integrating by parts the VP term—which, of course, precludes the lowest-order quadrilateral element, Q\Qq, from consideration; only elements with the ability to generate a valid approximation to VP can be employed. (The only pressure gradient that exists for Q\ Q0 is the weak one; ditto P\Pq.) The reason—or at least one reason—for doing so is related to outflow boundary conditions (OBC's), a subject we take up in greater depth in Volume II. Thus, for example, in Jackson and Cliffe (1981), Sani et al. (1984), and Eaton (1983), the gain (or potential gain) in OBC flexibility was the reason for heading down (or almost heading down) the path of ill-posed formulations a la Section 3.12.5—although it was not really known then (but suspected, by some) that the formulation was (or could be) ill- posed. Thus, we return to the weak formulation of the equations in Section 3.12.1; if the pressure gradient term was not integrated by parts—see (3.12-4) through (3.12-19)—there are two changes: (i) the pressure portion of (3.12-14) and (3.12-16) is absent (it is no longer part of the NBC), and (ii) the derivatives are not shifted to the velocity test functions; f(—Pd(p{x)/dx) in (3.12-18) becomes f<p{x)dP/dx with an analogous change in (3.12-19). The net result, in the final GFEM equations given by (3.13-28) and (3.13-29), is that we lose the 'div-grad symmetry' present there because 'C is no longer the transpose of CT (which remains unchanged); rather, C is replaced by G (gradient matrix) where, cf. (3.13-25), Ga = f <p{a)d\lrT /dxa. This formulation leads to, for the special cases under consideration in this section, (3.13-60), where D = CT. Remarks: (1) It may be worth mentioning that the difference between C and G is in some sense quite small; indeed, it is zero if all BC's are Dirichlet. Thus, they agree in all of £1 and only differ on those parts of V on which the normal NBC is applied. To see this, simply note that where v is a vector test function a la (3.12-34), which leads to (G — C)P|,- = frP(n • v)(- Wi, and we know that n • v|r = 0 except on FN; thus, G = C nearly everywhere.
494 THE NAVIER-STOKES EQUATIONS (2) But, on FN, the difference between G and C is profound; GP is a 'gradient' (VP), and CP is a 'force.' These differences, and their effects, are demonstrated below. (3) GPH = 0, always; yet, if the problem has inflow and outflow with an NBC as OBC, there is no redundant continuity equation associated with this singularity—Ph is then a spurious hydrostatic mode. *o A simple example. We conclude this section with a simple but carefully developed and hopefully useful (but long) example—with the intent of demonstrating the theory discussed above. Specifically, we consider the following exact solution to the 2D steady Stokes equations in an unbounded domain: u = x2, v = — 2xy, P = 2x + ay, where a corresponds to a body force (like gravity). Figure 3.13-9 depicts the streamlines for this solution [and, for a = 0, the (vertical) isobars]. The boxed subdomain is the one chosen for our example. And, for the ultimate in simplicity, we cover this domain by a single element—the Q{2 ]Q\ 'serendipity' element, because it is the simplest (quadrilateral) element that is capable of representing the exact solution. We shall specify Dirichlet/essential BC's along the axes and NBC's along the other two boundaries, per Figure 3.13-10 below, in which there are only 10 degrees of freedom—three w's, three v's, and four P's (• is velocity, □ represents pressure). We shall 'solve' this problem in several ways: A ^\ ' M ill, 1 0 -1 -2 -3 -4 -4-3-2-10 1 2 3 4 Fig. 3.13-9 Streamlines and isobars (- - -) for a Stokes test problem. Fig. 3.13-10 One serendipity element.
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 495 1. Conventional GFEM. 2. Unconventional FEM, in which the pressure gradient is not 'integrated around by parts.' 3. Variations on 2, in which the pressure is specified at a point. The reason that 3 is even considered is related to the simple fact that 2 is generally ill- posed: the Stokes matrix is singular (and unsymmetric), and the RHS vector is (generally) not orthogonal to the null vector of the transpose matrix. Before embarking on the details (in which the Devil resides), let us state the bottom line: the Galerkin FEM is always well-posed and for the example here can (given exact data) obtain the exact solution, whereas the unconventional FEM is generally ill-posed (exception: exact data) unless global mass conservation is forsaken via pressure specification—and, of course, cannot thereby obtain the exact solution. The GFEM condensed statement of the problem is Ku + CP = f, (3.13-95) CTu = g, (3.13-96) where we use the simplest (y = 0) option so that these equations mimic — V2u + VP = aey and V ■ u = 0. The unsymmetric version is Ku + GP = fG, (3.13-97) CTu = g, (3.13-98) where G replaces C and, because of different NBC's, fG replaces /, and the first job at hand is to construct the new gradient matrix, G—which we do starting from the available (16 x 4) C-matrix of Appendix 1. We will detail the construction for the ^-momentum equation only, leaving y to the reader: from (3.13-25), with a switched to a superscript for convenience, Cf = — f d(pi/dx • xj/j, and from the Stokes equations with no integration by parts of VP, we have the obvious definition, GJ- = f (pjd\//j/dx, so that G"j-Cti = / dx^i^/j">= / ^j"*' (3.13-99) Defining ff[. = I (pjXl/jn, j r„ gives the following (8 x 4) boundary element matrix on the 'usual' (—/ ^ x ^ /, y ^ h) rectangular domain: -h€ ^~2 -f-\<l>if\dv 0 0 -/l,04^id>? 0 0 0 ~ S-108^1^ 0 0 I J_!,02^2d>7 5lx<hfi&,n I. I , 03^2 d>7 0 0 , 06^2 d>7 0 0 /- 1 03^3 d>? 0 0 /! 106^3^ 0 o -/. 1 01^4d>? 0 0 1 04^4 d>? 0 0 0 -/_108^4d>?-
496 THE NAVIER-STOKES EQUATIONS which, using the serendipity element basis functions, leads to 'j ^ ~ 6 -1 0 0 0 0 0 0 -2 ?, = C?,+lfyis Gii ~ 36 1 1 2 2 -8 -6 -4 -6 0 0 0 1 0 0 0 1 0 0 0-1 0 0 0 2 2 0 0 0 0 0 0 -2_ ? -1-2 2 -1 -2 2 -2 -1 1 -2-1 1 8 4-4 6 6-6 4 8-8 6 6 - -6 in which the bold entries are those that differ from the corresponding entries in C\,-. This is the full element-matrix; we need only rows 3, 6, and 7 for our purposes, since we have only 10 degrees of freedom (unknowns) in our test problem; namely, W3, ue, u-], v3, v6, vn, and Pi, P2, P3, Pa- Omitting the details, we also present the y-version of the new gradient matrix, 'for the record': °h I ~ 36 1 2 2 1 -6 -4 -6 -8 2 1 1 2 -6 -8 -6 -4 -2 -1 -1 -2 6 8 6 4 -1 -2 -2 -1 6 4 6 8 and again, only rows 3, 6, and 7 are needed here. By referring to Appendix 1—and grabbing rows 3, 6, and 7 of the appropriate K{—V2) and C matrices, we can now form the LHS of (3.13-95) and (3.13-96) and of (3.13-97) and (3.13-98)—where, for the case of interest, / = h = 2: first for GFEM, 45 < T8 52 -37 37 104 37 0 0 0 0 0 0 0 ( 2 -6 J -2 -6 )-7 -6 I 1 -6 -37 0 104 0 0 0 -4 4 8 -8 0 0 0 52 -37 -37 2 1 -7 -2 0 0 0 0 0 0 -37 -37 104 0 0 104 -4 -6} -8 -6 1 8 -6 [ 4 -6 J 2 -2 -7 -6 -6 -6 -4 4 8 2 1 -7 -4 -8 8 -6 -6 -6 (0 0 0 0' 0 0 0 0 1 0 0 0 0 lo 0 0 0. 1 > -6 -8 -2 4 -6, > 1 J ■"3 "6 "7 «3 v7 P\ Pi P3 lp4
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 497 for the unsymmetric version, only the upper 6x4 matrix changes (C —► G); it is then 2 -6 -4 2 -4 -6 -2 6 4 1 -8 -6 -1 6 8 -1 8 6 1 -6 -8 -2 4 6 which, it is important to note and in contrast to C, annihilates a constant (pressure) vector; GPH = 0. We now focus our attention on the RHS vectors, in which / and fG receive contributions from all three 'sources': (i) inhomogeneous Dirichlet velocity BC's, (ii) inhomogeneous NBC's, and (iii) body forces (accelerations): f = fD + /nbc + /bf, and we construct each in turn (and later repeat it for fG). For /D, we start with the eight-node element matrix for —V2 from Appendix 1, and transpose to the RHS the Dirichlet data (nodes 1, 2, 4, 5, and 8)—first for u and then for v, to obtain -46«i - 45w2 - 45«4 + 46w5 + 46w81 [" -67 46«i + 74w2 + 46«4 - 32w8 148 46« i + 46u2 + 74u4 - 32u5 1 76 -46vi - 45v2 - 45v4 + A6v5 + 46v8 _ 45 0 46vi + !Av2 + 46v4 - 32v8 0 46vi + 46v2 + 74v4 - 32v5 J [ 0 , where we have employed the exact solution, u = x2, v = —2xy. For /nbc» we first refer to Figure 3.13-11, in which the exact solution is inserted into the 'traction' terms. The NBC contribution, /nbc, is thus comprised of the following boundary integrals, wherein we invoke the shortcut notation that f (•) denotes the boundary integral between nodes a and b: 1 /D=90 y fx = -fT = 3u/3y = 0 fy = fn = -p + Bv/3y = -4x -2a fx = fn = -P + 3u/3x = -ay fy = fT = 3v/3x = -2y ►x Fig. 3.13-11 The natural boundary conditions.
498 THE NAVIER-STOKES EQUATIONS r Z-4 / NBC M> 06/ > <hf> 03/, 06/< wherein we observe that part of /nbc actually comes from the body force term, aey, via the pressure force. Finally, the body force contribution from the 'bulk' integrals is: f <p3(-ay)+ f 0 / (f)e(-ay) 0 3 z-4 03(-2>O+ / 03(-4x- / 06(-2)0 / (p-i(-4x-2a) -2a) — -2a/3 -Aa/3 0 -4 - 2a/3 -8/3 .-16/3-8a/3 /bf = ■ o " 0 0 <p6a <p7a — " 0 " 0 0 -a/3 4a/3 . 4a/3 . / = so that the total /-vector is -67/45-2a/3 " 148/45 -4a/3 76/45 -4-a -8/3 + 4a/3 -16/3-4a/3 J and we now turn to the g-vector—from the mass conservation equations. It is worthwhile to start by writing out the four full continuity equations, corresponding to P\ through P4—again using Appendix 1 for the Cr-matrix; here for I = h = 2 and dropping the common factor 1/18: (7u\ + uj + 2«3 + 2«4 — 8«5 — 6«6 — 4«7 + 6wg) + {lv\ + 2t>2 + 2t>3 + V4 + 6v5 — 4v(y — 6v-i — 8vg) = 0, (—«i — 7«2 — 2«3 — 2w4 + 8M5 — 6u(y + 4w7 + 6wg) + (2v\ + Ivj + v-} + 2t>4 + 6t>5 — 8t>6 — 6v-i — 4v8) = 0, {—2u\ — 2uj — 7«3 — «4 + 4«5 — 6«6 + 8«7 + 6«8)
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 499 + (—2v\ — V2 — Ivt, (2u\ + 2u2 + M3 + 7«4 - + (—v\ — 2v2 — 2v3 - 2t>4 + 6t>5 + 8t>6 — 6v-i + 4v8) = 0, 4«5 — 6«6 — 8«7 + 6«g) - 7^4 + 6t>5 + Av(y — 6v-j + 8vg) = 0. Before forming the g-vector, it is interesting to add the four equations to obtain the following global mass balance: 6[(«i - u2) + 4(m8 - U(y) + (u4 -u3)] + 6[(v\ - v4) + 4(v5 - v7) + (v2 - v3)] = 0, which is interesting because it is also an element-level mass balance—a balance not normally achieved when using continuous pressure basis functions. (But it is also true that one-element domains are not common.) Then, transposing all Dirichlet data to the RHS gives the Cr-matrix already displayed and gives g on the RHS (after reinstating the 1/18 factor): 8 1 18 1 18 (—7«i — ii2 — 2u4 + 8«5 — 6«g) (u\ + luj + 2«4 — 8«5 — 6wg) (2m i + 2m2 + M4 — 4m5 — 6Mg) _ (—2m i — 2m2 — 7m4 -f 4m5 — 6m§) 4" 20 4 -4 + (—7im — 2t>2 — V4 — 6v$ + 8vg) + (—2vi — 7t>2 — 2t>4 — 6v$ + 4vg) + (2v\ + vj + 2t>4 — 6t>5 — 4vg) + (Vi + Vj + 7t>4 — 6t>5 — 8vg) which comes from the only non-zero values: M2 = 4 and M5 = 1 (see Figure 3.13-10). All of the data are now available to obtain the GFEM solution of (3.13-95) and (3.13-96), which solution gives the exact results (m =x2, v = — 2xy, P = 2x + ay), as can be easily verified by substitution: "«3~ M6 M7 V3 V6 V7 Pi Pi Pi IPA\ ■ 4 " 4 1 -8 -4 -4 0 4 4 +2a . 2<2 . We now turn to the G-matrix version of the same problem, which requires the construction of fG. To do this, we first repeat in Figure 3.13-12 the 'NBC sketch shown earlier (Figure 3.13-11), wherein the pressure terms are now absent. Building /nbc m me same way we did for /nbc (i-e., using the exact solution) yields 4/3 fG _ /nbc — 16/3 0 -8/3 -8/3 -8/3 J
500 THE NAVIER-STOKES EQUATIONS y A fx = -fT = 9u/9y = 0 fy = f n = dv/dy = -2X 0 2 Fig. 3.13-12 Natural boundary conditions sans pressure. fx = fn = du/dx = 4 fy = fT = dv/dx = -2y ► x Since the other two parts of the /-vector remain unchanged, the total /G-vector is fG = -67/45 + 4/3 148/45 + 16/3 76/45 + 0 -a/3 - 8/3 4a/3 - 8/3 4a/3-8/3 -7/45 388/45 76/45 -a/3 - 8/3 4a/3 - 8/3 L 4a/3 - 8/3 The g-vector is the same for the 'G-problem' as it is for the 'C-problem' since both use CT\ thus, we have completed the definition of A and b in Ax = b for each. For the symmetric case (C-matrix), A is non-singular, and the solution of (3.13-95) and (3.13-96) gives the exact solution—'as advertised' [i.e., if the exact solution is contained in the grab bag (the trial space), GFEM will find it]. For the unsymmetric case (G-matrix), A is singular, and the solution of (3.13-97) and (3.13-98) is not unique—when it exists. It turns out that it does exist for the above data, and it is unique (and exact) only up to an additive multiple of P#, the hydrostatic pressure mode; i.e., we have the special case of a consistent singular system. This case is very special in that the use of the exact solution to build /nbc yields a consistent system; virtually any /^bc other than that shown above leads to an ill-posed problem—a point that we shall prove below. (Indeed, part of the reason for defining such a 'small' problem is so that the linear algebra issues are easy to illustrate.) Since (3.13-97) and (3.13-98) have a singular matrix, we are led to seek the null vector (corresponding to the pure pressure mode, Ph) for the adjoint/transpose problem and then test for solvability. Thus, we form Kw + Cr = 0, GTw = 0, (3.13-100) (3.13-101)
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 501 from the given matrices and seek the solution of the resulting 10 x 10 homogeneous linear system, which is easily found to be wT = (wx3, wx6, w*v wy, wy, wy) = (2, 1, -1/2, 2, -1/2, 1) and rT = (r,, r2, r3, r4) = (9/5, -14/5, 29/5, -14/5), where we 'normalized' the eigenvector by setting wy-j = 1. Recalling that the solvability constraint is wT fG + rT g = 0, where fG = (/*, /*, /*, />', fy6, fy7)T and g = (#,, g2, g3, g4)T, leads directly to the solvability condition V\ + fl - 0-5fx7 + 2fy - 0.5f I + f}7 + 1.8s, - 2.8#2 + 5.8*3 - 2.8#4 = 0. (3.13-102) If this equation is not satisfied by the given data (/, g), then the problem is ill-posed and no solution exists. Inserting the fG and g results above into this constraint equation yields - 14/45 + 388/45 - 38/45 - 2a/3 - 16/3 = 2a/3 + 4/3 + 4^/3-8/3 + (1.8 x 4-2.8 x 20 + 5.8 x 4 + 2.8 x4)/18 = 0, or 0 = 0; i.e., it is satisfied identically! This rare result is obviously related to the 'simplicity' of the test problem and to our knowledge and utilization of the exact solution to form fG (/nbc m particular). In the general case (many elements, no exact solution), a solvability condition similar to (3.13-102), but with many more terms, will always exist for the unsymmetric problem—but its satisfaction will rarely if ever be attained. Thus, the general case is ill-posed. (Even though the A-matrix displays only a pure pressure mode for its null vector, the null vector of AT will generally contain some non-zero entries in the velocity portion of its null vector—as above—and these will usually assure that the linear system Ax = b has no solution.) Before pushing on, let us be sure to appreciate the current situation in which we have constructed a consistent singular system. Since in this particular case the null vector of AT is fully populated (no zeros in either w or r), it follows that each of the 10 matrix rows is linearly dependent on the other nine [simply apply (3.13-102) to the matrix rows, with f\ replaced by Row 1 ...andg4 replaced by Row 10]; since also the system is consistent, with the RHS satisfying (3.13-102), it follows that any of the 10 equations could be dropped after pegging (specifying) any one of the four pressures and transposing the corresponding column data to the RHS, thus reducing the problem to a nine-equation one with a 9 x 9 matrix that is not singular. Its solution would agree with that from the C-matrix up to a constant multiple of PH, the hydrostatic pressure mode. For example, let us omit the third equation (momentum equation for w7) and specify the first pressure (P\) to be P\ — 10; the remaining nine equations will give the proper solution once the RHS of each is modified by subtracting 10G,i, i = 1, 2, 4, 5 and 6, from the RHS vector, where G,i is the first column in the original (6 x 4) G-matrix—it is the seventh column in the A-matrix. Since the exact solution has P\ = 0, the 9x 9 solution will be 10 units too large in pressure and exact in the six velocities. [In the general case of an TV x N system, it may not be the case that the null vector is fully populated with non-zeros; any equation corresponding to a zero entry in the null vector of AT is not linearly dependent on the other TV — 1 and must be retained. It is probably safe to say, though, that any of the continuity equations in the consistent (!) singular system could be safely jettisoned (r, ^ 0, Wi is
502 THE NAVIER-STOKES EQUATIONS presumed) and its pressure specified—which is the most 'familiar' case/situation. But, as already mentioned, it is probably also safe to say that the general case will not be consistent, so that the point is somewhat moot.] So now let us see what happens if we employ the 'trick' used by many FDM prac- tioners; i.e., they often argue as follows: 'Since the pressure is never determinable except up to an additive constant, let us remove this indeterminancy by setting the value of P at a single point.' What they seemingly do not realize is that they do not—unless they are solving the (too) rare case with a consistent singular system—really have a redundant 'continuity' equation (when OBC's are present) even though they do have a hydrostatic pressure mode—because the linear system is not consistent. (While the matrix rows still do display some linear dependence, the corresponding equations do not because the RHS is not consistent.) Thus, once a pressure is specified, the corresponding mass conservation equation is necessarily lost; removing the singularity by setting the pressure at one node sacrifices the continuity equation at that node—a procedure that may or may not cause serious 'damage.' It must be the case that serious damage is not the common consequence, judging by the plethora of 'successful' simulations done this way. Perhaps it is the case that the problems are otherwise well designed, so that the local loss of mass conservation is hardly noticeable. Perhaps it is also the case that 'pressure-pegging' is done selectively, cleverly; e.g., if it is done in a region of low velocity, the local loss of conservation may well go unnoticed. [Note that the location of the pressure specification point (node) is completely arbitrary; it could, but need not be, chosen to be on the outlet boundary. It could be far away from any boundary—preferably, as stated above, in a region of very low velocity.] But it will always be the case that changing the point of pressure specification will change the entire solution. Toward this end, then, we show below the results of selective pressure specification for our simple test problem—for a = 0. To both generate an inconsistent G-matrix problem, and to permit some sort of comparison with the C-matrix result when the proper value of / is not known, we replaced the specified shear stress (—2y) at node six (the momentum equation for v6) by zero in the /^-location—for both the G and C problems. For the G-problem, we must also specify one pressure and omit one of the 10 equations to obtain a non-singular 9x9 system, as described above. For simplicity, and to correspond more closely to what is done 'in practice,' we simply omitted the mass conservation equation corresponding to the specified pressure. (For the C-problem, we retain the always-consistent 10x10 system; only the RHS vector is changed: f\ goes from —8/3 to 0.) To seek the 'best possible' solution, we first peg the chosen pressure at the exact value; but then to better correspond to reality, we also repeated the two cases wherein the exact pressure is non-zero {Pj = P$ = 4) by zero—the usual choice when no knowledge is available. The results are shown in Table 3.13-6, with the last row showing the residual from the mass conservation equations.... What can be said about these results? Well, 1. The pressures are all bad (the notorious 'over-reactors'). 2. The P2 = A case is exceptionally good. 3. The P\ = 0 case is exceptionally bad. 4. It may be worthwhile to replace C by G, peg a pressure, and live with the result—especially for the case of stratified (Boussinesq) flows that we shall discuss in Vol. II, wherein the homogeneous OBC of the C-matrix often does a lousy job.
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 503 Table 3.13-6 Test problem results. Variable u3 Ue u7 v3 v6 v7 Pi P2 P3 P, CTu-g Exact solution 4 4 1 -8 -4 -4 0 4 4 0 0 C -matrix solution 3.44 3.80 0.86 -7.44 -3.86 -3.80 13.00 -7.21 9.00 -6.80 0 G-matrix solutions for various specified pressures Pi =0 9.76 2.96 4.66 -2.65 -0.44 -6.30 0 63.6 29.3 63.4 2.22 P2 = 4 3.40 4.08 0.61 -8.83 -4.68 -3.84 5.81 4 7.22 -0.59 0.48 P3 = 4 4.55 4.46 1.48 -7.86 -3.62 -4.80 -2.05 1.33 4 1.18 0.69 P4 = 0 6.91 4.77 3.16 -6.62 -2.94 -5.48 -21.3 -1.92 -9.51 0 -1.43 P2 = 0 5.79 3.78 2.16 -5.51 -1.95 -4.49 -21.4 0 -9.66 -2.22 -1.43 P3 = ' 4.55 4.46 1.48 -6.62 -3.62 -4.80 -6.05 -2.67 0 -2.82 0.69 Perhaps the main thing to conclude from this prolonged example is that the semi- discrete equations, via linear algebra, lead to the same conclusion stated several times already for the continuous case: integration by parts of the pressure gradient term is required in order to have a well-posed problem for any but Dirichlet BC's. *c. LBB-stability/div-stability In Chapter 2 we mentioned the fundamental theorem that provided sufficient conditions for the existence of a unique solution of the steady advection-diffusion equation in the continuum; namely, the Lax-Milgram theorem. We saw that when the GFEM was applied on the appropriate finite-dimensional subspaces (i.e., via conforming elements), the resulting matrices were guaranteed to be invertible if the conditions of the theorems were satisfied. [We also saw cases in which the theory was silent—i.e., the conditions of the theorems were not satisfied—some of which were nevertheless shown to deliver what appeared to be very reasonable (sometimes even surprisingly good) solutions on the finite-dimensional subspaces even though it is (ostensibly) not known that the problem remains well-posed as h —► 0.] If this scalar situation may be termed 'bad' or 'scary,' then we are about to confront a situation in which we go from bad to much worse. And this even for the simplest of all cases for the vector-valued systems of interest in this book—the steady (and even self-adjoint/symmetric) Stokes equations. It begins by recognizing that we must leave the simpler situation wherein a single type of basis function was adequate to represent the single unknown variable and enter the larger and more difficult realm of mixed methods and 'mixed'-finite elements, wherein different unknown variables are (or, at least, may be) represented by different types of basis functions. In our case, of course, we encounter the option of using different basis functions for pressure than for (each component of) velocity. 'Loosely speaking, we want to choose our velocity space and our pressure space so that the resulting method is both accurate and stable. These demands are in some sense conflicting and one has to find a reasonable compromise'—Johnson (1987). And from Arnold et al. (1984), we add, '.. .This space is chosen so that the approximate solution is easily computable...'; i.e., implementational convenience/efficiency should also play
504 THE NAVIER-STOKES EQUATIONS a major role. As an indication of the magnitude of the problems ahead, both in theory and in practice (computations), we note that an entire book on the subject has recently appeared: Mixed and Hybrid Finite Element Methods, by Brezzi and Fortin (1991), and we are very thankful because it both reduces the magnitude of our task and permits 'an appeal to authority.' (As we have no need for 'hybrid' elements in our book, we shall simply leave the curious reader dangling with respect to the term 'hybrid.') Depending on the reader's mathematical background, it may be the case that there is some 'tough sledding' ahead—although we shall endeavor to display and discuss what we consider to be close to the minimum amount of advanced mathematics (functional analysis, mainly) needed to adequately appreciate the magnitude and scope of the issues involved. In this regard, we make two remarks: (i) the Brezzi/Fortin book is written at a much higher mathematical level than ours—as is another important precursor to our work: Finite Element Methods for Navier-Stokes Equations, by Girault and Raviart (1986); and (ii) it is the very existence of this higher level of mathematics that turns off (scares) many potential 'CFD engineers' (and physical scientists) from the finite element method; they are much more comfortable with Taylor series and related divided differences—and are usually comfortable in their assumed world of finding classical solutions. [They—or at least a large subset of them—of course live under a false sense of security most of the time, since most of their results represent/approximate 'only' weak solutions, not classical ones—since the latter are often non-existent.] As stated by Mason in Methods of Functional Analysis for Application in Solid Mechanics (1985), 'The subject of Functional Analysis, with its abstract character and sweeping generalizations, is not easy for untrained minds to master, since it departs considerably from the usual offhand engineering approach to mathematics, but once one succeeds in learning some of it, the dividends are very rewarding.' This statement could serve equally well as a warning and an inducement! It is also true, unfortunately for those with narrow/applied interests, that functional analysis covers much more than just weak solutions to 'our' PDE's. But there are some texts that try to focus on the applied side; in addition to Mason (1985) referred to above, there is Rektorys (1980) and Reddy (1986), to name a few. From Rektorys, we quote, and concur with(!), 'Functional analysis is a difficult subject for a non-mathematician. It is rich in abstract concepts which cannot be absorbed with haste. That is why I have advanced very cautiously, in an inductive rather than a deductive manner, from the simpler to the more complicated.' And from Reddy, 'An increased interest is seen in recent years in the study of functional analysis among engineers and physicists who are theoretically inclined. This is because it is now widely accepted that functional analysis is a powerful tool in the solution of mathematical problems arising from physical situations.' Finally, from Ortega (1990), 'The most important tool in many areas of numerical analysis is linear algebra and matrix theory. .. .In more advanced work, infinite-dimensional linear algebra—functional analysis—plays an analogous role.' Finally, a statement from O. Pironneau (personal communication) is relevant here: 'Functional analysis is really only required in finite elements if you want to do error analysis' —a subject we mostly wish to avoid in this text; but not totally. For the Stokes equations, unfortunately, Lax-Milgram cannot help us much. We are forced to look to more general/powerful theory—especially when we seek the approximate solution via the GFEM. In the words of Gunzburger (1989), another important reference for our subject, 'In the positive-definite case... the mere inclusion of the finite element spaces within the underlying function spaces is essentially sufficient to assure that the
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 505 approximate solution is well-defined and is, as far as the rate of convergence is concerned, as accurate as possible for the type of finite element functions being used. Here, for the Navier-Stokes equations, the inclusions... are not by themselves sufficient to produce stable, meaningful approximations. We find ourselves in the realm of what are known as mixed finite element methods.' Another good reference for this (and other) material, is Arnold (1990), who states, 'A key point, which is characteristic of mixed variational principles, is that the pair.. . is not an extremum point.... It is a saddle point.' That is to say the steady Stokes equations, in u and P, correspond not to the minimizing function of a quadratic functional, but to a saddle-point in that the solution corresponds simultaneously to a minimum with respect to the velocity and a maximum with respect to the pressure; see (3.7-7) and Section 3.15. If we were to use (discretely) divergence-free basis functions, a possibility that we discuss in Section 3.13.7, much of the difficulty associated with mixed methods would vanish; but the fact is that the great majority of approximation methods use the mixed approximation, necessitating (at least) the solution of saddle-point problems. [It is an interesting 'aside' in that if one were only and always interested in solving time-dependent non-linear problems (i.e., the NS equations), the issue of minimizing a functional over a divergence-free subspace in which the associated Lagrange multiplier 'turns out' to be the pressure would ostensibly never arise; one would then a priori give a more 'physical' interpretation to the pressure (it is no longer a mere mathematical adjunct), although it would still need the interpretation that part of its role is to keep u divergence-free. It is still true, however, that weak solutions would/could benefit from the use of divergence-free basis functions, and the pressure would 'magically' disappear from the final equations.] In most FEM codes, both u and P are approximated—because the velocity basis functions are not divergence-free, and this leads one to the subject of 'approximate (FEM) solution of saddle-point problems'—their existence, uniqueness, and accuracy (error estimates). The general case is of no interest here since our PDE's are specific; for more general cases than Stokes flow, see, for example, Oden and Carey, (1983) Arnold (1990); the FEM Handbook (Kartestuncer and Norrie, 1987); and Brezzi and Fortin (1991). The general case (and associated 'abstract theory') of the type of saddle-point problems of interest herein was first directly addressed by Brezzi (1974), even though this turns out to be a 'special case' of the earlier Babuska theory (1971; see also the article (book!) by Babuska and Aziz in Aziz, 1972; and Babuska, 1973). Brezzi developed independently his own version of the theory—and it is his version that most closely describes the weak formulation of the Stokes equations in function space. In any event, the two theories agree and, importantly, they paved the way for much subsequent finite element analysis. The theory is considerably more powerful than that of Lax-Milgram in that it provides necessary and sufficient conditions for a well-posed continuous problem; it also supplies sufficient conditions for the approximate (discrete) problem that are also necessary in the following sense: only if they are satisfied is the approximate solution guaranteed to be stable (as h —>• 0) and of at least quasi-optimal (best power of h) accuracy. The 'joint' theory was apparently first noticed by Bercovier (e.g., 1977; and in Bercovier and Pironneau, 1978) who combined the credit and led the way to the current nomenclature of 'BB theory/BB condition' (which condition we present below). But two names were not enough, apparently. Oden et al. (1982) noted a connection between this theory and earlier theory on the NS equations by Ladyzhenskaya (1969), and coined the triple crown appelation, LBB. Finally, in an attempt to be more objective/descriptive, Gunzburger
506 THE NAVIER-STOKES EQUATIONS (1987) suggested the term, first used in Boland and Nicolaides (1983), div-stability—for reasons that we hope to clarify below. Another descriptive name, used (e.g.) by Carey and Oden (1986) is a 'consistency condition' (between the two function spaces). It 'tests the consistency of the approximation of derivatives...'—Thomasset (1981, p. 32). Related to this is yet another appellation: 'compatibility condition.' Finally, the name most loved (it seems) by mathematicians (but feared by many engineers) is the term 'inf-sup condition.' All of these names are referring to the same 'problem.' [For a generalization of this theory to unsymmetric saddle-point problems, and a simplified/alternate treatment of Babuska's theory—without functional analysis—see Nicolaides (1982), and see Bernardi et al. (1990) for an application of this theory to spectral methods for NSE.] Let us get on with it; consider the inhomogeneous Stokes equations (with body force) and inhomogeneous BC's: V/>=vV2u + g, Vu = 0 in SI, u = w on T. (3.13-103) The weak form is [see, for example (3.12-34)]: find ueH[ and P e L2 such that v /(Vu)7 : Vv- f PV ■ v= /g.v VveH0, (3.13-104) and - fqV-u = 0 VqeL2, (3.13-105) where, to conform to (more or less) standard terminology, we restate it (abstractly) as a(u, \) + b(\,P) = f(\) (3.13-106) and b(u, q) = g(q), (3.13-107) where the definition of the two bilinear forms a( , ) and b( , ), as well as that of the linear form, /(v), is obvious—and Lq is that subspace of L2 that has the hydrostatic mode removed (e.g., via JP = 0.) The linear form g(q) is not obvious, nor need it be explicated carefully; suffice it to say that it is a boundary term resulting from the inhomogeneous BC. The B, BB, LBB, div-stability theory provides the necessary and sufficient conditions for the existence and uniqueness (up to an additive constant for P) of a solution to the weak form of the Stokes (and other) equations (the general theory is more general; we specialize here to our needs). Since both Babuska's theory is very abstract (read 'difficult'), we present only the seemingly simpler of the two—that due to Brezzi—and this in only a very brief summary form, following Brezzi and Fortin (1991) to which the (mathematically literate) reader is referred for details. Given (assuming, which is true in our case) that both a( , ) and b( , ) are bounded, i.e., for all permissible u, v, and q, \a(u, v)| ^ ||<z|| • ||u||i • ||v||i and |b(u, q)\ ^ \\b\\ ■ ||u||i • o where n ii l«(u, v)| \b(\,q)\ \\a\\ = sup and \\b\\ = sup O^u.ve//,1, HUHl • IMIl O^u.vg//^ IMIl ' Mo the necessary and sufficient conditions for the existence of a unique (up to an additive constant for P) solution of (3.13-106) and (3.13-107) are two:
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 507 1. The ellipticity condition on a( , ), a(v,v)^a||v||? Vv e Hj, and a > 0 is a constant; i.e., a(,) is coercive as well as bounded. (3.13-108) 2. sup </V-v Mli >P\\q\\o VqeL20, (3.13-109) where /3 > 0, and L2, denotes L2 modulo constants because q = constant is disallowed (it gives a LHS of zero; JV-v = /rn-v = 0 because v = 0 on T), and the supremum (least upper bound) also precludes v from residing in J0; i.e., V • v ^ 0—in fact, v resides in the orthogonal complement to the divergence-free subspace of H0 (see, for example, Girault and Raviart, 1986). Before pressing on, it may be useful to attempt to relate (3.13-109)—which soon will be shown to be equivalent to the famous (infamous?) inf-sup condition, a main product of the 'modern' (BB) theory—to the 'older' theory developed by Ladyzhenskaya and others, so that all will feel more comfortable with the term 'LBB condition.' (We are indebted to J.T. Oden for helping us track some of the history described below—especially since we were initially tempted to get the L out of LBB. E. Olsen also helped us in this regard.) This is 'required' because the earlier theorists seemingly did not need the inf-sup condition to obtain the same well-posedness results. [Remember that the Stokes equations are but one special (symmetric) case of the modern abstract theory. It is also noteworthy that many modern analysts do not seem to specifically state the need to satisfy the inf-sup condition in order to have a well-posed (continuous) problem; some examples: Temam (1984), Constantin and Foias (1989), and Kreiss and Lorenz (1989).] Specifically, Ladyzhenskaya (et al.) were concerned primarily with the decomposition of the space of L2-vector-valued functions into the sum of divergence-free vectors and gradients of scalars—each from a different subspace of L2. In her classic textbook (Ladyzhenskaya, 1969), she proves the existence of a divergence-free vector field (e H1) that satisfies given Dirichlet BC's, which vector field can be used to show that for every q e L2, there exists a vector v e Hq satisfying V- v = q and ||v||i ^ kIMIo, (3.13-110) where y is a positive constant. The fact that (3.13-110) implies (3.13-109), which gets the L into LBB, goes as follows: 1. Given q e L2, define A(q) = sup </Vv |v i so that A(q) ^ JqV ■ u/||u|| i for general (arbitrary) u. 2. Pick that u satisfying V • u = q to give Mq)> l = ll<7ll0/ u
508 THE NAVIER-STOKES EQUATIONS 3. Rearrange and use (3.13-110) as follows: A(q) = sup qV -y v|h ^ lkllo/llulli > IMIo/k, which is just (3.13-109) with fi — \/y. Thus, the 'L' (et al.) approach is equivalent—for the Stokes equations—to the BB approach, and shows that (3.13-109) is indeed satisfied. Remarks: (1) The second part of (3.13-110) is not to be found in Ladyzhenskaya (1969); it is probably in one of the references cited in her appendix called 'Comments.' (2) For additional, detailed discussion of the problem V • v = q, see Galdi (1994, Vol. 1). To finish, we restate (3.13-109) in the equivalent inf-sup form: qV ■ v inf sup l<7llo >P- (3.13-111) To recapitulate: the satisfaction of (3.13-108) and (3.13-109) or (3.13-111), the last of which is the LBB condition, assures the existence of a unique solution to the (weak form of the) Stokes equations. To conclude our summary of the continuous case, we present (for w = 0) the following bounds on the Stokes solution in terms of the constants a and /? above—from Brezzi and Fortin (1991): |u||, < -11/11 + (\ +-\\a\\] Ug\\, (3.13-112) a a and, once the average pressure has been substracted from the pressure solution to give (3.13-113) where v.g = sup and || g | = sup gift) kilo' where g is the 'body force' in (3.13-103), and g is the linear form/functional in (3.13-107). This was the easy part. Now we move on to the hard part: discrete (via FEM) approximations to the Stokes equations, which leads to the 'discrete LBB condition.' 'Fortunately,' however, there are only two aspects of this condition that are difficult: (i) understanding it, and (ii) applying (verifying) it—especially on general meshes. Unfortunately, however, both aspects are very difficult; experts are few. The recent Ph.D. Thesis by Qin (1994) summarizes some of the available techniques. We begin by writing the approximation problem in the same abstract formulation as the continuum problem, (3.13-106) and (3.13-107), a(uh,\h) + b(y\Ph) fiyh) (3.13-114)
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 509 and b(u\qh) = g(qh\ (3.13-115) where we a priori assume a( , ) is properly defined (coercive, bounded), which is true in all realistic cases. Thus, the discrete LBB/inf-sup condition is the following analog of the continuous version: for every qh e Qh, \b(v\qh)\ ' ^'^ sup = sup h >kh\\qh\\0, (3.13-116) where kh ^ k0 > 0(k0 is the h -> 0 limit of kh, the value of the constant on a mesh of 'size' h), Vh is the discrete velocity space and Qh is the discrete pressure space modulo any pressure modes (the null space of the discrete gradient is 'out of bounds'). The 'matrix' realization of (3.13-114) and (3.13-115) is Ku + CP = f, (3.13-117) CTu = g, (3.13-118) and the corresponding realization of (3.13-116) is max V. Cq ^khy/qTQq Vq e Qh, (3.13-119) veVh y/vTKv where Q = J V^7 ls me pressure 'mass' matrix; or, equivalently, J W Cq\ min max . . ^ h, (3.13-120) qeQh veVh JvTKv . y/qTQq where kh > 0, which is basically a stability condition on the C-matrix. If &o > 0 is fixed independent of h (i.e., if kh becomes independent of h for h —► 0), then the LBB (inf-sup) condition is satisfied, the discrete unique solution, u and P (up to the elements in the null space of C) exists, and uh and Ph are of optimal (or at least quasi-optimal) accuracy: the convergence rate as h —► 0 is the best possible from the approximating space. In fact, Brezzi and Fortin (1991, p. 56) give the following error estimates for LBB-stable elements: \\u — uh\\\^c\ inf \\u — Vh\\\ + C2 inf \\P — qh\\o, (3.13-121) vheVh qheQh where c, ^ f 1 + ~\\a\\) • ( 1 + — \\b\\) and c2^-\\b\\, (3.13-122) V a J \ kh J a and, modulo pressure modes, \\P-Ph\\o^(\ + —\\b\\] inf \\P-qh\\Q + — \\a\\ ■ \\u-uh\\\, (3.13-123) \ h J qh&Qh h and we note that, with respect to kh, we have IIm-m^Ii ^0(\) + 0(\/kh)
510 THE NAVIER-STOKES EQUATIONS and \\P - Phh ^ 0(1) + 0(1/*fc) + 0(\/k2h); i.e., pressure is less stable than velocity if kf, is badly behaved. Important Remarks: (1) If the discrete LBB condition is not satisfied (e.g., if kh = ch^ for positive constants c and f$), then the theory becomes silent; or at least nearly so. Whereas LBB satisfaction was necessary and sufficient for the continuous case, it is subordinated to a sufficient condition in the discrete case. (2) A solution to (3.13-117) and (3.13-118) can only exist if the g-vector is orthogonal to all null vectors of the C-matrix, a solvability condition discussed in the previous section that we assume to be satisfied at this point. Actually, there is somewhat more that can be said when a particular element fails LBB: (i) convergence may still occur, but the rate may be suboptimal (lower power of h)\ and (ii) there exist data for which convergence does not occur. It 'is in a sense necessary if we want a reasonable behavior of the discrete problem'—Brezzi and Fortin (1991, p. 59). (Our 'reaction' to this issue will be presented in both the following section and in Section 3.13.5J, and is related to the 'reasonableness' of the data.) A good part of the problem in the finite-dimensional case is caused by the fact that the space of discretely divergence-free velocities is generally not a subspace of the continuous divergence-free vector space—rather, it is an 'external' approximation (Brezzi and Fortin, 1991). This precludes the establishment of relationships like (3.13-110), which would, as it does in the continuous case, lead directly to a satisfaction of the LBB condition. This shortcoming leaves us with another problem as well: as we search for useful elements, we must always beware lest the pressure (constraint) space become (relatively) too large, and we must seek elements whose discretely divergence-free velocity space still permits good approximation capability...; see Malkus's Appendix to Chapter 4 in Hughes (1987). d. Bringing LBB to the rest of the people o Stability analysis. More light (for some) might be shed on the above matters if we review some linear algebra and then restate the discrete div-stability criteria. Toward this end, first recall that if we are given a linear system, Ax — b, where A is N x N and SPD, we can write x =A~lb because we know that A~x exists. Next, we use the property of a compatible matrix norm to write ||jc|| ^ ||A-11| • \\b\\, which bounds (perhaps pessimistically) the solution in terms of the data—and is thus a 'stability' statement. Choosing now the discrete L2 norm (also called the Euclidean vector norm, ||jc||^ = xTx, which induces the spectral matrix norm), which is appropriate for our purposes, we have ||A||2 = max^o ll^zlh/lklh = Amax(A), where kmax(A) is the maximum eigenvalue of A. It then follows that ||A_I ||2 = Amax(A_1) = l/Amin(A) so that ||jc||2 ^ l|6||2Amin(A). If x is to remain bounded as N (the vector length) —>• oo, it is required (assuming \\b\\2 stays bounded) that Amin(A) stay bounded away from zero; A^'^A) could be called the stability constant in that the solution remains bounded for all N (stable) as long as Am;n(A) remains bounded away from zero—and a 'large' stability constant implies a less 'stable' problem (see also Arnold, 1990).
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 511 It is possible, and useful, to take this concept one step further, by actually constructing the solution to Ax = b in terms of the full spectrum (eigenvalues) and corresponding eigenvectors of A; i.e., instead of merely bounding the solution, we shall represent it in 'closed' form. To this end, recall that an SPD matrix has a complete set of orthogonal eigenvectors, say {z,}, which form a basis for RN and satisfy Az.i = A/z,- with 0 < k\ ^ A2 ^ ... ^ A/v < 00 and—after normalization, which is always possible—zfzj = <$,•/, the Kronecker delta. Then (recall) the solution of Ax = b can be obtained via the eigenvector expansion, x = J2%\ ajZj as follows: find at by inserting the expansion for x into Ax = b and forming zjAx = zfb; i.e., Y,%\ ajzfAzJ = zfb, which implies that J2%\ aj^jzfzj = (zfb) or at = (zfb)/Xj so that x = Y!!j=\(z]b)zj/kj is the solution to Ax = b. (Note that the previous bound, ||x||2 ^ H^lhAi, is also derivable from this complete solution.) The key object of this exercise is to point out—and then emphasize—the fact that it is more than just the behavior of Amin(A) = A) that is important in the analysis of stability: the projection of the data (b) onto the eigenvectors {zj} is also quite relevant—at least in 'practice'; i.e., the amplitude coefficients, {zfb}. In particular, if b 'just happened' to be orthogonal to the first eigenvector, then z\ and A) would play no role in determining the stability of the solution to this particular problem (except in the face of round-off error, which we ignore at this point). More relevant yet is the possibility that while zfb is not zero, it may (and generally will) vary with N (just as A; may do). Thus, suppose A) = C\/Na for large N where a > 0. Then, if fi\ = zfb remains bounded as TV increases—or if it decreases with N, via f5\ = Cj/N^, but too slowly (0 < /? < a), the solution will indeed become unbounded as N —► 00; i.e., be 'unstable' via (zfb)/X\ — 0(Na^^). But the other side of the coin is also relevant—and possibly important; namely, if fi > a, then even though Ai —► 0 as N —► 00, the solution may not 'blow up' because (zf fi)/k\ —> 0 as N —► 00. (For simplicity, and focus, we assume that only Ai might be a 'bad actor'; generalization to A2, etc., is virtually immediate.) This is as far as we need take this discussion at this point, ending with the realization that // matrix A 'appears' to be unstable because Ai « C\/Na, then it is only truly unstable for the specific problem at hand if /3\ ~ C2/NP and fi < a. As we shall see later, for the Stokes equations, even though situations occur for which Ai ~ C\/Na with a = 2 (for example), that there are no 'reasonable data' (i.e., data for problems of interest—those whose solution makes sense to consider) for which /3\ ~ C2/NP with fi < a. For all reasonable data, it will be the case that /? > a. [This is, of course, related to the purely mathematical definition of stability, 'solution bounded for all data,' vis-a-vis the more practical—but perhaps occasionally risky—definition, 'solution bounded for all data that make sense to consider.' A mathematician may be perfectly content to consider solving—or, more likely, contemplate solving—a problem that most 'engineers' (i.e., applied physical scientists) would walk away from. See too the end of Section 3.13.5J.] We now apply this analysis—and reasoning—to the Stokes equations, Ku + CP = f, CTu = g, (3.13-124) at least 'formally,' to obtain (i) Solve (CTK-lC)P = CTK-lf -g (3.13-125) for P, where u e Rn and P e Rm, etc.—and the total dimension of the (product) space is n +m = N. [Alternatively, and equivalently, and in a form which will later prove useful,
512 THE NAVIER-STOKES EQUATIONS solve (Q-{/2C7K~x CQ-[/2)(Q{/2P) = Q~X/2(CTK'X f - g) for QX/2P, where Q is the (SPD) pressure 'mass matrix,' Qtj = f xj/jif/j.] (ii) Solve Ku = f-CP for u. (3.13-126) Remark: The solution procedure is only 'formal' because K~x is a dense matrix, and one would be foolish to actually construct it; the coupled (and sparse) system of (3.13-124) is what is actually solved on the computer. The point is that the two solution methods are totally equivalent, algebraically—and the above formalism will help us to better understand inf-sup/LBB. Continuing the formal solution procedure, we first obtain the m-vector P via P = (CTK~x C)~x (CTK~x f - g) (3.13-127) and then the n -vector u from u = K-\f -CP), (3.13-128) and we are ready to apply classical linear algebra theory to the results. First, clearly there are two necessary conditions for solvability: (i) K is non-singular, and (ii) CTK~XC is non-isingular. It turns out that (i) is virtually always satisfied (for all physically reasonable problems—those requiring Dirichlet data for velocity on at least a portion of the boundary) and (ii) is satisfied up to the existence of 'pure pressure modes'—see Section 3.13.2b—which we preclude from the present discussion. Thus, by fiat, we have a solvable system. Next, for stability, we would like to have both ||Af-1|| and ||C7Ar_1C|| bounded—in appropriate norms—and we start with the (ostensibly) easy part by showing that HA'-11| is bounded (and show the bounds) in both the spectral norm and in the Ar-norm, which latter norm is the discrete version of the //'-(semi-) norm. (It is easy because K is SPD.) We begin by showing that, in fact, HA"-1 \\K = HA"-11|2 where \\x\\k = >JxTKx is the Ar-norm of x: for a general but SPD matrix A, we seek ||A~' |U as follows: M . \\A~xx\\A (A-xx)TA(A-xx) xTA-xx \\A \\A = max = max \ ~ = max \ —~ x \\x\\a x V x Ax x V x Ax xTA-xx (A-x/2y)TA-x(A-x/2y) = max \ I —r^—~—r-pr— = max \ / ~ * V (Ax/2x)T(Ax/2x) y=A^xV yTy = max V 'yTA-2y T y y = V^™AA-2) = v/l/Amin(A2) ,~1\ — II,!-' = lAmin(A) = Amax(A-')= ||A-'||2, (3.13-129) a result that actually applies to any power of A; i.e., \\Aa\\A = ||Aa||2. Thus, || A"-1 \\k = || A"-11|2 = l/Amin(Ar), and we have (from Axelsson and Barker, 1984, for example), *-mm(K)~Ch"', (C = constant), (3.13-130) where ns is the spatial dimension (ns = 1, 2, or 3), h is the maximum element size (length), and the meaning of (3.13-130) is that there are constants C\ and Cj independent
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 513 of h such that C{h"s ^ Amin ^ C2hn>. Thus, \\K-x\\K~C/hn\ (3.13-131) and we obtain our first 'surprise' — HA'-1 H2 is not uniformly bounded! The resolution of this dilemma is first to realize that K does not in fact correspond to the (unbounded) operator —V2, and thus K~x does not correspond to its (bounded) inverse; it is, in fact, M~XK that corresponds to —V2 (see p. 213 of Strang and Fix, 1973), and it follows that HAf-'A'lb ^ ||Af-11|2 • ll^lb ^ Amax(/0/Amin(M) and that \\(M-[K)~[\\2 ^ ^max(M)/A-min(^0- It is easily shown (see, for example, Axelsson and Barker, 1984, again, or Strang and Fix) that Amax(M) ~ Chn° so that ||(Af-1 AT)"11|2 = 0(1); i.e., the true inverse 'Laplacian' is bounded. But we have K in our Stokes equations, not M~lK, which forces us to return to the fact that ||AT~'||2 ~ 0(\/h)n-\ The 'final' resolution of the paradox is obtained by looking at the entire equation—in particular, at the magnitude of the RHS vector, / (CP will be discussed later), and this we do for a scalar (Poisson-type) problem for simplicity of presentation (the true, vector case, follows easily—we assert). The problem —V2w = S in Q with u = 0 on T leads to Ku = / via the GFEM, where /,■ = f faS, and we assume S to be 'well-behaved' (e.g., S e L2). Considering the compact support of the basis functions (or, with similar although slightly less accurate results, interpolate S via the basis functions to obtain a mass matrix on the RHS), it follows easily that /,• = Sh"s where 5; is the basis-function-weighted average value of S(x) over the support of 0/ [which is 0(hn>)]. Thus, ||u||2 < ||/||2Amin(ff) = h^2V^/kmm(K) where S2 = l/N^ti ~&i is the (discrete) domain average of S2 and Q is the domain size. Using (3.13-130) then leads to ||m||2 ^ c'vS2/hns/2, which looks ominous until it is realized that "2 = \ jr^u2 = \[m2 = c"\/tf/hn<i2 to give, finally (in the RMS norm), Vu^^c^S2; (3.13-132) the solution is indeed 'stable' (bounded by the data independently of h) even though HA'-11|2 —>• 00 as h —>• 0—and is our first example of a stable result from Ax = b even though Amin —> 0 as N —> 00; both zfb and kj vary with h as h"-\ We now turn our attention to CTK~XC and the issue of its (potential) bounded inverse. In this regard it is first relevant to point out that Malkus (1981) showed long ago that the following eigenvalue problem, (CTK-xC)qi=aiQqi, i = 1,2, ..., m, (3.13-133) which he called the second adjoint LBB eigenproblem, has the (real) spectrum 0 ^ o\ ^ a2 ^ ... ^ crm < 00. In fact, if the continuous version of this eigenproblem can be interpreted as [V-(V2ylV]qi=aiqi, (3.13-134)
514 THE NAVIER-STOKES EQUATIONS and we believe that it can, at least up to BC's, and if V • (V2)~* V approximates the identity operator [see two Karniadakis et al. (1993)], then we would obviously have {07} = 1 (and qt arbitrary!)—and this does seem to be approximately true, numerically—at least for the 'good' part of the spectrum. That is, at least for LBB-stable elements, it does seem to be true that the eigenvalues of (3.13-133) satisfy Sj < a,■■ — 1 ^ s{ where 0 < 8{, Si < 1. [Also, the eigenvalues of (3.13-134) are bounded—above and below—because V • (V2)-1 V is a bounded operator; there are not any derivatives left.] The 'velocity-mode' that is associated with (3.13-134) for a = 1 is given by u, = [2/(1 ± \/5)](V2)-1 V?,- and is referred to as an 'irrotational' (curl-free) mode in Griffiths and Silvester (1994) and Griffiths (1996). Thus, we might expect the eigenvalues of Q~XCTK~XC [or, equivalently, those of Q-{l2(CTK-{C)Q-{'2}, to be 0(1), at least in the best of cases. But for 'LBB- unstable' elements, such as Q\Qq, we are not so fortunate But this is getting ahead of the story, and we return to a consideration of the stability of (3.13-127) and (3.13-128) after making a useful change of variable: define u via u = K-lf, (3.13-135) after which the Stokes equations become Ku + CP = Ku and CTu = g\ i.e., they correspond to a discrete Hl -projection (see Appendix 3) of u to the discretely divergence-free subspace. Then (3.13-127) becomes simply P = (CTK-{C)-\CTii-g), (3.13-136) and (3.13-128) becomes u = u-K-lCP, (3.13-137) where we remark that, from our earlier discussion, we know that u is well-behaved (bounded independent of h). Beginning with P from (3.13-136), which is appropriately measured in the discrete version of the L2-norm, via Ph(x) = J2J=\ Pj^ii*)' we nave 2 \Ph\\l= [Ph(x)]2 = X>^(*> 7=1 and (3.13-136) yields = J2pJPk J'tjfa = pTQp = I^He = WQi/2pWh (3.13-138) and thus \Ph\\l = [(CTK-lCyl(CTu- g)]TQ[(CTK-lC)-\CTu- g)] = [Q-[,\CTti-g)]T[Q['\CTK-{Cr{Q''2}2[Q-'l\CTU-g)] = \\\Qi/2(CTK-iC)-iQi/2][Q-l/2(CT~u-g)]\\2 \Ph\\o = \\[Ql/2(CTK-lC)-lQ1'2] ■ [Q'l/2(CTU-g)]\\2 ^\\Q-'/2(CT~u-g)\\2.\m,x(Q{l2(CTK-xCT'Q''2) = \\Q-{,2{CT~u - g)||2Ami„(G-,/2(C7K-[ O0T1/2), (3.13-139)
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 515 and we wish to know if and how \\Ph\\o varies with h. The numerator—the easy part—is estimated first: to begin, we recognize that there exists a continuous 'velocity' field, uh(x), that satisfies the BC's of the Stokes velocity, and that its discrete, weak divergence (at pressure nodes) is Qr\CTu - g); i.e., Vh ■ uh = Yl%\[Q~l(CTu - g)]jisj(x) so that qW^W = J(yh -uhf = ii v, • uh\\20 = Xj0~' (Ctu - g)]j[Q'\CTu - g)]k I i,ji,k = [Q~\CTu-g)]TQ[Q-\CTU~-g)] = \\Q-\CTu-g)\\2Q = \\Q~{'\CTu-g)\\l (3.13-140) where (V/j • uh)2 is the average value of (V/, • uh)2 over the domain, Q, and is obviously independent of h. Thus, \\Q-W(CTu - g)\\2 = \/Q(Vh-uh)2 = \\Q-{'\CTK~{f - g)\\2 (3.13-141) is independent of h. We are thus left with \\Ph\\0 ^ c/kmm[Q-{'2{CTK~lC)Q~x/\ where c is independent of h, and we now turn our attention to the denominator and show that it is intimately related to the LBB 'constant,' kh, that is defined by [cf. (3.13-120)] \uTCP\ kh = mm max —= . (3.13-142) p u VuTKu ■ \/PTQP wherein we note that the 'appropriate' norms have been utilized—//' for velocity and L2 for pressure. This is the finite-dimensional version of the inf-sup condition. By actually evaluating kh from the above equation, we shall see how it is connected to Amjn[<2_1/2(C7 K~lC)Q~l/2]. The evaluation below follows that of Stenberg (1991; personal communication), although it has been known much longer (e.g., Malkus, 1981). For a given P, we must first compute uTCP (Kx/2u)TK-x/2CP ct(P) = max . = max —. « VuTKu « y/{Kx/2u)T{Kx/2u) vTK-\/2Cp VTK'X/2CP = max — = max v=Kxr-u yj VT V v Wv\\2 T V W _,n = max where w = K ' CP 1' \V\\2 = max || w||2 cos 0, e where 6 is the angle between v and w—from the definition of the inner product: vTw = 1Mb • IMhcosfl. Clearly, the maximum is attained when 0 = 0; i.e., when v and w are
516 THE NAVIER-STOKES EQUATIONS parallel. This says v = fiw (and thus u = fiK~x/2xv = fiK~xCP) for an arbitrary scalar /3 and yields u(P) = |M|2 = \\K~X/2CP\\2. (3.13-143) We now insert (3.13-143) into (3.13-142) and vary P: ki — min ,2/D\ oTr-Tv-X az(P) . P'C'K-lCP U 111111 Tf min p P QP p P QP (QW2P)T(Q-l/2CTK-iCQ~i/2)(Qi/2P) ~ Pn (Qi/2P)T(Qi/2P) . q\Q-'l2CTK-'CQ}l2)q — min ~ . q=Qx'2P q' q But by Rayleigh's quotient, the RHS is just the minimum eigenvalue (Amin) of the matrix Q~i/2CTK~lCQ-i/2, and we have the (important) result that k2h = ^n(QT[,2CTK-[CQ-['2), (3.13-144) from which follows \\Ph\\o^c/k2, (3.13-145) where (recall) c is independent of h. Remarks: (1) Whereas the LBB stability constant, kh, was obtained/derived by studying the coupled/mixed problem, the 'same' result, namely that 'kmm[Q~{/2{CTK~x C)Q~x/2\ should be bounded independent of h, was derived by isolating P and studying its stability, via the minimum eigenvalue of (3.13-133); i.e., these are two equivalent ways to analyze the saddle-point problem. (2) k2 from (3.13-144) is the same eigenvalue as o\ from (3.13-133); kh = sfo[- Before looking further at kh and how it might vary with /z, we turn to the velocity solution; we wish to evaluate the size of the velocity portion of the solution—again in the appropriate semi-norm, Im^OOIi = HmH^ = \/uTKu. We have, from (3.13-136) and (3.13-137), u = u- K-lC(CTK-lCyl(CTu - g) = [1 - K~xC(CTK~xCyxCT}u + K~xC(CTK~{CT{g = Bu + K~xCA-xg, (3.13-146) where A = CTK~XC, and B = I - K~XCA~XCT is a projection matrix (see Appendix 3, wherein it is called p^) that is associated with the //'-projection of u to the discretely divergence-free subspace; B2 = B and CTB = O (B projects into the null space of CT; i.e., into the discretely divergence-free subspace). Thus, \\u\\2K = uTKu = (Bu + AT1 CA~X g)TK(Bu + K~y CA~X g)
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 517 = u1BTKBii + 2gTA~x CTBu + gTA~x CT' K~x CA~X g = uTKu - {CTu + g)TA-x {CT~u - g) = \\u\\2K - (CTti+g)T(CTK-xCyx(CTti - g), (3.13-147) where we note that this would properly simplify in the event that u happened to be properly divergence-free; i.e., if CTu = g, then we get P = 0 and u = u. Pushing on, we rewrite (3.13-147) as \\u\\l = \ml-[Q-{,\CTu+g)}T[Qx'\CTK-xC)-xQx'2][Q-x'\CT~u-g)} and thus, the obvious inequality, \\u\\k ^ \\u\\k + [\\Q-l/2(CTu +g)\\2 ■ \\Qx/2(CTK-xC)-xQx/2\\2 ■ \\Q-x/2(CT~u - g)\\2]x/2 follows. Noting now that ||G~1/2<7ll2 = IIQ-1^lie and that \\QX/2(CTK-XC)-XQX'2\\2 = kmadQl/2(CTK-xCyxQx/2] = \/^mn(Q-[,2CTK-xCQ-x'2) -k2 — Kh from (3.13-144) yields IMI*^ ll«lk + [|IG~,(C7'ii + g)||G-||G-,(C7'fi-g)||G],/2A*. (3.13-148) Recalling now the result in (3.13-141), it is clear that the numerator in the second term of (3.13-148) is independent of h to give II"IIa: ^ \\u\\k + c'(u)/kh, (3.13-149) where the only possible dependence on h could be in kf,. So, our final stability bounds are that ||w||^ ^ cq + c'/kh and ||P||2 ^ c/k\, and further progress rests on being able to estimate the LBB stability constant, kh, which estimate (i) is not easy and (ii) depends on the element under consideration; thus, we have reached the end of 'general' results. Remarks: (1) The fact that P varies quadratically with k^' and u only linearly has important ramifications; namely, if kh = 0(ha) for a > 0, the pressure 'bound' is 'lost' (becomes poor) 'sooner' than that for velocity—an observation previously made by Brezzi and Fortin (1991, p. 57). (2) Several papers that also approach the stability issue mainly from the pure linear algebra approach are: (i) Brezzi and Bathe (1990); (ii) Fortin and Pierre (1992); (iii) Chapelle and Bathe (1993); (iv) Wathen and Silvester (1993); and (v) Nicolaides (1982).
518 THE NAVIER-STOKES EQUATIONS o Eigenvector expansion. To conclude our LBB discussion, we replay (with a twist) the eigenvector expansion technique on/for the Stokes equations—the results of which add further to our overall understanding of stability. To this end, we construct—in principle, at least—the analytical solution of (3.13-124). But to do so profitably—i.e., to build upon previous knowledge garnered by others—we consider a non-standard eigenvalue problem corresponding to (3.13-124): rather than considering the conventional eigenproblem, Kv + Cq = kv and CTv = kq, we address the 'scaled' eigenproblem (fi = eigenvalue) Kv + Cq = nKv (3.13-150) and CTv = iiQq, (3.13-151) which was first considered by Malkus (1981), who referred to it as the 'convergence' eigenproblem. [For a more 'modern' approach, and some new results, see Griffiths (1996), portions of which we will summarize/utilize.] It also leads directly to (3.13-133), with a = /i(/i — 1). This corresponds to a generalized eigenproblem of the type Az = kBz, where A (which we assume initially to be the case) is non-singular, and B is SPD, thus assuring a complete set of fi-orthogonal eigenvectors {z;}, which we take to be normalized; zfBzj = 8jj. The solution of Ax = b in terms of this set of basis vectors is done as follows: x = Yl,iaiZi, b = ]T\ PiBzi, the latter expansion employed because it leads to a particularly efficient analytical solution. [It is a legitimate expansion because {zi} is a basis, and B is SPD, which makes {Bzi} a basis too.] The final result is easily found to be x = J2i ^i{(zjb)zi, as for the conventional eigenproblem. Applied to (3.13-124), we first have (;)-gw/+'W(S) (3-13-152> in which, as an aside, it is interesting to note that if g = 0 (a common situation in practice—such as thermal convection in a contained flow), then (3.13-152) necessarily implies J2i(vJ' f)Q<ii = 0 V/. That this is 'reasonable' follows from the rearrangement t0 Yli^QVivJ^f = 0 and the realization (thanks to A. Hindmarsh) that the m x n 'outer product' matrix, J2i Q^itf ■>ls actually the zero matrix—from the orthonormality condition, vjKvj+qjQqj = 8ij; (3.13-153) it is the lower left partition (m x n) of the identity matrix. (End of 'aside.') The final solution of (3.13-124) is then = £ V,J q,g ' "' ' (3.13-154) /=i ^ which produces a stable solution to (3.13-124)—with (graph) norm (squared) of ||w||^ + \\P\\l = uTKu + PTQP = Y,Nj=\(v]f + q]g)2/fi2}—as long as the amplitude coefficients, (VJ f + q]S)/^'l = 1, 2, ..., Af, remain bounded as h —>• 0(/V —>• oo). Again, we point out that unbounded growth (instability) can only occur if /z,- —> 0 faster than does the
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 519 projection of the data, (vf f + qfg), as h —> 0—in those cases wherein vf f + qfg —> 0 as h —> 0. The concern (mostly by mathematicians) over ./wsf /x,• —> 0 as /z —> 0 in the general case is, however, warranted, for the following reason: stability in the strictest sense means a bounded solution as h —> 0 for a// possible (bounded) data (/ and g). If fjLj —> 0, then there must be wme data for which the solution will blow up rather than converge. And this is true. But what this general and cautious/conservative stability definition does not take into account is any consideration of the 'reasonableness' of such data and its relationship to the continuous problem. (Mathematicians are prohibited from enforcing 'reasonableness' on the data—partly, and certainly justifiably, because the term cannot be well-defined.) Suppose, for example, that /ik —> 0 as h —> 0 for some k (and, for simplicity, that this is the only eigenvalue that does so). Suppose further that the data were carefully selected to be (as a 'worst case') the corresponding eigenvector, i.e., (f,g) = (vk, qk) to give the very simple solution [from (3.13-154)] Hk \<lkj the point of this extreme example (unstable for /z —> 0) is to emphasize the fact that the eigenvectors are important too (not just the eigenvalues), and it may just turn out (as we have discovered at least for some special cases) that the 'unstable' eigenvector is so 'bizarre' that it makes absolutely no sense in the continuum (h —> 0); it might even be the case that it does not—for our case—lie in the range of the Stokes operator, which here would mean that the data in the continuous problem are not in the dual space of Hl —viz., //-'. As we will show later, in Section 3.13.5k for the (allegedly) 'unstable' Q\Qo element, the eigenvectors corresponding to the unstable eigenvalues are indeed rather 'bizarre'; they are very highly oscillatory (similar to a '2 Ax wave') and, as a consequence, are nearly orthogonal to 'smooth' data—i.e., to reasonable data—and thus the numerators of the amplitude coefficients do decrease as h —> 0, and they do so faster than does the denominator, /ij, thus permitting both stability and convergence. It is this simple fact that accounts for the major 'success' of Q\Qo- It may be of interest to reveal a few more of the known results regarding the eigen- system [(3.13-150), (3.13-151 J—results derived originally by Malkus (1981)—and relate them to the LBB stability constant given in (3.13-142) and (3.13-144). To start, we eliminate v, for /i # 1, from [(3.13-150), (3.13-151)] to recover (3.13-133): (CTK-{C)q = fi(fi -\)Qq = aQq, (3.13-156) which we convert, via the change of variable, r = Q^2q, to [Q-l'2(CTK-lC)Q-l/2]r = ar, (3.13-157) a conventional eigenproblem—of 'size' m; i.e., r e Rm, and the matrix is m x m. Malkus (1981) has shown that this 'second adjoint LBB eigenproblem' has a discrete set of positive eigenvalues (in the absence of pressure modes, which we assume to be the case at this point), 0 < 0\ ^ <T2 ^ CT3 . . . ^ <Tm, (3.13-158) and, from (3.13-144) we see that the LBB constant is just kh = Ja[. (3.13-159)
520 THE NAVIER-STOKES EQUATIONS Remark: It seems to be the case that am ^ 1, although we know of no proof. Then, since /x((/x; — 1) = <t(, each of the m-values of 07 produces two values of ///(one < 0 and the other > 1), lif = \ (l ± 0+W) , (3.13-160) and we have 2ra of the desired n + m(= N) eigenvalues with corresponding eigenvectors where qt and 07 come from (3.13-156), and the </'s satisfy qT:Qq\ = 8jj. Also, the LBB constant is now expressible as kh = y/iiT(jiT-\). (3.13-162) From (3.13-160) we see that one half of the /x's (/xf) are < 0, and the other half (/x+) are > 1; and it is the negative roots that can be dangerous (if <r, —> 0). The remaining n — m eigenvectors have /i — 1 and q = 0; i.e., they are the divergence-free subset of ('velocity-only') eigenvectors—cf. (3.13-151)—of the form G)=(o)- (3-i3-i63) where CTVj = 0, and fi,■ = 1. Note that the 2m 'velocity' eigenvectors of (3.13-161) are not divergence-free; they are dilatational, with CTvf = -^-—CTK'xCqi = (ifQqi. (3.13-164) [Aj 1 These are the vectors that permit satisfaction of the inhomogeneous constraint equation, CTu = g, in (3.13-124). They are referred to as 'discretely irrotational vectors' (curl-free) by Griffiths (1996), because they are (K-) orthogonal to the divergence-free vectors. If we now order the eigenvectors according to the size of the corresponding eigenvalue, we can represent the full solution of (3.13-124) via (3.13-161) and (3.13-163) as = v ! + Vi+^ (1 + yrT4^* Cq' i=m+\ 2 T^T^j -re,,.* / z K-iCq,\ (3.13-165) -i + ^Y+4^ f_l + y/i+^;K c* ^ l(l + VT + 4^) V q{ i=n +1 where qi+n = qt, and ai+n = 07 for / = 1,2,..., m; and (recall) N = n + m.
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 521 Returning now to stability and LBB from an alternate viewpoint, we focus on the smallest eigenvalue (a\); i.e., we have a\ = crmm(Q~i/2CTK~iCQ~i/2) = k\ from (3.13-144)—wherein a was called A. Suppose now that o\ = ch2, as indeed it is for some 'LBB-unstable' elements such as Q\Qq in 2D (see Sections 3.13.5J and k), giving kh = 0(h) and an amplitude coefficient for the first eigenvector, using ->/1 + 4o\ = 1 4- 2ch2, of — (vjf+qjg) = -{q[CTK-{f + q]g)/ch2, (3.13-166) Mi and the potential instability with mesh refinement is clearly evident. Only if q\(CTK~{ f + g) = 0(h2+E) for 8 ^ 0 will the solution not blow up as h —> 0. And this can be the case, as we show in Sections 3.13.5J for Q\Qo, because q\ is very 'checkerboardish', and thus nearly orthogonal to reasonably smooth (not checkerboardish!) data. We shall return to this issue in Section 3.13.5k. Remarks: (1) If a non-trivial null space (dimension k) of pressure modes is present, (i) the first and last summation in (3.13-165) is each reduced by k, (ii) the middle sum is increased by k (each pressure mode 'generates' another divergence-free velocity mode), and (iii) there is added a fourth sum, J2?=n+m+\-k a> (p)' where P, is one of the k pressure modes (CPj = 0) and {«/} are arbitrary scalars. These modes correspond to a = 0 in (3.13-156) and have /i+ = \,/i~~ = 0. When pressure modes are present, the 'effective' number of pressures (and concomitant constraint/continuity) equations is reduced—one per mode. [The rank (r) of C is m — k; cf. the discussion following (3.13-58).] (2) A similar eigenvector expansion for the transient Stokes equations, which has more physical relevance, will be presented later—Section 3.16.3. (3) A clever re-scaling of the above eigenvectors, due to Griffiths (1996), leads to a much more compact representation: ;=i ^qjQqt . i=m+\ + E^("n). 0-13-167) where v Kv Wj = K"xCqj (3.13-168) is called 'discretely irrotational' because it is (A')-orthogonal to the discretely divergence-free eigenvectors (vj) and was obtained (in part) by exploiting the (unnormalized) orthogonality between the two sets of dilational m-vectors; namely, [cf. (3.13-156)], wjKwi = aiqjQqh (3.13-169) where we note that the new eigenvectors are actually [vv7, (/x^ — 1)<7/]7 = [K~lCqj, (/if — \)qj]T■ Additional relationships that led to the more streamlined expansion are: fi~j + (iJ = 1 and /x+ • (ij = —Oj.
522 THE NAVIER-STOKES EQUATIONS To conclude our discussion of LBB, we show 'qualitatively' the spectrum {fij} in Figure 3.13-13, which might be useful. The eigenvalues are imagined to be distributed along the curve from #1 to #8. For an element-specific version of this figure, see Griffiths (1996). The figure is qualitative in that it applies to any element (probably even to other spatial discretizations than FEM) and the curved portions are only 'suggestive,' and the two 'limit points,' (jlj• = (1 ± V5)/2 that obtain when o}■ = 1, are probably only approached for large N. Additional remarks related to the circled numbers: 1. Regions #1 and #8 show the regions of 'good' (smooth) modes that are trying to mimic the curl-free modes with a = 1 mentioned below (3.13-134). 2. Regions #2 and #7 are 'transition' regions and may vary in shape from element-to- element; i.e., with the discretization. 3. Regions #3 and #6 are 'bad' regions with rather oscillatory modes that seem to have no counterparts in the continuum—which would never deviate from the (1 ± V5)/2 'asymptotes.' 4. Another numerical artifact is Region #4, at least if k —the dimension of the null space of C—exceeds unity; these are pure pressure modes. 5. Region #5 is the 'clean' region—the space of truly discretely-divergence-free eigenvectors, the null space of CT; and it is often smaller than we would like. We would like (we believe) a larger Region #5 and smaller Regions #3 and #6—and no Region #4. 6. Finally, we return to the 'danger zone,' Region #3, and remark that the difference between a stable and an unstable element shows up at the right end of #3; a stable element will have the smallest /x^'s that approach a constant independent of N as N is increased and unstable ones will approach zero. A final eigenvector remark, until we revisit the analogous problem for the transient Stokes equations in Section 3.16.3, is this: the combination of 2(ra — k) dilatational modes 1.5 Hi 1-0 0.5 l\ -0.5 (1 + V5)/2 m-k dilatational X "K 4—# n-m+k div-free m-k dilatational k pure pressure modes dim N(CT) -dim N(C) (1 - V5)/2 j = 1 m-k m n+k j = n+m = N Fig. 3.13-13 A spectral picture of the discretized Stokes equations.
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 523 are those needed to satisfy the following two sets of m — k equations: 1. CTu = g and 2. (CTK-{C)P = CTK-Xf - g- cf. (3.13-124) and (3.13-125)). The former is in a sense a boundary equation in that g is populated only at pressures corresponding to velocity nodes on the boundary with (inhomogeneous) Dirichlet BC's—and the latter is a 'full domain' equation. This concludes our version/description of LBB stability, and related matters. Hopefully, it adds something positive to the total picture. e. Penalty methods We have already briefly introduced 'the' penalty method—via the penalty 'equation of state' in Section 3.5.3 and variationally in Section 3.7.1, showing how the penalty parameter, the pressure (Lagrange multiplier), and the velocity divergence are related. See also Section 3.10.6. Here we further discuss the penalty method, extend it to the full NS equations, and mention two other penalty-like methods. First, however, we shall cite some of the relevant references on the subject (there are too many to list 'all'—see the references' references) to which we refer the reader for the 'heavy' theory associated with the continuous penalty method (PDE's), as our approach will be more 'practical' in that we will begin with the discrete (FEM) equations. For the continuous case, see, for example, Bercovier (1978), Temam (1984), Reddy (1978, 1982), Oden et al. (1982), Carey and Oden (1983), and Girault and Raviart (1986). Some relevant publications more on the 'applied' side are: Zienkiewicz and Godbole (1975), Hughes et al. (1976, 1978), Malkus and Hughes (1978), Bercovier and Engelman (1979), Hughes et al. (1979a), Oden and Jacquotte (1984), Kheshgi and Scriven (1982, 1984, 1985), and Reddy et al. (1992). The demarcations of theoretical vs applied in listing the above references is, of course, subject to 'interpretation.' Finally, an entire ASME conference on the subject is described in the proceedings edited by Reddy (1982). o Model problem. To further motivate the method, and indeed to aid in understanding it, we present first a three-equation 'model' of the Stokes equations, the steady version of the model discussed later in some detail (Section 3.16.2), ku + c\P = f\, kv + c2P = fi, and C\U + C2V = g, with solution c\f\ - c\c2fi +cxkg u = = = k(c] + c\) c]fi -c\c2f\ +c2kg v = = » k{c] + c\) (3.13-170) (3.13-171) (3.13-172) (3.13-173) (3.13-174)
524 THE NAVIER-STOKES EQUATIONS k(c] + c\ + ek) c]fi -c\c2f\+ c2kg + skf2 k(c] + c\ + ek) c\f\ +c2fi-kg c\ + c\+ ek (3.13-177) (3.13-178) (3.13-179) and p=CI/i+C2/2-^ '•?+4 This was the 'mixed-interpolation/Lagrange multiplier' approach. The penalty approximation replaces (3.13-172) by c{u + c2 v-g = sP, (3.13-176) where e is 'small' (1/e = A is the 'penalty parameter'). The penalty solution is also easily found: u = v = and P = But the 'raison d'etre' of the penalty method is to eliminate P a priori and only calculate it at the 'end of the day'—if ever. Thus, inserting (3.13-176) into (3.13-170), (3.13-171) yields the penalized momentum equations sans pressure: (it + kc])u + \c\c2 v = /, + Ac,g, (3.13-180) Ac,c2w + (it + Xc\) v = f2 + kc2g. (3.13-181) It is clear from (3.13-173) through (3.13-179) that the penalty solution is 'close to' the 'mixed' solution; clearly 0(e) away, in fact. And this is 'exactly' what happens in the full-blown GFEM case. So much for 'theory'—for now. Except for the following remarks—which also generalize to the 'many'-degrees-of-freedom case: 2 1. The 'penalty matrix,' B = (C| CiC22) is singular, a requirement first noticed, we believe, by Fried (1974); a non-singular B would, from (3.13-180) and (3.13-181) drive u and v to zero ('locking') for A —> oo (at least for the g = 0 case)—not a good approximation to (3.13-173) through (3.13-175). 2. The full system matrix, (k 0\ ( c\ c\c2\ {k + kc2; kc\c2 \ A = K + kB=(* ,)+k[ ' 2 =( , , \\ ' (3-13-182) has eigenvalues k and k + \(c] + c\) and consequent large condition number—0(A). The concomitant loss of accuracy in solving the penalty equations, (3.13-180) and (3.13-181), via Gaussian elimination is thus also 'large,' and this sets an upper bound on A depending on your computer's accuracy; a 14 digit machine will lose about 10~14A significant digits, thus limiting A to values less than, say 109. This is the principal problem with the penalty method: A too small gives too large a divergence (and other errors), A too big loses significant digits. There is a bathtub curve measuring 'penalty error,' whose 'bottom' is
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 525 nearly flat (and on which the penalty method works well) only over a somewhat limited range of A—typically 104 — 109. Iterative solution methods 'like' large A even less than do direct methods, because the convergence rate becomes very small. 3. For steady-state problems (only), the sign of s is immaterial—a statement that may not sit too well with our friends from solid mechanics, who often like to liken the penalty parameter to a physical one (a Lame parameter) that should be of proper sign when dealing with compressible elasticity. But we are dealing with incompressible fluid mechanics and thus feel free to simply regard A as a penalty parameter (math, not physics). o The real problem. We now apply the penalty method to the NS equations. Starting from (3.13-28) and (3.13-29), we obtain the penalty version of them by first replacing the latter by CTu-g = sQP, (3.13-183) where Q is the pressure 'mass matrix' (or 'basis function overlap' matrix; Kheshgi and Scriven, 1985): Qu = Jfifj, (3.13-184) and eliminating the pressure in (3.13-28) via (3.13-183); i.e., the penalized momentum equations are Mil + [K +N(u) + -kCQTxCT\u = f + AC(r'g, (3.13-185) and the pressure is gone... and our problems are thus gone!? Not quite. While the penalty method is often very useful, it is, unfortunately, no panacea—for at least the following reasons, some of which we shall further elucidate and/or demonstrate later: 1. The selection of A is not always as 'easy' as implied above. 2. The matrix Q~x is generally dense, effectively precluding practical utility. But consistent mass is not a requirement for getting good penalty results (cf. Engelman et al., 1982b, and Fortin, 1983); for those elements for which mass lumping is legitimate, Q may be lumped and thus rendered diagonal. For other elements, the replacement of Q by (average) element size (area or volume) would probably work fine—although we have not tested this. 3. Unless element-contained, and thus discontinuous, pressures are employed, the (consistent—and 'highly' singular) penalty matrix, B = CQ-lCT, (3.13-186) is global and must be so constructed—a feature that makes the method unattractive in practice. This helps explain why virtually all applied penalty methods have employed discontinuous pressures. 4. The penalty matrix intensely couples the velocities, a fact that affects/limits the ensuing numerical solution procedures. 5. The necessarily large penalty parameter makes the problem rather ill-conditioned, thus again restricting the choice of numerical solution procedures and (usually) adversely affecting their performance. 6. The spurious penalty start-up transient—Section 3.5.3—can be 'annoying.' (See Section 3.16.2e for more on this.)
526 THE NAVIER-STOKES EQUATIONS 7. If pressure modes plagued the mixed interpolation 'progenitor,' they will usually also make their presence felt in the penalty version—when (and if) the pressure is obtained via post-processing, from (3.13-183). o Reduced quadrature. Thus far, we have been discussing what we call 'consistent penalty,' a term introduced in Engelman et al. (1982b); the above equations resulted from a consistent GFEM applied to the continuum momentum equation and to the continuum penalty equation (3.5-9), in the conventional 'mixed-interpolation mode' (e.g., P is one order lower than u). Historically, and still alternatively today, there is another way to obtain the penalized discretized momentum equation; namely, apply GFEM to (3.5-10), the continuum penalized momentum equation. This approach results in a generally (but not always, see below) different fi-matrix and a somewhat different set of 'problems.' [The viscosity is completely negligible next to A in (3.5-10)—or should be if the penalty method is to succeed—and can/should be dropped from that term.] The name of this historical game is called 'reduced integration' or 'reduced integration penalty' (RIP) or 'selective reduced integration'—and it came about by starting 'wrong,' where by 'wrong' we simply (naively?) mean starting from the weak form of the continuum-derived penalty momentum equation; i.e., from (3.5-10), which when discretized a la GFEM, reads Mu + (K+N(u)+\B)u = f, (3.13-187) where all terms are as before [cf. (3.13-18) et seq.] except the penalty matrix, which is now ^^ (3.13-188) (3.13-189) or d<pla) d<pf] dxa dxp The only problem with this formulation is that it does not work. Not as stated, at least; because as presented thus far, the fi-matrix is not singular, giving for the simplest case of steady Stokes flow, for A —> oo, u = k~l B~x f % 0—the locking problem mentioned above. How to fix it? Well, what began as a 'trick,' reduced integration (to render B 'less accurate,' and singular) was later elevated to a legitimate methodology by Malkus and Hughes (1978), who put all of the previous clues together and came up with their famous equivalence theorem, which we loosely state in words as it applies to the case of interest herein (it covers other mixed method applications in addition to incompressible flows), albeit 'only' for discontinuous pressures (nearly the only case of practical interest): the B- matrix of (3.13-187) and that of (3.13-186) are the same (at least under certain conditions, such as straight-sided quadrilaterals) if and only if (3.13-189) is under-integrated in just the right way—that being the Gauss-Legendre rule whose integration points correspond to the pressure 'nodes' for the corresponding mixed method. Examples: 1. The Q\Qo element displays the equivalence when one-point quadrature is used for B in (3.13-189). Full quadrature requires 2x2 and also has an equivalence: Q\Q~\, an element that is no good because there are more constraint equations than velocities to satisfy them; ergo, locking.
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 527 2. The Q2Q-1 element displays the equivalence when the fi-matrix is integrated via 2x2 Gaussian quadrature. Again, full quadrature, here 3x3, also has an equivalent mixed method element: Q2Q-2. another loser. Both of these equivalences apply only to straight-sided elements. 3. Any other higher-order Lagrange element, QmQ(\-m). But the reduced integration method has enough significant 'problems' associated with it that we recommend only the consistent penalty method: B = CQ~lCT, with discontinuous pressures so that element-level construction of B is possible. Some of the reduced integration 'problems,' per Engelman et al. (1982b), are: (i) equivalence does not obtain for Q2Q-1 if the sides are curved—i.e., via fully isoparametric element simulations; (ii) there is no known reduced quadrature method to do the penalty version of Q2P-1, a leading contender for the 'best' 2D element; (iii) equivalence in 3D for Q\Qo and Q2Q-1 only occurs for simple bricks, not for distorted ones; and (iv) when consistent penalty is pitted against reduced integration penalty under non-equivalent conditions (which is still legitimate for both methods), the consistent penalty method is more accurate. Finally, only consistent penalty is applicable (although not recommended) to all elements for which mixed interpolation is successful—even those with C° pressure approximation. We conclude this comparison with an important quotation from the Malkus/Hughes equivalence paper: 'We believe the practical computing consequences of these results are significant. Namely, the accuracy of the mixed formulation in constrained media situations can be obtained with a displacement [read velocity] formulation, completely eliminating the additional computational expense engendered by the auxiliary field of the mixed formulation.' A solid vote for the consistent penalty method from the solid mechanics community is the following, from Lee and Dawson (1989): 'It has been found that consistent penalty techniques give more accurate pressure and velocity fields with less computation time. The approach suggested by Engelman and co-workers uses a linear, discontinuous interpolation function for pressure to avoid spurious modes with quadratic velocity approximation.' Finally, we remark that the reader interested in 'Some Historical Remarks on Mixed and Reduced and Selective Integration Methods' should see p. 226 of Hughes (1987). o Transient penalty. Turning now fully to the time-dependent case, we form the PPE analog a la-penalty by inserting ii from (3.13-185) into the time derivative of (3.13-183) and use (3.13-183) again to obtain -QP + {CTM-{C)P = CTM-\f -Ku- N{u)u] - g, (3.13-190) A which, for A —> 00, recovers the conventional PPE—derived in Section 3.13.4; see (3.13-242). But it is precisely the P term for finite A that makes (3.13-190) look more like the transient heat equation than the desired elliptic equation. [The advection -diffusion equation, derived in (3.5-11) for the continuum case, could—and perhaps should—also be rearranged and then 're-interpreted' as the transient heat equation, since advection, eu • VP, is negligibly small.] The implied PPE/heat equation for P is indeed implied by the penalty velocity solution and is what we called a 'transient penalty shock wave' in Section 3.5.3. It corresponds to intentionally introduced stiffness (see Section 2.7.2c) in the ODE sense in that, once the spurious 'wave' has diffused through the mesh, the heat equation 'portion' is finished, and the PPE 'portion' takes over; i.e., P is in
528 THE NAVIER-STOKES EQUATIONS quasi-equilibrium after a time of order r ~ 1/A. Three additional remarks: (i) although indeed an implied equation that is always satisfied by the pressure, it of course need never be formed in a computer code; P also satisfies (3.13-183), which is a much more convenient way to 'retrieve' P, if and when desired; (ii) the index (see Section 3.16.1) of the penalty method is, of course, zero—there are no more algebraic equations, just (stiff) ODE's; and (iii) the divergence-free constraint on the initial velocity field is ostensibly no longer present; arbitrary iio fields, however, are still physically meaningless and will be converted to a (hopefully reasonable) divergence-free [to 0(e)] velocity in a time of 0(e) via what is effectively on L2-projection—the details of which will be presented later (Sections 3.16.2e and 3.16.4f). The above discussion also clearly shows that 'stiff integrators' are required in order to solve the time-dependent penalized momentum ODE's; explicit methods are OUT. In Sani et al. (1981b), this spurious penalty transient was introduced and demonstrated. Also shown was the important trick of 'bypassing' this transient with a dissipative integrator; one step of BE or BDF2, for example, of size At ^> 1/A will put (3.13-190) into the 'PPE mode,' after which the (stable) implicit time integrator (BE, TR, BDF2, etc.) could take over and the smart timestepper turned on. That is, the short transient associated with penalty need not be accurately integrated—and generally should not, being totally non- physical. The only time this sidestepping trick might fail would be a situation in which the true physics occurs on a time scale so short that At = 0(1/A) or less is required for accuracy. This would represent a penalty method failure—unless A could be increased sufficiently without running out of digits on the computer. In Figures 3.13-14 and 3.13-15 we show some aspects of the spurious penalty transient for a simple Poiseuille flow start-up via the imposition of a pressure drop over the length of the channel, using the variable-step BE method—described in detail later (Section 3.16.4). Here, A = 106, and the Q\Qo element was employed with a local time truncation error tolerance (e) of 0.001. Whereas the true (or mixed interpolation) solution displays a constant-in-time (and y), linearly decreasing in x (from four to zero) pressure profile and a uniformly increasing u(y) in the y-direction with v = 0, the penalty method does f, = v = 0 P5o 1 . % f 1 52 1 ^40 53 1 ' 1 P30 3011 * 29 1 >—r P20 t u = v = 0 1 1 1 P10 I I —I—6° p5 I 0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 Fig. 3.13-14 Mesh and boundary conditions for transient Poiseuille flow; the inlet BC is fn=4,v = 0.
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 529 4.0 3.5 3.0 2.5 P 2.0 1.5 1.0 0.5 0 10 _ (a) ^y i J I U^~~ p I / P40 / / ^30 / / /p III //A10 o X > 0 -10 -20 -30 -40 (b) - 1 / /At \ v59 A v53 - \v29 -= I 1 10-5 10 7 At 10~9 -in-11 -11 10~* 10~ Time 10 -5 10 -11 10 9 10 Time -7 10 -5 Fig. 3.13-15 The specious penalty transient. not attain this situation until t = 0(1O~5)—the time it takes for the penalty transient to equilibrate/dissipate. Note too that the non-zero vertical velocities are also spurious, albeit they are 0(e). Only 80 timesteps were required to track this penalty transient, -n to 10 -4. see during which At grew by seven orders of magnitude—from ~ 10 Figure 3.13-15. For t > 10~5, the 'true' transient Poiseuille flow 'starts' for the penalty approach; see Sani et al. (1981b) for further details, and also a non-physical penalty transient in which the incompressibly illegal BC of u = 2y(\ — y/2) with zero initial velocity was also run with the penalty method, both in the stiff and non-stiff modes—the former taking advantage of the L-stability of BE via a starting At of 0.01, which stepped right over the spurious transient. (The non-stiff integration follows the transient accurately in time via local error control and variable timesteps, which will be carefully described in Section 3.16.4.) In this latter case, the spurious compression wave generated pressures of size 105-107 for 0 < t < O(10~5), after which they returned to 0(1). [Here TR required only 43 time steps and BE required 94, with e = 0.001—using (of course) the 'smart' (variable-step) integrators described in Section 3.16.4.] Besides the spurious penalty transient, the penalty method is (for e > 0) artificially dissipative, although only slightly. Neglecting the viscous term in (3.13-187) yields the penalized Euler equations, whose kinetic energy 'conservation' law reads, for / = 0 and N(u) skew-symmetric, uTMu = -kuTBu, (3.13-191) 2 dt which is dissipative if uTBu > 0. From (3.13-186) we obtain uTMu = -k(CTu)TQ-\CTu) 2 dt (3.13-192) and, since QTX is SPD, dissipation has been proven. But it is not badly dissipative, even though A appears on the RHS—because CTu is small. In fact, the use of (3.13-183), for g = 0, yields 1 d —uMu = 2dt sP' QP, (3.13-193)
530 THE NAVIER-STOKES EQUATIONS the kinetic energy slowly decays (for s > 0 and small) at a rate proportional to the pressure's 'kinetic energy.' Perhaps this energy 'argument' is, in fact, sufficient to suggest that only positive values of A should be employed —but see below. For an example of penalty's dissipation, see Sasaki and Reddy (1980). o Pressure modes. Next we present a short discussion on 'pressure modes a la penalty,' beginning with the obvious observation that even if the corresponding mixed-mode element would display one or more pure pressure modes (zero eigenvalue, CPm = 0, recall) —including the physical hydrostatic mode (n • u specified on all of T, recall)—the penalty version of it will not be singular. In this sense, the penalty method is also a regularization method; any eigenvalue that would be zero is then 0(e). Presuming the existence of a (mixed-mode) pressure mode, Pm, (3.13-183) yields PTmCTu - PTmg = uTCPm - PTmg = -PTmg = ePTmQP, (3.13-194) which brings up two points: (i) if the 'mixed' problem is well-posed (Pjng = 0, recall), then the penalty pressure is Q-orthogonal to that mode, PTmQP = 0, which, as shown by Sani et al. (1981a), tends [cf. (3.13-87)] to act like a 'filter' if the mode is checkerboardlike; and (ii) if the mixed problem is ill-posed (PTmg # 0), the corresponding penalty pressure will be very large [0(A)], and the associated velocities not 'physical.' Do not use 'penalty' to try to solve otherwise ill-posed problems. o Variable penalty parameter. To conclude our discussion of 'the' penalty method, we return briefly to the subject of 'selecting A' for steady flows. We have already asserted that penalty 'works' [gives u and P that are 0(e) from those obtained with the 'mixed' analog] for both positive and negative values of A. Here we further strengthen this position by pointing to the papers by Kheshgi et al.; see, for example, Kheshgi and Scriven (1985) for 'applications' and Kheshgi and Luskin (1985) for theory, and references therein. Called the 'variable penalty method,' the penalty parameter is chosen (2D) to alternate in sign in a checkerboard manner and to vary in magnitude in a certain way, the combination having been shown to increase the 'quality' of penalty results—at least in some cases. Taking, for the Q\Qo element, kj = ±cxai/(a), (3.13-195) where cx is the 'conventional' penalty parameter (e.g., 107), a, is the area of element /', (a) is the number average of all element areas, and the signs alternate in CB-fashion. When it 'works' (explained below), this trick extends the utility of the penalty method by both reducing compressibility error and permitting a wider range of c\ (several more decades of utility). There are, however, two classes of BC's for which it does not work well: (i) normal traction applied on a portion of T, and (ii) tangential stress applied on a portion of T. In both cases, Dirichlet BC's are applicable on the rest of V. For all Dirichlet or for partly Dirichlet and partly normal and tangential stress BC's, the variable penalty method works well; i.e., better than the conventional method. The extension of these results to other elements or to 3D, however, has yet to be accomplished—to our knowledge. o Closure. Final remarks on the penalty method: 1. It is effective, especially in the 'consistent' method (no reduced integration), for any element in which the pressure is element-contained (C~').
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 531 2. It fully and tightly couples the velocities, which can have serious implications in 3D; e.g., segregated/uncoupled solution methods (discussed in Section 3.16.7) are not 'available.' It is more 'viable' in 2D. 3. Only fully implicit time-integration methods should be employed—except perhaps for advection (semi-implicit)—thus again coupling all velocity components in the 'large' matrix. See Section 3.16.4f for implicit time-integration discussion. 4. The first timestep (or first several) should generally be made using a strongly A-stable (or even L-stable) integrator with step sizes large compared with s. 5. If a wide variety of element sizes is present in the mesh for steady-state simulations, then the variable penalty method of Kheshgi et al. may be worthwhile—BC's permitting. It may also be a good idea in general, at least in 2D for 'proper' BC's, except for time-dependent flows. o Another penalty-like method: Augmented Lagrangian. In order to both reduce the need for such a large penalty parameter and to more closely approximate exact enforcement of the divergence-free constraint, the method called 'augmented Lagrangian' has been employed and advocated by some [here are three: Fortin and Glowinski (1983), in which the Navier-Stokes problem is but one of many addressed, Fortin and Fortin (1985c), and Simo and Armero (1994)]. Unlike the classical/conventional penalty method, this method does not try to reduce the size of the problem by eliminating the pressure; rather, it tries to decouple the velocity and pressure. Thus, rather than (3.13-183) and (3.13-185), the problem addressed is Mh + [K + N(u) + XCQ-xCT]u + CP = f + kCQ~lg (3.13-196) and CTu = g\ (3.13-197) i.e., the pressure is retained and exact mass conservation recovered. But as the astute reader has no doubt already noticed, (3.13-197) implies that the penalty terms drop out in (3.13-196), bringing us back to 'square one.' The augmented Lagrangian 'trick' is to apply the so-called Uzawa method to a 'split-up' iterative form of the above equations—and we 'demonstrate' this only for the simple special case of steady Stokes flow, for simplicity (see the references for the rest); [K + XCQ~{CT]uk+[ =f + kCQ-lg - CPk (3.13-198) and Pk+i = Pk+k(CTuk+i -g), (3.13-199) which system often converges 'quickly' even for not-so-large values of A (say 103). We conclude this very brief summary with a few Remarks: (1) In Fortin and Glowinski (1983), the possibility of finding improved convergence rates by using different A's in (3.13-198) and (3.13-199) is discussed. (2) In Simo and Armero (1994), the algorithm is applied as part of a time-marching method with only two iterations per timestep and, for reasons not obvious to us, they omitted the penalty terms in the momentum equations.
532 THE NAVIER-STOKES EQUATIONS (3) Fortin and Fortin (1985c) apply the method in conjunction with a Newton method for solving the steady Navier-Stokes equations. o Another penalty-like method: PALM. In a completely different approach and for completely different reasons, Hutton and Smith (1981) and Smith (1985) invented the penalty-augmented Lagrangian multiplier (PALM) method in order to correct a deficiency when using biquadratic velocity and continuous bilinear pressure on isoparametric quadrilaterals (typically the serendipity element). The 'deficiency' is a not very accurate representation of pressure and (concomitantly) a not very accurate approximation to V • u = 0. The 'fix' is to 'augment' the Lagrange multiplier (pressure) with a penalty term that tends to return element-level mass conservation in a way that seems to be closely related to that using the (MQi + ^o) element of Table 3.13-2—but in PALM the element- level mass balance is only approximately achieved; it is a sort of penalty approximation to Q2(Q\ + Po)- The PALM equations are basically (3.13-196) and (3.13-197) applied to Qi Q\ (or Q2Q1), in which (only) the penalty matrix is different; C there is replaced by Co, where here Co refers to the C-matrix of a Q2Q0 element, which would produce the following element-level mass balance (and the penalty approximation gets it close), a la Gresho et al. (1980b), for the lower left element in Figure 3.13-36: h -^[(ussww + 4usww + uww) — (uss +4w$ + wo)] o / + t[vssww + 4vSSw + vs) - (vww + 4vw + v0)] = 0. (3.13-200) o Also, the 'true' pressure from the PALM method is obtained as P(x) = Y,pAM) + X>,lM*), (3.13-201) j e where \ffj(x) is the C°-bilinear basis function, Pj is the nodal pressure corresponding to (3.13-196) and (3.13-197), ff(x) is the piecewise-constant basis function on element e, and Pe = -7- (Clu-g) (3.13-202) Ae is the 'augmentation' of the Lagrange multiplier. Final remarks on PALM: 1. Hutton and Smith also apply the method to the PjP\ element. 2. The larger is A, the closer is (3.13-200) satisfied; A = 105 seems 'typical.' Final Remark: From Fortin and Glowinski (1983), we end our augmented penalty presentation: 'In summary, the penalisation is indissociable from a mixed (velocity-pressure) method, and must be considered as a solution technique for this latter method, and not as an approximation technique in itself. In this sense the use of augmented Lagrangian methods is quite natural and the techniques of Chapter I provide some advance on the more usual methods, since several iterations actually enable the error due to the penalisation to be eliminated. We do not therefore have to choose values of r as large as in a pure penalisation method. This possibility allows an improvement in the conditioning of the problems in u/j, and
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 533 this is particularly useful if one is unable to use double precision, or if the problem in U/j, is to be solved by an iterative method...,' where their r is our A. f. Some 2D vs 3D considerations While there may never be even close to concensus on the 'best' element, especially in this day of rapid growth of parallel computers, we dare offer a few opinions for a few special cases; cf. the Tables 3.13-1-3.13-4 in Section 3.13.2a. 1. If 2D is the name of your game, then the element of choice is the QiP-1, usually via the (consistent) penalty method—at least if you favor quadrilaterals. If you are a 'triangleperson,' P\P\ is good, as is the Crouzeix-Raviart element (PjP-\)—it permits the 'elimination' of u and v at the center node and two pressures at element level, thus making it cost-effective (D. Pelletier, 1993, personal communication); the cost is that of P\Po (not recommended), but the accuracy is much higher. 2. If all of your work is in 3D, then the best is less clear, but some probably competitive choices are: (i) Q\Qo, perhaps in stabilized form (see next section) (ii) Stabilized Q\Q\ (see next section) (iii) QiP-\ (iv) P2PX or P2(Pi + P0) (v) Mini (PfPi) 3. See Thatcher (1993) in Gunzburger and Nicolaides (1993) for some suggestions and opinions—some of which are surely unjustified and probably wrong—such as 'In practice, the only elements that are widely used are based on quadratic velocities and linear pressures on tetrahedra and triquadratic velocities and trilinear pressures on hexahedra, and it is the latter that are most often used for 3D flow.' 3.13.3 Stabilization [D. J. Silvester] First, we review some basic definitions and set up notation that is used subsequently. Our starting point is a conventional Galerkin formulation of the incompressible steady-state Stokes equations. Our aim is to find the velocity-pressure pair u/j e Xf, and ph € Mf, satisfying: (grad uh, grad v) - (ph, div v) = (f, v) Vv g Xft, -(?, div u/,) = 0 VqeMh. (3.13-203) Here, h is a representative grid parameter associated with some (quasi-uniform) subdivision Ch of the flow domain Q. Xh C X and M/, c M are finite-dimensional subspaces of the underlying function spaces: X = (HQ(Q))d and M = L2(Q) with d = 2 or 3. Any non-uniqueness of the pressure solution ph is associated with the space of pressure modes: Qh = {qeMh\(q, div v) = 0 Vv G X^} (3.13-204) being non-trivial.
534 THE NAVIER-STOKES EQUATIONS If finite element spaces X/j and Mf, are constructed so that the stability condition: . |(div \,q)\ inf sup ^ y (3.13-205) qeMh\Qh veXh IM|x|l#llAf is satisfied with the stability constant y > 0 and independent of h, then (3.13-203) is well-posed since the velocity u/j is unique, and the pressure ph is unique in Mh\Qh- a. Stable vs. stabilized methods Let us start with an innocent-looking question, namely: Is stability essential? The issue is surprisingly contentious. Although stability is fundamental in a mathematical sense—it ensures good approximation properties on any conceivable mesh—specialists in solving practical incompressible flow problems often argue otherwise. Their point is that reasonable numerical solutions are often computed using supposedly unstable approximation methods, particularly low-order finite element methods like Q\Qo- Obviously, a different definition of stability is needed in such cases. It is clearly possible to argue that discretization methods exhibiting pathologies only on certain types of grids are 'semi-stable' in reality. Moreover 'stabilizing' such methods is straightforward in principle; all one needs to do is to restrict the class of allowable grids. The most stunning example of sensitivity to grid design is the PjP\ triangular element, with a 'non-conforming' pressure approximation defined by the values at the mid-edge points (so that the pressure is not continuous across the edge except at the mid-point). On the one hand, if uniform grids are constructed by triangulating square elements into 'union-jack' patches, then the non-conforming PjP\ approximation is 'stable.' Yet, if the direction of triangulation is changed to give a 'diagonal grid,' stability is lost and solution accuracy is immediately reduced by one order of h. Similar sensitivity to the triangulation direction is observed in the case of the fully discontinuous PjP- i triangular element, see Qin (1994) for details. One thing that makes restricting grids difficult in general is the desirability of adaptive refinement as a means of error control. In particular, using Q\ Qq as a discretization method it is impossible to categorize stable/unstable meshes in advance. The adaptivity feature is what would seem to make stability really essential. It has certainly prompted the rapid development of universally stabilized formulations in recent years. It also leads us to the main issue addressed here: Is stabilization the way forward? Those opposed to the principle of stabilization will argue that ensuring stability is not really difficult. For example, using standard finite elements, enriching the velocity approximation space will always do the trick (either adding 'bubble functions,' or else adding velocity degrees of freedom to inter-element edges). Alternatively, working in a finite difference or finite volume setting, a staggered grid (sometimes referred to as the MAC scheme), with normal velocities defined on cell edges is always stable. On the other hand, stabilization of equal order interpolation elements often looks appealing because it is computationally convenient, especially in a parallel processing/multigrid context. The main drawback, however, is that stabilization always introduces (regularization) parameters, either explicitly or implicitly. Thus, insensitivity to such parameter values is important if the methodology is to be competitive. In many cases, estimates of good/optimal parameter values can be deduced a priori. In other cases, however, an appropriate selection of parameters is not obvious, and in this instance the
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 535 advantage of stabilization must be questionable. We give some personal recommendations after the discussion of particular element combinations which follow. b. Equal order interpolation via stabilization Using the same basis functions to approximate velocity and pressure is almost always unstable*. For convenience, the lowest and the higher-order cases are discussed separately below. We consider the simplest approximation methods first. o P\P\ and Q\Q\ The inherent instability of low order interpolation methods is well known. For example, solving an enclosed flow problem using a grid of Q\Q\ rectangular elements, the space Qh in (3.13-204) is eight dimensional (see Section 3.13-2b). This means that extreme care is required in imposing non-homogeneous velocity boundary conditions in order to ensure that the singular system is consistent. If the pressure modes are filtered a priori (for example, by relaxing the boundary conditions), the approximation is still likely to be unstable because of the presence of 'pesky modes' (see Section 3.13-2b) associated with the inf-sup constant in (3.13-205) being 0(h). An illustration is given later in this section. In the P\P\ case, there is a simple way to stabilize the approximation without a formal loss of accuracy. The idea (originally presented in Brezzi and Pitkaranta, 1984) is to 'regularize' (3.13-203) via a pressure Laplacian perturbation of the incompressibility constraint: (grad uh, grad v) - (ph, div v) = (f, v) Vv e Xh, -(q, div uh) - P JZ ^(8rad Ph* grad 4)k =0 Vq G Mh. (3.13-206) KeC„ Here, hx is the diameter of the kth element in the subdivision Ch, (•, -)k denotes the element-wise L2 innerproduct, and the regularization parameter /3 is strictly positive. The formulation (3.13-206) is also useful in the Q\Q\ case, although the original analysis (Hughes et al., 1986) indicated that additional 'consistency' terms should be added to (3.13-206) if the quadrilateral grid is non-cartesian. (These extra terms are usually omitted in practice, see Hughes et al. 1986, p. 96). Because of the 0(h2) perturbation, the method (3.13-206) is first-order convergent at best: if the exact Stokes solution (u, p) is sufficiently smooth, and if the hydrostatic pressure is set appropriately, then an a priori error estimate is satisfied: llu - UfeHi + \\p - Ph\\o ^ C(h\u\2 + h2\p\2). (3.13-207) Note that the order of the pressure approximation in (3.13-207) is limited by the velocity error, so that using either Pi Pi or Q\Q\, the O(h) pressure error is the best that one can expect. It must be stressed here that the 'weakening' of the incompressibility constraint does not destroy global conservation of mass (because the hydrostatic pressure is in the nullspace of the perturbation operator). This is in contrast to penalty methods whereby global incompressibility is sacrified—the degree of compressibility being proportional to the size of the penalty parameter. * The exception is the case of fully periodic boundary conditions, see, for example Dean and Glowinski (1993, pp. 49)
536 THE NAVIER-STOKES EQUATIONS The popularity of the approach (3.13-206) is largely due to computational convenience: implementation is relatively trivial since the stabilization matrix is a standard element 'stiffness matrix.' The difficulty of finding a 'good' choice of /? in (3.13-206) is the only limiting factor. Unfortunately, as illustrated below, the quality of the approximation is very dependent on the magnitude of /?. Furthermore, we show later on that an inappropriate choice not only leads to inaccurate solutions, but also adversely affects the convergence of iterative solvers applied to (3.13-206). Perhaps the biggest problem with (3.13-206) is that it is very easy to 'over-stabilize' by using a parameter which is too large, in which case the quality of the divergence-free approximation must inevitably deteriorate. Indeed, in the limiting case of infinite /? the solution of (3.13-206) is a constant pressure together with a velocity which is virtually unconstrained (it is only divergence-free globally). The important point is that stability does not guarantee accuracy. The issue of a small parameter value is quite subtle since (3.13-206) is theoretically stable for all positive values of /?. The proof is trivial enough; defining the bilinear form: Bh(w, r; v, q) = (Vw, Vv) - (div v, r) - (div w, q) - /3 ^ /*|(grad r, grad q)K, KeCh (3.13-208) and the mesh-dependent norm: 1/2 |||(u,p)|||= f ||u||f+ X>£llv/>Ho,*) , (3.13-209) the approximation (3.13-206) is coercive over X/, x Mf, (and thus stable). That is, for all j8>0a positive constant a exists such that Bh(v,q;v,-q)^a\\\(v,q)\\\2. (3.13-210) (Note that this is a slightly unusual definition of coercivity, see Franca et al. 1993, pp. 97 for further details.) The loophole in the theory is the fact that the constant a in (3.13-210) behaves like /3 for /3 < 1. Intuitively, if the parameter is small then we are essentially solving the original problem and the usual symptoms of instability will be apparent. A set of numerical experiments illustrating this is given in Pierre (1988). To gain further understanding it is useful to look at the issue of 'small' /3 in a spectral setting. To this end, consider the general matrix eigenproblem: (3.13-211) where /i is the eigenvalue, Q is the pressure mass matrix and K, C and S are defined in the usual way from (3.13-206). The first thing to point out (or, see below) is that a sharp upper bound on the largest negative eigenvalue (i-\ is given by: I M _ . /1 ±4v2 Ai_, *U 1 -./1+4/2 , (3.13-212) where the 'inf-sup like' constant ys is defined via: 2 . pT(CTK-{C + pS)p Ys = min — f t—LL- (3.13-213) pzm„ pTQp
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 537 Note that without stabilization (P = 0) (3.13-213) is algebraically equivalent to (3.13-205) (for details see Brezzi and Fortin, 1991, pp. 73-78). This suggests that the role of stabilization is to ensure that ys is independent of h in cases when y in (3.13-205) is 0(h). The derivation of (3.13-212) is satisfyingly neat, and is included for completeness. Expanding (3.13-211): -Cp = (\ - n)Ku, (3.13-214) CTu- pSp = /iQp, (3.13-215) we consider /i < 0 and note that in this case p ^ 0 since otherwise (3.13-214) implies u = 0, contradicting the definition of an eigenvector. Then, taking the scalar product of (3.13-215) with p and substituting for u from (3.13-214) gives: (1 - (i)-lpTCTK-lCp + pPTSp = -fipTQp, and since 0^(1— /x)-1 ^ 1 it follows that (1 - ii)~l pT(CTK~lC + PS)P < -fipTQp. Using the definition (3.13-213), and the fact that the mass matrix is positive definite leads to: K2(l -/xr1 ^-m; thus 0 ^ m2 - ii - y2s, from which (3.13-212) easily follows. Note that the argument above is very general: (3.13-212) holds independently of the choice of stabilization matrix S. The next step is to quantify the relationship between ys and p as described in Silvester (1994). Starting from (3.13-213), a crude bound is obviously , qTCTK~lCq qTSq y ^ mm f h p max ^—. (3.13-216) qeMh q Qq q&Mh q Qq Thus, introducing the constant 0 = max —f— (3.13-217) qzMh q Qq and noting that the definition of Q/j(3.13 — 204) implies that qTCTK.-xCq 0 = min " == ", (3.13-218) qeMh q Qq we deduce that yj ^ PQ2. An analogous argument gives the alternative bound 2 qTCTK-xCq , Q . qTSq n „ -1Q. y: ^ max f h P mm —~—. (3.13-219) qzMh q Qq qzMh q Qq Thus if we define an upper bound 9 qTCTK~xCq T2 = max " = ", (3.13-220) qzMh q Qq
538 THE NAVIER-STOKES EQUATIONS and note that in the case of Pi Pi or Q\Q\ stabilized via (3.13-206) 0=min-V-^ (3.13-221) q&Mh q Qq (because a constant pressure is in the nullspace of the stabilization operator by construction), we deduce that y2 ^ r2, and hence that y2 ^ min(£©2, T2). (3.13-222) This implies that the stability constant tends to zero as /? —> 0. Moreover, if the stability parameter is small, then (3.13-212) and (3.13-222) imply that: /it_, =-/3e2+0(/32), which means that the matrix system associated with (3.13-206) becomes increasingly ill-conditioned as fi —>• 0. The result (3.13-222) also points to the existence of a critical ('optimal') parameter fic = T2/02. The implication of (3.13-222) is that the stability constant ys (and thus (i-\) is essentially independent of /? as long as /? is large enough, i.e. /? ^ /3C. One way of using this result is to estimate constants r* and ©* which are independent of the mesh so that r* ^ Y and ©* ^ 0, and then to make the specific choice fi = r2/©2 ^ (3C. Note however, /3C is optimal in the sense that if /? > /3C then the maximum negative eigenvalue of (3.13-211) varies like 0(/3) (so that the system associated see with (3.13-206) also becomes increasingly ill-conditioned as /? —> oo; Silvester, 1994). Thus it pays off to estimate /3C as accurately as possible. A simple estimate of r* is well known (see Fortin and Pierre, 1992): a Cauchy-Schwarz argument yields |(div v, p)\2 lldiv vll2 -—, F ' ^ i- ^ d, (3.13-223) IIvll II nil II Vvll IIvIIxNPHm livv|| so for example in K2 we have \fl ^ F. In practice, this estimate (which holds for all mixed approximations) seems to be pessimistic. In particular, in the case of the Q\Q\ approximation, numerical computations on quasi-uniform cartesian grids of rectangular elements suggest that Y —> 1 from below, as h —> 0. Hence a better choice would be r* = 1 in this case. Estimating 02 a priori is more problematical, at least in the case of the scaled Laplacian stabilization operator in (3.13-206). (Using a local stabilization based on macro-elements there is no problem, see the next section.) The obvious way of proceeding is to try to use a finite element inverse estimate \\Vp\\2 ^ Cih-2\\p\\2, (3.13-224) but this only gives limited information. Specifically, (3.13-224) implies that for a quasi- uniform sequence of grids, ©2 is bounded above by the inverse constant C/. In general, it seems that direct computation of the largest eigenvalue of the (scaled) stabilization matrix using representative grids is a better way of getting a good estimate for 0*. The determination of the optimal parameter in (3.13-206) is the key to assessing the quality of implicitly stabilized methods. The simplest method in this category is the 'mini- element', discretization, see Brezzi and Fortin (1991, pp. 213), where the basic P\P\
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 539 method is 'stabilized' by augmenting the velocity in each element with 'bubble' degrees of freedom (one for each component). Since (by definition) a typical bubble function (pk is zero on the boundary of the element (ensuring that the enriched velocity is continuous across element edges or faces), the associated degrees of freedom may be removed using the standard process of static condensation; that is each bubble degree of freedom may be eliminated before element assembly. If this is done, then the reduced system also looks like (3.13-206) with a perturbed incompressibility constraint of the form -(q, div uh) - Y] — -^ (grad/7/,, grad?)* =0 Vq eMh, ^ A^(grad <f>k, grad (pk)K (3.13-225) where A# is the element area/volume. For a proof, see Pierre (1995). This interpretation is appealing theoretically since it facilitates the construction of the 'optimal bubble' such that after normalizing, (for example, setting the centroid value of (pk to unity), the inf-sup constant (3.13-205) is maximized. One of the products is that for a grid of equilateral triangles, the standard cubic bubble element can be shown to be the 'best' possible conforming mini-element. In practice, it is important to note that the cubic bubble mini-element often gives a poor approximation (Lohner, 1993; Pierre, 1988); for example, local 0(h) pressure oscillations are often observed near boundaries, suggesting that the implicitly defined stabilization parameter in (3.13-225) is too small. The above discussion of the optimal parameter /3C sheds a little light on this: specifically, if we have a uniform grid of right-angled triangles with short sides parallel to the coordinate axis, then the cubic bubble mini element solution satisfies (3.13-206) with: 1 1 AjcAv fitfc = ——-, —^- (3.13-226) K 40 Ax2 + Ay2 Thus, putting Ax = Ay = \ik we have a stabilization parameter of /? = 1/80. Moreover, if a distortion parameter a is introduced so that Ax = ahx and Ay = hx, then P= —=-, (3.13-227) 40(1+a2) which shows that fi tends to zero in the limit of highly stretched triangles with a —> 0 or a —> oo. Further discussion of this issue is given in Becker and Rannacher (1994). In Lohner (1993) the suggested 'fix' is to multiply the natural mini-element parameter by an order of magnitude for elements within boundary layers! Implicitly stabilized methods also arise in a natural way if standard (semi-)explicit time stepping schemes are applied to the 'slightly compressible' Stokes system: (grad uh, grad v) - (p/,, div v) = (f, v) Vv e Xh, -(q, div uh) = \ (q, 3^) Vq G Mh, (3.13-228) c dt where c is the speed of 'sound'. See Zienkiewicz and Wu (1991) for a complete list of possibilities; which include simple predictor-corrector methods (with the momentum equation treated explicitly), Taylor-Galerkin type methods, and explicit Runge-Kutta methods. In the simplest case of a fixed time-step At, the steady-state solution to
540 THE NAVIER-STOKES EQUATIONS (3.13-228) satisfies a system containing an O(At) perturbation term. For example, if (3.13-228) is discretized using a uniform grid of P\P\ elements, with a time-step At = h2/4 determined by the local stability limit, then the solution ultimately obtained using a Taylor-Galerkin approach (see Lohner, 1993), will satisfy a system which is essentially (3.13-206) with a (large) parameter fi = 1/2. In general, the idea of viewing pseudo time-stepping as a mechanism for enforcing stability is likely to be fraught with peril. Nevertheless, good results are certainly possible in the hands of experts, see Zienkiewicz and Wu (1992). o PkPk and QkQk for k ^ 2. The instability of equal-order interpolation is intrinsic. Unfortunately, stabilization in the case k ^ 2 appears to be considerably more complicated than in the low order case. A variety of practical computations have been done nevertheless, using the methodology described below (for details see Tezduyar, 1992). The key idea was suggested by Hughes and Franca in (1987), and involves adding mesh-dependent Galerkin-least-squares perturbation terms to the discrete formulation (3.13-203)): (grad uh, grad v) - (ph, div v) -aJ2 h2K(-V2uh + grad Ph - f, tV2v)k = (/, v) Vv e Xh (3.13-229) KeCh - (q, div U)j)-a^ h2K(-S72uh + grad ph - f, grad q)K =0 Vq e Mh. KeC„ In this case a > 0 is the regularization parameter. The 'trick' is to add the stabilization terms in an element-by-element fashion, circumventing the more demanding continuity requirements associated with a conventional least-squares formulation. There are obviously two possibilities above, depending on the choice of sign in (3.13-229). Both possibilities are consistent in the sense that the solution of the underlying continuous Stokes problem also satisfies (3.13-229). The symmetric (minus) formulation is that given in Hughes and Franca (1987). It is stable only if 0 < a < a*, where a* is defined via the inverse estimate (cf. (3.13-224)): llgrad v||n a* = max —-^ ^—-. (3.13-230) veXh V^ h2 IIV2vll2 K&Ch A systematic way of computing a* (solving a local eigenvalue problem on each element) which is applicable to highly non-uniform grids, is given in Franca and Madureira (1993). The unsymmetric (plus) formulation is more recent, see Douglas and Wang (1989). The motivation for its introduction is that stability is ensured for all values of a in this case. In the symmetric case, if a < a*, and assuming that the exact solutions are sufficiently smooth, the following error estimate is established in Franca and Stenberg (1991): llu-u/,11, + ||p-p/,||o^C(/**|u|ik+i +hk+l\p\k+i). (3.13-231) The same estimate holds for the unsymmetric formulation with no restriction on a, (note that C in (3.13-231) depends on a, however). A numerical performance comparison of the plus/minus formulations in the case k = 2 can be found in Franca and Frey (1992). The bottom line seems to be that the unsymmetric (plus) formulation is the better method
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 541 because of its relative insensitivity to the choice of parameter, although a clean mechanism for determining a in practice is not obvious in either case. c. Stabilized approximations using discontinuous pressure Apart from the case of equal-order interpolation, the need for stabilization is most apparent using mixed methods based on discontinuous pressure. Discontinuous pressures (or alternatively control-volume finite element methods) are to be preferred in cases where mass conservation at the element level is important. Noting the typical form of the mixed approximation error estimate (for example, (3.13-231), it is clear that methods with a pressure approximation which is one degree less than the velocity are of prime interest. Again, the lowest order case is special and will be considered in detail. o P\Po and Q\Qo The unstable P\Po triangular/tetrahedral and Q\Qo quadrilateral/brick elements are very natural mixed methods; moreover their inherent simplicity makes them computationally attractive. Two ways of stabilizing these methods are discussed here. Each of these has advantages/disadvantages which are described below. Unfortunately, as we shall demonstrate, both types of stabilization tend to destroy the intrinsic simplicity of the underlying approximations. The first approach we consider is the so-called global stabilization approach, which is essentially a special case of the formulation in Hughes and Franca (1987), and corresponds to controlling the jumps in pressure across element boundaries: (grad uh, grad v) - (p/,, div v) = (f, v) Vv e Xh -fo, div ufc) - 0 £>, f[[ph]U[q]]eds = 0 VqeMh. (3.13-232) eeVh Je Here, he is the length of the element edge (in K. ) or the diameter of the element face (in [&3X [[•]]<? is the jump operator, and T/j is the set of all interior inter-element edges/faces. The stabilization is global in the sense that the eigenfunctions of the perturbation operator in (3.13-232) are all global functions—they cannot be constructed using a local (element or macro-element) basis. It is interesting to observe that the stabilized system (3.13-232) is closely related to the stabilized Q\Q\ method above. To illustrate, consider a uniform grid of square Q\Qo elements of side h. Assuming the usual pointwise interpretation of the constant pressure (i.e., the value at the centroid), the stabilization term corresponding to the piecewise- constant pressure test function %, at the centre of the patch of nine elements illustrated in Figure 3.13.16 is given by: = h2{(p0 - Pe) + (Po - Pn) + (Po - pw) + (po - Ps)} = h2(4p0 - Pe - Pn - Pw - Ps) = -hAV2 ph + 0(h6), that is, the stabilization term is just a scaled discrete Laplacian defined on the 'dual' grid obtained by joining the element centroids. Since the approach (3.13-232) is clearly another
542 THE NAVIER-STOKES EQUATIONS 1 1 1 1 N I l I I S3 W VA 0 t\2 E i i i i S1 S Fig. 3.13-16 Typical patch of Q^ Q0 elements. 'pressure Laplacian' stabilization, poor accuracy is to be expected if j8 is too large. (In the limiting case of infinite j8 the solution of (3.13-232) is again a constant pressure.) The analysis of (3.13-232) is a straightforward generalisation of that above; coercivity in the mesh-dependent norm: |||(u,p)|||= ( ||u||?+ 5^/1, f[[p]]2eds) , (3.13-233) V eerh Je J over Xfj x M/,, is the key to deriving the desired error estimate: llu - it/,||i + \\p - Ph\\o ^ Ch. (3.13-234) The spectral analysis leading to the 'best' choice of fi also applies here. For example, computations on uniform square grids in Silvester (1994) suggest that r* = 1 is a good estimate in this case. Estimating 0* is more tricky, however; see the discussion of Q\Q\ above. A relevant observation here is that explicitly perturbing the incompressibility constraint by a suitably scaled pressure Laplacian operator is also a relatively clean way of stabilizing 'non-staggered' finite volume (and centered finite difference) methods. Furthermore, in the case of a uniform grid, a local Fourier analysis of the non-staggered finite difference operators suggests that fi= 1/16 is an intelligent choice to make in such cases. Specifically, /3 = 1/16 is the smallest possible value that maximizes the local ellipticity measure which determines smoothing of high frequency error components. Hence an optimally convergent multigrid solver for the underlying discrete Stokes system is obtained in this case. See Linden et al. (1988) for further details. In practice however, the globally stabilized Q\Qq and P\P0 methods have limited appeal. There are two main reasons for this: First, the jump terms make life awkward in the
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 543 sense of ease of implementation into existing codes based on Q\Qo, and secondly, the fact that in gaining stability, the local incompressibility and the simplicity of the underlying approximation are sacrificed. (Note that mass is still conserved globally since the nullspace of the stabilization matrix contains constant vectors). Both of these limitations may be overcome by making a subtle modification, however. The idea is simply to control jumps in pressure locally within macro-elements instead of globally across the whole grid. Starting from the global approach (3.13-232) the idea is to aggregate adjoining elements into an appropriate macro-element partitioning M, and then omit those jumps in pressure across the macro-element boundaries. This gives the following local stabilization method [Kechkar and Silvester (1992): (grad u,;, grad v) - (ph, div v) = (f, v) Vv e Xh, -(q, div uh) - 0 Y, h>» J2 JiiPh]U[q]]eds = 0 Vqe Mh, (3.13-235) me M e^Y m where FM is the set of all edges/faces in the interior of the rath macro-element, and hm is a measure of the macro-element's size, see below. Of course, if stability is to be retained then the number of elements in each macro-element must be sufficiently large—if every macro-element contained just one element (i.e. M = Ch) there are no internal jump terms (i.e. FM = 0), and (3.13-235) degenerates to the unstabilized formulation. In the motivating paper [Kechkar and Silvester (1992) it is rigorously established that as long as M is constructed so that each macro-element is topologically equivalent to a reference macro-element having a velocity node on every edge (or every face in three-dimensions), then there exists a minimal parameter value /?o > 0 such that the formulation (3.13-235) is stable (ys in (3.13-213) is independent of h), and the optimal error estimate (3.13-234) holds (with a constant C independent of fi). Note also that the globally stabilized formulation (3.13-232) corresponds to the extreme case of a local stabilization based on a single macro-element. One of the features of (3.13-235) is that if the discrete incompressibility constraints are added together then the jump terms sum to zero in each macro-element (a specific example is given below). This is crucially important to the success of the method since it implies that the local incompressibility of the Q\Qo or P\Po method is retained after stabilization (albeit over macro-elements). It also suggests that a good strategy when constructing M is to form macro-elements containing as few elements as possible. Indeed, given some arbitrary grid, an 'optimal' partitioning M may be constructed by a simple adaptive process: successively subdividing large patches into smaller ones until further subdivision cannot be done without violating the connectivity constraint on the macro-elements. As an illustration, the patch of seven quadrilaterals in Figure 3.13-17 can obviously be split into two macro-elements which are topologically equivalent to the reference macro-elements illustrated in Figure 3.13-18. Once a suitable macro-element partitioning has been formed, the local stabilization matrices can be calculated by running through the component elements, summing jump contributions corresponding to the internal edges. For example, in the case of the four element macro element in Figure 3.13-17, each element has two internal (dotted) edges
544 THE NAVIER-STOKES EQUATIONS Fig. 3.13-17 Typical patch of seven d Q0 elements. Fig. 3.13-18 Reference Q^Q0 macro elements. and the local stabilization matrix implied by (3.13-235) is given by: SM hK ( U\ + l\2 -In 0 -In l\2 + llZ -hi 0 0 —^23 ^23 + ^34 — ^34 0 —^34 /34 + ^41 / (3.13-236) Here Uj is the length of the edge between elements / and j. The reference length hm may be computed by simply defining it to be the average diameter of the constituent elements. In two dimensions a convenient way of constructing a 'legitimate' M. is to take a coarse subdivision of quadrilaterals and triangles and then to uniformly refine it once by joining the mid-edge points. This gives a macro-element partitioning with each macroelement consisting of precisely four elements. The quadrilateral macro elements are thus
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 545 4 r Fig. 3.13-19 Reference P^P0 macro elements. topologically equivalent to the left hand reference macro-element in Figure 3.13-18, and the triangular macro-elements are all topologically equivalent to the reference triangle in Figure 3.13-19. In three dimensions the 2 x 2 x 2 block is the obvious starting point for stabilizing Q\Qo- Similarly, the basic reference P\Pq tetrahedral macro-element has each face built up from three adjoining sub-tetrahedra, as illustrated in Figure 3.13-19. (The crucial point here is that each face of the macro-element tetrahedron must have a 'centre-node' in order to satisfy the connectivity condition.) Perhaps the most serious potential drawback of the local framework (3.13-235), is that stability is only guaranteed if the stabilization parameter /? is bigger than some critical value fio, which needs to be estimated, (cf. the globally stabilized case where stability is built in). Fortunately, this does not cause any real difficulty in practice since it is known theoretically that the value fic defined via the spectral analysis is always big enough (since fio ^ Pc)> see Silvester (1994). This is important in the framework (3.13-235) because the determination of an estimate for /3C is a simple piece of local analysis. To illustrate this point, consider the case of a uniform grid of J x J square Q\Qo elements of side h. In constructing (3.13-235) there are two possibilities; if J is even, then local stabilization can be based on the 2 x 2 macro-element illustrated in Figure 3.13-18, whereas if J is odd, then a single layer of larger 2x3 macro-elements needs to be appended around the boundary. Restricting attention to the even case for simplicity, and setting hm = /,j = h in (3.13-236), we see that the stabilization matrix S in (2.13-211) is block diagonal with identical 4x4 blocks of the form: >M = hl ( 2 -1 0 \-l -1 2 -1 0 0 -1 2 -1 -1 0 -1 2 (3.13-237) / A simple calculation then shows that the eigenvalues of the stabilization matrix S are simply 0, 2h2, 2h2, Ah2 (each with multiplicity equal to J2/4). Furthermore, since the pressure mass matrix Q is diagonal in this case, with entries Qn = h2, we immediately see that O* = O2 = 4 in (3.13-217) independently of the grid. If this is combined with the numerically computed upper bound r2 = 1 (see Table 3.13-8), then a 'good' parameter value is easily deduced, namely fi = 1/4. Similar considerations apply in three dimensions, for
546 THE NAVIER-STOKES EQUATIONS example using the 2 x 2 x 2 brick as the Q\Qo building block, each element has three 'internal' faces; thus the local stabilization matrix has maximal eigenvalue equal to 6h3, and hence 0^ = 6 in this case. To complete the picture a brief discussion of the higher order versions of the Q\Qo and P\Pq methods is appropriate here. o Pk+\P_k and Qk+\P-k for & ^ 1 The triangular/tetrahedral case is discussed first. The Pk+\P-k approximation is very special since the divergence of the velocity approximation is contained in the pressure space, for all k. The upshot is that the discrete velocity field is divergence-free everywhere in the flow domain: (q, divu/j) = 0 VqeMh implies div u/j = 0 in Q. (3.13-238) Whilst the need for such a strong enforcement of incompressibility is not obvious in all cases, it may be highly desirable for certain types of flows, and the method then gives an alternative to working with a streamfunction formulation. With the incompressibility so strongly enforced, the stability of the approximation is bound to be problematic. What is surprising is that whilst the lowest order methods PjP-x, P3P-2 are unstable (with the inf-sup constant in (3.13-205) behaving like 0(h) on certain types of grids), for k ^ 3, the methods are not unstable as long as a technical condition associated with (near-)singular vertices is satisfied, see Brezzi and Fortin (1991, pp. 227-228). The price to pay in this case is that whilst the inf-sup constant is independent of h, it is not independent of k. Thus it is difficult to use the family as a basis for adaptivity via ^-refinement (see Jensen and Vogelius, 1990 for details). Stabilization of such ^-refinement methods remains an active research area, partly because similar difficulties arise when using spectral approximations, see Canuto et al. (1988a, pp. 394-406). Returning to the /i-refinement setting, the PjP-1 and P3P-2 methods both need to be stabilized if they are to work on all possible grids. This can be done either by adding velocity bubble functions, leading to the Crouzeix-Raviart family of 'bubble' elements, or else working within a globally stabilized formulation like (3.13-229). Either way, stabilization is relatively straightforward, although the alluring property (3.13-238) is lost. Stabilizing in a global framework via (3.13-229), the 'difficult to handle' pressure jump terms in (3.13-232) are not needed for k ^ 2 (Franca et al., 1993), nor in the case of the PiP~\ triangle which does have the crucial mid-edge velocity node. On the other hand, some recent results Qin (1994) reinforce the basic point made at the outset: the underlying methods can be made to work without stabilization by restricting attention to carefully selected grids. For example, using (Clough-Tocher) macro-elements with each triangle divided into three, both the P^P-\ and P3P-2 methods can be shown to be stable and thus give optimal rates of convergence. See Qin (1994) for details. Finally, it should perhaps be stressed that the Qk+\P-k family of methods are (Zi-)stable fork^ 1, see Girault and Raviart (1986, pp. 156-157). Indeed, this class of methods is one of the more attractive starting points for both p and h — p refinement strategies. A more complete analysis is given in Stenberg and Suri and (1994). As a result, the unstable Qk+\Q-k family are clearly of limited interest. d. Impact on iterative solvers The recent development of high-performance computing architectures, and the ability to render three-dimensional solution information has led to an increased emphasis on iterative
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 547 linear equation solvers. The aim of this section is to assess the impact of instability on the convergence rate of state-of-the-art iterative solvers when applied to the linearized equations that arise (at each time-level), when solving the time-dependent Navier-Stokes equations. For ease of exposition, we restrict ourselves to operator splitting methods which involve a Stokes (or H' )-projection onto a discretely divergence free subspace at every step (e.g., the second-order ^-scheme Dean and Glowinski, 1993, pp .27). If an L2- projection is done instead, then the ramifications of instability are not nearly so evident, see Griffiths and Silvester (1994). Using a stabilized formulation of the form (3.13-235) [or (3.13-232)] the Stokes projection step requires the solution of a generalized Stokes system of the form: where K, C, and S are as defined previously, v is the inverse of the Reynolds number, M is the velocity mass matrix, and / depends on the solution u! and pl at the previous time-step. Recently, 'fast iteration' methods for solving linear equation systems have been developed. These typically incorporate multigrid or multilevel strategies, and are often optimally efficient in the sense that a solution can be generated in O(N) floating point operations (where N is the number of degrees of freedom), cf. direct methods for dense linear equations which require 0(N3') flops. One iteration method that is uniquely appropriate for indefinite systems like (3.13-239) is the minimal residual method (MINRES), see Silvester and Wathen (1996). Alternatively, the mathematically equivalent conjugate residual (CR) algorithm (Elman, 1994) may be used. In practice, our experience is that MINRES has better stability properties than CR; furthermore, it is relatively cheap to implement—requiring only one matrix-vector product, two dot products and six 'AXPY' operations per iteration. The key point here is that if the basic iteration is preconditioned by velocity and pressure operators which are spectrally equivalent to the primal operator K, and the dual operator CTK~' C + /3S, respectively, then a mesh independent convergence rate is assured (see Silvester and Wathen, 1996). Furthermore, using such a preconditioner the 'energy' of the error is spectrally equivalent to the preconditioned residual, and hence it is strictly minimized at every step. (To get a convergence rate which is independent of v and At requires a more sophisticated preconditioning approach; see Bramble and Pasciak, 1995 for details). Note that in the limiting case of an arbitrarily large timestep the system (3.13-239) reduces to a standard Stokes problem. In any case, (3.13-205) is a necessary and sufficient condition for the pressure mass matrix Q to be spectrally equivalent to the dual operator. In fact, the inf-sup constant y is the lower bound on the equivalence relation (in the stabilized case, the condition (3.13-213) plays the same role). The crucial point is this: if a 'fast' method is applied to a system corresponding to an unstabilized Q\Qo discretization, then the ratio of the equivalence constants corresponding to the dual operator may blow up as h —>• 0. In this case, what will happen in practice is that the rate of convergence of the iteration will deteriorate under mesh refinement. * Two symmetric matrix operators S and T are said to be spectrally equivalent if there exist constants a and b independent of h such that a ^ x'Sx/x'Tx ^ b, for all vectors x.
548 THE NAVIER-STOKES EQUATIONS A numerical experiment will hopefully reinforce this point. We solve an enclosed Stokes flow problem, with Dirichlet velocity conditions on all boundaries. Note that ignoring the effect of the starting guess, the convergence of iterative solvers applied to symmetric systems of equations is completely determined by the systems' eigenvalues, and hence is essentially independent of the actual boundary values imposed. We discretize using a sequence of uniform grids of square Q\ Qo elements, each grid obtained by uniform refinement of the previous one. The velocity components are preconditioned using a multi- grid solver applied separately to each of the discrete Laplacian blocks. In particular, we perform the simplest multigrid smoothing strategy available; one V-cycle of optimal point- Jacobi relaxation with just one smoothing step before and after transferring to the next grid. Bilinear prolongation is used, and restriction is via 'full weighting' to ensure that the preconditioner is symmetric. The discrete pressure is preconditioned by the pressure mass matrix, that is by a simple diagonal scaling. The iteration is stopped when the preconditioned residual has reduced by a factor of 10~6, and the iteration counts are recorded in Table 3.13-7 Stabilization is enforced via (3.13-235) using the 2 x 2 macro-element construction analyzed above. Note that in all cases the computed velocity solutions were identical in the 'eyeball norm'. Iteration and operation counts (in megaflops) for three particular values of fi are shown; the value ft = 1/4 discussed above, and the value fi = 0.058 (see Silvester, 1994) which minimizes the condition number of the dual operator which determines the speed of convergence. This 'perfect' value is hard to estimate in general, although it is easily determined (by numerical experiment) in the case of uniform grids, see Vincent (1995). What is observed is that iteration counts are only independent of the grid in the stabilized cases: using the raw Q\Qo method the iteration counts significantly increase with decreasing h. Note that exactly the same picture would be observed if we had used the global formulation (3.13-232) instead of (3.13-235) above. The only difference is that the computed solutions are much more sensitive to the choice of fi in this case, see Silvester and Kechkar (1990) for details. The poor performance in the case of unstabilized Q\ Qq is easily explained theoretically. Indeed an asymptotic (worst-case) analysis of the instability of Q\Qo suggests that the number of iterations will double for every uniform refinement, independently of the fast iteration actually used. To illustrate the difference in behavior, the extremal eigenvalues of the preconditioned dual operator (CTK~XC + fiS)/h2 are given in Table 3.13-8 (cf. Section 3.13.5k). In the unstabilized case there are two zero eigenvalues corresponding to the hydrostatic and the 'pure' chequerboard mode. The minimal eigenvalue clearly decreases with h, in fact it is known that <7min —> 3/j2tt2/8 as h -* 0, see Griffiths and Silvester (1994). In addition, the maximal eigenvalue clearly tends to unity, hence the condition number Table 3.13-7 MINRES iteration counts (operations). Grid p = 0 p = 0.058 p = 0.25 8x8 46 (1.88) 32 (1.35) 35 (1.48) 16x16 76 (13.05) 34 (6.04) 42 (7.44) 32x32 128(91.06) 34(25.03) 41(30.13) 64x64 213(619.3) 34(102.4) 43(129.2)
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 549 Table 3.13-8 Extremal eigenvalues of the preconditioned dual operator. Grid 8x8 16x 16 32x32 P = °Vnin 4.661 E-2 1.318E-2 3.465E-3 = 0 °Vnax 0.9764 0.9941 0.9984 P = °Vnin 0.2320 0.2312 0.2200 = 0.058 °Vnax 1.1009 1.1121 1.1147 of the preconditioned dual operator becomes unbounded as h —> 0. In contrast, in the stabilized case, there is one zero eigenvalue, the minimal eigenvalue is bounded away from zero, and the condition number is uniformly bounded by a small constant (~ 5). Looking at Table 3.13-8 it is obvious that the instability of Q\Qo is fundamental; any iterative solver that relies on a well conditioned dual operator, e.g., any Uzawa method, is bound to converge arbitrarily slowly in the limit h —> 0. In three dimensions, the situation is even worse since the analysis Griffiths and Silvester (1994) indicates that using a uniform grid of Q\Qq bricks, <rmin behaves like 0(h4) as h —> 0. Thus if a 'fast' iteration is employed and the method is not stabilized, the number of iterations required to satisfy a fixed tolerance will ultimately increase by a factor of four with each uniform grid refinement. A 3D comparison, kindly supplied by S. T. Chan, for potential flow (L2-projection) over a hemisphere, is shown in Figure 3.13-19A—using an incomplete Cholesky conjugate gradient iterative solver, and fi = 0.10. The payoff is quite considerable, although not as dramatic as for the 2D Stokes example (//'-projection) shown in Table 3.13-1. e. Recommendations • It is important to be aware that stabilization does not come for free. The general price that must be paid is that the regularization parameter has to be appropriately chosen. Ignoring the lowest order conforming methods (i.e. with Q\ or P\ velocity), efficient mixed approximation methods exist like Qk+\P-k which are intrinsically stable. Hence, stabilized higher-order methods seem to have limited attraction. • The stabilized formulation (3.13-206) is a clean way of implementing P\P\ approximation on triangles and tetrahedra, which is intrinsically superior to the alternative mini-element discretization. Equation 3.13-206 is also a good starting point for equal- order Q\Q\ approximation on grids of rectangles and bricks, but is less attractive in the case of distorted grids (because the estimation of a good parameter choice is more difficult). Care must be taken however—it is easy to over-stabilize, in which case the quality of the divergence-free approximation is compromised, adversely affecting accuracy. • The local stabilization approach provides a convenient and efficient way of getting Q\Qo and P\Pq methods to work with minimal restrictions on the grid used. In such cases a priori estimates of an 'optimal parameter' are easily computed. However, the need for a macro-element data structure is an unavoidable consequence. In three dimensions, the necessity for iterative solvers make the case for stabilizing these
550 THE NAVIER-STOKES EQUATIONS - 10 i—i—r—t—r~i—i—i—i—I—i—i—i—i—i—i—i—i—i—|—i—i—i—i—i—i—i—i—i—I—i—i—i—i—i—i—i—i—i- \ 11x51x31 101x101x51 101x101x51 1 0 ■ ■ ■ » ■ 1 I I 1 i_J I I I L—t 1 1 1 1 1 ' ■ I t I I I I 1 | | | I I 0 100 200 300 400 1CCG Iterations Fig. 3-13-19A A 3D example; solid curves show stabilized results, dashed curves unstabi- lized. methods compelling, especially if Uzawa-type iteration methods based on a Stokes projection step are used. 3.13.4 The Discrete Pressure Poisson Equations (PPE's) We have previously discussed the PPE in the continuum—see Sections 3.5.1, 3.8.2, 3.10.3-3.10.5, 3.12.3, and 3.12.5; here we address the discrete analog, in several versions, and point out relative advantages and disadvantages—vis-a-vis (u, P). Later (Sections 3.16.1 and 3.16.5), we will show how some discrete PPE formulations can be used to write code and generate numbers. a. The consistent PPE Just as the continuous equations (PDE's) of mass and momentum conservation (plus BC's) imply (induce) the existence of an associated pressure Poisson equation (PPE), complete with appropriate BC's, so too do the semi-discrete forms of these equations—most easily derived from the DAE's presented in their most compact form in (3.13-28) and (3.13-29). The PPE derivation (but not all of its consequences) is (are) simple: 1. Since (3.13-29) applies for all t ^ 0, its time derivative exists, C u= g, (3.13-240)
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 551 which simply states that the acceleration is also divergence-free—and we remark that 3.13-240 is only valid when the domain shape is time-independent. 2. Since M is non-singular, (3.13-28) can be uniquely solved for the acceleration, U = M-{[f -Ku-N(u)u-CP]; (3.13-241) i.e., if u and P are known, the acceleration is computable. 3. To ensure that the velocity remains divergence-free, the acceleration is inserted into (3.13-240) and the result rearranged to give (CTM-' C)P = CTM- x[f -Ku- N(u)u] - g, (3.13-242) the analog of (3.5-3), which is the consistent (discrete) PPE and, given u (such that CTu = g), generates the unique pressure that assures that u remains divergence-free. (The PPE formulation generally assures only a divergence-free acceleration.) So why do we care about the discrete PPE and what do we mean by 'consistent'? Simply put, the answers are these: we care about the PPE because it can replace the mass conservation equation and be used, with the momentum equation, to sometimes solve more efficiently the time-dependent, semi-discrete NS equations. It is consistent in that it can generate exactly the same solution—u(t) and P(t)—as that obtained by time integration of the 'primitive' equations (3.13-28) and (3.13-29). It also has the appropriate BC's for the pressure 'built-in'—consistently. These statements will be seen to be especially relevant when we discuss alternatives—i.e., inconsistent discrete PPE methods—in the next section. Remarks (1) The matrix CTM~XC usually 'corresponds to' (—V2) and is thus (up to pressure modes) SPD; and the RHS, appropriately, corresponds to (-div) on the 'data.' But see Remarks (4) and (5) below. (2) Whereas CTM~xKu approximates V- (vV2u), which is zero in the continuum, it is not zero in the discrete world. Whereas it will be 'small' (and appropriately unimportant) away from solid boundaries, it is 'large'—and very important at these boundaries (unless Re 2> 1); it is the term that puts the viscous portion of the Neumann BC, vn • V2u, into the PPE [see (3.8-36)]. In fact, for Stokes flow with BC's that are independent of time, it is the only term that 'establishes' the pressure field—per Remark (8) below (3.8-37). (3) The fact that M~x is a dense matrix will cause 'problems,' which we discuss later. The 'solution,' mass lumping, also 'weakens' our claim of 'consistency' for the PPE in that LM is no longer 'honest,' no longer GFEM—and not always possible. (4) Examining the consistent PPE at any portion of T on which n • u is specified (which we do later for Q\Qo) and letting the mesh size tend to zero, recovers the appropriate Neumann BC for the PPE—e.g., (3.8-36). It is important to point out that here C1M~XC corresponds not to —V2, but to the normal derivative. (See Section 3.13.5f for a demonstration of this fact). The tangential momentum equation (3.8-37), is not obviously satisfied on F—even for t > 0; its satisfaction is and must be 'implicit'—a spatially converged solution properly mimics the continuum.
552 THE NAVIER-STOKES EQUATIONS (5) Examining this consistent PPE at any portion of F on which an NBC is applied in the normal direction (which we do later for Q\Qo) and letting the mesh size tend to zero, recovers the appropriate (Dirichlet) boundary condition for the PPE—equation (3.8-38). Again, CTM~XC then does not correspond to (—V2); it corresponds to a 'constant' operator, a scalar. Again, see Section 3.13.5f for a demonstration. (6) It is noteworthy that solving the pair (3.13-241) and (3.13-242) for u and P does not imply CTu = g; it only implies (3.13-240). More on this issue in Section 3.16 on time integration. (7) If the problem is one in which the rotated equations (at a boundary node) are required in order to supply proper normal and/or tangential BC's, a la (3.13-42) through (3.13-44), then it is an easy matter to show that the appropriate CPPE is (CTM-' C)P = CTM-' [/ -(K+ N(RTuR))RTuR] - g; (3.13-243) the (scalar) equation for P is affected only by the change of (vector) variable from u to ur. Since some of our mathematician friends tend to 'cringe' at the thought of even mentioning a PPE when/since the pressure generally 'lives' in L2 rather than Hx {especially true for those elements employing discontinuous pressure), let alone considering using one to obtain an (alleged) solution, a few words of 'justification' (vindication?) may be in order (with thanks here to both R. Rannacher and D. Griffiths, personal communications). Suppose P e L2 (e.g., Q\Qo), yet we propose to both discuss and utilize (3.13-242); what then does the vector (CT M~x C)P really mean? It means this: if the pressure from the NSE's is 'sufficiently smooth' (typically 'valid' in much of £2 and away from singularities), then this vector really does approximate —V2/\ per Remark (1) above. If, however, P(x) is not sufficiently smooth (say near a 'corner'), then the above vector can not correspond to —V2P; rather, it should then be interpreted precisely for what it is: an algebraic 'rearrangement' of the momentum and mass conservation equations, with P e L2. After all, the numbers coming out the other end are the same whether the (u-P) formulation or the CPPE formulation is used—at least when both are done properly. Thus, our 'final' and, in our opinion, justifiable, position on the matter, which applies equally well to FDM's and FVM's, is this: the CPPE approach gives the same results as the primitive equation approach and, since P e L2 in general, we must simply admit that CTM~XC is generally not a Laplacian. (Note that this same reasoning might suggest that the use of continuous pressure elements, such as QjQ\, is generally ill-advised—probably not an unreasonable suggestion.) b. Some inconsistent (approximate) PPE's Of the many ways to generate an inconsistent PPE, we will present just two—each displaying its own form of 'inconsistency' and each precluding the use of the two lowest- order elements, P\Pq and Q\Qq. For the first, we simply discretize the continuum PPE given in (3.12-59)—presuming that we can somehow generate a decent approximation to the viscous term—to obtain [omitting the (Vu)7 term for simplicity (y = 0)] LP=[K-N(u)]u+~f-~g, (3.13-244)
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 553 where Ljj = fS/i/// -S/i///, etc. Note that L, like CTM~XC, approximates (—V2) for sufficiently smooth pressure—at least up to BC's. The many attempts to make such a method work are so numerous and so varied—and often rather vague—that we cannot bring ourselves to discuss any of them—even though most seem to 'succeed.' What we will do is refer the reader to much of the relevant literature by requesting that s/he consult a recent paper by Gresho et al. (1995, references #3 through #19 in particular); less important in this paper are two attempts to make the Q\Q\ element work (which we too seemingly attained—until the algorithms were placed under close scrutiny) via one form or another of ad hoc stabilization tricks. These remarks are not intended to denigrate the above references; rather, they are meant to imply that our understanding of them is less than perfect. Some of them in fact probably do not belong on our list because they are addressing stabilization of Q\Q\ {et al.)—a la Section 3.13.3—rather than inconsistent PPE's. We simply tend to group them together, with apologies for any errors. But there was one result from the above paper that is worth mentioning here: the simple replacement of CTM~XC in (3.13-242) by L—ostensibly legitimate/interesting because each represents (—V2), at least for Dirichlet BC's on velocity, which we shall presume here—is ill-advised. Why? Well, simply because LP = CTM~x[f — Ku — N(u)u] — g, combined with CTM~x[f — Ku — N(u)u] = CTu + CTM-' CP from (3.13-241), implies that LP + g = CT M~x CP + CTu, which integrates to CTu(t) — g(t) = Ctuq — go + (L — CTM~XC) f0 P(r)dr, showing a clear loss of control of 'div.' See the above paper for an example of a steady-state result of a well-posed problem that satisfies LP = CTM~XCP with a very large divergence error and a totally non-physical solution. The second way is as discussed in Section 3.13.2b: do not integrate VP by parts when generating the weak form. This makes the primitive equations look like Mii + [K + N(u)]u + GP = f(G\ (3.13-245) CTu = g, (3.13-246) where only G and f(G) are different. They are (jc-components) Gxu= f (pidi/j/dx; (3.13-247) f(G) is given by (3.13-26) except that now Fx (on T^) does not contain the pressure contribution to the normal force; drop — Pna from (3.12-28). The NBC associated with this formulation is vdu/dx = Fx and, from the discussion following (3.12-69), is ill-posed in the continuum. Nevertheless, we shall form the associated PPE—in the usual way [insert u into the time derivative of (3.13-246)], to get (CTM-XG)P = CTM-x[f{G) -Ku- N(u)u] - g, (3.13-248) which, while not inconsistent with (3.13-245) and (3.13-246) [i.e., (3.13-245) and (3.13-248) can give the same solution as (3.13-245) and (3.13-246)], has a different sort of inconsistency; namely, if one were to examine this discrete PPE on any portion of T, where the NBC in the normal direction is being used, and let the mesh size tend to zero, one would not recover the proper Dirichlet BC for the PPE given by (3.8-38). What one
554 THE NAVIER-STOKES EQUATIONS would recover is a repeat of the normal momentum equation NBC, dun/dn = Fn! This is another (and much longer) way to discover the ill-posedness of the associated continuum equations—the PPE has no BC, as mentioned earlier, in Section 3.12.5. For the 'finite K case and assuming the case of a 'stable' element (no spurious pressure modes), say Q,2Q\, there is another aspect of (3.13-248) that should arouse suspicion—and it is this: there is always a hydrostatic mode (Ph is always a null vector of G because G is always a gradient), yet there are no redundant continuity equations. This is a spurious hydrostatic mode—another manifestation of the 'inconsistency.' 3.13.5 Additional Detailed Discussion of the Slightly UNSTABLE but Highly USABLE Q^o Element a. Introduction While it may be true that the successful use of Q\Qo over a wide range of problems (especially in 3D) requires a bit more expertise (and perhaps a bit of faith!) on the part of the user than does an LBB-stable element, one of our goals is to turn the serious reader of this book into one of those 'expert users' who can apply this often-excellent (and cost-effective) element confidently and successfully. After respectfully recalling the following remarks, (i) 'Whenever ... the power of the computer is needed not only to solve a system of equations, but also to formulate and assemble the discrete approximation in the first place—the finite element method has something to offer'—Strang and Fix (1973); and (ii) 'I should like to say briefly why I am a finite element man, and why I have adopted numerical integration almost exclusively. Firstly, I am lazy, secondly, I do not enjoy mathematics, and thirdly, I make mistakes'—Irons (1970); we put forth the somewhat counter viewpoint—more in concert with Aris (1978) who said, 'Though a model may have been formulated with perfect propriety and perspicacity, it is almost always a mistake to jump in with an extensive set of computations. It is better to live with it for a bit, to view it from different angles, to shape and mold it more justly. If the analogy may be permitted, there is a need for mathematical foreplay if the model is to be fully responsive and the ultimate knowledge is to be satisfactory'—that you should also study your discrete equations to at least some extent, to help to understand both your own algorithms and the PDE's and BC's that they allegedly represent. To this end, then, we present the following 'details'—for the simplest quadrilateral element, the Q\Qo, the last of which is a new convergence proof. b. General problem statement It is an interesting and useful, although tedious, exercise to actually assemble and study the full NS equations in semi-discrete form—especially for 'higher-order' elements. The result is worthwhile, however, in several respects—not the least of which is to learn how the GFEM actually generates boundary equations when non-'essential' BC's are employed, as we have done in the previous chapter. It will also prove fruitful to see first-hand how the GFEM automatically generates proper BC's for the pressure Poisson equation. The analysis will be limited, of course. We will use the simplest, 2D, quadrilateral element—Q\Qo—bilinear approximations for velocity and temperature, and piecewise- constant approximation for pressure—and study only a mesh of uniform rectangles, as shown in Figure 3.13-20, in which the right side is a boundary. While we shall initially
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 555 ) ( ) ( ) ( )— I y i ^6 ~W h u* 13 .—| I xp9 xp8 r xp7 ) ^2 11' io( 9 < ) -< xp6 XP5 XP4 — £ Q \ 8*' \ xp3 7U xp2 6U XPl 5 1> —► Kr 4 9x^ 3" g 2U 1o - t 9y Fig. 3.13-20 /\ patch of d O0 elements. develop the equations via the 'honest' GFEM, we will be forced to compromise this later in the interest of being able to directly generate nodal (or elemental) equations—a convenience that is lost when the consistent mass matrix is retained; i.e., then we do accede to Strang and Fix, Irons, et ai, and let the computer do most of the work: the discrete equations that we present will use the lumped mass approximation. (We also leave non-uniform grids to the computer—or to the reader!) The continuum equations that we choose to study are the 2D Boussinesq equations, dP du h u • Vw dt dv Yt f du + u- Vv dx dP 9j + vVz« + aT, + vV2v + 0T, dv dx dy 0, and 9 dT dT V2/> = -V • (u • Vu) + a— + P —, dx dy (3.13-249) (3.13-250) (3.13-251) (3.13-252) where a, ft are the x- and y-components of the gravitational buoyancy force (a = ygx, f5 = ygy, where y is the volumetric expansion coefficient). The energy equation (AD equation for T) should also be appended to close the system, but it is not needed for our limited purposes here. That we even include the buoyancy term relates to some 'difficulties' that it can cause at outflow boundaries—a subject taken up in more detail in Volume II, as are the Boussinesq equations themselves similarly deferred. The GFEM form of these equations is Mu + Ku+ N(u)u + CXP = fx+ aMT, Mb + Kv + N{u)v + CyP = fy + 0MT, CTxu + CTxv = g, (3.13-253) (3.13-254) (3.13-255)
556 THE NAVIER-STOKES EQUATIONS and (CTM-lC)P = (CTXM-{CX + CTyM-{Cy)P = CTXM~'[fx + aMT -Ku- N(u)u] + CTyM-' [fy + fiMT -Kv- N(u)v] - g, (3.13-256) where we have omitted the subscripts x and y on M, K, and N for simplicity; this slight laxity of notation should cause no confusion and will cause no harm for our current purpose. Finally, fx and fy come from applied BC's. The plan is as follows: 1. We shall explicitly construct the nodal equations for momentum conservation at a typical interior node—node 6. 2. We shall then construct the explicit mass-conservation equation for a typical interior element—element 5 (the only available fully interior element for this patch of nine elements). 3. The discrete pressure Poisson equation (PPE) will then be formed at element 5—a procedure that requires a departure from true GFEM by approximating M by a lumped mass (i.e., diagonal) matrix. After studying crudely the asymptotic behavior of these typical interior equations via Taylor series analysis (l,h —> 0), we will move on to examine some boundary equations and related BC's. 4. The momentum equations associated with natural boundary conditions (NBC's) will then be constructed at a typical boundary node—node 2. 5. Then the same equations and BC's will be applied to a less typical boundary node—node 1 (construed to form the lower right corner of the domain). 6. The effect of both Dirichlet and Neumann velocity BC's will then be examined for the PPE, by studying its discrete form on element 2, obtained via the continuity equation on element 2. 7. Finally, several PPE BC equations (16, actually!) will be presented for a corner element when various velocity BC's are employed—e.g., on element 1 wherein the bottom line in Figure 3.13-20 (x-axis) is construed to be the lower boundary of the domain, or on element 3 wherein the top horizontal line is construed to be a boundary. It is worth repeating that the only BC's applied to the NS equations are associated with the velocity. Thus, it is of some interest to see what BC's, and how they are selected by the GFEM for the PPE—especially since (we assert) these are the only proper BC's (for this weak form of) the continuum PPE; i.e., for any particular (and well-posed) weak form, pressure BC's are always induced by those on the velocity and the requirement that V • u = 0 on T. In order to begin the analysis, we will need the 'element matrices' for one / x hQ\Qo element. Except for the non-linear advection matrix, we present these below for 'convenience'—even though they are also listed in Appendix 1. (The very complicated advection matrix—for the general case—is not needed for our purposes. Later we will just carry it along 'symbolically.')
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 557 1. Mass matrix: M\- = JA (p((pj dxdy = ^(pjCpj, Ml Ih 36 4 2 12 2 4 2 1 12 4 2 L2 1 2 4J or, if lumped, Ml Sn / (pi Ih ~4 10 0 0 0 10 0 0 0 10 L0 0 0 U 2. Diffusion matrix: K6^ = Je V0, • V07, "2-2-1 1 h_ -2 2 1-1 6/ -1 1 2 -2 .1-1-2 2 3. Advection matrix: Ke + 6h 2 1 1 2 -1 -2 L-2 -1 -1 -2 -2 -1 2 1 1 2J 4. Gradient matrices: Cex.. = — fe V^(d(pj/dx), C\„. = — fe\//j(d(pj/dy), C! /z 2 r n -i -i L 1 J and CI = - 1 1 -1 -1 (3.13-257) (3.13-258) (3.13-259) (3.13-260) (3.13-261) 5. Divergence matrices: C\: = Cep, h 2 Cf = ^[1,-1,-1,1], Cf = ^[l, 1,-1,-1]. (3.13-262) c. Interior momentum equation o Momentum equations at node 6. To form the global equations at any node, we first determine, in turn, the individual contributions from each element containing that particular (global) node; i.e., the 'standard' FEM assembly process. We will carefully (step-by-step) generate the global jc-momentum equation for node 6. After that, we will skip details (which should be easy to fill in) and move more hastily toward our final goal. We now use the element matrices to determine the contributions to each of the six terms from each of the four elements sharing node 6—for the jc-momentum equation (only): 1. Element 1. Global node 6 is local node 4 in element 1; thus, we use the fourth row of each element matrix to give first, Ih . Mu\(y ~ — (2«5 +u\ +2u2 +4w6), 36 where ~ is to be read, 'the contribution to the global node in question'—it is not an equation and should not be so written (even though some authors have presented element- level contributions as element-level equations).
558 THE NAVIER-STOKES EQUATIONS If lumped mass is employed, lh. 4 Also, vh vl Ku\6 ~ — [2(w6 -u2) + (u5 -u\)] + — [2(w6 -«5) + ("2 -«i)J, 6/ 6« and yv(u)«|6 ~ - • X where we have introduced the following short cut notation: lA% = discrete (GFEM) approximation to u -Vw at node 6 from element 1. Next, CxP\(y ~ \P\, where CxP\e is the nodal contribution to node 6, and P\ is the pressure in element 1. Finally, fx\d ~ 0 (we are not at a boundary), and aM7|6 ~ °^—(2T5 + 7, + 272 + 476) or, if lumped, 36 alh aMT\6 — T6. 4 2. Element 2. Here, global node 6 is local node 1, and we get, from the first row of the element matrices, lh Mu\(y ~ —(4w6 + 2«2 + "3 + 2«7) or, if lumped, 36 lh. 4 Also, u/i vl Ku\b ~ — [2(w6 - m2) + («7 - "3)] + —[2(u6 - w7) + (m2 - w3)], 6/ oh N(u)u\6 ~ ^ • 2A£, Cjc^U ~ -Pi, /xl6~0, aM7|6 ~ -^^(476 + 272 + F3 + 277) or, if lumped, 36 alh aMT\6 ~ —-7-6. 4 3. Element 4. Here, global node 6 is local node 3; the third row of the element matrices gives lh . Mu\(, ~ — [W9 + 2«5 + 4«6 + 2u\q] or, if lumped, 36 lh. MuL ~ —«6- 4
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 559 Also, vh vl Ku\e ~ 77[("5 - "9) + 2(«6 - «io)] + yr[2(u6 ~ "5) + ("10 - u9)], 6/ on (u)w|6 - CVU " fx\b " 4 A(" h 2 4 -0, alh aMT\6 ~ —(79 + 275 + 476 + 27,10) or, if lumped, 36 aM7|6 ~ —-7-6. 4 4. Element 5. Finally, global node 6 here corresponds to local node 2, giving lh . ... Mw|6 ~ —(2u\o +4«6 + 2«2 + «n) or, if lumped, 36 lh. Also, y/i y/ ATm|6 ~ 77l2(w6 -«!()) + ("7 -«ii)] + 77[2(«6 -"7) + («io -«n)], 6/ oh yv(u)M|6 ~ - • 5A£, 4 2" aM7|6 ~ ^-(27,0 + 476 +277 +7,,) or, if lumped, 36 alh aMT\6 ~ —-7V 4 We are now ready to assemble the x-momentum equation for node 6. Collection of all elemental contributions and placement into the jc-momentum equation (3.13-253), yields, for the simpler case of lumped mass, vh lhii6 - —[(u\ — 2u5 +m9) + 4(«2 -2«6 +M10) + («3 -2«7 +«M)] 6/ vl - —[(wi - 2w2 + «i) +4(«^ - 2w6 +M7) + ("9 — 2mio + «ii)] 6« + !I![lA«+2A«+4A«_|_5A«]_|_»(/,i_/,4+/,2_/,5) -[^+2A"6+4A^+5A^ + -i -alhT6=0. (3.13-263)
560 THE NAVIER-STOKES EQUATIONS If we use consistent mass, then the terms in u^ and T(, are more complicted; e.g., Ihiie would be replaced by Ih 36 [(in + w3 + ii9 + u\ i) + 4(«2 + «5 + "7 + wio) + 16«6], 3. D^w, = t[(«5£ — usw) + 4(«£ — uw) + («/v£ — «/vw)L and its obvious y-analog, DyW,. with a similar expression replacing alhT6. But in this example we must use lumped mass in order to later invert M algebraically (analytically) and thus form explicitly the PPE. Remark: If «2 and «3 are specified (given), then their contributions to (3.13-263) would actually wind up on the RHS as part of fx in (3.13-253), as would their time derivatives when CM is employed. To proceed more 'efficiently,' we now introduce some further shortcuts in terminology, using the compass point notation (except for pressure) shown in Figure 3.13-21. 1. A1 = \0A1 +2 Aut +3 Af +4 Af), and represents u -Vw at node /. 2. Dxxut' = l[(uSw — 2us + uSe) + 4(«w -2u{■■ + ue) + (uNW - 1uN +uNE)], with an obvious analog in the y-direction, Dyyui. to We will also need boundary terms later, so we now add the following: 4. AE = ^(2AE +3 AE), which represents u -Vu at node E and is needed for the case where node E lies on the domain boundary. 5. DxuE = {[(use -us)+ 4(ue -ui) + (uNE - uN)]. 6. DyyUE = \[2(uSe - 2uE + uNE) + (us - 2uj + uN)]. The x-momentum equation at node 6 now reads lhu6 - v (jDxx + -DyyJ u6 + lhAl + -(Pi-P4+P2-P5)- oclhT6 = 0. (3.13-264) Since Ih = J 06 for this case, it is not surprising (since 0, is the general weighting function) that we must divide by Ih to 'unweight' the weighted residual equation and thus see the final version of the discrete momentum equation at node 6 that looks more N \A/ I VV ' < W >— r © © i si ® i © > NE —o sw SE Fig. 3.13-21 Another 4-patch.
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 561 like the PDE (and a finite difference approximation thereto); namely, Uxx , LJ yy \ ■ ■■ *■ "6 ~ v [-JT + -fiT) "6 +A6 + 2 ( 7"^ + / ) ~ aTe = °- (3-13-265) The one-to-one correspondence of each term with the corresponding term in (3.13-249) is now quite obvious. In fact, a Taylor series expansion about node 6 could be performed—an exercise we leave to the reader. The results would show second-order accurate approximations to each of the derivative terms; i.e., the Q\Qo element generates a second-order accurate (Taylor series) scheme on a uniform mesh of rectangles. (Of course, the lazy presentation of the advection terms precludes such an analysis, but the details could be worked out. As a final notational shortcut, we let Dxx/l2 + Dyy/h2 = V2h, the bilinear, discrete Laplacian operator—the expanded version of which can be seen in the previous chapter; (2.3-24). Clearly a similar analysis applies to the y-momentum equation at node 6, the final result of which is V6 ~ Wfo +Avb + i (P2~Pl + P5~hP4) - ^6 = 0. (3.13-266) d. Interior PPE o Continuity equation (V • u = 0) at element 5. Here we have an easier construction since, for Q\Qo, each element directly yields the discrete equation of interest. From (3.13-262), we have Ce*u + Cfv) =0; i.e., h I . -(mio -«6 + "11 -un) + -{y\o + V(y -v1 -v\\) = 0, (3.13-267) which is a second-order accurate approximation to —V • u = 0 at the centroid of element 5 in Figure 3.13-20—and we have employed the time-derivative of (3.13-262) to obtain (3.13-267), a needed step in the derivation of the PPE. o Pressure Poisson equation at element 5. While the CTM~'C-matrix is global and must be generally so-constructed, it is easy to generate one row of this matrix (and one of the set of discrete PPE's), when the mass matrix is lumped, by simply inserting the resulting accelerations from the associated momentum equations into the corresponding (element- level) continuity equation; and this we shall do after noting that we can do no such thing when the more-accurate consistent mass matrix is employed—in which case the PPE (and the 'Laplacian' matrix, CTM~~XC) is both global and has a fully populated band structure because M_1 is dense; all nodes in the mesh are then involved in forming this elliptic equation for pressure, both in the matrix and in the RHS vector. While it is, of course, true that the CT operation, even if applied to a fully populated vector, will 'localize' the result to one element at a time (corresponding to element-level divergence), it is also true that the final (more local) result cannot be obtained until the fully populated global vector is constructed first. Hence, we leave consistent mass to the computer (where it probably belongs anyway—although we hasten to state that the actual implementation of
562 THE NAVIER-STOKES EQUATIONS CM and the PPE method are generally not recommended) and focus on the lumped mass approximation in the remainder of this discussion. Since the continuity equation for element 5 requires the acceleration at several nodes in addition to that at node 6 presented thus far, we first write (by inspection, hopefully) the remaining momentum equations: w7 - vV2hun +AU7 + - fP3~P^+P2~PA _aTl = q (3.13-268) 9 1 (P3 -P2+Pe -Ps\ vn - vV2hvn +X> + - I- 2—- J - fiT, = 0, (3.13-269) 9 „ i (P4-P-1 + P5 -PA uw-W2hul0+Auw + - M L-j-l M -«r10 = 0, (3.13-270) 9 „ 1 (P5 -P4+P8-P-i\ vw-W2hvw+A\0 + - M i^—5 1J -/3TW = 0, (3.13-271) and 9 „ 1 (P5 -Ps + P6-P9\ iiu-W2hull+Auu+- I- ^—- 9-\ -aTu = 0, (3.13-272) «i 1 -W2vn+Avu +UP6 P5+hP9 P*j -PTu = 0. (3.13-273) Next, we rewrite the continuity equation in a form that looks more like V • du/dt = 0 by dividing it by {-lh)\ 1 (U(y—U\o ill — ii\\\ 1 (V-i — i>6 Vn -Vin\ -(_r_ + _r-L)+-(-L_i + -lL_J£J=0. <3.,3-274) The eight momentum equations are now rearranged and combined (as needed) to give M6-W10 2 /«6 — «10\ (K-AU\o\ —r- =yV* \—r-) - \—r-) - ^ [(/>, - 2/>4 + Pn) + (P2 - 2P5 + P8)] + a(r6~r'0), (3.13-275) ill -«ii ^2/"7 -"11 \ M" -A", —*—=vV*(—H-(—r- ^2[(P2 - 2/>5 + />8) + (/>3 - 2/>6 + P9)] + a(r7 r"}, Vl -V6 _ „2 / v7 - V6 \ (A% - Al h ~uvh\ h r\~~h - -2 [(/>, - 2/>2 + />3) + (/>4 - 2/>5 + P6)] + ^(7?, ^ 2h n and ^11 -^10 = yV2 /^ll — ^10 \ _ Mil -^10 a H a y v a 2-[(P4 - 2P5 + P6) + (^7 - 2/>8 + P9)] + ^(r", r'0), (3.13-278) 2/*z /*
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 563 Finally, inserting these results into (3.13-274) and rearranging yields the CPPE for element 5: -^[(P, - 2PA+P1) + 2{P2 - 2P5+P8) + (P3 - 2P6+P9)] 4r + 772 [(^1 - 2P2 + P3) + 2(P4 ~ 2P5 + P6) + (P7 - 2/>8 + P9)] Ah 1 2 '4« _4M Au — Au \ / Av — Av Av — Av ' / / h h + 1 (T^-T^ , T7-Tu\ 0 /Tt-T6 Tn -Tip' 2 V / / / 2 V h h + 2V' "6 - "10 "7 - "11 \ / ^7 - ^6 Vll — UlO / / /* /* (3.13-279) [Multiplication of the LHS by -Ih gives CTM'lCP; i.e., {CTM~XC)P « lh{-V2P); Ih is the pressure mass matrix, Q.] Hopefully, the term-by-term identification is obvious; for /, h —>• 0, it gives 32/> 32/> 3 3 dT dT —2 +—r =- —(u-v")- — («• Vv) + a— + £— 3jcz 3/ 3jc 3j 3x 3y + y 3Z 32 \ /3w 3^ _ 2 2n + 3jcz 3/ / V9* dyJ ( — + —} +0(l\hz). (3.13-280) Clearly, if V • u = 0, which is the /, h —> 0 limit of the continuity equation, then we recover (3.13-252), the desired PPE (at the centroid of element 5, to be precise). Remarks: (1) It may not be obvious, but we have just obtained, in (3.13-279), one row of the CTM~ 'C-matrix of (3.13-256), whose nine-element stencil, per Figure 3.13-20, we elucidate below—using CTM^C = CTXM~XCX+CTyM-xCy: 1 t I 1 -cZm-xcx = —=■ Ih x x 4/2 -1 2 -1 -2 4 -2 L-l 2 -1J (3.13-281) and 1 r 1 1 — CTvMZlCY = —j Ih v -v - 4h2 2 4 2 -1 -2 -1 (3.13-282) which clearly approximate —32/3jc2 and — d2/dy2, respectively; and their sum is {\/lh)CTM-{C^-V2. (2) If / = h, the Laplacian is seen to simplify to 2/ I(/>l+/>3+ Pi +P9-4P5) (3.13-283)
564 THE NAVIER-STOKES EQUATIONS rather than to the simplest finite difference stencil, 1 / 2(/,2+/,4 + /,6+/,8-4/,5); (3.13-284) (3) i.e., it is a 'rotated Laplacian' that involves the 'corner' elements rather than the nearest neighbors in the x- and y-directions. Although both are second-order accurate, the GFEM stencil here displays its CB-(checkerboard) pressure mode in the simplest way; i.e., the above stencil clearly annihilates a pressure that takes the value +1 on red elements and — 1 on black ones. If the elements are / x h rectangles, then the following stencil describes -Q~XCTM~XC ~ V2 for the central element of a 9-patch: / \/l2 + \/h2 2(1//2- \/h2) \/l2 + \/h2 2(l/h2 - l//2) -4(l//2 + l//*2) 2(\/h2 - l//2) \/l2 + \/h2 2(1//2- \/h2) \/l2 + \/h2 h (4) If CM had been used, then all we need realize/recall [see then (2.3-26) in Chapter 2] is that, for example, Mu = MLu + Ih ■ Oil2, h2), where now M refers to CM and ML to LM, to 'see' the same consistent results from consistent mass for /, h —> 0. So much for interior nodes; we now move on to the more interesting case of boundary nodes and BC's. e. Boundary momentum equations/NBC's o Momentum equations at a typical boundary node. Since there are no momentum equations at boundary nodes when the velocities are specified, we are naturally interested here in other BC's and related equations. Thus, assume that node 2 is on a boundary for which both /„ and fT have been specified. While we are ultimately more interested in the PPE at the boundaries (and its built-in BC's, to be discovered), the momentum equations are an important necessary step along the way. Besides, the analysis will shed some light on the often somewhat mysterious NBC's. So let us return to the element matrices and construct the x- and y-momentum equations at node 2—first on the assumption that no (explicit) BC's are applied on this surface. x-momentum equation. 1. Element 1. Global node 2 is local node 3, and we get (LM approximation), from the same element matrices, Ih Mu\2 -U2, 4 vh vl Ku\2 ~ — [2(m2 - u6) + (wi - u5)\ + — [2(m2 - u\) + (u6- u5)], o/ bh N(u)u\2 ~ '— ■ XA\, Ih ~4 CxP\2~-^Pl,
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 565 /*l2~/ <hfndy=^fn\\ alh aMT\2 T2. 4 2. Element 2. Global node 2 is also local node 2, and we get Ih Mu\2 -u2, 4 vh vl Kll\2 ~ — [2(M2 - «6) + ("3 - "7)1 + 7t[2(«2 - «3) + ("6 - «7)], 7V(u)M|2 ~ - • 2AU2, h 4 2' /'l2~/ <hfndy = ^f%, T2 aMTW ~ T2. 12 4 2 The equation for u2 is now 'fully summed,' and by combining the contributions to the momentum equation at node 2 from elements 1 and 2, we obtain, from (3.13-253), an NBC in the guise of an ODE for u2\ Ih vh — "2 + 7t[("i -«5)+4(m2 - «6) + ("3 -"7)] 2 0/ vl - —[2(u\ -2u2 + ut,) + (u5 -2ub + u-i)] on + ^('A"2 +2AU2) - ^(P, + P2) = k-[f^ + /£>] +«y^2, (3.13-285) where it is interesting and important to note that there are no pressure differences in the equation—clearly the approximate equation does not contain the .r-component of VP. But what does it contain? Recalling and returning to the shortcut notation introduced earlier, we also (logically but, as it turns out, naively) divide the equation by J (p2 = lh/2 and denning Jn2 = ![/£»> + /g>] to obtain Tu2 - v-^u2 + -< A2 + A2) - —j— = —- w2 + 2u^w2 - v-f-w2 + -('A^ +2A^) !——- = -^ +a72, (3.13-286) the interpretation of which as a 'momentum equation' is not at all obvious; it is even suspicious-looking: it sure doesn't look like a momentum equation! And this is true. To see what it does represent, and what the GFEM is telling us, we multiply by 1/2 and regroup: I . Dyy t\^1 Dxlil Pl+P2 - I u2 - v-^u2 + Au2 - aT2 I + I y— ill ~ v-f k2 +Au2-aT2\ + [ u-^ - „ = /„.. (3.13-287)
566 THE NAVIER-STOKES EQUATIONS The terms within the first parentheses, for /, h —>• 0, correspond (at node 2) to du d2u \ , , y—T + u ■ Vw - aT + 0(1, h), dt dyl J and those in the second parentheses to du I d2u\ ( I dP\ /2 ,2 Yx-~2^)-{p-2Yx)+0{l-h)- so that the equation, for small /, h, becomes 1 f du i dP\ ( du \ ,?,,,?, 2 \~d~t ~ U ~ ^ + Iti) +\Yx~P) =fn+0(l2JKh2). (3.13-288) But the coefficient of 1/2 is zero, since the jc-momentum equation is (we assume) satisfied. Thus, the alleged momentum equation for node 2 actually converges to v^--P = fn, (3.13-289) ax the natural BC for the jc-momentum equation at F—and it does so to second-order accuracy in the Taylor series sense even though simple one-sided derivatives are present. Remarks: (1) It was the assumption of jc-momentum equation satisfaction that led to a second- order accurate Taylor series result. This assumption is, of course, not necessary; i.e., the final result is still v(du/dx) — P = /„, the (pseudo)-traction NBC. (2) This is, in fact, the natural BC associated with the terms vV2u — dP/dx in the x-momentum equation that were integrated by parts to obtain the weak formulation. (3) The initial GFEM equation for node 2 was first divided by hl/2 (= J 4>i) and then multiplied by 1/2 to obtain the final equation. It turns out that the net result, division by h, is proper since the equation is (properly) dominated by the boundary terms—and h = Jr fa- Henceforth, we will account for such boundary behavior in a more satisfactory way—the above being presented as it was for (hopefully) tutorial purposes. (4) If we had begun the analysis with the stress-divergence form of the NS equations, the final result would be only slightly different; i.e., the NBC is then 2v^--P = fn, (3.13-290) dx which small difference promotes /„ to the status of a true boundary normal force (or traction) per unit area; i.e., 2/idu/dx is the normal component of the viscous stress at the boundary, and /„ is the normal force applied to the fluid at the boundary. Only in this way could we actually apply a. force to the fluid—a result that is often more prominent and important in solid mechanics than fluid mechanics, because in the latter the /„ NBC is usually (but not always—e.g., for free surface flows) a
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 567 convenience at an (artificial) outflow 'boundary.' The NBC then corresponds to the terms , , d2u\ dP v \V2u + dx2 I dx of the stress-divergence form. (5) The non-dimensional form of the original ('V2-form') NBC can be written 1 dun Re dn -P = fn, (3.13-291) which can be (properly) interpreted to suggest that P = —fn for large Re simulations; i.e., this BC approximates a prescribed pressure and it does so too when the stress-divergence form is utilized. (In actual fact, however, numerical results often give P=-f n even for low Re simulations.) (6) The original discrete ii2 equation (3.13-285), is still the one 'seen' by the computer; it is actually an ODE that approximates the PDE's NBC.(!) y-momentum equation. By repeating the analogous steps as for kj, the 'momentum' equation (ODE, actually) for i>2 can be obtained: lh vh vl irv2 + —[{vx - v5) + 4(v2 - vb) + (v3 - v7)] - — [2(vi - 2v2 + v3) + (v5 - 2vb + v7)] 2 6/ on lh , ,, 9 „ / lh h + -CAV2 +2 A^) +-(P2-Pi)-p--T2 = - / 02/rd;y+ / fafTdy = hfZ2. (3.13-292) This time, suspecting that a boundary equation will survive at the end of the day, we divide by h = Jr 4>i and rearrange to get -\V2-v-~-V2+A2 + —j;--eT2\+v-l v2 - v-^v2 +A'2 + L - /3T2 \ + v^v2 = fT2. (3.13-293) Again, while each of the terms multiplied by 1/2 converges to a well-defined quantity, the key observation is that this whole term vanishes when /, h —> 0 because it represents (after combining it with the 1/2 term from a Taylor series expansion of Dxv2/l) the y-momentum equation, and we are simply left with vdl=fz + 0{l2Jh,h2); (3.13-294) ox i.e., a (Taylor series) second-order accurate approximation to dv/dx = fz. Remarks: (1) Again, as for the normal momentum equation, this NBC does not generally represent a force balance. If the stress-divergence form of the momentum equations had been used, then the corresponding NBC would turn out to be
568 THE NAVIER-STOKES EQUATIONS which is a true force (per unit area) balance—a shear stress balance on the boundary. So only here does fT represent a true traction. (2) The simple original form presented, dv/dx = fT, corresponding to the V2-form of the equations is actually more useful for the purpose of a good 'outflow' BC via setting fT to zero: it usually causes minimal perturbation to the flow as V leaves the domain, and this is often the name of the game (or at least part of it) in CFD. As a final remark on the implementation of non-zero values for /„ and /r, we have found (for the Q\Qo element) it better to actually employ piece wise-constant (average) values at the (line) centroid of each element when actually computing Jr <pf„ and Jr <f)fT. The use of this seemingly lower-order-than-necessary approximation (i.e., why not use linear interpolation?) is related to two items: 1. In the normal momentum equation, it is a better match to the pressure, which is a centroid quantity and usually more important than the viscous term. 2. In the tangential momentum equation, we merely state the following experimental fact: even though fT is linear (in y, say) for steady Poiseuille flow, the GFEM can only obtain the nodally exact solution on a mesh composed of rectangular elements of varying size if the discontinuous, stepwise-changing approximation is employed. o NBC's in general. Figure 3.13-22 helps to relate x- and y-directions to (local) normal and tangential directions, and shows how the NBC's are stated in both n — x and x — y coordinate systems—limited for convenience and simplicity to straight boundaries aligned with the x — y coordinate system. Note that /„ > 0 is an applied (normal) tensile stress, and /„ < 0 is an applied (normal) compressive stress. o Momentum equations at a corner node. Continuing on, it is of interest to inquire, 'Just what "boundary equations" are contained in the momentum equations at a node in the corner of a computational domain?' So we focus now on node 1 of Figure 3.13-20 and suppose that nodes 13-9-5-1 form a bottom boundary, just as nodes 1-2-3-4 form a side boundary. The first thing we observe is the following: since Dirichlet data (specified velocity, essential BC) 'prevails,' we can only write momentum equations at node 1 if both boundaries (bottom and side) are 'free,' or 'natural,' or 'flow-through' (i.e., open)—so this is the case of interest. Here we are dealing with only one element, and we can easily construct the 'global momentum equation.' Thus, from row two of the element matrices, we obtain directly the equation for global node 1 — where it will suffice to consider just one of the two equations: i.e., either one will show us all there is to learn about such a corner node. So, considering the jc-equation, we have lh . vh vl lh , „ h alh — Ml + TT[2(M1 ~"5)+ ("2 -Ub) - ~[2(U2 - Mi) + (lib ~U5)]+ — -A" - -/>, + —-7, 4 6/ on 4 2 4 p /-Node 2 /-Node 1 = / 0i/* = / 0i/„d.y+ / 0,/r(Lc, (3.13-296) JV ./Node 1 ./Node 5 where /„ is the applied normal 'force' on the vertical boundary and fT the applied tangential 'force' on the horizontal boundary. For the reasons discussed above, we represent these
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 569 y,v,fyA *if;-p.w (d\\ dv ay ax auT dU duT du an ay' dx ax aun av aun av an ay ' dx ax X" dUn _ au dUj, _ du an ax ' dx dy 9uj _ av auT _ av an ax ' dx ay au „ -fx = 2m —fv = M- ax av a_u ax ay un = v, uT = -u 'n = V 't ~~ ~'x fn = "fx fx = -fy ! f = 2uau-n - P ' | Tn ^an ! fx = M- auT + au^ an dx un = -v, uT = u 'n = —'y> 't = 'x Wx Wy aun _ au atjp _ aij an ax ' ax ay auj _ av auT _ av an dx ' ax ay av au ax ay V ■-- fv = M- iiX -»- n au^ _ av aun _ _ay an ay' ax ax a^ _ _ au a^ _ au an ay ' dx dx X, U, fx au ay ay dx j Fig. 3.13-22 Natural boundary conditions and force balances; for the conventional (V2) formulation, omit the underlined terms and all factors of 2, to obtain pseudo- tractions. applied (pseudo-) 'forces' by their average values, so that the RHS becomes hj l- To obtain the 'limit equation' for this case, we might divide the equation by Jr <p\ = (/ + h)/2. But the result would be awkward. So we merely multiply by two and rearrange it to Ih r v — i — On + A\+aTx) + h{-[2(u[ - u5) + (u2 - u6)] - Pi -fnj /{^[2(M2-Mi) + (M6-M5)] + /r} =0, (3.13-297)
570 THE NAVIER-STOKES EQUATIONS where it is noteworthy that this is the 'jc-equation' when u\ is 'free,' regardless of the BC employed (essential or natural) in the y-direction. Clearly, the first term 'goes away' (it is higher-order) as /, h —> 0. So in the limit (or near-limit) for / ^ h, we obtain a special case of the following general result: h du fx-nx(v- P) ax ( du\ + I [ fx — nYv— = 0 in the x-direction. Similarly, from the nodal y-momentum equation (not presented), we obtain h U-v-"^— 1 +/ f'-"'(%-p] = 0 in the y-direction. (3.13-298) (3.13-299) (In the case under consideration, nx = 1 and n v = — 1.) These equations can be interpreted simply and properly as follows: the GFEM 'converts' the momentum equations at a comer node when NBC's are applied there to surface force' balances on the comer element. For example, (3.13-298) says: h x (jc-component of net normal stress) + / x (jc-component of net tangential stress) = 0, a la ^ Fx = 0 from your first course in 'statics.' Remark: A true force balance, of course, is and h h f> n, fv ~nxv ' du 2v P k dx du dv\ dy dx)_ + 1 + 1 f> nyv I /du dv^ \dy dxt dv fy-ny\2v—-P = 0 = 0, (3.13-300) (3.13-301) which is only obtained if the stress-divergence form of the NS equations is used. The result presented is the proper pseudo-force balance that is appropriate to the V2 u (conventional) form of the equations. This concludes our elaboration of the momentum equations—for now. Now we shift back to the PPE. f. The PPE at boundaries. o Boundary conditions for the PPE. Consider the PPE for element 2 in Figure 3.13-20, which is a typical boundary element. It is of interest to determine just what BC's are built into the PPE when any of the following velocity BC's are imposed (on the vertical boundary for this case—generalization is straightforward): (a) Specified u, v: Dirichlet BC's (e.g., solid wall, or specified inflow). (b) Specified /„, v: sometimes useful for inflow or outflow. (c) Specified /„, fT: the general NBC, often useful for outflow. (d) Specified u, fT: most commonly used (with u = 0 and fT = 0) for a symmetry BC. The common starting point, of course, is the continuity equation for element 2: (3.13-302) h . I . -("6 - "2 + "7 - m) + ~(v2 -v3+v6- vn) = 0, where the values of the component accelerations at nodes 6 and 7 have already been presented.
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 571 We now examine each case in turn, since the same continuity equation will 'generate' different pressure equations: Specified u, v. This is the simplest case, since all needed information is already available. In anticipation of the final result (a boundary equation, recall), we divide the continuity equation by h before substituting the accelerations: thus, 1 . / . -(K6 ~U2+U1 - K3) + — (V2 - V3 +V(,~ Vn) = 0, 2 2h (3.13-303) into which we insert the previous ii and v equations for nodes 6 and 7, (3.13-265) through (3.13-269). There are no equations for u and v at nodes 2 and 3; they lie on a Dirichlet boundary and therefore u2, ^2, «3, and v3 are given 'data' and thus show up in g in (3.13-256). The result is 1 1 2vVi(u6 + «7) - -{Al + A") - - 1 fP\ ~ Pa + Pi - P5 , P3-P6 + P2- P5 I + + t(76 + T7) - -(«2 + u3) + ^l(v6 ~ t>i) -—(Al- Av7) 2 2 2n 2h I (P2-P\+P5-P4 4h\ h PI I . . + ^-(T6-T7)+— (v2-v3) = 0, 2h 2h P2 + Pe h (3.13-304) which is a mess. To proceed, we transpose all but the terms in Pt/l to the RHS, change the sign, and regroup to obtain (/>, -P4) + 2(P2-P5) + (P3-P6) 4/ 2 / U6 + W7 \ = VH 2 J" / + 2 -^ U2 + U3 { Au6+A"~ 2 ' 2 j Nv)+V+t* r7 -T6) V2 h ~V3~ h 1 , (^6 + ^7) + a 2 - 2P2 + P3) + (P4 - 2/z2 -2/>5+/>6) (3.13 an equation that is to be interpreted as applying at the centroid of element 2. Letting / and h go toward zero, we obtain, in a two-step procedure that is not really necessary, — = vV2w - [ — + u • Vw + aT dx \dt I 3 29^ 9 dP dv 3y 3r + 0(l,h), (3.13-306) in which only the first four of the RHS terms survives the limit process; i.e., the final 'PPE' for element 2 is actually a boundary equation (again—no surprise anymore); it is,
572 THE NAVIER-STOKES EQUATIONS in fact, the (Neumann) boundary condition for the PPE, dP , fdu \ — = v\72u- ( — + u-Vw }+aT, (3.13-307) ox \at J which is just the normal component of the NS equations applied on T^. The actual equation (3.13-305), is, of course, an ODE that is actually a first-order accurate approximation to the normal momentum equation—applied at the centroid of element 2, which itself approaches the wall as / —► 0. (This ostensibly 'first-order' approximation, however, in no way vitiates the second-order accuracy of the overall solution.) Remarks: (1) The 'suspicious sign' of dv/dt in the bracketed term above is just one example of many in which the GFEM will do whatever is necessary, with respect to tweaking/twiddling the truncation error terms, so as to attain the key objective for incompressible flow: V/j • uh = 0. (2) The term — (/z/2)(w2 + W3) + (l/2)(v2 — v3) in the original statement of the continuity equation corresponds to (a portion of) g in (3.13-256). (3) This case (Dirichlet BC's on velocity and Neumann BC's on pressure) is further discussed in great detail in Gresho and Sani (1987). Specified fn, v. Here we need the full momentum equation for nodes 2 and 3 in the jc-direction. That for node 2 has already been presented, see (3.13-285) and (3.13-286), and that for node 3 follows 'by inspection'; it is lh . vh — vl — lh 9 „ , „ -«3 + —Dxu3 - -zrDyyUi + -(2AU3 +3 Au3) L I Lh 4 --(P2 + P3)-oc-T3=hJny (3.13-308) Starting afresh, we rewrite the continuity equation for element 2 in the form -[(W6 + tin) - (K2 + «0] + ^[(V2 - V3) + (V6 ~V7)] = 0 (3.13-309) to give, after inserting the relevant momentum equations, vh (Au + A") h -V\{ub + «7) - hK 62 V - -[(/>, - P4) + 2(/>2 -P5) + (P3- p6)] ah vh— v— h , 0 0 , + y (^6 + T7) + -J2DAU2 + m) - -Dyy(u2 + w3) + 4(^2 +2 A\ +2A% +3 A\) h ah h — — - -(/», + 2P2 + Pi) - ~(T2 + T3)- -{fni + /„3) + ^ {(^2 - v3) + vV2h(v6 - vj) - (Al-Aw7) 1 + 2 (P, - 2P2 + P3) + (P4 - 2P5 + P6) another mess. + £(76-:T7U=0; (3.13-310)
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 573 To proceed, note that several terms are divided by / and most are multiplied by h. Thus, we multiply by l/2h and judiciously rearrange to arrive at Dx(u2 + u3) />, + 2P2 + P3 7„2 + In, v - / 2 4 2 / 2 fu6+u7\ Al+A" (Pl-P4) + 2(P2-P5) + (P,-P6) (T6 + T7) Dyy lAu2+2Au2+2Au3+3Au3 (T2 + T3) - a + v-y(u2 + u3) - "- +a 2 h 4 2 I2 + 4 V3 ~V2 2 (I)-; - V6) (AV7 - AD h " h h (Tn - 7^1 (3.13-311) (Pl-2P2 + P3) + (P4-2P5+P6) (r7-r6) _ +p 2hz h As /, h —> 0, the only remaining terms are those on the LHS; i.e., we obtain v^-P-fn=0, (3.13-312) ox the same BC satisfied by the x-momentum equation at node 2; cf. (3.13-289). Here though, the BC acts more like a Dirichlet condition for the PPE; while we do not specifically set P at the boundary, it is inherently (and approximately for finite /, h) set via this '/„ BC as P = v^--fn, (3.13-313) ax and is done in such a way that V/j • uh = 0 on F is assured. Specified f„, fz. The full equations for this case are almost too long to be useful; the final equation is similar to the above case except that the y-momentum equations must be employed for i>2 and i>3, since these velocities are no longer specified. We shall be content to leave the details to those interested and merely state what is nearly obvious: the end result is the same. As /, h —>• 0, the pressure (continuity) equation becomes the same BC for the PPE, p=vT-~fn, (3.13-314) ax the y-direction (tangential) BC is again waived in favor of the normal 'momentum equation'—i.e., the 'force' balance in the jc-direction—because this is the equation that keeps the solution divergence-free. Specified u, fT. For this case we need the momentum equations for nodes 2 and 3 in the y-direction. That for node 2 has already been presented, (3.13-292), and that for node 3 follows from it by 'inspection': Ih. vh — V3 + — [(V2 - Vb) + 4(V3 - V7) + (V4 - Vs)] 1 o/
574 THE NAVIER-STOKES EQUATIONS vl - — [2{v2 - 2v3 + v4) + (V(y - 2v7 + v8)] on Ih , -, I Blh — + -(2A3' +3 A\) + -(P3 - P2) - ~~T3 = hf *3! (3.13-315) or, dividing by h and using the shortcut notation, Pi-Pi D_2 ti DY - ( V3 ~ v-gvs +A\ + ' J , ' * - 0T3 1 + v-^u3 = / /* / *y (3.13-316) Inserting all momentum equations (except those for which the velocity is specified, u2 and u3) into the (same) continuity equation, this time in the form \[{u2 +"3) — («6 + "7)] + jr[(v3 - v2) + (vi ~ Vf,)] = 0, gives U2 + U3 V 1 2 --(V>6 + V^7) + -(A^+A^) + -[(/>! -P4 + P2- P5) + (P2 - ^5 + P3 - Pe)] ain + T^ + (fT-fZ7) Dx(v3-v2) h — v- l h + + Dyv (v3-v2) , Q(T3 -T2) A\-A\ Px- 2P2 + P3 V ^ ; V P~ —-' uV hA h ' '' h h 2(v7-v6) , .(7-7-7-6) (A*-AD hl h + P- h h \_ /P1-2P2 + P3 P4-2P5+P6 2 I h2 h2 0. (3.13-317) Clearly this equation approximates—at the centroid of element 2— fdu 9 dP \ d ( dv\ uV2w + u • Vw H aT\-\ I fz - v—\ \dt dx J dy V ar/ / 3 / d2v 2Yy ["dy2 + ^— I v^~2 + vV2v - 2u ■ Vv - 2— + 2pT ) =0(l,h), (3.13-318) dP Jy which, since the tangential BC, v(dv/dx) = fT from (3.13-294), is valid for all y, will converge to the Neumann BC for the PPE, dP = vVlu + olT du + u • Vw , (3.13-319) dx V & the normal component of the momentum equation. In a common application of this BC, u = 0 and fT = 0—i.e., a symmetry BC—we get dP dx = ccT; (3.13-320) the normal pressure gradient balances the 'body force.'
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 575 Remarks: (1) The viscous term vanishes (for the symmetry BC) because u = 0 for all y, which leaves vd2u/dx2. But d2u 3 fdv\ 3 fdv\ dv V-u = 0=»—T = (— = — and v— = f T = 0 dx2 dx \dyj dy \dxJ dx J for all y. (2) In summary, and generalizing slightly, the PPE BC's are: (i) For either u specified or u • n and tangential 'traction' specified, dP/dn = n • (uV2u + ygT - u • Vu - du/dt), a Neumann BC. (ii) For either normal 'traction' and tangential velocity specified or total 'traction' specified, a Dirichlet BC. If the stress-divergence form is used, the term dun/dn is multiplied by two. In general, specified normal velocity implies that the BC for the PPE is the normal component of the momentum equation—at least for domains with a fixed boundary. The overriding issue related to pressure BC's is that they be such as to ensure that the discrete version of V/, • u;' = 0 is satisfied on the boundary [just as the (consistent) PPE itself ensures that the discrete version of V>, • uh = 0 in the domain.] Corner boundary conditions for the PPE. The last situation to examine, and in some sense the most involved, is the 'pressure equations' that result from applying the various velocity BC's at a domain corner. For variety and simplicity, we let this corner be that denoted by element 3; i.e., the domain boundary is now the line connecting nodes 1, 2, 3, 4, 8, 12, and 16. Since we have two equations (with BC's) on each of two surfaces (horizontal, say at y = H, and vertical, say at x = L) and four possible velocity BC's at each (specified: un and uT, un and fT, f„ and uT, and /„ and fT), there are 42 = 16 possibilities—and this is just 2D! (In 3D, there are, by our count, 93 = 729, all of which will be left to the computer as far as we are concerned; i.e., the knowledge gained from the 2D analysis should be sufficient.) We shall develop only two of the 16 in any detail, but will present a few of the others in abbreviated form. First, a summary of the results, in the form of a table (Table 3.13-9). We note several things from Table 3.13-9: 1. The specified /„ BC is 'dominant,' and provides a Dirichlet BC for the pressure. 2. Only 'specified u' returns the Neumann BC for P. 3. The loss of a pressure BC in three cases with un and fz specified on part or all of Fe is surprising—especially since 'specified w„' usually delivers the Neumann BC for P. (3) (4) (5)
576 THE NAVIER-STOKES EQUATIONS Table 3.13-9 Boundary conditions for a corner element. Case 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 BC's at x = L u,v {u„,ux) u,v (un,ur) u,v (u„,ux) u,v (un,ur) Ujy (UnJx) Ujy (UnJx) Ujy (UnJr) Ujy (Un,fr) fx,V (fn,Ux) fx,v (fn,ux) fx,V (fn,Ux) fx,V (fn,Ux) fxJy (fnJr) fxJy (fnJr) fxJy (f„,fx) fxJy (f„,fx) BC's at y = H u,v iu„,ux) Ujy (f„,Ux) fx,v (unJT) fxJy (fnJr) U,V (Un,Ux) Ujy (fn,Ux) fx,V (UnJr) fxJy (fnJr) U,V (Un,Ux) Ujy (f„,UT) fx,v (unJx) fxJy (fnJr) U,V (Un,Ux) Ujy (f„,Ux) fx,v (unJr) fxJy (fn,Un) dP/dn (fx~ Resulting pressure BC's = n • (vV2u + ygT - P = vdun/dn fr = vdur/dn P = vdun/dn fr = vdur/dn P = vdun/dn du/dy)y=H + (fy - P = vdun/dn P = vdun/dn P = vdun/dn P = vdUn/dn P = VdUn/dn P = VdUn/dn P = vdun/dn P = vdUn/dn P = vdun/dn - du/dt - u Vu) -fn (*) -fn (*) -fn ^L=0(1) -fn -fn -fn -fn -fn -fn -fn ~fn -fn (1)A loss of a pressure BC! It is only a pointwise (D. Griffiths, personal communication). loss, however, and is mathematically legitimate We now present details of these derivations for Cases 1 and 16 (the two extremes) in Table 3.13-9—after emphasizing that all of the pressure BC's come from a single continuity equation, i.e., h . / -(«3 -Un +UA- Us) + ~(V4 - V3 + V8 ~ Vj) = 0, (3.13-321) and pointing out that not all terms in each discrete momentum equation are required to be carried along, since they vanish with mesh refinement. Thus, we first present the (only) form of the momentum equations needed for the analysis of the most general case—Case 16 [where we distinguish between normal and tangential applied stresses and let HOT = Higher-Order Terms; actually, they are 0(1) in h and /]: i{f-+p-^ + HOT, }(r,-v%)\3 + »m. Hf",+r-»%)U + l(r,-v%)\t + nor, T(/;-"$0L + g(/",+',->$)L + HOT.
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 577 Ul = (vv2hu + olt - u • vw - ap/a*)|7, v7 = (vV2v + ^7-u- Vv-dPdy)\v "8 =i(^-vt)l8+H0T' and (3.13-322) where we point out that in all cases save one (Case 1), the u-i and v-i contributions will vanish with mesh refinement. Case 1: u and v specified. Here, only node 7 is 'free,' and the continuity equation becomes h -«3 — (vVjjU + aT — u • Vw — dP/dx)\-; + ii4 — u& I + 2 / 2 dp\ V4 — V3 + Vg — ( VV^V + fiT — U • Vv — — I 7J 0. (3.13-323) Now as /, h —>• 0, we note that «3 4- «4 — «8 = 3w/3/|7 + HOT and v4 — v3 4- i>8 = 3v/3r|7 + HOT to yield, at node 7, (du dP VV2M \ /3u dP 7 \ aT] +1 [ — 4- u • Vv H vV2v - 071 =0, J \dt dy J (3.13-324) a (particular) linear combination of the x- and y-momentum equations. But if we introduce the so-called 'mass-consistent' normal vector (with concomitant tangent), from (3.13-56), a more appropriate (mass-consistent) interpretation is possible. For corner node 4, the consistent normal vector is given by nx = h/y/h2 +I2 and ny = l/y/h2 4-12 (and the concomitant tangent vector ). Thus, dividing the above linear combination of momentum equations by \Jh2 4-12 gives, with Mx = 0 and My = 0 used as shortcut designations of the x- and y-momentum equations, respectively, nxMx 4- nyMy = 0. But this is just n M = Mn = 0, the normal momentum equation in the corner element; i.e., the continuity-equation-derived PPE BC is simply (again) dP/dn = n • (vV2u + ygT - u Vu - du/dt); (3.13-325) the normal component of the momentum equation is the (Neumann) BC for pressure. Case 16: fn and fT specified. This is the opposite end of the spectrum of cases, in which all momentum equations partake, and Figure 3.13-23 may be useful. Thus, inserting six of the eight momentum equations from (3.13-322) into (3.13-321) gives, neglecting HOT, h 2 2 ( „ du + dx ui + -Afnx+p i — dx h ' du\ 2 ( du dyj. h dy,
578 THE NAVIER-STOKES EQUATIONS h *7 \jh2+£2 £ 3 ' Fig. 3.13-23 A comer element. + £ + -(r-v- l V v VdxJ4 ■ h V v ' " dy/4 + !(r,+p-v? 2/ dv dx J 3 /i + l(r,+,-%\-* 0, (3.13-326) where we refrain from explicating iii and v-j because, as will be seen below, their contributions vanish in the limit of mesh refinement. In fact, letting /, h —> 0 in the above equation yields — if" + P- v— +/— l fl -v— ) +h— I fl -v — / V <W dx V dyj dy \J y dx 2/ / rn dv\ + - \fy+p-vy) =0(i,h), h (3.13-327) ay; and multiplication by //i/2 gives h2 [fnx + P - v~^j +I2 (f» + p - Vj-\ = 0(l2h, lh2). (3.13-328) Introducing again the consistent normal definition gives, upon division by (h2 +/2), (3.13-329) "'^-f-)+<%-^+ov-k)- If we now invoke the interpretation that each term in parentheses is simply (vdun/dn — fn)~even though the first applies to the vertical boundary and the second to the horizontal boundary—and use n2 + n2 = 1, we obtain P=vdun/dn~ fn\ (3.13-330) the normal 'traction' BC on velocity serves (again) as a Dirichlet BC for the pressure.
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 579 Before leaving this case, however, it is of some interest to return to the actual momentum equations (surface force balances) at node 4; i.e., (for /, h —>• 0) [cf. (3.13-298) and (3.13-299)] h If" -\- P - v-^-J -\-1 (fxx - v—j = 0 from the x-equation, and h(fzy-Vy)+l(f1,+P-v—) = 0 from the y-equation. Ty) Rotating these equations from x — y to n — r, as before, gives n, *(*+|,-©+'(*-$) + nv >(r,-%)+i(r, 0 = M, (3.13-331) and *(*+'-£)+'(';-£) + TV '('^IM^'-S) 0 = Mr. (3.13-332) Using now h = nxy/l2 + /z2 and / = nyy/l2 +h2 gives P = /^l v— - f? 1 +/ir/2vl v— - f! + v— -f{\ +n2(v i ( du „\ f du T 3v dv By fl --0 (3.13-333) from Mn = 0 and (3.13-334) from MT = 0, where we have used xxnx + xyn v = t • n = 0. Noteworthy here are the following: 1. The consistent (normal) rotation has removed P from the tangential momentum equation—an absolute 'must' when the Q\/Pq element is used, since it is unable to represent a pressure gradient within a single element (see also the discussion surrounding Figure 3.13-2. in Section 3.13. le). 2. In the MT = 0 equation, the first and last terms sum to zero, after again invoking the interpretation that v(du/dx) — f" = v(dv/dy) — f" = v(du/dn) — f n. This leads to the final result, v(duT/dn) = fT, after invoking a similar interpretation of the shear terms. 3. Comparing the Mn = 0 result with that earlier from the mass conservation requirement, (3.13-330), and requiring consistency between the two resulting Dirichlet BC equations for the pressure implies that nxnv(v(du/3y) — f\ + v(dv/dx) — f\) = 0 is
580 THE NAVIER-STOKES EQUATIONS required. But this is identically true from the same shear stress BC, fT = vduT/dn; i.e., the offered interpretation is compatible/consistent. 4. Using (for straight boundaries) dn ' d d\ nx— + ny— I (nxu + nyv) and dn ' d d\ nx— +ny— \ (rxu + ryv) permits the following alternate representation of the rotated momentum equations: (i) v(dun/dn) ~P = nxff + nyff = ff and (ii) vduT/dn = rxff = ff, where :eff ff = "*f"x + nyfl hfnx+lfl Vh2 +12 and ff = nxfy + nyf"y hfTy+lfny Vh2 +12 are the effective surface forces caused by the applied surface tractions. 5. If the stress-divergence form of the equations was used, then each normal velocity derivative above would be doubled and dun/dr added to every appearance of duT/dn. 6. If v = 0, then all tangential 'traction' terms are dropped (replaced by the original tangential momentum equations, actually); the pressure BC remains unchanged (P = —f„), as does the normal momentum equation; and the tangential momentum equation becomes —l(du/dt + u • Vw) + h(dv/dt + u • Vv) = 0; i.e., r • (du/dt + u • Vu) = 0—no pressure gradient is present. This completes the top and bottom of the entries in Table 3.13-9. Only a few of the remaining 14 cases will now be presented, and in a much-abbreviated form; in all cases we just write Ui, v-i for the internal node momentum equation. Case 3: u, v specified at x = L; fx and v specified at y = H. h 2 2 / du' m - un + uA - - if x - v— I + ~[V4 - V3 +V8 vi] = 0; (3.13-335) i.e., the /, h —> 0 result is simply . 3" . 3«r fx = v— or fT = v—-. ay dn (3.13-336) a 'repeat' of the only NBC in the momentum equations—and no BC for pressure.
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 581 Case 4: u, v specified at x = L; fx, f y specified at y — H. h — 2 «3 " - hi + M4 - 2 r - 7 (fx - h \ du - v— dy + / V4 - Vi, + h fy+P V — dy)>, -V! (3. = 0, 3-337) which => l(fy + P — v(dv/dy)) - h(fx - v(du/dy)) = 0, which, upon invoking the tangential BC from the momentum equation, gives P = vdv/dy — f y = vdun/dn — fn as the BC for the PPE. Case 7: un and fT specified at x = L and at y = H. h 2 «3 . 2/ du\ Ul + U4 - - [fx - V— ) + / V-! o, 2 / , 9v\ V4-l{fy-vTx)3+v* (3.13-338) to give (fx — v(du/dy))\y=H + (fy — v(dv/dx))\x=L = 0; both shear-stress NBC's are inherited by the PPE, with the pressure absent. It must be the case that the contiguous boundary nodes, which supply Neumann BC's for the PDE, also set the pressure in such a corner. Remark: It is interesting to examine the inviscid version of this BC: un specified at x = L and at y = H. The mass consistent normal =>• iin = nxii4 + nyv4, which, when placed in the (original) continuity equation, gives nx(uT, — u-] — u%) + un + ny(v^, — v-j — v3) = 0, which becomes nx(l(d/dx)(du/dt) — «8) + un + ny(h(d/dy)(dv/dt) — v3) = 0, or un = nxii8 4- nyv3, wherein un is given. Finally, inserting the (tangential) momentum equations and rearranging gives (dP/dn) = —n • (du/dt + u • Vu)—a return to the Neumann BC. (A similar result obtains for Cases 3 and 5.) So we see that the 'strange' result for the viscous case is indeed a viscous effect. Case 8: u, f v specified at x = L; fx, fy specified at y = H h 2 / du u3-u7 +u4- - [fx~vy l + 2 2 / dv\ 2 / dv\ + T\fy+P-V-) h dy), 2 ( 8v\ 2 ( fy-v—.) +TAfy+p I dx J h — v- 'By. = o, (3.13-339) to give after invoking the x-equation BC, we obtain P = vdv/dy — fy = vdun/dn stop here. (3.13-340) /„. And we g. Flow past a flat plate. While the equations and methodology are still fresh in our minds, it is interesting to examine another common situation in which the PPE is deprived of a BC at a single
582 THE NAVIER-STOKES EQUATIONS point—at least under one set of velocity BC's. To this end, suppose now that the original patch of nine elements in Figure 3.13-20 describes flow near the leading edge of a flat plate at zero angle of attack. The bottom surface is thus a symmetry line until the plate begins at node 5; i.e., the flow is left to right and the BC at nodes 13 and 9 is fx = 0, v = 0, and at nodes 5 and 1 (comprising the leading portion of the plate), we have u = v = 0. The continuity equation for element 4 is the focal point: h I -[w5 - ii9 + w6 - uw] + t;[v6 -V5+ i>io - vg] = 0, (3.13-341) where, with our newly gained experience, we know that the key player will be the x- momentum equation at node 9. Also, even though node 5 has u$ = i>5 = 0, we shall (as usual) carry these (specified) accelerations for consistency of interpretation. Thus, neglecting HOT ab initio, 2 / du\ h V dy/9 h 4 which gives the no-longer-surprising lack of a PPE BC equation, + ~[V6 ~ vs + v\o ~ vg] = 0, (3.13-342) fx = v^=0. (3.13-343) dy Now, by the same token—and from earlier results—we find that on all upstream elements, like #7, the same procedure (symmetry BC) gives (for buoyancy absent) dP/dy = 0, and for all downstream elements (un = uT = 0)—such as #1—we obtain the Neumann BC, dP/dn = n • (vV2u — u Vu — du/dt) or, specialized to the case of interest, dP/dy = vd2v/dy2. Interesting. The 'transition' element, #4, 'sees' dP/dy = 0 to its left, du/dy = 0 thereon, and dP/dy = vd2v/dy2 to its right for the pressure/continuity equation; i.e., there is actually no pressure BC from nodes '9' til 5.' What is the effect of this? It seems to translate to the loss of a specific BC for the pressure at the leading (and trailing) edge of the plate, perhaps related to the singularity there in which the pressure is allowed some 'freedom.' On the other hand, a similar analysis in which symmetry is not invoked a priori, yields a dissimilar result: it regains the Neumann BC for pressure. To see this, suppose the plate to begin at node 6 (and extending to node 2 and beyond) so that there is flow on both sides, and the velocity is specified at nodes 6 and 2, but symmetry on 14-10-6 is not invoked. There are now four continuity equations that are relevant—those on elements 1, 2, 4, and 5. But we need to examine only two of them in any detail (say #1 and #4), the others following from symmetry. Element 1 has h . I . . . -(«, - u5 + u2 - u6) + -(v2 - v{ + v6 - v5) = 0 (3.13-344) to give (omitting the buoyancy term for simplicity) -[(vV2M - u • Vu - dP/dx)\ - (vV2w - u • Vu - dP/dx)5 + u2 - ii6] + ~[V2 ~ (W2v -u-Vv- dP/dy - vV2u)\ + vb- (vV2v -u-Vv- dP/dy)5] = 0; (3.13-345)
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 583 i.e., hi d (du 9 \ fdv -, \ -— l — +u-Vu + dP/dx-vV2u) +1 I — + u-Vv + dP/dy-vV2v\ =0(l,h), (3.13-346) which finally converges to the Neumann BC, dP/dy = vV2v — u • Vv — dv/dt in general, and dP/dy = vd2v/dy2 in particular. No surprise. Remark If the flow is inviscid, then w2 and «6 are free, and the BC is v = 0 at nodes 2 and 6. The result is dP/dy = 0 in elements 1, 2, 4, and 5. Also no surprise. Switching now to element 4, h . I -(u5 -u9 + ub- wio) + -(ve - v5 + vl0 -v9) = 0 is the continuity equation, and w6 = vb = 0 is the BC. This leads to h (3.13-347) d du fdu I h — dx dt V dt vVzM + u • Vw + dP/dx I + 2 which converges to h du Tt fdv 9 \ 3 dv vV~v + u • Vv + dP/dy) +h \dt J dydt dv 0(1, h), (3.13-348) + u • Vw + dP/dx - vVzu 1 + / I — + u • Vv + dP/dy - vS/lv = 0, (3.13-349) and further interpretation seems to rely on the recognition that nx = h/y/l2 + h2 and ny = l/y/l2 +h2 define the consistent normal for node 6 in element 4; i.e., even though the global mass-consistent unit normal would be n = — i (obtained from the pair of elements, 1 and 2, that contain the domain's boundary), the e/emenr-consistent normal is needed to interpret the above equation as the usual Neumann BC for the PPE, dP/dn = n • (vV2u — u Vu — du/dt) in general, and dP/dn = vd2v/dy2 in particular. Clearly, analysis of element 2 (respectively, 4) will give results that basically replicate those from element 1 (respectively, 3); and —not clearly—we have no explanation for this paradoxical behavior, other than 'singularities can cause strange reactions.' We do point out, however, that a symmetric solution is 'compatible' with this result; all terms in the y-equation vanish, and (3.13-349) becomes dP/dx = vV2w — udu/dx — du/dt. h. Flow past a corner Again, because of the resulting/induced forced interpretation that seems to be required to understand discrete equations, we present one more example. And to make it more interesting, we will use different-sized rectangular elements. Consider flow past the corner shown in Figure 3.13-24, with u = 0 on the boundary: 4-5-8. Whereas previous analyses apply more or less directly to elements 1 and 3 (giving dP/dx = vV2w — u • V« — du/dt = vd2u/dx2 and dP/dy = vV2v — u • Vv — dv/dt = vd2v/dy2, respectively), it is the corner
584 THE NAVIER-STOKES EQUATIONS u = w (or un = 0) Fig. 3.13-24 Flow past a corner. element that is more interesting. Thus, for element 2 we have hi l\ — (u5 -u2+ub- u3) + — (v3 -v2 + v6- v5) = 0, where 115 is specified. Thus, du ~dt hi 2 5 - (vV2w - u • Vw - dP/dxh +l\ I + ■ 3 dv ~> h2 + (W2v - u ■ Vu - dP/dy)6 dy dt 3 3«" dx dt y)6 - - dv dl 5. (3.13-350) 0(1, h), (3.13-351) which ostensibly 'converges' to (du o \ f dv — + u • Vw + dP/dx - vV2u ) -hi — dt I V dt + u • Vv + dP/dy - vV2v = 0, (3.13-352) a linear combination of the two momentum equations that is, in fact, the element- level consistent normal momentum equation at node 5; i.e., nx = hi/\lh\ +12 and ny = —l\l'\jh\ +l\ is the consistent normal at node 5 when 'viewed from element 2'; it is called nL in the sketch. (We will see below that this local unit normal is generally different from the global unit normal.) Thus, we seem to have arrived at the common and agreeable result, 3P/3n = n • (vV2u — u Vu — 3u/3r) = vV2un at node 5. But now consider the inviscid case, for which we must introduce the globally mass-consistent unit normal in order to both permit slip and satisfy the appropriate 'no- penetration' BC, u n = 0. This unit normal vector is calculated from (3.13-56): n V05 V05 (3.13-353) 'n 'Q
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 585 where V05 905 + f d^l P -2 as follows (note that all three elements sharing node 5 are involved): •1 /•! A,//3,l) U. r\ r\ (3.13-354) 9^(2,2) ^(1,3) ^ ^ d£dr?, (3.13-355) where ^;(k-m) is the local basis function at (local) node k in element m. Since \j/k = (1 ±£)(1 ± j?)/4, the computation is easy and gives nx = fih\/2. Similarly, "> ~ P J If ~ fi2 c /.: t *^' '"• + °'iLL 1 /•! fl,/,C3) di/[ dr) d^d»7 -Phjl. (3.13-356) Thus, from n2x + n2 = \, fi~2 = (h] + l\)/A to give, finally, nx = hi I \lh] + I2, ny = -l2 / Jh] +1\, (3.13-357) which is, interestingly, the negative of the local consistent normal for node 5 when viewed from the 'phantom' element 4 in the sketch, and called nG there. It then follows that n u = nxU5 + nyvs, and the n • u = 0 slippery BC is thus h\Us = hv$. Now let us return to the 'continuity' equation for element 2 for v = 0: (3.13-352) gives h2 ( — + u • Vu + dP/dx J - /, I — + u • Vv + dP/dy J = 0, (3.13-358) which is the local (element-level) version of n • (du/dt) + u • Vu + VP) = 0, except here nx = h2 J \jh\ + i\, ny = -l\ y/hl + l]; (3.13-359) i.e., we have two verisons of a unit normal vector—one based on local mass conservation and one based on global mass conservation. Denoting these as nL and nG, respectively, the inviscid flow past the corner seems to satisfy n ~dt + u • Vu + VP =0, (3.13-360) which serves as a 'mass-conserving' PPE BC, and nG • — = 0, dt (3.13-361) which is the no-normal-flow-and-mass-conserving momentum equation BC. It is (3.13-361), h\us = hvs, that will be satisfied by the discrete equations.
586 THE NAVIER-STOKES EQUATIONS The Neumann BC for the PPE is thus dP/dn = -nL ■ [(u ■ Vu) + du/dt], with nL u = (h2 — h\(l\/h))u5 ^ 0 rather than the 'expected' result, dP/dn = — n (u Vu). Again, it seems that the GFEM does what it must to enforce the discrete version of V • u = 0; i.e., it can apparently be somewhat indiscreet in its discrete enforcement. If /2 = /i and fi2 = h\, then nG do have certain advantages. n , and the discrepancy disappears; uniform grids /'. Div u = 0 as a PPE BC In Section 3.10.4 we discussed the ill-posedness of the SPPE (as first presented) and mentioned that it could lead to a well-posed problem if appended with the additional BC, V • u = 0 on r. Here we present a weak formulation of this approach and show how it might discretize for one simple case. Consider first the problem and du/dt + v/> = vv2u + f V2/> = Vf in Q, (3.13-362) (3.13-363) with BC's u = w and dP/dn = n • (vV2u + f - du/dt) on T, and IC u = u0, with V • Uo = 0. But as shown in Section 3.10.4, this problem is not well-posed—it has an infinite number of solutions—many of which do not satisfy V • u = 0. So, here we recover uniqueness, and divergence-free-ness, by explicitly adding another BC : V • u = 0 on T. A weak formulation of the latter SPPE problem might go roughly like so: Find u g H| and P e Hl/R via 9u / dt + v / Vv (Vu)7 + v ■ VP = / v ■ f VveH 0' and /V0■VP = /f■V0 V0 e H0, ^V • u = 0 V\f/eLz (3.13-364) (3.13-365) (3.13-366) where (3.13-365) is obtained from (3.13-363) via the usual integration by parts and noting that the boundary integrals vanish because 0 = 0 on T. The functions {^} could, in practice, be just those </>'s 'omitted from' (3.13-365) by the restriction 0 g H{0; i.e., we replace BC's on (3.13-363) by V • u = 0 there. There are three new features when discretization of these equations is contemplated: (i) we cannot use Q\Qo, since (3.13-365) requires P €//', not just L2, and (ii) the boundary evaluation of V • u, and (iii) the boundary evaluation of vnV2u. So, using Figure 3.13-25 and bilinear functions for both u and -ft, i.e. the Q\Q\ element, leads to—for node 0: 0 / vv-u f° h/2 + y Js h fN h/2 - y N (h/2-y) (h/2 + y) , v0 - vs] (us usw) ■ ,. +(«o uw) ■ ., + In In h , W2 -30,, , (h/2 + y) vN- (Uo Uw)- ,, + (W/V Unw)- , + In h h dy vo' dy,
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 587 Fig. 3.13-25 A boundary 2-patch. which finally leads to l v/v — — [us — usw + 4(mo — uw) + un — uimw\ H — 6/ 2h vs 0 (3.13-367) as a typical 'boundary' statement of V • u = 0—and seemingly somewhat 'defective' (although still legitimate) in that no internal v-nodes appear. Also, of course, u on F is replaced by w there. If the 'bulk' continuity equation were applied at node 0 instead, the result would be [from the element matrices in (3.13-262)] 1 1 [2{vN [us - uSw +4(m0 - ww) + un - uNW\ + vs) + (vnw —vsw)] 6/ 3 2h (3.13-368) a somewhat 'better looking' approximation. This entire formulation, while feasible, does not seem highly advisable for at least two reasons: (i) there will be no discrete version of V • u = 0 except that in (3.13-366) on r, and (ii) the boundary coupling is awkward. But it may have an additional advantage that is worth exploring (not yet done, to our knowledge): it may have (probably did) stabilized Q\Q\, or any(l) equal-order elements. Thus we conclude this little section in a rather speculative manner—but do point out that a successful FDM version of the idea seems to have been demonstrated by Schiiller (1990). /. C?! Oo convergence proof The analysis below relies heavily on the concepts shown in Figure 3.13-26, and it may be useful (for some) to first review the stability discussion in Section 3.13.2d, as well as Appendix 3. Synopsis/Roadmap: The proof goes roughly like so, referring to the circled 'distances' in the figure: (1) 1^2 + 3. (2) 3^4^5+6.
THE NAVIER-STOKES EQUATIONS Fig. 3.13-26 Sketch for proving Q^Q0 convergence. (3) Thus, 1 ^2 + 5+6. (4) Show that each of 2,5,6 is 0(h). Done; proof complete. [Showing 6 = 0(h) is easy; 2 and 5 are not.] Introductory remarks, per Figure 3.13-26: (1) u is the exact Stokes velocity (from the weak solution of VP — vV2u = f, V • u = 0 in Q, u = 0 on T) and uh e Jh is the Q\ Qo-numerical approximation to u. The former is divergence-free but not discretely divergence-free, and the latter is discretely divergence-free but not divergence-free. (2) The plane outside of Vh is the divergence-free subspace of //' that we call J. (Vh is a subspace of //', but it is not a subspace of J. Thus, unfortunately, uh is not obtainable as a divergence-free projection of u. It is, however, a discretely divergence-free projection of u and VP; see Appendix 3) (3) p) is the interpolation projection from //' to Vh. (4) p\ is the //'-orthogonal projection from //' to Vh. (5) Pj is the //'-orthogonal projection from //' to Jh (the discretely divergence-free subspace of Vh). (6) The ^-projection is not essential (because, as shown in Appendix 3, p1) = p) p\, where p1} is essential), but it does serve to point out some interesting facts: (i) ||u — p\u||i ^ ||u — pji u|| i = \\u — Pj p\u\\ i, and (ii) ||u — p\u|| i ^ ||u — uh|| i. The closest vector (in the //'-norm) in Vh to the Stokes velocity is not in Jh; also, it is neither divergence-free nor discretely divergence-free. Finally, the approximate solution is not the closest discretely divergence-free vector to the Stokes velocity; \\u — p1} u\\\ < \\u — uh\\\. Background material: (1) The important but 'technical' (read 'difficult') 1981 paper by Malkus.
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 589 (2) The important but 'technical' 1984 paper by Malkus and Olsen (see too Malkus and Olsen, 1982) for a less technical discussion]. (3) Appendix 3 describes the above projections in great detail. We now address one of the most difficult tasks facing authors on the subject of this book: explaining—or perhaps, rationalizing—one of the most popular (perhaps arguably) finite elements extant for incompressible flow (and incompressible elasticity?). We are referring, of course, to the Q\Qq element—the bilinear velocity-piecewise- constant pressure element on isoparametric quadrilaterals, and its 3D analog, trilinear velocity-piecewise-constant pressure on isoparametric bricks. (It may be the case that its greatest relative popularity is in 3D.) We now embark on this task—hopefully profitably; i.e., we shall make a serious attempt at explaining the ostensibly paradoxical success of this slightly unstable but highly usable element. Worth first recalling, though, from Brezzi and Fortin (1991, p. 215) is, 'However simple it may look, the Q\Qo element is one of the hardest elements to analyze, and many questions are still open about its properties.' And this, more than 20 years after Fortin (1972) first discovered some of these properties! Remark Note that many others call it Q\Pq rather than Q\Qo —but since Q\Pq has, in many circles (see below), generated a bad reputation, we have changed its name; Qq means 'constant on the Quadrilateral.' We also hope to change its reputation. Before presenting our convergence proof for Q\Qo, which may be interpreted as our version/interpretation of the Malkus-Olsen theory—or at least a piece of that theory, since we focus on a single element and they studied several—we make some introductory and motivational remarks: 1. The Malkus-Olsen theory uses the LBB theory, in conjunction with some results of Mercier, to prove some more general results by showing that the error in the Q\Qo- velocity (not pressure) is bounded by the sum of the smallest possible L2-error in the pressure, and a computable error bound, using the H' -projection of the interpolant, to the same discretely divergence-free subspace of//1, in which the GFEM velocity resides, Jh. 2. The goal (obviously) is to estimate the velocity error, \\uh — u\\ \ —which will, of course, imply stability if convergence can be proven. (A sort of 'inverse' Lax theorem???) 3. The //'-projection of u to the discretely divergence-free subspace will be seen to play an important role—see Appendix 3 for details of this projection. 4. Recall (or see, for example, Strang and Fix, 1973, or Appendix 3) that for 'simple' elliptic problems, the GFEM solution is, because it is the best approximation, more accurate than is the interpolant of the exact solution—and that both converge at the same rate. Here, because we are using a mixed method, we lose the best approximation property—but we will still find a good use for the interpolant, u), by showing that uh still converges at the same rate as u), where it is important to realize/note that although u) is neither divergence-free nor discretely divergence-free, it nevertheless converges to the proper divergence-free result as h —> 0. 5. Just as the Malkus-Olsen theory (Malkus and Olsen, 1984) 'follows in the spirit of Mercier,' by showing that 'when a "good" approximation exists in the null space of
590 THE NAVIER-STOKES EQUATIONS the discrete divergence operator, velocity convergence can take place without the LBB condition,' so does ours follow in the spirit of Malkus-Olsen; i.e., it is not simply a re-statement of their analysis. To be 'fair,' we should at least briefly turn to the other side of the ledger and make some de-motivating remarks, most of which have appeared in the literature: 1. 'Moreover, they have shown (Boland and Nicolaides, 1985) that there exist data f for which the pressure approximations do not converge and that it is also possible to set up problems for which the velocity approximations do not converge as well'—Gunzburger (1989, p. 24). Two remarks on this remark: (i) we believe that the 'data' referred to above are far beyond what reasonable people would deem reasonable; and (ii) a useful summary of the Boland-Nicolaides theory ('patch test') is available in Thatcher (1993). See too Stenberg (1984) who, according to Thatcher (1990), 'extended and simplified' the method. 2. 'Only elements satisfying the consistency condition are recommended in a finite element program for viscous flow computation using a mixed velocity/pressure formulation'—Carey and Oden (1986, p. 138). 3. 'On the other hand, one might hesitate to disqualify elements, like the Q\Pq element, which do not satisfy the inf-sup condition, on the argument that the condition may be too strong. These doubts are not confirmed by experience'—Chapelle and Bathe (1993). 4. 'Given any constant c, however large, a flow (u, P) (dependent on c) exists such that the ratio of the norm of the finite element error to the norm of the best approximation error of this flow exceeds c for some h. A stronger statement is also true, namely that a flow (u, P) exists for which the finite element scheme produces nonconvergent approximations.'—Boland and Nicolaides (1984). [See Remark (1).] 5. 'The Q\Pq element was introduced more than a decade ago and then one did not know so much about the troubles with mixed finite elements. Now the situation is different and in view of what we know, the element cannot be recommended.' And, 'The Q\Pq element has done a good job (when in skilled hands), but every product has its lifespan.'—R. Stenberg (personal communication, 1991). So—returning to Figure 3.13-26—we present our version of a convergence theory for the QiQo-element—in a baker's dozen (or so) steps: 1. To begin, we introduce and utilize the //'-orthogonal projection (with projection operator pj ; see Appendix 3) of the exact velocity to the discretely divergence-free subspace Jh via uh — u = uh — Pj{u + pj u — u and the triangle inequality to get \\uh — u\\\ ^ \\uh — p1} u\\\ + Up1} u — w||i, and we shall estimate (bound) separately both members of the RHS, with the goal being to prove that \\uh — u\\\ = 0(h); i.e., that velocity convergence occurs when the mesh is refined—and it occurs at the optimal rate. 2. The following theorem is due to Mercier (1979a, b) and is crucial to our proof: \\uh — Pj{u\\i ^ C[\\PqP — P\\0, where P is the Stokes pressure, and p^P is its L2-orthogonal projection (the average pressure over each element—not shown in Figure 3.13-26) to the discrete pressure space, see Appendix 3; i.e., the distance between the numerical solution and the //'-orthogonal projection of the exact solution to the discretely divergence-free subspace depends (only) on the quality of the pressure space
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 591 approximability—or, equivalently, the numerical solution approximates the discretely divergence-free //'-projection of the exact velocity as well as the pressure basis functions can approximate the pressure. Since this assertion (theorem) is surely not obvious, we prove it below—following Olsen (1983): (i) The weak form of the Stokes equation is fvu:(V\)T-fpV\=ff\ VveHi. (ii) The weak form of the numerical (GFEM) solution is fvuh: (VvY - [ PhV\h = ff-\h Vvh e\h cHj. (iii) Setting v = \h in (i)—i.e., we use only those test functions of the finite- dimensional approximation—and subtracting generates the following 'principal error equation,' / V(u - uh) : (VvY = /(/>- Ph)V ■ y\ V\h e \h (iv) Selecting a particularly useful \h e Jh C \h, namely, \h = p) u — uh = p) (u — uh) because (see Appendix 3) p1] uh = uh, gives Jv(u-uh): [VpJ, (u - u*)]7 = f(P - Ph)V ■ phJx (u - uh). (v) Using u - uh = phJx (u - uh) + QhJx (u - uh), where QhJx=l- p)x, and the fact (see Appendix 3) that p)x is //'-orthogonal to Q)x J VphJx ( ) : [VQhJx( )]T = 0 gives fv(u-uh): [Vrf, (u - u")]r = J [vphJx (u - uh) + V$, (u - u*)] : [vp*, (u - u*) = j VphJx (u - uh) : [vPhJx (u - u*) = ll^u-u^l?, and thus IIpJ,u - »*||? = Up - Ph)v ■ phJx (u - u*). (vi) We now turn to the RHS; since Ph e Sh and p)x (u - ii*) € Jh, and by the definition of Jh : JqhV ■ \h = 0 Vqh e Sh and Vv* e Jh, the second term on the RHS of the above equation is identically zero. But another (and useful) term that is also identically zero for the same reason is J(PqP)V ■ phJx (u — u*). Thus, we can replace Ph above by the L2-projection of P (replacing one zero term by another) to obtain Mil* - rf,u||? = J(P - phQP)V ■ p)x (u - u*)
592 THE NAVIER-STOKES EQUATIONS and then use the Cauchy-Schwarz inequality to obtain Mil* - pj.ull i < 11^ - ^llo • IIV • phJx (u - u*)||0. (vii) Now consider || V • u||§ = / fe?=i «,-,,-) ^ dJYl1=\ uu^ obtained by using (again) the Cauchy-Schwarz inequality, this time in the form d 2 d d J2^ibi ^][>?X>? with bt = \. i= 1 i= 1 i= 1 (viii) Next, use the obvious inequality JYl1=\ uh ^ Yl1=\ Z^=i utj — HUH? t0 obtain || V • u||o ^ VJ||u||i, and thus ||V.rfi(u-u/!)||o^Vj||rf1(u-u/!)ll'=^Hu/!-^,ull'- (ix) Thus, finally, Mil* - pj,u||? ^ \\ph0P - P\\o • Vd\\uh - phju\\,; i.e., Ilu^-p^ulh ^cHlpjP-PHo where c, = Vd. QED 3. Thus, we arrive at [see step (1)] \\uh -«||i ^ ci\\PqP -P||0+ \\u - phJxu\\\, a result that is similar to (different constants) equation (1.11) of Girault and Raviart (1986), to equation (2.18) of Gunzburger (1986), and to equation (2.11) of Brezzi and Fortin (1991), to name three recent texts. But we shall take this result 'farther' than did any of the above authors, starting with the approximation theory result, \\p$P — P||o = c2^, [on uniform grids, the error is actually smaller yet—0(h2)] to write |w — W||i ^ C\C2h + ||w — Pj W||i. Remark: If the difference between a divergence-free velocity and its discretely divergence-free //'-projection can be shown to approach zero like 0(hk\ then the Malkus-Olsen (1984) Constrained Approximation Condition (CAC) will be satisfied to 0(hk). 4. To deal with the second term on the RHS, we begin with the obvious inequality \\u-phJxu\\\ ^ \\vh -w||i, Vvh eJ\ because p) u is the closest function in Jh to w. 5. Next, take a particularly useful vh, namely, vh = p1} u*} = p^p^u, so that vh — u = p1} uhj — uhj + uhj — u to give, via the triangle inequality, 11^ -w||i = \\p) u1} -w||i ^ \\p) u1} -uhj\\\ + ||kJ -w||i
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 593 and thus \\UH — W||i ^ C\C2h + ||Py Uh} — M/Hi + ||M^ — W||i. 6. Another application of approximation theory yields \\u^ — u\\\ = c^h to give, with C4 = C\C2 + C3, ||w'2 — «||l ^ C4/j + HPy M/ — M/lll- 7. All we need to show now is that the interpolant of u into the bilinear basis, u!}, is close enough—0(h) or better—to its discretely divergence-free projection, a seemingly trivial task. (But it is actually not very trivial—at least 'our' way.) We now seek wh = Pj uhj — uhj and its magnitude, Hh^Hi. This requires finding the discretely divergence- free //'-projection of uh, onto Jh\ i.e, p^u1}. From Appendix 3, we know that Py,W/ is obtained by solving the following algebraic system: Kv + Cq = Kii and CTv = 0 for q (the Lagrange multiplier—at element centroids—associated with the projection) and v [the nodal values of the projection Py:«/ = Yl"j=\ vj^j(x)^^ where u is the vector of nodal values of the interpolant, uht{x) = Y^"j=\ Uj<t>j(x), where Uj = u(xj). The nodal values of wh(x) = J2"j=\ wj<Pj(x) are men Wj = Vj — iij, j = 1, 2, ..., n, and we seek \\wh\\ \ = (wTKw)l/2 = Up1} U[ — uj\\\. Solving the algebraic system of the projection yields q = (CTK~'C)~'CTu and w = -K~xCq = -K'xC(CTK'{C)~'CTu, and thus wTKw = [-K-lC(CTK-lCylCTu]TK[-K-lC(CTK-lCylCTu] = {CTu)T(CTK-xCy\CTu). Remark: w is the minimum norm solution of CTw = —CTu, i.e., wTKw is a minimum, w has the same discrete divergence as u, but w is smaller than li—in //'. [It is in fact much smaller—wh = 0(h)—as we shall see; and, of course, u ~ 0(1).] 8. Introducing now (for reasons soon to become clear) the mass matrix on Sh C L2, Q,7 = f i//ii//j, yields wTKw= (Q-i/2CT~u)T [Qxl\CTK-xCyxQ}l2\ [Q"x/2CTu] ^\\Q"l/2CTu\\2-\\Qx/2(CTK-xCrlQl/2h from xTy ^ ||jc|| • ||y||, via Cauchy-Schwarz, and from ||Ax|| ^ ||A|| • ||jc|| (a property of an induced matrix norm) with y = Ax. 9. Now recall that the L2-norm of a matrix (A) is given by its largest eigenvalue—||A||o = ^maxG4)- So, we have \\Ql/2(CTK-lC)-lQl/2h = kmax[Qi/2(CTK~xCrxQx/2] = iAmin(Q-,/2c7/:-,CQ-,/2) because Amax (A) = l/Amin(A"'). Thus, we have that ||w*||? = wTKw <: \\Q"i/2CTu\\2/kmin(Q-x/2CTK-xCQ-x/2).
594 THE NAVIER-STOKES EQUATIONS 10. But Q-l/2CTK-iCQ"i/2 is—or is equivalent to—an 'LBB stability matrix' (see Section 3.12.2d) with the following known results: (0 ^min = ^h' where kh is the LBB stability constant; (ii) kh = 0{h) for a sequence of meshes of uniform rectangles or for quasi-uniform meshes generated by uniform refinement of an initially 'arbitrary' mesh of quadrilaterals (Chapelle and Bathe, 1993; and Griffiths and Silvester, 1994, some of whose results we discuss in the next section); and, importantly, (Hi) kh = 0(\) for most sequences of 'general' meshes [Malkus (1981); Brezzi and Fortin (1991, p. 244), who also stated that 'This last fact is still resisting analysis'; Malkus and Olsen (1984); Silvester and Gresho (1992, unpublished experiments); and Griffiths and Silvester (1994)]. Admittedly, the need to lean on some experimental results weakens (negates, in fact, for most mathematicians) the alleged "proof." But we, and ostensibly many others, can live with this 'problem.' 11. \\Q~{,2CTu\\l = {Q-xCTu)Q{Q-xCTu)= \\Q~y CT u\\2Q, where (recall—or see Appendix 3) the matrix — Q~XCT corresponds to the weak divergence operator: i.e., V/j • u(x) = — YlNj=\(Q~XCTu)j'[l;i(x)i where Uj is the value of u(x) at node j, and thus [see too (3.13-141)] ||Vfc -u||S = ^(Q-'C^Q-'C7^)* fifjifk = \\Q-lCTu\\2Q, M J to give ||£/ C M||o — || V/j • U7||o = ||i>? C U\\q. 12. But from Johnson and Pitkaranta (1982) and Oden et al. (1982)—see also Olsen (1983)—we have the following 'superapproximation' result on a mesh of rectangles for u/ : IICT'C^tillg = 0(h2) rather than the expected 0(h); the discrete divergence of the (bilinear) interpolant of a divergence-free vector is 'extra small'—another attribute of the QiQo-element. [For a general mesh of quadrilaterals, the normal approximation theory result, || £r'C7w||<2 = 0(h), obtains.] 13. Thus, finally, we have IMh ^ \\Q'{CT~u\\Q/^kmin(Q-'l2CTK-'CQ-xl2) = \\Q-lCTu\\Q/kh; and we (finally) see why the discrete projection of the interpolant to the discretely divergence-free subspace [a la step (7)] is perhaps not so trivial. It is—or rather, it could be—amplified by a badly behaved LBB 'constant'. From steps (8)-(l 1), we obtain ||w|| i < 0(h2)/0(h) for rectangles (or quasi-uniform mesh refinements) and ||w||i ^ 0(h)/0(\) for a general mesh where—in the latter case—we have no superapproximation of (V/,-) to aid us; and, fortunately, we do not need it.
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 595 14. Finally, inserting these last results—||w||i ^ c$h—into step (6) gives the final result: \\u — w||i ^ ch\ the LBB-unstable Q\Qq-element converges at the optimal rate. Finally, lest we be accused of the worst form of CFD (Charlatan Fluid Dynamics; another form of CFD, not quite so bad, is Colorful Fluid Dynamics, in which GREAT GRAFIX can replace/displace serious analysis of numerical results), we hasten to add the following remarks, partly to somewhat temper the above conclusion (but in no way negating it): 1. A wealth of laboratory experience seems to support this conclusion—or at least does not refute it. 2. Convergence of the pressure is in no way implied in the above—all we can offer here is the combination of experience and hope, the former of course being more useful; we have never seen divergent pressures, and our filtered pressures, from 'proper' meshes, are virtually always acceptable if not excellent. 3. 'Bad data'—i.e., the forcing function, f(x)—could conceivably weaken the above conclusion; e.g., if the 'crazy' mesh-dependent data discussed by (or implied by) Boland and Nicolaides (1985) were imposed, the QiQ0-element might fail to converge, which failure must (we believe) show up in an 'unstable' interpolant in the above proof. See also Arnold et al. (1984), who offer—among many other things—the following advice: 'Certainly formal insistance that the (LBB) stability condition is satisfied is inappropriate.' For counter arguments to our counter arguments, see the article by Nicolaides in Gunzburger and Nicolaides (1993). 4. We conjecture, but cannot prove, that the same convergence rate obtains for more general, and inhomogeneous, BC's than just u = 0 on T—at least as long as the (spurious) CB-constraint equation is satisfied (or absent). 5. If the LBB 'constant' was more badly behaved—say kh = 0(h3) on rectangles—then the result of this theory, ||w|| i ^ 0{h~x), would be that it too (like LBB theory) is simply not good enough. That is to say, it is very important to realize that the theory does not say that ||w||i = ||V/j • u^||0//:/j, only that ||w||i is bounded by ||V/j • u^lo/^ if this term —> oo for h —> 0, then the theory simply becomes silent/useless. Both theories can, if conditions are 'right,' be used to prove convergence. But, when conditions are wrong, they do not prove divergence! (in spite of numerous implied statements in the literature). 6. A quotation from Olsen (1983), to help wrap it up, is relevant here: 'This generally favorable computational experience, together with the fact that the LBB condition is only a sufficient condition for convergence, suggests that if the bilinear rectangle and crossed triangle elements fail to satisfy the LBB condition, it is as much the fault of the LBB condition as of the elements.' And another, from Malkus and Olsen (1984), '... when a "good" approximation exists in the null space of the discrete divergence operator, velocity convergence can take place without the LBB condition.' 7. Even Girault and Raviart (1986) seem to believe that the LBB condition is not so sacrosanct: 'On the other hand, quadrilateral elements (more precisely, rectangular elements) provide excellent examples of schemes which do not satisfy the, inf-sup condition and yet can be proved to converge with optimal accuracy.'
596 THE NAVIER-STOKES EQUATIONS Fig. 3.13-27 An example of a CB-precluding 'mesh'; each 4-patch is converted to a 5-patch. 8. For further discussion of results using the technique from which the above analysis was derived, see Malkus and Olsen (1984) and Appendix 4.II (by D. Malkus) in Hughes (1987), which, incidentally, is independently recommended reading. For example, the crossed linear triangle element (a macro element) is also 'optimally constrained' yet does not pass the LBB stability test. 9. For fans of Q\Qq who want guaranteed optimal convergence of both u and P (with, however, larger error constants caused by the distorted shapes?), one way to assure this is to discretize via the macro elements of Figure 3.13-27, each composed of five Q\Qq quadrilaterals: These meshes can be made, e.g., from a mesh of nine-node elements by 'splitting in two' the central node in each element. Such 'CB-killer' meshes have been employed in practice by (at least) J. Bathe (see, for example, Chapelle and Bathe, 1993) and J. Schutt (personal communication). Both the macro-element and the proof are due to Stenberg (1984). 10. A recent relevant reference that supports our position that an unstable element will behave nicely for a wide range of input data (but not for all data) and that demonstrates the notion in ID is Babuska and Narisimhan (1997). k. Quantitative description of some unstable modes In Section 3.13.2b on filtering Q\Qo pressure modes, we introduced the concept of 'pesky modes' which are, in fact, just those most unstable LBB modes, where 'most unstable' => smallest eigenvalue—though it turns out that there are many others that are also unstable, at least on uniform meshes. In this section, we quantify some of these modes, following Griffiths and Silvester (1994—'GS')—but generalizing (somewhat) their results—after noting that these modes were actually 'discovered' earlier (1992) in the CFD laboratory by Silvester and Gresho, using MATLAB. Using the method of modified equations, GS obtained the following results for the eigenproblems of (3.13-150), (3.13-151), and (3.13-156) for a uniform mesh of square (h x h) elements for the 'Dirichlet problem' on the unit square: Mm,n = -| 7T2h2(m2 + n2) (3.13-369) for m, n = 0, 1, 2, ... but 'small' relative to the number of elements in each direction; i.e., it is an asymptotic result and applies for mh < 1 and nh < 1. The corresponding
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 597 ('LBB') eigenvectors, for the m, n mode, are smooth modes that are modulated by the infamous CB-mode, and are thus no longer smooth: u ij Jtj ■Qi+l/2j+l/2 J (-1) i+j — ^7t2h2 cos imnh sin jnnh — 4 7t2h2 sin imjth cos jn nh 7rcos(/ — l/2)mnhcos(j — l/2)nnh_ (3.13-370) where x; = ih and yj = jh, i, j = 1,2,..., \/h — 1 are the internal nodal locations, and we presume \/h to be an integer—for simplicity. Note that «,7 vanishes smoothly at the top and bottom, but roughly—in a boundary layer of thickness h—at the two sides (because u = v = 0 on T). Also noteworthy is that the 'velocity modes' are quite small in amplitude relative to the pressure mode, owing to the h2 factor—and hopefully helps explain why these pesky modes are rarely seen in computations, although another reason that they are small or absent is that the 'CB-factor,' (—l)1+j, will make these modes nearly orthogonal to virtually all 'reasonable' (not CB-ish) data. A final reason that the pesky velocity modes are sometimes absent is explained by the expansion (3.13-167), and the realization that the unstable modes are represented by the first 'few' terms of the (a) Raw pressure: contours (b) Velocity V \ 1 t f t 1 ; :\\ \ \ \ \ s ' - ;', I \ i i i i '' ;' ' i' i!»* *' i * i ; \ \ i i i i f : (' i i i \ i i , -'* i',',», * i * i^ / (c) Smooth pressure: contours (d) Smooth pressure: mesh surface ii Fig. 3.13-28 First of two most unstable LBB modes, 16x16 mesh.
598 THE NAVIER-STOKES EQUATIONS first summation, and noting that g = 0 suppresses all of the velocity portions; i.e., only g 7^ 0 can excite the pesky velocities. Setting m = 1 and n = 0 or m = 0 and n = 1 yields the lowest mode with nonzero fi, and with it a quantitative value for the LBB constant [cf. (3.13-159) and (3.13-162)]: kh = V^oT = vWi(Mo,i-D = ^\/i + 0(/*2)- (3.13-371) Figures 3.13-28 through 3.13-35, kindly supplied by D. Silvester, display the first few unstable modes pictorially on two meshes (to show 'mode similarity' with mesh refinement). They were obtained numerically, via MATLAB, not via (3.13-370); and were done, in part, to verify (3.13-370). Shown are the first four LBB unstable modes corresponding to (3.13-369) and (3.13-370); namely, for (m,n) = (1,0), (0, 1), (1, 1), and (2, 0) + (0, 2)—the last one being 'MATLAB's choice' since both modes have the same eigenvalue. In each plot, (a) shows the computed (and polluted) pressure part, (b) shows the corresponding velocities, (c) is the filtered pressure, and (d) is a 3D perspective/isometric plot of the smoothed pressures, sometimes rotated for better viewing. Even though the pictures speak well for themselves, a few remarks may be in order: 1. Clearly the same modes are present on both meshes. 2. Clearly the mode shapes are well-described by (3.13-370). (a) Raw pressure: contours (b) Velocity 1000000000000001 (c) Smooth pressure: contours (d) Smooth pressure: mesh surface Fig. 3.13-29 Second of two most unstable LBB modes, 16x16 mesh.
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 599 (a) Raw pressure: contours (b) Velocity (c) Smooth pressure: contours \ .WT7// ■^ ^ \ i / s £- / ^ I /> V' (c) Smooth pressure: mesh surface Fig. 3.13-30 /Vexf unstable LBB mode, 16x16 mesh. 3. MATLAB sometimes 'chooses' a + sign, sometimes a — sign. 4. The smoothed pressures were not obtained with our 'standard' CB filter to the nodes. Rather, they are still centroid pressures, obtained by multiplying the computed ('raw') pressures by (—1)'+7. 5. Since both u and P oscillate as fast as possible, with the 'frequency' (wave number) increasing with increasing number of elements, it is clear from (3.13-167) that both (wj f) and (qfg) will be small for reasonable (smooth) data (/ and g) and will -» 0 as h -» 0. We conclude by comparing the corresponding numerical eigenvalues with those from the new theory—(3.13-369): the entries in Table 3.13-10 are (-8^-„/3^2/z2). Additional Remarks (1) For a more general grid of rectangular elements, the CB-factor in (3.13-370) probably becomes (-1)'+V^i-1/2,7-1/2, where -A/—1/2.7—1/2 is the area of the associated element—and the simple trigonometric parts of the eigenvector are no longer so simple, and not available in closed form. The eigenvalues, likewise, are not simply
600 THE NAVIER-STOKES EQUATIONS (a) Raw pressure: contours ^►§> O <$><♦><> <^^ ►#>(> <S>4^>x> ^># >0^44 ^ 90<^4 ^ &4 ^ G> o<£<S>o <^<A c) Smooth pressure: contours (b) Velocity \ % \ , \ ■ -> v^ v ^ v » ^\\N^ x- "■ -V vO ^ -.v \ V v . i < \ ' - '/ '/ '' S ', ', - - "'/'/' '/'/!.' /t 't ,'l / / i ' . / / , t / '/ >/ */ . - ', *''/, - -- '• '• -- - 'V '/ '' - 1 ( ' J ' -V n\nV n- " -\n\nk - ^ v\\* <■ -^ *vv - ^\^N\. ■ » ^N » \N A I \ \ (d) Smooth pressure: mesh surface Fig. 3.13-31 Sum of LBB modes (2,0) and (0,2), 16 x 16 mesh. given by (3.13-369). But we (and D. Griffiths, personal communication) believe that the 'general' picture remains much the same. (2) The area-weighted pressure filter applied to a typical 4-patch generated by (3.13-370) gives, at node ij, tj |(-l)/+7'[cos(/' + l/2)mnh - cos(i - l/2)mnh] x [cos(y + l/2)nnh — cos(j — l/2)nnh] •, • . mith . . nith (— 1) J sin imjth sin sin jnizh sin , (3.13-372) which is clearly 0(h2). The 'standard' CB filter + smoother is thus not perfect for the pesky modes, but it is quite good on good meshes (small h).
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 601 (a) Raw pressure: contours (b)Velocity ::: « ' • • • • • • • • • ' ■■■ ■' '»• • • •' •' • •' :::v.v.v.v.v v ::s *' i' ••••#•!••••••f i- ■§ •f Vj'.V.V.'.V.V V : :VV ...... ». :::V.v.v.v.v.v v-.v.v.v.v.v.v iV.v.v.'.v.v.v V'sV.'.V.V.V.V.' ••.v.v.v.v.w.v • • • • 11 * 111 j 11«< > (c) Smooth pressure: contours (d) Smooth pressure: mesh surface TIMIr LAlllll JIUIlL Fig. 3.13-32 Same as Figure 3.13-28 except 32 x 32 mesh. (3) GS also found the 3D LBB constant: (4) kh V3 ,1,3- TZl\t + 0(h"). (3.13-373) They also showed, for the leaky lid-driven cavity problem in both 2D and 3D (when well-posed) that the amplitude coefficient of the most unstable mode, see (3.13-165) through (3.13-168), is 0(h) even though the minimum eigenvalue is 0(h2) in 2D and 0(h4) in 3D. This is an example of the contention we have made several times: the bad modes are nearly orthogonal to 'reasonable' data and thus offer no impediment to obtaining good results on good grids, and convergent results as h —► 0. A final remark is this—and applies to both stable and unstable elements: if a domain has aspect ratio l/h and is uniformly discretized into rectangular elements (with of course the same aspect ratio), then the LBB constant goes to zero like 0(1/h)2 for l/h —> 0 or like 0(h/l)2 for l/h —► oo; from D. Silvester (personal communication, 1995). /. The boundary vector, g We mentioned earlier [Remark (7) below (3.13-32)] that g is a sparse vector that accounts for inhomogeneous Dirichlet BC's in the mass conservation equation, CTu = g. In this
602 THE NAVIER-STOKES EQUATIONS (a) Raw pressure: contours (b) Velocity N »» MIKNIIK II N K K NNNNIHN NMNMMNNMMMNanilMMMNMMHHMMNHNMUMM NKHNNMMtlMNMNXHNNHMNHNMMHMNMMNMM HMMIIll NN MIKIIIIII NHHHHHHMMNMHNMMHMMNMMMMMMMMMNMN MXMMMNMNHMMNHMMMNMNhMNNNHIINHMMM MXMXHNNMMHMMHMMMMHHMXNMMMNMNMHM >MM» MIIMMHIMMIKK ♦♦♦♦♦♦♦♦»♦♦»♦♦♦«♦♦♦♦♦♦♦♦♦♦♦♦♦«♦ 1 1 ooooooooooo MM (c) Smooth pressure: contours (d) Smooth pressure: mesh surface Fig. 3.13-33 Same as Figure 3.13-29 except 32 x 32 mesh. little section we shall explicitly construct g for a special-but-general case. It is special because we do it only for Q\Qo, and it is general because we consider an arbitrary ('isoparametric') mesh. Generalization to other elements is left as an exercise. Let us return to Figure 3.13-23, but imagine it to be generalized in the following way: nodes 3 and 7 are no longer constrained to form a rectangle; i.e., the element is envisioned to be an isoparametric quadrilateral with the only restriction being that nodes 4 and 8 still define a line in the jc-direction. This restriction, too, could be removed if we were willing to work in the transformed coordinate system (normal and tangential, a la Section 3.13.1e), a complication we prefer to avoid for present purposes. Consulting the element matrices of Appendix 1, it is straightforward to obtain the following mass conservation equation, CTu = 0—a 'preprocessor' step prior to enforcing Dirichlet data upon boundary nodes 4 and 8 (nodes 3 and 7 are here both internal nodes, not on T): 2CTu = (yg - y3)(«7 - u4) + (y4 - y7)(u8 - m) + (X3 - Xs)(v7 ~ V4) + (X4 ~ X7)(V3 ~ V8) = 0. (3.13-374)
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 603 (a) Raw pressure: contours (b) Velocity (c) Smooth pressure: contours (d) Smooth pressure: mesh surface Fig. 3.13-34 Same as Figure 3.13-30 except 32 x 32 mesh. Dividing by two and transposing Dirichlet data to the RHS gives the 'true' Cr-matrix and 'true' (unknown) velocity vector (u) as CTu = |(y8 - y3)u7 - \{yA - y7)u3 + \(x3 -xs)v7 + \{xA -x7)v3 = \(ys- J3)"4 -\{yA- yi)u% + \fa- x8)v4 + \{xa - *i)v% = g, (3.13-375) and we see that the element contribution to the global g-vector is made up of the difference between two neighboring tangential velocities («4 and u%) and the sum of two neighboring normal velocities (V4 and v&). The main thing to note is that for 'smooth' data [U4 — «8 + 0(x4 — xs), etc.] the 'tangential' contribution to g is small compared with the normal component—as also mentioned in the above Remark. Indeed, if we specialize to a simple / x h rectangular element as in Figure 3.13-23, we obtain a simpler version of g: h I g = ^(u4 ~ u8) + ~(v4 + vs), (3.13-376) from which it is obvious that normal contributions are generally much more 'important' than tangential ones, the latter vanishing totally for the case of uniform shear velocity—all of which is a simple consequence of (global) mass conservation. It is also clear that only normal velocities remain in g as / and h —» 0.
604 THE NAVIER-STOKES EQUATIONS (a) Raw pressure: contours (b) Velocity (c) Smooth pressure: contours (d) Smooth pressure: mesh surfact Fig. 3.13-35 Same as Figure 3.13-31 except 32 x 32 mesh. Table 3.13-10 (m,n) Mesh 8x8 16x 16 32x32 d,0) = Expt 0.772 0.900 0.955 = (0,1) Theory 1 1 1 (1,1) Expt 1.188 1.613 1.822 Theory 2 2 2 (2, 0) -- Expt 2.162 3.278 3.732 = (0, 2) Theory 4 4 4 The details, but not the concepts, will vary if one examines other elements than <2i<2o- Indeed, if one has studied Section 3.13.2b on 'pressure modes,' s/he will know that the explicit construction of g for two other elements, Q\Q\ and Q2 Qi, has already been presented in the two example problems discussed there. 3.13.6 Higher-Order Elements We shall be fairly brief here—and only present a sample of 'final' results, because they are mostly less than enlightening.
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 605 Nearly everything needed for building a typical GFEM equation has already been presented in Chapter 2. Here we provide, along with Appendix 1, just enough that is not in Chapter 2 to permit the interested reader to finish the equations—namely, VP and V • u; and we will provide only the d/dx portions of these stencils for the quadrilateral elements: <22<2-i, QiQ\, and Q2P-\, leaving the rest as an exercise. The key differences in the first-derivative operators from those of advection (with constant velocity) shown in Chapter 2 are, of course, a consequence of 'mixed interpolation' and, since P is only linear or bilinear, it is to be expected that the resulting first derivatives will be both simpler and perhaps less accurate than the 'full' quadratic approximation (trial and test functions) that is applied to advection. a. 02Oi We begin with the 4-patch of Figure 3.13-36. By consulting the element matrices in Appendix 1 one can easily construct the following operators: du/dx\Q = -QZxCTxu\0 (3.13-377) and dP/dx\o = m;1cxp\0, (3.13-378) where Qi and Mx are the lumped mass matrices for pressure (bilinear basis functions) and velocity (biquadratic basis functions) respectively—specifically, QL = Ih and Mx = lh/9 for the 4-patch. Since P 'lives' only at corner nodes, so too does V • u, so all we need to investigate is du/dx for the 4-patch shown, for which we obtain du/dx\0= I {2[(uSe - uSw)/l + ("e - uw)/l + (uNE - uNW)/l] + (uSee ~ uSWw)/2l + (uee ~ uww)/2l + (uNEE - uNWW)/2l}, (3.13-379) which is indeed much simpler than du/dx when the test functions are also biquadratic. Turning now to dP/dx, we must address the three different types of nodes: corner, midside, and center. We begin with the corner node; for the uniform grid 4-patch, the result is strikingly simple: dP/dx\0 = (PEE-Pww)/2l, (3.13-380) NNWW nnww X NWWn WW SWWn SSWW NNW —n— NWO nww W t> SWO -{]- ssw NN nnw X X nw 0 N O Os ss NNE —n— O NE E -C} OSE ■£ SSE NNEE h NEE «EE SEE SSEE Fig. 3.13-36 A 4-patch of biquadratic elements.
606 THE NAVIER-STOKES EQUATIONS really. Even a variable-rectangular grid brings just a little more coupling—only node 0 gets into the act/stencil, a fact clearly revealed by perusing the element matrix, CTX, in Appendix 1. Turning now to the 2-patch containing midside node E, we obtain another surprising lack of coupling and resulting simple result—here using Mx = 2hl/9: dP/dx\E = (PEE-Po)/l. (3.13-381) The 2-patch for midside node TV is only slightly more involved: dP/dx\u = ^[(Pee — P\vw)/2l + (Pnnee — Pnnww)/21]. (3.13-382) Finally, the center node equation, with Mx = Alh/9, is, for node NE, dP/dx\NE = \[{Pee ~ Po)/l + (Pnnee ~ Pnn)/H (3.13-383) and we seem to be led to an obvious overall 'conclusion' that is at least sometimes borne out in practice: the element may not be very accurate in V • u and VP, at least relative to the other terms in the NSE's that benefit from 'full quadratic' approximation. This probably also helps to explain why Q2P-\ and <22<2-i outperform <22<2i> and we state the additional fact that helps to reinforce our assessment: because <22<2i uses a C° pressure approximation, element mass balances do not exist—whereas the C_1-pressures in <22<2-i and QiP-\ do generate element-level mass balances. To complete our brief analysis of <22<2i > we shall present the (lumped mass) 'Laplacian', CTM~XC, which would be used for explicit time integration of the PPE version of the semi-discrete NSE's. Appendix 1 shows the stencils for both CTXM~XCX and CTM~xCy. Summing these gives CTM~XC and multiplication by —<2l' gives, upon rearrangement, the 'familiar' (finite difference) representation of the Laplacian. We present the result in several steps; using Figure 3.13-36—but this time the velocity nodes are to be interpreted as pressure nodes; i.e. the sketch now represents a 16-patch (4 x 4) of 9-node elements containing 25 pressures (the size of the patch is now 4/ x Ah). (1) d2P/dx\ = -QlxCTxMlxCxP\Q = —KPsww ~ IPs + Psee)/(21)2 + 2(PSW ~ 2PS + Pse)/12 + 4(Pww ~ 2/>o + Pee)/(21)2 + 8(/V - 2Po + Pe)/12 + (Pnww ~ 2PN + P nee)/(21) + 2(PNW ~ 2PN + Pne)/121 (3.13-384) (2) d2P/dy2\0 = -QZlCTyM^CyP\o = —[(Pnnw - 2PW + PSSw)/(2h)2 + 2(PNW - 2PW + Psw)/h2 + 4(PNN - 2PQ + Pss)/(2h)2 + 8(PN - 2P0 + Ps)/h2 + (Pnne - 2PE + PSSE)/(2h)2 + 2(PNE - 2PE + PSE)/h2. (3.13-385) (3) Adding these two gives V2h. We present only the simpler result for a 'square' mesh, / =h: VjPIo = -QllCTM-lCP\0 = ^[(Psww + Psee + Pnww + Pnee + Pnnw
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 607 + PSsw + Pnne + Psse) + HPnn + Pss + Pee + P\vw) + M(PN + Ps + PE + Pw) + \6(PSW + ^f + /W + Pne) - 144/U (3.13-386) which would probably better be presented in 'stencil' form—an exercise we leave to the reader. b. O2O-1 Here, in the first case (Case 1 in Appendix 1) we shall be even more brief, mostly because the results are not especially 'revealing'—a fact that is related to the computationally- convenient and 'historically'-determined choice of pressure basis functions: they are Lagrangian basis functions whose nodes are at the four 2x2 Gauss points. Referring back to Figure 3.13-36, we show the 4 pressure nodes in one of the elements (top left) via the lower case letters; since the figure would be too 'busy' if we had explicated them everywhere, we let the reader fill in the other 3—in the obvious way. The distance between these Gauss-point nodes is 1/V3 in x and h/^/3 in y. We shall present only two of the W equations and none of the V • u equations, as they reveal little but confusion. But we do show the sum of the four CTu = 0 equations, because they do display something useful, namely an element-level mass balance, which is (for the top left element) h ~[(uww — uo) + 4(«/vww — un) + (unnww ~ unn)] 6 / + ~[(vww -vNNWW) + 4(vw -vNNW) + (vq - vNN)] = 0. (3.13-387) 6 Next we report M~XCXP for node 0 in the 4-patch, a la (3.13-378), from the element matrices in Appendix 1: dP/dx\o = -[(9 + 5V3)(Pne - PnW + Pse ~ P.w) 0/ + (3 + V3)(Pnne ~ PnnW + ^.v, ~ P*sW) ~ 0^3 ~ 3)(Pnee - Pnww + Psee ~ Psww) + (9 - 5V3)(Pnnee ~ P„nww + Pssee ~ PSSWW)],0.1 3-388) and it may be obvious why we present only one more—and the only one that is intuitively- appealing—that for a center node. It is 1 dP/dx\NW = - P — P P — P + l/y/3 l/y/3 (3.13-389) More 'useful' equations result using the alternative-but-equivalent pressure basis referred to as Case 2 in Appendix 1. Now the pressures 'live' at the 4 corner nodes of each element, with the somewhat inconvenient consequence that pressure is now a multivalued quantity; for example, there are 4 different pressures at node 0 of the 4-patch. Note that this is really no different than the 2 x 2 Gauss point pressure basis, which are also quadruple-valued at node 0. (Note too that the numerical results using either equivalent basis will be the same). We simply need to introduce some new names/nomenclature; and, rather than further cluttering up an already busy figure, we ask the reader to help us by returning to Figure 3.13-36 and 'mentally', adding 3 rows of node numbers for the 16
608 THE NAVIER-STOKES EQUATIONS pressures. Thus, 1 through 4 lie (left-to-right) on the bottom, 5 through 8 lie on the line connecting WW and EE, as do 9 through 12 (the former living 'just' below the line and the latter just above. Finally, nodes 13 through 16 are on the top row (NNWW —» NNEE). Thus node SS contains Pi and />3, node WW contains P5 and Pg, node 0 contains P(„ P-j, P\o, and P\\, etc. We can now go to the same CT matrices in Appendix 1 that were used earlier for Q2 Q\ and now re-use them for Q2 Q-\, with the following sampling of results (the lumped mass matrices, Q and M are unchanged): (1) V • u at node 6(1 of 4 continuity equations, in the lower left element, that at node 0): du/dx\Q = —[(uww + Auw - 5u0) + 2(uSWw + 4usw - 5us)], (3.13-390) 9/ which is 'representative'. Viewed 'alone'/in isolation, it is easy to believe that the Q2Q-1 approximation to V • u will not be very accurate (the above is first-order, if TS applies—which it does not). But note/recall that the sum of all four equations in an element gives /V-u=/n-u = 0, an element mass balance; see (3.13-387). e Fe (2) M~lCxP\0 for the 4-patch: dP/dx\Q = -[(Pi -P5+ PX2 - P9) + 5(P7 -P6 + Pn - /»,<,)], (3.13-391) which degenerates to that for Q2Q\ in (3.13-380) if P5 = P9 and Pxl = P& and P6 = Pi = ^*io = ^11; i-e. if the pressure were continuous. But it is generally discontinuous, the extra degrees of freedom accounting for its superior performance (up to the CB mode!) compared to <22<2i- It is als° tempting to suggest that the term in the second parentheses (the jump term) will, for smooth pressures, tend to zero with mesh refinement and the first term tending to (PEE - PWw)/2l = dP/dx\0; i.e., for l,h-*0, Q2Q-1 -> QiQ\- (3) M~XCXP\S for a 'horizontal' 2-patch: dP/dx\s = - [{PA -Pi+Ps-P5) + 5(P3 -P2+PJ- P6)l (3.13-392) the second group of terms again describing a jump. (4) M~lCxP\w for a 'vertical' 2-patch: dP/dx\w = ±-(P6-P5+ pl0 - P9], (3.13-393) with no jump terms. (5) M~XCXP\NE for a center node: see (3.13-383). Note for both cases that all pressure equations annihilate the CB pressure mode—a reflection of the LBB instability for <22<2-i- This concludes our 'sampling' for Q2Q-1, about which we remark: if you think this element is slightly confusing, turn the 'page'. Unlike <22<2-i ■> this element—from the viewpoint of studying the discrete equations—truly 'suffers' from the need to put a 'square peg' (triangular 'element') into a 'round hole'
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 609 (rectangular element). For V • u we merely mention that the sum of the three continuity equations also generates (3.13-387). Whereas <22<2-i at least has some symmetry that helps a little to interpret the discrete equations, even this is lost with the 'great' QjP-\ element. As we use the first three Gauss points in each element to define the pressure nodes and concomitant Lagrangian basis functions, we must omit the upper left node (Gauss point) in each element (node nnww in Figure 3.13-36). We report, for what it's worth, the M~x CXP\0 result for the above four patch—and let the reader both interpret the result and build other discrete equations from the element matrices listed in Appendix 1: dP/dx\0 = -[(2V3 + 3XPne+Psse) + V3> nee T' ssee ww + Pssw) + (3^3 - \)(Pnnw + Psw - Pnnee - Psee) - 5V3(Pnw + />,,„)], (3.13-394) about which the only remark we care to make is that the apparently wrong-signed term (the third) is a common feature of higher-under approximations; recall that it even occurred in ID for the AD equation; cf. Section 2-3-lb. To 'finish,' we show again the center node equation, since, especially for this fun- to-use but difficult-to-'analyze' element, it is the only one that is reasonably simple to understand: P — P dP dx\NW = =—, (3.13-395) l/Vd> where we recall that there is no node corresponding to Pnnww for this element (thus it 'cleverly' neglects Pnnw in the dP/dx approximation). To conclude our discussion of 9-node elements (27-node in 3D), we speculate on <22<2-i vs. Q2P-\, the latter having become the favorite 2D element in many codes besides ours, in both 2D and 3D—partly ('slightly' is a better word) as a result of contemplating the above (and other) stencils: in the cases of most interest for the FEM—complex geometry in 3D—there will seldom, if ever, appear spurious pressure nodes for Q2 Q- \ for which applications it just might be superior to QiP-\ which, especially in 3D, seems to be a little 'short' in pressure degrees of freedom. 3D numerical comparisons are strongly recommended between these two elements. (For finite difference geometries, like boxes and cubes, QiQ-\ is often somewhat hampered by spurious modes 'because' the geometry is so simple that not quite all of the continuity equations are actually required. We even believe that finite difference methods are more appropriate—more cost-effective—than finite elements on 'finite difference geometries'). Our final remark on 'higher-order elements' for NSE's is this: they probably are better left to the computer (in practice) and to the finite element mathematician (in theory). 3.13.7 Divergence-Free Elements (and Methods) Before diving into this section, it may be a good idea to take a look at our brief introduction to this subject in Section 3.12.2e. The 'idea' behind a divergence-free basis is best initiated via the conventional weak form of the continuum Stokes problem—following Griffiths (1979a,b): find u e Hq and P e L2 from a(u, v) - (div v, P) = (f, v) Vv e Hj, (3.13-396)
610 THE NAVIER-STOKES EQUATIONS and (div u, q) = 0 Vv e L2, (3.13-397) which, by decomposing the Hilbert space H(\ into the direct sum of a divergence-free subspace (D) and its orthogonal complement—a curl-free subspace (C)—reduces the above pair of coupled equations to the following sequence of equations, with the second step 'optional': 1. Find u e D from a(u,v) = (f,v) VveD. (3.13-398) 2. Find P e L2 from (div v, P) = a(u, v) - (f, v) Vv e C, (3.13-399) where H' = D + C Whereas the (orthogonal) decomposition is always theoretically possible, the challenge is to repeat the problem formulation for the finite-dimensional subspaces associated with the FEM (the easy part) and then find suitable (local) basis functions for these subspaces (the hard part). It is also relevant to point out that the pressure 'recovery' in Step 2, which must use C '-approximations for P if the divergence-free basis is to remain local, is not a 'conventional' linear system of algebraic equations (as is Step 1); rather, it is solved by looping through the elements, thereby determining the jumps across element boundaries of the pressure 'parameters' (P, dP/dx, etc.)—see Griffiths (1981) for details. The divergence-free approach has had somewhat of a checkered history, beginning (we believe) with one of the early pioneers of 'FEM in Fluids,' M. Fortin. About a quarter of a century ago, he showed in his Ph.D. thesis (Fortin, 1972a) how a divergence-free basis could be employed, although perhaps not gainfully: 'Indeed, one can construct finite element methods where the incompressibility condition is exactly satisfied [cf. Fortin (8), (9)], but this leads to the use of complex elements of limited applicability.'—Crouziex and Raviart (1973), an important paper that introduced discontinuous pressure on triangles. See also Thomasset (1981) and Hecht (1984)—who not only talked about divergence- free bases, but constructed and used them (for Q2P-O. Our brief attempt at a summary of this history, most assuredly with numerous errors of omission (at least), continues with the early contributions of E. Thompson and students. Although not employing a divergence-free basis, Thompson (Thompson and Hague, 1973, and Thompson, 1975) did experiment with a pointwise (exactly) divergence-free element—PiP~\, happily and innocently unaware (then) of the sometimes serious 'LBB stability' problems displayed by this element (there can be many spurious pressure modes, depending on mesh design and BC's—details later), because they always used NBC's on large segments of the boundary and apparently never encountered insoluble problems. (A very neat example of a 'drooping candle' is presented in Thompson and Hague, 1973—a sequence of steady Stokes 'flows' in a Lagrangian formulation.) It may be interesting to demonstrate the pointwise incompressibility of this element, so we do so, beginning with the observation {required for exactly divergence-free elements) that the pressure space is precisely the divergence of the velocity space—see, for example, the Appendix to Chapter 4 by
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 611 D. Malkus in Hughes (1987), so that jfiV-uh = 0 V/, (3.13-400) becomes, because of quadratic velocity (P2) and linear (and discontinuous) pressure (P_i), for each element (e), {a + bx + cy)(A + Bx + Cy) = 0, (3.13-401) because \js = a + bx + cy and V • uh = A + Bx + Cy. Since this must hold for all a, b, and c, simply select a = A, b = B, and c = C to obtain Je(V • u'7)2 = 0, and thus V • uh = 0. The ^2^-1 element is an example of a mixed interpolation method whose computed pressures 'cause' polntwise incompressibility—for straight-sided triangles only (E. Thompson, 1991, personal communication). Thus, Thompson et al. did not even attempt to save computer time by seeking out the inherent divergence-free velocity basis functions; they just used the element 'as is' (u and P are computed) because they liked it—at least for a while. In fact, however, and perhaps somewhat surprisingly, Thompson (1975) found this element to usually be actually less accurate than P2P0 m several test problems; thus, 'divergence-freeness' does not in and of itself imply accuracy. [Currently, Thompson (personal communication, 1991) prefers <2i<2o—f°r both 2D and 3D simulations.] For other applications of PiP-\, see, for example: Dawson and McTigue (1985), Dawson (1987), and Mathur and Dawson (1987), in which it was found to be inferior to the quadrilateral element QiQ-\, at least for Eulerian formulations. [For Lagrangian formulations, the PiP-\ is generally more robust than Q2P-\, which 'crashes' more often (P. Dawson, 1991, personal communication).] Another paper on f^-i (plus others) is Malkus and Olsen (1984), in which they identified the LBB-instability and gave the dimension of the null space as six (correct as far as it went—but see below!). It took D. Griffiths to actually generate the divergence-free basis for the PiP-\ element—and many more, some in the pointwise sense and others in the weak sense, via pressure test functions. In a series of papers over a six-year period (Griffiths, 1977, 1979a, b, 1981, 1982), he, inspired and challenged by the important paper of Crouziex and Raviart (1973), single-handedly generated analytically the divergence-free bases for virtually 'all' elements then in use. Not all are 'local,' however, and not any of them were extended to iso-P elements with curved sides. Even straight-sided elements possess (at least some do) rather involved divergence-free bases. And they seem (thus far, at least) to be sufficiently complex that they appear not to have 'caught on,' i.e., shown up in subsequent papers—except for some recent work by Shopov et al. (see Shopov et al., 1992, and Shopov and Iordanov, 1994), who use the Q2P-\ element in (weakly) divergence-free form (no pressures)—and on isoparametric elements yet. Perhaps one reason that not many codes have been written is due to Griffiths himself; in Griffiths (1981) is: 'It is not, as yet, clear whether the computational procedures based on these basis functions would be more cost-effective than the orthodox Lagrange multiplier method (or indeed, penalty methods), although both methods would give identical results when used with the same underlying functions spaces.' On the other hand, on the same page (342) is a statement that big code developers should have but (it seems) did not take rather seriously: 'The potential savings brought about by using these new basis functions are much greater for time-dependent problems, since the pressure does not need to be computed at each step 'A final quotation is useful in that it helps to better understand the somewhat
612 THE NAVIER-STOKES EQUATIONS distinct roles played by the different types of nodal velocities for a nine-node quad: 'normal components of velocities at midside nodes control the flux across element edges (also the discontinuity/jump in pressure across a boundary), internal nodes control the creation/destruction of mass within an element (also the gradient of pressure in the element), and the remaining nodes are free to approximate the momentum equations.' Thus, the four corner nodes and the four tangential components at midside nodes are those 'available' to satisfy Newton's second law. Interesting. For more recent work in this regard, see Shopov and Iordanov (1994). In another sequence of papers, Gustafson and Hartman attacked the divergence-free 'challenge' as posed in the book by Temam (1984); in Hartman and Gustafson (1981), and in Gustafson and Hartman (1983, 1985), graph-theoretic methods were used to explicate the underlying discretely divergence-free bases of the elements discussed by Temam. Like the Griffiths' work, however, the results seem not to have attracted much attention by 'code-builders'—at least to our knowledge. In a recent Ph.D. thesis under the supervision of D. Arnold, Qin (1994) studied theoretically and numerically the divergence-free PiP-\ element (in 'mixed mode') and its two 'neighbors,' P\Pq and P3P-2. We summarize here only a few of his salient results for PiP-\ on the unit square: (i) on a mesh oriented so that all hypotenuses go in the same direction (45°) with Dirichlet BC's, the dimension of the null space of the gradient operator is six, the 'reduced' (after removal of the zero eigenvalues) inf-sup constant is 0(h), and optimal convergence with mesh refinement occurs for the velocity (only—pressure does not converge); i.e., the velocity error is 0(h3) in L2 and 0{h2) in //'; (ii) ditto except NBC's—which remove the entire six-dimensional null space; (iii) on 'many' other meshes, such as one composed of criss-cross triangles, four to a square, the dimension of the null space (of C) is huge(!), unbounded like 0(h~x) for either essential or natural BC's, yet velocity converges optimally as does pressure [0(h2) in L2], and the reduced inf-sup constant is good/stable—0(1); and (iv) for certain very special meshes, optimal convergence of both u and P occurs with no instability and no spurious null space. Very interesting news, but perhaps not to the 'applications engineer'—unless he is prepared to generate his mesh as follows: (i) start with squares (presumably rectangles are also okay, and presumably of various sizes, to permit graded grids); (ii) form the triangles by criss-crossing each rectangle; and (iii) move the center node in each rectangle off-center 'a fixed distance, for instance /i/4'—such a mesh being called a distorted 'criss-cross subdivision.' Returning now to a low-order element and a divergence-free basis for velocity (no pressure), we mention the work of Rannacher and Turek (1992) and Turek (1994, 1996, 1997); see too Hey wood et al. (1996) for an interesting 'variationally based' analysis of 'built- in' NBC's/OBC's when divergence-free basis functions are employed—one of the most impressive-in-practice uses of such an element for time-dependent flows, with but one small 'hitch': it is non-conforming, defined as it is on a so-called 'rotated bilinear' element with the bilinear velocities defined at element midsides rather than at the corners. (The xy part of the bilinear basis function goes over to x2 — y2.) This trick, however, permits the efficient definition of a more local divergence-free basis (element-contained even for distorted quadrilaterals) than is possible with the standard/conforming Q\ element, whose divergence-free basis requires the use of 4-patches of macro elements (Griffiths, 1981). (Note too that even Griffiths did not find a local basis for Q\ <2o on isoparametric elements.) But the biggest key to the good performance realized in Turek's code is the effective use
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 613 of 'multigrid' [which is good in that even though the divergence-free 'stiffness' matrix has a condition number of 0(h~4) and the divergence-free mass matrix has condition number of 0(h~2), Turek's multigrid method has a convergence rate that is independent of the condition numbers] to solve the associated linear systems (S. Turek, 1992, personal communication) because, in fact, the number of unknowns with the discretely divergence- free basis on the rotated Q\Qq element is virtually the same (not substantially less, as perhaps intuitively expected) as that on the conforming Q\ Qq element with mixed interpolation! (And in 3D, the mixed method actually involves fewer total degrees of freedom than the div-free method—'because' the stream 'function' is a vector quantity in 3D.) It turns out that the divergence-free basis necessarily (see too the Griffiths' papers) introduces a sort of stream function (at the corners) which, along with tangential velocities at the midside nodes, results in an element with an average of three degrees of freedom per element, just like Q\ <2o (conforming) with bilinear velocity and constant pressure. The 'reason' that there are so many nodal parameters/degrees of freedom is that nonconforming elements always have more. In contrast, the <2i<2o basis derived by Griffiths (1981) has only one-third the nodal parameters as when using mixed interpolation. In fact, even though a very impressive 2D code has been generated with the divergence-free basis, the Turek-Rannacher 'team' seems to be switching toward projection methods (see Section 3.16.6); i.e., back to mixed interpolation (with a projection method, described later)—especially in 3D. We would also like to point out that a finite difference, divergence-free method that is effectively the 'conventional' Q\Qo element, has also been invented and used; in Stephens et al. (1984, which paper also proves that there can be at most two pressure modes for Q\ <2o) and Bell et al. (1989), a 'finite difference Galerkin formulation' was employed, in which a set of discretely divergence-free velocity 'mesh functions' was utilized to preclude the pressure—on a 9-patch (see, too, p. 172 of Girault and Raviart, 1986) rather than on the 4-patch a la Griffiths. [Fortin (1981) also used a nine-patch to show the divergence- free 'Vortex' for Q\ <2o] However, like Turek and Rannacher, Bell et al. have returned to conventional 'mixed-interpolation' (in finite difference 'garb'), for both 2D (Bell et al., 1991) and 3D simulations, and have even switched over to 'approximate projections' (the discrete velocity is only 'close' to being discretely divergence-free; see Section 3.16.6d). In the first of these (steady flow), the fully coupled, divergence-free system of size N (N = total number of nodes) was solved via banded solvers (and Newton's method). In their extension to time-dependent equations, they 'split out' the projection, with the result that in a sequential manner, they returned to 3N equations (N each for u, v, P). The main reason for their switch was to more efficiently invoke a higher-order Godunov method for advection. In 3D, Bell et al. returned to the 'PPE' approach, in part because the 3D 'basis' is very complicated and also because the relative savings is then not so large (J. Bell, 1992, personal communication). Another time-dependent, divergence-free-basis FDM approach is discussed in Goodrich and Soh (1989), and a strong connection revealed between that approach and a stream function only approach. In fact, it was the stream-function-only code that was used for the computations presented in that paper (J. Goodrich, personal communication, 1994). They also showed the 'equivalence' between their (and Stephens, Bell, et al.) 'finite difference Galerkin' method and the 'dual variable' method of Amit et al. (1981)—another approach that uses graph theory. In fact, Goodrich and Soh state, 'The next section will show that the dual variable or finite difference Galerkin algorithms can actually be interpreted as stream
614 THE NAVIER-STOKES EQUATIONS function algorithms. This discovery resulted from trying to understand and simplify the product terms in the FDG algorithms 'It is, in fact, this very 'equivalence'—discretely divergence-free bases seem always to introduce a stream function—that might help to explain why the time-dependent case has not received much attention in the fully coupled (N equations only—one per node) divergence-free approach; rather than simply du/dt, the acceleration becomes, at least partially, converted to an equation for da>/dt (there is an implied curl operation) or, 'worse yet,' an equation for d(V2\}/)/dt. For a more recent application of Goodrich's V-only approach for time-dependent flow, see Gresho et al. (1993). The method has, however, thus far only been applied on uniform grids—no 'geometry.' To conclude this discussion, we make two observations: 1. Divergence-free elements have the added advantage that they cannot generate unstable DAE's via the advection term (recall the example in Chapter 2 wherein any but skew- symmetric advection caused the ODE's to be unstable; and see Section 3.16.4 in this chapter, wherein the possibility of unstable DAE's is discussed). This is simply because divergence-free basis vectors necessarily (at least via GFEM) generate skew-symmetric advection matrices—at least up to outflow BC's; see Remark (1) following (2.2-24) in Chapter 2. 2. The world is still in need of a truly cost-effective divergence-free basis for 3D GFEM simulations in which complex geometries are to be tackled. The mixed blessings of mixed methods seem currently to be on top, even though the 'best' 3D element is also not yet 'obvious.' (In 2D, it seems that there are now several 'best' elements: Q2P-\, Q\Qo, or any of several triangular elements—it all depends on who is calling it best.) 3.13.8 Conservation Laws Revisited Recalling the discussion in Sections 2.2.3 and 2.2.4 of the previous chapter, we now repeat the analogous steps for the NS equations, except that we are now smart enough to do it in the 'efficient' way right away; i.e., we shall work directly with the semi-discretized equations in matrix-vector form, to study conservation of momentum and conservation of kinetic energy. To this end, we first rewrite them in the augmented form corresponding to (2.2-15) through (2.2-27), starting from the condensed form in (3.13-28), (3.13-29): Mu + [K + N(u) + PD(u)]u + CP = f (3.13-402) and CTu = 0, (3.13-403) where Dij(u)= I(Pi(PjV-uh (3.13-404) is another 'divergence' matrix, fi is a scalar to be determined later, the RHS of (3.13-403) is zero because u now contains all velocities [including those on FD; cf. (3.13-13) and (3.13-14)], and / is the 'augmented' forcing vector that includes the (as yet unknown) force applied by FD to the fluid, but does no longer contain the u terms in (3.13-26): /„, = Ufga + / <t>(?]Fa + f <t>(«]~Fa, (3.13-405)
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 615 where Fa is the new (reaction-force) term that, analogous to the 'Dirichlet' heat flux in (2.2-16), is to be determined once the velocity and pressure are known; i.e., as for the scalar transport equation, we have effectively defined a two-step procedure: (1) solve the DAE's of (3.13-28) and (3.13-29), which are 'contained in' (3.13-402) and (3.13-403), by omitting the test functions (and equations) that correspond to nodes on r£, and (2) solve for the 'reaction' force, Fa, via K Fa= Y. ^5a)' (3.13-406) j=N„ + \ with u and P available (known), analogous to (2.2-14) et seq.; i.e., (3.13-402), (3.13-405), and (3.13-406) are to be considered a linear system of N^ — Na equations in the unknown nodal force components {F«/}, a = 1, 2 or a = 1, 2, 3. The details of this force calculation will be explained in Chapter 4. Step (2) is, of course, optional and can only be performed after Step (1) is completed. With the efficient 'nomenclature' now in place, it is a relatively simple matter to study conservation of momentum and kinetic energy, beginning with the former. Since in this augmented form all basis functions sum to unity, it is now a simple matter to sum all of the momentum equations in (3.13-402), separately for each component (ua), to obtain £- fuh+ fuh- Vuh + fi fuhVuh= fg+ /V, (3.13-407) where uh is the approximate solution given in (3.13-13) and where, on F%, the applied traction vector, Fha, is (in 2D) that given by (3.12-14) and (3.12-16), whereas on F® it is the reaction force given by (3.13-406), and we of course realize that true traction forces (vis-a-vis pseudotraction forces) are only obtained via the full stress-divergence form of the momentum equation—}/ = 1 in, for example, (3.13-18) through (3.13-26). If y = 0, then we only have portions of the full momentum balance. The final step in obtaining a true force balance that will make (3.13-407) look just like the appropriate continuum balance equation (3.11-3) (after adding the body 'force' term, g, to that equation), is generally only possible if fi = 1 because J uh • Vuh = J V ■ (uV) - / ii* V • ii* = /r(n • uV - JVV • uh, and (3.13-407) then becomes g + f[¥h - (n • u'V], (3.13-408) a true global momentum balance; cf. (3.11-4). Thus, rigorous conservation of momentum requires the divergence form of both the viscous stress term and the advection term [recall that fi = 1 is equivalent to replacing uh ■ Vuh by V • (u/lu/l)] —but we hasten to add that 'just fine' solutions to the NS equations can be obtained with the simpler (and thus less expensive) versions via the V2-form (y = 0) and advective form (/J = 0). As was the case for the scalar transport equation, reversibility in the sense that replacing Dirichlet data on FD by Neumann data there and achieving the same (uh, Ph) solution is only achievable via the 'consistent force' formulation (for y = 0 or 1 in fact, with only the latter giving a true force); i.e., if Fa is determined in any way other than via 'Step (2)' above, then the resulting velocities and pressures will not be the same as those obtained with Dirichlet BC's. For added clarification here, we explicitly describe the ^-component of the reaction it J
616 THE NAVIER-STOKES EQUATIONS force calculation on rf: / 0W/T, = [MU + [K+ N(u) + fiD(u)]u + CP), - I <t>?'gx - l 4>?'F*, JrD J JrNx (3.13-409) where { }, denotes the j-th row of the LHS of the jc-momentum equation [see (3.13-28) and below it], and NTX F* = E ~F^f- (3.13-410) j=Nx + l The entire RHS is known, and the nodal values, {FXj}, which represent a true jc-direction force component if y = 1, can be computed—and the boundary mass matrix, / W, may be lumped (when lumping is feasible) if desired, as discussed in the previous chapter. This is the 'consistent' force and, at least when the consistent mass matrix is used, is 'exceptionally accurate' (details later, in Chapter 4). Finally, we turn to kinetic energy conservation, wherein (3.11-12) is our goal. To that end, we simply take the scalar product of (3.13-402) and the (full) velocity vector u, and using (3.13-403) to see that uTCP = PTCTu = 0 and obtain -—uTMu + uT[N(u) + PD(u)]u = uTf - uTKu, (3.13-411) 2 dt where, a la (2.2-26) and (2.2-27), we have [for y = 1 in the (augmented) ^-matrix] uTMu = J uh ■ uh = f q\, uT[N(u) + fiD(u)]u = f uh ■ (uh ■ Vuh) + fiuh ■ uh V • uh = \ /[V • (q2huh) - q2hV ■ uh] + p J q2hV ■ uh, which, if and only if 0 = 1/2, becomes \ /r(n • uh)q2h; also, uT f = J uh • g + JrFh ■ uh, and uTKu = J <$>h = v/2f[Vuh + (VuYl2, which we do not claim to be 'obvious.' Thus, for /3 = 1/2, we have Eh = J ¥h ■ uh - \ J q2(n ■ uh) + J(uh • g - <t>h), (3.13-412) where Eh = \§q\, which is (3.11-12) after dividing by p and adding the body force term, g, to (3.11-6) et seq. So we are done; conservation of energy can be assured—but to do so requires, as did the quadratic conservation 'law' in the previous chapter, fi = 1/2—thus sacrificing global conservation of momentum. Final Remarks: (1) Conservation of energy is, of course, more important than conservation of momentum if guaranteed stable DAE's are desired. Recall too that only /J = 1/2 gives a skew- symmetric advection matrix when n • uh = 0 on T; i.e., we then have uT[N(u) + fiD(u)]u = 0. If fi = 0 or 1, then the DAE's are 'indefinite'—they may be stable or unstable, although for well-designed problems and grids, instability will be rare (unless v = 0, which is not recommended in general). (2) ft = \/2 has long been used by many 'theorists'; it was introduced by Temam (1966, 1968) in order to assure 'well-behaved' equations—and the analysis above shows why.
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 617 (3) Further detailed discussions of issues of momentum conservation and consistent force calculations are presented in Gresho et al. (1987), albeit in a manner that is somewhat obfuscated by the use of somewhat 'clumsy' notation. (4) The 'sacrifice' of global momentum conservation when fi = 1/2 is usually a small one because V- uh is, for 'reasonable' simulations, 'small.' (Ditto the conservation of energy when fi = 1 —at least if the DAE's are stable; and ditto both when ft = 0.) Finally, y = 0 is also usually quite 'acceptable', for any /3. 3.13.9 Periodic Boundary Conditions Having shown how the GFEM can be used to compute consistent forces on boundaries, we are finally ready to return to the 'BC Section' (3.8) and complete it—ironically by removing the boundaries. Recalling first the discussion of periodic BC's for the scalar transport equation (Section 2.6.3d, a review of which might be helpful) and the discussion of internal line heat sources and internal heat fluxes in Section 2.3.2d, we extend these concepts in the appropriate (if not obvious) way to the NS equations. The less obvious part is related—of course—to the pressure. We deal with the easy part first: tangential velocity (velocities in 3D). They are easy because they are a direct analog of the scalar case; thus, the periodic BC is simply a matter of 'node numbering': there must be a one-for-one nodal identity for each node on the periodic boundary. That's all there is to it. But the normal direction, that which involves the pressure in the 'force balance,' is another matter. The first thing that must be addressed is 'frictional pressure drop' and the associated lack of a periodic pressure BC. Since the pressure usually, at least 'on average', decreases in the flow direction, it is clear that P can not be the same at the exit as at the inlet of such a 'periodic' domain, a simple example being flow in a long pipe with venturi meters (of the same type) inserted every / units—and we choose a domain of length /. What to do? Well, the only way the 'modeler' can force the normal velocity to be periodic yet permit 'pressure drop' in the computational domain is to add a 'pump' at the periodic boundary—to cause a jump in P from the lower exit value to the higher inlet value. Note too that if the physical flow were truly periodic in that flow through a closed loop were being addressed, that said closed loop could also not operate without a pump. Thus, we have physical justification for adding a mathematical pump to make the pressure jump. [Actually, the pump jump is modeled as a normal traction (or pseudo-traction) jump, as we shall see.] The velocity portion is (again) easy—and 'standard': just give the appropriate inlet and exit nodes (degree of freedom, to be more precise) the same 'name.' Then, just as we allowed the possibility of either adding a line (plane) heat source or specifying the temperature in the scalar case, Section 2.3.2d, we now have the option of either adding a pump (line/plane 'source' for normal momentum) or specifying the desired normal velocity along the periodic boundary and determining the required 'pump characteristics' (which could be rather strange, depending on the normal velocity profile imposed). To do the former requires the use of the 'augmented' set of momentum equations—those just developed in the previous Section [(3.13-402) through (3.13-406)], which obviously explains why we waited until now to discuss the periodic case. We need the so-called 'reaction force,' called Fa in (3.13-405) of Section 3.13.8, to introduce our mathematical pump.
618 THE NAVIER-STOKES EQUATIONS Remark: Actually, the a in Fa must correspond to the normal direction at the periodicity line (plane), which could also be the x-, or y-, or z-direction—but need not be. If it is 'none of the above,' then the rotated momentum equations, to normal and tangent directions, as described in Section 3.13.1e [equations (3.13-38) through (3.13-57)], must be employed. o A digression. Before actually addressing the periodic case, let us analyze the situation in which a 'pump' is inserted along a line of nodes internal to the domain; i.e., we consider a 'line source of normal momentum.' As in Section 2.3.2d, there are two ways, nearly equivalent, to insert a pump: (i) add a line momentum source along a line of velocity nodes, or (ii) use the NBC approach to specify the total jump in momentum flux. We shall present the second form because, while requiring more effort to derive, it is slightly more general—it permits the calculation of the split in 'pump work,' by separate calculation of 'suction side' and 'pressure side'; details to follow, at least for one type of element (with two types of pressure approximation). To this end, consider the 4-patch in Figure 3.13-37, which we shall employ simultaneously in two ways: Q\ <2o and Q\ Q\ —partly to show how much simpler is the case of discontinuous pressure. The pump is located between the two center columns of nodes and will inject x- momentum only. Also, the separation is figurative only—nodes 0^ and Or (et al.) reside at the same jc-location. The reasons for the 'duality' are two: (i) it is needed for Q\ Q\ because we need two pressures at the same location in order to permit a discontinuity/jump, and (ii) it will make the periodic BC case easier. To make the analysis tractable, we shall consider only the transient Stokes equations (or steady, by dropping the acceleration terms). Also, as in Section 2.3.2d, we begin the analysis by 'decoupling' the left and right pairs of elements, as if Si — 0l — Nl were the right boundary of the 'left' domain and Sr — Or — Nr the left boundary of the 'right' domain. The weak form of interest is duh frF" = V / V0; • W (3.13-413) which we apply, sequentially, to nodes 0^ and Or, via u" = ^2uj(pj and P = Yl^j^j and LM for simplicity: fyN h vn / <PoLFxl(y) dy= '77[Us'- ~ Usw + 4("°t ~uw) + unl- unw] vl 6h Ih [(usw ~ 2u\v + uNW) + 2{uSl - 2u0l + uNl)] - /l(P) + y"oL, (3.13-414) N w < ^ W | > x NW x SW NL OL -—i H NR OR > NE xNE xSE o SW SL SR Fig. 3.13-37 A 4-patch for analyzing periodic BC's. SE
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 619 where /'l(P) is the pressure term, which is different for the two elements: h Q\Qo : fdP) = ^(Psw + Pnw) with P at centroids, Q\Q\ ■ fdP) = J^i(psw + 4/V + Pnw) + (PsL + 4P0, + Pnl)] with P at nodes. The analogous equation for the right node is fyN h vh vl I <t>uRFxR(y)dy = 77t"5R - use + 4(k0k - uE) + uNr - uNE] - — \2{uSr - 2u0r + uNr) ' ys lh + (use - 2uE + uNE)] + fR(P) + -«oR, (3.13-415) where /r(P) is the pressure term: h Q\Qo ■ Ir(P) = 2(pSE + Pne) with P at centroids, Q\Q\ ■ fR(P) = ^(psR + 4PoR + />*«) + (Pse + 4P£ + PNE)] with P at nodes. To obtain the final GFEM jc-momentum equation at node 0, we sum the two equations to get the total applied force at node 0, and we merge the L and R nodes for velocity (only) to get, after dividing by h, 1 fyN 7 / 4>dFhXL{y) + fXR(yWy n J ys v = 77{("5 -usw) + (uSE -us) + 4[(k0 - uw) - (uE - u0)] 6/ vl + (wyv _ uNW) — (uNE — uN)} — -—j[(usw ~ 2uw + uNW) + 4(us — 2uo + uN) on + (W5£ - 2uE + w^)] + \[fR(P) - fdP)] + /«o, (3.13-416) h where, for Q\Qo, 1 Pse ~~\~ Pne Psw ~\~ Pnw tU'r(P) ~ Il(P)] = hLJ'" J ' 2 2 with P at centroids, and for Q\Q\, U/r(P) ~ Il(P)] = t^{[Pse + PsR + 4(P£ + P0r) + Pne + />*„] h 12 - [^w + Ps, + 4(Pw + P0/.) + Pnw + ^J}, with P at nodes (and two P's at three of the nodes; see Remark(5) below). Letting FhxL(y) + FxR(y) = Fx(y), the total applied force in the jc-direction (at y) and letting /, h —» 0, it is hopefully clear that the viscous 'y-terms' (~ vluyy) and the acceleration terms vanish, and the remaining terms converge (we hope) to h i 9" du a* _(p| -p| ), (3.13-417) showing the jump in (pseudo) traction force caused by the pump.
620 THE NAVIER-STOKES EQUATIONS Remarks: (1) Probably most of the jump given in (3.13-417) will usually be taken up by the pressure change with, as usual, the viscous contribution being small. (2) When the total pump jump, Fx(y), is specified, (3.13-416) is the GFEM equation that determines uq (uq for the transient case); i.e., this case shows how to put a pump into the system. (3) If u(y) is prescribed along the pump line [and ii(y)] —an internal Dirichlet BC—then (3.13-416) is the equation for the total jump, in the form Fx(y) = (psFXs + (poFXo + $nFXn. In this case, it is possible (although not necessary) to 'post-process' via (3.13-414) and (3.13-415), and FhXL(y) = FXL<t>s + F^o + FXL<f>N and similarly for FXR, to determine how much of the total jump is on the suction side (Fh ) and how much on the pressure side (Fh )—for what it is worth. (4) If we take Fx = 0, we recover from (3.13-416) the conventional GFEM equation for an interior node and no pump. For this case, it would (or at least should) turn out that P0l = P0r, etc., for Q{QU thus converting (\/h)[fR(P) - fL(P)) back (after dividing the entire equation by /) to a term that approximates dP/dx. (5) For Q\Q\, we are not yet done—we must account for the extra pressures via appropriate (extra) continuity equations. This follows naturally once we realize that the pressure basis function for 0l spans only the two left elements and that for 0R spans only the two on the right. (No such concern exists for Q\Qo, or any element with discontinuous pressure—because the pressure jump is accommodated 'naturally,' and the continuity equations are exactly the same as those without a pump.) Thus, rather than the single continuity equation for node 0, (h/\2)[(uSE ~ use) + 4(«£ - uw) + (uNE - uNW)] + (l/\2)[vNE - vSE + HvN - vs) + (vnw — vsw)], we get instead the pair. (h/\2)[(us — usw) + 4(«o — uw) + (uN - uNW)] + (l/\2)[vNW - vsw + 2(vN ~ vs)] = 0 for 0L, and (h/\2)[uSE - us + 4(uE -«o) + (uNe ~un)] + (l/\2)[2(vN - vs) + (vNE - vSE)) = 0 for 0R. (Note that the sum of the last two equations is the first equation, so that it too is satisfied in the 'pump' case.) (6) If pressure modes (spurious or hydrostatic) are permitted by the BC's on the non- periodic portion of the domain, or if the entire domain is endowed with periodic BC's, then they will appear in full measure and the associated matrix will be singular. They are, however, innocuous in that the associated solvability conditions [Pj„g = 0 or Pj„g{t) = 0] are automatically satisfied because g = 0 at all periodic boundaries. The matrix singularity in these cases can be avoided by appropriate specification of pressures, as discussed earlier (Section 3.13.2b). Digression Following on from Remark (2), another possibility (see, for example, Fortin, 1988) is to require the total flow rate, Jr unbe specified along the 'pump line,' FP, rather than the pointwise value of u • n. In this case, rather than a variable Fx(y) in (3.13-416), we are restricted to a constant value—the additional single constraint equation permits only a single extra degree of freedom, and u(y), from (3.13-416), will vary along the pump line (plane in 3D). One does, however, have the freedom/flexibility to apply this constant force/pressure drop at only some elements/nodes (even one!) and not others—as long as Fx is the same wherever it is applied. Probably a constant Fx along the entire pump line
THE FINITE ELEMENT EQUATIONS/DISCRETIZATION OF THE WEAK FORM 621 would make the most sense in most cases. (Note too that a total flow constraint is more or less 'global' in that all nodes along the pump line are coupled.) End digression o Back to periodic. We are finally ready to complete the discussion of periodic BC's for the NSE. And the discussion will be brief because all of the hard work is behind us. For the periodic case, the 'left' nodes in Figure 3.13-37 are those at the outlet (assuming the conventional left-to-right flow convention); i.e., the left nodes (and their equations) are located at the right boundary of the periodic domain, and the 'right' nodes in Figure 3.13-37 are at the left domain boundary. Thus, in the periodic case, the x-locations of the nodes (but not the y-locations) are truly different, even though the velocities (but not the pressures) are still tied by periodicity, uq1 = uqr = uq, etc. (They are given the same node number.) Finally, we hope and believe that this extended discussion of <2i<2o and Q\Q\ will permit the reader to implement pumps and/or periodic BC's for other elements, with either continuous or discontinuous pressure. If the expected solution is truly periodic (velocity and pressure), nothing else need be done. If, however, the domain is a 'flow-through' type that suffers a pressure drop in the flow direction, then further input is required: either u(y) of Fx(y) must be specified. In either case, where continuous pressure elements are employed, the pressure nodes along the periodic boundary are not tied—they are separate; and there are separate (not coupled) continuity equations at inlet and outlet pressure nodes. Discontinuous pressure elements need no special consideration. Finally, for continuous pressure elements and true periodicity, the pressure nodes may be tied together—though it is not then a requirement. Remark: We are speculating more than we like on this last point—and in Remark (4) above: i.e., we believe but cannot prove that the pressure jump would be zero if either Fx = 0 or true periodicity were computed without tying the pressure nodes together. For the PPE formulation, we actually proceed in much the same way, at least for 'Case 1' (specified pressure drop): (i) use the 'standard' method (proper node numbering) for the velocity, (ii) use the velocity normal traction NBC as the PPE Dirichlet BC (still applied weakly, of course), to add the desired pressure drop; i.e., we proceed to set the problem up exactly as if we were solving the primitive variable formulation. The consistent construction of the consistent PPE will take care of the rest. For 'Case 2' (specified un), the PPE will again take care of itself, automatically—but differently: the inherited, inhomogeneous Neumann BC that is associated with specified normal velocity still applies here. But note that the 'inlet' nodes for P see a different RHS than do those at the outlet, and therefore dP/dn may differ at the 'interface'; the jump in pressure will be a natural consequence of these Neumann BC's and is, of course, that produced by the implied pump. So, it turns out, once again, that the PPE 'method' should be treated, with respect to IC's and BC's, just as if it were the u-P method. Also similar to the u-P formulation, post-processing for the Afn could be applied if desired. Finally, related to these issues is one more: we believe that it matters little—at least for a Newtonian fluid away from the Stokes limit—whether the stress divergence form (y=\, true traction vector) or the simpler form (y = 0, pseudo-traction) is employed in the above equations.
622 THE NAVIER-STOKES EQUATIONS For some nice examples of 2D periodic flow past arrays of cylinders, both in one direction and two, see Tezduyar and Liou (1990), who used the \f/-co formulation. For some primitive variables results, including alternative methods of implementing the periodic BC's, see Fortin (1988) and Segal et al. (1994). 3.14 A CONTROL VOLUME FINITE ELEMENT METHOD As promised in Chapter 2, we will extend the CVFEM discussion there to the NS equations. But since we now know that (most) FVM's are inherently low-order methods (first- and second-order), we limit our scope ab initio and present the CVFEM version of only Q\Qq. First we give an executive summary: since all terms except W and V • u are obviously the same (for each velocity component) as for the scalar problem of Section 2.5.3, and since the discrete approximations to div and grad turn out to be identical to those of GFEM on <2i<2o (really), we are basically done; i.e., the DAE's for CVFEM are quite close (identical for CP and CTu) to those for GFEM presented in Section 3.13.5. All that really remains is to show that GFEM = CVFEM for div and grad, and this we do next, using the sketch in Figure 3.14-1. In the sketch, nG is a unit normal on an element boundary, nCv is a Control Volume (FEM) unit normal, and we shall focus on element 3 to present our story. The subdomain/CVFEM begins (necessarily) with the divergence form of the NS equations, a la (3.4-2), to obtain l/f/V • (or— puu) Q = ^V2u - V • (PI) - V • (puu), (3.14-1) Fig. 3.14- A control volume 4-patch.
A CONTROL VOLUME FINITE ELEMENT METHOD 623 which is equivalent (for straight/planar) boundaries to f da f / p—- = / n • OnVu - PI - puu) = / n • (/LtVu — puu) — / nP = / ^^ - /°UM«) - / nP' (3.14-2) Jr, dn JT. where we have used V • u = 0 to obtain the simplified form of the viscous term. If the acceleration, viscous term, and advective term are broken down into individual cartesian components, it is clear that each components 'looks like' the analogous scalar terms in (2.2-37) and (2.2-38); thus, we need now only focus on the new term—the pressure gradient: [ VP= [ nP (3.14-3) Jsij Jr, is the CVFEM version of VP. Recall now the GFEM form of VP: j (piyp = - j ps/(pi + / n<piP Jo. Jo. Jr = - j PV<pj (3.14-4) Jq for an interior 4-patch because there is then no boundary integral. To show the equivalence, we use P = Y2j Pjiffjix), where {i/^} are the piecewise-constant pressure basis functions for QiQo in both (3.14-3) and (3.14-4) to give / VP = sTPj [ n (3.14-5) and /^VP = -VP7 f V<ph (3.14-6) Jq j Jq respectively, where, in each case, the sum over j is effectively a sum over the 4-patch. Finally, we invoke Green's theorem in (3.14-6) to obtain f <plS/P = -TPj f n<ph (3.14-7) Jq j JVj where T7 is the boundary of element j. Now note that, for each element, <pi is zero on those two sides that are 'opposite' node /. (For example, in Figure 3.14-1, <pi is zero on sides 6—^9 and 8 —► 9 in element 3.) Considering now each element in turn, we have equivalence of the two pressure gradients if / ncv = - <pinG, (3.14-8) JrCv JTc, where rCv and TG are the appropriate control-volume boundary segments and element boundary segments, respectively. For example, for element 3 we need / ncv+ / ncv = - <PinG- / (piiiG (3.14-9)
624 THE NAVIER-STOKES EQUATIONS in order that the matrix coefficient of P3 be the same for each. We shall prove that (3.14-9) is true by direct construction, after rewriting it in local coordinates: r>0 f~\ ncvdr] + / nCyd£ = -1 7o nG(pi d£- nG<Pi »> = -i drj. (3.14-10) Noting that each unit normal vector is constant in the integrand and that the boundary integral of <pi is simply half the length of the element side, reduces the problem to showing that Zfl^n(0, -1/2) + Z^cii(-l/2, 0) = -i/5^6n(0, -1) - i/5^8n(-l, 0), (3.14-11) where we have 'evaluated' the unit normal vectors at the midpoint of each line segment for convenience. To finish, we use Figure 3.14-2, which will give us the various normal vectors: The equation of the normal vectors is then —m/y/\ + m2 iil = -n/? = -y'/y/T+if? \/^\ + {y')2 and we can now evaluate each term in (3.14-11) Ay 1/VTT mz 1. Zfl_frn(0,-l/2) = 3. and 4. ■Ax ys + ye + ys + J9 4 -Ay Ax ys + ye 1 2 1 4 *5 + *6 + *8 + X9 , X5 +X6 ys + j9 - ys - ye Zft_cn(0,-l/2) = 1 -Ay Ax 1 4 1 2-Z5_6n(0,-l)=- -^/5-,8n(-l,0)=^ -Ay Ax Ay -Ax 1 2 1 2 ys + ys - ye - X6 + Xg - X5 - -(ye- ys) x6 -x5 y9 *8 ys - ys -(xs -x5) (3.14-12) y = mx + b m = (y2-yi)/(x2-xi) £ = Vax2 + Ay2 •►x Fig. 3.14-2 Unit normal vectors.
A CONTROL VOLUME FINITE ELEMENT METHOD 625 and we are finished. Both sides of (3.14-11) give \ n _ y6 , which is just the 'C-matrix' ^ _ -^6 -*-8 coefficient of Q\Qq (see Appendix 1). Generalization is immediate, and we see that the discrete gradient operators for GFEM and CVFEM are identical. Remarks: (1) Noting that the integrals of both GFEM and CVFEM test functions are the same (1/4 of the relevant area) makes the equivalence more believable —intuitively. (2) Clearly, the discrete divergence operators are also identical in the two cases, since both test and basis functions are identical in / V< V • uh = 0. (3) The above construction has generated an alternate, but not necessarily useful, way to compute the C-matrix. (4) The absence of node 9 in the gradient evaluation at node 5 makes the 'bent element blues' discussed in Gresho and Leone (1984) particularly obvious; the pressure gradient at node 5 is completely independent of/oblivious to the location of node 9—it could be on the moon and make no difference. (This remark of course, also applies to nodes 1,3, and 7 when 'fully assembled.') The gradient is also independent of node 5's location. (5) For an element that is a simple rectangle, the equivalence of the two gradients is fairly obvious. (6) The extension to 3D is not fairly obvious, unless the elements are simple bricks. Isoparametric elements with planar faces/sides are also straightforward, but those with non-planar sides are not. Now that we have shown div and grad equivalence, there is one final aspect of CVFEM that needs to be addressed: open (or outflow) boundary conditions. What is the CVFEM equation for a node at which the velocity is not specified? To answer this important question, we examine the following two-patch in Figure 3.14-3 at the right edge of a computational domain. In generating the discrete momentum equation for node 0, we are led to consider / ifcV.(IP)= / nP, (3.14-13) Fig. 3.14-3 A boundary 2-patch.
626 THE NAVIER-STOKES EQUATIONS which leads to the question, 'Is that portion of ro that comprises a — 0 — b to be included or not?' Our answer is the following: yes for the tangential component, but no for the normal component, the second part of which may be surprising to some CVFEM practitioners. The reason is this: if the entire CV boundary was included in the equation for the normal momentum equation, the result would be zero—because P is piecewise- constant. [It is, of course, not zero for the tangential equation; there it is simply and appropriately H{Pn — Ps)/2.] What is needed is the realization that a normal force balance is needed on the open boundary, and this leads to the CVFEM version of the NBC of GFEM; namely, -P + H^ = fn, (3.14-14) dn where, as with GFEM, the missing factor of two in the viscous term is a result of dropping the (Vu)r portion of the viscous stress in (3.14-1), so that, as with GFEM, we really have a pseudo-traction BC. But the key point is that the pressure must show up in the OBC so that, when considering the boundary integral at node 0, the portion of it on a —>- b must be replaced by the above BC; i.e., the combined viscous and pressure term [see (3.14-2)], Jr(fidu/dn — nP), is replaced by F, a given (pseudo) traction vector. [In the above sketch, the pressure term is then — (PN + Ps)h/2, which is lCxP' at node 0 and is again identical to the GFEM result—which result came about somewhat more 'naturally' as a natural boundary condition.] So we have come to the end of our CVFEM presentation for the NS equations. For convenience, however, we summarize the key results below, since some of them are lifted from the previous chapter: 1. For an interior four-patch, (2.5-4) can be used to obtain the CVFEM version of du/dt, u • Vu, and V2u by replacing T by u and v, respectively. 2. The div and grad terms are identical to those from GFEM, a la Section 3.13.5. 3. To construct the CVFEM equations at an open boundary, use (2.5-8) for a 2-patch and (2.5-10) for a 1-patch in the same way as above, and use the GFEM results in Section 3.13.5 to get div and grad and the OBC's. 4. As for the scalar case, the characteristic GFEM averaging of(l 4 l)/6 changes to (1 6 l)/8. 5. The consistently derived CVFEM has more similarities than differences from GFEM, and it is the authors' opinion that every difference but one is in favor of GFEM. That 'one' is: 'flux in = flux out.' But see Appendix 2. 6. The nascent theory of FVM's has been 'covered' briefly via some of the citations of Section 1.7. *3.15 VARIATIONAL PRINCIPLES FOR POTENTIAL AND STOKES FLOW 3.15.1 Introduction The steady Stokes equations—and their seemingly totally unrelated but simpler cousin, the potential flow equations—provide a rich setting for mathematical analysis, mostly related to or caused by the incompressibility constraint. These equations can be formulated via
VARIATIONAL PRINCIPLES FOR POTENTIAL AND STOKES FLOW 627 minimum principles or via maximum principles (which also introduces a least-squares solution) or by 'both' (saddle-point principles); or by projections. We shall do all of the above below, and in an order that moves more or less monotonically up the scale of difficulty, beginning with the discrete equations for which linear algebra provides the bulk of the analytical tools—as well described by Strang (1986; see also his SIAM review article—Strang, 1988), who refers to the subject as 'duality.' After developing and presenting this theory for the Stokes equations, we will digress to show how easily it also applies to the discrete equations of potential flow. Then we leave the comforts of the finite dimensional world to address the continuous analogs—first for potential flow, and finally concluding with steady Stokes flow. This path may seem to some a digression from the goal of solving the NS equations via GFEM, and indeed it is—and can be skipped over without loss of continuity. It is much more aimed at the 'incompressible flow' part of our title. For 'instructions' (or, at least, some guidance) on how to generate appropriate functional, see Kardestuncer and Norrie (1987) or Sewell (1987); see also Finlayson (1972). 3.15.2 Discrete Stokes The discrete steady Stokes equations are contained in (3.13-28) and (3.13-29): omit Mu and N(u)u, and add a body force (in general) to obtain Ku + CP = f, (3.15-1) CTu = g, (3.15-2) which can be rewritten as Ax = b with xT = (uT, PT), bT = (fT, gT), and A= A- n . (3-15-3) where K is SPD (and n x n), and C(n x m, n > m) has full rank (no pure pressure modes permitted—for the time being); thus, K is invertible, and C has no null space. It is noteworthy that C has more rows than columns, and CT has more columns than rows. Thus, when C (and therefore CT) has full rank, the rank is m, and since m < n, it follows that CT has a non-trivial null space even though C does not: there are n — m linearly independent vectors {u} for which CTu = 0; this null space is called the divergence-free subspace of Rn corresponding to/generated by CT. Formally (but never computationally), the solution to this linear system can be obtained in two steps: (i) P={CTK-[C)-[[CTK-'f -g], (3.15-4) (ii) u = K~\f -CP), (3.15-5) showing also that A is invertible. To put the above solution into a variational setting, we introduce three functionals: (i) J{v) = vT({Kv- f), (3.15-6) (ii) I(q) = -\(Cq-f)TK-\Cq-f)-qTg, (3.15-7) and (iii) L{v,q) = J{v) + qT{CTv-g). (3.15-8)
628 THE NAVIER-STOKES EQUATIONS J(v) is called the primal functional, I(q) the dual (or reciprocal) functional, and L(v, q) the Lagrangian functional. We shall show that the solution of (3.15-1) and (3.15-2), i.e., (3.15-4) and (3.15-5), can also be obtained in three other ways—one from each functional: (i) minimize J(v) subject to the constraint CTv = g, (ii) maximize I{q) with no constraints, and (iii) find the saddle-point of L(v, q). Note the asymptotic behaviors: J{v) —► oo for \\v\\ —► oo, I(q) —► —oo for \\q\\ —► oo, and L(v, q) —► oo for \\v\\ —>- oo. If also L(v, q) —► —oo for ||<?|| —► oo, we would be assured that L(v, q) is a saddle- point functional. The last condition is, however, not always realized, and the sufficient conditions for the existence of a saddle point may not be satisfied. It is, however, not always necessary, as we shall see. See, for example, Carey and Oden (1983, Volume II) for further discussion. 1. Minimize J. The first step is easy; the first variation of J{v) is simply 8J(v) = 8vT(Kv- /), (3.15-9) but the second step is more subtle—because of the constraint CTv = g. Thus, attempting to find 8J(v) = 0 via v = K~x f is not allowed because this v is generally not in the admissible set of functions—it does not satisfy CTv = g. The admissible functions do satisfy CTv = g, and thus their first variations are necessarily discretely divergence-free: CT8v = 0, which leads us to consider an n-vector, say w, that is generated by an m-vector, say q, via w = Cq because all such vectors are orthogonal to 8v; 8vTw = wT8v = qTCT8v = 0. This leads to the proper conclusion that if Kv — f in (3.15-9) were one of these w-vectors, we would have 8J{v) = 0 and be respecting the constraint. Thus, Kv — f is the 'gradient' of some scalar, say w = — Cq, and we have 8J{v) = 0 if Kv + Cq = /, where CTv = g; i.e., the extremum of J{v) is attained at the Stokes solution: v = u and q = P. To finish, we must show that the extremum is indeed a minimum. This is easy; the second variation of J{v), from (3.15-9), is simply 82J(v) = 8vTK8v, (3.15-10) which is a positive definite quadratic form because K is SPD, which proves that the extremum is a minimum (82J > 0). 2. Maximize I(q). There are fewer subtleties here because the variational problem is unconstrained. The first variation of I(q) in (3.15-7) gives 8I(q) = -8qT[CTK-l(Cq-f) + g], (3.15-11) and since 8q is a completely arbitrary m-vector, we obtain 81(q) = 0 when CTK~x(Cq — f) + g = 0; i.e., when q = P from (3.15-4). This 'dual' variable formulation thus leads directly to the correct value of the dual variable—pressure. To show that this solution maximizes I(q), simply form 82I(q) = —8qTCTK~xC8q = -xTKx~x < 0, since K is SPD. Done. To finish, we simply return to (3.15-5) to recover the primal variable, u. Digression 1: Least-squares solutions and projections. As a small aside, it is noteworthy that for the special case of g = 0 in (3.15-2), corresponding to homogeneous BC's on velocity, the above maximization (pressure solution) is related to a least-squares solution of CP = /,
VARIATIONAL PRINCIPLES FOR POTENTIAL AND STOKES FLOW 629 and the full solution (u and P) is related to an orthogonal projection of a 'velocity,' say w, defined by Kw = f in (3.15-1), from the given space (Rn) to the divergence-free subspace of Rn. Let us prove these assertions: suppose we seek a 'solution' to CP = /, a system with more equations (n) than unknowns (m). A classical technique is the method of (weighted) least squares: find a P that minimizes \\CP — /||k-i; i.e., minimize the residual of CP — f in the ^~'-norm. Defining R(q) = \\Cq - f\\\.x = (Cq - f)TK~\Cq - f), (3.15-12) and setting 8R = 28qTCTK-\Cq - /) = 0 which, since 8q is arbitrary, means that the m-vector CTK~x(Cq — f) must be orthogonal to all m-vectors in Rm, and thus it must be the zero vector; (CTK~xC)q = CTK~l f, and we have proven our first point: when g = 0, the maximization of I{q) is equivalent to finding a q that is the least-square solution (in the ^_l norm) of Cq = /, which we call P. Actually, the proof that 8R = 0 yields a minimum (least-squares residual) relies on the fact that 82R = 28qTCT K~xC8q > 0, which we have already shown. Stated differently: the vector CP — f is the smallest vector in the (large; dimension n — m) null space of CTK~{—smallest in the /f_1-norm, that is; and it is (of course) unique. Stated differently yet: the pressure is the vector that minimizes the 'related' vector, K~\Cq— f) in the ^-norm over all divergence- free vectors, an interpretation that is more 'consistent' with the continuum analog to be discussed below, wherein the relevant norm is the //'-norm. To see the projection connection, we rewrite the Stokes equations—again for g = 0—as Ku + CP = f = Kw and CTu = 0 with solution P = (CT K-{CyxCTw and u = w-K-lCP = [I-K~lC(CTK~lC)~lCT]w = pKw, where pK is a projection matrix (p| = pK) that projects a given n-vector, here w, to the divergence-free subspace that is the null space of CT\ CTpKw = CTu = 0 for all w because CTpK = 0. That the projection is ^-orthogonal, (Pku)tK(Qku) = 0 for every u, where QK = I — pK is the 'residual' projection matrix, follows immediately by noting that PtkKQk = [/ - C(CTK~{CT{CTK~X]C(CTK~xC)~xCT = 0. (3.15-13) For more projection discussion, see Appendix 3, in which Pk is called p\. To conclude this portion of the digression, we note that it is the presence of K that caused that least-squares solution to be a weighted (via ^~') least-squares solution and caused the (non-symmetric, p\ ^ pK) projection to be K-orthogonal rather than 'simply' (Euclidean) orthogonal. A change of variable would change both of these results to their 'simpler' interpretations; i.e., u = Ki/2u, f = K~l/2f, and C = K~l/2C yields a system with K replaced by the identity: u + CP = / and CTu = 0 and (i) the (unweighted) least- squares solution of CP = /; namely, (CTC)P = CTf, gives P and (ii) the new projection matrix, p, is p = / — C(CTC)~]CT, which is symmetric and generates a 'conventionally orthogonal' projection of /; u = pf with (pf)T(Qf) = 0, where Q = I — p because P and Q are 'conventionally' orthogonal: pQ = 0. End Digression Digression 2: Show that ymin = /max. It is probably not intuitively obvious that if 7min and /max describe the same solution, then it follows that they are equal. To show that this is indeed the case
630 THE NAVIER-STOKES EQUATIONS 'is just a medley of matrix algebra' (Strang, 1986, p. 101): J(v) - I(q) = \vTKv - vTf + \(Cq- f)TK~l(Cq - f) + qT g = \[vTKv + (Cq - f)TK~\Cq - /)] + qTg - vT f = {[(Kv + Cq- f)TK~\Kv + Cq- /)] - vT(Cq - f) + qTg - vTf = {(Kv + Cq- f)TK~x(Kv + Cq- f) + qT(g - CTv) (3.15-14) is the general result. Now if v is divergence-free, then we have CTv = g and, because K~l is SPD, we then obtain J(v) — l(q) = 0 if and only if Kv + Cq — f = 0; at the solution, we have Ku + CP = f and CTu = g. We then have J(u) = Jm[n(v) = ImeLX(q) = I(P). Finally, it is also noteworthy from above that J(v) ^ I(q) for all admissible v in the minimization problem; i.e., those satisfying CTv = g—again because A'-1 is SPD. End Digression 3. Find the saddle-point. The last item on our list is to study the (alleged) saddle-point problem associated with the Lagrangian functional, (3.15-8). We will show that the Stokes solution minimizes L(v, q) with respect to v and maximizes it with respect to q. We begin by seeking the extremal/stationary points of L(v, q) via 8L = 8J + 8qT(CTv - g) + qTCT8v = 8vT(Kv + Cq- f) + 8qT(CTv - g), (3.15-15) and we emphasize that we are now not in the space of divergence-free vectors—CT8v ^ 0—because the introduction of the Lagrange multiplier variable (q) has permitted a relaxation of this constraint. The extremum of L is given by 8L = 0 and yields, since 8q and 8v are independent and arbitrary variations, Kv + Cq = / and CTv = g, thus recovering the discrete Stokes equations: u and P from (3.15-1) and (3.15-2) are a stationary point of the Lagrangian. To show that the stationary point is a saddle-po'mt, we start with the easy part; at the solution, we have L(u, P) = J(u), and we have already shown that this is the minimum J. But to show that L(u, P) is also a minimum with respect to v at fixed q(= P), we must examine L(u + ev, P) = J(u + ev) + PT[CT(u + ev) - g] = J(u + ev) + ePtCtv = \(u + ev)tK(u + ev) -(u + svff + ePtCtv = \uTKu - uTf + e[utKv - vTf + PTCTv] + \e2vtKv = L(u, P) + svT[Ku + CP- f] + \e2vtKv = L(u,P) + {e2vtKv, (3.15-16) and we have that L(u + ev, P) > L(u, P) because K is SPD. Turning to the other side, the analysis of the maximum proceeds as follows: pick a q, any q but fixed (so that 8q = 0) and find the v that makes 8L = 0 for this q. This v
VARIATIONAL PRINCIPLES FOR POTENTIAL AND STOKES FLOW 631 comes from (3.15-15) with 8q = 0 and is v = K~l(f — Cq); i.e., for fixed q, this v gives 8L = 0. Now insert this v(q) back into L(v, q) to find its value at 8L = 0: L(v(q), q) = ±[K-\f - Cq)\TK[K-\f - Cq)] - fTK~\f - Cq) + qT[CTK-[{f-Cq)-g\ which, after simplification, becomes L(v(q), q) = -\[qTCTK-'Cq + fTK~lf] + qT[CTK'lf - g], (3.15-17) which is just I(q) in (3.15-7), and we already know that maximizing I(q) yields the Stokes solution. [Indeed, the above observation leads to a useful way to construct I(q) from J(v) and L(v, q)—and we shall soon use this fact to our benefit.] Thus, we have shown that L(u, P) is a minimum over v and a maximum over q\ i.e., we have just shown that L(u, q) <: L(u, P) ^ L(v, P), (3.15-18) which is the definition of a saddle point. Finally, we close with the observation that Jmin = ^max = ^saddie-poinu the value of which we leave to the reader to work out as we have not found the exercise productive. 3.15.3 Discrete Potential Now we switch gears and show the power and ease of linear algebra by making an instant change from (discrete) Stokes (rotational) flow to (discrete) potentiaVirrotational flow. It is easy in the discrete case, but not in the continuous case, thus perhaps serving as an example of the rather significant difference between finite and infinite dimensional spaces. The easy (and indeed somewhat remarkable) part is this: changing K to M (the velocity mass matrix) at every occurrence above converts every statement about Stokes flow to an equivalent one about potential flow. Thus, changing (3.15-1) gives Mu + CP=f, (3.15-19) CTu = g, (3.15-20) which describes discrete potential flow in which / is now normally relegated to describing boundary condition forcing, whereas for Stokes flow it was this plus a 'body force,' but the key point regarding linear algebra is that they are both 'merely n-vectors.' And there is also a change in g, in general, because Stokes flow needs both normal and tangential Dirichlet BC's on velocity, whereas (slippery) potential flow permits specification of only the normal velocity. One noteworthy difference between M and K is that M can, sometimes—depending on the element, be 'lumped' without destroying the potential flow approximation, but K cannot—and the reason is simple: M (and its diagonal lumped version, Ml) both approximate the identity operator of the continuum (which is very 'local'), whereas K approximates —V2, the Laplacian (an elliptic operator whose inverse 'fills the domain'). The continuum analog is that potential flow moves us from the //'-norm to the simpler L2-norm.
632 THE NAVIER-STOKES EQUATIONS 3.15.4 Continuous Potential So let us now state the continuum version of potential flow described by (3.15-19) and (3.15-20) and try to find the analogous variational 'consequences.' It is u + VP = f and Vu = 0 in Q (3.15-21) with u • n = un on FD and P = PN on FN, (3.15-22) where f, un, and PN comprise the data, we retain a body force (with V x f = 0 because potential flow is, by definition, irrotational) for generality—even though f = 0 for 'conventional' potential flow, and we retain the symbol P for the velocity potential—for 'convenience.' These equations imply, for sufficiently smooth solutions, V2P = Vf in £1, (3.15-23) with dP/dn = n • f - un on FD and P = PN on FN, (3.15-24) wherein we note (again) the 'inversion' of essential and natural BC's; FN 'looks like' a Dirichlet boundary for P, whereas FD looks like a Neumann boundary—and indeed this would be the case if the solution of (3.15-23) and (3.15-24) were to be attacked directly (but weakly, of course). But we are more interested in the mixed (primitive) formulation of (3.15-21) and (3.15-22), usually, in which our appellation is the proper one. The relevant/corresponding functionals for this case are again the primal, the dual, and the Lagrangian, respectively: (i) J(y) = ^ /vv- /vf + / PNn\, (3.15-25) where every v must satisfy n • v = un on FD and be divergence-free: V • v = 0. (ii) I(q) = -{f(Vq-f)-(Vq-f)- J unq, (3.15-26) where here every q must agree with PN on FN. (iii) L{\,q)=J{\)- fqV-v, (3.15-27) where here n • v = un on Fq and q = PN on FN are required. It is noteworthy that the primal functional contains a boundary integral over its Neumann portion, and its dual displays a boundary integral over its Neumann portion. We now repeat the variational analyses presented above for the continuous case: min then max then saddle. 1. Minimize 7(v). Again, the first step is easy; the first variation of J(y) is 8J = /(v-f).$v+ / PNn-8\, (3.15-28) J JvN
VARIATIONAL PRINCIPLES FOR POTENTIAL AND STOKES FLOW 633 and we now utilize the fact that v is divergence-free (so that V • 8\ = 0) and that n • v = un on rD (so that n • S\ = 0 there) to write, for any scalar, q, 8\-Vq. (3.15-29) Thus, comparing (3.15-28) and (3.15-29), we see that in order to have 8J = 0 in the space of divergence-free functions satisfying n • v = un on To, we need that v — f is the gradient of a particular scalar— via \-f=-Vq in Q and q = PN on TN. (3.15-30) Comparing this result with the potential flow equations (3.15-21) and (3.15-22), shows that v = u and q = P. The final step is to show that this solution minimizes J(\), and this follows easily by taking the second variation of J(v) via (3.15-28): 82J = JS\-S\, (3.15-31) which is positive, and we are finished. Remark: We have just rediscovered (as a special case) the Kelvin principle for inviscid flow: among all incompressible flows satisfying (3.15-21) with f = 0 (or conservative, V x f = 0) and either PN =0 or TN = 0, the one that minimizes the kinetic energy is irrotational. 2. Maximize I(q) with no constraints. Again, the no-strings-attached variational problem is simpler, although q = PN on FN (and thus 8q = 0 there) is still at least an attached 'thread.' We obtain 81 = - f(Vq -f).V8q- f un8q = - f V . [8q(yq - f) + fsqV-CVq-f)- = [8q(V2q-V-f)- I 8q[n • (Vq - f) + un], (3.15-32) which vanishes if V2q = V • f in £2, dq/dn =nf—un on rD, and (of course) q = PN on T/v which, via comparison with (3.15-23) and (3.15-24), shows that q = P, our potential function. Returning to 81 = — f(Vq — f) • V8q — Jr un8q and taking its first variation yields 82I = — f V8q ■ V8q, a negative-definite quantity, thus assuring that the pressure maximizes I(q) : I(q) ^ I(P) for all q, a restatement (for f = 0, or conservative, and rN = 0) of the Dirichlet principle for inviscid flow: among all irrotational flows satisfying n • u = un on To, the one that minimizes the kinetic energy is divergence-free (Fix et ai, 1981); maximizing / minimizes KE = ^ J u • u because —u = Vq = VP. Now, to finish, we note that even though V • (VP — f) = 0, it is definitely not the case that VP = f. What is true is that, because the vector VP — f is divergence-free, it is the
634 THE NAVIER-STOKES EQUATIONS curl of some other vector (because div curl (•) = 0); i.e., we have VP-f=Vxv = -u, (3.15-33) where u is both divergence-free and curl-free; finally, (3.15-24) and (3.15-33) show that u will satisfy n • u = un on FD, and we are finished; maximizing I(q) solves the potential flow equations. Finally, in analogy with the discrete case, we note that for un = 0, the potential is a least-squares solution to Vq = f, and the potential velocity is an L2-orthogonal projection of f to the appropriate divergence-free subspace. Again, see Appendix 3. Digression 3: Show that ymin = /max. Again, as in the discrete case, it is worth noting that the solution of (3.15-25) and (3.15-26) causes these two functional to 'touch': J{y)-I{q)=\ /v-v- /vf + / P/vn-v J J JvN + i|(V<?-f)-(V<?-f) + J unq = \j(y + Vq-f)-(v + Vq-f) -/v-(V?-f)-/vf+/ PNn-\+ f unq = \ Av + V^-f)2+ f qV\ - qn-\+ / PNn-v+ / unq Jv JrN JrD = i /(V + V^-f)2+ fqV\ (3.15-34) because q = PN on FN and n • v = un on FD. Thus, when v is divergence-free, 7(v) = I(q) if and only if v + S/q — f = 0, and we are finished: the potential flow solution makes •^min — 'max- End Digression 3. Find the saddle-point. We now demonstrate that (3.15-27) has a saddle-point at the potential flow solution: 8L = 8J- fqV-S\- J8qV-\ = 8J - qn-8\+ 8\ ■ Vq - 8qV • \ = (y-f+Vq)-8\+ PNn-8\- PNn ■ 8\ - f 8qV ■ v, (3.15-35) j j r,v J r,v J and we obtain 8L = 0 if v + Vg = f and V • v = 0 in 12 with n • v = un on rD and q = PN on f^; the stationary point of the Lagrangian is obtained at the potential flow solution.
VARIATIONAL PRINCIPLES FOR POTENTIAL AND STOKES FLOW 635 To show that it is a saddle-point, we proceed as for the discrete case, by first varying v about u at the fixed value of the potential, P: L(u + ev, P) = J(u + ev) - / PV • (u + ev) \ \ (u + ev) • (u + ev) - / f • (u + ev) + PNn ■ (u + ev) - e / PV • v giving u • u — / f • u + / fVn • u + e (u • v + f • v - PV • v) + / rN PNn ■ v + \e2 I v • v, (3.15-36) L(u + ev, P) - L(u, P) = e v • (u + f - VP) - / Pn v + / PNn • v Jr JrN + ^£2 / V • V. (3.15-37) But L(u, P) = 7(u) > 0 and Jr Pn • v = Jr PNn ■ v because n • v = 0 on rD (here, v is a variation from u which satisfies n • u = un on fD) to give, noting that u + f — VP = 0, 1 „2 L(u + ev, P) - y(u) = ^ez / v • v, (3.15-38) showing that L(u, P) is a minimum with respect to velocity. To show the 'other half—that L(u, P) is a maximum with respect to P—we proceed as follows: pick a q, any q (but fixed: <5g = 0) and seek 8L = 0 for that g. Thus, setting 8q = 0 in (3.15-35), v = f — Vq makes 8L = 0, which result we place back into (3.15-27): L(\(q),q) = J(\)- /tfV-v I (f-Vq)-(f-Vq)- (f-Vq)-f + / Pyvn.(f-V^)- qS/-(f-S/q). (3.15-39) Integrating the last term by parts gives J qV ■ (f — Vq) = Jrqn ■ (f — Vq) — J(f - Vq)-Vq and /r qn ■ (f- V^) = J^qn ■ v + JVn PNn ■ (f - Vq) = frp qun + JrN PNn (f - V^) to yield L(y(q), q) = \ f (f - Vq) ■ (f - Vq) - f(t -Vq)-t + [(f-Vq)-Vq- f unq
636 THE NAVIER-STOKES EQUATIONS = -i J(Vq-f).(Vq-f)- j unq = Kq), (3.15-40) and we are done because we have already shown that the solution, q = P, maximizes the dual functional, I(q). Thus, we have our saddle-point and the final result that L(u, q) ^ L(u, P) ^ L(v, P) (3.15-41) and, Of COUrse, Jm[n = /max = ^saddle-point- 3.15.5 Continuous Stokes This completes the potential flow analysis. Now—finally—we come to the subject that started this whole section: steady Stokes flow. We have followed the chosen path because the Stokes equations are significantly more difficult with respect to variational principles, and in fact we shall use the general result that the maximum of the Lagrangian with respect to P agrees with I(q) to obtain this latter functional. Even at that, the procedure is much more 'formal'—involving implicit Green's functions and integral operators—and is (in fact) less useful, in some sense... explaining, or rationalizing, the fact that we will fall a bit short of our goal. So we begin by introducing only two functionals—the 'easy ones,' primal and Lagrangian: (i) J{\)={ /vv: (Vv)r- f\f- f v F (3.15-42) and (ii) L(v, P) = J(v) - J qV ■ v, (3.15-43) which are 'related' to the following Stokes problem: -V2u + VP = f and Vu = 0 in Q, (3.15-44) with u = w on TD and du/dn-nP = F on rN, (3.15-45) where here f, w, and F are the data. These are in fact the PDE's and BC's described by the opening equations of this section, (3.15-1) and (3.15-2). 1. Minimize J(\) subject to V • v = 0 and v = w on rD. The first variation of (3.15-42) gives 8J = J Vv:(V5v)r- /5v-f- / «5v • F. (3.15-46) Recalling that (or referring to Table 3.1-2), Vu : (Vv)r = Vv : (Vu)r = V • [v • (Vu)7] - v V2u gives 8J = I n • [(Vv) • 5v] - / 8\ ■ (V2v + f) - / <5v • F. (3.15-47) Jr J JvN
VARIATIONAL PRINCIPLES FOR POTENTIAL AND STOKES FLOW 637 But since v = w on rD, 8v = 0 there and we get 8J = 8\- (9v/9n-F)- / <5v • (V2v + f). (3.15-48) Utilizing the constraints that V • v = 0 in Q and v = w on Fq implies that V • 8\ = 0 in £2 and 8v = 0 on To, which leads to 0 = qV-8\ = / qn-8\- 8\ • Vq Vq (3.15-49) which, when compared with (3.15-48), shows that 8J = 0 when V2v + f=V# in Q and d\/dn - F = qn on TN, (3.15-50) which, with V • v = 0 in Q and v = w on rD, shows that the constrained extremum of 7(v) gives the Stokes solution: v = u, q = P. That it is minimum follows easily from (3.15-47), the first variation of which gives 82J = jV8\: (V5v)r, (3.15-51) which is positive-definite, and thus 7(u) is a minimum. Remark: For f = 0 (or conservative, f = VA.) and rN = 0, this result is virtually the same as the Helmholtz dissipation theorem (Serrin, 1959): the solution of the (steady) Stokes equations minimizes the viscous dissipation. 2. Find the saddle-point of L(u, q). We now vary both v and q in (3.15-43) while respecting v = w on VD and 3v/3n — qn = F on rN (but V • v ^ 0 in general): 8L = 8J - 8qV-\- qV • 8\ = 8J - 8qV • v - / qn ■ 8\ + 8\ ■ Vq = / 8\ ■ (d\/dn -F-qn)- 8\ ■ (V2v + f - S/q) - 8qV ■ v, (3.15-52) which has 8L = 0 when 3v/3n — qn = F on rN and — V2v + Vq = f in Q and, finally, when also V • v = 0 in Q; i.e., when v and q solve the Stokes equations: v = u, q = P. As usual, we now seek to show that 8L = 0 is a minimum with respect to velocity, which we do (again, as usual) by holding P and varying u via u + e v, where we note that, while v is not required to be divergence-free, it is required to vanish on rD (so that u + £v = w there). Thus, we form L(u + ev, P) = J(u + ev) - / PV • (u + ev) = J(u + ev) - e / PV • v = \ J V(u + ev) : [V(u + e\)]t - / (u + ev) ■ f - / (u + ev) • F - e PV\
638 THE NAVIER-STOKES EQUATIONS + £ Vu : (Vu)7 Vu : (Vv)y u f- / u F JrN fy.f- f vF- PVv + \e2 / Vv : (Vv)y = J(u) + el /[Vu: (Vv)7 - v-f-PV-v] - / v F + \s2 [Vv: (Vv)7. (3.15-53) But / Vu : (Vv)r = /r v • du/dn - J v • V2u = JTn v • du/dn - f v • V2u and / PV ■ v = fr Pn • v — f v • VP to give L(u + ev, P) - 7(u) = £ v • (du/dn -nP-F)- / v • (V2u + f - V/>) + - / Vv : (Vv)y = k2 / Vv : (Vv)7 > 0 (3.15-54) because u and P satisfy the Stokes equations. Hence, we do have that L(u, P) is a minimum with respect to velocity. Now the final, and hardest, part: show that 8L = 0 is a maximum with respect to P. Again, as usual, pick a q, any q, but /zjced (8q = 0) and place it into L(v, g) and seek 8L = 0 by finding just the right v(#). Then we will vary q, if possible/lucky. Thus, we simply omit the 5^-term in (3.15-52) to obtain 8L= 8\- (d\/dn -F-qn) 8\- (V2v + f- S/q), and it is clear that 8L = 0 =>• —V2v + Vq = f in Q and 3v/3n — qn = F on T/v, (3.15-55) (3.15-56) which, along with v = won rD, provides a well-posed problem for v(g). Switching notation via V2 = A for convenience, we formally write the solution for v(g) as v = A_,(V?-f). (3.15-57) We shall place this result into (3.15-43) after rearranging the Lagrangian to a more convenient form—first via integration by parts of the last term: f qV ■ v = frqn ■ v — J \ . S/q = Jr qn ■ w + Jr qn ■ v — / v • Vg, and (3.15-43) becomes L(v, q) = ^ J Vv : (Vv)7 — / v • f — Jr qn ■ w — Jr v • (F + qn) + J v • Vg, which we rearrange by using Vq - f = V2v to obtain L(v, q) = {j Vv : (Vv)7 + / v • V2v - /r qn ■ w - /r v • (F + gn), which we further rearrange by—yep, you guessed it—integration by parts; this time
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 639 of / v • V2v, and utilizing the BC F + qn = d\/dn on FN : / v • V2v = fr v • d\/dn - f Vv : (Vv)r = fVi> w • d\/dn + JTn v • (F + qn) - / Vv : (Vv)r to give L(v, q) = I w {d\/dn + qn) - \ f Vv : (Vv)7", and we are 'finished' —after 'simply' replacing \(q) from (3.15-57) to obtain [cf. (3.15-17)] L[\(q), q] = -- f[VA-l(Vq - f)] : [VA"1 (V<? - f)f 2 — A-l(Vq-f)-qn dn = I(q), (3.15-58) and we leave it to the (talented) reader to verify the rest: the extremum of I(q) exists and is a maximum. [An alternative, and probably better, functional to be maximized, is given by Finlayson (1972, p. 2.27) in Kardestuncer and Norrie (1987).] Thus we conclude our (condensed, yet protracted) odyssey on 'variational fluid mechanics' —a journey that we hope has been useful to some. 3.16 SOLUTION METHODS FOR THE SEMI-DISCRETIZED TIME-DEPENDENT (AND STEADY) EQUATIONS We have now arrived at what is arguably the most important part of the book: solving the time-dependent incompressible Navier-Stokes equations via the GFEM. There are those who would argue that these are the only codes that should be written, since one rarely if ever knows that a stable steady state exists. But there are also those who would argue—perhaps at least after 'modifying' the equations to include a turbulence model (see Volume II)—that most flows of interest are turbulent and 'steady,' at least statistically. Both arguments have merit—and while the bulk of this section is focused upon the (much more complex) time-dependent case, the steady case will also occasionally be addressed separately, albeit usually as a special case/subset of the time-dependent case. Anyway, we now address numerical time integration of the DAE's that comprise the semi-discretized NS equations, and we begin by pointing out an early and important paper on the subject with a very cogent title, 'DAE's Are Not ODE's' (Petzold, 1982). So what are they and why do they need a special name? They are ODE's subjected to algebraic constraints, and they need a special name because they have, in the last 15 years or so, spawned a new, and growing, branch of applied mathematics that attempts to properly account for the often-significant additional difficulties, both theoretical and applied, when ODE's are constrained. We are fortunate not to need more than the tip of the iceberg in this field because our DAE's are of a special class (semi-explicit) and the 'index' (a measure of the 'degree of difficulty,' defined below) is not too high. Suffice it to say in this introduction that 'simple' ODE-thinking can lead to trouble, and we thus introduce the reader (or some readers) to the proper approach to the problem. After introducing some of the concepts and ideas behind DAE's, we shall show how to apply several ODE methods to both types of DAE's that comprise the semi-discrete NS
640 THE NAVIER-STOKES EQUATIONS equations; namely, those involving the primitive variables, (3.13-28) and (3.13-29), and those involving the PPE—(3.13-28) and (3.13-242). The first and foremost new concept is the notion of the index of the DAE system: The minimum number of times that all or part of the constraints of a DAE system must be differentiated with respect to / in order to obtain an ODE system in the original variables is called the index of the DAE's. 'The index is a measure of the singularity of a system' —(Petzold and Lotstedt, 1986). And, '... the more singular a DAE system is, the more difficult it is to solve numerically'—(Hindmarsh and Petzold, 1988). Note that we have already performed one constraint differentiation, (3.13-240), to obtain the PPE—another algebraic constraint equation [(3.13-29) was the first constraint equation]. If we perform one more, this time on the PPE itself, (3.13-242), to obtain (CTM~XC)P = CTM~l—[f -Ku- N{u)u] - g, (3.16-1) at we have a system of ODE's; namely, (3.13-28) and (3.16-1), and we have thus discovered the indices: since it required two differentiations of the algebraic constraint equations to arrive at our (index 0) ODE system in u and P, the index of the primitive variable DAE's is two. Similarly it follows that the index of the PPE formulation, (3.13-28) and (3.13-242), is one. We remark that (3.16-1) has no other use than 'index-determination.' Next, we paraphrase a few important remarks from the recent text on the subject by Brenan etal. (1996): 1. The higher the index, the more difficult is the DAE system to solve. (This statement refers, in particular, to ODE methods in which automatic error control of the non-linear DAE's is desired.) 2. DAE's with index two or more are called 'higher index' systems. 3. The solution of higher index systems can involve derivatives of order k — 1 of the forcing function (k is the index). ['Since numerical differentiation is notoriously ill- conditioned, (sensitive to small errors), difficulties for numerical ODE methods can be expected ' (Hindmarsh and Petzold, 1988)] 4. Not all initial conditions admit a smooth solution if k ^ 1. (Consistent IC's generally lead to smooth solutions.) 5. Higher-order DAE's can have hidden algebraic constraints. 6. Lowered-index DAE's (derived by differentiation) have more solutions than the original DAE's. Only some of these solutions are solutions of the original DAE's. 7. Often the most difficult part of solving a DAE system in applications is to determine a consistent set of initial conditions. Fortunately, this problem need not plague us because we know how to determine consistent IC's; details later. (It has, however, plagued others in the past—with the most common offenders being 'impulsive starts.') 8. This one is our own: we presume that the solution 'most desired' is that satisfying the original DAE system—that of highest index. [This is because the implication is 'oneway'; solutions of the highest index system will always satisfy all (derived) lower index systems.]
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 641 It may be helpful to present a simple, but meaningful example (because generalization is possible) of one of the difficulties that can occur with DAE's. It has to do with local truncation error estimates, and the lesson to be learned is this: 'ordinary' error estimates, from ODE theory, may be wrong. The following example is from Petzold (1982): h = yu (3.16-2) yi = g(t), (3.16-3) an innocuous-looking index 2 system in two degrees of freedom. To prove that the index is 2, differentiate the constraint to get y2 = 8, which, when inserted into (3.16-2), gives the 'hidden' constraint y\ = g, which, when differentiated once more, yields the pair of ODE's for y\ and y2, y2 = y\ and y\ = g. Hence—index 2. As an 'aside,' note that the general solution to these ODE's is y\ = y\ (0) + g(t) — g(0) and y2 = y2(0) + g(t) — g(0) + t[y\(0) — g(0)], whereas that of the original (index 2) DAE's is simply y\ = g(t) and y2 = g(t); also, that of the intermediate, index 1, system is y\ = g(t) and y2 = y2(0) + g(t) — g(0). The index 1 solution (the solution of the index 1 system) is only correct if the particular IC y2(0) = g(0) is prescribed, and the index 0 solution (the solution of the index 0, ODE, system) is only correct if, in addition, the particular IC y\ (0) = g(0) is prescribed, showing clearly the existence of extraneous solutions from the derived lower index problems. Let us consider the simplest and most 'robust' time integrator, backward Euler. As we learned in the previous chapter, the local truncation error for BE (on ODE's) is 0(At2), a result that was used to generate a variable-step method based on local error control. Does this estimate carry over to (index 2) DAE's? To help answer this, let us compute the exact local truncation error for the above example. BE, starting at tn, gives y2,n + \ — yi,n gn + \ ~ gn ,~ , ,. A, *..+. = s = —^—. (3-16-4) y2,n + l=8(tn + l) = gn + l, (3.16-5) whereas the exact solution has y\ (tn+\) = g{tn+\) = gn+\ and 3^2(^+1) = gn+\ ■ Thus, the error is zero in the second component; but in the first, it is ^i,n+i = y\,n+\ — y\(tn+\) gn + \ ~ gn = —^— - gn + X Atgn+] - (Af2/2)g(£) . = Xf 8n+x = ~m< (3-16-6) where tn ^ £ ^ tn+\. Thus, the surprising result is that one order of accuracy (for y\ has been lost by applying BE to the simplest index 2 problem. 'The situation... is obviously enough to wreak havoc with any step size selection algorithm which assumes that errors are 0(Atk+l), where k is the order of the method'—Petzold (1982). If all local errors were this bad, and if they accumulated, then the global error would be 0(1) in At\ Fortunately, things are not quite that bad; in this
642 THE NAVIER-STOKES EQUATIONS example, y\ is the algebraic variable (and y2 is the ODE variable), and the general theory for BE on index 2 DAE's (Brenan et ai, 1996) state that these larger local errors in the algebraic variable do not accumulate, with the result that y\ above is first-order accurate both locally and globally. This will extend to our Navier-Stokes index 2 DAE's in the obvious way: BE will be (globally) first-order accurate for both u and P even though the local truncation errors are 0(At2) and (generally) O(At), respectively. The 'general' advice that comes out of DAE theory, for our purposes (our DAE's) is this, roughly (but not uniformly for all DAE's): base your error estimates and timestep control strategies on the differential variable. The general 'lesson,' which is also true for our index 2 NS DAE's, is this: when g(t) is time-dependent, the numerical solution will be more difficult (and often less accurate—in pressure) because the integration process actually involves an implied numerical differentiation of g(t). [The index 1 problem, in contrast, is easier in principle because it involves a numerical integration of g(t)—a statement that presumes both g(t) and g(t) are given continuous functions of t.] Another intuitively useful concept when 'thinking about' DAE's is that they can be thought of as stiff differential equations in the limit of infinite stiffness. Suppose, for example, that we changed the algebraic constraint equations, CTu = g, to a set of differential equations via xP + CTu = g, which we could integrate simultaneously with the momentum equations. This technique was introduced by Chorin (1967b) to efficiently 'time-march' to steady solutions—and has been recently utilized by Kwak et al. (1986). If we then let r —► 0, we approach infinite stiffness—DAE's. Thus, it is no accident that a very recent book on the subject (Hairer and Wanner, 1991) bears the telling title, Solving Ordinary Differential Equations II: Stiff and Differential-Algebraic Problems. Remark 6 above should not be too surprising in light of what has already been discussed in Section 3.9.2 regarding extraneous solutions of the PDE's when the PPE is used to replace V • u = 0—a discussion we now revisit, by asking, 'When will the index 1 solution (PPE) agree with that from the index 2 (primitive equation) formulation?' The answer, prior to introducing time integration error, is, just as it was for the continuum: when the same solvability conditions that apply to the index 2 formulation are satisfied; here Ctuq = go and (when n • u is specified on all of T) PTHg{t) = 0 for t ^ 0, where Ph is the hydrostatic pressure mode. But there are many additional solutions to the PPE formulation that are not solutions of the index 2 formulation—even for the time-continuous 'theoretical' solutions; e.g., if g(t) = g0 is time-independent and CTuQ ^ g0, the PPE formulation has no solvability constraints so that a solution always exists, but that solution will not be a solution to the original (index 2) DAE's—which are ill posed and have no solution. From (3.13-240), which is easily seen to be implied by (3.13-241) and (3.13-242), it is easy to see that the PPE solution, rather than satisfying the proper mass conservation equation, CTu = g(t), will instead satisfy CTu(t) = g(t) + CTu0 - g0; (3.16-7) any initial divergence error will linger forever. The PPE does, however, sometimes have a solvability constraint; namely, when (and only when, in the absence of spurious pressure modes, which we assume herein) the hydrostatic pressure mode exists (i.e., only when u • n is prescribed on all of T), the scalar product of the PPE, (3.13-242) with PH yields, using the fact that PTHCT = 0, PTHg(t) = 0, (3.16-8)
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 643 which is the time derivative of the constraint given earlier for the index 2 formulation, (3.13-31), and corresponds to the time derivative of (3.10-13) in the continuum—see the discussion on 'solvability' following Remark (9) after (3.10-13). So, only if the normal velocity is specified on all of T and is time-varying on some portion of T does the PPE method 'retain' the original solvability constraint related to global mass conservation. Then, if ]T);g;(0) # 0, the index 1 system is also ill-posed (even if CTu0 = g0, actually) and has no solution because the PPE for P at t = 0 is ill-posed. Except for cases where g(t) is time-dependent and Pjfg(t) ^ 0, the violation of Ctuq = g0 is also depicted in Figure 3.10-1 of Section 3.10, to which we now return. In Figure 3.10-1, the union of the two larger ellipses is now described by Ctuq ^ g0, and the horizontal ellipse describes PTHgo # 0, where g(t) = 0.(CTu = g =» PTHg = 0 and thus PTHg # 0 => CTu # g; but CTu ^ g does not =>• Pfjg ^ 0.) In Figure 3.10-2, as in Figure 3.10-1, all six cases correspond to Ctuq t^ go; here, though, if we generalize our interpretation and allow g(t) to be time-dependent and interpret Cases 4-6 as a violation of Pjfg(t) = 0, then these three cases are also ill-posed via the PPE formulation. (Recall that two procedures for fixing this problem were presented in Section 3.13.1g.) Finally, now the vectors in both figures might make more sense intuitively since each vector could correspond to a velocity at just one node. One final remark on solvability: if inflow = outflow in (special) Case 2, this case solves the index 1 DAE's (but not those of index 2) even when the normal velocity on T is time-varying. We close this introductory discussion with a quotation from two leading general- purpose code developers before we present a few methods: 'The development of codes for DAE's is not a straightforward task because of the difficulties in the computation arising from the singular part of the system and the coupling to the differential part, which do not occur for ODE systems. In particular, starting, error estimation, and solving the nonlinear system all present difficulties—even for index 1 systems.' — (Hindmarsh and Petzold, 1988). 3.16.1 Some Time-Integration Methods for the DAE's In this section we shall show how to apply several of the ODE methods discussed in the previous chapter to both index 2 and index l formulations. In Volume II, we shall discuss how to solve the resulting equations—for a few of the methods. Before this, however, we will first form another index 0 formulation (ODE's), one that is useful only in that it shows how to properly integrate the index l DAE's—it is not useful for 'generating' code. But before we do even this, we introduce a new, highly condensed notation, to save 'ink,' because we will be rewriting these DAE's many times. Thus, we rewrite the index 2 and index l formulations, respectively, in the following two ways: (i) u + GP = f(u), (3.16-9) Du = g(t) (3.16-10) and, by differentiation of the constraint, (ii) u + GP = f(u), (3.16-11) LP = Df(u)-g(t), (3.16-12)
644 THE NAVIER-STOKES EQUATIONS where (obviously) G = M~lC, f(u) = M~l[f - Ku - N(u)u]—and we apologize if the dual use of / causes any confusion, D = CT, and L = DG = LT is the Laplacian. It is noteworthy that many simple FDM formulations already display this concise description, in which the new terms really are simple because M = I. But it is also important to remember that we are dealing with M ^1 and, in the general case, M~x is dense so that these new definitions are more 'formal.' Also slightly noteworthy is that many FDM formulations do not generate a symmetric L. Finally, we point out that the stability of the DAE's can only be assured if the advection matrix is skew-symmetric. Some remarks on initial conditions: whereas (3.16-9) and (3.16-10) are well posed given any initial velocity, uq, satisfying Du0 = go (and not otherwise), (3.16-11) and (3.16-12) are well-posed for any uq. Thus, index 1 solutions are actually wrong/spurious if Duq t^ go. Also the proper/appropriate/consistent initial pressure, P0, cannot be chosen arbitrarily; it must be derived—and this can be done in either of two equivalent ways: (i) solve (3.16-9) and the time-derivative of (3.16-10) at t = 0 for uQ and Pq,, (ii) solve (3.16-12) aw = 0for/>0. Now we find our alternate index 0 (ODE) formulation by solving (3.16-12) for P and placing the result in (3.16-11) to get it=(I - GL~lD)f(u) + GL~lg = pf{u) + GL~lg, (3.16-13) = Pf(u) + v, where v(t) = GL~X g(t) (3.16-14) and p = I — GL~XD is a projection matrix/operator. See Appendix 3. Note that these ODE's =>• Du = g, since Dp = 0, DG = L and Dv = g. These ODE's are not in the original variables (P is gone), and (3.16-13) is also 'formal' in that it is not a useful representation from which to launch code writing. But it is, or can be, useful as a canonical approach for the application of ODE methods to DAE's (A.C. Hindmarsh, personal communication); it is also useful when performing theoretical analysis, as we shall soon see. In fact, however, if (3.16-13) was construed as a legitimate index-determining ODE system, then the erroneous conclusion that the index of (3.16-9) and (3.16-10) is 1 would obtain. Remarks: (1) Here and hereafter, unless otherwise specified, we presume, when writing gn = g(tn), that we have a given continuous function of time, just as we presume for gn = g(tn). (2) The penalty method (see Section 3.13.2e)—P = X(Du — g) for X 'large'—generates a more useful (albeit quite stiff) index 0 system, it + XGDu = f(u) + XGg, that we shall return to later (at the end of Section 3.16.4). (3) Here and hereafter we presume that spurious pressure modes are either absent or have been properly accommodated (see Section 3.13.2) so that our matrices are either non-singular or, at worst, our problems are 'consistent singular.' (4) If the IC for the NSE's is the most general possible, there will be an ephemeral vortex sheet on that portion of T on which the tangential velocity is specified. If this sheet is important, then a fine spatial mesh near V will be required as well as small timesteps at small time—regardless of the time integration method selected.
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 645 a. Primitive equations/index 2 We shall start at the 'top'—index 2. We shall present two explicit and four implicit methods, and then argue that, for GFEM at least, only implicit methods make much sense for this formulation. [What we call 'explicit' some (more careful) authors—e.g., Hairer et al. (1989), call 'half explicit'—because P is always implicit.] 1. Forward Euler. Applied to (3.16-9), (3.16-10), this simplest of methods goes like so: given un with Dun = gn, solve Af + GPn=fn=f(un) and Dun+\ — gn+\ for un+\ and Pn. As a coupled linear system, this looks like (I AtG\fun+l\ fun+Atfn\ \D 0 )\Pn ) \ gn+] J ' which we immediately 'expand,' using our previous definitions, to (3.16-15) (3.16-16) (3.16-17) M C 0 ) Mun + At[fn — Kun — N(un)un] gn + l (3.16-18) (with a different /„, of course), which leads to the following Remarks: (1) The time 'index' on P is perfectly proper; it should not be Pn+\ as many believe—a statement we shall later prove. (2) If the mass is lumped (or if FDM is used), then the system in (3.16-17) easily uncouples to the following less expensive sequence [solve for un+\ in (3.16-15) and put the result in (3.16-16)]: (i) solve LPn = Dfn - (gn+\ ~ gn)/&t; (ii) compute un+\ = un + At(fn — GPn), which turns out to be exactly the same as applying forward Euler to the index 1 DAE's after approximating g by (g»+\ — gn)/At, as we later show. The above PPE is an example of the 'hidden constraint equation' with higher-order DAE's, referred to in the introduction—and of the implied numerical differentiation of the constraint in the index 2 solution, a fact that will be seen to sometimes adversely affect the local accuracy of the pressure. (3) If the mass is not lumped (GFEM), then (3.16-17) does not (easily) uncouple, and we have an expensive explicit method—and one that is not recommended. (4) If PH exists (n • u specified on all of T), then (3.16-15) and (3.16-16) are solvable if and only if Pjjgn+[ = 0, n = 0, 1, ..., and the resulting matrix singularity can then be removed by specifying any one of the elements of the pressure vector—at any value. This remark applies to all time-integration methods. (5) If Duo # go, then P0 will behave like O[(Du0 - go)/At] and U\ — Wo — 0(1) in At with Du\ = g\; this is an example of IC's that give non-smooth solutions because the
646 THE NAVIER-STOKES EQUATIONS problem is ill-posed. In fact, LP0 = Df0 + (Du0 - g\)/At = Df0 - g0 + (Du0 - go)/At + O(At) and u\ = p(u0 + Atf0) + v\ = pu0 + v\ + O(At). In fact, AtP0 approximates [to O(At)] the Lagrange multiplier of the L2-projection discussed in Sections 3.10.2 and 3.13.Id—and u\ — v\ is clearly O(At) away from the corresponding projected velocity. 2. Second-order Adams-Bashforth. AB2 applied to (3.16-9) and (3.16-10) gives, given un with Dun = gn, un_\ with Dun-\ = gn_u and Pn-\, and un + \ ~ Un At 1 ~ 2 Dun Xfn + 1 = -G^j-a^.-G^-,) g„+i, for n = 1,2, ..., (3.16-19) (3.16-20) which again are to be solved for un+\ and Pn and, like forward Euler, is only a feasible method for FDM or lumped mass FEM—because the equations then uncouple to permit the sequential solution steps—after inserting un+\ from (3.16-19) into (3.16-20), (i) compute Pn from the implied PPE, L[(3Pn - Pn-\)/2] = D[(3fn - fn-\)/2] - (gn+\ — gn)/&-U (ii) compute un+\ from (3.16-19). Consistent mass 'demands' implicit methods for these DAE's even more than it did in the case of the scalar transport equation of the previous chapter. Finally, we mention that: (i) (ii) Start-up is, as for ODE's, 'special'; typically forward Euler would be used to take the first (and smaller) step. In order to see what approximation to g is implied by AB2, simply insert Df(u) — LP = g from (3.16-12) into the above PPE to obtain the implied differentiation of the constraint, 1 gn = gn-\ +2 gn+\ gn A~t ^ dt which rearranges to gn+\ = gn + (At/2)(3gn — g„_i), the AB2 integration of a system of ODE's for g. This observation will later be seen to be quite 'general' and quite significant. 3. Backward Euler. The simplest implicit method gives un + \ ~ un At + GPn+\ = fn+\ and (3.16-21) (3.16-22) Dun + \ = gn + \, a fully coupled non-linear system for (un+\, Pn+\)- Remarks: (1) The advanced time-level index on both u and P is related to the fact that BE cares not whether uq is discretely divergence-free; it thus also has no need for the initial pressure, P$. Thus, this most robust of all implicit schemes can, like the less robust, simplest explicit scheme, even solve ill-posed index 2 DAE's—and is related to the
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 647 introductory remark that 'Not all IC's admit a smooth solution'; a la Remark (5) under FE. (2) The discrete PPE satisfied (implied) by (3.16-21) and (3.16-22) is LPn+i = Dfn+l - (gn+\ - Dun)/At, which, [f Duo = go, becomes LPn+l = Dfn+X - (gn+x - gn)/At, which approximates (3.16-12) and which, if Duq ^ go gives, using gn+{ = gn + Atgn + 0(At2) at n = 0, LP\ =Dfl-g0- (g0 - Du0)/At + O(At), like FE on Pq\ i.e., P\ —► oo as At —► 0. Another consequence of this ill-posedness is (again) that u\ — uq is 0(1) in At (not smooth) rather than O(At) as when well-posed; u\ = puQ + v\ + 0{At). (3) It turns out—see Gresho (1991b) and Section 3.16.Id—that u\ is, for Duo # go and At —> 0, an L2-projection of uq onto the discretely divergence-free subspace, and AtP\ is the associated Lagrange multiplier, a statement that we have already applied to FE with A^i replaced by AtPo\ BE (or FE) applied to an ill-posed problem 'changes the data' so that after one timestep, it becomes well-posed—at least for those cases violating Duo = go but not Pjjgo = 0 (problems violating Pjfgo = 0 when PH exists are ill-posed because of bad BC's, not bad IC's). (4) As for ODE's, setting At = oo recovers the steady equations; i.e., BE can, in principle, obtain the (or a) steady state in one step—assuming one exists, which solution may not be (temporally) stable. 4. Trapezoid rule. A single step up from BE gives the next member of the Adams family: given un with Dun = gn and Pn, solve Un+l~f Un + \G{Pn + Pn+l)= fn +/"+', (3.16-23) At At —D(un+un+l)=—(gn+gn+l) (3.16-24) for (un+\, Pn+\), which prompts us to mention how (3.16-24), rather than Dun+\ = gn+\, which we shall soon discuss, came about. A general/canonical method of discovering how an implicit ODE method can be applied to a DAE system is to first write the latter in time-singular form (Campbell, 1980), Ay = b(y), where A is singular and b contains the algebraic constraints (which, of course, can be done for our two DAE systems), and then 'multiply' the selected ODE method by A and replace Ay by b\ i.e., for TR applied to y = b, we have yn+l = yn + At(yn + y„+i)/2, and thus Ayn+\ = Ayn + At(bn + bn+\)/2. This is how (3.16-23), (3.16-24) was obtained from (3.16-9), (3.16-10), with y = (£), A=(J °0), and b =(/_"-). [Another way to arrive at the same result is to replace Du = g by the (stiff) ODE, xP = g — Du, apply the implicit ODE method, and then let r —► 0.] But it is easy to see, via induction, since we assumed that Dun = gn for all n (in particular for n = 0), that (3.16-24) quickly simplifies to Dun+X = gn+\, (3.16-25) which we shall call shortened TR (STR); and this is the equation that would probably be used in most codes. But the above difference in the treatment of Du = g leads to a useful digression in which we briefly return to the subject of wiggles and wiggle signals. It turns out that
648 THE NAVIER-STOKES EQUATIONS the 'long' version of TR, referred to as the 'direct approach' by Hairer and Wanner (1991), using (3.16-24), can alert the code user to some difficulties that would mostly be overlooked by the 'short' version, (3.16-25), the 'indirect approach' (Hairer and Wanner, 1991). To see this distinction, we first examine the two PPE's that are implied by the two methods. Writing Pn+l/2 = (Pn + Pn+i)/2 and fn+l/2 = (/„ +/„+i)/2, (3.16-23) and (3.16-24) give, in the 'general' case (Duq, ^ go), for n = 0, 1,..., LPn+]/2 = Dfn+l/2-(gn+i-gn)/At + 2(-\y(Du0-go)/At, (3.16-26) and the velocity would satisfy [see (3.16-13)] un+l =un+ Atpfn+i/2 + GL-\gn+x - gn) - 2(-\)nGL-\Du0 - go) (3.16-27) and Dun-gn = (-\)n(Du0-g0). (3.16-28) Thus, there is a persistent 2At oscillation/wiggle in the solution, and larger the larger is the initial divergence error. Especially noteworthy is the pressure oscillation; contrary to 'conventional' large-A? TR oscillations, these oscillations increase as At is decreased. We can and shall regard this as a wiggle signal—a clear message to the user that s/he should re-examine the input data. [A recent example of this very thing is contained in Marx (1994)—except that the signal was either not 'received' or not heeded; rather, TR was wrongly condemned, again.] On the other hand, the shortened TR, (3.16-23) and (3.16-25), yields the following: ux = u0 + Atpf\/2 + GL~x{gx - g0) - GL~X (Du0 -go), = puo + v\+ O(At), (3.16-29) which satisfies Du\ = g\ and, for n = 1, 2,..., k„+i = un + Atpfn+l/2 + GL~l(gn+\ -gn), (3.16-30) which satisfies Dun+\ = gn+\', for the pressure, LPl/2 = Dfx/2 - (gi - go)/At + (Du0 ~ go)/At, (3.16-31) and, for n ^ 1, LPn+l/2=Dfn+l/2-(gn+l -gn)/At. (3.16-32) Thus, except for the first step, the shortened version (like the Euler methods) is—provided that only Pn+\/2 is reported—oblivious to ill-posedness; bad initial data will be changed (via an appropriate L2-projection) during the first timestep, and a different-but-legitimate (consistent) problem solved thereafter. But—as for FE and BE—the code may be solving a different problem than the code user thinks is being solved. Unless one computes and reports only Pn+\/2 (a viable option, actually), the initial pressure field is needed in (3.16-23) for n = 0, and it comes (only—no other choice, which extends our definition of 'honest GFEM' further into the time domain) from the DAE's applied, with differentiated constraint equation, at t = 0; i.e., given «0 with Du0 = go, solve uo + GP0 = f0 (3.16-33)
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 649 and Du0 = g0, (3.16-34) a linear system with matrix ('D GQ) or, in GFEM form, (£?r CQ), for the initial acceleration and pressure, (uq, Pq). From the above, we observe the following general result: every divergence-free velocity field induces a concomitant pressure and a divergence- free acceleration, an observation that is also true for the PDE's; recall Remark (5) below (3.10.13). The initial pressure is thus seen to satisfy the appropriate PPE at t = 0, LPQ = DfQ-go. (3.16-35) Note that even when (3.16-33) and (3.16-34) [and thus (3.16-35)] are solvable, both itQ and Pq make little sense if Duq ^ go. Note too that if the hydrostatic pressure mode, PH, is present and Pjfgo # 0, (3.16-35) has no solution—even if DuQ = go. If PH is present and Pjfgo = 0, the singularity in L can be removed by 'pegging' a pressure—as stated earlier—with no adverse effects. Returning, however, to the ill-posed case (Du0 ^ go), we wish to point out the pressure behaviour for both TR and STR. Rather than the average pressure behaviour discussed above, we get, with Pq from (3.6-35), LPn=Dfn+{-\f j=0 (3.16-36) where fin = 2n for TR and fin = 1 for STR. Thus, a 'true' TR simulation with Duo # go will show a linearly growing oscillatory pressure, whereas STR will merely oscillate; each, of course, is a wiggle signal (especially for small At) that begs for attention. Finally, for the well-posed case (Duq = go), it is interesting to find the implied PPE for Pn+\ rather than that for (Pn + Pn+\)/2 as done thus far. From (3.16-26) applied at n = 0 and (3.16-35), we obtain, by induction, LPn+\ = Dfn+X - (V"+^~ 8n - gn) (3.16-37) (where it is important to point out that gn is a TR approximation to dg/dt at t„, but gn ^ dg/dt at tn) which is (3.16-12) applied at t = tn+\ with a 'special' (appropriate) approximation to gn+\\ namely, gn+\ = 2(gn+\ — gn)/At — gn, which is just (the inversion of) TR applied to an ODE system for g. 5. Second-order backward-differentiation method. BDF2, like BE, needs neither pressure nor a divergence-free IC to advance the solution; i.e., given un and un-\, from (2.7-72) of the previous chapter, we get uJl±l^=X-U-^^- + \(f„+i-GPn+i) (3.16-38) and Dun+\=gn+u n =0,1,2,..., (3.16-39) a non-linear system in (un+\, Pn+\)- Note that all BDF methods apply the algebraic constraint equation only at the advanced time so that, like the BE case already discussed,
650 THE NAVIER-STOKES EQUATIONS an infinite timestep recovers the steady-state NS equations (but not necessarily their solution, even if one exists). If Duq = go, the implied PPE and concomitant approximation to g(t) is easily found to be LPn+i =Dfn+l - (3gn+l -Agn +£„_,)/2Af, (3.16-40) which is (again) the appropriate PPE with gn+\ consistently (a la BDF2) approximated; cf. (2.7-73). Since BDF2 is not self-starting, it may be well to address this issue here. There are two choices, basically, both viable: (i) the first step could be made with the first-order member of the same BDF family; namely, BDF1, which of course is backward Euler—and because it is only first-order accurate, the At should be smaller (by a factor of approximately five to 10) than that for BDF2; (ii) the TR, via (3.16-23) and (3.16-25) with Du0 = g0, could also be used to obtain u\, and at the same At contemplated for BDF2. 6. Implicit midpoint rule. The last implicit method we consider, on the index 2 system, and the third of the 'second-order-accurate' family, is IMR: Un+\-lin . n (Pn + Pn + \\ (Un + Un + X\ ,,U/|n —aT~ + G {-^~) = f {—^) (3-16"41) and yD(W„ + un+i) = Atg ftn+^n+l\ for n = 0, 1, 2,..., (3.16-42) which, unless g(t) is either constant or a linear function of time, introduces a slight peculiarity because the divergence-free constraint applies only at the midpoint; i.e., Dun ^ gn in the general case—even if Duq = g0. [Du\ = g\ + 0(At3).] IMR is also a member of the family of implicit Runge-Kutta methods, all of which share this same 'difficulty.' For the (common) case in which g is independent of time, IMR will also satisfy the constraint equation at endpoints as well as midpoints—if Duq, = g. The implied PPE is also different: L(Pn + Pn+i)/2 = Df(un+l + un)/2 - D(un+l - un)/At, (3.16-43) which implies a differentiation of Du rather than g. Another good candidate (better, probably) would be a modified IMR in which the constraint is enforced at the 'end-of-step'; i.e., use Dun+\ = gn+\ rather than (3.16-42). This implies the following PPE and concomitant g approximation; replacing (3.16-43): L(Pn +Pn+i)/2 = Df(un + un+i)/2 - (gn+i - gn)/At. (3.16-44) These are all of the methods that we wish to discuss for now, concluding with some final remarks on index 2 time-marching methods: 1. Unless the mass is lumped (or unless a simple finite-difference method is used to generate the DAE's), only implicit methods should be considered. If lumped mass is employed, then the index 1/PPE 'version' (which 'falls out' of the index 2 equations) of any explicit integration method should be used—because cheaper sequential solutions then obtain.
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 651 2. The first-order BE method should generally only be used if a rapid approach (via fairly 'large' timesteps) to a steady state—assuming one exists—is desired. For this reason, we will later present a 'smart' BE integrator—even though it is rather inefficient for obtaining 'time-accurate' transient solutions (too many small steps are needed). 3. Among the three second-order methods, (i) TR is not recommended by some ODE/DAE experts (e.g., Brenan et ai, 1996) because of its neutral stability. We, however, have had much success with TR—especially when used with variable step size via error control, as we present later—and are not afraid of it. [Neither is Gartling, in his NACHOS II code (1987), who has told us that TR (with error control) 'always' works well—personal communication.] (ii) IMR may be a good choice if a fixed-A? code is written because it conserves 'energy' (uTu or uTMu) regardless of step size—and because variable At methods are not easy to derive for the one-leg family. (TR is indefinite with respect to energy conservation, and BDF2 removes energy.) But we do not believe in fixed-A? implicit integrators. (iii) BDF2 may be the best choice of all because of its (extra) stability (L-stable rather than just A-stable) and (like TR) implementation of error control and variable At is easy—its disadvantage being artificial dissipation (the price for L-stability). Thus, we will later also present a robust, variable-step BDF2 integration technique. (iv) If unconditional stability is desired or required, then only implicit methods should be used, regardless of mass lumping issues. b. PPE methods/index 1 For the index 1 system, (3.16-11) and (3.16-12), we shall consider four explicit methods (FE, AB2, RK2 and LF) but only one implicit method, and the latter only to make the point that index 1 and implicit methods are not a good combination (implicit methods should be applied directly to the index 2 DAE's). Of the explicit methods considered, only one, AB2, will be described in any detail. Another important point is that consistent mass virtually precludes all index 1 solution methods; we will therefore soon limit our discussion to lumped mass FEM or FDM. 1. Backward Euler. Let us start with the (simplest) implicit example, backward Euler, applied to (3.16-11) and (3.16-12), it is, simply un+x =un + At(fn+] - GPn+]) (3.16-45) and LPn+l =Dfn+l-gn+l, (3.16-46) a fully coupled non-linear system in (un+\, Pn+\ ). Since L = DG = CTM~XC and D = CTM~X, and M-1 is dense, it is clear that GFEM is just not meant to be integrated with implicit methods in index 1 form. Even if the mass were lumped (or FDM used), the system is still fully coupled, and—as with all ODE methods applied to the true index 1 DAE's—the resulting solution will generally not even be discretely divergence-free; from (3.16-45) and (3.16-46), it is easy to derive that Dun+\ = Dun + Atgn+\ ^ gn+\, even when Duq = go.
652 THE NAVIER-STOKES EQUATIONS We now go to the explicit methods, beginning with AB2. 2. Second-order Adams-Bashforth. Because we advocate higher than first-order methods, we begin with AB2, which is best applied first (and formally) to the second index 0 form, (3.16-13), to give un+x =un + At/2[3(pfn+GL-]gn) - (p/„_, +GL-'^_,)]. (3.16-47) Next, to recover the appropriate index 1 version that is useful, we explicitly construct the projection of /, after rearranging the above result to (un+\ — un)/At = ^[fn — GL-\Dfn-gn)\-\[fn-X-GL-\Dfn-X -gn-\)l which, noting that L~\Df - g) is an ra-vector called P, yields {Un+x~Un) = 3-(fn -GPn)- !(/„_, - G/V.), (3.16-48) with LPn = Dfn - gn and LPn-X = Dfn-X - gn-X (3.16-49) as the appropriate 'defining' equations for the pressure. Thus, presuming Pn_x to be available (we will return to 'start-up' later), the AB2 method, as an 'algorithm,' is Step 1. solve LPn = Dfn — gn for Pn; Step 2. update un+x from (3.16-48). Done. We see already that index 1 (with LM) and explicit methods are a good match since the (sole?) advantage of explicit methods is realized: there are no algebraic equations to solve for the 'ODE variable,' u. (There will always be algebraic equations to solve for P, because these are DAE's, not ODE's.) So how close to divergence-free is the resulting velocity? This is most easily answered by applying D to (3.16-47) and recalling that Dp = 0 to give D(un,\ — u„) 1 +^ = ■jQgn-kn-x)* (3.16-50) which is now a second-order accurate approximation (AB2 in fact) to Du = g; index 1 methods preserve the discrete divergence of the acceleration, but only up to the LTE—and they do not, in general, yield Dun = gn. If, however, the Dirichlet BC's were time- independent, we would have g = 0, and then AB2 would be maintaining Dun = g if it started that way (Duq = g). But we can do better yet when g ^ 0 by constructing a special-but-appropriate approximation to gn; i.e., one that keeps the velocity discretely divergence-free regardless of step size. We shall call this approach a modified index 1 method, vis-a-vis 'true' index 1. Setting Dun+X = gn+x and Dun = gn in (3.16-50) yields the following approximation to gn that is to be used in (3.16-49): ^ 8n + \ ~ gn ' . _ ,, C1 £« = ^ — + ^8n~\, (3.16-51) which is simply the AB2 ODE method 'inverted.' The 'improved' (modified) AB2 index 1 algorithm is then
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 653 Step 1. solve LPn =Dfn - \{gn+\ -gn)/At- \gn-u Step 2. update un+\ from (3.16-48) and update gn from (3.16-51). Done—and the result will satisfy Dun+\ = gn+\, the index 2 constraint equation (iff Duq, = go). So, if the mass is lumped, we have a good way to solve the modified index 1 DAE's. But, as the astute reader will have already realized, comparing this 'new' result to the index 2 integration with AB2, (3.16-19), (3.16-20) and the Poisson equations below that equation, we see ... equivalence! We have found the way to start with index 1 yet obtain a solution to index 2—at least with perfect 'arithmetic' (no roundoff, no iteration errors, etc.). But in the next section, we will discuss the (minor) 'cost' of this 'improvement' (reduced local pressure accuracy). Final remarks on AB2: (i) Use forward Euler for the (smaller) first step, in the 'modified' manner to be described below, (ii) Before taking that first step, solve LP0 = D/0 — go for Pq. 3. Forward Euler. Applied to (3.16-11) and (3.16-12), with the help of (3.16-13) if necessary [via un+\ = un + At(pfn + vn) and pf = f — GP — v] to get the right time index on GP, it is un+\ =un + At(fn - GPn) (3.16-52) and LPn=Dfn-gn, (3.16-53) in which (3.16-53) is solved first. Equation (3.16-53) is the best equation to use to prove the assertion made earlier regarding the time index on P; i.e., the index is not n + 1. We offer two proofs, the second from J. Leone (personal communication, 1989): (i) The elliptic (algebraic) PPE always enforces the pressure to be in equilibrium with the current velocity field; since the RHS of (3.16-53) shows a velocity field at time tn, the LHS must agree—Pn is correct, not Pn+\- QED. (ii) If Pn+\ were correct, then the PPE would read Pn+\ = Df n + gn, which makes the value of Pn+\ = P(tn + At) completely independent of At\ QED. This (true) index 1 solution would satisfy D(un+\ — un)/At = gn, a first-order approximation to Dii = g, and thus Dun = gn + O(At). Again, however, the simple and expedient use of the 'appropriate' approximation to gn in (3.16-53), i.e., gn = (gn+\ —gn)/At, recovers the index 2 result via our modified index-1 method; i.e., the resulting solution would satisfy (3.16-15) and (3.16-16). Remark: We mentioned in Section 3.13.4a that PPE methods only enforce the weaker constraint CTu = g. Using gn = (gn+\ —gn)/At in (3.16-53), it is easy to show that Dun+\ = gn+\ + (Dun — gn). Any initial divergence error (Duq, — g0 ^ 0) will forever be 'frozen' in the fluid—two examples of which are presented in Gresho (1991b), and we shall present another at the end of this Section (3.16.Id). Another way to interpret (and generalize) these results is that the index 2 formulation, via its implied ('hidden,' initially) PPE, which itself implies an approximation to g, tells
654 THE NAVIER-STOKES EQUATIONS the index 1 formulation the best way to approximate g—an interpretation that we believe is useful and general [it applies to any ODE method—although only explicit ones, with lumped mass (or FDM) make sense for code writing]. It is also restricted to the situation in which the matrix D is time-independent, which is usually the case. The exception is free-surface flows in which all matrices are time-dependent because of a continuously changing domain shape (see Volume II). As mentioned above, however, both index 2 and modified index 1 solutions give a less accurate pressure (but only locally) when g is time-dependent—the details of which are presented in the next section. With this new interpretation, we now present the final results for two more second-order explicit methods, leapfrog and Runge-Kutta, in both 'true' and modified formulations. 4. Leapfrog. The algorithm is as simple as FE (which it needs for the first step), but more accurate: un+l =un_x + 2At(fn-GPn), (3.16-54) where for true index 1, or LPn =Dfn-gn, (3.16-55) LPn=Dfn - (gn+l - gn_{)/2At (3.16-56) for the modified version. But because LF2 is unstable for diffusion (cf. Chapter 2), its use for NS is probably ill-advised. 5. Second-order Runge-Kutta. Strictly speaking, RK2 requires two PPE 'solves' per timestep (once per stage), but see also Le and Moin (1991), in which a way was found to do just one. Also, like leapfrog (but in the other extreme), it is unstable for pure advection (Euler's equations) and must be used 'carefully'—like FE. The general algorithm [see (2.7-12)] goes as follows: Stage 1: where un+l/2Y + At(fn - GPn)/2y, (3.16-57) LPn = Dfn - 2y{gn+M2Y - gn)/At (3.16-58) is solved first. Stage 2: un+l =un+ At[{\ - Y){fn ~GPn) + y(fn+\/2Y - GPn+l/2y)], (3.16-59) where LPn+\/2y = Dfn+l/2y - [(gn+\ - gn)/yAt - 2(1 - y) ■ (gn+i/2y - g„/Af] (3.16-60) is solved first, with the result that Dun+X =gn+\- (3.16-61) The way is now clear how to apply higher-order RK formulas to the DAE's—exercises we leave to the reader.
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 655 Final remarks on index 1/PPE methods: (i) FE is generally not recommended. (ii) AB2 (or even AB3, not shown) may be preferable to RK2 because it needs to solve half as many PPE's. (iii) Use the modified index-1 method rather than true index 1. c. Error analysis for index 1 and 2 We now present a brief analysis of local error for the two Euler methods applied to index 2 (and modified index l, since they are the same algorithm) and (true) index l. We will present enough detail (we believe) so that the reader may do the 'same' (perhaps with some effort) for higher-order methods, and we explicitly re-introduce the viscous terms. o We begin with index 2 and forward Euler; FE applied to the index 2 (= FE applied to modified index l) DAE's is un+i-u« +KUn+GPn = f{Un) = f (3.16-62) At and Dun+X = gn+\, (3.16-63) which is index 2—or, equivalently, (3.16-62) and LPn = £>(/„ -Kun)- (gn+l - gn)/At, (3.16-64) which is modified index 1. Eliminating the pressure via the projection matrix, p = I — GL~lD, permits an efficient analysis of the velocity error; i.e., Un + \—Un \(gn + \ - gn) =p(fn-Kun) + GL , At At = p(fn -Kun) + (vn+x -vn)/At, (3.16-65) which is to be compared to (3.16-13)—due account being taken of the different definition of f(u) there. In order to compute /„ = un+\ — u(tn+\) = dn for FE [cf. (2.7-7) in Chapter 2, wherein we showed, for FE, that the LTE is also the local error], where u(tn+\) is the solution of (3.16-13), we use Taylor series: At2 , u(tn+i) = u(tn) + Atu{tn) + —-u{tn) + 0{Af) (3.16-66) At2 = un + Atitn + ~^un + 0(At3). (3.16-67) Using (3.16-13) gives At2 u(tn+i) = un +At[p(f„ - Kun) + i)n] +-—un +0(At"), (3.16-68) which, when subtracted from un+\ in (3.16-65) yields FP At2 , d™ = Vn + \ -Vn- AtVn - -^r-Un + 0(Af )
656 THE NAVIER-STOKES EQUATIONS vn + Atvn + -—vn + 0(Ar) — vn — Atvn — At' -un +0(Ar) At' (vn-un) + 0(At3); (3.16-69) the velocity is locally second-order accurate. For the pressure, we must compare Pn+\ and P(tn+\)—since Pn = Pn(t) was presumed exact. Thus, and where LPn+l = D(fn+l - Kun+i) - (gn+2 - gn+\ )/At LP(tn+l) = D{f[u(tn+l)] - Ku(tn+i)} -gn+u (3.16-70) (3.16-71) fn+i = f(un+i) = f[u(tn+i) + dFnE] dT + 0{d¥nE)\ df = f[u(tn+i)] + -f du (3.16-72) tn + l where df /du\tn+l = J is a Jacobian matrix. Then, defining en = Pn+\ — P(tn+\) yields, using (3.16-69), /FE Len = D(J - K)d™ + gn+i- (gn+2 - 8n+\ )/At + O(Af). (3.16-73) Now use TS on gn+2'- At' At3... gn+2 = gn + \ + Atgn + i + —gn + l + -g-g„ + l + 0(At ) to give, using (3.16-69) again, At2... At . At _. ... Len = -—D(J-K)(vn -un)-—gn+x -~-gn+x +0{Ati), 2 2 6 (3.16-74) and we have discovered, in (At/2)gn+\, another DAE problem: when g(t) is time- dependent, the numerical differentiation of the constraint equation—either implicitly via index 2 or explicitly via modified index 1—causes a loss of local accuracy in P, by one order, over that for velocity (and for FE on ODE's). If and only if g is either independent of time or linear in t, (3.16-74) gives Len = —^D(J - K)un + 0(Ar>); for the general case, however, en = O(At), while dFE = 0(At2). o Next we apply FE to the true index-1 DAE's: (3.16-75) At + Kun + GPn = fn (3.16-76) and LPn = D(fn - Kun)- gn, (3.16-77)
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 657 which, upon elimination of Pn, is Un+'~t Un = P(fn -Kun) + GL~xgn = p(fn-Kun) + vn (3.16-78) rather than (3.16-65)—a simple change with non-simple consequences. (Recall/note that Dun+X = Dun + Atgn = gn+ Atgn ^ gn+l.) Subtracting u(tn+l) in (3.16-68) from un+\ in (3.16-78) yields d¥nE = y'K + 0(At3), (3.16-79) a la ODE local error. What about PI Well, LPn+l = D[fn+i - Kun+l] - gn+l (3.16-80) and LP(tn+l) = D[f(un+l)-Ku(tn+l)]-gn+u (3.16-81) to give Len = L[Pn+i - P(tn+l)] = D{fn+i - f[u(tn+l)] - K[un+l - u(tn+i)]} (3.16-82) which, using (3.16-70) through (3.16-73), yields Len=D{J-K)dlE + 0{\\dlE\\2) At2 , = —— D(J-K)un +0(At3); (3.16-83) no loss of accuracy relative to the velocity! We see here a special instance of a general fact: The index 1 DAE's 'trade' a loss of discrete divergence (Dun ^ gn) for a (local) gain in pressure accuracy—in the general case (time-dependent g). o We are now ready to analyze BE, first on index 2 (= modified index 1): un+\ ~ un At and + Kun+] +GPn+] =fn+] (3.16-84) Dun+X = gn+\, (3.16-85) or (3.16-84) and LPn+l =D(fn+] -Kun+])-(gn+] -g„)/At, (3.16-86) both of which yield the following equation in velocity only, upon elimination of Pn+\— and using (3.16-14): = p(fn+\ ~ Kun+X) + — . (3.16-87) At At The LTE for BE [cf. (2.7-8) in the previous chapter] is obtained by inserting the exact solution into (3.16-87) and evaluating the residual. Thus, df- = u{tn) + Atp{f[u(tn+i)] - Ku{tn+X)} + (vn+i -vn)- u(tn+l), (3.16-88)
658 THE NAVIER-STOKES EQUATIONS and the TS is now taken backwards from tn+\~. At2 „ , u(tn) = u(tn+l) - Atu(tn+l) + —u(tn+l) + 0(Af ) At2 and we obtain = u{tn+x)- At{p{f[u{tn+X)] - Ku(tn+l)) + vn+\} + -^-"«+i (3.16-89) rf„ = v„+i - vn - Atvn+i + -—un + O(Ar') A^ 2 (ii„+i -i)„+i) + 0(Ar3) = ^-(«„ -SJ + OCAf3); (3.16-90) as for ODE's, BE reverses the sign of the LTE for FE [cf. (3.16-69)]. For the pressure, the analysis follows precisely as that for FE, given by (3.16-70) through (3.16-74), except that gn+2 — gn+\ is now gll+\ — gn, with the 'same' result—first- order accurate. o Finally, to finish, we examine BE applied to the true index-1 system: ^^At ^ + KUn+l + °Pn+l = /"+1, (3.16-91) LPn+x = D{f n+x - Kun+X) - gn+u (3.16-92) which, assuming Dun = gn, =>• Dun+\ = Dun + Atgn+\ ^ gn+\ in general. Eliminating pn+x gives un+x =un +Atp(fn+x -Kun+x) +Ati)n+x, (3.16-93) the BE analog of (3.16-78). As above, we compute the LTE as d*E = u{tn) + Atp{f[u(tn+x)] ~ Ku(tn+l)} + Atvn+x - u(tn+l) (3.16-94) and use (3.16-89) to obtain At2 dBnE=^-un+i+0(At3) At2 , = -r-K« +0(At2), (3.16-95) with no error contribution from g(t). For the pressure, the analysis proceeds just as for FE, see (3.16-80) through (3.16-83), with the final result that At2 , Len = ~-D{J - K)un + 0(At3); (3.16-96) as for FE on index 1, the pressure error is also locally 0(At2).
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 659 This concludes our error analysis. We now appeal to authority (the DAE experts, several of whom have already been cited) to state the following facts: in all four cases studied above the local velocity errors, all of which are 0(At2), accumulate during the integration to give global errors one order lower—O(At)—as with ODE's. For pressure (the algebraic variable), it turns out, luckily we add, that the O(At) local pressure errors for index 2 (= modified index 1) do not accumulate, so that pressure is also globally first-order accurate. For the true index-1 systems, the pressure errors do accumulate, so that the locally 0(At2) accuracy degrades to first-order globally. Since also the discrete continuity equation is not exactly satisfied for the true index 1 system (unless g = 0), it would seem, on balance, to suggest that the modified index 1 approach (= index 2) is the way to go. Finally, we summarize these results and those for some other methods, in Table 3.16-1, obtained mostly from Hairer et al. (1989) and Hairer and Wanner (1991), where Euler refers to FE and BE (the latter going by the name Radau IA in the above citations). Also, TR has the alias Lobotto IIIA (s = 2) and IMR is also called a Gaussian method with s = 2—these alternate names coming from 'quadrature rules.' Final remarks: (1) Some implicit Runge-Kutta (IRK) methods may also be useful for these DAE's, but we refer the reader to the literature for details—particularly Brenan et al. (1996) and Hairer and Wanner (1991). See also Arnold (1993) to see how error amplification in the algebraic variable can occur when IRK is applied to index 2 DAE's. (2) A perhaps useful intuitive notion to 'explain' the less-accurate local truncation error for pressure in the index 2 formulation is that the momentum equation causes pressure to 'look like' a sort of time-derivative of the velocity (3) If time-dependent free-surface fluid mechanics is contemplated, the matrices become time-dependent—which introduces additional DAE difficulties—and the 'naive' DAE solution procedures discussed herein may often experience difficulties. See Ascher and Petzold (1991) and Brenan et al. (1996) for relevant theory and solution methods. Table 3.16-1 Errors in velocity/pressure. Index 2 (or Method Euler(2) TR BDF2 IMR(3) AB2(4) RK2(4) modified index 1) LTE(D At2/At At3/At2 At3/At2 At3/At At3/At2 At3/At2 Global Error At/At At2/At2 At2/At2 Af2/1 At2/At2 At2/At2 Index 1 Method Euler(5) AB2 RK2 LF2(6) LTE At2/At2 At3/At3 At3/At3 ? Global Error At/At At2/At2 At2/At2 ? (1' If g = 0, the pressure error is the same as that for velocity—both locally and globally. (2) If BE is selected, please also select TR (and/or BDF2). (3) The pessimistic result for pressure probably does not apply in our easier situation (linear, constant coefficient constraint matrix; DG = L); from L. Petzold (personal communication). ,4) Not recommended for index 2. (5) If FE is selected, please also select AB2 (and/or AB3). (6) Not available in the DAE literature (we believe). It is probably the same as AB2.
660 THE NAVIER-STOKES EQUATIONS d. Some numerical results (Taylor vortex) With a large amount of help from D. Veyret, we verify some of the DAE solution methods in this section via a pretty popular (and rare) analytical 2D solution of the NSE's. (We will also include results using two algorithms that are fully-described in a later section—with some apology.) Although restricted to periodic BC's, it nevertheless provides a useful family of exact solutions—originated by G.I. Taylor (1923) and recently generalized by Walsh (1991). After presenting the original solution and Walsh's significant generalizations, we shall focus on the simplest member of the family in order to test some time integrators. (We shall also, as a useful 'aside', obtain some results on spatial accuracy for the 7 elements examined: namely Q\Qq, Q2P-1, QiQ\, Q2Q-1, PiP\, Ptp-\ and P2P\) The 'Taylor vortex' solution, as well as its generalizations by Walsh, is 'simply' obtained via stream functions that are also eigenfunctions of the Laplacian operator in a periodic domain (either 2^-periodic or unit-square 'half-periodic'; we chose the latter): VV = -W on - i ^ x, y ^ \, (3.16-97) with period 2. The eigenvalues are all of the simple form Xm,n =n2(m2 + n2) (3.16-98) and the concomitant eigenfunctions are linear combinations of sums and products of cosines and sines of mux and nny in the x- and y- directions. For example, the simplest of them (in 2D—and our choice for a test problem) has m = n = \(X = 2n2) and the single eigenfunction comprising cosines: xff = — cos nx cos ny. (3.16-99) moving toward the 'other end of the spectrum', we present an eigenfunction derived by Walsh (1991), with X = 625tt2: \{/ = sin25tzx + cos25tzy — sin24ttxcos7ny + cos \5nxcos20ny — cos Inx sin 2Any, (3.16-100) whose shape is so complex that only a fraction of the full cell can be easily shown; thus Figure 3.16-1, from Walsh (1991), shows one-eighth of a full cell in each direction (0 ^ x, y ^ 1/8). Clearly Walsh's generalization of the simpler Taylor 'vortex cells' provides an incredibly rich and challenging family of test problems (!). The rest of the solution is as follows (the stream function, Vo, represents initial data, and is given for example, by (3.16-99) or (3.16-100)): u = u0e^u', (3.16-101) where u0 = (dx/so/dy, -di//0/'dx)T, (3.16-102) xf, = xlsoe-Xvt, (3.16-103) and, finally, P = Poe-Uvt, (3.16-104)
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 661 Fig. 3.16-1 f(x, y) for 1/64 of a cell for A = 625 n2. where PQ comes from V2/>0 = -V • (u0 • Vu0). (3.16-105) This is the 'general' solution, given xfr [from (3.16-97)], and it may already be evident that this 'general' solution family is not quite as 'general' as one would desire; i.e. it is actually rather special in several ways, thus perhaps somewhat reducing its viability as an 'ideal' test problem. 1. The shape of the solution is time-independent; it merely decays in place—as do all eigenfunctions of the 'heat equation'. 2. The pressure is not needed (in the continuum—only) to keep V • u = 0; it is only needed to (exactly) balance advection (a sort of 'Bernoulli' pressure)—and is thus not needed at all for Stokes flows. 3. Related to point 2, the acceleration and viscous terms are also in perfect balance for all time; i.e., we really do have simple viscous decay (albeit div-free) solutions of the (vector) heat equation. 4. The vorticity is proportional, via the eigenvalue, to the stream function: dv du <■> co = =-V2\/r = X\j/. (3.16-106) 3* dy We now return to the simplest version of this family of exact solutions—(3.16-99) and u v it cos Ttx sin izye ~2n2vt —nsm nx cos n ye .2 -2n2vt p = —(sin nx + sin ny)e v (3.16-107) (3.16-108) (3.16-109) with co = In cosnxcosnye -27T2Xt
662 THE NAVIER-STOKES EQUATIONS where we note that, consistent with (3.16-104) and (3.16-105), the pressure decays at twice the rate that velocity does. This is also a special case—the normal component of both acceleration and viscous terms, from the general PPE BC of (3.10-8), are exactly zero. In more general cases, we might expect to see the temporal decay rate of P to be twice that of u (from advection in the PPE, which is 'quadratic' in velocity) away from T, but (sometimes at least) approaching the same rate of decay as u near or an r because of the linear (viscous) coupling there. Before embarking on our own numerical experiments, for which we use (only) (3.16-106) through (3.16-109) on the unit span, we digress briefly to summarize what a few others have done with this (and similar) 'Taylor vortex' problem(s). While (probably) numerous investigators have taken a 'quick' look at this problem in order to verify, for example, the alleged accuracy of their chosen time integration method [for example, Kim and Moin (1985), Tau (1994)—and we make no attempts to be complete here], D. Valentine and colleagues have pursued the problem in rather more depth, beginning (we believe) with Mohamed et al. (1991), submitted in July 1988, in which they tested a new code against a 16-cell (4 x 4) solution of square cells—using collocation on high- order finite elements (Hermite bicubics) and TR ('Crank-Nicolson') for time integration (plus Richardson extrapolation). They even used a 'smart' integrator and varied the time step during the simulation—albeit not in the most cost-effective way. [They were probably not aware of the much more efficient predictor-corrector method for TR (i.e., AB/TR), and employed a 'step-doubling' method that is more appropriate for other ODE methods, such as Runge-Kutta.] In Valentine and Mohamed (1989), they proposed the Taylor vortex problem (this time as a 2 x 2 array) as a numerical test problem and performed some limited (numerical) hydrodynamic stability analyses by introducing certain perturbations to the IC's. Finally (or, perhaps, more recently), in Valentine (1995) a more serious set of numerical stability analyses was performed on a 4 x 4 array of Taylor vortices—Valentine also introduced a more efficient finite difference method, which produced vortex merging/capturing under certain conditions, depending on the form of the initial perturbation. For related (classical) linear stability analyses, see Lin and Tobak (1986), Thess (1992), and other references in Valentine (1995). Returning now to our 'simple' version of the test problem, (3.16-106) through (3.16-109), we begin by mentioning that we, like Valentine et al., did not invoke periodic BC's in our simulation (but see Veyret et al., 1999); rather, in order to have time-varying g(t) in Du = g, we applied the analytic solution as time-dependent Dirichlet data on the boundary of our domain—a more difficult BC, infact. This approach (simplification?), as well as that of comparing numerical solutions with an analytical solution in general, is not without its own 'problems'; so we begin by addressing them. 1. Since the GFEM interpolant of the IC given by (3.16-102) is not discretely divergence- free, the interpolated initial data must be 'adjusted' in order to have a well-posed index 2 DAE system. 2. Whereas in the general case the GFEM interpolant of the normal component of the specified velocity will not satisfy the hydrostatic pressure mode's solvability constraint (3.13-31) even when the specified normal velocity does (Jrn • u = 0), in this case we are 'lucky' because n • u = 0 on T. Thus, simply maintaining n • u = 0 on F gives a well-posed problem.
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 663 3. But the 'tangential consequence' of holding the analytic Dirichlet BC's is also significant—though not fatal (the problem is still well-posed). It turns out that the largest error between the adjusted (via an L2-projection, typically) and interpolated velocity occurs 'one-node-in' from F—a consequence of our not permitting adjustment of the tangential BC, a constraint (along with the normal constraint) that we chose to impose but didn't need to (at least for an L2-projection). In every case except that using forward Euler (FE), we first adjusted the interpolated data via an L2-projection to the div-free subspace—per Appendix 3—which, it is important to realize, injects an error at t = 0 when comparing the results with the analytic solution, one of the 'problems' referred to earlier. But since the code employed (FIDAP; see Fluid Dynamics International (1993)), probably like many others, does not have the L2-projection 'machinery' built-in (a la Section 3.10.2), we explain next how such a projection can be w^^-approximated via BE, as mentioned in the Remarks following (3.16-21)—which is what we did. Following on from Remark (2) there, we have LPn+l =Dfn+\ -ign+i -Dun)/At, which we re-write, for n = 0, as L{P\° + 0/Af) = Dfx - {gx - go)/At + (Du{) - g{))/At, in which we have defined P\ = Pf + (p/At, where P\-corresponds to the 'physical' (true) pressure and 0 is the Lagrange multiplier associated with the L2-projection; i.e. we 'interpret' the first time-step result via the two equations LP\ = Dfx - (gx -go)/At, (3.16-110) the PPE at t\ = At, and L(P = Duo-g{h (3.16-111) whose separation is not generally available from the P\ actually computed (but could be). For small-enough At, what the code calls P\ we must interpret as cp/At. The resulting velocity is (see too Appendix 3) u\ =p(uo + Atfx) + GL-Xgx, (3.16-112) where p = I — GL~XD, which, of course, satisfies Du\ = g\ for any value of At, but it is only a good approximation to the L2-projection of uo for At -> 0 (specifically, we need A'll/ill <3C ||«oll), for which (3.16-112) is equivalent to (A3.3-43) in Appendix 3. Note too that a second small BE step is required in order to obtain a true/physical pressure—and this is what we do; all (implicit) runs begin after two small BE steps of size Ato = 10~5. All results herein are performed for v = 0.01, giving a Reynold's number of O (100); Remax = umaxH/v = n x 1.0/0.01 = 314 at t = 0. For results at other values of v, as well as variations in other problem parameters, see Veyret et al. (1998). Figure 3.16-2a shows a typical mesh used for all simulations to follow. It has 2 x 242 =1152 quadratic triangular elements, 242 = 576 biquadratic quadrilateral elements (obtained by removing the 'diagonals'), and the equivalent (and nearly equal, but not shown) bilinear element mesh has 482 = 2304 elements. The range of Ax and A_y is from 0.0083 (corners) to 0.0415 (center). All but P$P-\ and P^P\ meshes have 492 = 2401 total velocity nodes; the two remaining quadratic 'triangles' have 1152 (2 x 242)
664 THE NAVIER-STOKES EQUATIONS (a) The mesh for triangular elements. (c) Analytic vorticity at t = 0. (Contour separtion = 1.406) (e) Initial pressure for Q1Q0. (Contour separation = 0.705) (b) Interpolated analytic velocity at t = 0. (d) Analytic pressure at t = 0. (Contour separation = 0.705) (f) Pressure from Q1 QO at t = 5. (Contour separation = 0.097) Fig. 3.16-2 A single Taylor vortex.
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 665 more, because of the bubble functions. The number of pressures is as follows: Q\Qo and Q2Q-\ have 4 x 576 = 2304, Q2P-\ has 3 x 526 = 1728, Q2QX has 252 = 625, P^P-x has 3 x 2 x 242 = 3456, P$PX and P2P\ have 252 = 625. (Note the relative 'paucity' of pressures for the two Taylor-Hood elements and the relative 'abundance' for P^P-x). The total number of degrees of freedom (ignoring BC's) thus covers the range of 2 x 2401 + 625 = 5427 for Q2QX and P2PX (Taylor-Hood elements) to 2 x (2401 + 2 x 242) + 3456 = 10562 for P$P-\. (Our favorite 2D element, Q2P-X, lies toward the low end, with 2 x 2401 + 1228 = 6530.) These differences also show up in CPU costs and should be borne in mind when comparing results; indeed, P^P-x, was about three times as costly as QxQq and about twice as costly as Q2P-x, P2Px, and Q2Q\. Also, when Q\Qo (with 2 x 2401 + 482 = 7106 unknowns) is run in 'penalty mode; there are only 2 x 2401 = 4802 degrees of freedom—and this was indeed the least costly in CPU time (when measured against other implicit ODE methods). Figures 3.16-2(b), (c) and (d), respectively, show the IC from the exact solution [(3.16-99) and (3.16-109)] for velocity, vorticity [stream function, too; see (3.16-106)], and pressure. Figure 3.16-2(e) shows the 'projected' initial pressure (2 BE steps at At = 0.00001, the second step required to get a true pressure) for QxQq (representative), showing very small differences from the exact pressure; for example, whereas the range of PQ is tt2 = 9.8696 from the exact solution, it is -9.8743 in Figure 3.16-2(e). The simulations were run to t = 5, for which the Q\Qo pressure result is shown in Figure 3.16-2(f). Clearly the shape has been properly retained—and the range of pressure is ~ 1.359 visa-vis the exact: 7t2t~An x001x5 ^ 1.371. Again this is 'representative' of all other results, and thus we show no more [until our planned later publication on the subject, Veyret et al. (1999)]. It seems that, in general, all elements performed quite well on this (easy?) problem—at least in the augen-norm. We will see some differences—later. We now turn to a detailed comparison of several time integrators and a less detailed, but still useful, comparison of several 'elements' (spatial accuracy). We measured the error in two norms—the first of which we call the PDE error; for the velocity, it is given by ev(t) = \ - 5>;(0 - «(*;. 0]2 + [vj(t) - v(xj, t)}2, (3.16-113) N where u(xj, t) is the ^-component of the exact solution at node j, etc., and N is the number of (internal) nodes in the domain. For pressure, we use the same type of RMS norm after forming an (element) area-weighted average of the 'raw' pressures at the same velocity nodes and after 'normalizing' the results via a reference pressure—defined to be the pressure at the node closest to x = y = ^, which is one-half of the distance between the pressure extrema of the analytic solution; i.e. we have eJt) = N v J2[Pj(t) - p(xj> l) -pj(t)+p(xj>f)]2' (3-16_114) n j where J is the node located closest to |, ^. Remark: These PDE norms are perhaps better called 'lazy'-person's, (or discrete) PDE norms, as the true L2-norm would involve integrals of the squares of the differences rather than sums.
666 THE NAVIER-STOKES EQUATIONS For the second norm, called the ODE norm, we discard the exact solution and define 'truth' as the solution at the smallest At employed—usually 0.001. Thus, - 5}k;(A0 - Uj(Ats)]2 + [vj(At) - Vj(Ats)]2, (3.16-115) j where Ats is the smallest time step—with a similar equation for the ODE pressure error norm. (Actually, because of symmetry, we only used uj in the ODE velocity norm definition, thus giving a dv that is low by a factor of \/2.) This approach should allow us to verify the ODE/DAE theory if two criteria are satisfied: (i) Ats <$C At and (ii) At is 'sufficiently small'. As will be seen, this approach generally works quite well—at least for velocity. All linear systems were solved with one or another form of Gaussian elimination and the fixed-point iterative method of Picard [functional iteration/successive substitution; see Vol. II or Reddy and Gartling 1994)] was used for the nonlinear algebraic systems—using a reasonably tight convergence criterion (e = 0.001, where we enforce both || Ax||/||x;|| ^ e and || f(xi)||/1|/oil ^ £ when solving f(x) = 0. It turns out that most of the runs needed but one iteration; only those with large At required more—two or three, suggesting perhaps again that the 'fluid dynamics' is not particularly difficult.) We begin by comparing three quadrilateral elements for both BE and TR time integrators—for velocity—in the PDE norm in Figure 3.16-3 for three different times: t = 1, 3, 5, where, from (3.16-107) we note that the decay time constant, r = \/2n2v, is very close to 5, so that the simulation stops after one time constant [one-half of the time constant for the pressure, a la (3.16-109)]. For 'comparison' purposes, the RMS (nodal) norm of the exact velocity at these three time is 1.72, 1.16, and 0.78, respectively; it is 2.09 at t = 0. The penalty method (ODE's, penalty parameter = 109) is shown for Q\Qo, whereas fully-coupled mixed interpolation (DAE's) is used for the rest of the presentation—and this just to make the point that penalty 'works' by pointing out that the error graphs for the fully-coupled Q\ <2o result (not shown) are virtually the same in the 'augen-norm' (i.e. by visual comparison)—except for the smaller values of At, where the combination of small At and large penalty parameter seems to sometimes cause differences (in the errors) of ~10%. In these figures, and (most of) those to follow, we placed a line with slope = 1 on all BE results and one with slope = 2 for the TR results to test agreement with theory. Also noteworthy from Figure 3.16-3 are the following. 1. When At is sufficiently small (a mesh-dependent determination/definition, naturally), the error curves flatten out, indicating that the error is spatial only—a situation that is realized much sooner (larger At) for TR than for BE; typically, for the mesh chosen, this 'convergence' At is ~0.10 or so for TR (~50 steps per time constant): and more like 0.01 (500 steps per time constant) for BE. This 10-fold increase in cost is truly that because the CPU costs per time-step for TR and BE are within a percent or so of being equal. (Later, when we present variable-step/smart integration results, we will see similar accuracy for significantly fewer time steps to get to t = 5; 15-20 for TR and about twice that for BE). 2. By the same token, only when At is sufficiently large can we hope to see temporal error only. For Q\Qq for example, this appears to be At > ~0.03 for BE and At > ~0.3 dv(At) = \
iu ■ 10-2 ev 10-3 [ m-4! ■irv-5 - I °'A i in in A I I I I I I! A \ Ml SOLUTION METHODS FOR THE SEMI-DISCRETIZED 10-1 10-2 667 0.001 0.01 0.1 At (a) C^Qq via BE (penalty). 10-1 10-2 ev 10-3 10-4 10-5 0.001 (c) C^P.! via BE. 10-1 10-2 ev IO-3 10-4 - - -_ T 11 £ = -.' ■ n t • A i ..' I I . I III 1 i 1 II :; r A ■ T \ 1 1 0.001 0.01 0.1 At (e) C^ via BE. - " - - I_ = 8 8 \ i i i i 111 ■ /i n ii a / " /* T / A r/ i i 11 m ev IO-3 10-4 10-5 0.001 0.01 0.1 At (b) Ditto (a) except TR. 10-1 10-2 ev IO-3 10-4 6 10-5 0.001 0.01 0.1 At (d) Ditto (c) except TR. 10-1 10-2 r = n n n n > D / ° / V <t • O • / T T A T A/ T I I . I il / . I ll I III. - — □ ii~ - - - D A I ; i t t ] D I 1 A t /< : 1 II Q ] □/ ' / A 1 ' II ' O A 1 : ev 10-3 10-4 10-5 0.001 0.01 0.1 At (f) Ditto (e) except TR. Fig. 3.16-3 PDE velocity error for 3 quadrilateral elements. Here, and in those to follow, D denotes t = 1, • denotes t = 3, and A denotes t = 5.
668 THE NAVIER-STOKES EQUATIONS for TR. Only when temporal error 'totally' dominates spatial error can a PDE error norm be compared to DAE theory. 3. The 'flat' portion of the curves—spatial error only—shows that, for this problem at least, QiP-\ [and QiQ-\, which gave nearly identical results (3-4 significant figures), not shown] is significantly more accurate than Q\Qo, which itself is significantly more accurate than £2Q\- [This latter result may surprise a few readers, but we have become convinced that this is just another example of a true thing—or two, actually: Q\Qo is often 'very' accurate and <22<2i is often very inaccurate. Even though the 9-node element is generally much more accurate than the 4-node for 'simple' (scalar) PDE's, the mixed-interpolation requirement of the NSE's drags down the 9-node element's ('deliverable') accuracy (for <22<2i) because of the poor job it does on the pressure (and V • u).] Note that a At as large as ^ with TR (17 steps to t = 5) has virtually removed all temporal error for QiQ\ (Figure 3.16-3(f)), and that, in marked contrast, BE needs a At of 0.001 (5000 steps to t = 5!) to get to the 'same' point when using the good 9-node element—QiP-1, which 'point' has more than an order of magnitude smaller error than QiQ\. The 'best' accuracy attainable on the chosen mesh is that using Q2P-\, and its most cost-effective attainment (for fixed At integration and mixed interpolation) is via TR with At = 0.1 (XAt = 0.02)—until we apply the 'smart' version of TR; see below. Moving now to Figure 3.16-4, we first discuss the pressure errors for QiP-\, for BE (Figure 3.16-4(a)) and TR (Figure 3.16-4(b)). There are two things worthy of discussion, the first of which is obvious. (i) The pressure error is much larger than the velocity error, such that even BE has 'converged' for At just less than 0.10 (and TR by At = ^, as for velocity). (ii) These pressure errors are rather 'representative' of those for all elements tested; i.e. the large spatial errors are quite close to each other. (And for this reason we will not show the others.) The reason for this behavior, we believe, is that the 'DAE pressure' dances to a different drummer than does the PDE pressure, beginning with the fact that the largest error occurs at t = 0+; i.e. the pressure error associated with the projected velocity is both large (relative to that for velocity) and nearly element-independent. We are not so surprised by the former result, since the pressure is well-known to be a rather 'sensitive' variable, responding with relatively large changes to relatively small changes in velocity. But the latter (element 'independence') is still rather puzzling—an issue we plan to return to in Veyret et al. (1999). The rest of Figure 3.16-4 shows the velocity error for two of the better triangular elements, and we see that P^P-\ is very similar in accuracy to <2i<2o (but at several times the cost) and that P^P\ is close to <22<2i (also in cost) the latter result being less surprising than the former. [We have also run the PiP\ (Taylor-Hood) element, with the following results (not shown): the PDE velocity error looked a lot like that from either QiQ\ or P^Pu the remaining three types of error curves looked very much like those from any other element.] In Figure 3.16-5 we show the PDE error for 'Projection 2' with semi-consistent mass (Q\Qo) (see Section 3.16.6c) and the first of the ODE errors. Figures 3.16-5(a) and (b) present us—and perhaps the rest of the world—with our biggest surprises in this study:
10-1 10-2 SOLUTION METHODS FOR THE SEMI-DISCRETIZED 10"1 10-2 667 ev 10-3 ■\Q-5 ' ' ! ' ' 0.001 0.01 0.1 At (a) C^Qo via BE (penalty). 10"1 10-2 ev 10-3 10-4 •8 r 'P -r # 10-5 i i i , i - ii i ■ i i i I I I i I 0.001 0.01 0.1 At (c) QaP.! via BE. 10"1 10-2 Q P P P ev 10-3 &- A A -• it 10-^ 10-51 -U- -ill L A 0.001 0.01 0.1 At (e) Q2Q! via BE. ev 10-3 10-4 10-5 0.001 0.01 0.1 At (b) Ditto (a) except TR. 10"1 10-2 ev 10-3 0.001 0.01 0.1 At (d) Ditto (c) except TR. 10"1 10-2 p ev 10-3 P p n P p p/ '=—*—T—*—"—T—{ AT A 10-4 1Q-5 I 1 i i I Mil A A Ml III l i ll i l. 0.001 0.01 0.1 At (f) Ditto (e) except TR. Fig. 3.16-3 PDE velocity error for 3 quadrilateral elements. Here, and in those to follow, U denotes t = 1, • denotes t = 3, and A denotes t = 5.
670 THE NAVIER-STOKES EQUATIONS 10-1 10-2 d- □ a □ yg 3V 10-3<b • ° •/-! 10-4 = 10"6 0.0001 0.001 0.01 At (a) Q^Qq via BE and Projection 2. 10"1 10-2 10-3 10-4 10"6 10-6 10-7 10-8 0.001 0.01 0.1 At (c) ODE velocity error; Q2P_., and BE. 10° 10"1 10-2 dp 10-3 10-^ = = =f -'" .A --i = — ="" I Mill' .A I I I I : M A I : l ! I' 10-5 - = ! li j A A/ * /J = s^ - O </ ~ ^^ 0.01 0.1 At 1 10"1 10-2 ev 10-3 10-^ 10-5 0.0001 0.001 (b) Ditto (a) except TR 0.01 At dv ILT' 10-2 10-3 10"^ 10-5 10-6 10"7 10-8 E = = = i = n o □ / y ^ •/ n / _ _.A KV^ n / / f •/ n / n / y 2^ y v \ . : 0.001 0.01 0.1 At (d) Ditto (c) except TR. 10° 10"1 10-2 10-3 10-^ 10-5 0.001 □ □ 7<r p D ' '' ■• ' i i ' i i ir 0.01 0.001 (e) ODE pressure error; Q2P_i and BE. (f) Ditto (e) except TR. Fig. 3.16-5 PDE velocity error for Projection 2, and several ODE errors. 0.1 At
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 671 1. Both BE and TR behave very nearly the same—clearly violating any remaining notions one might have of applying ODE or DAE theory to projection methods of this type. 2. The errors are large relative to those from the more 'honest' methods (by factors of 5-10) i.e. time integration of the DAE's (cf. Figures 3.16-3(a) and (b)). We believe that we can explain the relatively large spatial error, and we do not believe it is a code 'bug' because of the following: whereas the results in Figures 3.16-3(a) and (b) were obtained with a code (PASTIS) written by H. Daniels while spending a year at LLNL, these results were closely duplicated (at least in the range 0.01 ^ At < 0.10) by running the FIDAP code in the manner discussed in Section 3.16.7; i.e., using the segregated solver with just 'one pass' (no iterations) is an 'algorithm' that is (at least for sufficiently small At) very close to that of projection 2—one of the obvious differences being that the projection 2 code uses 1-point quadrature on the element matrices, which is less accurate (but perhaps more cost-effective). Another difference is in the treatment of advection: whereas FIDAP does it 'honestly' (triple-product integrals, etc.), PASTIS uses the 'centroid advection' approximation (with only double-product integrals) discussed in Section 3.16.5. This combination of differences 'explains' the larger error. [Note too that when At was sufficiently small with FIDAP (in the 1-pass mode), the 'spatial only' error, attained for At ^ ~0.001, agreed 'completely' with that using BE or TR in Figure 3.16-3.] A related conclusion from this comparison is thus this: at least the extra effort of employing 'honest GFEM' does deliver greater accuracy. 3. The two straight lines shown in both of the figures have slopes of 1.5 and 2.0 because it appears possible that e(, ~ Ar1-5 for early time (t = 1) and er ~ At2 for 'late' time (t = 5)—for both TR and BE. We are just not sure. 4. The At range is one order of magnitude smaller than all the others, partly because larger At caused much larger errors (and convergence difficulties), presumably caused by the spurious boundary layer (e.g. *JvAt = 0.1 for At = 1; see Section 3.16.6c) of this projection method. (Clearly the results presented for At < 0.001 are superfluous—until we examine the ODE errors.) The rest of Figure 3.16-5 shows the ODE error for Q2P-\, for both BE and TR for both velocity and pressure—and we point out that, just as pressure PDE errors are nearly element-independent, so too are the ODE errors, but this time for both P and u (and even more so for u, for which they are virtually identical for all elements examined. Again, except to conjecture that the 'spatial' differences are 'removed' (hidden?) by our 'normalization' ('truth' At = 0.001, one for each element), we are at a loss to explain such close behavior, since each element employed generates a different set of DAE's and because the differences when compared to the PDE solution are quite noticeable. Especially confusing is the ODE pressure error for TR at small At—besides being 'inverted' from the rest of the results (largest errors at large time rather than smallest errors at large time), they display a nearly Af-independent behavior over much of the range. We did verify that the dp results do (finally) bend over and go to zero at the 'truth' Ars of 0.001—although the error is still between 10~6 and 10"5 for At as small as 0.0012; the turn-down is very steep. Also interesting (and also not understood) is why the BE velocity errors do not spreadout/separate with time (smaller error at larger t) as the solution decays; TR is much more reasonable in this regard. Finally, we mention that, as a check, we defined
672 THE NAVIER-STOKES EQUATIONS the 'truth' solution for BE as the (more accurate) TR solution at At = 0.001; the result (not shown) would be three more points on the curve in Figure 3.16-5(c) at At = 0.001, with values between 1 and 2 x 10-5, which is reassuring—it reinforces the 'assumption' that virtually all of the error is spatial for both TR and BE for At = 0.001. Finally, Figure 3.16-6 shows the ODE errors from 'Projection 2' and the two PDE velocity errors for FE (see Section 3.16.5)—all with QiQq, Whereas the ODE velocity errors for Projection 2 using TR seem to be in accord with what we believe to be known theoretically, those from BE are, like the PDE errors, rather surprising. For large At, they look very much like those from TR (slope 2 and similar magnitude), and display a painfully slow transition toward a slope of unity (the smallest 'local' slope in Figure 3.16-5(a) is ~1.2—and the two straight lines have slopes 1 and 2), where we point out that the 'truth' At for the BE result is very very small (1/30000)—needed to convince us that the asymptote, at least, does appear to agree with BE theory. ['Projection 1'—see Section 3.16.6b—behaves more like expectations; and its slope is 1 for both BE and TR; see Veyret et al. (1998).] Turning to the ODE pressure errors, in Figure 3.16-6(c) and 3.16-6(d), we seem to see the following behavior: BE has a slope 'near' 2 for 'large' At and 'near' 1 for small At (the two lines shown have slopes 1 and 1.5); TR may have a slope near 2 for large At, it seems to show 0(At15) for some ill-defined 'intermediate' range of At, and may be heading toward a slope of 1 for small At. The 3 curves in Figures 3.16-6(d) have slopes of 1, 1.5, and 2. Note that Shen (1996) also showed some experimental results that seemed to have a slope of ~1.5. Note also that Van Kan (1986), after demonstrating second-order accuracy for u, had this to say about P: 'For the pressure the results appeared to be less trustworthy.' All-in-all, it seems that Projection 2 does not fit comfortably into any 'niche' regarding temporal accuracy. The last 2 curves in this figure show some FE results (with a line of slope 1 drawn on Figure 3.16-6(f)), which, unlike the BE and TR results presented earlier, are obtained from, the index 1 DAE's (PPE 'method'). (We did not invoke BTD—see Section 3.16.5—in order to keep the DAE system 'honest'). The semi-theoretical stability limits for FE—based on the FDM version of the ODE's and given in (2.7-28) and (2.7-29) for constant velocity on a uniform mesh—are approximately as follows: (i) At ^ Ax2mJ4v = (0.0083)2/0.04 ^ 0.0017 and At ^ 0.02/(tt2 + 0) = 0.0020. The empirical/experimental result is AtmdX = 0.0045-0.0050. But the important result is that all stable time steps are necessarily so small that spatial error always dominates. (BTD would help, but not much: AtmdX from (2.1-41) is ~0.01. Also worth noting, by comparing Figure 3.16-6(e) to Figure 3.16-3(a) or (b) is that mass lumping ('required' for explicit time integration) has increased the spatial error several-fold. Finally, note that the ODE error agrees well with its implicit counter part: Figure 3.16-6(f) vis-a-vis Figure 3.16-5(c). Figure 3.16-7 shows the divergence error (contours of CTu) that goes with the interpolated (but not projected—not necessary) IC used in the FE runs. It is actually quite small and, of course retains the values shown in the figure during the entire integration. [The solution (vectors, etc.) from the FE runs were visually as good as those that are div-free.] We conclude the FE discussion with the following remark: it is interesting to realize that the index 1 'property' of preserving the divergence error (see Sections 3.10.1, 3.13.4, 3.16.1b, 3.16.2c, and 3.16.2g) will actually preclude the FE time integration from attaining the final steady-state of no-flow. For the case above and a long time integration, the FE PDE error, which starts at ~ 1.3 x 10~4, increases for awhile, peaks at t = 3 at ~ 6 x 10-4, then decreases monotonically until t = 30 or so (at which time the
10° 10-2 dv 10-3 10-5 10-6 H = ~ = 3 ! l/lMI! A k1 I I I I Mil J& Y I ■ I Mill /♦ * / II! SOLUTION METHODS FOR THE SEMI-DISCRETIZED 10-1 ID"2 10-3 1(H 10-5 10-6 10"7 673 0.0001 0.001 0.01 At 0.1 (a) ODE velocity error; Q^Qo, Projection 2, BE. 101 10° 10"1 10-2 dp 10-3 10-^ 10-5 10-6 = = = - - □<// ^/ 3^ A ■ I I I i l-M ■% ? r ^ ^~~ I ! I : ■ I l! 0.0001 0.001 0.01 At 0.1 (c) ODE pressure error; Q^Qq, Projection 2, BE. 10"1 10-2 ev10-3 10-^ a 8 $ 8 10-5 L I I I ! Ml 0.00001 (e) PDE velocity error; Q^Qq via FE i 0.0001 0.001 At 0.01 dv E = - E / A E /A = / A ; ii i ill A A I i ■ \ i ii i mi 10-8 0.0001 0.001 0.01 At (b) Ditto (a) except TR. 0.1 0.0001 0.001 0.01 At (d) Ditto (c) except TR. 10"1 10-2 10-3 10-^ dv 10-5 10-6 10"7 10-8 0.00001 0.0001 0.001 At (f) Ditto (e) except ODE error. E E = E = = ^^ i i m ■ ii- I I I ! I I II 8^ l Mill: 0.01 Fig. 3.16-6 Several other errors using Q^ Q0.
674 THE NAVIER-STOKES EQUATIONS © = -1.44 x 10*6 (contour interval = 0.288 x 10~6). true velocity is ^ 7re 2rvt = 0.008), after which it levels off at ~ 6 x 10 6—because CTu{t) - g{t) = CTu0 - g{) # 0, per (3.16-7). To close our discussion of 'spatial only' errors, we show in Table 3.16-2 the projection (PDE) errors for velocity and pressure (2 BE steps at At = 10~5, called 'Initial Velocity Error') and the range of errors from the (A?-converged) results at t = 1, 3, and 5. From these results, we seem to see the following. 1. The pressure error is virtually element-independent. We have no good explanation for this, nor for the anomalous behaviour of PtP-\, which, at t = 0+, simultaneously displays the largest velocity error and the smallest pressure error. 2. The pressure errors are much larger than those for velocity—and are largest initially. 3. Elements Q\Qq and P%P-\, which generally have performed rather similarly, display nearly the same projection error as the A?-converged velocity error—and both (at t = 0+) relatively rather large.
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 675 Table 3.16-2 Some element errors. Element Q^Qo O2P-1 O2O-1 O2O1 P2+P-i P2+Pi P2P1 Q^ Q0 via Projection 2 104 x Initial Velocity Error 1.3 0.056 0.056 0.092 3.2 0.12 0.14 1.3 Initial Pressure Error 0.026 0.020 0.020 0.020 0.012 0.021 0.020 0.026 104 x Range of Velocity Error 1-2 0.2-0.9 0.2-0.9 5-20 1-3 9-30 6-22 5-20 Range of Pressure Error 0.003-0.017 0.003-0.014 0.003-0.014 0.003-0.014 0.002-0.008 0.003-0.012 0.002-0.012 0.003-0.015 4. QiQ\, ^2^1' and PiP\ all show 'small' initial velocity error but 'large' A?-converged errors. 5. QjP-x and Q2Q-\ are the most accurate elements, and for the mesh chosen, seem to show that the potential extra accuracy offered by the xy term is not really needed for this 'easy' problem. See Veyret et al. (1998) for more on this issue. To wrap it up, we 'advance' from fixed-step to two variable-A? (smart) integrators. The results of one such pair of simulations (for At$ = 10~3 and e — 10~3 via Q\Qo), one for BE and the other for TR, are shown in Figure 3.16-8. In Figure 3.16-8(a) and (b) are shown the At histories as the solid lines (the first three steps are below 0.01 and are not shown), the PDE velocity errors as the solid lines + dots with every time step plotted (1 per dot), and the 'theoretical' (scalar ODE) curve as a dashed line [kAt = (2eeXt)i/2 for BE and XAt = (12ee^)'/3 for TR with X = 2n2v\ see Section 2.7.4c]. The BE run required 36 steps to get to t = 5 and TR needed 19. Figures 3.16-8(c) and (d) show (again) the At vs. t curve plus the PDE pressure error. We conclude from these results the following: 1. Variable time-stepping is much more efficient. 2. For TR, all of the error is spatial for all t [cf. Figure 3.16-3(b)]; for this mesh a larger s could be used for Q\Qq. 3. Whereas the TR curve of At vs. t is somewhat conservative relative to the simple ODE theoretical curve, the BE result is not. The BE time steps appear to have grown a bit 'too fast'—and the 'increasing' PDE global error reflects this. TR did a much better job of maintaining the error in the desired range. 4. The PDE pressure errors are virtually identical for BE and TR and are decaying in proportion to the pressure—like e~4n vt. Recall too that the pressure error is greatest at t = 0. 5. When we tightened s to 10"4 (not shown), BE kept the velocity error below 10~3 and required 106 time steps to t = 5, whereas TR maintained the error at about 2 x 10~4 and needed only 33 steps; as noted above, even e = 10~3 is 'overkill' for TR. Both of these results are in fair agreement with simple asymptotic ODE theory (36\/To = 114 and 19^10 = 41).
676 THE NAVIER-STOKES EQUATIONS At 10 - 1(T* ey 0 12 3 4 5 0 12 3 4 5 (a) At and PDE velocity error for BE. (b) At and PDE velocity error for TR. At 10 10-2 en 2345 01234 t (c) At and PDE pressure error for BE. (d) At and PDE pressure error for TR. Fig. 3.16-8 Sample performance of two variable-step integrators (4/1 element). Finally, we report the results when using the usually-cost-effective (in 2D at least) penalty method—an index 0 DAE system; i.e. ODE's. And our view of the (usually) most efficient 2D transient simulation is that using the QiP-x (9/3) element in the consistent penalty mode (see Section 3.13.2e) integrated via the variable-step TR. The results, using a penalty parameter (see (3.16.254) of 107 and At0(still) = 10-3 which (via the first two BE steps) suppresses (skips over) the 'penalty transient' are as follows: BE needed 36 time steps and TR needed 18—the same as mixed interpolation on Q\Qo, a la Figure 3.16-8. The penalty method can be quite cost-effective. These extensive simulations—more of which will be reported in Veyret et al. (1999)—have led us to the following conclusions: 1. While the Taylor-vortex problem is indeed worthwhile as a test problem, it is not perfect. 2. The most accurate elements are QiP-\ and Q2Q-1, with the former being both stable and more cost-effective. 3. The Projection 2 method has a few remaining nagging problems (high spatial error, strange At behaviour) that are 'calling out' to the numerical analysis community.
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 677 4. Perhaps the Taylor-Hood elements (continuous pressure) have outlived their usefulness. 5. Variable-step TR is a good way to integrate the index 2 DAE's; even variable-step BE is a better choice than any fixed-A? integration. 6. Whereas we saw mostly excellent agreement with DAE theory, some of the results (eg. ODE errors) need further error analysis. 3.16.2 A Model DAE Problem a. Introduction The following (ostensibly simple) model problem for the incompressible Stokes equations is introduced and developed for several reasons: (i) it provides insight into the mathematics and behaviour of DAE's; (ii) analytical solutions are reasonably easy to obtain, and these reveal some of the potentially bizarre behavior of DAE's; (iii) the model mimics remarkably well the true semi-discretized DAE's generated by the Stokes equations; and (iv) it sheds some light on the behavior of the penalty method. It is, however, diversionary and is, like the section to follow, not absolutely essential; skip to Section 3.16.4 if you only wish to see how to write code. To start, we review the goal: find u and P from du/dt + VP = uV2u, Vu = 0 in Q, (3.16-116) with BC u = w(0 on F, and with IC u(x, 0) = uo(jc). If uo is divergence-free—i.e., if V • uo = 0 in £2 and n • uo = n • w(0) on F—then the IC's are said to be compatible. If uo is not divergence-free, a continuous-in-time solution does not exist, and the problem is generally regarded as ill-posed. A nearby well-posed problem can then be obtained as discussed in Section 3.10.2, and it is worth mentioning now that the analytical solution of the ill-posed DAE analog of (3.16-116) that we are about to present will automatically 'correct itself to generate the analogous well-posed problem. The simple DAE model for (3.16-116) is given by the following three-degree-of- freedom system: u + kxu + c\P= f\(t), u(0) = u0, (3.16-117) v + k2v + c2P = f2(t), v(0) = v0, (3.16-118) and Clu + c2v = g(t), P(0) = Po, (3.16-119) where c\ ^ 0, c2 # 0, k\ > 0, and k2 > 0 are all constant. (An anisotropic 'viscosity,' k{ ^ k2, while not necessary, is utilized for generality. The model is still relevant, and in some sense more appropriate, for k\ = k2. Also, c\ ^ c2 is most appropriate.) We initially proceed naively in that (i) we presume that an initial pressure is required and given, and (ii) we presume that u0 and vq are arbitrary. This is the index 2 (primitive variable) version of the model problem, whose index we shall now verify. Just as the time derivative of V • u = 0 leads to the PPE in the continuum, so too does cxit + c2v = g (3.16-120)
678 THE NAVIER-STOKES EQUATIONS lead to the semi-discrete PPE and an index 1 DAE; i.e., (3.16-117), (3.16-118) and (3.16-120) constitute an index 1 DAE system. To obtain the PPE version, substitute the accelerations from (3.16-117), (3.16-118) into (3.16-120) to get c\u + C2V = g = c\(f\ — k\u — c\P) + C2(fi — k2V — C2P), which we rearrange to {c\ + c22)P + clklu + c2k2v = clfl +c2f2-g, (3.16-121) and denote (3.16-117), (3.16-118) and (3.16-121) as the PPE/index 1 formulation; note that (c\ + c\) is our mimic of the Laplacian. It is this latter version of the index 1 DAE's that corresponds to the way some computer codes are written to solve the NS equations. Finally, one more differentiation of the constraint leads to an index 0 DAE system—i.e., an ODE system: (3.16-117), (3.16-118) and (c? + c\)P + cxkxii + c2k2v = c,/, + c2f2 ~ g, (3.16-122) which is a system of ODE's. We can now claim that the stated 'indices' are correct since it required two differentiations of the constraint to obtain an ODE system in the original variables—the original DAE's were indeed index 2. Remarks: (1) Another (and not equivalent) index 0 system could be obtained by solving (3.16-121) for P and placing the result in (3.16-117) and (3.16-118) to obtain a pair of ODE's in u and v only. (2) Since neither index 0 system is used to write computer codes, we will say little more about them. b. Index 2 Returning now to the index 2 formulation, (3.16-117) through (3.16-119), we proceed to find an analytical solution in terms of eigenvalues and eigenvectors. To start, we write it as a general system of singular ODE's (see, for example, Campbell, 1980, or Hairer and Wanner, 1991) via By + Ay = F(t), y(0) = y0, (3.16-123) where y = (u, v, P)T, F=(fl,f2, g)T, /l 0 0\ (k\ 0 c,\ B= 0 1 0 , and A = 0 k2 c2 1 , \0 0 0/ \c, c2 0/ where it is clear that B is singular. It is also clear that A is symmetric and non-singular (|A| = — c\k2 — c\k\)\ it is less clear but true that A is indefinite. To solve (3.16-117) through (3.16-119) via an eigenvector expansion, we must first find the proper eigenvectors. We start 'conventionally,' naively; i.e., as if B were not singular, and pretend that we have a simple system of linear ODE's. The conventional ODE method is the following: Step 1. Set F(t) = 0 and study the homogeneous problem. Step 2. Seek a solution of the form y(t) = xe~~Xt, which generates the following generalized eigenproblem: Ax = XBx. (3.16-124)
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 679 Step 3. Solve the eigenproblem to obtain (hopefully) a set of basis vectors—the first key objective. Step 4. Return to the inhomogeneous problem, (3.16-123), and expand both y(t) and F(t) into linear combinations of the basis vectors. The amplitude coefficients for F{t) are obtained by solving a linear algebraic system and those for y{t) by solving (integrating) a set of uncoupled (usually) ODE's. Step 5. Once the IC vector, y0, is also expanded into the basis set, the solution is complete. So let us start along this conventional path and see what happens. To solve (3.16-124), we first must form the characteristic equation, \A-kB\=0, (3.16-125) which gives X = {c\k2 + c22k\)/(c]+cl), (3.16-126) rather than the expected cubic equation for the desired three eigenvalues of the 3 x 3 system. This is the first DAE 'surprise'; we find only one eigenvalue rather than three. The eigenvector, from (3.16-124) and (3.16-126), is easily found to be x = [c2, -c\, clC2(k2 - *i)/(c? + C2)f, (3.16-127) which is 'divergence-free': c\X\ + c2x2 = 0. What next? Guided by, for example, Cook, et al. (1989), or by Wilkinson (1978), we seek the rest of our 'solution' via the inverse eigenproblem, Bx = (jlAx, (3.16-128) which displays the same eigenvectors as (3.16-124) and, importantly, \x = \/X. The characteristic equation here is \B — (jlA\ = 0, which gives fi2[(c2k2 + cjkOfJL - (c2 + c\)\ = 0; (3.16-129) i.e., the three roots are \x = {0, 0, 1/A.}. The two missing original eigenvalues are infinite! Another DAE surprise. Inserting \± = 0 into (3.16-128) yields the eigenvector x=x0 = (0,0, \f, (3.16-130) and we came up against the third DAE surprise: the matrix A~XB is defective—there is but a single eigenvector corresponding to the repeated eigenvalue \± = 0. The resolution of this dilemma was provided by Jordan, and we thus seek a generalized eigenvector (x\) corresponding to the repeated eigenvalue (/x = iiq) via the application of Jordan theory (e.g., Noble, 1969): (B - fi0A)xl = Ax0, (3.16-131) where /xq = 0, but we will not utilize this fact until later, for reasons that will become clear. The solution of (3.16-131) is easily found to be xi=\c2) = \c2)+Y\0) = \C2]+ J*o, (3.16-132)
680 THE NAVIER-STOKES EQUATIONS where y is arbitrary, and it is obviously expedient (and legitimate) to take y = 0. Note that the generalized eigenvector is not divergence-free—it is dilatational, with div x\ = c\X\\ + C2X\2 = (c2 + c\); it will turn out that its 'job' is to enforce c\u + c2v = g for t > 0—and that of x0 is to enforce the PPE, (3.16-121) for t > 0. We finally have a basis for Rn(=R3): x0 = (0, 0, \)T with fi0 = 0 (A.0 = oo); x\ = (c\,C2,Q)T, also with /xo = 0 (A0 = oo); and the first one obtained, X2 = [c2, -ci,cic2(k2 - k\)/(c\ + c|)]r, with A2 = (c]k2 + c\k\)/(c\ + c|). {Proof of basis: 0 0 1 C2 0 C2 -ci CiC2(^2 ~k\) c\ + c$ = -(cf + c|)#0, showing that the three vectors are linearly independent.} Given a basis, we can return to (3.16-123) and express the solution as y(t) = ^2aj(t)Xj, 7=0 (3.16-133) 7=0 (3.16-134) and F{t) = YJbj{t)Axj, 7=0 (3.16-135) where it is convenient to represent F(t) via the modified basis, {Ajc/}, for reasons that will soon become clear. (Since {xj} form a basis, it follows that {Axj} also form a basis for any non-singular matrix A.) The modified basis is easily found to be Axq = (c\, C2, 0)T, Ax\ = (c\k\, C2k2, c\ + c|)r, and Ax2 = {^2C2, —^ic\, 0)T. Proceeding in reverse order for the amplitude coefficients, we first find the three fr/s. Expanding (3.16-135) gives Cibo + c\k\b\ +X2c2b2 = f \{t), c2bo + c2k2b\ - X2c\b2 = fi(t), and with solution 0 • b0 + (cf + c\)bx +Q-b2 = g{t\ (3.16-136) c\f\(t) + c2f2(t) bo(0 = clk\ + c2k2 cl + c2, 8(0 ci + ci b\(t) = g{t)/(c] + c22),
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 681 and c\c2(k2 — k\) cifdt) - c, f2(t) + 2 2 8(0 b2(0 = 21 2,C'+C2 • (3.16-137) C\K2 + C2ACi Similarly, solving the 3x3 system from (3.16-134) gives the {yj}, where _yo = (k0, v0, P0)T: c\c2(k2 - k\)(c2u0 -c\vq) yo = Po W^4? ' Y\ = (c\u0 + c2vq)/(c] + c\), Y2 = (c2uQ - c, v0)/(cf + c\\ (3.16-138) where we still (naively) presume the initial data to be arbitrary. Finally, we insert (3.16-133) and (3.16-135) into (3.16-123) to obtain 2 ^2\Baj{t)xj +Adj{t)Xj - bj(t)Axj] = 0, 7=0 which, using (3.16-128) for j = 0, 2 and (3.16-131) for j = 1 (the generalized eigenvector), gives [fjLoa0(t) + a0(t) - bo(t)]Ax0 + (fjioAxi + Ax0)a\ (t) + [ai (t) - b\ (t)\Ax\ + [M2«2(0 + a2{t) - ft2(0]^2 = 0, which we rearrange to [Mo^oCO + &\ (0 + «o(0 - ^o(OJM) + [mo«i (0 + a\(t) — b\ (t)\Ax\ + [li2a2(t) + a2{t) - b2(t)\Ax2 = 0, (3.16-139) which shows why our choice of the modified basis for F(t) is 'convenient:' the vectors {Axj} are linearly independent, so that we finally obtain the nearly uncoupled ODE's—i.e., uncoupled in the sense of Jordan, fjLoaoit) + h\ (0 + a0(t) = b0(t), fjLQa\(t) + a\(t) = b\(t), fi2a2(t) + a2(t) = b2(t), (3.16-140) with IC's given by (3.16-134) and (3.16-138). While the solution of (3.16-140) is particularly simple if we set /xo =0, we shall proceed somewhat differently, to highlight some additional 'properties' of DAE's. Thus, suppose for the moment that /xq > 0; we shall solve the resulting ODE's and then see what happens as /iq -> 0. These ODE's are solved in the usual way (integrating factor)
682 THE NAVIER-STOKES EQUATIONS »-V to give (using Xj = \/fij): ao(t) = e ai(t) = e^0' a2(t) = e^2' Yo + X0 h(z)e^T dr Jo Y\+*o [ ^i(r)e^Tdr Jo Y2 + X2 [ b2(T)ehTdz Jo (3.16-141) where h(t) = b0(t) - X0{bi (t) - e^'fa, (0) + X0 /0' bx (r)e^T dr]}. The full solution, from (3.16-133), is then u v P = a0(t) I 0 j +ai(0 I c2 } +a2(t) I c2 \ cxc2(k2 -k\) \ c] + c] (3.16-142) / i.e., and u(t) = ci<3i(0 + c2a2(t), v(t) = c2a\(t) -c\a2(t), P(t) = a0(t) + cxc2(k2 - k\)a2(t)/(c\ + c\). (3.16-143) But what about t —► 0 and IC's? Recall that we presumably selected uq, vq, and Pq arbitrarily. ... Also recall that we are ultimately interested in ixq = 0 {X$ = 00). Well, it turns out to matter in which order we take the two limits, and the difference is interesting and illuminating. If we let t -> 0 first, (3.16-141) through (3.16-143) yield k(0) = c\Y\ +c2y2, v(0) = c2Y\ -C1K2, and P(0) = Yo + CxC2{k2 - kx )Y2i/c] + c\\ or using (3.16-138), u(0) = u0, v(0) = v0, and P(0) = P0—as required; i.e., by construction. But if we let ixq —> 0 first (the real case of interest), the results are vastly different. Thus, we finally focus on the solution of real interest for the DAE's: Xo = 00 (/io = 0) for which we need, in order to use (3.16-141), 1. and lim A()e -A.0r »A.or G(r)eA(,T dr = G(t) (3.16-144) 2. lim h(t) = bo(t) - lim A.0—>-oo Xq—>-oc A.oMO-*oe 2^~V ^i(r)e^,Tdr = b0(t) - b{(t). (3.16-145)
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 683 The final Oo = 0) solution is therefore ao(0 = MO-MO* al(t) = bl(t), and a2(t) = e~k2' b2{r)tk2T dr (3.16-146) inserted into (3.16-143). Remarks: (1) This solution is, of course, (much) easier to obtain by setting iiq = 0 and solving (3.16-140), now a system of DAE's, directly. We proceeded as we did for pedagogical reasons—hopefully profitably and not pedantically. (2) It is noteworthy that the solution involves a time derivative of the data—a situation that does not occur with ODE's. (3) yq and y\ are totally irrelevant. Now for our final{!) DAE surprise: if we now evaluate (3.16-143) at t = 0, using (3.16-146) and (3.16-138), we obtain ,nx c2(c2uo -c\v0) + c\g(0) U(0) = = 2 ^ W°' c, +c2 ,m c\(civ0 - c2u0) + c2g(0) V(0) = 2 2 ^ ^°' c, + c2 and D/m Cl/l (0) + C2/2(0) - g(0) - [Cl*m(0) + C2^(0)1 , D P(0) = 2 2 ^^°; (3.16-147) c, + c2 i.e., these results do not (in general) satisfy the chosen IC's! But they do satisfy both the continuity equation and the PPE; i.e., c\u(0) + c2v(0) = g(0) and (c\ + c2)P(0) + c\k\u(0) + C2^2^(0) = ci/i(0) + C2/2(0) - ^(0), the two algebraic constraint equations. What has happened is that the DAE solution is smarter than we are in the following sense: if we were foolish enough to select an initial velocity field that is not divergence-free and/or foolish enough to believe that we could select an initial condition for the pressure, the solution will correct our errors by (via the two infinite eigenvalues) changing the IC's! Another way to view it is that the solution will initially be discontinuous in time owing to a lack of compatible initial data: at t = 0+, the solution will be u(0), v(0), and ^(0), rather than uq, vq, and Pq. For t > 0, the solution will be continuous, divergence-free, and will always satisfy the PPE. Remarks: (1) Not all numerical integration methods will be as smart as the analytical solution, with the result (both here and in general) that DAE's can cause integration schemes to go crazy.
684 THE NAVIER-STOKES EQUATIONS (2) Note that a0(0 is used only for pressure and that a\(t), with the generalized eigenvector, is used only for velocity—a situation that will be seen to carry over to the full Stokes equations. (3) If c\ uq + C2Vq = g(0) = go, then the velocity solution (but generally not the pressure) will satisfy the IC's. Before we conclude the index 2 presentation, we point out that it is highly significant—because it too is not limited to this model problem—that the 'adjusted' IC's are in fact identical to those obtained by the L2-projection of the initial data to the nearest divergence-free subspace, as discussed in Section 3.10.2 for the continuous case. To prove this important assertion, we pose the following problem: given uq and vq (A) is irrelevant), find the closest divergence-free velocity, u and v. The solution of this problem is: find u and v such that J(u, v) = \[{u — uq)2 + (v — vq)2] is minimized over all (u, v) satisfying c\u + C2V = go. Introducing a Lagrange multiplier, <p, to satisfy the constraint, an equivalent statement of the problem is: Find the stationary point of J(u, v; <p) = J(u, v) + ip{c\u + C2V — go) over all u, v, <p. This leads to dl/du = 0, dJ/dv = 0, and dJ/d<p = 0; i.e., to u = uq — c\<p, v = vq — C2p, and c\u + C2V = go, with solution C2(C2«0 ~ C\V0) + C\g0 u — <p — c c c\ + c\ i (ci^o -c2uo) + c2go c] + c\ 1 Uq + C2V0 - g0 c] + c\ Clearly u = k(0) and v = v(0) from (3.16-147). QED. If we had been smarter a priori, we would not violate div uo = 0, nor would we select a Pq. If we constrain uq and t>o to satisfy c\Uq + C2VQ = go, (3.16-147) 'agreeably' gives u(0) = uq and v(0) = vq; and P(0) is the appropriate pressure corresponding to uq and ^o and satisfies the PPE, (3.16-121), at t = 0. To conclude the index 2 discussion, we specialize to the simpler case of constant forcing to glean some additional insight; the above solution [with c\Uq + C2V0 = g and Pq obtained from (3.16-121)] then 'agreeably' simplifies to u(t) = UQQ~Xlt + uss{\ - Q~Xlt), v(t) = v0e-^< + vss(\ -e~ht), and where P{t) = Pqq-x"j +Pss{\ -e""2'), (3.16-148) uss = [C2(c2f\ - cxf2) + clk2g]/(c2k2 + c\k\), vSs = [c\(clf2 -C2f\) + c2k\g]/(c\k2 + clk\), Pq = [c\f\ + C2J2 - {c\k\UQ + c2k2VQ)]/(c2 + c\),
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 685 and c\k2f\+c2k\f2-k\k2g c\fx+c2f2-(c\k\uss + c2k2vss) ,~,,1>ln. Pss = T T = = 2 • (3.16-149) cxk2 + c2k\ c, +c2 [Exercise for the reader: Add a third momentum equation (for w) to the model system of (3.16-117) through (3.16-119), change (3.16-119) to c\u + cif + ct,w = g, and solve as above. Show that the only significant difference is the addition of a second divergence-free viscous decay mode; i.e., the index of the system is still 2, and there are still two infinite eigenvalues. (The two 2's are coincidental.)] c. Index 1 The first thing to note about the lower index version of the model problem, given by (3.16-117), (3.16-118) and (3.16-121), is that these equations imply (3.16-120) rather than (3.16-119), which itself implies c\u(t) + c2v(t) - g(t) = c\uq + c2v0 - g0; (3.16-150) i.e., any initial violation of div u = 0, which is in fact mathematically permissible in the index 1 formulation, will remain for all time. Next we note that both B and A of (3.16-123) are singular, where now c2 ), F(t) = {f{,f2,c{f{+c2f2-g)T, c] + cl) and B is unchanged from its index 2 definition. (Note that the third row of A is a linear combination of the first two rows; hence, A must be singular.) The eigenproblem (3.16-124) now yields two roots rather then just one—and they are X = {0, X2 = (c\k2 + c\k\)l(c\ + t'l)}. The eigenvector corresponding to X = A.i = 0 is x\ = (c\k2, c2k\, —k\k2)T and that corresponding to X2 is given by (3.16-127). Now the inverse eigenproblem, (3.16-128), is again needed to complete the vector space. It yields ix = {0, \/X2}, and it is the first root, \± = iaq = 0 (Xq = oo) that is now of interest; its eigenvector is xo = (0, 0, l)r, as for the index 2 problem. Thus, one of the infinite eigenvalues from the index 2 formulation has been converted/inverted to zero, a rather significant change—another DAE surprise. It follows easily, as with index 2, that the three eigenvectors are linearly independent. Given a basis, we now return to (3.16-133) through (3.16-135), except with a new twist—the 'efficient' expansion of F{t) this time is as follows: F(t) = b0(t)Ax0 + bi(t)Bx\ + b2(t)Bx2; (3.16-151) i.e., a mixture of modified basis vectors is utilized. We remark that an equally successful expansion would replace b2Bx2 by b2Ax2, and we leave as an exercise the proof that these modified vectors are indeed linearly independent even though both A and B are singular. (Hint: form their determinant.) Solving (3.16-151) yields c\f\{t) + c2f2{t)-g(t) MO = ct + cj g(t) M0 = -2, , 2, c\k2 + c2k\
686 THE NAVIER-STOKES EQUATIONS C\C2(fC2 — k\) , C2/l(0-Cl/2(0- 2/ 2, S(0 b2(t) = -2 c^2 + c^' . (3.16-152) c, + c2 Inserting (3.16-133) and (3.16-151) into (3.16-123) and utilizing the results from the eigenproblems, Bxq = 0, Ax\ = 0, and Ax2 = "k2Bx2, yields (<30 - b0)Axo + (a\ - b\)Bx\ + (a2 + X2a2 - b2)Bx2 = 0. (3.16-153) Again, since the vectors {Axq, Bx\, BX2) are linearly independent, this is necessarily an expansion of the zero vector; i.e., we have a0(t) = b0(t), al(0 = bl(t), and a2(t) + X2a2(t) = b2(t). (3.16-154) Next, noting that u and v depend only on x\ and x2, the IC's for a\{t) and a2{t) are obtained from uq = y\X\\ + Yix2\ and ^o = Y\x\2 + Yix22 t0 giye Y\ = (c\u0 + c2VQ)/(c]k2 + c\k\), Yi = (c2k\Uo — c\k2vo)/(c\k2 + c\k\\ (3.16-155) so that (3.16-154) yields, using (3.16-152), a\(t) = Y\+ b\{T)&z= — clk2 + c2k\ and a2(t) = e -x2t ft ^X2T Y2+ / fc2(r)eA2Tdr Jo (3.16-156) and the full solution is now at hand: u(t) = c\k2a\ (t) + c2a2{t), v(t) = c2k\a\(t) - c\a2(t), P(t) = b0(t) - klk2al(0 + cxc2(k2 - kx)a2{t)/{c\ + c\), (3.16-157) from which we can easily verify that (i) c\u + c2v = g(t) + (ciuq + c2v0 -go) and (ii) {c\ + c\)P + c{k{u + c2k2v = cxf \ +c2f2 - g; the PPE is always satisfied, but the solution is divergence-free only if it began that way. And only then will it agree with the index 2 solution. The loss of div u = 0, and the concomitant admission of more solutions, is a direct consequence of the eigenvalue that went from oo in the index 2 formulation to 0 in the index 1 formulation, the latter not recognizing the possibility of using incompatible IC's. Finally, if some arbitrary Pq were
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 687 chosen as a pressure initial condition, then P would be non-smooth (discontinuous at t = 0). [Exercise for the reader: Show that the index 1 and index 2 solutions agree when c\ uq + cjVq = go- (Hint; an integration by parts is required.)] d. Index 0. Although we will not provide the details, we discuss this case for completeness and, more importantly perhaps, to shed yet more light on the behavior of DAE's. The salient features of the ODE system and its (more conventional) solution via the eigenvectors are: 1. Two of the eigenvalues (X) are zero and one is non-zero (and the same as X2 above, corresponding to the solenoidal viscous decay mode). 2. IC's are now required for P, too. 3. All IC's are 'arbitrary'; i.e., they are necessary, and a solution always exists. 4 The only divergence condition that is guaranteed is c\'u + c2v = g; the jerk (rate of change of acceleration) is divergence-free. 5. Only the time derivative of the PPE is guaranteed to be satisfied. 6. Last but not least: if and only if the IC's are compatible will the solution be correct; i.e., agree with the index 2 solution. Compatible IC's are the following: (i) divergence-free velocity and (ii) pressure that satisfies the PPE, (3.16-121), at t = 0. e. Penalty The penalty method neatly converts the DAE's to ODE's, so that it would seem a simple—even desirable—ploy, since we now know that DAE's are trickier to solve than ODE's. As we shall see, this is a good ploy except for two things: 1. The results are not uniformly valid in time—there exists a sharp 'penalty transient' (a boundary layer in time) that is spurious from the viewpoint of the index 2 DAE's that it attempts to solve. 2. Non-trivial asymptotic analysis is often required if a full understanding of the penalty method is desired. The penalty method begins as an index 1 DAE system: u + k\u + c\P = f\(t), u(0) = uq, v + k2v + c2P = fi(t), v(0) = v0, and ciu + c2v-g(t) = sP, P(0) = (ciu0 + c2vo-go)/e, (3.16-158) where e is 'small' in an appropriate sense, and positive. Elimination of P generates the index 0 penalty ODE's, u + (k\ + c\/e)u + c\c2v/e = f\+ c\g/e,
688 and THE NAVIER-STOKES EQUATIONS v + (k2 + c2/£)v + C[C2u/e = f2 + c2g/£, (3.16-159) a system we shall solve and then use to attempt to evaluate the penalty premise: for e —► 0, the penalty solution is close—to within 0(e)—to the index 2 solution. Remark: In addition to satisfying (3.16-158), the penalty pressure also satisfies the following 'PPE': eP + (c\ + c\)P + c\k\u + c2k2v = c,/, + c2f2 - g, (3.16-160) which of course is the true PDE iff eP is 'sufficiently small'. In (3.16-123), we now have B = I, ' f\(t) + c]g(t)/e\ A = k\ + c2/e c\c2/e c\c2/e k2 + c\je and F(t) = f2(t)+C2g(t)/£ Solving (3.16-125) then gives the two eigenvalues (X ^ \/e\—the penalty parameter here is simply 1/e): 1 A.= - 2 kl+k2 + C-±±^± \ 2(k2-kx)(c\-c\) (c] + c22 (k2 — k\ y H h (3.16-161) whose asymptotic behavior is of particular interest: as e —> 0, (3.16-161) gives one very large and one finite eigenvalue, 0 0 0 0 c, + c9 c,k\ + c0k2 ^ X+ = -! 2- + -J-4 It1 + 0(e) >+ c2 + 4 and c,k2 + c0k\ _ X- = -4 P + O(e), c\ + c\ (3.16-162) the former tending to infinity and the latter to the proper physical value, X2. It is X+ that will be responsible for the spurious penalty transient and X- for the close approximation to the desired solution once this transient has expired—more or less. The corresponding eigenvectors, from (3.16-124), are found to be £{h - X) 2c2 +Cl g(A.-*i) r 2c, Cl in which X+ [from (3.16-161)] gives one of the eigenvectors, and X_ gives the other [which is divergence-free to 0(e)]. Expanding both y = (u, v)T and F(t) into these eigenvectors and solving (3.16-123) yields e(k2 - X+) x = (3.16-163) u V = e -k+t «.(0)+ / £,(r)e^rdr Jo
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 689 s(k2 -A._) + e -X_r fl2(0)+ / /32(T)ex~Tdz Jo + C2 — C\ , (3.16-164) where 3cj - c] + e(Aik + #) c| - 3c? + e(Ak - R) fli(O) = c\v0 c2u0, _, -_. . _v— . _., 3cl - c\ + s(Ak - R) a2(0) = c2u0 c\vQ, Pi = c\~ 3c22 D 3c\ + s(Ak + R) D - c\ + s(Ak + R) D c\(fi + c2g/s)- D c\ - 3c\ + s(Ak - R) D ci(f\ +cig/e), and c\ - 3c] + s(Ak + R) r . , 3c? - c? + e(AA: - R) where D Ak = k2 — k[, R = D c\(f2 + c2g/e), \ iAk)2+2Ak(4-cl)+(cU4 and D = eR[eAk + 2{c\ - c\)], which is to be compared with the index 2 result (velocity part), (3.16-142) through (3.16-146). The 'comparison' is clearly not easy—and probably not worthwhile, at least in any detail. What is worthwhile is to note that it can be shown, from (3.16-164) for e —► 0 and using (3.16-144), that while u(0) = uq and v(0) = vq where uq and t>o are arbitrary, the spurious penalty transient does a similar job as the index 2 solution if c\u$ + c2v$ ^ go; namely, it performs the 'projection' to the nearest divergence-free subspace. The difference is that it takes longer to do so via penalty [the penalty time constant is r = e/{c\ + c\) vis-a-vis r = 0 for index 2], and the ('slow') projection is only correct to 0(e). It is also noteworthy that there is a non-physical penalty transient even if c\Uq + c2vq = go (for which Pq = 0!); it is just not so spurious in this case, because during (and after) the transient, the velocity will satisfy c\u(t) + c2v(t) = g(t) + 0(e). (P0 is also spurious.) We will now perform a detailed comparison, albeit only for a rather simpler particular case: time-independent forcing and 'isotropic' viscosity, k\ = k2 = k; cf. Section 3.13.2e for the steady case. The index 2 solution is, from (3.16-148) and (3.16-149), = e -kt "0 Vq + (1 -kt ) k{c] + c\) ci(c2fx -c\f2) + kcxg c\{c\f2 - c2f\) + kc2g and P = P0=PSS = (c,/, + c2f2 - kg)/(c] + c22); (3.16-165) (3.16-166) i.e., we have a particularly simple special case (because k\ = k2): no transient viscous decay terms in the pressure. The initially divergence-free velocity (c\uq + c2vq = g)
690 THE NAVIER-STOKES EQUATIONS remains that way with no 'need' for the pressure. [If the IC is not div-free, (3.16-165) still applies, but with «0 and vo replaced by u(0) and v(0) in (3.16-147).] The corresponding penalty solution, however, with eigenvalues [from (3.16-161)] of X+ = k + {c\ + c\)/e = (c\ + c\)je and X- = k, is not quite so simple: from (3.16-164) it is ci + c + + _A-+' I" C\(C\U0 + C2Vq _C2(C\Uo + C2Vq) (1 -Q~K') [c] + c\){£k+c] + 4) ec\(c\f\ +c2f2) + {c]+cl)c\g £c2{c\f\ + c2fi) + {c\ + c\)c2g -kt C2(C2UQ -C\Vq) c\ +c2 lc\(C\VQ -C2U0) + (1 e~k') Kc\ + c\) c2{c2f\ -cxf2) c\(cxf2 - c2fx) (3.16-167) where uq, and ^o are (thus far) arbitrary, and we note that the physical decaying portion is exactly divergence-free—and thus will make no contribution to the penalty pressure. We shall soon specialize to the case c\Uq + c2vq = g (a la the index 2 solution), but first let us show the penalty pressure from (3.16-158), and then study the entire result: -X+t x+t* P{t) = q-a+'(ciUq + c2v0 - g)/e + (1 - e"A+?) c\f\ +c2f2 -kg sk + c\ + c\ (3.16-168) where we imagine e —► 0. The first thing we observe is that the penalty transient is the only transient and that P(0) is very large unless the initial velocity is divergence- free—for which case P(0) = 0. But very shortly after this bad start, both velocity and pressure from the penalty approximation become quite good. After the very fast penalty transient is 'finished' (e~x+t/e <$C 1)), the pressure is clearly the same as that from index 2, to 0(e) of course. The corresponding penalty velocity, after the penalty transient, is the following: £C\(c\f\ +C2/2) 1 sk + c\ + c\ kt- + (l-e-*f) k(c] + c\) c\ + c2 £c2(cxf\ +c2f2) c] + c\ c2(c2fx -C1/2) c\(cxf2 -c2/i) + c\g + c2g -kt + c\ + c2 c2(c2u0 .C\(C\Vq C\Vq) C2Uq) (3.16-169) which is 0(e) away from the index 2 velocity. But for the more reasonable penalty simulation, we would start with a divergence-free (or close to divergence-free) velocity. Thus if we invoke c\uq + c2vq = g in (3.16-167), we obtain, after some algebra, the pleasant result, + eP(t) index 2 c2 where now P(t) = (\ e-x+t) (c\f\ +cifi-kg \ ek + c\+c22 , (3.16-170) (3.16-171)
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 691 which, while starting at zero, approximates (3.16-166) to O(s) after completion of the penalty transient—and completes our 'comparison.' We conclude the penalty discussion by noting that in general the pressure is recovered from (3.16-158) via P = (c\u + c2v — g)/e, and is different from that in (3.16-143) by O(s) for t ^ 0(1/"k+). Note too that the initial penalty pressure is never correct—the simplest example being P(0) = 0 when the IC is divergence-free. The pressure rapidly adjusts—and often can change its amplitude by a very large [0(e~1)] amount—during the penalty transient, an example of which we have already discussed for the full Stokes equations—in Section 3.13.2e. [Exercise for the reader—perhaps important: Show that the penalty transient can be bypassed and an L2 -projection of any initially 'divergent' velocity field to the discretely divergence-free subspace simultaneously obtained in just a single BE timestep by choosing At such that XAt ^> 1. Show too that the corresponding pressure is really not a pressure at all, but 1/ At times the potential function associated with the projection. The 'true' pressure is obtained at step 2—which need not have XAt ^> 1 (but could—as long as At is still small enough for accuracy). Fortunately, as demonstrated by Gresho and Sani (1998), these results carry over intact to the full NS penalty ODE's—thus showing the 'proper' way to use the penalty method.] f. Energetics To conclude, we briefly study the 'kinetic energy,' KE = \(u2 + v2), for the homogeneous DAE's (/i = f2 = g = 0) and each of the above formulations: 1. Index 2. From (3.16-117) through (3.16-119) it is easily seen that (u2 + v2) + kxu2 + k2v2 = 0 2 dr the KE is independent of the pressure and, since k\ > 0 and k2 > 0, it decays monotoni- cally. This behavior properly simulates that of unforced Stokes flow. 2. Index 1. From (3.16-117), (3.16-118) and (3.16-121) can be derived U 0 0 0 0 — KE + [k\u + k2v~ — (c\u + c2v){c\k\u + c2k2v)/(c, + c2) = 0, dr which can be shown to also decay, but at the wrong rate unless c\uq + c2v$ = 0. 3. Index 0. Here we combine (3.16-117), (3.16-118) with the second integral of c\'u + c2v = 0—i.e., c\u + c2v = a + fit, where a and /J are constants that depend on the IC's—to obtain — KE + k{u2 + k2v2 + (a + fit)P = 0, dr where, from (3.16-122), P = y — (c\k\u + c2k2v)/(c\ + c\), where y is another constant. Since it is also true (and not hard to show) that P(t) contains—in the general case (arbitrary IC's) a term linear in t, it follows that KE will contain terms up through r3 — it could become very large in magnitude, and be either positive or negative. Again, only if the proper IC's are chosen (divergence-free and Pq satisfies the PPE at t = 0), will a, ft, and y vanish and the index 0 solution agree with that of index 2. 4. Penalty. Equations (3.16-158) yield — KE + k\u2 + k2v2 + sP2 = 0; dr
692 THE NAVIER-STOKES EQUATIONS the KE will decay even faster than that for index 2; the penalty ODE's are (slightly) 'over-stable.' g. Numerical integration Another interesting and useful exercise that is (with the help of A.C. Hindmarsh) not too difficult to perform with this three-equation DAE system is to represent the numerical solution exactly—in terms of the eigenvalues and eigenvectors (and generalized eigenvectors) when a particular ODE time integration method is selected. Thus, in this section we shall present some closed form solutions when FE, BE, TR, and shortened TR (STR) are employed on both the index 2 and index 1 DAE's. But, since little more is to be learned for the inhomogeneous case with lots more effort, we shall present and discuss only the homogeneous case (f\ = f2 = 0, g = 0). Finally, as the details behind all eight methods would probably be more burdensome than interesting, we shall only present 'details' for two of them (FE and TR), leaving the rest as exercises—although we will present final results for all of them. All begin with (3.16-123) with F{t) = 0. But we shall re-define both A and B to be method-dependent, so that each can be expressed in the form Byn+X =Ayn (3.16-172) We shall let the reader determine A and B (the easy part) for all but one method; we show them for one method to 'set the stage': if FE is applied to the index 2 DAE's of (3.16-123), we easily obtain, with yn = (un, vn, Pn-\)7', / 1 0 c\At\ B= I 0 1 c2At J , (3.16-173) \c\At c2At 0 / and /1-kiAt 0 0\ A= I 0 1 -k2At 0 , (3.16-174) V 0 0 0/ where we have multiplied the 'continuity' equation (3.16-119) by At to obtain a useful symmetry. [For all of the remaining cases, yn = (un, vn, Pn).] Returning now to the general case, the next matrix of interest, and the one that will lead to the analytical solution, we shall call the decay matrix, D: D = B~lA (3.16-175) because, once we find its spectrum and corresponding eigenvectors (and generalized eigenvectors when necessary/appropriate), say {Xj, Xj}, we can obtain the analytical (and decaying, when solved properly) solution by first expressing jo in terms of the {Xj}, 3 yo = J2aJxJ (3.16-176) 7=1 which gives a\, a2, and a^, and then using (3.16-172) to obtain 3 yn = Dny0 = Y^ajDnXj- (3.16-177) 7=1
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 693 i.e., the solution is expressible in powers of the decay matrix operating on the basis vectors, If we were dealing with simple ODE's, the rest of the analysis would also be simple—because then DXj = XjXj (3.16-178) easily leads to the final solution 3 yn=J2ajKxJ (3.16-179) 7=1 But we are dealing with DAE's, not ODE's, and the procedure is not quite so simple—because: (i) D may be singular (FE, BE and STR on index 2; FE and BE on index 1); (ii) D may be defective (FE, BE, and TR on index 2), (iii) both may be true (FE and BE on index 2); (iv) the IC's cannot (or, at least, should not) be freely chosen; even though (3.16-176) seems to suggest otherwise—more later. In our small three-equation system, we will always obtain a (the) physically-decaying mode, whose eigenvalue and eigenvector approximate those of the continuous DAE solution; namely, (3.16-126) and (3.16-127). Corresponding however, to the two infinite eigenvalues for index 2 and one infinite and one zero eigenvalue for index 1, is a method- dependent range of behavior—which is part of the reason that numerical time integration of DAE's is not quite so straight forward. In our model problem, those cases with defective matices (repeated eigenvalues with less than the desired number of linearly-independent eigenvectors) are of the simplest type: one repeated eigenvalue (a double root) with but one eigenvector. So this is the (only) case that we will present—and we shall let (A.i, X\) denote the single corresponding pair, even though X2 = X\ (i.e., A.] occurs twice). The (Jordan) theory of generalized eigenvectors then generates the second independent vector, X2, according to DXl=XlXl (3.16-180) and DX2=X\X2+Xl. (3.16-181) This is the required generalization—X2 is the generalized eigenvector corresponding to the repeated eigenvalue X\. In this case, it is not hard to obtain the proper generalization of (3.16-177) through (3.16-179), starting with (3.16-176): 3 yn=D"yQ = J2aJDnXJ 7=1 = a\X[X\ + a2(XnlX2 + nXnl'lXl) + a3Xn3X3 (3.16-182) is the exact solution to the discretized version of the DAE's when D is defective of degree 1. (It will turn out that A.i = 0 for the two Euler methods and A.i = — 1 for TR.)
694 THE NAVIER-STOKES EQUATIONS We are now ready to apply this theory to the several chosen time-marching methods (with fixed At). o Index 2. As mentioned earlier, we shall show D for only FE and TR. Thus, c\(\-k\At) -c,c2(l -k2At) 0 -cxc2(\ -k{At) cl(\-k2At) 0 c,(l -k{At)/At c2(l -k2At)/At where L = c\ + c\ ('Laplacian'); and DFE — #FEAFE = 1 Z (3.16-183) 0 DTR — 1 (c\-c\ -LX0At/2 -2c\c2 L(\+X0At/2) -2c \c \ 1^2 4(1 + k2At/2)c{ At 0 0 \ :f - cl2 - LXoAt/2 4(l+^Ar/2)c2 _L{l+XoAt/2)) (3.16-184) where the ominous factor of \/At is related to the possibility of employing inconsistent IC's, Xq is the viscous decay rate, given by (3.16-126), and we remark that only TR and STR have non-zero entries in the third column of D. Next, we present {Xj,Xj} for the four selected methods—all obtained by solving (3.16-180), (3.16-181) and, for j = 3, (3.16-178). In tabular form with Ak = k2—k\, we obtain: ( c2 \ FE:X{ = X2 = (A, = 0) cxAt/{\ -k{At) c2At/{\ -k2At) (X2 = A, = 0) X, = c2 -c\ C]C2Ak \L(\ -X0At)/ (X3 = 1 - XQAt) BE :Xl = X2 = (A, =0) TR:XX = (A, =-1) X2 = STR : X\ = (A, =-1) (1 -k2At/2)c{At/2\ X2= ( (1 -k{At/2)c2At/2 ) (A.2 = 0) (X3,X3) asforTR, where _ (1 - k2At/2)c\yx + (1 -kxAt!2)c\y2 Z~ L(\+X0At/2) (3.16-185)
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 695 with K, = 1 + AkAt/2 - k{k2At2/4 (3.16-186) and K2 = 1 - AkAt/2 - k{k2At2/A (3.16-187) In addition to noting that X\ is common to all methods and that STR is rather more 'complicated' than the others, we offer the following Remarks: (1) For FE, BE, and TR, X2 is a generalized eigenvector. The fact that X2 'vanishes' as At —> 0 for all methods is related to the fact that its sole purpose is to force the solution to satisfy any non-div-free IC, a statement that will be more clear when we present the full solutions. (2) Xj = 0 in the discrete equations corresponds to Xj = oo in the DAE's; both are 'active' only initially, and serve to 'adjust' any incompatible initial data. (3) Both BE and FE will quickly (n ^ 1) cause satisfaction of the div-free constraint, c\un + c2vn = 0, and of the (implied) PPE, LPn + c\k\u„ + c2k2vn = 0, per (3.16-121). (4) TR, with X\ = X2 = — 1, will not change the data; rather, it will generate wiggle signals for incompatible IC's. (5) STR will quickly enforce the continuity equation (via X2 = 0), but not the PPE. (6) The isolated singularities for FE at particular values of At are easily explained, (i) At —► \/k\ or \/k2 causes a significant change in D; either the first or second column becomes zero [cf. (3.16-174)] thus necessitating a special case analysis—with the result that X = (0, 0, — c\c\IXqL}) with three (new) linearly independent eigenvectors, (ii) For At = \/X0, D has the spectrum (0, 0, 0), and the eigenspace is completed via the construction of two generalized eigenvectors; i.e., DX\ = 0, DX2 = Xx, and DX3 = X2. Similarly, STR with At = 2/X0 becomes defective (A.3 = X2 = 0) and requires separate analysis with another generalized eigenvector. Finally, we are ready to present the four analytical solutions in terms of the {Xj}, {Xj}, and {aj}. 1. FE. A three stage presentation is required for FE; i.e., it is most convenient to consider separately the cases n = 0, n = 1, and n ^ 2. (i) n = 0: v0 )=alXl +a2X2+a^X3 (3.16-188) gives a2 and a3, and we note that both P-\ and a\ are irrelevant: (c\uq + c2vq)(\ —k\At)(\ -k2At) a2 = AtL{\ -X0At) Atc\a2 \
696 THE NAVIER-STOKES EQUATIONS (ii) n = \: vx \=a2Xl +a3(\ -X0At)X3 (3.16-189) which used DX2 = X\, and gives Pq (and, of course, u\ and v\), which initial pressure satisfies LPq + k\C\UQ + C2&2'yo = (c\Uq + C2Vq)/ At. (iii) n ^ 2: un \ v„ ]=fl3(l -A.oAO"X3 (3.16-190) ■ Pn-lJ which satisfies both c\un + C2Vn = 0 and the PPE, LPn+k\C\un+k2C2Vn = 0. 2. #£. Again a three stage approach seems most useful, even though now the pressure index agrees with that for velocity. (i) n = 0: gives the three a's: (ii) n = \: UQ v0 \ =a\X\+a2X2+a2X2 (3.16-191) Po ®\ = (Po — c\C2AkaT,/L, Cl2 = (C\Uq + C2Vo/AtL «3 = ("o - Atcxa2)/c2 Vx )= a2Xl + —- X3 (3.16-192) p I 1 +A0A? l using DX2 = X\, which satisfies c\U\ + C2V\ = 0 and gives a P\ that is 'independent of Pq and satisfies LP\ + k\C\U\ + k2C2V\ = (C\Uq + C2Uo)/At. (iii) n ^ 2: v" = n , 1 A,v»*3 (3.16-193) ^ y (l +A.0A0 which satisfies c\un + C2Vn = 0 and the PPE. 3. TR. Here a 'single-stage' presentation 'works', but a different set of 'complications' arises. For n ^ 0, ^ J =aa-\)nXl+a2(-\)\X2-nXO + aJ]^^^\ X2 (3.16-194) where we have used the fact that DX2 = X\ — X2. Again, n = 0 gives the three a's: a\ = Pq — c\C2Akai,/L,
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 697 a2 = 4(ci«0 + c2vo)/AtL, a?, = (u0 - Atc\a2/4)/c2, and we have (i) cxun +c2vn = (-\)"(c\u0 + c2vq) (ii) LPn+k[c[un+k2c2vn = {-\)n (Hi) LPn +Pn + \ + klC{ ^n +Un + X ^ + ^ (Vn +Vn + , ^ = (_ { )n QUo+^O, 4. STR. For this last case, a two stage presentation seems best, (i) n = 0: = fl,X,+A2^2+fl3^3 (3.16-195) ^,-(n-A0A^/4)(c^0A+^) o + c: At/2 Vr, gives the three a's: a\ = Pq — c\c2Aka^/L — za2, At a2 = {c\Uq + c2vq)/ -L{\ -XoAt/2) <33 = 1 C2 At "o -ci(l - k2At/2)a2 (ii) n ^ 1: «. l=ai(-l^i+a3(J^g)"jf3. (3.16-196) which solution satisfies c\un + C2«« = 0, L/>„ +^ic2«„ +k2c2vn = (-\)"La\, and z.(p" + p">) +klC, (»'+»'+>) + *2C2 f""+"^ = o. Remarks: (1) Only the first-order (Euler) methods mimic the DAE solution when the initial conditions are not compatible; i.e., they change the 'data' at step 1 to satisfy the algebraic constraints. (2) In all of the cases, if and only if u$ and v$ are selected properly (c\Uq + c2v$ = 0), will a2 be zero. If, in addition—and only if—Pq for the three implicit methods is selected properly (LPq + k\C\uo + k2c2vo = 0), will a\ also be zero. (3) For the general (ill-posed) case, three of the four methods (FE, BE, and STR) will, for At —► 0, generate at the first time step, the same velocity solution as do
698 THE NAVIER-STOKES EQUATIONS the DAE's in (3.16-147)—corresponding to an L2-projection to the divergence-free subspace. TR, however, will 'preserve the div', up to the (—1)" wiggle-signal factor. The corresponding three pressures (Pq for FE, P\ for BE and STR), multiplied by At, are actually the Lagrange multipliers associated with the projection. (4) For the ill-posed case {c\Uq + c2v0 ^ 0), TR, will send out a strong wiggle-signal in the pressure—the n{— \)n term a la (3.16-36)—even though the average pressure, (Pn +Pn+\)/2, does not grow, a la STR's pressure (whose average P still satisfies the PPE). (5) Even if the initial velocity is div-free, the pressures from TR and STR will still oscillate unless Pq is compatible (making a\ =0). o Index 1. As for index 2, we begin by displaying the decay matrix for FE and TR: 1 and £>fe = - | c\c2k\At -k\c\ At L — c\k\At c\c2k2At L - c\k2At (3.16-197) -k2c2 /L+ ^-{c]k2-c\k\) c{c2k2At Dtr — L[l + X0At c\c2k\At \ -klcl(2 + k2At) L+ 4f(cjk\ -c}k2) -k2c2(2 + k\At) 0 0 -L(\+X0At/2).J (3.16-198) The results of the eigensystem analysis are presented next—for FE, BE, and TR (leaving the more complex STR as an exercise). It turns out that all three have (almost) the same eigenvectors—and no generalized eigenvectors are required since the eigenvalues are distinct. (Lower index systems are always simpler than higher ones.) The common eigenvectors are as follows; Xi = 0 X2 = c\k2 \ c2k\ J -k\k2) *3 = / c2 -C \Pcxc2Ak/L (3.16-199) where p = 1 for BE and TR, and p = 1/(1 - A.0A0 for FE—owing to the index 'shift' between velocity and pressure. Note that (up to the /^-factor) these are also the eigenvectors of the continuous DAE problem derived in Section 3.16.2c. The eigenvalues of D are as follows: Method FE BE TR A, X2 X, 0 0 -1 1 1 - X0At 1/(1 +X0At) (l-XpAt/2) (1 +X0At/2) where we point out that the 'job' of X2 is to preserve the div—it corresponds to the zero eigenvalue in the continuous DAE system. The zero eigenvalues for the Euler methods correspond to the infinite eigenvalue for the index 1 DAE's—and their job is to enforce the PPE for t > 0 even if it is not initially satisfied.
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 699 Finally, we present the analytical solutions, which, like the eigenvectors, are very similar. 1. FE. (i) n = 0: vQ = a\Xx + a2X2 + a3X3 (3.16-200) gives a2 and a3—and again we note that a\ and P_\ are irrelevant: a2 = (c\Uq + c2v{))/XqL, #3 = (uq — c\k2a2)/c2. (ii) n ^ 1: v„ \=a2X2+a3(\ -k0At)nX^ (3.16-201) ■ Pn-lJ which, for n = 1, gives Pq satisfying the PPE. Also, for all n ^ 0 the solution satisfies the PPE and c\un + c2vn = c\Uq + c2vq\ (3.16-202) the index 1 solution preserves the initial divergence—as noted many times earlier. 2. BE. For all n ^ 0, un\ a vn = a,(0)'% +a2X2 + 3 Y3, (3.16-203) p y (i + AoA?) where n = 0 gives, using (0)° = 1, the a's: fli = P() + ^1^2 — c\c2AkaT,/L, with A2 and 03 the same as those above for FE. This solution also satisfies (3.16-202) for all n and satisfies the PPE for n ^ 0. (The value of Pq for n = 0 is actually quite irrelevant.) 3. TR. For n ^ 0, the solution is M"\ , n-koAt/2\n vn =ai(-l)BXi+02*2 + 03 t , 1 a /o X-^ (3.16-204) P y V1 + M)At/zj with n = 0 giving the same three a's as for BE. Also, the divergence is preserved and the pressure satisfies LPn +k\c\un +k2c2vn = Lfli(-l)"; i.e. it 'rings' unless a\ = 0, for which case it satisfies the PPE for all n—like the two Euler schemes, even when the solution is not div-free.
700 THE NAVIER-STOKES EQUATIONS Final remarks: (1) If (and only if) the two compatibility conditions are satisfied by the initial data, then each ODE method will produce idential solutions for index 1 and index 2 (PPE) formulations—and only the physical eigenvalue, A3, which approximates A.o, is needed. (2) Noting that the correct final (t —► 00) solution is zero makes it clear that the index 1 system will generally attain a spurious non-zero final state; only a2 = 0 (and a\ = 0 for TR) will lead to the correct steady state (and transient state, of course). [Exercise for the reader: Obtain the projected ODE's by eliminating P, which introduces the projection matrix (see Appendix 3) k> = - ( c' ~CiCA s L\-c\c2 c\ J Show that this ostensibly simpler approach (two equations vs three) is equivalent (only) to the index 1/PPE approach in that it will preserve the initial divergence. [We stayed with the full system for yet another reason: in the next section we generalize this approach from a (2 + l)-system to an (n + w)-system.] As a concluding remark, we believe that a clear understanding of the model problem (plus the next section, 3.16.3) and how the various ODE methods 'solve' the DAE's is bound to be helpful as one applies, or just contemplates applying, the same method to the full DAE's that approximate the NSE's. h. Final exercise To close this portion of the extended introductory discussion of DAE's, let us show how easy it is to go astray by not being sufficiently careful. Since (3.16-121) was obtained from (3.16-119), let us see what happens if we use this pair to eliminate P from the index 1 system. Inserting P from (3.16-121) into (3.16-117), (3.16-118) yields k\C2U — k-2C\C2V 2 2 U~\ 2 2 = [C2(.C2fl -Cl/2) + Cig]/(c, +C2) c, +c2 and IC2C,V — k\C2C\U ~> ~> v + 4 Y^ = \.c\ifi\f2 - c2fx) + C2g\/(c] + c22). (3.16-205) c, +c2 Invoking (3.16-119) in these equations leads to the simplified set, U + Xu = [c2(c2f\ -c\f2) + c\(g + k2g)]/(c] +c\) and v + Xv = [c,(ci/2 -C2f\) + c2{g + k\g)\/{c\ + c\), where X = (c]k2 + c\kx)/{c\ + c\\ (3.16-206) is the physical decay rate. This result leads to two interesting observations: 1. u and v have become uncoupled. 2. The divergence, c\u + C2V — g = w(t) satisfies dw/dt + Xw = 0, and thus w(t) = wqq"^'—any initial divergence will decay toward zero at the viscous decay rate.
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 701 Both of these results are spurious in general: u and v are correct as given above only if wo = 0, and the correct general solution of the index 1 DAE has w(t) = wo with no decay. The error was caused by assuming the validity of (3.16-119) 'just because' it was used to obtain (3.16-121). One must be very careful about 'going backwards.' 3.16.3 Analytical Solution of the Stokes Equations a. Introduction In a continuation of our diversion into the mathematics of DAE's, we generalize here from the three-equation model problem to the general n + m equation system describing Stokes flow. It is 'analytical' only in the sense that a closed-form solution in terms of the appropriate eigenvectors is developed; in no sense are these eigenvectors 'analytical.' It is useful only in that it may provide (as did the previous section, hopefully) a deeper insight into some of the subtleties/difficulties of incompressible flow, and is similar in spirit to the eigenvector expansion presented earlier, near the end of Section 3.13.2d, for the steady Stokes equations. The details behind the summary of results to be presented below are too many to be useful, we believe, and are thus left as exercises for the reader. (But see Malkus, 1981, for some guidance.) Here we just present 'final' results, for both index 1 and index 2, and point out their similarity with the model problem solutions. b. Index 2 We start with MU + Ku + CP = f(t), k(0) = k0, (3.16-207) CTu = g(t), CTu0 = g(0), (3.16-208) where u is an n-vector and P is an m-vector; or, expressed in singular ODE form, By + Ay = F(t), y(0) = yQ, (3.16-209) where y = (£), F = (f), B = (^ q) is singular, and A = (*r CQ) is not (we preclude pressure modes for 'convenience'—until the end). For jo = (hq, Po)t> simply takePq = 0 because any IC's attempted to be imposed on P will be ignored; this applies also to the index 1 formulation, considered next. (It does not apply to the index 0 formulation, which we do not consider anyway.) In order to obtain our 'analytical' solution, we begin as we did with the three-equation model problem of the previous section; i.e., we first seek the homogeneous (/ = g = 0) solution in the form y = xe"Xt to obtain the following generalized symmetric (n + m)- dimensional eigenproblem [xT = (vT, qT)\. Kvj + Cqj = kjMvj, (3.16-210) CTvj=0, (3.16-211) or, equivalently, a la (3.16-124), Ax, = XiBxi. (3.16-212)
702 THE NAVIER-STOKES EQUATIONS This eigenproblem, which turns out to be defective (there are repeated eigenvalues and fewer than n + m independent eigenvectors), was 'solved' by Malkus (1981), who called it the 'natural modes' eigenproblem and who also did consider pressure modes (which he called 'ill-disposed' modes); results: 1. There are only n — m finite (and positive) eigenvalues, and the associated (divergence- free and linearly independent) eigenvectors (Vj, q^)7 are M-orthogonal — in the velocity part (and we take vfMvi = 1; i.e., we presume that they have been normalized). That these are positive follows easily from (3.16-210) and (3.16-211): v^Kvj + vTjCqj = Xjv^Mvj and v^Cqj = q]CTVj = 0; since both K and M are SPD, Xj = vTjKvj/vTjMvj > 0. 2. There are m repeated infinite eigenvalues with a corresponding linearly independent set of ^-orthogonal eigenvectors in the pressure only [Q is the pressure mass matrix, and (3.16-212) becomes BXj = 0]: where the {<? •} can be obtained from the subsidiary (and smaller, dimension m) eigenproblem (CTM~]C)qj = VjQq-j, (3.16-214) called the 'Adjoint LBB eigenproblem' by Malkus, in which the m eigenvalues {vj} are positive (and we take qjQq~j = 1). 3. The remaining m basis vectors (also with X = oo) are generalized eigenvectors [and therefore need not/do not satisfy (3.16-212)] in the velocity only (and can be obtained by the application of Jordan theory to the inverse eigenproblem, Bxj = HjAxj, where Xj — \/hj), one for each q~j, and are given by [see (3.16-131) for guidance] (;;) = ("»■ These vectors are not divergence-free (CTvj = CTM~~lCq~j = VjQqj), nor are they required to be [see (3.16-131) and (3.16-132) and below] and also 'belong to' the m infinite values of Xj. They also correspond to 'gradients,' and are annihilated by the projection matrix (see Appendix 3), P0 = I — M~~lC(CTM~~xC)"1 CT, because they are (M-) orthogonal to divergence-free vectors. In fact, they are therefore regarded as curl-free ('discretely irrotational') vectors (Griffiths, 1996). Remarks: (1) There is an error in the first sentence of Theorem 3 in Malkus (1981), from which the above results were obtained: n — m should be just n. (2) Equation (3.16-214) approximates the continuous eigenproblem for the Laplacian operator, which of course implies a wide range of eigenvalues—from 0(1) to 0(\/h2), where h is a linear measure of element size. The (n + m)-vector space basis is now complete and we can return to (3.16-207), (3.16-208) and expand both the data and the solution in terms of these basis vectors. Some 'tricks' are required—and they are mostly analogous to those used in the previous
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 703 analysis of the model problem; the results are ft)-^tt)+i.H-^C)+^p)} ,/=l ' j=n-m+\ (3.16-216) where ai{t) = v]MuQ + Xi I b^{x)Qk'T dx; i = \,2, ... ,n; (3.16-217) Jo b\i{t) = qTig{t)/vl, i = n-m+\,...,n\ (3.16-218) 1 1 " MO = T-vJf(t) - - J2 bij(t)vjK(M-lCqj), Xj *"i ■ , , j=n—m+\ i = 1,2 n-m\ (3.16-219) 1 1 " bQi(t) = -(M-lCqff(t) - - V bXj{t){M-yCqjjIK{M-xCq.\ i = n-m+\,...,n. (3.16-220) Surely it would be less work to simply numerically integrate the Stokes DAE's! But it may be useful to at least realize what the numerical integrations are striving for. Remarks: (1) The first n — m modes correspond to divergence-free viscous decay. (2) The first of the m 'equilibrium' modes [those with (vj, qj) = (0, q-) and Xj = oo] 'cause' satisfaction of the PPE, {CTM"[C)P = CTM"\f - Ku) - g, the 'hidden' algebraic constraint—and, of course, also involve differentiation of the data, which can help to explain why ODE methods sometimes deliver lower local accuracy for P than u—numerical differentiation. The PPE is satisfied for t ^ 0 if and only if CTuo = g(0); otherwise it is only satisfied for t > 0. (3) The second of the m equilibrium modes (those with generalized eigenvectors and Xj = oo) 'cause' satisfaction of the original constraint, CTu = g, for t > 0. These equations actually give the solution for arbitrary u{), but only if Ctuq = g(0) will CTu = g for t ^ 0; otherwise the solutions will be non-smooth and suffer a jump at t = 0+—via an L2-projection, as discussed in the previous Section (3.16.2b). (4) Recalling that g(t) corresponds to specified BC's and is thus very sparsely populated, it seems that lots of basis vectors (all m generalized eigenvectors) are 'used up' just to enforce a boundary condition! And frequently the BC has g = 0! A partial explanation of this situation is this: the general solution does not specifically recognize that g(t) is sparse and therefore could also solve the more general (and non-physical) problem, in which g(t) could also correspond to sources and sinks of mass in Q, and thus be non-sparse. (See, for example Strikwerda, 1984.) (5) If k pressure modes are present, the solution changes in the following ways: (i) the first summation is increased to n — m + k, (ii) the second summation is reduced;
704 THE NAVIER-STOKES EQUATIONS it now goes from n — m + k + \ to n, (iii) the pressure modes are tacked on to the end via YTjtt+\ aj(p)> wnere Pj is a pressure mode (CPj = 0) and the {a,-} are arbitrary scalars. Explanation: with k pressure modes, there are only m — k linearly independent continuity equations (constraints)—the 'effective' number of pressures is reduced from m to m — k; thus there are n — (m — k) divergence-free modes, m — k ^-orthogonal eigenvectors in the pressure only, each of which (still) 'generates' a generalized eigenvector, and k pure pressure eigenvectors, each with X = 0; cf. Figure 3.13-13. (6) If a divergence-free basis was employed, then the problem would simplify considerably; the pressure term is gone in (3.16-207), and (3.16-208) is not needed; all n modes would be divergence-free viscous decay modes. (7) While the velocity parts of the n — m divergence-free eigenvectors are the same as those discussed earlier for the steady Stokes equations—Section 3.13.2d and Figure 3.13-13, the remainder of these transient Stokes eigenvectors are different from the 'convergence' eigenvectors there. Since, however, each eigensystem forms a basis, any one of the dilatational eigenvectors from one system can be represented as a linear combination of those from the other. Finally, the pressure parts of the viscous decay modes are obtainable from the velocity parts via qj = -(CTM-lC)-lCTM-lKvj— from (3.16-210) and (3.16-211), which is the discrete version of the continuum equations, V2g(- = 0 in Q, with BC dqjdn = vn • V2v, on To, Qi = vd(n ■ \,)/dn on TN. (8) Recalling the previous analysis of the three-equation model problem, it is clear that there n = 2 and m = 1. c. Index 1 The analogous PPE formulation starts with Mii + Ku + CP = f(t), u(0) = uQ, (3.16-221) (CTM"lC)P = CTM"\f - Ku) - g, (3.16-222) whose associated eigenproblem is in the form (3.16-212) with now A=( rK , rC i ) \CTM-lK CTM-XC)' which is both singular (because the last m rows are obviously just linear combinations of the first n rows, owing to the factor CTM~~l) and non-symmetric; B is unchanged from its index 2 form. Here, as in the model problem above, both A and B are singular, thus necessitating a somewhat different set of tricks than for index 2. But we first mention that the n — m div-free modes are the same as those from the index 2 formulation—as are the first m modes with X = oo. What happens is that, rather than generalized eigenvectors a la Jordan, the second set of m eigenvectors comes from a third eigenproblem; i.e., in addition to the obvious eigenproblem from the above equations, Kvj + Cqj = XjMvj (3.16-223) and CTM-lCqj + CtM"[Kvj = 0; (3.16-224)
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 705 we must now invoke Malkus' 'second adjoint LBB eigenproblem' [M —► K in (3.16-214); see also (3.13-156)], {CTK"xC)qj = OjQqj, j = n - m + 1, ..., n, (3.16-225) where the m values of {oj} are (in the absence of pressure modes) positive, and the corresponding linearly independent eigenvectors are ^-orthogonal (and we take q]Qqi = !)• Noting now that the operation CTM~~{ on (3.16-223) yields, considering (3.16-224), XjCTVj = 0, we must have either Xj = 0 or CTvj = 0. The latter case has already been accounted for; thus it must be the former case that is associated with (3.16-225), for which (3.16-223) yields vj = -K~xCqh j = n-m+\,...,n, (3.16-226) where the {qj} are those from (3.16-225); qj = qj Vy. The second set of m equilibrium modes from index 2 (Xj = oo) has been inverted; the 'physics' of this transformation is this: rather than enforcing CTu = g for index 2 (even if Ctuq ^ go), these m non- decaying modes (Xj = 0, a direct result of the fact that the last m rows of A are linear combinations of the first n rows) for the index 1 formulation enforce the weaker constraint, CTu = g—they merely 'hold the div.' The analytical solution in terms of the n + m basis vectors for index 1 turns out to be 7=1 x J / j=n-m+\ M>U)-M)(-^ (3.16-227) where, a la index 2, (vTj,qTj)T come from (3.16-210), and (3.16-211), (07,qTj)T come from (3.16-214) with Xj = oo, and the rest come from (3.16-225) and (3.16-226) with Xj = 0; also n ~t ai(t) = vjMu0- ]T (q]CTUQ)(vTiMK"[Cqj)/aj+ I ^(z)ex'z dz (3.16-228) and and j=n—m+\ n Pi(t) = vjf(t) - J2 (qTjk(t))(v]MK-{Cqj)/o}, i = 1, 2, ..., n - m, j=n—m+\ (3.16-229) b0i(t) = qJ[CTM-lf(t) - g(t)]/vt (3.16-230) M0 = ^[£(°) - 8(0 ~ CTUQ\/Oi, i = n-m+\,...,n. (3.16-231) And we are done. But, as for index 2, one would probably be foolish to even contemplate actually computing the transient Stokes solution in this way. Remarks: (1) In marked contrast to index 2, the index 1 velocity is divergence-free only if it started that way; the set of m zero eigenvalues cause the initial divergence, whatever it may be, to be retained for all time—from CTii = g.
706 THE NAVIER-STOKES EQUATIONS (2) If and only if CTu0 = go in both cases will the index 1 solution agree with that from index 2. (3) As for index 2, only the first n — m divergence-free modes describe viscous decay. (4) The m eigenvectors with infinite eigenvalues again 'cause' satisfaction of the PPE for all t ^ 0. (5) These index 1 eigenproblem results are not explicitly in the 1981 Malkus paper, but he has since then (personal communication, 1991) generalized them to cover the new results. (6) As for index 2, the 'analytical' results represent an appropriate generalization of those from the model problem in the previous section. (7) Noteworthy is that g{t) is differentiated in the index 2 solution and that g(t) is integrated in the index 1 solution. (8) If we were to bother studying the index 0 (ODE) formulation, we would find 2m zero eigenvalues, corresponding to zero time derivatives of both continuity and pressure Poisson equations—and the usual n — m divergence-free viscous decay modes. (9) Remark (5) on pressure modes, at the end of the index 2 discussion, applies here as well. (10) The book by Cook et al. (1989) contains a nice summary of FEM-related eigen- problems. Final remark for index 2 and index 1: For the finite X cases, (3.16-212) can be 'rearranged' to give [/ - M"1C(CrM~1C)"1Cr] M~~lKvj = P{)M"lKvj = kjVj\ the projection matrix, Pq, reduces the size of the solution space from that of Kv; = XjMvj (size n) to an (n — m)- dimensional subspace of div-free vectors—since CtPq = 0 (see too Appendix 3). d. Linear stability theory To conclude this lengthy DAE diversion, we consider briefly the application of discrete (via GFEM) linearized stability theory, which may serve as a mini introduction to a chapter in Volume II. Only the index 2 formulation is useful in this context, so that is what we will present. Classical stability analysis seeks to determine if a particular steady solution of the NS equations, Ku + ReN(u)u + CP = / and CTu — g, where Re is the Reynolds number (displayed here for 'emphasis'), say (us, Ps), is stable to small perturbations. The results of such an analysis lead to the following generalized eigenproblem (see Volume II for details) [K + ReN(us)]vj + Cqj = XjMvj (3.16-232) and CTvj=0, (3.16-233) which is obviously related to the index 2 'natural modes' (Malkus, 1981) eigenproblem, (3.16-210) and (3.16-211). Here N(us) = —[N(u)u] du (3.16-234)
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 707 is a modified (augmented) advection matrix that corresponds to the operator uv • V + (Viiy)7. in the continuum. The difference, of course, is that (3.16-234) is unsymmetric and so therefore is the eigenproblem (3.16-232) and (3.16-233), which may now display a complex solution. Whereas the details of the stability eigenproblem are certainly different from those of the Stokes problem, some similarities remain; in particular, there are still 2m infinite eigenvalues [2m zero eigenvalues for the inverse (fi = 1/A.) eigenproblem]. Further details will be presented in Volume II, the above intended only to begin to bridge the gap between DAE's and stability analysis. 3.16.4 Three Variable-Step Implicit (Index 2) Methods—and Some Steady-State Methods a. Introduction In this section we return to numerical time integration and generalize the variable-step methods introduced in the previous chapter—and we add one more. We shall construct smart time integrators for TR, BE, and BDF2, all applied to the index 2 DAE's; starting with TR, the history of which goes back to 1978 (Gresho et ai, 1978b, 1979, 1980a). In each of the three, the related solution method for the steady form of the NS equations is 'available'—as a subset. b. Trapezoid rule Analogous to (2.7-92) application of TR to the DAE's of (3.13-28) and (3.13-29) yields -£-M + K + N(un+l) C CT 0 (M(-£-uH+uH)+fH+x\ (316_235) V 8n+\ J a fully coupled non-linear system of equations in (un+\, Pn+\), wherein we have invoked the 'shortened' form of TR a la (3.16-25), which is valid if Ctuq = go—and it 'adjusts the data' in the first step if Ctuq ^ go; see (3.16-29). Before discussing methods for actually solving this nonlinear system [and a legitimately linearized version that is always stable when N(u) is skew-symmetric—discussed below], we describe the entire AB2/TR algorithm, wherein it is important to note that the predictor and concomitant error control are based solely on velocity—pressure just goes along for the ride, being the algebraic variable in the DAE's. Start-up. Given u{) satisfying Ctuq = go'- Step 1. Solve M C CT 0 \ = ffo~[K + N(uo)uo]\ (3.16-236) for (u{), P0). Step 2. Select Ato as discussed in Chapter 2 (Sections 2.7.4a and e) for T; replace T by u in the start-up algorithm below (2.7-95). Step 3. Take the first TR step; solve (3.16-235) for n = 0 to get (uu P\), using up = uq + At{)iio as a first guess for u\ [e.g., N{upx)].
708 THE NAVIER-STOKES EQUATIONS Step 4. Invert TR to get the required AB2 data for velocity; ii\ = 2{u\ — uo)/Ato — Mo- Remarks: (1) Solving (3.16-236) is the best (most trustworthy) way to get started, even if Pq is not actually needed—or wanted. Approximations that estimate Uq by less rigorous methods are not as robust. Note that, in some sense, TR is not quite as 'self-starting' for these DAE's as it is for ODE's. (2) Even though the more efficient form of the RHS of the momentum equation is used in (3.16-235)—vis-a-vis that in (3.16-23)—with the result that Pq is not actually required, do not make the mistake of assuming that you can select Pq arbitrarily to obtain uq, as is occasionally seen in the literature (e.g., Eguchi et al., 1988; see also Gresho, 1990a). Such a procedure will cause the 'cursed' TR 2At oscillations—a wiggle signal. General step. With At\ = Ato, for n = 1, 2,..., do: Step 1. Predict the velocity with AB2: u p At un + —^[(2+ Atn/Atn-\)un - (Af„/Af„_i)K„_i]. Step 2. Solve (3.16-235) for (un+\, Pn+\) using u^+l as the first guess for un+\ (details later). Step 3. Invert TR to get AB2 data for the next step and update t: un+\ = 2(m„+i — un)/Atn — un, tn + \ = tn + Atn. Step 4. Compute the LTE estimate based on the velocity: dn = (un+l - upn+l)/[3(\ + Atn-i/Atn)]. Step 5. Compute the (potential) next step size: Atn+l = Atn(e/\\dn\\)l/\ where \dn\\2 =dTn+l -dn+l/NTu2 max' where NT is the total number of (variable) velocity nodes, and umdX is an appropriate measure of the maximum speed in the domain. A better-yet weighted RMS-norm might be the following (in 2D, for simplicity): ll^ll2 Nu+Nv y^ / an+\j \ | y^ / an+\j f^ \\un+ij\ + Uo) f^ \\Vn+lj\ + Vo. where Uo and Vq are user-specified 'characteristic' velocities ('floor' values, in case | • | is 'too close' to zero), and Nu + Nv = NT. Step 6. Bump n and go to Step 1 unless the final time has been reached.
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 709 Now, as for the scalar problem of Chapter 2, there are a few practical matters to deal with—more in fact since the TR equations are now non-linear. Digression But before biting the non-linear bullet, we digress briefly to point out a useful and cost- effective way to retain the stability of TR without solving non-linear equations! It is borrowed from Simo and Armero (1994) and only applies rigorously when N(u) is strictly skew-symmetric; and our version is this: 'linearize' the advection term in TR so that the momentum equation reads M(un+l-Un) («„+«„+!) (Un+Un + 1) (Pn+Pn + l) fn + fn + \ + K h N(u ) h C Atn 2 2 2 2 (3.16-237) where u* is given but is thus far 'arbitrary'—but it is not a function of un+\ so that (3.16-237) remains linear. The first thing we note is that for N(-) = —NT(-), we get energy conservation (in the absence of forcing terms) for all At; i.e., the inner product of (3.16-237) with (un + un+\)/2 yields, utilizing CT(un + un+\) = 0, the symmetry of M and K, and the skew symmetry of N, 1 UTn + {Mun + \-UTnMun (Un+Un+1\T (un+Un + \ \ ^i.^tn 2 ATn = -{-^^) K[—T-~)t (3-16"238) which (i) proves stability; uTn+xMun+\ < uTnMun because K is SPD and (ii) approximates the appropriate ODE conservation law, \ d(uTMu)/dt = —uTKu; viscous dissipation. This is the reason that this linearization is 'interesting': guaranteed stability with no non-linear equations to solve. Remarks: (1) This 'trick' was well advertised by Simo and Armero (1994), but it was not discovered by them. Although we do not know who actually discovered it, we do know that it was discussed as far back as 1972 by Lions (1972) and Temam (1972)—who probably discovered it—in Temam (1966), in which is presented a stability proof—probably a much more elegant one than we have just presented. (2) The skew-symmetric form is only strictly attainable for Dirichlet BC's; NBC's as OBC's can cause trouble (see the discussion of fully implicit methods at the end of Section 3.16.6c); in Simo and Armero (1994), the theory was 'all Dirichlet'—yet flows with outflows, nice-looking and stable, were also presented, perhaps somewhat misleadingly. They actually modified the advection operator at the outflow to obtain 'reasonable' solutions; i.e., they sacrificed skew-symmetry there to obtain useful results for which they had no theory (F. Armero, personal communication). Rearranging (3.16-237), and then invoking the ODE at tn yields 2 At,, 2 M + K + N(u*) un+\ + CPn+\ Mun + fn ~ Kun - N(U )un - CPn + fn + \ Atn
710 THE NAVIER-STOKES EQUATIONS 2 = M Atn un+un) + fn+l + [N(un) - N(u*)]un, (3.16-239) and the full 'TR' system becomes -2j-M + K + N{u*) C CT 0 Pn + \ M (j^tUn + Un^j + /„+1 + [N(un) - N(u*)]un 8n+\ (3.16-240) an unsymmetric linear system in (un+\,Pn+\). We now address some options for choosing u*: 1. Choose u* = un to gain efficiency (the AN term on the RHS vanishes) while losing advection accuracy. 2. Set u* = (un +m£+1)/2, where upn+x is the predictor from AB2, to gain advection accuracy but at higher cost (AN ^0 on the RHS). Note that this causes the advection term to look a lot like an explicit midpoint rule. 3. Ditto 2, but drop AN from the RHS to save cost—and take your chances on stability. We have not tested any of these choices, but would probably start with 1 or 3. We would also use the same variable-A? algorithm as presented above for the honest (nonlinear) TR, including all the 'rules of thumb;' it would probably work quite well and is recommended. End Digression Returning now to the (more general) non-linear and non-skew-symmetric case, we show how the non-linear system of (3.16-235) can be solved pretty effectively using one or another variant of Newton's method. Applied to (3.16-235), Newton's method gives the following sequence of linear systems: for m = 0, 1, ... with u°n+l = m£+1, solve -£-M + K + N(u^) + N\u^) C CT 0 u n + \ U n+\ n+\ = (M(Jrun + iin)+fn^\_([-irnM + K + N{u^x) ln + \ (3.16-241) where we have simplified the Newton system because Pn+\ appears only linearly; also N'(u) denotes the matrix whose (i,j) element is given by or, equivalently, N'u(u) = N'(u) = dNik{u) dUj d[N(u)v] du uk (3.16-242) (3.16-243) and is the 'non-linear' portion of the Jacobian matrix of Newton's method. (See Gresho et ai, 1980a, for explicit expressions for N-,.)
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 711 Remarks: (1) In the (most common) case that g = 0, the RHS can be simplified by omitting CTu™+, and gn+l since then even the predicted velocity is divergence-free (if Ctuq = go = g). This simplification could, however, lead to the accumulation of round off errors. (2) The 'safety valve' that precludes illegal IC's has been bypassed ('shortened form'); to reset it, add gn — CTu„ to the RHS of (3.16-241). [Best advice for code writers: use shortened TR but test on CTu{) — go = e{) and print an 'appropriate' norm of eo so that the user can ascertain the quality of his/her IC. (Recall that u\ will always be discretely divergence-free.)] (3) If iterative methods are used to solve the linear systems, then the solution will always contain some iteration error that should be appropriately controlled. The iterations in the Newton system, (3.16-241), are terminated when Au is 'sufficiently small'; i.e., small enough that the resulting error does not contaminate the estimation of the local time truncation error. A reasonable criterion is that ||Aw|| ^ O.le, which will typically be attained in very few iterations: one, two, or three. In fact, however, full Newton interations are probably seldom cost-effective for several reasons: 1. The cost of updating the Jacobian is significant and, if direct methods are employed, the cost of solving the linear system at each iteration is very high. 2. The AB2 predictor is often close enough to un+\ to accept just one iteration as converged. This is called one-step Newton; it was suggested by Gresho et al. (1979, 1980a) and is used, for example, by Gartling (1987) and in FIDAP (Fluid Dynamics International, 1993). 3. Other approximations, such as the chord method (outdated Jacobian; see, for example, Gresho et al., 1980a) or quasi-Newton methods (FIDAP, 1993) may also often work well—experiment. Final 'rules of thumb' on the time-stepping algorithm—similar but not identical, to those in Chapter 2 (Section 2.7.4) for the scalar problem: 1. If DTSF = Atn+\/Atn > 1, accept the increase. 2. If y < DTSF < 1, where y is user-specified, but a value like 0.8 is recommended, accept the solution but do not change At. 3. If DTSF < y < \, reject the current solution and repeat the step using Atn+\. 4. If DTSF <<C 1 or if more than two step reductions occur within one timestep, then it may be a good idea to stop the integration and print a warning message so that the cause can be studied. 5. Depending on the strategy used in approximating Newton's method, the Jacobian may or may not be updated in cases 1 and 3. In case 2, an update is probably a good idea. As discussed in Chapter 2 [following (2.7-97)], an implementation of TR that reduces potential problems with round-off error may be a good idea (we have not tested it, but should). It goes like so: 1. Perform Step (1) of 'General step'—below (3.16-236)—as usual.
712 THE NAVIER-STOKES EQUATIONS 2. Compute the predictor acceleration from [see too (2.7-82) and (2.7-83)] "J+i = (1 + Atn/Atn-\)un - (Atn/Atn-\)Un-i. 3. Replace Step (2) [solve (3.16-235)] by: solve for 8u = un+\ - u?+l and Pn+l from /„+, - (K + N(upn+l + Su))upn+l - Mupn+[' 4. Replace Step 3 by ■p 2 ^« + l = ^h + A?„. 5. Set «„+i = «^+i + Su. 6. Return to Step (1). Suppose you wish to continue a run after studying the results, which will probably occur frequently. There are two options here, which we call 'smooth' and 'fresh' restarts. The former is the simplest and should normally be employed; it does require, however, that more data be saved than with a fresh restart: un, un+\, un, un+\, Atn, and Atn+\. The smooth restart then starts with the general AB predictor equation, and the continuation run should be as smooth as if you had not stopped. The fresh restart, on the other hand, ignores all history data and saves only un+\, which is then treated like u$ in the start-up phase. This type of continuation run, which may not be as smooth in the sense that the computed A?o (or user-selected if that is your choice) will generally differ from that of a smooth restart. The fresh restart is needed if some parameter is to be changed (such as e) and is (only) recommended if you are not 'happy' with the current At vs t behavior of the algorithm—which can occur, for example, if error accumulation (iteration error, too large an e, etc.) has 'polluted' the history vector. (Even 'smart' integrators are not perfect.) Final remark: If the skew-symmetric and a priori linearized form, (3.16-240), is employed, then the same timestepping strategy as above may, except for option (1), be safely used because the (second-order accurate) linearization does not change (pollute) the LTE estimates—an assertion whose proof we leave to the reader. A few words on the steady NS equations are appropriate here—and similar words apply to each of the following two time-marching methods (because all collapse to the same equations): 1. The Newton system given by (3.16-241) applies equally well to the steady-state NS equations; simply set M = 0 or At = 'oo' and omit the time-level indices—with or without 'explicit relaxation' [see 3 below]. 2. It may often be the case that a good first guess is hard to come by (the ball of attraction/radius of convergence of Newton's method is not large—bad initial guesses -+-M + K + N(up+l + Su) C CT 0. Su Pn+\
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 713 result in divergence of the iterations); methods for getting a good guess are described in Volume II (the chapter on continuation methods). 3. An alternative solution strategy is to begin the steady-state iterations with a simpler method that: (i) may have a larger radius of convergence but (ii) converges more slowly (linearly vis-a-vis quadratically for Newton's method). Called successive substitution or functional iteration, or Picard iteration, the recommended alternate strategy uses the following fixed-point iterative method (see too Reddy and Gartling, 1994): K + N(um) C CT 0 m = 0, 1. (3.16-244) with perhaps a relaxation factor applied: aum + (1 — a)u m+\ um+l where 0 < a < 1, with a similar 'trick' for P—but only for getting started. Once the solution is 'close enough' (the hard part, and left to the reader), the switch to Newton's method should be made. c. Backward Euler We now repeat the exercise with BE, and repeat the admonition: generally only use BE when seeking a time-marching approach to a steady solution—and then be 'generous' with s (try to use 'large' timesteps), or use BE if TR 'crashes' and a 'sanity check' is needed. (Robustness caused by numerical damping is sometimes useful—although too large an e can preclude convergence of the non-linear iterations.) If BE is being used to get 'quickly' to a steady state, then it may also be a good idea to monitor (un+\ — un) = Aun and switch to a steady-state solver (probably using Newton's method) when Aun is 'small enough.' (See too Section 3.16.10 below.) The BE method of (3.13-28) and (3.13-29) gives -±-M + K + N(un+l) C CT 0 ■^-Mun + fn+A ^ (3.16-245) 8n + \ J and the Newton system for solving it is [cf. (3.16-241)] -^-M+K+N{umn+X) + N'{umn+X) C CT 0 Un+\ Un+\ pm+1 rn + \ gn + \ J_M+K+N(<+^K+ , m = 0, 1,..., (3.16-246) rTum and we cannot help but point out again the obvious very slight cost reduction over TR—with a significant reduction in accuracy. As an algorithm, the BE/FE method is the following: Start-up. Step 1. Select A?o as suggested in Sections 2.7.4b and 2.7.4e. Step 2. Solve for uq as in (3.16-236); Pq is a bonus. General Step. For n = 0, 1, 2,..., do:
714 THE NAVIER-STOKES EQUATIONS Step 1. Compute u^+l = un + Atnu„. Step 2. Solve (3.16-245) or iterate on (3.16-246) for un+\, using u^+l as the first guess. Step 3. Invert BE via un+\ = (un+\ — un)/Atn. Step 4. Compute the LTE based on velocity from dn = (un+\ — u^+l)/2. Step 5. Compute the next (potential) step size from Atn+\ = Atn(e/\\dn\\)l/2. Step 6. Bump n and go to (1) unless the final time has been reached. Similar rules of thumb as for TR are also suggested here. Remark: Warning: The 'rush' to an alleged steady state via BE using large At (large e), may not be a stable and/or unique steady state—even when it 'works' (converges). [If a time- accurate solution from well-posed initial data attains a steady state, then that steady state is unique—for the given IC's. The 'robust' BE method can still sometimes be a useful alternative to attaching the SS equations directly; see, for example, Reddy and Gartling (1994) for further discussion, wherein they consider mainly non-linear ODE's via a 'semi'-time-accurate BE method—with 'judicious' selection of At and e.] We follow up on this last remark with a few more that may actually be more important. Yee and colleagues have spent much effort in studying the 'dynamics of numerics' for various fixed-step (usually) time integration methods applied to 'model' non-linear ODE problems that are presumed to mimic at least portions of the behavior of the non-linear DAE's (and ODE's) of CFD. The emphasis is usually on the following issue/question: For generally unknown IC's, do time-marching schemes reliably find stable steady states when they exist? By 'unknown' we mean that the original PDE may have come from the steady NSE's, whose associated BVP is converted to an IBVP via a time-integration approach What IC's should be selected? 'The phenomenon that a non-linear differential equation and its discretized counterpart can have different dynamical behavior (asymptotic behavior) was not uncovered fully until recently'—Yee and Sweby (1995a). And strong dependence on initial data means that for a finite time step At that is not sufficiently small, the asymptotic numerical solutions and the associated, numerical basins of attraction depend continuously on the initial data'—Yee and Sweby (1994), where the 'basins of attraction' refers to that set of IC's whose solution curves all approach the same asymptote. (Thus, 'large' At integrations can give a different steady solution for each different IC.) One of their points of emphasis is that implicit methods (usually using fixed At that is 'too large') can stabilize unstable steady states—a conclusion that is valid for some, but not necessarily all, IC's. Finally, they also point out that the different methods for solving approximately the resulting non-linear algebraic equations can also affect the dynamics of the numerics—all-in-all a sobering set of 'fears' that we would all do well to appreciate. For the most recent 'summary' of these efforts, see Yee and Sweby (1996). While we still believe that smart, variable-step, integrators will rarely fall prey to such spurious behavior and thus further justify their use, it may often be the case that the number of 'small' timesteps required to find a putative steady state is too large to be deemed affordable—thus leaving the analyst in a quandary: either attack the problem with fixed, 'large' At implicit integrations and take your chances on obtaining 'correct' results, or 'blow' your computer budget on a smart, variable-but-presumably-'small' At, implicit integration method. We still tend to side with the latter approach for three reasons:
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 715 (i) reliability, (ii) smart integrators will properly and automatically use 'large' At if and only if a stable SS is approached (At is only small when necessary), and (iii) if At stays 'small' to follow a time-dependent solution that does not not go to SS, so be it—you have found the solution appropriate to your selected IC's. d. BDF2 The last smart integrator that we consider is another 'predictor-corrector' linear multistep method, with a second-order predictor that turns out to be leapfrog (explicit midpoint rule) when the step size is constant. First, however, we present our variable- At version of BDF2 on ODE's-y = f(y): yn+\ - yn A*„ yn-yn-\ . Atn + Atn^ . ,,„.... = • h • y„+i, (3.16-247) Atn 2Atn + Af„_i Af„_i 2Atn + Atn-\ ' which agrees with the result in Hairer et al. (1987, p. 351). The appropriate predictor equation is fn+\ =yn+y + -£f^) At«y« - \^TL~) iyn ~ yn-°- (3-16-248) The LTE of (3.16-247) is found to be (Atn + Atn-i)2 Atly„ 4 dn = y«+i - v(W.) = A; " , " , , • -^ + 0(At4) (3.16-249) Atn(2Atn + Atn-\) 6 and that of (3.16-248) is rf+i ~ y(Wi)= - (l + -^A ■ ^—^ + 0(At\ (3.16-250) from which dn can be obtained by solving the above two equations for y(tn+\) and yn —to 0(At4): . (\ + Atn^/Atnf p .,n(K.A. dn = 1 t()Wi — yn-t-\) + 0(Atn). \+3(Atn^/Atn) + 4(Atn^/Atn)2 + 2(Atn^/Atn)3W + *n+u (3.16-251) With the variable-step ODE preliminaries out of the way (which results could, of course, be applied to the ODE's of the previous chapter), we can now formulate the smart DAE integrator using BDF2: Start-up. Step 1. Solve (3.16-236) for (k0, Po). Step 2. Select Ato as per TR. Step 3. Take the first timestep with TR; solve (3.16-235) for n = 0 to get (u\, P\). Setp 4. Invert TR to get the acceleration: ii\ = 2(u\ — uo)/Ato — Mo- General Step. With At\ = Ato, for n = 1,2,..., do: Step 1. Predict the velocity via 'generalized leapfrog'; i.e., (3.16-248).
716 THE NAVIER-STOKES EQUATIONS Step 2. Solve for (un+\, Pn+\) from the Newton system derived from «TT&M+^^» C M CT 1 +Af„/Af„_i At„ 0 (Ar„/Ar„-i): 2 "" Ar„(l+Ar„/Ar„_,) M„_l gn+1 + /* + ! , (3.16-252) using m^+, as the first guess. Step 3. Invert BDF2 to get the required predictor data for the next step; un+\ = (3un+\ — Aun + K„_i)/2Af„. Step 4. Compute the LTE based on velocity from (3.16-251) with y —► u and O(AfJ) dropped. Setp 5. Compute the next (potential) step size: tn+\ = ^« + Afm Atn+l =Atn(e/\\dn\\)l/3. Step 6. Bump n and go to (1) unless the final time has been reached. Remarks: (1) (3.16-252) can of course be solved by other than Newton's method. We have not tested this algorithm in the laboratory, but nevertheless recommend it—especially as an alternative to TR when and if TR 'acts up'; and as a better dissipative algorithm than BE. The same 'rules of thumb' and solution methods for the corrector equation discussed for TR apply here as well. As for BE and TR, the predicted velocity is generally not divergence-free. An admonition from Kheshgi and Scriven (1984), who used the variable-step TR method: 'The scheme is started with a backward-difference corrector to avoid the oscillations that certain initial conditions set off in the TR corrector: higher-order backward-differentiation methods were rejected because their numerical dissipation can make an unstable flow appear to be stable.' [They used a penalty method and wished to preclude spurious penalty transients—discussed below. Also, and related to the Kheshgi-Scriven comment, the BDF methods (especially BE!) have the property that they will actually damp an unstable ODE solution, e(,ftH~X)', for X > 0 but not 'too large' and co sufficiently large—i.e., slowly growing oscillatory solutions. See, for example, Gear (1971).] Final remark on Index 2 DAE's: In 3D, these fully coupled solution methods are still generally considered by most practitioners to be too costly, even with smart integrators—a situation that we hope will not last forever. This remark in a sense 'justifies' most of the following subsections on time integration. (2) (3) (4) (5)
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 717 Final remark on Smart Integrators: Although they usually do the 'advertised' job (follow the physics), they are not infallible—although we strongly believe that they are always superior to fixed-step implicit integration methods. We also believe that there is room for further improvement in this area of CFD; the graduation from 'smart' to brilliant integrators is a noble goal. Some of its attributes would probably be (at least) the following: (1) it never 'stalls' when At should be increasing, (2) it always selects the proper norm in which to measure the error estimate, (3) it never needs to repeat a step, (4) it never misses any important small scale action, (5) it ... . e. Discussion In an important series of papers, Heywood and Rannacher have done much to realistically clarify the picture when the NS equations are solved approximately via GFEM + TR; see Heywood and Rannacher (1982, 1986a, 1988, 1990) and Rannacher (1990) for a 'summary'. They put forth a convergence theory and error estimates for virtually all finite elements (second- through fifth-order) used (and some not used) for the time-dependent index 2 NS equations via TR. The word 'realistically' refers to the fact that, prior to their work, the error analyses seen in the literature 'conveniently' assumed more smoothness/regularity than is realistic for the NS equations; in particular, vortex sheets had been prohibited by assuming that the tangential momentum equation(s) was (were) valid at t = 0 on T. As is now well known, and was the only 'regularity' assumption utilized by Heywood and Rannacher, only the normal component of the momentum equation applies on T for all t ^ 0—and is the BC for the PPE. The tangential equation(s) apply only for t > 0 and, as stated in Section 3.9.2, behaves like the transient heat equations for small time (with VP known from the PPE at t = 0). This behavior returns us to the discussion presented for the heat equation (with advection, usually) in Section 2.7.4f. Whereas the normal momentum equation satisfies both zeroth- and first-order compatibility conditions [n • uo = n • w(0) and the normal momentum equation itself, respectively] and the initial velocity field must be divergence-free (no jumps permitted in the direction along streamlines), the tangential component(s) can tolerate much less regularity—it (they) can reside in L2 or, more precisely, the initial velocity can be in H (div; £2); see Girault and Raviart (1986). A simple example of this situation would be an IC comprising a cylinder of fluid (of finite length if 3D—with ends orthogonal to the axis) in solid body rotation in an otherwise quiescent fluid—an example of a divergence-free IC of compact support. Another example is a box of fluid with an initial velocity determined/derived from an arbitrary vorticity field (arbitrary only up to the necessary constraint that n • uo = 0 on T)—which will generally display vortex sheets (co = oo) on T (at least); see Gresho (1992). Heywood and Rannacher advocate, similar to the 'Rannacher philosophy' for the transient heat equation (and AD equation) espoused in the previous chapter (see Section 2.7.4f), the use of a 'smoother' in order to compensate for TR's lack of dissipation so that meaningful 'smoothing estimates' can be made. (This means, roughly, that the smooth solution for t > 0 is generally not smooth for t | 0; regularity is lost—in general.) Using only the above regularity assumption, and a quadratically conservative FEM formulation, they derive the following (L2) error estimates for fixed step TR and a stable NS solution: \\un ~ "('«)II ^ c(hm/t™/2-1 + At2/tn), (3.16-253)
718 THE NAVIER-STOKES EQUATIONS where tn = nAt > 0 and At ~ h2. Here m (2 ^ m ^ 6) is the order of the FEM basis functions (m = 2 for linear approximation, etc.) and are called 'smoothing' estimates for the 'singular cases': m ^ 3. For a detailed discussion of the various notions and definitions regarding 'stability' of the solution, see Heywood and Rannacher (1986b). This result accounts for the general case in which the overdetermined Neumann problem (see Heywood and Rannacher, 1982) is not satisfied at t = 0; i.e., only the normal momentum equation applies on T at t = 0—with the general result that ||V3u/3?||o, ||3u/3?||//i, and ||u||//3 are unbounded as t \ 0. If the TR is 'stabilized' via, for example, the appropriate sprinkling of a few BE steps (or BDF2)—as discussed at the end of Section 2.7.4—then the stringent restriction that At ~ h2 can be removed and (3.16-253) can still apply. Our position here, however, is basically the same as it is for the transient heat equation: a smart, careful application of the variable step TR (or, perhaps, BDF2) applied to the NS DAE's will also yield 'optimal' accuracy—albeit with At ~ h2 for small t, in general (as selected by the smart integrator—not the analyst!). This approach will, we believe, generally be more cost- effective than a 'fixed At TR + smoothing' approach; i.e., the total number of steps for a given simulation will be a minimum for a given, specified accuracy—and said accuracy will be achieved even at small time (where, at least in some cases, it may not be fully warranted). f. Penalty method We conclude this section with some remarks on the penalty method. If the penalty method is utilized to eliminate the pressure, then we return to an index 0 formulation—albeit a mighty stiff one, owing to the (necessarily) large penalty parameter. These ODE's look like [cf. (3.13-185)] Mii + [K + N(u) + kB]u = f + kCQ~lg, (3.16-254) where k^> v, B = CQ~lCT is the penalty matrix (cf. Section 3.13.2e), and Q is the pressure mass matrix. We mentioned in Section 3.5.3 that the penalized momentum equations, while reducing the system size by eliminating P, also 'penalized' the ODE's via the introduction of an extraneous and spurious penalty transient; i.e., a compression 'wave' that travels quickly through the domain until the penalty stiffness becomes virtually equilibrated. This extra stiff behavior effectively rules out all explicit time-integration methods, but the three implicit methods discussed above can easily deal with the problem—provided certain simple modifications are made to the algorithms. Since the spurious transient is of no physical significance, it can be quickly eliminated (overlooked) by taking advantage of the stiff stability of BDF1 and BDF2. The required changes (besides, of course, never seeing the pressure or the continuity equation in the system) are as follows, for all three: for the first timestep (only), use BE and a At that is large relative to the penalty time constant, rP = L2/k, where L is an appropriate characteristic length, yet small relative to the physical time scales of interest, xA = L/uq and xq = L2/v, where uq, is a characteristic velocity. Note that tp/td = v/k <£ 1 and that rp/rA = Re-u/A, which is also <<C 1 for moderate Reynolds numbers so that such a At selection should be feasible. Presuming that the initial velocity field is close to satisfying Ctuq = go; i.e., that one is trying to solve a physical (nearly incompressible) problem via 'penalty,' the velocity, u\ = uq (the new/effective IC) from the above step will still be nearly divergence-free,
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 719 and the penalty transient [especially in the implied pressure field—see (3.5-11)] will have been vanquished. If this step is regarded as a pre-start-up process, which is probably the best way to view it, then simply replace uq by uq and return to the general variable-step method for non-linear ODE's—obtained from the above three DAE index 2 implicit integrators simply by deleting the C-matrix, the pressure, and the continuity equation; and using (3.16-254) as the momentum equation. Remarks: (1) If Ctuq — go is not small, then the resulting, nearly divergence-free adjusted velocity, uq, may be 'strange' looking even though it satisfies CTu$ — g0 = 0(1/1)—an example of which is presented in Sani, et al. (1981b). (Even if uq is 'strange looking', it is still quite close to a discrete L2-projection of u0.) (2) If At cannot be large enough to fully kill the penalty transient in a single step, a situation that can be measured by examining the size of CTu\ — g\ compared to \/k, then it may be necessary to take several BE steps. (One properly selected step should generally suffice, however.) (3) Another time scale of interest is rh = h2/v, where h is a measure of the grid spacing—typically for the smallest element. Another constraint on k is thus L2/k <$c h2v, or k ^> v(L/h)2, which is usually not so difficult to attain. (4) In practice, 105 ^ k ^ 109 is fairly common and useful. We conclude this penalty discussion by returning to the 'spirit' of the eigenvector expansions of Section 3.13.2d and 3.16.3 and briefly consider some analogous results a la penalty; i.e, we embark on a somewhat 'academic' diversion, which we hope is justifiable. And it is better to back up a bit and start with (3.13-183) before eliminating P; i.e. we first consider the index 1 system Mii + Ku + CP= f (3.16-255) and CTu = g + eQP, (3.16-256) e = \/k, and we have reverted to Stokes flow because N{u)u is not 'receptive' to simple linear analysis. The associated (n + m)—dimensional eigenproblem is Kvj + Cqj = kjMvj (3.16-257) and CTVj-£Qqj=0, (3.16-258) rather than (3.16-210) and (3.16-211). The qualitative 'solution' of this eigen problem, which is clearly an 0(e) perturbation from the index 2 eigenproblem, is as follows (and we assume, for simplicity, no 'pressure modes'). 1. There will be n-m 'div-free' modes [CTVj = 0(e)] with finite ky, both eigenvalues and eigenvectors will be 0(e) from those of index 2.
720 THE NAVIER-STOKES EQUATIONS 2. There will be 2m values of kj that are O (\/e); i.e., very large. To find the eigenvectors, we note that (3.16-257) and (3.16-258) imply CT(kjM - Ky[Cqj = sQqj (3.16-259) in general, and, for sufficiently large kj, CTM'lCqj = skjQqj (3.16-260) in particular. Since ekj = 0(1), it seems clear that (3.6-260) is an approximation to (3.16-214) with ekj = Vj. Thus, the first set of m eigenvectors is, considering again (3.16-257) with kj = vj/s, fv]\ = feM^Cq+0(e^)\ \qjj V 1j + 0{e) J where cjj are the eigenvectors of (3.16-214). The second set of m eigenvectors correspond to the generalized eigenvectors of the index 2 result; i.e., to (3.16-215): (;;>rc^0(£)) and we are done. The 'mixed' (index 1) penalty method approximates well—to 0(e)—the index 2 results. But, you may object, the real (index 0) penalty method of interest contains no pressure. Thus we now return to (3.16-254), drop N{u)u, and consider the concomitant eigen- problem: 1 which is of 'size' n. The solution of this eigenproblem will, we assert, display the following properties. 1. There will the same n — m modes as from 'mixed penalty' with finite kj that are nearly div-free; CTVj = O(e). These {vj} are O(e) from the {^-portion of the index 2 eigenvectors from (3.16-212). (There is no analogous pressure-portion; pressure eigenvectors do not exist in this 'velocity-only' formulation.) 2. There are m eigenvalues having kj = 0(1 /e) with dilatational eigenvectors. From (3.16-263), we have (eK + CQ~1Ct)vj = ekjMvj in general and CQ~lCTVj = ekjMvj (3.16-264) in particular—for sufficiently small e—and CTVj ^ 0(e); i.e. the discrete divergence is finite. [Also noteworthy is that (3.16-264) also follows from eliminating qj from the pair (3.16-257) and (3.16-258).] In addition to the n—m nearly div-free modes discussed above, (3.16-263) and (3.16-264) will admit m modes with finite divergence, which will be 0(e) from the index 2 modes given by (3.16-215) and (3.16-214), here with kj = vy/e; i.e., large but not infinite—as with the second set of m modes from index 1 above. Thus, the index 0 eigensystem mimics both the div-free modes and the dilatational generalized eigenvector modes of the index 2 eigensystem—the latter being those that
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 721 cause satisfaction of CTu = g. It does not (apparently/ostensibly) mimic the m infinite eigenvalues of index 2 that cause satisfaction of the PPE; PPE satisfaction via penalty would need to come 'after the fact.' But what we do learn from the above 'analysis' is the following important result (and one that we have verified numerically): since the penalty eigenproblem results are 0(e) from those of index 2 (at least up to the PPE satisfaction), so too will be the penalty transient if the initial velocity field is far from div-free (or close, of course); i.e., the penalty transient will actually mimic the L2-projection of index 2 to the (nearly) discretely div-free subspace—a result we have actually already seen in the three-equation model problem of Section 3.16.2e. (It is demonstrated again in Gresho and Sani, 1998.) With these results more-or-less in hand, we now pose two interesting and related problems for the reader—part of which has already been presented in Gresho and Sani (1987). Consider a bounded domain containing a motionless fluid everywhere except in a (small) subdomain in which a non-zero velocity field is given by the stream function \ff(x, y) = sin2 [tt(x — xq)/1] sin2 [n(y — yo)/h], (3.16-265) for 0 ^ x ^ / and 0 ^ y ^ h, a 'cellular' divergence-free IC contained in an / x h subdomain whose lower left corner is located at (xq, yo) that is away from the boundaries of the full domain. This is an example of an IC with compact support (u = 0 at and near T). The two 'interesting' problems are these—on a set of node points that 'discretizes' Q. 1. Take as discrete initial data the interpolant of the velocity from (3.16-265); i.e., the exact velocity field is evaluated at the nodes in the subdomain. Noting/recalling that a div-free velocity is generally not discretely div-free leads to the second problem: 2. Take the discrete L2 projection of the velocity from problem 1 as the IC. Consider now 'solving' (time-integrating) these two problems (with advection terms re-instated) in three different ways, (i) the index 2 DAE's (u-P formulation), (ii) the PPE/index 1 DAE's, and (iii) the index 0 penalty DAE's (ODE's). After realizing that problem 1 is ill-posed in the strongest (index 2) formulation, we are down to five computable solutions, whose qualitative behavior we shall predict and leave the quantitative results to the interested reader (with the warning that 'the Devil is in the details.): (1) Interpolated IC via index 1. This problem is solvable, but the solution will be —at least in part —non-physical; the initial non-zero divergence, CTu$ ^ 0, will be present for all time [CTu(t) = Ctuq] and, thus, the concomitant part of the pressure field will also be spurious. (If the mesh is sufficiently fine, the non-physical portion of the solution may be difficult to detect). (2) Interpolated IC via penalty. Here the initial pressure (e~{QCtuq) may be quite large, but by time 0(e) the system would have recovered to be 0(e) away from the L2-projected IC of problem 2. (3) Projected IC via index 2. A physical, domain-filling initial pressure for t ^ 0 (that is independent of viscosity and has dP/dn = 0 on T) will occur (i.e., not just in the IC's subdomain) and a physical flow would ensue.
722 THE NAVIER-STOKES EQUATIONS (4) Projected IC via index 1. Ditto index 2; i.e., the results would be identical—and this is the example presented, but without any discussion of time integration, in Gresho and Sani (1987). (5) Projected IC via penalty. This case is similar to that presented in Section 3.16.2e for the three-equation model in that the initial pressure is zero (and thus wrong), but will undergo a penalty transient that will recover the physical pressure —to 0(e)—while only very slightly changing the velocity field, again to 0(e). (6,7) Yes—there are actually two more cases that should be considered: each IC via penalty but not 'tracking' the penalty transient (e.g. just take one step with kAto = 100 via BE). After a single time step, both solutions should be 0(e) from the L2 projected IC, and variable-step time integration may now commence. And this is the way the penalty method should be employed—in practice. The penalty transients from cases (2) and (5) would be interesting to compare in that they are very different (in the pressure) yet would both lead to the same result—to 0(e) in time 0(e): the L2 projected velocity field and its corresponding pressure. A final exercise is to reconsider the above problem for Stokes flow—for which the exact initial pressure is zero, (why zero?) We are now finished with 'honest' (rigorous) implicit methods that choose At based on the physics and have no need for mass lumping, but generate the largest possible sets of coupled equations; i.e., robustness exacts its price. In the rest of this section, therefore, we investigate some cost-cutting alternatives, beginning with an explicit method applied to the index 1 DAE's. 3.16.5 An Explicit (Index 1) Method, Plus a Few Tricks The tricks will be mainly applied to one particular element—Q\Qo—but the explicit method to be described next can be applied to any element for which mass lumping makes sense. It was broached in Gresho et al. (1980a), first implemented in Gresho et al. (1981), with an additional simplification called 'centroid advection,' and described in detail in Gresho et al. (1984b, c). We begin with the 'general' element and, at the end, specialize to Q\Qq. Also, whereas we will describe only the simplest explicit method, there seems to be no reason that others cannot be used. The forward Euler method is described only because it is still quite popular and it is the simplest possible way to make a code. We return to (3.16-52) and (3.16-53) but this time in full dress form (FEM): given un with CTun = gn, the algorithm is simply as follows, for n = 0, 1, ...: Step 1. Solve (CTMlxC)Pn = CTMl\fn-Kun -N(un)un] - (gn+l - gn)/At = CTan - (gn+l -gn)/At (3.16-266) for Pn, where an is a partial acceleration (sans VP). Step 2. Update the velocity via un+x =un + At(an -M~[lCPn) = un + Atan, (3.16-267) where an is the total acceleration vector. Step 3. Bump n and go to Step 1 unless it is time to stop.
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 723 Remarks: (1) ML is a diagonal, lumped mass matrix—and we no longer have a GFEM (or at least an 'honest' GFEM); our FEM has moved a step toward FDM. Mass lumping also, of course, restricts the choice of elements (to those for which lumping is 'allowed'). (2) The final velocity is discretely divergence-free: CTun+\ = gn+\—unless Ctuq^ go and PHg(t) = 0, in which case the initial divergence error is forever present: CTun+l = gn + CTu0 - go, a la (3.16-7). (3) If (3.16-266) is not solved sufficiently accurately, then spurious divergence may accumulate. There are two known ways to remedy this problem: (i) 'penalize' the PPE by replacing gn by C1 un on the RHS of (3.16-266), (ii) periodically project the velocity to the divergence-free subspace (see next section, and Appendix 3); an example of the latter is given in Dupont and Marchal (1988). The price paid for the simplicity is instability—only At less than some stability-limited A?max will preclude blow up. And this is over and above the price, in loss of accuracy, already paid for mass lumping. And Afmax is, in general, not known! Thus, rather than addressing the issues covered in the previous section on implicit integration (efficiency issues mainly—but not simple ones), the main issue for explicit methods, besides the single linear algebra issue of how best to solve (3.16-266), is how to 'guesstimate' Atmax. Another issue, rarely addressed in the literature (see Paolucci, 1990, for an exception), is related to accuracy: even if I can find Afmax, how do I know it is small enough to be 'sufficiently' accurate? Most users of explicit methods seem to believe (at least implicitly) that Atmdx is so small (ridiculously so, sometimes) that the results must be accurate; indeed, more accurate than warranted for the given mesh. Without further ado and totally without justification, we shall henceforth make the same (religious?) assumption and only justify it by offering the following advice: just as a single-grid simulation {one FEM discretization) is not a good simulation practice (see Volume II for more on this subject), so too is a single At simulation not generally tenable. If you perform a simulation at or near Atmax (which is frequently found experimentally, being guided by those estimates to be soon described), perform another one at one-half this step size; large differences tell you to 'keep going,' small ones are a sign of relief. Rather than advocating the FE algorithm just described, we advocate (but not too strongly) the BTD improvement described in Section 2.7.2e for the scalar transport equation—because the stability limits for NS are at least as stringent as those for AD (FE still generates negative diffusion). But we also include another warning: the simple replacement of the viscosity, v, by a balancing tensor viscosity, vl + uuAf/2, does not have quite as simple a consequence for the vector -valued PDE's as it did for the simple scalar equation. In particular, the term V • (uu • Vu), when represented in Cartesian coordinates (as in conventional FEM codes), does not 'transform' properly. For example, transformation to 'intrinsic coordinates' (parallel and normal to streamlines) of the 2D Euler equations (v = 0) for simplicity, with BTD, du/dt + u • Vu + VP = \ At{u ■ V)2u, yields dq/dt + qdq/ds + dP/ds = Atq2{d2q/ds2 - K2q)/2 (3.16-268) for the streamline component of the velocity (q = |u|), where s is the streamline coordinate and k is the principal curvature—and is the culprit; rather than simply (q2At/2)d2q/ds2, which is streamline diffusion, the BTD 'curvature crisis' (a la Gresho and Chan, 1990)
724 THE NAVIER-STOKES EQUATIONS shows the undesirable introduction of a damping term. Another way to see the 'problem' is to seek an 'equivalent operator'—as in Section 2.7.2e—which, when integrated via FE, would give the right result—to 0(At2). This yields, again for the inviscid case, At du/dt + u ■ vu + s/p = — V (uu Vu) + VP ■ Vu + (u • V)V/> ds/p + (u Vu) Vu - dt (3.16-269) in which only the first term on the RHS is 'BTD.' If one cared to and was able to incorporate the other four correction terms, perhaps the forward Euler fix would really be proper. Rather than try or suggest such silliness, we recommend switching at least the advection terms—from FE to one of the higher-order explicit methods discussed earlier; AB2, AB3, or RK4. We present and further discuss FE/BTD only because we and others have quite a lot of experience with it—mostly good, in spite of the theoretical shortcomings just exposed. Then, because we know no better and because it works reasonably well, we assume that using BTD as in the scalar equation leads to the same estimates for Afmax; i.e., see (2.7-47) and related discussion. But on the assumption (!) that both the CFL and the diffusional stability limit cause Atmax to be much smaller than necessary for a sufficiently accurate integration, we now discuss the concept of 'subcycling,' a cost-effective trick first introduced in Chan et al. (1981; and in detail in Gresho et ai, 1984b)—and one that frequently works well. The object is to further reduce cost by updating the pressure less frequently than once per timestep, as in the presented algorithm. And here we do reintroduce some accuracy concepts based on the LTE—and utilize the fact that (at least in the absence of forcing) the pressure-velocity coupling via mass conservation has no impact on the stability (since u'n+lCPn = PTnCTun+\ = 0, the pressure term drops out of the kinetic energy equation). One of the goals of the method is to perform the expensive part of the problem (solving the PPE) no more frequently than would a BE method with error control. A four-step summary of the idea is given first, followed by the details (in which the Devil resides): 1. The minor (smaller, and fixed) timestep based on stability estimates is used to compute, presumably very accurately, the advection and diffusion terms (these processes are 'subcy- cled' for a predetermined number of steps—and the pressure gradient is approximated (guessed) via linear extrapolation. Hence, the PPE (or, equivalently, the mass conservation equation) is not needed during subcycling; it is ignored. 2. Project the now non-divergence-free velocity to the divergence-free subspace and take the result as the velocity solution. (Because we do not use the consistent mass matrix for the projection, we do not have an L2-orthogonal projection to the nearest divergence-free velocity—but it is close; see Appendix 3.) 3. Solve the PPE for the concomitant pressure field. 4. Compute the next major timestep (the timestep for the pressure) via local error estimates. Figure 3.16-9 shows the process schematically; un is the divergence-free velocity, Pn is its corresponding pressure, un+\ and Pn+\ are the 'intermediate' velocity and pressure at the end of subcycling, Ats is the minor (subcycle) timestep based on stability, and
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 725 Fig. 3.16-9 A schematic of the subcycling process. Atn is the major (and variable) timestep. (Actually, as seen below, Pn+\ is not actually needed—nor is it computed.) The details of the subcycling method are as follows: 1. The subcycling process. Given un at tn with CTun = gn and S(= Atn+\/Ats)—the subcycle ratio (whose value will be discussed later)—the following ODE's are solved via FE: MLu + Ku + N(u)u + CP(t) = f; i.e., for m = 0, 1, ..., S — 1, with uq = un, um+\ = um + AtsML [fm — Kum — N(um)um — CPm], where (3.16-270) (3.16-271) + {Pn -Pn_{)(mAts)/Atn+{. 2. Project us(= un+\) to the divergence-free subspace and call the result un+\\ ML(us - un+\) = Ck and CTun+l = gn+l; (3.16-272) i.e., solve (CTM~[l C)k = CTus — gn+\ for X (the Lagrange multiplier of the projection), and then compute un+l = us — MJ^Ck and discard \. (See Appendix 3 for 'projection theory,' and details.) 3. Solve the PPE for />„+,, (CTMl{C)Pn+x =CTM-[l[fn+l -Kun+l -N(un+l)un+l] ~ (gn + l ~ 8n)/Atn+i. (3.16-273) 4. Determine the next major step size, Atn+2—the heart of the matter, and the most tenuous part. We would like to base the major step size on specified local accuracy (as usual), but we have corrupted the usual ODE (or DAE) problems with our ad hoc procedure of extrapolate, subcycle, project—and the local error estimate is thus trickier than usual. In fact, we have two of them, one more conservative than the other and neither
726 THE NAVIER-STOKES EQUATIONS really rigorous. The conservative one simply ignores the fact that the modified ODE is solved with small minor steps and bases the local error on major steps only—as if full FE were used at the major step sizes. It is thus simply d„+i =un+\ - u(tn+l) = -At2n+lun/2 (3.16-274) a la (2.7-7), and we approximate un by (un+\ — un)/Atn+\, where both un+\ and un are computed directly from the ODE's; u^ = M~Lx\fk — Ku^ — N(uk)uk — C7\], k = n or n + 1, to give dn+\ = —Atn+\(iin+\ — iin)/2. The standard ODE theory then gives the next major step size as Atn+2 = Atn+l(£/\\dn+l\\)l/\ (3.16-275) where e and || || are as before (for implicit solvers). The less conservative estimate attempts to account for the subcycling process (at Ats) and the pressure extrapolation during same. It is also less easy to derive, beginning with e„+i =un+\ - u(tn+\) = pus - u(tn+i) (3.16-276) as the local error, where p = I — M~[]CiC1"M~[]C)~]CT is the projection matrix [see (3.16-13) and Appendix 3]; en+\ is the local error on the assumption that u$ = un = u(tn). {Recall that, via p, the original DAE's can be expressed as u = p[M~[l(f — Ku — N(u)u] + M^1 C(CTMjT1 C)_1 g; see (3.16-13).} The analysis proceeds by expressing en+\ as the sum of a Subcycle (plus projection) error and an Extrapolation error, eB+, =e;+1+ej+1, (3.16-277) where e^+, = p[us -u(tn+l)] and u(tn+\) is the exact solution of the subcycled ODE, (3.16-270), and e*+l = pu(tn+l) - u(tn+l) (3.16-278) is the extrapolation error [the error remaining if Ats —► 0 (with Atn+\ fixed) and thus e^+, —>■ 0]. We treat these two error contributions in turn, starting with e^+1, whose local (single subcycle step) error is obviously —At2sum/2. Next, from the theory of global error for ODE's, the accumulated local error at time Atn+\ can be shown to be, approximately, — Atn+\Atsiin/2 (A.C. Hindmarsh, personal communication); i.e., we have us - u(tn+\) = -Atn+\Atsun/2 and thus e^+, = p[us -u(tn+l)] = -Atn+lAtspun/2. (3.16-279) To simplify the ensuing analysis, we take g = 0 (the most common situation in practice) and define h(u) = M~[\f — Ku — N(u)u) so that the NS ODE reads u = ph(u), and the subcycle ODE reads u = h(u) — M~[lCP(t). Thus, u = h~u{u)u — M~[lCP, where hu(u) = dh(u)/du is a Jacobian matrix. We also have u = phu(u)u in general and 'un = phu(un)un in particular. Thus, using pM~[lC = 0 gives pu = phu(u)u and pun = phu(un)un. But we have and it follows easily that un = un so that pun = p'un = 'un since 'un is
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 727 divergence-free—and we are done, once we approximate un by (iin+\ — iin)/Atn+\; i.e., we have Qsn+l = -Atn+lAtspu/2 = -Atn+lAtsun/2 = -Ats(un+l -un)/2. (3.16-280) Switching now to e^+1, we start by utilizing the fact that it is a continuous function of time to write e*+, =e^+AW.^+^±i^+^±iet + 0(AW.)4 ~ Atl+\ - = pun -un + Atn+[(pun -un)-\ —(pu -un) A?3 + —^±I(pu-u) + 0(A^+1) A?3 = —f^ipun ~ u'J + 0(AfJ+1). (3.16-281) Further effort would show that pun — u ~ (Pn — Pn) = 0(Atn) to give e^+, = 0(At*+l) for the linearly extrapolated P(t), but the details are not necessary because even if we used constant extrapolation, P(t) = Pn, the result would be e^+, = 0(At\+[), which is high-order relative to the 0{Atn+\Ats) from the subcycle plus projection error. Thus we can neglect e^+, relative to e^+, to arrive at our second error estimate, e„+i = -Ats(un+\ - u„)/2 (3.16-282) and the concomitant timestepping strategy Atn+2 = Af„+1(£/||e„+,||), (3.16-283) a 'linear' (in l/||e„+i||) relationship and thus faster changing than the inverse square- root relationship given by the more conservative estimate, (3.16-275). The ratio of the two estimates, from (3.16-275) and (3.16-283) dn+\/en+\ = Atn+\/Ats, is in fact S, the subcycle ratio—which can be significant [O(10) or more]. Thus, using either (3.16-275) or (3.16-283), the next subcycle ratio, S = Atn+2/Ats, (3.16-284) which is naturally rounded down to the nearest integer, is available. The subcycling method via FE plus BTD is complete. Remarks: (1) For start-up, set At\ = Ats and solve (3.16-273) with n = — 1 and (go — g-\)/At replaced by go to get Pq. Take one FE step to get u\. Solve (3.16-273) with n = 0 to get Pi. Compute At2 from (3.16-275) or (3.16-283) with n = 0 and then S from (3.16-284). Set n = 1 and go to (3.16-271).
728 THE NAVIER-STOKES EQUATIONS (2) Subcycling obviously only makes sense when S > 2, since two Poisson equations are solved for each cycle (major time step). If S » 2 (e.g., 10, 20, 50), then the procedure can be quite cost-effective. (3) In our experience it seems that often the conservative estimate is too conservative, and the other estimate is too 'jumpy' (rapid adjustments to S). A reasonable compromise, although totally ad hoc, might be to average the two estimates—either arithmetically or, perhaps, geometrically. (4) If an iterative method is used to solve the PPE without subcycling, then it may be a good idea to add a 'penalizing' term to its RHS to help control possible error buildup in the velocity divergence; i.e., replace gn on the RHS of (3.16-266) by CTun. [Then, the operation CTun+\ in (3.16-267) will give gn+\ rather than CTun + (g„+i — g«)on the RHS.] This procedure has been called 'divergence cleaning' in electromagnetics (e.g., Ramshaw, 1983). Another possibility here is the periodic projection (e.g., every 20 timesteps of the current velocity to the divergence-free subspace—a la, for example, Dupont and Marchal (1988), as mentioned earlier. We now specialize to the Q\Qo element and offer an additional cost-savings device—also first published in detail in Gresho (1984b), but exposed earlier in Chan et al. (1981) and Chan and Gresho (1982). This additional short cut, which further converts/subverts(?) the GFEM to what might be called an 'isoparametric FDM,' is not without some pain, however, which requires an additional and inconvenient 'patch job.' Borrowed from the solid mechanics community, this short cut is most simply described as one-point quadrature; rather than paying the cost of 'full quadrature' to get the Galerkin integrals (the coefficient matrices) right (or close to right for distorted elements and the A'-matrix), the simple expedient of centroid evaluation (approximation) of all integrals can significantly reduce the cost of running a code. It of course also reduces the accuracy; but, on balance, the technique has often proven to be cost-effective—at least when measured against the explicit method just described when 2x2x2 Gaussian quadrature is employed (eight times as many quadrature points). The method to be described here is an evolution of that proposed and tested in Gresho et al. (1984c)—in ML and C. In particular, the one-point scheme we currently advocate is the following, first in an abbreviated form: 1. Mi is generated by full (3x3x3) quadrature on M, which is then row-summed. 2. The C-matrix is generated via 2 x 2 x 2 quadrature for both pressure gradient and velocity divergence. (Earlier experience with one-point on C caused noticeable loss of mass in 3D with distorted elements. If your code is 2D only or 3D on pure bricks, one-point on C is fine since it is then exact.) 3. A second C-matrix, say C, is generated via one-point quadrature because, as shown below, it can be efficiently utilized to generate K, N(u), and the BTD matrix—'on the fly.' Additional details and remarks: 1. One-point quadrature on the non-linear advection terms simplifies them considerably and, at least on reasonable grids, gives the same result as full quadrature after the 'centroid advection' approximation is made; i.e., for the A^e(«)-matrix, J (piuh ■ V^9y, take u^ at the element centroid as representative of that everywhere in the element to obtain A^.(u) =
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 729 uk fe(Pid(pj/dxk, in which uk is the average of all nodal values of the k-th component of velocity. This serious simplification, reducing otherwise triple product integrals to double, is of course related to the group finite element (product approximation) advocated by Fletcher and others; see Fletcher (1991). This approximation, as already discussed in the previous chapter, modifies the Galerkin weighting/averaging from (14 1 )/6 to (1 2 1 )/4 and is quite easily stated in words: the average (centroid) velocity in each element is multiplied by the average gradient of the advected variable in that element and the result averaged (volume-weighted) over those elements sharing the node in question. 2. The one-point C-matrix is defined, on element e, by Ceik = -(d(pi/dxk)o£2e, (3.16-285) where ()o indicates centroid evaluation, and Qe is the element 'size' (correctly integrated, 2 x 2 x 2 for general 3D elements, one-point otherwise). The anti-symmetric nature of (d(pi/dxk)o allows for the computation (and storage, if stored) of only one-half of the full C-matrix (i.e., in 3D, 8x3 goes to 4 x 3). 3. The one-point advection 'matrix' ('matrix'-valued vector product, actually) is easily formed from C as follows—using T as the typical advected variable (it could also be u or v or w): N'jTj = ukTj / (pid(pj/dxkdQ = ukTj(pi(0)(d(pj /dxk)0£le = -(Pi(0)ukCejkTj, (3.16-286) which, since ^,(0) = 1/4 in 2D and 1/8 in 3D for all i, is seen to be a simple scalar, to be distributed equally to each node in Qe. 4. Similarly for the diffusion matrix, we obtain (using T again to represent any diffused quantity), for element e, KeuTj = Tj [(d^/dxkXdyj/dx^dn = Tj(d(pi/dxk)o(d(pj/dxk)oQe = Tj{Ceik/Qe) ■ Cejk (no sum on e) = -Ceik(dT/dxk)0. (3.16-287) 5. The one-point quadrature implementation of the BTD matrix, say B(u), is also rather efficient (again on T); Je dxk dxi = ukU[CeikCej,Tj/Qe (no sum on e), (3.16-288) and 'looks like' J(u ■ V<p(-)(u • S/T). Adding (3.16-288) to (3.16-287) is the BTD improvement over 'straight' FE.
730 THE NAVIER-STOKES EQUATIONS These describe how a 'streamlined' code can be generated via one-point quadrature. Now we turn to the other side of the ledger and admit to a diffusional deficiency. (The advectional deficiency, both for LM and for one-point quadrature, has been thoroughly discussed already—in Chapter 2.) Just as the simplification of forward Euler integration on the hyperbolic terms required a truncation error 'correction' term, so too does the one-point approximation to V2 terms require some correction terms—described in the solid mechanics literature (see, for example, Hughes, 1987) as 'hourglass correction' or 'hourglass control,' since the Lagrangian-based elements can deform into zero-energy modes that resemble an hourglass. In our Eulerian framework, of course, the element shape stays constant—and our manifestation of these 'zero energy' modes shows up as a singular A'-matrix. The amount of work that has gone into patching this problem is impressive—and nearly all of it comes from the solid mechanics community (see, for example, Liu et al., 1985, and references therein)—although there are exceptions (e.g., Mallet et al, 1992). The singularity in the A'-matrix, which actually is truly singular only for Neumann or periodic BC's (Dirichlet BC's add a stabilizing influence, but not enough to sufficiently suppress the deleterious effects), shows up as oscillatory null vectors (and near null vectors—eigenvectors with small eigenvalues), which make wiggles when excited by the 'data'—especially short wave data. Since advection cannot move these modes very well [not at all for the null vectors ('2Ajc'), which have zero phase speed], they must be dealt with explicitly—by adding truncation error correction terms to the one-point ('under-integrated') A'-matrix. But since we have probably already given too much space to the explicit Euler plus 'fixes' method, relative to its importance in today's computing environment, we merely state that both K and B(u) are augmented, at element level, by a correction matrix that is proportional to the outer product of the corresponding null vectors—and refer the reader to the original reference, Gresho et al. (1984c) for details. Suffice it to say here that this final patch job does do a fair job of restoring a better simulation of diffusion. One final remark on the hourglass correction that is not discussed in Gresho et al. (1984c): the simplest technique, devised by Goudreau and Hallquist (1982) on the basis of explicit knowledge of the form of the null vectors for simply shaped elements (rectangles/bricks), has worked fairly well—even though many solid mechanics codes employ the more rigorous but more expensive method of Flanagan and Belytsckko (1981). This technique is more appropriate for distorted (iso-f) elements, for which the '/i-stabilization' of Goudreau and Hallquist is only approximately correct (G. Goudreau, personal communication). The idea of this simpler stabilization scheme is to render non-singular the A'-matrix by adding a rank-one matrix to it for each null vector; e.g., if x is a null vector (with eigenvalue X = 0) of K, then the modified matrix is K = K + xxT, which raises the rank of K by one; i.e., Kx = Kx + xxTx = Lc gives X = xTx ^ 0 because Kx = Xx = 0. Digression on Quadratic Elements: Since the biquadratic (nine-node) element is still reasonably accurate for advection- dominated flows when mass lumping is invoked (see the discussion and figures in Section 2.6.3b), we believe that explicit integration using it may be a viable alternative to Q\Qo—probably with linear pressure (QiP-1). But it should probably be done with fewer tricks; in particular, we would recommend: full quadrature, AB3 or RK4 rather than FE + BTD, subcycling, and perhaps (we are not sure) some simplified advection
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 731 approximation that would be the nine-node analog of centroid advection (if possible)—to avoid the 'triple-product integrals.' In fact, even without considering the extra phase speed accuracy of lumped over lumped linears, it can be shown, at least with respect to the relative costs of both forming and using the matrix CTM^{ C, that quadratics are more cost-effective than linear (M. Engelman, personal communication). End Digression We conclude this section with a brief example showing how subcycling works. Figures 3.16-10 through 3.16-14 are results from an early 2D simulation (with Q\Qo) of an LNG (Liquified Natural Gas) simulation run three different ways and called (a), (b), and (c) in the first four figures: (a) no subcycling, (b) subcycling with linear pressure extrapolation, and (c) subcycling with constant pressure extrapolation (used to convince us that linear extrapolation is worth the effort). Time is in seconds, horizontal (u) and vertical (v) velocities are in meters/second, temperature is in °C, and concentration is in volume fraction of LNG. The time histories shown are for a node near the surface of the 'spill pond' (see Koopman et al., 1989, and Chan, 1992, for details of these and other LNG safety study simulations). LNG 'injection' stops at t = 40 and is the cause of the discontinuity in the velocity plots. Only the vertical velocity is noticeably affected by subcycling. More remarkable yet is the vertical velocity in Figure 3.16-14 at a node closer to the surface. The case with constant pressure extrapolation (a) shows extremely variable velocity during subcycling as compared with that using linear extrapolation (b). But the key (and perhaps slightly remarkable) point is that in each case the projected velocity at the end of each round of subcycling is nearly the same in both cases—and it is also very close to the 'right' answer (that with no subcycling)—nearly zero for t > 60. The small (stability-limited) timestep was 0.2, and the typical subcycle ratio (S) was about 12. Clearly one should not generally 'believe' any velocities but those at the conclusion of each cycle (the projected velocities). -A w - : VJ I I I I 0 20 40 60 80 100 0 20 40 60 80 100 0 20 40 60 80 100 Time Fig. 3.16-10 Horizontal velocity for three cases.
732 THE NAVIER-STOKES EQUATIONS .30 0 20 40 60 80 100 20 40 60 80 100 0 20 40 60 80 100 Time Fig. 3.16-11 Vertical velocity for three cases. I I I I V (c) M 20 40 60 80 100 0 20 40 60 80 100 0 20 40 60 80 100 Time Fig. 3.16-12 Temperature for three cases. 0 20 40 60 80 100 0 20 40 60 80 100 0 20 40 60 80 100 Time Fig. 3.16-13 Concentration for three cases.
SOLUTION METHODS FOR THE SEMI-DiSCRETIZED ?33 U.«£U 0.18 0.18 0.14 0.12 0,10 V 0.08 0.06 0.04 0.02 0 -0.02 A r\A K-0.) t \ \ \ A " V V_ — — ^J L_L_J J Li_J I 10 20 30 40 50 60 70 80 90 100 Time Fig. 3.18-14 Vertical velocity for two types of subcycling.
?34 THE NAVIER-STOKES EQUATIONS Having exposed the reader to a bag of tricks applied to a particular element, we conclude this section with the general advice that there are usually better ways to solve the DAE's—some of which have yet to be presented. The mass-lumped explicit integration method does 'work,' however, and is still sometimes useful—even though it looks too much like an FDM, which is why we call it an 'isoparametric finite difference method.' 3.16.6 Semi-Implicit Projection Methods a. Introduction Having described fully implicit and fully 'explicit' methods (except of course for pressure), we now turn to an increasingly popular class of 'compromise' methods—semi-implicit—in which some of the advantages of each of the two 'pure' methods are realized and their disadvantages reduced, if not eliminated. Goodrich and Soh (1989) said it best: 'The diffusion terms are treated implicitly to avoid the numerical stability restrictions from the viscous terms, and the convection terms are lagged to avoid the computational effort of solving a nonlinear system at each timestep.' While explicit-advection, implicit- diffusion is simple to implement (and often effective) for the scalar transport equation, cf. Section 2.7.5, the vector system of NS equations and the V • u = 0 constraint preclude, in significant ways, such simplicity. But effective methods have been devised, if not always understood. Relevant here, from E and Liu (1995) is: 'The numerical phenomena involved in the projection method are sufficiently complex that soft arguments can hardly touch the heart of the matter; neither does a simple convergence theorem or crude error estimates.'—and, from J. Shen (personal communication, 1996), after noting that a projection method can be interpreted as a temporal discretization of a singularly perturbed equation: 'Unlike the usual cases where the error of a fully discretized scheme is simply the sum of the temporal discretization error and the spatial discretization error, the error of a full discretization for singularly perturbed Navier-Stokes equations is much more involved, since the a priori estimates for its solution may depend on the perturbation parameter ... On the other hand, this type of analysis is important, because it is not obvious, without a thorough understanding of the error behavior, how to properly match the temporal discretization parameter At with the spatial discretization parameter /z.' Finally, some related remarks from the projection pioneer: 'In ending, the author would like to make some comments on the preceding proofs. First of all, he would like to state his belief that the value of a scheme such as (14) lies in its practical usefulness, not in the possibility of a convergence proof. The value of the convergence proofs lies in the fact that they contribute to the understanding of the numerical processes performed on the computer'—Chorin (1969), wherein (14) referred to what we will later call 'projection 1' for the simplest case—periodic BC's. Herein we provide both some useful semi-implicit projection methods and further (but not complete) understanding (via, unfortunately, somewhat 'soft arguments') regarding how and why they work. In the semi-implicit projection methods that we have in mind, which carry several aliases (fractional step, splitting, pressure correction, predictor-corrector), unconditional stability of the viscous term is attained, as with implicit methods; also, the decoupling of equations is attained, as with explicit methods—almost. The cost of these gains is that in addition to solving a PPE at each timestep (a la explicit
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 735 methods), additional but symmetric linear systems, one per velocity component, must be solved at each timestep. Another disadvantage is that, unlike the semi-implicit treatment of the scalar transport equation in which (using forward Euler for advection) the implicit treatment of BTD stabilized the scheme unconditionally, for the NS equations the non- linearity of the (explicitly treated) advection terms limits us to a CFL stability limit of ~ 1 — 10—an experimental result. One final advantage is also realized—although obtained in a somewhat ad hoc manner: the consistent mass matrix is retained even though the PPE associated with the projection is done with the lumped mass matrix (CTM~[lC). An additional 'cost'—as will soon be seen—is the notable extra effort required to derive and justify the methods! To further punctuate this last sentence, we quote again from E and Liu (1995), 'It has been a mystery for twenty-five years that the projection method seems to perform better than expected.' On balance, however, one or more of the semi-implicit projection methods to be described below often provides a cost-effective compromise between 'costly' implicit methods and 'unstable' explicit methods—although in the next section we shall describe a competitive method that is both uncoupled and fully implicit. Finally, we emphasize at the outset—and demonstrate later—that these projection methods are intended only for 'time-accurate' simulations, not for 'quickly' time-marching to steady-state solutions. The plan for this section is the following: after deriving and discussing optimally (or near optimally) accurate projection methods for the PDE's, which require special (and awkward) BC's for the so-called 'intermediate velocity,' we will fall back to less optimal but (probably) more cost-effective methods that use only the simplest BC's for the intermediate velocity, discuss semi-discrete methods in which only the time is discretized (in marked contrast to the DAE approach), and, finally, discuss fully discrete semi-implicit projection methods, in which we show how to maintain the extra accuracy associated with the consistent mass matrix. We mention now and try to explain later that all of these projection methods are approximations to 'legitimate' methods for solving both the PDE's and the resulting DAE's (if we may call them that)—a statement that would still be true even if the (intermediate velocity) equations were solved fully implicitly because the principal 'problem' with the projection method (implicit or semi-implicit, but not explicit) is that the viscous term causes the occurrence of a numerical and completely spurious boundary layer (wherever the tangential velocity is specified a la Dirichlet) of thickness 0(a/uA/), within which certain quantities (starting with the pressure) are generally in error—and at their worst on To [the pressure by 0(a/vA/), and its normal gradient by 0(1)!]. This behavior, which is part of the 'mystique' of these methods, is still only partially understood, as is the near-miraculous recovery (usually) to the 'right' answer beyond 0(a/vA/) from TD. Before embarking on the approximation to a projection method for solving the NS equations, let us briefly visit the real projection that is, after all, our goal; i.e., we shall 'view' the NS equations as a projection: since du/dt + VP = uV2u + g — u • Vu = f and V • u = 0 => V2P = V • f, we have (formally at least, and introducing A = V2), />=A_1V-f, (3.16-289) VP = VA~lV-f, (3.16-290) and ^ = (/-VA-'V.)t (3.16-291) dt
736 THE NAVIER-STOKES EQUATIONS which has introduced the projection operators (see also Appendix 3—which includes some discussion of concomitant BC's that we mostly avoid here) p = /-VA"'V- (3.16-292) and 0 = /-p = VA~'V- (3.16-293) so that the NS equations become, simply, ^=pf(u) (3.16-294) at = f(u)-V/>, (3.16-295) which shows that p 'strips off the gradient part of f to reveal its divergence-free part—the acceleration. Also, Qf = VP. These orthogonal (Qp = pQ = 0) projection (p2 = p,Q2 = Q) operators—which also come with BC's 'built-in'—display the following additional properties: V • p = 0 and V x Q = 0, showing that p projects onto the null space of div, and Q projects onto the null space of curl; finally, f = pi + Qf is the orthogonal decomposition of the vector f into its divergence-free and curl-free components. Returning to (3.16-294), it is clear that the time integral of pf(u) is the desired velocity solution of the NS equations—and this is the goal of the projection method: to strip off the gradient part of f—a process that is easier said than done. But the basic idea is easy: guess VP, subtract it from f(u) and integrate the result for some length of time—the length of which is proportional, in some sense, to the quality of your guess and the tolerance level you place on the divergence (a perfect guess remains divergence-free, while an imperfect guess generates spurious divergence)—and then project the result back down to the divergence-free subspace. The Devilish details follow. b. Derivation of an 'optimal' projection method, simplifications thereto, and analysis thereof Before doing mathematics, we do English; i.e., we will introduce the projection method in words, beginning with these: if we were somehow provided with the (God-given?) proper pressure, P(x, t), we note that the NS equations represent nothing more than a coupled system of vector AD equations (or Burger's equations, as some may prefer to call them) that, because of VP(x, t), would remain divergence-free with no need to carry V • u = 0 along as a constraint. Projection methods try to do the same thing, as follows: Given an incompressible velocity field, say at t = 0 for convenience, that satisfies the appropriate BC's, perform the following steps: Step 1. Guess VP(x, t) for t ^ 0; i.e., approximate it, somehow. Step 2. Solve the momentum equations alone, with VP(x, t) simply acting as another given body force 'on the RHS,' up to what we shall call the projection time (t = T), which could either be set a priori (somehow) or—better yet—be defined as that time at which an appropriate norm of V • u(x, t), where u is the intermediate velocity (from the momentum equations), which does not remain divergence- free because our guess at VP(x, t) was imperfect, reaches some predetermined maximum-acceptable value.
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 737 Step 3. Project the intermediate velocity, u(x, T), to the nearest (in L2) divergence-free subspace and (boldly?) report it as the NS velocity; i.e., u(x, T) = pu(x, T). This completes one projection cycle; reset the clock and go to Step 1. Immediate questions/issues are the following: 1. How do we guess the pressure (gradient)? 2. How do we select Tl Or, what is a 'maximum acceptable value' of V • u(jc, 7")? 3. Are there better BC's for u than simply those given for u? 4. What are the BC's associated with the projection? Is there a choice? 5. How do we know that the (projected) result is 'close' to the true NS solution? These are hard questions. But clearly their answers are both necessary and important, so we must address them. We can say this much for sure: the answers to questions 1 and 3 actually 'define' a particular projection method, and the answer to 2 is 'cut and try.' For question 4, we again first quote E and Liu (1995): (i) 'There are still controversies with regard to the optimal choice of boundary conditions at the projection step.' And, (ii) 'The agonizing decision to be made is the BC for (2.7)'—the projection step; and, there is a choice, they assert. We say no—no choice; only the proper (normal direction) BC will retain a well-posed problem and a divergence-free projected velocity. More controversy—c'est la projection. The answer to question 5 will come later. Let us begin by deriving a particular projection method that we called 'projection 2' in our first publication on the subject (Gresho and Chan, 1988) because it is intended to be second-order accurate in some sense; it begins with a pressure guess that is time-invariant and often assumes more regularity than is really necessary in practice: 1. Given uo that satisfies 'appropriate' BC's (details later) and V • uo = 0 and given the associated Pq = P(x, 0), solve for u(x, 0 on 0 < t ^ T from — = uV2u + g - u Vu - VP0 dt = uV2u + f(u) - W>0 in & (3.16-296) with u = w(0 on FD and (3.16-297) vdu/dn = F(0 + nP0 on FN, (3.16-298) where wis a to-be-determined 'intermediate' Dirichlet BC (recall that u = w on T^ is the proper NS BC) and F(0 is the given 'traction' force. We lump the body force and advection terms into f(u) because most of the 'interesting' behavior in projection methods comes from the viscous term. 2. At t = T, project u to the divergence-free subspace as follows: v(jc, T) = pu(x, T), (3.16-299) which is restated less formally as follows: solve for v and <p from the additive decomposition of u into a divergence-free vector field and a curl-free vector field: u = v + V<p and V-v = 0 in Q, (3.16-300)
738 THE NAVIER-STOKES EQUATIONS with n • v = n • w(T) on T^, (p = cp on FN, (3.16-301) where cp is, like w, to be determined. This projection is, usually, realized as follows: (i) Solve VV = V-Q in ft, (3.16-302) with d(p/dn = n • (u — w) = n • (w — w) on r^, (3.16-303) and <p = (jp on rN, for cp. (3.16-304) (ii) Compute v = Q-V<p in fi. (3.16-305) [Comparing (3.16-299) and (3.16-305) yields p = I - grad(V2r'div, as before. See also Appendix 3.] 3. Update P(x, t) = P(x, T) via the 'usual' PPE: V2/>=V-f(v) in £2, (3.16-306) dP/dn = n • (uV2v + g - v • Vv - dxv/dt) on FD, (3.16-307) P=vdvn/dn-n-¥(T) on rN. (3.16-308) 4. v is called uo, and P is called Pq, and the next projection cycle can begin. Remarks: (1) It may be preferable to set v = w on To because, in general, (3.16-305) will produce some slip there. More on this key issue later. (2) Note that P is generally not a continuous function of time in a projection method. (3) Since V x S/cp = 0, we see from (3.16-305) that the projection operator preserves the vorticity present in u—the projection is a. potential flow adjustment, at least until the no-slip BC is re-introduced; see Remark (1). o Boundary conditions. To obtain the 'proper' Dirichlet BC data for u and <p, we resort to a Taylor series expansion, in time, of both u and u (the true NS velocity) about time 'zero': 2 u(0 = u0 + m0 + ^-uo + 0(t2), (3.16-309) u(0 = u0 + rtio + -u0 + 0(t2), (3.16-310) where u0 = uo. Now we invoke the PDE's for u and u, which of course implies certain smoothness assumptions, to obtain uo = u0 = uV2u0 + f(uo) - V/>0, (3.16-311)
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 739 iio = ^[^V2u + f(u) - VP]t=0 = uV2u0 + f(uo) - VP0, (3.16-312) at iio = -[vV2u + f(u) - V/>0L=o = ^V2u0 + f(uo), (3.16-313) at where f = (9//9u) • u. Clearly f(u0) = f(u0) = (9f/9u)0 • iio- Subtracting (3.16-309) from (3.16-310) then gives 2 u(0 = u(0 + f-VP0 + 0(t\ (3.16-314) which we boldly 'push to the wall' and neglect HOT to find our BC for u: t2 ■ w = w+ yVPoIr,,- (3.16-315) Inserting (3.16-314) and (3.16-315) into the projection Poisson equation (3.16-302) through (3.16-304), yields (to second-order) T2 V2<p=—V2p0 in Q, (3.16-316) / = —n-VPo on TD, (3.16-317) an 2 and <p = (jp on rN, (3.16-318) and the desired Dirichlet BC for (p 'drops out'; i.e., if we take T2 • <P=YPo> (3.16-319) it is clear that the 'solution' of BVP (3.16-316) through (3.16-318) is simply T2. <P=YP°' (3.16-320) at least through 0(T3). The 'optimal' BC's, (3.16-315) and (3.16-320), are now known. o Simpler projection method. But how do we get P{)1 This is the question that causes us to cheat even further—since it is either impossible or highly inconvenient to try to solve for Po by solving for the time-derivative of the PPE plus BC's. Thus, even though we have found our 'second-order' BC's, we will in fact not use them. What we will use is what we called 'simpler schemes' in Gresho (1990b); i.e., (i) just use the physical BC for u on rD: w = w on rD; (3.16-321) (ii) assume that <p ~ T2Pq/2 still, approximate P{) by (P(T) — Pq)/T and approximate the normal component of (3.16-298) by Fn{T) + P() = Fn(T) + P(T) = 0, which neglects the viscous contribution [the latter a usually valid approximation, especially for 'large' Re; see point 3 following (3.8-38)] to obtain (p = q) = -T[Fn(T) + P0]/2 on FN. (3.16-322)
740 THE NAVIER-STOKES EQUATIONS Later we shall attempt to justify some of these approximations, which are clearly only 'reasonable' for T 'sufficiently small.' For now, we simply use them and rewrite the entire (simpler) projection 2 cycle as an algorithm: given a divergence-free uq(x) and Pq(x), Step 1. Solve (3.16-296) through (3.16-298), and (3.16-321), for 0 < ?^ T to get u(T). Step 2. Solve (3.16-302) through (3.16-304) for <p, where (3.16-303) now reads d<p/dn = 0 on TD, and y for (3.16-304) comes from (3.16-322). Step 3. Compute the new pressure from P(T) = P0 + 2<p/T. (3.16-323) Step 4. Compute \(T) from (3.16-305). Report v and P as the (alleged) NS velocity and pressure. Step 5. Reset the variables for the next projection cycle as follows: t = 0, Pq = P(T), and uo = v except on rD, where it is set to uo = v/(T). Remarks: (1) At the true beginning of a simulation, the proper PPE and BC's must be solved to get the proper initial pressure (Pq) that is induced by the initial (divergence-free) velocity. (2) The Neumann BC for the Lagrange multiplier, d<p/dn = 0 on FD, implies, using the approximation <p = T(P(T) — Po)/2, that is an inherent part of the method/algorithm, that dP(T)/dn = dPo/dn on To, rather than the correct inhomogeneous Neumann BC from the correct PPE; i.e., the pressure gradient on rD never changes throughout the simulation! While true, and seemingly stupid/wrong, it will be (largely) 'justified' later. (3) The projected velocity, \(T), from (3.16-305) will, unfortunately, not satisfy the no- slip BC; i.e., x • v ^ x • w, on YD\ the so-called overdetermined Neumann problem is not satisfied [recall Remark (3) following (3.8-36)]. Rather, it will slip—with a slip velocity, s, given by s= r • v — t • w = t • (u(T) — V<p — w) = — x • V<p. The (necessary?) resetting of Uo to w(T) on To [rather than to uo = \(T)] introduces a vortex sheet, of strength s, on rD—a phenomenon that we shall discuss in more detail later. [Note that there is no jump in the normal velocity on r^, which is of course illegal for an incompressible flow; i.e., n • \(T) = n • w(T) comes from the projection—by construction.] (4) If FN =0, then the resulting Neumann problem for <p is 'consistent singular'; it is singular because d<p/dn = 0 on F leaves <p = constant as a function in the null space, and it is consistent because, from (3.16-302), we need JV-u = Jrn-u = jr n • w = 0, which is satisfied because w is properly constrained. <p is then obtained only up to an irrelevant additive constant. (5) It turns out, as part of the 'mystique' associated with this projection method, that the coefficient 2 in (3.16-323) can be replaced by y for 0 < y ^ 2 and 'success' (of some sort) still be achieved, a fact discovered by Gresho and Chan (1990) and subsequently analyzed by Shen (1992, 1996)—although y —► 0 only 'works' if At does too. We recommend using either y = 2 (somewhat more attractive, theoretically) or y = 1 (somewhat more robust, practically; y = 2 sometimes makes wiggles—2At oscillations in P).
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 741 (6) Whereas u must have sufficient regularity to reside in H', v can be rather rougher; v e H(div) is sufficient [J.-L. Guermond, personal communication; a function v e H(div) => v e L2 and V • v e L2]. (7) Additional BC discussions, in the framework of spectral element methods, are presented in Karniadakis et al. (1993). o Oldest projection method. Thus it appears that the seemingly simplifying approximations associated with the above 'projection method' are not without significant cost. (There is just no free lunch when it comes to 'solving' accurately the incompressible NS equations.) But before trying to justify the above technique, which we state now and demonstrate later (and earlier, in Section 3.16.Id) that it really does 'work,' we digress briefly to go a few years backward in time to discuss a simpler-yet projection method (the original one—by Chorin, 1967a, 1968a,b, 1969) that also works but looks even more like it should not. [We credit Chorin with the invention of the first semi-implicit projection method. A similar, but explicit, projection method that was contemporaneous with that of Chorin is that of Temam (1966, 1969a,b). Note though that we have chosen to classify this latter method as simply explicit Euler (see Section 3.18); i.e., the original Temam et al. projection method is simply forward Euler—although Temam did also study implicit and semi- implicit projection methods; see Marion and Temam (1996) for relevant references.] Called 'projection 1' in Gresho (1990) and Gresho and Chan (1990), we present it simultaneously in two forms, the first (a) corresponding to that using the 'optimal' BC's for u and <p (to assure, more or less, a consistent first-order method) and the second (b) using the simpler BC's. It is characterized, in its simplest form, by its rather simple guess for the pressure while solving for the intermediate velocity: P = 0! Really; zero (not zero factorial—although that would work, too). It reads: given a divergence-free velocity field, uo(jc), and P0 in the 'optimal' method, for each cycle, do: Step 1. Solve for u for 0 < t < T from du/dt = uV2u + g - u Vu = f(u) in £2, with either (a) u = w = w + tS/P0 or (b) u = w on rD and vdu/dn = F(0 on rN. Step 2. Perform the projection: (i) Solve for <p from V2<p = V • u(T) in ft, with either (a) d(p/dn = TdPo/dn or (b) dcp/dn =0 on To (3.16-324) (3.16-325) (3.16-326) (3.16-327) (3.16-328) (3.16-329) (3.16-330)
742 THE NAVIER-STOKES EQUATIONS and <p=-TFn(T) on rD. (3.16-331) (ii) Compute v = u-V<p in Q. (3.16-332) Step 3. Compute the new pressure from either the expensive way, V2/> = Vf(v) in Q, with (3.16-333) dP/dn = n • [f(v) - dxv/dt] on FD and (3.16-334) P=vdvn/dn- Fn(T) on FN, (3.16-335) or the 'cheap' way, P = <p/T in fi. (3.16-336) Step 4. Reset the variables, as for 'projection 2' above, for the next cycle. Remarks: (1) The 'optimal' method (a) came from a similar Taylor series analysis to seek the best BC's for u and <p—an exercise we leave to the reader—for which the <p— P relationship turns out to be <p = TPo, a result that was already used to obtain (3.16-331) and (3.16-336). (2) The simpler method (b) implies that dP/dn = 0 on FD for all time—an implication that appears to scuttle the scheme, at least for 'boundary-driven' Stokes flow for which the true pressure obeys S/2P = 0 in 12, with dP/dn = vd2un/dn2 on T; i.e., it is only the non-zero value of dP/dn on F that generates a non-trivial P. We will explain later why this scheme does work—even for Stokes flow. (3) In his original projection method, Chorin combined the options as follows: he used (3.16-336) to update the pressure, which he employed in (3.16-325) and (3.16-329). (4) All of the boundary-produced vorticity that comes from the no-slip BC and the tangential pressure gradient is injected rather roughly—as a vortex sheet upon reducing the post-projection slip velocity to zero in preparation for the next cycle. (In contrast to 'projection 2', no vorticity is introduced into the fluid by the tangential pressure gradient during the intermediate velocity portion of the cycle because dP/dz is absent.) (5) Clearly, u(t) will stray from the divergence-free subspace more quickly (much more quickly if S/Pq is large) than does the corresponding u(t) from projection 2. Thus, we do not advocate this 'projection 1' method since 'projection 2' provides more accuracy (at least in theory) for virtually no more effort. (6) A steady state, if attained, is a function of At—not a desirable attribute; the pressure gradient is then multiplied by (/ — uA^V2). o Justification, vorticity production. Before moving on to analyze and at least partially justify the above projection methods, it may be wise to note that even though a 'projection 3' method was proposed (but not tested) in Gresho (1990b), by the 'obvious' extension of projection 2 (guess the pressure as P = Pq + ^o, etc.), it is not to be recommended
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 743 because it is actually unstable—a result determined theoretically by Shen (1993) and experimentally (and earlier) by J. Schutt (personal communication). To justify the simpler projection methods, we must first discuss the 'discovery' announced in Gresho and Sani (1987) and further elucidated in Gresho (1990a, 1991a): whereas the normal component of the momentum equation applies even on To at t = 0—and is thus used to set the Neumann BC for the PPE [this portion of the discovery having been borrowed from Hey wood (1980) and Hey wood and Rannacher (1982)]—the tangential component does not (in general); rather, at least for small time and close to FD, the tangential component of Wo acts like a given 'source' (body force) term with Pq coming from the PPE plus Neumann BC at t = 0. The resulting vortex sheet (present in general, because the no-slip BC is not required to be satisfied by the initial velocity field) quickly diffuses from To into £2 as the solution to a 'heat equation' with a step change at the boundary (and a given source term near To, dP/dz); i.e., the 'action' occurs near FD in a boundary layer of thickness 8 % y/Vt. While this behavior usually occurs only once (at start-up) for the true solution of the NS equations, it occurs once per projection cycle when a projection approximation to the NS equations is utilized. What is happening is this: the projection of u to v at the end of each cycle introduces a slip velocity on Fq of size s = (—\/2)T2dPo/dT for projection 2 (s = —TdPo/dz for projection 1) and a resulting diffusional boundary layer of thickness ~ y/vi, 0 < t ^ 7\ when uT is (necessarily, in this 'simple' approximation) forced to satisfy the no-slip BC during the next cycle. The effect of these processes is to cause a significant 'loss of regularity' within the layer 8, with one result being that the Taylor series analyses presented above are probably not valid within this 'projection' boundary layer. The whole concept of vorticity 'production' at a no-slip boundary is subtle, difficult, and perhaps not even yet fully understood. For some 'classical' discussions on the subject, see, for example, Batchelor (1967) or Panton (1984, 1996). For some fairly recent controversy on the subject, see Gresho (1992) and the references therein—and also E and Liu (1996). For a modern viewpoint with additional in-depth analysis, see Wu and Wu (1993, 1995, 1996) and Wu (1995). We assert that one must deal with this issue, successfully or not, when designing, discussing, and (especially) analyzing projection methods. We start with two questions: 1. Must we re-set the slip velocity, computed via s = — r ■ V0, to zero to start the next projection cycle? We presumed yes in our stated algorithm via Remark (3) following (3.16-323), but others seem to obtain good results by not doing so; just setting u = w at the start of the next cycle seems to be good enough. (More on this later—at the end of this section.) 2. How is vorticity production at the wall related to the choice of BC? While we have no 'final' answers, we currently believe that the answer to Question 1 is 'no,' and we try next to answer 2, somewhat heuristically. Starting in 2D for simplicity, the (scalar) vorticity on F is defined by co = duT/dn — du„/dr (3.16-337) and its flux into Q at F (whatever that means) by Q = -vdco/dn. (3.16-338)
744 THE NAVIER-STOKES EQUATIONS From V • u = 0 = dun/dn + duT/dr and (3.16-337) follows Q = -vV2uT; the flux of (jo is related to the tangential viscous term, which at least suggests the attempted employment of the tangential momentum equation on VD, in the following way: Q = -vV2uT = r- (g - Du/Dt - V/>); (3.16-339) vorticity flux is 'caused by' tangential body forces, tangential acceleration, and—last but not least—tangential pressure gradients. Since it is the latter 'source' term that is of interest herein, we abbreviate r • (g — Du/Dt) by / to obtain, integrating in time over one projection cycle, T (dP/dr)dt ^+,*A + >l3A + ...]dl dz dz 2 dz = r/d,_(V^ + l!^ + l!^ + ...|. (3,6-340) Recalling now the slip velocities associated with the projections via the 'simpler' BC of u = w on To, s = —T dPo/dz for projection 1 and s = — (T2/2)dPo/dz for projection 2, we see that the end-of-cycle vortex sheet that results from resetting the tangential velocity from x ■ (w — V0) to zero corresponds to the appropriate term in the Taylor series expansion of dP/dz. And this result can also be used to ascertain that the 'better' BC's of (3.16-325) for projection 1 and (3.16-315) for projection 2 'merely' increases by one power of T(At in the real case) the accuracy of vorticity injection. The important point is that the reduction of the slippery post-projection velocity to the no-slip value is a clear and important part of the projection method; what is not as clear is whether no-slip needs to be applied to both u and u, or whether application to just u is sufficient. (This latter seems to be the case ... in which case our previous use of the word 'necessary' was unnecessary.) In 3D, the situation is similar, just more complicated for a general, curved boundary. The starting point is the (no body force) curl form of the momentum equation (3.3-6), whose tangential components can be written as /Du \ vn x (V xa>) x n= -n x I — + VP 1 x n, (3.16-341) where o> = V x u. Combining this with the equation for the normal flux of tangential vorticity, Q = -vn x da>/dn, (3.16-342) yields, with the help of J.-Z. Wu (personal communication, 1995), finally, Q = nx ( — + V/M xn+unx (V2n) • co, (3.16-343)
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 745 where V2 is the (2D) surface gradient operator. While surface curvature (V211 ^ 0) causes additional flux of vorticity, the tangential pressure gradient plays a role similar to that in 2D—and that is the major point. o Further analysis, justification. With this new proviso and new understanding [i.e., be careful about assuming too much 'smoothness' within 8(t)], we now take the next (and last) step toward justifying projection methods—which begins by considering the hypothetical but do-able (in principle) projection cycle in which a continuous projection for all t between 0 and T is performed, for reasons soon to become clear (hopefully). Thus, we now consider both v and ip to also be continuous functions of time and consider the following cycle, 0 ^ t ^ T: Given Pq(x) and uo with V • uo = 0, Step 1. Solve for u, with iio = uo at t = 0 from du/dt = uV2u + f(u) - W>0 in ^, u = w(0 on To, and vdu/dn = ¥(t) + nPQ on TN. (3.16-344) Step 2. Perform the continuous projection; i.e., (a) Solve V2<p = V • u(0 in Q, dcp/dn =0 on rD, <p = -[Fn(t) + P0]t/2 on FN. (3.16-345) (b) Compute v(0 = u(0 - V<p(t) in Q. (3.16-346) Step 3. Compute the pressure from P(t) = P0 + 2(p/t. (3.16-347) Step 4 At_t_= T, set u0 = \(T) in Q and on rN, set u0 = w(T) on FD, set ^o = P(T) in Q, set t = 0 (reset the clock), and go to Step (1). The reason for considering a continuous projection is so that we can 'analyze' (partially) the results via Taylor series expansions, which we will do soon. But to begin the analysis, we first insert u = v + Vcp into the intermediate velocity PDE, (3.16-344), to obtain (d/dt - uV2)(v + V(p) = f(v + V<p) - V/>0 in £2, (3.16-348) with BC's v + V<p = w on TD (3.16-349) and v— (v + V^) = F(0 + n/>o on TN, (3.16-350) dn
746 THE NAVIER-STOKES EQUATIONS which, when augmented by the IC v + Vcp = uo at t = 0, is a well-posed problem for the linear combination of the two vector fields, v and S/<p. Rearrangement of (3.16-348) identifies it as what we shall refer to as a modified/perturbed momentum equation: 3v ~dt + V Po + dt uV2 )<p = uV2v + f(u), (3.16-351) where we have reinstated u in f, for 'convenience.' Note that to the extent that P0 + (d/dt — vV2)cp 'looks like' P, (3.16-351) 'looks like' the true momentum equation (at least if V(p is small compared with v in f(u) or, simpler yet, for Stokes flow). In fact, let us now subtract the true NS momentum equation from the v equation above to obtain dt (v - u) + V (p°-p)+u-vVr = uV2(v-u) + f(u)-f(u), (3.16-352) which we analyze as follows—with slightly circuitous, but reasonably rigorous, logic, starting with the assumption [cf. (3.16-319)] that <p(t) = t2PQ/2 + 0(t3): 1. From (3.16-314) and \(t) = u(0 - V<p(t), we see that v — u = u(O + yV/»o + 0(f3) -VPo-u(O = 0(r), (3.16-353) and thus 9(v - u)/dt = 0(t2). P0-P = P0-(PQ + tp0 + t2/2P0 + •••) = -tP0 + 0(t2). (3.16-354) 2 - f(u) 2. 3. f(u) - f(u) = f r u+-VP0 + O(r3) t2 9f 2"9u S/P0 + O(t3) = O(t2). (3.16-355) Inserting these asymptotic results into (3.16-352) gives V[d/dt — vV2)(p — tPo = 0(t2), which 'integrates' to (d/dt - vS/2)(p - tP0 = constant (in space) + 0(t2). Taking the constant to be zero leads to the ostensible 'governing' PDE for <p—at least for small time: ( yv2 ] y = tP0 in Q, with BC's and and IC .9' d(f)/dn = 0 on Fq <P=-[Fn(t) + P0]t/2 on r,v, <p = 0 in Q, (3.16-356) (3.16-357) (3.16-358) (3.16-359) a parabolic PDE that supplements the 'real' (and elliptic) one, V2<p = V • u; i.e., the cp that comes from the elliptic equation also, to the order stated, satisfies the transient heat
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 747 equation. Considering now that <p = 0 at t = 0, and that we are really only interested in the solution to this transient heat equation outside of the spurious projection boundary layer, suggests that the 'outer' solution of (3.16-356) through (3.16-359) in the neighborhood of FD (not T/v) can be obtained by neglecting the diffusion term to give t2. <p(x,t)*-PQ(x), (3.16-360) at least up to 0(t3) and outside of 8 = 0(^/vi), which we happily note is consistent with the assumption made at the beginning of the analysis. Remark: We realize that the stated form of <p is not correct near FN, but we are not really interested in the solution there since it is near rD where the problems lie. Noting now that (p = t2P{)/2 is also the solution [through 0(?3)] that was obtained in the 'hypothetical' case of 'optimal' BC's, we now believe and assert that the simple BC's only cause (portions of) our projected solution to be spurious within 0(8) of the Dirichlet boundary and that the actual use of optimal BC's would 'merely' mean that (3.16-360) would apply all the way to the wall. [The pollution from projection would then occur in the 0(t3) terms; but it would occur.] We note, finally, that the pressure update, P = P0 + 2<p/t = Pq + tPo, is consistent with the above solution. Then, to the extent that (3.16-360) does apply, the modified momentum equation (3.16-351), becomes -^ + V(/>0 + tP0) = vV2v + f(u) + 0(t2), (3.16-361) at which now looks a lot more like the NS momentum equation. o Special cases. To help appreciate the importance and extra difficulty caused by no-slip walls, and to see a really great use of the projection method, consider the following two special classes of problems for Stokes flow: 1. Periodic domain, periodic forcing function, periodic BC's. 2. Unbounded domain, IC of compact support, and body force that goes to zero as x —► oo. These can be described by the following IBVP or IVP: du/dt + VP = uV2u + f(x, 0 and Vu = 0 in £2, (3.16-362) with divergence-free initial velocity (u0), which is periodic for Class 1 and with uo(jc) —► 0 for x —► oo for Class 2, periodic BC's for Class 1, and BC of u —*■ 0 for x —► oo for Class 2. Derived from (3.16-362) are the PPE, V2/>=V-f in £2, (3.16-363) and the VTE du/dt = uV2a> + V x f. (3.16-364) The BC's for (3.16-363) are (i) periodic or (ii) P —► 0 as x —► oo, and those for co are the 'same.'
748 THE NAVIER-STOKES EQUATIONS The first observation is that the pressure 'stands alone' in that we do not need to know the velocity field to solve (3.16-363). This leads to the following 'idea': omit VP from (3.16-362), call the resulting (and generally non-solenoidal) velocity field u, and consider finding u and later projecting it ...; i.e., solve du/dt = uV2u + f (3.16-365) with iio = Uo and the same BC's as for u. Clearly this vector 'heat' equation (with uncoupled components, yet) is much easier to solve than (3.16-362). The curl of it yields dii>/dt = uV2w + V x f, (3.16-366) with o>o = o>o and the same BC's as on a>; note that both o> and a> are necessarily divergence-free. Our next observation is that a> = o> even though only the latter is derived from a divergence-free vector field, suggesting that the two velocities could only differ by the gradient of a scalar. Next, suppose we have solved (3.16-362) and (3.16-365) from t = 0 to t = tF (not necessarily small) and we ask: How would the projection of u to the divergence-free subspace compare with u? We obtain the answer by 'construction': compute v and <p from u = v + V<p and V-v = 0 (3.16-367) with either periodic (Class 1) or no BC's (Class 2; i.e., v —*■ 0 for x —► oo). This of course involves first solving the Poisson equation, VV = V-Q, (3.16-368) and then computing v = u — V<p. To answer the above question, we invoke the following (Helmholtz) theorem: a vector field is uniquely defined if and only if both its curl and divergence are known, as well as its value at one space point. Thus, since v and u have both the same curl (co = to) and the same divergence (zero), it follows that v = u and we are done; for the special classes of problems defined above, the transient Stokes equations can be solved by omitting the pressure and solving the vector heat equation of (3.16-365) and projecting the resulting vector field to the divergence-free subspace at whatever times u(x, t) is desired—and only at those times. The pressure, if desired, is always obtainable from (3.16-363). We conclude this interesting 'digression' with three Remarks: (1) If f = 0, then we have P = 0 (or constant) and u = u: the decaying solution remains divergence-free with no 'need' for a pressure; u(x) is then (necessarily) a linear combination of (decaying) Stokes eigenfunctions—each of which has P = 0 (see Walsh, 1991). (2) If f is independent of time such that a steady solution, say u9, will ultimately obtain, then a similar trick can be applied: simply solve for v = u — u5 from d\/dt = uV2 v with vo = uo — u5. There is no more need for P because v (and thus u) will remain divergence-free.
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 749 Proof: The pressure is given by (3.16-363) and is independent of time, and the divergence-free steady-state velocity is obtained by solving uV2u5 = VP — f, which permits (3.16-362) to be rewritten as du/dt = uV2(u — u5) and thus as 3(u— us)/dt = uV2(u — u9). [Note that both u5 = 0 and P = const if f = 0, consistent with Remark (1).] Defining now D = V • u leads to dD/dt = uV2D with D = 0 at t = 0. Finally, since the BC's cannot cause non-zero D, we have D = 0, and thus u = v + u9 solves (3.16-362). QED. In this case, the solution is also representable as a linear combination of decaying Stokes eigenfunctions, again with P = 0 for each: 00 u = ^a„e-^+b„(l-e^'), n = \ where a„ is the projection onto the n-th eigenfunction of uo(jc), and b„ is that of us(x). (3) It is said but true that the (non-linear) advection terms preclude such a streamlined solution procedure; for the NSE, the pressure is always needed. Damn! o Yet another equation for <p, the BHE. One may still wonder (legitimately, we add) if and why the bad/polluted portions of the solution actually recover—as we have merely asserted—outside of the (putative) projection boundary layer. To begin to answer the question, we derive yet a third{\) PDE for <p—obtained by operating on (3.16-351) with the divergence operator and using V • v = 0: uV2 ) VV = V • [f(u) - VPqI (3.16-369) dt J If f(u) was known (e.g., for Stokes flow), (3.16-369) might be solved for <p if appropriate IC's and BC's were specified—and if the implied regularity/smoothness was 'true.' [As this assumption is probably violated in the majority of actual simulations, we are admittedly treading on thin ice. Also, some of our more mathematical colleagues have not and presumably will not 'buy' it; in particular J. Shen and R. Rannacher, both of whom have made serious and important contributions to 'projection theory.' On the other side of the coin, though, we have concurrence by the mathematician who first employed projection 2—J. van Kan (personal communication), and a very recent contribution by another mathematician, Schwab (1995), in which he shows, for the time-discretized case that we consider below, that the pressure does indeed (when sufficiently regular, of course) satisfy a biharmonic equation (BHE). We will thus present it because we believe it does lead to a useful and perhaps deeper understanding of the projection boundary layer; it also leads to what we call the Biharmonic Miracle (BHM).] We know the IC, <p = 0, and we know some BC's [(3.16-357), (3.16-358)]. A completely posed problem may be obtained (we assert) by applying, on the boundary, the normal component of the modified momentum equation from which this higher-order equation was derived; namely (3.16-351): ( yV2 ) — = n • [uV2v + f(u) - V/>0 - dxv/dt] on rD + rv (3.16-370) \dt ) dn This equation is analogous to the inhomogeneous Neumann BC (normal momentum equation on fD) needed for the PPE and, we believe, for the same reason: to assure
750 THE NAVIER-STOKES EQUATIONS V • v = 0 on T. [Alternatively, one could argue the other way: V • v = 0 on F causes (3.16-370) to apply.] In any event, the above parabolic problem is now well-posed and should, under perhaps some stringent regularity assumptions (probably even requiring very smooth boundaries) admit a solution—at least in principle. We also believe and assert that the solution to the above IBVP will also (for t small) display a boundary layer behavior such that, outside of 8 = 0(*Jvi), the S/4<p term will become negligible, and the previous approximate solution, <p = t2Po/2, will apply—for which (3.16-369) reduces to V2(/>0 + tP0) = V • f(u) + 0(t2), the 'PPE' for small t; similarly, (3.16-370) then simplifies to n -V(P0 + tP0) = n ■ [(uV2v) + f(u) - dxv/dt], the appropriate (on FD at least) BC for the PPE. o Semi-discrete BHE. To make further progress, we have found it fruitful, in order to complete the (already too-long?) analysis of projection methods, to shift gears and go to a semi-discrete projection method—in which only time is discretized. In so doing we shall make the same assumption that all 'practitioners' have made (whether or not they realized it): that the projection should be performed at each timestep of the selected time integration method for u. That this is silly and expensive, at least for any flow that is approaching a steady state with At fixed, is obvious. How to do better is less obvious—and few have tried. [We—PMG and H. Daniels—made a brief attempt in 1992, but were not so successful. We also have more recently learned—see Gresho et al. (1995)—that projection methods are not so efficient for finding steady solutions; they are 'inherently' time-accurate methods—a conclusion that we shall validate later.] Anyway, we now present one semi-implicit algorithm that could be useful in practice (for simplicity, we drop the body force term and assume F to be time-independent); use AB2 (or even AB3) for advection and TR for diffusion. The general step (ignoring start-up issues for now) is as follows: 1. Given Pm and um with V ■ um = 0, and um_i with V ■ um_i = 0, solve for u^+i from h - (3um ■ Vum - U„,_, ■ Vum_i ) + VPm =-W2(um+1+um) in Q, (3.16-371) with BC's um+i=wm+l on TD (3.16-372) and vdum+\/dn = F + nPm on FN, (3.16-373) a 'modified' Helmholtz equation [with operator (/ — (A?/2)uV2) vis-a-vis (V2 + k2)]. 2. Solve for <p from V2<^ = V-uWI+, in Q (3.16-374) d(p/dn=0 on TD (3.16-375) At <P=- — (Fn+Pm) on IV (3.16-376)
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 751 3. Update the velocity via 4. Update the pressure from um+\ = um+\ - V(p in £2. Pm+\ = Pm +2(p/At. (3.16-377) (3.16-378) 5. Reset variables: Pm+\ —>- Pm, um+\ —> um except on To, where wffl+| —► um. Bump m and go to 1. We can now complete our projection analysis, such as it is; insert u^+i = um+\ + V(p into (3.16-371) to obtain the semi-discrete version of the modified momentum equation: um+\ U„ At 1 + -[3u„, • Vum - um_i ■ Vum_i] + V 1 / vAt P^AtV--2- V2 U = -vV2(um+um+l), (3.16-379) whose divergence is also of interest here: (/ - 82V2)V2(p/At = -V • [VPm + \Onm ■ Vum - um_, ■ Vu„,_,)] , (3.16-380) where now 8 = y/vAt/2 is the projection BLT, and it is worth noting that this result looks like (3.16-369) after a single timestep, starting from ^9 = 0. (Indeed, it could have been so derived—and perhaps should have been.) Anyway, we now have a biharmonic equation—the third PDE satisfied by that slippery variable called <p. The final (?) step in the analysis is to replace <p by (Pm+\ —Pm)At/2 a la (3.16-378) to obtain the BHE for the pressure difference: (7-<52V2)V2 "m+1 "n = -V- 1 VPm + -(3u„, ■ Vu„, - um_i ■ Vu„,_,) (3.16-381) whose rearrangement gives the BHE satisfied by the pressure itself: e2.^ (I-8zV')VzPm+l =-V.(3um-Vum-um_, ■ Vu„,_,)-(/ + «52V2)V2/V (3.16-382) Thus, rather than the desired PPE, the 'projection pressure' satisfies a (singularly perturbed) BHE that approximates the PPE, and apparently does it well if 82V2P is 'small' compared with P. (This is also essentially the BHE derived recently by Schwab, 1995.) A different rearrangement will make the perturbation to the PPE more clear: ' P -\- P 1 m * * i m+\ = -V- 1 - (3u„, • S/um um_i ■ Vum_ ■m— 1 + 52V4 'm+1 — ' n (3.16-383) which, since Pm+\ — Pm is O(At) and 82 = vAt/2, is clearly an 0(At2) perturbation from the 'good' PPE. The BC's for (3.16-381) are (from the earlier <p equations): d(Pm+i — Pm)/dn = 0 on To, (Pm+\ — Pm) = —(Fn + Pm) on TN, and the normal component of (3.16-379) on TD and rV; i.e., I ^-Vz I — (Pm+i -Pm) = n- uV (um + um+i) - (3u„, ■ S/um - um_i • Vu„,_,) dn - 2S7Pm - 2 Uwi+1 U« ~At (3.16-384)
752 THE NAVIER-STOKES EQUATIONS These BC's permit, in principle, the solution of the BHE for (Pm+\ — Pm); they also 'cause' a recovery of the normal pressure gradient from the bad value on To (zero) to the proper PPE value because the term vAtS/2(Pm+[ — Pm)/2 becomes small compared with (Pm+\ — Pm) once outside of the projection BL; i.e., for xn > 0(\/vAt), where xn denotes the normal distance from F into Q. o ID model problem, BHM. Now, because we probably cannot actually solve the BHE BVP for any real case of interest, we switch to a ID model problem, introduced in Gresho (1990b), which we believe mimics at least some of the important parts of the solution of (3.16-383). 1. The model PPE (P —► u) is given by —u" = S on 0 < x < 1, u = a at x = 0, u'=a-S at x=\, (3.16-385) where S and a are constant, and we note that the solvability condition jQ S = u'(0) — u'(\) is satisfied. Thus, (3.16-385) has a solution—up to an arbitrary additive constant, which we take to be zero. 2. The model BHE problem is given by = S in 0 < x < 1, = qo at x = 0, = q\ at x = 1, and = a at x = 0, -82u" + u = a - S at jc=1, (3.16-386) where qo and q\ are arbitrary (although they should both be taken as zero to look more like the pressure conditions, we include the more general case for more 'punch'). Remarks: (1) We shall regard the solution of the BHE as 'spurious' to the extent that it disagrees with the PPE solution. (2) The solvability condition is now = /0 S, which is again satisfied—for arbitrary q\ and qo. Again we take the resulting arbitrary additive constant to be zero. (3) A similar 'boundary layer' problem is discussed by Bender and Orszag (1978)—Example 2 on p. 449. The PPE solution is simply u(x) = upPE = ax — Sx2/2, (3.16-387) and that of the BHE is (less simply, of course) u(x) = uppE + _l/s { [S - a{\ - e-'/5) + (qx - ^e^5)] ex/s + [S + a(el/s- !) + (?, -q0)Ql/s]e-x/s}, (3.16-388) c2 //// / o u — u u u -82u'" + u
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 753 which is well approximated when 8 <£ 1 (the case of most interest) by u = uPPE + 8 [(S - a + qx )e-(l-^s + (a - qo)e~x/s] . (3.16-389) The 'normal' derivatives are also of interest: u'PPE = a-Sx (3.16-390) and, for the BHE solution, u' = uPPE + (S - a + qy )e-(1-*>/5 _ (a - qQ)Q-x/s, (3.16-391) which satisfies u' = qo at x = 0 and u' = q\ at x = 1, as it must. (The exact solution satisfies these BC's exactly, of course.) Thus, we see that: 1. The error in 'pressure' is 0(8) at the wall and very small outside the BL, 0(<5e~1/<5). 2. The error in the normal gradient is very large, 0(1), at the wall, but very small outside the BL. In fact, (3.16-391) shows that u' = u'PPE + 0(e~'/5) for 0(8) < x < 1 - 0(8); the large error at the wall 'vanishes' as the BL is traversed. 3. This is the biharmonic miracle in ID: a 'rough' solution with a tendency to show a loss of regularity because of the very large higher derivatives at the wall (dm/dxm)u « \/8m~l) smooths out nicely while traversing the BL to recover to virtually the proper PPE results. To the extent that this behavior carries over to the full NS equations, we have explained and rationalized some of the disconcerting features of projection methods. To further elucidate the BHM and, importantly, to be sure that discrete approximations also 'recover,' we present, with thanks to S. Chan, in Figures 3.16-15 through 3.16-17 a summary of results of exact and finite difference approximate solutions (second- order, centered) to both PPE and BHE, for a = 4, S = 6, qo = q\ = 0, for which «ppe(*) = Ax- 3x2 and u(x) = Ax - 3x2 + 8[2q-({-x)I& + 4e^5]; u'PPE(x) = A - 6x, and u'(x) = A — 6x + 2e~(1~*)/<5 — 4e~*/<5. In each figure, we plot the BHE solution, u(x), as solid curves—the upper one being the FDM solution (101 grid points) and the lower one the analytic solution. The solid dots describe the FDM solution of the PPE and the open dots the analytic solution of same. Finally, the first derivative of the BHE solution is also plotted ('normal pressure gradient')—dashed for the FDM solution and solid for the analytic solution. Figure 3.16-15 shows the case wherein the mesh is too coarse to resolve the spurious BL—or, stated differently, the timestep is small enough that the BL is very thin; here, 8 = ^/vAt = 0.001 and h = 0.01, giving h = 105. Note first that the analytic PPE results (open circles) lie right on top of the analytic BHE curve—the only discrepancy, of 0(8) at x = 0, 1 is graphically invisible. But the differences do show up in du/dx, wherein the BHE shows the 0(1) error at the walls; the correct slopes are 4 and — 2 at x = 0, 1; vis-a-vis zero for the BHE. The FDM results, while not particularly accurate, do show a similar behavior with the important result that the approximate BHE agrees with the approximate PPE even though the spurious BL is too thin to be resolved. The other extreme, a fat BL, is shown in Figure 3.16-16; here 8 = 0.1 = lO/i so that the BL is well-resolved in this case. Here even the analytic results for PPE and BHE show large disagreement near the walls. But the key point of this result is that even though the BHE easily resolves the (bad) solution within the spurious BL, once outside of it (say
754 THE NAVIER-STOKES EQUATIONS Fig. 3.16-15 PPE/BHE results; small timestep yields unresolved boundary layer (8 = 0.1 h). 28 from the walls), the BHE solution recovers to the desired PPE solution—both for the analytic and FDM cases. Finally, Figure 3.16-17 shows the case in between—8 = h = 0.01. No more surprises. Not shown are cases with better spatial resolution (up to 401 points, with 'convergence' observed) and even more extreme ratios of 8/h (from 0.01 to 100). The inevitable conclusions from the model problem are these: 1. In all cases, the solution of the discrete PPE was close to that of the discrete BHE once outside of the BL region. 2. For 8 <<C /?, the discrete BHE solution 'tactfully' ignores a BL that it cannot see; i.e., at the first node away from the wall (and all others), the agreement between BHE and PPE was good. 3. For 8y> h, the BHE solution within the BL is 'poor' (far from the PPE solution) for both analytic and FDM—but once outside the BL, a near-miraculous recovery occurs. 4. To the extent that this discrete Biharmonic Miracle extends to the multi-dimensional case associated with the projection method, the 'success' of the method is, at least partially, explained—even for Stokes flow. That was the good news. The bad news is this: suppose you need accurate results right up to and at the wall and you accordingly use a graded and extremely-fine-near-the-wall
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 755 Fig. 3.16-16 PPE/BHE results; large timestep yields a thick boundary layer (8 = 10 h). mesh? In spite of the fact that your implicit treatment of the viscous terms has removed the onerous diffusional stability limit, At < 0(h2/v), the requirement of high accuracy at the wall seems to say that 8 = ^JvAt should be small relative to the mesh spacing, h, so that the specious BL is thin enough not to matter. But this brings us back to 8 < h (or 8 <$C hi), which translates to ^JvAt < h or At < h2/v, and we are (or might as well be) back to an explicit treatment of diffusion! That is to say, if it is really true that accurate results near the wall require At < 0(h2/v) for h 'very' small, then the projection method loses much of its appeal for at least a certain class of problems, since forward Euler could do the same job at lower cost. We shall return to this issue after a brief excursion to the 'other end'—relatively large timesteps, which we do next. o Steady-state paradox. We have mentioned more than once that projection methods should be used in the 'time-accurate' mode and not be used as steady-state seekers—at least not 'by design'; if a time-dependent flow attains a steady state via 'accurate' time- marching, that is another matter. And we are not alone; cf. Simo and Armero (1994) and Turek (1997). Here we will explain why large timesteps should not be used and do so using a most robust ODE method, BE. To simplify our task, we first rewrite the DAE's in the condensed notation used previously (Section 3.16.1), and we do it for Stokes flow, for simplicity. (Recall that BE applied to the index 2 Stokes DAE's will give the proper
THE NAVIER-STOKES EQUATIONS Fig. 3.16-17 PPE/BHE results; 'balanced' case (8 = h). steady-state solution in one step if At is set to infinity.) To solve u + Ku + GP = f # f(t) (3.16-392) and Du = g^g(t) (3.16-393) via the BE projection method, we do—given Pn and un with Dun = g for n = 0, 1, ...: Step 1. Solve (un+\ — un)/At + Kun+\ + GPn = / for un+\. Step 2. Solve DG(Pn+l - Pn) = (Dun+{ - g)/At for At(Pn+l - Pn). Step 3. Compute un+\ = un+\ — AtG(Pn+\ — Pn). Step 4. Update P, bump n, and go to 1. (3.16-394) We now show that algorithm (3.16-394) is very badly-behaved for large At with the result that, unlike true BE applied to the Stokes DAE's, a steady result is not attainable simply by taking a few very large steps. Quite the contrary, in fact; the larger is At, the more steps does (3.16-394) require to attain steady state! A qualitative picture of this disaster is shown in Figure 3.16-18, first shown in Gresho et al. (1995), in which
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 757 U0 u^ u-i u2 u3 Fig. 3.16-18 Backward Euler projections for several At's. Vu = 0 the horizontal line represents the manifold of all divergence-free velocities- range of the projection matrix, -l p = I -G(DGylD -which is the (3.16-395) inherent in the above algorithm, in which we have taken g = 0 in order to obtain orthogonal projections (see Appendix 3). The figure depicts the fact that Step 1 of (3.16-394) lifts us out of the divergence-free subspace—and farther from it for larger At(At\ < At2 < Atj,...)—and also depicts Steps 2 and 3, which brings us back. The limiting case, At = oo, is particularly easy to examine, so we do so. Take n = 0 and At = oo to obtain 1.5!= K~\f - GPQ), where (recall) P0 came from DGP0 = D(f - KuQ). 2. Solve for A. = At(P\ — Po) from DGX = Du\ — g, where we note that AtAP is perfectly well-defined even for At —► oo; i.e., the Lagrange multiplier (A.) is finite and thus P{ -> Pq. 3. Compute u\ = u\ — GX -l pux +G(DGylg — l -l = pK-l(f-GP0) + G(DGylg, and we are done with the first timestep. Now set n = 1 for Step 2; we easily obtain -l u2 = K-\f - GPi) = K-\f - GP0) = u (3.16-396) and we see the 'stall' shown in Figure 3.16-18: the 'velocity' simply bounces up and down between Uoq and Uoq. Only for small At (e.g., At = At\ in the figure) is the projection method useful. (Figure 3.16-19 shows the analogous, and even more bizarre behavior when TR is used for large At—the analysis of which we omit because TR is well-known not to be recommended with large At.)
758 THE NAVIER-STOKES EQUATIONS U1=U3. Vu = 0 u0 = u2 = u4 U1 = U3 Fig. 3.16-19 TR projections for several At's. We end this discussion by noting that 'small' or 'large' At with respect to the behavior depicted in Figures 3.16-18 and 3.16-19 is a strong function of the timescales and the temporal behavior inherent in the flow being simulated; e.g., during sharp transients, the At needed to stay 'sufficiently close' to the divergence-free manifold will be much smaller than that for a slowly changing flow or one that is approaching a steady state. Perhaps A?||m|| < £Umax would be a good guide, for 'properly selected' e and £/max- o Biharmonic catastrophe. We now return to the notion that an accurate solution within the spurious BL is not readily achievable and that an accurate solution within any physical BL (or near any boundary with u = w, actually) requires that 8 = ^JvAt should be small relative to h, which itself should be small relative to any physical BLT. For example, in Gresho and Chan (1995) was presented a case in which a very fine mesh was employed near a geometric singularity (re-entrant corner) in which no physical BL was present (Stokes flow, in fact)—a presumably easier case than one at large Re. What they observed was a disastrous manifestation of the BHE (quite the opposite of a BHM) which they called a BHC (Biharmonic Catastrophe) and it is this: if At is not sufficiently small (read 'very, very small'), then the well-resolved-but-spurious BL in the vicinity of a singularity can generate garbage by 'finding' a singular solution to the BHE that dominates that of the desired PPE—at least near the singularity, giving a region of totally spurious velocity and pressure. And the solution appears to attain a steady state, a manifestation of the behavior discussed regarding Figure 3.16-10 above. In reality, a steady solution—for which there is no more BHE and therefore no BHM or BHC—would have required an unthinkable number of (too large) steps to follow a non-physical 'transient.' If the spurious BL was to be made sufficiently small so as not to cause a BHC, the needed timestep would be close to that needed for stability if the simple FE method was employed—in their case requiring ~109 steps—either via FE or a BE version of projection 2, either of which would deliver an accurate solution, with FE costing much less (and each unaffordable). See Minev and Gresho (1998) for a possible 'fix' for the BHC.
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 759 o Approximate projection. We conclude the semi-discrete discussion by noting that the other side of the coin is also interesting; i.e., suppose we decided to report um rather than um as the NS velocity, a decision that (we believe) has actually been made by several investigators (cited later), especially for 'projection 1.' Inserting <p from (3.16-378) into (3.16-377) and placing the result into (3.16-371)—after lowering the ra-index by one—yields, reverting to Stokes flow for simplicity, hm+[-hm (Wm-Pm_x\ v 2 _ vAt 2 — + VI 1 = -VZ(U„,+ , +Um)- -—VZV(Pm ~Pm-\)- (3.16-397) We now boldly drop the 'spurious' 0(At2) term—both here and in the advection term for the full NSE—and consider the following algorithm, which clearly mixes AB2 and TR: Step 1. Given um, Pm, and Pm-\, solve for um+i from + VI 1 = -Vz(um+, + um) (3.16-398) with the same BC's as before—(3.16-372) and (3.16-373). Step 2. Solve for Pm+l from (3.16-374) through (3.16-376); i.e., j 2 V2(Pm+i-Pm) = —V-um+i in £2, d(Pm+i - Pm)/dn =0 on VD, and Pm+\ = -Fn on rN. Done. Go to Step 1. This algorithm obviously merits the following remark, since we omitted the step that strips off the gradient part of u: we have lost incompressibility! True, but before becoming overly concerned, let us see what we have gained—ostensibly (also, as mentioned at the outset, if the pressure 'guess' is good, the velocity will be close to divergence-free): (i) no BC 'problems, (ii) no BHE, and thus (iii) no spurious BL and (iv) no vortex sheets. As we have not tested this idea, we shall drop it here—but retrieve it later when discussing projection methods used by others, in Section 3.16.6d. [We did test two methods of 'stabilizing' the Q\Q\ element—in Gresho et al. (1995)—one of them a projection method; but neither was deemed to be really worthwhile.] c. A GFEM (almost) implementation of the second-order projection method—projection 2 One of the discoveries of Gresho and Chan (1990) during their initial investigations on projection methods via Q\Qo was a 'trick' that permitted the introduction of a (semi-) consistent mass matrix into the method in spite of performing the projection with the
760 THE NAVIER-STOKES EQUATIONS only viable Laplacian—that using lumped mass: CTMj}C. They introduced a modified semi-discrete momentum equation in an ad hoc manner (the pressure gradient term, CP, was premultiplied by MMj}, where M is consistent and ML its lumped approximation) and used it to help derive 'discrete projection 2.' They also presented a GFEM version of 'projection 1,' in which the consistent mass matrix was (naturally) employed for the intermediate velocity step (which uses no pressure gradient), and the lumped mass matrix (necessarily) was used for the projection step. A consistency analysis of the projection 1 scheme revealed that the method 'automatically' inserted the MMj} factor in front of the 'normal' pressure gradient term. Based on that result, and others to follow, here we take the stand that this is the 'proper' way to incorporate the beneficial effects of consistent mass (low dispersion error, mainly) into a finite element projection method. Thus, we begin by stating the projection method DAE's: Mu + [K + N(u)]u + MMllCP = f, CTu = g, (3.16-399) wherein we point out that the modified pressure term is really not very far from the original GFEM pressure term because MM~[l is not far from the identity matrix. In fact, on a uniform 2D mesh of bilinear elements, it is not hard to show that MMj}u = u + (h2/6)'V2u + 0(h4) for a smooth function, u. (The second-order truncation error terms may drop to first-order on a general mesh of distorted quadrilaterals.) Thus, since the factor MMj} hardly hurts the momentum equation (probably/usually—at least on good grids), yet permits both consistent mass treatment of advection and lumped mass during the projection (see below), it is a significant improvement over either of the two 'same' mass matrix approaches (lumped mass is inaccurate and consistent mass is unaffordable). Next we note that the implied PPE of (3.16-399) is also a mixed mass matrix result: (CTMllC)P = CTM-l[f -Ku-N(u)u] - g, (3.16-400) wherein we point out that even though M~x appears on the RHS, our projection method will bypass this 'inconvenience.' But before describing the projection method, we make some remarks about these DAE's—mostly discouraging: Remarks: (1) Even the steady Stokes equations no longer display a symmetric matrix. (2) Steady state results are not independent of the mass matrix (unless lumping is employed, which is of course still permissible—even advisable for steady solutions). In fact, we emphasize that the sole purpose of the mass matrix trick is, as is that of the projection method in general, to obtain more accurate transient solutions—as demonstrated in Gresho and Chan (1990) and from which we show a sample result in Figure 3.16-20, in which the tick marks between the two figures show the nodal spacing in the jr-direction. The significant improvement is rather obvious. (3) Even the linear stability [N(u) = 0] of the DAE's, observed experimentally, is not provable—at least not by us. (4) Another useful interpretation of the modified momentum equation is obtained by rewriting (3.16-399) as u + M~l [K + N(u)]u + M^CP = M~l f, which, since M~l
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 761 -0.5 Fig. 3.16-20 Snapshot of streamlines for vortex shedding at Re = 250; f = Q, 0.50, 0.75, 1.0. (a): Semi-consistent mass, (b) lumped mass. Tick marks denote element lengths. is dense, shows that the pressure gradient, and only the pressure gradient, is approximated locally (a la finite differences) in order to compute the acceleration. (5) As for the original GFEM DAE's, it is easy to show that the first of (3.16-399) and (3.16-400) imply the second of (3.16-399) if and only if CTu$ = go—a simple exercise we leave for the reader. It may be worthwhile to return to Remark (3) above and point out that in spite of our inability to perform the stability analysis, experimental results have never (yet) generated unstable Stokes solutions for either TR or BE. This stability result can perhaps be better appreciated—even for the fully non-linear case—if we examine the evolution of the kinetic energy in the absence of forcing, obtained by forming the scalar product of (3.16-399) with the velocity vector and utilizing uTN(u)u = 0 for N skew-symmetric (which we here assume) to obtain 1 d T E = u Mu = 2d? -uT Ku uTMM] CP = -u' Ku-P' C'MT'Mu T ^Ta/i-\ L vis-a-vis the true GFEM result E = -uTKu < 0 (3.16-401) (3.16-402) since K is positive-definite and, thus, (3.16-402) properly describes 'viscous decay.' The remaining term on the RHS of (3.16-401) is indefinite (precluding a definite stability
762 THE NAVIER-STOKES EQUATIONS assessment) but small—because Mj} M % / + 0(h2) in the operator sense as shown earlier for MM~[l, and CTu = 0, thus at least telling us that the indefinite term is small in some sense. As a final 'persuasion,' the further rearrangement of (3.16-399) via multiplication by MlM~x leads to the following 'energy' equation: -—uTMlxu= -uTMLM-l[K + N(u)]u = -uTKu + 0{h2)\ (3.16-403) i.e., the pressure term is gone if we use the lumped mass kinetic energy, but now the viscous term no longer guarantees decay. So, we have guaranteed viscous dissipation in the natural (CM) norm with an indefinite pressure contribution on the one hand, and no pressure contribution (as desired) but 'slightly' indefinite viscous 'dissipation' (for finite h) in the LM norm, on the other hand. However, since in a finite-dimensional vector space all norms are equivalent (i.e., auTMu ^ uTMiu < /3uTMu and auTMiu ^ uTMu ^ buTMiu for some finite a, /3, a, b), we are not surprised that the DAE's have been observed to be stable. Now we can present the semi-implicit projection 2 algorithm. For simplicity of presentation, we will utilize only the simplest (first-order) explicit and implicit ODE methods—FE (with BTD in the ^-matrix, per Section 3.16.5) and BE on diffusion, and remark that one would be better advised to use AB2 or AB3 for advection and TR for diffusion. We do what we do just to save 'ink.' Given Pn and un with CTun = gn, projection 2 starts at n = 0 and does: Step 1. Solve M(un+l-un) +K~n+i +N(Un)Un+MMZ{CPn =fn for the intermediate velocity, un+\; i.e., solve (M + AtK)un+[ = Mun + At[fn - N(un)un -MMlxCPn\, (3.16-404) which is done quite efficiently using, for example, DSCG. Step 2. Project un+\ to the divergence-free subspace; un+\ = pun+i +Mz;1CA~'g„+i, or, recalling (3.16-14), k„+i = pun+\ +vn+i, (3.16-405) where p = I — M~lXCA~xCT and A = CTMllC, which assures that CTun+\ = gn+i- (See Appendix 3 for a detailed discussion of projections, wherein p is called pj , the interpolation projection, below (A3.3-38).) This projection is realized by ,. (Un + \ — Un + \) „{Pn + \ — Pn) n , Mi \-C =0 and CTun+l =gn+l, (3.16-406) which itself is realized by the following sequential steps:
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 763 (i) Solve A(Pn+l -Pn) = 2(CTun+l -gn+l)/At; (3.16-407) (ii) Compute the final velocity from un+l =un+l - -^MllC(Pn+l -/>„); (3.16-408) (iii) Update the pressure; add (/>„+, -Pn)toPn. (3.16-409) Step 3. Bump n and go to (1). Remarks: (1) The sequential solution procedure is obvious—once it is realized that (3.16-404) can be solved separately for each velocity component, or 'in parallel' on a more modern computer. (2) The scheme can be interpreted as a CM predictor and an LM corrector. (3) In spite of our earlier discussions regarding slippery projections in the continuum, it is clear from the above algorithm that we are enforcing the same BC's on u as on u; i.e., we overspecify the 'no-slip' BC during the projection. That it is 'convenient' is obvious; that it is (usually) innocuous will be demonstrated later. What happens is this: the code is smart enough to ignore the overspecified velocity in the following sense: the first node in from Fq will look like a slip velocity, and there is no visible (resolved) BL. (4) The projection in (3.16-405) is more properly described as an affine transformation on un+\. It is a true projection when gn+\ = 0. But the affine transformation could also be regarded as a projection in the sense that un+\ = pun+\ +vn+\, where vn+x = MllCA~lgn + l gives pun+l = p(pun+l + vn+l) = pun+l = un+l - vn+\ because pvn+\ = 0 and p2 = p; thus, finally, un+\ = pun+\ + vn+\. (5) The projection method does not need M^x per the RHS of the PPE in (3.16-400)—a very important point, yet P satisfies (3.16-400) (also important)—to O(At); see Gresho and Chan (1990) for a proof. (6) Start-up (n = 0) requires Pq, which must be obtained from the PPE given by (3.16-400) at t = 0, which does involve M~x. This is easily done as follows: (i) Solve Ma = f0 — Kuq — N(uo)uq for a via, for example, DSCG. (ii) Solve APo = CTa — go for P0 using your favorite method. [Another useful way to solve Mlx^+x = b is via the iteration MLXk+\ = b(ML—M)xk, with xq = 0; convergence is usually adequate after several iterations; see, e.g., Wathen (1991).] (7) If MM^1 is omitted, then the resulting method is unconditionally unstable—unless, of course, M is lumped. (The 'cheat' is really required!) This instability arises because the projection does then not annihilate the previous pressure gradient, which is part of un+\, as we show below.
764 THE NAVIER-STOKES EQUATIONS (8) For that part of /„ in (3.16-404) that corresponds to the normal force BC (NBC) on open boundaries, it is usually a good idea to multiply this vector by MM~lx to better balance the (dominant) pressure portion of the normal force balance. (9) Related to earlier discussion, the 2 in (3.16-406) through (3.16-408) can be (perhaps should be—especially if the Euler methods are utilized) replaced by unity. (10) Restating our earlier advice, for emphasis: use higher-order methods than Euler's. (11) If this method was to be used as a time-marching-to-steady-state method, then we recommend three changes: (i) lump the mass, (ii) use BE exclusively, and (iii) use a better method. (12) Solving the same problem on the same mesh two times—once with CM and once with LM—can sometimes provide a simple test for a grid-converged solution; i.e. they will then agree. This remark also applies to fully implicit methods, and even to the transient AD equation of the previous chapter. (13) Actually, (3.16-407) is solved most efficiently via multigrid (S. Turek and L. Howell, personal communication), although our implementation has thus far used only direct methods or DSCG—the former (with A stored in factored form) always winning when main memory is large enough to store (the factored) A. o Convergence analysis. The discrete projection algorithm described above can be shown to converge to the DAE's of (3.16-399) and (3.16-400) as At -* 0. We will need, in addition to pun = un — vn which was derived in Remark (4) above, (M + AtK)~l = [M(I + AtM~lK)]~l = (/ + AtM~lK)~lM~l = [I - AtM~lK + At2(M~lK)2]M-1 + 0(At3). (3.16-410) Inserting un+\ from (3.16-404) into (3.16-405) gives un+l =p{M + AtK)-[[Mun + At(fn -N(un)un - MMllCPn)] + vn+l = p[un + AtM~\fn -Kun -N(un)un -MMl[CPn)} + vn+l +0{At2) = un-vn+ AtpM~' (/„ - Kun - N(un )un) - AtpMlxCPn +vn+l+ 0(At2) = un + AtpM~ \fn-Kun-N (un )un) + vn+[ -vn +0{At2) (3.16-411) because pM~[{C = 0. Dividing by At and passing to the limit gives ii = pM~\f-Ku-N{u)u) + v, (3.16-412) which, we assert, is (3.16-399). To verify the assertion, we place P from (3.16-400) into (3.16-399) to obtain MU + [K + N(u)]u + MMlxC(CTMlxCy\CTM-\f - Ku - N(u)u] -g] = f,
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 765 which, using p = I - M~LXC(CTM~LXCYXCT, and MllC(CTMj;lC)g = v rearranges to (3.16-412), and we are done. Remarks: (1) If MM1' is omitted, then there will remain the non-zero term —pM~' CP on the RHS of (3.16-412), thus showing inconsistency and probably explaining the instability referred to above—the pressure gradient should project to zero. (2) For further analysis and discussion of this and other projection methods, see Gresho and Chan (1990) and Shen (1993), wherein a method called 'Projection 3' in Gresho (1990) was shown to be unconditionally unstable. o Overspecified BC's. It is both important and easy to demonstrate the utility of overspec- ifying the projection by enforcing the 'no-slip' (i.e., specified tangential velocity) BC in addition to the proper BC of specified normal velocity—and we show this in two ways: 1. Suppose we did permit slip during the projection. A possible and divergence-free result of the projection might look like that shown (<2i<2o element) on element (e) in Figure 3.16-21. If, before taking the next step for un+\, we enforce the (required, for proper vorticity generation) no-slip velocity, the above picture (for un) changes to that in Figure 3.16-22, which is clearly not 'mass-consistent.' End of 'first way.' 2. Write the mass conservation equation for the 'same' element (e) in Figure 3.16-23, both with slip and no-slip (and n ■ u = 0 on f): (a) Slip: (b) No slip: hiiiT, — ua, + «2 — "i ) + l(vs + V4) = 0; h(uT, — U4) + l(vi + V4) = 0, < 1 u = 2 1 > ► « ^- u = 1 (e) u = 1 u = 2 Fig. 3.16-21 Slippery but mass-consistent. u = 2 u = 1 * ^ * ^^ > < w > >> *- > Fig. 3.16-22 No-slip but mass-inconsistent.
766 THE NAVIER-STOKES EQUATIONS from which it is seen that, except in the improbable event that 112 = u\,a 'mass adjustment' would be necessary when switching BC's from slip (the projection) to no-slip (post- projection). End of second way. It is also worthwhile demonstrating that such overspecification is innocuous in that the projected (divergence-free) velocity fields for the legitimate (slip) and overspecified (no-slip) BC's are the same—within the 'truncation' error of the method. The 'proper' projection of u is given by u = u + VA and V ■ u = 0 in Q, u • n = w • n on T, and the overspecified projection replaces the BC by u = w on T. The discrete realization of these two projections will be presented for element (e) of Figure 3.16-24. The continuity equation for element (e) is h(u2 — u\ + u4 — ut,) + l(v\ — vt, + V2 — v4) = 0, (3.16-413) where in both cases we have v\ = w\ and V2 = w2—specified. In the second case we also have u\ = wj and U2 = w\—specified. For the first (slippery) case we have u\ =u\ — (Xq — Xw)/l and «2 = "2 — (K — ^o)/2, so that U2 — u\ = 112 — u\ — (A.£ — 2Xq + Xw)/l. But we will just carry them as u\ and U2 for the time being, and specialize later. The remaining equations needed are: «3 = "3 - K^-o - A.w) + (A.5 - A.5w)/2/, u4 = u4- [(XE - Xo) + (XSE ~ A.5)]/2/, ^3 = £3 - [(A.w - ^sw) + (A.o - Xs)/2h, v4 = v4- [(A.0 - Xs) + (XE - XSE)]/2h. £ < i > < 4 3 1 2 > < > h t Fig. 3.16-23 One Q^Q0 element. r 1 2 h I i x ^.Sw £ \ « x \0 3 (e) x\s * 1 xXE 4 x >.SE Fig. 3.16-24 A boundary 6-patch.
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 767 Inserting these results into (3.16-413) yields (kE — 2A.0 + kw) + (ksE — 2A5 + Xsw) h U2 — U\ + Ua, — «3 + / 2/ v , v ~ , (>W — ^Sw) + 2(Ao — kS) + (^£ — ^SE) w\ + w2 - v3 - vA H — 2« = 0, (3.16-414) which we rearrange and divide by 2/ to get kw — ksw + 2(Ao — ks) + (A.£ — As£) _ ^3 — VV^ + V4 — VV2 + + h (kE — 2k() + kw ) + (^5£ — 2^5 + A5W ) 2/' «3 — M4 + «i W2 / (3.16-415) First note that the LHS approximates (and should converge to) 3A/3y(= dk/dn). Next, recalling the (pre-projection) BC u = w shows that the first term on the RHS approximates —h'dv/dy. Finally, it is clear that the remaining terms approximate, respectively, (h/2)d2k/dx2, —(h/2)du/dx, and [from (u\ — U2)/l] either —{h/2)dwx/dx or —h/2(du/dx — d2k/dx2) = —h/2(dwx/dx — d2k/dx2) for the no-slip (overspecified) or slippery case, respectively. Thus the only difference between the two is the term (h/2)d2k/dx2, which, along with every other term on the RHS, vanishes with mesh refinement so that in either case, the BC for the Lagrange multiplier associated with the projection is dk/dn = 0—and the highly 'convenient' overspecification of the Dirichlet BC in the tangential direction during the projection is justified/vindicated. Finally, the (important) vorticity 'injection' via a vortex sheet is well-approximated in either case. Three final remarks on these BC's: (1) One could justifiably promote the argument that the 'proper' tangential BC (i.e., none) is proper in that the subsequent process of reducing the resulting slip velocity to zero is also perfectly legitimate because it is actually being applied to u, which is not mass-consistent anyway. (2) For Q\Qo, the legitimate (slippery) projection would reap an additional and quite attractive benefit—the legitimate elimination of all spurious pressure modes; i.e., they are precluded (cf. Section 3.13.2b). Two 'wrongs' (projection method, slippery walls) could indeed make a 'right'—and has (J. Schutt, personal communication). The 'slippery walls', of course, are 'wrong' only if really needed for vorticity production; they are right during the projection. (3) Although Q\Qo was used to 'prove' our assertions, we believe that most of the results generalize to all other elements. o Return to the BHE. It is of some interest to duplicate the analysis performed earlier on the semi-discrete equations to see 'how' the FEM equations represent the biharmonic
768 THE NAVIER-STOKES EQUATIONS pressure equation. To this end, we substitute un+\ from (3.16-408) into (3.16-404) and multiply by CTM~X to get, upon rearrangement, and replacing the factor of two by y in (3.16-408), where 0 < y ^ 2 per Remark (5) following (3.16-323), for generality: [CT Ml1 C + AtCT M~l KMl1 C]Pn+l = y[CTM-\fn -Kun+l -N(un)un)-(gn+l -gn)/At] - [(y - \)CTMlxC - AtCTM~lKMllC]Pn, (3.16-416) wherein we note the approximations: — Q~lCTMJ^lC ~ V2, — Q~lCT ~ div, —M~XK ~ uV2, and M~[{C ~ V—where Q is the pressure mass matrix. Thus, recalling that V- V2(W>) = V ■ [VV ■ (V/>) - V x V x (V/>)] = V4P, Q-[CTM-{KMlxC ~ uV4, and we see (up to the factor Q~l) that (3.16-416) does correspond to (3.16-382), at least if we take y = 2. This result also shows why y = 1 is more robust—it kills the term — C ML CPn on the RHS, and with it the tendency to make 2A?-oscillations. o Temporal accuracy of projection 2. Before mentioning some numerical results, we opine that it is generally not easy to perform a simple, short set of numerical experiments to 'cleanly' validate/support some convergence estimates in the field of numerical solutions of PDE's. Many 'surprises' lurk in the CFD laboratory, and often extensive runs and re-runs are required—and even then truly conclusive and general results are not easy to obtain. And this opinion is probably a large understatement when it comes to projection methods and their numerical evaluation; confusion still reigns. The 'numerical results' refered to have already been discussed—in Section 3.16.Id; there we showed 'good' results for TR (second-order in At) and not so good for BE in the sense that the approach to the theoretical behavior (first order) was attained painfully slowly (very small At needed). But the good news from BE was second-order accuracy (close to TR in fact) for 'larger' At—more like those used in practice. o The pesky modes of Q\Qq. Since we have produced and promoted 'projection 2 via QiQo,' it is natural to wonder if or how the LBB instability tendency for this element manifests itself. The answer is simple: no problem. This fact has been known experimentally for some time and has recently been explained theoretically by Griffiths and Silvester (1994); in an extension of their LBB-mode analysis via the method of modified equations that we have already summarized (Section 3.13.5k), they studied the projection step of projection methods—with the following results: 1. The projection is stable in that the associated eigenproblem on the unit square, Mm + CP = XMlu and CTu = XQP, has amn = y/Xm,n(^m,n — 1) = (m2 + n2)n2, which is now independent of h with, for each (m, n) a pair of eigenvectors, one of which is smooth (the 'physical' mode) and the other of which is oscillatory: (— \)m+n multiplying the smooth mode—for mh <$C 1 and «/;« 1, 2. The oscillatory (spurious) eigenvectors are [cf. (3.13-370)] mn sin imnh cos jnnh mi cos imnh sin jnnh (1 — ^»i,n)cos(/ — 1 /2)mjrh cos(j — \/2)njrh. form, n = 0, 1,2, .... (3.16-417) Ujj Vij Pijl = (-\)i+j
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 769 The good news is stability—no eigenvalues go like 0(h), and the (slightly) bad news is that the velocity parts are not, as for the Stokes LBB problem, 0(h2) smaller than the pressure parts. Our final remark is this: it must be the CB-part, (—i)'+j, that (still) causes the projection of u to the divergence-free subspace, u, to occur without wiggles because this has been our numerical experience; i.e., the intermediate velocity, u, is apparently 'sufficiently smooth' that its projection onto the 'bad' modes is very small. o A fully implicit projection method. We (RLS and D. Veyret) have experimented with a fully implicit second-order projection method with mostly favorable results. We report that and planned future progress here, in summary form—using our modification of the code PASTIS (Daniels, 1992, 1993). Beginning with the linearized TR equations shown in (3.16-237) with the skew-symmetric advection matrix, N(u*), being explicitly constructed from the 'advective form' matrix: N(u*) = ±[N(u)-NT(u)l (3.16-418) from which the 'mostly favorable' descriptor originated. The results were favorable for all 'internal' flows tested (Dirichlet BC's all around), but unfavorable for a flow with an NBC at outflow (vortex shedding); there were large spurious 2Ajc 'waves' at and near the outlet. The reason is that N(u*) involves velocities at nodes on the outflow boundary and the boundary integral, per the /J = 1/2 discussion in Chapter 2 (Sections 2.2.3 and 2.2.4), is not properly accounted for. A 'fix' that is ad hoc and loses skew symmetry but has worked in practice, is to revert to the simple advective form at outflow boundary nodes—as was done in Simo and Armero (1994; R. Taylor, personal communication) for their vortex shedding simulations. The linearlized N(u*) was then generated using u^+[ from AB2 or AB3. Finally, the 'standard' projection 2 algorithm was implemented, with the only change being that the intermediate velocity solution now generates an unsymmetric matrix; M + A^A: changes to M + At[K + N(u*)] in (3.16-404), except that the ODE integrator was TR (as it should be!) rather than BE. What needs yet to be done to really get a good 'solver' is to take advantage of the unconditional stability engendered by N(u*) and design a smart integrator—part of which could involve the further potentially cost-effective feature: do not project every timestep; do it only when 'needed.' Three obvious ideas are the following: 1. Noting from (3.16-314) that V ■ u ~ t2 in the continuous projection method, vary the timestep via Atn+l =A^(£/||V-u„||)1/2, (3.16-419) where e is a user-specified maximum allowable divergence. 2. Try an AB predictor-TR corrector scheme as done with the fully coupled system; although theoretically 'lacking,' it may still give reasonable results, at least if At is not too large. 3. Combine 1 and 2 in some clever fashion, and figure out a smart way to avoid the projection step at each timestep; do it less frequently—especially for flows tending to a steady state. Final Remark on Smart Integrators Using the Projection Method: Beware the biharmonic catastrophe for too-large At selection—it could turn 'smart' to stupid.
770 THE NAVIER-STOKES EQUATIONS d. A sampling of projection methods used by others. To 'prove' that projection methods are both 'attractive' and not easy to understand—regardless of how easy they are to program—we conclude this section with a sampling of the literature on this subject, spanning FEM, FDM, and spectral methods. There is a seemingly endless string of papers on the subject and even as we write this down, we are aware of more coming—from Heidelberg, in particular (A. Prohl and R. Rannacher). The more mathematical of the publications will typically begin by making one or another set of regularity assumptions and proceed from there to prove one or another convergence result. The problem is that the results often, but not always, disagree with those of others and with those from one or another numerical experiment. Anyway, we list below, and comment upon, enough of these so that the interested reader may quickly(?) catch up on the literature. We start with J. Shen, who probably leads the pack in number of publications on the subject—and we cite only some of them. In Shen (1992) he showed ('weakly') first- order global accuracy (in At) for u and 'weakly 1/2' for P for projection 1 (Chorin's method). He also showed 'strongly first-order' for u and 'weakly first-order' for P for projection 2 via BE—the latter being close to the best you can hope for, since BE on the full index 2 coupled system is first-order in u and P. (See, however, our numerical results in Section 3.16.Id.) In Shen (1996) he addresses higher-order schemes (projection 2 and variants) and presents some numerical results using a (Legendre-Galerkin) spectral method. He proves second-order for u, but only first-order for P for projection 2 and TR. We shall return to tins paper after citing a few others, because some of the issues are related—but we also now mention his 'first attempt,' in Shen (1992). Rannacher (1992) has gone further with projection 1 — 'the classical projection method'; by reinterpreting the intermediate velocity (u) as the final, reported velocity, he placed the method in the category of 'pressure stabilization methods' (see Section 3.13.3) and then proved the optimal result: first-order for both u and P, the latter holding only away from Fd- Near the boundary, convergence deteriorates to 0(^/~At), a la Shen (R. Rannacher, personal communication)—owing to the spurious BL there. In a series of papers, J.-L. Guermond has addressed some projection issues and obtained some interesting results, using both finite elements and finite differences (in different papers). One of the latest is also interesting; in Guermond and Quartapelle (1995) they apply BE to projection 2, but then omit the last step (stripping off the gradient) to obtain an approximate projection method. See also Guermond and Quartapelle (1996). Recalling (3.16-398) and the discussion there, we repeat this 'analysis' for projection 2 via BE, starting from IWl^- Um + ^ = vV2^m+ ^ (3A6_42{)) which is solved with the BC's given in (3.16-372) and 3.16-373). The 'conventional' projection step is then: find um+\ and Pm+\ from um+l =u„,+, + AfV(/Vn ~Pm) and V-um+1=0, (3.16-421) with n un+i = n ■ w„+) on FD and Pm+\ = — Fn on FN. This of course yields V2(/Vi -Pm) = V-um+l/At in £2, (3.16-422) d(Pm+\ - Pm)/dn = 0 on FD, and Pm+\ = —Fn on FN. This is projection 2 via BE. The 'trick' played by Guermond is then to 'pretend' (3.16-421) does not exist after using it
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 771 to eliminate the divergence-free velocity in (3.16-420) and then just use (3.16-420) for the (reported) velocity and (3.16-422) for the pressure update; they state: 'In this way the end-of-step velocity is made to disappear from the algorithm, thus eliminating the wierd velocity space Yo,/j from practical calculations.' The approximate projection algorithm is then, given um and Pm, Step 1. Solve for um+i from Um+[~t ""' + V(2/»M -/»„_,) = uV2u„,+ 1, (3.16-423) which is the BE analog of (3.16-398) shown earlier for TR. Step 2. Solve for Pm+l from (3.16-422). Done. Then they showed (stable—obviously) results for the P\P\ triangle; i.e., equal-order interpolation—using the 'inconsistent' Laplacian (/ V0, ■ V0y) for the Poisson equation. Related (and earlier) papers are Guermond and Tenaud (1994) and Guermond (1994). It now seems appropriate to revisit (3.16-398) and the discussion there, in light of the contributions by Shen, Rannacher (et al.), and Guermond (et al.). They all seem to opt for an approximate projection by reporting what we have been calling the intermediate velocity. We now return to Shen (1996), who computed and compared (i) with our version of projection 2, (ii) with the approximate version given in (3.16-398), and with a third—change the Adams-Bashforth pressure integration to a forward Euler one; i.e., replace (3Pm — Pm)/2 by Pm. The reason he did this is that (3.16-398) gave only first-order results for P (and second-order for u)—which he blamed on the homogeneous Neumann BC for P, and its associated boundary layer. His numerical results showed essentially second-order for u and something like 0(AtL5±) for P—for all three methods! These results also seem to justify the seemingly cavalier neglect of the divergence-free velocity—the divergences were small enough not to be harmful, and they apparently did not accumulate. These results, taken with the above, seem to imply that the ostensible 'gains' listed below (3.16-398) are not all realized; whereas there is no BHE and no vortex sheets, there is still a BL and still BC 'problems.' We note in passing that Shen's pressure accuracy away from To went up to full second-order for projection 2 and for approximate projection 2, (3.16-398), but not for the 'FE version' of the pressure. Interesting. The last Shen paper we cite is Shen (1993), in which he too examines (briefly) projection 2 via BE a la Guermond and Quartapelle (1995), although the purpose of the paper was to prove that a higher-order-yet projection method, called 'projection 3' in Gresho (1990b), is a loser—it is unconditionally unstable. Moving on, we shall more quickly list the remaining, relevant projection contributions that we are aware of, beginning with the first higher-order finite-difference method (projection 2), of Van Kan (1986); this paper, and that of Bell et al. (1989) are probably the two key finite-difference papers on projection 2. Both predict and demonstrate second-order convergence for velocity; both did not report pressure accuracy. The next important paper we mention (again) is that by E and Liu (1995); they even address the intermediate BC 'issue' for projection 1, including what we earlier called optimal BC's on u; cf. (3.16-325). They study Chorin's method ['classical' projection 1 (BE) with optimal BC] and Kim and Moin's (1985) method, which is Chorin's except for an elevation from BE to TR. Their results are as follows: (i) for semi-discrete projection 1 with 'smooth initial data,' the velocity error (oo-norm in time, Lr in space) is O(At),
772 THE NAVIER-STOKES EQUATIONS and that for pressure is 0(Atl/2); (ii) for semi-discrete projection 2, the analogous results are 0(At2) and O(At), respectively; (iii) if, however, some non-local and generally non- realizable-in-practice initial (t = 0) regularity assumptions are satisfied, such as (3.8-37), then additional results are available (especially) for the pressure—outside of the spurious numerical BL—improving the pressure error (pointwise in this case; i.e., stronger) for projection 1 to O(At) and that for projection 2 to 0(At2). See the original paper for details. See too their most recent paper [E and Liu (1996)] in which the velocity accuracy applies right up to the boundary—in apparent disagreement with Shen and Rannacher. The 'approximate factorization' approach taken by Dukowicz and Dvinsky (1992) to obtain/derive higher-order projection methods (projection 2 and others) is novel; the paper has a number of interesting ideas. The finite-difference version of 'approximate projections' seems to have originated with the paper by Almgren et al. (1996), although the earlier one by Dvinksy and Dukowicz (1993) may have coined the phrase first. A related paper by Rider (1994) is interesting in that it includes a large number of numerical experiments. Another alternative method for deriving projection methods is shown by Perot (1993); the result is similar to projection 2. It is based on approximate factorization like those of Dukowicz and Dvinsky (1992), not approximate projection, as the final velocity is discretely divergence-free; and it involves (implicitly, perhaps) a discrete BHE. Applied to the Stokes system, ii + Ku + GP = f, Du = g it gives, for constant / and g for simplicity, the algorithm below: Step 1. Solve un+\ ~ un „ I un + \ + un At V 2 with P dropped—like projection 1. Step 2. Solve [DG + (At/2)DKG]Pn+l = (Dun+l - g)/At. Step 3. Obtain the divergence-free velocity, un+l = un+l -AH/ + ~YK) Gpn+\, which satisfies Dun+\ = g. The perturbation to the pressure gradient term, a consequence of the approximate factorization, elevates the method to second-order in velocity—and first-order in pressure, which Perot demonstrates numerically. A collective account may be in order for both the Dukowicz and Dvinsky paper mentioned earlier and the one by Perot: they all assert that the issues associated with BC's for the intermediate velocity and for the pressure are obviated by their derivations via 'matrix manipulations' of the discrete equations—assertions that we believe, while ostensibly true, are misleading in that the final discrete equations for both cases can be studied near r to see what BC's are 'built-in'; we believe that they will turn out to be the same ones that we use in the simpler projection methods. Another interesting 'reaction to these papers is that of Rosenfeld (1996): 'Note that the pressure always converges with second-order accuracy as well, contrary to the predictions of Dukowicz and Dvinsky and Perot.' = /
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 773 Returning now to FEM, we mention two recent papers by Turek (1996, 1997), in which many numerical comparisons are made and compared also with some fully coupled (DAE) methods—and even a combination of the two in which the projection method is used as a sort of 'preconditioner' for the fully coupled method. We conclude this brief(?) review with a higher-order, spectral-element projection method that introduces a new wrinkle—in Timmermans et al. (1996), the 'pressure correction term,' Pm+l — Pm, in, for example, (3.16-378), was replaced by Pm+\ — Pm + uV ■ uffl+i; i.e., Pm+\ = Pm + 2<p/At — uV ■ um+\. This 'improvement' does seem to help the accuracy even though it was not rigorously justified theoretically. We should add that Timmermans et al. did not use either TR or BE for the u step: they used BDF2 (see too Minev and Gresho, 1998). Look for some recent additional work on this and related methods by Prohl and Rannacher which will also improve the results of E and Liu (1995) [R. Rannacher, personal communication]; see, in particular, Prohl (1996, 1997) and Prohl and Rannacher (1997)—in which also a new variant, called 'Chorin-Uzawa,' is introduced to virtually preclude the spurious boundary layers. This concludes our excursion into 'work by others' on projection methods; hopefully some of it is useful, perhaps leading some future genius to the 'ultimate projection' and its unerring analysis. 3.16.7 Fully-Implicit Segregated Solution Methods—Transient and Steady-State The desirability of sequential solution methods with the concomitant significantly smaller matrices is obvious from the viewpoint of computational cost—if not simplicity—at least for large 3D problems. The desirability of fully implicit time-integration techniques, with At based on the 'physics,' is also (now) obvious. The method to be described below is predicated (in part) on these features, but in the main part, on obtaining a good steady- state 'solver'. It resembles the semi-implicit projection methods discussed above, but generalizes them in several significant ways: (i) time integration is 'elevated' from semi- to fully-implicit so that the CFL restriction is bypassed and smart timestep control is possible; (ii) the ad hoc approximation of semi-consistent mass is obviated via the honest use of fully consistent mass; (iii) there is no projection boundary layer and no spurious slip velocity; and, finally, (iv) the method is, as mentioned above, most useful for attacking the steady-state form of the equations (At = oo in what follows). While the method to be described below finds its greatest utility in 3D simulations involving lots of coupling (e.g., Boussinesq equations via both thermal and solutal natural convection problems—see Volume II), wherein full/honest coupling of all conservation equations (especially in the index 2 formulation) can easily overload even the biggest computers, we believe it appropriate to introduce them here and, for ease of presentation, in 2D only. The extension to 3D is 'obvious' and that to coupled systems not difficult—and will be done in Volume II. Besides, the intentional lack of coupling can cause convergence 'problems' when the tight coupling should be respected owing to its importance (e.g., thermal convection; see Volume II). Thus, while the presentation to follow 'merely' uncouples u, v, and P, it is worth emphasizing that its rewards are better realized in 3D and when additional transport equations are present (including turbulence 'transport' equations)—all of which are segregated/uncoupled and solved sequentially—but repeatedly, via a new 'iteration loop,' even for linear problems. One further simplification that we invoke below, again merely
774 THE NAVIER-STOKES EQUATIONS to simplify the basic/conceptual ideas, is to employ implicit Euler (BE) for the time integrations. Surely by now the reader realizes that: (i) it is not (usually not) the method of choice and (ii) it is easy to convert the final equations from BE to, for example, TR or BDF2. But the equations to follow are, necessarily, quite long even when written via BE—so we hope the reader will: (i) forgive us, and (ii) not 'write code' using (only) BE. The technique described below was devised by Haroutunian et al. (1993), partly to help iterative solvers cope with unsymmetric matrices and partly to help 'stabilize' the overall solution of non-linear algebraic equation systems. It involves the use of so-called 'implicit relaxation' procedures on every transport equation, in which the 'relaxation factor' is an inherent part of the solution procedure rather than only being explicitly applied at the end; i.e., explicit relaxation means that xk+l = a>xk + (1 — co)xk+l/2, where xk+l/2 represents an 'intermediate' (temporary) update of x, and co is the relaxation factor (0 < co < 1). The implicit procedure is, in fact, an adaptation of old and successful FDM strategies ('SIMPLE,' 'TEACH,' etc.) in which diagonal dominance was the coveted attribute, and implicit relaxation was employed during the iterative solution process. The starting point of the segregated solution method is an iterative solution of the BE equation (3.16-245): ^-M + K + N (ukn+x) 4X\ + CpknX\ = ^tM"n + fn+i = bn+i, (3.16-424) CTukn++\ =gn+l, (3.16-425) which could (and should) also be construed as a solution method for the steady equations by setting At = oo and dropping the n + 1 subscripts—a property not shared by TR. But these linear equations are still fully coupled—just what we wanted to avoid, which we now do, beginning with an explication of the implied (but not used!) equation for the pressure, [CTA-\ukn+l)C]Pkn++\ = CTA-\ukn+{)bn+{ -gn+u (3.16-426) where A(u) = At~xM + K + N(u). Henceforth, we shall suppress the temporal indices, n and n + 1, for notational simplicity. Also for simplicity, we shall denote A(uk) by Ak. The basic idea will be presented first (frills later) and it is this: noting that (3.16-426) and (3.16-424) imply (3.16-426), replace (3.16-425) by an approximation to (3.16-426) (via A —► A, where A approximates A and is easy to invert, and will be discussed below) and iterate between (3.16-424) and the approximation to (3.16-426) in such a way that convergence 'rapidly' occurs and in such a way that (3.16-425) will still be satisfied. One way to assure a divergence-free result is to perform a projection during each iteration, leading to the following algorithm, called the 'pressure projection' algorithm in Haroutunian et al. (1993): given u°, do for k = 0, 1,2,...: Step 1. Solve (CTA7lC)Pk+l = CTA7l(b-Akuk), (3.16-427) and note that (3.16-426) is recovered if A = A and if CTuk = g. Step 2. Solve Akuk+l/2 = b- CPk+l. (3.16-428)
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 775 Step 3. Project: i.e., (a) Solve {CTA^C)X = CTuk+l/2 - g. (3.16-429) (b) Compute uk+l = uk+l/2 - A^CX, (3.16-430) where X is the Lagrange multiplier associated with the projection (see Appendix 3); it is zero upon convergence. Step 4. If ||X|| < e, stop; else update Ak and Ak and go to Step 1. Note that, upon convergence, both CTu = g and A{u)u + CP = b, as desired; also, (3.16-427) is (still) satisfied, for any A(u)—but it is no longer relevant. Note too that there are two Poisson-like equations per (non-linear) iteration—one for Pk+l and one for X; clearly the method will only be cost-effective if a 'small' number of iterations is needed. It is also noteworthy that even steady Stokes flow (A = K) requires iterations (two PPE's plus one Poisson equation for each velocity component, per iteration), whereas, if A = A, one iteration would suffice—so that the decoupling is not without some added cost. (It is also worth mentioning that the algorithm was not designed with linear problems in mind.) What can be said about convergence of the iterations? Is convergence guaranteed? What is the convergence rate? We shall brush these aside—at least for now—to get on to an improved algorithm that is even harder to analyze (yet usually performs better). Besides, until A is defined, it is impossible to answer these questions. The 'final' algorithm introduces the implicit relaxation referred to above (via the a- terms, with a, and A, defined below) and further decouples the equations by clearly segregating the u and v momentum equations. It is this: for k = 0, 1,2, ..., do [with u° and v° given, and Axk = Ax{uk), etc.]: Step 1. Solve (CTxA;klcx + cTvA;klcv)Pk+l = cTxA;kl(bx -Axkuk) + cTvA^(bY -Avkvk). (3.16-431) Step 2. Solve (j^Axk+Axk) uk^2 = bx - CxPk+l + -^-Axkuk. (3.16-432) \ 1 - au ) 1 - au Step 3. Solve (-^-Avk+Ayk) vk^'2 = bx - CYPk+l + ~^-Aykvk. (3.16-433) V 1 - Of,, ■ ■ / 1 - Of,; Step 4. Project: i.e., (a) Solve (3.16-429): (CTxA;klCx + CTyA~klCy)X = CTxuk+l/2 + Cy+1/2 -g. (3.16-434) (b) Compute uk+\ = uk+\/2 _X~k{CxX. (3.16-435)
776 THE NAVIER-STOKES EQUATIONS (c) Compute vk+l=vk+l/2 Ayk CyX. (3.16-436) Step 5. If ||A.|| ^ e, stop; else update Ak and A*, increment k, and go to Step 1. Remarks: (1) Upon convergence, the desired equations (3.16-424) and (3.16-425), have been solved—regardless of the choice of a's or A—as required. (2) Any or all of Steps 1, 2, 3, and 4(a) could also define 'inner iteration' steps if solved by an iterative method. (3) The a's are between 0 and 1, the lower bound removing relaxation and the upper bound giving no change; hopefully there is an optimum value. (4) The algorithm can also be used to attack the steady equations; simply drop the mass matrix terms. (5) The (transient) algorithm can also utilize the predictor-corrector-variable-A? techniques discussed earlier for the fully coupled solution methods. (6) The 'proper' a can be very useful to 'stabilize' (read: make converge) the non-linear equations themselves. (7) When the convergence test is passed (||A.|| < s), either the timestep has been completed or the steady solution (no mass matrix) has been found. (8) The stopping criterion could just as well be done using \\8u\\ and \\8v\\ or \\CTu — g\\—and probably better using 'relative' norms; see Section 3.16.4. Safest, of course, is to 'test on all.' It remains to define the approximations to the A-matrix and the relaxation factors. The A-matrix is approximated by—no surprise—a diagonal matrix that is (surprise?) formed by summing the absolute values of each row, the hope being that at least some semblance of the actual A-matrix will be contained in the result. (It will at least be dimensionally correct!) Thus, Kk =Ax(ukn+l,vkn+l) = diag Ayk=Ay(ukn+l,vkn+l) = diag and we note the following features: ^2\(Ax(4+i^n+i))ij\ : i'=1.2,.. ]T|(Ay(^+1,^+1))0| : i = l,2, ... ,(3.16-437) ,(3.16-438) 1. In the At —► 0 limit, it becomes a sort of row-sum mass lumping, except for the absolute values. In contrast to conventional mass lumping, this 'row-sum' lumping has no effect on the accuracy of either transient or steady-state solutions—at least when iterating to convergence. 2. In a 'worst case' of large Re and steady flow, A(u) is nearly skew-symmetric, and it is not obvious how any diagonal matrix could approximate it.
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 777 3. The A described here is only one of an infinite number of possible choices. We implore that any reader who finds a (significantly) better A contact the authors immediately! There are several ways to select the implicit relaxation factors, au and av: 1. Simply choose the optimal constant values for each. Since, however, these values are usually not known, other methods are also listed. 2. A local Reynolds-number-based scheme. Here the a's are functions of element (grid) Reynolds number, Re = uele/v, where ue is the average velocity in element e and le = y/A~e approximates the element 'size' (Ae is element area). The a's are then computed locally after each iteration via au = (amjn + Regoimdx)/{\ + Re), where again amin and amax are user-specified. This method uses a 'large' a for advection-dominated regions and a small one where the flow is diffusion-dominated. 3. Recent experience with the implicit relaxation scheme suggests that a blend of dynamic element-matrix-based implicit relaxation and explicit relaxation of the non-linear iterations may be an optimal strategy .... This (at the time of writing) is current research!! More remarks on segregated solvers: Remarks: (1) It may be useful, sometimes, to employ some explicit relaxation on the pressure update, via apPk + (1 — ap)Pk+l —► Pk+l; sometimes this will help the 'non-linear' iterations converge. (2) Another segregated solution method that sometimes works well, called 'pressure update' (PU) by Haroutunian et al. (1993), from which most of this section was obtained, can be derived via a few simple modifications to the projection method described above: (i) Add a 'penalizing' term to the RHS of the pressure equation (3.16-431), in the form Xp(CTukn+l — gn+\). Experience indicates that Xp = 0.15 is a reasonable value. (ii) Omit the projection step, (iii) Test for convergence via ||Crw^', — gn+\ || ^ e. (3) The SIMPLE algorithm (PC; pressure correction) in Haroutunian et al. (1993) is no longer recommended. Remarks on Remark (2): (1) Since the pressure and X equations are usually the most time-consuming part of each iteration, this method has the potential for being much more cost-effective—and is if not too many more iterations are required. (2) The resemblance of this scheme to the semi-implicit PPE scheme of Gresho and Chan (1990) is interesting; it generalizes and improves that one via penalizing any spurious divergence—a technique similar to that called 'divergence cleaning' (Ramshaw, 1983).
778 THE NAVIER-STOKES EQUATIONS Final Remark: Recalling that iteration to completion (convergence) yields the associated underlying DAE method (e.g., TR or BE), it is also worth pointing out that the segregated solution method sometimes approaches another popular method in the other limit—for a time-dependent simulation: one pass through the algorithm gives a result that, at least for small At (time-accurate) simulations, looks (and behaves) very much like 'projection 2.' This assertion/observation requires two important 'facts' [k = 0 in (3.16-431) through (3.16-438)]: (i) ||Axo«l/2 ^> orM/(l — au)\\Axo(ul/2 — u°)\\ where u° = un, ul/2 is taken to be un+\, and ux is taken to be un+\; and ditto for v; (ii) the pressure from the 'PPE' [(3.16-431)], with P{ taken to be Pn, is 'the same as' the pressure from the projection 2 Lagrange multiplier; namely, (3.16-407). 3.16.8 A Fractional-Step (Index 2) Method In a recent series of papers (Rannacher, 1989, 1993; Miiller et al., 1995), Rannacher et al. argue convincingly that (his version/adaptation of) a special scheme designed by R. Glowinski, which is called the fractional-step ^-scheme (FS#), is one that merits serious consideration by the CFD community. This method, '... which seems to have the potential to become the winner in this race...' (Rannacher, 1993), is second-order accurate, like TR and BDF2 but, unlike TR, it exhibits strong A-stability and, unlike BDF2, it is only lightly dissipative for purely hyperbolic problems. He argues against TR because any high-frequency 'noise' that is injected into the solution (e.g., via too crude a solution of the equations, linear or non-linear, within a timestep) will be too slowly damped. But we note in passing that we know of few complaints by those who have employed our variable-step TR method [e.g., Crochet et al. (1983, 1985); Kheshgi and Scriven (1984); Bixler and Benner (1985); Keunings (1986); Gartling (1987); Derby et al, (1987); Ladeinde and Torrence (1990); FIDAP (Fluid Dynamics International, 1993); and Basaran and De Paoli (1994)]; in fact, we cite D. Gartling (1994, personal communication), who wrote the NACHOS code and other codes that use the variable-step TR integrator (only NACHOS solves the NSE's), which has numerous users: 'The algorithm works well—we are happy with it. We seldom if ever run the BE version.' Nevertheless, R. Rannacher (1994, personal communication) argues that FS^ may still beat TR in cost-effectiveness because its good damping properties will permit equivalent accuracy using larger timesteps. We shall return to this issue after describing the method—which we do largely following Rannacher rather than Glowinski in that Rannacher considers it as another ODE method. Rannacher attributes the FS# scheme to Glowinski via Bristeau et al. (1987), but Glowinski himself (Glowinski, 1991) nicely summarizes the scheme as 'a variant of the Peaceman-Rachford scheme,' and refers to even earlier references, e.g., Bristeau et al. (1985); the earliest being Glowinski (1984), and the one with the most appropriate title is Glowinski (1986): 'Splitting Methods for the Numerical Solution of the Incompressible Navier-Stokes Equations.' For a recent analysis of the method, see Kloucek and Rys (1994). It resembles a diagonally implicit Runge-Kutta method in that there are three (implicit) 'stages' (fractional steps), all of which taken together constitute a 'timestep.' Applied first to the (somewhat special) non-linear ODE system described by y + A(y)y = f(t), (3.16-439)
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 779 the 'general' FS# scheme is: 1. Solve for yn+e from [\ + a0AtA(yn+0)]yn+e = [1 - (1 - a)6AtA(yn)]yn + 6Atfn. (3.16-440) 2. Solve for yn+\-e from [1 +(1 -a)(\-26)A(yn+l^)]yn+l_e = [1 - a(\ - 20)A(yn+9)]yn+e + (l-20)Af/„+,_«,. (3.16-441) 3. Solve for yn+\ from [\ + a0AtA(yn+l)]yn+l = [1 - (1 - a)0AtA(yn+^o)]yn+\-e + OAtfn+l-0, (3.16-442) where 0 < 0 < 1 and 0 < a < 1 are parameters. Glowinski (1991) shows, for the linear case (A is a constant and SPD matrix), by comparing the overall amplification factor (replace A by X, an eigenvalue of A, and set / to zero), £ = yn+\/yn, to e~XAt, where [1 - (1 - a)0XAt]2[\ - of(l - 20)XAt] £ = ~ , (3.16-443) [1 +a0XAt]2[\ +(1 -a)(l -20)XAt] that the scheme is second-order accurate if either 0=1- 1/V2 = 0.2929 (3.16-444) or a =1/2, (3.16-445) the first of which is the 'interesting' one. (For a = 1/2, the scheme approaches TR in either the 0 -* 0 or 0 -* 1 limit, and is TR at Af/2 if 0 = 1/2 and w TR at At/3 if 0 = 1/3; none of these special cases—all TR—is worthwhile. In fact, a = 1/2 is always a variant of TR.) If neither is true, then the scheme is doomed to first-order accuracy. But the choice a = 1/2 also gives lim^^oo £ = —(1 —a)/a = —1, which is A-stability—like TR, a 'disadvantage.' So the choice of 0 in (3.16-444) is made. Next, it is observed that only if a = (1 - 26)/(\ -6) = 2-V2 = 20 = 0.5858 (3.16-446) will the coefficient matrix in (3.16-440) through (3.16-442) for the special-but-important case of a constant matrix, A, give the same linear system to solve in each of the three steps. Since this choice also gives \imXAt-^oo^ = — 1/V2 = —0.7071, which means that the ODE method is then strongly A-stable (but not L-stable or even stiffly stable), this is deemed to be a good choice—especially by Rannacher. [Large 00At computations of y = icoy will give yn + \ = (—0.7071 )nyo rather than yn+\ = (— \)"yo of TR and yn+\ = 0 of BDF2.] Remarks: (1) Glowinski and Rannacher do not seem to agree on the sampling points for f(t) when solving (3.16-440) through (3.16-442)—which is the R2 method; Glowinski uses fn+o in the first stage and f n+\ in the third.
780 THE NAVIER-STOKES EQUATIONS (2) The amplification factor for the (constant coefficient) FS^ scheme (for optimal 6, a) is easily found to be (to four digits) = (1 -0.1213AAQ2(1 -0.2426AAQ Jfi (1+0.1716AA03 which, for Xh -* 0, gives J-=\-\h+ (Xh)2/2 - 0A159(Xhf + 0(Xh)4, which has a very small 'local error' term. (Recall that for TR, 0.1759 is replaced by 1/4, which, compared with the 'target' of 1/6, has an error about nine times larger, FS# is indeed accurate.) The fractional steps for y = — X(t)y + f(t) corresponding to the optimal (6, a), and to (3.16-447) for constant X, are (again to four digits): 1. (\+0Al\6Xn+eAt)yn+e = (1 -0A2\3XnAt)y„+6Atfn', (3.16-448) 2. (\+0Al\6Xn+l^At)yn+^e = (1 -O.2426Xn+0At)yn+0 + (1 -20)Af/„+,_*; (3.16-449) 3. (1 +0.1716A„+1A0^+i = (1 -0A2\3Xn+l-eAt)yn+l-d + 6Atfn+[_e, (3.16-450) and the suspicious-looking index on / can be justified by simply requiring second-order accuracy when X = 0—which means no error for y = t with 0 a free parameter. The result is 0 = 1 — 1 /\/2; the indices are right as written. In Rannacher (1989) are shown some very interesting comparisons of FS# with TR, BDF2, BE, FE, and a second-order DIRK (diagonally implicit Runge-Kutta) method (see, too—Marx, 1994). He showed, in the context of fixed At algorithms, that FS# is the clear winner for an ODE system displaying some damped and some undamped oscillations: (i) in the non-stiff cases it displayed about the same accuracy as TR using three times the At (which is the proper measure since it takes three times the effort of TR); (ii) in the stiff case (stiffness ratio of 104), TR oscillations polluted the early time solution (the At used was quite accurate for tracing the oscillatory mode—~126 steps/cycle—but not small enough to damp the 'stiff mode), thus giving better results at (again) three times the At of TR (~42 steps/cycle). Impressive as these results are, we still believe that a 'smart' TR integrator would fare much better, taking very small At only initially to follow the stiff components and growing monotonically toward that needed to follow the free oscillations. [Part of the 'problem' is philosophical: we believe stiffness should be accurately resolved because it is (usually at least) physical whereas Rannacher is more concerned with non-physical stiffness encountered in the form of 'noise.'] And we believe it could be done more cost-effectively in the general case even if the FS# scheme were improved to include error control and automatic At selection [and it has—see Turek (1996)—because of the following very important fact: local error estimation and resulting At control are virtually free (very low in cost) for TR (or LMM's in general) because it is done with an explicit method, whereas the relatively expensive (implicit) technique called 'step doubling' (common, for example, with RK methods) is required for FS# because it is a so-called 'composite method.' For every two 'real steps,' a single step at twice the size is required in order to estimate the local error and thus invoke At control [see Gear (1971) for details]. Thus, to a first approximation, the overhead cost is a full 50%.
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 781 We leave as an exercise the application of the FS^ scheme to both the scalar transport equation and the NSE's, but we do show the result for phase speed (up) and amplitude coefficient (|£|) for ID pure advection using linear elements (recall Section 2.7.6): it is u, u 1 -i y = — tan ' - c0 x and % = x + iy [m(0) — aic sin 0] [m(6) — 2aic sin 0] [m(0) + j8/csin0]3 (3.16-451) (3.16-452) with due account taken of the quadrant in which £ lies. Here, m(0) = (2 + cos0)/3 is the mass matrix symbol, a = 0.1213, and fi = 0.1716. Figure 3.16-25 shows the result in terms of relative phase speed (up/u) and amplification coefficient (|£|)—both of which should be 1.0—for several values of 0. It is seen that the phase speed is as good or better than its implicit 'competitors,' TR and BDF2 (cf. Figures 2.7-15 and 2.7-17 for P = 100). The amplitude comparison, however, is not so easy—it is not available by comparing Figure 3.16-25(b) with Figure 2.17-15(b) or Figure 2.17-17(b), since the former is |£| for P = oo and the latter are \%\ec9~/2p for P = 100. We can compare a random point, however—such as 0 = tt/2 for c = 3. The results are (for pure advection): |f| = 1.0, 0.495, and 0.953 for TR, BDF2, and FS#, respectively—which suggests at least that FS# will be much better than BDF2 for advection-dominated flows. Because it is rather interesting, and was not easy for us (PMG and J. Leone, Jr.) to obtain, we show for one case how £ varies in the complex plane as 0 ranges from 0 to n. The spiral shown in Figure 3.16-26 for c = 10, which begins on the unit circle at 0 = 0 and proceeds in a clockwise (decreasing argument) direction, has a maximum damping (|£| = 0.76) at 0 = 2.07 (a nearly 3A* wave) for which the argument is ~ —60° (from the *-axis). As 0 increases from 2.07, the spiral reverses its path and retraces virtually the same spiral to return to its starting point at 0 = n. The straight line segments (chords) in the figure represent the return path, plotted with increment A0 = 7r/500(~ 0.36°); the smooth curve outside the chords shows the 'outward' path (0 < 6 < ~2.07). The resulting ||(0)| curve is that labeled c = 10 in Figure 3.16-25(b). In order to obtain the associated phase speed curve using either a hand calculator or a software package on a computer, 1.0 & 0.8 8-0.7 $0.6 ^0.5 Q. a) 0.4 1 0.3 CD "cd 0.2 01 0.1 0 c v~s> - — \ , \ \ ~' C = 10" ) 0.5 ""\ 3~ ^>^xC<0.3 _ "'V 5 ^. ^ "vS. *^, a. x\ I I I I M 1.0 1.5 2.0 2.5 3.0 0 (a) Phase speed 1.05 1.00 0.95 ^0.90 0.85 0.80 0.75 ( I ' I ' I I C = 0.1 " \ \ ~r--" / r _ \ \ / !- \ 5"\ / ' \ / — 10'v. / — | , | ' ■—, -[ , J 0.5 1.0 1.5 2.0 2.5 3.0 0 (b) Amplitude Fig. 3.16-25 Phase speed and amplitude for pure advection via FS-G.
782 THE NAVIER-STOKES EQUATIONS Fig. 3.16-26 Argand diagram of £(0) for C = 10 (0 ^ 9 < n, AO = tt/250. both of which usually deal only with the principal branch in the complex plane, one must add or subtract the proper multiple of n from tan-1 y/x, where £ = x + iy. For the curve shown, there were two subtractions of n needed on the way out (one when the spiral crossed the imaginary axis into the left-half plane and the other when it crossed the real axis from above); these two increments needed to be 'added back' during the return portion of the spiral. Anyway, the FS# scheme does appear to be a viable contender for CFD; whether it is a good way to solve ODE's in general is another matter. We also leave for the reader the details of applying FS# to the NS equations—or see the above Glowinski/Rannacher references. We do point out, however, that no one has, to our knowledge, applied the scheme to any but the index 2 DAE's (u — P)—where, in Turek (1994, 1996, 1997), it has been applied quite effectively. 3.16.9 Other Methods (Used by Others) As in the previous chapter, we would be remiss if we did not also mention that there are yet other ways to solve both the steady and time-dependent NS equations—still using FEM but differing in many 'details,' sometimes in major ways. Thus, tending toward 'completeness,' we cover (only) some of these below. a. Methods based on trajectories/characteristics The most impressive applied ('industrial') use of the (backward) method of characteristics (BMOC; see Section 2.7.7a) in real-world (3D) geometries (reactor vessels, automobiles, etc.) that we have seen takes place at Electricite de France (EDF), in which tetrahedral elements permit the effective use of (Lagrangian) particle tracking (x = u)—made all the more efficient when thermal, species, and turbulence fields must also be computed because one trajectory calculation applies to all (u, v, w, T, c,, k, e, etc.). This in fact appears to be an example that strongly favors simple tetrahedra over bricks because the
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 783 latter, in the isoparametric (distorted) shapes that are necessary to model complex 3D geometries, do not have flat faces (each of the six faces of a distorted brick is a curved surface), and the associated particle tracking is a very difficult task; the QiQ\ element has been dropped from the EDF codes (J.-P. Chabard, personal communication, 1994). They in fact use either the f^i element or the iso P\ — P\, and prefer the latter for big 3D problems in complex geometry. While they began their BMOC using first-order methods (BE for the ODE's and for the characteristic curves), they are currently leaning toward second-order methods (BDF2 for the ODE's, and a BE with extrapolation for the characteristic curves)—at least for time-accurate unsteady simulations. (They still use BE for obtaining steady solutions.) See, for example, Janvier et al. (1992) and Boukir et al. (1996). For a sample of results involving a nuclear reactor pressure vessel, see Alvarez et al. (1992), and for flow past an automobile, see Bidot et al. (1992). Finally, for flow through and around a fully 3D (but not rotating) drill bit for the petroleum industry, see King et al. (1990) or Chabard and King (1991). To conclude our brief discussion related to the impressive EDF Navier- Stokes code (N3S), we note that while these folks have long been fond of (and used) the so-called Uzawa method (see, for example, Cahouet and Chabard, 1988) for solving the pressure-velocity (viscous part) coupling problem, they seem lately to be more interested in the (second-order) projection method—at least for 'large' Re (not Stokes): Ng et al. (1993) and B. Nitrosso (personnal communication). They still use 'Uzawa' for laminar flow simulations. For other representative efforts in this and related areas, see the citations in Section 2.7.7a in Chapter 2 and the following, which were not listed there because they focus exclusively on the NS equations: Suli (1988), Suli and Ware (1989), and Hansbo (1992b). b. Methods based on least squares (LSFEM) Another method that is increasing in popularity is the LSFEM (least-squares FEM), wherein many of the problems of GFEM using mixed interpolation are avoided at the cost of solving many more equations—all first order in space. A leading proponent of this technique is B-n. Jiang, from whom we quote, 'It is well known that the Galerkin mixed method leads to a saddle point problem, thus the sophisticated LBB condition is invoked to guarantee the existence of a solution. It is notoriously difficult to verify and satisfy the LBB condition. From a numerical point of view, the most difficult problem associated with the Galerkin mixed method is that the resulting discretized algebraic equations are nonsymmetric and non-positive-definite, which are hard to deal with for large problems. All these difficulties motivated us to apply the least squares method.'—Jiang et al. (1994). These are indeed powerful arguments against GFEM, and it might just turn out that LSFEM is the 'right' way to go—or at least a good way. By representing the NS equations as a coupled system of (many) first-order PDE's, as discussed in the previous chapter for the scalar transport equation, low-order C° finite element basis functions—with equal- order interpolation in fact (for all variables)—can be profitably employed. The curl form [(3.3-6)] or the rotational form [(3.4-4)] are appropriate starting points. But there are many more equations and many more unknowns, thus exhibiting the 'down side' of the method: many coupled equations (but always with symmetric matrices). By including the definition of vorticity as an additional (vector) equation and the constraint that it be divergence-free as another (scalar), a system of eight equations in seven unknowns—three velocities,
784 THE NAVIER-STOKES EQUATIONS three vorticities, and pressure (which is not overdetermined, see below) is the first-order system of NS equations to which LSFEM is applied. For the steady NS equations, the large system of non-linear PDE's is first linearized (Newton's method, usually) and then the LSFEM applied. The resulting SPD systems are then solved via DSCG (also called Jacobi preconditioning). While the matrices are sparse and SPD, and while only matrix-vector products are required during the CG iterations thus generating a method with minimal memory requirements (minimal for the nearly twice as many equations as for GFEM), the solver typically requires many thousands of iterations for a non-trivial 3D problem (B-n. Jiang, personal communication). Perhaps the planned (potential) switch over to multigrid will further increase the computational efficiency of their LSFEM. We now briefly return to the issue of more equations than unknowns for this LSFEM. The addition of seemingly redundant constraint equations such as curl v = 0 when the problem V2w = / is solved via v = Vw and div v = /orV-6> = 0 when the steady Stokes equations, —VP + uV2u = f, V • u = 0 are solved via VP + vV x co = —f, V • u = 0, and co = V x u (ditto transient NS equation), was discovered by C. Chang (see Chang, 1992, and Chang and Gunzburger, 1987), in which a so-called 'slack variable' that turns out to be identically zero was also introduced—needed for convergence proofs and error analysis but not needed in the 'codes'—a fact originally discovered and utilized in both 2D and 3D (but not analyzed), by B-n Jiang (personal communication). The first of the additional constraint equations (but not the additional slack variable—a quantity not needed in the computations) used in computations appears to be in Chang (1992), but see also Jiang and Povinelli (1993), Jiang et al. (1994), and Bochev and Gunzburger (1993). This simple(?) trick is what allows LSFEM to 'work' in general—and precluded (finally) the serious constraints that augered against the method that had been previously promulgated in a series of papers by Fix et al. (1979a, b, 1981) in which 'success' was only assured if certain constraints were imposed; principally the so-called GDP (grid decomposition principle) or DDP (discrete decomposition principle), in which some severe constraints were placed on both the basic approximating spaces and on the nature of the grids employed. These worries are no more when the discrete forms of the continuum-redundant equations are added to the system—the otherwise limited or lost stability is then assured. The prices for the simplicity (equal-order approximation and stability) is, 'simply,' more equations. For some recent theoretical results on LSFEM in the manner of the above discussion, which also 'corrects' some earlier theoretical works, see Bochev and Gunzburger (1994, 1996) and Jiang et al. (1994), the last of which also introduces some new ('non-standard') BC's for the NS equations. For a recent 'down-side' report on LSFEM, see Chang (1996) and Nelson and Chang (1995), in which it is shown that some 'problems' can arise in which the discrete, divergence-free constraint equation is too weakly enforced. (The proposed fix, which does work, reintroduces a Lagrange multiplier and an associated saddle-point problem!) In the arena of time-dependent NS equations, the team under T. Tsang seems to be leading the pack. In Tang and Tsang (1993), they used BE (linearized) then LSFEM on both isothermal NS equations and Boussinesq equations (see Volume II). It may be worth pointing out the following experience (T. Tsang, personal communication): the use of BE plus LSFEM to simulate Karman vortex shedding past a cylinder 'failed' in the following sense: even with a small At, the correct shedding frequency (Strouhal number) could not be obtained. Perhaps this is a manifestation of combining two dissipative methods—one
SOLUTION METHODS FOR THE SEMI-DISCRETIZED 785 in space and one in time. More recently (Tang etai, 1995), they have gone to fully 3D, time-dependent simulations, using the Q\ element (eight-node brick) for all seven variables, (fixed step) TR for time integration, Newton linearization followed by LSFEM with the SPD systems solved via matrix-free DSCG—in what appears to be a cost- effective method. (All they need now is a smart time integrator.) More recent yet is Tang etal. (1998). To conclude the LSFEM discussion, we simply point out a. few others who have recently tried it—but not in the same way as above: Harbord and Gellert (1991), Winderscheidt and Surana (1994), and Bell and Surana (1994). c. Methods based on Galerkin least squares The previous section discussed 'pure' least squares in which the least-squares criterion was the only (weighted residual) principle invoked to obtain the final FEM equations—in space; time was discretized via 'conventional' (C°) ODE methods. In this last little section, least squares forms only a portion of the methodology, and often the ODE portion is done via discontinuous methods (Galerkin—as discussed in the previous chapter, Section 2.1.Id, e). Another difference is that the methods to be summarized below (via citations) do not reduce second-order operators to first order by introducing new variables; rather, they retain the second-order operators, but only apply least squares on element interiors, wherein the required differentiability is present (the C° basis functions of FEM are infinitely differentiable inside each element, with most higher-order derivatives being zero). A sample of recent publications in this area include: Hansbo and Szepessy (1990), Hauke and Hughes (1994), and some involving in addition free and moving surfaces: Tezduyar (1992), Tezduyar et al. (1992a, b), and Hansbo (1992b). These advanced methodologies also employ (usually) the discontinuous-in-time Galerkin method for the ODE integration. Finally, for a brief comparison of characteristic methods and GLSQ methods, see Pironneau et al. (1992)—in which it was determined (O. Pironneau and T. Tezduyar, personal communication) that the latter is easier to formulate but the former is less expensive. For some more recent GLSQ results, see Behr and Tezduyar (1994) and Mittal and Tezduyar (1994). 3.16.10 A Strategy for Hastening Steady Solutions To conclude our discussion of both time-dependent (mostly) and steady methods for the NS equations, we combine them via a strategy that employs (almost) 'any' time-marching method but switches, cleverly we hope, to a steady solution method when it is deemed likely that a steady state exists—and one is not using a smart implicit integrator so that quite a few more timesteps would be required to actually attain it. The method to be discussed below is related to one that was partially, but successfully, tested by McCallen (1993). Suppose one is integrating in time using a fixed-step ODE method and that it is suspected that a stable steady state exists—or the algorithm can be designed to test for 'approach to steady state' in any of several ways. Here is one: monitor the time-history of several key velocity 'nodes' and/or the kinetic energy (less sensitive) of the flow, uTMu, and test for steady state 'once-in-a-while' by monitoring Aw, vs time, where Aw,- is the change in u at node i per timestep. If Aw, is of constant sign (Aw, > 0 or
786 THE NAVIER-STOKES EQUATIONS Aui < 0) for 'many' steps and decreasing monotonically in absolute value, then it may be asymptotically approaching a steady state. When these tests are passed and an appropriate norm of the velocity (or KE) increment drops below a user-specified tolerance (s), one could try the following algorithm—first in words: guess via exponential extrapolation the steady-state solution by assuming that only the smallest eigenvalue is still 'active' and use this guess as a first guess in a steady-state solution algorithm. As an algorithm it might be the following: Step 1. For each variable monitored, x,, use three successive (or three not successive—perhaps \0At apart) computed values of the solution at three known times to fit the coefficients of the assumed exponential behavior, Xi = at + Z?,e -fit Step 2. Solve for at, bt, c, from the three equations—see below. Step 3. Use jc(- = a,(t — oo) as the first guess in the steady-state iterative solver. Remarks: (1) If e is sufficiently small, then the first guess, at, may actually be good enough—and could, as a first cut, be used to report the steady-state solution. (2) The 3x3 system leads to a nonlinear equation for c: x2 - jc, e"ct2 - e~''f| R= — = JC3-JC2 e_rt3 -e^2' where the indices now apply to the three time levels, and we have omitted the nodal index /, for simplicity. The Newton method should be effective for finding c: Cm+1 — Cm ~i (R + 1 )e~Cm'2 - (Re~Cmh +e~~Cmt') t2(R+\ )e~~Cm'2 - (h Re-''"1'3 +t{e~Cmh)' Once this has converged, compute a from (e.g.) ; —ch X3 ~ X^ —ch a=x2-be 2=*2-e_,f3_e-cf2e '2. (3) A much simpler method, thanks to S. Chan (personal communication), is this: take ?2 = t\ + 8t and ^ = t2 + 8t, where 8t is the chosen time interval (for example, an integral number of timesteps). This equal-interval approach causes the above equation to simplify considerably: R = ecSt, giving c = (\nR)/8t, b = (*2 — x\)ect]/(R~l — 1), and the final desired result is a = (x\ —Rx2)/(\ — R) = x(t = oo). (4) We have not tested this idea, but nevertheless advocate it. 3.17 ALIASING AND ALIASING INSTABILITY, LINEAR AND NON-LINEAR The literature on 'aliasing' in finite element CFD is rather sparse and in our opinion, rather confusing; the latter may help to explain the former. The concept of aliasing is
ALIASING AND ALIASING INSTABILITY, LINEAR AND NON-LINEAR 787 fairly clear—but its consequences are not. If one attempts to place on a mesh (as in IC, say) a waveform with a spatial frequency content that cannot be 'resolved' by the grid (X < 2Ax), or if such a wave tends to be 'generated' by the numerical solution procedure (typically via a product, such as u • V7\ of two short-but-resolvable waves), the grid will misinterpret the (too) short wave as a longer wave that it can resolve. This is aliasing. If aliasing occurs and if the resulting time-integration (with a stable marching scheme, or with At —► 0) becomes unstable (solution grows without bound), then the result is often called an aliasing instability. If the aliasing is caused by a non-linear product (such as u • Vu) and if the resulting ODE (or DAE) behavior becomes unstable, then the result is often called non-linear aliasing instability. Finally, if the ODE's are linear (e.g., u given when solving the scalar transport equation) and the product u • S/T is still deemed the generator of aliased modes, and if the resulting ODE becomes unstable, then the term 'linear aliasing instability' is sometimes used. Before delving further into this issue, we ask the reader to refer back to our first meeting with it, via the Remark in the previous chapter, in Section 2.3.1 between (2.3-9) and (2.3-10), wherein some concepts, and some previous work, was discussed. Before getting any deeper into the alias issue, which itself seems to arise under aliases, we present a brief sample of the literature, in the form of interesting quotations—after mentioning that Canuto et al. (1988a) is both a useful general reference and a particular reference on aliasing in spectral methods. 1. In a truncated Fourier expansion (spectral, elkx) method, Orszag (1972) states, '... where the summation terms are referred to as 'aliasing' terms; the 'aliases' km = k + mN of k satisfy exp(ikmxn) = exp(ikxn) so that they are indistinguishable from k on a discrete grid.' 2. 'The confusion of frequencies is an inevitable consequence of discretization'—Roache (1982). He, of course, is discussing FDM's and their associated, discrete grid point values. 3. In discussing some FDM's, Orszag (1971) states, 'Schemes with no quadratic semi- conservation properties may be unstable due to aliasing errors.' Here, quadratic semi- conservation means stable ODE's. 4. In the same paper, 'The energy-conserving finite difference schemes discussed (above) have aliasing errors (Lilly, 1965; Grammeltveldt, 1969), but they are not susceptible to aliasing instability.' 5. In Gary (1979) appears, 'The analysis of Richtmyer and Phillips indicates that the non-linear instability may be due to a distortion of the non-linear interaction between wave numbers caused by the discrete mesh.' This is what is usually meant by non-linear aliasing instability. 6. Again from Roache (1982): 'Since it conserves £2 (vorticity squared—enstrophy), it is not subject to the non-linear instability of Phillips (1959), which arises from aliasing errors.' (Aliasing errors are present but remain bounded, since £2 remains bounded.) He is here discussing Arakawa's famous method (Arakawa, 1966), which later just happened to also be 'a finite element method'—see Jespersen (1974). 7. 'When enstrophy is not conserved, the average (spatial) frequency It will in general not be constant, and in such cases Phillips (1959) has shown that numerical instabilities can be created by high-frequency noise cascading down into the low frequencies. Phillips
788 THE NAVIER-STOKES EQUATIONS calls these aliasing errors'—Fix (1975). He then goes on to show that his finite element method of 'ocean circulation' assures that enstrophy is conserved: 'We thus conclude in this case that the semi-discrete finite element model will have no aliasing errors.' This conclusion follows after his proof of conservation of mean wave number, k. 8. 'The difference in the accuracy of finite element and finite difference methods is analyzed to illustrate the removal of "aliasing" by the Galerkin approach,' and 'In practice, the grid-point projection is known to be unsatisfactory for non-linear analysis if propagation is possible between points. The errors usually called "aliasing" errors are exactly the spatial evolutionary errors in evaluating products as in (6) by direct multiplication at grid points.. .'—both in Cullen (1976). 9. In the same WMO publication containing Gary's article on 'Non-linear Instability,' Cullen (1979) has one in which he states, 'In a numerical method where a function is defined over the whole domain by finite elements or other means, there can be no aliasing because there is no ambiguity.' He means, of course, that a term such as uh ■ VTh is, via the basis function expansions, a well-defined quantity at all points in space—both at mesh points and in between. 10. 'For the same number of degrees of freedom as a finite difference scheme, the finite element method is more accurate and eliminates the possibility of aliasing energy cascading onto the trunction scale and then back into the larger resolvable scales'—Wyngaard et al. (1984). Note too the use of 'scheme' vis-a-vis 'method'—not the only place these identifiers have been employed (cf. Strikwerda, 1989). 11. Stopping short of a dozen, we end with a statement from the monograph by Gottlieb and Orszag (1977), 'However, Galerkin approximation is sometimes very attractive because it gives approximations that are conservative and have no so-called aliasing errors.' So, it looks like all is right with the world if you employ GFEM—no aliasing and therefore no aliasing instability. Or is it really that simple? What of the several unstable ODE discussions (plus one example) in the previous chapter? These were true Galerkin, true (honest) FEM. Could it be that the 'conventional wisdom,' if our sample above suffices to summarize it, is not always correct? We definitely believe that to be the case; whether or not GFEM is free of aliasing, we know for a fact that it is not always free of instability. In fact, we now know rather well that the most important asset for assuring stability of the ODE's and DAE's is the assurance of a skew-symmetric advection matrix—a coveted attribute that is easiest to obtain for contained flow (n • u = 0 on F) or periodic BC's when we take /J = 1/2 (velocity or temperature). It is not so easy otherwise. Also, did we see aliasing, stable or not, in our numerical example in Section 2.8.1? Probably not; what we saw in the unstable ODE case in Chapter 2 was one or more eigenvectors of M~lN(u) that had positive real parts, thus ultimately causing unbounded growth for virtually any IC. What we did have there is a condition that is now fairly well recognized—rapidly varying coefficients in u(x) that, if fi ^ 1/2, generated unstable ODE's. That is to say in at least (plus those mentioned in Chapter 2) Kreiss and Oliger (1973) and Roache (1982), it is recognized that linear equations with variable coefficients can cause unstable behavior.
ALIASING AND ALIASING INSTABILITY, LINEAR AND NON-LINEAR 789 We will conclude this section with a discussion of the ID, inviscid Burger's equation, du du — +u— =0, (Kjc^I, (3.17-1) at dx a non-linear hyperbolic PDE, which we shall semi-discretize in several ways, for both periodic and Dirichlet BC's on a uniform mesh. 1. FDM1 iii + Ui(ui+\ — w,_i)/2/ = 0; (3.17-2) 2. FDM2 iii + \[Ui(ui+l -M(_,)/2/ + (^+i -«72_,)/2/] = 0, (3.17-3) which might have come about by first rewriting (3.17-1) in the equivalent form du \ ( du du \ ¥ + 3(> + l*)=0' (3-,7"4) 3. FEM1 (advective form: fi = 0). From (2.3-9) et seq. in the previous chapter (let Tj —► uj), we get 1 1 -(ui-i +4w,- +w,-+i)+ —[(2uj + Uj+i)uj+i 6 6/ + (uj-i — Uj+\)uj — (2uj + Uj-\)uj-\] = 0, (3.17-5) which rearranges to g(«,-_i +4Ui■ + U;+\)+ ^(ut-\ + Ui + ui+\)(ui+\ —Ui-\)/2l =0. (3.17-6) 4. FEM2 (/J = 1 /2). From the same section of Chapter 2, 1 1 / u; + u-l+\ Ui + ui-\ \ ~(w/-i +4i/,- + ui+\)+ — I ui+i ui-i 1 = 0, (3.17-7) and we omit the /3 = 1 case. Remarks: (1) (3.17-2) is often unstable (Fornberg, 1973; Kreiss and Oliger, 1973; and Gary, 1979). (2) (3.17-3) is a stable ODE under periodic BC's (lac. cit.). (3) (3.17-3) and (3.17-6) are equivalent when mass lumping is invoked; i.e., the stable FDM and the LMFEM with linear basis functions are the same. (4) (3.17-7) shows a skew-symmetric advection matrix and is therefore—up to BC's at least—also a stable ODE. (It is clearly skew-symmetric, and therefore stable, for periodic BC's.) (5) The earliest citation that we know of that discusses (3.17-6) is that of Swartz and Wendroff (1969)—a paper that also introduces (we believe) the 'product approximation'; for the latter, see also Christie etal. (1981) and Fletcher (1991b) and references therein). It is interesting, and probably rare, that the simple advective form (/3 = 0) of the GFEM equations above is guaranteed to be stable (at least for periodic BC's); i.e., even in the
790 THE NAVIER-STOKES EQUATIONS absence of a skew-symmetric matrix. Proof of stability: rewrite (3.17-6) as Mu + N(u)u = 0 (3.17-8) and form uTN(u)u—if it vanishes, the ODE is stable. We have t x~^ f ( dfa\ f h 2^u u'N(u)u = 2_^Ui J (Pi(uj(pj) I Uk-jr- I = / (" ) y- ijk 3 J a* ' 3y 'l0 because of periodicity. Finally, we point out, following on from Cullen (1979), that advection products, like u • Vr or u • Vw, cannot generate 'aliases' of longer wavelength even if both u and T are varying as fast as possible; i.e., even if u(x) is a 2 A* wave (ID for simplicity) and T(x) another 2Ajc wave, the product of u and dT/dx is also simply a third 2A* wave—a statement whose proof we leave as an exercise. We believe that the following summary of aliasing and GFEM is reasonably accurate: 1. GFEM is not susceptible to aliasing, except in the sense discussed in Section 2.6—unresolvable initial data. 2. Hence, GFEM is not susceptible to aliasing instability, whether linear or non-linear. 3. GFEM is, however, susceptible (if fi ^ 1/2) to generating unstable ODE's—both linear and non-linear. 4. Stable GFEM ODE's can always be assured—at least when n • u = 0 or periodic BC's are employed—simply by using the fi — \/2 form of the advection operator. *3.18 A NEW LOOK AT TWO OLD FINITE DIFFERENCE METHODS Anyone who has the interest or need to go back in history to see the origins of some methods might be misled because it is a common occurrence that those discovering/inventing new methods do not always fully understand them when testing them and writing about them. (And the authors of this book are surely no exception!) Such is the case in several papers to be described here: the first, evolving from the MAC (Welch and Harlow 1965) and SMAC (Amsden and Harlow, 1970) methods at Los Alamos Scientific Laboratory (LASC) and the second from France. The former include the SMAC (simplified marker and cell) 'improvement' over the original MAC method of Harlow et al. [See, for example, Harlow and Welch (1965), which in many ways was more 'correct' than its SMAC successors. Also, C.W. Hirt has opined that SMAC was never 'needed'—personal communication, with which we concur.] The SMAC 'improvements' appeared, e.g., in Amsden and Harlow (1970) and Easton (1972). [Then there is the simpler-yet and less confused manuscript by Hirt et al. describing SOLA (SOLution Algorithm?); in Hirt et al. (1975) is described a 'proper' approach: rather than trying to apply the PPE on the boundary, which, as we shall soon see, is at the root of most of the problems, Hirt et al. applied V • u = 0 there, which, as we have seen, is the proper approach.] From the French group, we have Fortin et al. (1971), recently
A NEW LOOK AT TWO OLD FINITE DIFFERENCE METHODS 791 summarized and somewhat reinterpreted by Peyret and Taylor (1983). Also, the French (and others!) are sometimes guilty of employing a notational convenience that is strictly illegal and thus valid only when properly interpreted; namely, they write semi-discretized equations that are discrete in time and continuous in space, with the time discretization being explicit (FE in fact), which is illegal for two reasons—even for the simple 'model' equation, the transient heat equation; i.e., the semi-discretization of ut = vV2u via (un+\ — un)/At = vV2un gives—at least away from r, un = (l + vAtV2)nuo, which is unbounded as n —► oo for all At > 0. This was the first reason. The second occurs at the boundary; for u = w(t) as the Dirichlet BC, the equation un+\ = (l + vAtV2)un for x —> T is compatible with the BC un+\ = wn+\ on T if and only if w(t) is sufficiently smooth [so that wn+\ = wn + Atwn + 0(At2) is true], which unnecessarily restricts the boundary function, w(t). These results are of course completely compatible with the well- known stability limit of the full discretization, At ^ CAx2/v, which tells us that stability requires At ^ 0 for the continuum limit(!). Thus, in what follows we must interpret the semi-discrete equations as actually fully discrete (V = V/,, etc.). In this section we shall lift the veil of confusion that has surrounded these two methods for so many years, and reveal them for what they are—forward Euler. Yes, they are not really two distinct methods—each is a confused presentation of the simplest possible NS algorithm. Our explanation will utilize the original (and still popular/useful) MAC grid, as shown below at a solid boundary—in which fictitious/phantom cells are (unnecessarily) introduced as part of the method. All subsequent effort is directed toward the cell labeled Pq in Figure 3.18-1—a representative 'boundary' cell. To show that the fictitious cells, velocities, and pressures (and all that they entail) are truly unnecessary, we begin by applying FE to du/dt + VP= uV2u - u Vu = f(u) (3.18-1) V, NW -o- Vsw o OU w U sw V, NN O Unw x Pn -(>- xp0 Vs XPC V, ss Fluid ou NE V, NE >- 1! UF x PF h -o ■ £ ■~-li U SE Fiction Fig. 3.18-1 MAC-Type (staggered) mesh near a wall.
792 THE NAVIER-STOKES EQUATIONS and Vu = 0 in Q, (3.18-2) with u = w(0 on F. Recall first the FE algorithm; given um = u(tm) with V • um = 0: (iW, - um)/At + VPm =fm= f(uj (3.18-3) and V-um+1 =0. (3.18-4) Focusing on 'cell 0,' we begin with the discretized version of (3.18-4): m+\ _ m+\ m+\ _ „/n+\ ^ %L_ + ^Y ^_=0, (3.18-5) / h wherein the velocities are obtained from the discretized momentum equations—unless they are known through the BC: u™+l = u%+l(F) = w™+l is given, (3.18-6) ,.m+\ _ m+\ w — w + &t[fxw{um) - (P% - P»)/l], (3.18-7) v™+l =v™ + At[fyN(um) - (PmN - P%)/h], (3.18-8) and <+1 = < + Mfys(um) - (P% ~ PsVhl (3.18-9) Now use the FE-compatible time variation of w(t) on T, u™+l(F) = u™(F) + Atii™(F) (3.18-10) and the discrete continuity equation (3.18-5) at time tm to obtain, upon inserting the above velocities into the above continuity equation at tm+\, the simple result \kmE + (P* - PD/l - fxw(um)]/l + [fyN(um) - fs(um)]/h -(/>£ - 2P% + P™)/h2 = 0, (3.18-11) which is the consistent PPE at the boundary. It is quite clear that multiplication by / and letting /, h 'shrink' yields uE + dP/dx\o - fxw(um) = ld/dy[dP/dy - fy] + 0(1, h), (3.18-12) with the limit (/, h —► 0) giving 3P/dx\r = (fx-du/dt)\r, (3.18-13) which is the proper Neumann BC for the PPE. Remark: In practice, the FE equation for u^+l (F) would be simply replaced (usually) by w%+l; the above description is needed only to help understand the actual algorithm. So much for the simple and straightforward way. Now let us invoke the SMAC and FP (French Projection—explicit version) algorithms and, later, try to understand them; given um with V • um = 0, they are:
A NEW LOOK AT TWO OLD FINITE DIFFERENCE METHODS 793 Wyn * V Uwj I»> Step 1. Omit VP from the momentum equation and compute a provisional (or fractional step) velocity field, u, from (u - um)/At = uV2u„ Step 2. Solve (um+1 -u)/At + VPm=0 and (3.18-14) (3.18-15) (3.18-16) (3.18-17) V-um+i =0 for um+1 and Pm via formation of the implied PPE, y2pm = V • u/At. Then compute um+\ = u — AtVPm. Remarks: (1) We shall discuss BC's below. (2) In SMAC, Pm is called \js/At, and \Js is called a 'potential function.' (3) In the FP, Pm is called Pm+\, and the second step is called a projection. (4) Both xjs/At and Pm+\ are really Pm. (This, of course, was part of the confusion.) Returning to Figure 3.18-1, this time using also the fictitious quantities on the other side of T, the SMAC and FP approaches write a discrete PPE (rather than V • u = 0) for cell '0'; i.e., PZ - 2P™ + Pm PZ - 2Pm +_Pl _ _1_ ~ At where, from (3.18-14), + h2 ue — uw vN — vs I h (3.18-18) uw = u^ + Atfxw(um), vN = t% + AtfyN(um), vs = v™ + Atf}s(um), and 1. SMAC: (3.18-19) (3.18-20) (3.18-21) uE = uE!+l(ry, (3.18-22) 2. FP: Same as SMAC in Fortin et al. (1971), but Peyret and Taylor (1983) say that it does not matter: 'The essential feature of the projection method is that the numerical solution is independent of u(T).' Interesting .... But Pe is also fictitious, and the following choices were made: 1. SMAC: Pe = Po, which seems to be related to statements about 'homogeneous' BC's for the PPE; i.e., if Pe = A)> it would seem that dP/dn = 0 on T. 2. FP: the Neumann BC inherent in the projection step, dPm/dn = n [u - um+1(r)]/ At from (3.18-15), or, in discrete form for cell '0,' UE-U^iF) pm pm / At (3.18-23)
794 THE NAVIER-STOKES EQUATIONS an equation whose interpretation is vague at best. If we then chose uE = «£+l (T), as did Fortin et al. (1971), we obtain PE = P$, a la SMAC. If we do not choose either PE or uE, per Peyret and Taylor (1983), but simply insert (3.18-23) into (3.18-16), it follows immediately that both PE and uE 'vanish' from the algorithm, as stated above. [Note that this vanishing only occurs if the approximation to (3.18-15) is that given by (3.18-23).] Thus, in all cases, placing the fictitious velocity and pressure (uE and P^) into (3.18-18) gives pffl pffl r>m opw _i pffl rw ~ ro_ , £jy_— z/o "t rs V K 1 At "hT'ch- / -uw vN -vs h (3.18-24) an equation whose misinterpretation has (we believe) led to the mistaken belief that it converges to dP/dx = 0 (multiply by / and then let /, /i —► 0). But the proper interpretation comes about as follows: insert (3.18-10) and (3.18-19) through (3.18-21) into (3.18-24) to obtain '<(f) + Atii^iF) - ,i* - Atf%\um) pm rw V pm pm - 2P% + P ~~h2 ' 1 ~At I + v™ + AtfyN(um)-vm-Atfys(um) h , (3.18-25) giving pm _ pm rW M) I2 + pm 2P% + Pm W, w- = [ue ~ fxw(um)]/l + [/{/(««) - fsy\»m)Vh (3.18-26) because [um(T) - u%]/l + « - t%)/h = 0. Multiplication by / and letting l,h^0 now clearly gives (3.18-13)—the proper PPE BC. In fact, it is also clear that (3.18-26) is a 'repeat' of (3.18-11) and we are done: both SMAC and FP are nothing other than FE applied properly; the fictitious variables really are fictitious and totally unnecessary—as is the 'PPE on f approach. 3.19 NUMERICAL EXAMPLE—IMPULSIVE START 3.19.1 Introduction We conclude this overly-long chapter with but a single example (to help compensate?): flow past a circular cylinder via an 'impulsive' start from rest. But it will turn out to be a long example because of the many different issues and problems that accompany it, the first one being this: an impulsive start is a mathematical impossibility for incompressible flow—it is an ill-posed IBVP. Please refer back to Section 3.9.1 for further important details, and for an introduction to the problems to be 'solved' below. So why then is an ill-posed problem our single selected example? There are several reasons: 1. It is a very common 'misconception' in much of the fluid mechanics literature that is in need of further clarification. 2. We want to show how and why some who thought an impulsive start from a fluid at rest was actually that (perhaps by not looking carefully enough at the results after a
NUMERICAL EXAMPLE-IMPULSIVE START 795 single time step)—and to note that those in the know knew it for what it was: a potential flow initial condition—rather far from a fluid at rest! 3. We want to show how to 'legally' and easily (in principle at least) approximate the so- called impulsive start—(nearly) as closely of you like (or can afford); via an inlet BC of, for example, u = wo(l — e~Xt) for 'large' X. Note that a nearly-equivalent startup could be obtained with a normal traction BC of the type /„ = XFe~Xt at the inlet, where F is the desired 'impulsive' stress (f£° fn dt); i.e. we apply a large but rapidly-decaying 'force' at the inlet. We chose the Dirichlet BC so as to get a better handle on the Reynolds number. See Gresho et al. (1980a) for such a 'normal-force' startup (albeit non-impulsive). Note too that the chosen 'exponential' start is only one of an infinite number of time-dependent BC's that could approximate an impulsive start from rest. 4. We want to 'show-off a smart integrator (variable At) by the existence of widely separated time scales; i.e., the very small acceleration time constant, rac = \/k, required to get close to an impulse, would severely challenge any fixed-A? integration method. 'Our' integrator just takes it in its stride. 5. Another feature, related to the chosen problem, is the calculation of potential flow with a Navier-Stokes 'solver', which requires that the code be able to rotate the x- and _y-momentum equations to the normal and tangential momentum equations a la Section 3.13.le, in order to apply the appropriate inviscid BC's: n • u = 0 and free slip (zero shear), which we will also demonstrate. 6. The next reason was actually unplanned: it is a simulation that makes the simulator realize that s/he, and the code employed, has been severely 'numerically challenged'. We will show what appear to be mesh-converged pressure solutions that are really not—a consequence of not being able to properly simulate a vortex sheet. The easiest way to get into this 'pickle' (which we shall later demonstrate) is to try the impulsive start in the manner discussed in Section 3.9.1; namely, take a potential flow and apply to it the no slip BC on the cylinder—'add viscosity and apply the brakes'. Also responding to the numerical challenge was our friend and colleague, David Gartling, who, at our request, provided some significant help early on by verifying some of our early 'strange' results with his own code (NACHOS); see Gartling (1987). (He was also smart enough to 'bail out' early!) 7. The final 'reason' was also unplanned, but worthwhile: to show how badly a great 'Stokes element' can perform when trying to solve problems that are closer to potential flow than to Stokes flow. In fact our favorite 2D viscous flow element, Q2P-1, let us down during the phase of startup in which viscous effects were either negligible or too 'localized' to capture—via wiggles, a result that also beckons the FEM theorists to get more 'active' in the time-dependent 'arena'. We also mention 'up-front', and show later that the presumably less stable <2i<2o element beats Q2P-1 badly—at least with respect to wiggles. It is not our purpose either to provide an extensive bibliography on this much-studied problem, or to attempt serious comparision of our results with those of others—the latter partly because we employed a rather restrictive (small) and bounded domain that in no way is meant to approximate the unbounded situation. Rather, we will be closer to following Sarpkaya's suggestion (1989 personal communication) that fully numerical simulations would do well to attack truly accelerating flows rather than just the limiting case
796 THE NAVIER-STOKES EQUATIONS of a potential flow IC. [See too Sarpkaya (1996) for more recent discussion and citations, and Telionis (1989) for transient incompressible flow in general—and impulsive starts in particular.]. He also (in Sarpkaya, 1992) comments on the impossibility of generating a 'truly impulsive flow either numerically or experimentally,' the latter needing to additionally deal with compressibility and cavitation effects in liquids (even if the required infinite force were available). We too will have much to say about its numerical impossibility. A very recent reference covering virtually all aspects of flow past circular cylinders is Zdravkovich (1997). Finally, as we are really mainly interested in 'small' time behavior, we shall invoke symmetry and save a factor of 2 or so in each run which is justifiable since the full flow is symmetric for any Re for t sufficiently small. We shall investigate two types of 'impulsive' starts (potential flow IC and no-flow IC with large acceleration) for two values of the Reynolds number (1000 and 0) for two elements (QiP-\, which we shall call 9/3 for short, and Q\Qo, which we shall call 4/1) using two time integrators (BE and TR)—both in the 'smart' (variable-step) mode. Fear not, however—we shall only discuss one of these eight combinations in any detail; namely 9/3 via TR on the no-flow IC at Re = 1000. (We do want the reader to know, however, that we spent many months examining all eight possibilities—plus others!) 3.19.2 Domain, mesh, BC's, IC's After several 'mesh iterations', we settled on that shown in Figure 3.19-1 for most of our 'runs' (for the 9/3 element; for 4/1 divide by 2 in each direction—roughly, since 9-node meshes are graded differently than 4-node in order to keep mid-side nodes at midsides, a la our discussion on Good Simulation Practices in Volume II). At the 'end of the (long!) day, we realized and admit that our mesh grading was not 'optimal'; we should have graded more 'aggressively'; and we do, briefly, at the end. The mesh contains 4290 elements and 16965 nodes, with the unit-radius cylinder located at the center of a domain that covers — 3 ^ x ^ 3 and — 2 ^ y ^ 2, although the symmetry BC allows us to use 0 ^ y ^ 2—which we do. The BC's are as follows: u = wq(\ — e~Xt) and v = 0 at the inlet (x = —3) with wq = 0.1 and X = 100 = l/rac; homogeneous NBC's at the outlet (x = 3) as OBC's (vdu/dx — P = 0 = vdv/dx); u = 0 on the cylinder (n • u = 0 for the inviscd cases); and symmetry elsewhere (y = 0 and y = 2) via v = du/dx = 0 which also implies dP/dy = 0, a la Section 3.8.2). This was for the no-flow IC. For the 'impulsive'/potential flow version of the problem, the inlet BC (only) was changed to u = wo = 0.1 and the resulting 'illegal' problem (no initial flow) was made legitimate via an L2-projection (see Appendix 3) to a discretely div-free potential flow + vortex sheet (caused by the no-slip BC) at t = 0+ using the very small BE time-step trick already utilized and explained in Section 3.16.Id. (The first step gives the appropriate velocity field and the second gives the concomitant pressure, both results being virtually independent of Re and At, for At sufficiently small). The cylinder diameter (D) is 2 and is the characteristic length for defining the Reynolds number; Re = wqD/v gives, for Re = 1000, v = 0.0002—which value we also used for the Stokes flow simulations. The resulting additional physical time scales (in addition to rac = 0.01) are: tA = D/wq = 20 (advection) and xD = D2/v = 20000 (diffusion), both of which are large relative to rac.
NUMERICAL EXAMPLE-IMPULSIVE START 797 (a) The full domain. (b) Zoom near the cylinder. Fig. 3.19-1 The mesh of 4290 Q2P-i (9/3) elements; 16 965 nodes. 3.19.3 Two steady-state results (v = 0, oo) Before launching into the transient simulations, we show and discuss two limiting steady- state results: potential flow (Re = oo, via the 2-step trick using BE with v = 0) and steady Stokes flow (Re = 0)—in Figures 3.19-2 to 3.19-4. Whereas the Stokes result looks (and is) reasonable and accurate, that for potential flow (realized, of course, by rotating the momentum equations a la Section 3.13.1e and releasing the no-slip BC to use the n ■ u = 0 BC on the cylinder) is highly irregular/suspicious because of the WIGGLES—in Figure 3.19-3 and 3.19-4 (the discussion of which we defer temporarily). The reason we can say the Stokes results are 'accurate' is the following, which is always a useful adjunct to mesh refinement studies: the 4/1 element delivered virtually the same solution as the 9/3 (cf. Figures 3.19-2(e) and 3.19-2(f), which leads to the perhaps somewhat promiscuous conclusion that the much-more-accurate-in-general 9/3 element is virtually 'mesh converged'.
798 THE NAVIER-STOKES EQUATIONS (a) Potential flow streamlines (Ay = 0.01) (b) Potential flow pressure; Pmax = 0.00505, Pmin = -0.0284 (AP = 0.00167) (c) Potential; <(>max = 80356, <j>min =-0.1 (A* = 4018) (d) Stokes flow streamlines (Ay = 0.01) (e) Stokes flow pressure; Pmax = 3.73 x 10-4, Pmin = -0.5 x 10"4 (AP = 2.14 x 1Q-5) ' ^1 <o (f) 4/1 versions of Stokes flow pressure; Pmax = 3.41 x 10"4, Pmin = -0.5 x 10-4 (AP=1.95x10"5) Fig. 3.19-2 Potential flow (left) and steady Stokes flow (right); 9/3 elements (mostly). The streamlines in Figure 3.19-2 show the rather large 'displacement' thickness for (sticky) Stokes flow, vis-a-vis (slippery) potential flow. Except for the first ones in from the boundaries, the streamlines are equally-spaced (A^ = 0.10)—the lowest value is 0.005 and the largest shown is 0.195 (the top of the domain has \Js = 0.2 = JQ vvrjdy). Turning now to the pressure (Figure 3.19-2(b) and 3.19-2(e), the only thing the two have in common is some symmetry about x = 0—but even these are different; the potential flow has Pmax = 0.0051 at both upstream and downstream stagnation points (even symmetry about x = 0), whereas the Stokes pressure field displays odd symmetry, with Pmax = 0.000 373 at the forward stagnation point and Pmin = -0.000055 at the rear. (Recall that the pressure scales with v for Stokes flow.) The pressure at the exit is close to zero for both cases (and at the inlet for potential flow), and the minimum pressure for potential flow occurs at the top of the cylinder; it is —0.0284, and we note that the analytic solution [see for example Batchelor 1967)] for unbounded potential flow, P = -u2 - 2 °V 2 cos 29 a (3.19-1) l-u2 - 2Moo = (0.1)2/2 = 0.005 at 0 = 0 and ^u^ = —0.015 at 0 = 7r/2; both at r = a. Thus, our bounded domain in polar coordinates (a is the radius), has Pmax = 6 = 71 and Pmin = has captured well the stagnation pressures, but the additional flow forced past the top of the cylinder has further reduced the 'Bernoulli' pressure minimum (to —0.028). Also, rather than «max = lu^ at the top of the cylinder for the unbounded case, we see «max =
0 20 0 05 (c) u vs x at y = 1 0.30 0 25 u 0.20 0.15 0.10 NUMERICAL EXAMPLE-IMPULSIVE START 0 20 799 0.15 - 0.10 -, 0 05 (b) Same as (a) 0 30 0 25 0 20 - 0.15 0 10 0 05 i 1 1 r J L (d) Same as (c) 0.30 0 25 u 0.20 - 0 15 0.10 _L J L -1 0 1 x 1.0 1.2 1.4 1.6 1.8 2.0 1.0 1.2 1.4 1.6 1.8 2.0 y y (e) u vs y at x = 0 (f) Same as (e) Fig. 3.19-3 Potential flow: 9/3 results on the left, 4/1 on the right.
800 THE NAVIER-STOKES EQUATIONS 0.20 0.15 - 0.10 0.05 (a) u vs x at y = 2 0.35 0.10 0.05 i r J L -1 -2 -1 0 x (c) u vs x at y = 1 0.35 0.30 0.25 0.20 0.15 0.10 J L 1.0 1.2 1.4 1.6 1.8 2.0 y (e) u vs y at x = 0 0.20 0.15 0.10 0.05 (b) Same as (a) 0.35 0.30 0.25 - uO.20 0.15 0.10 0.05 (d) Same as (c) 0.35 0.30 0.25 - u 0.20 - 0.15 - 0.10 1.0 1.2 1.4 1.6 1.8 2.0 y (f) Same as (e) Fig. 3.19-4 Mesh refinement results for 9/3, potential flow; 4323 node mesh on left, 1267 node mesh on right.
NUMERICAL EXAMPLE-IMPULSIVE START 801 2.75 Woo—a somewhat uncertain value because of the wiggles. (Bernoulli's theorem would say that it is V2(0.005 + 0.028) = 0.257 = 2.57moo.) Finally we remark that the 'potential' field shown in Figure 3.19-2(c) is actually the pressure field after the first small (At = 10~5) BE step on the ill-posed problem associated with zero initial velocity and u = wo = 0.1 as the inlet BC—per the discussion in Section 3.16.Id for the Taylor vortex problem. A further digression on this potential field may be worthwhile since the associated potential flow problem, u = u+V0 and V-u = 0 in Q, (3.19-2) with BC's of n u = n u on TD (3.19-3) and 0 = 0 on FN, (3.19-4) where FN is the outflow portion of dQ and Fd is all the rest, where u = 0 in Q and n ■ u = wo at the inlet —which describes an L2-projection (see Appendix 3) of the non- smooth function u—does not seem to be an easy one. Harder yet is the derived Poisson equation for the potential, V20 = V-u in £2, (3.19-5) with 90 :r-=0 on TD (3.19-6) an and 0 = 0 on FN (3.19-7) in which V ■ u is a sort of Dirac delta function, being zero every where except at the inlet where it is unbounded. That the problem is not really all that difficult may perhaps be better-appreciated by showing the exact solution for the ID analog to this 'Green's function' problem (for wo = 1); <Kx, |) = -(£-*)#(£-*) + L-x, (3.19-8) where £ is the location of the source term (delta function), L is the domain length, and //(■) is the Heaviside unit step function. It looks as in Figure 3.19-5. It has slope = 0 for x < £ and 0 = 0 at x = L, and the solution of interest is that with £ = 0. [See too Hughes (1987), p. 26.] The 2D analog of this exact solution is the potential function in Figure 3.19-2(c). [Note that the potential flow solution is insensitive to the over-specification at the inlet; v = 0 or v 'free' (a la true potential flow) look just the same.] Further explanation of the potential function in this figure, whose maximum is ~ 8.0 x 104 at the inlet (and whose minimum is ~ —0.06 at the exit, close to the 0 = 0 NBC imposed) is as follows. For a channel of length 6 with no cylinder, the linear potential field would have an inlet value of 0.6 (for wo = 0.1) in order to give u = 30/3jc = 0.1; the increase from 0.6 to 0.8 is caused by the cylinder and the factor of 105 is 1/Af, where we recall that the code thinks that this is a pressure, whereas it is really (p/ At, per the discussion in Section 3.16.Id. We now return to the wiggles in Figures 3.19-3 and 3.19-4 for potential flow and point out three things: (i) The (LBB-stable) 9/3 element is (ironically?) much noisier than the (LBB-unstable) 4/1; (ii) The clear reduction in wiggle magnitude with mesh
802 THE NAVIER-STOKES EQUATIONS »- x 0 % L Fig. 3.19-5 1D Green's function. refinement suggests that convergence will indeed occur; and (iii) part of the wiggles are caused by the C-matrix (even when operating on what appears to be a smooth 'pressure' field) and part are caused by the consistent mass matrix (in its usual mode of 'wiggle signal announcer')—a lumped mass version of this potential flow yields smaller wiggles and a lumped mass matrix on an Re = 1000 simulation generated no wiggles; (iv) for the Re = 1000 transient case to follow, the wiggles are only present at early time—before viscous effects obliterate them. Recall that the discrete version of this 'second-order mixed elliptic' problem is, via the first (very small) time step of a BE integration, 1 —M(ui -u0) + CPi =0 At and CTux = g, giving (CTM~XC)P{ = (CTu0-g)/At (3.19-11) and mi =u0- AtM~lCP{ (3.19-12) for the (L2-projected) potential velocity, where u$ corresponds to u in the above discussion for the continuum, and AtP\ = (p. If P\ is smooth (which usually seems to be the case) yet u\ is rough (wiggly), the 'cause' can be in either CP\ or M~lCP\ (or both)—and we believe that we have some of each. For example, the wiggles at y = 1 in Figures 3.19-3 and 3.19-4 are much the same when M is lumped, suggesting that the culprit is C. Similarly, let us focus on the 4/1 element wiggles at the upper corners of the mesh (the y = 2 plot in Figure 3.19-3(b)—which wiggles are present for CM and LM on all three meshes and even on a fourth (with over 67 000 nodes). We offer the following additional remarks for Q\Qo' 1. This element often performs poorly at or near mesh lines that point into a corner of the domain (for reasons that we do not understand). 2. It also responds badly when only two pressures are available in the 'nodal' momentum equation (2-patches) and when the elements are large and/or distorted—all of which (3.19-9) (3.19-10)
NUMERICAL EXAMPLE-IMPULSIVE START 803 occur along the y = 2 line. (In fact, 2-patch equations often do not have coefficients of the VP approximation—the C-matrix—that sum to zero, thus causing trouble even if P is constant!). 3. We have sung the 'Bent Element Blues' before [in Gresho and Leone (1984)]. In Figure 3.19-6 we add yet another piece to the wiggle puzzle. Shown is the velocity field in the upper right-hand corner of the mesh (similar behavior occurs in the upper left- hand corner) from Q2P- \ and potential flow for two different pressure approximations. The rough field in Figure 3.19-6(a) is from the local approximation on the element (P = a + b^ + crj) and the smooth field [Figure 3.19-6(b)] is from the global approximation (P = A + Bx + Cy), and we remark that (i) both of the solutions are smooth and good for steady Stokes flow, (ii) we also do not understand the cause of these wiggles, and (iii) only the global approximation is used henceforth. After mentioning that (Stokes) triangular elements (e.g. P2PO also wiggle for potential flow, we assert that we have done our best to unravel these element 'stability' problems, but have found the literature to be rather sparse—when 'Stokes elements' are used for potential flow or related simulations that also involve L2-projections [for example, projection methods and even explicit time integration methods (see below)]. What we seem to have here is a lack of satisfaction of the so-called 'first Brezzi stability conditions' (D. Arnold, personal communication)—that of coercivity/ellipticity. An intuitive 'feel' for the problem is related to the 'inconsistency' in the mixed approximation in that the velocity is in //' and the (lower-order) pressure/potential is in the larger and less smooth space L2, yet the only spatial derivatives [cf. (3.19-2)] are on the latter. The second Brezzi condition'—the LBB condition—is indeed satisfied for the more wiggly of our two elements, for Stokes or potential flow. While we know of no proofs that certain (all?) stable Stokes elements either pass or fail the coercivity condition when applied to potential flow, we suspect—and this suspicion is reinforced by our discussions with a fair number of 'finite element mathematicians'—that they fail. [This stability condition is related to the stability of the inverse of the matrix (^r c0), where A = K for Stokes flow and A = M (or / for FDM) for potential flow. [It is quite noteworthy that the simple FE time marching method for the NSE's involves virtually the same matrix as potential flow, with A = (\/At)M; yet we, and many others, have computed quite successfully via FE and 'Stokes' elements.] Another 'intuitive feeling' as to why it may fail the ellipticity test is that, noting first that the inverse of the above matrix also involves the inverse of the Schur complement, CTA~lC, K~x is a 'smoothing' operator (the inverse Laplacian) whereas M_1 (or /) is not, so that if C tends to be 'unstable' in some sense, K~{ (only) can help it to recover. We believe that any Stokes elements that fail the first Brezzi condition for potential flow (or for explicit time marching methods) but nevertheless perform satisfactorily (if not optimally) in practice is a consequence of one or both of that following: (i) The stability condition is a sufficient but not a necessary condition for convergence, and (ii) the RHS's that are encountered 'in practice' are special, not general—and the study of stability necessarily, considers all possible 'data' (RHS's). Indeed, in the new 'Bible' on mixed finite element methods, Brezzi and Fortin (1991) do not even consider these 'Stokes elements' for the solution of such a (potential flow) problem. Rather, they seek elements in the larger space than H1 called H (div): vector fields in this space need not (and generally do not) have continuous tangential components—the vectors and their divergences are all that must be in L2, But these elements, good as they might be for the
804 THE NAVIER-STOKES EQUATIONS (a) Local pressure approximation (b) Global pressure approximation Fig. 3.19-6 Potential flow velocity field from O2P-1 in upper right corner of the mesh shown in Figure 3.19-1.
NUMERICAL EXAMPLE-IMPULSIVE START 805 inviscid case, are useless as soon as we introduce viscosity and the no-slip BC; thus, we go no further in the direction of H(div) elements. [For an alternative approach to computing potential flow via a mixed FEM, see Carey and Oden (1986), in which integration by parts is applied to the continuity equation rather than to the pressure (potential) gradient with the result that 'pressure' (potential) is the 'higher-order' variable. In Pironneau (1989), potential flow is discussed in the context of the Laplace equation for the potential. See too, Lee et al. (1982), in which the Laplace equation approach was surprisingly better than the mixed method approach.] We conclude our discussion of the steady cases by further verifying that the 9/3 element behaves 'perfectly' for steady Stokes flow—as is of course well-known, Figure 3.19-7 shows the line plots that were wiggly for potential flow. Also shown is the vorticity, and we remark that results from <2i<2o are very close to these—within a few percent. 3.19.4 Pressure impulse Enough on the two steady solutions—now we will show results from a transient case integrated with TR (in the 'smart' mode): Re = 1000 with a horizontal inlet velocity BC 0.30 0.12 u 0.15 — -0.3 -0.2 -0.1 (a)uvsxaty=2 1.0 2.0 3.0 -3 -2 (b) uvsxaty =1 (d) Vorticity (a>max = 0.075, <»min = -0.799) (c) u vs y at x = 0 Fig. 3.19-7 Further results from Q2P-i for steady Stokes flow.
806 THE NAVIER-STOKES EQUATIONS of u(t) = 0.1(1 — e~100r), zero initial velocity, with the 9/3 element. This is the simulation that does start from rest, albeit with large acceleration and pressure at small time (from "(Ohniet = 10e~100r and the PPE 's BC, dP/dx = -it(t), at the inlet). It also provides a good example of what Bachelor (1967) has called the 'pressure impulse': 7i= fpdt, (3.19-13) and, as we shall see, for sufficiently large X and sufficiently small time, the pressure goes like 0(jc, y)le~Xt— from the n • u term in the PPE's BC, (3.8-36)—so that jz = 4>(x, y) (1 - e"*'), (3.19-14) which is close to 4>(x, y), the initial potential field if Xt;» 1 yet t is still small enough that the other physical processes—advection and diffusion—have not yet had a chance to 'react'; the initial pressure is just X times the potential function. Thus, as stated by Bachelor (1967), 'This relation provides us with a physical interpretation of the velocity potential. The potential 0 of a given irrotational velocity distribution may be interpreted as (—1/p) times the pressure impulse required to set up the flow from rest, or, alternatively, as (1/p) times the pressure impulse required to reduce the given motion to rest.' In fact, the initial pressure field for this problem is exactly 1000-fold smaller than the 'potential' function shown in Figure 3.19-2(c); multiplication by At reduces it by a factor of 105, and multiplying it by X increases it 100-fold; the initial AP over the domain is ~ 80. This shape of the pressure field remains virtually unchanged during the transient for times up to almost 5rac—while it decays in magnitude according to Po(x, y)e~Xt—and the flow during this same time is, in fact, merely an accelerating potential flow (basically—except very close to the cylinder), until either advection or diffusion has had time to react/wakeup—the idea here being to have X large enough that a true potential flow (not merely a potential acceleration with a feeble flow) exists before the fluid even knows what hit it. (It may be worthwhile to point out/emphasize that, even though we envision X -> oo, which generates unbounded initial pressure and acceleration, the initial velocity is always zero.) Thus we need (at least) rac <^ ta = D/Uqo = 20 and rac <£ td = D2/v = 20000. which, as already mentioned, is easily satisfied for our rac of 0.01. 3.19.5 Minimum Time of Believability; Re = 1000 Results What is not so easy to obtain relates to the vortex sheet on the cylinder and its diffusion by viscosity. Defining TMTB=h2/4v (3.19-15) as the fastest diffusional response time of the mesh (which we call the Minimum Time of Believability for reasons that will become quite clear later), where h is the distance to the closest node off the cylinder surface (h = 0.022 for our mesh), gives tmtb =~ 0.60, or 60rac (see too Section 2.4.1). Thus, we are not surprised that the acceleration transient can achieve a virtually steady potential flow before the fluid reacts in other ways; in fact, we planned it that way. A most sensitive measure of this 'reaction' is the maximum magnitude of the vorticity, which of course is that at the top surface of the cylinder: it is 23.6 at
NUMERICAL EXAMPLE-IMPULSIVE START 807 t = 0+ for our version of an impulsive start (potential flow + no slip BC) and is the best approximation we have to the true value (oo). For later times, it is as follows, where the result from the e_x' startup is shown in parentheses: 22.1 (22.2) at 5rac, 20.8 (21.1) at 10rac(f = 0.10), and 19.9 at 15Tac—and the corresponding contour plots (not shown) look much like those in Figure 3.19-8, the vortex sheet as 'seen' by our (definitely finite) mesh. Thus, the velocity/vorticity field is indeed changing slowly at the 'completion' of the acceleration transient (say 10rac). But the pressure, as usual, is another matter entirely; it always 'reacts' first and fast to any change in 'external' conditions/forcing. This is clearly seen in Figure 3.19-9—showing the isobars between t = 5rac and t = 10rac, which appears to be the transition period: transition from acceleration-dominated potential flow + vortex sheet to a flow in which both advection and diffusion are 'effective'. [At t = 5rac, P0(max)e-X' = 80e~5 is 98.5% of Pmax in Figure 3.19-9(a); at t = 10rac, 80e~10 is only 23% of Pmax in Figure 3.19-9(f).] By t = 10rac =0.10, the pressure already looks a lot like that for potential flow (Figure 3.19-2), but it is definitely 'viscosity - affected', the range now being from — —0.026 near the top of the cylinder (vis-a-vis —0.028 for potential flow) to —0.016 at the forward stagnation point (vis-a-vis —0.005 for potential flow). It thus appears that it is the 'source' term [—V- (u • Vu)] on the RHS of the PPE (which is still very close to 'potential', ~^S/2q2) that determines the pressure minimum (and most of the 'shape' of the field), but that the 'Stokes' boundary term (vn • V2u in the Neumann BC for the PPE) has a significant effect on the maximum. (The fact that this BC is singular when a vortex sheet is present causes significant difficulty.) Shown next are i/r, P, and to at t = 5 and t = 10, in Figure 3.19-10, and we note from the minimum value of i/r (— — 8 x 10~6 at t = 5 and — —0.037 at t = 10) that separtion has occured (\}/mm = 0 prior to separation). Boundary layer theory (Re = oo) predicts, for the unbounded domain, a separation time of —3.22 and a good prediction at Re = 1000 is —3.71 [both from Collins and Dennis (1973a)]. Vorticity advection is clearly evident at these two times. Further snapshots at t = 15 and 25 are shown in Figure 3.19-11, the latter showing two secondary eddies. In this figure, and the next two, the Ai/f for the contours is definitely not constant; the flow in the small eddies is very weak. Finally, Figures 3.19-12 through 3.19-14 show the results at 'late' time: t = 40 (also shown in the frontispiece) and t = 95, the latter of which is surely a good 'test' of the OBC—and is the end of our simulation. Similar results (but definitely not the same, owing Um I i r^ i Fig. 3.19-8 The vortex 'sheet' on the mesh of Figure 3.19-1; potential flow with no-slip boundary condition (o)max. — 4.9; ajmin = -23.6).
808 THE NAVIER-STOKES EQUATIONS (a) t = 5-rac = 0.05; Pmax = 0.547, Pmin = 0 (AP = 0.027) (c) t = 7Tac; Pmax = 0.081, Pmin = 0(AP=0.0041) (e) t = 9tac; Pmax = 0.024, Pmin = -0.021 ( P = 0.0023) Fig. 3.19-9 Isobars during the transition ph, (a) Stream function (b) Pressure; Pmax = 0.0079, Pmin = -0.0256, (AP = 0.00167) i r^ \ (c) Vorticity; comax = 0.67, comin = -7.53, (Aco = 0.41) Fig. 3.19-10 Solution att = 5 (left) and t = 1 mm (b) t = 6tac; Pmax = 0.211 ,Pmin = 0 (AP = 0.011) (d) t = 8Tac Pmax = 0.037, Pmin = -0.013 (AP = 0.0025) (f) t= 10tac; Pmax = 0.016, Pmin = 0. (AP = 0.0021) >, Re = 1000. (d) Stream function (e) Pressure; Pmax 0.0091, Pmin 0.0256 (f) Vorticity; comax = 3.20, comin = -7.20, (Aco = 0.52) (right).
NUMERICAL EXAMPLE-IMPULSIVE START 809 (a) Stream function (d) Stream f (b) Pressure; Pmax = 0.0119, Pmin = -0.020, (AP = 0.0016) (e) Pressure; Pmax = 0.014, Pmin = -0.034, (AP = 0.0032) f~\ (c) Vorticity; comax = 5.80, comin = -7.09, (Aco = 0.64) (f) Vorticity; comax = 4.62, comin = -7.12, (Aco = 0.78) Fig. 3.19-11 Solution att=^ (left) and t = 25 (right). at least to our 'small' domain) are available in (at least) Ta Phuoc Loc (1980), Chou and Huang (1996), and Chu etal. (1996). Related results at larger Re (3000 and 9500) are presented in Ta Phuoc Loc and Bouard (1985), in which the two small secondary eddies near the top of the cylinder are called 'phenomenon «' after Bouard and Coutanceau (1980). [For a fairly recent review of the literature on unsteady flows, see Sarpkaya (1996).] 3.19.6 Transient Stokes Flow We shall return to this more interesting case after a brief look at the 'boring' case, Re = 0, because we later (must) consider both. (Re = 0 is not as easy as it might seem.) Figure 3.19-15 shows snapshots at three times (t = 1, 10 and 100), and we point out that either IC gives the same results (for t;» rac). The symmetric diffusion of vorticity and the nearly-constant shape of the pressure field are obvious—the latter a subject that we wish to say a lot more about, since virtually every Stokes pressure field that we have examined (even those for the impulsive start: potential flow + no-slip BC) has virtually the same shape for all time. The velocity field for (unbounded) potential flow is [in polar coordinates; Batchelor (1967)] 2 /„2> and ur = Moo(l — a /r )cos0 uq = —«oo(l +a /r )sin# (3.19-16) (3.19-17)
810 THE NAVIER-STOKES EQUATIONS (a) Stream function (b) Pressure (Pmax = 0.014, Pmjn = -0.040; AP = 0.0027). (c) Vorticity (comax = 2.09, comin = -7.01; Aco = 0.35). Fig. 3.19-12 Solution at t = 40. and the normal component of V2u, needed for the PPE's Neumann BC, is (n points in the — r direction) ' dr 1 d r or 1 d2u r 2 dug r2 d92 r2 89 ' (3.19-18)
NUMERICAL EXAMPLE-IMPULSIVE START 811 (a) Stream function (b) Pressure (Pmax= 0.043, Pmin =-0.010; AP = 0.0026). (c) Vorticity (Wmax = 2.10, wmin = -6.99; Aw = 0.45). Fig. 3.19-13 Solution att = 95.
812 THE NAVIER-STOKES EQUATIONS (a) t = 40; see also Figure 3.9-12 (b) t = 95; see also Figure 3.9-13 Fig. 3.19-14 The small eddies at t = 40 and 95. which, for potential flow vanishes identically [V2u = 0 for every solenoidal irrotational vector field, since V2u = V(V • u) — V x V x u]. But if we impose the no-slip BC upon this otherwise potential (curl-free) velocity, the last term in (3.19-18) vanishes (at r = a) with the result that n-V2u|r = ~d~r l a r dr (rur) 4u oo COS#, a cos# (3.19-19)
NUMERICAL EXAMPLE-IMPULSIVE START 813 (a) P at t = 1; Pmax = 0.00473, Pmin = -0,00092 (AP = 0,00028) (b) co at t = 1; comax = 0,20, comin = -11.34 (Aco=0.58) r\ (c) P at t = 10; Pmax = 0.00159, Pmin = -0,00031 (AP = 0,000095) (d) co at t = 10; comax = 0.099, comin = -3,680, (Aco=0,19) (e) P at t = 100; Pmax = 0.000633, Pmin = -0.000119 (AP = 0,000038) (f) co at t = 100; comax = 0,052, comin = -1,375 (Aco= 0.071) Fig. 3.19-15 Pressure and vorticity during developing Stokes flow, starting from potential flow or from rest with large acceleration. so that the PPE BC (at t = 0 only) for the impulsive start is (or seems to be) dp dn dp dr = vn • V u|r = —^— cos#, a (3.19-20] which of course is the only driving function for the Stokes pressure field. [It is alsc impossible to compute accurately on any mesh—an important point to which we will soon return.] With the BC given by (3.19-20), it is easy to show that the (unbounded domain) solution to V2P = 0 is P{r, 9) = 4vUnc, COS 9 (3.19-2T for P —>- 0 as r —>- oo. For steady solutions, multiply (3.19-21) by — c with c > 0, whicl constant will be discussed below, and we note that P ~ — cos 9/r is the 'normal' case: higl pressure at the inlet (cos# < 0) and low pressure at the outlet (cos# > 0). The 'wrong sign for the 'proposed' t = 0 potential flow plus vortex sheet solution (which, if it is eve] valid, applies only at t = 0; not for t < 0 and not for t > 0) seems to be so because the no-slip BC wants to slow the flow. (Later, we shall present a proposed pressure solutior for t > 0.) Our bounded-domain solutions (when Stokes-like) are actually not too differen from P(r, 9) = (Acos#)/r for some A, even though we have different BC's [dP/dy = (
814 THE NAVIER-STOKES EQUATIONS at y = 0 and y = 2, dP/dn = n • (vV2u - u Vu - du/dt) at the inlet, and P = 0 at the outlet]. Remark: As promised in Section 3.8.2 we have here an example of an incompressible flow that violates V • u = 0 on rD: for potential flow + no-slip, V • u = \/r [d/dr(rur) + dug/d9] applied at r = a, gives V • u = (l/a)[2«00 cos# + 0] ^ 0. It is also worth pointing out that even though V2u = 0 for potential flow, the normal pressure gradient at r = a is still non-zero; it is, from (3.19-1), 8P a7 l^sin2^ (3.19-22) a a which is also —n • (u • Vu) at r = a. [This result also corrects a small error in Gresho and Sani (1987)—Remark (iv) on p. 1119.]. True potential flow has a maximum in dP/dr at 9 = n/2 rather than the 'Stokes flow' result, of dP/dr = 0 there. The solution given by (3.19-21) is what we refer to, perhaps paradoxically, as an Euler-Stokes flow [see also Gresho (1991a,b, 1992)]: an Euler velocity field (except on r = a) and a Stokes pressure field—even though we may not be able to find the latter field numerically. Next we note that, although not valid for r —>• oo, the steady Stokes equations have the 'near-field' solution (see for example, Panton (1984)] /a r r r\ vi/ — cuood h 2- In - sin9, \r a a as giving ur CUr a 1 +21n- a Uq — CUqq a 1 1, r -x- — 1 — 2 In - r a cos 9, sin#, and n • V u|r = —(4cUoo/a)cos9, (3.19-23) (3.19-24) (3.19-25) (3.19-26) where c is 'arbitrary' (and related to the Stokes paradox; c = 2/ln(7.4/Re) for Re < |, a la Oseen, is a good choice (Batchelor, 1967))—but the important point is that the 9- variation of dP/dn at r = a is the same in both potential flow with no-slip and steady Stokes flow—and our numerical simulations have told us that it is also the same for transient Stokes flow that 'began' as a potential flow. [We have also solved the following Neumann problem on the mesh in Figure 3.19-1: V2T = 0 in Q, dT/dn = cos9 on the cylinder, and dT/dn = 0 elsewhere; the solution looks just like all of the Stokes pressure fields. In fact, the 'empirical' result on our domain leads to the following relationship between Pmax - Pmin = AP and the dP/dn BC on the cylinder: AP = (3/ cos 9)dP/dr, which can be used, along with the steady Stokes flow result [see Figure 3.19-2] that AP = 4.2 x 10"4, to obtain an estimate of the 'bounded domain factor', 0, in the PPE BC equation dP/dr = fi(4vUoo cos 9)/a2, to give 0 = 4.2 x 10"4/(3 x 4 x 0.0002 x 0.1) = 1.75—vis-a-vis c from (3.19-26) and 1 from (3.19-20). Another measure of 0 comes from the exact solution to V2T = 0 in Q with dT/dr = A cos 9/a2 on r = a = 1 with T -> 0 for
NUMERICAL EXAMPLE-IMPULSIVE START 815 r —>- oo; viz. T = — A cos 9/r giving Armax = 2A vis-a-vis 3/4 on our bounded domain. Thus, the bounded domain seems to increase the 'magnitude' of the solution by 50-75%.] We stated above for Stokes flow that either IC gives the same results for t;» rac and t > tmtb- Well, it is also true for Re = 1000, thus justifying our assertion that the 'real' transient, via the e~kt BC starting from rest, is indeed a valid and legitimate (physically and mathematically) technique for obtaining an impulsive start in the CFD laboratory. Another thing that is virtually the same for any Re is the accelerating potential flow field (up) for the e~x'-case; u(x, t) = up(x)(\ — e~Xt) is independent of Re—except very close to the cylinder. 3.19.7 Divergence for h —► 0 But for t < O(10rac), the results from the two start-up methods are, of course, not the same. They are also, unfortunately, not accurately computable for at least one case of interest: rac <$C t < tmtb; i.e., the inability of any mesh to properly simulate what is effectively a step change in the tangential velocity BC precludes the attainment of an accurate solution—especially for the pressure—for at least 10rac < t < tMtb- Thus, for example, the pressures that we have shown during the transition phase in Figure 3.19-9 are only 'nearly' correct. Subsequent finer mesh calculation (not shown) with h = 0.0024 (rather than 0.022) and thus tMtb = 0.0072, which is less than rac =0.01, showed, for example, that Pmax in Figure 3.19-9 is in error ranging from —3% low at t = 5rac to —35% low t = 10rac; AP = PmdX — Pmin was less in error (3-10%), and the isobars in Figure 3.19-9 are qualitatively correct. We shall return to and explain these errors later; for now, let us just report the empirical result (for the impulsive start in general; and for 10rac < t < tMtb for the rapid start) that, for h sufficiently small, the initial AP = XAvu^/h—with smaller h needed for smaller v(larger Re) to see 0(1//?). This \/h behavior was also displayed by Pmm and PmdX, and leads to the observation that the pressure behaves like a fractal dimension: the more closely you examine/measure it, the larger it becomes! (Divergence as h -> 0!). Why is the pressure field so wrong, (in magnitude, but ostensibly not in 'shape') how wrong is it, and how wrong is the associated velocity field? These are good questions, whose answers we can only partially supply, beginning with an examination of the numerical solution, vis-a-vis what we think we know about the PDE solution, very close to the cylinder, just after the 'impulsive' start [t = 0+ for the potential flow + no-slip IC, and t = 10rac (say) for the no-flow IC]. Consider the 'worst case' (maximum amplitude of the vortex sheet, which is also the maximum slip velocity for the related potential flow)—the top of the cylinder (although the analysis to follow applies just as well at other values of 9). It seems clear that, sufficiently close to the cylinder and for sufficiently small time, the dominant terms in the tangential momentum equation are the acceleration term and the 'friction' term; i.e., the normal 'component' of the viscous term. All other terms are negligibly small in comparison. It is also clear for the conditions stated that the curvature of the cylinder can be neglected (close enough to the surface, the radius appears to be infinite), thus permitting a simple model that is nothing more than the ID transient heat equation [see also Gresho (1991a, 1992)], du d2u ~dt ~ya7' (3.19-27)
816 THE NAVIER-STOKES EQUATIONS where u = —u$ and y is the distance from the surface. The solution to this equation, with an IC of u = uq = constant (another valid approximation close to the surface; «o = (1/a) 30/90, the slip velocity) and a BC of u = 0 at y = 0 [see too Collins and Dennis (1973b)] is u = u0 erf (y/V4vi), (3.19-28) which we assert is a reasonably-close approximation to the true solution (for the conditions stated). Later, we will present an even better model that, while somewhat more complicated, leads to virtually the same results for our present purposes. Now please take a look at Figure 3.19-16. Plotted there are three curves corresponding to the first node up from the cylinder top (at y = h = 0.022) and three to the second (at y = 2h = 0.044). The two solid curves in each figure are from the numerical solutions—one for the 'true' transient startup (e~x') and the other for the impulsive start. (The dotted curves are 'theoretical', and are explained below.) Two things are especially noteworthy when comparing these curves. 1. The transient case catches up to the potential flow IC case by t = 0.05 (= 5rac) or so, and for t >~ 10rac the results, as mentioned earlier, are virtually identical. 2. For the impulsive start, the second node off the surface clearly starts off in the wrong direction (!). For the impulsive start, the first node also starts in the wrong direction—a fact ('proven' shortly) that is much less obvious than that for the second node, U2- (The e~Xt startup is thus also in error. The spurious initial acceleration for 1/2 [= w*(653)] is a clear wiggle signal, thanks to the consistent mass matrix—a signal first identified and utilized in Gresho and Lee (1980) (see also Gresho, 1979). [A lumped mass simulation, or most FVM or FDM simulations, would not be alerted to this dilemma—both nodes would show a (not so unreasonable) declerating flow for all t ^ 0—and the analyst might just believe the results, and thus remain befuddled as to the pressure's behavior with mesh refinement.] To see quantitatively the errors in the simulation, let us first return to (3.19-28) and evaluate the acceleration: du uvye-y2/4v' d2u — = — , = v^r, 3.19-29 which we wish to examine at y = h and y = 2h. In each case, the point of overriding importance is that du/dt = 0 at t = 0; the initial acceleration is neither negative (cf. Mi[= Mjr(392)] in Figure 3.19-16(a)) nor positive (a la 112 in Figure 3.19-16(b))—it is zero. The dotted curve in these two figures is a plot of (3.19-28) for y = h and 2h, respectively, with an IC that approximates that from potential flow [cf Figure 3.19-3(e) and (f)]. These ostensibly small differences are the cause of the big problem, where we point out that recovery to the proper solution does occur for times > 0(tmtb)> as suggested in Figure 3.19-16, in which [see (3.19-15)] tMtb = 0.6 for u\ and —2.4 for u2. [It is now clear that our definition of tmtb is closely related to the error function solution: for a given distance (h) from the surface, it is the time at which the argument in the error function is unity. It is also clear that we are not quite consistent in our definition—cf Section 2.6.2c] The reason that a 'finite K calculation cannot get it 'right' for small time is actually rather easy to understand—and 'leads to' the less-easy-to-understand 'I//?' pressure
NUMERICAL EXAMPLE-IMPULSIVE START 817 _, , , r i 1 1 r ux (392), impulsive ux (392), transient (a) First node up (y = 1.022). 1.0 ± ♦- ux (653), impulsive ux (653), transient 0.5 t (b) Second node up (y = 1.044). 1.0 Fig. 3.19-16 Time histories of u for 2 nodes (392 and 653) just above the cylinder, and 2 error function curves (dotted). behavior, which, we emphatically point out, is observed for both 9/3 and 4/1 elements. It is simply a reflection of the fact that we cannot compute d2u/dy2 in (3.19-29) with any accuracy; that the step change is beyond our reach is perhaps best realized via the simplest approximation—second-order FDM: d2u -rjiuj-i -2uj +uj+l), (3.19-30) where, for our IC's and j = 1 at y = h gives, roughly, d2u dv2 h (0 - 2m0 + "o) = VUq hl (3.19-31) at t = 0; The finite mesh will always give a spurious second-derivative close to the 'wall'—and thus spurious acceleration there. [Note that quadratic basis functions do not 'help' at all; they too give O(vuo/h2)]. Also, mesh refinement experiments verified the 0(1/h2) initial acceleration. This is the beginning of the 'problem'. Next, we must connect this bad behavior to the pressure field, where the errors are perhaps more noticeable. And this must be done by first turning to the normal velocity, since it is the component that is closely coupled to the pressure (owing, of course, to V • u = 0). We begin this part of the analysis by recalling the PPE's 'viscous' BC at the cylinder: dP dn = vn • Vzu = v d un dn2 (3.19-32) or, in our 'local' coordinate system at the top of the cylinder, 8P d2v = v- dy dy2 (3.19-33) But, we note, after 'filtering' the wiggles [or from a lumped mass (or FDM/FVM) result which is smooth in v(y) but still has bad accelerations and pressure], that d2v/dy2 is
818 THE NAVIER-STOKES EQUATIONS 'smooth'(no step change in the normal direction!) and thus easy to compute well. So we are led to repeat the question: Why is the pressure field so wrong? To answer this nagging but important question, we must examine the discrete PPE at and near the cylinder's surface. (We limit this analysis to the 4/1 element—but assert that the results are quite general.) To this end, please return to Section 3.13.5f and review the analysis leading up to (3.13-306)—because it provides the vitally important missing link. And this 'link' is in one of the 'truncation error' terms; namely, (l/2)d/dy(vV2v). In terms of our local coordinate system with u = 0 on r, the remaining important terms are these: dP , H , — = vV2un - --(vVV), (3.19-34) an 2 at and the 'bad guy' is the normal 'component' of V2uT : d2ur/dn2 = d2u/dy2 in (3.19-34). (It is noteworthy that the error in the normal momentum equation is caused by the tangential velocity, which is much larger than the normal velocity.) We cannot make h small enough to make this error term negligible because d2u/dy2 ~ \/h2, a la (3.19-31). Thus, dP/dn becomes dominated by the error term, giving the spurious result that dP/dn = 0(1/h) and therefore P = 0(1//?) as h -> 0, which is just what we saw in our numerical results. Although this is really the 'bottom line', it turns out that we can actually be somewhat more quantitative in the analysis, by returning again to (3.19-34): (i) V2«„ = 4Moocos6»/a2 from (3.19-19). (ii) Jtv2ur\j = (V2urj+l -V2uTj)/l where j is any node on the surface of the cylinder (increasing j is in the same direction as increasing 6), and / = aA6, (Hi) As in (3.19-31), - 2uTlJ + urij where uSj is the original slip velocity on the cylinder at nodey (from the potential flow solution). This, of course, is the term that is in error, (iv) Thus, - — V2U I ~ 1 "Sj+l ~ Usj 2dx rlj 2h I (v) But the ^-variation of the slip velocity is smooth and therefore easily approximated. This means that the term (uSj+] —uSj)/l will be close to (\/a)due/d6 = —2uoo cos#/a from the potential flow solution (3.19-17)—at least for the unbounded domain, Hence, h d 7 Uqc cos 6 -— y ur\; % 2 3t ah and we are basically finished.
NUMERICAL EXAMPLE-IMPULSIVE START 819 (vi) Thus, dP vUqq cos 6 / a dr a2 V h the first term ostensibly coming from 'physics' and the second from 'numerics' (truncation error)—the key observation being that the two terms have the same 6-dependent coefficient, (This is the 'good' news and causes the subtle result that the shape is correct; The bad news is that the 'error' term swamps the 'physics' term.) So, for the impulsive start, and any Reynolds numbers, for a fine enough mesh, the PPE BC for t < tMtb will be dP VUnc COS 6 — = ^- , (3.19-35) dr an which causes all of the bad behavior: (i) The pressure diverges like 0(1//?) for h -> 0. (ii) This boundary term will even swamp the source term on the RHS of the PPE [—V • (u • Vu)] with the result that, for a sufficiently fine mesh, even large-Re simulations will generate an initial pressure field corresponding to steady Stokes flow—in 'shape' only—and is thus totally spurious. (The 'fine mesh' solution at and near 0 = n/2 may show the source term effect, however.) (iii) Bad early-time drag coefficients—discussed below. Thus the inability to get a step change right on any mesh for small time actually causes double damage for a NS simulation: both tangential acceleration and pressure are not accurately computable. Hence, we cannot (at least for the impulsive start) go to the t \, 0 limit and obtain anything useful on any mesh. Only for t > 0(tMtb) does the numerical solution recover and become useful/believable. In fact, for the (more regular) e~Xt startup, it is interesting and ironic that the solution's accuracy starts 'good' and ends 'good'—but is bad during a portion of the transient. (The potential flow IC, in contrast, is 'consistently' bad for all t < ~tmtb-)- From the above analysis, we can argue that the numerical BC for the PPE on the cylinder during the early transient is given approximately by dP vunccos0 / a\ ,, ~" '" Ml-e-*'), (3.19-36) dn a2 ^ h whereas that at the domain's inlet is W dP . u -— = — = n u = XUooe-kt (3.19-37) ax dn where Uqo = wq. The numerical solution can thus be accurate for 'small f if |(1 — e-A-f)w/ooCos0/a/i| is sufficiently small compared to ku^t^1 which, at the stagnation points, for example, translates to t ("Kan \ — =A.f«ln + 1 , (3.19-38)
820 THE NAVIER-STOKES EQUATIONS where '<$C' will be subject to 'interpretation'. For our parameters, we get ln((100 x 1 x 0.022)/0.0002 + 1) = 9.3—which puts us right near the end of the transition period shown in Figure 3.19-9, and thus, as already mentioned, casts doubt on the validity of these pressures. However, because the logarithm with large argument is a slowly-varying function, it doesn't require much of a reduction below Xt = 9.3 to recover 'validity'; for example, at Xt = 9.3 we have Xuoo^Xt = WooO — z~Xt)lah = 9.1 x 1CT4, but at just one-half of this time, kt = 4.65, we get ku^t'^ = 0.096, whereas w/ooO - z~Xt)/ah is (still) only 0.00090—over two decades smaller than the 'competition.' The transition from acceleration-dominated to viscous-dominated occurs on a very short time scale, as we have seen in Figure 3.19-9. Also noteworthy is a comparison of the true (unbounded domain) viscous BC term, AvUoo cosO/a2, with the acceleration term (at 6 = 0); equating them (at 6 = 0) gives kt = \n(ka2/4v) = 11.7, which is not much larger than 9.3. Thus, the deleterious truncation error effect 'merely' hastens (by a small amount here) the time at which the transition from potential flow to viscous flow occurs—though it is still true that the magnitude of the pressure will be in error for roughly 10 < kt < A.tMtb> which we shall define as the window of nonbelievability; i.e., for ( ah\ ^ ^ h2 tacln < t < tmtb = —, (3.19-39) V vr.dC J 4v which is 0.10 < t < 0.60 on this mesh—a rather small range. For t > tMtb> the results from either method of startup are believable—and only for t > tMtb are the results from the potential flow + no-slip IC believable (i.e., accurate). Of course, if tmtb < Tac ln(a/?/vrac + 1), the above argument breaks down. (This 'crossover' occurs at h ~ 0.00815, giving tmtb — 0.083 for our parameters.) We also believe and assert that these results are general; any element, any numerical method—even spectral. At this point it seems reasonable to ask: Do these results generalize in any useful way? I.e., you've beaten to death a simple geometry with some known analytical solutions; what about arbitrary geometries for which even the potential flow solution is not known? And what about 3D? Our answers are as follows: 1. We believe that the error function analysis/argument extends 'point-wise' to other, more general geometries with curved streamlines (although we are unsure of the extra-hard effects caused by geometric singularities such as sharp corners). 2. We will present below a better (more general) model for the cylinder that leads to essentially the same results. 3. We therefore believe that the error term, (h/2)(d/dr)(V2uT) can always (for smooth geometries) be approximated by (h/2)(d/dr)(us/h2), where us is the slip velocity of the associated potential flow solution; thus 4. The PPE BC is, for small t, (1-e-*'), (3.19-40) dP dn = v ■j 1 du y n V2uP - 2h dr where uP is the potential flow solution.
NUMERICAL EXAMPLE-IMPULSIVE START 821 And this is as far as we can take it. Since both uP and us are problem-specific and generally not known in closed form, all we say in general is that pressure will behave somewhat like \/h for times between about rac \n(ah/vrac) and tmtb = h2/4v. We believe too that if the potential flow is accurately computable on a fine enough mesh (as for the circular cylinder), then the shape of the two terms in (3.19-40) will be much the same, thus causing only the magnitude of the pressure (and, of course, the concomitant tangential acceleration—and the drag coefficient, presented and discussed later) to be spurious during the above time range. Finally, while we have not seriously addressed the 3D situation, we believe that it will not differ much. 3.19.8 A Better Model The 'better' model for the cylinder case, alluded to above, was obtained from some of the (more general) results in Wang (1967) and Bar-Lev and Wang (1975)—and which we generated in order to help explain some portions of the drag coefficient curves to be presented next. Using the method of matched asymptotic expansions, the above authors derived an equation for u0, whose leading term (dominant for t \, 0) is the one we need; namely, (a\2 _ „ (r — a 1 + Uq = — U oo 2 erfc V4 vt. sin#, (3.19-41) from which, using V • u = 0, we obtain the needed normal component: ur = —uc 1 a\2 r ) 2W-a- erf (1-e -(r-a)2/4vt cos#, r> \V4vt/ r V n (3.19-42) and we note the reassuring result (not obtainable from the simpler model presented earlier) that the potential flow solution is obtained for r > a and t = 0; cf. (3.19-16) and (3.19-17). Evaluating a n Vzu = — at r = a gives the PPE BC: 8P dr 2viinn cos 6 a 1 d r or a /sfirvt 1 (3.19-43) in which the second term is negligible for small t, and it is interesting to note that this second term is the only difference between this 'better'-model and the simple one obtainable from (3.19-28) as uq = —2^ sin# • erf [(r — a)/y/Avi\. It is now interesting to compare the PPE BC's for the impulsive start (potential flow + no slip) for t < 0, t = 0, and t > 0. Collecting previous results yields (i) dP/dr = 4*4 sin2 0/a for t < 0 from (3.19-22); (ii) dP/dr = -Avuoo cos 6/a2 for t = 0 from (3.19-20); and (iii) dP/dr = 2vuoo cos O/a^/jtvi for t > 0 but small, from (3.19-43).
822 THE NAVIER-STOKES EQUATIONS The existence and diffusion of the vortex sheet wreaks havoc with the pressure field—and it is interesting and significant to note that the unbounded pressure at t = 0+ is 'caused' by the combination of a step change in tangential velocity, the div-free constraint with curved stream lines, and the PPE's BC, which is the normal momentum equation applied on Tcyi. [If an impulsive start were applied to a geometry for which the potential flow is basically unidirectional—as in the classical Rayleigh-Stokes problem or for long triangular wedge with very small angle—the pressure field would not become unbounded as t \, 0. Finally, we mention that a brief numerical experiment for flow past a square cylinder seemed to give 0(1/?0'2) behavior for small time.] 3.19.9 Drag Coefficients For more 'surprises', we turn now to a brief(?)look at the drag coefficient (for Re = 1000), CD = Fj/l/M&D (3.19-44) where FTX is the total jc-direction force on the (full) cylinder and is comprised of a pressure part (form drag) and a viscous part (friction drag); FTX = Fx + Fx, where Fpx=- f nxPdl (3.19-45) ? = -/«* JT,. and F 3m /du dv' 2nx— +ny — + — ox \ ay ox d/. (3.19-46) where nx = —cos6 and ny = — sin#. [Note that p = 1 in our case, and recall that P is the kinematic pressure.] The first thing to point out is that, just as the pressure is necessarily in error at small time for an impulsive start, so too is the viscous part—and nearly for the same reason: the step change in tangential velocity. The vortex sheet causes Fx ~ v/h in (3.19-46)-again for both 9/3 and 4/1 elements. Figure 3.19-17 shows some CD vs. time results for Re = 1000, which we present first and analyze later. The CD in Figure 3.19-17(a) is clearly dominated by the acceleration transient: it behaves like CD(t) = CD(0)e~Xt, and it is virtually all 'pressure'. On the 'large' time scale (t > 0.07), it behaves as shown in Figure 3.19-17(b), about which we make three remarks: 1. The minimum value of ~1.0 at t = 4.5 is 'too large' relative to the unbounded domain case [CD min = 0.5-0.6 at t = 6-7 from Chou and Huang (1996) and Chu et al. (1996)]. 2. When we stretched the domain's top from jmax = 2 to jmax = 3 (thus doubling the constricted flow passage), our Comin decreased—to about 0.75 at t = 6, thus showing the proper trend. 3. The impulsive start produces results that would lie atop those shown, the difference being that it starts at CD = 3.5 at t = 0 rather them ^8000. Next, Figure 3.19-17(c) shows another small-time result, up to 20rac, this time on two meshes and with both viscous and pressure contributions plotted separately. Fx denotes the total drag coefficient, Px the pressure contribution, and Fx — Px the viscous contribution; 16965 denotes the mesh (number of velocity nodes). We see the following behavior:
NUMERICAL EXAMPLE-IMPULSIVE START 823 8000 -: i -— 7000 6000 5000 CD 4000 3000 2000 1000- o ; - — l:=:"-=1 0 0.01 0.02 0,03 0.C4 ; '-: t (a) Small time behavior. Fx 16985 Px 16965 (Fx-Px) 16965 Fx 4323 Px 4323 (Fx-Px) 4323 . ;t- 0.07 5.0 r- 4.5 - 4.0 L 3.5 ■- 3.0 - r 9 5 - 2.0 - 1.5 1.0 0.5 / 0 '-- 0 0.05 0.10 t 0.15 0.20 (c) More small time behavior, with more detail - on two grids Fig, 3.19-17 Drag coefficients. CD 9 8 7 8 5 4 3 2 1 0 5 10 15 20 25 30 35 40 t (b) Long time behavior, (t - 0.07) - - - - - - Fx 16965 Px16965 (Fx-Px) 18965 mam C^ 4*70*2 rx 4o£o □ u JOOO O K ,_ Wn W 3.0 - 2.5 - 2.0 : 1.5 \ 1.0 \ X* 0.5 * 0 - 0 \X ^«;~- 2 1.1 a — fa; tjttj _ / y ■<•"""" 4 6 8 10 12 t 14 (d) Impulsive start results on two grids. (i) (ii) (iii) the viscous contributions grow (for small t) from zero at t = 0 approximately like (1 *-kt ); the viscous contribution varies, unfortunately, approximately like 0(1 fh) and so does the pressure part, but only for t > ~0.10 = 10rac; the smart integrator seems to be using rather large time steps—although the piece wise-linear plot in Figure 3,19-17(e) should really be 'faired in' with quadratic interpolation because we are using TR, which would make the curves look more presentable. Finally, Figure 3.19-17(d) shows the same contributions to Co, and Co itself, for the impulsive start case, from which we see: (i) ~ 1/h behavior for small t, and (ii) convergence (recovery) for t > ~2t2 or so, where %2 == h2/4v = 2.4 on the coarse mesh (ri = 0.6 on the fine mesh), Note too that, for t > ~0.1 = 10rac, these curves also closely describe the e~A' startup.
824 THE NAVIER-STOKES EQUATIONS Another empirical result from our many 'runs' is this: the initial (and spurious) drag coefficient for the impulsive start behaves like CD = 38v/uooh. And another is this: for t > —tmtb, Cd = (12/uo)*JvJi, which can be compared with the theoretical result in an unbounded domain [see, for example, Collins and Dennis (1973b)], Co = (A*Jn/u0)^fvJt; our bounded domain has increased the coefficient from A*Jtz = 7.1 to 12. Also, in our case the pressure contributed —59% of the total, whereas the theory for the unbounded case (which we shall soon present) predicts exactly 50%. Both of the above results are valid for any Re, and are shown graphically in Figure 3.19-18, in which the 'new' meshes will be discussed further below—as will the Co curves themselves. The reason that the results are independent of Re (i.e., for a given v, it matters not whether the nonlinear advection terms are included) is as follows: the only difference between Stokes and Navier-Stokes is in the pressure (the initial velocity being that from potential flow for each); NS has a source term on the RHS of the PPE and Stokes does not. But—the contribution of this source term is (basically) to add to the Stokes pressure field the pressure field for potential flow, which, of course, causes no drag. (In fact, when we subtracted the initial Stokes pressure from the initial NS pressure the result was indeed the potential flow pressure field.) This observation is true for both the continuum (at least up to the no-slip BC 'problem') and the discrete/FEM case, with the latter, however being a 'victim' of the mostly spurious Stokes pressure (for h/a <^ 1)—as shown so clearly in Figure 3.19-18. Now that we have presented our sometimes-good, sometimes-bad numerical results, let us sit back and try to explain them. We begin with an explanation of CD (0) for the rapid- start case, which, with special thanks to Renwei Mei, we can adequately explain with a fairly simple analytical model. Although our computed results employed a 'cartesian' domain, we could have used a 'polar' domain—enclosing our cylinder inside a larger 100.0 10.0 cD 1.0 0.1 0.00001 0.0001 0.0010 0.0100 0.1000 1.0000 t Fig. 3.19-18 More drag coefficients; impulsive start, small time, four grids.
NUMERICAL EXAMPLE-IMPULSIVE START 825 and concentric cylinder that defines the outer domain boundary; and this comprises our 'model', for which we can find analytical solutions for limiting cases. Thus, imagine that our domain is a < r < R and consider first the rapid start-up of an inviscid flow in this domain. The plan is to find the pressure field associated with an accelerating potential flow and then to determine the resulting 'form' drag caused by this pressure. To this end, we first solve (with u = V0) Subject to and V20 = 0 in a<r < R, dc/)/dr = 0 at r = a -Xt d<f>/dr = w0(l - e~Af) cos0 at r = R, (3.19-47) (3.19-48) (3.19-49) the latter converting our rectilinear accelerating flow BC to polar coordinates. The solution to (3.19-47) to (3.19-49) is ,-A.r 0 = wo(l -e~~Ar)cos#- (r + ayr) 1 a2/R2 (3.19-50) and we note (for t = oo) that R -> oo recovers the appropriate potential function for the unbounded domain—from which (3.19-16) and (3.19-17) were obtained. The Bernoulli equation, fa 2 2"oo' (3.19-51) where q = |V0| = ^(30/3r)2 + ((l/r)/30/36»)2 and u^ = w0(l - e~A?) gives the pressure: p = \<L(t) — Xwq6 a / ^ a —^ 2cos20 j r \ rl -Xt r + a cos 6 (3.19-52) which, for both t -> oo and R -> oo, recovers (appropriately) the pressure field for steady unbounded potential flow past a cylinder; cf. (3.19-1). The resulting drag force is given by (nx = — cos 0) FAX =2 [ P\r=acosead6 (3.19-53) Jo and the drag coefficient by C^ = Fpx/^Wq -2a. Since the first term in (3.19-52) is a potential flow pressure, it contributes naught to the drag—a la d'Alembert. But the second term gives -Xt _ izja + q)te-M _ Cd ~ 7\ 27^2; = CD^U)e w0(l - a /R ) (3.19-54) and we have obtained an analytical estimate of Cq(0)—and we shall 'explain' the factor 'a + cC, which came from r + a2/r in (3.19-50), later. Recall that, experimentally, we obtained Cq(0) ~ 8000 [See Figure 3.19-17(a)] so that, presuming that it were obtained
826 THE NAVIER-STOKES EQUATIONS on our new, annular, domain, we can determine the requisite size of this domain; i.e. solving (3.19-54) for R gives R = a 1 y/\ - 2nak/w0CD(0) ^/\-2n x 100/800 ~ 2.16, which seem quite reasonable; e.g. it gives, via (3.19-50), a maximum velocity (at 0 = n/2, r = a and t = oo) of ~ 2.53wo and a total flux (fa uo(r, n/2)dr) or ~ 2.17wo—vis-a-vis our computed values on the 'cartesian' mesh of ~ 2.75wo and 2wo, respectively. (For the 'record', an unbounded domain gives Cq(0) = 2nak/wo, or 6283 for our parameters.) Later, we shall further interpret these results in terms of 'added mass' —but for now we stop, with the following [Exercise for the reader: Re-derive the above results in a different-but-equivalent way, using the PPE and the appropriate Neumann BC's] Before we actually 'stop', we wish to note that (3.19-52), rewritten as P = -0(r, 0)ke-Xt + Ppol(r, 0)(1 - e~Xt)\ where now and 0(r, 0) = wo(r + a /r) cos 6 1 - a2/R2 (3.19-55) (3.19-56) Ppoi(r,e) = w0 ~2 I 2 cos 26 a a R2 a R2 (3.19-57) oo (an (3.19-58) which is uniformly valid in time, can actually be taken to the limit of k impulsive start-up from rest of an inviscid fluid: P = -0(r, 0)8(t) + Ppot(r, 6) ■ H(t), where 8(t) = lim^oo ke~Xt is a 'generalized' function that we shall call a Dirac delta function (/0°° 8(t)dt = 1 and J0°° f(t)ke~Xt dt = /(0) for k -* oo) and H(t) is a version of the Heaviside step function [H(t) = 0 for t ^ 0 and H(t) = 1 for t > 0]. Thus, it is now even more clear that the role played by the accelerating inlet flow, vt>o(l — e~Xt), is to quickly accelerate the flow from rest—via the potential function—to a potential flow, in 'ein augenblick'. To do so in 'zero time' of course requires an infinite pressure, briefly, Note too that n • u is not continuous in this limiting case—thus 'violating' in compressibility—again, briefly—and 'explains' the ill-posedness that we mentioned at the beginning. 3.19.10 A New Analytical Model Much of what has just been said applies equally well (or at least nearly equally-well) to a viscous, no-slip fluid, to which we now turn our attention. We have also generated a useful model for the 'real' situation. It begins with a ID model for the viscous boundary
NUMERICAL EXAMPLE-IMPULSIVE START 827 layer (VBL) that resides beneath an accelerating potential flow (the 'outer' solution), and concludes with a 2D representation of the velocity and pressure—in an unbounded domain, for 'simplicity'. Just as we invoked a ID model via the transient heat equation for the impulsive start case—cf. (3.19-27) et seq.—so too we begin with one appropriate to the e_;u case(with y -> x); viz., solve 30 320 with „ =v—T + T0ke-Xt for x> 0, (3.19-59) ot dx 0(0, t) = 0(jc, 0) = 0, (3.19-60) which models, via the source term, an accelerating 'flow' field above the VBL (ie. for x ;» A(t) = y/4vt) and a viscous, no-slip flow beneath it. Clearly for jc ;» A, the solution to (3.19-59, 60) is simply 0 = T0(\ — e_;u)- But to obtain the full solution, we transform this problem to one that has already been 'solved' (p. 64 of Carslaw and Jaeger 1959). Letting 0 = 70(1- e~Xt) - T yields the following IBVP for T: 8T d2T — = v—T for x> 0 (3.19-61) dt dx2 T(x,0) = 0, (3.19-62) and 7/(0, 0 = 7/0(1 -e~Xt). (3.19-63) The solution to this problem and (especially) the ensuring asymptotic analysis was obtained by A.C. Hindmarsh (whom we thank again; he is as much at home in the complex plane as is PMG in his own backyard); in terms of 0 it is 0 = To {(1 - e~Xt) - erfc (x/V^i) + e-*2/4yr • Re[w(z)]} , (3.19-64) where z = y/)J + ix/V4vi (3.19-65) and the complex function w(z) is given by w(z) = e~z2 (1 + -7= T^2 ds J . (3.19-66) We shall make good use of some asymptotic approximations to this solution (viz, one for x ;» y/4vt, one for Xt <^ 1, and yet another for x <£ */Avt) after converting it to the form needed for flow past a cylinder. But before doing even this, we digress briefly to note the existence of a common solution to two different IBVP's, the first given by (3.19-59) and (3.19-60), for A. -> 00 [i.e. for Xe~Xt -> 5(0, the source term becoming a Dirac 'burst'], and the second by 30 320 — = v—^- for jc>0 (3.19-67) dt dx2
828 THE NAVIER-STOKES EQUATIONS 0(jc,O) = 7,o, (3.19-68) 0(0,0 = 0; (3.19-69) i.e., a simple step change. The 'common solution' (an interesting result in its own right) is 0(jc, 0 = ©(*, 0 = 7"o erf (x/V4vt), (3.19-70) which could prove useful (it did for us) when trying to model (and understand) an impulsive start via X -> oo. Now the true fluid-mechanical model that we propose to better understand the rapid- start case is simply the following: ud = -2w0sino\(\ - e~u) - erfc ^^ + e~(r-a)2/4yr • Re[vv(z)] 1, (3.19-71) I V4v? J where now z = Vxi + i(r — a)/^/4vt\ i.e. we have 'borrowed' the ID solution and converted it to one which, for v = 0 and r = a + e where we must have g«a, describes the tangential component of a growing potential flow (small s is required so that a 'cartesian' solution can apply to a cylindrical geometry). Note too that (3.19-71) gives uq = 0 at r = a and at t = 0, as desired. After exploring the asymptotics of this solution, we will invoke V • u = 0 to obtain the corresponding radial velocity derivative which in turn will be used to study the PPE's BC on the cylinder. Ultimately, we shall present an analytical solution for the pressure field—and the drag. For a ^> r — a ^> *jAvt and t > 0, (3.19-71) becomes, approximately, ue = -2w0(l - e~Xt) sin<9, (3.19-72) which describes the accelerating potential flow just outside of the VBL—yet still close to the cylinder. Next, we need a 'small time' approximation; for Xt <*^\, (3.19-71) becomes uq = — 2woXts'm6 < 1 2vt erfc LZl + rJZ± . e-o--)V4w v4vt s/nvt (3.19-73) which displays a 'reduced' boundary layer thickness, Xty/4vt, and is valid for all r, t—though we still want r — a small. Finally, for r — a <£ y/4vt, 'we' (ACH) obtain from (3.19-71) the all-important approximation deep within the VBL: . n r — a r— i— Uq ~ -4W() Sin# • • y/XtD(y/Xt) s/TZVt = -4w0sin6>- r~a D(VXt), (3.19-74) V7rv'Tac valid for all Xt, where D(-) is Dawson's integral, D(y) = e~y2 J0V es2ds (See, e.g., Abra- mowitz and Stegun 1965, p. 297), whose properties of most interest here are these: (i) for
NUMERICAL EXAMPLE-IMPULSIVE START 829 y<K\, D(y) ~ y, (ii) for y » 1, D(y) ~ l/2y, and (iii) for j ~ 0.92, D(y) attains a maximum of D(0.92) ~ 0.54. (In fact, D(y) somewhat resembles a more familiar function, ye~v\) Thus, for kt <^ 1, (3.19-74) describes an accelerating velocity in the VBL, uq 2^ —4wq sin 6 a s/nvt kt = — 4wnsin# and for kt ^> 1 it gives a decelerating one: uq = —2wosin# r — a yJnvTa a kt (3.19-75) -JlZVt (3.19-76) which, it should be noted, is both independent of k and is an appropriate approximation to the step function solution utilized earlier—in (3.19-41) for r — a <^C V4vt and r — a <£. a there. Finally, at yfkt ~ 0.92, at which point z~Xt is only about 43% of its starting value, uq within the VBL peaks (in time) at uo = —2.\6wos\nO r — a ^vt^' (3.19-77) after which time viscosity 'wins out' over acceleration. (Note that since this equation only applies for r — a <£ y/7ivrac, Tac -> 0 (A. -> oo) =>• r -> a.) To obtain the associated BC for the PPE, we need only the VBL approximation, (3.19-74). Thus, we insert ue from (3.19-74) into V u = \[d(rur)/dr + du0/d6] = 0 to obtain d(rur)/dr = —duo/dO, which we insert into the Neumann BC on the cylinder, cf. (3.19-19), to obtain dP a7 3 /1 due = — v— -■ 4vwo cos 6 a a dr \r 86 ) which, for small time (kt <^ 1), becomes 4vwo cos 6 dp and for large time (kt;» 1) becomes yj7TVX& a DWkt), (3.19-78) a *Jnvxa kt (3.19-79) 8P a7 2vwo cos 6 a a \fjrvt (3.19-80) and, recalling the additional restriction, a ;» «j7ivt, we see that, in some sense, we are still restricted to 'small' time even when kt;» 1. Finally, for *Jkt 2^ 0.92(r ~ 0.85rac), we obtain the maximum value of dP/dr: 8P 2.16vwocos# a a y/KVTvc ' (3.19-81) which becomes unbounded as k -> oo(rac -> 0). The viscous BC for the PPE will cause unbounded pressure in the limit k -> oo—but only at t = 0+; it (the 'viscous' pressure) is still zero at t = 0 because u = 0 then.
830 THE NAVIER-STOKES EQUATIONS Considering, in addition to the above model and PPE BC, the accelerating potential flow away from the cylinder, leads us to propose the following small-time pressure solution to the problem of accelerating flow from rest past a circular cylinder: , e)xe-xt + ^ • a\ [ic^ie - 4) 0 - z~kt)' P = ~<P(r. 4vwq cos 6 a D(VAi), (3.19-82) where 0 = wq{t + a2/r)cos6 is the potential function, and the second term accounts for '2 the source term in the PPE (V2/> = —V • (u • Vu) = — \V2q2) that is given by the accelerating potential flow [u = V0 • (1 — e_Xr) and q = |u|]. The only hesitancy/uncertainty that we still harbor regarding this putative solution is that it seems to imply a slippery solution on the cylinder; i.e. dP u0 4vwocos# a dr a a2 *JnvTa = ^ + -7 ; • D(y/kt), (3.19-83) ac where u2e/a = —\4-(q2) = 4wq sin2#/a at r = a from (3.19-17) rather than uq = 0 there. Perhaps it is best to suggest that the potential flow part of the pressure be only applied 'outside' the VBL, though it is noteworthy that others have also included the potential flow part on the cylinder [e.g. C-Y Wang (1968), Bar-Lev and Yang (1975), Bentwich and Miloh (1982)]. It is clear that (3.19-83) is valid in the two limits, v -> 0 and v -> oo (transient Stokes flow, for which we probably need to have X ~ v; or else just drop the advection term). In any event, we believe that it is important to emphasise that the last term in (3.19-82) is a 'viscous-generated' pressure that is realized (only) for incompressible flow by the need to respect the omnipotent constraint, V • u = 0, via the Neumann BC for the PPE. The A. -> oo limit of (3.19-82) is obviously also of much interest; it is 2 2 P = -0(r, 0)S(t) + — % (2 cos 26 -a2/r2)-H (t) 2 r 2vwncos 6 a 7=, (3.19-84) showing that, in spite of viscosity, the potential 'burst' at t = 0 still generates a potential flow (with vortex sheet) at t = 0+. Note that, like the potential flow portion, the viscous portion only applies for t > 0 [recalling the shape of D(VXf)]; it is still zero at t = 0. Note too that, unlike (3.19-55) which applies uniformly in time, (3.19-84) is still restricted to small time, via y/nvt ^C a. The pressure portion of the drag coefficient from (3.19-82) is CPD = Xe-kt + —J—D J— , (3.19-85) WO VV0 V Tac V V Tac / and comes from the first and last terms, the middle term contributing nothing. The viscous component of the drag is also computable from our model as follows: ?: = -2 T Jo Fvx = -2j (Trrcos6» + T^sin6»)ad6» (3.19-86)
NUMERICAL EXAMPLE-IMPULSIVE START 831 dur 2— cos# + dr 3 1 dur sme'fadff (3.19-87) evaluated at r = a, for which it simplifies to Fvx = -lav \ — sin6»d6» (3.19-88) Jo dr since dur/dr, dur/d6, and uo all vanish at r = a. Inserting uq from (3.19-74) gives Fvx ~ t -.D(Vxt) / sin26»d6» (3.19-89) y/TtVTac J() and thus to a viscous drag coefficient of 4 n^v~ — Cl~—J—D(Vki), (3.19-90) Wo V Tac which is clearly identical to the viscous (Stokes) part of the pressure drag coefficient, from (3.19-85)—and we are not the first ones to notice this equality, at least for the limit case of X -> oo for which our results agree with those of previous investigators. (We may be the first to find it for the e~A? startup case). Anyway, for small time (0 < Xt <$C 1), we find p 2nak „ *Jizvt Cpn =CVD=4X- , (3.19-91) W() Wt> corresponding to a constant acceleration of the far-field flow and an accelerating flow in the VBL. For large Xt, we obtain D 2 Ittv Cpn^C'n = —J — , (3.19-92) w0 V t which corresponds to a decelerating flow in the VBL and agrees with that from many earlier investigators for the impulsive start case, beginning (probably) with Blasius (1908); eg. Goldstem & Rosenhead (1938), who erroneously argued that the pressure drag is zero, Collins and Dennis (1973), Wang (1968), Bar-Lev and Yang (1975), and Bentwich and Miloh (1982), although their result—perhaps a 'typo'—is four-fold smaller. The two drag coefficients peak at yfxt — 0.92, at C£(max) = C^(max) ~ 2.6^^, (3.19-93) w0 which clearly become unbounded (at t = 0+) for X -> oo. Again, however, we point out that Co from both the viscous part of the pressure and shear is zero at t = 0. Only the acceleration drag is present at t = 0—impulsively for X -> oo. Returning to (3.19-85), it is interesting to compare the acceleration drag to the viscous/pressure drag. Thus, equating 27iaXe~Xt/wq and S^/Trv/r.dCD(y/Xt)/wo gives, for Xt;» 1, the time of equality: Xte~Xt = 2y/vt/n/a; note that the RHS is a sort of 'reduced' boundary layer thickness, and thus both sides must be small. For our parameters, this equation yields Xt ~ 7.44 (7 + time constants), which makes us recall the rapid transition observed between ~ 5rac and 10rac; see Figure 3.19-9, equation (3.19-38), and related discussion—including that
832 THE NAVIER-STOKES EQUATIONS following (3.19-39). This really is a transition time! [Increasing X to 104 changes the 'crossover' time only slightly—to Xt = 9.9, and X = 108 only increases it to Xt = 14.7.] To finish our analysis, which we believe has already tied up most of the loose ends with respect to our numerical results, we return to the drag coefficient for the bounded domain inviscid model, (3.19-54), and (especially) to the pressure from which it came, (3.19-52), and we remark that the following discussion applies as well to the unbounded (R = oo) domain. The last term in (3.19-52) can be rewritten (generalized) as /'accel = -piioc(t)(r + a2/r)cos8/(l -a2/R2); (3.19-94) and we have converted from kinematic pressure to true pressure—for reasons that will become clear soon. Paccei is the pressure field generated by the accelerating potential flow—and it comprises two separately identifiable terms: Paced = Pfs + Pam, where PFS = -p«0O(0rcos6»/(l -a2/R2) = -piiooiOx/il - a2/R2)- (3.19-95) i.e. dPFS/dx = -p«oo(0/(l - a2/R2) (3.19-96) is the (uniform) pressure gradient caused by an accelerating free stream (FS) velocity. [After setting R = oo, (3.19-95) also describes the pressure distribution in an accelerating incompressible solid.] The remaining term Uoo(t)a cos# Pam = -p 7 7 (3.19-97) r(\-a2/R2) is the pressure generated by the so-called added mass effect; it accounts for the acceleration of that mass of fluid caused by the need to 'flow around' the cylinder. [Lovalenti and Brady (1993), who also present a most detailed study of particle dynamics, state the following about added mass: 'It represents the additional mass the particle appears to have due to the resistance to acceleration of the surrounding fluid'.] The added mass will be made more clear after we point out the following fact: It turns out that the circular cylinder is a sort of special case with respect to 'added mass' in that both Pfs and Pam are identical on the cylinder (r = a). For a sphere, Pam = \Pvs, ar>d for other shapes the ratio (Pam/Pfs) can vary from near zero (very thin elliptical cylinders or very prolate spheroids) to very large values (very thick elliptical cylinders or very oblate spheroids); see, e.g. Figure 16.1 in Vogel (1994), Clift et al. (1978) and Sarpkaya and Isaacson (1981). In any case, however, the added mass factor (y = Pam/Pfs) is no harder to compute than the potential flow field; i.e., it is determined by the shape of the object (with d(p/dn = 0 on the surface) and V20 = 0, with dc/y/dx -> «oo(0 for r -> oo. The drag force from the accelerating potential flow is obtained as usual, via Fx = 2 f* Pcos6a60 where P = Pacce\(a, 6), and we see from (3.19-94) that r + a2/r = a + a for r = a, as noted earlier—and we note another 'equality' for a cylinder, the first being the early time equality of viscous and pressure drag. (For other shapes, this factor would probably become something like r + yl2/r (in 2D), at least for large r, where / is a characteristic dimension of the obstacle and y is a scalar. See, e.g. Batchelor (1967), in
NUMERICAL EXAMPLE-IMPULSIVE START 833 which the added mass factor is generalized from a scalar to a second rank tensor.) Thus Fx = 2na2piiOQ/(l - a2/R2), (3.19-98) where (for our case) Uoo = kwoe~Xt is the force required to hold the cylinder in place against the accelerating potential flow. To conclude our added mass 'digression', we ask the seemingly simple question, which will also show why we needed to re-introduce the fluid's density: Suppose we release the cylinder in this 2D flow field (in zero gravity)—what is its initial acceleration? The naive response is Fx/psna2 where ps is the (solid) cylinder's density and na2 is its volume (per unit length). The correct response, however, is (ps + p)na2us = Fx = 2na2pu00/(\ - a2/R2), or (ps/p + \)US = 2*00/(1 - a2/R2) (3.19-99) because the accelerating cylinder also 'causes' acceleration of surrounding fluid—in this case a mass of fluid that is contained in the 'volume' of the cylinder, because y = 1. For a more general geometry, (3.19-99) generalizes (for R = oo) to (ps/p + y)us = (1 + y)uoo; (3.19-100) there is an added mass effect on both sides of / = ma. (Note the 'compatibility' of this result if ps = p,and note too, per Landau and Lifshitz (1959) who also generalize the result to arbitrary bodies, that time integration gives the cylinder's velocity for all time.) A practical and real life example of added mass effects on a moving object is provided by Vogel (1994): 'Perhaps the most extreme case so far uncovered occurs in the escape response of a crayfish. It flexes tail and abdomen and goes rearward with a maximum acceleration of 51 m/sec2. Drag turns out to be only around 10% of the resistance, with 90% caused by the masses of crayfish and water—as is reasonable for a high acceleration to a fairly low final speed.' And this is as far as we wish to take it, except to state that these inviscid results carry over without change to viscous flows. But for readers who wish to dig deeper into these concepts—which are quite prevalent when studying the motion of small particles, or drops or bubbles, in an incompressible fluid—we provide a 'short list' of some useful recent references, in addition to those already cited: Maxey and Riley (1983), Chang and Maxey (1995), Mei (1993), Mei and Lawrence (1996), Mei (1996), Lovalenti and Brady (1993, 1995), and Panton (1996). 3.19.11 A Better Mesh While we argued earlier, and (we assert) correctly, that the computed magnitude of the pressure field should be pretty accurate when the acceleration Neumann BC for the PPE is much larger than that from the viscous term (i.e., when kt <3C 9.3), we must also admit that the pressure distribution at and near the cylinder can not be accurately computed for Tac ln(a/?/vrac) < t < tmtb; hence, the spurious (\/h) form drag coefficient for small
834 THE NAVIER-STOKES EQUATIONS time. We also said that we cannot say how CD should behave during the 'blind spot'. What we meant is that we cannot on this mesh and with the chosen acceleration rate (A. = 100) answer that question. In order to generate 'believable' results for nearly all t ^ 0, and still approximate well an impulsive start via the e_;u start up, it is now clear that the 'effective' acceleration rate should not exceed the mesh response time; i.e., a minimum requirement for good accuracy for all time is that the 'window of non-believability' be closed—realized via tacln (—) > tmtb = h2/4v (3.19-101) from (3.19-39), and already reported as h ^ 0.00815 for our parameters (tmtb = 0.083). [Fixing the mesh at h = 0.022 with v = 0.0002, gives rac = 0.084 or k = 11.9, which is not a very rapid start.] So—to see if our 'methodology' could yield better insight into the small-time behavior, we performed a brief study with modified (improved) versions of the three meshes presented earlier: a coarse mesh with 4323 nodes and h = 0.0049 (tMtb = 0.030), a medium mesh with 16965 nodes (cf. Figure 3.19-1) and h = 0.0024 (tMtb = 0.0073), and a fine mesh with 67 209 nodes and h = 0.0012 (tMTb = 0.0018)—each differing by close to a factor of 2 in each direction, and each more aggressively graded (Figure 3.19-19 shows the 4323 mesh), thus permitting us to get closer to the cylinder and closer to t = 0 believability. Shown first, in Figure 3.19-20, are the initial numerical pressure fields for the impulsive start case (potential flow + no slip) and Re = 1000 (v = 0.0002), in order to clearly demonstrate the nearly 'disastrous' effect of mesh refinement for this 'impossible' problem. Whereas the coarse mesh seems to give a 'reasonable' result for this Re (cf. Figure 3.19-9), the two finer meshes quickly dispel any notion regarding mesh convergence, with the 67 K mesh clearly showing a return to the truncation-error- dominated Stokes-like pressure field-a la (3.19-35). Also noteworthy is that both pressure and vorticity (in the vortex sheet) are varying approximately like 0(1/h) in magnitude. The coarse mesh, while still giving a spurious initial pressure with respect to the viscous BC for the PPE, is not so dominated by this error term that it cannot 'see' some of the advection 'source' term in the PPE. Returning now to Figure 3.19-18, which shows, this time on a log-log plot, the impulsive start drag coefficients for small t—for both Re = 1000 and Re = 0 (both with v = 0.0002)—on several grids: the three new grids mentioned above as well as the first (old) 17 K grid used for all previous results. Noteworthy is the 0(1/h) specious behavior of Co for t < tMtb for each, as well as the convergence (for all grids) to a k/s/i behavior for t > O(tmtb)' where k 2^ 0.69 for the viscous drag, k ~ 1 for the pressure drag and, of course, k = 1.69 for the total drag (The theoretical value of k, for both viscosity and pressure, is 2-y/nv/ua = 0.50 for the unbounded case—as pointed out earlier (we are 38% high for the viscous part, 100% high for the pressure part, and 61% high for the total—all of which are explained by our too-tight domain. Our final simulations, on the 'new' 4323 and 16965 node meshes for the e~kt startup at Re = 1000, are summarized by the drag coefficients shown in Figure 3.19-21, on a log-log scale. The first 3-4 decades show mainly the e~kt decay of the acceleration drag (all pressure) and we note with pleasure that both meshes give the same result (to graphical accuracy), i.e., the curves for Fx (total drag) and Px (pressure portion) from the
NUMERICAL EXAMPLE-IMPULSIVE START 835 (a) The full domain (b) Zoom near the cylinder Fig. 3.19-19 A 4323 node mesh of 9/3 elements with improved 'grading'. two meshes are indistinguishable—a consequence of which is that we seem to be mesh- converged. That the viscous portions (Fx — Px) do not agree for t < tmtb [and, in fact, vary like 0(1/h)], tells us—of course—that the viscous drag (and the viscous portion of the pressure drag) are still not accurate at small time. Fortunately, they are very small for t < Tmtb and thus, the total drag is believable for all t—and we note, for both meshes, that there is no window of non-believability—since xac\r\{ah/vxac) > tmtb for each; see (3.19-101). For t <~ 5xac or so, CdU) = C£>(0)e_A' and is, of course, all pressure drag. For t > 10Ta(, or so, the transition from acceleration-dominated to 'inertia + viscous' is complete, with the curves now agreeing completely with those in Figure 3.19-18 when t > ?mtb there. The rise of the viscous part, which follows closely the equation CVD = 10(1 — e_Xr) = lOOOr, at least up to t =~ 0.01, will now be explained. First, however, we explain what it is not; it is not a good approximation to the small time (small Xt, actually) analytical solution, CVD ~ AXyfnvi/wQ = 100>/f from (3.19-91). The computed viscous drag, for t < tmtb, is simply another victim of the small time error on any finite mesh—even though the small Xt analytical solution is describing a simpler transient: constant acceleration. But for t < h2/4v = tmtb the mesh still cannot 'see' the proper viscous effect; it 'sees' only the outer solution—potential flow, like wo — — 2w§Xt sin# when Xt « 1, from (3.19-72). Thus, in the attempt to compute CVD [cf. 3.19-86-3.19-91],
836 THE NAVIER-STOKES EQUATIONS the code 'sees' due/dr = —(2uo^ts'm6 - 0)/h to give _ 2nvXt L,D — woh (3.19-102) another spurious result that gives, for our parameters on the '17k' mesh (h — 0.0024), CVD ~ 524?, within a factor of ~ 2 of the numerical result on the truncated domain and thus satisfactorily explaining the observed behavior. (To get closer yet, recall that our peak speed across the top of the cylinder is closer to 3 than to 2). For our final drag figure, we show in Figure 3.19-22 a portion of the Co vs. t curve for Tac ^ t < 100rac for both the original and the improved 16 965 node mesh for Re = 1000. Recalling our earlier discussion regarding the questionable accuracy of the pressure field in the window of non-believability (blind spot); (cf. Figure 3.19-9 and discussion), we assert that the new mesh shows the correct result and the old mesh does indeed display noticeable error in the range 0(rac = 0.01) < t < 0(tmtb — 0.61). (a) 4323 mesh; Pmax=0.051, Pm,n =-0.016, (0,^= 17.1, oomln =-97.2 (b) 16965 mesh; Pmax = 0.097, Pmin =-0.014, ©max= 34.5, a)min =-218 (c) 67209 mesh; Pmax = 0.196, Pmin = -0.034, comax = 73.0, © min = ^11 Fig. 3.19-20 Initial pressure field and range of vorticity for the impulsive start on 3 meshes (Re = 1,000;.
NUMERICAL EXAMPLE-IMPULSIVE START 837 10,000 1,000 100 10 0.1 0.01 '—- 0.001 Fx4k Px4k (Fx-Px)4k Fx16k Px 16k (Fx-Px) 16k 0.00001 0.0001 0.001 0.01 t 0.1 Fig. 3.19-21 Early-time drag coefficients for rapid starts on two 'better'meshes. 10,000 1,000 cD 100 Original mesh mproved mesh Fig. 3.19-22 The 'window of error' on the original mesh is approximately 0.04 < t < 0.8. To conclude our drag discussion, we return to a small sampling of the literature [Dennis and Collins (1973b); Chou and Wang (1996)] and note that they obtained the asymptotic (t -> 0) result that both Co (pressure) and Cq (viscous) go like (2/u00)y/(7Tv)/t. Attempting to calibrate our e~A' startup with these results would suggest, since the friction drag is necessarily zero at t = 0, a really rapid growth (achievable of course for A. -* oo) in order to approach this asymptote.
838 THE NAVIER-STOKES EQUATIONS To conclude our (numerical) drag discussion, we wish to state that it appears to us that the stream function vorticity (V-co) method should have significantly more difficulty than the primitive-variable (u-p) method that we utilize: for example, in Collins and Dennis (1973b) are (after fixing a misprint is the pressure term) the following equations for the drag coefficient: 4 [* Co (friction) = — / co smOdO and Co (pressure) cos#d#. r=a The vortex sheet of course has co = oo at r = a, and we will not even 'speculate' as to the type of singularity that is dco/dr at r = a. The vortex sheet is a real code breaker. This is probably a good place to relate some opinions of someone else who is and has been very interested in transient fluid dynamics: 'The impulsive start is a man-made problem; Nature has no impulsive starts.' And: 'Any paradoxes at the end were put in at the beginning, by our assumptions.'—T. Sarpkaya (personal communications). 3.19.12 At vs. t To conclude on an upbeat note, we now turn to the behavior of our (smart, via local error control) time integrator. We shall present and discuss At vs. t curves for, unless otherwise indicated, TR applied to the Re = 1000 case via the 9/3 element on the mesh shown in Figure 3.19-1 with s = 10~4. But because we used a code (FIDAP) that does not initialize TR 'properly,' a la the discussion surrounding (3.16-236), we first describe how this code starts up—and mention up front that, while not theoretically 'perfect,' it usually works quite well in practice; i.e., the short cut is viable. It is also simple: start the integration with 2 (or 3, or 4) fixed, small- At BE steps (here 10-5), thus precluding a need for Pq, Uq, and a div-free IC (the first step is generally to be considered as a 'projection step'). After step 2, the quantity Uj = («2 — «i)/A?o is used as the acceleration vector on the RHS of the general TR algorithm, (3.16-235), and the switch to TR is made. After one TR step, still at the conservatively-small initial step size, we also have u?, = 2(1/3 — «2)/A?o — ii2, so that error control via AB2 can commence with step 4, and At changed beginning at step 5, (If the IC is known to satisfy Ctuq = gQ, the switches can be made one step sooner.) Figure 3.19-23 shows the variable-step integrator results at early time, both to show how well it deals with the imposed e_A' transient, and to compare this result with the more conventional impulsive start—the potential flow IC. Here, and in those to follow, all time steps are plotted. The solid line in Figure 3.19-23(a) is the simple theoretical At vs. t result for the scalar ODE y = -Xy (see (2.7-87)]: At = (12seA')l/3A = 0.00106e33-3'. It is seen to describe well the full Navier- Stokes time integration during the acceleration-dominated period—say t < ~ 0.10 or so. Note that only a few steps are needed to 'recover' from the conservatively small Ar0 of 10-5 to the 'theoretically correct' value of ~10~3. For t > 0.1 or so, the physics of advection and diffusion (and spatial numerical error!) take over the job of 'determining A?.' Figure 3.19-23(b) shows three items of interest for the impulsive start:
NUMERICAL EXAMPLE-IMPULSIVE START 10° 839 At (a) e X1 case; (40 steps) (b) Impulsive start; (16 steps) Fig. 3.19-23 Short-time performance of TR integrator (Re = 1000! 101 10° 10"1 io-2 10"3 10"4 10-5 i—i—i—i—i—i—i i r j i i l J I L 0 10 20 30 40 50 60 70 80 90 100 t (a)The full TR run, with a restart at t = 50 (162 steps) 10"5 0 2 4 6 8 10 12 14 16 18 t (b) Backward Euler, shorter run (200 steps) Fig. 3.19-24 Performance of TR and BE for the e~XT case (Re = 1000;. 1. At quickly reaches a value appropriate to the problem—in this case, the mesh 'response time', or tmtb, is ~0.6, and TR ODE theory now says, for s = 10-4, kAt (different A.!) should initially be (12s)1/3 = 0.1, so that At should be ~ 0.1 /k = 0.1tMtb = 0.06, which is close-enough to the values in the plot. 2. Beyond t = 0.2, the At vs. t behavior is very close to that is Figure 3.19-23(a)—another desirable property of a 'smart' integrator. 3. The e~Xt transient cost only 24 extra time steps for an accurate integration—even though its time scale (rac = 0.01) is much shorter than those for advection and diffusion. In Figure 3.19-24(a) we show At vs. t for the entire simulation, including an appropriate time step reduction at t = 50, caused by a restart which includes a switch back to BE for
840 THE NAVIER-STOKES EQUATIONS 2 or 3 (necessarily smaller) steps. Note too the rapid recovery to what would have been a smoother curve via non-stop integration to t = 95. It turns out that the 'rough' restart only 'costs' about 5-10 steps—the non-stop integration used 156 time steps, The reduction in At at t = 15 is apparently required to follow the separation phenomena, although it does grow again beyond t = 25. For comparison, we show in Figure 3.19-24(b) the typically inferior performance of BE on (a portion of) the same problem; it took 200 time steps just to reach a time of ~17. Moving to the Stokes flow simulations, Figure 3.19-25 shows the At behavior (TR) for both types of startup, about which we note the following: 1. Again, the e~kt transient required only 24 extra time steps over the potential flow IC. 2. Again the At selection mechanism caused virtually identical behavior for t > ~ 0.2. 3. Stokes flow is 'easier' to integrate and At grows monotonically owing, in part, to the linearity of the DAE's. 3.19.13 A Deficient Mesh Design We conclude (finally!) this example with a small excursion/digression related to mesh 'response'; i.e. tmtb—to show how much more important is a 'good' mesh for parabolic problems than for the more forgiving (easier) elliptic problems. We mentioned earlier that the #-variation around the cylinder is, in some sense, the 'easy' part of the simulation. Let us return briefly to that issue to point out that a non-uniform mesh size as a function of 6 can cause some significant problems—at least for the transient part of the simulation and for small time close to the cylinder. Figure 3.19-26(a) shows a portion of a highly nonuniform mesh (in the r-direction), constructed 'by mistake' early-on in our investigation. (The domain is the same as in Figure 3.19-1). Suffice it to say (but not show) that the transient pressure field that resulted from either type of startup produced lots of interesting-but-spurious dynamics. Rather, we shall briefly demonstrate related behavior for the much-simpler transient heat equation. We solved 37'/3? = 0.0002V2T on this mesh i i i r i i r 10_5i L o o o o J I L _L 0 10 20 30 40 50 60 70 80 90 t (a) No-flow BC, e-^case (60 steps) 0 10 20 30 40 50 60 70 80 90 t (b) Potential flow/impulsive start case (36 steps) Fig. 3.19-25 Performance of TR for Re = 0.
NUMERICAL EXAMPLE-IMPULSIVE START 841 (a) Non-uniform, non-optimal mesh; Drmin = 0.0023, Armax= 0.022. (b) T at t = 0.05 (Tmax = 1.037) (c) T at t = 1.0 (Tmax = 1.008) Fig. 3.19-26 Demonstration of spurious dynamics on a non-uniform mesh. with an IC of T = 1 and BC's of dT/dn = 0 except on the cylinder, where we used T = 0; i.e., we have another step change at the boundary. Figure 3.19-26(b) shows the resulting solution near the cylinder at early time and Figure 3.19-26(c) shows it at a later time. We summarize the discussion of this, and other simulations, with the following Remarks: (1) The spurious non-concentric isotherms are a direct consequence of the range in mesh response times—as a function of 0. tMtb ranges from 0.0070(/?i = 0.00236) to 0.613(/?i = 0.02215) for the first nodes off of the cylinder, the minimum occurring at 6 = tt/2, the maximum at about halfway down. (2) The temperature overshoot is 'caused' by the CM matrix—a wiggle signal. (A LM result has no overshoot, but also has spurious, non-concentric isotherms for small time.)
842 THE NAVIER-STOKES EQUATIONS (3) Even by t = 1.0, the solution has not totally recovered,even though the isotherms are now properly concentric. (4) A uniform-in-^ tmtb at least generates concentric isotherms for all t—and the only remaining problem relates then to the difficulty of a step change. (5) The solution of the steady Stokes equations on this mesh looks little different than those on our good mesh (Figure 3.19-1), thus, demonstrating the extreme relative lack of sensitivity to mesh non-uniformity for elliptic problems. Returning to the transient case, we (nearly) conclude this overlong example with the obvious 'word to the wise:' If you solve a NS problem with time-dependent forcing that acts over a time scale r in which viscous (and pressure) effects are important near a no-slip boundary, you need tmtb ^C t for believable results; i.e., be sure that h <gi -jAvr—the same result that 'comes from' the simple ID transient heat equation. 3.19.14 Concluding Remarks We end this discussion by attempting to summarize what we know—or think we know—about rapid starts from rest and 'impulsive' starts from potential flow, beginning with the former. There are three competing processes for establishing the velocity and (especially) the pressure fields, and we assume below that X is sufficiently large that 'process 1' dominates for small time: (i) the acceleration of the inlet flow—in our case via the Dirichlet BC, u(t) = wq(1 — (ii) the viscous diffusion of momentum (and vorticity—as well as its generation) via the no-slip BC on the cylinder; (iii) Advection of momentum and vorticity by the velocity field. Correspondingly/concomitantly are three driving forces' for setting the always-in- equilibrium pressure: (i) the Neumann BC at the inlet, dP/dx = -Xwoe~Xt; (ii) the Neumann BC at the cylinder, dP ? d i a ~^{rur) r dr at r = a; dr dr (iii) the source term on the RHS of the PPE, V2/> = -V ■ (u • Vu). Remark: The velocity at t = 0 is zero, but the t = 0 pressure is very large, approaching infinity as X -> oo, owing to process 1. Since X is 'sufficiently large,' the very-early-time solution is one of accelerating potential flow, albeit, with a vorticity-producing no-slip BC on the cylinder; process 1 is
NUMERICAL EXAMPLE-IMPULSIVE START 843 dominant, but process 2 is also active very close to the cylinder. Process 3 is too small to be seen, i.e., the early time response is 'independent' of Re. Next, depending on the Reynolds number (or 1/v), the decaying 'acceleration BC gives way to advection and diffusion—a transition that is most prominent near the cylinder. If Re is sufficiently small, the viscous effects (process 2) will dominate the transition from a mostly potential flow, and process 3 is still less important (and separation will not occur); advection is totally absent for Stokes flow (very large v). If Re is sufficiently large, process 3 will dominate the transition, which causes the flow to remain mostly curl-free (potential) except very close to the cylinder where, especially near the two stagnation points, viscosity (process 2) will (slightly) affect the solution (e.g. the upstream stagnation point pressure will exceed ju2^). Shortly into this transition phase will occur another transition: boundary layer separation and downstream advection of the separated flow—ultimately leading to vortex shedding or even turbulence. For intermediate values of v (Re = 107100?) both processes 2 and 3 will be important as the acceleration phase winds down—and their interaction will often lead to boundary layer separation and downstream advection of momentum and vorticity in a still-laminar flow which may even ultimately become steady. For the impulsive start case, achievable in principle via X -> oo in the rapid-start case, process 1 is absent and the effective initial condition is one of potential flow everywhere except at the cylinder, upon which resides a vortex sheet (an integrable singularity of infinite vorticity, whose integral is the slip/potential velocity just off the surface, u$ = —2^00 sin 6). The corresponding initial pressure field is Reynolds-number-dependent and (for us at least) very difficult to describe quantitatively. If v is sufficiently large (Stokes flow in the limit), process 2 will totally dominate and the advection source term in the PPE will be completely unimportant (Euler velocity, Stokes pressure). If v is sufficiently small, the opposite situation will exist; process 2 will be mostly unimportant and the initial pressure will basically correspond to that of simple potential flow (P + \q2 = \u2OQ)—except that the no-slip BC on the cylinder must still be respected. The PPE will not see the potential flow BC, dP/dr = — n • (u • Vu) = u^/a = Au2^ sin2 6/a because this term is now zero on the cylinder; rather, it will (we believe!) see 4vUoo cos 6 with very small v, thus giving dP/dr -> 0 in the limit of v -> 0. For t > 0, however, the 1/v^-like pressure behavior will agree with the rapid-start case for X -> oo there, and is caused by the step change in tangential velocity, which velocity jump 'generates' a concomitant pressure via the 'omnipotent' divergence-free constraint. The impulsive start generates a temporal pressure discontinuity in response to a spatial velocity discontinuity—unless one chooses the probably legitimate position that the velocity change is also temporal. When one attempts to invoke these three processes via approximate/numerical solution of the NSE's (via FEM in our case, but surely no other numerical method would behave much differently), a fourth 'process' enters the picture. It is both spurious and insidious—and caused by the combination of incompressibility (and all that it entails) and by the inability to numerically simulate properly a step change for the transient 'heat' equation for the tangential velocity, with the net effect that the viscous BC for the PPE (process 2) becomes 'augmented' by another Neumann BC of the form dP/dr ~ vUqo cos 6J ah which, for a seemingly well-resolved flow for which h/a <^ 1 is clearly 8P _ d dr dr 1 ^ r dr
844 THE NAVIER-STOKES EQUATIONS necessary, can generate a totally spurious pressure field that, while showing the same qualitative shape as a Stokes pressure, is quantitatively totally spurious—and even diverges (becomes unbounded) as h -> 0. For the impulsive start, this bad solution will dominate the numerical results until the Minimum Time of Believability of the given mesh, tmtb = h2/4v, has been passed—after which the pressure 'recovers' to the proper 0(1/*Ji) behaviour and the numerical solution actually returns to believability, fortunately. For the rapid-start case, the numerical behaviour can be even more bizarre, depending as it does on the magnitude of tmtb relative to the time constant of the startup phase, rac = 1/A. If rac <£ tmtb, there will exist a 'window' of non-believability during which the solution is dominated by the same spurious Neumann BC for the PPE as for the impulsive start case. Prior to entering this window, t < 0(rac) and after leaving it, t > O(tmtb), the solution can be relied upon, assuming 'all else' is done well. If, on the other hand, tmtb ^C Tac, the rapid- start problem is capable of delivering good results for all t ^ 0. The only problem in this desirable case is this: it is quite impossible to let A become arbitrarily large since tmtb ^C rac =>• h ^C y/4v/X. But at least this case is superior to the impulsive start/potential flow startup case in that one can select A. based on the mesh that is deemed affordable (and graded meshes are extremely useful and important here) via, say, A = 0.1 (4v/h2)—and then proceed to perform a believable simulation of a 'rapid' startup from rest. Finally, we wish to point out that in spite of what many have said regarding impulsive starts from rest, there is a subtle-but-important (and large!) difference between a flow which truly starts from rest and one which starts from a potential flow. In the former case there exists, for A -> oo, a 8(t) impulse via a potential function and associated infinite pressure that accelerates the fluid to the potential flow that exists at t = 0+ but not at t = 0. In the latter case this effect is missing, showing again that this is definitely not an impulsive start from rest. Also for this case, the pressure at t = 0 is bounded—not infinite. It is only 'nearly' unbounded at t = 0+ via the weaker singularity, 0(\/*Jt), that is also present for the A -> oo impulsive start from rest. Having said this, we realize that an alternative interpretation of our results may also be viable—and that is that we have 'merely' justified the potential flow + vortex sheet startup by showing how it comes about. Our final remarks on this difficult problem—except to note that we describe the opposite case, impulsive and sudden stops in another publication (Gresho & Sani 1998)—are that we now agree more strongly than ever in the adage regarding the NSE's: — Easy for fluids — Difficult for people — Impossible for computers, first seen by some on a tee shirt! 3.20 CLOSURE: SOME ADDITIONAL REMARKS ON THE PRESSURE A recent paper on 'projection' methods (Perot, 1993) contains the following statement: 'The pressure is a very interesting variable in the context of numerical discretizations of the incompressible Navier-Stokes equations.' This understatement prompts a digression
CLOSURE: SOME ADDITIONAL REMARKS ON THE PRESSURE 845 to review and augment portions of a lecture given at one of the FEM in Fluids conferences (Antibes, France, 1984) by PMG, updated slightly, entitled, 'Some Remarks on the Pressure ...': — In the stress tensor, P is clearly a compressive stress. The mechanical pressure is proportional to the trace of the stress tensor. — In hydrostatics (u = 0), VP is a. force per unit volume; it balances 'body forces' —and P itself has less meaning than pressure differences. (VP is, of course, also a force per unit volume in hydrodynamics; its role there is simply less obvious.) — In steady Stokes flow, it is a Lagrange multiplier (that enforces the divergence-free constraint)—a mathematical entity. Is it not also a physical entity, helping 'guide' the flow around 'obstacles,' etc.? Yes, it is that, too. — At the fluid's boundaries, it (not its gradient) is an important part of a normal force balance. — When body forces are present, a 'portion' of the pressure is used to balance them. — For ideal flow (v = 0, V x u = 0), the pressure is an energy per unit volume in the Bernoulli equation. After solving for the velocity from Laplace's equation, only the pressure remains to be determined: '... in an ideal flow, the pressure adjusts itself according to the Bernoulli equation so that the fluid is accelerated to those values of velocity dictated by the geometry of the boundaries'—Panton (1984, p. 452). Is it then also a Lagrange multiplier? Yes; P is a multi-purpose variable. — For certain more general (and non-Newtonian) incompressible fluids, a-tj = —PSjj + f(ujj + Ujj), where /(•) is a nonlinear-function of the strain rate, P even loses its physical interpretation as a normal stress; it is then 'merely' a Lagrange multiplier—the slow 'recognition' of which caused some confusion in the past (K. Rajagopal, personal communication). — What makes the fluid flow around a corner, or a circular cylinder when viscosity is present? Or when it isn't? It is, again, VP. — Finally, for time-varying incompressible viscous flows, it seems like the pressure is 'all of the above'—and more: it must adjust itself, instantaneously, so that V • u = 0 at all points in x and /. Or is this latter simply another manifestation of its role as a Lagrange multiplier? Probably. — In time-dependent flows, the pressure often varies, in both time and space, in wild and wondrous ways—only some of which are easy to understand; even the range of magnitudes is often quite mysterious. This is apparently related to the many jobs it has to do and that it is an elliptic variable embedded in an otherwise parabolic system. — It is by far the most 'sensitive' variable to any change in any 'parameter'; e.g., v, h, At, IC, BC, Q, 8Q, .... — See Remark 17 following Table 3.13-4 (Section 3.13.2a). — We conclude these remarks by recalling one thing, in marked contrast to the situation with compressible flow, that the pressure is not: it is not a thermodynamic variable; there is no equation of state.
4| Derived Quantities 4.1 INTRODUCTION In this chapter we shall discuss a number of issues that are important after a numerical solution has been obtained. (Actually, for time-dependent problems, some of what we shall discuss takes place during the calculation.) Often called 'post-processing,' the 'data manipulations' of the primary variables—velocity, pressure, temperature (etc.)—that are needed to obtain such things as streamlines, heat flux, and forces and moments, are what we loosely refer to as 'derived quantities.' It is often the case, however, that one or more of these 'secondary' quantities is actually the primary quantity of interest; examples: (i) the drag force on an automobile, and (ii) the heat transfer rate in a thermal convection simulation or in a heat exchanger. There is often the possibility of deriving the quantity of interest in more than one way, which means that choices must be made. Herein we shall describe some of these options and suggest—when we know the answer—which of the available options is 'best,' a term whose definition will often depend on (at least) the problem, the element used, the quantity desired, and the graphics package available—the latter of which can be so important as to be overriding, an example of which would be the vorticity computed from the velocity field obtained with the Q[Qq element: clearly co = dv/dx — du/dy = Ylj=i(vj^(l)j/dx — ujd(j)j/dy) should be computed, reported, and plotted (via contours of constant co) at the centroid of each element, to take advantage of the supercon- vergence phenomenon, e.g., Zienkiewicz and Taylor (1989): derivatives at appropriate Gaussian points are more accurate than elsewhere; bilinears, for example, give second- order accuracy for first derivatives at the centroids, at least for rectangular elements. (For general quadrilaterals, the 'order' is hard to define—but it is always true that the most accurate value will be at the centroid;—J.Z. Zhu, personal communication.) For a good bibliography on superconvergence in general, see Krizek and Neittaanmaki (1987). But more often than not, the available graphics package will only accept data at the element nodes—thus requiring further action than simply evaluating co at the centroids. How best to bring these data to the nodes is the question forced upon us by the plotting package. This is an example of what this small chapter is all about. We shall first describe, in a fair amount of detail, most items of interest from 2D simulations, after which we shall extend the discussion to 3D, which both complicates some of the issues and introduces more derived variables (e.g., vorticity is then a vector quantity).
848 DERIVED QUANTITIES 4.2 TWO DIMENSIONS The 2D derived quantities that we shall discuss below are: vorticity, stream function, particle paths (tracer), heat flux, forces and moments, and global Peclet (Reynolds) numbers, after discussing methods of smoothing discontinuities at element boundaries and—especially—at nodes. 4.2.1 Smoothing in General Since we use at most C° basis functions, whose gradients are only in C~l, and since many pressure basis functions are in C_1, it is of some interest to review some of the methods that have been used to smooth discontinuities at the global nodes of a finite element mesh. For a simple motivational example, suppose we want the pressure and the vorticity at the 'central' node of a 4-patch of distorted Q\Qq elements? Since each of the four contributing elements will present its own value (and, for vorticity, not even a 'unique' value for each of the four, as we shall see), we begin with four different values, and it is clear that a unique nodal value can only be obtained by some sort of averaging procedure—apparently necessarily ad hoc. We will consider four ways of local smoothing/averaging to obtain a unique nodal result, and one global method; in each case, of course, the averaging is performed over the number of elements sharing the node: 1. Simple arithmetic averaging. 2. Area-weighted averaging. 3. Inverse area-weighted averaging. 4. Basis-function-weighted averaging. The first is certainly the simplest, and this indeed has much to recommend it—especially on well-designed meshes that have gradual grading. The second and fourth, which are rather closely related (the latter being a lumped mass shortcut of the global method to be discussed below), have little to recommend them in most cases because they are actually somewhat illogical—the 'definition' of which follows after we describe one of the other recommended schemes: inverse area weighting. Inverse (element) area weighting is logical in the following sense: the smallest element presumably has the most accurate value (all else being equal) and thus should get more weight than larger elements. While this seems, in some sense, to fly in the face of the very GFEM that we are espousing in this book (wherein basis function weighting—at least in simple cases—looks more like area weighting than inverse area weighting), we take the position that the averaging (weighting) arguments for these post-processed quantities lie outside of the Galerkin framework. Indeed, recall that in Section 2.6.3, we argued (and demonstrated) that mass lumping (basis function weighting of the time derivatives) was quite inaccurate in many cases—worse even than the 'analogous' FDM. Thus, our general recommendation for local smoothing is to either use simple arithmetic averaging or, if you want higher accuracy and/or frequently employ somewhat 'coarse' meshes that may contain a few 'bad' elements, use inverse area-weighted averaging.
TWO DIMENSIONS 849 4.2.2 Vorticity We begin with a global smoothing that is also a best least-squares fit (L2-projection, see Appendix 3) via the global basis functions being used for velocity, following Lee et al. (1979): N r N „ ^2coj / faQj = ^2 <t>i(vkd(t>k/dx ~ ukd(pk/dy), i = 1, 2, ..., N, (4.2-1) j= 1 k= 1 which is, of course, a mass-matrix problem, Mx = b, whose solution can be rather quickly obtained, e.g., via the diagonally scaled conjugate-gradient method (see Volume II). While not regarded as very viable by Lee et al., we nevertheless present it—and even recommend it—because it is based on firm FEM theory: the integral of the square of the error between the raw discontinuous field and the smoothed C° field is a minimum. A final feature of this method relates to wiggle signals; because of the propensity of L2 best fits to wiggle when the function being approximated changes too rapidly for the given grid, this technique could perhaps also be used as a guide to better mesh design. It is, of course, more expensive than any of the local methods to be described below, but it is also much more consistent than its lumped mass counterpart (or area-weighted averaging). Alternatively, and perhaps even more consistently, a best fit to the vorticity field may be accomplished in the space of pressure basis functions (L2) rather than the somewhat unnatural space (//') in (4.2-1), which is from the projection into the velocity 'space'. Thus, rather than coh = Y^j (Oj<f>j(x), we use coh = Y^j coj\jrj(x), where \j/j is a 'pressure' basis function, leading directly to N fc&j = ^2 ^i(vkd(pk/dx - ukd(pk/dy), i= 1, 2, ..., M, (4.2-2) where M is the total number of pressure (and now, vorticity) nodes in the mesh (same as the number of elements for Q\Qq, although the notions generalize). A further 'economy' results upon recognizing the coefficient matrices on the RHS as the components of the divergence matrix, CT, in the GFEM NS equations—at least when the BC's for NS are 'appropriate' (no Dirichlet BC's on velocity); i.e., the 'full' C7-matrix is needed, and (thus) TV above includes every node in the mesh—as indeed it does in (4.2-1). Thus, recalling the definition of CT [(3.13-25)] Qco = C\u - CTxv, (4.2-3) where Q is the 'pressure' mass matrix. Of course, if a C~l pressure approximation had been employed (e.g., Q[Qq or Q2P-O, then the nodes for vorticity are internal to the element and—depending on the graphics package used—may need to be spit to the velocity nodes. C~l (element-contained) pressure basis functions have, of course, an additional significant advantage: Q is also element-contained and the inverse is thus easy to compute, and the vorticity is obtained simply by 'looping through the elements.' In contrast, the vorticity from (4.2-1) is directly available at the velocity nodes—after solving a more expensive linear algebra problem. We even believe but cannot prove that the accuracy at the pressure (vorticity) nodes from (4.2-3) is higher than that from the more
850 DERIVED QUANTITIES expensive (4.2-1) at the velocity nodes. It is probably less accurate if it too must be 'transferred' to the velocity nodes. For a method of determining an accurate boundary vorticity when a stream function- vorticity method is used, see MacKinnon et al. (1990), which uses some of the 'consistent flux' concepts discussed in the next section. We now consider simpler, but less consistent, schemes. It seems that simplicity usually dominates over consistency in that not many CFD packages that we are aware of go the consistent route. For the simpler methods that need results at the (velocity) nodes, we must first compute the 'best' estimate from each element sharing that node. Within an element, we can obviously compute co(x, y) at any desired location from ne co = dv/dx — du/dy = \J(^30y/3jf — Ujdcf>j/dy), (4.2-4) where element e contains ne nodes. What to do next is the key question—and the (non- unique) answer is quite 'element-dependent.' Zienkiewicz and Taylor (1989, p. 349) show some 'optimal sampling points' (for stresses, which are also basically first derivatives) for several elements: centroid for Q\ and Pi, the four 'Gaussian' points for P2, and the 2 x 2 Gaussian points for Q\ and Q2. Whereas these points give 'extra' accuracy owing to super-convergence, this (local) accuracy is usually lost (returning us to 'normal' convergence) when 'extrapolation' of any type to the nodes is performed; such is the cost of bowing to the contour package programmer. In the event you must go to the nodes, we present the 'winner' (but not by much) from the brief study performed by Lee et al. (1979) for two Lagrange elements: (i) for Q\, evaluate co at the 2 x 2 Gaussian points and the centroid, then linearly extrapolate to each of the four nodes; (ii) for Q2 [or Q2 ], evaluate co at the 3 x 3 Gaussian points (one of which is the center node for Q2) and linearly extrapolate to the eight 'boundary' nodes. For triangles, analogous procedure, can be used via the appropriate integration points. We close this discussion with a useful warning (implicitly contained in our discussions up to now) from Zienkiewicz and Taylor (1989): '... in quadratic C° elements, whether 2D or 3D, the stresses (or similar quantities) should never be calculated at nodes.' In concluding our vorticity discussion, we point out (again—see too Section 3.11.4) some simple, but probably not obvious, facts related to vorticity—that could prove useful in CFD simulations. Starting from the equation relating stream function to vorticity (see Section 3.6.4), VV + co = 0, (4.2-5) integrate over the domain to obtain coT= fco=- J ^= f uT; (4.2-6) the total vorticity in the domain, even in a time-dependent flow with time-dependent BC's, is (for simply-connected domains) always the line integral of the tangential velocity over the boundary. This is Stokes' theorem in 2D. For but a single simple example of the potential utility of this result, consider a common CFD test problem, the lid-driven cavity (LDC); here the flow in a box (unit size, usually) is at rest, and the top lid is impulsively moved to induce a shearing force (and a vortex sheet). Since (if) both lid speed and box
TWO DIMENSIONS 85 size are unity, (4.2-6) tells us that coT = 1 for all r > 0 (and for any value of Re). To the extent that your mesh does not deliver unit total vorticity, it is deficient. While this is a rather easy diagnostic to compute, it does not—unfortunately—tell the analyst where the mesh is deficient. Note too that (4.2-6) tells us, for any contained flow with stationary, no-slip boundaries, that J co = 0. Finally, we mention that the above test can equally be applied to steady-state simulations. 4.2.3 Stream Function Besides vector plots, the stream function is the most popular way to depict 2D (Cartesian or axi-symmetric) flow. Here again, a choice needs to be made—at least for elements that are mass-conserving (those with discontinuous pressures). For such elements, we can profitably use the definition of \Jr that relates a change in \Jr to the flow rate between the two values (see, for example, Batchelor, 1967): ir(x, y)-rfr0= [(udy-vdx), (4.2-7) where the line integral is along any curve connecting x/tq to \]/(x, y). For mass-conserving finite elements, it is natural (Gartling, 1987) to take the curve(s) to be element boundaries; i.e., integration around an element gives the incremental values (node-to-node) of \\r for that element. The entire procedure can be gleaned from the sketch in Figure 4.2-1, in which each node is visited only once (even if theoretically unnecessary owing to specified BC's). In practice, the line integrals are performed via the appropriate basis functions and associated isoparametric coordinates; i.e., —1 ^ £ ^ 1, and we show only the results (details left as an exercise) for both linears and quads—both from applying (4.2-7): HHHHB Fig. 4.2-1 Stream function calculation via boundary integrals.
852 DERIVED QUANTITIES 1. Linears. Node n to node n + 1: i^n+i - isn = \i.yn+\ - yn)(un +un+l)- \{xn+{ -xn)(vn +vn+i). (4.2-8) 2. Quads. Node n (edge) to node n + 1 (center) to node n + 2 (edge): Vr„+1 -i/n = — [(-6yn + 7y„+i - y„+2K + (-7y„ + 6y„+1 + yn+2)un+l + (yn - yn+i)un+2 ~ (-6x„ + 7*„+1 -xn+2)vn - (-lxn +6xn+l +xn+2)vn+i - (x„ -xn+l)vn+2], (4.2-9) V^+2 - irn + i = —[(yn + l - yn+2)un + ^(-yn ~ 6J« + 1 +7j„+2)"« + l + (yn ~ 7y„+i + 6yn+2)un+2 - (xn+i -xn+2)vn - (-xn - 6xn+l + lxn+2)vn+l - (xn - lxn+l + 6xn+2)vn+2]. (4.2-10) If C° pressure basis functions are employed (e.g., Q2Q\), then the above procedure will not work so well because Jr n • u / 0 in general for these non-element-mass-balance elements. An alternate procedure, which in some sense hides its own errors, uses the stream function-vorticity equation, VV =-co = du/dy - dv/dx, (4.2-11) because [from (4.2-7)] u = d^/dy, (4.2-12) v=-df/dx. (4.2-13) The weak form is, of course, desired; it is, via GFEM, Y^j^jj V0,- • V0y = J 0,-(3u/ac - du/dy) - J 0,-iiT - f V0,-VVr, i = \,2,...,N, (4.2-14) T D where uT = —d\Js/dn is the specified value of the tangential velocity on TT. Note that 0,- = 0 on any part of f on which uT is not specified, because \Jr itself is there specified— by \fr(xe r) = 1A0+ / n-u, (4.2-15) where 0o (typically 0) is selected (at jco, yo) arbitrarily. Also, TV is the total number of nodes in Q and on TT, 3^/3jf = Yljvj^(Pj/^x an<^ similarly for du/dy. After solving (4.2-14), the contour plotting package may be called—for any type of element. This method can, of course, also be used for 'mass-consistent' elements—trading the coding logic of boundary integral tracing for solving linear systems, and generating generally different xjr fields—but hopefully not too different, which would indicate a too coarse mesh.
TWO DIMENSIONS 853 This was for the simple case of simply connected domains. If the domain is multi- connected (flow past a cylinder, for example), a 'branch cut' must be taken from the external boundary (with known xjr at that point) to each internal object and a line integral like (4.2-15) performed, to complete the Dirichlet data—a necessary complication that many might wish to avoid. 4.2.4 Heat Flux Again, there are multiple methods available for this post-processing procedure. The simplest is to just estimate the normal derivative at the boundary points wherever qn = KdT/dn is desired (n • K • VT in the most general case), and the most complex is the so-called consistent flux method that is built into the GFEM. Since the latter is demonstrably more accurate and uses basically the same 'data' that was used to solve for T via the GFEM, it is the preferred method—but we shall describe both methods. First we remark that while again we need to approximate the gradient of a computed quantity, it is typically the case that this gradient, and more often its normal component, is only needed at the boundary of the domain, vis-a-vis the vorticity. We begin with the best method and end with older and more common (but simpler and less accurate) methods: called the 'consistent (flux) method' in Gresho et al. (1981b), where our version of it was first derived and discussed, and in Gresho et al. (1987), which also included the NS equations, the 'new' method is in fact a method that has been visited several times already in this text. Thus, much of our work has already been done, and all we need do here is to tie up some loose ends and to present it as the preferred postprocessing method for computing the heat flux through T into Q. Note first that, consistent with this consistent flux 'philosophy,' on r#, wherein if q was specified as data for the problem, we are done; i.e., regardless of the computed GFEM solution for Th, no postprocessing need be or should be done anywhere q was specified—it is the flux on TN. In this sense, 'closure' has been reached in that applied fluxes are incorporated via NBC's and 'reactive'' fluxes are virtually the same things (another aspect of 'consistency'); i.e., they are computed from the very same type of equations that occur at NBC's—although in this case the primary variable (here, Th) must have been computed first. The consistent flux equation that we need has already been presented—as part of (2.2-16) in Chapter 2. Here we recall and rearrange it: 4>i<lD = 0/ ~dt + u ■ Vr + pThV u-S\ + V0,- • (K • VF") + / (/>i[H(Th -f)-q] fori = N+\,N + 2 NT, (4.2-16) where there are TV nodes at which Th = f + ZlyLi Tj(t)4>j(x) has already been computed [cf. (2.2-1) and (2.2-3) in Chapter 2], NT is the total number of nodes in Q = Q + T, and the reactive heat flux is given by the expansion [cf. (2.2-15) in Chapter 2] N7 qhD = ^2 ^Dj4>ji.x) iorxeTD\ J=N+\ (4.2-17)
854 DERIVED QUANTITIES i.e., the consistent flux is expressed in the same C° basis functions used for Th—here restricted to rD, and we have generated another 'mass-matrix problem.' (Again, our node numbering convention is solely for our expository convenience—not for writing code.) Now for some Remarks: (1) fi = 1 assures global energy conservation even when V- u / 0, as discussed in Section 2.2.3; p / 1 often does not. The simpler methods—discussed below—will never (in general) generate global energy balances. (2) /3 = 0 is still permissible regarding consistent heat flux, even if V • u / 0, in that it would deliver the same temperatures on rD as given by (2.2-3) in Chapter 2 if the problem was recomputed using our applied flux BC on all of T (as long as ft = 0 is also used in the recomputation). (3) As it stands, (4.2-16) and (4.2-17) represent a linear algebraic system of size Nt — N, with the consistent (boundary) mass matrix; solution via DSCG would be quite effective—as would a skyline-based direct method. Note too that the 'dimensionality' of the post-processing linear systems is one lower than the original problem, thus hopefully parrying 'computer cost' as an argument for not doing it right. (4) If the boundary mass matrix is lumped (when 'permissible'), the solution becomes much simpler (and generally somewhat less accurate), since then Jr $iqhD = qDj St ^" anc* tne solution of (4.2-16) is 'free' relative to the cost of forming the RHS. (5) For h -> 0, only one term on the RHS will remain (the others vanishing with h) to give qD = n • K • VTh. (6) For h not -> 0, the consistent flux depends on much more data than just the normal derivative of K • V7\ and this is precisely what makes the result more accurate. (7) The method (magically?) computes a 'normal' flux at boundary corners (or other sharp changes in shape), even though the normal direction is not uniquely defined. [See the discussion following (2.3-33) in Chapter 2, wherein an attempt at defining a unique normal direction, which applies here as well, failed.] (8) The boundary integral term over F^ will only be present at any intersections of VD and r^v; usually 0, will be zero on rN—as indeed it will in the 'bulk' integrals except in the single layer of elements that make up rD. (9) The consistent heat flux, from (4.2-16), could be used to re-solve the original (primary variable) problem via specified flux on rD rather than specified T and give the same solution. [Here, of course, consistency in mass lumping, or not, will be required; i.e., if the mass is lumped when solving for qD from (4.2-16), then it must be applied in the lumped manner if used as an applied flux to determine T, and is another reason that the term consistent is consistent.] No other method of flux calculation would be 'reversible'/consistent. (10) A short historical list of some other contributors to these ideas must be presented, some focusing on practical applications and others on the super-convergence issue, lest we leave the (wrong) impression that we discovered all of these good things: see, for example J. Wheeler (1973, 1978), M. Wheeler (1974), Douglas et al.
TWO DIMENSIONS 855 (1974), Larock and Herrmann (1977), Marshall et al. (1978), Kjaran and Sigurdson (1981), Carey (1981), and Mizukami (1986). The consistent flux 'ideas' are called 'extraction methods' by Babuska and Suri (1994), Carey (1982), Carey et al. (1985), and Lynch (1984, 1985a, 1985b). Finally, MacKinnon and Carey (1990) even show the finite difference community how to obtain super-convergent fluxes. An algorithmic and heuristic way in which to both appreciate and perhaps program the consistent flux method is as follows: 1. Initially, form all of the boundary nodal equations as if there were to be imposed the most general type of natural boundary condition (consistent with the weak form employed, of course) at each node [for the Laplace operator considered thus far, it is n-K-VT + H(T-f) = ql 2. Modify the boundary node equations for the particular problem at hand, e.g., for Dirichlet data, the nodal equation can be omitted entirely (after transposing the appropriate coupling information to the RHS), although it should also be 'saved' (e.g., on a disk file) for later use in Step (4). For simpler natural boundary conditions, the proper deletions are made (e.g., H, To, or q in the current problem). 3. Assemble and solve the conventional GFEM equations for the primary variables. 4. Recall each nodal equation for which Dirichlet data were employed and solve for the consistently derived flux—i.e., solve (4.2-16) and (4.2-17). We now switch to simpler but less consistent methods for flux estimation. Actually, before describing simpler methods, we mention one that appears to be more costly but, we believe, also less accurate than the consistent flux method just discussed: the method of Lagrange multipliers [e.g., Babuska (1973); Strang and Fix (1973, p. 133); Carey and Oden (1983, p. 108)]. It begins with the following 'standard' identity (K -> k for simplicity, not necessity): f K(j>iV2Th = f 4)iKdTh/dn+ f <f>iKdTh/dn- f *rV0,- ■ VTh = f <t>iqa+ f 4>iqr- /"kV^-VT*. (4.2-18) where r^ + VD = V, qa is the applied flux on rv, and qr is the reactive flux on rD. (In our 'old' terminology, qa — q and qr = qD.) Now, qa is given and qr is not. The Lagrange multiplier method involves using (4.2-18) for the diffusive term in the appropriate GFEM equations and, to introduce the required additional equations in order to 'balance' the Nt - N new unknowns [qr = Y?jLn+i Qj^j^ e ^D)], the Dirichlet BC is applied only weakly: [ friT1' -TD) = 0, i = N+\,N + 2,...,NT. (4.2-19) The solution of (4.2-19) simultaneously with the nodal temperature equations is the Lagrange multiplier method (qr is the Lagrange multiplier). Clearly it has turned an ODE problem of size TV followed by a linear algebra problem of size Nt — N of the consistent flux method, into a DAE problem of size Nt—an increase in difficulty that is
856 DERIVED QUANTITIES probably unwarranted; the resulting solution will, we believe, not be more accurate than that from the simpler consistent flux method. If the steady-state problem is being solved, then the consistent flux method involves the sequential solution of two linear algebraic systems, one of size TV and the other of size 7V> — TV; whereas the Lagrange multiplier method solves one larger system, of size Nj. Finally, if u = 0 and a steady solution is sought, the Lagrange multiplier method converts a nice minimization problem (the size TV linear system) into a less desirable saddle-point problem of greater size. We (and, for example, Carey, 1982) see no reason to use the Lagrange multiplier method. Finally, we arrive at simpler, more 'obvious' methods, which we might (somewhat dangerously) call 'engineering' methods: since the diffusive heat flux is defined to be q = KdT/dn, why not simply try to estimate the normal derivative of T on rD? The first answer (and one surely known even by engineers) is that differentiation is a 'noisy' process, and the second is that our C° temperature approximation necessarily causes the gradient to be discontinuous. To combat the latter—if the plotting package will permit it—the best advice is to compute dTh/dn (from the basis functions) only on the boundary of an element [which is, of course, also (generally) an approximation of the domain's boundary], not at nodes shared by two (or more) elements. This avoids the non-uniqueness of dT/dn at shared nodes and the non-uniqueness (in general) of the normal direction. So, if you have the proper type of plotting package (or if you care not about plotting the results), compute the heat flux as follows (on f): dT 3£ dTdrjY 9£ dy dr) dy J dTdx dTdxY as a^+ a^a$/J' (4.2-20) where £ and rj are either s or ±1, depending on which (isoparametric) element side we are on (s is the parameter defining the element side in space and ranges between —1 and +1), x = E"=i xjisj(s\ y = Y!j=\ yrfjis)* and T = J2%i Tj<t>jiM, rj), where ns is the number of nodes on the side of an element J = (dx/d^)dy/drj - (dx/dr))dy/di-, ne is the number of nodes in {and on) the element, 0, is the local basis function, and i/r, is 0, restricted to r. It is important to realize that the value of qn from (4.2-20) is more accurate at certain points on T than at others; the phenomenon of superconvergence suggests that the midpoint (s = 0) be used for Q\ elements and the two Gaussian points (s = ±1/V3) be used for Q2 elements. For triangles, the procedure is analogous—just replace 'Gaussian integration points' by those associated with the appropriate integration rules for triangles; for example, 1, or 4,or 7 point rules. qn(x e VD) = kti ■ VF = K{nxdT/dx + nydT/dy) K dydT dxdT\ 'fdx\2 /dy\2 \dsdx dsdy; U)+ U) K :s)**(2)* dy (dT d^ dT d^ ds \ 9£ dx drj dx dy fdTdy dT dy' ds \ 9£ dr] dr\ 9£ t dx dT
TWO DIMENSIONS 857 If you must bow to the graphics programmer and report your heat flux at nodes so that the plotter can give you q vs x e V, there are several options available—all of which are probably in current use somewhere: 1. Compute qn from (4.2-20) at the element nodes and, where shared by another element, just report the arithmetic average. 2. The nearest Gaussian point flux is computed and assumed to apply at the node, and simple (arithmetic) averaging performed. 3. The Gaussian point values may be extrapolated 'appropriately' to the nodes and averaging performed. 'Appropriate' for Q\ is piecewise-constant extrapolation [giving the same result as in 2 above]; for Q2 it is linear extrapolation from the two Gaussian point values, etc. 4. As in 1 through 3, except use a better averaging method in each; inverse element side length seems appropriate. 5. Use a least-squares fit via the boundary basis functions and boundary mass matrix; i.e., solve NB n n J2<lj / 0/0;= / 0tf«* (4.2-21) where qn are the 'best' Gaussian point values extrapolated (but not averaged) as per 3 above. It is not advisable to lump the mass; better would be to simply return to 3 above. There are probably yet other ways to try to salvage a heat flux from the temperature field, some perhaps better than those listed above. Nevertheless, we stop here with our repeated recommendation: use the available GFEM machinery to get the best heat flux obtainable—the consistent flux. If, however, you insist on 'simpler' methods, a good description of the details of doing so (for anisotropic materials yet, and for 3D and for components of the flux vector other than normal) is available in Gartling and Hogan (1994). See also Reddy and Gartling (1994). We conclude this section with but a single, simple-but-effective example—for others, see Gresho et al. (1987) and Thornton (1982), who puts forth an argument for lumping the mass (for the Q\-element) and also demonstrates the utility of the consistent flux method for computing inter-element fluxes. The exact solution to the Poisson equation, V2T = -S = 5/2, (4.2-22) on (0 ^ x ^ 2, - ^ y ^ 1) with T = y2 on x = 0, T = 1 + y2 on x = 2, T = x2/4 on y = 0, and T = x2/4 + 1 on y = 1, is T =x2/4 + y2. (4.2-23) Figure 4.2-2 shows the solution and a simple 4-patch of bilinear elements for use in testing the flux approximations. Since the bilinear element will give a nodally exact solution for this simple problem, we can use (4.2-23) to evaluate the various heat flux 'post-processors' at node S:
858 DERIVED QUANTITIES y 1/2 NW © W -^T = 1/8 ^\ © sw \ N 1/4 0 S £=1 NE © ^\1 X9/16 \ \ © \ h = 1/2 E SE w 0 1 2 Fig. 4.2-2 Solution of a Poisson equation. 1. Consistent flux—in the lumped mass approximation: (4.2-16) here simplifies to ml <t>i= I V0,-Vr- / 0,-S; i.e., lclDs = h or --[2(7W - 2TS + TSE) + (Tw - 2T0 + TE)] 0/ - ZlUTw - Tsw) + 4(7\) - Ts) + (TE - TSE)] - Slh/2, oh VDs = -TTiWsw ~ 2TS + TSE) + (Tw ~ 2TQ + TE)] 61 ~ -k^Tw ~ Tsw)+4(Tv - Ts) + (TE - TSE)] - Sh/2 oh = -^[2(0 - 2/4 + 1) + (1/4 - 2/2 + 1.25)] - 3 [d/4 - 0) + 4(1/2 - 1/4) + (1.25 - 1)] + 5/8 = -0.125-0.5+0.625 = 0 2. Average of dT/dn from each element at node S: 1 4s = 8T 2 [dy 1 2 dT 3,5 dy 4,5 To-Ts\ + /To-Ts- h J \ h = -[2(1/2-1/4)] = -0.5.
TWO DIMENSIONS 859 3. Average of the two Gaussian point values: qs = - 1 2 1 2 1 dT 97 dT 3.(SW+S)/2 dy 4,(S+SE/2) (Tw - Tsw) + (TQ - Ts) (Tp - Ts) + (TE - Ts) 2h 2h = -^[(1/4 - 0) + (1/2 - 1/4) + (1/2 - 1/4) + (1.25 - 1)] = -i[l/2+ 1/2] = -0.5. It is, of course, just a coincidence that methods 2 and 3 give the same result—they would usually differ. But it is no coincidence that method 1 produced the exact result: g5 = dT/du\s = — dT/dy\s = — 2y\s = 0. Note how the consistent flux combines the three pieces of available 'data'—jc-direction 'heat flow,' y-direction flux, and local heat generation—to come up with the right answer. Note too that only the y-direction flux would 'survive' in passing to the limit; the consistent flux does the best job possible with the given data on the finite element mesh—consistent with the general GFEM. The final worthwhile comment is this: the consistent flux method would give exact results for this problem even for a mesh comprising various-sized rectangular elements if the consistent (boundary) mass matrix is used in (4.2-16) (LM is only exact for equal-sized rectangles)—for which methods 2 and 3 would still give erroneous results (and the two wrong results would differ in the general case). If the same exercise was to be repeated at node E, at which the exact flux is dT/dn |£ = x/2\E = 1 (an exercise we leave to the reader), the results are 'the same'; i.e., consistent flux gets qoE = 1, and the two simpler methods give q^ = 0.75. [The consistent flux method gives 0.75 (jc-flux) —1.0 (y-'heat flow') +1.25 (local heat generation).] Finally, only the consistent flux method applied at all eight boundary nodes would generate fluxes that satisfy the global energy balance frq = - fnS = 5— and (only) it would give the 'proper' flux on each face (proper flux at a corner needs to be 'properly' interpreted); in fact, the full results are these: qsw = qs = qw = 0> asE = 1/3, <?£ = 1, qNE = 5/3, qN = 2, and qNW = 4/3, giving frq = E; <lj ' /r 0y = 5- The consistent flux method always utilizes whatever data are available in order to provide fluxes (secondary variable) of equivalent accuracy as are the temperatures (primary variable)—at least if consistent mass is used throughout; this is the real bottom line. A final derogatory remark on any of the simpler methods: if you were to 'revisit' TN during the post-processing phase, on which a given (applied) q was part of the problem's data, you would oftentimes be disappointed in the lack of agreement between the applied q and your post-processing estimate of same. 4.2.5 Forces and Moments Just as the consistent flux method follows from and is intimately connected with GFEM done properly, so too is the case with 'consistent force'—a global force/momentum
860 DERIVED QUANTITIES balance. Just as the 'conservation form' (/? = 1) of the AD equation was required in order to realize a global energy balance via consistent flux calculations, so too is that form required for the NS equations to realize a global force balance via the consistent force method. But, as there, the conservation form is not required for accurate local forces—the consistent force method is all that is required. From (3.13-402) in Section 3.13.8 of Chapter 3, + [K + N(u) + PD(u)]u + CPl ^ga - [ ^]Fa, (4.2-24) with a = 1, 2 or a = 1, 2, 3 and i = Na + l,Na+ 2 NTa, where the terms [ ], represent the i-th row of the LHS of the a-component of the momentum equation. This lazy way to write the momentum equation at node / is actually justifiable if the 'algorithmic way' described earlier (for heat flux) is implemented; i.e., just 'haul out' the previously saved equation for node /. Whether or not the code is written that way, (4.2-24) is the suggested way to compute the forces exerted on the fluid by the boundary at all locations that used Dirichlet BC's for the velocity; it is the consistent force method and will generate forces (e.g., lift and drag) whose accuracy is commensurate with that of the primary variables. Of course, just as the heat flux from (4.2-16) 'finally' (on a fine-enough mesh) simply represents n • K • VTh from the term J V0, • K • VTh, so too does the consistent force from (4.2-24) finally 'see' only the term on the RHS given by (Ku + CP)t—the viscous plus pressure contributions of the traction vector, the former being correct only if the y = 1 form (stress-divergence form) is employed. But since h -> 0 is rarely seen in GFEM calculations, the 'other' terms in the 'force balance' equation represented by (4.2-24) will often be significant and should not be neglected—at least if you wish to wring the last drop of accuracy from your simulation. Finally, we direct the reader's attention to the Remarks and discussion following (4.2-17) because, with the proper (and easy, we assert) reinterpretation, they apply equally well for the force components, Fx and Fy. If, however, you insist on using simpler (and less consistent!) methods of estimating reaction forces, the methods discussed relative to the simpler heat flux equation (4.2-20), can be easily adapted to the force calculations—with the 'exception' that the heat flux is a vector quantity whose normal component is the natural desired result, whereas with the force vector, it is usually the x- and y-components that are the desired results, although sometimes normal and tangential components of the reaction traction vector may be needed. Here we shall merely present the continuum equations for the boundary forces; details are left as exercises—or see Gartling (1987): Fx = nx(2fidu/dx -P) + nyfi(du/dy + dv/dx) (4.2-25) and Fy = nxix(du/dy + dv/dx) + ny(2fidv/dy - P). (4.2-26) Another advantage (besides the alleged simplicity) of this more 'conventional' approach is that the pressure part of the 'lift and drag' could, if desired, be easily separated out Ni £ FaJ i 0,(avr=ma j=Na+\ >rs
TWO DIMENSIONS 861 from the viscous part. Also, if normal and tangential forces are desired, then and Fn = nxFx +nvFy f r Xxt x -\- Xyt , (4.2-27) (4.2-28) where it may be important to use the appropriate definition of nx and ny (and thus of xx, ty). Recall that in Section 3.13.1e, we showed how to compute a (mass-) consistent normal vector at each node on r in conjunction with applying certain BC's. Although not as serious in this post-processing stage, we believe that the same definition of n presented there is appropriate here. Alternatively, of course, n could be simply computed from the equation for an element side [see (4.2-20)]—preferably at the right points on r (typically Gaussian points) and Fx, Fy (or Fn, Fx) computed there as well—followed, perhaps, by extrapolation and averaging to nodes. Turning briefly to the computation of moments/torques, we refer to Figure 4.2-3 and assume that we want the turning moment about the point O—which could be, for example, the center of gravity of a 2D tumbling spacecraft. The moment of the traction vector at the point P is Mp = F x r = Fxry - Fyrx = |r| • |Fr| (4.2-29) and is directed into the paper. The total moment is thus (4.2-30) and this torque (per unit length, into the paper) can then be computed 'consistently,' via (4.2-24) or otherwise, via (4.2-25) and (4.2-26). To conclude this section, we remark: for a recent and striking demonstration of the much higher accuracy from the consistent force method vis-a-vis the 'simpler' method for computing the drag coefficient for axisymmetric flow past a sphere, see Tabata and Itakura (1995). y F = a • n Fig. 4.2-3 The moment about 0 is F x r.
862 DERIVED QUANTITIES 4.2.6 A Recommended Method for Computing First Derivatives at Nodes In what is possibly the best way to compute derivatives at nodes, advantage is taken of the known points (in most cases) of super-convergence/extraordinary accuracy, by 0. Zienkiewicz and J. Zhu. In a series of papers, beginning with Zhu and Zienkiewicz (1988), and culminating in what appears to be a significant breakthrough, these authors showed in a pair of papers [Zienkiewicz and Zhu, 'Parts One and Two' (1992), pp. 1331 and 1365)], not only how to obtain accurate gradients (first derivatives) at nodes, but also how to use the same 'recovered' information to provide a cost-effective error estimator for mesh redesign. Since adaptive meshing is one of the subjects not covered in this text, we shall focus on the first of their good results (but will summarize the second). In order to set the stage, we present a few quotations that let the reader know that they surely like the 'Z2 local L2-recovery method,' the first and last from Part One above and the middle from Zhu (1991): 1. After summarizing what is wrong with all of the older methods, they state, 'In this paper we therefore propose a new procedure in which a single and continuous polynomial expansion of the function describing the derivatives is used on an element patch surrounding the node at which recovery is desired. This expansion can be made to fit locally the superconvergent points in a least-squares manner or simply be an L2-projection of the consistent finite element derivatives. The first of these will be shown always to lead to superconvergent recovery of nodal derivatives. ...' Thus, it is only this 'first' method that we shall examine in any detail. 2. 'It has been demonstrated, by numerical experiments, that the recovered nodal values of the derivatives by the discrete superconvergent recovery procedure are superconvergent. One-order-higher accuracy is achieved by the procedure for the derivatives of linear and cubic elements. Two-orders-higher accuracy is achieved for the derivatives with quadratic elements. In particular, 0(hA) convergence of the nodal values of the derivatives for the quadratic triangular element is reported for the first time.' Finally, the bottom line: 3. 'The results presented in this part of the paper indicate clearly that a new, powerful and economical process is now available, which should supersede the currently used post-processing procedures applied in most codes.' (!) They do show lots of convincing evidence and the method is immediately intuitively appealing and so obviously excellent that it is a little bit surprising that it took so long to discover ... 'Why didn't / think of that?' It is appealing because it is simple, easy to apply, and easy to say in words: put a polynomial through the super-convergent values at the appropriate points surrounding the node in question and evaluate this polynomial at that node. What polynomial? Where are the appropriate points? How should the appropriate polynomial be obtained? These are the questions answered in their papers. Here, we merely provide a summary, first via a ID example using quads that was presented in yet another of their papers—Zienkiewicz and Zhu (1991): consider the 2-patch in Figure 4.2-4, with two center nodes (•), three edge nodes (x), and four Gaussian points (O); Figure 4.2-4 is used as follows: 1. The derivatives of the finite element solution are evaluated at the four superconvergent Gaussian points (D -> Qj).
TWO DIMENSIONS 863 ->^^> Fig. 4.2-4 A 1D 2-patch of quads. X X X 41 ( X > < X 3 X > < X 2 X , A B C D Fig. 4.2-5 A partial mesh of bilinear elements. 2. A least-squares fit through these four points is made with a quadratic polynomial. 3. The new nodal values are obtained by evaluating the resulting polynomial at the nodes (• ->, x ->). 4. For the central edge node, we are done. For the two midside nodes, an arithmetic average will be made with the analogous result obtained by applying the same procedure at the next edge node. (Since both values are superconvergent, so too is their average.) The mathematics of this process goes as follows: 1. Evaluate uh(x) = J2j Fj d<j>j(x)/&c at the four Gaussian points, xt(i = 1, 2, 3, 4), from the given finite element solution, Fh(x) = Yjj Fj<l>j(x)> where F is the 'field' variable in question, and uh(x) is its conventional first derivative (discontinuous at element edges). 2. Seek a new u(x) = a + bx + ex2 such that the sum of the squares of u(xj) — uh(Xj) over the four Gaussian points is a minimum; i.e., minimize J = i ^[(a + bxi + ex2) - uh(Xi)Y (4.2-31)
864 DERIVED QUANTITIES with respect to the three unknown coefficients; namely, dJ/da = 0 = J2(a + bxi + cxf) - "*(*«■) i dJ/db = 0 = ^[a + bxi + ex] - uh{Xi)]xi i dJ/dc = 0 = ^[a + bxf + exf - uh(Xi)]xf. (4.2-32) (4.2-33) (4.2-34) This 3x3 linear system can be rewritten as i i E*,2 E^ X>.4 a = E -2,./i Xfu (Xj) (4.2-35) 3. Once a, b, and c are available, the nodal values are computed from u(Xj) = a + bxj + exj, j= 1,2,3, (4.2-36) for the three nodes at which 'recovery' is desired. The value so obtained at the 'central' edge node is final; the two midside node results will later be averaged with their like values from adjacent 2-patches. Now that we see the concept, the rest should be easy—up to a few details. But we present one more example—2D bilinear elements on a 3, 4, or 5-patch, near the corner of a domain in which some corner refinement is employed—nodes 1, 2, and 3, respectively, in Figure 4.2-5. In each patch we seek a best fit to the first derivative of Fh(x, y), dFh/dx, evaluated at the appropriate centroids over each patch, via a bilinear function, u(x, y) = a + bx + cy + dxy, using the method of least-squares: n minimize J = j Y~][a + bx{ + cy, + t/jc,y, — uh(xi, y,)]2, (4.2-37) (4.2-38) where (xt, y() define the centroid locations of the n elements (3, 4, or 5) in the patch, and uh(Xi, y,) is the conventional, basis-function evaluation of dFh/dx at centroid i. The minimization in each case leads to the 4 x 4 linear system ($2 = YH=0> n E* E^' E-*^ E* E-*2 E-*^ E-*2^- E^- E-*^ E^2 E-*^2 .E-*^ E-*2^ E-*^2 E-*2^2 a b c Id. J2uh(Xi,yi) J2xiUh(Xi,yi) ^2yiUh(Xi,yi) ^2xiyiuh(xi, y() (4.2-39) Once these three linear systems have been solved, the best-fit expansion (4.2-37) is evaluated, respectively, at nodes 1,2, and 3.
TWO DIMENSIONS 865 Remarks: (1) The recovery polynomial in all cases is the same polynomial used to represent the primary field variable. (2) Boundary nodal values are also easily evaluated from the corresponding interior patch that contains that node—and is the recommended procedure; e.g., nodes A and B come from node l's 3-patch, node C from node 3's 5-patch, and node D from node 2's 4-patch. (3) Another, somewhat simpler, but certainly less general method for 4-patches that should display equivalent accuracy at lower cost is available simply from interpolating (using the 'conventional' FEM bilinear basis functions) the super-convergent centroid data to the node in the 'center' of the 4-patch—after finding the 'central' node's location relative to the new centroid—which is just the average of the four surrounding centroids. (4) A second recovery technique was also discussed by Zienkiewicz and Zhu (1992, Part 1) and tested by Zhu (1991), in which the sums in (4.2-38) and (4.2-39) are replaced by integrals; i.e., the new result minimizes the integral of the difference of the squares. It seems to us, however (and J. Zhu agrees—personal communication) that it must be somewhat inferior, because it does not so 'strongly' utilize the original super-convergent information—it tends to smear it out. In fact, it does not yield super-convergent results for one of the elements examined: quadratic. (5) The application to triangular elements, although more tenuous owing to the lack of clearly super-convergent evaluation points, does indeed deliver super-convergent results at nodes if the centroid is used for the linear element and the three midside nodes for the quadratic element—Zienkiewicz and Zhu (Part 1). The latter, in fact, is 'ultra convergent'; error ~ 0(hA). (Cubic triangles remain to be studied.) (6) The error estimator follows simply once the 'super-convergent' nodal values are available: assume these to be exact, form the 'conventional' (inaccurate) derivatives using the basis function derivatives, and compute the energy norm of the difference over each patch. Large errors suggest mesh refinement. (7) For additional details regarding the methods and many experimental results, see the Zienkiewicz and Zhu references, as well as a more recent paper that further improves the method: Labbe and Garon (1995). (8) For some recent detailed analysis of the Z2-method see Babuska et al. (1997) and, for a recent comparison of several methods, see Zhu (1997). (9) For another recent superconvergence result on bilinears, see Zheng and Li (1996). 4.2.7 Particle Paths Three basic types of particle path plots can be created from the velocity field computed by a CFD simulation: particle path plots, dye plots, and material line plots. In a particle path plot, a particle is introduced at a point (or points) in the flow domain, and the path of motion of the particle is tracked based on the computed flow field. In a dye plot, as for the particle path plot, a massless particle is introduced at a point (or points) in the flow domain, and the path of motion of the particle is tracked based on the computed
866 DERIVED QUANTITIES flow field. The difference between a particle path plot and a dye plot is that in a dye plot a new particle is introduced at the same position as the original particle at a specified time increment. In each of these plots, the position of the particles can be plotted at discrete time increments or the subsequent positions of a particle can be joined by line segments creating a continuous line plot of the particle's motion. The continuous dye plot corresponds exactly to a flow visualization experiment in which tracer dye is introduced into the flow at some point. Note that for a steady-state stimulation, particle path and dye plots are identical; they will only differ in the case of a transient simulation. In a material line plot, a sequence of positions in the flow field at some initial time are connected with straight line segments. The deformation of this 'line', the material line, is then followed in time with the position of the material line being plotted at specified time increments. All of the above plots require the computation of the trajectory of a massless particle introduced into the flow; i.e., into the computed flow solution. The basic problem can be stated as follows: Given a particle at position />(jci, JC2, JC3) in the flow domain at time tn, find the position of the particle at time tn + At = tn+\, (These times and the ensuing time steps are totally independent of the time-integrator used for the momentum equation). The particle trajectories are obtained by the solution of the equation: dxf n -r- = «?, (4.2-40) at where xf is the position coordinate of the particle at time t. Using the finite element representation for the velocity «,, namely ui = YJUki{t)(t>k{Hj), k where £, are the local coordinates, 0^ are interpolation functions on the reference element, and U'l are the nodal value of the velocity components, we can write (4.2-40) as The relationship between the local coordinates £, and the global coordinates jc, is given by the isoparametric mapping *,■(*;) = £*?**<*;) k It follows that for the reference element, (4.2-40) can be written as dxj _ dxj_d%j _ ~dt ~ dfj~dT ~"" or, equivalently, in matrix form, as dx d£ T~ = J T = u' dt dt where J is the Jacobian of the parametric transformation. Using this formulation, we can transform the solution of the original problem to the solution of the equivalent problem
TWO DIMENSIONS 867 on the reference element: dt u, with u = u(§). Once the particle (i.e. fluid) velocities are available at time tn+\, the position of the particle at time tn+\ in the reference coordinate system is given by, for example, the trapezoid rule: £j(tn) + kjitn+\) l;j(tn+l) = i;j(tn) + ^2pkAt- where & is an interpolation factor used to compute intermediate positions when the path transverses several elements in the time step At. Formally, Pi = A, Am = a7~ A = A, where t\, i*, and xm are the time at which the particle enters the first element, the time at which the particle enters the &'th element along the path (during the time interval At), and the time at which the particle exits the last element in the path; see Figure 4.2-6. Clearly, if during the time interval At the particle moves within the same element, then /3 = 1 (Note that At here is a user-specified quantity.) 4.2.8 Effective Peclet (Reynolds) Number A potentially useful diagnostic that has not, to our knowledge, been tested, is a global measure of artificial/numerical dissipation. For advection-diffusion, we call it an effective Peclet number: Pe Pe(eff) = T T , (4.2-41) 1 + TtN(u)T/kTtKT where Pe(= uqL/k) is the given Peclet number, N(u) is the advection matrix, K is the diffusion matrix (sans diffusivity coefficient), and TT is the transpose of the nodal temperature vector. For the NS equations, simply replace Pe by Re after replacing T by u and k by v. The first thing to notice is that Pe(eff) = Pe (presumably the goal!) if N(u) is skew- symmetric, and the next is that Pe(eff) < Pe if N(u) is dissipative. In our GFEM using Element k Element k -1 Element 2 Element 1 'n+1 Particle path Element m — \ Fig. 4.2-6 A particle path.
868 DERIVED QUANTITIES either advective [ft = 0 in (2.2-16)] or conservation (flux) form (/? = 1), the advection matrix is indefinite and thus Pe (eff) may be bigger or smaller than Pe. For those to whom the above definition is not quite obvious, we now present the derivation. Recall the semi-discretized AD ODE from Chapter 2 (2.2-7)], here with no source term and no BC forcing: MT + N(u)T + kKT = 0; (4.2-42) i.e., we are considering a 'spindown' problem—and we have intentionally modified the definition of the diffusion matrix by factoring out the (constant) diffusion coefficient. The resulting 'energy' balance equation is obtained by taking the scalar product of (4.2-42) with the T vector: -—TTMT = -TTN{u)T - kTtKT = -KeffTTKT, (4.2-43) 2 dt where clearly Keff = k if TTN{u)T = 0—which is (only) true if N(u) is skew-symmetric [/? = 1/2 in (2.2-16)]. Otherwise, we see an 'effective' diffusivity given by *reff = K + TTN(u)T/TTKT, (4.2-44) leading to (4.2-41) via Pe (eff) = uqL/k^. To the extent that Pe (eff)/Pe differs from unity in the general case (not just the spindown model used for the derivation), the simulation is, in this measure, defective. For example, for simple first-order upwinding in ID, (4.2-44) gives Afeff = k + uAx/2. Since we have no actual experience with this diagnostic, all we can do is put it out on the 'table' for consideration—as was done in Baptista et al. (1995). It just might, at least sometimes, be a simple-but-effective way to measure one aspect of simulation 'quality.' Unfortunately, it, like the global vorticity measure put forth in Section 4.2.2, does not tell you how to improve your results—only that they might need it. Even so, these sorts of diagnostics may eventually also find good use in adaptive mesh algorithms—as independent 'quality' measures. 4.2.9 Pressure Smoothing and Node Moving for QiQ0 The infamous checkerboard (CB) pressure mode (modes in 3D) that afflicts the Q\Qo element is easily dealt with—because we know (at least for 'simple' grids) the form of the CB-eigenvector. But even for non-simple grids and even when BC's have precluded CB-mode(s), the procedures described below are useful and generally recommended. If, however, there are no spurious modes and if your graphics package will plot centroid values, that too is a good choice—perhaps better. We shall treat the 2D case in sufficient detail that the extension to 3D will be fairly obvious. For the most rigorous possible filter for the pure CB-mode, see 'Scheme V in Sani et al. (1981a). For a mesh comprising rectangles, the filter to be described is also rigorous—and, since it also works well on general meshes and is much simpler than 'Scheme 1,' we do not hesitate to advocate it 'universally.' The filtered pressures are computed at the velocity nodes (and thus could then be considered 'smooth'—C°—and interpolated bilinearly), typically (but not necessarily) at the 'central' node of a '4-patch.' If there are n elements sharing the node in question, say
TWO DIMENSIONS 869 node i, the new nodal pressure is simply where Pej is the pressure on element j and Ae- is its area. This area-weighted filter works because, recall, the CB-pressure mode eigenvector is Pqb = ±1/Ae, and if the element pressure is interpreted as representing the sum of a physical pressure and the CB-mode, it is clear that—at least on a 'nice' 4-patch or 2-patch, the most likely candidates for containing the CB-mode—the above filter precisely annihilates the CB-mode, leaving an area-weighted physical pressure. We regret that we do not know how to inverse-area-weight the physical pressure (for more accuracy) while simultaneously area- weighting (removing) the CB-mode. But, thanks to D. Griffiths, we also have a fix for this 'problem'—also from Sani et al. (1981a): simply move the nodes by the same prescription. When (4.2-45) is applied to both the x- and y-components of node f s location, the pressure at this new location (for 'pressure purposes' only, of course) will be more accurate—second-order vs first-order if not done. This can be seen by considering the 4-patch shown in Figure 4.2-7. Since (4.2-45) has removed the non-smooth portion of the pressure, we can consider the filtered result to be smooth; i.e., susceptible to a Taylor series expansion. Thus, rather than applying (4.2-45) at node i, let us apply it at location (jco, yo), whose position is to be determined. If Ps(x, y) denotes the general, smooth pressure field, and (jc,-, y,) the location of element f s centroid, then we have 4 , 4 + j=\ P.s\o + (Xi -Xq) Ps\o + (X2 -Xq)- dPs dx dPs dx + (y\ - yo) + (n - yo) dPs_ dy dP^ dy + 0(Ajc2) + 0(Ay2) + 0(Ajc2) + 0(Ay2) + [ ]A3 + [ ]A4 } /(A, +A2 +A3 +A4). (4.2-46) X4 < i X 1 1 1 1 1 1 1 1 i X 3 1 1 ^_^;_o_ i i X2 i i i , 1 1 B B' Fig. 4.2-7 Nodal 'smoothing' for better pressure.
870 DERIVED QUANTITIES Can we find an (x0, yo) such that Po(xo, y0) is more accurate than for any other location? That the answer is 'yes' follows by requesting the coefficients of V/Mo in (4.2-45) to vanish—and the result is, simply 4 , 4 -*o = £-*A7EA> (4.2-47) with a similar equation for yo (and zo for that matter in the 3D case wherein, of course, Aj -* Vj, etc.), giving P0(x0, y0) = Ps(x0, y0) + 0(Ax2) + 0(Ay2), and we can apparently do no better. This prescription, suggested in the figure, moves node i to the geometric center of a rectangular 4-patch. That this node smoothing/moving works is demonstrated in Sani et al. (1981a). Thus, if you want the best interior pressures attainable from Q\Qo, use both (4.2-45) and (4.2-47)—the latter using the original centroid coordinates. (We do not propose moving boundary nodes, and we leave as an exercise how to communicate these results to your plotting package.) This takes care of interior nodes, but not boundary nodes (and especially not corner nodes). These require special treatment and are to be done after processing the internal nodes—with or without nodal smoothing. We (J.M. Leone, Jr. and PMG) discovered how to effectively 'process' the boundary and explained our technique to D. Malkus, who further analyzed and published it in Yao and Malkus (1990). It is based upon two simple observations: (i) the filtering and smoothing of a boundary 2-patch gives a result that applies most properly at a distance of Ajc/2 (or Ay/2) from the boundary—corresponding to the centroid distance from same; and (ii) boundary corner nodes 'see' only one element pressure, which is generally CB-polluted, and no filtering operation is apparent. To deal with these two minor issues, return to Figure 4.2-7 and focus first on node B (not B' because we do not shift boundary nodes). The 2-patch filter there, (P\A\ + /V^VO^i + A2) = P, kills the CB-mode and most properly applies on the line joining i to B, halfway in between. Thus, a better boundary value is obtained by simple linear extrapolation: PB = 2P-P(, (4.2-48) a result that presumably could be improved further by utilizing Pq instead of Pt and devising a more elaborate interpolation/extrapolation scheme. Finally, assume that node C defines a corner of the domain. The best way that we (and Malkus) have found to obtain Pc is via simple linear extrapolation in the master element (^-coordinates). Thus, presuming PA and PB to have been properly obtained, the value at the corner is simply PC=PA+PB-Pi- (4.2-49) Except for a few finishing remarks, we are done: 1. For internal corners, (4.2-49) also applies. 2. It is possible (cf. Hughes et al., 1979a, and Yao and Malkus, 1990) but not recommended, to perform linear extrapolation to corner nodes viax-y coordinates in distorted elements rather than in the simpler t--r) system. It is both more cumbersome to implement and, according to both J. Leone (personal communication) and Yao and Malkus (1990), less accurate. 3. It is regretable that, if node smoothing is to be employed, both Pt and Pq need to be available. Perhaps further effort would result in further improvement—such as permitting the motion of boundary nodes via sliding on T.... 4. Extension of the remaining details to 3D is left as a not-very-difficult exercise.
THREE DIMENSIONS 871 4.3 THREE DIMENSIONS In the 'real world' we must accept the (expensive) fact that CFD 'post-processing' is an even more important aspect of the simulation than in 2D—including the intelligent use (still to come, hopefully) of color graphics for other purposes than Colorful Fluid Dynamics (CFD's alias, sometimes). While more difficult to actually code up, most (not all) of the 3D derived quantities are conceptually pretty analogous to those in 2D. We shall therefore highlight 'differences' and discuss some 3D 'only' quantities. 4.3.1 Vorticity Other than the fact that co = V x u is a vector quantity, the same issues as in 2D are present. Simply repeat for the two other dimensions that which was done in 2D for the 'vertical' vorticity component; e.g., extending what may be the most straightforward of those methods, we present the other two components, cox and a>y, from (4.2-3), which describes the z-vorticity: Qcox = CTzv - CTyw (4.3-1) and Qcoy = CTxw - Cju, (4.3-2) giving the vector at the vorticity (pressure) nodes. Note that this approach, for C_l pressure, is still element-by-element. An analogous extension of (4.2-1) to 3D would give two more 'similar' equations—all with the same global mass matrix—all giving L2-projections to the velocity basis. Also, the simpler methods using (4.2-4) can also be extended in generally obvious ways, although there may be more room for 'ad hocness' in 3D. Finally, the best method is probably the Z2-method described in Section 4.2.6 for 2D; simply extend it to 3D. 4.3.2 Helicity Density One quantity that has no 2D counterpart is the scalar product of the velocity and vorticity vectors; called helicity density, he = u ■ co = u ■ (V x u), (4.3-3) and its global integral, He= J he, (4.3-4) the helicity (see, for example, Pelz et al., 1985, and Mobbs, 1981). As mentioned in Chapter 3 (Section 3.3.2), if he is everywhere large, then u x co—the non-linear (advec- tion) term—is necessarily small, and vice versa: 'nonlinearity is depleted if co tends to align with u'—Frisch and Orszag (1990). Our goal here is, however, only to see if we can compute it—not necessarily understand it. At least it is simply a scalar 'at the end of the day.' Expanded, it is he = ucox + vu>y + wcoz
872 DERIVED QUANTITIES The expensive/Galerkin way would be to first obtain the vorticity (discussed above) either in terms of the velocity (//') or pressure (L2) basis functions, and then perform a 'Galerkin product' (L2-projection): Y,hej f <j>i(j>j = f 0,11* <o\ i=\,2,...,N, (4.3-6) j where we have chosen the 'velocity' basis for he. (If <oh were expressed in the 'pressure' basis, it might make sense for he to be, too.) But the double summation and the integral of triple products of basis function terms on the RHS, followed by a 3D mass-matrix problem, leads us to also consider approximations that are perhaps more cost-effective. The first of these that comes to mind, and probably the simplest, is pointwise multiplication of Gaussian point (where applicable) values; i.e., evaluate both u* and toh at the 'appropriate' Gaussian points (centroid for Q\Qo, 2 x 2 for Q2, etc.) and multiply them to obtain he there. Nodal quantities, if needed, could then be obtained in any of the 'usual' ways. If you have already computed nodal values of <oh, however, and if you believe they are sufficiently accurate, then simple nodal multiplication (u ■ co) is obviously called for. We close with the suggestion that if helicity is really important to you, you should try several methods in the 'research version' of your code and select the most 'appropriate' version after completing a small research program aimed at a comparison. The forces and moments in 3D are probably worth a few words—especially the latter, called pitch, yaw, and roll by the experts. The forces should be done consistently (and thus, accurately) using (4.2-24). Consider the jc-axis as defining the direction of travel, so that it is also the roll axis. At any given point (P) on the surface of our moving 'vehicle' (say), it is an 'easy' matter to compute the traction force vector, F = a n, by methods discussed above. To compute the roll moment requires 'dropping a perpendicular' from P to the jc-axis and calling the resulting distance vector r. Next, the projection of F onto the plane containing r and normal to the jc-axis is required. Calling this vector ¥R (roll vector), the pointwise roll moment is computed as M* = ¥R x r, and its integral over the entire boundary (surface area) of the vehicle gives the total roll moment. Similarly, if the y-axis points up, it is used in an analogous way to compute the pitching moment. The remaining axis, z, is then used to compute the yaw moment. The actual details of such a computation are left as an exercise. The remaining derived quantities are simple (Ha!) extensions of what we have already covered in 2D. If we are wrong, then each 'error' is an exercise for the reader (!). Anyhow, we are, we believe, done.
Appendix 1 Some Element Matrices In this appendix we list some results that are useful for performing old-fashioned pencil- and-paper analysis—and for code debugging. In ID and 2D we present many of the (GFEM) element matrices for fluid mechanics that are 'normally' generated in a computer subroutine. We will show mass, diffusion, and advection matrices in ID, and these plus divergence and gradient matrices—and their 'product' that generates an uncommon Laplacian matrix for the pressure—in 2D. The ID results are useful only for AD, whereas the 2D results are also useful for the NSE's. At the end, we even show some element matrices for a CVFEM. Figure A-l shows the generic elements examined, in both ID and 2D, and form the basis for understanding the matrices to follow. (Sorry, no triangles.) A. 1.1 ADVECTION-DIFFUSION MATRICES rriij = I (fityj (mass matrix), dx dx dy dy f f ( dip; dip; riij = / (piu ■ W(pj = / \U(pi-^-+ vcpi dx dy (diffusion matrix), (advection matrix), Notes: (1) In ID, e is element length; in 2D it is element area. (2) In ID, drop the y-terms. (3) Lower cases are used to denote element (vis-a-vis global) matrices. A. 1.2 ONE-DIMENSIONAL ELEMENT MATRICES (1) Linear rriij = ktj — u / 6 1 / "2 1" 1 2 7 1 -1" -1 1 -1 1 -1 1
874 SOME ELEMENT MATRICES Jt 1 •- Jt <*— 1(8 ,»!— X© x© —• 7 • 9 5 • ®x ©x —3° 611 -?o Jt Fig. A-1 Linear and quadratic elements for 1D and 2D. The four internal nodes at the 2x2 Gauss points are for the pressure in 020-i, with the fourth one omitted for 02P-i- All are omitted for 02Oi and for bilinear elements, for which notes 5 through 9 are also omitted. (2) Quadratic rriij = *"< / ^ r riij = 1 30 1 3/ r 4 ?, .-1 r 7 -8 . 1 r-3 u — 6 -4 1 2 16 2 -8 16 -8 4 - 0 -4 -ll ?, 4. 1 -8 7 -11 4 3. A. 1.3 TWO-DIMENSIONAL ELEMENT MATRICES (1) Bilinear (full quadrature and 1-point quadrature) where mU = Ih 36 r4 2 1 2l 2 4 2 1 12 4 2 L2 1 2 4. 1-point //j 16 rl l l li llll llll .1111. kij ~ k*ij + kiJ " ~ 61 2 2 1 1 -2 2 1 -1 -1 1 2 -2 ll -1 -2 2. 1-point /j > — 4/ r 1 -1 -1 . 1 -1 1 1 -1 -1 1 1 -1 -1
TWO-DIMENSIONAL ELEMENT MATRICES 875 and ky ntj riij (1-point) / ~ 6h uh = 12 uh - 2 1 -1 .-2 "-2 -2 -1 .-1 --1 -1 -1 .-1 1 2 -2 -1 2 2 1 1 1 1 1 1 -1 -2 2 1 1 1 2 2 1 1 1 1 -2" -1 1 2. -P -1 -2 -2. -r -l -l -l. 1-point / Ah vl + T2 vl --2 -1 -1 .-2 --1 -1 1 . 1 " 1 1 -1 .-1 -1 -2 -2 -1 -1 -1 1 1 1 1 -1 -1 1 2 2 1 -1 -1 1 1 -1 -ll -1 -1 1 1 1 1. 2" 1 1 2. -P -1 1 1. For u = J2i uj<t>j' and v = J2i vj<Pj> the 'nonlinear' advection matrices become n >j = n*j + nh' where 72 (—6«i — 3ll2 — "3 — 2m4) (6u\ + 3«2 + "3 + 2«4) (2«i + U2 + «3 + 2«4) ( — 2li\ — li2 — M3 — 2«4) (—3«i — 6«2 — 2«3 — 1/4) (3«i + 6«2 + 2«3 + U4) (u\ + 2«2 + 2«3 + 1/4) (—u\ — 2u2 — 2ut, — U4) {—U\ — 2u2 — 2«3 — U4) (U\ + 2U2 + 2«3 + U4) (U\ + 2u2 + 6M3 + 3m4) ( — Ml — 2«2 — 6«3 — 3m4) ( —2«i — U2 — «3 — 2«4) (2« 1 + U2 + "3 + 2m4) (2« 1 + U2 + 3»3 + 6M4) (—2l4\ — U2 — 3«3 — 6M4) / -2^2-^3-^4) (2^1+2^2+^3+^4) (6^1+2^2 + ^3 + 3^4)' - 6v2 - 3^3 - V4) {2V\ + 6lb + 3i>3 + i>4) (2^i + 2v2 + ^3 + V4) 3V2 - 6U3 - 2^4) (1>1 + 3V2 + 6^3 + 2V4) (Vi +V2+ 2l>3 + 2V4) ^2-2^3-2^4) (^1+^2+2^3+2^4) (3^1+^2 + 2^3+6^4). For 1-point quadrature and variable velocity, use the 1-point results above for constant velocity but replace u and v by their average values at the centroid. ^ = 72X "(-6^1 - 2V2 - 1>3 (-2V\ - 2V2 - l>3 (-V\ -V2- 2V7t - _(-3V\ -Vi- 2V?, -3v4) (-2V\ -V4) (-2V\ 2V4) (-Vi -6U4) (-Vi Related results for bilinears on a distorted element (1) Area: A = ±[(x3 -xi)(y4 - y2) + (x2 -*4)(j3 - yi)] (2) Lumped mass matrix mf = <5,ymf /36, where m\ = [(\'3 - v2) + 2(y4 - yi )][2(x2 -xi) + to - x4] - Ito - x2) + 2(x4 - xx )][2(y2 - >>,) + (w - v4)L m^ = [2( V3 -y2) + (y4- vi )][2(x2 - -vi) + (JT3 - -^4)1 - 12(^3 - *2) + (*4 - jti )][2( v2 - vi) + (x - v4)], m\ = [2(b -y2) + (V4 - Ji )][(*2 - *i) + 2(x3 -x4)]~ [2(xi - x2) + (x4 - x\ )][(y2 - vi) + 2(W - v4)], m\ = [(b - >'2) + 2(>>4 - y\ )][(x2 - x,) + 2(*3 - jt4)] - [U3 - x2) + 2(x4 - x{)][(y2 - vi) + 2(j3 - y4)].
876 SOME ELEMENT MATRICES (2) Biquadratic mu kjj where kx i i ' J and ky ■ K : - J lh ~ 900 = *& + h 90/ / 90h - 16 -4 1 -4 8 -2 -2 8 . 4 ky - 28 4 -1 -7 -32 2 8 14 .-16 " 28 -7 -1 4 14 8 2 -32 .-16 -4 16 -4 1 8 8 -2 -2 4 4 28 -7 -1 -32 14 8 2 -16 -7 28 4 -1 14 -32 2 8 -16 1 -4 16 -4 -2 8 8 -2 4 -1 -7 28 4 8 14 -32 2 -16 -1 4 28 -7 2 -32 14 8 - -16 - -4 1 -4 16 -2 -2 8 8 4 -7 - -1 - 4 28 8 2 - -32 - 14 - -16 4 -1 -7 28 2 8 14 -32 -16 - 8 8 -2 -2 64 4 -16 4 32 -32 -32 8 8 64 -16 -16 -16 32 - 14 14 2 2 112 -16 16 -16 -128 -2 8 8 -2 4 64 4 -16 32 2 14 14 2 -16 112 -16 16 -128 8 -32 -32 8 -16 64 -16 -16 32 - -2 -2 8 8 -16 4 64 4 32 8 8 -32 -32 -16 -16 64 -16 32 - 2 2 14 14 16 -16 112 -16 -128 8 -2 -2 8 4 -16 4 64 32 14 2 2 14 -16 16 -16 112 -128 -32 8 8 -32 -16 -16 -16 64 32 4" 4 4 4 32 32 32 32 256. -16" -16 -16 -16 32 -128 32 -128 256. -16" -16 -16 -16 -128 32 -128 32 256. n u nx- + n- where uh nx: = I / 1 Of\ J 180 --12 4 -l 3 -16 2 4 -6 . -8 -4 12 -3 1 16 6 -4 -2 8 1 -3 12 -4 -4 6 16 -2 8 3 -1 4 -12 4 2 -16 -6 -8 16 -16 4 -4 0 -8 0 8 0 -2 6 6 -2 8 48 8 -16 64 -4 4 -16 16 0 -8 0 8 0 -6 2 2 -6 -8 16 -8 -48 -64 8 -8 -8 8 0 -64 0 64 0
TWO-DIMENSIONAL ELEMENT MATRICES 877 and n u vl 1 or\ 180 --12 3 -l 4 -6 4 2 -16 . -8 3 -12 4 -1 -6 -16 2 4 -8 1 -4 12 -3 -2 16 6 -4 8 -4 1 -3 12 -2 -4 6 16 8 -6 -6 2 2 -48 -8 16 -8 -64 -4 16 -16 4 8 0 -8 0 0 -2 -2 6 6 -16 8 48 8 64 16 -4 4 -16 8 0 -8 0 0 8 8 -8 -8 64 0 -64 0 0 Remark: Variable velocity biquadratic matrices, a la those shown earlier for bilinear elements, are available; however, they required many pages (via Mathematica, by J. Derby), and are not presented. The interested reader can contact PMG to get a copy. (3) Serendipity (8-node quadratic; omit node 9) and where and m, i j Ih 180 " 6 2 3 2 -6 -8 -8 .-6 2 6 2 3 -6 -6 -8 -8 3 2 6 2 -8 -6 -6 -8 2 3 2 6 -8 -8 -6 -6 -6 -6 -8 -8 32 20 16 20 -8 -6 -6 -8 20 32 20 16 -8 -8 -6 -6 16 20 32 20 -6 -8 -8 -6 20 16 20 32 k- ■ — kx A- ky K'J ~ Kij ^ Kij n ij «h+< k*-h ij 90/ I ky — ,J 90h - 52 28 23 17 -80 -6 -40 . 6 " 52 17 23 28 6 -40 -6 .-80 28 52 17 23 -80 6 -40 -6 17 52 28 23 6 -80 -6 -40 23 17 52 28 -40 6 -80 -6 23 28 52 17 -6 -80 6 -40 17 23 28 52 -40 -6 -80 6 28 23 17 52 -6 -40 6 -80 -80 -80 -40 -40 160 0 80 0 6 6 -6 -6 48 0 -48 0 -6 6 6 -6 0 48 0 -48 -40 -80 -80 -40 0 160 0 80 -40 -40 -80 -80 80 0 160 0 -6 -6 6 6 -48 0 48 0 6" -6 -6 6 0 -48 0 48. -80" -40 -40 -80 0 80 0 160.
878 SOME ELEMENT MATRICES where and x uh n : ; = lJ 180 „y vl n ; ; = ,J 180 "-12 8 3 3 -20 14 0 .-26 "-12 3 3 8 -26 0 14 .-20 -8 12 -3 -3 20 26 0 -14 3 -12 8 3 -26 -20 14 0 -3 -3 12 -8 0 26 20 -14 -3 -8 12 -3 -14 20 26 0 3 3 8 -12 0 14 -20 -26 -8 -3 -3 12 -14 0 26 20 20 -20 0 0 0 -40 0 40 14 14 14 14 -48 -40 -48 -40 -14 -14 -14 -14 40 48 40 48 0 20 -20 0 40 0 -40 0 0 0 -20 20 0 -40 0 40 -14 -14 -14 -14 48 40 48 40 14 14 14 14 -40 -48 -40 -48 20 0 0 -20 40 0 -40 0 A. 1.4 NAVIER-STOKES; ADDITIONAL MATRICES A.1.4.1 Gradient Matrix where -•J </ Lt^ and <■ 90/ dy fj> where 0, is a velocity basis function and xjrj is a pressure basis basis function. A.1.4.2 Divergence Matrix where (Note that d = -cl) dij = (dxijdylj), (1) Bilinear velocity, piecewise-constant pressure (QiQ0) This case is carefully covered in Section 3.13.5, including construction of the global GFEM equations from the element matrices.
NAVIER-STOKES; ADDITIONAL MATRICES 879 (2) Biquadratic velocity, continuous bilinear pressure (Q2Q1) (cT)u = [(cx)Jj (cy)Jj], where and (cl) x nj (<)/; h 36 / 36 "5 1 0 .0 "5 0 0 .1 -1 -5 0 0 0 5 - 1 - 0 0 0 -5 -1 0 - -1 -5 0 - 0 0 1 5 -1 0 0 -5 -4 4 0 0 10 10 2 2 -2 -10 -10 -2 0 -4 4 - 0 - 0 0 4 -4 -2 -2 -10 -10 10 2 2 10 -4 0 0 4 -8 8 8 -8 -8 -8 8 8 (3) Serendipity (Q^8)Qi). Remove node 9 from <22<2i: (c[),7 = 3g and (cl) yfij 36 7 1 2 2 7 2 2 1 1 -7 -2 2 2 7 -1 -2 2 -2 -7 1 2 1 -7 -2 2 -2 -1 7 1 2 -2 -7 -8 8 4 -4 6 - 6 - 6 6 -6 -4 -6 4 -6 8 -6 -8 -4 -6 - -8 -6 - 8 -6 4 -6 6 6 6 6 -8 -4 4 8 (4,) Biquadratic velocity, discontinuous bilinear pressure (Q2Q_i) o Case 1. The pressure basis functions are centered at the 2 x 2 gauss points—and helps explain the appearance of V3. <<£) n n = — x '7 72 9 + 575. -(3-75). 9-575, -(3 + 75). and 7- ' (Cy ),7 = — X r 9+573, -(3 + 75), 9-575, .-(3-75), 3-75 -(9 + 575), 3 + 75. -(9-575). -(3 + 75). 9 + 575, -(3-75). 9-575, -(9-575), 3 + 75, -9 + 575, 3 - 1/3, -(9-575), 3-73. -(9 + 575), 3 + 73, -(3 + 75), 9-575, -(3-75). 9 + 575, 3-73. -(9-573). 3 + 73, -(9 + 573). -4(3 + 73), 4(3 + 73). -4(3-73), 4(3 - 73), 4(273 + 3), 4(273 + 3), -4(273-3). -4(273-3). 4(273-3). -4(273 + 3). -4(273 + 3). 4(273-3), 4(3 - 73). -4(3 + 73). 4(3 + 73), -4(3 - 73), 4(3 - 73). -4(3-73), 4(3 + 73). -4(3 + 73), 4(273-3), 4(273-3), -4(273 + 3), -4(273 + 3), 4(273 + 3) -4(273-3), -4(273-3), 4(273 + 3), -4(3 + 73). 4(3 - 73), 4(3 - 73), -4(3 + 73). -1673) 1675 1675 -1675 -1673- -1673 1675 1675. o Case 2. The pressure basis functions are centered at the four corner nodes; this basis is equivalent to that in case l, and both will thus give the same numerical results. The
880 SOME ELEMENT MATRICES 'new' one is simply simpler. The element matrix has, in fact, already been presented—it is simply the cT matrix of <22<2i- The difference is in how the matrix is used to construct the global equations; for Q2Q1 there is only one pressure per global node (C° pressure), whereas for Q2G-1 there are as many pressures at each global node as there are elements sharing that node (C~l pressure). See main text for further details—Section 3.13.6. (5) Biquadratic velocity, discontinuous linear pressure (Q2P-i) The pressure basis functions are centered at the first three of the four Gauss points. (Any three points in the plane suffice to define the linear pressure.) (cl)u 36 2^3 + 3, 2^3-3, 2^3-3, a/3, -5a/3, -5a/3, -3(a/3- 1), 3(a/3- 1), 3(a/3- 1), and (CTy)lj 36 3(a/3+ 1), -a/3, -(2a/3-3), -3(a/3-1), 3(a/3-1), 5a/3, -a/3, -(2v^-3), -(2^3 + 3), 2a/3 + 3, a/3, 3(a/3- 1), -4a/3, 4a/3, 0, 4(2a/3-3), -8 a/3. -12, -4a/3, 4(2a/3 + 3), 4a/3, -8a/3, 0, 12, -16a/3 16a/3 0 -3(a/3+1), 5 a/3, -(2^ + 3), 12, 8a/3, -4(2a/3-3), 0, -4a/3, 4a/3, -12, 8a/3, -4(2a/3 + 3), 0, -4a/3, 4a/3, 0 •16a/3 16a/3 A.1.4.3 Consistent Laplacian Matrix (CTMZlC)ij = Y,CikMZklkCkj, where ML is the lumped mass matrix (diagonal), and it is important to note that this is an exceptional case: we are no longer dealing with an element matrix; the summation over k ranges over all velocity modes in the mesh. CTM~[lC is a (sparse) global matrix. The following 16-element patch is required for CTM^lC, where • corresponds to the x-portion and x to the y-portion: #- t§- $$- &- #- i$- 4$- h& *$- .0 ■*■ ^^~ ^^ The two stencils (not matrices!) that comprise CTMLlC corresponding to node 0 are h CTxMllCx 111 -1 -8 18 -8 -1 -4 -32 72 -32 -4 -1 -8 18 -8 -1J
TWO-DIMENSIONAL CONTROL VOLUME FINITE ELEMENT MATRICES 881 and CTyMl)Cy 12h -1 -8 18 -8 L-l -4 -32 72 -32 -4 -1 -8 18 -8 -1. A.1.5 TWO-DIMENSIONAL CONTROL VOLUME FINITE ELEMENT MATRICES m, ■i j Ih 64 "9 3 1 .3 3 9 3 1 1 3 9 3 3 1 3 9 Remark: Unfortunately, the symmetry is not preserved if the element shape is non-rectangular. K; j h 81 - 3 -3 -1 . 1 -3 3 1 -1 -1 1 3 -3 1- -1 -3 3. / + Sh " 3 1 -1 .-3 1 3 -3 -1 -1 -3 3 1 -3 -1 1 3 Remark: Unfortunately, the symmetry is not preserved if the element shape is non-rectangular. n u uh T6 3 -3 -1 1 3 -3 -1 1 1 -1 -3 3 ll -1 -3 3J vl + 76 3 1 1 3 1 3 -3 -1 1 3 -3 -1 3 1 -1 -3 Finally, if m = V. iij<f>j, and V = ]T\- Vj<f>j, the 'non-linear' matrices becomes where «o- = 96 7(«i +«3) + 2(«3 + u4), -7(«i + us) - 2(«3 + u4), — U\ — U2 — 2(«3 + U4), Ui + U2 + 2(U3 + U4), and u 4 + 4- 7(«i + u2) + 2(«3 + u4), 2(m + u2) + M3 + "4, —7(«i + u2) — 2(«3 + M4), —2(«i + M2) — "3 ~ "4. 2(«i + «2) + «3 + "4 —2(u\ + u2) — "3 — "4 — Ml — «2 ~ 2(«3 + M4), U] + U2 +2(W3 +M4), -2(«i + U2) - 7(«3 + M4), -2(«i + M2) _ 7("3 + "4) 2(«i + u2) + 7(«3 + M4), 2(«i + «2) + 7(«3 + «4> . 7 96 '7(l>, + V4) + 2(V{ +V2), 2(V2 + Vi) + Vi + V4, -2(V2 + Vi)-V{ -V4, -HVi +V4)-2(Vi +V2), 2(l>, +V4) + V2 + V?, l(V2 + Vi) + 2(Vi +V4), -HV2+V3)-2(Vi +V4), -2(V{ + V4)-V2-V^ 2(V{ + V4) + V2 + V?, l(V2 + Vi) + 2(V\ +V4), -l(V2+Vi)-2(Vl +V4), -2(Vi +V4)-V2-V3, 7(V{+V4) + 2(V2 + Vi) ' 2(V2 + Vi) + Vi +V4 -2(V2+V])-Vl -V4 -1(V{ + V4)-2(V2 + Vi)_
Appendix 2 Further Comparison of Finite Elements and Finite Volumes A.2.1 INTRODUCTION Since, by some measures, recent trends seem to favor the more physically appealing finite volume methods (there is not just one) over the GFEM (and other FEM's), it seems interesting to offer another comparison of the two. This we now do by comparing both GFEM and CVFEM (see Sections 2.2.6 and 2.5.3 for the latter) via bilinear approximations on rectangles. In so doing, we will also digress somewhat to offer some subjective remarks on much (most) of the previous FEM textbook literature and the manner in which the discrete equations are derived. And to make it perfectly clear that we are embarking on a path that is clearly not clear, we shall present two ostensibly rather 'opposite' viewpoints regarding the important-but-ambiguous subject of 'local conservation' via GFEM. In the former, we shall take the position that, alas and alack, the 'poor' GFEM is simply devoid of the often highly coveted local conservation property. Then we shall turn the tables and explain our version of the local conservation properties that, when properly interpreted/understood, assert quite the opposite—GFEM does display local conservation, both at the nodal level and at element level. This latter viewpoint was pioneered in modern Italy by Comini et al. (1991, 1992) and has been further 'interpreted' via some personal communications with G. Comini—to whom we remain grateful. Herein, we present our version of their arguments. At the end, we expect that the reader will either be more confused than ever or will 'choose' one or the other side! In either case, it may be worthwhile reiterating one of the unwritten 'laws' pertaining to numerical simulation: both the final discrete equations, and their numerical solution, are indifferent to the manner of interpretation. A.2.2 VIEWPOINT ONE Let us begin on the negative side of the argument.... We have seen very few FEM textbooks that do not (we assert) confuse the newcomer by carrying the finite element method a bit too far by writing so-called element-level equations—which erroneously imply, in most cases, element-level 'balances' (momentum/force, heat, mass, etc.). Many peer-reviewed research papers, unfortunately, also promulgate this confusion. Chapter 5 in the book by Burnett (1987) is one of clearest and most elaborate treatments of 'element equations' (and with no erroneous implications) that we have seen—even though we still believe that such an approach is unnecessary.
884 FURTHER COMPARISON OF FINITE ELEMENTS AND FINITE VOLUMES We believe that the proper way to generate the GFEM equations is just the way we did it in Sections 2.2-2.4 and 3.13.5. It is then sad but true that the GFEM lacks (usually) that single attractive property of many finite volume/control volume methods: simple-to- interpret, local-level 'physical' balances. But to help relieve the consistent confusion that still abounds in even the most recent of FEM texts, we shall have a go at a proper description of element equations and element assembly. Before doing so, however, we admit that the confusing concepts are rather subtle, thus rationalizing the rampant confusion, which confusion seems to be more 'prevalent' among engineers than mathematicians. Consider for simplicity the ID AD equation, dT dT 1- u — dt dx k- d2T dx2 + S(x) on 0 ^ x ^ L, (A. 2.2-1) where u and k are constant. We shall discretize the weak form of this equation with both linear and quadratic basis functions in the manner of many FEM texts written by engineers ('free body diagrams'); i.e., we will actually write equations for each node of each element—as if the element were not connected to its neighbors—but we will make every effort to make the presentation both palatable and rigorous (no 'arm waving'). To this end, we first write the finite element approximation to the weak form of (A.2.2-1) over an arbitrary interval (A, B) located somewhere in (0, L), with 0 ^ A < B ^ L: r-fl 0/ ~dt + u dx + K "B 30,- dTh \ dx dx "B r\T^ 0/5 + K(j)j—- i dx B (A.2.2-2) where Th = ]T\ Tj4>j(x). Next, an energy balance on the interval (A, B) is, from (A.2.2-1), S+lx- 8T uT B (A.2.2-3) if r = dt J a J a V dx and is one 'goal' of our GFEM approximate solution. Consider now the (typical) 3-patch in Figure A.2.2-1 using linear elements. Next we write two equations for element i + 1, one at the left end and one at the right, using (A.2.2-2); i.e., A = X(, B = xi+i, starting at the left, node /: li+l ■ ■ U (T; — 7\ + i) -^-{ITi + Ti+i) + -(Ti+l - Tt) + k l+U 6 2 li+l (t>(S + K(f>i-— r. OX Xi+\ (A.2.2-4) The key issue before us lies in the interpretation of the last term—what it is and what it is not. First the easy part: since 0, = 1 at xt and 0, = 0 at jc(+i, the term simplifies i + 2 Fig. A.2.2-1 Local GFEM linear element solution.
VIEWPOINT ONE 885 to — K(dTh/dx)\Xi. Now, in spite of what may seem obvious, what it definitely is not is K(d/dx)Y^j Tj(j}j(x)\Xi, since the first derivative of the C° function, Th(x), is not even uniquely defined at Xj. [Also foolish, and fatal, would be to attempt to approximate dTh/dx\Xi by (Ti+i — Tj)/li+\, which would cancel the diffusion term on the LHS—and is equivalent to not having integrated by parts.] What it is is an unknown, but consistent GFEM diffusional flux at node / (sometimes called a secondary variable—to distinguish it from the primary variable, Th; see, for example, Reddy (1993); see also Chapter 4 for further discussion of 'consistent flux'). We shall even give this new unknown a name: —K(dTh/dx)\x+, = qf, which means the (consistent) flux at x = xf, with the + sign (jc+ = Xj + s for s > 0, where s —> 0) indicating that we are quite prepared for a discontinuous flux at Xj. (This quantity also goes under the name of 'generalized nodal flux'—a name that might make more sense in multi-dimensions; see below.) Thus, the first 'element' equation is h+\ • • u (Tj — Tj+\) f , -!£-QT, + T,+ i) + -(T,+ i-Ti) + K ' '+U = / <f>iS + q+, (A.2.2-5) 6 2 li+i Jii+l which, as it will turn out, is actually nothing but the defining equation for qf\ And this is precisely what causes the confusion. If we had the GFEM solution, then (A.2.2-5) could be used to compute the consistent flux through the left end of li+\. Similarly, at the right end of element / + 1, we have l-^-(fj + 2ti+l) + ^(r,+ i - Tj) + K(Ti+l ~ Ti) = f 0,+ 1S-<7,-„ (A.2.2-6) o 2 //+i J,.+l where qJ+[ = —K(dTh/dx)\x- is the (unknown, but consistent) heat flux through the right end of li+\. {qf > 0 =>• net influx to element / + 1 at its left end and q~[+x > 0 =>• net outflux from element / + 1 at its right end—because —K(dTh/dx) describes flux in the positive jc-direction.) Important Observation: Whereas we have written two equations per element, which would lead to 2N total equations with only TV unknown nodal temperatures, we also have introduced two additional unknowns per element so that the total number of unknowns is now 3/V. What to do? Well, since 0, +0,+ i = 1 in li+\, it is interesting first to sum the two element equations (in three unknowns); this gives l-^-(tj+tj+l) + u(Tj+l-Tj) + 0= f S + q+-q7+l, (A.2.2-7) which, if the heat fluxes on the RHS were 'just right,' would represent an element-level energy balance. And this is a good place to 'prove' again that K(dTh/dx) cannot be expressed via Th = ][]. Tj<f>j because that would yield q+ — qf+l =0 in (A.2.2-7)—in clear violation of an element energy balance via the total absence of the conduction term. To make further progress, we write (A.2.2-2) for node / in element /, and that for node / + 1 in element li+2'. ^(Tj^+ltj) + 1(T, - Tj^) + K(Ti ~Ti~{) = t ct»S-q- (A.2.2-8) o 2 /, Jij
886 FURTHER COMPARISON OF FINITE ELEMENTS AND FINITE VOLUMES and //+2(27V, + tl+2) + U-(Ti+2 - Ti+l) + K(Ti+li Ti+2) = [ <j>i+lS + qT+l. 2 //+2 Jli+2 (A.2.2-9) The next—and crucial—step is to require/invoke flux continuity at each node (yes—even though our C° temperature field seems to preclude same) via qt=qj V/, (A.2.2-10) which closes the system (we now have 3/V equations in 3/V unknowns). Although Th e C° =>• dTh/dx e C~l, the consistent flux, q-, = qf, is continuous; q, e C°—by (required) definition. (This is an example of why these concepts are subtle/slippery.) Now we return to the ostensible energy balance given by (A.2.2-7), and use (A.2.2-8) and (A.2.2-10) to eliminate q[ and (A.2.2-9) and (A.2.2-10) to eliminate qJ+{, to give £(7V, + it,) + l,+l (Ti+Ti+l) + lJ±*(2Ti+l + t,+2) 6 I 6 u (T, — Ti-1 T,+, — Ti+2 \ + ^(Ti+l - t, + t^ - r,_,) + k I { + ,+Ir 2 '+2 j = [<j>,S+[ S+ f 0,+ i5, (A.2.2-11) which is a confusing mess. But is it also an energy balance? That is to say, is there some sort of local energy balance lurking in these equations? Could it be an energy balance over element / and over one-half of each of its neighbors? Unfortunately, no. What it is is simply the sum of two GFEM nodal equations; i.e., the sum of two global equations with no local energy balance. Proof: the GFEM equations for nodes / and /+ 1, obtained (of course) by applying (A.2.2-2) with A = 0, B = L so that the last term is zero, are, respectively, VJV, + It,) + -£-(27, + ti+i) + ^(T,+ x - r,-_,) 6 6 Z + /T1-T!zi + T1-T!±1\r ^ (A22]2) V h li+\ J Jh+h+i and %J-(7, + 27,+ ,) + ^r(27V. + ti+2) + ^(T,+2 - Ti) 6 6 2 + JTl±±^li + Tl+l-Tl+I\ = j ^ (A22]3) V li+l h+2 J Jli+l+ll+2 whose sum is easily seen to be (A.2.2-11). Thus, the potential element-level energy balance suggested by (A.2.2-7) is a red herring. Only piecew'ise-constant test functions—such as those often employed in FVM's—can generate local balances. To be absolutely sure that our analysis is both clear and useful, we reiterate: once the GFEM solution has been found, the flux at node / computed as qf = l-f(2t, + 7V.) + U-(T,+ { - 7V + * (ZlzZill) _ jf 0,5
VIEWPOINT ONE 887 from (A.2.2-5) and that computed as qj = j hS-^ti-x+lti) - ^(T, - r,_,) - k (r/~/r'"1) from (A.2.2-8) are precisely the same quantities even though they are obtained from very different equations. [Setting qf = qT above simply returns the GFEM equation (A.2.2-12), for node /.] These are the consistent (internal) fluxes that are, of course, computable only after having solved the GFEM equations for Tt(t), i = 1, 2,..., N. And—even though the diffusive portion of the flux suffers a jump at the element interface (node / above)—the complete consistent flux does not. (Note, however, that continuous consistent flux is a different issue than element-level energy balances.) To finish the story with linears, let us connect what we have presented above to that in many FEM texts. To obtain the assembled equation for node /, using the above, cumbersome (?), element equation-based approach, simply form (A.2.2-5) and (A.2.2-8), then add them, using (A.2.2-10); the result is (A.2.2-12). For quadratic basis functions, we merely outline the steps to obtain the analog of (A.2.2-11), leaving the details to the interested reader: 1. Form the element equation for the left node, introducing the additional unknown, say, qf. 2. Form the element (and global) equation for the center node, which introduces no unknown fluxes because 0(+i = 0 at the two ends of the element. 3. Form the element equation for the right node, introducing qj+1. 4. Sum the three equations, giving a potential element-level balance. 5. Eliminate qf and qJ~+2 from (1) and (3) in favor of the left and right neighboring element equations. 6. The final result in Step (5) is the same as summing all three GFEM equations corresponding to the given element, and there is (again) no local energy balance. Moving to 2D, we consider (only) the bilinear element and address the equally confusing concept of '...upon assembly, all element boundary integrals will cancel... because the path (element boundary segment) is traversed once in each direction ' The 4-patch (Figure A.2.2-2) below will be all we need to perform the analysis: The six 'paths' shown relate to the boundary (line) integrals below; e.g., V{0 ) is the boundary integral path NW N NE W -(4) © -0) 1 r (1) E I © I •+- 1 (2) © r(2)l 1 E i sw SE Fig. A.2.2-2 Element boundary integral paths.
888 FURTHER COMPARISON OF FINITE ELEMENTS AND FINITE VOLUMES associated with node 0 in element 4. [The paths traverse only one-half of each element's perimeter because the basis (test) function associated with the node in question is zero on the remaining portion of the element boundary.] The 'plan' is three-fold: (i) to show how to form the proper assembled GFEM equation 'one element at a time'; (ii) to show that the thus-introduced unknown inter-element flux terms are not very useful in general; and (iii) to use the results to discuss the concept of an element-level energy balance. The element equation for node 0 from element l's 'viewpoint' is 00 ~dt + u • vrh + kv<Pq ■ vr 0OS + / 0OK- dr dn (A.2.2-14) which is the 2D analog of (A.2.2-2), and, as an alternate to introducing boundary g's as above, we have placed a bar over Th in the boundary integral to distinguish it from T = ]T) Tj(j)j on the LHS; dT /dn is an additional unknown quantity. Remark: Note again that, as in ID, the attempted identification of dT /dn = V ■ Tjd(j)j/dn would lead to the total absence of any diffusion terms in an attempted element-level energy balance that could be attempted by repeating (A.2.2-14) for each of the other three nodes in element 1, summing the four equations, and using ^/0j = 1- This result can also be obtained by realizing (i) that the above procedure is completely equivalent to not having integrated 0V2r by parts to obtain the weak form, and (ii) that V2T = 0 for bilinear basis functions. Higher-order elements would not give a null result, but they would also not give a correct result—an exercise we leave for the reader. If we write three more element equations for node 0, one from each element, and sum all four, we obtain, with Qq = A\ + Aj + A3 + A4, 00 dr ~dt + u • VTh ) + kV0o • VTh <l>oS + Yl (A.2.2-15) which becomes the GFEM equation for node 0 after imposing 'flux-continuity' by setting the sum of the boundary integrals to zero; i.e., we must define the boundary integrals to cancel in pairs—the 2D equivalent of the flux-continuity statement in (A.2.2-10). So, with more effort than is needed, we can derive the assembled GFEM equations for the full mesh by repeating the above exercise. (It is more effort than needed because we wrote element-level equations with the necessary introduction of the element boundary integral terms, which later drop out.) Suppose now that we have solved the GFEM equations so that Th(x, t) is available. Is there any utility in returning to element equations and the associated inter-element flux-like terms? Well, let us try first to compute the diffusive flux between elements 1 and 2; i.e., through 0-E. We begin by returning to (A.2.2-14)—rearranged and expanded; and using 00*- dT dn - J foK^-dy- / (p0K^r- dx, dx dy
VIEWPOINT ONE 889 we obtain ,N dTn rE qt» (poK^^dy- / (j}0K—dx dx 00 ~a7 dy + u • vr - 5 + /cV0o • vr RHS (i) (A.2.2-16) :(D where RHS0 denotes a known RHS associated with node 0 and element 1. Thus, whatever the two terms on the LHS really mean, their sum is now available. But we are interested in only the second of them (the heat flux between elements 1 and 2). What to do? Well, we might try writing the analogous equation for element 2—but it is obvious from the above sketch that this would introduce another boundary integral, k f0 4>o(dT /3jc)dy, which we do not care about. Going to node E will not help either because it brings in the two boundaries E-NE and E-SE. So let us return to the other three equations for node 0 and write out each—taking due account (as we did above) of the outward pointing normal on each element: rE QYh rs Qfh + / foK—dx- 4>ok— dy = RHS02), (A.2.2-17) rs dT rw dT + 4>oK-—dy+ 0OK--ck = RHS^, (A.2.2-18) Jo dx Jo dy ,(3) and ) dy <!>ok— dy i dx RHS (4) (A.2.2-19) where all RHS's are known. Now we might hope to get somewhere because we have four equations in the four unknown 'generalized' heat fluxes. Thus, writing these four equations as Ax = b, with jci = J0 4>0K(dT /dx)dy, X2 = J0 4>oK(dT /3y)d*, *3 J0 4>oK(dT /dx) dy, and x4 = J0 (poK(dT /dy) dx gives 1 0 0 1 -1 1 0 0 0 -1 1 0 °1 0 1 -lj A = which, unfortunately, is singular (the four equations are not linearly independent), and our hopes are dashed—we cannot extricate the individual element-side heat fluxes from the given sums of them. So far, no 'utility'! But we can get some information on heat flux—but it will be nodal rather than elemental. Suppose we want the total flux through W-O-E that is associated with node 0; i.e., that between elements 1 and 4 above and elements 2 and 3 below? This is do-able, and we shall finally see some utility of these secondary variables. If we sum (A.2.2-16) and (A.2.2-19), we get rE dT» (f)oK—-dx ) dy (pox^dx i dy RHSl^+RHS (4) (A.2.2-20)
890 FURTHER COMPARISON OF FINITE ELEMENTS AND FINITE VOLUMES where the LHS looks like fw4>oqdx = [(/i +h)/^]% to give 2 % h +h RHS^ + RHSj,4* (A.2.2-21) as an 'average' vertical flux 'through' (associated with) node 0. Note too that this same (consistent) flux could also be obtained 'from below'; i.e., via elements 2 and 3. Just add (A.2.2-17) and (A.2.2-18) to get 00*-— dx + ) dy 0o*——dx i oy RHSj,2) + RHSj,3), (A.2.2-22) the LHS of which is just the negative of that in (A.2.2-20). That they agree (up to a sign) is a simple consequence of the fact that Ylj=i RHSq = 0, which is (of course) the GFEM equation for node 0. To get the flux through a single element side, say 0 — E, one would first compute qE in an analogous way to obtaining (A.2.2-21)—which, of course, requires the introduction of the element to the right of element 1. Then go0oOO + qE4>E(x) can be used to describe, pointwise, the (C°) heat flux through edge 0-£—for any x between 0 and E. While certainly legitimate and probably sometimes useful, these consistent flux calculations are usually more useful at true domain boundaries—a subject we shall return to in Chapter 4. To conclude our element-equation analysis, let us examine the possibility of element- level energy balances. Thus, we focus on a single element, say number 3, in the above mesh and write sdlfour element equations: Qsv/K^—dy- >sw dx »S f)T ())swk— dx isw dy 05 W 05*—— dx + >sw dy + u • vr - s) + *V0. r° dfh / ^sK~dx~dy >sw vr (A.2.2-23) 05 ar ~dt + u • vr - s + *V05 • vr (A.2.2-24) + "0 j3T" 0o*^— dy + dx "0 r)T (j>oK—— dx 'W dy 00 ar ~a7 + u • vr - s + *V0O • vr (A.2.2-25) and f° dT / <Pw«^— dx 'w dy -w Qf (j)WK^—dy >sw dx
VIEWPOINT TWO 891 (j)W + u • VTh - S ) + kV(J)w ■ VTh (A.2.2-26) which sum to k dy — is dx Jsw dx o «~/j ,5 -—h w dT \ i r dT" r* dT" K—iy] + llwK-%'ix-LK-*'ix + u • vr - s (A.2.2-27) Now, except for the 'elusive' nature of the terms on the LHS, it is clear that (A.2.2-27) is at least an attempt at an element energy balance. In fact, the LHS can be rewritten as Jr^K(dT /dn), which, from the divergence theorem, is fA kV2T . Unfortunately, —h' —h however, the element energy balance is just as elusive as T (or (dT /dn); perhaps the best interpretation of (A.2.2-27) is to consider it as the definition of the sum of the diffusive flux terms that would give an element-level balance. In fact, these terms are simply the residual of the known terms on the RHS. Our 'bottom line' on this issue is simply the following advice: do not form your GFEM equations via element-level equations; use the proper (global) support of the test functions, and you will not get confused. Although this approach implies the construction of nodal (global) equations spanning more than one element, and thus seems to ignore/bypass/minimize/preclude 'classical' finite element thinking/methodology, it really does not—it simply admits (perhaps even emphasizes) that 'looping through the elements,' or 'global assembly,' is merely a bookkeeping procedure in which element contributions to global equations is the proper name of the game. Finally, with regard to the lack of element balances: sometimes one has no choice but to simply let the mathematics speak for itself and let the physics take a back seat. But as we have shown, finite volume methods with their true local conservation usually do not produce as accurate an approximation to the PDE solution as does GFEM! The GFEM trades element (control volume) balances for higher accuracy. A.2.3 VIEWPOINT TWO The starting point is the same; write (A.2.2-5) as f Xj+ I qf = -k(T'+! T,) + ljJr(2t, + ti+i)+ U-(Tl+{ - r,) - / frS, (A.2.3-1) ii+i o 2 Jx. where q~l is the (consistent) diffusional flux (in the jc-direction; qf > 0 =$■ flow in the jc- direction) through node / as seen from element i + \; qf > 0 =$■ flux is into the element. Yes, in spite of the 'extra' terms on the RHS, q{ is a diffusional flux—a concept that will become more clear after we write the analogous equation for node / from the 'viewpoint' of element /: qj = -«(Ti ?l~{) ~ liVTi + r,_,) - fa - 7V-,) + T 0,5, // 6 2 A,_, (A.2.3-2)
892 FURTHER COMPARISON OF FINITE ELEMENTS AND FINITE VOLUMES wherein qj is the diffusional flux through node / as seen from element /—wherein the sign reversals are interesting, and significant. Both (A.2.3-1) and (A.2.3-2) should be interpreted in the following sense: the RHS comprises a 'first cut' (the first term) and a set of 'correction' terms for the diffusive flux at node /. This is the key to finding local GFEM balances, and is rationalized/justified, at least in part, by realizing that mesh refinement will leave only the first term on the RHS's, —icdT/dx; the other three terms shrink to zero (no more corrections needed). 'Nodal' conservation is now invoked by the requirement that the (consistent) diffusional flux be continuous; i.e., we enforce q[ = qj, which is just (A.2.2-10). The result, of course, is simply the GFEM equation for node i. Once the full set of GFEM equations is solved for Tj(t), i = 1, 2,... N, either (A.2.3-1) or (A.2.3-2) can be used to compute the diffusional flux at node /. Thus, although the GFEM equation for node i does not describe a nodal energy balance per se, the 'post-processing' equations (A.2.3-1) and (A.2.3-2), do. The element energy balance (for /(+i) begins by writing the analog to (A.2.3-1) for node i + 1; namely, (A.2.2-6): -qT+\=K '+!~ +^r}-(ti+2ti+l) + 1)(Ti+l-Ti)- / 0/+15, (A.2.3-3) '(+1 O l Jx, which (for —qJ+{ > 0) describes the diffusional flux into li+\ from the right. Clearly the sum of (A.2.3-1) and (A.2.3-3) represents the total diffusional flux into li+\\ it yields (f. +f-Lt) fXi+l //+i ' 2 + u(Ti+i -Ti) = J S + q+- q7+l (A.2.3-4) as the total energy balance for element i + 1. Comparing this with (A.2.2-3), it becomes (again) clear that the term K(dT/dx)\^ is represented by q~l — q~l~+l, the total diffusional flux entering element li+\. This is the GFEM element energy balance. All that is needed to 'accept' it is the previously stated key interpretation: the g's as given by (A.2.3-1) and (A.2.3-3) are the 'proper' diffusional flux terms, comprising the sum of a simple first approximation and appropriate (and not so simple) correction terms. Moving to 2D, we begin with (A.2.2-14), where Th = ^2jT'j(/)j, refer back to Figure A.2.2-2, and introduce the additional defining equation for the nodal diffusive (and consistent) flux: / fate— = - 0orf} = —q{i\ (A.2.3-5) Jru) dn Jru) 2 wherein q^ is the (pointwise) normal diffusional flux leaving [for q^ > 0] element i through Tq ('through' node 0) and q(Q is the average outward (when positive) normal diffusional flux through Tq0. Both q^ and q^ are unknown quantities until the GFEM equations are solved, after which the second term on the RHS of (A.2.2-14) is available. Remark: The term Jr<n (poK(dT /dn) is often labeled a 'generalized flux' or some other such mostly meaningless term—and contributes to the extant confusion.
VIEWPOINT TWO 893 Equations (A.2.2-14) and (A.2.3-5) represent/describe the consistent diffusional flux through Tq as a 'first approximation,' — fA kV0o • vr, plus several correction terms, as l + h 4o,} = ~ f kV<Po ■ VTh - [ 0O ( JAi Ja\ \ ~a7 + u • vr (A.2.3-6) wherein the second integral vanishes as A\ =$■ 0; i.e., for l,h -+ 0, this equation becomes 5S" = K 3(1 +h) + 0(l,h), which further leads to 2(7'£-7'0) + (7W-7',v) , , 2{TN-T0) + {TNE-TE) h ■ ; h / • / h (A.2.3-7) /?■ -d) dT — dx dT + l^T o dy l+h + 0(l,h), (A.2.3-8) a consistent description of a particular heat flux vector—including direction. (We presume here that the limit is taken with l/h fixed.) Similar equations apply at node 0 for the other three elements: </Wo (2) / kV<Pq • vr - / 0o IA2 JA2 -a) 0o^3) = - / kV0o • VTh "aT dTh + u • vr - ^ + u ■ vr (A.2.3-9) (A.2.3-10) and -(4) <Po% (4) - / kV0o • vr - / 00 J/t4 JA4 dTh "a7 +u • vr - 5 (A.2.3-11) The sum of all four nodal equations for node 0 yields, upon invoking/enforcing flux continuity at node 0, £/r>^ = o, (A.2.3-12) /= 1 u 10 (A.2.2-15) with the second term on the RHS dropped—which is (of course) the GFEM equation (fully 'summed') for node 0. Summarizing, once the full GFEM set of equations is solved, one can return to the above nodal equations and compute the appropriate (and consistent) diffusional fluxes at each node. This is what Comini et al. describe as a nodal energy balance. Moving now toward an element energy balance, we begin by writing all four of the above type nodal equations, but this time for a single element, say number 3, as follows: /,rf = - / *v0O-vr- / 00 •M3) Ja3 Ja} dTh "a7 +u ■ vr (A.2.3-13)
894 FURTHER COMPARISON OF FINITE ELEMENTS AND FINITE VOLUMES 0^ = - / KV<pw -VTh- I 03 ( %- + u • VTh - S ) , (A.2.3-14) T\l] J A, J A, \ Ot w f0) tswlsw = ~ J "Wsw ■ VT* - J <}>sw[^+VL-VTh-Sy (A.2.3-15) and / 4>sqf = - J kV05 ■ VTh - J 05 I ^- + u • VTh - S ) . (A.2.3-16) Simple summation of these four equations yields dTh I „ </wf>3) + / + <Pwq(w + / m <t>swq(sw + / ,„ fcqf = \s u-VTh (A.2.3-17) wherein the LHS describes the total energy leaving element 3 via diffusion, and the equation can be rewritten, using (A.2.3-5), as 3$" + ^ + 5& + 5f = j^-h I (s - ^ - u • VI* j . (A.2.3-18) It is thus clear that (A.2.3-17) and (A.2.3-18) are statements of local energy conservation at the element level. While not quite as straightforward to obtain as when a CVFEM procedure is employed, the resulting GFEM equation does indeed describe an equivalent local conservation law. Remark: The 'completion' of the energy balance actually requires either that V • u = 0 or that the flux-divergence form of advection be employed, so that the RHS of (A.2.3-17) can be rewritten as [ S-^ [ Th- [ Thnu. Jai d? J A) Jr, As an 'aside,' it may be worth pointing out that the above discussion/derivations involved a 'mass lumping' approximation that is actually not required. By doing more work, via consistent mass, it may be possible to generate more accurate 'first estimates' of the local diffusional (and still consistent) heat fluxes. This can be done by expanding the pointwise flux, q(e , the flux associated with node i in element e, via the (in this case) linear basis functions on the element's boundary [rje)]: ^} = E^% (A.2.3-19) j on r/ , where only three of the four elements' nodes make contributions; e.g., for node 0 in element 3, (A.2.3-19) gives 43) = <$<t>w + tfVo + q(s]4>s, (A.2.3-20)
VIEWPOINT TWO 895 because the basis function for node SW, <f>sw, is zero on Tq3). Repeating this procedure for the other three nodes in element 3 (S, 0, and W) and performing the integrations on the LHS yields, rather than (A.2.3-13) through (A.2.3-16), [lq^ + 2(1 + h)qf + hqf] = - [ kV0o ■ VTh J A) vrh + u- VTh-S ) , (A.2.3-21) |[2(/ + h)q$ + hq% + Iqf] = - f kVc/>w ■ VTh J At, djh + u ■ VTh - S ) , (A.2.3-22) frhqff + 2(1 + h)q(^ + Iqf] = - f kVc/>sW ■ VTh J At, djh + u ■ vr - S , (A.2.3-23) and [Iqfj, + 2(/ + h)qf + /?43)] = - f kVc/>s ■ VTh JAi Ir/J3) 6 - / 0s f — + u ■ W* - 5 | , (A.2.3-24) whose sum is again the element-level energy balance given by (A.2.3-18)—appropriately —with q- replaced by qj . This 4x4 system can be solved for the consistent mass version of the consistent flux equations on element 3—and the procedure can be repeated for every element in the domain, wherein we point out that the nodal equations on a boundary element in which the flux is specified need not be written. Exercises for the reader: 1. Verify that the 'consistent mass' approach in the formulation of the nodal equations such as (A.2.3-6) is also consistent/legitimate. (Hint: in addition to flux continuity at node 0, similar enforcement is required at nodes N, S, E, and W.) 2. Show how the element-level matrices can be used to finally implement the (optional) flux calculations. 3. Extend the analysis to arbitrary meshes and to higher-order elements. If 'Viewpoint Two' is accepted and 'Viewpoint One' is rejected, then the principal argument favoring FVM over FEM has been vanquished.
Appendix 3 Projections, Orthogonal and Not—and Projection Methods A3.1 INTRODUCTION * [Warning to some readers: you have really got to want to understand the nitty-gritty about projections to justify the time and effort required to assimilate this appendix!] This appendix is intended to supplement Chapters 2 and 3 and, accordingly, begins with scalar systems and ends with divergence-free vector systems. It may be devoured whole or piecemeal, the latter by reading only the scalar portions of it when Chapter 2 refers to this appendix and skipping the vector portion until referred to in Chapter 3. It is a selective segment of a very general concept and has been designed to focus on the applications associated with this book. For further background and more information, the reader may consult, among others, the following references: Mikhlin (1964), Bronshstein and Semendyayev (1985). Galerkin's method is often called a projection method, and one of the key goals of this appendix is to explain why. This will obviously entail a fairly careful definition and description of 'a projection', as well as that of a projection method. In the various linear vector spaces (of functions, or vectors, as they are also called) associated with the branch of mathematics known as functional analysis, are a variety of so-called projections. These are abstract generalizations of familiar Euclidean projections and, by construction/definition, share some of their properties. Three of these familiar projections are: (i) the projection of a 2D vector in the plane onto a line in the plane; (ii) the projection of a vector in 3-space (R2) onto a plane (R2); and (iii) the projection of a vector in R2 onto a line (/?'), which could also be realized by first applying the second of the above projections and then applying the first. The key point is that a projection is a representation of some 'quantity' in a subspace of the original space (sometimes called a proper subspace). This means that it is never a complete or total representation in that some information is necessarily lost; i.e., you cannot go backwards. If/when the projection is considered as an 'operation,' it is one-way: the inverse operation does not exist. Consider, for example, the two (orthogonal) projections of vectors from R2 to Rl shown in Figure A.3.1-1. The result of projecting either u\ or u2 to the line represented by the jc-axis is u. While u is the unique projection of both u\ and u2, there is no unique inverse projection—clearly.
898 PROJECTIONS, ORTHOGONAL AND NOT-AND PROJECTION METHODS # ^x Fig. A.3.1-1 Two simple sample projections. If the projection is represented by the operator symbol, p (whether explicitly or, as is often the case, implicitly) in a vector space W, the subspace onto which the projection occurs is called the range of the projection and is denoted by R{p). Associated with p is another subspace, N(p), called the nullspace of the projection; any quantity in N{p) gets projected to zero by p. Similarly, associated with the projection operator, p, is another projection operator, Q, via Q = I — p. If u e R(p), then pu = u and Qu = 0. Similarly, if v € N(p), then Qv = v and pv = 0. Also, we use the symbol x, as in u(x), to represent the total dimensionality of the spaces involved; e.g., if we are in R2, x means x, y, z. Consider now the following more general sketch shown in Figure A.3.1-2, and three vectors u, v, and w, and we remark that these sketches in the plane are at best 'schematics' when considering higher-dimensional spaces: (Note that the intersection of TV and R 'passes through '/defines the only vector that lives in both subspaces—the zero vector.) The projection operator p is said to project 'down to' R(p) in the direction of (parallel to) N(p); analogously, Q projects 'down to' N(p) in the direction of R(p). Note that the notions of both distance (norm) and angle (inner product) are implied in R(P) Fig. A.3.1-2 General (non-orthogonal) projections.
INTRODUCTION 899 the above sketch—as in Euclidean geometry and as in a Hilbert (function) space—and, hence, also the notion of orthogonality (_L): if TV _L R, then the projection is said to be orthogonal and vice versa: if we have an orthogonal projection, then TV _L R. If orthogonality exists, then the distance between a given vector u and any vector in R(p) is a minimum for pu, where distances are necessarily measured in the norm appropriate to the projection in question. With or without orthogonality, the decomposition u = pu + Qu is unique. Remarks: (1) Some of the projections discussed below are not of the 'usual' type in that the associated subspaces are not linear. (2) More general definitions of projections exist in which the vector space need not possess either an inner product or a norm. These are not of interest herein. Denoting the inner product between two vectors (functions) by (u, v) and the induced norm by || ■ || leads to the following 'algorithm' for constructing the above diagram: given u and v, each as a point lying in the plane of the paper (W): Step 0. Draw a horizontal line; call it R(p), by definition, and place a point at the zero vector. Step 1. Compute the magnitude of u, \\u\\ = y/{u, u) and its projection onto the range, pu, and onto the null space, Qu = u — pu. Step 2. Compute the angle from the /?(p)-axis to u via cos# = (u, pu)/(\\u\\ ■ ||p«||). Step 3. Compute the angle between R and TV via cos\J/= (pu,Qu)/(\\pu\\ ■ \\Qu\\). Draw the line N(p). Step 4. Plot u, pu, and Qu. Step 5. Compute the magnitude of v, \\v\\. Plot v; pv and Qv can then be obtained (and plotted) either directly (via 'computation,' as for u) or indirectly (graphically). Step 6. Compute the angle between u and v via cos0 = (u, v)/\\u\\ ■ \\v\\. Remarks: (1) Note that ||w||2 = \\pu + Qu\\2 = \\pu\\2 + \\Qu\\2 + 2\\pu\\ ■ \\Qu\\ ■ cosxjr. Only if {pu, Qu) = 0 do we have orthogonality, and the concomitant satisfaction of the 'Pythagorean theorem.' (2) Another property of the projection is that the angle \Jr is the same for all admissible functions (those in W)—a requirement that is clearly (and most easily) satisfied for orthogonal projections; xjr = 90° via (pu, Qu) = 0. (3) The dimension of R (hence N) may be the same as that of W, or it may be less than that of W. (4) Only orthogonal projections are norm-reducing, \\pu\\ ^ ||«|| Vw. In the non-orthogonal projection depicted in the sketch, it is clear that ||pw|| > ||w|| for the w shown. And this brings us to the notion that there is usually a variational statement/interpretation of a projection: the projected quantity is the function in R(p) that is 'as close as possible,' in some sense, to the original quantity; and this is, of course, related to the
900 PROJECTIONS, ORTHOGONAL AND NOT-AND PROJECTION METHODS concept of orthogonality. In this context, Qu = u — pu is often called the error associated with the projection. The projections that we consider do indeed also have some of their roots in the calculus of variations. An essential property of a projection—indeed, part of its definition, is that a second projection changes nothing [you are already in the subspace associated with p; i.e., in R(p)]: p2 = p (p is idempotent) is the symbolic statement of this fact/requirement. (Or, p2u = pu.) Note too that p2 = p => Q2 = Q and pQ = Qp = 0. The larger (original) space and the subspace may be oo-dimensional or finite-dimensional (of course we cannot project from the latter to the former!). We will be primarily interested in the case in which the original space is oo-dimensional and the subspace finite-dimensional, although projections from one finite-dimensional space to another (a subspace of it) will also occasionally be of interest. Projections may also be (loosely) categorized as continuous or discrete, the former sometimes involving differential operators and the latter usually involving matrices. A simple example of a finite-dimensional function projection is the finite Fourier series representation (or any other truncated eigenfunction expansion) of a given function; namely, the amplitude coefficients: the coefficients of the expansion represent the projection of the given function onto the set of trigonometric (or other) functions. (In this case, the projection is an L2-projection, as will soon become clear.) An example of an oo-dimensional function projection is a partial Fourier series representation of a given function on the interval [—1, 1] by the sine series, {sin«7rjc, n = 1,2,..., oo}. For example, since e* is not an odd function about the origin, the Fourier sine series can only approximate it; the sine sequence is (and spans) only a subspace of L2 on [—1, 1]—even though it spans L2 on [0,1]. The (orthogonal in this case) complement to the sine series, {cos«7Tjc, n = 0, 1, ..., oo}, would be required, in addition to the sine series, in order to exactly represent e* on [—1, 1]; the combination of the two infinite sets of functions spans L2 [—1, 1]. In fact, the sine series representation/approximation of e* will describe exactly the odd part of it—sinhjc. And, equivalently, sinhjc is the best least-squares fit to e* using the functions {sin«7TJc}. Remark: The simplest finite element projection occurs when the basis functions are interpolating (Lagrange polynomials); i.e., when (pj(xj) = <5(y—and it is this: the simple interpolation of a given function, f(x), via the basis functions, f'ix) = ]Cj=; f(xj)<l>j(x)i is a projection that we shall call pt; i.e., pif(x) = f'(x). While a particular projection of a given function to a given subspace is a well-defined process that is independent of the source/origin of the given function, it will often be the case that the given function represents the solution to some BVP, in which case the projection turns out to be the approximate (GFEM) solution to the same BVP; indeed, this is why Galerkin's method is a projection method—and we shall demonstrate this projection connection. For example, the projection of a solution to Poisson's equation onto the finite element subspace (mesh, nodes, and basis functions) is one from infinite dimensions to finite dimensions. If the (weak) gradient of this projected function was then further projected to a (finite-dimensional, necessarily!) subspace of discretely divergence- free functions, we would be projecting from one finite-dimensional space to another. We shall demonstrate these projections in what follows.
SCALAR PROJECTIONS 901 Another noteworthy (and general) property of projection operators is that their eigenvalues are either zero or one. [Proof: px = Xx =$■ p2x = px = Xpx => (1 — X)px = 0 = (1 -X)kx.] In closing this introduction, we point out that there are only two basic types of projections that are of interest herein; one is called the L2-projection (from the Lebesque norm, ||m||z,p = (/ \u\p){/p for p = 2), wherein we shall convert to the common and simpler name, ||m||o, because H° is another name for I? (the function, but none of its derivatives must be square integrable); i.e., ||«||o = ||m|Il2- The other is the //'-projection, in the (Hilbert?) norm ||«||i = (/ Vu ■ V«)1/2, where we neglect the 'L2-portion' of the conventional //'-norm, ||«||2 = J(Vu ■ Vu + u2) because our semi-norm will actually qualify as a true norm (value zero if and only if u = 0) in virtually all cases of interest, because our associated BC will preclude u = constant—which would give \\u\\i = 0 for u = constant / 0 and thus be illegal as a norm. Also, while not explicitly stated each time, associated with each oo-dimensional projection is a spatial domain, Q (in RHs; ns = 1, 2, or 3), with boundary I\ In the remainder of this appendix we shall attempt to remove the thus-far qualitative and therefore somewhat vague interpretations of projections, and replace/augment them with quantitative discussions including explicit (when possible) definitions of the projection operators for the two norms mentioned above and for two classes of problems: scalar and vector, with the former serving partly as a stepping stone to the latter—and with an important additional projection regarding the latter: the projection to a discretely divergence-free subspace. The terminology we shall employ utilizes the following (ten!) projection definitions: po : Infinite dimensional L2(Z/°)-projection Pq : Finite dimensional L2(Z/°)-projection Pi : Infinite dimensional //'-projection p\ : Finite dimensional //'-projection ' Pj0 : Infinite dimensional L2-projection to the divergence-free subspace, J Pj0 : Finite dimensional L2-projection to the weakly divergence-free subspace, Jh J Pq : Projection matrix; discrete L2-projection to the discretely divergence-free ^ subspace ' Pj{ : Infinite dimensional //'-projection to the divergence-free subspace, J p1/ : Finite dimensional H' -projection to the weakly divergence-free subspace, Jh J P\ : Projection matrix; discrete //'-projection to the discretely divergence-free l, subspace Thus, we have five variations on each of two themes; the first four are for scalar fields, and the remaining six are for vector fields. We shall define and describe them in the above order. A.3.2 SCALAR PROJECTIONS Here we are principally interested in the projections of scalar-valued functions to the finite-dimensional subspace spanned by the basis functions of the FEM, via both L2-and //'-norms—the former (and simpler) of which always delivers an orthogonal projection,
902 PROJECTIONS, ORTHOGONAL AND NOT-AND PROJECTION METHODS and the latter of which sometimes does. But we will open the discussion with the oo- dimensional case—partly for completeness, in some sense. A.3.2.1 The L2-Projection, p0 Suppose we are given a function u{x) in L2 and are asked to find the closest function to it within some (given) linear space of functions that is a subset of L2, say S, with u <fc S. Calling such a function uq(x)—assuming such a function exists—leads to the following equivalent statements (choose the one that suits/pleases you): (i) Findinf \\u — v\\0 (ii) \\u — Hollo = inf \\u ~ v\\o V€S (iii) \\u — «ollo ^ llM — v\\o VueS. (A.3.2-1) Thus, to find uq, let v e S, introduce the functional F0{v) = \J{v-u)2, (A.3.2-2) and try to minimize this functional by varying v within S: 8F0(v) = 0= (v-u)8v. (A.3.2-3) (Note that 82F0(v) = J 8v8v > 0, and thus the extremum is indeed a minimum; i.e., the greatest lower bound.) Now suppose that we have a basis for S; i.e., a set of linearly independent functions, {vn,n = 1,2,...} that spans the space and thus permits us to solve (A.3.2-3); any function in S can be represented by a linear combination of the {vn}. Thus, we can represent v and 8v (an arbitrary variation of v) as follows: v = YlT=i anVn(x) and 8v = Yl^Li bmvm from which (A.3.2-3) gives 00 „ 00 ]Cfem / ^2^a«v" - u^Vm = °' (A.3.2-4) m=\ n=\ a result that can only hold for all 8v if each coefficient of bm vanishes (the values of {bm} are arbitrary): /oo ^2(anvn - u)vm =0, m = 1,2, ..., oo, (A.3.2-5) n = \ showing that the error (u — v) is orthogonal (in L2) to the basis (the projection of the error is zero)—which actually defines Galerkin's method. Rearrangement gives oo ]Pa„ vmvn = uvm, m = 1,2, ..., oo, (A.3.2-6) n = l •* J an infinite set of linear equations for the amplitude coefficients, {an}. [Exercise for the reader: Show that the solution of (A.3.2-6), and therefore that of (A.3.2-3), minimizes (A.3.2-2).]
SCALAR PROJECTIONS 903 To make further progress, to simplify the notation, and to identify our first projection, we invoke the fact that any linearly independent set of functions can be ortho-normalized (via the Gram-Schmidt procedure, for example), and we suppose this to have been done to the i>'s; i.e., we have f vmvn = 8mn so that (A.3.2-6) becomes, simply, an = J uvn = (u, vn), and our solution, v = uq, is then just oo "OO) = X^"' Vn)vnix) = PQU(x), (A.3.2-7) n = \ where (u,v) = J uv is the L2-inner product: the closest function to u(x) in 5", uq{x), is the L2-projection of the given function u{x) onto the basis functions spanning S. [The amplitude coefficient, (u,vn), is the projection of u(x) in the direction of (onto) vn(x).] To prove that (A.3.2-7) defines (implicitly) a projection, we simply project again to see if pi = p0. Thus, 00 p0u0(x) = PqU(x) = ^2 n=\ oo y^0> Vm)Vm(x),V„(x) jn=\ vn(x) oo ^2 (u,vm)(vm,vn)vn(x) m,n = \ oo = ^ ("' vm)8mnv„(x) m,n=l oo = ^(u, vn )v„ (x) = u0(x) = Pqu(x)\ QED. (A.3.2-8) n = l Finally, to complete our introductory example, we test orthogonality to see if our L2- projection is indeed 'closest': since uq = p^u and QQ = I — p0, the 'remainder' of the projection is Qou; i.e., u = Pqu + Qqu, and we now wish to see if the Pythagorean theorem is satisfied in the sense of vectors in a vector space: does \\u\\q = ||po"llo + ll<2oMllo = llMollo + llM ~~ "olio? A direct calculation yields u2 = (p0u + Qqu)2 = / [(Pqu)2 + (Qqu)2] + 2 PquQqu = l|Po"llo + IIGo"llo + 2(po"i Go") = IIpo"IIo + IIGo"llo + 2 / PuuU ~ Po)w; i.e., \u\ = l|Po"llo + IIQo"llo + 2 up0u - (PquY (A.3.2-9) (A.3.2-10) and we do have J_o (read J_o as 'orthogonality in L2') because / up0u = (u, u0) = (Uo,ll) = 52„(u,V„)(vn,U) = E«("^«)2 and f(P0uf = ("0,"()) = Z)m.„((W, V/n)Vm, (u,v„)v„) = T,m,„(u< vm)(u, v„)(vm,v„) = ^2n(u,vn)2\ i.e., we have shown that p0u ±0 Qqu, or, equivalently, (pou, Qqu) = 0.
904 PROJECTIONS, ORTHOGONAL AND NOT-AND PROJECTION METHODS N(P0) A Q0u« 0 ► 4 IIQullo ~< ^M^ IIUollo ^^ I /^ 1 t I I I \ 1 1 I u0=P0u ►R(P0) Fig. A.3.2-1 An Lz-orthogonal projection. The sketch in Figure A.3.2-1 thus applies (TV _L R). Final remarks: (1) All of the above discussion still applies if the space S is finite-dimensional; just change the upper limit on the sums from oo to N. (2) uq = Pqu is the closest function to u (in the L2-norm) in the subspace S C L2 spanned by {vn}; a best fit in L2. (3) Recalling (A.3.2-2) makes it clear that this L2-projection is also a least-squares best fit to u(x) via {^„(jc)}; the L2 best fit is also the least-squares best fit. (4) The 'error,' u — pou = Qqu, is J_o ('L2-orthogonal') to the subspace, (u — uo,vn) V«—another way to say that uq is a best approximation. (5) For most cases of interest herein, the constant functions belong to S, from which it follows that the total 'mass' of u (or, equivalently, its average value) is preserved by this projection: f u$ = f u. (Take v = 1 = Ylm cmVm as the test function.) A.3.2.2 The L2 -projection, phQ It is a simple matter to specialize the above (somewhat abstract) L2-projection to a specific finite-dimensional version via the FEM basis functions, {0,, / = 1, 2,..., TV} e Sh C L2: return to (A.3.2-3), replace v(x) by Uq(x) = Y^=i uj4>j(x) an<3 Sv by 0(, where / varies from one to N, to obtain 8F0 = 0 j l]C"A~") 0/> (A.3.2-11) or ^Uj 0y-0,- = u4>i, i=\,2,...,N, j=i J J (A.3.2-12)
SCALAR PROJECTIONS 905 or, introducing an efficient matrix-vector terminology that will be more useful later, Mu = b, (A.3.2-13) where M = f<p<pT, or M-,j = f <j>i<j>j is the (SPD) N x N mass matrix, u = (u{ ... uN)T is the TV-vector of nodal coefficients (to be determined), and b = b(u) = J u<p, or bj(u) = f u(pj = (u, (pj), where <p(x) is an TV-vector of the basis functions, {0,}. Solving the above linear system yields the amplitude coefficients of the projection {u,}. The projection is completed via the basis function expansion: uh0{x) = (pT(x)u = (p(x)TM~lb(u) ee ph0u(x) = J2"j<l>j(x) = ]£[Af-lb(ii)];0y-Or). (A.3.2-14) Thus, we have, ostensibly, derived and described the L2-projection of the GFEM. All that remains is to prove that it is an orthogonal projection; and we begin by showing that ph0u0(x) = (ph0)2u(x) = <p(x)TM-lb[uh0(x)] r-l T ha-\\ <p' (i)M"'b[<p(x)' M~lb(u)] \T w-1 = <p' (x)M'1 / [<p(xy M-lb(u)]<p(x). (A.3.2-15) Noting that [<pT(x)M lb(u)] is a scalar leads to ph0u0(x) = <pT(x)M-1 f<p(x)[<p(x)TM'lb(u)] = <pT(x)M~l <P(x)<P (x) M~lb(u) = <pT(x)M-lb(u) = uhJx); (A.3.2-16) Pq is a projection operator. To prove that the projection is ±o, we need to show that D = JPqU(I — p^)u = (PqU, Qqu) = 0—and, for 'variety', we will do it slightly differently: we have p^u = (pT(x)M~lb(u) = bT(u)M~l<p(x), and thus D = / up^u - I PqUPqU = / ubT{u)M~[<p{x)- bT(u)M~l(p(x)(pT(x)M'lb(u) r-l = b' (u)M'1 / u<p-b' (u)M~ <p(x)<pJ (x) M~lb(u) r-l = b' (u)M'lb(u) - b1 (u)M~lMM~lb(u) = 0. QED. (A.3.2-17) Remarks: (1) Note that here we obtained an orthogonal projection with basis functions that are merely linearly independent but are not themselves orthogonal.
906 PROJECTIONS, ORTHOGONAL AND NOT-AND PROJECTION METHODS (2) If the mass is lumped in (A.3.2-13), a common trick in some FEM applications, the solution is particularly simple, u{ = f u(pj/ f (/>;—a basis function-weighted average value of u{x) at node /; it is important to realize, however, that the result is no longer a projection. But it is at least 'close' to a projection. To see this, project again to obtain ph0uh0(x) = <?T{x)MZ{MMZ{b{u) = <pT(x)MZlb(u) + 0(hp) = 4 + 0(hp), where ML is the lumped mass matrix, and we have used the 'sloppy' interpretation that says MM~[l = I + 0(hp), where h is element 'length' and p > 0, depends on mesh and basis functions. [In fact, MM^lu = u + 0{hp) for a sufficiently smooth u.] The lumped mass 'projection' is only as close to a true projection as ML is to M. (3) Lumping loses the 'best approximation' property as well—obviously. (4) If u(x) is replaced/approximated by its interpolant into the basis set, {0,}, in (A.3.2-11) through (A.3.2-13), then the projection degenerates to the interpolation projection introduced previously (Section A.3.1), which is indeed a projection—it is just not an L2-projection (of u; ptu is an L2-projection of the interpolant of u, which is again the interpolant because it is already in the subspace). With finite elements, L2-projections always involve the consistent mass matrix. (5) If the approximations in both (2) and (4) above are invoked and then the mass lumped also on the RHS, the simple interpolation projection, pju, is again obtained. (6) In practice, it is often (usually, probably) the case that u(x) e H' —a subspace of L2. (7) The simplest possible L2-projection via FEM is the expansion of uh via piecewise- constant basis functions; in this case, M is diagonal and Uq(x) corresponds to element average values of u{x). We conclude this section by restating this L2-projection in a different way: uh0(x) = <pT\x)Af-lb(u) = bT(u)M~l(p(x) = bT(u)(p(x) = yT (x)b(u), where <p(x) = M~l<p(x), (A.3.2-18) or 0,.(*) = ]T(M-%0;(.x), /= l,2,...,/V, (A.3.2-19) j are the so-called conjugate (or dual) basis functions; the nodal values of the i-th element of <p(x), (pj(xk) fork= 1, 2,..., N, are just the i-th row (or column) of M_1 (none of which is zero in general). Each dual basis function, 0((jc), is a linear combination of all of the FEM basis functions and is therefore a truly global basis function (i.e., a basis function with global support) that is L2-orthogonal to the conventional FEM basis functions; [<p(pT=I. (A.3.2-20) (The conventional basis functions are not an orthogonal set—hence, the non-diagonal mass matrix; the basis functions and the conjugate basis functions form a bi-orthogonal
SCALAR PROJECTIONS 907 set.) See Oden (1972) for a detailed discussion of these conjugate functions—including pictures; and further discussion is presented in Oden and Reddy (1976). Thus, recalling (A.3.2-14), the simplest 'projection-looking' representation of Uq(x) is either (i) uh0(x) = <pT(x)M-lb(u) = <pT(x)M-l(u, <p) = <pT(x)(u, M" V) = <PT(X)(U, <p) = ^(U, $j)(/)j(x) = ^2uj(/}j(x), j J (A.3.2-21) or (ii) uh0(x) = vT(x)b(u) = <pT(u, <p) = J2(u> 0y)0/C*), (A.3.2-22) j where bi-orthogonality yields (uq, 0,) = («, 0,) from (i) and (ufa, 0() = (u, 0() from (ii); thus, the error (u — p^u) is, appropriately, orthogonal to both sets of basis functions: (u - Uq, 0;) = 0 = (K - KJ|, 0,-) V/. Although Oden (1972) has already plotted some linear conjugate basis functions for both ID and 2D, we shall show a few more here—including quadratics. To use (A.3.2-18), we first multiply through by M and then pick a mesh node, say k, to get My(xk)=<p(xk) (A.3.2-23) or N ^Mi$j{xk) = <t>;(xk) = 8ik, i=\,2,...N, (A.3.2-24) where <5(y is the Kronecker delta. Letting k range over all nodes gives the matrix equation MO = /, (A.3.2-25) where each matrix is N x N. Each column in the (symmetric) O-matrix is a vector of nodal values for the corresponding conjugate basis function, and we see that O is just M~l, as stated previously. Equation (A.3.2-25) is, of course, solved (for TV not too large) by 'factoring' M and performing back-substitution against ek = (0, -> 1, 0, -> )T for each k, where the one is in the k-th position of the TV-vector ek. Once the /V2-values of O/j are available, we can return to (A.3.2-19) to obtain N 0;(*) = ]T;O,70;(*), (A.3.2-26) i=i which we plot for several values of / in Figure A.3.2-2 for linears and in Figure A.3.2-3 for quads—each normalized to 1.0 for plotting convenience. The actual peak nodal amplitudes are <f>k(xk) = 41.6, 22.3, 20.8, and 20.8 for k = 1, 2, 6, 9 for linears, and (pk(xk) = 50.9, 12.7, 12.2, and 25.5 for quads. [The asymptotic peak amplitude (h -> 0) for linears is y/3/h (Oden, 1972) in general, and 20.785 for this example—for internal modes. It is twice that value for nodes 1 and TV.] The piecewise parabolas in Figure A.3.2-3 were obtained element-by-element with 20 plotting increments per element, using the conventional quadratic basis functions for interpolation.
908 PROJECTIONS, ORTHOGONAL AND NOT-AND PROJECTION METHODS W*) 0.2 -V Fig. A.3.2-2 M-conjugate linear basis functions for nodes 1, 2, 6, and 9 in a 12-uniform- element mesh. ♦i(x) 0.2 -0.2 i- -0.4 'r- -0.6 Fig. A.3.2-3 Same as Figure A.3.2-2 except for six quadratic elements. Moving to 2D, Figures A.3.2-4 and A.3.2-5 show the bilinear basis functions at five 'different types' of nodes on a 13 x 13 = 169 node mesh, and their corresponding duals. They are plotted on a 85 x 85 node mesh via bilinear basis function interpolation to provide better clarity—and they are normalized to unit amplitude. In these plots and in those to follow, the height of the 'base' is 10% of the full range of the plotted function plus
SCALAR PROJECTIONS 909 (a) 0 (x) for a typical internal node I \ (b) 6 (x) corresponding to (a); maximum = 432 'i (c) (t) (x) for one node in from boundary (b) 0 (x) corresponding to (c); maximum = 463 (e) (t)(x) for one node in from corner (f) 0 (x) corresponding to (e); maximum = 496 Fig. A.3.2-4 Basis functions and dual (in L2) basis functions for bilinear elements (4- patches). the absolute value of the function's minimum value. [The true (asymptotic) amplitudes of the dual functions are 3/lh for internal nodes (4-patch), 6/7/? for non-corner boundary nodes, and 12/1h for corner nodes—where / x h is the element dimension.] Figure A.3.2-6 shows the biquadratic basis functions and their conjugates for the three types of internal nodes—on the same 13x13 mesh plotted (via biquadratic basis function interpolation)
910 PROJECTIONS, ORTHOGONAL AND NOT-AND PROJECTION METHODS I \ (a) <|)(x) for a typical boundary node (2-patch) i" (b) (J)(x) corresponding to (a); maximum = 864 (c) <|)(x) for a corner node (1-patch) (d) <j)(x) corresponding to (c); maximum = 1728 Fig. A.3.2-5 Same as Figure A.3.2-4, but for two boundary nodes. on an 85 x 85 mesh. To complete the pictures, Figures A.3.2-7 and A.3.2-8 show the corresponding biquadratic functions for nodes at or near a boundary and nodes at or near a corner, respectively. We have no more to say except that the local minima for quads are generally not at nodes and that the biorthogonality property now seems (more-or-less) 'obvious' (!). To conclude this section, we examine u(x) = S(x — xk), the Dirac delta function 'centered' at node k. (A.3.2-12) then gives ^2 nf / 0/0y = / 0/5(.r - **) = 0/(.rjfc) = 8ik, (A.3.2-27) (*). the Kronecker delta, which, upon comparison with (A.3.2-24), shows that 0y-(.V/t) = iij ; the dual basis functions are also the L2-projections of the Dirac delta function.
SCALAR PROJECTIONS 911 J v , it A \ • s~ (a) 4>(x) for a 4-patch internal node (b) <|)(x) corresponding to (a); maximum = 648 ^ 1 \ X 7 (c) <|)(x) for a 2-patch internal node (d) <j)(x) corresponding to (c); maximum = 310 I y - 1 " (e) <|)(x) for a central node (1-patch) (f) <|)(x) corresponding to (e); maximum = 148 Fig. A.3.2-6 Basis functions and their L2-duals for quads. A.3.2.3 The H1 -Projection, p1 Instead of the given function being merely square-integrable [i.e., u(.x) e L2], we now consider a given function u(x) that is smoother; its gradient is also in Lr. Thus, we now consider u(x) e //', a smaller space than Lr (a subspace of it, actually), so that
912 PROJECTIONS, ORTHOGONAL AND NOT-AND PROJECTION METHODS I (a) $ (x) for a boundary node (b) <j) (x) corresponding to (a); maximum = 1296 V. (c) <j) (x) 1 node in from boundary (d) § (x) corresponding to (c); maximum = 342 Fig. A.3.2-7 Same as figure A.3.2-6, but for nodes at or near a boundary (2-patches). both J u2 < oo, and J Vu ■ Vu = \\u\\2l = WVu\\q < oo. (We shall adopt the common- but-sloppy terminology of the //'-norm—semi-norm, actually—by precluding constant functions from our //'-subspace, except in special cases; see below.) This additional a priori smoothness allows one to seek a 'nearby' function in a space of functions that is a subspace of //' such that its gradient is as close as possible, in some sense, to the gradient of u(x). Rather than considering 'general' cases beyond our interest, we shall ab initio restrict our attention to subspaces of functions in Q, that take on a specified (boundary) value on a portion of dQ = T, say rD; and this value will be u(x)\rD = ud / 0 nl general—the value of the given function evaluated on TD (called 'the trace' of the function in the functional analysis literature). We do this because these projections and the related projection methods are related to elliptic BVP's, in which at least a portion of the BC is of the Dirichlet type—usually. Next, we introduce the subspace of admissible functions, a constrained and non-linear subspace of //' that, by fiat, does not contain u: VE = u' + V0, (A.3.2-28)
SCALAR PROJECTIONS 913 9. ^ A (a) <|)(x) for a corner node. X / / / / / (b) <|)(x) corresponding to (a); maximum = 2592 V ' \ v? . <s (c) <|)(x) for 1 node in from corner, (d) <|)(x) corresponding to (c); maximum = 162, Fig. A.3.2-8 Same as Figure A.3.2-6, but for nodes at or near a corner (1-patch). where Vq is a linear vector space of functions that vanish on rD, and vv is any //'-function (other than u\) that takes on the value uD on TD—the essential 'boundary condition.' w is called an //'-extension of no into Q. The space Ve is constrained because every function therein must agree with «D on VD, and it is non-linear because the sum of two functions in the subspace, on rD, is 2uD and is thus no longer in the subspace. If v e Ve, then v — w e V0; the difference between any admissible function and the chosen //'-extension of uD does lie in a linear vector space. Note that the 'large' (possibly oo-dimensional) subspace VE is constructed by adding a single function to another large subspace. Note too that changing w to some other //'-extension of uD changes Ve-
914 PROJECTIONS, ORTHOGONAL AND NOT-AND PROJECTION METHODS In this //'-case, we call the objective function U[(x) [rather than uq{x) for the H°- norm] and assert that the following are equivalent statements of the appropriate variational problem—and will lead us to the //'-projection: (i) Find inf \\u — v\\\ (ii) \\u — «i||i = inf \\u ~ v\\\ v&VE (Hi) \\u-u{\\x ^\\u-v\\{ VveVE, (A.3.2-29) where Ve C Z/1 such that if v e Ve, then v e //' and v = uq on rD; it satisfies the Essential BC. To find u\(x), let us introduce the functional [cf. (A.3.2-2)] F{ (v) = \ I V(v-u)- V(v - u) (A.3.2-30) and seek a minimum by varying v e Ve'- 8Fi(v) = 0= J V(v-u)-V8v. (A.3.2-31) Remarks: (1) It is important to remember that this is a constrained variational problem—the constraint being that the admissible class of functions over which the minimization is sought is constrained by the requirement that each must take on a particular value on rD [that of u(x) evaluated there]; and this fact will be seen to affect ('mess up' a bit, actually) our concept of orthogonality. (2) 82F[(v) = J V8v ■ V8v > 0, and therefore our stationary point is, as desired, a minimum. To make further progress, we proceed (somewhat) as in the L2-case: we first imagine that we have a basis (orthonormal in //', for convenience) for Vq, say {vn,n = 1,2,...}. Thus, we can perform the following expansions for v e Ve and 8v e Vq\ (i) ^(jc) = w(x) + YlT=i anVn(x) with {an} to be determined, and (ii) 8v(x) = Y^=i bmvm(x) with {bm} arbitrary. Proceeding again as in the L2-case easily leads to, from (A.3.2-31), 00 « / 00 \ y^ybm / V f y^anvn + w -u I m=l J \n=l J Vvm=0 and then to oo y^qn / V^„ ■ Vvm = / V(u - w) ■ Vvm, m = 1, 2, ..., co, (A.3.2-32) an co-dimensional linear system for the a„'s. Introducing the //'-inner product [u, v]= [vu-Vv (A.3.2-33)
SCALAR PROJECTIONS 915 and utilizing the assumed (//') orthogonality of the basis, which we shall refer to as _l_i, gives [vn, vm] = 8mn and thus an = [u — w, vn], and the final solution to (A.3.2-31) is oo u\(x) = v(x) = 2J[« — w, v„]v„(x) + w(x) = p™u(x), (A.3.2-34) and we have (we assert now and prove below) our //'-projection, with [u\, vn] = [u, vn], or [u — u\, vn] = 0V«; the error is _l_i (//'-orthogonal) to the basis, a la Galerkin. We have also found the associated (implicit) projection operator (pf), for which we note the obvious interesting aspect: the Dirichlet BC is first 'subtracted off and then 'added back,' a feature that is an essential part of the projection but which will be seen to cause p\ to be 'suboptimal' in that it is not an Hl-orthogonal projection—even though the error is J_i to the basis. Different w's give different Mi's, for the same u\ hence, the superscript w. [See also p. 70 of Strang and Fix (1973)] We now address the two questions: (i) is it really a projection?; and (ii) is it _l_i? The first question is answered, as usual, by applying the (alleged) projection a second time: from (A.3.2-34), it follows that oo p™ux(x) = (p™fu(x) = ^[«i - w, vn]vn(x) + w(x) n=l oo = E oo ^[u - w, vm]vm(x), vn(x) n — l Lm=l oo vn (x) + w(x) - ^2 t" ~ W> V^iVm, V„]V„(X) + W(X) m.n=l = ^2[u-w, vm]8mnv„(x) + w(x) m,n = /J[« ~ w> vn\vn(x) + w(x) = ui(x), (A.3.2-35) n and we do have a projection: (pv,v)2 = p™. The second question is also answered by construction—testing (//') orthogonality a la Pythagoras: letting Q\u = (I — p™)u leads (as usual) to u = p\u + Q™u and _l_i would mean that \\u\\\ = Hp^wll? + 11(27"Up We obtain, however, the following: \\u\\]=\\pw{u + Qw{u\\2{ = / W« + VQ7«) ■ (Vp> + VQw{u) = f Vp> • Vpw{u + J VQw{u ■ VQw{u + 2 ! Vpw{u ■ VQw{u = ||p>||? + ||Qy«||?+2 fvp^u- VQw{u, (A.3.2-36)
916 PROJECTIONS, ORTHOGONAL AND NOT-AND PROJECTION METHODS from which J-i=> the necessary condition, J Vp™u • V(u - p™u) = 0, or J Vu ■ Vp™u = J Vp™u ■ Vp^u by setting [p™u, Q™u] = 0. To test this, we simply evaluate both sides: LHS = / Vu ■ V oo y~"ju - w, v„]v„(x) + w(x) .n=\ /oo y~][u — w, vn]Vu ■ Vvn + Vu ■ Vw oo = ^[u - w, vn][u, v„] + [u, w]. (A.3.2-37) n = \ Next, RHS = [ V oo ^[u - w, vn]vn(x) + w(x) ln=l oo ^[u-w, vm]vm(x) + w(x) m=\ OO p. OO = ^[u-w, vn][u - w, vm][vn, vm] + 2 I Vw ■ ^[u - w, vn]Vvn + m,n=l n=l \w\ m,n=\ oo = ^[u- w, vnf + 2^2[u- w, vn][w, vn] + \w\ 1- (A.3.2-38) Then, finally, LHS - RHS = n = \ oo u-w - ]P[m - w, vn]vn, w n = l = [u — w — (u\ — w), w] = [u — u\,w] = [u — pYu, w] = [Q™u, w] (A.3.2-39) to give and we are left with [pw{u, Qw{u] = f Vpw{u ■ VQw{u = [Qw{u, w], (A.3.2-40) lull, = ipr"iii + ne7"ii?+2[e7K,w] \u\ 111 + Wu — u\ 111 + 2[M — Ml, W]. (A.3.2-41) We have _l_i in general only if w = 0; i.e., if u = 0 on Tq. More precisely, we also have ±i, if Q™u ±i w or if Q™u = 0 or if VQ™u = 0 or if Vw = 0, none of which warrants very serious consideration, except perhaps the last one: u = constant on rD—and one more: rD = 0. In the general case, however h / 0 on TD (thus, w / 0), and we do not obtain a best approximation in //' even though our projection is a solution to the variational problem [(A.3.2-31)]; i.e., unless u = 0 on To, the solution u\(x) is not the closest function to u(x)\ nor is it unique, varying as it does if we change w—even though the error is //'-orthogonal to the basis. [The error is not _l_i to w(x).]
SCALAR PROJECTIONS 917 Why is this? What is the closest function to u(x)7 Is there a different projection (still in //') that is orthogonal? We will answer the second of these good questions second, after answering the last first: yes (see below). The closest function to u(x) would be that which minimizes the functional in (A.3.2-30) without considering the constraint—v(x) would be allowed to vary on To just as it is on VN = r — rD. [Actually, however, this case is not quite well-posed in that then F\(y + c) = F\(y): any constant could be added to v without affecting the value of the functional. A 'standard' way around this non-uniqueness issue that is, in fact, analogous to the non-uniqueness associated with the classical Neumann BVP is to subtract the average value (over Q) of each function in the subspace from itself before 'using' it—a procedure that makes each resulting function L2-orthogonal to all constants.] Thus, if rD = 0, the unconstrained projection will both truly minimize F[(u[) and be _l_ i. Noteworthy is the fact that it is actually the constraint of an inhomogeneous Dirichlet BC that is the cause of loss of _l_i; if u(x) were zero on TD, so too would be w(x), and then the projection, u\(x), would be closest in Hl even in the presence of the constraint n = 0on TD, a result that is obviously related to the fact that all of our 'test functions' also vanish on rD. And this leads to the ±i-projection alluded to above that was, in fact, already hinted at [after (A.3.2-34)]: subtract off w(x) before doing anything else. Thus, we now consider the modified problem: let u = u — w (giving u = 0 on rD) and seek v(x) e Vo from (i) Find inf \\u — v\\\, or (ii) ||« — v\\\ = inf \\u — v\\\, or v€V0 (iii) \\u-v\\i ^Wu-vh VveVo, (A.3.2-42) all of which are solved by finding the stationary (and minimum) point of F{ (v) = \ jV(v-u)- V(v - u) (A.3.2-43) over all v e Vo via 8F[ =0, or / V(v - u) ■ V8v = 0; (A.3.2-44) i.e., we seek the closest function to u(x) — w(x) rather than the closest to u(x). Proceeding as before leads to oo v(x) = v(x) = y^[«, vn]v„(x) = p\u(x) = u\ — w = p™u — w, (A.3.2-45) and we have our 'different' projection. That p\ = p\ follows easily, and the next step—as usual—is to test for _l_i, which is true if / Vw • Wpu = J Vpu ■ Vpii, which in this case yields: LHS= fvu-v(^2[u,vn]vn(x)) =Yl[u>v"] J V"' Vv« =Yl[u'Vn]2
918 PROJECTIONS, ORTHOGONAL AND NOT-AND PROJECTION METHODS and RHS= fv(Y^[»iVm]vm\ v(^[«,^K = Jj«, vm][u, v„] / Vvm ■ Vvn = ^[u, vm][u, vn]8mn = ^[u, vnf = LHS, (A.3.2-46) m,n n and we do have _l_i: v(x) is an orthogonal //'-projection of u(x) in a Hilbert space; it minimizes the quadratic functional and is the closest function (in //') to u(x). We shall even emphasize this fact by changing the name of p\ to p\—and note that the gradient of v(x) is the closest (in L2) to the gradient of u(x). Now, since v(x) minimizes J V(i> + w — u) ■ V(v + w — u) and is the closest function to (u — w), so it must follow, since v = u\ — w, that (u\ — w) minimizes the same functional and is the closest function to u — w, giving u\ — w J_i u\ — u; i.e., we have U[(x) = p™u(x) = v(x) + w(x) = pfu(x) + w(x) = p-(-[u(x) — w(x)] + w(x), (A.3.2-47) and we see that our original //'-projection, thanks to inhomogeneous Dirichlet BC's, is actually probably better described as an affine (but idempotent) transformation since, as noted earlier, different choices for w(x) give different values of U[(x). But we shall be content to stay with the terminology of a non-orthogonal projection. We can summarize our results in the following two sketches, first for the original (Figure A.3.2-9) and non-_l_i-projection, and then for the _l_i-projection (Figure A.3.2-10), wherein we note the following: 1. The zero function is not in the range of p"'—unless w = 0—and thus the domain of p\ has no null space; p\ is not a projection in a linear vector space. 2. p\u is the closest function to u in R(p™) : \\u — «i||i is a minimum and u — u\ _l_i U\ — W. 3. Changing w changes pf and R(p"')—and thus u\. R(PD o Fig. A.3.2-9 A non-orthogonal /-/1 -projection.
SCALAR PROJECTIONS 919 u - u1 N(P|) (l-PJ)(u-w) u - w u1 - w= P|(u-w) Fig. A.3.2-10 An H1 -orthogonal projection. 4. p\u = U[ is not ±i to Q[u = u — u\, and p\ cannot operate on Q™u; p™Q™ is undefined. But p-yQy = 0; i.e., p/-(/ — p™)u = p^[u — w - p^(u — w)] = 0. Comparing the two figures shows that the non-_l_i-projection of u(x) to u\ (jc) is equivalent to the ±i-projection of u(x) — w(x) to u\(x) — w(x). Rather than u\(x) being the closest possible function to u(x), we have that u\ (jc) — w(x) = v(x) is the closest possible function to u(x) — w(x) = u(x). Note too that the 'error,' \\u — u\\\\, is the same size in both depictions, and that it changes when w(x) does—and this is what really matters. Clearly, there is room to at least seek a best w—by trying to minimize \\u — «i(vt>)||i—but we will drop it here (after mentioning that w = u is clearly a minimizer, but not a very interesting one). Finally, if u(x) = 0 on Vd, we have w = 0 and p^ = p°{ = p{; this is the 'clean' case, with a simple //'-orthogonal projection—similar to the unconstrained projection that results when TD = 0. But the w = 0 constrained case is different, as are the solutions, from the unconstrained (TD = 0) case, the latter generating a smaller F[(v) than the former; constrained minima are never as small as the unconstrained case, which, in some sense, has the largest 'grab bag' of admissible functions. This point will be further clarified when we consider the FEM version in Section A.3.2.4. So much for orthogonality—for now, except to mention that a final pair of comparison sketches that are intended to be self-explanatory, shown in Figure A.3.2-11, may be useful—in which Vw is the subset of all //'-extensions of uD, and /?(p/") is that subspace of Vq generated by pf: We said earlier that //'-projections can be associated with a BVP involving the Lapla- cian operator, at least if u(x) and v(x) are sufficiently smooth—an assertion that we now demonstrate. Returning to (A.3.2-31), we perform an integration by parts to give 0 = [8vV(v - u)] - 8vV2(v u) V(v -u)- I SvV2(v - u). (A.3.2-48) But 8v = 0 on rD because v = u there and, using the fact that 8v is an arbitrary variation in Q and on rv, leads to the Euler-Lagrange equations associated with the minimization of the given functional: V2(v - u) = 0 in Q, (A.3.2-49)
920 PROJECTIONS, ORTHOGONAL AND NOT-AND PROJECTION METHODS -Yw subtracts _ L •-^u-w Ul = P^u = P|(u-w) + w N(P|) V0 (A linear vector space) p|(u-w) And . Subvac^i. £v P1 N(P|) \ 1± /~>W, P| Qyu = 0 R(Piw) v„ R(P|) Fig. A.3.2-11 Interpreting the two H ^-projections [N.B. Changing w(to w') changes P™ and 3 dn (v — u) = 0 on T^, and, of course, v — u = 0 on To, (A.3.2-50) (A.3.2-51) which is an elliptic BVP for v since u is given, with the apparent obvious solution, v = u in £2! What sort of projection is this? It is, of course, the ultimate/perfect 'projection' and represents a limiting case (the subspace is the same as the original space). We shall return to notions related to this when we discuss the projection method. A.3.2.4 The H1 -projection, p* As for the L2-case, it is now reasonably easy to specialize the //'-projection to the finite- dimensional version via the (C° or better) FEM basis functions, {</>,, / = 1,2,..., TV}, where TV is the total number of nodes in Q and on VN, which functions are required to be in the space Vh C V0; i.e., </>, = 0 on rD. Also, as usual, we shall implement the //'-extension of u into Q from FD via the basis function interpolant of u on rD; i.e., w
SCALAR PROJECTIONS 921 is represented/replaced by NT u\x) = Y^ U(XJ e ro)(pj(x), (A.3.2-52) j=N+l which describes u1 as an interpolant of u(x) (another projection!) on rD, described by the rest of the nodes (/ = TV, TV + 1,..., 7V> on rD) and is the required //'-extension of u(x e rD) into Q. Then, N v(x) = u\{x) = p\u(x) = u'(x) + ^2 uj<f>j(x) = u'(x) + "*(•*) (A.3.2-53) is utilized in (A.3.2-31) along with 8v = fa to obtain the discrete form of the finite- dimensional H' -projection, 0 = / V ^ Uj<f>j + U1 - U 1 • V0;, 1=1,2, , N, (A.3.2-54) or N J Vuh ■ V0,- = ^T,Uj J V<Pj ■ Vfa = fV(k - u1) ■ Vfr, i=\,2,...,N, (A.3.2-55) which can be compared with its L2-counterpart, (A.3.2-11) through (A.3.2-13). Again, it is expedient to utilize matrix-vector notation to rewrite (A.3.2-55) as Ku = b, (A.3.2-56) where K = J V<p • V<p7 and b = b(u - u1) = J V(u - ul) ■ V<p; or K{j = J V</>( • V0y and bi = J V(m — u!) ■ V</>,. Thus, <p is an TV-vector (as is u) and A' is a symmetric (and positive definite when rD / 0) TV x TV matrix. The solution of (A.3.2-56) gives the nodal amplitude coefficients of the //'-projection, and (A.3.2-53) gives the projection: ph{u(x) = uh{(x) = <pTu + ul(x) = (pT(x)K~lb(u-uI)+uI = ^[/T'b(w - u!)]j<f>j(x) + u1, (A.3.2-57) j and the proof that (p\)2 = p\ is now left as an exercise. It also follows easily that the finite-dimensional version of (A.3.2-41) obtains (p\ -> p\, Qw{ -> Qh{, w —>- u) and, as there, //'-orthogonality is generally only achieved for homogeneous Dirichlet BC's (u! = 0) or FD = 0. Also, however, u\ — u! = uh is an ±i-projection of u — u1; uh = ]T\. uj<f>j is the closest function in Vh to u — u!. Finally, it is noteworthy that this H' -projection has introduced another set of (global) conjugate basis functions—another dual basis. We shall introduce them in the same way
922 PROJECTIONS, ORTHOGONAL AND NOT-AND PROJECTION METHODS as before (for L2), but it is noteworthy that the 'bottom line' is obtainable simply by replacing M by K in (A.3.2-18). From (A.3.2-57), rewritten as uh{(x) = bT(u - ul)K-l<p(x)+ul(x), we introduce the new TV-vector - v-\. <p(x) = K~l<p(x), to obtain u\(x) = <pT(x)b(u - u1) + ul(x) N ,- = 2 IV(M - "7) • V(j>j y=i L (A.3.2-58) 0;(jc) + m7(jc). (A.3.2-59) The global basis functions, {</>;}, are _l_i to the conventional basis functions; in fact, they form a bi-orthonormal set: V«p • V<pT = / V(K~l<p) ■ V(pT = K'1 / V<p • V<p7 = /, (A.3.2-60) or Vfo-Vt^Sn, which permits the following 'streamlined' representations of the projection: T^w-U J\ i ../ (i) uUx) = <p' (x)K~lb(u -u') + u = (pT(x)K~l J V(k - u1) -Vip + u1 = <pT(x) J V(k - u1) ■ V(/T V) + u1 N = Y^[u-u', 4>j]<l>j(x) + u'(x), (A.3.2-61) I * ^ L< ^ with [u{(x), </),] = [u(x), (pi]; i.e., [u{ — u, </>,•] = 0 For the simple case of u = 0 on r0, (A.3.2.61) becomes u\ = 2~\uj<$>j(x), with Uj = S/u ■ V0y (ii) wf(jc) = <pT(x)b(u - u1) + u1 = <pT(x) / V(k - u1) -Vip + u1 N = ^2[u-u', <pj]<j)j(x) + u'(x), (A.3.2-62) with [k}(x), </>,] = [m, </>,]; i.e., [wf - «, </>,] = 0. The error is //'-orthogonal to the original basis and to the conjugate basis—as for the L2-version, and as expected since both sets of basis functions span Vh.
SCALAR PROJECTIONS 923 As we did for the mass matrix, we shall compute and plot a few of the ID K- matrix conjugate basis functions, from (A.3.2-58), here for<p(je) = 0 at x = 0, 1. Recalling (A.3.2-24) through (A.3.2-26), a one-for-one replacement of M by K yields the matrix of nodal values for the A'-conjugate basis functions, O = K~\ and N <f>i(x) = ^2&ij<f>j(x), (A.3.2-63) (A.3.2-64) values of which are plotted for several nodes in Figure A.3.2-12 for linears (normalized to unity) and Figure A.3.2-13 for quads, the latter kindly supplied by D.F. Griffiths and showing the quadratic 'Bubbles' for center nodes. We conclude our ID discussion with the following Remarks: (1) The A'-conjugate linear basis functions and the quads at edge nodes are in fact the Green's function for the ID Laplacian operator: 2../J dzw dx' = S(x-xk) with uh (jc) = Y^ uf)(t>j (-*)< gives Yl uf} / <P'i<P'j = / <Pi(x)8(x -xk) = (ptixk) = 8ik, $i(x) 1 .U 0.8 0.6 0.4 0.2 n \ i y\ i /\ i \ - / \ \ / / ^ \ / ^ \--' /•. x / \ \ ' \ ' \ \ / \ \/ _ / / X. \^ \ _ / '' / ^v v> \ \ ' \ > N \ , / \ - \ \ ; ' \ v> \ i ■■ / \ '■•• x " ' / \ \ \ \ > / \ - \ \ '■•■ x \ . ' \ \ \ J// X\\- / / / \ \ \ ' / \v \ 1 II 1 ^ ' 1 1 1 1 ^ 0.2 0.4 0.6 x 0.8 1.0 Fig. A.3.2-12 K-conjugate linear basis functions for nodes 2, 6, and 9 in a 12-uniform- element mesh.
924 PROJECTIONS, ORTHOGONAL AND NOT-AND PROJECTION METHODS 0.25 0.2 0.15- 0.05 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Fig. A.3.2-13 K-conjugate quadratic basis functions for center nodes 2, 4, 6, and edge nodes 7,9, 11. which is just (A.3.2-63) in disguise (u) = </>,;), obtained exactly because of two facts: (i) the exact Green's function is piecewise linear, which is a function spanned by our basis functions, and (ii) the source term for the Green's function was placed at the nodes. (2) The center node conjugate functions for quads are not true Green's functions because they are C°°-functions within the element—a jump in the first derivative is required to obtain a true Green's function. They are approximate Green's function. (3) The functions shown in Figure A.3.2-12 are again normalized, for plotting, by the maximum nodal value. The true peak nodal amplitudes are 4>k(xk) = **0 — *k) for linears and for quads at edge nodes, showing small amplitudes near the ends and largest amplitude in the center, per Figure A.3.2-13—and is related to the fact (for the Green's function analog at least) that the same total 'heat' flux must be removed no matter where the 'heat' source is placed. (4) For the (strange-looking) midside node basis functions, the peak value generally does not occur at the node. The nodal value for these is 3h/16 above the average value of the two neighbouring edge nodes (D.F. Griffiths private communication). (5) These functions really do span the space of quadratic functions on (0,1), even though every other basis function is piecewise linear. Moving to 2D, we repeat what we did earlier for the L2-conjugate functions—plot some of them on a 13 x 13 = 169 node mesh interpolated to 85 x 85. Figure A.3.2-14 shows some bilinear //'-conjugate functions, and Figure A.3.2-15 and A.3.2-16 show some biquadratics. There are no pictures for nodes on V since all of the conjugate basis functions are zero there. The left side of the previous figures (Figures A.3.2-4 through A.3.2-8) show the basis functions to which these are bi-orthonormal in the //' sense (their
SCALAR PROJECTIONS 925 %r ft ' i- (a) $(x) for a typical internal node (4-patch); maximum = 0.642. (b) $(x) for 1 node in from boundary; maximum = 0.454. i I 5Wg (c) $(x) for 1 node in from a corner; maximum = 0.408. Fig. A.3.2-14 Dual (in /-/1j basis functions for bilinear elements. gradients are bi-orthonormal). The little 'bumps' on the otherwise smooth biquadratic functions are really there—and they are really small—a simple consequence of quadratic interpolation. As we shall later show, every one of these dual functions is an approximation of the 2D Green's function, In /% where /• = y/(x — xk)2 + (v — yk)2\ they are rather poor approximations, of course, on this coarse mesh. A.3.2.5 The Projection Method The principal purpose of this section is to describe and demonstrate the following important, and somewhat remarkable, fact: the GFEM solution to a second-order elliptic BVP
926 PROJECTIONS, ORTHOGONAL AND NOT-AND PROJECTION METHODS "N I 1 r / (a) $(x) for a 4-patch;maximum = 0.698 (b) $(x) for a 2-patch;maximum = 0.539 (c) $(x) for a central node; maximum = 0.446 Fig. A.3.2-15 Dual (in H V basis functions for biquadratic elements at three internal nodes. (a) <|>(x) for 1 node in from a corner; (b) $(x) for 1 node in from a boundary; maximum = 0.214 maximum = 0.354 Fig. A.3.2-16 Dual in H^) basis functions for biquadratic elements for nodes near the boundary.
SCALAR PROJECTIONS 927 is precisely the //'-projection of the exact solution onto the finite element subspace (basis functions)—and this in spite of the fact that the exact solution is generally 'not available.' We shall demonstrate this remarkable fact for one simple example, but the result is much more general—and applies to any elliptic problem in which the operator is self-adjoint in Hq, although sometimes the most 'natural' norm, such as the so-called energy norm, is different from the simple //'-(semi-)norm we have been using; see the literature for other examples. Thus, we consider the weak form of the following simple BVP: V2u + / = 0 in Q (A.3.2-65) u = uD on rD (A.3.2-66) and du/dn = g on rN = r - rD; (A.3.2-67) i.e., find u e HE such that Vu-Vv= fv+ gv Vve //J, (A.3.2-68) J J JrN where HE is that subspace of //' (not a linear subspace) in which all functions take on the value up on To, and HQ is the linear subspace of//'-functions that vanish on FD. An alternate statement of the weak formulation that may actually be more useful is: find u = u — w e HQ, where w is an //'-extension of uD from TD into Q, by solving [ V(u + w)-Vv= J fv+ J gv Vve Hl0, (A.3.2-69) or Vu • Vv = / fv + / gv- VwVv Vve //'; (A.3.2-70) i.e., the problem is now placed in a linear vector space setting. The corresponding approximate (GFEM) formulation is f V j J2 "A + u' • v<fr = / /><• + / 8<Pi, *'=1,2, ,/V, (A.3.2-71) and uh{(x) = u1 {x) + ^ ■ Uj(pj(x) = u1 + uh, where u1 {x) is the interpolant of uD on rD. Now comes the first of two key observations: since (A.3.2-69) is valid for every v e Hq, it is surely valid for the finite-dimensional subset, i> = </>,- for / = 1, 2,..., N, which implies that / V(w + w) ■ V0,- = / f<pi + / g<pi (A.3.2-72) J J JrN for every i; i.e., the exact (weak) solution satisfies this finite set of equations (and, of course, many others) with the same RHS as does the approximate solution. This fact permits us to subtract (A.3.2-72) from (A.3.2-71), which 'eliminates' the data (/ and g),
928 PROJECTIONS, ORTHOGONAL AND NOT-AND PROJECTION METHODS giving Y^ UJ f v</>/ • V0; = J V(fi + w - u1) ■ V0;, i = 1, 2,..., N. (A.3.2-73) The second key observation is this: (A.3.2-73) is a repeat of (A.3.2-55), the discrete //'-projection equation—after replacing u by u + w there. So, we have achieved one of our goals: The GFEM approximate solution to the BVP given by (A.3.2-65) through (A.3.2-67) is identical to the H' -projection of that solution onto the subspace spanned by the GFEM basis functions. Galerkin's method is indeed a projection method. More precisely, perhaps: the GFEM solution, u\ — u1, is an ±i-projection of (u — u1) onto the finite element basis. Remarks: (1) A crucial piece of the above analysis is that the finite elements be of the 'conforming' type; i.e., the discrete space is really a subspace of the oo-dimensional problem. The GFEM applied with non-conforming elements is not a projection method—a fact that will later be seen to lessen the 'quality' of incompressible flow simulations. (2) The //'-projection to the GFEM basis functions is thus seen to satisfy du\/dn = du/dn on rN, weakly; it is an NBC that comes with the projection. (3) If rD = 0, (i) the problem is only solvable if J f + Jr g = 0, (ii) it is then solvable only up to an arbitrary additive constant, (iii) the projected solution will try to maintain du/dn on all of T, and (iv) the projection is then an //' -orthogonal (_l_i) projection of the exact solution onto the finite element subspace—the gradients of the two solutions are as close as possible (in L2). (4) Recalling that u\ = w + p^(u — w), see (A.3.2-47), we observe the following: while it is true that u\ — w is the closest (in //') function to u — w in the oo-dimensional subspace Hl0 and that u\ — ul is the closest (in //') function to u — u1 in the finite- dimensional subspace of H{0, we have, in general, no idea of how close u\ is to u\. (5) The Pq- and Pq-projections can also be related in a somewhat similar manner, as follows: (i) rewrite (A.3.2-6) as J u0vm = J uvm and (A.3.2-12) as f ufaj = /«</>,■ and take Sh C S; (ii) restrict the {vm} to the subset {</>,} to obtain J ugcpj = J ucpj = f ufai, which shows that Uq is also a £)q-projection of u0, which itself is a p0- projection of u: u\ = PqU$ = PqPqu = p\u\ i.e., p^(u — p0u) = 0, which is similar to the familiar Euclidean projection of a 3D vector onto a line, in which an intermediate projection might be to a plane containing the line. Perhaps a somewhat more relevant example would be the projection of u(x) onto a subspace spanned by, say, 30 continuous piecewise, cubic polynomials (a subspace of dimension 120), followed by a projection to a subspace spanned by, say, 30 piecewise, linear FEM basis functions at the same set of nodes. The result would be the same as projecting u(x) directly to the 30-dimensional FEM subspace. We conclude the //'-projection discussion with a return to an issue raised in Section 3.2.3—between (A.3.2-47) and (A.3.2-48); namely, just what is the difference
SCALAR PROJECTIONS 929 between the following two _l_i-projections: (i) r^ = 0 (unconstrained) and (ii) uq = 0 (constrained, but with u = 0 on FD / 0)? The difference, which is small but 'finite' when u(x) = 0 on VD, is rather easier to understand for the finite-dimensional FEM projections, beginning with the observation that—all else being the same [same domain, same number of nodes, same basis functions, and same u(x)]—the two FEM subspaces have different dimensions, with that of the first projection being larger [Nt'- see (A.3.2-52)]. First we note intuitively that the projection with more functions in the 'grab bag' should have a better chance at minimizing the quadratic functional. Thus, the unconstrained projection, which will allow u\ (x) to be different from zero on T, will have a smaller F\ (u\) than that which, while still an _l_i-projection, constrains U[(x) to be zero on rD. Next we mention an intuitive reason for this difference: the unconstrained solution will more closely match the normal gradient of u(x) on T [see (A.3.2-67)], which, after all, is part of the goal of an H '- projection; the constrained case, on the other hand, sacrifices the ability to maintain du/dn on To by requiring u\ (jc) = 0 there—and thus cannot find the truly smallest F\ {u\), even though it is the smallest in the lower-dimensional subspace—of dimension TV. Finally, we remark that the unconstrained projection (which, of course, must deal with the fact that K is singular, probably by setting Uj = 0 on some boundary node) will also be _U for u(x) that is not zero on T; the special case was selected just to compare and contrast the two projections. A.3.2.6 Brief Discussion of GFEM Errors on Elliptic BVP's To conclude this discussion, we return to the GFEM approximation of the Green's function for the Laplacian—because it is important in its own right, not just for discussing the //'-conjugate functions of Section 2.6.4—and we thank D. Arnold and R. Rannacher for helping us here. They are used by FEM numerical analysts to make error estimates—a deep subject that we shall merely touch upon, beginning with the following quotation from Brenner and Scott (1994), p. 170: 'The finite element approximation is essentially defined by a mean square projection of the gradient. [As we saw in the previous section.] Thus it is natural that error estimates for the gradient of the error directly follow in the L? norm. It is interesting to ask whether such a gradient-projection would also be of optimal order in some other norm, for example L°°. We prove here that this is the case.' We shall summarize how this works, using the //'-dual functions. Consider solving the following 'Green's function' problem: -V2G{k) = 8(x-xk) in Q, (A.3.2-74) G(k) = 0 on 8Q, (A.3.2-75) where x^ denotes the node at which the Dirac delta (source) function 'lives'—although the theory does not require a nodal source. The weak form (of course) is the equation of interest; namely fvv VG(k) = f vS(x - xk) = v(xk) VveHl0 (A.3.2-76) The GFEM approximation to G(k), gh(x) = Ey£?Vy(x), is §iven bY J Vh ■ Vgh = Ui(x)8(x - xk) = 0,-(xt) Vi, (A.3.2-77)
930 PROJECTIONS, ORTHOGONAL AND NOT-AND PROJECTION METHODS i.e., which is giving Y,gf JV0,- • ^(pj = Sik, / = 1, 2,..., TV, (A.3.2-78) Y,KU8y=S*' (A.3.2-79) 8ik) = £(*>7>* = (K~l)ik = 4>ik\ (A.3.2-80) the K-conjugate basis functions are the GFEM approximations to the Green's functions of the Laplacian—and we are now ready to (imprecisely) summarize a particular GFEM error estimate. Since (A.3.2-76) holds for all v e Hl0, let us take v = u-uh (A.3.2-81) where u is the solution to the BVP, V2« = -S(x) in Q (A.3.2-82) u = 0 on dQ (A.3.2-83) and uh is the GFEM approximation to the same problem (thus u — uh is the error) and on the same mesh used to obtain gh, the approximate Green's function. Thus, from (A.3.2-76) and (A.3.2-81), with the nodal identifier, k, suppressed in the sequel (for convenience), u-uh = I VG ■ V(k - uh) (A.3.2-84) = fv(G-gh)- V(k - uh), (A.3.2-85) because u — uh ±1 gh, a fact that is contained in (A.3.2-73) of the previous section, for w = u! = 0 (the error in a GFEM solution to an elliptic BVP is //'-orthogonal to the basis.) By the same reasoning, the error in the Green's function, G — gh, is //'-orthogonal to uh, and also to u!, the interpolant (projections) of the exact solution. Thus we can replace uh by u!, u-uh = J V(G - gh) • V(k - u1), (A.3.2-86) to obtain, via the Schwarz inequality; \u - uh\ ^ J | V(G - gh)\ • ( | V(k - u')\ (A.3.2-87) where \u — uh\ is the absolute value of the error at the (suppressed) node, jc^. Next it is clear that / |V(m — u1) ^ meas (Q) maxfi |V(w — u!)\,
SCALAR PROJECTIONS 931 where meas (Q) is the 'size' of the domain. Standard approximation theory [see, for example, Strang and Fix (1973) or Ciarlet (1978)] then gives I \V{u-u'\^chr (A.3.2-88) where c is proportional to the second-derivatives of u; and r = 1 for linears, 2 for quadratics, etc. Thus, \u-uh\^chr I\V(G-gh)\, (A.3.2-89) and we now appeal to authority [for example, the Brenner and Scott book, and Rannacher and Scott (1982) and references therein—in which paper they succesfully 'removed' a previously-present In \h\ factor for linear (only) basis functions] to state that the (gradient) error in the Green's function approximation is also 0(h) —for all r (owing to the singularity in G)—and we are done: now letting k range over all the nodes in the mesh (and even over all of Q) gives the maximum (L°°) estimate: llw-^lloo ^chr+l This concludes our brief excursion into GFEM error analysis. Hopefully it is useful to some of our readers. A.3.2.7 Numerical Examples Before moving on to the more difficult subject of 'vector projections,' we shall show a few ID examples and one 2D example of scalar projections—some with linear basis functions, some with quadratic, some with both, starting with a very smooth (C°°) function, a Gaussian. Figures A.3.2-17 and A.3.2-18 show the L2- and //'-projections, 1.2 1.0 0.8 ug(x) 0.6 0.4 0.2 0 0 0.2 0.4 0.6 0.8 1.0 x Fig. A.3.2-17 L2-projection of a Gaussian via quadratic basis functions (dashed) on a 6- element mesh.
932 PROJECTIONS, ORTHOGONAL AND NOT-AND PROJECTION METHODS 1.2 1.0 0.8 uftx) 0.6 0.4 0.2 0 0 0.2 0.4 0.6 0.8 1.0 x Fig. A.3.2-18 H^-projection of a Gaussian via quadratic basis functions (solid) on a 6- element mesh. respectively, of u(x) = f(x) = e-^--^2/2*2 (A.3.2-90) for 0 ^ x ^ 1 with xq = 1/2 and o = Ajc = 1/12 onto an TV = 12±1-dimensional subspace via quadratic basis functions (six elements) and three-point Gaussian quadrature, which we supplement with the following Remarks: (1) TV = 13 for L2 and TV = 11 for H' (because T = 0 at x = 0, 1 are BC's). (2) In these and subsequent figures the plots are 'lifted off the origin by adding 0.1 to the true results. Also we remark that the plots from quads 'suffer' somewhat because linear interpolation is used by the graphics routine. (3) Higher-order quadrature rules give the 'same' result (graphically identical). (4) Linear basis functions give virtually the same graphical results—with a two-point (or greater) Gaussian rule. (5) The //'-projection looks much like an interpolant projection—and it is, for the edge nodes in ID and for all linear basis functions), which we show below. (6) The true //'-projection, via ||«||2 = ||m||q + ||Vm||q gives, for these cases, (M+ K)u = b, where b{ = /(«</>, + «'</>•)• Limited experimentation here showed that there is little difference between the true //'-norm and the //'-semi-norm (A'-matrix) results. That the ID //'-projection, p\, yields the interpolant for linear basis functions is probably worthwhile demonstrating, even though the result does not extend to multi- dimensions—and we do so in two ways.
SCALAR PROJECTIONS 933 First way: Recall (A.3.2-55) for u = f and u1 = 0 in ID; i.e., / 0;(«*)'= / 0J/', i=\,2,...,N, (A.3.2-91) Jo Jo or, introducing the 'error' function e(x) = ^(jc) — u(x), f <S>\e'= f (p'i(uh-f)'=0 Vi; (A.3.2-92) Jo Jo the gradient of the error is //'-orthogonal to the basis. For f(x) sufficiently smooth, we can integrate by parts to obtain, using e(0) = e{\) = 0, Jo 1 e(xyt>"(x) = 0 Vi. (A.3.2-93) But (p'/(x) is a Dirac delta function centered at x = jc, and thus (A.3.2-93) gives e(jc() 0 = ^(jc,-) — u(xj). Second way: Starting again from (A.3.2-55), this time in the form n ,1 ,1 Y^Uj f #</>) = f </>•/', (A.3.2-94) ~\ Jo Jo we 'realize' (or can compute) that the LHS is («, — «(_i)//?L + (u( — ui+\)/hR. The analogous (/-th row of the) RHS is also easy to evaluate: RHS = / -Wl) -£d| + / -(1 - Wr) ~<% JXi_{ d£ d£ Jx. d£ d£ 1 -*"/ — ! V + = A/to) - /Ot,-i)] + ^-[f(Xi) - f(xi+i)]. Recalling the non-singular A'-matrix then leads to K(u — f) = 0, where /,• = /(jc,-); and thus u = f. We have thus proven (twice) that linear basis functions cause the //'-projection to be the same as the interpolation projection. What about quads? Here we use a two-step procedure, as follows (with thanks to D. Arnold): Step 1. The fact that the linear basis functions are a subset of the quadratics (using only the edge nodes) yields immediately that the edge nodes of the H '-projection via quads are also the interpolant. Step 2. Here we focus on the center nodes and return to (A.3.2-93). But this time <j>" is a constant, and we obtain / e(x) = 0 Vi; (A.3.2-95) Jo i.e., the average error, e(jc) = ^(jc) — /(jc), is zero over each element, which is consistent with minimizing the error in the gradient—and we are done. For quads, the edge nodes are the interpolant and the center nodes take on the value
934 PROJECTIONS, ORTHOGONAL AND NOT-AND PROJECTION METHODS that gives zero average error over each element—and we leave generalization to higher-order elements to the reader. We now move to our second, more difficult, example (C° function), in which quadrature error precludes obtaining the interpolant for the //'-projections. We chose a portion of a nearly half ellipse to see how well the projections could deal with a large slope discontinuity (and we were pleasantly surprised). [In what follows we discuss—for simplicity—only a single ellipse, the left one in the figures below, whereas the actual calculations and figures show a pair of them—for 'variety,' and to see how well a very large change in slope is accommodated (which is rather well).] In the next four figures are shown some TV = 24 ± 1 projection results for the function u(x) = f(x) = c + b\ \ - f -J on jc,^jc^jc2, (A.3.2-96) with horizontal center location jco = 3/8, semi-major axis b = 1, semi-minor axis a = 1/8, and vertical center location c = —0.1—giving the intercepts [f(x) = 0]*i = xq — ay/\ -c2/b2 = 0.2506, and x2 = x0 + ay/1 -c2/b2 = 0.4994. The L2-projections have N = 25 (24 linear elements or 12 quadratic elements) and, because we set / and uh = 0 at x = 0 and 1, the //'-projections have TV = 23. Figures A.3.2-19 and A.3.2-20 show the L2-projections for linears (two-point rule) and quads (three-point rule), respectively. In both cases, higher-order quadrature did not visibly change the pictures. On the other hand, the //'-projections were much more sensitive to the 'Gaussian rule'; Figures A.3.2-21 and A.3.2-22 show the linear basis function results for a two-point and a seven-point rule, respectively. (For quads—not shown—a three-point rule produced a picture much like that in Figure A.3.2-21, and the seven-point rule result was close to, but slightly lower 1.2 1.0 0.8 u5(x) 0.6 0.4 0.2 0 0 0.2 0.4 0.6 0.8 1.0 x Fig. A.3.2-19 L2-projection of two ellipses via linear basis functions on a 24-element mesh (2-point Gaussian quadrature).
SCALAR PROJECTIONS 935 uS(x) 0.6 Fig. A.3.2-20 Same as Figure A.3.2-19 except via 12 quadratic elements (and 3-point quadrature). u"(x) 0.6 Fig. A.3.2-21 H^-projection of two ellipses via linear basis functions on a 24-element mesh (2-point quadrature). than, that in Figure A.3.2-22, showing that the quads are harder to get right.) Since exact integration yields the interpolant, all of the error at the nodes in these last two figures is quadrature error. We also mention that in these and other cases, the results are less 'nice' looking if node points are not located at points of discontinuity—owing to additional quadrature error.
936 PROJECTIONS, ORTHOGONAL AND NOT-AND PROJECTION METHODS u"(x) 0.6 Fig. A.3.2-22 Same as Figure A.3.2-21 except using a 7-point quadrature rule. Let us now recall the projection connection: the continuum results in Figures A.3.2-19 through A.3.2-22 (the ellipses) also correspond to the (weak) solution of the following BVP: d2u d2f — = -S(x) = —^ on 0 ^ x ^ 1 with k(0) = u(\) = 0, dxz dx where, from (A.3.2-96), the 'source' term is given by dx2 b/az 1 - X-Xq a -,3/2 on x\ < x < X2, (A.3.2-97) (A.3.2-98) with point heat sinks at x = x\ and x = X2, and S(x) = 0 for x < x\ and x > X2. The point heat sinks are Dirac delta functions, needed to compensate for the discontinuity in f'(x) at x = x\ and X2, and are given by the 'flux' jump there; namely., S(x) = [b2^\ - c2/b2/ac]8(x - jc() (A.3.2-99) for / = 1,2; here b2\J\ — c2/b2/ac = —79.6. Note that the ellipse center, c, must be <0 in order for the solution to make sense (reside in //'); for c -> 0, the flux at the two 'edges' of the ellipse -> ±oo and the BVP becomes ill-posed. The approximate (GFEM) solution of the above BVP is given by Y.UJ J Wj = j <PiS(x) = -j <piU"(x) V/, (A.3.2-100) which will clearly not give the same result as the //'-projected result, (A.3.2-94), with /' replaced by u', when numerical integration is employed in both cases; the RHS's will
SCALAR PROJECTIONS 937 generally differ and, thus, the solutions. Galerkin's method via FEM is a true projection method only in the absence of quadrature error. In the general case, we can call it an approximate projection method. Our last ID example is a C-1-function and thus can only be studied in L2. It, and its projection, are shown in Figure A.3.2-23 for a 50-linear-element mesh (N = 51) and a two-point Gaussian rule, which merits the following Remarks: (1) The discontinuity at x = 0.7 causes a classic Gibbs jump because our basis functions are continuous. (2) Quadratic basis functions with a three-point rule and an edge node located at the discontinuity look much like the linears—but see below. (3) The near-perfect result for the ramp portion is deceptive; i.e., there are wiggles, decreasing in amplitude away from the discontinuity—they are just too small to see because, at least in part, there is a node at the jump in u'{x). The Gibbs jump can, with thanks to D. Griffiths, be studied analytically—at least for the case of a single discontinuity, for both linears and quads. In both cases we wish to solve Mu — b, where b,■ = \ • f </>, for / = 0 (node at discontinuity), b/ = 1 • J & for / < 0, and bj = 0 for * > 0. Application of the theory of difference equations leads first to the homogeneous solution, u( = a^ + b?_, (A.3.2-101) where £± = (—2 ± V3) for linear elements and £± = (3 ± 2^2) for the edge nodes (only) for quads—the latter having been obtained by first eliminating the midside (1/2- integer) nodes in terms of edge nodes. Next, in order to obtain bounded solutions for i > 0, we need b = 0 for linears and a = 0 for quads; conversely, for / < 0 we need 1.4 1.2 1.0 0.8 u5(x) 0.6 0.4 0.2 0 -0.2 0 0.2 0.4 0.6 0.8 1.0 x Fig. A.3.2-23 L2-projection of a C~1 function via linear basis functions on a 50-element mesh.
938 PROJECTIONS, ORTHOGONAL AND NOT-AND PROJECTION METHODS a = 0 for linears and b = 0 for quads. Finally, to obtain the non-zero values for a and b, two equations are needed: (i) the matchup (continuity) condition at / = 0 to obtain the particular solution Uj = 1 for / < 0 and u{ = 0 for / > 0; and (ii) the 'special' equation, with bo = j f (po and u0 = 1/2, must be satisfied for i = 0. The final results are: 1. Linears: = 1 - j£L for / ^ 0, = 1$; for / ^ 0, giving a Gibbs jump of magnitude -2 + V3 \U[\ 0.134 or, equivalently, |m_i| — 1 = (-2 - V3)" - 1. independent of grid size. 2. Quadratics (i) Edge nodes (integer): 1 - \?+ for i ^ 0, W for i ^ 0. 2' Here, though, there is no Gibbs jump because both £'s are positive. The Gibbs jump (and oscillatory decay) shows up in the center nodes. ... (ii) Center nodes (1/2-integer): j Ui-1/2 = [10 — (iij + «,_i]/8 for / ^ 0, \ ui+i/2 = -(Uj + ui+l)/8 for / ^ 0, showing here a Gibbs jump of magnitude 11/1/21 = («o + «i)/8 = [1/2 + 1/2(3 — 2\/2]/8 = 0.0732, which is about one-half of that from linears. This does, however, correspond to a sort of 'best case' in that a larger jump is observed if a center node is placed at the discontinuity or, worse yet, if no node is there (additional quadrature error)—a situation that also applies to linear elements (the jump can then be more like 20%). For our single 2D example, we shall show five projections of the 2D Gaussian shown in Figure A.3.2-24a, which has ax = Ax = Ay = | and oy = 0.3a^, via u(x, y) = e-te-irfri+ty-yofrtW, (A.3.2-102) with jco = yo = j (center of the unit square domain). As with our conjugate basis function examples presented earlier, we use a 13 x 13 = 169 node mesh and interpolate all results via the appropriate FEM basis functions onto an 85 x 85 mesh. Figure A.3.2-24(b) shows the bilinear interpolant projection, the rounded cap becomes a spike on this coarse mesh. Figure A.3.2-24c is from a 3 x 3 (Gauss points) quadrature on bilinear elements, and the extrema (max/min) are (0.981/ — 0.110)—and higher order quadrature looks much the same (for example 7x7 gives extrema of 0.981/—0.106). Figure A.3.2-24(d) is from a 7 x 7 quadrature on biquadratics and has
SCALAR PROJECTIONS 939 ;/ (a) The exact 2-D Gaussian (b) Its interpolant via bilinear elements s hi '// <0 ■^ (c) L2-projection via bilinears (d) L2-projection via biquadratics I f (e) H1-projection via bilinears (f) H1-projection via biquadratics Fig. A.3.2-24 >A 2D Gaussian and five projections. extrema of (1.061/—0.102), whereas a 3 x 3 rule here, while looking very much the same, gave extrema of (1.177/—0.108). Moving now to the Hl results, Figure A.3.2-24(e) shows a bilinear 3x3 result with extrema (0.970/—0.052), whereas 7x7 gave the 'same' picture with extrema (0.987/—0.054). Note the similarity to the interpolant—and recall that in ID these two projections are identical. Finally, a 7 x 7 rule for biquadratics, shown
940 PROJECTIONS, ORTHOGONAL AND NOT-AND PROJECTION METHODS in Figure A.3.2-24(f), has extrema (1.024/-0.110), whereas a 3 x 3 rule gave the 'same' picture with extrema of (1.161/—0.167). The interpolant projection (not shown) looks much the same, and has extrema of (1.00/—0.120). Clearly the //' projections 'look' a little bit smoother, and virtually all of the 'wiggles' (for both projections) are in the more poorly resolved y-direction. [We purposely used a coarse mesh to reveal the projection errors, which virtually vanish (in the augen norm at least) on 'normal' meshes—i.e., both projections then 'look like' the interpolant.] One of the lessons that has been learned from these examples is this: whereas the L (best least square fit to the function) projection necessarily 'wiggles', the //'-projection can be 'mentally approximated' by the interpolant (projection), which only 'wiggles' (and not very much, usually) for higher-order elements. Finally, it is useful to recall (see, e.g., Strang and Fix 1973) that the solution to the associated elliptic BVP (Poisson equation) via GFEM is always more accurate (in H') than is the interpolant of the exact solution. A.3.3 VECTOR PROJECTIONS We now move up in complexity by considering both L2-and //'-projections of vector- valued functions. The subspaces of interest are both the same finite element spaces as above (subsets of L2 and //', respectively) and some new ones: the subspaces of divergence-free vector fields. In fact, our main goal here is to explicate the projection of vectors (divergence-free or not) from either oo-dimensional or finite-dimensional spaces to a finite-dimensional subspace in which the vector is discretely divergence-free. Along the way we will discover that the projection of the oo-dimensional space to the discretely divergence-free subspace can also be represented as two sequential projections—the first to a non-divergence-free finite-dimensional subspace and the second from there down to the final subspace. Also, as in the scalar case, we will lead up to the case of interest by first examining the case of an oo-dimensional projection of a vector onto its oo-dimensional subspace of vectors that have zero divergence. We shall again start with the simpler L2- case and conclude with the //'-case, and we note now the interesting 'final' result for the finite-dimensional versions: simply swap M with K to convert from one to the other. A.3.3.1 The pj -Projection The proper (minimum allowable smoothness/largest possible space) space of vector functions that can be considered here is called H(Div)—the space of vectors in L2 whose divergence is also in L2; this space was apparently introduced by P. Raviart (see Thomasset, 1981, p. 94)—see also, for example, Temam (1984, p. 5), Girault and Raviart (1986, p. 26), Arnold (1990), or Brezzi and Fortin (1991, p. 18) for detailed descriptions of this space that contains H' as a subspace. Such spaces permit the use of non-conforming elements because (viewed from H1) the tangential velocity is permitted to be discontinuous across element boundaries. But, even though some mathematicians may cringe at the notion, we shall only permit our 'velocities' to lie in (be restricted to) the smaller space H1; i.e., the space of conforming elements in which all first-derivatives are in L2. They may cringe because the function-analytic stability properties of the divergence-free subspaces of this space (velocity-pressure combinations, in simple language) are not fully understood (or at least that is our reading of the situation). We use H' because we—and
VECTOR PROJECTIONS 941 many others—use this space daily in computations, and mostly with success. Thus, while it may have some theoretical deficiencies (as might its finite-dimensional subspace in the limit h -> 0), we opt for the practical side and permit, for example, the QiQo finite element basis (and avoid the h -> 0 cases!). So, given a bounded domain Q containing a vector field u(jc) e H1, we are interested in finding the closest (in the L2-norm) divergence-free function to this vector field, say Uq, subject to the constraint that n uo = n • w on To, where n • w is a given function on TD, which itself is a portion of dQ = T. We remark that whereas a scalar L2-projection need not be subjected to any 'boundary' conditions, the divergence-free constraint associated with the vector L2-projections introduces a differential operator—and with it, the necessity of finding and using 'appropriate' BC's. That this is reasonable for us is related to the fact that our final 'product' will be an incompressible flow in Q whose normal component is 'controlled' on at least a portion of dQ. We also remark that if rD is all of T, then n • w must satisfy Jr n • w = 0 and that n • w ^ n • u in general. Rather than considering the elusive-in-practice divergence-free spaces of vector fields, we a priori introduce a Lagrange multiplier (A.) and seek a constrained extremum—via a saddle-point problem, consistent with our 'theme' of mixed interpolation. Thus, we introduce the (non-quadratic) functional F(\,k)= \ /(v-u)-(v-u)- /\v-v, (A.3.3-1) where veH[ and A. e L2; HlE is that subset of H1 whose functions satisfy n • v = n • w on rD—and H0 is the sub-space of H1 with n • v = 0 on rD. (The minus sign is chosen for 'convenience' only; for either choice of sign, the functional takes on both positive and negative values.) Seeking a stationary point (critical point) of F(\, A.) via 8F(\, A.) = 0 and calling the critical values of v and A., uo and A-o, respectively, gives 0 = / (u0 - u) • 8\ - / A.oV • <5v - / 5A.V • u0 = / (u0 - u) • 8\ - / V • (A.05v) + / 8\ • VA.0 - / 5A.V • u0 = / (u0 - u + VA.0) -8\- X0n-8\- A.0n • <5v - / 5A.V • u0, (A.3.3-2) J «/ F/v J r /) j where r^ = V — VD. Now, since n uo = n w on rD, n • <5v = 0 there. Also, realizing that <5v and 8X are independent variations gives the pair of variational equations, / (u0 - u + VA.0) • 8\ - / A.0n • <5v = 0 (A.3.3-3) and - j 8XV-u0 = 0, (A.3.3-4) which is the final statement of the saddle-point problem. The arbitrariness of the variations finally leads to the so-called Euler-Lagrange equations for uo and A,0 that describe the projection (proven below) of u to the closest (in L2) divergence-free subspace, u = u0 + VA.Q and V • u0 = 0 in Q, (A.3.3-5) A.() = 0 on VN, (A.3.3-6)
942 PROJECTIONS, ORTHOGONAL AND NOT-AND PROJECTION METHODS and, of course, n • u0 = n w on rD. (A.3.3-7) Remarks: (1) This is sometimes called a Helmholtz-Weyl decomposition of a given vector field into the sum of its divergence-free and curl-free parts; see, for example, Galdi (1994). (2) The BC A.0 = 0 on rN is actually an NBC. Also on VN, there is a 'built-in' constraint on u0; namely, whereas the normal component of uo is free to change, x • uo = x • u because x ■ VA-o = 0 there—the tangential components of velocity are not allowed to change on rv. [This restriction could be removed by specifying A-o on rV such that 3A.o/3t = r • uo = g, where g is given; i.e., replacing A-o = 0 by A-o = A.£ = Jr g on FN puts a weak Dirichlet BC on A,0 and results in a projected velocity that satisfies a specified tangential velocity on rv But we shall usually opt for the simpler situation in the sequel, leaving the more general case as an exercise. And with this exercise, we pose another—to the FDM expert who is able to solve the NSE with VP being a gradient everywhere (even at 'outflows'): how would you solve the above system for u and A. wherein u • x is specified on rN and (of course) the solution is discretely divergence-free everywhere?] (3) The solution to (A.3.3-5) through (A.3.3-7) is a saddle point of (A.3.3-1); it minimizes the kinetic energy of the difference between u and uo, and A-o maximizes F(\, A.0) at v = u0. (4) If V • u = 0 in Q and n • u = n • w on rD, the solution turns out to be A-o = 0 and uo = u (see below); the function u is already in the subspace. (5) If u = 0, then we have a saddle-point formulation of potential flow; see Section 3.15. (6) Clearly we have violated the original assertion that A. e L2; we will correct this 'carelessness' soon, after briefly investigating the situation in which even more regularity is required. If we demand or assume additional smoothness, we obtain a classical formulation—which 'formal' representation will also be useful in our search for the projection operator; i.e., from (A.3.3-5) through (A.3.3-7) follows V2A.() = V-u in Q (A.3.3-8) A.0 = 0 on rN (A.3.3-9) dXo/dn =n-(u-w) on FD, (A.3.3-10) which can be solved for the Lagrange multiplier, from which the desired vector field, u0 = u-VA.0 in Q, (A.3.3-11) is finally obtained. (Note the sequential solution procedure; first A-o, then uo.) The formal solution of (A.3.3-8) through (A.3.3-11) is obtained by inverting the Lapla- cian in (A.3.3-8) through (A.3.3-10) and placing the result in (A.3.3-11): A.0 = (V2r'V-u= A-'V-u, (A.3.3-12) where we have switched notation (A = V2) for notational convenience, and we remark that A.Q from (A.3.3-12) comes with the BC's [(A.3.3-9)) and (A.3.3-10)] 'built-in'—again,
VECTOR PROJECTIONS 943 formally. Then, u0 = u-VA"'V-u= (/-VA-'V-)u = py()u, (A.3.3-13) where pJo signifies an L2(//°)-projection onto the divergence-free subspace J: i.e., the subscript on J refers to the type of projection used to get there (7i will then be an //'-projection onto J). Thus, we assert that we have found our (first) projection operator: pj0 = I — gradA~'div projects (in L2) a given vector field onto the divergence-free subspace [and the result satisfies (A.3.3-7)]. pJ{) is an operator that, when applied to u, 'subtracts off that portion of u that is not divergence-free; i.e., it subtracts the curl-free portion—a gradient. Defining additionally fi/0=/-py0 = VA-lV., (A.3.3-14) leads to VA-o = Qjt)u, the gradient part of the decomposition, so that we have u = u0 + VA-o = p/0u + QJ{)u. (A.3.3-15) It is immediate to demonstrate that pj = pJt) and Qj = QJo as well as V • uo = 0; what is more interesting is to examine orthogonality: is pjt)u _L0 <2y,)u? Or' perhaps more clumsily: does ||m||q = ||py0u||o + ||(?/()u||q = ||uoNo + llu — uo||o? The answer is—only if n • w = 0 (i.e., only when the subset of H1 becomes also a subspace of H1). Proof: ||u||2 = n ■ u = /(u0 + V^o) • (uo + V^o) = /u0 • u0 + f Vlo • V^o + 2 fuo • VV But V • u0 = 0 and thus / u0 • VA.Q = / V • (A0uo) = / A.on • w; i.e., l|u|lS=||uoll5 + l|VX||J + 2 = IIP/oullo + ll&/<,ullo + 2 / ^on-w. (A.3.3-16) The L2-projection is only _L0 when u0 is parallel to rD, or if rD = 0 [no essential BC, a situation that is interesting in that the A.0 = 0 BC constrains the tangential component(s) of u0 on T; see Remark (2) below (A.3.3-7)]. [See Chorin and Marsden (1992) for additional discussion.] Now that we have identified the projection, and its 'strong' formulation [(A.3.3-8) through (A.3.3-11)], let us get more realistic and obtain the weak formulation of the projection. It is important to point out that, while either approach is seemingly legitimate, the most 'appropriate' weak formulation begins with the 'strong' formulation given by (A.3.3-5) through (A.3.3-7) and not the stronger(?) one given by (A.3.3-8) through (A.3.3-11). The reason for this is that the finite-dimensional approximation to the weak
944 PROJECTIONS, ORTHOGONAL AND NOT-AND PROJECTION METHODS form of (A.3.3-8) through (A.3.3-11) would, in a sense, 'lose track of the V • u0 = 0 constraint, with the result that a A.o obtained from the weak formulation of (A.3.3-8) through (A.3.3-10), which reads: 'Find A.o G Hq such that /VA.0 • V0 = - /0V • u V</>g//0, (A.3.3-17) where functions in Hl0 c Hl vanish on rv," will not lead, in the finite-dimensional case, to a projected velocity, from (A.3.3-11), that will satisfy any readily identifiable version of V • u0 = 0. So, we return to the 'primitive' equations in the 'mixed' formulation, (A.3.3-5) through (A.3.3-7), this time in the weak formulation. Let Hq be a subspace of H1 such that v g Hq =>• v g H1 and n • v = 0 on rD. Next, let w be a given H1-extension of a vector field from rD into Q that satisfies n • w = n • w on rD. Then, to find our second L2- projection, uo = py()u, we proceed as follows: Find Uq g Hq and k0 e L2 from /(u0)+w-u)-v- / k0V-\ = 0 VvgH^ (A.3.3-18) and .0 , „•=,>> _ n w„^r2 - / ?V-(itf+ w) = 0 WqeL\ (A.3.3-19) where integration by parts of v • VA-o (along with n • v = 0 on rD and A.o = 0 on rN) has been employed to reduce the differentiability requirements on A.o to those of great interest in the GFEM—namely, none. Note, too, that just as in the scalar //'-case, inhomogeneous Dirichlet BC's (the 'essential' variety) must be given special attention in order to invoke the powerful machinery inherent in linear vector spaces. We remark that w need not be divergence-free, and it generally is not—a feature that tends to 'obscure' the best approximation property of the projection—as we shall see. Rearrangement of (A.3.3-18) and (A.3.3-19) to place the data on the RHS leads to a nicer form from which to launch our GFEM approximation: Find Uq g Hq and Xq g L2 from and Remark: uj • v - / A.0V • v = / (u - w) • v Vv g H^ (A.3.3-20) qV-v§= qV-vr VqeL2. (A.3.3-21) Recall that this solution gives t-uo = 3A.o/3t = 0 on FN. If instead VN has a nonzero specified tangential velocity, say x ■ uo = g, there needs to be added to the RHS of (A.3.3-20) the following boundary integral: — fr (n • v)A.£, where XE = jT g and comes from dXs/dr = g. This is the weak form of a Dirichlet BC referred to earlier—in Remark (2) below (A.3.3-7). Once u[| is available, we have u0 = ug + w = pJou, (A3 3-22)
VECTOR PROJECTIONS 945 and the projection proof (pj{) = pJo) follows easily: place uo into (A.3.3-20) and realize that Uq and A-o = 0 solves the pair; thus, pj()uo = uo = pj u. Before leaving the oo-dimensional case, let us again examine orthogonality, or lack thereof. We have (weakly) u = u° + w + VX0 = u0 + V^o = pJ{)u + Qj()u, (A.3.3-23) which leads to / u • u = / u0 • u0 + / VA0 • V^o + 2 / u0 • V^o, which, using / A.0V • u0 = 0 from (A.3.3-21) with q = A.0, n • u0 = n • w on TD, and A-o = 0 on r^ yields, for the last term on the RHS, / u0 • VA.Q = / A.0n • u0 - / A.0V • u0 = / A.0n • w, which leads to llullo = lluollo + II v^ollo + 2 / Ion • w, (A.3.3-24) T D showing that, analogous to (A.3.3-16), uo J-o VA.o if and only if n • w = 0; i.e., we (again) need n • w = 0 on rD in order to have J_o, i-e., if TD / 0, then the flow must be parallel torD. But—as in the scalar //'-case—an alternate version of orthogonality is also of interest: start from u — w = u[| + VA.0 and form f(u — w)2 to obtain ||u - w||g = 11 ug 11 g + || V^ollo + 2 [u°0 • VA., /u!J-VA0= /"^on-uj- [XqV.u°0= floV-vf where to give |u - w||g = ||u0 - w||g + || VA.0II0 + 2 / k0V ■ w, (A.3.3-25) and we obtain an orthogonal decomposition/projection of u — w to uo — w + VA-o if and only if w is divergence-free; i.e., a divergence-free H1-extension of n • w from rD into Q will generate an J_o-projection (of u — w), even when n • w / 0—and we have reached the end of the pj{)-projection discussion. A.3.3.2 The pj -Projection It is now relatively easy, as usual, to specialize from an oo-dimensional setting to the finite-dimensional one via our FEM basis functions—which we present, for 'convenience,' in 2D only. And here again, somewhat in violation of the advice provided by many of our FEM mathematician friends, we permit—and indeed, compute with—ostensibly unstable
946 PROJECTIONS, ORTHOGONAL AND NOT-AND PROJECTION METHODS 'element pairs,' such as QiQo, QiQ-\, etc. But this issue is of minor concern relative to our current discussion in that our presentation would not be affected in any appreciable way if we restricted attention to only 'stable' elements—which the reader is welcome to do. So, consider now the finite-dimensional subspaces V'cH1 [or H(Div)] for velocity and Sh C L2 for pressure as well as the additional subspace, Jh c \h, of discretely divergence-free vectors; and seek a GFEM solution to (A.3.3-20) and (A.3.3-21) via uj = uh + u7 g J*, (A.3.3-26) where Uq(jc) is the projected (weakly) divergence-free velocity, ■' = t»A = E (:;>=("l ?r)(Uv)z< (A.3.3-27) where TV is the total number of nodes in Q, and on r^,u7 is the H'-extension (corresponding to vv) given by interpolation (another projection): u7 = ^ (n w^-ii; G H', (A.3.3-28) where (n • w), is the (specified) value of n • w at node j, and ny"/vV /v</> (A.3.3-29) is the mass-consistent unit normal at node j (see Section 3.13. le). Note that u7 is generally not divergence-free because the velocity basis functions are not. Note also that the 'proper' (mass consistent) value of the normal component of u1 is only obtained via iij • u7 with nj from (A.3.3-29). And note, finally, that (A.3.3-29) is also required in order to assure that uh g Hq; i.e., n • uh = 0 on rD can only be assured using the consistent normal. We also need Xh0 = ^2XjxJ,j eSh, (A.3.3-30) where N p is the total number of 'pressure' nodes, after which the combination of (A.3.3-20) and (A.3.3-21) with v = (</>,, </>,)r, / = 1, 2,..., TV and q = y\rhi = \,2, ...,NP there, and equations (A.3.3-26) through (A.3.3-29), give N p N p p rj I p Y,Uj I 4>i4>j -J2kJ ^J= /("-"7)<^ /=1,2, ...,7V, (A.3.3-31) N N" - ^ Y,vj j <Pi<Pj-J2kjj jL^j = jiv-i/^i, i=\,2,...,N, (A.3.3-32) and - Y, (UJ J ^^ + VJ J ftJ1) = J ftv u7' i=l,2,...,Np, (A.3.3-33)
VECTOR PROJECTIONS 947 the set of 2N + N p GFEM equations of the L2-projection, and this may be a good time to make the following important remark: even if u is 'perfectly' divergence-free (V • u = 0), its projection (Uq) will not generally be, which leads to two further remarks: (i) this projection is also a transformation (mapping) in that it transforms a divergence-free vector to a discretely divergence-free vector, and (more importantly) (ii) a (seemingly 'superior') perfectly divergence-free vector field must be projected to its ('inferior') discretely divergence-free counterpart in order to provide a legitimate IC for time-integration of the GFEM NS equations. This is because Jh is not a subset of J—even though Jh c \h and \h cH1 and JcH1. Before proceeding, it is worthwhile noting that omitting the Lagrange multiplier terms in (A.3.3-31) and (A.3.3-32) and dropping (A.3.3-33) recovers the familiar L2-projections of the scalar case, one for u and one for v, uncoupled [albeit with some unnecessary (but permissible) BC's/constraints]; i.e., the RHS's of (A.3.3-31) and (A.3.3-32) could be rewritten in terms of a (non-divergence-free) discrete L2-projection with nodal values f uiN via V; I Mx 0 \ (u\ _ (Mxit 0 mJ \v) ~ \Mvv V1 / (u - u')(p^ J , (A.3.3-34) J (v-vr)<p j which here defines (u, v) and can be compared to (A.3.2-11) through (A.3.2-13). The Venn/potato diagram in Figure A.3.3-1 depicts some of the preceding observations, showing some projections of two functions in H1 —one divergence-free (ui) and the other not (u2): the 'suggestion' that p) = p) p^ will be proven later—Section A.3.3.5. The condensed form of (A.3.3-31) through (A.3.3-33) will also be useful: Mxii + Cx\ = bx=Mxu, (A.3.3-35) MyV + CyX = by= M yV, (A.3.3'36) C[u + C\v = g, (A.3.3-37) Fig. A.3.3-1 Some L2-projections—non-orthogonal in general.
948 PROJECTIONS, ORTHOGONAL AND NOT-AND PROJECTION METHODS with the obvious definitions of matrices and vectors—and because of our simplified choice of BC's, Mx = Mr Solving (A.3.3-35) and (A.3.3-36) for u and v and placing the results into (A.3.3-37) yields the discrete equation for the Lagrange multiplier (CTXM~{CX + CTyM-{Cy)k = CTM~lC = Ak = CTxu + CTyv - g, (A.3.3-38) which is the discrete analog of (A.3.3-12); A ~ —V2 and M~lC ~ V. Remarks: (1) If u is interpolated into the basis functions and the mass then lumped on both sides of (A.3.3-35) and (A.3.3-36), we would obtain the (relatively inexpensive via sequential solutions) discretely divergence-free interpolation projection, say p1} , which is related to a commonly used projection method when solving the time-dependent Navier-Stokes equations (see Section 3.16.6c). [If the mass is not lumped after the interpolation, then a different (and more expensive) divergence-free projection is obtained.] (2) As in the continuous case, setting u = 0 yields a mixed interpolation approximation to potential flow. Solving for k and placing the result in (A.3.3-35) and (A.3.3-36) gives M-{CxA-{g \ M-lCyA-lgJ (A.3.3-39) (A.3.3-40) (A.3.3-41) (A.3.3-42) (A.3.3-43) where P0 is the L2-projection matrix; Pq = Pq{Pq is a projection), CtPq = 0(^0 is also a divergence-free projection), and P0M~lC = 0 (gradients project to zero). As noted earlier, the eigenvalues of any projection operator are either zero or one. In the case of P0, the zero eigenvalues (of which there are Np when C has full rank) correspond to eigenvectors that are gradients of scalars (the null space of P0: x = M~lCq for some (non-trivial) q has Pqx = 0), and the unit eigenvalues (of which there are 2/V — Np) correspond to divergence-free eigenvectors (at least when C has full rank; see Section 3.13.2b). Finally, we note that Pq is an orthogonal projection via the discrete L2-inner product, (jc, y) = xTMy; i.e., with Q0 = I - P0, (P0u, Q0u) = (P0u)TM(QQu) = (P0u)7MM-lCA~lCTu = (CtPqu)tA~{Ctu = 0 because CtPq = 0. Thus, Pq is an 'L2-orthogonal divergence-free projection' matrix. Remark: This L2-projection can be closely realized via the BE or STR time integration method by taking one very small time step in which k = AtP, see Section 3.16.Id. u v T-lf A-l/^T (Ix - M~ lCxA~lCx) -M- lCxA~l c; -M-{CyA-{CTx (Iy-M-lCyA-lCTy or, in further condensed form, u v + Ak = C u — g, l^A-lt^T; u = u-M~lCA~l(C'u-g), or \/^A-\/^Tx7. \r-A-\. u= (I -M-lCA-lC')it + M-lCA-lg = P0~u + M-{CA-{g,
VECTOR PROJECTIONS 949 Solving (A.3.3-35) through (A.3.3-37) is equivalent to applying (A.3.3-43) to It with the result that CTu = g; the resulting (discrete) vector field is discretely divergence-free. Also, (2o yields the gradient part: MlCX = Q0u - M~lCA~lg = M~lCA~l(CTu - g), and we have it = u + M"1 Ck = P0u + Q0u, (A.3.3-44) which mimics (A.3.3-15), except that here the (discrete) projection is _L0- Note that, especially clearly from (A.3.3-41), if the w-field is already (discretely) divergence-free, then the projection does nothing; u = u, and (A.3.3-44) shows that, in general, the projection of u 'subtracts out' the gradient portion to leave the divergence-free portion. Testing orthogonality of u and M~lCX, however, leads to \\u\\l = (u+M~lCX)TM(u+M~lCk) = uTMu + (M~lCX)TM(M~lCX) + 2uTCl; but uTCX = kTCTu = kTg, and we thus obtain Plo = Nlo + W~{CX\\l + 2XTg, (A.3.3-45) a la (A.3.3-25); L2-orthogonality between u and M~lCX obtains only for g = 0 which, in this case, would require n • w = 0 (or rD = 0), since u7 is not a divergence-free extension. The comparison of (A.3.3-45) with (A.3.3-25) rather than with (A.3.3-24) is appropriate because the discrete vector, u, of (A.3.3-44), corresponds to u — w, u to u[j, and g to V • w in the oo-dimensional case. This seems like a good place to quote (with a few necessary notational changes) from a nice paper of Strang (1988, p. 283), for the case g = 0: 'The first two equations [(A.3.3-35) and (A.3.3-36)] separate u (and v) into its components u (and v) and M~lCxX (and M~'CVA.). The third equation [(A.3.3-37)] makes these components perpendicular. There is only one solution—one choice from each subspace that will combine to give u (and v). It is found from (A.3.3-38).' Finally, we consider/introduce the continuous projection operator, p1} , that is induced by the matrix projection, Pq\ phJ{) is a finite-dimensional L2(//°)-projection operator onto the discretely divergence-free subspace, Jh. To do this, we first state the procedure in words, for clarity: 1. Given ( u(x)\ and n • w on rD, make a first approximation to the former via and approximate the latter by u7 via (A.3.3-28). 2. To obtain (tij,Vj), perform the (non-divergence-free) L2-projection by solving (A.3.3-34).
950 PROJECTIONS, ORTHOGONAL AND NOT-AND PROJECTION METHODS 3. Project (itj, vj) to the divergence-free subspace via (A.3.3-39), giving (Uj, Vj). 4. Insert the result into (A.3.3-26) and (A.3.3-27); done. We have the projection, which is: u;(,)^u = u*+u'=(*07 ,')(:)+u7; (A.3.3-46) i.e. "6 v0 if 0 0 <pT r-1 1^7. (Ix-M-lCxA-yC'x) -M;lCxA~lCT -M~{CyA~{CTx (/v r-1, y M~[CyA-'C'y) u V + M;lCxA~lg M-lCyA~lg <PT 0 \ 0 <pT + u v u V + M~lCA~lg + u vl where u v m;1 m: (u — u')(p (v — v1 )<p (A.3.3-47) (A.3.3-48) (A.3.3-49) Remark: In practice, of course, the (numerical) solution would be done all-at-once by solving the coupled system (A.3.3-31) through (A.3.3-33), since M~' is dense. Unless we lump the mass, in which case the sequential procedure, (A.3.3-38), (A.3.3-35) and (A.3.3-36), is viable—but the L2-projection is sacrificed, although we still do have a discretely divergence-free (interpolation) projection. In this sense, the 'small-Ar' BE approximation/trick mentioned in the Remark below (A.3.3-43) is closer to a true L2-projection than is the lumped mass 'sequential procedure'. To verify (!) the alleged projection (the last one for us), we project again; i.e., pj Uq = 0 (PjJ u gives f<PT 0 J „h PoM <P(uh0 -ul) (pU-v1) + M-{CA-{g\ + u V which, using (A.3.3-47) and (A.3.3-48), gives ^hnh (<PT 0 X u V <pT 0 0 <pT + M-lCA~lg AD <p 0 0 <p <pT 0 0 <pT + M-{CA-{g\ + u v1 + M~{CA-{g + M-{CA-{g\ + -uh — u0' because M' <P<P 0 T 0 -1, (A.3.3-50) T)=I, Pq = P0, and /,0M~'C = 0. QED
VECTOR PROJECTIONS 951 In concluding this section, we assert the truth of the following claim and leave the proof as an exercise: if and only if n • w = 0 (giving u7 = 0 and therefore g = 0) in (A.3.3-47) and (A.3.3-48), it follows that ||u lrf„u| 2 + 0 + G$0u|& where Q)^l-p)-, i.e., the projection is _l_o if the Dirichlet BC is homogeneous [u is parallel to fD, a la (A.3.3-16)]. A.3.3.3 The pj, -Projection In this final case of interest for the continuous case there is no ambiguity regarding the smoothness/regularity that is required; the velocities are in H1, and the 'pressures' (i.e., Lagrange multipliers) can reside in L2—both, of course, when the appropriate equations are put in the appropriate weak form. As for the scalar case, usually we presume that there is always a Dirichlet portion of T (TD / 0). Thus, given a vector field u(jc) g H1, we seek its closest neighbor in H1, say ui(jc), that is both divergence-free and retains the value of u on To; ui = u on rD. Combining our knowledge from the //'-scalar case and the L2-vector case just examined permits a rather more expeditious treatment of this case—and that is good because this is the most difficult case. (And, we are all tired.) Again, we begin with a saddle-point problem via the introduction of an associated Lagrange multiplier, say k(x), via the following functional: F(v,A)=± / V(v - u) : [V(v - u)]J AV -v V(^« - ua) ■ V(va -ua)- / AV • v, (A.3.3-51) where the Greek index implies summation over the dimension of the underlying Euclidean space; e.g., in 2D, Vua • Vva = Vm( • Vui + V«2 • Vt>2 = Vu :(Vv)7. In (A.3.3-51), v e HlE; i.e., v e H1 with v = u on rD; also, Hq c H1 has v = 0 on rD, and A e L2. Taking the first variation of v and A. to give 8F and setting 8F = 0 gives an alleged stationary point with v = ui and A = X\ there; i.e., 0 = / V(ki - ua) • V8v0 A,V-5v <5AV-ui, (A.3.3-52) which is the starting point for obtaining each of two results: (i) the Euler-Lagrange equations of the stationary point and (ii) the weak formulation of same. To obtain the former (and the implied/formal projection operator), we use integration by parts on the first two terms to get 0 = / V • [8vaV(ula - ua)] - / 8vaV2(ula - ua) 5AV-U). But 8va = 0 on fD and thus 0 = a — (ula -ua) - Ai«c an 8Va (A.3.3-53) - /[V2(u,-u)-VA,]-5v- /5AV-U,, (A.3.3-54)
952 PROJECTIONS, ORTHOGONAL AND NOT-AND PROJECTION METHODS which leads to the Euler-Lagrange equations, V2u, - VA., = V2u and V-u,=0 in Q, (A.3.3-55) 3ui 3u —--A.,n= — on rN, (A.3.3-56) dn dn ui = u on rD. (A.3.3-57) and, of course Remarks: (1) These 'look like' the steady Stokes equations with 'body force' V2u. (2) The BC on VN is an NBC. (3) The solution of (A.3.3-55) through (A.3.3-57) is a saddle-point of (A.3.3-51). (4) If V • u = 0, then the solution turns out to be A.) =0 and Ui = u; i.e., u was already in the subspace. (5) If u = 0 in Q and du/dn = f on FN and u = w on rD, we have steady Stokes flow. Now we will use these PDE's to derive the projection operator; simply 'solve' (A.3.3-55) through (A.3.3-57) for ui (which of course includes the BC's) and place the result into V • ui =0: u, =u +A-'VA.,, (A.3.3-58) giving -V- A-'VA., = V u, (A.3.3-59) which is 'solved' for A.), A., = -(V- A-'vr'V-u, (A.3.3-60) and the result placed in (A.3.3-58) to give u, =u- A_1V(V- A_1V)_1V-u = [/- A_1V(V- A_1V)_1V-]u = Pj{u, (A.3.3-61) a rather formidable formal operator! Nevertheless, it is a projection operator (pj = pJ{) as is QJ{ = I — pJx, which is related to VA.) via -VA., = AQ7lu; (A.3.3-62) i.e., we have obtained the decomposition u = u, - A-'VA., =pJlu + QJlu, (A.3.3-63) with V • pJ{ =0 (the projected vector lies in the null space of div) and V • QJt = V-; i.e., Qjx has no effect on the divergence. pJ{ is an H'-projector onto J. Remark: Rewriting QJ{ as A-1 grad[div A-1 grad]-1 div and recalling that div grad = A, it is tempting to suggest that div A-1 grad approximates the identity operator so that Qj. %
VECTOR PROJECTIONS 953 A-1 grad div, etc.; but to yield to this temptation would probably be counterproductive. We merely refer the reader to (3.13-134) in Chapter 3, where a modicum of support for this notion was presented. Next (of course), we test J_i; we obtain lUll 1 = / ^Uot ' ^Ua — ^(Pj{Ua + Qj,Ua) ■ V(py,Ma + Qj,Ua) IP/,u|l?+IIG/,ll|l?+2 (Vpj.Ua) ■ (VQj^a) |u,||?+ 1^-^,11?-2 / VMla-V(A-,VA1)a. (A.3.3-64) To make further progress, we must integrate the last term by parts; first, letting v = A_1VA.! yields / Vula • Vva = / V • (UiaVva) ~ / U[a Ava d i. = J ula — (A-lVki)a- J uia(Vki)a. (A.3.3-65) But A 'VA-i = Ui — u from (A.3.3-58), and Jui • VA.i = Jr A.in • U) via the divergence theorem to give VMla.(A-,VA,)a = a "la — ("la —U) — k{U -U) an = / U d dn (ui — u) — A-i n (A.3.3-66) because 3(ui — u)/3n) — A.in = 0 on VN, and ui = u on rD. Thus, finally, a |u||f = llpy.ullf + ||Q7lu||f-2 / u dn (ui — u) — A. [ii (A.3.3-67) and we see that the projection is J_i if u = 0 on TD—or if TD = 0; i.e., we have the 'usual result'—inhomogeneous Dirichlet BC's cause loss of orthogonality. [Cf. (A.3.2-41) and the discussion following it, to dispense with unimportant special cases.] Now we derive the weak formulation for this projection, in preparation for the GFEM approximation. Returning to (A.3.3-52), we first introduce Hq, a subspace of H1 such that v g Hq =$■ v g H1 and v = 0 on rD. Next, let w be a given H1-extension of a vector field from rD into Q that satisfies w = u on rD. Then, to find our desired H1-projection, m = pj^u, we proceed as follows: set ui = u^ + w, and replace 8va by a test function v g Hq and 8k by a test function q e L2 to obtain the variational problem: find vl\ e Hq and k\ e L2 from V(u°la + wa-ua)- Vv0 A,V-v = 0, VvgHJ, (A.3.3-68) and [ qV-(u°l+w) = 0, VqeL2, (A.3.3-69)
954 PROJECTIONS, ORTHOGONAL AND NOT-AND PROJECTION METHODS where w is generally not divergence-free. As before, we rearrange by placing the data on the RHS before 'going Galerkin': Find 11° e Hq and X e L2 from fvu°la-Vva- /\,V-v = fv(ua-wa)-Vva Vve//J, (A.3.3-70) - [qV-VL°{= fqVw VqeL2, (A.3.3-71) then set u, =u°l+w = pJlu. (A.3.3-72) Now, analogous to the scalar H1-projection, we can find an J_i-projection by first 'subtracting' the inhomogeneous Dirichlet data, then performing the projection, and finally 'adding' the data back in. Thus, defining ii = u — w and following the same steps as before leads to the same Euler-Lagrange equations as (A.3.3-55) through (A.3.3-57) except that now iii = 0 on rD—we have homogeneous Dirichlet data. Here, iii = p/,u and iii + w is the final result which, because u —► ii = 0 on rD in (A.3.3-67), gives the desired(?) H1 -orthogonal projection; i.e., we have ||u||^ = ||p/,u||^ + IK^uHp A.3.3.4 The pj -Projection We have finally arrived at our last pair of projections (p1} and Pi), giving a total of 10—as advertised. We consider now the finite-dimensional subspaces: \h c H1, Sh C L2, and Jh c \h—the discretely divergence-free subspace in which our GFEM solutions lie. Proceeding as in the py() case, we invoke (A.3.3-26), (A.3.3-27), and (A.3.3-30), but replace (A.3.3-28) by the full interpolation u1 = Y^ UA' e H1. (A.3.3-73) where uy is the value of the given velocity at node j on rD. Thus, rather than (A.3.3-31) through (A.3.3-33), the weak form [(A.3.3-70), (A.3.3-71)] leads to the analogous H1- version of it—using also (A.3.3-26) and (A.3.3-27); i.e., N f Np f dd) r Y.UJ J V(k • V^ - Y,XJ I ^J = / V(« - «7) • V0,-, / = 1, 2, ..., N, (A.3.3-74) N p. N p YVJ J V<Pi-V<Pj-J2lj 1^1 = Jv(v-vr)-V<j>h /=1,2,...,/V, (A.3.3-75) and - E("J J ti~^ + "j J fi j1) = J yjfiV • u7-, / = 1, 2,..., Np (A.3.3-76)
VECTOR PROJECTIONS 955 are the 2/V + /V^-Galerkin equations of the //'-projection. Comparing these equations with those of the ^-projection given by (A.3.3-31) through (A.3.3-33) makes it clear that there is a perfect one-to-one analogy between the mass matrix (M) and the Laplacian matrix (A')—a fact that allows us to abbreviate the rest of this discussion. The 'pre-projection' velocity, corresponding to (A.3.3-34), the non-divergence-free //'-projection, has as its nodal amplitude coefficients, (it, v)T, and is obtained via solving Kxii = / V(m - u1) • V<p (A.3.3-77) and Kyv= jViv-v1)-^, (A.3.3-78) after which the final projection takes the form Kxu + Cxk = Kxu (A.3.3-79) Kyv + Cyk = Kyv (A.3.3"80) and CTxu + CTyv = g, (A.3.3-81) where here, Kx = Ky—because of our somewhat restrictive BC's. Next, the analog of (A.3.3-38) and the discretization of (A.3.3-59) is easily obtained: (CTxK;lCx + C\,K~{Cy)X = CTK~{CX = BX = CTxu + CTyv - g. (A.3.3-82) So too are the analogs of (A.3.3-39) and the GFEM analog of (A.3.3-61): /k\ = WX-K-{CXB-'CTK) -K-'CxB-'CTy j /fi\ /K-lCxB~lg\ \v) [ -K^CvB-lCTx (Iv-K-lCvB-lCTy)\\~v) \K-lCyB-lg)' y ~ ■ (A.3.3-83) which has introduced the //'-projection matrix, Pi; i.e., in condensed form, u = (I - K-lCB~lCT)u + K~lCB~lg = P{u + K-{CB-{g, (A.3.3-84) a la (A.3.3-43), which properly shows the (comforting) lack of change if u is divergence- free via the rearrangement to u = u-K-{CB-\CTu-g). (A.3.3-85) Also, in analogy with (A.3.3-44), with Q{ = I - P, = K-lCB~lCT, u = u + K-xCX = P{u + Q{u, (A.3.3-86) and the inner product, [u, v] = f Vua • Vva, goes to [u, v] = uTKv (A.3.3-87)
956 PROJECTIONS, ORTHOGONAL AND NOT-AND PROJECTION METHODS and leads to the following J_i-condition, analogous to (A.3.3-45): p|2 = ||M||2 + ijA-'CA-ll? + 2XTg; (A.3.3-88) J_i is obtained only if u = 0 on VD or if rD = 0. As with the L2-case [see (A.3.3-45) and related discussion], P\u _l_i Q\u but u is not J_i K~lCX even though u = P\li + Q\u = u + K~lCX— 'because' P{u = u - K~lCB~lg and Q{u = u + K~lCB~lg. Finally, to introduce the continuous projection operator, pj, that is induced by the discrete projection operator, Pi, we repeat the summary given for pjo just above (A.3.3-46): 1. Given make a first approximation to it via ■**>=x: (?;)*, and approximate its constrained value on rD by (A.3.3-73). 2. To obtain (iij,Vj), perform the (non-divergence-free) //'-projection by solving (A.3.3-77) and (A.3.3-78). 3. Project (Uj, vj) to the divergence-free subspace via (A.3.3-83), giving (Uj, Vj). 4. Insert the result into (A.3.3-26) and (A.3.3-27), and we are done; we have the projection, which is (A.3.3-46) through (A.3.3-49)—the finite-dimensional //'-projection operator onto the discretely divergence-free subspace, Jh, with (of course) M replaced by K, Po byP,,andpJ() by phJ{. A.3.3.5 Sequential Projections Before relating the divergence-free projections to BVP's and their solution, we need to point out an interesting fact that answers the following question: Does the direct projection from the oo-dimensional space to the discretely divergence-free subspace generate the same vector as results by first applying the non-divergence-free projection operator to the same vector to bring it down to the finite-dimensional subspace and then applying the discretely divergence-free projection to this vector (L2 and/or //')? In fewer words, we ask: Does ui = phJiu = phJip*u = phJiu* for / = 0,1? Or, pictorially, do we have (a) or (b) in Figure A.3.3-2? We shall demonstrate that the answer is 'yes' [Figure A.3.3-2(b)] for the L2-projection (as before, replace M by K to get the H'-analog): from (A.3.3-34), (A.3.3-35) through (A.3.3-37), and (A.3.3-48) we have < = rf0u l)+M-lCA-lg + u7
VECTOR PROJECTIONS Vh 957 Fig. A.3.3-2 Sequential projections. (f 0 0 (p t I \ poM~ <p(u — u1) <p(v — v1) + M-lCA~lg} +u7. (A.3.3-89) Then, from (A.3.3-34) again, we have Ji„ _ r,h *T <?' 0 \ U *>0« = U0 = I 0 yT V + u' i [<Plit+ u' V <pTv+ v1 (A.3.3-90) Finally, ,A «A _ ^~A„ _ / * 0 \ JP()M-1 P7oU0 = Pj0P0U = l 0 ^ ^ 0 *r 0 0 <P7 0 <pT I (pi^-u1) + M-'CA-'gl +u7 /W <p<pTv + M~lCA'lg J + u' PQM~lM \H4 U \ , W-1^4-1. V + M"'CA-'g + u' rf(u; QED: *>;0=p}()p8- (A.3.3-91) A.3.3.6 The Projection Method To (nearly) conclude this extended discussion on projections, we seek BVP's that correspond to the two divergence-free projections. These correspond to potential flow in the first case and Stokes flow in the second. See also Section 3.15. Recalling first the L2-projection, we consider the following BVP: with u + V/> = f, V-u = 0 in Q, n • u = un on To, and P = P^ on VN, (A.3.3-92) (A.3.3-93) which, with P identified as a velocity potential, is recognized as the potential flow equations with a (necessarily curl-free) source term, f (which we could omit but retain for generality/consistency).
958 PROJECTIONS, ORTHOGONAL AND NOT-AND PROJECTION METHODS The 'analogous' Stokes problem is -V2u + V/> = f, V-u = 0 in Q, (A.3.3-94) with u = uD on TD and du/dn - nP = F on VN, (A.3.3-95) in which f may, but need not, be the same source term as in (A.3.3-92)—ignoring 'regularity issues.' Also, f need not be curl-free for Stokes flow. To see the projection connection, we first rewrite the above BVP's in their appropriate weak formulations and then compare them with the GFEM approximations of same: Potential: Find u e Hl0 and P e L2 from /(u + w)v- jPV.\= jf.y- J PNn\ VveH^, (A.3.3-96) and - I qV ■ (u + w) = 0 Vq eL2, (A.3.3-97) where w is an H1-extension of a vector field from rD into Q that satisfies n • w = un on Stokes: Find u 6 Hq and P e L2 from / V(u + w) : (Vv)7"- JPV\= /fv + J Fv Vv e Hj, (A.3.3-98) and -[qV-(u + w) = 0 VqeL2, (A.3.3-99) where w is an H1-extension of a vector field from TD into Q that satisfies w = uD on rD. The GFEM version of these follows easily wherein v -> \h -> (</>,-, </>,)7, q -> qh -> V/» and w —► u7; the test functions become the (subset of) GFEM basis/test functions, and the interpolant is used as the H1-extension of Dirichlet data. Thus: Potential: Find uh = Y^=l «;</>./ and Ph = Y^% Ptfj from f(uh + u7) • \h - J PhV • \h = J f • \h - J PNn\h Vv* e \h c Hj, (A.3.3-100) and - f qhV-(uh+uI) = 0 VqheQhCL2, (A.3.3-101) and Stokes: Find u* = Yfj=\ "j<Pj and Ph = YjjLx Pj^j from /\(uh +u7): (Vv'1)7"- f PhV\h = ff-\h+ f F Vv* e\h cH' (A.3.3-102) v*
VECTOR PROJECTIONS 959 and - [qhV-(uh+uI) = 0 VqheQhcL2. (A.3.3-103) Now, as we did earlier for the scalar case, we note that since \h c Hq and Qh c L2, we have that (it follows that) (A.3.3-96) through (A.3.3-99) are also valid when v is replaced by \h and q by qh— thus generating finite-dimensional subsets of the oo-dimensional equations of the weak formulations. We then also note that the RHS's (data) for the resulting sets are the same as the RHS's of (A.3.3-100), (A.3.3-101) and (A.3.3-102), (A.3.3-103), respectively, which leads directly to the following results: Potential: /(u + w-u/)-v/i- fpV\h = fuh-\h (A.3.3-104) - fqhV-uh= /Vv-u7, (A.3.3-105) where the introduction of uh (which is computable if u and P are available), a non- divergence-free vector field is merely (partly) for 'convenience.' uh.\h- / PhV ■ \h = and Stokes: W : (VvV - I Phy • \h = f V(u + w - u7) : (VvV - f PV v* and = / Vuh : (VvV (A.3.3-106) qhV-uh= qhV-u', (A.3.3-107) where again, uh is a computable (and non-solenoidal) vector field—again assuming u and P are available (which we do). To finish, all we need do is recognize that (A.3.3-104) and (A.3.3-105) is nothing other than equations (A.3.3-31) through (A.3.3-37) and that (A.3.3-106), (A.3.3-107) is nothing other than equations (A.3.3-74) through (A.3.3-81)—after replacing u by u + w in the former equations and accepting the 'new' definition of u71 = ]T^ u/</>_,. Thus, at least up to the slightly ambiguous H1-extension functions, w, we clearly (and finally!) see the projection connections for incompressible flow: the GFEM solution of the potential/Stokes flow equations is the same as the projections (L2 in one case, //' in the other) of the exact solutions to the appropriate finite element subspaces. These projections are 'best fitsVclosest approximations/orthogonal projections in the case wherein the Dirichlet data are homogeneous (as usual). [The functions u/( (partial/'internal' GFEM solutions) are always orthogonal projections of u + w-u', but the full solutions (uh +u7) are non- orthogonal projections of u 4- w.] Final remarks—for homogeneous BC's: 1. The potential flow projection can also be interpreted as a projection of the source vector to the discretely divergence-free subspace, since (A.3.3-104) implies (weakly) uh = u + VP = f, with V • u = 0.
960 PROJECTIONS, ORTHOGONAL AND NOT-AND PROJECTION METHODS 2. The Stokes flow projection can also be re-interpreted via (A.3.3-106), this time via -V2uh = -V2u + VP = f; i.e., uh = -A"'f, with V • u = 0. Thus, the GFEM solution is also just a 'different' decomposition of the forcing function, and is also related to the fact, mentioned earlier, that divergence-free vector fields are (generally) not discretely divergence-free. A.3.3.7 Ranking Elements via Projections As another potentially useful product of this prolonged discussion on projections, we offer (perhaps with tongue-in-cheek, since we have not tested it) the following suggestions as one (fairly easy) way to compare 'element A' to 'element B' (to 'element C to ...) for incompressible flow: 1. Pick an analytical function for the stream function (vector potential in 3D). 2. Take its curl to get a div-free vector field, u. 3. Design a 'reasonable' finite element mesh (or, perhaps better yet—a sequence of them). 4. Interpolate u onto the mesh; call it U/. 5. Project (in L2 or Hl or, perhaps preferably, both) u/ onto the discretely div-free subspace. This gives (A., u^), with CTu = g, where u is the vector of nodal values of u^. 6. Compute appropriate norms of A. and (u — u^). 7. Rank the elements according to the size of these norms; smallest 'wins'—for this test case. 8. Goto 1. Remarks: (1) The oo Do-loop can be truncated when you've 'had enough'. (2) This of course tests the 'quality' only of the discrete divergence (and, by implication, it seems, that of the pressure)—but this is just the additional quantity that is needed. (3) The cost of each projection might also be factored in, somehow, to also rank an element's cost-effectiveness. (4) A small sample of such a comparison is shown in the table near the end of Section 3.16.Id.
References N.N. Abboud and P.M. Pinsky. Finite element solution and dispersion analysis for the transient structural acoustics problem. Appl. Mech. Rev., 43(5):S381-S388, Part 2, 1990. N.N. Abboud and P.M. Pinsky. Finite element dispersion analysis for the three- dimensional second-order scalar wave equation. Int. J. Numer. Meth. Eng., 35:1183-1218, 1992. L. Abia and J.M. Sanz-Serna. On the use of the product approximation technique in nonlinear Galerkin methods. Int. J. Numer. Meth. Eng., 20:778-779, 1984. M. Abramowitz and LA. Stegen (Eds.). Handbook of Mathematical Functions, National Bureau of Standards, US Dept. of Commerce, Washington, D.C., USA, 1964. NBS Applied Mathematics Series, Vol. 55. J.H. Ahlberg, E.N. Nilson and J.L. Walsh. Theory of Splines and Their Applications. Academic Press, New York, USA, 1967. J.E. Aiken. Stiff Computations. Oxford University Press, New York, USA, 1985. M. Ainsworth and J.T. Oden. A procedure for a posteriori error estimation for h-p finite element methods. Comput. Meth. Appl. Mech. Eng., 101:73-96, 1991. D. Alvarez, O. Daubert, L. Janvier and J.P. Schneider. Proc. NURETH 5, 1992. Chap. "Three dimensional calculations and experimental investigations of the primary coolant flow in a 900 MW PWR vessel"; Salt Lake City, Utah, USA. A.S. Almgren, J.B. Bell and W.G. Szymczak. A numerical method for the incompressible Navier-Stokes equations based on an approximate projection. SI AM J. Sci. Comput. 17:358-369, 1996. R. Amit, C.A. Hall and T.A. Porsching. An application of network theory to the solution of implicit Navier-Stokes difference equations. J. Comput. Phys., 40:183-201, 1981. A.A. Amsden and F.H. Harlow. A simplified MAC technique for incompressible fluid flow calculations. J. Comput. Phys., 6:322-325, 1970. A.A. Amsden and F.H. Harlow. The SMAC Method: A Numerical Technique for Calculating Incompressible Fluid Flows. Los Alamos Scientific Laboratory, Los Alamos, New Mexico, USA, LA-4370; UC-32, mathematics and computers; TID-4500 edition, 1970. A. Arakawa. Computational design for long-term numerical integration of the equations of fluid motion: Two-dimentional incompressible flow. Part 1 J. Comput. Phys., 1:119-143, 1966. T. Arbogast and M.F. Wheeler. A characteristics-mixed finite element method for advection-dominated transport problems. SINUM, 32(2):404-424, 1995.
962 REFERENCES J.S. Archer. Consistent mass matrix for distributed systems. Proc. Am. Soc. Civ. Eng., 89(ST4):161, 1963. R. Aris. Mathematical Modeling Techniques. Pitman, London, England, UK, 1978. D.N. Arnold, I. Babuska, and J. Osborn. Finite element methods: principles for their selection, Comput. Meth. Appl. Mech. Eng. 45:57-96, 1984. D.N. Arnold. Mixed finite element methods for elliptic problems. Comput. Meth. Appl. Mech. Eng., 82:281-300, 1990. M. Arnold. Stability of numerical methods for differential-algebraic equations of higher index. Appl. Numer. Math., 13:5-14, 1993. W. Arter and J.W. Eastwood. Particle-mesh schemes for advection dominated flows. J. Comput. Phys., 117:194-204, 1995. U.M. Ascher and L.R. Petzold. Projected implicit Runge-Kutta methods for differential- algebraic equations. SI AM J. Numer. Anal, 28(4): 1097-1120, 1991. U.M. Ascher, S.J. Ruuth and B.T.R. Wetton. Implicit-explicit methods for time- dependent partial differential equations. SIAM J. Numer. Anal., 32(3):797-823, 1995. 0. Axelsson and V.A. Barker. Finite Element Solution of Boundary Value Problems. Theory and Computation. Academic Press, Inc., Orlando, Florida, USA, 1984. A.K. Aziz (Ed.). The Mathematical Foundations of the Finite Element Method with Applications to Partial Differential Equations. Academic Press, New York, New York, USA, 1972. 1. Babuska. Error-bounds for finite element method. Numer. Math., 16:322-333, 1971. I. Babuska. The finite element method with Lagrangian multipliers. Numer. Math., 20:179-192, 1973. I. Babuska and M. Suri. The p and h-p versions of the finite element method, basic principles and properties. SIAM Rev., 36(4):578-632, 1994. I. Babuska. Courant Element: Before and After. 1994. I. Babuska, and J.T. Oden, Preface. Comput. Meth. Appl. Mech. Eng., 133:xi-xii, 1996. I. Babuska and R. Narasimhan. The Babuska-Brezzi condition and the patch test: an example. Comput. Meth. Appl. Mech. Eng., 140:183-199, 1997. I. Babuska, T. Strouboulis, S.K. Gangaraj and C.S. Upadhyay. Pollution error in the h- version of the finite element method and the quality of the recovered derivatives. Comput. Meth. Appl. Mech. Eng., 140:1-37, 1997. A.J. Baker. Finite Element Computational Fluid Mechanics. Hemisphere Publishing Corporation/McGraw-Hill Book Company, Washington/New York, USA, 1983. A.J. Baker and J.W. Kim. A Taylor weak-statement algorithm for hyperbolic conservation laws. Int. J. Numer. Meth. Fluids, 7:489-520, 1987. A.J. Baker and D.W. Pepper. Finite Elements 1-2-3. McGraw-Hill, Inc., New York, New York, USA, 1991. B.R. Baliga and S.V. Patankar. A new finite-element formulation for convection-diffusion problems. Numer. Heat Transfer, 3:393-409, 1980. R.E. Bank, B.D. Welfert and H. Yserentant. A class of iterative methods for solving saddle point problems, Numer. Math., 56:645-666, 1990. A.M. Baptista, E.E. Adams and P. Gresho. Benchmarks for the transport equation: The convection-diffusion forum and beyond. Quantitative Skill Assessment for Coastal Ocean Models Coastal and Estuarine Studies, 47:241 -268, American Geophysical Union, 1995. C. Bardos, M. Bercovier and O. Pironneau. The vortex method with finite elements. Math. Comput., 36(153): 119-136, 1981.
REFERENCES 963 M. Bar-Lev and H.T. Yang. Initial flow field over an impulsively started circular cylinder. J. Fluid Mech., 72(4):625-647, 1975. K.E. Barrett. A variational principle for the stream function-vorticity formulation of the Navier-Stokes equations incorporating no-slip conditions. J. Comput. Phys., 26:153-161, 1978. O.A. Basaran and D.W. DePaoli. Phys. Fluids, 6(9):2923, September 1994. G.K. Batchelor. An Introduction to Fluid Dynamics. Cambridge University Press, London, England, UK, 1967. S. Bates and B. Cathers. Analysis of spurious eigenmodes in finite element equations. Int. J. Numer. Meth. Fluids, 23:1131-1143, 1986. K.J. Bathe. Finite Element Procedures. Prentice-Hall, Englewood Cliffs, New Jersy, USA, 1996. R.M. Beam and R.F. Warming. Lecture Notes, 1981-82 Lecture Series Programme, Computational Fluid Dynamics, von Karman Institute for Fluid Dynamics, UK, 1982. Chap. "Implicit numerical methods for the compressible Navier-Stokes and Euler equations"; Waterloo, Belgium, March 29-April 2; J.A. Essers (Ed.). E.B. Becker, G.F. Carey and J.T. Oden. Finite elements. An Introduction. Vol. I. Prentice- Hall, Inc., Englewood Gliffs, New Jersey, USA, 1981. R. Becker and R. Rannacher. Finite element solution of the incompressible Navier- Stokes equations on anisotropically refined meshes. Technical Report 94-31, Universitat Heidelberg, Germany, 1994. M. Behr and T.E. Tezduyar. Finite element solution strategies for large-scale flow simulations. Comput. Meth. Appl. Mech. Eng., 112:3-24, 1994. B.C. Bell and K.S. Surana. p-version least squares finite element formulation for two- dimensional, incompressible, non-Newtonain isothermal and non-isothermal fluid flow. Int. J. Numer. Meth. Fluids, 18:127-162, 1994. J.B. Bell, P. Colella and H.M. Glaz. A second-order projection method for the incompressible Navier-Stokes equations. J. Comput. Phys., 85:(2)257-283, 1989. J.B. Bell, P. Colella and L.H. Howell. Proc. AIAA 10th Computational Fluid Dynamics Conf.. American Institute of Aeronautics and Astronautics (AIAA), New York, USA, 1991. Chap. "An efficient second-order projection method for viscous incompressible flow"; Honolulu, Hawaii, June 24-27, 1991. J.B. Bell and D.L. Marcus. Vorticity intensification and transition to turbulence in the three-dimensional Euler equations. Commun. Math. Phys., 147:371-394, 1992. J.B. Bell, A.S. Almgren and W.G. Szymczak. A numerical method for the incompressible Navier-Stokes equations based on an approximate projection. SIAM J. Sci. Comput., 17(2):358-369, March 1996. T. Belytschko and R. Mullen. On Dispersive Properties of Finite Element Solutions. John Wiley and Sons, Inc., New York, New York, USA, 1978, in Modern Problems in Elastic Wave Propagation; J. Miklowitz et ai, (Eds.). CM. Bender and S.A. Orszag. Advanced Mathematical Methods for Scientists and Engineers. McGraw-Hill, New York, New York, USA, 1978. J.P Benque, B. Ibler, A, Keramsi and G. Labadie. Proc. 3rd. Int. Conf. Finite Elements in Flow Problems, Vol. I, 1980. Chap. "A finite element method for Navier-Stokes equations"; p. 110-120; Banff, Alberta, Canada, June 10-13, 1980. J.P Benque, G. Labadie and J. Ronat. Finite Element Flow Analysis: Proc. 4th. Int. symp. Finite Element Methods in Flow Problems, 1982.
964 REFERENCES M. Bentwich and T. Miloh. Low Reynolds number flow due to impulsively started circular cylinder. J. Eng. Math., 16(1): 1-21, 1982. M. Bercovier. Information Processing 77. North-Holland, 1977. Chap. "A family of finite elements with penalisation for the numerical solution of Stokes and Navier-Stokes equations"; pp. 97-101; B. Gilchrist (Ed.). M. Bercovier and O. Pironneau. Proc. Numerical Methods in Laminar and Turbulent Flow. NUL, 1978. Chap. "Comparisons and error estimates for several finite elements for the numerical simulation of incompressible viscous flows"; Swansea, Wales, UK, July 18-21, 1978. M. Bercovier. Perturbation of Mixed Variational Problems. Application to mixed finite element methods. R.A.I.R.O. Analyse numerique/Numer. Anal, 12(3):211-236, 1978. M. Bercovier and M. Engelman. A finite element for the numerical solution of viscous incompressible flows. J. Comput. Phys., 30:181-201, 1979. M. Bercovier and O. Pironneau. Error estimates for finite element method solution of the Stokes problem in the primitive variables. Numer. Math., 33:211-224, 1979. M. Bercovier and O. Pironneau. Finite Element Flow Analysis: Proc. 4th. Int. Symp. Finite Element Methods in Flow Problems. Chap. "Characteristics and the finite element method"; pp. 67-73; Chuo University, Tokyo, Japan, July 26-29, 1982. M. Bercovier, O. Pironneau and V. Sastri. Finite elements and characteristics for some parabolic-hyperbolic problems. Appl. Math. Modelling, 7:89-96, April 1983. R. Bermejo. On the equivalence of semi-Lagrangian schemes and particle-in-cell finite element methods. Mon. Weather Rev., 118:979-987, April 1990. R. Bermejo. Analysis of an algorithm for the Galerkin-characteristic method. Numer. Math., 60:163-194, 1991. R. Bermejo and A. Staniforth. The conversion of semi-Lagrangian advection schemes to quasi-monotone schemes. Mon. Weather Rev., 120:2622-2632, November 1992. C. Bernardi, C. Canuto and Y. Maday. Generalized Inf-Sup Conditions for Chebyshev Spectral Approximation of the Stokes Problem. SIAM J. Numer. Anal., 25:(6)1237-1271, December 1988. C. Bernardi, C. Canuto, Y. Maday and B. Metivet. Single-grid spectral collocation for the Navier-Stokes equations. IMA J. Numer. Anal., 10:253-297, 1990. F.H. Bertrand, M.R. Gadbois and P.A. Tanguy. Tetrahedral elements for fluids flow. Int. J. Numer. Meth. Eng., 33:1251-1267, 1992. T. Bidot, S. Delaroff, J.M. Vanel, G. Monville and G. Pot. Proc. Basel World User's Day CFD. Chap. "Application of the N3S finite element code to numerical simulation around a peugeot car 405"; May, 1992. R.B. Bird, W.E. Stewart and E.N. Lightfoot. Transport Phenomena. John Wiley and Sons, Inc., New York, New York, USA, 1960. N.E. Bixler and R.E. Benner. Proc. Fourth Int. Cont. Numer. Meth. in Laminar and Turbulent Flow. Pineridge Press, Ltd, Swansea, Wales, UK, 1985. Chap. "Finite element analysis of axisymmetric oscillations of sessible liquid drops"; pp. 1325-1335; Swansea, Wales, UK, July 9-12, 1985. N.E. Bixler and L.E. Scriven. Downstream development of three-dimensional viscocap- illary film flow. Ind. Eng. Chem. Res., 26:475-483, 1987. H. Blasius. Grenzschichten in Fltissigkeiten kit Kleiner Reiburg. Zeit. Math. Phys. 56:1-37, 1908. P.B. Bochev and M.D. Gunzburger. Accuracy of least-squares methods for the Navier-Stokes equations. Comput. Fluids, 22(4/5):549-563, 1993.
REFERENCES 965 P.B. Bochev and M.D. Gunzburger. Least-squares methods for the velocity-pressure- stress formulation of the Stokes equations. Comput. Meth. Appl. Mech. Eng., 114:213, 1994. P.B. Bochev and M.D. Gunzburger. Analysis of least-squares finite element methods for the Stokes equations. Math. Comput., to appear; also Virginia Tech., Department of Mathematics and Interdisciplinary Center for Applied Mathematics, Blacksburg, Virginia 24061-0531, USA, 1996. J.M. Boland and R.A. Nicolaides. Stability of finite elements under divergence constraints. S1AMJ. Numer. Anal, 20(4):722-731, 1983. J.M. Boland and R.A. Nicolaides. On the stability of bilinear velocity-constant pressure finite elements. Numer. Math., 44:219-222, 1984. J.M. Boland and R.A. Nicolaides. Stable and semistable low order finite elements for viscous flows. SIAMJ. Numer. Anal., 22(3):474-492, 1985. R. Bouard and M. Coutanceau. The early stage of development of the wake behind an impulsively started cylinder for 40 < Re < 104. J. Fluid Mech., 101(3):583-607, 1980. J. Bramble and J. Pasciak. Iterative techniques for time dependent Stokes problems. In W. Habashi, editor. Solution Techniques for Large-Scale CFD Problems, pp. 201-216. John Wiley, 1995. K. Boukir, Y. Maday, B. Metivet and E. Razafindrakoto. A high order characteristics/finite element method for the incompressible Navier-Stokes equations. Int. J. Numer. Meth. Fluids, 1996. K.E. Brenan, S.L. Campbell, and L.R. Petzold. Numerical Solution of Initial-Value Problems in Differential-Algebraic Equations. North-Holland, Elsevier Science Publishing Co., Inc., New York, New York, USA, 1989. K.E. Brenan, S.L. Campbell and L.R. Petzold. Numerical Solution of Initial-Value Problems in Differential-Algebraic Equations. Society for Industrial and Applied Mathematics, Philadelphia, Pennsylvania, USA, 1996. S.C. Brenner and L.R. Scott. The Mathematical Theory of Finite Element Methods. Springer-Verlag, Berlin, Germany, 1994. F. Brezzi. On the existence, uniqueness and approximation of saddle-point problems arising from Lagrangian multipliers. Revue Francaise d' Automatique, Informatique et Recherche Operationnelle (R.A.I.R.O.), R-2:129-151, 1974. F. Brezzi and J. Pitkaranta. On the Stabilisation of Finite Element Approximations of the Stokes Problem. Vieweg, Braunschweig, Germany, 1984, in Efficient Solutions of elliptic Systems; pp. 11-19; W. Hackbusch (Ed.). F. Brezzi and K.-J. Bathe. A discourse on the stability conditions for mixed finite element formulations. Comput. Meth. Appl. Mech. Eng., 82:27-57, 1990. F. Brezzi and M. Fortin. Mixed and Hybrid Finite Element Methods. Springer-Verlag, Inc., New York, New York, USA, 1991. F. Brezzi and R.S. Falk. Stability of higher-order Hood-Taylor methods. SIAM J. Numer. Anal., 28(3):581-590, 1991. M.O. Bristeau, R. Glowinski and J. Periaux. Numerical methods for the Navier-Stokes equations. Applications to the simulation of compressible and incompressible viscous flows. Comput. Phys. Reports, 6:73-187, 1987. M.O. Bristeau, R. Glowinski, B. Mantel, J. Periaux and P. Perrier. Finite Elements in Fluids—Vol. 6: Finite Elements and Flow Problems. John Wiley and Sons, Chichester,
966 REFERENCES England, UK, 1985. Chap. 1, "Numerical methods for incompressible and compressible Navier-Stokes problems"; pp. 1-40; R.H. Gallagher, G.F. Carey, J.T. Oden and O.C. Zienkiewicz (Eds.). I.N. Brohnshtein and K.A. Semendyayev. Handbook of Mathematics. Van Nostrand Rein- hold Company, New York, New York USA, 1985. A. Brooks and T.J.R. Hughes. Proc. Third Int. Conf. Finite Element Methods in Fluid Flow. 1980. Chap. "Streamline-upwind / petrov-Galerkin methods for advection dominated flows"; also Available from Division of Engineering and Applied Science, California Institute of Technology, Pasadena, California 91125, USA. A.N. Brooks and T.J.R. Hughes. Streamline upwind/Petrov-formulations for convection dominated flows with particular emphasis on the incompressible Navier-Stokes equations. Comput. Meth. Appl. Mech. Eng., 32:199-259, 1982. D.L. Brown and M.L. Minion. Performance of under-resolved two-dimensional incompressible flow simulations. J. Comput. Phys., 122:165-183, 1995. P.N. Brown, G.D. Byrne and A.C. Hindmarsh. VODE: A Variable-Coefficient ODE Solver. SIAMJ. Sci. Stat. Comput., 10:(5) 1038-1051, 1989. E.T. Bullister, G.E. Karniadakis, E.M. Ronquist and A.T. Patera. Proc. Sixth Int. Symp. Finite Element Methods in Flow Problems, Antibes, France, 1986. Chap. "Solution of the unsteady Navier-Stokes equations by spectral element methods"; pp. 225-230. E.T. Bullister. Development and Application of High Order Numerical Methods for Solution of the Three-dimensional Navier-Stokes Equations. Ph.D. Thesis, Massachusetts Institute of Technology, Department of Mechanical Engineering, Cambridge, Massachusetts, USA, 1986. D.S. Burnett. Finite Element Analysis: From Concepts to Application. Addison-Wesley Publishing Company, Reading, Massachusetts, USA, 1987. K. Burrage, J.C. Butcher and F.H. Chipman. An implementation of singly-implicit Runge-Kutta Methods. BIT, 20:326-340, 1980. J.C. Butcher. The Numerical Analysis of Ordinary Differential Equations: Runge-Kutta and General Linear Methods. John Wiley and Sons, Chichester, England, 1987. J. Cahouet and J.-P. Chabard. Some fast 3d finite element solvers for the generalized Stokes problem. Int. J. Numer. Meth. Fluids, 8:869-895, 1988. Z. Cai. On the finite volume element method. Numer. Math., 58:713-735, 1991. Z. Cai, J. Mandel and S. McCormick. The finite volume element method for diffusion equations on general triangulations. SIAM J. Numer. Anal., 28:(2)392-402, 1991. S.L. Campbell. Singular Systems of Differential Equations. Pitman Advanced Publishing Program, San Francisco, California, USA, 1980. A.-Campion Renson and M.J. Crochet. On the stream function-vorticity finite element solutions of Navier-Stokes equations. Int. J. Numer. Mech. Fluids, 12:1809, 1978. C. Canuto, M.Y. Hussaini, A. Quarteroni, and T.A. Zang. Spectral Methods in Fluid Dynamics. Springer-Verlag, Inc., New York, New York, USA, 1988a. C. Canuto, C. Bernardi and Y. Maday. Generalized inf-sup conditions for Chebyshev spectral approximation of the Stokes problem. SIAMJ. Numer. Anal., 25(6): 1237-1271, December 1988b. G.F. Carey. An analysis of finite element equations and mesh subdivision. Comput. Meth. Appl. Mech. Eng., 9:165-179, 1976. G.R. Carey. Derivative calculation from finite element solutions. Comput. Meth. Appl. Mech. Eng., 35:1-14, 1982.
REFERENCES 967 G.F. Carey and J.T. Oden. Finite Elements: A Second Course Vol. II. Prentice-Hall, Inc., Englewood Cliffs, New Jersey, USA, 1983. G. Carey and J.T. Oden. Finite Elements: Computational Aspects Vol. III. Prentice-Hall, Inc., Englewood Cliffs, New Jersey, USA, 1984. G.F. Carey, S.S. Chow and M.K. Seager. Approximate boundary-flux calculations. Comput. Meth. Appl. Mech. Eng., 50:107-120, 1985. G.F. Carey and J.T. Oden. Finite Elements: Fluid Mechanics. Vol. VI. Prentice-Hall, Inc., Englewood Cliffs, New Jersey, USA, 1986. G.F. Carey and R. McLay. Local pressure oscillation and boundary treatment for the 8-node quadrilateral. Int. J. Numer. Meth. Fluids, 6:165-172, 1986. G.F. Carey and B.N. Jiang. Least-squares finite elements for first-order hyperbolic systems. Int. J. Numer. Meth. Eng., 26:81-93, 1988. G.F. Carey and Y. Shen. Convergence studies of least-squares finite elements for first- order systems. Commun. Appl. Numer. Methods, 5:427-434, 1989. H.S. Carslaw and J.C. Jaeger. Conduction of Heat in Solids. Clarendon Press, Oxford, England, UK; 2nd edition, 1959. M.S. Carvalho and L.E. Scriven. Numerical Methods in Laminar and Turbulent Flow, Pineridge Press, Swansea, Wales, UK, 1995, Vol. 9, Part II. Chap. "Flows between rigid and deformable rotating cylinders with free surfaces, inflow and outflow"; pp. 972-983; C. Taylor and P. Durbetaki (Eds.). M.S. Carvalho and L.E. Scriven. Multiple states of a viscous free surface flow: transition from a pre-metered to a metering in flow. 24:813. B. Cathers and B.A. O'Connor. The group velocity of some numerical schemes. 5:201-224, 1985. M.A. Celia, T.F. Russell, I. Herrera and R.E. Ewing. An Eulerian-Lagrangian localized adjoint method for the advection-diffusion equation. Adv. Water Res., 13(4): 187, 1990. M.A. Celia, I. Herrera, R.E. Ewing and T.F. Russell. Eulerian-Lagrangian localized adjoint method: The theoretical framework. Numer. Meth. Partial Differential Equations, 9:431-457, 1993. J.P. Chabard and I. King. Proc. XXIVth Biennial Congress Int. Assoc, for Hydraulic Research (AIRH). Chap. "An industrial application of the N3S finite element code"; Madrid, Spain, September 9-13, 1991. D.J. Chaffin and A.J. Baker. On Taylor weak statement finite element methods for computational fluid dynamics. Int. J. Numer. Meth. Fluids, 21:273-294, 1995. S.T. Chan, P.M. Gresho, R.L. Lee and CD. Upson. Proc. AIAA, 5th Comp. Fluid Dynamics Conf, USA, 1981. Chap. "Simulation of three-dimensional, time-dependent, incompressible flows by a finite element method"; pp. 354-364; Palo Alto, California; also Lawrence Livermore National Laboratory, Livermore, California, UCRL-85226. S.T. Chan and P.M. Gresho. Proc. 4th Int. Symp. Finite Element Methods in Flow Problems, Finite Element Flow Analysis. University of Tokyo Press, Tokyo, Japan, 1982. Chap. "Solution of the multi-dimensional, incompressible Navier-Stokes equations using low-order finite elements and one-point quadrature"; pp. 201-210; T. Kawai (Ed.). S.T. Chan. Numerical simulations of LNG vapor dispersion from a fenced storage area. J. Hazard. Mater., 30:195-224, 1992. C.-L Chang and M.D. Gunzburger. A finite element method for first order elliptic systems in three dimensions. Appl. Math. Comput., 23:171-184, 1987.
968 REFERENCES C.L. Chang. Finite element approximation for grad-div type systems in the plane. SIAM J. Numer. Anal, 29(2):452-461, 1992. C.L. Chang. Least-squares finite element methods for incompressible flow with zero residual for mass conservative law. SIAM J. Numer. Anal, 1996. to appear; also Cleveland State University, Department of Mathematics Research Report 94-50 (September 1994). E.J. Chang and M.R. Maxey. Unsteady flow about a sphere at low to moderate Reynolds number. Part 2. Accelerated motion. J. Fluid Mech., 303:133-153, 1995. M.W. Chang and B.A. Finlayson. On the proper boundary conditions for the thermal entry problem. Int. J. Numer. Meth. Eng., 15:935-942, 1980. D. Chapelle and K.J. Bathe. The inf-sup test. Comput. Struct., 47(4/5):537-545, 1993. R.T.S. Cheng. Numerical solution of the Navier-Stokes equations by the finite element method. Phys. Fluids, 15(12):2098, 1972. R.C.Y. Chin, G.W. Hedstrom and K.E. Karlsson. A simplified Galerkin method of hyperbolic equations. Math. Comput., 33:(146)571-586, 1979. J.C. Chien. A general finite-difference formulation with application to Navier-Stokes equations. Comput. Fluids, 5:15-31, 1977. P.N. Childs and K.W. Morton. Characteristic Galerkin methods for scalar conservation laws in one dimension. SIAM J. Numer Anal, 27(3):553-594, 1990. D.P. Chock and A.M. Dunker. A comparison of numerical methods for solving the advec- tion equation. Atmos. Environ., 17(1): 11 —24, 1983. D.P. Chock. A comparison of numerical methods for solving the advection equation—II. Atmos. Environ., 19(4):571-586, 1985. A.J. Chorin. The numerical solution of the Navier-Stokes equations for an incompressible fluid. Bull. Am. Math. Soc, 73(6):928, 1967a. A.J. Chorin. A numerical method for solving incompressible viscous flow problems. J. Comput. Phys., 2:12-26, 1967b. A.J. Chorin. Numerical solution of incompressible flow problems. Stud. Numer. Anal, 2:64-1 \, 1968a. A.J. Chorin. Numerical solution of the Navier-Stokes equations. Math. Comput., 22:745, 1968b. A.J. Chorin. On the convergence of discrete approximations to the Navier-Stokes equations. Math. Comput, 23(106):341, 1969. M.H. Chou and W. Huang. Numerical study of high-reynolds-number flow past a bluff object. International Journal for Numerical Methods in Fluids, 23:1 \ 1 -732, 1996. I. Christie, D.F. Griffiths, A.R. Mitchell and J.M. Sanz-Serna. Product Approximation for Non-linear Problems in the Finite Element Method. IMA J. Numer. Anal, 1:253-266, 1981. K.N. Christodoulou and L.E. Scriven. The fluid mechanics of slide coating. J. Fluid Mech., 208:321-354, 1989. C.-C Chu, C.-C. Chang, C.-C. Liu, and R.L. Chang. Suction effect on an impulsively started circular cylinder: Vortex structure. Phys. Fluids, 8(11):2995, 1996. P.G. Ciarlet. The Finite Element Method for Elliptic Problems. North-Holland, Amsterdam, The Netherlands, 1978. P.G. Ciarlet and J.L. Lions (Eds.). Handbook of Numerical Analysis, Vol I. Finite Difference Methods (Part I), and Solution of Equations in Ren (Part I). North-Holland, Amsterdam, The Netherlands, 1990.
REFERENCES 969 K.A. Cliffe. On conservative finite element formulations of the inviscid Boussinesq equations. Int. J. Numer. Meth. Fluids, 1(2): 117, 1981. R. Clift, J.R. Grace and M.E. Weber. Bubbles, Drops and Particles. Academic Press, New York, New York, USA, 1978. W.M. Collins and S.C.R. Dennis. Flow past an impulsively started circular cylinder. J. Fluid Mech., 60(1): 105-127, 1973a. W.M. Collins and S.C.R. Dennis. The initial flow past an impulsively started circular cylinder. Quart. Journ. Mech. Appl. Math., XXVI, 1973b. G. Comini and S. Del Giudice. A physical interpretation of conventional finite element formulations of conduction-type problems. Int. J. Numer. Meth. Eng., 32:559-569, 1991. G. Comini, S. Del Giudice, and C. Nonino. Energy balances in CVFEM and GFEM formulations of convection-type problems. Int. J. Numer. Meth. Eng., 35:709, 1992. G. Comini M. Malisan, and M. Manzan. Accuracy comparison of control-volume and Galerkin finite-element methods for heat conduction Problems. Numer. Heat Trans. Part B, 1996, (in press). P. Constantin and C. Foias. Navier-Stokes Equations. The University of Chicago Press, Chicago, USA, 1989. R.D. Cook, D.S. Malkus and M.E. Plesha. Concepts and Applications of Finite Element Analysis. John Wiley and Sons, Inc., New York, New York, USA, 3rd edition, 1989. R. Courant, K.O. Friedrichs, and H. Lewy. Uber die Partiellen Differenzengleichurgen der Mathematischen Physik. Mathematische Annalen, 100:32-74, 1928. S.H. Crandall. Engineering Analysis: A Survey of Numerical Procedures. McGraw-Hill Book Company, Inc., New York, New York, USA, 1956. M.J. Crochet, F.T. Geyling and J.J. Van Schaftingen. Numerical simulation of the horizontal Bridgman growth of a gallium arsenide crystal. J. Crystal Growth, 65:166-172, 1983. M.J. Crochet, F.T. Geyling and J.J. Van Schaftingen. Finite element method for calculating the horizontal Bridgman growth of semiconductor crystals. J. Crystal Growth, 65:166-172, 1983. M.J. Crochet, F.T. Geyling and J.J. Van Schaftingen. Finite Element Method for Calculating the Horizontal Bridgman Growth of Semiconductor Crystals, in Finite Elements in Fluids—Volume 6, John Wiley & Sons Ltd, Chichester, England, UK, 1985; pp. 321-339; R.H. Gallagher, G.F. Carey, J.T. Oden and O.L. Zienkiewizc (Eds.). M. Crouzeix and P.-A. Raviart. Conforming and nonconforming finite element methods for solving the stationary Stokes equations I. Revue Francaise d' Automatique, Infor- matique et Recherche Operationnelle (R.A.I.R.O.), R-3:33-76, December 1973. W.P. Crowley. Numerical advection experiments. Mon. Weather Rev., 96(1), January 1968. M.J.P. Cullen. On the use of artificial smoothing in Galerkin and finite difference solutions of the primitive equations. Q. J. R. Meteorol. Soc, 102:77-93, 1976. M.J.P. Cullen. Numerical Methods Used in Atmospheric Mols. World Meteorological Organization, Bracknell, England, UK, Vol. 2, 1979. GARP Publication Series No. 17. M.J.P. Cullen and K.W. Morton. Analysis of evolutionary error in finite element and other methods. J. Comput. Phys., 34(2):245-267, 1980.
970 REFERENCES M.J.P. Cullen. The use of quadratic finite element methods and irregular grids in the solution of hyperbolic problems. J. Comput. Phys., 45:221-245, 1982. E.L. Cussler. Diffusion: Mass Transfer in Fluid Systems. Press Syndicate of the University of Cambridge, New York, New York, USA, 1984. C. Cuvelier, A. Segal and A. van Steerhover. Finite Element Methods, and Navier-Stokes Equations. D. Reidel Publishing Company, Dordrecht, The Netherlands, 1986. G. Dahlquist and A. BjOrck. Numerical Methods. Prentice-Hall, Inc., Englewood Cliffs, New Jersey, USA, 1974. Translated by Ned Anderson. G. Dahlquist. On one-leg multistep methods. SIAM J. Numer. Anal., 20(6): 1130-1138, 1983. H. Daniels. PASTIS-3D: Finite Element Projection Algorithm Solver for Transient Incompressible Flow Simulations. Implementation Aspects and User's Manual, Version 1.0. Lawrence Livermore National Laboratory, Livermore, California, USA, UCRL-MA- 111833, 1992. H. Daniels. Proc. Finite Elements in Fluids. Pineridge Press Ltd., Swansea, Wales, UK, 1993. Chap. "PASTIS-3D—A new generation finite element code for the incompressible Navier—Stokes equations"; p. 338; K. Morgan, E. 0nate, J. Periaux, J. Peraire and O.C. Zienkiewicz (Eds.). H.T. Davis, R.A. Novy, and L.E. Scriven. A comparison of synthetic boundary conditions for continuous-flow systems. Chem. Eng. Sci., 46(l):57-68, 1991. P.R. Dawson and D.F. McTigue. A numerical model for natural convection in fluid- saturated creeping porous media. Numer. Heat Transfer, 8:45-63, 1985. P.R. Dawson. On modeling of mechanical property changes during flat rolling of aluminum. Int. J. Solids Struct., 23(7):947-968, 1987. C. De Boor. Practical Guide to Splines. Springer-Verlag, New York, New York, USA, 1978. Applied Mathematical Sciences, Vol. 27. Baptista de Melo A.E. Solution of Advection-Dominated Transport by Eulerian-Lag- rangian Methods Using the Backwards Method of Characteristics. Ph.D. Thesis, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA, 1987. G. de Vahl Davis. A note on a mesh for use with polar coordinates. Numerical Heat Transfer, 2:261-266, 1979. E. Dean and R. Glowinski. On some Finite Element Methods for the Numerical Solution of Incompressible Viscous Flow. Cambridge University Press, 1993, in Incompressible Computational Fluid Dynamics; pp. 17-65; M. Gunzburger and R. Nicolaides (Eds.). M. Delfour, W. Hager and F. Trochu. Discontinuous Galerkin methods for ordinary differential equations. Math. Comput, 36(154):455-473, 1981. M. Delfour, W. Hager and F. Trochu. Discontinuous Galerkin Methods for Ordinary and Non-Conservative Finite Element Formulations of Convection-Type Problems. Int. J. Numer. Meth. Eng., 35:709-727 1992. J.E. Dendy. Two methods of Galerkin type achieving optimum I2 rates of convergence for first-order hyperbolics. SINUM, 11:637, 1974. J.J. Derby, L.J. Atherton, P.D. Thomas and R.A. Brown. Finite-Element Methods for Analysis of the Dynamics and Control of Gzochralski Crystal Growth. J. Sci. Comput., 2(4):297, 1987. J. Donea. A Taylor-Galerkin method for corrective transport problems. Int. J. Numer. Meth. Eng., 20:101-119, 1984.
REFERENCES 971 J. Donea, S. Giuliani, H. Laval, and L. Quartapelle. Time-accurate solution of advection-diffusion problems by finite elements. 45: 1984. J. Donea, L. Quartapelle and V. Selmin. An analysis of time discretization in the finite element solution of hyperbolic problems. J. Comput. Phys. 70:463-499, 1987. J. Donea and L. Quartapelle. An introduction to finite element methods for transient advection problems. Comput. Meth. Appl. Mech. Eng., 95:169-203, 1992. J. Douglas Jr, T. Dupont and M.F. Wheeler. A Galerkin procedure for approximating the flux on the boundary for elliptic and parabolic boundary value problems. Revue Franqise d'Automatique, Informatique et Recherche Operationnelle (R.A.I.R.O.), August 1974. J. Douglas Jr and T.F. Russell. Numerical methods for convection-dominated diffusion problems based on combining the method of characteristics with finite element or finite difference procedures. SIAMJ. Numer. Anal, 19(5):871-885, 1982. J. Douglas and J. Wang. An absolutely stabilised finite element method for the Stokes problem. Math. Comp., 52:495-508, 1989. J.K. Dukowicz and A.S. Dvinsky. Approximate factorization as a high order splitting for the implicit incompressible flow equations. J. Comput. Phys., 102:336-347, 1992. S. Dupont and J.M. Marchal. Preconditioned conjugate gradients for solving the transient Boussinesq equations in three-dimensional geometries. Int. J. Numer. Meth. Fluids, 8:283-303, 1988. D.R. Durran. The third-order Adams-Bashforth method: An attractive alternative to leapfrog time differencing. Mon. Weather Rev., 119:702-720, 1991. A.S. Dvinsky and J.K. Dukowicz. Null-space-free methods for the incompressible Navier-Stokes equations on non-staggered curvilinear grids. Comput. Fluids, 22(6):685-696, 1993. W.E and J.-G. Liu. Projection method I: Convergence and numerical boundary layers. SIAMJ. Numer. Anal., 32(4): 1017-1057, 1995. W.E and J.-G. Liu. Vorticity boundary conditions and related issues for finite difference schemes. J. Comput. Phys., 1996, 124:368-382. W.E and J.-G. Liu. Projection methods II: Godenov-Ryabenki analysis. SIAM J. Name. Anal. 1996, 33:(4) 159-1621. E.D. Eason. A review of least-squares methods for solving partial differential equations. Int. J. Numer. Meth. Eng., 10:1021-1046, 1976. R. Easton. Homogeneous boundary conditions for pressure in the MAC method. J. Comput. Phys., 9:375-379, 1972. J.W. Eastwood. The stability and accuracy of EPIC algorithms. Comput. Phys. Commun., 44:73-82, 1987. B.E. Eaton. The Galerkin Finite Element Method Applied to Viscous Incompressible Flows. Ph.D. Thesis, University of Colorado, Department of Chemical Engineering, Boulder, Colorado, USA, 1983. Y. Eguchi, G. Yagawa and L. Fuchs. A conjugate-residual-FEM for incompressible viscous flow analysis. Comput. Mech., 3:59-72, 1988. H. Elman. Multigrid and Krylov subspace methods for the discrete Stokes equations. Technical Report UMIACS-TR-94-76, Institute for Advanced Computer Studies, University of Maryland, 1994. M.S. Engelman, R.L. Sani and P.M. Gresho. The implementation of normal and/or tangential boundary conditions in finite element codes for incompressible fluid flow. Int. J. Num. Meth. Fluids 2:225-238, 1982a.
972 REFERENCES M.S. Engelman, R. Sani, P.M. Gresho and M. Bercovier. Consistent vs. reduced integration penalty methods for incompressible media using several old and new elements. Int. J. Numer. Meth. Fluids, 2:25-42, 1982b. M.S. Engelman. Incompressible Computational Fluid Dynamics: Trends and Advances. Cambridge University Press, Cambridge, England, UK, 1993. Chap. 3, "CFD—An Industrial Perspective"; pp. 67-86; M.D. Gunzburger and R.A. Nicolaides (Eds.). W.H. Enright and T.E. Hull. Numerical Methods for Differential Systems: Recent Developments in Algorithms, Software, and Applications. Academic Press, Inc., New York, New York, USA, 1976. Chap. "Comparing numerical methods for the solution of stiff systems of ODE's"; pp. 45-66; L. Lapidus and W.E. Schiesser (Eds.). N. Ericsson. On the Stability of Pipe Flow. Master of Science Thesis. Chalmers University of Technology, Goteborg, Sweden, 1993. K. Eriksson and C. Johnson. Adaptive Streamline Diffusion Finite Element Methods for Convection-Diffusion Problems. Department of Mathematics, Chalmers University of Technology, Goteborg, Sweden, No. 1990-18/ISSN 0347-2809, 18:1990. D. Estep. A Posteriori error bounds and global error control for approximation of ordinary differential equations. SIAM J. Numer. Anal., 32(1): 1 -48, 1995. R.E. Ewing and T.F. Russell. Advances in Computer Methods for Partial Differential Equations—IV. IMACS, Rutgers University, New Brunswick, New Jersey, USA, 1981. Chap. "Multistep Galerkin methods along characteristics for convection-diffusion problems," P. 28-36; R. Vichnevetsky and R.S. Stepleman (Eds.). R.E. Ewing, T.F. Russell and M.F. Wheeler. Simulation of miscible displacement using mixed methods and a modified method of characteristics. Society of Petroleum Engineers of AIME, Dallas, Texas, USA, SPE 12241, 1983. M. Feistauer. Mathematical Methods in Fluid Dynamics. John Wiley and Sons, Inc., New York, New York, USA, 1993. J.H. Ferziger and M. Peric. Computational Methods for Fluid Dynamics. Springer-Verlag, Berlin, Germany, 1996. B.A. Finlayson and L.E. Scriven. The method of weighted residuals—A review. Appl. Mech. Rev., 19(9):735-748, 1966. B.A. Finlayson. The Method of Weighted Residuals and Variational Principles, with Application in Fluid Mechanics, Heat and Mass Transfer. Academic Press, Inc., New York, New York, USA, 1972. Vol. 87 in Mathematics in Science and Engineering, R. Bellman (Ed.). B.A. Finlayson. Stiff Computation. Oxford University Press, New York, New York, USA, 1985. Sec. 3.4, "Solution of stiff equations resulting from partial differential equations," pp. 124-139; R.C. Aiken (Ed.). B.A. Finalyson. Numerical Methods for Problems with Moving Fronts. Ravenna Park Publishing, Inc., Seattle, Washington, USA, 1992. R.S. Fisk. On an oscillation phenomenon in the numerical solution of the diffusion-convection equations. SIAM J. Numer. Anal., 19(4):721-724, 1982. G.J. Fix. Finite element models for ocean circulation problems. SIAM J. Appl. Math., 29(3):371-387, 1975. G.J. Fix, M.D. Gunzburger and R.A. Nicolaides. Constructive Approaches to Mathematical Models. Academic Press, Inc., New York, New York, USA, 1979a. Chap. "Theory and applications of mixed finite element methods"; p. 375-393. G.J. Fix, M.D. Gunzburger and R.A. Nicolaides. On finite element methods of the least squares type. Comput. Math. Appl., 5:87-98, 1979b.
REFERENCES 973 G.J. Fix, M.D. Gunzburger and R.A. Nicolaides. On mixed finite element methods for first order elliptic Systems. Numer. Math., 37:29-48, 1981. G.J. Fix, M.D. Gunzburger and J.S. Peterson. On finite element approximations of problems having inhomogeneous essential boundary conditions. Comput. Math. Appl., 1983, 9:(5)687-700. D.P. Flanagan and T. Belytschko. A uniform strain hexahedron and quadrilateral with orthogonal hourglass control. Int. J. Numer. Meth. Eng., 1981, 17:679-706. C.A.J. Fletcher and K. Srinivas. Stream function vorticity revisited. Comput. Meth. Appl. Mech. Eng., 1983, 41:297-322. C.A.J. Fletcher. Computational Techniques for Fluid Dynamics I. Fundamental and General Techniques. Springer-Verlag, Berlin, Germany, 2nd edition, 1991a. Series: Springer Series in Computational Physics; R. Glowinski, M. Holt, P. Hut, H.B. Keller, J. Killeen, S.S. Orszag, and V.V. Rusanov (Eds.). C.A.J. Fletcher. Computational Techniques for Fluid Dynamics 2. Specific Techniques for Different Flow Categories. Springer-Verlag, Berlin, Germany, 2nd edition, 1991b. Series: Springer Series in Computational Physics; R. Glowinski M. Holt, P. Hut, H.B. Keller, J. Killeen, S.S. Orszag, and V.V. Rusanov (Eds.). R. Fletcher and D.F. Griffiths. The generalized eigenvalue problem for certain unsym- metric band matrices. Linear Algebra Appl., 29:139-149, 1980. Fluid Dynamics International, Inc., FIDAP 7.0: Fluid Dynamics Analysis Package: Theory Manual. Fluid dynamics international, Inc., Evanston, Illinois, USA, revision 7.0, 1st edition, 1993. B. Fornberg. On the instability of leap-frog and Crank-Nicolson approximations of a nonlinear partial differential equation. Math. Comput., 27(121):45-57, 1973. B. Fornberg. A Practical Guide to Pseudospectral Methods. Cambridge University Press, Cambridge, England, UK, 1996. M. Fortin, R. Peyret and R. Temam. Resolution numerique des equations de Navier-Stokes pour un fluide incompressible. J. Mec, 10:(3)357-390, 1971. M. Fortin. Calcul Numerique des Ecoulements des Fluides de Bingham et des Fluides Newtoniens Incompressibles par la Methode des Elements Finis. Ph.D. Thesis, Universite de Paris VI, Paris, France, 1972a. M. Fortin. Numerical Methods in Fluid Dynamics. 1972b. Chap. "Numerical solution of steady state Navier-Stokes equations"; J.J. Smolderen (Ed.); Agard Lecture Series No. 48, AGARD-LS-48. M. Fortin. An analysis of the convergence of mixed finite element methods. R.A.I.R.O. Analyse numerique/Numer. Anal, 11(4):341-354, 1977. M. Fortin. Old and new finite elements for incompressible flows. Int. J. Numer. Meth. Fluids, 1:347-364, 1981. M. Fortin. Short communication: Two comments on: Consistent vs reduced integration penalty methods for incompressible media using several old and new elements. Int. J. Numer. Meth. Fluids, 3:93-98, 1983. M. Fortin and M. Soulie. A non-conforming piecewise quadratic finite element on triangles. Int. J. Numer. Meth. Eng., 19:505-520. 1983. M. Fortin and R. Glowinski. Augmented Lagrangian Methods: Applications to the Numerical Solution of Boundary-Value Problems. North-Holland, Amsterdam, The Netherlands, 1983. Series: Studies in Mathematics and Its Applications, Vol. 15; J.L. Lions, G. Papanicolaou, R.T. Rockafellar and H. Fujita (Eds.).
974 REFERENCES M. Fortin. A three-dimensional quadratic nonconforming element. Numer. Math., 46:269-279, 1985. M. Fortin and A. Fortin. Finite Elements in Fluids. John Wiley and Sons, Inc., Chichester, England, UK. Vol. 6, 1985a. Chap. 7, "Newer and newer elements for incompressible flow"; pp. 171-187; R.H.Gallagher, G.F.Carey, J.T. Oden and O.C. Zienkiewicz (Eds.). M. Fortin and A. Fortin. Experiments with several elements for viscous incompressible flows. Int. J. Numer. Meth. Fluids, 5:911-928, 1985b. M. Fortin and A. Fortin. A generalization of Uzawa's algorithm for the solution of the Navier-Stokes equations. Commun. Appl. Numer. Meth., 1:205-208, 1985c. A. Fortin. On the imposition of a flowrate by an augmented Lagrangian method. Commun. Appl. Numer. Meth., 4:835-841, 1988. M. Fortin and R. Pierre. Stability analysis of discrete generalised Stokes problems. Numer. Meth. Partial Differential Equations, 8:303-323, 1992. L. Franca and R. Stenberg. Error analysis of some Galerkin least square methods for the elasticity equations. SIAM J. Numer. Anal., 28:1680-1697, 1991. L.P. Franca, S.L. Frey and T.J.R. Hughes. Stabilized finite element methods: I. application to the advective-diffusive model. Comput. Meth. Appl. Mech. Eng., 95:253-276, 1992. L. Franca and S. Frey. Stabilised finite element methods: II. the incompressible Navier- Stokes equations. Comput. Meth. Appl. Mech. Eng., 99:209-233, 1992. L. Franca and A. Madureira. Element diameter free stability parameters for stabilized methods applied to fluids. Comput. Meth. Appl. Mech. Eng., 105:395-403, 1993. L. Franca, T.J.R. Hughes and R. Stenberg. Stabilized finite element methods. Cambridge University Press, 1993, in Incompressible Computational Fluid Dynamics; pp. 87-107; M. Gunzburger and R. Nicolaides, (Eds.). I. Fried. Finite element analysis of incompressible material by residual energy balancing. Int. J. Solids Struct., 10:993-1002, 1974. I. Fried and D.S. Malkus. Finite element mass matrix lumping by numerical integration with no convergence rate loss. Int. J. Solids Struct., 11:461-466, 1975. I. Fried. On a deficiency in unconditionally stable explicit time-integration methods in elastodynamics and heat transfer. Comput. Meth. Appl. Mech. Eng., 46:195-200, 1984. E.O. Frind. Solution of the Advection-Dispersion Equation with Free Exit Boundary. Numer. Meth. Partial Diff. Equations, 4:301-313, 1988. U. Frisch and S.A. Orszag. Turbulence: challenges for theory and experiment. Phys. Today, p. 24, January 1990. G.P. Galdi. An Introduction to the Mathematical Theory of the Navier-Stokes Equations, Vol. I. Linearized Steady Problems. Springer-Verlag, New York, New York, USA, 1994a. G.P. Galdi. An Introduction to the Mathematical Theory of the Navier-Stokes Equations, Vol. II. Nonlinear Steady Problems. Springer-Verlag, New York, New York, USA, 1994b. R.H. Gallaher, G.F. Carey, J.T. Oden and O.C. Zienkiewicz. (Eds.), Finite elements in fluids—Vol. 6: finite elements and flow problems, John Wiley and Sons Ltd, Chichester, England, UK, 1985. J. Gary. Nonlinear Instability. World Meteorological Organization, 1979. Chap. 10 of Numerical Methods Used in Atmospheric Models. Vol. II.
REFERENCES 975 D.K. Gartling. Some comments on the paper by Heinrich, Huyakorn, Zienkiewicz and Mitchell. Int. J. Numer. Meth. Eng., 12:187-191, 1978. D.K. Gartling. NACHOS II—A Finite Element Computer Program for Incompressible Flow Problems, Part I. Theoretical Background. Sandia National Laboratories, Albuquerque, New Mexico, USA, SAND86-1816; UC-32 edition, 1987. D.K. Gartling and R.E. Hogan. Coyote II —A Finite Element Computer Program for Nonlinear Heat Conduction Problems Part I —Theoretical Background. Sandia Report, SAND 94-1173-uc-905, 1994. C.W. Gear. Numerical Initial Value Problems in Ordinary Differential Equations. Prentice- Hall, Inc., Englewood Cliffs, New Jersey, USA, 1971. M. Gellert and R. Harbord. Symmetric forms for finite element analysis of the Navier-Stokes problem. Comput. Fluids, 15(4):379-389, 1987. J.P. Gerrity Jr. A note on the Computational Stability of the Two-Step Lax-Wendroff Form of the Advection Equation. Mon. Weather Rev., 100:(l)72-73. 1972. V. Girault and P. A. Raviart. Finite Element Methods for Navier-Stokes Equations. Theory and Algorithms. Springer-Verlag, Berlin, Germany, 1986. V. Girault. Incompressible finite element methods for Navier-Stokes equations with nonstandard boundary conditions in R3. Mathe. Comput., 51(183):55-74, July 1988a. V. Girault. Curl-conforming Finite Element Methods for Navier-Stokes Equations with Non-standard Boundary Conditions in R3. Universite Pierre et Marie Curie, Centre Nationale de la Recharche Scientfic, 1988b. R. Glowinski and O. Pironneau. Numerical methods for the first biharmonic equation and for the two-dimensional stokes problem. SI AM Rev., 21(2): 167, April 1979. R. Glowinski. Numerical Methods for Nonlinear Variational Problems. Springer-Verlag, New York, New York, USA, 1984. R. Glowinski. Vistas in Applied Mathematics. Optimization Software, New York, New York, USA, 1986. Chap. "Splitting methods for the numerical solution of the incompressible Navier-Stokes equations"; p. 57; A.V. Balakrishnan, A.A. Dorodnitsyn and J.L. Lions (Eds.). R. Glowinski. In Vortex Dynamics and Vortex Methods. American Meteorological Society, Providence, Rhode Island, USA, 1991. Chap. "Finite element methods for the numerical simulation of incompressible viscous flow. Introduction to the control of the Navier-Stokes equations"; pp. 219-301; Lectures in Appl. Math., Vol. 28; C. Anderson and C. Greengard (Eds.). M.B. Goldschmit and E.N. Dvorkin. On the solution of the steady convection-diffusion equation using quadratic elements: A generalized Galerkin technique also reliable with distorted meshes. Eng. Comput., 11:565-573, 1994. S. Goldstein and L. Rosenhead. Boundary layer growth. Proc. Cambridge Phil. Soc, 32:392-401, 1936. G.H. Golub and C.F. Van Loan. Matrix Computations. The Johns Hopkins University Press, Baltimore, Maryland, USA, 1983. J.W. Goodrich and W.Y. Soh. Time-dependent viscous incompressible Navier-Stokes equations: The finite difference Galerkin formulation and streamfunction algorithms. J. Comput. Phys., 84:207-241, 1989. D. Gottlieb and S.A. Orszag. Numerical Analysis of Spectral Methods: Theory and Applications. Society for Industrial and Applied Mathematics, Philadelphia, Pennsylvania, USA, 1977.
976 REFERENCES G. Goudreau and J. Hallquist. Recent developments in large-scale finite element Lagrangian hydrocode technology. Comput. Meth. Appl. Mech. Eng., 1982, 33(1 -3):725. A.R. Gourlay. A note on trapezoidal methods for the solution of initial value problems. Math. Comput., 24(111):629-633, 1970. A. Grammeltvedt. A survey of finite-difference schemes for the primitive equations for a barotropic fluid. Mon. Weather Rev., 97(5):384-404, 1969. W.G. Gray and G.F. Pinder. On the relationship between the finite element and finite difference methods. Int. J. Numer. Meth. Eng., 10:893-923, 1976. W.G. Gray. Proc. Finite Elements in Water Resources. Pentech Press, London, England, UK, 1977. Chap. "An efficient finite element scheme for two-dimensional surface water computation"; pp. 4-33; W. Gray, G. Pinder and G. Brebbia (Eds.). P.M. Gresho and S. Chan. Solving the incompressible Navier-Stokes equations usng consistent mass and a pressure Poison equation/UCRL-99406. ASME Symposium on Recent Developments in CFD, Chicago, 95:51-75, August 1988. P.M. Gresho, R.L. Lee and R.L. Sani. Proc. 2nd. Int. Symp. Finite Element Methods in Flow Problems. International centre for computer aided design (ICCAD), 1976, Chap. "Advection-dominated flows, with emphasis on the consequences of mass lumping"; pp. 745-756; Santa Margherita Ligure, Italy, June 14-18, 1976. P.M. Gresho, R.L. Lee and R.L. Sani, Finite elements in fluids. John Wiley and Sons, Inc., New York, USA, Vol. 3, 1978a. Chap. 19, "Advection-dominated flows, with emphasis on the consequences of mass lumping"; p. 335. P.M. Gresho, R.L. Lee, R.L. Sani and T.W. Stullich. On the Time-dependent FEM Solution of the Incompressible Navier-Stokes Equations in Two- and Three-Dimesions. Preprint. Lawrence Livermore National Laboratory, Livermore, California, USA; UCRL-81323, 1978b. P.M. Gresho. Comments on a Recent Paper by Emery et ai: A comparison of the some of the Thermal Chracteristics of Finite-element and Finite-difference Calculations of Transient Problems. Numer. Heat Trans., 2:519-520, 1979. P.M. Gresho and R.L. Lee. Finite Element Methods for Convection Dominated Flows. AMD, The American Society of Mechanical Engineers, New York, New York, USA, 1979. Chap. 3, "Don't suppress the wiggles—they're telling you something!"; pp. 37-61; T.J.R. Hughes (Ed.). P.M. Gresho, R.L. Lee, S.T. Chan and J.M. Leone Jr. A New Finite Element for Boussi- nesq Fluids. Preprint. Lawrence Livermore National Laboratory, Livermore, California, USA; UCRL-82842, 1979. P.M. Gresho, R.L. Lee and R.L. Sani, Recent advances in numerical methods in fluids. Pineridge Press Ltd, Swansea, Wales, UK. Vol. 1, 1980a. Chap. 2, "On the time- dependent solution of the incompressible Navier-Stokes equations in two and three dimensions"; pp. 27-79; C. Taylor and K. Morgan (Eds.), P.M. Gresho, R.L. Lee, S.T. Chan and R.L. Sani. Solution of the time-dependent Navier- Stokes and Boussinesq equations using the Galerbin finite element method, in Approximation Methods for Navier-Stokes Problems, Proceedings of the Sympossium Held by IUTAM at the Universityof Paderborn, Germany, September 9-15, 1979. Springer- Verlag, Berlin, Germnay, 1980b; pp. 203-222; R. Rautman (Ed.). Series: Lecture Notes in Mathematics, Vol. 771. A Dold and B. Eckmann (Eds.). P.M. Gresho and R.L. Lee. Don't suppress the wiggles—they're telling you something! Comput. Fluids, 9:223-253, 1981.
REFERENCES 977 P.M. Gresho, S.T. Chan, R.L. Lee and CD. Upson. Proc. 22nd Num. Meth. Laminar and Turbulent Flow, Pineridge Press Ltd., Swansea, Wales, UK, 1981a. Chap. "Solution of the time-dependent, three-dimensional incompressible Navier-Stokes equations via FEM"; p. 27-39; C Taylor and B. Schreffler (Eds.); Venice, Italy. P.M. Gresho, R.L. Lee and R.L. Sani. The Consistent Method for Computing Derived Boundary Quantities when the Galerkin FEM is used to Solve Thermal and/or Fluids Problems. Preprint. Lawrence Livermore National Laboratory, Livermore, California, USA; UCRL-85366, 1981b. P.M. Gresho and J.M. Leone Jr. Proc. 5th Int. Conf. on Finite Element Methods. Springer- Verlag, Burlington, Vermont, USA, 1984. Chap. "Another attempt to overcome the bent element blues"; pp. 667-683; June, 1984; also Lawrence Livermore National Laboratory, Livermore, California, USA, UCRL-90449. P.M. Gresho, R.L. Lee and R.L. Sani. Proc. 5th Int. Symposium on Finite Elements in Flow Problems, TICOM, USA, 1984a. Chap. "Further studies on equal-order interpolation for Navier-Stokes"; pp. 143-148; Austin, Texas USA, also Lawrence Livermore National Laboratory, Livermore, California, UCRL-89094. P.M. Gresho, S.T. Chan, R.L. Lee and CD. Upson. A modified finite element method for solving the time-dependent, incompressible Navier-Stokes equations, Part 1: Theory. Int. J. Numer. Meth. Fluids, 4:557-598, 1984b. P.M. Gresho, S.T. Chan, R.L. Lee and CD. Upson. A Modified Finite Element Method for Solving the Time-Dependent, Incompressible Navier-Stokes Equations, Part 2: Theory. Int. J. Numer. Meth. Fluids, 4:619-640, 1984c. P.M. Gresho and S.T. Chan. Proc. Int. Cont. Num. Meth. in Laminar and Turbulent Flow. Pineridge Press Ltd, Swansea, Wales, UK, 1985. Chap. "A new semi-implicit method for solving the time-dependent conservation equations for incompressible flow"; pp. 3-21; Swansea, Wales, UK, July 9-12, 1985. P.M. Gresho, C Taylor, M.D. Olson and W.G. Habashi. Proc. Int. Cont. Numer. Meth. in Laminar and Turbulent Flow. Part 2. Pineridge Press, Swansea, Wales, UK., 1985. P.M. Gresho and R.L. Lee. Comments on 'the group velocity of some numerical schemes'. Int. J. Numer. Meth. Fluids, 7:1357-1362, 1987. P.M. Gresho, R.L. Lee, R.L. Sani, M.K. Maslanik and B.E. Eaton. The consistent Galerkin FEM for computing derived boundary quantities in thermal and/or fluids problems. Int. J. Numer. Meth. Fluids, 7:371-394, 1987. P.M. Gresho and R.L. Sani. On pressure boundary conditions for the incompressible Navier-Stokes equations. Int. J. Numer. Meth. Fluids, 7:1111-1145, 1987. P.M. Gresho and R.L. Sani. Finite Elements in Fluids. John Wiley and Sons Ltd, Chichester, England, UK. Vol. 7, 1988. Chap. 7, "On pressure boundary conditions for the incompressible Navier-Stokes equations"; pp. 123-157; R.H.Gallagher, R. Glowinski, P.M. Gresho, J.T. Oden, and O.C. Zienkiewicz (Eds.). P.M. Gresho and S.T. Chan. On the theory of semi-implicit projection methods for viscous incompressible flow and its implementation via a finite element method that also introduces a nearly consistent mass matrix. Part 2: Implementation. Int. J. Numer. Meth. Fluids, 11(5):621-660, 1990. P.M. Gresho. Comments on 'a conjugate-residual-FEM for incompressible viscous flow analysis' by Y. Eguchi, G. Yagawa and L. Fuchs. Comput. Mech., 6:203-204, 1990a. P.M. Gresho. On the theory of semi-implicit projection methods for viscous incompressible flow and its implementation via a finite element method that also
978 REFERENCES introduces a nearly consistent mass matrix, Part 1: Theory. Int. J. Numer. Meth. Fluids, ll(5):587-620, 1990b. P.M. Gresho. Annual Review of Fluid Mechanics, Vol. 23. Annual Reviews, Inc., Palo Alto, California, USA, 1991a. Chap. "Incompressible fluid dynamics: some fundamental formulation issues"; pp. 413-453. P.M. Gresho. Proc. Fourth Int. Symp. Computational Fluid Dynamics: A Collection of Technical Papers, Vol. I. University of California, Davis, Davis, California, USA, 1991b. Chap. "A summary report on the 14 July 91 minisymposium on outflow boundary conditions for incompressible flow"; pp. 436-442. P.M. Gresho. Some current CFD issues relevant to the incompressible Navier-Stokes equations. Comput. Meth. Appl. Mech. Eng., 87:201-252, 1991c. P.M. Gresho. Advances in Applied Mechanics. Academic Press, Inc., New York, New York, USA, Vol. 28, 1992. Chap. "Some interesting issues in incompressible fluid dynamics, both in the continuum and in numerical simulation," pp. 45-140. P.M. Gresho, D.K. Gartling, J.R. Torczynski, K.A. Cliffe, K.H. Winters, T.J. Garratt, A. Spence and J.W. Goodrich. Is the steady viscous incompressible two-dimensional flow over a backward-facing step at Re = 800 stable? Int. J. Numer. Meth. Fluids, 17:501-541, 1993. P.M. Gresho, S.T. Chan, M.A. Christon, and A.C. Hindmarsh. A little more on stabilized QiQi for transient viscous incompressible flow. Int. J. Numer. Meth. Fluids, 21:837-856, 1995. P.M. Gresho and S.T. Chan. Proc. Sixth Int. Symp. Comput. Fluid Dyn. Chap. "An Update on Projection Methods for Transient Incompressible Viscous Flow"; Vol. I, pp. 389; Lake Tahoe, Nevada, USA, September 4-8, 1995. P.M. Gresho and R.L. Sani. Problems and solutions (generalized and FEM) related to rapid and impulsive changes for incompressible flows, in Computational Fluid Dynamics Review 1997. John Wiley and Sons, Inc., New York, New York, USA, 1998; M. Hafez and K. Oshima (Eds.). D.F. Griffiths. Proc. Sixth Canadian Congress of Applied Mechanics. Chap. "The construction of approximately divergence-free finite elements"; Vancouver, British Columbia, Canada, May 29-June 3, 1977. D.F. Griffiths and J. Lorenz. An analysis of the Petrov-Galerkin finite element method. Comput. Meth. Appl. Mech. Eng., 1977. D.F. Griffiths. Mathematics of Finite Elements and Applications III. Academic Press, 1979a. Chap. "The construction of approximately divergence-free finite elements"; pp. 237-245; J.W. Whiteman (Ed.). D.F.Griffiths. Finite elements for incompressible flow. Math. Meth. Appl. Sci., 1:16-31, 1979b. D.F. Griffiths, I. Ghristie and A.R. Mitchell. Analysis of error growth for explicit difference schemes in conduction-convection problems. Int. J. Numer. Meth. Eng., 15:1075-1081, 1980. D.F. Griffiths. An approximately divergence-free 9-node velocity element (with variations) for incompressible flows. Int. J. Numer. Meth. Fluids, 1:323-346, 1981. D.F. Griffiths. Numerical Methods for Fluid Dynamics. Academic Press, London, England, UK, 1982. Chap. "The effect of pressure approximations on finite element calculations of incompressible flows"; pp. 359-374; K.W. Morton and M.J. Baines (Eds.).
REFERENCES 979 D.F. Griffiths and J.M. Sanz-Serna. On the scope of the method of modified equations. SIAMJ. Sci. Stat. Comput., 7(3):994-1008, 1986. D.F. Griffiths. Numerical Analysis 1987. Longman Science and Technology, Pitman research notes in mathematics. Vol. 170, 1988. Chap. "The dynamics of some linear multistep methods with step-size control"; pp. 115-134; D.F.Griffiths and G.A. Watson (Eds.). D.F. Griffiths. Discretised Eigenvalues Problems, LBB Constants and Stabilization. Longman Scientific & Technical, Pitman Research Notes in Mathematics. Vol. 334, 1996. D.F. Griffith and G.A. Watson (Eds.). D.F. Griffiths. The 'no boundary condition' outflow boundary condition. Int. J. Numer. Meth. Fluids, 24:393-412, 1997. D. Griffiths and D. Silvester. Unstable Modes of the Q\ -Pq Element. University of Manchester, Manchester, England, UK, technical report NA-257, 1994. W.D. Gropp and D.E. Keyes. Domain decomposition methods in computational fluid dynamics. Int. J. Numer. Meth. Fluids, 1992, 14:147-165. J.-L. Guermond and C. Tenaud. Proc. ECCOMAS 94. John Wiley and Sons Ltd, Chichester, England, UK, 1994. Chap. "Error analysis and numerical tests for the approximation of unsteady incompressible viscous flow by means of projection methods". J.-L. Guermond. Sur 1'approximation des equations de Navier-Stokes instationnaires par une methode de projection. C.R. Acad. Sci. Paris, 319:887-892, 1994. Serie I. J.-L Guermond and L. Quartapelle. Proc. Ninth Int. Conf. Finite Elements in Fluids: New Trends and Applications, Part 1. 1995. Chap. "Unconditionally stable finite-element method for the unsteady Navier-Stokes equations"; pp. 367-376; M.M. Cecchi, K. Morgan, J. Periaux, B.A. Schrefler and O.C. Zienkiewicz (Eds.); Venice, Italy, October 15-21, 1995. J.-L Guermond and L. Quartapelle. Calculation of incompressible viscous flows by an unconditionally stable projection FEM. J. Comput. Phys., 1996. submitted; also Laboratoire d'lnformatique pour la Mecanique et les Sciences de l'Ingenieur(Notes et Documents LIMSI) No. 95-06 and No. 95-14; Orsay, France, May, 1995. D. Gunzburger and A. Nicolaides. Incompressible Computational Fluid Dynamics Trends and Advances. Cambridge University Press, Cambridge, UK, 1993. M.D. Gunzburger. Finite Element Methods for Viscous Incompressible Flows: A Guide to Theory, Practice, and Algorithms. Academic Press, Inc., Boston, Massachusetts, USA, 1989. M.D. Gunzburger, M. Mundt and J.S. Peterson. Computational methods for viscous flows, Vol. 4, 1990. Chap. "Experiences with finite element methods for the velocity-vorticity formulation of three-dimensional viscous incompressible flows"; pp. 231-271; C.A. Brebbia (Ed.). M.D. Gunzburger and R.A. Nicolaides. Incompressible Computational Fluid Dynamics Trends and Advances. Cambridge University Press, Cambridge, UK, 1993. K.K. Gupta and J.L. Meek. A brief history of the beginning of the finite element method. Int. J. Numer. Meth. Eng., 39:3761-3774 1996. K. Gustafson and R. Hartman. Divergence-free bases for finite element schemes in hydrodynamics. SIAMJ. Numer. Anal, 20(4):697-721, 1983. K. Gustafson and R. Hartman. Graph theory and fluid dynamics. SIAM J. Alg. Disc. Methods, 6(4):643-656, 1985.
980 REFERENCES W.G. Habashi and G.G. Youngson. Letter to the editor: Discussion on article by S. Ramad- hyani and S.V. Patankar. Int. J. Numer. Meth. Eng., 1980, 15:1740-1742. M. Hafez, J. Dacles and M. Soliman. Proc. Ilth Int. Conf. Num. Methods in Fluid Dynamics, Springier-Verlag, Berlin, Germany, 1989. Chap. "A velocity/vorticity method for viscous incompressible flow calculations"; p. 288; Series: Lecture Notes in Physics; Vol. 323; D.L. Dwoyer and M.Y. Hussaini (Eds.). T. Hagstrom. Conditions at the downstream boundary for simulations of viscous, incompressible flow. S1AMJ. Sci. Stat. Comput., 12(4):843-858, 1991. E. Hairer. Unconditionally stable explicit methods for parabolic equations. Numer. Math., 35:57-68, 1980. E. Hairer, S.P. N0rsett and G. Wanner, Solving ordinary differential equations I: nonstiff problems, Springer-Verlag, Berlin, Germany, 1987. E. Hairer, C. Lubich and M. Roche, The numerical solution of differential-algebraic systems by Runge-Kutta methods, Springer-Verlag, Berlin, Germany, 1989. Series: Lecture Notes in Mathematics; Vol. 1409; A. Dold, B. Eckmann and F. Takens (Eds.). E. Hairer and G. Wanner. Solving Ordinary Differential Equations II: Stiff and Differential-Algebraic Problems. Springer-Verlag, Berlin, Germany, 1991. P. Hansbo and A. Szepessy. A velocity-pressure streamline diffusion finite element method for the incompressible Navier-Stokes equations. Comput. Meth. Appl. Mech. Eng., 84:175-192, 1990. P. Hansbo. The characteristic streamline diffusion method for convection-diffusion problems. Comput. Meth. Appl. Mech. Eng., 96:239, 1992a. P. Hansbo. The characteristic streamline diffusion method for the time-dependent incompressible Navier-Stokes equations. Comput. Meth. Appl. Mech. Eng., 99:171-186, 1992b. J. Happel and H. Brenner. Low Reynolds Number Hydrodynamics: with Special Applications to Particulate Media. Prentice-hall, inc., Englewood Cliffs, New Jersey, USA, 1965. R. Harbord and M. Gellert. Progress in symmetric formulation of the incompressible Navier-Stokes equations. Comput. Meth. Appl. Mech. Eng., 83:201-209, 1990. R. Harbord and M. Gellert. A simple least-squares method for FE analysis of the Navier-Stokes problem. Comput. Mech., 8:19-24, 1991. F.H. Harlow and J.E. Welch. Numerical calculation of time-dependent viscous incompressible flow of fluid with free surface. Phys. Fluids, 8(12):2182, 1965. V. Haroutunian, M.S. Engelman and I. Hasbani. Segregated finite element algorithms for the numerical solution of large-scale incompressible flow problems. Int. J. Numer. Meth. Fluids, 17:323-348, 1993. R.L. Hartman and K. Gustafson. Quantum Mechanics in Mathematics, Chemistry, and Physics. Plenum Publishing Corporation, USA, 1981. Chap. "On the dimension of a finite difference approximation to divergence-free vectors"; pp. 125-131; K.E. Gustafson and W.P. Reinhardt (Eds.). Y. Hasbani, E. Livne, and M. Bercovier. Finite elements and characteristics applied to advection-diffusion equation. Comput. Fluids, 11(2):71 —83, 1983. G. Hauke and T.J.R. Hughes. A unified approach to compressible and incompressible flows. Comput. Meth. Appl. Mech. Eng., 1994, 113:389-395. L.J. Hayes, S.V. Krishnamachari, and T.F. Russell. A finite element alternating-direction method combined with a modified method of characteristics for convection-diffusion problems. SIAMJ. Numer. Anal, 26(6): 1462-1473, 1989.
REFERENCES 981 F. Hecht. Analysis of Laminar Flow over a Backward Facing Step, A GAMM-Workshop. Friedr. Vieweg and Sohn, Braunschweig, Germany, 1984. Notes on numerical fluid mechanics; Vol. 9. Chap. "Use of divergence free basis in finite elements methods"; pp. 290-316; K. Morgan, J. Periaux, and F. Thomasset (Eds.). G.W. Hedstrom. The Galerkin method based on Hermite cubics. SIAM J. Numer. Anal., 16(3):385-393, 1979. A.F. Hegarty, J.J.H. Miller, E. O'Riordan and G.I. Shishkin. Special Meshes for Finite Difference Approximations to an Advection-Diffusion Equation with Parabolic Layers. J. Comput. Phys., 117:47-54, 1995. J.G. Heywood. The Navier-Stokes equations: On the existence, regularity, and decay of solutions. Indiana Univ. Math. J., 29(4):639-681, 1980. J.G. Heywood and R. Rannacher. Finite element approximation of the nonstationary Navier-Stokes problem. I. Regularity of solutions and second-order error estimates for spatial discretization. SIAM J. Numer. Anal., 19(2):275-311, 1982. J.G. Heywood and R. Rannacher. Finite element approximation of the nonstationary Navier-Stokes problem, Part II: Stability of solutions and error estimates uniform in time. SIAM J. Numer. Anal., 23(4):750-777, 1986a. J.G. Heywood and R. Rannacher. An analysis of stability concepts for the Navier-Stokes equations. Journal fur die reine und angewandte Mathematik (Crelles Journal), Band 372, 1986b. J.G. Heywood and R. Rannacher. Finite element approximation of the nonstationary Navier-Stokes problem, Part III. smoothing property and higher-order error estimates for spatial discretization. SIAM J. Numer. Anal., 25(3):489-512, 1988. J.G. Heywood and R. Rannacher. Finite-element approximation of the nonstationary Navier-Stokes problem. Part IV: Error analysis for second-order time discretization. SIAM J. Numer. Anal., 27(2):353-384, 1990. J.G. Heywood, R. Rannacher and S. Turek. Artificial boundaries and flux and pressure conditions for the incompressible Navier-Stokes equations. Int. J. Numer. Meth. Fluids, 22:325-352, 1996. B.G. Higgins. Downstream development of two-dimensional viscocapillary film flow. Ind. Eng. Chem., Fundam., 21:168-173, 1982. A.C. Hindmarsh. GEAR: Ordinary Differential Equation System Solver. Lawrence Liver- more National Laboratory, Livermore, California, USA, UCID-30001, rev. 1, computer documentation, 1972. A.C. Hindmarsh. Numerical Solution of ODE's; Lecture Notes. Lawrence Livermore National Laboratory, Livermore, California, USA, UCID-16558, 1974. A.C. Hindmarsh. On a Finite Element Algorithm for Ordinary Differential Equations. Numerical Mathematics Group, Lawrence Livermore National Laboratory, Livermore, California, USA, technical memorandum No. 75-4, 1975. A.C. Hindmarsh. On Numerical Methods for Stiff Differential Equations—Getting the Power to the People. Lawrence Livermore National Laboratory, Livermore, California, USA, UCRL-83259, 1979. A.C. Hindmarsh. Scientific Computing. Vol. I of IMACS Transactions on Scientific Computation. North-Holland, Amsterdam, The Netherlands, 1983. Chap. "ODEPACK, a systematized collection of ODE solvers," pp. 55-64; R.S. Stepleman et al. (Eds.).
982 REFERENCES A.C. Hindmarsh, P.M. Gresho and D.F. Griffiths. The stability of explicit Euler time- integration for certain finite difference approximations of the multi-dimensional advection-diffusion equation. Int. J. Numer. Meth. Fluids, 4:853-897, 1984. A.C. Hindmarsh and L.R. Petzold. Numerical Methods for Solving Ordinary Differential Equations and Differential/Algebraic Equations. Energy Tech. Rev., September:23-36, 1988. A.C. Hindmarsh and L.R. Petzold. Algorithms and software for ordinary differential equations and differential-algebraic equations, Part I: Euler methods and error estimation. Comput. Phys., 9(1):34-41, Jan./Feb. 1995a. A.C. Hindmarsh and L.R. Petzold. Algorithms and software for ordinary differential equations and differential-algebraic equations, Part II: Higher-order methods and software packages. Comput. Phys., 9(2): 148-155, Mar./Apr. 1995b. E. Hinton, T. Rock and O.C. Zienkiewicz. A note on mass lumping and related processes in the finite element method. Earthquake Eng. Struct. Dyn. 1976, 4:245-249. C. Hirsch. Numerical Computation of Internal and External Flows. Vol. I: Fundamentals of Numerical Discretization. John Wiley and Sons Ltd., Chichester, England, UK, 1988. C.W. Hirt, B.D. Nichols and N.C. Romero, SOLA—A numerical solution algorithm for transient flux flows, Los Alamos Scientific Laboratory, Los Alamos, New Mexico, USA, 1975, UC-34 and UC-79d. L.-W. Ho. A Legendre Spectral Element Method for Simulation of Incompressible Unsteady Viscous Free- Surface Flows. Massachusetts Institute of Technology, Department of Mechanical Engineering, Cambridge, Massachusetts, USA, 1989. Ph.D. Thesis. L.-W Ho and A.T. Patera. A Legendre spectral element method for simulation of unsteady incompressible viscous free-surface flows. CMAME, 80:355-366, 1990. L.-W Ho, Y. Maday, A.T. Patera and E.M. Ronquist, "A high-order Lagrangian- decoupling method for the incompressible Navier-Stokes equations," CMAME, Vol. 80, pp. 65-90, 1990. P. Hood and C. Taylor. Proc. Finite Element Methods in Flow Problems. University of Alabama Press, Alabama, USA, 1974. Chap. "Navier-Stokes equations using mixed interpolation"; J.T. Oden, R.H. Gallagher, O.C. Zienkiewicz, and C. Taylor (Eds.); Swansea, Wales (January 1974). N.A. Hookey, B.R. Baliga, and C. Prakash. Evaluation and enhancements of some control volume finite-element methods—Part 1. convection-diffusion problems. Numer. Heat Trans., 14:255-272, 1988. N.A. Hookey and B.R. Baliga. Evaluation and enhancements of some control volume finite-element methods—Part 2. incompressible fluid flow problems. Numer. Heat Trans., 14:273-293, 1988. E. Hopf. iiber die anfangswertaufgabe fur hydrodynamischen grundgleichungen. Mathematische Nachrichten, 4, Sept. 1950/1951. also available as 'On the Initial Value Problem for the Fundamental Equations of Hydrodynamics,' translated by P.P. Weidhaas, Lawrence Livermore National Laboratory, Livermore, California, USA, UCRL-Trans-12144 (December 1986). T.J.R. Hughes. Unconditionally stable algorithms for nonlinear heat conduction. CMAME, 1977, 10:135-139. T.J.R. Hughes and A. Brooks. Finite Element Methods for Convection Dominated Flows. AMD, The American Society of Mechanical Engineers, New York, New York, USA,
REFERENCES 983 1979. Chap. 2, "A multi-dimensional upwind scheme with no crosswind diffusion"; pp. 19-35; T.J.R. Hughes (Ed.). T.J.R. Hughes, W.K. Liu and A. Brooks, Finite Element Analysis of Incompressible Viscous Flows by the Penalty Function Formulation, J. Comput. Phys., 1979a, 30:1-60, January. T.J.R. Hughes and K.S. Pister and R.L. Taylor. Implicit-explicit finite elements in nonlinear transient analysis. 17/18:159-182, 1979b. T.J.R. Hughes and A. Brooks. Finite Elements in Fluids. John Wiley & Sons Ltd, Chichester, England, UK. Vol. 4, 1982. Chap. "A theoretical Framework for Petrov-Galerkin methods with discontinuous weighting functions: application to the streamline-upwind procedure"; p. 47; R.H. Gallagher, D.H. Norrie, J.T. Oden and O.C. Zienkiewicz (Eds.). T.J.R. Hughes. Analysis of Transient Algorithms with Particular Reference to Stability Behavior, in Computational Methods for Transient Analysis. North-Holland, Amsterdam, The Netherlands, 1983; pp. 67-156; T. Belytschko and T.J.R. Hughes (Eds.). T.J.R. Hughes, L. Franca and M. Balestra. A new finite element formulation for CFD: V. Circumventing the Babuska-Brezzi condition: a stable Petrov-Galerkin formulation of the Stokes problem accommodating equal-order interpolations. Comput. Meth. Appl. Mech. Eng., 59:85-99, 1986. T.J.R. Hughes. The Finite Element Method: Linear Static and Dynamic Finite Element Analysis. Prentice-Hall, Inc., Englewood Cliffs, New Jersey, USA, 1987. T.J.R. Hughes and L. Franca. A new finite element formulation for CFD: VII. The Stokes problem with various well-posed boundary conditions: Symmetric formulations that converge for all velocity/pressure spaces. Comput. Methods, Appl. Mech. Eng., 65:85-96, 1987. T.J.R. Hughes and L.P. Franca. A new finite element formualtion for computational fluid dynamics: VII. the Stokes problem with various well-posed boundary conditions: Symmetric formulations that converge for all velocity/pressure spaces. Comput. Meth. Appl. Mech. Eng., 65:85-96, 1987. T.J.R. Hughes, L.P. Franca and M. Mallet. A new finite element formulation for computational fluid dynamics. VI. Convergence analysis for Linear time-dependent multidimensional advective-diffusive systems. Comput. Meth. Appl. Mech. Eng., 63:97-112, 1987. T.J.R. Hughes. Finite Elements in Fluids, Vol. 7. John Wiley and Sons Ltd, Chichester, England, UK, 1988. T.J.R. Hughes, R.L. Taylor and J.F. Levy. Proc. 2nd Int. Symp. Finite Element Methods in Flow Problems. International Center for Computer Aided Design ICCAD. 1976. Chap. "A finite element method for incompressible viscous flows"; p. 3; Santa Margherita Ligure, Italy, June 14-18, 1976, Conference Series No. 2/76. T.J.R. Hughes, L.P. Franca, and G.M. Hulbert. A new finite element formulation for computational fluid dynamics: VIII. The Galerkin/least-squares method for advective- diffusive equations. Comput. Meth. Appl. Mech. Eng., 73:173-189, 1989. A.G. Hutton and R.M. Smith. Numerical Methods in Laminar and Turbulent Flow. Piner- idge Press, Swansea, Wales, UK, 1981. Chap. "On the finite element simulation of incompressible turbulent flow in general two- dimensional geometries"; C. Taylor and B.A. Schrefler (Eds.); also CEGB Report RD/B/5010N81.
984 REFERENCES P.S. Huyakorn, C. Taylor, R.L. Lee and P.M. Gresho. A comparison of various mixed- interpolation finite elements in the velocity-pressure formulation of the Navier-Stokes equations. Comput. Fluids, 6:25-35, 1978. S.R. Idelsohn and E. 0nate. Finite volumes and finite elements: Two 'good friends'. Int. J. Numer. Meth. Eng., 37:3323-3341, 1994. A.M. Il'in. Differencing scheme for a differential equation with a small parameter affecting the highest derivative. Math. Notes Acad. Sci. USSR, 6:596, 1969. B. Irons. Finite Element Techniques, Proceedings of a Seminar at the University of Sowthampton, April 1970. p. 328; H. Tottenham and C. Brebbia (Eds.). Southampton, England, UK, 1970. B. Irons and S. Ahmad. Techniques of Finite Elements. Ellis Horwood Ltd, Chichester, England, UK, 1980. C.P. Jackson and K.A. Cliffe. Mixed interpolation in primitive variable finite element formulations for incompressible flow. Int. J. Numer. Meth. Eng., 17:1659-1688, 1981. J.D. Jackson. Classical Electrodynamics. John Wiley and Sons, Inc., New York, New York, USA, 2nd. edition, 1975. G. James and R.C. James (Eds.). Mathematics Dictionary. D. Van Nostrand Company, Inc., Princeton, New Jersey, USA, multilingual edition, 1959. L. Janvier, B. Metivet, R. Mgouni, G. Pot and E. Razafindrakoto. Numerical Simulation of 3-D Incompressible Unsteady Viscous Laminar Flows: A GAMM- Workshop. Friedr. Viewegund Sohn, Braunschweig, Germany, 1992. Chap. "A 3-D driven cavity flow simulation with N3S code"; pp. 67-78; Series: Notes on Numerical. Fluid Mechanics, Vol. 36; M. Deville, T.-H. Le and Y. Morchoisne (Eds.). O.K. Jensen. An automatic timestep selection scheme for reservoir simulation. Proc. 55th Annual Fall Technical Conf. and Exhibition ofSPE of AIM E, SPE 9373, Dallas, Texas, USA, September 21-24, 1980. O.K. Jensen and B.A. Finlayson. A numerical technique for tracking sharp fronts in studies of tertiary oil-recovery pilots. SPE Reservoir Eng., 1:194-202, March 1986. S. Jensen and M. Vogelius. Divergence stability in connection with the p-version of the finite element method. MM AN, 24:737-764, 1990. D.C. Jespersen. Arakawa's methods is a finite-element method. J. Comput. Phys., 16:383-390, 1974. B.-N Jiang and L.A. Povinelli. Least-squares finite element method for fluid dynamics. Comput. Meth. Appl. Mech. Eng., 81:13-37, 1990. B.-N Jiang and L.A. Povinelli. Optimal least-squares finite element method for elliptic problems. Comput. Meth. Appl. Mech. Eng., 1993, 102:199-212. B.-N. Jiang. Non-oscillatory and non-diffusive solution of convection problems by the iteratively reweighted least-squares finite element method. J. Comput. Phys., 105(0:108-121, 1993. B.-N Jiang, C.Y. Loh, and L.A. Povinelli. Theoretical Study of the Incompressible Navier- Stokes Equations by the Least-Squares Method. Lewis Research Center, Cleveland, Ohio, USA, NASA technical memorandum 106535 and ICOMP-94-04, March 1994. B.-N. Jiang, T.L. Lin and L.A Povinelli, Large-Scale Computation of Incompressible Viscous Flow by Least-Squares Finite Element Method, Comput. Meth. Appl. Mech. Eng. 114:213-231, 1994. C. Johnson and J. Pitkaranta. Analysis of some mixed finite element methods related to reduced integration. Math. Comput., 38(158):375-400, 1982.
REFERENCES 985 C. Johnson, U. Navert, and J. Pitkaranta. Finite element methods for linear hyperbolic problems. Comput. Meth. Appl. Mech. Eng., 45:285-312, 1984. C. Johnson. Numerical Solution of Partial Differential Equations by the Finite Element Method. Cambridge University Press, Cambridge, England, UK, 1987. C. Johnson. Error estimates and adaptive time-step control for a class of one-step methods for stiff ordinary differential equations. SI AM J. Numer. Anal, 25(4):908-926, 1988. C. Johnson. A new approach to algorithms for convection problems which are based on exact transport + projection. Comput. Meth. Appl. Mech. Eng., 100:45-62, 1992. C. Johnson, R. Rannacher and M. Boman. Numerics and hydrodynamic stability: Toward error control in computational fluid dynamics. SIAM J. Numer. Anal., 32(4): 1058-1079, August 1995. S.L. Josse and B.A. Finlayson. Reflections on the numerical viscoelastic flow problem. J. Non-Newtonian Fluid Mech., 16:13-36, 1984. J.P. Gerrity Jr. A note on the computational stability of the two-step Lax-Wendroff form of the advection equation. Mon. Weather Rev., 100(l):72-73, 1972. H. Kardestuncer and D.H. Norrie (Eds.). Finite Element Handbook. McGraw-Hill, Inc., New York, New York, USA, 1987. G. Em Karniadakis, S.A. Orszag, E.M. R0nquist and A.T. Patera. Spectral Element and Lattice Gas Methods for Incompressible Fluid Dynamics, in Incompressible Computational Fluid Dynamics Trends and Advances. Cambridge University Press, Cambridge, England, UK, 1993. p. 203; M.D. Gunzburger and R.A. Nicolaides (Eds.). G.E. Karniadakis, M. Israeli and S.A. Orszag. High-order splitting methods for the incompressible Navier-Stokes equations. J. Comput. Phys., 97(2):414-443, 1991. N. Kechkar and D. Silvester. Analysis of locally stabilised mixed finite element methods for the Stokes problem. Math. Comp., 58:1-10, 1992. H. Keller. Numerical Solution of Partial Differential Equations II. Academic Press, Inc., New York, New York, USA, 1971. Chap. "A new finite difference scheme for the incompressible advection-diffusion equation"; p. 327; B. Hubbard (Ed.). R. Keunings. An algorithm for the simulation of transient viscoelastic flows with free surfaces. Journal of Computational Physics, 62:199-220, 1986. A. Khelifa, J.-L. Robert and Y. Ouellet. A Douglas-Wang finite element approach for transient advection-diffusion problems, Comput. Meth. Appl. Mech. Eng., 110:113-129, 1993. H. Kheshgi and M. Luskin. Analysis of the finite element variable penalty method for Stokes equations. Math. Comput., 45(172):347-363, 1985. H.S. Kheshgi and L.E. Scriven. Penalty-Finite Element Methods in Mechanics. AMD, The American Society of Mechanical Engineers, New York, New York, USA, 1982. Chap. "Finite element analysis of incompressible viscous flow by a variable penalty function method"; p. 67-74; AMD-Vol. 51; J.N. Reddy (Ed.). H.S. Kheshgi and L.E. Scriven. Finite Elements in Fluids. John Wiley and Sons Ltd, Chichester, England, UK. Vol. 5, 1984. Chap. 19, "Penalty finite element analysis of unsteady free surface flows"; pp. 393-434; R.H. Gallagher, J.T. Oden, O.C. Zienkiewicz, T. Kawai and M. Kawahara (Eds.). H.S. Kheshgi and L.E. Scriven. Variable penalty method for finite element analysis of incompressible flow. Int. J. Numer. Meth. Fluids, 1985, pp. 785-803. N. Kikuchi, J.T. Oden and Y.J. Song. Penalty-finite element methods for the analysis of Stokesian flows. Comput. Meth. Appl. Mech. Eng., 31:297-329, 1982.
986 REFERENCES J. Kim and P. Moin. Application of a fractional-step method to incompressible Navier-Stokes equations. J. Comput. Phys., 59:308-323, 1985. I. King, C. Bratu, B. Delbast, A. Besson and J.P. Chabard. Proc. Europec 90. Society of Petroleum Engineers, 1990. Chap. "Hydraulic optimization of PDC bits"; The Hague, The Netherlands, October 22-24, 1990. I. Kinnmark. The Shallow Water Wave Equations: Formulation, Analysis and Application. Springer-Verlag, Berlin, Germany, 1986. I.P.E. Kinnmark and W.G. Gray. Stability and accuracy of spatial approximations for wave equation tidal models. J. Comput. Phys., 60:447-466, 1985. S.F. Kistler and L.E. Scriven. The teapot effect—sheet-forming flows with deflection, wetting and hysteresis. J. Fluid Mech., 263:19-62, March 1994. S.P. Kjaran and S.T. Sigurdsson. Treatment of time derivative and calculation of flow when solving groundwater flow problems by Galerkin finite element methods. Adv. Water Resources, 4, March 1981. P. Kloucek and F.S. Rys. Stability of the fractional step ^-scheme for the nonstationary Navier-Stokes equations. SI AM J. Numer. Anal., 31(5):1312-1335, 1994. J.B. Knox. Numerical errors in the time integration of advective processes. J. Geophys. Res., 66(12):4177-4186, 1961. R.P. Koopman, D.L. Ermak and S.T. Chan. A review of recent field tests and mathematical modelling of atmospheric dispersion of large spills of denser-than-air gases. Atmos. Environ., 23:(4)731-745, 1989. H. Kreiss and J. Oliger. Methods for the Approximate Solution of Time Dependent Problems. World Meteorological Organization, 1973. Vol. 136 in Pure and Applied Mathematics. H.-O Kreiss and J. Lorenz. Initial-Boundary Value Problems and the Navier-Stokes Equations. Academic Press, Inc., Boston, Massachusetts, USA, 1989. S.V. Krishnamachari, L.J. Hayes and T.F. Russell. A finite element alternating-direction method combined with a modified method of characteristics for convection-diffusion problems. SIAM J. Numer. Anal., 26:(6) 1462-1473, 1989. D. Kwak et al. A three-dimensional incompressible Navier-Stokes flow solver using primitive variables. AIAA Journal, 24(3):390-396, 1986. Y.-K Kwok and K.-K. Tarn. Stability analysis of three-level difference schemes for initial-boundary problems for multidimensional convective-diffusion equations. Commun. Numer. Methods Eng., 9:595-605, 1993. M. Kfizek and P. Neittaanmaki. On superconvergence techniques. Acta Appl. Math., 9(3):175-198, 1987. P. Labbe and A. Garon. A robust implementation of Zienkiewicz and Zhu's local patch recovery method. Commun. Numer. Methods Eng., 11:427-434, 1995. F. Ladeinde and K.E. Torrance. Galerkin finite element simulations of convection driven by rotation and gravitation. Int. J. Numer. Meth. Fluids, 10(1), January 1990. O.A. Ladyshenskaya. The Mathematical Theory of Viscous Incompressible Flow. Gordon and Breach Science Publishers, Inc., New York, New York, USA, 2nd edition, 1969. O.A. Ladyshenskaya. Annual Review of Fluid Mechanics. 7, 1975. Chap. "Mathematical analysis of Navier-Stokes equations for incompressible fluids"; p. 249; M. Van Dyke, W. Vincenti and J. Wehausen (Eds.). A. Lafon and H.C. Yee. Dynamical approach study of spurious steady-state numerical solutions for nonlinear differential equations. Part III: The effects of nonlinear source
REFERENCES 987 terms and boundary conditions in reaction-convection equations. Int. J. Comput. Fluid Dyn., 6:1, 1996. J.D. Lambert. Numerical Methods for Ordinary Differential Systems: The Initial Value Problem. John Wiley and Sons Ltd, Chichester, England, UK, 1991. L.D. Landau and E.M. Lifshitz. Fluid Mechanics. Pergamon Press, Oxford, England, UK, 1959. W.E. Langlois. Slow Viscous Flow. The Macmillan Company/Collier-Macmillan Limited, New York/London, 1964. L. Lapidus and W.E. Schiesser. Numerical Methods for Differential Systems. Academic Press Inc., New York, New York, USA, 1976. B.E. Larock and L.R. Herrmann. Proc. First Int. Cont. Finite Elements in Water Re sour., Pentech Press, London, England, UK, 1977. Chap. "Improved Flux Prediction Using Low Order Finite Elements"; pp. 1.103-1.114; Princeton, New Jersey, USA, July, 1976. P. Lax and B. Wendroff. Systems of conservation laws. Commun. Pure Appl. Math., XIIL217-237, 1960. P.D. Lax and B. Wendroff. On the stability of difference schemes with variable coefficients. Commun. Pure Appl. Math., 15:363-371, 1962. P.D. Lax and B. Wendroff. Difference schemes for hyperbolic equations with high order of accuracy. Commun. Pure Appl. Math., XVII:381-398, 1964. H. Le and P. Moin. An improvement of fractional step methods for the incompressible Navier-Stokes equations. J. Comput. Phys., 92:369-379, 1991. L.G. Leal. Laminar Flow and Convective Transport Processes: Scaling Principles and Asymptotic Analysis. Butterworth-Heinemann, Boston, Massachusetts, USA, 1992. R.L. Lee and P.M. Gresho and R.L. Sani. Smoothing techniques for certain primitive variable solutions of the Navier-Stokes equations. Int. J. Numer. Meth. Eng., 14:1785-1804, 1979. R.L. Lee, S.T. Chan, P.M. Gresho and CD. Upson. Proc. AIAA, 5th Comp. Fluid Dynamics Conf., USA, 1981. Chap. "Simulation of three-dimensional, time-dependent, incompressible flows by a finite element method"; pp. 354-364; Palo Alto, California; also Lawrence Livermore National Laboratory, Livermore, California, UCRL- 85226. R.L. Lee, P.M. Gresho, S.T. Chan, R.L. Sani, and M.J.P. Cullen. Finite Elements in Fluids. John Wiley and Sons Ltd, Chichester, England, UK. Vol. 4, 1982. Chap. 2, "Conservation laws for primitive variable formulations of the incompressible flow equations using the Galerkin finite element method," p. 21; R.H. Gallagher, D.H. Norrie, J.T. Oden, and O.C. Zienkiewicz (Eds.). R.L. Lee, P.M. Gresho and R.L. Sani. An Exploratory Study on the Application of an Existing Finite Element Navier-Stokes Code to Compute Potential Flows. Preprint. Lawrence Livermore National Laboratory, Livermore, California, USA; UCRL-86684, 1982. R.L. Lee and J.M. Leone Jr. A modified finite element model for mesoscale flows over complex terrain. Comput. Math. Appl., 16(12):41-56, 1988. R.L. Lee, P.M. Gresho, S.T. Chan and C.D.Y. Upson. A Three-dimensional, Finite Element Method for Simulating Heavier-than-air Gaseous Releases over Variable Terrain, in Air Pollution Modeling and Its Application II. 1983; pp. 555-573; C. deWispelaere (Ed.).
988 REFERENCES R.L. Lee P.M. Gresho and R.L. Sani. Proc. Summer Computer Simulation Conference. Chap. "A comparative study of certain finite-element and finite-difference methods in advection-diffusion simulations"; pp. 37-42; Washington, D.C., USA; April 1976. Y.-S Lee, and P.R. Dawson. Obtaining residual stresses in metal forming after neglecting elasticity on loading. J. Appl. Mech., Ill, June 1989. C.E. Leith. Methods in Computational Physics: Advances in Research and Applications. Academic Press, New York, New York, USA. Applications in hydrodynamics, Vol. 4, 1965. Chap. "Numerical simulation of the earth's atmosphere"; pp. 1-28; B. Alder, S. Fernbach and M. Rotenberg (Eds.). C.E. Leith. Diffusion approximation for two-dimensional turbulence. Phys. Fluids, 11:671-673, 1969. B.P. Leonard. A stable and accurate convective modelling procedure based on quadratic upstream interpolation. 1979a, 19:59-98. B.P. Leonard. Finite Element Methods for Convection Dominated Flows. AMD, The American Society of Mechanical Engineers, New York, New York, USA, 1979b. Chap. 1, "A survey of finite differences of opinion on numerical muddling of the incomprehensible defective confusion equation"; pp. 1-17; T.J.R. Hughes (Ed.). B.P. Leonard and S. Mokhtari. Beyond first-order upwinding: The ultra-sharp alternative for non-oscillatory steady-state simulation of convection. Int. J. Numer. Meth. Eng., 30:729-766, 1990. B.P. Leonard and S. Mokhtari. ULTRA-SHARP solution of the Smith-Hutton problem. Int. J. Numer. Meth. Heat Fluid Flow, 2:407-427, 1992. J.M. Leone Jr., P.M. Gresho, S.T. Chan and L. Lee. A note on the accuracy of Gauss-Legendre quadrature in the finite element method. Int. J. Numer. Meth. Eng., 14:769-773, 1979. J.M. Leone Jr. Finite Element Simulations of Stratified Flow over Simple Geometrical Obstructions and Arbitrarily Complex Terrain. Iowa State University, Ames, Iowa, USA, 1980. Ph.D. Thesis. J.M. Leone Jr. and P.M. Gresho. Finite Element Simulations of Steady, Two-Dimensional Viscous Incompressible Flow over a Step. J. Comput. Phys., 41(1): 167, 1981. J.M. Leone Jr., P.M. Gresho, R.L. Lee and R.L. Sani. Proc. Int. Cont. Numer. Meth. Laminar and Turbulent Flow. Pineridge Press Ltd, Swansea, Wales, UK, 1983. Chap. "Flow-Through Boundary Conditions for Time-Dependent, Buoyancy-Influenced Flow Simulations Using Low Order Finite Elements"; pp. 1-13; Swansea, Wales, UK, August 8-11, 1983. J.M. Leone Jr and R.L. Lee. Numerical simulation of drainage flow in Brush Creek, Colorado. J. Appl. Meteorol, 28(6):530-542, 1989. M. Lesieur. Turbulence in Fluids: Stochastic and Numerical Modelling. Kluwer Academic, Boston, Massachusetts, USA, 1987. Series: Mechanics of Fluids and Transport Processes. F.S. Lien and M.A. Leschziner. A general non-orthogonal collocated finite volume algorithm for turbulent flow at all speeds incorporating second-moment turbulence- transport closure, Part 1: Computational implementation. Comput. Meth. App. Mech. Eng., 114:123-148, 1994. M.J. Lighthill. Group velocity. J. Inst. Math. Appl, 1:1-28, 1965. D.K. Lilly. On the computational stability of numerical solutions of time-dependent nonlinear geophysical fluid dynamics problems. Mon. Weather Rev., 93(1): 11-25, 1965.
REFERENCES 989 S.P. Lin and M. Tobak. Spectral stability of Taylor's vortex array. Phys. Fluids, 29(10):3477, October 1986. J. Linden, G. Lonsdale, B. Steckel and K. Stiiben. Multigrid for the steady-state incompressible Navier-Stokes equations: a survey. Technical Report 322, GMD, 1988. J.L. Lions. AGARD Lecture Series No. 48: Numerical Methods in Fluid Dynamics. North Atlantic Treaty Organization (NATO) Advisory Group for Aerospace Research and Development (AGARD), 1972. Chap. "On the numerical approximation of some equations arising in hydrodynamics"; pp. 9-21; J.J. Smolderen (Ed.). J. Liou, O. Pironneau and T. Tezduyar. Characteristic-Galerkin and Galerkin/least-squares space-time formulations for the advection-diffusion equation with time-dependent domains. Comput. Meth. Appl. Mech. Eng., 100:117-141, 1992. W.K. Liu and Y.F. Zhang. Unconditionally stable implicit-explicit algorithms for coupled thermal stress waves. Comput. Struct., 17(3):371-374, 1983; see also Int. J. Numer. Meth. Eng. 20(9): 1581, 1984. W.K. Liu, T. Belytschko and Y.F. Zhang. Partitioned rational Runge Kutta for parabolic systems. Int. J. Numer. Meth. Eng., 20:1581-1597, 1984. W.K. Liu, J.S.-J. Org and R.A. Uras. Finite element stabilization matrices-a unification approach. Comput. Meth. Appl. Mech. Eng., 53:13-46, 1985. T.P. Loc. Numerical analysis of unsteady secondary vortices generated by an impulsively started circular cylinder. J. Fluid Mech., 100(1): 111-128, 1980. T.P. Loc and R. Bouard. Numerical solution of the early stage of the unsteady viscous flow around a circular cylinder. A comparison with experiments visualization and measurements. J. Fluid Mech., 160:93-117, 1985. P.M. Lovalenti and J.F. Brady. The Lydrodynamic force on a rigid particle undergoing arbitrary time-dependent motion at small Reynolds number. J. Fluid Mech., 256:561-605, 1993. P.M. Lovalenti and J.F. Brady. The temporal behaviour of the hydrodynamic force on a body in response to an abrupt change in velocity at small but finite Reynolds number. J. Fluid Mech., 293:35-46, 1995. C. Lubich, E. Hairer and M. Roche. The Numerical Solution of Differential-Algebraic Systems by Runge-Kutta Methods. Springer-Verlag, Berlin, Germany, 1989. Series: Lecture Notes in Mathematics, Vol. 1409; A. Dold, B. Eckmann and F. Takens (Eds.). H.J. Lugt. Vortex Flow in Nature and Technology. John Wiley & Sons, Inc., New York, New York, USA, 1983. M. Luskin and R. Rannacher. On the smoothing property of the Crank-Nicolson scheme. Appl. Anal., 14:117-135, 1982. D.R. Lynch and W.G. Gray. A wave equation model for finite element tidal computations. Comput. Fluids, 7:207-228, 1979. D.R. Lynch and W.G. Gray. Finite element simulation of flow in deforming regions. J. Comput. Phys., 36:135-153, 1980. D.R. Lynch. Mass conservation in finite element groundwater models. Advances in Water Resources, 7:67, 1984. D.R. Lynch. Heat conservation in deforming element phase change simulation. J. Comput. Phy., 57(2):303, January, 1985a. D.R. Lynch. Mass balance in shallow water simulations. Commun. Appl. Numer. Meth., 1:153-159, 1985b.
990 REFERENCES R. Lohner. Design of incompressible flow solvers: practical aspects, Cambridge University Press, 1993, in Incompressible Computational Fluid Dynamics; pp. 267-293; M. Gunzburger and R. Nicolaides (Eds.). M. Morandi, K. Morgan, J. Periaux, B.A. Schrefler and O.C. Zienkiewicz. Proc. Int. Cont. Finite Elements in Fluids: New Trends and Applications. Venezia, Italy, 1995. R.J. Mackkinnon and G.F. Carey. Superconvergent derivatives: A Taylor series analysis. Int. J. Numer. Meth. Eng., 28:489-509, 1989. R.J. MacKinnon and G.F. Carey. Nodal superconvergence and solution enhancement for a class of finite-element and finite-difference methods. SI AM J. Sci. Stat. Comput., ll(2):343-353, 1990. R.J. MacKinnon, G.F. Carey and P. Murray. A procedure for calculating vorticity boundary conditions in the stream-function—vorticity method. Commun. Appl. Numer. Meth., 1990, 6:47-48. R.H. MacNeal. An asymmetrical finite difference network. Q. Appl. Math., 11(3):295-310, 1953. Y. Maday, A.T. Patera and E.M. Ronquist. An operator-integration-factor splitting method for time-dependent problems: application to incompressible fluid flow. J. Sci. Comput., 5(4):263-292, December 1990. A.V. Malevsky and D.A. Yuen. Characteristics-based methods applied to infinite Prandtl number thermal convection in the hard turbulent regime. Phys. Fluids A, 3(9):2105-2115, 1991. A.V. Malevsky. Spline-Characteristic Method for Simulation of Convective Turbulence. University of Minnesota, Army High Performance Computing Research Center, Minneapolis, Minnesota, USA, AHPCRC 93-059, 1993. D.S. Malkus and T.J.R. Hughes. Mixed finite element methods—reduced and selective integration techniques: A unification of concepts. Comput. Meth. Appl. Mech. Eng., 15:63-81, 1978. D.S. Malkus. Eigenproblems associated with the discrete LBB condition for incompressible finite elements. Int. J. Eng. Sci., 19:1299-1310, 1981. D.S. Malkus and E.T. Olsen. Penalty-Finite Element Methods in Mechanics. AMD, The American Society of Mechanical Engineers, New York, New York, USA, 1982. Chap. "Incompressible finite elements which fail the discrete LBB condition"; pp. 33-50; AMD-Vol. 51; J.N. Reddy (Ed.). D.S. Malkus and E.T. Olsen. Obtaining error estimates for optimally constrained incompressible finite elements. Comput. Meth. Appl. Mech. Eng., 45:331-353, 1984. D.S. Malkus and M.E. Plesha. Zero and negative masses in finite element vibration and transient analysis. Comput. Meth. Appl. Mech. Eng., 59:281-306, 1986. D.S. Malkus, R.D. Cook and M.E. Plesha. Concepts and Applications of Finite Element Analysis. John Wiley and Sons, Inc., New York, New York, USA, 3rd edition, 1989. M. Mallet, C. Poirier and F. Shakib. A new finite element formulation for computational fluid dynamics: development of an hourglass control operator for multidimensional advective-diffusive systems, Comput. Meth. Appl. Mech. Eng., 94:429-442, 1992. R.S. Marshall, J.C. Heinrich and O.C. Zienkiewicz. Natural convection in a square enclosure by a finite-element, penalty function method using primitive fluid variables. Numerical Heat Transfer, 1:315-330, 1978. M. Marion and R. Temam. Navier-Stokes Equations Theory and Approximation. Handbook Numer. Anal., to appear, 1996.
REFERENCES 991 Y.P. Marx. Time integration schemes for the unsteady incompressible Navier-Stokes equations. J. Comput. Phys., 112:182-209, 1994. J. Mason. Methods of Functional Analysis for Application in Solid Mechanics. Elsevier, Amsterdam, The Netherlands, 1985. Series: Studies in Applied Mechanics, Vol. 9. K.K. Mathur and P.R. Dawson. On modeling damage evolution during the drawing of metals. Mech. Mater., 6:179-196, 1987. M.R. Maxey and J.J. Riley. Equation of motion for a small rigid sphere in a nonuniform flow. Phys. Fluids, 26(4):883, 1983. R.C. McCallen. An Investigation of the Finite Element Incompressible Flow Code FEM3: The Model Some of Its Options, A Guide on How To Obtain and Run the Code, and Its Performance on Some Chosen Problems. Lawrence Livermore National Laboratory, Livermore, California, USA, UCID-21527 edition, 1988. R.C. McCallen. Large-Eddy Simulation of Turbulent Flow Using the Finite Element Method. Ph.D. Thesis University of California, Davis, Department of Mechanical Engineering, Davis, California, USA, 1993. R. Mei. History force on a sphere due to a step change in the free-stream velocity. Int. J. Multiphase Flow, 19(3):505-525, 1993. R. Mei. Velocity fidelity of flow tracer particles. Exp. Fluids, 22:1-13, 1996. R. Mei and C.J. Lawrence. The flow field due to a body in impulsive notion. J. fluid Mech., 325:79-111, 1996. M.C. Melaaen. Calculation of fluid flows with staggered and nonstaggered curvilinear nonorthogonal grids—the theory. Numer. Heat Transfer, Part B, 21:1-19, 1992. B. Mercier. Topics in Finite Element Solution of Elliptic Problems. Springer-Verlag, Berlin, Germany, 1979a. B. Mercier. A conforming finite element method for two-dimensional incompressible elasticity. Int. J. Numer. Meth. Eng., 14:942-945, 1979b. S.G. Mikhlin. Variational Methods in Mathematical Physics. Pergamon Press, Oxford, England, UK, 1964. J.J.H. Miller, E. O'Riordan and G.I. Shiskin. On piecewise-uniform meshes for upwind- and central-difference operators for solving singuarly perturbed problems. IMA J. Numer. Anal. 1995, 15:89-99. J.J.H. Miller, E. O'Riordan and G.I. Shishkin. Solution of singular perturbation problems with e-uniform numerical methods—introduction to the theory. World Scientific Publishers, Singapore, 1996. to appear. R.H. Miller. A horror story about integration methods. J. Comput. Phys. ,93:469-476, 1991. K. Millsaps. Karl Pohlhausen, as I remember him. Ann. Rev. Mech., 16:1-10, 1984. P. Minev and P.M. Gresho. A remark on pressure correction schemes for transient viscous incompressible flow. Commun. Numer. Meth. Eng., in press. W.J. Minkowycz, E.M. Sparrow, G.E. Schneider and R.H. Pletcher. Handbook of Numerical Heat Transfer. John Wiley and Sons, Inc., New York, New York, USA, 1988. A.R. Mitchell and D.F. Griffiths. Semi-Discrete Generalised Galerkin Methods for Time- Dependent Conduction-Convection Problems. Academic Press Ltd, London, England, UK, 1979. Chap. 2 in MAFELAP III. A.R. Mitchell. Recent Developments in the Finite Element Method. North-Holland, Amsterdam, The Netherlands, 1984, in Computational Techniques and Applications, J. Noye and C. Fletcher (Eds.).
992 REFERENCES S. Mittal and T.E. Tezduyar. Massively parallel finite element computation of incompressible flows involving fluid-body interactions. Comput. Meth. Appl. Mech. Eng., 112:253-282, 1994. K. Miyakoda. Contribution to the numerical weather prediction: Computation with finite difference. Jpn. J. Geophys., 3(1):75-190, 1962. A. Mizukami. A mixed finite element method for boundary flux computation. Comput. Meth. Appl. Mech. Eng., 57:239-243, 1986. A.G. Mohamed, D.T. Valentine and R.E. Hassel. Numerical study of laminar separation over an annular backstep. Comput. Fluids, 20(2): 121-143, 1991. C.R. Molenkamp. Accuracy of finite-difference methods applied to the advection equation. J. Appl. Meteoroi, 7(2): 160-167, 1968. K.W. Morton. Stability and convergence in fluid flow problems. Proc. R. Soc. Lond. A., 323:237-253, 1971. K.W. Morton. Stability of finite difference approximations to a diffusion-convection equation. Int. J. Numer. Meth. Eng., 1980, 15:677-683. K.W. Morton and A. Stokes. The Mathematics of Finite Elements and Applications IV MAFELAP1981. Academic Press, New York, New York, USA, 1982. Chap. "Generalised Galerkin methods for hyperbolic equations"; pp. 421-431; J.R. Whiteman (Ed.). K.W. Morton. Proc. IMA Conf, Numerical Methods for Fluid Dynamics. 1982. Chap. "Generalised Galerkin methods for steady and unsteady problems"; pp. 1-32; K.W. Morton and M.J. Baines (Eds.). K.W. Morton. Proc. Fifth GAMM Conf. Numerical Methods in Fluid Mechanics. Friedr. Vieweg und Sohn, Braunschweig, Germany, 1983. Chap. "Characteristic Galerkin methods for hyperbolic problems"; pp. 243-250; M. Pandolfi and R. Piva (Eds.); Rome, Italy, October 5-7, 1983. K.W. Morton. Generalised Galerkin methods for hyperbolic problems. Comput. Meth. Appl. Mech. Eng., 52:847-871, 1985. K.W. Morton, A. Priestley and E. Siili. Stability of the Lagrange-Galerkin method with non-exact integration. Math. Model. Numer. Anal. 22:(4)625-653, 1988. K.W. Morton. Proc. Third Int. Conf. Hyperbolic Problems. Studentilitteratur, Uppsala, Sweden, 1990. Chap "Lagrange-Galerkin and Characteristic-Galerkin methods and their applications"; pp. 742-755; B. Engquist and B. Gustafsson. Eds. Uppsala, Sweden, June 11-15, 1990. K.W. Morton and E. Siili. Finite volume methods and their analysis. IMA J. Numer. Anal., 11:241-260, 1991. K.W. Morton. Numerical Solution of Convection-Diffusion Problems. Champman & Hall, London, UK, 1996. R. Mullen and T. Belyschko. Dispersion analysis of finite element semidiscretizations of the two-dimensional wave equation. Int. J. Numer. Meth. Eng., 18:11-29, 1982. S. Miiller, A. Prohl, R. Rannacher and S. Turek. Fast solvers for flow problems, Friedr. Vieweg und Sohn, Heidelberg, Germany, 1995. Chap. "Implicit time-discretization of the nonstationary incompressible Navier-Stokes eqations"; p. 175; Notes on CFD, Vol. 49; W. Hackbush and G. Wittum (Eds.). B. Metivet, K. Boukir, Y. Maday and E. Razafindrakoto. A high order characteristics/finite element method for the incompressible Navier-Stokes equations. Int. J. Numer. Meth. Fluids, 1996, to appear.
REFERENCES 993 K. Nafa and R.W. Thatcher. Low-order macroelements for two- and three-dimensional Stokes flow. Numerical Methods for Partial Differential Equations, 9:579-591, 1993. M.J. Naughton. On Numerical Boundary Conditions for the Navier-Stokes Equations. Ph.D. Thesis, California Institute of Technology, Pasadena, California, USA, 1986. J. Nelson and C.L. Chang. A mass conservative least-squares finite element method for the Stokes problem. Commun. Numer. Meth. Eng., 11:965-970, 1995. B. Neta and R.T. Williams. Stability and phase speed for various finite element formulations of the advection equation. Comput. Fluids, 14(4):393-410, 1986. O. Nevanlinna and W. Liniger. Contractive methods for stiff differential equations. Part I. BIT, 18:457-474, 1978. O. Nevanlinna and W. Liniger. Contractive methods for stiff differential equations. Part II. BIT, 19:53-72, 1979. E. Ng, B. Pegtor and B. Nitrosso. On the solution of Stokes's system within N3S using supernodal Cholesky factorization, in Finite Elements in Fluids, Part I, Pineridge Press Ltd, Swansea, wales UK, 1993. pp. 76-84; K. Morgan, E. Onate, J. Periaux, J. Peraire and O.C. Zienkiewicz (Eds.). R.A. Nicolaides. Existence, uniqueness and approximation for generalized saddle point problems. SIAMJ. Numer. Anal, 19(2):349-357, 1982. R.A. Nicolaides, T.A. Porsching and C.A. Hall. Computational fluid dynamics review John Wiley and Sons, Inc., New York, New York, USA, 1995. Chap. "Covolume methods in computational fluid dynamics"; M. Hafez and K. Oshima (Eds.). B. Noble, Applied Linear Algebra, Prentice-Hall, Inc., Englewood Cliffs, New Jersey, USA, 1969. R.A. Novy, H.T. Davis and L.E. Scriven. Upstream and downstream boundary conditions for continuous-flow systems. Chem. Eng. Sci., 1990, 45:(6)1515-1524. R.A. Novy, H.T. Davis and L.E. Scriven. A comparison of synthetic boundary conditions for continuous-flow systems. Chem. Eng. Sci., 1991, 46:( 1)57-68. J.T. Oden. Finite-Element Method. Finite-Element Analogue of Navier-Stokes Equation, J. Eng. Mech. Div. ASCE, 96(4):529, 1970. J.T. Oden. Finite Elements of Nonlinear Continua. McGraw-Hill Book Company, New York, New York, USA, 1972. J.T. Oden and J.N. Reddy. An Introduction to the Mathematical Theory of Finite Elements. John Wiley and Sons, New York, New York, USA, 1976. J.T. Oden, N. Kikuchi and Y.J. Song. Penalty-finite element methods for the analysis of Stokesian flows. Comput. Meth. Appl. Mech. Eng., 31:297-329, 1982. J.T. Oden and G.F. Carey. Finite Elements: Mathematical Aspects Vol. IV. Prentice-Hall, Inc., Englewood Cliffs, New Jersey, USA, 1983. J.T. Oden and O.-P. Jacquotte. Stability of some mixed finite element methods for Stokesian flows. Comput. Meth. Appl. Mech. Eng., 43:231-247, 1984. J.T. Oden. The best FEM. Finite Elements Anal. Des., 7:103-114, 1990. J.T. Oden. Finite Elements: An Introduction. Handbook of Numerical Analysis, Vol. II, North-Holland, Amsterdam, The Netherland, 1991. J.T. Oden, A. Patra and Y. Feng. An hp adaptive strategy, AMD, ASME, 1992, 157:23-46. Adaptive, Multilevel and Hierarchical Computational Strategies; A.K. Noor (Ed.). J.T. Oden, W. Wu and M. Ainsworth. Three-step h-p adaptive stragtegy for the incompressible Navier-Stokes equations, IMA summer program on modeling, mesh generation, and adaptive numer. Methods for partial differential equations, 1993.
994 REFERENCES E.T. Olsen. Stable Finite Elements for Non-Newtonian Flows: First Order Elements Which Fail the LBB Condition. Ph.D. Thesis, Illinois Institute of Technology, Chicago, Illinois, USA, 1983. M.D. Olson and S.-Y. Tuann. New finite element results for the square cavity. Comput. Fluids, 7:123-135, 1979. I. Orlanski. A simple boundary condition for unbounded hyperbolic flows. J. Comput. Phys., 21:251-269, 1976. S.A. Orszag. Numerical simulation of incompressible flows within simple boundaries: Accuracy. J. FluidMech., 49(1):75-112, 1971. S.A. Orszag. Comparison of pseudospectral and spectral approximation. Stud. Appl. Math., 51:253-259, 1972. J.M. Ortega. Numerical analysis. A Second Course, SIAM, Philadelphia, Pennsylvania, USA, 1990. Series: Classics in Applied Mathematics; R.E. O'Malley Jr. (Ed.). R.L. Panton. Incompressible Flow. John Wiley and Sons, Inc., New York, New York, USA, 2nd edition, 1996. S. Paolucci. Direct numerical simulation of two-dimensional turbulent natural convection in an enclosed cavity. J. Fluid Mech., 215:229-262, 1990. S. Paolucci and D. Chenoweth. A note on the stability of the explicit finite differenced transport equation. J. Comput. Phys., 47:489, 1982. T.C. Papanastasiou, N. Malamataris and K. Ellwood. A new outflow boundary condition. Int. J. Numer. Meth. Fluids, 14:587-608, 1992. N.-S. Park and J.A. Liggett. Taylor-least-squares finite element for two-dimensional advection-dominated unsteady advection-diffusion problems. Int. J. Numer. Meth. Fluids, 11:21-38, 1990. N.-S. Park and J.A. Liggett. Application of Taylor-least squares finite element to the three-dimensional advection-diffusion equation. Int. J. Numer. Meth. Fluids, 13:759-773, 1991. J. Pedlosky. Geophysical Fluid Dynamics. Springer-Verlag, New York, New York, USA, 2nd edition, 1987. A. Peirce and J.H. Prevost. On the lack of convergence of unconditionally stable explicit rational Runge-Kutta schemes. Comput. Meth. Appl. Mech. Eng., 57:171-180, 1986. R.B. Pelz, V. Yakhot, and S.A. Orszag. Velocity-Vorticity patterns in turbulent flow. Phys. Rev. Letters, 54(23):2505, 1985. D.W. Pepper and J.C. Heinrich. The Finite Element Method: Basic Concepts and Application. Taylor & Francis, Basingstoke, England, UK, 1992. J.B. Perot. An analysis of the fractional step method. J. Comput. Phys. 108:51-58. 1993. L. Petzold and P. Lotstedt. Numerical solution of nonlinear differential equations with algebraic constraints II: Practical implications. SIAM J. Sci. Stat. Comput., 7(3):720-733, 1986. L. Petzold. Differential/algebraic equations are not ODE's. SIAM J. Sci. Stat. Comput., 3(3):367-384, 1982. R. Peyret, M. Fortin and R. Temam. Resolution numerique des equations de Navier-Stokes pour un fluide incompressible. J. Mec, 10(3):357-390, 1971. N.A. Phillips. The Atomsphere and the Sea in Motion. The Rockefeller Institute Press, New York, New York, USA, 1959. Chap. "An Example of Non-Linear Computational Instability"; B. Bolin (Ed.).
REFERENCES 995 S.A. Piacsek and G.P. Williams. Conservation properties of convection difference schemes. J. Comput. Phys., 6:392-405, 1970. R. Pierre. Simple C° appproximations for the computation of incompressible flows. Computer Methods in Applied Mechanics and Engineering, 68:205-227, 1988. R. Pierre. Optimal selection of the bubble function in the stabilization of the PI-PI element for the Stokes problem. SI AM. J. Numer. Anal, 32:1210, 1995. G.F. Pinder and W.G. Gray. Finite Element Simulation in Surface and Subsurface Hydrology. Academic Press, New York, New York, USA, 1977. O. Pironneau. On the transport-diffusion algorithm and its application to the Navier-Stokes equations. Numer. Math., 38:309-332, 1982. O. Pironneau. Equations aux derivees partielles (analyse numerique). Conditions aux limites sur la pression pour les equations de stokes et de Navier-Stokes. C. R. Acad. Sc. Paris, Ser. I, 303(9):403, 1986. O. Pirronneau, C. Bengue, C. Conca, and F. Marat. Problemes Mathematiques de la Mecanique. A nouveau sur les equations de Stokes et de Navier-Stokes avec des conditions aux limites sur la pression. C. R. Acad. Sc. Paris, ser. I, 304(1 ):23, 1987. O. Pironneau. Finite Element Methods for Fluids. John Wiley and Sons Ltd, Chichester, England, UK, 1989. O. Pironneau, and J. Liou and T. Tezduyar. Characteristic-Galerkin and Galerkin/Least- Squares Space-Time Formulations for the Advection-Diffusion Equation with Time-Dependent Domains. Comput. Meth. Appl. Mech. Eng., 100:117-141, 1992. C. Prakash. Examination of the upwind (donor-cell) formulation in control volume finite- element methods for fluid flow and heat transfer. Numer. Heat Transfer, 11:401-416, 1987. C. Prakash and B.R. Baliga. Finite Element Analysis in Fluids: Proc. 7th. Int. Conf. Finite Element Methods in Flow Problems. UAH Press, 1989. Chap. "Control-volume- based numerical methods for fluid flow: similarities and differences": pp. 397-404; Huntsville, Alabama, USA, April 3-7, 1989. H.S. Price, R.S. Varga and J.E. Warren. Application of oscillation matrices to diffusion convection equations. J. Math. Phys., 1966, 45:301-311. A. Priestley, K.W. Morton and E. Suli. Stability of the Lagrange-Galerkin method with non-exact integration. Math. Model. Numer. Anal., 22(4):625-653, 1988. A. Priestley. A quasi-conservative version of the semi-Lagrangian advection scheme. Mon. Weather Rev., 121:621-629, February 1993. A. Priestley. Exact projections and the Lagrange-Galerkin method: A realistic alternative to quadrature. J. Comput. Phys., 112:316-333, 1994. A. Prohl. Analysis of Chorin's Projection Method for Solving the Incompressible Navier-Stokes Equations. Universitat Heidelberg, Institut fiir Angewandte Mathematik, INF 294, D-69120 Heidelberg, Germany, 1996. A. Prohl. Projection and Quasi-Compressibility Methods for Solving the Incompressible Navier-Stokes Equations. Wiley-Teubner, Chichester, England, UK/Germany, 1997. A. Prohl and R. Rannacher. (1997) An analysis of Chorin's projection method for the incompressible Novier-Stokes equations, submitted to SIAM J. Numer Anal. R.J. Purser and L.M. Leslie. An efficient semi-Lagrangian scheme using third-order semi- implicit time integration and forward trajectories. Mon. Weather Rev., 122:745-756, April 1994.
996 REFERENCES J. Qin. On the Convergence of Some Low Order Mixed Finite Elements for Incompressible Fluids. Ph.D. Thesis, The Pennsylvania State University, University Park, Pennsylvania, USA, 1994. L. Quartapelle and F. Valz-Gris. Projection conditions on the vorticity in viscous incompressible flows. Int. J. Numer. Meth. Fluids, 1:129, 1981. L. Quartapelle. Vorticity conditioning in the computation of two-dimensional viscous flows. J. Comput. Phys., 40:453, 1981. K. Radhakrishnan. New integration techniques for chemical kinetic rate equations. I. efficiency comparison. Combust. Sci. Technol, 46:59-81, 1986. K. Radhakrishnan and A.C. Hindmarsh. Description and Use of LSODE, the Livermore Solver for Ordinary Differential Equations. National Aeronautics and Space Administration, Washington, D.C., USA. NASA reference publication 1327 edition, 1993; also available as Lawrence Livermore National Laboratory Report UCRL-ID-113855, Livermore, California. S. Ramadhyani and S.V. Patankar. Solution of the Poisson equation: Comparison of the Galerkin and control-volume methods. Int. J. Numer. Meth. Eng., 15:1395-1418, 1980. S. Ramadhyani and S.V. Patnakar. Solution of the convection-diffusion equation by a finite-element method using quadrilateral elements. Numer. Heat Transfer, 8:595-612, 1985. J.D. Ramshaw. A method for enforcing the solenoidal condition on magnetic field in numerical calculations. J. Comput. Phys., 52(3):592-596, 1983. J.D. Ramshaw. Numerical viscosities of difference schemes. Commun. Numer. Methods Eng., 10:927-931, 1994. A. Randriamampianina, P. Bontoux and B. Roux. Ecoulements Induits par la force gravifique dans une cavite cylindrique en rotation. Int. J. Heat Mass Transfer, 30(7): 1275-1292, 1987. R. Rannacher. Discretization of the heat equation with singular initial data. Z. Angew. Math. Mech., 62:T346-T348, 1982. R. Rannacher and R. Scott. Some Optimal Error Estimates for Piecewise Linear Finite Element Approximations. Math. Comput., 38(158):473-445, 1982. R. Rannacher. Finite element solution of diffusion problems with irregular data. Numer. Math., 43:309-327, 1984. R. Rannacher. Applications of Mathematics in Industry and Technology. B.G. Teubner, Stuttgart, Germany, 1989. Chap. "Numerical analysis of nonstationary fluid flow (a survey)"; pp. 34-53; V.C. Boffi and H. Neunzert (Eds.). R. Rannacher. Navier-Stokes Equations: Theory and Numerical Methods. Springer- Verlag, Berlin, Germany, 1990. Chap. "On the numerical analysis of the nonstationary Navier-Stokes equations"; pp. 180-193; J. Heywood et al. (Eds.). R. Rannacher and S. Turek. Simple nonconforming quadrilateral Stokes element. Numer. Methods PDE's, 8(2):97-lll, 1992. R. Rannacher. The Navier-Stokes Equations II: Theory and Numerical Methods. Springer- Verlag, Berlin, Germany, 1992. Chap. "On Chorin's projection method for the incompressible Navier-Stokes equations"; pp. 167-183; Lecture Notes in Mathematics, Vol. 1530. R. Rannacher. On the numerical solution of the incompressible Navier-Stokes equations. Z. Angew. Math. Mech., 73:203-216, 1993.
REFERENCES 997 R. Rautmann (Ed.). Approximation Methods for Navier- Stokes Problems. Springer-Verlag, Berlin, Germany, 1979. Proceedings of the Symposium Held by International Union of Theoretical and Applied Mechanics (IUTAM) at the University of Paderborn, Germany, September 9-15, 1979. R. Rautmann, J.G. Heywood, K. Masuda and V.A. Solonnikov (Eds.). The Navier-Stokes Equations II —Theory and Numerical Methods. Springer-Verlag, Berlin, Germany, 1992. Proceedings of a Conference held in Oberwolfach, Germany, August 18-24, 1991. M.J. Raw and G.E. Schneider and V. Hassani. Proc. AIAA 22nd Aerospace Sciences Meeting, McGraw-Hill, Inc., New York, USA, 1987. W.H. Raymond and A. Garder. Selective damping in a Galerkin method for solving wave problems with variable grids. Mon. Weather Rev., 104:1583-1590, Dec 1976. J.N. Reddy. On the accuracy and existence of solutions to primitive variable models of viscous incompressible fluids. Int. J. Eng. Sci., 16(12-A):921-929, 1978. J.N. Reddy. On penalty function methods in the finite-element analysis of flow problems. Int. J. Numer. Meth. Fluids, 2:151-171, 1982. J.N. Reddy. An Introduction to the Finite Element Method. McGraw-Hill Book Company, New York, New York, USA, 1984; also 2nd edition, 1993. J.N. Reddy. Applied Functional Analysis and Variational Methods in Engineering. McGraw-Hill, Inc., New York, New York, USA, 1986. J.N. Reddy, M.P. Reddy and H.U. Akay. Penalty finite element analysis of incompressible flows using element by element solution algorithms. Comput. Meth. Appl. Mech. Eng., 100:169-205, 1992. J.N. Reddy and D.K. Gartling. The Finite Element Method in Heat Transfer and Fluid Dynamics. CRC Press, Inc., Boca Raton, Florida, USA, 1994. S.C. Reddy and L.N. Trefethen. Pseudospectra of the convection-diffusion operator. SI AM J. Appl. Math., 54(6): 1634-1649, 1994. K. Rektorys. Variational Methods in Mathematics, Science and Engineering. D. Reidel Publishing Company, Dordrecht, The Netherlands, 2nd edition, 1980. M. Renardy. Imposing 'no' boundary condition at outflow: Why does it work? 1997, 24:413-418. R.D. Richtmyer and K.W. Morton. Difference Methods for Initial-Value Problems. Inter- science Publishers, a Division of John Wiley and Sons, Inc., New York, New York, USA, 2nd edition, 1967. W.J. Rider. Approximate Projection Methods for Incompressible Flow: Implementation, Variants and Robustness. Los Alamos National Laboratory, Los Alamos, New Mexico, USA, 1994, Technical Report LA-UR-2000. P.J. Roache. Computational Fluid Dynamics. Hermosa Publishers, Albuquerque, New Mexico, USA, 1982. P.J. Roache. A flux-based modified method of characteristics. Int. J. Numer. Meth. Fluids, 15:1259-1275, 1992a. P.J. Roache. Proc. Computational Methods in Water Resources, Vol. I: Numerical Methods in Water Resources. Chap."Validation exercises of a one-dimensional flux-based modified method of characteristics"; pp. 69-76; T.F.Russel et al. (Eds.); June 9-12, 1992b. P.J. Roache and P.M. Knupp. Completed Richardson extrapolation. Commun. Numer. Methods Eng., 9:365-374, 1993. R.S. Rogallo and P. Moin. Annual Review of Fluid Mechanics. Annual Reviews Inc., Palo Alto, California, USA. Vol. 16, 1984. Chap. "Numerical simulation of turbulent flows"; pp. 99-137.
998 REFERENCES E.M. Ronquist. Convection Treatment Using Spectral Elements of Different Order. Int. J. Numer. Meth. Fluids, 22:241-264, 1996. R.B. Rood. Numerical advection algorithms and their role in atmospheric transport and chemistry models. Rev. Geophys., 25(1 ):71-100, 1987. M. Rosenfeld. Uncoupled temporally second-order accurate implicit solver of incompressible Navier-Stokes equations. AIAA Journal, 34(9): 1829, September 1996. J.E. Rowley and P.M. Gresho. Proc. Sixth IMACS Int. Symp. Computer Methods for PDE's. Chap. "Some New Results Using Quadratic Finite Elements for Pure Advection"; pp. 202-209; Lehigh University, Bethlehem, Pennsylvania, June 23-27, 1987; also Lawrence Livermore National Laboratory, Livermore, California, Report UCRL-96615 (May, 1987). T.F. Russell. Proc. Seventh Int. Conf. Finite Element Meth. in Flow Problems. UALI Press, Huntsville, Alabama, USA, 1989; p. 538; Huntsville, Alabama, USA, 1989. T.F. Russell and R.V. Trujillo. Computational Methods in Surface Hydrology, Proc. Eighth lnt.Conf Computational Methods in Water Resources. Computational Mechanics Publications, Southampton, England, UK, 1990. Chap."Eulerian-lagran- gian localized adjoint methods with variable coefficients in multiple dimensions," pp. 357-363; G. Gambolati, A. Rinaldo, C.A. Brebbia, W.C. Gray and G.F. Pinder (Eds.). T.F. Russell, R.E. Ewing and L.C. Young. An anistropic coarse-grid dispersion model of heterogeneity and viscous fingering in five-spot miscible displacement that matches experiments and fine-grid simulations. 1989. T.F. Russell. Numerical Analysis 1989. Longman Science and Technical, Harlow, England, UK. Pitman research notes in mathematics series, Vol. 228, 1990. Chap. "Eulerian-Lagrangian Localized Adjoint Methods for Advection-Dominated Problems"; pp. 206-228; D.F. Griffiths and G.A. Watson (Eds.). R.L. Sani, P.M. Gresho, R.L. Lee and D.F. Griffiths. On the cause and cure (?) of the spurious pressures generated by certain FEM solutions of the incompressible Navier-Stokes equations: Parts 1 and 2, Int. J. Numer. Meth. Fluids, 1: 17-43 for Part 1, 171-204 for Part 2, 1981a. R.L. Sani, B.E. Eaton, P.M. Gresho, R.L. Lee, and S.T. Chan. Proc. 2nd Num. Meth. Laminar and Turbulent Flow. Pineridge Press Ltd, Swansea, Wales, UK, 1981b. Chap."On the solutions of the time-dependent incompressible Navier-Stokes equations via a Penalty Galerkin finite element method"; pp. 41-51; C. Taylor and B. Schreffler (Eds.); Venice, Italy. R.L. Sani, B.E. Eaton, P.M. Gresho, CD. Upson and M.S. Engelman. Proc. 5th Int. Symp. Finite Elements in Flow Problems. TICOM, USA, 1984. Chap."On outflow boundary conditions for startified and/or rotating flows"; pp. 85-90; Austin, Texas, USA, January 1984. R.L. Sani and P.M. Gresho. Resume and remarks on the open boundary condition minisymposium. Int. J. Numer. Meth. Fluids, 18:983-1008, 1994. J.M. Sanz-Serna and L. Abia. Interpolation of the coefficients in nonlinear elliptic Galerkin procedures. SI AM J. Numer. Anal, 21(l):77-83, 1984. J.M. Sanz-Serna. Studies in numerical nonlinear instability I. Why do leapfrog schemes go unstable? SI AM J. Sci. Stat. Comput, 6(4):923-938, 1985. T. Sarpkaya and M. Isaacson. Mechanics of Wave Forces on Offshore Structures. Van Nostrand Reinhold Company, New York, New York, USA, 1981.
REFERENCES 999 T. Sarpkaya. Brief reviews of some time-dependent flows. J. Fluids Eng., 114/283, September 1992. T. Sarpkaya. Unsteady Flows. John Wiley and Sons, New York, New York, USA, 1996. Handbook of Fluid Dynamics and Fluid Machinery. Joseph A. Schetz and Allen E. Fuhs (Eds.). Y.K. Sasaki and J.N. Reddy. A comparison of stability and accuracy of some numerical models of two-dimensional circulation. Int. J. Numer. Meth. Eng., 16:149-170, 1980. A.L. Schaferperini and J.L. Wilson. Efficient and accurate front tracking for two- dimensional groundwater flow models. Water Resour. Res., 27(7): 1471-1485, 1991. H. Schlichting. Boundary Layer Theory. McGraw-Hill, New York, New York, USA, 1979. G.E. Schneider and M.J. Raw. A skewed, positive influence coefficient upwinding procedure for control-volume-based finite-element convection-diffusion computation. Numer. Heat Transfer, 9:1-26, 1986. H.L. Schreyer. Dispersion of Semidiscretized and Fully Discretized Systems. North- Holland, Amsterdam, The Netherlands, 1983. Chap. 6 in Computational Methods for Transient Analysis; T. Belytschko and T.J.R. Hughes (Eds.). M. Schumack, W. Schultz and J. Boyd. Spectral method of the Stokes equations on nonstaggered grids. J. Comput. Phys., 94:30, 1991. L.L. Schumaker. Spline Functions. Wiley-Interscience, New York, USA, 1981. J. A. Schutt. ZEPHYR 30: A Finite Difference Computer Program for 3D, Transient Incompressible Flow Problems. Sandia National Laboratories, USA, SAND 91-0350-UC-705, 1991. C. Schwab. Proc. Int. Conf. Finite Elements in Fluids: New Trends and Applications. Chap. "Remarks on pressure approximation in projection methods for viscous incompressible flow"; Venezia, Italia, October 15-21, 1995. A. Schuller, Numerical Treatment of the Navier-Stokes Equations. Proc. Fifth GAMM- Seminar. Friedr. Vieweg und Sohn, Braunschweig, Germany, 1990. Chap."A Multigrid algorithm for the incompressible Navier-Stokes equations"; pp. 124-133; Series: Notes on Numerical Fluid Mechanics, Vol. 30; W. Hackbusch and R. Rannacher (Eds.). A. Segal. Aspects of numerical methods for elliptic singular perturbation problems. SIAM J. Sci. Stat. Comput., 3(3):327-349, 1982. A. Segal, P. Wesseling, J. Van Kan, C.W. Oosterlee and K. Kassels. Invariant discretization of the incompressible Navier-Stokes equations in boundary fitted co-ordinates. Int. J. Numer. Meth. Fluids, 15:411-426, 1992. G. Segal, K. Vuik and K. Kassels. On the implementation of symmetric and antisymmetric periodic boundary conditions for incompressible flow. Int. J. Numer. Meth. Fluids, 18:1153 -1165, 1994. V. Selmin, J. Donea and L. Quartapelle. Finite element methods for nonlinear advection. Comput. Meth. Appl. Mech. Eng., 817-845, 1985. K. Sepehrnoori and G.F. Carey. Numerical integration of semidiscrete evolution systems. Comput. Meth. Appl. Mech. Eng., 27:45-61, 1981. J. Serrin. Handbuch der Physik. Springer-Verlag, Berlin, Germany, 1959. Chap. "Mathematical principles of classical fluid mechanics"; pp. 125-263. M.J. Sewell. Maximum and Minimum Principles: A Unified Approach, with Applications. Cambridge University Press, Cambridge, England, UK, 1987. Series: Cambridge Texts in Applied Mathematics.
1000 REFERENCES F. Shakib and T.J.R. Hughes. A new finite element formulation for computational fluid dynamics: IX. Fourier analysis of space-time Galerkin/least-squares algorithms. Comput. Meth. Appl. Mech. Eng., 87:35-58, 1991. L.F. Shampine and M.K. Gordon. Computer Solution of Ordinary Differential Equations: The Initial Value Problem. W.H. Freeman and Company, San Francisco, California, USA, 1975. L.F. Shampine and C.W. Gear. A user's view of solving stiff ordinary differential equations. SI AM Rev., 21(1), 1979. J. Shen. On error estimates of some higher order projection and penalty-projection methods for Navier-Stokes equations. Numer. Math., 62:49-73, 1992. J. Shen. A remark on the projection-3 method. Int. J. Numer. Meth. Fluids, 16:249-253, 1993. J. Shen. On error estimates of the projection methods for the Navier-Stokes equations: Second-order schemes. Math. Comput, 1996, 65, 215:1039-1065. T.-M Shih and P.M. Gresho. Pressure Modes for Galerkin Finite Element Method Using Equal-Interpolation Bilinear Elements. Lawrence Livermore National Laboratory, Livermore, California, USA, UCRL-92045, extended abstract, 1985. D. Shin and J.C. Strikwerda. Inf-sup conditions for finite difference approximations of the Stokes equations. J. Austral. Math. Soc, Series B, 39, August 1997. P.J. Shopov, P.D. Minev and LB. Bazhlekov. Numerical method for unsteady viscous hydrodynamical problem with free boundaries. Int. J. Numer. Meth. Fluids, 14:681-705, 1992. P.J. Shopov and Y.I. Iordanov. Numerical solution of Stokes equations with pressure and filtration boundary conditions. J. Comput. Phys., 112(1): 12-23, 1994. G.R. Shubin and J.B. Bell. An analysis of the grid orientation effect in numerical simulation of miscible displacement. Comput. Meth. Appl. Mech. Eng., 41:41-11, 1984. W.J. Silliman and L.E. Scriven. Separating flow near a static contact line: Slip at a wall and shape of a free surface. J. Comput. Phys., 34:287-313, 1980. D. Silvester and N. Kechkar. Stabilised bilinear-constant velocity-pressure finite elements for the conjugate gradient solution of the Stokes problem. Comput. Meth. Appl. Mech. Eng., 79:71-86, 1990. D. Silvester. Optimal low order finite element methods for incompressible flow. Comput. Meth. Appl. Mech. Eng., 111:357-368, 1994. D. Silvester and A. Wathen. Fast & robust solvers for time-discretised incompressible Navier-Stokes equations, in Numerical Analysis 1995. Pitman Research Notes in Mathematics Series, 1996 D.F. Griffiths, (Ed.). J.C. Simo and F. Armero. Unconditional stability and long-term behavior of transient algorithms for the incompressible Navier-Stokes and Euler equations. Comput. Meth. Appl. Mech. Eng., 111:111 -154, 1994. J.C. Simo, F. Armero and C.A. Taylor. Stable and time-dissipative finite element methods for the incompressible Navier-Stokes equations in advection dominated flows. Int. J. Numer. Meth. Eng., 38:1475-1506, 1995. R.F. Sincovec. Some Projection Methods in Atmospheric Simulation. Lawrence Livermore National Laboratory, Livermore, California, USA, UCID-16186 edition, 1972. F. Singer, Ozone, skin cancer, and the SST. Aerosp. Am., page 22-26, July 1994. R.M. Smith. Finite element solutions of the energy equation at high Peclet number. Comput. Fluids, 8:335-350, 1980.
REFERENCES 1001 R.M. Smith. The Current Status of Turbulence Modelling in the Fluid Flow Code FEATT. Central Electricity Generating Board, Berkeley Nuclear Laboratories, Berkeley, Gloucestershire, England, UK, TPRD/B/0591/N85, 1985. P.K. Smolarkiewicz and P.J. Rasch. Monotone advection on the sphere: An Eulerian versus semi-Lagrangian approach. J. Atmos. Sci., 48(6):793-810, 1991. P.K. Smolarkiewicz and J.A. Pudykiewicz. A class of semi-Lagrangian approximations for fluids. J. Atmos. Sci., 49(22):2082-2096, 1992. C. Smutek, P. Bontoux, B. Roux, G.H. Schiroky, A.C. Hurford, F. Rosenbderger and G. de Vahl Davis. Three-dimensional convection in horizontal cylinders: Numerical solutions and comparison with experimental and analytical results. Numerical Heat Transfer, 8:613-631, 1985. G. Sottas. Rational Runge-Kutta methods are not suitable for stiff systems of ODEs. J. Comput. Appl. Math., 10:169-174, 1984. A. Soulaimani, M. Fortin, Y. Ouellet, G. Dhatt and F. Bertrand. Simple continuous pressure elements for two-and three-dimensional incompressible flows. Comput. Meth. Appl. Mech. Eng., 62:47-69, 1987. I. Stakgold. Green's Functions and Boundary Value Problems. John Wiley and Sons, Inc., New York, New York, USA, 1979. A. Staniforth and J. Cote. Semi-Lagrangian integration schemes for atmospheric models-a review. Mon. Weather Rev., 119(9):2206-2223, 1991. R. Stenberg. Analysis of mixed finite element methods for the Stokes problem: A unified approach. Math. Comput, 42(165):9-23, 1984. R. Stenberg and M. Suri. Mixed hp finite element methods for problems in elasticity and Stokes flow. Technical Report 18, Helsinki University of Technology, 1994. A.B. Stephens, J.B. Bell, J.M. Solomon and L.B. Hackerman, A finite difference Galerkin formulation of the incompressible Navier-Stokes equations, J. Comp. Phys., 1984, 53:152-172. W.N.R. Stevens. Finite element stream function vorticity solution of steady laminar natural convection. Int. J. Numer. Meth. Fluids, 2:349, 1982. G. Strang and G.J. Fix. An Analysis of the Finite Element Method. Prentice-Hall, Inc., Englewood Cliffs, New Jersey, USA, 1973. G. Strang. Linear Algebra and Its Applications. Academic Press, New York, New York, USA, 1976. G. Strang. Introduction to Applied Mathematics. Wellesley-Cambridge Press, Wellesley, Massachusetts, USA, 1986. G. Strang. A framework for equilibrium equations. SI AM Rev., 30(2):283-297, 1988. J.C. Strikwerda. Finite difference methods for the Stokes and Navier-Stokes equations. SI AM J. Sci. Stat. Comput, 5(1): 56-68, 1984. J.C. Strikwerda. Finite Difference Schemes and Partial Differential Equations. Wadsworth and Brooks/Cole, Pacific Grove, California, USA, 1989. C.R. Swaminathan and V.R. Voller. Streamline upwind scheme for control-volume finite elements, Part I. formulations. Numer. Heat Transfer, Part B, 22:95-107, 1992a. C.R. Swaminathan and V.R. Voller. Streamline upwind scheme for control-volume finite elements, Part II. implementation and comparison with the SUPG finite-element scheme. Numer. Heat Transfer, Part B, 22:109-124, 1992b. B. Swartz and B. Wendroff. Generalized finite-difference schemes. Math. Comput., 23:37-49, 1969.
1002 REFERENCES B. Swartz and B. Wendroff. The relative efficiency of finite difference and finite element methods. I: Hyperbolic problems and splines. SI AM J. Numer. Anal, 11(5):979-993, 1974. E. Siili. Convergence and nonlinear stability of the Lagrange-Galerkin method for the Navier-Stokes equations. Numer. Math., 53:459-483, 1988. E. Siili and A.F. Ware. The Spectral Lagrange-Galerkin Method for the Navier-Stokes Equations: Convergence and Non-Linear Stability. Oxford University Computing Laboratory, Numerical Analysis Group, 11 Keble Road, Oxford, England OX13QD, 89/10 (November 1989) edition, 1989. E. Siili. The Mathematics of Finite Elements and Applications VII. Academic Press Ltd, 1991. Chap. "The accuracy of finite volume methods on distorted partitions"; J. Whiteman (Ed.). M. Tabata and K. Itakura. Precise Computation of Drag Coefficients of the Sphere. Hiroshima University, Higashi-Hiroshima, 739 Japan, INS AM report no. 12 (95-07) edition, 1995. This paper is also scheduled to appear in Int. J. Numer. Meth. Fluids, from the Proceedings of the 3rd. US-Japan Symposium on Large-Scale FEM in CFD. L.Q. Tang and T.T.H. Tsang. A least-squares finite element method for time-dependent incompressible flows with thermal convection. Int. J. Numer. Meth. Fluids, 17:271-289, 1993. L.Q. Tang, T. Cheng, and T.T.H. Tsang. Transient solutions for three-dimensional lid- driven cavity flows by a least-squares finite element method. Int. J. Numer. Meth. Fluids, 21:413-432, 1995. L.Q. Tang, J.L.Wright and T.T.H. Tsang. Simulations of 2-D and 3-D Thermocapillary Flows by a Least-Squares Finite Element Method. Int. J. Numer. Meth. Fluids, in press. E.Y. Tau. A second-order projection method for the incompressible Navier-Stokes equations in arbitrary domains. J. Comput. Phys., 115:147-152, 1994. C. Taylor and P. Hood. A numerical solution of the Navier-Stokes equations using the finite element technique. Comput. Fluids, 1:73-100, 1973. C.Taylor, J. Ranee and J.O. Midwell. A note on the imposition of traction boundary conditions when using the FEM for solving incompressible flow problems. Comm. Applied Num. Meth,, 1:113-121, 1985. G.I. Taylor. On the Decay of Vortices in a Viscous Fluid. Phil. Mag. S. 6.. 46(271), Oct. 1923. D.P. Telionis. Unsteady Viscous Flows. Springer-Verlag, New York, New York, USA, 1989. R. Temam. Analyse Mathematique. Sur 1'approximation des solutions des equations de Navier-Stokes. C R. Acad. Sci. Paris, Ser. A, 262:219-221, 24 January 1966. R. Temam. Une mehode d'approximation de la solution des equations de Navier-Stokes. Bull. Soc. Math. France, 1968:115-152, 1968. R. Temam. Sur 1'approximation de la solution des equations de Navier-Stokes par la methode des pas fractionaires I. Arch. Rat. Mech. Anal., 32(2): 135, 1969a. R. Temam. Sur 1'approximation de la solution des equations de Navier-Stokes par la methode des pas fractionaires II. Arch. Rat. Mech. Anal., 33:377, 1969b. R. Temam. AGARD Lecture Series No. 48: Numerical Methods in Fluid Dynamics. North Atlantic Treaty Organization (NATO) Advisory Group for Aerospace Research and Development (AGARD), 1972. Chap."Approximation of Navier-Stokes equations"; pp. 22-27; J.J. Smolderen (Ed.).
REFERENCES 1003 R. Temam. Behaviour at time t = 0 of the solutions of semi-linear evolution equations. J. Diff. Equations, 43:73-92, 1982. R. Temam. Navier-Stokes Equations. North-Holland, Amsterdam, The Netherlands, 3rd edition, 1984. T.E. Tezduyar, R. Glowinski and J. Liou. Petrov-Galerkin methods on multiply- connected domains for the vorticity-stream function formulation of the incompressible Navier-Stokes equations. Int. J. Numer. Meth. Fluids, 8:1269-1290, 1988. T.E. Tezduyar and J. Liou. Computation of spatially periodic flows based on the vorticity-stream function formulation. Comput. Meth. Appl. Mech. Eng., 83:121-142, 1990. T.E. Tezduyar and J. Liou. On the downstream boundary conditions for the vorticity-stream function formulation of two-dimensional incompressible flows. Comput. Meth. Appl. Mech. Eng., 85:207-217, 1991. T.E. Tezduyar. Advances in Applied Mechanics. Academic Press, Inc., Boston, Massachusetts, USA. Vol. 28, 1992. Chap. 1, "Stabilized finite element formulations for incompressible flow computations"; pp. 1-45; J.W. Hutchinson and T.Y. Wu (Eds.). T.E. Tezduyar, M. Behr and J. Liou. A new strategy for finite element computations involving moving boundaries and interfaces—the deforming-spatial-domain/space— time procedure: I. the concept and the preliminary numerical tests. Comput. Meth. Appl. Mech. Eng., 94:339-351, 1992a. T.E. Tezduyar, M. Behr, S. Mittal, and J. Liou. A new strategy for finite element computations involving moving boundaries and interfaces—the deforming-spatial- domain/space-time procedure: II. computation of free-surface flows, two- liquid flows, and flows with drifting cylinders. Comput. Meth. Appl. Mech. Eng., 94:353-371, 1992b. R.W. Thatcher. Locally mass-conserving Taylor-Hood elements for two- and three- dimensional flow. Int. J. Numer. Meth. Fluids, 11:341-353, 1990. R.W. Thatcher. Incompressible Computational Fluid Dynamics Trends and Advances. Cambridge University Press, Cambridge, England, UK, 1993. Chap. 13, "The finite element method for three dimensional incompressible flow"; pp. 427-445; M.D. Gunzburger and R.A. Nicolaides (Eds.). A. Thess. Instabilities in two-dimensional spatially periodic flows, part ii: Square eddy lattice. Phys. Fluids, 4(7), July 1992. F. Thomasset. Implementation of Finite Element Methods for Navier-Stokes Equations. Springer-Verlag, Inc., New York, New York, USA, 1981. Series: Springer Series in Computational Physics; W. Beiglbock, H. Cabannes, H.B Keller, J. Killeen and S.A. Orszag (Eds.). E.G. Thompson, L.R. Mack, F.-S. Lin. Finite Element Method for Incompressible Slow Viscous Flow with a Free Surface, in Developments in Mechanics, The Iowa State University Press, Ames, Iowa, USA Vol. 5, 1969. p. 93; H.J. Weussm D.F. Young, W.F. Riley and T.R. Rogge (Eds.). E.G. Thompson and M.I. Haque. A high order finite element for completely incompressible creeping flow. Int. J. Numer. Meth. Eng., 6:315-321, 1973. E.G. Thompson. Average and complete incompressibility in the finite element method. Int. J. Numer. Meth. Eng., 9:925-932, 1975. H.D. Thompson, B.W. Webb and J.D. Hoffman. The cell reynolds, number myth. Int. J. Numer. Meth. Fluids, 5:305-310, 1985.
1004 REFERENCES J.F. Thompson. Convection schemes for use with curvilinear coordinate systems—a survey. Mississippi State University Department of Aerospace Engineering, June 1984. V. Thomee. Galerkin Finite Methods for Parabolic Problems. Springer-Verlag, Berlin, Germany. Lecture notes in mathematics, vol. 1054 , 1984. E.A. Thornton. Finite Element Flow Analysis. University of Tokyo Press, Tokyo, Japan, 1982. Chap. "Computation of consistent boundary quantities in finite element thermal- fluid solutions"; pp. 263-270; T. Kawai (Ed.). D.M. Tidd, R.W. Thatcher and A. Kaye. The free surface flow of Newtonian and non- Newtonian fluids trapped by surface tension. International Journal for Numerical Methods in Fluids, 8:1011-1027, 1988. L.J.P. Timmermans, F.N. Van De Vosse, and P.D. Minev. Taylor-Galerkin-based spectral element methods for convection-diffusiion problems. Int. J. Numer. Meth. Fluids, 18:853-870, 1994. L.J.P. Timmermans, P.D. Minev, and F.V. Van De Vosse. An approximate projection scheme for incompressible flow using spectral elements. Int. J. Numer. Meth. Fluids, 22:673, 1996. A.F.B. Tompson and L.W. Gelhar. Numerical simulation of solute transport in three-dimensional, randomly heterogeneous porous media. Water Resour. Res., 26(10):2541-2562, 1990. L.N. Trefethen. Group velocity in finite difference schemes. SIAM Rev., 24(2): 113-136, 1982. L.N. Trefethen. Numerical Analysis 1991: Proc. 14th Dundee Conf, June 1991. Longman Scientific and Technical, with John Wiley and Sons, Inc., New York, New York, USA, 1991. Chap. "Pseudospectra of matrices"; pp.234; D.F.Griffiths and G.A. Watson (Eds.). L.N. Trefethen, A.E. Trefethen, S.C. Reddy and T.A. Driscoll. Hydrodynamic stability without eigenvalues. Science, 261:578, July 1993. L.N. Trefethen. ICIAM'95: Proc. Third Int. Congress on Industrial and Applied Mathematics. Akademie-Verlag, Berlin, Germany, 1995. Chap. "Pseudospectra of linear operators". C.J. Tremback, W.R. Cotton, J. Powell, and R.A. Pielke. The forward-in-time upstream advection scheme: Extension to higher orders. Mon. Weather Rev., 115:540-555, February 1987. D.J. Tritton. Physical Fluid Dynamics. Van Nostrand Reinhold Company New York, New York, USA. S. Turek. Tools for simulating non-stationary incompressible flow via discretely divergence-free finite element models. Int. J. Numer. Meth. Fluids, 18:71-105, 1994. S. Turek. A comparative study of some time-stepping techniques for the incompressible Navier-Stokes equations: From fully implicit nonlinear schemes to semi-implicit projection methods. Int. J. Numer. Meth. Fluids, 22(10):987-1012, 1996. S. Turek. On discrete projection methods for the incompressible Navier-Stokes equations: An algorithmical approach. Comput. Meth. Appl. Mech. Eng., 143:271-288, 1997. CD. Upson, P.M. Gresho, R.L. Sani, S.T. Chan and R.L. Lee. Num Properties and Methodologies in Heat Proc. Transfer, 2nd. Nat. Symp. Hemisphere Publishing Corp., 1983. Chap. "A Thermal Convection Simulation in Three Dimensions by a Modified Finite Element Method"; p. 245-259; T. Shih (Ed.).
REFERENCES 1005 D.T. Valentine and A.G. Mohamed. Taylor's vortex array: a new test problem for Navier-Stokes solution procedures, in Solution of Superlarge Problems in Computational Mechanics Plenum, New York,; New York, New York, USA, 1989, pp. 167-181; J. Kane and A. Carlson (Eds.). D.T. Valentine. Decay of confined, two-dimensional spatially periodic arrays of vortices: A numerical investigation. Int. J. Numer. Meth. Fluids, 21:155-180, 1995. J. Van Kan. A second-order accurate pressure-correction scheme for viscous incompressible flow. SIAMJ. Sci. Stat. Compute 7(3):870-891, 1986. M. Van Dyke. An Album of Fluid Motion. The Parabolic Press, Stanford, California, USA, 1982. John Milton van Dyke, J. V. Wehausen and L. Lumley. Annual review of fluid mechanics. Annual Reviews Inc., Palo Alto, California, USA; Vol. 16, 1984. R.S. Varga. Matrix Iterative Analysis. Prentice-Hall, Inc., Englewood Cliffs, New Jersey, USA, 1962. R.S. Varga, H.S. Price and J.E. Warren. Application of oscillation matrices to diffusion convection equations. J. Math. Phys., 45:301-311, 1966. A.E.P. Veldman, 'missing' boundary conditions? discretize first, substitute next, and combine later. SIAMJ. Sci. Stat. Compute 11(1):82-91, 1990. A.E.P. Veldman and K. Rinzema. Playing with nonuniform grids. J. Eng. Math., 26:119-130, 1992. D. Veyret, P. Gresho and R. Sani, 1998 (in preparation). R. Vichnevetsky and F. De Schutter. A Frequency Analysis of Finite Difference and Finite Element Methods for Initial Value Problems. Proceedings of the AICA International Symposium on Computer Methods for Partial Differential Equations hold at Lehigh University, Pennsylvania, 1975. R. Vichnevetsky and J.B. Bowles. Fourier Analysis of Numerical Approximations of Hyperbolic Equations. Society for Industrial and Applied Mathematics, Philadelphia, Pennsylvania, USA, 1982. R. Vichnevetsky. Propagation and spurious reflection in finite-element approximations of hyperbolic equations. Comput. Math. Appi, 11(7/8):733-746, 1985. R. Vichnevetsky. Wave propagation analysis of difference schemes for hyperbolic equations: A review. Int. J. Numer. Meth. Fluids, 7:409-452, 1987. C. Vincent. The influence of the stabilization parameter on the convergence factor of iterative methods for the solution of the discretized Stokes problem. Int. J. Numer. Methods Fluids, 20:1237-1252, 1995. S. Vogel. Life in Moving Fluids. The Physical Biology of flow. Princeton University Press, Princeton, New Jersey, USA; 2nd edition, 1994. L.B. Wahlbin. A remark on parabolic smoothing and the finite element method. SIAM J. Numer. Anal., 17(l):33-38, 1980. R. Wait and A.R. Mitchell. Finite Element Analysis and Applications. John Wiley & Sons, Chichester, UK, 1985. O. Walsh. The Navier-Stokes Equations II—Theory and Numerical Methods: Conference Proceedings. Springer-Verlag, Berlin, Germany, 1991. Chap. "Eddy solutions of the Navier-Stokes equations"; J.G. Heywood, K. Masuda, R. Rautmann and V.A. Solonnikov (Eds.); Oberwolfach, Germany, August 18-24, 1991. C.-Y. Wang. The flow past a circular cylinder which is started impulsively from rest. J. Math. Phys., XLVI(2):195, 1967.
1006 REFERENCES C.-Y. Wang. A note on the drag of an impulsively started circular cylinder. J. Math. Phys., 47:451, 1968. A. Wambecq. Rational Runge-Kutta methods for solving systems of ordinary differential equations. Computing, 20:333-342, 1978. R.F. Warming and B.J. Hyett. The modified equation approach to the stability and accuracy analysis of finite-difference methods. J. Comput. Phys., 14:159-179, 1974. R.F. Warming and R.M. Beam. Proc. 1979 SIGNUM Meeting on Numerical ODEs. 1979a. Chap. "Factored, A-Stable, linear multistep methods—an alternative to the method of lines for multidimensions"; Champaign, Illinois, April 3-5, 1979; also available from Computational Fluid Dynamics Branch, Ames Research Center, NASA, Moffett Field, California 94035, USA. R.F. Warming and R.M. Beam. An extension of a-stability to alternating direction implicit methods. BIT, page 395-417, 1979b. A.J. Wathen. On relaxation of Jacobi iteration for consistent and generalized mass matrices. Commun. Appl. Nume. Methods, 7:93-102, 1991. J.E. Welch and F.H. Harlow. The MAC Method: A Computing Technique for Solving Viscous, Incompressible, Transient Fluid-Flow Problems Involving Free Surfaces. Los Alamos Scientific Laboratory, Los Alamos, New Mexico, USA, LA-3425 edition, 1965. J. Wheeler. Simulation of heat transfer from a warm pipeline buried in permafrost. 74th Nat. Mtg. Am. Inst. Chem. Engrg.; New Orleans, Louisiana, USA, 1973. J.A. Wheeler. Permafrost Thermal Design for the Trans-Alaska Pipeline, in Moving Boundaryr Problems, Academic Press, New York, New York, USA, 1978. p. 267; D.G. Wilson, A.D. Solomon and P.T. Boggs (Eds.). M.F. Wheeler. A Galerkin procedure for estimating the flux for a two point boundary problem. SI AM Jour. Numer. Anal, 11:764, 1974. J.R. Whiteman. The Mathematics of Finite Elements and Applications II: MAFELAP 1975. Academic Press, London, England, UK, 1976. J.R. Whiteman. The Mathmatics of Finite Elements and Applications IV: MAFELAP 1981. Academic Press, London, England, UK, 1982. G.B. Whitham. Linear and Nonlinear Waves. John Wiley and Sons, Inc., New York, New York, USA, 1974. J.H. Wilkinson. Recent Advances in Numerical Analysis. Academic Press, Inc., New York, New York, USA, 1978. Chap. "Linear differential equations and Kronecker's canonical form"; pp. 231-265; C. de Boor and G. Golub (Eds.). P.T. Williams and A.J. Baker. Incompressible Computational Fluid Dynamics and the Continuity Constraint Method for the Three-Dimensional Navier-Stokes equations. Numer. Heat Trans., 29(2): 137, 1996. A.M. Winslow. Internal Memorandum. Lawrence Livermore National Laboratory, Liver- more, California, USA, 1967. D. Winterscheidt and K.S. Surana. p-version least squares finite element formulation for two-dimensional, incompressible fluid flow. Int. J. Numer. Meth. Fluids, 18(l):43-70, 1994. W.L. Wood. Practical Time-Stepping Schemes. Oxford University Press, New York, New York, USA, 1990. Series: Oxford Applied Mathematics and Computing Science Series. J. Wu. Wave equation model for solving advection-diffusion equation. Int. J. Numer. Meth. Eng., 37(16):2717-2734, 1994.
REFERENCES 1007 J.-Z. Wu and J.-M. Wu. Interactions between a solid surface and a viscous compressible flow field. J. Fluid Mech., 254:183-211, 1993. J.-Z. Wu. A theory of three-dimensional interfacial vorticity dynamics. Phys. Fluids, 7(10):2375-2395, 1995. J.Z. Wu and J.M. Wu. Advances in Applied Mechanics. Academic Press, New York, New York, USA, 1996. Chap. "Vorticity dynamics on boundaries"; Vol. 32. X.H. Wu, J.Z. Wu and J.M. Wu. Effective vorticity-velocity formulations for three- dimensional incompressible viscous flows. J. Comput. Phys., 122:68-82, 1995. M.G. Wurtele. On the problem of truncation error. Tellus XIII, 3:379-391, 1961. J.C. Wyngaard, W.D. Bach Jr., S. Burk, W.R. Cotton, J.H. Ferziger, S.R. Hanna, P. Moin, W. Ohmstede and J.C. Weil. Large-Eddy Simulation: Guidelines for Its Application to Planetary Boundary Layer Research. Michaels Communications, Boulder, Colorado, USA, 1984. Final Report from The Working Group on Large-Eddy Simulation; J.C. Wyngaard (Ed.). M. Yao and D.S. Malkus. Boundary node correction and superconvergence in the FEM. Int. J. Numer. Meth. Fluids, 10:713-721, 1990. H.C. Yee, P.K. Sweby, and D.F. Griffiths. Dynamical approach study of spurious steady- state numerical solutions for nonlinear differential equations. Part 1: The Dynamics of Time Discretizations and Its Implications for Algorithm Development in Computational Fluid Dynamics J. Comput. Phys, 1991, 97:249-310. H.C. Yee and P.K. Sweby. Global asymptotic behavior of some iterative implicit schemes. Int. J. Bifur. Chaos, 4(6): 1579-1611, 1994. H.C. Yee and P.K. Sweby. Dynamical approach study of spurious steady-state numerical solutions for nonlinear differential equations. Part II: The dynamics of numerics of systems of 2 x 2 ODEs and its connections to finite discretizations of PDEs. Int. J. Comput. Fluid Dyn., 4:219-283, 1995a. H.C. Yee and P.K. Sweby. Proc. Conf. Numerical Methods for the Euler and Navier-Stokes Equations. 1995b. Chap. "On super-stable implicit methods and time-marching approaches"; Montreal, Canada, September 14-16, 1995; to appear Int. J. CFD; also RIACS Technical Report 95.12 (July, 1995). H.C. Yee and P.K. Sweby. Nonlinear Dynamics & Numerical Uncertainties in CFD. National Aeronautics and Space Administration, USA, April 1996. NASA Technical Memorandum 110398; also submitted to J. Com. Phys. S.T. Zalesak. Fully multidimensional flux-corrected transport algorithms for fluids. J. Comput. Phys., 31:335-362, 1979. M.M. Zdravkovich. Flow Around Circular Cylinders. Oxford University Press, Oxford, England, UK. Vol. 1, 1997. L. Zhang and L. Li. On superconvergence of isoparametricc bilinear finite elements. Commun. Numer. Meth. Eng., 12:849-862, 1996. J.Z. Zhu and O.C. Zienkiewicz. Adaptive techniques in the finite element method. Commun. Appl. Numer. Methods, 4:197-204, 1988. O. Zienkiewicz and J. Wu. Incompressibility without tears—how to avoid restrictions of mixed formulations. Int J. Numer. Meth. Eng., 32:1189-1203, 1991. O. Zienkiewicz and J. Wu. A general explicit or semi-explicit algorithm for compressible and incompressible flows. Int J. Numer. Methods Eng., 35:457-479, 1992. O.C. Zienkiewicz and P.N. Godbole. Finite Elements in Fluids, Vol. I: Viscous Flow and Hydrodynamics. John Wiley and Sons, Ltd, London, England, UK, 1975. Chap. 2,
1008 REFERENCES "Viscous, incompressible flow with special reference to non-Newtonian (plastic) fluids"; pp. 25-55; R.H. Gallagher, J.T. Oden, C. Taylor and O.C. Zienkiewicz (Eds.). O.C. Zienkiewicz and K. Morgan. Finite Elements and Approximation. John Wiley and Sons, Inc., New York, New York, USA, 1983. O.C. Zienkiewicz and R.L. Taylor. The Finite Element Method. Vol. 1: Basic Formulation and Linear Problems. McGraw-Hill Book Company (UK) Ltd, London, England, UK, 4th edition, 1989. O.C. Zienkiewicz and E. 0nate. Nonlinear Computational Mechanics: State of the Art. Springer-Verlag, Berlin, Germany, 1991. Chap. "Finite Volume vs Finite Elements. Is There Really a Choice?" pp. 240-254; P. Wriggers and W. Wagner (Eds.). O.C. Zienkiewicz and R.L. Taylor. The Finite Element Method. Vol. 2: Solid and Fluid Mechanics, Dynamics, and Non-Linearity. McGraw-Hill Book Company (UK) Ltd, London, England, UK, 4th edition, 1991. O.C. Zierkiewicz and J.Z. Zhu. The superconvergent patch recovery and a posteriori error estimates. Part 1: the recovery technique. Int. J. Numer. Meth. Eng., 33:1331-1364, 1992a. O.C. Zienkiewicz and J.Z. Zhu. The superconvergent patch recovery and a posteriori error estimates. Part 2: error estimates and adaptivity. Int. J. Numer. Meth. Eng., 33:1365-1382, 1992b. J.Z. Zhu. Further tests on the derivate recovery technique and a posteriori error estimator, in Finite Elements in the 90's. Springer-Verlag/CIMNE, Barcelona, 1991; E. Onate, J. Periaux and A. Samuelsson (Eds.). J.Z. Zhu. A posteriori error estimation—the relationship between different procedures. Comput. Meth. Appl. Mech. Eng., 150:411-422, 1997.