Текст
                    The Genesis of Simulation in Dynamics


Springer New York Berlin Heidelberg Barcelona Budapest Hong Kong London Milan Paris Santa Clara Singapore Tokyo
Thomas P. Weissert The Genesis of Simulation in Dynamics Pursuing the Fermi-Pasta-Ulam Problem With 63 Illustrations Springer
Thomas P. Weissert The Agnes Irwin School Rosemont, PA 19010 USA Library of Congress Cataloging-in-Publication Data Weissert, Thomas P. The genesis of simulation in dynamics : pursuing the Fermi-Pasta- Ulam problem / Thomas P. Weissert. p. cm. Includes bibliographical references and index. ISBN 0-387-98236-1 (alk. paper) 1. Dynamics — Computer simulation. 2. Dynamics — Mathematical models. I. Title. QC133.W45 1997 53r.ir0113-DC21 97-9272 Printed on acid-free paper. © 1997 Springer-Verlag New York, Inc. All rights reserved. This work may not be translated or copied in whole or in part without the written permission of the publisher (Springer-Verlag New York, Inc., 175 Fifth Avenue, New York, NY 10010, USA), except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed is forbidden. The use of general descriptive names, trade names, trademarks, etc., in this publication, even if the former are not especially identified, is not to be taken as a sign that such names, as understood by the Trade Marks and Merchandise Marks Act, may accordingly be used freely by anyone. Production managed by Timothy Taylor; manufacturing supervised by Jacqui Ashri. Camera-ready copy prepared from the author's L^I^jX files. Printed and bound by R.R. Donnelley and Sons, Harrisonburg, PA. Printed in the United States of America. 987654321 ISBN 0-387-98236-1 Springer-Verlag New York Berlin Heidelberg SPIN 10575007 (Hardcover) ISBN 0-387-98237-X Springer-Verlag New York Berlin Heidelberg SPIN 10575017 (Softcover)
The tree that is nurtured can not be wrought. Our Mum and Dad conceived; summer solstice they got. Sister planted that seed; she brought my mind to light. Then my wife, ah my life, she brings my heart delight. My Lizzy, June flower, my purpose is complete. Oh joy, that tree is me. This is for She.
Preface We hear a lot about dynamical systems theory these days—"chaos dynamics" is the popular idiom—but very little of this abstract discipline is taught in undergraduate or even in graduate school, unless that happens to be one's own field. People may have heard some of the terms, but they may not understand what a dynamicist does. Just what does a dynamicist do? Although I've been through the training, I spend much more time talking and thinking about dynamics than actually doing it, which is a circumstance that affords me an interesting perspective. Some time after defending my dissertation, when I stopped back by my alma mater on a visit through Boulder, I ran into one of my advisors, a practicing dynamicist, and he was excited to show me his new workstation and its resident software. Hopefully, by relating what I experienced then, I may in part help to answer my opening question. He sat down in front of a very large screen, at least 19 inches across, so closely, that it filled his field of view completely. At his fingertips were the usual mouse and keyboard. He fired up the software and chose a model; I think we were out to explore "just the standard map." As the program began, the screen remained black, except for a small blinking cursor at the middle. Resting the fingers of one hand on the keyboard, and the other on the mouse, he tapped the keys, slid the mouse, and clicked its button. Several streamlines appeared on the screen immediately, and they began to flow, extending outward, waving—just colored lines on a black background, squirming and dancing. As he swept the mouse across the pad while rapidly clicking the button, the almost invisible cursor arc'd across the screen, initializing new streamlines as it went. The new lines replaced some of the others, as those decayed away, like the red trail of a flame moving in the dark. I began to see the screen as a window onto a flowing fluid, its undulating texture made manifest by the waving streamlines.
viii Preface Without taking his eyes off the screen, he named the features as they flowed by—islands of order in a sea of chaos, abstract landscapes in a virtual realm. Every so often, he would type a key and so bump the control parameter; the scene changed, eddies appeared, and the swirling became more intense. I thought about vorticity, curl, and electromagnetic fields. I saw the development of attractors, knots in the streamlines, point sources, and singularities; they would come and go, becoming unstable as the parameter changed. I saw catastrophe and understood about structural stability. More sweeps, more lines; he created more whenever he needed more detail. Out of nowhere a pair of stable and unstable fixed points emerged together and begin to spread apart, appearing to me as I had always imagined pair- production out of the quantum vacuum might look. I began to see all kinds of abstract theorems and structures enacted across that screen. But mostly what I saw was a dynamicist exploring phase space with his senses; he watched it flow past, yet he controlled the flow. By moving his hands, he navigated through virtual space; it was simulation in real time. I sat back and watched the beginnings of virtual reality; but not an imitation reality like the CD-ROM games you get now. This reality was the abstract Cartesian space where points and lines represented evolving dynamical systems. Yet by immersing his senses, sight and touch, the dynamicist was climbing into phase space and taking it for a ride. This activity is just part of what a dynamicist does these days; and so just maybe, by going back to the beginning, we can see a little of how dynamical systems theory came to be. Acknowledgments. I would like to thank the following people who were instrumental to this project's existence: Allan Franklin allowed me to study philosophy in the face of zero job prospects, and he continued to encourage me while I created and developed this project with no model to work from. John Cary and Jim Meiss spent many long and I'm sure tedious hours helping me understand the extremely abstract field of dynamics. Allan, John, and Jim also read most of the contents of this book and did their best to ensure that I got both the history and the physics right; any remaining errors must have been inserted by me after their scrutiny. Professor Joseph Ford was very kind and generous with his correspondence, providing me with many references along the way. Working on the time scale with Marc Weiss at the NIST while I researched this project was both a delight and a significant means of support. Paul Harris sharpened my mind with countless hours of theoretical conversations. Lois Cole gave me at least one very nonlinear year of her life. Jo Alyson Parker taught me how to write, and she continues to be an outstanding reader and editor. Tom von Foerster, my editor at Springer-Verlag, believed in this project and in my ability to bring it off. Finally, I must thank Elizabeth Parker Weissert, whose presence in this world has reconciled me to myself.
Contents Preface vii List of Figures xiii Introduction 1 Part I: History 7 1. The FPU Model and Simulation: "A Little Discovery" 9 1.1. Development 9 1.2. Dynamics to Statistical Mechanics 11 1.3. Surfaces of Constraint 13 1.4. Global Versus Local Analysis 15 1.5. Simulation 15 1.6. Loading the Nonlinear String 16 1.7. Modal Representation 19 1.8. Model Considerations 21 1.9. Results 23 1.10. Discussion Post Hoc 27 2. The FPU Research Program: Echoes on a String 31 2.1. The Threads of a Research Program 31 2.2. The Nonlinear Discrete Lattice 32 2.3. Ford, 1961 33 2.4. Jackson, 1963 37 2.5. Ford and Waters, 1963 41 2.6. The Continuous String 43 2.7. In the Continuous Limit 45 2.8. Discreteness as Viscosity 46
x Contents 2.9. The First Soliton Paper 47 3. The Kolmogorov-Arnold-Moser Theorem: "Here Comes the Surprise" 51 3.1. A Brief History of Dynamics 51 3.2. The Fundamental Problem of Dynamics 53 3.3. The Small Divisors Problem 57 3.4. Poincare to Kolmogorov 60 3.5. The Conjecture 64 3.6. Beyond the Blaze 66 3.7. The Henon and Heiles Simulation, 1964 72 4. Research Threads Come Together: Harmonic Convergence 83 4.1. The Story Continues 83 4.2. Izrailev and Chirikov, 1966 84 4.3. Zabusky and Deem, 1967 89 4.4. Walker and Ford, 1969: Physical Review 91 4.5. Ford and Lunsford, 1970 93 4.6. Lunsford and Ford, 1972 97 4.7. The Toda Lattice Is Integrable 99 Part II: Philosophy 103 5. Steps to an Epistemology of Simulation 105 5.1. Introduction 105 5.2. Hierarchy of Modeling 107 5.3. Historical Significance 110 5.4. Experiment Ill 5.5. Epistemology 113 5.6. Preconceptions 117 5.7. Strategies for Belief and Pursuit 122 5.8. Case Study I: Fermi-Pasta-Ulam 125 5.9. Case Study II: Henon and Heiles 128 5.10. Methodology 129 5.11. Irreversibility 132 5.12. Proof 133 5.13. Proof and Simulation 136 Appendix A. Hamiltonian Dynamics: Language of Abstraction 139 A.l. Topology and Phase-Space Trajectories 140 A.2. Canonical Transformations 141 A.3. Transforming the Unperturbed String 141
Contents xi A.4. Cyclic Coordinates 142 A.5. Liouville Integrability 143 A.6. The Action-Angle Variables 143 A.7. Dynamics on a Torus 147 A.8. Commensurability: Two Types of Motion 148 A.9. Digital Representation 152 A.lO.Physical Reality and the Continuum 152 A. 11.Perturbing the String 153 Glossary 155 References 157 Index 167
List of Figures I.l. Road Map to the FPU Research Program 3 1.1 The Discrete String 17 1.2 Normal-Mode Displacements 20 1.3 FPU-1: Quadratic Perturbation 24 1.4 FPU-9: Time-Averaged Energy Distribution 24 1.5 FPU-8: String Displacement 26 1.6 The FPU Super-Period 29 2.1 Ford's Modal Energy Plot 36 2.2 The Confused Question of Recurrence 39 2.3 From an Almost Constant of the Motion 43 3.1 Concentric Shells of Tori 63 3.2 Toroidal Phase Space in Perturbation 67 3.3 A Heuristic Explanation of FPU 69 3.4 The Henon and Heiles Original Figures 76 3.5 Reproducing Henon and Heiles Level Curves 80 4.1 Resonance Overlap and FPU 85 4.2 The Basis for Resonance Overlap 87 4.3 Initially Excited Optical Modes 90 4.4 Stochasticity and Resonance Overlap 96 5.1 The Henon and Heiles Original Figures 119 A.l Geometrical transformations 144 A.2 The 2-Torus Coordinate System 148 A.3 Single Orbits on Tori 150 A.4 Multiple Orbits on Tori 151
Introduction Defining My Project Although it is often referred to in the dynamics literature as a standard reference, the Fermi-Pasta-Ulam problem (FPU) has not been comprehensively documented anywhere in a way accessible to a nonspecialized audience. Thus it is virtually unknown outside of dynamics; yet the FPU problem marks a vital turning point in that field. In one of the very first computer simulations, performed in the early 1950s, a simple nonlinear string model exhibited behavior that completely baffled some very prominent scientists. In effect, the FPU problem revitalized the field of nonlinear dynamics, which had been dormant for nearly 50 years. Classical dynamics had reached its peak by the latter part of the nineteenth century. In fact, its very name became synonymous with an entire scientific world view. The universe was thought to be as a mechanical clock, or a completely deterministic book written in the language of the calculus. Many scientists believed that all the major problems had been solved, and that the only challenge remaining was to iron out details, which was indeed a pretty depressing prospect. Where was the challenge in that? However, at the very end of the nineteenth century, Henri Poincare took on the three-body problem in his three-volume magnum opus. In those weighty volumes, he developed the topological method of dynamics as we know it and made great strides in the development of perturbation theory. But as he reached the very end of the third volume, Poincare glimpsed a degree of complexity in that low-order problem that was well beyond anything yet encountered. The universal text of the classical paradigm had some missing pages, text that might turn out to be indecipherable, even with the calculus. Perturbation theory calculations demanded a great deal of time and pedantic repetition; at the beginning of the twentieth century, any useful result in nonlinear dynamics was well out of reach of the pencil and paper dynamicist. Thus, these problems were shelved for half a century—until
2 Introduction burgeoning computer technology brought the possibility of their solution within reach. Having worked with some of the first digital computers during the Manhattan project, the renowned physicist Enrico Fermi recognized the utility of their fast computational speed and relentless repetition for making nonlinear dynamical calculations. Fermi, along with his associates John Pasta and Stanislaw Ulam, expected that the harmonic linear string model, perturbed with a small nonlinear term, would in simulation tend toward equipartition of energy among the normal modes. But that outcome did not occur; where they expected chaos, they found a new realm of order. My self-appointed task, which was to gather the historical information about FPU and develop it into a coherent form, itself proved to be a voyage of discovery. When I began to study the problem, I asked two different dynamicists what the solution was; one said "I thought it was KAM," and the other said "wasn't it KdV?" Such divergent responses intrigued me; clearly there was some shadow of a doubt about this so-called standard problem. Then, as I starting reading and researching, I ran into the problem of identifying the references. The very first such problem was that the original FPU paper was never published in the traditional way, although I was able to locate it in Fermi's collected works. After some thought, I realized that I could not simply work forward in time because articles refer to prior work only, nor could I work backward in time because of the difficulty of finding starting points. Instead I zig-zagged back and forth in time, tracing authors and all the prior articles to which each relevant article referred. Ultimately I uncovered a network of related articles extending over a 20-year period, 1953 through 1973. Beyond 1973, very little further work was performed on the FPU model, indicating that interested parties were satisfied by the conclusions that had been drawn, and that the leading edge of research had moved on to new challenges. Road Map to Research Figure 1.1 represents the main body of publications documenting the FPU research program. This complicated diagram functions as a road map to our travels. I describe its basic structure here, and then treat it piecewise throughout the coming discussion. I strongly recommend periodically referring back to this main figure in order to see a thread within the context of the whole program. For example, in the main figure, we see a feed from the original Tuck and Menzel work—before its publication—over to Zabusky's 1963 article, indicating that the latter worked with full knowledge of the former significant findings. This fact does not show up in the subfigure I use later in the text, in the discussion of Zabusky's work. Although they become densely interwoven in time, there are roughly five threads running through this story, three of which began independently of the others, and two that began as deviations from another thread. For ex-
Introduction 3 Fermi-Pasta-Ulam Simulation 1954 Tuck-Menzel [Simulation 1961 FPU Published 1965 Tuck-Menzel Published 1972 ( The FPU Research Program 1954-1973 Zabusky 1962 m Zabusky 1963 f f Zabusky Kruskal 1964-65 Zabusky Deem 1967 Zabusky 1971 Zabusky 1973 iJackson 1963 Ford 1961 f ¥ Ford-Waters 1963 ft f Waters-Ford 1966 ffffff Walker-Ford 1969 Ford-Lunsford 1970 Lunsford-Ford 1972 [Ford, Stoddard & Turner 1973 Pr t > Kolmogorov Conjecture 1954 Moser 1962 Arnold 1963 Izrailev Chirikov 1966 Toda 1967 Henon-Heiles Simulation 1964 IT Gustavson 1966 Greene 1968 ? f 1974 FPU Solitons Discrete Lattice KAM Figure I.l. The chronological history of the FPU research program, including explicit inferences represented by arrows. Each box represents a publication. There are three independent starting points.
4 Introduction ample, Zabusky initiated his work in an attempt to solve the FPU mystery, but his theory of solitons took on a significant life of its own. Of course nothing happens completely in isolation, so by "independent," I mean that these researchers were not aware of the work performed by the others at the time of writing that particular article. The chronology runs downward, beginning with 1954 at the top and ending with 1973 at the bottom. Each box represents a specific contributing journal article or sequence. The lines and arrows indicate the complicated flow of information as it was acknowledged explicitly by the authors, showing the a priori knowledge informing the work contained within each article. The diagram does not include all articles by each author, but a sampling of the most significant pieces—those that I discuss in this book. The original FPU article—available to select circles but not actually published until 1965—Tuck and Menzel's continuation of the original FPU simulation, and the article that was eventually published in 1972 make up one thread. Down the left-hand side of the diagram we see the sequence of four boxes in this line. Beginning again at FPU, the line to Zabusky's initial article (1962) indicates that the FPU problem informed and inspired Zabusky's work on continuous wave theory using partial differential equations. That work becomes the independent soliton (KdV) thread thereafter, converging once again with the other FPU work at the Walker and Ford Physical Review article (1969). The passage from FPU to Ford (1961) begins the central FPU research program—an attempt to retrodict the FPU simulation results using perturbation theory on the discrete nonlinear lattice. The flow continues downward with the papers by Ford, Jackson, and Waters, eventually intersecting the KAM thread sometime before the Walker and Ford article. The KAM thread begins independently with Kolmogorov's 1954 conjecture. It contributes significantly to the post-1966 work on FPU, and it coincidentally adds to the significance of the year 1954 in the history of dynamics. The KAM theorem was proved independently by the two mathematicians Moser (1962) and Arnold (1963a) using quite different methods. KAM was known to Izrailev and Chirikov (1966), and it passed directly into the English-language physics community via the Walker and Ford Physical Review article in 1969. Ford and Waters claim to have been aware of Kol- mogorov's 1957 address to the International Congress of Mathematicians, in which he repeated the conjecture of 1954. However, because they did not see the direct implications of that conjecture for the FPU problem, I indicate their citation with a dashed line. Not so much a thread per se, the Izrailev and Chirikov (1966) article stands out as the first to identify the results of FPU with the implications of the KAM theorem. Although it came out in 1966, its explicit references place it as following directly from the original FPU work without reference to any of the American work. By that time, Ford, Waters, Jackson, Zabusky, Tuck, and Menzel had all made contributions. Perhaps the de-
Introduction 5 lay in information exchange might be attributed to the strained relations between the Soviet Union and the United States. My Presentation In the first chapter, I present the FPU work in detail, including discussions of the thinking, expectations, and results of Fermi et al. The mathematical development is elementary for the most part, and as such should be familiar to anyone with an undergraduate background in physics; but it should also be accessible to anyone interested in mathematical science and/or the history and philosophy of dynamics. Beyond the first chapter, we follow the trail of what I will be calling the FPU research program. Chapters 2 and 4 are very similar in that they track the thinking of dynamicists through the 1960s, when new methods were developed and new discoveries incorporated. The material concerned with the FPU research program divides neatly into two phases, which are punctuated by the arrival of the Kolmogorov- Arnold-Moser (K AM) theorem. Because news of the K AM theorem reached the main group of researchers in 1966,1 end my discussion of the first phase of the FPU research program in Chapter 2 at that year, and proceed to Chapter 3 which is devoted to a discussion of the KAM theorem itself, its development, the dynamical problem it solves, and the nature of that solution. Following the KAM material, I return to the second phase of FPU work in Chapter 4 and continue to its end in 1973. In Chapter 5, I take up the subject of the epistemology of simulation. Computer simulation has become a new means of scientific investigation, taking its place alongside theory and experiment. As we will see in this exposition, the genesis of simulation is inextricably intertwined with the rebirth of dynamics. For nearly 40 years now, computer simulation has been an invaluable tool for mathematical modeling. Part and parcel with these new instruments and new realms of phenomena, there arises a series of new questions to be addressed: What can a simulation tell us about? How is simulation like and unlike an experiment? Why should we believe what the simulation tells us? And when can a simulation be used to verify or reject an hypothesis? Focusing on the use of simulation in this case study, I seek answers to these important questions. Contrary to what might be assumed at first glance, simulation is far from being a transparent window onto the phase space of nonintegrable models. However, by studying the limitations of simulations, we increase the resolution of what we can learn from them. In so doing, we are able make better choices about how to design a useful simulation. Better simulations mean better understanding of nonintegrable models, and better understanding of macroscopic physical behavior. In an attempt to familiarize the reader with the sometimes difficult language and tools of dynamics, I have included both a long Appendix, in
6 Introduction which I develop the model of the string in a language appropriate for understanding the other developments discussed throughout the book, and a Glossary, which explains many of the very specialized terms in the language of dynamics. Dynamics is a big subject with many different subdisciplines. The language and tools used by some workers may be quite different than those used by others, and this situation was never more in force than in the early days of dynamical systems theory. The list of glossary terms also may be found in the table of contents for easy reference. If at any point in the text, the reader gets bogged down in the formalism, I recommend a short trip to the appropriate section of the Appendix. At the heart of this project is a story of scientific discovery. It follows the usual narrative line: simple expectations lead to a baffling result; random historical events impede progress and access to information; and new research tools lead to new vistas of physical behavior. I have enjoyed researching this material, and I hope that readers enjoy the story as well as the subject matter.
Part I: History This terror of the mind, these shadows must be dispelled not by the suns7 bright shafts, nor by the brilliant daylight, but by an understanding of the laws of Nature. Lucretius (Book I, lines 146-148) The FPU paradox forces us to face some of our deepest insecurities. Given the Hamiltonian for a system, what is the character of its motion? ... Indeed, the ufull and final" explanation of FPU still pends. It is this very fact which makes the FPU paradox such a delightful pedagogical "skeleton" upon which to drape the evolving story of nonlinear dynamics/chaos. Joseph Ford (1992)
1 The FPU Model and Simulation: "A Little Discovery" ... it makes an enormous difference how the atoms are placed, and in what position they are brought together, and what movements they give each other and receive, and how the same atoms, with only a little change, can make both "flames" and "firs," just as these two words have similar elements, but differently arranged, and so we speak of flames and firs by different names. Lucretius (Book I, lines 908-914) Let us say here that the results of our computations show features which were, from the beginning, surprising to us. Fermi, Pasta and Ulam (1955, p. 981) 1.1. Development In the Spring of 1952, the MANIAC-I computing machine came on line at Los Alamos. One of the first digital computers, it filled a large room, used vast arrays of large, hot, glowing vacuum tubes for computation, and processed box after box of permanently inscribed, single-command, disposable, punched cards. That summer, in a professional configuration foreshadowing the disciplinary alliance of the future, physicist Enrico Fermi, computer scientist John Pasta, and mathematician Stanislaw Ulam, gathered together to discuss the computation potential presented by the speed, accuracy, and relentless automation of these new electronic marvels. Although these machines were quite a bit slower than even the first personal computers of the 1970s, to say nothing of the workstations most physicists use routinely today, they could provide access to some of those physical problems that had remained marginalized for 50 years because it took so long to laboriously carry out numerical methods by hand. Fermi wanted to develop new
10 1. The FPU Model and Simulation: "A Little Discovery" heuristic techniques for investigating nonlinear dynamical problems "experimentally" by beginning with the simplest possible nonlinear problem, solving it, and moving on to successively more complex problems, perhaps even making an approach to one of the most difficult nonlinear problems of all—turbulence. Following the method of perturbations devised by Poincare in the 1890s, Fermi et al. (1955) chose the one-dimensional vibrating string as an unperturbed, integrable model. The linear-string results from first expanding the Newtonian equations of force between discrete points along the string in an infinite series, and then dropping all the nonlinear terms. The solution of this linear model is known; it can be derived analytically and exactly; it is the ubiquitous simple harmonic motion that allows for no dispersion of the conserved energy. Whatever modal energy configuration exists at the outset remains the same throughout the subsequent motion of the string. Of course, real strings do not behave this way for very long; energy is both dispersed among harmonic modes and dissipated to the surroundings. Even when we ignore the fact that a real string dissipates energy all during its motion and eventually comes to a stop, the energy in an energy-conserving string does not remain in its initial configuration either. Whereas the linear force terms of the expansion do not allow for energy-sharing, the nonlinear model terms link the modes together and do allow for dispersion of the energy between the modes. From the solid, predictable ground of linear harmonic motion, Fermi et al. proceeded cautiously into the unknown territory of computational nonlinear dynamics. They added one small nonlinear force term to the linear model. The resulting equations of motion could no longer be integrated analytically. The shape of such a string, just barely different from the linear model, could not be predicted accurately by anyone after only a few hundred harmonic periods. Using numerical integration, which approximates the evolution of the string in very small linear steps, Fermi et al. programmed their computer to recursively approximate the solution of the equations of motion for various initial configurations of the string. In order to track the dispersion of energy from its initial configuration along its passage through the higher harmonic modes, Fermi et al. used the method of Fourier analysis. Any state or configuration of the discrete string may be decomposed into a set of Fourier or normal modes, which is a set of independent sinusoidal waveforms. In this way, they could observe the distribution of energy among the modes during the simulation. This choice of representation made the ideal initial condition clear, that is, place all the initial energy into a single normal mode, then watch it disperse. They even hoped to calculate a dispersion constant that represented the rate at which the energy spread out. Fermi et al. knew that, given any initial condition, a string would move toward its most probable state—thermal equilibrium (chaos), wherein all of the energy is distributed evenly among the accessible modes. Any term
1.2. Dynamics to Statistical Mechanics 11 added to the model that links together modes should act as a bridge for the energy to reach higher modes. Energy should march through the sequence of harmonic modes like champagne spilling down a pyramid of glasses. Fermi et al. wanted to measure the rate of thermahzation, which they believed was related to the relative strength of the additional nonlinear term. Nothing in the theory of nonlinear dynamics led Fermi et al. to expect anything other than a consistent march toward thermahzation; but for every run of the simulation, beginning from several different initial conditions and three different nonlinear perturbation terms, the system failed to thermalize. After thousands of cycles of the calculations, corresponding to hundreds of swings of the string, the energy never dispersed beyond the lowest few normal modes, although strangely enough, it did seem to cycle around among the modes that it could reach. Not only did the FPU simulation clearly fail to relate the rate of thermahzation to the size of the nonlinearity, but it chanced upon a new realm of orderly behavior. This result is analogous to an unexpected result in a carefully designed physical experiment; it demanded an explanation. The search for such an explanation brought about a total reevaluation of our expectations about nonlinear behavior. In so doing, the Fermi-Pasta-Ulam problem initiated the study of computational nonlinear dynamics—what is now called dynamical systems theory. No one expected the existence of isolating structures in the phase space of conservative nonlinear dynamical systems—structures that would inhibit the tendency toward thermal equilibrium. Instead of chaos, some new form of order was at work. As we will see, whatever was holding the energy in the lowest few harmonic modes of the string, did so consistently, as if there were a modal energy barrier. Eventually, the repercussions of this work led to an entirely new understanding of complexity, a reinterpretation of our understanding of order and stochasticity, and the discovery of whole new regimes of dynamics, accompanied by new methods of exploring them. 1.2. Dynamics to Statistical Mechanics What led Fermi to his choice of the nonlinear string? One of his many early interests had been in the ergodic theory—the mathematical study of the long-term average behavior of measure-preserving transformations. In terms of physics, ergodic theory is closely allied to statistical mechanics. Whereas classical dynamics studies the dynamical behavior of systems of individual particles acting under classical laws of motion, using trajectories to follow the deterministic evolution of the system, statistical mechanics concerns itself with the statistical behavior of ensembles, or large numbers of particles acting as a coordinated group, and does not recognize or focus on the causality of individual elements. In this sense, statistical mechanics
12 1. The FPU Model and Simulation: "A Little Discovery" is a phenomenological theory that assumes Newtonian mechanics as the underlying fundamental theory. The elusive connection between these two disparate descriptions of mechanical systems has been a source of some concern, especially within a physics community driven by a tradition of fundamental (deterministic and upwardly causal) theories. A fundamental foundation for statistical mechanics would require a way to derive the irreversibility of the second law of thermodynamics from deterministic causes, which is a problem of fundamental difficulty. Fermi wanted to find the smooth transition from classical dynamics to statistical mechanics. Traditionally, the number of particles in a system has been used to determine which branch of mechanics would be the best descriptive theory. As the number of particles increases beyond the computational capabilities of classical dynamics, researchers tend to turn to statistical mechanics for description. However, the predictions of statistical mechanics are more valid for systems with numbers of particles on the order of Avogadro's number (1023) and significantly less valid as the number of particles goes down. On the other hand, classical dynamics gets terribly complicated for any system with more than two particles in them. Here the boundary region takes shape: What can we say about the dynamics of systems with tens, hundreds, and thousands of particles? As the number of particles increases, the number of individual interaction terms increases exponentially and the integrable (usually linear) approximation to the dynamics no longer dominates the behavior. The nonlinear interaction terms, that may be ignored no longer, couple the particles together in a web of interactions, increasing in complexity until total stochasticity dominates the dynamics. As of 1954, expectations for the dynamical behavior of systems beyond those already understood—in the unexplored margins of classical science— were informed by the following basic assumption: Because they are still deterministic, nonlinear force terms can create only technical limitations on the extent of our knowledge, due both to the failure of analytical methods, and to our lack of computational capabilities. It was the second of these conceived limitations that Fermi et al. were trying to overcome. Their simulation was to be the first in a sequence that would demonstrate the transition from the deterministic trajectory dynamics of classical mechanics to the ensemble behavior of statistical mechanics. Fermi et al. knew that the number of particles did not have to be large for the system to develop stochastic behavior. Indeed, they were well aware of Poincare's discovery, 50 years prior, of deterministic chaos in the three-body problem. They were convinced that just a few particles with sufficiently large nonlinear terms would be enough to demonstrate the expected transition. Simulation was intended to be a new instrument of perception; we could look into an area of dynamical behavior that had been blocked off by our computational and analytical failings. However, the transition from classical dynamical systems at the simple end of the mechanical-complexity spectrum, to statistical mechanics at the other, was turning out to be significantly less straightforward
1.3. Surfaces of Constraint 13 than dynamicists had anticipated. Nonintegrability in itself was known to be a necessary but not a sufficient condition for ergodicity. The boundary between "necessary" and "sufficient" conditions is characteristic of the new frontier of dynamical behavior glimpsed by Fermi et al. in 1954. 1.3. Surfaces of Constraint The project of dynamical systems theory is to map the region left open between classical mechanics and statistical mechanics. We often speak of the topological properties of the surfaces accessible to the system dynamics in phase space. Phase space, defined in the Glossary, is the Cartesian configuration space where we track the movement of the system trajectory over time. Once we have converted the dynamical equations to discrete form for computer simulation, we can treat them as mappings of phase space into itself. In this way, we bring the full power of topology to bear in the study of nonintegrable behavior. The transition from classical dynamical systems to statistical mechanical systems may be characterized by the successive breakdown of the integrals of the motion. An integral of the motion corresponds to the ability to integrate analytically one variable of the model. More significantly, an integral of the motion indicates the existence of one additional surface of constraint in phase space, restricting the trajectory to one fewer degree of freedom. If a dynamical system is integrable, then it has a complete set of integrals of the motion and so the trajectory is bound to a known surface (the curve that is the intersection of all the surfaces of constraint). There can be no uncertainty about its movement, no deterministic chaos, and no ergodicity. Analytically integrable systems are completely predictable. If, however, one integral of the motion breaks down, meaning that some parametric change in the model's form reduces the number of integrals by one, then the trajectory is freed from that surface of constraint and may move about in the space delineated by all the other surfaces: one degree of freedom has been unlocked. I emphasize that the trajectory is still deterministic, but we may not be able to predict where in that space of its new freedom it will be at any given time. The trajectory may be ergodic within any portion of phase space (of at least three dimensions) that is bounded by surfaces of constraint. At the extreme opposite end from having a complete set of integrals, there may be only one. In Hamiltonian dynamical systems, the quantity represented by the Hamiltonian function (usually the total energy) is always conserved, and thus it guarantees at least one surface of constraint to which the trajectory must remain affixed. This surface is called the energy surface; it is a subspace of phase space that is one dimension lower than the 2n dimensions of the full system. Dynamics in any Hamiltonian system always
14 1. The FPU Model and Simulation: "A Little Discovery" takes place on the energy surface. If total energy is the only conserved quantity, then the trajectory is free to move about the (2n— l)-dimensional energy surface such that its location at any time may be unpredictable. This is the behavior known as ergodic on the energy surface, which is often referred to as ergodicity. An example of a system easily described in terms of its degrees of freedom and integrals of the motion is the two-body problem—two point masses acting under the mutual attraction of Newtonian gravity. This system has two degrees of freedom: separation and rotation. Phase space has four dimensions, and the energy surface is a three-dimensional subspace embedded within these four dimensions. The Newtonian force of gravity is a central force, so the Hamiltonian function depends on the separation variable only: it is independent of the rotation variable. Thus angular momentum, which is the variable that is canonically conjugate to rotation, is an integral of the motion. Angular momentum in the two-body problem is always a conserved quantity. This second integral restricts the system's trajectory by one further dimension to a two-dimensional phase plane. Therefore most of the energy surface is inaccessible to an arbitrary trajectory, which must remain constrained to a phase plane of constant angular momentum. The actual quantity of energy in the system is determined by the initial conditions. This conserved quantity specifies a specific value for the orbital area in the two-dimensional plane circumscribed by the system trajectory. For a fixed area in a plane, the trajectory must be a closed curve, which may always be transformed into an ellipse. This system is well ordered and absolutely predictable. All of this work relates back to Kepler's laws of planetary motion, one of which identifies planetary orbits as ellipses. Because a single planet orbiting the sun constitutes the prototypical two-body problem, the term "orbit" is traditionally applied to a trajectory with a number of integrals of the motion, even if the dynamical system has nothing else to do with orbiting bodies. Integrable systems are predictable because trajectories move along a curve that is the intersection of all those surfaces of constraint. When a system is not integrable, then its trajectories are assumed to access freely all of the energy surface; so it seems reasonable to expect ergodicity to follow immediately from nonintegrability. But why should it follow that nothing else besides integrals of the motion might exist to constrain the movement of the trajectories? The answer is that integrals of the motion are the only known guarantee of bounded behavior. If ergodicity were to follow directly from the absence of integrals of the motion, then that boundary between dynamics and statistical mechanics would be quite narrow. Fermi et al.'s "ergodic hypothesis" assumed that when a Hamiltonian model did not have a complete set of integrals of the motion, it would immediately tend toward ergodicity.
1.4. Global Versus Local Analysis 15 1.4. Global Versus Local Analysis The separation between Hamiltonian dynamics and statistical mechanics is a local-versus-global distinction (Balescu, 1975, p. 714). At the local level, Hamiltonian dynamics is concerned with questions about the evolution in phase space of a single trajectory that results from the step-by-step integration of the equations of motion, beginning from an arbitrary initial condition. We ask the questions: Is the orbit periodic or conditionally periodic? Which regions of phase space are accessible to which trajectories? We can easily imagine several successive steps generalizing this inquiry toward a more global discussion. We might consider the behavior of a bundle of trajectories—a tube surrounding the first trajectory that corresponds to a small neighborhood of initial conditions. At this level we would be interested in questions concerning the local stability of trajectories—such as: Will trajectories that begin arbitrarily close to the first remain close to it, or will they diverge? This is the type of consideration Poincare took up and that we will adopt in our consideration of the KAM theorem in Chapter 3. At a still more global level, we might be interested in the hydrodynamic behavior of any region of phase space as we map it into itself successively. Here we would naturally take up the question of phase-space properties such as connectedness, density (in the measure-theoretic sense), ergodic- ity, and mixing. This hydrodynamic study is the approach of statistical mechanics. The evolution of single trajectories is of less concern than the average behavior of the hydrodynamic flow of the specified region of phase space. In the hydrodynamic approach, ergodicity implies that the region in question will eventually move over all the accessible phase space. 1.5. Simulation At each step in the numerical procedure of a simulation, we use transformation equations derived from the equations of motion, to make a prediction about the future state of the system, one that is an approximation over some small interval of system time. We adjust the time-step size to obtain either higher resolution when the trajectory is erratic, or lower resolution when it appears well behaved. If the trajectory behaves very erratically, then we want to use a very short time-step size; but when the trajectory is changing direction very slowly, we may increase the time-step size to obtain more information in a shorter time. The process of running iterative numerical integrations with varying initial conditions and parameters to map the phase space of the model, functions similar to a physical experiment, except that instead of testing hypotheses or theory predictions on a physical system, we try to follow the evolution of the mathematically true solution of a set of differential equations. If the true solution is changing in
16 1. The FPU Model and Simulation: "A Little Discovery" an erratic way, the predictions of the numerical procedure may lose track of it, thus degrading the efficacy of the simulation. Every digital computer must perform its calculations in discrete steps rather than continuously. The numerical procedure is discrete also; we map from one state of the system to the next via the prediction that is derived from fitting a continuous curve or straight line to some of the previous discrete states and projecting forward. We assume the system evolves continuously, but we cannot represent that change without these intervening discrete approximations. As this diagram shows: True-Solution Simulation Reconstruction we regain the impression of continuity by conceptually supplying a smooth curve fit to the pure succession of static states of the system—each state being a set of numbers calculated from the previous set. Because we do not know what really occurs within the prediction interval, we must exercise caution when we assume continuity between the states of the numerical procedure. This necessary, discrete approximation enables the possible divergence between the true solution and the simulated solution. "Deterministic chaos" is the term we apply to the trajectory when we cannot follow the true solution for any period of time, no matter how small we make the time-step size. We know the true solution is deterministic, but it appears to be random in our simulation. In Chapter 5, synthesizing from the seminal work brought forth by the FPU problem, I lay the foundation for an episte- mology of simulation, in which we try to understand when and if simulated solutions are consistent with true solutions. If a system is ergodic, then we may reasonably expect to see deterministic chaos in its simulation. Fermi et al. expected the string to tend toward ergodicity, but what they found did not appear to be random. 1.6. Loading the Nonlinear String Fermi et al. wanted to begin with a simple problem. The one-dimensional, linear, vibrating string has long been one of the standard problems physicists are expected to solve as part of their early training. One elegant aspect of the FPU problem is that the model is so fundamental and the perturbation so obvious. Let us review the derivation of this model, following Marion (1970), Tuck and Menzel (1972), Ford (1992), and Fermi et al. (1955). The string is seen as a set of discrete mass points set along
1.6. Loading the Nonlinear String 17 a one-dimensional line, with fixed endpoints. The masses are all identical and can move only longitudinally—that is, along the line in which they lie. Thus the string is not like a violin string vibrating back and forth (transversally), although that string can also be approximated with this same model. Using Newton's law of gravitation as our force law would require all the masses to attract all the other masses. However, instead of that complicated affair, we assume that there is significant attraction between only nearest-neighboring points. Thus the force on any single mass point depends only on the two mass points nearby; motion is generated locally by the interaction of three bodies. This fact forms the conceptual basis for making connections between the well-known three-body problem and the FPU problem. The relationship is definite, but it is complicated by the fact that in the FPU problem, each three bodies in succession are coupled to the next three bodies in the sequence by the two-point overlap. Further, we assume this attraction to be linearly dependent on the distance between the points. This approximation leads us to the Hooke's law linear force. Figure 1.1 shows the arrangement of mass points along the string with fixed endpoints. At equilibrium, each pair of mass points is separated by the equilibrium spring length d; but this constant neatly drops out of the equations. Using the Hooke's law force between nearest-neighbor mass points and setting k = m = d =1, using an appropriate system of units for convenience, the resulting equations of motion, from Newton's second law, take the form Qj = (qj+i-Qj) + (Qj-i-qj), j = l,2,...,n. (1.1) The force on and the total acceleration of each mass point depends linearly on the separation between itself and its two adjoining neighbors. As was mentioned, this is the model for coupled harmonic oscillators, with a completely predictable solution. We put our model into Hamiltonian form by writing down the total energy of the system in terms of the positions qj and momentums pj (where Pj = mqj) of all the mass points. The resulting unperturbed Hamiltonian Hq is the sum of the kinetic energies of the mass points and the potential energies of the springs between them: L=(n+l)d 12 H J j+l n-1 n FIGURE 1.1. The string is seen as a sequence of n identical discrete mass points to, separated by identical springs with material constant k and equilibrium length d.
18 1. The FPU Model and Simulation: "A Little Discovery" Ho(q,P) = \ £ [p2 + (to ~ Qj-i)2} ■ (1.2) The sums run up to n -f 1 because that is how many springs there are, so we must also define pn+i = 0, as there are only n mass points. The addition of a small nonlinear force term to the linear equations must result in behavior that diverges from harmonic oscillation and approaches somewhat that of the real string. This expectation derives from the validity of the series expansion of fundamental force laws. But strict validity applies only to the whole infinite series. Whenever we truncate to a finite set of terms, the model no longer truly represents the fundamental force laws, but only an approximation. By learning the effect of each successive, higher- order term in the expansion—added to the previous terms—we hope to construct a clear understanding of physical behavior. Fermi et al. assumed that the addition of either cubic or quadratic terms to the linear string would bring the system closer to the ergodic tendencies of a real string. For the unperturbed model, the linear force term was used. The next higher-ordered term in the series is the quadratic force term, then the cubic term, and so forth. Although Fermi et al. used three different perturbation terms separately in their work—a quadratic, a cubic, and a broken-linear force term—the separate outcomes all had the same inexplicable feature. I will concentrate on their application of only the first, lowest-order, nonlinear perturbation term, which is representative of all of their results. When added to the linear Hooke's law force, the quadratic perturbation yields the new equations of motion as follows: Qj = (tfj+i " Qj) + (Qj-i ~ Qj) + ai(Qj+i ~ Qj)2 ~ (Qj ~ Qj-i)% j = l,2,...,32. (1.3) Equation (1.3) contains the unperturbed linear force term (1.1) and the additional term that is quadratic in the displacements q. In this instance, I have replaced n by 32, the number of mass points actually used by Fermi et al. in the first run of their simulation. In accordance with the normal procedure for perturbation theory, we see introduced here a small parameter a to control the strength of the perturbation term. When we model real physical systems, this parameter should correspond to some combination of physical characteristics, such as the elasticity of the string; but in exploratory work, it merely gives us a way of varying the influence of the perturbation on the overall behavior of the model. The total Hamiltonian (H = Ho -f ocH{) consists of the unperturbed expression of (1.2), plus the additional terms which are cubic in the displacements and scaled by the
1.7. Modal Representation 19 parameter a, as follows: n+l r 3 = 1 H(q,p) = ^ I X (P2j + (Qj - Qj-i)2) + %j ~ Qj-if (1.4) 1.7. Modal Representation Because the solution to the unperturbed problem is the n-coupled harmonic motion, which is most easily understood in terms of the normal (or Fourier) modes, and because Fermi et al. were interested in measuring the dispersion of energy among modes, conversion to the modal representation made sense. The relationship between the displacements qj and the normal ak for the unperturbed Hamiltonian are given by the following expressions: ak = y^t^qis[n[iTl)' * = l»2,...,n, (1.5) ^ = ]f^l^akSin(iTl)' i = l,2,...,». (1-6) Each mode defines a set of displacements qj for all the mass points along the string, thereby constructing a unique shape for the overall string, one that is linearly independent of all the other modes. With the normal modes as a basis, any arbitrary shape of the discrete string can be represented as a sum of modes. We may think of this shift from displacements qj to normal modes a^ as a change of variables from one set of independent quantities to another—from the local behavior of points along the string to the global behavior or shape of the entire string. Figure 1.2 shows the displacements of the 32 mass points along the string as determined by the first four modes. Up and down displacement in this figure represent left and right displacements along the string. When we make the conversion from local displacements to global modes using the transformation equations (1.5) and (1.6), the new unperturbed Hamiltonian becomes: #o(a, a) = - Y^("l + "14), where uk = 2sin I ^^j • (1-7) The Uk are the frequencies of the normal modes. As the mode index k increases from 1 to n, the argument of the sine function increases from nearly zero to nearly 7r/2, the frequencies increase monotonically and approach 2, so higher modes correspond to higher frequencies of string vibrations. Notice that this expression shows the isolation between the modes. Each successive term in the energy sum represents the energy of each independent
20 1. The FPU Model and Simulation: "A Little Discovery" 0.3 -0.3 Model -♦- Mode 2 -+-- Mode 3 a- Mode 4 ••*••• 10 15 20 25 30 FIGURE 1.2. The first four normal modes of the string. Each mode defines the displacement qj of the n = 32 mass points. They take on the familiar standing-wave patterns. Because the string has fixed endpoints, each mode ak has fc-f 1 nodes (points where the string intersects the horizontal axis). mode. This kind of separation is not possible in the displacement representation, in which each term of the sum depends on preceding and successive terms as well. When we add the effect of the perturbation, we also get additional terms in the Hamiltonian. The full, perturbed Hamiltonian in modal form is as follows (Ford and Waters, 1963): .. it a, H(a, a) = - J^(al 4- Jtal) + a Yl Ckimakaiam. fc=l k,lfm=l (1.8) Here we see more clearly how the perturbation links together the modes. The 3n additional terms allow energy to move from one mode to another that is nearby in energy. The parameter a is kept small in FPU, such that the quadratic perturbation force term is no more than one-tenth the size of the linear term at maximum values of qj. The constants Ckim axe quite complicated combinations of the transformation coefficients; we are not concerned with them here. Whereas the modal transformation equations do appear in the original FPU paper, this Hamiltonian does not. To obtain the FPU simulation, we integrate numerically the (displacement) equations of motion (1.3), and every so often, calculate the modal energy distribution from (1.7).
1.8. Model Considerations 21 For the linear problem, the modal description is isomorphic to a description in terms of the integrals of the motion; the energy is divided up in a particular way initially, and it will be confined in that distribution by the integrals of the motion. Instead of speaking about trajectories locked into particular regions of phase space, as is so common in dynamics today, Fermi et al. speak of the evolution of the modal energy. This description highlights the global configuration of the string, while obscuring both the local behavior of the displacements, and the evolution of the system trajectory in phase space—a representation that will later dominate the field. Without the perturbation, the string would maintain a standing wave pattern, as in Figure 1.2. The motion would be periodic: every time a multiple of T — 27r/ui occurred, the string would reproduce the initial configuration, provided the initial configuration was given by a\. In the perturbed problem, Fermi et al. expected the motion to drift away from periodicity toward ergodicity. By calculating the modal energy at any time in the evolution, Fermi et al. could speak of the evolution of the system completely in terms of the energy distribution among normal modes. 1.8. Model Considerations With their one-dimensional, discrete-string model, Fermi et al. had several parameters with which to modulate their simulation. When choosing n, the experimenters had to mediate between two criteria: whereas a larger number of points would approximate the continuous string better, a smaller number would require less calculation time for each iteration of the model. While this simulation was going to be a demonstration of the computer's usefulness in computational modeling, the speed of even the best machine in 1953 was quite slow. Each machine-language instruction for the computation had to be correctly encoded onto a single punch card. The program consisted of boxes of cards that were fed into the machine, over and over. For each iteration of the calculation, the cards were fed in again. Attempting to simulate a continuum would, of course, require a large number of particles, and we would assume that the larger the number, the better. Even if we convince ourselves that there is no need to simulate a continuum, we still would like a large number of particles to enhance the approach to ergodic equilibrium or to obtain a better simulation of a statistical mechanical system. In 1953, Fermi et al. were comfortable using only 64, 32, or 16 particles on the very earliest computing machines. With computing time as an ever-present constraint, Fermi et al. had to strike a balance between the number of particles—thus the number of equations— and the duration of the time step. Because of the slow divergence from linear behavior, the size of the time step can be on the scale of the recurrence time of the associated linear
22 1. The FPU Model and Simulation: "A Little Discovery" problem to reveal the long-term average behavior of nearly linear systems, or it can be made smaller to try to resolve rapidly diverging orbits in strongly nonlinear systems. In addition to these considerations, care had to be taken in choosing the time-step size because, in this instance, the nonlinear system was being analyzed into the modes of the corresponding linear problem. These modes have characteristic periods that force an upper limit on the step size of the simulation for resolution and interpretation of the evolution. Using the normal modes as an interpretive tool, Fermi et al. had to run each configuration of their system through enough cycles to allow for the tendency toward thermalization—that is, they had to run through enough cycles to simulate several hundred periods of the initial configuration. They chose to break up the period of the associated linear model into a large number of time cycles, up to five hundred in some cases. Thus, in various configurations, they ran the simulation anywhere from 14,000 cycles to 80,000 cycles. As a consequence, they chose n to be at most 64, and in some cases, they also considered n = 16 and n — 32. The fact that only powers of two were considered in FPU seems to have been only a coincidental consequence of using binary arithmetic; but this fact figures prominently in the later analysis by Ford (1961) as a significant factor in his explanation of the results of FPU. Given that we expect eventual thermalization, the choice of the initial string configuration might not seem to make very much difference, because all initial conditions should tend toward equipartition of the energy. For ease of following the evolution of the system in their simulation, Fermi et al. chose very simple combinations of the lowest normal modes, usually a single half sine wave (mode one) and releasing the string from rest (pj = dfc =0). Fixing the initial energy (H = E — constant) determines the value of a\ from (1.7), with o^ = 0 whenever k ^ 1. With this value of ai, the initial values for the displacements qj may be obtained from (1.6). But once again, a decision about initial conditions—namely the choice of initially exciting only the lowest modes—came into question in a later work (Zabusky and Deem, 1967), where it is shown that the initial excitation of optical (high- frequency) modes results in better energy-sharing among modes, although not in thermalization. As mentioned, Fermi et al. chose values for the nonlinear parameter a to keep the quadratic force terms (1.3) smaller than one-tenth the size of the linear terms at maximum displacement. Because the corresponding energy terms (1.8) were cubic in the displacements, they always represented less than a just a few percent of the total energy, so they were ignored when Fermi et al. calculated the modal energy distributions.
1.9. Results 23 1.9. Results At the start of each simulation, the string did appear to be tending toward thermalization, but as the simulation progressed, some odd things began to happen. Most of the energy would strangely accumulate in one or another mode, only to disperse again. Most significantly however, was the fact that the energy apparently remained bound within the first five, lowest-energy modes, those modes nearest in energy to the initial condition. There the energy cycled about within those few modes in an almost periodic way. Figure 1.3, taken from the original FPU paper, shows the results of their first simulation. Using the quadratic perturbation, with all the energy initially in mode one at time to = 0, the string was released from rest and allowed to evolve numerically. The simulation ran for 30,000 cycles, which corresponds to about 160 fundamental periods {T\ = 2tt/uji) of the first mode. In the following passage, the authors describe the result: Instead of a gradual, continuous flow of energy from the first mode to the higher modes, all of the problems show an entirely different behavior. Starting in one problem with a quadratic force and a pure sine wave as the initial position of the string, we indeed observe initially a gradual increase of energy in the higher modes as predicted. Mode 2 starts increasing first, followed by mode 3, and so on. Later on, however, this gradual sharing of energy among successive modes ceases. Instead, it is one or the other mode that predominates. For example, mode 2 decides, as it were, to increase rather rapidly at the cost of all other modes and becomes predominant. At one time, it has more energy than all the others put together! Then mode 3 undertakes this role [from 14,000 cycles to 19,000 cycles]. It is only the first few modes which exchange energy among themselves and they do this in a rather regular fashion. Finally, at a later time mode 1 comes back to within one percent of its initial value so that the system seems to be almost periodic. (Fermi et al., 1955, p. 981) Two very important problems emerge from these results. They show a definite isolating behavior and the modal energy seems to evolve cyclically. Figure 1.4 shows the time-averaged modal-energy distribution. Initially the energy is in the improbable initial configuration, and instead of any tendency toward equipartition of energy in all the modes—a plot of which would show all the modes averaging toward the same low level of energy (E/n)—we see a consistent pattern of energy partitioned unequally into the first four modes. Each successively higher mode has less of the total energy, and, as we saw in Figure 1.3, virtually no energy at all extends beyond mode five.
24 1. The FPU Model and Simulation: "A Little Discovery' 100 0 1 \ \ 2/ Id \ \ \ \ \ 5 3/ V 4 3r 5y / / [ ,2 A \ \ h fU\ (5 ■A I / / / / \2 t / / / 1- \ 20 30 f IN THOUSANDS OF CYCLES FIGURE 1.3. Modal energy plot for the FPU string model with a quadratic perturbation term: n = 32, a = \. The initial configuration was a single half sine wave. About 30,000 computation cycles were calculated. QO ou 20 m 0 T \ ^t 1 \ \ \ \ ' 1 \t L i V \ / t K \ \ =5^ <•_ 4 \ > ^ -— < -»-_ 2 40 80 t IN THOUSANDS OF CYCLES FIGURE 1.4. Modal energy distribution averaged over time. The energy settles down into a clear and consistent pattern, which is not consistent with expectations of thermalization.
1.9. Results 25 In order to understand the implications of these results, imagine taking two containers, one filled with compressed gas and the other empty. If you connect these two together, the second law of thermodynamics—which is based on the very same statistical principles of ergodicity and large numbers of interacting particles—predicts correctly that the gas will spread itself out over the entire volume until a single pressure obtains throughout. Now imagine how surprising it would be if, instead of this predicted outcome, the gas spread itself out along only the bottom third of the two containers and remained that way. In a physical experiment, we would expect to find a hidden variable at work, such as the force of gravity, or the presence of an additional, different gas in the second container. But this was a simulation, where "such experiments on computing machines would have at least the virtue of having the postulates clearly stated. This is not always the case in an actual physical object or model where all the assumptions are not perhaps explicitly recognized" (S. Ulam, in his Introduction to the FPU paper, Segre, 1965, p. 977). Where they expected to see the chaos of randomness and the disorder of ergodicity, they found order, or, as Fermi told Ulam, they made "a little discovery" (p. 977). Anyone dealing with the FPU results must address the following questions: What is the source of the isolating behavior? Is the nonlinear Hamil- tonian actually integrable? But if it is, then why is there any energy sharing at all? Could there be integrals of the motion beyond the total energy, yet not a full set? What else can we learn from the FPU simulation? Boundedness is only the first and most notable feature of FPU. Looking at Figure 1.4, we also notice a tendency toward energy stratification. Different modes have different energies that seem to depend on their mode numbers in decreasing order. Remember that this is a time-average plot; there can be variations in the energy distribution all along, so long as the long-term behavior approaches a constant. Even white noise averages to a constant value. This behavior in the time average is consistent with periodicity. If the system energy visits each mode periodically, then the time-average value of modal energy would be approximately constant. In order to see this periodicity, let us return to Figure 1.3 on page 24. As Fermi et al. point out, on the right-hand side of the graph, near 30,000 cycles, mode one returns to prominence, obtaining 99% of its initial value. Between these peaks, there is a pattern: mode two appears in the middle between the mode-one peaks, with mode-three peaks on either side of the mode-two peak. The motion seems to exhibit something like periodicity, returning every so often to one mode or another. But the return to any mode is not complete. The apparent cycling between modes can also be seen in Figure 1.5. Each mode represents a particular shape of the string that can be represented by a plot of the displacements of the mass points along the string (see Figure 1.2). At both 1000 cycles and 28,300 cycles, mode one is dominant and we see the shape of the string as a single hump. As the string evolves, the shape becomes some combination of the first five modes,
26 1. The FPU Model and Simulation: "A Little Discovery' and every so often, one or another mode dominates the displacement, and thus the energy. As of 1953, there was no published theory to explain this distorted periodicity. The linear model is strictly periodic and so strictly disallows energy- sharing. The additional nonlinear term causes a distortion of the periodicity of the linear model and allows for some degree of energy sharing, but apparently it is not a complete breakdown of the integral-like structure because there still seems to be something resembling periodicity. Fermi et al. conjectured that this "almost periodic" behavior amounted to something like "quasi-states" in an "almost linear" problem; such behavior later became known in the literature as "quasi-periodicity"(Fermi et al., 1955, p. 987). This new behavior constitues the first discovery for dynamical systems theory—the first new feature characteristic of the region between classical integrable dynamical systems and ergodic statistical mechanical systems. The challenge to the world of dynamics was now to develop some new theory to understand this new behavior, which was found in this first excursion into experimental mathematics through computer simulation. 0 2 4 6 8 10 12 14 16 POSITION OF THE MASS POINT FIGURE 1.5. The actual displacement of the string, showing the shape of the string at various times (in cycles).
1.10. Discussion Post Hoc 27 1.10. Discussion Post Hoc Posthumous Results Enrico Fermi died of cancer on November 28, 1954; he was 53 years old. The following year, atomic element 100 was synthesized for the first time and named Fermium in his honor. Although he did see and discuss most of the results of FPU with his coauthors in that year, he did not see all of the results, nor did he see the final paper that was drafted as Document LA-1940, and filed internally in the Los Alamos Reports in May 1955. The ramifications on the general publication of FPU were, as Joseph Ford writes: Pasta and Ulam found themselves trapped: they clearly could not publish without Fermi's name on the paper, but equally they could not publish with Fermi's name on the paper, since he had neither read nor approved it. This dilemma was never resolved, and, as a consequence, the FPU results were never published. However, the manuscript did finally reach the open literature as part of Fermi's collected works, which appeared some ten years after distribution of the original FPU preprint. (1992, p. 275) Copies of the preprint (LA-1940) circulated among the mathematical physics community, and the results also spread by word-of-mouth dissemination. Work on the FPU problem was slow to start as a consequence, and it was some 6 years before the next step along the direct line of FPU was taken in 1961. Postponed Publication In his Introduction to the 1965 printing of the FPU paper, Ulam mentioned the existence of further evidence for the isolating behavior which they had observed originally in 1953, but no such evidence existed in the literature in 1965. But back in 1961, the FPU simulation was continued at Los Alamos by Jim Tuck and Mary Menzel. However, in a strange repetition of the delayed FPU publication, Tuck and Menzel's results were not published until 1972—11 years after the work was performed. Once again, like FPU, the work had been performed, yet it was unavailable to the community except in an unpublished form. For modern science, these are very long time scales for the turnaround of results. But as computer simulation was indeed a new branch of science, still in its infancy, it may not yet have been clear what constituted a publishable result.
28 1. The FPU Model and Simulation: "A Little Discovery" FPU Simulation J 1954 I | f I Tuck-Menzel I Simulation 1961 FPU Published 1965 I Tuck-Menzel Published 1972 In more than one sense, these calculations were a direct continuation of FPU, because one of the authors, Mary Menzel, nee Mary Tsingou, did the actual computer coding of the original FPU simulation. These continued calculations were performed in response to requests for a published reference documenting the super-period of FPU. Because of the issues raised in the original FPU simulation, we wondered whether that apparent quasi-periodicity would continue, or perhaps fall off after a significantly longer integration time. Maybe on each iteration of the cycle, more and more energy would become dispersed. Given these questions about the FPU results, it would seem as if extending the calculations of FPU to longer times would be a promising endeavor. The results of Tuck and Menzel are summarized in Figure 1.6. The conjectured "quasi-states" (Segre, 1965, p. 987) were indeed confirmed in that later study, in which it was found that the energy moved around the lowest modes in an almost periodic fashion and that certain cycles and "super-cycles" were observed: In 1961, on more modern and faster machines, the original problem was considered for still longer periods of time. It was found by J. Tuck and M. Menzel that after one continues the calculations from the first "return" of the system to its original condition the return is not complete. The total energy is concentrated again essentially in the first Fourier mode, but the remaining one or two percent of the total energy is in higher modes. If one continues the calculation, at the end of the next great cycle the error (deviation from the original initial condition) is greater and amounts to perhaps three percent. Continuing again one finds the deviation increasing—after eight great cycles the deviation amounts to some eight percent; but from that time on an opposite development takes place! After eight more, i.e., sixteen, great cycles altogether, the system gets very close—better
1.10. Discussion Post Hoc 29 OSCILLATIONS AT FUNDAMENTAL FREQUENCY FIGURE 1.6. The calculations by Tuck and Menzel extend the FPU results to longer integration times, and so show the development of the larger super-period. than within one percent to the original state! This super-cycle constitutes another surprising property of our nonlinear system. (Segre, 1965, p. 978, Ulam's Introduction) This quasi-periodic behavior brings up the question of round-off error in the simulation. If perhaps some of the energy were lost due to the repetitive calculations, then wouldn't this cause the incomplete return to the initial state and thus disguise a true periodicity, which might result if the model were actually integrable? Fortunately, however, we may dispense with this problem in this case because Fermi et al. knew of the possibility for such errors. The conservation of energy provided a convenient check on the accuracy of the computation in that the sum of the energy in the normal modes at any time should always equal the initial value. Fermi et al. recalculated the total energy periodically to ensure that no significant losses interfered with their results. Postulated Solutions In the following chapters, we will see several solutions to the FPU problem that were suggested in the years following its nonpublication. But two main results eventually dominated the field: One group of researchers came to believe that FPU was a clear case of Kolmogorov-Arnold-Moser (KAM) stability, while another saw FPU as an example of Korteweg-deVries (KdV)
30 1. The FPU Model and Simulation: "A Little Discovery" solitons. Both of these phenomena are significant developments from dynamical systems theory that emerged in the years following FPU, and they are both intimately related to the FPU problem. In the following passage, published well after the fact in 1989, David Campbell suggests that the KAM theorem works as an explanation, indicating that the problem has yet to be resolved conclusively, but there does seem to be a consensus within the discipline: At a conceptual level, then, the KAM theorem explains the non- chaotic behavior and recurrences that so puzzled Fermi, Pasta, and Ulam. Although the FPU chain had many (64) nonlinearly coupled degrees of freedom, it was close enough (for the parameter ranges studied) to an integrable system that the invariant KAM tori and resulting pseudo-integrable properties dominated the behavior over the times of measurement. (Cooper, 1989, p. 244) This solution, which is the isolation of phase space by invariant surfaces in weakly nonlinear Hamiltonian systems, will be investigated in Chapter 3. The second significant and certainly not unrelated solution to the FPU conundrum is the possibility of a series of solitons—nonlinear, nondisper- sive waves—which would also isolate the energy to low modes and would account for the quasi-states: The FPU problem is closely related to the soliton problem. Since the nonlinear mechanical lattice investigated by Toda admits periodic solution, a single lattice soliton can exist on the system with periodic boundary conditions. This is a clear example in which the nonlinearity is not causing the lattice to become thermalized into a state of energy equipartition. (Scott, Chu and McLaughlin, 1973, p. 1469) Clearly the FPU problem provokes a lively discussion. Each group wants to claim it for its own. Here the term "related to" disguises the fact that solitons can exist only in continuous systems, yet the FPU model is obviously a discrete one. But it points to an important tie between different models and different perspectives. Some aspects of these proposed solutions may be more or less attractive, elegant, or applicable to the problem than others; but none of them to date offers a complete explanation for the FPU problem in the rigid sense of proof. The KAM theorem may offer us the best possible solution; but no one has done the dirty work of proving it. Each solution in its own right uncovers a rich vein of new dynamics research, and it might also be said that each of these solutions describes essentially the same phenomenon from a slightly different perspective.
2 The FPU Research Program: Echoes on a String And certainly the atoms did not move by volition, nor did they place themselves by sharp intelligence, nor did they agree what movements to produce, but they, being many and moving about in many ways, are constantly being buffeted and given motion, and by trying every kind of combination and motion, finally they fall into the arrangements and the patterns of which the sum of things consists. Lucretius (Book I, lines 1021-1028) The results of the computer calculations of [FPU] thus appear to be in sharp contrast, if not in actual contradiction, to the widely held notions concerning the approach to equilibrium. ... Consequently, aside from any element of contradiction, a number of physicists have been puzzled by the failure of [the FPU Hamiltonian] to lead to equilibrium behavior. Joseph Ford (1961, p. 387) 2.1. The Threads of a Research Program Even though the FPU results were not published until the two volumes of Fermi's Collected Works came out 11 years after the fact (Segre, 1965), word of the simulation did spread to the physics community by way of private communication. At that time, physics research preceded along the two accepted lines: theory and experiment. Computer simulation was a new kind of research direction that did not yet have a plot of its own in the field of scientific research. Being a set of numerical calculations, the FPU problem seemed to fall into the category of theory. But a theoretical, pencil-and-napkin physicist was not likely to be exploring the realm of "experimental mathematics," such as that opened-up by FPU. Although
32 2. The FPU Research Program: Echoes on a String no physical experiment had been performed, the problem did have many of the effects usually associated with an experiment: theorists were faced with the implicit demand to retrodict the anomalous results of FPU, just as any scientist is challenged to explain the unexpected result of a valid physical experiment. Furthermore, the FPU model was chosen exactly because it was thought to be unsolvable using analytical means. Theory in dynamics had been stalled for nearly 50 years, but here was a new and surprising result that gave theorists renewed motivation. Further theoretical work must come from that most difficult arena known as perturbation theory. I use the term "research program" to mean something akin to a "dynamical system," in that these lines of research are linked together by a common interest and are mutually influenced by one another's results. However, I do not want to convey in the term "program" the sense of "programming." This research preceded stochastically and not according to some preconceived plan. The research following from the FPU simulation may be organized along several lines, all within the discipline of mathematical physics. Already I have discussed Tuck and Menzel's 1961 continuation of the FPU calculations in the previous chapter (see page 27). Their discovery of the FPU super-period served to further validate the FPU results and intensify the need for an explanation. 2.2. The Nonlinear Discrete Lattice FPU Simulation 1954 X Jackson 1963 a, b Ford 1961 1 C 1 Ford-Waters 1963,1966 Around 1965, word of the KAM theorem began to spread across the land, effecting significant changes in this program of research. But before that tide arrived, there were two distinguishable lines of research. Both involved integrable approximations to the FPU model in an attempt to retrodict the FPU recurrence; but they divided on the choice of model topology: Joseph Ford, E. Atlee Jackson, and John Waters applied perturbation theory to
2.3. Ford, 1961 33 the nonlinear discrete lattice, while Norman J. Zabusky and M.D. Kruskal sought an integrable solution to a continuous nonlinear wave equation. Both groups worked through some very difficult theoretical calculations with varying results. Most of this work found its home in the Journal of Mathematical Physics. 2.3. Ford, 1961 In the first paper to address the FPU problem, "Equipartition of Energy for Nonlinear Systems," Ford applied physical principles to obtain an interpretation that followed from physical intuition: A system of harmonic oscillators weakly coupled by nonlinear forces will not achieve equipartition of energy as long as the uncoupled frequencies Uk are linearly independent on the integers, i.e., as long as there is no collection of integers {n^} for which EriktOk = 0 other than all rik = 0. ... Physically, the linear independence of the uncoupled frequencies means that none of the interacting oscillators drives another at its resonant frequency, and this lack of internal resonance precludes appreciable energy-sharing in the limit as the coupling tends to zero. (1961, p. 387) Ford recognized that the unperturbed frequencies of the FPU model, defined by the equations a* = 2sin(|j0, (2.1) must always be linearly dependent unless N equals a prime or a power of 2, in which case they are linearly independent (Hemmer, 1959). In FPU, only N in powers of 2 were considered. Therefore, in all cases considered by FPU, the unperturbed frequencies were linearly independent. What is the significance of this fact? There are two ways of seeing its direct implications. In terms of Hamiltonian dynamics (see the Appendix), in which we talk of orbits on tori, the linear independence of the unperturbed frequencies translates to the exclusion of periodic tori. In other words, because FPU chose to consider N only in powers of 2 (undoubtedly because of the connection between binary arithmetic and computers), all the trajectories they considered, unperturbed, must necessarily lie on conditionally periodic tori; no FPU initial condition could have resulted in a periodic, unperturbed torus. Of course, for the unperturbed system, conditionally periodic tori and periodic tori both correspond to periodic harmonic oscillations in the positions and the momenta, because a full set of integrals of the motion
34 2. The FPU Research Program: Echoes on a String bind the trajectories. But once the perturbation is turned on, the topological difference between these types of tori becomes very significant, as we know for the KAM theorem (see the discussion in the next chapter). But Ford was not working with tori and he was not yet aware of Kol- mogorov's conjecture, so he talked in terms of oscillators and resonance. He turned to the physical model of coupled pendulums, from which he derived his intuition that normal modes could share energy only if they drive each other at resonance—that is, if the condition J2nk"k~0 (2.2) is met by the set of initial frequencies. Clearly, the equality here can be achieved only with frequencies that are linearly dependent, which, as we know now and Ford knew then, could not be satisfied by the FPU frequencies. As a further heuristic argument to support this intuition, Ford made use of the resonant denominators, which, as we will see in the next chapter, play a central role in the development of the KAM theorem: [The] solution clearly indicates that as a —► 0, there will be no energy sharing unless some resonance denominator is zero, i.e., unless the uncoupled frequencies are linearly dependent. Moreover, the form of the resonance denominators [i.e., t^z+i"~ {u)r — ur+2i+i)2] almost demands an interpretation in terms of internal resonance. (1961, p. 393) The terms involving this denominator are all multiplied by the perturbation parameter a, such that when the denominators vanish due to their linear dependence, both numerator and denominator would tend toward zero in the small perturbation limit, resulting in the possibility of terms that do not vanish and so couple the modes together. However, because they are not linearly dependent in this way, the FPU modes could never drive each other at full resonance, and Ford concluded that appreciable energy-sharing would be unlikely and should not have been expected by FPU. But strict equality is not required to satisfy the approximation in (2.2), so the question becomes more subtle and more difficult to answer: Can the FPU unperturbed frequencies be nearly resonant? How close did the approximation need to be to obtain appreciable energy-sharing? Ford recognized that the size of the nonlinear coupling term in part determined how large this approximation interval would be, and so it must be related to the amount of energy-sharing: Thus, as the [size of the perturbation] is increased from zero, one would expect appreciable energy-sharing between higher and higher modes. FPU happened to choose a value such that appreciable energy sharing occurred among the first few modes. (1961, p. 388)
2.3. Ford, 1961 35 Thus it would seem, once again, that because FPU coincidentally chose a particular size of perturbation parameter, the energy was locked into the lowest few modes. If the size of the perturbation were increased, Ford believed that the energy would be shared among higher and higher modes, eventually reaching equipartition. One cannot help but notice that Ford has attributed the FPU results to two coincidental choices made by FPU in the conditions of their model. Ford did consider the approximation in the resonance condition: For large N and small fc, however, we see that for uj\ « (J2/2 « ^3/3 « ..., the Uk are "almost" linearly dependent and, strictly speaking, we should at least allow for the possibility of some internal resonance, i.e., amplitude modulation. (1961, p. 390) However, the perturbation techniques available to Ford at that time proved to be nearly intractable when one had to include terms for the amplitude modulation associated with near-resonance conditions. Without considering amplitude modulation, Ford performed several calculations to find out what to expect in the case of maximum possible internal energy-sharing— i.e., when all the aVs are equal. He found, via an averaging technique, that each mode almost always possesses \/N of the total energy, but his results did not preclude the possibility of any particular oscillator from having all of the energy (l/AT)th of the time. From this result, Ford concluded that, at best, equipartition of energy can be achieved on the average, but not ergodicity. He also tried to retrodict the recurrence times of FPU using perturbation techniques. His findings are included in Figure 2.1, about which he qualifies: "Quantitatively, then, the two solutions [his and FPU] are not comparable; qualitatively, however, they are similar" (1961, p. 392). Without using amplitude modulation terms, Ford had no way of showing definitively whether FPU should or should not have appreciable energy- sharing. Believing in the simulation results, Ford needed an explanation for what might allow some energy-sharing, but not beyond the first few modes. As was mentioned in the first chapter, the only known barriers in phase space were due to integrals of the motion. Ford knew of a result from Balescu to the effect that if the unperturbed frequencies are linearly dependent, then nonlinear equations of this type cannot possess any analytic constants of the motion other than the total energy (Balescu, 1956, p. 622). From this result, Ford formulated certain expectations: Consequently, when the uJk are linearly independent, as is true for the FPU chain, we must anticipate the possibility of finding analytic constants of the motion other than the total energy; and in particular, one must anticipate the possibility that each normal mode energy generates an analytic constant of the motion. (1961, p. 388)
36 2. The FPU Research Program: Echoes on a String A3 OOo it ooo FIGURE 2.1. Ford's modal energy plot of the FPU chain (N = 32, a = \) without considering amplitude instabilities. Expectations such as this border on the fallacious: Let statement A = "unperturbed frequencies are linearly dependent" and let B = "There are no constants of the motion," then Balescu's statement is A —> B, a simple inference. Ford posits "not A" (~ A) to get ~ B. But ~ A and (A —> B) together provide no logically conclusive statement about B. All we can say is that Balescu's result does not preclude constants of the motion for FPU. By anticipating constants of motion because they are not logically excluded, Ford demonstrates a strongly ingrained intuition that bounded trajectories probably result from constants of the motion. Although Ford did allow for some energy-sharing because of the approximation in the resonance condition, he relied on the physical intuition that only resonance in the unperturbed frequencies could lead to appreciable energy-sharing. FPU could not have resonance, therefore it seemed likely that there must be integrals of the motion corresponding to each of the modes, thus locking the energy into those modes. Furthermore, as the perturbation parameter is increased, the integrals might dissociate in an upward cascade leading to more and more energy-sharing. The intuition is good and the results fair; but little could be said conclusively based on Ford's analysis. Both Zabusky (1963) and Jackson (1963b) offered some objections to Ford's work. As discussed in the next section, Jackson's analysis divides the behavior of nonlinear oscillators into two categories whose properties depend on the size of the perturbation parameter. He agrees with Ford's
2.4. Jackson, 1963 37 observations concerning nonresonance due to the linear independence of the unperturbed frequencies; but he goes on to point out that these results are actually irrelevant to the FPU case, because they apply only to weak nonlinear coupling, whereas the FPU simulations always fell into his category of strong coupling (and large iV), where the linear independence of the unperturbed frequencies play no role in the energy-sharing properties. Zabusky called Ford's analysis incomplete because he worked without considering the "amplitude instability" of FPU. Zabusky goes on to point out that Ford's results (see Figure 2.1) show minima and maxima (of modal energies) that are out of phase with the FPU results. Finally, in order to dispel the problem with FPU's choice of N in powers of two, Tuck and Menzel, also in 1961, performed a run of their simulation using N = 17. The results were consistent with all the others in both their work and in the original FPU simulation. Energy remained in the lowest few modes and there appeared to be quasi-periodicity. Although these results do not overturn Ford's work, in that 17 is still a prime number, they do eliminate the powers-of-two connection. 2.4. Jackson, 1963 Two years after Ford's initial article, E. Atlee Jackson published a two- article sequence (1963a, 1963b) following the same line of inquiry. In the first, Jackson developed new perturbation methods that were not ordered in powers of the perturbation parameters, as were traditional perturbation expansions. Using these methods he claimed to show, in his second paper, that the nonergodic behavior of this [FPU] system does not result simply from the incommensurability [linear independence] of the uncoupled frequencies {uJk}, but also from the particular form of mode interaction and the initial conditions used in all calculations, both of which affect the coupled frequency spectrum {nfc}. (1963b, p. 686) Jackson established an iterative procedure for determining the "shifted" frequencies Qk of the perturbed system, which are functions of the perturbation parameter A and also N. For the FPU case, these shifted frequencies are approximated by nk~uk+r^)A2ku;k) (2.3) where the constants Ak are related to the mode energy of mode k. Jackson showed that the particular form of perturbed frequencies play a significant role in the behavior of the perturbed system.
38 2. The FPU Research Program: Echoes on a String Using the cubic FPU Hamiltonian perturbation, H = H0 + XH1 = - ^(d| + (J2ka2k) + - ^2 Vjkiajakah (2.4) k jkl where LOk is still denned as in (2.1) and Vjki is symmetric in the indices, Jackson denned two coupling-strength regimes: strong coupling (limA —> oo) and weak coupling (limA —> 0). He claimed that for the conditions of the FPU experiment, weak coupling corresponded to (A «C 0.1), and since FPU used values of A larger than this (A = \ and A = 1), FPU must be considered as a case of strong coupling. Although Jackson discussed both what he called the "perfect chain" of FPU as well as other, imperfect chains, I will restrict my discussion to what is applicable to the FPU problem. According to Jackson, a perfect chain is one with all identical spring constants. This was indeed one of the simplifying assumptions of FPU; any real string or system of harmonic oscillators will have variations between each of the springs. For such perfect chains, Jackson found that "energy is only transmitted to the higher modes once the intermediary modes have become excited" (1963b, p. 687). This succession of modes does correspond to what we saw in the FPU results. All the initial energy was placed in the first mode and then the string was released. The energy initially flowed toward an ascending succession of mode numbers. Jackson and Ford both shared in this expectation. Jackson focused his analysis on the concept of the Poincare recurrence time—the time for some initial condition to "nearly" recur. He defined the recurrence time, 2tt N t\ = q, xy where Q(m, A) = ]Pmfcftfc(A). (2.5) Although this method seems useful to predict the return of initial conditions—certainly important for any effort to explain what happened in FPU—it is not a very informative mechanism for understanding what would cause such a return. The subscript, A, indicates Jackson's contention that the recurrence time depends on the size of the perturbation: If t\ increases when A increases, this would mean that the energy exchange would continue for a longer period of time when the coupling is stronger—a feature which is presumably necessary, even if not sufficient, for ergodic behavior. Actually what is found to occur for the particular initial condition [of FPU], is that r\ decreases as A increases. This behavior of t\ will be shown to be dependent on the initial condition, and therefore is not necessarily a general feature of such systems. (1963b, p. 687) Once again we find that the FPU results must have come from an exceptional case of initial conditions.
2.4. Jackson, 1963 39 (a) OSCILLATORS. II n r (b) 0 1 \ \ 2/ h \ \ \ \ 5 3/ y 4 3r 57] .2 \ \ \ 3A | v V \ \ V A / \l > K4 lY Y\ ^ ^ j / / / / in \ ^ / / / / f \ ^ -1- \ U 20 30 f IN THOUSANDS OF CYCLES FIGURE 2.2. Compare Jackson's recurrence calculations (a) with the appropriate FPU results (b). Notice the symmetry in Jackson's figure as compared to the obvious asymmetry in the FPU figure.
40 2. The FPU Research Program: Echoes on a String Jackson noted Ford's recognition that FPU used numbers of particles only in multiples of 2, and he calculated a theoretical recurrence time for the case of weak coupling, which while appropriate to Ford's analysis, it was not, according to Jackson, applicable to the actual FPU work. Using the FPU conditions (N = 32, A = 1/4, and A = 1), he found results that were very poor estimates of the recurrence in comparison with FPU. In particular, the recurrence times n and ti, differed from the FPU calcu- 4 lations by 150% to 400%. From this poor correlation, Jackson concluded that, whereas Ford's analysis applied to the weak coupling case, the FPU results indicated the need to consider strong coupling. In his summary of the results for the case of strong coupling, Jackson found that he could closely approximate the recurrence times of the relevant FPU case (see Figure 2.2): The present results show that the frequencies u(m) [= ]T rrikWk] are not relevant in determining the periodic behavior of coupled oscillators. Thus, the nonergodic behavior of this system does not arise from any singular property of uj(m) but from at least two other properties. The first is the fact that the relevant frequencies fi(ra, A) [the denominator in (2.5), the perturbed frequencies] tend to increase as A is increased. Thus, while the amount of energy exchange is increased as A is increased, the duration of this exchange decreases. This feature would not preclude an ergodic behavior, except for the second property of this system. This property, which is inherent in all perfect chains, is that the energy is transmitted to higher modes only via the intermediary modes. (1963b, p. 695) Jackson concluded that perfect chains tend to have their lowest modes preferentially excited, which suggests that if the higher modes had been excited initially, then more widespread energy distribution might have resulted. In Chapter 4, we will see this claim made again by Boris Chirikov, but from quite a different analysis, and we will see it tested by Zabusky and Deem (1967). Several important points arise from Jackson's analysis. First, by focusing on recurrence time, Jackson provided no insights into whether we might expect ergodicity in any general sense. He seems to indicate that ergodicity depends distinctly on the initial conditions; although he did mention that the strength of the coupling plays an important role. His conclusions about weak coupling show his confidence that ergodicity is unlikely to occur, but he is much less forthcoming about what to expect in the case of stronger coupling. Whereas Jackson did allow for the possibility of ergodicity in this system, Ford, on the other hand, seems to have moved in the opposite direction, i.e., eliminating the possibility of ergodicity at all.
2.5. Ford and Waters, 1963 41 2.5. Ford and Waters, 1963 Two months after Jackson submitted his second paper to the journal, Ford submitted a second paper on the subject; this time coauthored with his student John Waters. In this paper, "Computer Studies of Energy Sharing and Ergodicity for Nonlinear Oscillator Systems," Ford and Waters concentrated their efforts on weakly coupled systems with the now familiar cubic Hamiltonian (compare with (2.4)), 1 n n #==2^(^+U;^)+a ^ AJkiQjQkQi- (2.6) k=l j,k,l=l By "weakly coupled," the authors specifically meant that the coupling terms seldom obtained more than 10% of the total energy. This condition they guaranteed by setting all positions (g's) and momenta (p's) initially to zero, except for the initial condition pi = \/3> and they kept a < 0.1. This size of the perturbation term places their work well within the KAM stability zone. Furthermore, Ford and Waters refined the definition of the resonance condition: Y^nkujk<a. (2.7) They confined their work in this paper to this resonance region of the weakly coupled nonlinear system. With intuition and without thorough knowledge of KAM, Ford and Waters isolated the region of phase space most significantly affected by the KAM theorem. Although Ford and Waters did cite Kolmogorov's 1957 presentation to the International Congress of Mathematicians: "Kolmogorov claims to have shown that nonlinear systems, more general than the systems considered here, are not ergodic" (Ford and Waters, 1963, p. 1294). It is clear that they did not yet recognize the limitation of Kolmogorov's theorem to weakly coupled systems, such as those they were concerned with here. Furthermore, no mention was made of the recent proofs of the theorem by Moser (1962) and Arnold (1963a). Ford and Waters cited what is essentially the KAM theorem to claim that these weakly coupled systems are nonergodic in general; but the KAM theorem is limited to precritical conditions and is inappropriate for such a general statement. Ford and Waters focused on the important distinction between equipar- tition of energy among modes and ergodicity. The former is a property of the energy distribution and the latter involves the average behavior of the system trajectory over time and state space. They contended that weakly coupled, nonlinear oscillator systems could come to equipartition of energy, if the initial condition places the system near one of the resonances defined by (2.7). Because of their belief that this system could never be ergodic, they did not consider the possibility that this region might be a (KAM)
42 2. The FPU Research Program: Echoes on a String boundary zone; instead, they thoiight of it in terms of the nearness to resonance, so that related modes can share energy yet still retain their modal identity. Using computer calculations, they demonstrated that indeed as they moved the initial conditions closer and closer to one of the primary resonances (set of frequencies which satisfies the equality contained in (2.2) exactly), the energy sharing between the neighboring modes increased significantly. Thus, to their satisfaction, they inferred that equipartition of energy could occur in this system without ergodicity. This was an important result, because it separated equipartition from the stronger property of ergodicity, an especially important result for anyone who would like to believe that nonlinear discrete systems could still have analytic solutions. This paper was a significant step forward from Ford's previous effort, in that they now accepted the possibility of energy-sharing in these systems. The shift in emphasis from the original, strict, resonance condition (2.2) to the new condition (2.7), that is dependent on the size of the perturbation, acknowledges the influence of the perturbation on the energy-sharing properties. Some region of phase space about the resonances, with width a, allowed for energy-sharing, which is remarkably similar to the KAM result. How can the normal modes share energy and yet retain their identities once the perturbation is turned on? Ford and Waters turned their attention to the possibility that a constant of the motion might persist into the nonlinear regime. Beyond the energy integral, there would need to be an additional integral of the motion to account for the continuation of the modal boundaries. They tried numerically recalculating the value of the hypothesized second constant of the motion (Whittaker's adelphic integral) on each iteration of the numerical model. For a nonlinear, five oscillator system, weakly coupled, they estimated the value of the theoretical constant of the motion (see Figure 2.3). We see that the integral of the motion is not exactly constant, but it seems to be nearly constant in time. Ford and Waters concluded that this figure demonstrated the existence of this constant. They further conjectured that probably there exists a full set of constants of the motion (making the system totally analytic and integrable) because this second constant persisted into the weakly nonlinear region and because the normal modes seemed to be stable there also. We xcan see from Ford and Water's statement of this constant ($): $ = J\ J% cos(02 — 20i)+ a<l>i H that it is a combination of the actions. But it is also a function of the angles, which would require them to maintain a very special relationship, because they clearly vary in time. But then, so do the actions, once the perturbation is turned on. Ford and Waters made no conclusion as to what to expect from this system in the case of stronger coupling, and by doing so, they avoided the issue of transition to ergodicity. This analysis is enlightening but static. It is static because it only considers the class of weakly coupled systems without then pushing their results to the limits of the class; but it is def-
2.6. The Continuous String 43 COUPLING ENERGIES (NONLINEAR SYSTEM) ♦It .TOTAL COUPLING ENERGY •RESONANT COUPLING ENERGV —i— 20 10 30 40 TIME 50 60 70 80 FIGURE 2.3. Ford and Waters conclude the existence of the second constant of the motion from the evidence of this plot. initely enlightening because they came close to modeling the behavior of these systems in the region of KAM stability. This explanation does not allow for any mechanism for the constants of motion to vanish at higher values of the perturbation parameter. With such a limited explanation, once the perturbation was turned on, if the constants of motion did indeed persist, then just changing its value would be an unlikely cause for them to disappear. The switch from linear system to nonlinear system is a major discontinuity in system type; it is at that threshold that qualitative change must occur. The KAM theorem allows for this change and it provides a logical transition in topology from integrable system to global deterministic chaos. 2.6. The Continuous String Unconvinced by the "conventional physics" approaches to the FPU problem by both Ford and Jackson, Norman J. Zabusky and M.D. Kruskal, then of Bell Laboratories and Princeton University, respectively, decided to tackle the FPU problem from an alternative perspective, that is, as a continuous string. Perhaps the mode recurrences could have resulted from the way in which FPU "discretized" time for their lattice model. Fermi et al. (1955) chose a loaded string instead of the continuous one because of the discrete requirements of the simulation:
44 2. The FPU Research Program: Echoes on a String Fermi-Pasta-Ulam Simulation 1954 Zabusky 1962 I i Zabusky 1963 J Zabusky-Kruskal § 1964-65 I For the purposes of numerical work this continuum is replaced by a finite number of points so that the partial differential equations defining the motion of this string is replaced by a finite number of total differential equations. ... The corresponding partial differential equation obtained by letting the number of particles become infinite is the usual wave equation plus nonlinear terms of a complicated nature. (Segre, 1965, p. 979) The mechanical picture of the macroscopic string began as a continuum; but for simulation, the continuum was replaced by a lattice of point masses separated by springs. Because the differential equations for a set of coupled harmonic oscillators becomes the wave equation in the continuous limit, we have a clear understanding of the affect of this approximation on the unperturbed Hamiltonian. By extension, we might expect some analogous smooth transition from the discrete perturbation to the continuous one; but, as we see from Zabusky's work, this is not the case. Whereas the continuous, nonlinear, perturbed Hamiltonian turns out to be integrable, the discrete, perturbed Hamiltonian of FPU is explained best using the properties of the KAM theorem, which applies to a discrete lattice. Whereas "of a complicated nature" seems to indicate that Fermi et al. knew there would be some effect, they also seemed to think this would be only a minor perturbation to the "usual wave equation." Zabusky was initially drawn to a study of the FPU string without the truncation of space, so he directed his own work toward the theory of continuous, partial differential equations and not simulation. He elected to solve the continuous analogue of the FPU system, analytically. But I believe he was very interested to see how the continuous solution would compare with the results of FPU, thus showing that he believed in the simulation, especially in light of the work by Tuck and Menzel (1972). At first Zabusky chose to treat what he called "the lowest continuum limit" version of the string, which means keeping only the lowest-order terms of the continuous expansion of the FPU Hamiltonian.
2.7. In the Continuous Limit 45 2.7. In the Continuous Limit In 1962, Zabusky published the paper "Exact Solution for the Vibrations of a Nonlinear Continuous Model String," in the Journal of Mathematical Physics. He attempted to explain the FPU results by resorting to the partial differential equation that arise from taking the FPU equations to the continuous limit, i.e., taking the number of lattice points to infinity while simultaneously shrinking the spacing between the sites to zero. The resulting partial differential equations are Vtt = {l + eyx)aVxx, (2.8) where e is the perturbation parameter, a is the degree of the nonlinearity, yu is the second partial derivative of displacement with respect to time, yxx is the second partial derivative of displacement with respect to position along the string, and yx is the first partial derivative of displacement with respect to position. Zabusky applied a so-called "Riemann integration technique" to obtain a solution to order e for this equation, which involved inverting the roles of the dependent and independent variables, solving the resulting linear equation with an integral and then reinverting to get the solution of y as a function of x and t. But Zabusky ran into a problem with this technique: "It is demonstrated that yx develops a discontinuity after an elapsed time of order (1/e)" (1962, p. 1028). The continuous solution breaks down for the FPU case even before the second mode achieves its first full maximum (see Figure 2.2), which is well before the recurrence of the first mode. Consequently, very few conclusions can be drawn from this analysis except to try to account for the breakdown. Zabusky pointed to further work which might be tried to penetrate the FPU recurrence: The computations of Fermi et al. indicate that the vibrations of a finite number N of coupled, nonlinear, equimass particles do not develop such a discontinuity. Thus, a continuous nonlinear system described by a partial differential equation of second order cannot describe the vibrations of the equivalent discrete system for "large" times. To account for the FPU results by a continuum representation, one is led to include terms which measure the discreteness or "graininess" of the medium. These terms appear quite naturally if we retain quantities of order (l/N2) that arise in the limiting process. These terms involve higher derivatives (for example, ytttt, Vxxxx, etc.) and should affect the vibrations most at those points on the string where breakdown "tends" to occur. These terms are analogous to the viscosity-like terms that are added to the lowest order hydro-dynamic equations to prevent a discontinuity from forming. (1962, p. 1039)
46 2. The FPU Research Program: Echoes on a String 2.8. Discreteness as Viscosity After the 1962, 1963 papers, Zabusky went back to his original equations and reconsidered the first-order nonlinear version of (2.8), with the addition of the fourth-order derivatives, which he believed would represent the discreteness of the FPU string: -§ = (1 + eyx)yxx + ( — ) yxxxx, (2.9) c2 where c2 = h2u^ is the wave speed in the linear limit. Zabusky then introduced the variable substitution -Ks[(i+*),-i]-')> <2l0) which is an invariant quantity for (2.8). Employing this substitution and transforming to a moving (Lagrangian) reference frame (x —► [x — c£]), Zabusky arrived at the Korteweg-de Vries (KdV) equation: uT 4- uux + 62uxxx = 0, (2.11) , ect c2 h2 where r = —— and o = ——. 2 12e This equation was first obtained by Korteweg and de Vries in 1895 to describe hydrodynamic wave propagation in shallow water. The third- order partial derivative in this equation, in terms of traditional wave equations, represents a dispersive medium. Analyzing this equation, Kruskal and Zabusky (1964) discovered the existence of nondispersive "solitary- wave pulses," which became known to the plasma physics community as "solitons." The unique and exciting property of the soliton is that several of them can coexist in one medium and propagate through one another without losing their identity. These solitons have the mathematical form ti = tie* + (tio - tXoo)Sech2 i}X~^\ , (2.12) where uq, Uqq, and xq are arbitrary constants and i '(Uo-Uoo)' A = 6 C = Woo + 12 (UQ-Uoo) 3 u = U(x — ct). In their attempt to include terms which would account for the discreteness of the medium in the continuous partial differential equations, Kruskal and Zabusky (1964) discovered a nonlinear wave with particle-like properties.
2.9. The First Soliton Paper 47 2.9. The First Soliton Paper In 1965, Zabusky and Kruskal reported their soliton discovery in Physical Review Letters, in a paper called "Interaction of 'Solitons' in a Collisionless Plasma and the Recurrence of Initial States." They described the KdV equation and the soliton solutions, including the important property of "focusing," in which these wave-pulses pass through one another but do not sum in the usual linear-wave superposition. Using this property of the solitons, the authors claim to "give a phenomenological description of the near recurrence to the initial state in numerical calculations for a discretized weakly nonlinear string made by Fermi, Pasta, and Ulam." Their explanation was: In conclusion, we should emphasize that at Tr all the solitons arrive almost in the same phase and almost reconstruct the initial state through the nonlinear interaction. This process proceeds onward, and at 2Tr one again has a "near recurrence" which is not as good as the first recurrence. Tuck, at the Los Alamos Scientific Laboratories, observed this phenomenon as well as eventual "superrecurrences" in calculations for a similar problem. We can understand these phenomena in terms of soliton interactions. For t > Tr the successive focusings get poorer due to solitons arriving more and more out of phase with each other and then eventually gets better again when their phase relationship changes. Furthermore, because the solitons are remarkably stable entities, preserving their identity through numerous interactions, one would expect this system to exhibit thermalization only after extremely long times, if ever. (Zabusky and Kruskal, 1965, p. 242) This explanation is very appealing because it accounts for the super-period and it provides a causal mechanism for the recurrence, both of which were lacking in Jackson's explanation. If we have several solitons moving along the string, independently of one another (except that they accelerate as they pass through each other) and we understand these waves carry energy, then we can imagine that as their relative positions change, the energy passes from mode to mode. As with any explanation, there are drawbacks. In all of their work, Zabusky and Kruskal chose to use periodic boundary conditions, instead of the fixed endpoints of the FPU problem. With periodic boundary conditions, it is clear that waves will return periodically along the string, but with fixed endpoints, the waves would have to reflect from the ends and return, which is not the same process, and we have no idea how problematic this would be for these new, nonlinear waves. No justification was made in the 1965 paper; but later, one of the authors (Zabusky, 1967) did at least address the switch in terms of the pre-KdV continuous equation (2.8):
48 2. The FPU Research Program: Echoes on a String We showed that the lowest FPU modal energies agreed up until breakdown with those obtained from a Fourier decomposition of the waveform y(x, t). The analysis proved, trivially, that we would also get breakdown if we took one Riemann invariant finite and the other zero and replaced the fixed boundary conditions by periodic or cyclic boundary conditions. The analysis also showed that, although the phenomena are nonlinear, signals propagating along the different characteristics did not interact with each other to lowest order. This suggested that the recurrence phenomena would be preserved for these analytically easier-to-treat initial-boundary conditions. A computer simulation for these conditions gave recurrence and we were on our way to a better analytic understanding. (1967, p. 231) In order to recreate the recurrence of FPU, Zabusky chose to go to the periodic boundary conditions because they were easier to use. But, even if they do get recurrence this way, does it mean that they would also get recurrence if they used the more difficult, fixed boundary conditions? Later in his development of soliton theory, Zabusky discussed the affect of the boundary conditions on the recurrence time: Note that it was important to use periodic boundary conditions. This allowed the solitons to "circulate" until they were arrayed for focusing. We now realize that had we chosen an initial condition for the lattice problem which was neither a pure standing wave nor a pure progressive wave, we would have obtained a different focussing time for the solitons propagating along the two characteristic directions. Therefore, the recurrence would not have been as good as the one we observed. (1967, p. 244) As for the energy conservation problem, in the 1965 paper, they mention in a footnote that "the energy is almost conserved" (p. 243). It turns out that they had to neglect some higher order terms and they achieved energy conservation only to the fifth significant figure. So after a long time—many cycles of the calculation—such as that necessary to reproduce the super- cycles, it would seem possible that the energy loss might be significant enough to cause a disagreement. We can see from his decisions that Zabusky was stongly motivated to understand the results of the FPU simulation and that it was more the simulation results than the string that was interesting. Whereas FPU was approximating a continuous string with discrete equations, Zabusky was approximating the discrete approximation with continuous equations. FPU was trying to model a real string and Zabusky was modeling a model. He added higher-order terms to the continuous string to model the discontinuities, or the "graininess" of the discretized string. He developed a new model that incorporated terms usually considered to be velocity-dependent:
2.9. The First Soliton Paper 49 The dispersiveness of the beaded string can be treated in a continuum representation by keeping all or some of the higher derivative terms that were omitted in taking the lowest-continuum limit. In recent years, much effort has gone into resolving the singularities or breakdowns of hydrodynamic problems by incorporating "viscosity-like" higher derivative terms. These viscosity terms are dissipative and not dispersive in nature, and one would not expect the mathematical techniques for treating the former to be applicable to the latter. (1963, p. 128) The additional terms Zabusky added to account for the discretization had not been considered before because they were traditionally associated with other effects, so it seemed unusual to consider them. The approximation of a discrete system of oscillators is clearly not just the usual "continuum limit," as Zabusky said: "There is a fundamental disparity between the continuous and discrete representations of nonlinear, physical systems" (1963, p. 101). There is a lot more to be learned in probing the boundaries between discrete and continuous systems, beyond simply taking the lowest-order, continuous limit. Zabusky had a set of standard presuppositions about the effect of higher-order terms in the continuous expansion that might have prevented him from discovering the soliton. By putting his expectations aside in an effort to understand the FPU results, Zabusky made an important discovery. We can see that he believed in the behavior described by the FPU simulation, because he continued to change his own continuous model in order to obtain results similar to those of FPU, rather than try to understand the lowest-order continuum approximation to the string, which was in disagreement with the FPU results.
The Kolmogorov-Arnold-Moser Theorem: "Here Comes the Surprise" In order to connect familiar with unfamiliar territory, in other words to make the unfamiliar familiar, to be able to find his way back, the explorer cuts a blaze precisely on the boundary of the known world. ... The region from which the blaze is visible is unfamiliar in the sense that the explorer has never seen it before, but familiar in the sense that he still knows his way back from it, and can safely venture there. Frederick Turner (1978, p. 625) But here comes the surprise ... All the isolated points correspond to one and the same trajectory, just as the points on one of the closed curves; but they behave in a completely different way. ... They seem to be distributed at random, in an area left free between the closed curves. Most striking is the fact that this change of behavior seems to occur abruptly across some dividing line in the plane. Michel Henon and Carl Heiles (1964, p. 76) 3.1. A Brief History of Dynamics Around 1610, Galileo began the formalization of terrestrial mechanics, while Kepler did the same for celestial mechanics. Sir Isaac Newton saw fit to combine these into one subject, unifying them with his calculus. Newton himself worked out the equations of motion for the two-body problem in order to demonstrate his method (1687). In 1788, the great Italian-French mathematician Joseph Louis Lagrange, who had been working with Euler on the calculus of variations, was able to formalize his brilliant generalized method of finding the equations of motion for any mechanical system— the same method we use today. It was also Lagrange, and then Laplace, who recognized the difficulty of understanding the motion of three bodies, and so arose the question of the stability of the Solar System. In 1834, in
52 3. The Kolmogorov-Arnold-Moser Theorem: "Here Comes the Surprise" Dublin, Sir William Rowan Hamilton placed position and momentum on equal footing as the canonical variables of dynamics, and he also showed that the variational principle beneath dynamics was the same as the principle of least time in optics, thereby uniting the two disparate formalisms at the theoretical level. Soon after Hamilton's contribution, Joseph Liouville proved a theorem implying that if energy was conserved in a system, then any volume of initial conditions in phase space must be conserved throughout the evolution. Liouville's theorem holds even if the system is ergodic and highly complex. In the late nineteenth century, statistical mechanics split off from dynamics to study the statistical behavior of large systems of particles, whose highly complex behavior was out of reach of the traditional method of following the paths of isolated trajectories. Beginning with Liouville's theorem, statistical mechanics could define volumes of initial conditions and study the so-called "hydrodynamics" of this conserved volume through time. At the end of the nineteenth century and into the twentieth, Henri Poincare made dramatic progress in the topological methods of dynamics. He applied the qualitative formalism of topology and geometry to the study of phase-space trajectories. He began using the equations of dynamics to define surfaces, such as the energy surface, in phase space. With these definitions, the evolution of dynamics could be achieved through a series of structural surface transformations—including changes in shape, dimension, and topology. But Poincare also proved a general theorem (1890, p. 233) showing that most dynamical systems do not possess a complete set of analytic integrals of the motion; thus they could not be expressed as a closed set of equations of motion—as were used to predict the far future of a dynamical system. He also found that the perturbation methods available at that time, which were used to approximate the solutions of nonintegrable systems, were perhaps not reliable for long-term prediction either. His findings were so conclusive that dynamicists could not see how to go beyond the very few already-solved problems of dynamics. For half a century, while general relativity and quantum mechanics were revolutionizing physics, dynamics remained at nearly a standstill; although there were many notable and important contributions, such as those of Birkhoff, Whittaker, and Siegel. Around 1954, on two fronts, new work in dynamics got under way. Even though they were isolated from one another by language, culture, and geography, these researchers began working on the same problem from two quite different perspectives. In the United States, Fermi and his group put the new digital computer to work simulating the simplest nonlinear problem and were surprised by the results. In what was then the Soviet Union, Andrei Kolmogorov had come to a conclusion about the behavior of dynamical systems with small nonlinear perturbations. And although it remained unproven until 1962, his conjecture (possible theorem yet to be proved) had direct implications for the outcome of the FPU simulation. The failure of
3.2. The Fundamental Problem of Dynamics 53 FPU generated much continuing work for physicists throughout the 1960s and early 1970s. At the same time, following the proofs of Kolmogorov's conjecture by V.I. Arnold and Jiirgen Moser, applied mathematicians and physicists began studying the implications of the theorem for nonintegrable dynamical systems. Eventually, during the 1960s, these two threads came together and provided the first pathway into the new dynamics. _ Kolmogorov's Conjecture 1954 \ 'l Arnold 1 1 96 3 1 ^r Moser l 1962 3.2. The Fundamental Problem of Dynamics Poincare himself called the problem of studying perturbations of conditionally periodic motions in a system given by the Hamil- tonian H = Ho(I) + eH1{I10), c<l, (3.1) in action-angle variables J and 0, the fundamental problem of dynamics. Here Ho is the Hamiltonian of the unperturbed problem, and eH\ a perturbation which is a 27r-periodic function of the angle variables #i,... ,0n- In the unperturbed problem (e = 0) the angles 0 change uniformly with constant frequencies Uk = dHn dlh' (3.2) and all the action variables are first integrals. (3-3) We must investigate the phase curves of Hamilton's equations dJ __d# d0 _ dH dt~ do' dt~ or in a phase space which is a direct product of a region in n- dimensional space with coordinates J and the n-dimensional torus with angular coordinates 0. V.I. Arnold (1978, p. 400)
54 3. The Kolmogorov-Arnold-Moser Theorem: "Here Comes the Surprise" The proto-typical problem of this' class is the three-body problem, which goes back to the very beginning of dynamics. Poincare pronounced it un- solvable and we shall see why. The FPU Hamiltonian also falls into this category. In the Appendix, I translate the FPU model into the language of Hamiltonian dynamics. In that venue, the model becomes an explicitly time-independent Hamiltonian function, expressed exactly the same as (3.1)—which represents the total energy (conserved) for a system of coupled oscillators. The action-angle variables (J, 0) suggest a visualization of the dynamics as orbits on the surface of an n-dimensional torus in phase space. For the unperturbed model, the actions {Ik} form a complete set of n integrals of the motion, that imply the simple linear progression of the angle variables 0k on the surface of the n-torus. The unperturbed motion is of two types, both of which are periodic in each of the n degrees of freedom. But one type is periodic overall and the other never visits the same state twice. If the set of frequencies (3.2) are commensurate, satisfying the equation n y] rrik Wk = 0, for any integer values of ra^, (3.4) then the orbit will form a closed curve on the toroidal surface, periodically repeating itself after an exact whole number of windings. However, if the set of unperturbed frequencies are incommensurate, then the orbit will be conditionally periodic—an open curve filling the entire toroidal surface without ever returning to exactly the same configuration. When we turn on the perturbation (by setting e > 0), the difference between periodic and conditionally periodic unperturbed orbits, or more specifically, just how close to commensurate the frequencies are, determines which of the two radically different types of motion will result. Perturbation Theory Because these models are not usually integrable, the method of perturbations was developed—mainly by Poincare—to deal with problems of this nature. But this method has had some profound problems of its own that needed to be overcome before progress could be made on the fundamental problem of dynamics. The procedure of perturbing an integrable system with small terms as successive approximations to the full nonlinear dynamical system goes back to the early days of dynamics. Newton began using approximations when trying to understand the motion of the planets, which led him to realize that his dynamical equations apparently could not be integrated explicitly for more than two bodies interacting under the force of gravity. The study of the next order of the problem, the famous three-body problem has a long history of its own, which I will not attempt to summarize here. At
3.2. The Fundamental Problem of Dynamics 55 the end of the nineteenth century, Poincare formally developed a whole series of perturbation methods to cope with the problems of celestial mechanics. These elaborate methods were deemed necessary because Bruns (1887, p. 70) proved a theorem establishing the non-existence of algebraic integrals of the motion for the three-body problem—that is, any beyond the known six integrals of motion for the center of mass of the system, the three integrals of angular momentum, and the energy. Algebraic integrals take the form /(#, p, t) = constant, where / is an unspecified algebraic function. Two years later, Poincare proved a similar theorem that he says: ... is more general in a sense than that of Bruns, because I demonstrate not only that there exists no algebraic integral, but that there exists not even a transcendental uniform integral, and not only that an integral cannot be uniform for all values of the variables, but that it cannot remain uniform even in a restricted domain. (1890, p. 256ff) Thus it seemed that there could be no subsequent integrals of the motion, which meant nonintegrability in general, and that perturbation methods may be the only possible hope for studying apparently nonintegrable dynamical systems. I say "apparently nonintegrable" because there is no known way of determining that particular property for any general model type. Whereas the problem of finding ways to integrate specific models is still lively today, the number of known integrable models is quite limited. According to Arnold: The collection of solvable "integrable" problems that we have at our disposal is not large (one-dimensional problems, motion of a point in a central field, Eulerian and Langrangian motions of a rigid body, the problem of two fixed centers, and motion along geodesies on the ellipsoid). (1978, p. 399) Yet most of what we have obtained in mechanics since its inception results from these few cases. With no further integrals of the motion, it seemed as if no one should have expected isolating behavior in the nonintegrable problems of dynamics. In an early work, Fermi (1923) tried to prove an ergodic theorem based on Poincare's general result. His theorem claimed to show that if, in any particular dynamical system, no integrals exist beyond the energy integral, then that system must be ergodic on the energy surface. First he showed that once a trajectory originated on the energy surface, it would remain on that surface for all time—a result similar to the Poincare theorem. But the second part of the theorem tried to show that, given a trajectory that begins in some arbitrary small segment of the energy surface, after some time it necessarily would pass arbitrarily close to any other point on the surface (see Haar, 1954, pp. 358-359). In terms of the FPU system,
56 3. The Kolmogorov-Arnold-Moser Theorem: "Here Comes the Surprise" he believed that adding nonlinear perturbation terms would couple the oscillators, resulting in ergodic energy-sharing. But his proof of the "quasi- ergodic" theorem was said to be unconvincing. According to E. Serge, the editor of Fermi's Collected Papers: "The proof of the ergodic theorem given by Fermi is not considered rigorous from the mathematical point of view and it is difficult to make it rigorous" (1965, p. 79). It is far from clear whether Fermi believed his theorem to be as yet unproven, unprovable, or doubted in any way; but it is clear from the FPU surprise that Fermi et al. (1955) expected ergodic behavior in their simulation. The Poincare Theorem and Its Application The question of whether Fermi's theorem was or was not provable, and so whether or not it guaranteed ergodicity in the FPU model is further and more significantly confounded by the simple fact that the original Poincare theorem itself does not apply to a system of coupled harmonic oscillators, and so, strictly speaking, it does not apply to the FPU model at all. Not only was there no actual guarantee of ergodicity, but there was also no guarantee that there might not be additional integrals of the motion. In Volume 1, Chapter 5 of New Methods of Celestial Mechanics (1899), Poincare proved the theorem that forms the basis for many of the expectations of nonintegrability discussed throughout this period, especially with respect to the two simulated Hamiltonians of FPU and Henon and Heiles. I restate the theorem as follows: Given a Hamiltonian that may be expressed as H = tf0+etfi4-e2#2 + -.. , that is analytic in e, /, 0, and periodic in 0 and satisfies Hamilton's equations (3.3) and assuming that H0 is a function of n variables Ik and does not depend on 0 and that Ho also satisfies the nondegeneracy condition: dij = d2H0 dlidlj ± 0, (3.5) then there exists no uniform integral, other than H = constant, analytic in e, /, and 0. In action-angle variables, the unperturbed Hamiltonian for any system of coupled harmonic oscillators is the sum of the action-times-the-frequency of each mode, n H0(I) = Y,"kh. (3.6) fc=i
3.3. The Small Divisors Problem 57 Notice that this expression is linear in the actions and so all second derivatives will be identically zero. Therefore no Hamiltonian that is a system of coupled harmonic oscillators can satisfy the condition of nondegener- acy (3.5). But this condition is necessary for any strict application of the Poincare theorem. When formulated as a perturbed chain of coupled harmonic oscillators, the FPU Hamiltonian satisfies neither the Poincare theorem, nor, as we will see, the KAM theorem. However, taken as a whole, the FPU Hamiltonian with its nonlinear perturbation could be seen as a perturbation of another, unknown, nonlinear dynamical system, whose unperturbed part does satisfy the nondegeneracy condition, even though Fermi et al. did not conceive of it that way. But the FPU Hamiltonian could be close to one that does satisfy the Poincare theorem, and then perhaps Fermi's theorem as well, if it is truly a theorem at all. Clearly, the FPU simulation showed that the system did not have a complete set of integrals of the motion, and at the same time, it was also not ergodic. Somewhere between these two extremes lies an explanation and a new realm of dynamical behavior. 3.3. The Small Divisors Problem Difficulties encountered in celestial mechanics on account of the existence of small divisors and approximate commensurabilities of mean motions are connected with the very nature of things and cannot be avoided. H. Poincare (from Arnold, 1963b) The KAM theorem appears to provide a long-awaited solution to a problem that essentially blocked analytical studies of dynamical systems using perturbation theory for the first 50 years of this century. In this section we begin our study of the KAM theorem directly from the analytical perspective, by reviewing the problem for which it provides the solution, and then the solution itself. Because the force equations of dynamics typically cannot be solved analytically in closed form, the usual approach to finding a solution begins by expanding the force functions in an infinite series about some small parameter e. The point of making the expansion is to approximate the dynamics by subsequently truncating the infinite expansion to a finite sum. Hopefully these terms can be integrated in some way to predict the behavior of the system in lieu of a complete solution. The first term is the linear term, which is always integrable, and often is in itself a sufficient approximation to the dynamics. But as an approximation, any truncation of the full infinite series must diverge from the true behavior after some period of time. But the difficulty with adding more terms lies in the fact that, as we add them, we lose the simplicity of the solution. As we have
58 3. The Kolmogorov-Arnold-Moser Theorem: "Here Comes the Surprise" seen in the FPU problem, adding force terms to the linear harmonic oscillator results in the loss of any analytical solution and requires new methods of analysis. Classical perturbation theory, which operates in the realm of energy and Hamiltonians—instead of forces—is the general name given to this approach. It proceeds by considering the changes in dynamical systems due to small, angle- and action-dependent terms added to the integrable Hamiltonian—successive approximations to the full dynamical problem. The additional terms are considered to be correction terms of order e— where e is small and Hi is of period 2tt in 0. Usually the integrable part of the Hamiltonian is normalized: we define it to be of order one. The smallness parameter e is used to control the size of the perturbation or correction terms, the first of which is typically the size of e to the first power. This is the essential structure of any perturbation-theory analysis. Underlying this method is the structural-stability assumption, that the behavior due to the perturbation will be only a slight modification of the well-known behavior of the integrable system model. In general, predictions made from a Hamiltonian with this one additional term are valid up to times on the order of t ~ 1/e. The example that was the main driving force for all of this work was the stability of the Solar System: How long can we expect the planets to remain in their orbits as they are today? Will they eventually collide? Fall into the Sun? or Will they escape into interstellar space? In order to answer such questions, We require predictions on astronomical time scales, which requires huge reductions in the size of the correction terms. The first step in answering this question is the study of the three- body problem, the goal of most of Poincare's work in celestial mechanics. Here in perturbation theory, we once again encounter the usefulness of Hamiltonian mechanics. According to the approach of classical perturbation theory, we would like to find a canonical transformation (J, 0) —> (J, <j>) that transforms H, retaining the first term's angular independence, while reducing the size of the correction term. Functionally, we want H(I, 0) - ff(J, (j>) = H0(J) + 62Hi(J, 0), (3.7) which maintains the angular independence of Ho and reduces the error to order e2. Predictions based on this model would be accurate to times on the order of t ~ 1/e2. Such a transformation would be a function of e and it would eliminate the terms of order e by substituting terms of the smaller order e2 (e < 1 => e2 < e). If we can find such a transformation, then, ideally, we would like to further reduce the error by applying a series of such canonical transformations, where each successive transformation reduces the error by at least one power of e and extends the accuracy of predictions to longer times. In the limit of an infinite number of these transformations, if this sequence converges (reduces the error to zero without causing H to become pathological), then we will have found a nonlinear, integrable system. However, Siegel (1956) has shown that these so-called asymptotic series do not converge in general. This then is the manifestation of the
3.3. The Small Divisors Problem 59 problem brought forth by the Poincare theorem: not only could we not integrate the equations, but now there appears to be a severe problem with the only other method we know about. It was this general convergence failure of these series that had blocked any significant progress in dynamics from Poincare (1899) up until Kolmogorov's 1954 paper. But before we discuss modern dynamics, let us study this convergence difficulty in more detail. There are two related problems with the convergence of the asymptotic series, both are due to the so-called small divisors (see the extensive review of this subject in Arnold, 1963b). Each canonical transformation, required to reduce the order of the error by one power of e, involves terms that take the form of Fourier-series expansions, such as S(J, 9) = Vj,ft + ^ p¥±- cos HT Mi J , (3.8) i=i k^L.iK^ \ i ) This highly complicated, canonical-transformation generating function involves one variable (0) from the old variables and one (J) from the new set. The Bk terms are Fourier coefficients that depend on the new actions. Notice that in the denominator of every term in the second sum we find the very same commensurability condition (3.4) that distinguishes between the two types of unperturbed motion—periodic and conditionally periodic. The presence of this denominator is a consequence of both the Fourier series expansion and the requirements of the canonical transformation theory. This expression vanishes for some appropriate set of integer coefficients {fc?}, which means that the Fourier expansion terms "blow up," that is, they cause the Hamiltonian to approach very large energies. This pathological condition indicates that the series cannot converge. Frequencies naturally commensurate in a physical system are called resonances because the rational relationship between them represents a conduit for energy transport among the modes. This is the same effect as when the forcing frequency of a driven harmonic oscillator is in resonance with the natural frequency of the oscillator; the super critical transport of energy can drive the oscillator to the breaking point. But even if we knew that a specific system had no naturally occurring physical resonances, these series would still diverge mathematically, because the rational numbers are dense on the set of real numbers, so there exists a set of coefficients that will make this denominator approach zero. The traditional method of perturbation theory did not provide a way to avoid a rational number in the denominator with each additional correction. One of Kolmogorov's insights was just such a method that would avoid the small divisors, by making certain adjustments with each additional transformation. Notice that even if the frequency sums were guaranteed to be irrational, we would still see very large terms arising from "near rational" irrational numbers, that is, irrational numbers whose ratios are very close to ra-
60 3. The Kolmogorov-Arnold-Moser Theorem: "Here Comes the Surprise" tional, and so cause the divisors to become very small, although technically nonzero. But even if, for the moment, we could ignore the rational numbers—for which the terms in the expansion actually could go to infinity—then we would still have a problem with convergence. This second problem is that in an asymptotic series, such as all those from classical perturbation methods, the convergence is so slow in coming—one power reduction in e per transformation step—that the terms of the series get large faster than the reduction in e. The divergence of these perturbation series actually tells us very little about the dynamical system itself, because the energy of physical systems does not go to infinity at resonance. It simply means that the methods of analysis we are using are inappropriate under those conditions. The model still possesses dynamics in those domains, but, before 1954, we had not yet found a way to study the long-term behavior analytically. It was Kolmogorov's major achievement to find a way out of both of these impediments. 3.4. Poincare to Kolmogorov Poincare applied topology and geometrical methods to the study of dynamics; he developed the idea of phase space, and turned dynamics into the study of surfaces. In classical dynamics, as in the entire classical episteme, the world appeared to be extremely ordered and deterministic. For classical dynamicists, phase space also seemed very simple and well structured, because without sufficient analytical methods for treating nonintegrable dynamics, they were seeing only a small part of a much larger realm, full of strange new and vastly complex structures. While practically inventing what we know of as perturbation theory, simply to obtain a set of tools to tackle the three-body problem, Poincare, at the very end of his three- volume magnum opus, realized the possibility of infinite complexity in this, the first unsolved problem. He discovered the existence of what he called a homoclinous point (dynamicists use the term "homoclinic point" these days), which is a special kind of intersection of two boundary curves in phase space. What follows is the often quoted passage from that work in which Poincare expresses his glimpse of what lies in the future of dynamics: When we try to represent the figure formed by these two curves and their intersections in a finite number, each of which corresponds to a doubly asymptotic solution, these intersections form a type of trellis, tissue, or grid with infinitely serrated mesh. Neither of the two curves must ever cut across itself again, but it must bend back upon itself in a very complex manner in order to cut across all the meshes in the grid an infinite number of times.
3.4. Poincare to Kolmogorov 61 The complexity of this figure will be striking, and I shall not even try to draw it. Nothing is more suitable for providing us with an idea of the complex nature of the three-body problem, and all of the problems of dynamics in general, where there is no uniform integral and where the [asymptotic] series are divergent. (Poincare, 1899, Vol. Ill, p. 389) Perhaps the most significant aspect of this expression is that Poincare, with whom we associate topology, geometry, and phase space, would not even attempt to sketch such a figure. During the 50 years after Poincare wrote these words, dynamicists knew that something existed in the phase space of nonintegrable systems, perhaps something incredible, but it was completely unfamiliar to them. Poincare had discovered the woods and had glimpsed the demons, but it would take half a century before someone was ready to venture in. In 1954, Andrei Kolmogorov, a Russian mathematician of the first rank, blazed the trail into Poincare's woods. He saw what must be "the extension of the familiar into the unfamiliar." By then, the classical world-view had been shattered by the successes of quantum mechanics, so dynamicists were now able to look beyond simple, classical structures in phase space. Classically, we had either an integrable system with nice, simple tori constraining the conditionally periodic trajectories, or else we had nearly random trajectories moving about with no discernible (predictable) pattern, in an ergodic phase space (statistical mechanics). In 1954, the same year that John Pasta and Stanislaw Ulam printed the FPU results in an unpublished Los Alamos report, Kolmogorov published a conjecture in the Russian language journal Soviet Math Doklady. Apparently very few Western dynamicists were in the habit of reading untranslated Russian mathematics journals at that time, because news of Kolmogorov's findings was not known to them until the mid-1960s. In 1957, Kolmogorov presented his conjecture in an address (also in Russian) to the International Congress of Mathematicians; it was translated into French and published as Theorie general des systemes dynamiques de la mecanique classique (Sem. Janet, no. 6, Fac. Sci., Paris, 1958). The address was finally published in English as Appendix D in Abraham (1967). Whereas in the early 1960s the Doklady journal was selectively reprinted in English translation in Russian Math Surveys (London: London Mathematical Society), Kolmogorov's original 1954 article predated the starting point (1958) for the Surveys. It was finally translated into English by Helen Dahlby and printed as Los Alamos Scientific Laboratory Translation LA- TR-71-67, and can now most easily be found in reprint in Bai-Lin (1984). The news of KAM traveled slow, first to mathematicians working on classes of dynamics more general than those of concern to most physicists, and then to the physicists themselves when they began to realize its implications for their work in the mid-1960s.
62 3. The Kolmogorov-Arnold-Moser Theorem: "Here Comes the Surprise" Dodging Small Divisors In approaching the small-divisors problem, Kolmogorov must have reasoned that if he could avoid the rational combinations of frequencies, then the problem is reduced to accelerating the convergence of the asymptotic series. But how could one ignore the rational numbers, which are dense on the real line? Prom a measure-theoretic point of view, the rational numbers have measure zero as compared to the real numbers. The rational numbers are countable and, as such, the set of all of them is much smaller than the set of all the irrational numbers. Kolmogorov realized that, given an arbitrary real number, it was likely to be an irrational number, and so most tori (in an integrable, unperturbed Hamiltonian system) are conditionally periodic, or nonresonant. One way to visualize the problem is as follows: The particular set of frequencies {u)k} for any oscillator system depends on the values of the initial conditions, which in turn, determine the actions {Ik} (constants of the unperturbed motion), which are represented as the set of radii for the hyper-torus (see the Appendix). Theoretically, most of the different possible tori for any oscillator system are going to be conditionally periodic, which, when perturbed, do not suffer the problem of small divisors. The unperturbed surfaces of these tori are covered by the orbit after a long time, making them seem like solid shells—each one a hollow (hyper-)donut shape (Figure 3.1). Different values for the actions, stemming from different initial conditions, result in larger or smaller shells. Thus we can imagine a whole continuum of donut shells, one surrounding the next. The ones that correspond to resonant frequencies, which, when perturbed, give rise to the small divisors, are not solid shells, but single closed orbits winding within the space defined by the surface of the torus. In order to avoid the resonant tori and so the problem of small divisors, we needed to stay away from initial conditions that led to actions and frequencies that came close to satisfying (3.4). This task presented a difficulty because traditional perturbation theory required us to fix an initial condition and then perform canonical transformations to new coordinates. So although we could have started with initial conditions far away from the resonance condition, there was no way to assure that the frequencies would still be nonresonant after any particular canonical transformation. Kolmogorov's approach suggests that he thought of the tori as structurally robust topological forms (shells). He recognized that the model equations supported both of these distinctly different types of tori; just as the real line supports two exclusive categories of real numbers, so the perturbed equations should allow for two exclusive sets of topological structures. Kolmogorov must have assumed that canonical transformations could not change the structural character of these entities. Intuitively, he saw no reason to believe that all conditionally periodic tori would simply vanish with the addition of a small perturbation, because most of them were "far"
3.4. Poincare to Kolmogorov 63 Conditionally Periodic Periodic: Freq = 9 0-^ Concentric shells that are mostly conditionally-periodic FIGURE 3.1. Conditionally periodic tori are covered by a single trajectory after a long period of time; periodic tori contain only a single closed periodic orbit. The periodic or resonant tori form a small subset of the continuum of possible tori, conceived as a sequence of concentric tori.
64 3. The Kolmogorov-Arnold-Moser Theorem: "Here Comes the Surprise" from the dangerous denominators. By "far," I mean that the linear sum of the frequencies is sufficiently larger than zero for any integer values of the coefficients as to be unaffected by the small divisors. If, as Fermi et al. assumed, the presence of the nonlinear perturbation caused the system dynamics to become ergodic, then the shells necessarily would be destroyed or otherwise topologically altered, resulting in trajectories free to move about anywhere within the space of the outermost shell, that is, the result would be a homogeneous chaotic donut. The KAM theorem demands the persistence of most of the internal structure of the unperturbed system of conditionally periodic shells. The mathematical method was the problem and not the topology. Because Kolmogorov expected most of the conditionally periodic tori to persist, he decided to lock onto one that was far from a periodic torus and track it across each canonical transformation by going back each time to modify the initial conditions as necessary to end up with a conditionally periodic torus in the last transformation. This was his first breakthrough. He could eliminate the periodic tori from consideration by breaking with tradition and changing the initial conditions with each new transformation. All that remained was to find a way to assure the convergence of the series. Accelerated Convergence Asymptotic series fail to converge because of their relatively slow rate of convergence—the error decreasing by only one power of e per term. Kolmogorov discovered a solution to this problem in Newton's method of tangents. This method of finding the roots of a function (xi is a root of F whenever F(xi) = 0) uses recursive approximations based on the slope of the function—the tangent. In this method, accuracy doubles on each iteration of the approximation—which means that the error must decrease by quadratic order on each iteration (e2 —» e4 —» e8 —> • • •). Kolmogorov found that if he could apply a similar method to replace the usual asymptotic series of canonical transformations, then this quadratic reduction would be fast enough to assure convergence. He convinced himself that this method would in fact lead to convergence, and he alluded to it in his paper; but he did not have to apply it rigorously because he never formally proved his conjecture. 3.5. The Conjecture Kolmogorov was convinced that he could prove his conjecture and so he announced his findings in the 1954 paper, the two-dimensional version of which is adapted here from the English translation in Abraham (1967):
3.5. The Conjecture 65 duji = d2H0 dlidlj Let us suppose that the Jacobian of the frequencies u)i with respect to the momenta Ii is nonzero: ? 0. (3.9) It turns out that in this case, the partitioning of the region in question of the four-dimensional space into two-dimensional tori is basically stable with respect to small changes in H of the form H(I, 0, e) = H0(I) + eHtf, 0, c). (3.10) To obtain a precise formulation, let us consider a region G determined by the condition JgB, where B is a bounded region in the plane of points /. Assuming that the functions H$ and H\ are analytic and that the above condition is satisfied, we can prove that, for arbitrary (3 > 0, there exists a 6 > 0 such that, for |e| < <5, in the dynamical system dt dli' dt dOi9 [ j the entire region G except for a set of measure less than 0 consists of invariant two-dimensional tori on each of which, in suitable (that is, depending analytically on (/, 6)) circular coordinates 0i, </>2> the motion is determined by the equations where &\ and &2 are constant on each torus, that is, they are conditionally periodic with two periods. Although Kolmogorov claimed that he could prove his theorem, he left that rigorous exercise to younger men. Eight years after his initial statement of the conjecture, Arnold (1963a) and Moser (1962) independently published proofs of the theorem. Arnold used the super-convergent perturbation theory suggested by Kolmogorov and required the perturbation to be analytic (infinitely differentiable), just as Kolmogorov requires in his statement. Moser's approach to the proof took the form of analytical twist mappings and he required the perturbation to have derivatives to only the 333rd order! According to the KAM theorem, most conditionally periodic tori remain stable after the addition of a small perturbation eH\ to the integrable Hamiltonian Hq. However, the action-radii will be dependent upon the angle variables, instead of remaining constant throughout the evolution.
66 3. The Kolmogorov-Arnold-Moser Theorem: "Here Comes the Surprise" So instead of toroidal surfaces with fixed radii, these new surfaces have angle-dependent radii, which means that the surfaces no longer retain the ideal donut shape. They may have indentation or bumps, or they may have to undergo much more radical topological transformations. But by not explicitly mentioning them, the theorem also allows for regions in which some tori are not preserved. This exceptional region must contain all the tori affected by the small divisors, which in turn, must be of a measure less than /?, as long as e < 6. This region contains all of what remains of the periodic tori, as well as the conditionally periodic tori that are associated with nearly commensurate frequencies. Furthermore, this theorem implies that the size of the exceptional region (or regions, since they need not be connected) is related to the size of the perturbation. Although nothing more definite is guaranteed by the theorem, it has already provided a great deal. The KAM theorem cuts a blaze into the fundamental problem of dynamics and so extends the boundary of the known world. Instead of the complete unknown—represented by the (classical) homogeneous ergodic donut—we find that under certain circumstances, the conditionally periodic tori persist and therefore provide elements of familiar and possibly predictable behavior. But as always, to the explorer, the most interesting part of any realm is the region beyond the blaze. 3.6. Beyond the Blaze What happens to the tori, under small perturbation? The resonant orbits, those plagued by the small divisors, are immediately destabilized by the perturbation. As Melnikov (1963) and Arnold (see Arnold and Avez, 1968) both discovered, these orbits are replaced by a sequence of alternating stable and unstable equilibrium points, which give rise to the homoclinic orbits—that infinitely complex lattice that Poincare would not even attempt to sketch. There are two quite interesting topological changes that occur in the space between the preserved tori. In Figure 3.2(a), we see again the unperturbed, concentric, mostly conditionally periodic tori, whose radii are all constants of the motion. But then, in Figure 3.2(b), we see the effect of adding the nonlinear perturbation and bringing the Hamiltonian into the realm of KAM. Of the three tori depicted in this figure, the outer and inner tori are the same as in the previous figure; they represent the preserved conditionally periodic tori. But between them winds a new mini-torus. Its surface is wrapped about what was a period-four orbit, so it winds around four times between the other two tori. The conditionally periodic tori that are "near" to resonant orbits undergo this amazing change. They retain their bounded, conditionally periodic character—meaning that trajectories remain fixed to their toroidal surface
3.6. Beyond the Blaze 67 (a) Unperturbed (b) Perturbed FIGURE 3.2. (a) The unperturbed, integrable Hamiltonian has a phase space that is a series of concentric tori. Each torus represents one possible orbit for the system with a unique set of initial conditions, (b) When a perturbed Hamiltonian satisfies the conditions of KAM, its toroidal phase space appears to have undergone a dramatic change: some of the conditionally periodic tori wrap themselves around the destroyed periodic orbits, creating mini-tori that are bound between the still-preserved tori. These new tori embody all the properties of the whole: they contain further mini-tori within them.
68 3. The Kolmogorov-Arnold-Moser Theorem: "Here Comes the Surprise" and their angular coordinates are individually periodic; but the surface is no longer strictly concentric with the other, still-preserved tori. Instead of being wrapped around the central circular axis of the entire torus, these surfaces now are wrapped around what had been the nearest resonant, periodic orbit, thereby forming a conditionally periodic torus that winds about between the preserved tori. The number of times it winds about depends upon the periodicity of the resonance about which it is wrapped. However, these mini-tori still wind around the central circular axis in the sense that the periodic orbits about which they are wound, were themselves wound around that circle. The mini-tori are both conditionally periodic—in the sense of their surface being filled by the orbit after long periods—and periodic—in the sense that they wind about the larger torus an integer number of times. Fractal Structure There is a very strong analogy between the larger structure of these KAM tori and the real number line—one that stems from the the small divisors problem. The small divisors, and so the periodic orbits, are related to the rational numbers, whereas the conditionally periodic tori are related to the irrational numbers. The real number line is a fractal in the sense that the interpenetration of the rational numbers and irrational numbers on the whole line reemerges at each scale of the rational numbers. The integers represent a certain scale of rational numbers, just as the strongest resonances in the physical system give rise to the most pathological small divisors. Given any two integers, there is another whole scale of rational fractions punctuating the irrational space between them, and so on for any two rational numbers ad infinitum. In the KAM torus, each of the strong resonances generates a mini-torus in perturbation. So there is a whole sequence of them punctuating the space of preserved conditionally periodic tori. But even more interesting is the fact that within the mini-tori, just as between any two integers, the entire structure of the whole re-emerges on another size scale. Within all the mini-tori, there are more mini-tori, as well as preserved conditionally periodic tori at each size scale of the small divisors. Heuristic Explanation of FPU Finally, in answer to the question, "How does KAM explain FPU?" please refer to Figure 3.3 for what follows: Fermi et al. used the unperturbed normal modes to track the evolution of the string. For the unperturbed Hamiltonian in FPU, the torus is the complete set of concentric shells of a very high dimension. In these figures, I must restrict my representation to two dimensions. Once the FPU Hamiltonian was perturbed, and if that perturbation was subject to the KAM theorem, then, if the perturbation
3.6. Beyond the Blaze 69 Plane Cross-sections of Tori FIGURE 3.3. The KAM torus provides us with a heuristic explanation for the FPU string: superimposing the before and after perturbation tori and assuming an arbitrary period-four mini-torus between the number one and number four normal-mode preserved tori, we see in the cut-away view at the bottom how an orbit might develop the quasi-periodic behavior found in FPU by following sequence ® through ®. The orbit passes through the four lowest modes and returns almost completely to the original mode configuration.
70 3. The Kolmogorov-Arnold-Moser Theorem: "Here Comes the Surprise" parameter were small enough, the resulting phase space (in action-angle variables) would be a perturbed KAM torus such as the one represented in Figure 3.2. In Figure 3.3, I have superimposed the cross-sections of the unperturbed and perturbed tori showing how a mini-torus might exist in the space between what had been the mode-one and mode-four tori. Although I have arbitrarily chosen a period-four mini-torus, the periodicity does not affect the explanation. Let us focus in on one cross-section of the mini-torus (they are all the same) in the lower right of the figure. As this plane is a cross-section to a toroidal surface, the system trajectory periodically passes either outward or inward through this surface, but always in the same direction. The points of passage are fixed to the circular ring; but they must march around the ring in a fixed succession. I have represented such a sequence with the letters ® through © . Beginning at ®, the system trajectory is in the configuration of mode one, which is the FPU initial condition. The trajectory then proceeds around the mini-torus, eventually returning to pass through this surface at point ©, which corresponds to what had been mode two (before the perturbation). Then the sequence continues through points © and (3), arriving at modes three and four, respectively. But here, the surface and the periodicity demand that the trajectory go to point ®, which is a turnaround in terms of the normal modes, because point ® is back near mode three. This behavior seems to correspond nicely with what we saw in the FPU simulation. But there is even more. Looking at point © , we see that the return to mode one is close but not exact. Nor can it be exact because this is a conditionally periodic orbit, so it can never return exactly to the same point. Thus at point® , we see the quasi-periodic return to the initial condition. How about that super-period found by Tuck and Menzel (1972), which gave so much trouble to both Jackson (1963b) and Zabusky (1962)? This secondary period can be explained by a close examination of the point © . Because of the orbit's periodicity about the ring, and just as the point (D fell short of point© , point® is slightly closer to point® than was point © . This progression toward the initial condition continues with the points ® , © , etc. If the periodicity were within a certain range, then we would see a very close return to the initial condition that might be described as follows: This trend continued so that by ~ 780 cycles [fundamental oscillations of mode one], the string was closer than ever before to its starting configuration, clearly the recurrence has a super- period. (Tuck and Menzel, 1972, p. 403) The FPU super-period returned the string to its closest approach to the initial condition on every sixteen of its fundamental quasi-periods. In this heuristic explanation, the existence of the super-period requires that after
3.6. Beyond the Blaze 71 sixteen times around the circle, the trajectory must come back very close to the point ®, and so getting very close to the initial condition before marching away again. The KAM explanation of FPU is very compelling; however, there are some serious technical difficulties in making the heuristic explanation into a rigorous mathematical one. So serious are these difficulties, that no one to this author's knowledge has to date succeeded in making it rigorous. As I mention in my qualifications above, just picking periodicities out of the air is unacceptable; the KAM torus corresponding to the perturbed FPU Hamiltonian would have to be derived from the model through canonical transformations. But perhaps even more difficult is the need to overcome the fact that the unperturbed FPU model cannot be used as a starting point because it does not satisfy the nondegeneracy conditions of either the Poincare (3.5) or KAM (3.9) theorems. It must be shown that the perturbed FPU Hamiltonian can be derived from some nondegenerate, integrate model. But so few integrable models exist that finding one that can be perturbed into the FPU model seems remote. However, as we will see in the next chapter, Ford developed some further convincing arguments for this explanation. Deterministic Chaos Two further issues that might or should come up when considering the KAM torus are the questions of what lies in that space between the preserved tori and in-between the mini-tori, and what did happen to the period orbits? Of course the answers to both questions are the same: that space is filled with a single orbit, the orbit that corresponds to the destroyed unperturbed periodic resonance. According to the theory devised by Poincare, Birkhoff, Arnold, and Melnikov, alternating with the stable elliptic points— those that give rise to the mini-tori—there is a set of hyperbolic points that cause the orbit in this region to act very erratically, we might even say pathological. Although it is bound between the conditionally periodic tori, the movement of this single trajectory appears to be chaotic (nonde- terministic). This is thought to be the beginning of ergodicity, and it is also the reason perturbation theory failed. The single trajectory that corresponded to the unperturbed periodic orbit—about which the mini-torus is now wrapped—is free to move about between those preserved tori, and it does so in an extremely erratic, unpredictable manner. Like the conditionally periodic orbit, this trajectory will completely fill its available space over an infinite amount of time; but unlike the conditionally periodic torus, its movement is aperiodic. Perturbation theory fails to predict the movement because of the small divisors and, as we will see, simulation also fails to track this movement. Although the trajectory is still deterministic, following as it does from the formalism of classical mechanics, it cannot be predicted very far into the
72 3. The Kolmogorov-Arnold-Moser Theorem: "Here Comes the Surprise" future, nor simulated very far, for reasons that we will discuss in Chapter 5. This is the behavior that has come to be known as deterministic chaos. It is not chaotic in the classical sense of no order; but its order cannot be used to predict its movement. This chaos was what Poincare recognized as the infinitely complex web of stochastic trajectories. Before the KAM theorem, it was thought that phase space was filled with tori, in the integrable case, or else it was a turbulent sea of ergodic trajectories. Clearly, Fermi et al. expected the perturbed FPU system to become ergodic, because it was apparently not integrable. The KAM theorem blazes a trail into the transition between integrability and ergodicity. One of the more profound philosophical implications of modern dynamics is that predictability and complete randomness can and often do coexist in one phase space. Within each mini-torus there are more mini-tori, and between them there is more deterministic chaos, and within those tori there are more tori. Mathematically, the KAM torus is a beautiful dynamical object, made possible by the numerical properties of the real numbers, and embodying the perfect interpenetration of order and disorder. It is a transitional object, because its composition depends on the value of the perturbation parameter; as the parameter is increased, more of the conditionally periodic tori are near the resonances and so the deterministic chaos seems to take up more and more of the phase space. According to classical intuition, we expect this process to lead to the homogeneous sea of ergodicity at some critical value; but then again, that intuition led to the original FPU problem. 3.7. The Henon and Heiles Simulation, 1964 In 1969, when Grayson Walker and Joseph Ford introduced the KAM theorem to the general physics community in their Physical Review article, they used extensively the results of some work performed five years earlier in theoretical astrophysics by Michel Henon and Carl Heiles. This latter work was significant not only because it demonstrated vividly some of the properties of KAM tori remarkably well, but because it also broke new ground in the techniques of simulation in dynamics. Looking back at our road map on page 3, we see that this work constitutes one of three independent starts in the FPU research program, because Henon and Heiles conceived and completed their project without ever having knowledge of either the KAM theorem or the FPU simulation. Yet the model they used and the behavior they observed had intimate relationships to both. In 1963, just after KAM became a theorem with the proofs by Arnold and Moser, astrophysicists Henon and Heiles (1964) were investigating the possibility of there being a hidden, third integral of the motion for the case of a star moving in a galactic axisymmetric potential. In their pa-
3.7. The Henon and Heiles Simulation, 1964 73 per "The Applicability of the Third Integral of Motion: Some Numerical Experiments," Henon and Heiles present a short history of the problem. There were stellar observations that might be accounted for by hypothesizing the existence of a third integral of the motion, beyond the two integrals of conserved energy and angular momentum. The absence of a third integral "implies that the dispersion of velocities should be the same in the direction of the galactic center and in the direction perpendicular to the galactic plane, whereas the observed dispersions have approximately a 2:1 ratio" (p. 73). Observational data of subject stars seemed to indicate that they might be bounded by one additional degree of freedom, resulting in asymmetrical dispersions. Just as in FPU, these researchers expected that only integrals of the motion could bind the trajectories of a model; without them, trajectories should move randomly throughout the available energy surface. r \ FPU I ^ > \ / r i u _IA , LI, .. hci iui i-nciico | Simulation 1 1964 I ~T~ Gustavson 1966 » - r Walker-Ford 1969 T-, Ford, Stoddard | 8 iTi im< 3r 1973 Toda 1967 f \ f H6non | 1 974 Henon and Heiles developed a model for the motion of a single star in a galactic field that included the following terms: H(x,y,x,y) = ^- + ^±^- +*2j/-^. (3.13) Because of the first two terms, the Hamiltonian looks just like a harmonic oscillator with two degrees of freedom, which has been perturbed by the two nonlinear terms on the right, both of degree three in the positions.
74 3. The Kolmogorov-Arnold-Moser Theorem: "Here Comes the Surprise" The Toda Lattice Much later, in 1972, Lunsford and Ford noticed that this potential could be derived by truncating the series expansion of the "Toda lattice" model, which describes a ring of particles constrained to the two-dimensional, circular surface and separated by nearest-neighbor, exponential forces: 2 24 L Jo where Px and Py correspond exactly to x and y in the Henon and Heiles model. Although three particles on a ring lattice may not seem to be related physically to stars in galaxies, the two models are intimately related mathematically. Furthermore, particles on a lattice does sound very much like particles on a string, and so we begin to see that this work indeed could be related to FPU. Nearly 10 years after this work with Carl Heiles, Michel Henon (1974) discovered the existence of a third integral binding the motion in the Toda lattice. Expressed as a constant function, this integral of the motion is F = Spx(pl - 3p2y) + (px + V3py)e2y~2^x + (Px - y/3py)e2v+2^ - 2pxe~4y. (3.15) Of course at the time of the Henon and Heiles simulation, no one had yet derived the integral for the Toda lattice, nor did Henon and Heiles mention that their Hamiltonian might be derived from or related to it. I mention the Toda lattice here because, as Lunsford and Ford (1972) and Ford, Stoddard and Turner (1973) later show, it contributes to the belief that the results of both FPU and Henon and Heiles are intimately related to the KAM theorem. However, there is an important distinction to be made. The FPU lattice had fixed endpoints, whereas the Toda lattice has implicitly periodic boundary conditions stemming from its ring structure. As we saw with Zabusky's soliton explanation of FPU on page 47, the difference between the type of boundary conditions used in each model makes a difference. Even though the Henon and Heiles model may be related to FPU by way of the Toda lattice, the two are not the same, because of the difference in boundary conditions. Developing the Simulation Whereas Fermi et al. expected to see ergodicity in their simulation, Henon and Heiles were hoping for just exactly the opposite effect. Their anticipated integral of the motion would have bound the trajectories tightly to the surface of a torus. But neither simulation obtained the anticipated results.
3.7. The Henon and Heiles Simulation, 1964 75 Henon and Heiles found behavior that seems to be a transitional effect, something between torus-bound orbits and ergodicity. For any model, each integral of the motion defines a surface in phase space to which all orbits must be bound; so each integral of the motion reduces the freedom of movement in the problem. The two known integrals of the motion in this problem restrict the trajectory to three dimensions. Furthermore, this Hamiltonian has a potential well for energies below a dissociation value (E = 1/6); as long as the constant energy is maintained below dissociation, trajectories are bound to remain within a known, finite region of the energy surface. Henon and Heiles chose to use the coordinates (x, y, y) for their work on this problem. For the purposes of simulation, they elected to look at two-dimensional cross-sections of this three-dimensional space. All of the figures show the phase-plane (y, y). The results of the Henon and Heiles simulation are clearly represented in Figures 3.4(a), (b), and (c). All three figures show a bounded area containing all the trajectories that result from maintaining the energy below the dissociation value. Because these figures are cross-sections (at x — 0), the trajectories appear as sets of points, where the trajectories pass through this surface every so often. The two classically extreme cases are as follows: If the orbits are ergodic, then they are not bound to any additional surface and so the surface of section should show the bounded region filled with a gas of points, uniformly distributed throughout. If there is an additional integral of the motion, binding the orbits to another surface, then the surface of section should show a series of nearly closed curves, just as the conditionally periodic tori appear to be circles in cross-section. But Henon and Heiles are not using action-angle variables, so their phase space is not a torus and their orbits will not be simple circles. Any tendency toward closed curves here would seem to indicate the presence of an additional integral of the motion. In Figure 3.4(a), the energy is set to E = 1/12, and apparently all the orbits that we can see are conditionally periodic because they remain on these closed curves. Although the perturbation is small, Henon and Heiles thought, just as did Fermi et al. before them, that it should still cause the trajectory to drift away from the torus, unless the third integral existed. This result seemed to verify that there must be a third integral of the motion that was keeping these orbits bound to the closed curves. But then, in Figure 3.4(b), at E = 1/8, the initial energy is larger and so the nonlinear terms have a much more pronounced effect. Here we find new orbit patterns that very much surprised Henon and Heiles. They write: All the isolated points in the figure correspond to one and the same trajectory, just as the points on one of the closed curves; but they behave in a completely different way. It is clearly impossible to draw any curve through them. They seem to be distributed at random, in an area left free between the closed
76 3. The Kolmogorov-Arnold-Moser Theorem: "Here Comes the Surprise" (a) (b) (c) FIGURE 3.4. Trajectories in the Henon and Heiles surface of section for energies (a) E = -^ shows very persistent level curves suggesting an inte- grable system; (b) E = | shows the mix of level curves and deterministic chaos well into the region of KAM stability; and (c) E = | at dissociation shows the breakdown of most, but not all level curves.
3.7. The Henon and Heiles Simulation, 1964 77 curves. The five little loops in the right of the diagram belong to the same trajectory; the successive points [of the trajectory] jump from one loop to the next. Let us call this feature a chain of islands. Other such chains have been found in various parts of the diagram. Each chain is associated with a stable periodic orbit; the q islands surround the q points which correspond to that orbit. The following properties are also suggested by our results: (1) there is an infinite number of islands; (2) the set of all the islands is dense everywhere; (3) but the islands do not cover the whole area since they become very small; there exists a "sea" between the islands, and the ergodic trajectory is dense everywhere on the sea. (p. 76) Figure 3.4(c), which shows the situation at the dissociation energy, demonstrates that the closed curves (the conditionally periodic orbits) have decayed even further and the so-called ergodic orbits expand further between the remaining closed curves. From this progression/degradation, Henon and Heiles concluded that there does exist a third integral of motion, although there also seems to be a critical energy at which the integral breaks down. The model that Henon and Heiles simulated probably did not have a third integral of the motion, strictly speaking. If it did, then the orbits would not have been free to move between the closed curves. If there were an integral that did indeed break down after some critical value of the energy, then once that value of energy was reached, the closed curves would be absent. Without recourse to analytical, topological methods, Henon and Heiles found in simulation the new behavior that seems to correspond to the KAM theory, because of the mixture of conditionally periodic orbits with bound ergodic orbits. From only their graphical surface of section, they interpreted their results as they witnessed the breakdown of the conditionally periodic tori, as the effect of the perturbation increased with the energy. They found that over a relatively short period of time the "integral" seemed to break down very quickly. Their description of what lies in the surface of section during breakdown is remarkably similar to what Melnikov (1963) and Arnold (see Arnold and Avez, 1968) speculated for the KAM torus using strictly analytical methods. Whereas Arnold, working from the Poincare-Birkhoff theorem on the existence of fixed points in an annulus (Birkhoff, 1913), deduced that the conditionally periodic tori surrounding the periodic orbits would change topologically to the mini-tori winding around within the remaining conditionally periodic tori, Melnikov, on the other hand, working from Poincare's development of the homoclinic point (1890, Vol. 3) develop theory concerning the homoclinic points surrounding the unstable fixed points—such as those between the KAM tori. The mini-tori appear in cross-section to be the "island chains" found and described by Henon and Heiles, and the homoclinic points are believed to
78 3. The Kolmogorov-Arnold-Moser Theorem: "Here Comes the Surprise" give rise to ergodic motion between the KAM tori. Without reference to Melnikov, Henon and Heiles reached the conclusion that the orbits between the preserved tori must be ergodic, by using the criterion of the exponential separation of orbit pairs. Integrable Models: The Undiscovered Country Walker and Ford (1969) claimed that the Henon and Heiles model was in fact a KAM system. Although the behavior of trajectories in the Henon and Heiles simulation seems to demonstrate very well the KAM properties, there are important requirements to be met in order to establish a proof that the Henon and Heiles model is a KAM system. By extension, the relationship between the Henon and Heiles model and the FPU model might tempt us to claim that the KAM theorem also provides the explanation for the FPU results. But these are bold claims and are very difficult to justify. In order to apply the KAM theorem to a model, the model must first meet the strict requirements of that theorem. The two important requirements for any application of the KAM theorem are that the unperturbed Hamiltonian be integrable and nondegenerate, according to (3.5). The nondegeneracy condition immediately eliminates models that are linear in the actions, which rules out the harmonic oscillator. Both the Henon and Heiles and the FPU Hamiltonians may be viewed as perturbed harmonic oscillators, because of the quadratic energy terms that are their basis. Fermi et al. explicitly chose this model in the first place, and then added the nonlinear terms. But this historical fact can cause some significant confusion as to an application of the KAM theorem. Whether or not a theorem applies to a model does not depend on the intentions of researchers. Yes, the harmonic oscillator is an integrable linear model and, as such, does not satisfy the nondegeneracy condition; but the full perturbed Hamiltonian of Henon and Heiles and FPU may still be KAM models, because the same Hamiltonians could result from perturbing some other integrable yet nonlinear Hamiltonian, such as the Toda lattice. This could be true regardless of the set of approximations that led experimenters to these models and regardless of what they thought they were modeling. The KAM theorem guarantees that as we slightly perturb the integrable system away from its state of integrability, some of the conditionally periodic tori will be preserved, but distorted. As a result, if we see KAM-like structures, such as in the Henon and Heiles simulation, then there is likely to be a nonlinear, integrable system "nearby," meaning that our model may be close in a mathematical sense to a nonlinear integrable model that we may not even know about. So by seeing KAM behavior when simulating a model that is obviously not a KAM system, we obtain an important indicator that there may be an undiscovered integrable Hamiltonian nearby. Simulation could act as a guide for analytical discovery.
3.7. The Henon and Heiles Simulation, 1964 79 If someone wants to prove rigorously that the KAM theorem applies to the Henon and Heiles model, then they must produce the unperturbed Hamiltonian that satisfies the conditions of the theorem and then add to it the perturbation terms with the result being the Henon and Heiles Hamiltonian; similarly for FPU. They could not begin with the harmonic oscillator as the unperturbed Hamiltonian. In addition, because the KAM theorem is best described in terms of the slightly-distorted, yet preserved tori, the unperturbed integrable Hamiltonian that is developed must display level curves of constant energy that have the same topology as the curves in the surface of section of the Henon and Heiles (or FPU) system, or it must be shown to be a canonical transformation away from such a model. Walker and Ford (1969) cited KAM stability for both the Henon and Heiles and the FPU models. Yet they did not provide the requisite conditions for a rigorous application of the KAM theorem for either case. They did not offer a nonlinear integrable Hamiltonian as the definite source for these models; instead, referencing the figures of Henon and Heiles and the follow-up article by Gustavson (1966), they claimed that: "One may use canonical transformation theory to show that this two-dimensional surface [in the figure from Henon and Heiles] is topologically equivalent to a KAM torus" (p. 419). They also claimed that "In view of the complexity of actually reducing the Henon and Heiles problem to manageable form, however, it is perhaps worthwhile illustrating by example the effects that even a simple canonical transformation can introduce" (p. 428). Whereas they proceeded to perform an analysis based on trying to transform their own constructed integrable Hamiltonian into a form that would produce level curves like those of the Henon and Heiles paper, they finally concluded: "In short, we have made it quite plausible that the Henon and Heiles Hamiltonian is only a coordinate transformation away from the analysis of sections II and III" (p. 429). Although their final figure (see Figure 3.5(a)) had some topological structure similar to the Henon and Heiles figures, their argument remained a plausibility argument and not a rigorous one. For the FPU case, Walker and Ford did not attempt even this much justification; instead, they offered a reference to the article by Izrailev and Chirikov (1966) relating FPU to KAM by way of the resonance-overlap criterion (see the discussion in the next chapter). Once again, although the claim was made for KAM as the probable explanation for FPU, the conditions for the theorem had not been established rigorously. Gustavson: First to see the KAM connection In his paper following the Henon and Heiles article, Gustavson (1966) discussed the Henon and Heiles results in terms of the KAM theorem, thus making his the first published statement calling upon such a connection:
80 3. The Kolmogorov-Arnold-Moser Theorem: "Here Comes the Surprise" (a) (b) INITIAL POINT X-U6.0) + -1.08,0) o-( 0,0) • -C-.I2.0) O-C0..I6) INTEGRAL VALUE .464 x Iff* -.389 xIff' -.164 x Iff* -310 > Iff* -330 * Iff* ! « - ZERO VELOCITY CURVE FIGURE 3.5. Reproducing the Henon and Heiles level curves: (a) Walker and Ford; and (b) Gustavson.
3.7. The Henon and Heiles Simulation, 1964 81 Recently, Kolmogorov, Arnold, and Moser obtained very important results that show the existence of invariant surfaces for nearly integrable Hamiltonian systems. Since the unperturbed Hamiltonian is integrable, a continuous family of invariant surfaces exists. Their results show that, for small perturbations, the majority of these invariant surfaces are preserved while a small number dissolve or break up. The surfaces that dissolve are densely intertwined with the surfaces that persist showing that no integral exists in the strict sense. We emphasize that these results are true only for sufficiently small perturbations and that no concrete results exist in the large. Numerical evidence, however, indicates that the surfaces do continue for large perturbation but then suddenly start to disappear completely. The results of Henon and Heiles are an example of this phenomenon. (Gustavson, 1966, p. 671) The author's intent was to construct an approximation to the integral representative of the invariant surfaces in the Henon and Heiles system. He did not claim to show that Henon and Heiles was a KAM system, and therefore he was under no obligation to produce the nonlinear Hamiltonian. But he does claim here that the unperturbed Hamiltonian is integrable, which can be read as a reference to the harmonic oscillator Hamiltonian. If so, then he seems to be unaware of the nondegeneracy condition of the KAM theorem. He did develop an eighth-degree polynomial expression for the invariant curves and he showed that they did indeed resemble the Henon and Heiles topology (see Figure 3.5(b)). From this expression, we might be able to develop the integrable Hamiltonian that would be required for proof of KAM for the Henon and Heiles system. Final Words Thanks to Henon's, as of then unpublished, (1974) identification of the exact form for the integral, Ford et al. (1973) were able to definitively announce the integrability of the Toda lattice model (3.14). Ford et al. (1973) also pointed out that the Henon and Heiles Hamiltonian (3.13) is exactly the truncation to third order of the series expansion of the terms in (3.14). This latter coincidence seems to suggest that Henon and Heiles might actually have begun from that model, although they do not mention the fact in their article. Further evidence for this argument is that it was Michel Henon himself who found the third integral for the Toda lattice. If we begin with the integrable, nondegenerate Toda lattice for three particles as our unperturbed Hamiltonian, then we may "add" the negative of all the terms beyond the third order in the series expansion as our perturbation, thereby arriving at the Henon and Heiles Hamiltonian as our perturbed model. Furthermore, Ford et al. (1973) showed that the level curves for the integrable
82 3. The Kolmogorov-Arnold-Moser Theorem: "Here Comes the Surprise" Toda lattice do indeed match those of the Henon and Heiles model. In this way, the Henon and Heiles model satisfies the conditions for the KAM theorem, and so the behavior we see in the Henon and Heiles simulation does indeed exemplify KAM stability, just as expected by Gustavson, Walker, and Ford. The simulation strongly suggested the existence of the nearby integrable model, and eventually, the analytical methods came through. The three-body problem satisfies the conditions for the Poincare theorem, and also the KAM theorem. The three-body problem is nonintegrable in general, but may be near to an integrable system, such as the Toda lattice. Poincare saw the cause of deterministic chaos (homeoclinic orbits) and realized that these models could not be integrated. From that perspective, he believed that this problem could not be solved, in the sense of being integrated analytically. The Solar System's stability is a huge problem for dynamics, obviously involving many more bodies than three. As a Newtonian system with relatively few bodies, the Solar System provides the prototype for dynamical systems theory. On a long time scale, the Solar System is slowing down as the Sun radiates its mass and energy away; but on a shorter, more human time scale, the Solar System may be considered as an energy-conserving Hamiltonian system, which appears to be stable in the KAM sense: its randomness constrained to very small variations on the global predictability of planetary orbits.
Research Threads Come Together: Harmonic Convergence When the atoms are falling downward through empty space of their own weight, at indeterminate times and places they swerve a very little out of their downward course, just enough for you to call it a change of direction. If they did not swerve, then everything would fall straight down like raindrops through the void, no collisions would take place, and no impact of atoms upon each other, and Nature never would have produced anything at all. Lucretius (Book II, lines 217-224) I first learned of KAM sometime before the Ford-Waters paper. Boris [Chirikov] got the details of KAM a good while before I did; some Moscow mathematician explained Arnold's statement of KAM to him. Kruskal and Zabusky were initially hostile to Chirikov-Izrailev's article, but then news of KAM got through to them. The Ford-Walker paper was an effort to popularize KAM and to make the Chirikov overlap criterion understandable via Henon-Heiles level curves. Joseph Ford (Private communication) 4.1. The Story Continues Back in Chapter 2, I left off with the pre-KAM work by Waters and Ford (1966), and Zabusky and Kruskal (1965). Beginning in 1966, we find the news of KAM spreading across the physics community, influencing research directions, and forcing conclusions. In the first effort to bring the KAM theorem to bear on the FPU problem, Izrailev and Chirikov (1966) provided an explanation of the FPU boundedness (failure to approach equilibrium) in terms of a simplified, purely topological argument. Shortly thereafter, motivated in part by the work of Izrailev and Chirikov, Zabusky and Deem
84 4. Research Threads Come Together: Harmonic Convergence (1967) attempted to obtain equipartition of energy with the FPU Hamil- tonian by exciting high-frequency modes initially, and then following the subsequent energy transfer using a continuum description. Then—in what is considered to be the historical marker for the general awareness of KAM in the physics community—Walker and Ford (1969) published a sizable article in Physical Review, demonstrating the properties of KAM using the Henon and Heiles simulation as an example. In 1970, Ford again returns, this time with Gary Lunsford, to explore the irreversibility in weakly nonlinear systems based on arguments derived from Izrailev and Chirikov. Finally, in 1972 and 1973, Ford, Lunsford, Henon, Zabusky, Stoddard, and Turner brought together the close relationship between the Henon and Heiles Hamiltonian, the FPU problem, and the Toda lattice in a discussion of the ergodicity and integrability of these systems. 4.2. Izrailev and Chirikov, 1966 Fermi-Pasta-Ulam Simulation 1954 FPU first became associated with KAM in July 1966, when Izrailev and Chirikov published the article "Statistical Properties of a Nonlinear String," in which they attempted to "estimate the limits of stochasticity for a chain of oscillators." The authors thanked Stanislaw Ulam for sending them the FPU paper and they cited Kolmogorov's conjecture and Arnold's subsequent proof of the theorem. Although they did simply assume the applicability of KAM to the FPU problem without rigorously demonstrating the sufficient conditions, their goal was not to prove that FPU was KAM, but to determine the critical point (relative size of the perturbation) beyond which the KAM curves would break down and the FPU system would become "stochastic," a term which they applied to mean ergodic. These were bold
4.2. Izrailev and Chirikov, 1966 85 assumptions because the proof of ergodicity is very difficult, and no one had yet investigated what lay beyond the region of KAM stability. Historically, the significance of this article derives from the authors's assumption—and therefore their implicit acceptance of the idea—that KAM stability defines a boundary region between integrable systems and what appear to be ergodic statistical mechanical systems. Second, this work initiates the technique Ford called the "Chirikov overlap criterion," which is a way of using topology to predict when the KAM curves would reach critical breakdown. Izrailev and Chirikov began with the equations of motion for the cubic FPU perturbation dt2 dz2 1 + 3/3 (4.1) 10 o.ot\- />e 1 / 1 « ■ i i i i i i 1— v^S^ % ' 1 1 1 Mill Ofil OJ k/N — FIGURE 4.1. Estimated resonance overlap for the FPU chain: (I) region of KAM stability; (II) region of stochasticity; (a) limit of stochasticity for fc<n; (b) limit for n — k -C n; and (c) qualitative interpolation. The numerical values of the straight line a, b are given for n = 32. For simplicity, they considered only the two cases in which all the initial energy was placed in one mode, fc, which must be either a very low numbered mode (k <gi n) in the first case, or a very high numbered mode (n — k <C n) in the second case. In these two extreme cases, the mathematical work was reduced significantly and the authors were able to derive the following
86 4. Research Threads Come Together: Harmonic Convergence critical values: where (dx/dz)m represents the maximum value of the derivative. Before discussing the basis for these results, let us look at them in terms of the FPU problem. Using these limiting values of stochasticity, Izrailev and Chirikov created a plot of the two corresponding curves. Figure 4.1 shows a log-log plot of the critical value (4.2) versus the relative mode number (k/n) that is excited initially. The solid lines represent the two calculated limits—beyond which the FPU system should exhibit ergodicity. The dashed curve is an attempt to make a rough interpolation between these modal extremes. For two different sets of FPU initial conditions, Izrailev and Chirikov derived positions relative to the critical values and plotted those results as the circles numbered 1 and 2 in the figure. Their description of the results follow: It is interesting to note that the first case falls well within the region of Kolmogorov [KAM] stability, despite the large value of (3 = 8. The results of the numerical calculation exhibit in this case a clearly pronounced quasiperiodicity. The second case lies near the limit of stochasticity; even though the value of /? = 1/16 is very small, the seventh harmonic is still excited. The model pattern in this case bears little resemblance to quasiperi- odic motion, being more nearly like undeveloped stochasticity. (p. 32) The authors were very pleased with the correspondence between their results and those of FPU, that is, with the appearance of quasi-periodicity for some values of /? and with its disappearance for other values. Although it is unclear what is meant by "undeveloped stochasticity," it is certainly implicit that Izrailev and Chirikov assumed that the zone of KAM stability is parametrically transitional; the nearer to critical the system initializes, the more stochastic its behavior will be. From this result, the authors concluded further that if Fermi et al. had excited higher-numbered modes initially, then they might well have realized the ergodicity and equiparti- tion of energy that they expected, because these conditions would have initialized their system beyond the limit of stochasticity. Let us consider the basis for these estimates and see why the method of resonance overlap has been called a "crude approximation to the truth" by Ford and others. Recall from the previous chapter that the condition of oscillator resonance in the unperturbed Hamiltonian corresponds to a periodic torus in the action-angle variables. When perturbed, these resonances lead to the small divisors in the canonical transformations, which
4.2. Izrailev and Chirikov, 1966 87 Rotation (preserved CP tori) Oscillation (periodic mini-tori) Separatrix (deterministic chaos) Pendulum Action-Angle Phase Space Approaching Resonance Overlap Simplified KAM Torus Cross-Section FIGURE 4.2. The action-angle phase-space portrait (upper) is used to model the resonance structure of the KAM torus (lower left). When the critical value is reached, neighboring resonances at different radii approach the point where their separatrices cross (lower right). As the resonances overlap, the remaining conditionally periodic tori are destroyed throughout the entire phase space.
88 4. Research Threads Come Together: Harmonic Convergence in turn indicate when the periodic tori are destroyed and replaced by homoclinic orbits, which came to be associated with deterministic chaos. We also saw that, for sufficiently small perturbations, the nearby conditionally periodic tori undergo topological change to mini-tori bundles centered on the periodic orbits. The remaining orbits, with sufficiently incommensurate frequencies, continue as slightly perturbed conditionally periodic tori, and so remain as layers separating the various mini-tori. To understand the workings of the resonance overlap criterion, we must associate this KAM language with the phase-space portrait of the simple pendulum. In Figure 4.2,1 have drawn the phase-space portrait for the simple pendulum in action-angle variables. The angular variations along the horizontal axis correspond to the pendulum's angle away from the vertical. Clearly the behavior repeats itself periodically along the horizontal axis. Along the vertical axis the action corresponds to the momentum of the pendulum. There are two different types of motion possible, shown by the two orbit topologies in the figure. The circles/ellipses represent oscillation and the wavy lines represent rotation of the pendulum. The heavy black line separating the two types of motion is called the separatrix; it corresponds to the special initial conditions leading to the pendulum stopping at the top of its swing. The topological similarity between these pendulum structures and the resonance structure of the KAM tori indicates a significant correspondence. Comparing the upper and lower resonance diagrams in Figure 4.2, we see that the oscillation orbits of the pendulum correspond to the mini- tori of KAM, and the rotation orbits of the pendulum correspond to the preserved conditionally periodic KAM tori. The separatrix corresponds to the homoclinic orbit. With our structural analogy in place, we are in a position to see the workings of the resonance overlap criterion. As the energy represented by the perturbation terms in the Hamiltonian increases—either from having a larger energy to begin with, or from a larger perturbation parameter—more and more of the conditionally periodic tori undergo the topological change to mini-tori at all levels of the fractal structure. In our simple analogy, the size of the resonances increases throughout the torus. But the separatrix can never cross itself or another separatrix; so as the resonances get larger, the separatrices push outward until they would necessarily overlap, which they actually cannot do. At that critical point, theory predicts the destruction of all conditionally periodic tori. The critical parameter (4.2) depends upon the maximum value of the first derivative because the first derivative is related directly to the action (momentum), the maximum value of which gives the height or the so-called "half-width" of the resonance at its center. This method may be considered crude because of the nature of the analogy; the resonances in the KAM torus are definitely not the same as those of the simple pendulum. But the results have proven to be quite good and so this method provides a very good heuristic for analyzing these very complicated systems.
4.3. Zabusky and Deem, 1967 89 4.3. Zabusky and Deem, 1967 According to Izrailev and Chirikov's (1966) results, if Fermi et al. had excited higher-energy normal modes initially, then they might have seen the transition to ergodic behavior. This conjecture was put to the test by Zabusky and Deem (1967) in their article "Dynamics of Nonlinear Lattices I. Localized Optical Excitations, Acoustic Radiation, and Strong Nonlinear Behavior." Once again, Zabusky preferred the continuum description of the string, because, as they said in their article: "we may think of the lattice as a discretized representation of a continuum and thus view some of the phenomena as resulting from discretization; that is, the response of the system to short wavelength excitations." They excited the higher-frequency modes of the lattice initially by displacing alternate masses onto one of two initial smooth-curve displacement patterns—even-numbered masses on one curve and odd-numbered masses on the other (see Figure 4.3(a)). They allowed the string to evolve as they tracked the modal energy distribution in both the optical and acoustic modes. They expected to see the wave motion break up into some random condition in which waves of all types were propagating—which would indicate a tendency toward stochasticity. But to their surprise, after initially exciting the optical modes and setting the ratio of nonlinear-to-linear energy at 0.326, still they observed correlated motions of the particles, indicating the retention of some nonrandom behavior: This conclusion is somewhat contrary to our expectations, although it is analogous to the results of Fermi et al. in the case of low amplitude, acoustic initial excitations. In Fig. [4.3] we show the evolution of the modal energies. At ut = 0 the energy resides in the highest optical modes; after 0.05 of a linear period, Fig. [4.3(b)] shows the rapid cascade of energy to the low end of the spectrum, together with a rising importance of many central modes. Very soon afterwards, we reach a state whose average features are exemplified by Fig. [4.3(c)] (0.5 linear period). At this time 0.558 of H has been transferred to the central modes (20 < k < 180). At later times, one sees only small energy fluctuations about this nonuniform spectrum, indicating that our system does not reach a state of equilibrium having equipar- tition of energy among the various modes, (p. 142) [my figure numbers] Whereas the linkage between optical and acoustic modes did result in more pronounced energy-sharing, the lattice showed no tendency to randomize, nor did there arise a tendency toward equipartition of energy. But instead of turning a critical eye to the Chirikov overlap criterion, Zabusky and Deem simply concluded that:
90 4. Research Threads Come Together: Harmonic Convergence (a) A w,z t = 0 s *-+-•* tftgtiTi111111 1111 \ \ X EVEN MASSES (Wn) 1111 11 11 i i fty-m // ODD MASSES (Zn) -►X (b) io- a) cut = 20(0 05 Linear Period) 100 MODE NUMBER (c) IO"5 ACOUSTIC IV--.. x v*~ cut = 200 (0.50 Linear Period) 12-CURVE 47URVE 1 6-CURVE # ■• —; L- 1 »- - V •A 1 1 1 ^—Non-Linear (Upper Bou 3-CURVE V Contr nds on ibutions the H™-) ait = 0 / OPTICAL^/ 100 MODE NUMBER FIGURE 4.3. Exciting the optical modes: (a) placing alternate masses on two characteristics; (b) after 0.05 linear periods, the energy rushes down to the acoustic modes; and (c) after 0.5 linear periods, the energy returns to the middle modes.
4.4. Walker and Ford, 1969: Physical Review 91 These results are indicative of the trouble that can arise in the numerical simulation of continuum phenomena. The process of discretizing a continuum is equivalent to setting up a lattice to represent a continuum. We observe that if nonlinear processes cause energy to flow to the high wavenumbers, then because of the discrete nature of the lattice this energy can be fed back to the low wavenumbers that we are in fact trying to simulate, (p. 147) Instead of probing deeper into this contradiction of Izrailev and Chirikov's (1966) conclusion, Zabusky and Deem (1967) were more interested in getting a better understanding of a continuous string. Although in previous work, Zabusky had shown an interest in discretization problems to the extent that he and Kruskal were led to their discovery of the soliton, here he and Deem chose to move away from the problem, perhaps because it was simply a representation problem, rather than what he considered to be a physical problem. 4.4. Walker and Ford, 1969: Physical Review FPU Published 1965 Waters-Ford 1966 1 Hi 1 BJ Izrailev-Chirikov 1966r —■*• \ r Gustavson 1966 I 111 Zabi 1967 I Ills 1 | | J isky-Deem Toda L- | 1967 T Walker-Ford 1969 J In "Amplitude Instability and Ergodic Behavior for Conservative Nonlinear Oscillator Systems," (Physical Review, 1969), Grayson Walker and Joseph Ford attempted to "provide an elementary introduction to Kolmogorov- Arnold-Moser amplitude instability and to provide a verifiable scheme for predicting the onset of this instability" for the physics community—those
92 4. Research Threads Come Together: Harmonic Convergence who may not yet be familiar with KAM. Although many mathematical, plasma, and fluid physicists already knew about KAM and the work of Henon and Heiles by 1969, Walker and Ford wrote this article for everyone else in the community. In the Introduction, the authors described the two divergent expectations for the slightly perturbed Hamiltonian system: One approach assumes that the weak perturbation changes the unperturbed motion only to the extent of slightly shifting the frequencies of the motion and introducing small nonlinear harmonics. ... The second approach assumes that, even though weak, [the perturbation] has a profound and pathological effect on the unperturbed motion, converting it into ergodic motion, (pp. 416-417) Walker and Ford linked these two approaches to the work of Poincare in the former case and Fermi in the latter, highlighting the duality that a non- integrable perturbation would result in either predictable behavior with some distortion or else totally randomized movement. Walker and Ford recognized that the existing expectations formed an extreme dichotomy, so they introduced the KAM theorem as a possible middle path—a boundary more broad and complex than the traditional opposition: "A brief paper by Kolmogorov enunciated a theorem which can perhaps provide a cornerstone for linking the two aforementioned divergent views on the effects of the weak perturbation." And slightly later in the article: "In short, KAM theory indicates, but certainly does not prove, the existence of an amplitude instability for conservative nonlinear oscillator systems which permits a transition from motion which is predominantly conditionally periodic to motion which is predominantly ergodic" (p. 418). Fifteen years after Kol- mogorov's statement of the theorem, the news of KAM had reached the general physical audience. Walker and Ford introduced the action-angle variables and tori cross- sections in two degrees of freedom, and explained the topological arguments of KAM: that the unperturbed commensurate tori are destroyed by any nonzero size of the perturbation; that the conditionally-periodic tori are only slightly deformed for small values of the perturbation; that although the destroyed tori regions are everywhere dense, the majority of initial conditions lie on the distorted conditionally periodic tori; and finally, that, the relatively small set of initial conditions leading to motion not on preserved tori is pathologically interspersed between preserved tori. Moreover Arnold conjectured that the system phase-space trajectory in regions of the destroyed tori is quite complicated indeed, perhaps ergodically filling the destroyed region, (p. 417) Notice the significant change in Ford's attitude that came with his grasp of the KAM theorem. In this article, he described the small divisors, related
4.5. Ford and Lunsford, 1970 93 them to the resonance condition, and suggested that "in analogy, one would anticipate that the motion generated by this Hamiltonian in the overlapping resonant zones of destroyed tori is highly complicated, perhaps even ergodic" (p. 418). We saw previously that Ford had all but eliminated any expectations for ergodicity in this system. Working with Walker, he seems to be convinced of the possibility. To provide a set of standard techniques for numerically analyzing nonlinear Hamiltonians, Walker and Ford introduced the methods of Henon and Heiles, and they discussed the topological equivalence between the surface of section and the KAM tori. They described the Henon and Heiles results in terms of the KAM theory, as was done in Chapter 3. Although they pointed to the KAM theory as a possible solution to the FPU problem, they also mentioned Zabusky and Deem's failure to achieve ergodicity, which did seem to contradict the Izrailev and Chirikov prediction. Because this review article was designed to introduce KAM on known applications, Walker and Ford offered no further solution to the FPU problem, but they did point to the possibility of connections between FPU, Henon and Heiles, and KAM. This article provided a thorough introduction to the KAM theory for physicists, with resonances showing up as the island chains of Henon and Heiles, and further, the secondary and tertiary island chains which represent resonances within resonances. The amplitude instability they spoke of in the article refers to the overlapping of resonances, which results in the destruction of the intervening conditionally periodic tori. These topological changes in the surface of section accompany the instabilities in terms of the perturbation theory expansion. Believing that ergodicity might well lie beyond the KAM instabilities, Walker and Ford raised several questions in the conclusion indicating the need for further work, including: What effect would many degrees of freedom have on the energy at which the transition occurs? Whether the islands of order disappear altogether or whether statistical mechanics might have to find a basis in the "course graining" of phase space, in which ergodic trajectories are dense but do not completely fill the phase space? and Why do some systems, such as that studied by Zabusky and Deem, fail to achieve ergodicity as expected? 4.5. Ford and Lunsford, 1970 Very soon after the Walker and Ford article, Joseph Ford, working with Gary Lunsford and armed with his acceptance of at least the possibility of ergodicity in KAM systems, began to uncover the conditions for which we might expect "stochastic behavior, by which we mean that a trajectory wanders more or less randomly over part or most of the energy surface" (p. 59). Although KAM theory demonstrates that, in the region of KAM
94 4. Research Threads Come Together: Harmonic Convergence stability, most initial conditions must lie on the distorted conditionally periodic tori, and thereby are not going to be stochastic, Ford and Lunsford realized that this use of "most" is derived from measure theory—a purely mathematical result. They argued that physical systems are not bound to comply with the statistics of measure theory. In "Stochastic Behavior of Resonant Nearly Linear Oscillator Systems in the Limit of Zero Nonlinear Coupling" (1970), Ford and Lunsford investigated the nature of stochas- ticity for systems that were close enough to the resonances to be stochastic, that is, the unperturbed frequencies were nearly commensurate. Once again, in terms of KAM theory, these are the systems which lie in the destroyed tori region—in the homoclinic tangle between the preserved KAM tori. In this study, which is obviously motivated by the KAM-theoretic speculations of Arnold, we find an investigation of what trajectories will do, if they are initialized purposefully in the resonant zone. Although this had been speculated upon by Arnold in terms of the topology of the homo- clinic tangle, this would be the first numerical investigation on particular trajectories. Once again motivated to find a sound theoretical basis for statistical mechanics, Ford looked to the KAM instability: "Our intent is to demonstrate that widespread stochasticity can occur even for the lowest temperatures and the weakest nonlinearity" (p. 60). The two issues they addressed indicate their concern about the need to increase total energy to large, nonphysical values to achieve apparent stochasticity in the model, whereas it is known to occur at very low energies in physical systems. On face-value, KAM seems to require that the size of the nonlinearity be quite large to achieve the total breakdown assumed necessary for ergodicity. With this article, Ford and Lunsford demonstrated that neither condition is necessary for stochasticity and limited ergodicity in a KAM system. Ford and Lunsford based their argument and exposition on the concept of resonance overlap, because, although crude, it nonetheless provided a connection to the existence of destroyed tori, which made room for the stochastic trajectories. If we begin with a trajectory near enough to the unstable periodic orbits, then, even for a very small but nonzero perturbation, we should see stochasticity. Geometrically, the minimum number of degrees of freedom required to obtain stochasticity is three (with three we get chaos), so Ford and Lunsford chose to experiment with a system of three oscillators to demonstrate the independence of stochasticity from the size of the perturbation and the energy level. Beginning with the cubic three-particle Hamiltonian, Ford and Lunsford canonically transformed to the following action-angle (J, </>) form: H =J + 7 [ aJi J* cos(20! - (j>2) + /3(Ji J2J3)* cos(0i + 02 - 03) ]. (4.3) This form was particularly useful for their exposition because it contained explicitly the second constant of the motion J (where J = J\ + 2 J2 + 3 J3)
4.5. Ford and Lunsford, 1970 95 in the first term on the right-hand side, leaving only the two terms generated by the resonances. The first resonance, the so-called "three phonon" interaction (2o;i = u^), has a relative size proportional to a. The second resonance (a;i+u;2 = CJ3) has a relative size proportional to /?. The full phase space is six dimensional; the energy surface has five dimensions; and the additional constant of the motion restricts the freedom of the system trajectory to four dimensions. By varying the relative sizes of the resonances, using the parameters a and /?, but still maintaining the constant energy and the overall perturbation size (7), they varied effectively the amount of resonance overlap between the two physical resonances. In a sequence of cross-sectional views, shown in Figure 4.4, Ford and Lunsford simulated a number of trajectories—all initialized within the region between the two resonances. In all cases shown in Figure 4.4, /3 is held fixed and a is varied to alter the relationship between the two resonances. The figures are all two- dimensional cross-sectional views of the (#3, p3) phase plane. Figure 4.4(a) shows that at a — 0, there can be no resonance overlap because there is only one resonance. All initial conditions result in level curves (conditionally periodic tori). As a is increased, we see the sequential disruption of these level curves and the onset of stochasticity in Figures 4.4(b)-(d). Ford and Lunsford came away from this study satisfied that they had demonstrated the likelihood of stochasticity in weakly nonlinear oscillator systems with initial conditions which were physically most probable. In particular, they stressed that stochasticity could still continue into the limit 7 —► 0. They seem to have found a foundation for the expectation of ergodicity in systems of large numbers of particles, because increasing the number of particles of a dynamical system increases both the percentage of the energy surface accessible to the stochastic orbits and the number of resonant interactions that cause the stochasticity. Without proving ergodicity, which is an extremely difficult task, they made significant progress in their effort to replace a weak foundation for statistical mechanics—one based on the Poincare theorem on nonexistence of constants of motion— with a stronger one based on the instabilities arising from the resonance overlap of KAM theory: The origin of irreversibility for these oscillator systems perhaps finds its most fundamental description in terms of the exponential separation of pair-orbits. In a very real sense, this rapid pair-orbit separation represents that stirring of phase space which Gibbs envisioned as causing irreversibility. In particular, the rate at which these pair-orbits diverge gives at least one measure of entropy production in these systems. In the terms of information theory, the slight uncertainty in the initial state would grow with time to almost complete uncertainty of the final state, (p. 69)
96 4. Research Threads Come Together: Harmonic Convergence 1 Tick = 0.5 lTick=0.5 (c) Ps p, (d) FIGURE 4.4. The onset of stochasticity is seen in this sequence of ((73, P3) surface of section plots. In all cases, the size of 0 is held constant while a is varied to change the relative weight of the two resonances in the perturbation. The sequence is as follows: (a) a = 0, all level curves, no apparent stochasticity; (b) a = 0.01, the beginning of stochasticity; (c) a = 0.025, fewer level curves, increased stochasticity; and (d) a = 0.05, level curves occupy 50% of surface.
4.6. Lunsford and Ford, 1972 97 The nonexistence of constants of the motion leads us to expect free-ranging trajectories; but it does not provide a fundamental cause for ergodicity. It also supports a tendency to make an either-it's-predictable-or-it's-not assumption. But with KAM, we are seeing the rise of a more fundamental explanation for stochasticity—one based on the complex influence of resonance interactions, and measured by the exponential separation of pair- orbits. The KAM explanation allows for the existence of a natural transitory boundary zone between two opposite poles of the integrable, completely predictable and the ergodic, totally random. There remains a question about why some systems do not exhibit the total randomness that Ford and Lunsford expected to result from these resonant interactions. They explicitly addressed this issue in reference to the Zabusky (1967) work: "The extent to which stochastic orbits are truly random has not been resolved, and the extent to which these orbits may yield constant high-order correlations has only been partially investigated" (p. 70). The question of whether an orbit is truly random was also related, albeit indirectly, to the islands of order (the periodic mini-tori) that Arnold conjectured on and that Henon and Heiles found, persisting within the region of destroyed tori. Additionally, although the authors did not address it, the question presents itself as to why FPU, with such a large number of degrees of freedom, did not realize anything close to ergodicity. In light of that issue, remember that Ford (1961) found that the coincidental choice of the number of degrees of freedom in powers of two led Fermi et al. to examine orbits which could lie on only conditionally periodic tori and not near resonance. Although this article sheds a great deal of light on stochasticity and resonance overlap, the results may not be applicable to FPU. However, according to the results of Izrailev and Chirikov (1966), at least in one case, the conditions of resonance overlap were nearly met by FPU, and in that case, we saw the onset of "undeveloped stochasticity." Ford and Lunsford do address these issues in the next section. 4.6. Lunsford and Ford, 1972 With KAM theory and thoughts of ergodicity under their belt, and with the new work on iterative mappings by John Greene (1968), Lunsford and Ford returned with another article "On the Stability of Periodic Orbits for Nonlinear Oscillator Systems in Regions Exhibiting Stochastic Behavior" in which they investigated the Henon and Heiles Hamiltonian H(x, y,±,y) = *-±f- + X-^- + x2y - £, (4.4) for evidence of ergodicity. Using a Poincare surface of section, Henon and Heiles found what appeared to be islands of order surrounded by a stochastic sea. In the present article, Lunsford and Ford searched for evidence to
98 4. Research Threads Come Together: Harmonic Convergence support their belief that the stochastic sea is ergodic and mixing. They discovered that outside the outermost remaining KAM curve, all the fixed points at the center of what used to be islands of order, had become unstable. They related this fact to the fairly recent proof by Sinai (1963), that a hard sphere gas is ergodic and mixing, which implies that an all-repulsive- force Hamiltonian is also mixing. If all the fixed points in the stochastic sea are unstable, then perhaps the system may be ergodic—a conclusion they found to be a potential new foundation for statistical mechanics. Lunsford and Ford demonstrated successfully that for this Hamiltonian, there does indeed exist a fractal structure of alternating fixed points, which is the same structure underlying the mini-tori and homoclinic tangle in the KAM torus. In Chapter 3, we saw that as the energy in the Henon and Heiles Hamiltonian is increased toward the dissociation value (E = 1/6), these islands shrink and break up, apparently in accord with the predictions of the resonance overlap criterion. Although they could not prove it rigorously, Lunsford and Ford concluded that: An essential property of C-systems [ergodic and mixing] is that points on initially close orbits separate exponentially with time. Therefore, all periodic orbits for C-systems are unstable. As a consequence, the elliptic fixed points of [the Henon and Heiles surface of section] graphically demonstrate that Hamiltonian (4.4) is not a C-system. Nonetheless, [Henon and Heiles] reveals macroscopic unstable regions which are suggestive of emerging C-system behavior. In particular, Henon and Heiles have shown empirically that points on initially close orbits in such regions do exponentiate apart. Moreover, they show that the macroscopic unstable region includes most of phase space for sufficiently large energy, (p. 703) Notice the convenient return to the mathematical use of the word "most" here, as opposed to the pragmatic usage in the last article. They concluded that at sufficiently large values of the energy, the primary stable fixed points break up such that "most" of the phase space is covered with the seemingly ergodic stochastic sea. But they did indeed qualify their conclusion because they could not rule out the possibility that, because this is a fractal structure, "maverick" elliptic fixed points might still remain on some secondary or smaller scale. They were forced to argue for a kind of coarsely grained ergodicity: Our results, like those of Greene, are empirical and do not constitute a mathematical proof; moreover they do not preclude the existence of stable periodic orbits in the stochastic regions. However, they do make it clear that such periodic orbits, if they exist, are likely to affect only regions of very small measure in phase space. As a consequence in the macroscopic view
4.7. The Toda Lattice Is Integrable 99 of a physicist, the physical properties of stochastic regions like those of (4.4) are likely to be indistinguishable from those of a C-system. Thus in the stochastic sea between and outside KAM curves, which fills the entire energy surface, above the dissociation energy or beyond total resonance overlap, we might expect a coarsely grained brand of ergodicity in which pair orbits experience exponential separation due to the complex interactions in the homoclinic tangle. As a final note in their article, Lunsford and Ford clarified the connections between Henon and Heiles, FPU, and the Toda lattice, and so situated their work into the framework of this research program: In conclusion, let us emphasize that the Henon-Heiles system does provide insight into the generic behavior of systems with attractive forces. First, it is worth noting that any three-particle system with a Hamiltonian of the form H = 2^ + P| + P*] + V{Ql " °3) + V(Q2-Qi) + V(Q3-Q2) can be reduced to Hamiltonian [4.4] in the cubic approximation, provided that V(r) has a nonzero cubic term in its power series expansion. Thus the three-particle Fermi-Pasta-Ulam system and the three-particle Toda lattice are intimately related to Hamiltonian [4.4]. (p. 704) Herein lies the key to the convergence of this research program. For three- particle discrete lattices with attracting forces whose series expansions have the same cubic approximations, we can expect KAM behavior like that found by Henon and Heiles. Although the FPU lattice involves many more particles than three, the cubic FPU Hamiltonian satisfies these same conditions, and so there is good reason to expect similar KAM behavior in FPU. All of this work is tied together by connection to the Toda lattice, and so the resolution of the Toda lattice conundrum brings this research program to a close. 4.7. The Toda Lattice Is Integrable While pursuing the FPU problem from the continuum perspective, Zabusky and Kruskal (1965) were led to discover solitons in the integrable Korteweg- deVries (KdV) equation. In (1971), Zabusky showed that in the continuous limit, the discrete Toda lattice could also be described by KdV. Clearly,
100 4. Research Threads Come Together: Harmonic Convergence in the continuous limit, FPU is closely related to the Toda lattice; but, because of the continuous limit approximation required to obtain KdV, nothing more conclusive can be claimed for FPU in terms of KdV. 1 Zabusky 1971, 1973 Lunsford-Ford 1972 ~T~ Ford, Stoddard! & Turner 1973 I H6non | 1974 mmmmmmmi In the realm of discrete lattice dynamics, Lunsford and Ford (1972) showed that for just three particles, the cubic approximation of the Toda lattice Hamiltonian ZT - Px+Py J_ I" 2y+2v/3x , 2y-2y/3x , -4yl _ 1 2 24 r +C +e J 8' (4.5) is exactly the same as both the Henon and Heiles Hamiltonian and the three-particle case of the FPU Hamiltonian. But would the same be true for more than three particles in FPU? Because Henon and Heiles used only a three-particle Hamiltonian that is clearly the truncation of the exponential Toda lattice, there can be no doubt of the direct connection between them. But the same cannot be said for FPU. Furthermore, working from the topological method devised by Gustavson (1966), Walker and Ford (1969) demonstrated that the Toda lattice level curves had the same topology as those of Henon and Heiles. If the Toda lattice could be proven integrable, then there would be no doubt that the Henon and Heiles Hamiltonian is a nonintegrable perturbation of a nonlinear, integrable Hamiltonian. The behavior seen in simulation by Henon and Heiles would be a certified example of KAM stability. In 1973, Ford et al. used numerical methods to provide convincing evidence for the integrability of the Toda lattice with both three and six particles. Focusing on the exponential separation of pair orbits, they argued as follows: Let us [numerically] integrate two trajectories for Hamiltonian [4.5] which are initially very close together in [(#, px, y, py)] space. For integrable systems, the distance between these trajectories grows approximately linearly with time; while for nonintegrable systems, this distance grows exponentially, (p. 1553) Their results indicated that orbit-pairs in both cases separated approximately linearly with time, and so they argued for the integrability of the
4.7. The Toda Lattice Is Integrable 101 Toda lattice, knowing full well that this (inductive) evidence could not provide a proof in the mathematical (deductive) sense. Finally, in an interesting referential feedback loop, Ford et al. appended the following late-breaking information to their article: Here the main body of our original manuscript ended; however we utilize the opportunity given us by a reviewer's request for minor changes in the original m[a]nuscript to add the following exciting footnote. Motivated by reading a reprint of this paper, Professor M. Henon sought and found analytic expressions for n independent constants of the motion for the n-particle Toda lattice having periodic boundary conditions. Moreover since these n constants of the motion are in involution, Liouville's Theorem insures that the Toda lattice with periodic boundary conditions is a completely integrable system. Using Henon's results, it may be shown that the Toda lattice with fixed ends is also a completely integrable system. Henon will soon publish the details of [his] discovery in another place. Here we mention only that Hamiltonian [4.5] has the exact, independent constant of the motion [*(x, pxi y,py) = 8px(pl - Zp2y) + (px + V3py)e2y-2V*x (4.6) + (px ~ V3py)e2y+2V~3x - 2pxe-4y). Henon (1974) did publish his results and so certified by proof that the Toda lattice is one of those rare jewels in the catalogue of integrable dynamical systems. Announced at the literal heel of the Ford et al. article, this result verified that Henon and Heiles did see KAM stability in simulation. Because of these close connections to the Henon and Heiles Hamiltonian and its subsequently verified KAM behavior, and because of the compelling KAM explanation of FPU, the consensus in the field is that FPU is probably KAM stable. No other explanation fits the data as well, and whereas the KdV soliton solution is compelling, it is simply not applicable to a discrete lattice system. However, there must be a very important connection underlying the similarities between KdV and KAM, a connection that might lead us to conclude that KdV is the manifestation of KAM in the continuous limit. For nearly 20 years, the work stemming from the Fermi-Pasta-Ulam simulation of 1954 influenced and drove on this investigation, which I have called the FPU research program. Of course the FPU problem was just a continuation of the ongoing subject of dynamics. But in 1954, with one of the first computer simulations in dynamics and with Kolmogorov's powerful conjecture, the age of dynamical systems theory began. It is a new
102 4. Research Threads Come Together: Harmonic Convergence science stemming from the happy marriage of computer simulation and mathematical modeling, that has been blazing the trail into new regimes of dynamical behavior in what are still classical dynamical systems. In a very strong sense, dynamical systems theory is the logical continuation of classical mechanics; it does not begin from any new, or relativistic, or quantum mechanical theories of matter, but just good old Newtonian mechanics.
Part II: Philosophy But we do not admit that the eyes are deceived at all: for their task is to see where light and shadow are, but whether it is the same light, in fact, or not, and whether the shadow that passes by is the same one, or whether what I said before is indeed the case, these things can only be decided by the mind, for the eyes cannot discern the true nature of things. And so you must not blame the eyes for the fault of the mind. Lucretius (Book IV, lines 379-386) Now with the advent of large computers, sophisticated graphical algorithms and interactive terminals, we can undertake large-scale numerical simulations of these systems and probe those regions of parameter space that are not easily accessible to the theorist/analyst or experimentalist. ... This is a new approach to understanding the physical environment around us through formulating, exercising, comparing results and then improving mathematical models that we believe describe the real world. ... [This approach ] in nonlinear lattice dynamics began with the FPU attempt to determine a relaxation time. Norman Zabusky (1973)
5 Steps to an Epistemology of Simulation A computer simulation for these conditions gave recurrence and we were on our way to a better analytic understanding. Norman Zabusky (1967, p. 231) In cybernetics, mapping appears as a technique of explanation whenever a conceptual "model" is invoked or, more concretely, when a computer is used to simulate a complex communicational process. ... Outside of cybernetics, we look for explanation, but not for anything which would simulate logical proof. This simulation of proof is something new. We can say, however, with hindsight wisdom, that explanation by simulation of logical or mathematical proof was expectable. After all, the subject matter of cybernetics is not events and objects but the information "carried" by events and objects. Gregory Bateson (1972, p. 401) 5.1. Introduction In dynamics, we use models to represent the workings of a physical system. We arrive at the model by making approximations to fundamental force laws based on assumptions about the physics. The model is expressed mathematically as either a set of differential equations, or else as a Hamilto- nian function, from which we obtain differential equations using Hamilton's formalism. Every system of well-behaved differential equations always has a definite solution. Although some scientists often call this guaranteed solution the truth, I will, for clarity, refer to it as the true solution. As we have seen, the differential equations that characterize dynamical systems theory are not usually solvable analytically, making the true solution an elusive object of desire. In the absence of direct access to the true solution, we resort to numerical methods to create approximations to the true solution.
106 5. Steps to an Epistemology of Simulation Using numerical methods on a computer to study the solution to noninte- grable differential equations is one major use of simulation in physics. This form of simulation goes well beyond dynamics, as nonintegrable differential equations appear in a broad spectrum of application. This is the use of simulation that I assume in this chapter. The goal of simulation in dynamics is to learn about the true solution by simulating the trajectories of the associated differential equations. At best, simulation can tell us only about the true solution for the model, and it is the shortcomings of reaching that goal that are at issue here. Whether or not the true solution of a model actually relates to physical reality depends upon the fundamental laws used and the approximations made to obtain the model. As the simulation teaches us about the true solution, we make decisions about the adequacy of the model. Simulation's function is to reveal the properties of the true solution and to aid our decisions about how well the model suits the physical system. As doubts arise about the model's adequacy, but we are reasonably confident of the simulation, we must turn our attention to the approximations made in the model. The goal of this chapter is to make explicit what issues or aspects a simulation describes. Several times I will point out an instance of when a researcher indicates an implicit belief that the simulation does tell us something about physical reality. Such an assumption implies a trust in a long chain of inferences and approximations beginning from the adequacy of the fundamental theory itself, to the equivalency of an infinite series expansion of terms, then to the approximations that are made to obtain a model, and finally to the simulation and what it might be telling us about the model. First I discuss the historical significance of the FPU simulation as it marks the beginning of simulation's role as an experiment on the domain of mathematical models. I will clarify some of the significant similarities and differences between simulation and experiment. Then I take up the issue of how we come to believe that a simulation tells us anything about the true solution. The major source of divergence from the true solution in simulation is the necessary discretization of time in the numerical procedures. Although this approximation has emerged as the significant consideration when working with simulation, we must also consider Peter Galison's view that experimenters themselves often may be biased in interpreting the results of their experiment. Once I discuss these issues, I move to the more pragmatic discussion of delineating the strategies we may use to increase our belief in the simulation's approximation of the true solution. Working from the epistemology of experiment suggested by Allan Franklin, I extend five strategies of reasonable belief in experiment to the realm of simulation. After first introducing these strategies in their general form, I move back into a close reading of the articles of the FPU case study to show how they are put to use. Finally, I take up the issue of what a simulation can tell us
5.2. Hierarchy of Modeling 107 specifically about the difference between belief and proof, by studying the failure to prove the existence of integrals of motion using simulation. 5.2. Hierarchy of Modeling There are many different meanings of the term "model" in common usage. Dynamics uses models in three distinct layers of the investigative process. First, dynamicists conceive of a mechanical model to represent the components of the physical system; then they construct a mathematical model to represent the functional relationships between the variable components of the mechanical model; finally, they generalize the mathematical model to a parametric model class within which there is a continuum of specific mathematical models. As I will discuss in the next section, approximations must be made to obtain the model at both the mechanical and the mathematical levels. Fermi et al. (1955) modeled a one-dimensional, loaded string, with fixed ends, that was free to move only in the longitudinal direction. The springs between each mass point along the string would have to be described using some nondissipating, nonlinear force. Once the conceptual picture of a mechanical analogue is obtained, the much more difficult task remains to describe the dynamics of the system mathematically. The Henon and Heiles mechanical model was a single star moving in a nonlinear galactic potential field. However, once they constructed the mathematical model for this field, it was identical in form to the FPU model, when the latter is truncated to three moving mass points. It is interesting to note that many different mechanical models can result in the very same mathematical model. By the same token, the results of an analysis of one mathematical model can be generalized to many different physical systems. The Hamiltonian function is the foremost example of the statement of a mathematical model in dynamics. For conservative dynamical models, such as all the models in the case study, the Hamiltonian is the mathematical representation of the energy-like quantity that axiomatically remains constant throughout the evolution of the system. Thus conservation of the Hamiltonian is one major approximation inherent in all the mathematical models in this work; it is the approximation that separates the two major branches of dynamics: conservative and dissipative dynamics. This conservation is an axiom of the formalism; once assumed, it must be maintained throughout the evolution. The unperturbed Hamiltonian for the FPU model is simply the sum of potential and kinetic energy terms for a chain of coupled harmonic oscillators. The perturbed part of the FPU Hamiltonian consisted of one of the lowest nonlinear terms in the series expansion of Newtonian attraction between mass points. Interestingly, the Henon and Heiles mathematical
108 5. Steps to an Epistemology of Simulation model is also the same as the truncation to the cubic nonlinear term in the series expansion of the Toda lattice model, whose mechanical analogue is three moving mass points confined to a circular ring and separated by exponential forces. So again, the approximations made to models reduces them to terms that are common to many different physical situations. The mathematical models employed in dynamical systems theory usually contain some variable control parameter, whose value must be specified before any consideration of the evolution of the trajectory. Because each value of the control parameter constitutes a different physical situation and we will want to discuss all the models associated with the likely range of values of the control parameter, I adopt van Frassen's (1980) formal differentiation between a model and a model class. Each value of the control parameter represents a specific model within the model class. It is actually the entire model class that becomes the entity under scrutiny in dynamics. With this generalization, questions of structural stability emerge that concern the topological relationships between the trajectories of various models within the model class. As dynamicists study the behavior of the trajectories generated from one model, phase space becomes a map to regions of predictability, whose coordinates are the ranges of values of the independent variables—position and momentum. When we consider the mathematical model as but a cross-section of the larger model class, we realize the possibility of watching the topological evolution of phase space, where instead of time, evolution is modeled by the parameter of the model class. Dramatic changes in the topology of orbit surfaces in phase space during the evolution through the model class indicates a significant qualitative change in the model and a possible identification of a short-coming in some approximation, or else some dramatic change in the behavior of the physical system itself. Whereas we usually define the model class using a variable parameter on the size of the perturbation, the Henon and Heiles Hamiltonian is an exception. Henon and Heiles derived their Hamiltonian all at once, in one piece, instead of an unperturbed piece and a perturbation. Thus the relative size of their perturbation remained the same (unity); but this does not imply that the relative distribution of the total energy remained the same between the linear piece and the nonlinear piece. The initial amount of the total conserved energy functioned for Henon and Heiles as their control parameter. So whenever I speak of the Henon and Heiles work, I specify the energy level in relation to the breakdown of the KAM curves, which is not typical of these models. As they increased the energy toward the dissociation value, more and more of it became associated with the perturbation, corresponding to the increasing area of stochastic orbits. As Henon and Heiles increased the energy from very small toward the dissociation value, the lines defining the ellipses in the surface of section began to spread out, eventually breaking down altogether. For the more general KAM model, this region is associated with the size of the perturbation parameter, which
5.2. Hierarchy of Modeling 109 changes the significance of the perturbation for a given value of the energy. In the Henon and Heiles work, the perturbation was set to unity and the significance of the effect of the perturbation increased with increasing energy. Beyond the Henon and Heiles paper, we find the size of the perturbation parameter to be one of the most significant determinants for understanding the behavior of the model class. Whereas Zabusky (1962) was dissatisfied with Ford's (1961) analysis of the FPU problem because it "assumed that the amplitude of the modal oscillations did not change" (p. 110), Jackson (1963b) pinpointed the problem with Ford's work as being "not entirely relevant to the calculations of FPU" (p. 687) because the FPU case did not fall into what Jackson called "weak coupling," which is the case that Ford's analysis is applicable to. Jackson eliminated the FPU work from that classification because of the large number of particles and because of the size of the perturbation parameter. Thus the perturbation parameter size dramatically affects the approach of the perturbation theory analysis, such that Jackson could reject Ford's work because he missed this distinction. A second place in the case study that we find the perturbation size playing a key role is in the Izrailev and Chirikov (1966) paper. These authors used both the perturbation size and the number of particles as key factors in determining the critical breakdown of the KAM curves using the resonance overlap criteria. They demonstrated that KAM stability was applicable to two different cases of the FPU work, citing that the perturbation size was below the critical value for KAM breakdown in both cases. Finally, Ford and Lunsford (1970) tried to shift the emphasis away from the size of the perturbation parameter in order to "demonstrate that widespread stochasticity can occur even for the lowest temperatures and the weakest nonlinearity" (p. 60). They did so by arguing that most physical oscillator systems satisfy the resonance condition: N ]}T rifccjfc == 0. (5.1) By claiming that "low-order resonance linking all degrees of freedom is assumed to be ubiquitous in physical nearly linear oscillator systems" (p. 60), Ford and Lunsford try to move the study of dynamics back into the realm of physics, where arguments based on number theory tend to be inappropriate for realized physical systems. If a physical system is nearly resonant, then we know its trajectory will fall within a zone of resonance overlap, no matter how small we make the perturbation parameter, as long as it is greater than zero. It is Ford and Lunsford's (1970) contention that most physical systems actually do obtain this property. And it is through arguments like this one that we can recognize Ford's desire to pursue KAM as a basis for statistical mechanics and irreversibility in macroscopic deterministic systems.
110 5. Steps to an Epistemology of Simulation 5.3. Historical Significance The results of the FPU simulation seemed to cast doubt upon the hypothesis that when a Hamiltonian model does not possess a complete set of integrals of the motion, it must evolve toward thermal equilibrium—meaning that without a set of integrals binding the trajectories to predictable orbits, those trajectories should evolve into ergodic, totally random motion. In what I will be calling the "ergodic hypothesis" (not to be confused with the assumption about the different sizes of infinity in mathematics), dy- namicists believed that without integrals of the motion, dynamical systems must necessarily be ergodic (totally random). Whereas an experiment may falsify a hypothesis about a physical system, can a simulation lay claim to the same capability? This question brings into focus the subtle difference between simulation and experiment that we must remind ourselves of constantly when working in this area. Whereas an experiment tests predictions about a physical system fairly directly, a simulation can test predictions only about the behavior of a mathematical model. However, if the model is derived from a set of approximations to a well-corroborated theory, then the simulation tests both the theory and the approximations in conjunction. When the results in simulation go against the underlying hypothesis but we have reason to believe those results, then we cannot be certain immediately what has been falsified, the theory or the approximations. The FPU simulation was expected to verify a hypothesis about an entire class of mathematical models that result from an approximation to Newtonian theory, namely, the truncation of the full series expansion of the force of attraction between bodies. Because the hypothesis concerned the behavior of a class of mathematical models, which is related to physical theory only through an approximation, we may say that the results of the FPU simulation falsified that hypothesis about the expected properties of the model, as far as they could be examined in simulation. The implications of this result are twofold: For dynamics, instead of a simple, mutually exclusive and jointly exhaustive binary division of Hamiltonian models—either they are completely predictable or they are totally unpredictable—there exists a more complex set of possibilities which emerges from the negation of both of these properties. The second implication concerns the history and philosophy of science. By casting doubt on a strongly held hypothesis about the expected behavior of this class of Hamiltonian models, the FPU results established firmly a significant and independent place for simulation in the scientific method. Simulation may be used to falsify theoretical hypotheses about the properties of mathematical models in the absence of analytical methods. I will discuss the second of these issues first, so that I may introduce the historical framework for this chapter. The Poincare theorem (1890, Vol. 1, Chap. 5) on the nonexistence of analytic integrals of the motion for a certain class of nonlinear mathematical models had direct implications for the FPU Hamiltonian. Fermi et al. be-
5.4. Experiment 111 lieved that the nonlinear perturbed model they were simulating belonged to this class, and so it should not have been integrable. This assumption led them to believe that with no integrals of the motion, their model should evolve toward thermal equilibrium. In 1923, Fermi tried to prove a theorem that went one extra step beyond the Poincare theorem. He wanted to show that because this class of models could not have a complete set of integrals of the motion, which would prevent the trajectories from accessing the full energy surface, they must necessarily be ergodic. Although no other alternatives for trajectory- binding structures were known at the time, the proof of Fermi's theorem was not found to be convincing in and of itself (see Haar, 1954, Segre, 1965). Thirty years later, Fermi saw an opportunity to at least verify his hypothesis in simulation, in lieu of a proof for his theorem. Clearly the results of the FPU simulation not only failed to verify that hypothesis, they falsified it by providing a counterexample. However, there is some question about whether the Poincare theorem even applied to the model used by Fermi et al. If it did not, then Fermi et al.'s expectations for their model were even less well founded, because without a rigorous application of the Poincare theorem, no guarantee existed for the nonintegrability of their model. The requirements for the Poincare theorem are the same as those for the KAM theorem, as we saw in Chapter 3. The problem with applying either of these theorems to FPU is that the unperturbed Hamiltonian, as written by Fermi et al., is degenerate. But if the same perturbed Hamiltonian could result from the truncation of some other integrable, nondegenerate unperturbed Hamiltonian, such as the Toda lattice, then not only would the Poincare theorem apply, but so would the KAM theorem. In the history and philosophy of science, the general problem of experimental evidence that contradicts the hypothesis underlying the experiment itself is not new. However, the FPU problem is the first such case where the evidence came from the results of a simulation instead of an experiment. 5.4. Experiment In the volume of Fermi's collected works (Segre, 1965), Stanislaw Ulam wrote an introduction to the FPU article. Appropriately, from this piece we obtain an insight into Fermi et al.'s attitudes concerning the role of simulation: We decided to try a selection of problems for heuristic work where in absence of closed analytic solutions experimental work on a computing machine would perhaps contribute to the understanding of properties of solutions, (p. 977) Let us consider what is meant by the term "experimental" in this context. In what way is a simulation of a nonlinear dynamical system an experiment?
112 5. Steps to an Epistemology of Simulation Simulation provides us with observations of the implications of the model that are otherwise hidden. With an analytic solution, we can derive global properties for all of the trajectories mathematically and calculate the exact state of the system at any point in time, given an appropriate initial condition. But the number of dynamical systems that are integrable is small; most "real world" dynamical systems are not integrable. In simulation, we can learn about only those trajectories that we simulate and only for the evolution times that we choose to use and only to the numerical resolution of our computers. So the task of obtaining general properties of a model through simulation is similar to understanding a physical system by running experiments on different aspects of it. Because the true solution might not be obtained using analytical methods, the behavior of trajectories from this model may be unknown in advance of the simulation. Drawing from existing theory, dynamicists develop certain expectations; but until we see the behavior of the trajectories, we cannot know their actual properties. Theory also informs our expectations in an experiment and the outcome often brings surprises there. The true solution for this type of dynamics is occluded from us by our own inability to solve the equations analytically. In this way, simulation is somewhat analogous to experiment; it provides us with glimpses of the true solution. We must be circumspect about not only what we can see, but also what we are prepared to see. However, the similarities between experiment and simulation fall short of an equivalence relationship because they are used to investigate the distinctly different realms of physics and mathematics. If we can, for the moment, include the mathematical realm into the realm of nature, then simulation can be thought of as a subset of experiment. Simulation is an experiment into an unknown area of mathematical structure using a physical apparatus. The experimental apparatus in simulation includes both the computational machine and the graphical display device. An important difference between simulation and experiment is that very little of the theory of the apparatus comes into play when it comes to studying a model in simulation, but this can be a major issue in an experiment. However, the theory of the apparatus in simulation is not negligible; as we will see, there are important considerations to be made concerning the ability of the simulation to tell us about the true solution of a model. The single most important constraint on simulation is the discretization of the evolution of the trajectory. Because simulation is a special case of experiment, in the sense just described, we should expect to derive the beginnings of a philosophy of simulation from the existent philosophy of experiment, just as some of the techniques of experimental investigation form the basis of the methodology of simulation. Fermi et al. expected the trajectories of their nonlinearly perturbed string to begin from a certain region of phase space—determined by the initial conditions—and then slowly evolve further and further away from that region. The state of theory at that time caused Fermi et al. to have ex-
5.5. Epistemology 113 pectations that were not realized. When the existent theory provides no conclusive way to make definite predictions about what lies in the phase space of a model and the model is not integrable, then efforts must be made to verify the simulated results and then to develop new theory to account for those results. In this sense, the simulation and modeling of nonintegrable dynamical systems is experimental and heuristic. I interpret and will adopt Ulam's use of the word "heuristic" herein to mean exploratory—as in the absence of a predictive theory. The application of this term to simulation implies a similarity to one of the more interesting functions of experiment—to precede theory in the exploration of unknown territory. Given that most dynamical systems are not integrable, then there is quite a lot of territory to be explored, and possibly quite a lot of new physics to be discovered. 5.5. Epistemology How can we know that what a simulation describes is actually the true solution? This is the central question for an epistemology of simulation. We do not know the true solution a priori, although we know that it does exist; so we must establish strategies of testing the simulation to increase our belief in what it tells us about the true solution. Existing theory can aid or hinder us in this pursuit and I will discuss that topic in the next section. First, let us look at what can cause the solution of the discrete, iterative numerical integration procedure to diverge from the true solution. The conversion of partial differential equations to ordinary differential equations constitutes an approximation through the truncation of space; the infinite degrees of freedom of continuous spatial variables is often approximated by a finite number of discrete degrees of freedom. The result is a new model, and so the different behavior realized is a function of the approximation of the model and not the simulation. We see from this example that we can distinguish between approximations made in the model and those in simulation. The significant approximation that can cause a simulated trajectory to diverge from the true solution is the discretization of time from the infinitely divisible continuum to a discrete contiguum of finite-precision steps. Because FPU is the cornerstone of this work, let us look there for the initial application of this approximation. From the following quotation we can see that this approximation was considered to be the least significant of the approximations made by Fermi et al.: "The derivatives in time, of course, were replaced for the purpose of numerical work by difference expressions" (p. 986). Their reasoning is intuitively clear. On the one hand, discretization of time is necessitated by the numerical procedure and, on
114 5. Steps to an Epistemology of Simulation the other hand, such an approximation should not affect either the model or what can be said about the physical system. Although Fermi et al. may not have realized it, this approximation is significant exactly because it plays a role only in the simulation and not in the model nor in the physical system. Indeed, consider this statement by Ulam: In addition, such experiments on computing machines would have at least the virtue of having the postulates clearly stated. This is not always the case in an actual physical object or model where all the assumptions are not perhaps explicitly recognized. (Segre, 1965, p. 977) Ulam seems to imply that a simulation is able to provide us with more clear information about its respective truth than an experiment because the postulates are clearly stated in advance. But the discretization of time in a simulation can and sometimes does, under certain conditions, obscure the true solution. The discovery of this limitation and the development of a system of checks to recognize its effect will be discussed in our next tour through the case study. Why would we expect the discretization of time to have little, if any, physical meaning? Physically because a macroscopic system is deterministic, its behavior must be independent of how often we look at it, so having discrete-time steps should not make a difference there. Similarly for the model. However, we cannot obtain infinite precision in our calculations, so an iterative numerical procedure may lose some portion of the true solution on each iteration. Under most circumstances, we can contain this error by choosing an appropriate size for the time steps. But there are circumstances in which the trajectory is changing so rapidly that we cannot track it; the signal becomes lost in the noise of the numerical procedure. In other words, the changes in the trajectory are so fast that any prediction into the future based on the present state cannot follow the true solution, no matter how small we make the prediction interval. In simple terms, there are times when the simulation displays the true solution easily; but there are also times when a numerical procedure is not a transparent window on the true solution and when this occurs, the efficacy of the simulation deteriorates. Numerical integration must proceed in discrete steps; so not every point along a trajectory can be represented. An infinite number of points along the trajectory are lost in the grid of the numerical procedure. The significance of this loss varies, depending on how erratic the trajectory is. One characteristic feature of chaotic systems is the "exponential separation of pair orbits." In a particularly chaotic region of phase space, very small differences in the location of a point along the trajectory can lead to very different behavior. In a numerical procedure, the choice of precision can alter the behavior of the trajectory: adding or subtracting a single digit of
5.5. Epistemology 115 precision can result in totally different trajectories from what appear to be the same initial conditions. Each iteration of the numerical procedure is a prediction into the future—into the next time step—based on the state of the conjugate variables at the present instant. So making the time-step size small lessens the likelihood of a prediction error. In regions of "deterministic chaos," in which the trajectory changes its course rapidly, the numerical procedure generates noise in the simulation regardless of the time-step size. This level of noise grows until the signal— the true solution—gets lost in it. Clearly this is a problem of entropy. This effect is not a loss of determinism in our dynamical system, but a loss of our ability to make the determination with a simulation. The usefulness of our apparatus deteriorates as it increasingly generates noise. In other words, in regions of deterministic chaos, information entropy increases in simulation, even though the physical entropy of the system cannot, according to our assumptions. A related simulation problem is the randomness associated with trying to connect a physical system to a numerical initial condition. The state of the physical system can be measured only to some finite accuracy, the uncertainty of which can lead to randomness in simulation. To show how randomness can come about in the deterministic process of numerical mapping on a discrete grid, I turn to an explanation from Lichtenberg and Lieberman: Consider for example the one-dimensional mapping defined over the interval [0,1]: xn+i = 10xn, mod 1, with the initial condition: 0 < xq < 1. We divide the "phase space" 0 < xo < 1 into ten equal intervals having as labels aj the integer part of 10a;. Then for example the initial condition xq = 0.157643 ... through the mapping, generates the sequence {a{\ = 1, 5, 7, 6, 4, 3,... Is this sequence random? We see that the answer hinges on whether the initial condition x$ is random. However, by the previous theorem [Kolmogorov-Sinai entropy], "almost all" #o's are random, and thus the sequence is random. It is conjectured that the motion near homoclinic points with a separatrix layer is random in this sense. Conversely, the regular motion on a KAM surface is nonrandom according to the above definition. (1983, p. 275) Associating the model with a simulated trajectory requires an ability to track arbitrary initial conditions. As we can see from this example, the uncertainty in the accuracy of any measurement can quickly lead to randomness in the calculation of a numerical mapping on a discrete grid. So even though we can obtain many decimal places in a computer simulation, we cannot avoid this exponential growth of randomness in the numerical procedure. The move from the continuum (infinite precision) to the discrete
116 5. Steps to an Epistemology of Simulation grid is unavoidable, because in all numerical integration and mapping, we must choose some finite numerical grid of precision. Eventually, in regions of deterministic chaos, a simulated trajectory may, if it is sufficiently erratic, lose memory of its initial condition, meaning that if we were to reverse the calculation, the trajectory would not return to its initial condition. Because the deterministic fundamental theory is time-reversal invariant, this loss of initial conditions is a sure indicator that the efficacy of the simulation has been lost. However, chaos does not entail irreversibility and so this loss of memory situation may not occur even in a chaotic model; currently, the only sure signature of chaos is the exponential separation of orbits. Irreversibility defies our deterministic expectation on the system; so either the simulation has diverged from the true solution, or else the true solution is not what we expect. We can blame the discretization of the numerical procedure for this increase of information entropy. When regions of deterministic chaos are discovered, they enhance our knowledge of the true solution by guiding us to regions where macroscopic irreversibility seems to emerge from a deterministic system; but the connection between deterministic chaos in the simulation and physical irreversibility is yet to be made. At the same time, if our intention is to understand the evolution of particular solutions, then the efficacy is reduced because we can no longer track the evolution of an arbitrary trajectory. In these regions we can still learn about the general behavior of the solutions of the model, but it becomes difficult to know whether the simulation is telling us about the true solution or not, at the local level. Testing for reversibility seems to be a good way to measure the efficacy of our simulations. Another problem related to the finite-precision arithmetic of computer simulation is round-off error. The uncertainty in the last digit of a finite precision real number accumulates in multiple arithmetic calculations—such as many iterations of a mapping. Several studies of the effect of this round-off error have led to the conclusion (Lichtenberg and Lieberman, 1983, p. 277) that "chaotic motion observed in generic Hamiltonian systems is intrinsic to the dynamics and is not produced by discretization effects associated with finite precision arithmetic." In one of these studies (Rannou, 1973), the author converts a "continuous" mapping—one that is iterated in real- number representation—to a "discrete" mapping—one that is performed in integer representation. The advantage of a discrete mapping is that there is no round-off for integer arithmetic and so it cannot accumulate during the iterations as it does when digits of real numbers must be truncated. The surface of section is subdivided into a course-grained grid beforehand. The results of the randomness of this discrete mapping as compared to the associated continuous mapping showed very little difference in the characterization of ergodic and periodic orbits. In a second study (Greene, 1979), the author calculated the effective propagation of the round-off error in a Hamiltonian mapping exhibiting KAM properties and found that the error only propagated in a direction parallel to the curves (cross-sections of
5.6. Preconceptions 117 tori). In this case, the error could not cause confusion between stochastic and conditionally periodic regions. Although we might confuse the effects of round-off error in our procedure with the randomness of deterministic chaos, we can check our procedure by applying it to an integrable model and comparing its results against direct calculation, made possible by the global definition of the trajectory. 5.6. Preconceptions Apart from the question of whether the simulation is telling us about the true solution or not, we must consider how much of its behavior we are prepared to see. What we see in simulation may be biased strongly by what we expect to see. Peter Galison writes: the naive view that prior expectations do no more than "bias" the experimenter will not do. Theories, or rather the different levels of theory, do much more than tint an otherwise crystal- clear view of the world. In Maxwell's case, for instance, it as precisely his lack of a quantitative prediction that left him with no idea of how big an effect he was looking for. (Galison, 1987, p. 73) Galison argues that theoretical presuppositions function not only to "influence the outcome of an experiment" (p. 69), but they also create the situation for the experiment itself. By extension, these ideas carry over into the use of simulation. Galison cites the example of J.C. Maxwell's attempt to ascertain the gyromagnetic effect of electrical current (p-factor) experimentally (performed in 1863). The theory that led Maxwell to perform his experiment left him without a prediction of the scale of the phenomenon that he was looking for. Maxwell failed to find the effect at all: "Maxwell was at sea: he could not know even what order-of-magnitude effect to expect and consequently could neither calculate which background effects would be important nor predict how accurately he would have to measure the tilt of his machine" (p. 69). As an example from our case study, let us consider the Henon and Heiles simulation in light of their presuppositions. The ergodic hypothesis influenced Henon and Heiles to perform their simulation in the first place, as we see in the following explanation: For many years, it was assumed that 73 is nonisolating on the grounds that no third integral expressible in analytical form like I\ and I2 had been discovered, despite many efforts. But this assumption, as has been often remarked, is in conflict with the observed distribution of stellar velocities near the Sun; for it implies that the dispersion of velocities should be the same in the
118 5. Steps to an Epistemology of Simulation direction of the galactic center and in the direction perpendicular to the galactic plane, whereas the observed dispersions have approximately a 2:1 ratio. More recently, a number of galactic orbits have been computed numerically. Quite unexpectedly, all these orbits behaved as if they had not 2, but 3 isolating integrals. As a result, there was some change of opinion on the subject. Attempts were made to prove theoretically the existence of a third integral. ... In the present paper, we approach the problem again by numerical computation; but, in order to have more freedom of experimentation, we forget momentarily the astronomical origin of the problem and consider it in its general form. (Henon and Heiles, 1964, p. 73) The existing theory of galactic motion predicted only two integrals of the motion—the conservation of both energy and angular momentum. But the statistics drawn from observations seemed to indicate the existence of an additional constraint on the orbits of stars. According to the ergodic hypothesis, either there is an integral of the motion or else there is ergodicity. Henon and Heiles chose to use simulation on this perturbed Hamiltonian to find evidence of an integral where no such integral could be found analytically. Here we find the beginning space for simulation; it might be able to resolve a conflict between theory and observation by displaying empirical evidence of an isolating integral when no analytical integral could be discovered. We have already seen what Henon and Heiles found in their study. Let us focus on Figures 5.1(a), (b), and (c), and some of their explanation. I argue that their reliance on the ergodic hypothesis may have prevented them from seeing two important attributes in their simulation. Each of these figures can be examined in both a static sense, in the spatial relationships between the points as they are plotted, and then in a dynamic sense in the sequence of the successive points as they are plotted. Statically, once the simulation is stopped, we can look at the resulting picture and record observations. In this way, Henon and Heiles discovered the islands-of-order structures and they also recognized that the so-called third integral seemed to break down as the energy was increased from figure to figure. What they found in these static analyses, combined with their expectations from the ergodic hypothesis, guided what they looked for in their dynamic analysis. By "dynamic analysis," I simply mean making observations during the unfolding of the simulation. In the first figure, the orbits are clearly bound to tori, so the authors conclude that there is an integral of the motion causing this behavior. In accord with this conclusion, they look for and find an order to the unfolding of the points of the trajectory: It may be interesting to note that the successive points Pi, P2) P3, ••• rotate regularly around the curve. The figure is
5.6. Preconceptions 119 (a) (b) (c) FIGURE 5.1. Trajectories in the Henon and Heiles surface of section for energies: (a) E = 1/12 shows very persistent level curves suggesting an integrable system; (b) E = 1/8 shows the mix of level curves and deterministic chaos; and (c) E = 1/6 at the dissociation energy.
5.6. Preconceptions 121 but they behave in a completely different way. It is clearly impossible to draw any curve through them. They seem to be distributed at random, in an area left free between the closed curves. Most striking is the fact that this change of behavior seems to occur abruptly across some dividing line in the plane. (P- 76) Their analysis of this figure is strictly static. They saw ergodic trajectories and assumed there would be no order to their unfolding. Why they chose to look for it in the third figure instead of the second is unknown. Had they found the unfolding order of the stochastic trajectories in the second figure, they may have found a second important property. To their credit, they did find the very important property of exponential separation of pair orbits. Furthermore, they used this property to successfully differentiate between ergodic and bound trajectories. Even though they knew about the exponential separation of pair orbits, they did not know that this property could lead to irreversibility, which is a significant indicator of the loss of efficacy. If they had found the dynamic order to the stochastic unfolding of the ergodic orbits, then there is reason to believe that they would have made the connection between that order and the exponential separation of pair orbits. They made the point about periodic orbits returning to the initial condition, indicating that perhaps if they had recognized the unfolding order in the stochastic orbits, then they probably would have explored the reversibility of those orbits. The major flaw in this argument lies in the conspicuous absence of any mention of reversibility in Henon and Heiles, even for the periodic orbits. Had Henon and Heiles discussed the reversibility of the periodic orbits explicitly, then we would know that they thought about reversibility. Because they found no dynamic order in Figure 5.1(c), it did not occur to them to try reversing the trajectory. Their expectations—stemming from the ergodic hypothesis—probably prevented them from discovering the dynamic unfolding order in the ergodic trajectories and then, perhaps, by extension, the more significant property of irreversibility. But because this latter conclusion has significantly less rhetorical grounding, we must let it stand as an interesting speculation. My arguments are in no way intended to detract from the significance of the Henon and Heiles work and the remarkable discoveries they made. That they discovered the crucial property of a chaotic system, the exponential separation of pair orbits, is truly remarkable. My intent is only to point out that preconceived theoretical ideas can function to set the stage for a simulation and, simultaneously, they can operate to influence what we may find in simulation.
122 5. Steps to an Epistemology of Simulation 5.7. Strategies for Belief and Pursuit In the next section, we will reengage with the FPU case study for a close reading of the development and use of several strategies for obtaining reasonable belief in the results of simulation. First I present a summary of these strategies to outline what we may expect to find in what follows. At the same time, I will relate these strategies to those discussed in detail by Franklin (1986, Chap. 6 and 7) and (1990, Chap. 6). Franklin's work establishes the beginnings of an epistemology of experiment, based on implicit and common sense strategies used by scientists to increase their degree of reasonable belief in the results of experiment. Although I do not attempt in this single chapter to achieve the rigor of Franklin's extended work to establish a similar epistemology of simulation, I do develop a subset of these strategies which make their way naturally to simulation. My choice of elements to include in this subset stem from their manifestation in the case study. The methodology of experiment extends naturally and implicitly to simulation and so it follows that the implicit methods of rational belief would extend themselves to the realm of simulation as well. This is a reasonable expectation that allows for an epistemology of simulation to get a head start by importing the applicable elements from the epistemology of experiment. In the following sections, I adapt five of Franklin's nine strategies into the language of simulation; I choose this subset only because they seem particularly apt in the specific case study at hand. The first five strategies are as follows. Comparison We might expect that errors could result from the way the algorithm is implemented, for example, by not carefully avoiding round-off errors in iterative-loop calculations. But we do not expect to find much difference in the outcome of the same code run on two different machines; although this is a possibility, the uniformity of digital computing machines allows us to talk about the behavior of the simulation separately from that of the machine. As we will see, FPU performed checks on their results to avoid errors due to round-off. More generally, when there are properties of the true solution known a priori, then they can provide us with checks on the simulation during the procedure. Because all the systems dealt with in the case study are Hamiltonian, we know axiomatically that energy must be conserved. Thus in every simulation, we can expect to calculate the total energy at any cycle and obtain the same result. If the energy is not conserved, then we know the simulation is wrong, because the true solution must conserve energy by design. Let us list this strategy under a category called comparison: comparing the results of simulation with known properties of the true solution. In analogy to the epistemology of experiment, this strategy
5.7. Strategies for Belief and Pursuit 123 corresponds to what Franklin (1990, p. 104) calls "Reproducing artifacts that are known in advance to be present." In this case, conserved energy is an artifact of the model. This strategy is useful because it provides a running check on at least one property of the true solution while studying an unknown system. We might think of this as a dynamic strategy because it relates the simulation to the true solution at all times during the simulation. However, it is very important to note that this strategy provides only a necessary but not a sufficient condition for believing in the result. As I will explain further in the case study, a simulation may diverge from the true solution and still conserve energy. Calibration A similar but less dynamic strategy is one that we might use to test both the precision of the machine and the accuracy of the numerical procedure. This strategy, which I choose to call calibration (again in analogy to Franklin) entails using the apparatus (the machine and the procedure) to reproduce known results. This strategy involves simulating a model in which the true solution is known, one that is analytically solvable—such as the unperturbed Hamiltonian of the system in question. With this method, we can obtain belief that, to the degree of precision that we are operating, the simulation is producing its solution without numerical errors. This method is useful for checking on the effects of precision and the competency of the numerical integration techniques; but it is less dynamic than the first method because it must be run before or after the actual simulation. Even if we do calibrate a numerical technique with the unperturbed Hamiltonian, we cannot be assured that it is telling us about the true solution in the perturbed case because of the radical behavior of trajectories in regions of deterministic chaos. Although we can increase our belief in the numerical procedure using this method, we must always use other methods in addition to this strategy to obtain as much reasonable belief as we can about the simulation. Verification If observed behavior in the simulation does not violate any of the corresponding properties of the true solution, then what strategies may be employed to check on that behavior? In an experiment, we might look for independent verification by someone else performing the same experiment, or else we might try to devise a different experiment to observe the same phenomena. In dynamics, the analogous options seem to be: other experimenters simulating the same model, but using a different numerical procedure, or using the same procedure, but changing the precision or the size of the time step to see if the trajectory changes. As I mentioned above, sometimes using different precision can produce different results for the
124 5. Steps to an Epistemology of Simulation same trajectory, so if the same behavior is observed using different precision, then there is some reason to believe that the results are genuine. As for diflFerent numerical procedures, if different results are obtained on the same trajectory, then the results are suspect. But if the results are the same, then we obtain stronger belief in that the result is attributable to the true solution. If the results are verified by an independent numerical procedure, then we might be motivated to check further using other methods, especially if the results defy our expectations. Retrodiction Once some interesting new behavior is uncovered in simulation, it could very well be used to guide theoretical calculations to an independent verification in the form of perturbation theory analyses. We saw in both Ford and Jackson's early work, that they used theoretical means to derive reasonable belief in the simulation of FPU. I call this form of verification retrodiction—the retroactive affirmation of simulation results. This strategy not only provides a strong reason to believe that the results of simulation are genuine because of its reliance on perturbation theory, but it also points to a feedback-loop strategy of interdependent guidance between theory and simulation. This strategy corresponds to Franklin's "Using an independently well-corroborated theory of the phenomena to explain the results." Consistency Another of Franklin's strategies that carries over into the realm of simulation is "Using the results themselves to argue for their validity." In the FPU simulation, one of the interesting results that helped to establish reasonable belief that something important had been observed was the fact that the three different choices of perturbations had all resulted in the same anomalous behavior. This consistency argues for the validity of the common result, even though that result could not be explained by the existing theory. The remaining four of Franklin's nine strategies that have not been mentioned, although they might still find some application in this work are: (6) intervention, in which the experimenter manipulates the object under observation (we might include variation of model parameters here and the entire subject of structural stability); (7) the elimination of plausible sources of error and alternative explanations of the result; (8) using an apparatus based on a well-corroborated theory; and (9) using statistical arguments. I do not say these additional strategies do not apply to simulation, but only that I do not see them emerging immediately in the case study. A more complete work in this vein would necessarily invest significantly more time
5.8. Case Study I: Fermi-Pasta-Ulam 125 in the examination of the analogous role of these strategies for simulation, as well as deriving new strategies that apply only to simulation. 5.8. Case Study I: Fermi-Pasta-Ulam The many reasons to believe in the results of the FPU simulation form an interconnected network involving all five of the strategies that have been described. In addition to belief in the results, there is also the issue of pursuit. Why did these scientists pursue this subject? This is a question that is just beginning to find its place in the philosophy of science. In the case study, we see that something about the FPU results inspired several independent scientists to pursue the issue further. Certainly one prerequisite to pursuit in a case of anomalous results in simulation is a conflict between strong expectations and a contrary result that carries a suflScient amount of reasonable belief. However, this does not mean that scientists who choose to pursue a line of inquiry must believe in the anomalous results; they may choose to pursue it just exactly because they do not believe in the implications of the result. Belief and pursuit do not have to be aligned. Other issues that may enhance pursuit include the individual propensities of scientists. Implicit in a scientist's choice to pursue a simulation is perhaps the belief that it provides information about nature. But let me deal with the relationship between belief and pursuit in terms of the case study as at least a beginning. Perhaps to provide some contextual significance to the FPU paper, Ulam, in writing a new introduction to the FPU article in Fermi's Collected Works (Segre, 1965), included references to the work of Zabusky. In effect, he calls upon us to believe in the results of FPU by utilizing the epistemological strategy of retrodiction: One obtains from it [Zabusky's analytical work] another indication that the phenomenon discovered is not due to numerical accidents of the algorithm of the computing machine, but seems to constitute a real property of the dynamical system, (pp. 977- 978) By using the word "real" here, Ulam seems to indicate that the results of their simulation tell us something about physical reality. But, if we consider that he is a mathematician, then an alternative reading would be that by "dynamical systems" he means the mathematical model, and that the true solution is what is "real." So he may not be claiming anything about the reality of physical systems in the way that physicists are used to thinking about. Ford, Jackson, and Zabusky all chose to pursue the FPU model in their theoretical work. We know that the results were certainly a surprise because they conflicted with expectations based on the ergodic
126 5. Steps to an Epistemology of Simulation hypothesis. According to the above prerequisite criteria for pursuit, something about the FPU results provided these workers with sufficient belief that the results were reasonable. We must also consider how their work, in turn, increased the general belief in phenomena more general than just the FPU problem. The near-recurrence of initial conditions that FPU observed were in conflict with a strong expectation of ergodicity. This conflict provided the focus for the pursuits of Zabusky, Ford, and Jackson. Ford was looking for a firm basis for statistical mechanics—a basis for irreversibility that goes beyond phenomenology. Zabusky was dissatisfied with Ford's approach to the problem in 1961, and he saw this problem as a good way to collaborate with Kruskal, who privately had expressed interest in the FPU results (Zabusky, 1967). Because both Zabusky and Jackson knew and acknowledged the fact that the near-recurrence appeared in the independent calculations of Tuck and Menzel, we must recognize that this independent simulation on the same Hamiltonian served as an important verification, and that it generated enough reasonable belief in both Jackson and Zabusky to inspire them toward the development of new theory. Not only did Tuck's work recover the FPU results, but it extended those results to uncover the "super-period." Zabusky and Jackson both retrodicted the FPU recurrence from completely independent avenues of theoretical research and thus provided a very strong reasonable belief that this was a real property of the true solution; each independent result provided some degree of additional belief. Also, because Jackson studied a discrete model and Zabusky studied a continuous model, we find a case of the consistency of the results arguing for their validity. Once these articles were published, there emerged a very strong general belief that the FPU results did indeed uncover an important new property of nonintegrable models. Due to the regrettable death of Fermi, we are afforded an opportunity to look more closely at the FPU results, independently of any other results. Returning to Ulam's introduction to Fermi's Collected Works, we find him reporting that Fermi believed in the results of their simulation: He [Fermi] expressed to me the opinion that they really constituted a little discovery in providing intimations that the prevalent beliefs in the universality of "mixing and thermalization" in nonlinear systems may not be always justified. (Segre, 1965, p. 977) Clearly Fermi believed strongly in the results of their simulation, enough to question the generality of the ergodic hypothesis and, it may be conjectured, that this simulation was telling them something about reality. Because Fermi died a year after performing this simulation, he must have said this to Ulam before any other verification of the results occurred, indicating that something in the FPU work itself convinced him that these inexplicable results were genuine.
5.8. Case Study I: Fermi-Pasta-Ulam 127 In addition to using three different perturbations, which amounts to testing three different models, Fermi et al. hoped to obtain rational belief in their results by using the Hamiltonian conservation of energy as an effective strategy for checking on the accuracy of the numerics: The accuracy of the numerical work was checked by the constancy of the quantity representing the total energy. In some cases, for checking purposes, the corresponding linear problems were run and these behaved correctly within one percent or so, even after 10,000 or more cycles, (p. 986) Clearly this statement refers to the double strategy of calibration and comparison. Not only was the energy conservation checked for the perturbed system, but it was checked also in the unperturbed, linear system. Thus the authors checked the competency of the numerical procedure by convincing themselves that round-off error was not causing a change of energy in either system, which they knew should not dissipate energy. Also, by cross-checking in this way, they confirmed that the nonlinear, dispersive model did not contain a hidden energy-dissipating feature, which might invalidate their assumptions about the model. Although the use of different perturbations may seem to establish the strategy of independent verification, it does not because that strategy requires either the use of different numerical techniques, different precision or time-step size and, in addition, all of these need to be performed with the same model. FPU used three different models and therefore it seems as if the similarity in the results support arguments that pertain to the model and not the simulation. But the consistency of the results among the different models does provide strong reasonable belief in the results after all. If we are to believe Ulam's claim about Fermi's firm statement, then there must be an implicit reason to believe in these results which confounded the established theory. Indeed, we have here perhaps the original example of the strategy of "using the results themselves to argue for their validity." The consistency among the results on different perturbations strengthens the belief that the anomalous behavior is real and represents a heretofore unknown property of the true solution. Although they were able to discount errors due to the numerical procedure by applying the two strategies of calibration and comparison, these strategies themselves only provide a basis on which to discount gross numerical errors. But this other more implicit strategy explains why this problem generated the interest and attention of the larger community of physicists. The FPU bounded behavior had acquired strong belief as a real property of the basic perturbed model. In addition, this behavior presented an important conflict with the existing theory of dynamical systems. Both of these results seem to be necessary conditions for pursuit, and perhaps they explain some of the reasons this problem took on a continuing life throughout the period of the case study.
128 5. Steps to an Epistemology of Simulation 5.9. Case Study II: Henon and Heiles Now we move on to the seminal work of Henon and Heiles (1964). This article obtains significance both because it represents the beginning of standard methodologies in dynamics with relation to simulation and it is the first application of the KAM theorem to a model derived from a physical system—as was announced by Walker and Ford (1969). Lunsford and Ford (1972) demonstrated the connections between the FPU, Henon and Heiles, and Toda-lattice Hamiltonian, so that there is also this indirect connection to the problem at hand, in that they all address the "fundamental problem of dynamics." It is important to mention that this study was not motivated by the results of FPU. Henon and Heiles claim that the reason for their pursuit of this problem was the disparity between theoretical predictions and observations. But the results of this study converge with the FPU results as a further basis for continuing pursuit of the related anomalous behavior. Let us examine the strategy used by Henon and Heiles to obtain reasonable belief in the results of their simulation. The important method that Henon and Heiles chose to use in their simulation, which differed from Fermi et al., was representing the evolution of their model in a surface of section. This choice provided them with access to important topological structures that are obscurred by the modal-energy approach of Fermi et al. But to their credit, such an approach by Fermi et al. would have been very difficult considering the large number of degrees of freedom they had to consider. Fermi et al. utilized the two explicit strategies of calibration and comparison; both of which may be used on any simulation of this type because the unperturbed Hamiltonian is integrable and both the unperturbed and perturbed models are energy-conserving. Henon and Heiles also used two explicit strategies: they used comparison and independent verification, the latter instead of calibration. Once again, as did Fermi et al., they utilized the fact that energy must be conserved in this Hamiltonian system: As a check, some of the orbits were computed independently by each of us, using different computers and different integration schemes. The following results were obtained using the Runge- Kutta method; during the numerical integration the energy was observed to decrease very slightly (< |0.00003| for 150 orbits). (P- 75) In this instance, the authors used different algorithms and different computers to insure their results. Just on the basis of this article and FPU, there seems to be an implicit assumption that using two of the explicit strategies is adequate to achieve reasonable belief that the results do not contain artifacts of the simulation. However, we must keep in mind that belief is quite different from truth: even though scientists may believe in a result, it still may be different from the true solution. Henon and Heiles
5.10. Methodology 129 could have performed a calibration of their simulation by checking the energy-conservation of the unperturbed Hamiltonian; but they chose not to. Using different integration techniques to achieve the same trajectory constitutes an independent verification that the simulation describes the true solution. But, as I soon discuss, there exists the possibility that the true solution may still diverge from the simulation. No subset of strategies of reasonable belief provides sufficient conditions to guarantee finding the truth. The use of independent verification may be only slightly more successful than calibration as a strategy for reasonable belief, because calibration on the integrable model does not assure us that the technique is reliable for the perturbed model. One of the characteristics of the nonintegrable models that we have seen in this work is the exponential separation of pair orbits—an important result found through the use of simulation. This kind of radical behavior in the trajectory cannot occur in the integrable model, indicating a distinct difference between the two—a difference that could invalidate any assurance we obtain from a calibration procedure on the unperturbed model. Whereas an independent numerical procedure that reproduces the results consistently provides us with at least necessary conditions for the results to be descriptive of the true solution. But this strategy does not provide us with sufficient conditions to believe that the behavior is attributable to the true solution, because there may be an artifact generated by the discrete-time approximation inherent to all of these numerical procedures. We saw that in stochastic regions of the phase space of these models, the discrete nature of the numerical procedures leads to the loss of efficacy of the simulation. Even though two independent procedures may produce the same trajectory, this trajectory is not guaranteed to be representative of the true solution because both procedures rely on discrete iterations. Whereas FPU demonstrated that the boundedness in their simulation was not specific to the particular nonlinear perturbation—using the results to argue for their validity—the results of the Henon and Heiles work are limited in scope to this single perturbation. But taken together, the two works strengthen each other as they flesh out a more definite understanding of the bounded energy-sharing. Because the Henon and Heiles Hamiltonian overlaps with a special case of the FPU Hamiltonian, the two sets of results strengthen each other as independent verification that the bounded behavior is intrinsic to the true solution. 5.10. Methodology Consideration of the work of Ford (1961) and Jackson (1963b) adds little to our discussion of epistemological strategies for simulation because they
130 5. Steps to an Epistemology of Simulation dealt mainly with perturbation theory. However, that in itself, further indicates that the FPU simulation functioned to stimulate the development of new theory—one more function that simulation shares with experiment. Both Ford and Jackson returned to perturbation theory as a starting point for further theory development. The one other aspect of their work that seems appropriate to this discussion is their incorporation of simulation into their own work. Whereas Ford used the results of FPU only as reference data with which to align his research findings (retrodiction), Jackson integrated simulation into his work more completely. Evidence from Ford's article indicates that he, as a theorist, thought of simulation as an experiment: "Ulam, Fermi, and Pasta tried to illustrate the expected approach to equilibrium by observing the equipartition of energy among normal modes for a system of one-dimensional oscillators obeying equations of the type ... "(p. 387). Both the terms "illustrate" and "observe" indicate Ford's attitude toward simulation as an experiment. He treated the output of the simulation as if it were data from an experiment, and he presented his analytical analysis as a qualitative comparison to the "experimental" results of the simulation. Jackson went a step further in his use of simulation; although he too seems to have thought of it as an experiment on the model. In the second part of his two-part article, he wrote: In the present study, this theory [from Part I] is applied to a particular system of coupled oscillators and compared with results obtained from computer solutions of the equations of motion. ... A numerical analysis of such a system has been made previously by Fermi, Pasta, and Ulam, and, in order to make use of their results, we will restrict our study to one of the nonlinear interactions used by them. (1963b, p. 686) Whereas Jackson, like Ford, tried to explain the findings of the FPU simulation with perturbation theory, he went on to use his theory to predict the outcome of new calculations that extended the FPU model to new cases. Thus, by combining perturbation theory with numerical calculations of his own, Jackson synthesized the two approaches. To increase the reader's belief in his own results, Jackson first retrodicted the FPU results, thus extending the reasonable belief of simulated results to this own theory. He then extended those results beyond the bounds of the FPU experiment by predicting forward with his theory beyond the confines of the FPU results. He performed calculations on the model to confirm those predictions. By employing the FPU results to validate his own theoretical work, Jackson strengthened the acceptance of the FPU results. In essence, Jackson used the results of simulation to corroborate his own theory development. Here again a theorist treated simulation as an experiment, but in a more integrated form this time, using prediction from theory and a numerical
5.10. Methodology 131 experiment as validation for the theory. Jackson recognized that if theoretical predictions could not only account for the FPU recurrence, but also extend those results to new cases, then he could increase acceptance of his theoretical work. Consequently, his strategy had the additional effect of strengthening our reliance on simulation as a source of validation, because he used numerical calculations of his own to show that his extended FPU predictions would be borne out in simulation. Let us turn now to the collaborative work of Ford and Waters (1963, 1966), at which time simulations were beginning to play a more significant role as a heuristic tool in analytical research. In the first of these articles, the authors come close to claiming that their own computer calculations validated their predictions from perturbation theory: The amplitudes of these normal modes are predicted using perturbation theory; the validity of the calculations is then demonstrated using a computer for systems of 3 and 5 oscillators. The computer solutions indicate that these normal modes are stable in the sense that a system started near one of its normal modes remains near it. (p. 1294) Although they did not actually claim that the calculations validated their predictions, they did say that the calculations "demonstrate" that validity, which is an interesting and slippery distinction. They must have thought that claiming explicitly that a simulation could confirm a theory might be unjustified. They displayed some reservations about whether a simulation could be enough to validate theoretical calculations, which only points to the need for establishing an epistemology of simulation. Also note that, in the last sentence, they came up against an important problem that I will treat in the next section; implicit in their claim of "remains near it" is the qualification "as long as we ran the simulation," and "to the accuracy of the calculation." So they could not use the simulation as a proof of the existence of normal modes. In the second of these articles (Waters and Ford, 1966) we find the interplay between analytical work and simulation becoming a regular practice. In the following quotation, we can see an emerging interconnectedness between the strategies of retrodiction and calibration: In Fig. 1, the first-order approximate solution is compared with the numerical solution for a set of initial conditions which are rather close to those for the first periodic solution. ... The expansion parameters are ... , so that a first-order expansion about this periodic solution is expected to provide a rather good approximation to the actual solution. Figure 1 confirms that it does. (p. 401) Here they chose to predict solutions that come very close to the unperturbed solution. In this way, retrodiction may seem to be calibrated be-
132 5. Steps to an Epistemology of Simulation cause its predictions come close to* the unperturbed solution. The implicit argument is that for a small perturbation, the theory is working, so we may believe it has a chance of being correct for the more highly perturbed cases. 5.11. Irreversibility Waters and Ford explicitly specified their strategy for maintaining the accuracy of their calculations: "Certain checks, including a running calculation of the Hamiltonian and the reversal of several runs back to the initial conditions, indicate that the accuracy of the numerical solutions presented is better than 0.1%" (p. 401). In addition to the usual check on the conservation of energy, we find the first mention of the reversibility expectation, two years after the Henon and Heiles article. This additional feature, that we expect to be guaranteed by the assumptions of the deterministic model, provides a more sensitive test for the efficacy of the simulation. As we know, in the stochastic regions of the phase space of a nonintegrable model, orbit-pairs undergo exponential separation and the initial conditions are lost. Clearly, it had not yet been realized that the simulation can and will generate errors in its representation that are not due to any error in programming; but instead, are due to the nature of the orbits themselves, when linked with the discretization effects necessitated by simulation. But reversibility is not the only way of discovering the divergence of the simulation from the true solution in the exponential separation of pair-orbits. Obviously tracking a pair of nearby orbits is another way, and varying the size of the time step is yet another. In their Physical Review Letters article, Walker and Ford (1969) introduced the KAM theorem to the general physical audience. They cite the irreversibility of stochastic trajectories that Henon and Heiles did not think to observe: For the total integration times used, up to t = 1500 for the stable orbits, the energy was constant through six decimals. Moreover, at t — 1500, the velocities could be reversed and the motion integrated back to t = 0 maintaining four decimal accuracy. However, for the highly unstable orbits, the reversed integrations showed that the solutions were accurate only for t < 200 even though the energy continued to be constant at high accuracy, (p. 431) Walker and Ford tried to verify that the irreversibility is not due to numerical inaccuracy; but because this was a deterministic process, the irreversibility must have been due to the numerical implementation. Ford and Lunsford (1970) offered a possible explanation for this behavior that we should look at closely:
5.12. Proof 133 The origin of irreversibility for these oscillator systems perhaps finds its most fundamental description in terms of the exponential separation of pair-orbits. In a very real sense, this rapid pair-orbit separation represents that stirring of phase space which Gibbs envisioned as causing irreversibility. In particular, the rate at which these pair-orbits diverge gives at least one measure of entropy production in these systems. In terms of information theory, the slightest uncertainty in the initial state would grow with time to almost complete uncertainty of the final state, (p. 69) Here the authors associated the production of information entropy with the production of energy entropy. The entropy they describe is what I have been calling the loss of efficacy of the trajectory: the irreversibility of the trajectory causes a loss of the initial condition and so the trajectory no longer represents the deterministic evolution from the initial condition. Walker and Ford would have liked to connect this loss of information with the source of energy entropy in a physical system. But this conclusion seems unjustified because it lacks rigorous causal connections. We can conclude that the memory of the initial condition is lost, and that the exponential separation of pair-orbits does represent stochasticity. The connection between loss-of-efficacy in a simulation and physical irreversibility is not made. Once again we hear a researcher discussing the results of simulation in terms of physical reality. Clearly Walker and Ford believe that the loss of efficacy of the simulation is related directly to physical entropy production. Although Walker and Ford were able to identify the existence of this inherent limitation on the efficacy of simulation, they point out that no method for predicting its occurrence has yet been established: "While there can be little doubt that KAM instability is the source of the amplitude instabilities observed in the above computer experiments, the theorist's ability to predict the onset and completion of macroscopic instability is less certain" (p. 188). 5.12. Proof Dynamics is a multidisciplinary study in mathematics and physics. The manipulation of a model, the Hamiltonian formalism, and the KAM theorem are all mathematical, whereas the use of models and simulation to study the properties of physical systems is physics. But there is no clear separation between the two in any of this work. In physics, belief is a strong principle: we believe in the laws of physics or we believe in the results of an experiment. Proof is more the concern of mathematicians; logic and mathematics enable the possibility of proof. When we work with simulations of dynamical systems, questions of belief and proof become confused. In
134 5. Steps to an Epistemology of Simulation dynamics, certain properties of models can be proved, such as the KAM theorem. Once proved, these theorems become very important for the physical study of dynamics. In the case of the ergodic hypothesis, we saw that Fermi attempted a proof that was not convincing in the 1920s and in the work of this case study, we found that the ergodic hypothesis was finally laid to rest, partially because of the KAM theorem, which proved that boundary curves persisted well into the perturbed space of nonintegrable dynamical systems. I have discussed strategies for obtaining reasonable belief that a simulation displays properties of the true solution. Because the truth in this case is mathematical, proof seems to be required to assure, finally, whether certain properties exist for a model. This assumption has been expressed clearly by Henon and Heiles (1964): Are the curves found here exactly or only approximately invariant? What is the topological nature of the set of all the islands? Is it possible to compute the curves directly from the potential, without integrating all the orbits? The ultimate answer to such questions should rest on rigorous mathematical proofs, not on numerical experiments; but the mathematical approach to the problem does not seem too easy. (p. 79) Henon and Heiles were aware of the need for mathematical proof; yet they also point to the difficulty of finding that proof. In order to get an idea of what needs proving, simulation provides explorations into the space of models. But can simulation be used to provide a proof? The answer to this question depends heavily on how we frame the question. Each point of a trajectory is generated from the iterative numerical integration of equations of motion; if a trajectory displays a general property, and consistently retains that property throughout the simulation, then the question emerges: Will this property hold throughout the evolution? Similarly, there are an infinite number of possible initial conditions for simulated trajectories and we can simulate only a finite subset of them. What we see in simulation is merely a small piece of the space of a model. If every swan we see is white, may we conclude that all swans are white? The answer to this question depends on whether we make conclusions based on the evidence, or on the tenets of logic. A physicist might answer in the affirmative to this question because all the evidence supports that answer; but a logician must answer in the negative because deductive logic forbids such an inductive conclusion. Evidence from the Henon and Heiles article indicates the need for care in framing our questions so that the simulation can provide definitive answers. Consider the following excerpt from their text: If we follow the trajectory for an infinite time, there will be in general an infinite sequence of points. If there is no [additional] isolating integral, these points will fill an area [in the surface
5.12. Proof 135 of section]. But if there is [an additional] isolating integral, the points will lie on a curve. Thus we get a simple criterion for the existence of the second integral; it is sufficient to compute a number of points, plot them in the plane and see whether they lie on a curve or not. (p. 74) [my emphasis] Henon and Heiles make an interesting claim about sufficiency: If the trajectory remains on a curve, then there is an integral of the motion. Whereas any integral of the motion guarantees only bounded behavior, this particular integral, if it exists, fulfills the final requirement that makes this system completely integrable. If the points seem to be filling the bounded region, then we could sufficiently conclude that there is not an analytical integral of the motion in effect. But Henon and Heiles claim the converse, which is true only if the ergodic hypothesis is true. If we see a black swan, then we may conclude that not all swans are white; but, if we do not see a black swan, then we still cannot claim to know that all swans are white. However, because all the evidence supports it, then we may reasonably conclude that all swans are white. This conclusion stands until someone produces a nonwhite swan. Conclusions based on the evidence may turn out to be wrong, but conclusions based on logical principles can not. It would be nice to only make conclusions that cannot be wrong; but if logical guarantees were the only way of justifying a conclusion, then science would proceed very slowly indeed. If points continue to remain on what appears to be a curve throughout the simulation, no matter how long it is run, then there still remains the possibility that either the points will deviate at some later time, or that a different choice of initial conditions could result in new behavior. Logically, their claim would not be wrong if the ergodic hypothesis were true—that is, if there is only the two possibilities of either order or ergodicity, then anything remotely resembling order would cause us to conclude the existence of an isolating integral. To be correct in the light of the failure of the ergodic hypothesis, Henon and Heiles might have said: "Thus we get a simple criterion for the nonexistence of the second integral ... ." Simulation can disprove a hypothesis with the instantiation of any exception to the rule, but it cannot prove the hypothesis. The proof must come from some time-independent mathematical formalism, such as the complicated proofs of the KAM theorem. Thus we obtain a general limitation on what a simulation can prove. Given a hypothesis that says: "Property A holds for all time." Simulation can be used to disprove it by providing an exception, but it cannot be used alone to prove the hypothesis because it can provide only a finite number of specific instances. Whereas Henon and Heiles did not seem to be aware of this limitation to using the results of simulation as "sufficient" to demonstrate that properties of solutions would hold for all time, we know that Waters and Ford were conscious of it because they stated it explicitly:
136 5. Steps to an Epistemology of Simulation Our work relies heavily on numerical solutions to the equations of motion. Clearly, a computer can provide these solutions only for selected initial conditions and for relatively short time intervals. The analytical methods we have used to imply the long-time behavior are open to serious convergence questions, and it remains to be shown whether or not they provide an adequate treatment of the small-denominator problem. Nonetheless, whenever we have been provided with a check, we have found agreement between analytical and numerical solution. Consequently, it is with guarded optimism that we offer our conclusions. (Waters and Ford, 1966, p. 1304) Even though they made this cautionary statement, Waters and Ford had convinced themselves of their analytical predictions using the results of their simulation. In the process of using simulation to strengthen belief in analytical results, their respect for, or their belief in simulation's ability to provide real information was intensified. Here we see the difference between certainty as supplied by proof and belief as supported by evidence. At the end of the first article by Ford and Waters, we obtain an insight into the authors' recognition of simulation's value in this work: "The partial success of the perturbation methods we have used indicates that these systems may eventually succumb completely to analysis; although a computer is likely to remain a useful adjunct to analysis for some time to come" (Ford and Waters, 1963, p. 1305). From this statement we can infer that simulation had already become a vital tool for the theorist, because it provided both heuristic exploration into unmapped areas of a nonintegrable model, and limited, yet very directed, validity for predictions made during theoretical development. Again, we must recognize with caution that this validity is limited to particular ranges of perturbations and evolution times, but it does function to strengthen the reasonable belief in predictions in the absence of proof. What is also interesting in this statement is that, implicitly, if these systems do not ever succumb to mathematical proof, then computer simulation will be the most important tool for understanding properties of solutions. 5.13. Proof and Simulation Ford et al. (1973) pushed the use of simulation to its extreme as they tried to determine whether the Toda lattice was integrable or not. They designed many tests that could be examined in simulation. First they tried to determine whether there obtains, within computer accuracy, a constant of the motion. On the other hand, by numerically integrating the equations of motion for a variety of initial conditions, one can determine if
5.13. Proof and Simulation 137 the truncated, formal series is a valid constant of the motion to within computer accuracy. In most of our numerical integrations, this series was indeed found to be a constant of the motion exhibiting a time variation only slightly greater than the Hamiltonian itself, (p. 1549) Recall that from just this result, Henon and Heiles concluded the existence of an integral; but Ford et al. knew this was not sufficient. Next they tried to find a breakdown of curves in the surface of section, using methods similar to what Henon and Heiles used: Thinking that the somewhat similar Toda Hamiltonian [similar to Henon and Heiles] would also be nonintegrable, we chose to begin our investigation at E = 1, only to find smooth curves everywhere. Upping the energy to E = 256, then to E = 1024, and finally to E = 56,000 still yielded smooth curves despite the fact that, at these high energies, the nonlinear terms dominate the motion, (p. 1551) They searched for a breakdown of curves of successively higher energy, characteristic of KAM curves to disprove the hypothesis of an integral; but they failed to find any breakdowns. They also searched for fixed points corresponding to islands of order in the Henon and Heiles system; but again they failed to find these. Both of these negative results strongly indicated that these curves were due to integrals of the motion; yet, they are not sufficient proof in themselves. Finally, they looked for the characteristic exponential separation of pair orbits, which would indicate stochasticity; but they could find no evidence of exponential separation. All this evidence did conform with the thesis that the Toda-lattice system was integrable; yet they could conclude only: Thus two equally attractive possibilities exits. The Toda lattice is a member of the exception class of integrable oscillator systems, making it a rare jewel in physics—a physically interesting, nonlinear system which can in principle be analytically solved exactly. Alternatively, the Toda lattice is typical of the more general class of nonintegrable systems, thereby making it a welcome addition to a small but growing list of interesting physical systems which exhibit stochastic as well as nonstochas- tic behavior. In this paper, we attempt to decide between these alternatives by presenting the results of a series of distinct computer experiments. ... Contrary to our original expectations, all our empirical evidence indicates that the Toda lattice is integrable, an unfortunate result from the viewpoint of computer studies. For while a computer can be used to definitively prove nonintegrability, it cannot prove integrability.
138 5. Steps to an Epistemology of Simulation Using every test of integrability that simulation could provide, Ford et al. (1973) could conclude only that it is probable that the Toda lattice was integrable. They believed in that integrability, but because models fall into the realm of mathematics, they still had to defer proof to some means other than simulation. Henon finally proved that the Toda lattice was integrable. He did so, not with simulation, but by working out the form of the third integral analytically. Similarly, simulation cannot be used to prove the existence of ergodicity. One of the only systems proven to be ergodic is a system of hard spheres with a repulsive force, which Sinai (1963) proved to be both ergodic and mixing, without using simulation.
Appendix A Hamiltonian Dynamics: Language of Abstraction The advantages of the Hamiltonian formulation lies not in its use as a calculational tool, but rather in the deeper insight it affords into the formal structure of mechanics. The equal status accorded to coordinates and momenta as independent variables encourages a greater freedom in selecting the physical quantities to be designated as "coordinates" and "momenta." As a result we are led to newer, more abstract ways of representing the physical content of mechanics. Herbert Goldstein (1980, p. 378) Newton's equations of motion are second-order differential equations relating acceleration to influential forces. Often, it is preferable to consider first-order differential equations rather than second-order equations—where the order of the equation is determined by the order of the highest derivatives required by the equation. A set of n second-order equations may be converted to a set of 2n first-order equations by assuming the independence of a variable from its rate of change, which was justified by Lagrange in the eighteenth century. This independence gives rise to the need for a 2n- dimensional phase space to specify completely the system's state with a single point. The resulting system of 2n first-order differential equations is completely equivalent to the system of n second-order equations of Newton: 2n initial conditions are still required to specify the system completely. So what is obtained by going into this form of dynamics? The answer is mathematical elegance, symmetry, geometrical representation, and canonical transformations. Position variables in Hamiltonian dynamics may be something quite different from the spatial location of a particle. They may well be some set of generalized coordinates for which the description of the physical system is more wieldy—such as the standard use of angles in the case of the pendulum. But they could also be a set of perfectly valid quantities that have no simple physical interpretation. Of course, if the positions are generalized
140 Appendix A. Hamiltonian Dynamics: Language of Abstraction coordinates, then the momenta' are certainly something other than mass times the first derivative of position. In fact, the relationship between position and momentum is quite specific and is part of the very essence of Hamiltonian dynamics. The central element of this dynamics is the Hamiltonian function (if, often called simply "the Hamiltonian"), that is usually a statement of the total energy of the system, expressed as a function of positions {^}, momenta {pi}, and physical parameters related to the various energy potentials. Prom the Hamiltonian, we obtain the 2n equations of motion for the system by taking explicit derivatives of H with respect to the 2ra variables (n positions and n momenta). Given a Hamiltonian H(Qii • • • > (Zn> Pi> • • • jPn)j we may obtain Hamilton's equations of motion directly: dqi _ dH dpi __ dH dt dpi' dt dqi Notice the symmetry? Each pair of q^s and piS are locked together in an antisymmetrical embrace that is given the special name symplectic. The rates of change of both position and momentum are obtained directly from the Hamiltonian, from its rate of change with respect to the other variable. The variables are conjugated by way of the Hamiltonian, and the Hamilton equations of motion. The special symmetry between the conjugate variables and the Hamiltonian is given the name canonical As long as the dynamics is denned in this canonical relationship, the variables are kept on equal footing as the state variables in this form of dynamics and the initial assumptions of the model are retained by the trajectories. A.l. Topology and Phase-Space Trajectories In symplectic geometry, each degree of freedom contributes two dimensions to the overall system phase space. The phase space for any Hamiltonian system of n degrees of freedom is thus a 2n-dimensional Euclidean space. The system point changes coordinates as the system evolves in time, thereby generating the system trajectory. The possibility exists for the system point to remain fixed, in which case the point is called a fixed point. Once the system point obtains that value, it never leaves of its own accord—for example, the pendulum stopped at the bottom of its swing. There are two important topological properties of the system trajectory: as the system evolves, it cannot make discrete jumps in state variables, and so the trajectory is a continuous curve. Second, because the solution to the differential equations is unique, the trajectory in phase space may never cross itself. Once a trajectory passes through a particular point in phase space, then the path it travels away from that point is determined unambiguously for all time. If the trajectory reaches that same point again at (A.1)
A.2. Canonical Transformations 141 some later time, then it must follow the previously determined course away (again), which clearly leads into a periodic loop. A closed curved is formed and the system is periodic—every point on the closed curve is revisited periodically. The fixed point is a special case of a periodic trajectory. A.2. Canonical Transformations In Chapter 1, we saw that FPU chose to convert their model to the normal- mode representation to decouple the differential equations. Conveniently, this transformation also provided a set of generalized coordinates with which they (and we) could visualize the dynamics of the whole string. Each normal mode represents a physical, sinusoidal shape of the string. The transformation to normal modes is called a point transformation because it only involves transforming from one set of generalized coordinates {qi} to another {Qi} without involving the momenta in the transformation. Here the momenta transform trivially from {pi} to {Pi} because of the special case of the momenta being simply the first temporal derivative of the positions. But, in general, the canonical transformations of Hamiltonian dynamics involve both the coordinates and the momenta of the old system (of representation) in the derivation of the coordinates and momenta of the new system, all the while maintaining the system's canonical structure. For more detailed information on canonical transformations, see, for example, Goldstein (1980, Chap. 9). These transformations convey us to more abstract representations of the dynamical system, which may help us to understand better any hidden order principles (symmetries), that we might not have been able to recognize in the original representation. However, one drawback of employing such transformations is that we may lose track of the physical significance of the theoretical entities. A.3. Transforming the Unperturbed String In order to make clear the connections between the KAM theorem and the FPU problem, I develop, heuristically, the model of the discrete string in the Hamiltonian formalism. The unperturbed Hamiltonian or a system of n particles with linearized spring forces acting between them is 1 i=i In conformity with FPU, I've set the mass of each particle and the spring constant for each string-segment to unity, so they do not appear here. The difference between nearest-neighbor displacements represents the extension
142 Appendix A. Hamiltonian Dynamics: Language of Abstraction of each spring segment. On converting to the normal-mode representation using the substitution 3 = 1 7 = 1 x ' where k = 1,2,... ,n, the Hamiltonian becomes H(Q,P) = J2 fc=l ^{Pl + ulQD with Wfc „ . fifk\ 2 sin —- \2n) (A.3) (A.4) (A.5) Equation (A.4) is analogous to equation (1.5), except that I have substituted Q's for a's, and P's for the conjugate momenta a. In this new representation, the generalized coordinates are the normal modes themselves. Note the change effected between the two Hamiltonians (A.2) and (A.4). Each term in the sum of the new Hamiltonian contains reference only to variables of the same mode index fc, and so decoupling the equations of motion. Whereas the motion of one mass point along the string depends upon the location of its two neighbors (coupled), the motion of each mode (of the whole string) depends only on itself and its conjugate momentum (decoupled). Applying the formalism to this Hamiltonian results in the usual oscillatory motion of the normal modes: Qk(t) = -Q0cos(ujkt)] Pk(t) = P0sm(ukt). (A.6) In this completely integrable, unperturbed system, the normal modes do not exchange energy. The motion described by equations (A.6) is harmonic: both the position and the momentum of each mode moves in the familiar periodic pattern. A.4. Cyclic Coordinates If any of the variables Qk are absent from the Hamiltonian in any of its transformed manifestations, then, in that same manifestation, the specific conjugate variables Pk must be constants of the motion, and we say that the Hamiltonian is cyclic in the absent variable. Often called "first integrals
A.5. Liouville Integrability 143 of the motion," these constants result from integrations of equations (A.l). For example, assume that some Hamiltonian was independent of the variable Qi, such that H = H(__Q2l • • • , Qn, Pi,- ,Pn )—then the second of equations (A.l) tells us that tfi dff n , —— = — -rp- = 0 and so Pi = a constant = p\, at aQi so that H becomes H(—Q2,... , Qn, /?i»-P2» • • • »Pn )•> where /?i is a first integral of the motion and so remains conserved at its initial value throughout the evolution. This is exactly the process by which angular momentum is conserved in the two-body problem, because that Hamiltonian is independent of the angular orientation variable. A.5. Liouville Integrability When a dynamical system is completely integrable and its motion is fixed to a determinable trajectory that is the intersection of the surfaces defined by the n constants of the motion, such as the system of harmonic oscillators in the unperturbed FPU string, then there is a certain set of canonical coordinates that are especially useful for representing that system. Notice that when we transform our Hamiltonian to a form in which it is explicitly independent of all of the coordinates Qk, then we obtain the n explicitly defined integrals of the motion. We have H = H(/3i,... , /3n), where H and all the /3fc's are constant. From this vantage point, we are in a position to apply the first of Hamilton's equations (A.l) to every Qk to get dQk dH —— = -p-r- = another constant value = Uk, at opk which easily integrates to the linear function: Qk = Wk t + Qko- AH the positions Qk evolve in a linear way, beginning from the initial value Qko- In effect, the system is solved completely and the motion is absolutely predictable. This special case is called Liouville integrability. In Hamiltonian dynamics it is necessary to know only n constants of the motion to completely specify the evolution exactly—given the 2n initial conditions. A.6. The Action-Angle Variables Taking advantage of both the dual sinusoidal motion of the normal string modes and the Liouville integrability of the coupled harmonic oscillators, we can transform our system once again to a more elegant geometrical
144 Appendix A. Hamiltonian Dynamics: Language of Abstraction Elliptical Orbit Q = -Q0Cos(cot) P = P0Sin((ot) a = sqrt(2E) b = a/co Circular Orbit (rectangular) Q' = Qco P' = P r = sqrt(2E) Circular Orbit (polar) I=E/(o 0 = cot + 0O FIGURE A.l. Geometrical transformations: Looking at a single cross section of one mode we see: (a) The mode Q and its momentum P both orbit periodically resulting in an ellipse with fixed semimajor and semiminor axes a and b. (b) Each such ellipse easily transforms into a circle whose radius and area are directly related to the constant mode energy, (c) The circle with fixed radius is converted into polar coordinates, which are the action angle variables (/, B).
A.6. The Action-Angle Variables 145 representation in which the Hamiltonian is cyclic in all the position variables. As a result, all of the conjugate momenta become constants of the motion, and the position variables evolve linearly in time. As can be seen from equation (A.4), each mode energy is the sum of the kinetic and potential energies of that mode, represented by the bracketed terms of the sum. Setting each term equal to a new constant Ek, we have Ek = \{Pl + ulQl). Geometrically, this is the equation for an ellipse—a fact which suggests several quite interesting things. First, that both the positions and the momenta must be periodic; but equations (A.6) already tell us that. Second, that when we graph Pk versus Qk, as I have done in Figure A. 1(a), we obtain a plane ellipse, whose area is directly related to the conserved modal energy. Using a suitable substitution of variables we can easily transform an ellipse into a circle, as in Figure A.l(b). Each such circle, one for each mode, has a fixed radius that is also directly related to the modal energy, which in turn is a constant of the motion, determined by the initial conditions of the string. Thus we can see that if all the initial energy of the string were placed in mode one, as it was in FPU (unperturbed), then only the k = 1 circle would have a nonzero area. Remember that the phase space of our system is a 2n-dimensional space, of which a single (Qk, Pfc)-plane is but a small cross-section; but in this case, it is the only cross-section with nonzero area. The system point evolves continuously around this circle of fixed radius. As we can see from equations (A.6), both Q and P evolve with the angular frequency a;, defined by (A.5). So if we transform the system once again, to a polar-coordinate representation, only the angular variable would evolve. We transform to this set of polar coordinates using the transformation equations Qk = -\— cos(0fc), Pk = y/2ukIk sin(0fc). (A.7) V Wfc These equations actually transform the ellipses of Figure A. 1(a) directly into the circles of constant radius J and rotating angular coordinate 6 (Figure A. 1(c)). Looking at the inverse of these transformations (they are required to be invertible): /fc = St+iM = ^, 6k = ukt, (A.8) ZUk Uk we see that the radius variables Ik are all constants of the motion derived from the modal energies and that the angle variables 0k are linear functions of time. Thus the shape of the string, defining the location of every
146 Appendix A. Hamiltonian Dynamics: Language of Abstraction mass point along it, is easily detefmined for any time t—given the initial conditions—by a simple calculation of the angle. The variables Ik are traditionally called actions because they are the quantities minimized in the principle of least action, which was derived independently by Maupertuis, Euler, Lagrange, and Hamilton. According to Whittaker: The Principle of Least Action originated in Maupertuis' attempt (Mem. de VAcad., 1744, p. 417) to obtain for the corpuscular theory of light a theorem analogous to Fermat's 'Principle of Least Time.' Maupertuis' principle was established by Euler (Addit. II. p. 309 of his Methodus inveniendi lineas curvas, 1744) for the case of a single particle under a central force, and by Lagrange (Miscell. Taurin. II. (1760-1), Ouvres, I. p. 365) for much more general problems. (1937, p. 417) This variational principle states that of all the integral paths possible for the system trajectory, the one which minimizes the action integral will be the physical one. Hamilton derived his canonical equations from a similar principle. It is easily shown that the variable J results exactly from the action integral. The transformation to action-angle variables is canonical because it retains the structure of the canonical equations (A.l). Substituting the transformations (A.7) into the Hamiltonian (A.4), we obtain the new Hamiltonian n ff(/) = 5>fcJfc, (A.9) which is completely independent of the angle variables 6k> Now we apply the canonical equations of motion to this Hamiltonian to obtain and so Ik are all constants, and then clearly, -77- = 77T = u)k (a constant), (A.ll) at o±k which gives us finally, OkW^Wkt + Oko. (A.12) The new Hamiltonian (A.9) is a constant function of the action variables only, which in turn (A. 10), are themselves constants of the motion. Furthermore, the angle variables Ok are linear functions of time (A.12). The
A.7. Dynamics on a Torus 147 motion is represented by a point rotating around a circle of fixed radius with a constant angular frequency. However, the motion is confined to a plane circle only if n = 1 or if only a single mode is excited initially. In a full treatment, we must be concerned with cases for which n > 1. So now we move to representations of the motion in the action-angle variables when n > 1. A.7. Dynamics on a Torus The higher-dimensional analogue of this circular action-angle representation is motion on the surface of a torus. For visualization purposes we will consider the motion on a 2-torus, which is a four-dimensional surface (manifold) embedded within a three-dimensional space. Consider the example of two coupled, linear, harmonic oscillators. This system has two degrees of freedom (n = 2), which demands representation in a four-dimensional phase space. Because the energy is conserved, the energy surface is a (2n — 1 = 3)- dimensional manifold. Upon transforming the model to action-angle variables, we obtain a system of two actions, two angles, and the Hamilto- nian: H = uil\ + ^h- The actions and their associated angles (#i,#2) are the coordinates of a 2-torus, which I have represented graphically in Figure A.2—where arbitrarily I have arranged the coordinates such that Ii > I2. Because the actions are constant, the evolution of the system point is constrained to the surface of this torus and is described by the evolution of the angles according to (A. 12). These angles evolve linearly in time with normal-mode frequencies (o;i, 0/2). The energy surface is the three-dimensional Euclidean space in which we have embedded this torus. Clearly, because the system trajectory is confined to this torus, it is not free to wander through the entire energy space and so the system is not ergodic. The 2-torus is the highest-dimensional torus that most people can visualize, a fact that might present a problem when studying systems with n > 2. But, fortunately, the 2-torus is sufficient to represent most of the properties of the higher-dimensional analogues. The big step is going from the circle to the 2-torus. Even though our model is many dimensions more than 2, we will be using the 2-torus for all visualizations. Although we originally derived the action-angle variables from the normal modes, all quantities now must be defined as functions of the actions and angles. The Hamiltonian is a function of the actions only, as per (A.9). The total energy of the system is the sum of the modal energies (Ek = IkWk), that are constants determined by the initial values of /&. Because the actions are constant radii of the n-torus, once we set their values, we have determined the n-torus completely. The angles play no role in determining either the energy or the characteristics of the torus: every
148 Appendix A. Hamiltonian Dynamics: Language of Abstraction FIGURE A.2. In the 2-torus coordinate system, each of the two action radii shown is associated with an independent angle variable (not shown). trajectory on the n-torus will have the same energy as any other one on the same torus. Fixing the initial values of the angles 0k determines or selects a particular trajectory from the infinite number of possible trajectories, all with the same energy. Because the actions are constructed from both the positions and momenta of the normal modes—which are themselves composites of the coupled positions and momenta of the mass points along the string—we do need to fix the initial values of both the positions and the momenta of the modes to obtain the initial value of the actions. A.8. Commensurability: Two Types of Motion As mentioned earlier, a trajectory in phase space can be either an open curve (aperiodic) or a closed curve (periodic). Although this polarity still holds on the surface of the torus, we obtain something new in this representation. For our system of coupled harmonic oscillators, the trajectories must remain on the fixed surface of the torus; yet the trajectory can never cross itself. Furthermore, in any single-mode cross-section of the torus, the motion is circular and periodic in that mode. But even though every mode is periodic in its own right, the whole trajectory need not be. If the trajectory on the surface of the torus repeats itself, then it forms a closed curve on that surface. On the other hand, it is possible for the trajectory to wind around the torus in such a way that it never intercepts itself, and
A.8. Commensurability: Two Types of Motion 149 so never visits the same point twice. This second kind of trajectory must necessarily cover the entire surface of the torus, given an infinite amount of time; whereas the periodic trajectory can never cover the entire surface, in any amount of time. This second kind of trajectory is called conditionally periodic] it is periodic in every mode, but aperiodic overall. The distinction between these two types of trajectories plays a significant role in both the small denominators problem of perturbation theory and in the development of the KAM theorem. Whether or not a trajectory is periodic or conditionally periodic depends upon the relationship among the angular frequencies. In the case of the 2- torus, if the ratio of the two frequencies is rational, then the trajectory will return to the same point eventually. Each frequency Uk is the number of complete revolutions that the corresponding angle Ok makes in one unit of time. First, let us consider the case of n = 2, where, for convenience, u)\ > U2- If the ratio of the two frequencies ujxJuj^ is a rational number, then: — = — where j and m are integers, (A. 13) rau>i — ju)2 — 0. (A. 14) The period of mode two is 2n/uj2, which, by assumption, is the longer of the two periods. Whenever mode two completes j periods of its motion, mode one will have completed exactly m periods of its motion, and so the overall trajectory will come back to its initial point. In general, the motion of a system of n oscillators will be periodic only if the condition, n ]Trafca;fc=0, (A. 15) holds for some set of integers {m^}, in which the integers may be positive, negative, or zero, with the exception of all zeros. This condition is the straightforward generalization of the ratio of frequencies above. The frequencies in this case are said to be linearly dependent. If this condition is met, the frequencies are said to be with one another. Figures A.3 and A.4 represent several combinations of frequencies and the resulting trajectory on the 2-torus. In Figure A.3(a), the toroidal frequency is zero, so the orbit will not go around the torus; it simply makes a plane circular path in the poloidal cross-section. I have included the slinky-like three-quarter torus in the figure only to indicate the potential shape of the full torus; it should not be confused with the darker line representing the orbit. In Figure A.3(b), the situation is reversed, there is no motion in the poloidal direction; the orbit is again a cross-sectional circle, but now its the half-bagel cross-section. These are the simplest cases. In Figure A.4(a), as the orbit makes one complete period about the poloidal direction, it makes
150 Appendix A. Hamiltonian Dynamics: Language of Abstraction FIGURE A.3. Single orbits on tori, (a) One periodic orbit forming a circle about the poloidal cross-section; and (b) one periodic orbit forming a circle about the toroidal cross-section.
A.8. Commensurability: Two Types of Motion 151 FIGURE A.4. Multiple orbits on tori, (a) Once around the poloidal direction sees nine cycles about the torus, (b) While still periodic, this orbit in graphical representation begins to fill the surface of the torus as would one that is conditionally periodic.
152 Appendix A. Hamiltonian Dynamics: Language of Abstraction a full nine toroidal cycles. In Figure A.4(b), although the ratio of the two frequencies is still rational, the orbit is well on its way toward covering the surface of the torus. When the ratio of frequencies is irrational, the orthogonal components of the motion never complete whole numbers of revolutions at the same time as all the other components and the trajectory never meets itself. Thus an open curve results, which must fill the surface of the torus completely as t —► oo. A.9. Digital Representation In 1857, Jules Lissajous demonstrated that if one were to plot the sinusoidal components of two different frequencies on the horizontal and vertical axes of a Cartesian coordinate system, then the resulting curve as a function of time would be either closed or open, depending on the commensurability of the two frequencies. In each of the torus frames in the figure, the frequencies are commensurate and the resulting curves are closed. Therefore they cannot completely cover the torus. But only in theory could the curve in these plots remain open. I generated these plots on a digital computer, and thus the frequencies could only be defined to some finite precision. They can be only rational numbers, and as such, they could never generate an open curve on these graphs. But, in Figure A.4(b), the ratio consists of two not extremely large prime numbers—large in the sense that if they were much larger the resulting curve would cover the torus simply because of line-width, or, equivalently, the resolution of the graphical device. Whereas the resulting curve seems quite dense visually, it is far from an open curve. This is an important issue which both helps and hinders the usefulness of simulation. Open trajectories may never actually be generated on a digital computer; but, if the ratio of frequencies is sufficiently close to an irrational number within the precision of the machine, then the resulting curve will appear in every way to be an open one. If we are looking for open trajectories, then we must be careful to make use of this fact and define the "closeness" criteria properly. A. 10. Physical Reality and the Continuum The continuum of possible concentric tori of our model is structured the same as the continuum of the real line. Between each two irrational numbers there lies a rational and vice versa. Therefore between each two conditionally periodic tori, there lies a periodic torus. But, just as on the line, the measure of the set of all periodic tori is negligible as compared to the conditionally periodic tori, because of the countability of the rationals. Therefore most of the tori are conditionally periodic. But, if we attempt
A. 11. Perturbing the String 153 to represent these tori on a digital computer, then, in theory, we may only represent the periodic ones—leaving the majority out. However, the same situation holds when determining the initial conditions of our dynamical system. We must measure the initial state of our system using physical measuring devices which, like the digital computer, have a finite precision. Thus we can only have rational initial conditions for real physical systems, so it seems reasonable and desirable to only simulate those. We will soon see that when we add the nonintegrable perturbation to our model, the resulting dynamical behavior depends quite specifically on whether our particular unperturbed torus is periodic or conditionally periodic. It would seem to be a problem that we can represent only the periodic tori on the computer. But this problem has been overcome in part with techniques provided by Henri Poincare and George Birkhoff before digital computing even came into existence. A. 11. Perturbing the String We begin with the original Hamiltonian with a perturbation term, cubic in the spring extension, which corresponds to the quadratic force term in FPU: ff=5EW + («-«-i)2] +e£^-<fc-i)3. (A.16) I have substituted the small parameter e in lieu of the symbol a, used in FPU, to avoid confusion between it and the winding number. The first transformation we made on the original equations was to decouple them using the point transformation to normal modes (A.3). But when we apply that transformation rigorously to the perturbed Hamiltonian, we find that the equations no longer decouple. In the most general sense we obtain the following terms: H = J2\l(Pk+"iQl)} +eJ2AakQiQJQk. (A.17) k=l L ijk The perturbation term, even under transformation, continues to couple the different modes together. The triple sum provides links between the normal modes, Q&. The term A^ represents various combinations of constant coefficients that depend harmonically on the indices. If we place all the initial energy in one mode, then we should expect that energy to diffuse to the other modes via the coupling terms—limited in diffusion rate only by the relatively small size of the parameter e, which generally is kept less than one. Because of the coupling terms, the mode energies are no
154 Appendix A. Hamiltonian Dynamics: Language of Abstraction longer constants of the motion. We can see this fact more clearly once we transform our system over to the action-angle variables. The transformation (A.7) to action-angle variables that we used previously is still canonical, but the new Hamiltonian is quite a bit more complex than was the pleasingly simple form of (A.9), p. 146: H(I, 9) = VWfc +e^AijkJ^^coB(0i)co8(ej)cos(ek)9 t^i 7j£ V UiUJ^k (A.18) or, functionally, H(I,0) = Ho{I) + eH1(I,0) where c < 1. (A.19) This Hamiltonian is no longer independent of the angle variables 0^, and so the actions I are no longer constants of the motion. Further, the angles no longer evolve in the linear motion described by (A. 12). The resulting motion is extremely complicated. If the FPU model does satisfy the KAM theorem, then, for small values of the parameter e, the phase-space torus for this system might resemble those shown in Chapter 3.
Glossary Commensurate. In the normal-mode representation of a system of oscillators, each mode has a frequency. The question of interest concerning these frequencies is whether or not they form a linearly independent set—that is, whether Yl7=i m*u* ~ 0 or n°t> f°r arbitrary values of ra^. If this condition is true for any set of integers {m^}, then the frequencies are said to be commensurate and if not, they are said to be incommensurate. Commensurate frequencies are associated with resonance: the oscillator frequencies are multiples of one another and so they are more likely to share energy. In the action-angle representation of the system of linear oscillators, if the frequencies are commensurate, then the orbit will be periodic overall, forming a closed curve in phase space. If the frequencies are incommensurate, then the orbit will be conditionally periodic, forming an open curve that will fill the surface of a torus completely over an infinite period of time. See the Appendix for details concerning the torus representation of the unperturbed string, and then see Chapter 3 for details of commensurate frequencies in the perturbed Hamiltonian and KAM. Conjugate Variables. In Hamiltonian mechanics, each state variable, such as position, has a special counterpart called its conjugate momentum. Together these pairs of variables are called canonical conjugates or conjugate variables. They are intimately related to one another through the antisymmetrical form of Hamilton's equations of motion: dq_dB_ dp _ OH dt dp dt dq ' where H is the Hamiltonian and (#, p) is the pair of conjugate variables. These equations give the evolution in time of each of the two variables in terms of its conjugate. The most common pairs of conjugate variables
156 Glossary are: position-momentum [(x, x)\ (g, p)], angle-action [(0, J), (</>, J)], and time-energy [(£, H)]. Deterministic Chaos. Stemming far back into our cultural memory, the Greek word chaos means utter confusion and disorder, the dark void out of which the Universe formed. Until we began to separate the ideas of whether something was ordered and whether we could percieve that order, chaos was disorder. Now that we are faced with the very real post-enlightenment situation—that there can be determinism without determinability—we assign the term chaotic to mean without order, or cause, and distinguish it from random, which means that we cannot perceive any order. Furthermore, we now speak of chaotic dynamical systems (chaos in the popular sense), which are by first principles completely deterministic; yet we may not be able to predict future configurations, because of various physical constraints, such as finite computer memory, or finite accuracy in measurement. The single most important fact about chaotic dynamical systems is that they can exhibit simultaneously features of predictability and randomness. Deterministic chaos is when trajectories move so erratically that we are unable to simulate their future states. Ergodicity. At the extreme opposite end of the predictability spectrum from systems with known integrals of the motion is the condition called ergodicity, in which only the energy integral exists and the energy surface is freely accessible to all trajectories. Strictly speaking, a system is ergodic only if all trajectories are free to reach the entire phase space; but when we consider energy-conserving systems, we say that any arbitrary trajectory will eventually pass through an arbitrarily small neighborhood surrounding any arbitrary point on the energy surface. A consequence of ergodicity, which is often used as an alternate definition, is that any function of the phase space variables will have a time-averaged value (over infinite times) which is equal to the space average (over all phase space). Therefore conservative systems can never be truly ergodic by the strict definition, because the energy surface restricts the system's accessibility. But we speak of the condition "ergodic on the energy surface," or on any confining surface, as a restricted form of ergodicity. In the present work, ergodicity will always mean ergodic on the energy surface. Any well-defined set of connected initial conditions defines an area or volume in phase space. During the evolution, ergodicity requires that this area evolve all about phase space as a bundle of trajectories. Ergodicity does not require that it change shape very much—that is, it may retain its essential structural character throughout the evolution. However, although Liouville's theorem guarantees the conservation of the volume of the region under consideration, its shape may vary drastically (Goldstein 1980, p. 426ff). One possibility for an ergodic flow is that the volume does not retain its essential shape, but it deforms into very long attenuated
Glossary 157 strands that wind around and through the entire phase space. This structural change that goes beyond simple ergodicity is called "mixing" and it is the next level up in the hierarchy of stochasticity. After a long period of evolutionary time, any arbitrarily small sample of the phase space will contain some of the initially defined set of points (initial conditions), which is not true in general of an ergodic flow. Obviously, mixing implies ergodicity but not vice versa. It was not until 1963, when Sinai proved that a system of hard spheres possessed the property of mixing (mixing entails ergodicity) that anyone had proved ergodicity for a Hamiltonian system. And there is still some incredulity about the completeness of this proof. See Ford (1974), Arnold and Avez (1968), or Sinai (1963). Exponential Separation. Henon and Heiles discovered the very important property of the exponential separation of pair orbits. They found this behavior to be a good way to map the phase space of the stochastic region as they tried to determine where there were island chains and where there were ergodic trajectories. The ergodic trajectories were easily determined because they would separate from one another exponentially, whereas the trajectories on the island chains would separate only linearly. This property later became important as the most reliable indicator of deterministic chaos. Hamiltonian. There is an interlocked set of terms connected by the name of William Rowan Hamilton. Hamiltonian system refers to a dynamical system in which the quantity represented by the Hamiltonian function is conserved. The Hamiltonian is usually a statement of the total energy of the system, but this consequence depends upon the coordinate system used for the representation and whether or not time is represented explicitly in the problem. If the Hamiltonian contains velocity-dependent potentials (such as the classical magnetic field), then it is not the total energy. Usually a Hamiltonian may be transformed into one that does not contain these potentials, and so does not depend explicitly on the time variable. In this way, the Hamiltonian becomes a statement of the conserved energy and defines the constant energy surface. Hamiltonian dynamics usually refers to the formalism of conservative dynamics in which the second-order differential equations of motion, originally derived by Lagrange, are converted into pairs of first-order differential equations. The representations are equivalent, but the Hamiltonian method lends itself to representation in phase space and sets up an elegant system of paired canonical variables; position and momentum are set on equal footing in Hamiltonian dynamics. Using this formalism, Hamilton was able to show the equivalency between the principle of least time in optics and the principle of least action in dynamics. The Hamiltonian formalism was later adapted as the basis dynamical theory of quantum mechanics.
158 Glossary Homoclinic Tangle. This term functions as a touchstone or keyword in our work, taking us well out of our depth in terms of topological dynamics. The adjective homoclinic is an English variation on the term homoclinous, used by Poincare. It refers to the "feeding back into itself" behavior of a certain unstable manifold (homoclinic orbit), which we can associate with the separatrix in a dynamical system. Under certain very interesting conditions, the separatrix structure of a dynamical system can demand the existence of an infinite sequence of these points in a bounded region of phase space. The arrangement of these points forms the famous homoclinic tangle discovered by Poincare in the final chapter of the final volume of his magnum opus (1890). It is the infinitely complicated lattice that he would not even begin to draw. Because these orbits are unstable, they tend to repel trajectories from them, and in an infinitely complex web of these orbits, the trajectory can behave in a wildly erratic manner. This behavior makes it impossible to track such a trajectory in simulation, and so this web is one source of deterministic chaos. In a perturbed Hamiltonian dynamical system that is subject to the KAM theorem, the regions bounded between the mini-tori and the preserved conditionally periodic tori are filled with the homoclinic tangle. A brief but interesting explanation of the homoclinic tangle can be found in Appendix 1 of Ekeland (1988). Integrable. When the set of differential equations of a particular model can be integrated analytically, the model is said to be integrable. Integrable models possess a complete set of integrals of the motion, and therefore cannot be ergodic. The integrals of the motion lock the trajectories into determinable orbits, so there can be no chaos. Although related to nonlinearity, nonintegrable is the more significant characteristic of a dynamical model. Nonintegrability usually is a result of nonlinear terms in the equations of the model, but nonlinearity itself is not sufficient to guarantee nonintegrability. Although it is true that numerical integration is truly a form of integration, we can numerically integrate equations that are nonintegrable (analytically). Analytical integration means using the calculus to obtain a closed-form solution to the motion for any future time. If we can find a canonical transformation that eliminates all position coordinates from the statement of the Hamiltonian, and so make it a function of only momenta, then we know that those momenta will all be constants of the motion. In effect, the system will be completely solved and the motion will be completely predictable. This special situation is called Liouville integrability. In Hamiltonian mechanics it is necessary to know only n constants of the motion to specify completely the evolution exactly, even though there are 2n initial conditions. Integral of the Motion. If the Hamiltonian function for the system is independent of any state variables, then the associated conjugate vari-
Glossary 159 able may be integrated directly to obtain a constant of the motion, whose value is determined completely by the initial conditions. The term constant of the motion means that one of the n momenta are conserved all during the motion. In the associated phase space, each integral of the motion constrains one dimension of the system trajectory to a fixed surface, just like a point constrained to move around the surface of a circle with a constant radius. A Hamiltonian dynamical system with n degrees of freedom has a 2ra-dimensional phase space and up to 2n — 1 integrals of the motion. If all possible integrals of the motion exist, they fix the system's trajectory to a well-defined curve—the intersection of all the surfaces of constraint. These constants are often known as first integrals of the motion, because they each result from a single analytical integration of Hamilton's equations. If an integral of the motion exists, then it is implicate in the model— it does not matter if we are able to actually perform the integration or not. When we simulate the motion of a dynamical system, we may discover the existence of a surface of constraint in the behavior of the trajectories and so discover the existence of a corresponding integral of the motion. Mapping. This noun arises in topology from the action "to map an element of a set into another set." A mathematical function is a special case of a mapping, in that a number x is mapped into another number y. Both numbers are of the same set, the real numbers, and so we can speak of mapping a set into itself. A simulated trajectory is also a mapping, in that the dynamical equations map the system state from one point into the next. The equations of motion provide the formula for mapping phase space into itself. The term mapping is used to distinguish this topic from cartography, the making of maps, where a map represents a territory and mapping is a verb. A mapping is the set of equations for making the transformation between one set and another. One of the central topics of discussion in topology is the directional properties of mappings and the structure of the sets on either end. A surface of section is often generated using a mapping; instead of taking a single trajectory step by step, we can take a large region or bundle of trajectories step by step. Mapping finds itself in good company with simulation, because the discrete approximation procedure of numerical integration may be considered to be a form of mapping. Mode. Sinusoidal standing wave patterns that fit exactly n waves on the string between the ends, an infinite series of these modes with decreasing amplitude and increasing frequency can be summed to model the shape of any continuous string. A finite subset of the full infinite series can be used to approximate any disturbance up to any degree of accuracy necessary. Often called either normal modes or Fourier modes. The
160 Glossary process of composing or decomposing a function into the series of modes is known as Fourier analysis. Nonlinear. There are two distinct uses of this adjective in mathematics. Functionally nonlinear is when an algebraic equation has terms that are not linear—that is, they express a nonproportional relationship, such as y = x2. This is the sense implied when we say that the Hamil- tonian has nonlinear terms, such as all the perturbation terms in the FPU simulation. This is also what we mean when we speak of nonlinear dynamics. In this same sense, linear thinking means to make predictions based on a directly proportional relationship. On the other hand, differential equations are said to be nonlinear when two solutions to the same equation or set of equations, when added together, is not a solution to that same equation or set of equations. Phase Space. Phase space is a special subset of state space, wherein each pair of canonically conjugate variables (position and momentum) occupies one linked pair of dimensions in the space. Thus the phase space of a dynamical system with n degrees of freedom, having n position variables, must have n pairs of dimensions. A phase plane is a special cross- section of phase space representing a single pair of conjugate variables. The geometry of phase space is called symplectic geometry. Simulation. Simulations provide us with a way to analyze the properties of models without resorting to the very difficult techniques of perturbation theory. "Simulation in dynamics," means the entire process of numerically integrating a set of differential equations on a computer and displaying graphically for analysis the sequential solution states as an ordered set of points in a phase-space portrait. As these points unfold, we implicitly understand the successive dots to represent the evolution of a trajectory, or, in the case of FPU, the evolution of the normal-mode energy distribution. By observing the behavior of this unfolding trajectory, we infer properties of the model. State Space. A mathematical space spanned by Cartesian coordinates, state space may have any number of dimensions; but one point (set of coordinates) in state space must represent a single unique complete configuration of a dynamical system at a single point in time. The n- dimensional coordinates of a single point is a collection of numbers that holds one possible combination of the values of the n system variables. Any trajectory is a path through state space that defines the evolution of the system's configuration of variables. Stochasticity. This term has a lot of ambiguity about it. Many workers use stochastic to mean "that a trajectory wanders more or less randomly over part or most of the energy surface." This definition is the same as some aspects of ergodic on the energy surface; however, the terms
Glossary 161 are not interchangeable because ergodic has some very technical and specific requirements. The behavior defined above also seems to describe what later came to be known as deterministic chaos, but again there is a problem with making a strict substitution, bacause the latter term can be quite localized, such as being locked between two KAM curves. Deterministic chaos does not mean that a trajectory is free to wander the entire energy surface. The following definition seems less restrictive, yet useful when dealing with this term: a sequential and usually discrete process is stochastic when it involves a random selection (movement or decision) at each step in the sequence. Stochastic processes can result in a totally random sequence—such as white noise or random walk—or they can see spontaneous order emerge from the sequence. The deterministic chaos of dynamics is often called stochastic because simulation is a stepwise process that introduces randomness at each step—the true solution of the equations of motion are changing direction too rapidly for the size of the time step. Ergodic orbits appear to be stochastic in simulation. Structural Stability. The question of whether different models within a particular model type may be expected to exhibit similar behavior falls under the category of structural stability, an area of increasing importance in working with dynamical models. Abraham (1967) states the problem as follows: "If a dynamical system X has a known phase portrait P, and is then perturbed to a slightly different system X' (for example, changing the coefficients in its differential equation slightly), then is the new phase portrait P' close to P in some topological sense? This problem has an obvious importance, as in practice the qualitative information obtained for P is to be applied not to X, but to some nearby system X', because the coefficients of the equation are to be determined experimentally, and therefore, approximately." One extensive approach to formalizing the science of structural stability has been expounded by Thorn (1975). Surface of Section. Usually associated with the name Poincare, who invented them, the surface of section is a Cartesian plane—a very special two-dimensional cross-section of phase space, in which a state variable, such as position, and its conjugate momentum, define the two axes. Phase space is a 2n-dimensional space, and so there are n possible phase planes or surfaces of section. However, there is no specific requirement that a cross-section must be a phase plane; but, because of the special pairwise relationship between canonically conjugate variables, it is usually beneficial to select them this way. Surfaces of section lend themselves to very nicely to simulation, which needs a useful graphical interface with human experimenters.
162 Glossary This diagram shows what to expect from a surface of section in action-angle variables (on a torus). Thermal Equilibrium. Usually interpreted as a consequence of the second law of thermodynamics, many dynamical systems (usually particles) tend to move spontaneously toward thermal equilibrium, the state in which all particles have, statistically, the same amount of energy. For the discrete string with all of its energy initially configured in a single mode, it was expected that, as the string vibrated, its energy would disperse to all the modes, because the modes were connected by nonlinear terms added to the linear model. This dispersion tends to spread the energy out among all the modes until they all have equal amounts of energy. In the discrete string, the number of modes is finite, so it was expected that this equipartition of energy would occur fairly quickly.
References Abraham, R. (1967), Foundations of Mechanics, The Mathematical Physics Monograph Series, 1st edn, W.A. Benjamin, New York. Arnold, V. I. (1961), 'Generation of almost periodic motion from a family of periodic motions', Soviet Math. Dokl. 2, 501-503. Arnold, V. I. (1962), 'On the behavior of an adiabatic invariant under slow periodic variation of the Hamiltonian', Soviet Math. Dokl. 3, 136-140. Arnold, V. I. (1963a), 'Proof of a theorem of Kolmogorov on the invariance of quasi-periodic motions under small perturbations', Russian Math. Surveys 18(5), 9-36. Arnold, V. I. (1963b), 'Small denominators and problems of stability of motion in classical and celestial mechanics', Russian Math. Surveys 18(6), 85-191. Arnold, V. I. (1978), Mathematical Methods of Classical Mechanics, Vol. 60 of Graduate Texts in Mathematics, Springer-Verlag, New York, translated by K. Vogtmann and A. Weinstein. Arnold, V. I. (1983), Geometrical Methods in the Theory of Ordinary Differential Equations, Springer-Verlag, New York. Translated by Joseph Sztics. Arnold, V. I. and Avez, A. (1968), Ergodic Problems of Classical Mechanics, The Mathematical Physics Monograph Series, W.A. Benjamin, New York. Bai-Lin, H. (1984), Chaos, World Scientific, Singapore. Balescu, R. (1956), Bull. Acad. Roy. Belg. CI. Sci. 42.
164 References Balescu, R. (1975), Equilibrium and Nonequilibrium Statistical Mechanics, Wiley Interscience Series, Wiley, New York. Bateman, H. (1945), Hamilton's work in dynamics and its influence on modern thought, in 'A Collection of Papers in Memory of Sir William Rowan Hamilton', Scripta Mathematica, New York, pp. 51-63. Bateson, G. (1972), Steps to an Ecology of Mind, Chandler, Toronto. Berry, M. V. (1978), 'Regular and irregular motion', AIP Conf. Proc. 46, 16-120. Birkhoff, G. D. (1913), 'Proof of Poincare's geometric theorem', Amer. Math. Soc. Transl. 14, 14-22. Birkhoff, G. D. (1917), 'Dynamical systems with two degrees of freedom', Trans. Amer. Math. Soc. 18, 199-300. Birkhoff, G. D. (1931a), Troof of a recurrence theorem for strongly transitive systems', Proc. Nat. Acad. Sci. USA 17, 650-655. Birkhoff, G. D. (19316), Troof of the ergodic theorem', Proc. Nat. Acad. Sci. USA 17, 656-660. Birkhoff, G. D. (1966), Dynamical Systems, Vol. IX of American Mathematical Society Colloquium Publications, 2nd edn, American Mathematical Society, Providence, RI. First Published 1927. Bruns, H. (1887), Acta Math. XI, 25. Capek, M. (1961), The Philosophical Impact of Contemporary Physics, Van Nostrand, Princeton, NJ. Capek, M. (1987), The philosophical significance of piaget's researches on the genesis of the concept of time, in A. Shimony and D. Nails, eds, 'Naturalistic Epistemology, A Symposium of Two Decades', Reidel, Dordrecht, pp. 91-118. Vol. 100 of the Boston Studies in the Philosophy of Science series. Cartwright, N. (1983), How the Laws of Physics Lie, Clarendon Press, Oxford. Chirikov, B. V. (1960), 'Resonance processes in magnetic traps', Plasma Phys. 1, 253-260. Chirikov, B. V. (1979), 'A universal instability of many-dimensional oscillator systems', Phys. Rep. 52(5), 263-379. Chirikov, B. V., Izrailev, F. M. and Tayursky, V. A. (1973), 'Numerical experiments on the statistical behavior of dynamical systems with a few degrees of freedom', Comput. Phys. Comm. 5, 11-16.
References 165 Cooper, N. G., ed. (1989), From Cardinals to Chaos: Reflections on the Life and Legacy of Stanislaw Ulam, Cambridge University Press, Cambridge. Dragt and Finn (1976), 'Insolubility of trapped particle motion in a magnetic dipole field', J. Geophys. Rev. 81, 2327. Duhem, P. (1954), The Aim and Structure of Physical Theory, Princeton University Press, Princeton, NJ. Translated by Philip P. Wiener. Eckhardt, B., Ford, J. and Vivaldi, F. (1984), 'Analytically solvable dynamical systems which are not integrable', Physica D 13, 339-356. Ekeland, I. (1988), Mathematics and the Unexpected, University of Chicago Press, Chicago. Fermi, E. (1923), Beweis, dass ein mechanisches normalsystem im allge- meinen quasi-ergodisch ist, in Segre (1965), pp. 79-87. Article published in Phys. Zeits 24:261-265. Fermi, E., Pasta, J. and Ulam, S. (1955), Studies of nonlinear problem^, in Segre (1965), chapter 266, pp. 977-988. Although written in 1954, this article first appeared in this volume in 1965. Fomenko, A. (1988), Integrability and Nonintegrability in Geometry and Mechanics, Mathematics and Its Applications, Kluwer Academic, Dordrecht, translated by M. V. Tsaplina. Ford, J. (1961), 'Equipartition of energy for nonlinear systems', J. Math. Phys. 2(3), 387-393. Ford, J. (1974), The statistical mechanics of classical analytic dynamics, in E. G. D. Cohen, ed., 'Fundamental Problems in Statistical Mechanics III', North-Holland, Amsterdam, pp. 215-255. Proceedings of the International Summer School on Fundamental Problems in Statistical Mechanics III. Ford, J. (1992), 'The fermi-Pasta-Ulam problem: Paradox turns discovery', Phys. Rep. 213(5), 271-310. Ford, J. and Lunsford, G. (1970), 'Stochastic behavior of resonant nearly linear oscillator systems in the limit of zero nonlinear coupling', Phys. Rev. A 1(1), 59-70. Ford, J., Stoddard, D. and Turner, J. (1973), 'On the integrability of the Toda lattice', Prog. Theoret. Phys. 50(5), 1547-1560. Ford, J. and Waters, J. (1963), 'Computer studies of energy sharing and er- godicity for nonlinear oscillator systems', J. Math. Phys. 4(10), 1293- 1306.
166 References Franklin, A. (1986), The Negleet of Experiment, Cambridge University Press, Cambridge. Franklin, A. (1990), Experiment, Right or Wrong, Cambridge University Press, Cambridge. Galison, P. (1987), How Experiments End, University of Chicago Press, Chicago. Goldstein, H. (1980), Classical Mechanics, 2nd edn, Addison-Wesley, Reading, MA. Greene, J. M. (1968), 'Two-dimensional measure-preserving mappings', J. Math. Phys. 9(5), 760-768. Greene, J. M. (1979), 'A method for determining a stochastic transition', J. Math. Phys. 20(6), 1183-1201. Guckenheimer, J. and Holmes, P. (1986), Nonlinear Oscillations, Dynamical systems, and Bifurcations of Vector Fields, Vol. 42 of Applied Mathematical Sciences, 2nd edn, Springer-Verlag, New York. Gustavson, F. G. (1966), 'On constructing formal integrals of a Hamiltonian system near an equilibrium point', Astronom. J. 71(8), 670-686. Haar, D. T. (1954), Elements of Statistical Mechanics, Rinehart, New York. Contains English translation of Fermi's Ergodic Theorem in the Appendix. Hacking, I. (1983), Representing and Intervening, Cambridge University Press, Cambridge. Hankins, T. L. (1980), Sir William Rowan Hamilton, Johns Hopkins, Baltimore. Hausdorff (1962), Set Theory, Chelsea, New York. Hemmer, P. (1959), Dynamic and Stochastic Types of Motion in the Linear Chain, PhD thesis, Trondheim. Henon, M. (1974), 'Integrals of the Toda lattice', Phys. Rev. B9, 1921. Henon, M. and Heiles, C. (1964), 'The applicability of the third integral of motion: Some numerical experiments', Astronom. J. 69(1), 73-79. Infeld, E. (1981), 'Quantitive theory of the Fermi-Pasta-Ulam recurrence in the nonlinear Schrodinger equation', Phys. Rev. Lett. 47(10), 717-718. Izrailev, F. M. and Chirikov, B. V. (1966), 'Statistical properties of a nonlinear string', Soviet Phys. Dokl. 11(1), 30-32. Translated from Doklady Akademii Nauk SSSR, Vol. 166, No. 1, 1965.
References 167 Jackson, E. A. (1963a), 'Nonlinear coupled oscillators I: Perturbation theory; ergodic problem', J. Math. Phys. 4(4), 551-558. Jackson, E. A. (1963b), 'Nonlinear coupled oscillators II: Comparison of theory with computer simulations', J. Math. Phys. 4(5), 686-700. Jackson, E. A., Pasta, J. R. and Waters, J. F. (1968), 'Thermal conductivity of one-dimensional lattices', J. Comput. Phys. 2, 207-227. Jeffreys, W. H. (1966), Astronom. J. 71, 306. Kellert, S. H. (1993), In the Wake of Chaos, University of Chicago Press, Chicago. Khinchin, A. I. (1949), Mathematical Foundations of Statistical Mechanics, Dover, New York, translated by G. Gamow. Khintchine, A. Y. (1963), Continued Fractions, Noordhoff, Groningen, The Netherlands. Translated by Peter Wynn. Kolmogorov, A. N. (1954), Preservation of conditionally periodic movements with small change in the Hamiltonian function, in Chaos (Bai- Lin, 1984), pp. 81-86. From Los Alamos Scientific Laboratory Translation LA-TR-71-67 by Helen Dahlby. Originally: Akad. Nauk. S.S.S.R., Doklady 98, 527 (1954). Kolmogorov, A. N. (1958), 'The general theory of dynamical systems and classical mechanics', English Translation in Appendix D in Ralph Abraham, Foundations of Mechanics. Address to the 1957 International Congress of Mathematicians in Russian. Kroes, P. (1985), Time: Rs Structure and Role in Physical Theories, Vol. 179 of Synthese Library, Reidel, Dordrecht. Kruskal, M. D. and Zabusky, N. J. (1964), J. Math. Phys. 5, 231. Kryloff and Bogoliuboff (1947), Introduction to Nonlinear Mechanics, Edwards Brothers, Ann Arbor, MI. Landau, L. D. and Lifshitz, E. M. (1976), Mechanics, Vol. 1 of Course of Theoretical Physics, 3rd edn, Pergamon Press, Oxford. Translated J. B. Sykes and J. S. Bell. Lichtenberg, A. J. and Lieberman, M. A. (1983), Regular and Stochastic Motion, Vol. 38 of Applied Mathematical Sciences, Springer-Verlag, New York. Lucretius (1965), On the Nature of the Universe, Ungar, New York. Verse translation by James H. Mantinband, (circa 65 B.C.E.).
168 References Lunsford, G. and Ford, J. (1972), 'On the stability of periodic orbits for nonlinear oscillator systems in regions exhibiting stochastic behavior', J. Math. Phys. 13(5), 700-705. MacKay, R. S. and Meiss, J. D., eds (1987), Hamiltonian Dynamical Systems, Adam Hilger, Bristol. Marion, J. B. (1970), Classical Dynamics of Particles and Systems, 2nd edn, Academic Press, New York. Melnikov, V. K. (1963), 'On the stability of the center for time periodic perturbations', Trans. Moscow Math. Soc. 12, 1-57. This is a translation of Dokl. Akad. Nauk. SSSR, 144 (1962) and 148 (1963). Moser, J. (1962), 'On invariant curves of area-preserving mappings of an annulus', Nachr. Akad. Wiss. Gottingen Math. Phys. Kl. 2, 1-20. Moser, J. (1973), Stable and Random Motions in Dynamical Systems, Vol. 77 of Annals of Mathematics Studies, Princeton University Press, Princeton, NJ. The Hermann Weyl Lectures at The Institute for Advanced Study. Nicolis, G. and Prigogine, I. (1989), Exploring Complexity, W. H. Freeman, New York. Peterson, K. (1983), Ergodic Theory, Vol. 2 of Cambridge Studies in Advanced Mathematics, Cambridge University Press, Cambridge. Poincare, H. (1890), 'Sur les equations de la dynamique et le probleme de trois corps', Acta Math. 13, 1-270. Poincare, H. (1899), Les Methodes Nouvelles de la Mecanique Celeste,Vo\. I—III, Gauthier-Villars, Paris. The three volumes separately: 1890, 1892, 1899. Poincare, H. (1913), The Foundations of Science: Science and Hypothesis, The Value of Science, and Science and Method, Vol. 1 of Science and Education, Science Press, New York, translated by George Bruce Halsted. Poincare, H. (1963), Mathematics and Science: Last Essays, Dover, New York, translated by John W. Bolduc. Prigogine, I. (1980), Prom Being to Becoming: Time and Complexity in the Physical Sciences, W. H. Freeman, San Francisco. Prigogine, I. and Stengers, I. (1984), Order Out of Chaos, Bantam, New York.
References 169 Rannou, F. (1973), 'Numerical study of discrete plane area-preserving mappings', Astronom. Astrophys. 31, 289-301. Rayleigh, J. W. S. (1945), The Theory of Sound, 2nd edn, Dover, New York. Originally published in 1894. Reichenbach, H. (1958), The Philosophy of Space and Time, Dover, New York, translated by Maria Reichenbach and John Freund, Originally published in 1927 in German as Philosophic der Raum Zeit-Lehre. Sagdeev, R. Z., Usikov, D. A. and Zaslavsky, G. M. (1988), Nonlinear Physics: From the Pendulum to Turbulence and Chaos, Vol. IV of Contemporary Consepts in Physics, Harwood Academic, Chur. Translated by Igor R. Sagdeev. Saito, N. and Hirooka, H. (1967), 'Computer studies of ergodicity in coupled oscillators with anharmonic interaction', J. Phys. Soc. Japan 23(2), 167-171. Schmidt, G. and Bialek, J. (1982), 'Fractal diagrams for hamiltonian stochasticity', Physica D 5, 397-104. Scott, A. C., Chu, F. Y. F. and McLaughlin, D. W. (1973), 'The soliton: A new concept in applied science', Proc. IEEE 61(10), 1443-1483. Segre, E., ed. (1965), Collected Papers of Enrico Fermi, Vol. 2, University of Chicago Press, Chicago. Siegel, C. L. (1942), 'Iteration of analytic functions', Ann. of Math. 43(4), 607-612. Siegel, C. L. (1956), Vorlesungen iiber Himmelsmechanik, Springer-Verlag, Berlin. Sinai, Y. G. (1963), 'A system of hard spheres is ergodic and mixing', Soviet Math. Dokl. 4, 1818. See also Russian Math. Surv. 25, 137 (1970). Stewart, I. (1989), Does God Play Dice?, Blackwell, Oxford. Synge, J. L. (1945), The life and early work of Sir William Rowan Hamilton, in 'A Collection of Papers in Memory of Sir William Rowan Hamilton', Scripta Mathematica, New York, pp. 13-24. Tabor, M. (1989), Chaos and Integrability in Nonlinear Dynamics, Wiley, New York. Thorn, R. (1975), Structural Stability and Morphogenesis, Benjamin, Reading, MA.
170 References Toda, M. (1967), 'Vibration of a chain with nonlinear interaction', J. Phys. Soc. Japan 22(2), 431-436. Tuck, J. L. and Menzel, M. T. (1972), 'The superperiod of the nonlinear weighted string (FPU) problem', Adv. in Math. 9, 399-407. Turner, F. (1978), Poiesis: Time and artistic discourse, in J. T. Fraser, ed., 'The Study of Time III', Springer-Verlag, New York, pp. 614-633. Proceedings of the Third Conference of the International Society for the Study of Time. Ulam, S. M. (1960), A Collection of Mathematical Problems, Vol. 8 of In- terscience Tracts in Pure and Applied Mathematics, Interscience, New York. Ulam, S. M. (1976), Adventures of a Mathematician, Charles Scribner's Sons, New York. van Frassen, B. C. (1980), The Scientific Image, Clarendon Press, Oxford. Walker, G. H. and Ford, J. (1969), 'Amplitude instability and ergodic behavior for conservative nonlinear oscillator systems', Phys. Rev. 188(1), 416-432. Waters, J. and Ford, J. (1966), 'A method of solution for resonant nonlinear coupled oscillator systems', J. Math. Phys. 7(2), 399-403. Whittaker, E. T. (1937), A Treatise on the Analytical Dynamics of Particles and Rigid Bodies, 4th edn, Cambridge University Press, Cambridge. The first edition was published in 1904. Wiggins, S. (1988), Global Bifurcations and Chaos, Vol. 73 of Applied Mathematical Sciences, Springer-Verlag, New York. Zabusky, N. J. (1962), 'Exact solution for the vibrations of a nonlinear continuous model string', J. Math. Phys. 3(5), 1028-1039. Zabusky, N. J. (1963), Phenomena associated with the oscillations of a nonlinear model string: The problem of Fermi, Pasta, and Ulam, in S. Drobot and P. A. Viebrock, eds, 'Mathematical Models in Physical Science', Conference at the University of Notre Dame, 1962, Prentice- Hall, Englewood Cliffs, NJ, pp. 99-133. Zabusky, N. J. (1967), A synergetic approach to problems of nonlinear dispersive wave propagation and interaction, in W. Ames, ed., 'Nonlinear Partial Differential Equations: A Symposium on Methods of Solution', University of Delaware, Academic Press, New York, pp. 223-258. Zabusky, N. J. (1971), Phys, Rev. Lett. 27, 1774.
References 171 Zabusky, N. J. (1973), 'Solitons and energy transport in nonlinear lattices', Comput. Phys. Comm. 5, 1-10. Zabusky, N. J. and Deem, G. (1967), 'Dynamics of nonlinear lattices: Localized optical excitations, acoustic radiation, and strong nonlinear behavior', J. Comp. Phys. 2, 126-153. Zabusky, N. J. and Kruskal, M. D. (1965), 'Interaction of "solitons" in a collisionless plasma and the recurrence of initial states', Phys. Rev. Lett. 15(6), 240-243.
Index action-angle variables 53, 56, 70, 88, 146, 147 amplitude instability 93 Arnold, V.I. 53 asymptotic series 58, 60, 64 Campbell, D. 30 canonical conjugates 14 coordinates 143 equations 146 transformations 58, 62, 79, 141 variables 52, 140 chaos, see deterministic chaos Chirikov, B.V. 85 closed curve 148 commensurate 59, 152, 155 commensurate frequencies, see frequencies conditionally periodic 26, 54, 59, 61, 65, 66, 68, 70, 71, 75, 88, 93, 97, 117, 149, 152 conjugate momentum 142 variables 140, 142, 155 constant of the motion, see integral of the motion continuous limit 44, 45 string 21, 43 trajectory 140 control parameter 108 critical parameter 88 cyclic, see Hamiltonian, cyclic Deem, G. 89 degree of freedom 93, 94, 140, 147 deterministic chaos 16, 43, 71, 115, 123, 156 dispersion 19 divisors, small 34, 57, 59, 62, 71 dynamical systems theory 26, 105 efficacy 116, 121, 132, 133 energy dissociation 99 sharing 25 surface 13, 52, 55, 73, 99, 147 ensembles 11 epistemology 113, 131 equations of dynamics 57 of motion 53, 140 equipartition 23, 41, 89
174 Index ergodic hypothesis 110, 117, 125, 134 ergodicity 13, 14, 35, 41, 56, 75, 93, 99, 116, 156 experimental mathematics 26 exponential separation 97, 100, 114, 121, 132, 133, 157 Fermi Enrico 9, 27 theorem 55, 111 fixed point 98, 140 Ford, J. 27,33 Fourier mode, see mode FPU approximation 18 asKAM 68 change of variables 19 chaos in 25 configuration 21 Hamiltonian 19, 54, 57, 99 initial conditions 22 lattice 74, 99 local behavior 21 model 56, 130 parameters 18, 21 quadratic terms 22 research program 32, 72, 101 retrodicting 4 simulation 11, 21, 57, 106, 110, 130 time step 21 unperturbed model 18 fractal 68, 98 frequencies commensurate 54, 66, 94, 149 linearly-dependent 34, 149 normal mode 19 resonant 59 unperturbed 94 generalized coordinates 139, 141, 142 Gustavson, F.G. 79 Hamilton, William Rowan 52 Hamiltonian 157 cyclic 143 dynamics 13, 33, 139, 143 function 105, 140 system 92 unperturbed 17, 141 harmonic motion of the string 142 oscillators 17, 56, 58, 73, 78, 143 Henon and Heiles 72-81, 100 heuristic 131 homoclinic orbits 66, 88 point 60, 77 tangle 94, 98, 99, 158 Hooke's law 17 information entropy 116 initial conditions 70, 143, 145 integrable 10, 13, 55, 58, 71, 158 integral of the motion 13, 21, 35, 42, 52, 54-57, 72, 74, 94, 110, 118, 135, 142, 143, 158 islands of order 93, 97, 118, 137 isolating 11, 23 iteration 21, 97 Izrailev, F.M. 85 Jackson, E.A. 37 KAM behavior 99 curves 98, 99, 108 explanation of FPU 71 instabilities 93 mini-tori 66, 67-71, 77, 88 stability 86, 94, 100, 109 theorem 4, 30, 41, 57, 64, 65, 92, 111, 133 tori 68, 72, 77, 88, 93, 98 Kolmogorov, A.N. 52, 61 Korteweg-de Vries 46, 99 Kruskal, M.D. 43
Index 175 Lagrange, J. 51, 139 level curves 95 Liouville integrability 143 Joseph 52 Lissajous, Jules 152 Lunsford, G.H. 93 mapping 159 measure theory 94 mechanics celestial 55, 57 classical 60, 102 Hamiltonian 58 Newtonian 102 statistical 12, 94, 98, 109 Menzel, M. 27 mode 159 acoustic 89 energy 20, 89, 145 Fourier 10 normal 10, 19, 21, 89, 141, 147 optical 86, 89 models in dynamics 107 Moser, J. 53 Newton's method of tangents 64 noise in a simulation 115 nondegeneracy condition 57, 78 nonlinear 160 normal mode, see mode numerical methods 100 open curve 148 oscillatory motion 142 overlap criterion, see resonance overlap partial differential equations 45 Pasta, J. 9 periodic boundary conditions 47 motion 148 orbits 77, 116 tori 33 trajectory 141 periodicity 21, 25 perturbation 92 parameter 108 theory 18, 32, 37, 52, 55, 57, 58, 59, 60, 71, 93, 124, 130 phase space 11, 14, 21, 52, 75, 98, 140, 160 Poincare Henri 10, 52, 92 recurrence 38 theorem 52, 55-57, 110 precision 152 pursuit 125 quasi-periodic 26, 28, 86 rational numbers 59, 68 resolution, graphical 152 resonance condition 41, 62, 93 orbits 66 overlap 85, 94, 95, 97, 109 primary 42 road map 2 round-off error 29, 116, 127 separatrix 88 series expansion 18, 99, 110 simulation 71, 106, 160 in FPU, see FPU Tuck and Menzel 27 sinusoidal motion 143 soliton 4, 30, 46, 91 state space 160 stochasticity 11, 12, 86, 89, 94, 95, 97, 98, 109, 160 Stoddard, D. 100 structural stability 108, 161 substitution of variables 145 super-period 28, 32, 70 surface of section 75, 77, 93, 97, 108, 116, 161 symplectic geometry 140 system trajectory 147
176 Index thermal equilibrium 11, 110, 162 three-body problem 17, 54, 82 time average 25 Toda lattice 74, 78, 82, 99, 101, 108, 128, 136 topological methods 52 transformations 66 tori conditionally-periodic 33 dynamics on 147 preserved 71 trajectory periodic 141 stochastic 72 system 147 true solution 105, 112, 124, 132 Tuck, J.L. 27 Turner, J.S. 100 two-body problem 14, 51 Ulam, S. 9 variables canonical, see canonical variables conjugate, see conjugate variables variational principle 146 Walker, G. 92 Waters, J. 41 wave equation 44 nonlinear 46 standing 21 Zabusky, N.J. 43