The Genesis of Simulation in Dynamics: Pursuing the Fermi–Pasta–Ulam Problem - Weissert T.P.

Автор: Weissert T.P.

Теги: mathematical physics mathematical modeling computer modeling nonlinear dynamics

ISBN: 0-387-98236-1

Год: 1997

Похожие

Computer Simulation of Dynamic Phenomena

Chaos and Integrability in Nonlinear Dynamics: An Introduction

Classical dynamics a contemporary approach.

Applied Nonlinear Dynamics: Analytical, Computational and Experimental Methods

Текст

The Genesis of Simulation in Dynamics

Springer
New York
Berlin
Heidelberg
Barcelona
Budapest
Hong Kong
London
Milan
Paris
Santa Clara
Singapore
Tokyo

Thomas P. Weissert
The Genesis of Simulation
in Dynamics
Pursuing the Fermi-Pasta-Ulam Problem
With 63 Illustrations
Springer

Thomas P. Weissert
The Agnes Irwin School
Rosemont, PA 19010
USA
Library of Congress Cataloging-in-Publication Data
Weissert, Thomas P.
The genesis of simulation in dynamics : pursuing the Fermi-Pasta-
Ulam problem / Thomas P. Weissert.
p. cm.
Includes bibliographical references and index.
ISBN 0-387-98236-1 (alk. paper)
1. Dynamics — Computer simulation. 2. Dynamics — Mathematical
models. I. Title.
QC133.W45 1997
53r.ir0113-DC21 97-9272
Printed on acid-free paper.
© 1997 Springer-Verlag New York, Inc.
All rights reserved. This work may not be translated or copied in whole or in part without the
written permission of the publisher (Springer-Verlag New York, Inc., 175 Fifth Avenue, New
York, NY 10010, USA), except for brief excerpts in connection with reviews or scholarly
analysis. Use in connection with any form of information storage and retrieval, electronic
adaptation, computer software, or by similar or dissimilar methodology now known or
hereafter developed is forbidden.
The use of general descriptive names, trade names, trademarks, etc., in this publication, even
if the former are not especially identified, is not to be taken as a sign that such names, as
understood by the Trade Marks and Merchandise Marks Act, may accordingly be used freely
by anyone.
Production managed by Timothy Taylor; manufacturing supervised by Jacqui Ashri.
Camera-ready copy prepared from the author's L^I^jX files.
Printed and bound by R.R. Donnelley and Sons, Harrisonburg, PA.
Printed in the United States of America.
987654321
ISBN 0-387-98236-1 Springer-Verlag New York Berlin Heidelberg SPIN 10575007 (Hardcover)
ISBN 0-387-98237-X Springer-Verlag New York Berlin Heidelberg SPIN 10575017 (Softcover)

The tree that is nurtured
can not be wrought.
Our Mum and Dad conceived;
summer solstice they got.
Sister planted that seed;
she brought my mind to light.
Then my wife, ah my life,
she brings my heart delight.
My Lizzy, June flower,
my purpose is complete.
Oh joy, that tree is me.
This is for She.

Preface
We hear a lot about dynamical systems theory these days—"chaos
dynamics" is the popular idiom—but very little of this abstract discipline is
taught in undergraduate or even in graduate school, unless that happens
to be one's own field. People may have heard some of the terms, but they
may not understand what a dynamicist does. Just what does a dynamicist
do?
Although I've been through the training, I spend much more time
talking and thinking about dynamics than actually doing it, which is a
circumstance that affords me an interesting perspective. Some time after defending
my dissertation, when I stopped back by my alma mater on a visit through
Boulder, I ran into one of my advisors, a practicing dynamicist, and he was
excited to show me his new workstation and its resident software.
Hopefully, by relating what I experienced then, I may in part help to answer my
opening question.
He sat down in front of a very large screen, at least 19 inches across, so
closely, that it filled his field of view completely. At his fingertips were the
usual mouse and keyboard. He fired up the software and chose a model;
I think we were out to explore "just the standard map." As the program
began, the screen remained black, except for a small blinking cursor at the
middle. Resting the fingers of one hand on the keyboard, and the other
on the mouse, he tapped the keys, slid the mouse, and clicked its button.
Several streamlines appeared on the screen immediately, and they began to
flow, extending outward, waving—just colored lines on a black background,
squirming and dancing. As he swept the mouse across the pad while rapidly
clicking the button, the almost invisible cursor arc'd across the screen,
initializing new streamlines as it went. The new lines replaced some of
the others, as those decayed away, like the red trail of a flame moving in
the dark. I began to see the screen as a window onto a flowing fluid, its
undulating texture made manifest by the waving streamlines.

viii Preface
Without taking his eyes off the screen, he named the features as they
flowed by—islands of order in a sea of chaos, abstract landscapes in a
virtual realm. Every so often, he would type a key and so bump the control
parameter; the scene changed, eddies appeared, and the swirling became
more intense. I thought about vorticity, curl, and electromagnetic fields. I
saw the development of attractors, knots in the streamlines, point sources,
and singularities; they would come and go, becoming unstable as the
parameter changed. I saw catastrophe and understood about structural stability.
More sweeps, more lines; he created more whenever he needed more detail.
Out of nowhere a pair of stable and unstable fixed points emerged together
and begin to spread apart, appearing to me as I had always imagined pair-
production out of the quantum vacuum might look. I began to see all kinds
of abstract theorems and structures enacted across that screen.
But mostly what I saw was a dynamicist exploring phase space with
his senses; he watched it flow past, yet he controlled the flow. By moving
his hands, he navigated through virtual space; it was simulation in real
time. I sat back and watched the beginnings of virtual reality; but not an
imitation reality like the CD-ROM games you get now. This reality was
the abstract Cartesian space where points and lines represented evolving
dynamical systems. Yet by immersing his senses, sight and touch, the
dynamicist was climbing into phase space and taking it for a ride. This activity
is just part of what a dynamicist does these days; and so just maybe, by
going back to the beginning, we can see a little of how dynamical systems
theory came to be.
Acknowledgments. I would like to thank the following people who were
instrumental to this project's existence: Allan Franklin allowed me to study
philosophy in the face of zero job prospects, and he continued to encourage
me while I created and developed this project with no model to work from.
John Cary and Jim Meiss spent many long and I'm sure tedious hours
helping me understand the extremely abstract field of dynamics. Allan,
John, and Jim also read most of the contents of this book and did their best
to ensure that I got both the history and the physics right; any remaining
errors must have been inserted by me after their scrutiny. Professor Joseph
Ford was very kind and generous with his correspondence, providing me
with many references along the way. Working on the time scale with Marc
Weiss at the NIST while I researched this project was both a delight and a
significant means of support. Paul Harris sharpened my mind with countless
hours of theoretical conversations. Lois Cole gave me at least one very
nonlinear year of her life. Jo Alyson Parker taught me how to write, and
she continues to be an outstanding reader and editor. Tom von Foerster, my
editor at Springer-Verlag, believed in this project and in my ability to bring
it off. Finally, I must thank Elizabeth Parker Weissert, whose presence in
this world has reconciled me to myself.

Contents
Preface vii
List of Figures xiii
Introduction 1
Part I: History 7
1. The FPU Model and Simulation: "A Little Discovery" 9
1.1. Development 9
1.2. Dynamics to Statistical Mechanics 11
1.3. Surfaces of Constraint 13
1.4. Global Versus Local Analysis 15
1.5. Simulation 15
1.6. Loading the Nonlinear String 16
1.7. Modal Representation 19
1.8. Model Considerations 21
1.9. Results 23
1.10. Discussion Post Hoc 27
2. The FPU Research Program: Echoes on a String 31
2.1. The Threads of a Research Program 31
2.2. The Nonlinear Discrete Lattice 32
2.3. Ford, 1961 33
2.4. Jackson, 1963 37
2.5. Ford and Waters, 1963 41
2.6. The Continuous String 43
2.7. In the Continuous Limit 45
2.8. Discreteness as Viscosity 46

x Contents
2.9. The First Soliton Paper 47
3. The Kolmogorov-Arnold-Moser Theorem: "Here Comes
the Surprise" 51
3.1. A Brief History of Dynamics 51
3.2. The Fundamental Problem of Dynamics 53
3.3. The Small Divisors Problem 57
3.4. Poincare to Kolmogorov 60
3.5. The Conjecture 64
3.6. Beyond the Blaze 66
3.7. The Henon and Heiles Simulation, 1964 72
4. Research Threads Come Together:
Harmonic Convergence 83
4.1. The Story Continues 83
4.2. Izrailev and Chirikov, 1966 84
4.3. Zabusky and Deem, 1967 89
4.4. Walker and Ford, 1969: Physical Review 91
4.5. Ford and Lunsford, 1970 93
4.6. Lunsford and Ford, 1972 97
4.7. The Toda Lattice Is Integrable 99
Part II: Philosophy 103
5. Steps to an Epistemology of Simulation 105
5.1. Introduction 105
5.2. Hierarchy of Modeling 107
5.3. Historical Significance 110
5.4. Experiment Ill
5.5. Epistemology 113
5.6. Preconceptions 117
5.7. Strategies for Belief and Pursuit 122
5.8. Case Study I: Fermi-Pasta-Ulam 125
5.9. Case Study II: Henon and Heiles 128
5.10. Methodology 129
5.11. Irreversibility 132
5.12. Proof 133
5.13. Proof and Simulation 136
Appendix
A. Hamiltonian Dynamics: Language of Abstraction 139
A.l. Topology and Phase-Space Trajectories 140
A.2. Canonical Transformations 141
A.3. Transforming the Unperturbed String 141

Contents xi
A.4. Cyclic Coordinates 142
A.5. Liouville Integrability 143
A.6. The Action-Angle Variables 143
A.7. Dynamics on a Torus 147
A.8. Commensurability: Two Types of Motion 148
A.9. Digital Representation 152
A.lO.Physical Reality and the Continuum 152
A. 11.Perturbing the String 153
Glossary 155
References 157
Index 167

List of Figures
I.l. Road Map to the FPU Research Program 3
1.1 The Discrete String 17
1.2 Normal-Mode Displacements 20
1.3 FPU-1: Quadratic Perturbation 24
1.4 FPU-9: Time-Averaged Energy Distribution 24
1.5 FPU-8: String Displacement 26
1.6 The FPU Super-Period 29
2.1 Ford's Modal Energy Plot 36
2.2 The Confused Question of Recurrence 39
2.3 From an Almost Constant of the Motion 43
3.1 Concentric Shells of Tori 63
3.2 Toroidal Phase Space in Perturbation 67
3.3 A Heuristic Explanation of FPU 69
3.4 The Henon and Heiles Original Figures 76
3.5 Reproducing Henon and Heiles Level Curves 80
4.1 Resonance Overlap and FPU 85
4.2 The Basis for Resonance Overlap 87
4.3 Initially Excited Optical Modes 90
4.4 Stochasticity and Resonance Overlap 96
5.1 The Henon and Heiles Original Figures 119
A.l Geometrical transformations 144
A.2 The 2-Torus Coordinate System 148
A.3 Single Orbits on Tori 150
A.4 Multiple Orbits on Tori 151

Introduction
Defining My Project
Although it is often referred to in the dynamics literature as a standard
reference, the Fermi-Pasta-Ulam problem (FPU) has not been
comprehensively documented anywhere in a way accessible to a nonspecialized
audience. Thus it is virtually unknown outside of dynamics; yet the FPU
problem marks a vital turning point in that field. In one of the very first
computer simulations, performed in the early 1950s, a simple nonlinear
string model exhibited behavior that completely baffled some very
prominent scientists. In effect, the FPU problem revitalized the field of nonlinear
dynamics, which had been dormant for nearly 50 years.
Classical dynamics had reached its peak by the latter part of the
nineteenth century. In fact, its very name became synonymous with an entire
scientific world view. The universe was thought to be as a mechanical clock,
or a completely deterministic book written in the language of the calculus.
Many scientists believed that all the major problems had been solved, and
that the only challenge remaining was to iron out details, which was indeed
a pretty depressing prospect. Where was the challenge in that? However,
at the very end of the nineteenth century, Henri Poincare took on the
three-body problem in his three-volume magnum opus. In those weighty
volumes, he developed the topological method of dynamics as we know it
and made great strides in the development of perturbation theory. But as
he reached the very end of the third volume, Poincare glimpsed a degree of
complexity in that low-order problem that was well beyond anything yet
encountered. The universal text of the classical paradigm had some missing
pages, text that might turn out to be indecipherable, even with the
calculus. Perturbation theory calculations demanded a great deal of time and
pedantic repetition; at the beginning of the twentieth century, any useful
result in nonlinear dynamics was well out of reach of the pencil and paper
dynamicist. Thus, these problems were shelved for half a century—until

2 Introduction
burgeoning computer technology brought the possibility of their solution
within reach.
Having worked with some of the first digital computers during the
Manhattan project, the renowned physicist Enrico Fermi recognized the
utility of their fast computational speed and relentless repetition for making
nonlinear dynamical calculations. Fermi, along with his associates John
Pasta and Stanislaw Ulam, expected that the harmonic linear string model,
perturbed with a small nonlinear term, would in simulation tend toward
equipartition of energy among the normal modes. But that outcome did
not occur; where they expected chaos, they found a new realm of order.
My self-appointed task, which was to gather the historical information
about FPU and develop it into a coherent form, itself proved to be a voyage
of discovery. When I began to study the problem, I asked two different
dynamicists what the solution was; one said "I thought it was KAM," and
the other said "wasn't it KdV?" Such divergent responses intrigued me;
clearly there was some shadow of a doubt about this so-called standard
problem. Then, as I starting reading and researching, I ran into the problem
of identifying the references. The very first such problem was that the
original FPU paper was never published in the traditional way, although
I was able to locate it in Fermi's collected works. After some thought,
I realized that I could not simply work forward in time because articles
refer to prior work only, nor could I work backward in time because of the
difficulty of finding starting points. Instead I zig-zagged back and forth in
time, tracing authors and all the prior articles to which each relevant article
referred. Ultimately I uncovered a network of related articles extending over
a 20-year period, 1953 through 1973. Beyond 1973, very little further work
was performed on the FPU model, indicating that interested parties were
satisfied by the conclusions that had been drawn, and that the leading edge
of research had moved on to new challenges.
Road Map to Research
Figure 1.1 represents the main body of publications documenting the FPU
research program. This complicated diagram functions as a road map to
our travels. I describe its basic structure here, and then treat it piecewise
throughout the coming discussion. I strongly recommend periodically
referring back to this main figure in order to see a thread within the context of
the whole program. For example, in the main figure, we see a feed from the
original Tuck and Menzel work—before its publication—over to Zabusky's
1963 article, indicating that the latter worked with full knowledge of the
former significant findings. This fact does not show up in the subfigure I
use later in the text, in the discussion of Zabusky's work.
Although they become densely interwoven in time, there are roughly five
threads running through this story, three of which began independently of
the others, and two that began as deviations from another thread. For ex-

Introduction 3
Fermi-Pasta-Ulam
Simulation 1954
Tuck-Menzel
[Simulation 1961
FPU Published
1965
Tuck-Menzel
Published 1972
(
The FPU Research Program
1954-1973
Zabusky
1962
m
Zabusky
1963
f f
Zabusky
Kruskal
1964-65
Zabusky
Deem
1967
Zabusky
1971
Zabusky
1973
iJackson
1963
Ford
1961
f ¥
Ford-Waters
1963
ft f
Waters-Ford
1966
ffffff
Walker-Ford
1969
Ford-Lunsford
1970
Lunsford-Ford
1972
[Ford, Stoddard
& Turner 1973
Pr t
>
Kolmogorov
Conjecture 1954
Moser
1962
Arnold
1963
Izrailev
Chirikov
1966
Toda
1967
Henon-Heiles
Simulation
1964
IT
Gustavson
1966
Greene
1968
? f
1974
FPU
Solitons Discrete Lattice
KAM
Figure I.l. The chronological history of the FPU research program,
including explicit inferences represented by arrows. Each box
represents a publication. There are three independent starting points.

4 Introduction
ample, Zabusky initiated his work in an attempt to solve the FPU mystery,
but his theory of solitons took on a significant life of its own. Of course
nothing happens completely in isolation, so by "independent," I mean that
these researchers were not aware of the work performed by the others at
the time of writing that particular article.
The chronology runs downward, beginning with 1954 at the top and
ending with 1973 at the bottom. Each box represents a specific contributing
journal article or sequence. The lines and arrows indicate the complicated
flow of information as it was acknowledged explicitly by the authors,
showing the a priori knowledge informing the work contained within each article.
The diagram does not include all articles by each author, but a sampling
of the most significant pieces—those that I discuss in this book.
The original FPU article—available to select circles but not actually
published until 1965—Tuck and Menzel's continuation of the original FPU
simulation, and the article that was eventually published in 1972 make up
one thread. Down the left-hand side of the diagram we see the sequence
of four boxes in this line. Beginning again at FPU, the line to Zabusky's
initial article (1962) indicates that the FPU problem informed and inspired
Zabusky's work on continuous wave theory using partial differential
equations. That work becomes the independent soliton (KdV) thread thereafter,
converging once again with the other FPU work at the Walker and Ford
Physical Review article (1969).
The passage from FPU to Ford (1961) begins the central FPU research
program—an attempt to retrodict the FPU simulation results using
perturbation theory on the discrete nonlinear lattice. The flow continues
downward with the papers by Ford, Jackson, and Waters, eventually intersecting
the KAM thread sometime before the Walker and Ford article.
The KAM thread begins independently with Kolmogorov's 1954
conjecture. It contributes significantly to the post-1966 work on FPU, and it
coincidentally adds to the significance of the year 1954 in the history of
dynamics. The KAM theorem was proved independently by the two
mathematicians Moser (1962) and Arnold (1963a) using quite different methods.
KAM was known to Izrailev and Chirikov (1966), and it passed directly into
the English-language physics community via the Walker and Ford Physical
Review article in 1969. Ford and Waters claim to have been aware of Kol-
mogorov's 1957 address to the International Congress of Mathematicians,
in which he repeated the conjecture of 1954. However, because they did
not see the direct implications of that conjecture for the FPU problem, I
indicate their citation with a dashed line.
Not so much a thread per se, the Izrailev and Chirikov (1966) article
stands out as the first to identify the results of FPU with the implications
of the KAM theorem. Although it came out in 1966, its explicit references
place it as following directly from the original FPU work without
reference to any of the American work. By that time, Ford, Waters, Jackson,
Zabusky, Tuck, and Menzel had all made contributions. Perhaps the de-

Introduction 5
lay in information exchange might be attributed to the strained relations
between the Soviet Union and the United States.
My Presentation
In the first chapter, I present the FPU work in detail, including discussions
of the thinking, expectations, and results of Fermi et al. The mathematical
development is elementary for the most part, and as such should be familiar
to anyone with an undergraduate background in physics; but it should
also be accessible to anyone interested in mathematical science and/or the
history and philosophy of dynamics. Beyond the first chapter, we follow
the trail of what I will be calling the FPU research program. Chapters
2 and 4 are very similar in that they track the thinking of dynamicists
through the 1960s, when new methods were developed and new discoveries
incorporated.
The material concerned with the FPU research program divides neatly
into two phases, which are punctuated by the arrival of the Kolmogorov-
Arnold-Moser (K AM) theorem. Because news of the K AM theorem reached
the main group of researchers in 1966,1 end my discussion of the first phase
of the FPU research program in Chapter 2 at that year, and proceed to
Chapter 3 which is devoted to a discussion of the KAM theorem itself,
its development, the dynamical problem it solves, and the nature of that
solution. Following the KAM material, I return to the second phase of FPU
work in Chapter 4 and continue to its end in 1973.
In Chapter 5, I take up the subject of the epistemology of simulation.
Computer simulation has become a new means of scientific investigation,
taking its place alongside theory and experiment. As we will see in this
exposition, the genesis of simulation is inextricably intertwined with the
rebirth of dynamics. For nearly 40 years now, computer simulation has
been an invaluable tool for mathematical modeling. Part and parcel with
these new instruments and new realms of phenomena, there arises a series
of new questions to be addressed: What can a simulation tell us about? How
is simulation like and unlike an experiment? Why should we believe what
the simulation tells us? And when can a simulation be used to verify or
reject an hypothesis? Focusing on the use of simulation in this case study,
I seek answers to these important questions. Contrary to what might be
assumed at first glance, simulation is far from being a transparent window
onto the phase space of nonintegrable models. However, by studying the
limitations of simulations, we increase the resolution of what we can learn
from them. In so doing, we are able make better choices about how to
design a useful simulation. Better simulations mean better understanding
of nonintegrable models, and better understanding of macroscopic physical
behavior.
In an attempt to familiarize the reader with the sometimes difficult
language and tools of dynamics, I have included both a long Appendix, in

6 Introduction
which I develop the model of the string in a language appropriate for
understanding the other developments discussed throughout the book, and a
Glossary, which explains many of the very specialized terms in the language
of dynamics. Dynamics is a big subject with many different subdisciplines.
The language and tools used by some workers may be quite different than
those used by others, and this situation was never more in force than in
the early days of dynamical systems theory. The list of glossary terms also
may be found in the table of contents for easy reference. If at any point
in the text, the reader gets bogged down in the formalism, I recommend a
short trip to the appropriate section of the Appendix.
At the heart of this project is a story of scientific discovery. It follows
the usual narrative line: simple expectations lead to a baffling result;
random historical events impede progress and access to information; and new
research tools lead to new vistas of physical behavior. I have enjoyed
researching this material, and I hope that readers enjoy the story as well as
the subject matter.

Part I: History
This terror of the mind, these shadows must be dispelled
not by the suns7 bright shafts, nor by the brilliant daylight,
but by an understanding of the laws of Nature.
Lucretius (Book I, lines 146-148)
The FPU paradox forces us to face some of our deepest insecurities.
Given the Hamiltonian for a system, what is the character of its
motion? ... Indeed, the ufull and final" explanation of FPU still
pends. It is this very fact which makes the FPU paradox such a
delightful pedagogical "skeleton" upon which to drape the evolving
story of nonlinear dynamics/chaos.
Joseph Ford (1992)

1
The FPU Model and Simulation:
"A Little Discovery"
... it makes an enormous difference how the atoms are placed,
and in what position they are brought together,
and what movements they give each other and receive,
and how the same atoms, with only a little change,
can make both "flames" and "firs," just as these two words
have similar elements, but differently arranged,
and so we speak of flames and firs by different names.
Lucretius (Book I, lines 908-914)
Let us say here that the results of our computations show features
which were, from the beginning, surprising to us.
Fermi, Pasta and Ulam (1955, p. 981)
1.1. Development
In the Spring of 1952, the MANIAC-I computing machine came on line at
Los Alamos. One of the first digital computers, it filled a large room, used
vast arrays of large, hot, glowing vacuum tubes for computation, and
processed box after box of permanently inscribed, single-command, disposable,
punched cards. That summer, in a professional configuration foreshadowing
the disciplinary alliance of the future, physicist Enrico Fermi, computer
scientist John Pasta, and mathematician Stanislaw Ulam, gathered together
to discuss the computation potential presented by the speed, accuracy, and
relentless automation of these new electronic marvels. Although these
machines were quite a bit slower than even the first personal computers of
the 1970s, to say nothing of the workstations most physicists use routinely
today, they could provide access to some of those physical problems that
had remained marginalized for 50 years because it took so long to
laboriously carry out numerical methods by hand. Fermi wanted to develop new

10 1. The FPU Model and Simulation: "A Little Discovery"
heuristic techniques for investigating nonlinear dynamical problems
"experimentally" by beginning with the simplest possible nonlinear problem,
solving it, and moving on to successively more complex problems, perhaps
even making an approach to one of the most difficult nonlinear problems
of all—turbulence.
Following the method of perturbations devised by Poincare in the 1890s,
Fermi et al. (1955) chose the one-dimensional vibrating string as an
unperturbed, integrable model. The linear-string results from first expanding the
Newtonian equations of force between discrete points along the string in
an infinite series, and then dropping all the nonlinear terms. The solution
of this linear model is known; it can be derived analytically and exactly; it
is the ubiquitous simple harmonic motion that allows for no dispersion of
the conserved energy. Whatever modal energy configuration exists at the
outset remains the same throughout the subsequent motion of the string.
Of course, real strings do not behave this way for very long; energy is both
dispersed among harmonic modes and dissipated to the surroundings. Even
when we ignore the fact that a real string dissipates energy all during its
motion and eventually comes to a stop, the energy in an energy-conserving
string does not remain in its initial configuration either. Whereas the linear
force terms of the expansion do not allow for energy-sharing, the nonlinear
model terms link the modes together and do allow for dispersion of the
energy between the modes.
From the solid, predictable ground of linear harmonic motion, Fermi et
al. proceeded cautiously into the unknown territory of computational
nonlinear dynamics. They added one small nonlinear force term to the linear
model. The resulting equations of motion could no longer be integrated
analytically. The shape of such a string, just barely different from the
linear model, could not be predicted accurately by anyone after only a few
hundred harmonic periods. Using numerical integration, which
approximates the evolution of the string in very small linear steps, Fermi et al.
programmed their computer to recursively approximate the solution of the
equations of motion for various initial configurations of the string.
In order to track the dispersion of energy from its initial configuration
along its passage through the higher harmonic modes, Fermi et al. used the
method of Fourier analysis. Any state or configuration of the discrete string
may be decomposed into a set of Fourier or normal modes, which is a set
of independent sinusoidal waveforms. In this way, they could observe the
distribution of energy among the modes during the simulation. This choice
of representation made the ideal initial condition clear, that is, place all
the initial energy into a single normal mode, then watch it disperse. They
even hoped to calculate a dispersion constant that represented the rate at
which the energy spread out.
Fermi et al. knew that, given any initial condition, a string would move
toward its most probable state—thermal equilibrium (chaos), wherein all
of the energy is distributed evenly among the accessible modes. Any term

1.2. Dynamics to Statistical Mechanics 11
added to the model that links together modes should act as a bridge for the
energy to reach higher modes. Energy should march through the sequence of
harmonic modes like champagne spilling down a pyramid of glasses. Fermi
et al. wanted to measure the rate of thermahzation, which they believed
was related to the relative strength of the additional nonlinear term.
Nothing in the theory of nonlinear dynamics led Fermi et al. to expect anything
other than a consistent march toward thermahzation; but for every run of
the simulation, beginning from several different initial conditions and three
different nonlinear perturbation terms, the system failed to thermalize.
After thousands of cycles of the calculations, corresponding to hundreds of
swings of the string, the energy never dispersed beyond the lowest few
normal modes, although strangely enough, it did seem to cycle around among
the modes that it could reach.
Not only did the FPU simulation clearly fail to relate the rate of
thermahzation to the size of the nonlinearity, but it chanced upon a new realm
of orderly behavior. This result is analogous to an unexpected result in a
carefully designed physical experiment; it demanded an explanation. The
search for such an explanation brought about a total reevaluation of our
expectations about nonlinear behavior. In so doing, the Fermi-Pasta-Ulam
problem initiated the study of computational nonlinear dynamics—what is
now called dynamical systems theory. No one expected the existence of
isolating structures in the phase space of conservative nonlinear
dynamical systems—structures that would inhibit the tendency toward thermal
equilibrium. Instead of chaos, some new form of order was at work. As
we will see, whatever was holding the energy in the lowest few harmonic
modes of the string, did so consistently, as if there were a modal energy
barrier. Eventually, the repercussions of this work led to an entirely new
understanding of complexity, a reinterpretation of our understanding of
order and stochasticity, and the discovery of whole new regimes of dynamics,
accompanied by new methods of exploring them.
1.2. Dynamics to Statistical Mechanics
What led Fermi to his choice of the nonlinear string? One of his many
early interests had been in the ergodic theory—the mathematical study of
the long-term average behavior of measure-preserving transformations. In
terms of physics, ergodic theory is closely allied to statistical mechanics.
Whereas classical dynamics studies the dynamical behavior of systems of
individual particles acting under classical laws of motion, using trajectories
to follow the deterministic evolution of the system, statistical mechanics
concerns itself with the statistical behavior of ensembles, or large numbers
of particles acting as a coordinated group, and does not recognize or focus
on the causality of individual elements. In this sense, statistical mechanics

12 1. The FPU Model and Simulation: "A Little Discovery"
is a phenomenological theory that assumes Newtonian mechanics as the
underlying fundamental theory. The elusive connection between these two
disparate descriptions of mechanical systems has been a source of some
concern, especially within a physics community driven by a tradition of
fundamental (deterministic and upwardly causal) theories. A fundamental
foundation for statistical mechanics would require a way to derive the
irreversibility of the second law of thermodynamics from deterministic causes,
which is a problem of fundamental difficulty. Fermi wanted to find the
smooth transition from classical dynamics to statistical mechanics.
Traditionally, the number of particles in a system has been used to
determine which branch of mechanics would be the best descriptive theory. As
the number of particles increases beyond the computational capabilities of
classical dynamics, researchers tend to turn to statistical mechanics for
description. However, the predictions of statistical mechanics are more valid
for systems with numbers of particles on the order of Avogadro's number
(1023) and significantly less valid as the number of particles goes down.
On the other hand, classical dynamics gets terribly complicated for any
system with more than two particles in them. Here the boundary region
takes shape: What can we say about the dynamics of systems with tens,
hundreds, and thousands of particles? As the number of particles increases,
the number of individual interaction terms increases exponentially and the
integrable (usually linear) approximation to the dynamics no longer
dominates the behavior. The nonlinear interaction terms, that may be ignored
no longer, couple the particles together in a web of interactions, increasing
in complexity until total stochasticity dominates the dynamics.
As of 1954, expectations for the dynamical behavior of systems beyond
those already understood—in the unexplored margins of classical science—
were informed by the following basic assumption: Because they are still
deterministic, nonlinear force terms can create only technical limitations on
the extent of our knowledge, due both to the failure of analytical methods,
and to our lack of computational capabilities. It was the second of these
conceived limitations that Fermi et al. were trying to overcome. Their
simulation was to be the first in a sequence that would demonstrate the
transition from the deterministic trajectory dynamics of classical mechanics to
the ensemble behavior of statistical mechanics. Fermi et al. knew that the
number of particles did not have to be large for the system to develop
stochastic behavior. Indeed, they were well aware of Poincare's discovery,
50 years prior, of deterministic chaos in the three-body problem. They were
convinced that just a few particles with sufficiently large nonlinear terms
would be enough to demonstrate the expected transition. Simulation was
intended to be a new instrument of perception; we could look into an area
of dynamical behavior that had been blocked off by our computational and
analytical failings. However, the transition from classical dynamical systems
at the simple end of the mechanical-complexity spectrum, to statistical
mechanics at the other, was turning out to be significantly less straightforward

1.3. Surfaces of Constraint 13
than dynamicists had anticipated. Nonintegrability in itself was known to
be a necessary but not a sufficient condition for ergodicity. The boundary
between "necessary" and "sufficient" conditions is characteristic of the new
frontier of dynamical behavior glimpsed by Fermi et al. in 1954.
1.3. Surfaces of Constraint
The project of dynamical systems theory is to map the region left open
between classical mechanics and statistical mechanics. We often speak of the
topological properties of the surfaces accessible to the system dynamics in
phase space. Phase space, defined in the Glossary, is the Cartesian
configuration space where we track the movement of the system trajectory over
time. Once we have converted the dynamical equations to discrete form for
computer simulation, we can treat them as mappings of phase space into
itself. In this way, we bring the full power of topology to bear in the study
of nonintegrable behavior.
The transition from classical dynamical systems to statistical mechanical
systems may be characterized by the successive breakdown of the integrals
of the motion. An integral of the motion corresponds to the ability to
integrate analytically one variable of the model. More significantly, an integral
of the motion indicates the existence of one additional surface of constraint
in phase space, restricting the trajectory to one fewer degree of freedom.
If a dynamical system is integrable, then it has a complete set of integrals
of the motion and so the trajectory is bound to a known surface (the curve
that is the intersection of all the surfaces of constraint). There can be no
uncertainty about its movement, no deterministic chaos, and no ergodicity.
Analytically integrable systems are completely predictable. If, however, one
integral of the motion breaks down, meaning that some parametric change
in the model's form reduces the number of integrals by one, then the
trajectory is freed from that surface of constraint and may move about in the
space delineated by all the other surfaces: one degree of freedom has been
unlocked. I emphasize that the trajectory is still deterministic, but we may
not be able to predict where in that space of its new freedom it will be
at any given time. The trajectory may be ergodic within any portion of
phase space (of at least three dimensions) that is bounded by surfaces of
constraint.
At the extreme opposite end from having a complete set of integrals,
there may be only one. In Hamiltonian dynamical systems, the quantity
represented by the Hamiltonian function (usually the total energy) is always
conserved, and thus it guarantees at least one surface of constraint to which
the trajectory must remain affixed. This surface is called the energy surface;
it is a subspace of phase space that is one dimension lower than the 2n
dimensions of the full system. Dynamics in any Hamiltonian system always

14 1. The FPU Model and Simulation: "A Little Discovery"
takes place on the energy surface. If total energy is the only conserved
quantity, then the trajectory is free to move about the (2n— l)-dimensional
energy surface such that its location at any time may be unpredictable.
This is the behavior known as ergodic on the energy surface, which is often
referred to as ergodicity.
An example of a system easily described in terms of its degrees of freedom
and integrals of the motion is the two-body problem—two point masses
acting under the mutual attraction of Newtonian gravity. This system has two
degrees of freedom: separation and rotation. Phase space has four
dimensions, and the energy surface is a three-dimensional subspace embedded
within these four dimensions. The Newtonian force of gravity is a central
force, so the Hamiltonian function depends on the separation variable only:
it is independent of the rotation variable. Thus angular momentum, which
is the variable that is canonically conjugate to rotation, is an integral of
the motion. Angular momentum in the two-body problem is always a
conserved quantity. This second integral restricts the system's trajectory by
one further dimension to a two-dimensional phase plane. Therefore most
of the energy surface is inaccessible to an arbitrary trajectory, which must
remain constrained to a phase plane of constant angular momentum. The
actual quantity of energy in the system is determined by the initial
conditions. This conserved quantity specifies a specific value for the orbital area
in the two-dimensional plane circumscribed by the system trajectory. For
a fixed area in a plane, the trajectory must be a closed curve, which may
always be transformed into an ellipse. This system is well ordered and
absolutely predictable. All of this work relates back to Kepler's laws of planetary
motion, one of which identifies planetary orbits as ellipses. Because a
single planet orbiting the sun constitutes the prototypical two-body problem,
the term "orbit" is traditionally applied to a trajectory with a number of
integrals of the motion, even if the dynamical system has nothing else to
do with orbiting bodies.
Integrable systems are predictable because trajectories move along a
curve that is the intersection of all those surfaces of constraint. When a
system is not integrable, then its trajectories are assumed to access freely
all of the energy surface; so it seems reasonable to expect ergodicity to
follow immediately from nonintegrability. But why should it follow that
nothing else besides integrals of the motion might exist to constrain the
movement of the trajectories? The answer is that integrals of the motion
are the only known guarantee of bounded behavior. If ergodicity were to
follow directly from the absence of integrals of the motion, then that boundary
between dynamics and statistical mechanics would be quite narrow. Fermi
et al.'s "ergodic hypothesis" assumed that when a Hamiltonian model did
not have a complete set of integrals of the motion, it would immediately
tend toward ergodicity.

1.4. Global Versus Local Analysis 15
1.4. Global Versus Local Analysis
The separation between Hamiltonian dynamics and statistical mechanics is
a local-versus-global distinction (Balescu, 1975, p. 714). At the local level,
Hamiltonian dynamics is concerned with questions about the evolution in
phase space of a single trajectory that results from the step-by-step
integration of the equations of motion, beginning from an arbitrary initial
condition. We ask the questions: Is the orbit periodic or conditionally
periodic? Which regions of phase space are accessible to which trajectories? We
can easily imagine several successive steps generalizing this inquiry toward
a more global discussion. We might consider the behavior of a bundle of
trajectories—a tube surrounding the first trajectory that corresponds to a
small neighborhood of initial conditions. At this level we would be
interested in questions concerning the local stability of trajectories—such as:
Will trajectories that begin arbitrarily close to the first remain close to it,
or will they diverge? This is the type of consideration Poincare took up and
that we will adopt in our consideration of the KAM theorem in Chapter
3. At a still more global level, we might be interested in the hydrodynamic
behavior of any region of phase space as we map it into itself successively.
Here we would naturally take up the question of phase-space properties
such as connectedness, density (in the measure-theoretic sense), ergodic-
ity, and mixing. This hydrodynamic study is the approach of statistical
mechanics. The evolution of single trajectories is of less concern than the
average behavior of the hydrodynamic flow of the specified region of phase
space. In the hydrodynamic approach, ergodicity implies that the region in
question will eventually move over all the accessible phase space.
1.5. Simulation
At each step in the numerical procedure of a simulation, we use
transformation equations derived from the equations of motion, to make a prediction
about the future state of the system, one that is an approximation over
some small interval of system time. We adjust the time-step size to obtain
either higher resolution when the trajectory is erratic, or lower resolution
when it appears well behaved. If the trajectory behaves very erratically,
then we want to use a very short time-step size; but when the trajectory
is changing direction very slowly, we may increase the time-step size to
obtain more information in a shorter time. The process of running iterative
numerical integrations with varying initial conditions and parameters to
map the phase space of the model, functions similar to a physical
experiment, except that instead of testing hypotheses or theory predictions on a
physical system, we try to follow the evolution of the mathematically true
solution of a set of differential equations. If the true solution is changing in

16 1. The FPU Model and Simulation: "A Little Discovery"
an erratic way, the predictions of the numerical procedure may lose track
of it, thus degrading the efficacy of the simulation.
Every digital computer must perform its calculations in discrete steps
rather than continuously. The numerical procedure is discrete also; we map
from one state of the system to the next via the prediction that is derived
from fitting a continuous curve or straight line to some of the previous
discrete states and projecting forward. We assume the system evolves
continuously, but we cannot represent that change without these intervening
discrete approximations. As this diagram shows:
True-Solution Simulation Reconstruction
we regain the impression of continuity by conceptually supplying a smooth
curve fit to the pure succession of static states of the system—each state
being a set of numbers calculated from the previous set. Because we do not
know what really occurs within the prediction interval, we must exercise
caution when we assume continuity between the states of the numerical
procedure. This necessary, discrete approximation enables the possible
divergence between the true solution and the simulated solution.
"Deterministic chaos" is the term we apply to the trajectory when we cannot follow
the true solution for any period of time, no matter how small we make the
time-step size. We know the true solution is deterministic, but it appears to
be random in our simulation. In Chapter 5, synthesizing from the seminal
work brought forth by the FPU problem, I lay the foundation for an episte-
mology of simulation, in which we try to understand when and if simulated
solutions are consistent with true solutions. If a system is ergodic, then we
may reasonably expect to see deterministic chaos in its simulation. Fermi
et al. expected the string to tend toward ergodicity, but what they found
did not appear to be random.
1.6. Loading the Nonlinear String
Fermi et al. wanted to begin with a simple problem. The one-dimensional,
linear, vibrating string has long been one of the standard problems
physicists are expected to solve as part of their early training. One elegant
aspect of the FPU problem is that the model is so fundamental and the
perturbation so obvious. Let us review the derivation of this model,
following Marion (1970), Tuck and Menzel (1972), Ford (1992), and Fermi
et al. (1955). The string is seen as a set of discrete mass points set along

1.6. Loading the Nonlinear String 17
a one-dimensional line, with fixed endpoints. The masses are all identical
and can move only longitudinally—that is, along the line in which they
lie. Thus the string is not like a violin string vibrating back and forth
(transversally), although that string can also be approximated with this
same model. Using Newton's law of gravitation as our force law would
require all the masses to attract all the other masses. However, instead of that
complicated affair, we assume that there is significant attraction between
only nearest-neighboring points. Thus the force on any single mass point
depends only on the two mass points nearby; motion is generated locally
by the interaction of three bodies. This fact forms the conceptual basis for
making connections between the well-known three-body problem and the
FPU problem. The relationship is definite, but it is complicated by the fact
that in the FPU problem, each three bodies in succession are coupled to
the next three bodies in the sequence by the two-point overlap. Further, we
assume this attraction to be linearly dependent on the distance between
the points. This approximation leads us to the Hooke's law linear force.
Figure 1.1 shows the arrangement of mass points along the string with
fixed endpoints. At equilibrium, each pair of mass points is separated by the
equilibrium spring length d; but this constant neatly drops out of the
equations. Using the Hooke's law force between nearest-neighbor mass points
and setting k = m = d =1, using an appropriate system of units for
convenience, the resulting equations of motion, from Newton's second law, take
the form
Qj = (qj+i-Qj) + (Qj-i-qj), j = l,2,...,n. (1.1)
The force on and the total acceleration of each mass point depends linearly
on the separation between itself and its two adjoining neighbors. As was
mentioned, this is the model for coupled harmonic oscillators, with a
completely predictable solution. We put our model into Hamiltonian form by
writing down the total energy of the system in terms of the positions qj
and momentums pj (where Pj = mqj) of all the mass points. The resulting
unperturbed Hamiltonian Hq is the sum of the kinetic energies of the mass
points and the potential energies of the springs between them:
L=(n+l)d
12 H J j+l n-1 n
FIGURE 1.1. The string is seen as a sequence of n identical discrete
mass points to, separated by identical springs with material constant k
and equilibrium length d.

18 1. The FPU Model and Simulation: "A Little Discovery"
Ho(q,P) = \ £ [p2 + (to ~ Qj-i)2} ■ (1.2)
The sums run up to n -f 1 because that is how many springs there are, so
we must also define pn+i = 0, as there are only n mass points.
The addition of a small nonlinear force term to the linear equations must
result in behavior that diverges from harmonic oscillation and approaches
somewhat that of the real string. This expectation derives from the validity
of the series expansion of fundamental force laws. But strict validity applies
only to the whole infinite series. Whenever we truncate to a finite set of
terms, the model no longer truly represents the fundamental force laws, but
only an approximation. By learning the effect of each successive, higher-
order term in the expansion—added to the previous terms—we hope to
construct a clear understanding of physical behavior. Fermi et al. assumed
that the addition of either cubic or quadratic terms to the linear string
would bring the system closer to the ergodic tendencies of a real string.
For the unperturbed model, the linear force term was used. The next
higher-ordered term in the series is the quadratic force term, then the cubic
term, and so forth. Although Fermi et al. used three different perturbation
terms separately in their work—a quadratic, a cubic, and a broken-linear
force term—the separate outcomes all had the same inexplicable feature. I
will concentrate on their application of only the first, lowest-order,
nonlinear perturbation term, which is representative of all of their results. When
added to the linear Hooke's law force, the quadratic perturbation yields
the new equations of motion as follows:
Qj = (tfj+i " Qj) + (Qj-i ~ Qj) + ai(Qj+i ~ Qj)2 ~ (Qj ~ Qj-i)%
j = l,2,...,32. (1.3)
Equation (1.3) contains the unperturbed linear force term (1.1) and the
additional term that is quadratic in the displacements q. In this instance,
I have replaced n by 32, the number of mass points actually used by Fermi
et al. in the first run of their simulation. In accordance with the normal
procedure for perturbation theory, we see introduced here a small
parameter a to control the strength of the perturbation term. When we model
real physical systems, this parameter should correspond to some
combination of physical characteristics, such as the elasticity of the string; but in
exploratory work, it merely gives us a way of varying the influence of the
perturbation on the overall behavior of the model. The total Hamiltonian
(H = Ho -f ocH{) consists of the unperturbed expression of (1.2), plus the
additional terms which are cubic in the displacements and scaled by the

1.7. Modal Representation 19
parameter a, as follows:
n+l r
3 = 1
H(q,p) = ^ I X (P2j + (Qj - Qj-i)2) + %j ~ Qj-if
(1.4)
1.7. Modal Representation
Because the solution to the unperturbed problem is the n-coupled harmonic
motion, which is most easily understood in terms of the normal (or Fourier)
modes, and because Fermi et al. were interested in measuring the dispersion
of energy among modes, conversion to the modal representation made sense.
The relationship between the displacements qj and the normal ak for the
unperturbed Hamiltonian are given by the following expressions:
ak = y^t^qis[n[iTl)' * = l»2,...,n, (1.5)
^ = ]f^l^akSin(iTl)' i = l,2,...,». (1-6)
Each mode defines a set of displacements qj for all the mass points along
the string, thereby constructing a unique shape for the overall string, one
that is linearly independent of all the other modes. With the normal modes
as a basis, any arbitrary shape of the discrete string can be represented as a
sum of modes. We may think of this shift from displacements qj to normal
modes a^ as a change of variables from one set of independent quantities
to another—from the local behavior of points along the string to the global
behavior or shape of the entire string. Figure 1.2 shows the displacements
of the 32 mass points along the string as determined by the first four
modes. Up and down displacement in this figure represent left and right
displacements along the string.
When we make the conversion from local displacements to global modes
using the transformation equations (1.5) and (1.6), the new unperturbed
Hamiltonian becomes:
#o(a, a) = - Y^("l + "14), where uk = 2sin I ^^j • (1-7)
The Uk are the frequencies of the normal modes. As the mode index k
increases from 1 to n, the argument of the sine function increases from nearly
zero to nearly 7r/2, the frequencies increase monotonically and approach 2,
so higher modes correspond to higher frequencies of string vibrations.
Notice that this expression shows the isolation between the modes. Each
successive term in the energy sum represents the energy of each independent

20
1. The FPU Model and Simulation: "A Little Discovery"
0.3
-0.3
Model -♦-
Mode 2 -+--
Mode 3 a-
Mode 4 ••*•••
10
15
20
25
30
FIGURE 1.2. The first four normal modes of the string. Each mode
defines the displacement qj of the n = 32 mass points. They take on the
familiar standing-wave patterns. Because the string has fixed endpoints,
each mode ak has fc-f 1 nodes (points where the string intersects the
horizontal axis).
mode. This kind of separation is not possible in the displacement
representation, in which each term of the sum depends on preceding and successive
terms as well.
When we add the effect of the perturbation, we also get additional terms
in the Hamiltonian. The full, perturbed Hamiltonian in modal form is as
follows (Ford and Waters, 1963):
.. it a,
H(a, a) = - J^(al 4- Jtal) + a Yl Ckimakaiam.
fc=l k,lfm=l
(1.8)
Here we see more clearly how the perturbation links together the modes.
The 3n additional terms allow energy to move from one mode to another
that is nearby in energy. The parameter a is kept small in FPU, such that
the quadratic perturbation force term is no more than one-tenth the size
of the linear term at maximum values of qj. The constants Ckim axe quite
complicated combinations of the transformation coefficients; we are not
concerned with them here. Whereas the modal transformation equations
do appear in the original FPU paper, this Hamiltonian does not. To obtain
the FPU simulation, we integrate numerically the (displacement) equations
of motion (1.3), and every so often, calculate the modal energy distribution
from (1.7).

1.8. Model Considerations 21
For the linear problem, the modal description is isomorphic to a
description in terms of the integrals of the motion; the energy is divided up in a
particular way initially, and it will be confined in that distribution by the
integrals of the motion. Instead of speaking about trajectories locked into
particular regions of phase space, as is so common in dynamics today, Fermi
et al. speak of the evolution of the modal energy. This description
highlights the global configuration of the string, while obscuring both the local
behavior of the displacements, and the evolution of the system trajectory
in phase space—a representation that will later dominate the field.
Without the perturbation, the string would maintain a standing wave
pattern, as in Figure 1.2. The motion would be periodic: every time a
multiple of T — 27r/ui occurred, the string would reproduce the initial
configuration, provided the initial configuration was given by a\. In the
perturbed problem, Fermi et al. expected the motion to drift away from
periodicity toward ergodicity. By calculating the modal energy at any time
in the evolution, Fermi et al. could speak of the evolution of the system
completely in terms of the energy distribution among normal modes.
1.8. Model Considerations
With their one-dimensional, discrete-string model, Fermi et al. had several
parameters with which to modulate their simulation. When choosing n, the
experimenters had to mediate between two criteria: whereas a larger
number of points would approximate the continuous string better, a smaller
number would require less calculation time for each iteration of the model.
While this simulation was going to be a demonstration of the computer's
usefulness in computational modeling, the speed of even the best machine
in 1953 was quite slow. Each machine-language instruction for the
computation had to be correctly encoded onto a single punch card. The program
consisted of boxes of cards that were fed into the machine, over and over.
For each iteration of the calculation, the cards were fed in again.
Attempting to simulate a continuum would, of course, require a large
number of particles, and we would assume that the larger the number, the
better. Even if we convince ourselves that there is no need to simulate
a continuum, we still would like a large number of particles to enhance
the approach to ergodic equilibrium or to obtain a better simulation of a
statistical mechanical system. In 1953, Fermi et al. were comfortable using
only 64, 32, or 16 particles on the very earliest computing machines. With
computing time as an ever-present constraint, Fermi et al. had to strike a
balance between the number of particles—thus the number of equations—
and the duration of the time step.
Because of the slow divergence from linear behavior, the size of the time
step can be on the scale of the recurrence time of the associated linear

22 1. The FPU Model and Simulation: "A Little Discovery"
problem to reveal the long-term average behavior of nearly linear systems,
or it can be made smaller to try to resolve rapidly diverging orbits in
strongly nonlinear systems. In addition to these considerations, care had
to be taken in choosing the time-step size because, in this instance, the
nonlinear system was being analyzed into the modes of the corresponding
linear problem. These modes have characteristic periods that force an upper
limit on the step size of the simulation for resolution and interpretation of
the evolution. Using the normal modes as an interpretive tool, Fermi et
al. had to run each configuration of their system through enough cycles
to allow for the tendency toward thermalization—that is, they had to run
through enough cycles to simulate several hundred periods of the initial
configuration. They chose to break up the period of the associated linear
model into a large number of time cycles, up to five hundred in some cases.
Thus, in various configurations, they ran the simulation anywhere from
14,000 cycles to 80,000 cycles. As a consequence, they chose n to be at
most 64, and in some cases, they also considered n = 16 and n — 32. The
fact that only powers of two were considered in FPU seems to have been
only a coincidental consequence of using binary arithmetic; but this fact
figures prominently in the later analysis by Ford (1961) as a significant
factor in his explanation of the results of FPU.
Given that we expect eventual thermalization, the choice of the initial
string configuration might not seem to make very much difference, because
all initial conditions should tend toward equipartition of the energy. For
ease of following the evolution of the system in their simulation, Fermi et al.
chose very simple combinations of the lowest normal modes, usually a single
half sine wave (mode one) and releasing the string from rest (pj = dfc =0).
Fixing the initial energy (H = E — constant) determines the value of a\
from (1.7), with o^ = 0 whenever k ^ 1. With this value of ai, the initial
values for the displacements qj may be obtained from (1.6). But once again,
a decision about initial conditions—namely the choice of initially exciting
only the lowest modes—came into question in a later work (Zabusky and
Deem, 1967), where it is shown that the initial excitation of optical (high-
frequency) modes results in better energy-sharing among modes, although
not in thermalization.
As mentioned, Fermi et al. chose values for the nonlinear parameter a to
keep the quadratic force terms (1.3) smaller than one-tenth the size of the
linear terms at maximum displacement. Because the corresponding energy
terms (1.8) were cubic in the displacements, they always represented less
than a just a few percent of the total energy, so they were ignored when
Fermi et al. calculated the modal energy distributions.

1.9. Results 23
1.9. Results
At the start of each simulation, the string did appear to be tending toward
thermalization, but as the simulation progressed, some odd things began to
happen. Most of the energy would strangely accumulate in one or another
mode, only to disperse again. Most significantly however, was the fact that
the energy apparently remained bound within the first five, lowest-energy
modes, those modes nearest in energy to the initial condition. There the
energy cycled about within those few modes in an almost periodic way.
Figure 1.3, taken from the original FPU paper, shows the results of their
first simulation. Using the quadratic perturbation, with all the energy
initially in mode one at time to = 0, the string was released from rest and
allowed to evolve numerically. The simulation ran for 30,000 cycles, which
corresponds to about 160 fundamental periods {T\ = 2tt/uji) of the first
mode. In the following passage, the authors describe the result:
Instead of a gradual, continuous flow of energy from the first
mode to the higher modes, all of the problems show an entirely
different behavior. Starting in one problem with a quadratic
force and a pure sine wave as the initial position of the string,
we indeed observe initially a gradual increase of energy in the
higher modes as predicted. Mode 2 starts increasing first,
followed by mode 3, and so on. Later on, however, this gradual
sharing of energy among successive modes ceases. Instead, it is
one or the other mode that predominates. For example, mode
2 decides, as it were, to increase rather rapidly at the cost of
all other modes and becomes predominant. At one time, it has
more energy than all the others put together! Then mode 3
undertakes this role [from 14,000 cycles to 19,000 cycles]. It is
only the first few modes which exchange energy among
themselves and they do this in a rather regular fashion. Finally, at a
later time mode 1 comes back to within one percent of its initial
value so that the system seems to be almost periodic. (Fermi
et al., 1955, p. 981)
Two very important problems emerge from these results. They show a
definite isolating behavior and the modal energy seems to evolve cyclically.
Figure 1.4 shows the time-averaged modal-energy distribution. Initially the
energy is in the improbable initial configuration, and instead of any
tendency toward equipartition of energy in all the modes—a plot of which
would show all the modes averaging toward the same low level of energy
(E/n)—we see a consistent pattern of energy partitioned unequally into
the first four modes. Each successively higher mode has less of the total
energy, and, as we saw in Figure 1.3, virtually no energy at all extends
beyond mode five.

24 1. The FPU Model and Simulation: "A Little Discovery'
100
0
1
\
\
2/
Id
\
\
\
\
\
5
3/
V
4
3r
5y
/
/
[
,2
A
\
\
h
fU\
(5
■A
I
/
/
/
/
\2
t
/
/
/
1-
\
20 30
f IN THOUSANDS OF CYCLES
FIGURE 1.3. Modal energy plot for the FPU string model with a
quadratic perturbation term: n = 32, a = \. The initial configuration
was a single half sine wave. About 30,000 computation cycles were
calculated.
QO
ou
20
m
0
T
\
^t
1
\
\
\
\
'
1
\t
L
i
V
\
/
t
K
\
\
=5^
<•_
4
\
>
^
-—
<
-»-_
2
40 80
t IN THOUSANDS OF CYCLES
FIGURE 1.4. Modal energy distribution averaged over time. The energy
settles down into a clear and consistent pattern, which is not consistent
with expectations of thermalization.

1.9. Results 25
In order to understand the implications of these results, imagine taking
two containers, one filled with compressed gas and the other empty. If you
connect these two together, the second law of thermodynamics—which is
based on the very same statistical principles of ergodicity and large
numbers of interacting particles—predicts correctly that the gas will spread
itself out over the entire volume until a single pressure obtains
throughout. Now imagine how surprising it would be if, instead of this predicted
outcome, the gas spread itself out along only the bottom third of the two
containers and remained that way. In a physical experiment, we would
expect to find a hidden variable at work, such as the force of gravity, or the
presence of an additional, different gas in the second container. But this
was a simulation, where "such experiments on computing machines would
have at least the virtue of having the postulates clearly stated. This is not
always the case in an actual physical object or model where all the
assumptions are not perhaps explicitly recognized" (S. Ulam, in his Introduction
to the FPU paper, Segre, 1965, p. 977). Where they expected to see the
chaos of randomness and the disorder of ergodicity, they found order, or,
as Fermi told Ulam, they made "a little discovery" (p. 977).
Anyone dealing with the FPU results must address the following
questions: What is the source of the isolating behavior? Is the nonlinear Hamil-
tonian actually integrable? But if it is, then why is there any energy sharing
at all? Could there be integrals of the motion beyond the total energy, yet
not a full set? What else can we learn from the FPU simulation?
Boundedness is only the first and most notable feature of FPU.
Looking at Figure 1.4, we also notice a tendency toward energy stratification.
Different modes have different energies that seem to depend on their mode
numbers in decreasing order. Remember that this is a time-average plot;
there can be variations in the energy distribution all along, so long as
the long-term behavior approaches a constant. Even white noise averages
to a constant value. This behavior in the time average is consistent with
periodicity. If the system energy visits each mode periodically, then the
time-average value of modal energy would be approximately constant. In
order to see this periodicity, let us return to Figure 1.3 on page 24.
As Fermi et al. point out, on the right-hand side of the graph, near 30,000
cycles, mode one returns to prominence, obtaining 99% of its initial value.
Between these peaks, there is a pattern: mode two appears in the middle
between the mode-one peaks, with mode-three peaks on either side of the
mode-two peak. The motion seems to exhibit something like periodicity,
returning every so often to one mode or another. But the return to any
mode is not complete. The apparent cycling between modes can also be seen
in Figure 1.5. Each mode represents a particular shape of the string that
can be represented by a plot of the displacements of the mass points along
the string (see Figure 1.2). At both 1000 cycles and 28,300 cycles, mode
one is dominant and we see the shape of the string as a single hump. As the
string evolves, the shape becomes some combination of the first five modes,

26 1. The FPU Model and Simulation: "A Little Discovery'
and every so often, one or another mode dominates the displacement, and
thus the energy.
As of 1953, there was no published theory to explain this distorted
periodicity. The linear model is strictly periodic and so strictly disallows energy-
sharing. The additional nonlinear term causes a distortion of the
periodicity of the linear model and allows for some degree of energy sharing, but
apparently it is not a complete breakdown of the integral-like structure
because there still seems to be something resembling periodicity. Fermi et
al. conjectured that this "almost periodic" behavior amounted to
something like "quasi-states" in an "almost linear" problem; such behavior later
became known in the literature as "quasi-periodicity"(Fermi et al., 1955,
p. 987). This new behavior constitues the first discovery for dynamical
systems theory—the first new feature characteristic of the region between
classical integrable dynamical systems and ergodic statistical mechanical
systems. The challenge to the world of dynamics was now to develop some
new theory to understand this new behavior, which was found in this first
excursion into experimental mathematics through computer simulation.
0 2 4 6 8 10 12 14 16
POSITION OF THE MASS POINT
FIGURE 1.5. The actual displacement of the string, showing the shape
of the string at various times (in cycles).

1.10. Discussion Post Hoc 27
1.10. Discussion Post Hoc
Posthumous Results
Enrico Fermi died of cancer on November 28, 1954; he was 53 years old.
The following year, atomic element 100 was synthesized for the first time
and named Fermium in his honor. Although he did see and discuss most
of the results of FPU with his coauthors in that year, he did not see all of
the results, nor did he see the final paper that was drafted as Document
LA-1940, and filed internally in the Los Alamos Reports in May 1955.
The ramifications on the general publication of FPU were, as Joseph Ford
writes:
Pasta and Ulam found themselves trapped: they clearly could
not publish without Fermi's name on the paper, but equally
they could not publish with Fermi's name on the paper, since
he had neither read nor approved it. This dilemma was never
resolved, and, as a consequence, the FPU results were never
published. However, the manuscript did finally reach the open
literature as part of Fermi's collected works, which appeared
some ten years after distribution of the original FPU preprint.
(1992, p. 275)
Copies of the preprint (LA-1940) circulated among the mathematical
physics community, and the results also spread by word-of-mouth dissemination.
Work on the FPU problem was slow to start as a consequence, and it was
some 6 years before the next step along the direct line of FPU was taken
in 1961.
Postponed Publication
In his Introduction to the 1965 printing of the FPU paper, Ulam mentioned
the existence of further evidence for the isolating behavior which they had
observed originally in 1953, but no such evidence existed in the literature in
1965. But back in 1961, the FPU simulation was continued at Los Alamos
by Jim Tuck and Mary Menzel. However, in a strange repetition of the
delayed FPU publication, Tuck and Menzel's results were not published
until 1972—11 years after the work was performed. Once again, like FPU,
the work had been performed, yet it was unavailable to the community
except in an unpublished form. For modern science, these are very long
time scales for the turnaround of results. But as computer simulation was
indeed a new branch of science, still in its infancy, it may not yet have been
clear what constituted a publishable result.

28 1. The FPU Model and Simulation: "A Little Discovery"
FPU Simulation J
1954 I
| f
I Tuck-Menzel
I Simulation 1961
FPU Published 1965 I
Tuck-Menzel
Published 1972
In more than one sense, these calculations were a direct continuation of
FPU, because one of the authors, Mary Menzel, nee Mary Tsingou, did the
actual computer coding of the original FPU simulation.
These continued calculations were performed in response to requests for
a published reference documenting the super-period of FPU. Because of
the issues raised in the original FPU simulation, we wondered whether
that apparent quasi-periodicity would continue, or perhaps fall off after a
significantly longer integration time. Maybe on each iteration of the
cycle, more and more energy would become dispersed. Given these questions
about the FPU results, it would seem as if extending the calculations of
FPU to longer times would be a promising endeavor. The results of Tuck
and Menzel are summarized in Figure 1.6.
The conjectured "quasi-states" (Segre, 1965, p. 987) were indeed
confirmed in that later study, in which it was found that the energy moved
around the lowest modes in an almost periodic fashion and that certain
cycles and "super-cycles" were observed:
In 1961, on more modern and faster machines, the original
problem was considered for still longer periods of time. It was found
by J. Tuck and M. Menzel that after one continues the
calculations from the first "return" of the system to its original
condition the return is not complete. The total energy is concentrated
again essentially in the first Fourier mode, but the remaining
one or two percent of the total energy is in higher modes. If one
continues the calculation, at the end of the next great cycle the
error (deviation from the original initial condition) is greater
and amounts to perhaps three percent. Continuing again one
finds the deviation increasing—after eight great cycles the
deviation amounts to some eight percent; but from that time on
an opposite development takes place! After eight more, i.e.,
sixteen, great cycles altogether, the system gets very close—better

1.10. Discussion Post Hoc 29
OSCILLATIONS AT FUNDAMENTAL FREQUENCY
FIGURE 1.6. The calculations by Tuck and Menzel extend the FPU
results to longer integration times, and so show the development of the
larger super-period.
than within one percent to the original state! This super-cycle
constitutes another surprising property of our nonlinear system.
(Segre, 1965, p. 978, Ulam's Introduction)
This quasi-periodic behavior brings up the question of round-off error in
the simulation. If perhaps some of the energy were lost due to the repetitive
calculations, then wouldn't this cause the incomplete return to the initial
state and thus disguise a true periodicity, which might result if the model
were actually integrable? Fortunately, however, we may dispense with this
problem in this case because Fermi et al. knew of the possibility for such
errors. The conservation of energy provided a convenient check on the
accuracy of the computation in that the sum of the energy in the normal modes
at any time should always equal the initial value. Fermi et al. recalculated
the total energy periodically to ensure that no significant losses interfered
with their results.
Postulated Solutions
In the following chapters, we will see several solutions to the FPU problem
that were suggested in the years following its nonpublication. But two main
results eventually dominated the field: One group of researchers came to
believe that FPU was a clear case of Kolmogorov-Arnold-Moser (KAM)
stability, while another saw FPU as an example of Korteweg-deVries (KdV)

30 1. The FPU Model and Simulation: "A Little Discovery"
solitons. Both of these phenomena are significant developments from
dynamical systems theory that emerged in the years following FPU, and they
are both intimately related to the FPU problem. In the following passage,
published well after the fact in 1989, David Campbell suggests that the
KAM theorem works as an explanation, indicating that the problem has
yet to be resolved conclusively, but there does seem to be a consensus within
the discipline:
At a conceptual level, then, the KAM theorem explains the non-
chaotic behavior and recurrences that so puzzled Fermi, Pasta,
and Ulam. Although the FPU chain had many (64) nonlinearly
coupled degrees of freedom, it was close enough (for the
parameter ranges studied) to an integrable system that the invariant
KAM tori and resulting pseudo-integrable properties dominated
the behavior over the times of measurement. (Cooper, 1989,
p. 244)
This solution, which is the isolation of phase space by invariant surfaces in
weakly nonlinear Hamiltonian systems, will be investigated in Chapter 3.
The second significant and certainly not unrelated solution to the FPU
conundrum is the possibility of a series of solitons—nonlinear, nondisper-
sive waves—which would also isolate the energy to low modes and would
account for the quasi-states:
The FPU problem is closely related to the soliton problem.
Since the nonlinear mechanical lattice investigated by Toda
admits periodic solution, a single lattice soliton can exist on the
system with periodic boundary conditions. This is a clear
example in which the nonlinearity is not causing the lattice to
become thermalized into a state of energy equipartition. (Scott,
Chu and McLaughlin, 1973, p. 1469)
Clearly the FPU problem provokes a lively discussion. Each group wants
to claim it for its own. Here the term "related to" disguises the fact that
solitons can exist only in continuous systems, yet the FPU model is
obviously a discrete one. But it points to an important tie between different
models and different perspectives.
Some aspects of these proposed solutions may be more or less attractive,
elegant, or applicable to the problem than others; but none of them to
date offers a complete explanation for the FPU problem in the rigid sense
of proof. The KAM theorem may offer us the best possible solution; but
no one has done the dirty work of proving it. Each solution in its own
right uncovers a rich vein of new dynamics research, and it might also be
said that each of these solutions describes essentially the same phenomenon
from a slightly different perspective.

2
The FPU Research Program:
Echoes on a String
And certainly the atoms did not move by volition,
nor did they place themselves by sharp intelligence,
nor did they agree what movements to produce,
but they, being many and moving about in many ways,
are constantly being buffeted and given motion,
and by trying every kind of combination
and motion, finally they fall into the arrangements
and the patterns of which the sum of things consists.
Lucretius (Book I, lines 1021-1028)
The results of the computer calculations of [FPU] thus appear to be
in sharp contrast, if not in actual contradiction, to the widely held
notions concerning the approach to equilibrium. ... Consequently,
aside from any element of contradiction, a number of physicists
have been puzzled by the failure of [the FPU Hamiltonian] to lead to
equilibrium behavior.
Joseph Ford (1961, p. 387)
2.1. The Threads of a Research Program
Even though the FPU results were not published until the two volumes
of Fermi's Collected Works came out 11 years after the fact (Segre, 1965),
word of the simulation did spread to the physics community by way of
private communication. At that time, physics research preceded along the
two accepted lines: theory and experiment. Computer simulation was a
new kind of research direction that did not yet have a plot of its own in
the field of scientific research. Being a set of numerical calculations, the
FPU problem seemed to fall into the category of theory. But a theoretical,
pencil-and-napkin physicist was not likely to be exploring the realm of
"experimental mathematics," such as that opened-up by FPU. Although

32 2. The FPU Research Program: Echoes on a String
no physical experiment had been performed, the problem did have many of
the effects usually associated with an experiment: theorists were faced with
the implicit demand to retrodict the anomalous results of FPU, just as any
scientist is challenged to explain the unexpected result of a valid physical
experiment. Furthermore, the FPU model was chosen exactly because it
was thought to be unsolvable using analytical means. Theory in dynamics
had been stalled for nearly 50 years, but here was a new and surprising
result that gave theorists renewed motivation. Further theoretical work
must come from that most difficult arena known as perturbation theory.
I use the term "research program" to mean something akin to a
"dynamical system," in that these lines of research are linked together by a common
interest and are mutually influenced by one another's results. However, I
do not want to convey in the term "program" the sense of "programming."
This research preceded stochastically and not according to some
preconceived plan.
The research following from the FPU simulation may be organized along
several lines, all within the discipline of mathematical physics. Already I
have discussed Tuck and Menzel's 1961 continuation of the FPU
calculations in the previous chapter (see page 27). Their discovery of the FPU
super-period served to further validate the FPU results and intensify the
need for an explanation.
2.2. The Nonlinear Discrete Lattice
FPU Simulation
1954
X
Jackson
1963 a, b
Ford
1961
1 C
1
Ford-Waters
1963,1966
Around 1965, word of the KAM theorem began to spread across the land,
effecting significant changes in this program of research. But before that
tide arrived, there were two distinguishable lines of research. Both involved
integrable approximations to the FPU model in an attempt to retrodict the
FPU recurrence; but they divided on the choice of model topology: Joseph
Ford, E. Atlee Jackson, and John Waters applied perturbation theory to

2.3. Ford, 1961 33
the nonlinear discrete lattice, while Norman J. Zabusky and M.D. Kruskal
sought an integrable solution to a continuous nonlinear wave equation.
Both groups worked through some very difficult theoretical calculations
with varying results. Most of this work found its home in the Journal of
Mathematical Physics.
2.3. Ford, 1961
In the first paper to address the FPU problem, "Equipartition of Energy
for Nonlinear Systems," Ford applied physical principles to obtain an
interpretation that followed from physical intuition:
A system of harmonic oscillators weakly coupled by nonlinear
forces will not achieve equipartition of energy as long as the
uncoupled frequencies Uk are linearly independent on the
integers, i.e., as long as there is no collection of integers {n^} for
which EriktOk = 0 other than all rik = 0. ... Physically, the
linear independence of the uncoupled frequencies means that
none of the interacting oscillators drives another at its resonant
frequency, and this lack of internal resonance precludes
appreciable energy-sharing in the limit as the coupling tends to zero.
(1961, p. 387)
Ford recognized that the unperturbed frequencies of the FPU model,
defined by the equations
a* = 2sin(|j0, (2.1)
must always be linearly dependent unless N equals a prime or a power of
2, in which case they are linearly independent (Hemmer, 1959). In FPU,
only N in powers of 2 were considered. Therefore, in all cases considered by
FPU, the unperturbed frequencies were linearly independent. What is the
significance of this fact? There are two ways of seeing its direct implications.
In terms of Hamiltonian dynamics (see the Appendix), in which we talk
of orbits on tori, the linear independence of the unperturbed frequencies
translates to the exclusion of periodic tori. In other words, because FPU
chose to consider N only in powers of 2 (undoubtedly because of the
connection between binary arithmetic and computers), all the trajectories they
considered, unperturbed, must necessarily lie on conditionally periodic tori;
no FPU initial condition could have resulted in a periodic, unperturbed
torus. Of course, for the unperturbed system, conditionally periodic tori
and periodic tori both correspond to periodic harmonic oscillations in the
positions and the momenta, because a full set of integrals of the motion

34 2. The FPU Research Program: Echoes on a String
bind the trajectories. But once the perturbation is turned on, the
topological difference between these types of tori becomes very significant, as we
know for the KAM theorem (see the discussion in the next chapter).
But Ford was not working with tori and he was not yet aware of Kol-
mogorov's conjecture, so he talked in terms of oscillators and resonance. He
turned to the physical model of coupled pendulums, from which he derived
his intuition that normal modes could share energy only if they drive each
other at resonance—that is, if the condition
J2nk"k~0 (2.2)
is met by the set of initial frequencies. Clearly, the equality here can be
achieved only with frequencies that are linearly dependent, which, as we
know now and Ford knew then, could not be satisfied by the FPU
frequencies. As a further heuristic argument to support this intuition, Ford made
use of the resonant denominators, which, as we will see in the next chapter,
play a central role in the development of the KAM theorem:
[The] solution clearly indicates that as a —► 0, there will be
no energy sharing unless some resonance denominator is zero,
i.e., unless the uncoupled frequencies are linearly dependent.
Moreover, the form of the resonance denominators [i.e., t^z+i"~
{u)r — ur+2i+i)2] almost demands an interpretation in terms of
internal resonance. (1961, p. 393)
The terms involving this denominator are all multiplied by the perturbation
parameter a, such that when the denominators vanish due to their linear
dependence, both numerator and denominator would tend toward zero in
the small perturbation limit, resulting in the possibility of terms that do
not vanish and so couple the modes together. However, because they are
not linearly dependent in this way, the FPU modes could never drive each
other at full resonance, and Ford concluded that appreciable energy-sharing
would be unlikely and should not have been expected by FPU.
But strict equality is not required to satisfy the approximation in (2.2),
so the question becomes more subtle and more difficult to answer: Can
the FPU unperturbed frequencies be nearly resonant? How close did the
approximation need to be to obtain appreciable energy-sharing? Ford
recognized that the size of the nonlinear coupling term in part determined
how large this approximation interval would be, and so it must be related
to the amount of energy-sharing:
Thus, as the [size of the perturbation] is increased from zero,
one would expect appreciable energy-sharing between higher
and higher modes. FPU happened to choose a value such that
appreciable energy sharing occurred among the first few modes.
(1961, p. 388)

2.3. Ford, 1961 35
Thus it would seem, once again, that because FPU coincidentally chose
a particular size of perturbation parameter, the energy was locked into
the lowest few modes. If the size of the perturbation were increased, Ford
believed that the energy would be shared among higher and higher modes,
eventually reaching equipartition. One cannot help but notice that Ford
has attributed the FPU results to two coincidental choices made by FPU
in the conditions of their model. Ford did consider the approximation in
the resonance condition:
For large N and small fc, however, we see that for uj\ « (J2/2 «
^3/3 « ..., the Uk are "almost" linearly dependent and, strictly
speaking, we should at least allow for the possibility of some
internal resonance, i.e., amplitude modulation. (1961, p. 390)
However, the perturbation techniques available to Ford at that time proved
to be nearly intractable when one had to include terms for the amplitude
modulation associated with near-resonance conditions. Without
considering amplitude modulation, Ford performed several calculations to find out
what to expect in the case of maximum possible internal energy-sharing—
i.e., when all the aVs are equal. He found, via an averaging technique, that
each mode almost always possesses \/N of the total energy, but his results
did not preclude the possibility of any particular oscillator from having all
of the energy (l/AT)th of the time. From this result, Ford concluded that,
at best, equipartition of energy can be achieved on the average, but not
ergodicity. He also tried to retrodict the recurrence times of FPU using
perturbation techniques. His findings are included in Figure 2.1, about which
he qualifies: "Quantitatively, then, the two solutions [his and FPU] are not
comparable; qualitatively, however, they are similar" (1961, p. 392).
Without using amplitude modulation terms, Ford had no way of showing
definitively whether FPU should or should not have appreciable energy-
sharing. Believing in the simulation results, Ford needed an explanation
for what might allow some energy-sharing, but not beyond the first few
modes. As was mentioned in the first chapter, the only known barriers in
phase space were due to integrals of the motion. Ford knew of a result from
Balescu to the effect that if the unperturbed frequencies are linearly
dependent, then nonlinear equations of this type cannot possess any analytic
constants of the motion other than the total energy (Balescu, 1956, p. 622).
From this result, Ford formulated certain expectations:
Consequently, when the uJk are linearly independent, as is true
for the FPU chain, we must anticipate the possibility of finding
analytic constants of the motion other than the total energy;
and in particular, one must anticipate the possibility that each
normal mode energy generates an analytic constant of the
motion. (1961, p. 388)

36 2. The FPU Research Program: Echoes on a String
A3 OOo it ooo
FIGURE 2.1. Ford's modal energy plot of the FPU chain (N = 32, a = \)
without considering amplitude instabilities.
Expectations such as this border on the fallacious: Let statement A =
"unperturbed frequencies are linearly dependent" and let B = "There are
no constants of the motion," then Balescu's statement is A —> B, a simple
inference. Ford posits "not A" (~ A) to get ~ B. But ~ A and (A —> B)
together provide no logically conclusive statement about B. All we can
say is that Balescu's result does not preclude constants of the motion for
FPU. By anticipating constants of motion because they are not logically
excluded, Ford demonstrates a strongly ingrained intuition that bounded
trajectories probably result from constants of the motion. Although Ford
did allow for some energy-sharing because of the approximation in the
resonance condition, he relied on the physical intuition that only resonance
in the unperturbed frequencies could lead to appreciable energy-sharing.
FPU could not have resonance, therefore it seemed likely that there must
be integrals of the motion corresponding to each of the modes, thus locking
the energy into those modes. Furthermore, as the perturbation parameter
is increased, the integrals might dissociate in an upward cascade leading to
more and more energy-sharing. The intuition is good and the results fair;
but little could be said conclusively based on Ford's analysis.
Both Zabusky (1963) and Jackson (1963b) offered some objections to
Ford's work. As discussed in the next section, Jackson's analysis divides
the behavior of nonlinear oscillators into two categories whose properties
depend on the size of the perturbation parameter. He agrees with Ford's

2.4. Jackson, 1963 37
observations concerning nonresonance due to the linear independence of
the unperturbed frequencies; but he goes on to point out that these results
are actually irrelevant to the FPU case, because they apply only to weak
nonlinear coupling, whereas the FPU simulations always fell into his
category of strong coupling (and large iV), where the linear independence of
the unperturbed frequencies play no role in the energy-sharing properties.
Zabusky called Ford's analysis incomplete because he worked without
considering the "amplitude instability" of FPU. Zabusky goes on to point out
that Ford's results (see Figure 2.1) show minima and maxima (of modal
energies) that are out of phase with the FPU results.
Finally, in order to dispel the problem with FPU's choice of N in powers
of two, Tuck and Menzel, also in 1961, performed a run of their simulation
using N = 17. The results were consistent with all the others in both their
work and in the original FPU simulation. Energy remained in the lowest few
modes and there appeared to be quasi-periodicity. Although these results
do not overturn Ford's work, in that 17 is still a prime number, they do
eliminate the powers-of-two connection.
2.4. Jackson, 1963
Two years after Ford's initial article, E. Atlee Jackson published a two-
article sequence (1963a, 1963b) following the same line of inquiry. In the
first, Jackson developed new perturbation methods that were not ordered
in powers of the perturbation parameters, as were traditional perturbation
expansions. Using these methods he claimed to show, in his second paper,
that
the nonergodic behavior of this [FPU] system does not result
simply from the incommensurability [linear independence] of
the uncoupled frequencies {uJk}, but also from the particular
form of mode interaction and the initial conditions used in all
calculations, both of which affect the coupled frequency
spectrum {nfc}. (1963b, p. 686)
Jackson established an iterative procedure for determining the "shifted"
frequencies Qk of the perturbed system, which are functions of the
perturbation parameter A and also N. For the FPU case, these shifted frequencies
are approximated by
nk~uk+r^)A2ku;k) (2.3)
where the constants Ak are related to the mode energy of mode k. Jackson
showed that the particular form of perturbed frequencies play a significant
role in the behavior of the perturbed system.

38 2. The FPU Research Program: Echoes on a String
Using the cubic FPU Hamiltonian perturbation,
H = H0 + XH1 = - ^(d| + (J2ka2k) + - ^2 Vjkiajakah (2.4)
k jkl
where LOk is still denned as in (2.1) and Vjki is symmetric in the indices,
Jackson denned two coupling-strength regimes: strong coupling (limA —>
oo) and weak coupling (limA —> 0). He claimed that for the conditions of
the FPU experiment, weak coupling corresponded to (A «C 0.1), and since
FPU used values of A larger than this (A = \ and A = 1), FPU must be
considered as a case of strong coupling.
Although Jackson discussed both what he called the "perfect chain" of
FPU as well as other, imperfect chains, I will restrict my discussion to
what is applicable to the FPU problem. According to Jackson, a perfect
chain is one with all identical spring constants. This was indeed one of
the simplifying assumptions of FPU; any real string or system of harmonic
oscillators will have variations between each of the springs. For such perfect
chains, Jackson found that "energy is only transmitted to the higher modes
once the intermediary modes have become excited" (1963b, p. 687). This
succession of modes does correspond to what we saw in the FPU results.
All the initial energy was placed in the first mode and then the string
was released. The energy initially flowed toward an ascending succession
of mode numbers. Jackson and Ford both shared in this expectation.
Jackson focused his analysis on the concept of the Poincare recurrence
time—the time for some initial condition to "nearly" recur. He defined the
recurrence time,
2tt N
t\ = q, xy where Q(m, A) = ]Pmfcftfc(A). (2.5)
Although this method seems useful to predict the return of initial
conditions—certainly important for any effort to explain what happened in
FPU—it is not a very informative mechanism for understanding what
would cause such a return. The subscript, A, indicates Jackson's contention
that the recurrence time depends on the size of the perturbation:
If t\ increases when A increases, this would mean that the
energy exchange would continue for a longer period of time when
the coupling is stronger—a feature which is presumably
necessary, even if not sufficient, for ergodic behavior. Actually what
is found to occur for the particular initial condition [of FPU],
is that r\ decreases as A increases. This behavior of t\ will be
shown to be dependent on the initial condition, and therefore is
not necessarily a general feature of such systems. (1963b, p. 687)
Once again we find that the FPU results must have come from an
exceptional case of initial conditions.

2.4. Jackson, 1963 39
(a)
OSCILLATORS. II
n r
(b)
0
1
\
\
2/
h
\
\
\
\
5
3/
y
4
3r
57]
.2
\
\
\
3A
|
v
V
\
\
V
A
/
\l
>
K4
lY
Y\
^
^
j
/
/
/
/
in
\
^
/
/
/
/
f
\
^
-1-
\
U
20 30
f IN THOUSANDS OF CYCLES
FIGURE 2.2. Compare Jackson's recurrence calculations (a) with the
appropriate FPU results (b). Notice the symmetry in Jackson's figure as
compared to the obvious asymmetry in the FPU figure.

40 2. The FPU Research Program: Echoes on a String
Jackson noted Ford's recognition that FPU used numbers of particles
only in multiples of 2, and he calculated a theoretical recurrence time for
the case of weak coupling, which while appropriate to Ford's analysis, it
was not, according to Jackson, applicable to the actual FPU work. Using
the FPU conditions (N = 32, A = 1/4, and A = 1), he found results that
were very poor estimates of the recurrence in comparison with FPU. In
particular, the recurrence times n and ti, differed from the FPU calcu-
4
lations by 150% to 400%. From this poor correlation, Jackson concluded
that, whereas Ford's analysis applied to the weak coupling case, the FPU
results indicated the need to consider strong coupling. In his summary of
the results for the case of strong coupling, Jackson found that he could
closely approximate the recurrence times of the relevant FPU case (see
Figure 2.2):
The present results show that the frequencies u(m) [= ]T rrikWk]
are not relevant in determining the periodic behavior of
coupled oscillators. Thus, the nonergodic behavior of this system
does not arise from any singular property of uj(m) but from at
least two other properties. The first is the fact that the relevant
frequencies fi(ra, A) [the denominator in (2.5), the perturbed
frequencies] tend to increase as A is increased. Thus, while the
amount of energy exchange is increased as A is increased, the
duration of this exchange decreases. This feature would not
preclude an ergodic behavior, except for the second property of this
system. This property, which is inherent in all perfect chains,
is that the energy is transmitted to higher modes only via the
intermediary modes. (1963b, p. 695)
Jackson concluded that perfect chains tend to have their lowest modes
preferentially excited, which suggests that if the higher modes had been excited
initially, then more widespread energy distribution might have resulted. In
Chapter 4, we will see this claim made again by Boris Chirikov, but from
quite a different analysis, and we will see it tested by Zabusky and Deem
(1967).
Several important points arise from Jackson's analysis. First, by focusing
on recurrence time, Jackson provided no insights into whether we might
expect ergodicity in any general sense. He seems to indicate that ergodicity
depends distinctly on the initial conditions; although he did mention that
the strength of the coupling plays an important role. His conclusions about
weak coupling show his confidence that ergodicity is unlikely to occur, but
he is much less forthcoming about what to expect in the case of stronger
coupling. Whereas Jackson did allow for the possibility of ergodicity in
this system, Ford, on the other hand, seems to have moved in the opposite
direction, i.e., eliminating the possibility of ergodicity at all.

2.5. Ford and Waters, 1963 41
2.5. Ford and Waters, 1963
Two months after Jackson submitted his second paper to the journal, Ford
submitted a second paper on the subject; this time coauthored with his
student John Waters. In this paper, "Computer Studies of Energy Sharing
and Ergodicity for Nonlinear Oscillator Systems," Ford and Waters
concentrated their efforts on weakly coupled systems with the now familiar
cubic Hamiltonian (compare with (2.4)),
1 n n
#==2^(^+U;^)+a ^ AJkiQjQkQi- (2.6)
k=l j,k,l=l
By "weakly coupled," the authors specifically meant that the coupling
terms seldom obtained more than 10% of the total energy. This condition
they guaranteed by setting all positions (g's) and momenta (p's) initially
to zero, except for the initial condition pi = \/3> and they kept a < 0.1.
This size of the perturbation term places their work well within the KAM
stability zone. Furthermore, Ford and Waters refined the definition of the
resonance condition:
Y^nkujk<a. (2.7)
They confined their work in this paper to this resonance region of the
weakly coupled nonlinear system. With intuition and without thorough
knowledge of KAM, Ford and Waters isolated the region of phase space
most significantly affected by the KAM theorem. Although Ford and
Waters did cite Kolmogorov's 1957 presentation to the International Congress
of Mathematicians: "Kolmogorov claims to have shown that nonlinear
systems, more general than the systems considered here, are not ergodic"
(Ford and Waters, 1963, p. 1294). It is clear that they did not yet
recognize the limitation of Kolmogorov's theorem to weakly coupled systems,
such as those they were concerned with here. Furthermore, no mention
was made of the recent proofs of the theorem by Moser (1962) and Arnold
(1963a). Ford and Waters cited what is essentially the KAM theorem to
claim that these weakly coupled systems are nonergodic in general; but the
KAM theorem is limited to precritical conditions and is inappropriate for
such a general statement.
Ford and Waters focused on the important distinction between equipar-
tition of energy among modes and ergodicity. The former is a property of
the energy distribution and the latter involves the average behavior of the
system trajectory over time and state space. They contended that weakly
coupled, nonlinear oscillator systems could come to equipartition of energy,
if the initial condition places the system near one of the resonances defined
by (2.7). Because of their belief that this system could never be ergodic,
they did not consider the possibility that this region might be a (KAM)

42 2. The FPU Research Program: Echoes on a String
boundary zone; instead, they thoiight of it in terms of the nearness to
resonance, so that related modes can share energy yet still retain their modal
identity. Using computer calculations, they demonstrated that indeed as
they moved the initial conditions closer and closer to one of the primary
resonances (set of frequencies which satisfies the equality contained in (2.2)
exactly), the energy sharing between the neighboring modes increased
significantly. Thus, to their satisfaction, they inferred that equipartition of
energy could occur in this system without ergodicity. This was an
important result, because it separated equipartition from the stronger property
of ergodicity, an especially important result for anyone who would like to
believe that nonlinear discrete systems could still have analytic solutions.
This paper was a significant step forward from Ford's previous effort, in
that they now accepted the possibility of energy-sharing in these systems.
The shift in emphasis from the original, strict, resonance condition (2.2)
to the new condition (2.7), that is dependent on the size of the
perturbation, acknowledges the influence of the perturbation on the energy-sharing
properties. Some region of phase space about the resonances, with width a,
allowed for energy-sharing, which is remarkably similar to the KAM result.
How can the normal modes share energy and yet retain their identities
once the perturbation is turned on? Ford and Waters turned their attention
to the possibility that a constant of the motion might persist into the
nonlinear regime. Beyond the energy integral, there would need to be an
additional integral of the motion to account for the continuation of the
modal boundaries. They tried numerically recalculating the value of the
hypothesized second constant of the motion (Whittaker's adelphic integral)
on each iteration of the numerical model. For a nonlinear, five oscillator
system, weakly coupled, they estimated the value of the theoretical constant
of the motion (see Figure 2.3).
We see that the integral of the motion is not exactly constant, but it
seems to be nearly constant in time. Ford and Waters concluded that this
figure demonstrated the existence of this constant. They further
conjectured that probably there exists a full set of constants of the motion
(making the system totally analytic and integrable) because this second constant
persisted into the weakly nonlinear region and because the normal modes
seemed to be stable there also. We xcan see from Ford and Water's
statement of this constant ($): $ = J\ J% cos(02 — 20i)+ a<l>i H that it is a
combination of the actions. But it is also a function of the angles, which
would require them to maintain a very special relationship, because they
clearly vary in time. But then, so do the actions, once the perturbation is
turned on.
Ford and Waters made no conclusion as to what to expect from this
system in the case of stronger coupling, and by doing so, they avoided the
issue of transition to ergodicity. This analysis is enlightening but static.
It is static because it only considers the class of weakly coupled systems
without then pushing their results to the limits of the class; but it is def-

2.6. The Continuous String 43
COUPLING ENERGIES (NONLINEAR SYSTEM)
♦It
.TOTAL COUPLING ENERGY
•RESONANT COUPLING ENERGV
—i—
20
10
30
40
TIME
50
60
70
80
FIGURE 2.3. Ford and Waters conclude the existence of the second
constant of the motion from the evidence of this plot.
initely enlightening because they came close to modeling the behavior of
these systems in the region of KAM stability. This explanation does not
allow for any mechanism for the constants of motion to vanish at higher
values of the perturbation parameter. With such a limited explanation,
once the perturbation was turned on, if the constants of motion did indeed
persist, then just changing its value would be an unlikely cause for them to
disappear. The switch from linear system to nonlinear system is a major
discontinuity in system type; it is at that threshold that qualitative change
must occur. The KAM theorem allows for this change and it provides a
logical transition in topology from integrable system to global deterministic
chaos.
2.6. The Continuous String
Unconvinced by the "conventional physics" approaches to the FPU problem
by both Ford and Jackson, Norman J. Zabusky and M.D. Kruskal, then of
Bell Laboratories and Princeton University, respectively, decided to tackle
the FPU problem from an alternative perspective, that is, as a continuous
string. Perhaps the mode recurrences could have resulted from the way in
which FPU "discretized" time for their lattice model. Fermi et al. (1955)
chose a loaded string instead of the continuous one because of the discrete
requirements of the simulation:

44 2. The FPU Research Program: Echoes on a String
Fermi-Pasta-Ulam
Simulation 1954
Zabusky 1962 I
i
Zabusky 1963 J
Zabusky-Kruskal §
1964-65 I
For the purposes of numerical work this continuum is replaced
by a finite number of points so that the partial differential
equations defining the motion of this string is replaced by a finite
number of total differential equations. ... The corresponding
partial differential equation obtained by letting the number of
particles become infinite is the usual wave equation plus
nonlinear terms of a complicated nature. (Segre, 1965, p. 979)
The mechanical picture of the macroscopic string began as a continuum;
but for simulation, the continuum was replaced by a lattice of point masses
separated by springs. Because the differential equations for a set of coupled
harmonic oscillators becomes the wave equation in the continuous limit,
we have a clear understanding of the affect of this approximation on the
unperturbed Hamiltonian. By extension, we might expect some analogous
smooth transition from the discrete perturbation to the continuous one;
but, as we see from Zabusky's work, this is not the case. Whereas the
continuous, nonlinear, perturbed Hamiltonian turns out to be integrable,
the discrete, perturbed Hamiltonian of FPU is explained best using the
properties of the KAM theorem, which applies to a discrete lattice. Whereas
"of a complicated nature" seems to indicate that Fermi et al. knew there
would be some effect, they also seemed to think this would be only a minor
perturbation to the "usual wave equation."
Zabusky was initially drawn to a study of the FPU string without the
truncation of space, so he directed his own work toward the theory of
continuous, partial differential equations and not simulation. He elected to
solve the continuous analogue of the FPU system, analytically. But I believe
he was very interested to see how the continuous solution would compare
with the results of FPU, thus showing that he believed in the simulation,
especially in light of the work by Tuck and Menzel (1972). At first Zabusky
chose to treat what he called "the lowest continuum limit" version of the
string, which means keeping only the lowest-order terms of the continuous
expansion of the FPU Hamiltonian.

2.7. In the Continuous Limit 45
2.7. In the Continuous Limit
In 1962, Zabusky published the paper "Exact Solution for the Vibrations
of a Nonlinear Continuous Model String," in the Journal of
Mathematical Physics. He attempted to explain the FPU results by resorting to the
partial differential equation that arise from taking the FPU equations to
the continuous limit, i.e., taking the number of lattice points to infinity
while simultaneously shrinking the spacing between the sites to zero. The
resulting partial differential equations are
Vtt = {l + eyx)aVxx, (2.8)
where e is the perturbation parameter, a is the degree of the nonlinearity,
yu is the second partial derivative of displacement with respect to time,
yxx is the second partial derivative of displacement with respect to position
along the string, and yx is the first partial derivative of displacement with
respect to position.
Zabusky applied a so-called "Riemann integration technique" to obtain
a solution to order e for this equation, which involved inverting the roles
of the dependent and independent variables, solving the resulting linear
equation with an integral and then reinverting to get the solution of y as a
function of x and t. But Zabusky ran into a problem with this technique:
"It is demonstrated that yx develops a discontinuity after an elapsed time
of order (1/e)" (1962, p. 1028). The continuous solution breaks down for
the FPU case even before the second mode achieves its first full maximum
(see Figure 2.2), which is well before the recurrence of the first mode.
Consequently, very few conclusions can be drawn from this analysis except
to try to account for the breakdown. Zabusky pointed to further work which
might be tried to penetrate the FPU recurrence:
The computations of Fermi et al. indicate that the vibrations
of a finite number N of coupled, nonlinear, equimass
particles do not develop such a discontinuity. Thus, a continuous
nonlinear system described by a partial differential equation of
second order cannot describe the vibrations of the equivalent
discrete system for "large" times. To account for the FPU
results by a continuum representation, one is led to include terms
which measure the discreteness or "graininess" of the medium.
These terms appear quite naturally if we retain quantities of
order (l/N2) that arise in the limiting process. These terms
involve higher derivatives (for example, ytttt, Vxxxx, etc.) and
should affect the vibrations most at those points on the string
where breakdown "tends" to occur. These terms are analogous
to the viscosity-like terms that are added to the lowest order
hydro-dynamic equations to prevent a discontinuity from
forming. (1962, p. 1039)

46 2. The FPU Research Program: Echoes on a String
2.8. Discreteness as Viscosity
After the 1962, 1963 papers, Zabusky went back to his original equations
and reconsidered the first-order nonlinear version of (2.8), with the
addition of the fourth-order derivatives, which he believed would represent the
discreteness of the FPU string:
-§ = (1 + eyx)yxx + ( — ) yxxxx, (2.9)
c2
where c2 = h2u^ is the wave speed in the linear limit. Zabusky then
introduced the variable substitution
-Ks[(i+*),-i]-')> <2l0)
which is an invariant quantity for (2.8). Employing this substitution and
transforming to a moving (Lagrangian) reference frame (x —► [x — c£]),
Zabusky arrived at the Korteweg-de Vries (KdV) equation:
uT 4- uux + 62uxxx = 0, (2.11)
, ect c2 h2
where r = —— and o = ——.
2 12e
This equation was first obtained by Korteweg and de Vries in 1895 to
describe hydrodynamic wave propagation in shallow water. The third-
order partial derivative in this equation, in terms of traditional wave
equations, represents a dispersive medium. Analyzing this equation, Kruskal
and Zabusky (1964) discovered the existence of nondispersive "solitary-
wave pulses," which became known to the plasma physics community as
"solitons." The unique and exciting property of the soliton is that several
of them can coexist in one medium and propagate through one another
without losing their identity. These solitons have the mathematical form
ti = tie* + (tio - tXoo)Sech2 i}X~^\ , (2.12)
where uq, Uqq, and xq are arbitrary constants and
i
'(Uo-Uoo)'
A = 6
C = Woo +
12
(UQ-Uoo)
3
u = U(x — ct).
In their attempt to include terms which would account for the discreteness
of the medium in the continuous partial differential equations, Kruskal and
Zabusky (1964) discovered a nonlinear wave with particle-like properties.

2.9. The First Soliton Paper 47
2.9. The First Soliton Paper
In 1965, Zabusky and Kruskal reported their soliton discovery in Physical
Review Letters, in a paper called "Interaction of 'Solitons' in a Collisionless
Plasma and the Recurrence of Initial States." They described the KdV
equation and the soliton solutions, including the important property of
"focusing," in which these wave-pulses pass through one another but do
not sum in the usual linear-wave superposition. Using this property of
the solitons, the authors claim to "give a phenomenological description
of the near recurrence to the initial state in numerical calculations for
a discretized weakly nonlinear string made by Fermi, Pasta, and Ulam."
Their explanation was:
In conclusion, we should emphasize that at Tr all the solitons
arrive almost in the same phase and almost reconstruct the
initial state through the nonlinear interaction. This process
proceeds onward, and at 2Tr one again has a "near recurrence"
which is not as good as the first recurrence. Tuck, at the Los
Alamos Scientific Laboratories, observed this phenomenon as
well as eventual "superrecurrences" in calculations for a
similar problem. We can understand these phenomena in terms of
soliton interactions. For t > Tr the successive focusings get
poorer due to solitons arriving more and more out of phase
with each other and then eventually gets better again when
their phase relationship changes. Furthermore, because the
solitons are remarkably stable entities, preserving their identity
through numerous interactions, one would expect this system
to exhibit thermalization only after extremely long times, if
ever. (Zabusky and Kruskal, 1965, p. 242)
This explanation is very appealing because it accounts for the super-period
and it provides a causal mechanism for the recurrence, both of which were
lacking in Jackson's explanation. If we have several solitons moving along
the string, independently of one another (except that they accelerate as
they pass through each other) and we understand these waves carry energy,
then we can imagine that as their relative positions change, the energy
passes from mode to mode.
As with any explanation, there are drawbacks. In all of their work,
Zabusky and Kruskal chose to use periodic boundary conditions, instead
of the fixed endpoints of the FPU problem. With periodic boundary
conditions, it is clear that waves will return periodically along the string, but
with fixed endpoints, the waves would have to reflect from the ends and
return, which is not the same process, and we have no idea how problematic
this would be for these new, nonlinear waves. No justification was made in
the 1965 paper; but later, one of the authors (Zabusky, 1967) did at least
address the switch in terms of the pre-KdV continuous equation (2.8):

48 2. The FPU Research Program: Echoes on a String
We showed that the lowest FPU modal energies agreed up until
breakdown with those obtained from a Fourier decomposition
of the waveform y(x, t). The analysis proved, trivially, that we
would also get breakdown if we took one Riemann invariant
finite and the other zero and replaced the fixed boundary
conditions by periodic or cyclic boundary conditions. The analysis
also showed that, although the phenomena are nonlinear,
signals propagating along the different characteristics did not
interact with each other to lowest order. This suggested that the
recurrence phenomena would be preserved for these analytically
easier-to-treat initial-boundary conditions. A computer
simulation for these conditions gave recurrence and we were on our
way to a better analytic understanding. (1967, p. 231)
In order to recreate the recurrence of FPU, Zabusky chose to go to the
periodic boundary conditions because they were easier to use. But, even
if they do get recurrence this way, does it mean that they would also get
recurrence if they used the more difficult, fixed boundary conditions? Later
in his development of soliton theory, Zabusky discussed the affect of the
boundary conditions on the recurrence time:
Note that it was important to use periodic boundary conditions.
This allowed the solitons to "circulate" until they were arrayed
for focusing. We now realize that had we chosen an initial
condition for the lattice problem which was neither a pure standing
wave nor a pure progressive wave, we would have obtained a
different focussing time for the solitons propagating along the
two characteristic directions. Therefore, the recurrence would
not have been as good as the one we observed. (1967, p. 244)
As for the energy conservation problem, in the 1965 paper, they mention
in a footnote that "the energy is almost conserved" (p. 243). It turns out
that they had to neglect some higher order terms and they achieved energy
conservation only to the fifth significant figure. So after a long time—many
cycles of the calculation—such as that necessary to reproduce the super-
cycles, it would seem possible that the energy loss might be significant
enough to cause a disagreement.
We can see from his decisions that Zabusky was stongly motivated to
understand the results of the FPU simulation and that it was more the
simulation results than the string that was interesting. Whereas FPU was
approximating a continuous string with discrete equations, Zabusky was
approximating the discrete approximation with continuous equations. FPU
was trying to model a real string and Zabusky was modeling a model. He
added higher-order terms to the continuous string to model the
discontinuities, or the "graininess" of the discretized string. He developed a new
model that incorporated terms usually considered to be velocity-dependent:

2.9. The First Soliton Paper 49
The dispersiveness of the beaded string can be treated in a
continuum representation by keeping all or some of the higher
derivative terms that were omitted in taking the
lowest-continuum limit. In recent years, much effort has gone into
resolving the singularities or breakdowns of hydrodynamic problems
by incorporating "viscosity-like" higher derivative terms. These
viscosity terms are dissipative and not dispersive in nature, and
one would not expect the mathematical techniques for treating
the former to be applicable to the latter. (1963, p. 128)
The additional terms Zabusky added to account for the discretization had
not been considered before because they were traditionally associated with
other effects, so it seemed unusual to consider them.
The approximation of a discrete system of oscillators is clearly not just
the usual "continuum limit," as Zabusky said: "There is a fundamental
disparity between the continuous and discrete representations of nonlinear,
physical systems" (1963, p. 101). There is a lot more to be learned in
probing the boundaries between discrete and continuous systems, beyond simply
taking the lowest-order, continuous limit. Zabusky had a set of standard
presuppositions about the effect of higher-order terms in the continuous
expansion that might have prevented him from discovering the soliton. By
putting his expectations aside in an effort to understand the FPU results,
Zabusky made an important discovery. We can see that he believed in the
behavior described by the FPU simulation, because he continued to change
his own continuous model in order to obtain results similar to those of FPU,
rather than try to understand the lowest-order continuum approximation
to the string, which was in disagreement with the FPU results.

The Kolmogorov-Arnold-Moser
Theorem: "Here Comes the Surprise"
In order to connect familiar with unfamiliar territory, in other
words to make the unfamiliar familiar, to be able to find his way
back, the explorer cuts a blaze precisely on the boundary of the
known world. ... The region from which the blaze is visible is
unfamiliar in the sense that the explorer has never seen it before,
but familiar in the sense that he still knows his way back from it,
and can safely venture there.
Frederick Turner (1978, p. 625)
But here comes the surprise ... All the isolated points correspond to
one and the same trajectory, just as the points on one of the closed
curves; but they behave in a completely different way. ... They
seem to be distributed at random, in an area left free between the
closed curves. Most striking is the fact that this change of behavior
seems to occur abruptly across some dividing line in the plane.
Michel Henon and Carl Heiles (1964, p. 76)
3.1. A Brief History of Dynamics
Around 1610, Galileo began the formalization of terrestrial mechanics,
while Kepler did the same for celestial mechanics. Sir Isaac Newton saw fit
to combine these into one subject, unifying them with his calculus. Newton
himself worked out the equations of motion for the two-body problem in
order to demonstrate his method (1687). In 1788, the great Italian-French
mathematician Joseph Louis Lagrange, who had been working with Euler
on the calculus of variations, was able to formalize his brilliant generalized
method of finding the equations of motion for any mechanical system—
the same method we use today. It was also Lagrange, and then Laplace,
who recognized the difficulty of understanding the motion of three bodies,
and so arose the question of the stability of the Solar System. In 1834, in

52 3. The Kolmogorov-Arnold-Moser Theorem: "Here Comes the Surprise"
Dublin, Sir William Rowan Hamilton placed position and momentum on
equal footing as the canonical variables of dynamics, and he also showed
that the variational principle beneath dynamics was the same as the
principle of least time in optics, thereby uniting the two disparate formalisms at
the theoretical level. Soon after Hamilton's contribution, Joseph Liouville
proved a theorem implying that if energy was conserved in a system, then
any volume of initial conditions in phase space must be conserved
throughout the evolution. Liouville's theorem holds even if the system is ergodic
and highly complex. In the late nineteenth century, statistical mechanics
split off from dynamics to study the statistical behavior of large systems
of particles, whose highly complex behavior was out of reach of the
traditional method of following the paths of isolated trajectories. Beginning
with Liouville's theorem, statistical mechanics could define volumes of
initial conditions and study the so-called "hydrodynamics" of this conserved
volume through time.
At the end of the nineteenth century and into the twentieth, Henri
Poincare made dramatic progress in the topological methods of
dynamics. He applied the qualitative formalism of topology and geometry to the
study of phase-space trajectories. He began using the equations of
dynamics to define surfaces, such as the energy surface, in phase space. With these
definitions, the evolution of dynamics could be achieved through a series of
structural surface transformations—including changes in shape, dimension,
and topology. But Poincare also proved a general theorem (1890, p. 233)
showing that most dynamical systems do not possess a complete set of
analytic integrals of the motion; thus they could not be expressed as a closed
set of equations of motion—as were used to predict the far future of a
dynamical system. He also found that the perturbation methods available at
that time, which were used to approximate the solutions of nonintegrable
systems, were perhaps not reliable for long-term prediction either. His
findings were so conclusive that dynamicists could not see how to go beyond
the very few already-solved problems of dynamics. For half a century, while
general relativity and quantum mechanics were revolutionizing physics,
dynamics remained at nearly a standstill; although there were many notable
and important contributions, such as those of Birkhoff, Whittaker, and
Siegel.
Around 1954, on two fronts, new work in dynamics got under way. Even
though they were isolated from one another by language, culture, and
geography, these researchers began working on the same problem from two
quite different perspectives. In the United States, Fermi and his group put
the new digital computer to work simulating the simplest nonlinear
problem and were surprised by the results. In what was then the Soviet Union,
Andrei Kolmogorov had come to a conclusion about the behavior of
dynamical systems with small nonlinear perturbations. And although it remained
unproven until 1962, his conjecture (possible theorem yet to be proved) had
direct implications for the outcome of the FPU simulation. The failure of

3.2. The Fundamental Problem of Dynamics 53
FPU generated much continuing work for physicists throughout the 1960s
and early 1970s. At the same time, following the proofs of Kolmogorov's
conjecture by V.I. Arnold and Jiirgen Moser, applied mathematicians and
physicists began studying the implications of the theorem for nonintegrable
dynamical systems. Eventually, during the 1960s, these two threads came
together and provided the first pathway into the new dynamics.
_
Kolmogorov's
Conjecture
1954
\
'l
Arnold 1
1
96
3 1
^r
Moser l
1962
3.2. The Fundamental Problem of Dynamics
Poincare himself called the problem of studying perturbations of
conditionally periodic motions in a system given by the Hamil-
tonian
H = Ho(I) + eH1{I10), c<l,
(3.1)
in action-angle variables J and 0, the fundamental problem of
dynamics. Here Ho is the Hamiltonian of the unperturbed
problem, and eH\ a perturbation which is a 27r-periodic function
of the angle variables #i,... ,0n- In the unperturbed problem
(e = 0) the angles 0 change uniformly with constant frequencies
Uk =
dHn
dlh'
(3.2)
and all the action variables are first integrals.
(3-3)
We must investigate the phase curves of Hamilton's equations
dJ __d# d0 _ dH
dt~ do' dt~ or
in a phase space which is a direct product of a region in n-
dimensional space with coordinates J and the n-dimensional
torus with angular coordinates 0.
V.I. Arnold (1978, p. 400)

54 3. The Kolmogorov-Arnold-Moser Theorem: "Here Comes the Surprise"
The proto-typical problem of this' class is the three-body problem, which
goes back to the very beginning of dynamics. Poincare pronounced it un-
solvable and we shall see why. The FPU Hamiltonian also falls into this
category. In the Appendix, I translate the FPU model into the language
of Hamiltonian dynamics. In that venue, the model becomes an
explicitly time-independent Hamiltonian function, expressed exactly the same as
(3.1)—which represents the total energy (conserved) for a system of
coupled oscillators. The action-angle variables (J, 0) suggest a visualization of
the dynamics as orbits on the surface of an n-dimensional torus in phase
space. For the unperturbed model, the actions {Ik} form a complete set of
n integrals of the motion, that imply the simple linear progression of the
angle variables 0k on the surface of the n-torus. The unperturbed motion
is of two types, both of which are periodic in each of the n degrees of
freedom. But one type is periodic overall and the other never visits the same
state twice. If the set of frequencies (3.2) are commensurate, satisfying the
equation
n
y] rrik Wk = 0, for any integer values of ra^, (3.4)
then the orbit will form a closed curve on the toroidal surface, periodically
repeating itself after an exact whole number of windings. However, if the
set of unperturbed frequencies are incommensurate, then the orbit will
be conditionally periodic—an open curve filling the entire toroidal surface
without ever returning to exactly the same configuration. When we turn
on the perturbation (by setting e > 0), the difference between periodic and
conditionally periodic unperturbed orbits, or more specifically, just how
close to commensurate the frequencies are, determines which of the two
radically different types of motion will result.
Perturbation Theory
Because these models are not usually integrable, the method of
perturbations was developed—mainly by Poincare—to deal with problems of this
nature. But this method has had some profound problems of its own that
needed to be overcome before progress could be made on the fundamental
problem of dynamics.
The procedure of perturbing an integrable system with small terms as
successive approximations to the full nonlinear dynamical system goes back
to the early days of dynamics. Newton began using approximations when
trying to understand the motion of the planets, which led him to realize
that his dynamical equations apparently could not be integrated explicitly
for more than two bodies interacting under the force of gravity. The study
of the next order of the problem, the famous three-body problem has a
long history of its own, which I will not attempt to summarize here. At

3.2. The Fundamental Problem of Dynamics 55
the end of the nineteenth century, Poincare formally developed a whole
series of perturbation methods to cope with the problems of celestial
mechanics. These elaborate methods were deemed necessary because Bruns
(1887, p. 70) proved a theorem establishing the non-existence of algebraic
integrals of the motion for the three-body problem—that is, any beyond
the known six integrals of motion for the center of mass of the system, the
three integrals of angular momentum, and the energy. Algebraic integrals
take the form /(#, p, t) = constant, where / is an unspecified algebraic
function. Two years later, Poincare proved a similar theorem that he says:
... is more general in a sense than that of Bruns, because
I demonstrate not only that there exists no algebraic integral,
but that there exists not even a transcendental uniform integral,
and not only that an integral cannot be uniform for all values
of the variables, but that it cannot remain uniform even in a
restricted domain. (1890, p. 256ff)
Thus it seemed that there could be no subsequent integrals of the
motion, which meant nonintegrability in general, and that perturbation
methods may be the only possible hope for studying apparently nonintegrable
dynamical systems. I say "apparently nonintegrable" because there is no
known way of determining that particular property for any general model
type. Whereas the problem of finding ways to integrate specific models is
still lively today, the number of known integrable models is quite limited.
According to Arnold:
The collection of solvable "integrable" problems that we have
at our disposal is not large (one-dimensional problems, motion
of a point in a central field, Eulerian and Langrangian motions
of a rigid body, the problem of two fixed centers, and motion
along geodesies on the ellipsoid). (1978, p. 399)
Yet most of what we have obtained in mechanics since its inception results
from these few cases.
With no further integrals of the motion, it seemed as if no one should have
expected isolating behavior in the nonintegrable problems of dynamics.
In an early work, Fermi (1923) tried to prove an ergodic theorem based
on Poincare's general result. His theorem claimed to show that if, in any
particular dynamical system, no integrals exist beyond the energy integral,
then that system must be ergodic on the energy surface. First he showed
that once a trajectory originated on the energy surface, it would remain
on that surface for all time—a result similar to the Poincare theorem.
But the second part of the theorem tried to show that, given a trajectory
that begins in some arbitrary small segment of the energy surface, after
some time it necessarily would pass arbitrarily close to any other point on
the surface (see Haar, 1954, pp. 358-359). In terms of the FPU system,

56 3. The Kolmogorov-Arnold-Moser Theorem: "Here Comes the Surprise"
he believed that adding nonlinear perturbation terms would couple the
oscillators, resulting in ergodic energy-sharing. But his proof of the "quasi-
ergodic" theorem was said to be unconvincing. According to E. Serge, the
editor of Fermi's Collected Papers: "The proof of the ergodic theorem given
by Fermi is not considered rigorous from the mathematical point of view
and it is difficult to make it rigorous" (1965, p. 79). It is far from clear
whether Fermi believed his theorem to be as yet unproven, unprovable, or
doubted in any way; but it is clear from the FPU surprise that Fermi et al.
(1955) expected ergodic behavior in their simulation.
The Poincare Theorem and Its Application
The question of whether Fermi's theorem was or was not provable, and so
whether or not it guaranteed ergodicity in the FPU model is further and
more significantly confounded by the simple fact that the original Poincare
theorem itself does not apply to a system of coupled harmonic
oscillators, and so, strictly speaking, it does not apply to the FPU model at all.
Not only was there no actual guarantee of ergodicity, but there was also
no guarantee that there might not be additional integrals of the motion.
In Volume 1, Chapter 5 of New Methods of Celestial Mechanics (1899),
Poincare proved the theorem that forms the basis for many of the
expectations of nonintegrability discussed throughout this period, especially with
respect to the two simulated Hamiltonians of FPU and Henon and Heiles.
I restate the theorem as follows:
Given a Hamiltonian that may be expressed as
H = tf0+etfi4-e2#2 + -.. ,
that is analytic in e, /, 0, and periodic in 0 and satisfies
Hamilton's equations (3.3) and assuming that H0 is a function of n
variables Ik and does not depend on 0 and that Ho also satisfies
the nondegeneracy condition:
dij
=
d2H0
dlidlj
± 0, (3.5)
then there exists no uniform integral, other than H = constant,
analytic in e, /, and 0.
In action-angle variables, the unperturbed Hamiltonian for any system of
coupled harmonic oscillators is the sum of the action-times-the-frequency
of each mode,
n
H0(I) = Y,"kh. (3.6)
fc=i

3.3. The Small Divisors Problem 57
Notice that this expression is linear in the actions and so all second
derivatives will be identically zero. Therefore no Hamiltonian that is a system
of coupled harmonic oscillators can satisfy the condition of nondegener-
acy (3.5). But this condition is necessary for any strict application of the
Poincare theorem. When formulated as a perturbed chain of coupled
harmonic oscillators, the FPU Hamiltonian satisfies neither the Poincare
theorem, nor, as we will see, the KAM theorem. However, taken as a whole,
the FPU Hamiltonian with its nonlinear perturbation could be seen as
a perturbation of another, unknown, nonlinear dynamical system, whose
unperturbed part does satisfy the nondegeneracy condition, even though
Fermi et al. did not conceive of it that way. But the FPU Hamiltonian could
be close to one that does satisfy the Poincare theorem, and then perhaps
Fermi's theorem as well, if it is truly a theorem at all. Clearly, the FPU
simulation showed that the system did not have a complete set of integrals
of the motion, and at the same time, it was also not ergodic. Somewhere
between these two extremes lies an explanation and a new realm of dynamical
behavior.
3.3. The Small Divisors Problem
Difficulties encountered in celestial mechanics on account of the
existence of small divisors and approximate commensurabilities
of mean motions are connected with the very nature of things
and cannot be avoided.
H. Poincare (from Arnold, 1963b)
The KAM theorem appears to provide a long-awaited solution to a
problem that essentially blocked analytical studies of dynamical systems using
perturbation theory for the first 50 years of this century. In this section we
begin our study of the KAM theorem directly from the analytical
perspective, by reviewing the problem for which it provides the solution, and then
the solution itself.
Because the force equations of dynamics typically cannot be solved
analytically in closed form, the usual approach to finding a solution begins
by expanding the force functions in an infinite series about some small
parameter e. The point of making the expansion is to approximate the
dynamics by subsequently truncating the infinite expansion to a finite sum.
Hopefully these terms can be integrated in some way to predict the
behavior of the system in lieu of a complete solution. The first term is the
linear term, which is always integrable, and often is in itself a sufficient
approximation to the dynamics. But as an approximation, any truncation
of the full infinite series must diverge from the true behavior after some
period of time. But the difficulty with adding more terms lies in the fact
that, as we add them, we lose the simplicity of the solution. As we have

58 3. The Kolmogorov-Arnold-Moser Theorem: "Here Comes the Surprise"
seen in the FPU problem, adding force terms to the linear harmonic
oscillator results in the loss of any analytical solution and requires new methods
of analysis. Classical perturbation theory, which operates in the realm of
energy and Hamiltonians—instead of forces—is the general name given to
this approach. It proceeds by considering the changes in dynamical systems
due to small, angle- and action-dependent terms added to the integrable
Hamiltonian—successive approximations to the full dynamical problem.
The additional terms are considered to be correction terms of order e—
where e is small and Hi is of period 2tt in 0. Usually the integrable part of
the Hamiltonian is normalized: we define it to be of order one. The smallness
parameter e is used to control the size of the perturbation or correction
terms, the first of which is typically the size of e to the first power. This
is the essential structure of any perturbation-theory analysis. Underlying
this method is the structural-stability assumption, that the behavior due
to the perturbation will be only a slight modification of the well-known
behavior of the integrable system model. In general, predictions made from
a Hamiltonian with this one additional term are valid up to times on the
order of t ~ 1/e. The example that was the main driving force for all of this
work was the stability of the Solar System: How long can we expect the
planets to remain in their orbits as they are today? Will they eventually
collide? Fall into the Sun? or Will they escape into interstellar space? In
order to answer such questions, We require predictions on astronomical
time scales, which requires huge reductions in the size of the correction
terms. The first step in answering this question is the study of the three-
body problem, the goal of most of Poincare's work in celestial mechanics.
Here in perturbation theory, we once again encounter the usefulness of
Hamiltonian mechanics. According to the approach of classical
perturbation theory, we would like to find a canonical transformation (J, 0) —>
(J, <j>) that transforms H, retaining the first term's angular independence,
while reducing the size of the correction term. Functionally, we want
H(I, 0) - ff(J, (j>) = H0(J) + 62Hi(J, 0), (3.7)
which maintains the angular independence of Ho and reduces the error to
order e2. Predictions based on this model would be accurate to times on
the order of t ~ 1/e2. Such a transformation would be a function of e and
it would eliminate the terms of order e by substituting terms of the smaller
order e2 (e < 1 => e2 < e). If we can find such a transformation, then,
ideally, we would like to further reduce the error by applying a series of such
canonical transformations, where each successive transformation reduces
the error by at least one power of e and extends the accuracy of predictions
to longer times. In the limit of an infinite number of these transformations,
if this sequence converges (reduces the error to zero without causing H
to become pathological), then we will have found a nonlinear, integrable
system. However, Siegel (1956) has shown that these so-called asymptotic
series do not converge in general. This then is the manifestation of the

3.3. The Small Divisors Problem 59
problem brought forth by the Poincare theorem: not only could we not
integrate the equations, but now there appears to be a severe problem with
the only other method we know about. It was this general convergence
failure of these series that had blocked any significant progress in dynamics
from Poincare (1899) up until Kolmogorov's 1954 paper. But before we
discuss modern dynamics, let us study this convergence difficulty in more
detail.
There are two related problems with the convergence of the asymptotic
series, both are due to the so-called small divisors (see the extensive review
of this subject in Arnold, 1963b). Each canonical transformation, required
to reduce the order of the error by one power of e, involves terms that take
the form of Fourier-series expansions, such as
S(J, 9) = Vj,ft + ^ p¥±- cos HT Mi J , (3.8)
i=i k^L.iK^ \ i )
This highly complicated, canonical-transformation generating function
involves one variable (0) from the old variables and one (J) from the new
set. The Bk terms are Fourier coefficients that depend on the new actions.
Notice that in the denominator of every term in the second sum we find the
very same commensurability condition (3.4) that distinguishes between the
two types of unperturbed motion—periodic and conditionally periodic. The
presence of this denominator is a consequence of both the Fourier series
expansion and the requirements of the canonical transformation theory. This
expression vanishes for some appropriate set of integer coefficients {fc?},
which means that the Fourier expansion terms "blow up," that is, they
cause the Hamiltonian to approach very large energies. This pathological
condition indicates that the series cannot converge. Frequencies naturally
commensurate in a physical system are called resonances because the
rational relationship between them represents a conduit for energy transport
among the modes. This is the same effect as when the forcing frequency of
a driven harmonic oscillator is in resonance with the natural frequency of
the oscillator; the super critical transport of energy can drive the
oscillator to the breaking point. But even if we knew that a specific system had
no naturally occurring physical resonances, these series would still diverge
mathematically, because the rational numbers are dense on the set of real
numbers, so there exists a set of coefficients that will make this
denominator approach zero. The traditional method of perturbation theory did not
provide a way to avoid a rational number in the denominator with each
additional correction. One of Kolmogorov's insights was just such a method
that would avoid the small divisors, by making certain adjustments with
each additional transformation.
Notice that even if the frequency sums were guaranteed to be irrational,
we would still see very large terms arising from "near rational" irrational
numbers, that is, irrational numbers whose ratios are very close to ra-

60 3. The Kolmogorov-Arnold-Moser Theorem: "Here Comes the Surprise"
tional, and so cause the divisors to become very small, although
technically nonzero. But even if, for the moment, we could ignore the
rational numbers—for which the terms in the expansion actually could go to
infinity—then we would still have a problem with convergence. This
second problem is that in an asymptotic series, such as all those from classical
perturbation methods, the convergence is so slow in coming—one power
reduction in e per transformation step—that the terms of the series get
large faster than the reduction in e.
The divergence of these perturbation series actually tells us very little
about the dynamical system itself, because the energy of physical systems
does not go to infinity at resonance. It simply means that the methods
of analysis we are using are inappropriate under those conditions. The
model still possesses dynamics in those domains, but, before 1954, we had
not yet found a way to study the long-term behavior analytically. It was
Kolmogorov's major achievement to find a way out of both of these
impediments.
3.4. Poincare to Kolmogorov
Poincare applied topology and geometrical methods to the study of
dynamics; he developed the idea of phase space, and turned dynamics into the
study of surfaces. In classical dynamics, as in the entire classical episteme,
the world appeared to be extremely ordered and deterministic. For
classical dynamicists, phase space also seemed very simple and well structured,
because without sufficient analytical methods for treating nonintegrable
dynamics, they were seeing only a small part of a much larger realm, full
of strange new and vastly complex structures. While practically inventing
what we know of as perturbation theory, simply to obtain a set of tools
to tackle the three-body problem, Poincare, at the very end of his three-
volume magnum opus, realized the possibility of infinite complexity in this,
the first unsolved problem. He discovered the existence of what he called
a homoclinous point (dynamicists use the term "homoclinic point" these
days), which is a special kind of intersection of two boundary curves in
phase space. What follows is the often quoted passage from that work in
which Poincare expresses his glimpse of what lies in the future of dynamics:
When we try to represent the figure formed by these two curves
and their intersections in a finite number, each of which
corresponds to a doubly asymptotic solution, these intersections form
a type of trellis, tissue, or grid with infinitely serrated mesh.
Neither of the two curves must ever cut across itself again, but
it must bend back upon itself in a very complex manner in
order to cut across all the meshes in the grid an infinite number
of times.

3.4. Poincare to Kolmogorov 61
The complexity of this figure will be striking, and I shall not
even try to draw it. Nothing is more suitable for providing us
with an idea of the complex nature of the three-body problem,
and all of the problems of dynamics in general, where there
is no uniform integral and where the [asymptotic] series are
divergent. (Poincare, 1899, Vol. Ill, p. 389)
Perhaps the most significant aspect of this expression is that Poincare, with
whom we associate topology, geometry, and phase space, would not even
attempt to sketch such a figure. During the 50 years after Poincare wrote
these words, dynamicists knew that something existed in the phase space of
nonintegrable systems, perhaps something incredible, but it was completely
unfamiliar to them. Poincare had discovered the woods and had glimpsed
the demons, but it would take half a century before someone was ready to
venture in.
In 1954, Andrei Kolmogorov, a Russian mathematician of the first rank,
blazed the trail into Poincare's woods. He saw what must be "the extension
of the familiar into the unfamiliar." By then, the classical world-view had
been shattered by the successes of quantum mechanics, so dynamicists were
now able to look beyond simple, classical structures in phase space.
Classically, we had either an integrable system with nice, simple tori constraining
the conditionally periodic trajectories, or else we had nearly random
trajectories moving about with no discernible (predictable) pattern, in an ergodic
phase space (statistical mechanics). In 1954, the same year that John Pasta
and Stanislaw Ulam printed the FPU results in an unpublished Los Alamos
report, Kolmogorov published a conjecture in the Russian language
journal Soviet Math Doklady. Apparently very few Western dynamicists were
in the habit of reading untranslated Russian mathematics journals at that
time, because news of Kolmogorov's findings was not known to them until
the mid-1960s.
In 1957, Kolmogorov presented his conjecture in an address (also in
Russian) to the International Congress of Mathematicians; it was translated
into French and published as Theorie general des systemes dynamiques de
la mecanique classique (Sem. Janet, no. 6, Fac. Sci., Paris, 1958). The
address was finally published in English as Appendix D in Abraham (1967).
Whereas in the early 1960s the Doklady journal was selectively reprinted
in English translation in Russian Math Surveys (London: London
Mathematical Society), Kolmogorov's original 1954 article predated the starting
point (1958) for the Surveys. It was finally translated into English by Helen
Dahlby and printed as Los Alamos Scientific Laboratory Translation LA-
TR-71-67, and can now most easily be found in reprint in Bai-Lin (1984).
The news of KAM traveled slow, first to mathematicians working on classes
of dynamics more general than those of concern to most physicists, and then
to the physicists themselves when they began to realize its implications for
their work in the mid-1960s.

62 3. The Kolmogorov-Arnold-Moser Theorem: "Here Comes the Surprise"
Dodging Small Divisors
In approaching the small-divisors problem, Kolmogorov must have reasoned
that if he could avoid the rational combinations of frequencies, then the
problem is reduced to accelerating the convergence of the asymptotic series.
But how could one ignore the rational numbers, which are dense on the real
line? Prom a measure-theoretic point of view, the rational numbers have
measure zero as compared to the real numbers. The rational numbers are
countable and, as such, the set of all of them is much smaller than the set
of all the irrational numbers. Kolmogorov realized that, given an arbitrary
real number, it was likely to be an irrational number, and so most tori (in
an integrable, unperturbed Hamiltonian system) are conditionally periodic,
or nonresonant.
One way to visualize the problem is as follows: The particular set of
frequencies {u)k} for any oscillator system depends on the values of the
initial conditions, which in turn, determine the actions {Ik} (constants
of the unperturbed motion), which are represented as the set of radii for
the hyper-torus (see the Appendix). Theoretically, most of the different
possible tori for any oscillator system are going to be conditionally periodic,
which, when perturbed, do not suffer the problem of small divisors. The
unperturbed surfaces of these tori are covered by the orbit after a long
time, making them seem like solid shells—each one a hollow (hyper-)donut
shape (Figure 3.1). Different values for the actions, stemming from different
initial conditions, result in larger or smaller shells. Thus we can imagine a
whole continuum of donut shells, one surrounding the next. The ones that
correspond to resonant frequencies, which, when perturbed, give rise to the
small divisors, are not solid shells, but single closed orbits winding within
the space defined by the surface of the torus.
In order to avoid the resonant tori and so the problem of small divisors,
we needed to stay away from initial conditions that led to actions and
frequencies that came close to satisfying (3.4). This task presented a difficulty
because traditional perturbation theory required us to fix an initial
condition and then perform canonical transformations to new coordinates. So
although we could have started with initial conditions far away from the
resonance condition, there was no way to assure that the frequencies would
still be nonresonant after any particular canonical transformation.
Kolmogorov's approach suggests that he thought of the tori as
structurally robust topological forms (shells). He recognized that the model
equations supported both of these distinctly different types of tori; just
as the real line supports two exclusive categories of real numbers, so the
perturbed equations should allow for two exclusive sets of topological
structures. Kolmogorov must have assumed that canonical transformations could
not change the structural character of these entities. Intuitively, he saw no
reason to believe that all conditionally periodic tori would simply vanish
with the addition of a small perturbation, because most of them were "far"

3.4. Poincare to Kolmogorov
63
Conditionally Periodic
Periodic: Freq = 9
0-^
Concentric shells
that are mostly
conditionally-periodic
FIGURE 3.1. Conditionally periodic tori are covered by a single
trajectory after a long period of time; periodic tori contain only a single closed
periodic orbit. The periodic or resonant tori form a small subset of the
continuum of possible tori, conceived as a sequence of concentric tori.

64 3. The Kolmogorov-Arnold-Moser Theorem: "Here Comes the Surprise"
from the dangerous denominators. By "far," I mean that the linear sum
of the frequencies is sufficiently larger than zero for any integer values of
the coefficients as to be unaffected by the small divisors. If, as Fermi et al.
assumed, the presence of the nonlinear perturbation caused the system
dynamics to become ergodic, then the shells necessarily would be destroyed
or otherwise topologically altered, resulting in trajectories free to move
about anywhere within the space of the outermost shell, that is, the result
would be a homogeneous chaotic donut. The KAM theorem demands the
persistence of most of the internal structure of the unperturbed system of
conditionally periodic shells.
The mathematical method was the problem and not the topology.
Because Kolmogorov expected most of the conditionally periodic tori to
persist, he decided to lock onto one that was far from a periodic torus and
track it across each canonical transformation by going back each time to
modify the initial conditions as necessary to end up with a conditionally
periodic torus in the last transformation. This was his first breakthrough.
He could eliminate the periodic tori from consideration by breaking with
tradition and changing the initial conditions with each new transformation.
All that remained was to find a way to assure the convergence of the series.
Accelerated Convergence
Asymptotic series fail to converge because of their relatively slow rate of
convergence—the error decreasing by only one power of e per term.
Kolmogorov discovered a solution to this problem in Newton's method of
tangents. This method of finding the roots of a function (xi is a root of F
whenever F(xi) = 0) uses recursive approximations based on the slope of
the function—the tangent. In this method, accuracy doubles on each
iteration of the approximation—which means that the error must decrease by
quadratic order on each iteration (e2 —» e4 —» e8 —> • • •). Kolmogorov found
that if he could apply a similar method to replace the usual asymptotic
series of canonical transformations, then this quadratic reduction would be
fast enough to assure convergence. He convinced himself that this method
would in fact lead to convergence, and he alluded to it in his paper; but he
did not have to apply it rigorously because he never formally proved his
conjecture.
3.5. The Conjecture
Kolmogorov was convinced that he could prove his conjecture and so he
announced his findings in the 1954 paper, the two-dimensional version of
which is adapted here from the English translation in Abraham (1967):

3.5. The Conjecture 65
duji
=
d2H0
dlidlj
Let us suppose that the Jacobian of the frequencies u)i with
respect to the momenta Ii is nonzero:
? 0. (3.9)
It turns out that in this case, the partitioning of the region
in question of the four-dimensional space into two-dimensional
tori is basically stable with respect to small changes in H of the
form
H(I, 0, e) = H0(I) + eHtf, 0, c). (3.10)
To obtain a precise formulation, let us consider a region G
determined by the condition JgB, where B is a bounded region
in the plane of points /. Assuming that the functions H$ and
H\ are analytic and that the above condition is satisfied, we
can prove that, for arbitrary (3 > 0, there exists a 6 > 0 such
that, for |e| < <5, in the dynamical system
dt dli' dt dOi9 [ j
the entire region G except for a set of measure less than 0
consists of invariant two-dimensional tori on each of which, in
suitable (that is, depending analytically on (/, 6)) circular
coordinates 0i, </>2> the motion is determined by the equations
where &\ and &2 are constant on each torus, that is, they are
conditionally periodic with two periods.
Although Kolmogorov claimed that he could prove his theorem, he left
that rigorous exercise to younger men. Eight years after his initial
statement of the conjecture, Arnold (1963a) and Moser (1962) independently
published proofs of the theorem. Arnold used the super-convergent
perturbation theory suggested by Kolmogorov and required the perturbation to
be analytic (infinitely differentiable), just as Kolmogorov requires in his
statement. Moser's approach to the proof took the form of analytical twist
mappings and he required the perturbation to have derivatives to only the
333rd order!
According to the KAM theorem, most conditionally periodic tori remain
stable after the addition of a small perturbation eH\ to the integrable
Hamiltonian Hq. However, the action-radii will be dependent upon the
angle variables, instead of remaining constant throughout the evolution.

66 3. The Kolmogorov-Arnold-Moser Theorem: "Here Comes the Surprise"
So instead of toroidal surfaces with fixed radii, these new surfaces have
angle-dependent radii, which means that the surfaces no longer retain the
ideal donut shape. They may have indentation or bumps, or they may have
to undergo much more radical topological transformations.
But by not explicitly mentioning them, the theorem also allows for
regions in which some tori are not preserved. This exceptional region must
contain all the tori affected by the small divisors, which in turn, must be
of a measure less than /?, as long as e < 6. This region contains all of what
remains of the periodic tori, as well as the conditionally periodic tori that
are associated with nearly commensurate frequencies. Furthermore, this
theorem implies that the size of the exceptional region (or regions, since
they need not be connected) is related to the size of the perturbation.
Although nothing more definite is guaranteed by the theorem, it has already
provided a great deal.
The KAM theorem cuts a blaze into the fundamental problem of
dynamics and so extends the boundary of the known world. Instead of the
complete unknown—represented by the (classical) homogeneous ergodic
donut—we find that under certain circumstances, the conditionally
periodic tori persist and therefore provide elements of familiar and possibly
predictable behavior. But as always, to the explorer, the most interesting
part of any realm is the region beyond the blaze.
3.6. Beyond the Blaze
What happens to the tori, under small perturbation? The resonant orbits,
those plagued by the small divisors, are immediately destabilized by the
perturbation. As Melnikov (1963) and Arnold (see Arnold and Avez, 1968)
both discovered, these orbits are replaced by a sequence of alternating
stable and unstable equilibrium points, which give rise to the homoclinic
orbits—that infinitely complex lattice that Poincare would not even
attempt to sketch. There are two quite interesting topological changes that
occur in the space between the preserved tori.
In Figure 3.2(a), we see again the unperturbed, concentric, mostly
conditionally periodic tori, whose radii are all constants of the motion. But then,
in Figure 3.2(b), we see the effect of adding the nonlinear perturbation and
bringing the Hamiltonian into the realm of KAM. Of the three tori depicted
in this figure, the outer and inner tori are the same as in the previous
figure; they represent the preserved conditionally periodic tori. But between
them winds a new mini-torus. Its surface is wrapped about what was a
period-four orbit, so it winds around four times between the other two tori.
The conditionally periodic tori that are "near" to resonant orbits undergo
this amazing change. They retain their bounded, conditionally periodic
character—meaning that trajectories remain fixed to their toroidal surface

3.6. Beyond the Blaze 67
(a) Unperturbed
(b) Perturbed
FIGURE 3.2. (a) The unperturbed, integrable Hamiltonian has a phase
space that is a series of concentric tori. Each torus represents one possible
orbit for the system with a unique set of initial conditions, (b) When
a perturbed Hamiltonian satisfies the conditions of KAM, its toroidal
phase space appears to have undergone a dramatic change: some of the
conditionally periodic tori wrap themselves around the destroyed periodic
orbits, creating mini-tori that are bound between the still-preserved tori.
These new tori embody all the properties of the whole: they contain
further mini-tori within them.

68 3. The Kolmogorov-Arnold-Moser Theorem: "Here Comes the Surprise"
and their angular coordinates are individually periodic; but the surface is
no longer strictly concentric with the other, still-preserved tori. Instead of
being wrapped around the central circular axis of the entire torus, these
surfaces now are wrapped around what had been the nearest resonant,
periodic orbit, thereby forming a conditionally periodic torus that winds
about between the preserved tori. The number of times it winds about
depends upon the periodicity of the resonance about which it is wrapped.
However, these mini-tori still wind around the central circular axis in the
sense that the periodic orbits about which they are wound, were themselves
wound around that circle. The mini-tori are both conditionally periodic—in
the sense of their surface being filled by the orbit after long periods—and
periodic—in the sense that they wind about the larger torus an integer
number of times.
Fractal Structure
There is a very strong analogy between the larger structure of these KAM
tori and the real number line—one that stems from the the small
divisors problem. The small divisors, and so the periodic orbits, are related to
the rational numbers, whereas the conditionally periodic tori are related
to the irrational numbers. The real number line is a fractal in the sense
that the interpenetration of the rational numbers and irrational numbers
on the whole line reemerges at each scale of the rational numbers. The
integers represent a certain scale of rational numbers, just as the strongest
resonances in the physical system give rise to the most pathological small
divisors. Given any two integers, there is another whole scale of rational
fractions punctuating the irrational space between them, and so on for any
two rational numbers ad infinitum. In the KAM torus, each of the strong
resonances generates a mini-torus in perturbation. So there is a whole
sequence of them punctuating the space of preserved conditionally periodic
tori. But even more interesting is the fact that within the mini-tori, just
as between any two integers, the entire structure of the whole re-emerges
on another size scale. Within all the mini-tori, there are more mini-tori, as
well as preserved conditionally periodic tori at each size scale of the small
divisors.
Heuristic Explanation of FPU
Finally, in answer to the question, "How does KAM explain FPU?" please
refer to Figure 3.3 for what follows: Fermi et al. used the unperturbed
normal modes to track the evolution of the string. For the unperturbed
Hamiltonian in FPU, the torus is the complete set of concentric shells of
a very high dimension. In these figures, I must restrict my representation
to two dimensions. Once the FPU Hamiltonian was perturbed, and if that
perturbation was subject to the KAM theorem, then, if the perturbation

3.6. Beyond the Blaze 69
Plane Cross-sections of Tori
FIGURE 3.3. The KAM torus provides us with a heuristic explanation for
the FPU string: superimposing the before and after perturbation tori and
assuming an arbitrary period-four mini-torus between the number one
and number four normal-mode preserved tori, we see in the cut-away view
at the bottom how an orbit might develop the quasi-periodic behavior
found in FPU by following sequence ® through ®. The orbit passes
through the four lowest modes and returns almost completely to the
original mode configuration.

70 3. The Kolmogorov-Arnold-Moser Theorem: "Here Comes the Surprise"
parameter were small enough, the resulting phase space (in action-angle
variables) would be a perturbed KAM torus such as the one represented
in Figure 3.2. In Figure 3.3, I have superimposed the cross-sections of the
unperturbed and perturbed tori showing how a mini-torus might exist in
the space between what had been the mode-one and mode-four tori.
Although I have arbitrarily chosen a period-four mini-torus, the periodicity
does not affect the explanation. Let us focus in on one cross-section of the
mini-torus (they are all the same) in the lower right of the figure. As this
plane is a cross-section to a toroidal surface, the system trajectory
periodically passes either outward or inward through this surface, but always in
the same direction. The points of passage are fixed to the circular ring; but
they must march around the ring in a fixed succession. I have represented
such a sequence with the letters ® through © .
Beginning at ®, the system trajectory is in the configuration of mode
one, which is the FPU initial condition. The trajectory then proceeds
around the mini-torus, eventually returning to pass through this surface
at point ©, which corresponds to what had been mode two (before the
perturbation). Then the sequence continues through points © and (3),
arriving at modes three and four, respectively. But here, the surface and the
periodicity demand that the trajectory go to point ®, which is a
turnaround in terms of the normal modes, because point ® is back near mode
three. This behavior seems to correspond nicely with what we saw in the
FPU simulation. But there is even more. Looking at point © , we see that
the return to mode one is close but not exact. Nor can it be exact because
this is a conditionally periodic orbit, so it can never return exactly to the
same point. Thus at point® , we see the quasi-periodic return to the initial
condition.
How about that super-period found by Tuck and Menzel (1972), which
gave so much trouble to both Jackson (1963b) and Zabusky (1962)? This
secondary period can be explained by a close examination of the point © .
Because of the orbit's periodicity about the ring, and just as the point
(D fell short of point© , point® is slightly closer to point® than was point
© . This progression toward the initial condition continues with the points
® , © , etc. If the periodicity were within a certain range, then we would
see a very close return to the initial condition that might be described as
follows:
This trend continued so that by ~ 780 cycles [fundamental
oscillations of mode one], the string was closer than ever before
to its starting configuration, clearly the recurrence has a super-
period. (Tuck and Menzel, 1972, p. 403)
The FPU super-period returned the string to its closest approach to the
initial condition on every sixteen of its fundamental quasi-periods. In this
heuristic explanation, the existence of the super-period requires that after

3.6. Beyond the Blaze 71
sixteen times around the circle, the trajectory must come back very close
to the point ®, and so getting very close to the initial condition before
marching away again.
The KAM explanation of FPU is very compelling; however, there are
some serious technical difficulties in making the heuristic explanation into
a rigorous mathematical one. So serious are these difficulties, that no one
to this author's knowledge has to date succeeded in making it rigorous. As
I mention in my qualifications above, just picking periodicities out of the
air is unacceptable; the KAM torus corresponding to the perturbed FPU
Hamiltonian would have to be derived from the model through canonical
transformations. But perhaps even more difficult is the need to overcome
the fact that the unperturbed FPU model cannot be used as a starting
point because it does not satisfy the nondegeneracy conditions of either
the Poincare (3.5) or KAM (3.9) theorems. It must be shown that the
perturbed FPU Hamiltonian can be derived from some nondegenerate,
integrate model. But so few integrable models exist that finding one that
can be perturbed into the FPU model seems remote. However, as we will
see in the next chapter, Ford developed some further convincing arguments
for this explanation.
Deterministic Chaos
Two further issues that might or should come up when considering the
KAM torus are the questions of what lies in that space between the
preserved tori and in-between the mini-tori, and what did happen to the period
orbits? Of course the answers to both questions are the same: that space is
filled with a single orbit, the orbit that corresponds to the destroyed
unperturbed periodic resonance. According to the theory devised by Poincare,
Birkhoff, Arnold, and Melnikov, alternating with the stable elliptic points—
those that give rise to the mini-tori—there is a set of hyperbolic points
that cause the orbit in this region to act very erratically, we might even
say pathological. Although it is bound between the conditionally periodic
tori, the movement of this single trajectory appears to be chaotic (nonde-
terministic). This is thought to be the beginning of ergodicity, and it is
also the reason perturbation theory failed. The single trajectory that
corresponded to the unperturbed periodic orbit—about which the mini-torus
is now wrapped—is free to move about between those preserved tori, and
it does so in an extremely erratic, unpredictable manner. Like the
conditionally periodic orbit, this trajectory will completely fill its available space
over an infinite amount of time; but unlike the conditionally periodic torus,
its movement is aperiodic.
Perturbation theory fails to predict the movement because of the small
divisors and, as we will see, simulation also fails to track this movement.
Although the trajectory is still deterministic, following as it does from the
formalism of classical mechanics, it cannot be predicted very far into the

72 3. The Kolmogorov-Arnold-Moser Theorem: "Here Comes the Surprise"
future, nor simulated very far, for reasons that we will discuss in Chapter 5.
This is the behavior that has come to be known as deterministic chaos. It
is not chaotic in the classical sense of no order; but its order cannot be
used to predict its movement. This chaos was what Poincare recognized as
the infinitely complex web of stochastic trajectories.
Before the KAM theorem, it was thought that phase space was filled
with tori, in the integrable case, or else it was a turbulent sea of ergodic
trajectories. Clearly, Fermi et al. expected the perturbed FPU system to
become ergodic, because it was apparently not integrable. The KAM
theorem blazes a trail into the transition between integrability and ergodicity.
One of the more profound philosophical implications of modern dynamics
is that predictability and complete randomness can and often do coexist in
one phase space. Within each mini-torus there are more mini-tori, and
between them there is more deterministic chaos, and within those tori there
are more tori. Mathematically, the KAM torus is a beautiful dynamical
object, made possible by the numerical properties of the real numbers,
and embodying the perfect interpenetration of order and disorder. It is a
transitional object, because its composition depends on the value of the
perturbation parameter; as the parameter is increased, more of the
conditionally periodic tori are near the resonances and so the deterministic
chaos seems to take up more and more of the phase space. According to
classical intuition, we expect this process to lead to the homogeneous sea
of ergodicity at some critical value; but then again, that intuition led to
the original FPU problem.
3.7. The Henon and Heiles Simulation, 1964
In 1969, when Grayson Walker and Joseph Ford introduced the KAM
theorem to the general physics community in their Physical Review article,
they used extensively the results of some work performed five years earlier
in theoretical astrophysics by Michel Henon and Carl Heiles. This latter
work was significant not only because it demonstrated vividly some of the
properties of KAM tori remarkably well, but because it also broke new
ground in the techniques of simulation in dynamics. Looking back at our
road map on page 3, we see that this work constitutes one of three
independent starts in the FPU research program, because Henon and Heiles
conceived and completed their project without ever having knowledge of
either the KAM theorem or the FPU simulation. Yet the model they used
and the behavior they observed had intimate relationships to both.
In 1963, just after KAM became a theorem with the proofs by Arnold
and Moser, astrophysicists Henon and Heiles (1964) were investigating the
possibility of there being a hidden, third integral of the motion for the
case of a star moving in a galactic axisymmetric potential. In their pa-

3.7. The Henon and Heiles Simulation, 1964 73
per "The Applicability of the Third Integral of Motion: Some Numerical
Experiments," Henon and Heiles present a short history of the problem.
There were stellar observations that might be accounted for by
hypothesizing the existence of a third integral of the motion, beyond the two integrals
of conserved energy and angular momentum. The absence of a third
integral "implies that the dispersion of velocities should be the same in the
direction of the galactic center and in the direction perpendicular to the
galactic plane, whereas the observed dispersions have approximately a 2:1
ratio" (p. 73). Observational data of subject stars seemed to indicate that
they might be bounded by one additional degree of freedom, resulting in
asymmetrical dispersions. Just as in FPU, these researchers expected that
only integrals of the motion could bind the trajectories of a model; without
them, trajectories should move randomly throughout the available energy
surface.
r
\
FPU I
^ >
\
/
r i
u
_IA
, LI,
..
hci iui i-nciico |
Simulation 1
1964 I
~T~
Gustavson
1966
»
- r
Walker-Ford
1969
T-,
Ford, Stoddard |
8
iTi
im<
3r
1973
Toda
1967
f
\
f
H6non |
1
974
Henon and Heiles developed a model for the motion of a single star in a
galactic field that included the following terms:
H(x,y,x,y) = ^- + ^±^- +*2j/-^. (3.13)
Because of the first two terms, the Hamiltonian looks just like a harmonic
oscillator with two degrees of freedom, which has been perturbed by the
two nonlinear terms on the right, both of degree three in the positions.

74 3. The Kolmogorov-Arnold-Moser Theorem: "Here Comes the Surprise"
The Toda Lattice
Much later, in 1972, Lunsford and Ford noticed that this potential could
be derived by truncating the series expansion of the "Toda lattice" model,
which describes a ring of particles constrained to the two-dimensional,
circular surface and separated by nearest-neighbor, exponential forces:
2 24 L Jo
where Px and Py correspond exactly to x and y in the Henon and Heiles
model. Although three particles on a ring lattice may not seem to be
related physically to stars in galaxies, the two models are intimately related
mathematically. Furthermore, particles on a lattice does sound very much
like particles on a string, and so we begin to see that this work indeed could
be related to FPU.
Nearly 10 years after this work with Carl Heiles, Michel Henon (1974)
discovered the existence of a third integral binding the motion in the Toda
lattice. Expressed as a constant function, this integral of the motion is
F = Spx(pl - 3p2y) + (px + V3py)e2y~2^x
+ (Px - y/3py)e2v+2^ - 2pxe~4y. (3.15)
Of course at the time of the Henon and Heiles simulation, no one had yet
derived the integral for the Toda lattice, nor did Henon and Heiles mention
that their Hamiltonian might be derived from or related to it. I mention the
Toda lattice here because, as Lunsford and Ford (1972) and Ford, Stoddard
and Turner (1973) later show, it contributes to the belief that the results
of both FPU and Henon and Heiles are intimately related to the KAM
theorem.
However, there is an important distinction to be made. The FPU
lattice had fixed endpoints, whereas the Toda lattice has implicitly periodic
boundary conditions stemming from its ring structure. As we saw with
Zabusky's soliton explanation of FPU on page 47, the difference between
the type of boundary conditions used in each model makes a difference.
Even though the Henon and Heiles model may be related to FPU by way
of the Toda lattice, the two are not the same, because of the difference in
boundary conditions.
Developing the Simulation
Whereas Fermi et al. expected to see ergodicity in their simulation, Henon
and Heiles were hoping for just exactly the opposite effect. Their anticipated
integral of the motion would have bound the trajectories tightly to the
surface of a torus. But neither simulation obtained the anticipated results.

3.7. The Henon and Heiles Simulation, 1964 75
Henon and Heiles found behavior that seems to be a transitional effect,
something between torus-bound orbits and ergodicity.
For any model, each integral of the motion defines a surface in phase
space to which all orbits must be bound; so each integral of the motion
reduces the freedom of movement in the problem. The two known integrals
of the motion in this problem restrict the trajectory to three dimensions.
Furthermore, this Hamiltonian has a potential well for energies below a
dissociation value (E = 1/6); as long as the constant energy is maintained
below dissociation, trajectories are bound to remain within a known, finite
region of the energy surface. Henon and Heiles chose to use the coordinates
(x, y, y) for their work on this problem. For the purposes of simulation, they
elected to look at two-dimensional cross-sections of this three-dimensional
space. All of the figures show the phase-plane (y, y).
The results of the Henon and Heiles simulation are clearly represented in
Figures 3.4(a), (b), and (c). All three figures show a bounded area
containing all the trajectories that result from maintaining the energy below the
dissociation value. Because these figures are cross-sections (at x — 0), the
trajectories appear as sets of points, where the trajectories pass through
this surface every so often. The two classically extreme cases are as follows:
If the orbits are ergodic, then they are not bound to any additional surface
and so the surface of section should show the bounded region filled with
a gas of points, uniformly distributed throughout. If there is an additional
integral of the motion, binding the orbits to another surface, then the
surface of section should show a series of nearly closed curves, just as the
conditionally periodic tori appear to be circles in cross-section. But Henon
and Heiles are not using action-angle variables, so their phase space is not
a torus and their orbits will not be simple circles. Any tendency toward
closed curves here would seem to indicate the presence of an additional
integral of the motion.
In Figure 3.4(a), the energy is set to E = 1/12, and apparently all the
orbits that we can see are conditionally periodic because they remain on
these closed curves. Although the perturbation is small, Henon and Heiles
thought, just as did Fermi et al. before them, that it should still cause the
trajectory to drift away from the torus, unless the third integral existed.
This result seemed to verify that there must be a third integral of the
motion that was keeping these orbits bound to the closed curves. But then,
in Figure 3.4(b), at E = 1/8, the initial energy is larger and so the nonlinear
terms have a much more pronounced effect. Here we find new orbit patterns
that very much surprised Henon and Heiles. They write:
All the isolated points in the figure correspond to one and the
same trajectory, just as the points on one of the closed curves;
but they behave in a completely different way. It is clearly
impossible to draw any curve through them. They seem to be
distributed at random, in an area left free between the closed

76 3. The Kolmogorov-Arnold-Moser Theorem: "Here Comes the Surprise"
(a)
(b)
(c)
FIGURE 3.4. Trajectories in the Henon and Heiles surface of section for
energies (a) E = -^ shows very persistent level curves suggesting an inte-
grable system; (b) E = | shows the mix of level curves and deterministic
chaos well into the region of KAM stability; and (c) E = | at dissociation
shows the breakdown of most, but not all level curves.

3.7. The Henon and Heiles Simulation, 1964 77
curves. The five little loops in the right of the diagram belong
to the same trajectory; the successive points [of the trajectory]
jump from one loop to the next. Let us call this feature a chain
of islands. Other such chains have been found in various parts
of the diagram. Each chain is associated with a stable periodic
orbit; the q islands surround the q points which correspond to
that orbit. The following properties are also suggested by our
results: (1) there is an infinite number of islands; (2) the set of
all the islands is dense everywhere; (3) but the islands do not
cover the whole area since they become very small; there exists
a "sea" between the islands, and the ergodic trajectory is dense
everywhere on the sea. (p. 76)
Figure 3.4(c), which shows the situation at the dissociation energy,
demonstrates that the closed curves (the conditionally periodic orbits) have
decayed even further and the so-called ergodic orbits expand further between
the remaining closed curves. From this progression/degradation, Henon and
Heiles concluded that there does exist a third integral of motion, although
there also seems to be a critical energy at which the integral breaks down.
The model that Henon and Heiles simulated probably did not have a
third integral of the motion, strictly speaking. If it did, then the orbits
would not have been free to move between the closed curves. If there were
an integral that did indeed break down after some critical value of the
energy, then once that value of energy was reached, the closed curves would
be absent.
Without recourse to analytical, topological methods, Henon and Heiles
found in simulation the new behavior that seems to correspond to the
KAM theory, because of the mixture of conditionally periodic orbits with
bound ergodic orbits. From only their graphical surface of section, they
interpreted their results as they witnessed the breakdown of the
conditionally periodic tori, as the effect of the perturbation increased with the
energy. They found that over a relatively short period of time the
"integral" seemed to break down very quickly. Their description of what lies
in the surface of section during breakdown is remarkably similar to what
Melnikov (1963) and Arnold (see Arnold and Avez, 1968) speculated for
the KAM torus using strictly analytical methods. Whereas Arnold,
working from the Poincare-Birkhoff theorem on the existence of fixed points in
an annulus (Birkhoff, 1913), deduced that the conditionally periodic tori
surrounding the periodic orbits would change topologically to the mini-tori
winding around within the remaining conditionally periodic tori, Melnikov,
on the other hand, working from Poincare's development of the homoclinic
point (1890, Vol. 3) develop theory concerning the homoclinic points
surrounding the unstable fixed points—such as those between the KAM tori.
The mini-tori appear in cross-section to be the "island chains" found and
described by Henon and Heiles, and the homoclinic points are believed to

78 3. The Kolmogorov-Arnold-Moser Theorem: "Here Comes the Surprise"
give rise to ergodic motion between the KAM tori. Without reference to
Melnikov, Henon and Heiles reached the conclusion that the orbits between
the preserved tori must be ergodic, by using the criterion of the exponential
separation of orbit pairs.
Integrable Models: The Undiscovered Country
Walker and Ford (1969) claimed that the Henon and Heiles model was in
fact a KAM system. Although the behavior of trajectories in the Henon
and Heiles simulation seems to demonstrate very well the KAM properties,
there are important requirements to be met in order to establish a proof
that the Henon and Heiles model is a KAM system. By extension, the
relationship between the Henon and Heiles model and the FPU model might
tempt us to claim that the KAM theorem also provides the explanation for
the FPU results. But these are bold claims and are very difficult to
justify. In order to apply the KAM theorem to a model, the model must first
meet the strict requirements of that theorem. The two important
requirements for any application of the KAM theorem are that the unperturbed
Hamiltonian be integrable and nondegenerate, according to (3.5).
The nondegeneracy condition immediately eliminates models that are
linear in the actions, which rules out the harmonic oscillator. Both the
Henon and Heiles and the FPU Hamiltonians may be viewed as perturbed
harmonic oscillators, because of the quadratic energy terms that are their
basis. Fermi et al. explicitly chose this model in the first place, and then
added the nonlinear terms. But this historical fact can cause some
significant confusion as to an application of the KAM theorem. Whether or
not a theorem applies to a model does not depend on the intentions of
researchers. Yes, the harmonic oscillator is an integrable linear model and, as
such, does not satisfy the nondegeneracy condition; but the full perturbed
Hamiltonian of Henon and Heiles and FPU may still be KAM models,
because the same Hamiltonians could result from perturbing some other
integrable yet nonlinear Hamiltonian, such as the Toda lattice. This could
be true regardless of the set of approximations that led experimenters to
these models and regardless of what they thought they were modeling.
The KAM theorem guarantees that as we slightly perturb the integrable
system away from its state of integrability, some of the conditionally
periodic tori will be preserved, but distorted. As a result, if we see KAM-like
structures, such as in the Henon and Heiles simulation, then there is likely
to be a nonlinear, integrable system "nearby," meaning that our model may
be close in a mathematical sense to a nonlinear integrable model that we
may not even know about. So by seeing KAM behavior when simulating
a model that is obviously not a KAM system, we obtain an important
indicator that there may be an undiscovered integrable Hamiltonian nearby.
Simulation could act as a guide for analytical discovery.

3.7. The Henon and Heiles Simulation, 1964 79
If someone wants to prove rigorously that the KAM theorem applies
to the Henon and Heiles model, then they must produce the unperturbed
Hamiltonian that satisfies the conditions of the theorem and then add to it
the perturbation terms with the result being the Henon and Heiles
Hamiltonian; similarly for FPU. They could not begin with the harmonic oscillator
as the unperturbed Hamiltonian. In addition, because the KAM theorem
is best described in terms of the slightly-distorted, yet preserved tori, the
unperturbed integrable Hamiltonian that is developed must display level
curves of constant energy that have the same topology as the curves in the
surface of section of the Henon and Heiles (or FPU) system, or it must be
shown to be a canonical transformation away from such a model.
Walker and Ford (1969) cited KAM stability for both the Henon and
Heiles and the FPU models. Yet they did not provide the requisite
conditions for a rigorous application of the KAM theorem for either case. They
did not offer a nonlinear integrable Hamiltonian as the definite source for
these models; instead, referencing the figures of Henon and Heiles and the
follow-up article by Gustavson (1966), they claimed that: "One may use
canonical transformation theory to show that this two-dimensional
surface [in the figure from Henon and Heiles] is topologically equivalent to a
KAM torus" (p. 419). They also claimed that "In view of the complexity of
actually reducing the Henon and Heiles problem to manageable form,
however, it is perhaps worthwhile illustrating by example the effects that even
a simple canonical transformation can introduce" (p. 428). Whereas they
proceeded to perform an analysis based on trying to transform their own
constructed integrable Hamiltonian into a form that would produce level
curves like those of the Henon and Heiles paper, they finally concluded: "In
short, we have made it quite plausible that the Henon and Heiles
Hamiltonian is only a coordinate transformation away from the analysis of sections
II and III" (p. 429). Although their final figure (see Figure 3.5(a)) had
some topological structure similar to the Henon and Heiles figures, their
argument remained a plausibility argument and not a rigorous one.
For the FPU case, Walker and Ford did not attempt even this much
justification; instead, they offered a reference to the article by Izrailev and
Chirikov (1966) relating FPU to KAM by way of the resonance-overlap
criterion (see the discussion in the next chapter). Once again, although
the claim was made for KAM as the probable explanation for FPU, the
conditions for the theorem had not been established rigorously.
Gustavson: First to see the KAM connection
In his paper following the Henon and Heiles article, Gustavson (1966)
discussed the Henon and Heiles results in terms of the KAM theorem, thus
making his the first published statement calling upon such a connection:

80 3. The Kolmogorov-Arnold-Moser Theorem: "Here Comes the Surprise"
(a)
(b)
INITIAL POINT
X-U6.0)
+ -1.08,0)
o-( 0,0)
• -C-.I2.0)
O-C0..I6)
INTEGRAL VALUE
.464 x Iff*
-.389 xIff'
-.164 x Iff*
-310 > Iff*
-330 * Iff* !
« - ZERO VELOCITY CURVE
FIGURE 3.5. Reproducing the Henon and Heiles level curves: (a) Walker
and Ford; and (b) Gustavson.

3.7. The Henon and Heiles Simulation, 1964 81
Recently, Kolmogorov, Arnold, and Moser obtained very
important results that show the existence of invariant surfaces for
nearly integrable Hamiltonian systems. Since the unperturbed
Hamiltonian is integrable, a continuous family of invariant
surfaces exists. Their results show that, for small perturbations,
the majority of these invariant surfaces are preserved while a
small number dissolve or break up. The surfaces that dissolve
are densely intertwined with the surfaces that persist showing
that no integral exists in the strict sense. We emphasize that
these results are true only for sufficiently small perturbations
and that no concrete results exist in the large. Numerical
evidence, however, indicates that the surfaces do continue for large
perturbation but then suddenly start to disappear completely.
The results of Henon and Heiles are an example of this
phenomenon. (Gustavson, 1966, p. 671)
The author's intent was to construct an approximation to the integral
representative of the invariant surfaces in the Henon and Heiles system. He did
not claim to show that Henon and Heiles was a KAM system, and
therefore he was under no obligation to produce the nonlinear Hamiltonian. But
he does claim here that the unperturbed Hamiltonian is integrable, which
can be read as a reference to the harmonic oscillator Hamiltonian. If so,
then he seems to be unaware of the nondegeneracy condition of the KAM
theorem. He did develop an eighth-degree polynomial expression for the
invariant curves and he showed that they did indeed resemble the Henon
and Heiles topology (see Figure 3.5(b)). From this expression, we might
be able to develop the integrable Hamiltonian that would be required for
proof of KAM for the Henon and Heiles system.
Final Words
Thanks to Henon's, as of then unpublished, (1974) identification of the
exact form for the integral, Ford et al. (1973) were able to definitively
announce the integrability of the Toda lattice model (3.14). Ford et al. (1973)
also pointed out that the Henon and Heiles Hamiltonian (3.13) is exactly
the truncation to third order of the series expansion of the terms in (3.14).
This latter coincidence seems to suggest that Henon and Heiles might
actually have begun from that model, although they do not mention the fact
in their article. Further evidence for this argument is that it was Michel
Henon himself who found the third integral for the Toda lattice. If we begin
with the integrable, nondegenerate Toda lattice for three particles as our
unperturbed Hamiltonian, then we may "add" the negative of all the terms
beyond the third order in the series expansion as our perturbation, thereby
arriving at the Henon and Heiles Hamiltonian as our perturbed model.
Furthermore, Ford et al. (1973) showed that the level curves for the integrable

82 3. The Kolmogorov-Arnold-Moser Theorem: "Here Comes the Surprise"
Toda lattice do indeed match those of the Henon and Heiles model. In this
way, the Henon and Heiles model satisfies the conditions for the KAM
theorem, and so the behavior we see in the Henon and Heiles simulation does
indeed exemplify KAM stability, just as expected by Gustavson, Walker,
and Ford. The simulation strongly suggested the existence of the nearby
integrable model, and eventually, the analytical methods came through.
The three-body problem satisfies the conditions for the Poincare
theorem, and also the KAM theorem. The three-body problem is nonintegrable
in general, but may be near to an integrable system, such as the Toda
lattice. Poincare saw the cause of deterministic chaos (homeoclinic orbits) and
realized that these models could not be integrated. From that perspective,
he believed that this problem could not be solved, in the sense of being
integrated analytically. The Solar System's stability is a huge problem for
dynamics, obviously involving many more bodies than three. As a
Newtonian system with relatively few bodies, the Solar System provides the
prototype for dynamical systems theory. On a long time scale, the Solar
System is slowing down as the Sun radiates its mass and energy away; but
on a shorter, more human time scale, the Solar System may be considered
as an energy-conserving Hamiltonian system, which appears to be stable
in the KAM sense: its randomness constrained to very small variations on
the global predictability of planetary orbits.

Research Threads Come Together:
Harmonic Convergence
When the atoms are falling downward through empty space
of their own weight, at indeterminate times and places
they swerve a very little out of their downward course,
just enough for you to call it a change of direction.
If they did not swerve, then everything would fall
straight down like raindrops through the void, no collisions
would take place, and no impact of atoms upon each other,
and Nature never would have produced anything at all.
Lucretius (Book II, lines 217-224)
I first learned of KAM sometime before the Ford-Waters paper.
Boris [Chirikov] got the details of KAM a good while before I did;
some Moscow mathematician explained Arnold's statement of KAM
to him. Kruskal and Zabusky were initially hostile to
Chirikov-Izrailev's article, but then news of KAM got through to
them. The Ford-Walker paper was an effort to popularize KAM and
to make the Chirikov overlap criterion understandable via
Henon-Heiles level curves.
Joseph Ford (Private communication)
4.1. The Story Continues
Back in Chapter 2, I left off with the pre-KAM work by Waters and Ford
(1966), and Zabusky and Kruskal (1965). Beginning in 1966, we find the
news of KAM spreading across the physics community, influencing research
directions, and forcing conclusions. In the first effort to bring the KAM
theorem to bear on the FPU problem, Izrailev and Chirikov (1966) provided
an explanation of the FPU boundedness (failure to approach equilibrium)
in terms of a simplified, purely topological argument. Shortly thereafter,
motivated in part by the work of Izrailev and Chirikov, Zabusky and Deem

84 4. Research Threads Come Together: Harmonic Convergence
(1967) attempted to obtain equipartition of energy with the FPU Hamil-
tonian by exciting high-frequency modes initially, and then following the
subsequent energy transfer using a continuum description. Then—in what
is considered to be the historical marker for the general awareness of KAM
in the physics community—Walker and Ford (1969) published a sizable
article in Physical Review, demonstrating the properties of KAM using
the Henon and Heiles simulation as an example. In 1970, Ford again
returns, this time with Gary Lunsford, to explore the irreversibility in weakly
nonlinear systems based on arguments derived from Izrailev and Chirikov.
Finally, in 1972 and 1973, Ford, Lunsford, Henon, Zabusky, Stoddard, and
Turner brought together the close relationship between the Henon and
Heiles Hamiltonian, the FPU problem, and the Toda lattice in a discussion
of the ergodicity and integrability of these systems.
4.2. Izrailev and Chirikov, 1966
Fermi-Pasta-Ulam
Simulation 1954
FPU first became associated with KAM in July 1966, when Izrailev and
Chirikov published the article "Statistical Properties of a Nonlinear String,"
in which they attempted to "estimate the limits of stochasticity for a chain
of oscillators." The authors thanked Stanislaw Ulam for sending them the
FPU paper and they cited Kolmogorov's conjecture and Arnold's
subsequent proof of the theorem. Although they did simply assume the
applicability of KAM to the FPU problem without rigorously demonstrating the
sufficient conditions, their goal was not to prove that FPU was KAM, but to
determine the critical point (relative size of the perturbation) beyond which
the KAM curves would break down and the FPU system would become
"stochastic," a term which they applied to mean ergodic. These were bold

4.2. Izrailev and Chirikov, 1966 85
assumptions because the proof of ergodicity is very difficult, and no one had
yet investigated what lay beyond the region of KAM stability. Historically,
the significance of this article derives from the authors's assumption—and
therefore their implicit acceptance of the idea—that KAM stability
defines a boundary region between integrable systems and what appear to be
ergodic statistical mechanical systems. Second, this work initiates the
technique Ford called the "Chirikov overlap criterion," which is a way of using
topology to predict when the KAM curves would reach critical breakdown.
Izrailev and Chirikov began with the equations of motion for the cubic
FPU perturbation
dt2
dz2
1 + 3/3
(4.1)
10
o.ot\-
/>e
1 /
1 « ■ i i i i i i 1—
v^S^
%
' 1 1 1 Mill
Ofil OJ
k/N —
FIGURE 4.1. Estimated resonance overlap for the FPU chain: (I) region
of KAM stability; (II) region of stochasticity; (a) limit of stochasticity
for fc<n; (b) limit for n — k -C n; and (c) qualitative interpolation. The
numerical values of the straight line a, b are given for n = 32.
For simplicity, they considered only the two cases in which all the initial
energy was placed in one mode, fc, which must be either a very low numbered
mode (k <gi n) in the first case, or a very high numbered mode (n — k <C n)
in the second case. In these two extreme cases, the mathematical work
was reduced significantly and the authors were able to derive the following

86 4. Research Threads Come Together: Harmonic Convergence
critical values:
where (dx/dz)m represents the maximum value of the derivative. Before
discussing the basis for these results, let us look at them in terms of the FPU
problem. Using these limiting values of stochasticity, Izrailev and Chirikov
created a plot of the two corresponding curves. Figure 4.1 shows a log-log
plot of the critical value (4.2) versus the relative mode number (k/n) that is
excited initially. The solid lines represent the two calculated limits—beyond
which the FPU system should exhibit ergodicity. The dashed curve is an
attempt to make a rough interpolation between these modal extremes. For
two different sets of FPU initial conditions, Izrailev and Chirikov derived
positions relative to the critical values and plotted those results as the
circles numbered 1 and 2 in the figure. Their description of the results
follow:
It is interesting to note that the first case falls well within the
region of Kolmogorov [KAM] stability, despite the large value
of (3 = 8. The results of the numerical calculation exhibit in
this case a clearly pronounced quasiperiodicity. The second case
lies near the limit of stochasticity; even though the value of
/? = 1/16 is very small, the seventh harmonic is still excited. The
model pattern in this case bears little resemblance to quasiperi-
odic motion, being more nearly like undeveloped stochasticity.
(p. 32)
The authors were very pleased with the correspondence between their
results and those of FPU, that is, with the appearance of quasi-periodicity
for some values of /? and with its disappearance for other values. Although
it is unclear what is meant by "undeveloped stochasticity," it is certainly
implicit that Izrailev and Chirikov assumed that the zone of KAM stability
is parametrically transitional; the nearer to critical the system initializes,
the more stochastic its behavior will be. From this result, the authors
concluded further that if Fermi et al. had excited higher-numbered modes
initially, then they might well have realized the ergodicity and equiparti-
tion of energy that they expected, because these conditions would have
initialized their system beyond the limit of stochasticity.
Let us consider the basis for these estimates and see why the method
of resonance overlap has been called a "crude approximation to the truth"
by Ford and others. Recall from the previous chapter that the condition
of oscillator resonance in the unperturbed Hamiltonian corresponds to a
periodic torus in the action-angle variables. When perturbed, these
resonances lead to the small divisors in the canonical transformations, which

4.2. Izrailev and Chirikov, 1966 87
Rotation
(preserved CP tori)
Oscillation
(periodic mini-tori)
Separatrix
(deterministic chaos)
Pendulum Action-Angle
Phase Space
Approaching
Resonance
Overlap
Simplified KAM Torus
Cross-Section
FIGURE 4.2. The action-angle phase-space portrait (upper) is used to
model the resonance structure of the KAM torus (lower left). When the
critical value is reached, neighboring resonances at different radii
approach the point where their separatrices cross (lower right). As the
resonances overlap, the remaining conditionally periodic tori are destroyed
throughout the entire phase space.

88 4. Research Threads Come Together: Harmonic Convergence
in turn indicate when the periodic tori are destroyed and replaced by
homoclinic orbits, which came to be associated with deterministic chaos. We
also saw that, for sufficiently small perturbations, the nearby conditionally
periodic tori undergo topological change to mini-tori bundles centered on
the periodic orbits. The remaining orbits, with sufficiently incommensurate
frequencies, continue as slightly perturbed conditionally periodic tori, and
so remain as layers separating the various mini-tori. To understand the
workings of the resonance overlap criterion, we must associate this KAM
language with the phase-space portrait of the simple pendulum.
In Figure 4.2,1 have drawn the phase-space portrait for the simple
pendulum in action-angle variables. The angular variations along the horizontal
axis correspond to the pendulum's angle away from the vertical. Clearly
the behavior repeats itself periodically along the horizontal axis. Along the
vertical axis the action corresponds to the momentum of the pendulum.
There are two different types of motion possible, shown by the two orbit
topologies in the figure. The circles/ellipses represent oscillation and the
wavy lines represent rotation of the pendulum. The heavy black line
separating the two types of motion is called the separatrix; it corresponds to the
special initial conditions leading to the pendulum stopping at the top of its
swing. The topological similarity between these pendulum structures and
the resonance structure of the KAM tori indicates a significant
correspondence. Comparing the upper and lower resonance diagrams in Figure 4.2,
we see that the oscillation orbits of the pendulum correspond to the mini-
tori of KAM, and the rotation orbits of the pendulum correspond to the
preserved conditionally periodic KAM tori. The separatrix corresponds to
the homoclinic orbit. With our structural analogy in place, we are in a
position to see the workings of the resonance overlap criterion.
As the energy represented by the perturbation terms in the Hamiltonian
increases—either from having a larger energy to begin with, or from a larger
perturbation parameter—more and more of the conditionally periodic tori
undergo the topological change to mini-tori at all levels of the fractal
structure. In our simple analogy, the size of the resonances increases throughout
the torus. But the separatrix can never cross itself or another separatrix;
so as the resonances get larger, the separatrices push outward until they
would necessarily overlap, which they actually cannot do. At that critical
point, theory predicts the destruction of all conditionally periodic tori.
The critical parameter (4.2) depends upon the maximum value of the
first derivative because the first derivative is related directly to the action
(momentum), the maximum value of which gives the height or the so-called
"half-width" of the resonance at its center. This method may be considered
crude because of the nature of the analogy; the resonances in the KAM
torus are definitely not the same as those of the simple pendulum. But the
results have proven to be quite good and so this method provides a very
good heuristic for analyzing these very complicated systems.

4.3. Zabusky and Deem, 1967 89
4.3. Zabusky and Deem, 1967
According to Izrailev and Chirikov's (1966) results, if Fermi et al. had
excited higher-energy normal modes initially, then they might have seen
the transition to ergodic behavior. This conjecture was put to the test by
Zabusky and Deem (1967) in their article "Dynamics of Nonlinear Lattices
I. Localized Optical Excitations, Acoustic Radiation, and Strong Nonlinear
Behavior." Once again, Zabusky preferred the continuum description of the
string, because, as they said in their article: "we may think of the lattice
as a discretized representation of a continuum and thus view some of the
phenomena as resulting from discretization; that is, the response of the
system to short wavelength excitations." They excited the higher-frequency
modes of the lattice initially by displacing alternate masses onto one of two
initial smooth-curve displacement patterns—even-numbered masses on one
curve and odd-numbered masses on the other (see Figure 4.3(a)). They
allowed the string to evolve as they tracked the modal energy distribution in
both the optical and acoustic modes. They expected to see the wave motion
break up into some random condition in which waves of all types were
propagating—which would indicate a tendency toward stochasticity. But to
their surprise, after initially exciting the optical modes and setting the ratio
of nonlinear-to-linear energy at 0.326, still they observed correlated motions
of the particles, indicating the retention of some nonrandom behavior:
This conclusion is somewhat contrary to our expectations,
although it is analogous to the results of Fermi et al. in the case of
low amplitude, acoustic initial excitations. In Fig. [4.3] we show
the evolution of the modal energies. At ut = 0 the energy
resides in the highest optical modes; after 0.05 of a linear period,
Fig. [4.3(b)] shows the rapid cascade of energy to the low end of
the spectrum, together with a rising importance of many central
modes. Very soon afterwards, we reach a state whose average
features are exemplified by Fig. [4.3(c)] (0.5 linear period). At
this time 0.558 of H has been transferred to the central modes
(20 < k < 180). At later times, one sees only small energy
fluctuations about this nonuniform spectrum, indicating that our
system does not reach a state of equilibrium having equipar-
tition of energy among the various modes, (p. 142) [my figure
numbers]
Whereas the linkage between optical and acoustic modes did result in more
pronounced energy-sharing, the lattice showed no tendency to randomize,
nor did there arise a tendency toward equipartition of energy. But instead
of turning a critical eye to the Chirikov overlap criterion, Zabusky and
Deem simply concluded that:

90 4. Research Threads Come Together: Harmonic Convergence
(a)
A w,z
t = 0
s
*-+-•*
tftgtiTi111111 1111
\
\
X
EVEN MASSES (Wn)
1111 11 11 i i fty-m
// ODD MASSES (Zn)
-►X
(b)
io-
a) cut = 20(0 05 Linear Period)
100
MODE NUMBER
(c)
IO"5
ACOUSTIC
IV--..
x v*~
cut = 200 (0.50 Linear Period)
12-CURVE 47URVE
1 6-CURVE #
■• —;
L- 1 »-
- V
•A
1 1 1
^—Non-Linear
(Upper Bou
3-CURVE
V
Contr
nds on
ibutions
the H™-)
ait = 0 /
OPTICAL^/
100
MODE NUMBER
FIGURE 4.3. Exciting the optical modes: (a) placing alternate masses on
two characteristics; (b) after 0.05 linear periods, the energy rushes down
to the acoustic modes; and (c) after 0.5 linear periods, the energy returns
to the middle modes.

4.4. Walker and Ford, 1969: Physical Review 91
These results are indicative of the trouble that can arise in the
numerical simulation of continuum phenomena. The process of
discretizing a continuum is equivalent to setting up a lattice to
represent a continuum. We observe that if nonlinear processes
cause energy to flow to the high wavenumbers, then because of
the discrete nature of the lattice this energy can be fed back
to the low wavenumbers that we are in fact trying to simulate,
(p. 147)
Instead of probing deeper into this contradiction of Izrailev and Chirikov's
(1966) conclusion, Zabusky and Deem (1967) were more interested in
getting a better understanding of a continuous string. Although in previous
work, Zabusky had shown an interest in discretization problems to the
extent that he and Kruskal were led to their discovery of the soliton, here he
and Deem chose to move away from the problem, perhaps because it was
simply a representation problem, rather than what he considered to be a
physical problem.
4.4. Walker and Ford, 1969: Physical Review
FPU Published
1965
Waters-Ford
1966
1 Hi
1 BJ
Izrailev-Chirikov
1966r
—■*•
\
r
Gustavson
1966
I 111
Zabi
1967
I Ills 1
| | J
isky-Deem
Toda L-
| 1967 T
Walker-Ford
1969
J
In "Amplitude Instability and Ergodic Behavior for Conservative Nonlinear
Oscillator Systems," (Physical Review, 1969), Grayson Walker and Joseph
Ford attempted to "provide an elementary introduction to Kolmogorov-
Arnold-Moser amplitude instability and to provide a verifiable scheme for
predicting the onset of this instability" for the physics community—those

92 4. Research Threads Come Together: Harmonic Convergence
who may not yet be familiar with KAM. Although many mathematical,
plasma, and fluid physicists already knew about KAM and the work of
Henon and Heiles by 1969, Walker and Ford wrote this article for everyone
else in the community. In the Introduction, the authors described the two
divergent expectations for the slightly perturbed Hamiltonian system:
One approach assumes that the weak perturbation changes the
unperturbed motion only to the extent of slightly shifting the
frequencies of the motion and introducing small nonlinear
harmonics. ... The second approach assumes that, even though
weak, [the perturbation] has a profound and pathological effect
on the unperturbed motion, converting it into ergodic motion,
(pp. 416-417)
Walker and Ford linked these two approaches to the work of Poincare in the
former case and Fermi in the latter, highlighting the duality that a non-
integrable perturbation would result in either predictable behavior with
some distortion or else totally randomized movement. Walker and Ford
recognized that the existing expectations formed an extreme dichotomy, so
they introduced the KAM theorem as a possible middle path—a boundary
more broad and complex than the traditional opposition: "A brief paper
by Kolmogorov enunciated a theorem which can perhaps provide a
cornerstone for linking the two aforementioned divergent views on the effects of
the weak perturbation." And slightly later in the article: "In short, KAM
theory indicates, but certainly does not prove, the existence of an
amplitude instability for conservative nonlinear oscillator systems which permits
a transition from motion which is predominantly conditionally periodic to
motion which is predominantly ergodic" (p. 418). Fifteen years after Kol-
mogorov's statement of the theorem, the news of KAM had reached the
general physical audience.
Walker and Ford introduced the action-angle variables and tori cross-
sections in two degrees of freedom, and explained the topological arguments
of KAM: that the unperturbed commensurate tori are destroyed by any
nonzero size of the perturbation; that the conditionally-periodic tori are
only slightly deformed for small values of the perturbation; that although
the destroyed tori regions are everywhere dense, the majority of initial
conditions lie on the distorted conditionally periodic tori; and finally, that,
the relatively small set of initial conditions leading to motion
not on preserved tori is pathologically interspersed between
preserved tori. Moreover Arnold conjectured that the system
phase-space trajectory in regions of the destroyed tori is quite
complicated indeed, perhaps ergodically filling the destroyed
region, (p. 417)
Notice the significant change in Ford's attitude that came with his grasp of
the KAM theorem. In this article, he described the small divisors, related

4.5. Ford and Lunsford, 1970 93
them to the resonance condition, and suggested that "in analogy, one would
anticipate that the motion generated by this Hamiltonian in the
overlapping resonant zones of destroyed tori is highly complicated, perhaps even
ergodic" (p. 418). We saw previously that Ford had all but eliminated any
expectations for ergodicity in this system. Working with Walker, he seems
to be convinced of the possibility.
To provide a set of standard techniques for numerically analyzing
nonlinear Hamiltonians, Walker and Ford introduced the methods of Henon and
Heiles, and they discussed the topological equivalence between the surface
of section and the KAM tori. They described the Henon and Heiles
results in terms of the KAM theory, as was done in Chapter 3. Although
they pointed to the KAM theory as a possible solution to the FPU
problem, they also mentioned Zabusky and Deem's failure to achieve ergodicity,
which did seem to contradict the Izrailev and Chirikov prediction. Because
this review article was designed to introduce KAM on known applications,
Walker and Ford offered no further solution to the FPU problem, but they
did point to the possibility of connections between FPU, Henon and Heiles,
and KAM. This article provided a thorough introduction to the KAM
theory for physicists, with resonances showing up as the island chains of Henon
and Heiles, and further, the secondary and tertiary island chains which
represent resonances within resonances. The amplitude instability they spoke
of in the article refers to the overlapping of resonances, which results in the
destruction of the intervening conditionally periodic tori. These topological
changes in the surface of section accompany the instabilities in terms of the
perturbation theory expansion.
Believing that ergodicity might well lie beyond the KAM instabilities,
Walker and Ford raised several questions in the conclusion indicating the
need for further work, including: What effect would many degrees of
freedom have on the energy at which the transition occurs? Whether the islands
of order disappear altogether or whether statistical mechanics might have
to find a basis in the "course graining" of phase space, in which ergodic
trajectories are dense but do not completely fill the phase space? and Why do
some systems, such as that studied by Zabusky and Deem, fail to achieve
ergodicity as expected?
4.5. Ford and Lunsford, 1970
Very soon after the Walker and Ford article, Joseph Ford, working with
Gary Lunsford and armed with his acceptance of at least the possibility
of ergodicity in KAM systems, began to uncover the conditions for which
we might expect "stochastic behavior, by which we mean that a trajectory
wanders more or less randomly over part or most of the energy surface"
(p. 59). Although KAM theory demonstrates that, in the region of KAM

94 4. Research Threads Come Together: Harmonic Convergence
stability, most initial conditions must lie on the distorted conditionally
periodic tori, and thereby are not going to be stochastic, Ford and Lunsford
realized that this use of "most" is derived from measure theory—a purely
mathematical result. They argued that physical systems are not bound to
comply with the statistics of measure theory. In "Stochastic Behavior of
Resonant Nearly Linear Oscillator Systems in the Limit of Zero Nonlinear
Coupling" (1970), Ford and Lunsford investigated the nature of stochas-
ticity for systems that were close enough to the resonances to be
stochastic, that is, the unperturbed frequencies were nearly commensurate. Once
again, in terms of KAM theory, these are the systems which lie in the
destroyed tori region—in the homoclinic tangle between the preserved KAM
tori. In this study, which is obviously motivated by the KAM-theoretic
speculations of Arnold, we find an investigation of what trajectories will
do, if they are initialized purposefully in the resonant zone. Although this
had been speculated upon by Arnold in terms of the topology of the homo-
clinic tangle, this would be the first numerical investigation on particular
trajectories. Once again motivated to find a sound theoretical basis for
statistical mechanics, Ford looked to the KAM instability: "Our intent is to
demonstrate that widespread stochasticity can occur even for the lowest
temperatures and the weakest nonlinearity" (p. 60). The two issues they
addressed indicate their concern about the need to increase total energy to
large, nonphysical values to achieve apparent stochasticity in the model,
whereas it is known to occur at very low energies in physical systems.
On face-value, KAM seems to require that the size of the nonlinearity be
quite large to achieve the total breakdown assumed necessary for ergodicity.
With this article, Ford and Lunsford demonstrated that neither condition
is necessary for stochasticity and limited ergodicity in a KAM system.
Ford and Lunsford based their argument and exposition on the concept
of resonance overlap, because, although crude, it nonetheless provided a
connection to the existence of destroyed tori, which made room for the
stochastic trajectories. If we begin with a trajectory near enough to the
unstable periodic orbits, then, even for a very small but nonzero
perturbation, we should see stochasticity. Geometrically, the minimum number
of degrees of freedom required to obtain stochasticity is three (with three
we get chaos), so Ford and Lunsford chose to experiment with a system of
three oscillators to demonstrate the independence of stochasticity from the
size of the perturbation and the energy level.
Beginning with the cubic three-particle Hamiltonian, Ford and Lunsford
canonically transformed to the following action-angle (J, </>) form:
H =J + 7 [ aJi J* cos(20! - (j>2)
+ /3(Ji J2J3)* cos(0i + 02 - 03) ]. (4.3)
This form was particularly useful for their exposition because it contained
explicitly the second constant of the motion J (where J = J\ + 2 J2 + 3 J3)

4.5. Ford and Lunsford, 1970 95
in the first term on the right-hand side, leaving only the two terms
generated by the resonances. The first resonance, the so-called "three phonon"
interaction (2o;i = u^), has a relative size proportional to a. The second
resonance (a;i+u;2 = CJ3) has a relative size proportional to /?. The full phase
space is six dimensional; the energy surface has five dimensions; and the
additional constant of the motion restricts the freedom of the system
trajectory to four dimensions. By varying the relative sizes of the resonances,
using the parameters a and /?, but still maintaining the constant energy
and the overall perturbation size (7), they varied effectively the amount of
resonance overlap between the two physical resonances. In a sequence of
cross-sectional views, shown in Figure 4.4, Ford and Lunsford simulated a
number of trajectories—all initialized within the region between the two
resonances.
In all cases shown in Figure 4.4, /3 is held fixed and a is varied to
alter the relationship between the two resonances. The figures are all two-
dimensional cross-sectional views of the (#3, p3) phase plane. Figure 4.4(a)
shows that at a — 0, there can be no resonance overlap because there is
only one resonance. All initial conditions result in level curves
(conditionally periodic tori). As a is increased, we see the sequential disruption of
these level curves and the onset of stochasticity in Figures 4.4(b)-(d).
Ford and Lunsford came away from this study satisfied that they had
demonstrated the likelihood of stochasticity in weakly nonlinear
oscillator systems with initial conditions which were physically most probable.
In particular, they stressed that stochasticity could still continue into the
limit 7 —► 0. They seem to have found a foundation for the expectation of
ergodicity in systems of large numbers of particles, because increasing the
number of particles of a dynamical system increases both the percentage
of the energy surface accessible to the stochastic orbits and the number of
resonant interactions that cause the stochasticity. Without proving
ergodicity, which is an extremely difficult task, they made significant progress
in their effort to replace a weak foundation for statistical mechanics—one
based on the Poincare theorem on nonexistence of constants of motion—
with a stronger one based on the instabilities arising from the resonance
overlap of KAM theory:
The origin of irreversibility for these oscillator systems perhaps
finds its most fundamental description in terms of the
exponential separation of pair-orbits. In a very real sense, this rapid
pair-orbit separation represents that stirring of phase space
which Gibbs envisioned as causing irreversibility. In particular,
the rate at which these pair-orbits diverge gives at least one
measure of entropy production in these systems. In the terms
of information theory, the slight uncertainty in the initial state
would grow with time to almost complete uncertainty of the
final state, (p. 69)

96 4. Research Threads Come Together: Harmonic Convergence
1 Tick = 0.5 lTick=0.5
(c) Ps p, (d)
FIGURE 4.4. The onset of stochasticity is seen in this sequence of ((73, P3)
surface of section plots. In all cases, the size of 0 is held constant while
a is varied to change the relative weight of the two resonances in the
perturbation. The sequence is as follows: (a) a = 0, all level curves, no
apparent stochasticity; (b) a = 0.01, the beginning of stochasticity; (c)
a = 0.025, fewer level curves, increased stochasticity; and (d) a = 0.05,
level curves occupy 50% of surface.

4.6. Lunsford and Ford, 1972 97
The nonexistence of constants of the motion leads us to expect free-ranging
trajectories; but it does not provide a fundamental cause for ergodicity. It
also supports a tendency to make an either-it's-predictable-or-it's-not
assumption. But with KAM, we are seeing the rise of a more fundamental
explanation for stochasticity—one based on the complex influence of
resonance interactions, and measured by the exponential separation of pair-
orbits. The KAM explanation allows for the existence of a natural
transitory boundary zone between two opposite poles of the integrable,
completely predictable and the ergodic, totally random.
There remains a question about why some systems do not exhibit the
total randomness that Ford and Lunsford expected to result from these
resonant interactions. They explicitly addressed this issue in reference to
the Zabusky (1967) work: "The extent to which stochastic orbits are truly
random has not been resolved, and the extent to which these orbits may
yield constant high-order correlations has only been partially investigated"
(p. 70). The question of whether an orbit is truly random was also related,
albeit indirectly, to the islands of order (the periodic mini-tori) that Arnold
conjectured on and that Henon and Heiles found, persisting within the
region of destroyed tori. Additionally, although the authors did not address
it, the question presents itself as to why FPU, with such a large number of
degrees of freedom, did not realize anything close to ergodicity. In light of
that issue, remember that Ford (1961) found that the coincidental choice
of the number of degrees of freedom in powers of two led Fermi et al.
to examine orbits which could lie on only conditionally periodic tori and
not near resonance. Although this article sheds a great deal of light on
stochasticity and resonance overlap, the results may not be applicable to
FPU. However, according to the results of Izrailev and Chirikov (1966), at
least in one case, the conditions of resonance overlap were nearly met by
FPU, and in that case, we saw the onset of "undeveloped stochasticity."
Ford and Lunsford do address these issues in the next section.
4.6. Lunsford and Ford, 1972
With KAM theory and thoughts of ergodicity under their belt, and with
the new work on iterative mappings by John Greene (1968), Lunsford and
Ford returned with another article "On the Stability of Periodic Orbits for
Nonlinear Oscillator Systems in Regions Exhibiting Stochastic Behavior"
in which they investigated the Henon and Heiles Hamiltonian
H(x, y,±,y) = *-±f- + X-^- + x2y - £, (4.4)
for evidence of ergodicity. Using a Poincare surface of section, Henon and
Heiles found what appeared to be islands of order surrounded by a
stochastic sea. In the present article, Lunsford and Ford searched for evidence to

98 4. Research Threads Come Together: Harmonic Convergence
support their belief that the stochastic sea is ergodic and mixing. They
discovered that outside the outermost remaining KAM curve, all the fixed
points at the center of what used to be islands of order, had become
unstable. They related this fact to the fairly recent proof by Sinai (1963), that a
hard sphere gas is ergodic and mixing, which implies that an all-repulsive-
force Hamiltonian is also mixing. If all the fixed points in the stochastic sea
are unstable, then perhaps the system may be ergodic—a conclusion they
found to be a potential new foundation for statistical mechanics.
Lunsford and Ford demonstrated successfully that for this Hamiltonian,
there does indeed exist a fractal structure of alternating fixed points, which
is the same structure underlying the mini-tori and homoclinic tangle in
the KAM torus. In Chapter 3, we saw that as the energy in the Henon
and Heiles Hamiltonian is increased toward the dissociation value (E =
1/6), these islands shrink and break up, apparently in accord with the
predictions of the resonance overlap criterion. Although they could not
prove it rigorously, Lunsford and Ford concluded that:
An essential property of C-systems [ergodic and mixing] is that
points on initially close orbits separate exponentially with time.
Therefore, all periodic orbits for C-systems are unstable. As a
consequence, the elliptic fixed points of [the Henon and Heiles
surface of section] graphically demonstrate that Hamiltonian
(4.4) is not a C-system. Nonetheless, [Henon and Heiles] reveals
macroscopic unstable regions which are suggestive of emerging
C-system behavior. In particular, Henon and Heiles have shown
empirically that points on initially close orbits in such regions do
exponentiate apart. Moreover, they show that the macroscopic
unstable region includes most of phase space for sufficiently
large energy, (p. 703)
Notice the convenient return to the mathematical use of the word "most"
here, as opposed to the pragmatic usage in the last article. They concluded
that at sufficiently large values of the energy, the primary stable fixed points
break up such that "most" of the phase space is covered with the seemingly
ergodic stochastic sea. But they did indeed qualify their conclusion because
they could not rule out the possibility that, because this is a fractal
structure, "maverick" elliptic fixed points might still remain on some secondary
or smaller scale. They were forced to argue for a kind of coarsely grained
ergodicity:
Our results, like those of Greene, are empirical and do not
constitute a mathematical proof; moreover they do not preclude
the existence of stable periodic orbits in the stochastic regions.
However, they do make it clear that such periodic orbits, if
they exist, are likely to affect only regions of very small
measure in phase space. As a consequence in the macroscopic view

4.7. The Toda Lattice Is Integrable 99
of a physicist, the physical properties of stochastic regions like
those of (4.4) are likely to be indistinguishable from those of a
C-system.
Thus in the stochastic sea between and outside KAM curves, which fills
the entire energy surface, above the dissociation energy or beyond total
resonance overlap, we might expect a coarsely grained brand of ergodicity
in which pair orbits experience exponential separation due to the complex
interactions in the homoclinic tangle.
As a final note in their article, Lunsford and Ford clarified the connections
between Henon and Heiles, FPU, and the Toda lattice, and so situated their
work into the framework of this research program:
In conclusion, let us emphasize that the Henon-Heiles system
does provide insight into the generic behavior of systems with
attractive forces. First, it is worth noting that any three-particle
system with a Hamiltonian of the form
H = 2^ + P| + P*] + V{Ql " °3)
+ V(Q2-Qi) + V(Q3-Q2)
can be reduced to Hamiltonian [4.4] in the cubic approximation,
provided that V(r) has a nonzero cubic term in its power series
expansion. Thus the three-particle Fermi-Pasta-Ulam system
and the three-particle Toda lattice are intimately related to
Hamiltonian [4.4]. (p. 704)
Herein lies the key to the convergence of this research program. For three-
particle discrete lattices with attracting forces whose series expansions have
the same cubic approximations, we can expect KAM behavior like that
found by Henon and Heiles. Although the FPU lattice involves many more
particles than three, the cubic FPU Hamiltonian satisfies these same
conditions, and so there is good reason to expect similar KAM behavior in
FPU. All of this work is tied together by connection to the Toda lattice,
and so the resolution of the Toda lattice conundrum brings this research
program to a close.
4.7. The Toda Lattice Is Integrable
While pursuing the FPU problem from the continuum perspective, Zabusky
and Kruskal (1965) were led to discover solitons in the integrable Korteweg-
deVries (KdV) equation. In (1971), Zabusky showed that in the continuous
limit, the discrete Toda lattice could also be described by KdV. Clearly,

100 4. Research Threads Come Together: Harmonic Convergence
in the continuous limit, FPU is closely related to the Toda lattice; but,
because of the continuous limit approximation required to obtain KdV,
nothing more conclusive can be claimed for FPU in terms of KdV.
1
Zabusky
1971,
1973
Lunsford-Ford
1972
~T~
Ford, Stoddard!
& Turner 1973 I
H6non |
1974
mmmmmmmi
In the realm of discrete lattice dynamics, Lunsford and Ford (1972)
showed that for just three particles, the cubic approximation of the Toda
lattice Hamiltonian
ZT - Px+Py J_ I" 2y+2v/3x , 2y-2y/3x , -4yl _ 1
2 24 r +C +e J 8'
(4.5)
is exactly the same as both the Henon and Heiles Hamiltonian and the
three-particle case of the FPU Hamiltonian. But would the same be true
for more than three particles in FPU? Because Henon and Heiles used only
a three-particle Hamiltonian that is clearly the truncation of the
exponential Toda lattice, there can be no doubt of the direct connection between
them. But the same cannot be said for FPU. Furthermore, working from the
topological method devised by Gustavson (1966), Walker and Ford (1969)
demonstrated that the Toda lattice level curves had the same topology as
those of Henon and Heiles. If the Toda lattice could be proven integrable,
then there would be no doubt that the Henon and Heiles Hamiltonian is a
nonintegrable perturbation of a nonlinear, integrable Hamiltonian. The
behavior seen in simulation by Henon and Heiles would be a certified example
of KAM stability.
In 1973, Ford et al. used numerical methods to provide convincing
evidence for the integrability of the Toda lattice with both three and six
particles. Focusing on the exponential separation of pair orbits, they
argued as follows:
Let us [numerically] integrate two trajectories for Hamiltonian
[4.5] which are initially very close together in [(#, px, y, py)]
space. For integrable systems, the distance between these
trajectories grows approximately linearly with time; while for
nonintegrable systems, this distance grows exponentially, (p. 1553)
Their results indicated that orbit-pairs in both cases separated
approximately linearly with time, and so they argued for the integrability of the

4.7. The Toda Lattice Is Integrable 101
Toda lattice, knowing full well that this (inductive) evidence could not
provide a proof in the mathematical (deductive) sense.
Finally, in an interesting referential feedback loop, Ford et al. appended
the following late-breaking information to their article:
Here the main body of our original manuscript ended; however
we utilize the opportunity given us by a reviewer's request for
minor changes in the original m[a]nuscript to add the following
exciting footnote. Motivated by reading a reprint of this paper,
Professor M. Henon sought and found analytic expressions for n
independent constants of the motion for the n-particle Toda
lattice having periodic boundary conditions. Moreover since these
n constants of the motion are in involution, Liouville's Theorem
insures that the Toda lattice with periodic boundary conditions
is a completely integrable system. Using Henon's results, it may
be shown that the Toda lattice with fixed ends is also a
completely integrable system. Henon will soon publish the details
of [his] discovery in another place. Here we mention only that
Hamiltonian [4.5] has the exact, independent constant of the
motion
[*(x, pxi y,py) = 8px(pl - Zp2y)
+ (px + V3py)e2y-2V*x (4.6)
+ (px ~ V3py)e2y+2V~3x - 2pxe-4y).
Henon (1974) did publish his results and so certified by proof that the Toda
lattice is one of those rare jewels in the catalogue of integrable dynamical
systems. Announced at the literal heel of the Ford et al. article, this result
verified that Henon and Heiles did see KAM stability in simulation.
Because of these close connections to the Henon and Heiles Hamiltonian
and its subsequently verified KAM behavior, and because of the compelling
KAM explanation of FPU, the consensus in the field is that FPU is
probably KAM stable. No other explanation fits the data as well, and whereas
the KdV soliton solution is compelling, it is simply not applicable to a
discrete lattice system. However, there must be a very important
connection underlying the similarities between KdV and KAM, a connection that
might lead us to conclude that KdV is the manifestation of KAM in the
continuous limit.
For nearly 20 years, the work stemming from the Fermi-Pasta-Ulam
simulation of 1954 influenced and drove on this investigation, which I have
called the FPU research program. Of course the FPU problem was just a
continuation of the ongoing subject of dynamics. But in 1954, with one of
the first computer simulations in dynamics and with Kolmogorov's
powerful conjecture, the age of dynamical systems theory began. It is a new

102 4. Research Threads Come Together: Harmonic Convergence
science stemming from the happy marriage of computer simulation and
mathematical modeling, that has been blazing the trail into new regimes of
dynamical behavior in what are still classical dynamical systems. In a very
strong sense, dynamical systems theory is the logical continuation of
classical mechanics; it does not begin from any new, or relativistic, or quantum
mechanical theories of matter, but just good old Newtonian mechanics.

Part II: Philosophy
But we do not admit that the eyes are deceived at all:
for their task is to see where light and shadow are,
but whether it is the same light, in fact, or not,
and whether the shadow that passes by is the same one,
or whether what I said before is indeed the case,
these things can only be decided by the mind,
for the eyes cannot discern the true nature of things.
And so you must not blame the eyes for the fault of the mind.
Lucretius (Book IV, lines 379-386)
Now with the advent of large computers, sophisticated graphical
algorithms and interactive terminals, we can undertake large-scale
numerical simulations of these systems and probe those regions of
parameter space that are not easily accessible to the theorist/analyst
or experimentalist. ... This is a new approach to understanding the
physical environment around us through formulating, exercising,
comparing results and then improving mathematical models that we
believe describe the real world. ... [This approach ] in nonlinear
lattice dynamics began with the FPU attempt to determine a
relaxation time.
Norman Zabusky (1973)

5
Steps to an Epistemology
of Simulation
A computer simulation for these conditions gave recurrence and we
were on our way to a better analytic understanding.
Norman Zabusky (1967, p. 231)
In cybernetics, mapping appears as a technique of explanation
whenever a conceptual "model" is invoked or, more concretely,
when a computer is used to simulate a complex communicational
process. ... Outside of cybernetics, we look for explanation, but not
for anything which would simulate logical proof. This simulation of
proof is something new. We can say, however, with hindsight
wisdom, that explanation by simulation of logical or mathematical
proof was expectable. After all, the subject matter of cybernetics is
not events and objects but the information "carried" by events and
objects.
Gregory Bateson (1972, p. 401)
5.1. Introduction
In dynamics, we use models to represent the workings of a physical
system. We arrive at the model by making approximations to fundamental
force laws based on assumptions about the physics. The model is expressed
mathematically as either a set of differential equations, or else as a Hamilto-
nian function, from which we obtain differential equations using Hamilton's
formalism. Every system of well-behaved differential equations always has
a definite solution. Although some scientists often call this guaranteed
solution the truth, I will, for clarity, refer to it as the true solution. As we have
seen, the differential equations that characterize dynamical systems theory
are not usually solvable analytically, making the true solution an elusive
object of desire. In the absence of direct access to the true solution, we
resort to numerical methods to create approximations to the true solution.

106 5. Steps to an Epistemology of Simulation
Using numerical methods on a computer to study the solution to noninte-
grable differential equations is one major use of simulation in physics. This
form of simulation goes well beyond dynamics, as nonintegrable
differential equations appear in a broad spectrum of application. This is the use
of simulation that I assume in this chapter.
The goal of simulation in dynamics is to learn about the true solution
by simulating the trajectories of the associated differential equations. At
best, simulation can tell us only about the true solution for the model,
and it is the shortcomings of reaching that goal that are at issue here.
Whether or not the true solution of a model actually relates to physical
reality depends upon the fundamental laws used and the approximations
made to obtain the model. As the simulation teaches us about the true
solution, we make decisions about the adequacy of the model. Simulation's
function is to reveal the properties of the true solution and to aid our
decisions about how well the model suits the physical system. As doubts
arise about the model's adequacy, but we are reasonably confident of the
simulation, we must turn our attention to the approximations made in the
model. The goal of this chapter is to make explicit what issues or aspects
a simulation describes. Several times I will point out an instance of when
a researcher indicates an implicit belief that the simulation does tell us
something about physical reality. Such an assumption implies a trust in a
long chain of inferences and approximations beginning from the adequacy
of the fundamental theory itself, to the equivalency of an infinite series
expansion of terms, then to the approximations that are made to obtain a
model, and finally to the simulation and what it might be telling us about
the model.
First I discuss the historical significance of the FPU simulation as it
marks the beginning of simulation's role as an experiment on the domain
of mathematical models. I will clarify some of the significant similarities
and differences between simulation and experiment. Then I take up the
issue of how we come to believe that a simulation tells us anything about the
true solution. The major source of divergence from the true solution in
simulation is the necessary discretization of time in the numerical procedures.
Although this approximation has emerged as the significant consideration
when working with simulation, we must also consider Peter Galison's view
that experimenters themselves often may be biased in interpreting the
results of their experiment. Once I discuss these issues, I move to the more
pragmatic discussion of delineating the strategies we may use to increase
our belief in the simulation's approximation of the true solution. Working
from the epistemology of experiment suggested by Allan Franklin, I extend
five strategies of reasonable belief in experiment to the realm of simulation.
After first introducing these strategies in their general form, I move back
into a close reading of the articles of the FPU case study to show how they
are put to use. Finally, I take up the issue of what a simulation can tell us

5.2. Hierarchy of Modeling 107
specifically about the difference between belief and proof, by studying the
failure to prove the existence of integrals of motion using simulation.
5.2. Hierarchy of Modeling
There are many different meanings of the term "model" in common usage.
Dynamics uses models in three distinct layers of the investigative process.
First, dynamicists conceive of a mechanical model to represent the
components of the physical system; then they construct a mathematical model to
represent the functional relationships between the variable components of
the mechanical model; finally, they generalize the mathematical model to a
parametric model class within which there is a continuum of specific
mathematical models. As I will discuss in the next section, approximations must
be made to obtain the model at both the mechanical and the mathematical
levels.
Fermi et al. (1955) modeled a one-dimensional, loaded string, with fixed
ends, that was free to move only in the longitudinal direction. The springs
between each mass point along the string would have to be described
using some nondissipating, nonlinear force. Once the conceptual picture of a
mechanical analogue is obtained, the much more difficult task remains to
describe the dynamics of the system mathematically. The Henon and Heiles
mechanical model was a single star moving in a nonlinear galactic potential
field. However, once they constructed the mathematical model for this field,
it was identical in form to the FPU model, when the latter is truncated
to three moving mass points. It is interesting to note that many different
mechanical models can result in the very same mathematical model. By
the same token, the results of an analysis of one mathematical model can
be generalized to many different physical systems.
The Hamiltonian function is the foremost example of the statement of
a mathematical model in dynamics. For conservative dynamical models,
such as all the models in the case study, the Hamiltonian is the
mathematical representation of the energy-like quantity that axiomatically remains
constant throughout the evolution of the system. Thus conservation of the
Hamiltonian is one major approximation inherent in all the mathematical
models in this work; it is the approximation that separates the two major
branches of dynamics: conservative and dissipative dynamics. This
conservation is an axiom of the formalism; once assumed, it must be maintained
throughout the evolution.
The unperturbed Hamiltonian for the FPU model is simply the sum of
potential and kinetic energy terms for a chain of coupled harmonic
oscillators. The perturbed part of the FPU Hamiltonian consisted of one of
the lowest nonlinear terms in the series expansion of Newtonian attraction
between mass points. Interestingly, the Henon and Heiles mathematical

108 5. Steps to an Epistemology of Simulation
model is also the same as the truncation to the cubic nonlinear term in
the series expansion of the Toda lattice model, whose mechanical analogue
is three moving mass points confined to a circular ring and separated by
exponential forces. So again, the approximations made to models reduces
them to terms that are common to many different physical situations.
The mathematical models employed in dynamical systems theory
usually contain some variable control parameter, whose value must be specified
before any consideration of the evolution of the trajectory. Because each
value of the control parameter constitutes a different physical situation and
we will want to discuss all the models associated with the likely range of
values of the control parameter, I adopt van Frassen's (1980) formal
differentiation between a model and a model class. Each value of the control
parameter represents a specific model within the model class. It is actually
the entire model class that becomes the entity under scrutiny in
dynamics. With this generalization, questions of structural stability emerge that
concern the topological relationships between the trajectories of various
models within the model class. As dynamicists study the behavior of the
trajectories generated from one model, phase space becomes a map to
regions of predictability, whose coordinates are the ranges of values of the
independent variables—position and momentum. When we consider the
mathematical model as but a cross-section of the larger model class, we
realize the possibility of watching the topological evolution of phase space,
where instead of time, evolution is modeled by the parameter of the model
class. Dramatic changes in the topology of orbit surfaces in phase space
during the evolution through the model class indicates a significant
qualitative change in the model and a possible identification of a short-coming
in some approximation, or else some dramatic change in the behavior of
the physical system itself.
Whereas we usually define the model class using a variable parameter
on the size of the perturbation, the Henon and Heiles Hamiltonian is an
exception. Henon and Heiles derived their Hamiltonian all at once, in one
piece, instead of an unperturbed piece and a perturbation. Thus the
relative size of their perturbation remained the same (unity); but this does not
imply that the relative distribution of the total energy remained the same
between the linear piece and the nonlinear piece. The initial amount of the
total conserved energy functioned for Henon and Heiles as their control
parameter. So whenever I speak of the Henon and Heiles work, I specify the
energy level in relation to the breakdown of the KAM curves, which is not
typical of these models. As they increased the energy toward the
dissociation value, more and more of it became associated with the perturbation,
corresponding to the increasing area of stochastic orbits. As Henon and
Heiles increased the energy from very small toward the dissociation value,
the lines defining the ellipses in the surface of section began to spread out,
eventually breaking down altogether. For the more general KAM model,
this region is associated with the size of the perturbation parameter, which

5.2. Hierarchy of Modeling 109
changes the significance of the perturbation for a given value of the
energy. In the Henon and Heiles work, the perturbation was set to unity and
the significance of the effect of the perturbation increased with increasing
energy.
Beyond the Henon and Heiles paper, we find the size of the perturbation
parameter to be one of the most significant determinants for understanding
the behavior of the model class. Whereas Zabusky (1962) was dissatisfied
with Ford's (1961) analysis of the FPU problem because it "assumed that
the amplitude of the modal oscillations did not change" (p. 110), Jackson
(1963b) pinpointed the problem with Ford's work as being "not entirely
relevant to the calculations of FPU" (p. 687) because the FPU case did
not fall into what Jackson called "weak coupling," which is the case that
Ford's analysis is applicable to. Jackson eliminated the FPU work from that
classification because of the large number of particles and because of the
size of the perturbation parameter. Thus the perturbation parameter size
dramatically affects the approach of the perturbation theory analysis, such
that Jackson could reject Ford's work because he missed this distinction.
A second place in the case study that we find the perturbation size
playing a key role is in the Izrailev and Chirikov (1966) paper. These authors
used both the perturbation size and the number of particles as key factors in
determining the critical breakdown of the KAM curves using the resonance
overlap criteria. They demonstrated that KAM stability was applicable to
two different cases of the FPU work, citing that the perturbation size was
below the critical value for KAM breakdown in both cases.
Finally, Ford and Lunsford (1970) tried to shift the emphasis away from
the size of the perturbation parameter in order to "demonstrate that
widespread stochasticity can occur even for the lowest temperatures and the
weakest nonlinearity" (p. 60). They did so by arguing that most physical
oscillator systems satisfy the resonance condition:
N
]}T rifccjfc == 0. (5.1)
By claiming that "low-order resonance linking all degrees of freedom is
assumed to be ubiquitous in physical nearly linear oscillator systems" (p. 60),
Ford and Lunsford try to move the study of dynamics back into the realm
of physics, where arguments based on number theory tend to be
inappropriate for realized physical systems. If a physical system is nearly resonant,
then we know its trajectory will fall within a zone of resonance overlap,
no matter how small we make the perturbation parameter, as long as it is
greater than zero. It is Ford and Lunsford's (1970) contention that most
physical systems actually do obtain this property. And it is through
arguments like this one that we can recognize Ford's desire to pursue KAM as a
basis for statistical mechanics and irreversibility in macroscopic
deterministic systems.

110 5. Steps to an Epistemology of Simulation
5.3. Historical Significance
The results of the FPU simulation seemed to cast doubt upon the
hypothesis that when a Hamiltonian model does not possess a complete set of
integrals of the motion, it must evolve toward thermal equilibrium—meaning
that without a set of integrals binding the trajectories to predictable
orbits, those trajectories should evolve into ergodic, totally random motion.
In what I will be calling the "ergodic hypothesis" (not to be confused with
the assumption about the different sizes of infinity in mathematics), dy-
namicists believed that without integrals of the motion, dynamical systems
must necessarily be ergodic (totally random). Whereas an experiment may
falsify a hypothesis about a physical system, can a simulation lay claim to
the same capability? This question brings into focus the subtle difference
between simulation and experiment that we must remind ourselves of
constantly when working in this area. Whereas an experiment tests predictions
about a physical system fairly directly, a simulation can test predictions
only about the behavior of a mathematical model. However, if the model
is derived from a set of approximations to a well-corroborated theory, then
the simulation tests both the theory and the approximations in
conjunction. When the results in simulation go against the underlying hypothesis
but we have reason to believe those results, then we cannot be certain
immediately what has been falsified, the theory or the approximations.
The FPU simulation was expected to verify a hypothesis about an entire
class of mathematical models that result from an approximation to
Newtonian theory, namely, the truncation of the full series expansion of the force
of attraction between bodies. Because the hypothesis concerned the
behavior of a class of mathematical models, which is related to physical theory
only through an approximation, we may say that the results of the FPU
simulation falsified that hypothesis about the expected properties of the
model, as far as they could be examined in simulation. The implications of
this result are twofold: For dynamics, instead of a simple, mutually
exclusive and jointly exhaustive binary division of Hamiltonian models—either
they are completely predictable or they are totally unpredictable—there
exists a more complex set of possibilities which emerges from the negation
of both of these properties. The second implication concerns the history
and philosophy of science. By casting doubt on a strongly held hypothesis
about the expected behavior of this class of Hamiltonian models, the FPU
results established firmly a significant and independent place for
simulation in the scientific method. Simulation may be used to falsify theoretical
hypotheses about the properties of mathematical models in the absence of
analytical methods. I will discuss the second of these issues first, so that I
may introduce the historical framework for this chapter.
The Poincare theorem (1890, Vol. 1, Chap. 5) on the nonexistence of
analytic integrals of the motion for a certain class of nonlinear mathematical
models had direct implications for the FPU Hamiltonian. Fermi et al. be-

5.4. Experiment 111
lieved that the nonlinear perturbed model they were simulating belonged
to this class, and so it should not have been integrable. This assumption
led them to believe that with no integrals of the motion, their model should
evolve toward thermal equilibrium.
In 1923, Fermi tried to prove a theorem that went one extra step
beyond the Poincare theorem. He wanted to show that because this class of
models could not have a complete set of integrals of the motion, which
would prevent the trajectories from accessing the full energy surface, they
must necessarily be ergodic. Although no other alternatives for trajectory-
binding structures were known at the time, the proof of Fermi's theorem
was not found to be convincing in and of itself (see Haar, 1954, Segre, 1965).
Thirty years later, Fermi saw an opportunity to at least verify his
hypothesis in simulation, in lieu of a proof for his theorem. Clearly the results of
the FPU simulation not only failed to verify that hypothesis, they falsified
it by providing a counterexample. However, there is some question about
whether the Poincare theorem even applied to the model used by Fermi et
al. If it did not, then Fermi et al.'s expectations for their model were even
less well founded, because without a rigorous application of the Poincare
theorem, no guarantee existed for the nonintegrability of their model. The
requirements for the Poincare theorem are the same as those for the KAM
theorem, as we saw in Chapter 3. The problem with applying either of these
theorems to FPU is that the unperturbed Hamiltonian, as written by Fermi
et al., is degenerate. But if the same perturbed Hamiltonian could result
from the truncation of some other integrable, nondegenerate unperturbed
Hamiltonian, such as the Toda lattice, then not only would the Poincare
theorem apply, but so would the KAM theorem.
In the history and philosophy of science, the general problem of
experimental evidence that contradicts the hypothesis underlying the experiment
itself is not new. However, the FPU problem is the first such case where the
evidence came from the results of a simulation instead of an experiment.
5.4. Experiment
In the volume of Fermi's collected works (Segre, 1965), Stanislaw Ulam
wrote an introduction to the FPU article. Appropriately, from this piece
we obtain an insight into Fermi et al.'s attitudes concerning the role of
simulation:
We decided to try a selection of problems for heuristic work
where in absence of closed analytic solutions experimental work
on a computing machine would perhaps contribute to the
understanding of properties of solutions, (p. 977)
Let us consider what is meant by the term "experimental" in this context.
In what way is a simulation of a nonlinear dynamical system an experiment?

112 5. Steps to an Epistemology of Simulation
Simulation provides us with observations of the implications of the model
that are otherwise hidden. With an analytic solution, we can derive global
properties for all of the trajectories mathematically and calculate the exact
state of the system at any point in time, given an appropriate initial
condition. But the number of dynamical systems that are integrable is small;
most "real world" dynamical systems are not integrable. In simulation, we
can learn about only those trajectories that we simulate and only for the
evolution times that we choose to use and only to the numerical resolution
of our computers. So the task of obtaining general properties of a model
through simulation is similar to understanding a physical system by
running experiments on different aspects of it. Because the true solution might
not be obtained using analytical methods, the behavior of trajectories from
this model may be unknown in advance of the simulation. Drawing from
existing theory, dynamicists develop certain expectations; but until we see
the behavior of the trajectories, we cannot know their actual properties.
Theory also informs our expectations in an experiment and the outcome
often brings surprises there. The true solution for this type of dynamics is
occluded from us by our own inability to solve the equations analytically.
In this way, simulation is somewhat analogous to experiment; it provides
us with glimpses of the true solution. We must be circumspect about not
only what we can see, but also what we are prepared to see. However, the
similarities between experiment and simulation fall short of an equivalence
relationship because they are used to investigate the distinctly different
realms of physics and mathematics.
If we can, for the moment, include the mathematical realm into the realm
of nature, then simulation can be thought of as a subset of experiment.
Simulation is an experiment into an unknown area of mathematical structure
using a physical apparatus. The experimental apparatus in simulation
includes both the computational machine and the graphical display device.
An important difference between simulation and experiment is that very
little of the theory of the apparatus comes into play when it comes to
studying a model in simulation, but this can be a major issue in an experiment.
However, the theory of the apparatus in simulation is not negligible; as
we will see, there are important considerations to be made concerning the
ability of the simulation to tell us about the true solution of a model. The
single most important constraint on simulation is the discretization of the
evolution of the trajectory. Because simulation is a special case of
experiment, in the sense just described, we should expect to derive the beginnings
of a philosophy of simulation from the existent philosophy of experiment,
just as some of the techniques of experimental investigation form the basis
of the methodology of simulation.
Fermi et al. expected the trajectories of their nonlinearly perturbed string
to begin from a certain region of phase space—determined by the initial
conditions—and then slowly evolve further and further away from that
region. The state of theory at that time caused Fermi et al. to have ex-

5.5. Epistemology 113
pectations that were not realized. When the existent theory provides no
conclusive way to make definite predictions about what lies in the phase
space of a model and the model is not integrable, then efforts must be made
to verify the simulated results and then to develop new theory to account
for those results.
In this sense, the simulation and modeling of nonintegrable dynamical
systems is experimental and heuristic. I interpret and will adopt Ulam's
use of the word "heuristic" herein to mean exploratory—as in the absence
of a predictive theory. The application of this term to simulation implies
a similarity to one of the more interesting functions of experiment—to
precede theory in the exploration of unknown territory. Given that most
dynamical systems are not integrable, then there is quite a lot of territory
to be explored, and possibly quite a lot of new physics to be discovered.
5.5. Epistemology
How can we know that what a simulation describes is actually the true
solution? This is the central question for an epistemology of simulation.
We do not know the true solution a priori, although we know that it does
exist; so we must establish strategies of testing the simulation to increase
our belief in what it tells us about the true solution. Existing theory can
aid or hinder us in this pursuit and I will discuss that topic in the next
section.
First, let us look at what can cause the solution of the discrete,
iterative numerical integration procedure to diverge from the true solution. The
conversion of partial differential equations to ordinary differential equations
constitutes an approximation through the truncation of space; the infinite
degrees of freedom of continuous spatial variables is often approximated by
a finite number of discrete degrees of freedom. The result is a new model,
and so the different behavior realized is a function of the approximation
of the model and not the simulation. We see from this example that we
can distinguish between approximations made in the model and those in
simulation. The significant approximation that can cause a simulated
trajectory to diverge from the true solution is the discretization of time from
the infinitely divisible continuum to a discrete contiguum of finite-precision
steps.
Because FPU is the cornerstone of this work, let us look there for the
initial application of this approximation. From the following quotation we
can see that this approximation was considered to be the least significant
of the approximations made by Fermi et al.: "The derivatives in time,
of course, were replaced for the purpose of numerical work by difference
expressions" (p. 986). Their reasoning is intuitively clear. On the one hand,
discretization of time is necessitated by the numerical procedure and, on

114 5. Steps to an Epistemology of Simulation
the other hand, such an approximation should not affect either the model
or what can be said about the physical system. Although Fermi et al. may
not have realized it, this approximation is significant exactly because it
plays a role only in the simulation and not in the model nor in the physical
system. Indeed, consider this statement by Ulam:
In addition, such experiments on computing machines would
have at least the virtue of having the postulates clearly stated.
This is not always the case in an actual physical object or model
where all the assumptions are not perhaps explicitly recognized.
(Segre, 1965, p. 977)
Ulam seems to imply that a simulation is able to provide us with more
clear information about its respective truth than an experiment because
the postulates are clearly stated in advance. But the discretization of time
in a simulation can and sometimes does, under certain conditions, obscure
the true solution. The discovery of this limitation and the development of
a system of checks to recognize its effect will be discussed in our next tour
through the case study.
Why would we expect the discretization of time to have little, if any,
physical meaning? Physically because a macroscopic system is
deterministic, its behavior must be independent of how often we look at it, so having
discrete-time steps should not make a difference there. Similarly for the
model. However, we cannot obtain infinite precision in our calculations, so
an iterative numerical procedure may lose some portion of the true
solution on each iteration. Under most circumstances, we can contain this error
by choosing an appropriate size for the time steps. But there are
circumstances in which the trajectory is changing so rapidly that we cannot track
it; the signal becomes lost in the noise of the numerical procedure. In other
words, the changes in the trajectory are so fast that any prediction into
the future based on the present state cannot follow the true solution, no
matter how small we make the prediction interval. In simple terms, there
are times when the simulation displays the true solution easily; but there
are also times when a numerical procedure is not a transparent window
on the true solution and when this occurs, the efficacy of the simulation
deteriorates.
Numerical integration must proceed in discrete steps; so not every point
along a trajectory can be represented. An infinite number of points along
the trajectory are lost in the grid of the numerical procedure. The
significance of this loss varies, depending on how erratic the trajectory is. One
characteristic feature of chaotic systems is the "exponential separation of
pair orbits." In a particularly chaotic region of phase space, very small
differences in the location of a point along the trajectory can lead to very
different behavior. In a numerical procedure, the choice of precision can
alter the behavior of the trajectory: adding or subtracting a single digit of

5.5. Epistemology 115
precision can result in totally different trajectories from what appear to be
the same initial conditions. Each iteration of the numerical procedure is a
prediction into the future—into the next time step—based on the state of
the conjugate variables at the present instant. So making the time-step size
small lessens the likelihood of a prediction error.
In regions of "deterministic chaos," in which the trajectory changes its
course rapidly, the numerical procedure generates noise in the simulation
regardless of the time-step size. This level of noise grows until the signal—
the true solution—gets lost in it. Clearly this is a problem of entropy. This
effect is not a loss of determinism in our dynamical system, but a loss of our
ability to make the determination with a simulation. The usefulness of our
apparatus deteriorates as it increasingly generates noise. In other words, in
regions of deterministic chaos, information entropy increases in simulation,
even though the physical entropy of the system cannot, according to our
assumptions.
A related simulation problem is the randomness associated with trying
to connect a physical system to a numerical initial condition. The state
of the physical system can be measured only to some finite accuracy, the
uncertainty of which can lead to randomness in simulation. To show how
randomness can come about in the deterministic process of numerical
mapping on a discrete grid, I turn to an explanation from Lichtenberg and
Lieberman:
Consider for example the one-dimensional mapping defined over
the interval [0,1]: xn+i = 10xn, mod 1, with the initial
condition: 0 < xq < 1. We divide the "phase space" 0 < xo < 1
into ten equal intervals having as labels aj the integer part of
10a;. Then for example the initial condition xq = 0.157643 ...
through the mapping, generates the sequence {a{\ = 1, 5, 7, 6,
4, 3,... Is this sequence random? We see that the answer hinges
on whether the initial condition x$ is random. However, by the
previous theorem [Kolmogorov-Sinai entropy], "almost all" #o's
are random, and thus the sequence is random. It is conjectured
that the motion near homoclinic points with a separatrix layer
is random in this sense. Conversely, the regular motion on a
KAM surface is nonrandom according to the above definition.
(1983, p. 275)
Associating the model with a simulated trajectory requires an ability to
track arbitrary initial conditions. As we can see from this example, the
uncertainty in the accuracy of any measurement can quickly lead to
randomness in the calculation of a numerical mapping on a discrete grid. So
even though we can obtain many decimal places in a computer simulation,
we cannot avoid this exponential growth of randomness in the numerical
procedure. The move from the continuum (infinite precision) to the discrete

116 5. Steps to an Epistemology of Simulation
grid is unavoidable, because in all numerical integration and mapping, we
must choose some finite numerical grid of precision. Eventually, in regions of
deterministic chaos, a simulated trajectory may, if it is sufficiently erratic,
lose memory of its initial condition, meaning that if we were to reverse the
calculation, the trajectory would not return to its initial condition. Because
the deterministic fundamental theory is time-reversal invariant, this loss of
initial conditions is a sure indicator that the efficacy of the simulation has
been lost. However, chaos does not entail irreversibility and so this loss of
memory situation may not occur even in a chaotic model; currently, the
only sure signature of chaos is the exponential separation of orbits.
Irreversibility defies our deterministic expectation on the system; so
either the simulation has diverged from the true solution, or else the true
solution is not what we expect. We can blame the discretization of the
numerical procedure for this increase of information entropy. When regions of
deterministic chaos are discovered, they enhance our knowledge of the true
solution by guiding us to regions where macroscopic irreversibility seems
to emerge from a deterministic system; but the connection between
deterministic chaos in the simulation and physical irreversibility is yet to be
made. At the same time, if our intention is to understand the evolution of
particular solutions, then the efficacy is reduced because we can no longer
track the evolution of an arbitrary trajectory. In these regions we can still
learn about the general behavior of the solutions of the model, but it
becomes difficult to know whether the simulation is telling us about the true
solution or not, at the local level. Testing for reversibility seems to be a
good way to measure the efficacy of our simulations.
Another problem related to the finite-precision arithmetic of computer
simulation is round-off error. The uncertainty in the last digit of a finite
precision real number accumulates in multiple arithmetic calculations—such as
many iterations of a mapping. Several studies of the effect of this round-off
error have led to the conclusion (Lichtenberg and Lieberman, 1983, p. 277)
that "chaotic motion observed in generic Hamiltonian systems is intrinsic
to the dynamics and is not produced by discretization effects associated
with finite precision arithmetic." In one of these studies (Rannou, 1973),
the author converts a "continuous" mapping—one that is iterated in real-
number representation—to a "discrete" mapping—one that is performed in
integer representation. The advantage of a discrete mapping is that there
is no round-off for integer arithmetic and so it cannot accumulate during
the iterations as it does when digits of real numbers must be truncated.
The surface of section is subdivided into a course-grained grid beforehand.
The results of the randomness of this discrete mapping as compared to the
associated continuous mapping showed very little difference in the
characterization of ergodic and periodic orbits. In a second study (Greene, 1979),
the author calculated the effective propagation of the round-off error in a
Hamiltonian mapping exhibiting KAM properties and found that the
error only propagated in a direction parallel to the curves (cross-sections of

5.6. Preconceptions 117
tori). In this case, the error could not cause confusion between stochastic
and conditionally periodic regions. Although we might confuse the effects
of round-off error in our procedure with the randomness of deterministic
chaos, we can check our procedure by applying it to an integrable model
and comparing its results against direct calculation, made possible by the
global definition of the trajectory.
5.6. Preconceptions
Apart from the question of whether the simulation is telling us about the
true solution or not, we must consider how much of its behavior we are
prepared to see. What we see in simulation may be biased strongly by
what we expect to see. Peter Galison writes:
the naive view that prior expectations do no more than "bias"
the experimenter will not do. Theories, or rather the different
levels of theory, do much more than tint an otherwise crystal-
clear view of the world. In Maxwell's case, for instance, it as
precisely his lack of a quantitative prediction that left him with
no idea of how big an effect he was looking for. (Galison, 1987,
p. 73)
Galison argues that theoretical presuppositions function not only to
"influence the outcome of an experiment" (p. 69), but they also create the
situation for the experiment itself. By extension, these ideas carry over into
the use of simulation. Galison cites the example of J.C. Maxwell's attempt
to ascertain the gyromagnetic effect of electrical current (p-factor)
experimentally (performed in 1863). The theory that led Maxwell to perform his
experiment left him without a prediction of the scale of the phenomenon
that he was looking for. Maxwell failed to find the effect at all: "Maxwell
was at sea: he could not know even what order-of-magnitude effect to
expect and consequently could neither calculate which background effects
would be important nor predict how accurately he would have to measure
the tilt of his machine" (p. 69). As an example from our case study, let us
consider the Henon and Heiles simulation in light of their presuppositions.
The ergodic hypothesis influenced Henon and Heiles to perform their
simulation in the first place, as we see in the following explanation:
For many years, it was assumed that 73 is nonisolating on the
grounds that no third integral expressible in analytical form like
I\ and I2 had been discovered, despite many efforts. But this
assumption, as has been often remarked, is in conflict with the
observed distribution of stellar velocities near the Sun; for it
implies that the dispersion of velocities should be the same in the

118 5. Steps to an Epistemology of Simulation
direction of the galactic center and in the direction
perpendicular to the galactic plane, whereas the observed dispersions have
approximately a 2:1 ratio. More recently, a number of galactic
orbits have been computed numerically. Quite unexpectedly, all
these orbits behaved as if they had not 2, but 3 isolating
integrals. As a result, there was some change of opinion on the
subject. Attempts were made to prove theoretically the
existence of a third integral. ... In the present paper, we approach
the problem again by numerical computation; but, in order to
have more freedom of experimentation, we forget momentarily
the astronomical origin of the problem and consider it in its
general form. (Henon and Heiles, 1964, p. 73)
The existing theory of galactic motion predicted only two integrals of the
motion—the conservation of both energy and angular momentum. But the
statistics drawn from observations seemed to indicate the existence of an
additional constraint on the orbits of stars. According to the ergodic
hypothesis, either there is an integral of the motion or else there is ergodicity.
Henon and Heiles chose to use simulation on this perturbed Hamiltonian
to find evidence of an integral where no such integral could be found
analytically. Here we find the beginning space for simulation; it might be able
to resolve a conflict between theory and observation by displaying
empirical evidence of an isolating integral when no analytical integral could be
discovered.
We have already seen what Henon and Heiles found in their study. Let us
focus on Figures 5.1(a), (b), and (c), and some of their explanation. I argue
that their reliance on the ergodic hypothesis may have prevented them
from seeing two important attributes in their simulation. Each of these
figures can be examined in both a static sense, in the spatial relationships
between the points as they are plotted, and then in a dynamic sense in
the sequence of the successive points as they are plotted. Statically, once
the simulation is stopped, we can look at the resulting picture and record
observations. In this way, Henon and Heiles discovered the islands-of-order
structures and they also recognized that the so-called third integral seemed
to break down as the energy was increased from figure to figure. What they
found in these static analyses, combined with their expectations from the
ergodic hypothesis, guided what they looked for in their dynamic analysis.
By "dynamic analysis," I simply mean making observations during the
unfolding of the simulation. In the first figure, the orbits are clearly bound
to tori, so the authors conclude that there is an integral of the motion
causing this behavior. In accord with this conclusion, they look for and
find an order to the unfolding of the points of the trajectory:
It may be interesting to note that the successive points Pi,
P2) P3, ••• rotate regularly around the curve. The figure is

5.6. Preconceptions 119
(a)
(b)
(c)
FIGURE 5.1. Trajectories in the Henon and Heiles surface of section
for energies: (a) E = 1/12 shows very persistent level curves suggesting
an integrable system; (b) E = 1/8 shows the mix of level curves and
deterministic chaos; and (c) E = 1/6 at the dissociation energy.

5.6. Preconceptions 121
but they behave in a completely different way. It is clearly
impossible to draw any curve through them. They seem to be
distributed at random, in an area left free between the closed
curves. Most striking is the fact that this change of behavior
seems to occur abruptly across some dividing line in the plane.
(P- 76)
Their analysis of this figure is strictly static. They saw ergodic trajectories
and assumed there would be no order to their unfolding. Why they chose
to look for it in the third figure instead of the second is unknown.
Had they found the unfolding order of the stochastic trajectories in the
second figure, they may have found a second important property. To their
credit, they did find the very important property of exponential
separation of pair orbits. Furthermore, they used this property to successfully
differentiate between ergodic and bound trajectories.
Even though they knew about the exponential separation of pair orbits,
they did not know that this property could lead to irreversibility, which is
a significant indicator of the loss of efficacy. If they had found the dynamic
order to the stochastic unfolding of the ergodic orbits, then there is reason
to believe that they would have made the connection between that order
and the exponential separation of pair orbits. They made the point about
periodic orbits returning to the initial condition, indicating that perhaps if
they had recognized the unfolding order in the stochastic orbits, then they
probably would have explored the reversibility of those orbits.
The major flaw in this argument lies in the conspicuous absence of any
mention of reversibility in Henon and Heiles, even for the periodic orbits.
Had Henon and Heiles discussed the reversibility of the periodic orbits
explicitly, then we would know that they thought about reversibility. Because
they found no dynamic order in Figure 5.1(c), it did not occur to them to
try reversing the trajectory. Their expectations—stemming from the
ergodic hypothesis—probably prevented them from discovering the dynamic
unfolding order in the ergodic trajectories and then, perhaps, by extension,
the more significant property of irreversibility. But because this latter
conclusion has significantly less rhetorical grounding, we must let it stand as
an interesting speculation.
My arguments are in no way intended to detract from the significance of
the Henon and Heiles work and the remarkable discoveries they made. That
they discovered the crucial property of a chaotic system, the exponential
separation of pair orbits, is truly remarkable. My intent is only to point
out that preconceived theoretical ideas can function to set the stage for a
simulation and, simultaneously, they can operate to influence what we may
find in simulation.

122 5. Steps to an Epistemology of Simulation
5.7. Strategies for Belief and Pursuit
In the next section, we will reengage with the FPU case study for a close
reading of the development and use of several strategies for obtaining
reasonable belief in the results of simulation. First I present a summary of
these strategies to outline what we may expect to find in what follows. At
the same time, I will relate these strategies to those discussed in detail by
Franklin (1986, Chap. 6 and 7) and (1990, Chap. 6). Franklin's work
establishes the beginnings of an epistemology of experiment, based on implicit
and common sense strategies used by scientists to increase their degree of
reasonable belief in the results of experiment. Although I do not attempt
in this single chapter to achieve the rigor of Franklin's extended work to
establish a similar epistemology of simulation, I do develop a subset of
these strategies which make their way naturally to simulation. My choice
of elements to include in this subset stem from their manifestation in the
case study. The methodology of experiment extends naturally and
implicitly to simulation and so it follows that the implicit methods of rational
belief would extend themselves to the realm of simulation as well. This is
a reasonable expectation that allows for an epistemology of simulation to
get a head start by importing the applicable elements from the
epistemology of experiment. In the following sections, I adapt five of Franklin's nine
strategies into the language of simulation; I choose this subset only because
they seem particularly apt in the specific case study at hand. The first five
strategies are as follows.
Comparison
We might expect that errors could result from the way the algorithm is
implemented, for example, by not carefully avoiding round-off errors in
iterative-loop calculations. But we do not expect to find much difference
in the outcome of the same code run on two different machines; although
this is a possibility, the uniformity of digital computing machines allows us
to talk about the behavior of the simulation separately from that of the
machine.
As we will see, FPU performed checks on their results to avoid errors due
to round-off. More generally, when there are properties of the true solution
known a priori, then they can provide us with checks on the simulation
during the procedure. Because all the systems dealt with in the case study
are Hamiltonian, we know axiomatically that energy must be conserved.
Thus in every simulation, we can expect to calculate the total energy at
any cycle and obtain the same result. If the energy is not conserved, then
we know the simulation is wrong, because the true solution must conserve
energy by design. Let us list this strategy under a category called
comparison: comparing the results of simulation with known properties of the
true solution. In analogy to the epistemology of experiment, this strategy

5.7. Strategies for Belief and Pursuit 123
corresponds to what Franklin (1990, p. 104) calls "Reproducing artifacts
that are known in advance to be present." In this case, conserved energy
is an artifact of the model. This strategy is useful because it provides a
running check on at least one property of the true solution while studying
an unknown system. We might think of this as a dynamic strategy because
it relates the simulation to the true solution at all times during the
simulation. However, it is very important to note that this strategy provides
only a necessary but not a sufficient condition for believing in the result.
As I will explain further in the case study, a simulation may diverge from
the true solution and still conserve energy.
Calibration
A similar but less dynamic strategy is one that we might use to test both the
precision of the machine and the accuracy of the numerical procedure. This
strategy, which I choose to call calibration (again in analogy to Franklin)
entails using the apparatus (the machine and the procedure) to reproduce
known results. This strategy involves simulating a model in which the true
solution is known, one that is analytically solvable—such as the
unperturbed Hamiltonian of the system in question. With this method, we can
obtain belief that, to the degree of precision that we are operating, the
simulation is producing its solution without numerical errors. This method
is useful for checking on the effects of precision and the competency of
the numerical integration techniques; but it is less dynamic than the first
method because it must be run before or after the actual simulation. Even if
we do calibrate a numerical technique with the unperturbed Hamiltonian,
we cannot be assured that it is telling us about the true solution in the
perturbed case because of the radical behavior of trajectories in regions of
deterministic chaos. Although we can increase our belief in the numerical
procedure using this method, we must always use other methods in
addition to this strategy to obtain as much reasonable belief as we can about
the simulation.
Verification
If observed behavior in the simulation does not violate any of the
corresponding properties of the true solution, then what strategies may be
employed to check on that behavior? In an experiment, we might look
for independent verification by someone else performing the same
experiment, or else we might try to devise a different experiment to observe the
same phenomena. In dynamics, the analogous options seem to be: other
experimenters simulating the same model, but using a different numerical
procedure, or using the same procedure, but changing the precision or the
size of the time step to see if the trajectory changes. As I mentioned above,
sometimes using different precision can produce different results for the

124 5. Steps to an Epistemology of Simulation
same trajectory, so if the same behavior is observed using different
precision, then there is some reason to believe that the results are genuine.
As for diflFerent numerical procedures, if different results are obtained on
the same trajectory, then the results are suspect. But if the results are the
same, then we obtain stronger belief in that the result is attributable to the
true solution. If the results are verified by an independent numerical
procedure, then we might be motivated to check further using other methods,
especially if the results defy our expectations.
Retrodiction
Once some interesting new behavior is uncovered in simulation, it could
very well be used to guide theoretical calculations to an independent
verification in the form of perturbation theory analyses. We saw in both Ford
and Jackson's early work, that they used theoretical means to derive
reasonable belief in the simulation of FPU. I call this form of verification
retrodiction—the retroactive affirmation of simulation results. This
strategy not only provides a strong reason to believe that the results of
simulation are genuine because of its reliance on perturbation theory, but it
also points to a feedback-loop strategy of interdependent guidance between
theory and simulation. This strategy corresponds to Franklin's "Using an
independently well-corroborated theory of the phenomena to explain the
results."
Consistency
Another of Franklin's strategies that carries over into the realm of
simulation is "Using the results themselves to argue for their validity." In the
FPU simulation, one of the interesting results that helped to establish
reasonable belief that something important had been observed was the fact
that the three different choices of perturbations had all resulted in the same
anomalous behavior. This consistency argues for the validity of the
common result, even though that result could not be explained by the existing
theory.
The remaining four of Franklin's nine strategies that have not been
mentioned, although they might still find some application in this work are: (6)
intervention, in which the experimenter manipulates the object under
observation (we might include variation of model parameters here and the
entire subject of structural stability); (7) the elimination of plausible sources
of error and alternative explanations of the result; (8) using an apparatus
based on a well-corroborated theory; and (9) using statistical arguments. I
do not say these additional strategies do not apply to simulation, but only
that I do not see them emerging immediately in the case study. A more
complete work in this vein would necessarily invest significantly more time

5.8. Case Study I: Fermi-Pasta-Ulam 125
in the examination of the analogous role of these strategies for simulation,
as well as deriving new strategies that apply only to simulation.
5.8. Case Study I: Fermi-Pasta-Ulam
The many reasons to believe in the results of the FPU simulation form an
interconnected network involving all five of the strategies that have been
described. In addition to belief in the results, there is also the issue of
pursuit. Why did these scientists pursue this subject? This is a question that
is just beginning to find its place in the philosophy of science. In the case
study, we see that something about the FPU results inspired several
independent scientists to pursue the issue further. Certainly one prerequisite to
pursuit in a case of anomalous results in simulation is a conflict between
strong expectations and a contrary result that carries a suflScient amount of
reasonable belief. However, this does not mean that scientists who choose
to pursue a line of inquiry must believe in the anomalous results; they may
choose to pursue it just exactly because they do not believe in the
implications of the result. Belief and pursuit do not have to be aligned. Other
issues that may enhance pursuit include the individual propensities of
scientists. Implicit in a scientist's choice to pursue a simulation is perhaps
the belief that it provides information about nature. But let me deal with
the relationship between belief and pursuit in terms of the case study as at
least a beginning.
Perhaps to provide some contextual significance to the FPU paper, Ulam,
in writing a new introduction to the FPU article in Fermi's Collected Works
(Segre, 1965), included references to the work of Zabusky. In effect, he calls
upon us to believe in the results of FPU by utilizing the epistemological
strategy of retrodiction:
One obtains from it [Zabusky's analytical work] another
indication that the phenomenon discovered is not due to numerical
accidents of the algorithm of the computing machine, but seems
to constitute a real property of the dynamical system, (pp. 977-
978)
By using the word "real" here, Ulam seems to indicate that the results of
their simulation tell us something about physical reality. But, if we
consider that he is a mathematician, then an alternative reading would be
that by "dynamical systems" he means the mathematical model, and that
the true solution is what is "real." So he may not be claiming anything
about the reality of physical systems in the way that physicists are used to
thinking about. Ford, Jackson, and Zabusky all chose to pursue the FPU
model in their theoretical work. We know that the results were certainly
a surprise because they conflicted with expectations based on the ergodic

126 5. Steps to an Epistemology of Simulation
hypothesis. According to the above prerequisite criteria for pursuit,
something about the FPU results provided these workers with sufficient belief
that the results were reasonable. We must also consider how their work, in
turn, increased the general belief in phenomena more general than just the
FPU problem. The near-recurrence of initial conditions that FPU observed
were in conflict with a strong expectation of ergodicity. This conflict
provided the focus for the pursuits of Zabusky, Ford, and Jackson. Ford was
looking for a firm basis for statistical mechanics—a basis for
irreversibility that goes beyond phenomenology. Zabusky was dissatisfied with Ford's
approach to the problem in 1961, and he saw this problem as a good way
to collaborate with Kruskal, who privately had expressed interest in the
FPU results (Zabusky, 1967). Because both Zabusky and Jackson knew
and acknowledged the fact that the near-recurrence appeared in the
independent calculations of Tuck and Menzel, we must recognize that this
independent simulation on the same Hamiltonian served as an important
verification, and that it generated enough reasonable belief in both Jackson
and Zabusky to inspire them toward the development of new theory. Not
only did Tuck's work recover the FPU results, but it extended those
results to uncover the "super-period." Zabusky and Jackson both retrodicted
the FPU recurrence from completely independent avenues of theoretical
research and thus provided a very strong reasonable belief that this was a
real property of the true solution; each independent result provided some
degree of additional belief. Also, because Jackson studied a discrete model
and Zabusky studied a continuous model, we find a case of the consistency
of the results arguing for their validity. Once these articles were published,
there emerged a very strong general belief that the FPU results did indeed
uncover an important new property of nonintegrable models.
Due to the regrettable death of Fermi, we are afforded an opportunity to
look more closely at the FPU results, independently of any other results.
Returning to Ulam's introduction to Fermi's Collected Works, we find him
reporting that Fermi believed in the results of their simulation:
He [Fermi] expressed to me the opinion that they really
constituted a little discovery in providing intimations that the
prevalent beliefs in the universality of "mixing and thermalization"
in nonlinear systems may not be always justified. (Segre, 1965,
p. 977)
Clearly Fermi believed strongly in the results of their simulation, enough
to question the generality of the ergodic hypothesis and, it may be
conjectured, that this simulation was telling them something about reality.
Because Fermi died a year after performing this simulation, he must have
said this to Ulam before any other verification of the results occurred,
indicating that something in the FPU work itself convinced him that these
inexplicable results were genuine.

5.8. Case Study I: Fermi-Pasta-Ulam 127
In addition to using three different perturbations, which amounts to
testing three different models, Fermi et al. hoped to obtain rational belief in
their results by using the Hamiltonian conservation of energy as an effective
strategy for checking on the accuracy of the numerics:
The accuracy of the numerical work was checked by the
constancy of the quantity representing the total energy. In some
cases, for checking purposes, the corresponding linear problems
were run and these behaved correctly within one percent or so,
even after 10,000 or more cycles, (p. 986)
Clearly this statement refers to the double strategy of calibration and
comparison. Not only was the energy conservation checked for the perturbed
system, but it was checked also in the unperturbed, linear system. Thus
the authors checked the competency of the numerical procedure by
convincing themselves that round-off error was not causing a change of energy
in either system, which they knew should not dissipate energy. Also, by
cross-checking in this way, they confirmed that the nonlinear, dispersive
model did not contain a hidden energy-dissipating feature, which might
invalidate their assumptions about the model.
Although the use of different perturbations may seem to establish the
strategy of independent verification, it does not because that strategy
requires either the use of different numerical techniques, different precision
or time-step size and, in addition, all of these need to be performed with
the same model. FPU used three different models and therefore it seems as
if the similarity in the results support arguments that pertain to the model
and not the simulation. But the consistency of the results among the
different models does provide strong reasonable belief in the results after all.
If we are to believe Ulam's claim about Fermi's firm statement, then there
must be an implicit reason to believe in these results which confounded the
established theory. Indeed, we have here perhaps the original example of
the strategy of "using the results themselves to argue for their validity."
The consistency among the results on different perturbations strengthens
the belief that the anomalous behavior is real and represents a heretofore
unknown property of the true solution. Although they were able to discount
errors due to the numerical procedure by applying the two strategies of
calibration and comparison, these strategies themselves only provide a basis
on which to discount gross numerical errors. But this other more implicit
strategy explains why this problem generated the interest and attention of
the larger community of physicists. The FPU bounded behavior had
acquired strong belief as a real property of the basic perturbed model. In
addition, this behavior presented an important conflict with the existing
theory of dynamical systems. Both of these results seem to be necessary
conditions for pursuit, and perhaps they explain some of the reasons this
problem took on a continuing life throughout the period of the case study.

128 5. Steps to an Epistemology of Simulation
5.9. Case Study II: Henon and Heiles
Now we move on to the seminal work of Henon and Heiles (1964). This
article obtains significance both because it represents the beginning of
standard methodologies in dynamics with relation to simulation and it is the
first application of the KAM theorem to a model derived from a physical
system—as was announced by Walker and Ford (1969). Lunsford and Ford
(1972) demonstrated the connections between the FPU, Henon and Heiles,
and Toda-lattice Hamiltonian, so that there is also this indirect connection
to the problem at hand, in that they all address the "fundamental problem
of dynamics." It is important to mention that this study was not motivated
by the results of FPU. Henon and Heiles claim that the reason for their
pursuit of this problem was the disparity between theoretical predictions and
observations. But the results of this study converge with the FPU results
as a further basis for continuing pursuit of the related anomalous behavior.
Let us examine the strategy used by Henon and Heiles to obtain
reasonable belief in the results of their simulation. The important method that
Henon and Heiles chose to use in their simulation, which differed from Fermi
et al., was representing the evolution of their model in a surface of section.
This choice provided them with access to important topological structures
that are obscurred by the modal-energy approach of Fermi et al. But to
their credit, such an approach by Fermi et al. would have been very difficult
considering the large number of degrees of freedom they had to consider.
Fermi et al. utilized the two explicit strategies of calibration and
comparison; both of which may be used on any simulation of this type because
the unperturbed Hamiltonian is integrable and both the unperturbed and
perturbed models are energy-conserving. Henon and Heiles also used two
explicit strategies: they used comparison and independent verification, the
latter instead of calibration. Once again, as did Fermi et al., they utilized
the fact that energy must be conserved in this Hamiltonian system:
As a check, some of the orbits were computed independently by
each of us, using different computers and different integration
schemes. The following results were obtained using the Runge-
Kutta method; during the numerical integration the energy was
observed to decrease very slightly (< |0.00003| for 150 orbits).
(P- 75)
In this instance, the authors used different algorithms and different
computers to insure their results. Just on the basis of this article and FPU,
there seems to be an implicit assumption that using two of the explicit
strategies is adequate to achieve reasonable belief that the results do not
contain artifacts of the simulation. However, we must keep in mind that
belief is quite different from truth: even though scientists may believe in
a result, it still may be different from the true solution. Henon and Heiles

5.10. Methodology 129
could have performed a calibration of their simulation by checking the
energy-conservation of the unperturbed Hamiltonian; but they chose not
to. Using different integration techniques to achieve the same trajectory
constitutes an independent verification that the simulation describes the
true solution. But, as I soon discuss, there exists the possibility that the
true solution may still diverge from the simulation. No subset of strategies
of reasonable belief provides sufficient conditions to guarantee finding the
truth.
The use of independent verification may be only slightly more
successful than calibration as a strategy for reasonable belief, because calibration
on the integrable model does not assure us that the technique is reliable
for the perturbed model. One of the characteristics of the nonintegrable
models that we have seen in this work is the exponential separation of
pair orbits—an important result found through the use of simulation. This
kind of radical behavior in the trajectory cannot occur in the integrable
model, indicating a distinct difference between the two—a difference that
could invalidate any assurance we obtain from a calibration procedure on
the unperturbed model. Whereas an independent numerical procedure that
reproduces the results consistently provides us with at least necessary
conditions for the results to be descriptive of the true solution. But this strategy
does not provide us with sufficient conditions to believe that the behavior
is attributable to the true solution, because there may be an artifact
generated by the discrete-time approximation inherent to all of these numerical
procedures. We saw that in stochastic regions of the phase space of these
models, the discrete nature of the numerical procedures leads to the loss of
efficacy of the simulation. Even though two independent procedures may
produce the same trajectory, this trajectory is not guaranteed to be
representative of the true solution because both procedures rely on discrete
iterations.
Whereas FPU demonstrated that the boundedness in their simulation
was not specific to the particular nonlinear perturbation—using the results
to argue for their validity—the results of the Henon and Heiles work are
limited in scope to this single perturbation. But taken together, the two
works strengthen each other as they flesh out a more definite understanding
of the bounded energy-sharing. Because the Henon and Heiles Hamiltonian
overlaps with a special case of the FPU Hamiltonian, the two sets of
results strengthen each other as independent verification that the bounded
behavior is intrinsic to the true solution.
5.10. Methodology
Consideration of the work of Ford (1961) and Jackson (1963b) adds little
to our discussion of epistemological strategies for simulation because they

130 5. Steps to an Epistemology of Simulation
dealt mainly with perturbation theory. However, that in itself, further
indicates that the FPU simulation functioned to stimulate the development
of new theory—one more function that simulation shares with experiment.
Both Ford and Jackson returned to perturbation theory as a starting point
for further theory development. The one other aspect of their work that
seems appropriate to this discussion is their incorporation of simulation
into their own work. Whereas Ford used the results of FPU only as
reference data with which to align his research findings (retrodiction), Jackson
integrated simulation into his work more completely.
Evidence from Ford's article indicates that he, as a theorist, thought of
simulation as an experiment: "Ulam, Fermi, and Pasta tried to illustrate
the expected approach to equilibrium by observing the equipartition of
energy among normal modes for a system of one-dimensional oscillators
obeying equations of the type ... "(p. 387). Both the terms "illustrate" and
"observe" indicate Ford's attitude toward simulation as an experiment. He
treated the output of the simulation as if it were data from an experiment,
and he presented his analytical analysis as a qualitative comparison to the
"experimental" results of the simulation.
Jackson went a step further in his use of simulation; although he too
seems to have thought of it as an experiment on the model. In the second
part of his two-part article, he wrote:
In the present study, this theory [from Part I] is applied to
a particular system of coupled oscillators and compared with
results obtained from computer solutions of the equations of
motion. ... A numerical analysis of such a system has been
made previously by Fermi, Pasta, and Ulam, and, in order to
make use of their results, we will restrict our study to one of
the nonlinear interactions used by them. (1963b, p. 686)
Whereas Jackson, like Ford, tried to explain the findings of the FPU
simulation with perturbation theory, he went on to use his theory to predict the
outcome of new calculations that extended the FPU model to new cases.
Thus, by combining perturbation theory with numerical calculations of
his own, Jackson synthesized the two approaches. To increase the reader's
belief in his own results, Jackson first retrodicted the FPU results, thus
extending the reasonable belief of simulated results to this own theory. He
then extended those results beyond the bounds of the FPU experiment by
predicting forward with his theory beyond the confines of the FPU results.
He performed calculations on the model to confirm those predictions.
By employing the FPU results to validate his own theoretical work,
Jackson strengthened the acceptance of the FPU results. In essence, Jackson
used the results of simulation to corroborate his own theory development.
Here again a theorist treated simulation as an experiment, but in a more
integrated form this time, using prediction from theory and a numerical

5.10. Methodology 131
experiment as validation for the theory. Jackson recognized that if
theoretical predictions could not only account for the FPU recurrence, but also
extend those results to new cases, then he could increase acceptance of his
theoretical work. Consequently, his strategy had the additional effect of
strengthening our reliance on simulation as a source of validation, because
he used numerical calculations of his own to show that his extended FPU
predictions would be borne out in simulation.
Let us turn now to the collaborative work of Ford and Waters (1963,
1966), at which time simulations were beginning to play a more significant
role as a heuristic tool in analytical research. In the first of these articles,
the authors come close to claiming that their own computer calculations
validated their predictions from perturbation theory:
The amplitudes of these normal modes are predicted using
perturbation theory; the validity of the calculations is then
demonstrated using a computer for systems of 3 and 5 oscillators. The
computer solutions indicate that these normal modes are stable
in the sense that a system started near one of its normal modes
remains near it. (p. 1294)
Although they did not actually claim that the calculations validated their
predictions, they did say that the calculations "demonstrate" that validity,
which is an interesting and slippery distinction. They must have thought
that claiming explicitly that a simulation could confirm a theory might be
unjustified. They displayed some reservations about whether a simulation
could be enough to validate theoretical calculations, which only points to
the need for establishing an epistemology of simulation. Also note that, in
the last sentence, they came up against an important problem that I will
treat in the next section; implicit in their claim of "remains near it" is the
qualification "as long as we ran the simulation," and "to the accuracy of
the calculation." So they could not use the simulation as a proof of the
existence of normal modes.
In the second of these articles (Waters and Ford, 1966) we find the
interplay between analytical work and simulation becoming a regular practice.
In the following quotation, we can see an emerging interconnectedness
between the strategies of retrodiction and calibration:
In Fig. 1, the first-order approximate solution is compared with
the numerical solution for a set of initial conditions which are
rather close to those for the first periodic solution. ... The
expansion parameters are ... , so that a first-order expansion
about this periodic solution is expected to provide a rather good
approximation to the actual solution. Figure 1 confirms that it
does. (p. 401)
Here they chose to predict solutions that come very close to the
unperturbed solution. In this way, retrodiction may seem to be calibrated be-

132 5. Steps to an Epistemology of Simulation
cause its predictions come close to* the unperturbed solution. The implicit
argument is that for a small perturbation, the theory is working, so we
may believe it has a chance of being correct for the more highly perturbed
cases.
5.11. Irreversibility
Waters and Ford explicitly specified their strategy for maintaining the
accuracy of their calculations: "Certain checks, including a running calculation
of the Hamiltonian and the reversal of several runs back to the initial
conditions, indicate that the accuracy of the numerical solutions presented is
better than 0.1%" (p. 401). In addition to the usual check on the
conservation of energy, we find the first mention of the reversibility
expectation, two years after the Henon and Heiles article. This additional feature,
that we expect to be guaranteed by the assumptions of the deterministic
model, provides a more sensitive test for the efficacy of the simulation. As
we know, in the stochastic regions of the phase space of a nonintegrable
model, orbit-pairs undergo exponential separation and the initial conditions
are lost. Clearly, it had not yet been realized that the simulation can and
will generate errors in its representation that are not due to any error in
programming; but instead, are due to the nature of the orbits themselves,
when linked with the discretization effects necessitated by simulation. But
reversibility is not the only way of discovering the divergence of the
simulation from the true solution in the exponential separation of pair-orbits.
Obviously tracking a pair of nearby orbits is another way, and varying the
size of the time step is yet another.
In their Physical Review Letters article, Walker and Ford (1969)
introduced the KAM theorem to the general physical audience. They cite the
irreversibility of stochastic trajectories that Henon and Heiles did not think
to observe:
For the total integration times used, up to t = 1500 for the
stable orbits, the energy was constant through six decimals.
Moreover, at t — 1500, the velocities could be reversed and
the motion integrated back to t = 0 maintaining four decimal
accuracy. However, for the highly unstable orbits, the reversed
integrations showed that the solutions were accurate only for
t < 200 even though the energy continued to be constant at
high accuracy, (p. 431)
Walker and Ford tried to verify that the irreversibility is not due to
numerical inaccuracy; but because this was a deterministic process, the
irreversibility must have been due to the numerical implementation.
Ford and Lunsford (1970) offered a possible explanation for this behavior
that we should look at closely:

5.12. Proof 133
The origin of irreversibility for these oscillator systems perhaps
finds its most fundamental description in terms of the
exponential separation of pair-orbits. In a very real sense, this rapid
pair-orbit separation represents that stirring of phase space
which Gibbs envisioned as causing irreversibility. In
particular, the rate at which these pair-orbits diverge gives at least
one measure of entropy production in these systems. In terms
of information theory, the slightest uncertainty in the initial
state would grow with time to almost complete uncertainty of
the final state, (p. 69)
Here the authors associated the production of information entropy with
the production of energy entropy. The entropy they describe is what I
have been calling the loss of efficacy of the trajectory: the irreversibility of
the trajectory causes a loss of the initial condition and so the trajectory
no longer represents the deterministic evolution from the initial condition.
Walker and Ford would have liked to connect this loss of information with
the source of energy entropy in a physical system. But this conclusion seems
unjustified because it lacks rigorous causal connections. We can conclude
that the memory of the initial condition is lost, and that the
exponential separation of pair-orbits does represent stochasticity. The connection
between loss-of-efficacy in a simulation and physical irreversibility is not
made. Once again we hear a researcher discussing the results of simulation
in terms of physical reality. Clearly Walker and Ford believe that the loss of
efficacy of the simulation is related directly to physical entropy production.
Although Walker and Ford were able to identify the existence of this
inherent limitation on the efficacy of simulation, they point out that no method
for predicting its occurrence has yet been established: "While there can be
little doubt that KAM instability is the source of the amplitude
instabilities observed in the above computer experiments, the theorist's ability to
predict the onset and completion of macroscopic instability is less certain"
(p. 188).
5.12. Proof
Dynamics is a multidisciplinary study in mathematics and physics. The
manipulation of a model, the Hamiltonian formalism, and the KAM
theorem are all mathematical, whereas the use of models and simulation to
study the properties of physical systems is physics. But there is no clear
separation between the two in any of this work. In physics, belief is a strong
principle: we believe in the laws of physics or we believe in the results of an
experiment. Proof is more the concern of mathematicians; logic and
mathematics enable the possibility of proof. When we work with simulations
of dynamical systems, questions of belief and proof become confused. In

134 5. Steps to an Epistemology of Simulation
dynamics, certain properties of models can be proved, such as the KAM
theorem. Once proved, these theorems become very important for the
physical study of dynamics. In the case of the ergodic hypothesis, we saw that
Fermi attempted a proof that was not convincing in the 1920s and in the
work of this case study, we found that the ergodic hypothesis was finally
laid to rest, partially because of the KAM theorem, which proved that
boundary curves persisted well into the perturbed space of nonintegrable
dynamical systems. I have discussed strategies for obtaining reasonable
belief that a simulation displays properties of the true solution. Because the
truth in this case is mathematical, proof seems to be required to assure,
finally, whether certain properties exist for a model. This assumption has
been expressed clearly by Henon and Heiles (1964):
Are the curves found here exactly or only approximately
invariant? What is the topological nature of the set of all the islands?
Is it possible to compute the curves directly from the potential,
without integrating all the orbits? The ultimate answer to such
questions should rest on rigorous mathematical proofs, not on
numerical experiments; but the mathematical approach to the
problem does not seem too easy. (p. 79)
Henon and Heiles were aware of the need for mathematical proof; yet they
also point to the difficulty of finding that proof. In order to get an idea
of what needs proving, simulation provides explorations into the space of
models. But can simulation be used to provide a proof? The answer to this
question depends heavily on how we frame the question.
Each point of a trajectory is generated from the iterative numerical
integration of equations of motion; if a trajectory displays a general property,
and consistently retains that property throughout the simulation, then the
question emerges: Will this property hold throughout the evolution?
Similarly, there are an infinite number of possible initial conditions for simulated
trajectories and we can simulate only a finite subset of them. What we see
in simulation is merely a small piece of the space of a model. If every swan
we see is white, may we conclude that all swans are white? The answer
to this question depends on whether we make conclusions based on the
evidence, or on the tenets of logic. A physicist might answer in the
affirmative to this question because all the evidence supports that answer; but a
logician must answer in the negative because deductive logic forbids such
an inductive conclusion.
Evidence from the Henon and Heiles article indicates the need for care in
framing our questions so that the simulation can provide definitive answers.
Consider the following excerpt from their text:
If we follow the trajectory for an infinite time, there will be in
general an infinite sequence of points. If there is no [additional]
isolating integral, these points will fill an area [in the surface

5.12. Proof 135
of section]. But if there is [an additional] isolating integral, the
points will lie on a curve. Thus we get a simple criterion for the
existence of the second integral; it is sufficient to compute a
number of points, plot them in the plane and see whether they
lie on a curve or not. (p. 74) [my emphasis]
Henon and Heiles make an interesting claim about sufficiency: If the
trajectory remains on a curve, then there is an integral of the motion. Whereas
any integral of the motion guarantees only bounded behavior, this
particular integral, if it exists, fulfills the final requirement that makes this system
completely integrable. If the points seem to be filling the bounded region,
then we could sufficiently conclude that there is not an analytical integral
of the motion in effect. But Henon and Heiles claim the converse, which is
true only if the ergodic hypothesis is true. If we see a black swan, then we
may conclude that not all swans are white; but, if we do not see a black
swan, then we still cannot claim to know that all swans are white.
However, because all the evidence supports it, then we may reasonably conclude
that all swans are white. This conclusion stands until someone produces
a nonwhite swan. Conclusions based on the evidence may turn out to be
wrong, but conclusions based on logical principles can not. It would be nice
to only make conclusions that cannot be wrong; but if logical guarantees
were the only way of justifying a conclusion, then science would proceed
very slowly indeed. If points continue to remain on what appears to be a
curve throughout the simulation, no matter how long it is run, then there
still remains the possibility that either the points will deviate at some later
time, or that a different choice of initial conditions could result in new
behavior. Logically, their claim would not be wrong if the ergodic
hypothesis were true—that is, if there is only the two possibilities of either order
or ergodicity, then anything remotely resembling order would cause us to
conclude the existence of an isolating integral.
To be correct in the light of the failure of the ergodic hypothesis, Henon
and Heiles might have said: "Thus we get a simple criterion for the
nonexistence of the second integral ... ." Simulation can disprove a hypothesis
with the instantiation of any exception to the rule, but it cannot prove the
hypothesis. The proof must come from some time-independent
mathematical formalism, such as the complicated proofs of the KAM theorem. Thus
we obtain a general limitation on what a simulation can prove. Given a
hypothesis that says: "Property A holds for all time." Simulation can be used
to disprove it by providing an exception, but it cannot be used alone to
prove the hypothesis because it can provide only a finite number of specific
instances.
Whereas Henon and Heiles did not seem to be aware of this limitation to
using the results of simulation as "sufficient" to demonstrate that properties
of solutions would hold for all time, we know that Waters and Ford were
conscious of it because they stated it explicitly:

136 5. Steps to an Epistemology of Simulation
Our work relies heavily on numerical solutions to the
equations of motion. Clearly, a computer can provide these
solutions only for selected initial conditions and for relatively short
time intervals. The analytical methods we have used to imply
the long-time behavior are open to serious convergence
questions, and it remains to be shown whether or not they
provide an adequate treatment of the small-denominator problem.
Nonetheless, whenever we have been provided with a check, we
have found agreement between analytical and numerical
solution. Consequently, it is with guarded optimism that we offer
our conclusions. (Waters and Ford, 1966, p. 1304)
Even though they made this cautionary statement, Waters and Ford had
convinced themselves of their analytical predictions using the results of
their simulation. In the process of using simulation to strengthen belief in
analytical results, their respect for, or their belief in simulation's ability to
provide real information was intensified. Here we see the difference between
certainty as supplied by proof and belief as supported by evidence.
At the end of the first article by Ford and Waters, we obtain an insight
into the authors' recognition of simulation's value in this work: "The
partial success of the perturbation methods we have used indicates that these
systems may eventually succumb completely to analysis; although a
computer is likely to remain a useful adjunct to analysis for some time to come"
(Ford and Waters, 1963, p. 1305). From this statement we can infer that
simulation had already become a vital tool for the theorist, because it
provided both heuristic exploration into unmapped areas of a nonintegrable
model, and limited, yet very directed, validity for predictions made during
theoretical development. Again, we must recognize with caution that this
validity is limited to particular ranges of perturbations and evolution times,
but it does function to strengthen the reasonable belief in predictions in
the absence of proof. What is also interesting in this statement is that,
implicitly, if these systems do not ever succumb to mathematical proof, then
computer simulation will be the most important tool for understanding
properties of solutions.
5.13. Proof and Simulation
Ford et al. (1973) pushed the use of simulation to its extreme as they
tried to determine whether the Toda lattice was integrable or not. They
designed many tests that could be examined in simulation. First they tried
to determine whether there obtains, within computer accuracy, a constant
of the motion.
On the other hand, by numerically integrating the equations of
motion for a variety of initial conditions, one can determine if

5.13. Proof and Simulation 137
the truncated, formal series is a valid constant of the motion
to within computer accuracy. In most of our numerical
integrations, this series was indeed found to be a constant of the
motion exhibiting a time variation only slightly greater than
the Hamiltonian itself, (p. 1549)
Recall that from just this result, Henon and Heiles concluded the existence
of an integral; but Ford et al. knew this was not sufficient.
Next they tried to find a breakdown of curves in the surface of section,
using methods similar to what Henon and Heiles used:
Thinking that the somewhat similar Toda Hamiltonian [similar
to Henon and Heiles] would also be nonintegrable, we chose to
begin our investigation at E = 1, only to find smooth curves
everywhere. Upping the energy to E = 256, then to E = 1024,
and finally to E = 56,000 still yielded smooth curves despite the
fact that, at these high energies, the nonlinear terms dominate
the motion, (p. 1551)
They searched for a breakdown of curves of successively higher energy,
characteristic of KAM curves to disprove the hypothesis of an integral;
but they failed to find any breakdowns. They also searched for fixed points
corresponding to islands of order in the Henon and Heiles system; but again
they failed to find these. Both of these negative results strongly indicated
that these curves were due to integrals of the motion; yet, they are not
sufficient proof in themselves.
Finally, they looked for the characteristic exponential separation of pair
orbits, which would indicate stochasticity; but they could find no evidence
of exponential separation. All this evidence did conform with the thesis
that the Toda-lattice system was integrable; yet they could conclude only:
Thus two equally attractive possibilities exits. The Toda
lattice is a member of the exception class of integrable oscillator
systems, making it a rare jewel in physics—a physically
interesting, nonlinear system which can in principle be analytically
solved exactly. Alternatively, the Toda lattice is typical of the
more general class of nonintegrable systems, thereby making
it a welcome addition to a small but growing list of interesting
physical systems which exhibit stochastic as well as nonstochas-
tic behavior. In this paper, we attempt to decide between these
alternatives by presenting the results of a series of distinct
computer experiments. ... Contrary to our original expectations,
all our empirical evidence indicates that the Toda lattice is
integrable, an unfortunate result from the viewpoint of computer
studies. For while a computer can be used to definitively prove
nonintegrability, it cannot prove integrability.

138 5. Steps to an Epistemology of Simulation
Using every test of integrability that simulation could provide, Ford et al.
(1973) could conclude only that it is probable that the Toda lattice was
integrable. They believed in that integrability, but because models fall into
the realm of mathematics, they still had to defer proof to some means other
than simulation. Henon finally proved that the Toda lattice was integrable.
He did so, not with simulation, but by working out the form of the third
integral analytically. Similarly, simulation cannot be used to prove the
existence of ergodicity. One of the only systems proven to be ergodic is a
system of hard spheres with a repulsive force, which Sinai (1963) proved to
be both ergodic and mixing, without using simulation.

Appendix A
Hamiltonian Dynamics:
Language of Abstraction
The advantages of the Hamiltonian formulation lies not in its use
as a calculational tool, but rather in the deeper insight it affords
into the formal structure of mechanics. The equal status accorded to
coordinates and momenta as independent variables encourages a
greater freedom in selecting the physical quantities to be designated
as "coordinates" and "momenta." As a result we are led to newer,
more abstract ways of representing the physical content of
mechanics.
Herbert Goldstein (1980, p. 378)
Newton's equations of motion are second-order differential equations
relating acceleration to influential forces. Often, it is preferable to consider
first-order differential equations rather than second-order equations—where
the order of the equation is determined by the order of the highest
derivatives required by the equation. A set of n second-order equations may be
converted to a set of 2n first-order equations by assuming the independence
of a variable from its rate of change, which was justified by Lagrange in
the eighteenth century. This independence gives rise to the need for a 2n-
dimensional phase space to specify completely the system's state with a
single point. The resulting system of 2n first-order differential equations is
completely equivalent to the system of n second-order equations of
Newton: 2n initial conditions are still required to specify the system completely.
So what is obtained by going into this form of dynamics? The answer is
mathematical elegance, symmetry, geometrical representation, and
canonical transformations.
Position variables in Hamiltonian dynamics may be something quite
different from the spatial location of a particle. They may well be some set of
generalized coordinates for which the description of the physical system is
more wieldy—such as the standard use of angles in the case of the
pendulum. But they could also be a set of perfectly valid quantities that have no
simple physical interpretation. Of course, if the positions are generalized

140 Appendix A. Hamiltonian Dynamics: Language of Abstraction
coordinates, then the momenta' are certainly something other than mass
times the first derivative of position. In fact, the relationship between
position and momentum is quite specific and is part of the very essence of
Hamiltonian dynamics. The central element of this dynamics is the
Hamiltonian function (if, often called simply "the Hamiltonian"), that is usually
a statement of the total energy of the system, expressed as a function of
positions {^}, momenta {pi}, and physical parameters related to the
various energy potentials. Prom the Hamiltonian, we obtain the 2n equations
of motion for the system by taking explicit derivatives of H with respect
to the 2ra variables (n positions and n momenta). Given a Hamiltonian
H(Qii • • • > (Zn> Pi> • • • jPn)j we may obtain Hamilton's equations of motion
directly:
dqi _ dH dpi __ dH
dt dpi' dt dqi
Notice the symmetry? Each pair of q^s and piS are locked together in an
antisymmetrical embrace that is given the special name symplectic. The
rates of change of both position and momentum are obtained directly from
the Hamiltonian, from its rate of change with respect to the other variable.
The variables are conjugated by way of the Hamiltonian, and the Hamilton
equations of motion. The special symmetry between the conjugate variables
and the Hamiltonian is given the name canonical As long as the dynamics is
denned in this canonical relationship, the variables are kept on equal footing
as the state variables in this form of dynamics and the initial assumptions
of the model are retained by the trajectories.
A.l. Topology and Phase-Space Trajectories
In symplectic geometry, each degree of freedom contributes two dimensions
to the overall system phase space. The phase space for any Hamiltonian
system of n degrees of freedom is thus a 2n-dimensional Euclidean space. The
system point changes coordinates as the system evolves in time, thereby
generating the system trajectory. The possibility exists for the system point
to remain fixed, in which case the point is called a fixed point. Once the
system point obtains that value, it never leaves of its own accord—for
example, the pendulum stopped at the bottom of its swing.
There are two important topological properties of the system trajectory:
as the system evolves, it cannot make discrete jumps in state variables, and
so the trajectory is a continuous curve. Second, because the solution to the
differential equations is unique, the trajectory in phase space may never
cross itself. Once a trajectory passes through a particular point in phase
space, then the path it travels away from that point is determined
unambiguously for all time. If the trajectory reaches that same point again at
(A.1)

A.2. Canonical Transformations 141
some later time, then it must follow the previously determined course away
(again), which clearly leads into a periodic loop. A closed curved is formed
and the system is periodic—every point on the closed curve is revisited
periodically. The fixed point is a special case of a periodic trajectory.
A.2. Canonical Transformations
In Chapter 1, we saw that FPU chose to convert their model to the normal-
mode representation to decouple the differential equations. Conveniently,
this transformation also provided a set of generalized coordinates with
which they (and we) could visualize the dynamics of the whole string.
Each normal mode represents a physical, sinusoidal shape of the string.
The transformation to normal modes is called a point transformation
because it only involves transforming from one set of generalized coordinates
{qi} to another {Qi} without involving the momenta in the
transformation. Here the momenta transform trivially from {pi} to {Pi} because of
the special case of the momenta being simply the first temporal derivative
of the positions. But, in general, the canonical transformations of
Hamiltonian dynamics involve both the coordinates and the momenta of the old
system (of representation) in the derivation of the coordinates and
momenta of the new system, all the while maintaining the system's canonical
structure. For more detailed information on canonical transformations, see,
for example, Goldstein (1980, Chap. 9). These transformations convey us
to more abstract representations of the dynamical system, which may help
us to understand better any hidden order principles (symmetries), that we
might not have been able to recognize in the original representation.
However, one drawback of employing such transformations is that we may lose
track of the physical significance of the theoretical entities.
A.3. Transforming the Unperturbed String
In order to make clear the connections between the KAM theorem and the
FPU problem, I develop, heuristically, the model of the discrete string in
the Hamiltonian formalism. The unperturbed Hamiltonian or a system of
n particles with linearized spring forces acting between them is
1 i=i
In conformity with FPU, I've set the mass of each particle and the spring
constant for each string-segment to unity, so they do not appear here. The
difference between nearest-neighbor displacements represents the extension

142 Appendix A. Hamiltonian Dynamics: Language of Abstraction
of each spring segment. On converting to the normal-mode representation
using the substitution
3 = 1
7 = 1 x '
where k = 1,2,... ,n,
the Hamiltonian becomes
H(Q,P) = J2
fc=l
^{Pl + ulQD
with
Wfc
„ . fifk\
2 sin —-
\2n)
(A.3)
(A.4)
(A.5)
Equation (A.4) is analogous to equation (1.5), except that I have
substituted Q's for a's, and P's for the conjugate momenta a. In this new
representation, the generalized coordinates are the normal modes themselves.
Note the change effected between the two Hamiltonians (A.2) and (A.4).
Each term in the sum of the new Hamiltonian contains reference only to
variables of the same mode index fc, and so decoupling the equations of
motion. Whereas the motion of one mass point along the string depends upon
the location of its two neighbors (coupled), the motion of each mode (of
the whole string) depends only on itself and its conjugate momentum
(decoupled). Applying the formalism to this Hamiltonian results in the usual
oscillatory motion of the normal modes:
Qk(t) = -Q0cos(ujkt)] Pk(t) = P0sm(ukt).
(A.6)
In this completely integrable, unperturbed system, the normal modes do
not exchange energy. The motion described by equations (A.6) is harmonic:
both the position and the momentum of each mode moves in the familiar
periodic pattern.
A.4. Cyclic Coordinates
If any of the variables Qk are absent from the Hamiltonian in any of its
transformed manifestations, then, in that same manifestation, the specific
conjugate variables Pk must be constants of the motion, and we say that
the Hamiltonian is cyclic in the absent variable. Often called "first integrals

A.5. Liouville Integrability 143
of the motion," these constants result from integrations of equations (A.l).
For example, assume that some Hamiltonian was independent of the
variable Qi, such that H = H(__Q2l • • • , Qn, Pi,- ,Pn )—then the second of
equations (A.l) tells us that
tfi dff n ,
—— = — -rp- = 0 and so Pi = a constant = p\,
at aQi
so that H becomes H(—Q2,... , Qn, /?i»-P2» • • • »Pn )•> where /?i is a first
integral of the motion and so remains conserved at its initial value throughout
the evolution. This is exactly the process by which angular momentum is
conserved in the two-body problem, because that Hamiltonian is
independent of the angular orientation variable.
A.5. Liouville Integrability
When a dynamical system is completely integrable and its motion is fixed to
a determinable trajectory that is the intersection of the surfaces defined by
the n constants of the motion, such as the system of harmonic oscillators
in the unperturbed FPU string, then there is a certain set of canonical
coordinates that are especially useful for representing that system. Notice
that when we transform our Hamiltonian to a form in which it is explicitly
independent of all of the coordinates Qk, then we obtain the n explicitly
defined integrals of the motion. We have H = H(/3i,... , /3n), where H and
all the /3fc's are constant.
From this vantage point, we are in a position to apply the first of
Hamilton's equations (A.l) to every Qk to get
dQk dH
—— = -p-r- = another constant value = Uk,
at opk
which easily integrates to the linear function: Qk = Wk t + Qko- AH the
positions Qk evolve in a linear way, beginning from the initial value Qko- In
effect, the system is solved completely and the motion is absolutely
predictable. This special case is called Liouville integrability. In Hamiltonian
dynamics it is necessary to know only n constants of the motion to
completely specify the evolution exactly—given the 2n initial conditions.
A.6. The Action-Angle Variables
Taking advantage of both the dual sinusoidal motion of the normal string
modes and the Liouville integrability of the coupled harmonic oscillators,
we can transform our system once again to a more elegant geometrical

144 Appendix A. Hamiltonian Dynamics: Language of Abstraction
Elliptical Orbit
Q = -Q0Cos(cot)
P = P0Sin((ot)
a = sqrt(2E)
b = a/co
Circular Orbit
(rectangular)
Q' = Qco
P' = P
r = sqrt(2E)
Circular Orbit
(polar)
I=E/(o
0 = cot + 0O
FIGURE A.l. Geometrical transformations: Looking at a single cross
section of one mode we see: (a) The mode Q and its momentum P both orbit
periodically resulting in an ellipse with fixed semimajor and semiminor
axes a and b. (b) Each such ellipse easily transforms into a circle whose
radius and area are directly related to the constant mode energy, (c) The
circle with fixed radius is converted into polar coordinates, which are the
action angle variables (/, B).

A.6. The Action-Angle Variables 145
representation in which the Hamiltonian is cyclic in all the position
variables. As a result, all of the conjugate momenta become constants of the
motion, and the position variables evolve linearly in time. As can be seen
from equation (A.4), each mode energy is the sum of the kinetic and
potential energies of that mode, represented by the bracketed terms of the
sum. Setting each term equal to a new constant Ek, we have
Ek = \{Pl + ulQl).
Geometrically, this is the equation for an ellipse—a fact which suggests
several quite interesting things. First, that both the positions and the
momenta must be periodic; but equations (A.6) already tell us that. Second,
that when we graph Pk versus Qk, as I have done in Figure A. 1(a), we
obtain a plane ellipse, whose area is directly related to the conserved modal
energy. Using a suitable substitution of variables we can easily transform
an ellipse into a circle, as in Figure A.l(b). Each such circle, one for each
mode, has a fixed radius that is also directly related to the modal energy,
which in turn is a constant of the motion, determined by the initial
conditions of the string. Thus we can see that if all the initial energy of the string
were placed in mode one, as it was in FPU (unperturbed), then only the
k = 1 circle would have a nonzero area. Remember that the phase space of
our system is a 2n-dimensional space, of which a single (Qk, Pfc)-plane is
but a small cross-section; but in this case, it is the only cross-section with
nonzero area.
The system point evolves continuously around this circle of fixed radius.
As we can see from equations (A.6), both Q and P evolve with the angular
frequency a;, defined by (A.5). So if we transform the system once again, to
a polar-coordinate representation, only the angular variable would evolve.
We transform to this set of polar coordinates using the transformation
equations
Qk = -\— cos(0fc), Pk = y/2ukIk sin(0fc). (A.7)
V Wfc
These equations actually transform the ellipses of Figure A. 1(a) directly
into the circles of constant radius J and rotating angular coordinate 6
(Figure A. 1(c)). Looking at the inverse of these transformations (they are
required to be invertible):
/fc = St+iM = ^, 6k = ukt, (A.8)
ZUk Uk
we see that the radius variables Ik are all constants of the motion derived
from the modal energies and that the angle variables 0k are linear
functions of time. Thus the shape of the string, defining the location of every

146 Appendix A. Hamiltonian Dynamics: Language of Abstraction
mass point along it, is easily detefmined for any time t—given the initial
conditions—by a simple calculation of the angle.
The variables Ik are traditionally called actions because they are the
quantities minimized in the principle of least action, which was derived
independently by Maupertuis, Euler, Lagrange, and Hamilton. According
to Whittaker:
The Principle of Least Action originated in Maupertuis'
attempt (Mem. de VAcad., 1744, p. 417) to obtain for the
corpuscular theory of light a theorem analogous to Fermat's 'Principle
of Least Time.' Maupertuis' principle was established by
Euler (Addit. II. p. 309 of his Methodus inveniendi lineas curvas,
1744) for the case of a single particle under a central force, and
by Lagrange (Miscell. Taurin. II. (1760-1), Ouvres, I. p. 365)
for much more general problems. (1937, p. 417)
This variational principle states that of all the integral paths possible for
the system trajectory, the one which minimizes the action integral will be
the physical one. Hamilton derived his canonical equations from a similar
principle. It is easily shown that the variable J results exactly from the
action integral.
The transformation to action-angle variables is canonical because it
retains the structure of the canonical equations (A.l). Substituting the
transformations (A.7) into the Hamiltonian (A.4), we obtain the new
Hamiltonian
n
ff(/) = 5>fcJfc, (A.9)
which is completely independent of the angle variables 6k> Now we apply
the canonical equations of motion to this Hamiltonian to obtain
and so Ik are all constants, and then clearly,
-77- = 77T = u)k (a constant), (A.ll)
at o±k
which gives us finally,
OkW^Wkt + Oko. (A.12)
The new Hamiltonian (A.9) is a constant function of the action variables
only, which in turn (A. 10), are themselves constants of the motion.
Furthermore, the angle variables Ok are linear functions of time (A.12). The

A.7. Dynamics on a Torus 147
motion is represented by a point rotating around a circle of fixed radius
with a constant angular frequency. However, the motion is confined to a
plane circle only if n = 1 or if only a single mode is excited initially. In
a full treatment, we must be concerned with cases for which n > 1. So
now we move to representations of the motion in the action-angle variables
when n > 1.
A.7. Dynamics on a Torus
The higher-dimensional analogue of this circular action-angle
representation is motion on the surface of a torus. For visualization purposes we will
consider the motion on a 2-torus, which is a four-dimensional surface
(manifold) embedded within a three-dimensional space. Consider the example
of two coupled, linear, harmonic oscillators. This system has two degrees of
freedom (n = 2), which demands representation in a four-dimensional phase
space. Because the energy is conserved, the energy surface is a (2n — 1 = 3)-
dimensional manifold. Upon transforming the model to action-angle
variables, we obtain a system of two actions, two angles, and the Hamilto-
nian: H = uil\ + ^h- The actions and their associated angles (#i,#2)
are the coordinates of a 2-torus, which I have represented graphically in
Figure A.2—where arbitrarily I have arranged the coordinates such that
Ii > I2. Because the actions are constant, the evolution of the system
point is constrained to the surface of this torus and is described by the
evolution of the angles according to (A. 12). These angles evolve linearly
in time with normal-mode frequencies (o;i, 0/2). The energy surface is the
three-dimensional Euclidean space in which we have embedded this torus.
Clearly, because the system trajectory is confined to this torus, it is not
free to wander through the entire energy space and so the system is not
ergodic.
The 2-torus is the highest-dimensional torus that most people can
visualize, a fact that might present a problem when studying systems with
n > 2. But, fortunately, the 2-torus is sufficient to represent most of the
properties of the higher-dimensional analogues. The big step is going from
the circle to the 2-torus. Even though our model is many dimensions more
than 2, we will be using the 2-torus for all visualizations.
Although we originally derived the action-angle variables from the
normal modes, all quantities now must be defined as functions of the actions
and angles. The Hamiltonian is a function of the actions only, as per
(A.9). The total energy of the system is the sum of the modal energies
(Ek = IkWk), that are constants determined by the initial values of /&.
Because the actions are constant radii of the n-torus, once we set their
values, we have determined the n-torus completely. The angles play no role
in determining either the energy or the characteristics of the torus: every

148 Appendix A. Hamiltonian Dynamics: Language of Abstraction
FIGURE A.2. In the 2-torus coordinate system, each of the two action
radii shown is associated with an independent angle variable (not shown).
trajectory on the n-torus will have the same energy as any other one on the
same torus. Fixing the initial values of the angles 0k determines or selects
a particular trajectory from the infinite number of possible trajectories, all
with the same energy. Because the actions are constructed from both the
positions and momenta of the normal modes—which are themselves
composites of the coupled positions and momenta of the mass points along the
string—we do need to fix the initial values of both the positions and the
momenta of the modes to obtain the initial value of the actions.
A.8. Commensurability: Two Types of Motion
As mentioned earlier, a trajectory in phase space can be either an open
curve (aperiodic) or a closed curve (periodic). Although this polarity still
holds on the surface of the torus, we obtain something new in this
representation. For our system of coupled harmonic oscillators, the trajectories
must remain on the fixed surface of the torus; yet the trajectory can never
cross itself. Furthermore, in any single-mode cross-section of the torus, the
motion is circular and periodic in that mode. But even though every mode
is periodic in its own right, the whole trajectory need not be. If the
trajectory on the surface of the torus repeats itself, then it forms a closed
curve on that surface. On the other hand, it is possible for the trajectory
to wind around the torus in such a way that it never intercepts itself, and

A.8. Commensurability: Two Types of Motion 149
so never visits the same point twice. This second kind of trajectory must
necessarily cover the entire surface of the torus, given an infinite amount of
time; whereas the periodic trajectory can never cover the entire surface, in
any amount of time. This second kind of trajectory is called conditionally
periodic] it is periodic in every mode, but aperiodic overall. The distinction
between these two types of trajectories plays a significant role in both the
small denominators problem of perturbation theory and in the development
of the KAM theorem.
Whether or not a trajectory is periodic or conditionally periodic depends
upon the relationship among the angular frequencies. In the case of the 2-
torus, if the ratio of the two frequencies is rational, then the trajectory
will return to the same point eventually. Each frequency Uk is the number
of complete revolutions that the corresponding angle Ok makes in one unit
of time. First, let us consider the case of n = 2, where, for convenience,
u)\ > U2- If the ratio of the two frequencies ujxJuj^ is a rational number,
then:
— = — where j and m are integers, (A. 13)
rau>i — ju)2 — 0. (A. 14)
The period of mode two is 2n/uj2, which, by assumption, is the longer of
the two periods. Whenever mode two completes j periods of its motion,
mode one will have completed exactly m periods of its motion, and so the
overall trajectory will come back to its initial point.
In general, the motion of a system of n oscillators will be periodic only
if the condition,
n
]Trafca;fc=0, (A. 15)
holds for some set of integers {m^}, in which the integers may be
positive, negative, or zero, with the exception of all zeros. This condition is
the straightforward generalization of the ratio of frequencies above. The
frequencies in this case are said to be linearly dependent. If this condition
is met, the frequencies are said to be with one another.
Figures A.3 and A.4 represent several combinations of frequencies and the
resulting trajectory on the 2-torus. In Figure A.3(a), the toroidal frequency
is zero, so the orbit will not go around the torus; it simply makes a plane
circular path in the poloidal cross-section. I have included the slinky-like
three-quarter torus in the figure only to indicate the potential shape of the
full torus; it should not be confused with the darker line representing the
orbit. In Figure A.3(b), the situation is reversed, there is no motion in the
poloidal direction; the orbit is again a cross-sectional circle, but now its the
half-bagel cross-section. These are the simplest cases. In Figure A.4(a), as
the orbit makes one complete period about the poloidal direction, it makes

150 Appendix A. Hamiltonian Dynamics: Language of Abstraction
FIGURE A.3. Single orbits on tori, (a) One periodic orbit forming a circle
about the poloidal cross-section; and (b) one periodic orbit forming a
circle about the toroidal cross-section.

A.8. Commensurability: Two Types of Motion 151
FIGURE A.4. Multiple orbits on tori, (a) Once around the poloidal
direction sees nine cycles about the torus, (b) While still periodic, this orbit
in graphical representation begins to fill the surface of the torus as would
one that is conditionally periodic.

152 Appendix A. Hamiltonian Dynamics: Language of Abstraction
a full nine toroidal cycles. In Figure A.4(b), although the ratio of the two
frequencies is still rational, the orbit is well on its way toward covering
the surface of the torus. When the ratio of frequencies is irrational, the
orthogonal components of the motion never complete whole numbers of
revolutions at the same time as all the other components and the trajectory
never meets itself. Thus an open curve results, which must fill the surface
of the torus completely as t —► oo.
A.9. Digital Representation
In 1857, Jules Lissajous demonstrated that if one were to plot the sinusoidal
components of two different frequencies on the horizontal and vertical axes
of a Cartesian coordinate system, then the resulting curve as a function of
time would be either closed or open, depending on the commensurability of
the two frequencies. In each of the torus frames in the figure, the
frequencies are commensurate and the resulting curves are closed. Therefore they
cannot completely cover the torus. But only in theory could the curve in
these plots remain open. I generated these plots on a digital computer, and
thus the frequencies could only be defined to some finite precision. They
can be only rational numbers, and as such, they could never generate an
open curve on these graphs. But, in Figure A.4(b), the ratio consists of two
not extremely large prime numbers—large in the sense that if they were
much larger the resulting curve would cover the torus simply because of
line-width, or, equivalently, the resolution of the graphical device. Whereas
the resulting curve seems quite dense visually, it is far from an open curve.
This is an important issue which both helps and hinders the usefulness of
simulation. Open trajectories may never actually be generated on a digital
computer; but, if the ratio of frequencies is sufficiently close to an irrational
number within the precision of the machine, then the resulting curve will
appear in every way to be an open one. If we are looking for open
trajectories, then we must be careful to make use of this fact and define the
"closeness" criteria properly.
A. 10. Physical Reality and the Continuum
The continuum of possible concentric tori of our model is structured the
same as the continuum of the real line. Between each two irrational numbers
there lies a rational and vice versa. Therefore between each two
conditionally periodic tori, there lies a periodic torus. But, just as on the line,
the measure of the set of all periodic tori is negligible as compared to
the conditionally periodic tori, because of the countability of the rationals.
Therefore most of the tori are conditionally periodic. But, if we attempt

A. 11. Perturbing the String 153
to represent these tori on a digital computer, then, in theory, we may only
represent the periodic ones—leaving the majority out. However, the same
situation holds when determining the initial conditions of our dynamical
system. We must measure the initial state of our system using physical
measuring devices which, like the digital computer, have a finite precision.
Thus we can only have rational initial conditions for real physical systems,
so it seems reasonable and desirable to only simulate those. We will soon
see that when we add the nonintegrable perturbation to our model, the
resulting dynamical behavior depends quite specifically on whether our
particular unperturbed torus is periodic or conditionally periodic. It would
seem to be a problem that we can represent only the periodic tori on the
computer. But this problem has been overcome in part with techniques
provided by Henri Poincare and George Birkhoff before digital computing
even came into existence.
A. 11. Perturbing the String
We begin with the original Hamiltonian with a perturbation term, cubic
in the spring extension, which corresponds to the quadratic force term in
FPU:
ff=5EW + («-«-i)2] +e£^-<fc-i)3. (A.16)
I have substituted the small parameter e in lieu of the symbol a, used in
FPU, to avoid confusion between it and the winding number. The first
transformation we made on the original equations was to decouple them
using the point transformation to normal modes (A.3). But when we apply
that transformation rigorously to the perturbed Hamiltonian, we find that
the equations no longer decouple. In the most general sense we obtain the
following terms:
H = J2\l(Pk+"iQl)} +eJ2AakQiQJQk. (A.17)
k=l L
ijk
The perturbation term, even under transformation, continues to couple the
different modes together. The triple sum provides links between the normal
modes, Q&. The term A^ represents various combinations of constant
coefficients that depend harmonically on the indices. If we place all the
initial energy in one mode, then we should expect that energy to diffuse
to the other modes via the coupling terms—limited in diffusion rate only
by the relatively small size of the parameter e, which generally is kept
less than one. Because of the coupling terms, the mode energies are no

154 Appendix A. Hamiltonian Dynamics: Language of Abstraction
longer constants of the motion. We can see this fact more clearly once we
transform our system over to the action-angle variables.
The transformation (A.7) to action-angle variables that we used
previously is still canonical, but the new Hamiltonian is quite a bit more complex
than was the pleasingly simple form of (A.9), p. 146:
H(I, 9) = VWfc +e^AijkJ^^coB(0i)co8(ej)cos(ek)9
t^i 7j£ V UiUJ^k
(A.18)
or, functionally,
H(I,0) = Ho{I) + eH1(I,0) where c < 1. (A.19)
This Hamiltonian is no longer independent of the angle variables 0^, and
so the actions I are no longer constants of the motion. Further, the angles
no longer evolve in the linear motion described by (A. 12). The resulting
motion is extremely complicated. If the FPU model does satisfy the KAM
theorem, then, for small values of the parameter e, the phase-space torus
for this system might resemble those shown in Chapter 3.

Glossary
Commensurate. In the normal-mode representation of a system of
oscillators, each mode has a frequency. The question of interest concerning
these frequencies is whether or not they form a linearly independent
set—that is, whether Yl7=i m*u* ~ 0 or n°t> f°r arbitrary values of ra^.
If this condition is true for any set of integers {m^}, then the frequencies
are said to be commensurate and if not, they are said to be
incommensurate. Commensurate frequencies are associated with resonance:
the oscillator frequencies are multiples of one another and so they are
more likely to share energy. In the action-angle representation of the
system of linear oscillators, if the frequencies are commensurate, then the
orbit will be periodic overall, forming a closed curve in phase space. If
the frequencies are incommensurate, then the orbit will be
conditionally periodic, forming an open curve that will fill the surface of a torus
completely over an infinite period of time. See the Appendix for details
concerning the torus representation of the unperturbed string, and then
see Chapter 3 for details of commensurate frequencies in the perturbed
Hamiltonian and KAM.
Conjugate Variables. In Hamiltonian mechanics, each state variable,
such as position, has a special counterpart called its conjugate
momentum. Together these pairs of variables are called canonical conjugates
or conjugate variables. They are intimately related to one another
through the antisymmetrical form of Hamilton's equations of motion:
dq_dB_ dp _ OH
dt dp dt dq '
where H is the Hamiltonian and (#, p) is the pair of conjugate variables.
These equations give the evolution in time of each of the two variables
in terms of its conjugate. The most common pairs of conjugate variables

156 Glossary
are: position-momentum [(x, x)\ (g, p)], angle-action [(0, J), (</>, J)], and
time-energy [(£, H)].
Deterministic Chaos. Stemming far back into our cultural memory,
the Greek word chaos means utter confusion and disorder, the dark void
out of which the Universe formed. Until we began to separate the ideas
of whether something was ordered and whether we could percieve that
order, chaos was disorder. Now that we are faced with the very real
post-enlightenment situation—that there can be determinism without
determinability—we assign the term chaotic to mean without order, or
cause, and distinguish it from random, which means that we cannot
perceive any order. Furthermore, we now speak of chaotic dynamical
systems (chaos in the popular sense), which are by first principles
completely deterministic; yet we may not be able to predict future
configurations, because of various physical constraints, such as finite computer
memory, or finite accuracy in measurement. The single most important
fact about chaotic dynamical systems is that they can exhibit
simultaneously features of predictability and randomness. Deterministic chaos
is when trajectories move so erratically that we are unable to simulate
their future states.
Ergodicity. At the extreme opposite end of the predictability spectrum
from systems with known integrals of the motion is the condition called
ergodicity, in which only the energy integral exists and the energy
surface is freely accessible to all trajectories. Strictly speaking, a system is
ergodic only if all trajectories are free to reach the entire phase space; but
when we consider energy-conserving systems, we say that any arbitrary
trajectory will eventually pass through an arbitrarily small neighborhood
surrounding any arbitrary point on the energy surface. A consequence of
ergodicity, which is often used as an alternate definition, is that any
function of the phase space variables will have a time-averaged value (over
infinite times) which is equal to the space average (over all phase space).
Therefore conservative systems can never be truly ergodic by the strict
definition, because the energy surface restricts the system's accessibility.
But we speak of the condition "ergodic on the energy surface," or on any
confining surface, as a restricted form of ergodicity. In the present work,
ergodicity will always mean ergodic on the energy surface.
Any well-defined set of connected initial conditions defines an area or
volume in phase space. During the evolution, ergodicity requires that this
area evolve all about phase space as a bundle of trajectories. Ergodicity
does not require that it change shape very much—that is, it may retain
its essential structural character throughout the evolution. However,
although Liouville's theorem guarantees the conservation of the volume of
the region under consideration, its shape may vary drastically (Goldstein
1980, p. 426ff). One possibility for an ergodic flow is that the volume does
not retain its essential shape, but it deforms into very long attenuated

Glossary 157
strands that wind around and through the entire phase space. This
structural change that goes beyond simple ergodicity is called "mixing" and
it is the next level up in the hierarchy of stochasticity. After a long
period of evolutionary time, any arbitrarily small sample of the phase space
will contain some of the initially defined set of points (initial conditions),
which is not true in general of an ergodic flow. Obviously, mixing implies
ergodicity but not vice versa.
It was not until 1963, when Sinai proved that a system of hard spheres
possessed the property of mixing (mixing entails ergodicity) that
anyone had proved ergodicity for a Hamiltonian system. And there is still
some incredulity about the completeness of this proof. See Ford (1974),
Arnold and Avez (1968), or Sinai (1963).
Exponential Separation. Henon and Heiles discovered the very
important property of the exponential separation of pair orbits. They found
this behavior to be a good way to map the phase space of the
stochastic region as they tried to determine where there were island chains and
where there were ergodic trajectories. The ergodic trajectories were easily
determined because they would separate from one another exponentially,
whereas the trajectories on the island chains would separate only linearly.
This property later became important as the most reliable indicator of
deterministic chaos.
Hamiltonian. There is an interlocked set of terms connected by the
name of William Rowan Hamilton. Hamiltonian system refers to a
dynamical system in which the quantity represented by the Hamiltonian
function is conserved. The Hamiltonian is usually a statement of the total
energy of the system, but this consequence depends upon the coordinate
system used for the representation and whether or not time is represented
explicitly in the problem. If the Hamiltonian contains velocity-dependent
potentials (such as the classical magnetic field), then it is not the total
energy. Usually a Hamiltonian may be transformed into one that does
not contain these potentials, and so does not depend explicitly on the
time variable. In this way, the Hamiltonian becomes a statement of the
conserved energy and defines the constant energy surface.
Hamiltonian dynamics usually refers to the formalism of
conservative dynamics in which the second-order differential equations of motion,
originally derived by Lagrange, are converted into pairs of first-order
differential equations. The representations are equivalent, but the
Hamiltonian method lends itself to representation in phase space and sets up an
elegant system of paired canonical variables; position and momentum
are set on equal footing in Hamiltonian dynamics. Using this
formalism, Hamilton was able to show the equivalency between the principle
of least time in optics and the principle of least action in dynamics. The
Hamiltonian formalism was later adapted as the basis dynamical theory
of quantum mechanics.

158 Glossary
Homoclinic Tangle. This term functions as a touchstone or keyword
in our work, taking us well out of our depth in terms of topological
dynamics. The adjective homoclinic is an English variation on the term
homoclinous, used by Poincare. It refers to the "feeding back into
itself" behavior of a certain unstable manifold (homoclinic orbit), which
we can associate with the separatrix in a dynamical system. Under
certain very interesting conditions, the separatrix structure of a dynamical
system can demand the existence of an infinite sequence of these points
in a bounded region of phase space. The arrangement of these points
forms the famous homoclinic tangle discovered by Poincare in the
final chapter of the final volume of his magnum opus (1890). It is the
infinitely complicated lattice that he would not even begin to draw.
Because these orbits are unstable, they tend to repel trajectories from them,
and in an infinitely complex web of these orbits, the trajectory can
behave in a wildly erratic manner. This behavior makes it impossible to
track such a trajectory in simulation, and so this web is one source of
deterministic chaos. In a perturbed Hamiltonian dynamical system
that is subject to the KAM theorem, the regions bounded between the
mini-tori and the preserved conditionally periodic tori are filled with the
homoclinic tangle. A brief but interesting explanation of the homoclinic
tangle can be found in Appendix 1 of Ekeland (1988).
Integrable. When the set of differential equations of a particular model
can be integrated analytically, the model is said to be integrable.
Integrable models possess a complete set of integrals of the motion, and
therefore cannot be ergodic. The integrals of the motion lock the
trajectories into determinable orbits, so there can be no chaos. Although related
to nonlinearity, nonintegrable is the more significant characteristic of a
dynamical model. Nonintegrability usually is a result of nonlinear terms
in the equations of the model, but nonlinearity itself is not sufficient to
guarantee nonintegrability. Although it is true that numerical integration
is truly a form of integration, we can numerically integrate equations that
are nonintegrable (analytically). Analytical integration means using
the calculus to obtain a closed-form solution to the motion for any future
time.
If we can find a canonical transformation that eliminates all position
coordinates from the statement of the Hamiltonian, and so make it a
function of only momenta, then we know that those momenta will all
be constants of the motion. In effect, the system will be completely
solved and the motion will be completely predictable. This special
situation is called Liouville integrability. In Hamiltonian mechanics it is
necessary to know only n constants of the motion to specify completely
the evolution exactly, even though there are 2n initial conditions.
Integral of the Motion. If the Hamiltonian function for the system
is independent of any state variables, then the associated conjugate vari-

Glossary 159
able may be integrated directly to obtain a constant of the motion,
whose value is determined completely by the initial conditions. The term
constant of the motion means that one of the n momenta are
conserved all during the motion. In the associated phase space, each integral
of the motion constrains one dimension of the system trajectory to a
fixed surface, just like a point constrained to move around the surface of
a circle with a constant radius. A Hamiltonian dynamical system with n
degrees of freedom has a 2ra-dimensional phase space and up to 2n — 1
integrals of the motion. If all possible integrals of the motion exist, they
fix the system's trajectory to a well-defined curve—the intersection of
all the surfaces of constraint. These constants are often known as first
integrals of the motion, because they each result from a single
analytical integration of Hamilton's equations.
If an integral of the motion exists, then it is implicate in the model—
it does not matter if we are able to actually perform the integration or
not. When we simulate the motion of a dynamical system, we may
discover the existence of a surface of constraint in the behavior of the
trajectories and so discover the existence of a corresponding integral of
the motion.
Mapping. This noun arises in topology from the action "to map an
element of a set into another set." A mathematical function is a special
case of a mapping, in that a number x is mapped into another number y.
Both numbers are of the same set, the real numbers, and so we can speak
of mapping a set into itself. A simulated trajectory is also a mapping, in
that the dynamical equations map the system state from one point into
the next. The equations of motion provide the formula for mapping phase
space into itself. The term mapping is used to distinguish this topic from
cartography, the making of maps, where a map represents a territory and
mapping is a verb. A mapping is the set of equations for making the
transformation between one set and another. One of the central topics
of discussion in topology is the directional properties of mappings and
the structure of the sets on either end. A surface of section is often
generated using a mapping; instead of taking a single trajectory step
by step, we can take a large region or bundle of trajectories step by
step. Mapping finds itself in good company with simulation, because
the discrete approximation procedure of numerical integration may be
considered to be a form of mapping.
Mode. Sinusoidal standing wave patterns that fit exactly n waves on the
string between the ends, an infinite series of these modes with decreasing
amplitude and increasing frequency can be summed to model the shape
of any continuous string. A finite subset of the full infinite series can
be used to approximate any disturbance up to any degree of accuracy
necessary. Often called either normal modes or Fourier modes. The

160 Glossary
process of composing or decomposing a function into the series of modes
is known as Fourier analysis.
Nonlinear. There are two distinct uses of this adjective in
mathematics. Functionally nonlinear is when an algebraic equation has terms
that are not linear—that is, they express a nonproportional relationship,
such as y = x2. This is the sense implied when we say that the Hamil-
tonian has nonlinear terms, such as all the perturbation terms in the
FPU simulation. This is also what we mean when we speak of
nonlinear dynamics. In this same sense, linear thinking means to make
predictions based on a directly proportional relationship. On the other
hand, differential equations are said to be nonlinear when two solutions
to the same equation or set of equations, when added together, is not a
solution to that same equation or set of equations.
Phase Space. Phase space is a special subset of state space, wherein
each pair of canonically conjugate variables (position and momentum)
occupies one linked pair of dimensions in the space. Thus the phase space
of a dynamical system with n degrees of freedom, having n position
variables, must have n pairs of dimensions. A phase plane is a special cross-
section of phase space representing a single pair of conjugate variables.
The geometry of phase space is called symplectic geometry.
Simulation. Simulations provide us with a way to analyze the
properties of models without resorting to the very difficult techniques of
perturbation theory. "Simulation in dynamics," means the entire process of
numerically integrating a set of differential equations on a computer and
displaying graphically for analysis the sequential solution states as an
ordered set of points in a phase-space portrait. As these points unfold, we
implicitly understand the successive dots to represent the evolution of a
trajectory, or, in the case of FPU, the evolution of the normal-mode
energy distribution. By observing the behavior of this unfolding trajectory,
we infer properties of the model.
State Space. A mathematical space spanned by Cartesian coordinates,
state space may have any number of dimensions; but one point (set
of coordinates) in state space must represent a single unique complete
configuration of a dynamical system at a single point in time. The n-
dimensional coordinates of a single point is a collection of numbers that
holds one possible combination of the values of the n system variables.
Any trajectory is a path through state space that defines the evolution
of the system's configuration of variables.
Stochasticity. This term has a lot of ambiguity about it. Many workers
use stochastic to mean "that a trajectory wanders more or less randomly
over part or most of the energy surface." This definition is the same as
some aspects of ergodic on the energy surface; however, the terms

Glossary 161
are not interchangeable because ergodic has some very technical and
specific requirements. The behavior defined above also seems to describe
what later came to be known as deterministic chaos, but again there
is a problem with making a strict substitution, bacause the latter term
can be quite localized, such as being locked between two KAM curves.
Deterministic chaos does not mean that a trajectory is free to wander
the entire energy surface.
The following definition seems less restrictive, yet useful when dealing
with this term: a sequential and usually discrete process is
stochastic when it involves a random selection (movement or decision) at each
step in the sequence. Stochastic processes can result in a totally
random sequence—such as white noise or random walk—or they can see
spontaneous order emerge from the sequence. The deterministic chaos
of dynamics is often called stochastic because simulation is a stepwise
process that introduces randomness at each step—the true solution of
the equations of motion are changing direction too rapidly for the size of
the time step. Ergodic orbits appear to be stochastic in simulation.
Structural Stability. The question of whether different models
within a particular model type may be expected to exhibit similar
behavior falls under the category of structural stability, an area of increasing
importance in working with dynamical models. Abraham (1967) states
the problem as follows: "If a dynamical system X has a known phase
portrait P, and is then perturbed to a slightly different system X' (for
example, changing the coefficients in its differential equation slightly),
then is the new phase portrait P' close to P in some topological sense?
This problem has an obvious importance, as in practice the qualitative
information obtained for P is to be applied not to X, but to some nearby
system X', because the coefficients of the equation are to be determined
experimentally, and therefore, approximately." One extensive approach
to formalizing the science of structural stability has been expounded by
Thorn (1975).
Surface of Section. Usually associated with the name Poincare, who
invented them, the surface of section is a Cartesian plane—a very
special two-dimensional cross-section of phase space, in which a state
variable, such as position, and its conjugate momentum, define the
two axes. Phase space is a 2n-dimensional space, and so there are n
possible phase planes or surfaces of section. However, there is no specific
requirement that a cross-section must be a phase plane; but, because of the
special pairwise relationship between canonically conjugate variables,
it is usually beneficial to select them this way. Surfaces of section lend
themselves to very nicely to simulation, which needs a useful graphical
interface with human experimenters.

162 Glossary
This diagram shows what to expect from a surface of
section in action-angle variables (on a torus).
Thermal Equilibrium. Usually interpreted as a consequence of the
second law of thermodynamics, many dynamical systems (usually
particles) tend to move spontaneously toward thermal equilibrium, the state
in which all particles have, statistically, the same amount of energy. For
the discrete string with all of its energy initially configured in a single
mode, it was expected that, as the string vibrated, its energy would
disperse to all the modes, because the modes were connected by nonlinear
terms added to the linear model. This dispersion tends to spread the
energy out among all the modes until they all have equal amounts of
energy. In the discrete string, the number of modes is finite, so it was
expected that this equipartition of energy would occur fairly quickly.

References
Abraham, R. (1967), Foundations of Mechanics, The Mathematical Physics
Monograph Series, 1st edn, W.A. Benjamin, New York.
Arnold, V. I. (1961), 'Generation of almost periodic motion from a family
of periodic motions', Soviet Math. Dokl. 2, 501-503.
Arnold, V. I. (1962), 'On the behavior of an adiabatic invariant under slow
periodic variation of the Hamiltonian', Soviet Math. Dokl. 3, 136-140.
Arnold, V. I. (1963a), 'Proof of a theorem of Kolmogorov on the invariance
of quasi-periodic motions under small perturbations', Russian Math.
Surveys 18(5), 9-36.
Arnold, V. I. (1963b), 'Small denominators and problems of stability of
motion in classical and celestial mechanics', Russian Math. Surveys
18(6), 85-191.
Arnold, V. I. (1978), Mathematical Methods of Classical Mechanics, Vol. 60
of Graduate Texts in Mathematics, Springer-Verlag, New York,
translated by K. Vogtmann and A. Weinstein.
Arnold, V. I. (1983), Geometrical Methods in the Theory of Ordinary
Differential Equations, Springer-Verlag, New York. Translated by Joseph
Sztics.
Arnold, V. I. and Avez, A. (1968), Ergodic Problems of Classical Mechanics,
The Mathematical Physics Monograph Series, W.A. Benjamin, New
York.
Bai-Lin, H. (1984), Chaos, World Scientific, Singapore.
Balescu, R. (1956), Bull. Acad. Roy. Belg. CI. Sci. 42.

164 References
Balescu, R. (1975), Equilibrium and Nonequilibrium Statistical Mechanics,
Wiley Interscience Series, Wiley, New York.
Bateman, H. (1945), Hamilton's work in dynamics and its influence on
modern thought, in 'A Collection of Papers in Memory of Sir William
Rowan Hamilton', Scripta Mathematica, New York, pp. 51-63.
Bateson, G. (1972), Steps to an Ecology of Mind, Chandler, Toronto.
Berry, M. V. (1978), 'Regular and irregular motion', AIP Conf. Proc.
46, 16-120.
Birkhoff, G. D. (1913), 'Proof of Poincare's geometric theorem', Amer.
Math. Soc. Transl. 14, 14-22.
Birkhoff, G. D. (1917), 'Dynamical systems with two degrees of freedom',
Trans. Amer. Math. Soc. 18, 199-300.
Birkhoff, G. D. (1931a), Troof of a recurrence theorem for strongly
transitive systems', Proc. Nat. Acad. Sci. USA 17, 650-655.
Birkhoff, G. D. (19316), Troof of the ergodic theorem', Proc. Nat. Acad.
Sci. USA 17, 656-660.
Birkhoff, G. D. (1966), Dynamical Systems, Vol. IX of American
Mathematical Society Colloquium Publications, 2nd edn, American
Mathematical Society, Providence, RI. First Published 1927.
Bruns, H. (1887), Acta Math. XI, 25.
Capek, M. (1961), The Philosophical Impact of Contemporary Physics, Van
Nostrand, Princeton, NJ.
Capek, M. (1987), The philosophical significance of piaget's researches on
the genesis of the concept of time, in A. Shimony and D. Nails, eds,
'Naturalistic Epistemology, A Symposium of Two Decades', Reidel,
Dordrecht, pp. 91-118. Vol. 100 of the Boston Studies in the
Philosophy of Science series.
Cartwright, N. (1983), How the Laws of Physics Lie, Clarendon Press,
Oxford.
Chirikov, B. V. (1960), 'Resonance processes in magnetic traps', Plasma
Phys. 1, 253-260.
Chirikov, B. V. (1979), 'A universal instability of many-dimensional
oscillator systems', Phys. Rep. 52(5), 263-379.
Chirikov, B. V., Izrailev, F. M. and Tayursky, V. A. (1973), 'Numerical
experiments on the statistical behavior of dynamical systems with a
few degrees of freedom', Comput. Phys. Comm. 5, 11-16.

References 165
Cooper, N. G., ed. (1989), From Cardinals to Chaos: Reflections on the
Life and Legacy of Stanislaw Ulam, Cambridge University Press,
Cambridge.
Dragt and Finn (1976), 'Insolubility of trapped particle motion in a
magnetic dipole field', J. Geophys. Rev. 81, 2327.
Duhem, P. (1954), The Aim and Structure of Physical Theory, Princeton
University Press, Princeton, NJ. Translated by Philip P. Wiener.
Eckhardt, B., Ford, J. and Vivaldi, F. (1984), 'Analytically solvable
dynamical systems which are not integrable', Physica D 13, 339-356.
Ekeland, I. (1988), Mathematics and the Unexpected, University of Chicago
Press, Chicago.
Fermi, E. (1923), Beweis, dass ein mechanisches normalsystem im allge-
meinen quasi-ergodisch ist, in Segre (1965), pp. 79-87. Article
published in Phys. Zeits 24:261-265.
Fermi, E., Pasta, J. and Ulam, S. (1955), Studies of nonlinear problem^,
in Segre (1965), chapter 266, pp. 977-988. Although written in 1954,
this article first appeared in this volume in 1965.
Fomenko, A. (1988), Integrability and Nonintegrability in Geometry and
Mechanics, Mathematics and Its Applications, Kluwer Academic,
Dordrecht, translated by M. V. Tsaplina.
Ford, J. (1961), 'Equipartition of energy for nonlinear systems',
J. Math. Phys. 2(3), 387-393.
Ford, J. (1974), The statistical mechanics of classical analytic dynamics, in
E. G. D. Cohen, ed., 'Fundamental Problems in Statistical
Mechanics III', North-Holland, Amsterdam, pp. 215-255. Proceedings of the
International Summer School on Fundamental Problems in Statistical
Mechanics III.
Ford, J. (1992), 'The fermi-Pasta-Ulam problem: Paradox turns discovery',
Phys. Rep. 213(5), 271-310.
Ford, J. and Lunsford, G. (1970), 'Stochastic behavior of resonant nearly
linear oscillator systems in the limit of zero nonlinear coupling', Phys.
Rev. A 1(1), 59-70.
Ford, J., Stoddard, D. and Turner, J. (1973), 'On the integrability of the
Toda lattice', Prog. Theoret. Phys. 50(5), 1547-1560.
Ford, J. and Waters, J. (1963), 'Computer studies of energy sharing and er-
godicity for nonlinear oscillator systems', J. Math. Phys. 4(10), 1293-
1306.

166 References
Franklin, A. (1986), The Negleet of Experiment, Cambridge University
Press, Cambridge.
Franklin, A. (1990), Experiment, Right or Wrong, Cambridge University
Press, Cambridge.
Galison, P. (1987), How Experiments End, University of Chicago Press,
Chicago.
Goldstein, H. (1980), Classical Mechanics, 2nd edn, Addison-Wesley,
Reading, MA.
Greene, J. M. (1968), 'Two-dimensional measure-preserving mappings', J.
Math. Phys. 9(5), 760-768.
Greene, J. M. (1979), 'A method for determining a stochastic transition',
J. Math. Phys. 20(6), 1183-1201.
Guckenheimer, J. and Holmes, P. (1986), Nonlinear Oscillations,
Dynamical systems, and Bifurcations of Vector Fields, Vol. 42 of Applied
Mathematical Sciences, 2nd edn, Springer-Verlag, New York.
Gustavson, F. G. (1966), 'On constructing formal integrals of a Hamiltonian
system near an equilibrium point', Astronom. J. 71(8), 670-686.
Haar, D. T. (1954), Elements of Statistical Mechanics, Rinehart, New York.
Contains English translation of Fermi's Ergodic Theorem in the
Appendix.
Hacking, I. (1983), Representing and Intervening, Cambridge University
Press, Cambridge.
Hankins, T. L. (1980), Sir William Rowan Hamilton, Johns Hopkins,
Baltimore.
Hausdorff (1962), Set Theory, Chelsea, New York.
Hemmer, P. (1959), Dynamic and Stochastic Types of Motion in the Linear
Chain, PhD thesis, Trondheim.
Henon, M. (1974), 'Integrals of the Toda lattice', Phys. Rev. B9, 1921.
Henon, M. and Heiles, C. (1964), 'The applicability of the third integral of
motion: Some numerical experiments', Astronom. J. 69(1), 73-79.
Infeld, E. (1981), 'Quantitive theory of the Fermi-Pasta-Ulam recurrence in
the nonlinear Schrodinger equation', Phys. Rev. Lett. 47(10), 717-718.
Izrailev, F. M. and Chirikov, B. V. (1966), 'Statistical properties of a
nonlinear string', Soviet Phys. Dokl. 11(1), 30-32. Translated from Doklady
Akademii Nauk SSSR, Vol. 166, No. 1, 1965.

References 167
Jackson, E. A. (1963a), 'Nonlinear coupled oscillators I: Perturbation
theory; ergodic problem', J. Math. Phys. 4(4), 551-558.
Jackson, E. A. (1963b), 'Nonlinear coupled oscillators II: Comparison of
theory with computer simulations', J. Math. Phys. 4(5), 686-700.
Jackson, E. A., Pasta, J. R. and Waters, J. F. (1968), 'Thermal conductivity
of one-dimensional lattices', J. Comput. Phys. 2, 207-227.
Jeffreys, W. H. (1966), Astronom. J. 71, 306.
Kellert, S. H. (1993), In the Wake of Chaos, University of Chicago Press,
Chicago.
Khinchin, A. I. (1949), Mathematical Foundations of Statistical Mechanics,
Dover, New York, translated by G. Gamow.
Khintchine, A. Y. (1963), Continued Fractions, Noordhoff, Groningen, The
Netherlands. Translated by Peter Wynn.
Kolmogorov, A. N. (1954), Preservation of conditionally periodic
movements with small change in the Hamiltonian function, in Chaos (Bai-
Lin, 1984), pp. 81-86. From Los Alamos Scientific Laboratory
Translation LA-TR-71-67 by Helen Dahlby. Originally: Akad. Nauk. S.S.S.R.,
Doklady 98, 527 (1954).
Kolmogorov, A. N. (1958), 'The general theory of dynamical systems and
classical mechanics', English Translation in Appendix D in Ralph
Abraham, Foundations of Mechanics. Address to the 1957
International Congress of Mathematicians in Russian.
Kroes, P. (1985), Time: Rs Structure and Role in Physical Theories, Vol.
179 of Synthese Library, Reidel, Dordrecht.
Kruskal, M. D. and Zabusky, N. J. (1964), J. Math. Phys. 5, 231.
Kryloff and Bogoliuboff (1947), Introduction to Nonlinear Mechanics,
Edwards Brothers, Ann Arbor, MI.
Landau, L. D. and Lifshitz, E. M. (1976), Mechanics, Vol. 1 of Course
of Theoretical Physics, 3rd edn, Pergamon Press, Oxford. Translated
J. B. Sykes and J. S. Bell.
Lichtenberg, A. J. and Lieberman, M. A. (1983), Regular and Stochastic
Motion, Vol. 38 of Applied Mathematical Sciences, Springer-Verlag,
New York.
Lucretius (1965), On the Nature of the Universe, Ungar, New York. Verse
translation by James H. Mantinband, (circa 65 B.C.E.).

168 References
Lunsford, G. and Ford, J. (1972), 'On the stability of periodic orbits for
nonlinear oscillator systems in regions exhibiting stochastic behavior',
J. Math. Phys. 13(5), 700-705.
MacKay, R. S. and Meiss, J. D., eds (1987), Hamiltonian Dynamical
Systems, Adam Hilger, Bristol.
Marion, J. B. (1970), Classical Dynamics of Particles and Systems, 2nd
edn, Academic Press, New York.
Melnikov, V. K. (1963), 'On the stability of the center for time periodic
perturbations', Trans. Moscow Math. Soc. 12, 1-57. This is a translation
of Dokl. Akad. Nauk. SSSR, 144 (1962) and 148 (1963).
Moser, J. (1962), 'On invariant curves of area-preserving mappings of an
annulus', Nachr. Akad. Wiss. Gottingen Math. Phys. Kl. 2, 1-20.
Moser, J. (1973), Stable and Random Motions in Dynamical Systems,
Vol. 77 of Annals of Mathematics Studies, Princeton University Press,
Princeton, NJ. The Hermann Weyl Lectures at The Institute for
Advanced Study.
Nicolis, G. and Prigogine, I. (1989), Exploring Complexity, W. H. Freeman,
New York.
Peterson, K. (1983), Ergodic Theory, Vol. 2 of Cambridge Studies in
Advanced Mathematics, Cambridge University Press, Cambridge.
Poincare, H. (1890), 'Sur les equations de la dynamique et le probleme de
trois corps', Acta Math. 13, 1-270.
Poincare, H. (1899), Les Methodes Nouvelles de la Mecanique Celeste,Vo\.
I—III, Gauthier-Villars, Paris. The three volumes separately: 1890,
1892, 1899.
Poincare, H. (1913), The Foundations of Science: Science and
Hypothesis, The Value of Science, and Science and Method, Vol. 1 of Science
and Education, Science Press, New York, translated by George Bruce
Halsted.
Poincare, H. (1963), Mathematics and Science: Last Essays, Dover, New
York, translated by John W. Bolduc.
Prigogine, I. (1980), Prom Being to Becoming: Time and Complexity in the
Physical Sciences, W. H. Freeman, San Francisco.
Prigogine, I. and Stengers, I. (1984), Order Out of Chaos, Bantam, New
York.

References 169
Rannou, F. (1973), 'Numerical study of discrete plane area-preserving
mappings', Astronom. Astrophys. 31, 289-301.
Rayleigh, J. W. S. (1945), The Theory of Sound, 2nd edn, Dover, New
York. Originally published in 1894.
Reichenbach, H. (1958), The Philosophy of Space and Time, Dover, New
York, translated by Maria Reichenbach and John Freund, Originally
published in 1927 in German as Philosophic der Raum Zeit-Lehre.
Sagdeev, R. Z., Usikov, D. A. and Zaslavsky, G. M. (1988), Nonlinear
Physics: From the Pendulum to Turbulence and Chaos, Vol. IV of
Contemporary Consepts in Physics, Harwood Academic, Chur. Translated
by Igor R. Sagdeev.
Saito, N. and Hirooka, H. (1967), 'Computer studies of ergodicity in
coupled oscillators with anharmonic interaction', J. Phys. Soc. Japan
23(2), 167-171.
Schmidt, G. and Bialek, J. (1982), 'Fractal diagrams for hamiltonian
stochasticity', Physica D 5, 397-104.
Scott, A. C., Chu, F. Y. F. and McLaughlin, D. W. (1973), 'The soliton:
A new concept in applied science', Proc. IEEE 61(10), 1443-1483.
Segre, E., ed. (1965), Collected Papers of Enrico Fermi, Vol. 2, University
of Chicago Press, Chicago.
Siegel, C. L. (1942), 'Iteration of analytic functions', Ann. of Math.
43(4), 607-612.
Siegel, C. L. (1956), Vorlesungen iiber Himmelsmechanik, Springer-Verlag,
Berlin.
Sinai, Y. G. (1963), 'A system of hard spheres is ergodic and mixing', Soviet
Math. Dokl. 4, 1818. See also Russian Math. Surv. 25, 137 (1970).
Stewart, I. (1989), Does God Play Dice?, Blackwell, Oxford.
Synge, J. L. (1945), The life and early work of Sir William Rowan Hamilton,
in 'A Collection of Papers in Memory of Sir William Rowan Hamilton',
Scripta Mathematica, New York, pp. 13-24.
Tabor, M. (1989), Chaos and Integrability in Nonlinear Dynamics, Wiley,
New York.
Thorn, R. (1975), Structural Stability and Morphogenesis, Benjamin,
Reading, MA.

170 References
Toda, M. (1967), 'Vibration of a chain with nonlinear interaction',
J. Phys. Soc. Japan 22(2), 431-436.
Tuck, J. L. and Menzel, M. T. (1972), 'The superperiod of the nonlinear
weighted string (FPU) problem', Adv. in Math. 9, 399-407.
Turner, F. (1978), Poiesis: Time and artistic discourse, in J. T. Fraser,
ed., 'The Study of Time III', Springer-Verlag, New York, pp. 614-633.
Proceedings of the Third Conference of the International Society for
the Study of Time.
Ulam, S. M. (1960), A Collection of Mathematical Problems, Vol. 8 of In-
terscience Tracts in Pure and Applied Mathematics, Interscience, New
York.
Ulam, S. M. (1976), Adventures of a Mathematician, Charles Scribner's
Sons, New York.
van Frassen, B. C. (1980), The Scientific Image, Clarendon Press, Oxford.
Walker, G. H. and Ford, J. (1969), 'Amplitude instability and ergodic
behavior for conservative nonlinear oscillator systems', Phys. Rev.
188(1), 416-432.
Waters, J. and Ford, J. (1966), 'A method of solution for resonant nonlinear
coupled oscillator systems', J. Math. Phys. 7(2), 399-403.
Whittaker, E. T. (1937), A Treatise on the Analytical Dynamics of Particles
and Rigid Bodies, 4th edn, Cambridge University Press, Cambridge.
The first edition was published in 1904.
Wiggins, S. (1988), Global Bifurcations and Chaos, Vol. 73 of Applied
Mathematical Sciences, Springer-Verlag, New York.
Zabusky, N. J. (1962), 'Exact solution for the vibrations of a nonlinear
continuous model string', J. Math. Phys. 3(5), 1028-1039.
Zabusky, N. J. (1963), Phenomena associated with the oscillations of a
nonlinear model string: The problem of Fermi, Pasta, and Ulam, in
S. Drobot and P. A. Viebrock, eds, 'Mathematical Models in Physical
Science', Conference at the University of Notre Dame, 1962, Prentice-
Hall, Englewood Cliffs, NJ, pp. 99-133.
Zabusky, N. J. (1967), A synergetic approach to problems of nonlinear
dispersive wave propagation and interaction, in W. Ames, ed., 'Nonlinear
Partial Differential Equations: A Symposium on Methods of Solution',
University of Delaware, Academic Press, New York, pp. 223-258.
Zabusky, N. J. (1971), Phys, Rev. Lett. 27, 1774.

References 171
Zabusky, N. J. (1973), 'Solitons and energy transport in nonlinear lattices',
Comput. Phys. Comm. 5, 1-10.
Zabusky, N. J. and Deem, G. (1967), 'Dynamics of nonlinear lattices:
Localized optical excitations, acoustic radiation, and strong nonlinear
behavior', J. Comp. Phys. 2, 126-153.
Zabusky, N. J. and Kruskal, M. D. (1965), 'Interaction of "solitons" in a
collisionless plasma and the recurrence of initial states', Phys. Rev.
Lett. 15(6), 240-243.

Index
action-angle variables 53, 56,
70, 88, 146, 147
amplitude instability 93
Arnold, V.I. 53
asymptotic series 58, 60, 64
Campbell, D. 30
canonical
conjugates 14
coordinates 143
equations 146
transformations 58, 62, 79,
141
variables 52, 140
chaos, see deterministic chaos
Chirikov, B.V. 85
closed curve 148
commensurate 59, 152, 155
commensurate frequencies, see
frequencies
conditionally periodic 26, 54,
59, 61, 65, 66, 68, 70, 71, 75,
88, 93, 97, 117, 149, 152
conjugate
momentum 142
variables 140, 142, 155
constant of the motion, see
integral of the motion
continuous
limit 44, 45
string 21, 43
trajectory 140
control parameter 108
critical parameter 88
cyclic, see Hamiltonian, cyclic
Deem, G. 89
degree of freedom 93, 94, 140,
147
deterministic chaos 16, 43, 71,
115, 123, 156
dispersion 19
divisors, small 34, 57, 59, 62, 71
dynamical systems theory 26,
105
efficacy 116, 121, 132, 133
energy
dissociation 99
sharing 25
surface 13, 52, 55, 73, 99, 147
ensembles 11
epistemology 113, 131
equations
of dynamics 57
of motion 53, 140
equipartition 23, 41, 89

174 Index
ergodic hypothesis 110, 117,
125, 134
ergodicity 13, 14, 35, 41, 56, 75,
93, 99, 116, 156
experimental mathematics 26
exponential separation 97, 100,
114, 121, 132, 133, 157
Fermi
Enrico 9, 27
theorem 55, 111
fixed point 98, 140
Ford, J. 27,33
Fourier mode, see mode
FPU
approximation 18
asKAM 68
change of variables 19
chaos in 25
configuration 21
Hamiltonian 19, 54, 57, 99
initial conditions 22
lattice 74, 99
local behavior 21
model 56, 130
parameters 18, 21
quadratic terms 22
research program 32, 72, 101
retrodicting 4
simulation 11, 21, 57, 106,
110, 130
time step 21
unperturbed model 18
fractal 68, 98
frequencies
commensurate 54, 66, 94, 149
linearly-dependent 34, 149
normal mode 19
resonant 59
unperturbed 94
generalized coordinates 139,
141, 142
Gustavson, F.G. 79
Hamilton, William Rowan 52
Hamiltonian 157
cyclic 143
dynamics 13, 33, 139, 143
function 105, 140
system 92
unperturbed 17, 141
harmonic
motion of the string 142
oscillators 17, 56, 58, 73, 78,
143
Henon and Heiles 72-81, 100
heuristic 131
homoclinic
orbits 66, 88
point 60, 77
tangle 94, 98, 99, 158
Hooke's law 17
information entropy 116
initial conditions 70, 143, 145
integrable 10, 13, 55, 58, 71,
158
integral of the motion 13, 21,
35, 42, 52, 54-57, 72, 74, 94,
110, 118, 135, 142, 143, 158
islands of order 93, 97, 118, 137
isolating 11, 23
iteration 21, 97
Izrailev, F.M. 85
Jackson, E.A. 37
KAM
behavior 99
curves 98, 99, 108
explanation of FPU 71
instabilities 93
mini-tori 66, 67-71, 77, 88
stability 86, 94, 100, 109
theorem 4, 30, 41, 57, 64, 65,
92, 111, 133
tori 68, 72, 77, 88, 93, 98
Kolmogorov, A.N. 52, 61
Korteweg-de Vries 46, 99
Kruskal, M.D. 43

Index 175
Lagrange, J. 51, 139
level curves 95
Liouville
integrability 143
Joseph 52
Lissajous, Jules 152
Lunsford, G.H. 93
mapping 159
measure theory 94
mechanics
celestial 55, 57
classical 60, 102
Hamiltonian 58
Newtonian 102
statistical 12, 94, 98, 109
Menzel, M. 27
mode 159
acoustic 89
energy 20, 89, 145
Fourier 10
normal 10, 19, 21, 89, 141,
147
optical 86, 89
models in dynamics 107
Moser, J. 53
Newton's method of tangents 64
noise in a simulation 115
nondegeneracy condition 57, 78
nonlinear 160
normal mode, see mode
numerical methods 100
open curve 148
oscillatory motion 142
overlap criterion, see resonance
overlap
partial differential equations 45
Pasta, J. 9
periodic
boundary conditions 47
motion 148
orbits 77, 116
tori 33
trajectory 141
periodicity 21, 25
perturbation 92
parameter 108
theory 18, 32, 37, 52, 55, 57,
58, 59, 60, 71, 93, 124, 130
phase space 11, 14, 21, 52, 75,
98, 140, 160
Poincare
Henri 10, 52, 92
recurrence 38
theorem 52, 55-57, 110
precision 152
pursuit 125
quasi-periodic 26, 28, 86
rational numbers 59, 68
resolution, graphical 152
resonance
condition 41, 62, 93
orbits 66
overlap 85, 94, 95, 97, 109
primary 42
road map 2
round-off error 29, 116, 127
separatrix 88
series expansion 18, 99, 110
simulation 71, 106, 160
in FPU, see FPU
Tuck and Menzel 27
sinusoidal motion 143
soliton 4, 30, 46, 91
state space 160
stochasticity 11, 12, 86, 89, 94,
95, 97, 98, 109, 160
Stoddard, D. 100
structural stability 108, 161
substitution of variables 145
super-period 28, 32, 70
surface of section 75, 77, 93, 97,
108, 116, 161
symplectic geometry 140
system trajectory 147

176 Index
thermal equilibrium 11, 110,
162
three-body problem 17, 54, 82
time average 25
Toda lattice 74, 78, 82, 99, 101,
108, 128, 136
topological
methods 52
transformations 66
tori
conditionally-periodic 33
dynamics on 147
preserved 71
trajectory
periodic 141
stochastic 72
system 147
true solution 105, 112, 124, 132
Tuck, J.L. 27
Turner, J.S. 100
two-body problem 14, 51
Ulam, S. 9
variables
canonical, see canonical
variables
conjugate, see conjugate
variables
variational principle 146
Walker, G. 92
Waters, J. 41
wave
equation 44
nonlinear 46
standing 21
Zabusky, N.J. 43