/
Текст
v/7*
ЧУ<Д
' , . 1 r,
щш
уЙЙЙ
вж
■ iVtfr * v ХчетУФ!
* iV«y> t 1 ■:¥
2.'*v v»
>.<V4-., Л1и»лЙ> r
ШШшШ:
--- ‘ T»Vv> - Д>?»'
^bsv.y^>>
£5
«я
N*
•>»Ж!
G. Ludwig
► ovvT’^'vv» . vv*
ySV^f>V«W.
• * a f *- f * « .
Ч«МЙЙЬУ5
«ИЛ» ■ . a r *’A *V*.i'* г*» Азу&м
■■>■■
v*:
•iv; ► 5
c^R3c<? Ч’ЗЗкбр**'
кйЯЖЙВЯЙ
rjwwl
>*4V-
• :•?. v
* 1 *
>j
Foundations
of Quantum Mechanics I
Ssm
sra
XL
V.
Springer-Verlag
New York Heidelberg Berlin
G. Ludwig
Foundations of
Quantum Mechanics I
Translated by Carl A. Hein
0
Springer-Verlag
New York Heidelberg Berlin
G. Ludwig
Institut fur Theoretische Physik
Universitat Marburg
Renthof 7
Federal Republic of Germany
Editors
Wolf Beiglbock
Institut fur Angewandte Mathematik
Universitat Heidelberg
Im Neuenheimer Feld 5
D-6900 Heidelberg 1
Federal Republic of Germany
Tullio Regge
Istituto de Fisica Teorica
Universita di Torino
C. so M. d’Azeglio, 46
10125 Torino
Italy
Carl A. Hein (Translator)
Formerly with
Massachusetts Institute of Technology
Lincoln Laboratory
Lexington, MA
U.S.A.
Elliott H. Lieb
Department of Physics
Joseph Henry Laboratories
Princeton University
Princeton, NJ 08540
U.S.A.
Walter Thirring
Institut fur Theoretische Physik
der Universitat Wien
Boltzmanngasse 5
A-1090 Wien
Austria
Library of Congress Cataloging in Publication Data
Ludwig, Gunther, 1918-
Foundations of quantum mechanics.
(Texts and monographs in physics)
Translation of: Die Grundlagen der
Quantenmechanik.
Bibliography: p.
Includes index.
1. Quantum theory. I. Title. II. Series.
QC174.12.L8313 1982 530.Г2 82-10437
ISBN 0-387-11683-4 (v. 1)
Original German edition: Die Grundlagen der
Quantenmechanik. Berlin-Heidelberg-New York: Springer-Verlag, 1954.
© 1983 by Springer-Verlag New York, Inc.
All rights reserved. No part of this book may be translated or reproduced in any form
without written permission from Springer-Verlag, 175 Fifth Avenue, New York, New
York 10010, U.S.A.
Typeset by Composition House Ltd., Salisbury, England.
Printed and bound by R. R. Donnelley & Sons, Harrisonburg, VA.
Printed in the United States of America.
987654321
ISBN 0-387-11683-4 Springer-Verlag New York Heidelberg Berlin
ISBN 3-540-11683-4 Springer-Verlag Berlin Heidelberg NewYork
Dedicated to my wife
Preface
This book is the first volume of a two-volume work on the Foundations of
Quantum Mechanics, and is intended as a new edition of the author’s book
Die Grundlagen der Quantenmechanik [37] which was published in 1954.
In this two-volume work we will seek to obtain an improved formulation of
the interpretation of quantum mechanics based on experiments. The second
volume will appear shortly.
Since the publication of [37] there have been several attempts to develop
a basis for quantum mechanics which is, in the large part, based upon the
work of J. von Neumann [38]. In particular, we mention the books of G. W.
Mackey [39], J. Jauch [40], C. Piron [41], M. Drieschner [9], and the
original work of S. P. Gudder [42], D. J. Foulis and С. H. Randall [43], and
N. Zierler [44]. Here we do not seek to compare these different formulations
of the foundations of quantum mechanics. We refer interested readers to
[45] for such comparisons.
In this book we shall seek only to develop a well-defined formulation for
the foundations of quantum mechanics and to examine the implications of
such a formulation towards the most important applications of quantum
mechanics in a consistent manner. This formulation will be based only on
the objective, that is, the so-called classical mode of description of the
apparatuses. In this respect this book represents a systematic mathematical,
as well as conceptual, formulation of the original viewpoint of N. Bohr in
which it is assumed that it is necessary to use the classical mode of description
in order to describe the measurement process in quantum mechanics (see,
for example, the extensive discussion in M. Jammer and E. Scheibe [14]).
In our approach to developing a formulation of the foundations of
quantum mechanics we shall not present a precise mathematical description
vii
viii Preface
of the macroscopic measurement apparatus. Instead, we shall only assume
that there exists an objective characterization of the mode of operation of
the apparatus. In [13] we have shown that it is possible to derive quantum
mechanics without making reference to microsystems, by using only the
description of macrosystems in terms of state spaces. There the reader will
also find a derivation of the Hilbert sp^e6 structure from general laws con¬
cerning the interactions of macrosystems. In this book we will make use of
these results in III, §3 without proof.
At several places in this book the reader will find references to the book
Grundstrukturen einer Physikalischen Theorie [1]. Previous knowledge of
[1] is not necessary for an understanding of this book. Readers who are
familiar with [1] can easily recognize how the general structure of a physical
theory is realized for the case of quantum mechanics. Readers who wish to
study [1] later will have the advantage that they will have an example to
illustrate the general description in [1].
The formulation of quantum mechanics presented here is the last step
of developments since 1964 which the interested reader can find in [48].
(In [48] there is also some previous work which has led to the results de¬
scribed here.) In a certain sense, this presentation together with [1] and [13]
represents a greatly improved “second edition” of [17].
In Appendices I-V we have provided a summary of important mathe¬
matical results which may be unfamiliar to some readers. Appendix V will
appear in Volume II. For readers who are unfamiliar with the mathematical
results in Appendix V, we suggest that they “take them on faith” until
Volume II appears.
References in the text are made as follows: For references to other sections
of the same chapter, we shall only list the section number of the reference,
for example, §5.3. For references to other chapters, the chapter is also given;
for example, IV, §7.2 refers to Chapter IV, Section 7.2. The formulas are
numbered as follows: (5.7.10) refers to the 10th formula in Section 5.7 of the
current chapter. References to formulas in other chapters are given, for
example, by IV, (5.7.10). References to the Appendix are given by AIV, §2,
where AIV denotes Appendix IV.
I would like to express my deep gratitude to Mr. Carl A. Hein for the
difficult job of translating the manuscript from German to English. He had
the difficult assignment of finding suitable English language expressions for
the somewhat new and sometimes difficult concepts and ideas used in a new
conceptual framework for quantum mechanics. This was possible only
because of his deep understanding of the text. I would also like to thank him
for his patience in accommodating my wishes and a substantial number of
revisions in the German text while the translation was under way.
I hope that the present book together with [13] will lead to further interest
and research into the foundations of quantum mechanics, especially in the
direction of a relativistic theory for quantum mechanics, that is, relativistic
quantum field theory (see [46]).
Marburg, January 1982
G. Ludwig
Contents
CHAPTER I
The Problem: An Axiomatic Basis for Quantum Mechanics l
1 The Axiomatic Formulation of a Physical Theory 2
2 The Fundamental Domain for Quantum Mechanics 4
3 The Measurement Problem 10
CHAPTER II
Microsystems, Preparation, and Registration Procedures 12
1 The Concept of a Physical Object 13
2 Selection Procedures 15
3 Statistical Selection Procedures 18
4 Physical Systems 21
4.1 Preparation Procedures 21
4.2 Registration Procedures 22
4.3 The Dependence of Registration upon Preparation 24
4.4 The Concept of a Physical System 26
4.5 The Structure of Probability Fields for Physical Systems 28
CHAPTER III
Ensembles and Effects 41
1 Combinations of Preparation and Registration Methods 42
2 Mixtures and Decompositions of Ensembles and Effects 47
3 General Laws: Preparation and Registration of Microsystems 56
ix
x Contents
4 Properties and Pseudoproperties 60
4.1 Properties and Physical Objects 60
4.2 Pseudoproperties 69
5 Ensembles and Effects in Quantum Mechanics 73
6 Decision Effects and Faces of К —- 75
CHAPTER IV
Coexistent Effects and Coexistent Decompositions 83
1 Coexistent Effects and Observables 84
1.1 Coexistent Registrations 84
1.2 Coexistent Effects 86
1.3 Commensurable Decision Effects 90
1.4 Observables 96
2 Structures in the Class of Observables 106
2.1 The Spaces ЩЛ) and Щ2) 107
2.2 Mixture Morphisms Corresponding to an Observable 122
2.3 The Kernel of an Observable; Mixture of Effects
for an Observable 123
2.4 Mixtures and Decompositions of Observables 128
2.5 Measurement Scales for Observables 139
3 Coexistent and Complementary Observables 152
4 Realizations of Observables 154
5 Coexistent Decompositions of Ensembles 156
6 Complementary Decompositions of Ensembles 166
7 Realizations of Decompositions 172
8 Objective Properties and Pseudoproperties of Microsystems 173
8.1 Objective Properties of Microsystems and Superselection Rules 173
8.2 Pseudoproperties of Microsystems 177
8.3 Logic of Decision Effects? 181
CHAPTER V
Transformations of Registration and Preparation
Procedures.
Transformations of Effects and Ensembles 199
1 Morphisms for Selection Procedures 199
2 Morphisms of Statistical Selection Procedures 201
3 Morphisms of Preparation and Registration Procedures 203
4 Morphisms of Ensembles and Effects 206
4.1 Morphisms of Ensembles 206
4.2 Morphisms of Effects 211
4.3 Coexistent Operations and Coexistent Effects Morphisms 214
5 Isomorphisms and Automorphisms of Ensembles and Effects 216
Contents xi
CHAPTER VI
Representation of Groups by Means of Effect
Automorphisms and Mixture Automorphisms 231
1 Homomorphic Maps of a Group ^ in the Group si of
J^-continuous Effect Automorphisms 231
1.1 Generation of a Representation of ^ in si by Means of a
Representation of ^ by r-Automorphisms 232
1.2 Some General Properties of a Representation of ^ in si 237
1.3 Topologies on the Group si 245
1.4 The Representation of ^ in Phase Space Г 247
2 The ^-invariant Structure Corresponding to a Group
Representation 248
3 Properties of Representations of ^ which are Dependent on the
Special Structure of st{m) in Quantum Mechanics 249
3.1 The Topological Structure of the Group stm 249
3.2 The Topological Properties of a Representation of ^ 252
3.3 Unitary and Anti-unitary Representations Up to a Factor 254
CHAPTER VII
The Galileo Group
258
1 The Galileo Group as a Set of Transformations of Registration
Procedures Relative to Preparation Procedures
258
2 Irreducible Representations of the Galileo Group
and Their Physical Meaning
262
3 Irreducible Representations of the Rotation Group
272
4 Position and Momentum Observables
284
5 Energy and Angular Momentum Observables
292
6 Time Observable?
293
7 Spatial Reflections (Parity Transformations)
299
8 The Problem of the Space 3> for Elementary Systems
302
9 The Problem of Differentiability
304
CHAPTER VIII
Composite Systems 307
1 Registrations and Effects of the Inner Structure 307
2 Composite Systems Consisting of Two Different Elementary Systems 310
3 Composite Systems Consisting of Two Identical Elementary Systems 320
4 Composite Systems Consistingof Electrons and Atomic Nuclei 323
5 The Hamiltonian Operator 328
6 Microsystems in External Fields 332
7 Criticism of the Description of Interaction in Quantum
Mechanics and the Problem of the Space 3f 339
xii Contents
APPENDIX I
Summary of Lattice Theory 343
1 Definition of a Lattice 343
2 Orthomodularity 346
3 Boolean Rings 348
4 Set Lattices 352
APPENDIX II
Remarks about Topological and Uniform Structures 353
1 Topological Spaces 353
2 Uniform Spaces 355
3 Baire Spaces 357
4 Connectedness 357
APPENDIX III
Banach Spaces 359
1 Linear Vector Spaces 359
2 Normed Vector Spaces and Banach Spaces 360
3 The Dual Space for a Banach Space 360
4 Weak Topologies 361
5 Linear Maps of Banach Spaces 362
6 Ordered Vector Spaces 363
APPENDIX IV
Operators in Hilbert Space 366
1 The Hilbert Space Structure Type 366
2 Orthogonal Systems and Closed Subspaces 369
3 The Banach Space of Bounded Operators 372
4 Bounded Linear Forms 373
5 The Banach Space 375
6 Projection Operators 377
7 Isometric and Unitary Operators 379
8 Spectral Representation of Self-adjoint and Unitary Operators 380
9 The Spectrum of Compact Self-adjoint Operators 384
10 Spectral Representation of Unbounded Self-adjoint Operators 385
11 The Trace as a Bilinear Form 390
12 Gleason’s Theorem 397
13 Isomorphisms and Anti-isomorphisms 404
14 Products of Hilbert Spaces 405
15 The Spaces Я{Ж19 ...) and ...) 411
References 415
List of Frequently Used Symbols 421
List of Axioms 422
Index 423
CHAPTER I
The Problem:
An Axiomatic Basis for Quantum Mechanics
The historical path of discovery for a new physical theory is, for the most
part, a complicated one. At first new concepts are tentatively introduced. By
a lengthy process, involving trial, error, insight, and revision, these concepts
are modified and become more clearly defined and familiar. As an under¬
standing of the postulated structure of the theory develops, it is possible by
careful application of the new concepts to learn how to avoid error and to
develop an interpretation of the new theory. Such has been the case for
quantum mechanics. In this book we shall not present a heuristic path to
quantum mechanics as a means of developing a theory of electrons, atoms,
... (in general: microsystems). Instead, we shall assume that the reader has
already had extensive contact with quantum mechanics, and has studied one
or more of the elementary texts. If this is not the case, we recommend that the
reader either study such a text before reading this book, or use one of the
elementary texts in conjunction with this book. In this way, the reader will
discover the vagueness inherent in the usual fundamental concepts which are
used to formulate quantum mechanics. For the reader who seeks an
elementary text which considers some of the problems to be discussed in this
book, we recommend Volume 3 of [2].
In the following chapters we shall attempt to clarify the fundamental
concepts of quantum mechanics and present a thorough and systematic
axiomatic formulation of quantum mechanics.
1
2 I The Problem: An Axiomatic Basis for Quantum Mechanics
1 The Axiomatic Formulation of a Physical Theory
We cannot consider the general structure of a physical theory in detail in this
book. We refer readers who are interested in such questions to [1]. Here we
shall only describe in broad terms what we mean by an axiomatic basis of a
physical theory, and what we seek to accomplish when we formulate an/
axiomatic basis for a physical theory.
A physical theory (abbreviated TT) consists of three essential parts, a
mathematical theory {JfT), a domain of reality (nF), which we seek to describe
by JiT, and a set of mapping principles {Ms40>) where the latter is needed in
order to describe the relationship between iff and JiT. The mapping
principles also describe what is frequently called the interpretation of JiT. It
is here that we encounter the difficult problems in quantum mechanics.
Concepts such as observable, state, property, and object are ill-suited to serve
as a conceptual basis for an interpretation of quantum mechanics.
(1) We expect that the difficulties of interpretation of а TT should be
minimized when an axiomatic basis for 0>T is found.
A JiT as part of a TT can, in principle, be arbitrarily defined. In the
historical evolution of physics, the mathematical theories seldom appear in
the form of an axiomatic basis. This situation frequently leads to conflicting
views concerning the interpretation of the theory. In particular there may be
disputes about a given structure in JiT. Is it accidental, thereby having
nothing to do with the real structure of the world as described by nF? In [1]
it is shown how it is possible to thoroughly treat such problems if JiT is an
axiomatic basis. Here we shall only provide a brief exposition of the principle
of an axiomatic basis in order to apply it to quantum mechanics in the
following chapters. By doing so we shall give an explicit formulation of the
mapping principles (see II). In this manner the application of the principle of
an axiomatic basis can be understood without the more general and detailed
analysis of [1].
(2) The introduction of new physical concepts in the domain of а TT
may be carried out in a simple and transparent manner when the JiT
is an axiomatic basis.
If JIT is an axiomatic basis, then we can introduce new physical concepts
for which the physical meaning will naturally follow from the mapping
principles JistfT and mathematical constructions in JiT. Again, it is not
necessary to study the general principles presented in [1] because it is not
difficult to understand the concrete derivation of the new physical concepts
(such as ensemble, effect, observable, position observable, etc.) which are
presented in the following chapters.
The reader who is familiar with the “stories” which are usually told in
order to explain (in a rough manner) the physical meaning of these concepts
will appreciate the conceptual clarity which results from the construction of a
TT with JtT as an axiomatic basis.
1 The Axiomatic Formulation of a Physical Theory 3
What do we mean by the expression “axiomatic basis”? First it is essential
that this expression describe certain aspects of the form of МЗГ itself. Thus it
is essential that the М2Г be studied in its own right, that is, detached from its
relationship to physics. Thus we postulate that Ji3T should take the form of
what Bourbaki [4] calls a “theory of species of structure” E and denotes by
. Since we are studying both physical and mathematical theories, we prefer
to write JIZT^ instead of . Thus an axiomatic basis should be of the form
where E denotes the appropriate species of structure. The form of
within a 0>3T is somewhat more specialized than that introduced by
Bourbaki in [4]. The specialization consists in the fact that, for the auxiliary
base sets of E, we shall only use the set of real numbers R and for МЗГ we
require only the use of set theory.
It is not necessary for the reader to study either [4] or [1] because it is not
difficult to exhibit the nature of а and because in II (supplemented by
VII, §1 and axioms introduced later) we will explicitly construct the mathe¬
matical description corresponding to quantum mechanics.
In JlZTi we shall assume the usual formulations of mathematical logic and
set theory, and the usual formulation of the real number system R. Then we
shall introduce what we shall call base sets, for which no internal structure is
specified in advance. We obtain an internal structure for the base sets by
introducing relations on the sets, and by postulating axioms for them. This is
how we shall introduce the structure for the corresponding to quantum
mechanics beginning in II and continuing in VII, §1. A simple example for a
is a group, that is a set with a relation (called multiplication) which
satisfies the axioms for a group. In addition to the requirement that be
of the form we must also make requirements concerning the relation
between the mapping principles JtstfgP and .
We shall use the concept of mapping principles instead of the concept of
interpretation because we require a neutral term, one which does not
evoke differing preconceived notions among various readers. We note
that inherent in the concept of mapping principles is that something is being
mapped. We shall only require that what is being mapped must be expressed
in terms of experiment and experience without the need for the application of
the new theory. We may find that we do need older theories, that is, theories
which are already known to express what is being mapped. We may illustrate
this requirement with the aid of a familiar theory. For example, it is possible
to specify the position of the planets without requiring the use of Newton’s
mechanics and Newton’s law of gravitation. The position of the planets at
^different times provides the experimental material which can be compared to
Newton’s theory, that is, which is to be mapped into the mathematical
framework JltT of Newton’s theory. We shall use the expression fundamental
domain У to denote those facts that can be specified in advance of a particular
0>3T and which are mapped into the mathematical framework of
In §2 we shall seek to describe the fundamental domain for the case of
quantum mechanics. The fundamental domain consists of all that can be
determined before the application of the theory. The fundamental domain
4 I The Problem: An Axiomatic Basis for Quantum Mechanics
will later be expanded with the help of the theory to the domain of reality iV'
where the latter includes all real aspects of the world which can be described
by the theory.
The mapping principles are rules by which the elements of that is, the
stated facts, may be translated into the language of МЗГЪ. We obtain an
axiomatic basis if and only if the translation of the determined facts (as
elements of into the language of makes use only of the undefined
basis sets and the undefined relations of £. The axioms required for are
not deduced from experiment, but are guessed from experiment by a process
involving trial and error, intuition, and insight. It is important to keep this in
mind in the following chapters (see also [1], §5). The axioms used in
should not be derived from philosophical a priori principles (for example,
from forms of pure sensible intuition) believing that these principles and
forms are necessary to develop physics. Here we note that some authors [9]
take an a priori viewpoint concerning the fundamental structure of quantum
mechanics.
By an axiomatic basis we mean a realization of JiZT together with a
set of mapping principles which have the form described above. It is
obvious that many problems concerning the physical meaning associated
with mathematical structures are easier to solve within the framework of an
axiomatic basis because we permit only sets and relations in £ (except for
certain idealizations) which are interpreted physically by use of the mapping
principles. Again, we refer interested readers to [1].
At this stage we expect that many readers will not find the above formula¬
tion of the general structure of a physical theory clear. On the basis of the
formulation of quantum mechanics which will be presented in the following
chapters, we expect that such readers will appreciate the conceptual clarity of
our axiomatic formulation of quantum mechanics as compared to the usual
one. As we progress we also expect that such readers will obtain a better
understanding of the nature of an axiomatic basis. In this respect, this book
can also be used in order to obtain a better understanding of [1].
2 The Fundamental Domain for Quantum Mechanics
The simplicity we find in the case of classical mechanics results from the
following fact: It is possible to describe the measurement of the position of a
particle (point mass) at various times without making use of the laws of
mechanics. In other words, the formulation of a theory of measurement for a
position-time measurement does not require the use of the laws of mechanics
(see [2], II). Thus, the measured positions of a particle at different times are
appropriate for inclusion in the fundamental domain of classical mechanics.
In quantum mechanics the situation is completely different. Nevertheless,
in the historical development of quantutn mechanics we find that there is a
strong tendency to imitate the procedures of classical mechanics, where it is
assumed that we know how it is possible to measure position, momentum,
2 The Fundamental Domain for Quantum Mechanics 5
and many other quantities. Here the term “observable” is used to describe
the quantities to be measured, and the measured values of the observables are
assumed to be the experimental material which is to be compared with the
theory. Unfortunately, it is not possible to state precisely what is meant by
the concept of an observable.
The origin of this difficulty is easily ascertained. As we have seen, in
classical mechanics it is possible to develop a theory of measurement of the
position of the particles at various times without the application of mech¬
anics. In this way it is possible to give meaning to the concept of the
observable position at various times and their measurements before develop¬
ing the theory of classical mechanics. In quantum mechanics this is not
possible, because there is no measurement apparatus for microsystems such
as electrons, atoms, molecules, etc., whose function can be explained without
the use of quantum theory. Thus we find that the concept of observables and
their corresponding measurement values are ill-suited for inclusion into the
fundamental domain for quantum mechanics. The claim that quantum
mechanics is only concerned with what can be measured (that is, the
measurement values obtained from a scale on a measurement apparatus) is
false because we cannot explain what the measurement values represent
without the use of quantum theory.
Here we shall not review the vast body of literature devoted to the theory
of measurement in quantum mechanics. Instead, we shall seek to obtain a
suitable substitute for the concept of an observable (and their associated
measurement values) for inclusion into the fundamental domain of quantum
mechanics.
A second concept—that of a “state”—is often used as an aid in the
interpretation of quantum mechanics. A microsystem is said to be in one of
its possible states. But what do we mean by the notion of a state? In classical
mechanics it is possible to characterize the state of a system by the positions
and velocities of the individual particles of the system at a given time, that is,
by a point in phase space. We shall not describe the various attempts to
* develop a quantum mechanical notion of a state because it is clear that such a
notion will make explicit use of the structure of quantum mechanics. Thus we
find that the notion of a state is also ill-suited for inclusion in the
fundamental domain of quantum mechanics.
Thus, if we seek to formulate quantum mechanics in terms of an axiomatic
basis, we have little to begin with other than what an experimental physicist
would call experiments with a single microsystem. The term “single” is a
qualitative designation which is used only to differentiate these experiments
from those that treat a large number of interacting systems as a whole, that is,
a macrosystem composed of a collection of microsystems. The term “micro¬
system” is also a qualitative designation which is used to emphasize the fact
that we do not assume that quantum mechanics is a suitable theory for the
description of macrosystems (for example, the earth). Indeed, quantum
mechanics is inadequate for a theoretical description of macrosystems. We
cannot discuss the problem of the relationship between quantum mechanics
6 I The Problem: An Axiomatic Basis for Quantum Mechanics
and a more comprehensive theory of macrosystems in this book. The reader
will find an introduction to this problem in [2], XV, [5], [7], [13], [27] and
some comments in XVIII.
By experiments with “single” microsystems we do not mean that we
consider only a “single” experiment with a “single” microsystem. Since
statistics plays a central role in quantum mechanics (see II), we must consider
experiments with “large numbers” of microsystems. It is important to
understand that experiments with “large numbers” of microsystems can be
frequently understood in terms of repeated measurements with a single
microsystem. This situation is familiar to every experimental physicist. An
electron beam can be considered to be the result of a multiple process in
which a single electron is “produced” provided that the mutual interaction
between individual electrons can be neglected (see XVI). Even the so-called
ideal gas can be approximately treated as a collection of many single atoms
(see XV, §2) since the mutual interactions of the atoms in an ideal gas are
negligible.
For the fundamental domain of quantum mechanics we shall choose the
class of experiments with individual microsystems, and the relative fre¬
quencies of the phenomena associated with multiple repetitions of these
experiments.
In this book we shall not describe the vast variety of such experiments.
Several examples are briefly described in XI-XVII. In II we shall develop the
general structure of such experiments as the basis for the formulation of
quantum mechanics. In preparation for this task we find it necessary to make
the meaning of the expression “experiments with microsystems” more
precise. We shall begin by describing the structure of such experiments in
more detail.
In order to carry out experiments with individual microsystems it is
necessary to have such systems at hand. Often such microsystems can be
found in nature, for example, in interstellar space. There they are sufficiently
separated so that their mutual interactions can be neglected. In fact, their
mutual interactions will be smaller than what can be produced in many
experiments in the laboratory. On Earth such microsystems must be
produced in the laboratory. Often they may be obtained naturally, as for
example, from the decay of radioactive substances. Sometimes it is necessary
to produce them using a complicated apparatus which is very expensive to
build. Often a rarefied (ideal) gas will be suitable for many purposes. Here we
shall use the generic term preparation procedure to denote the various
methods of obtaining microsystems.
Thus some preparation procedures will require the use of a special
apparatus (a giant accelerator), while others will only require the use of the
sun, which emits such microsystems as light quanta, charged particles (solar
wind) and neutrinos.
We shall now state an important requirement for the development of an
axiomatic basis for quantum mechanics: It must be possible to describe the
structure of the preparation apparatus, and the time-dependent physical
2 The Fundamental Domain for Quantum Mechanics 7
process by which the preparation apparatus operates without the use of
quantum mechanics. In brief, we require that the so-called pre-theories for
quantum mechanics permit the description of the structure and the operation
of the apparatus and the characterization of the preparation procedure (for a
description of the concept of a pre-theory see [1]). In other words we require
that the preparation procedures belong to the fundamental domain of
quantum mechanics.
In order to prevent misconceptions concerning the characterization and
description of a preparation procedure, we find it necessary to give an
example of what is not part of the characterization of a preparation
procedure. If, for the purpose of illustration, electrons are to be produced,
then the specification of the spin of the prepared electron or the description of
the physical process of emission of the electron from, for example, a heated
cathode is not permitted as part of the description of the preparation
procedure. However, all macroscopic processes which take place in the
operation of the preparation apparatus, including those instructions which
can be stored on magnetic tapes and in other memory devices and executed
in sequence by a computer belong to the description of the preparation
procedures.
The concept of a preparation procedure permits the description of
complicated experimental arrangements such as one composed of an accel¬
erator, a target and a special selection apparatus which selects the desired
microsystem. In addition, it is possible to combine two or more preparation
procedures into a new preparation procedure. Such combinations of pre¬
paration procedures are commonly found in scattering experiments (see
XVI).
At present the formulation of a preparation procedure may appear to be
too general. We find it necessary to impose an additional restriction—that of
reproducibility. The latter notion is related to the relative frequencies of the
various phenomena associated with repetitions of an individual experiment
(see II).
By making the assumption that a microsystem is produced in a prepara¬
tion procedure we do not mean that we know, in a particular case, which
preparation procedure was used to produce a particular microsystem. In a
test of a physical theory we require that only known facts are to be mapped by
means of the mapping principles into the mathematical language of the
theory М2ГЪ.
We shall now consider the following question: How do we use the
prepared microsystems to investigate the structure of microsystems? In the
second and crucial part of such experiments we require the use of an
apparatus which measures the microsystems and their structure. If we wish
to interpret the macroscopic physical processes associated with the second
apparatus as a measurement of the structure of a microsystem, we need to
make use of quantum mechanics. Therefore we permit only the inclusion of
the macroscopic processes associated with such an apparatus as part of the
fundamental domain of quantum mechanics. Would it then be correct to say
8 I The Problem: An Axiomatic Basis for Quantum Mechanics
that the measured value obtained by a measuring apparatus may be
compared with the predictions of quantum theory? It is correct in that the
measurement values (scale values) are the result of macroscopic processes
associated with the apparatus, and are therefore parts of the fundamental
domain. What is not part of the fundamental domain is the interpretation of
these scale values as a measurement of a property of a microsystem (that is,
the “result” of the measurement of an “observable”).
Since, in our description of the fundamental domain, we cannot say what is
(or was) measured by the apparatus, we find it necessary to introduce the
expression registration apparatus (instead of measuring apparatus) to de¬
scribe the second part of the experimental arrangement.
It is in this sense that every experimental arrangement of an experiment
with a single microsystem consists of a preparation apparatus and a
registration apparatus. It is not necessary that the experimental arrangement
be man-made. Indeed, the preparation apparatus can be a star or a galaxy. In
order to prevent misunderstanding, it is necessary to note that the expression
“single microsystem” also applies to what is called a “composite microsys¬
tem” in the sense of VIII. If, for example, we study electron-proton
scattering, the “single” microsystems are electron-proton pairs.
In this book we shall not describe the construction of a typical registration
apparatus. Instead, we shall give a few familiar examples: a scintillation
counter, or array of such counters, a cloud chamber, a bubble chamber, a
spectroscope, a photographic plate, etc. In addition, we give an example of a
registration apparatus which makes use of complicated electromagnetic
fields—such as a mass spectrograph.
It can be argued whether every experimental arrangement for an experi¬
ment with a single microsystem consists of a preparation apparatus and a
registration apparatus. This is indeed the case. However, this does not mean
that it is possible to uniquely divide a complicated experiment into prepara¬
tion and registration parts. In Figure 1 we have an experimental arrange¬
ment which consists of three parts. In part (1) we produce the microsystem
a. In part (2) we produce microsystem b as the result of the interaction
of the macroscopic apparatus (2) with a (where b can be the same as a).
The microsystem b is then registered by (3). We may consider (1) as the pre¬
paration apparatus for the microsystem a, and (2) plus (3) as the registration
apparatus for a. We may also consider (1) plus (2) as the preparation apparatus
for system b and (3) as the registration apparatus for system b.
Thus we find that the preparation-registration structure provides a
conceptual basis for experiments with microsystems. It is possible to invent
experiments in which it is not possible to speak of preparation and
registration of microsystems. For example, consider an apparatus having two
parts (1) and (2), and suppose that they interact by exchanging microsystems,
and that the emission of microsystems by (1) is influenced by the microsys¬
tems produced by (2) and vice versa. For such a system it is not possible to
specify which microsystem is the subject of the experiment. Is it the one
which goes from (1) to (2) or the one which goes from (2) to (1)? Here we do
2 The Fundamental Domain for Quantum Mechanics 9
Preparation
apparatus for a Registration apparatus for a
(1) (2) (3)
Preparation apparatus for b Registration
apparatus for b
Figure 1
not mean to suggest that a more comprehensive theory (which includes
quantum mechanics as a special case) will be unable to treat such com¬
plicated interaction problems. We only suggest that such experiments are ill-
suited for the immediate goal of formulating an axiomatic basis for quantum
mechanics.
By our reference to the possibility that the interaction between the
apparatus (1) and (2) need not be directed, we may be led to question the
assumption of the existence of microsystems, or at least to seek a better
understanding of the concept of a microsystem.
The inclusion of the preparation apparatus and the registration apparatus
(and the associated physical processes which can be described without the
use of quantum mechanics) into the fundamental domain is not affected by
the question of the existence of microsystems. The directedness of the
interaction can be described in terms of the pre-theory of quantum me¬
chanics, that is, without the need of quantum mechanical theory. Indeed, it
can be described without the need of the concept of a microsystem. We shall
not discuss these matters here; they are discussed in [2], XVI, [3], [6], [7],
and briefly in XVIII. In [13] the axiomatic basis of quantum mechanics is
formulated without the need for the assumption of the existence of microsys¬
tems. There we find that no special assumption is needed in order to describe
the directed interaction between the preparation apparatus and the reg¬
istration apparatus by means of a “carrier of interaction.” In II we shall
introduce the fundamental set M of “interaction carriers.” The introduction
of M does not violate our intention to develop an axiomatic basis provided
we do not make additional assertions about the elements of M other that
they depend on the preparation apparatus, the registration apparatus and
their associated macroscopic processes. The introduction of M will also
permit us to make the formulation of an axiomatic basis of quantum
mechanics given here easier to understand than the one given in [13]. Thus
the reader who has understood the foundations of the theory of microsystems
given here will obtain a better understanding of the formulation presented in
10 I The Problem: An Axiomatic Basis for Quantum Mechanics
[2], XVI and the more detailed presentation given in [13]. In II we will call
the carriers of interactions (elements of M) “microsystems” even though this
word will denote special carriers of interaction which are characterized by the
axioms given in III, §3 and §5. The introduction of the set M of microsystems
should not be understood as implying that such microsystems exist in every
individual experiment because the “vacuum” can also be considered to be a
“type” of microsystem. Here we do not exclude the possibility that the
preparation apparatus does not always interact with a registration
apparatus.
In summary, we will present the axiomatic basis for quantum mechanics in
this book. The axiomatic basis will be constructed using the fundamental
domain which consists of all those aspects of the preparation and registration
procedures for microsystems which can be described without the use of
quantum mechanics.
3 The Measurement Problem
Have we, by our restriction to the fundamental domain described above,
eliminated the measurement problem which was described in the beginning
of §2? Have we eliminated the problem in such a way that the interaction
between the microsystem and the measurement apparatus can be analyzed
without the need for quantum theory? Of course not. What we have done is
to place the problem where it belongs—namely, with the developing theory.
The mapping principles are no longer burdened with the problem of
providing a theoretical description of either the effect of the microsystem on
the registration apparatus or the dependence of the microsystem on the
structure of the preparation apparatus.
What is the status of such a theoretical description in the arena of quantum
mechanics?
In II and III we shall begin our axiomatic formulation of quantum
mechanics by introducing structural rules governing the preparation and
registration processes. These structural rules will be very general, and will be
analogous to those found in thermostatics. These rules will not specify how
individual cases of preparation and registration are structured, just as the
fundamental structure of thermostatics does not specify the equation of state
for a given substance.
The theory presented in II—III and extended in VII, §1 describes the
preparation and registration of microsystems but is not as complete as we
might wish. In opposition to such a wish of completeness, in IV we shall
proceed in the opposite direction. We shall attempt to eliminate, as far as
possible, the preparation and registration process in order to obtain a theory
of the structure of microsystems which is independent of accidental aspects of
the structure of the apparatuses. The extent to which this is possible will be
discussed in IV.
3 The Measurement Problem 11
In thermodynamics the equation of state is obtained from experimental
data. By analogy with thermodynamics we will, in the present state of the
theory, take the mode of operation of the special preparation and registration
procedures from experiment, and make use of additional assumptions (see,
for example, XI, §1 and §2 and XVI, §1 and §2). This “taking” of special
structure from experiment is not without considerable cost. As a result, the
current status of the theory is not satisfactory. Thus we shall attempt to
describe the problems of preparation and registration more precisely. We
shall make such a detailed investigation of the problems of preparation and
registration in order to obtain a more comprehensive theory than that which
was developed in II-XVI.
We shall begin these investigations in XVII. There we shall find that
quantum mechanics cannot present a closed theory (more precisely—is not a
g.G.-closed theory in the sense of [1], §8 and §10) of the preparation and
registration process. This is perhaps disappointing. In fact, it merely
demonstrates the fact that quantum mechanics is not a theory which can
describe everything from a microsystem to a macrosystem.
In XVIII we shall analyze the situation in quantum mechanics in its
relationship to other physical theories. Thus, at the end of this book we
return to the problems posed at the beginning.
CHAPTER II
Microsystems, Preparation, and
Registration Procedures
We shall now present a “theoretical” description of experiments with
individual microsystems, a description which is expressed in terms of a
mathematical framework . We shall introduce mathematical entities to
which the individual microsystems and the experimental procedures which
are to be applied to them are to be mapped. We shall use the methodology
which was briefly outlined in I, and is developed in greater detail in [2], II
and in [1]. Here a familiarity with I will suffice in order to understand (at
least in an intuitive way) the relationship between the mathematical theory
and the physical reality it describes.
We shall develop the mathematical theory as systematically as possible. At
the beginning we shall introduce a number of axioms which we shall need in
order to formulate the foundations of quantum mechanics in a transparent
manner. Later in the development of the theory we shall only motivate the
selection of new axioms. Nevertheless the reader who is mathematically
inclined will find it easy to verify that the mathematical theory is of the form
МЗГ^ as described in I, §1 and described in greater detail in [1]. The mapping
principles will be presented in a more intuitive manner. The reader who has
read the definite presentation in [1], and is familiar with notion of a “concise
formulation” presented there will be able to formulate the mapping principles
in a precise way. Even those readers who are satisfied with obtaining a more
intuitive understanding of the relationship between physics and mathematics
will not find it difficult to understand this “physical interpretation” of
quantum mechanics given in the following chapters because this presentation
is easier to understand that the usual one.
12
1 The Concept of a Physical Object 13
At various places we shall use expressions such as “physical reality” and
deduce new physical concepts while making explicit reference to the precise
formulations and methods of [1], §10 without making explicit applications of
them. We do so in order to keep the size of this book within reasonable
bounds, and not to stray from the intended scope of the book. In the title of
the present chapter we have introduced the term “microsystem” somewhat
prematurely, because we shall not define this concept until the following
chapter. The expression “physical system” or “carrier of interaction” would
be more appropriate in this chapter. However, since we are concerned only
with the applications of the general methods and concepts described here to
this special case, we shall use the expression “microsystems.”
This presentation is closely related to that found in [2], XIII. In fact, an
understanding of the subject matter presented in [2], XI-XIII will greatly
facilitate the understanding of the formulation of quantum mechanics
presented in this book. In this respect, this book is a continuation of
[2], XIII.
1 The Concept of a Physical Object
When we introduce a general concept such as a “physical object” we do not
intend to present an analysis of the meaning of such concepts as they are used
in physics. Instead, we intend to formulate the concept anew, independent of
the fact that the new formulation may not agree with the usual one in all
particulars.
In [1], §10.5 we set forth the requirement that the “new” physical concepts
are to be introduced into the mathematical theory JIZT^ by means of a set
together with a “structure.” Here the term “new” refers only to the definition
of the concept (that is, of the set and the structure). We have to assume that
we already know how to assign physical meaning to the structure terms and
to the elements of the set. How this may be done is illustrated by the concept
of a “physical system” which is defined in §4. For a detailed description of the
method we refer readers to [1], §10.5, §11, and §12.
We shall now consider a set M, the elements of which we wish to call
“physical objects.”
Here, in order to prevent misunderstandings, we warn the reader of the
opinion (see [1], §10.5) that the expression “physical object” is used to
describe all aspects of “physical reality” which are to be mapped to an
element of a set. In mathematics, a set and the elements of a set are often
loosely called “mathematical objects.” Thus, it should not be misleading to
use the expression “physical object” to refer to the physical reality associated
with these mathematical objects. Thus we may call an element of a set M (or
preferably the “physical reality” which is mapped to the element) a physical
object (see [1], §5 and §10.5) if and only if the “physical reality” associated
with the elements of M have (intuitively speaking) objective properties. In the
14 II Microsystems, Preparation, and Registration Procedures
mathematical framework we shall express the notion of a property in the
following terms:
Let a structure $ be defined on the set M as follows: Let $ c= ^(M), that is,
$ is a collection of subsets of M. Let a, b e ё\ let M\a denote the complement
of a in M. Let $ satisfy the following axioms:
AE 1. If a g $ then M\a e S.
AE 2. If a, b g $ then a n be S.
Physically, the elements of $ represent definite properties. By this we mean
that the mapping principles must specify what aspects of “physical reality”
are to be mapped to the elements of $ and M. The mapping principles must
also specify what real relationships between an element xe M and an
element a e S are to be identified with the statement “x has the property аГ
The latter statement is mapped to the mathematical relation xe a where
xe M and a e S. In a more general context M and $ may represent (in the
sense of [1], §10.5) sets of real but only indirectly determinable aspects of
“physical reality.” In other words, the statement “x has the property я” may
be only indirectly verified (see [1], §10.5 and §10.6).
The axioms AE 1 and AE 2 have the following intuitive meaning:
AE 1 states that all objects which do not have the property a share a
common property—which we denote by “not я.”
AE 2 states that all objects which have both properties a and b have a
common property—which we denote by a n b.
It is important to note that these statements do not constitute a proof of
these axioms. A careless reading of these axioms may lead the reader to
conclude that they are merely consequences of logic. Such is not the case;
AE 1 and AE 2 cannot be derived from the logical axioms of mathematics
(see [1], §4.3) and therefore must be asserted as axioms.
The concept of a “physical object” which is defined only by the elements of
a set M together with a structure ё characterized by axioms AE 1 and AE 2
is too general. The above concept of a property is also too general. In
addition to describing the object itself, it may also be used to describe a
physical system with respect to its environment. But objective properties
should exhibit an independence of the environment. Therefore we shall find it
necessary to formulate the notion of “independence of the environment” in
terms of the mathematical theory .
It is customary to express this independence of the environment in the
following terms: The properties of a given system are “objective” and they
can be determined by suitable measurements. It is, however, not clear what
we mean by a “suitable measurement.”
We have not yet formulated the concept of “objectivity” (that is, inde¬
pendence of the environment) in a mathematical way. It is clear that “M
together with S” by itself is not sufficient for this formulation because the
2 Selection Procedures 15
“interaction with the environment” must first be described if we wish to
define the notion of “objective,” (that is, independent of the environment). In
§4 we shall define the notion of a “physical system” and describe the
interaction of the system with the environment. In III, §4.1 we shall continue
this discussion in order to obtain a suitable definition of a physical object.
We shall now proceed as if the term “objective property” has already been
defined in the theory. If S is a set of “objective” properties, we shall call the
elements of M “physical objects.”
After we have introduced the above definition of the concept of a “physical
object” it is important to put aside all intuition and preconceptions about
physical objects (despite the fact that they were used in order to formulate the
new concept—see [1], §5 and [2], III, §4) in order to obtain a correct
understanding of the new concept. Thus our notion of a “physical object” is
defined in terms of M, axioms AE 1 and AE 2, and a definition of the
notion of “objective” which will be introduced later. It is important to
emphasize the fact that this concept depends not only on the elements of the
set M but also on S. In mpre precise terms we must speak of “physical
objects with respect to the property structure в Г Such a distinction is not
necessary when the choice of S is clear and unambiguous.
We note the fact that axioms AE 1 and AE 2 are equivalent to the
statement that S is a Boolean ring of sets (see AI, §4).
For those readers who have not yet achieved an understanding of the
remarkable features of quantum mechanics it might appear that (intuitively,
on the basis of preconceived notions associated with macroscopic physical
objects) it is possible to construct quantum mechanics on the basis of “micro¬
objects,” that is, upon S, M, and AE 1 and AE 2 in such a way that the
behavior of micro-objects is completely determined by the properties as
defined by S. What is meant by this complete determination is delineated in
III, §4.1. Such a procedure will lead to contradiction with experience, as we
shall find in IV, §8.1.
2 Selection Procedures
We shall now consider the discussion of experiments with microsystems
which we began in I, §2. We shall not begin by introducing a number of
intuitive concepts which establish a connection between measurements of
microsystems and “measurement of properties” (see III, §4.1). Instead we
shall proceed cautiously and seek only to obtain a mathematical repre¬
sentation of the preparation and registration of microsystems. According to
the description of the experimental procedures presented in I, §2, the
preparation and registration procedures have a common attribute—they
result in the selection of microsystems. We shall now formulate this common
attribute in mathematical terms.
In mathematics it is often useful to first introduce the more general and
then the more specialized concepts, or in more precise terms, first the less rich
16 II Microsystems, Preparation, and Registration Procedures
and then the more rich structure types. Then all theorems for the less rich are
also valid for the more rich structure types. Indeed, everywhere in a
mathematical theory where such a general structure is found, the theorems
deduced for this structure can be applied. Consider, for example, the
structure type “group” and its meaning in many different mathematical
theories.
On the basis of physical and mathematical considerations we shall now
introduce the structure type “selection procedures.”
We begin by introducing a set M. The elements of M shall be used as labels
for the microsystems. Therefore we shall loosely refer to M as the set of
microsystems (see, for example, [1], §5, §10, and §12 or [2], III, §4). We shall
call a subset 9 c= 0>(M) a set of selection procedures provided the following
axioms are satisfied:
AS 1.1. If a, b g 9, a c= b then b\a e 9.
AS 1.2. If a, be 9, then a n b e 9.
It is somewhat difficult to make the axioms for such a general concept as a
selection procedure plausible to physicists. The following remarks are
intended for this purpose.
First, it is evident that a structure S of “properties” is, on the basis of AE 1
and AE 2, also a structure of “selection procedures.” A structure of selection
procedures is equivalent to a set of properties (according to AS 1.1 and
AS 1.2) if and only if M e 9.
The fact that every set $ of properties is a set of selection procedures may
be intuitively expressed as follows: S consists of the selection procedures
which select according to the properties ae$. If we had introduced the
concept of a “property” as a special case of a selection procedure, then we
would say that, in physics, there are other methods of “selection” than
according to “objective properties.”
Mathematically, the distinction between $ and 9 appears to be small. For
9 we do not require that M e 9. This small distinction permits us to extend
the domain of application of selection procedures beyond that of the
properties.
In order to make AS 1 more plausible, let us suppose that the selection
procedures a, b, etc. are obtained by physical methods. Then a subset a of M
represents the set of systems x selected according to the procedure a for some
experiment. The set a is, in general, infinite. The set of systems obtained
experimentally from the procedure a is always finite, but can be arbitrarily
large. Since in principle we do not know how large this number can be, we
express this lack of knowledge in the mathematical framework by the
expression “infinite” (see [1], §6, §9 and [2], §5, §8). Intuitively speaking,
axiom AS 2 states that the set of all x selected according to both selection
procedures a and b, that is all x e a n b, is a possible selection procedure. If
flcftwe say that the selection procedure a is “finer” than that of b. If a c= b,
and if we eliminate (by means of the finer selection procedure a) the systems
2 Selection Procedures 17
associated with a, we obtain b\a; AS 1 states that b\a is a possible selection
procedure.
Have we then not also asserted that all objects x e M which do not satisfy
the selection criterion of a can be “selected” on the basis that they do not
satisfy the selection criteria of a?
For “properties” it appears to be meaningful to assert that both a and its
complement M\a are properties, because the elements of a differ fundamen¬
tally from those of M\a because the latter do not have the property a. For
selection procedures, however, it is physically unrealistic to make this
assertion. This is the case not only for micro-objects.
For example, let us consider a machine which produces steel spheres (ball
bearings). The machine can be considered to be a selection procedure a for
steel spheres M. The complementary set M\a is apparently characterized by
the fact that the spheres of M\a were not produced by the machine a. For the
set a we may make certain (technically important) assertions, while for M\a
we may say only that the elements of M\a were not selected according to a.
A similar case exists for the case of a modern electron accelerator. For the
“selected” set of electrons we may make important assertions about the
experiments for which the electrons are used. What assertions can we make
about the electrons which are not prepared by the electron accelerator?
Thus it is meaningful not to require that M be a selection procedure. The
addition of the axiom M e 9 to AS 1 and AS 2 would not lead to a
contradiction to quantum mechanics. As the above example has shown, the
inclusion of the condition МеУ to the axioms for the structure “selection
procedure” is somewhat physically unrealistic. Therefore we will not add the
axiom МеУ.
If we add the condition МеУ to axioms AS 1 and AS 2 then we find that
9 will satisfy axioms AE 1 and AE 2 and would therefore be a property
structure. In physical terms (that is, on the basis of the mapping principles)
an element of 9 will not represent an intrinsic “objective” property of an
object x but only the “property” that x is selected according to the procedure
a. We say that the axiom systems AE 1 and AE 2 or AS 1 and AS 2 alone do
not suffice to describe the physical role of the elements of S and 9. Further
axioms will be needed in order to formulate the physical structures more
precisely.
We shall now state a number of definitions and theorems which we shall
need later.
D 2.1. 9(a) = {b | b g 9 and 6ca}.
According to AI, §4 and AS 1.1, 2 we find that 9(a) is a Boolean ring of
sets with null element 0 and unit element a. The set 9 itself need not
necessarily be a lattice (AI, §1 and §4) since given a,b e 9,av b need not be
an element of 9.
Th. 2.1. Given a family of structures of selection procedures {9k}, then
9 = f]A9x is a structure of selection procedures.
18 II Microsystems, Preparation, and Registration Procedures
The proof is a simple consequence of AS 1.1, 2.
Th. 2.2. For each subset © of 3?(M) there is a smallest structure of selection
procedures & (called the structure of selection procedures generated by 0)
which satisfies © с:У,
Proof. S? is the intersection of all structures of selection procedures «9^ which
satisfy © c= «9^. Since &(M) is itself a structure of selection procedures, the family
«9^ is nonempty.
D 2.2. A set «5^ of selection procedures for which «5^ с & is said to be
coexistent relative to с e Sf provided that «5^ с £f(c).
A set ё of properties is a coexistent set of selection procedures relative to
M. Every subset of ё is a coexistent set of selection procedures relative to M.
3 Statistical Selection Procedures
Earlier we have presented a mathematical description of the fundamental
phenomena associated with the selection of microsystems by means of an
apparatus. We shall now consider the mathematical formulation of the
second basis of quantum mechanics—statistics. The role of statistics is, of
course, not restricted to quantum mechanics. It is, however, an essential
component of quantum mechanics. For example, it is possible to resolve the
apparent contradiction between the particle and wave descriptions only by
the introduction of a statistical viewpoint; see, for example, [2], XI, §1.5—
§1.7.
In an experimental context the role of statistics is made manifest by the
relative frequency with which a finer selection procedure bca selects relative
to a. By this we mean that if we select N systems xx, x2,..., xN according to
the selection procedure a, and we obtain Nx of these systems which also
satisfy the selection procedure b, then the relative frequency h is given by
h = NJN. We say that b is statistically dependent on a if the relative
frequency h is reproducible. By this we mean that “in physical approxi¬
mation” we obtain the same relative frequency (in the case of large numbers
N of systems) for various experiments involving selection according to a and
b. We shall use real numbers to mathematically represent these relative
frequencies.
Here we shall not consider the meaning of the expression “in physical
approximation” used above. We only note that we do not require that
a = NJN but only that they are approximately equal a « NJN (where a is
the real number representing the frequency). The nature of this approxi¬
mation is discussed in [1], §11 where we have placed a particular emphasis
on the relationship between theory and experience.
We shall now introduce the concept of a statistical selection procedure in
order to describe the statistical dependence of selection procedures.
3 Statistical Selection Procedures 19
A set 9* c= 9(M) is called a structure of statistical selection procedures
provided that, in addition to AS 1, the following axiom is satisfied. Let
P = {(a, b)\a9be99b с а9аф 0};let A: F-*[091].
AS 2.1. If al9 a2 e 99 ax n a2 = 0, ax u a2 e 9 then Цаг u al9 at)
4- Х(аг и al9 a2) = 1.
AS 2.2. If al9 a2, a3 e 9, ax c= a2 c= a39 a2 Ф 0 then X(al9 a3) =
MPu %)•
AS 2.3. If я b a2 e 99a2 a au a2 Ф 0 then А(аъ a2) Ф 0.
The quantity Ца9 b) is called the probability of b relative to a and represents
the relative frequency (as described above) with which b selects relative to a.
On the basis of this interpretation of X(a9 b\ the postulates AS 2.1-AS 2.3
are obvious (but not proven; see [1], §5 or [2], III, §4).
If ax и a2 is a selection procedure then ax and a2 are finer than ax и a2. If
ax n a2 = 0 then ax and a2 are mutually exclusive. Then if N systems are
selected according to aa2 and if Nt are selected according to аъ N2
according to al9 then we must have N = Nx + N2, which, after division by N,
is what we obtain from AS 2.1. For three selection procedures a3 c= a2 c= ax
let Nx systems be selected according to au N2 of these Nt systems according
to a2 and N3 of these N2 systems according to a3. Then we find that
N3/Nt = (N./N.m/N,) which is in agreement with AS 2.2.
If at zd a2 Ф 0 then for N systems, selected according to au a finite
number may be selected according to al9 in agreement with AS 2.3.
We shall now consider a number of simple corollaries of AS 2. Let
a2 = 0- From AS 2.1 we obtain
Ца1,а1) + Х{аъ 0) = 1. (3.1)
By multiplication with X(al9 ax) we obtain
X(al9 ax)2 + X(al9 ax)X(al9 0) = X(al9 ax). (3.2)
According to AS 2.2,
X(al9 ax) + X(al9 0) = X(al9 ax)
and therefore
X(al9 0) = 0 (3.3)
and
X(al9a1) = 1. (3.4)
If a2 n a3 = 0, and if a2 cz al9 a3 cz ax from AS 2.1 it follows that
X(a2 и a39 a2) + X(a2 u a39 a3) = 1.
By multiplication with X(al9 a2 u a3) and application of AS 2.2 we obtain
X(al9 a2 u a3) = X(al9 a2) 4- X(al9 a3). (3.5)
20 II Microsystems, Preparation, and Registration Procedures
If аъ с a2 <= аъ from (3.5) and a2 = a3 u (a2\a3) we obtain
Л(яь я2) = А(яь я3) + Л(яь a2\a3)
and therefore
A(^i, n2) > Л,(#ъ %)•
D 3.1. A mapping p of a Boolean ring £ (see AI, §3) into the unit interval
[0,1] which satisfies p(e) = 1 (where e is the unit element of £) for which the
condition (jj л <j2 = 0 implies that p(ax v <r2) = p(at) 4- p(a2) is satisfied is
called an additive real measure on £.
From (3.5). AS 2.2, and AS 2.3 it is easy to obtain the following theorem:
Th. 3.1. p(b) = Л(а, b) is an additive measure on the Boolean ring 9(a); for
a3 c= a2 c= a we obtain
X(a2, a3) — .
p(a2)
If we fix a, and consider all b a a, it is sufficient to consider the probability
function p(b) since, on the Boolean ring 9(a) we obtain all conditional
probabilities from the probability function p(b).
In addition to the axioms AS 2.1-AS 2.3 we also propose the following:
AS 2.4.1. If ax g 9 is a decreasing sequence for which p)v ax = 0, and if
there exists ш ae 9 for which ax c= a (and thus ax c= a for all v) then
X(a, av)—>0.
AS 2.4.2. For every totally ordered subset of 9 there exists an upper bound
in 9.
Axiom AS 2.4.1 is a generalization—in the sense of a mathematical
idealization—of the intuitively evident relation (3.3). If for the decreasing
sequence we have ax = 0 for some v onward, from (3.3) it follows that
AS 2.4.1 is satisfied. In addition, AS 2.4.1 requires that if the sequence of
selection procedures ax becomes arbitrarily small in the sense of f]x ax = 0
then the probabilities Л(а, ax) must become arbitrarily small. A situation in
which Ца, ax) tends to a nonzero limiting value while, for all practical
purposes, there are no more physical systems to be selected by ax for large v is
physically unrealistic. Axiom AS 2.4.2 is physically realistic because it asserts
that, for a sequence of increasing selection procedures, there is a largest
selection procedure. Thus it becomes a substitute for the stronger, but less
realistic condition M e 9. Here we refer readers who are interested in these
4 Physical Systems 21
axioms and their implications towards a physically motivated “generalized”
probability theory to [1], §11 and [18].1
Using axioms AS 2.1-AS 4 we may develop a theory of probability which
is similar to that of Kolmogorov (see [18]). We shall not derive any
additional results here. Here we shall only note that the probabilistic basis
for quantum statistical mechanics differs only slightly from the usual one. If
instead, we require that M e 9, then we would find that 9 = 9(M) is a
Boolean ring, and that the relative probabilities would be completely
determined by the probability function p(a) = A(M, a). The basis of the
statistical selection procedures would then be identical to those of the
simplest “classical” probability theory.
We shall now introduce an important definition which we shall need later.
D 3.2. Let 9 be a structure of statistical selection procedures. A partition of
ae99(a = (J"=1 bi9 Ф 0, bt e 9, b( n bj = 0, i Ф j) is called a decom¬
position of a in the bb and a is called a mixture of the bt. The А(я, b}) are called
the weights of bt in a.
Since 9(a) is a Boolean ring, the decomposition of a is nothing other than
a disjoint partition of the unit element a of 9(a). With the additive measure
p(b) defined on 9(a) we obtain ]T"=1 p(bt) = l. The p(bt) describe the
“weights” by which the individual components bt occur in the decom¬
position. If N systems are selected according to a, and if Nt are selected
according to each bu then the relation p(bt) = Nt/N must hold in “physical
approximation.”
4 Physical Systems
We shall now consider the central topic of this chapter: the presentation of a
more precise formulation of the concept of a “physical system” and of a more
detailed description of the statistical selection procedures used for physical
systems. We again begin with a set M, the elements of which will be used to
represent “microsystems.”
4.1 Preparation Procedures
From the analysis of the experimental process which was presented in I, §2
we found that the “selection” of microsystems takes place in two distinct
ways: by preparation and by registration (that is, by selection according to
1. It is possible to add the following axiom to AS1.1-AS2.4 without contradicting
experience:
AS 1.3. If a,beSf and a n b Ф 0 then a и b e 9 We shall not use this axiom in this book.
22 II Microsystems, Preparation, and Registration Procedures
the result of the interaction of the microsystem with a registration ap¬
paratus). Let us begin with the formulation of the mathematical structure
describing preparation.
Let £ с 0>(M) be a structure for which
APS 1. £ is a statistical selection procedure.
We shall call £ a “set of preparation procedures’’ This designation is not a
consequence of APS 1, but is shorthand for a mapping principle which maps
certain facts onto elements of £. That is, the elements of £ will serve as images
(see, for example, [2], III, §4 or [1], §5) of certain definite technical facts and
processes by which microsystems can be produced in large numbers. Here we
do not permit the use of quantum mechanics in their description. The
mathematical relation xe a (where a e £) is the translation of the statement:
x is obtained by the preparation procedure a. Here it does not make any
difference whether this statement is a statement about the past, present, or
future (see the discussion in [1], §12).
We shall denote the probability function for £ (where £ satisfies APS 1) by
кл. The definitions and theorems of §3 are valid for Later we shall
consider the decomposition (as defined by D 3.2) of preparation procedures.
4.2 Registration Procedures
It is somewhat more difficult to present a mathematical formulation of the
registration process. The registration process is characterized by two steps:
(1) Construction and utilization of the registration apparatus.
(2) Selection according to the changes which occur (or do not occur) in
the registration apparatus.
Accordingly, we shall define an additional structure on M by means of two
subsets of 0>(M): 01 and ^0. For 0t and 0to we require that the following
axioms be satisfied:
APS 2. 0t is a selection procedure.
APS 3. 0to is a statistical selection procedure.
APS 4.1. 0to c= 0t.
APS 4.2. To each b e0t there exists a b0 e 0to for which b c= b0.
In order to develop the physical meaning of axioms APS 2-4 we shall first
state what is to be mapped onto the elements of 0t and 0t0. An element
b0 g 0to shall represent the construction and the use of a registration
4 Physical Systems 23
apparatus. This can best be illustrated by an example. Let us consider a
counter. Then b0 will be the set of all microsystems which are registered by
the counter. The mathematical relation xeb0 may be the translation of the
following statement: for the registration of x the counter b0 is applied. The
mapping principle may be expressed in more concise form as follows: 9t0 is
the set of registration methods.
For a given microsystem xeb0 the counter considered above may or may
not respond. Let b+ be the selection procedure for those xeb0 for which the
counter responds. Here we find that b+ c= b0. Let b_ denote the set of those x
from b0 for which the counter does not respond. Here we obtain b_ = b0\b+;
b+, b_ are elements of 91.
Generally we find that 91 contains not only the elements of 9t0 but also
those selection procedures which are finer than the elements of 9t0 and are
selected according to changes which occur (or do not occur) in the
registration apparatus by interaction with the microsystems. This situation is
described by axioms APS 4.1 and APS 4.2. The physical interpretation of the
elements of 9t may be expressed in more concise form as follows: 9t is the set
of registration procedures.
Axioms APS 4.1 and APS 4.2 do not permit us to conclude that the
selection procedures from 910 do not depend on the microsystems but depend
only on the apparatus and its “arbitrary” application.
The fact that the registration methods do not depend on the microsystems
is, in part, described by APS 3 where we assume that 9t0 is a statistical
selection procedure. Here axiom APS 3 requires that a finer registration
method is statistically dependent on a coarser one and is independent of the
subsequent influence of microsystems. Suppose that b0, c0 e 9tQ and that
c0 c= b0. Then the registration method c0 satisfies stronger selection criteria
(namely that of c0) than b0 with respect to the apparatus. In an experiment,
this would result in the exclusion of those elements of b0 which are not also
elements of c0. Thus for c0 there is a statistical dependence relative to b0
which depends only on the “finer” selection c0 of the apparatuses.
Experimental physicists will seek to minimize the effect of the “statistical
distribution” of the apparatuses because it will interfere with the statistical
distributions they wish to measure. Instead of speaking of the statistical
dependence of the elements b0 e it is more common to include its
influence in the discussion of the so-called experimental errors in the
measurement. To an experimental physicist the above statistical distri¬
butions of the apparatuses (the effect of which cannot be completely
eliminated) will appear as an error in the experimental measurement because
he tries, but cannot attain registration methods b0 e 9l0 for which there are
no c0 e 9t0 for which c0 Ф 0, c0 a b0 and c0 Ф b0.
We.shall denote the probability function for 9t0 by Xmo.
We note that axiom APS 2 does not require that 9t be a structure of
statistical selection procedures. This crucial fact permits 9t to contain
selection procedures which are essentially conditioned by the microsystem.
The counter which is characterized by b0 may or may not respond; thus b0
24 II Microsystems, Preparation, and Registration Procedures
can be partitioned into two subsets b+, b_ where b0 = b+ и b_,
b+ n b_ = 0. In nature, however, there are no reproducible, frequencies
X(b0, b+). Suppose that we pass N microsystems xl9x29...9xN through the
counter, and that N+ responses are obtained. By experience we lind that
N+/N depends on any circumstances which affect the microsystems. We find
that the frequency N+/N is not reproducible on the basis of the registration
procedures alone. On the contrary, counters are used to determine which
kinds of microsystems are present. The actual frequency N+/N depends on
the preparation procedures used for the microsystems. Thus we find it
necessary to introduce additional postulates which depend on both the
preparation procedure and the registration procedure.
4.3 The Dependence of Registration upon Preparation
We shall now consider the following question: Which combinations of a e J
and b0 g 9t0 are physically meaningful? Unfortunately, this problem is not
trivial; indeed, in the usual formulation of quantum mechanics it is not even
mentioned. In an idealized form, part of this problem can be found in
axiomatic field theory as the so-called “causality postulate.”
How do we formulate this “combination problem”? In order to formulate
this problem in mathematical language we introduce the following
definition:
D 4.3.1. Let ae J, b0e ^0, а Ф 0,bo Ф 0; we say that a and b0 may be
combined provided that if ae J, B0 g ^0, 0 Ф a с- a, 0 Ф B0 c=b0, we find
that anb0 Ф 0.
If a and b0 may be combined, then clearly a n b0 Ф 0. We now define the
following set:
С = {(a, b0)\ae J, b0 g and a, b0 may be combined}. (4.3.1)
The following theorem is an immediate consequence of D 4.3.1.
Th. 4.3.1. If 0 Ф a c= a, 0 Ф B0 c= b0 and if a, ae£l; b0, B0e$Q and
(a, b0) g С then we find that (a, B0) g C.
Intuitively, the statement a n B0 Ф 0 represents the statement that it is
physically possible (see [1], §10) to select microsystems according to both a
and B0. That is, the microsystems which were prepared according to the
preparation procedure a can be used for the registration method B0. Thus the
statement that a, b0 may be combined means that for every finer preparation
procedure 5ca and every finer registration method B0 cz b0 there are
always systems in a n B0.
The physical meaning of the combination problem will be discussed in
detail in III, §1. In VII, §1 we shall again return to this subject.
4 Physical Systems 25
The “largest possible” set С is given by
C= £’ x 01о
where
D 4.3.2. £' = £\{0} and 0t'o = 0to\{0}.
The condition С = £! x gt'Q is not always realistic. Thus, in the selection of
axioms for С we have encountered a physical problem.
Why is the condition С = £! x an unrealistic condition for
microsystems?
In order to facilitate understanding, we shall now briefly describe an
additional structure which is needed in order to describe the relationship
between preparation and registration methods. A detailed exposition of this
structure will be given in VII, §1. It is concerned with the space-time
relationship between these experimental procedures. Since the microsystems
are produced by the preparation procedure it makes sense that the micro¬
objects are registered only after they are produced, and that the preparation
apparatus and the registration apparatus do not collide. These remarks show
that the question concerning axioms for the set С are not without physical
significance. Later we shall see that, under certain circumstances, the
combination problem described by С does not play an essential role in the
usual formulation of quantum mechanics.
We shall now begin by making a minimal requirement for C.
APS 5.1.1. To each ae £' there exists a b0 e $f0 such that (a, b0) e C.
This axiom states that it is physically possible (see [1], §10) to apply at
least one registration apparatus to the microsystems prepared by a. In III, §1
we will replace APS 5.1.1 by the stronger axiom APS 5.1.2.
We shall now consider a registration procedure be 0t where b Ф 0 (we do
not require that b e 0to); b Ф 0 means that there are microsystems which
may be selected according to the procedure b. All experience has shown that
it is possible to prepare such microsystems. Thus we assert
APS 5.2. To each b e 0t,b Ф 0 there exists at least one ae £ and a b0 e 0to
which satisfies b c= b0, (a, b0) e С and a nb Ф 0.
Axiom APS 5.2 is stronger than APS 4.2.
We shall now consider the mathematical formulation of the statistical
dependence of the combined selection procedures. First we define
D 4.3.3.
0 = {c\c = a nb, ae£, be and for а Ф 0 there
exists a60G^o which satisfies b czb0 and (a, b0) e C}.
26 II Microsystems, Preparation, and Registration Procedures
Clearly © is a subset of 9>(M). An element a n b of © represents the set of
all microsystems which are prepared according to a and registered according
to b.
From D 4.3.3 and Th. 4.3.1 it follows that:
Th. 4.3.2. Let a, яеД oca and suppose b, В e 91„ В cz b. If a n be®
then an Be®.
D 4.3.4. Let 9 denote the smallest set of selection procedures for which
© c 9 (see Th. 2.2).
In general we find that, neither J с- 9 nor 91 c= 9 holds!
We shall now formulate the requirement that the combination of prepara¬
tion and registration procedures leads to reproducible frequencies.
APS 6. We require that 9 (as defined by D 4.3.4) shall be a statistical
selection procedure.
Let kc? denote the probability function for 9.
On the basis of their physical interpretation, XA, A#oand X#> cannot be
mutually independent. Experience has shown that, in the combination of
preparation procedures a e & and the registration methods b0 e 9t0 there is
no statistical dependence between the selection according to the preparation
procedures and the application of the registration method. We formulate the
statistical independence of the preparation procedure and the application of
registration methods by means of the following axioms:
APS 7. Let аъ a2 e b10, b20 e 9t0 where a2 cz аъ b10 c= b20 and
(au b20) e C. Then
APS 7.1. X^(a^ n b^o, a2 n Ью) = X^(a^9 a2)i
APS 7.2. X^(a1 n b10, a± n b20) = X@o(b10, b20).
Note that if b e 9t, b ф 9t0, and if аъ a2 e Д ax n b Ф 0 then, in general
we find that X^(ax n b,a2 n b) ф Х^аъ a2).
Axiom APS 7.1 expresses the directedness of the interaction of the
preparation on the registration apparatus in the following way. The pro¬
bability associated with the process of the preparation apparatus does not
depend on those of the registration apparatus (that is, on b10); see I, §2.
4.4 The Concept of a Physical System
We shall now replace the intuitive notion of a physical system by a well-
defined one, one defined in terms of the mathematical structure “preparation
and registration procedures” and its physical interpretation (as expressed in
4 Physical Systems 27
terms of the mapping principles). Since the mapping principles are involved
in this concept, it is not a derived one, that is, a concept which is defined only
by terms and structures deduced in Such derived concepts will be
introduced later, for example, the derived concepts of ensemble, effect,
observable (see III and IV).
The expression “physical system” will represent those facts which are to be
mapped to elements of the set M together with the well-defined physical
processes which are to be mapped to the elements of J, and whereby
the letters J, and 9t0 denote not only the sets but also the structures
defined by axioms APS 1-7.1 In less precise terms we say that a physical
system is an element of the set M endowed with the structure J, and 9t0.
In III, §4.1 we shall examine the distinctions which must be made between
the concept of a “physical system” and the concept of a “physical object.” It is
important to note that the concept of a physical system is not necessarily
restricted to microsystems; see, for example [1], §12 and [2], XV-XVI and
[13].
The facts which we have called “physical systems” have some reality
beyond that of the “direct” interpretation of preparation and registration
procedures. This additional meaning results from axioms APS 1 and APS 7,
which imply that the preparation procedure is independent of the regis¬
tration procedure. Intuitively, this means that, in the preparation, “some¬
thing” is produced—that which we have called a physical system—which can
be detected by the registration apparatus. A more precise formulation of this
state of affairs is presented in [2], XVI and [13].
But the facts which we have called “physical systems” are so closely related
to the associated production and detection methods that the typical charac¬
terization of physical objects (see §1 and III, §4.1) by objective properties of
the systems has not been possible until now.
The statement of facts which are self-existing in the sense that they do not
exert any influence on other things is self-contradictory. Such facts are
completely inaccessible. Nevertheless, in physics we endeavor to describe, as
completely as possible, portions of the real world as if these portions were self-
existing. The attempt to describe the real world in complete detail would
make physics impossible. Physics is possible only because we are able to
make structural assertions about portions of the real world without taking
into account the structure of the whole world in all its particulars. Only a few
“global structures” of the world as a whole are introduced into the
description of the physics of its parts, as, for example, the space-time
structure (and gravitation, which for sufficiently small regions of space can be
neglected due to the existence of local inertial reference systems).
We have made the assumption that the experiments composed of a
preparation and a registration procedure can be described as such portions
of the world using only space-time as a global structure of the world.
1. Here we shall require the use of axiom APS 5 in its strengthened form as presented in
III, §1. We shall use the expression “carriers of physical interaction” to denote the set M when
we require only APS 5 in the form presented above.
28 II Microsystems, Preparation, and Registration Procedures
Is it possible to describe the physical systems as such “self-existing”
portions of the world? It would be possible, if the physical systems were
physical objects in the sense of §1 and III, §4.1. But since the microsystems
are not physical objects (see IV, §8.1), we shall try to eliminate “as far as
possible” details of the structure of the preparation and registration appa¬
ratuses. This problem is treated in III and IV.
In the case of a microsystem, this problem is not as simple as in the case of
a macrosystem because the latter may be described as a physical object. The
concept of a physical system described above is too general because it is
applicable to both macro- and microsystems. The selection procedures in Sf
describe a completely conventional “classical statistics” which does not
exhibit the “typical” quantum mechanical structure. At what point do we
make the transition to “quantum statistics” which appears to be so different?
This transition will be made in III, §3 where we introduce axiom AV 4s
instead of the postulate that the physical systems are physical objects.
We are able to describe the structure of a physical system only to the extent
that the physical system can be prepared and registered. This statement will
be formulated in mathematical terms by the following axiom:
APS 8.1. M = U a.
a e &
APS 8.2. M = U b.
be®
By APS 4.2 we find that APS 8.2 is equivalent to
м = и V
bo € 0to
Axiom APS 8 does not describe a profound aspect of reality because the
preparation and registration procedures are arbitrary and permit “all
possibilities.” The axiom states that every physical system interacts with the
outside world at least once (APS 8.1) and again (APS 8.2).
4.5 The Structure of Probability Fields for Physical Systems
In this section we shall derive those theorems which may be obtained from
the above axioms (that is, without the axioms in III, §1) in order to reduce the
problem of the probability structure for the selection procedures of 9* to a
manageable subset. First, we shall consider how Sf is obtained from 0.
We shall now consider a finite collection a{ e J; i = 1,..., n for which
a{ cz a e J, a( n cij = 0 for i Ф j, and a corresponding collection bke@L\
к = 1,..., m for which bk c= b e bk n bt = 0 for к Ф /. Let A be a subset
of ordered pairs (i, k), i = 1,..., n and к = 1,..., m. From an be®
and Th. 4.3.2 we obtain atnbke 0. Furthermore, we find that
eA^i n bk a a n b.
4 Physical Systems 29
Let li denote the set of all с с M which may be represented in the form
c = Ua,k)eAai n К where the aubk,A satisfy the above conditions. In
addition, let S' denote the following set:
S' = jc | с = (J c{, Ci n ck = 0 for i Ф k,cte& and ccie © j.
(4.5.1)
Then we find that:
Th. 4.5.1. S = S' =
PROOF. Since Sf(a n b) is a Boolean ring, and since af n g © с: and
a. n bk c= a n b e 0, we find that с = []у,к)еА n bke . Thus we obtain
Sc^. We also obtain S = when we prove that S is a system of selection
procedures, that is, we show that S satisfies both conditions AS 1.1 and AS 1.2.
From the definitions of S and S', from at n g © and a n b g © it follows that
I cl'. Since Sf(d) is a Boolean ring, it follows that S' c= Sf. Thus we need only
prove that S = Sf.
Let с = \J(i,k)eAai n h and с = yoJ)6^ dj n be two elements of S repre¬
sented in the above form where i = 1,..., n; к = 1,..., m; at a a, bk cz b;
j = 1,..., n, I = 1,..., m; cij cz a, £, c: 5. Then we obtain
с n с = (J fa n a, n bk n bt). (4.5.2)
(i, k) e л
Let = af n dj, bkl = bk n bt. We find that dtj g J, bkl e 0t and <= a n a g J,
bkl <=: b n Be M. Clearly di} n dki — 0 for (/,;’) # (Jc, Q; a similar result holds for
Thus, from (4.5.2) and Th. 4.3.2, it follows that с n cel, that is, S satisfies
axiom AS 1.2.
In order to show that AS 1.1 is satisfied, we shall assume that ccc and compute
c\c.
Since ccc,we obtain с = с n с, from which we obtain, using (4.5.2)
c\c = U Ln M U (flj n Sj n bk n 5,) . (4.5.3)
(i, k) e A L UJ. 0 e Л J
Since й and Я are selection procedures, we obtain ам+1е % Ьш+1е Я, where
(4.5.4)
a,7i+i = <*Л U (dt п dj),
\j= i
\ л
Ькт + 1 — Ьк \ U (Ьк П 0j).
\l = l
In addition, we find that
dtj <= d for i = 1,..., n;j = 1,..., n, n + 1,
bkl c= b for к = 1, ..., m; I = 1,..., m, m + 1
and for j = 1,..., n + 1 we obtain
aij n dvy = 0 for (i,j) Ф (/',/)
and a similar result for the bkl.
(4.5.5a)
30 II Microsystems, Preparation, and Registration Procedures
From (4.5.4) we conclude that
n _ ... .
fyjf Ьк Ькх
j =1 1=1
and (4.5.5b)
Й+1 1Й+1
Oi П bk = U U ay n bu.
j = 1 1 = 1
Let
B = {(/>0L/ = 1, ...,й+ 1;/= 1, ...,m+ 1}.
From (4.5.3) we obtain
с\г= U U Кn M- (4.5.6)
(i, к) e Л O’. 0 e B\A
From (4.5.6) together with (4.5.5a) and (4.5.5b) we conclude that c\c e S.
Th. 4.5.2. The function X#, for 9 is uniquely determined by and by the
special values X#>(a n b0, a n b) for those (a, b0) e С where С is defined by
(4.3.1) and be 01 with b cz b0.
PROOF. Since 9 = £' (by Th. 4.5.1) we may express the с, с in Я<^(с, с) in the form:
n m
с = U °i, с = U ck, (4.5.7)
i=1 k=l
where c{ e 0, ck e © and ct n ck = 0 for i Ф к, ckr\cx — 0 for к Ф /, where
cade 0, cade 0. Since с <=. с <=. d we can set d = d and с without loss of
generality.
From (4.5.7) and (3.5) we obtain
A^(c, c) = Я^(с, cfc). (4.5.8)
k = l
Since с с d e 0, from AS 2.2 we obtain
ck) = c)A^(c, ck).
Since с Ф 0 (otherwise Я#(с...) would not be defined) we find that, by AS 2.3
Я^(d, c) # 0 and therefore
Я-<с’г‘> = ТТГТ- (4 5.9)
Я^(а, с)
Using (3.5) and (4.5.7) we find that
n
ЯД c) = X Xcf{d, cf
i = 1
Thus (4.5.9) becomes
^ p> ) —
S.,
(“510)
4 Physical Systems 31
By substitution of (4.5.10) into (4.5.8) we obtain
i Xk=l (A c 114
(4'5'ш
Since d, ch ck g © A? is determined by the special values A^(d, c) for с c= d and
c, d g 0.
Let d = ar\b, ae£t9be$ and с = a n B,a e £l,B e Then if с c: d we obtain
с — (a n a) n (b n 5).
к
That is, if a = a n a and 5 = b n 5, then с — а n В where a g 5 g ^ and a c: a,
Б c= b.
According to D 4.3.3 there exists a h0 6 f°r which b a b0 and (a, h0) 6 C;
consequently a n b0e® and 5 c= b с h0. Since a n b Ф 0 (otherwise
A<^(d, с) = Я^(a nb9anh) would not be defined) anb0 Ф 0 and
A^(a n b09 a n b) Ф 0.
Using the following equations (obtained from AS 2.2)
A<?(a n b0, a n b)A^(a n b, a n B) = A^(a n b0, a n B)
we obtain
, _ i-ч ЛЛа n b0, a n В) ^
A^(a n b0, an В) = — —. (4.5.12)
A<?(a n b0, a n b)
Similarly, from AS 2.2 it follows that
A^(a n b0, a п В) = Я^(а n b09 a n b0)A#>(a n b0, a n B) (4.5.13)
because a n b0 Ф 0 whenever а Ф 0 according to D 4.3.1.
If а Ф 0, then by (4.5.13) we obtain
A<?(a n b0, a n В) = Ай(а, a)A^(a n b0, a n B);
from (4.5.12) we obtain, for а Ф 0
. , . - гч la(a, a)2^(a n b0, a nh) ......
A^(a nb0,anB) = — . (4.5.14)
ЯДа n o0, a n o)
For the case in which a = 0, we obtain
Я^(а n h, a п Б) = 0.
According to Th. 4.5.2, in physics it is only necessary to consider those
Ay(a n b0, a n b) for which (a, b0) eC,b e b с b0; the role of an experi¬
ment is to “measure” these special values of the function A#>. We now define
D 4.5.1. 3F = {(b0> b)\b0 g #o, b g Ь с b0}.
^ is called the set of “effect processes” or “questions.”
D 4.5.2. V = {(a,f) \ f = (b0, b) g У and (a, b0) e C},
where С is defined by D 4.3.1.
32 II Microsystems, Preparation, and Registration Procedures
D 4.5.3. On ^ we define the real function p by
ц(а, f) = ц(а, (b0, b)) = Xy(a nb0,an b).
From Th. 4.5.2 it follows that X# is determined for all of 9 by X# and the
values of p on c€.
The function p defined by D 4.5.3 plays a central role in the statistical
description of a physical system. We shall now briefly illustrate the nature of
the experiments which may be used to test the function p.
These experiments consist of a preparation procedure a by which a single
system xe M is produced, and upon which a registration method b0 is
applied. As the result of the interaction of the system with the apparatus, a
process characterized by b may or may not occur. Let xu ..., xN denote N
systems which are prepared according to a and to which b0 is applied^that
is, xte a n b0 for i = 1,..., N. Of these systems, suppose that Nt systems
xiv, v = 1,..., Nt activate the process characterized by b, that is, xiv e b.
According to the physical meaning of X^(a n b0, a n b\ if the theory is ap¬
plicable, we must have approximate agreement of NJN with p(a, (b0, b)) =
p(a, /) = X<y(a nb0,a n b).
This basic experimental situation will be repeatedly encountered in the
comparison of quantum theory with experiment. We shall not give an
account of the numerous examples here. Instead we shall prove a number of
theorems about p(a, /).
In order to formulate the following theorems we shall now define
D 4.5.4. lbo = (b0, b0) and 0bo = (b0, 0); lbo and 0bo are elements of
D 4.5.5. Suppose that b0 e is partitioned by bt, i = 1,..., и, that is,
b0 = U”=i bi and bi n bJ = 0 for 1 We cal1 fi = (fco> bd a (disjoint)
partition of lbo and write lbo = 1J?= t fi.
The following two theorems describe the central structural properties of
the probability function Х&.
Th. 4.5.3. For the function p the following conditions are satisfied:
(ii) For each a e 9 there exists anf0 e ^ for which p(a, /0) = 0.
(iii) For each a e 9 there exists anfx e 3F for which ft(a9 fj = 1.
(iv) For a decomposition a = (J£ a£ (see D 3.2) the following condition
(i) 0 < p(a, f) < 1.
holds for all ffor which (a, /) e c€\
where
0 < Xi = X$(a, at) < 1 and Я* = 1.
4 Physical Systems 33
(v) Let b01, b02 g #o and b cz b02 cz b01, = (b0i, b),/2 = (b02, b).
77zen t/ze following condition holds for all a for which (a, fx) g <€
fi) = Л#0(ЬОЪ b02)p(a, /2).
(vi) Let(a,b0)e C;iflbo = [JU г fi (see D 4.5.5) then fd = 1.
- (vii) // (a, f) еЯ> where f = (b0, b) then ц(а, /) = 0 z/ and only if
a n b = 0.
K (viii) Let b0 = (J"= j b0i where b0i g and b0i n b0k = 0 for i Ф к (that
is, U?=i b0i is a decomposition of b0 in @l0). Then, for f = (b0, b)
fi = (boh'boi n b), and for every aef which satisfies (a, f)e^ we
obtain
f) = £ АЯо(Ь0, b0i)fi(a, /f)
i = 1
and
X ^&o(b05 boi) = 1.
i = 1
(In this case we say that the f describe a decomposition of f which is
induced by the decomposition ofb0).
PROOF. According to APS 5.1.1, to each а ей' there is at least one b0 e for
which (a, b0) e C. For such a b0, from (vi) it follows that (with n = l)/i = lbo, thus
proving (iii); similarly, let/0 = ®b0 (b0 = 0). From (vii) we obtain (ii). We obtain a
proof of (i) from the corresponding property of . Thus, it is only necessary to
prove (ivHviii).
(iv) Since (a, f)e<$ where / = (b0, b), it follows that (a, b0) e C. Then by
Th. 4.3.1 we obtain (af, b0) e С, that is, (at, f) e (€. Thus all the p(at, f) are defined.
Thus, we find that
(n \ n n
a> U ai) = £ at) = £ Xt.
i=1 / i=1 i=l
Since a n b = (J f (af n h) we obtain
n
iia, f) = Ay(a n b0, an b) = £ A^(a n b0, a, n b)
i = 1
n
= £ -Ma n fco> ai n b0)Ay(a n b0y at n b)
i = 1
n n
= £ ^(a, n b0, at n b) = £ Aj(u(a;, /).
i=l i = 1
(v) From (a, /J e ^ we obtain (a, h01)eC. From Th. 4.3.1 we obtain
(a, b02) g С and therefore (a, f2) e (€. Thus we obtain
о a о h) = ly(o о Ь()^, a о ^02)!y(^ ^ ^02» ^ ^ b)
= ^0(^01» b02)^(a П b02> a П h).
34 II Microsystems, Preparation, and Registration Procedures
(vi) Let b0 = (J"=! bt be a disjoint partition of b0, and let (a, b0) e C. Then we
find that
(vii) Let (a, b0) e C, then if p(a, /) = Ay(a n b0, a n b) = 0 then from AS 2.3 it
follows that a n b = 0. If a n b Ф 0, then by (3.3) we obtain p(a, /) =
Я^(а n b0, a n b) Ф 0.
(viii) If (a, b0) e С we obtain
From AS 2.2, Th. 4.3.1 and APS 7.2 we obtain
Ay(a n b0, a n b0i n b) = Ay(a n b0, a n b0i)Ay(a n b0f, a n b0l- n fe)
= Ля0(^о> h0i)p(a, fi)-
Since = (J "= j b0i we find that
It is important now to show there is a partial converse to Th. 4.5.3.
Th. 4.5.4. Given A$; suppose there is a function p(a,f) on which
satisfies conditions (i), (iv), (v), (vi), and (vii) of Th. 4.5.3. Then on 9 there
exists one and only one probability function A(that is, a function which
satisfies AS 2) for which Ay(a n b0, a n b) = p(a, /). This function A#
satisfies the conditions APS 7 and conditions (viii) of Th. 4.5.3. Also, A#0 is
uniquely determined by p.
Proof (Based on a Communication by H. Neumann). We shall use the proof of
Th. 4.5.2 in reverse. From the function ^:#—►[(), 1] we shall seek a function
A: 90—>[0,1] where 90 = {(d, c) | d, с e © where с (= d} which is identical to the
restriction of A# to . Then with the help of S' = 9 we will obtain the function
Ay.Py-* [0, 1] with 9y = {(d, c)\d,ce9 where с cz d}.
We shall begin with the second step, in order to better establish the conditions
which A: —► [0,1] must satisfy in order that it may be extended to a probability
function A#. We will show that this extension is possible if A: [0,1] satisfies
the following three conditions:
(a) If cu c2, c3 g 0, c3 с: c2 c= съ c2 Ф 0 then
1 = Ay(a n b0, a n b0) = £ Ay(a n b0, a n b() = £ p(a, fi).
i = i
i = 1
/j(a, f) = ку(а n b0, a nb) = A J
a n b0, Q (a n boi n b)
= £ ky(a nb0,an b0i n b).
1 = 1
1 — Ля0(^о> b0) — ]Г А#0(Ьо, b0i).
i = i
A(c±, c2) — A(ci, c2)A(c2, c3).
(P) If C,ct,[jni=1 ct g 0, с Ф 0,(J?= 1 Ci С= с,с{г\с} = 0 for i Ф j then
(y) If d, с g 0, с Ф 0 then A(d, с) Ф 0.
4 Physical Systems 35
We now seek to define A^: —► [0,1] by (4.5.11), that is, by the equation
i ^ Ck) iA с 1 c\
( }
where с (= d, d g © and с Ф 0.
In order to use (4.5.15) as a definition, it is necessary that A(d, ct) # 0 for at least
one value of i; otherwise, by (y) we would have cf = 0 for all i, and с = 0 in
contradiction to the hypothesis that с Ф 0.
We shall now show that A<^, as defined by (4.5.15), is unique. Suppose that, in
addition to the decomposition с = (J"=1 ct and с = У”=1 cfc, there is another
decomposition с = (Jjl i cj and с = (J|"= x c[. Then we find that
n' m’
ct = с n Cj = 1J (с, n cj) and ck = с nck = 1J (c„ n cj).
j=i 1 = 1
Then from (p) we conclude that
n'
W, C() = £ M c< cj),
j=l
M c*) = fj A(d, с* n cl)
1=1
X (c c\- ^ ° ®
Zu^c.-ncj)-
This formula is symmetric in the primed and unprimed quantities; in addition, it
yields the same value for A(c, cj if the primed quantities are used in the right side of
the equation (4.5.15). Thus we find that (4.5.15) does not depend on the choice of
the decomposition of с and c. We must now show that it does not depend on the
choice of d.
Let с a d' where d' g 0. Then d! n d g © and с <=: d n d'. By (a) we find that
A(d, ct) = A(d, d! n d)A(d' n d, c*),
A(d, ck) — A(d, d' n d)A(d' n d, ck)
and, from (4.5.15) we obtain
ЕГ-1 Щ' n d, ck)
and therefore
A^(c, c) —
£?-! Md'nd,cd'
This formula is symmetric in d and d'; thus (4.5.15) is independent of the choice of
d.
We shall now show that the function A^ satisfies AS 2.1-AS 2.3.
If c, cg^, cnc = 0 and ckj ceZf then ct n ck = 0 in (4.5.7) for all i, k.
Since сисеУ = Г, there exists a dg S' with с u с с: d. Since с и с =
[Ui CJ u CUfc ^en ЬУ (4.3.15) we obtain
w . x_ LM>zd
^сис,с)"БМсг) + 1кмгк)
and
Б 4d,£j)
X?(c и с, с) =
36 II Microsystems, Preparation, and Registration Procedures
Thus we find that A^(c u c, c) + A^(c u c, c) = 1, or, in other words, AS 2.1 is
satisfied.
If ё cz с cz с and с Ф 0 we obtain
^(d, ct) 2jt X(d, ck)
-W-»ca
ътсд Цс,г)-
Thus we find that AS 2.2 is satisfied.
If с Ф 0, then at least one of the ck Ф 0; then according to the definition
formula (4.5.15) A(c, с) Ф 0, and AS 2.3 is satisfied.
We note that the above result and Th. 4.5.2 show that a function A: —► [0,1]
can be extended in a unique way to a probability function X#,: У#—► [0,1] if the
conditions (a), (/?), and (y) are satisfied for A: —► [0,1].
It only remains to show that a function //: ^ —► [0,1] which satisfies the
conditions (i), (iv), (v), (vi), and (vii) can be extended to a function A: ^ —► [0,1]
which satisfies (a), (/?), and (y) and APS 7.
We use the formula (4.5.14), that is,
(4.5.16,
H(a, f)
where a n b Ф 0, f = (b0, b), / = (i>0, B) and b cz b0 in order to define
In order that (4.5.16) be well defined, it is necessary that p(a, f) ф 0. From
p(a,/) = 0 it follows that, by (vii) anb = 0 in contradiction to the hypothesis
a n b Ф 0. In order to apply (4.5.16), p(a, f) must be defined, that is, a cz а, а Ф 0
and (a, b0) e C.
According to D 4.3.3, b0 e 0to can be chosen so that (a, b0) e С because
a n b g 0. Then, by Th 4.2.1 it follows that (a, b0) e C.
In order to show that the function defined by (4.5.16) is unique, we shall assume
that a n b = a1 n b1 and a nb = a1 nht where a1 cz аъ Ъ1 cz bWe must
show that
Ыа, аЫа, /) = А/а1г aMau /i) t4 5i7t
14Л f) Kaufi) ’ 1 " '
where /t = (b0l, /;,) and /, = (bou Б,). If a n В = at n Bt = 0, then the right
side of (4.5.17) is zero because either A^(a, a) = 0 for a = 0 or p(a, /) = 0
according to (vii); a similar result holds for au Бх. We need, therefore, only consider
the case in which a n В = a1 n Ф 0. We rewrite the left side of (4.5.17) using
(vi). We obtain
lAa, (b0, B)) = fi(a, (b0, В n Et)) + /<d, (Ь0, ЩЬ n Bt)))
and
Ф, Фо, b)) = Ma, (h0, b n hO) + /i(«> №o> *Л(Ь n M)-
Since a n b = n bl9 a n b = a n b n bl9 that is, a n (b\(b n bj)) = 0. By
(vii) /x(a, (h0, b\(h n hj)) = 0 and /х(а, (h0> Б\(Ь n hj)) = 0. The left-hand side of
(4.5.7) is equal to
Aj(fl, %(fl, (h0, b n bj))
p(a, (b0, b n bi))
4 Physical Systems 37
If b0 n b01 # 0, then by (v) we obtain:
p(a, (b0, b n bi)) = ЯЛо(Ь0, b0 n BoiMa, (Bo n B0i, b n b±)).
The condition b0 n b01 # 0 is satisfied because
b0nb01^b nb^a nbna1nb1 = anb#0.
From (v) we also conclude that
p{a, (b0, В n BJ) = A*o(b0, b0 n b01)p(a, (b0 п b0i, В n БД.
Using these results, the left-hand side of (4.5.17) is equal to
A<ja, a)p(a, (B0 п b01, В n БД
Via, (b0 n b01, b nbt)
According to (iv), for all/for which (a, f) e % we obtain
p(a, f) = Aj(a, a n ajvia n al9 f)
+ A^a, a\a n aj^(a\a n al9 /).
Since a n b = a! n bb it follows that a n b n bi = (a n aj п (b п ЬД that is,
(a\a n flj n (ft n = 0. Thus, from (vii) we obtain
Via, (b0 n b01, b n bj) = /Ца, a n fli)/i(e n «1» (B0 n B0i, b n ЬД
and
^(a, (b0 n b0i, В п БД = Я^(а, a n n йь (b0 n b0i, В n Bi)).
Thus we obtain the following expression for the left side of (4.3.17):
AM(a, a)A$(a, a n а^уф n al9 (b0 п B01, Б n БД
Aj(a, a n ai)/i(a n al9 (b0 r\ b01, b n bi))
For we find that A$(a, a)A$(a, a n ax) = A^a, a n aj and since a n ax # 0, we
obtain
lj(c, л о #i) = Я^(д, й о #i)Aj(u о fli, л о Ui).
Thus the left side of (4.5.17) may be rewritten as follows:
Ам(а n al9 a n а^рЦа n al9 (b0 n b01, Б n Bi))
Via n flb (B0 n b0i, b n bj))
(4.5.18)
The expression (4.5.18) is symmetric under exchange of quantities having index 1
and those without index 1. The right side of (4.5.17) can be transformed in the same
way into an expression which is similar to (4.5.18), proving that (4.5.16) is well
defined.
It now remains to show that the function A defined on 90 satisfies the conditions
(«X (ft (Л and APS 7.
We shall now show that (a) is satisfied:
Let ct = fli n bl9 c2 = a2 n b2 Ф 0, c3 = a3 n b3 where we suppose that
c3 cz c2 cz ct (and therefore find that, for example, ct n c2 = c2; that is,
c2 — a2c\b2 — (a1 n a2) n (b1 n b2). We may therefore assume that
a3 с a2 c: ax and that b3 cz b2 cz bx. According to APS 5.1.1, for ax there exists a
b0 such that (ab b0) e С; by Th. 4.3.1 we also find that (a2, b0) e С and (if a3 Ф 0)
(a3, b0) g C. If a3 = 0 then c3 = 0 and therefore Я(с15 c3) = 0, Я(с2, c3) = 0 and
38 II Microsystems, Preparation, and Registration Procedures
(a) is satisfied in a trivial way. Let us assume that a3 Ф 0; thus we find that
a3 n b0 Ф 0. We shall now assume that a3 Ф 0 and therefore a3 n b0 Ф 0.
Since ax n b0 Ф 0 we may write
Thus we find that
\ _ ^&(ai, а2)^Ла2, аз)м(аз, (V b3))
A\ci, c2)a\c2, c3) — - — — .
l4ait (b0, bj)
From A£(al9 а2)Л^(а2, a3) = ^(al9 a3) we find that (a) is satisfied.
In order to show that (j$) holds, we define ct = a{ n bt and (J "= 1ci = с = a n 5.
Since ct<=:c<=:c = anb we may assume a{ c= a c= a and bt c= В с b as above.
The a{ generate a finite Boolean ring in «2(a); we denote the atoms of this finite
subring by av. Similarly let denote the atoms of the finite subring obtained from
$(&) and the bt. Then we find that
where Ai9 Bt are the set of indices uniquely associated with a{ and bh respectively.
We obtain
Ксь c2) = —,t V '
iAa 1» Фо, ь i))
Aj(ai, a2)fi(a2, (b0, b2))
a = U «». b= [j
ct = at r\ bt= U av n £„.
From n Cj = 0 for i Ф j it follows that
U a, n £„ = 0,
v 6 AtnAj
ц e BtnB'j
that is йхпоц = 0 for ve^-n A,. and figB{ n Bj. Let
then for all pairs (v, /x) g C£ n Cj9 we find that
av n ^ = 0 whenever i Ф j.
(4.5.19)
We may describe the cf in terms of the Ct as follows:
c, = U Я n £„).
(v, fi) e Ci
We obtain
(J (av r\6„) = \J 6i = c = anS =
= U("v П £„).
Let С denote the set of all index pairs (v, fi). Then we find that
av n = 0 for all (v, fi) e C\(J Ct.
(4.5.20)
4 Physical Systems 39
Since by APS 5.1.1 there exists a b0 for which (a, b0) e С by Th. 4.3.1 we find that
(av, b0) g €. For this b0 we find that
,, Ыа, a)fi(a, (b0, b))
4°’ 9 -44 •
(b0.&))
From = (fr0VO u U/* b„ and b0 = (b0\f>) и В it follows from (vi) that
1 =ф,Фо, Ь0\Ь)) + ^11(а, (b0, fi,)),
V
1 = n(a, (b, b0\B)) + fi(a, (V b))
and finally
(bo. S)) = E (bo, fi.))-
v
From a = (Jv av, it follows from (iv) that
f) = av, /^ = £ As(a, a„)j<(av, /)•
The preceding result, together with
a)XM(a, av) = A ^a, av)
permits us to conclude that
*,<9-1 ,4.5.2,)
vp (bo, b^))
According to (vii), fi(av, (b0, 6M)) = 0 for av n ^ = 0.
Using (4.5.19) and (4.5.20) we easily obtain
~ч у у Aj(#, #v)//(#v, (h0, h^))
iC’ “ T (v.rt .Cl Жф^Ь)) ■
If we show that (noting the definition of Q)
Me, с,) = X (4.5.22)
V6A< (b0, b))
peBt
then we have proven (j8). Note that (4.5.22) has the same form as (4.5.21) and is
therefore proven.
In order to show that (y) holds, let d = a1 n bt and с = a2r\b2 where we assume
that a2 c= au b2 <= bx. From
3/Л \ Л*(я1» a2Ma2i (bo, b2))
A(d, c) = —-——
^(Ui, (h0, bj)
and (vii) it follows that (y) is satisfied.
We shall now show that APS 7.1 is satisfied. First, we note that
3 / ~ U „ ~ к \ _ ^(аЬ a2)Ka2i (Ью, Ью))
A^(ai n h10, «2 n b10) — - — - - .
M«i, (bio, bi0)
According to (iii) (which follows from (vi)) we obtain
Kai> (Ью, Ью)) = Ka2> (Ьго, b2o)) = 1*
40 II Microsystems, Preparation, and Registration Procedures
Thus we obtain
Ay(a1 n b10, a2 n b10) = A^(al9 a2).
To show that APS 7.2 is satisfied, we note that
, , , , s _ ЬцАРь аМаи (bio, B20))
Ay(a 1 n b10, n b20) — z 77 г ^
/4«i, №10, bio))
= M«1. (bio> b20))-
From (v) we find that
A*(ai» (Bio» B20)) = Ля0(Вю» Ь2оЖаи (В20» B20))
= Ля0(Вю» B20)-
Thus we have also shown that A#0 is determined by p.
On the basis of the uniqueness proven in Th. 4.5.2, it follows from Th. 4.5.3 that
the Ay so defined satisfies condition (viii) of Th. 4.5.3.
Theorem 4.5.4 shows that in order to determine Ay it is sufficient to test
the function p(a, /).
The conditions (i), (vi), and (vii) are justified (in a trivial way) by the
meaning of p as the mathematical description of a frequency in an experi¬
ment, as we have already explained when axiom AS 2 was stated in §3.
Condition (iv) has the following intuitive interpretation: the preparation is
not “influenced” by the registration (and therefore is one of the conditions
defining the experiment). Otherwise, it would not be possible to conduct a
test experiment.
Condition (v) states that a refinement of the registration method will be
statistically independent of the preparation process—and is therefore also a
condition which defines the experiment.
Conditions (i), (iv), (v), (vi), and (vii) are, foi* the most part, “rules for correct
experimentation” rather than assertions about the physical system under
investigation. The complete “information about the physical systems” is to
be found in the function p(a, /) over c€. This fact will justify our future,
almost exclusive, interest in this function.
Yet this function p also is not independent of some of the properties of the
preparation and registration apparatuses. In §4.4 we shall attempt, as far as
possible, to eliminate the properties of the apparatuses (that is, the properties
of the function p(a, f) which depend on the structure of the apparatuses). We
would like to consider the preparation and registration processes as merely
an aid to detect the physical systems and investigate their structure.
The “separation” of the preparation and registration apparatus on one
side, and the physical systems on the other side is not only nontrivial, but in
the case of microsystems (that is, quantum mechanics) is not possible, at least,
in the desired or “expected” sense. The clarification of those concepts, with
which we seek to obtain the largest separation, will be the goal of III and I V.
CHAPTER III
Ensembles and Effects
Many of the problems and difficulties encountered in the interpretation of
quantum mechanics have the following source: the failure to clearly dis¬
tinguish between a collection of microsystems obtained by means of a
preparation procedure and an ensemble, where the latter is represented by a
statistical (density) operator. We shall not describe these misunderstandings
here. Instead we shall formulate definitions of the concepts of an “ensemble”
and an “effect.” An effect is often called a “yes-no” measurement. Here we
shall not yet make a distinction between an “effect” and a “decision effect”;
such a distinction will be made in §3 and §6. In this book we shall show that
the experiments which exhibit paradoxes for the usual interpretation of
quantum mechanics can be described in a natural manner.
Here we again state that, in the formulation of quantum mechanics which
will be presented here, neither the statistical operators (in particular, the
projections onto a vector in Hilbert space, often called a state) nor the self-
adjoint operators (which described the so-called “observables”) will be used
for direct comparison with experience, that is, with experiment. We shall
describe the relationship between the mathematical description and experi¬
ment exclusively in terms of the preparation and registration procedures and
the probability function X .
In the mapping principles (as described in [1], §5 or [2], III, §4) the
fundamental sets are M, J, ^0, and and the fundamental relations are
X#,(c, d) = a, x g a and x e b. The concepts which will be introduced in III
and IV will be derived concepts (see [1], §10). The additional axioms which
will be introduced in III and IV are additional laws of nature in the sense of
[1], §7.3 concerning M, J, ^0, and and X^(c, d).
41
42 III Ensembles and Effects
In subsequent chapters (for example, in VII) we shall introduce additional
relations; these will, however, be related directly to the fundamental sets
M, J, ^0, and 0t. In addition, it is necessary to describe the apparatuses in
more detail than is currently possible using only elements of M, ^0, and 01.
As we mentioned in II, this problem has not yet been solved, in general, for
the case of quantum mechanics. In IV we shall again return to this problem.
In IX and XVI we shall make a number of unsystematic special assumptions;
in XVII we shall describe a method for the solution of a portion of this
problem. A new aspect of this problem will be discussed in [13].
In III and IV we shall redefine the “usual” concepts such as “ensemble,”
“state,” and “observable.” These concepts will no longer be dependent on the
interpretation problems of quantum mechanics, because we have already
stated the relationship between the mathematical description and experiment
in II. In III and IV we shall outline (in the sense of [1], §10) certain parts of
the reality domain of quantum mechanics and consider problems of the
structure of microsystems within this framework.
1 Combinations of Preparation and Registration Methods
We shall now continue the discussion of the combination problem which was
begun in II, §4.3. For the case in which С = 0! x 0t'o axioms APS 5.1 and
APS 5.2 become theorems. From С = J' x 0t'o it follows that n(a,f) is
defined for all f e ^ and that an equivalence relation ax ~ a2 is defined on J'
by the condition
ltal9f) = n(a2,f) for all fe P. (1.1)
In our discussion of the combination problem we shall show that for
microsystems, physically realistic assumptions guarantee the existence of an
equivalence relation ax ~ a2 in J'. Indeed, this will be proven in Th. 1.2.
Assuming that Th. 1.2 holds, we shall introduce the following definition:
D 1.1. Let Ж denote the set of all equivalence classes (with respect to the
equivalence relation ax ~ a2 in £'). An element of Ж is called an ensemble or
state; Ж is called the set of ensembles (set of states).
On the basis of the equivalence relation assumed above, we may define a
function /i(w,/), w e Ж as follows:
= n(a>f) where aew.
Then is defined for all pairs (w,/) for which there exists a a e w for
which (a,f) e Suppose that we may justify, on physical grounds, that fi is
defined on all ofjf x «f (that is, axiom APS 5.1.4, which will be introduced
below will be satisfied). Then an equivalence relation ~ f2 on will be
defined by
/1 ~ /2 if and only if fiiwJJ = fi(w,f2) for all wel.
1 Combinations of Preparation and Registration Methods 43
We now introduce the following definition:
D 1.2. Let JSf denote the set of all equivalence classes (with respect to the
equivalence relation fx ~ f2) from An element of if is called an effect and
if is called the set of effects.
We obtain the following theorem:
Th. 1.1. Let fi be defined over Ж x if, where jl(w, g) = fi(w,f) where we Ж,
g e if, andf e g. Then fi satisfies the following conditions:
(i) 0 < ji(w, g) < 1.
(ii) Iffi(wu g) = jl(w2, g)for allge& then wx = w2.
(iii) Iffiiw, Qi) = Aw, g2)for all we Ж then gt = g2.
(iv) There exists ag0e if such that Aw, g0) = 0 for all w e Ж.
(v) * There exists a gt e if such that Aw, gt) = I for all w e Ж.
The proof is a simple consequence of II, Th. 4.3.4.
For the following, it is useful to introduce the canonical mappings of J'
onto Ж and У onto if.
D 1.3. Let cp denote the canonical map which maps the elements a e & to
the equivalence classes we Ж to which a belongs. Let ф denote the canonical
map from onto if. Let / = {b0, b)\ in addition to ф(/) we shall also write
ФФо,Ъ).
The important concepts “ensemble” and “effect” require the existence of
the equivalence relations ax ~ a2 in J' and fx ~ f2 in We shall now seek to
formulate realistic axioms which will guarantee the existence of such
equivalence relations. Here we shall place our emphasis on physical con¬
siderations rather than seeking the weakest possible axioms.
The intuitive basis underlying the axioms for С is as follows: In order that
a preparation procedure a may be meaningfully combined with a registration
method b0—that is, in order that the microsystem prepared according to the
preparation procedure a may be registered according to the method b0—it is
necessary that the registration according to b0 occur “after” the preparation.
The time-ordering of b0 with respect to a will be discussed in more detail in
VII, §1. Here we note that the question whether ae £ may be combined with
b0 e is not only a question about the physical possibility to position the
apparatus for a and b0, but is also a stipulation that an element xe a is also
an element of b0 only if the registration of x occurs after x is prepared (see
[2], XVI). Only in this way will the transformations which will be introduced
in VII, §1 have a simple structure.
We see, therefore, that the equivalence relation ax ~ a2 for preparation
procedures depends on our stipulation concerning the possible combination
of the preparation with registration procedures. ax ~ a2 will be satisfied if
44 III Ensembles and Effects
and only if the effect procedures which may be combined with ax and a2 lead
to the same frequencies. Thus the resulting equivalence relation depends
strongly on the conditions imposed on the set Cl
Let denote the set of registration methods b0 Ф 0 which begin
“later” than time T. Then we expect that, to each a e J', there exists a Tsuch
that (a, b0) g С for all b0 g 3t'0T. Obviously, the set of 3t'0T for increasing Гis a
(lower) directed set. We shall now formulate the following axiom:
APS 5.1.2. There exists a directed set (in the sense of inclusion) Г с ^(^'0)
where 0 ф Г, so that to each a g & there exists at least one element 0l'Oa g Г
such that (я, b0) g С for all bo £ ^Oa *
This axiom is, by itself, too weak. From the directedness of Г it follows
that, to every finite set au a2, ..., an g J' there exists a common element
&ofi g Г for which (ai9 b0) g С for i = 1,..., n and all b0 g 3$'0p. It remains an
open question whether each element (for example, contains “sufficiently
many” registration methods in order to sufficiently test a preparation
procedure. If we consider the set @t'0T described above, our intuition may
suggest that it should be possible to test a completely by means of all b0 from
X0T9 that is, if
/i(al9 (b09 b)) = ц(а2, (b09 b))
for all b0 g Лor and b a b0, then it follows that
ц(аг,Л = n(a2lf)
for all /g providing that both n(auf) and n(a2,f) are defined (that is,
(al9f)eV and (a29f)e%)).
For certain macrosystems this assumption will prove to be false (see
below). For microsystems and certain “classical” macrosystems (systems of
mass-points undergoing conservative forces) the above assumption is suc¬
cessful. We shall now formulate a new axiom (which will be stronger than
APS 5.1.2):
APS 5.1.3. Let Г be a set satisfying APS 5.1.2. Suppose that the following
condition is also satisfied: For al9 a2 g J' let @'0p be an element of Г where
(al9 b0), (ia2, b0) g С for all b0 g . Suppose that ц{аъ (b0, b)) =
ц(а2, (b0, b)) is satisfied for all b0 G Жор and all b c b0. Then we require that
Kax>f) = M^25/) f°r all/G ^ for which n(auf) and fi(a2,f) are defined.
Axioms APS 5.1.1-APS 5.1.3 are not mutually independent. We shall now
state the relationships among these axioms:
If we assert APS 5.1.2 then we may prove APS 5.1.1 (given in II, §4.3).
APS 5.1.3 is clearly a stronger condition than APS 5.1.2. If we assume
APS 5.1.3 then APS 5.1.1 and APS 5.1.2 can be proven, and are therefore
superfluous.
From APS 5.1.3 we now obtain the following theorem:
1 Combinations of Preparation and Registration Methods 45
Th. 1.2. Let us define the relation ax ~ a2 as follows: If ц{аъ/) = ii{a2,f)for
all f еЗ* for which ii{auf) and ii{a2,f) are defined then ax ~ a2. The
relation ax ~ a2 is an equivalence relation.
Proof. We need only show that if a1 ~ a2, a2 ~ a3 then ax ~ a3. According to
APS 5.1.2, the directed set Г contains an element $'0fi with
(au b0), (a2} b0\ (a3, b0) e С for all b0 e &0fi.
From a1 ~ a2, a2 ~ a3 we obtain
ц(аи (b0, b)) = \i(a2, (b0, b)) = ц(а3, (b0, b))
for all b cz b0 and b0 e &'0fi. From
ц(аи (b0, b)) = ц(а3, (b0, b))
for all b cz b0 e ЖО0 it follows that, according to APS 5.1.3
МяьЛ = >/)
for all/e SF for which n(auf) and n(a3,f) are defined. Thus we have shown that
On the basis of Th. 1.2, for w e Ж (where Ж is defined by D 1.1) we may
define a function fi on a subset of Ж x У
= KaJ) where aew.
Thus fi is defined for all pairs (w,/) for which there is an a e w for which
(aJ)eV.
The “intuitive” considerations which have led to the formulation of axiom
APS 5.1.3 have now led us to make additional assertions about the set C. We
have seen that, for each a, there is a subset 0t'OT, the elements of which may be
combined with a. If we now consider the various a which belong to the same
equivalence class, given ana^w which can be combined with all b0 E @t'oTl
(that is the preparation of microsystems is excluded after Tt) it may be
possible to find an a2ew which excludes preparations after an earlier time
T2(T2 < ТД but for all effects (b0, b\ where b0 e0t'OTl, leads to the same
frequencies as ax. We would like to assert that for each w and each time Tit is
possible to find an aew which may be combined with the registration
method b0 E0tfOT. Together with [jT 0t’OT = 0t’o (where the union is taken
over all time).
We come to the postulate:
APS 5.1.4. To each b0 g ^'0 in each class we Ж there exists an aew such
that (a, b0) g C. To each pair a e£t',f= 0* there exists an /' g 3F such that
(a,f) g ^ and fi(w9f) = fi(w,f') for all w e Ж for which fi(w,f) and fi(w,f')
are defined.
Th. 1.3. The function fi defined above is defined on Ж x
Proof. We need to show that, to each / g & and we Ж there is an a e w for which
(a,/) g This is, however, guaranteed by APS 5.1.4 since if/ = (b0, b), there is an
aew with (a, b0) e C, that is, (a,f) e c€.
46 III Ensembles and Effects
We now obtain the following theorem:
Th. 1.4. The relation fx ~ f2 defined by the condition
/1) = /2) for all we Ж
is an equivalence relation. To each pair aE3!,fe9 there exists anf'e9
such that (a,f') e <€ andf ~ f.
Thus, from D 1.2 we have finally proven Th. 1.1.
In the following (up to and including VI) we shall not consider the precise
form of axioms APS 5.1.3 and APS 5.1.4. We shall, however, make use of the
equivalence classes of J' and 9, and the mappings q> and ф defined by
D 5.1.3. The special form of the axioms APS 5.1, 3, 4 have stood the test for
the case of microsystems; the considerations presented in VII, §1 depend
decisively on these axioms.
The situation is completely different for macrosystems which undergo
irreversible processes. Let us select a reference time (which we denote by
t = 0), and consider only those preparation procedures a e J which are
completed before t = 0, and the registration procedures which begin after
t = 0. Then, instead of APS 5.1, 3,4 we may use the simpler axiom
С = J' x @t’Q. This is achieved at the expense that the considerations of
VII, §1 are no longer applicable, since we may consider only those time
translations t—>t + т for which т > 0—a semi-group of time translations!
It is impossible to apply axioms APS 5.1, 3, 4 to irreversible macrosystems
because axiom APS 5.1.3 is stronger than APS 5.1.2. The stronger axiom
APS 5.1.3 appears to contradict our experience with irreversible macro¬
systems because the ability to distinguish between the different au a2 by means
of the elements (b0, b) where b0 e Жот is reduced as T increases. In particular,
if T is sufficiently large, the condition p(al9 (b0, b)) = p(a2, (B0, b)) for all
(b0, b) for which b0 e Жот does not imply that p(al9f) = p(a2,f) for all/e 9
even where p(al9f) and p(a2,f) are defined. We have noted that APS 5.1.3
cannot be used for irreversible macrosystems. Its usefulness for micro¬
systems, especially for the structures defined in VII, §1 is not self-evident! The
distinction between macrosystems and microsystems must be carefully
examined if we are to embed a theory of macrosystems into many-body
quantum theory. As a result of the above considerations (macrosystems may
be prepared only before t — 0), the set Jw of the preparation procedures for
macrosystems can only be a small subset of the set J of preparation
procedures for a many-body quantum theory; a similar situation holds also
for 9t0m and 9t0 (see [2], XV and [13]).
The explicit form of the axioms—either APS 5.1, 3, 4 or С = J' x —
will not be important in the following—from here to VI inclusive. Here it is
important to state that this choice permits us to introduce the concepts of an
ensemble and an effect. In the transition from J' to Ж and from 9 to 9 we
lose (as desired) the special structure of the apparatuses as characterized by
a e J' or b0 e 9T0. Ensembles and effects then appear to depend more on the
systems themselves than upon the preparation and registration apparatus. In
2 Mixtures and Decompositions of Ensembles and Effects 47
this respect, axioms APS 5.1.3 and 4 permit, at least, a partial “separation” of
thevsystem from the structure of the apparatuses in the sense of our intention
which was expressed in II, §4.4 and at the end of II, §4.5. There remain two
problems in the separation described above in defining the concept of
ensemble and effect:
(1) In the transition from J' to Ж and from 3F to ££ we must ask whether
we have lost too much of the structure of the system itself.
(2) On, the other hand, we must ask whether we have not included more
of the apparatus structure than is necessary.
An ensemble w is characterized only by its statistical distribution.
Therefore it is often said that two sets of systems al9 a2 for which
<;p(at) = <p(a2) = w—that is, ax ~ a2—are, in reality, equal and differ only in
the details of the construction of the two different preparation apparatuses.
In other words the two sets al9 a2 give the same results for all experiments
(not only registration). In IV, §5 and §6 and in XVII, §4.4 we shall find that
this statement is, in general, not true for microsystems. We find that much
too much is lost in the transition from J' to Ж and from & to . In IV we
shall seek to recover the loss by introducing the concepts of observable and
preparator.
On the other hand the notion of an effect may perhaps contain too much of
the structure of the registration procedure; in general, an effect may also
contain the probability that the associated effect process reacts upon a
“property” of the microsystem. We would, of course, like to eliminate the
“bad” registrations and keep the “good” ones—those which exhibit the “real”
properties of the system. Is such a distinction possible?
A search to find a description of the microsystems in themselves, going
beyond the concepts of ensembles and effects may proceed in two different
directions:
* First, by considering the properties or pseudoproperties of systems (see §4),
or
Second, by providing a very precise analysis of what we seek to describe by
the concepts of observable and preparator (see VI). There we seek to identify
those observables which precisely describe the physical systems, and only
secondarily describe (if this is inevitable) the measurement (registration)
apparatus. We also wish to identify those preparators which describe the
structure of the prepared physical systems rather than the structure of the
preparation apparatus.
2 Mixtures and Decompositions of Ensembles and Effects
The concept of an “ensemble” (or “state”) is uniquely defined by D 1.1 and
refers to a number of relatively trivial relationships. The usual intuitive
notion of an ensemble (state) does not agree in all details with that defined by
D 1.1.
48 III Ensembles and Effects
An ensemble w e Ж is not a set of microsystems because w is not a subset
of M; w is a subset of @>(M)—and is a class of subsets of M. The following fact
is of crucial importance in quantum mechanics (but not limited to quantum
mechanics). To each class w there is more than one aew (see IV, §5 and §6
for details). The substitution of the statement aef for the statement w e Ж
will quickly result in errors in both logic and intuition. Such errors may be
avoided by using the mathematical structure associated with M, J, Л0, and
in order to describe microsystems.
For quantum mechanics it is important that, if q> is the map defined in
D 1.3, then the condition cp{ax) = <p(a2) does not imply at = a2.
In the following we shall denote the function pi: Ж x —> R obtained in
Th. 1.1 by ц (the same symbol used for the corresponding function on ^); it
will be clear which function is meant by the arguments. Thus we obtain
H(a,f) = fi((p(a), iA(/)). (2.1)
From II, Th. 4.3.4(iv) we obtain the following important theorem on the
“decomposition of ensembles.”
Th. 2.1. Let a = (J"=i at be a decomposition of the preparation procedure a.
Then for allg e we obtain
n
ri<p(a), g) = £ ^A(p(at), g)\
i = 1
where kt = к#(а9 atX 0 < < 1 and £"=1 Xt = 1.
Proof. According to II, Th. 4.3.4(iv), we find that for Af = XJa, af),
= Z Xip{auf)
i = 1
for all/ g for which (a,f) e Therefore, from Th. 1.3 we obtain
(K<p(a),f) = L ЛДФд,Л
i — 1
for all f g Our assertion follows directly from Th. 1.1.
D 2.1. Suppose to w g Ж there is a set of real numbers Xu 0 < Xt < 1,
i = 1, ..., n where £"=1 Xt = 1, and a set wt e Ж, i = 1,..., n for which the
following condition is satisfied for all g
n
Ф, g) = E 'Wwi, g)- (2.2)
i = 1
Then (2.2) is called a decomposition of w according to the w( with weights X(.
We shall now introduce the notion of “mixing” of preparation procedures,
a notion which is the inverse of decomposition. We shall describe how ye
may construct such a mixture w using the apparatuses for the selection of the
components wt.
2 Mixtures and Decompositions of Ensembles and Effects 49
We may build a new apparatus A using the apparatus Ax corresponding to
the preparation procedure ax and the apparatus A2 corresponding to a2 in
the following way. Suppose we have an apparatus В which randomly
generates two states (+) and (—) where (+) occurs with frequency a and (—)
with frequency 1 — a. The arrangement of В, Al9 and A2 are such that, upon
the occurrence of (+) the apparatus Ax is used, and upon the occurrence of
(-) the apparatus A2 is used. Note that В is also a part of the “large”
apparatus A. If (+) occurs in the apparatus A (that is, in its part B\ a
preparation procedure a\ c= a is determined for which X^a9 a\) = a. Then a\
is a “finer” preparation procedure than a9 and is selected from a only if (+)
occurs. Similarly, if (—) occurs, a preparation procedure d2 occurs and
Xg(a, a'2) = 1 — a.
We may be tempted to set ax = a\ and a2 = a!2. If we do so, we make the
following error: the preparation procedures ax and a2 are obtained from the
use of ax and a2 independent of the random generator В which controls
which of At or A2 in A is used for the preparation.
From the construction of the apparatus A we find that a = a\ и a!2 and
a\ n a'2 = 0; that is, a is a mixture of a\ and o'2 with weights X^a9 a'x) = a
and Xg(a, a'2) = 1 — a.
Since the selection according to a\ and ax is obtained with the same
apparatus Ax (where in the case of a\ the apparatus Ax is only a part of the
total apparatus A, but is otherwise unchanged) we expect that l(at) =
{a\ae£, a cz at} is isomorphic to <2(tfi) = {a'\a'e J, a' a ax}. That is,
there exists an isomorphism i of the Boolean ring £(аг) onton J(fli) for which
cp(ia) = cp(a) and X^(al9 a) = X^(ial9 ia) = XM(a'u ia).
Intuitively this isomorphism expresses the fact that ax and o\ have the same
“structure type.”
With the above motivation, we now introduce the following definitions:
&■
D 2.2. Two preparation procedures a and a are said to be isomorphic if
there is an isomorphism i between the Boolean rings J(a) and 1(a) for which
<p(ia') = <p(a') and X^(a, a') = XJja9 ia') (2.3)
and if (a, b0) e С is equivalent to (a, b0) e C.
D 2.3. A preparation procedure a is called a direct mixture of ax and a2 if
there are isomorphic preparation procedures a\, a2 for which a\ n a'2 = 0
and a = a'x u a'2. a = Хй(а9 a\) and 1 — a = X^(a, a'2) are called the weights of
ax and a2 in the direct mixture a.
If (р(аг) Ф <p(a2) then the weights a and 1 — a in the direct mixture are
uniquely determined by (p(ax) and <p(a2). According to II, Th. 4.3(iv) we
obtain
50 III Ensembles and Effects
From D 2.3 we obtain
fi((p(a), g) = аМ<г>(Ы g) + (1 - u.)g(<p{a2), g). (2.4)
Since <p(dh) Ф cp(a2), there exists a g such that
КФг), g) Ф g(<p(a2), g)-
From (2.4) it follows that
_ n(<p(a), g) - gispifl 2), 0)
КФ1), 9) - д(Фг), в)'
The experimental arrangement described above leads us to introduce the
following axiom:
AP 1. To each аъ a2 e J' and to each rational number a, 0 < a < 1 there is
a direct mixture а ей’ of ax and a2 with weight a of ax in a.
From АР 1 we obtain the following theorem:
Th. 2.2. Let w e Ж, and let ki9 i = 1,..., n be rational numbers where 0 < ki9
Yj=i h = 1* Then there exists an a e J' and a decomposition a = au
(р(а() — wt and k^(a9 a() = ktfor which:
n
g) = X 'Wtt'i, g) for all деУ,
1 = 1
where w = <p(a).
Proof. We use induction on n. For a set of n + 1 rational numbers kl9 ..., kn+1
we consider the set of n rational numbers
Z J ‘a, (i = 1, • • •, n).
k = 1 /
According to the induction hypothesis, there exists an a e J' with a decomposition
a = U?=1 а{ with <p(af) = w* and к$(а, dt) = af. Suppose that for w„+1 there exists
ana„+1 with (p(an+1) = w„+1. According to APS 1, for a, a„+1 there exists a direct
mixture a of a and an+1 with weight kn+1 of an+l in a. By D 2.3 there is an a and a
a„+1 for which a = a и a„+1, a n a„+1 = 0, A^a, a) = 1 - A„+1, k^a, an+1) =
A„+1 where a is isomorphic to a and a„+1 is isomorphic to an+1. From the
decomposition a = (J"=1 a{ and the fact that a is isomorphic to a, it follows that
there is an isomorphic decomposition between a = (J?=1 ai and a = \J1=} at.
From fl = iuflB+1,flnflB+1 = 0it follows that a = U"=1 af is a decomposition
of a. The weights of at are for i < n:
at) = kja, a)k^(a, at)
= (1 — кп+1)кд(а, dt) = (1 — A„+1)af = kh
where the relation (2.3) was used. For i = n we have k$(a, an+1) = kn+1. Thus, frorii
(2.3) we obtain <p(af) = (p(dL) and (p(an+1) = <p(a„+1). Thus, with the use of Th. 2.1
the theorem is proven.
2 Mixtures and Decompositions of Ensembles and Effects 51
In АР 1 we have required only that a be a rational number. This has been
done with the wish that J can be chosen to be a denumerable set (see [1], §9).
Th. 2.2 states that to every decomposition (2.2) of w according to the w* with
weights At (At rational) there exists a preparation procedure a e J and a
decomposition of a, a = (J?=1 at for which ср(а) = w, (р(а() = wf and
AafCl, Clj) A} .
It is not difficult to see that АР 1 and Th. 2.2 say little about the structure
of microsystems. In fact АР 1 is more of an assertion about the construction
of the preparation apparatus. We have introduced АР 1 only because it will
illuminate the discussion about the physical assertions of quantum me¬
chanics. We shall now return to our discussion concerning the concepts of
preparation and registration procedures and their mixtures and
decompositions.
In order to avoid error in connection with АР 1 we find it necessary to
make the following remarks. For every decomposition a = (J"=1 at we may
be tempted to believe that the apparatus A corresponding to a must consist
of a random generator В which selects the sub-apparatus At (which
correspond to the at). Such a description is incorrect for the following reason:
Although there may be indications on the apparatus A by which the
selection procedures a{ are determined, the total structure of the apparatus
may be such that it is impossible to uniquely define the component
apparatuses At of A. For the case of quantum mechanics it is important to
note that there are decompositions a = at which do not correspond to
a partition of the apparatus A into a random generator and components At.
If we replace АР 1 by the following somewhat stronger assertion we shall
find ourselves in contradiction with quantum mechanics: Suppose that there
is a decomposition of w into w* according to D 2.1, and suppose that
cp(a) = w. Then there is a decomposition of a, a = (J£ a( for which (р(а() = wf
and Ag(a, at) = A{.
In VI, §6 we shall find, even in the case in which the relation (2.2) holds,
that there are selection procedures a which satisfy <p(a) = w for which there
are no decompositions a = (J£ a£ for which (p(at) = wf. Since АР 1 holds,
there must be another selection procedure a' which satisfies <p(a) = w and
permits a decomposition a' = (Jf a\ where cp(aJ) = щ and Ай(а\ aj) = At.
Earlier we have attempted to formulate the concept of an ensemble in such
a way as to avoid errors in interpretation. In order to continue this effort it is
now necessary to examine the relationships between the concept of an effect
and other concepts which are frequently used in quantum mechanics.
By analogy with the notion of an ensemble, we emphasize that, according
to D 1.2, the expression effect denotes classes of effect processes (b0, b). Then
the map defined by D 1.3 maps many effect processes (b0, b) into an effect. In
this mapping, however, information is lost about the effect processes (b0, b)—
see, for example, the description of “coexistent” effects in IV, §1. The effect
process is characterized by an “apparatus”—the registration method b0—
52 III Ensembles and Effects
and by the “detection response”—the registration procedure b. Thus
it is possible that the two apparatuses b(0Х) and b^2) with substantially
different technical design will represent the same effect g e ££: ф(Ь(о\ b{1)) =
ф(Ь$\ b(2>) = g.
Let b characterize the “detection response” for the apparatus b0. Some
authors call the pair (b0, b) a “yes-no measurement.” In quantum mechanics
the concept of a “yes-no” measurement is usually explained in ordinary
language before it is used. Thus it will often be unclear whether the
expression “yes-no measurement” should refer to the elements (b0, b) of У or
to the elements g = ф(Ь0, b) of ££. This ambiguity can easily lead to
misunderstandings. Often the expression “question” is used instead of “yes-
no measurement.” Here (b0, b) is interpreted as a question posed to the
micro-object; xeb corresponds to the answer “yes” and xeb0\b cor¬
responds to “no.” Here again it is not clear whether the concept “question”
should refer to an element of or to an element of ££.
The expressions “yes-no measurement” and “question” are also used in a
more restricted sense. If care is not taken to see how different authors use
these expressions, great confusion will result. We find that the expressions
“yes-no measurement,” “question,” and “proposition” are used for the
elements of a subset of S£, that is, for special effects (which we shall call
“decision effects” and define in §3 and §6).
We now find it necessary to define the notions of mixture and decom¬
position in reference to registration methods and effects—these notions will
later prove to be useful.
We shall now consider only the notion of decomposition of registration
methods which was defined in II, Th. 4.5.3(viii) (and not the more general
decomposition of registration procedures; for the latter, see the discussion on
observables presented in IV, §1.4).
The following theorem is an immediate consequence of II, Th. 4.5.3.
Th. 2.3. Let b0 = b0i be a decomposition of the registration method b0
according to the b0i with weights Xt. Then for an effect process f = (b0, b)
where f = (b0i, b0i n b) the following equation holds:
g(w, tКЛ) = Z V(w, ФШ) (2.5)
i = l
for all we X.
D 2.4. Let д e ££. Suppose that there is a set of real numbers Xt where
0 < Xt < 1, Ya=i h = 1 and a set gt e У? such that
n
g) = Z 'Ww, 0,) (2.6)
i = l
holds for all w e X, then (2.6) is called a decomposition of the effect g
according to the effects gt with weights kt.
2 Mixtures and Decompositions of Ensembles and Effects 53
Equation (2.5) represents a decomposition of g = ф(Ь0, b) according to the
Qi = ФФоь b0i n b) with weights Л, = Л<%о(Ь0, b0i).
The procedure described earlier for the construction of a preparation
apparatus A from a random generator В and two preparation apparatuses
At and A 2 can also be applied directly to a similar procedure for registration
apparatuses. Let us construct a registration apparatus using a random
generator В and two registration apparatuses Ax and A2. Since a registration
apparatus corresponds to an element of ^'0, from the two registration
methods b01 and b02 we obtain a registration method b0 having the
decomposition b0 = boi u ^02 where Ьоь ^02 correspond to the apparatuses
Ax and A2, respectively.
By analogy to D 2.2 and D 2.3 we define:
D 2.5. Two registration methods b0 and b'0 are isomorphic if there is an
isomorphism i of the Boolean ring 0t(bo) to the Boolean ring 0t(b'o) for which
\j/(ib0Jb) = \j/(b0,b)\ i is also an isomorphism of 0to(bo) to 0to{b'o) and
(a, b0) eC is equivalent to (a, b'0) e C. (Here we note that 0to(bo) is defined by
0to{bo) = 0ton Я{Ь0)).
D 2.6. A registration method b0 is said to be a direct mixture of the
registration methods b0l9 b02 if there are two registration methods b'01,b'02,
where b'01 is isomorphic to bou b'02 is isomorphic to b02 such that
b'oi n bf02 = 0, b0 = b'ol u bo2* a = A^o(b0, b'01) and 1 - a = Л#о(Ь0, b'02)
are called the weights of b01, b02 in the direct mixture b0.
From D 2.6 and Th. 2.6 it follows that, for every b cz b0
/i(w, il/(b09 b)) = a/i(w, ф(Ь'ои b'01 n b))
+ (1 - a)/i(w, ф(Ь'02, Ьо2 n b)). (2.7)
Let c= b01, b2 c= b02. Since b'01 and b01 are isomorphic, and b'02 and b02
are isomorphic, there exists a bi cz b'01 and a b'2 cz b'02 such that bu b\ and
bl9 b'2 are isomorphic, respectively, and ф(Ьои bt) = ф(Ь'ои b\\ ф(Ь02, b2) =
ФФ029 b'2). If b = b\ u b2, then from b'01 n b'02 = 0 we obtain b'oi n b =
P'u b'02 n b = b2. From (2.7) it follows that
/i(w, ф(Ь0, b)) = a/i(w, ^(b0i, bi)) + (1 - Ф(Ь, Ф(Ь02, Ь2)). (2-8)
Here in the set of effects ^(b0, b) there is a mixture of the effects i^(b0i, bi) and
ф(Ь02, b2) in the ratio a to (1 — a).
We now introduce the following axiom:
AR 1. To each pair b01, b02 e 0t'o and each rational number a, 0 < a < 1
there exists a direct mixture b0 e 0t'o of b ,l5 b02 with the weight a of b01 in b0.
54 III Ensembles and Effects
From AR 1 we obtain:
Th. 2.4. Let g{ e , i = 1,..., n and let Af > 0, i = 1,..., n be rational
numbers for which Yj=i h = 1. Then there exists a b0e $'0 and a decom¬
position b0 = U"=1 b0ifor which boi e and there exists abe@t9b a b0
such that i//(b0i, b0i n b) = gt and A^0(b0, b0i) = Af. Let w e Ж9 g =
^(b0, b). Tben
n
Mvv, 0) = £ AiMw, 0i).
i = l
Proof. The proof of this theorem is analogous to that of Th. 2.2. According to the
induction hypothesis there exists a B0 = (J"=1 B0i and a J с S0 such that
*A(6oi> 6oiJ^ £) = 0i and A^0(£0> 6oi) = <*;• Choose j£0«+i> Bn+1) such that
ФФоп+и B„+i) = 0И+1- According to AR 1, to B0 and B0n+1 there exists a direct
mixture b0 of B0, S0n+i with weight An+1 of b0n+i in b0. There also exists a B0 which
is isomorphic to B0 and a b0n+1 isomorphic to B0n+1 such that
b0 = B0 v b0n+l9
Bo n b0n+1 = 0,
Л'Яо^о» So) = i — K+l9
Л&0Фо9 b0n+1) = An+1.
From the isomorphism between B0 and b0 (see D 2.5) it follows that there is a
decomposition B0 = [ji=1b0i which is isomorphic to the decomposition
B0 = 0"=!^ for which
Ф(Во> B0i) = ф(В0, b0i).
Fromb0 = b0 u b0n+1 = (Jjj± J b0Jt, it follows that, for к < n
9 bok) = к®0Фо9 Во)Ля0(В09 b0k)
= (1 - Аи+1)А^(я nB09 an b0k)
= (1 — An+1)A^(a Pi fi0, a n B0f^
= (1 — An+1)A^0(b0, B0k) = (1 — An+1)aft = Aft.
Thus it follows that A^0(b0, b0n+1) = An+1. To В a B0 there exists a В (<= b0)
isomorphic to В, for which
А^(я n B09a n B0i n В) = А^(я n B0,a n b0i n B).
Thus we obtain
А^(я n bo> л ^ ^oi сл B) = A^o(fi0, B0i)A,#>(a n Boi9 a n B0i n B)
= о^А^я n b0i, я n 50i n
and
Ay,(a n B09 a n b0i n В) = A^0(60> b0i)^(a n b0i9 a n b0i n B)
= а,-А^(я n b0i, я n b0i n 6).
Thus we find that
9i = ФФоь hi r\S) = ф(Ь01, b0! n 6).
2 Mixtures and Decompositions of Ensembles and Effects 55
For bn+1 (which is isomorphic to Bn+1) we obtain
dn + l = ФФоп + и Bn + 1) = ф(Ьо„ +1, bn + 1).
From b = fi u bn+1, since bn+1 c= b0n+1 and fi c= fi0, we obtain
boi n b = boi n b for i < n,
^ои+i n b = bn+1.
Thus, from Th. 2.3 we finally obtain
П + 1
ti{a, {b0, b)) = £ V(«> (i’oi. b0, n b))
i = 1
that is,
П + 1
0) = E Л-М<?>(«), Si».
i = 1
Th. 2.4 states that for every decomposition (2.6) of g into components gt
with weights kt (Af rational) there exists a registration method b0 e R'0 with
decomposition b0 = (J"=1 b0i and detection response bt = b0i n be ffl(b0i)
which satisfies
ФФо, Ui bi) = 9, ФФOb bi) = Si and ХЯ(Ь0, bQi) = A,.
Axioms AR 1 and АР 1 have been introduced primarily for the purpose of
aiding the discussion of the physical meaning of certain aspects of quantum
mechanics. Here it is not necessary to again remind the reader of the possibility
of making incorrect conclusions from axiom AR 1 (these are analogous to those
described in the case of preparation procedures).
In closing this section we again state that axioms АР 1 and AR 1 are not
only applicable to the case of microsystems, but may be used for all systems in
physics. Of course, it is possible to choose stronger axioms than АР 1 and
AR 1 (see the discussion on coexistent decompositions and coexistent effects
in IV, §1 and §5 and the concept of a “physical object” which will be
introduced in §4.1) in such a way as to exclude the possibility of describing
microsystems.
The interest among theoreticians in the sets J, ^0, and 01 and their
physical meaning is divided. In a “classical world” in which the elements of
M can be considered to be the set of physical objects, the measurement
problems underlying J, 010, and 01 are generally not of theoretical interest. It
is sometimes believed that it should be possible to construct a theory of
microsystems without making an inquiry into the physics of the measure¬
ment process. In §4 and IV, §8.1 we shall find that this is not possible. We
shall eliminate much of the preparation and measurement process if we seek
to develop the theory with the aid of the sets Ж and and the function
li: Ж x —> [0,1]. In order that we may obtain the most general laws of
nature governing the processes of preparation and registration of micro¬
systems (analogous to the first and second laws of thermodynamics) in §3, it is
not necessary to use the sets Д 0to, and 01. That is, none of the individual
56 III Ensembles and Effects
physical structures associated with apparatuses denoted by the elements of «Э
and 0t will be mapped into the mathematical theory М0~ъ. This viewpoint
will appear to be sufficient, if not more than sufficient, to those whose interest
is in the description of microsystems. We shall consider this form of the theory
in the following chapters (up to and including XVI), paying close attention to
the difficulties inherent in such a viewpoint.
In XVII we shall seek to introduce additional structure on the sets 0, 0tQ,
and 01. The reader who is dissatisfied with the lack of structure on «Э, 0tQ, and
0t (that is, the treatment of the preparation and registration apparatuses as
“black boxes”)—a viewpoint shared by the author—is referred to XVII,
XVIII, and [13].
3 General Laws:
Preparation and Registration of Microsystems
In this section we shall briefly digress and consider a few fundamental ideas
for the case of microsystems. This section may be skipped without loss of
continuity. We shall consider how the mathematical description of the set of
ensembles X and the set of effects if which will be formulated in §5 may be
deduced for the case of microsystems from physically motivated axioms. A
detailed presentation of this topic can be found in [13].
Because of the “finiteness of physics” (see [1], §9 or [2], III, §8) we shall
assume that the sets Jt, 1, 01 are countable. Then X and if will be
countable.
The following theorem is a consequence of Th. 1.1.
Th. 3.1. There exists a pair of real Banach spaces 01, 01' (where 0? is dual to
01) and an embedding of X in 01 and if in 01' (that is, X, if can be identified
with subsets of 01 and 01', respectively) for which the following conditions
hold:
(i) The canonical bilinear form (w, g) defined for the dual pair 01, 01' is
identical to p(w, g)onX x if, that is,
g(w,g) = (w, д)\Жх^.
(ii) 01 is a base-norm space (see AIII, §6) with basis К where К is equal to
со X (where со X is the norm-closed convex set generated by X). The
positive cone 01+ generated by К is closed. From Th. 2.2 it follows that
К is the norm closure of X—that is, the norm closure of X is already
convex.
(iii) The linear span of if is o(0!', 0T) dense in 01' (for the o(.. )-topology
see AIII, §4).
0f and 01' are also uniquely defined (up to isomorphism) by (ii)-(iii).
Since X is countable, it follows that 01 is separable.
The proof of this theorem can be found in [17] and [13].
3 General Laws: Preparation and Registration of Microsystems 57
We shall denote the dual form (x, y) for 3!, 3' by fi(x, y).
From Th. 3.1(ii) it follows that 3' is an order unit space. Since
0 < g) < 1 for w e Ж and g e we also obtain 0 < g,(w, g) < 1 for
w e К and g e if. This means that if с: [0,1] where 1 is the order unit in 3'.
Let L denote the o(3\ 3) closure of if in 3'. From Th. 2.4 it follows that L is
convex. Let 3 denote the norm-closure of the linear span of if. Then le 3
and 3 is a separable Banach subspace of 3' (3 is an order unit space). 3 is
o(3\ 3) dense in 3'. К is ^) precompact and <т(^', ^) separable.
Let 3' be the Banach space which is dual to 3; 3' is a base-norm space.
We may identify the space 3 with a subspace of 3'. Let К denote the
<r(3, 3')-closmQ of К in 3'. К is o(3'9 3) compact and L is o(3\ ^-compact.
For the compact sets К and L the Krein-Millman theorem holds (AIII, §4).
The topologies g(3\ 31) and a(3\ 3) have the following physical inter¬
pretation: First, the topologies a(3\ 3), o(3\ L n 3) and <r(3', if) on К
(and K) are identical since К is compact. The same is true for the topologies
g(3\ 3\ g(3\ K) and g(3', Ж) on L since L is compact. The topologies
o(3\ if) on К (or Ж) and Ж) on L (or if) describe the possibility of
“physically” distinguishing among ensembles (effects). We shall now illustrate
this for the case of ensembles: From giwug) = n(w2,g) for all ge if, it
follows that wx = w2. Experimentally we can only use a finite number of
registration apparatuses with a finite number of “detection responses”—that
is, a finite number of g in order to test whether two ensembles wx and w2 are
different. In addition, we can only test to within a finite error whether
G) — g). That is, for finitely many gu g2,..., gn and finite error
s > 0 we can always test whether
Mwi, 9i) ~ fiw2,9t)I < e (» = 1, 2,, n). (3.1)
The inequalities (3.1) determine, for different e, n, and g{ the neighborhood
basis for the topology g(3\ if).
We shall now present additional axioms for К and L. The physical
intuition upon which these axioms are based can be found in [13] and (in
part) in [17]. It is already clear that, on the basis of the maps <p of 3! in 3 and
ij/ of into 31' that the following additional axioms will represent indirect
assertions about the sets J, 3t09 and 31.
In order to formulate additional axioms, we define the following:
D 3.1. K0(B) = [w | w 6 К and g,(w, g) — 0 for all g e В a L},
Ki(B) = {w | w 6 К and g,(w, g) = 1 for all g e В <= L},
L0(A) = {g | g e L and g,(w, g) = 0 for all we A a K}.
K0(B) and KX(B) are closed faces1 of X, L0(A) is a <r(3l'9 ^)-closed face of L.
If В consists of only a single element g, instead of K0(B) we shall write X0(g)
and similarly for Kx and L0.
1. For the concept of a face, see §6.
58 III Ensembles and Effects
It is easy to verify that the order relation < y2 in is equivalent to the
following relation
fi(w, yj < p(w, y2) for all weK.
We shall now state the first law of measurement as an axiom:
AV 1.1. To each pair gu g2 e L there exists a g3eL for which g3 > gu
03 > 02 and Ko(0i) n K0(g2) cz K0(g3).
AV 1.1 is equivalent to the statement that each L0(A) has a largest element,
which we denote by eL0(A) (see [17] and [13]).
All elements of К (not only those of Ж) are called ensembles; all elements
of L (not only of ££) are called effects. The elements eL0(A) are called decision
effects. We shall denote the set of decision effects by G. Let deL denote the set
of extreme points of L; we obtain G c: deL (see [17] and [13]).
For an arbitrary subset {Aa} of &>(K) we find that the relationship
L0((JaAa) = f]aL0(Aa) is satisfied. Thus we find that the set {L0(A)\ A <= K)
is a complete lattice with respect to the partial order c: of set inclusion. Since
the map L0(A) -+ eL0(A) is an order isomorphism of {L0(A)\A с: K] onto
G, we find that G is a complete lattice with respect to the order induced on
G с ST by ST.
For the second law, we propose the following:
AV 1.2s. L = [0,1].
From this axiom, it follows that the set {K0(g\ g e L} coincides with the set
of so-called exposed1 faces. We may also show that
sup p(w, e) = 1
weK
for all e e G for which e Ф 0. Unfortunately, the following relation AV id
cannot be proven. It represents only a minor idealization. We shall introduce
it as an axiom.
AV id. To each eeG,e Ф 0, there exists aweKfor which p(w, e) = 1.
This relation is equivalent to the assertion
eeG implies 1 — e e G.
This relation may be used as an axiom instead of AV id.
Then we would find that the map e —> e1 =f 1 — e is an orthocomplemen¬
tation in the lattice G and that G is orthomodular. The map e—> K^e) is an
isomorphism between G and the lattice of exposed faces of K.
1. A face F of К is said to be exposed if and only if there exists a у e 88' for which
F = (w | w e K, ju(w, y) = sup y)t.
V w' 6 X *
3 General Laws: Preparation and Registration of Microsystems 59
We shall now define the notion of “distance” between two elements el9 e2
of G (or the corresponding faces Kx(e{)9 of K) as follows:
As an additional axiom, we assert the following:
AV 3. If el9 e2, e3e G and if e2 < et < e2 v e3 and A(el9 e3) Ф 0 then
*i = *2-
For the case in which G is a Boolean ring axiom AV 3 will be satisfied as
a theorem. “Classical systems” are often defined by the requirement that G be a
nonatomic Boolean ring. Instead of the nonatomic condition we shall require
that each face of К is infinite dimensional. The requirement that G be a
Boolean ring may be replaced by other equivalent assertions (see the general
discussion in [13]). In D 4.1.2 we shall define what we mean by the
expression “physical object.” The assertion that G is a Boolean ring may be
replaced by the requirement that the physical systems in M are physical
objects (for a proof see [1], §12.3).
The next axiom will permit us to distinguish between microsystems and
classical systems. We shall call this axiom the “law of quantization.”
AV 4s. Every exposed face of К is the upper bound (the lattice union) of an
increasing sequence of exposed and/ш/te-dimensional faces.
We extend this axiom by the following assertion:
AV 2f. Every finite-dimensional face of К is exposed.
D 3.2. If axiom AV 1.1, AV 1.2s, AV2f, AVid, AV 3, and AV4s are
satisfied we then say that M (together with the structure J, 0to, 0t) is a set of
microsystems.
The following important theorem holds (the proof will not be given here;
see [17] and [13]):
The relations AV 1.1, AV 1.2s, AV2f, AVid, AV 3, and AV4s are
equivalent to the condition that the Banach spaces 369 01' can be identified
With the spaces 0!{Жи Ж29...), 0У(Жъ Ж29...) with К as the basis of
01(Жи Ж2,...) and L as the order interval [0,1] of 01'(Жи Ж2,...) where we
assume that the lattice-dimension of the irreducible parts of G is not 2 or 3.
Here ЩЖ19 Ж2,...) and 0?'(Жи Ж2,...) are understood, as they are
defined in AIV, §15 with the generalization that the number fields of the
Hilbert spaces Ж may be either the set of real numbers R, the set of complex
numbers C, or the set of quaternions Q.
There are physical arguments (that is, physical facts—see VIII, §2) which
permit us to exclude the cases R and Q. In §5 we shall assert this “end
&(ei> ei) = max 1 — ec)
60 III Ensembles and Effects
result”—which is historically obtained by means of the correspondence
principle (see, for example [12], [2], XI, §1 and [2], XIII, §3) as an axiom for
microsystems. The reader who is willing to accept this “end result” as
“axioms for microsystems” and the accompanying structure of the set of
ensembles К and the set of effects L as a hypothesis which has been verified
will be able to follow the rest of this book without knowledge of this section.
Those readers who are interested in the problems described in this section are
again referred to references [17] and [13].
In order to dispel scepticism that the axioms which describe the sets
M, l,0to,0t and the function X# can restrict the probability function ц over
Ж x more than that which is permitted by the above theorem (or AQ in
§5), it has been shown (in [8]) that for each function ц: Ж x if —> [0,1]
which satisfies the theorem, it is possible to construct a model consisting of
sets M, 1, 0tQ, 01 and a function X?. This does not mean that there is only
one such construction possible; we must assume that it is possible to con¬
struct many nonisomorphic models M, 1, 0lQ,01, X? for a given function ц.
4 Properties and Pseudoproperties
In the discussion of the concept of a physical object which was presented in
II, §1 we have left open what we mean by the expression “objective property.”
In this section we shall seek to clarify this and other questions.
Here we shall seek conceptual clarity. We shall, for the most part, only
sketch much of the mathematical content; much of the subject matter of this
section does not have a direct bearing on the problems treated in this book.
4.1 Properties and Physical Objects
If, in addition to the axioms APS 1-APS 7 we also add the conditions Mel,
Me 01 о (and therefore M e 0t) then 1, 0to, 01 and (since M = M n M e &
implies M e 0>) are Boolean rings of sets. Each of these sets can be
considered to be a set of properties. These properties are, however, the
opposite of that which we have called “objective” because they refer to the
preparation and registration apparatuses rather than to the microsystems.
For example, a e 1 is the “property that the systems xe a are prepared
according to the procedure a.” An objective property should refer directly to
the microsystem itself and be independent of the preparation and registration
process. By this we mean that a set a which is selected by a preparation (and
similarly a set b which is selected by a registration) may be divided—
according to objective properties—into subsets, and that such a “part” of a
can be treated as if it were a fictitious preparation procedure.
We shall now seek to formulate this idea mathematically in terms of the
relationship between a set of objective properties and the sets 1 and 0t in
order to precisely define the concept of an objective property. For this
4 Properties and Pseudoproperties 61
purpose we shall use part of the general treatment which is presented in
[1], §12.3.
In addition to the structure defined on M by «2, 0to, 0t suppose that a set of
properties S is given (that is, $ с ^(M) satisfies AE 1 and AE 2).
D 4.1.1. Let 2 be the set of selection procedures generated by the set
{(a n p)\ae 2, peS).
Since M e $ we find that <2 <= J. We assert the following axiom:
AE 3. <2 is a statistical selection procedure. For the probability function Aj,
for al5 a2 e <2 we require that Aj(al5 a2) = Aj(al5 a2).
II (4.5.1) providing that the set © is replaced by the set {(a n p) \ a e 2, p e <^}.
We may consider 0 to be an extended system of preparation procedures.
For example, if a n p is the extended preparation procedure which prepares
the system according to a and results in the selection of those with property p
for further experimentation. Thus Aj(a, a n p) is the probability that a system
prepared according to a also “exhibits” property p.
In complete analogy to II, Th. 4.5.1 it follows that «2 is the set found in
II (4.5.1) providing that the set в is replaced by the set {(an p) | a e <2, p e $}.
In particular, to each ae £ there exists an ae 1 for which a a a.
The extended preparation procedures in 1 represent idealized refinements
of the preparation procedures in «2. This fact motivates the following
extension of II, D 4.3.1:
Let a e 2 and let boe0to; we say that a may be combined with b0 if there
exists an a e 1 such that a a a and (a, b0)e С.
We now define the following as an extension of II, D 4.3.1:
С = {(a, b0) | a e J, b0 e 0to and a may be combined with b0}.
Here we find that CcC.
By analogy with the sets 0, У defined in II, §4.3 we may also define the
sets J', 0, We would then obtain 0! a 9* с У.
By analogy with APS 6 we shall now introduce the following axiom:
AE 4.1. У is a statistical selection procedure. For the probability function
we find that if we replace й by J, by we find that APS 7.1, 2 holds.
In addition, for със1е9? с/ we find that
A^(ci, c2) = k^(cl9 c2\
From AE4.1 it follows that: if (a,b0)eC and a n p Ф 0 then
a n p n b0 Ф 0.
For С and $ \ye require that
AE 4.2. Let b0 e 0trQ and p e S. If a n p = 0 for all a for which (a, b0) e С
then p = 0.
62 III Ensembles and Effects
This axiom expresses the requirement that the combination problem and
the relation pe S are mutually independent.
D 4.1.2. Let X 0t9 S be defined on M according to the axioms de¬
scribed above and those axioms given in II, §4.3 and III, §1. Then we shall
call S a set of virtual properties—virtual with respect to the structures
X Я.
Let pi, p2 e S and let b0 e Л0, b e 0t and b c: b0. Then for X^ we find that
X^(a n Pi n b0, a n рг n p2 n b)
= X^a n px n b09 a n px n p2 n b0)
x A^(a n px n p2 n b0, a n px n p2 n b)
= X^a n pl9 a n px n p2)
x A^(a n px n p2 n b0, a n px n p2 n b). (4.1.1)
Thus we find that X#{a n px n b0, a n p1 n p2 n b) is determined by the
values of A^(a n b0, a n b) where aeJ' and (b0, b) e On the basis of this
result, we introduce the following selection structure as a substitute for
and 0t\0tо is unchanged; instead of 0t we consider the set of all selection
procedures generated by all b n p where beand peS. We find that
<= c Л, and that the system of selection procedures generated by
the a n b where a e «Э, b e Л is identical to У (from which X? is determined
by (4.4.1)). It is easy to show that the system can be considered to be an
extended system of registration procedures, that is, if X 0tO9 satisfy the
above axioms, then X Л0, 0t also do.
Thus the function X#(a n px n b0, (a n px) n(bn p2)) takes on the fol¬
lowing very intuitive meaning. The “idealized refined” prepared systems in
a = a n px will be registered by the method b0 in such a way as to permit the
use of the “idealized refined” registration method b = bnp2. For the
“idealized” registration procedure b = b0 n p2 we find that
X^a np1nb0,anp1np2n b0) = X^a n pl9 a n px n p2) is equal to the
probability (which is independent of b0) that the systems which are prepared
according to a SP px “have the property” p2.
The requirements we have imposed on the set S of virtual properties
together with the structures X Л0, 0t appear to be too weak for us to classify
them as ^objective,” especially since S is not determined by X
Therefore, we shall call the elements of the set S which we may add to X ^0, St
“virtual properties”; when we wish to stress their “virtual” character, we shall
refer to them as “hidden properties.” The expression “hidden variables” is
often used instead of hidden properties. The reason for this name will now be
given. We note that, to each x e M, the subsets S(x) = {p\peS and xe p)
correspond to an ultrafilter S(x) in the Boolean ring S.
According to a theorem by Stone, each Boolean ring may be described in
terms of a set П in which each ultrafilter corresponds to a single point of П.
To each xe M there is a point n e П; every xe M which corresponds to the
same ultrafilter S(x) is mapped to the same point п. П is called the space of
4 Properties and Pseudoproperties 63
hidden variables. In this book we shall not attempt to formulate the problem
of hidden variables in mathematical terms. Instead, we shall only attempt to
formulate what we would intuitively call “nonhidden” or “measurable”
properties.
The following condition appears to be obvious: pcMis “measurable”
(and is therefore not hidden) if b0 n p e 01 for all b0 e This condition is,
however, too strong because we may only be able to register a p <= M
approximately.
In II, §3 (after introducing AS 2.4) we have stated that we may mathemati¬
cally extend the set of selection procedures by adding “idealized limiting
elements.” We shall do so now, but not for the general case (see [18]), but
only for the case of registration procedures, for wMich we shall add certain
idealized registration procedures.
D 4.1.3. A set с a M is called an idealized registration procedure if there is
ab0G^0 for which с a b0 and
с = U b, b0\c = u b.
be SH be SH
b <= с be bo\c
Th. 4.1.1. The map ф (see D 1.3) of 0F into (and therefore of 3* in L where
L is defined in §3 and §5) may be extended to an idealized registration
procedure с as follows:
Ф(Ь0, с) = sup ф(Ь09 b) = inf ф(Ь09 b). (4.1.2)
b e 01 be^t
bee b0 => b => с
The function X? may be extended to с as follows:
Xy(a n b0, a n c) = sup X#(a n b09a n b)
b€^t
bee
= inf X^(a n b09 a n b). (4.1.3)
be
b0=>b=>c
The properties of the function Xare preserved when 0t is extended to
include the set of selection procedures generated by all b a с and b n (M\c).
Proof. We shall only sketch the essential part of the proof, namely that
sup ф(Ь09 b) = inf ф(Ь09 b).
beSt be 31
b^c bo=>b=>c
Since the set of the b c= с is upwardly directed (in the sense of the order relation c=
of set inclusion) the set of the ф(Ь09 b) is also upwardly directed in 0$'. Since
ф(Ь09 b)eL and L is compact, the sup and inf exist (and are also limits in the
g(08'9 08) topology) and are in L.
From b с: с cz В cz b0 it follows that ^(b0, b) < ^(b0, B) and
sup b) < inf ф(Ь0, В).
64 III Ensembles and Effects
The condition с a В c= b0 is equivalent to the condition b0\B c= b0\c. Since
b0\c= [j b'= [j (b0\b) = b\ п Я
b' e St be St \ be St
b'<=b\c b0=>b=>c \ b0 =» 6 => с
it follows that
Thus we obtain
c = n
beSl
b0=>b=>c
0 = c n (b0\c) = с n
• \ V
= П &n П (bo\b) = _ П «K^VO-
be b<=c b,be@
b0=>b=>c bo=>b=>c=>b
Since the set of b is upwardly directed, and the set of В is downwardly directed, the
set of B\b is downwardly directed. By AS 2.4.1 we find that
and therefore
inf ф(Ь0, B\b) = 0
b,b
sup ф(Ь09 b) = inf ф(Ь09 В).
be St beSl
b<=c b0=>b=>c
D 4.1.4. We say that the set p a M may be ideally registered if b0 n p is an
idealized registration procedure for each b0 e 0to.
Th. 4.1.2. p may be ideally registered if and only if
p = (J b and M\p = (J b. (4.1.4)
be SI be 01
b^p b <= M\p
Proof. If p may be ideally registered then for each b0 e 010 we obtain:
b0 n p = (J b and b0 n (M\p) = (J b. (4.1.5)
be St be St
b<=b0np b<=b0\(M\p)
=b0\(b0np)
According to APS 8.2 and APS 4.2 we have (JboC3tb0 = M. Thus, from
APS 4.2. it follows that
p = M n p = (J b0np= (J (J b= [j b. (4.1.6)
In a similar way we obtain
bo e SIq bo £ Sl0 be St be В
b^bonp be p
M\p= [j b. (4.1.7)
beSl
b<=M\p
From (4.1.6) and (4.1.7) it immediately follows that p may be ideally registered.
4 Properties and Pseudoproperties 65
For a set p which may be ideally registered the effects ф(Ь0, b0 n p)eL and
ф(Ь0, b n p)e L where b cz b0 are uniquely defined.
Th. 4.1.3. The set Sr of all sets which may be ideally registered is a Boolean
ring.
The proof of this theorem is easy, and is left to the reader.
By analogy with the case of registration, we shall define the notion of a set
which may be ideally prepared. By analogy to D 4.1.2 we define:
D 4.1.5. We say that a set p a M may be ideally prepared if
p = [j a and M\p = (J a. (4.1.8)
a e й a £ J
a <=■ p a <=■ M\p
The statements made above for sets which may be ideally registered are also
valid for sets which may be ideally prepared.
Th. 4.1.4. The function (p: &' —► Ж <= К has a unique extension to the set of
alia n p Ф 0 where a e 2! and p may be ideally prepared. The extension is
given by
<ip(a np) = wfi(w, I)-1,
where
w = sup Aj(a, a)(p(a)
ae J
a<=anp
= inf A^(a, a)(p(a). (4.1.9)
a £ a
a^a^anp
Proof. The proof proceeds in a similar manner as Th. 4.1.2, where it is only
important to note that sup and inf exist in the sense of the norm in B. For, if
wl5 w2 e К (for definition of K, see AIII, §6), and w1 > w2 it follows that
IIWi - W2|| = jU(Wi - W2, 1).
The following theorem follows directly from Th. 4.1.3.
Th. 4.1.5. The set Sp of all sets which may be ideally prepared is a Boolean
ring of sets.
Let Sm denote the set Sr n Sp, that is, the set of all sets which may be both
ideally prepared and ideally registered. Clearly Sm is a Boolean ring of sets.
Th.4.1.6. For each pt$m the map Tp: (for definition of К, see
AIII, §6) which is defined by (p(a)—> Aj(a, a n p)(p(a n p) is norm-continuous
and has a unique extension Tp on $ which is linear and norm-continuous.
66 III Ensembles and Effects
This map is uniquely determined by the following equation
i4w, ф(Ь0, bnp)) = fi{Tpw, ф(Ь0, b)), (4.1.10)
which is valid for all w e К and all (b0, b) e In addition, the following
equations hold:
Тщр = 1 — Tp,
(4.1.11)
PROOF. If w = (p(a) (that is, w e Ж) it follows that:
p(w, ф(Ь0, b n p) = n b0, a n b n p)
= n b0, a n p n b0)l^(a n b0 n p, a n p n b)
= Aj(a, a n p)p((p(a n b), ^(b0, fc))
= р(Трф(а), ф(Ь0, fc)).
Thus (4.1.10) is proven for w e Ж. From (4.10), if ]T?=1 a^- = 0, w* e Jf, it follows
that Ya=i KiTpWi = 0. Therefore Tp can be uniquely extended as a linear map to all
ofJ>: (where is the linear span of Ж in $). If Tp is norm-continuous,
then it can be extended to all of
In order to prove that Tp is norm-continuous, we shall assume that со is
o(&9 ^)-dense in L (which is the case for quantum mechanics—see §3 or §5). Since
[-1,1] = 2L - 1, for wl9 w2e Ж we find that
|| TpWi - Tpw21| = sup р(Т^ - Tpw29 2q - 1)
gsL
< MTpWj - Tpw2,1)1 + 2 sup KTpW! - Tpw2, g).
geL
From (4.1.10), for w e К we find that, for the special case b = b0
vATpw, 1) = n(Tpw, ф{Ь0, b0)) = p(w, фф0, b0 n p)) (4.1.12)
and we obtain:
\KTpWt - TPW2,1)1 = |/i(w! - w2, фф0, b0 n p)| ^ IIwt - w2||.
Since со if is dense in L we obtain
sup pi.TpWi - Tpw2, g) = sup р(Трщ - Tpw2, g).
gsL де&
For g e if there exists a (b0, b) e ,¥ with фф0, b) = g. Therefore we find that
SUP КТрЩ - Tpw2, g) = sup MTpWj - Tpw2, фф0, b))
geL (bo, b)e&
and, since
р(Трщ - Tpw2, фф0, b)) = p(wt - w2, фф0, b np))< || wt - w2||
we finally obtain
II Трщ - Tpw21| < 3||wj - w2||,
whereupon we have proven the norm-continuity of Tp. Thus Tp is defined on all of
4 Properties and Pseudoproperties 67
From (4.1.10) and
ф(Ь0, bnp) + ф(Ъ0, b n (M\p)) = ф(Ь0, b)
it follows that
/i(w, (b0, b n (M\P))) = b)) - p(w, ф(Ь0, b n p))
= мК Ф(ьо, ад - ^(TpW, «А(г>0, ь))
= m((1 - TP)W, Ф(Ь0, b)).
Thus we find that
T^p = 1 — Tp.
Since Tp is defined on all of Я, for w e К it follows that
/i(w, ф(Ь0, bnp)) = p(Tpw, ф(Ь0, b)).
From
ф(Ь0, b гл Pi гл p2) = sup ф(Ь0, В)
be St
b<=pinp2nb
= sup Ф(Ь0, Ъ П В)
b<=bi
it follows that
b,be
b^binp,Bc:b2np
ф(Ь0, b n Pi n p2) = sup ф(Ь0, Pi n B).
be St
bcp2nb
Thus we obtain
p(w, ф{Ь0, b n Pi n p2)) = sup p(w, ф(Ь0, Pi n B))
b^p2nb
= sup p(Tpiw, ф{Ь0, В)) = p(TPlw, ф{Ь0, b n p2))
b^p2nb
= P(TPiTplw, ф(Ь0, b)).
From which we obtain
= TPI TP2 — Tp2 'Tpi.
D 4.1.6. The set Em of all subsets p a M which may be both ideally
registered and ideally prepared is called the set of objective properties of M.
We note that D 4.1.6 makes sense becuase Sm is a Boolean ring.
From (4.1.12) it follows that the map Tp in Bf which is dual to Tp satisfies
the equation
х(р)^Ф(Ьо,Ь0пр)=Тр1 (4.1.13)
and therefore ф(Ь09 b0 n p) is independent of b0. If px n p2 = 0, from
(4.1.13) it is easy to show that
XiPi V Pi) = XiPi) + XiPi)- (4.1.14)
68 III Ensembles and Effects
If, instead of $ we use Sm, and we use the sets «Э, 01, etc. which were defined
with the help of S, it follows that AE 3 and AE 4.1.2 are theorems and that
, / . U \ KTPMa)’ ^(b0> b n p2) /, ...
n p1 n b0, a n p1 n b n p2) — —TVTx (4.1.15)
KTPl(p(a)> !)
and
Aj(a, a n p) = p(Tpcp(a), 1) = /#(4 *(p)) (4.1.16)
are satisfied. Thus <fw is, in the sense of D 4.1.12, also a set of virtual
properties.
In D 4.1.6 we have called the set Sm the set of objective properties because
we believe that the mathematical structure for Sm characterizes what we
mean intuitively by the expression “objective property.” By this we mean that
the relations AE 3 and AE 4.1.2 describe the condition that the properties of
the system be independent of the preparation and registration procedures.
The condition that pt$m may be both ideally prepared and registered says
that p is not “hidden.”
We shall call those physical systems which are completely described by the
set Sm of objective properties physical objects. How can we find a mathemat¬
ically precise definition of the expression “completely described”? The set Sm
can be so “small” that different ensembles wl5 w2 e К cannot be distinguished
by means of p(w, x(p))- By “completely described” we mean that Sm is
sufficiently “large” that the w e К can be distinguished by the pe$m. In
mathematical terms we may express this idea as follows: x($m) separates К,
that is, if ju(wl5 xip)) = x(p)) f°r aH P G then wx = w2. Thus we define:
D 4.1.7 A set M of physical systems is called a set of physical objects if x($m)
separates the w e K.1
In IV, §8.1 we shall find that microsystems are not physical objects. In
[1], §12.3 we have shown what structures К and L must have in order to
describe physical objects.
Intuitively, on the basis of the formulation of the set Sm, it follows that $m
represents “real physical facts.” On the basis of the analysis presented in
[1], §10 and in [1], §12.3 we have shown that Sm does represent a set of real
physical facts.
For a set S of virtual properties (defined according to D 4.1.2) we may
suppose that $ с Sr. Then we may also define the set Sp of all sets which may
be ideally prepared with respect to 0 (defined according to D 4.1.1). Then,
trivially, we find that S c= Sp and that S c= Sp n Sr. Then $ = Sp n $r is a
set of imagined properties which at least may be ideally registered. The set $
is, in general, not uniquely determined (there are many such sets possible).
Thus S is not a set of real physical facts (in the sense of [1], §10), or in other
words—is not a set of physically real and objective properties. If such a set $
1. More precisely: if it is a certain hypothesis that x(#m) separates the weK (see [1], §12.3).
4 Properties and Pseudoproperties 69
is so large that the equivalence classes of 9! defined by the equivalence
relation a1^ a2: Х^аъ a1 n p) = Aj(a2, a2 n P) f°r aH P G $ are ^пег than
those determined by the / e then $ cannot exist for microsystems (see
IV, §8.1).
As we found above, microsystems are not physical objects. This fact has
created a scandal. Radical assertions—such as “objective knowledge is
impossible”—have been proposed. However, the fact that microsystems are
not physical objects is itself objective knowledge.
The fact that microsystems are not physical objects does not mean that
each separation of microsystems from the preparation and registration
procedures is impossible and that in each experiment the complicated
structure of each apparatus must be taken into account. It only means that
we must abandon the notion of a microscopic “object,” one to which we have
been accustomed.
In the following section we shall use a different notion—that of a
pseudoproperty—in order to separate the microsystems from the prepara¬
tion and registration apparatuses.
4.2 Pseudoproperties
In quantum mechanics we find that the role of a Boolean ring $ of sets is
replaced by another structure Sps. For Sps <= P(M) the following axioms are
satisfied:
APE 1. Sps is a lattice with respect to the order c= (set inclusion) with largest
element M (AI, §4).
APE 2. To each p e Sps the set
Dp = {q\pe Sps and q cz M\p}
has a greatest element, which we denote by p*. For px =э p2, Pi Ф p2 we
require that Dpi Ф Dp2.
A set Sps с P(M) which satisfies APE 1 and APE 2 is called a structure of
species pseudoproperties.
Th. 4.2.1. From APE 1 and APE 2 it follows that Sps is an orthocomplemented
lattice.
PROOF. We must prove the following three relations (AID 1.2):
(i) If p cz q then q* cz p*.
(ii) (p*)* = p.
(iii) P a P* = 0-
70 III Ensembles and Effects
(i) From p c= q it follows that Dq c= Dp and we obtain q* c= p*.
(ii) From p* c= M\p (and therefore p c= M\p*) it follows that p e Dp*. Let g => p
where c Dp*. Then we would find that 4 c= M\p*; that is, p* c= M\g and
p* c: Dq—in other words Dp a Dq. Since Dq c= Dp we obtain Dp = Dq. From
APE 2 we obtain p = q. Thus we obtain (p*)* = p.
(iii) Since p* c= M\p we find that p n p* = 0, and therefore p л p* = 0 (from
which we conclude that <fps has a least element—the empty set 0).
Conditions APE 1 and APE 2 are natural generalizations of AE 1 and
AE 2. The latter require that if pl9 p2 e $9 then px n p2 and pi и p2 e ; in
addition, if p e $ then M\p e S. The former require that to pl9 p2 e <^ps there
exists a largest element p3 e <^ps for which p3 <= px n p2 and a smallest
element p4 e <fps for which p4 => Pi n p2; also, to p e <fps there exists a largest
element p5 e <fps for which p5 c= M\p. The conditions APE 1 and APE 2 are
as strong as possible without requiring that conditions AE 1 and AE 2 be
satisfied.
Of course, AE 1 and AE 2 say little about the system under consideration
unless we relate the structure $ to the physically motivated structures «S, Л0,
and 01. The same is true for APE 1 and APE 2. We must now relate the
structure Sps to 0,0tO9 and 0t.
Since we do not intend to develop the notion of “hidden” pseudoproper¬
ties, we shall proceed to develop the analogs of equations (4.1.4) and (4.1.8).
Of course, we cannot assume these equations directly without obtaining the
set Sm as a result.
From (4.1.4) and (4.1.8) it follows that
This equation suggests the following generalization: For each subset с a M
we may define
<c) = Г U а] и Г U ь\
\ ае й \ЪеМ
\_acc J Lbc:c J
from which it follows that
7t(C) = Г (J fllu[ (J b]
ae£ ЪевИ
|_ac 7t(c) J |_Ься(с) J
that is, n(c) is an element having the form (4.2.1).
For each set p of the form (4.2.1) we define
p* = n(M\p).
From (4.2.4) it follows that (p*)* =j p and that px p2 implies p* cz p*.
Thus it follows from (p*)* => p that [(p*)*]* <= p*. From (p*)* =з p, if we
replace p with p* we obtain [(p*)*]* p*. Thus we find that (p*)** = p*.
Let П denote the set of all p* where p is of the form (4.2.1). Then П is the
set of all p of the form (4.2.1) for which p** = p.
(4.2.2)
(4.2.3)
(4.2.4)
4 Properties and Pseudoproperties 71
For p g П we shall use the following abbreviated notation:
pP = U a and Pr = U b• (4-2-5)
a e £ be 01
a<=-p be p
We obtain
Pp= U a> Pr = U b and P = PpyjPr- (4.2.6)
a e J b e
a e pp be pr
Th. 4.2.2. П is a structure of species pseudoproperties; in addition, the
following equations are satisfied:
(Pl A P2)p = Pip П p2p, 2
Ol A P2)r = Plr П P2r.
Proof. If px => p2 and pf = P* it follows that px = p2\ then, by definition (4.2.4) the
relation APE 2 is proven.
According to (4.2.2), n{p1 n p2) is the largest element p of the form (4.2.1) for
which p c= p1 and p a p2. It only remains to show that я(рх n p2) e П, that is
[я(рх n p2)]** = ^(Pi n Pi)- From я(рх n p2) <= px it follows that
[7i(Pi n p2)]** <= Pi* = Pi - Similarly it follows that [я(рх n p2)]** <= Pi\ thus we
find that [n{Pi n p2)]** <= Pi n p2- We therefore obtain [я(рх n p2)]** <=
tt(pi n p2). Since [я(рх n p2)]** ^ ^(Pi n p2) is trivial, we have proven that
Ф1 п p2) 6 П.
We shall now prove that [я(р* n p*)]* is the smallest element in П containing
p1 and p2. From p => px and p => p2 it follows that p* c pf and p* c pf, and
therefore we obtain p* с я(р5* n p2) from which we conclude that
ln(p$ n p})]* c p.
Thus we find that П is an orthocomplemented lattice. Using л and v for the
lattice operations we obtain
Pl A Pi = n{p1 n p2)j
Pl V Pi = [7l(pt П pf)]*.
From the definition of я we also find that
(Pi a p2)p = U «; (pi a p2)r = U b-
oe be 01
acPinp2 bc:pinp2
From these results we obtain
(Pl A p2)p С Plp, (Pj A p2)p <= p2p
and
(Pl A р2)р С Plp n P2p.
If a1 e J, ax с p1 and a2 e я2 c= p2 we find that a1 n a2G й and a1 n a2 c:
Pi n P2 and therefore obtain
(Pi a p2)p 3 (J (J «1 n a2 = pip о p2p.
a 1 e jg 02 e jg
«1 cPl 02CP2
Thus we obtain the first equation in (4.2.7). The second equation is obtained
similarly.
72 III Ensembles and Effects
Let Ap = {(p(a) | ae &\a cz p}. From D 3.1 it follows from p n p* = 0
that a n b = 0 for all a cz p and b a p*, that is, ip(b0, b) e L0(Ap) for all
b cz p*. Thus we also obtain ф(Ь0, b) g L0(Ap*) for all b cz p.
From the definition
ФФо, b0 n pr) = sup ф(Ь0, b)
b^bonpr
it follows that ip(b0, b0 n pr) g L0(Ap*). For the case in which pr = pp (and
thus pr = pp = p) (that is, in the case of objective properties), then for all
a cz p, for a n pr = a we obtain
A^(a n b0, a n b0 n pr) = 1
from which we conclude that ip(b0, b0 n pr) g L^Ap).1
For a pseudoproperty we shall require that there exists at least one
registration method 60 for which ip(b0, b0 n pr) g L^Ap), from which it
follows that
L0(Ap,0 n L^Ap) # 0. (4.2.8)
The fact that there is no b g R for which b cz p and b cz p*, that is, there is no
ip(b0, b) for which ip(b0, b) g L0(Ap) and i^(b0, b) g L0(Ap*) leads us to impose
the following additional condition for pseudoproperties:
L0(Ap и Ap*) = L0(Ap) n L0(Ap*) = 0. (4.2.9)
We note that (4.2.9) is satisfied for objective properties because, for each
a g 1 the relation a = (a n p) и (a n p*) holds where p* = M\p (that is,
(p(a) = A<p(a n p) + (1 — X)(p(a n p*) holds where X = Aj(a, a n p)).
We could not show that the set of all p for which there exists a b0 for which
ip(b0, b0 n pr) g L^Ap) and for which (4.2.9) is satisfied is an orthocom¬
plemented sublattice of П. For this reason it is necessary to define:
D 4.2.1. An orthocomplemented sub-lattice Sps of П which satisfies the
conditions
(i) to each p g Sps there exists a b0 g such that \p(b0, b0 n pr) g L^Ap),
and
(ii) the relation (4.2.9) is satisfied for all pe$ps,
is called a set of actual physical pseudoproperties.
D 4.2.2. A set S’ps of actual physical pseudoproperties is said to be sufficient
if the set of all i//(b0, b0 n pr) g L^Ap) for p g Sps separates the elements of K.
Whether such sets Sps satisfying D 4.2.1 and D 4.2.2 exist for a given
theory cannot be determined in general. For microsystems the existence of
such sets is certain—see IV, §8.2.
1. By analogy to D 3.1 we define
Lt(A) = {g | g e L and /i(w, g) = 1 for all w e A cr K}.
5 Ensembles and Effects in Quantum Mechanics 73
5 Ensembles and Effects in Quantum Mechanics
At the end of §3 we suggested that the “end result” obtained from the
fundamental laws of preparation and registration given in §3 may also be
formulated as an axiom. This axiom may be motivated by a reasoning
process which uses the so-called correspondence principle and certain aspects
of the measurement process (see I, XVII, [2], XI, §1 and [2], XIII, §3). We
shall now formulate the fundamental axiom of quantum mechanics (which
according to §3 may be obtained as a theorem from axioms AV 1.1, AV 1.2s,
AV 2f, AV id, AV 3, and AV 4s) as follows:
AQ. There exists an injective mapping j? of Ж into the basis К of
08(Жи Ж2,...) and an injective mapping у of if into the order unit interval
[0,1] in 08'(Жи Ж2,...) for which
p(w, g) = tr((j3w)(yg))
holds for w e Ж, g e if where ft Ж is (norm) dense in К and у if is (t(08\ 08)
dense in [0,1].
Both Banach spaces 08(ЖЪ Ж19...) and 08'(Жи Ж19...) and the canonical
bilinear form tr(uv) where и e 08(Жи Ж19...) and v e 08(Жи Ж19...) are
defined in AIV, §15. In AIV, §15 it will also be shown that 08'(Жи ...) and
08(ЖЪ ...) may be identified as subsets of the Banach algebra л/(Ж19 ...).
08(Жи ...) is a base-norm space and 08\ЖЪ ...) is its dual Banach space (and
therefore is a order unit space). The order unit in 08\ЖЪ ...) is the unit
operator 1 in я$(Жъ ...).
In the following we shall often make use of the fact that 08(ЖЪ ...) and
08'(ЖЪ ...) are subsets of я/(Ж19...). Where products of elements from
08(Жи ...) and 08\ЖЪ ...) occur, they are to be understood in the sense of the
algebra #0(Жи ...). (For the formulation of axiom AQ, see also [2], XIII, §3;
the above formulation of axiom AQ is somewhat weaker than that given in
[2], XIII, §3. The stronger form in [2], XIII, §3 was chosen for its simplicity).
On the basis of AQ we shall identify Ж with /JjT and if with yif and write
p(w, g) = tr(w, g). Then the maps q> of 9! into Ж and ф of 3* into if
correspond to maps of 9! in К and 3* into [0,1]. Instead of [0,1] we shall
write L. According to AQ the set (p9f is dense in К and ф97 is dense in L. We
shall call the elements of К ensembles and the elements of L effects. Let 0$
denote the norm-closed subspaces of 08'(Ж19...) generated by $F. Then Sf is a
Banach space and 08(ЖЪ ...) may be identified with a subspace of 0$'. 0$ is
norm-separable; 08’ is not. We shall seek to characterize Sf by axioms in
VII, §8. Let К denote the 08) closure of К in For the case in which 08
is the set of real elements of a C*-algebra the mathematical situation for
0$9K <=.0$ has been thoroughly researched (see, for example, [30]). Note that
the so-called “set of states” in the theory of a C*-algebra is denoted here by К
instead of К; for the problem of К see VI.
In order to prevent misunderstandings about the physical meaning of
axiom AQ we make the following comments: The countable subsets Ж of
74 III Ensembles and Effects
К a К and if of L are, in certain topologies, dense in К or L, respectively.
Here К, К, and L are mathematical extensions or “completions” analogous
to the completion of the set of rational numbers by the real numbers. We
now pose the “converse” question: Are the “physical sets” Ж and if
“special” subsets of К, К, and L? If so—how? An important tool in the
resolution of this question is the description of the physical distinguishability
of ensembles and effects which we have described in §3. This fact suggests
that, in addition to the sets Ж and if, the topologies generated on JT by if
and on if by if by means of equations of the form (3.1) will also be
important. These topologies are identical to g(2\ 2) on К and К and to
o(0!\ 0$) on L. Note that the assumption that the ensembles may be
distinguishable by the topology o(0l, 01') is false because the topology
o(0i, Af) is actually “finer” on К than is a(2\ 2).
Since two elements of К may be physically distinguishable only in the
sense of the topology o(2\ 2\ no subset of К, that is, no special Ж has
special physical significance. However, К itself is of special significance as a
subset of К only because the topology generated by К on L (which is
identical to o(0!\ Щ) correctly describes the physical distinguishability of
effects. We note, however, that every subset of К which is dense (with
respect to the norm) in К generates the same topologies on L as does Ж\ К,
and 01. In summary, we find that Ж (as a subset of K) and if (as a subset of
L) are not uniquely determined by the physics; every subset К of К which is
norm-dense in К (including К itself) and every subset of L n 2 which is
norm-dense in L n 2 (also L n 2 itself) may be used as the set of “physical”
ensembles or effects in the sense that the topologies generated by К on and
by on К are correct for “physical distinguishability” (see also the general
discussion in [1], §10.5).
We may, therefore, choose sets К and of the above type on the basis of
our mathematical convenience without “changing the physics.”
In this book К <=. 2(Жи ...) and L с: 01(ЖЬ ...) will play a central role.
(We shall discuss 2, 2' only briefly in VI and VIII) because the mathematics
for Hilbert spaces is extensively developed, and that the methods developed
for the dual spaces 2, 2' are, at present, too cumbersome for practical
computation. The criterion for mathematical accessibility may change in
time as new mathematical methods are developed (recall the development
of the Hamilton-Jacobi formulation of mechanics.)
If we introduce the axiom AQ as a substitute for a set of axioms such as
that given in §3, we find it desirable to substitute the following definition for
D 3.2:
D 5.1. If axiom AQ is satisfied, then we shall call M (together with the
structures J, 0to, and 0t) a set of microsystems.
On the basis of the above formulation of the foundations of quantum
mechanics it is clear that the Hilbert space (as a complex vector space) does
not directly describe a physical structure. Instead it is a computational tool
6 Decision Effects and Faces of К 75
which permits us to cleverly handle the structure of the convex set K. Since
the positive affine functionals on К are identical to the elements of the
positive cone of (see AIII, §6), it is the structure of К alone (and the choice
of topology g{&, 3f) on K) which determine the physical structure of
microsystems. The structure of К also contains the so-called “wave charac¬
ter” of microsystems; the Schrodinger wave function—the elements of a
Hilbert space—are only mathematical tools for obtaining a better “handle” on
this “wave character.” Only the elements of К are “physically real” (in the
sense of [1], §10 and [2], III, §9). The elements of the Hilbert spaces Ж{ are
not. The elements of form only a particular representation basis for the
elements of K ; a representation analogous to the special coordinates used for
the representation of the orbit of mass-points in mechanics.
The role of Hilbert space as a representation tool for physically important
quantities will be developed in IX.
6 Decision Effects and Faces of if
In II, Th. 4.5.2(iv) and in III, Th. 2.1 we have examined the notion of a
decomposition of preparation procedures and defined the notion of a
mixture of ensembles by (2.2). Th. 2.2 states that to each “mixture” of
ensembles wf with weights Xt there is an analogous “mixture” a of preparation
procedures at.
The notions of ensemble, mixture, and decomposition may be expressed in
simple terms if we use AQ. If we identify the elements w e Ж with the
elements w e К <= $ it follows that (3.2) is equivalent to
W = X AjWf, (6.1)
i — 1
whete (6.1) is written as an equation among elements of the Banach space
Equation (6.1) states that w is a convex combination of the wf. Since the set К
is convex, every convex combination of elements wf e К is also an element of
K.
Let 0 < X < 1 and let wl5 w2 e K. Then w = Xwx + (1 — X)w2 is a “mix¬
ture” of wl5 w2 (in ratio X to 1 — A)). If w e К can be written in the form
w = Xwx + (1 — X)w2 where wl5 w2 e К, 0 < X < 1 then the decomposition
w = Awx + (1 — X)w2 represents w as a “mixture” of the ensembles wl5 w2.
The norm-closed subsets of К which are invariant under mixtures and
decompositions play an important role. A norm-closed subset С of К is
invariant under mixtures and decomposition if for wl5 w2 e С we obtain
Awx + (1 — X)w2 g С for all X which satisfy 0 < X < 1 and if w e C,
w = Xwx 4- (1 — X)w2 (where 0 < X < 1 and wl5 w2 e K) it follows that
wl5 w2 g C.
The norm-closed subsets of К which are closed under mixtures and
decomposition are known as norm-closed “faces” in the mathematical theory
of convex sets. The “faces” of К obtain their physical meaning from the
76 III Ensembles and Effects
concepts of mixture and decomposition. We only consider the norm-closed
faces because (from the discussion presented in §5) a subset С of К is not
physically distinguishable from its norm-closure.
Of special importance are those faces of К which consist only of a single
point (and are therefore norm-closed). Such a face is called an extreme point
of К and is characterized as follows: An element w is an extreme point of К if
and only if when w = Awx + (1 — A)w2, wl9 w2 e К, and 0 < Я < 1 it follows
that w = wx = w2. The “physical meaning” of an extreme point of К is that it
represents an “irreducible ensemble”—often called a “pure state.” We shall
not use this expression in this book because we do not wish to create false
associations with the notion of a pure state—for example, the false notion
that all microsystems x e a for irreducible (p(a) are “identical” while only
those xe a for which (p(a) is reducible can be nonidentical. Ontological
concepts such as “identical” and “nonidentical” can easily lead to con¬
tradictions with experiment.
D 6.1. The set of extreme points of a convex set К will be frequently
denoted by deK. The set of norm-closed faces of К will be denoted by ф(К).
What are the elements of de(K) and of ф(К)1
Th. 6.1. Every closed face С of К can be written as C(w)for a suitably chosen
w e К where C(w) is the smallest closed face which contains w.
Proof. Since $ is separable, К and С are also separable (see AIV, §15). Thus, in С
there is a norm-dense denumerable subset wf e C. Thus there are real numbers
0 < Af < 1, Yji h = 1 for which w = hwf e С since С is norm-closed and
convex. We shall now show that С = C(w). From we С it follows that C(w) c: C.
From w = Afcwfc + (1 — Afc)w£ where
wi = (l - 4Г1 E xiwie K>
i*k
it follows that wk e C(w). Since all wk belong to C(w) and are dense in C, and C(w) is
norm-closed, it follows that С с: C(w).
D 6.2. The set of elements e e L for which e is a projection operator in the
algebra ...) (that is, e2 = e and e+ = e) (see AIV, §15) will be denoted
byG.
In the following we shall make use of the concise notion which is presented
in AIV, §15: Ж = (Jy Жу is the “sum” of the sets Ж19 Ж2,... etc. Every
element of Ж is characterized by an index у and a q> e Жу . Ж is not a vector
space. The elements of з#{Жи Ж2,...) can be considered to be operators in
Ж. As “subspaces” of Ж we shall denote only those subsets ZT of Ж for which
^ ~ Uv where ^ are closed linear subspaces of Жу. ST\ _L ЗГ2 means
that for ^ = (Jy and ЗГ2 — (Jy ?T2 the relations ZT) 1 ZT2 hold for all
y. is the subspace (Jy
6 Decision Effects and Faces of К 11
Th. 6.2. To each face С of К there is a uniquely determined e e G for which
w eC is equivalent tow = ewe. The map so defined of all faces of ф(К) in G
is an order isomorphism of ф(К) onto G.
Proof. According to Th. 6.1, С = C(w) where w = and {wf} is dense in C.
Let 2Г denote the support of w, that is, 2Г is the space which is orthogonal to the
eigenspace of w which has eigenvalue 0 (see AIV, §11). Let e be the projection
operator for 2Г. Thus we find that w = ewe. The condition w = ewe is equivalent to
the condition (1 — e)w(l — e) = 0 (see AIV, §6). Using the decomposition which is
found in the proof of Th. 6.1
w = kkwk + (1 -
it follows that
0 = (1 - e)w( 1 - e) = Afc(l - e)wk( 1 - e) + (1 - Afc)(l - e)w'k( 1 - e).
Since the operators (1 — e)wk( 1 — e) and (1 — e)wk( 1 — e) are positive operators in
ЩЖ,...) it follows that (1 — e)wk( 1 — e) = 0 and therefore wk = ewke. Since {wk}
is dense in C, it follows that w = ewe for all weC.
Let w e К and w = ewe. The eigenvectors (which correspond to nonzero
eigenvalues) of w lie in the projection space of e. Since С is convex, we obtain
w eC because, if for all elements cp e 2Г the corresponding projections Рф lie in С
(for Рф see AIV, §6). By analogy with the proof in Th. 6.1 it can be proven that for
every eigenelement (рл having a nonzero eigenvalue, the corresponding Рфа lies in C.
The {(pa} span all of ST. From
1 " 1 "
where the {&} is an orthogonal system which spans the same subspace as the {<pa.}
it follows that, Рф lies in С for each (p from a finite-dimensional subspace ЗГ
spanned by elements of {<pa}. Since each cp from may be approximated by a cp'
from such a finite-dimensional subspace of 2Г (and, therefore, Рф is approximated
in the norm-topology by P^ (see AIV, §11) it follows that all Рф for which (p e lie
in C.
Thus we have shown that С consists of all w for which w = ewe. In this way the
element e which corresponds to С is uniquely determined because the set of those w
for which w = ewe contain all Рф for which q> are contained in the projection space
for e and the projection spaces for different e are different.
It is easy to show that the converse holds, that, for each e e G the set of all w for
which w = ewe is a norm-closed face of K.
It is easily shown that e > e is equivalent to С з С. Therefore the cor¬
responded between the elements eEG and С is an order isomorphism of ф(К)
onto G.
ф(К) is a complete lattice, since К e ф(К) and (as is easily shown) that for
every set С of closed faces, p)a Ca is closed. By the order isomorphism of ф(К)
and G it follows that G is a complete lattice (which also follows directly from
AIV, §6 because the set of subspaces of Ж is a complete lattice (see AIV, §2)).
78 III Ensembles and Effects
Th. 6.3. The condition that the realtion w = ewe holds for w e К and e eGis
equivalent to the condition p(w, e) = tr(we) = 1, that is, w e Kx(e) where
Kx(e) is defined by D 3.1.
PROOF. From w = ewe it follows that
p(w, e) = tr(we) = tr(we2) = tr (ewe) = tr(w) = 1.
From p(w, e) = 1 it follows that
p(w, 1 — e) = tr(w(l — e)) = 0
and
tr(w(l — e)) = tr((l — e)w{ 1 — e)) = 0.
Since (1 — e)w( 1 — e) > 0 it follows that (1 — e)w(l — e) = 0 and w = ewe (see
AIV, §6).
Using the notation of D 4.1, from Th. 6.2 and Th. 6.3 we find that every
С e ф(К) is of the form С = Kx(e) for some eeG. Thus we obtain the first
part of the following theorem:
Th. 6.4. An order isomorphism ф(К) <-> G is defined by С = Kx(e), С e ф(К)
and eeG. An anti-order isomorphism ф(К) «-* G is defined by С = K0(e)
with С e ф(К). In particular,
К fa) n К fa) = К fa) л К fa) = л e2),
Ki(ei) v К fa) = К fa v e2),
K0(ei) n K0(e2) = K0(ei) л K0(e2) = K0(ei v e2), and
Kfa) v ^0(^2) = X0(ei л ег)-
PROOF. We need only prove that e —► X0(e) is an anti-isomorphism map of G onto
ф(Х). This follows directly from the fact that e—>1 — e is an anti-isomorphism
map of G onto itself and K0(e) = К 1 — e).
Th. 6.5. The set deK is equal to the set of all .
Proof. According to AIV, §11, for each wgKwq have the spectral representation
w = E^V.- (6.2)
a
From p(w, 1) = 1 it follows that ]Ta Xa = 1. Clearly e can be an extreme point of К
only if only one Xa Ф 0 in (6.2), that is, w is of the form P^ for some cp.
Pp is also an extreme point of K; from P^ = Xw^ + (1 - X)w2, 0 < X < 1 it
follows that, for e = 1 — P^
0 = Xew^e + (1 — X)ew2e.
Since ew12e > 0 it follows that
ew^e = 0 and ew2e = 0.
6 Decision Effects and Faces of К 79
From ew^ = 0 it follows that w= 0 (see AIV, §6); thus, from e = 1 — we
obtain Wi = tjPy. Since tr(Wi) = 1 it follows that rj = 1. Similarly, it follows that
w2 = pv
In addition to the faces of К (which are special sets of ensembles), the
elements of G (effects) also have a special physical meaning. The physical
meaning of the elements of L is established by the map xj/ of into L (as
presented in §1, §3, and §5). In many formulations of quantum mechanics
there is an implicit assumption that every experiment (b0, b) g cor¬
responds to a xp(b0, b)e G. In these formulations (b0, b) is an “observable”
having measurement values 0,1 and must be identified with a projection
operator e because only projection operators have eigenvalues 0,1 (see, for
example, [2], XI, §2.1).
Actually, the statement that all xp(b0, b) are elements of G is incorrect. On
the contrary, an experimental physicist will not know in advance whether
xp(b0, b) at least approximately corresponds to an e e G or whether he has
only registered an effect g = xj/(b0, b) e L (see XVII and [2], XII). We shall
now seek to obtain a physically meaningful definition of the notion of a
decision effect, that is, one which relates to the probability function /j(w, g).
For this purpose we shall use D 3.1.
For A a K, L0(A) is a set of effects for which there will be no detection
response for any w e A. K0L 0(A) is a closed face of К with A c K0L0(A).
For С = K0L0(A) we find that L0(A) = L0(C). An element gx e L is said to be
more “sensitive” than g2 e L if gx > g2, that is, if ^(w, gx) > /j(w, g2) for all
w gK. Then the effect дг will respond more frequently than the effect g2. Is
there a most sensitive effect in the set L0(A) = L0(C)? If such an effect exists,
and if it cannot respond to the w e A, it will respond to each w g К “as
frequently as possible.”
We shall now use a few examples to clarify this situation. A filter for light
may be considered to be a b0; here b is the set of light-quanta which are
absorbed by the filter b0. If g = xj/(b0, b) g L0(A) then w g A represents the
“light” which passes through the filter without absorption. Consider, for
example, the absorption coefficient of the filter b0 as a function of frequency
(see Figure 2). All the light which contains only those frequencies for which
the absorption coefficient is zero passes through b0 without absorption. By
“combining” several filters of type b0 it is possible to construct filters for
which the absorption coefficient is approximately 0 or 1 and passes all light
which passes through b0 without absorption. Such a filter represents a type of
“maximally sensitive” filter which satisfies the additional condition that all
light w g A passes through it without absorption.
The following definition is motivated by the above example:
D 6.3. If L0(A) has a maximal element e, then e is called a decision effect. We
shall denote the largest element of L0(A) by eL0(A).
Th. 6.6. Every L0(A) has a maximal element L0(A); the set of decision effects
is equal to G (see D 6.2). In addition G = dJL; L0(A) = L0K0(eL0(A)).
80 III Ensembles and Effects
Proof. As we have seen earlier L0(A) = L0(C) where С is a closed face of K.
According to Th. 6.4 С = K^e) where eeG. For e = I - e we find that С = K0(e)
and we therefore obtain L0(A) = L0K0(e). Thus we find that e e L0(A).
From g g L0(A) it follows that tr(wg) = 0 for all w g K0(e); in particular,
tr(P<pg) = 0 for all P^ g K0(e) = K^e). Since ё = 1 - e, from P^ g K^e) it follows
that (1 - e)q> = cp. Thus we obtain 0 = tr(P^g) = <<p, gcp> for all cp for which
(1 — e)(p = cp. Since g > 0 we obtain gcp = 0 for all cp in the projection space of
1 — e. From this we conclude that g(l — e) = 0, that is, g = ge. This result,
together with the adjoint equation g = eg yields g — ege. From g < 1 it follows
that ege < e and also g < e. Therefore e is the maximal element of L0(A) =
L0K0(e). Each e g G is obtained as a maximal element of L0(A), namely from L0(A)
where A = K0(e).
Suppose that e g G, gl9 g2 e L, 0 < Я < 1 and
e = Afifi + (1 - X)g2.
Then it follows that gl9g2e L0K0(e) and consequently g\<e and g2 < е.Ид1Ф e
(or 02 Ф e) then Xgt + (1 — X)g2 $ e; therefore gt = g2 = e. Thus G cz deL. For
g g L, from 1 - g < 1 it follows that 0 < (1 - g)2 = 1 - 2g + g2 < 1 - g < 1. We
therefore obtain 0 < 2g — g2 and 0 < g2 < 1. For gf = 2g — g2 and g" = #2 we
find that g', g" g L and that 0' = ^0' + jg". If g g dgL then g' = g" = 0, that is,
0 = 02 and 0 g G. Thus = G.
D 6.4. For e g G we shall let e1 denote the element 1 — e.
Th. 6.7. e1 = 1 - e is the projection operator onto the subspace of Ж which is
orthogonal to еЖ. The correspondence e—^e1 is an orthocomplementation in
6 Decision Effects and Faces of К 81
the lattice G. The isomorphism e -► К^e) o/Th. 6.4 defines an analogous
orthocomplementation on (j>(K)for which Kfi^e)1 = K^e1) = K0(e).
PROOF. The first part of this theorem is a direct consequence of AIV, §6. In order to
show that e —► e1 is an orthocomplementation it is necessary to show that (see
AI, D 1.2):
(i) From et < e2 it follows that e\ > ej.
(ii) (e1)1 = e.
(iii) e1 л e = 0.
(i) and (ii) follow directly from e1 = 1 — e. (iii) follows directly from the fact that
e1 л e is the projection operator on the intersection of the projection spaces of e1
ande.
D 6.5. We shall denote the relation ex <1 — e2 = e2 by ex 1 e2.
Since ex < 1 — e2 is equivalent to e2 < 1 — the relation ex 1 e2 is
symmetric. If ex < e and e2 < e1 then e2< e1 < e\ and^ 1 e2.
Th. 6.8. Ife1 1 e2 then exe2 = 0 and ex + e2 = ex v e2.
PROOF. Since 1 e2 and e1e2 = 0 are equivalent (see AIV, §6) it easily follows that
(ei + ei)2 = ei + e2> that is, + e2 e G. It is easy to see that K0(ex + e2) =
K0(ei) n K0(e2); from Th. 6.4 we obtain K0(ei) n K0(e2) = v e2). From
K0(ei + e2) = Хо(вх v e2) and from Th. 6.4 it follows that et + e2 = ey v e2.
D 6.6. An orthocomplemented lattice Ж is said to be orthomodular if, for
a, b, с g Ж, с 1 b and a < c, the following relation holds:
(a v b) л с = (а л c) v (b а с) = a
(see AI, D 2.3).
Th. 6.9. G is orthomodular.
PROOF. From a a с < a,b a с < bit follows that (а а с) v (b а с) < a v b. From
a a с < c,b a с < cit follows that (а а с) v (b а с) < с; thus we obtain
(а л c) v (b л c) < (a v b) а с.
For G the orthomodularity is equivalent to that found in the case of the subspaces
*, d, / of a Hilbert space. Thus, for о 1 / and * c= / it is necessary to show that the
relation (ь a o) n / <= % holds. Since * 1 d the vectors in * v d have the form
(p = (pt + (p2, where <p± e г and (p2 e d. Therefore we find that <pt _L (p2.
Since / Id, for the projection Pt on the subspace / we find that Pt(p2 = 0. In
addition Ptcp = Pt(py = <pt (the last equality follows from * c /). For each
Ф e (* v d) n / and <p e /—that is, Ptcp = cp and also cp = cpt—we obtain cp e 4.
Another proof of Th. 6.9 follows directly from AI, Th. 2.2. (See remarks at
the end of AI, §2.)
82 III Ensembles and Effects
Th. 6.10. Decreasing and increasing sequences of elements of G converge in
the a(08\0S) topology towards elements of G; an increasing sequence ev
converges towards \JV ev and a decreasing sequence ev converges towards
Av*v.
Proof. The convergence of increasing and decreasing sequences in L follows in
general from AIII, §6. Let ev be a decreasing sequence. From ex —► д e L it follows
that ev > /\v ev and therefore д > ер. From ev > ep for all p > v, in the limit
we obtain ev > д for all v. We obtain K0(g) => K0(ev) for all v and, from Th. 6.4 we
obtain K0(g) => \JV K0(ev) = K0(/\v ev\ that is, g e L0K0(/\V ev). Therefore, from
Th. 6.6 we obtain g < f\vev. Thus we find that g = /\v ev.
If the sequence eY is increasing, then the sequence eY = 1 — ev is decreasing and
1 - ev /\v e$. Therefore we obtain ev -+1 - /\v e$ = (f\v e^)1 = \fvev.
CHAPTER IV
Coexistent Effects and
Coexistent Decompositions
The structure of the theory of microsystems presented in this book permits us
to make a fundamental characterization of the concept of microsystems
without making use of the familiar basic concepts of property and observ¬
able. The notions of property and pseudoproperty which were introduced in
III, §4 are not fundamental concepts but are derived concepts—derived from
the more fundamental concepts of preparation and registration procedure. In
III, §4 we have outlined one possible way in which it is possible to begin to
separate the notion of a microsystem from the inherent structure associated
with preparation and registration procedures. We shall now consider a
second way to accomplish this separation. These two methods will be
compared in §8.
We shall now make the difference between the deductive approach
presented here and the approaches which are based on the notions of
“property” and “observables” more clearly evident, in order that we may be
able to more clearly formulate the problems which will be discussed in this
chapter.
Many formulations of quantum mechanics begin by making certain
specific assumptions about the properties of microsystems : It is assumed that
the properties of microsystems may be ascertained by measurement, and may
(at least, “after” the measurement) be attributed to the microsystems.
Similarly, an observable is considered to be a “measurable quantity” which
(again, on the basis of measurement) may be attributed to the microsystems.
For instance, it is assumed that, in the measurement of position, we are able
to “detect” the position of the microsystem, and that the latter is one of its
83
84 IV Coexistent Effects and Coexistent Decompositions
properties. In other words, it is assumed that the proposition “the microsys¬
tem has the measured position” is already meaningful.
We have not used the concepts described above in our formulation of
quantum mechanics in II and III because we were not convinced that their
meaning can be readily determined. For instance, we do not assume that a
proposition such as “the microsystem has the measured position” is already
meaningful. Instead, we have only made use of the fundamental concepts—
those of preparation (represented by 3) and registration (represented by
0t$0t)—concepts which have immediate meaning because they can be
explained in terms of “pre-theories” relative to quantum mechanics (for the
meaning of the notion of a pre-theory see [1], §5 and [2], §4). It is possible to
deduce the concepts of properties and pseudoproperties of microsystems
only after the development of the theory (see §8).
The microsystems make their presence known by the “response” b e 0t of
the “apparatus” b$e 0t$. Except for a few general comments made in III, §4,
what these responses permit us to conclude about the microsystems them¬
selves remains open. The investigations in III, §4 do not represent a
systematic path from the preparation and registration methods to the actual
structure of microsystems. Instead, they only raise the question whether it is
possible to correctly formulate the intuitive notion of a “property” in terms of
the structures of preparation and registration procedures. It was not possible
to systematically answer the question as to how much the “response” b e 0t
of the apparatus b0 e 01$ is due to the microsystem xeb and how much is
due to the apparatus b0. We shall now turn to these problems and questions
without (!) prejudging the issue by making arbitrary “assumptions concern¬
ing the nature of microsystems.” For a realistic clarification it is necessary to
obtain all concepts directly from the experiments with microsystems which
are described by the structures «Э, 0t$0t (see the more general treatment in
[1], §10). The physical meaning of quantum mechanics may be obtained then
only with the use of elements of the sets «2, 0t$0t. We shall not introduce a
syntax for assertions about the microsystems themselves; we shall only
consider mathematically formulated relations in the form described in §8.3.
1 Coexistent Effects and Observables
The concept of an observable is a very useful concept in quantum mechanics.
It will be derived from the concepts described by the sets 01 $01. In order to do
this, we shall now examine the physical significance of the substructures
0t{b$) of the structures 0t$0t.
1.1 Coexistent Registrations
Earlier we have introduced the expression “effect process” to denote a pair
(b$, b) where b$ e 0t$,b e 0t and b c= b$. To such a pair the map ф assigns an
effect ip(b$, b)e L а Я'(Ж19...). In III we have only been concerned with the
1 Coexistent Effects and Observables 85
mapping of a single effect process / = (b0, b) onto an element
g = ip(b0, b)e L. Here it is important to note that there are, in general, many
such b a b0 for an apparatus b0. Let ЩЬ0) denote the set of beMfor which
b a b0. In experiments with microsystems we find that, in most cases, the
number of registrations for which b a b0 is overwhelming. An approximately
exhaustive description of those cases cannot be described in a book, no less in
a few lines. For the purpose of illustration we shall only consider two typical
experiments.
In the first example we shall consider an array of counters in which each
microsystem may activate the response of some of them. Let b0 denote the
array of counters. Let us characterize b as follows: three specifically chosen
counters will respond; four other specifically chosen counters will not. In this
case 0t(bo) will be familiar to those readers who are familiar with modern
electronic technology—0l(bo) is the Boolean switching algebra of the set of
possible responses of the counters. For the case in which b0 consists of a pair
of counters we may describe 0t{bo) as follows: bx is the registration for which
the first counter responds; b2 is the registration for which the second counter
responds. Here bx n b2 corresponds to the case in which counters 1 and 2
have both responded; b0\b1 corresponds to the case in which counter 1 has
not responded; similarly for b0\b2. Here (b0\bi) n b2, (b0\b2) n bl9 bx и b2,
etc. are described similarly. By including b0 and 0 we find that @(b0) has
exactly 16 elements.
In the second example we shall consider a cloud chamber (or a bubble
chamber) b0. Here the number of possible registrations is overwhelming.
Every condensation droplet (or vapor bubble) may itself be a registration b.
the ionization trail (or bubble trail) may also be one. If a magnetic field is
applied, then the radius of a circular particle trail will be a “scale value” for
certain of these Vs. The scale numbers are nothing more than indices which
serve to order the overwhelming range of possible registrations b a b0. Scale
values—even when they are very practical—are of a secondary nature. The
primary concept is that of registration possibilities.
These two examples also demonstrate the following fact: The different
elements from 0t{bo) need not occur “simultaneously” because b <= b0 does
not necessarily correspond to a “point in time,” but instead, may correspond
to an extended process, such as a joint response of two counters in an array
where one responds later than the other, or, to the entire ionization trail as in
a cloud chamber, which persists for a certain length of time. The joint
registration of elements b e 0t{bo) has nothing to do with “simultaneous”
registration or “simultaneous” measurement. In quantum theory such
expressions as “simultaneously measurable” and “not simultaneously
measurable” are often used. These expressions are often misleading and
misunderstood. We shall discuss this subject in more detail in VII, XVII, and
XVIII. Now we shall state only that the b e 0t{bo) has nothing to do with
“simultaneity.” In order to reduce such misunderstanding we shall chose
another concept (which we have already introduced in II, D 2.2) for the
description of the physical situation represented by 0t(bQ).
86 IV Coexistent Effects and Coexistent Decompositions
D 1.1.1. The registration procedures b e 3t(b0) are said to be coexistent with
respect to the registration method b0e3t0. The (b09 b)e^ which cor¬
respond to the same b0 will be called coexistent effect processes.
A subset A cz gF is a set of coexistent effect processes if and only if all
elements in A have the same first component b0.
How does the structure of coexistent effect processes affected by the
mapping ф of У into L c= 08'(Ж19 ...)?
1.2 Coexistent Effects
If we have a set of coexistent effect processes (b09 b) then the corresponding
set of effects ф(Ь0, b) is a subset of the set of effects ф(Ь09 b) containing all the
b g 3t(b0). For a fixed b0 the map ф(Ь09 b) defines a map ф0: 3l(b0)—>L. We
shall now investigate the properties of this map.
The following definition is often used:
D 1.2.1. Let F be a map of the Boolean ring £ (with unit element e) into the
order interval [0, и] of an ordered vector space. Let F satisfy the following
conditions:
F(e) = u9
F(c7i v <r2) = + F(<72)
for the case in which л o2 = 0. Then the map F is called an (additive)
measure on £.
For the special case in which [0, u] is the interval [0,1] cRwe shall call F
(as defined in D 1.2.1) a real measure (see II, D 3.1).
Th. 1.2.1. For fixed b0 the map \p0(b) = ф(Ь0, b) of 3t(b0) into
[0,1] = L c= <Я'(Ж19 ...) is an additive measure on 3t(b0).
Proof. According to APS 5.1.4, to each w g Ж there exists an a g w for which
a n b0 Ф 0. Let bl9 b2 e 0%(bo) and suppose that b± n b2 = 0. We then obtain:
X#>(a n b09 a n (bt u b2)) = X^(a n b09 (a n bx) u (a n b2))
= X#,(a n b0,a n bx) + X^(a n b0,a n b2).
We therefore obtain
p(cp(a)9 ф(Ь09 bt и b2)) = p(q>(a\ ф(Ь09 bj) + p(q>(a\ ф(Ь09 b2)).
Since, by AQ, cp9! is dense in K9 for all w g К we obtain
p(w9 ф(Ь09 bi u b2)) = p(w9 ф(Ь09 bj) + p(w9 ф(Ь09 b2)) (1.2.1)
from which we conclude that
ф0(Ь1 и b2) = ф0(Ь J + ф0(Ь2). (1.2.2)
Since ф(Ь09 b0) = 1 we obtain ф0(Ь0) = 1.
1 Coexistent Effects and Observables 87
Equations (1.2.1) and (1.2.2) are equivalent; intuitively they express the
additivity of the frequencies for the responses of or b2” for the case in
which iib1 and b2 cannot both respond. In other words it is a direct
consequence of the “switch” from bx and b2 to or b2.” In other words
these equations are a direct consequence of the intuitively motived equation
II (3.5) and the axioms AS 2 for statistical selection procedures. On the basis
of previous axioms it follows that there are no restrictive conditions for the
map ф0 of $(b0) in L other than those imposed by equation (1.2.2) and the
condition ф0(Ь0) = 1. Thus, it will be possible to impose axiom AOb later.
We shall now introduce the following definition which is motivated by
Th. 2.1.1.
D 1.2.2. A set A <= L is called a set of coexistent effects if there exists a
Boolean ring £ with additive measure F: E —> L for which icFI.
In order to simplify what follows, we shall introduce the following
definition:
D 1.2.3. An additive measure F on £ is said to be effective if F(<r) = 0
implies that (7 = 0.
We shall now state and prove the following theorem:
Th. 1.2.2. Let F:E —>L be an additive measure; let E0 = {бг|бге£ and
F(ct) = 0}. Let (7 e<7 where <tgE/E0; define F(d) = F(ct). Then F is an
effective additive measure on the Boolean ring E/E0.
PROOF. First we shall show that E0 is an ideal in E: Suppose that o1 < a and
F(a) = 0; then from Ffa) < F(<r) it follows that F^) = 0. Let a = <r1 v a2 and
suppose that F(<7X) = F(<t2) = 0. Then from a — <r1 v (<72 л of) and from
a2 Д 0* < <r2 h follows that F(<t) = 0. Therefore E/E0 is a Boolean ring (see AI, end
of §3). From (Tl9 a2 e <7 e E/E0 we obtain <т1 л of e E0 and of л a2 e E0 from
which it follows that
F((7i) = F((7j A of) + F((7X A <72)
= F(<7? A (72) + F((T1 A (72) = F(<72).
Therefore, for o- g or F(or) = F(a) defines a function F: E/E0 —> L. Let ё denote the
class containing the unit element e of E. Then we obtain F(s) = 1. If д1 л <т2 = 0,
then, for any pair of representatives <r1 e <rl9 a2 e a2 we find that а1 л o-2 g E0.
Hence we find that
F((7i V (72) = F((7i) + F((7f A (72)
= F(o’i) + F(o-1 A <72) + F((7f A (72)
= F((7X) + F((72)
and finally
F(di v ff2) = + F(ff2).
From F(<r) = 0 it follows that F(o-) = 0 for all a g <7, and therefore o- g E0, that is, a
is the null element of E/E0.
88 IV Coexistent Effects and Coexistent Decompositions
Therefore, if we wish to investigate coexistent effects Th. 1.2.2 implies that
we need only consider those Boolean rings having effective measures. Clearly
ф maps a set of coexistent effect processes into a set of coexistent effects. In
addition, it is easy to show that the map ф0: ЩЬ0) —> L is an effective measure
on 0l{bo) because from ip0(b) = 0 (that is, ij/(b0, b) = 0) it follows that
k#>(a n b0, a n b) = 0 for all a e for which (a, b0) e С (where С is defined
in II (4.3.7)). Thus a n b = 0 for all a e S! for which (a, b0) e C. According
to APS 5.2 this is possible only if b = 0.
Definition D 1.2.2 has the following essential advantage over the special
situation ^0(b0)-^*L: We do not have to be concerned with the question
whether, given a set A of coexistent effects, there exists a registration method
b0 e for which A cz \p0$(b0). In §4 we shall assert that in an “approxi¬
mate” sense, to each Boolean ring £ having an effective measure F: £ —> L
there is a “realization” described by a &(b0); the complication of the
“approximate realization” will be eliminated if we consider general effective
measures on a Boolean ring.
In AI, §3 we show how to define two operators -j- and • for a Boolean ring
which satisfy the rules of a commutative algebra. In addition, we show how
to define a generalized Boolean ring (without unit element) using + and •. A
unit element exists if and only if there exists an e for which e • a = a for all
elements a of the Boolean ring.
D 1.2.4. A map F of a generalized Boolean ring £ into [0,1] is called an
additive measure if, for <тх • o2 = 0 the following equation is
satisfied:
FiPi + ai) = F(<r i) + РФг)-
Note. In the case in which E has a unit element e, we do not require that
F(e) = 1!
Th. 1.2.3. Let E be a generalized Boolean ring and let F be an additive
measure on E. Then there exists an extension E o/E and a (normed) measure
F which is a continuation of F on 2.
PROOF. Let E be the two-element Boolean ring which consists of the zero element 0
and the unit element I. Let 2 = E x E and let -j- and ■ be defined as follows:
(6, (Tj) + (fi, <r2) = (6, + <r2),
(г, fft) + (0, a2) = (fi, ffj + (T2),
(0, ctj) + (0, tr2) = (0, al + tr2),
(e, (Tj) • («, <г2) = (e, + a2 + (тг cr2),
(e, <Tt) ■ (6, a2) = (6, ff2 + <ri • tr2),
(0, j) • (D, <r2) = ((5, <v<x2).
1 Coexistent Effects and Observables 89
2 is a Boolean ring; Z may be identified with the subset of all (0, <r) of 2, that is, 2 is
an extension of Z. Let
F(0, a) = F(a), F(e, a) = 1 - F(a).
F is an additive measure on 2 which coincides with F on Z.
From Th. 1.2.3 it follows that a set A cz L is a set of coexistent effects if
there is a generalized Boolean ring Z and an additive measure F on Z for
which A c= FZ.
We shall now consider the special case in which A consists of a pair of
elements gl9 g2. We then obtain the theorem:
Th. 1.2.4. The following conditions are equivalent:
(0 0i> 02 are coexistent.
(ii) There exist three elements g[, g'2, g12 e L for which gx = g[ + g12,
02=92+ 012 and g[ + g’2 + 012 = 01 + 02 = 02 + 01 6 L
PROOF, (i) => (ii). Let Z be a generalized Boolean ring with gl9g2e {F(<r) | a e Z}. Here
we find that F(gx) = gx, F(<t2) = g2. Now let us consider the following additional
elements in Z: •<r2, gx + -029 <*2 + *02» and v <r2 = <7j + a2 + 0"i * •
From the additivity of F(<r), and from F(g^ • (T2) = g12, F(<7i + di • <r2) = 0i, яи4
F((t2 + <7j • (T2) = 02 it follows that
0i = ■F'(ffi) = f(<^i 4- ffi • + ffi • <*2)
= F((Tj + -ff2) + • <r2) = 3; + g12;
similarly we obtain 02 = 02 + 0i2. In addition, we obtain
F(<7i v <r2) = F{a, + (a2 + • o2)) = F(<7j) + F(<t2 + <s1 • <r2)
= 0i + g2sb.
(ii) => (i). Let us consider a set П consisting of three elements (1), (2), and (3),
and let Z = ^(П). For the elementary sets a1 = {(1)}, <r2 = {(2)}, <r3 = {(3)} we set
Ffoi) = g'l9 F(<t2) = 02, F((T3) = 012. It is easy to show that F determines an
additive (not necessarily normed) measure satisfying Ffo + 03) = gu
F((r2 4- <r3) = 02.
The proof of this theorem is particularly instructive because it shows that
0i, 02 does not necessarily uniquely determine the remaining values of the
measure F(o). For the case in which Z = ^(П) we may obtain different
measures because there will be different g12 e L for which gx — g12, g2 — g12
apd 0i + 02 — 0i2 6 L. If, for example, gx g2e L then we may choose
0i2 = 0. If we can find another 0i2 Ф 0, g12e L for which gx — g12,
02 — 0i2 G we таУ then construct two different measures F(<r) on
Z = ^(П). It is easy to give such an example. Consider three arbitrary
elements gl9 gl9 g3e L and choose gl9 g2 as follows:
01 = i0i 4- ig3,
02 = 702 4- £03.
(1.2.3)
90 IV Coexistent Effects and Coexistent Decompositions
Then we obtain g1 + g2 = 3(^1 + g2 + 0з) £ L; alternatively we can set
012 = 603*
In the following section we shall return to consider the fact that two effects
do not necessarily uniquely determine an effect g12 for which “both effects gx
and g2 jointly respond.”
We shall now mention the following special case of Th. 1.2.4: The
following conditions are sufficient for g1 and g2 to be coexistent :
(1) 0! + 02 eL(set£12 = 0).
(2) 01 > 02 (set 012 = 02)*
1.3 Commensurable Decision Effects
The notion of a decision effect, which was introduced in III, §6 plays an
important role in quantum mechanics. Often the “role” of decision effects is
over-estimated—resulting in the exclusion of realistic situations for all of the
effects. In order to avoid clouding the issue by introducing “additional
hypotheses” or “opinions” we shall proceed step by step to develop the
special role of a decision effect. We shall use only the assertions about
decision effects which were introduced in III, §6.
We shall now begin with the following obvious definition:
D 1.3.1. A set A c= G (that is, a set A of decision effects) is said to be
commensurable if there exists a Boolean ring £ with additive measure
F: £ —> G for which A cz F'L.
The reader should note that, according to this definition, there is a map F
of £ into G. According to D 1.2.2 a set A cz G is coexistent if there exists a
Boolean ring with additive measure F': —> L for which A cz JF'S'. Thus,
a set of commensurable decision effects is also (since GcL) coexistent. Does
the converse hold?
The Boolean ring S with measure F: S —> G represents at least an
“idealized” (in the case in which S G can only be approximately “realized”
by а ЩЬ0)-^Ь—see §4) registration method corresponding to decision
effects associated with the realizable registrations. It was the practice of
theoretical physicists to permit only the use of those measurement methods
for which ф(/) g G. Certainly, this is a particularly interesting special case of
a measurement. We note, however, that the general measurement methods
for which 1p(b0, b) are only effects are usable measurements. Indeed, they are
the only realistic measurements.
In Th. 1.2.4 we have analyzed the factual content of two coexistent effects.
Now we shall consider the special case in which one of the two effects is a
decision effect.
1 Coexistent Effects and Observables 91
Th. 1.3.1. For g g L,e e G the following conditions are equivalent:
(i) g9 e are coexistent.
(ti) 0 = 0i + 02 where gl9 g2e L and gx < e,g2 < e1 (in this partition of
g, gx and g2 are uniquely determined).
(iii) e = g[ + g3 where g'l9 g3e L and g[ < g9 g3 < 1 — g (in this partition
of e9 g'l9 and g3 are uniquely determined and g[ = gx where gx is defined
in (ii)).
<iv) eg = ge.
PROOF. According to Th. 1.2.4 (i) is equivalent to the condition that there exists
0i> 02» 0з e L such that 0 = 0i + 02» ^ = 0i + 0з and 0i + 02 + 0з e L- From this
it follows that g1 < e and 1 > 0i + 02 + 03 = 02 + e9 that is, g2 < 1 - e; thus we
obtain (i) => (ii).
Conversely, if 0 = 0i + 02 where gt <e9 g2< e1 then g3 = e - g1e L and
e = 0i + 0з and 0i + 02 + 0з = 02 + * ^ e± + « = h that is, 0! + 02 + g3 e L.
Thus we have shown that (ii) => (i).
Let 0 = 0i + 02 = 0i + 02 where gl9 cf < e and gl9 g2 < e1. Then we obtain
0i - 0i = 02 “ 02 and 0<e-g1<e + g1-g1 = e + g2-g2<e + g2<
e + e1 < 1. Therefore g3 = e + gt — e L. Since 0j < e we obtain
K0(Gi) 3 X0(e); similarly we obtain Kq^j) з K0(e). From these results it follows
that K0(g3) з K0(e)9 that is, g3 e L0K0(e). From K0(e) cz X^)—that is,
g3 e L0K0(e)—and from III, Th. 6.6 it follows that g3 < e. From these results it
follows that g1 < g^ \ similarly we may also derive g1 < g±. Therefore we obtain
g± = and g2 = g2. Therefore we have proven the uniqueness of the decom¬
position given in (ii).
From the above decomposition g = gy + 02 and e = gy + g3 it follows that
0i < g and 0з = e — gy = e — g + 02. Since g2 < e1, (see (ii)) it follows that
0з < e — g + e1 = 1 — g. Thus we have shown that (i) => (iii). Conversely, let
e = 0i + 0з where g[ < g and 03 < 1 — g and let g2 = g — g[. Then we obtain
g'2 = g — e + g'3 < g — e + 1 — 0 = 1 — e = e1. Thus we have shown that
(iii) => (ii).
Since we may derive the relation (ii) out of (iii) for the special case g[ = g1 it
follows that the partition in (iii) is unique.
It now suffices to show that (ii) <*> (iv). From g1 < e it follows that (see AIV, §6)
0i = £01^» fr°m 02 ^ e± ft follows that 02 = e1g2e1. Therefore, from (ii) we obtain
ge = eg^e = eg. Conversely, if ge = eg, we define g1 = ege9 g2 = e1ge1. Then,
since e and g commute, we obtain
0 = (e + eL)g(e + e1) = ege + eLgeL = g1 + g2-
In addition, we obtain gx = ege < ele = e\ similarly we obtain g2 < e1. Thus we
have shown that (ii) holds.
The commutativity of two coexistent effects does not hold in general! This
fact may tie verified using example (1.2.1) where gl9 gl9 g3 may be arbitrary
elements of L.
Th. 1.3.2. Let el9 e2e G be two coexistent decision effects. Suppose that
they satisfy the decomposition ex = gx + gl9 e2 = g 1 + g3 where
92 IV Coexistent Effects and Coexistent Decompositions
0i> 02> 0з G L and 0i + 02 + 0з G L (equivalent to the coexistent require¬
ment). Then gl9 g2, дъ are uniquely determined by el9 e2 and g 1 = ex a e2,
02 = ei a e2, дъ = e\ a e2 and gl9 g2, g3 g G. 7/E is a Boolean ring with
additive measure F: E —> Lfor which = el9 F(ct2) = e2, it follows that
F(a1 a g2) = ex a e2, F(a1 + <тх a2) = ex a e2 and F(ct2 + a1 o2) =
e2 a e\.
PROOF. According to Th. 1.3.3 (ii) gx < e2, g2 < e2, and gx < el9 дъ < ef.
Therefore it follows that g1 g L0K0(e1) and g1 g L0K0(e2), that is,
0! g L0iC0(e1) n L0K0(e2) = L0(A) where A = K0(ej u K0(e2). According to
III, Th. 6.6 L0(A) = L0(C) where С is the face generated by A, that is,
С = v K0(e2). According to III, Th. 6.4 K0(et) v K0(e2) = a e2) and
therefore gt eL0(A) = L0K0(ei a e2), that is, gy < et a e2.
Similarly, it follows that g2 < e1 a e2, g3 < e\ a e2. Therefore = gt + g2 <
(ex л e2) + (ex л e2). Since et a e2< e2 and et a e2 < e2, we find that
ei a e2 -L ey a e2; thus, by III, Th. 6.8 we obtain (e1 a e2) + (<гx л e2) =
(e1 a e2) v (e2 a ef) < e^. From ey = g1 + g2 < (ex a e2) + (е1 л ej) < el9 it
follows that g1 = e1 a el9 g2 = e1 a e2 because gy < a el9 g2 < e2 a e2.
Similarly, it follows that g3 = e\ a e2. The rest of the theorem follows directly
from the proof of Th. 1.2.4.
Th. 1.3.3. If F: E —> G is an additive, effective measure on the Boolean ring E,
then F is an isomorphic map of the Boolean ring E onto the Boolean
sublattice FH of G.
PROOF. According to Th. 1.3.2, for each pair of elements g19 <t2gE we have
F(g± a g2) = F^) л F(<t2). Furthermore, for a g E we obtain 1 = F(s) =
F(ct v <r*) = F(g) + F(<t*): F(g*) = 1 — F(g) = FH1. Since a1w a2 = (erf л a$)*
it follows that F(a1 v g2) = F((T1) v F(g2). Thus F is a lattice homomorphism of E
into G for which F(<r*) = F(a)1. In addition, F is injective: From F((T1) = F(g2)
it follows F((t1 a g2) = F^) a F(g2) = F((T1) = F(<t2) and from
F^) = F{{<j1 A (T2) V (CTi A <7j)) = F((T! A <T2) + F^ A <jJ)
we obtain the relation F((T1 a g$) = 0. Since F is effective, it follows that
<7j a <t2 = 0 and therefore Gt = g1ag2. Similarly we find that g2 = g1 a g2, that
is, Gt = g2. Thus F is an isomorphism onto the sublattice FE of G and FE is a
Boolean sublattice of G.
Th. 1.3.4. The following six conditions are equivalent for decision effects:
(i) el9 e2 are coexistent.
(ii) e{9 e2 are commensurable.
(iii) The orthocomplemented sublattice Г of G generated by el9 e2 is a
Boolean ring.
(iv) e2 = (et л e2) v (e2 л ej).
(v) ete2 = e2e2.
(vi) exe2 = et a e2.
Let E-^*L be the Boolean ring defined according to (i) where F is the
effective measure. Then there exist two elements al9 a2 e E for which
1 Coexistent Effects and Observables 93
F(gi) = el9 F(g2) = ^2* Let 2° be the Boolean subring of 2 generated by
al9 o2. The restriction of F to 2° is an isomorphism of 2° on Г (where Г is
defined by (iii)). The identity map i of Г into itself is an additive measure i on
the Boolean ring Г for which T-LG.
Proof. According to (i) there exists a Boolean ring 2 with an effective measure
F: 2—► L (see Th. 1.2.2). According to Th. 1.3.2, there exist two elements <rl9 <r2 for
which F^) = el9 F(<t2) = e2 and the following condition holds:
F(a1 л <т2) = e1 a el9 F^ + а1 -<т2) = е1 л e2, F(<t2 + 6i -o^) = e1 a e\. We
shall now show that the Boolean subring 2° of 2 generated by <т19 <r2 will be
mapped by F into G.
Since 1 = F(e) = F(<t v <r*) = F(<r) + F(<r*) we obtain F(<r*) = 1 — F(<r). If, in
addition, F(<r) g G, then we also find F(a*) g G. It suffices to show that the
following eight elements 0, <rl9 <r2, <7i a <r2, g1 + a1 • <r2, 62 + 61*62,
6X + a1 • <t2 + a2 + a1 • <r2 and ^ v <r2 = ^ + (T2 • <r2 + a2 may be mapped into
G since the remaining eight elements of 2° are complements of the above elements.
We obtain:
Ffo + 6i • a2 + 62 + (T1 • <r2) = F((j1 + (Ti • (T2) + F(<t2 + 6X • <r2)
= (^1 A 4) + (e2 A 4) e G>
since ex Ae2L e2 a e\. Similarly, we obtain
F(<t 1 v <r2) = F((T1 + 6X • (T2) + F(62) = a e2 + e2 e G,
since a e2l e2.
Therefore the map 2° Д G is an additive measure on 2° with range in G, whereby
(i) => (ii) is proven.
With the application of Th. 1.3.3 to 2° Д G we find that Г = F2° is isomorphic
to 2°, thereby proving that (ii) => (iii) is proven.
If Г—as defined in (iii)—is a Boolean ring, then from e9 e g Г and e a e = 0 it
follows that e < e1 because e = e a (e v e1) = (e a e) v (e a e1) = e a e1.
Therefore, by III, Th. 6.8 we find that e v e = e + e1 and that the identity map of
Г onto itself is an additive effective measure. Since Г с G we find that el9 e2 are
also commensurable. That is, we have shown that (iii) => (ii).
As we have shown above after D 1.3.1, (ii) => (i). In Th. 1.3.2 we have shown that
(i) => (iv). If (iv) is satisfied, that is, = (e1 a e2) v (e± a 4), then, by III, Th. 6.8
we obtain et = gt + g2 where g^ = ey a el9 g2 = e1 a e2. Or, expressed dif¬
ferently we obtain gx < el9 gx < e2. By Th. 1.3.1 it then follows that (iv) => (i).
From Th. 1.3.3 (iv) it follows that (v) <=> (i); (vi) <=> (v) is proven in AIV, §6.
We shall now consider some special cases of commensurable decision
effects. If ex < el9 then, by arguments presented at the end of §1.2 we
conclude that el9 e2 are coexistent and are therefore commensurable.
Similarly, if ex + e2 < 1 (that is ex + e2 e L) we conclude that el9 e2 are
coexistent and are therefore commensurable since ex + e2 < 1 is equivalent
to ex < 1 — e2 = e29 that is, e1\- e2. Hence, from ex _L e2 it follows that ex
and e2 are commensurable. If el9 e2 are commensurable and ex a e2 = 0, it
then follows from Th. 1.3.4 (iv) that e1l. e2 and, consequently, by
III, Th. 6.8, ex v e2 = ex + e2.
94 IV Coexistent Effects and Coexistent Decompositions
We shall now develop general criteria for the characterization of com¬
mensurable sets of decision effects. For this purpose we shall state and prove
the following theorem:
Th. 1.3.5. Let A1 cz G, A2 cz G and suppose that for each pair ex e Al9 and
e2 e A2 are commensurable. Then /\eeA2 e and \Jee Al e are commensurable
with each ex e Ax. Suppose that e e G is commensurable with each ex e Al9
then e1 is commensurable with each ex e At.
PROOF. The statement that e, ey are commensurable is equivalent to the condition
that e1 = (е1 л e) v (е1 л e1). This relation is symmetric in e and e1 because
e = (e1)1. Thus, e\ ex are commensurable.
Since el9e2 are commensurable, from Th. 1.3.4 it follows that
e2 = (e2 л ej v (e2 л e\) and we therefore obtain
V e = Г V (e л ei)lv Г V (e Aei)
e e A2 \ ее A2 J \_e e A2
Since G is orthomodular (see III, D 6.6 and AI, §2) it follows that
V el л et = ГГ V (e л ei)l v Г V л ei)]l л et = V (e л ei)-
ееАг J l_f_eeA2 J \_eeA2 JJ eeA2
A similar result holds if we substitute e{ for ey. Thus we obtain
V * = |Yv
eeA2 L\eeA2 / J |_ \eeA2 / J
Thus, using Th. 1.3.4(iv) we conclude that el9 \JeeA2e are commensurable. Since
(Ve € a2 ех)х = Ле eAZ e. the same result follows for eu Де e Al e.
Th. 1.3.6. For A cz G the following conditions are equivalent:
(i) A is coexistent.
(ii) A is commensurable.
(iii) Each pair el9 e2e A are coexistent.
(iv) Each pair el9 e2 e A are commensurable.
(v) The orthocomplemented sublattice ГА generated by A is a Boolean ring.
(vi) The complete orthocomplemented sublattice TA generated by A is a
Boolean ring.
PROOF, (iii) => (iv) is a direct consequence of Th. 1.3.4. (ii) => (i) and (i) => (iii) are
trivial. If we show that (iv) => (ii) we will have proven that (i), (ii), (iii), and (iv) are
equivalent.
Suppose that (iv) is satisfied. Every element of the sublattice TA described in (v)
may be obtained by the finite application of the operators л, v, and 1 to the
elements of A. According to Th. 1.3.5 every element of TA is commensurable with
every element of A. Again, from Th. 1.3.5 it follows that every pair of elements of TA
is commensurable. Thus, from AI, Th. 3.2 it follows that TA is a Boolean ring.
Since ey л e2 = 0 and el9 e2 are commensurable (see the discussion preceding
Th. 1.3.5) it follows that et v e2 = et + e2 and that the identity map of TA is an
additive measure. Therefore, since A c= TA9 the set A is a set of commensurable
decision effects. Thus we have proven (iv) => (ii), (iv) => (v), and (v) => (ii).
1 Coexistent Effects and Observables 95
(v) => (vi). By Zorn’s lemma it follows that the set S of Boolean subrings Г' for
which А с Г' с G contains a maximal element Гт. Гт must be a complete Boolean
ring, otherwise, there would be a subset ВсГт for which the element
e = \feeBeeG would not lie in Гт. Then, by Th. 1.3.5 and by (ii) <s> (iv) we would
find that {Гт, e} would be a set of commensurable decision effects which, according
to (v), would be contained in a Boolean ring. This contradicts the fact that Гт is
maximal. TA must be a sublattice of Гт and must therefore be a Boolean ring.
(vi) => (v) is trivial.
Th. 1.3.7. Let ev be a sequence of commensurable decision effects which
converges in the o(38\ 38) topology towards e e G. Then e is commensurable
with all ее G which are commensurable with the ev. In addition, it is possible
to choose a subsequence eVk such that e = /\^=i en where en = \Д°=и evk-
(From this result we obtain the following special case: Every complete
Boolean subring of G is o(38\ 38) closed in G.)
PROOF. From III, Th. 6.10 en—>e in the a(38\ 38) topology where e = Д^°=1 en.
Since en > eVk for all к > n9 in the limit к —> oo we find that en > e and therefore
find that e > e. If we show that, in the a(38\ ЗЙ) topology en—>e then from
en > e > e we would obtain e = e.
First we shall show that e is commensurable with all the ё e G which are
commensurable with the ev. If ё is commensurable with the ev then, by Th. 1.3.4(iv)
and III, Th. 6.8 we obtain
ё = (ev л e) + (ё а e^\
ev = (ey л ё) + (ev л ё1).
Since L is a(38\ J^-compact (see AIII, §4 and §6), we may select a subsequence еХж
from ev for which eVa л ё —► ду e L, еХж л ё1—>д2еЬ,ёл е^ж —*g3eL, and there¬
fore ё = д1 + д3,е = д1 + д2 - From еХж л ё < еХж and from еХж л ё < ё it follows
that 0i < е and д3 < ё. Hence it follows that ^((h) K0(e) and Хо(0х) Z3 К0(ё).
Therefore, from III, Th. 6.4 it follows that
К0(д1) з K0(e) v К0(ё) = K0(e л ё).
According to III, Th. 6.6 it follows that gt < e л ё. Similarly we may prove that
g2 < e л ё1 and g3 < ё a e\ From these results we obtain
0i + 02 + 0з ^ l(e A e±) v (e л + (ё a e1).
Since (e а ё) v (e а ё1) < e and ё a e1 < e1 we obtain
01 + 02 + 03 ^ e + e± = 1-
Therefore, by Th. 1.2.4, e and ё are coexistent and, by Th. 1.3.4 are
Commensurable.
Since we have assumed that the ev are all commensurable, e is therefore
commensurable with all ev, and, by Th. 1.3.5, is also commensurable with e and the
Since e is commensurable with the en — Vfc°=" evk апс* with the eVk we therefore
obtain (see the proof of Th. 1.3.5)
00
К A e1 = V (eVk л e1).
k= 1
96 IV Coexistent Effects and Coexistent Decompositions
Since the eXk л e1 are commensurable, we find that
00
К A e1 < £ (eVk л e1).
k = n
According to AIII, §4 the ^)-topology on L may be characterized by a norm
||... 19. For this norm we find that
00
He, л ех\\„ <, £ Ik* л e±\l•
k = n
Since e„ > e we find that en — e = en л e1 and we therefore obtain
00
Ik ~ e\l ^ E Ik* а е1^.
k = n
Since L is compact, we may choose a subsequence eXk from the ex such that
eXk л e1 —► g e L in the 88)-topology. From eVk л e1 < eVk it follows that, in
the limit g < e; from eXk л e1 < e1 it follows that g < e\
Hence, from the above we may conclude that g < e1 л e, that is, eVk л e1 —► 0.
Thus it is possible to select a subsequence such that ||eVk л eL\a < (j)k from
which we conclude that \\en — e\\a A 0 and en —► e.
D 1.3.2. The set of all e e G which are commensurable with all the elements
of G is called the center Z of G.
Th. 1.3.8. Z is the set of all e = (El9 E2,...) for which the Ev in Hv are
either 0 or 1.
PROOF. According to Th. 1.3.4 Z is the set of all e = (El9 E29...) which commute
with all the other eeG, which proves the assertion.
According to Th. 1.3.6(vi) it follows that Z is a complete Boolean subring
of G (which can be easily proven directly). Z is atomic, with atoms
q. = (0, 0,..., li9...) where the components are equal to 0 except for the ith
position.
The fact that Z is atomic is a consequence of AV 4s from III, §3. The
atomic character of Z is characteristic for microsystems.
1.4 Observables
In §1.1 and §1.2 we have implicitly presented the structure upon which the
observable concept will be based. The concept of an observable is none other
than an abstract idealization of the structure represented by the map
Яфо) Bef°re we proceed to give a precise definition of the notion of an
observable we shall make a number of preliminary remarks in order to
reduce the possibility of misunderstandings. Many readers will find it
somewhat surprising that we shall say little about “measurement values,”
“measurement scales” and the like when we introduce the notion of an
observable. Is a “measurement” necessarily quantitative? In response to this
1 Coexistent Effects and Observables 97
question, we emphasize that it is important for the reader to put aside the
notion that the essential aspect of physics is that of quantitative measurement.
Otherwise, the reader will not obtain a correct understanding of the methods
of theoretical physics. Parameterization of registration procedures can be
very “convenient,” “practical,” and “useful,” but it has no fundamental
meaning in reference to the mapping of physical reality by means of the
mathematical structures. This becomes evident in the fact that it is possible,
in principle, to record all measurements digitally and store them in a
computer. In addition, the structure of a Boolean ring (for example, of £%(b0))
becomes more tractable if we do not insist on imposing a more or less
arbitrary parameterization of the Boolean ring. Finally, the abstract struc¬
ture of a Boolean ring is more transparent than the “usual” parametric
formulation.
At this point we could define an observable directly in terms of ЩЬ0) and
the map ф0: $(b0) —>L. This approach will, however, lead to a number of
mathematically inconvenient structures. Instead, we shall proceed in an
abstract manner in analogy to the definition of coexistent effects. The first
approach would be to define an observable by means of a Boolean ring E and
an effective measure F: E —> L (see [2], XIII, D 5.6; in [2] we have presented
this definition. There we did not find it necessary to discuss the process of
completion of E). For this approach, in order to make the concept more
realistic physically, and to simplify the mathematical treatment, we shall
impose a number of additional conditions upon the notion of an observable.
Mathematically, in order to formulate several theorems more simply, it is
always very convenient to make the sets complete (relative to the uniform
structure). We shall apply such a completion to the ring E.
Th. 1.4.1. Let weK; the sets
Nw,B = {(ffi. a 2) I a и <*2 e g{w, F{a2 + o2)) < £}
form a fundamental system of sets for the uniform structure Ug (see All, §2)
of the Boolean ring E with effective measure F: E —> L. Ug separates E
because F is effective. Ug is metrizable, with metric
die и = F(o1 4 g 2)) where we choose an “effective” w0 (such an
effective w0 exists according to III, Th. 6.1), that is, a w0 for which
C(w0) = K.
PROOF. From w0 e К we will have proved the theorem if we can show that d(al9 o2)
is a metric, and that, to each pair w, s there exists an s' for which p(w0, g) < s'
implies that p(w, g) < s for all g e L.
' Since F(Gt -j- <r2) + F(g2 4 <r3) = F(°i + ^3) + + ^3)‘(^2 + ^3)) we ob¬
tain F((T1 4 03) < F(0i 4 g2) + F(g2 4 g3). Then we find that d(ol9 o2) satisfies the
triangle inequality. From d(al9 a2) = 0 it follows that p(w0, F((J1 4 <r2)) = 0 and we
therefore obtain p(w0, F(d1 4 g2)) = 0 for all w e C(w0) = К, that is,
F((T1 4 g2) = 0. Since F is effective, it follows that ax 4 <*2 = 0, that is = a2.
Thus d(ol9 o2) is a metric.
98 IV Coexistent Effects and Coexistent Decompositions
We shall now show that we may choose a special effective w0 e К such that, to
each pair w, e there exists an s' for which p(w09 g) < e' implies p(w9 g) < e for all
g g L. For this purpose, we introduce a denumerable subset {wv} which is dense in
К and define w0 = £®=1 Avwv,Av > 0,£®=1 Av = 1.
Then, for w g K9wpe {wv} we obtain
p(w9 g) < |p(w - wp9 g)| + p(wp9 g)
< ||w - wp\\ + p(wp9 g).
Since p(w9 g) = J]®= x Av^(wv, g), we find that
V*(wp, g) < р(щ, g)
and we obtain
A*(w, g) < || w - wp\\ + A" V(wp, 0).
If we now choose wp so that ||w — wp|| < s/2 and choose s' < Ap(e/2) then we
obtain 0(w, 0) < e.
From Th. 1.4.3 and Th. 2.1.11 it follows that the metrics
*2) = A*(wi> F(ffi + <J2))
and
42(^i» ^2) = 0(w2, 4- (T2))
are equivalent for effective pairs w19w2eK.
Here it is important to note that the following theorems (up to Th. 2.1.11)
only require that there exists a single w0 for which the metric defined in
Th. 1.4.1 generates the uniform structure Ug\
Let Ъд denote Ъ endowed with the uniform structure Ug defined in
Th. 1.4.1.
Th. 1.4.2. The map F:I,g—*L (where L is endowed with the uniform
structure defined by <r(0l'9 01)) and the maps (<rl9 (т2)—><т1 + <r2 and
(ol9 o2) -*(T1-(T2 of 2ig x Ъд into 2ig are uniformly continuous. •
PROOF. The uniform structure on L determined by g(0\ 01) is the initial structure
for the maps L **(vv,g) > [0,1] c= R for w e К because every у e 0 may be expressed
in the form у = olw1 — fiw2 where wl9 w2 e K. If the composite map
Sg—»L **(vv,g)> [0,1] is uniformly continuous, that is, the map p(w9F(<r)) is
uniformly continuous for each w, then the map F will be uniformly continuous (see
All, §2). From
F(<r 1) + F(g2 + <7i • <r2) = F(g1 v <r2) = F(g2) + F(g± + • g2)
and from
F(<7i + <T2) = F((T1 + Gt • <r2) + F(g2 + gx • g2)
it follows that
Ip(w9 F(g1) - F(g2))\ < p(w9 F(g1 + <r2)),
thus proving the uniform continuity of p(w9 F(g)).
1 Coexistent Effects and Observables 99
From
Ffal + £1 + <*2 + o2) < F(<T 1 + <*i) + F{G2 4 (T2)
it is easy to show that (<rl9 a2) —>a1 + a2 is uniformly continuous. From
F(<ri 4 + F(g2 + d2) > 2F(((T1 + + ^2))
= 2F((T1 • (T2 4 (Ti -(T2 4 -^2 + £1 -£2)
= 2F(<71 • (T2 4 £1 • ^2) + Щ°1 • ^2 + ^1 • ^2)
it follows that
F{c1 • (T2 4 ffi • ff2) ^ 2fF(^i + <*i) + F(<t2 + ^2)]»
thus proving that the map (<rl9 <r2) -kj1-(t2 is uniformly continuous.
Th. 1.4.3. The uniform completion tg of 2,g is a (lattice-theoretically)
complete Boolean ring. The additive measure F may be extended on tg to an
additive measure Ъд -> L. IfF is effective on 2,g then its extension is effective
on tlg. For each subset Г с Ъд a denumerable subset ov e Г can be so chosen
such that V^er*7 = \Д (and similarly for Д). In addition \/I=i
\Д® 1 av • Both В and m(o) = p(w9 F(o))9 are also o-additive measures on Ъд.
Proof. From Th. 1.4.2 and from the general theorems about the extension of
uniformly continuous maps we find that tg is a Boolean ring and that the map
хДь may be extended as a uniformly continuous map because L is o(08\ $)-
complete since it is <r(@f9 ^-compact. Similarly, it is easy to show that the
extension of F is an additive measure on tg since ol9 o2 g tg may be approximated
arbitrarily well by al9 a2 g if c1 • <r2 =0 then • a2 will approximate the null-
element arbitrarily well. Similarly it follows that F is effective on tg when it is
effective on .
In order to prove the lattice-theoretical completeness of tg it suffices to show
that, for а Г c= tg the upper bound \/аеГ<т exists because if Г is the set of all a for
which <7 < <7 for all <7 g Г then we obtain еГ а = \Jser a.
Let ф denote the set of all finite subsets of Г. Then \/аеГ<т = \Де</> % where
% = V<reo»<T- Since tg is a Boolean ring <7Ф g tg the set of <7Ф, q> g ф is a directed
subset of t,g. From the additivity of the measure m0(<7) = p(w0, F(<r)) where w0 is
defined in Th. 1.4.1, we may conclude that the set of numbers т0(сгф) is upwardly
directed; since т0(сгф) < 1 it is also bounded. The set of т0(сгф) therefore has an
upper bound which we denote by a.
If (pl9 (p2 g ф and if (p g ф9 cp cz (pl9 ^ cz cp2 then
WoKl +/«) = moK, + % + % + %2)
^ ffloK, 4- %) + + %2)-
Furthermore, since сгф < сгф1, cr^ < сгф2 we find that
+ <r„) = tn0(om) - m0(av) < a - m0(<x„);
a similar expression holds for <7ф2. Thus we obtain m0(<7ф1 4 <^2) < 2[a — т0(<7ф)].
Using d(...) from Th. 1.4.1 we obtain
100 IV Coexistent Effects and Coexistent Decompositions
Since a is the upper limit of all т0((тф), the directed set <тф, cp e ф converges towards
an element .
Since <тф • <тф1 = (T^for^i =э (p we obtain • аф = (7ф, that is, (7^ > <7^ for all ф e ф-If
<T > Oy for all (реф, then from cr • сгф = сгф it follows that, in the limit <т-<тф = <тф and
also оф < <7. Therefore аф = \/„ e Ф % = V»«г ■
As a special case of the above result, for a denumerable set {<rv} we obtain
n 00
s«=y ff»-^ V ffv;
v=1 v=1
since &n is an increasing sequence, bounded from above, we therefore obtain
00 00
Sn~* V Sm = V ffv
m = 1 v = 1
We shall now show that, in general, it is possible to choose a denumerable subset
{<тд} from Г such that \JаеГ a = V“=i V First we can choose a denumerable
sequence <pv from ф such that d((7^2, o^) —>0. For <r = \Jv<r<Pv we find that
<7^ < д < (тф = \/<рефо(р. Therefore, as above, since <т^—ктф9 it follows that
<тф < д < <тф; that is, & = оф. Thus, using the denumerable set у = Уv <pv c= Г: we
obtain \JasTa = аф = а = \Д <V» =
F is e-additive if for a denumerable sequence <rv e £g for which <rv • сг„ = 0 for v # ц
f (v *.) = £«
Let <rw = VlT=i G\> then we obtain F(<rw) = 1 F(crv). We must therefore show
that F(<7w) converges towards F(\JV <rv) in the cr(^', J’)-topology. Above we have
already shown that <rw A \/v (Tv; from the continuity of F this assertion is proven.
From the (7-additivity of F it easily follows that m(<r) = p(w9 F((t)) is also (7-additive.
Th. 1.4.4. Let I, be a (lattice-theoretically) complete Boolean ring, /e£ m0 be a
o-additive effective measure Z [0, 1] <= R and let Ug be the uniform
structure induced by d((Tu o2) = m0(o*i 4- о2). Then Z is Ug-complete. 77ien,
gwen a convergent sequence <7V —> <7 a subsequence on can be chosen such that
the decreasing sequence dm = \/f=m aVi converges to a and a = f\m &m.
PROOF. Z is (/^-complete if every Cauchy sequence <rv converges to a limit (tgI.
From <7V we select a (yet to be determined) subsequence <rVk and (since Z is lattice-
theoretically complete) construct on = \J^=n eXn and a = An=i <ти. We will now
show that <7V —► (7. Since (7V is a Cauchy sequence, it suffices to show that, for a
subsequence <7Vk —» (7, that is, d(t7Vk, n) —> 0.
Suppose that m0 is (7-additive and that <тд is a decreasing sequence which satisfies
An Gn = We claim that m0(d^) —» 0. From
<7l = (^ + <j2) у (<72 + <73) V • • • V (ffm_! + (Tw) v • • •
it follows that, on the basis of (7-additivity, that
Щ(°1 + Gi) + ™o(<*2 + <73) + • • • +
Since т0((7д + (7д + 1) = т0((7д) - т0((7д+1) the left-hand side of the above equation
is equal to mo^) — m0(<rw). Therefore it follows that m0(dw) —» 0.
1 Coexistent Effects and Observables 101
From <7 = Д °= x <7„ it follows that an — an -j- о is a decreasing sequence for which
A“=i К = 0. Therefore d(Sn, a) = m0(S„ + (x)-»0. From 5m = <rVm v (V<" „+1 <0
it follows that
00 00
+ °vm < v к + о= V
i = m + 1 i = m +
Thus, since m0 is (7-additive, it follows that
^ Z + ffVm + к + ff»j- V Kj + o-Vm)
i = m+1 L j = m +1 J
00 00
^ Z 'Mo(oVi + = Z °vj-
i = m + 1 i = m+ 1
Since (7V is a Cauchy sequence, we may select a subsequence such that
d((TVk, <7Д) < 1/2* for p> vk. Thus we obtain
<*(<?«, ^ z 4a°-
k=m+lL
From d((iVm, a) < d(crVm, <tw) + d(dm, a) it follows that d(aVm, a)—► 0 and therefore
also (7V —» (7. Since (7 e Z, Z is (/^-complete.
In reference to Th. 1.4.3 it is important to note that the following situation
is possible: For a subset ГсИ9 there exists an upper bound in Ъд which is
usually denoted by \Jff e r a. We shall, however, denote this upper bound in
Ъд by Vffer*7- The same subset Г, as a subset of t,g has, according to
Th. 1.4.3, an upper bound in 1Lg, which in Th. 1.4.3 was denoted by \/аеГ a.
Note that it is possible that
V а Ф V a-
<r e Г or e Г
In this sense it is possible that F is a-additive in t.g9 but is not a-additive in
S*!
The map represents an idealization of the situation $(b0)-^+L.
For £, as an idealization of ЩЬ0), the uniform structure Ug which was
introduced in Th. 1.4.1 also appears to be physically meaningful. Ug
distinguishes two elements <rl9 o2 from Ъ with the aid of finitely many w e К
by means of the probabilities p(w, F(o1 + g2)) where ax + o2 that “re¬
gistration” in which only one of the two registrations <7l5 o2 have responded.
It is premature, however, to identify Ug with the uniform structure of the
“physical imprecision” associated with £ (this uniform structure of “physical
imprecision” is treated in a general setting in [1], §6 and §9) because of the
following unexpected property of Ug :
In general t,g is not compact—as we would expect (according to [1], §9)
for a uniform structure of physical imprecision. If we postulate that £ is
denumerable, then t,g will be separable. ЩЬ0) is denumerable, in agreement
with the assumptions made at the beginning of III, §3 about M9£90t.
Since t,g need not be compact, we would like to obtain another uniform
structure which would provide a better map of the physical imprecision of the
distinguishability of the elements of £ than that provided by Ug.
i — 1
J- V K + ffvJ
j = m+ 1
■
102 IV Coexistent Effects and Coexistent Decompositions
For the special case E Д G the Boolean ring E is isomorphic to the image
of Sin G. The uniform structure which describes the physical precision in G
is that generated by cr(^', &). The latter may be transferred to E as the initial
structure which corresponds to the map F. This result would suggest that it is
reasonable to use the initial uniform structure on E (which is generated by
the map S-^L) as the uniform structure of physical imprecision. But the
initial uniform structure does not, in general, separated; for E = &t(b0),F = ф0
we obtain (by means of JF^) = F(g2)) the same equivalence classes of
elements (b0, b) which we have considered in III, §1.
The uniform structure Ug permits us to compare each with all the other
<7’s. By analogy to the fact that we can test a gx only with a finite number of
w g К we may only compare a gx with a finite number of the a e E. This fact
leads us to suggest that we adopt the initial uniform structure generated by
the maps g —> F&(g) = F(g • &)—that is, the weakest uniform structure for
which the maps Fd are uniformly continuous for each & (see All, §2)—as the
uniform structure Up of physical imprecision. Since the uniform structure
generated by g($\ $) is identical to the initial structure defined by the maps
L g) -> [0,1] с R for all w g K, Up is equal to the initial structure which is
generated by the maps:
£ /1(w’F(a’g>-l [0,1] for all w e K, a e Z.
We therefore “test” a g by means of Up using a finite number of the & e E and
a finite number of the w e К (see the general treatment presented in [1], §9).
According to Th. 1.4.2 the maps E —1-a) > L (for fixed g) are uniformly
continuous. Therefore Ug is finer than Up. If F is effective (which we shall
always assume—see Th. 1.2.2) then Up will also separate E because if
F(gx • g) = F(g2 • g) for all <7 g E then, for g = g2 we obtain F(g2) = F(gx • g2).
Thus, from F(g2) = F(g2 + • &2) + F(gx • g2) it follows that
F(g2 + • g2) = 0, that is, g2 -i- g1-g2 = 0 and hence g2 = g1-g2. Thus we
may derive (for g = gx) gx = g1g2. Therefore we obtain gx = g2.
The map F(g • g) may be extended to all of Ъд as a uniformly continuous
map by means of the extension of F upon Ъд in such a way that Up will be
defined in all of Ед. The uniform structure of U„ is coarser than that of Ua and
у ^ F
is the initial structure for the maps Ъд —► L where F&(g) = F(g • &) for all
<7 e E^. Would we obtain a finer initial structure on Ъд if we admit all g e t*gl
In fact we obtain the same structure Up as the initial structure for the maps
Ъд ► L for all <7 eltg. This becomes evident from the following estimate for
g еЪд and g g E^, which is obtained by analogy with the proof of Th. 1.4.2.
|F(gx • g) - F(g2 • <7))| < |F(gx • g) - F(g2 • g))\
+ |i4w, F(gx g) - F(g2-g))\
+ \/i(w,F(g2-g) - F(g2-g))\;
|/j(w, F(g • g) — F(g • £))| < n(xv, F(g • G + G • <7))
= F(g • (g 4- #))) < Mw, F(g 4- o)
1 Coexistent Effects and Observables 103
and \ye therefore obtain
|p(w, F(ctx • d) - F(c72 • <j))| < 2p(w, F(d + dj) + \p(w, F(ci1 • &) - F(g2 • <r))|.
The previous discussion shows that if we impose the condition £p = £p we
do not lose any physical generality, but we do gain mathematical simplicity.
We define:
D 1.4.1. Let £ be a Boolean ring with additive measure F: E —> L such that
E is complete with respect to the uniform structure Ug defined in Th. 1.4.1 (or
equivalently by Th. 1.4.3 and Th. 1.4.4, where F is a-additive and E is lattice
theoretically complete). We define the uniform structure of “physical impre¬
cision” Up on £ as the initial uniform structure generated by the maps
£ L, F9(g) = F(g d), g g £ and L is endowed with the uniform structure
generated by g(0I', 0!). We shall let £p denote E together with the uniform
structure Up (£p is therefore complete; £p need not necessarily be complete).
Th. 1.4.5. Ър is precompact (see All, §2). The topologies generated by Up
and Ug are identical.
Proof. Since L is g(0', ^-compact, £p is precompact (see All, §2). Since Ug is finer
than Up, it is only necessary to show that the identity map £p—► £0 is continuous
(not necessarily uniformly continuous!) in order to prove that the topologies
generated by Up and Ug are identical. Now, for fixed a we have:
d(G, gJ = p(w0, F(g + (Tj) = p(w0, F(g) + F^) - 2F(<t-(71))
< 2\p(w0, F(g) - F(g • Oj)! + \р(щ, Ffo) - F((j)|.
Therefore, for the case in which a1 = g, g2 = e (e = unit element) we have
d(G, gJ < 2\p(w0, F(g • - F(Gt • g^I
+ \p(w09F(g-g2) - F(g1-g2))\
from which we conclude that the identity map £p —» E0 is continuous.
According to Th. 1.4.5 the structure £p, £p is of the form which has been
discussed more generally in [1], §9. Up is somewhat “more physical” than Ug.
£p is precompact (totally bounded), but the structure of this Boolean ring
may not necessarily be extended to the completion £p. We note that Ug is
characterized by the fact that tg = Ep is a complete Boolean ring; in order to
obtain a Boolean ring for completion it is methodologically desirable to
introduce Ug and not only Up.
Since we wish to describe an idealization of the situation 0t{bo) L by
E^L and since 0t{bo) is denumerable, it seems reasonable to introduce the
notion of an observable in the following way :
D 1.4.2. By an observable we mean a pair of objects (E, F) where E is a
Boolean ring and F is an additive effective measure F: E —> L for which E is
complete and separable with respect to the uniform structure Ug (defined by
Th. 1.4.1 using F). We shall denote the observable (E, F) also by E L.
104 IV Coexistent Effects and Coexistent Decompositions
According to D 1.4.2, and Z2-^»L are considered to be the
“same” observable if there is an isomorphism Zx -A Z2 of the Boolean rings
for which JF\(a) = F2(ic7).
Th. 1.4.6. is an observable if and only if Z is lattice theoretically
complete, F is o-additive and there exists a denumerable Boolean sublattice
Za ofli whose (lattice theoretical) completion in Z is equal to Z.
Proof. According to Th. 1.4.3 Z (of D 1.4.2) is lattice theoretically complete and F
is <7-additive. Since Z is Inseparable (according to D 1.4.2) there exists a
denumerable subset icZ which is l^-dense in Z. The Boolean ring 'LA generated
by A is denumerable. The lattice theoretical completion of Z^ in Z is a (lattice
theoretically) complete Boolean ring I^cZ. According to Th. 1.4.4 is In¬
complete. Since 2Л is l^-dense in Z and since Z is incomplete 2Л = Z.
Conversely, if Z is lattice theoretically complete and F is <r-additive, then by
Th. 1.4.4 Z is incomplete. We shall now show that Za is l^-dense in Z. Let Za0 be
the completion of Za with respect to Ug. Since Z is incomplete, Za0 c= Z. Note that
t*ag is, according to Th. 1.4.3, lattice theoretically complete; therefore the lattice
theoretical completion Z of Za lies in 2a0, that is Z c= Za0. Therefore t,ag = Z and
Za is l^-dense in Z.
In previous theorems we have only assumed that F maps the Boolean ring
Z into L. For the special case in which Z Лб, then it is, in principle,
conceivable that the extension of the map F to the completion Ъд will yield
points which are not elements of G. The following theorem rules out this
possibility.
Th. 1.4.7. Suppose that, for a Boolean ring Z, an additive measure
F: Z —> G is given. Then, for the extension of F: 2^ —> L FZ^ is contained in
the &)-closure of FZ in G.
Proof. According to Th. 1.4.1 Ug is metrizable. Thus it suffices to show that, for a
sequence <rv e Z satisfying d(av, a) —► 0 for <7 e Z0, that if F(<rv) e G then F(a) e G.
Since F(<tv) —» F(<r) in the <т(^', ^-topology, then we have also proven that FZ0 is in
the Щ closure of FZ in G.
Since F(<tv) converges towards F(<r), in order to prove that F(<r) e G it suffices to
consider a subsequence <rv.; where the latter is chosen such that (using the notation
ofTh. 1.4.4)
00
к = v
i = m
Then we obtain F(<rJ —► F(<r).
According to Th. 1.4.3 we have
N
&m,N = V
1 = m
we therefore obtain
Since <7WtN e Z we obtain F(dm N) e G. Therefore, by III, Th. 6.10 F(dJ e G. F(dJ is
a decreasing sequence; therefore, by III, Th. 6.10 we obtain F(<r) e G.
1 Coexistent Effects and Observables 105
Th. 1.4.7 shows that in the case in which S-^Gwe may assume that, Ъд is
complete without any loss of generality. Thus we define:
D 1.4.3. A decision observable is a pair (£, F) where £ is a Boolean ring and
F is an additive effective measure F: £ —> G for which £ is complete with
respect to the uniform structure Ug. (We do not require that £ is In¬
separable; this fact is a result of Th. 1.4.8.)
Th. 1.4.8. For a decision observable Ъд is always separable. F is an
isomorphism onto the image set F£ c= G;for each decision observable £ may
be identified with a Unclosed Boolean sublattice of G. Each (lattice
theoretically) complete Boolean sublattice of G is also Ug-complete, and
conversely, so that decision observables may be identified with (lattice
theoretically) complete Boolean sublattices of G. The uniform structure Up
defined in D 1.4.1 is identical to the uniform structure on £ generated by
o(0S\ 08). The topology on £ generated by Ug is identical to the o(08\ 08)
topology on £ (therefore £ is a(08\ 08) closed in G since £ is Ug-complete (see
Th. 1.3.7).
PROOF. On the basis of Th. 1.3.3 £ may be identified with F£. According to
Th. 1.4.4 the (/^-complete Boolean sublattices of G and the lattice-theoretically
complete sublattices of G are the same. Therefore it is only necessary to show that
every complete Boolean sublattice of G is Inseparable and that the uniform
structure Up is identical to that generated by o(08\ 08).
Let £ be a complete Boolean sublattice of G. In order to prove that £ is In¬
separable, we recall that 08' is <r(08f, 08) separable (see AIII, §4). Therefore £ is
a(08\ ^-separable. Therefore there exists a countable subset A с 2 which is
o(08\ J^-dense in £. The Boolean subring ЪА (where £л is generated by A—
according to Th. 1.3.6(v)) is denumerable; let £л denote the complete Boolean
subring generated by A according to Th. 1.3.6(vi)). Since £ is complete, we obtain
Za = £4 <= 2.
Since A is a(08f, J^-dense in £ and since £л and £ are, according to Th. 1.3.7,
a(08\ J^-closed in G, we obtain £л = £. Since the lattice theoretical completion £л
of £* is equal to £, then, according to Th. 1.4.6, £ is Inseparable.
Since £ is identified with a subset of G, the map F becomes the identity map and
the structure Up (defined by D 1.4.1) is the initial structure generated by the maps
F^e) = e a e (for all e e £) of £ into G. For e = 1 we obtain the identity map; thus
Up is finer than the uniform structure defined by a(08\ 08). If, however, the maps Рё
(for fixed e) are uniformly continuous with respect to a(08\ 08) then Up is identical to
the uniform structure generated by <r(08f, 08).
The fact that Fs is uniformly continuous is a consequence of
p(w, e л et — e л e2) = p(w\ — e2), (1*4.1)
where w' depends on w and e, as we wished to prove.
Since e, el9 e2 are commensurable, then, by Th. 1.3.4(v) and (vi) we obtain
ё л e1 = ee1 = e^e = eefe and, correspondingly, e л e2 = ee2 = e2e = ee2e. For
w' = ewe, from p(w, g) = tr(wg) we obtain p(w, e(e± — e2)e) = p(ewe, et — e2) thus
proving (1.4.1).
106 IV Coexistent Effects and Coexistent Decompositions
The fact that Up is identical to the uniform structure defined by 0(01', 0!) on
£ с G is only an assurance that we do not provide two different uniform
structures for physical imprecision, because, as a subset of G, £ already has
the uniform structure of physical imprecision defined by a(0T, 01) (see III, §3).
Th. 1.4.8 permits us to characterize the decision observables entirely by
complete Boolean subrings £ of G. In this way the Boolean operations in £
immediately represent the switching-algebra of an idealized registration
apparatus—the measurement apparatus for the observable. However, this
“simplification” which naively identifies the measurements of decision ob¬
servables with subsets £ of decision effects makes the explanation of the
measurement problem more difficult.
This difficulty is increased by the fact that (as we will see in §2.5) the
“usual” concept of an observable is identical with what we called a decision
observable and that we are accustomed to view only these decision observ¬
ables. We note, however, that our notion of an observable is more general
and realistic as a deduced concept; since it is a deduced concept we do not
need to make vague statements such as “an observable is something that can
be measured.” Its physical meaning may be obtained from its definition as an
idealization of the situation 0t(bo) L to which we have already given a
physical interpretation—a conceptual joining of the registration procedures
b g 0t(bo) and the set of possible frequencies ky(a n fe0, a n b) for the
various preparation procedures a e 0'. In this description, the problem of
finding a “measurement method” b0 g 0to which permits us to at least
approximately “measure” the observable £ Д L need no longer be assigned
to a domain between theory and experiment which may only be
described by “words” and not by a theory. Instead, it becomes a question
which must be treated in a theory which is perhaps more comprehensive than
quantum mechanics because the physical interpretation of quantum me¬
chanics does not depend upon the concept of an observable but depends
instead on the actual physical methods of preparation and registration. In §4
we shall begin with a step by step discussion of the problem of the
measurement of an observable. This topic will be treated in more detail in
XVII and XVIII. Nevertheless, the solution of the measurement problem
concerning the macroscopic signals of a macroscopic apparatus will not be
treated in this book. This problem has been solved in [13], X where the
compatibility of an extrapolated quantum mechanics for “many particles”
with the macroscopic description of the macroscopic measurement and
preparation apparatuses is demonstrated. Such a macroscopic description of
the apparatuses was used as a starting point for the foundation of quantum
mechanics presented in II.
2 Structures in the Class of Observables
By introducing the concept of an observable we seek to eliminate a portion of
the structure associated with a particular apparatus from the registration
process. For registration methods only the abstract structure of a Boolean
2 Structures in the Class of Observables 107
ring together with an additive measure F: £ —> L remains. Without any
additional analysis it is already evident that the concept of an observable
already contains too much of the structure of the registration methods. We
should therefore seek to eliminate “unnecessary” and “bad” registration
methods. Is it possible that such an elimination procedure will lead us to the
concept of a decision observable? If this is the case, then the decision
observables are the “true” measurements of the microsystems and exhibit the
real structure of the microsystems.
These and similar questions make it necessary for us to examine the
concept of an observable more closely, as we shall do in this section. The
reader who is not interested in a deeper analysis of the concept of an
observable may omit this section. This is possible because, in §1 we have
already introduced a number of important theorems which have a close
connection with the analysis of §2.
2.1 The Spaces and &'(L)
Let S be a Boolean ring and let m0: £—► [0,1] с R be an additive real
effective measure. £ may be complete with respect to Ug (that is, £ together
with the uniform structure defined by the metric d(<rl9 o2) = mQ(a1 + o^))* Let
w0 be defined as in Th. 1.4.1; then, according to §1.4 we may choose m0 as
follows:
F(<r)).
According to Th. 1.4.4 a (lattice theoretically) complete Boolean ring £
with a a-additive effective measure is Ug-complete.
We shall now recapitulate some of the results obtained in §1.
Th. 2.1.1. To each set Г а Ъ there is a denumerable subset {(7V} <= Г for
which \/асг a = Vv av- &n = V? = i av *s an increasing sequence for which
VA = Vv°v = Verer*7 md for which дп^>\/аеТо (in the topology
generated by Ug). For every increasing sequence oy we obtain (7V —> \/v (7v. A
similar result holds for Дст e r a.
Proof. See Th. 1.4.3.
Th. 2.1.2. Let ov be a convergent sequence (in the topology generated by Ug)
and suppose oy —> a. Then we may choose a subsequence ov. such that, for
dm = \JfLm (7Vi the following relationships hold:
Gm > ® ~ f\ ^m*
Proof. See Th. 1.4.4.
108 IV Coexistent Effects and Coexistent Decompositions
D 2.1.1. A real function x(o) over Ъ is said to be a signed additive (or signed
a-additive) measure over Ъ if there exists a real number с such that |x((r)| < с
for all о and if, for о = \/v ov, ov л = 0 for v # ц the equation
х(<т) = £*К) (2.1.1)
V
holds for finitely (or countably) many gv .
All theorems about signed (7-additive measures will also hold for o-
additive measures.
Th. 2.1.3. Let x be a signed, additive measure over £. Then the following
conditions are equivalent:
(i) x is G-additive.
(ii) If Gn is a decreasing sequence for which Д„ on = 0 then x(on) —> 0.
(iii) Ifon is a decreasing sequence for which /\поп = о then x(on) —> x(o).
(iv) J/(7„ is an increasing sequence for which \JnGn = g then x(Gn) —> x(g).
Proof, (i) => (iv). Assume that en is defined according to (iv). Let <70 = 0; then
(7 = Vn=о (^n+1 + (7„). Then by (i) it follows that
m
x(ff)= lim X [x(ffn+i) -
m-> oo B = 0
= lim x(<7m+1).
m~* oo
(iv) => (iii). Let on be defined according to (iii). Then <7* is a sequence for which
\/„ (7* = (7*. According to (iv) it follows that x(<7*) —► x(<7*). Since
x(<7*) + x(<7) = x(s) it follows x(<7„) —» x(<7).
(iii) => (ii) is trivial, since (ii) is a special case of (iii).
(ii) => (i). Let a = \/®= 1 <7V where gv a = 0 for v ф {i. The sequence
an — a + \f”= x (7V is a decreasing sequence, for which, according to (ii)
x{° + V -^0.
Since
x(ff + V = *(<*) - x^ V
n
= X(a) ~ E x(<Tv),
V = 1
it follows that x(<7) = x x((7v).
Th. 2.1.4. Let m be an additive measure over S. Then the following
conditions are equivalent:
(i) m is G-additive,
(ii) m is continuous.
2 Structures in the Class of Observables 109
PROOF. According to Th. 2.1.3, the condition that m is <r-additive can be replaced
by one of the other conditions (ii)—(iv) in Th. 2.1.3.
(ii) => (i). For an increasing sequence <rn for which \Jn on = <r, from Th. 2.1.1 it
follows that <7„—>cr; therefore, from (ii) we obtain m(<7„) —»m(<r). Thus, from
Th. 2.1.3(iv) we have proven (i).
(i)=>(ii). Let <rv be a convergent sequence for which crv—»o. Since
0 ^ m(<rv) < 1 the set m(<rv) is convergent if and only if it has only one accumulation
point. If a is any accumulation point we may choose a subsequence <rVk such that
m(<rVk) -* a. According toTh. 2.1.2 a subsequence of <rVk can be chosen (for simplicity
we shall also use <rVk to denote the subsequence) such that, if am = \JfLm aXi then
a = f\m am - Since m is <r-additive, from Th. 2.1.3(iii) it follows that m(<rj —► m(<r).
Since am > oVm we obtain m(am) > m(crvJ—» a. Therefore m(<r) > a.
From the subsequence <rVk, since <rv* —><т* according to Th. 2.1.2, a subsequence
can be chosen (which we shall denote by <rVk) such that
00
°* = /\°m with (Tm = V < •
m i — m
Thus, we obtain
<* = V <** where a* = Д ащ.
m i = m
Since m is <r-additive, from Th. 2.1.3(iv) we obtain
m(<7*)->m(<7).
Since m((T*) < m(crVm) and m(crVm) —> a it follows that m(<r) < a. Therefore a = m(<r).
Th.2.1.5. To each o-additive measure m there exists one and only one (jsgI
for which m(cts) = 1 and т(ст) Ф 0 for all о for which 0 # a < os.
PROOF. Let Г = {a I a e Z and m(<r) = 1}. Then, from Th. 2.1.1 it follows that there
is a countable subset {<rv} of Г such that
«r/=ifA<7 = A<Tv.
а еГ v
We shall now show that d„ = /\J=1 <rv 6 Г. We need only show that if crl9 cr2 e Г
then it follows that а1 л <т2 е Г. From т(<т2) = 1 it follows that т((г|) = 0 and
therefore m(cr1 л trf) = 0.
From 1 = mfai) = т(ст± л trj) + л <т2), it follows that m(<r1 л (T2) = 1, that
is (Tj л (72gT. Since <rs = from m(dn) = 1 and Th. 2.1.3(iii) it follows that
m((Ts) = 1. Thus Г contains a least element, namely <rs.
Suppose that a < as where m(<r) = 0; then it follows that m(<rs + cr) =
m(<rs) — m((r) = 1 and we therefore obtain <rs + (T g Г. Since <rs is the smallest
element of Г and since crs -i- cr < crs it follows that crs + cr = crs, that is, <r = 0.
From m(<r) = 1 it follows that <r g Г and therefore a > <rs. Let <т = <rs. Then
.? we obtain m(<r) = m(a) — m((Ts) = 0. If, in addition, а Ф <rs then there exists a
a9 0 Ф a < a with m(<r) = 0. Thus we have proven the uniqueness of <rs.
D 2.1.2. We shall call os (as defined in Th. 2.1.5) the support of m.
110 IV Coexistent Effects and Coexistent Decompositions
Th. 2.1.6. To each signed o-additive measure x on £ there is exactly one
partition e = <7+ v a_ v o0 (that is, o+ л a_ = o+ л a0 = a_ л a0 = 0)
/or w/dc/i x(a) > 0 for all a for which 0 Ф a <= a+5 and x(a) < 0/or a// a/or
which 0 # a < a_ and x(a) = 0/or alio < o0.
PROOF. Let A = {o \ x(o) < 0 for all <7, 0 Ф a < o}; let <r_ = \JaeAo. We shall now
show that <t_ e A: Let a satisfy 0 Ф a < o_. Then there must be a o1 e A for which
о1 л <7 # 0; thus we obtain x^ л a) < 0. Since d < o_ we obtain о = о л o_ =
\1аел(а л According to Th. 2.1.1 there exists a countable subset {<rv} c= A for
which <7 = \JX (<tv л a). We may choose ot such that a o) < 0. We may
rewrite 6n = Vv=i (ov a o) recursively in the form:
On = (<7! A ff) V <72 V • • • V <7„,
where
[m— 1
V к л s)
V=1
from which it follows that
n
x(<t„) = x(ol Л d) + X!
v = 2
From av < <7V it follows that x(<rv) < 0. Therefore х((ти) < х(оу л a). From
Th. 2.1.3(iv) it follows that x(<r) < x(&! л d) < 0. Thus we obtain <r_ gL
Therefore A contains a greatest element <r_ and A = {<т|0 Ф о < <т_}. We shall
now show that x(<r) > 0 for all о < о*. Let o' < <r* and let x(of) < 0. Since o' < o*
we find that о' ф A, from which it follows that there exists a ot such that
0 Ф oy < o’ for which x^) > 0. Let nx be the smallest positive integer for which
there exists a ot < o' for which xfo) > l/n1. Then
x(o' -j- 0i) = x(o') — x(a1) < 0 and o' -j- Oi < o' < o*.
We find a similar situation holds for o' + o1: Let n2 be the smallest positive integer
for which there exists a o2 for which o2 < o' + ot and x(<r2) > l/n2.
Continuing in a similar fashion, from о = o' + \/v ov we obtain
x(ff) = X(<7') - X X(av) < x(ff') - J] — < 0.
V V nv
Thus we obtain \/nx —> 0. If о < <т, then from the construction of ov it follows that
x(<r) < 0. Thus m(o) = (1/х(д))х(д л о) is a positive сг-additive measure. On the
support os of m we find that os < o; since m(o) > 0 for all о for which 0 Ф о < os we
obtain os e A and therefore os < o_ in contradiction to os < о < o' < o*. Thus
m+(o) = (1/х(а!!!))х((7!!! л о) is a positive (7-additive measure for which x(<7*) > 0.
Let o+ be the support of m+. Then o+ < o*. For all о for which 0 Ф о < o+ we
obtain x(<7) >0. If о < o0 = (o+ v o_)* then x(a) = x(o л (o+ v <7_)*) =
x(o л о* л (7*) = x(<7*)m+(<7 л (7+) = 0 because o+ is the support of m+.
Suppose that, in addition to £ = o+ v a_ v o0 there exists a second partition of
the same type e — o\ v oi v ob- Then we would obtain oi e A and therefore
01 < <7_. Since, for all о for which о < o\ v <70, and therefore for all о for which
0 < (<7+ v <70) л <7_x(<7) = 0 we must obtain (<7+ v (7<5) л o_ = 0. Thus we obtain
01 = <7_ and therefore o+ v o0 = 0+ v o£; from the uniqueness of the support we
therefore obtain o+ = 0+ and o0 = Oq.
2 Structures in the Class of Observables 111
Th. 2.1.7. Every signed G-additive measure x on E may be written in the form
x = a— Pm2 where a, ft > 0 and , m2 are G-additive measures such
that the supports osU gs2 of mu m2 satisfy the relation gs1 л gs2 = 0; a, /?,
ml9 and m2 are uniquely determined.
PROOF. From the partition E = <7+ v <r_ v (T0, from Th. 2.1.6 we obtain the
following result: Let
х((т+ л a), m2((r) = ^-y x(<7_ л a),
a = x(cr+), P = — x(g_),
we obtain the desired partition and <rsl = <7+, <rs2 = .
Conversely, suppose that we are given a partition which satisfies Th. 2.1.7. Since,
for all <7 for which a < gs2x{g) < 0, as2 < o_ is obtained from Th. 2.1.6. Since, for
all (7 for which <7 < c7_ a <t*29 x(cr) = am^cr) > 0, we must have (7_ л of2 = 0.
Therefore <rs2 = <7_. Thus we obtain am^cr) = х(а* л a) and therefore obtain
°,i = (7+*
D 2.1.3. Let J^(E) denote the set of all signed a-additive measures on E; let
K(E) denote the set of all a-additive measures on E.
For every finite set {xv} с= the sum x = £v av*v а signed a-additive
measure. This result, together with Th. 2.1.7, yields the following theorem:
Th. 2.1.8. ЩЕ) is a linear vector space. The linear hull of K(L) is Щ2).
Th. 2.1.9. The absolute convex set generated by K(E)—the set
V = Uo^a<i — (1 — A)K(E)]—defines a norm in ЩЕ) by means of
its Minkowski functional.1 For this norm ||x|| = a + fi where a, /3 are
uniquely defined by Th. 2.1.7. With the positive cone ^+(E) given by
^+(E) = {x | x(g) > 0 for all g g E}
J^(E) becomes an ordered vector space. For x> Owe obtain ||x|| = x(e);/or
x^. Owe obtain x = x(e)m where m = (l/x(e))x g KfE). Therefore K(Y) is
the base for the cone ^+(E) and ЩЕ) is a base-norm space (see AIII, §6).
PROOF. It suffices to show that the Minkowski functional for V is equal to a + ft
where a, ft are obtained from Th. 2.1.7, because V is then absorbing and the
Minkowski functional is equal to zero only if a = ft = 0, that is, x = 0.
The Minkowski functional for V is equal to a + ft since, if x = a— j3m2 =
— (1 — /фи2] where a, ft, ml9 m2, are defined in Th. 2.1.7 and 0 < p < 1,
ml9 m2 g ЩЕ), it follows that A > a + ft. Let <7sl, os2 be the supports of and m2,
respectively. Then it follows that
a + p = x(<7sl) - x(ffs2) = ЛОфмДо^) - mjfe))
- (1 - n)(m2(osl) - m2(<rs2))].
1. If F is convex and if x eV implies also —xeV, the Minkowski functional is defined by
p(x) = inf{A|A_1x e V}.
112 IV Coexistent Effects and Coexistent Decompositions
Since mi((7sl) + = mi(<rsi v as2) < щ(в) = 1 and since m2((7sl) + m2((7s2) <1
it follows that
- щ(а,2)) - (1 - n)(m2(ffsi) - m2(<rs2))\ <1.
Thus A > a + /?. The remainder of the theorem is easy to prove.
Th. 2.1.10. ЩЕ) is a Banach space.
PROOF. Let x„ be a Cauchy sequence in 08(L). Let x = ocm1 — fim2 (where
a, /?, ml9 m2 are given by Th. 2.1.7). It then follows that |x((7)| < am^tr) + j3m2((r) <
a + /? = ||x||. Therefore the real numbers xn(a), for fixed (7, form a Cauchy
sequence. From xn(o) —» x(cr) a real function x is defined on £ which (as may easily
be shown) is additive. From
|x»| < IX» - Xm(cr)| + |xm(<r)|
^ II*» - JCmll + ll*J
it follows that there exists a real number с for which |хи((т)| < с for all n and a.
Therefore |x(<7)| < c, that is, x is a signed additive measure on £ for which |x(cr)| < с
for all (7 e £.
We shall now show that x is also (7-additive, and is therefore an element of 08(L).
According to Th. 2.1.3 x is (7-additive if, for any decreasing sequence <rv satisfying
Д v (7V = 0, it follows that x((7v) —> 0. In the following equation
|x(ff„)| < |x(<7„) - X„(<TV)| + |x>t) - Xm(<7„)| + |xm(t7v)|
< |X((TV) - X„((TV)| + ||x„ - xj + |XM((TV)|
we may choose N such that, for n, m> N, ||x„ — xw|| < в. Next, we hold m> N
fixed, and choose v so large that | xm(crv) | < e(xm is (7-additive). Then, for fixed v we
choose n so large that |x((7v) — x„(crv) | < s. Therefore, we obtain |x((7v)| 0.
Let (j", (7" be the elements which, according to Th. 2.1.6, correspond to the
signed measures (x„ — x). Then by Th. 2.1.9 and Th. 2.1.7 we obtain
II*. - *11 = *»№ - *№ - l>„№ - x(<t")]
< |x„K) - xm(<r"+)| + |xm«) - x«)|
+ |x„(ff”_) - Xm(<T"_)| + \xm(<f.) - x(<7”_)|
^ 2||x„ - xj + |xmK) - xK)I + \xJrfL) - x(ff")|.
We now choose N such that ||x„ — xm|| < в/2 for all n9 m> N. Then, holding n
fixed and choosing m so large that
|xMW) - xK)| < e,
Ixm(al) - x(<t” )| < e.
From this, it follows that ||x„ — x|| —> 0.
Th. 2.1.11. Let m be an effective a-additive measure. Then
!£(£) = co{ma | rii^d) = m(a)~1m((7 л &), a g £, а Ф 0},
where co{...} is the norm-closure of co{...}, that is, the convex subset of
K(£) generated by all the measures ma is dense (in norm) in K(£).
2 Structures in the Class of Observables 113
Let mx, m2 be two effective G-additive measures. Then the corresponding
metrics dx(Gl9 o2) = mi(oi + &2) and d2(ol9 o2) = m2(o1 + ^2) are equiva¬
lent, that is, the uniform structure Ug of 2 will be generated by
dfau ^2) = m(ai + о2) where m is any effective G-additive measure.
Proof. For m e X(Z) and for m as defined in Th. 2.1.11, the xk = m — km, for real
numbers Я > 0 are elements of Щ2). For each хя we may define <7_(Я) according to
Th. 2.1.6; clearly Л1 > k2 implies g_{Xx) > сг_(Я2). We shall now show that
\/л>о0-(Л) = e. Suppose not; then, for <7 = [\/a>o we would obtain
a < (7_(Я)* for all Я > 0, that is, xk{a) > 0 for all Я > 0. Therefore we would obtain
m(<r) > Ят(<т) for all Я > 0.
Since т(<г) Ф 0, this contradicts the fact that m{a) < 1. Let d > 0; then define
Яи = nd for n = 0,1, 2,... and <7„ = G_(nd) + G_((n — 1)<5). Thus we obtain
g„ a am = 0 for n Ф m and \/®_ 1 on = e.
Let m„((7) = m((7„)“ lm(o л <7„); we define
N
*d,N= E (« “ l)*w(<T„)m„.
n= 1
Clearly xdtN e &+(!.) and m„ e K(I,). Since
00
miff) = E т((Т л an)
n= 1
it follows that
N
m(<r) - xSN(o) = E [m(<7 л <7„) - (n - l)3m(d л (j„)]
И=1
00
+ Е ^ л О-
п = N +1
We obtain (7 л ап < <7_(ю<5) and <7 л ап < <7_((и — 1)<5)*; from which it follows that
m(<7 л <7„) — nSm(G л g„) < 0,
m((7 Л <7И) — (n — l)<5m(<7 A <7„) > 0.
From which we obtain
0 < m(<7 л (7„) — (n — l)<5m(<7 л g„) < дт(а л <7„)
and
N 00
0 < m{&) - xd N(a) < 5 E m(<r * <t„) + E ™i° л
п=1 п = /V + 1
00
< <5m(<7) + Yj A °Vi)*
n = N+ 1
Since 0 < m(cr) — x5>iV(<7) we obtain ||m — Xj,N II = m(e) - x3M(e) (see Th. 1.2.9).
It follows that
00
II m - jc^jvII < <5 + E
n = N + 1
114 IV Coexistent Effects and Coexistent Decompositions
We may now choose 3 < s/4, and N so large that N + 1 m(cr„) < s/4 where the
latter is possible because of the convergence of the series
Ё "КО = "*(®) = '•
n= 1
Therefore we obtain ||m — xdfN\\ < s/2. Since ||m|| = 1 it follows that 1 — e/2 <
Ы\ ^ 1 + £/2 and
xd,N
**,#11
< £.
Since xdtN > 0, mdN = xdfN/\\xdtN\\ is an element of K(E) and md N is a convex
linear combination of the mn.
Suppose that m is effective. We shall now show that the metrics
a2) = + g2) and 3(<rl9 a2) = + a2) are equivalent. By symmetry
between m and m we need only show that, for each £ > 0 there exists an £ > 0
such that d(al9 <j2) < e implies that d(al9 <r2) < e. For x5>iV, as defined above, we
obtain
m(<7i + <r2) = m((T! + a2) - + ai) + + *2)
N
^ II™ - x5>iV|| + £ (и - l)<5m(<7„ • <7X + (T„-(T2)
< m — xt
'd,N I
We now choose xd N (and therefore choose <5, N) such that ||m — xdtN\\ < e/2.
Then we choose £ < e/2 ! (и — 1)<5. Then, for d(at, <r2) < £ it follows that
d(<7i, (T2) < £.
Th. 2.1.12. J^(E) is separable i/E is separable.
PROOF. Suppose {<rv} is a countable set which is dense in E, that is, to each <7 e E,
for arbitrary £ > 0 there exists a <rv such that d(d9 <7V) = m0(d + <7V) < £. Let
ms(<r) = m0(a)~1m0(a л a) where m0 is the effective measure (see the beginning of
§2.1). According to Th. 2.1.8 and Th. 2.1.11 it suffices to show that, to each ihd
there exists a mav for which ||m~ — maJ is arbitrarily small. This fact is a
consequence of the following estimates:
|| md - mj| < —— ||m0(<7 л a) - m0(<7v л a)
m0(<7)
1
1
m0(d) m0(<rv)
||m0(<7v a a)
1
— ||m0(<7 л a) - m0((7v л <т)|
m0((7)
+
1
1
m0(d) m0((7v)
From
т0(д a <7) - m0(<7V a a) = m0(<7 л <7V л a) 4- m0((<7 + <7 л (7V) л a)
- m0(<7 A (7V A (7) ~ m0(((7v 4- O' A (7V) A (7)
2 Structures in the Class of Observables 115
it follows that
||m0(ff - a) - m0(ay л a)|| < \\m0({a + & лс7„) л сг)||
+ ||"»o(K + $ л ffv) л ff)||
= m0(<r + о • <rv) + moK 4- ff • ffv)
= d(o, <7V).
Therefore we obtain
1 11
|| mav - щI < (jv) +
m0{a)
m0(d) m0((rv)
We shall now examine the properties of the dual Banach space &'(L)
corresponding to J(£).
Let I g &'(!<), that is, I is a bounded linear functional over ЩТ). The norm
of / is given by
M = sup \Kx)l
11*11 SI
It is easy to show that
||/|| = sup{l(m)\m e K(L)}.
The positive cone ^'+Ф) *п ^'(2) is defined as the set of all / for which
l(m) > 0 for m g K(£).
The unit sphere in may then be described by the order interval
[—1,1] where 1 is the functional for which 1 (m) = 1 for all гпеК(1<).
Therefore l(x) = a — /? (where a, jв are defined in Th. 2.1.7).
The unit sphere [—1,1] is ^-compact. Thus it satisfies the Krein-
Milman theorem (see AIII, §4) which says that a convex set is spanned by its
set of extreme points de[— 1,1]- The unit sphere is isomorphic to the set
L(I) = [0,1]. What are the elements of АДЕ)?
For fixed a g Ъ, la(m) = т(а) is a bounded linear functional which satisfies
la g L(£). We shall now show that deL(l<) is precisely the set of all la where
a g S.
Th. 2.1.13. la g 3eL(£), g2 => lffl Ф la2.
Proof. Let lax = А/ + (1 - A)/' where 0 < A < 1. Then, for all m for which
m^) = 1 it follows that l(m) = 1; for all m satisfying m^) = 0 it follows that
l(m) = 0. Each m can be written as follows:
m = m((T1)m<Tl + m(Gf)maV
- where
m(<7 A <7)
= 7^—•
m(<7)
116 IV Coexistent Effects and Coexistent Decompositions
Therefore, since /(mCTi) = 1 and /(mCTl) = 0 it follows that
l(m) = + m(<rf)l(mai)
=
l°t = l„2 =* w0(<Ti)_ 1m0fer1 л <7j) = m0(ff1)-1m0(ff1 л <72)
=> «10(0-! -j- О-! Л <72) = 0 => = (7! Л G2
and similarly c2 = g1 a <72, that is, g1 = a2.
The fact that each element of deL(L) is of the form la is obtained as a
corollary of the important “spectral theorem” for / e I*1 order to prove
this fact we introduce the following definition and notation:
Let the canonical dual form for be denoted by (x, y). Therefore every
linear functional у may be expressed as a function over $ in terms of
(x, y) for fixed y. Let
■Кт I °) = sup{(m, y)\me K(l<) and m(<r) = 1},
Гя0>) = И «O'к) < a},
<7(y<a)= V a• (2.1.2)
<теГж(у)
We claim that ст(у < a) e Га(у), that is, g(y < a) is the largest element of Га(у).
In order to show this result, we shall first show that, if au g2 g Га then
a1 v (т2еГг
The following is obvious:
s(y|ffi v a2) > 4y,o2\
s(y I (Tj v a2) > s(y, a2).
Let 5X = + g1 • g2 < gx ; we then obtain gx v g2 = Sx v g2 and
a g2 = 0 from which we conclude that
m(G1 v g2) = + m(<72).
Suppose that v g2) = 1. If m^) = 1 (or m^) = 0 and therefore
m(G2) = 1) it follows that (m, у) < (or < s(y|<72)). If ^1,^0,
then, setting A = т(£х) we obtain:
m^G) = a g\ m2(G) = 1 2 ^т((72 л g);
mum2 are elements of K(L) and = 1, т2(д2) = 1. Since
т(а1 v a2) = 1 we obtain
m(G) = v g2) a g) = Am^a) + (1 — A)m2(<7).
From this result it follows that
(m, у) = Цти у) + (1 - X)(m2, у)
< As(y I ffi) + (1 - A)s(j> I a2).
2 Structures in the Class of Observables 117
This result also holds (as we have seen above) for Я = 0 and Я = 1. Therefore
we obtain:
s(yki v <72) < sup [As(y I <7л) + (1 - A)s(y I <72)]
0<Я<1
= max[s(y I (7л), s(y|<r2)].
Combining this result with (2.1.3) we obtain:
s(y | (7j v (72) = max[s(y | <Тл), s(y | (72)]
thus proving that, if al9 a2 e Га then ax v a2 e Га. Therefore we have shown
that, for <tv g Га we obtain
m
V
v=l
This result, together with Th. 2.1.1, shows that there is an increasing
sequence an e Га such that a(y < a) = \Jn an. Let m g K(T) satisfy
m(a{y < a)) = 1. Then, from Th. 2.1.3(iv) it follows that
m((t„)->1. (2.1.4)
Clearly, we may have only a finitely many m(an) = 0. These can be omitted,
and we may require that т(<ти) ф 0. Then
тфп л a)
m” ° H°n)
is therefore an element of K(L) which satisfies mn(an) = 1. For the comple¬
ment <7* of on we find that
is an element of K(L) if m(an) ф 1.
If m((7n) = 1 then (m, y) < s(y | <r„) < a. Let m(<тп) ф 1 for all n. Then from
m = m((7„)m„ + (1 - m(<7„))m*
it follows that
(m,y) = m(<7„Xm„,y) + (1 - m(an))(m*, y)
< m(<7„)s(y|<7„) + (1 - w(<7„»||y||
< m(<7„)a + (1 - m(<7n))||y||.
Therefore, together with (2.1.4) it follows that (again, as in the case in which
m(<7„) = 1) (m,y) < a. For an me K(L) for which m{a(y < a)) = 1 we also
obtain (m, y) < a, that is, a(y < a) g Га.
Thus, we obtain
аг > oc2=> <r(y < aj > <т(у < a2). (2.1.5)
118 IV Coexistent Effects and Coexistent Decompositions
Let a*(y < a) denote the complement of a(y < a). We will now show that the
relation (m, y) > a is satisfied for all m e K(L) which satisfy m(<7*(y < a)) = 1.
Suppose that there exists an me K(L) which satisfies m(<7*(y < a)) = 1 and
(m, y) < a. Let у = у — al; then (m, y) < a is equivalent to (m, y) < 0. Let <7
be the support of m (see D 2.1.2). Then g < a*(y < а), т(ё) = 1 and m(<r) Ф 0
for all <7 satisfying 0 Ф о < д.
Let n1 be the smallest positive integer for which there exists a gx satisfying
g1 < <7 and (xai, y) > 1 /nl9 where л a). For д + ^ we may
make the same assumption as we made for a: Let n2 be the smallest positive
integer for which there exists a o2 < 6 + such that (xa2,y) > l/n2.
Proceeding in this fashion we obtain a sequence <7V for which <7V л <тм = 0 for
v # /I such that \/v (7V = d.
Let a = <7 + Vv (7V; we will now show that £ Ф 0. From g — g v \jy gv it
follows that
If = 0 (and therefore xd = 0) then we would obtain (m, y) > 0 in con¬
tradiction to (m, y) < 0. Therefore g Ф 0 and т(д) Ф 0 (since g < g and g is
the support of m). From (2.1.8) it follows that £v l/wv is convergent and
therefore l/nv —► 0.
We must have (xff, y) < 0 for all g < g, otherwise there would be a g < g
for which (x$, y) = s > 0, contradicting the definition of nv and l/nv -> 0.
For all g satisfying 0 ф g < g we find that x9(g) = т(д л g) = m(<7) Ф 0.
The measure md e K(S) given by
m(<7) = m((7 л a) = т(<7 л a) + £ w(<7v л g). (2.1.6)
v
Since
N
N
m - = m(s) - m(S) - £ m(<rv)
V=1
V=1
in the norm-topology of ^(S) we obtain
00
(2.1.7)
V=1
and therefore obtain
(m,y) = (x9,y) + Z(x<rv>y)
^ (x»,y) + £—,
v nv
(2.1.8)
is therefore “effective” in a, that is, ma(<7) Ф 0 for all g for which О Ф g < g.
Thus, in the same manner as in Th. 2.1.11 it follows that co{m. | g < g] is
norm-dense in the set of all m for which т(д) = 1. Therefore, for all m
2 Structures in the Class of Observables 119
satisfying m(o) = 1 we obtain (m, y) < 0, that is, (m, y) =
(m,y + al) = (m, y) + a = < a; or, in other words д e Га and therefore
о < a(y < a) in contradiction to & < a < a*(y < a).
Therefore, we have shown that (m, y) > a is satisfied for all m satisfying
m(<7*(y < a)) = 1.
We see, therefore, that a(y < a) = 0 for all a < — ||y || and a(y < а) = г for
D 2.1.4. The totally ordered subset {c(y < a)} c= Z is called the spectral
family of ye 38'(?,).
Th. 2.1.14. The spectral family a(y < a) is right-continuous, that is,
о (у < a + г) o{y < a) for s > 0 and г 0.
PROOF. According to Th. 2.1.1 it suffices to show that, for д = Д£>0 o(y < а + e)
the relation a = a(y < a) holds. Immediately it follows that а (у < a) < a. On the
other hand
Since a < a(y <, а + e) from (m, у) < а + e for each e > 0 we obtain (m, у) < а
for each m e K(Z) and m(<r) = 1, that is, о < o(y < a).
D 2.1.5. A map RAl satisfying the conditions
p(а) = г for a > с for some c, and satisfying p(a + ) = p(a) (p is right-
continuous) is called a (generalized) spectral family.
Th. 2.1.15. (Spectral Theorem). For each spectral family there is a linear
bounded functional I e 38'(L) defined by
For the special case in which у e 38'(£), p(a) = a(y < a) then I in (2.1.9a) is
equal to y. We may write (2.1.9a) (la is defined in Th. 2.1.13) as follows:
where the integral is to be considered to be a limit in the norm of 38'(L).
PROOF. l(m) is a bounded linear functional if it is affine and bounded on K(Z); this
result follows directly from (2.1.9a).
Let m(p(aj) = 1; then m(p(a)) = 1 for a > From (2.1.9a) it follows that
l(m) = (m, I) ^ <xl9 that is, p(aj ^ g(1 ^ aj.
s(y I &) = sup{(m, y) | m e K(L) and m(<i) = 1}.
a2 > oc1=> p(a2) > p(aj; p(a) = 0 for a < — c,
(2.1.9a)
suchihat
g(1 < a) = p(a).
(2.1.10)
(2.1.9b)
120 IV Coexistent Effects and Coexistent Decompositions
Nowletm(<7(/ < aj) = 1. From (2.1.9a) it follows that
(m, /) = j* a dm(p(a) л g(1 < aj). (2.1.11)
Suppose that p(aj Ф a(l < ax); then there exists a S > 0 such that
p{otl + S) Ф a(l < (zj since p is continuous from the right. Therefore
P(ai + <$) a a(l < ai) Ф a(l < or p^ + <5) > a(l < a^. If p(+ <5) >
cr(Z < oti) for all <5 > 0 it would follow that p^ + ) = p(a^ = a(l < a^.
If there exists a S such that p{oLx + 6) л <т(/ < aj ф g(1 < aj then a =
<т(/ < o^) + p(ai + <5) л g(1 < oil) ф 0 and a < g(1 < а^. Let us choose а
m e K(Z) such that m(<r) = 1. Then it follows that
m(p(а) л <т(/ < aj) = 0 for a < oq + S,
m(p(c) л a(l < ax)) = m(<j(/ < a^) = 1
and, therefore, from (2.1.11) /(m) > + S in contradiction to the fact that
a < a(l <
If a! > a2, then, from the definition of a(y < a) we obtain the inequality
а2т(сг(у < oct) + <r(y < a2)) <
< < at) + a(y < a2)), (2.1.12)
where xa is defined by xa{d) = m(o л <т).
We may rewrite (2.1.12) in the form
a2[m(ff(y < at)) - m(a(y < a2))] < (jc...,y)
< a1[m((T(j' < at)) - m(o(y < a2))].
For a partition of the real axis for which a„+1 > a„ for
A„m = m(a(y < a„+1)) - m(<r(y < aj)
and
m ~ Z Л‘<т(У<«и+
n
we obtain the inequality
Z a„A„m < (m, y) < £ “„+Am- (2.1.13a)
n n
With A„Z = la(y^an+1) ~~ l<r(yz<xn) ^s inequality can be written in the form (< is in
the order in J*'(2)!)
Z«aAJ<yZ*n+i*nl (2.1.13b)
n n
because (2.1.13a) holds for all m e K(Z).
Thus it folfows that the left- and l right-hand sides of (2.1.13b) converge (with
respect to the norm of &'(L))—as the maximum interval length tends zero—
towards the same limit
dLy^, (2.1.14)
2 Structures in the Class of Observables 121
Th. 2.1.16. dJL(L) = {la | <7 g E}, where la(m) = m(<r).
PROOF. On the basis of Th. 2.1.13 it is only necessary to show that there are no
other extreme points of L other than the la. If 0 < у < 1, then, from the spectral
theorem it follows that:
(Щ У) = L adm(<7<a), (2.1.15)
where S > 0 can be arbitrarily chosen. We define two elements yl9 y2 e L(E) by
means of the following equations:
Уi) = J (2a - a2) dm((r(y < a)) (2.1.16)
and
fa У2) = J a2 dm(a(y < a)) (2.1.17)
for which у = \y 1 + \y2; У can be an extreme point of L(E) if and only if
У1 = У2 = У* that is, if a(y < a) = a(y1 < a) = a(y2 < a) for all a. On the basis of
(2.1.15)—(2.1.17) this is the case only if a{y < a) = a(y = 0) for all a for which
0 < a < l.Thus
(m, y) = m(e + a(y = 0)) = lE+a{y=0). (2.1.18)
Th. 2.1.17. The map is an order isomorphism ofl, onto deL(L).
PROOF. From Th. 2.1.16 and Th. 2.1.13 it is obvious that the mapping is a
bijection. Thus, it is clear that > a2 <> lai > la2, where lai > la2 is the order
relation in that is, (m, lai) = m(cr1) > (m, la2) = m(a2) for all m e K(L).
On the basis of Th. 2.1.17 we may identify £ with d<£,(£) and write
(m, la) = (m, a).
Since we may identify a representation of a Boolean ring with a measure
space (see [31] and §2.5) and the elements of &'(L) with a measurable,
essentially bounded function (where two functions which differ only on a set
of zero measure are considered to be the “same” element of &'(E)) the
abstract space &'(Z) is sometimes called the space of measurable functions.
We will not, however, make use of this representation of E and of this notion
ota'fL).
We shall now state (without proof) an interesting theorem which is
concerned with the axioms introduced in III.
Th. 2.1.18. If the lattice G which was introduced in III, §3 is distributive
then it follows that (without making use of AV 3 and AV Vs in III, §3) the
following isomorphisms hold for the Banach spaces defined in III, §3:
36 -> ЩЕ), Я9 -> ЩЕ) where G -> deL{E)
(1and therefore G —> £), L —> L(2), К —> K(L) for a suitable chosen E. For a
proof, see [13], VII, §5.3.
122 IV Coexistent Effects and Coexistent Decompositions
2.2 Mixture Morphisms Corresponding to an Observable
According to Th. 1.4.3, for each observable I-^L, there exists a <r-additive
measure m defined by m(a) = p(w9 F(oj) for each w e K; for w0 (from
Th. 1.4.1) m0(<j) = ju(w0, F(<r)) is effective. In this way we clearly obtain a map
of К (as a subset of $(Ж19 Ж2,...)) into ^(2) (see also [17], p.380) as follows:
K^K(2),
(2.2.1)
w —> p(w9 F(a)).
We shall now find it useful to investigate such maps. In V, §4.1 these maps
will be studied in a more general setting. The discussion presented in this
section may serve to motivate the studies presented in V. We will therefore
use several general results from V here. From V, D 4.1.1 it is easy to show
that the map (2.2.1) is a mixture morphism (abbreviation: mi-morphism).
According to V, Th. 4.1.2 such a mi-morphism uniquely defines a map S of
ЩЖ19...) into ЩЕ) which is norm-continuous.
Earlier we have called awe К(Ж19...) effective if C(w) = K.
Th. 2.2.1 (H. Neumann [21]). The mi-morphism S defined by (2.2.1)
transforms effective ensembles w e К(Ж\9...) into effective measures
m g K(L). The adjoint map S' of &'(L) into &'(Ж19...) maps (according to
V, Th. 4.1.3) L(2) into Ь(Ж19...). The restriction of S' upon deL(L) is
identical to the vector measure F: 2 —> L providing that we identify 2 with
dJL^L) (according to Th. 2.1.17). S is uniquely defined by the restriction of S'
upon 8eL(L).
PROOF. Let w be effective. From m(a) = p(w, F(g)) = 0 it follows that F(g) = 0; if
F(a) = 0 then 67 = 0 since F was assumed to be effective.
Since 67 g deL(L) = 2 and if Sw = m it follows that p(w9 S'a) = (Sw9 a) =
(m, 67) = m(67) and, according to (2.2.1) p(w9 S'a) = p(w9 F(a)) for all we K. Thus we
obtain S'g = F(a).
Suppose that we have two mi-morphisms St and S2 for which S[ and S'2 are
identical on дДЯХ). Then, for all w g К(Ж19...) we obtain 0 = p(w9 (Si - S2)a) =
(SiW — S2w, 67) for all 67 g deL(L). According to either the Krein-Milman theorem
or the spectral theorem (Th. 2.1.15) L(2) = codglfL). Thus it follows that
(SiW — S2w, h) = 0 for all h e L(2); since L(2) spans all of &'(Z), the same result
holds also for all h e &'(Z). It follows that Stw = S2w for all w g К(Ж\9...) from
which we conclude that St = S2.
Th. 2.2.2 (H. Neumann [21]). If S is a mi-morphism of К(Ж\9...) into
K(L) which maps effective ensembles into effective measures, then the
restriction of the adjoint map S' onto 8eL(E) defines a a-additive effective
measure F on 2 such that F(o) = S'a and F(s) = 1 (where s is the unit
element of 2). In addition, 2^ is complete if Ug is the uniform structure
determined by F.
PROOF. According to V, Th. 4.1.3 S' maps the set L(2) into U(Jfl9...). We note
that if л 672 = 0 we obtain т(бTt v g2) = rr^G^ + m(672), from which it follows
2 Structures in the Class of Observables 123
that v g2 = g1 + g2, where the oj are elements of deL(L) c: J*'(2). Thus it
follows that S'(<7i v &2) = tbus S'o- is an additive measure on 2. If we
define F(a) = S'o, then, for а e деЦ2), m = Sw we obtain m(o-) = (m, o’) =
(Sw, o) = ju(w, S'a) = /z(w, F(<x)).
For o- = 6 we find that 1 = m(e) = /4w, F(e)) for all w e К(Жi,...) and we
therefore obtain F(e) = 1.
For w0 defined by Th. 1.4.1 we find that Sw0 is an effective o--additive measure
m0 on 2 for which Ug is the uniform structure generated by d((Jv a2) = m0(a1 + o2).
Thus, by Th. 1.4.4,2 is l^-complete.
Th. 2.2.1 and Th. 2.2.2 show that it is possible to uniquely characterize o-
additive measures F: 2 —> L on complete Boolean rings 2 by mi-morphisms
S: К{Ж19...)—>K(2) which transform effective ensembles w into effective
measures. S': L(2)—>L(«^i,...) represents a type of extension of the map
F: 2 —> L(JtPl9...) because we may identify 2 with deL(L) and S' is equal to F
on deL(2). Since S' is linear, S' maps the convex set L(2) (which is generated
by 5^(2)) onto a convex subset of ЦЖ19...). What is the physical
interpretation of these convex subsets?
2.3 The Kernel of an Observable;
Mixture of Effects for an Observable
In III, §2 we discussed the use of the concept of a random generator in the
formulation of the concept of the direct mixture of two registration methods.
We shall now consider the special case in which b01 = b02. We begin by
defining the following special case of III, D 2.6.
D 2.3.1. A b0 is said to be a direct mixture of the same b0 if there exist two
registration methods b'01, b'02 such that b'0l is isomorphic to b0, b'02 is
isomorphic to b0 such that b'01 n b'02 = 0, b0 = b'01 u b'02. We call
and 1 — a the weights of b'01 and b'02 in the direct mixtureb0.
The direct mixture of b0 with itself is nothing other than an extension of
the Boolean rings @(b0) to a Boolean ring 0t(bo) with the aid of a random
generator such that all convex combinations of effects \l/(b0, b) appear in the
ratio a to 1 — a. According to III (2.8), to each pair Bl9 b2 cz b0 there exists a
b с b0 such that
W09 b) = oaKBo, bx) + (1 - *ШВ0, B2). (2.3.1)
From III, Th. 2.4 we obtain the following special case:
Th. 2.3.1. Let b0 be a registration method, and let A{ > 0 be a set of rational
numbers such that t kt = 1. Then there exists a registration method b0
such that, in @(b0) there exists a b for every series ofbi e &(b0) for which
b) = Z ^Ж&о>g.-)-
i— 1
(2.3.2)
124 IV Coexistent Effects and Coexistent Decompositions
It is not difficult to obtain mixtures of effects from the extension of the ring
Яф0) by means of a random generator. This procedure apparently does not
lead to new information. In Th. 2.3.3 we shall show that we may obtain an
abstract version of Th. 2.3.1
The converse question is more interesting: Let S Дь be an observable.
Suppose there exist three elements a, ol9 g2 which satisfy the following
equation:
F(g) = clF(g^ + (1 - ol)F(g2)
(which is an abstract version of (2.3.1)). Is it possible to “shrink” Ъ in such a
way as to eliminate g!
We have placed these descriptions of mixtures of effects in the foreground
in order to make it easier for the reader to visualize the physics underlying
the following abstract discussion about convex combinations and to under¬
stand the desire to seek “small” observables.
We shall begin the abstract discussion with a definition:
D 2.3.2. Let F: £ —> L be an additive measure on the Boolean ring S. (We
do not require that £ = Hg or that t,g be separable.) Let cb(F£) denote the
g(38'9 38)-con\ex closure of the set F'L; we shall call it the convex range of the
measure F. If E Д L is an observable, then we shall call cb(F£) the convex
range of the observable.
According to the previous discussion, in order to make an “economical”
measurement, it is not necessary to observe a “response” of a о for which
there exists a pair al9 g2 satisfying (2.3.3). Conversely, if we are interested in
the physical meaning of mixtures of effects, by using direct mixtures we may
introduce arbitrarily many convex combinations of the form (2.3.3). It is
possible to prove the following: To each h L there exists an extension E' of
S for which, if then co(FE) = F'E' (see [21]); this result has the
following intuitive meaning: we may introduce arbitrary convex com¬
binations (corresponding to the physical notion of a direct mixture given in
Th. 2.3.1) if we are willing to “enlarge” the Boolean ring.
Since L is g(38'9 J^-compact, co(FS) is, as a closed subset of L, also
g(38'938)-compact. According to the Krein-Milman theorem co(FS) is
generated by its set of extreme points de co(FS). Physically this set de co(FS)
is the essential set of effects for the observable I-^L. We therefore define:
D 2.3.3. 8e сo(FE) is called the extreme kernel of the range of the measure
F:E->L. If is an observable, then it is called extreme kernel of the
observable.
D 2.3.4. Two observables 2^-^L and E2-^+L are said to be convex
equivalent if сЬ^Ел) = co(T2S2).
2 Structures in the Class of Observables 125
On the basis of the Krein-Milman theorem the following assertions are
equivalent:
(i) Si L and Z2 L are convex equivalent.
(ii) 'L1 L and Z2 L have the same extremal kernel.
Physically, in order that we do not have to measure anything more than is
necessary, it is desirable to make an observable “as small as possible.”
Mathematically, this corresponds to the question whether there is a “small¬
est” range of an observable, given a particular convex range. The answer is
based upon the following theorems:
Th. 2.3.2. Let F\lL—*Lbe an effective measure on the Boolean ring Z (we do
not require that 1, = t,g or that Ъд is separable). Then co(FZ) = сo(FZ0).
PROOF. From Th. 1.4.3 it follows that, since F is continuous (according to
Th. 1.4.2), Flg c FZ , from which the assertion follows.
Th. 2.3.3. To each additive measure F: Z —> L there exists a convex equivalent
observable 'L1 L. If desirable, this observable may be chosen such that со
(FZ) = F1'L1 (where A is the g(&', $)-closure of A).
Proof. Since L is o($', ^-separable, we may choose a countable subset {<7V} c Z
such that {F(<7v)} is dense in FZ. Thus co(FZ) = co({F(<7v)}). The <7V generate a
countable Boolean subring Z' of Z such that co(FZ) = co(FZ'). Zi = Z^ is
separable, hence the extension Fl of F (as a function over Z') onto Z^ defines an
observable ZX-^L. According to Th. 2.3.2 cc^F^) = co(FZ'); therefore
Zi L is convex equivalent to Z Д L.
If we wish to obtain co(FZ) = F^ we then extend F: Z —► L to F": Z" —► L for
which co(FZ) = F"Z" (see above and [21]). Then, as above, we may construct the
observable Zi L by means of a countable subset {crv} c: Z" for which {F"(crv)} is
dense in F"Z".
In order to investigate the convex range we may always assume that
Z = Xg, that is, that Z is complete. This makes it possible to use the results of
§2.2. We then obtain
Th. 2.3.4 (H. Neumann [21]). Let F\lL—*Lbe an effective additive measure,
and let Z = Z^. Let S: К(ЖХ, ...)—>K(Z) be the mi-morphism defined
by (2.2.1) and let S' be the dual map (see Th. 2.2.1) S': L(Z)—> ЦЖХ,...).
Then the following conditions are satisfied:
(i) S'L(Z) = co(FZ).
(ii) To each extreme point ge e de co(FZ) = deS'L(Z) there is one and only
one о e Z for which ge = F(g).
(iii) G n co(FZ) = G n de co(FZ) = Gn (FZ).
PROOF. Since S' is continuous with respect to the a(0§', 0&) topologies and L(Z) is
compact, we find that S'L(Z) is compact. Since L(Z) is convex, S'L(Z) is convex.
Then, since Th. 2.2.1 Z may be identified with dJJp), we find that FZ с S'L(Z)
126 IV Coexistent Effects and Coexistent Decompositions
and therefore obtain co(FZ) c: S'L(Z). Then, by the Krein-Milman theorem
L(Z) = сodJJp) and F is continuous, S'L(Z) = S' codeL{Z) = coS'd eL(L) = со
(FZ) whereby we have proven (i).
Let gesde co(FZ) = deS'L{Z). The set S'~1ge n L(Z), that is, the set of all
/g L(Z) with S'f = ge forms a closed face of L(Z), as we will show: S'~1ge is the
inverse image of a closed set and is therefore closed. Therefore S'~1ge n L(Z) is
also closed. It is easy to verify that S'~ 1ge n L(Z) is convex. Let f e S'~1ge n L(Z),
fl9f2e L(Z) and suppose that
/ = ЯД + (1 - X]f2 where 0 < к < 1.
Then
S'f = ge = XSfl + (1 - A)S'/2.
Since is an extreme point, S'fa = S'f2 = #e, that is,/i,/2 e S'-10e n L(Z).
According to the Krein-Milman theorem the face S'~1g4 n L(Z) contains an
extreme point of L(Z) which, according to Th. 2.2.1 may be identified with a a g Z.
Therefore there exists a a g Z for which F(cr) = .
Suppose that, for <rl9 o2 g Z we have F(cr1) = F(cr2) = . Since F is additive, it
follows that
9e = + <7i *<72) + F^ -<72),
9e = F(<72) = F(<72 + ^1-^2) + Ffo -<72).
We therefore obtain
^Ffo V (72) + ^Ffo A <72)
= 1 + <*1 ‘ <*2) + Ffo ‘ ^2) + F(°2 + ^ 1 ‘ <*2)] + 2^(ff 1 ‘ ^2)
= 9e-
Since ge is an extreme point, the following condition must be satisfied:
F(<7i v <72) = F^ л <72) = ge.
Thus it follows that
F(<71 + (72) = F^ V a2) — F(<71 A (T2) = 0.
Since F is effective, it follows that ot + <72 = 0, that is, a1 = o2. Since
G cz д^,(Ж19...) (see III, Th. 6.6) we obtain G n co(FZ) = G n de co(FZ). Then,
since by (ii) de co(FZ) c: FZ we obtain
G n de co(FZ) cz G n FZ cz G n co(FZ).
D 2.3.5. Let F: Z —> L where Z is C^-complete. Then the subset
N = {<71 <7 g Z and F(<7) g de co(FZ)}
is called the kernel of the measure F. If Z Д L is an observable, then N is
called the kernel of the observable Z L.
According to Th. 2.3.4 the map (7 —> F(<7) is a bijection of N onto de co(FZ).
For the experimental technique of measurement the kernel N of an
observable is the essential component. For example, in order to determine
2 Structures in the Class of Observables 127
the frequencies p(w, F(<r)) for all a e Z we need only measure the “responses”
corresponding to the “indications” a e N. In §2.4 we shall consider this topic
in more detail.
From Th. 2.3.4(iii) we find that we do not introduce any new additional
decision effects into FZ by taking the closure со of the domain FZ (providing,
of course, that Z is Unclosed). For the special case in which FZ c: G it
follows that, since I-^G is, by definition D 1.4.3, a decision observable, that
FI = G n FI = G n FI = G n co(FZ)
and therefore
FZ is o(8S\ $)-closed in G.
Thus we obtain an alternative proof of the portion of Th. 1.3.7 which is
enclosed by brackets; see also Th. 1.4.8. Therefore, the kernel N of a decision
observable is identical to Z, and we may therefore identify Z, and therefore N
with FZ. According to Th. 1.4.8 N = Z is C^-separable.
We shall now show that the above result holds in general.
Th. 2.3.5. The kernel N of a measure F: Z —> L (where Z is complete in Ug) is
Ug-separable (in general Z is not U -separable—see [11]). There exists an
observable 'L1-?-L>L where ^ cl and Fx = F|Zl for which the kernel
Nx = N (where N is the kernel of F: Z —> L). The complete Boolean ring
I2cS generated by N is separable; therefore Z2 L where F2 = F^2 is
an observable. If де сo(FZ) cz G, then 8e co(FZ) is a complete Boolean
sublattice of G and Z2 L is an isomorphism Z2 de co(FZ).
PROOF. If we construct 1^ as in the proof of Th. 2.3.3, then it follows that Ii с I
and that Zj-^L is an observable (that is, Zx is Inseparable). According to
Th. 2.3.3, for this observable we have co(FZ) = cojFZi) and therefore де со
(FZ) = dgCdiF'Lj). Since Zi also satisfies the assumptions of Th. 2.3.4, F(o) is,
according to Th. 2.3.4(ii) a bijection of N± с upon de ^(FZJ. Since F(a) is also
a bijection of N a Z onto de co(FZ) it follows that N± = N. Since Zi is In¬
separable, we find that N = Nl9 as a subset of Zx is Inseparable. Thus Z2 is also
separable. From Th. 1.4.7 it follows that the last part of Th. 2.3.5 holds.
According to Th. 2.3.5 the complete Boolean ring Z2 generated by the
kernel N (which exists according to Th. 2.3.5) yields the smallest possible
observable for which nothing is lost with respect to the measure F: Z—
since 8e co(F2Z2) = 8e co(FZ). Thus, as far as the physics of the situation is
concerned, it suffices to only consider those observables Z Д L for which Z is
the complete Boolean ring generated by the kernel N.
D 2.3.6. Let N be the kernel corresponding to the observable Z Д L. If the
complete Boolean ring generated by N is equal to Z, then we shall call I-^L
a kernel observable.
Since the kernel N of a decision observable is identical to Z, every decision
observable is a kernel observable.
128 IV Coexistent Effects and Coexistent Decompositions
D 2.3.7. Let be an observable. The observable E2-^»L generated
by the kernel N of £ Д L is called the associated kernel observable cor¬
responding toS^L.
For a given observable, for measurement purposes it is sufficient to
consider only the associated kernel observable and the associated measure¬
ments. If we consider only the class of kernel observables, we have taken an
additional step towards our goal of describing “measurements” in terms of
the structure of microsystems: the exclusion of mixtures of effects from the
measurement. Have we actually attained this goal? Suppose that the relation
8 e сo(FX) c= G holds for an observable £ - Д L. Then the kernel is a complete
Boolean ring (see Th. 1.4.8), and the kernel observable is therefore a decision
observable N-^+G, where Fi — F\n • It would suffice to consider only
decision observables if all kernel observables would be decision observables.
If this would really be the case, then we could say that the decision effects
are the “true” measurements of the structure of microsystems. Unfortunately,
as we shall find in the remainder of this chapter, this is not the case.
In this section we have already seen that it is sufficient to consider only
observables—in particular, kernel observables, instead of more general
measures F: £ —> L.
To close this section, we shall now state the following theorem, which is a
direct consequence of the previous theorems:
Th. 2.3.6. A set A a L is a set of coexistent effects if and only if there exists
an observable (which we may also assert is a kernel observable) £ -Д L for
which A c= co(FZ).
2.4 Mixtures and Decompositions of Observables
We shall now, according to the implications of the previous section, only
consider observables—as described by Inseparable Boolean rings.
We shall now seek to define, in mathematical terms, the following intuitive
idea: A registration method b01 registers “more” than a second b02. For this
purpose, we again consider, as an idealization of &(b01) and M(b02\ two
observables 2^-^L and S2-^>L and the following possibility: Suppose
that we are given a homomorphism h of a Boolean ring 2^ into the Boolean
ring S2 for which the following diagram is commutative:
(2.4.1)
A homomorphism h is an isomorphism of 'L1 onto the Boolean subring
of S2 if and only if no other element of 2^ except the zero element is
mapped to the zero element of S2. If this is the case, then from hfoj = /i(<t2) it
2 Structures in the Class of Observables 129
follows that /1(0-! + a2) = 4- = 0 anc* Gi + ^2 = 0 and that
аг = (T2. /Г1 exists, and is also a homomorphism, where the latter can be
proven in the following way. From <r3 = h~1[h((j1) v h(a2)~\ and
v a2) = /i^x) v h(a2) it follows that 0-3 = (ax v 0-2) and so on.
If (2.4.1) is commutative, and if 2^ L, S2 L are observables, that is,
Fu F2 are effective, then if h(a) = 0 it follows that a = 0: According to (2.4.1)
it follows that if h(a) = 0 then Fx(a) = F2(g) = F2(0) = 0; therefore a = 0. h
is an isomorphism of 2^ onto hLx c= Z2.
We will prove that, in addition, h is not only an isomorphism of the
Boolean ring 'L1 onto hLx c= Z2, but is also an isomorphism of the uniform
structures of the two rings. These uniform structures are generated by
where w0 is defined in Th. 1.4.1
Since F2(Hpi) + KGi)) = F2h((T1 + a2) = F1(a1 + <r2) we find that
Thus КЕг is Ug-complete, as is Zl5 and is, according to Th. 1.4.3, a complete
Boolean sublattice of S2 and h is also an isomorphism of the complete
Boolean lattices 2^ and KL1. On the basis of (2.4.1) we may identify 2^ L
with hL1 L. We shall not, however, do this in the following.
D 2.4.1. Let (Sl5 Ft) and (S2, F2) be a pair of observables. We shall say that
Z2 L is more extensive than 2^ L (written (Zl5 Fx) -< (S2, F2)) if there
is a homomorphism /1 (of the type described above) for which the diagram
(2.4.1) is commutative.
This definition is the precise form of the following statement: £2-^»L
“measures more” than 2^ L.
The relation -< is a pre-order: Suppose that (Zl5 Fx) -< (S2, F2) and
(S2, F2) -< (£3, F3). Then, from the commutative diagram
a2) = ^1(^1 + ^2))
and
d2(°i> Gi) = /*(w<)> ^2(^1 + *2))»
a2) = ЫРт))-
(2.4.2)
L
it immediately follows that (Sl5 F\) -< (S3, F3).
D 2.4.2. A pair of observables (Zl5 Fx) and (S2, F2) are said to be equivalent
if (2lf Fx) -< (S2, F2) and (S2, F2) -< (Zlf F,).
130 IV Coexistent Effects and Coexistent Decompositions
It is easy to verify the fact that if (El5 РД -< (E2, F2) then the following
relationships hold: F1'L1 c= F2E2 and со(Т^ЕД <= co(F2E2). Thus, if two
observables are equivalent, then they are convex equivalent (see D 2.3.4).
If (El5 Fx) is the kernel observable associated with the observable (E, F)
then (El5 Fx) -< (E, F) and (El5 Fx) is convex-equivalent to (E, F). Let
(E2, F2) < (E, F) and let (E2, F2) be convex-equivalent to (E, F). Then, from
(2.4.1) it follows that
co(FKL2) = co(F2E2) = co(FE)
and therefore
de co(FhL2) = de co(FE).
From this result, and as a consequence of Th. 2.3.4 it follows that the
kernels Nx and N of both observables obey the relation hNx = N. If (El5 Fx)
is the kernel observable associated with (E, F) then HLX <= E2. In this sense
the kernel observable associated with (E, F) is the “smallest” observable
(E2, F2) which is convex-equivalent to (E, F) and which satisfies
(E2, F2) -< (E, F). This result exhibits the exceptional status of kernel
observables.
The physical description of the registration process motivates the need to
study an additional relationship between observables:
Let b0 = (J”=i boi be a decomposition of the registration method b0.
Then, according to III, Th. 2.3 we obtain
ф(Ь0, b)=t Wei. hi n b), (2.4.3)
1=1
where = Я^0(Ь0, b0i). We may also obtain kt by means of the function
/i(w, g) as follows:
^&оФо9 boi) = Яу>(л n b0, a n b0i)
for any a for which a n b0 Ф 0, that is,
к = Ля0(ьо> hi) = Kw, Ф(Ь0, boi))
holds for all w, that is, /i(w, \//(b0, b0i)) = kt is independent of w. Thus it
follows that
(2.4.4)
In equation (2.4.4) we meet, for the first time, an example of a general result
(which will be discussed in XVII, §2.3) that we will not obtain any
information about a microsystem from an effect of the form AI. This is self-
evident in the case of (2.4.4) since b0i, as well as b0 are, as elements of ^20,
statistically independent of the preparation; (2.4.4) is a direct consequence of
APS 7.2.
According to the discussions of III, §2 we may make arbitrary direct
mixtures of registration methods. The direct mixtures are of great impor¬
tance for the interpretation of quantum mechanics as well as for the selection
2 Structures in the Class of Observables 131
of axioms in the sense of III, §3. In III, §2 we have already noted that, in
general, an experimental physicist does not create mixtures of registration
methods in an experiment. In gёneral, as far as possible, he makes decom¬
positions. In making direct mixtures we lose almost nothing in the way of
experimental results (see III, §2) providing the weights are not too small.
We note, however, that the use of direct mixtures makes experiments more
difficult without making even a small improvement with respect to the
results. On the other hand, the decomposition of registration procedures, if
carried out experimentally by improving the tolerances of the apparatus
represents a refinement of the results, because it then becomes possible to
measure the effect processes (b0i, b0i n b) instead of only (b0, b) by means of
the improved apparatus (see III, §2).
If it is possible to measure the effects ij/(b0i, b0i n b) then the measurement
of the mixtures ^(b0, b) is not interesting because no new information is
obtained. Thus it is reasonable to seek, in mathematical terms, a solution of
the problem of decompositions, making the transition from the “special”
situation ЩЬ0)L to the idealized case
For this purpose we see that the decomposition described by (2.4.3) may be
characterized in the two following different ways:
(1) We may consider ij/0(b) = ij/(b0, b) and il/0i(b) = i//(b0i, b0i n b) to be
additive measures on the same Boolean ring ЩЬ0). Then (2.4.3)
describes the decomposition of measures on ^(b0) as follows:
Ф<№ = 1Мо*Ь). (2.4.5)
i
(2) Let us consider the additive measures ij/0: ЩЬ0)—+L and
*01 = ФФоi> b0i n b): $(b0i)—>L. Let jt be the injective map of the
subset $(b0i) into $(b0). Then the following diagram is commutative:
ЖЬ0|) _^i_+ Я(Ь0)
L
These two characterizations suggest the following two idealizations con¬
cerning observables:
Г. We begin with a single observable and consider a decomposition
of F into additive measures Ft : E —► L of the form
m n
F(cj) = £ АД(<7) where A, > 0, £ A, = L (2.4.6)
I = 1 i = 1
The Ft need not be effective measures. It is easy to show that the uniform
structure Ug where the latter is defined by the metric
d(tt19 g2) = /i(w0, Fi{a1 + o2))
132 IV Coexistent Effects and Coexistent Decompositions
(where w0 is defined in Th. 1.4.1) is finer than the uniform structures Ug.
where the latter are defined by the metrics
dfcu °i) = ju(w0, Fi(al + <r2))-
Since the F( are uniformly continuous with respect to the Ug. (see Th. 1.4.2),
they are also uniformly continuous with respect to Ug and, as is the case of F,
they are also o-additive (because o-additivity is equivalent to the condition
that Ff((Tv) —> 0 for decreasing sequences which satisfy <rv —> 0—see Th. 2.1.3).
2'. Let and L (i = 1, 2,...) be observables. In addition,
suppose that there exist isomorphic maps h{ of onto an interval [0, j/J of £
where ^ л rjj = 0 for i Ф j and \/- rji = s for which the following diagrams
commute:
S
AiF\ (2.4.7)
L
We then say that the observable is a mixture of the observables
with weights kt. From the diagram (2.4.7) it follows that
Fhi(s) = ЛД, that is, F^) = ЛД and therefore kt > 0. Since the rjt are a
decomposition of e, we obtain 1 = F(e) = F^) = ЛД, that is,
ZA- = i.
Condition 2' idealizes in a most precise way the decomposition of a
b0 g into different b0i g as described by condition 2. From 2' it follows
that 1' holds as follows: Let
4°) = A *)• (2.4.8)
Since rji a g < rji (2.4.8) defines an additive measure over S. From
a = V (*7i л ^ follows that F(<r) = F^. л a). According to the diagram
(2.4.7) and definition (2.4.8) it follows that F(<r) = AfFf(<7)—which is a
decomposition of the form (2.4.6).
The question whether a structure of the form V exists is mathematically
simpler than the question whether a structure of the form 2' exists. V is
somewhat weaker than 2' (we have already seen that 1' follows directly from
2'), but physically is not much weakfer, as we shall see with the help of the
following theorem:
Th. 2.4.1. Suppose that, for the measure F of an observable £ L there
exists a partition of the form (2.4.6). Then there exists an observable
(2, F) > (£, F), a homomorphism £ 2 and a partition of en of 2 of
the form
г = V w/iere л fjj = 0 /or i Ф j,
2 Structures in the Class of Observables 133
such that the following diagrams commute:
2 > 1
/t (i = 1,2,, и), (2.4.9)
L
wtoere Я;(<т) = й(<т) л .
PROOF. As we have already seen (in Y and the subsequent discussion) the Ft are o-
additive measures over Z. The sets
Jt = {o-1 <7 g Z and Ff(o-) = 0}
are complete Boolean subrings of Z; the factor rings Z* = Z/J* are complete
Boolean rings. A ^-additive effective measure Zf L is defined by
Ft(p) = Ft(cr) for any asps Zf (2.4.10)
Therefore we find that the Zf L are observables.
Using the Zf = Z/J* we construct the product 2 of the Z*. The elements of 2 are
therefore the n-tuples (<rl9 <r2,..., o-J with the ordering (al9 <j2,...) < (ol9 <t2, ...) if
and only if ot < dt (i = 1, 2,..., n) (see AI, §3). Thus 2 is complete. Using the Ft
defined by (2.4.10) we define the following measure F on 2:
<0 = £ лад.
£
This measure is, as in the case of the Fi9 сг-additive. Therefore Z —is an
observable. Let rjt = (0,0,..., ei9...). Then we obtain the decomposition s =
\/i fh of the unit element e of 2 and
n
F(ou ...,<>•„)= £ F(fj{ л (alt..., ff„))
i = 1
= £ F(0,0,..., о;,...)
i = 1
= £ w
i = 1
The map oj—►((), 0,..., <7f, 0,...) is a homomorphism ht in the sense of 2' for
which the diagram
z* 2
is commutative. Thus we have a structure of type 2' between Z* and 2.
Using the canonical homomorphism Z-^Z/Jf, from hiki we obtain a homo¬
morphism of Z into 2 for which the diagram
134 IV Coexistent Effects and Coexistent Decompositions
is commutative. Let h((j) = V"=i h then defines a homomorphism £ Д £
which satisfies the conditions of Th. 2.4.1. The fact that h is a homomorphism
follows directly from
h(a) = h2k2(o\ ...,
Thus, we obtain
4°) = Ы?) л if, = htki(a)
from which it fpllows that (2.4.11) transforms into (2.4.9).
It is easy to verify that condition 2' is equivalent to the representation of
the lattice 2 as a product which was used in the proof of Th. 2.4.1. Using the
rii in 2', for each aelwe may form the decomposition a = V?=i(a л */*•)>
which may be rewritten in “component form”
a = (а л ri^a л ri2,.. .,v л rjn).
The ^ in (2.4.7) are nothing other than isomorphisms by which we may
identify the £f with the Boolean ring of the i th component.
We may also represent condition 2' as follows: £ permits a “repre¬
sentation” as a product of the £f as follows:
<7 = (<Tl5 <T2,..., <7„) where e £*,
where F(a) may be represented in terms of the £f —> L in the following
form:
m = t
i— 1
From the diagram (2.4.9) it follows that
Find = F{m A nd = Fht(e) = Я,^(е) = A,l.
The observable £ —> L in Th. 2.4.1 may be represented as a product
а = (а^ал^...5ал^)
and
F(a) = £ л >/,),
i = 1
where the are defined by
Ff(ff л i/i) = A i/i).
w p
Therefore the observable £—>L may be decomposed according to the
“refined registration methods” Th. 2.4.1 states that if the measure F for
the observable £ Д L satisfies the decomposition (2.4.6), then by the map h
this observable can be identified with a portion (namelj with hL c= £) of the
observable £ —> L because the decomposition of £ —> L according to the
refined registration procedures fjt leads to a partition of the restriction of the
measure F on hL (which is identical to F on £) of the form (2.4.6). Therefore a
2 Structures in the Class of Observables 135
decomposition of the form (2.4.6) can always be considered to be a
decomposition of a “more extensive” observable according to a set of refined
registration methods.
Hence, in experimental situations it is desirable to seek those observables
for which the measure F permits no proper decompositions of the form
(2.4.6). For this purpose we shall now introduce the following definition:
D 2.4.3. An observable is said to be irreducible if, for any decom¬
position of F in the following form
F(cj) = &Fx(g) + (1 - A)F(<r), (2.4.12)
where 0 < Я < 1 and Fl9 F2 are additive measures Fx, F2: £ —► L it follows
that Fx = F2 = F.
It immediately follows that every decision observable is irreducible.
In complete analogy to the set K(L) which was defined in Th. 2.1.1 we shall
now introduce a set K(X, L) as the set of all <r-additive measures F: Z —> L on
the complete Boolean ring Z. Then K(L, L) is a convex set; the measure F for
an irreducible observable is clearly an extreme point of K(L, L). On
the basis of (2.2.1), Th. 2.2.1, and Th. 2.2.2 (see the remarks following
Th. 2.2.2) we may also identify K(L, L) with the set of all mi-morphisms (not
only those which transform effective w into effective measures)
К(Ж19 ...)—> K(Z). An investigation of the mathematical structure of K(Z, L)
remains yet to be done; in particular, the structure of deJ£(Z, L) is not yet
known. In each case we find that all F for which I-^G belongs to deK(L, L).
An experimental physicist must strive to measure irreducible observables
(or at least “approximately irreducible” observables). We shall not describe
what we mean here by “approximately irreducible” observables. For this
purpose it is necessary to introduce a uniform structure of physical impre¬
cision in K(Z, L) (for the general structure see [1], §9) with respect to which
K(Z, L) is precompact (totally bounded).
We shall now consider the following question: Is irreducibility preserved in
the transition to a more extensive observable? This need not be the case; in
the following theorem we shall find that if the more extensive observable is
reducible, then each component is more extensive than that of the irreducible
observable.
Th. 2.4.2. Let Z L be irreducible and let Z L be more extensive than
Z L and reducible in the sense of 2-'. Then every component Zf L is also
more extensive than Z Д L.
PROOF. According to 2' and diagram (2.4.7) we may identify Zf with [0, c: £
and Ft with Since 2-^>L is more extensive than there exists a
homomorphism h such that
136 IV Coexistent Effects and Coexistent Decompositions
The map h^a) = h(a) л rjt is a homomorphism = [0, с I. On the
basis of the identification of [0, rj J with we obtain
Fh,{ff) = F(h(ff) а щ) = XiFlHa) л »/,)
= фш- (2'4'14)
On the other hand, on the basis of (2.4.13) we find that
F(a) = Fh(a) = £ F(h(«) л пд-
i
Since F(a) is irreducible, it follows that
F(h(a) a m) = XiF(a).
Equation (2.4.14) transforms into
F(a) = FA(a). (2.4.15)
Equation (2.4.15) says that the diagram
2 —Hi—► 2(
\
L
is commutative, and thus we find that L is more extensive than lil.
Therefore Th. 2.4.2 guarantees that, in the transition from an irreducible
observable to a more extensive observable (at least approximately) we may
make use of irreducible observables. In IX, §1 we shall study such transitions
for decision observables which are always irreducible.
If we combine our wish for irreducible observables with the wish for kernel
observables, then the “desired” measurements would be irreducible kernel
observables. The irreducible kernel observable characterizes, so to say, the
“optimum apparatus without unnecessary redundancy” which we would seek
to approximate in an experiment.
We may also wish to determine whether it is reasonable to interpret the
irreducible kernel-observables as “measurements of the structure of micro¬
systems.” Are all irreducible kernel observables also decision observables?
According to the remarks following D 2.3.6 and the remarks following
D 2.4.3 every decision observable is an irreducible kernel observable.
If all irreducible kernel observables were also decision observables, then we
could justifiably assert that every measurement made by a sufficiently refined
apparatus—neglecting unnecessary mixtures of registrations—is traceable to
measurement of decision observables. It should be mentioned that this is
indeed the case for “classical” physical systems. However, in the measure¬
ment of microsystems there are irreducible kernel observables which are not
decision observables. We shall now exhibit an example:
Let <p, x be two normalized orthogonal vectors in one of the Hilbert spaces
for example, . Let us consider the following effects:
0,0,...), (2416)
Q 2 = <х(Рф, o, 0,...),
2 Structures in the Class of Observables 137
where
^ = (9+j) and a = 2_y2.
>/2
Later we shall show that gx + g2 < 1—see below. Thus gl9 g2 are coexistent.
Let S be the Boolean ring consisting of the subsets of a set of three elements.
We assign the following measures of the atoms rjl9 r\2, rj3 of £ as follows:
*72 02>
43-»l -flii -g2.
In this way we determine an observable S Д L for which the remaining three
elements of h (excluding 0 and e) correspond to
f/i и ij2-*g1 + g2,
r\l и - 02,
>j2 и - gu
S Д L is a kernel observable with N = T, since co(FZ) is a three-dimensional
parallelepiped: the four points 0, gl9 gl9 gx + g2 generate a two-dimensional
parallelogram. If we add 1 — gx — g2 to these four points we obtain the
remaining four points of the parallelepiped cd(FE).
Now we need only show that is irreducible. Let 0 < Я < 1,
F(&) = kF^a) + (1 — X)F2(g). For <7 = r\1 we obtain the following equations
for the components in (the other components are 0):
uPq, = AFx(jчx) + (1 - AJFaOh). (2.4.17)
For each vector t1 (p it follows from F1(rj1)> 0, F2(rj1)> 0 that
<t, = 0. Thus, since F1(rj1)> 0 we must obtain Fl(rjl)T = 0.
Therefore F^) = г^. In the same way it follows that F1(ri2) = е2Р^. If we
then show that ex = e2 = a, we would then easily obtain F^a) = F(<r) for all
del, that is, S L is irreducible.
To this purpose we shall consider all el5 e2 for which
*7i £i(P<p, 0,...),
Ц2-**г(Р+Л •••)
determine an additive measure S—>L. This is the case only if
+ е2Рф < 1 (as an operator in Жх).
Let us consider the operator гХР9 + г2Р^. It is easy to determine the el5 г2
for which this operator takes on 1 as an eigenvalue: el5 г2 must satisfy the
condition
£l + г2 — 2£i£2 — 1=0.
(2.4.18)
138 IV Coexistent Effects and Coexistent Decompositions
(For the case in which = s2 we obtain the special values
sx = s2 = a = 2 — y/l) In the following diagram (Figure 3) we illustrate the
domain in the (гь г2) plane for which гХР9 > 0, е2Р^ > 0 and
гХР9 + 82Рф < 1 by means of shading. It is bounded by the curve (2.4.18).
This domain is convex, and sx = s2 = a is an extreme point. Thus, by (2.4.17)
it follows that гг = г2 = a.
e2
Figure 3
The reader should show that the observable constructed in (2.4.16) is
reducible for a = j, and, in the sense of Th. 2.4.1, can be described as a
mixture of two decision observables.
According to Th. 2.4.2, the observable constructed using a = 2 — can
also not be “decomposed” if we would make a transition into a more
extensive observable. If we had an apparatus which measures the observable
characterized by (2.4.16) we would need only two yes-no responses, namely,
rjx and rj2. These responses are mutually exclusive, that is, they never occur
together. The maximum frequency (for an ensemble) for the response of rjx is
not equal to 1 but a = 2 — Jl\ similarly for r\2. There are, however,
ensembles for which we “always” obtain at least one of the two responses
rjl9 rj2 since а(Р^ +P^) has the eigenvalue 1! The observable cannot be
improved. It is very instructive to clarify the notions of coexistence and
observable for this example, and to realize that observables which are not
decision observables are not necessarily the result of “poor” or “unskilled”
experimentation.
In §3 we shall find that the observable described above is complementary
to the decision observables (P^, 0,...) and (P^, 0,...); we shall discuss this
fact in more detail in §3.
2 Structures in the Class of Observables 139
2.5 Measurement Scales for Observables
In the definition of an observable we have not discussed the notion of a
measurement scale. In practical applications measurement scales often (but
not always) play an important role. They are always somewhat arbitrary, but
they are also very practical. In the case of a well-developed theory, a well-
chosen measurement scale can both be very practical and involve some of the
physical structures of the theory; see, for example, the observables defined by
the infinitesimal transformations of a group (see VII); here a particular scale
is preferred. Rectangular coordinates represent particular selection of scales
which are often preferred to arbitrary coordinates in three dimensions. Since
we have not developed a detailed formulation of quantum mechanics until
this chapter, it is not possible to develop specific measurement scales in this
chapter. This fact must be emphasized, otherwise the reader may think that
we have already formulated the selection of a measurement scale in
mathematical terms. What do we mean by a measurement scale?
According to §2.1 and §2.2, to an observable there exists a space
ЩЕ) together with the set K(E) of <r-additive measures over S. The set £
symbolizes the set of registrations b e ЩЬ0). In general, by a measurement
scale we mean a sequence of numbers such that the fact that the measurement
values fall within an interval of the scale can be represented by one of the
i)ct0. We shall now replace this intuitive idea by a mathematically correct
definition.
D 2.5.1. An element у e &'(L) is called a measurement scale for the complete
Boolean ring S.
The expression “random variable” is frequently used instead of “measure¬
ment scale.” We choose not to use the former expression because the term
“random” only obscures the meaning, since it can be assigned with the
meaning of pure chance. In physics the concept of pure chance, as a
fundamental concept, is without meaning.
We wish to clarify the fact that D 2.5.1 corresponds to the intuitive
meaning of a measurement scale described above. At first у is an affine
functional on K(£). In §2.1 we have already said that we may identify &'(E)
with the set of measurable functions. But, as we mentioned in §2.1, we shall
not take this mathematical route to obtain a representation of a Boolean
ring.
The spectral family defined in D 2.1.4 makes the following definition
plausible:
D 2.5.2. Let ax < a2; we define <r(y < a2) -j- <r(y < o^) to be the registration
for which the scale value of у lies in the interval (al5 a2].
This definition will be justified by the following facts (let m e K(L)):
nic{y < a2) + c{y < aj) = 1 => < (m, y) < oc2,
m((7*(y < a2)) = 1 =>(m,y)> a2, m(o(y < aj) = 1 => (m, y) < olx.
140 IV Coexistent Effects and Coexistent Decompositions
D 2.5.3. Let E(y) denote the complete Boolean ring generated by
{o(y < a) | a g R}. We shall call Е(у) the ring of registrations generated by the
measurement scale y.
Therefore it “suffices” to measure the probabilities for all registrations
<r(y < a) in order to determine (in principle) the probabilities for all a e E(y).
Thus, for the purpose of measurement it is “economical” to introduce
measurement scales. In the following we sball consider this topic in more
detail.
The following concepts will be important in the discussion which follows:
D 2.5.4. The set
SpOO = {a | (т(у < a + г) + (т(у < a — г) # 0 for all г > 0}
is called the spectrum of the scale y.
The set
SpdOO = {a I o{y < a) + a{y < a-) # 0}
is called the discrete spectrum of the scale y.
The set
SpcOO = {a | [a{y < a + г) + a{y < a)]
v [<7(y < a — г) -j- a(y < a—)] # 0 for all г > 0}
is called the continuous spectrum of the scale y.
It is easy to see that Sp(y) is a closed subset of R.
Since there always exists an interval about ос ф Sp(y) for which
o{y < a + г) + o(y < a — г) = 0, the registration of a scale value in
(a — г, a + г] is “impossible” (see D 2.5.2). Therefore we shall call Sp(y) the
set of “possible measurement values” for the scale у.
Those readers who are already accustomed to the intricacies of quantum
mechanics will at first be somewhat surprised to find that such an essential
concept as the “spectrum of possible measurement values” apparently
depends only on the arbitrary choice of a Boolean ring and a scale y. The
spectrum of a (conventional) quantum mechanical observable (for example,
the energy) exhibits the typical quantum mechanical structure—namely, the
structure of discrete measurement values. The following remarks are in
order:
(1) The Boolean ring 2 of an observable 2 -Д L is not at all “arbitrary.”
For example, according to Th. 1.4.8, for a decision observable F we
may identify 2 with the set of decision effects FE <= G.
(2) The choice of a scale у for a given 2 appears to be arbitrary. If,
however, we seek to use a scale у for which E(y) is “as large as
possible,” for example, E(y) = 2, then the scale у exhibits the structure
of 2.
2 Structures in the Class of Observables 141
One purpose (but not the only one) for the introduction of a measurement
scale is to manage only with the set {<r(y < a)} instead of the complete
Boolean ring. That is, the introduction of a measurement scale corresponds
to the “need” to “measure only as much as is necessary,” as we have already
described above.
(3) It is possible that the choice of a measurement scale for a decision
observable does not correspond to the question of the introduction of
the most “practical” information about a measurement apparatus.
Instead, if the decision observable is defined as an infinitesimal
transformation, then the physical meaning of the scale is obtained
from the transformations. We may express this fact as follows: The
Boolean algebra of the decision observable is generated by the
spectral family £(Я) of a one-parameter unitary group given by
Ut = | еш dE(X) = eiAz where A = j*A dE(X).
Thus the Я-scale is determined by Ux. Observables which are defined in
terms of infinitesimal transformations will be discussed in VII and
VIII.
If £ Д L is an observable, and у is a measurement scale, then £(y) Д L will
be an observable for which (£(y), F) -< (£, F) holds (see D 2.4.1).
D 2.5.5. If £ Д L is an observable, and if у is a measurement scale, then we
shall call £(y) L the partial observable generated by the scale y.
We may expand the above concept to a finite collection of scales as
follows: Let £(yl5 y2,..., y„) denote the Boolean ring generated by the
<т(у( < a), i = 1, 2,..., n. Let £(yl5 y2,...,yjil denote the partial ob¬
servables generated by the scales yl5 y2,..., yn •
In physics it is customary to select a set of “practical” scales in order to
obtain £(yl5 y2,..., y„) = £.
Here we shall not consider the case in which £(y) is “smaller” than £. Such
cases can be treated using the methods discussed in §2.1 and §2.2. Instead, we
shall consider only the case in which £(yl5..., yn) = £. This is sufficient (see
Th. 2.5.6) because, for separable Boolean rings £ there always exists a scale у
for which £(y) = £, and in §2.2 we have seen that we may restrict ourselves to
observables—that is, to separable Boolean rings.
Th. 2.5.1. To each separable complete Boolean ring £ there exists a totally
ordered subset A such that the smallest complete Boolean ring containing A is
equal to £.
142 IV Coexistent Effects and Coexistent Decompositions
Proof. Since Z is separable, Z contains a countably dense subset Г. Using the
elements yv of Г we may recursively define the following totally ordered subsets of
Z:
Ai: 0 <
Д2:0 < ri -r2 < Vi < 7i + (У2 +
and with
A„: 0 < П± < %2 < • • • < nv < 7tv+1 < • • • < nm
we set
Д„+1:0 < nryn+1 < Щ < nt + (n2 + JtjJ-y.+i < tt2 < •••
< nv < nv -j- (7tv+1 + nv)-yn+1 < 7ZV +1 <
< < nm -j- (y„+1 -j- nm)-y„+1.
We therefore obtain
y„+l = Kl-ln+1 + Я1 + i>, + (n2 + WjJ-y.+ J + n2
+ l>2 + (я3 + Я2)-ул+1] + ••• + nm + [7tra + (y„+1 + 3lJ-V„+1].
Thus, recursively, we find that the Boolean ring generated by the A„ contains all
of the elements yl9 y2,..., yn. Clearly the Boolean ring generated by yl9 y2,..., yn
contains the set A„.
Since An+1=>AB and since each A„ contains a finite number of elements, the set
A = (J„ A„ is countable. Thus we see that A is totally ordered, because any two
elements of A, for sufficiently high n, are elements of a A„. The Boolean ring ZA
generated by A is therefore also countable. Since TcZA and Г is dense in Z, ZA is
dense in Z, that is, the complete Boolean ring generated by A is equal to Z.
Th. 2.5.2. To each separable complete Boolean ring Z there exists a totally
ordered and closed subset A such that the smallest complete Boolean ring
containing A is equal to Z.
PROOF. On the basis of Th. 2.5.1 it suffices to show that the closure A of A is totally
ordered.
Let <7V, <7^ be two sequences for which <7V, e A and crv —► <7, —> o'. We must
show that either a < o' or o' < a. If there exists an N such that, for all v, p > N,
the relation <7V < (7^ is satisfied, then a < o'. If there exists an N such that, for all
v, p > N the relation crv > is satisfied, then a > o'. If no such N exists, then, for
each N there are two pairs <7Vl, <7^ and <7V2, o^2 for which <7Vl < <7^ and <7V2 > o^2.
Thus we obtain four subsequences <7V1, <7V2, <7^, o^2. From <7Vn < o^n it follows that,
in the limit a < o'. Similarly, from <7V2 > o^2 it follows that, in the limit, <7 > V.
Therefore we obtain a = o'.
Th. 2.5.3. Let A be a totally ordered subset of I, for which s e A. Let ZA be
the smallest Boolean subring of Z containing A. Then ZA is the set of all
elements of the form о = г + <7V, where <7V e A; the representation of a as
a sum of elements of A is unique.
2 Structures in the Class of Observables 143
Proof. Since the product of finitely many elements of A is again an element of A
(namely, the smallest of these sets), every a e ZA is of the form a = £"=i +
where av e A.
We will now show that this representation is unique. Let
q n m
£ + ^= £ +K-
v=l H=1
Then we obtain
n m
X + ov + Y + Gli=
V = 1 n=l
In the last sum let us suppose that the quantities are ordered (A is totally ordered).
Then we obtain a sum of the form Yl =T + — 0 with +1 < • We may write
this sum as follows
(ffi + о’г) + (°з + &Z) + • • • = 0 (2.5.1)
For the rjn = &2П-! 4- 02n (n = 1, 2,...) we obtain r]n-rjm = 0 for n Ф m. Thus
(2.5.1) takes on the following form
I + r,n = 0. (2.5.2)
П
If we multiply (2.5.2) by rjm, then it follows that rjm = 0, that is, 1 + 02„ = 0 for
all n and therefore
°'2n-i = °2n for all n. (2.5.3)
Since all ov are different, and the are different, equation (2.5.3) can hold only if
av and are pairwise identical.
Th. 2.5.4. Let Abe a totally ordered and closed subset of a complete Boolean
ring S. Then A is compact, and will be mapped by an effective o-additive
measure m homomorphically to a closed subset со of[0, 1].
PROOF. Let al9 o2 eA ,ai Ф a2. Then either oi > o2 or a1 < a2; thus it follows that
m{a1 + a2) = Im^) — т(о-2)| and, since m is effective, m(a1 + <j2) Ф 0. Since m is a-
additive and effective, then, according to Th. 2.1.11 the uniform structure Ug on Z
(and therefore also on A) is defined by the metric d(al9 a2) = m(a1 + <j2). Since
m(or1 + 02) = |niiaj) — m((T2)I, the uniform structure Ug on A is equal to the initial
uniform structure for the map m: A —► [0,1]. Since the image со of A is precompact
(totally bounded) (All, §2), A is precompact (totally bounded). However, since A is
Incomplete (because it is closed in Z), A is compact, and therefore the image со of A
is a compact subset of [0,1]. Thus a —► m{a) is a homomorphic map of A onto со.
It is always possible to include the null and unit elements in A. In the
following we shall assume that they have been included.
We note that Th. 2.5.3 holds, and that со contains 0 and 1. It is not difficult
to construct a spectral family from a totally ordered closed set A in Z. To this
purpose, we begin by introducing the map A —* со с [0, 1] which we
obtained from Th. 2.5.4.
144 IV Coexistent Effects and Coexistent Decompositions
Th. 2.5.5. Let т(Я) = sup{a|0 < a < Я and a g со}; т(Я) is an upper con¬
tinuous map defined on the interval [0, 1] which increases from 0 to h The
map <т(Я), defined by a € A for which m(<r) = т(Я) is a spectral family for
which <r(0) = 0 and <т(1) = г. The set {<т(Я)} is equal to A.
Proof. From the definition it immediately follows that т(Я) is increasing and that
т(0) = 0, t(1) = 1 since 0 g со and 1 g со. From т(Я) < Я it follows that
т(Я + e) < Я + e. Therefore, if Я g со it follows that т(Я+) = Я = т(Я). If Я ф со, then
the closed subset {a | a > Я and a g со} has an infimum ft such that ft sco since со is
closed. We obtain ft > Я since Хфсо. For an e satisfying 0<e<ft — lit follows
that т(Я + e) = т(Я) and therefore т(Я+) = т(Я). Therefore т(Я) is continuous from
above.
Since the values of the function т(Я) lie in со, and A is homeomorphically mapped
onto со by m, m~ 4 is a map of [0,1] in A which we shall denote by ст(Я). Therefore
сг(Я) is uniquely defined by т(сг(Я)) = т(Я). Thus we obtain cr(0) = 0 and a{ 1) = e.
Since m~1 is increasing and upper continuous, сг(Я) = m" 4 is therefore a spectral
family.
Since, for X sco, the relation т(Я) = Я holds, we obtain {<т(Я)} = A.
Th. 2.5.6. There exists a measurement scale у such that Z(y) = Z.
PROOF. Choose the totally ordered and closed set A = A according to Th. 2.5.9.
Then, according to Th. 2.5.5 we obtain a spectral family {<т(Я)} where X is the
smallest possible Boolean ring containing {<т(Я)}. The linear functional у obtained
from the spectral family by (2.1.9) therefore satisfies the relation E(y) = Z (on the
basis ofTh. 2.1.15).
Th. 2.5.1-2.5.6 show that it is sufficient to consider the spectral family <т(Я)
and the corresponding functionals у s
A spectral family <т(Я) is a totally ordered subset of Z but is not necessarily
closed. We obtain the closure {<т(Я)} of {<т(Я)} by adding all of the limit points
<т(Я—) to {<т(Я)}. Theorem 2.5.3 is also applicable to nonclosed totally ordered
subsets of Z. Let A = {<т(Я)}, that is, A is a spectral family, and у is its
corresponding unique measurement scale. It is easy to obtain a repre¬
sentation of ZA in terms of subsets of the real axis.
Th. 2.5.7. Let A be the spectral family of у and let Sp(y) be the spectrum of y.
Then the bijection <т(Я) «-* (— oo, Я] n Sp(y) of A into subsets of the real axis
can be uniquely extended to an isomorphism ofHA to the Boolean ring of sets
Pj generated by the set
{(— oo, Я] n Sp(y) | Я g R}.
Proof. The proof is obtained directly from Th. 2.5.3. It is only necessary to order
the elements in the sum = i + a(K) such that Xv+1 < Xv. Then we consider
i + °(K) = ЫК) + o{XJ] v МЯз) + <7(Я4)] ....
v= 1
Each bracket, for example, [<т(Я3) + сг(Я4)] corresponds to an interval
(Я4, Я3] n Sp(y). The elements of Pj are unions of finitely many such intervals
(а, Д n Sp(y).
2 Structures in the Class of Observables 145
In practice, a representation of E(y) in terms of subsets of R which is not
one-to-one is more commonly used. Such a representation is obtained as
follows:
л For A2 < kx let the interval / = (Я2,Я J correspond to the element
<r(/) = <7^) 4- <г(к2) Let к be an arbitrary subset of R. We define a
covering и of к by a denumerable set of such intervals Iv for which к c= (Jv /v.
To each subset of к с: R there is a corresponding element <r(k) of E defined by
д(к) = /\u \/v <7(/v). If м is a covering of к and v is a covering of R — к then
м и i? is a covering of all of R. It is easy to show that, for every covering of R
the relation \/v <r(/v) = e holds. Thus it follows that <r(fc) v <r(R — fc) = e. We
shall call a set “measurable” if ff(fc) л o(R — k) = 0. Let P denote the set of
all “measurable” subsets of R. We shall show that P is a <r-complete Boolean
ring of sets. In general P need not be a complete Boolean ring! A Boolean
ring P of sets is said to be <r-complete if the union and intersection of
countably many sets in P is an element of P.
Th. 2.5.8. The set P described above is a а-complete Boolean ring of sets for
which Re P. The map o(k) = <т(к): P—> E(y) is a surjective o-homomorphism
of P onto E(y), that is, a homomorphism in which P and E(y) are a-complete
Boolean rings. Let J be the kernel of this map, that is,
J = {k\ke P, о (fc) = 0}. Then the mapping P -► E(y) may be expressed in
canonical form as follows:
p^pv^m,
where P/J —> E(y) is an isomorphism. In particular, P/J is also a complete
Boolean ring. Every interval I = (A2, is an element of P; in particular, we
obtain o(I) = -j- <t(A2).
Proof. In order to prove that P is a <7-complete Boolean ring of sets it suffices to
show that if к e P then R — к e P; in addition, if {kv}, kveP is a countable set,
(Jv kv g P because f]v kv = R - (JV(R - kv).
If к eP, then, from the definition it follows that <7(к) л <r(R — k) = 0 and
therefore R — к g P and cr(R — k) = <7*(/c).
We will now show that, if kv g P then
(J kv 6 P and JIJ kv) = V
V \ V / V
Since R — (Jv fcv = Hv(R — K), from the definition of a it easily follows that
<t(R - (Jv kv) < <t(R - fcv) for all v, that is, <r(R - (Jv kv) < /\v <r(R - fcv) =
Д„ <r(R - kv) = Д„ <7*(/cv). Since d((Jv fcv) и a(R - (Jv kv) = e we obtain
<i*(R - Uv kj < kv), that is, ff((Jv kj > [Ду <7*(/cv)]* = \JV tr(kv). If we show
that <t((Jv kv) < V» o’(fcv) then we obtain 5(1JV kj л ff(R - Qv kv) = 0, that is,
(Jv fcv e P and <r(ijv fev) = \/v <r(fcv)-
Let l/v be a covering of kv. Then V = (Jv Uv is a covering of (Jv kv because V is,
like the C7V, a countable set of intervals. Therefore we obtain
a((J к) ^ A V
\v / F = U Uv V,fi
146 IV Coexistent Effects and Coexistent Decompositions
where Iare, for fixed v, the intervals for Uv and Д is taken over all V of the form
V = (Jv Uv. We will now show that the Uv may be chosen such that \/v \/M cr(/^v)
will be “arbitrarily close” to \/v er(/cv), that is, with the metric d from Z, for every
6 > 0 we may make
for a suitable choice of Uv.
For this purpose we shall show some relations for a subset k:lfU1 = {I^} and
U2 = {/£2)} are two coverings of к, it then follows immediately that
V = {/J1) n /£2)} is a covering of k. From this result we obtain
From which we obtain av = <rVl л aU2.
Similarly, for a finite number of coverings Ul9 U2,..., Un there exists a covering
V such that av = Д?=i ^t/v-
According to Th. 2.1.1, from <j(k) = /\v ov it follows that there is a sequence Uv
of coverings such that for dn = /\*=1<rVv the relationship а (к) = Д„ dn and
dn^><j(k) hold. Since, to Uv9 v = 1, 2,..., n there exists a covering Vn such that
aVn = Д"=1 %v’ there exists a sequence of coverings Vn for which oVn is decreasing
and converges towards a(k), and satisfies a{k) = Д„ oVn.
For a given e > 0 we may choose such a covering Uv for each kv such that
v
For a covering U we shall use the following abbreviation:
% = V
Since
and
v L м J v
V
we obtain
< « E 47 = 8.
£ 1
V= 1 Z
2 Structures in the Class of Observables 147
Thus it follows that
Л V «КА»*) = V °(K)
K=U Vvv,n v
A v
from which we have proven that
kvj < V
With this result we have also proven that the map P —►£()>) is a o-
homomorphism. The fact that the map P—►£(>;) is surjective follows from
Th. 2.1.1 and Th. 2.1.2 because the elements of £(y) may be obtained from £д by
the joint v and meet л of denumerably many elements. P/J —► £(y) is therefore an
order isomorphism and therefore P/J is isomorphic to £(y), that is, P/J is a
complete Boolean ring.
It is easy to see that I = (Xl9 A2] is an element of P and that a(I) = + <j(A2).
If £(y) is equal to £ then we have obtained a representation of £ in terms of
subsets of scale values of the measurement scale y.
For keP a(k) is the “registration” for which the scale value of у lies in the
set fc. It directly follows that R\Sp(y) e J, that is, only scale values of the
spectrum of у will be registered.
It is easy to verify that the discrete spectrum Spd is the set of those values A
for which a(k) Ф 0 with к as the set consisting only of a single point A. The set
of these <r(fc) is equal to the set of atoms in £(y).
It is not difficult to extend Th. 2.5.8 to the case of finitely many scales
Уг> • • • ^ Уп and Z(yi9 У29 • • • > Уп)- P is then a ^-complete Boolean ring of
sets in the n-dimensional space R". In physics we frequently find that we
choose n scales (instead of one) for which £(yl5 y2,..., y„) = £ because n
scales often prove to be both practical and theoretically useful (in specific
situations which we will discuss later, for example, in VII and VIII).
Here we have treated the problem of measurement scales in terms of a
structure which may be “added” to the structure of a complete Boolean ring
£. Thus we see that the measurement scale is not necessarily involved in the
concept of an observable. In this way it is clear that the measurement scale is
nothing other than a preferred method which permits an overview of £ (or, in
particular, of ${b0)). $(b0) is of primary interest for experiment; the
measurement scale for the apparatus b0 is only a very practical tool, or may
have an additional physical meaning (see the discussion of (3) above)
concerning transformations.
In D 2.5.4 we have defined different components of the spectrum Sp(y) for
the measurement scale y. If m is a <r-additive effective measure on £ then for
keP ц(к) = m(<r(fe)) defines a <r-additive measure /i on P for which the
sets of //-measure 0 coincide with the elements of J. (R, P, //) is then a so-
called measure space. We may separate Sp(y) into three disjoint components:
К = Spd(y), kcc and ksc where fid(k) = fi(k n kd) is a discrete measure,
//cc(fc) = p(k n kcc) is absolutely continuous with respect to Lebesque
measure, and //sc(fc) = //(fc n fcsc) is singular with respect to Lebesque
148 IV Coexistent Effects and Coexistent Decompositions
measure, that is, the Lebesque measure of ksc is 0. This decomposition of
Spd(}>) does not depend on the <r-additive effective measure m on £.
Let {<r(A)} be a spectral family. Then, by the map F: £ —> L which
corresponds to an observable, there exists a totally ordered subset {F(A)} of L
defined by F(A) = F(<t(A)). Therefore F(A) is an increasing function R —
which is continuous from above F(A+) = F(A). We shall then call F(A) a
spectral family of effects. Earlier we have constructed a <7-complete Boolean
ring of sets P using the spectral family {<t(A)}, and we have seen that P/J is
isomorphic to £. The construction of P is, however, possible without
knowledge of <t(A), if we make use of the spectral family of effects {F(A)}.
We may proceed by using any totally ordered subset I с: L. The closure /
of L in the g(0H\ J1)-topology is totally ordered. This fact may be proven
analogously to the proof of Th. 2.5.2. If W is an effective ensemble from
К(Жи ...) then I will be mapped injectively into [0,1] by the map
F —> F) in complete analogy to Th. 2.5.4. By introducing the parameter
т from Th. 2.5.5 we may obtain a spectral family of effects from /; thus, it
suffices to start with such a spectral family of effects.
If we are given a spectral family of effects {F(A)} then we may use the
methods leading to Th. 2.5.8 to obtain an analogous result.
For the interval I = (A2, AJ we define F(/) = F(Aj) — F(A2). Therefore we
obtain F(/)eL. Furthermore, let F(fc) = infy(£vF(Jv)). This infimum exists
because the set of Fv = ]TV F(/v) is a lower-directed set (here we have used the
following result from Th. 2.5.8: if Ux = {I[X)} and U2 = {/д2)} are coverings
of fc, then V = {1{Х) n /<2)} is a covering of fc). We obtain F(fc) + F(R\fc) > 1.
A set k is said to be measurable if F(R\fc) = 1 — F(k). Let P denote the set of
measurable fc. Clearly P is a a-complete Boolean ring and fc—>F(fc) is a g-
additive measure. Let J be the set of all fc having F-measure 0. Then
£0 = P/J is a complete Boolean ring and £0 together with the map
F: £0 —> L defined by F: P —> P/J L is an observable.
For the special case in which F(A) is determined by F(<t(A)) then P and J are
identical to the sets P, J of Th. 2.5.8. That is, by using £0 we may recover the
Boolean ring £ and the mapping F: £ —> L from which we obtain the spectral
families {<t(A)} and {F(A) = F(<t(A))}. In this sense each totally ordered subset
of L uniquely defines an observable, and each observable may be obtained in
this way.
Therefore it is not surprising that we often use a spectral family of effects
instead of the total observable. The spectral family of effects not only
determines the observable but also determines the measurement scale у of the
observable by means of (2.1.9b) where £0 = P/J replaces £ and p(a) e £0 is
the class of elements of P which belong to the interval (— oo, a]. For fc e P we
find that p(W, F(fc)) is equal to the probability of obtaining a scale value of у
from the set fc in the case in which the observable £0-^L was measured and
the ensemble W was prepared. Thus it is not yet clear how we may obtain
apparatuses which will measure the desired observable £ —> L and to prepare
the desired ensemble W. These problems will be discussed first in §4 and §7
and then later in XVII and XVIII.
2 Structures in the Class of Observables 149
For decision observables the above relationships between X, the spectral
family {<r(A)} and the measures E: X —> G is somewhat simplified. According
to Th. 1.4.8 X may be directly identified with a subset of G such that all
results from Th. 2.5.1-Th. 2.5.8 are applicable to decision observables
providing that we consider X as a subset of G. In particular, the map
P—>X(y) is also a map of P into G. Therefore spectral families of decision
effects uniquely determine a decision observable with scale, and X may be
chosen as the complete orthocomplemented sublattice of G generated by the
spectral family {£(A)} where the identity map of X onto itself is the measure
X —> G of the observable. The scale у corresponding to the spectral set is
determined by
For decision observables the maps S and S' have the special properties
described in §2.2.
Th. 2.5.9. For a decision observable S' is injective and SK(J^l9 ...) = K(X). If
we identify X with a subset of G, then we obtain S'-1 from the spectral
representation of the operator A = Sy e ...):
PROOF. We begin by noting that equations (2.5.4) and (2.5.5) are valid as a limit (in
norm) in 0Г(ЛГ19...) and J*'(X), respectively. According to Th. 2.1.15 each
у g J*'(X) may be written in the form (2.5.5). Since S'E = £for£e! с G and since
S' is continuous in norm, it follows that S'у is equal to A. Since both spectral
representations (2.5.4) and (2.5.5) are unique (for the operator A, see AIV, §8 and
§15 and for y, Th. 2.1.15) we therefore find that S' is injective and S'~\A) (where A
is defined for E(A) el с Gby (2.5.4)) is equal to у by (2.5.5).
SK(Ж!,...) = K(L) follows from Th. 2.1.11 as follows: each meK(L) may
according to Th. 2.1.11, be approximated (in norm) arbitrarily well by xd N. For an
arbitrary effective w g K( ,...) m = Sw is an effective m g K(L). For this m x3tN is
of the form
where we now consider cr, on to be elements of I с G. Thus m(<т л an) = m(aan),
where oon are products of the “operators” cr, on of G. From m(a) = ju(w, a) = tr(wcr)
it follows that (since a commutes with an where <7, an gI!):
A = A d£(A),
(2.5.4)
where (S' X)A = у has the form
(2.5.5)
N
X6,n(°) = X (n ~ A O-J,
m(<7 A <7„) = tr(w<7<7„) = tr(w<7<7n2) = tr(w<7„<7<7w)
= tr(<7„W<7„<7).
150 IV Coexistent Effects and Coexistent Decompositions
Thus we obtain
N
XS,N = E (« - 1>5S((T„W(T„)
n= 1
= £ (n - l)8anwan
According to the proof of Th. 2.1.11 \\xd>iV|| ~*хдtN g K(E) converges in norm to m.
Thus \\zd>N\\~1zd>N e К(Жi,...) with zd>N = £n=i (n — 1)<5<7„w<7„ converges in norm
towards awe К(Ж19...). Thus we obtain m = Sw.
According to Th. 2.5.9 we may therefore identify the space &'(E) with the
subspace of Ж(Ж19...) of all operators of the form (2.5.4) where £(A) is a
spectral family of the complete Boolean sublattice E of G. ЩЕ) is an abelian
algebra. K(E) arises from the partition of К(Ж19...) into equivalence classes
where all the w g К(Ж19...) belongs to the same class if the w cannot be
distinguished by means of tr(we) where eel, that is, the linear forms tr(wg)
agree on the subspace B'(E) of В'(Ж19 ...).
If, for one or more scales yl9 ..., yn we have E(yl9..., yn) = 1 (according
to Th. 2.5.6 we may always choose a scale у such that Цу) = £) then the
corresponding operators Au ..., An and the decision observables and the
scales yl9..., yn are uniquely determined. Therefore we may uniquely
characterize a decision observable with measurement scales by a finite
number of commuting operators Ai9..., An e В\Ж\9...). The spectral
families of Al9..., An generate the corresponding complete Boolean ring
E с G.
We therefore define
D 2.5.6. Let A e В\Ж19 ...); the decision observable and its corresponding
scale which is uniquely determined by A is called a scale observable.
We shall often use the expression “Л is a scale observable” or, more briefly,
“A is an observable.” Thus we have explained the connection between the
usual language of quantum mechanics in which the self-adjoint operators are
called observables. In this explanation we expect that, at least, “in principle”
misunderstandings are impossible.
The brief characterization of a decision observable having a scale by a
single operator A e В'(Ж...) is not applicable to more general observables.
In such cases, to each scale у e &'(E) there corresponds an operator
A = S'y g ...) but the. operator A does not permit us to reconstruct
the Boolean ring and the scale у I
Since, for fixed £, we may introduce different measurement scales
у e &'(E), the question arises as to what scale we obtain when we replace Я by
a new scale value /(A) where /(A) is a real function. Let us consider the set
fc(a) = {a|/(A) < a}. (2.5.6)
For a registration the question: is /(A) < a? is meaningful only if the set
k(oc) g P, that is, k(a) is measurable. Then the registration /(A) < a can be
2 Structures in the Class of Observables 151
associated with the element <r(fc(a)) e E. In this way we may obtain the precise
jneaning of the “renaming” of the scale A by /(A), and we find that it is
meaningful to consider only these /(A) for which the sets k(a) of (2.5.6) are
measurable, that is, are elements of P. Such functions are usually called
measurable functions.
The element у corresponding to the renaming of the scale defined by /(A) is
given by
У =
ado(k(a)). (2.5.7)
Since <r(fc(a)) may be obtained from the <т(у < fi) we may therefore write
(2.5.7) as follows:
У =
/(A) da(y < A), (2.5.8)
where (2.5.8) is defined in terms of (2.5.7). Instead of (2.5.8) we sometimes use
the abbreviation у = f(y). Since the integrals (2.5.7) and (2.5.8) exist as limits
in the norm, /(A) must be essentially bounded, that is, a с can be found such
that the sets R and
{A| —c </(A) < c}
differ only by a set of measure zero.
Th. 2.5.10. Let у be a scale for which Цу) = S. Then, to each у = &'(L) there
exists a measurable, essentially bounded /(A) for which у = /(у).
PROOF. We seek a/(A) for which a(y < a) = a(k(a)) where k(a) satisfies (2.5.6). Let a
be a rational number. For this purpose we choose an arbitrary £(a) from the class
of subsets of P which is, according to Th. 2.5.8, in 1:1 correspondence with
<r(y < a). For rational fi < a, since a(y < p) < o(y < a), we find that the set
Да, P) = k(fi) + k(a) n Щ)
is a set of measure zero. Therefore, since the set of rational p which satisfies ft < a is
* denumerable, we find that the set [jp <0Lj(<x, fi) has measure zero. From this result it
follows that the set
Ш) = £(а) и (J Да, fS)
P<a
belongs to the same class as £(a) For the £(a) we find that ol1 > ol2 => H^i) ^ 2)-
Thus, for rational a we have obtained a set of £(а) e P which increases with a and
satisfies <т(£(а)) = o(y < a). Since the set of a > fi is denumerable, for each fi a set
ад = П«>/? £(а)G P is determined.
From Th. 2.5.8 it follows that
a(k(P)) = A ff(/c(a)) = Д a{y ^ a) = a{y < p),
a> fi <x> p
where we obtain the last expression from the fact that the spectral set a(y < fi) is
upper continuous.
152 IV Coexistent Effects and Coexistent Decompositions
Thus the desired function/(Я) is obtained as follows:
/(A) = inf{j9| ЯеВД}.
This theorem may be transformed from the scale у into an analogous
theorem about its corresponding scale observable.
Th. 2.5.11. If a decision observable is completely determined by the scale
observable A (that is, the spectral family of A generates all o/Z с G) then all
of the scale observables corresponding to the decision observable A are
functions f(A) of A.
PROOF. By analogy to (2.5.6) and (2.5.8)f(A) is defined by
where E(oi) is the spectral family of A. The theorem follows directly from the map S'
of J*'(Z) into &'(Ж19...).
The fact that the scale values of the observable f(A) are obtained from the
scale values of A by means of the real function / is not a mysterious result of
the correspondence principle, where the latter provided the basis for the
initial development of quantum mechanics more than 50 years ago. It is,
instead, a consequence of the definition of a scale observable, that is, of the
mapping S' of a scale у onto an operator A = S'y from ...). Clearly
the concept of a measurement scale is not a quantum mechanical concept,
but arises from the registration methods described by ЩЬ0) and idealized by
means of Z. Thus we have clarified and explained the usual, more or less,
intuitively based methods entirely on the basis of the comparison of theory
and experience in terms of a' e 1', b0 e 0lo, be 01.
From this viewpoint it is no longer remarkable that one and the same
operator A can represent the following different things: a scale observable, an
effect (if 0 < A < 1) or even an ensemble (in the case in which 0 < A and
tr(A) = 1). A mathematical term does not, in itself, have any physical
meaning. The physical meaning is obtained when we, in addition, state what
it represents. Here it is hoped that the formulation presented above will
eliminate many of these “apparent” problems.
3 Coexistent and Complementary Observables
If we apply a registration method b0, then by ЩЬ0) L (more precisely the
U^-completion of ЩЬ0)) an observable is determined. In an experiment it is
possible that we may only be interested in a Boolean subring 'L1 of 0l(bo);
then 'L1 is, so to say, a type of partial observable of 0l(bo)-^L. If
Z2 L is a second such partial observable, then Zi L and Z2 L will
“both” be measured by the registration method b0. Thus we are led to the
following general definition.
3 Coexistent and Complementary Observables 153
D 3.1. Two observables -^+L and Z2 are said to be coexistent if
there exists an observable E L and two homomorphisms hl9 h2 such that
the following diagram is commutative :
L
In §2.4 we have shown that we may identify 'L1 with hLx с 1, and Fx with
Fjw:i; and similarly for S2. From D 3.1 it follows that, in particular, the set
{F | F = F^of} for e 'L1 or F = F(g2) for o2 e S2}
is a set of coexistent effects.
If 'Ll G and S2 G are decision observables, then, according to
Th. 1.4.8 we may identify 'L1 and S2 with Boolean sublattices of G. 'L1 G
and S2-^> G may be coexistent only if {Sl5 S2} is a subset of a complete
Boolean ring 1, с G. For such a £ the diagram in D 3.1 is trivially satisfied.
D 3.2. Two coexistent decision observables are also said to be
commensurable.
The above results therefore show that :
Th. 3.1. Two decision observables are commensurable only if their combined
images form a set of commensurable decision effects.
From Th. 2.5.9 and Th. 1.3.4(v) we obtain the following extension:
Th. 3.2. Two decision observables with scales are commensurable only if the
scale observables commute.
Thus we have obtained the joining of the well-known and common
characterization of commensurable observables without the need to make
use of the usual long-winded discussion of what is meant by joint measure¬
ments. Here we have intentionally avoided the use of the expression
“simultaneous measurement.” We shall return to this problem and its
accompanying misconceptions in XVII, §2 and XVIII, §4. At this point we
have only defined the notion of coexistence and commensurability of
observables in D 3.1 and D 3.2, respectively, only in the form of an
idealization. We must also look into the question concerning the realization
of these idealized definitions; this will be done in §4.
If two observables are coexistent, then, according to D 3.1 there exists an
observable for which F11,1 cz FI, and FfL2 с FI,. We shall also use a
concept which will characterize the “extreme case” of noncoexistence be¬
tween two observables. For this purpose we shall define the subset
5 = {(Я11,Я21,...)|0<Я1.<1}
154 IV Coexistent Effects and Coexistent Decompositions
of L. Obviously the elements of S are those which are coexistent relative to
each element of L.
D 3.3. Two observables 2^ L, S2 L are said to be complementary if
and, for each observable it follows that
пЯс S or F2X2 пЯс S.
It is not difficult to see that two decision observables are complementary
only if, for g F12,1 and e2 g F2E2 and el9 e2 commensurable, then it follows
that either ex or e2 g Z where Z is the center of G (the definition of Z is given
in D 1.3.2).
4 Realizations of Observables
We have simplified the analysis of the structure of observables by using the
idealized version of the definition of an observable instead of the
realistic one ЩЬ0)-^ L. We must now ask whether it is possible to realize
(in the sense of the construction of a measurement apparatus) the observable
I-^L, that is, whether there exists a b0effl0 for which the following diagram
commutes:
2 —-—► <Hb0)
where h is an isomorphism. The requirement that, to each observable
there exists а ЩЬ0) such that the above diagram is satisfied, is clearly too
strong since we have assumed that J, 01 are denumerable (see III, §3).
The following axiom governing “approximate” realizations may be added
to the previous axioms:
AOb. To each observable and each finite Boolean subring 2 of E
and each ^-neighborhood U of 0 g ...) there exists а ЩЬ0) and
a homomorphism 2 —> ЩЬ0) such that
\j/0h{a) - F(g) g U
for all <7 g 2 where ф0 is the measure 0t{bo) L.
If we do not explicitly require axiom AOb, then, in any case, we may
consider AOb to be a “certain” hypothesis in the sense of [1], §10.1; we shall
not attempt to establish this result here. The proof that AOb is a “certain”
hypothesis implies that the addition of axiom AOb does not lead to any
contradictions in the mathematical theory. As we have explained in
4 Realizations of Observables 155
[1], §10.4, it is only a matter of taste whether AOb is added as an axiom,
except, if on the basis of experience there is strong evidence that, in nature,
there are strong barriers to the possibility that apparatuses satisfying AOb
can actually be constructed for all observables >L (see [1], §10). But we
have no indication of such barriers for the case of microsystems. However,
AOb is wrong in the case of an extrapolated quantum mechanics for “many
particles” (see [13], X).
Axiom AOb expresses the following statement: Each observable can be
measured “approximately” (note that 38) determines the structure of
physical imprecision in L—see III, §3, and in general [1], §9).
The structure described by AOb by which we may “approximately”
measure an observable is essential for the application of quantum mechanics.
Most of the important observables in quantum mechanics, for example,
position, momentum, and angular momentum can only be approximately
measured. An noteworthy experiment illustrating this fact is given by the
Stern-Gerlach experiment in which the angular momentum of an atom is
measured. In fact, the angular momentum is only approximately measured,
because the procedure represents a measurement for angular momentum
only for a subset of the ensembles (a subset for which the neighborhood U
can be specified). This situation may be more clearly understood in the
presentation of the theory of this experiment in [2], XI, §7.2. The angular
momentum will be measured only for such ensembles which pass through the
magnetic field in a particular way. For ensembles which, for example, do not
pass through the apparatus, the apparatus does not make any measurements
of angular momentum!
The definition of coexistence of two observables given in D 3.1 has, on the
basis of AOb, the following consequence: there exists a measurement method
b0 g 38о by which it is possible to (at least approximately) make a joint
measurement of both of them. This can be clearly inferred from the diagram
inD 1.3.1 as follows.
Let lx be a finite Boolean subring of 2q, and similarly, let £2 be such for
S2. Then hlll and h2l2 generate a finite Boolean subring £ in £, to which,
according to AOb, given a neighborhood U there exists a 3t(b0) satisfying
hi с 3l(b0) and j/h((j) — F(a) g U. From this result it follows that
lx 3#(b0), 12 —^ 38(b0) and tj/ohh^a) — F(a) g U for all a from lx and
ф0Ш2(а) — F2(g) g U for all <7 from £2. Thus b0 is an approximate measure¬
ment method for 2^ L as well as for S2 L, that is, the coexistent
observables 2^ L and S2 L can be (approximately) jointly measured
by using the single method b0.
Of course, the converse of this situation does not hold. It is indeed possible
to make joint approximate measurements of two noncoexistent (even
complementary) observables. Naturally both cannot be measured with
arbitrary accuracy. For example, it is even possible that an apparatus will be
able to measure both position and momentum in which the errors of
measurement of position Ap and that of momentum Ap satisfy ApAq < 1/2
156 IV Coexistent Effects and Coexistent Decompositions
for a certain subset of ensembles. This can occur only for certain subsets of
ensembles. This fact has led to much unnecessary confusion. The source of
this confusion is the false interpretation that such an attribution of position
and momentum to single microsystems is forbidden by the Heisenberg
uncertainty relations (see §8.2).
Axiom AOb states that it is “physically possible” (see [1], §10.4) that every
observable may be approximately measured. However, AOb does not show
how we may find the appropriate measurement b0. The theory presented
here cannot be used for this purpose because it does not contain any
mathematical description of the technical construction of the “apparatus”
b0effl0. In XVII we shall undertake a partial step in the direction of this
“construction problem” for the apparatus b0. In XVIII we shall be in¬
troduced into some of the deep problems found in this area. In [13], X these
problems are solved in principle.
The fact that the theory presented here does not yet provide a description
of the physical structure for the apparatus b0 gives rise to another deficiency
of quantum mechanics. We shall now provide a brief description of this
deficiency.
Suppose that we are given a detailed description of the physical structure
for a given apparatus (for example, a particle counter). Then, using the
mapping axioms given in [1], §5 we may express this fact by identifying the
apparatus with a particular b0. Apparatuses for which the internal physical
structure is actually different will correspond to different b0’s.
According to the definition III, D 1.3 of the map ф should be obtained
directly from knowledge of the internal physical structure of the apparatus.
But only the existence of this map is assured in the theory. It remains open
which g = ^(b0, b) corresponds to a given definite experimental effect
process. In this situation we may choose to “guess” which geL cor¬
responds (at least, approximately) to a given (b0, b) and to “add” the resulting
guesses to the theory as axioms. In XI and XVI we will use this approach and
the defects of this approach will become very evident. In XVII we shall
describe a method by which it may be possible to make some progress in the
area. A valid solution of the problem of obtaining an approximate de¬
termination of the map \// is, however, not attained in XVII. A general
method for finding if/ will be given in [13], X.
5 Coexistent Decompositions of Ensembles
In other formulations of quantum mechanics the question whether it is
possible to make joint measurements of pairs of observables has played an
important role. The question whether it is possible to make joint preparations
is, however, either usually ignored or it is treated as a minor part of the
measurement problem. Thus, in the literature, a primary emphasis is placed
upon the discussion (in a somewhat limited way) of idealized measurements
of the first kind. These idealized measurements are concerned with prepara¬
5 Coexistent Decompositions of Ensembles 157
tions of a well-defined form (see XVII, §5). We may clarify many of the
fundamental questions in quantum mechanics by separating the question
about the possibility of making joint preparations from the problem of
registration. This question can be posed in a natural manner using the
methodology presented here. To do so only requires that we begin with the
methods of III, §2.
Using III, Th. 2.1 and the identification of q>(a) with the elements of К
according to AQ, for a decomposition a = x a{ of a preparation pro¬
cedure a we obtain
Ф) = Z кФд, (5.1)
i = 1
where Af = Aj(a, aj, 0 < Af < 1 and Z"= i A; = 1.
If we have two decompositions of the same a e 2!
n m
a = U «i = U 4
i=1 k—1
we then obtain
n m
Ф) = Z кФд = E КФк)-
i=1 k—1
We may create a new decomposition from the above decompositions as
follows:
a — U (ai П (5-2)
i, к
Here we shall use the prime ' to indicate that we shall not take the union (or
summation) over those (i, k) for which at n ak = 0. Thus, from (5.2) we
obtain
<p(a) = £#<?(«* n ak), (5.3)
i,k
where
^ik ~ ai П ^fc).
In addition, we find that
Фд = Е'4<р(«гn 4)
к
and
Фк) = E' ^Фг П
i
where
158 IV Coexistent Effects and Coexistent Decompositions
The decomposition may be expressed in a very simple way if, in addition to
<jp, we introduce the maps
%(«) = a)(p(a)
of J(a) = {a | a e J, a с a} into K.
Th. 5.1. The map cpa with cpa(a) = Я^(а, a)(p(a) of J(a) in К is an additive
measure over the Boolean ring J(a) which satisfies cpa(a) = cp(a) e K.
Proof. Let a = ax u a2 where al n a2 = 0; Let a3 = a\a. Then it follows that
a=(Jf=1af is a decomposition for which according to (5.1)
(p(a) = (pjidi) + (pa(a2) + (pa(a3). In addition, a = а и а3 is a decomposition, so
that we obtain <p(a) = <pa(a) + <pa(%).
D 5.1. A w e К satisfying w < w € К is called a mixture component or
component of w.
If w is a component of w, then w — w also is, because 0 < w — w < w;
w = w + (w — w) is therefore a decomposition of w into the components w
and (w — vv).
Two decompositions of the same preparation procedure a =
(J"=1 a{ = ak generate two decompositions of the ensemble q>(a) as
follows:
n m
Ф) = £ %(«;) = £ q>a(ak),
1=1 k=1
where the components <ра(а(), lie in the range of the additive measure cpa
on the Boolean ring J(a).
This result suggests the following definition:
D 5.2. Two decompositions of an ensemble
n m
w = Yj wi = Yj ^ where wf, wke К
i=1 k=l
are said to be coexistent if there exists a Boolean ring £ and an additive
measure К for which W(e) = w and wbwke WL.
Two decompositions of a preparation procedure a result in coexistent
decompositions of <р(а). In §7 we shall return to the problem of the realization
of coexistent decompositions.
D 5.3. A set А с К is called a set of coexistent components of w if there
exists a Boolean ring £ and an additive measure £ ^ К such that W(e) = w
and A с WZ.
By analogy with the case of coexistent effects we need only consider
effective measures. Here it is reasonable to consider an idealization of
5 Coexistent Decompositions of Ensembles 159
obtained by mathematical completion. This possibility follows directly from
the following theorem.
Th. 5.2. Let W be an effective additive measure on the Boolean ring £ Д- К
which satisfies W(s) = w e K. Then m0(o) = ||Щ<т)|| = 1) is an
effective additive measure satisfying m0(s) = 1, and d(au o2) = т0(ог + <т2)
is a metric in £ for which W is uniformly continuous as a map in the Banach
spaced.
PROOF. From a = v <j2 and о1 л o2 = 0 it follows that W(o) = Wfa) + W(o2)
and m0(a) = 1) = pfWPih 1) + 1) = + пг0(а2). since
W(e) e К we obtain m0(s) = p(W(s\ 1) = 1. m0 is effective, because if m0(a) = 0 it
follows that p(W(o-\ 1) = 0 and therefore ||Щсг)|| = 0, that is, W{a) = 0; since W is,
by assumption, effective, we obtain <7 = 0.
From
Щах) + W{g2 + o,-o2) = Wfa v <r2)
= Щ(72) + Ж^1 +^l^2)
and
Що 1 + °l) = + °l 'Gl) + W(°2 + °t ,<72)
it follows that, for all д s L:
l/<№i) - Ща2\ fif))l < КЩеi + Vi), в)
< fKW(a 1 + ff2), 1) = dfat, <t2).
On the other hand, we have
||^(«Ti) - W(a2)\\ = sup niWiG,) - W(<r2), y).
• Since [— 1,1] = 2L — 1, for у = 2д — 1 we obtain
- W(<j2), y) = 2^^) - W(e2), в)
- i) - W(a2), 1)
and thus we find that
||WVi) - W(a2)\\ < 3 sup \n(W(Gt) - W(a2), g)\
geL
Therefore we obtain
т°1)-Щ°г)\\£Ч°1,°г)-
From Th. 5.2 it follows that, as is the case in §1.4, it is possible to complete
£ and extend W on the completion of S. Then W becomes a <r-additive
measure on the completion of S. If S is a (lattice-theoretically) complete
Boolean ring, and if IF is a <r-additive measure then £ is complete with
respect to the metric
d(<Tu a2) = (iWfai 4- !)•
160 IV Coexistent Effects and Coexistent Decompositions
We define the following analog of an observable:
D 5.4. A Boolean ring £ with effective measure W: X —> К for which
W(e) = w g К and for which £ (in the metric determined by W) is complete
and separable is called a preparator of w.
It follows that two decompositions of an ensemble
n m
w = £ wi = £
i= 1 fc=1
are coexistent only if there exists a preparator £ X such that wf, w* e JFE.
The mathematical similarity between preparator and observable runs
much deeper, and depends upon the following theorem:
Th. 5.3. For fixed w e X each w e X satisfying w < w can be written in the
form w = w1/2gw1/2, w/iere # e L. Let g g L; then, for w g X: w1/2gw112 g X.
TTie correspondence g —> w1/2#w1/2 is an order-isomorphism of [0, e] onto
[0, w] с X w/zcrc с is the decision effect satisfying Кг(е) = C(w).
Proof. From w = w1/2gw112 and from 0 < g < 1 it follows that
<<p, w<p> = <w1/2<p, gw1,2(p} < <w1/2<p, w1,2(p} = <<p, w<p>, that is, w ^ w. Since
<w1/2<p, gw1,2(py > 0 we obtain w > 0.
Let 0 < w < w; we shall now consider the support * of w. The domain of
definition of the operator w-1/2 is dense in * because, if w = £v wvP<Pv and wv Ф 0,
then the % form a complete orthonormal basis for *, and all vectors of the form
i <Pvav he in the domain of definition of w“1/2. If we then write A = w1/2w“1/2,
then A is defined in a dense set in r, in addition, since w < w we obtain
||A<p||2 = <A<p, A<p> = <w"1/2(p,ww"1/2(jo>
< <w-1/2<p, ww~1,2(py = <<p, <p>,
that is, || A || < 1. Clearly A may be extended to all of Therefore, for all ф e ^ we
obtain
Aw1/2<p = ww~1,2w1/2(p = w1/2<p
and thus we find that
w1/2 = Aw112.
Thus we obtain
w = w1/2w1/2 = (w1/2)+w1/2 = w1/2A+Aw1/2.
If we define # by A+A = gwe then obtain w = w1,2gw112. Since (see AIV, §3 and §4)
||A+|| = || A || and ||A+A|| < ||A+|| \\A\\ we therefore find that ||^f|| < 1 and therefore
0^g < 1.
From the operator equation in r. w1/2g1w1/2 = w1/2g2w112 it follows that
<w1/2<p, (gt — g2)w1/2<py = 0 for all w1/2<p. Since w1/24 is dense in * we obtain
(ФЛ01 — 02)*A> = 0 for all ф g ь and thus we find that g1 — g2 = 0 is satisfied as
an operator equation in
5 Coexistent Decompositions of Ensembles 161
Th. 5.4. Let be an additive measure with W(e) = w e K. Then an
additive measure rj is defined in terms of the bijection x and the following
diagram as follows:
w д where w = w1,2gw1/2,
where [0, w] [0, e], where e e G and Kx(e) = C(w)
£ —^ [0,e]
[0, w]
The uniform structure Ug defined by ц is the same as that one defined by W.
PROOF. The fact that rj is an additive measure follows directly from Th. 5.3. Ug is
determined by the metric
dn(pi, of) = ju(w0, Y\(i7i 4- o2))
for a w0 for which C(w0) = К (see Th. 1.4.1). For example, we may choose w0 as
follows:
w0 = + |w,
where w is defined in Th. 5.4 and w e K0(e) and C(w) = K0(e). Thus, since
0 < rj(a) < ewe obtain
a2) = 2i“(w, Ч(а1 + <т2)).
On the other hand, the metric for W is given by
dw(o 1, o2) = ц(Щ°1 + o2\ 1)
= n{wll2n{o1 + о 2)w1/2,1)
= /z(w, ^(CTi + g2)).
" According to Th. 5.4, if we admit, as observables, the lAl for which
ri(e) — eeG and e ф 1, then we have established a 1:1 correspondence
between the preparators E^+K and the observables E-^L. This cor¬
respondence permits us to, by analogy, transform all theorems about
observables to theorems about preparators. We leave the proof of this result
as an exercise for the reader. We may, without difficulty also define the
following concepts for preparators: a preparator for an ensemble w is more
extensive than another for the same ensemble, and: two preparators for the
same ensemble are coexistent (see also §6). These concepts are completely
analogous to those defined for observables. We shall now explain the
distinction between preparator and observable. The distinction depends on
the physical meaning and, mathematically upon the structure of the measure
W(o) = wll2r\{o)w112
which is somewhat different than фт).
The desire not to consider “unnecessary” convex combinations of the фт)
is analogous to the desire not to consider unnecessary mixtures of the W(o);
162 IV Coexistent Effects and Coexistent Decompositions
here the case of decompositions is more interesting physically than that of
mixtures. We shall therefore seek preparators for which Z L corresponds
to an kernel observable.
We find, however, that the decomposition of observables described in §2.4
has an alternative interpretation when applied to rj. Suppose, for example,
that
ф) = ХП1{а) + (1 - Я>,2(а) (5.4)
holds for all g e Z. Then it follows that
W(g) = mx(G) + (1 - X)W2(g). (5.5)
Equation (5.5) represents a decomposition of each ensemble W(o). We note,
however, that this decomposition fulfils on the basis of (5.5), an additional
condition. Since ri(s) = e = Я^г) + (1 — X)ri2(s) we find that
*7i(£) = 7/2OO = e апс* we therefore obtain Wx(e) = W2(s) = w, that is, the
ensemble w underlying the preparator is not decomposed by (5.5).
A decomposition which does not, in general, satisfy this condition may be
obtained, for example, from a partition г = ox v g2,g1 a g2 = 0 of the form
W(g) = W(gx a g) + W(g2 л g)
= Щ(°) + (1 - X)W2{g\
where Я = ptWfai), IX Щ(<т) = Я~1W(g1 a g) and
W2(g) = (1 - X)~1W(g2 a g).
In these cases, we find that, in general Wx(s) = 1~1W(g1) is not equal to
W(e) = w. If, in (5.5) we have Wx(g) = W2(g) = W(e) = w, then, according to
Th. 5.4 it follows that ^71 (cr), vi2(g) are uniquely determined with rj^e) = ri2(s)
= e and must also satisfy (5.4).
Decompositions of the form (5.5) for which W^s) = W2(s) = W(s) = w are
called decompositions of the preparator. One of the arguments presented in
§2.4 for observables is not applicable because, in the set of preparation
procedures there is no special subset of the type 010 in Despite this fact, we
may still deduce from the decompositions of preparators in a way analogous
to the case of observables the goal to realize experimentally as far as possible
irreducible preparators.
If Z is a decision observable (with the exception that we do not
require that rj(s) = e = 1) then the corresponding preparator is irreducible.
The preparator for which Z L is a decision observable plays a theoretically
distinguished role.
D 5.5. A preparator Z Д- К is called a decision preparator if the correspond¬
ing observable Z L is a decision observable.
According to Th. 1.4.8, for a decision observable we may identify Z with a
Boolean ring rfL. A decision preparator is uniquely defined by a Boolean
5 Coexistent Decompositions of Ensembles 163
subring £ of G (possibly with an e Ф 1 as unit element; e is the support of w)
together with the map
W(e) = w1/2ew1/2 (5.6)
of £ into К for eel. Since e is the support of w, we find that
W(e) = w1/2ew1/2 = w1/2w1/2 = w (as we would expect).
In the desire to eliminate the lack of symmetry between preparator and
observable, we will often find the following statement: the ensemble
w = (1,1,...) = 1 corresponds to the case of “complete ignorance,” and the
fact that w is not an element of К is taken only for a matter of mathematical
“inconvenience.”
For w = 1 equation (5.6) may be formally transformed into W(e) = e and
we may come to the false conclusion that a preparator is nothing other than a
decomposition of w = 1 with respect to a (decision) observable. According to
the formulation of quantum mechanics developed here we cannot dismiss w
in (5.6); w is not a measure of knowledge or ignorance, but is only a
mathematical symbol w = cp(a) corresponding to an apparatus (!) a by means
of the map cp of J' in K. Such an apparatus a does not exist for the case in
which cp(a) is approximately equal to 1. On the contrary, we may approxi¬
mately represent each w = £v as a finite sum v^P^, where
wv = wv£*=1 м^Г1. In physical terms each ensemble which we may obtain
from a preparation procedure has, for all practical purposes, a finite¬
dimensional support. This is a very important aspect of the structure of
microsystems, and is a direct consequence of axiom AV 4s.
The following important question is often discussed: For a given observ¬
able is it possible to find a preparator of w for which w can be
decomposed in such a way that there is no dispersion with respect to the
observable?
Let be an observable, and let £ ^ К be a preparator. Suppose that,
to each ael there exists a a e £ for which
p(W(S), 1 - F(ct)) = 0 and p(W(e + S\ F{c)) = 0, (5.7)
that is, the mixture component W(fi) triggers the response for the effect F(o)
with certainty and the mixture component W(e + S) does not trigger the
response for F(g) with certainty.
D 5.6. A preparator £ К is said to be dispersion-free with respect to the
observable if to each oeZ there exists a del for which (5.7) is
satisfied.
What relationships must be satisfied between a preparator and an
observable in order that the preparator be dispersion-free with respect to an
observable?
Let ew denote the support of w = W(e). Then, since W(p) < W(e) we obtain
W(S) = ewW(d)ew for all a and therefore find that
g) = g(ewW(a)ew, g) = g(W(a), ewgew)
164 IV Coexistent Effects and Coexistent Decompositions
for all W(a) and g e L. An additive measure £ -Д L (which is not necessarily
effective) is defined by F{a) = ewF{a)ew. Eq. (5.7) is satisfied for F{&) as well as
for F(o):
riW(o), 1 - F(a)) = g(W(d), ew - F(a)) = 0
and (5.8)
li{W(e + a\ F(a)) = 0.
Let A = g(W{d), 1); then from (5.8) it follows that A1 LF(<?) e K^Fia)) and
(1 - Х)~Ще + о) e K0(F(oj). For e(a) and e0(a)eG and K^Fia)) -
КМФ K0(F(a)) = K0(e0(a)) it follows that
ф) < F(a) < e0(a). (5.9)
Since w = W{5) + W{e + <?) it follows that
C(w) = С(Х~Щд)) v C((l - A)-4F(e + «?)),
that is,
KAeJ = KM*)) v K0(e0(aj)
from which it follows that
Ki(ew) с К Me)) v 7C1(e0(<7)-L)
= K1(e(a) v eo(ff)1).
Thus it follows that
ew < e(oj v e0(<7)x.
Since, according to (5.9) e(c) < eQ{o) we obtain
e(a) v e0{a)L = e(a) + ео(ст)1
= e(<7) + 1 - e0(a) = (e0(a) - е(ст))1.
Therefore ew ± е0(<т) — e(oj.
Thus, according to (5.9) we have F{a) = e{a) + (F(<r) — e(aj), where
0 < F(ff) — e(aj < e0(aj — e(<r). According to the definition of F we obtain
eia) < ew. Therefore we obtain F(a) — e(a) = ew(F(oj — e(a))ew <
ew(e0(fj) — e(a))ew = 0 and hence we find that F(a) = e(<r). Therefore the
measure F(a) is a projection-valued measure £ -Д G for F(e) = ew.
From ewF{a)ew = e(oj e G, it follows that, for а ф e е(а)Ж1 such that
\M = 1:
F(o)<p = (1 - ew)F{a)<p + ewF(a)cp
= (1 - ew)F{a)cp + ewF(a)ewcp
= (1 - eJF(o)(p + e(a)cp
= (1 - ew)F{a)(p + (p.
5 Coexistent Decompositions of Ensembles 165
Since (1 — ew)F((r)(p _L cp = ewcp it follows that
||F(<7MI2 = ||(1 - eJF(a)cp\\2 + 1.
Since 0 < F(a) < 1 we obtain ||F(<t)|| < 1 and therefore ||F(<7)<jp||2 < 1.
Therefore we find that (1 — ew)F((r)(p = 0 for all cp e е(о)Ж{, and therefore
F(a)cp = cp, that is,
F((r)e((T) = e(a). (5.10)
For cp g (ew — е(а))(Ж1, Ж2,...) we find that
<<p, F(a)cp) = (ew<p, F(a)ew(p)
= <9, ewF(a)ew(p) = <<p, e(a)(p) = 0
and (since F(a) > 0) F(a)(p = 0, that is,
F(a)(ew - e(a)) = 0. (5.11)
From (5.10) and (5.11) it follows that
F(<j)ew = e(<r)
and therefore
(1 - ejF(a)ew = 0, ewF(a) = e(a)
and (5.12)
F(a) = e(a) + (1 - ew)F(a\l - ej.
The measure F(a) may therefore be expressed as a sum of two terms, one
equal to e(a) and the other term is orthogonal to the support of w. In order
that a preparator of w be dispersion-free with respect to the observable
£ - Д L F(a) must therefore be, relative to w, a measure of decision effects
£-^G. For every does there exist a preparator of w which is
dispersion-free with respect to I-^>G?
For F(a) = e(a) equation (5.8) is equivalent to
wll2rj(d)w1,2(ew — е(а)) = 0
and
w1/2(l — ri(d))w1/2e((j)' = 0,
that is, equivalent to
we(<r) = w1/2ri(d)wll2e((j) = w1/2ri(a)w1/2 (5.13)
from which we obtain the adjoint equation:
e(a)w = w1/2ri(d)w1/2 = we(a).
Therefore we find that e(a) commutes with w and with w1/2, so that, from
(5.13), it follows that
w1/2rj(d)w1/2 = w1/2e(<r)w1/2, (5.14)
166 IV Coexistent Effects and Coexistent Decompositions
Since e(<j) < ew, from Th. 5.3 it follows that e(a) = ri(a), that is, r\L => FE.
Instead of we may consider the Boolean subring FE of G and
determine FE Д X by
W(F(a)) = w1/2F(<t)w1/2. (5.15)
For a preparator of w, instead of E X we may use (FE) Д К where W is
defined by (5.15). A preparator can be dispersion-free with respect to E-^>L
only if (5.15) results in dispersion-free ensembles, that is, if the F(a) commute
with w for all a. Then we would obtain
ju(w1/2e(<7)w1/2 1 — e(o)) = 0
and
ju(w1/2(l — e(<r))w1/2, e(a)) = 0.
Therefore, if and only if w commutes with the decision observable
determined by E -Л [0, ew], there exists exactly one preparator of w which is
dispersion-free with respect to E -Д [0, ew].
On the other hand, to each decision observable E [0, ew] there exists a
w g К (with support ew) which commutes with the observable providing that
the decision observable has only a discrete spectrum. According to D 2.5.4
and Th. 1.4.8 this result is equivalent to the condition that the Boolean
subring FE of G is atomic. This result follows from the fact that w has only
discrete eigenvalues and that each eigenspace has finite dimension.
Therefore, there are no preparators which are dispersion-free with respect
to an observable which have a continuous spectrum. This fact plays an
important role in the investigation of the so-called “measurements of the first
kind.” This is a typical aspect of “quantum structures,” that is, of the
structure of microsystems since, in classical theories, to any decision observ¬
able there exist dispersion-free preparators. Thus it is understandable that
some physicists have not only sought to weaken the distinction between
observables and preparators, but have also sought (by means of “tricks”) the
“elimination” of the typical “quantum mechanical structures.” For the case of
microsystems the preparation and registration processes are no longer
“parallel” as is the case for macrosystems. Therefore the implications for the
so-called measurements of the first kind appear to be highly idealized (see
XVII, §5). However, on the basis of II we shall find that it is not necessary to
require the existence of measurements of the first kind as a basis for quantum
mechanics.
6 Complementary Decompositions of Ensembles
In complete analogy to the definition of the notion of coexistent observables
we introduce the following definition:
D 6.1. A preparator E Д К of the ensemble w is said to be more extensive
than another preparator Ei-^X for the same ensemble if there exists a
6 Complementary Decompositions of Ensembles 167
homomorphism h for which the following diagram commutes:
К
D 6.2. Two preparators Si X, S2 К for the same ensemble w are
said to be coexistent if there exists a preparator for the same ensemble
which is more extensive than the preparators 'Ll К and S2 K.
Therefore we find that, for two coexistent preparators, there exists a
preparator and two homomorphisms hl9 h2 such that the following
diagram commutes:
We shall now provide a physical interpretation of this result:
represents (in idealized form) a preparation procedure a for which Zl5 S2 may
be considered to be Boolean subrings of J(a), that is, to be parts of a single
preparation procedure and its possible decompositions.
For the observables which (formally) correspond to the preparators there
exists a diagram analogous to (6.1) as follows:
L
where rj(e) = rj^s) = rj2(s) is the support of w.
In the sense of diagram (6.2) the corresponding observables are, by D 3.1,
coexistent observables.
The situation for two preparators is substantially different than is the case
of observables relative to the problem of “perfect” noncoexistence. Such
“perfect” noncoexistent preparators play an important role in those prob¬
lems of quantum mechanics which are associated with an epistomology
(theory of cognition).
We shall now characterize (in idealized form) the following situation: For
w g К suppose that we are given two decompositions of w:
where wf, wk e K. Suppose that there exists a pair of preparation procedures
a and a for which (p(a) = (p(a) = w, and that a, a may be decomposed as
follows: a = U"=1 щ, a = Q^=1 ak, where %(a;) = w;, %(ak) = wk. If it fol¬
lows that a n a = 0 then the preparation procedures a and a have nothing
К
(6.1)
Ej - hl > Z h2 - Z2
(6.2)
П
m
w = Z wt = £
(6.3)
168 IV Coexistent Effects and Coexistent Decompositions
in common—it is not possible to prepare a single microsystem which could
be considered to be prepared according to both a and a. The preparation
procedures a, a are mutually exclusive, even though a and a result in the
preparation of the same (!) ensemble w = (p(a) = (p(a).
Suppose that a n а Ф 0; then there exists a common Boolean subring
l(a n a) of 1(a) and 1(a). We shall now try to express, in a mathematically
idealized form, the fact that two preparators do not have such a “common
part.”
From the preparator £ ^ X of an ensemble w we may easily obtain new
preparators as follows: Let [0,77] be an interval from S. For Я = p(W(rj), 1)
we find that [0, *7] ——X is a preparator of the ensemble w0 = X~xW(rj).
We shall call the preparator obtained in this way the preparator canonically
determined by [0, *7].
D6.3. Let 2^-^ЛХ and Z2-^X be two preparators for the same
ensemble w. These preparators are said to be disjoint if, for any pair of
intervals [0, c= 'L1 and [0, *72] c= S2 the following condition cannot be
satisfied: Let IX ^2 = MW2O/2X IX then Af 1Wi(f71) =
^I^Oh) ancl the preparators canonically defined by [0, and [0,772]
are coexistent.
Let Zl5 Z2 denote the closures of 1(a) and 1(a), respectively. Suppose that
(p(a) = (p(a) = w, Wx = (pa, W2 = %. Then if 1(a)К and 1(a)К
(more precisely, their completions) are disjoint, then we find that a n a = 0
because the intervals [0, a n a] c= Q(a) and [0, a n a] c= Q(a) canonically
determine the same preparator, and therefore are (trivially) coexistent.
D 6.4. Let 2^ -5X X and S2 X be a pair of preparators. We say that
they are complementary if any preparator which is more extensive than
2^-^X is disjoint from any preparator which is more extensive than
D 6.5. Two decompositions of an ensemble
n m
W = X Щ = £ wk
i=1 k= 1
are said to be complementary if each pair of preparators 'Ll X, S2 X
for which wf g PF^, wfc g W2Z2 are complementary.
How may we determine whether two decompositions are complementary?
Th. 6.1. Two decompositions of an ensemble
n m
W = £ w, = £
i = 1 k=l
6 Complementary Decompositions of Ensembles 169
are complementary only if, for each pair wf, wk the following condition is
satisfied:
w0 eK, w0 < Wf, w0 < wk => w0 = 0.
PROOF. We shall assume that the decompositions are not complementary. Then
there exists a pair of preparators 2^—+K, for which
wk g W21>2 an(* a pair of coexistent intervals [0, с: 2^, [0, y\2] ^ ^2 • For at g Si
satisfying W^) = wt and for ak g Z2 satisfying W2(<rk) = wfc we obtain the following
diagram:
[0, »7i] —^ [0)>/2]
Let Tf = hl(ai л »7t) and pk = h2(ok л ;/2). From this diagram it follows that
WhM л 4l) = k^WM л 4l) = W(ii),
Wh2(dk A m) = ^lw2(dk A n2) = W(pk).
From \/i o'; = к it follows that \/(о- л t]j) = ^ and we obtain \/- й(о* л 41) =
V = M*7i) = 6- Similarly we obtain \fk pk = s and find that V\fc(Ti л Pk) = e-
Thus we obtain
Щв) = Ar^itoi) = = I Щтг л A).
i,k
Therefore we cannot have all W(xt л pk) = 0. Suppose that W(r1 л px) Ф 0. Then
we obtain = Wt(at) > л r^) = tJ > л p^ and similarly
> ^2^(1! л pj. Therefore there exists a, w0 e it, w0 Ф 0 for which w0 < wl9
Щ <
In order to prove the converse, we assume that there exists a w0e К for which
w0 Ф 0, w0 < wl9 w0 <wt. We then introduce the following Boolean rings: Let 2^
denote the set of all subsets of a set of (n + 1) elements which we shall denote by
0,1, ...,w. The one element subsets are the atoms of 2^. For these atoms
a0, ccl9..., a„ we define as follows:
Wi(«o) = wo>
= Wj - w0,
Wi(a2) = w2,
WiW = wn-
In the same way we define Z2 К with atoms fi0, flu , /?„,:
ИДОо) = w0,
IF2(j8i) = w2 - w0,
WJfi 2) = w2,
170 IV Coexistent Effects and Coexistent Decompositions
The interval [0, /?0] is then isomorphic to the interval [0, a0]. Let 2 denote the
Boolean ring consisting only of two elements, the zero and unit elements.
We then obtain the following diagram:
[0, a0] > 2 < [0,А>]
where W(e) = ktand k12 — A*(w0, !)•
Let el9 e2 denote the supports of wx e К and w2 e K, respectively. If
ex a e2 = 0 then there does not exist a w0 e К such that w0 < wx and
w0 < w2, because, since w0 < wx and w0 < w2 the relations e0 < ex and
e0 < e2 must hold for the support e0 of w0. The converse is not true in
general: Even if л e2 Ф 0 it is possible that for w0 < wl5 w0 < w2, w0 e К
that the condition w0 = 0 holds. We shall now give a simple example: For a
Hilbert space Ж and a complete orthonormal basis cpv in Ж let
Wi = 3 Xv°°=! (1/4v)P„v, w2 = P„, where cp = £v°°= x (l/2v/2)<ja,.
Then the supports of and w2 are 1 and Pv, respectively, and we find that
1 л P9 = Py Ф 0. A w0 e К for which w0 < Wj and w0 < w2 must therefore
have P(p as support, that is, must have the form AP9 for A > 0. For ).Ptp < wx it
follows that, for all ф e Ж
КФ, РуФУ = Ж cp, ФУ\2 < (ф, щф).
In particular, for all we must have
, 1 3
A<<?>, %>\ = Лу - Wl(p^ = 4?’
that is, A < 3/2" for all ц, that is, A = 0.
Each w e К may be expressed as a mixture of the extreme points of K, a
result which follows from the spectral decomposition theorem for operators
of trace class (see AIV, §9 and §11). Thus it follows that
W = £ WV^v> wv > 0» (6-4)
V
where the cpv (v = 1,...) are pairwise orthonormal vectors from Ж = (Jf Ж{.
We will now show that any w e К for which the spectral representation
(6.4) containing a sum of two P<Pv where <pv are from the same Ж{ has
complementary decompositions. It suffices to show this for one Hilbert space
only, that is, for Ж = Ж±. We shall show that for Ж = Ж± a w e К(Ж) has
complementary decompositions if w is not an extreme point of К(Ж).
According to our previous discussion, two decompositions
" = = (6-5)
v fl
are complementary if each P}^ is different from each P<Pv. Then P<Pv л = 0
for all pairs v, /i. We need only show that each w which is not an extreme
point (that is w # P^) has two decompositions (6.5).
6 Complementary Decompositions of Ensembles 171
We shall first prove this result for the case in which the support of w is two-
dimensional.
Therefore, let w = ХР9 + (1 — A)P^, where <pl.il/ and 0 < Я < 1. Let % be
another vector in the plane spanned by {<p, ф}. We claim there exists another
decomposition of the form:
W = цРх + (1 - n)Pn
for some 0 < ju < 1 and tj e Ж.
For % = <P<* + ФЬ, \a\2 + \b\2 = 1 we need only set
A(1 - A)
P =
r, =
X\b\2 + (1 - A)|a|2 ’
cpXb — ф(\ — X)a
im2 + (1 - Я2)|а|2]1/2 •
Let а Ф 0, b Ф 0; then the decompositions
w = XPy + (1 - Я)Р^ = pPx + (1 - p)Pn
are complementary.
If the support of w is three-dimensional, that is, = Я1Р<Р1 + Л2Рф2 + Л3Рфз
where <<й, <^> = 0 for i Ф fe, we may write
w
(Ях +Р){хг +pP,pi +AX + pP«)
^2 P
Л2 + A3 - рл<Р2 ' A3 + X2 - рл<рз f’
+ (A3 + Я2 — p)( - — -P + - —- -P
where 0 < p < X2 • We may then apply the result proven above to each of the
ensembles
_A_P I p P w - X2 - p A3
Ax+p Ax + p 2 A3 + A2-p « + A3 + A2-p
We then obtain four vectors, Xu 4\ in the plane spanned by {<pl9 <p2} and
X2, Ц2 in the plane spanned by {<p2, <p3}9 and a decomposition of the form
w = PiPxi + P'ipm + P1PX2 + Pipn2
which is complementary to the decomposition
W = ^lP<pi + ^2Pq>2 + ^3^<p3’
Finally, if w = Xv Kp^> where <<jt>v, <рд> = 0 for v Ф p, we may apply the
above proof to the parts X1P(pi + X2P(pi, A3P93 + Л4Рф4,...; if we obtain an
odd number of terms in the sum, then we may apply the result obtained for
the three-dimensional case to the final three terms.
Thus we find that complementary decompositions are not unusual.
The structure of complementary decompositions has lead to a continuing
discussion about the relationship between epistomology (theory of cognition)
172 IV Coexistent Effects and Coexistent Decompositions
and quantum theory. Especially many are inclined to reject quantum theory.
It is easy to see that the arguments used to reject quantum theory depend on
an impermissible identification of the preparation procedure ae<2! with the
ensemble q>(a) e K. Such an automatic identification appears to be natural in
the usual formulation of quantum mechanics because the usual formulation
takes into account only the set К but not the sets M or J, nor the canonical
mapping <jp of J' into K.
7 Realizations of Decompositions
If, for a preparator there exists a preparation procedure a e J' and a
homomorphism h of I into J(a) such that the following diagram is
commutative:
2 —► Ща)
К
then we may identify £ with the Boolean subring hL of J(a). If this is the case
we may call the preparation procedure a (together with J(a) and cpa) a
realization of the preparator The requirement that to each prepara¬
tor there exists a realization in this sense is too strong (see also §4). In
complete analogy to §4 we now impose the following requirement:
APr. To each preparator and each finite Boolean subring £ of £
and each <r(3) neighborhood U of 0 there exists a Да) К and a
homomorphism £ Д J(a) such that (pah(a) — W(a) e U for all a e £. (For the
space 3 see III, §3.)
APr means that we may realize (in physical approximation) each
preparator and, therefore, each decomposition of a w e K. In other words, it
is “physically possible” to (approximately) realize each preparator. We note
that APr does not, however, provide us with any information about how
we may, in an actual situation, build the apparatus. Such information cannot
be obtained from the theory presented here because it does not contain a
mathematical description of the structure of the apparatus. In XVIII we shall
take a number of small steps in the direction of the investigation of the
construction problem, in which we shall consider “transpreparation”
processes.
The problems concerning the maps ф which were described in §4 are
practically the same for the map (p:£l'—*K. If the construction of a
preparation apparatus is known to us, we still do not have any theoretical
tools by which we may compute the elements cp(a) e К corresponding to the
preparation procedures a. By analogy to the case of registration procedures
(see §4) we may guess, on the basis of a known classical theory, together with
8 Objective Properties and Pseudoproperties of Microsystems 173
the use of a correspondence principle, the ensemble w = ср(а) and add this
result formally as an axiom. At first we have no other choice, even though
this procedure may not be very satisfactory (see, for example, XI-XVT).
The development of experimental methods for preparation and regis¬
tration has up to now, taken place in a manner similar to that described
above using classical theories—statistical theories—and quantum mechanics.
This first step allows us to “estimate” the <p(a) and ij/(b0, b). Then, by varying
the experiment, that is, by using different combinations of preparation and
registration procedures, we seek to improve the values of (p(a) and i//(b0, b).
Therefore, in physics it is not the case that there are no adequate means of
determining the functional relationships between the preparation and regis¬
tration apparatuses and their corresponding cp(a) and \j/(b0, b). The only
failure is that there is no comprehensive and systematic theory of the
macroscopic preparation and measurement apparatuses.
In XVII we shall show that there is at least one realistic theoretical route
by which, using measurement collisions, measurement transformations,
transpreparations, and starting with poorly known values of ср(а) before
collision and ф(Ь0, b) after collision we may obtain theoretically computed
and very precise values of ф(а) and ij/(B0, B) for new preparation procedures a
or new effect processes (b0, 6), respectively. It may be that the method
described in XVII for the “improvement” is the only “practical” possibility
for experimentation, and that the “desired” comprehensive theory (see
[13], X) is more a requirement for epistomology (see XVIII).
8 Objective Properties and Pseudoproperties of Microsystems
In the literature of quantum mechanics the discussion about properties of
microsystems, about propositions about microsystems, and about problems
of logic in connection with such propositions about microsystems has taken
on vast proportions. In this book we cannot attempt to provide an overview
of all these topics. Here we shall only discuss the problem about the
properties of microsystems in terms of the formulation presented in III, §4.
8.1 Objective Properties of Microsystems and
Superselection Rules
We shall now seek to describe the structure Sm of objective properties which
was defined in III, D 4.1.6.
From III (4.1.11) it follows that, for each g e L together with the mapping
Tp (which is dual to Tp) of L into itself (from V, §4.1 we find that Tp is an
operation; thus, for Tp it follows that V, Th. 4.1.3 holds):
g = т;д + T^pg,
(8.1.1)
174 IV Coexistent Effects and Coexistent Decompositions
From III (4.1.13) it follows that
x(p) = t; i = т;д + r;(i - g). (8.1.2)
In addition we find that
т;д + т^„д + t;(i - 0) = + t;i
< T^p 1 + 7p,l = 1. (8.1.3)
From (8.1.1), (8.1.2), (8.1.3), and Th. 1.2.4 it follows that g and xip) are
coexistent. Therefore xip) is coexistent with each g e L, and is therefore
coexist with each eeG. Thus, according to Th. 1.3.4 x(p) commutes with
each eeG. Therefore x(p) has the following form:
T;1=z(p) = (^1,A21,...). (8.1.4)
Since Tpg + Тщрд = g, for each g e L we obtain Tpg < g and find that
ТД0, 0, ..., gi9...) = (0, 0,..., gl, 0,...),
where g[ < gt. Since Tp is linear, from the previous result and (8.1.4) it
follows that
ТД0, 0,..., 1,...) = (0, 0,..., ЛД, 0,...). (8.1.5)
From which we find that
T;2(0, 0,..., 1,...) = (0, 0,..., A21,...)
and (8.1.6)
т;21 = (Afl,..., Afl,...).
From III (4.1.11) we obtain the following special case: Tp = Tpnp = Tp;
from this result, and from (8.1.6) and (8.1.4) it follows that either = 0 or 1.
Therefore x(p) is an element of the center Z (see Th. 1.3.8) of G. The map
Sm^L therefore maps into Z as follows:
iM±Z. (8.1.7)
Z is a complete Boolean ring, x mast then be an isomorphism of Sm onto a
Boolean subring of Z since the following condition is satisfied: if p Ф 0 then
we must have x(p) Ф 0, since if Aj(a, a n p) = ц(ф(а), xip)) = 0 we would
obtain an p = 0 for all a e J' and therefore pn (Jae ^ a = 0, and, from
APS 8.1 we would obtain p = 0.
It is a certain hypothesis in the sense of [1], §10.1 that the map (8.1.7) is
surjective. Since we do not wish to discuss the hypotheses which are
described in [1], §10.1, we shall formulate, as an axiom, the condition that
(8.1.7) is surjective. Then it follows that x is an isomorphism between Sm and
Z.
According to III (4.1.8), for p e Sm we obtain
p = и «•
a e St
a<=z p
8 Objective Properties and Pseudoproperties of Microsystems 175
From a c= p it follows that Aj(a, a n p) = /^(a, a) = 1; conversely,
from Aj(a, a n p) = 1, it follows that Aj(a, a n (M\p)) = 0 since
Aj(a, a n p) + Aj(a, a n (M\p)) = 1. Thus it follows that a n (M\p) = 0
and therefore а c= p, where the latter is equivalent to the condition that
p(cp(a),xip)) = 1, that is, cp(a) gK^xip)). Therefore, for each eeZan objec¬
tive property is defined by
e^p(e)= U (8-18)
аеЗ.
<p(a)eK t(e)
where
X(p{e)) = e.
Thus we find that (8.1.8) is the inverse mapping of (8.1.7).
Thus it is understandable that we may sometimes say (somewhat in¬
correctly) that Z itself is a collection of objective properties of microsystems.
Since jU(wl5e) = p(w2, e) for all eeZ does not imply that wx = w2, it
follows that the microsystems (as described in terms of III, D 4.1.7) are not
physical objects.
Of special importance are the atoms of the center Z (see the end of §1.3),
because we could always assume that each microsystem x has one and only
one property which is characterized by an atom of Z. Let ZA denote the set of
atoms of Z. Then we obtain
M= U P(4 (8.1.9)
eeZA
where p(e) is defined by (8.1.8).
If el9 e2 e ZA and if ег Ф e2 we then find that
Piei) n p(e2) = 0. (8.1.10)
In physics it is customary to use “names” to denote objective properties
p(e), e g ZA—such as “p^) is the set of electrons,” “p(e2) is the set of
hydrogen atoms” or “p(e3) is the set of helium atoms,” etc. The introduction
of these “names” requires a more extensive characterization of the individual
elements of ZA than we have previously seen, that is, there must be additional
axioms which are needed to describe how (that is, by means of which
apparatus) we may produce or prepare the individual e g Za . An experimen¬
tal physicist can easily provide us with several apparatuses by which we may
“produce” or detect, for example, electrons or hydrogen atoms.
Although we are unable, at this stage of the development of the theory, to
assign characteristic names to the atoms of the center ZA, we now find it
desirable to introduce the concept of a system type.
D 8.1.1. The elements eeZA are called system types; for e g Za we shall call
the set p(e) the set of systems of type e.
176 IV Coexistent Effects and Coexistent Decompositions
Using (8.1.9) and (8.1.10), for each a el' we obtain the following
decomposition:
a = (J (a n p(e)). (8.1.11)
eeZA
Thus we obtain
ф(й) = хад?>(апр(4 (8.1.12)
e
where A(e) = tr(<jp(a)e). Equation (8.1.12) is nothing other than a decom¬
position of each w e К into components with respect to the different Жу as
follows: Let w = (Wl9 W2,...) and wv = (0, 0,..., Wv,...). Then w = £v wv.
This composition of w is uniquely determined by the condition
wv tr(wv)-1 e К^ву). This decomposition is coexistent with all other decom¬
positions of w. Thus it is understandable that in physics we are, in most cases,
concerned only with the individual system types. This is evident from the fact
that each registration procedure b may be decomposed as follows
b= U (b A Pie)) (8.1.13)
eeZA
from which it follows that
ФФ0,Ь) = X ф{Ь0,Ь n p(e)). (8.1.14)
e e ZA
Equation (8.1.14) is nothing other than a decomposition of each effect
F = (Fu F2,...) into its components Fv < ev. Here, for a given system type
only the probability tr(WvFv) is of interest.
The set Sm of objective properties, such as the set ZA of system types are
clearly related to the concept of “super selection rules.” We may (in a formal
mathematical sense) construct a Hilbert space Ж = ^ ® Ж{ from a direct
sum of the individual Hilbert spaces Ж{, where the latter, according to
AIV, §15, determine the algebra ^(Жх, Ж2,...). Then ^{Жх, Ж2,...) will be
identified with a subalgebra of <£(Ж). This subalgebra is characterized by the
condition that all A in ^{Жх, Ж2,...) commute with the projections P{ onto
the individual subspaces Ж{ of H (see AIV, §15). The P( are, however, the
atoms of the center Z. Each Pt is then a “super selection rule” since the Pt
commute with all “actual” decision observables in £/(Жг, Ж2,...) and are
therefore, in all circumstances, invariant quantities. Invariant “under all
circumstances” is only an imprecise formulation of what we have previously
called “objective properties.”
In closing we shall now ask whether there exists a set & = &p n Sr (as
defined at the end of III, §4.1) which is so large that the equivalence relation
on 1' defined by ax « a2: {Aj(al5 ax n p) = Aj(a2, a2 n p) for all p e &} is
finer than that defined by the f e From [1], §12.3 it follows that such a set
does not exist, because we may apply the results of that section upon
1, 010, 01 and obtain the following result.
8 Objective Properties and Pseudoproperties of Microsystems 177
There exists a complete Boolean ring £ and a pair of maps J A K(L) and
Д L(£) where ф J is dense in K(S) and ф J* is dense in L(L). Since ipi,a
d-continuous and surjective map L(£) —> L is defined by the maps OF Ль®
and ^-^L.
According to §2 this would mean that all effects would be coexistent—in
particular, G would be a Boolean ring—in contradiction to our axioms for
microsystems.
8.2 Pseudoproperties of Microsystems
In §8.1 we have found that microsystems are not micro-objects—that is, they
do not have a sufficient set of objective properties. We shall now consider the
analogous question concerning the physically realizable pseudoproperties of
microsystems (see III, D 4.2.1). Let Sps denote the set of physically realizable
pseudoproperties. For p e Sps we obtain the following result from III, §4.2:
Let K0L0(Ap) = Kx(e) = K0(eL) where e e G, and let
K0L0(Ap*) = K^e*) = Ые*1) where e* e G.
Then, according to III (4.2.8) there exists a geL0(Ap*) and geL^Ap) for
which
g < e*1 = 1 — e* and g > e. (8.2.1)
From which it follows that
e < e*1 that is e Jl e* and e + e* <1. (8.2.2)
From III (4.2.9) it follows that
0 = L0(Ap) n L0(Ap,) = L0K0{e*L) n L0K0(eL)
= L0K0(e*L л e1)
and therefore e*1 a eL = 0, that is,
e* v e = 1. (8.2.3)
From (8.2.2) and (8.2.3) we obtain
e + e* = 1 that is e* = e1. (8.2.4)
From (8.2.1) it follows that there exists a particular b0 e such that
Q = e = ф{Ь0, b0 n pr).
From Ap с К! (e) it follows that for all a с p
<pia)eKM (8.2.5)
From ф(Ь0, b) e L0(Apt) it follows that, for all b cz p
ф(Ь0, b) < e*1 = e. (8.2.6)
178 IV Coexistent Effects and Coexistent Decompositions
Conversely, if cp(a) g K^e), then it follows that
0 = fj.((p(a), e1) > ц((р(а), ф(Ь0, b))
for all tcp*. Therefore we obtain
ky{a n b0, a n b) = 0
and
a nb = 0 for all b c= p* that is an (p*)r = 0. (8.2.7)
Suppose that a n (p*)p Ф 0; then there exists an a c= p* such that
a n а Ф 0. For a! = a n a, since a' с a we clearly obtain <р(а') g ^(e).
Since a' c= a c= p* we obtain
<p(a') e 4p, с K0(e*x) = K0(e).
Thus we are led to a contradiction, from which it follows that
a n (p% = 0. (8.2.8)
From (8.2.7) and (8.2.8) it follows that a n p* = 0, that is, a с M\p* and
therefore a c= p.
Therefore we may strengthen (8.2.5) as follows:
a c= p о cp(a) g K^e). (8.2.9)
Suppose that il/(b0, b) < e. Then, for <р(а) g Ap* c= K0(e) we obtain
0 = /г(ф(а), ^(Ь0> ^)) = n b0, a n b)
that is a n b = 0 for all a c= p*9 from which it follows that
b n (p«% = 0. (8.2.10)
Suppose that b n (p*)r Ф 0; then there would be a b c= p* for which
b' = b n p Ф 0. Since b' c= b we obtain ^(b0> &') < ^(b0> ^ Since
b' c= b there exists a 50 for which ^(Ь05 Ю ^ G L0(Ap) = ^o^o(e±)
and we therefore obtain {j/(b0, b') < ^ e_L* From ij/(b0, b') < e and
ф(Ь0, b') < e1 it follows that \//(b0, b') = 0 in contradiction to b' Ф 0.
Therefore we obtain
b n (p*)r = 0. (8.2.11)
From (8.2.10) and (8.2.11) it follows that b n p* = 0, that is, b c= M\p* and
we obtain b c= p.
Therefore we may strengthen (8.2.6) as follows:
fccpo ^(b0> — e* (8.2.12)
From (8.2.9) it follows that
Pr= U b (8.2.13)
be
where
= {b | b e M and there exists a b0e3t0 for which ij/(b0, b) < e}.
8 Objective Properties and Pseudoproperties of Microsystems 179
From (8.2.12) we obtain
Pp = U a where £e = {a\ael' and cp(a) g JMe)}. (8.2.14)
Finally we obtain
U
U b
be £
(8.2.15)
To each physically realizable pseudoproperty p there exists a corresponding
eeG which is obtained from the equation K0L0(AP) = Kx(e) and defines a
map Sps —> G. This map is injective, because the image e e G of p is, according
to (8.2.15), uniquely determined.
We will now show that, for an arbitrary eeG, the corresponding p (which
we shall denote by pie)) given by (8.2.15) is an element of П (where the latter is
defined by III, §4.2).
By using the method of proof of (8.2.7)—(8.2.11) it may be shown that, if
a ele, then, for all a g le± and all В g 3$e± then it follows that a n a = 0,
a n В = 0, that is, a n pie1) = 0. Similarly, for be0te it follows that
b n pie1) = 0. Therefore pie) n pie1) = 0. This is equivalent to
pie1) cz p{e)* where p(e)* = n{M\p{e)) where n{c) is defined by III (4.2.2).
From a cz pie) it follows that a n pie1) = 0 and hence a n b = 0 for all
b g MeL, that is, picpifi), фф0, b)) = 0 for all фф0, b) < e1. If
sup фф0, b) = e1
b e @e.
(8.2.16)
then it follows that picpia), e1) = 0 and therefore a eQe.
From В cz pie) it follows that В n pie1) = 0 and hence В n a = 0 for all
a g Qe±, that is, picpia), ^(50, B)) *= 0 for all a eQe±.
Let Ae = {(pia)\aeSL\ cpia) g Ktie)}; if
L0iAel) = LqK^1) = L0K0ie),
(8.2.17)
that is, if the face generated by Ae is equal to Kxie) then it follows that
ij/iB0, B) g L0K0ie) and hence i//iB0, B) < e, that is, В g 0te.
Thus we obtain
pie) =
и в
ae2.
a<=p(e)
U
U b
b et%
bcp(e)
that is pie) has the form III (4.2.2).
At the same time we have also proven that if a cz M^ie1) and В cz M^ie1)
it also follows that a cz pie) and В cz pie). Thus, from pie1) cz pie)* (see above)
and from a cz M\pie)*, В cz pie)* we also obtain a cz pie) and В cz pie), that is,
pie) = pie)**. Therefore pie) is an element of П. Furthermore it follows that
pie)* = pie1) since (8.2.16) and (8.2.17) are also satisfied if we replace e by e1.
From K^eJ n Kxie2) = К^ег л e2) it follows that <bei n = Д,1Лв2.
From g<ex and g<e2 it follows that g<ex ле2 and ^ein^e2 = ^eiAe2,
Thus piex) л pie2) = piet л e2).
180 IV Coexistent Effects and Coexistent Decompositions
Let e e G; then condition (8.2.16), that is,
sup ф(Ь0, b) = e
ЬеМв
is satisfied, if (in correspondence to the assumption from III, D 4.2.1) to e
there exists a b0 e for which
ф(Ь09 b0 n p(e)r) = e. (8.2.18)
The condition (8.2.17) is satisfied for an e e G, that is, L0(Ae) = L0K^e), if
there exists an a e Q' for which
C(<p(a)) = KM (8.2.19)
where C(<p(a)) is the face generated by <p(a) (see also III, §3).
If the elements e of an orthocomplemented sublattice Gp of G satisfy
conditions (8.2.18) and (8.2.19) then Sps = {p(e)\eeGp} is an orthocomp¬
lemented sublattice of П which is isomorphic to Gp. It is a certain hypoth¬
esis that such a sublattice Gp of G exists which is &(&', $) dense in G.
Since, as in §8.1, we shall not discuss certain hypotheses, here we shall
assert the existence of such a sublattice Gp as an axiom.
For such a Gp, Sps = {pie) | e e G} is a sufficient system (see III, D 4.2.2) of
physically realizable pseudoproperties.
From (8.1.18) and x(M\p(e)) = pie1) (where x is defined by §8.1) it follows
that we may always assume that ZcGr We therefore assume that ZcGp.
Thus we obtain
= iP(e)\eeZ} <= Sps.
We may characterize Sm as a subset of Sps as follows:
<^m = {p\p € Eps and p* = M\p}.
PROOF. For eeZ, with p(e) defined by (8.1.8), we find that x(M\p(e)) = p{eL) and
therefore p(e)* = M\p(e). Conversely, suppose that p(e) e $ps and p(e)* = M\p(e\
but that eeZ is not satisfied. Since Gp is dense in G there exists an e e Gp which is
not commensurable with e. From (ё л e) v (ё л eL) < ё and since e and ё are not
commensurable, it follows that ex = ё — Цё л e) v (ё л e1)] Ф 0. Since Gp is an
orthocomplemented lattice we also obtain e1eGp. In addition, we find that
ex л e = 0 and л e1 = 0.
Since (p(et) л p(e))r = p(eJ, n p(e)r and pie^ л p(e) = p{ex л e) = p(0) = 0
we therefore find that р(ех)г n p(e)r = 0. Similarly we obtain р(ег)г n pie1),. = 0,
Piei)p n Pie)P = 0 and Pi^i)p P(^J')p = 0- Therefore we finally obtain
piei) n p(e) = 0 and pf^) n pfe-1) = 0. Since p(ex) = pie)* = M\p(e) it follows
that p(et) = 0 in contradiction to ex Ф 0. Therefore, for e ф Z we find that
pie)* Ф M\pie).
Since Gp is dense in G, in addition to designating the set
Sps = {pie) | e e Gp} as a set of pseudoproperties of microsystems, we shall
also refer to G itself (somewhat imprecisely) as a set of pseudoproperties. The
last designation is, however, often misunderstood. We have introduced the
8 Objective Properties and Pseudoproperties of Microsystems 181
set ips in order to be more precise. We may reduce difficulty in interpretation
if we translate “x e p” for p e Sps into normal language by the expression “x
has the pseudoproperty p.”
8.3 Logic of Decision Effects?
Now that we have discussed the meaning of decision effects in various
circumstances we shall now briefly discuss a number of expressions which
frequently lead to misunderstandings in quantum mechanics.
Since the set G is, in some respects, analogous to the set of properties of a
classical system, it is common to refer to an element e e G as. a property (and
not more precisely as a pseudoproperty). Since we are not accustomed to
describe individual microsystems mathematically (as we do in this book) we
frequently express the above assertion in ordinary language as follows: “a
particular microsystem has the property e” In seeking to give such state¬
ments a verifiable meaning it was recognized that if the assertion “the
microsystem has the property e1 and has the property e2” is replaced by the
assertion “the microsystem has the property е1 л e2” then doubts about the
validity of the usual two-valued logic arise. A similar case exists if the
negation of the proposition “the microsystem has the property e” is replaced
by the proposition “the microsystem has the property e1” In this way a
nonstandard logic of propositions was derived in which the logical oper¬
ations are (in a sense) parallel to the lattice operations in G. This parallelism
may be expressed as follows:
and<-*A
or«—► v, (8.3.1)
not <-* _L.
The lattice G is often called a quantum logic—even in the case in which the
relationships (8.3.1) are not strictly required or “believed.”
In the discussion of propositions concerning the properties of microsys¬
tems an alternative possibility proceeds from the idea that, in quantum
mechanics, it is not possible to formulate so-called “objective” propositions
of the form “the microsystem has the property e.” Instead, it is suggested that
we may only formulate “subjective” propositions, such as, “I know that the
microsystem under consideration has the property e.” Here there exists two
different types of negation: “I do not know that the microsystem has the
property e" and “I know that the microsystem has the property e1” Here “I
do not know ...” can be considered to be an imprecise form of the
proposition “I do not for certain know whether the microsystem in question
has the property e or eM; there is a probability a (possibly subjective) that the
microsystem has the property e (and 1 — a for the property e1). In this way
attempts have been made to develop a “probability logic.”
The notion of a “probability logic” will not be discussed here (see, for
example, [16] and [9]). Instead, we shall continue the development of the
182 IV Coexistent Effects and Coexistent Decompositions
fundamental description of individual microsystems in mathematical terms,
terms which are in some correspondence to the more intuitive ideas
formulated above.
Instead of G we shall consider the dense subset Gp of G which was
introduced at the end of §8.2, and the corresponding set of physically
realizable pseudoproperties Sps together with the isomorphic map e —► p(e).
Let eeGp; we shall now express the relationship
xep(e) (8.3.2)
in ordinary language as follows: “The microsystem x has the pseudoproperty
еГ Here it is important to note that the ordinary language formulation
should not be construed to mean anything other than the relation described
by (8.3.2). In this formulation (8.3.2) is primarily a relation in the mathemati¬
cal description of a physical theory, that is, it has a physical interpretation. We
should not, however, make the mistake of using the ordinary language
description of (8.3.2) as an alternative interpretation in addition to that
already given in II.
We shall now proceed as follows: According to the methodology presented
in [1] certain relationships in a mathematical theory J10~ (as a part of a
physical theory 0>0~) may be considered to be a representation of real
physical facts. Here it is important to emphasize that, in addition to the
mathematical formulation of there does not exist another type of
“proposition” formulated in 0*3". The mathematical formulation (for
example, (8.3.2)) is, on the basis of the physical interpretation of the funda¬
mental sets—for example, Д01,0to and the real function in II—considered
to be an assertion about reality. Thus the logic used in is only that of
the mathematical theory. In this way we obtain, in a natural way, certain
mathematical assertions of a real character (see [1], §10 or [2], III, §9). We
shall now describe this situation using (8.3.2) without making use of the
general formulation presented in [1], §10.
In the mathematical framework we cannot assign the “values” true or false
to the relation (8.3.2). The meaning and importance of relations such as
(8.3.2) in a mathematical theory is much more complicated and requires a
more precise analysis. Here we shall impose all of the axioms previously
introduced and also those introduced in subsequent chapters (for example,
VII).
We shall begin by providing a logical analysis of relations of the form
(8.3.2). Here, by the expression “logical analysis” we mean an analysis in the
sense of a mathematical theory.
We may logically “combine” two relations of the form (8.3.2) by means of a
logical conjunction “and” as follows:
* e P(ei) and x e p(e2), (8.3.3)
where (8.3.3) is equivalent to the relation
xepiej n p(e2).
(8.3.4)
8 Objective Properties and Pseudoproperties of Microsystems 183
We note that, in general
p(ei) П p(e2) ф p[e2 л e2). (8.3.5)
We note that
Pie) = P(e)p u p(e)r
and
Piei) n p(e2) = (piejp и pfci),) n (p(e2)p и p(e2\)
= Wi)p П p(e2)p) и (p(ei)p n p(e2)r)
u (?(ei)r n p(e2)p) и (ptej, n p(e2)r)
= Piei a e2)p u p(ct л e2)r
и (P(ei)p п p{e2)r) и (p(et)r n p(e2)p)
= Piei a e2) и (р(с!)р n p(e2)r) и (p^), n pie2)p).
For the special case in which e1Ae2 = 0we find that
p(eJ n p(e2) = (piejp n p{e2)r) и (p^), n p(e2)p)
need not be empty! For example, piejp n p(e2)r Ф 0 if there exists a
preparation procedure a for which cp(a) e and p(cp(a\ e2) Ф 0 because,
for ^(b0, b0 n p(e2)r) = e2 (see (8.2.18)) we must then have
^Aa r\ b0,a n b0 n p(e2)r) Ф 0,
that is, a n p(e2)r Ф 0.
For example, let el9 e2 be the following decision effects: e1: the momentum
lies within a compact region W of momentum space, e2: the position lies
within a compact region V of position space (see VII, §4). Then the set (8.3.4)
is nonempty because it is possible to prepare microsystems from pie^p (that
is, with momentum in W) which may be registered according to p(e2)r (with
position in V). If we express x e p(eД x e p(e2) as follows: “x has momentum
in W” and “x has position in V,” respectively, then (8.3.4) is equivalent to the
logical conjunction (8.3.3) which says that “x has momentum in W and
position in УГ For sufficiently small domains W and V the latter statement
may appear to contradict the Heisenberg uncertainty relation. We shall now
find that this contradiction is only an apparent one.
The following objection is frequently made—particularly in the case of
position and momentum: “after” the registration of the position (that is, for
the elements of p(e2)r) the momentum has been changed. This is indeed the
case. This fact is, however, irrelevant to the interpretation of quantum
mechanics. Quantum mechanics makes assertions concerning the interaction
between the preparation apparatus and the registration apparatus which
results from microsystems. Therefore all such assertions are concerned with
the microsystems “between” preparation and registration; here (8.3.3) repre¬
sents an assertion which is both correct and important. (In XVII, §4 we shall
184 IV Coexistent Effects and Coexistent Decompositions
examine the “trans-preparation” process and obtain a number of conclusions
concerning the “passage” of microsystems through the registration ap¬
paratus. We note, however, that these special processes—special relative to
the more general process of preparation—are not needed for the in¬
terpretation of quantum mechanics.)
We shall now show by an example that the “strength” of the disturbance of
a system which occurs during the registration process is not an important
issue in the interpretation of quantum mechanics. Let us consider a
“classical” system, for example, a bullet which is fired by a gun (as the
preparation apparatus) and produces a hole in a target (the registration
apparatus). No one will object to the use of the expressions “position of the
bullet” and “momentum of the bullet” in the description of bullet im¬
mediately before it is “registered” by the target. Here the “strength” of the
influence of the target is unimportant, for example, the bullet may become
embedded into the target if, for example, the target is a metal plate.
The real distinction between macro- and microsystems lies in the structure
of the convex set К which, for macrosystems, is completely different than that
for microsystems (see, for example, the remarks in III, §3).
The following claim is often made: the “classical” mode of description is
made possible whenever the disturbance of the measurement can be neglec¬
ted. This claim is false, and avoids the actual problem, making an under¬
standing of the problem more difficult.
Indeed, it is correct to say that every registration disturbs—for both the
case of a classical system and that of a quantum mechanical system. Whether
a system need be described classically or quantum mechanically has nothing
to do with the “strength” of the disturbance in the registration process.
Without disturbance there would hardly be any systems because without
disturbances we cannot prepare and register, that is, cannot “extract” the
system from its surroundings and make observations; without such in¬
teraction we cannot talk about systems at all.
For classical systems the set Sm of objective properties is “sufficiently large”
that we may interpret the preparation and registration procedures by means
of the objective properties (this topic was outlined in III, §4), even though the
interaction during registration may be very large! (see the remarks in III, §4
and the discussion in [1], §12).
We note that (8.3.5) holds in general. This fact has led to much
unnecessary speculation. It is not difficult to see that the expression “and” is
used in different senses in е1 л e2 and p(ex) n p(e2), respectively. The
mathematical formulation used here does not allow the usage of such
imprecise language.
The same can be examined if we consider the negation of (8.3.2)
x ф p(e). (8.3.6)
(8.3.6) is equivalent to
x € M\p(e).
(8.3.7)
8 Objective Properties and Pseudoproperties of Microsystems 185
In general we find that
M\p(e) Ф Pie1)- (8.3.8)
In §8.2 we have seen that
M\p(e) = pie1)
if and only if e e Z. Thus, from (8.3.7) we find that, in general, the two
statements “x does not have the pseudoproperty en and “x has the pseudo¬
property e1” are not equivalent. Thus, in “ordinary” language we may easily
encounter the following difficulty: Suppose, for example, that e is the decision
effect that the position lies in the domain V. Then, from the statement “x does
not lie in V" we may easily (and incorrectly!) conclude that “the position of x
lies in V,n where V' is the complement of the set V—e1 is the decision effect
for the statement “the position of x lies in F'.”
In ordinary language it is evident that we may easily arrive at con¬
tradictions with logic. The mathematical language of relation (8.3.2) does not
permit such confusion.
Let be the smallest Boolean ring of sets which is generated by £ps- &pb
contains (in the sense of equivalences (8.3.3), (8.3.4) and (8.3.6), (8.3.7)) all
possible “logical relations” of the pseudoproperties in Sps. The Boolean ring
is a reflection of the “ordinary” logic.This fact demonstrates that the
question about the properties of a microsystem is not a question of logic
because the elements of Spb may be called properties of the microsystem, and
contains “sufficiently many” properties since Sps already contains suf¬
ficiently many pseudoproperties. The problem is therefore not associated
with the construction of a Boolean ring of sets S cz 0>(M) but with the
question posed in III, §1 about “objective properties.” If we define the term
“objective” in the sense of III, §4.1 then is clearly not a set of “objective”
properties because the set of objective properties is (according to §8.1 and
§8.2) the subset im of Sps. Clearly Sm is not a sufficient set of properties.
Since Sm is a Boolean ring of sets, the “logical operations” do not force us
to leave the set Sm, and we may, without hesitation, speak about these
objective properties in ordinary language as, for example, “x is an electron.”
Using the following properties, we shall show that the elements of Spb are not
“objective” properties. Let p = p(e). Then the following element
M\p = iM\pp) n (M\pr)
belongs to $Ръ- Let us consider two preparation procedures ax and a2 which
belong to the same ensemble, that is, (p(ax) = <p(a2). Suppose that
e) ф О, Ф 1. According to §6 it is possible that a1 can be com¬
plementary to a2 (see the EPR paradox in XVII, §4.4). Let a2 have a
decomposition for which a2 cz a2 and <p{a2) 6^).^ may not have such a
decomposition. Therefore it follows that ax n pp = 0, a2 n pp Ф 0.
Similarly according to §3 there may exist a pair of registration procedures bq
and b% for which b^ n pr = 0 and {//{bl, bj n pr) = e; bq is a registration
procedure for which the corresponding observable is not coexistent with any
effect g < e.
186 IV Coexistent Effects and Coexistent Decompositions
Thus we obtain
ax n bo с M\p9
that is, for all systems xeat n bj we obtain x фр. On the other hand, from
picpiai), e) = p((p(a Д b20 n pr)) Ф 0 we obtain a1 n b\ n pr Ф 0, that
is, in a1 there exists a system x for which x e p. Therefore, by the “application
of the registration method b^” alone (that is, without any selection according
to a registration b!) we have therefore selected the “property” M\p from the
systems of type al9 although at also contains elements of p. The “property”
M\p is therefore not “objective” because it depends upon the application of a
registration method.
Here we again make the remark that we do not use the designation
“objective property” in a meaning opposite to that of “subjective,” but (as we
have already expressed in an intuitive way in II, §1) in the sense of
“independent from the methods of preparation and registration,” that is,
objective in the sense of the properties ascribed to the systems. The opposite
of “objective” properties is therefore “relative” properties, not subjective
meaning and knowledge about properties.
The desire for the intuitive idea that although the microsystems are
emitted by the preparation apparatus, they exist independently after the
emission, that is, no interaction exists after emission between the microsys¬
tems and the preparation apparatus, and later, that the microsystems on the
basis of their inherent structure—that is, their “objective properties” act
upon the registration apparatus—is so compelling that many of us may wish
to adopt the “hidden” properties hypothesis to retain this idea.
In the following discussion we shall ignore properties which are “com¬
pletely hidden” in the sense that they have nothing to do with the preparation
and registration processes. The above idea may yet have an additional
meaning: It is perhaps not possible to construct a sufficiently good prepara¬
tion procedure in order to produce systems with definite specified objective
properties. Otherwise, the registration process may be deficient in this
respect—the objective properties may only be partially registered. It is in this
sense that we shall attempt to mathematically formulate the notion of
“hidden objective properties.” This notion will be somewhat different than
that described in III, §4.1.
In addition to the structure previously introduced, we shall consider an
additional structure Sh for which $h с= 0>(M) where Sh is a Boolean ring of
sets. The elements of Sh will represent the hidden properties.
For p = p(e), eeGp there exists a registration method b0 for which
ФФо> b0 n pr) = e (see §8.2). Then we obtain
Xy(a nprnb0,anppnb0n pr) = 1,
that is, a n pp n b0 c= pr. Since the relation (b0 n pr) и (b0 n p*) = b0 is
satisfied for this b0, for each system registered according to b0 either
b0 n pr or b0 n p* will “respond.” We shall attempt to interpret this situation
in the following way: p = pp и pr possibly does not include all systems
8 Objective Properties and Pseudoproperties of Microsystems 187
which have the same objective property which is made evident by the
response b0 n pr during the registration process b0. Suppose there exists a
г e Sh for which г => pr such that for systems in г the registration response is
always obtained for b0 n pr and therefore a response is never observed for
p*, that is, s n p* = 0. Since the systems in a n pr are such that b0 n pr
responds with certainty, for all a we should find that a n pp c= г and therefore
pp c= г. Since the systems in a n p* are such that b0 n pr does not respond
with certainty we should find that г n p* = 0. We then find that г =э pr and
г => pp are equivalent to г =э pp и pr = p; г n p* = 0 and г n p* = 0 is
equivalent to г n (p* n p*) = г n p* = 0.
These speculations suggest that the following axiom is desirable:
AH 1. To each p e Sps there exists а г e Sh for which г => p and г n p* = 0.
Here £ n p* = 0 is equivalent to М\г => p*. Since = pGp we obtain: To
each eeGp there exists a£e^ for which г => p(e) and М\г => pie1).
In practical terms axiom AH 1 is not very restrictive because it is satisfied
by Sh = Sps itself for the case in which г = p(e).
The following idea is more restrictive: Suppose that el9 e2 are two decision
effects in Gp for which e1 L e2. Then we may think of a registration method
b0 for which Il/(b0, b0 n plr) = eu ф(Ь0, b0 n p2r) = e2 where
P2 = p(e2). Then from plr n p2r = 0 and
b0 = (Ь0 n plr) u (Ь0 n p2r) и [b0\(b0 n (plr и p2r))]
we obtain
<A(b0, b0\(h0 n (plr и p2r))) = e3 = 1 - (Ci + e2)
and therefore
b0\(b0 П (plr и p2r)) = b0 n p3r,
where p3 = p(e3). This registration method b0 permits us to separate the
systems according to plr, p2r, p3r. For three objective properties el9 г2, г3 for
which exactly one of b0 n plr, b0 n p2r, or b0 n p3r will respond with
certainty we should therefore have 81 и г2 и г3 = М because, for each
system, in every case one of the “responses” b0 n plr, b0 n p2r, b0 n p3r
must occur. In this way we are led to the following axiom:
AH 2. For el9 e2, e3eGp and e1 + e2 + e3 = l,8l9 82, 83 e$h where
S{ => р(е{) (i = 1, 2, 3) we require that s1 и г2 и г3 = М.
Let us define a map ф: Gp —► ^(M) as follows:
0(c) = fl s.
e e Sh
e^p(e)
Then it easily follows that
cx < e2 => 0(ct) <= ф(е2)
(8.3.9)
188 IV Coexistent Effects and Coexistent Decompositions
and, from AH 1 it follows that:
ф(е1) с= М\ф(е). (8.3.10)
If el9 e2, e3 e Gp and e1 + e2 + e3 = 1, then, from AH 2 it follows that
ф(е1) и ф(е2) и ф(е3) = М. (8.3.11)
If we assume that if e e Gp then [0, e] n Gp is dense in [0, e] we may then
prove that the existence of a map ф satisfying (8.3.9), (8.2.10), and (8.3.11)
leads to contradictions; for the case in which Gp = G an elementary proof
can be found in [2], XVIII). We shall not present discussions here of any
weaker hypothesis for “hidden properties.” Instead, we shall make an
analysis of the structure Sps of “physically real” pseudoproperties and their
physical interpretation.
For (8.2.4) it follows that pp depends upon the preparation procedure
because a = (a n pp) и (a\a n pp) is a decomposition of a which must
be coexistent with all possible decompositions in 1(a). Similarly, from
(8.2.3) it follows that pr depends on the registration method since
b0 = (Ь0 n pr) и (b0\b0 n pr) must be coexistent with all the other regis¬
trations in $(b0). The discussions in previous sections about the structure
of preparators and observables are therefore applicable to the sets a n pp
and b0 n pr, respectively, as substructures. The elements of Sps represent
only a part of the structures of preparators and observables. The analysis of
the preparators and the observables has shown that if we consider only Sps
then we lose part of the general structure of the interaction transfer
mechanism from the preparation to the registration systems, where this
mechanism is independent of the special technical construction of the prepara¬
tion and registration apparatuses as described by the corresponding
irreducible kernal observables or preparators, respectively. We cannot speak
of microsystems except in the context of preparation and registration even if
we neglect as many of the “accidental” properties of the preparation and
registration apparatuses as possible. We have also found that even the
structure described by the pseudoproperties in Sps actually refers back to the
preparation and registration processes. Only the “objective” properties in Sm
can be separated from the preparation and registration procedures. Sm is not
sufficient. Therefore it would be necessary to attempt to demonstrate the sub¬
jectivity of every assertion about nature. Such attempts would greatly exceed
the real procedures in physics and contradict our original intentions to base
the description of microsystems in terms of preparation and registration pro¬
cedures which can be described in an “objective” form. In II we have argued
that the interpretation of quantum mechanics depends only on the mode of
description of the preparation and registration processes, a description which
is given already before any knowledge of microsystems and quantum
mechanics. This is specially important if we wish to consider the “physical
possibility” of assertions of the form (8.3.2), that is, the question whether it is
possible to realize situations in which the assertion (8.3.2) is true.
8 Objective Properties and Pseudoproperties of Microsystems 189
The structure of assertions of the form (8.3.2) for an unspecified x and an
unspecified e is meaningful only if we are interested in studying the logical
operations, as we have done earlier. We are, however, also interested in
assertions of the form (8.3.2) in those cases in which x and e are particular
specified elements. In order to emphasize this definiteness we shall modify the
notion of (8.3.2) by using the subscript 1 as follows:
xieP(ei)• (8.3.12)
By requiring that xt and e1 be definite elements we mean that x1 and e1 are
already defined before we use the relation (8.3.12) in our mathematical
framework. For xt this may be achieved by requiring that xt is already a
label for an actual system in the context of an actual experiment which has
been carried out andidie~result of which is written down in the mathematical
framework before we may use (8.3.12). In [1] the mathematical scheme of the
theory in which previous experimental results have been incorporated is
denoted by JOTs#. In brief we say that xt is a definite element if xt is already
a label appearing in JlZTstf before we add the relation (8.3.12) to MZTstf.
e1 is a definite element if, for instance, it is defined as the decision effect to
find the position of the system in a given (that is, in a technically determined)
region Y in the laboratory system, a decision effect which will be defined in
VII, §4.
It is possible to add the relation (8.3.12) to the mathematical scheme
JtZTstf and determine whether
(1) (8.3.12) may be derived as a theorem in JlZTstf.
(2) The negation of (8.3.12) may be derived as a theorem in JlZTstf.
(3) Either (8.3.12) or its negation may be added to without
producing a contradiction (naturally both cannot).
In case (1) we say that “xx actually has the pseudoproperty ex.” In case (2)
we say that “xx actually does not have the pseudoproperty ex.n Earlier we
have seen that case (2) is not equivalent to the statement that “xx actually has
the pseudoproperty ,” except in the case in which e1 e Z—then e1 would be
an objective property. In case (3) we say that “xx may possibly have the
property ex” or “... not have ” Every mathematician knows that case (3)
can occur in a mathematical theory such as . The existence of case (3)
has nothing to do with a “new” mathematical logic. The existence of case (3)
is possible using “normal” mathematics and “normal” logic. Only some
physicists have difficulty with case (3) and have the opinion that it may be
necessary to introduce a new logic in physics in order to interpret case (3).
The source of these difficulties lies in the fact that we always (often
unconsciously) assume that the elements of Sps are like the inherent
properties of the system—so that for x e M and p e Sps only xe p or хф p
will be true. For the elements p e Sm we may, in fact, make such a claim, but
not for the elements peSps. Case (3) is essential! Case (3) is possible only
because the elements p e Sps are not “objective” properties but, more
generally, are assertions about the microsystems relative to the preparation
190 IV Coexistent Effects and Coexistent Decompositions
and registration processes. For this reason it is important to examine case (3)
in more detail.
We shall assume that the system is prepared according to some
preparation procedure a1. In this way we may introduce the relation x1 e ax
into the mathematical theory as an experimental result. On the basis of the
construction of the preparation apparatus corresponding to ax (the cor¬
responding information cannot be represented in terms of the theory
described here—see XVII) and on the basis of additional experiments we
may determine (p(ax). If (p(aA) is determined by experiment then the value of
ei) is determined by the theory.
(A) Suppose ^(<Kai). ei) = 1- In this case we obtain ax e &ei and obtain
(8.3.12) as a theorem.
(B) Suppose that /^(аД et) = 0. In this case р(ср(аД e\) = 1, that is,
xi e P(ei) and *i Ф p(ei) are theorems.
Therefore we find that (A) corresponds to case (1) and (B) corresponds to
case (2).
(C) Suppose that /^(аД et) ф 1 and Ф 0. To comment on this case the
actual experimental situation is essential since we are not dealing
with an imaginary microsystem. x1 is an actual microsystem. How
was the preparation of the system carried out? Has the system x1
already been registered? Has a registration method already been
applied to хД ... etc. ...?
We shall first consider the case in which a registration method has not yet
been applied. Then the only experimental facts are x1 e ax and (possibly)
for some of the a <= ax we have also obtained x1e a. Since the relationship
x1e a can be experimentally verified only for finitely many av we find that
a = П S. (8.3.13)
V
is an element of aA) and we obtain e a. If p((p(a), et) = 1 or if
p((p(a), ej = 0 we obtain cases (A) and (B), respectively. Here it is important
to remark that in the case in which not all the xte av are included in the
‘protocol” of the experiment (for example, it is possible that some of these
relations have been overlooked) in reality there exists such an a satisfying
(8.3.13) even thoughwe may fail to take it into account in JlZTstf. Thus it is
possible that, in the case p((p(a), e= 1, that really has the pseudo¬
property p(e J ” although we may be unaware of this fact. Here we find that the
a e &(a^) to which the system x belongs is determined by the Boolean ring
J(ai) even in the case in which an experimental result x e a for an a e has
been overlooked. The incompleteness associated with the introduction of the
experimental results into a mathematical theory (that is, the incompleteness
of the axioms denoted by set down in [1], §5 and [2], III, §4) may
permit “possibilities” which nature does not allow. For example, it is possible
that the relation p((p(a), ej = 1 is satisfied for a particular experimental a but
that we have only observed that xeat (where ax => a) where /^(аД et) < 1;
8 Objective Properties and Pseudoproperties of Microsystems 191
then the “possibility” remains that ф while, in reality e p(e t). This
“possibility” remains only if we have overlooked the fact that xx actually
satisfies the finer preparation procedure аса1(
We are not interested in the possibilities which are caused by the
incompleteness of the axioms Jtstftf*.
Therefore we shall now assume that the experimental situation is described
by the relationship xx e a, where a is defined by (8.3.13) and that p(<p(a), ej
has been calculated, and found not to be equal to either 0 or 1.
Then the relationship (8.3.12) or its negation can be introduced into the
theory without producing a contradiction. In such a case, possibility (3)
holds—“it is possible that has the pseudoproperty p(eJ.” What are the
possibilities engendered in this case?
Since we have assumed that x1ea is the best possible observation (that is,
we have excluded the possibility of an incomplete “protocol”), since
p((p(a), et) Ф 1, we must have a n plp = 0 (where p1 = p(et)) otherwise
in <2(0i) there would be selection procedures which are finer than a which we
have overlooked. By assuming that a r\p1 = 0 we will simplify the following
discussions.
From a n plp = 0 and x1ea it follows that х1 ф plp. Then
*i e Pi = P(ei) is equivalent to xx e plr. Since p((p(a), ex) Ф 0 we may add the
relation xx e plr without contradiction. It would be false, however, to say
that has the pseudoproperty p1 with probability p((p(a), et) because the
probability that x1 e p1 is realized in an experiment depends on the
following:
(a) The registration method b0 which we apply to xx can be chosen
arbitrarily. Thus the “possibility” that xx e p1 may be obtained from x1 e a
depends upon the possibility of the arbitrary(!) choice of b0. For the choice of
b0 there are no probabilities. Here b0 is freely “available” (for this concept see
[1], §11 and §12).
Suppose that a particular choice b^ is made, that is, as an additional
experimental fact e b^ is observed. By the introduction of the relation
xx € b(01} it becomes necessary to alter our assessment of the relation (8.3.12)
as follows:
(al) b(01} has been chosen such that b(01} n plr = 0. Such a choice has
occurred if the observable corresponding to ЩЬ^) L is complementary
to the observable {0, el9 e{91} (see §3). Then, by the choice(!) of such a
registration method Ь^\ from e Ь(01} we obtain the following result as a
theorem in JUTx^ ф plr, and, since xx ф plp we also obtain х1фр1.
Therefore we obtain the case that “x± actually does not have the pseudoprop¬
erty Plr
In this case (al) it is easy to see that the elements p e Sps cannot represent
properties of “isolated” microsystems because they are unconditionally
192 IV Coexistent Effects and Coexistent Decompositions
connected with the possibilities inherent in the preparation and registration
processes because the application of a registration method(!) can make it
impossible that ф pv
(a2) Let Ь(01} be chosen such that n plr ф 0. Here a registration has
not yet taken place. Then e b^ does not prohibit the addition of the
relation x1eplr An this case we have iix1 may possibly have the pseudoprop¬
erty Pi”
The “possibilities” for x1 e p1 cannot be further influenced because there
exist reproducible frequencies p(cp(a), ф(Ь^\ b)) for the various registrations
b c= Ь(0Ч Therefore, with a degree of justification, we may call
M<p(d), фф$\ ьр n plr))
the “probability for xx e pv” Since фф{о\ b^ n plr) is, in general, smaller
than еъ it is possible that
ФФо\ bo} n Pif)) ^ КфФ), «О!
Indeed, it is possible that
i4№, ФФо\ w n plr)) = 0
even in the case in which b^ n plr Ф 0.
If p{(p{a\ ф(Ь(о] n plr)) = 0 then “xx does not actually have the pseudop¬
roperty px” If p(cp(a), ф(Ь{о\ b(Q} n plr) Ф 0 then “xx may possibly have the
pseudoproperty p±”
(a2m) According to APE 3.4 we may choose b(01} such that
b(ol) = Ф(01) n plr) и ф{01} n р?г).
Then we obtain фф{о \ b<01) n plr) = j. If b{Qy> is chosen in this way, then
xx e p1 is possible with the maximal probability p{(p{a\ ex).
In case (a2) and in the special case (a2m), providing that
p(cp(a), ф(Ь#\ b(Q} n plr)) Ф 0), there exists a registration b+ <= b^ n plr. If
the experiment is carried out, then either e b^ n plr or
xi E n Pir wiH be observed upon registration from the registration
apparatus.
For the experimental result xx e n plr we therefore obtain (8.3.12) as a
theorem: “xx has the pseudoproperty px” and the possible pseudoproperty p1
of x1 has been realized. Otherwise, if we experimentally obtained the result
x1 e bo1)\(bo1) n plr) then we obtain the statement ф p1 as a theorem, that
is, “Xi does not actually have the pseudoproperty p±” Of course, this does
not mean that the relation x1 e pf = p(e|) must be satisfied. If b^ had been
chosen according to the case (a2m) then, from the experimental result
*i e n plr)
8 Objective Properties and Pseudoproperties of Microsystems 193
we would indeed obtain e pf = p(e|) as a theorem, that is, “xx actually has
the pseudoproperty pf.”
The analysis presented above shows the complicated structure of case (3) in
which e p1 and х1 ф p1 can be introduced without contradiction. This
analysis shows the essential point that the knowledge of the subject does not
play a role, and that, at most the incompleteness of the “protocol of an
experiment” (as, for example, stored in a computer) may leave additional
“possibilities” open, and that the latter are actually established by the
experiment if they are not established by the protocol. This analysis shows
that quantum mechanics permits the description of all experimental si¬
tuations even for individual microsystems without, as we have found, the
need for the introduction of a new logic, providing that we do not consider
mathematical logic to be unusual because of the occurrence of case (3).
The fact that we do not need to introduce a new logic does not mean that it
is not possible to introduce new language together with a new logic. For
example, it is possible to express relations such as (8.3.2) in a new language
which expresses more of the ontological character of such expressions than
does the formal mathematical language. For example, it is possible to
formulate expressions like (8.3.2) as follows: “The microsystem x has the
pseudoproperty e” (see the discussion following (8.3.2)). This “new” linguistic
formulation can be considered to be interpreted by means of (8.3.2). Using
this and similar expressions as “elementary propositions” it is possible to
obtain new propositions by means of logical operations; these new logical
operations need not be the logical operations of . Instead, they may be
introduced on the basis of a dialog (see [16]), in which the verification of a
proposition corresponds to what we have described above by: The pro¬
position is physically verified on the basis of an experiment (see [1], §10.4).
According to the previous discussion in §8.3 we are now in the position to
correctly explain the physical meaning of many of the famous quantum
mechanical facts. We will first consider the uncertainty relation, often
called the Heisenberg Uncertainty Relation. It has played a great role
in the conceptual development of quantum mechanics, but is often loaded
with considerable historical ballast; we shall now present a review of this
topic.
We shall begin by briefly proving the following mathematical theorem:
Let eu e2 e G, let = tг(м>еД a2 = tr(we2). Then there exist el9 e2 such
that for each we К at least one of the following two relations is false:
trMCi - a^)2) = <*! - a\ < ■&,
tr(w(e2 - a2l)2) = a2 - a2 < Vs.
First we shall present the proof for the case of a single Hilbert space, that
is, for 08 = 08(Ж). Let us consider a complete orthonormal basis which we
divide into two sets q>v (v = 1, 2,...) and ф^ (p = 1, 2, ...). Let
ei = £ P<pv and 1 - ei = ei = £
194 IV Coexistent Effects and Coexistent Decompositions
Let
Let us assume that the relation — a2 < is satisfied for w. Then it follows
that either < £ or (1 — o^) < that is, either tф^) < i or tr(we|) <
We shall only consider the case tr(we1) < the proof of the other is similar.
From tr^i) < i it follows that
The proof may be extended to the case of more than one Hilbert space.
Since Gp is dense in G, the theorem also applies for two suitable el9 e2eGp.
Since at least one of the relations (8.3.14) must fail for every w, it represents
an uncertainty relationship between the two decision effects e1 and e2. We
have stated the case for a pair of decision effects in order to show that the
uncertainty relations have nothing to do with the accuracy of measurements!
In order to discover the physical meaning of the relations (8.3.14) we shall
now rewrite them in terms of preparation and registration procedures. For
P(ei) = Pi and pip2) = P2 let two registration procedures and b(02) be
chosen such that ф(Ь{о \ b\n plr) = e1 and ^(b(02), b(02) n p2r) = e2 are
satisfied.
For each preparation procedure a from
£ > tr(wet) = Y, \Ф<РЛ2 > 2 Edl\/wZvll “ llv^/vll)2
V V
^ 2 £ IIn/wXvII2 + 2Z ll^vll2 - £ llVwxvll ll^vll
V V V
> £ tr(we2) + £ tr(we£)
> £ - tr(we2)1/2 tr(w(l - e2))1/2 = £ - a£2(l - a2)1/2
and we obtain
a2 - ocf = a2(l - a2) > (£ - £)2 = (|)2 > (f)2 = re-
Ay(a n b(0l), a n btf* n plr) < £
or
A^(a n ^o1*, a n i#* n pfr) < £
it follows that
A^(a n b(02), a n b(02) n p2r) > £
and
Ay{a n b(o\ a n b(02) n pfr) > £
must be satisfied.
Ay(a n bjft a n n plr) < £
8 Objective Properties and Pseudoproperties of Microsystems 195
states that, for the registration n plr at most £ of the systems prepared
according to the procedure a will respond. Correspondingly,
ky(a n b#\ a n b(0l> n pfr) < £
states that, in the registration b(01) n plr more than f of all systems prepared
according to a will respond.
A “high” value for the response or nonresponse of b(01} n plr for the
systems obtain from the preparation procedure a leads automatically to a
“low” value as well for the response as for the лотевропве of b(02) n p2r, that
is, the frequency of response must lie between £ and |.
The uncertainty relations expressed by the relation (8.3.14) express the
following statement for possible preparation apparatuses: It is impossible to
experimentally produce a preparation procedure for which the frequencies oq
and a2 satisfy both — ol\ < and a2 — a2 < This relation says
nothing about the possibility of building registration apparatuses. On the
contrary, we have assumed that we have made two experiments a n b^ and
a n b(Q} where we permit Ь(01} n b(02) = 0. From the uncertainty relations we
cannot say anything about the possibility of obtaining joint measurements of
e1 and e2. In fact, if e1 and e2 satisfy (8.3.14) then they are not commensur¬
able in the sense of D 1.3.1. Then, if ф(Ь^\ b^ n plr) = e1 and if
ф(Ь(о\ b(02) n p2r) = e2 and if bft* n b^2) = b0 Ф 0 it follows that:
Ф(Ь{о\ n plr) = 1.Яо(Ь$\ Ь0)ф(Ь0, b0 n plr) = ex
and
Ф(Ь{о\fe(02) n p2r) = А<*0(Ь(02), Ь0)ф(Ь0, b0 n p2r) = e2.
From
1 > ф(Ь0, b0 n plr) > Ц)
Л®0\УО > b0)
it follows that b0) = 1, that is, b0 = = b(02) resulting in the fact
that e1 and e2 are coexistent, in contradiction to the fact that e1 and e2 do not
commute. Therefore we must have b^ n b(02) = 0, that is, the two re¬
gistration methods must be mutually exclusive.
The above clarification of this concept is necessary because we are
accustomed to intuitively make more or less correct conclusions using the
uncertainty relations. These uncertainty relations are usually formulated for
scale observables which, according to D 2.5.6, are always decision
observables.
Suppose that A and В are two such scale observables for which
А,Ве$\Ж\,Ж2,..). A and В are therefore “bounded” operators (see
AIV, §15). The “dispersion” of measurement values of A in the ensemble w is
defined by
Str(,4) = tr(w,4'2), (8.3.15)
where A' = A — 1 tr(wA).
196 IV Coexistent Effects and Coexistent Decompositions
Here tr(wA) is the so-called expectation value of A in the ensemble w, that
is, it is approximately the experimental mean value a = (1/iV) av of the
measurement results av for a large number N of repeated experiments. Str(A)
is the mean of the square of the deviation, that is, it is experimentally the
approximate value of the mean value (1/iV) Y,v=i (av — a)2. We may make
the physical meaning more clear with the aid of a preparation procedure a for
which cp(a) = w and a registration method b0 for which ЩЬ0) L repre¬
sents a very good approximation for the scale decision observable A (for the
realization of an observable, see §4).
For A A = ^/Str(A) and the corresponding equation (8.3.15) for an
observable В it follows that:
(ДЛХДВ) = Vtr(wA'2) y/tr(wB'2).
Let D = A' + ixB' where a is real. Then D+D is self-adjoint and D+D > 0.
Thus it follows that
tr(wD+D) > 0
and, for all a we have
(AA)2 + a2(AB)2 + a tr(wC) > 0, (8.3.16)
where
С = i(A'B' - B'A') = i(AB - BA).
In order that (8.3.16) is satisfied, it is necessary and sufficient that the
minimum of (8.3.16) with respect to a is non-negative; we therefore obtain
(AA)(AB) > 0|tr(wC)|. (8.3.17)
(If AB = 0 we exchange A with В in the derivation; if both A A and AB = 0
then from (8.3.16) it follows that tr(wC) = 0 from which (8.3.17) is satisfied).
If equality holds in (8.3.17), then it follows that there exists an a for which
tr(wD+D) = tr(D4/w4/w£)) = 0. This is equivalent to the condition that
D^/w = 0 and also Dw = 0. If there exists an a such that Dw = 0 then we
obtain equality in (8.3.17). For w = £v (according to AIV, §11) we
obtain Dw = 0 is equivalent to the statement: D(pv = 0 for all (pv (for which
wv Ф 0). This result follows from
0 = tr(D+Dw) = X wv tr(D+DPJ = 2 wJDPJl2.
V V
Thus we have seen that the “uncertainty relations” (8.3.16) and (8.3.17) are
determined essentially by the noncommutativity of the operators A and В,
and, according to §3, are determined by the fact that A and В are not
commensurable. The fact that A and В are not commensurable does not
directly appear in the derivation of the uncertainty relations, but only
indirectly, in terms of the mathematical structure of noncommutivity.
The best known case of the uncertainty relation (8.3.16) is the case in which
A represents the position observable Q, and В represents the corresponding
8 Objective Properties and Pseudoproperties of Microsystems 197
momentum observable P; P and Q will be described in detail in VII, §4. For
P and Q (see VII (4.22) we obtain
PQ — QP = — il,
that is, С = — il; therefore (8.3.16) takes the form
(AP)(A<2) > i (8.3.18)
this is the famous Heisenburg uncertainty relation which has played an
important role in the evolution of quantum mechanics.
(Since P and Q are not bounded observables, the above proof does not
apply. Let w = £v wvP<Pv. If for wv Ф 0 all the (pv lie in the domain of
definition of the operators P2 and Q2, then the above derivation will be valid.
If, for example, one of the (pv, say <p1? does not lie in the domain of definition
of P2 (or Q2) then tr(wP(pi) need not exist; we may then consider the operator
P' — pi with arbitrary values of p and obtain Str(P) is infinite. Since Str(Q)
cannot be equal to zero, equation (8.3.18) will still be satisfied.)
The following conclusions are often made in connection with the
Heisenberg uncertainty relation:
“The position and momentum of a particle cannot be simultaneously
determined with arbitrary precision. The measurement uncertainties AP and
AQ for a simultaneous measurement must satisfy (8.3.18).”
Or, somewhat more concisely:
“Position and momentum cannot be simultaneously measured to arbitrary
accuracy.”
These assertions are half-truths, and are not valid conclusions of equation
(8.3.18).
First, the expression “time” does not appear in either (8.3.18) or (8.3.16).
The meaning of the expression “simultaneous” in this context is unclear. If we
conclude that position and momentum may be measured to arbitrary
accuracy at different times, that conclusion will be false (see VII, §6 and
XVII). As we have already seen, (8.3.16) is only indirectly concerned with the
fact that A and В are not commensurable. As we have seen in §1—§4 the
concept of commensurability is defined in an entirely different manner than is
the uncertainty principle.
Again, it is important to note that A A and AB have nothing to do with
measurement imprecisions; on the contrary, in the derivation of (8.3.16) it
was assumed that they were measured with “ideal precision.” What does it
mean to measure an observable “imprecisely?”
Such a concept does not appear in this chapter. Have we overlooked part
of the structure of the registration process? No, this is not the case. It was
essential that, in the introduction of the concept of a registration procedure
b e 0t that the registration is precise, whether xeb or x$b. Therefore there
are no “imprecise” registrations. What then does an experimental physicist
mean by the expressions “measurement error,” “measurement imprecisions,”
198 IV Coexistent Effects and Coexistent Decompositions
etc.? He is making a comparison between a “real” observable ^(b0)-^>L
and the desired observable E Д L; we have already discussed this problem in
§4. Here the “measurement error” is understood to be the “difference”
between ^(b0)L and which is characterized in AOb (see §4) by the
differences ф0к(а) — F(cr). If E Д L is a scale decision observable, then the
experimental physicist seeks to describe the difference between his real
apparatus, as described by ЩЬ0)-^Ь and a “real scale,” by “errors”
between the real scale of the apparatus and the ideal scale of the scale
observable he wished to measure. This “error” depends upon the ensemble
used in the experiment. This subject was discussed in §4 in connection with
axiom AOb. The discussion of errors is a typical experimental problem
because it is not related to the theoretical postulates of AOb but is concerned
with the construction of the real experimental apparatus.
Thus, when we say that we can only make imprecise joint measurements of
both P and Q, the assertion is made that, for a real apparatus b0 having two
scales x and у, the corresponding partial observables of @(b0) L may only
approximately measure the ideal observables P and Q with errors, where the
errors are “somewhat similar” to those described by (8.3.18). To analyze
these “errors” more precisely is somewhat more difficult than may at first
appear. At least we know that the well-defined quantities AP and AQ in
(8.3.18) are not measurement errors.
Although AP and AQ are often falsely interpreted as measurement errors,
it is probable that a relationship which is similar to (8.3.18) will be obtained if
we define a suitable notion of a measurement error. This is certainly not
surprising. However, the derivation of a relation which is similar to (8.3.18)
for the “measurement errors” of “approximate P” and “approximate Q” is
much more difficult and, in all probability, cannot in general be carried out.
For this reason we shall not proceed further in this direction (see [19]).
Certain related problems, such as the problem of a sequence of measure¬
ments or the problem posed in III, §1 (that the registration must occur “after”
the preparation), and that the registration must take place in another region
of space than the preparation cannot be clarified using only the axioms
presented here. This is due in part to the fact that we have not built into the
theory described here a description of space and time. Such a clarification is
extremely desirable, since the role of space and time in quantum mechanics is
of great importance. Such a clarification is possible only after we have
investigated the transformation properties of preparation and registration
procedures in V-VIII.
CHAPTER V
Transformations of Registration and
Preparation Procedures.
Transformations of Effects and Ensembles
In II-IV we have been primarily motivated by physical considerations. In
this chapter we shall be concerned with questions which play an important
role in all mathematical theories—the definition and examination of the role
of morphisms. The concepts presented in II-IV were motivated by physical
considerations; here again the mathematical structure will reflect the physi¬
cal situation. In this book we shall not investigate the underlying general
mathematical problem itself because we are interested in the physical
significance of the morphisms. We have already encountered this problem in
previous chapters and have already anticipated some of the applications of
morphisms. In VII we shall describe another important application; in XVII
we shall become familiar with additional important physical examples of
morphisms.
1 Morphisms for Selection Procedures
At first it may appear to be desirable to consider mappings of the set M into
itself. In classical physics we are accustomed to studying transformations of
state space (for example, phase space in classical mechanics). Here we note
that we may not identify the state space of classical theories with the set of
physical systems. For example, it is possible that many systems have the
same state. In classical theories the notion of the transformation of systems
is, in general, physically vague. In quantum mechanics, except for the maps
considered in XVII, §4.1, we also find that there are no physically interesting
199
200 V Transformations of Registration and Preparation Procedures
maps of M into itself. For these reasons we shall not consider maps of the set
M into itself.
On the other hand, the mapping of selection procedures is of substantial
interest. Let and be two systems of selection procedures on the sets Mx
and M2.
D 1.1. Let Д y>2 be a map satisfying the following conditions :
h(a n b) = h(a) n h(b),
h(a\b) = h(a)\h(b) for a => b.
Such a map will be called an sp-morphism.
For b c= a (hence a n b = b) from h(a n b) = h(a) n h(b) it follows that
h(a) n h(b) = h(b), that is, h(a) => h(b). Thus h preserves the order relation
and, if the first requirement is satisfied, the second requirement of D 1.1 is
meaningful because b с a implies h(b) c= h(a). If a, b e a и b e then
from h(a) c= h(a и b) and h(b) c= h(a n b) we obtain h(a) u h(b) c= h(a u b);
on the other hand, from h(a и b\a) = h(a и b)\h(a\ a и b\a c= b it follows
that h(a и b)\h(a) c= h(b) and that
h(a) u [/z(a u b)\/i(a)] = h(a u b) c= b(a) u h(b).
Therefore we obtain h(a и b) = h(a) и h(b).
In accord with the usual terminology we shall say that a bijective mapping
h is an sp-isomorphism if both h and b_1 are sp-morphisms.
Th. 1.1. Let Nx be a subset of M1? let N2 = and let be a system of
selection procedures. Then the set
T2 = {b\b — a n N2 and a e Tx}
is a system of selection procedures and the mapping
h(a) = a n N2
is an sp-morphism.
Proof. The proof is simple and left to the reader.
Th. 1.2. If h is an sp-morphism and if
J = {a | a e and h(a) = 0}
then J satisfies the following properties:
a) aeJ,beSFx and b c= a => b e J.
(2) a1? a2 e J, ax и a2 e => ax и a2 e J.
Proof. (1) Since h is order preserving, from h(b) c: h(a) = 0 it follows that
h(b) = 0.
(2) We obtain h(a1 и a2) = Ца^) и h(a2) = 0 u 0 = 0.
2 Morphisms of Statistical Selection Procedures 201
D 1.2. A subset У cz У is said to be an ideal in Sf providing that the
following conditions are satisfied:
(1) аеУ.ЪеУ* and b <= а=>ЬеУ.
(2) al9 a2 e У, ax и a2 e У => ax u a2 e У.
Th. 1.3. If h is an sp-morphism then, by means of the ideal У
described in Th. 1.2 it is possible to decompose the map h as follows:
PJJ
^ 5 >У2
where i is an injection.
Proof. From = h(a2) it follows that h(a1 n a2) = n h(a2) = =
h(a2) and we obtain h(a1\a1 n a2) = h(a1)\h(a1 n a2) = 0, h(a2\a1 n a2) = 0,
that is, ЯД**! п a2, аД^ n а2е У. An equivalence relation a1 ~ a2 is defined by
аД^ n a2, аД^ n а2еУ. We obtain the following identity:
n a3) = [(аД^! n a2)\(aiVh n a2) n a3]
u [«! n (a2\a2 n a3)].
From яДс^ n a2e У and a2\a2 n a3 g«/ it follows that a1\a1 n a3 g «/. Similarly,
from a3\a3 n a2e У and a2\a2 n g it follows that a3\a3 n a1e У.
2 Morphisms of Statistical Selection Procedures
For many applications the probability function obeys certain laws under sp-
morphisms. We shall now formulate these laws.
If a g S?9 h is an sp-morphism and У is the ideal defined in Th. 1.2, then the
set of all a n a for which ae/isa subset of У which we shall denote by
У(а). We therefore obtain У {a) = ^(a) n У. For У {a) we therefore obtain
ax c= a2, a2 e У (a) => ax e У{а\
al9 a2 g У (a) => ax u a2 e ./(a)
since if a1? a2 g У (a) then it follows that ax\j a2e У since ax и a2 c= a!
D 2.1. An ideal is said to be closed with respect to a statistical selection
procedure У if sup5et/(a) A(a, a) = 1 implies the relation a g У.
The condition sup5ey(a) A(a, a) = 1 means that there exists anae / for
which the probability for the selection a\a is, for all practical purposes, equal
to zero.
202 V Transformations of Registration and Preparation Procedures
D 2.2. An sp-morphism h of a statistical selection procedure У1 in a
statistical selection procedure У2 is said to be an ssp-morphism if the ideal У
is closed and if, for ax c= a2 the following condition holds:
a2
A2(/i(ai), h(a2)) = — a2),
ai
where a1? a2 are defined as follows:
0Lt = 1 — sup Ах(а*> a) = inf Xx(ah at\a).
a e J(ai) a e У(а{)
Since the ideal У is closed, it follows that the condition h(a) = 0 is
equivalent to the condition
a = 1 — sup Ax(a, a) = 0.
a e J{a)
For /i(ax) Ф 0 and therefore ax Ф 0; therefore the condition given for A1? A2
is well defined.
If an ssp-homomorphism is an sp-isomorphism then it is also an ssp-
isomorphism, since for each а ф 0 we obtain a = 1 and therefore
Л2(й(яД h(a2)) = AM* a2).
Conversely, if a = 1 for all а Ф 0, then it follows that Ax(a, a) = 0 for all
a e У (a). If ae У and if а Ф 0 then a e У (a) and therefore AM a) = 1 in
contradiction to AM a) = 0 for all a e У (a). Therefore У contains only the
null set, that is, the ssp-homomorphism is an ssp-isomorphism.
D 2.3. A subset Ух c= У of a selection procedure У is called a separated
part of У, if Ух is a selection procedure and if, for each pair of elements
а1еУ1,а2е У\УХ the intersection ax n a2 = 0.
It is easy to see that if Ух is a separate part of У then У2 = У\УХ is also a
separate part.
Th. 2.1. Let h be an ssp-morphism of Ух into У2. If the relation
A2(/i(ai), h(a2)) = AMi* a2) is satisfied for the case ax => a2 and Ща^ Ф 0,
then У is a separate part of Ух and h is an ssp-isomorphism of У = Уг\У
onto a partial selection procedure hУх = У2 с У2.
Proof. Let а ф У and ae y.lfa n а Ф 0 then, since h(a n a) = 0 we obtain
0 = A2(/i(a), h(a n a)) = a r\ а) Ф 0
which is a contradiction. Therefore is a separate part of Ух.
For a e У\ then h(a) Ф 0 for а Ф 0 and h is an sp-isomorphism of У\ onto
У2 = НУ\ = кУ1. As a result the probabilities are invariant—therefore h is an
ssp-isomorphism of У [ onto У2.
3 Morphisms of Preparation and Registration Procedures 203
3 Morphisms of Preparation and Registration Procedures
We now turn from the general case to the case of the preparation and
registration procedures which are important for quantum mechanics.
We shall now assume that we are given Ml9 J1? ^01, and M2, <22, ^02?
D 3.1. An ssp-morphism of SL^ into i>2 where SLX and J2 are statistical
selection procedures is called a preparation morphism (abbreviated p-
morphism). By analogy with D 2.2 we define a(a) = inf5ey(a) кй1(а9 a\a\
where J{a) is defined in §2
The p-morphisms and the p-automorphisms shall play a particularly
important role.
D 3.2. A p-morphism h will be said to be recording-invariant (r-invariant) if
<Pi(ai) = <Ma2) implies that (p2(h(a1)) = ViWPi)) anc* a(ai) = а(аг); here <px
and (p2 are defined according to III, D 3.1.
Th. 3.1. For an r-invariant p-morphism h a map
where kv = k&1(a9 av). If h is a p-isomorphism, then a(a) = 1 and
(pi&\ = Ж! Ж2 с= K2 .
Proof. Clearly S is well defined. From a= (J” = 1 av it follows that h(a) —
Ul=i h(ax). Since av n = 0 for v Ф p we also obtain h(ax) n Ща^) = 0.
Therefore we obtain
Scp^a) = oc(a)(p2(h(a))
For a decomposition a = (J" = 1 av it follows that
<Pi(h(a)) = X h2(Ka\ h(av))(p2(h(av)\
and
Scp^a) = a (a)(p2(h(a)) = а (a) £ ХЛг(Ца\ h(av))(p2(h(av)).
According to D 2.2 we have
a(a)k^2(h(a), h(ax)) = a(av)k^(a, av)
204 V Transformations of Registration and Preparation Procedures
and we obtain
S<pM) = £ ^,(a, ajct(av)<p2(h(aj)
V
= L -Wa> av)s<M«v)-
V
From <px(a) = £v Я^Да, av)<Pi(av) (according to III, Th. 3.1 and the equivalence of
III (3.2) and III (6.1)) we obtain the desired result.
The map S defined in Th. 3.1 is (according to III, Th. 3.2) a rational affine
map of the rational, convex set (px (which is dense in Kx) into K2.
D 3.3. We shall call an sp-morphism h of ^ into 012 for which the
restriction to ^01 is an ssp-isomorphism ^01 onto 0tQ2 a recording
morphism (abbreviation: r-morphism).
From an r-morphism h we may easily obtain a (canonical) map of effect
processes; this map we shall also denote by h:
D 3.4. For (b0, b) e we define the map ^ Д $F2 by h(b0, b) =(h(b0), h(b)).
D 3.5. A r-morphism h is said to be preparation-invariant (p-invariant if
*M/i) = *M/2) implies that ф2(Н/1)) = ^2(^/2))- Ф1 and Ф2 correspond to
III, D 3.1.
Th. 3.2. For a p-invariant r-morphism h a map ф0*х L2 is defined by
ТФ1 (/) = ф2(Ь(/)) which satisfies 71 = 1. For the decomposition of the unit
effect I (see III, D 4.5.5)
lb„ = Ш
i
we obtain
1 = Z Ф1Ш and 1 = £ тФЛ/д-
i i
Proof. The proof is similar to the proof of Th. 3.1.
According to III, Th. 2.4 T is a rational affine map of the rational convex
set ф0?х in L2.
In many physical applications an r-morphism is not only p-invariant but,
in addition, if fuf2 are hardly distinguishable by testing with preparation
procedures (even if ф(^) Ф ф(/2)) then the same is true for the images h(ji)
and h(f2). We therefore define:
D 3.6. An r-morphism h is said to be preparation-continuous (p-
continuous) if to each г > 0 and a e Ж2 there exists a <5 > 0 and a finite
number of at e such that
Ifi2{<p2(a), ф2(НЛ) - Ц2Ша), Ф2(М/))\ < £
3 Morphisms of Preparation and Registration Procedures 205
whenever
*M/)) - Pi(<pM\ Uf))\ < <5
for all at. An r-isomorphism is said to be a p-continuous r-isomorphism if
both h and /Г1 are p-continuous.
It is easy to see that a p-continuous r-morphism is also p-invariant.
An analogous definition can also be made for p-morphisms:
D 3.7. A p-morphism h is said to be recording-continuous (r-continuous) if
to each г > 0 and / e 3P2 there exists a 3 > 0 and a finite number off e #i
such that
\p2(ct(a)(p2(h(a)), i>2(/)) - p2(a(a)q>2(h(a)), ф2(/))\ < e
whenever
K(<Pi(a), <M/i)) - ^i(<Pi(a), ФЛШ < s
for all f. A p-isomorphism is said to be an r-continuous p-isomorphism if
both h and h~l are r-continuous.
Again it is easy to see that an r-continuous p-morphism is also r-invariant.
D 3.8. A p-isomorphism h: lx —► 12 is said to be dual to an r-isomorphism
hf: 0t2 —► if (b(a), b0) e C2 is equivalent to (a, h'(b0)) e Cx and if
p2(h(a), (b0, b)) = pj(a, h’(b0, b)). (3.1)
Here Cx and C2 are defined by analogy with С in II (4.3.1).
Th. 3.3. If a p-isomorphism h: 11 —► 12 and r-isomorphism h': $2 —► 01 x are
dual, then h~1 and k ~1 are also dual.
Proof. From (/i_1(a), b0) e it follows that, for a' = h~1(a) and b'0 = /i'_1(b0)
that (a',h'(b'0))eC1 and, according to D 3.8 (h(a'),b'0) e C2, that is,
(a, h ~1(b0)) e С2. In this way it follows from (a, h'" Х(Ь0)) e C2 that (/i “ ^a), b0) e С1.
From (3.1) it follows that for a' = h-1(a) and b'0 = /i'_1(b0), b' = b'-1(b)
p1(h-\a),(b0,b)) = P1(a', h'(b'0, b'))
= n2(h(a'\ (b'0, b')) = p2(a, h’~\b0, b)).
Th. 3.4 (H. Neumann). // a p-isomorphism h and an r-isomorphism h' are
dual, tben h is an r-continuous p-isomorphism and hi is a p-continuous r-
isomorphism.
Proof. According to Th. 3.3 we need only show that h is r-continuous and that h'
is p-continuous.
Since h is a p-isomorphism, we find that a (a) = 1. According to D 3.7 it suffices
to show that
p2(cp2(h(a)), ip2(b0, b)) = PiivM, Ф^Ь’ОЬо, b)). (3.2)
206 V Transformations of Registration and Preparation Procedures
First we shall show that h is r-invariant. Suppose that a' ~ a and that
h(af) * h(a). Then there exists a (b0, b) such that (h(a'\ b0) e C2, (h(a), b0) e C2 and
PiiHa'l (b0, b)) ф p2(b(a'), (b0, b)).
Thus it follows that (a', h'(b0)) e Cx and (a, h'(b0)) e Cx and p1 (a', h'(b0, b)) Ф
pfa, h'(b0, b)) which contradicts a' ~ a.
According to APS 5.1.4 there exists an a' ecpfa) satisfying (a\ hf(b0)) e С Then,
according to D 3.8
b)) = Px(a', h'(b0, b))
= fi2(h(a'), (b0, b)) = p1(<p2{h{a'), ф2(Ь0, b)).
Since h is r-invariant, from d ~ a it follows that h(d) ~ h(a), that is,
<Pi(h(a')) = <p2(h(a)).
According to D 3.6 the relation (3.2) suffices to show that h' is p-continuous
(observe that h' is a map ffl2 —► and not —► 0t2 as in D 3.6!).
4 Morphisms of Ensembles and Effects
Since r-invariant p-morphisms and p-invariant r-morphisms always occur in
applications, it is understandable that our emphasis in the investigation of
morphisms in quantum mechanics will be concerned with morphisms of
ensembles and effects.
4.1 Morphisms of Ensembles
D 4.1.1. An affine mapping S of Kx into K2 is called a mixture morphism
(mi-morphism).
D 4.1.2. An affine map S of Kx into K2 is called an operation.
Th. 4.1.1. A rational affine and norm-continuous map S of a (rational affine)
set Jfj which is dense in К x into K2 may be uniquely extended to an operation
K1 in K2.
Proof. Since S is norm-continuous, S may be extended as an affine mapping onto
= co K1 and therefore onto the whole space
Since a w e K1 may be written in the form kw where 0<2<1 ,w e K1 and S is
affine in Kl9 S may be extended onto all of K1 by means of the equation
S(2w) = 2S(w). Thus this extension of S is an operation K2.
Th. 4.1.2. An operation S of K1 into K2 may be uniquely extended as a linear
mapping of in M2 with norm ||S|| < 1 .Every mixture morphism К1 Д K2
has a unique extension as a linear map of in $2for which ||S|| = 1; in
particular, every mixture morphism can be extended in this way to an
operation.
4 Morphisms of Ensembles and Effects 207
Every positive norm-continuous linear map ||S|| < 1
(restricted to Kx) is an operation. Every positive linear map 0bx 0b2 is
norm-continuous and ||S|| _1S is an operation.
Proof. Since 0b 1 is spanned by Kl9 S can be extended to 0b 1. For w e Ku since
Swjl g K2 the relation \\Sw1 || < 1 and for x = ctw1 — f$w2 and ||x|| = a + P we may
conclude that ||Sx|| < ||x||, it follows that S is norm-continuous and ||S|| < 1. In
this way we find that every positive map satisfying ||S|| < 1 is an operation, because
К is the intersection of the unit sphere with the positive cone.
Since every positive linear map is norm-continuous, this result holds in general
(see AIII, §6).
Thus we see that a bijective mixture morphism Д K2 is a mixture
isomorphism, that is, S-1 is a mixture morphism. Thus it follows that a
bijective operation Kx Д K2 is a mixture isomorphism, that is, Kx Д K2.
Th. 4.1.3. To each operation S there exists a dual map S' of 0S'2 in 0S\ for which
L2^+ L1. S' is o(0b'2, 0b2)-(7(0bi, 0fx) continuous; S is a mixture morphism if
andonlyifS'l = 1.
Proof. The fact that S' exists and is o(0b2,0bfy~a(0b\, ^J-continuous follows from
Th. 4.1.2, that is, from the fact that S is norm-continuous (see AIII, §5). From
Д- K2 it follows that p2(Sw, 1) = 1 = /ii(w, S' 1) holds for all w e Ku and we
therefore obtain S'l = 1. If S' 1 = 1 then, for all w e K1 it follows that p2(Sw, 1) =
p2(w, S'l) = Pi(w, 1) = 1 and we therefore obtain Sw e K2.
Th. 4.1.4. For an mi -morphism S the following statements are equivalent:
(i) S is a mixture isomorphism.
(ii) К1 Д K2 is injective and SKX is dense in norm in K2.
(iii) L2 Д Lx is bijective.
(iv) S' is an isomorphic map of the Banach spaces.
(v) S is an isomorphic map of the Banach spaces.
Proof, (i) => (ii) trivial.
(ii) => (iii). Since БКг is norm-dense in К2 it follows that (S0b1)L — 0 which is
equivalent to the condition that S' is injective. Since SK1 is norm-dense in K2 it
follows that
\\S'y\\ = sup |/ii(w, S'y)\ = sup \p2(Sw, y)\
weKi weKi
= sup liWz^.y)! = llyll
weK2
that is S' is norm-preserving. Thus it follows that S'0t'2 is a norm-closed subspace of
0b\.
We shall now show that if К1 Д K2 is injective then 0b^ Д 0b2 is also injective:
Every xgJj can be written in the form x = a w1 — ($w2, where a, ft > 0 and
w1,w2eK1. Then from 0 = Sx = aSwi — PSw2 and from Swu Sw2 e K2 it follows
that a = P and that Sw1 = Sw2. Thus it follows that w1 = w2 and finally x = 0.
Since 0bi is injective, it follows that S'0b'2 is ст-dense in 0b\. Since the unit
sphere [—1,1] is ст-compact, the set A = S'[— 1,1] is therefore compact and
208 V Transformations of Registration and Preparation Procedures
convex. Since S' preserves the norm we obtain A = S'0b'2 n [—1,1] and S'0b'2 is
therefore also ст-closed (see AIII, §4) and S'0b'2 = 0b\ and S'[—1, 1] = [—1,1].
Since, according to Th. 4.1.4 S' 1 = 1 and L = ^(1 + [-1,1]) it follows that
S'L2 = L1 and (iii) is satisfied.
(iii) => (iv). If S' is a bijective map of L2 onto L1 then (since, according to
Th. 4.1.3 S' 1 = 1 and [-1,1] = 2L - 1) it follows that S' is also a bijective map of
the unit spheres onto each other. Hence (iv) holds.
(iv) => (v). We shall now show that (S')-1 is ст-continuous: (S')-1 as the inverse
of the ст-continuous bijective map of the ст-compact unit sphere is ст-continuous on
the unit sphere and is therefore ст-continuous everywhere (see AIII, §5). Thus it
follows that S-1 exists and satisfies (S-1)' = (S')-1. Since S' maps the unit sphere
bijectively, we obtain ||Sx|| = ||x||,from which we have proven (v).
(v) => (i). The existence of (S')-1 and (S')-1 = (S-1)' is clear. Then, according to
Th. 4.1.3 S'l = 1 and we obtain (S')-1l = 1. Let w e K2; since SK1 cz K2 from
||S-1w|| = ||w|| = 1 it follows that p1(S~1w, 1) = p2(w, 1) = 1, we obtain
S-1K2 c Kx.
D 4.1.3. An operation (or mixture morphism) is said to be ^-continuous
when, as a map —► iC2 is continuous with respect to the topologies
a(Ku 00 o(K2, the spaces are defined in III, §4 and §5.
As a mapping —> K2 a ^-continuous mixture morphism may be
naturally extended as a ^-continuous operation.
Th. 4.1.5. Every 00-continuous rational affine map Жх Д Ж2 can be extended
to a mapping 0b x Д 0b 2 which is norm-continuous and, as a mapping
К1 Д K2 is also 00-continuous.
Proof. By definition, the norm in 00' is equal to
||x|| = sup{/i(x,y)|||y|| < 1 ,ye@).
Since 00 n L is a(0b', ^)-dense in L (see III, §5) and since the set
{у 11|у || < 1, ye£0}is a(0b',0b)-dense in the unit sphere of 0b', for elements of
0Ь с= 00\ the norm in 01' is identical to the norm in B.
We shall now show that the unit sphere of 00' is equal to the convex set generated
byX* и (-K*).
The unit sphere in 01' is the set which is bipolar to the set К и (-K) in the dual
pair (0', 00). Therefore the unit sphere of 00' is equal to the o{00\ ^)-closed convex
set generated by Ka и Since Ka and (—Ka) are a{00\ ^-compact (as
bounded closed sets in 00'\ see AIII, §4) the convex set generated by Ka и is
already compact, and is therefore ct(^', ^)-closed, that is, equal to the unit sphere
of0\
Let 1 g 00\ since the relation p(w, 1) = 1 for all w e Ka follows from the same
relation for all w e К it follows that ||w|l = l_for all w e Ka.
Since the closures К\ of in 00 ^ and К% of K2 in 002 are compact, S may be
extended as a ^-continuous map K\ Д К2, where this extension becomes an
affine map. Since 0b 1 is the linear span of Kl9 S may be extended to a map
0b1-^^'2.
4 Morphisms of Ensembles and Effects 209
For x g 0b x and x = aw1 — /?w2, where wl5 w2 e Ki and ||x|| = a + P it follows
that
||Sx|| < ctWSwJ + P\\Sw2\\ < a + P
because the norm || w'|| < 1 for elements w' e K2 • 5 is therefore norm-continuous as
a map 0b 2 —► 0'2.
Since Xjl is norm-complete and Жх is norm-dense in and K2 is norm-
complete, and from Жх Д K2 we therefore obtain that is, S is an
operation. According to Th. 4.1.1 and Th. 4.1.2 it follows that S is a norm-
continuous map^ —► 0b 2.
Since К* Д X2 is ^-continuous, the map Д K2 is ^-continuous.
Th. 4.1.6. For a 00-continuous affine map Kx Д K2 the extension of S on 0b x
is continuous in the topologies o(0b2, ®i) and ®i) anequivalently,
maps t/гв subspace 002 of 0b'2 in 00x.
Proof. First we shall show that each element may be written in the form
x = <xwx — f$w2, where wl5 w2 e Ka and ||x|| = a + p. If x' = ||хЦ_1х then x' is in
the unit sphere of 00', and, since со(К* и (—Ka)) is the unit sphere in 0', is
therefore of the form x' = Xwx — (1 — X)w2, where 0 < Я < 1 and wx,w2eK<T.
Thus it follows that x = awx — Pw2 with a = ||х||Я, P = ||x||(l — Я), that is,
a + P = ||x||. Thus it follows that the set of all aw^ — /?w2, where wl5 w2e К is
0) dense in 0', that is, ^ is <r(^', 00) dense in 00'.
We may define a <r(^i, ^-continuous affine function on Kx by means of the
equation /(w) = /^(Sw, y) (for fixed у £000) because 5 is continuous in the topologies
o{00\, 000) on Ki and <7(^2, 02) on K2. In this way we may obtain an extension of
f(w) as a ст-continuous affine function on all of Ka.
Since each xe9\ is of the form x = awx — Pw2, where ||x|| = a + P and
wl5 w2 g К1, I may be extended on all of 00' as a norm-continuous linear form
because
\Ы < а|*Ю1 + p\l(w2)\
< (a + P) sup \l(w)\ = ||x|| sup |/(w)|.
we Kf we Kf
We shall now show that / is, in addition, a(9\, ^J-continuous as a linear form over
Here I is a(9'x, ^J-continuous if I is a(9'x, ^J-continuous on the unit sphere of
00\ (see AIII, §4). Thus we need only show that (/(x)| < S for ||x|| < 1 and a suitable
o(00i, ^-neighborhood of 0.
We may write x in the form x = awx — Pw2 where a + P = ||x|| < 1 and
wl5 w2 g K°x. Thus it follows that
x=^a + fi)(wl - w2) + (a - fi)(jwl + jw2).
Since /i1(w, 1) = 1 for all w e Kx, it follows that
^ = ilMK - w2) + Hi(x, !X2wi + 2wi)-
Thus
m ^ iu*PK - w2) + kw, i)i,
where Я = supweKf |/(w)|.
210 V Transformations of Registration and Preparation Procedures
We shall now choose the g(3\9 3X) neighborhood of 0 as follows: Since I is
a(3'l9 ^J-continuous in K\, there are finitely many yt e 31 for which
|l(w — w)| < <5/2 providing that \p(w — w, yt)| <1. For e = <5/22 we define
yt = (l/e)y*. We choose the c(3'l9 3X) neighborhoods of 0 as follows: \p(x9 y,)| < *
for all i and \p(x9 1)| < с/4, where a is the smallest of the numbers \\yt\\ and e. We
will now prove that \l(x)\ < S if ||x|| < 1 and if x lies in the specified neighborhood
ofO:
(a) If ||x|| < e, then since 1/^ — w2)| < 22 it follows that
|/(x)| < sX Л—X < 2sX = <5.
4
(b) If ||x|| > e then, since ||x|| < 1, it follows that
\l(x)\ < - Wl)\ + 4<^ ^ Ш™1 - w2)| + el
and from
Кх, yd = - w2, у,) + ц(х, l^K + w2, yd
it follows that
2 1
\Kwi - w2) у()| < -r—r\Kx, 3>i)l + Mx, 1)1 + w2. J>i)l
Wl \\x\\
2 л 11 11111
^ -Ых, Л)1 + то-ЧуА < —z + = -»
e 4 s s 2 2 s s
that is, \p(w1 — w2, y*)| < 1. Therefore we obtain 1/^ — w2)| < d/2 and
<5
|/(x)| < - + sX = <5.
Thus we have shown that I is o(3\, ^J-continuous in all of 3\9 and that there
exists a|e^i for which
l(x) = /x2(Sx, у) = /X t(x, y) = Hi(x, S’у)
from which it follows that S'у e 31 for у e 32, ^aX is, S'32 cz 31. This is equivalent
to the condition that S is, as a map 3\ —► 3'l9 continuous in the topologies
g(9i, 3X) and o(®'2, 32).
Since 08 x cz 3\ and 0(12 a. 3'l9 the same statement holds for the map
S: 08x —> 082.
Th. 4.1.7. Let h be an r-continuous p-morphism. A rational affine and 3-
continuous map Ж1K2 IS defined by the equation S(px(a) = cx,(a)(p2(h(a)).
This map may be extended to an affine 3-continuous map K2. The
continuation S is, as a map 08 x Д 082, both norm-continuous and 3-
continuous, that is, Sf32 <= 3X. If h is a p-isomorphism, then Kx-^ K2 is
bijective.
Proof. From Th. 3.1 and from the conclusions following Th. 3.1 it follows that
Jfi Д К2 is a rational affine map. From the fact that h is r-continuous and the fact
that the topologies о(K, 3) and c(K, if) are (with if = ф&) identical (since if is
4 Morphisms of Ensembles and Effects 211
norm-dense in L n <3) it follows that Jfj Д X2 is ^-continuous. According to
Th. 4.1.5 it follows that Д X2 is affine and ^-continuous and that Д ^2 is
norm-continuous. According to Th. 4.1.6 it follows that Д ^2 is also 0-
continuous and S'^2 c ^.
If h is a p-isomorphism, then, according to Th. 3.1 Jfj Д K2. Since h is a p-
isomorphism, we obtain SJfj = For /z_1 there is an S with §Ж2 = jfj_ and
SS = 1, SS = 1 on Ж2 (resp. X[). Since S and S are norm-continuous, and 0F2
norm-dense in K1 and K2, respectively, then for the extensions we obtain
Xi Л X2, X2 Л Xi and SS = 1 ,SS = 1.
4.2 Morphisms of Effects
D 4.2.1. Let Tbe a mapping of Lx into L2; if the relation
T(g 1 + 02) = Tgx + Tflf2
is satisfied for all gl9 g2, 0i + 02 e L i> then we shall call the map Tan effect
morphism. Tis said to be J^-continuous if it is continuous as a map Lx —► L2
in the topologies o(0!\, ЛД <т(^2, Л2).
Th. 4.2.1. A mapping T of Lx into L2 w/zzc/z satisfies the relation
T(g 1 + 02) = ?0i + Tg2for gl9 g2, g 1 + g2 € Lx has a unique extension to
0l\ and satisfies || T\\ < l.Tis positive.
PROOF. From gb g2 e L1 and g2 > gx it follows that g2 = g^ + (g2 — gx) and that
Tg2 = Tg1 + T(g2 - 0i), that is, T#2 > Tg1 and T(g2 - gfj = T#2 - Tgv For
$ g Ljl and ngeL1 (for integer values of n) by induction, we obtain T(ng) = nTg. For
integer m and g1eL1 we obtain g = gJmeL and 01 = mg; therefore we find that
Tgx = mTg, that is, T(gjm) = (Tg2)/m. Thus, for ng/meL1 we obtain T(ng/m) =
nT(g/m) = nT(g)/m. Thus, for all rational numbers A, for g e L1, kg e we obtain
T(kg) = ATg. If A is irrational, then, to each e > 0 we may choose two rational
numbers kl9k2 such that Ax < A < A2 and A2 — Ax < e. Let A2g gL1( Since
A I# < A# < A2g we obtain kfTg < T(kg) < k2Tg. Since s was arbitrary it follows
that T(kg) = AT#. If A is irrational and Ag e L1? but (A 4- e)g ф L1 for each e > 0, we
then choose g1 = g — Sg, where 0 < S < 1. Then, for irrational numbers A we also
obtain TXkgi) = kTg^ and T(kSg) = kT(Sg), because, for sufficiently small e > 0 we
always obtain (A + s)g1 gLj and (A + s)Sg e L 1. Since g = g^ + dg we obtain
Tg = Tg1 4- 7(<5g) and, since A# = A# 4- kdg we also obtain T(kg) =
ДА^) + T(A<5g) = kTQl + kT(Sg) = AT#.
Therefore Tis an affine mapping of L1 into L2. Since is the linear span of Ly
(for example, each у e B\ may be written in the form у = ад — pi, where g e L), T
may be extended as a linear map from into #'2.
Since the unit sphere of 0l\ is equal to 2L — 1 (and similarly for ^2) and since
Ljl L2 we find that T maps the unit sphere of into the unit sphere of ^2, that
is, I|T||<1.
Each у e B'1+ is of the form kg, where A > 0,geL1; therefore T is a positive
mapping.
212 V Transformations of Registration and Preparation Procedures
Th. 4.2.2. IfT is a continuous effect morphism, then the extension of the
map Tonto 3S\ is also 0tfy-a(0t'2, &2)-continuous.
Proof. Since the unit sphere of is equal to 2L1 — 1 the map Tis ст-continuous
on the unit sphere. Then, according to AIII, §5, Tis ст-continuous in all of .
Th. 4.2.3. If T is a linear positive mapping of 0t\ into $'2 then T is norm-
continuous. T maps into L2 if and only if Tie L2.
Proof. The norm-continuity of T is a result of general theorems (see AIII, §6). We
may easily prove this result directly. For у e we define
P = — inf {/i(w, y)\weK1}
providing that the inf is less than zero, otherwise set ft = 0. We also define
a = sup{p(w,y)\w eKJ.
Then a > —fi and
\\yII = sup{p(w, y)\weK1} = max{|a|, £}.
If we set
9 = —Tbly + W
a + P
then 0 < p(w,g) < 1 for all weKu that is, geLv Since T is a positive map we
obtain 0 < Tg < 71 and obtain
Ту = T((a + P)g - pi) = (a + P)Tg - PT1
< (a + P)T1 - PT1 = a71
and
Ту > -PT1.
Thus it follows that || Ty|| < || Tl|| ||y||, that is, T is norm-continuous and (since 1 is
an element of the unit sphere) we finally obtain || T\\ = || 711|.
If Ljl L2 we obtain Tl e L2. Conversely, if 71 e L2 then, for g1 e Ll9 that is,
0 < g1 <1 (since T is positive) and we obtain 0 < Tg1 < Tl. Since 71 e L2 we
obtain Tl < 1 and therefore obtain Tg1e L2.
For ||T|| = || Tl|| it follows that for each positive mapping T the map
|| Т|Г1T maps the set Lx into L2.
To each a(SPl9 $2)-o{@l'2, ^2)_continuous mapping T of into &У2 there
exists an adjoint mapping T of $2 int0 (see AIII, §5) for which
fi2(x, Ту) = iniT’x, y).
If 7'maps L1 into L2 then, for w e K2 we obtain:
0 ^ fi2(w, Ту) < 1
and also
0 ^ fi^T'w, y) <, 1
4 Morphisms of Ensembles and Effects 213
for all yebl9 that is, T,\^eK1. T is therefore an operation. The
continuous effect morphisms and the operations therefore uniquely cor¬
respond to adjoint mappings, as we have already seen in §4.1. In particular,
T is a mixture morphism if and only if Tl = 1.
In closing this section we shall now give criteria for the J^-continuity of an
effect morphism.
Th. 4.2.4. An effect morphism T is continuous if each o(8S\ 8#)-convergent
sequence gv in L1 (therefore gv —► g e Lfj satisfies Tgv —► Tg in the
&2) topology.
Proof. The <r(^', ^)-topology is metrizable (see AIII,§4) and it is therefore only
necessary to consider sequences.
Th. 4.2.5. An effect morphism T is continuous if and only if for each
decreasing sequence of decision effects ev satisfying Д v ev = 0 the relation
Tev —► 0 in the o(8S'2, ^S2)~t0P0^°Sy-
Proof. Since ex is decreasing it follows that Tey is decreasing in L2 and therefore
converges in the <r(^2, ^2) topology: Tex —► д e L2. Thus the hypothesis of the
theorem is equivalent to the condition that Ду ev = 0 implies that д = 0.
If ev is a increasing sequence satisfying \/v ev = e, it follows that e — ev is a
decreasing sequence, and that T(e — ev) = Те — Tey —> 0, that is, Tey converges to
Те. If en is a sequence of pairwise orthogonal decision effects satisfying 1 en = e>
then ev = Yji=ien is an increasing sequence satisfying \Jy ev = e. Therefore, in the
@2) topology Tev = ^=1 Ten -> Те, that is, T(X„“ i О = L”=x Te„. Thus,
for each w2e К we obtain
00 \ 00
”2, T £ e„) = £ fe(w2, Ten).
n=l ' n=l
Thus m(e) = р2(™2> Те) is a <r-additive measure on the lattice G. According to
Gleason’s theorem (see AIV, §12) there exists a uniquely determined w1e K1 for
which
Me) = №>(w2. Те) = e).
A mapping K2 —* К, is defined by w2 —> where it is clearly evident that it is
affine, that is, is a mixture morphism. We need only show that S' = T, that is,
j42(w, Tg) = g2(w, S'g)
for all w e K2 and all g eL1. We then have p^Sw, e) = ju2(w, Те) and therefore
ju2(w, Те) = ju2(w, S'e) for all w e K2 and for all e e .
Thus we obtain
p2(w, Tx) = p2(w, S'x)
for all x in the linear space spanned by G1. Since T is norm-continuous, the same is
true for all x in the norm-closed subspaces of spanned by Gv According to the
spectral representation theorem (see AIV, §8) the norm-closed subspace of
spanned by is 0b\ itself.
214 V Transformations of Registration and Preparation Procedures
The converse is easy to see. If ex is a decreasing sequence of decision effects, then
ex converges towards e in the 0b J topology. If Ду ev = 0, from III, Th. 6.10
or IV, Th. 1.3.7 we find e = 0, that is, ev —► 0. Therefore from the ^-continuity of T
it follows that Tex —► 0.
Th. 4.2.6. A p-continuous r-morphism h determines a 0b-continuous effect
morphism T: 08 ± 08 2 by means of the map ф&\ L2 where the latter is
given by Th. 3.2. T' is a 3-continuous mixture morphism.
PROOF. According to D 3.6 the map ф^ L2 is ^-continuous since (p£' is dense
(in norm) in K. Thus T can be uniquely extended onto the ст-compact set in
which фis ст-dense because L2 is ст-compact. It follows that L1 -L L2. Since
ф^ L2 is, according to the remark following Th. 3.2, a rational affine map,
Ljl L2 is an affine map. Therefore T may be extended as a linear map 08\ 08'2.
Thus, according to D 4.2.1, T is a ^-continuous effect morphism. According to
Th. 3.2 T1 = 1. T is therefore a mixture morphism.
Since T is norm-continuous, and since ф^ ф^2 cz 32 n L2 and since ф&х is
norm-dense in ^ n Li we obtain n Lx 082 n L2. Since 3 n L spans the
space 08 we obtain 31 082. Therefore T is ^-continuous.
Th. 4.2.7. Let h be a p-isomorphism and let h' be a r-isomorphism; let h be
dual to k (D 3.8). Then the maps S(px(a) = (p2(h(aj) and Тф2(/) = i/q(^ (/))
determine a 3-continuous linear map 081 082, where К2 is bijective
and a 08-continuous linear map 08f2 0b[, where L2^± Lx is bijective,
T = Sf. In addition, T082 = 31.
The proof follows directly from Th. 3.4, Th. 4.1.7, Th. 4.2.6, and Th. 4.1.4.
4.3 Coexistent Operations and Coexistent Effect Morphisms
The norm-continuous maps S ol 081 into 082 form a Banach algebra
stf(081? 082) and therefore form a Banach space (see AIII, §6) with the norm
||S|| = sup{\\Sx\\\xe081, Ml < 1}
= sup{IISwII I W € Kt}.
In $0(08 ^ 082) a positive cone $0+(08x, 082) is defined by S > 0 or equivalently
{Sx > 0 for all x > 0}.
$0(081? 082) is not only complete with respect to the norm topology but also
with respect to the topology of simple convergence, that is, for a sequence Sn
with Snw which is norm-convergent in 082 for all w eK1 there exists an
Se$0(08x, 082) satisfying ||Snw - Sw|| —► 0 for all weK1 and therefore
||Snx — Sx|| —► 0 for all x € 08x.
Th. 4.3.1. A map S e $0(082, 082) is an operation if and only if Sis positive and
IISII < i.
4 Morphisms of Ensembles and Effects 215
Proof. From p(Sw, 1) = ||Sw|| for w e and S positive and ||S|| =
sup{||Sw|| | w gK;) it follows that ||Sw|| < 1, that is,SweK2.
D 4.3.1. We shall denote the set of operations—that is, the intersection of
я/+(Я19 $2) with the unit sphere—by П.
D 4.3.2. An additive mapping of a Boolean ring 2 Л П for which x(e) is a
mixture morphism (where s is the unit element of E) will be called an
operation measure.
For an effective ensemble w0 we may define a uniform structure in E by
means of the metric
d(au a2) = ц(х(<7! + o-2)w0,1), (4.3.1)
which is identical to that defined in IV, Th. 5.2 provided that we set
W(tt) = х(ф0 (4.3.2)
or is identical with that defined by IV, Th. 1.4.1 if we rewrite (4.3.1) in the
form
d(au a2) = n(w0, /{a1 + cr2)l) (4.3.3)
and set F(a) = /(<r)l.
Th. 4.3.2. The mapping E П is uniformly continuous with respect to the
metric in E and the uniform structure of simple convergence in П.
PROOF. According to IV, Th. 5.2 for each fixed w the map E K2 is uniformly
continuous with respect to the norm topology in K2. This implies that the map
E -Ь П is uniformly continuous with respect to the uniform structure of simple
convergence in П.
From Th. 4.3.2 and the fact that П is complete in the topology of simple
convergence, it follows directly from IV, Th. 1.4.3 that E can be completed
and that x can be extended on the completion..
Th. 4.3.3. E П together with a complete and separable E determines for
each w eK1 a preparator of the ensemble x(£)w by means of the map
E^K2.
Proof. The proof is a simple corollary of IV, D 5.4 and the preceding results.
The map E-^IIis uniquely defined by the maps E K2 for all w e Kx.
We therefore define :
D 4.3.3. We shall call an additive measure ЕЛПопа complete separable
Boolean ring E a trans-preparator.
216 V Transformations of Registration and Preparation Procedures
Th. 4.3.4. Let E be complete and separable, and let an additive measure x be
defined by E -b П. Then to each xifi) {as a map *(g--> ^2) the adjoint map
X'(g} {as a map 0b'2 0b'l9 we may define an additive map E ^ P (where P
is the set of ^-continuous effect morphisms). For each geL2 an additive
measure is defined by E Lx. For g = 1 the map E -Lh L is an observable.
Proof. The proof follows simply, noting that x'(e)l = 1 because x'(e)w e K2.
D 4.3.4. The map E X P which is conjugate to a trans-preparator E П
is called the adjoint effect transformer to E -Ь П. The observable E is
said to be the observable associated with the trans-preparator.
D 4.3.5. A set si of operations is said to be coexistent if there exists a
Boolean ring E and an additive measure E А П for which sf c= /Е.
We shall not carry out further analysis of the trans-prepartors in a manner
which is similar to the analysis of observables and preparators which was
presented in IV. In XVII, §4 we shall present a number of applications in
order to become familiar with the concepts presented in §4.3.
5 Isomorphisms and Automorphisms of Ensembles and Effects
In quantum mechanics automorphisms of effects and ensembles play an
important role. For this reason we must present a careful and precise
explanation of the structure of these maps. In addition, we shall investigate
the question of the possibility of the extension of bijective maps, for example,
isomorphisms of decision effects onto Banach spaces, that is, to effect
morphisms.
D 5.1. If T is a bijective map of L1 onto L2 which satisfies the following
condition:
{Gi> Gi e I-'iJ Gi ^ Gi ^ T&1 ^ Tg2)
then we say that T is an order isomorphism of L1 onto L2.
In D 5.1 we therefore consider the structure type: L as an ordered set.
Since 1 is the largest and 0 is the smallest element of L1 (and correspond¬
ingly for L2) it follows that Tl = 1 and TO = 0.
Since 1 — g corresponds to the case in which g does not respond, the
following comprehensive structure type is suggested: L is an ordered set with
a dual automorphism g —► g* = 1 — g. Thus we obtain the following
restricted set of isomorphsims satisfying D 5.1:
D 5.2. An order isomorphism T of L1 onto L2 is called a «-isomorphism if
T(1 — g) = 1 — Tg for all g eL1. The same considerations can be repeated if
we consider the subset G of L as an ordered set (as a lattice) and, in the
5 Isomorphisms and Automorphisms of Ensembles and Effects 217
second case, as an ordered set with the dual automorphism e—►e1. The
isomorphisms of G as an ordered set are known as lattice isomorphisms.
D 5.3. A lattice isomorphism T of G1 onto G2 is said to be an _L-
isomorphism of Gt onto G2 providing Tie1) = (Те)1.
We may also consider the following structure type: L together with the
тар(01; g2) -> + g2 where gt + g2 < 1.
According to D 4.2.1 an isomorphism of L with respect to this structure
type is called an effect isomorphism. If an effect isomorphism L1 L2 is 0b-
continuous, then so is the inverse L2 —► L1 since L1? L2 are compact (see
If L1 = L2 or if G1 = G2 we then use the expression automorphism instead
of isomorphism.
Th. 5.1. Every effect isomorphism T of L1 onto L2 is an *-isomorphism and T
may be uniquely extended as a linear mapping of 0b\ onto 0b'2; then T will be
a norm preserving isomorphism of the Banach spaces 0S\ and 0b2. If T is 0b-
continuous then T' is a mixture isomorphism of K2 onto K1 and an
isomorphism of the Banach spaces 0b 2, 081 and T_1 is also 0b-continuous. IfT
is 0b-continuous then the restriction of T onto G1 is a 1-isomorphism of G1
onto G2.
PROOF. From д + (1 — д) = 1 it follows that Tg + T(1 — g) = Tl. From Th. 4.2.1
it follows that T is positive and preserves the order. Therefore Tl = 1 and
T(1 — g) = 1 — Tg, that is, Tis a *-isomorphism. From Th. 4.2.1 it follows that the
map T may be extended on 0b\ as a bijective mapping from 0b\ to 0b'2; in addition,
since || Ту || < || у || and since || T_1y|| < ||y|| we find that Tpreserves the norm, and it
follows that Tis an isomorphism of the Banach spaces.
If T is ^-continuous, then, according to §4.2, T is a mixture morphism of K2
into K1. According to Th. 4.1.4 T is a mixture isomorphism and that 0b2 0b 1 is
an isomorphism of the Banach spaces. Since (T-1)' = (T')_1 it follows that T is 0b-
continuous.
Let eeG1. Then, from p(w, Те) = p(T'w, e) it follows that weK0(Te)o
T'w e K0(e). Therefore there exists an e' e G2 such that K0(Te) = K0(e') and
Те < e’. From this result it follows that e < T“V. From w e K0(e') = K0(Te) it
follows that p(w, e') = 0 and that p(w, TT“V) = p(T'w, T“V) = 0, that is,
T"V g L0K0(e) and therefore T~yef<e. Therefore T“V = e, that is,
Те — e! e G2-
The last property may be proved more easily if we use the fact that G is the set of
extreme points of L (see III, Th. 6.6).
From T(Xg + (1 - l)g) = XTg + (1 - X)Tg it follows that, under a bijective
linear mapping of Ly onto L2 the extreme points of Ly are bijectively mapped onto
the extreme points of L2. Since T(1 — e) = 1 — Те it follows that (Те)1 = Tie1).
We may obtain a stronger result than Th. 5.1 if we use the fact that for
g e L the spectral representation
§4.2).
holds (see AIV, §8).
(5.1)
218 V Transformations of Registration and Preparation Procedures
Th. 5.2. The restriction of a *-isomorphism L1 L2 to Gt is a L-isomor-
phism G1 —► G2.
Proof. Let eeGx \ suppose that Те = g e L2. Then we obtain T(1 — e) = 1 - g.
Since g' < e and g' <1 — e = e1 implies that g' = 0 it follows that, except for the
null element, there exists no element in L2 which is smaller than g and 1 — g. From
(5.1) it follows that this is the case if and only if the spectrum of e(k) contains only 0
and 1, that is, g e G2. Therefore TG1 is mapped isomorphically onto G2. Since
Те1 = T(1 — e) = 1 — Те = (7b)1. Tis a _L-isomorphism of G1 onto G2.
We now have the following situation: An effect isomorphism of L1 onto L2
is also a ^-isomorphism of L1 onto L2.
Every *-isomorphism restricted to Gt becomes a _L-isomorphism of the
lattices G1? G2.
We now turn to the converse problem: When is it possible to extend a 1-
isomorphism of Gt onto G2 to a *-isomorphism of L1 —► L2? When is a *-
isomorphism of L1 —► L2 an effect isomorphism? ... etc.
Th. 5.3. Let T be a mapping of H1 into G2 where H1 is a o(0\, 0f)-dense
subset of G1. To each w e K2 there may exist an x' such that
p2(w, Те) = pffx^, e) for all eeH1. Then T has a unique extension as a
linear Я^у-а(Я'2, 0 ^-continuous map of into 0'2. In addition
TL1 c= L2, that is, Tis a 01-continuous effect morphism.
Proof. By hypothesis p2(w, Те) is, as a function defined on Hl9 the restriction of a
linear c(0\, ^J-continuous functional over 0t\. x' is uniquely determined because
H1 is o(0t\, ^^dense in G1 and is the o(0\, ^J-closure of the linear span of
Gi.
From 0 < p2(w, Те) < 1 it follows that 0 < pfx', e) < 1 on and thus
0 < p^x', g) < 1 on the 01)-closed convex set Ly which is generated by
G1 and therefore also by H1. Thus it follows that x' = Aw' where 0 <; A < 1 and
w' g Kl9 that is, x' e K1.
It is easy to see that an operation S is defined as a mapping of K2 into K1 by
w —► x'9 and that this map may be extended to a map of 02 into The dual map
S' corresponding to S is identical on H2 to the previously defined map T. Hence we
find that S' is the extension of T on 0\9 and that this extension is unique because T
is 01}-(t(0'2, ^-continuous and that the space spanned by H1 is a(0\, 01J-
dense in ! Therefore S is equal to T, the map which is dual to T. Since T' is an
operation, Tis itself an effect morphism.
Th. 5.4. If, for the map T = S' which was defined in Th. 5.3, the set G2 lies in
the о(Ж2, 0l2)-closure of TG1? then T is surjective as a mapping of L1 onto L2
and of onto Я'2 and, in addition, T is a mixture morphism K2 —► K1 and
T is an injective map of 012 into 0x.
PROOF. Since c= L1 and TG1 c TL1 and since TL1 is <r(^'2, ^2)-compact
(because Tis ^-continuous and L1 is o(0t\, ^J-compact!) the c(0\, ^J-closure of
TGjl lies in TL1. Then, according to our hypothesis, G2 c= TLV Since TL1 is
convex and compact, and since L2 = со G2 we therefore obtain TL1 = L2 Since
5 Isomorphisms and Automorphisms of Ensembles and Effects 219
L2 spans all of 0b'2 we obtain T0b\ = 0b'2. Since T preserves the order and TLy = L2
we obtain 71 = 1. Thus it follows that p(T'w, 1) = p(w, Tl) = p(w, 1) = 1, that is,
TK2 cz K±. T is injective since T0b\ = 0b’2.
If, in addition, T is injective onto L1? then T is a J^-continuous effect
isomorphism and T is a mixture isomorphism.
Is it possible to determine whether the extension of T onto L1 is injective
merely by looking at the map T only on G^.
Th. 5.5. If we assume the hypothesis of Th. 5.3 and the assumptions
TG1 a G2 then T is injective onto 0b[ if and only if for ее G and e Ф 0 it
follows that Те Ф 0.
Proof. Since у = ад — pl, for each у e 0b\ we obtain
Since T preserves the order, Te(X) is, for increasing A, an increasing sequence of
elements of G. Since Те Ф 0 for e ф 0 it follows that — Te(A2) Ф 0 for
^AJ — e(A2) Ф 0 and we therefore obtain Ту Ф 0 for у Ф 0.
Th. 5.6. Let ev be a pairwise orthogonal set of elements of G^ Then, for a _L-
isomorphism T (and for any lattice homomorphism satisfying T(eL) = (Те)1)
of Gx into G2 the Tev are pairwise orthogonal and T(£v ev) = Tev.
PROOF. From e11 e2—that is, e1 < e2—it follows that Te1 < T(ej) = (Te^1, that
is, Те11 Te2. Since the Tey are pairwise orthogonal and since T is a lattice
homomorphism it follows that T(£v ex) = T(\/v ev) = \/v (Tev) = Tey.
Th. 5.7. Let The a lattice homomorphism of G1 into G2for which Те1 = (Те)1
(for example, a _L-isomorphism of onto G2). Then, by applying Gleason's
theorem (see AIV, §12) to G1 it follows that T may be uniquely extended as a
linear o(0S\, 0Sfj-a(0S2, 0b^-continuous map T: 0b\ —► 0b'2. The extension of
Tisa 0b-continuous effect morphism.
If TG1 is o(0b2, 0b2)-dense in G2 then T is a surjective map of L1 onto L2
and of 0b\ onto 0b2 and T is injective as a map of 0b2 into 0b±.
If Те Ф 0 for e Ф 0,e e G1 then T is a 0b-continuous effect isomorphism.
PROOF. For all weK2 jU2(w, Те) is a positive function over G1? which, according to
Th. 5.6 satisfies /i2(w, T(£v ev)) = ju2(w> Tev) and /i2(w, Tl) = 1. Then, according
to Gleason’s theorem there exists an x e 0b 1 for which p(w, Те) = p(x, e). The
remainder of the theorem follows directly from Th. 5.3, Th. 5.4, and Th. 5.5.
У =
from which it follows that
Ту =
Th. 5.8. The restriction of an order isomorphism L2 onto G1 results in a
lattice isomorphism of Gt onto G2.
220 V Transformations of Registration and Preparation Procedures
Proof. Since both T and T 1 are order isomorphisms, we need only show that
TGX e G2.
Let A(G) denote the set of atoms of the lattice G. Let p e Gx be an atom of Gv
We will show that Tp e A(G2). We note that all g e Lx which satisfy g < p are of the
form Яр, where 0 < Я < 1. The set {g \ g < p} is therefore totally ordered, and so is
the set {g' \ g' < Tp} since Tis an order isomorphism of Lx onto L2. This is the case
only if Tp = ctp', where p' e A(G). Suppose that а ф 1, then T_1p' ^ P an(*
T"V ф p and T-y = pp", where p" eA(Gx), which is impossible. Therefore
a = 1, that is, Tp g A(G2). Therefore T generates a bijective mapping of A(G0) onto
A(G2).
If e g Gjl then for every p e A(GX) for which p < e we obtain Tp < Те. If g > eA
for Я g Л we obtain Kx(g) => Кх(ел) and Kx(g) => |JAeA Кх(ел) = Кх(\/Л ел) from
which it follows that g > Vae л • Therefore we obtain:
Те > \/ Tp = e' g G2,
p<,e
p e A(Gi)
a similar result holds for T-1:
V T’“1e = e"eG1.
qee'
qeA(G2)
Since Tp g A(G2) and Tp < ef holds for all p < e, it follows that all p g A(G) for
which p < e are elements of the set of all T " У (where q < e',qe A(G2)\ that is,
e" > V P = e-
p<,e
peA(G i)
Since 7b > we obtain e > = e"; therefore we obtain e = e" and
Те = Те" < e\ and we finally obtain Те = e'.
Th. 5.9. If, in addition to the hypothesis of Th. 5.8 we also assume that the
Tg, T( 1 — g) are coexistent for all g e L1? then the restriction of T on
Gi —► G2 is a _L-isomorphism.
PROOF. From {Tg, T( 1 - g)} is coexistent, for g = e e Gx it follows that {Те, T(eL)}
are coexistent and are therefore commensurable. Since eA^^Owe also obtain
(Те) л (Те1) = 0 and also T(eL) _L Те. Since e v e1 = 1 we obtain Те v Те1 = I
and finally T(eL) = (Те)1.
Th. 5.10. Let Gx, G2 be two atomic lattices, and let A(GX), A(G2) denote their
sets of atoms, respectively. Let Tbe a bijective mapping of A(GX) onto A(G2)
for which both T and T-1 maps orthogonal atoms into orthogonal atoms, that
is, T is an isomorphic map of A(GX) onto A(G2) with respect to the species
of structure determined by _L. Then T may be uniquely extended to a
L-isomorphismfrom —► G2.
Proof. For each e e Gx the set of all atoms for which p < e is uniquely determined
and satisfies
e= У P.
p<,e
peA(GO
5 Isomorphisms and Automorphisms of Ensembles and Effects 221
For each order isomorphic mapping T of G1 onto G2 the following equation holds:
Те= V Ш
рве
peA(GO
For this reason we define the extension of T(first defined only on A(Gfj) as follows:
Те= V Ш
p<,e
peA(GO
Then it follows that e1 < e2 implies that Te± < Te2.
For the set of atoms q < e1 it follows that Те1 = \^e± №)• Since p _L q it
follows that Tp _L Tq; we therefore obtain Те1 _L Те. Since there exists a complete
system of pairwise orthogonal pv, qp satisfying pv < e and qp < e1 and
(Vv Pv) v (V. 4/.) = we find that
(уж)¥(уч)->
otherwise there would be an atom r which would be orthogonal to all Tpv, Tq so
that T~*r would be orthogonal to all pv, qp in contradiction to the condition
(Vv Pv) V (V. %) = 1 From (Vv TPv) v (V. = 1 and Vv FPv ^ and
Tqp < Те1 it then follows that Те1 = (Те)1, \/v Tpv = Те and \/д Tqp = Те1.
Suppose there exists an atom r for which r < Те. Then we alsiypbtain r _L Те1,
that is, r _L to all Tq for which q < e1. In this way T_1r ± to all ^ < e1 and we
obtain T_1r < e. That is, each atom r < Те may be obtained as the image Tp of an
atom p < e. Thus Тех = Te2 implies that e1 = e2 and
e = \/ T_1r
r<,Te
reA(G2)
is proven.
Since the procedure presented above can also be applied to the extension of the
map T"1 of A(G2) to A(GX) and
e = \/ T_1r
r<,Te
r e A(G2)
holds, it follows that Tis a bijective map from onto G2.
Th. 5.11. Let T be an order automorphism of L onto itself (for example, an
effect automorphism). Therefore we obtain TG = G. Let qv denote the atoms
of the center Z of G (see IV, end of^ 1.3). For this T there exists a bijective
mapping p of the integers (a permutation) so that Г = 7^, such that
T[lqy = S^q^ is satisfied. The Tv are order isomorphisms of Lv onto Lp{v)
where the Lv are equal to [0,1] in &'(JQ; Ш'(Ж^) is therefore isomorphic to
the subspace of all operators of the form (0, 0,..., Av,...) in 8$'(ЖЪ
Each order automorphism T may, in this case, therefore be represented as a
“ direct sum ” of order isomorphisms Lv Lp(v) for the irreducible Lv, Lp(v). If
T is an *-automorphism, then the Tv are *-isomorphisms. If T is an effect
automorphism, then the Tv are effect isomorphisms. IfT is $-continuous, then
the Tv are also $-continuous.
222 V Transformations of Registration and Preparation Procedures
Proof. Let Gv be the sublattice of all e e G for which e < qv. Each eeG may be
uniquely written in the form e = £v ex where ey e Gv. The following sums are to be
understood in this way. From e < qy it follows that Те < Tqy = ^ (Tqy)p. For
Те = (7e)M it follows that (Te)p < (Tqv)^ for all e e Gv. Therefore Tgenerates an
order isomorphic map of Gv on the lattice G of all e' = ^ ep where ep < (TqJ^. If
(TqX Ф 0 for more than one /i then the lattice G would be reducible, in
contradiction to the assumption that Gv is irreducible. Therefore there exists, for
each v precisely one /л = p(v) for which (TqJ^ ф 0. Since the same is also true for
T_1 we must have (Tqy) = qp in addition to (Tqy) < qp and p(v) must be a bijective
map of the set of the v. Therefore we obtain Tqy = qpiy).
Since g e L and g < qy it follows that Tg < Tqy = qp(y); therefore Tgenerates an
order isomorphism map Tx of Lv onto Lp(v). Therefore it remains only to show that
(for g = £v 0Vanci 0V < 4v) T9 = Zv T9vin 0T^QT t0 set T = Zv where we obtain
Tygp = 0 for v ф fi.
Since Tg = ^ (Tg)p we must show that (Tg)p{v) = Tgx. Since (Tg)p < qp we find
that T~\Tg)p{v) = g'v < qy where Tgy = (Tg)p{y). It remains to show that g'v = gv.
Thus from (Tg)piv) < Tg it follows that g'v = T~\Tg)p{y) < g. From gy < g it
follows that Tgv < Tg and since Tgy < qp(y) we obtain Tgy < (Tg)piv), that is, gy <
T\Tg)p(v) = g'v; therefore#; = #v.
If T is a *-automorphism, then T(1 — g) = 1 — 7#. If g < qy and therefore gf =
9v - 0 < «V then l-0 = gv-0 + qp. Thus it follows that
T(1 - gf) = - 0) + £ <Jp(„) = 1 ~ Tg = '£jqp - Tg
рФх p
~ Z 9p(p) ~ Tg — Z 9p(p) ‘b ^fp(v) —
p рФх
Since g < qy we obtain 7# < #p(v) and therefore obtain T(#v — g) = #p(v) — 7#, that
is, Tv is a *-isomorphism since qy or qp(y) are the unit elements of Lv or Lp(v).
From g = 0i + 02 it follows that gv = glv + #2v. Thus it follows that, for an
effect automorphism Tg = Tg1 + Tg2 and therefore
1= X TvSl + £ Tv02,
. V V v
that is,
Z = Z ^lv + Z ^V02v
V V V
and we obtain (since the partition is unique) Tygy = Tygu + T2g2y, that is, Tv is an
effect isomorphism.
The ^-continuity of Tv follows from that of T, in which T is applied to such g
having only a single component g = gx which is different from 0.
On the basis of Th. 5.11 we are therefore interested in *-isomorphism or
effect isomorphism between two irreducible systems L1 and L2, that is,
Li c £'(■*!\L2 <= ЩМГ2).
Since G is a Gleason lattice, and the spectral representation theorem holds
for each ^-automorphism of G may be continued to an automorphism T
of the Banach space Ж which is ^-continuous, and the adjoint
mapping T is an automorphism of We obtain T = Tv and T = Ty
where the Tv are isomorphisms of on &'p(v) and Ty are the adjoint
isomorphisms of $p{y) and ^v(^ =
5 Isomorphisms and Automorphisms of Ensembles and Effects 223
From the preceding theorems we obtain the following special case:
Th. 5.12. Each _L-isomorphism T of Gx c= ^'(Ж^) onto G2 <= &'(Ж2) may be
uniquely extended to a 8$-continuous effect isomorphism T where the latter is
also an isomorphism of the Banach space = St’fflf) onto = ^'(«#2)-
The adjoint map T is an isomorphism of the Banach space $2 onto where
T is a mixture isomorphism K2^> Кx.
For <p e Ж and ||<p|| = 1 let P^ be the projection operator Pf = <K <?>/>•
Then Py may be considered to be an element of К c= ЩЖ) as well as of
G с Я(Ж). The following theorem is to be understood in this sense where
T: —► Я2, T\ St2 —► SS1 and ТР9 means P^e K2 <= Я2 = Я(Ж2) and
ТРф means Plj,eG1 <= St\ = St(3tff).
Th. 5.13. Let T be a mixture isomorphism. Then TP^ Since ТРф
is an atom, ТРф = Рф, where ф' is determined by T and ф up to a factor ela(oc
is real).
Proof. ЩТ'Р^е) = tr(P^Te). For e = T-1?^ = Рф; therefore it follows that
tr((Т'Р^Рф) = 1. For e = 1 - Рф we obtain Те = 1 - ТРф = 1 - P9 and therefore
tr((T'P^)( 1 - Рф)) = 0. Thus we obtain TP9 = T~lP,.
If e11 e2 then Te11 Te2 and T~1e11 T~1e2. Therefore we obtain:
Th. 5.14. For TP^ = P^v and pairwise orthogonal cpv, the q>v are pairwise
orthogonal. With TP^v = P^ and pairwise orthogonal фу the cpv are also
pairwise orthogonal.
Th. 5.15. From w = £v AvPt^ (each weK may be written in this form with
pairwise orthogonal cpv) and TP^ = Т_1Р^ = Pфv it follows that
T'w = £AvP*v.
V
Proof. The proof follows directly from the fact that T is linear and norm-
continuous as a mixture morphism.
Th. 5.16. Let Т'РЩ = Рф1, TPV2 = Рф2 (that is, PVl = ТРф1, PV2 = ТРф2).
Then K^, <p2>|2 = |<t^, ф2}\2.
Proof. tr(P^Pfc) = tftrpjp^ = 1т(Рп(ТРф2)) = tr(PViPV2).
Th. 5.17. Let Tbe an isomorphism of A(GX) onto A(G2) (see Th. 5.10). Then T
may be uniquely extended as a _L-isomorphism of G1 —► G2. For Рщ = ТРф1
and P^2 = ТРф2 we obtain
\<<Pi> <Рг>I2 = K^i> ^2>\2-
The proof is a direct consequence of Th. 5.10, Th. 5.12, and Th. 5.16.
224 V Transformations of Registration and Preparation Procedures
TP* = UP9U-\
Th. 5.18. To each isomorphism T of A(G1) onto A(G2) there exists an
isomorphism or anti-isomorphism U of Ж1 onto Ж2 which satisfies
ТРф = P„ and (p = Щ;
where
ф € Жъ (p € Ж2.
Proof. Let фу be a complete orthonormal basis for Ж±. A complete orthonormal
basis cpv for Ж2 is defined by 77^v = P(Pv where the cpy are uniquely determined
except for factors of the form eilXv.
For ф0 = £v°°=1 (1/2v/2)^v and ТРфо = P^ and (p0 = £v xv<pv it follows that since
K<Po> <PV>I2 = КФо» Ф,>\2 ^at |xv|2 = 1/2V. The factors ei<Xv for the <pv are arbitrary,
and may therefore be chosen such that cpQ = £v (l/2v/2)<pv.
We will now set ф = ауфу and set (p = bv(pv where ТРф = P^. We will now
show that all bv = av or all bv = av.
From |<<pv, <p>|2 = |<^v, i/^)|2 it follows that \ax\ = |bv|. Since Tmay be extended
as a _L-isomorphism on all of (by Th. 5.17) we shall consider the map T of the
following projection operators: Рф, P = (where the' means that we perform
the sum only on a certain subset N' of the natural numbers) and P1. We obtain
(Рф v P1) л P = Px where x = Рф\\Рф\\ ~For Q = Ylx = TP we obtain
where
(P, v Q1) a Q = TPX = Px.
Q<P\\Q<P\\-
That is, if ф is mapped onto q> then every partial sum £'v ауфу is mapped onto
£'v Ьуфу up to a normalization factor, which is, since \ax\ = |bv| identical:
m = \\Q<P\1
From
КФо'РфУПШ'1 = \<<Po,Q<P>\\\Q<p\\~l
and since ||P^|| = \\Q(p\\ it follows that
14
i
2v/2
1
2
(5.2)
for every over a subset AT of the v. Since a factor in q> and ф are arbitrary, and
|av| = |bv| we may choose a1 = real and nonzero (in the case in which
ai = hi = 0 we choose a different v ^ 1). From (5.2) we obtain especially:
1 1
2T72ai 2v/2 a'
1 1
2l/2 2v/2 '
for all v. Thus it follows that either av = bv or av = 6V. For each two different v, p
from (5.2) it follows that
1
1
2v/2 + 2ju/2 ^
1 1
^W2 + ^/2
If av is not real, and bv = av then it follows that b^ = if av is not real and if
bv = av, then it follows that b^ = a^. Thus we can either have all bv = ay or all
b=av.
5 Isomorphisms and Automorphisms of Ensembles and Effects 225
We must distinguish between two different cases:
(1) ife + i\jj2 is mapped onto + i<p2.
(2) ф1 + iф2 is mapped onto (p^ — i<p2.
Case (1). Since every partial sum is mapped into a partial sum, it follows that
ф1 + iф2 + % —> <?i + fife + <Pv and therefore we obtain ife + fife —> + i<p2 •
From this result we obtain фх + + iife^ <PV + anci aiso *Av + 1Фц
(px + i(pp for arbitrary pairs v, ц.
From ife + fife —> <pv + i(pp it follows that (for real a, b)
•Av + «A„ + (a + гЬ)фр -»<pv + i<pp + (a + ib)<pp
and thus we obtain i//v. + (a + ib)tj/p —> <pv + (a + ib)(pp. Let £v cvi//v be an arbitrary
vector. We shall assume that cVl is real and nonzero. Since cvii/^vi + срфр —>
cv!^v! + for a119 it follows that £v cvife -> £v cv<pv.
Case (2). In the same way as Case (1) it is easy to show that if
•Av + »A„ -»<Pv - i%
for all pairs, then ife + (a + ib)ife —> <pv + (a — ib)(pp and therefore
Z Cv'Av Z Cv<Pv.
V V
An isomorphic map U of Ж1 onto Ж2 is defined by l/(£v cvife) = £v cv<pv; an
anti-isomorphic map U is defined by C(£v cvife) = £ cv<pv (see AIV, §13). In both
cases ТРф = where cp = 11ф. For ап/еЖ2 we obtain
те/=p„/= <?<<?,/> = щ<иф,/>.
For / = Ug we have <Cife Ug} = <ife #> if U is an isomorphism (or <C/ife Ug} =
(g, фу in the case in which U is an anti-isomorphism). Thus it follows that
(ГР„)/= иф(ф,д) (or = (иф)(д, ф}) and therefore (TP^)f = UP^U-1/ (or
= (W)<9, <A> = t/[«A<«A. й)3 = ирфи-4), where ТРф = UP^U~l is proven for
both cases. In the last bracket we used the fact that, for an anti-isomorphic map
Щаф) = aGife
FromTh. 5.17 it follows that:
Th. 5.19. The mapping T extended to all of G1 (or the _L-isomorphism T of G1
onto G2) has the form Те = UeU-1 where U is either an isomorphism or
anti-isomorphism of Ж± onto Ж2.
Th. 5.20. Every isomorphism or anti-isomorphism U of Ж^ onto Ж2 generates
a _L-isomorphism of G1 onto G2 by means of the equation Те = UeU-1. U is
determined up to a phase factor em by T.
PROOF. It is easy to see that e —+ U^eU^1 is a _L-isomorphism of onto G2. Let
= U^U^1. Then we obtain U^U^e = eU21U1. The unitary map U21U1
of Ж± onto itself commutes with all e e G1? and we therefore obtain U21U1 = eial
from which we obtain I/ = U2eixl = e±iccU2 (where + or — depends on whether
Uu U2 are isomorphisms or anti-isomorphisms.
226 V Transformations of Registration and Preparation Procedures
Th. 5.21. The _L-isomorphism Те = UeU-1 of onto G2 may be extended
to a continuous effect isomorphism of L1 upon L2 by means of the equation
Tg = UgU_1. The uniquely determined isomorphism of onto $S2 de¬
scribed in Th. 5.7 is given by Ту = UyU-1. T'x = U~1xU is the isomor¬
phism of $2 onto 3$! and is a mixture isomorphism of K2 onto K1.
PROOF. The fact that Ту = UyU'1 is ап effect isomorphism and is an isomorphism
of onto №2 is clear. tr((T'w)e) = tr(w(T<?)) = tr(wUeU'1); if U is unitary, it then
follows that tr((T'w)e) = tr(l/_1wl/e) and, therefore, T'w = U_1wU. Since
tr(w(Te)) is real, from AIV, §13 it follows that, for an anti-isomorphism
tr(\vUeU~1) = tv(U~1wUe) and we therefore obtain T'w = U_1wU.
Th. 5.22. If the restriction of a *-automorphism L—*L where L с: <Я'(Ж)
onto G satisfies Те = e then Tg = g is satisfied, that is, T is the identity map
of L onto itself
PROOF. Since Tp = p for all p e A(G), it follows that, by analogy with the proof of
Th. 5.8, T(Ap) = t(A, p)p for 0 < A < 1. We obtain т(0, p) = 0, t(1, p) = 1 and
t(A, p) increases monotonically with A.
Let pv be finitely many pairwise orthogonal elements of A(G), it follows that, for
all g = Yjv KPx where 0 < Av < 1, that g'1 (where g'1 is the reciprocal operator to
g) exists in the subspace (£v рх)Ж We now consider all atoms < £v pv, that is,
all (p for which cp e (£v ру)Ж We seek all values of A, 0 < A < 1 for which g > AP^.
For the maximal A for which g > IP' there exists a x (for which рд = x) with
QX = ZP^x- With rj = gx (that is, x = G~ln) we obtain P^rj = rj and А~Ь/ =
P(f>g~1P(pfi and we therefore obtain A-1 = <<p, p“V). From this result it follows
that for g = ^хФцРх + РРц following relationship holds:
P = Щ1 - \\PM2) + M\p»<p\\2
and we therefore obtain
Wp^W^Pd-W-W-P)-1.
Since g = 'Zv*„Pv + PP* for all g for which £v5bv pv < g <, £v Pv we therefore
obtain
£ < fg < £pv,
V^jU V
that is,
Tg = Z Pv + °„(P)p„-
хФ ц
t(A, Pf) must be the maximal value of а т for which Tg > тP^. From Tg >t(A, P^P^
where t(A, P^) is the maximal т it follows that т(А, P^ = <<p, (Tg)“ V) and we
obtain
<*№) = p<pK(Mi - 11р„<р112) + T& р„)11р„<р112,
that is,
WpM\
2 ,PJ)
xa, рд! - ajm'
5 Isomorphisms and Automorphisms of Ensembles and Effects 227
(5.3)
In this formula we shall hold pд fixed, while we consider cp to vary in (£v рх)Ж;
we may set the value
If we insert the above value of \\р^ср ||2, it follows that
cj№ - /0 = ^УХ1 ~
Ж1 - <T„08)) 1(1 - T(I, P^))'
d, while 1
g _ Д11р„<?н2
i - Ki - iip^n2)
in the above equation. The left-hand side depends therefore only on 2 and \\р^ср\\2;
therefore the right side can depend only on 2 and Wp^cpW2. If the Hilbert space Ж is
more than two-dimensional then the right-hand side is independent of all other
components ||pv<p||2 (v ф p). Since the definition of т(2, P^ does not depend on the
choice of pit follows that the right side is independent of each ||pA<p||2 f°r every
arbitrary atom pk in G and is therefore independent of (for all cp e (£v ру)Ж).
On the other hand we have
2 = - P
«1 - M2) + IIP^II2
If in (5.3) we hold p on the left-hand side fixed, then the expression on the left side
is a constant, while 2 on the right-hand side (which is independent of P^) may vary,
and for fixed РФ 0, may vary between 0 and p. The right side depends only on 2
and is therefore constant in the interval 1 > 2 > p. Since p is arbitrary in
1 > P > 0 it follows that the right side is constant for 1 > 2 > 0 (for 1 = 0 we
obtain т(0, Py) = 0). Since the left side of (5.3) is positive, there exists a constant
a > 0 such that
ak
t(A, p<p) = 7——: г
1 + la - Л
(this is also the case for 2 = 0!).
t(A, Py) is therefore independent of P^ for all cp e (£v ру)Ж; т(2, Рф) = т(2).
Since each pair сръ cp2 together with a cp e (£v рх)Ж lies in a finite-dimensional
subspace of Ж we therefore obtain
Т(ХР(р) = т(Х)Р(р for all cp e Ж.
If e g G and if g = ae, where 0 < a < 1, then g is uniquely determined by
g > aPy for all cp e еЖ and g ^ (a + e)P(p for e > 0. Therefore fg > i(a)P(p for all
cp g еЖ and fg т(а + e)P((>. Since g < e we obtain fg < e and therefore obtain
T(<xe) = т{сс)е.
Let g = (1 - e) + pe; then, for all cp g > IP(p, where I-1 = <<p, g~lcp) =
(1 - \\ecp\\2) + P~1\\ecp\\2 and we obtain fg > t(2)P^. Thus it follows that
fg = (1 - e) + т(P)e because for fg = (1 - e) + т(P)e it follows that t(2)_1 =
(1 — ||e<p||2) + т(/?)-1||е<р||2, which, for the above value of 2 and the form of the
function т leads to an identity.
Since f is a *-automorphism of L, we must have T(1 — cte) = 1 — т(а)е. Since
1 — ae = (1 — e) + (1 — a)e, it follows that f(l — ae) = (1 — e) + t(1 — ct)e, that
is, 1 — т(а) = т(1 — a) from which we obtain a = 1 and t(2) = 2.
228 V Transformations of Registration and Preparation Procedures
Since each g e L is determined by the maximal 1 for which g > IP^ for all (p we
obtain Tg = g.
Th. 5.23. Each *-isomorphism T of L1 onto L2 (where L1 с=
Ь2 <= $'(Ж2)) has the form Tg = UgU-1 with an isomorphic or anti¬
isomorphic map U of Ж± onto Ж2.
Proof. The restriction of T onto G1 is a _L-isomorphism. On G1 we therefore
obtain Те = \JeU~1 with either an isomorphism or anti-isomorphism map U of
Ж± onto Ж2.Тд = U~\Tg)U is a ^-automorphism of L onto itself with Te — e for
eeG. According to Th. 5.2.2 T is the identity map on all of L; therefore
Tg = UgU~\
Here we have closed the circle: all *-isomorphisms of L1 onto L2 are
determined in the above way. We have formulated these theorems such that
we can begin the circle at any point, for example, with the isomorphic maps
of A(G1) onto A(G2) and with Th. 5.17 or with all 1-isomorphisms of G1
onto G2.
Th. 5.23 is not valid for reducible L1? L2, so that, in general each
♦-isomorphism is not necessarily an effect isomorphism. If Жх = Ж2 and
L1 = L2, = G2, we may only replace “iso” with “auto” in all the preceding
theorems. An automorphism U of a Hilbert space onto itself will also be
called a unitary mapping; similarly an anti-automorphism map U will be
called an anti-unitary map.
Before closing this chapter we shall consider the *-automorphisms T of the
whole reducible system L cz $'(ЖЪ Ж2,...). Each у e $'(Жи Ж2,...) can be
considered to be an operator у = yv where each yv operates in Жх.
Let Up(v)v denote the isomorphism (or anti-isomorphism) of Ж, onto Жр{у).
Then we obtain
7y = E UpMvyvUp-J)v. (5.4)
V
If we embed Ж = (Jv into the Hilbert space
/ = !©/,
V
then we may consider a ye $'(ЖЪ Ж2,...) to be an operator having the form
У9 = !>,$>„
V
where
9 = Z 9v and 9veJ^v
V
If the Up(v)v are all isomorphisms, then a unitary operator is defined by
(5.5)
5 Isomorphisms and Automorphisms of Ensembles and Effects 229
and we obtain
Ту = UyU~\ (5.6)
In general (that is, if not all the C/p(v)v are isomorphisms or are anti¬
isomorphisms) then the mapping defined by (5.5) is neither unitary or anti-
unitary.
The map T which is adjoint to Tis given by
T x = £ Up(v)vxp(v)UP(V)V (5.7)
V
or, according to (5.5), is given by
T'x = U~1xU. (5.8)
Here it is important to note that U is not only defined except for a factor eix
by (5.5) and (5.6) but that with
и<р = ^итЖ (5.9)
V
we have also
Ту = UyU-1 = UyU~\ (5.10)
With the proof of the preceding theorems we have, at the same time also
investigated the structure of mixture isomorphisms because, according to
§4.1 and §4.2, the mixture isomorphisms S and J^-continuous effect isomor¬
phisms T correspond uniquely to the equations S' = T and T = S.
Therefore we do not need to derive any additional theorems for mixture
isomorphisms.
In deriving the structure of mixture isomorphisms we have, after all, not
needed all the preceding theorems. These theorems have served to show
“how few” assumptions about the maps L1 —► L2 or —► G2 or
A(GX) —► A(G2) were already sufficient to determine the J^-continuous effect
isomorphisms. If we had begun with the mixture isomorphism S, then we
would only need the following theorems for T = S' :
(1) The restriction of T onto G is a 1-isomorphism Gt —► G2 (see
Th. 5.1).
(2) Tmaps ^4(GJ bijectively onto ^4(G2) and, correspondingly, T-1 maps
^4(G2) onto A(G1) where orthogonal atoms are mapped onto orthog¬
onal atoms (this result follows directly from 1).
(3) Th. 5.11.
(4) Th. 5.13, Th. 5.14, Th. 5.15, and Th. 5.16.
(5) Th. 5.18.
(6) T'w = Sw = U~1wU follows directly from Th. 5.21 and Th. 5.7.
Th. 5.22 and 5.23 were not needed.
With T = S and for x e Ж2,.. •), from (5.7) we finally obtain
$X = ^ ^p(v)vXp(v) Up(v)v • (5-11)
230 V Transformations of Registration and Preparation Procedures
A mixture isomorphism is ^-continuous if T = S' transforms the space 9
into itself. If we assume that the center Z of G is a subset of 9 then the
components j;v of a у e 9) form a Banach subspace 9V of Then S is
^-continuous only if UpMvyvU~(l)v e ®p(y) for yve9v.
Th. 5.24. Every p-continuous r-isomorphism h determines (by means of the
map ф^ L2 defined in Th. 3.2) a 8$-continuous effect isomorphism
T which maps 91 isomorphically onto 92 (as Banach spaces). T is a
9-continuous mixture isomorphism.
Proof. According to Th. 4.2.6 we need only show that L1^ L2 is bijective and
that 91 92 is bijective. Since ф^ is <r-dense in L1 and ф&2 *s ff-dense in L2 it
follows from ф^х ф&2 that L2 is surjective. For the map Tcorresponding
to h-1 on ф^ and ф$Р2 we obtain TT= 1 and 7T=1. Since L2-^LX is
surjective, it follows that L1 L2 is injective. Since 92 91 and T = T_1 we find
that 91 92 is bijective.
CHAPTER VI
Representation of Groups by Means of Effect
Automorphisms and Mixture Automorphisms
In the previous chapter we have seen that there is a one-to-one relationship
between the J^-continuous effect automorphisms and mixture automor¬
phisms which is defined by the adjoint maps. In this chapter we shall
investigate the representation of groups by means of J^-continuous effect
automorphisms. If we make the transition from the J^-continuous effect
automorphisms to the mixture automorphisms, then we obtain the cor¬
responding “adjoint” representation. Thereby, to each representation of a
group by a mixture automorphism we obtain a corresponding “adjoint”
representation in terms of a J^-continuous effect automorphism. Here we
shall only consider representations by means of J^-continuous effect auto¬
morphisms because the representation of a group by means of mixture
automorphisms would only result in an unnecessary repetition of the results
derived here.
1 Homomorphic Maps of a Groups in the Group sd of
^f-continuous Effect Automorphisms
From the fact that the product of two effect automorphisms is an effect
automorphism and that both the identity and the inverse of a J^-continuous
effect automorphism is again such, it easily follows that the set of $-
continuous effect automorphisms form a group. By the expression: the
representation of a group ^ by means of J^-continuous effect automorphisms
we mean a (group)-homomorphism of ^ into the group si of J^-continuous
effect automorphisms.
231
232 VI Representation of Groups
If we are given a map of a set M into the set si, then we may easily
construct a mapping M x Я' —► Я' in the following way: For a e M and
a —► T (Те si) we set (a, y) —► Ту. For convenience we shall often use the
abbreviation ay for the map Ту of (a, y) providing that in the particular
circumstance there exists a particular fixed map a —► T. In addition, for
the image set {ay\aeM} we will use the abbreviation My; similarly,
for {ay | a e M, у e Я'} we will use the abbreviation МЯ'; for
{ay | a e M, у e L} = ML.
Therefore, for a,be@, for a representation of a group ^ we obtain
a(by) = (ab)y for all ye L (and therefore all у e Я'). Thus it follows (e the unit
element of that a(ey — y) = 0; since a is an effect automorphism it follows
that ey = y. From a-1a = e it then follows that a-1ay = y, that is, a-1 is the
inverse effect automorphism relative to a.
1.1 Generation of a Representation of ^ in j/ by Means of a
Representation of ^ by r-Automorphisms
In applications we frequently find that a representation of a group ^ is given
in terms of r-automorphisms. We will therefore assume that to each element
a e У there corresponds an r-automorphism 3F A which satisfies
ai(aif) = (aiai)f (f°r all/ E ^)- Thus, for a2 = e we obtain a(ef) = af. Since
a is injective, it follows that ef = /, that is, e is the identity map in SF. From
(a~xa)f = ef =f it follows that a-1 is the inverse r-automorphism which
corresponds to a.
If, for all a e G, the maps 3F A & are p-continuous, then a and also a~l
are, according to V, D 3.6, p-continuous r-automorphisms.
If we are given a representation of ^ in terms of p-continuous r-
automorphisms, then according to V, Th. 5.24, there exists a representation
of ^ by means of ^-continuous effect automorphisms, that is, a repre¬
sentation of ^ in si. In addition, according to Th. 5.2.4, for all a in ^ we
obtain aQ) = Я), that is, ^ leaves 2 invariant.
If to each r-automorphism a e У there exists a dual p-automorphism a'
then, according to V, Th. 3.4, all a are p-continuous. For the representation
of ^ in si obtained in this way the mixture automorphisms which cor¬
respond to the p-automorphisms are precisely the adjoint maps correspond¬
ing to the representation elements of ^ in si.
In the following we shall only consider topological groups ^ (since we may
consider finite and countably infinite groups to be topological groups under
the discrete topology, the following considerations are also valid for such
groups).
The topology of a topological group is uniquely determined by the
neighborhood filter of the unit element e (see AV, §10.1). A topological group
is given if the neighborhood filter of the unit element satisfies the conditions
below; these conditions will be interpreted “physically.” By this we mean that
1 Homomorphic Maps of a Group ^ in the Group sd 233
the neighborhood filter of the unit element should (ideally)) relate physical
imprecision to the group elements which will be physically interpreted as
being distinguishable from the unit element (see [1], §5 and §9). Therefore it is
“physically” reasonable to postulate the following conditions:
TG 1. To each neighborhood U of the unit element there exists a neigh¬
borhood Ffor which VV с= U.
TG 2. To each neighborhood U of the unit element there exists a neigh¬
borhood Ffor which F_1 сz U.
VVis defined as the set of all ab for which a, b e V TG 1 therefore says that
the product of two elements a, b is not distinguishable from the unit element
(with the imprecision represented by the set U) if both elements are “near
enough” to the unit element.
Similarly TG 2 says that if a is not “distinguishable” from the unit element
then a-1 is also not “distinguishable” from the unit element.
TG 3. To each a e У and each neighborhood U of the unit element there
exists another neighborhood Vfor which V с= aUa_1, that is, a~1Va c= U.
TG 3 says that, if a group element b cannot be distinguished from the unit
element (with imprecision defined by V\ then if we consider a_1ba, that is, we
apply b “at the location a” and then transform back by a we then obtain an
element which is not distinguishable from the unit element.
A neighborhood filter of the unit element (or a basis for such a neigh¬
borhood filter) which satisfies TG 1-TG 3 makes ^ into a topological group
for which the neighborhood basis of an element a e У is given by a U where U
is the neighborhood of the unit element (AV, §10.1).
With the neighborhood filter of the unit element there are two (eventually
coinciding) uniform structures defined on ^ which are compatible with the
group operations.
A right-handed (or left-handed) uniform structure is defined by the family
of sets {(a, b)|ba-1e U (or a_1be U, respectively)} where U is a neigh¬
borhood of the unit element. The topologies associated with the left- and
right-handed uniform structures are identical with the original topology of
the group.
If a group ^ is complete with respect to the right-handed uniform
structure, then it is also complete with respect to the left-handed uniform
structure, and conversely (AV, §10.2). We may complete ^ as a group if the
map a —► a"1 transforms a Cauchy filter of a right-handed uniform structure
into a Cauchy filter of the right-handed uniform structure (AV, §10.2). We
will assume that this is the case for all physically meaningful groups. Then,
instead of ^ we may use its completion. For this reason we shall assume that
all groups which have a “physical interpretation” are complete. (Finite and
denumerable groups with discrete topologies are complete.) Every locally
compact group is complete (AV, §10.2).
234 VI Representation of Groups
In addition to the so-called “finiteness of physics” condition (see [1], §9) we
shall require that physically interpretable groups also satisfy the following
condition:
^ is separable, and the neighborhood filter of the unit element has a
denumerable basis. Then it follows that the uniform structure of ^ is
metrizable (see AV, §10.2). Then ^ is a Baire space (see All, §3) since ^ is
complete.
Thus we assume that “physical” groups ^ will always be, as topologically
metrizable, complete and separable groups relative to the corresponding
unique right- and left-handed uniform structures.
As we mentioned above, finite and denumerable groups with discrete
topologies satisfy the above requirements.
We have provided a simple physical interpretation for the neighborhood
filter of the unit element and derived two uniform structures from the
compatibility condition; we have also found that we may consider the group
to be complete with respect to this uniform structure. This does not, however,
mean that one of these uniform structures describes the physical imprecision
for the whole group, that is, the physical imprecision which permits us to
compare two arbitrary group elements. Physically it is an important
distinction whether we are able to only consider elements Ъ2еУ which are
neighbors of a fixed element bx or we have a procedure by which we may
compare arbitrary pairs (ft1? b2).
The fact that we have considered the set alia-1 where U is an arbitrary
neighborhood of the unit element does not contradict the assumption that
only elements in the vicinity of the unit element may be compared. From the
above symmetry assumption we may at most conclude that we cannot better
distinguish elements of U if we perform a translation of the unit to the
location a. Our assumption about the uniform structure of physical impre¬
cision for the group implies the opposite: We cannot compare a pair of
arbitrary elements as well as we can for two elements in the vicinity of the
unit elements. As we have explained in [1], §9, for the uniform structure of
physical imprecision it is desirable that the set under consideration (in this
case is precompact.
For a “physical” group we shall now assume that, in addition, there exists
an additional uniform structure ph of physical imprecision on ^ which is
weaker than the above and for which &ph (that is, ^ together with the uniform
structure ph) is precompact and metrizable and generates the same topology,
that is, the topology of ^ which is compatible with the group operations. In
addition, for fixed a the maps b —► ab and b —► ba should be p/i-uniformly
continuous.
Since the topologies of ^ and are identical, is separable.
In AV, §10.2 we have proven that such a structure ph always exists for If
^ is compact, then ph, and both the left- and right-handed uniform structures
of ^ are identical because a compact space has only a single uniform
structure. If ^ is not compact, then the uniform structure ph is not uniquely
1 Homomorphic Maps of a Group ^ in the Group 235
determined by ^ and by the above requirements. Then ph requires an
additional physical structure which is not determined by the neighborhood
filter of the unit element.
The completion §ph of ^ph is compact and is called the physical com-
pactification of CS. In general <&ph is not a group. If ^ is compact, then ^ is its
own compactification. Although is not necessarily a group, the maps
b -* ab and b -* ba may be, for fixed ae^, uniquely extended onto §ph as ph-
uniformly continuous maps.
The above results are also valid for finite and denumerable infinite groups
with the discrete topology; finite groups are already compact in this sense;
denumerable infinite groups G must yet be made compact, where the
topology of the compactification §ph onto ^ is the discrete topology!
In cases in which misunderstandings are unlikely, we will often write #
instead of §ph.
Th. 1.1.1. The set Ф may be partitioned under (left and right) multiplication
with the elements of У in invariant subsets У and #\^.
Proof. If b e and frv —► b with bv e then for a e ^ it follows that abx —► ab.
Suppose ab = ce^, then from abx —► с it also follows that a~1(abx)—>
a~lc = b e (S.
The physical meaning of the uniform structure ph should also be evident in
the representation of (S.
Since, by assumption, ^ is complete and 3F is denumerable (see III, §3), ^
cannot, in general, be represented by p-continuous r-automorphisms. Since
^ is separable, there exists a countable dense subgroup # in (S. We shall now
assume that there exists a representation of such a subgroup # by means of p-
continuous r-automorphisms.
It is reasonable to impose the following additional condition: If a is an
element of # in the neighborhood of the unit element, then for all w e Ж
then, for all practical purposes, it is not possible to distinguish between the
probabilities p(w, ij/(f)) and p(w, ij/(af)\ that is, / and af for all preparation
procedures, yield the same probabilities if a is sufficiently close to e. In this
way, for fixed w e Ж and arbitrary / the above probabilities cannot be
distinguished. We shall formulate this in the following mathematical axiom:
AG 1. To each / e and г > 0 there exists a neighborhood UB of the unit
element in #, such^that
Il4w, !>(/)) - p(w, ф(а/))\ < e
for all weJf and aeUe. To each we Ж and (5 > 0 there exists a
neighborhood Us of the unit element in such that
Ww, iA(/)) - liw, ф(а/))\ < S
for all / e and a eUs.
236 VI Representation of Groups
If AG 1 holds, then the following theorems hold:
Th. 1.1.2. For fixed у e 3 the map defined by a —► ay of # —► 3 is uniformly
continuous in the norm topology of 3. For fixed xe 0b the map of § 0b
defined by a —► a'x is uniformly continuous in the norm topology of 0b.
Proof. Since if = \jjtF is norm-dense in L n §, it is sufficient to prove the first
assertion for у e if as follows:
K'Kf) - а2ф(Л\\ = sup \n(w, ex[^(/) - аЛа2ф(/)])\
we Ж
= sup На>1; ф(Л - ф(а;1а2/)\
we Ж
= sup \ц(й,ф(Л - ф(аЛа2Л\■
we Ж
According to AG 1 we therefore obtain
IМ(Л - а2ф(Л\\ < 8
forajf1a2 £ Ut.
Since Ж is norm-dense in К it suffices to prove the second assertion for xe Ж
(if is ст-dense in L):
||a\w — a'2w\\ = sup p(a\w — a'2w, 2д — 1)
geL
= 2 sup p(a\w — a'2w, g)
= 2 sup p(w, ф(а,Л ~ Ф(а2,Л)
/6^
= 2 sup p(w, ф(Л - ф(а2аЛЛ)-
According to AG 1 we therefore obtain \\a\w — a2w\\ < S for a2a[1 e U8.
Th. 1.1.3. By continuous continuation of the map § —*■ described in
Th. 1.1.2 we may obtain a representation of by means of 0b-continuous
effect automorphisms which maps 3 into itself For x e 0b, у e 3 the maps
а —► R defined bya^> p(x, ay) are uniformly continuous.
Proof. Since § —► 3 is uniformly continuous and 3 is complete with respect to
the norm it follows that its continuous completion to each ye3 defines a map
^ —► 3. If we write this map as a —► ay, then it follows that a is linear. Since if is
norm-dense in L n 3, it follows that a(L n 3) cz L n 3 and that a is norm-
continuous. The map у —► ay is also ^-continuous. This result follows from
Th. 1.1.2, since, for all ae@ it follows that p(x, ay) = p(a'x, y) and ihat the
mapping a —► a'x is uniformly continuous with respect to the norm topology in 0b
and therefore can be extended to a map ^ —► 0b for which, to each ae^a mapping
a' of 0b into 0b is determined. From p(a'x, y) = p(x, ay) it follows that, in the limit
p(a'x, y) = p(x, ay) for all аеУ, from which we have proven the ^-continuity of a.
Since a (as a map of 3 n L into itself) is ^-continuous, it can be extended onto the
compact set L. Therefore a is a ^-continuous effect morphism. Since а!Ж is, for all
ae§, equal to Ж (and therefore norm-dense in K) it follows that а'Ж is norm-
1 Homomorphic Maps of a Group ^ in the Group 237
dense in K, from which we conclude that, according to V, Th. 4.1.4, a is an effect
automorphism.
Since the map a —► p(x, ay) is uniformly continuous as a map # —► R, its
extension on ^ is also uniformly continuous.
We must now make the uniform structure ph of physical imprecision
evident in the representations. We have seen that the mapping
a —► piw, ф(а/)) of # into R is uniformly continuous. If, the elements of § are
to be physically distinguishable according to ph, then it is necessary to
impose the following strong requirement:
AG2. For each we Ж and fetF the mapping §ph —► R defined by
a —► p(w, ij/(af)) is uniformly continuous.
From AG 2 we obtain, by continuous extension:
Th. 1.1.4. The mapping of &ph —► R defined by a —► p(x, ay) for x e 3$, у e 3)
is uniformly continuous.
1.2 Some General Properties of a Representation of ^ in sd
In the previous section it has become evident that the representation of
groups by means of J^-continuous effect morphisms plays an important role.
If such a representation is generated by r-automorphisms then this “gener¬
ation” will only play a role for the interpretation question (see, for example,
VII). Since neither the sets 2 and 3F nor the sets Ж and S£ are axiomatically
fixed, except that, for Ж each denumerable set in К which is norm-dense in К
and for if each denumerable tr-dense set in L may be used, we are led to
concentrate our efforts on questions about the representation of a group on
representations in j/. Unfortunately, the representation theory of groups by
means of positive automorphisms of base-norm spaces has not yet been
sufficiently developed for us to be able to present a comprehensive outline of
even a part of a “general” representation theory. This is, in part, due to the
fact that the relationships between special structures of base-norm spaces (for
example, the special structures formulated by axioms AV 1.1-AV 4 in III, §4)
and the structure of the automorphism groups are not well known (see, for
example, the attempts in this direction presented in [20]). For this reason in
this book we shall often refer to the special structure of these automorphism
groups as groups of J^-continuous effect automorphisms such as those which
were considered for Ж2,...) in V. Now, in the remaining part of
this section we shall consider a number of general and easily formulated (and
which will later be seen as “physically” meaningful) properties.
D 1.2.1. А д e Lis called a ^-invariant effect if Уд = д. An e e G is said to
be a ^-invariant decision effect if Ge = e.
238 VI Representation of Groups
It is easy to verify that Л1 is, for all 0 < A < 1, a ^-invariant effect, and that
0 and 1 are ^-invariant decision effects.
For a e У it follows that from ag = g and the uniqueness of the spectral
representation of g that ae(X) = e(A) holds for the spectral family e(X) of g.
Therefore it suffices to only consider invariant decision effects.
D 1.2.2. A representation is said to be irreducible if 0 and 1 are the only
invariant decision effects.
A representation is therefore irreducible only if the only invariant effects
are given by A1 where 0 < X < 1.
As we have already mentioned, to each a we may consider the adjoint d\ in
this way is defined as set of transformations in К and 9.
D 1.2.3. Two representations <39\ and <39*2 are said to be equivalent if
there exists a J^-continuous effect isomorphism T of 9\ onto 9t'2 for which
Та = aT for alia e 3.
By analogy with the considerations in §1.1 we shall assume that the a e 3
transform the subspace 9) into itself, that is,
(D 1) Ш с= 9.
Since 9) has not otherwise been specified by means of the axioms (except that
9> is norm-separable and that 9 n L is a(9\ J^)-dense in L—see III, §3) we
shall now proceed in the opposite way and seek to obtain conditions for 9
with the aid of the group representations. Therefore (D 1) is one of the
conditions which 9 must satisfy.
According to III, §3 we will assume that 9 is norm-separable. Here we
shall note that the following definitions and concepts will apply to every
norm-closed subspace 9 of 91' for which 9 n L is a(9\ J^)-dense in L and is
^-invariant.
Since 3 transforms the space 9 into itself, У is—on all of 9'—defined as
mixture automorphisms of Ka (Ka is the a(9', ^-closures of К in 9').
In analogy to the considerations presented in §1.1 we shall, in addition,
assume that the whole group ^ is represented in sd. From axioms AG 1 and
AG 2 it follows that there are certain continuity properties of this repre¬
sentation; here we shall only use two of these (!): According to Th. 1.1.3 we
obtain:
(D 2) For x e 91, у e 9 the maps ^ —► R defined by a —► g(x, ay) are
continuous.
According to Th. 1.1.4 we obtain:
(D 3) For xel, у € 9 the maps %h -*• R defined by a —► g(x, ay) are
uniformly continuous.
1 Homomorphic Maps of a Group ^ in the Group sd 239
It is clearly obvious that (D 3) => (D 2). For a representation of ^ in si we
shall not yet assume that a representation must be obtained by means of r-
automorphisms in the way described in §1.1. We will only assume that ^ is
complete and that either (D 1), (D 2), or (D 3) is satisfied. We will then find
that, for a representation of ^ in si the properties which have been derived
from AG 1 in §1.1 automatically hold.
Th. 1.2.1. If, according to (D 2), p(x, ay) is continuous at ее У (e the unit
element of У) as a function on У for each pair (x, y) e 0b x 00, then the
mapping of —► 0b defined by a —► a!x is norm-continuous.
Proof. Let x e 0b be fixed. From \p(a'x — x, y)\ = \p(x, ay) — p(x, y)\ < s if a is in a
suitable neighborhood of the unit element, then it follows that the map a —► a'x is
o(0b, ^-continuous at e. Thus since
p(a'x — b'x, y) = /i(b'-1a'x — x, by)
it follows that the map a —► a'x is o{0b, ^-continuous in all of (S.
We now consider the set Ух as a subset of 0b with the topologies induced by the
norm and a(0b, 00). Since 0b is norm-separable there exists a denumerable subset
{av} с for which {a'vx} is norm-dense in Ух. We define the spheres KBV in Ух:
KBX = {x' | x' e Ух, \\x' — a'xx\\ < s}.
Let M* denote the inverse image of a —► a'x, that is,
MBy = {a\ae G where a'x e KBX}.
Since the a'vx are norm-dense in Ух we obtain (Jv KBV = Ух from which it follows
that (Jv MBV = G.
The set Ух may be considered to be a subset of 00'; since the norm of 00' on 0b
agrees with that of 0b (see proof of V, Th. 4.1.6). K\ can be considered to be the
intersection of spheres in 00' with Ух. Since the unit sphere of 00' is o(00’, 00)-
compact and therefore closed, K\ is o(00', ^)-closed in Ух, that is, KBV is c(0b, 00)-
closed in Ух. Since the map a —► a'x is a(0b, ^-continuous, the inverse image M*
is closed in Since ^ is a Baire space (see All, §3) it follows that from (Jv M* = ^
that there exists a set M\0 which contains an open subset of
Therefore there exists a element a e M\Q and a neighborhood U(a) of a for which
U(a) cz MBVQ, that is, for all b e U(a) we obtain a'x e KBVQ and b'x e KBV0 and therefore
||b'x — a'x|| < 2s. U(a)a~1 is a neighborhood V of the unit element. Since the norm
is preserved by mixture automorphisms, for all с e V we obtain ||c'x — x\\ < 2s
whereby we have proven the norm-continuity of с —► c'x in the point e e G. Thus
we easily obtain the norm-continuity in (§.
Th. 1.2.2. If a —► a'x for all xe 0b is norm-continuous for fixed x, then
a ay for all у e 0b' is o(0b', 0b)-continuous.
The proof follows directly from
\p(x, ay - by)| = |^(a'x - b'x, y)\ < || a'x - b'x || ||y||.
Th. 1.2.2 says that the map a —► pix,ay) is continuous for each pair
(x, y) e 0b x 0b' on all of CS.
240 VI Representation of Groups
The following theorem is a corollary of Th. 1.2.1 andTh. 1.2.2:
Th. 1.2.3. If a —► ц(х, ay) is continuous for all (x, y) e 0b x 3 it follows that it
is continuous for all (x, y)e 08 x 08'.
In the same manner we may prove the following theorems :
Th. 1.2.4. If a —► fi(x, ay) is continuous for all (xj)eJx® it follows that
a —► ay is norm-continuous for ally e 3.
Th. 1.2.5. If a —► ay is norm-continuous for ally e 3 it follows that a —► ax is
o(3\ 3) continuous for all x e 3'.
Th. 1.2.6. If a —► ц(х, ay) is continuous for all (x,y)e0bx3 then it is
continuous for all (x, y) e 3' x 3.
In all these theorems only (D 1) and (D 2) are required. According to (D 3)
fi(x, ay) is p/i-uniformly continuous for all (x, y) e 0b x 3. In general we
should expect that \i(x, ay) is neither p/i-continuous for all (x, y) e 3' x 3 nor
for all (x, y) e 0b x 08'.
Therefore there exists a complete symmetry with respect to the representa¬
tion of groups between the spaces and Q) and their corresponding extensions
$ <= and 3) or $ and => 3). Previously we have generally not
made use of the dual pair 3\ 3 together with <= 3' as mathematical tools
for the representation of physical problems of quantum mechanics. It is,
however, possible that there exist new methods for the mathematical
treatment of many problems using the methods of C*-algebras if 3 is a C*-
algebra.
We will now examine the possible consequences of (D 3). In order to
provide a comprehensive formulation of the following processes we now
introduce the following spaces A„, Aph and A:
D 1.2.4. Aw is the set of all у for which the map a —► ay of ^ into is
norm-continuous. Aph is the set of all у e for which the maps a —► fi(x, ay)
are p/i-uniformly continuous for all x e 36. A = Aph n A„.
According to (D 3) and Th. 1.2.5 it follows that 3 cz A cz .
Th. 1.2.7. A„, Aph and A are norm-closed subspaces of Ж which are invariant
with respect to
Proof. The fact that Aph is a subspace of 0b' is clear. Let у e Aph and let с e then
/i(x, асу) is, as a function of a, p/i-uniformly continuous since both maps a —► ac —>
р(х, асу) are p/i-uniformly continuous; therefore cy e Aph.
If yx —► у is a norm-convergent sequence for which yv e Aph then
|p(x, ay - by)| < |p(x, a(y - yv)| + |p(x, ay - byv)\ + |p(x, b(yv - y)\
< 21| x || || у - yv|| + |p(x, ayv - byv)|
from which it follows that p(x, ay) is p/i-uniformly continuous.
1 Homomorphic Maps of a Group in the Group si 241
For A„ the same result follows more easily from the continuity of the maps a —►
ac —► (ac)y and from
|| ay - by || < ||a(y - yv)|| + ||ayv - byv|| + ||b(yv - y)||
< 2||y - yv|| + ||ayv - byj.
By analogy to D 1.2.4 we define:
D 1.2.5. Let E„ be the set of all xe®' for which the mapping a —► a'x of ^
into 2' is norm-continuous. Let Eph denote the set of all хб®' for which the
maps a —► p(x, ay) —► p(a'x, y) are p/i-continuous for all у e 2. Let
ш Hin O Uph .
According to (D 3) and Th. 1.2.1 Я a S c= 2’. By analogy with Th. 1.2.7
it follows that:
Th. 1.2.8. E„, Eph and E are norm-closed subspaces of 2' which are invariant
relative to У.
Since 2 has not previously been restricted by means of axioms, we may
now reverse the sequence of ideas presented above as follows: We begin with
a topologically complete group ^ and a representation for which the map
^ —► Ух is norm-continuous for all xe Я. The “physical” uniform structure
ph on ^ and the space 2 have previously not been specified. We shall now
define topologies on ^ by means of its representations as follows:
D 1.2.6. We shall call the initial topology on ^ for which the maps
defined by a —► a'x
are continuous for all x e Я (with respect to the norm topology in Я) the (Я)-
topology.
The original topology in ^ is therefore finer than the (J^)-topology.
If the representation of ^ is not true, then there exists a invariant subgroup
Ж of ^ which, in the representation of ^ is mapped onto the identity. The
representation of the factor group <3!Jf is then true. The fact that the
elements of Ж behave like the identity with respect to quantum mechanics
has the physical meaning that the transformations in Ж are possibly
meaningful outside the domain of microsystems, but in the domain of
microsystems are equivalent to the identity. Thus for microsystems only the
group 91Ж is physically meaningful. From this point on we shall assume
that the representation of ^ is true (this assertion we may not change,
because in certain “subdomains” of the domain of microsystems nontrue
representations may occur! See VII, end of §2). From the above arguments it
seems reasonable, on physical grounds, to identify the topology of the group
^ with the (J^)-topology. First, we must determine whether ^ together with
the (J')-topology is always a topological group.
242 VI Representation of Groups
Th. 1.2.9. 0 together with the (0b)-topology is a topological group (because
the representation of 0 was assumed to be true, the (0b)-topology is
separating). 0 is separable and metrizable in the (0b)-topology.
Proof. Since a' preserves the norm, TG 1 follows from
||a'b'x — x|| < \\a'b'x — a'x|| 4- ||a'x — x||
= || a'(b'x — x)|| + || a'x — x|| = || b'x — x|| + || a'x — x||.
TG 2 follows from
|| a'-1x — x|| = || a'(a'-1x — x)|| = ||x — a'x ||.
TG 3 follows from the fact that \\b'x1 — x11| < e is equivalent to
||a'~1b'a'x2 — x2\\ < s
for x2 = a'~1x1 since
\\a'~1b'a'x2 — *21| = || b'a'x2 — a'x2\\ = \\b'x1 — ||.
Since 0 was assumed to be separable in the original topology, 0 is separable in
the (^)-topology.
Every subset A cz 0b for which the closed subspace spanned by A is equal to 0b
generates the same initial topology as does 0b. This result follows easily for linear
combinations of elements of A and can be proven using the inequalities:
^axx — a2x|| < \\а^ — ахх\\ + \\a^ — a2x\\ + \\a2x — a2x\\
= 2\\x — x|| + \\ахх — a2x\\.
Since 0b is separable, there exists a denumerable set A and therefore a denumerable
neighborhood basis of the unit element in G.
For a subset Л <= 0b’ let the norm-closed subspace of 0b’ spanned by Л be
denoted by 00A.
D 1.2.7. Let Л c= 0b' be a subset of 0b' which satisfies the condition
0 К c= ®A.We shall call the initial structure for which the maps
0 0b' defined by a ► ay,
У 06’ defined by a ► a-1y
are uniformly continuous for all у e A with respect to the o(0b', J^)-topology
in 0b' the Л-uniform structure on 0. We shall call the topology determined by
the Л-uniform structure the Л-topology on 0.
It is easy to see that, for all уе®л, the maps a —► ay and a —► ay are
uniformly continuous. Since \\ay — ay\\ = \\y — y\\ it follows from 0K с 00A
that 000A = ®A, that is, 00A is an invariant subspace of 0b'. Thus it easily
follows that ay = у for all у e 00A is equivalent to ay = у for all ye A. A
representation will be said to be Л-true if ay = у for all ye A, it follows that
a = e.
1 Homomorphic Maps of a Group 3 in the Group si 243
Th. 1.2.10. The maps of <3 onto itself given by a —► a-1, a —► ab and a —► ba
(for fixed b) are uniformly continuous with respect to the A-uniform
structure.
Proof. Since the composite maps a —► a-1 —► a_1y and a —► a-1 —► (a_1)_1y = ay,
that is, since the maps a —► a_1y and a —► ay are uniformly continuous a —> a-1 is
uniformly continuous. From a —► ab —► aby and a —► ab —► (aby1 у = b~1a~1y
are (for fixed b) uniformly continuous, it follows that a —► ab is also such. The
uniform continuity of a —> aby is obtained with у = by from a —► aby = ay since
ye <3>A. The uniform continuity of a —► b~1a~1y follows from
p(x, b~1a~1y) = p(b'~1x, a_1y) = /г(х, a_1y)>
where x = b,_1xe J and the uniform continuity of a —► a-1y.
In this way we obtain the uniform continuity of a —► ba from a —► ba —► bay and
a-+ba-+ a~l b~ly and from у = b_1y and £ = b'x.
Th. 1.2.11. Tbe (&)-topology on <3 is finer than the A-topology. <3 is therefore
also separable in the A-topology.
Proof. Let 3, together with the (^)-topology (or Л-topology) be denoted by <3Я (or
respectively). We must now show that the identity map <3Я —► <3K is continuous.
У» ~* is continuous providing that the composite maps <§я ► <3K Я
and <3Я ► ^A Я are continuous. This follows, however, from
\p(x, axy - a2y)| = |p(a\x - a!2x, y)| < \\a\x - a'2x\\ ||y||
and Th. 1.2.9.
Th. 1.2.12. The set An n L is о(Я', Щ-dense in L and the set An is therefore
о(Я, Щ-dense in Я'.
Proof. For the P^ for which cpe(jv we find that, in the norm of
$ || a'Py — P91| < e whenever a is in a suitable neighborhood U (either in the
original topology or in the ^-topology) of the unit element. Since a'P^ = Рф =
a~1P(p (see V, Th. 5.13), denoting the norms in Я and Я by ||- • -||л, and ||- • -||л,
respectively, from the relation
IIP* - PJ* < 2Щ - PJa
we obtain \\а~1Р(р - Р^\\я. < 2s for aeU. Therefore for у = Pv e L we find that
the map a —► ay is norm continuous (since a —► a-1 is continuous). Therefore the
map a —► ay is norm continuous for all finite linear combinations of elements of the
form Py. The set of all such finite linear combinations is о(Ж, ^)-dense in L.
Since L is о(Я, ^-separable there exists denumerably many yve An n L
which are о(Я', J^-dense in L. Since <3m is separable, there exist denumerably
many bp (the unit element of 3 may be among the Ьр) which are dense in the
^-topology in 3. The set A of all bpyv is then denumerable, and we find that
A с L and A is о(Я', J^-dense in L. Here we realize that the set yv does not
need to be o(Я, J^)-dense in L if only the set of the b^yv is o(Я, Я)-dense in L.
Since yve A„ we find that ||byv — bMyv || < s for a be3 and for Ьр
sufficiently close to b. Therefore <3A is contained in the norm-closure of A
244 VI Representation of Groups
and therefore we find that ЗА <z 9A. Л therefore satisfies the conditions in
D 1.2.7 and is denumerable, and is o(9\ J^-dense in L. Thus we find that 9A
is norm-separable.
We may therefore use the space 9A as the space 9 of the theory if the
“physical” uniform structure ph is finer than the Л-uniform structure. If the
Л-topology is equal to the original topology on the group 3, then we may
identify the physical uniform structure with the Л-uniform structure, that is,
by the choice of the set A we fix the choice of the physical uniform structure
ph. In this way we recognize the close relationship between the designation of
the space 9 and the designation of the uniform structure ph of “physical
imprecision” on 3.
In practice it is often easy to show that the Л-topology is identical to the
original topology on 3. In applications 3 is at most a locally compact group.
If we could find a Л-neighborhood of the identity which is compact in the
original topology, then the original topology and the weaker Л-topology will
coincide in this neighborhood, and therefore will have the same neigh¬
borhood system of the identity. The proof that there is such a A-neigh-
borhood is very simple for the case of Lie groups (see VII, §8). In the
following we will always assume that there exists a Л-set in order that the Л-
topology will coincide with the original topology. Then the (9)-topology also
coincides with the original topology.
With this result we note that we have not yet solved the problem of the
space 9. For example, we may set 9 = 9A, but we may yet find that it is
possible to have different sets A and also different spaces 9A which result in
the same Л-topology on 3. If, for example, 3 is compact, then, for all possible
Л-sets the Л-topologies will coincide with the original topology on 3 and A is
equal to A„. This is not so surprising: If we need only consider compact
groups in physics, then we would always be able to choose finite-dimensional
Hilbert spaces Жх (see VII, §3 and AV, §10) and the problem of the selection
of a space 9 would be nonexistent.
After the construction (with the physically uniform structure ph being the
Л-uniform structure) we obtain А з 9A. The conditions under which
A = 9a has not yet been investigated.
The introduction of the Л-uniform structure in D 1.1.7 now appears to be
unsymmetric with respect to 9 and 9 = 9A. This is, however, not the case,
because the Л-uniform structure is also the initial structure for which the
following maps 3 —► R of the form
a —► p(x, ay) = p(a'x, y),
a —► p(x, a~xy) = p(a'~1x, y)
are uniformly continuous for alixe 9 and у e 9 = 9A, that is, for which all
maps of the form
a —► abc and a—> a'_1x
are uniformly continuous with respect to the 0(0, -topology in 9.
1 Homomorphic Maps of a Group ^ in the Group si 245
Therefore the above considerations are symmetric between 3 and In
particular, the question arises whether and under what conditions is it
possible that S = (where S is defined by D 1.1.5).
1.3 Topologies on the Group si
Since the elements of si are maps, there are innumerable possibilities for the
introduction of a topology on si. We shall select three possibilities—these
have already been encountered in §1.2.
To each Те si there is a corresponding map $ x 3 R defined by T—►
ц(х, Ту). A separating uniform structure on si is defined by means of the
uniform structure of normal convergence on & x 3 (that is, the initial
structure for all the maps si R defined by T—► p(x, Ту)); we shall let
si^2) denote si together with this uniform structure.
In the same way p(x, Ту) determines a mapping $ x Я* —► R which, by
analogy, determines si^^.
A mapping & —► & is defined by means of the adjoint map T as follows
x —► Tx. If we use the norm topology of & in the image set, then a
separating uniform structure in si is defined by means of the normal
convergence of this map. We shall let si{m) denote si endowed with this
uniform structure.
It is easy to see that sim is finer than siand siis finer than
•
We may now express (D 2) and (D 3) as follows: The representation maps
^ or Урн are uniformly continuous.
From (D 3) it does not follow that the maps <&ph —► si^^ can be extended
onto all of §ph since si{m^6J) is, in general, not complete.
From the uniform continuity of <&ph —> siin a trivial manner it follows
that the map ^ —► si^^ is simply continuous.
Th. 1.2.2 states that, from the continuity of ^ —► siKm^ it follows that the
map ^ —► sim is continuous. Th. 1.2.3 is then only a trivial corollary
because the map ^ —► si^^ is continuous.
From the proof of Th. 1.2.10 it follows that in the special case in which we
choose ^ = sim we obtain the first part of the following theorem:
Th. 1.3.1. sim is a topological group and is metrizable. si{m is separable.
sim is complete.
Proof. In order to show that sim is separable, choose a denumerable set {xv}
which is norm-dense in and, for each v, choose a denumerable set TvA e si for
which {T^xv} is norm-dense in sixv. Then {TvX} is a dense set in sim which
follows directly from
II(Г - ад < ||(Г - TvX)xJ + II(Г - T№ - xv)n
< II(T' - T^JxJ + 2||x - xv||
246 VI Representation of Groups
(choose v such that ||x — xv|| < e/4, and then choose 2 such that
ll(т - r;A)xvII < e/2).
The fact that sdm is complete follows from the general theorem that a sequence
Tv for which T'xx is a Cauchy sequence for each x converges towards a Те jd (from
Tx —► T where T is a norm-continuous map of 0b into itself, it follows that if Tx is
positive, then T is positive and from || Tx\\ =1 it also follows that || T'|| = 1).
If, in addition to the requirements that ^ be separable and metrizable, we
also require that it be locally compact, then with Th. 1.3.1 the following
important mathematical theorem may be proven:
If we endow and sdKm with the Borel structure generated by the open sets and
if the map —► <sd{m) is measurable with respect to this structure then the
map —► sd^ is also continuous (for proof see [10]). This theorem
therefore shows that if & is locally compact then we may start with a much
weaker requirement than that У —► sd^m) is continuous. All groups which
occur in quantum mechanics are locally compact.
In V we have seen that the group sd is “physically too large.” Only the
subgroup sdm can be physically meaningful where sdm is the set of all T in
sd for which T3 cz 3.
Therefore is a separable metrizable topological group.
Th. 1.3.2. The maps T -4 Ту for у e 3 of sd^ in @ tire norm-continuous.
Proof. This result follows directly from Th. 1.2.5, in which we replace ^ by sd^.
On sdw we may introduce the topology of normal convergence of the maps
sdm -4 3 (with the norm-topology in 3); this topology will be called the (3)-
topology and sdm together with this topology will be denoted by sd\f j. From
Th. 1.3.2 it follows that on sdm the (J^-topology is finer than the (3)-
topology.
The results which we have obtained for the space 08 may also be obtained
for the space 3, and we obtain:
Th. 1.3.3. The maps of sd[|J into 08 given by T—► Tx for xe 08 are norm-
continuous.
FromTh. 1.3.1, Th. 1.3.2, and Th. 1.3.3 it easily follows that:
Th. 1.3.4. On the topological group jd{^ the (3)-topology is equal to the (08)-
topology.
The representations of a group which are of interest to us are therefore the
continuous representation maps ^ —► which are (according to (D 3))
uniformly continuous as maps
1 Homomorphic Maps of a Group ^ in the Group d 247
1.4 The Representation of ^ in Phase Space Г
The properties of group representations in phase space Г are seldom
investigated in quantum mechanics. Since this topic is of importance in
understanding the relationship between quantum mechanics and classical
mechanics we shall present a brief description of the fundamentals of the
phase space representations.
The o{00\ ^-closure Ka of К in 00' is o{00', 00)- compact. Therefore, the
convex set Ka is, according to the Krein-Milman theorem, generated by the
set deKa of its extreme points.
D 1.4.1. We shall call the set deKa together with the uniform structure
generated by the o{00\ ^)-topology the phase space (which we denote by Г).
Г is precompact as a subset of Kff. Since 2' is separable and metrizable in
the o{00', ^)-topology, Г is separable and metrizable, and its points describe
in a physically meaningful way (see [1], §9) the “idealized” preparation
possibilities for the systems in M.
Since the elements of У can also be considered to be mixture automor¬
phisms of Ka, Г is У invariant. According to Th. 1.2.6, for a e <3, у e Г the
map a —► a'у is continuous. In addition, we find that
Th. 1.4.1. The mapping о/^хГ->Г defined by (a, y) —► a'y is continuous.
Proof. For fixed ae<& and fixed у e Г and for у e 00 we obtain
\p(a'y - a'y, y)| < \p(a'y - a'y, y)\ + \p(a'y - a'y, y)\
< \p(y - y, ay)| + \p(y, ay - ay)\
< \Р(У ~ Ъ аУ)\ + Way ~ ayW-
From which it follows that, since a—+ ay is norm-continuous, that the map
^ x Г —► Г is continuous.
D 1.4.2. We shall call the representation of У on Г by means of point
transformations the associated phase space representation of ^ correspond¬
ing to the original representation of (S.
For quantum mechanics the structure of the associated phase space
representation is surprisingly unfamiliar. In the case of “physical objects” (see
III, §4.1 and [1], §12) we are able to determine the phase space Г by means of
a particular choice of the group ^ (Galileo group or a direct product of
Galileo groups) and the uniform structure ph of physical imprecision, by
requiring that Q = A (see [21]). Here for Г we obtain the usual Г-space of
classical mechanics. The axioms in III, §4 and the specification of the group
^ permit us, therefore, to deduce the “usual” phase space of classical
mechanics in the case of the description of physical objects.
248 VI Representation of Groups
An analogous description for quantum mechanics is, up to now, not
commonly in use because the set Г = deKa is, at present, mathematically not
as accessible. For quantum mechanics there is no pressing necessity to
investigate the set deKa as in the case for classical mechanics, because deK is
not only nonempty, but it also satisfies (in the norm-closure of В): со 8eK =
K. Since 9 is the dual Banach space corresponding to 5£cr where the latter is
the subspace of all compact operators of 9' it follows that in quantum
mechanics К has the property со 8eK = К (see AIV, §11). 5£cr is norm-
separable. It is not difficult to show that $£cr <z A„. In a pure formal way we
may choose 9 = $£cr; then we would find that 1 ф 9; however the selection of
9 as the norm-closed subspace of 9 spanned by 1 and i?cr does not appear
to be physically meaningful.
The above considerations show why deK, that is, the set of all P^, is used
as a substitute for the phase space Г. For this reason we must put up with
the fact that for a decision scale observable A which has a continuous
spectrum there cannot be an element of deK, that is, a P^ for which
/г(Р^,(Л — al)2) = 0 (for a in the continuous spectrum). For “physical”
decision observables A, that is, for Ae 9 there exists, for each a in the
continuous spectrum an element w e deKa for which p(w, (A — al)2) = 0!
2 The ^-invariant Structure Corresponding to a
Group Representation
As we have seen in V, §5, every ^-continuous effect automorphism is
uniquely determined by a _L-automorphism of G and every _L-automorphism
of G determines a J^-continuous effect automorphism. Thus the repre¬
sentation of 3 by means of J^-continuous effect isomorphisms is uniquely
determined by means of the representation of 3 determined by 1-automor-
phisms of G.
Of special importance are two subsets of G, first the set of ^-invariant
decision effects (see D 1.2.1).
Th. 2.1. The set of 3-invariant decision effects forms a complete orthocom¬
plemented sublattice of G which is o(9\ 9)-closed in G.
Proof. Follows directly from the fact that each element ae3 defines a 1-
automorphism of G which is o(9\ ^-continuous by means of the map e —► ae.
(eeG).
D 2.1. Let G(3) denote the set of ^-invariant decision effects, L(3) denote
the set of ^-invariant effects and 9\3) denote the set of ^-invariant elements
of 9'.
Th. 2.2. L(3) = со G(3) where the closure is to be taken with respect to the
<j(9\ 9)-topology. 9\3) is the o(9\ 9)-closed subspace of 9' which is
spanned by G(3).
3 Properties of Representations of ^ 249
Proof. If А с: & is a set of ^-invariant elements in then co(^) and the &)-
closed subspace spanned by A are also sets of ^-invariant elements because th^
elements of si are ^-continuous effect automorphisms. If g is a ^-invariant effect,
then from the uniqueness of the spectral representation of g it follows that
g e со G(&). Thus for a ^-invariant ye Ж it follows that у lies in the Щ-
closed subspace spanned by G(^).
Let J^)1 denote the set of all л; e 0! for which p(x, у) = 0 for all у e &'(<&).
Then &/ЩУ)1 is a Banach space (see [33]).
D 2.2. Let ЩУ) be an abbreviation for Я1
Th. 2.3. Я(<&) is a base-norm space and таУ be identified with the dual
space of ЩУ) by means of the map
(x, y) = n(x, y),
where xexe and у e Я'(&) (here (x, y) is the canonical bilinear form to
Я(<&) and the dual space for Я(Щ. is an order unit space.
Proof. Since Щ&) is а{Я\ ^)-closed, Щ&) can be identified with the dual space of
#(#) (see [33]). Я(&) is a base norm space if Я'(У) is an order unit space (see
AIII, §6 and [33]). The unit sphere of is given by Я'(У) n (2L — 1). Since
16 and L n ЯЩ = Ц&) we obtain #{&) n (2L - 1) = 2Ц9) - 1, that is,
it is equal to the order interval [— 1,1] in Я'(У).
D 2.3. Let K(<&) denote the basis of Я(У).
Do К{У\ Ь{У\ SfifS) satisfy all the axioms and theorems which have
been formulated in III, §3? We will not pursue this question for the general
case. In the case of the special quantum mechanical structure of К and L, we
shall return to the question of the structure of К(У\ ЦУ) in VII, §2 and
VIII, §1 for physically important groups
3 Properties of Representations of & which are Dependent on
the Special Structure of in Quantum Mechanics
In this section we shall use the special structure of si which was described in
V, and we present a outline of the properties of a representation of
3.1 The Topological Structure of the Group
According to V (5.4) each element Те si has the form
Ту = Z (3.1.1)
V
where the Up(v)v are isomorphic or anti-isomorphic maps of upon ^(v),
and each operator T of the form (3.1.1) is an element of si. T uniquely
250 VI Representation of Groups
determines the permutation P of the indices v and the l/p(v)v up to a phase
factor eicCv.
Therefore T uniquely determines the subset of those v for which the Up{v)v is
an anti-isomorphic map. We shall denote this subset of the indices by I.
D 3.1.1. Let I) denote the subset of all Те srf which determine the
same permutation p and the same index subset I endowed with the topology
induced by .
Th. 3.1.1. The topology of is identical to the initial topology generated
by the mapsd9l(p2:
t—► K(jp2> tw*>l
(where (pxe Жх and cp2 e Жp{v)v)for all <pl5 cp2 e Ж.
Proof. First we shall show that dф1<Р2 is a continuous mapping of in R.
The map T A T'x of in & is continuous for each xel. The map
x —► p(x, y) of^inR is continuous for each у e Therefore the map
Tp(T'x, y) = /фс, Ту)
of —► R is continuous for each (x, у) e & x Ж; in particular, the map
T—► i*P„ TPJ = triPJTPJ)
= tr(P^Pt7p(v)v<Pl) = !<%, ^(V)V^>|2
is continuous, and therefore the map d(pi(p2 is also continuous.
The initial topology corresponding to the maps dщ(п is therefore coarser than the
(^)-topology. We will show that it is also finer, that is, for a sequence T(n) the
relation T(n)'x —► T'x is satisfied in the norm topology for every x e & if the maps
d<pKP2are continuous.
Since each xeJ has the form x = £v AVP^, where £v |AV| < oo, from
||(T<">' - T')x\\ Z £ |AV|||(T<">' - T')PJ\ + 2 £ |AV|
v = 1 v = N +1
it follows that Twx -► T’x for all x e Jf if TWPV -> T'P^, for all P„.
From V (5.7) for all q> e ^,(v) it follows that
T'P9 = P* with ф = U;(i)v<p. (3.1.2)
Therefore, letting Рфп = Т(и)'Рф we obtain (where ||... || is the norm of ^')
||{T(nY _ T)pj = ||p^ _
In the spectral representation of Рф — P^ there are only two (nondegenerate)
eigenvalues af0 and a(2n); therefore, for the norm in & we obtain:
ll^„ - Р*Ь = 1а1П)| + 1а2П>1-
Therefore | P^n — P^||s —► 0 is therefore equivalent to a*”1 + |a(2n)| —> 0 and there¬
fore also to (af)2 + (af)2 -*■ 0, that is, tr((P^ - Рф)2) -> 0. From tr((P^ - Рф)г) =
2 — 2|<i//„, ф}\2 it follows that T(nVx —»■ T'x for all x e ^ if
\<Фп,Ф> 1-1 (3.1.3)
for all q>e Ж.
3 Properties of Representations of rS 251
For (ре Жт we obtain
Кфп, ф}\ = ки^ср, t/p(»|
= \<ир(фф, и%,ф>\.
where ф is given in (3.1.2). If the sequence T(n) is convergent in the initial topology
corresponding to dwe therefore obtain:
where (3.1.3) is proven.
It is easy to show that the Borel structure corresponding to the initial
toplogy is equal to the initial Borel structure, that is, according to Th. 3.2:
The Borel structure of is the initial Borel structure associated with the
maps On the basis of the theorem which was mentioned at the
conclusion of Th. 1.3.1 we therefore obtain:
If, for a local compact group the maps a —► |<<pl5 Up{v)v(a)(p2}\ of ^ in R
are measurable (where Up{v)v(a) is the map of Жх in Жр{у) described by (3.1.1)
corresponding to a) then ^ is continuous.
In the following, for the most part, we shall only use the following fact
which is a simple corollary of Th. 3.2: ^ —► «я/(Л) is continuous if and only if,
for each cpu cp2 e Ж the maps
are continuous.
Th. 3.1.2. sd m(p^ I) are the connected components of In particular,
j/(^)(l, 0) (where 1 is the unit permutation) is the component which is
connected to the unit element of .
Proof. The fact that the set л/(л)(1, 0) is connected to the unit element follows
directly from the spectral representation
Г2п
U = ei(° dE(co)
of a unitary operator and that the function <<p, Ut(p} is continuous in t (where
and that <<p, U0(p> = 1, <<p, = <<p, U(p».
Therefore, if Ж is the component which is connected to the unit element, then
•«W0) <= ^
Since Ж is connected, it follows that, for each neighborhood 'V' of the unit
element of that all elements of the subgroup Ж may be represented as
products of finitely many elements off n / (see AV, §10).
Кф2> ^р(у)уфl)l “* Кф2> ^р(у)уф1)1>
where (рг = ф and <р2 = ир(фф, that is,
\<ир(фф, и$фФ>\ -> кир(у)ж и^фуI = 1.
а -► !<%, инф(а)<Р2>I
Jo
252 VI Representation of Groups
Let (pu (p2i.. •, (pn be normed vectors for which (pve34fv. For Y' we choose the
neighborhood determined by
1 - |<<pv, Up(v)v(pvyI <8 for V = 1, 2,..., n.
From this result it follows that p(v) = v for v = 1, 2,..., n, that is, for all elements
in Jf the relation p(v) = v for v = 1, 2,..., n. Since n was arbitrary it follows that
for all elements of Ж the permutation p is equal to the unit permutation 1.
We choose the neighborhoods which is determined by
1 - Uaail/Vy\<8 (v = 1,2, 3,4),
where ф2 are ип^ vectors in and where фг = (1 /v^X^i + Фг)> Фа =
(l/4/2X</'i + *Фг)- It is easy to show that L;„ is unitary. Thus it follows that U„
must be unitary for all products of elements in S and also for all elements of Ж.
Since a was arbitrary, it therefore follows that, for all elements of Ж, the set of
indices,/ = 0. Therefore Ж = л/(л)(1, 0).
It is well known that Ж = Жт(\, 0) is an invariant subgroup of (see
AV, §10.1). It is easy to show that the cosets of 0) are precisely the
/). Thus the m(p, I) are the connected components of (see
AV, §10.1).
Under the following rule for multiplication:
(P1.I1XP2.I2) = (P1P2.P2 ljri + h)
(where A + В is the symmetric difference (A\A n В) и (B\A n B)) the
elements (p, I) form a group F; it is easy to see that this group is isomorphic
to the factor group 0).
3.2 The Topological Properties of a Representation of ^
Let A be the component of ^ which is connected to the unit element of A is
an invariant subgroup; therefore for the homomorphic map ^ we
obtain A —► j/(^)(1, 0) (see All, §4). Since ^ is separable, the factor group
&/A is at most denumerable and is a discrete group.
Let $ denote the set of those elements in ^ which are mapped into
j/(^)(l, 0). Clearly A ~ J. It is easy to show that # is a subgroup of
For j e / and a s У the product aja~l (considered as an element of is
a map for which p = 1 and 1 = 0, that is, a/a~1 c= # for all a e <3. / is
therefore an invariant subgroup in then, according to the isomorphism
theorem (AV, §4) we obtain:
From the mapping ^ —► <stf(g8) we obtain a homomorphic map У/f —►
0)• An element of <&// is the union of all those cosets in &/A
which are mapped by means of the homomorphic map ^ into one
and the same srfm{p, /). An element of /Aj#!A is precisely the set of those
3 Properties of Representations of 3 253
cosets of 3jA which are mapped by the homomorphic map 3 —► into
one and the same J). Therefore, on the basis of the isomorphism
theorem a homomorphism 3jAj^jA —► 1, 0) is defined by the
map 3 —► s/ia)9 and, consequently, we may identify sdKm)/sdKm){X, 0) with the
group & which was given at the end of §3.1.
In nonrelativistic quantum mechanics we only consider those repre¬
sentations in which the group 3 is mapped onto portions of the form
/), that is, for p Ф 1 there are no images of elements in portions of
the form <srfm(p9 /). In nonrelativistic quantum mechanics it is also possible
to use representations of groups in which there exist images of group
elements in portions of the form sd{m)(jp9l) where p Ф 1(!). Such repre¬
sentations are of little physical significance in nonrelativistic quantum
mechanics. The situation is somewhat different in the case of relativistic
quantum mechanics. Here, on physical grounds, it is meaningful, for
example, to choose a transformation of a portion of <sd{m(p, 0) where p Ф 1
as the homomorphic image of reflection (parity inversion).
In this book we shall only be concerned with nonrelativistic quantum
mechanics for the following two reasons:
(1) By analogy with the case of the Galileo group (which is described in
VII) we could consider the problem of the representation of the
Poincare group (since it has been solved, as is the case of the Galileo
group); however, the interaction problem has been solved in nonre¬
lativistic quantum mechanics by the consideration of combined
systems (see VIII). The interaction problem has not yet been satisfac¬
torily solved in relativistic quantum mechanics, that is, for “elemen¬
tary particle theory.” We shall again discuss this problem in VII and
VIII.
(2) This book should provide an insight into the structure of a “closed”
physical theory. For this purpose we must abstain from the con¬
sideration of fragments of other theories (not only that of relativistic
quantum mechanics) although these fragments would fit into the
context of II-VI. See, for example, the discussion with respect to the
case of nuclear physics in VII, §2.
Since we shall only make use of the partitions sdm(\9 J) of sd{m) it is
therefore tedious to use the whole spaces Я(Ж19 Ж2,...) and 9'(Жи Ж2,...)
for computational purposes. It suffices to consider for each Жу the spaces
9(ЖХ\ 9'(ЖХ) separately, together with the group of all ^-continuous
effect isomorphisms of ЦЖХ). In more general circumstances, or in circum¬
stances in which it is clear which system type (as characterized by an atom of
the center—see IV, §8) we are concerned with, we shall ignore the index v for
Жх and, instead, consider 9(Ж\ 9\Ж) together with the subsets К(Ж),
ЦЖ). Then the group will consist of two connected parts—s/m(u)9
which is connected to 1 and consists of effect isomorphisms described by
unitary transformations and <sd{m(a) which is the set of effect isomorphisms
described by anti-unitary transformations in Ж The subgroup & of 3
254 VI Representation of Groups
therefore permits a representation in «я/(Л)(и). Only elements of the cosets ai
for which ai Ф i can have images in J^im{a).
Readers who are not well versed in physics will have some difficulty
accepting such a “reduced” description of physical experiments because, to a
given preparation procedure ae 2! it is possible to find a
cp(a) e К(ЖХ, Ж2,...) which has nonzero components in more than one
К(ЖХ\ that is,
cp(a) = (Wu W2,...),
where more than one of the Wv are nonzero. Such a preparation apparatus
(described by a) clearly does not produce only microsystems of a single type.
However, mixtures of different system types in the ratios
tr(WJ: tr(W2): tr(W3): ...
are “physically trivial” and are therefore uninteresting. Similarly, for those
effects i//(b0, b) = (Fu F2,...) having more than one nonzero Fv we find that
the probabilities are computed according to the mixture formula
ц(ф), Ф(Ьо, b)) = X tr(WVFV).
V
Here only the individual terms tr(U^Fv) are of interest. Combinations of
several system types are only of interest if there are “physically interesting”
effect isomorphisms in J) for which p Ф 1. Such is the case only in
relativistic quantum mechanics. For the present we shall only be interested in
the simplified description of quantum mechanics in which only a single
Hilbert space is used—considering different system types separately.
3.3 Unitary and Anti-unitary Representations Up to a Factor
Each element a of the group ^ (considered as an element of by means of
the representation in srfKm) (where $ = @1(Ж)Х) corresponds to a unitary or
anti-unitary transformation of Ж into itself as follows:
a — U(a). (3.3.1)
U(a) is determined by a up to a factor (which depends upon a). We may
require that the unit element e of ^ corresponds to the unit operator 1 in Ж:
e->l. (3.3.2)
From the representation properties of ^ in it follows that U(ab)
must determine (according to (3.3.1)) the same transformation in
as does U(a)U(b\ that is, U(ab) and U(a)U(b) can differ only by a factor of
magnitude 1:
U(ab) = co(a, b)U(a)U(b), (3.3.3)
where \co(a, b)\ = 1.
3 Properties of Representations of ^ 255
Conversely, if we have a representation (3.3.1) of a group ^ for which
(3.3.2) and (3.3.3) hold, then to each ae^S there exists a unique element in
sim corresponding to U(a) which we denote simply by a.
Let ) denote the group of unitary and anti-unitary transformations
of Ж.
A map of ^ into of the form (3.3.1) for which (3.3.2) and (3.3.3) are
valid will be called a unitary (or anti-unitary) representation of ^ up to a
factor. Each unitary (anti-unitary) representation of ^ up to a factor uniquely
corresponds to a representation of ^ in si{m. Each representation of ^ in
sim corresponds to a representation up to a factor, where two repre¬
sentations Uu U2 determine the same representation in si\m if
U^a) = eimU2(a), (3.3.4)
where 5(a) is a real function.
For a = e or b = e, from (3.3.3) and (3.3.2) it follows that
co(e, b) = co(a, e) = 1. (3.3.5)
From (3.3.3) it follows that
U(a(bc)) = co(a, bc)U(a)U(bc)
= co(a, bc)co(b, c)U(a)U(b)U(c)
and
U((ab)c) = co(ab, c)U(ab)U(c)
= co(ab, c)co(a, b)U(a)U(b)U(c)
and we therefore obtain
co(a, bc)co(b, с) = co(ab, c)co(a, b). (3.3.6)
The relations (3.3.5) and (3.3.6) are, in a sense, characteristic for the
multipliers ca(a, b) because, to each “solution” of (3.3.5) and (3.3.6) there also
exists a representation up to a factor (see [10]).
From (3.3.4) it follows that the multipliers щ and co2 for two repre¬
sentations Uu U2 which correspond to the same representation in si{m
satisfy the equation:
co2(a, b) = co^a, b)eWa)+m~3iab)\ (3.3.7)
Two multipliers cdx and co2 are said to be equivalent if there exists a real
function S on ^ such that (3.3.7) holds. Here we note that nothing more
about co(a, b) has been assumed—nothing about continuity, measurability,
etc. has been assumed.
Therefore the problem remains to put forward a clever choice of special
multiplier from a class of equivalent multipliers.
256 VI Representation of Groups
Thus the problem of finding the representation of a group ^ in siim is
equivalent to the problem of finding all unitary (or anti-unitary) repre¬
sentations up to a factor and the selection of a particularly “simple”
multiplier from each equivalence class of multipliers.
In accord with Th. 3.1.1 we now propose the following continuity
assumption for the representation map U: —► 11(Ж):
а-|<Ф,1/(<#>| (3.3.8)
is continuous for all (р,ф еЖ.
The problem posed above can be solved for a certain type of group which
contains all “physically relevant” groups. A more precise description of the
solution of this representation problem will require a special book, and
would result in the loss of continuity of the train of thought. Since
monographs already exist on this topic (see, for example, [10]), in the
appendix AV, §10 of this book we shall only present a brief summary for the
case of a compact group in order that we may obtain a better understanding
of the rotation groups.
In closing this section we shall now characterize the concepts introduced at
the beginning of §1.2 in terms of the form of a representation up to a factor.
A ^-invariant effect (as defined in D 1.2.1) is therefore an element g which
commutes with all U(a\ ae<&, that is, U(a)g = gU(a).
Therefore, according to D 1.1.2 a representation is irreducible if the only
operators which commute with all the U(a), а are multiples of the 1-
operator. Otherwise, to each operator A which commutes with all U(a) we
would have A + U(a)+ = U(a)+A+, that is, A + U(a~1) = U(a~1)A+; then all
U(a) would commute with the operator A and hence with the self-adjoint
operator A + A+. Then there would be a projection operator E (т^О, Ф1) (for
example, from the spectral family of A + A +) which commutes with all U(a).
If a representation is not irreducible, then there exists a projection
operator E (^0, Ф\) which commutes with all U(a); this is equivalent to the
condition that there exists an invariant subspace of Ж (different from Ж and
{0}), namely, the projection spaces belonging to E.
With the help of V (5.3) we may, in principle, rewrite condition D 1.2.3 for
equivalent representations. We will do this for the case of two repre¬
sentations in Я\Ж) and Я\Ж) because these are the only significant ones in
nonrelativistic quantum mechanics.
Suppose that two representations by means of J^-continuous effect isomor¬
phisms are given in terms of representations up to a factor U(a) in Ж and
U(a) in Ж. They are equivalent if there exists an isomorphism (or anti¬
isomorphism) V of Ж onto Ж such that
Щ^уЩаУ1 = V~1 U(a)VyV~ (3.3.9)
holds for all у e Я\Ж). From (3.3.9) we find that the above is equivalent to
U(a) = eid(a)V~1U(a)V,
(3.3.10)
3 Properties of Representations of ^ 257
where S(a) is a real function in <&. If Ж = Ж then V is a unitary or anti-
unitary operator in Ж. By clever selection of the factors for a representation
we may, according to (3.3.10) obtain S(a) = 0. Then (3.3.10) reduces to the
“usual” form for the definition of equivalence of two unitary representations.
In particular, it follows from (3.3.10) that for 7=1, that two repre¬
sentations which differ only by factors of the form ei3(a) (and which, according
to (3.3.7), correspond to the equivalent multipliers) are equivalent.
CHAPTER VII
The Galileo Group
In the investigation of microsystems an important role is played by a
particular component of the physical-technical structure of those experi¬
ments which are composed of a preparation procedure and a registration
method. Every experimental physicist is acquainted with this component,
which underlies the problem of fixing the spatial and time relationships
between the preparation and registration apparatuses. Earlier we have
briefly described this question in the definition of С in II (4.3.1) and in III, §1.
We must now introduce a corresponding mathematical structure into our
mathematical formulation which describes the spatial and time relationships
between the preparation and registration apparatuses.
1 The Galileo Group as a Set of Transformations of
Registration Procedures Relative to Preparation Procedures
Underlying each element b0 e $0 there exists a whole technology of the
construction of the apparatus to which the registration method b0 belongs.
Here it is not possible to discuss the technology involved. In this respect a
large series of pre-theories (see [1], [2], III and [13]) is required for quantum
mechanics. It is, however, essential that quantum mechanics itself is not used
in the physical description of b0. Here it is possible to raise objections to this
assertion; in XVII we will examine such problems in more detail. In [2], XVI
it will be obvious what we mean by the above assertions. In [13] this
problem is treated in considerable detail.
258
1 The Galileo Group as a Set of Transformations 259
Although we do not need to go into all of the details of the construction of
h0, we must, however, present a brief explanation of the special technical
character associated with the pre-theory of space-time measurement.
The preparation and registration procedures always refer to a space-time
reference system—an inertial system (or approximately inertial system) (see,
for example, [2], II, VII, and IX). Every experimental physicist is aware of
the importance of the spatial and temporal relationships of the registration
apparatus relative to the preparation apparatus. Here it is often necessary to
use the most modern measurement techniques. It is important that the
technical specification of such a space-time reference system has nothing to
do with quantum mechanics. The fact that this situation is often not
sufficiently understood is, in part, the cause for many conceptual errors in
quantum mechanics.
We shall now describe what we believe to be the correct meaning of space
and time in quantum mechanics: Space and time coordinates refer only to
the preparation and registration apparatuses and do not(!!) refer to the
individual microsystems. In the formulatipn of quantum mechanics pre¬
sented in this book we have not introduced the position of a microsystem as a
basic concept because it is not clear how such a concept can be defined in
terms of the pre-theories. The mapping principles do not permit us to use
concepts which cannot be defined by the pre-theories (see [1] and [2], III).
Instead of the “position of a microsystem” only the spatial relationship
between the preparation and registration apparatuses is defined by means of
the pre-theories, and, in this respect, is permitted by the mapping
principles.
We shall not provide a precise mathematical picture of the relative
placement of the registration apparatus relative to the laboratory spatial
coordinate system. Here we shall be interested only in a particular aspect of
this structure, which is described below. This description is a substitute for a
more complicated description in terms of pre-theories (see [13]).
We shall assume that a registration method b0 e 0tQ is not only character¬
ized by the “inner” structure of the registration apparatus but also by its
position and motion relative to the space-time laboratory reference system.
In this way we obtain a new structure in the domain of the registration
methods and the corresponding registration procedures. Two registration
methods b01 and b02 can only differ in their position and motion in the
laboratory reference system if the corresponding apparatuses do not differ in
their internal structure. We shall now introduce a mathematical description
of such a situation between two registration methods.
How can two such registration methods differ? If we use the Newtonian
space-time structure, they may differ only by a Galileo transformation. If the
space-time structure of special relativity is used, they will differ by one of the
transformations of the Poincare group. For a discussion of the Galileo group
see [2], VI, §1.2; for the Poincare group see [2], IX, §4.3. Here the physical
meaning of a group transformation is that b01 may be obtained from b02 by
means of a spatial translation (in the reference system under consideration)
260 VII The Galileo Group
or by a time translation, or that the apparatus for b01 moves with constant
velocity relative to that of b02.
A “time translation” for the time т of b01 relative to b02 means that the
apparatus corresponding to b02 is, placed into operation at a time interval т
later than the apparatus corresponding to b02, for example, a voltage is
turned on at a later time т. Later we shall find that the problem of time
displacement is not trivial with respect to the combination problem of a
preparation and a registration apparatus described in II, §4.3.
We will consider only the mathematical formulation of the structure
described above for the case of the Galileo group The formulation of the
analogous case for the Poincare group is similar and trivial. ^ is a local
compact, separable topological group. The elements of ^ can be given by the
transformations (see [2], VI (1.2.1)) as follows:
з
*v = Z avA + + *1v (v = 2> 3)>
Д = 1
t' = t + y, (1.1)
where A = (ocVfl) is the matrix of a spatial rotation, that is, A' = A ~1. We shall
consider only transformations (1.1) which can be continuously transformed
into the unit element, that is, those for which the determinant | A | = 1.
We may represent an element characterized by (1.1) as follows:
(Aj,ij,y). (1.2)
We now make the following assertion: There exists at least one denumer¬
able subgroup # c= ^ (we can choose # to be the set of all transformations
(1.1), where a^v,5v,^v,y are rational numbers) which, for all ge@ there
exists a map 01 -A 01 (which we shall denote by g) for which g($0) cz 01Q. Its
physical meaning (as we mentioned earlier) is that, for b0e$0, the method
gb0 is obtained from b0 by means of the Galileo transformation g and that for
b e 01, b cz b0 the procedure gb is precisely that which is obtained from b by
means of the Galileo transformation g. The “meaning” described here is a
mapping principle (in the sense of [1], §5 or [2], III, §4) for the relation
mathematically described by 01 01. This short outline must suffice for the
present. It is important, however, to note that the “physical meaning” of g
and gb0, gb is already determined by means of the pre-theories! The reason
why we consider only a denumerable subgroup Ф is concerned with the
“finiteness of physics” assumption (see [1], §9 or [2], III, §8) that 01 must be
denumerable (see VI, §1.1).
We may express the fact that the inner structure of the registration
apparatus remains unchanged by the application of the Galileo transfor¬
mation by asserting that the mapping 01 -A 01 is an r-automorphism. The
fact that it is essential to consider only Galileo transformations is made clear
by the fact that the acceleration of an apparatus can modify its inner
structure. Therefore an apparatus b01 which is accelerated relative to an
apparatus b02 cannot be characterized by means of a r-automorphism.
1 The Galileo Group as a Set of Transformations 261
We now present a summary of the above considerations in axiomatic
form:
For each g e § there exists an r-automorphism 0 -A 0 and we obtain a
representation of the group ^ by means of r-automorphisms.
This axiom can be directly obtained as a theorem from the pre-theories—
see, for example, [13].
(For the preparation and registration of macrosystems this assertion about
the possibility of representation of the group Ф by means of r-automorphisms
is not correct for all time translations у because we have chosen the time
point £ = 0 to be the time before which the preparation is complete and after
which the registration begins (see III, §1, [2], XV, and [13]). In [13] it is
shown that the time translation of the registration apparatus makes sense
only for у > 0.)
Our description of the application of Galilean transformations for reg¬
istration procedures can also be directly carried out for preparation
procedures. For such a transformation we shall write a —► да. The transfor¬
mations of registration procedures and preparation procedures are not
mutually independent. For a pair a e 0\ b0 e 01'Q from a Galileo transfor¬
mation g there arises an experiment (a, gb0) if (a, gb0) eC. If, instead, we
transform the preparation procedure by means of gT1, then we obtain the
pair (#_1a, b0). Here (a, gb0) and (0_1a, b0) differ only in the fact that the
complete experiment (a, gb0) is transformed by g relative to (#-1a, b0), while
the “relative” position of the preparation and registration apparatus is the
same in both cases. For this reason the following assertions are “almost
trivial”
(a, gb0) eCo (g~la, b0) e С
and
fi(a, g(b0, bj) = /4ГЧ (b0, bj).
From V, Th. 3.4 it follows directly that 01 01 is a p-continuous r-
automorphism and that 0 0 is an r-continuous p-automorphism.
We may combine the above considerations by asserting that # may be
represented by means of p-continuous r-automorphisms by means of the
maps 01-1* 01.
According to VI, §1.1 it follows that to each g there corresponds a unique
^-continuous effect automorphism which, in turn, corresponds to a repre¬
sentation # —► of # into the group stf of J^-continuous effect automor¬
phisms (see VI, §1.2). In addition the elements of s# which correspond to
elements of # leave the subspace 2 of 0!' invariant.
For the representation of # by means of p-continuous r-automorphisms
we shall require that AG 1 from VI, §1.1 holds. AG 1 is the mathematical
expression for the condition that small errors in adjusting the registration
apparatus in space and time cannot be detected by means of the probability
262 VII The Galileo Group
distributions. In principle this is nothing other than the assumption that
small errors are of a statistical character.
In this way we may, therefore, by VI, §1.1 and §1.2 consider the repre¬
sentation of the complete Galileo group ^ in si where ^ —► si is continuous
according to VI, §1.3. According to VI, §3.3. we may also consider separate
representations of the Galileo group (up to a factor) for each of the Hilbert
spaces Ж^ of different “system types” because the Galileo group (where we
assume that the determinant of the rotation matrix is +1) is connected. For
these representations we may impose the continuity condition VI, (3.3.8).
2 Irreducible Representations of the Galileo Group and
Their Physical Meaning
The irreducible representations of the Galileo group play a fundamental
physical role. As we have found at the end of the previous section, we can
consider each system type and its corresponding Hilbert space Ж^ separately.
D2.1. A system type (IV, D 8.1.1) is said to be “elementary” if its cor¬
responding representation of the Galileo group (that is, its representation up
to a factor in Жу) is irreducible; otherwise, it is said to be “composite.”
We may sometimes speak (less precisely) of xep(e) (where e is an
elementary system type) simply of elementary systems x of type e (for p(e) see
IV, §8.1).
This definition is mathematically clear. However, its physical interpre¬
tation may be misunderstood. For this reason we shall make a number of
explanatory remarks.
Every theory refers to a certain “fundamental domain” where it is usuable
(see, for example, [2], III, §2 and §4 or [1], §3 and §5). If we then have a more
comprehensive theory, then the corresponding fundamental domain will
probably be larger (see [2], III, §7 or [1], §8). If at a given point in the
development of a theory there is a certain fundamental domain, the theory
will describe certain real factual content—for example, atomic nuclei—as
elementary systems. In a more comprehensive theory these systems need
not necessarily be elementary. For example, if, in the fundamental domain,
only low-energy processes are admitted, then the atom nuclei may be
described as elementary systems (for example, in atom and molecular
physics). If we extend the fundamental domain by admitting such processes
as nuclear reactions, then we must use more comprehensive theories.
In such theories the atomic nuclei (except for the neutron and proton)
must be described as composite systems. We must be careful and avoid
the mistakes made by considering the concepts in a physical theory to
be absolute, instead of understanding that they are part of the description of
a certain fundamental domain. In physics we often restrict the fundamental
2 Irreducible Representations of the Galileo Group 263
domain and consider simplified approximate theories for such restricted
fundamental domains. In such a simplified and approximate theory a
complete atom may be considered to be an “elementary” system. The fact
that this situation does not result in a contradiction can be seen in the fact
that in the treatment of composite systems we find that the “center of mass”
of a composite system itself behaves as if it were an elementary system.
Since we may consider each system type separately—as we have already
discussed in VI, §3—in the following we shall always consider only a single
Hilbert space Ж
As we mentioned in VI, §3.3, we cannot present an exact derivation of the
irreducible representations of the Galileo group here, because it would
require an entire book to do so. The reader who is interested in this task is
referred to [10]. The derivation presented there can be directly applied here
because the continuity of the maps, VI (3.3.8), is the central assumption in
[10] (if necessary, using the weaker requirement that the maps, VI (3.3.8), be
only measurable, then continuity would follow for locally compact groups)
for the derivation of the possible inequivalent representations. Here again we
note that the assumptions made in §1 are completely sufficient to apply all the
theorems presented in [10]. Every irreducible representation of the Galileo
group in terms of J^-continuous effect isomorphisms can be given by the
corresponding unitary representation up to a factor. We shall now give a
brief summary of the structure of such representations.
The Galileo group (1.1) contains the abelian group of spatial translations
and the “proper” Galileo transformations 5 as subgroups. For a one-
parameter group (for example, the translations in the 1-direction with the
parameter rj) the factors may be so chosen that we obtain a unitary
representation as follows :
Let a denote the parameter of the group element, that is,
a(ax)a(a2) = a(ax + a2)
from VI (3.3.5) and (3.3.6) where
co(a(аД а(а2)) = eiy(CLuCL2)
it follows that
7(0, a) = y(a, 0) = 0, (2.1)
y(al5 a2 + a3) + y(a2, a3) = y(ax + a2, a3) + y(al5 a2). (2.2)
According to VI (3.3.7) the question arises whether there exists a 5(a) =
5(a(a)) for which co2(a, b) = 1, that is,
7(ai, а2) + 5(ax) + 5(a2) - 5(ax + a2) = 0. (2.3)
On the basis of condition (2.1) it is always possible to find such a 5(a). This
can be proven without additional assumptions (see [10]). This can be easily
seen if we assume that у is twice differentiable, and we differentiate (2.2) first
with respect to ax and then with respect to a3.
264 VII The Galileo Group
For a family of unitary opertors U(oc) with 1/(0) = 1 and U(cc1 + a2) =
l/(ai) + U(a2) it follows that (see [35]) there exists a spectral family E(k) for
From (2.4) it follows that there exists a (not necessarily bounded) self-adjoint
operator
The above procedure can also be carried out for the three parameter
abelian group (1, 0, jf, 0) if we only replace a by a three-dimensional vector
Then it follows that we may choose 1/(1, 0, fy, 0) in such a way that we obtain
a representation without factors, that is, the 1/(1, 0, jy, 0) form an abelian
group. From (2.6) we find that there exist self-adjoint operators Kl9 K2, K3
(or K) for which
Since the 1/(1, 0, fy, 0) form an abelian group, the Kv must commute. К is
not, however, uniquely determined by the choice of factors. It is easy to see
that the 1/(1, 0, jy, 0) are uniquely determined up to factors of the form e*'*
where it is an arbitrary vector. Therefore К is determined only up to additive
term /cl.
For the elements of the Galileo group we find that
The multipliers A(A, fj) do not depend on the choice of factors for
U(A, 0, 0, 0). We may set A(A, /7) = 1 by making a suitable choice of the factor
em,fi as follows: From (2.9) it easily follows (where A(A, /7) = ei9{A,fi\ that
g(A, fyx + r\2) = g(A, fyx) + g(A, fy2)
which
(2.4)
'oo
к =
k dE(k\
(2.5)
— 00
where
[/(a) = eiK*.
(2.6)
1/(1, 0, i}, 0) = e***.
(2.7)
(1,0, Aff, 0) = (A, 0, 0, 0)(1, 0, ij, 0)(Л Л 0,0,0). (2.8)
For the representation we obtain
U(A, 0, 0, 0)17(1, 0, ij, 0)l/(,4, 0, 0, 0Г1 = ДА, ®U( 1,0, Ai\, 0). (2.9)
and
д{АуАг, ф = g(A2, ф + g(Au А2ф.
From the first equation it follows that
д(А,ф = h(A) ■ ii
and from the second equation we obtain
h(AiA2) = h(A2) + A^HAJ.
2 Irreducible Representations of the Galileo Group 265
This equation fixes h(A) up to an arbitrary vector h(A0) for an A0 for which
h(A0) Ф 0. The solution up to an arbitrary vector is given by
h(A) = к — А~хк.
Thus, from (2.9) it follows that the factors em * of the 1/(1, 0, jf, 0) can be
chosen such that the A(A, rj) are equal to 1.
From (2.9), using (2.7), we find that
U(A, 0, 0, 0)KU(A, 0, 0, O)'1 = A-'K. (2.10)
By choice of X(A, fj) = 1 in (2.9) the operator vector К is uniquely
determined.
The same procedure can be applied to the subgroup of the proper Galileo
transformation (1, <5, 0, 0). Again the factors can be chosen such that this
subgroup has a unitary representation:
1/(1, 5, 0, 0) = e^v=i*v<5v = eixs (2 n)
with mutually commuting self-adjoint operators Xv which, by analogy with
(2.10) satisfy the equation
U(A, 0, 0, 0)XU(A, 0, 0, О)'1 = А~хХ (2.12)
and X is uniquely determined.
Although all elements (1, S, fy, 0) form an abelian subgroup of the Galileo
group, it is not necessary that the U( 1, S, 0, 0) commute with the 1/(1, 0, fy, 0)
because from
(1, d, 0, 0)(1, 0, fj, 0) = (1,0, fj, 0)(1, S, 0, 0)
it only follows that
eixvdveiK^ = rjfi)eiK^eix''3\ (2.13)
where |ЯДУ| = 1. Since we have no more free choice of factors for 1/(1, S, 0, 0)
and 1/(1, 0, fy, 0) we must yet specify what coefficients can occur in the
equation (2.13). We will now simplify the answer of this question by
assuming that the coefficients are differentiable, although no additional
assumptions are necessary (see [10]). If we multiply (2.13) on the left with
e-iKrfn^ differentiate with respect to rjд and then set ц(1 = 0 we obtain
-1К11е‘хж + ieiX^Kll = eix^. (2.14)
Л„=о
If we then multiply on the right with e~iXv3v then differentiate with respect to
Sv and finally set Sv — 0 we then obtain
<2Л5)
From (2.10) and (2.12) we obtain
266 VII The Galileo Group
where m must be real, because Xv and Кд are self-adjoint. Therefore we
obtain (2.15)
K„XV - = im5vfi 1. (2.16)
We may choose m ^ 0 because m and — m lead to an equivalent repre¬
sentation of ^ in j/; since m transforms into — m by means of a anti-unitary
transformation F because i transforms into — i:
(F^F-1)^!/-1) - (F^F-'KFX^F-1) = i(-m)<5v/tl.
We have to distinguish between two cases: m — 0 and m Ф 0. For m / 0 the
Kv do not commute with the Xy. However, (2.16) is only an abbreviated
notation for (2.13):
jx$eiKi = (2.17)
The result of these considerations (we again mention) is valid without any
assumption about XVfl (see [10]). Different values of m lead to inequivalent
representations, because factors in £7(1, S, 0, 0) and £7(1, 0, ff, 0) cannot be
varied more and the number |m| remains unchanged under unitary or anti-
unitary transformations.
The elements (1, 0, 0, y) form an abelian subgroup; therefore there exists a
unitary representation
£/(1,0,0,y) = eiH\ (2.20)
where Я is a self-adjoint operator, which is uniquely determined up to
additive term el.
From (A, 0, 0, 0)(1, 0, 0, y)(A ~ \ 0, 0, 0) = (1, 0, 0, y) it follows that
U(A, 0, 0, 0)eiHyU(A, 0, 0, 0)U(A, 0, 0, 0)”1 = oc(A)eiH\
where oc(A1A2) = a^ja^). Since the only one-dimensional representation
of the rotation group is the identity (see §3) it follows that a = 1, that is,
U(A, 0, 0, 0)HU(A, 0, 0, О)'1 = Я. (2.21)
Equation (2.21) describes the rotation invariance of Я. From
(1, 0, *?, 0)(1, 0, 0, y)(l, 0, *?, 0Г1 = (1, 0, 0, y)
it follows that
eiKvnveiHye-iKvnv = Pv(rjv9y)e-iHy,
From the preceding results and from (2.10) it follows that = 1, that is, Я
and the Kv commute.
From
(1, 0, 0, y)(l, S, 0, 0)(1, 0,0, y)-1 = (1,1 -h, 0)
it follows that
еШу^х4е-Шу = fyjX-de-iK »y
2 Irreducible Representations of the Galileo Group 267
If we differentiate with respect to у and set у = 0 we obtain (for p(0, <5) = 1))
Ше‘ы - ie‘™H = (8^] - ie‘*4K ■ I
^уЛ=0
Multiplying on the left with e~ixdifferentiating with respect to d and
setting S = 0 we obtain
,J-‘& i222)
According to (2.12) we must obtain
=0
dyd& J $=о, у = о
Then combining the results of (2.22) with (2.17) we obtain
H = 2-K2 + H{ (2.23)
whereby we find that the Xv and Кд commute with Ht.
From (2.17) it follows that the Hilbert space Ж can be represented in the
following way, that is, Ж together with the operators eiX**, eiKis isomorphic
to the following form:
Ж = ЖьхЖ{, (2.24)
where Жъ can be chosen as the space <5? 2(R3, dkx dk2 dk3). Then it is easy to
obtain
eiR-a. jR i x j
- . - . (2.25)
eix s. eix s x J
In the above equation К and X are operators only in Жь: and for q>(k)e
i?2(R3, dkl dk2 dk3) we obtain
e‘*>(£) = eiS'V(/c), (2.26)
е1*^(р(к) = (p(k + mS). (2.27)
Thus it follows that any operator which commutes with all the operators
eixi eik ij must jiave tjje form \ x A.ln particular, we may write (2.23) in the
form
H = J-K2 x 1 + 1 X Д. (2.28)
2m
In this way we obtain
^2<p(k) = ~k 2cp(k) (2.29)
and
ei(i/2 m>K2>(£) = e<i/2m)i‘2y(p(&). (2.30)
268 VII The Galileo Group
The proof of these relationships is given in [10]; certain aspects of the
proof can be found in IX.
On the basis of (2.26) and (2.27) it follows that the space Жь is irreducible
relative to the transformations elKand eiX b.
We have not yet given an explicit description of the representation of the
subgroup 39 of spatial rotations. In the previously given relations such as
(2.10), (2.12), and (2.21) the choice of factors in U(A, 0, 0, 0) were arbitrary.
Since 39 is compact, each irreducible representation of 39 is (up to a factor)
finite-dimensional (see [10] and AV, §10), and each representation (up to a
factor) in Ж can be reduced to the form
* = I e
n
where the Жп are invariant with respect to the representation of 39 and are
irreducible subspaces of Ж. Since 39 is compact each representation is, up to
a factor, equivalent to a normal unitary representation of the covering group
3% ol39 (see [10] and AV, §10.7).
For A e 39 we may define an operator V(A) in Жь by
V(A)cp(k) = cpiA-% (2.31)
It is easy to see that V(A) is unitary, and that V(A) defines a representation of
39. We then obtain
ViAfiViA)-1 = A~XK (2.32)
and, from (2.27) we obtain
V{A)XV{A)-1 = A~lX. (2.33)
From (2.10) and (2.12) it follows that
R(A) = V(A) ~1 U(A, 0, 0, 0) (2.34)
commutes with all eiX d and eiXЛ that is, is of the form R(A): 1 x R(A) and
that we may write
U(A, 0, 0, 0) = V(A) x R(A). (2.35)
Since the U(A, 0, 0, 0) form a representation of 39 up to a factor, this must
also be true of R(A). Then, from (2.21) and (2.28) it follows that (since V(A)
commutes with К2):
HtR(A) = R(A)Hi9 (2.36)
that is, the H( commutes with all R(A).
Therefore we may obtain an irreducible representation of the Galileo
group (for m Ф 0) only if the representation of 2% by means of R(A), that is,
in Ж{ is irreducible. Thus it follows that Ж{ is finite dimensional and (from
(2.36)) we find that we must have
Щ = AI,
(2.37)
2 Irreducible Representations of the Galileo Group 269
that is, H is, according to (2.28) uniquely determined up to an additive
constant, which, by choosing a suitable factor, can be set equal to 0.
The irreducible representation space Ж{ is called the “spin space” of the
elementary system; for elementary systems we shall use the notation %
instead of Ж{, that is, where we use the symbol we shall be considering the
spin space of an elementary system.
In §3 we shall consider the representations of 2% in Жъ by means of V(A)
and in by means of R(A) separately, because these representations play a
central role in the applications of quantum mechanics. We will see that each
irreducible representation is uniquely characterized by a number
s = 0, 1, f,...—and that this number will be used as an index for .
Thus we find that the Galileo group as a group of transformations of
registration procedures relative to preparation procedures leads, without any
additional assumptions other than those introduced in §1 (!) to the following
structure which is of central importance in quantum mechanics:
For the case m Ф 0 to each type of elementary systems there are two
parameters m and s and the corresponding irreducible representations of the
Galileo group. The necessarily infinite-dimensional Hilbert space Ж of such
a type of elementary systems can be written in the form Жъ x where Жъ =
i?2(R3, dkx dk2 dk3) and the rules for the transformation of the Galileo
group are given by (2.26), (2.27), (2.35), and (2.31) and by
ешу = eai2m)K2y x t (2.38)
In §4 we will see that (2.16) is equivalent to the Heisenberg uncertainty
relations.
The typical quantum mechanical structure obtained from the repre¬
sentation of the Galileo group by means of ^-continuous effect automor¬
phisms is not a consequence (!) of the introduction (§1) of the structure of the
Galileo transformation as transformations of registration procedures.
Everything introduced in §1 also holds for classical systems. The distinction
between classical systems and microsystems lies exclusively in axiom AV 4s
in III, §3. If, on the contrary, we make the assumption that the systems under
consideration are physical objects (see the remarks in III, §3 following
AV 4s), then it follows for elementary systems that they are “mass points
which move with constant velocity in a straight line between the preparation
and registration apparatus” (for proof see [21]).
The fact that every elementary quantum mechanical system type uniquely
determines two parameters m and s does not, of course, mean that every
elementary system type is uniquely characterized by these two parameters. It
is possible to give other “objective” properties in addition to m and s for
elementary systems (for objective properties, see IV, §8.1). The fact that m
and s are objective properties follows directly from the definition that they
are parameters which correspond to atoms in the center Z of G.
D 2.2. The parameter m is called the mass of the elementary system type;
the parameter s is called the spin of the elementary system type.
270 VII The Galileo Group
We shall later compare (see VIII, §6 and XVII, §6.2) the concept of “mass”
described above with the “usual” concept. The meaning of the parameter s
will be explained in §3 and §5.
The fact that in this formulation of quantum mechanics there is no
“constant” h = h/2n (where h is Planck’s constant) is not a defect. On the
contrary, it merely expresses the fact that this formulation considers only the
essential structure of quantum mechanics. This structure shows that the
quantum mechanical laws are not invariant under transformations m —► Am
where A > 0. Because all previous classical theories exhibit this invariance, it
seems advantageous to introduce a particular unit of mass in classical
physics. For quantum mechanics it appears to be somewhat “artificial” to
introduce a special unit of mass; if this was done, then it would be necessary
to introduce a factor between the “natural” unit in quantum mechanics and
the “artificial” unit in classical mechanics. The natural unit is (cm)-1 if the
velocity of light is taken to be 1, that is, if the time is also measured in cm.
Then we would have
1 (cm)-1 == h (gram), (2.39)
where gram is the usual mass unit. The conversion factor in (2.39) can be
found if we measure, for example, the mass of a hydrogen atom in (cm)-1 and
determine what the mass of a (cm)3 of water is compared to that of a
hydrogen atom (see XVII, §6.2).
We have not yet discussed the case m = 0 in (2.16). In [10] the remaining
possible irreducible representations of the Galileo group are given. Here we
briefly mention these representations and we will make it experimentally
evident that such systems are not found in nature, this evidence will be
formulated in terms of the following axiom—that m Ф 0. (Light quanta
cannot be described in terms of the Galileo group; here the representations of
the Poincare group must be used—see the remarks in §1 and at the end of
§2.)
For m = 0 it follows from (2.16) that all Kv and Xд commute. With respect
to the above derivation not much is changed; it is only necessary to set
m = 0. This occurs only in the transition between (2.22) and (2.23), that is, in
the determination of H in the equation (2.22);
XH - HX = - iK. (2.40)
Since X and H commute with К we may treat К in (2.40) like a number.
From (2.40) by analogy with (2.17) and (2.16) it follows that
eiXdeiHye-iXd = ei(Hy+K-S)' (2.41)
From (2.41) it follows that we may take (for an irreducible representation) the
space of quadratic integrable functions cp(k0, k) with fixed \k\ = r with
integration measure dk0r dco (where dco is an element of solid angle or a
surface element of the unit sphere) for Жь in (2.24). From (2.10) it follows that
2 Irreducible Representations of the Galileo Group 271
all directions are needed for k. The equations for the representation
operators are given by
eiK'>(fc0, к) = е‘нср(к0, к),
еШу(р{к0, к) = eikny<p(k0, к),
е1^ср(к0, к) = ср(к0 + к-д, к).
For г = О, X commutes with Я. For such an irreducible representation
eiHy reduces to a multiple of the unit operator.
The following experimental evidence shows that there are no physical
systems which correspond to the representation for which m = 0.
For r = 0 it follows that all effects are invariant under time displacements
(since eiHy commutes with all F e L). With the exception of a “vacuum” there
is no physical system known (see below) for which such a “time invariance” is
valid.
For г Ф 0 we have |k| = r. Suppose we produce an ensemble W for which
tr(WU(l, 0, riv, 0)FU(1, 0, riv, 0)_1) changes slowly with r/l9 ц2 f°r all F- Then
it follows that this expression will also vary weakly with rj3. This contradicts
experience because experience has shown that it is possible to make
ensembles which depend weakly on displacements rjurj2 but strongly on
displacements rj3 in the third direction (in all scattering problems we seek to
produce ensembles which weakly depend on rjl9 rj2 in order that a “beam” of
systems can be directed in the third direction—see XVI, §6.3).
Often the following argument is also introduced: For m = 0 there exists no
decision observable for position in the sense discussed in §4. This objection is
more than questionable because it is practically impossible to prove experi¬
mentally the existence of the decision observable for position (constructed in
§4) because the latter is an idealization which is only obtained approximately
(with difficulty) in terms of real constructable registration methods b0 (in the
sense of IV, §4). The nonexistence of a decision observable for position will
therefore not immediately contradict all known experiments because it is
conceivable that there yet exists a position observable (in the general sense,
that is, a measure S L with £ as the ring of the “position domain” in the
sense of §4) which describes what is measured (see [22]) where the case of a
light quantum is a nice example. The assertion of the existence of a decision
observable for position on arbitrary a priori grounds will absolutely con¬
tradict the concepts of physics as carried out here for the example of
quantum mechanics and as described in general in [1]. Such a priori
principles are not admissible in the development of an axiomatic basis for
quantum mechanics. This, of course, does not mean that intuitive concepts
such as the concept of “position” cannot be used in order to guess (or
“discover”) a SPS'.
There exists a trivial irreducible representation of the Galileo group: the
identity in a one-dimensional Hilbert space. The only other additional
elementary system types are those whose corresponding Hilbert spaces are
one-dimensional. There are no objective properties (IV, §8.1) which are
272 VII The Galileo Group
experimentally known which permit us to distinguish such elementary
system types whose corresponding Hilbert spaces are one-dimensional. We
therefore impose the axiom that there exists only a single one-dimensional
Hilbert space Jfv. For this one-dimensional Hilbert space we shall use the
index 0: Ж0. The corresponding system type will be called the “vacuum .” The
corresponding objective property (as a subset of M) will be denoted by M0.
According to IV, §8.1 we therefore obtain
M0 =
U «
a e &'
U
U b
be &
_<p(a) e Ki(e0) _
_ ффо, b)<^e о
where e0 is the atom of the center Z of G which projects onto Ж0.
The language used by the physicist for the situation xe M0 is that “no”
microsystem is “present,” and the set M\M0 is often called the set of “proper”
microsystems. For the formulation presented in II it is conceptually import¬
ant not to exclude the possibility that x e M0 because it is conceptually
impossible to describe the set M\M0 before the introduction of the concept of
a “system type.” Therefore the set M in II is only an aid to mediate between
the preparation and registration. Conceptually this would be clearer if we
construct quantum mechanics without the aid of a set M of microsystems
and instead use only mathematical structures which describe the preparation
and registration apparatuses and the connection between them (see [3],
[2], XVI, and [13]).
The discussions in §1 and §2 may also be carried out for the case of the
Poincare group. The same is true for parts of the discussion in sections §4—§7,
but not for the considerations in VIII. The experiments of “elementary
particle” physics lead us to suspect that there are actually no “elementary”
systems in the sense of D 2.1 (with respect to the Poincare group). The
concept of an elementary particle (as introduced in §2) appears to have a
meaningful application only in the realm of nonrelativistic physics of
microsystems.
3 Irreducible Representations of the Rotation Group
Since the unitary representations (up to a factor) of the rotation group are
equivalent to the unitary representations of the covering group Q)% (see [10]
and AV, §10.7) we may restrict our consideration to the unitary repre¬
sentations of 2%. Since 3)% is compact, all irreducible representations are
finite dimensional (see [10] and AV, §10.6). Ж may be a finite-dimensional
Hilbert space and U(A) a unitary representation of Щ in Ж. For a rotation A
of angle a about the 3-axis the equation U(A) = U(a) defines the repre¬
sentation of a one-parameter group satisfying Щосх + oc2) = ^(a1)C/(a2) for
which an infinitesimal rotation is defined by
U(oc) = eaJ*\
3 Irreducible Representations of the Rotation Group 273
Since Ж is finite dimensional, J3 is defined in all of Ж. Since U is unitary iJ3
is a self-adjoint operator. In the same way infinitesimal rotations «/l5 J2 are
defined for the axes 1 and 2. We therefore set
Lv = iJfv (v = 1,2,3). (3.1)
Thus the Lv are self-adjoint operators.
From the representation property of the U(A) it follows that (see
AV, §10.5) the Jy in Ж satisfy the same commutation relations as the
corresponding infinitesimal rotations in itself, that is,
Г1,2, 3,
= Jp where v, p, p = < 2, 3,1, (3.2)
[3,1,2.
For the Lv it follows that
Г1, 2, 3,
LvLp - LpLv = iLp where v, p, p = \ 2, 3, 1, (3.3)
[3, 1, 2.
Since Ж is finite dimensional, the Lv have (as is the case of all self-adjoint
operators in Ж) a discrete spectrum of eigenvalues and a complete orthonor¬
mal basis of eigenvectors. For that reason it is easy to carry out the following
computations.
We replace Ll5 L2 by means of the operators
N = L1 + iL2 and N+ = Lx - iL2 (3.4)
and we define
L2 = L\ + L\ + L\. (3.5)
It follows that
L3N - NL3 = N, L3N+ - N+L3 = -N+ (3.6)
and
NN+ = L2 + L3 - L2, N+N = L2 - L3 - L2. (3.7)
If v is an arbitrary eigenvector of L3 in Ж which satisfies
L3v = pv (3.8)
then, from (3.6) it follows that:
L 3Ni> = (NL3 + N)v = (p + l)Nv (3.9)
and
L3N+v = (N+L3 - N+)v = (p- 1 )N+v. (3.10)
If Nv ф 0 then Nv is an eigenvector of L3 with eigenvalues (p + 1); if
N* Ф 0, then N* is an eigenvector of L3 with eigenvalue (p — 1). If we apply
N repeatedly we obtain increasing eigenvalues of L3 providing that we do not
274 VII The Galileo Group
obtain the null vector. Since Ж is finite dimensional there exists an integer n
such that
Nnv Ф 0, L3Nnv = {ii + n)Nnv and Nn+1v = 0.
Let Nnv be denoted by ujlj = ц + n) we therefore obtain
L3Uj = jUf and Nuj = 0. (3.11)
From (3.11) it follows that N+Nuj = 0 and from (3.7) and (3.11) we obtain
L2Uj=j(j+ l)uj. (3.12)
Since we may assume that щ is normalized, we may recursively define:
m =j,j - 1,..(3.13)
where xm is so chosen that ||wj| = 1 for all m. Since we have required only
that ||ите|| = 1, we (arbitrarily) choose т to be real and >0. The sequence of
the um exists providing that N+um, = 0 for a value of m'. From (3.10) we find
that for all um defined according to (3.13) that
L3um = mum. (3.14)
From the commutation relations (3.3), and intuitively, from the fact that L2 is
the square of the magnitude of a vector, it can easily be seen that L2
commutes with the rotations and, therefore, with all Therefore L2
commutes also with N and N+. From (3.12) and (3.13) it follows that
L2um=j(j + l)um (3.15)
for all m.
Since Ж is finite dimensional, for a particular value of m! (for which
um> Ф 0) it follows that the relationship N+um, = 0 must hold. From (3.7) and
(3.15) it follows that m'(m' — 1) = j(j + 1). Since, according to the definition
of the um, for m! the relation m! < j holds, we therefore obtain m' = —j.
Conversely, from
\\N+u_j\\2 = (u_pNN+u_j>
= (u_j,(L2 - L3 - L2)u_j> = 0
it follows thatiV+w.j = 0.
The sequence of the um runs as follows: m = jj — 1,..., —j. Since the
number (2j + 1) must be an integer, we find that j may only take on half¬
integer values—0, j, 1, f, 2,..., etc.
We obtain the normalization condition from (3.13) as follows:
lTj2 = UN+uJ2 = <um,NN+um)
and, from (3.7), (3.14), and (3.15) (since we have chosen xm to be positive real)
we obtain:
rm = Jj(j + 1 ) + m- m2 = J(j + m)(j - m + 1).
(3.16)
3 Irreducible Representations of the Rotation Group 275
Equation (3.13) then becomes
N+um = J(j + m)(J -m +
(3.17)
Since N is the adjoint operator to N+ and since the um are orthonormal it
follows that
The subspace spanned by the um is therefore invariant under iV, N+, L3
and therefore under Since the set of operators given by
contains the representative operators for all rotations A (see AV, §10.4) the
subspace spanned by um is invariant under all the operators in U(A). Since Ж
is irreducible, the um span all of Ж.
If, conversely, we abstractly construct a space Ж] by specifying the (2j + 1)
vectors um (m = — j, —j + 1,... ,j) as a complete orthonormal basis in Ж$
and define the operators L3, N, N+ by the equations (3.14), (3.17), and (3.18)
and define the operators Ll5 L2 by (3.4) then, for the Lv it follows that the
commutation relations (3.3) hold. Then, for the defined by (3.1) we find
that the commutation rules (3.2) hold. Each AeQ) % can be described by a
rotation axis and a rotation angle, that is, by three parameters al5 a2, a3 (see
AV, §10.7). Since the satisfy the commutation relations (3.2) it follows that
the U(A) form a representation of the covering group. That this repre¬
sentation is irreducible follows from the construction because to every
invariant subspace there exists an eigenvector of L3 which must coincide with
one of the um, from which we find that N and N+ “generate” the entire space.
In the sequel we shall denote the above irreducible representation in Ж• by
3j (in particular, for the simplification of the discussion in XI-XVI). 3j also
represents a characterization for a class of equivalent representations,
independent of which vector space by which it is realized.
The representations 3j may be obtained in a purely algebraic fashion,
without the use of the theorems of Lie groups. The latter approach is often of
great practical value. For this reason we shall explicitly derive the repre¬
sentation 31/2 of Q)% in Ж^2.
We shall denote the basis vectors w+1/2, w_1/2 of Ж112 by u+ and w_. In
Ж1/2 we find that, according to (3.14), (3.17), and (3.18), the infinitesimal
rotations satisfy the equations:
Num = yj{j - m)(J + m + l)um+1.
(3.18)
U{A) = eIv
(3.19)
Л«+ = -2M+’ = 2U~’
J2u+=^u-, J2u _ =-^m+. (3.20)
276 VII The Galileo Group
Thus from (3.2) it follows that the following equations hold for av = 2iJv:
If we again use the parameters al5 a2, a3 for the rotation A where
a = yjocl + a2 + a3 represents the rotation angle and wv = av/a the com¬
ponents of the rotation axis, then from (3.19) it follows that
The element A = [e, #] of the fundamental group of 2% corresponds to a
continuous variation of the angle a from 0 to 2n (see AV, §10.7). From (3.25)
it follows that
The operator U(A) = 1 corresponds to cos a/2 = 1, that is, a = 4nn. All
these values of a correspond to the unit element of the covering group.
Therefore the representation in Ж1/2 is an isomorphic representation of 2%.
We shall now use algebraic methods to obtain the above result.
First we shall show that the U(A) contain all unitary operators in Ж1/2
which have determinant 1. From (3.25) and (3.20) it immediately follows that
the matrix of U(A) is given by:
be a unitary matrix having determinant 1. Then we must also have:
(3.21)
The <7V are self-adjoint and satisfy the equations:
+ «Vv = 0 if v A* and aI = 1.
(3.22)
(3.23)
From (3.22) it follows that
(3.24)
and we therefore obtain
U(A) = e l(a/2)£vwv<TV _ j cos ^ wvcrv J sin^. (3.25)
V
U(le9V]) = -1-
/ cos(a/2) — iw3 sin(a/2) — (iwx + w2) sin(a/2)
\ (—iw1 + w2) sin(a/2) cos(a/2) + iw3 sin(a/2)
This is a unitary matrix with determinant 1.
Conversely, let
(
(3.26)
(3.27)
(3.28)
3 Irreducible Representations of the Rotation Group 277
From the third equation of (3.28) it follows that
#21 = A512, #22 = ^#11
because, from the first equation, it is not possible that both an and a12 be
zero. From the second equation of (3.28) it follows that
|A|2(| a12l2 + |аи|2) = |Я|2 = 1.
From the fourth equation of (3.28) we finally obtain
A|au|2 + A|a12|2 — A = 1.
Therefore (3.27) has the form
^ 1 1 # 1 2 i ~ .
11 12 \ |2 , i„ |2
with \an\ + \a12\ = 1. (3.29)
“#12 all
The matrix (3.29) is unitary with determinant 1. If, in (3.29) we then set
#11 = 04 “ #12 = + Wi) where the jSv are real, we obtain
/^4 — 03 —(/^2 + 0i) \ о о о о
я я «\я with/J2 + J82+/J2+J82 = l. (3.30)
/?2 “ Ф1 @4 + 0э /
Otherwise the jSv may be freely chosen. From the auxiliary condition in (3.30)
we may introduce an angle a and set
jS4 = cos(a/2).
We introduce the wv by means of the equations
jSv = wv sin(a/2) for v = 1,2,3.
The auxiliary condition in (3.30) then reduces to
Wi + W 2 + VV3 = 1.
Therefore the matrix in (3.30) takes on the form (3.26).
The group of two-dimensional unitary operators with determinant 1 is
often called SU2. SU2 is isomorphic to <2)% by the correspondence (3.25). We
shall now show that this is the case directly by algebraic methods:
For ax = Ysv xvav where the xv are real we find that ax is self-adjoint and
that (7X = (£v 1. If p is a self-adjoint operator in Жц2 and tr(p) = 0 we
find that p=Yjvav °x where the av are real as follows: Since the four operators
<7V, 1 form a complete linearly independent system of operators in Ж1/2 every
operator p has the form p = £v #v°v + #(Д; fr°m tr(p) = 0 it follows that
a0 = 0. From p = p+ it follows that av = av. Therefore, for a unitary
operator U in Ж1/2 it follows that
Uffvu+ = £ %»-
where a„v are real. Thus it follows that
UaxU+ = £ with *'» = E •
278 VII The Galileo Group
From
it follows that (aVfl) is the matrix of a three-dimensional rotation. It is easy to
see that the correspondence U —► (a^v) is a representation of SU2 by means of
three-dimensional rotations. That this correspondence is surjective on
can be easily seen as follows:
For U defined according to equation (3.26), if we set w1 = w2 = 0we then
obtain for the matrix (ocVfl) a rotation about the 3-axis. For w2 = w3 = 0 we
obtain a rotation about the 1-axis. Any other rotation can be obtained by
multiplications of rotations about the 1-axis and the 3-axis (see a discussion
of Euler angles, for example, [2], VI,§3.1).
Since SU2 is isomorphic to Щ we may therefore obtain all unitary
representations of 3)% as unitary representations of SU2. The irreducible
unitary representations of SU2 may also be constructed simply using
algebraic methods.
With the help of u+, w_ we may easily define a (v + l)-dimensional vector
space 3~v/2 which is generated by the basis vectors u\, w+_1W-,
и+” 2u2_,..., иЧ the vectors of which are all homogeneous polynomials
of the i;-degree in the unknowns w+, w_. For U eSU2 there exists a repre¬
sentation of SU2 by means of linear transformations in ^v/2 generated
by V(ua+ut) = (Uu+Y(UuJf. It is a simple matter to define an inner product
in ^/2 in such a way that the above representation is unitary. The definition
of this inner product is suggested by the following considerations:
If a+u+ + a_u_ is a vector in J^l/2, then under a unitary transformation in
Jf1/2 the expression ||a+u+ + a_w_||2 = a+a+ + a_a_ will be invariant;
the same result therefore holds for the expression
The coefficients cr are transformed in the representation of SU2 in the same
way as the coefficients of
(a+a + a_a)v =
(3.31)
An arbitrary vector in ^vj2 has the form
E Cru\~rur_.
(3.32)
that is, in the same way as
(3.33)
3 Irreducible Representations of the Rotation Group 279
From (3.31) it follows that the expression
1 i
— X r\(v - r)\crcr
VI r = о
(3.34)
is an invariant for all such transformations. We shall now introduce the
following set of basis vectors in ^v/2
where j = i;/2, m = j — r (that is, m = —j, — j + 1,... ,j) and define an inner
product <гте, гте,> = . Thus we find that the above representation of SU2
in 2Tj is unitary.
We will now show that the above representation is identical with which
was derived from infinitesimal transformations in Щ. For this purpose we
shall now compute the infinitesimal transformation in S'y For the in
ЖХ!2 we find that
where Uv(a) is given by (3.26) for = 0 and ц Ф v. From (3.20) it follows
that for the in ^ we obtain
г d тФ+у+ттФ-У~т~\
Lda J{j + m)\(j - m)l Ja=o
(j + m)uj+ m ~1 ujI mJvu+
yj(j + m)\(J - m)\
0 — rn)uj+mujSm~1Jvu_
(Л + «ЛК = -Mvm = -iy/U - m)(J + m + l)»m+1, (3.36)
(Л - 'ЛК = ~iN+vm = —iyf(j + m)(j - m + IK-!,
that is, the relations (3.14), (3.17), and (3.18) are satisfied. Since SU2 is
isomorphic to 3% the representations in Щ and ^3 are identical.
By means of the algebraic construction of the representation 3-3 in ^ we
may easily determine which values of j for which the representation is unitary
and the values of j which correspond to a “multiple valued” representation of
3%. For this purpose we need only the representation of the element — 1 of
SU2 in ^. Since 2T> consists of the homogeneous polynomials of (2;)th degree
we will therefore find that — 1 will be represented by (— l)2jl. The integral
(3.35)
y/r\{v - r)\ J{j + m)\(j - m)! ’
s/U + m)\(j - m)!
Since the in 3tflj2 are given in (3.20) we obtain
S3vm = —imvm,
280 VII The Galileo Group
values of j therefore result in unique representations of the half integer
values lead to two valued representations of 3)9.
All reducible representations up to a factor of 3)9 in a Hilbert space Ж may
(since Q)% is compact) be decomposed into irreducible representations which
decompose Ж into a direct sum
jt = I e (3.37)
д
where each subspace Ж^ is invariant and irreducible with respect to the
representation. However, not any arbitrary irreducible representation (up to
a factor) can occur in the Ж^. Since the representation in Ж has to be a
representation up to a factor, the fundamental group of <3)% must be
represented isomorphically in each of the Ж^ (see AV, §10.7). A repre¬
sentation up to a factor of 3)9 in Ж contains either only representations with
half integer j or only those with integer values of j.
It now remains to show that we have determined all of the irreducible
representations of 2%, that is, of SU2, that is, all representations (up to a
factor) in terms of the representations Щ in the ^. This we shall show on the
basis of the completeness of the characters of the representations in the ^ as
class functions in SU2 (see AV, §10.5). If U e SU2 then there exists a Fg SU2
such that W = Fl/F-1 has the form
Wu+ = eiau+, Wu_ = e~iau_. (3.38)
This follows from the fact that U must have two orthogonal eigenvectors
vu v2 with eigenvalues eia, eifi. V needs only be chosen as a transformation
which transforms vx into u+ and v2 into w_. If the determinant of V is not
equal to 1, so that we may obtain this result by the multiplication of V with a
factor such that a multiplication of V does not change Fl/F-1. Since the
determinant of W and that of U must be equal to 1 we must have eip = e~ia.
It follows that two transformations from SU2 which have the same eigen¬
value belong to the same class of conjugate elements. We will run through
the different classes when the parameter a runs through the values between 0
and и because the pair eia, e~ia of eigenvalues will run through all possible
values. The character Xj of the transformations W in is determined by the
equation
W(uj+muCm) = (Wu+y+m(Wu_y-m
it follows that
Xj(W) = ei2aj + ei2aU~l) + • • • + e~i2aj. (3.39)
Therefore
Хо(Ю = 1,
ilXm(WO] = cos a,
ШЮ-Хо(Ю~] = cos 2a,
ftx3/2(W0 - X1/2W] = cos 3a,
3 Irreducible Representations of the Rotation Group 281
The functions cos(na) for n = 0, 1, 2,... in the interval 0 < a < n form а
complete function system, so that there exist no additional irreducible
representations of SU2.
As we have seen in §2, to an elementary system type there corresponds a
spin space in which an irreducible representation is given (up to a factor) of
%. must then be isomorphic to one of the ^. As an index s we use the
same index as in the case of that is, the spin s (see D 2.2) can take on the
values 0, i, 1, |, —
The representation (2.35) of the rotation group in Ж is not irreducible even
if R(A) is the operator which represents an irreducible representation <2)s in .
With this the problem arises of reducing the representation given by (2.35).
For this purpose we will reduce the representation given by V(A) in Жъ.
In order to derive some frequently used formulas we shall now consider the
isomorphic map (see AIV, §13).
from the space <5? 2(R3, dki dk2 dk3) to <5? 2(R3, dxx dx2 dx3). For simplicity
we shall also denote this space by Жъ since <5?2(R3, dkx dk2 dk3) and
j£?2(R3, dxx dx2 dx3) can be considered to be different representations (see
XI, §2) of “the same” Hilbert spaces Жь.
It is easy to see that, according to (3.40)
where V(A) is the image of the operator V(A) in <5? 2(R3, dxx dx2 dx3). Instead
of V(A) we shall write V(A). The representation of in Жъ which we must
reduce is therefore given by
Since (3.41) is a unique representation of in the reduced representation
we may only have integer values of j.
Consider a rotation A about the 3-axis of angle a, that is,
V(A)<p(k) = cp(A гк) —> ф(А ]т) = У(А)ф(г),
У(А)ф(г) = ф(А 'г).
(3.41)
thus, for an infinitesimal rotation we obtain:
J^{r) =
For the Lv = iJx it follows that, in general:
1J__ 1 d
v X“ i dxp Xp i dxfl ’
282 VII The Galileo Group
The space Ж^ — ££ 2(R3, dxx dx2 dx3) may be represented in the following
form by means of polar coordinates (see AIV, §14):
where Q is the surface of the unit sphere and dco is the surface element (solid
angle) do = sin 0 d6 dcp. With respect to (3.43a) V(A) takes on the form
Let the components of a unit vector be given by
ex = sin 0 cos cp, e2 = sin 0 sin cp, e3 = cos 0.
Then, by the Weierstrass approximation theorem the set of all e^efef
(where the av ^ 0 are integers) span the entire space <5?2(Q, dco). ТЫ e^efe^3
with fixed sum ax + a2 + ос3 = I span a finite-dimensional subspace Ж^ of
invariant subspace under V(A). We now seek the irreducible subspace in Ж^
which contains the largest eigenvalue of L3. According to (3.45) the
eigenvectors of L3 obviously have the form eim(pg(6). If, instead of eu e2, e3 we
introduce the three functions e = ex + ie2 = ei(p sin 0, ё = ex — ie2 =
e~i(p sin 0, e3 = cos 0, then Ж1вг(р will be spanned by all ePiePleP3 for which
Pi + Pi + Рз = ^ The еР1ёР2еРз are then precisely the eigenvectors of L3 with
eigenvalues px — p2. Therefore the largest eigenvalue of L3 is obtained when
Pi = I and p3 = p3 = 0. Its value is /. The eigenvalue / of L3 is nonde¬
generate in . The corresponding eigenfunction is
Since N = Lx + iL2 cannot take us out of Ж1в (р we must have Nut = 0 and
the following equation must be satisfied:
(3.43a)
where
Жг = <5?2(R+, r2 dr\ Ж$9„ = dco), (3.43b)
V(A): 1 x V(A).
(3.44)
We shall now reduce the representation V(A) in Жв>(р.
Similarly, the Lv in (3.42) must take the form (3.44). By conversion to polar
coordinates, for L3 and N+ = Lx — iL2 we obtain:
(3.45)
(3.46)
17,
Since the ev transform linearly under rotations, Ж\^ is obviously an
щ = cxeil<p{sin в)1.
(3.47)
Ь2щ = 1(1 + 1 )щ.
(3.48)
With the aid of N+ we may, using (3.17), obtain the desired irreducible
representation space of the form which is spanned by the um.
3 Irreducible Representations of the Rotation Group 283
Since um lies in and L3um = mum, um must be a linear combination of
vectors of the form where jSi — jS2 = m, + jS2 + jS3 = /:
where is a polynomial in cos 0.
With N+ defined by (3.46) and from (3.17) we obtain
N+um = cmei(m~1)(p(sin 9)~m+1Q'Jcos 0)
= y/(l + m)(l - m + 1 )um+1
= ,/(/ + m)(/ - m + l)c„_!ei(m“1 )(P(sin 0)“m+1Sm+1(cos 0),
where g'(£) is the derivative of Q(^) with respect to Therefore the may
be recursively defined by means of Q'm = Qm-l and, for the normalization
constant cm we obtain the recursion formula:
From (3.47) it follows that Q^) = (1 — £2)1 and we therefore obtain:
From (3.50) it follows that cm takes on the value (with a yet to be determined
normalization constant a):
The factor a may be determined by means of the normalization condition
The integral (3.53) may be recursively calculated (see, for example,
[2], XI, §5.3). We obtain
The functions um are generally known as “spherical harmonics”; the
customary notation for them is cp). From (3.49), (3.53), and (3.54) we
finally obtain
um = cmeim«(sin e)~mQm(cos 0),
(3.49)
(3.50)
d‘~m
eja = ^r( i-«2y.
(3.51)
(3.52)
|м0|2 sin в dd d(p = 1.
(3.53)
(3.54)
шеЬшЦва eymQlm(cos в), (3.55)
where
dl~m
i-a(.
284 VII The Galileo Group
The Ylm span an irreducible subspace of v. We will now show that
00
•*5* = Z © ^ (3-56)
1 = 0
that is, the Ylm completely span Жв(р. For this purpose it is sufficient to show
that
K* = ^0^20^4®- (3.57)
In fact, we may consider the &]-2, &l-4,... to be homogeneous polynomials
of degree /; for example, obtained from multiplication of polynomials of
(/ — 2) degree by 1 = e2 + e\ + e\ (and similarly for ^_4). The right side of
(3.57) is therefore a subspace of Ж\^. Since, for different / the ^ must be
orthogonal, the dimension of the right-hand side must be equal to
[2/ + 1] + [2(1 - 2) + 1] + [2(1 — 4) + 1] H
= [(/ + 1) +/] + [(/ -!) + (/- 2)] + [(/ - 3) + (/ - 4)] + ...
The dimension n(l) of is equal to the number of the e^efef for which
«i + oc2 + oc3 = /. Thus it follows that n(l + 1) =n(/) + [n + /] + 1.
Therefore n(l) = [/ + 1] + / + [/ — 1] + • • •. Thus we have proven (3.57)
and (3.56).
Since (3.42) and (3.44) the reduction of the representation in Жъ is very
simple. Choose a complete orthonormal basis %v(r) in Жг. The ф1У1Л(г) =
Xv(r)Ylm(Q, (p) span (for fixed v, /) an irreducible subspace of Жь. From
(3.56) we therefore obtain
^b = Z e^bv. (3.58)
v,l
The reduction of the representation U(A) = V(A) x R(A) in Жъ x (ts is
an irreducible representation space) will be postponed until XI, §10. It is easy
to see that this reduction is achieved if the representation V(A) x R(A) can be
reduced in ^ x 4S. For a representation in a product space we write a x-
sign, for example, <3h x Q)h for a product representation in the product space
where the representation operators have the form V(A) x R(A) and
is irreducible with respect to the V(A) and is with respect to R(A).
4 Position and Momentum Observables
In §2 and §3 we encountered infinitesimal transformations. Using the latter
we defined self-adjoint operators (not necessarily bounded) such as К, X and
H in §2 and L in §3. According to IV, D 2.5.6 to each such operator there
corresponds a scale observable (which, according to D 2.5.6 is also a decision
observable). These scale observables are often given names. The introduction
of observables by means of infinitesimal transformations is common practice
in quantum mechanics. In this way the problem of how these observables are
to be measured is often ignored.
4 Position and Momentum Observables 285
In an analogous procedure in classical mechanics the observables are
functions in the Г-space (pv,qv) of the system. Since the pre-theories of
classical mechanics should define how position and momentum, and finally
how pv,qv, are to be measured, we should be satisfied if we are able to specify
the observables as functions in Г-space; then their measurement will be
described in terms of the pre-theories of classical mechanics.
The situation in quantum mechanics is totally different. Here the pre¬
theories make possible the description of registration methods and regis¬
tration procedures. However, the correspondence is itself de¬
termined by quantum mechanics! Here the problem described in IV, §4
appears in clear focus. Here we must concede that the specification of self-
adjoint operators and their corresponding scale observables is primarily of a
“pure theoretical nature.” According to the theory there should be “in
principle” approximate measurement methods in the sense of IV, §4 for these
observables. However, it appears that they cannot be obtained from
previously defined theories!
The quantity Xv (multiplied by m-1) introduced in §2 is often called the
“position observable at time t = 0.” However, from the theory we can neither
say how this may be measured or whether a particular measurement method
b0 measures these Xv (even approximately). With respect to the latter point
we may make an additional step in that direction in the following way:
In the above definition of m~1Xv as the “position at time t = 0” it is
unclear not only what is meant by the expression “position” (unless we are
willing to accept this expression as a mere name without any meaning) but
also what the physical meaning of the expression “at time t = 0” should be.
At present there are many different and varied conceptions and (apparent)
interpretations about this problem in circulation. Here we shall only mention
that the expression “at time t = 0” cannot mean (as we shall find in more
precise terms in XVII) that the “measurement takes place at time t = 0.”
Such a “point in time” at which a measurement takes place does not exist.
On this basis other priorities will take precedence over the definition of a
decision observable for “position at time t” Such a definition will necessarily
refer in a precise manner to a laboratory fixed space-time reference system as
defined in §1 for the physical interpretation of the Galileo group. Consider an
apparatus b0 for which the scale response refers to the spatial domain, that is,
the scale of the measurement apparatus defines an isomorphism between
ЩЬ0) and the “Boolean ring £ of a region in three-dimensional space.” Since
we cannot expect that there is a real apparatus with a rigorous isomorphism
between $(b0) and £ we proceed instead by making an idealization, that is,
with an observable S L where £ is the “Boolean ring of the spatial
region.” How can we define £ mathematically?
Let A be the a-algebra of the Lebesgue measurable sets of R3—in this case
the three-dimensional space defined by the laboratory fixed spatial reference
system, that is, R3 is the set of coordinate tripels (xu x2, x3). Let J be the
family of the sets of measure 0. £ = A/J is a complete Boolean ring, which we
shall call the Boolean ring of the “spatial region.” Each element <rel
286 VII The Galileo Group
therefore represents a possible response b of the idealized measurement
apparatus b0.
In the sense of IV, D 2.5.5 Ъ = Z(xl5 x2, x3) where the xv are the measur¬
able spatial coordinates in the laboratory reference system under con¬
sideration. These coordinates have nothing to do with quantum mechanics.
They are defined by classical measurement techniques and procedures. £ =
*2 j *з) ^ L is therefore, in the sense of IV, §2.5, an observable with the
sufficient scales xl5 x2, x3.
We now seek to sharpen our assertion about the desired position
observable. Let us set £(xl5 x2, x3) G, that is, let us consider a decision
observable.
Since there is often much misunderstanding concerning the meaning of
measurement, that is, of the registrations b which correspond to the
a e £(xl5 x2, x3), we again stress the fact that the registrations b on the
apparatus under consideration do not have an immediate connection with
the technical aspects of the measurement of the coordinates xl5 x2, x3!
If, for example, a registration b corresponds to
a = {(xl5 x2, x3)\x° - e < xv < x° + e},
where e is small, this means only that the apparatus b0 records (for example,
with the aid of a computer) the “measurement values” (x?, x2, x3). It does
not mean (!) that x?, x2, x3 are the technically measured spatial coordinates
of some macroscopic event or process associated with the registration b.
The “responses” of the measurement apparatus have nothing to do with the
technical aspects of measurement of spatial position. However, the usage of
a measurement apparatus characterized by b0 has much to do with the
technical aspects of spatial measurement, as we have explained in §1 and
which we shall now discuss.
The measurement of the “position of a microsystem at time t” (assuming
that such an observable £ G exists which satisfies all the requirements
which we will impose on it) by an apparatus is therefore different than, for
example, the technical measurement of the position of a space ship at time t.
The technical measurement of a space ship is explainable without the use of
Newtonian mechanics. The construction of the desired position measure¬
ment apparatus b0 for microsystems cannot be explained without quantum
mechanics; the position of a microsystem is only indirectly measurable in
quantum mechanics (in the sense of the discussions in [1], §10 or [2], III, §9).
We shall now seek to define the position observable in this indirect way.
It must be possible to adjust the spatial position of the apparatus b0
relative to the laboratory system in order that Galileo transformations have
the meaning which was described in §1, for instance, that of a spatial
translation (1, 0, fy, 0). We shall now investigate the requirements which shall
be imposed on S(xl5 x2, x3) Д G in order to relate indirectly the “measure¬
ment values” xl5 x2, x3 of b to the spatial coordinates. Intuitively we find that
if we try to interpret the registration b for the apparatus b0 as the
“determination” of the position in a then the apparatus b'0 which is obtained
4 Position and Momentum Observables 287
from b0 by means of a translation (1, 0, fy, 0) and the corresponding response
b' must correspond to the determination that the position is in a' = a + fy
(where a + fy is, of course, the spatial domain for which a is displaced by */).
According to §1 and §2 we obtain
4f(b'o, bf) = 1/(1, 0, */, 0ЩЪо, b)U{ 1, 0, f\, or1. (4.1)
This equation should also be satisfied if ^(ib0, h) is replaced by £(o) and
il/(b'0, b') is replaced by £(<7 + fy), since £(o) is the idealization of ^(h0, h).
We now impose our first requirement upon E(xl5 x2, x3) Д G:
1/(1, 0, fy, 0)£(<7)l/(l, 0, fy, O)"1 = £(o- + *y). (4.2)
For rotations we may make a similar argument; we require that
U(A, 0, 0, 0)E((j)U(A, 0, 0, О)"1 = £(A<7), (4.3)
where Aa is the domain which is obtained by rotating a by A.
The requirement (4.2) is “in principle” experimentally verifiable in the form
(4.1) as follows: If we have built an apparatus and set it relative to the
laboratory system, then there is a corresponding b0. Then we easily obtain b'0
(macroscopically) as the spatial translation of the apparatus b0. Here (4.1)
can be approximately controlled by probability measurements for different
a el'. Thus it is clear that the requirements (4.2) and (4.3) refer to
registrations and preparations—as we have described in II and in §1.
As we have found in IV, §2.5, the decision observable Z(xl5 x2, x3) Д G is
uniquely determined by the scale-observables which correspond to the scales
as follows:
ev = (4.4)
Then, from IV, §2.5 we obtain
£V(A) = £(<7v(A)) where <7V(A) = {(xl5 x2, x3) \xv< X}. (4.5)
The operators Qv introduced in (4.4) are self-adjoint (not bounded) and are
not defined in all of Ж Nevertheless these operators uniquely determine the
corresponding spectral families £V(A) (see AIV, §10) and we therefore also
obtain (as proven in IV, §2.5) the complete observable S(xl5 x2, x3) Д> G.
Later we shall return to the discussion of the use of infinite extended scales.
The requirements (4.2) and (4.3) for the observable S(xl5 x2, x3) G are
not sufficient to uniquely determine this observable. In addition, we must
also take into account the use of the laboratory time scale t.
We shall now again consider the intuitive idea that b0 represents a
measurement apparatus for which the measurement result can be interpreted
as the registration of “position at time t” What should this mean? In order to
answer this question we shall now consider the original apparatus b0
together with its response b and a second experiment Щ and response b"
where Щ and b0 are identical except that bjj moves relative to b0 with velocity
$ in such a manner that, with respect to the laboratory time scale the
288 VII The Galileo Group
apparatus bjj takes the same spatial position as b0 does at the time t. It is
obvious that the apparatuses for b0 and Щ cannot be applied to the same
experiment. Two experimental series, one with b0, the other with bJJ must be
carried out in order to measure the frequencies of the responses. Then Щ may
therefore be obtained by the application of the following Galileo transfor¬
mation to bQ;
x" = xx + dv(t - t),
Equation (4.6) may be written as a product as follows:
(1, 0, 0, f)(l, S, 0, 0)(1, 0, 0, — t). (4.7)
The form (4.7) permits us to use the operators described in §2.
If b0 “registers the position at time ?’ by the response b then bfQ should also
be the same because both apparatuses will be in coincidence at time t. We
will therefore require that фф0, b) = фф'о, b"). We may transform ^(b0, b) to
its idealized version Еф) and therefore require that
1/(1, 0, 0, 7) 1/(1, S, 0, 0)17(1,0, 0, i)~lE(a)
■ U( 1, 0, 0, t)U(h s, 0,0)"117(1, 0, 0, Г1 = Ц<Г\ (4.8)
where we have used the form (4.7) of the Galileo transformation (4.6).
We shall now show that there exists exactly one decision observable
S(xl5 x29 хъ) Д G which satisfies the conditions (4.2), (4.3), and (4.8). This
observable will be called “position at time ?’ and we shall denote the
corresponding scale observables (4.4) by Qv(t). Then the real physical
meaning of the time parameter t is characterized by (4.8) because (4.8) can
only be satisfied by a single t. The time t has nothing to do with the time of
occurrence of the macroscopic response b; in addition it has nothing to do
with the notion of the “time of measurement.” Such a “time of measurement”
does not refer to an instant of time, but rather to a time interval in which the
interaction between the microsystem and the apparatus takes place. Rather,
the time t is that for which the moving apparatus b^ comes into coincidence
with the stationary apparatus b0. Therefore, t may be determined macro-
scopically and has nothing to do with the temporal evolution of the
interaction of the microsystem with the measurement apparatus, t is
obtained from the spatial alignment of the two registration methods b0 and
bfQ. Since we are unable to measure the position of a microsystem in a
technical sense, it is not meaningful to speak of a measurement “at time F.”
Since the xl9 x2, x3 and t are actually parameters of the underlying reference
system and are only adjustable parameters for the registration apparatus
relative to the reference system, they are not observables in the quantum
mechanical sense, as these are defined and studied in IV.
We shall now prove the above assertion that the Qv(t) are uniquely defined
(for a precise mathematical formulation and proof see [10] and [22]). For
4 Position and Momentum Observables 289
this purpose we shall first consider the special case in which t = 0. Then from
(4.8) we obtain
1/(1,1 0, 0)E((j)U(l I 0, О)"1 = E{o\ (4.9)
that is, 1/(1, <5, 0, 0) (of (2.11)) commutes with E(a).
From (4.2) it follows that, with (2.9)
е1^Е{а)е~1^ = E(a + fj). (4.10)
Thus, for all Qv(0) from (4.4) it follows that
eiR ii&(0)e-iR ii = G(0) - #?1- (4.11)
From (2.17) we obtain a similar equation
е1кпхе-^ = X - гщ\. (4.12)
Therefore 7 = (5(0) — m_1X is an operator which commutes with
1/(1, S, 0, 0) and with 1/(1, 0, fy, 0). Therefore we obtain:
Y: 1x7,
that is,
6(0) = m-1X x 1 + 1 x 7. (4.13)
From (4.3) it follows that, using (2.35):
R(A)fR{A)~1 = A_1Y. (4.14)
Since the 6V(0) mutually commute, the Yv must also commute. Since is
finite dimensional, the Yv have a common system of eigenspaces which span
all of . From (4.14) it follows that R(A)YVR(A)~1 are linear combinations of
the Yv and therefore the common eigenspaces of Yv are also eigenspaces of the
R(A)YVR(A)-1, that is, the R(A) leave the eigenspaces of Yv invariant. Since the
representation of the rotation group in is irreducible, there are no proper
invariant subspaces. Therefore 7V = Avl. From (4.14) it follows that
^1 = ^2 = ^3 = 0 and therefore, according to (4.13) we obtain:
6(0) = m~lX. (4.15)
Conversely, if we define a decision observable according to IV, §2.5
Ц*!, x2, хъ) G by means of (4.15) then the so-defined observable will
satisfy conditions (4.2), (4.3), and (4.8).
If S(xl5 x2, x3) Д G is the position observable for time t = 0 and we define
(using (2.38))
E(a) = 1/(1, 0, 0, t)E((j)U(l, 0, 0, t)"1
= eiHiE(a)e~iHt
then, from (4.9) it follows that
1/(1, S, 0, 0)17(1, 0, 0, t)~1E(<j)U(l, 0, 0, t)U( 1, S, 0, О)"1
= 1/(1, 0, 0, ?)_1£(cr)L/(l, 0, 0, t),
290 VII The Galileo Group
that is, (4.8) holds for E(o). It is easy to prove that Ё(а) also satisfies
conditions (4.2) and (4.3). The scale observables
Gv(0 = eiHtQv(0)e~iHt (4.16)
therefore exactly satisfy the conditions for the desired observables “position
at time t” Since the Qv(t) must have the form (4.16) it therefore follows that
the observable defined by
e~iHtQv(t)eiHt
must satisfy the conditions (4.2), (4.3), and (4.9).
Thus, in this way we have uniquely defined the decision observable
“position at time £” for elementary systems.
We have not yet constructed a registration method b0 which will permit (in
the sense of IV, §4) an approximate realization of the “position observable at
time £.” We have, however, described a type of experiment which may be used
in order to determine whether a given registration method approximates the
“position observable at time £.”
A similar method can be used for the definition of a “momentum
observable.” We begin with a decision observable S(pl5 p2, Рз) G and
impose the following requirements: The observable is invariant under spatial
translations:
1/(1, 0, rf, 0)E(a)U(l, 0, if, or1 = E(a). (4.17)
Under rotations we require that
U(A, 0, 0, 0)E{a)U(A, 0, 0, 0)"1 = E(Att). (4.18)
Under proper Galileo transformations we require that
1/(1, 3, 0, 0)E(a)U(l, 3,0, О)"1 = E(p - m3). (4.19)
Here (4.19) corresponds to the intuitive idea that the motion of the
registration apparatus with velocity S results in a change of momentum of the
microsystems relative to the moving registration apparatus of magnitude
(— md).
The observable S(pl5 p2, p3) Д G can be characterized by three scale
observables
Pv=jAd£v(A) (4.20)
with respect to the scales pl9 p2, p3.
From (4.17) it follows that the Pv commute with the From (2.17) and
(4.19) it follows that the
Zv = Pv + Kv
commute with the Xv and the Kv, that is, we must obtain
Pv = -Kv x 1 + 1 x Zv.
4 Position and Momentum Observables 291
Similarly, as in the case of (4.13), from (4.18) it follows that we must have
zv = 0.
Therefore, a momentum observable is uniquely determined by the require¬
ments (4.17), (4.18), and (4.19) which is characterized by the scale observables
Pv = -Kv x 1. (4.21)
It is easy to see that eiHy commutes with the Pv, that is, a time translation of
the idealized registration methods corresponding to the Pv do not produce
any change in the observables.
From (2.16), (4.15), and (4.21), we obtain the famous Heisenberg com¬
mutation relation for position and momentum
(4.22)
Since Planck’s constant does not appear in the theory formulated here (as
we have already discussed), it is clear that we have correctly formulated the
fundamental structure of quantum mechanics. Equation (4.22) is an indirect
consequence of the representations of the Galileo group and is therefore only
a consequence of axiom AV 4s in III, §3. For a discussion of the correspond¬
ing Heisenberg uncertainty relations, see IV, §8.3.
These observables Pv, Q^ provide our first example for the use of “un¬
bounded” self-adjoint operators. In quantum mechanics the meaning of
unbounded operators, their domain of definition, and the precise formulation
of the relation (4.22) are often treated as a mystery. Here we have found that
there is, in principle, nothing unusual underlying the introduction of the
unbounded operators Pv, Q^ described above. The conceptual structure of
quantum mechanics has nothing to do with this occurrence of unbounded
operators. It arises exclusively (and, for the most part, effortlessly) from a
mathematical idealization which has no particular physical meaning (see, for
example, [1], §9 and [2], III, §8). If, for example, we describe the laboratory
reference system by Euclidean geometry, it is then practical to use the
Euclidean rectangular coordinates xv as scales, although, in principle, it is not
necessary; we could instead use other finite scales. Since the lack of finiteness
for the scale xv has no physical meaning, it follows that in the real world
arbitrarily large xv have no physical content (see, for example, [1], §9 and
[2], IX, X).
Therefore unbounded self-adjoint operators for scale observables occur if
we introduce unbounded scales. The scales for such observables are only a
practical tool, as we have described in IV, §2.5.
There is, however, a second case where unbounded self-adjoint operators
(and therefore also their spectra) have, in a natural way, a physical meaning.
This situation arises if the self-adjoint operators occur as infinitesimal
operators in the representation of a group which has a physical in¬
terpretation. We have encountered such self-adjoint operators of the form
Xv, Kv, H, L in §2 and §3. Here the spectrum, that is, the scale values are of
292 VII The Galileo Group
crucial importance for the structure of the corresponding group
representations.
Both of the above viewpoints—scales as only a practical tool for the
ordering of a Boolean ring of an observable (as described in IV, §2.5) and
scales as a characteristic of an infinitesimal transformation—are often
confused with each other. In particular, this occurs when the infinitesimal
transformations are, because of their representations in terms of self-adjoint
operators, called observables and are given special names.
Naturally it is permissible to relate some scale observables in this way with
infinitesimal transformations; here the scales are, on the basis of their
definition, no longer arbitrary. Often a sufficient distinction is not made
between the case where, in an experiment, a registration method b0 is used
with a scale which corresponds to a theoretically defined observable, and the
case in which the registrations which were carried out reflect the transform¬
ations and indirectly permit the conclusions concerning the spectrum of the
infinitesimal transformation. In the applications presented in this book we
shall be careful to point out which are preparations, which are registrations,
and which are transformations.
In closing this section we shall now make a few remarks concerning the
parallelism between the case in which the Galileo group is replaced by the
Poincare group. Obviously (4.8) cannot be applied to the relativistic case
because it is not possible to bring two moving systems into “coincidence.”
Here it can be shown that the position decision observable for elementary
systems with nonzero mass conditions analogous to (4.2) and (4.3) can be
satisfied. These position observables are, however, not uniquely determined.
For light quanta (systems of zero mass) there is no such position decision
observable. This is, however, not an argument against the theory. If, in an
experiment with light quanta something similar to a position is measured,
this measurement does not correspond to a decision observable [22]. Here
we have an example that shows that it is somewhat risky to only discuss such
an observable concept which we have called a decision observable. Such a
restricted concept of an observable (namely that of a decision observable)
would not fit all essential experimental procedures.
5 Energy and Angular Momentum Observables
In this section we shall consider two observables for which the theory of
measurement methods is less well known than is the case for the position and
momentum observables described in §4. First, for purely formal reasons, we
may call the observable H of infinitesimal time translations (2.20):
l/(l,0,0,y) = ^ (5.1)
the energy observable. This does not meaii that (as we have already
mentioned in §4) we are able to construct a b0 for which ф(Ь0, b) corresponds
to an approximate measurement of H (in the sense of IV, §4).
6 Time Observable? 293
If there exist elementary systems, then according to (2.38) Я is a function of
К and therefore also of P:
H is therefore (in the sense of IV, D 2.5.5 and IV, D 2.5.6) a scale partial
observable for the momentum observable, that is, H is automatically
determined by the measurement of momentum. This is no longer the case for
composite systems, as we shall find in VIII, §1.
For certain systems experimental physicists have suitable registration
methods for the measurement of Я. Often, however, H can only be
experimentally determined indirectly—only by means of 17(1, 0, 0, y). For
this reason we must carefully use the notion of an energy observable H in
applications.
The operators Lv are defined in terms of U(A, 0, 0, 0) by (3.1) in a similar
manner as H is defined in terms of 1/(1, 0, 0, y). The scale observables Lv are
called the components of angular momentum. Again, we are not told how
these observables may be measured. In [2], XI, §7.2 and [2], XII, §2.2 we
show how angular momentum can be measured by means of the Stern-
Gerlach experiment. In many applications, however, the role of the com¬
ponents of angular momentum as infinitesimal rotations is more
important—a typical example is given by atomic spectra, where the latter is
discussed in XI-XIV.
For elementary systems (2.35) holds, where R(A) are the operators of
irreducible representations Ds in . In this way U(A, 0, 0, 0) defines not only
the total angular momentum (denoted by J) but, in addition, V(A) defines the
orbital angular momentum (denoted by L—as an operator in Jtfj) and R(A)
defines the spin angular momentum (denoted by S—as an operator in ts). By
differentiation of (2.35) we obtain
J = L x 1 + 1 x S, (5.3)
L, as an operator in 2(R3, dxx dx2 dx3) is given by (3.42). S, as an
operator in is given by (3.14), (3.15), (3.17), and (3.18) with s instead of j.
Here we say that the spin has the fixed value s, where by this we mean that,
according to (3.15), in the relation S2 =^s(s + 1)1 holds.
These mathematical formulas for L, S, J provide no instructions for the
construction of measurement apparatuses for these observables. In the
theory of atomic spectra the infinitesimal transformations characterized by
the L, S, J play a very important role (see XI-XIV).
6 Time Observable?
In the literature of quantum mechanics the discussion about the so-called
“time observable” has reached vast and overwhelming proportions. Most of
this discussion rests upon a misunderstanding of the concept of an observ¬
able. In the “usual” interpretation of quantum mechanics—the one most
294 VII The Galileo Group
frequently heard by the student—the observable concept is used as a
fundamental concept (see, for example, [2], XI, §1.7 where we have pointed
out the inadequacy of this interpretation). It is often necessary to go to great
lengths in order to provide an intuitive justification of this concept of an
observable, and the discussion of the difficulties associated with this concept
are avoided in order to minimize the difficulties with this approach in order
not to frighten the student excessively.
In this approach the observable concept is introduced as a “quantity
measured by an observer” or as a “measurable quantity,” etc. Clearly, in the
laboratory there exist clocks by which we may “measure time”; therefore
time should also be an observable. In order to counter all such erroneous
interpretations of quantum mechanics we have laid the foundations of the
interpretation of quantum mechanics in II and extended this interpretation
by the structure introduced in VII, §1. Hence it clearly follows that measure¬
ments with “meter sticks” and “clocks” do not constitute measurements of
quantum mechanical observables. For this reason we have developed the
concept of an observable as a derived concept in IV (for a discussion of
derived concepts see [1], §10). Meter sticks and clocks are used only for the
purpose of adjusting and calibrating preparation and registration
apparatuses.
Therefore in quantum mechanics the spatial coordinates xl9 x2, x3 and the
time coordinate t are only parameters given by the laboratory reference
system! As we have found in §4 the measurement of the position observable
Qv(t) does not mean that at time t (clock time) the coordinates xl9 x2, x3 of a
microsystem are measured by means of meter sticks because the microsystem
as such is “not there” in the sense that it can be measured in this way. The
microsystem may only be detected by producing a response in a registration
apparatus.
The introduction of the position observable Qv(t) in §4 clearly shows that it
is concerned only with the possibilities of (idealized) registrations. The claim
that the coordinates xu x2, x3 are defined as the measured values of the Qv(t)
is a misunderstanding of quantum mechanics. In fact it is just the opposite—
the technical process by which the coordinates xl9 x2, x3 are defined in the
laboratory system must be explained independently of quantum mechanics.
After it is understood that an observable 2 Д> G (here 2 is the Boolean ring
for the region of space under consideration) is determined by certain
requirements, it is reasonable to also choose the previously defined xl9 x2, x3
as the scales for this observable. Therefore the xl9 x2, x3 are definite scales for
the Boolean ring 2 of the region of space under consideration and are already
determined by the pre-theories. After xl9 x2, x3 are defined, the quantum
mechanics of the registration process comes into play as a map 2 Д> G.
The next question which we would like to ask, and is meaningful in the
context of quantum mechanics is whether there exists a decision observable
2(0 Д G for which 2(0 is the Boolean ring of the “time domain” which
satisfies the following reasonable conditions (by analogy with (4.2)):
1/(1, 0, 0, уЩо)Щ1, 0, 0, y)~l = E(a + y).
(6.1)
6 Time Observable? 295
Using the spectral family £(A) = E(<j) for a = {t 11 < A} from (6.1) we obtain
the relation :
[7(1, 0,0, y)E(X)U(l, 0, 0, y)~l = E(X + y), (6.2)
where from
[7(1, 0,0, у) = eiHy
and
eiT* = |giAa
we obtain the relation
eiHyeiTcce~iHy _ eiy<*eiTa
which we can also write in the form
e~iTaeiHyeiTa _ giya^iHy
If we let Ё(со) denote the spectral family of H it follows that
e~iTaE(co)eiTa = Ё(со + a).
If (for со2 > (Oi) Ё(со2) — Ё(a^) Ф 0 then we also find that
Ё(со2 + a) — Ё(со1 + a) Ф 0.
Since a may be chosen arbitrarily, it follows that the spectrum of H varies
between — oo to + oo in contradiction to (2.38).
Therefore a decision observable E(t) G which satisfies (6.2) does not
exist. For elementary systems this is purely a consequence of the axioms cited
in III, §3 and the conditions imposed on Galileo transformations of the
registration procedures in §1. It does, however, also hold for composite
systems because the Hamiltonian operators of time translation
1/(1, 0, 0, y) = eiHv
are, in all cases, bounded from below (see VIII, §5). Why, however, should an
observable which satisfies (6.2) exist? Is it only because it is “desirable”—even
though such an observable is not realizable ?
It is necessary to go beyond quantum mechanics to a more comprehensive
theory which permits an apparatus which registers the “desired” observable if
we succeed in constructing an apparatus which measures a “time observable”
which satisfies (6.2). Clearly a registration apparatus which contradicts the
Heisenberg uncertainty relations or contradicts the assertion of the non¬
existence of an observable £(£) Д G which satisfies (6.2) has not yet been
constructed. Therefore we may consider the nonexistence of the observable
X(£) Д> G satisfying (6.2) as a statement about the structure of the real world.
Clearly there exist apparatuses—for example, a particle counter—which
registers the time upon the detection of a particle. Is such a counter a
realization of a type of time observable?
This is indeed correct. Let b0 denote such a particle counter (including its
spatial orientation with respect to the preparation apparatus), then we
296 VII The Galileo Group
apparently can register whether a response signal has occurred in the time
interval between tx to £2- Let bn be the registration that the counter has not
responded at all; then the various b cz b0\bn register the time domain into
which the signal has occurred. If we consider the ideal case where the length
of the signal can be ignored than it is reasonable to proceed from $(b0) to the
following Boolean ring 27(£): In 27(0 there is a particular on e 27(0 which is an
atom of 27(0; the set {a | a < e -j- 0n} (£ *s the unit element of 27(0) forms a
Boolean ring (with e -j- an as the unit element) which is isomorphic to the
Boolean ring of the time domain which was denoted by 2(0- Therefore $(b0)
can be considered to be an approximation of 27(0, where the ij/(b0, b)
represent an approximation to an observable 27(0 L. Therefore 27(0 L
is obviously a type of “time observable.”
Does this observable satisfy the following relationship
which is analogous to (6.1)? For real counters the ф(Ь0, b) cannot, of course,
exactly satisfy (6.3) because the counter can only be turned on for a finite
time, and is therefore usable only for certain registrations b which are not
exactly at the beginning or the end of the “on” cycle of the counter. For this
reason we should find that (6.3) is satisfied by ф(Ь0, b) = F(o) if у is
sufficiently small. Therefore it is conceivable to require (6.3) for the idealiz¬
ation 27(£) L. The following additional idealization is, according to the
previous discussions, not allowed: 27(£)-^*L cannot be a decision
observable!
Apparently this is precisely the point where errors are often made. Since
we are often only familiar with observables which are decision observables it
is often thought that the signal for a counter is a “yes-no” response of a
decision observable, that is, that the registration “the signal occurs in the
interval £, to £2” must correspond to a projection operator (in our notation, to
a ij/(b0, b) e G). This is clearly an error arising from an inadequate in¬
terpretation of the mathematical framework of quantum mechanics.
In order to show that there exist observables 27(£) L which satisfy (6.3)
we shall now give an example. We note that (6.3) is equivalent to the
following equation (which is analogous to (6.2))
so that we need only to exhibit a general spectral family which satisfies (6.4).
lnjfb = JF2(R3, dxx dx2 dx3) we set
where e~r2/p2 is the operator consisting of multiplication by e~r2/p2. Since
e~r2/p2 js a positive operator, the same is true for eiHte-r2ip2e~iH\ Therefore
the operator
1/(1, 0, 0, y)F(a)U(l, 0, 0, y)"1 = F{o + y)
(6.3)
1/(1, 0, 0, y)FW( 1, 0, 0, y)"1 = F(A + y)
(6.4)
(6.5)
AHtn-r2\p2„-iHt
6 Time Observable? 297
is always a positive operator. If we show that
Г00 eiH,e~r2lp2e~iHtdt (6.6)
J - 00
is a bounded operator, then a in (6.5) can be chosen such that F(oo) < 1.
From F(tтп) = 1 — F(oo) we obtain an example for an observable 27(0 L
which satisfies (6.3) because from (6.5) it follows that
eiHyF(X)e~iHy = a f eiH(t+v)e~r2/pViH(t+v) dt
J - 00
^A + y
a i eiBze-r^e-iHz dz = Щ + yy
In order to show that (6.6) is a bounded operator we transform from
i?2(R3, dxt dx2 dx3) to 2(R3, dkl dk2 dk3) using the inverse formula to
(3.40)
<P$) = ^372 W) dxi dx2 dx3 ■
In this space the operator (6.6) takes on the form:
q>(k) -► (p'(H) = aj| dtje{il2m)k2,e-^2l^s~^2e-mm)k'\{k') dk\ dk'2 dk!3,
(6.7)
where the factor осг is not of any further interest. The positive operator (6.6) is
bounded if
<(p(b <?'(£)>
(р(к)(р'ф) dkx dk2 dk3
< С \<p(k)\2 dkx dk2 dk3
is satisfied for a particular value of C. It is sufficient to show that this
condition is satisfied for continuous cp(k) because the latter are dense in Ж
For continuous cp(k) we may obtain the integration of t in (b.7) from the
Fourier transform. If we introduce polar coordinates for к and k! we obtain
<v(h <р'Ф)> = «2
<p(k, в, <р)е-(*2',2/4)|ё-е"|У(/с, в', q>')k3 dk dco' dco, (6.8)
where dco, dco' are area elements for the unit sphere (or elements of solid
angle) and e, e' are unit vectors in the direction 0, cp or 0', cp'.
If we introduce the expansion
q>(k, e,<p) = z xtjk)Y‘m(e, (p)
l.m
298 VII The Galileo Group
into (6.8) we must compute the following integrals:
Ylm(6, (p)e-(k2p2l^-^2Ylm{Q\ cpf) dco dco'. (6.9)
Since the operator defined by the integral kernel
е~{к2р21А)\ё-ё'\2
as an operator on functions on the unit sphere, commutes with rotations, the
Ylm must be eigenfunctions of this operator, where for fixed I all Ylm must
correspond to the same eigenvalue:
I
e-(kV/4)|e —e'|2 уГ ^ ы = ф)У1тЩ <p). (6.10)
We only need to estimate ct(k). For this purpose we shall set m = 0 in (6.10).
Then, according to (3.55) Yl0 is real. Let вт denote the location of the
maximum of Yl0(6) and let Yljem) = dx. For e = em in the direction 0m from
(6.10) it follows that
c,(k)d, = j*e-<*V/*>l*»-*'l2Yj(0') do’
g-(*V/4)| em-eVd(o'. (6.11)
Since the last integral is independent of the direction of em we may choose em
to be the direction of the polar axis and by setting £ = cos в we obtain
Jе-(к2р214)\ёт-ё'\2 dw' = J7t|
-(1 - e~k2p2).
2X1-4)^
-1
4n
" k2p2
Thus, with the above result, and (6.11) we obtain, for ct(k):
4-7z
0<ф)^р^(1-е-к2р2). (6.12)
Combining (6.8) with (6.9)-(6.12) we obtain
Гоо 1 _
<cp(k), <р'(к)У < a3 E IXi,m(fc)l2 1 dk- (6-13)
l,m JO K
From
1 - e~k2p2
kp
< 1
7 Spatial Reflections (Parity Transformations) 299
we finally obtain
<(p(h <p'{ky> < с X f \z,,Jk)\2k2 dk
l,m JO
= C^\(p(H)\2dk1
dk2 dk3.
Thus we have proven that the operator (6.6) is bounded.
For a real counter we neither have i//(b0, b)eG nor does (6.3) hold for all a
and y. Therefore no experimental evidence exists against the assertion of the
theory that there is no observable X(t) Д G which satisfies (6.1). The
resistance to this fact is analogous to the resistance to the entropy theorem—
because it contradicts cherished beliefs.
The existence of the observable “position at time t” introduced in §4
represents a type of position-time measurement. In contrast with the usual
situation in classical mechanics, the observables “position at time t” and
“position at time t2 are, for tx Ф t2, not commensurable (in the sense of
IV, D 3.1 and D 3.2). This fact follows simply from the fact that 6v(*i) and
Qv(t2) do not commute for tx Ф t2 (see IV, Th. 3.2).
From (4.16), (4.21), and (2.41) it follows that
Q,(t2) = Q,(h) + t-1-~Pv (6.14)
and therefore follows that
QAtJQAti) ~ Q*(hmt2) = ~ - QMi)PJ = fl)i-
(6.15)
Therefore, there is no possible way to jointly measure the positions of a
microsystem at two different times! This does not, however, mean that a
microsystem cannot exist having the two pseudoproperties, “position in о* at
time tx" and “position in a2 at time t2 (IV, §8.3), for example, x was prepared
having a position in at time tx and was registered as having a position in a2
at time t2.
7 Spatial Reflections (Parity Transformations)
Up to now we have considered the Galileo group as the group which is
continuously connected to the unit element. This group can be given a well-
defined meaning in terms of transformations of the registration apparatus. In
practical applications, however, discontinuous transformations—such as the
space reflection r:x'v = — xv—play an important role. If we expand the
300 VII The Galileo Group
Galileo group ^ by admitting transformations A for which the determinant
\A\ = — 1, we obtain a new group ^(r) which can be decomposed into two
disconnected components—^ as an invariant subgroup of ^(r) and the coset
г<3 where r is the spatial reflection transformation. If we choose, we may
replace the reflection r by a transformation in which only one of the
coordinates is reversed, that is, by
x\ = -xl5
rx: x'2 = x2, (7.1)
*3 = *3-
We then obtain гУ = r
We may obtain a physical interpretation of the entire group ^(r) if we give
meaning to one of the transformations r or гг. Here we are confronted by a
problem which has often been neglected. While it is clear what it means to
rotate or translate an apparatus, it is not clear what it means to subject an
apparatus to a reflection r. The transformation (7.1) does not establish what
should be done with an entire apparatus.
We can visualize (7.1) as reflecting the apparatus in a mirror located at the
(2, 3) plane. Here we would see that mirror image of the apparatus. The
mirror image is, however, not an actual apparatus. The fact that the
production of an apparatus which is the mirror image of the original
apparatus is not trivial can be seen from the example of a person—it may
well be impossible.
However, the mirror image only establishes the spatial organization of the
components of the apparatus—by means of the transformation (7.1).
However, how do such things as electric charges, electric and magnetic fields,
etc., change in the apparatus? The transformation (7.1) therefore does not
establish how we should determine the corresponding transformations of
registration apparatuses. The arbitrariness of the application of (7.1) to an
apparatus is not sufficiently noted, and plays an important role for the so-
called “elementary particle physics.”
In nonrelativistic quantum mechanics the action of the reflection r (it is
mathematically simpler to deal with r rather than with rx) is defined as
follows: for the apparatus b0 a new apparatus is built in which the spatial
organization of the components and the spatial placement are changed in the
sense of the transformation r without changing the charges present. In spite
of the objections implicit in the example of a person described above, we
assume that we may build such a “reflected” apparatus. Axiomatically we
require that for r there exists a p-continuous r-automorphism such that,
together with the interpretation of elements of 9 in §1 we obtain a
representation of ^(r) by means of p-continuous r-automorphisms.
From the representation of ^(r) by means of p-continuous r-automorphisms
there arises a representation by means of J^-continuous effect automor¬
phisms. Since 9 is not connected with r^, according to VI, §3.2 we cannot
exclude the possibility that the elements of г<3 transform one system type into
7 Spatial Reflections (Parity Transformations) 301
another. Only experience will lead us to impose the requirement that the £8-
continuous effect automorphism corresponding to r transforms an F of the
form (0, 0,..., Fv, 0,...) into an F of the same form (0, 0,..., Fv, 0,...).
This is equivalent to the condition that the effect automorphism correspond¬
ing to r leave the “objective properties” (IV, §8.1) invariant. Thus, according
to VI, §3.2 and §3.3 we may again restrict ourselves to a Hilbert space of a
single elementary system type and consider the representation of ^(r) in Ж by
means of unitary or anti-unitary operators up to a factor. According to
IV, §3.2 we must be able to represent ^ by means of unitary transformations,
that is, we may assume the previous results about the representation of in
principle the elements of гУ may also be represented by means of anit-unitary
operators, in particular, the same is true of r itself. Let U(r) denote the
operator representing r.
From (A, S, fj,*y)r = r(A, — — fj, y) it follows that
U(A, I n, y)U(r)U(A,-Ц, y)-1 = A(A, 3, f\, y)U(r). (7.2)
For (p(lc) G 2(R3, dkx dk2 dk3) we define a unitary operator in Ж =
jfbX4byRxl, where
Rcp(&) = (p(—k).
It follows then, by simple computation, that
U(A, 3, y)(R x 1) = (R x 1 )U(A, -fj, y).
If we multiply (7.2) on the right by R x 1 we obtain:
U(A, S, fj, y)U(r)(R x 1 )U(A, 3, fj, y)-1 = X(A, S, fj, y)U(r)(R x 1). (7.3)
Thus it follows that the set of A(A, 3, f\, >') form a one-dimensional unitary
representation of the Galileo group and therefore must be equal to 1. Thus
(7.3) states that U(r)(R x 1) commutes with all U(A, <5, fy, y). It follows that for
an elementary system U(r)(R x 1) cannot be anti-unitary; therefore U(r) is
not anti-unitary; therefore U(r)(R x 1) must be a multiple of the unit
operator. Since a factor in U(r) is arbitrary, we may therefore choose this
factor such that
U(r) = R x 1. (7.4)
Here we stress the fact that (7.4) means that in the spin space U(r) behaves
like the unit operator.
We have given a physical meaning to the reflection as a transformation of
registration procedures. In the applications of quantum mechanics we will
use additional unitary symmetry transformations. Not all of these can be
physically interpreted. This is true, for example, for many of the per¬
mutations used in VIII, §4 and XII-XV. The application of such symmetry
transformations is legitimate if it illuminates the mathematical structure for a
problem.
302 VII The Galileo Group
In addition to the spatial reflection r we shall often consider another type
of reflection transformation—“motion reversal.” We will introduce this
transformation and its physical meaning in X, §4.
8 The Problem of the Space for Elementary Systems
In this section we shall consider the problems of the space which was
previously discussed in VI only for the case of elementary systems. For
composite systems we shall present a brief discussion in VIII, §7.
A single type of elementary systems is described by a single Hilbert space
Ж and an irreducible representation of the Galileo group characterized by
mass m and spin s.
The following attempt to introduce the space 3f is particularly fascinating
for physics: For the Galileo group we shall define the uniform structure
which characterizes physical imprecision and is denoted by ph in VI, §1.1. It
is physically reasonable to do this in the following manner: In both the three-
dimensional spaces fy and $ we introduce a uniform structure in an analogous
way. We shall now write the formula only for fy.
The equations e = г\/Щ and p = arctan(|fy|) define a bijective map f\ <-► pe
of the infinite three-dimensional space onto the finite three-dimensional
space of points within a sphere of radius я/2. We define the uniform structure
ph in the space of jy as the Euclidean uniform structure in this sphere. It is
easy to verify that the uniform structure ph in the space of fy also generates the
Euclidean topology. It is easy to compactify the space of jy with respect to
ph—we need only add the surface to the interior of the sphere of radius я/2 to
the space of pe.
We proceed in the same way for 5 and for the time translation y. Thus we
obtain a uniform structure ph in This expresses the fact that for large \ij\
that the physical distinguishability for a pair of group elements (1, 0, */l5 0)
and (1, 0, jy2, 0) is good only with respect to the direction e; for increasing |fy|
physical distinguishability deteriorates. In this way we eliminate the idealized
“infinity” from the transformations. A similar situation is found for the case
of the transformations (1, S, 0, 0) and for the time translations (1, 0, 0, y).
For very rapid motion of the registration apparatus relative to the
laboratory reference system, the more certain is the direction of motion, but
the absolute magnitude of the velocity of motion becomes less certain. In a
similar way the magnitude of the displacement becomes less certain for large
time translations.
Therefore it is physically reasonable that the real registration procedures
are such that the probabilities under displacement of the apparatus for large
f\ do not depend strongly upon |jy|, similarly for large $ and for large y. We
may therefore assert that 2 is identical to the space A of VI, D 1.2.4. To this
end we need to prove that the space A is norm-separable, as we assumed. We
shall not prove this here; see [19] for the proof.
8 The Problem of the Space @ for Elementary Systems 303
For elementary systems the space Д has not yet been sufficiently analyzed.
In addition A' (the dual Banach space to Д, in which 0! can be embedded) and
the closure Ka of К in Д' in the n(A', A)-topology is not well known (see [19]).
Perhaps in the structure of 8eKa there is a path to a new mathematical
method for the treatment of quantum mechanical problems.
Since Д is norm-separable there are good physical reasons for the assertion
that 3 = Д. Since the structure of Д is not known in detail, the reader should
be aware of additional possibilities for the definition of 3.
For the above reason, we shall now proceed in the opposite direction (as
shown in VI, §1.2): Using the subset A we choose 3 = 3A and we choose the
Л-uniform structure (VI, D 1.27) in ^ as the p/i-uniform structure. As a result
we lose the physical intuition for ph9 but we do realize the possibility to freely
choose A within certain limitations.
A first but somewhat radical choice for A consists of the selection of
A = 3A and we construct 3 from the position, momentum, and angular
momentum observables. Consider the set of continuous functions x in R3
with compact support. Then we construct the norm-closure algebra gen¬
erated by the %(P), x(Q) and aU the 0, 0, 0). This algebra is norm-
separable by construction. For 3A we choose the subspace of all self-adjoint
operators from this algebra. Since the representation of the Galileo group is
irreducible, 3A separates the elements of К and is therefore o(0&\ J^)-dense in
Я.
It is easy to see that the Л-uniform structure on ^ generates the same
topology as does the original topology on Thus we need only construct a
Л-neighborhood of the unit element of ^ which is compact in the original
topology, that is, a subset in ^ which is bounded in fj, S, у. Thus, together
with U as the representative transformation we need only consider
tr(l/+n>l/%(P)),tr(l/+P^^))
for a cp for which <p | cp} is concentrated in momentum space, а ф for which
<7*| ij/} is concentrated in position space,1 and a x which is concentrated in
R3.
By means of the algebraic construction we have obtained the possibility for
the selection of 3; however, we note that because of its algebraic nature, it
does not have a clear intuitive physical interpretation. An essential aspect of
the description of quantum mechanics presented in this book is that
algebraic operations such as products of operators in Hilbert space do not
have a clearly evident physical interpretation.
We note, however, that even for this choice of 3 the structure of 3\
Ka, deKa are not well understood (see [19]).
The following approach would be more satisfying. Consider the previously
described set of continuous functions x in R3 satisfying 0 < % < 1 and having
compact support. Then it is plausible that if we consider ()(0) to be the
1. For <p | <p>, <i» | ф) see IX, §5 and §6.
304 VII The Galileo Group
“position operator at time t = 0” then we are able to register the effect
%(<2(0)). Then, by time displacement, we are able to measure the effects
ешх(йФ))е-Ш1 = x№)) = xfc(0) + -P.
\ m
Let us choose Л to be the set of these effects. Clearly ^A c= 3)K. A is norm-
separable. The investigations in [23] indirectly show that A is J^-dense
in Ь{Ж)\ we may therefore choose 2 = (see VI, §1.2). We may therefore
choose the Л-uniform structure as the physical uniform structure on It
remains to show whether it generates the same topology on ^ as does the
original topology.
Here we have pointed out several problems of a fundamental nature
because we must describe the real measurement possibilities (that is, the
actual ability to distinguish between different ensembles). Here we have not
successfully obtained a unique “solution of these physical questions” by
means of the formulation of axioms concerning 2 (see also the discussion of
this problem in [19]). It is also important to state the open questions in order
to give a better estimate of the current state of the theory.
9 The Problem of Differentiability
Differentiable functions, differentiable manifolds, etc. appear in many mathe¬
matical formulations JltT of physical theories. Is differentiability an
essential component of physics—that is, does differentiability represent an
aspect of the structure of reality—or is it an artifact and a convenient
mathematical idealization?
There has been much philosophical discussion about this question. If a
physical theory is not based upon an axiomatic basis (see I, §1 or [1], §7.3)
then there is little that can be said about this problem. Since we used an
axiomatic basis for the development of quantum mechanics presented here,
we are able to apply the methods described in [1], §10. However, the
question whether differentiability is not merely a mathematical idealization
still remains.
Without using the methods of [1], §10 we may explicitly determine how
differentiability arises in the mathematical formulation JIZT of quantum
mechanics. Obviously the structure introduced in II-VII, §1 does not have
any axioms about differentiability. Note that the Galileo group is initially
introduced as a topological group, where the group structure and the
topology (more precisely, the uniform structure ph) reflect certain aspects of
reality.
By selecting a particular parameterization of the Galileo group we obtain a
differentiable manifold: ^ then becomes a Lie group. Pontrjagin [24] has
shown that, for compact and finite-dimensional groups there always exist
such parameters by which the group ^ can be made into a Lie group. This is
9 The Problem of Differentiability 305
also the case for many locally compact groups, in particular, for the Galileo
group [25].
For these groups the structure “differentiable manifold” may be derived
from the group structure and the topology. Therefore it is not necessary to
introduce additional axioms. This fact is important for the following two
reasons:
First, the structure “differentiable manifolds” represents a structure
(clearly, in idealized form) which represents certain aspects of reality. Second,
this structure represents nothing which is not already present in the topology
(more accurately in the uniform structure ph of physical imprecision) together
with the group structure.
Nevertheless it is, of course, correct that the differentiability structure
represents an idealization about reality, the basis of which is nothing other
than the idealization of a topological group. Let us consider the case of a
translation group. For a given translation a it is always possible to find
another translation b, the square of which generates the given translation,
that is, b2 = a. However, is this true for smaller and smaller translations?
Consider, for example, translations of an apparatus in the laboratory, as we
have done in §1. Obviously we cannot give an answer to this question at the
present time. We can express our lack of knowledge in mathematical terms
by the following idealization: There exist arbitrarily small translations; but
in order to proceed away from idealizations it is necessary to introduce
“uncertainty sets” in the neighborhood of the unit element of the group, that
is, make ^ into a topological group (see VI, §1). The idealization of group
elements which are “arbitrarily close” to the unit element leads to the
differentiability structure for the Galileo group.
The above assertion that the differentiable structure represents “some¬
thing” of the structure of reality may yet be made more precise as follows: It
expresses in idealized form the fact that the group structure describes what
happens in the neighborhood of the unit element idealized in the form of an
infinitesimal transformation, that is, in terms of the Lie algebra which
corresponds to the structure of the group. The well-known mathematical
result that the Lie algebra determines the group locally is only a mathemat¬
ical expression of this fact.
If we have recognized the physical meaning of the mathematical structure,
we could hardly then quarrel about whether it is physically correct to use
Hilbert space for the description of quantum mechanics, or whether it is more
correct to use subspaces in which all finite products of the operators
K, are defined. The answer to such a dispute is very simple: It is not
a physical problem but a question about the method of computation, that is,
concerning which mathematical methods are best suited for the solution of
physical problems. Methods can be judged only by their usefulness. Here the
use of Hilbert space is already a mathematical mode of description which
permits us to avoid the structure of the “original” set К of ensembles, of the
set L of effects, and of the form p(w, g) for the probability function. Thus it is
permissible to introduce new methods which facilitate the solution of
306 VII The Galileo Group
practical problems such as those which will be described in IX. Such
investigations have been undertaken extensively using the GePfand space
triple as a tool. Here we shall refer readers to the literature [26] because we
cannot describe all possible more or less practical methods especially when
they do not result in any new physical structure. Unfortunately it is not
always easy to determine from the literature whether we are dealing only
with practical methods or with new physical structures.
A problem of physical meaning can be directly related to the differentia¬
bility problem, namely in the area of the problem of the space 2 described in
§8.
In М(Ж) as well as in we may accentuate the “differentiable”
elements by asserting that We К is “differentiable” if
i(WK - KW\ i(WX - XW\ i(WL - LW\ i(WH - HW)
are elements of similarly, F e Lis “differentiable” if
i(FK - KF\ i(FX - XF\ i(FL - LF\ i(FH - HF)
are elements of
While the subset of differentiable ensembles has only a practical meaning
and does not have physical meaning, the set of differentiable effects can be
given a physical meaning insofar as the space 2 contains this set or not or
whether the norm-closed subsets of $ spanned by this set is the space 2,
CHAPTER VIII
Composite Systems
The real great achievement of quantum mechanics is not its successful
treatment of elementary systems, the basis for which was presented in VII,
but its successful description of composite systems. According to VI, D 2.1
the representation of the Galileo group in Hilbert space for composite
systems is reducible. In addition, there exist decision effects E <= G(Jt) which
are different from 0 and I which are left invariant under transformations of
the Galileo group.
1 Registrations and Effects of the Inner Structure
We shall first consider the structure of those effects which are left invariant
under the Galileo group as a whole, or are left invariant under subgroups of
the Galileo group. We have already become familiar with some of the general
properties of the set of these effects in VI, §2. In the case of the Galileo group
and its subgroups we can yet say something more about the structure of these
invariant effects. For this purpose we shall now consider some of the results
from VII, §2. It is easy to verify that in the derivations up to VII (2.23) no use
has been made of the fact that the representation is irreducible. The decisive
next step in VII, §2 was that the Hilbert space Ж can be represented in the
form (2.24) where the operators К and X obey the operator rules (2.25). (We
have not proven these results in VII, §2; an indirect elementary proof will be
provided in the discussion in IX, §5. For a proof which uses the group
representation or the algebra generated by elKand e1*'* see, for example,
[10] and [28].
307
308 VIII Composite Systems
Thus, for composite systems we may use the form
Ж = Жь x Ж{ (1.1)
of Hilbert space from VII (2.24) and use the operator rules for the operators
P = — К and Q = (1 /M)X in Жъ from VII, §2 together with the commutation
relation VII (2.17), where we replace m by M. For the definition of Q we have
already made use of the requirement that, for composite systems, the
parameter M which occurs in VII (2.17) is also nonzero, a result which is not
contradicted by any experiment in the fundamental domain of quantum
mechanics (that is, in atomic and molecular physics).
The observable P will be called the “total momentum” and the observable
Q will be called the “position of the center of mass” for the composite system.
These observables P and (5 are, however, not uniquely determined by the
conditions set down in VII, §4 (of course, P and Q satisfy all those
requirements).
According to VII (2.28) the Hamiltonian operator H is given by
Я=^х1 + 1хЯ, (1.2)
The term (1/2M)P2 is called the “kinetic energy” observable of the system; Ht
is called the Hamiltonian operator of the “inner structure” or the observable
of “rest energy,” that is, the energy of the microsystems if they were prepared
in such a way that the kinetic energy is (approximately) zero.
According to VII (2.35), for rotations we obtain
U(A, 0, 0, 0) = V(A) x R(A), (1.3)
where, according to VII (2.31), for Ж^ — ££ 2(R3> dkx dk2 dk3) we obtain
V(A)(p(k) = (piA'Hc) (1.4)
or, for Жь — JS?2(R3, dxx dx2 dx3) from VII (3.41) we obtain
У(АЩг) = Ф(Л-% (1.5)
The R(A) generate a representation of <3* in Ж{.
For composite systems it is no longer the case that these representations in
Ж{ are irreducible. Thus we find that VII (2.37) no longer holds. Instead, we
only have
R(A)Ht = НДЦ) (1.6)
which describes the rotation invariance of Я,.
Thus we have already identified the structures which can be deduced from
the structure of the Galileo group (considered as a transformation group of
registration procedures) which was introduced in VII, §1.
We will now introduce new terminology, describe its usage, and, in
addition, obtain some consequences from (1.6).
An effect is invariant under translations (1, 0, fj, 0) and proper Galileo
transformations (1, <?, 0, 0) if and only if it is of the form 1 x F. Such effects will
1 Registrations and Effects of the Inner Structure 309
be called “inner structure effects.” F need not commute with either the R(A)
or with Ht. This effect does not depend on the location or the velocity of the
registration apparatus corresponding to F but may depend on the orien¬
tation in space or on the time the apparatus is switched on.
It is easy to see that the position and momentum observables will be
uniquely determined if the requirements for elementary systems are sup¬
plemented by the following additional requirement: position and momentum
observables are coexistent with all inner structure effects.
In the sense of the terminology described above elementary systems also
have—according to VII, §2—an “inner structure”—one which is very sim¬
ple: spin, where the latter is described by the irreducible representation of
Q)% in ts; for Ht then VI (2.37) holds where Я is a constant having no
particular physical significance.
By analogy with the expression “inner structure effect” we shall call an
observable £ Д L an “inner structure observable” if the elements of the range
of the measure is of the form 1 x F. An “inner structure scale observable”
(according to IV, D 2.4.5 a scale observable is always a decision observable)
is uniquely defined by a self-adjoint operator of the form 1 x 5. For the
angular momentum operator J (for the definition see VII, §5) which cor¬
responds to the infinitesimal rotation R(A) the quantity I x J is an inner
structure observable. Since the representation of Щ can be completely
reduced in we may write in the following way as a direct sum as
follows:
ж, = x e ж}, (i.7)
j
where Ж '3 is the eigenspace of the operator J2 with eigenvalues j(j + 1). Since
the R(A) form a representation up to a factor of Sf9 (see VII, §2) the number j
in the sum must be either an integer or half integer.
The operators I x F are all effects which are invariant under the action of
the subgroup (1, S, fj, 0). The effects which are left invariant under the action
of the subgroup (A, 0) are therefore of the form I x F, where F leaves
each of the subspaces Ж* invariant. Let Fj denote the part of F which acts in
JfJ.
Since Ж* can itself be completely reduced with respect to irreducible
representations, we may introduce a complete orthonormal basis u\£ where
m = + 1 where the u\span an irreducible subspace *(v) and the и(*}
transform as described in VII, §3. That means nothing other than that
can be written in the form
jfJ = $ * 4j, (1.8)
where the restriction of the operators R(A) on have the form Rj(A) x I
with respect to (1.8) (see the general questions related to (1.7) and (1.8)
concerning operators which commute with a completely reducible repre¬
sentation in AIV, §14). Since the Fj commute with all the Rj(A) x I, they
must have the form I x Fj with respect to (1.8). Thus all the effects (and
310 VIII Composite Systems
therefore all scale observables) are known which are invariant with respect to
the subgroup of all (A, S, ц, 0), that is, the corresponding registration
apparatuses may be arbitrarily translated, given velocities and arbitrary
orientations in space. They are the set of all effects of the form I x F (with
respect to (1.1)) where F (with respect to (1.7)) transforms each subspace Ж3
into itself and in Ж* takes on the form 1 x Fj (with respect to (1.8)).
This is particularly true for the case of the rest energy observable Ht (see
(1.6)): To each angular momentum eigenvalue j there is a part Hj of the
operator H( for rest energy where H{ operates in For an effect which is
invariant under the entire Galileo group the corresponding Fj must also
commute with the H{.
Without violating any of the previously introduced structures in the theory
we may therefore begin with a series of arbitrary Hilbert spaces 4{, define
arbitrary self-adjoint operators Я/, and then construct the Ж* using the
spaces A? constructed in VII, §3 for an irreducible representation Dj of Щ
according to (1.8). From the Ж3 (here we observe that only whole or half
integer values of j may occur) we may construct according to (1.7) and
finally construct Ж according to (1.1).
In this manner we find that the previously introduced structures yield a
theory which does not contradict experience, but does contain too much
“arbitrariness” to be useful in clarifying the structure of atoms and molecules.
In the sense of [1], §10.3 or of [2], III, §7—§9 the preceding theory is not g.G.-
closed: not everything which is possible in the theory really occurs. This
represents a challenge to strengthen the theory by means of additional
structures and axioms (that is, in the sense of [1], §8 or [2], III, §7) in order to
proceed to a standard extension (“Standarderweiterung” in [1], §8)
2 Composite Systems Consisting of
Two Different Elementary Systems
In VII, §2 we have found that an elementary system can be characterized by
the mass m, the spin s, and perhaps (in case there is more than one elementary
system type having the same mass and same spin) other discrete quantities.
In nonrelativistic quantum mechanics it is possible to characterize all
elementary systems by the mass m, spin s, and electric charge; we shall
discuss such a description below. In most cases the charge is uniquely
determined by the mass and the spin. At present there is no satisfactory
physical theory which accurately predicts the masses of elementary particles.
In quantum mechanics the masses of the elementary systems are “determined
by experiment.”
What do we mean by “experimentally determine” in the context of a
physical theory?
It means that we take the (finitely many!) results of experiments and after
expressing them in mathematical form “adding” them (possibly as axioms) to
2 Composite Systems Consisting of Two Different Elementary Systems 311
the mathematical framework of the theory. A more precise description of this
process is given in [1], §5 or in [2], III, §4. From these experimental results
we then seek to deduce the value of the mass (as lying within a certain
interval of the real line—the interval describing the so-called experimental
errors). Since an analogous situation exists in the case of classical mechanics,
there is some discussion of this topic in [1], §10.5. It is not necessary to
present a detailed discussion in order to establish the fact that few parameters
of a theory can be determined by proceeding “backwards” from
experiment—such is the case for the mass m and the spin s for elementary
systems. Certainly it is desirable to have a theory which theoretically
determines these parameters, so that we may be able to determine whether
the theory is or is not contradicted by experiment. If we then would have
such a theory then we would have established a more comprehensive
description of the structure of the real world. If such a theory is not at hand
then we must be satisfied in experimentally determining which elementary
systems have which values of mass and spin. Here we shall only consider a
restricted but yet vastly large area of knowledge—that of atoms and
molecules. Here it suffices to introduce, as elementary systems:
(1) The so-called electrons, m = 4.13 x 109 cm-1, s = \ and charge e
where e is the negative of the so-called elementary unit of charge; in
the units used here e is dimensionless and has magnitude ~ >/(1/137).
(2) The different atomic nuclei with specific mass M (which is more than
1800 times that of the electron mass), is positively charged with charge
given by Ze where Z is an integer—the so-called charge number of the
nucleus. The spin of the different nuclei can be experimentally
determined. There exist tables of nuclei labeled with the correspond¬
ing values of M, Z, and s.
Certainly there exists a theory of atomic nuclei which permits the values of
M, Z, and s to be correctly assigned and which also permits the description of
the structure of nuclei. The theory of atomic nuclei permits the use of the
formulation of quantum mechanics presented here in which we consider
protons and neutrons as elementary systems. However, the problem of
constructing the Hamiltonian operator for the composite nucleus is much
more difficult than is the case for atoms and molecules; the case for the latter
may be handled by means of postulating axioms (see (5.8)). For this reason
we shall not be concerned with the application of quantum mechanics for
problems in nuclear physics. Since we seek to develop a more illuminating
discussion of the fundamentals of a physical theory, we shall confine our
interest to the fundamental domain of atomic and molecular physics, where
we may treat the atomic nuclei as elementary systems. On the other hand it is
important to note that physicists generally seek to extend a successful theory
to new areas, that is, to break new ground. Such efforts hre attempts to
discover new structure laws obtained from using known theories combined
with the aid of intuitive guesswork. We shall not discuss such attempts in this
312 VIII Composite Systems
book; the interested reader is referred to the presentation in certain sections
in [2].
For the desired application domain we shall assume that the elementary
systems in the theory consist of electrons and nuclei, where the different types
of nuclei are listed in tabular form. The electrons are completely (with respect
to the above domain of application) characterized by their mass. The nuclei
are uniquely characterized by their mass and charge. Electrons have spin
the nuclear spin will not play a role in the applications presented in this
book, since we shall consider those experiments in which the nuclear spin does
not play a role, or its effect is very small. It is, however, not difficult to extend
the theory developed here to take into account the effect of the spin of the
nucleus (hyperfine structure of spectral lines).
We shall also identify the experiments which prove that the electron spin is
\ in the development of applications (XI, §9 and §11).
We have presented the above overview of elementary systems in order to
formulate additional structure axioms for composite systems. If the reader
has studied physics for a few semesters he will encounter the well-known
“fact” that all atoms consist of “atomic nuclei and electrons.” What, however,
should such a statement mean? It is a problem of the theory to subject such
statements to scrutiny and to formulate them more precisely—by this we
mean that axioms should be introduced in the mathematical framework
which provide meaning for the short-hand statement that, for example, an
atom consists of a nucleus and a number of electrons.
In order to clarify the fundamentals we shall begin by considering a
composite system consisting of two elementary systems. Let Жх and Ж2
denote the Hilbert spaces for the elementary systems of different types (1) and
(2). Let Жъ denote the Hilbert space of the composite system. Previously
there existed no relationship between the different Hilbert spaces Jfv in
(Ж19 Ж2,...) which were introduced in AQ (III, §5) or by the theorem in
III, §3. We now wish to impose additional requirements which will establish
connections between these Жх. We shall therefore begin by examining the
relationships between ЖХ9 Ж2 (as Hilbert spaces of two elementry systems),
and Жъ (as the Hilbert space of the composite system).
Each effect F = (Fl9 F2, F3, F4,...) has three components Fl> F2> F3
which refer to these three system types; similarly each ensemble
W — (Wi9 W2, W39 W4,...) has three corresponding components Wl9 Wl9 W3.
We may consider Wx as an element of ЩЖг)9 Fx of Я'{Ж^)9 and similarly for
(2) and (3).
From experience we have found that it is not difficult to carry out
experiments with pairs of systems, one each of types (1) and (2). Every
scattering experiment of a system (1) onto a system (2) is of this form, and that
pairs, that is, a pair of (1) and (2) must be prepared as a new system (3). We
will consider the experimental situation of the scattering process in some
detail in XVI but only after we have explained what we mean by a composite
system (3) of the pair (1) and (2). For a mathematical basis it is certainly
desirable to only use such structures which are related to the experimental
2 Composite Systems Consisting of Two Different Elementary Systems 313
situation (as is the case for scattering experiments) in order to derive other
structures which describe the situation: (3) is composed of (1) and (2) from the
original structure. However, this route has not as yet led to success.
We shall therefore proceed to extrapolate from a few experimentally
motivated structures without being certain whether we will eventually
encounter contradictions with experience.
If we prepare pairs of systems (1) and (2) where, for example, the partners
(1) of each pair is on the moon and the other (2) is on the earth, then it would
appear to be possible to register the effects of system (1) independently of
registration of system (2). For certain preparations it also seems to make
sense that there are effects F3 for the pairs which correspond to one of the
effects caused by the partner (1).
Physicists often prefer to paraphrase the expression “for a specified
preparation” as follows: For systems “without interaction” certain effects F
of a “pair” are singled out as those which are caused by the “partner” (1).
Before we proceed to extrapolate this “structure” onto systems with in¬
teraction we shall first seek to mathematically describe this intuitive
structure.
The obvious method is to introduce the following new structure Lx ^ L3
with the following physical interpretation: F3 = is precisely the effect
of the pairs (3) which are actuated by (1) and correspond to the effect F1 of the
elementary system (1). We shall also introduce a corresponding map
L2 L3. We shall now introduce the following axioms about Tx and T2 in
order to mathematically describe the experimental situation “pairs without
interaction” on which we have earlier provided an intuitive description.
(a) 7] and T2 are ^-continuous injective effect morphisms. If the systems
(1) and (2) are prepared widely separated then we may obviously
measure “together” both the effects caused by (1) and (2). We therefore
require that:
(/?) Each effect in T1L1 is coexistent with each effect in T2L2.
Furthermore, it is reasonable to require that, for decision effects Ex for (1), we
cannot expect that there exist an F3 e L30X30(T1F1) which is more sensitive
than T1E1 only because the partner (2) is far removed from (1). We therefore
require that:
(y) F1G1 c= G3, T2G2 <= G3.
In order to express the fact that system (3) does not contain anything in
addition to systems (1) and (2) we require that:
(5) With exception to Al there exist no effects which are coexistent with
allF g T1L1 и T2L2.
From these results it follows that (for proof see [29]) Жъ can be written in
the following form:
e/I3 eft ^ A ft 2 .
(2.1)
314 VIII Composite Systems
and that the maps Tl5 T2 have the form
T1(F1) = F1xl, T2(F2) = 1 x F2.
(2.2)
Here we note (see the proof in [29]) that conditions (a)-(<5) cannot be
satisfied if the number field for the Hilbert spaces is either R or Q—it must be
С (see III, §3 and §5).
Without going into the proof cited above, we may base our structure on
(2.1), and define Tl5 T2 by means of (2.2). Then we may easily prove that the
relations (oc)-(<5) hold. The structure characterized by means of (oc)-(<5) or by
(2.1) and (2.2) is therefore only a representation of the physical situation of a
“pair without interaction.” In scattering theory (described in XVI) the
partners “before” and “after” scattering have practically no interaction. Then
there exists such a structure T^\ T2(0 which describes the effects of the partner
(1) or (2) before scattering and a similar structure T^\ T$p which describes the
effects of the partner (1) or (2) after the scattering. For scattering theory it is
decisively important that T[f) ф Tf\ T2(/) Ф T2(i) a result which we shall
examine more closely in XVI, §1 and §4.4.
For the nonrelativistic theory of interaction—as is applied in the case of
quantum mechanics—it is essential that we extrapolate the structure 7], T2
for the case of interaction. As the scattering experiments have shown we
cannot specify any “fixed” structure Tl5 T2 because in the case of interaction
each effect which is triggered only by partner (1) cannot be coexistent with
each effect which is triggered only by partner (2), as is required in (/?). This is
the case because the interaction of the partner (the measurement process on
partner (1)) will influence, as a result of the interaction, the effect triggered
later by partner (2). These intuitive ideas suggest the following attempt at
extrapolation.
AZ. To each time point t of the laboratory time scale there exists maps
Tu,T2t which satisfy (oc)-(<5). For the Galileo transformation Uy = 1/(1,0,0, y)
in Жъ and the corresponding Galileo transformations Щг\ U{2) in and Ж2
we require that
for i = 1, 2.
The intuitive idea here is that the effects ад are such that they will be
triggered by partner (1) if (1) is placed into interaction with a measurement
apparatus for a very short time at time t. Here we assume that there exist
measurement apparatuses having interaction processes which act “momen¬
tarily” between the microsystem being measured and the apparatus and that
during the brief interval during which the interaction takes place there is no
noticeable change caused by the interaction between the partners, that is, if
At is the duration of the interaction, then
TnUflF^l * сиадм
2 Composite Systems Consisting of Two Different Elementary Systems 315
These considerations show that in the case of elementary particle physics we
cannot expect to satisfy such a condition. The long-range electromagnetic
interaction of atomic and molecular structure and the small value of the
elementary charge e « ^/(1/137) combine to make the use of AZ possible.
The requirement imposed by (AZ)
= ЩТ^и;
is self-evident on the basis of the physical interpretation of the Galileo
group presented in VII. If TltFx is an effect for which a measurement
interaction takes place at time t then Uy{JuF^)Uy is the effect which is
produced if the measurement apparatus is applied at a time у later,
that is, when the measurement interaction takes place at time (t + y)
later; this is again an effect triggered by partner (1) which corresponds
to the effect T1(f+y)Uj1)F1Uj1)+ as if system (1) had been triggered at
the displaced time. The requirement
t1((+7)c/‘1)f1c/<1>+ = щт^и;
is mathematically only a definition for Tu if T10 (that is Tu for t = 0) is
known.
According to (2.1) and (2.2) there exists a product representation
Жъ = x Ж2 corresponding to the maps ^10> ^20 5 a similar situation also
holds for Tu, T2t. Since the product representation of Жъ obtained in this
way changes with time, it is not well suited for practical problems. In X we
shall become familiar with the reformulation of the time variability described
by Uy in the form of the Schrodinger picture and the interaction picture,
where the product representation of Жъ can be considered not to change with
time.
Now that we have introduced the maps Tu, T2t we must warn the reader
about a possible error concerning its physical interpretation: If b0 is a
registration procedure by which the effect Fx can be registered by systems of
type (1), that is, for b <= b0
ФФо’ b) = (Fl9 F2, F3,...)
it no longer follows (also in the case if F2 = 0!) that there exists a time t for
which F3 = TltFi. The registration procedure which registers the effect Fx for
the systems of type (1) need no longer register the effect TuFi for composite
systems of type (3). If, however, in the special case in which b0 is a registration
method in the sense described above—that is, for all practical purposes it is only
in interaction at time t—then for the case F2 = 0 we would expect that
*3 = TUF,.
We may extend the requirement AZ as follows: There exist registration
methods b0 and registration procedures b for which
^(Ьо,Ь) = (^, 0, TltFl9...)
On the basis of (2.1) and (2.2) “at time t = 0” we may, in a formal sense(!),
transfer the rules for Galileo transformations for Ж19Ж2, onto Жъ—and
316 VIII Composite Systems
even carry out different Galileo transformations in Ж± and Ж2. That is, for
the group <S x <S (see AV, §5) with elements ((A1; fju уД (A2, S2, fj2, y2))
where we define
U[(AU §u fh, уД (A2, $2, fj2, y2)] = Slf fju уг)
X U(2\A2, $2, fj2, y2), (2.3)
where Fx x 1 transforms into
U\_{Alt ...), (A2,.. .)](F, x l)Ui(Au ...), (A2,..,)] +
= U(1)(Al9.. .)F1U(1)(A1,...)+ x I. (2.4)
A similar result holds for 1 x F2. Here Fx x 1 transforms according to the
first Galileo transformation (Al9...) and 1 x F2 according to the other
transformation (A2,..
Is the formal expression (2.4) physically meaningful?
We shall now consider a pair (1), (2) of systems which do not interact (for
example, (1) and (2) may be widely separated) then it appears to be possible to
subject the registration methods for (1) and (2) to separate(l) Galileo
transformations (which are not very distant from the unit element). However,
for the case of interaction we cannot simply combine two different regis¬
tration methods for systems (1) and (2) into a new registration method for
the pair (3). Indeed Т10Рг = Fx x I and T20F2 = I x F2 are coexistent, that
is, there exists a registration method b0 and registration procedures bx a b0,
b2 <= b0 such that
<A(bo> bi) = (•> * Fi x I, • • •)>
b2) = (•,., I x F2,...).
However, we do not know how to construct the apparatus for b0. At least we
do not expect that parts of the apparatus for b0 can be subjected to distinct
Galileo transformations. Thus it appears to be more reasonable not to
generally interpret the transformations in (2.3) as transformations of regis¬
tration methods.
On the contrary it appears only reasonable to investigate the relationship
between the Galileo transformations of the entire registration method and
their representations in Hilbert space Жъ for systems of type (3). From the
extrapolation of systems (1) and (2) without interaction we obtain the
following answer which appears to be reasonable: In (2.3) we require that
(A2, *?2> У2) = C^i> *7и 7i)> that is, the representation of the Galileo
group in Жъ with respect to the product representation of Жъ “at time t = 0”
(Жъ = Жi x Ж^ is given as follows:
U(A, S, fj, y) = Ua\A, S, fj, y) x U<2,(A, $, fj, y). (2.5)
Equation (2.5) cannot hold in general, that is, it leads to contradictions
with experience (for example, for scattering processes—see XVI) since for
time translations it describes systems “without interaction.” The structure of
2 Composite Systems Consisting of Two Different Elementary Systems 317
the representation of the Galileo group for composite systems which was
described in general in §1 may not therefore be extended by means of the
severe condition (2.5). The fact that it is possible to solve the interaction
problem for the case of nonrelativistic quantum mechanics by means of a
small change of (2.5) has led to overwhelming success in the applications of
quantum mechanics. However, the fact that an analogous solution is not
possible for relativistic quantum mechanics suggests that such a similar
closed theory for elementary particles does not exist.
The physically functional but not very elegant solution of the interaction
problem for nonrelativistic quantum mechanics is obtained as follows: With
respect to the product representation of Жъ at time t = 0 (2.5) is required
only for y = 0, that is, for the subgroup of the Galileo group denoted by in
§1 we require that:
U(A, $, if, 0) = UW(A, $, n, 0) X U(1HA, S, n, 0). (2.6)
Thus, for the infinitesimal transformations “at time t = 0” it follows that:
K = K1xl + lxK2,
X = X1xl + lx%2,
J = Jt x 1 + 1 x J2, (2.7)
where J, Jl9 J2 are the angular momentum operators. From the first two
equations and VII (2.16) we obtain the commutation relations
ВД - XVK, = iMSvfl I
= (KlflXlv - XUK„) X I + I x (K2flX2v - X2vK2J
= I x I + I x т2ё^1.
It follows that M = m1 + m2; the mass M of the composite system (3) is
equal to the sum of the masses for the two systems (1) and (2).
For the position and momentum observables given by VII, §4, that is, for
Д = -^, p2 = -k2,
17/rt 2
from (2.7) it follows that the total momentum defined in §1 is given by
p = Д x 1 + I x P2 (2.8)
and the position of the center of mass Q = Х/M is given by
л = WiQi X 1 + m2l X Q2
У ml + m2
The term “center of mass” arises from the form of the right side of (2.9).
If (2.5) were correct, then it would follow that
318 VIII Composite Systems
In the case in which misunderstandings are unlikely we shall use the more
familiar notation of the physicist and write (2.10) in the form:
H = 2^ + 2^- ' (2U>
If Я in (1.2) has the form (2.11) we then say that it describes the situation “no
interaction.” We write (1.2) in the form
H = 2-P* + 2-PI + Hj, (2.12)
2m1 2 m2
where Hj is called the “interaction operator.”
In (2.12) we may replace Pl5 P2 by P in (2.8) and by the following new
operator:
p = —p —p (2.13)
m1 + m2 m1 + m2
Here we call Pr the relative momentum. Then (2.12) transforms into
н = ~P2 + ~P? + Hj where m = Щ”2 ; (2.14)
2 M 2m mx + m2
m is called the reduced mass. It is easy to verify that Pr commutes with X and
with all of the 1/(1, S, 0). Therefore, relative to x Ж2, that is,
relative to (1.1) we obtain
a*“ix5;* (215)
Since (1.2) holds, H3 must have the form
Hj = lxHj (2.16)
relative to (1.1), from which it follows from (1.2) that
Я;=Т-Р,2+Я;. (2.17)
(The notation of the physicist permits us to consider two different rep¬
resentations of the same Hilbert space as a product space without changing
symbols or indices—namely, Жъ = Жь x x Ж2, and to write the
corrpsponding operators accordingly. Here we must always take care to note
which representation of a product space is used in formulas like (2.7) and
(2.16)!)
In addition to the center of mass position, we introduce the relative
position operator Qr = Qx — Q2; it is easy to show that Qr commutes with Q
and P, that is, it is of the form
Qr = 1 x Qr (2.19)
2 Composite Systems Consisting of Two Different Elementary Systems 319
relative to (1.1). On the other hand it follows that
~ Qr»Prv = (2.20)
The “coordinate transformation” of £)l5 Л> 62* ^2 to & A A leads to
the following new notation: For
l = ь X tis* I — lb X ^2s
we obtain
Жг x Ж2 = (Ж1Ь x Ж2Ь) x *ls x t2s (2.21)
and
^Lb x ^гъ ~ x ЖгЬ^ (2.22)
where (5, P are operators in ^ and Qr, Pr are operators in ЖгЪ. From (2.22),
(2.21), and (1.1) it follows that
Ж{ = 2tfrb x *la x t2s. (2.23)
If the subsystem (2) has no spin, or if we can neglect the spin of (2) (a more
precise formulation of what is meant by being able to “neglect” spin is given
in [2], XI, §7.5) then we can ignore the space t2s in (2.23) and we obtain
Ж, = ЖгЬх,и. (2.24)
Not only the form (2.24) but also the algebra of the operators Pr, Qr and
the rotations (as we shall later prove) are completely equivalent with that of
the elementary system (1) with the exception of the Hamiltonian operator Ht
described by (2.17) and the mass factor m (m is the reduced mass in (2.14)). It
is now necessary to consider the above assertion about rotations.
From U(A) = U{1\A) x U{2\A) relative to (2.1) it follows that with
U{1\A) = V{1\A) x R{1)(A\ U{2\A) = V{2\A) x R{2)(A)
U(A) = {V{1\A) x V{2\A)) x R{1)(A) x Ri2){A)
relative to (2.21). Here V(1\A) x V(2\A) is an operator in Ж1Ъ x Ж2Ъ. It is
easy to show that from the change in the product representation according to
(2.22) we obtain
V{1\A) x V{2\A) = V(A) x Vr(A), (2.25)
where the left (right) side of (2.25) corresponds to the left (right) side of (2.22).
The operators V(1\ V(2), V, Vr behave in a manner which we have generally
described for an orbit space in VII, §3.
According to (2.23) in we have the representation
R(A) = Vr(A) x R{1\A) x R{2\A), (2.26)
where R(A) is defined according to (1.3). For the angular momentum, from
(2.7) together with VII (5.3) we obtain
J = + J2 = Li + L2 + $1 + $2
= L + Lr + (2.27)
320 VIII Composite Systems
where L is the orbital angular momentum in /b,Lr in ЖгЬ. Since the V(A)
and Vr(A) behave in the same way as described in VII, §3 for an orbit space,
from VII (3.42) it follows that
L = QxP, Lr = Qr x Pr. (2.28)
Therefore, if we may neglect the spin s2 of the elementary system (2) we
then obtain a description in which is equivalent to that of an elementary
system of mass m, spin Sy and having the Hamiltonian operator (2.17). Such a
description is applicable to a system consisting of an electron as system (1)
and an atomic nucleus as system (2) (see XI, §3).
The structure for the dynamics of composite systems is therefore de¬
termined when the operator Hj is explicitly given. The form of the
Hamiltonian operator will be discussed later in this book.
In closing we emphasize the fact that the observables Qi x h 1 x
Py x I, I x P2 for the representation Жъ = Жу x Ж2 refer to the time t = 0.
They therefore correspond to the operator
6(i)(0) = 6i x 1
“position of the subsystem (1) at time t = 0”; the “position of the subsystem (1)
at time t,” using the interpretation of the time translation operator
Uy = 1/(1, 0,0,y)
corresponds to the operator
Qd)W = utQ{1)Wt+ = шйх x m+
which does not take the form Ax 1 with respect to the product repre¬
sentation Жъ = Жу x Ж2 defined by ^io> T20. It does, however, have this
form with respect to the product representation corresponding to Tu, Tl,'- A
similar situation is found for the momenta of the subsystems (1) and (2). The
momenta of the subsystems (and the angular momentum) are no longer
constant with time!
3 Composite Systems Consisting of
Two Identical Elementary Systems
If the system types (1) and (2) are identical, then the intuitive reasoning which
led to (2.1) and (2.2) is fundamentally incorrect because it is impossible to
distinguish between the “effect of subsystem (1)” from that of “system (2).”
The following intuitive approach to this problem has been fruitful:
Let us suppose that we have constructed a product space at time t = 0
from a Hilbert space for a system of type (1) as follows:
4^2 -l/p w ■?//?
у t/t у X c/f у
(3.1)
3 Composite Systems Consisting of Two Identical Elementary Systems 321
(see §4 for more details). Here, of course, it does not make sense to apply (2.2)
in this case. Since both systems are identical, all “effects” are invariant under
exchange of the two systems and, consequently, (3.1) cannot be the “correct”
Hilbert space for the composite system. How can we formulate this
invariance in mathematical terms? For this purpose we define an exchange
operator as a linear operator in Ж\ as follows:
P<pv(l)<^(2) = (3.2)
where <pv is a complete orthonormal basis in Жг.
It is easy to verify (see §4) that the operator P defined in (3.2) is
independent of the complete orthonormal basis used in (3.2).
In addition P is obviously unitary, and satisfies P2 = 1. Therefore we may
partition Ж2 into two subspaces {Ж2}+ and {Ж2}_ where {Ж2}+ is the
eigenspace of P with eigenvalue +1 and {Ж2}_ is the eigenspace of P with
eigenvalue — 1. An operator A in Ж2 is said to be symmetric if
PA = AP. (3.3)
All effects of systems (3) therefore should be (intuitively speaking) symmetric
operators, that is, {Ж2}+ and {Ж2}_ are invariant subspaces with respect to
symmetric effects. Since the set of symmetric effects is a proper subset of the
set ЦЖ2\ Ж2 can therefore not be a Hilbert space for the system type (3)
because Ь(Ж3) is the set of effects for system (3). However, we may identify Жъ
with {Ж2}+ or with {Ж2}_ because it is clear that every self-adjoint operator
which leaves {Ж2}+ or {Ж2}_ invariant is also symmetric.
Both of these intuitively motivated possibilities have proven to be
successful. We shall now formulate the additional structure needed to
describe systems of type (3) which are composed of pairs of identical systems
of type (1).
Case(+):
Жъ = {H2}+ (3.4)
together with the map T0 (that is Tt for t = 0) of Lx = ЦЖ^ into L3 = Ь(Ж3)
defined by:
T.(fi) = x 1 + 1 x Fi). (3-5)
Case(-):
Жъ = {**}_ (3.6)
together with the map T0 of Lt = ЦЖг) into L3 = Ь{Ж3) defined by:
ад) = X 1 + 1 X FJ. (3.7)
Here (3.6) and (3.7) are the analogs of (2.2). T0(FJ is the effect which is
triggered by one of the subsystems of type (1) as if the other were not present.
322 VIII Composite Systems
In §2 we have already mentioned the fact that the physical interpretation of
this requirement is problematical.
Which of the cases (+) or (—) are we to choose? That depends on the
elementary system type (1) and—according to experience and certain results
from quantum field theory—depends only on the spin of the system type (1).
If the spin takes on an integer value, then case (+) is to be chosen; then we
call such systems “Bose systems.” If the spin takes on half-integer values then
case (—) is to be chosen; here the systems are called “Fermi systems.”
Many of the ideas in §2 are applicable to the cases (+) and (—). Clearly
(2.3) is not applicable; nevertheless the formula (2.6) is applicable. Thus the
formulas (2.7)-(2.12) remain applicable and we find that H and P must
commute. For the application of the material following (2.13) we must be
cautious.
If we define, in a formal manner, the following using (2.13) (here m1 = m2!):
Pr = - P2) (3.8)
then Pr cannot be an observable of system (3) because it is not symmetric. Of
course, (2.14) is applicable because Pr2 is symmetric.
We must now consider the decomposition
Жъ = {Ж?}± = ЖъхЖ{. (3.9)
Жъ is defined as in §2, but Ж{ is not of the form (2.19). What is the structure of
^in (3.9)?
This we may discover most simply by analogy to (3.8) if we assume the
formal definition of (2.18). Then, corresponding to (2.21) and (2.22) we
obtain:
Ж? = Ж?ь x i\s = Жь x ЖгЬ x i2ls. (3.10)
How does the exchange operator P behave? To answer this question we shall
define Ps in an obvious way as an exchange operator in ^ and the parity
operator R in ЖгЪ — JS?(R3, dxx dx2 dx3) (see VII, §7) as follows:
Яф(х19 x2, x3) = ф(-х19 x2, x3). (3.11)
Then, according to the representation on the right side of (3.10) we obtain
P = I x R x Ps. (3.12)
Therefore Ж{ in Жъ = Жь x Ж{ is the subspace of ЖгЬ x t\s which cor¬
responds to the eigenvalue 1 of R x Ps for Bose systems and — 1 for Fermi
systems. The operators Ht and H3 must, of course, commute with R x Ps.
If ЖгЪ+ is the eigenspace of R with eigenvalue +1 and ЖгЪ_ that for
eigenvalue — 1, then we obtain
Ж{ = ЖгЪ+ x {*ls}+ 0 ЖгЬ_ x {t2u}_ (3.13)
for Bose systems; for Fermi systems we obtain
Ж{ = ЖгЬ_ x {i2s}+ Ф ЖгЬ+ x {i2s}_. (3.14)
4 Composite Systems Consisting of Electrons and Atomic Nuclei 323
4 Composite Systems Consisting of Electrons and Atomic Nuclei
Now that we have introduced the characteristic structure for the concept of
composite systems for the case of two systems we shall seek to apply this
structure without additional discussion to the case of larger numbers of
subsystems. For applications of quantum mechanics to atoms and molecules
we need only consider the case in which the elementary subsystems are
electrons and atomic nuclei. We shall now present the construction of the
structure of a “composite system” in the form of a prescription since this form
is the most transparent.
First Step: We consider the different elementary system types from which
the system under consideration is composed. Let the system types be denoted
by (1), (2), ..., (k). For each of these let nv denote the number of systems of
type (v) which appear in the composite system.
Second Step: For each system type (v) we then construct, with the aid of the
Hilbert spaces Ж^ the Hilbert space Жп^ = {Ж^}+ or Жп^ = {Ж^}_
depending on whether the spin of the system type (v) is an integer or a half¬
integer. ,
Third Step: We then construct the Hilbert space of the composite system as
follows:
jr = jreixjrB2x...x (4.1)
For the product representation (4.1) the corresponding maps (at time t = 0)
of ЦЖХ) into ЦЖ) are given by:
TVFV = lxlx--xFx**xl,
where F e ЦЖп^ is in the vth position. With respect to the Жп^ the symmetric
operator F in Ж*' = Hx x Hx x • • • x Hx is defined by
F = —[Fv xlx--*xl+lxFvx--*xl + *** + lxlX“-x FJ.
nv
It is only necessary to make the second step more mathematically precise.
In order not to keep track of multiple indices we shall now consider one
elementary system type with Hilbert space Ж and the n-fold product space
Жп. This will be correctly defined in the following; here we shall write the n-
fold product space of Ж with itself as follows:
We define n Hilbert spaces Жа\ Ж{2\..., Ж{п\ suppose that for each Ж{а)
we are given an isomorphic map Va of Ж onto Ж(ос). Here we shall consider
the pair of effects F e ЦЖ) and VaFV~1 e ЦЖ(а)) to be identical (by
definition) that is, on the basis of the mapping principles they are interpreted
as being identical. Such isomorphisms are, for the most part, defined by
selecting a complete orthonormal basis <f)v in Ж and considering the images
Уафх in Ж{а\ The Уафх are usually denoted by фу(a). Such isomorphisms will
be used in XII-XV.
Жп is then defined by Ж(1) x Ж(2) x • • • x Ж(п) together with the specifi¬
cation of the isomorphisms Va. In such a product Жп we may define linear
324 VIII Composite Systems
permutation operators which also play an important role in practical
applications (XII-XV).
Using the above defined 0v(oc)
Yvbv, v„ = <№•.., 0» (4-2)
defines a complete orthonormal basis in Жп. If P is a permutation among a
set of n items (as defined in AV, §9) we define an operator in Жп by the
equation
P*F = Ф (4 3)
v„ *Pl,P2 Pn5 V'-V
where the pl9 fi2,..., fin is obtained from the permutation of the “n indices”
Vi, v2,..., v„ (see AV, §9). The operator P is (according to (4.3)) a unitary
operator, since it transforms a complete orthonormal basis into another (in
this case it only changes the order of the basis vectors).
The definition (4.3) of the operator P does not depend on the particular
choice of the basis </>v but only on the product representation. The inde¬
pendence of the definition (4.3) from the choice of a basis follows directly
from the fact that each vector of the form Xi(l)x2(2) • * * Xn(n)ls transformed
into a vector xai(l)#a2(2)... Xan(n) by P where P permutes the n symbols
1, 2,..., n into al5 a2,..., a„. This can be proved very simply as follows: For
a vector
X ~~ X К v„avi v„
VI V„
we obtain
px = Z M„avi v„-
VI V„
If rjp is another complete orthonormal basis in Ж for which
n, = E ФЛР
V
then we have
^Pi Pn ~~ */pi(l)> • • • ’ “ X V„^vipi^v2p25 * * * 5 ^V„p„
vi v„
and therefore obtain
p у = у и и
ж Лр1 pn Lj APl Pn^Vlpl 5 • • • 5 u\npn>
vi v„
where the jul5..., \in are obtained using P on the vl5..., vn. If we apply the
permutation P to the fevipi,..., bVnPn we do not change the value of the
product; they appear in the sequence bpi<Tl,..., bPn<Tn where the al9.. ,,on are
obtained from the pl9..., pn by means of the permutation P so that we
obtain
4 Composite Systems Consisting of Electrons and Atomic Nuclei 325
The operators P form a unitary representation of the symmetric group S„
(see AV, §9).
In this case the general form of an operator A which commutes with all the
operators P may easily be determined according to AIV, §14 in the following
way: To each element 4$ of the group ring (AV, §7, §9, and §10.6) there exists
a corresponding operator E($ in Жп. The operators EQ and £(v) = £a E$
are projection operators. Since e = £v>a 4a we therefore obtain
£Vfflr E^ = Yjv E(v) — 1- From this partition of unity the Hilbert space Жп
decomposes into subspaces 4(v) = £(v)(/") and the 4(v), in turn, decompose
into subspaces 4V) = Е%{Жп): Жп = £v ф *(v) and *(v) = £a 0 4V)- The
Hilbert spaces 4v) (f°r ^хе(1 v) are transformed into themselves by the
operator A; we may represent *(v) as follows: *(v) = d(v) x /(v) where the
operators P are represented in the finite-dimensional space /(v) by the vth
irreducible representation. The operators A leave r(v) invariant and have the
form (v4(v) x I) in 4(v) where I is the unit operator in /(v) and A(v) is the
operator in d(v); the permutation operators have the form (I x P(v)) in 4(v)
where P(v) denotes the operator for the vth irreducible representation in /(v)
and I is the unit operator in o(v). The operators A{v) in <fv) can be completely
arbitrary. Thus it follows that the general form of all operators which
commute with all the P (here £ © represents the fact that the operators A
and P leave the spaces 4(v) invariant) will be given by:
A = £ 0 (^(v) x I), P = ^®(lx P(v)). (4.4)
V V
In addition we find that unbounded operators which commute with all the P
also have the form A = £v ® (^4(v) x I) since the domain of definition of A is
transformed by the permutation operators into itself. Therefore we may
construct the domain of definition of A{v) by use of on the domain of
definition of A. The problem of the spectral decomposition of A is also solved
if we know the spectral decomposition of finitely many Ab). Therefore to each
irreducible representation (v) of the permutation group there exists a
spectrum which corresponds to A (providing that none of the 4(v) is equal to
zero, a situation which can occur for a finite-dimensional Hilbert space Ж;
see the presentation for the spin space and the Hilbert space in XIV, §7).
We shall now consider the possibility that one of the spaces d(v) has been
chosen as the Hilbert space Жп for n identical systems. From experience we
find that there are only two possible cases—either P(v) belongs to the
symmetric (that is, the identity representation) or it belongs to the anti¬
symmetric representation of the permutation group. Since these repre¬
sentations are one-dimensional we may identify d(v) with *(v), that is,
4<v> = {/"}+ or #(v) = {/"}_. We shall now impose the folkxwing postulate:
For elementary systems with integer values of spin we choose
Жп = {Жп}+; for elementary systems with half-integer values of spin we
choose^ = {Жп}_.
326 VIII Composite Systems
For the antisymmetric representation the operator £(v) has the form (see
AV, §9):
1
-7E(-DPp,
n! p
(4.5)
where (— l)p is +1 or — 1 depending whether P is either an even or an odd
permutation. For the TV1 Vn defined according to (4.2) we find that the
(4.6)
span the space {/"}. The TV1 Vn are equal to zero if two of the indices are
equal. The TV1 Vn (except for sign) are identical to ГД1 Mn if the jul9 ...,рп
are obtained from the vl5..., vn by a permutation. For the inner product of
two TV1 Vn it follows that (here R = P“ *Q) we obtain
<rVl V„5 ЦпУ
1 £ (- 1)PQ<P0V1(1),..., </>», <^(1),</>„»>
(n!)2
P,Q
(n\y
1
P Q
= I(- 1)R<0V1(1)5..., • • • > *»>•
From the above remark it suffices to choose an index sequence vl5..., vn
from all the index sequences obtained from permutation and a definite
corresponding TV1 Vn. If jul5 jU2,..., jU„ is a “different” index sequence,
then there must be a v* which does not occur in jul9 ju2,..ju„; then TV1V2 Vn
will be orthogonal to ГМ1М2 Mn. If both index sequences are identical (that is,
vf = jUf) then, (since no pair of indices v* are equal and therefore only the
summand for which R = I yields a nonzero term), we obtain
nrvl J2 =
1
nl
(4.7)
In this way the so-called Slater determinant
1
Vvi v„
Ш P
1) 2)
Фф) Ы 2)
ФЛ1) Ф42)
•» 4>v»
• 0v,(«)
• Ф4п)
■ Фу„(П)
(4.8)
rv„V^/ 'TVnV
defines a complete orthonormal basis in {Жп}_, where for each yvi Vn only
one index sequence is chosen from the index sequences obtained by
permutation.
4 Composite Systems Consisting of Electrons and Atomic Nuclei 327
For the identity representation the operator E(v) has the simple form
(AV, §9):
<49>
We obtain a basis for {^"}+ from the *FV1 Vn as follows:
«V, (41°)
The QV1 are never equal to zero; however, we obtain the same element
when we permute the index sequence. Therefore, for the QV1 Vn we must
select only one index sequence from all the possible index sequences obtained
by permutation, that is, according to (4.11) we do not pay attention to the
sequence of the indices vl5..., vn and only consider index sequences to be
different if different indices are present. It is easy to see that the QV1 Vn which
differ in the index sequences are orthogonal.
In order to obtain normalized vectors we compute ||QV1 VJ|2. In order to
do this we must know how many indices are identical. Since we do not
consider the order we shall assume
Vl5 v2> • • • 5 Pl’ Pl’ • • • 5 Рщ’ • • • 5 Pl’ Pl’ • • • 5 Ptl2’' * * ’ 4 1’ 4 2’ * * * 5 Чпу,’
(4.11)
where the \ui9 pi9.. .9rjt are identical and that nx + n2 + • • • + nv = n.
Therefore we obtain (here R = P-1Q):
IR, v„ll2
= -г Z <P<Mi),• ■ &»>Q<Mi)>• • • >&»>
ni P,Q
= Ai E E <<Mi)> • • • > <Ш> R<Mi), • • •. <Ш>
{П1) P R
= A E «(1), • • •. R0V1(1),..., &»>.
ni R
The inner products in the last sum vanish unless
ROUi, 02, . . . , tjJ = (0!, 02, ... , IfJ
if this is the case then they are equal to 1. In the sum there are as many terms
equal to 1 as there are permutations which transform the series (4.11) into
itself; there are exactly n1!, n2!,..., nv! permutations. Therefore we obtain
R, v„ll2 = “7ni-> n2-> • • • > ”v! (4-12)
Therefore, for a complete orthonormal basis for {Ж'"}+ we obtain:
®vi v„ — /iii I E • • • > <^i„,(nl)> •••»
V'n!, пг!, n2!, ...,nv! p
(4.13)
328 VIII Composite Systems
We will not carry out the separation of the center of mass as we have done
in §2 and §3. We shall do this later in our discussion of applications.
We shall also not consider the detailed analysis of the representation of the
Galileo group since, by analogy to (2.6), the transformations are given as a
product of i na factors. The time translation operator in Ж from (4.1) is
given by
U(l, 0, 0, y) = eiyH, (4.14)
where H must commute with translations, rotations, and permutation
operators.
In closing we shall now formulate the following axiom:
Every system type is either elementary (either an electron or an atomic
nucleus) or consists of a composite of a finite number of electrons and atomic
nuclei. Here the number щ of electrons and the numbers n2, n3,... of atomic
nuclei of identical type uniquely define the type of the composite system.
This axiom is, in part, a description of microsystems and, in part, reflects
the boundary of the fundamental domain to which we shall apply the theory.
This axiom places no restriction on the numbers na and na. There exists
grave doubts, indeed, strong evidence from experience, that the formal theory
for system types cannot be g.G.-closed for systems with very large values of
na, that is, must be extended by a more comprehensive theory (see, for
example, [2], XV and [13]). This means that eventually we must place into
question predictions of the theory for large na concerning physical possibi¬
lities (see [1], §10). We may only obtain reference points for what is possible
in nature for small values of na. The larger the value of na the larger the
possible discrepancy between the theoretical possibilities and the actual
possibilities. We introduced these critical remarks in order to urge the reader
to critically examine all the structure inherent in the theory.
5 The Hamiltonian Operator
The representation of the Galileo group is known for the case of composite
systems (in the sense of a generalization of (2.6)) if the time translation
operator (that is the Hamiltonian) in (4.14) is known. H can therefore be
given for every system type. According to VII (5.2) for elementary systems we
have
H = 2-P2. (5.1)
2m
What is H for the case of composite systems?
The first answer to this question is disappointing. More precisely there
exists no Я, that is, the description already presented in §4 can only represent
5 The Hamiltonian Operator 329
an approximation of experience, because it is not difficult to give examples of
experiments which contradict (4.14). The physicist can rapidly determine
where the deviations from (4.14) arise: As long as the emission of light (that is,
of electromagnetic radiation) plays a role, (4.14) can only be an approxi¬
mation. For elementary systems (5.1) is in agreement with experience.
We may strive to obtain a better description than that given in §4 in two
different ways.
The first attempt seeks to add the electromagnetic radiation to the
microsystem, that is, the action carriers from the preparation to the
registration systems. Then we must give up the picture of a composite system
of the type described in §4, since, for example, the composite system “atom”
can emit new systems—light quanta. This method, known under the
designation “Quantum Electrodynamics” has led to fantastic success. It is,
however, not clear today whether it yields a mathematically correct de¬
scription. For the latter reason we shall not consider this theory here; we
shall only consider mathematically correct theories. Therefore we shall now
consider the second attempt, in which we must take into consideration
inadequate physical concepts and mathematically unpleasant formulations.
In the second approach (4.14) appears as a “first” approximation which
yields good results as long as the radiation of light does not play a significant
role. For this first approximation we will give a Hamiltonian operator
without giving any additional theoretical reasons. The first estimate for H
was obtained from the correspondence principle which can be found in every
quantum mechanics textbook (see, for example, [2], XI). The most general
estimate is obtained as an approximation from quantum electrodynamics; in
(5.8) we shall give such an estimate for Я. In the domain of the theory
presented here (5.8) will therefore be an axiom. The emission of radiation will
therefore be described in a second “approximation” step in the following
way.
We may consider the registration of the light emitted by dn atom or a
molecule as an effect triggered by the atom or molecule. This conception is
correct in terms of the meaning of the formulation of quantum mechanics
presented here and also of the meaning of quantum electrodynamics. In XVII
we shall see how the effects for a system can be produced “with the help” of
other systems. According to quantum electrodynamics we may indeed
express an effect produced by a light quantum as an effect which is produced
by the atom or molecule (by analogy with the considerations of XVII, §2). In
XI, §1 we will outline a route by which we may guess an operator F e L
which describes these effects. The form of this effect operator can be
considered to be an axiom in the context of the theory presented here.
In the second approximation step the effect of the emission of a quantum of
radiation is treated by introducing a correction to (4.14). According to (4.14)
the frequency for an effect F and for its time displacement by у is calculated as
follows:
tr(WF) and tr{WFy) = tr{WeiHyFe~iHy).
(5.2)
330 VIII Composite Systems
The required correction of (5.2) may be simply determined by realizing that
W is also somewhat dependent on y, that is, instead of writing the right-hand
side of (5.2) as above we write
tr (WyFy) = tr (WyeiHyFe~iHv) (5.3)
and for Wy we introduce a differential equation of the form:
dW
I'ЛтЩА„+т - jA„+mA„mWy - jWyA„+mA„m], (5.4)
ЯУ n,m
where the Anm are operators in the Hilbert space of the composite systems.
From (5.4) it follows that
^ tr(n;) = 0 (5.5)
and
Tl, m
- K<P> А„+тА„тЩ<Р> - 2<wy(p, А+тА„т<рУ]. (5.6)
Since for у = 0 the expression <<p, Wyq>) > 0, it could only be negative if there
exists a value y0 of у for which <<p, Wyocp) = 0 and [(d/dyXcp, Wyq>y]yo < 0.
From <<p, Wyo(p) = 0 it follows that Wyo<p = 0 and therefore
l(d/dyKq>, Wyq>y]yo > 0.
The form (5.4) therefore guarantees that a mixture morphism Sy: 1C —► К
(which depends on y) is defined by W0-+Wy; it may not, however, be a
mixture automorphism! We may therefore rewrite (5.3) as follows:
tr {(SyW)eiHyFe~iHy) = tr (PT[S;(^HyF^"iHy)]). (5.7)
For the case of radiation emission we shall first give an explicit equation for
(5.4) in XI (1.18).
There is a great similarity between (5.3) and (5.4) and the interaction
picture given in X, §3 except that the quantity Sy described above is not an
automorphism.
Now that we have presented an overview of the secdnd method which we
shall use in X-XVI of this book we shall make a few remarks concerning
approximations. The Hamiltonian operator given in (5.8) is so complex that
it is essential that simplified approximations be used for practical appli¬
cations of the theory. Physicists are so accustomed to this situation that they
have no difficulty in accepting it. Often mathematicians will not accept an
approximation until they have obtained formulas for the estimation of
errors. Physicists are seldom concerned with such estimates because they can
compare the approximate theory directly with experiment. Since every
physical theory is not an exact picture of reality, every physicist must learn, in
terms of experience, how good the theory is. A conceptually detailed
discussion about the methods of approximation can be found in [1], §8.
5 The Hamiltonian Operator 331
The Hamiltonian operator which is well suited for systems consisting of n1
electrons and n2 atomic nuclei of the same type with charge Z and mass M is
given by (here latin indices are summation indices for electrons, greek indices
are summation indices over the nuclei)
H = H0 + Hx + • • • + H6,
where the individual terms are given by
Я° = 2~1№- eA(n, t))2 + Jl £ (Д _ Z\e\A(rv, t))2
v e2 v Z2e2 v Ze2
+ E -+ Z E— +
w Y “ у “ v
i<k Чк \<ц iv 'iv i
+ E z|ew¥, t);
v
И,—: 3№■ Ай + Л• wA• «];
i<krik
я3 = оЕ x A)-£ + Vtk x Pd-Sui;
i<krik
= 4 E
m i<k'ifc
я5 = -±^Sr6(ti, t) + g^-Eitt, x Рг)-£;
пт j |)V rt'v
<5-8)
Here we have also taken into account “external fields” where the latter are
described by means of a given vector potential A(t, t) and a scalar potential
<p(r, t). We will systematically treat the problem of external fields in §6; we
have included them in (5.8) for completeness. In (5.8) we have denoted the
position operators by rt or rv and rik = rf — rk, rik = \rik\, etc., the spin
operators for the ith electron are denoted by Sx- В = curl A is the external
magnetic field.
The form of (5.8) can hardly be obtained from the correspondence
principle. It follows directly from quantum electrodynamics (where the latter
unfortunately does not have a correct mathematical representation).
In (5.8) we have only considered a single type of atomic nucleus. It is trivial
to extend (5.8) to the case of several types of nuclei with different charges Z.
In addition, it is important to note that, mathematically, H as defined by
(5.8) is not well defined. Since H is an unbounded operator it is necessary to
Z е(р(п, t)
332 VIII Composite Systems
specify its domain of definition. We shall not do so here. We only note that it
must be a subset of the well-defined domain of definition of the kinetic energy
2_У p2 + —У p2.
2m t 2My "
Difficulties in the specification of the domain of definition of H arise from
singularities of the “position functions” in (5.8). In order to find a physically
meaningful definition of H consider the point charge of an electron or
nucleus as the limit of a charged sphere of fixed charge as the radius of the
sphere tends to 0. This picture is helpful in overcoming some of the difficulties
associated with the possible multiple meaning of (5.8), especially with respect
to the <5-function d(riv). In this book we shall not deal with the mathematically
complicated problem of the determination of the domain of definition of the
operator H. In the problems presented in XII-XIV we shall occasionally
make reference to this problem.
According to VII, §7 the parity operator r can be represented in the form
U{r) x U(r) x • • • x U(r% (5.9)
where U(r) is given by VII (7.4). In this way we obtain a representation of the
complete Galileo group ^(r) since the Hamiltonian operator (5.8) (excluding
the case of external fields) commutes with the transformations (5.9). This
commutivity plays an important role in the investigation of atomic spectra
presented in XI-XIV.
6 Microsystems in External Fields
A frequently used and, for many experiments, a very important approxi¬
mation is that of the external fields. We shall briefly discuss this method and
its application.
In many experiments the composite system consists of a microsystem
(electron, atom, molecule) and a much larger macroscopic system. If a
microsystem enters the structure of a macrosystem—such as a particle in a
counter—then we cannot use the approximation of the external field
mentioned above. If, however, the microsystem remains away from the
atomic structure of the macrosystem then the interaction can be described to
good approximation with the help of the external field. We also require that
the microsystem does not produce such changes which alter the external field
produced by the macrosystem.
From this first, yet somewhat provisional limitation of the application
domain for the method of the external field, we find that we leave the domain
of application of quantum mechanics with its different discrete (!) system
types; since we have incorporated macrosystems which can be described
objectively in terms of external fields, as we have outlined in connection with
axiom AV 4 in III, §3. Nevertheless it is possible to retain this approximation
6 Microsystems in External Fields 333
in the “domain” of quantum mechanics (see [1], §12.3), precisely because in
this approximation the macrosystems appear only in the form of given fields,
and are not influenced by the microsystems themselves. The influence of
macrosystems by microsystems, after all, is decisively important for the
“measurement process,” that is, for a more precise physical analysis of the
registration of microsystems. We have only superficially described these
registration processes in terms of the structure ^0, 01 introduced in II, §4.2 (a
precise description is given in [13]).
As exterior fields we consider only electromagnetic fields which may be
described by a vector potential A(r, t) and scalar potential cp{t, t). Here r, t are
the technically specified laboratory space time coordinates. Here (A, (p) are
considered to be fields which were specified in terms of the experimental
arrangement. The electromagnetic field is obtained from (A, q>) in the usual
way
In terms of quantum mechanics we now have a “composite” system
consisting of a microsystem of type (1) and a field (A, <p). We may
characterize such a system by the “index” (1) (A, cp). Since the type (1) system
will be held fixed in the following considerations, it suffices therefore to
characterize a system type of composite systems by using (A, cp) as an
“index.” The “discrete” index for system types is then replaced by the
“continuous” index (A, q>). Two different function pairs (Л(г, t), <p(t, t)) will
therefore be considered to be different indices. Corresponding to this, an
effect procedure (b0, b) does not determine a single effect F e ЦЖ), where Ж
is the Hilbert space of the microsystems (as a subsystem of the composite
system consisting of the microsystem and the field (A, cp)), but an operator
which depends on A, <p: i//(b0, b) = F(A, cp) where (A, cp) occurs instead of the
discrete index in the formula:
F(A, (p) is, as an operator in Ж, a function of the field (A, q>).
The fact that an effect procedure (b0, b) generally determines an effect
operator F(A, cp) e Ь(Ж) which depends upon a field (A, cp) is usually
neglected in quantum mechanics. Such neglect can lead to difficulties in the
interpretation of the Galileo group.
If we wish to be more general we must introduce a <r-ring E of measurable
subsets in the space of the fields (A, cp) and each individual preparation
procedure there will correspond to an entire a-additive measure E К(Ж)
as an ensemble. For such a measure the probability will be computed by the
formula
Ё = — — — grad <p, В = curl A.
(6.1)
iKb09b) = (Fl9 F2,...,Fi5...).
(6.2)
334 VIII Composite Systems
where the integration over (A, cp) corresponds to the summation over the
different system types in the formula tr(J^F*) for the probabilities. Here,
however, we shall return to the usual description, where we assume that the
field (A, cp) is, for all practical purposes, uniquely specified by the experiment,
that is, was uniquely prepared, that is, the dispersion of the field can be
neglected. Then the measure W is nonzero only in the neighborhood of a
single point in the space of (A, cp). Instead of (6.2)
tr(WF(A, cp)) (6.3)
is the probability that the effect F(A, cp) will occur for the “specified field”
(A, cp). For the same preparation procedure a for which cp(a) = W the
probability for the effect procedure (b0, b) for which ij/(b0, b) = F(A, cp) will
depend on the field (A, cp). The condition cp(a) — W is independent of the field
is correct only if we prepare the microsystems in a space region outside the
field.
After these introductory remarks we will begin with (6.3), W = cp(a) and
F(A, cp) = ф(Ь0, b) as the starting point for the description of the microsys¬
tems under consideration in an external field. In this way. we obtain a
description which can be carried out using the Hilbert space Ж of the
microsystems.
The essential structure which must be added to (6.3) consists of the
specification of the Galileo group as the transformation group of the
registration methods relative to the preparation procedures and therefore also
relative to the external field (A, cp) as was physically interpreted in VII, §1.
Here we observe that this representation cannot be the representation of
the Galileo group in the Hilbert space for system (1) (that is, without external
fields) because the effects F(A, cp) span a different Banach space than ЩЖ)
does, because the set of F{A, cp) is a set of maps from the space of the fields
(A, cp) into 0И(Ж)\ Here we shall not proceed any further towards a more
precise determination of this Banach space of maps; we would have to
introduce a uniform structure into the space of (A, cp) and require that the
maps be uniformly continuous. A representation of the Galileo group in the
space of the functions F(A, cp) is not uniquely determined, even less so than in
the space ЩЖ) of composite microsystems.
For a certain range of applications the following method leads to the
determination of the representation of the Galileo group, that is, leads to a
useful theory. The method is copied from that presented in §2.
To each time t there exists a map 7J(1) of ЦЖ) into the set of effects F(A, cp)
for which
Vl)F = Ft, Tt+yF = Ft+y = V(l, 0, 0, y)F„ (6.4)
where V(D, d, ij, y) represents the Galileo transformation of the F(A, cp).
According to the second requirement in (6.4) it is sufficient to specify Ttw for
t = 0.
The physical interpretation is that Ft = TtwF represents an effect for a
registration apparatus which is only in interaction with the microsystems for
6 Microsystems in External Fields 335
a short time at time t. The structure introduced by (6.4) is probably
physically useful only when it is possible to make such relatively “quick”
measurements, where we must later specify what we mean by “quick.”
By analogy with §2 we require that for system (2)—that is, the field
(A, (p)—we are able to register the field independently of the microsystem at
each time t. We express this as follows: The special effects
Ft(A, q>) = lft(A, (p), (6.5)
where ft(A, cp) are real functions of the field (and its first derivatives) at time t
and are to be interpreted as such effects which depend “only” on the field at
time t. This does not mean that the signal which corresponds to the effect
(6.5) occurs at time t but that the registration apparatus is only influenced by
the behavior of the field during a short time span around t.
Let us now consider (6.4) and (6.5) for the special case where t = 0:
F0 = U»F, (6.6)
F0(A, cp) = lf0(A, cp). (6.7)
Here F0 in (6.6) corresponds to the effect Fx x I from §2 and I/0 corresponds
to the effect I x F2 from §2. The effects Fx x F2 from §2 here correspond to
an effect of the form F0f0{A, (p).
It is reasonable to transform (2.6) into the following form. Let U^D, if, 0)
denote the*representative of the Galileo transformation for the microsystem (1)
without an external field. How may we introduce the Galileo transformations of
the effects f0(A, cp), that is, of the “effects of the pure fields” (without
microsystems)? This formulation must be based on the physical interpretation
of these effects.
If, for example, the registration apparatus characterized by f0(A, cp) is
translated by fj then it is apparently subjected to the field
A\t, t) = A{r + fj, t), <p% t) = <p(r + f\, t), (6.8)
that is, it responds as if the field (6.8) is acting on the original apparatus. The
translated registration apparatus then corresponds to an /0' for which
/ои^)=/оИ>а (6.9)
where cp' were defined in (6.8). We now generally define a Galileo
transformation of the field by
m s, yU, <p) = (A', (p% (6.Ю)
where
k( 1,0, f\, y\A{f, t), <p(f, t)) = (A(r -f\,t - y), <p(t y)), (6.11)
k{D, 0, 0, 0)Й(П t), (p(t, t)) = {DA(D~4, t), q>(D-% 0). (6.12)
The fc(l, S, 0,0) are to be defined in a similar way as an approximation of the
proper Lorentz transformations of A and cp. We will not enter into a
discussion of such approximations.
336 VIII Composite Systems
We now define the transformation of f0(A, (p) which corresponds to the
transformation (D, S, if, 0) as a generalization of (6.9)
/o(A <P) = /o(fe ■ HD, 8, n, 0)(A, cp)). (6.13)
We transform (2.6) for this case as follows:
V(D, 8, f\, 0)Fofo(A, cp)
= U,{D, 8, ц, 0)FoU+(D, 8, fj, 0)/o(fc- HD, 8, fj, 0ХЛ q>)). (6.14)
According to (6.14) V(D, 8, fj, 0)F0f0 takes on the same form F'f as does
F0/0: the T0(1)F = F0 and the l/0 are separately transformed by the Galileo
transformations without time translation. If we require that (6.14) holds for
the case of a time translation, that would mean that the external field exerts
no influence on the microsystem. By analogy with the case described in §2 we
define
V(l, 0, 0, z)F0f0(A, cp) = UzF0U;f0(k-HU o, 0, z)(A, cp)), (6.15)
where the unitary family of operators UX(U0 = 1) depends on the field A, (p.
For Ux we require that the following differential equation be satisfied:
dUT
= Шхих, (6.16)
where Hx is a time-dependent Hamiltonian operator.
We will often make use of the special effects T^F = F0 defined in (6.15)
for which we write V(l, 0, 0, t)F0 = Ft. From (6.15) we find that
Ft = UtF0Ut+. (6.17)
From (6.16) and (6.17) we obtain the following differential equation for Ft:
~ = KH.F, - FtHt). (6.18)
According to the above interpretation of Tt and (6.4) Ft is an effect which is
triggered only by the microsystem “at time t.” If we apply the registration
method for the effect Ft at time у later, then we would register the ffect Ft+y.
We may now make the transition from the effects to a scale observable as
follows: Let A0 denote the self-adjoint operator which corresponds to the
measurement of a scale observable for a microsystem “at time zero.” Here “at
time zero” means that the interaction for the registration method takes place,
for all practical purposes, only at t = 0 (in the time scale of the laboratory).
If, instead, we make the measurement “at time Г (that is, we displace the time
relative to 0) we therefore measure the observable At which (according to
(6.17)) satisfies the equation
At = UtA0Ut+
(6.19)
6 Microsystems in External Fields 337
and, according to (6.18) satisfies the differential equation
^ = ЩА, - AtHt). (6.20)
Examples of such special observables are the position and momentum
observables of microsystems: Whether the system of type (1) under con¬
sideration is itself an elementary or a composite system in the sense of §2—§4,
in every case the position and momentum observables <2i(0), Pt(0) for the
individual components of system (1) are defined by the representation
U( 1, S, if, 0). Here Q*(0) and P*(0) are these observables at time t = 0. By
analogy with (6.19) we obtain
pm = utpmut\ qm = umow; . (6.2i)
From (6.21) it follows that the components of P^t), Q*(t)-P*v(t), Qiv(t) satisfy
the same commutation relations as the Piv(0), <2iv(0):
Piv(t)Qkfi(t) ~~ Qkfi(t)Piv(t) = h
QiMQU*) - QUWM = 0,
Pi,m,(t) - рк,тм = °- (6.22)
The Heisenberg commutation relations are therefore satisfied “for all time,”
but not, of course, for Piv(t), Qk/Jl(tf\ where t Ф t'.
In (6.15) we may consider generalized effects F0(A, q>) which are not only of
the form F0f0(A, (p). We obtain
V(U o, 0, T)F0(A, <p) = UtF0(k- x(l, 0, 0, x)(A, <p))Uz+. (6.23)
Here F0{A, q>) depends only on the fields A, <p (and their first derivatives) at
time t = 0. F0(/c_1(l, 0,0, x)(A, (p)) depends only on the fields at time t = x.
We shall now simplify the notation of (6.23) as follows:
Ft(A, <p) = V(l, 0, 0, t)F0(A, <p). (6.24)
Ft(A, <p) is therefore an effect which is measured by a registration method
which is only affected by the microsystem and the field at time t. The meaning
of the left side of (6.24) is expressed by the right side of equation (6.23) as
follows: Ft is the time dependent function UtF0Ut+ which depends on the
functions A, cp (and their first derivative) at time t. By differentiation, from
(6.23) we obtain
jf(Ft(A, cp)) = ilHtFt(A, cp) - Ft(A, cp)HJ
+ U,£ F0(k~ HI, 0, 0, t)(A, cp))Ut+. (6.25)
at
The second term on the right-hand side of (6.25) is nothing other than the
partial derivative of Ft(A, cp) with respect to the t which appears in the fields
338 VIII Composite Systems
A, cp. If we write this derivative in the form (d/dt)Ft(A, (p) then (6.25) is
transformed into
jt (F,{A, <p)) = i(HtF, - FtHt) + . (6.26)
For a corresponding scale observable Bt(A, q>) (where the latter is a time-
dependent function of the fields and their first derivatives at time t) it follows
that
^ = i(HtB, - BtHt) + (6.27)
The frequently used simplified notation (6.27) can often lead to misunder¬
standing if it is not precisely explained. (6.27) is often referred to as “the
quantum mechanical equation of motion” because it appears to be the “most
general” form of the Galileo transformation for a simple time translation.
Certainly (6.27) is (in a formal sense) the most general form since we may
derive the others as special cases (for example, for the case A = 0, cp = 0
which corresponds to the case of no external field) from (6.27). We note,
however, that physically the applications of (6.27) are limited to the case of
the external field approximation. For the general case of measurement (6.27)
is, on the contrary, not sufficiently general.
In (6.26) we considered effects which “respond only at time t.” In fact,
realistic effects are somewhat more complicated—they are somewhat com¬
plicated functionals of the entire field A, cp. Since, however, they do not easily
permit a formulation in terms of simple equations analogous to (6.26) and
(6.27) we shall restrict our consideration to the effects and observables
described by (6.26) and (6.27) and leave the problem of the measurement of
such special effects to the experimental physicist.
We shall now define the Hamiltonian operator Ht in (6.16) more precisely.
In principle we have already given it in (5.8). Here we need only state
which operators Pi9 Qt = ri9 Pv, Qv = rv in (5.8) are to be used. If the external
fields do not depend on time t—that is, we have constant external fields—we
may choose the operators Pi9 ..., (5V to be the operators defined at t = 0,
that is, we choose Ht = H0 to be constant with time. We may then choose the
Pi9..., ()v at апУ other time since, for At = Ht in (6.20) we obtain
Ht = H(P(t), Q(t)) = H(P(0), 6(0)).
For Ut it then follows that
Ut = eiHt. (6.28)
If the external fields A, cp are time dependent, then in (5.8) we must choose
in Ht the Pi9..., ()v and the spins St as the P^t),..., ()v(t), $&)• this way
we obtain a representation of the Galileo group for systems in external fields.
Here Ht is itself an observable of the form Bt described in (6.26). We therefore
find that:
~dt - im ~ Вд) ИГ ~ёГ' ,6'29)
7 Criticism of the Description of Interaction 339
Here we shall not proceed further into the problem of the domain of
definition of Ht and therefore the existence of solutions for the differential
equation (6.16). In physics it is generally assumed that a family of unitary
transformations Ut exists and that the domain of definition of Ht is the same
as the region where Ut is differentiable. These physical ideas do not, of course,
lead to a solution of the mathematical problem for a more or less well defined
Ht given by (6.16) and (5.8). Here we refer the reader to [11] and [36].
In closing, we shall now state the form of an infinitesimal translation for
the special effects introduced in (6.17). With the total momentum P(t) of
system (1) it follows that for an infinitesimal fj:
K(l, 0, fj, 0)Ft = Ft + ifjlP(t)Ft - FtP(t)l (6.30)
We have given this formula in order to warn about possible errors in the
use of P(0) in (6.30) for all Ft. It is of decisive importance that V(...) does not
give a representation of the Galileo group by means of transformations in
&(Ж) but gives a representation by means of transformations in the Banach
space of the mappings of the space of the fields into the space ЩЖ)!
The “external fields” provide a useful means by which the mass of charged
systems can be measured, especially for the case of elementary systems. For
example, for an electron the Hamiltonian operator (5.8) (for small values of
the momentum and neglecting the spin term in H5) is given by
Я = P - eAf + e<p. (6.31)
In XVII, §6.2 we shall briefly outline how we may develop measurement
methods by which the quantity elm (in cm) may be measured by the
deflection produced by the field. If we then succeed in measuring the
elementary charge (by measuring e/M for macroscopic bodies and then
measuring M with the aid of other forces in units of gm—Milikan experi¬
ment and other modern experiments) we then obtain Planck’s constant as a
conversion factor between the units gm and cm -1 (see VII (2.39)). Here we, of
course, assume that m is previously known in units of cm-1 from atomic
processes (from the structure of atomic spectra (according to XI-XIV) or
more directly from the motion of electrons).
7 Criticism of the Description of Interaction in
Quantum Mechanics and the Problem of the Space Q)
Since the position and momentum observables for elementary systems could
be introduced in the manner described in VII, §4 it is, in principle, possible to
determine whether a registration method measures position or momentum
for an elementary system on the basis of the interpretation of the preparation
340 VIII Composite Systems
and registration procedures described in II and the interpretation of the
Galileo group described in VII, §1. For the case of the position and
momentum observables defined in §2 for the case of individual systems in a
composite system the situation is problematical, since the subsystems
undergo mutual interaction.
The introduction of these observables was only a consequence of the newly
introduced structure (2.1) for the Hilbert space of a composite system. Since
every Hilbert space is isomorphic to any other Hilbert space of the same
dimension we find that К{Ж^) is isomorphic to К(Ж2) and L(^) is
isomorphic to ЦЖ2) including the bilinear form tr(w, g). A mathematical
structure of the form (2.1) can be introduced for any Hilbert space. Such a
structure has no physical significance as long as this structure is not con¬
nected with an additional physical interpretation. In §2 we have attempted
to give such an interpretation by singling out the set of all effects of the
form Fx x F2 for this purpose. However, this procedure is physically
questionable.
Is, however, such a procedure absolutely necessary? If L(^), ЦЖ2), and
x Ж2) are isomorphic, how may we correctly make such a distinction?
It makes possible a heuristic procedure for the formulation of the
Hamiltonian operator according to (5.8); as we shall see in XI-XIV the
eigenvalue spectrum of this Hamiltonian operator is in excellent agreement
with experience. Where else is the structure (2.1) useful? Is this structure
really needed as part of the structure associated with the physics? Does the
subset of effects Fx x F2 have any real physical significance? Can we simply
forget about (2.1) once we have determined the spectral family of Я, that is, of
eiHt<?
Certainly we have the fact that it is possible to register the subsystems once
they are no longer in interaction. Scattering theory, described in XVI is based
on this experimental fact. In this sense we can also say that in an asymptotic
sense, for large times (or large distances) the introduction of (2.1) correctly
describes the asymptotic structure. In particular, it is not clear whether it is
physically meaningful to extend such an asymptotic structure to situations in
which the subsystems are in close interaction. Indeed, it may be meaningless
in those circumstances to speak of “subsystems” because the concept of a
subsystem depends upon (2.1)! In elementary particle physics it is physically
meaningless (except as a crude approximation) to speak of subsystems of a
system with respect to high energy scattering experiments.
In “nonrelativistic” quantum mechanics described here there are ad¬
ditional experimental facts which are explainable in terms of the structure
(2.1). For example, the angular momentum structure is, according to §1, quite
open; however, on the basis of experiments with external fields (see, for
example, the Zeeman effect in XIV, §6) it is in agreement with the structure
(2.1) (and its extension to the case of more than two subsystems). The
intensities of spectral lines, calculated in XIV, §5 is in agreement with the
structure given by (2.1). In addition, the structure of molecules (see XV) and
therefore (indirectly) the theoretical description of the most important facts in
7 Criticism of the Description of Interaction 341
chemistry are a consequence of the structures which were introduced in §2-
§e.
In spite of all these successes there remains a cluster of problems which
have already been cited in §2. Is it necessary to impose the condition of
“relatively short measurement interaction times” in order to obtain a basis
for (2.1)? This does not appear to be the case, but cannot be proven. The
considerations in XVII appear to mean that there is no particular time (even
approximately) required for a measurement providing that the Hamiltonian
operator does not explicitly depend upon the time. The case in which we
have a time varying external field remains problematical, as we have found in
the discussion in §6. If we exclude the case of external fields it appears that
the structure (2.1) is only needed asymptotically and that the extrapolation of
(2.1) is a useful tool for such physical problems as atomic spectra including
light emission (XI-XV) and the most complicated scattering problems (not
only those described in XVI).
Certainly it is possible that the structure (2.1) or better, the structure (2.3)
indirectly determines a physical structure—namely the space 2.
By analogy with VII, §8 we may seek to obtain the space 2 with the aid of
the direct product of two (or more) Galileo groups. Since the spaces
2'(Ж2)9 and ^'{Ж^ x ^2) are isomorphic, the additional structure of the
subspaces 21 <= ^'(^iX ^2 ^ Щ^Х and @ ^ ^'(^i x ^2) describe an
essential difference between, for example, 2± <= 2\Ж^) and 2 <=
^'{Ж1 x Ж2). The convex sets К(Ж±) and К(ЖХ x Ж2) are certainly iso¬
morphic, that is, there exists a mixture isomorphism К(Ж^—*
К(Жг x Ж^)\ however, it is possible that an isomorphism does not exist
which is с(К(Ж^)9 21)-а(К(Ж1 x Ж2), ^-continuous!
In the case of classical mechanics the analogous fact has been known for a
long time: The set X is the set of all measures which are totally continuous
with respect to Lebesgue measure, that is, which can be described by a
measurable density function p(x) in the Г-space. The set L is the set of all
measurable functions/for which 0 < / < 1. For the space 2 we may choose
the space spanned by the 1-function and the set of all continuous / with
compact support. For a single mass point the space Tx is six-dimensional.
For two mass points the composite space is twelve-dimensional. If Xl5 Lx
correspond to Tx then there certainly are mixture isomorphisms Kx -+ К but
none which are <r(Kl9 21)-<т(К, ^-continuous because the dual map must
map the set 2X onto 2 which is impossible because of the different
dimensions of Гг and Г since a d(Xl5 21)-<j{K, ^-continuous isomorphism
must lead to a topologically homeomorphic map of onto Г (the extremal
points of Ka may be identified in the o(2i9 ^)-topology with the topological
space Г!).
It is conceivable that we may be able to solve the problem of composite
systems in quantum mechanics by proceeding in the opposite direction—
where we supplement the assertions in §1 by introducing axioms about the
space 2. Perhaps it is possible to carry out the introduction of the structure
“composite system” in an improved manner with respect to the sense of an
342 VIII Composite Systems
“axiomatic basis” in this way rather than the route we have used in §2—§6.
Here, by the “sense of an axiomatic basis” we mean that new relations are
introduced only when these relations can be physically interpreted on the
basis of pre-theories (see [47]). Until such a clarification, it must be seen that
the concept of a composite system is (as far as this concept exceeds the
exposition of §1 is concerned) unfortunately one of the least well-explained
concepts of quantum theory.
APPENDIX I
Summary of Lattice Theory
1 Definition of a Lattice
A set M is said to be a partially ordered set (poset) if there is a relation
defined among pairs (a, b)e M x M (which we denote by <) which satisfies
the following axioms:
(1) a < a for all ae M,
(2) a<b,b<c=>a<c,
(3) a < b, b < a => a = b.
M is said to be totally ordered if, for each pair a,beM either a < b or b < a.
If JV is a subset of an ordered set M, then an element ae M is said to be an
upper bound of JV if b < a for all b e N. We say that the element ae M is the
least upper bound for JV if a < с for every upper bound с of N. Similarly, a
lower bound for N is an element d e M for which d < b for all b e N; we say
that the element ceMis the greatest lower bound for JV if d < с for every
lower bound d of JV. If the greatest lower bound or least upper bounds of the
set N exist, they are uniquely determined by the set N.
A lattice is a partially ordered set M for which every finite subset N has a
least upper bound and a greatest lower bound. By induction, it is easy to
show that M is a lattice if each pair of elements a, b has a least upper bound
(denoted by a v b) and a greatest lower bound (denoted by а л b).
Instead of a < b we frequently write b > a. The least upper bound of N is
denoted by \JaeNa, the greatest lower bound is denoted by f\aeNa.
343
344 Appendix I Summary of Lattice Theory
According to the definitions it is an easy matter to show that:
a = aAa = av a, a v b = b v а, (a a b) a с = a a (b a c), (a v b) v с =
a v (b v c), a v (а a b) = а, а a (a v b) = a, a < boa v b = b, a < bo
a a b = a.
Th. 1.1. Let M be a set, and let two binary operations v, a, be defined on M
which satisfy the following conditions:
(1) a a b = b a a, a v b = b v a ;
(2) a v (b v c) = (a v b) v c,
a a (b a c) = (a a b) a c;
(3) a v (a a b) = a, a a (a v b) = a,
then the relation < defined by a < b whenever a a b = a is a partial order
and M is a lattice.
Proof. From the first formula in (3) it follows that a = a v (a a (a v b)), and from
the second formula in (3) it follows that a = a v a; similarly it follows that
a a a = a. Therefore we obtain a < a.
From a < b and b < a it directly follows that a = b.
If a < b and b < c, that is, if a a b = a and b a с = b then from (2):
aAc = (aAb)AC = aA(bAc) = aAb = a, that is, a < c. Therefore < is a
partial order.
If a < b, then from a a b = a and from (3) it follows that
a v b = (a a b) v b = b. If a v b = b then from (3) it follows that
a a b = a a (a v b) = a, that is, a < b. Therefore a < b is equivalent to
a v b = b.
From (3) it follows directly that a a b < a and a a b < b. If с < a and с < b,
then it follows that с a (a a b) = (c a a) a b = с a b = c, that is, с < а л b.
Therefore a a b is the greatest lower bound of a, b. In a similar way it follows that
a v b is the least upper bound of a, b.
A lattice M is said to be complete if each subset N of M has a least upper bound
and a greatest lower bound.
Th. 1.2. If for every subset N of a partially ordered set M there exists a
greatest lower bound (least upper bound) then there exists a least upper
bound (greatest lower bound), that is, M is a complete lattice.
Proof. Suppose for each subset N there exists a least upper bound. Let a denote
the least upper bound of the set R = {b \ b < с for all с eN}. Here we find that a is
the greatest lower bound of the set N if we show that a eR : From с e N it follows
that с > b for all b e R and we therefore obtain с > a.
In a similar way we may prove that every subset has a least upper bound.
The hypothesis in Th. 1.2 requires that the empty set has a least upper
bound, that is, there exists a minimal element in M; corresponding to the
existence of the greatest lower bound there exists a maximal element in M.
We denote the minimal and maximal elements by 0 and 1 (or e) respectively.
1 Definition of a Lattice 345
D 1.1. If, for a given element ae M there exists an element a! e M for which
а л a! — 0 and a v a! = 1 then a! is called a complement of a. A lattice M is
said to be complementary if each element ae M has a complement.
Here we note that a is also a complement of a'.
D 1.2. A bijective map a —»a1 of M into itself is said to be an orthocom¬
plementation if the following conditions are satisfied:
(1) a < b => b1 < a1,
(2) (a^ = a,
(3) а л a1 = 0.
An orthocomplemented lattice is a lattice with unit and null element for
which in addition, there exists an orthocomplementation structure. Often the
orthocomplement of a will be denoted by a* instead of by a\
Remark. For a given lattice there can be more than one ortho¬
complementation !
For a subset N we define the following subset
N1 = {a\a = b\ b e N}.
Th. 1.3. In an orthocomplemented lattice a v a1 = 1, that is, a1 is a comple¬
ment of a. In addition (a v b)1 = а1 л b1 and (а л b)1 = a1 v b\
If the greatest upper bound /\aeN a exists for the subset N, then \/beN±b
exists, and \/beN±b = (Д a e N a)\ A similar result holds for the least upper
bound.
PROOF. From D 1.2, (1) and (2) it follows that b1 < a1 => a < b. From this result it
directly follows that the map a —> a1 maps least upper (greatest lower) bounds into
greatest lower (least upper) bounds, and we have proven the second part of the
theorem. In particular, 01 = 1. Thus from D 1.2, (3) it follows that a v a1 =
(<a1 л a)1 = 01 = 1.
D 1.3. We say that a is orthogonal to b (written a 1 b) if a < b\
From D 1.2, (1) and (2) it directly follows that a < b1 => b < a1, that is, the
relation alb is symmetric.
D 1.4. A lattice M is said to be distributive, if, for all a,b,ceM the
following conditions are satisfied:
(1) (a V b) Л c = (а л c) v {b л c),
(2) (a л b) v c = (a v с) л (b v c).
In the following we shall abbreviate conditions (1) and (2) in D 1.4 by
D(a, b, с) and D*(a, b, с) respectively.
346 Appendix I Summary of Lattice Theory
Th. 1.4. A lattice M is distributive if one of the relations D(a, b, с) or
D*(a, b, c) is satisfied for all a,b,ce M.
Proof. If D(a, b, c) holds for all a, b, с e M, then it follows that (a v с) л (b v c) =
(a a (b v c)) v (с л (b v с)) = (а л (b v c)) v с = (а л b) v (а л c) v с = (а л b)
v c, that is, £>*(я, b, c). In the same way we may show that if D*(a, b, c) holds for all
я, b, с e M then D(af b, c) holds.
2 Orthomodularity
D 2.1. A pair of elements a, b is said to be a modular pair, if for all с < b the
relation D(c, a, b) is satisfied, that is, if, for all с < b the relation
(c v а) л b = с v (а л b)
is satisfied. We shall denote a modular pair a, b by M(a, b). A lattice M is said
to be modular if M(a, b) is satisfied for all a, b.
We will now assume that M is an orthocomplemented lattice.
D 2.2. A pair a, b is said to be compatible (abbreviated C(a, b)) if
a = (a л b) v (а л b1)
is satisfied (see also IV, Th. 1.3.4(iv)).
It follows that a 1 b => C(a, b); C(a, b) => C(a, b1); a < b => C(a, b), since
a a b = a and а л b1 = a л b л b1 = 0.
Th. 2.1. Tbefollowing statements are equivalent:
(i) For all pairs a, b satisfying alb M(a, b) holds.
(ii) For all a we obtain M(a, a1).
(iii) For all pairs a, b satisfying a < b C(b, a) holds, that is, b =
a v (b л a1).
(iv) For all pairs a, b satisfying a < b
a = b л (a v b1).
(v) For all pairs a, b: C(a, b) => C(b, a).
(vi) For each triple a, b, с satisfying a 1 b, a 1 c, the following condition
holds:
avb~avc=>b = c.
(vii) For each triple a, b, с satisfying a 1 b, a 1 c, the following condition
holds:
a v b = a v с and b < с => b ~ c.
2 Orthomodularity 347
(viii) For each a the following condition holds: b J_ a and a v b =
1 =&b = a1, that is, for each a there exists only one orthogonal
complement.
PROOF, (i) => (ii) is clear, since a 1 a1.
(ii) => (iii). From a < b it follows that b1 < a1 and from M(a, a1) it follows that
(b1 v a) a a1 = b1 v (а a a1) = b1; therefore, by applying 1 we finally obtain
b = a v (b a a1).
(iii) => (iv). If a < b then b1 < a1 and, according to (iii) a1 = b1 v (а1 a b),
from which it follows that a = b a (a v b1). (iv) => (iii) can be proven in the same
way.
(iii) => (v). From C(a, b), that is, from a = (а a b) v (а a b1) it follows that
a1 = (a1 v b1) л (a1 v b) and b л a1 = b л (a1 v iff. Hence it follows that
(b л a) v (b a a1) = (b a a) v (b a (a1 v b1)) = (b a a) v (b л (a a b)1); since
b a a < b, from (iii) we finally obtain (b a a) v (b a (b a a)1) = b.
(v) => (vi). Assuming (v) holds, we obtain C(a, b) => C(a, b1) => C(b\ a) =>
C(b\ a1) => C(a\ b1). From a v b = a v c, bl a and cl a it follows that
a1 a b1 = a1 a c\ Since a 1 b => C(a, b) => C(b\ a1) it follows that b1 =
(b1 a a) v (b1 л я1) = a v (b1 л a1) = a v (с1 л a1). Similarly it follows that
c1 = a v (с1 л я1) and therefore b1 = c\ that is, b = c.
(vi) => (vii) is trivial.
(vii) => (viii). From avb = l = av a1 and b 1 a, that is, b < a1 we obtain
the special case in which b = a\
(viii) => (iii): We assume (by (iii)) a < b. We define d = a v (b a a1). Thus we
obtain d < b and therefore d a b1 = (d a b) a b1 = 0. On the other hand,
b1 v d = b1 v a v (b a a1) = (b a a1)1 v (b a a1) = 1. From (viii) it follows that
d = (b1)1 = b, that is, b = a v (b a a1).
(iii) => (i). According to (i) we assume that alb and с < b. Since alb M(a, b)
is equivalent to (c v a) a b = c. Since с < b and с < с v a we find that
(c v a) a b > c. According to (iii), for д = ((с v a) a b) a c1 we obtain the relation
(c v a) a b = с v д. Since (с v a) a b < b we obtain д < b a c\ that is,
b1 v с < д\ Since (с v a) a b < с v a it follows that д < с v a and, since alb,
that is, a < b1 we obtain д < с v b1 < д1—from which we conclude д = 0, that is,
с = (с v a) a b.
D 2.3. An orthocomplemented lattice is said to be orthomodular if it
satisfies one of the conditions (i)-(viii) of Th. 2.1.
D 2.4. Let M be an orthocomplemented lattice. A real function M A [0,1]
is said to be a normed orthomeasure on M if
(1) m( 1) = 1,
(2) a 1 b => m(a v b) = m(a) + m(b).
Th. 2.2. If there exists a set К of orthomeasures on M such that m(a) = m(b)
for all me К implies the relation a = b (that is К separates M) then M is
orthomodular.
PROOF. According to Th. 2.1(viii) let a v b = a v a1 = 1 with b 1 a. Thus, for all
me K, it follows that m(a) + m(b) = m(a) + mia1), that is, m(b) = mia1) and we
obtain b = a\
348 Appendix I Summary of Lattice Theory
Th. 2.2 is directly applicable to the lattice G of decision effects (see
III, D 6.2, III, D 6.3 and III, D 6.6) if we set m(e) = tr(we). G is therefore
orthomodular.
3 Boolean Rings
D 3.1. A complemented distributive lattice is called a Boolean lattice or a
Boolean ring (for the designation “Ring” also see D 3.2).
Th. 3.1. In a Boolean ring M each element a has exactly one complement a'.
The mapping a—* a! is an orthocomplementation.
Proof. Suppose, therefore, that алх = алу = 0 and ovx = flvy = l. It
then follows that x = x л 1 = x a (a v у) = (x л a) v (x л у) = x а у. Similarly
we obtain
у = x л у and therefore x = y.
Since a is also a complement of athe mapping a—+a' is bijective and (aj = a.
According to D 1.2 we need only prove a < b => bf < a!. From a < b it follows
that a = a a b and therefore а a b' = а a b a b' = 0. Thus we obtain b' =
bf a 1 = b' a (a v af) = (bf a a) v (bf a af) = bf a a\ that is, bf <; a'.
The fact that a Boolean ring is orthomodular is trivial, because it satisfies
the distributive law. In a Boolean ring we may therefore use a1 to denote the
complement of a. Instead of a1 we may often use a*.
Th. 3.2. An orthocomplemented orthomodular lattice is a Boolean ring if and
only if every pair of elements is compatible (see D 2.2).
Proof. The fact that C(a, b) holds for all pairs a, b in a Boolean ring follows from
a = a a 1 = a a (b v b1) and the distributive law. In order to prove the converse,
we shall now show that: C(af b)oaAb = aA(bv a1). C(af b) means that
a = (a a b) v (a a b1) from which it follows that a1 = (a a b)1 a (a1 v b). From
this it follows that a a (a1 v b) a (a a b)1 = 0. Since aAb<aA(bv a1), from
Th. 2.1 (iii) it follows that a a (b v a1) = (a a b) v (a a (b v a1) a (a v b)1) =
a a b. Conversely, from a a b = a a (b v a1) we obtain the following relation¬
ships:
(a a b) v (a a b1) = (a a (b v a1)) v (a a b1) = (a a b1) v (a a (a a b1)1) = a
where the latter are obtained from a a b1 < a and Th. 2.1 (iii).
Since a a с < a v b, a a с < c, b a с < a v b, b а с < с we find that
(a v b) a с > (a a c) v (b a c). According to Th. 2.1(iii) we therefore obtain
(a v b) a с = (a a c) v (b a c) v [{a v b) a с a ((a a c) v (b a c))1]. We need
only show that the expression z in the square brackets is equal to 0. Next it follows
that z = (a v b) a с a ((a a c) v (b a c))1 = (a v b) a с a (a1 v с1) л (b1 v c1).
From C(c, a1) we find с a a1 = с а (ях v cx) and we therefore obtain
z = (a v b) a с a a1 a (b1 v c1). From C(c, b1) we obtain z =
(a v b) a a1 a с a b1 = с a (a v b) a (a v b)1 = 0.
3 Boolean Rings 349
Th. 3.3. An orthocomplemented orthomodular lattice is a Boolean ring if and
only if each element a has only one complement.
PROOF. The first part of this theorem follows directly from Th. 3.1. According to
Th. 3.2 we need only show that each pair of elements is compatible if each element
has only one complement.
According to Th. 2.1 (iii) we obtain
(a a b) v (а а (а a b)1) = a.
Let с = (а л (а a b)1) we obtain с a b = 0 and, from (а a b)1 > b1 we obtain
с л b1 = а л (b1 л (а л b)1) = а a b\
a and b are therefore compatible if с a b1 = с, that is, if с and b are compatible.
For d = b v с and e = с v d1 we obtain bve = bvcvd1 = dvd1 = 1,
and, according to Th. 24(iii) it follows that e = d1 v (e a d). Since с 1 d\ from
Th. 2.1(vi) it follows that с = e a d. From b < d it follows that
bAe = bAdAe = bAc = 0.
Therefore e is a complement of b. Since it only has one complement, we find that
e = b1 and therefore с < b\ that is, с a b1 = c.
Th. 3.4. If the least upper bound \Ja e N a (greatest lower bound /\aeN a) exists
for a subset N of a Boolean ring M, then for each be M the least upper bound
VaeN^ л a) (greatest lower bound ДаеЛ^(Ь a a)) exists, and satisfies the
distributive law:
b л ( V a) = V (b л a); b v ( Д а) = Д (b v a).
\aeN / aeN \ae N / ae N
PROOF. We must show that x = b a (\/aeN a) is the least upper bound of the set of
all b a a for which aeN. Since b a a < b and b a a < \/deN d for all a e N it
follows that x > b a a for all a e N. Let д > b a a for all a e N. We need to show
that д > x. Since д > b a a implies д a b > b a a9 it suffices to show that, for
и > b a a with и < b it follows that и > x. Suppose that v > b1 a a for all a e N
and v < b\ Thus it follows that и v v > (b a a) v (b1 л a) = a for all aeN, that
is, и vt^ V.€jv«-
From b а и = и and b a v = 0 it follows that
U = bAU = (bAU)v(bAv) = bA(uVV)>bA \/ a = x.
aeN
FromTh. 1.3, from
b v Д a = (V л V a1) = Г V Ф1 л л1)] = Д Ф v a)
aeN \ aeN J |_a e N _] aeN
we obtain the second part of the theorem.
The following relations are frequently defined in a Boolean ring:
D 3.2 a-b = a a b, a + b = (a a b1) v (b a a1).
It is easy to show that a Boolean ring together with the operations • and +
is a (commutative) ring (algebra) for which a • a = a and a + cl = 0.
350 Appendix I Summary of Lattice Theory
We may express the operation v by means of • and + as follows:
a v b = a + b + a-b.
Conversely, let M be a commutative ring (with unit element) with the
operations • and + for which a-a —a and a + a = 0; if we define
а л b = a-b and avb = a + b + a-b9 then from Th. 1.1(1), (2), (3) it fol¬
lows that M is a lattice. It is easy to show that this lattice is distributive, and
that 1 + a is the complement of a.
A distributive complemented lattice can therefore be well characterized as
a commutative ring (with unit element) for which a - a = a and a + a = 0.
The following more general concept of a Boolean ring is often used:
D 3.3. A lattice is said to be relatively complemented if to an a > b there
exists a с such that Ь л с = 0, fe v с = a. с is called the relative complement
of b with respect to a.
D 3.4. A distributive relatively complemented lattice is called a generalized
Boolean ring.
In a generalized Boolean ring the relative complement is uniquely
determined.
In a generalized Boolean ring we may obtain a commutative algebra by
defining a-b = а л b and a -j- b = a v b, where a is the relative complement
of а л b in a and b is the relative complement of а л b in b. A generalized
Boolean ring is a Boolean ring if and only if it has a unit element.
From two Boolean rings Ml9 M2 (and from a finite number of Boolean
rings) it is possible to construct new Boolean rings in two different ways.
In the first way we are given a pair of arbitrary partially ordered sets
Ml5 M2; on the product set M = Mx x M2 we introduce the following
partial order:
(al9 bx) < (a2, b2): ax < a2, b1<b2.
It is easy to see that M is a (complete) lattice whenever Mx and M2 are
(complete) lattices. In particular, we obtain
{al9 bi) л (a2, b2) = (ax л a2, bx л b2\
(al9 bj v (a2, b2) = (ax v a2, bx v b2).
If and M2 have 1 and 0 elements, then so does M:
1 = (!■ ij I2X ® = (®i> O2X
If Mx and M2 are complemented, then so is M:
(au а2У = (ai, a2).
If and M2 are distributive, then so is M. Therefore, if Mx and M2 are
Boolean rings, then so is M.
3 Boolean Rings 351
The second way to construct a Boolean ring M from Mx and M2 may be
carried out most simply with the aid of the algebraic operations + and •. This
construction is analogous to the technical “logical” switching circuits—from
Mx and M2 we construct the “free” algebra (ring) which consists of all
possible formal sums and products of elements in Mx and M2 by means of
finitely many operations + and •. In this case M is not, in general, equal to
the product set.
The elements of M can be represented by all formal finite sums of the form
E + af-af,
i
where af e Ml9 af e M2 and af • af = 0, af • af = 0 for i Ф j. Then, by
means of the operations + and • we may obtain new sums as follows:
For QTj + af-af) + (+ bf-bf) we do not immediately obtain
af-bf] = 0, etc. We can, however, attain this result stepwise providing that,
instead of af, bf] we use new elements of : af • bf\ af + af • bf\
bf + af-bf\
D 3.5. A subset / of a lattice M is said to be an ideal if the following
conditions are satisfied:
(1) ael and b < a=> b e /,
(2) al9 a2el => ax v a2e I.
Th. 3.5. If M is a generalized Boolean ring, then I is also algebraically an
ideal of M, that is, the conditions
ael and b e M => ab e I,
ai, a2 £ I —a^ 4~ a2 e I.
Conversely, an algebraic ideal I is also an ideal in the sense of О 3.5.
PROOF. Let / be an ideal according to D 3.5. From ael and a-b = aAb<ait
follows that a-bel. From alfa2el and 4- a2 < at v a2 it follows that
4- a2 eI. Let I be an algebraic ideal. From ael and b < a it follows
that b = b a a = b- a el; from al9 a2e I it follows that at v a2 =
(at 4- a2) + a1-a2e I.
From the fact that I is also an algebraic ideal, it follows that M/I is a
Boolean ring.
An equivalence relation bx ~ b2 is defined by bx = b2 4- a where ael.
Since bx = b2 + aobx + b2 = a we may define bx ~ b2 by bx + b2e I.
The set of classes M/I can easily be seen to be a Boolean ring as follows: cx • c2
is equal to the class of bx • b2 for bx e cl9 b2ec2.
cx + c2 is equal to the class of the bx + b2 for bx e cl9 b2ec2.
352 Appendix I Summary of Lattice Theory
4 Set Lattices
If X is a set and M cz 0>(X) (0*(X) is the power set of X) then M is a partially
ordered set with respect to the set theoretical relation c: of inclusion. If M is
a lattice, we then speak of a set lattice. Here it is important to note that set
theoretical union и and intersection n will not necessarily correspond to the
lattice least upper bound v and greatest lower bound л, respectively, that is,
for a, b e M it is not necessarily true that а л b = a n b and a v b = a и b.
We do find, however, that a a b cz a n b and a v b з a и b.
Th. 4.1. If M has a maximal element, and if, for every N cz M f]aeN ae M,
then M is a complete set lattice and
a a b = a n b, a v b = Q c.
csM
aub^c
Similarly, if M contains a minimal element and if, for every N cz M,
(JaeNa G M ^en
a v b = a и b and а л b = (J c.
csM
c^anb
The proof of this theorem is a simple consequence of Th. 1.2.
If M is complemented, then in general the a' of a e M is not necessarily
equal to the set complement e\a of a in the maximal element e of M. If M
contains the empty set, then it follows that a' cz e\a.
Let M be a Boolean ring of sets, let с e M and let с cz e\a where e is the
maximal element of M. Then с л a = 0. Using a1 we find that
(c v a1) a a = (c a a) v (а1 л a) = 0 and (c v a1) va = cve = e, that is,
с v a1 is also a complement; by the uniqueness of the complement we obtain
с v a1 = a1, that is, с cz a1. If the empty set is an element of M then we must
have a1 n a = 0 and therefore a1 cz e\a. Then the set {с \ с e M and с cz e\a}
has a maximal element, namely a1.
APPENDIX II
Remarks about Topological and
Uniform Structures
Here we shall provide a brief summary of some of the concepts and principal
results which are used in various places in this book. For the proof of these
theorems, see [32].
1 Topological Spaces
A topological space consists of a set X together with a structure 6 cz 0>(X)
which satisfies the following properties:
(1) & contains the intersection of any finite collections of elements of (9.
(2) (9 contains arbitrary unions of elements of (9.
(9 is called the set of open sets. We often say that (9 defines a topology on
X. The complements of sets in (9 are called closed sets. The “interior” A0 of a
set A is defined as the union of all open subsets of A. A0 is open. The closure
A of А cz X is defined as the intersection of all closed sets which contain A. В
is said to be dense in A if В cz A and A cz В. X is said to be separable if there
exists a countable subset in X which is dense in X.
A filter in X is a subset & cz ^(X) for which and
and Be^=>v4n£e#\
We may also, in an equivalent manner, define a topology by means of a
neighborhood structure, as follows: To each xe X there corresponds a filter
2FX—the so-called neighborhood filter of x. For the &x we require that xeU
for all U e 3FX\ to each U e there exists aFe«fx such that for each у eV
the relation U e 3Fy is satisfied.
353
354 Appendix II Remarks about Topological and Uniform Structures
We may show the equivalence in the following way:
(1) Let X and (9 be given. We define as the set of all A for which there
exists a В e (9 for which xe В cz A. The 3FX then satisfy the require¬
ments for a neighborhood structure.
(2) Let X be given together with a neighborhood structure. We define (9
as the set of all A for which у e A=> As S'y, the set A is therefore a
neighborhood of each of its points.
It is a simple matter to show that the neighborhood structure defined
according to procedure (1) using X and (9 is identical to that defined by this (9
according to procedure (2). Similarly, if we begin with a neighborhood
structure in X and define the set (9 of open sets according to procedure (2)
and if we then define a neighborhood structure using (9 according to
procedure (1), we then obtain the same structure we began with.
Let X and Y be topological spaces. A mapping X^Y is said to be
continuous if, for every open set A of Y the set f~1(A) = {x\f(x)eA} is
open. This is equivalent to the condition that for every closed set A,f~ \A) is
closed.
A filter ^ is said to be finer than a filter 3F if ^ => We say that a filter ^
in X converges to an x e X if ^ is finer than the neighborhood filter <FX. If x„
is a sequence, then ^ = {A \ A contains all xn except for a finite number} is a
filter. We say that the sequence xn converges towards x and write x„ —»x if
the corresponding filter ^ converges to x. It follows that xn —»x if and only if
each U e 3Fx contains all xn (with the exception of a finite number). For a
filter ^ in Xx e X is called an accumulation point of ^ if xe A for all A e&.
x is called an accumulation point of the sequence xn if it is an accumulation
point of the corresponding filter this is the case only if, to each U e$Fx
there exists an infinity of elements xn. of the sequence for which xni e U.
We say that a topology on X *s ^пег than a topology ^ on I if
(92 => If ^ if a family of topologies on X then (9 — [)л (9Я defines the
finest topology which is coarser than each of the ^. The fact that the set
(9 = <9a satisfies the axioms for open sets, has, as a consequence of AI, §4,
the result that the topologies on X form a complete lattice since 0>(X) satisfies
all the axioms for a set (9. Therefore, to each family ^ there exists a coarsest
topology 3T0 which is finer than all the ^. 2TV is called the greatest lower
bound of all the ^ and 3TQ is the least upper bound of the ^.
Suppose we are given a set X together with a family Xk of topological
spaces and a set of maps X Xk. The coarsest topology in X for which all
the maps fk are continuous is called the initial topology generated by the
maps X Xk. If Y is a topological space, then the mapping Y -^ X (X
with the initial topology) is continuous if and only if all composite maps
Y X Xk of Y into Xk are continuous.
If A is a subset of the space X with the topology we then denote the
initial topology on A defined by the canonical injection of A in X as the
induced topology on A or (more simply) the topology «f on A. The open sets
of this topology are precisely the intersection of open sets В of X with A.
2 Uniform Spaces 355
Let Xk be a family of topological spaces, and let X bj£ the Cartesian
product of the Xk. We define the product topology on X to be the initial
topology defined by the set of projections X —»Xk.
A topological space X is said to be a Hausdorffspace if to each different
pair of points x, у there exist neighborhoods Ux and Uy of x and у such that
UxnU, = 0.
2 Uniform Spaces
A uniform structure on a set X is a subset if of 0>(X x X) which satisfies the
following axioms:
(1) if is a filter.
We define: W~1 = {(j/, x) | (x, y) e W}9 V• W = {(x, z) | there exists а у e X for
which (x, j/) g V, (у, z) e W} and A = {(x, x) | x e X}.
(2) Weif^AczW.
(3) PUeiT =>Г1 eif.
(4) W e if => there exists a V e if such that V • V € W.
The elements of if are called vicinities.
A subset Z of @>(X x X) is called a fundamental system of vicinities or the
basis of a uniform structure if the filter generated by Z satisfies axioms (2)-
(4).
A topological structure is defined by a uniform structure in the following
way: Let W eif; it is easy to show that, for each x the sets
&x = {y | (x, y) e W) form a filter for which x e U for all U e 2FX, and that, to
each U e 3FX there exists a Fef, such that U e for each у eV. This
topology is called the topology generated by if. A topology is said to be
uniformizable if there exists a uniform structure which generates the
topology.
A uniform structure if is said to be separated if fVeir W = A. For the
topology generated by if to be separated it is necessary and sufficient that
if is separated.
A mapping X -4 У between two uniform spaces X and Y is said to be uni¬
formly continuous if, to each vicinity К of У there exists a vicinity WoiX such
that f(W) <= V where f(W) is defined by f(W) = {(/(x), f(y)) \ (x, y) e W}.
Let ifk and if2, if^ <= if2 be two uniform structures; we then say that ifk
is coarser than if2 and if2 is finer than ifk. If ifk is a collection of uniform
structures on X, then if = f)k ifjx is easy to show that if is also a uniform
structure. If A is a set, Xk are uniform spaces, and fk are maps X Xk, then
there is a coarsest uniform structure for which all mappings fk are uniformly
continuous. This is the initial uniform structure for the maps X^+\. A
mapping g of a uniform space У into X with the initial uniform structure is
uniformly continuous if and only if all the composite maps У X Xk
are uniformly continuous. The topology corresponding to the initial uniform
356 Appendix II Remarks abdut Topological and Uniform Structures
structure on X is precisely the initial topology for which all maps X Xx
are continuous.
The product uniform structure on the product set X = Ylx initio
uniform structure in which all projections X-^Xk are uniformly con¬
tinuous. The induced uniform structure on a subset A of a uniform space is
the initial structure for which the canonical injection A —> X is uniformly
continuous.
A filter in a uniform space is called a Cauchy filter if for each vicinity W
there exists an element F e ^ for which F x F <= W. X is said to be complete
if every Cauchy filter converges to a point in X. For each uniform space X it
is possible to construct a complete separating(l) uniform space X.
If X is itself separating, then we may identify X with a dense subset of X; X
is then uniquely determined (up to an isomorphism). X is called the
completion of X.
A subset A of a separating complete uniform space X is a complete space if
and only if A is closed in X.
If X, Y are separating uniform spaces and Y is complete, then a uniformly
continuous map X-^Y has a unique extension X-^Y.
A metric on the set X is a real valued function IxI-^R which satisfies
the following conditions: d(x, y) > 0; d(x, у) = 0ox = у and the so-called
triangle inequality—d(x, z) < d(x, j/) + d(y, z). A metric space is a set X
together with a metric d.
It is easy to show that a metric defines a basis for a uniform structure by
the sets Ws = {(x, y) \ d(x, y) < s}. This uniform structure is called the uniform
structure generated by the metric. A uniform space X is said to be metrizable
if there is a metric which generates its uniform structure. X is metrizable if
and only if X is separating and there exists a denumerable basis for the
uniform structure. A topological space is said to be metrizable if there exists a
metric for which the topology is that generated by the uniform structure
generated by the metric.
A sequence {x„} is called a Cauchy sequence if the corresponding filter ^ is
a Cauchy filter, that is, if to each vicinity W there exists an integer N such
that (x„, xm) e W for n, m > N. A sequence in a metric space is therefore a
Cauchy sequence if and only if, to each s > 0 there exists an N such that
d(xn, xm) < s for n, m > N. A metric space X is complete if and only if every
Cauchy sequence has a limit element in X.
In a separating topological space the following three conditions are
equivalent:
(1) Every covering of X by open sets contains a finite subcovering.
(2) The intersection of a set of closed sets is nonempty if and only if every
finite subset of these closed sets has nonempty intersection.
(3) Every filter has an accumulation point in X.
If X satisfies one of these properties, then X is said to be a compact space.
Every compact space is uniformizable, and the corresponding uniform
structure is uniquely determined by the topology, and is the uniform
4 Connectedness 357
structure of the neighborhood filter of A in the topological space X x X. In
this way X becomes a complete separating uniform space. A separating
uniform space is said to be precompact if its completion is compact. X is
precompact if and only if to each vicinity W there exists a finite number of
points xve X such that (Jv W(xv) = X where W(xv) = {x \ (x, xv) e W}.
Every closed subset A of a compact space X is compact. A subset A of a
topological space X (X is not necessarily compact) is said to be relatively
compact if its closure is compact. The product X = Ylx %x °f compact spaces
is compact (Tychonov’s theorem). If f:X^>Y is a continuous map of a
compact space X into a separating topological space, then f(X) is a compact
subset of Y. For the special case in which / is bijective, X and Y are
homeomorphic. Thus, with the aid of the identity mapping I->Iw e find
that if X is compact with respect to the topology and ST2 is a coarser,
separating topology on X then ZT2 is identical to 2TX.
A subset A of a precompact space is precompact; the product X — Y\x %x
of precompact spaces Xx is precompact. X with the initial uniform structures
generated by the maps X Xx where the Xx are compact is precompact. If
the family of the fx is denumerable, then X is also metrizable.
A compact and metrizable space is separable.
A separating topological space X is said to be locally compact if each point
x has a compact neighborhood.
3 Baire Spaces
A subset A of a topological space X is said to be nowhere dense in X if the
interior of the closure A of A is empty. A is said to be meager in X if A is the
union of a denumerable set of nowhere dense subsets. A topological space is
said to be a Baire space if every open set is not meager.
If a Baire space X is the union of denumerable many closed sets Av, then at
least one of the Av must contain an open set, otherwise, all the Av would be
nowhere dense and X itself would be meager, although X is open.
Every locally compact and every complete metrizable space is a Baire
space.
4 Connectedness
A topological space X is said to be connected if it is not the union of two
open nonempty disjoint sets. This is equivalent to the condition that X is not
the union of two closed nonempty disjoint sets.
If X = Х1 и X2, where Xl9 X2 are open and Xx n X2 = 0 then Xl9 X2
are also closed.
A subset A <= X is said to be connected if A together with the topology
induced by X is a connected space.
358 Appendix II Remarks about Topological and Uniform Structures
If A is a connected subset of X,f is a continuous mapping X —> Y, then f(A)
is a connected subset of Y.
If A is a family of connected subsets of X and if Р|я Ak Ф 0 then (JA Ak is
a connected subset. If x is an element of X, then the union of all such
connected A which contain x is a connected set, and is the largest connected
set containing x. This set is called the connected component of x, and is a
closed set.
Two connected components (of x and j/) are either identical or disjoint,
that is, the connected components partition X into equivalence classes.
A topological space is said to be locally connected if each point has a
fundamental system of connected neighborhoods, that is, if to each neigh¬
borhood U of a point there exists a connected neighborhood V of x such that
Fc U.
A space is said to be linearly connected if, for each pair of points x, у there
exists a continuous path from x to у, that is, a continuous map [0,1]
such that /(0) = x and /(1) = y.
APPENDIX III
Banach Spaces
We cannot develop the theory of Banach spaces here. Instead, we shall only
briefly present a summary of important results without proof in order that
readers who are not familiar with these results will be able to find them in
other books, for example, in [33].
1 Linear Vector Spaces
A linear vector space X over the field К is an additive abelian group (that is,
to each pair xl5x2,el there corresponds a + x2eX; the following
axioms are also satisfied:
*1 + (*2 + *з) = (*1 + *2) +
X1 + X2 — X2 + Xl>
there exists an element 0 such that 0 + x = x for all x e X, to each xe X
there exists a yeX for which x + у = 0; we write у = — x and
Xi + (—x2) = Xi — x2). In addition the elements of the field К define maps of
X into X which satisfy the following axioms (here let a, j8eК and let e
denote the unit element of К):
ex = x, a(jSx) = (ajS)x,
a(xt + x2) = clx1 + ax2, (a + jS)x = ax + jSx.
We shall only consider the two following cases—К = R (where R is the
field of real numbers) and К = С (where С is the field of complex numbers).
359
360 Appendix III Banach Spaces
We shall assume that the reader is familiar with the simple computation
rules which follow directly from these axioms.
2 Normed Vector Spaces and Banach Spaces
In a Unear vector space (over the field К = R or C) the norm is a real
function ||x|| over X which satisfies the following properties:
(1) ||x|| > 0 and ||x|| = 0 only for x = 0.
(2) ||x + y\\ < ||x|| + \\y\\.
(3) || ax || = |a| ||x|| where |a| is the absolute magnitude of a.
From (3) it follows that || —x|| = ||x||. From (2) it follows that ||x — y\\ >
I \\xII “ IIУII I- If a norm is defined on X, X is called a normed space.
d(x,y) = || x — у || defines a metric (see All, §2) and therefore a uniform
structure and a topology. The operations X x X X and К x X X
are uniformly continuous.
The notion of a Cauchy sequence has already been defined in All, §2. A
normed space X is complete if for every Cauchy sequence xn there exists a
limit point xe X, that is, xn —» x. A complete normed space is called a
Banach space. Every normed space may be completed to form a Banach
space.
The set of all x e В such that ||x|| < 1 is called the unit sphere Вщ of В and
is closed in the norm topology.
3 The Dual Space for a Banach Space
A mapping X -Л К for which l(xx + x2) = Z(xx) + /(x2) and /(ax) = a/(x)
(where К is the field of scalars for the vector space X) is called a linear form
or a linear functional.
A linear form is continuous over a normed vector space if it is continuous
for x = 0. This is equivalent to the condition that I is bounded, that is, there
exists a real number с such that
A bounded linear form is clearly uniformly continuous, and therefore has
a continuous extension from the normed vector space onto its completion.
Let X' denote the set of continuous linear forms; we find that X' = Xf. Xf is
the dual space for X.
If we introduce the norm in X' as follows:
|/(x)| < c||x||.
|/(x)|
= sup |/(x)| = sup /(x)
11*11 SI
11*11 SSI
4 Weak Topologies 361
(the last equality is only valid if К = R), then X' is a Banach space because it
is easy to verify the fact that a Cauchy sequence ln is a convergent sequence in
x and that /(x) = lim,,.^ ln(x) is an element of X' with ||/„ — /|| —» 0.
The fact that X' separates X, that is, from 1(хг) = l(x2) for all I e X' it
follows that xt = x2 follows directly from the Hahn-Banach theorem. Since
|/(x)| < ||/|| ||x|| it follows that
11*11 > SUP7TF-
leX' ||/||
If II x || < 1 then, according to the Hahn-Banach theorem there exists an I
satisfying |/(x)| > \l(y)\ for \\y\\ < 1 and we therefore find that |/(x)| > ||/||. Thus
we obtain
11*11 > 1 => suP^]r > !•
leX' ||/||
Thus for x = х'/Цх'Ц (1 + г), for arbitrary s > 0 it follows that
(1 + £) sup -jiyjp > llx'll
leX' ||/||
and we therefore obtain
l(x)
и-3й-
A bilinear form (the canonical bilinear form of the dual pair X, X') is defined
onlxl' by <x, /> = /(x). For xeX, ye X\x, у> is, for fixed x, a norm-
continuous linear form over X'. X", the set of all norm-continuous linear
forms over X' is, in general, larger than X.
4 Weak Topologies
Let X be a Banach space, X' the dual Banach space for X and let <x, y) be
the canonical bilinear form for the pair X, X'.
a(X\ X) is the initial topology in X' for the maps X' -x,:—> R (or C) which
are continuous for all xeX. Similarly a(X, X') is the initial topology in X for
which the maps X R (or C) are continuous for all у e X'. These <x(...)
topologies are often called the weak topologies corresponding to the dual
pair X, X' because they are weaker than the norm topologies.
The set of all continuous linear forms over X in the <r(X, Z')-topology is
given by the у eX'. Similarly, the set of all continuous linear forms over X' in
the g(X\ X)-topology is given by the x e X. The sets of continuous linear
forms over X in both the norm topology and the a(X, Z')-topology are
therefore identical.
Uniform structures are defined by means of the weak topology; for
example, the uniform structure defined by a(X, X') with vicinities
{(x, y)\x — yeU where U is a neighborhood of 0}.
Thus every weakly continuous linear form is also uniformly continuous.
362 Appendix III Banach Spaces
Let X be a linear vector space over the field of real numbers. A subset К of
X is called a convex set, if, for xl5 x2 e K, + (1 — Л)х2 e К for 0 < к < 1.
For A cz X let со A denote the smallest convex set in X which contains A.
The unit sphere of a Banach space is convex.
Let X be a Banach space; let со A and coff A denote the smallest convex set
closed in the norm-topology and the g(X, X')-topologies, respectively, which
contain i. In I we find that coff A = со A. This is not the case in X'\
However, the unit sphere Х\ц of X' is not only g(X\ X)-closed but also
g(X\ X)-compact. This result follows from the fact that Х\ц (
[— 1,1] c: R, that is, the images under these maps are relatively compact
(see All, §2) and that X\^ is the polar set to Хщ (that is, the set of all yeXf
for which <x, y> < 1 for all x e Хщ) which is a{X\ X)-closed. The convex set
generated by one (or finitely many) compact sets is compact, and therefore is
closed!
An extreme point of a convex set A is an x e A for which the relation
x = Лх„+ (1 — A)x2, xl5 x2 g A, 0 < A < 1 is satisfied only for xx = x2 = x.
We shall denote the set of extreme points of A by deA. If A is compact, then
A = сo(deA) according to the Krein-Millman theorem. Therefore we obtain
X\1{ = тдеЩц)-
A linear form over X' is g(X', X)-continuous if and only if it is g(X', X)-
continuous over the unit sphere. This corresponds to the fact that a linear
subspace of X' is g(X', X)-closed when its intersection with the unit sphere is
ст(Х\ X)-closed.
The topologies g(X\ X) and g(X\ X^) are identical on X\ where g(X\ Хщ)
is the initial topology for which the maps X' R are continuous for all
xeXm. If A is a subset of X for which co(A и — A) = Хщ then the
topologies g(X\ X) and g(X\ A) are also identical on Х\ц. This result follows
directly from the fact that both the g(X\ X^)- and g(X\ A)-topologies are
weaker than the g(X\ X)-topology and are also separating, and must
therefore coincide on a g(X\ X)-compact set. If X is norm-separable we can
choose A to be denumerable. We define a norm in X' as follows:
IMLd= Z K\<x,y>\,
x 6 A
where the > 0 and YjxsaK < 00• Again it is easy to verfiy that the
topology determined by the norm \\y\\A coincides with g(X\ X) on Х\ц.
If X is norm-separable then the g(X', X)-topology on Хщ is compact and
metrizable, and therefore, according to All, §2, Х\ц is separable in the
g(X\ X)-topology. Thus X' is also separable in the g(X\ X)-topology.
5 Linear Maps of Banach Spaces
If Xx and X2 are linear vector spaces, then a map T from Xx X2 is said to
be linear if T(x + y) = T(x) + T(y) and T(ax) = ocT(x). If Хг and X2 are
Banach spaces, then T is continuous with respect to the norm topology only
6 Ordered Vector Spaces 363
if there exists a real number С such that
|| Tx|| < C||x|| for all x.
Then Tis also said to be bounded.
If Tis bounded, then <Tx, у> = <x, /> is a norm-continuous linear form
over X, that is, it defines а у' e X'. It is easy to verify that у —» / defines a
linear and bounded map X' X' which we call the dual map to T or the
adjoint map. T is continuous with respect to the a(X'2, X^-aiXХг)-
topologies. The following important relation is satisfied: <Tx, j/> =
<x, Ту>.
A linear map X'2 X\ is ^-continuous only if there exists a bounded
linear map Xx X2 for which S = T'.S is ^-continuous in X'2 only if it is o-
continuous on the unit sphere of X2.
6 Ordered Vector Spaces
A linear vector space X over the field of real numbers R is said to be ordered
if it is a partially ordered set in the sense of AI, §1 and if the following
conditions are satisfied :
хг > x2 => xx + x > x2 + x for all xe X,
x > 0, a > 0 => ax > 0.
A convex cone С is defined as a convex set which has the property that if
xeC then so does Ax for all A > 0. The cone С is said to be proper if
Cn(-C) = {0}.
It is easy to see that the specification of an order structure in X is
equivalent to the specification of a proper cone С as follows:
x > 0 if and only if xeC.
This cone С is called the positive cone; the positive cone of X is often denoted
by x+.
A convex subset К of С is said to be a basis for the cone С if to each xeC,
x Ф 0, there exists exactly one number A(x) such that A(x)x e K. The set К
defined by X=f ЛХ *s equal to tjie set {x|xeC and x < we K}
and is called the truncated cone generated by the base K.
An ordered Banach space X is said to be base-normed if there exists a basis
К for the cone X+ for which К is norm-closed and Хщ = со (К и (—К)). It
is a simple matter to show that X+ will also be norm-closed. In addition, it
can be shown (see [33]) that X is generating, that is, X = X+ — X+.
An affine functional on the convex set К is a map X ^ R for which
w1,w2eK, 0 < A < 1 =>/(Awi + (1 - A)w2) = АДи^) + (1 - A)/(w2). It is
easy to show that each affine functional on a basis К of a base-norm Banach
space may be uniquely extended to all of X as a linear functional because
X = X+ — X+ and X+ = (JA>0 AX. There is a 1:1 correspondence between
the linear functionals over X and the affine functionals over K.
364 Appendix III Banach Spaces
From Хщ = co(K u — K) it directly follows that || w|| = 1 for all w e К and
that || x || -1хбК for all xe X+.
Since X+ is generating, each xe X may be expressed in the form
x = awx — 1Sw2, where wl5 w2e К and oc, /? > 0. Thus it follows that ||x|| <
a + p. To each s > 0 we may choose wl5 w2, a, /? such that ||x|| >
a + ft — s (see [33]). We say that X has the minimal decomposition
property if wl5 w2, oc, /? may be chosen so that ||x|| = a + /?. All the examples
of base-norm spaces in this book satisfy the minimal decomposition
property.
If xn e X+ is a bounded increasing (or decreasing) sequence, then there
exists an x e X+ for which xn —» x. This follows directly from the fact that for
n > m xn — xm e X+ and therefore there exists a w e К such that xn =
xm + Aw, where A > 0 and with xm = ||xm|| wm (wm e K) it follows that
*• “(1Ы+ 4ы!Ь"'- + SFI’
and we therefore obtain ||x„|| = ||xj + X = ||xj + ||x„ - xj.
For the norm in the dual Banach space X' we obtain
IMI = sup{|<x, y>| I x e XM = co(K и —К)}
= sup{|<x,y>||xeK}.
X+ determines a polar cone in X' as follows :
X'+ = {y | <x, y) > 0 for all xe X+}
— {У I y} ^ 0 f°r all xe K}.
X'+ is not only g(X\ X)-closed, but is also a(X\ X)-complete (see [33])
because all positive linear functionals over X are norm-continuous. X'+
determines an order for X' because from у e X' n — X’ it easily follows that
у = 0fromX = X+ - X+.
If l(x) is a bounded linear functional on К, that is, if l(x) < с for all x e К
then from x = awx — jSw2 (where wl5 w2 e K) and ||x|| > a + p — e (see
above) that |/(x)| < a/(wt) + j3l(w2) < (a + j8)c < c\\x\\ + sc for arbitrary
s > 0 and therefore |/(x)| < c||x||. Every linear functional which is bounded
on К (and hence each bounded affine functional on K) is therefore an element
of X’.
l(w) = 1 for all w e К defines an element of X' which we shall denote by 1.
The unit sphere of X' is therefore equal to
X\i\ = {y | |<w, y}\ < 1 for all weK}
= {y | — 1 < <w, y) < 1 for all w e K}
= {y I -1 < у <1}.
We shall denote the set of all у for which yx < у < y2 by [yl5 y2] and call it
the order interval generated by yx and y2. Therefore we obtain Х[ц =
[— 1,1]. Because of this property we shall call X' an order unit space.
6 Ordered Vector Spaces 365
From = [— 1,1] it easily follows that X' is generating, that is,
X' = X'+ - X'+.
If yn e X'+ is a decreasing sequence, then there exists а у e X+ to which the
sequence converges in the c(X\ X)-topology, a result which follows directly
from the fact that every set of the form [0, al] is compact in the а(Х\ X)-
topology.
In the same manner in which every positive linear function over X is norm
continuous, every positive linear map T that is Tx > 0 for x > 0) of a base
norm space Xx into a base norm space X2 is norm-continuous (see [33]).
Therefore, according to §5 the adjoint map X2 X\ exists. T is therefore
also positive.
The set of norm continuous maps Xx X2 together with the norm
||T|| =sup{||7x|||||x|| <1}
form a Banach space Y because a Cauchy sequence is also uniformly
convergent. У becomes an ordered vector space by means of the cone
У+ = {T\ Tis positive}.
APPENDIX IV
Operators in Hilbert Space
Since the mathematics of Hilbert space is an essential tool in quantum
mechanics, we shall briefly outline the proofs of important theorems. In
particular, we shall provide a few examples of the application of a number of
general theorems in AIII.
1 The Hilbert Space Structure Type
A Hilbert space is:
(I) A linear vector space Ж over a field К (as defined in AIII, §1). Here
we shall only consider the case in which К = С, the field of complex
numbers.
(II) There is a map, the so-called inner product, defined on/x/-^C
which is denoted by <x, y> and satisfies the following axioms:
for a g С <x, ay> = a<x, y>,
<*> У1 + У2> = <*> У1> + <*> Уг\
<x, j/> = <y, x); <x, x) > 0, = 0 only if x = 0.
From <x, j/> = <y, x) it follows that <x, x) is real. From the axioms
it easily follows that <ax, j/> = a<x, y), <xt + x2, y) =
<*i> У> + <x2, y>, <x, 0> = <0, x> = 0.
366
1 The Hilbert Space Structure Type 367
Two vectors x, у e Ж are said to be orthogonal if <x, y) = 0. If x Ф 0 and
if we define a vector z by
<*, У> ,
У = 7 г* + z
<x, x>
then z satisfies <z, x) = 0 and we obtain
from which we obtain the Schwarz inequality:
<x, x><y,y> ^ |<x,y>|2
which is also valid for the case x = 0. Here the equality is satisfied if and only
if z = 0, that is, у = Ax.
If we define ||x|| = <x, x>1/2, then, from the Schwarz inequality |<x, y)\ <
l|x|| ||y||; from ||x - y\\2 = ||x||2 + ||y||2 - <x, y> - <y,x> we obtain the
triangle inequality ||x + y|| < ||x|| + ||y|| and ||x - y|| :> |||x|| - ||y|||.
Therefore Ц...Ц satisfies conditions (l)-(3) for a norm from AIII,§2.
Therefore Ж is a normed space with norm ||... ||. The convergence xn —» x of
a sequence is defined in the sense of the norm, that is, ||x„ — x|| —» 0. With
the help of the Schwarz inequality it easily follows that the inner product
/x/-»Cisa continuous map, that is, from xn —» x and ym—> у we
obtain the relation <x„, ym) —* <x, j/>.
A pre-Hilbert space is defined as a set Ж which satisfies all the above
axioms except <x, x> = 0 => x = 0. From <x, x> = 0 it follows directly that
<ax, ax) = 0. With <x, у) = |<x, y}\eid it follows that
0 ^ ||x - C-idy||2 = IIx||2 + ||y||2 - 2|<x, y>| (1.1)
and from <x, x) = 0 and <y, _v> = 0 we obtain <x, y) = 0 and therefore
II* + УII2 = 0- Therefore the set У0 = {x \ <x, x) = ||x|| = 0} is a subspace of
Ж. From (1.1) it follows that ||y|| = 0 and with ).x (л > 0) instead of x:
0 < A2||x||2 — 2A|<x, y}\
for all A > 0 and therefore <x, y} = 0 for x e Ж and j/G«f0. Therefore it
easily follows that xx ~ x2 if ||x± — x2|| =0 defines an equivalence relation,
and the value of <x, у) depends only on the equivalence classes to which x
and у belong. In this way Жis a linear vector space over С which satisfies
II.
We now present the third axiom for a Hilbert space :
(III) Ж is a Banach space, that is, it is complete with respect to the norm.
Every Cauchy sequence in Ж therefore has a limit element. Each
noncomplete space Ж which satisfies (I) and (II) can be completed (see
AH, §2).
368 Appendix IV Operators in Hilbert Space
We shall only consider Hilbert spaces which satisfy the axiom:
(IV) Ж is separable.
(For the notion of separability, see All, §1.)
A subset G of Ж is called a linear basis (or Hamel basis) if the span of G is
dense in Ж. From (IV) it follows that there exist denumerable linear bases. If
there exists a countable or a finite linear basis G then all finite sums ]TV avxv
where xv e G and the av rational complex numbers form a denumerable
subset which is dense in Ж, that is, IV is satisfied.
The smallest cardinality of a linear basis of Ж is called the linear
dimension of Ж which, according to (IV) can be only either finite or
denumerable.
We shall now give two important examples of Hilbert spaces (to prove that
these examples satisfy axioms (I)-(IV) see [34]).
Let Ж be the set of all complex number sequences x = (al5 a2,...) for
which £v |av|2 < oo. For x = (al5 a2r...), у = (jSl5 jS2,...), x + у =
(<*! + /?l5 a2 + jS2,...), ax = (aal5 aa2,...) and <x, j/> = £v avjSv, where the
convergence of ]TV av/?v is easily proven with the aid of the Schwartz
inequality
For the second example we consider a a-ring $4 of subsets of a set M, that
is, is a Boolean ring with respect to the intersection, the union,
complements and the union of countably many elements of s/ is again an
element of si (for such an example consider the a-ring in IV, §2.5). Suppose
Me
On si let a а-additive real measure fi be defined for which fi(rj) > 0, and
where we permit fi(rj) = oo for some rj. Let /i(0) = 0. For a sequence ^ e st
satisfying r{и nrjj = 0 for i Ф j we therefore obtain
Thus it follows that fi(rj) ^ fi(a) for ц з a since rj = a и (rj л g*) where a* is
the complement of a.
If fi(M) = + oo then there may exist a sequence rji such that rji+1 z> rji9 fi(rji)
is finite and M = (J* rit. In addition this sequence may be chosen such that
every л/. = {a \ a e s/, a a ^.} is separable with respect to the metric (this
metric is described in IV, §1.4)
<r2) = + <r2) = n <jf) + fi(a2 n af).
A complex function M Л С is said to be quadratically integrable if it is
measurable and
I |/(x)|2 dfi(x) < oo.
JM
2 Orthogonal Systems and Closed Subspaces 369
Two functions fuf2 are said to be equivalent if the set {x | fx(x) Ф f2(x)} is of
ju-measure zero, or, equivalently
f 1/iM -/2WI2 dfi(x) = 0.
Jm
We define the Hilbert space Ж as the set of all classes of equivalent functions.
Since/(x) =f1(x) +/2(x),a/(x),and
fi(x)f2(x) dfix) (1.2)
Jm
depend only on the classes, the operations ft(x) + /2(x), a/(x) make Ж into a
complex vector space, and an inner product </i,/2> is defined by (1.2).
It can be proven (see [34]) that, under the above assumptions about
(M, si, р)Ж is a separable Hilbert space. We shall denote this Hilbert space
by J£?2(M, dp).
An example for (M, si, p) is obtained by choosing M to be the set of R of
real numbers, si the set of Lebesgue measurable sets and /л as Lebesgue
measure. For the case of Lebesgue measure it is customary to replace dp(x) in
the above integral by the simpler notation dx.
2 Orthogonal Systems and Closed Subspaces
A sequence of vectors xv for which <xv, xM) = SVfi = 1 for v = p, 0 for v Ф /л
is called a normed orthogonal (orthonormal) system. The elements of an
orthonormal system are linearly independent because if ]TV Avxv = 0, then by
taking an inner product with we obtain = <x, £"=1 Кху>У = 0* For
each xe/a vector p is defined by x = ]T*=1 <xv, x)xv + p which satisfies
<xv,p> = 0 for all v<N. Thus it follows that ||x||2 =
Kxv> x>I2 + IIPII2 and we obtain Bessel’s inequality ]T*=1 |<xv, x>|2 <
||x||2 and we therefore obtain |<xv, x)|2 < 00 from which we conclude
that <xv, x> —► 0. Since ||£™=B xv<xv, x>||2 = E?=„ |<xv, x>|2 the sum
Y^=i xv(xv’ ХУ converges in norm, and we therefore obtain
E K*v>*>l2 ^ IN2-
v = l
We shall now show that the expression ||x — £*=1 avxv|| takes on its
minimum value when av = <xv,x> because, for p defined above and
q = x - Y,v = i ^ follows that q = £*=1 (<xv, x> - av)xv + p and we
therefore obtain
Ml2 = E K*v»*> - “vl2 + IIpII2-
v = l
An orthonormal system is said to be an orthonormal basis if it is a basis for
Ж. For an orthonormal basis we therefore obtain
* = EXv<Xv,X>
V
370 Appendix IV Operators in Hilbert Space
since the avxv are dense in Ж and, according to a previous result
llpll < Ml
The cardinality of an orthonormal basis is equal to the dimension of Ж;
thus it immediately follows that the cardinality cannot be less than the
dimension of Ж.
Next we show that the cardinality of an orthonormal basis cannot be
greater than denumerable. According to (IV) there exists a denumerable set
{yv} which is dense in Ж. Therefore, to each x of an orthonormal basis there
exists a yv(jc) such that \\yvix) — x|| < To each pair of different xl5 x2 of an
orthonormal basis yv(Xl) and yv(jC2) must be different because if yv(JCl) = yv(JC2)
then it follows that
11*1 - *all ^ 11*1 - ^voJI + 11*2 - ЛоЛ < 2
in contradiction to \\xt — x2\\2 = HxJ2 + \\x2\\2 = 2. Therefore every or¬
thonormal basis is at most countable. Thus, in the case in which the
dimension of Ж is denumerable, the theorem is proven.
Let the dimension of Ж equal n (finite), then a basis consists of finitely
many yu ..., yn. Thus it follows that each хеЖ can be written in the form
x = ]T"=1 avyv and thus there cannot be more than n linearly independent
vectors in Ж. Thus an orthonormal basis can have at most n elements.
We will now show that if M is an arbitrary denumerable subset of Ж then
it is possible to construct an orthonormal set of vectors which has the same
linear span as does M. For yveM set xt = jVllyJ, x2 = P2/IIP2II where
p2 = у2 — уУхi providing that p Ф 0 (if p = 0 then y2 and xt are
linearly dependent, and we can simply eliminate y2 from M and renumber
the elements yn). We will now assume that M is a linearly independent set.
Recursively, we may set
и — 1
= pJ\\Pn\\ where p„ = y„ - £ <xv, y)xv;
V = 1
this procedure is known as the Schmidt orthogonalization procedure. It is
easy to verify that the set of xv form an orthonormal basis which has the same
linear span as does M.
F is called a closed subspace if F is a subspace and is closed in norm. It
follows that 2Г is complete. If 9* is a subspace, then the closure of ^ in Ж is a
closed subspace. It follows directly that the intersection of arbitrary many
closed subspaces of Ж is a closed subspace. According to AI, Th. 4.1 the
closed subspaces of Ж form a complete lattice where a ^2 = n ST2
and v 9~2 is equal to the intersection of all closed subspaces 2Г for which
=> and ZT => ?T2.
If M is a subset of Ж let (M) denote the subspace generated by M and [M]
be the closed subspace generated by M. Therefore we find that [M] is the
intersection of all closed subspaces for which ^ => M and [M] is therefore
equal to the closure of (M).
If p is orthogonal to all elements of M, it directly follows that it is
orthogonal to all elements of (M) , and from the continuity of the inner
2 Orthogonal Systems and Closed Subspaces 371
product, is also orthogonal to all elements in the closure of (M), that is, of
[М]. In this way it follows that all elements p which are orthogonal to M
form a closed subspace which we denote by M1. Therefore we find that
M1 = (M)1 = [M]1. In addition it follows that [M] = (M1)1. Later we
shall find that [M] = (M1)1.
First we shall show that if is a closed subspace, then each xe Ж may be
uniquely represented in the form x = q + p where q e and p e Since
the uniqueness is trivial, we need only demonstrate the representation. For
x e F we obtain q = x and p = 0. We now consider the case in which x ф ST.
Then Min^ 6 ^ ||x — у || = p Ф 0. Therefore there exists a sequence yve ZT for
which ||x — yv\\ —► p. From
\\Уу - yj2 = 21|yv - x\\2 + 21|- x||2 - ||yv + Уд - 2x||2
it follows that ^(yv + уJe and we therefore obtain ||^{yv + y^) — x|| > ц
from which we conclude that
|| yv - уд || < 2||yv - x ||2 + 21| Уд - x ||2 - 4ц.
From this it follows that the yv form a Cauchy sequence and that yv — qe
from which it follows that ||x — yv\\ —► ||x — q\\ and finally ||x — q\\ = p. For
p = x — q it is only necessary to show that p e «^'1, that is, <A, p> = 0 for all
h e ZT.
For p = h(h, p}/\\h\\2 + r we find that ||r|| < ||p|| = p. For p = x — q it
follows that r = x — (q + h(h, p}/\\h\\2). Since q + p}/\\h\\2 e 2Г we
must have ||r|| > p. Therefore ||r|| = p = \\p\\ and since ||p||2 = ||r||2 +
I (K p> |2/||/i||4, we finally obtain </z, p> = 0.
If x e (M1)1 = ([M]1)1, then from x = p + q, qe [M] and p e [M]1 it
follows that <x, p) = 0, so that <x, p) = (q + p, p) = ||p||2 = 0 from which
it follows that x e [Af], and we have shown that [M] = (M1)1.
Let G be a basis in Ж and let У be a closed subspace in Ж; then each
element x in G can be written in the form x = p + q where q e ZT and
p e Let the set of q obtained in this way be denoted by G^, similarly let
the set of p be denoted by G? ±. It easily follows that is a basis for and
G^i is a basis for 5~1, and that G? и G^± is a basis for Ж. With the help of
the Schmidt orthogonalization procedure it is easy to show that it is possible
to select an orthonormal basis such that the elements xv of which are either
elements of У or of
We say that and ZT2 are orthogonal (written ?TX 13T2) if
2ГХ <z and therefore 3T2 c= 3TX holds. If 1 У2 then the set of all
x + y, where x e and у e ST2 is a closed subspace. Here we need only
prove that the subspace is closed.
From
II*» + Уп - (x„ + y„)ll2 = IIX» - xj2 + IIy„ - ym||2
it follows directly that the xn and the yn form a Cauchy sequence if the xn + yn
form a Cauchy sequence.
372 Appendix IV Operators in Hilbert Space
For ZTX _L we therefore obtain v ZT2 = + ^2. If _L ^ we
then write ^ intsead of + !T2. From AI, D 1.2 it follows that the
operation i in the lattice of closed subspaces of Ж is an
orthocomplementation.
3 The Banach Space of Bounded Operators
A linear map Жх -Д Ж2 is called a linear operator, or more simply an
operator and satisfies A(olx) = a Ax and A(xx + x2) = Axt + Ax2. A is
continuous if and only if there exists a number С for which \\Ax\\ < C\\x\\. A
continuous operator is therefore also called a bounded operator (see also
AIII, §5).
The values <x, Ax} for all x e Ж uniquely determine an operator A. This
fact is a direct consequence of the following simple identity:
4<x, Ay} = <x + y, A(x + y)> - <x - y, A(x - y)>
— i(x + iy, A(x + iy)} + i(x — iy, A(x — iy)>.
A is uniquely determined by the matrix aVfl = <xv, Ax^} with respect to a
complete orthonormal basis {xv} since Ax^ = xv<xv, Ахц}.
An operator A = Ax + A2 is defined by Ax = Atx + A2x and is bounded
if both Ax and A2 are bounded. We find that the operator A = AtA2, defined
Ax = At(A2x) is bounded if Ax and A2 are bounded. For a e С aA is defined
by (olA)x = a (Ax). Let <£{Ж) denote the set of bounded operators of Ж.
JУ?(Ж) is therefore a vector space over С and is also an algebra. The unit
element is the operator lx = x, the null element is the operator Ox = 0.
It is easy to see that ||A|| = sup^y ^ ||Ax|| defines a norm in J¥?(Ж). The
fact that <£(Ж) is a Banach space follows directly from general theorems;
however, we shall show that this is the case below.
In addition to the norm-topology in <£{Ж) we may also introduce the
pointwise topology as follows: We say that a sequence An e &(Ж) converges
pointwise if, for each xe Ж the Anx form a Cauchy sequence. From
||A„x — Amx|| <|| An — Am || ||x|| it directly follows that every Cauchy se¬
quence in the norm topology also converges pointwise.
If An is a pointwise convergent sequence, then a linear operator is defined
by Anx —► у where у = Ax. A is then also bounded, that is, A e J¥?(Ж).
PROOF. Now we shall show that the Anx are uniformly bounded, that is, there exists
a D such that ||A„x|| < D||x|| for all «.For this purpose it is sufficient to show that
there exists a у and a sphere Kd(y) = {x | ||x - y|| < <5 > 0} such that ||A„x'|| < a
for all x' e Kd(y), because for arbitrary x
4 Bounded Linear Forms 373
If such a sphere Kd(y) does not exist, then we may stepwise construct the following
sequence: first, find a yx and an n1 such that > 2. From the continuity of
Ani we can find a sphere КР1(уг) for which px< 1 and \\Anix|| > 1 for x e K^iy^.
Then we may find in the interior of Kpi{yx) a y2 and n2 which satisfy M„2y2|| > 3
together with a sphere KpJy2) c Kp^yJ where p2 < \ and M„2x|| > 2 for
x e Kpz(y2). In this way we obtain a sequence of spheres KPv(yv) c= KPv_1(yv_1) for
which pv < 1/v and ||4JI > v for xeKPv(yv). Thus for p > v we obtain
||yp — yv|| < 2pv and therefore there exists a у for which yv —► у and у e KPv(yv) for
all v. Here \\Anvy || > v in contradiction to the fact that Any is convergent.
From \\Anx|| < D||x|| it follows that, in the limit, ||Ax|| < D||x||. In this way we
may prove that, for a sequence of bounded linear forms Ж -H- С which converge
pointwise, ||/„(x)|| < £>||x|| for all n and hence defines a bounded linear form / by
/„(x)-W(x).
For a pointwise convergent sequence Anx —► Ax we write An —► A, for
norm-convergence we write An A. If An is a norm-Cauchy sequence,
then, for all x, Anx is a Cauchy sequence. Therefore, there exists an A for
which An —► A. We now show that An A as follows:
Let A'n = A — An\ A'n is also a norm-Cauchy sequence. If we choose N
such that, for n,m>N the relationship \\A'n — A'm\\ < s holds, then for
||x|| < 1 we obtain:
и;х|| < IIК - A'JxII + ll^xll
< \\A'n - A'm|| + ||^x|| < e + ll^xll.
For fixed x we obtain M^x|| —> 0, and therefore it follows that И^х|| < e for
n>N and arbitrary x. Therefore \\A'n\\ = sup^n||^x|| < e, that is,
Kil = K-^ll-o.
We have shown that <£(Ж) is a Banach space. It is easy to show that
\\AB|| < \\A\\ ||В||. If this relation holds in an algebra which is also a normed
space, then it is called a normed algebra; if it is also complete, it is called a
Banach algebra.
From An A, Bm В it easily follows that AnBm AB. From
An —► A, Bm —> В it follows that AnBm —► AB which follows directly from
\\AnBmx - ABx\\ < \\An(Bm - B)x|| + 11(4, - A)Bx\\
< D\\(Bm - B)x|| + ||(4 - A)Bx|| 0.
4 Bounded Linear Forms
As in the case of a Banach space (see AIII, §3) we may also investigate
bounded linear forms for Ж. We shall now show that if Ж С is a bounded
linear form, then there exists а у e Ж for which /(x) = <y, x).
It is easy to see that the set 3~Q = {h \ 1(h) = 0} is a closed subspace of Ж. If
= Ж it follows that l(x) = 0 = <0, x>. If 2Г0Ф Ж, then there exists,
according to §2, a у which is orthogonal to therefore /(y) ^0. For
p = x — (l(x)/l(y))y it follows that l(p) = 0, that is, p e and therefore
374 Appendix IV Operators in Hilbert Space
<p, у> = 0. From x = (l(x)/l(y))y + p it follows that, by taking the inner
product with h = (l(y)/\\y\\2)y we obtain <й, x> = /(x). From the Schwarz
inequality it easily follows that <й, x) is a bounded linear form for all he Ж.
Since </il5 x) = <й2, x) for all x implies that ht = h2, it follows that the map
/ —► h with /(x) = <й, x) is a bijective map Ж' Ж where Ж' is, in the sense
of AIII, §3, the dual Banach space to Ж. For this correspondence we obtain
lx + l2 —► hx + h2 and a/ —► och. From sup^u < x <й, x) = ||/i|| it follows that
the norm defined on Ж corresponds, according to this bijective mapping, to
the norm in Ж.
A convergent sequence in Ж' (in the sense of the topology а(Ж\ Ж) in
AIII, §4) corresponds to a sequence yn in Ж for which <y„,x> converges
pointwise. Such a sequence is called a weakly convergent sequence.
According to §3, to each pointwise convergent sequence /„ of linear forms
there exists a bounded linear form / which satisfies ln(x) —► Z(x) for all x as a
limit and there exists a С such that ||/„(x)|| < C||x||. Therefore, for each
weakly convergent sequence y11 in Ж there exists a С such that ||yj < С and
a limit element у towards which the yn converges (we denote the weak
convergence of у by yn у). Ж is therefore sequence-complete.
In §2 we saw that the relation <xv, x) —► 0 is satisfied for the elements xv of
an orthonormal basis. If a sequence xv satisfies the relations <xv, хд> = 0 for
p ф v and there exists a С such that ||xv|| < С for all v, then it follows that
xv 0.
From the general theorems in AIII, §4 it follows that every bounded set in
Ж is weakly relative compact. This result can also be easily shown directly: If
M is a bounded set (that is, ||y|| < С for all у e M\ then the set of <y, x) for
fixed x is bounded for ye M since | <y, x>| < ||y|| ||x|| < C||x||. Let ^ be a
denumerable dense subset of Ж and let xv (v = 1, 2,...) denote the elements
of In the usual way we may choose a sequence j/1} e Jt for which
xi) converges; from the sequence of the y^ we may choose a
subsequence yf} for which <у(д2), x2> converges, etc. For the diagonal
sequence у^ xv> converges for all fixed xv for all xv e We will show
that this situation holds for all xe Ж. This follows from
W - У{рр\ *>| < 1<УдД) - У{РР\ ^v)l + 1<УдД) - y(!\ X - Xv>|
< 1<УдД) - Урр\ xv>| + 2C||x - xv||,
if we first choose xv such that 2C||x — xv|| < e and then, for fixed xv, choose
N such that for jU,p>Nwe obtain |<y^ — у(£\ xv>| < e.
If A is a bounded operator, then <y, Ax) is, for fixed у a bounded linear
form over x; therefore there exists а у' e Ж for which <y, Ax) = (/, x).
Since y' is uniquely determined, у —► у' defines a map Ж —► Ж; it is easy to
verify the fact that the map is linear. We denote this operator defined by this
map by A +. Thus A + is determined by <y, Ax) = <A+y, x).
Since
sup \\A+y 11= sup sup |<4+y,x>|
1Ы1<1 llyllsi llxllsi
5 The Banach Space У£Г(Ж) 375
it easily follows that A+ is bounded, and that \\A + \\ = \\A\\. It is also easy to
show that (oo4)+ = oL4+, (A + B)+ = A+ + B+, (AB)+= B+A+ and
(A+)+ = A.
We call A + the adjoint operator corresponding to A. An operator A is said
to be self-adjoint (or Hermitian) if A + = A.
An operator A is said to be compact (or completely continuous) if A maps
the unit sphere (and therefore every bounded set) on to a relatively compact
set (with respect to the norm). We shall now show that the above condition is
equivalent to the condition that for every sequence yv 0 the relation
Ayv —► 0 is satisfied, that is, \\Ayv\\ —► 0.
If A is compact, then for each denumerable set {yv} for which ||yv|| < 1 the
set Ayv has an accumulation point in the norm-topology. Since, for a weakly
convergent sequence yv-^0 (yv9x) is uniformly bounded, there exists a
number С such that ||yv|| < C. If we consider the sequence yvC-1 instead of
yv, we may then assume that the yv are elements of the unit sphere, and that
the Ayv must therefore have an accumulation point у in the norm topology.
Therefore there exists a subsequence yVi such that Ayv. —► y. Thus, for
arbitrary xe / we obtain (AyVi, x> —* <y, x), and since yv. ^0 we also
obtain <AyVi, x) —* <yv., Ax} —► 0; therefore у = 0. Thus the sequence Ay
must converge: Ayv —► 0.
Conversely, we assume that for yv -*■ 0 it follows that Ayv —> 0. Let zv be
an arbitrary denumerable subset of the unit sphere. A will be compact if the
Azv have an accumulation point in the norm topology. Since the unit sphere
is weakly compact, there exists a subsequence zv. which is weakly convergent:
zV( z, where ||z|| < 1; for yv. = j(zVi - z) we obtain ||yV(|| < 1 and yv. 0
from which we obtain AyVi —► 0, that is, AzVi —► Az. The Azv therefore have
an accumulation point (in norm).
Let У?С{Ж) denote the set of all compact operators in <£(Ж). It is easy to
verify that J^c(^f) is a linear subspace of ). J^c(^f) is, however, closed
with respect to the norm topology and is itself a Banach space.
PROOF. Let An e &С(Ж) and let An A. Let yv 0, then \\Anyv\\ 0. From
||Ayv|| < ||(A - An)yv|| + \\Anyv|| ^ \\A - An\\ ||yv|| + \\Anyv\\
it follows that, since ||yv|| < С for a suitable value of C,
\\Ayv\\ < \\A - AJC + \\AHyv\\
from which we conclude that \\Ayv\\ —► 0.
5 The Banach Space ^£Г{Ж)
The set of self-adjoint bounded operators is evidently a linear vector space
over the field of real numbers R. We may also consider jУ?(Ж) to be a vector
space over R, and the set of self-adjoint operators form a subspace of £?(Ж)
which we shall denote by J^fr(^f). If 5£Г{Ж) is norm-closed in JУ?(Ж) then
376 Appendix IV Operators in Hilbert Space
У?Г(Ж) will be a Banach space. This will be the case if the limit (in norm) of a
sequence of self-adjoint operators is a self-adjoint operator.
Since ||A + || = \\A ||, it follows from An -^C A that A*
does not necessarily follow from An-+ A\). A+ = A+ implies A+ = A.
Therefore jis a Banach space. j£?r(jf) is, however, closed with respect to
pointwise convergence, because from An —► A and from <x, Any) = {Anx, y)
it follows that <x, Ay} = {Ax, y). Since is a Banach subspace of
<£(Ж), У?СГ(Ж) = n <£Г(Ж) is a Banach space.
For A e <£Г(Ж) we find that <x, Ax') = <Ax, x> = <x, Ax) and we find
that <x, Ax) is real.
We may introduce a partial order in !£Г(Ж) as follows: A > В if <x, Ax> >
<x, Bx) for all x e Ж. Here we note that £?r(^) is> *п the sense of AIII, §6 an
ordered vector space with positive cone
jSfr+pr) = {i41A e JSftJf) and A > 0}.
It is easy to show that ) is closed in the norm topology.
We will now show that <£Г(Ж) is an order unit space (see AIII, §6), that is,
the unit sphere of J<£Г(Ж) the order interval
[-1,1] = {A\ — 1<A <1}.
From ||Ax|| < C||x|| it directly follows that \{x,Ax)\ < ||x||||Ax|| <
C||x||2 = C<x, x>, that is, —Cl < A < Cl. Conversely, if /^1 < A < /л21,
then for Ax Ф 0 (the case Ax = 0 is trivial), setting
у = || Ax || “* || x || Ax = cl Ax, we obtain
A+(A+ Ал
\\Ax\\2 =
-Ax, у) =Ua(-x + y),-x + y
щ-х
+
y)
= ju||x|| ||Лх||,
1
2
1
-x + у
+
-x - у
a
a
where ju = Maxfl/jJ, |ju2|}. Therefore we obtain ||Ax|| < ju||x||. For С = 1,
)U = — l,ii = l we obtain our assertion about the unit sphere of <£Г(Ж).
If A and В are commuting self-adjoint operators, then AB = BA is a self-
adjoint operator. In the special case in which В > 0 then A2B =
ABA = BA2 > 0 since <x, ABAx) = {(Ax), B(Ax)). We will now show
that, from A > 0, В > 0 and AB = BA that AB > 0.
Since A is bounded, there exists a number с such that ||Ax|| < c||x||. For
Ax = c_1A we have previously found that 0 < Ax < 1. Recursively we define
An+1 = An — A2; we shall now show that 0 < An < 1 by induction:
A+1 = A2d - A) + Ad - A2) > о
and
l - A+i = d - A) + A2 > о
6 Projection Operators 377
providing 0 < An < 1. From A1 = £"=1 A2 + An+1 it follows that, since
An+1> 0, £"=1 ||Avx||2 < <x, Axx) and we find that \\Avx\\ —► 0, that is,
Av —► 0. Thus we find that Ax = ]£®=1 A2. Therefore Av is self-adjoint and
commutes with B, as can easily be proven by induction. Thus AB = сАгВ =
c^=iAvB > 0.
From the preceding results we obtain the following important convergence
properties of monotone sequences of commuting operators An e ).
Suppose that An is monotonically decreasing and suppose that An> 0 for all
n. According to the above theorem (Am — An)Am > 0 and An(Am — An) > 0
for m > n, that is,
<x, A2mx) > <x, AmAnx) > <x, A2x}.
The sequence of the <x, AmAnx) must therefore converge to the same value
as the monotonically decreasing sequence (x, A*x) = \\Anx\\2 so that
Il04m - A„)x\\2 = <x, A2x) + <x, A2, x) - 2<x, AmAax) -> 0. The Anx
therefore form a Cauchy sequence, and there exists an A such that An —► A.
From <x, Anx) > 0 it follows that <x, Ax) > 0, that is, A > 0.
If B„ is a monotonically decreasing (or increasing) sequence of commuting
self-adjoint operators and there exists an operator B' which commutes with
all Bn and satisfies Bn > B' (or Bn < B'\ then by considering the sequence
An = Bm- F (or An = F — Bn) we obtain the result that there exists a В
such that Bn—> В and В > В' (or В < В').
6 Projection Operators
Let be a closed subspace of Ж. Since each xe Ж has a unique partition
x = q + p, where qe&~,pe (see §2), the relation q = Px defines a linear
operator P, which we call a projection operator on 2Г. Since ||x||2 =
Hell2 + IIpII2 and since x = q for x e we find that ||P|| = 1 (providing that
P is not equal to 0, that is, & = {0}). From q e it follows that P2x =
Pq = Px, that is, P2 = P. For a corresponding partition of у, у = r + s,
re3T,se it follows that
(y, Px> = <r + s, q} = <r, q> = <r, q + p} = <Py, x>,
that is, P e ЖГ(Ж).
Conversely, if P e ЖГ(Ж), and P2 = P, then there exists a closed subspace
.7 upon which P projects. Thus the set .7 = РЖ is a closed subspace.
The fact that P projects upon P follows directly from the identity
x = Px + (1 — P)x; it is easy to show that (1 — P)x e 27Thus we find that
1 — P is a projection operator upon 27L. We therefore write 1 — P = P1.
Thus we find that P —> РЖ is a bijection of the set of projection operators
on the set of all closed subspaces of Ж.
For the special case in which У is the one-dimensional subspace spanned
by у (with || у || = 1) we shall often denote the projection operator on 27 by
Pr We obtain Pyx = y(y, x).
378 Appendix IV Operators in Hilbert Space
If, for a pair of closed subspaces => then from x =
<Zi + Pi = Чг + P2 where qxe&~x, pxe$~x, q2e^2 an(l P2e^~2 it
follows that pxe&~2. For the partition qx = r2 + s2 where r2 e .T2 and
s2 6 -9~2 it follows that x = r2 + (s2 + px), where r2 e 3~2 and s2 + px e 9~2.
Therefore q2 = r2 and p2 = s2 + px. If Px is the projector onto and P2 is
the projector onto Ж2 we then obtain Pxx = P2x + (1 — P2)Pxx, P2x =
P2Pxx. Thus it follows that 11jPjx112 > ||P2x||2 and P2 = P2PX. Since P2 = Px
we find that HPjXll2 > ||P2x||2 is equivalent to Px> P2. Since P2 is self-
adjoint, from P2 = P2PX it follows that P2 = P2 = PXP2, that is, Px, P2
commute.
If, conversely the projection operators Px, P2 satisfy Px > P2 then it follows
that
||(P2 — P1 P2)xII2 = <P2X — PJ PjX, P2x — PjP2x>
= <P2X, P2X> + <PjP2X, PxP2x}
- <PjP2x, P2x> - <P2x, PjP2x>
= ||P2x||2 - ЦЛРгхЦ2
and since || PjV ||2 > || P2_v||2 we obtain
||(P2 - PjP^xll2 = ||P2P2x||2 - ||PiP2x||2 < 0,
that is, P2 — PXP2 = 0. Thus, as above, it follows that P2 = P2PX = PXP2.
For x e Р2Ж, from P2 = PjP2 it follows that x = Pxx and therefore
x e P[ Ж, that is, Px Ж > Р2Ж. The above bijection between the projection
operators and the corresponding projection spaces is therefore an order
isomorphism.
From P2 = PXP2 it follows that (Px — P2)2 = Px — P2 and therefore
Px — P2> 0, that is, Px> P2. If (Pj — P2)2 = Px — P2 and therefore Px > P2
then it also follows that P2 = PXP2 = P2PX.
The following conditions are therefore equivalent:
P2 = PiP2, Р2 = РЛ, (Px - P2)2 = Px - P2, PX>P2, РХЖ=>Р2Ж.
For two projection operators P and Q we denote the projector onto
(РЖ) n (Q-Ж) by P л Q and the projector onto (РЖ) v by
P v Q. Because of the above order isomorphism, the set of projection
operators is, according to §2, an orthocomplemented lattice.
We will now show that PQ = QP is equivalent to P л Q = PQ. From
PQ = QP it follows that (PQ)2 = PQ and (QP)2 = QP. Thus it follows that
PQJP с РЖ and PQЖ = QPЖ cz QЖ and we obtain PQЖ с
РЖ л QЖ. If x e РЖ л QЖ then it follows that x = Px = Qx
and that x = PQx e PQЖ.
Conversely, if PQ = P л Q, then PQ is self-adjoint, that is, (PQ)+ =
QP = PQ.
In general PQ is not self-adjoint and is not a projection.
For the special case in which PQ = 0 it follows that PQ = QP and
Q = (1 — P)Q and therefore Q ‘# <= (РЖ)1, that is, QЖ is orthogonal to
7 Isometric and Unitary Operators 379
РЖ. It follows that P + Q is the projection operator onto РЖ ® (£Ж =
РЖ v 0,Ж'. If P + Q < 1, then it follows that P < 1 — Q and that P =
P( 1 — Q), that is, PQ = О, РЖ 1 (£Ж and P + Q is a projection onto
РЖ 0 еж
If P„ is a decreasing (or increasing) sequence of projection operators, then
from §5 it follows that Pn converges Pn —► P, since Pn> 0 (Pn < 1). From
P2 = Pn it follows that P2 = P, that is, P is also a projection operator. It is
easy to show that РЖ = Р|и (РпЖ) (or РЖ = \Jn (РпЖ)).
If Pn is a sequence of projection operators, then Y,n=i Pn is an increasing
sequence. If ]T*=1 Pn< 1 for all N then Y,n=i Pn exists and is < I. It is easy to
show that the condition J^°=1 Pn< 1 is equivalent to the condition that the Pn
are pairwise orthogonal, that is, PnPm = 0 for n Ф m. Thus we find that
P„ is a projection operator on \/n (РпЖ) = £„ ф Р„Ж.
If P is a projection operator and A e S£x\(Ж), then AP = PA is equivalent
to A = PAP + (1 — P)A( 1 — P), where the latter can be proven easily with
the aid of the identity
A = PAP + (1 - P)AP + PA( 1 - P) + (1 - P)A( 1 - P).
If A = (PA) (or AP) then since A e <£Г(Ж) it follows that A = (PA)+ = AP
(or A = PA). A therefore commutes with P and we obtain A = PA =
P2A = PAP. If PA = 0, then it follows that PA = AP — 0 and we obtain
(1 - P)A = A( 1 - P) = (1 - P)A( 1 — P) = A.
If 0 < A < P, then for x 6 (1 — Р)Ж it easily follows that <x, Ax) = 0.
Since A > 0, according to §5 we may write A in the form A = c]Tv A2.
From <x, Ax) = 0 it easily follows that Avx = 0 and Ax = 0, that is,
A(1 — P) = 0. Thus we therefore obtain
A = AP = PA = PAP.
1 Isometric and Unitary Operators
A linear operator Vis said to be isometric if || Vx\\ = ||x|| for all x 6 Ж. Thus
we find that Ve J¥(Ж) and \\V\\ = 1. From <Fx, Fx> = <x, V+ Vx> = <x, x>
it follows that V+ V = 1 (see §3) and from V+ V = 1 we obtain an isometry.
For the operator P = VV+ it follows that P+ = P and P2 = P, that is,
VV+ is a projection operator. If the Hilbert space Ж is finite dimensional,
then we must have V+ = F-1 and therefore P = 1. For infinite-dimensional
Ж it is possible that P Ф 1.
P = 1 is equivalent to the statement that V+ is also an isometry. An
operator U for which both U and U+ are isometric is said to be unitary.
Therefore U is unitary if and only if UU+ = U+U = 1.
If V is isometric, then УЖ is a closed subspace of Ж because, from the
continuity of V and V+ it follows from Vxn —► у that V+ Vxn —► V+у, that is,
xn —► V+y = y' and that Vxn —► Vy' e УЖ. From P = VV+ it follows easily
380 Appendix IV Operators in Hilbert Space
that РЖ c= УЖ. From Ж => УЖ it follows that РЖ => РУЖ =
УУ+УЖ = УЖ. Therefore we obtain РЖ = УЖ. Р is therefore a pro¬
jection onto the image of V. An isometric operator is therefore unitary if and
only if VЖ = Ж.
If the isometric operator Fhas a right-inverse, that is, there exists a V' such
that VV = 1, then it follows that V+ VV = V+ and V' = V+, that is,
VV+ = 1. The statement that an isometric operator Vis unitary is equivalent
to the statement that Fhas a right inverse.
8 Spectral Representation of Self-adjoint and Unitary Operators
We shall now demonstrate the following theorem:
Let A e J&Г+(Ж), then there exists a unique В e J&Г+(Ж) such that B2 = A;
all operators which commute with A also commute with В. (B is called the
positive root of A, В = A1/2.)
Proof. From В1е&+(Ж) and Bj = Ax = \\A\\ ~XA, it follows that for
В = \\A\\1/2B19 B2 = \\A\\Bj = A and В e &Г+(Ж); from B2 — A and В e &Г+(Ж)
and for Bl = \\A ||~1/2 it follows that В2 = Ax and В1е&г+(Ж). Since И||-1Л < 1 it
suffices to prove the theorem for A < 1. For this purpose we
define a sequence Bn as follows: B0 = 0, Bn+1 = Bn + j(A — B2). Thus it follows
that all operators which commute with A also commute with all Bn. Bn is an
increasing sequence which satisfies 0 < Bn < 1, because from 1 — Bn+1 =
%(1 — Bn)2 + ^-(1 — A) it follows that 1 — Bn+1 > 0; by the induction hypothesis it
follows from Bn+1 — Bn = ^(Bn - Б„_1)[(1 - Вп_х) + (1 - BJ] that the relation
Bn+1 — Bn > 0 holds. The sequence Bn converges; therefore we obtain Bn-+B
where 0 < В < 1. From Bn+1 = В + %A — B2) it follows that, in the limit,
B2 = A. Since the Bn commute with all operators which commute with A, the same
holds for B.
In order to prove uniqueness, we assume that С > 0 and that C2 — A. From
AC = C2C = CC2 = CA, С commutes with A, and therefore also commutes with
B. Earlier we have shown that there exist positive roots В1/2 and C1/2. For xe Ж
and у = (В — C)x it follows that
||B1/2y||2 + ||C1/2y||2 = <y, By) + <y, Су) = <y, (B + C)y)
= <y, (В + C)(B - C)x) = <y, (B2 - C2)x) = 0.
Therefore we obtain B1/2y = C1,2y = 0, from which we conclude that By = 0 and
Cy = 0. Thus we obtain \\(B - C)x\\2 = <x,(В - C)2x) = <x, (B - Qy) = 0,
that is, (B — C)x = 0. Since x was arbitrary, we have proven В = С.
Now let B(p) >0; let B(ji)2 = (A — pi)2. B(p) is therefore uniquely
determined and commutes with all operators which commute with A.
Let A(n) = A — jul, А(ц)+ = $(В(ц) + A(ji)) and A(p)_ = ^В{ц) - A(p.)).
Therefore А(ц) = A(jx)+ — А(ц)_ and B(ji) = А(ц)+ + A(pi)_. Since B(ji) and
A(p.) commute, we obtain
А(ц)+А(ц)_ = А(р,)_А(ц)+ = Ub(m)2 ~ Ah)2) = 0-
8 Spectral Representation of Self-adjoint and Unitary Operators 381
Let ^ denote the set {x | x e Ж and A(yi)+x = 0}. From the continuity
of A(pi)+ it follows that ^ is a closed subspace of Ж. Let E{p)
denote the projector onto ,ТЦ. We therefore obtain A(ji)+E(p.) = 0.
Since A(n)+A(p)_ = 0 we therefore obtain А(ц) _ Ж с: that is,
Е(ц)А (ц)_ = A(ji)_.
If С e Е£Г{Ж) commutes with A, then it also commutes with A(ji)+
and A(fi)_. For у e ,Т^ it then follows that A(p.)+Cy = СА{ц)+у = 0, that is,
СуеЗГр. Thus it follows that CE(p.) = E(p)CE(ji). Computing the adjoint
operator we obtain E(fi)C = CE(p,).
Let С = А(ц)_\ from Е(ц)А(р) _ = A(/j)_ it follows that А(р,)_Е(ц) =
A(p)_. Similarly, from А(ц)+Е(ц) = 0 we obtain E(p)A(p)+ = 0. Thus, from
0 < E(ji) < 1 and B(jx) > 0 it follows that
0 < E(ji)B(ji) = B(p)E(p) = E(]i)[A(ji)+ + A(pi)_] = A(p)_,
(8.1)
0 < [1 - Е{ц)\В(ц) = А{ц)+,
E(ji)A(ji) = -A(ji)_, (1 - E(n))A(ji) = A(jx)+.
For X <> ц we obtain A(X) — A(p) > 0; thus, from A(A)_ > 0 we obtain
A(A)+ — A(/i)+ + A(ji)_ > 0. By multiplication with A(/i)+ we obtain
A(ji) + IA(X)+ - A(ji)+ + A(fi)_~\ > 0,
that is, since А(/л)+А(/г)_ = 0 we obtain A(fi)+A(A)+ > (A(ji)+)2, and we find
that <x, A(fi)+A(A)+x} > \\A(fi)+x\\2 for all x. From A(A)+x = 0 it therefore
follows that A(ji)+x = 0, that is, ^ c: ^ which is equivalent to ДА) < E(fi)
and to Ди)ДА) = ДА).
The fact that, for A, there exists two constants a, ft for which al < A < fil
implies that A(A) > 0 for A < a and therefore ДА) = A(X) for A < a because
the positive root of A(A)2 is unique; therefore we obtain A(A)+ = A(A).
Since <x, A(A)x} = <x, A(A)+x} > (a — Я) ||x||2 it follows that for A <-a
A(A)+x = 0 only for x = 0 and we therefore obtain E(A) = 0 for A < a. For
A > P it follows — A(A) > 0 and therefore B(A) = — A(A), that is, A(A)+ = 0,
from which we conclude that ДА) = 1 for A > ft.
Since for A < fi, ДА) < Дц), Дц) — ДА) is a projection operator onto the
space n Let E{J) denote Дц) — ДА); from the relation (8.1) we
obtain:
0 < A(A)+E(J) = A(A)( 1 - E(A))E(J) = A(A)E(J?)
= (A- А1)Д/),
0 < A(ji)_E(Jf) = -A(fi)E(fi)E(J) = — A(fi)E(J>) = (fil - A)E(J).
Thus it follows that
AE{J) < AE(J>) < fiE{J). (8.2)
We shall denote the limit of Д/i) as /i —► A by ДА+). Let Q(A) =
ДА+)-ДА). We obtain
AQ(A) = AQ(A) that is, A(A)Q(A) = 0.
382 Appendix IV Operators in Hilbert Space
In addition it follows that
A(k)+Q(k) = (1 - E(k))A(k)Q(k) = 0
and we therefore find that Q(k)x e for all x, that is, E(k)Q(k) = Q(k). From
E(k)E(J) = 0 it follows that, in the limit ju -> к E(k)Q(k) = 0—thus we find
that Q(k) = 0 and therefore we obtain E(k+) = E(k), that is, E(k) is con¬
tinuous from above.
Thus, for P(ji) = Е(/л) — E(jx_), for A —► ju, from (8.2) we obtain
AP(ji) = fiP(ji). (8.3)
If we partition the real interval [a — 3, /?] (for a, /? used above, and arbitrary
small 3 > 0) into subintervals Jk = (kk, kk+1], where E(Jk) =
E(K+i) — E(kk), from (8.2) we obtain
£ XkE(Jk) < £ AE(Jk) < £ Xk+1E(Jk).
к к к
If the maximal length of the intervals Jk is equal to e, then from E(Jk) =
E(P) — £(a — 3) = 1 it follows that
® < i — K)E(^k) <
к
and therefore
o < ZVi£(A) - a <81,
к
0< A-^KEW<sl
к
from which it follows that
А- 1Ак+1£(Л)
< e and
A-^XkE(Jk)
< 8.
Thus Y,k Л'/ДА) an^ Xfc ^к+1ДЛ) converge, as e —► 0, in norm to A. The
limit of the sum is written as the integral
-i
X dE(X).
(8.4)
(8.4) is called the spectral representation of A.
From (8.4) for a polynomial p it follows that
P(A)
-L
p(X) dE(X).
In addition, for a projection valued measurable function f(X) which is
measurable relative to the projection valued measure defined by E(X) (see
IV, §2.5) we may define the function /)/1) as follows
№ =
f(X) dE(X).
8 Spectral Representation of Self-adjoint and Unitary Operators 383
In particular, for
1 for k1 < к < Я2,
0 otherwise,
it follows that rj(A) = ДЯ2) — F^).
From (8.4) it easily follows that the solution of the eigenvalue problem
Ax = (or (A — fil)x = 0) is equivalent to
and we therefore obtain x e Р{р[)Ж (P(jx) is defined in (8.3)). The eigenvalues
are therefore the values at which Е(/л) is discontinuous.
The uniqueness of the spectral family follows from the representation
(8.4). Let E'(2) be a second spectral family which has projection operators
which are increasing and continuous from above and satisfy (8.4). Then it
follows that
for each e > 0. Since E'(X) is continuous from above, we therefore obtain
E'(ii)x = x for all xe^, that is, for all x = E(p)y, where ye Ж Therefore
E(p)E{fi) = E(ji), that is, £'(ju) > E(ji). Conversely, if E’(ji)x = x, then (8.5) is
satisfied, that is, A(p.)+x = 0 and we obtain x e Therefore Е'(р.)Ж a
that is, E'(n) < Е(ц). Thus we have proven that Е'(ц) = E(p).
If U is a unitary operator, then A = + U+) and В = (l/2i)(l/ — U+)
are two commuting self-adjoint operators which satisfy ||4|| < 1, ||£|| < 1.
From UU+ = 1 it follows that A2 + B2 = 1, that is, B2 = 1 — A2. For A
there exists a spectral representation
Thus we obtain B2 ={!_!_ (1 — fi2) dE(ji).
For a partition of В into positive and negative parts В = B+ — В we
obtain B2 = B\ + B2_. Let F denote the projection operator onto the
subspace of all x for which B_x = 0. Then B+ = BF and B_ = B( 1 — F)
and we obtain B\ = B2F and B2_ = B2( l — F), from which we obtain
||(A - ц\)х\\2 =
(Я — ц)2 d\\E(X)x\\2 = 0
a -
The space ^ is therefore the set of all x for which
(8.5)
From which is follows that
E'(/a + e)x = x
384 Appendix IV Operators in Hilbert Space
For ц ± iy/1 — /г2 = e‘v and
Mm) = I(1 ~ E(^)F for 0 < cp < я,
w> [£(/t)(l — F) + F for % < cp < 2%
we obtain
Г2 n
U = ei(p dG((p).
For D(cp) = G((p+) we find that D((p) is continuous from above, and that
Г2 n
U = ei<p dD(cp). (8.6)
9 The Spectrum of Compact Self-adjoint Operators
If A is self-adjoint and compact (see §4) then A can only have a discrete
spectrum, that is, the spectral family of A cannot be continuously increasing.
In addition, the eigenspaces corresponding to nonzero eigenvalues, that is, all
Р(/л)Ж which satisfy (8.3) for nonzero values of ju can only have finite
dimension, and the sequence of eigenvalues must converge towards 0.
Proof. If E(k) were continuously increasing, or if there exists an infinite dimen¬
sional eigenspace corresponding to a nonzero eigenvalue, then there would exist an
interval к to к + e for which [E(k + e) — E(kJ] was infinite dimensional and
A + e<0 or к > 0. Therefore there would be an infinite sequence
xv e 1Е(к + e) - Е{к)~]Ж for which <xv, = dVfl so that xv 0 (see §4). Since A
is compact, we must have Axv —► 0. From (8.4), it would then follow that
\\Axv||2 = J \i2 d\\E(ji)xv\\2 > min{A2, (к + e)2} Ф 0
in contradiction to \\Axv\\ —► 0.
Therefore we obtain, for the eigenvalues ц of A,
4 = Z /*P(/4 (9-1)
where 0 is the only accumulation point for the eigenvalues ц and P(jE) is finite
dimensional for ц Ф 0. Therefore we may choose a complete orthonormal basis xv,
for which (9.1) is transformed into
A = Z where ц, — 0. (9.2)
V
We shall now show that if (9.2) holds, then A is compact. According to §4 it is
sufficient to consider a sequence yn 0 with ||yj| < 1. From (9.2) it follows that
(here we set MN = Max{|/*v| | v > N}:
10 Spectral Representation of Unbounded Self-adjoint Operators 385
Now we choose N such that MN < e and then let n —► oo, and we obtain
<xv, yn} —► 0. Therefore we obtain \\Ayn\\ —► 0.
From (9.2) it follows that 5£СГ(Ж) is the norm-closed subspace of $£Г(Ж)
generated by the set of all Px. A projection operator P is an element of 5£СГ(Ж) if
and only if РЖ is finite dimensional.
10 Spectral Representation of Unbounded Self-adjoint Operators
In general, an unbounded linear operator A in Ж is not defined in all of Ж.
Let 3fA denote the domain of definition of A, that is, A is defined as the map
9)a From linearity, we may assume that Q)A is a subspace (but not
necessarily a closed subspace). Let WA = A3)a denote the range of a. arA is
also a subspace of Ж.
If E(k) is a spectral family of projection operators, that is, E(kt) > E(k2) for
kx > k2, E(k+) = E(k) an<i E(k) —► 0 for к —► — oo, E(k) —► 1 for к —► oo, then
Axf=^XdE(X)
defines an operator Aafi for which \\Aap || < max(|a|, \fi\). For апхе Ж and for
a —► — oo and /? —► oo Aapx is convergent if there exists a с for which
11Дх/3ХН < с for all a, p. In particular, for all x for which
Г A2rf||£(A)x||2 < oo (10.1)
J — 00
there exists an operator A defined by
Ax = X dE(X)x. (10.2)
J - 00
The operator A therefore has a natural definition domain consisting of the set
of vectors which satisfy (10.1). If there does not exist a к for which either
E(k) = 1 or 0, then it is easy to show that 3)A Ф Ж. 3)A is, however, dense in
Ж\
For a map 3)A Ж we may consider the graphs—as subsets of the
topological product space Ж x Ж, that is, the set of all pairs (x, Ax) for
which xe^. Here it is particularly advantageous to consider the topologi¬
cal product of Ж with itself to be a Hilbert space Ж ® Ж = Ж2 where
(*1> yi) + (*2> У2) = (*1 + У1 + У2Х a(x> у) = (ax, ocy) and <(xl5 yx\
(*2> J2)) = *2> + У2У•For fhe graph 9A of A is the set of all
vectors of the form (x, Ax) e Ж2. 9A is a subspace of Ж2 if and only if A is
linear. Conversely, each subspace ^ of Ж2 for which there is no element of the
form (0, y) with у Ф 0 defines a linear operator A for which 3fA is the first
component of the elements of
386 Appendix IV Operators in Hilbert Space
The operator A is said to be closed if (SA is a closed subspace. This is
equivalent to the condition that for xn e 3)A, xn —► x and Axn —► у it follows
that (x, у) e УA, that is, xe£&A and Ax = y.
For an operator A we define A + as follows: @A+ is the set of all x for which
there exists an x' for which <x, Ay) = <x', y) for all у e 3)A. A+x = x' is
therefore defined only if <z, у) = 0 for all у e 3)A it follows that z = 0, that is,
if 3)A is dense in Ж
If we define a unitary operator U in Ж2 by U(x, y) = (y, — x), we find that
U2 = — 1. The graph of A+ is precisely 9A+ = A + is therefore
defined only if (l/^)1 *s a graph, that if it contains no element (0, y) for
which у #0, that is, if ^ is dense in Ж Since $A+ = (U^)1 is a closed
subspace, A+ is closed.
We shall now show that if A is itself closed, then @A+ is also dense in Ж If
^4+ is not dense in Ж then there exists а у e Ж for which yl®i+. We
therefore obtain (0, y) 1 U9A + , that is, (0, y) e (U^A+)L = [C(C^)1]1 =
[(L/2^)1]1 = C2^ = ^ in contradiction to the assumption that <SA is a
graph.
If 3iA+ is dense in Ж (but A is not necessarily closed!) then A++ exists, and
it follows that
9 = muwL = m1 = &a],
that is, &A++ is the closed subspace generated by <3A.
We say that В is an extension of A (written В id A) if id <3A. A++ is
therefore, if it exists (that is, if @A+ is dense in Ж), an extension of A:
A + + id A. A + + is closed. If В id A, and if В is closed, then % => <$A and we
therefore obtain => [^J, that is, В id A + +.
We define A + В by @A+B = @An @B and (A + B)x = Ax + Bx. We
define AB by means of $)AB as the set of all xe3)B for which Bx e 3)A and
(AB)x = A(Bx). If Ax = 0 only for x = 0, then we can define A"1 by
^4-1 = an<i A~1(Ax) = x. It is easy to show that 0A a 0, A(BC) =
(.AB)C, (A + B)C = AC + ВС, A(B + C) => AB + AC, (AB)"1 = B^A"1 if
A"1, B"1 exist;(A + B)+ => A+ + B+;(AB)+ => B+A + .
A is said to be self-adjoint if A + = A. Clearly Q)A must be dense in Ж A is
said to be symmetric if 2A is dense in Ж and <x, Ay) = <Ax, y> for all
x, у e ^. If A is symmetric, then A+ => A. If A is self-adjoint, then it is also
symmetric. From A+ => A it follows that for a symmetric A @A+ is dense in
Ж and that A + + exists and satisfies A + + c: A+ = (A+)++ = (A ++)+—
therefore A + + is also symmetric. A symmetric A is said to be maximal, if A
has no symmetric extension. A maximal A must therefore satisfy A + + = A. A
self-adjoint operator is maximal since В => A implies B+ a A+ — A and
В <= B+ <= A, that is, В = A. If A + + = A + then A ++, the closure of A, is self-
adjoint. Then we say that A is essentially self-adjoint. It is easy to show that
the operator A defined by (10.2) is self-adjoint. If we formulate quantum
mechanics in the manner presented in this book, then unbounded operators
occur in the representation of Lie groups, namely, as infinitesimal transfor¬
mations (see VII and VIII). As such, they are defined on the basis of
10 Spectral Representation of Unbounded Self-adjoint Operators 387
representation theory (see VII, VIII and [10]) in the form (10.2). However,
the introduction of unbounded scale observables (for example: the position
and momentum observables introduced in VII) occurs on account of the fact
that first a decision observable E —► G was defined and then a spectral family
is defined. Then the operator A is introduced according to (10.2), so to speak,
as a “condensation” of the spectral family. For the formulation of quantum
mechanics presented here it is sufficient to consider only (10.2). For
applications it is important to note that the so-called Hamiltonian operator
Я, that is, the operator for the infinitesimal time translation is only
determined by the Galileo group for elementary systems (see VII, §2). For
composite systems, on the other hand, it is necessary to discover the
Hamiltonian H. In this way we discover an operator which is not always well
defined, as, for example, we find in VIII (5.8), that is, 3)H is not fully known,
and even less for the spectral family E(X) which is representable in terms of a
time displacement by the unitary operator
Often we must seek to find an extension for an operator H which was only
given as a symmetric operator. The “correct” extension can only be obtained
on the basis of “physical” considerations. The extension must be a self-
adjoint operator (or at least essentially self-adjoint); only in this way is it
possible to uniquely define the spectral family and the operator eiHt as a
unitary operator. The problem of the definition of H is especially important
for the term scheme of atoms (see XI-XV) and for scattering theory (see XVI).
In the following we shall consider the circumstances under which a sym¬
metric operator can be extended to a self-adjoint operator. According to the
previous considerations it therefore follows that the maximal symmetric
operators which are not self-adjoint are “physically useless.”
We now proceed from a symmetric operator A and seek to define
(A — il)(A + il)-1. This form is motivated by the Cayley transform
w = (z — i)/(z + i) which maps the real z-axis onto the unit circles in the w
plane.
For x g 3)A we obtain
Thus we find that (A + il)x = 0 only for x = 0, that is, (A + il) 1 exists
where ^(A+n)~i = (A + i\)S>A. Therefore the operator
is well defined with 3)v = (A + il)@A. It directly follows that =
(A — i\)3)A. Each ye%, therefore, is of the form у = (A + il)x where
xe<3A and, for y' = Uy it follows that y' = (A — il)x. Thus, from (10.3) it
follows that || Uy|| = ||y||—that U is an isometric operator on Q)v (if Q)v = Ж
then U is, in the sense of §7, an isometric operator).
U(t) = eatdE{X) = eiHt.
||(Л + il)x||2 = ||Лх||2 + ||x||2 > ||x||2.
(10.3)
U = (A - ilXA + il)-1
(10.4)
388 Appendix IV Operators in Hilbert Space
From Uy = (A — il)x and у = (A + il)x it follows that x = (1 — U)y/2i
and that Ax = (1 + U)y/2. (1 — L/)"1 exists, because from (1 — U)y = 0 it
follows that x = 0 and that у = (A + il)0 = 0. Thus we find that
A = i( 1 + U)( 1 - I/)”1 (10.5)
since ^ = (1 — L/)^.
Conversely, suppose that U is an isometric operator in a subspace &v.
Then it follows that <x, y> = <L/x, Uy} for all x,ye3v. Since U is
isometric, U~x is defined in From (1 — U)y = 0 it follows that for
z e (1 - = (1 - и~г)и% = (U — 1)% = (1 - U)%, that is, for
z = (l-U~*)x and xefy we obtain
<z, y> = <x, y> - <f/_1x, y> = <x, y> - <x, Uy> = <x, (1 - l/)y> = 0.
If (1 — U)3)v is dense in Ж, then from <z, y> = 0 for all z e Ж it follows that
у = 0.
If U is an isometric operator in 3)v and (1 — U)Sfv is dense in Ж, then from
(10.5) an operator A is defined for which 3A = (1 — U)3V. Therefore 3)A is
dense in Ж and A+ is defined. For x,ye@A, that is, x = (1 — U)v,
у = (1 — L/)w, where w, ve it follows that
<x, Ay> = <(1 — U)v, i( 1 + U)w) = i(v, w) — i(JJv, L/w)
+ i<t?, L/w) — w) = i<t7, L/w) — i(Uv, w>
and
<Ax, y> = <i(l + U)v, (1 - L/)w> = -i(v, w> + i<l/i>, t/w>
+ i(v, Uw) — i(Uv, w> = i<t?, L/w) — i(Uv, w>,
that is, <x, Ay) = <>lx, y>—A is therefore symmetric.
For an operator A defined as above by U, it follows that (10.4) holds.
Symmetric operators A and isometric operators U for which (1 — U)3V is
dense in Ж are uniquely related by (10.4) and (10.5) and % = (A + il)3>A
and 3A = (1 — U)3)v. Each proper symmetric extension of A leads to a
proper extension of U and vice versa. In this way the problem of symmetric
extensions of A are directly related to that of isometric extensions of U.
From (10.3) it follows that for z„ = (A + il)x„ the sequence z„ is con¬
vergent only if the sequences Axn and xn are convergent. If A is closed, then
3)v is closed (and since U is continuous, U is also closed). Therefore 3)v and
are closed subspaces of Ж. Conversely, if @>v is closed (and therefore is
also closed) then it follows that A is closed. Since A + + is the smallest closed
extension of A, A + + corresponds to that extension of U for which the
domain of definition is the closure of 3V; this extension of U is uniquely
determined. For this reason we need only consider closed symmetric
operators A. 3V and iTv are therefore closed subspaces; and are
called the defect spaces of A. For xe®J and ye3A it follows that
<x, (A + il)y> = 0, that is, <x, Ay> = <ix, y>. In this way c= @A+ and
A+x = ix for x e 3jj. In the same way, for z e we obtain A+z = —iz. If,
conversely, A+x = ix, then, for all у e 3A it follows that <x, (A + il)y> = 0,
10 Spectral Representation of Unbounded Self-adjoint Operators 389
that is, x g 3jj. Since A = A+ for a self-adjoint operator, <x, Ax) is real; so
that we conclude that 3j} = {0}, if£ = {0}, that is, 3V — ifv = Ж.
According to §7 this is equivalent to the condition that U is unitary.
We now assume the converse—that is U is unitary, that is if
= = ^ then A is self-adjoint. In order to show this we prove the
following for a closed and symmetric A—3A + = 3A + 3^ + if^. For
x g 3A we partition (A+ + il)x into components in 3V and in 3jj, that is,
(A+ + il)x = (A + il)z + у where zg3a and ye3jj. Since Az = A+z and
A+y = iy it follows that (A+ + il)x = (A+ + il)z + (A+ + il)yf where
У = (1/2 i)y. Therefore A+(x — z — yf) = — i(x — z — iy% that is
x — z — У e if£ . Therefore x = z + У + r, where z e у e and
r g if£. Thus we have proven 3A+ = 3A + 3^ +
If 3^ + if £ Ф {0}, then ^ is a proper subset of 3A + because for
уеЗи, rGifjj; from the assumption that y + re3A it follows
that (A + il)(y + r) = (A+ + il)(j; + r) = 2ry and (A — ilXy + r) =
(A+ — HXy + r) = — 2ir. Since 3V = (A + il)^ we obtain у g3v and,
since ye3j}: у = 0. From (A — il)^ = it also follows that r = 0.
Therefore the partition x = z + у + r, where xe®^ uniquely defines
z g 3A, у g r g ify and we find that 3A + 3^ + is a direct sum of
subspaces.
If U is unitary, then A + = A.
Since there is a 1:1 correspondence between the extensions of the isometric
operators 3V ifv and the extensions of A, we need only investigate the
possibilities of finding isometric extensions of U. From the isometry it easily
follows that, for an extension V of U that V maps 3V n 3j} isometrically onto
ifv n that is, for 2Г = 3V n 3j} and Sf = if v n if^ the mapping
2Г Sf is an isomorphism (in the sense of §13). Therefore ST and ST have the
same dimension. Conversely, if 2Г and are subspaces having the same
dimension for which 1 3U9 1 ifV9 then we obtain an isometric exten¬
sion V of U with the aid of one of infinitely many isomorphic maps 2Г Sf
by V(x + y) = Ux + Vy for x g 3U9 у g which satisfies 3V = 3V ®
ifv = nrv 0 sr.
Therefore, A has a self-adjoint extension if and only if 3jj and if Tj have the
same dimension. If A is not self-adjoint, then there exist infinitely many self-
adjoint extensions of A. Therefore, if we wish to discover a Hamiltonian
operator H (for example, in the form VIII (5.8)), it is not sufficient to show
that H is symmetric in a certain domain of definition 3H. Indeed, if the defect
spaces 3y, if у do not have the same dimension, the operator cannot be used
as a Hamiltonian operator. If, however, the dimension of 3jj and if^ are the
same, then we are still not finished, as long as 3V Ф Ж. If 3„ = {0} = if^,
then H++ is a self-adjoint operator, that is, H is essentially self-adjoint. Such
an H is as good as a self-adjoint H because there is a uniquely defined closure
of H which is self-adjoint. The search procedure for H can only be carried out
using “physical considerations” because, in the case H is not essentially self-
adjoint, there are infinitely many possible extensions. If H is essentially self-
adjoint, then we can use the operator H++ instead of the operator H.
390 Appendix IV Operators in Hilbert Space
The condition that A is self-adjoint is equivalent to the condition that U is
unitary. If U is unitary then, according to §8 there exists a spectral family
D((p) which satisfies (8.6). From (10.5) it follows that
A - J0' r^ST dm - Jo (-cot I) dDiv)
because from 3>A = (1 — U)Ж for x = (1 — U)y (with arbitrary y) we obtain
D(cp)x = jo (1 — e“p ) dD(ip')y and we obtain
\\D((p)x\\2 = Jjl - e‘«'l2 d\\D(<p')y\\2 = sin2^ d\\D((p')y\\2.
From which it follows that
| c°t2 у ^||£>(<р)х||2 = J 4 cot2 у sin2 у d\\D((p)y\\
2
= 1 4cos2^-dP((j9)y||2 < oo,
that is, Jo" (— cot((jo/2)) dD((p)x exists. For X = cot(tp/2) and E(X)
D(— 2 arccot X) it follows that
■f
Я dE{k\ (10.6)
where the integral is defined as an operator in Q}A = (1 — 11)Ж. Since
il + Я dE(X) maps the subspace (1 — ЩЖ surjectively onto Ж, (10.6)
cannot converge for other vectors other than those of 3)A = (1 — Ц)Ж.
The uniqueness of the spectral family follows in a similar fashion as that
described in §8.
11 The Trace as a Bilinear Form
Let A g £?Г+(Ж) and let xv be a complete orthonormal basis. The sum
£ <xv, Axvy (11.1)
V
defines a real number >0 or + oo. For another complete orthonormal basis
уp we obtain
<У»>АУр> = M1/2yJ2 = 'ZKXy, All2y^\2 = El<^^1/2xv>|2
V V
where A1/2 is the positive square root of A. Thus it follows that
E <jv ау^> = Z EI<y» л1/2х„>|2 < Z 11<y„, л1/2х„>|2
n=l v=l fi=l v= 1 n = l
= X M1/2xv||2 = £ <xv,4xv>,
11 The Trace as a Bilinear Form 391
and, if we exchange the xv and we obtain
00 00
Z <У„’АУ„> = Z <xv,Axv).
n=l V=1
We find that (11.1) is independent of the choice of xv; we call this invariant
the trace of A, and denote it by tr (A)\ for A e £?Г+(Ж) tr (A) is a real positive
number or + oo.
For an arbitrary A e <£Г{Ж)we define
MIL = tr(y/A*) = trM+) + trM_), (11.2)
where y/A* is the positive root and A+, A_ are the positive and negative
parts of A.
We shall denote by ЩЖ) the subset of all A e <£Г(Ж) for which
\\A\\S < +oo. The operators in $(Ж) are called “operators of trace class.”
For Ae J(/) both tr(^4 +) and tr (A_) are finite. Therefore tr (A) exists
because tr (A) = £v <xv, Axv) = tr (A+) — tr (A_). We will now prove that
(11.2) is a norm in ЩЖ) and that ЩЖ) is a base norm space (see AIII, §6)
and a Banach space.
We will now show that
ML = sup {tr(EtAEt) - tr(E2AE2)}, (11.3)
EuE2
where the supremum over all projection operators EUE2 with finite
dimension of ЕХЖ,Е2Ж, so \\A\\S is a norm. It easily follows that for
A e and finite-dimensional ЕЖ the expressions tr(2L4+£), tr(EA_E)
and tr(EAE) exist and are finite. In particular, for E = Px we obtain
tr(PXAPX) = tr(PXA) = tr(APX) = <x, Ax}.
For A e <£Г+(Ж) we obtain EAE e &Г+(Ж). For finite or infinite dimensional
ЕЖ it easily follows that tr (EAE) < tr (A). Let £(A) denote the spectral family
of iG&Г(Ж); then A_ = -E(0)AE(0) and A+= (1 - E(0))A(1 - E(0)).
From (11.2) it follows that
ML = tr((l - E(0))A(1 - E(0)) - tr(E(0)AE(0)).
Let xv be a complete orthonormal basis for (1 — ЕЩЖ = (Е(0)Ж)x and let
уp be a complete orthonormal basis for Е(0)Ж—it therefore follows that
MIL = Z<xvMxv> - Z<^My„>.
v ц
Let E1N, E2M be the projections onto the space spanned by xl5..., xN and
У и • • • 5 Ум> respectively. We therefore find that
ML = [lr№lM^liv) —
N-*■00
M-*' oo
Therefore, using the sup ... from (11.3) we obtain
sup{tr(£M^i) - tr(E2AE2)} > MIL-
EuEi
392 Appendix IV Operators in Hilbert Space
Since 0 < tr(EA+E) < tr(A +) and 0 < tr(EA_E) < tr(A_) it follows that
tr(EAE) = tr(EA+E) - tr(EA_E) < tr(A+),
-tr(EAE) = - tr(EA+E) + tr(EA_E) < tr(A_)
and we obtain tr^Al^) — tr(E2AE2) < tr(A+) + tr(A_) from which it
follows that s\xpEuE2tr(E1AE1) — tr(E2AE2) < \\A\\S. Therefore we have
proven (11.3). &(Ж) is therefore a normed space with norm ||* • ||s. $(Ж) is
also a Banach space.
From \\A\\ = sup||xу <1(x,Ax} it is easy to show that ||A||S > ||A||.Thusit
follows that a Cauchy sequence Av e 0&(Ж) with respect to the norm ||* • *||s is
also a Cauchy sequence with respect to the norm || *||. Therefore there exists
an A e J&Г(Ж) such that Av A. We will now show that \\A — Av\\s —* 0
(from this result and \\A\\S < || A — Av\\ + ||AV||S it follows that \\A\\S < oo).
We now assume (11.3) and consider the case in which ЕЖ has dimension
N:
|tr(E(A - AV)E)| < |tr(E(A - Ap)E)\ + |tr(E(Ap - AV)E)\
<\\A-Ap\\N+\\Ap-Av\\s.
For a given e > 0 choose M such that \\Ap — Av\\s < e for p, v > M.
Therefore it follows that
|tr(E(A - Ay)E)| < e + N\\A - Ap\\ (11.4)
for all p, v > M and all E (dimension of ЕЖ < N), where M does not depend
on N\
Thus, for two projections EUE2 for fixed v > M we obtain
tv(EM - AJEi) ~ tr(E2(A - Av)E2) <2s + (Nt + N2)\\A - Ap||,
where Nu N2 are the dimensions of ЕХЖ and Е2Ж. For p —► oo it follows
that
tr (EM - Av)Et) - tr (E2(A - Ay)E2) < 2 e
and we obtain \\A — Av||s < 2e for all v > M. Therefore \\A — Av\\s is finite,
and || A — Av\\ —► 0. Therefore ЩЖ) is a Banach space.
Thus for A e ЩЖ) it is necessary and sufficient that A has the form (9.2)
(that is, it has only a discrete spectrum) and £v |pv| < oo (that is, A must also
be compact).
Proof. From (9.2) it follows that || A\\s = £v |pv|. If A e ЩЖ) and if, as in §9, for
the spectral representation, there exists an interval A, к + e for which
£(A + e) - E(X) is infinite dimensional, then for a complete orthonormal basis xv
from [£(Я + e) - Е{ЩЖ it follows that
Mil.* l<xv,Axy}
V
Therefore A must have the form (9.2).
= oo.
ЩЖ) is an ordered vector space with positive cone &+(Ж) of all A > 0.
The set
К = {W\W> 0, tr(vv) = 1} (11.5)
11 The Trace as a Bilinear Form 393
is a base for this cone (see AIII, §6) because from A > 0 it follows that
АЦАЦ71 = A(tx(A))~x e K. For an Ae it follows that A =
where Wt = AJA^ and W2 = A_\\A_\\;K A
therefore has the minimal decomposition property (see AIII, §6). Thus it
follows directly that the unit sphere of ЩЖ) is equal to co(K и — К) and
that &(Ж) is a base norm space. We shall now show that $(Ж) is norm-
separable. It suffices to show that К is norm-separable. From W = ]TV w„PXv
where wv > 0, ]TV wv = 1 for W e K, it suffices to show that the set of all Px in
К is norm-separable. Since PXI — PX2 is nonzero only in a two-dimensional
space generated by xt and x2, it easily follows that \\PXl — PX2\\S <
21|PXl — PX21| where \\PXI — PX2\\ is the operator norm in Sgtff). Since
ЦР*. - PJ < 21|x, - x21| it therefore follows that ||PXl - PX2\\S <
4||xi — x2||. Therefore, the set of Px is norm-separable as a subset of ЩЖ) if
the set of x for which ||x|| = 1 is separable as a subset of Ж. This is indeed the
case, because Ж is separable.
A central and most important result is that, for A e ЩЖ), В e $£Г(Ж\ the
expression tr(,42?) is a continuous bilinear form on <М(Ж) x У?Г{Ж) and that
each continuous Unear form over ЩЖ) is of the form tr(AB) with suitably
chosen В e <£Г(Ж)- Therefore У?Г(Ж) may be identified with the dual Banach
space $\Ж) for $(Ж), and we find that \tv(AB)\ < \\A || JB||.
We now prove this result in several steps. First, we show that
£ <xv, ABxvy = Z<xv, BAxv) = tr(AB) = tr(B^)
V V
is convergent and is independent of the choice xv of a complete orthonormal
basis. It suffices to show that this result is satisfied for A+ and A_ where
A = A+ — A_, that is, for an A > 0.
Suppose \tr(A+B)\ < M + IIJBH and \tr(A_B)\ < M_||S||B|| then it directly
follows that |tr(AB)| < ||>1||S||B||.
We shall now show that
X <xv, All2BA1/2xv} = tr(A1/2BA1/2) (11.6)
V
is convergent. This result follows directly from
£ |<xv, A1I2BA1I2xv}\ = ^ |<^1/2xv,B/l1/2xv)|
V = 1 v=1
< ||B|| £ P1/2xvll2
V=1
<||B|| tr(A) = \\B\\ \\A\\S.
Substituting B+ for B, we find that tr(All2B+A1/2) < oo, that is,
A1I2B+A112 g&+(Ж). Thus we find that A1/2B_A1/2 e М+(Ж) and we
obtain A1/2BA1/2 e ЩЖ), and we have proven (11.6).
We will now show that
X <xv, A1/2BAll2x„y = £ <xv, BAxvy.
(11.7)
394 Appendix IV Operators in Hilbert Space
If this is the case, the right-hand side is independent of the orthonormal basis
because the left-hand side is according to (11.6). That is, the existence of
tr(BA) follows from (11.7). Since A1/2BA1/2 is self-adjoint, the left-hand side
of (11.7) is real, and therefore so is the right-hand side. Since <xv, BAxv} =
<xv, ABxv}, from the existence of tr(BA) the existence of tr (AB) is guaranteed
and tr (AB) = tr (BA). From the previous results we conclude that
\tr(BA)\ < \\A\\a\\B\\.
In order to prove (11.7) we need only show that, for N —► oo,
Y Y <41/2ху,хд><х„,ВЛ1/2ху>Г £ <41/2xy, x„><x„, B41/2xy>j
V H=1 V J
= £ <Л1/2ху, BAil2x„y = tr (A112 BA112). (11.8)
V
Then since
£ YJixix’BAV2xvy{xv,Amxliy= Y <x„, BAll2All2Xliy
Ц=1 V fl=l
00
-*■ £ <ХУ, ВАх^У
n=l
and we have proven (11.7).
In order to prove (11.8) we introduce a projection EN for the subspace
spanned by xl5..., xN. Clearly EN —► 1. Thus (11.8) becomes
Y (All2xv, ENBAll2xvy -j Y <All2xv, BAll2xvy.
V V
(11.8) will be proven if we can show that
Y <Л1/2ху, (1 - EN)BAl'2xvy r 0. (11.9)
V
From
|<^2xy, (1 - EN)BA1,2xvy\ = |<(1 - EN)A"2XV, (1 - EN)BA^2xvy\
<W(1-En)A^\\\ \\B\\\\A1I2xJ
we obtain
Y<A1/2xv,(l ~ Е„)ВАЧ\У
V
<\\B\\Y\\(l-EN)AV2xv\\\\All2xy\\
V
M
< ||B|| Y l|(l-£wH1/2*JM1/2*vll
v = 1
+ \\B\\ Y ui,2xv\\2.
v = M +1
11 The Trace as a Bilinear Form 395
Since £vM1/2xv|| = \x{A) exists, it is possible to find an M such that
W\ Z^°=m+i M1/2*vll2 < e* Fix M and choose N so large that
«Dll Zf=i IK1 “ en)a1I2*vII M1/2*vll < e, from which (11.9) is proven.
Now we must show the converse, that each linear form on &(Ж) is of the
form 1(A) = tr(AB) for a suitably chosen В e 5£Г(Ж).
Next we shall show that we may easily extend 1(A) onto the set of “all”
operators of trace class. We say that A e <£(Ж) is an operator of trace class if
Ai = j(A + A+) and A2 = (l/2i)(A — A+) are elements of ЩЖ). We extend
as follows:
T(A) = ЦАг) + il(A2).
It is a simple matter to show that Г is a linear form. For Tit follows from
\l(A)\ < c\\A\\s for A e ЩЖ) that, in general, \T(A)\ < IKAJ] + \l(A2)\ <
c(Mi||. + \\A2\\S).
We now define an operator which we denote by x<y| as follows:
z —► x<y, z). It follows that (x<y|)+ = y<x|. Thus we obtain
T(x(y\) = l(j(x(y\ + y<x|)) + i/Qr(x<y| - iy<x 1)^.
The operator ^(x<y| + y<x|) acts only on the two-dimensional subspace
spanned by x and y, that is, on this operator is a null operator. Thus it
follows that ||^(x<);| + y<x|)||s < 2\\^(x(y\ + y<x|)||. We may easily estimate
the operator norm in <£(Ж) by \\^(x(y\ + X*l)ll ^ 11*11 IItII* In this waY й
follows that Шх(у\ — X*I)L ^ 2||x|| ||y||. Similarly it follows that
||(l/2i)(x<y| - iy<x|)||s < 2||x||||y||. Thus we obtain |Г(х<у|)| < 4c||x|| ||y||.
T(x(y\) is, therefore, for fixed у a bounded linear form on Ж. According to §4
there exists a z e Ж for which T(x(y\) = <z, x>. From |/(x<y|)| < 4c||x|| ||y|| it
follows that |<z, x)| < 4c || x || ||y|| and we obtain <z, z) < 4c||z|| ||y||, that is,
Nl<4c||y||.
A linear operator В is defined by у —► z which, from ||z|| < 4c||y || it follows
that the relation ||Б|| < 4c is satisfied, that is, В e 5£(Ж). Therefore we obtain
Kx<y\) = <By, x> = <y, B+x}.
Since x<x| = Px is self-adjoint, /(x<x|) = /(x<x|) = <2?x, x) is real, that is,
<Ях, х> = <x, Bx). With the aid of the identity
(By, x> = i(B(x + y), x + y> - i(B(x - y), x - y>
— ~ (B(x + iy), x + iy} + ~ (B(x — iy), x - iy}
it easily follows that (By, x) = <y, Bx) and В = B+, that is, В e ЖГ(Ж).
Since I is continuous over ЩЖ) and each A e Щ Ж) is of the form
A = YjvPxBxv’ where the converge in the trace norm, from l(PXv) =
<xv, Bxv> it follows that 1(A) = £v /iv<xv, Bxv> = £v (Axv, Bxv} = tr(^B).
In 0Щ(Ж) the norm is defined by ||/|| = sup^u^, |/(.<4)|. For 1(A) — tr(AB) it is
simple to show that ||/|| = ||B||. Thus we have proven that А'(Ж) = У£Г(Ж).
396 Appendix IV Operators in Hilbert Space
Since 0Ь(Ж) is norm separable, 0b'(Ж) is separable in the а(&'(Ж), ЩЖ))-
topology (see AIII, §4).
In closing our investigation of 0b'(Ж) we shall now show that every affine
positive functional К -U R+ is of the form /(w) = tr(WB) for some
Ве@\(Ж) = J¥Г+(Ж).
We say that / is an affine functional over К if W = kWt + (1 — k)W2,
0 < к < 1 implies that l(W) = kl(Wt) + (1 — k)l(W2). It is easy to verify that /
has a unique extension on all of 01(Ж) because 0&(Ж) is spanned by K. If / is
positive on X, then I is positive on 0b+(Ж). If a linear functional / satisfies
/: 0Ь+(Ж) —► R+, then it is said to be positive. For 0Ь(Ж) we may easily prove
the theorem mentioned in AIII, §6 that a positive linear functional is
continuous.
Let Wx = А + \\А + \\;геК9 W2 = A_\\A_\\S e К; we obtain
A = \\A + \\Wt - M-ll JF2. From this result and from \\W\\s = lfor WeK it
is easy to show that / is continuous only if there exists a number a for which
\l(W)\ < a for all We X. For a positive l(W) we need therefore only show that
l(W) < a for some a.
Suppose such an a does not exist. Then there exists a sequence
for which TO >2". Clearly IF = X„°°=12""IF„ is an element of X.
For VN = f%= i2~nWn it follows that 1(Vn) = N. Since W-VN =
Y,n=N+i 2""IF > 0 it follows that l(W) > l(VN) = N, which, as N —► oo leads
to a contradiction to the fact that l(W) is defined.
Therefore every positive affine functional / on X is of the form
l(W) = tr(WB) where В e <£Г(Ж). Since / is positive, it immediately follows
that В is positive.
Let us consider the Banach subspace J¥?СГ(Ж) (see §5) of 0Ь\Ж) = J&Г(Ж);
then, for each A e 0Ь(Ж) tr(AB) defines a bounded linear form over Е£СГ(Ж).
Since Ax = A2 follows simply from tr(AtB) = tr(A2B) for all В e J&СГ(Ж) (all
Px are elements of <£СГ(Ж)\ 0Ь(Ж) is therefore a subspace of the Banach space
which is dual to <£СГ(Ж). We will now show that 0Ь(Ж) is equal to the Banach
space which is dual to <£СГ(Ж).
Let / be a bounded linear form over <£СГ(Ж\ therefore, since
0Ь(Ж) cz <£СГ(Ж) (as a set of operators) / is a linear form over 0Ь(Ж). Since
1(B) < c\\B\\ and ||*||e > ||£|| / is also bounded as a linear form over ЗЬ(Ж),
that is, there exists an A e 5£Т(Ж) such that 1(B) = tr (AB) for all В e ЩЖ) c
Е£СГ(Ж); in particular, l(Px) = <x, Ax) for all Px e Е£СГ(Ж). Since each
Ве<£сг(Ж) is of the form В = £vwhere |juv|—>0 we find that
|| В — PXJ —► 0 as N —► oo and we therefore obtain 1(B) =
Zv°°=i MPJ = Ev°°=i nv<xv,Axv}. Since ||B|| = maxv{|^v|} we therefore ob¬
tain |£r=i Axv}\ < c||B|| = с maxv{|/iv|} for any complete orthonor-
mal basis and for all sequences juv for which jiv —► 0. By appropriate choice of
the juv > 0 and xv we obtain, for xv as a complete orthonormal basis for the
space (1 — Е(0))Ж where E(k) is the spectral family of A and juv > juv+1 we
obtain
12 Gleason’s Theorem 397
By choosing iix = ii2 = • • • = nN = 1 and jiv = 0 for v > N we obtain
Ys!=i ^xv> ^ c f°r N and we therefore obtain tr(A+) < oo. Similarly
it follows that tr (A _) < oo and we find that >1 e
It is easy to show that the norm in the Banach space which is dual to
$£СГ(Ж) is identical to the norm || • • • ||s for ).
From the general theorems in AIII, §4 it follows that the unit sphere
co(K v —K) is compact in the <т(ЩЖ), 3?СГ(Ж))-topology. Since $+(Ж) is
also о(0&(Ж\ J^crpf))-closed, the intersection of $+(Ж) with со (К v —К),
that is, the set К = ЛК (see also AIII, §6) is compact in the <т(ЩЖ),
))-topology.
12 Gleason’s Theorem
If En is a collection of pairwise orthogonal projection operators then,
according to §6 there exists a projection operator E for which
E = £E„. (12.1)
It
For We K, according to §11 we find that
w=£WvpXv,
V
where wv > 0 and £y wv = 1. For EN = £™=N+1 it follows that
0 < tr (WE) - £ tr (WEn) = tr (w(e - £ E^jj
M oo
= £ Wv<xv, (E - EN)xvy + £ wv<xv, (E - £jv)xv>
v = 1 v = M +1
M oo
£ £ wv||(£ - Е„)ху||2 + £ wv.
v=1 V = M +1
By choosing M sufficiently large, and for fixed M, let N become infinite, it
follows that
tr (WE) = £ tr(WE„). (12.2)
n=l
Let G denote the set of projection operators. For We К there exists a
positive real function G -A R+ given by ji(E) = tr (WE) for which ц( 1) = 1
and for pairwise orthogonal En we obtain /л(^п En) = /л(Еп). Gleason’s
theorem says that the converse holds—from GAR+, /x( 1) = 1 and
En) = Y,n then it follows that fi(E) = tr (WE) for some We K.
From ii(^n En) = ]ГИ M-EJ ^ follows that for an orthonormal basis {xv}
that PXJ = P(PXv)- Conversely, if the last equation holds for all
orthonormal vectors, then the first is satisfied for any set of pairwise
orthogonal En. Since /i( 1) = 1, it follows that, for any complete orthonormal
398 Appendix IV Operators in Hilbert Space
basis that ]TV fi(PXv) = 1. For a subspace 2Г = ЕЖ and for an orthonormal
basis in it follows that ]TV ju(P*v) = p(E) < 1.
Let A(G) denote the set of all Px {A(G) is the set of all atoms of the lattice G;
see V, §5). Suppose we are given a real function A(G) A [0,1] where
£v /i(PXv) = 1 for all complete orthonormal bases {xv}. ju can be extended in
one and only one way to all of G such that ц(^п En) = ]ГИ /г(Еп) holds.
We must therefore prove that ]TV /i(PXv) yields the same value for each
complete orthonormal basis in ЕЖ; in this way we may define ц(Е). This,
however, follows from the fact if {xv} and {yv} are two complete orthonormal
bases from ЕЖ and if {zp} is a complete orthonormal basis for ELЖ that the
{xv,zp} and {yv,zp} form a complete orthonormal basis for Ж for which
Ev KPJ + Ep №.) = Ev v(Py) + Ip = 1. It is easy to verify that
such an extension of /л onto G satisfies the condition /Х]£и En) = ]ГИ /л(Еп).
It therefore suffices to consider maps A(G) A [0,1] for which
]TV ju(P*v) = 1 for each complete orthonormal basis {xv}in Ж.
Let ST denote the surface of the unit sphere in Ж, that is, ST = {x \ x e Ж
and ||x|| = 1}. A function ST A R is called a frame function of weight с if, for
every complete orthonormal basis {xv} the relation ]TV m(xv) = c. Since it is
possible to replace one of the xv, for example, xVo by eiaxXo it follows that
т(ешх) = m(x); in particular m(—x) = m(x). For a frame function m we have
not assumed that m(x) is positive! If m(x) > 0 and с = 1, then the function
A(G) A [0,1] satisfies ]TV fi(PXv) = 1 for all complete orthonormal bases {xv}
of Ж and defines a function G A [0,1] for which En) = ]ГИ ju(£„).
The restriction of a positive frame function of weight 1 for Ж to the surface
of the unit sphere ^ of a subspace ^ is a frame function for of weight
c = Zv m(Jv)? where {yv} is a complete orthonormal basis for
A frame function m is said to be regular if m(x) = <x, Ax} for some
A e ЩЖ). The weight of <x, Ax} is equal to tr (A). For a regular positive
frame function of weight 1 it follows that there exists a unique measure
G A [0,1] for which ii(E) = tr (WE) where We K.
It suffices to prove that every positive frame function is regular.
We shall carry out the proof in several steps using Gleason’s approach.
First we shall show that:
Every continuous (and therefore not necessarily positive) frame function in
Euclidean space R3 is regular. (R3 is the real Hilbert space of three
dimensions).
Proof. Let & denote the surface of the unit sphere of R3; let denote the set of
all continuous functions ^ A R. Since 9* is compact, ||/|| = supX6<^|/(x)| < oo.
With the norm ||/|| C(9) is a Banach space. Let C+(9) denote the set of all
positive functions (with С+(9), C(9) becomes an ordered vector space—see
AIII, §6). Let F be the set of continuous frame functions; clearly F c= C(9); F is a
subspace of C(9). It is easy to verify that F is closed, that is, F is a Banach space.
Let ^ denote the group of rotations in R3. Let Де^; we obtain a
representation of ^ in C(9) by UAf(x) =/(A_1x). Clearly F is an invariant
subspace of C(^) under this representation, that is, F is itself a representation
12 Gleason’s Theorem 399
space for According to VII, §3 we may represent F in terms of the
subspaces (where is defined in VII, §3), that is, F = + (F n ^J). For
which value of / is F n 2ГХ Ф {0} ?
For /e«^i is/( —x) = —/(x)? Since for meF we must have m(—x) = m(x)
we find that F n «^ = {0}. According to VII, §3 ^ is the set of all
homogeneous polynomials of second degree, and F n (+ &~2) is therefore
the set of all real homogeneous polynomials of second degree. These have the
form <x, Ax) with (real) symmetric A.
Now we need only show that 2Txr\F = {0} for I > 3. If n F Ф {0}, then
would be equal to the set of all real functions in According to
VII (3.55) all functions of the form
cos m(p(sin 0)-mQ5„(cos 0), sin <p(sin 0)~mQ!jcos 0),
where m = —/,...,/ must belong to The restriction of these
functions onto the subspace determined by 0 = я/2 will also be frame
functions. For 0 = n/l, that is, for sin 0 = 1 and cos 0 = 0, all functions of
the form
cos mcp and sin mcp for m = —/,...,/
must be frame functions. Therefore, for cos mcp we must have
cos mcp + cos m(cp + я/2) = const.
This is only possible for m = 0 or for 1 + cos mn/2 = 0, which cannot be the
case for all m = —/,...,/ for I > 3.
We shall now show that every positive frame function on the surface ^ of
the unit sphere in R3 is continuous.
The proof can be carried out in a simple way. Let us consider the polar
coordinates introduced above. Let N denote the closed subset 0 < 0 < я/2 of
9*. A frame function is, since m(x) = m(—x), uniquely determined by its
values on N. The set of points 0 = constant is called a circle (parallel) of
latitude. Through each point xeN (with the exception of the pole 0 = 0)
there exists a uniquely determined great circle which, at x, has the same
tangent as the latitude circle passing through x. Let H(x) denote the set of all
points on this great circle. Clearly x e H(x).
Next we shall show that for z e N (z is not the pole) the set
X(z) = {x | x 6 N, there exists a у such that у e H(x) and z e H{y)}
has a nonempty interior.
Proof. It suffices to choose the point z with rectangular coordinates
(sin 0,0, cos 0) for which 0 < 0 < я/2. The set L of all у for which z e H(y) is
precisely the set of all (^, yj, Q for which ф d= (g2 + rj2) cos 0 - ^ sin 0 = 0. If x is
a point for which the quadratic form ф is negative, then L n H(x) ф 0 because if
f = 0 ф > 0. Thus X(z) contains the open set of all points for which ф < 0. This is
nonempty because, for ^ = sin 0', rj = 0, £ = cos 0' where 0 < 0' < 0: ф < 0.
Let osc(/, X) = sup{/(x)|x € X} — inf{/(x)|x€ X}. Let / be a frame
function on the unit sphere in R3. There may be a point x for which there
400 Appendix IV Operators in Hilbert Space
exists a neighborhood U for which osc(f U) = a. Thus each point у of the
great circle which has x as a pole (that is, in all directions у which are
orthogonal to the x direction) has a neighborhood V for which
osc(/, V) < 2a.
Proof. We introduce polar coordinates for which x is the “north pole.” There
exists an e > 0 such that all points for which в < e belong to U. Let x0 be a point
for which 0 = я/2 and (p = (p0. Let y0 be the point for which 0 = (я -f e)/2 and
(p = (p0. The point у with 0 = e/2, <p = <p0 is orthogonal to y0; similarly x is
orthogonal to x0. The points x, y, x0, y0 lie on a great circle, x and у lie in U. If,
instead of x0 we choose another point z from a sufficiently small neighborhood V of
x0, then the following two points z', yf lie in U: z', y' on the great circle passing
through x and y0 and orthogonal to z and y0.
For zl9 z2 e V let z\, y\, z'2, y'2 denote the points chosen in the above manner.
For the frame function/ it therefore follows that
/(Уо) =/(Z|) + /(/) (i = 1, 2).
By subtraction, we obtain
l/(zi) —/(zz)I = l/(/i) -/(y'2) + /(z'2) -/(z'i)l < 2a
since y', z' lie in U. Therefore we find that osc(/, V) < 2a.
We now prove the following lemma:
If, for a frame function f on the unit sphere 9 in R3 there exists an open set U
for which osc (f U) = a then, to each point ye 9 there exists a neigh¬
borhood Wfor which osc(/, W) < 4a.
PROOF. For x e U and ye 9 there exists a point z which is orthogonal to x and y. z
therefore lies on the great circle with x as the pole, and there exists a neighborhood
V of z such that osc(/ V) < 2a. у lies on the great circle with z as the pole, so there
exists a neighborhood W of у for which osc(/, W) <2 osc(/, F) ^ 4a.
We are now in a position to prove that every positive frame function on the
unit sphere 9 in R3 is continuous.
Let д > 0 be a frame function. Then the function / defined by /(x) =
g(x) — infy g(y) is also a frame function. Clearly/(x) > 0 and infXf(x) = 0.
Therefore, to each e > 0 there exists a point у for which/(y) < e. Let us
choose a polar coordinate system with у as the north pole. To each x there
exists a point x' which has the same coordinate в as does x but for which cp
differs by я/2. Clearly h(x) =/(x) +/(x') is a frame function for which h > 0.
Iff(x) is of weight c, then h(x) is of weight 2c.
For the special case in which x has the coordinates в = я/2 and (p = (p0
the vectors y, x, x' are an orthonormal basis for R3', and we find that
Ну) + Hx) + Hx') = 2c. Sinee/(—x) =/(x) we find that h(x') = /(x) + /(x'),
and we find that h(x) = с — h(y)/2 = с —/(у). Therefore we find that h is
constant on the great circle with у as the pole.
For an arbitrary point x e N\{y} H(x) is well defined. H(x) intersects the
great circle with pole у in the point z which is orthogonal to x (if x is on the
12 Gleason’s Theorem 401
great circle with pole y, then such a z also exists). Therefore we find that
2c > g(x) + g(z) = g(x) + с — f(y) and we obtain
g(x) < с +f(y) < с + e for all x e N\{y}.
Let и e H(x) n N and u' e H(x) n N where u' is orthogonal to м. It follows
that g(x) + с —f(y) = g(x) + g(z) = g(u) + g(u') < g(u) + с + e and thus we
obtain g(x) < g(u) + 2e for all x g N\{y} and и e H(x).
Let j? = inf{g(x)\x g N\{y}}; let zeN\{y} such that g(z) < p + e. For
x e X(z) (where X(z) was defined in the first lemma) there exists are H(x)
and z g H(v). According to the above inequality we obtain
g(x) < g(v) + 2e and g(v) < g(z) + 2e
and
j8 < g(x) < g(z) + 4e < j? + 5e.
The set X(z) has a nonempty interior, that is, there exists an open set U (as a
subset of X(z)) for which j? < g(x) < j? + 5e for all x g U; thus we obtain
osc(g, U) < 5e. According to the previous lemma for the point у there exists a
neighborhood V for which osc(p, V) < 20e. Since g(y) = 2f(y) < 2e we
obtain sup{g(x) \ x g V} = osc(g, V) + inf {g(x) \ x g V} < 20e + 2e. Since
0 </ < g we therefore obtain osc(/, V) < 22s. Thus, according to the
previous lemma each point xg has a neighborhood W for which
osc(/, W) < 88e. Since s can be chosen arbitrarily small, / is therefore
continuous.
We have proven the following: Every positive frame function on the unit
sphere 9* in R3 is regular. We shall now extend this result to complex Hilbert
spaces and to higher dimensions.
A subset St of a Hilbert space Ж is called a real subspace if for each x,
yG0t x + у GSt and for each a g R axe^. Note that St need not be a
subspace of Ж if Ж is a vector space over C. St is said to be completely real if
the inner product on St x St only takes on real values. If X с: Ж is a set for
which the inner product on X x X takes on only real values, then the real
linear span of X is a complete real subspace. The real linear span of an
orthonormal basis is therefore a complete real subspace. If the complex
subspace generated by a complete real subspace St is dense in Ж, then a
complete orthonormal basis of St is also a complete orthonormal basis of Ж .
Then the restriction of every frame function of Ж onto St is a frame function
for ^2.
We now prove that a frame function / on a complex two-dimensional
Hilbert space for which / > 0 which is regular on every complete real
subspace is regular in the entire Hilbert space.
Proof. Let с denote the weight of / and let d = sup{/(x)| ||x|| = 1} where
0 < d < с since/ > 0. We will now show that/ takes on the value d.
There exists a sequence xn for which ||xj = 1 and f(xn) —► d. Since the surface of
the unit sphere is compact, we may assume that the sequence xn converges: xn —► y.
For Xn = <x„, y>/|<x„, y}\ we find that \kn\ = 1 and kn —► 1. Thus it follows that
402 Appendix IV Operators in Hilbert Space
Kxn У andf(kn9 xn) = f(xn). Since <у, Аихи> is real, for each n both vectors y, knxn
lie in a complete real subspace 0tn. If x, у e 0tn satisfy ||x|| = ||y|| = 1, then, for a
symmetric A we obtain
Note that F(y) = f(y) = d. If z is orthogonal to у and if ||z|| = 1 then it follows that
F(z) = f(z) = с —f(y) = с — d. F(x) is a real quadratic form on the complete real
subspace of Ж generated by у and z which takes on its minimum d at y. Therefore,
for real a, ft we find that
F(ocy + pz) = cc2F(y) + P2F{z) = (x2d + p2(c - d).
For complex nonzero A, \i it follows that
Щу + ^) = F(j- (Ay + rf) = F(|A|z + l^lz'),
where z' = (^|A|/|^|A)z. Clearly z' is orthogonal to y and ||z'|| = 1. Therefore we
obtain
F(Ay + nz) = \k\2d + |/i|2(c - d).
It is easy to show that the above formula is correct for к = 0 or \i = 0. Therefore
F(x) = <x, Tx} where T is the diagonal matrix
with respect to the complete orthonormal basis {y, z}. Therefore/ is regular.
The extension of the preceding result to arbitrary dimension follows from
the following theorem:
Iff is a frame function in Ж for which / > 0 and f is regular on each two
dimensional subspace, thenf is regular on the entire Ж.
Proof. Again we define
for x Ф 0,
for x = 0.
12 Gleason’s Theorem 403
On each two-dimensional subspace 9 Fix) — <x, Ax') where A is a self-adjoint
operator which is defined in 9. We define a function Ж x Ж С as follows:
G(xf y) = <x, A^y),
where 9 is the subspace generated by x and y.
Now we must show that this definition is meaningful. If x and у are linearly
independent, then 9 is uniquely determined by x and у and therefore G(x, y) is
uniquely defined. If у = Ax then for 9 we may select an arbitrary two-dimensional
subspace which contains x. It then follows that G(x, Ax) = A(x, Arx) = AF(x)9
which is independent of the choice of 9. Since Ag- is self-adjoint in 9 we find
that G satisfies the following relationships for all x, у e Ж and aeC:
G(x, ay) = a(G(x, y),
G(y, x) = G(x, y),
4 Re G(x, y) = F(x + y) - F(x - y),
2F(x) + 2F(y) = F(x + y) + F(x - y).
Thus it follows that
G(x, y) + G(x, z) = Re G(x, y) + Re G(x, z) +- Im G(x, y) + Im G(x, z).
Since
Im G(x, y) = i Re(—i*G(x, y)) = i Re(G(x, — /у))
it follows that
G(x, y) + G(x, z) = Re G(x, y) + Re G(x, z) + i Re G(x, — iy) + i Re G(x, — iz).
From the above relationships it follows that
8 Re G(x, y) + 8 Re G(x, z) = 2F(x + y) — 2F(x — y) + 2F(x + z) — 2F(x — z)
= F(2x + у + z) — F(2x — у — z)
= 4 Re G(2x, у + z) = 8 Re G(x, у + z).
Therefore we obtain
G(x, y) + G(x, z) = Re G(x, у + z) + i Re G(x, i(y + z))
= G(x, у + z).
In addition it follows that
|G(x, y)| < |Re G(x, y)| + | Im G(x, y)| = |Re G(x, y)| + |Re G(x - iy)|
< JF(x + y) + iF(x - y) + iF(x + iy) + jF(x - iy).
From the definition of F, and from 0 < / < c, where с is the weight of/, it follows
that:
|G(x, y)| ^ iC[||x + y\\2 + ||x - y\\2 + ||x + »>||2 + ||x - iy||2]
< C(||x||2 + |Ы|2).
For unit vectors x, у we therefore obtain |G(x, y)| < 2c.
G is therefore a bounded linear form; thus, according to §4 there exists a
bounded operator A such that G(x, y) = <x, Ay}. Since G(x, y) = G(y, x) we
404 Appendix IV Operators in Hilbert Space
conclude that A = A+, that is, A is self-adjoint; from G(x, x) > 0 we conclude that
A > 0.
We now prove the following theorem:
Every frame function/on a Hilbert space Ж for whichf > 0 is regular when
the dimension of Ж > 3.
Proof. / is a frame function on each complete real subspace. Each complete real
two-dimensional subspace is also a subspace of a three-dimensional complete real
subspace. In each complete real three-dimensional subspace (that is, in R3) / is
regular, and is therefore regular on every two-dimensional subspace. Thus, from
the last two theorems, it follows that / is regular on each two-dimensional
subspace, and therefore in all of Ж.
13 Isomorphisms and Anti-isomorphisms
Let Ж^Ж2 be two Hilbert spaces, where we permit Ж± = Ж2. A map
-4 H2 is said to be linear if A(x + y) = Ax + Ay and A(ax) = aA(x). A is
said to be antilinear if we substitute A(ax) = aA(x) for the previous equation.
A is continuous if and only if it is bounded, that is, if there exists a number с
for which || Ax || < c||x||.
Ж± -4 Ж2 is said to be an isomorphism if U is linear and bijective and
|| Ux|| = ||x||. If Жх = Ж2 then, according to §7 U is an isomorphism if it is
unitary.
If for Жх 4 Ж2 only || Vx || = || x || then, from the results of §7 it follows that
V is an isomorphism on a closed subspace of Ж
Жх 4 Ж2 is said to be an anti-isomorphism if U is antilinear and bijective,
and || Ux || = ||x||. For the special case in which Жг = Ж2 then we say that U
is anti-unitary. For a complete orthonormal basis {xv} for Ж a special anti-
unitary operator is defined by Usxv = xv; it follows that Us(£iVocvxv) =
£v avxv .If Жх 4> Жг is an anti-isomorphism, then it follows that if U1 is an
anti-unitary operator in Ж± then 0 = UU1 is an isomorphism Жх У*Ж2.
Both Ux and U^1 are anti-unitary. Therefore every anti-isomorphism U can
be represented as a product UU^1 of an isomorphism and an arbitrary anti-
unitary map.
If U2 is an anti-unitary map in Ж2 it follows that U = U2U is an
isomorphism and that U may be represented in the form U^U.
As expected, the product of two antilinear maps Жг Ж2 -А Жъ is a linear
map Ж1 -24 Жъ.
If U is a unitary map, then for A e ЩЖ) and В e $'(Ж) it follows that
tr(AU+BU) = X <*v, AU+BUxv} = X <L/xv, UAU+BUxv>
= X <yv, UAU+Byv\
V
14 Products of Hilbert Spaces 405
where xv is a complete orthonormal basis and yv = Uxv. Since U is unitary,
the set yv is a complete orthonormal basis, and we find that
tr (AU+BU) = tr (UAU+B).
An analogous proof is applicable to the case in which U is anti-unitary. From
\\Ux\\ = <Ux, Ux) = ||x|| = <x, x> it follows that if we replace x by x + у
and x + iy, then we obtain
(x,y> = (Uy, Ux> = {Ux, Uy>
from which we obtain:
tr(AU-'BU) = £ <*v, AU-'BUxJ
= X <l/xv, 1/А1Г *Blfxv> = X <yv, UAU~lByvy
V V
Since A e ЩЖ), for A' = JJA U ~1 it follows that
<x, A'y) = <x, I/AITV) = <AU~ly, 1/_1х)
= (IT^Al/^x) = <1/А1/_1х,У>
= <A'x,y>
and we find that 4' e Therefore tr(y4'£) is real, and we find that
tt(AU~1BU) = tr(C/y4t/_1£) holds for anti-unitary U.
14 Products of Hilbert Spaces
Let Жу and Ж2 be two Hilbert spaces. We construct the so-called product
space Ж = Жх x Ж2 (where Ж is not the Cartesian product of Жу and Ж2—
a fact that is not reflected in the usual notation) in the following way :
Let x e Ну, у e H2 ;/(x, y) is said to be an anti-bilinear form if/(xx + x2, y) =
fix i, У) +f(x2, y),f{x, у у + y2) = fix, yy) + fix, y2),fi ccx, y) = a/(x, y) and
fix, ay) = a/(x, y).
An example of an anti-bilinear form is given by/(x, y) = <x, Zj)<y, z2>
where zx e Жу, z2 e Ж2. This anti-bilinear form is often denoted by zlz2.
An anti-bilinear form is continuous if and only if there exists a с for which
l/(x, y)| < c||x||||y||./= ZjZ2 is continuous with с = \\zy\\ ||z2||.
The set of all continuous bilinear forms is a linear vector space over C.
From |/(x, y)| < c||x|| ||y|| from §4 it follows that /(x, y) = <x, z) for some
z e Жу. An antilinear map Ж2 Ж у which satisfies ||Ay|| < c||y|| is defined
by y—>z. Therefore /(x, y) = <x, Ay). For / = z2z2 we obtain Ay =
Zyiy, z2>. Since A is bounded, there exists, according to §6 a bounded
antilinear operator Жх Ж2 for which <x, Ay} = <y, A'x). If A is a
406 Appendix IV Operators in Hilbert Space
bounded antilinear operator then <x, Ay) is a bounded antilinear form.
Therefore there is a bijective map between bounded antilinear forms and
bounded antilinear operators Ж2^> Ж±.
From <x, Ay) = (y, A'x) it follows that, for у = A'z, the operator AA'
defined by <x, AA'z) is bounded and linear in ; the same holds for A'A
in H2. From (x,AA'z) = (A'z, A'x) = (AA'x, z) it follows that AA' is
self-adjoint. For z = x it follows that (x,AA'x) = (A'x, A'x) = M'x||2
and (y, A'Ay) = (Ay, Ay) = My||2. Therefore AA' e JSfr+p^) and
A'A e JS?r+(^). For a complete orthonormal basis {xv} for Жг we obtain
tr(^4^4') = ]TV <xv, AA'xv) = £v ||,4'xv||2. For a complete orthonormal basis
{yv} for Ж2 we may define an antilinear operator V (anti-isometry; see §13) by
xv = Vyv.
Thus A'Ve J¥г{Ж2) and tx(AA') = £v \\A'Vyv\\2 = tr((A'V)+A'V). It is easy
to show that (A'V)+ = V'A. Therefore tr(AA') = tx((A'V)+(A'V)) =
tx((A'V)(A'V)+) = tr((V'A) + V'A) = £v WV'AyJ2. Since \\Vy\\ = ||y|| we
finally obtain \x(AA') = £v Myv||2 = \x(A'A).
We shall now introduce an inner product on the set of all those anti-
bilinear forms / for which ix(AA') = tr(^4'>l) < oo as follows: Let/i <->Au
/2 <-» A2, we define </i,/2) = ЩА^А^ in this way the above set becomes a
Hilbert space—the so-called product space Ж = Жхх Ж2.
For the special case in which / = z1z2 we obtain Ay = zx(y, z2),
A'x = z2(x,z1) and we therefore obtain A'Ay = z2\\z1\\1(z2, y) and
tr(^4'>l) = II^PllzjH2. For/i = z(11)z(21) and for/2 = z(2)z(22) it follows that
A'2A2y = z<22)<z<i1), z^'Xz^», y>
from which we find that
<Л./а> = <41).42)><41).42)>'
Let {xv} and {yv} be complete orthonormal bases for Жх and Ж2,
respectively. From/(x, y) = <x, Ay) it follows that / = ]TVfM ащхуу^; simi¬
larly, for g(x, y) = <x, By) it follows that 0 = ]ГДУ Ь^х^уу. We therefore
obtain <0,/> = BVflaVfl. Therefore the set of xvy^ is a complete ortho¬
normal basis in Ж = Жг x Ж2.
We will now show that for a given f e Жг x Ж2 we can choose a complete
orthonormal basis xvy^ such that
/ = E where kv > 0. (14.1)
V
Since А'А еЩЖ2) A'A can be expressed in the form A'A = £v цуРУу. For
Ayy = zv it follows that
AA'zy = ЛЛ'Лу, = цуАуу = fiyzv.
If zv Ф 0, then xv = zv/||zv|| is a normed eigenvector of A'A with eigenvalue
Hv. Clearly ||zv||2 = (Ayv, Ayv> = <yv, A'Ayy} = цу. Since A'A is positive,
14 Products of Hilbert Spaces 407
Mv > o. From f(x, у) = <x, Ay} it follows that for у = yv<yv, y>, we can
write
Л*. jO = E <*> zvX.y> zv> = E V/vX xv><y. yv>
V V
from which we conclude that/ has the form (14.1).
A bounded linear operator В in Ж = x Ж2 is completely defined
by the special values Bz1z2 where z1 e Жъ z2g Ж2. For each Ax g ЩЖ^
and A2 g ЩЖ2) an operator Axx A2g ^(Жх x Ж2) is defined by
(At x A2)z1z2 = (A1z1)(A2z2) and satisfies \\AX x A2\\ = \\АХ\\ \\A2\\. It is a
simple matter to show that (Ax x A2)(B1 x B2) = (A^J x (A2B2). As a
special case we obtain (A1 x 1)(1 x B2) = (Axx B2) = (1 x B2)(A1 x 1).
Thus we find that A x 1 and 1x5 commute. Under certain circumstances
there is a partial converse.
Let M а <£(Ж) and suppose that M has the property that if A g M then
A + g M. If 2Г is a subspace which is invariant under M (that is a
then so is 2TL since if ug for A g M and xe«f it follows that <x, Au) =
(A+x, u) = 0 since A+xg tf. Let M be a set and suppose we are given two
maps M Д 5£(Ж^) and M A j£?(,Ж^). A bounded linear map Жх Д Ж2 is
called a homomorphism with respect to the operator domain M if B(jxx) =
(j2x)B for all xg M. Often the functions j1 and j2 are ignored, and we
consider the elements of M as operators in Жг and Ж2, where we observe that
two different elements of M can correspond to the same operator in Жг and
Ж2. In this sense the elements of a subset M c= 5£(Ж) can be considered to be
operators in an invariant subspace & of Ж. In order to emphasize the fact
that the same operator domain M is under consideration for both Жг and Ж2
we shall often write (М)Ж1 and (М)Ж2.
Two Hilbert spaces (M)&\ and (M)3T2 (for example, two invariant
subspaces of Ж for which M а <£(Ж)) are said to be isomorphic if there
exists an isomorphism such that Axx = U 1AUx1 for all
A gM, that is, if UA = AU holds (as maps of into for all A g M.
A space (M)&~ is said to be irreducible if it contains no invariant subspaces
(other than {0} and If and У2 are isomorphic and if is irreducible,
then 2T2 is also irreducible. If ^ <T2 are irreducible, and ^
^2 are two isomorphic maps, then for U21U1 from U^A = AUX, and
U2A = AU2 it follows that U^U^A = U^AU^ = AU^U^ that is, the
unitary operator V= U21U1 commutes in with all operators of M (as
operators in ^i). Therefore the spectral family of V commutes with all
operators in M. If V Ф eial then there exists an element E((p) of the spectral
family for which 0 Ф E((p) Ф 1. From AE((p) = E((p)A it follows that E((p)&\
is an invariant subspace of ^ in contradiction to the irreducibility of &~v
Therefore U21U1 = eial, that is, Ux = ei<xU2. Two isomorphic maps can
therefore differ only by a “phase factor.”
We will now show that if У is an irreducible space and if В g S£(!T)
commutes with all operators in M (here we then say that В commutes with
Af) then В = il. From A g M, BA = AB it follows that A+B+ = B+A+,
408 Appendix IV Operators in Hilbert Space
since if A e M then A + e M and B+ commutes with M. Therefore В + B+
and i(B — B+) commute with M. If a self-adjoint operator D commutes with
M, then M commutes with its spectral family. If a projection operator P
commutes with M, then РвГ is an invariant subspace. Since У is irreducible,
the elements of the spectral family can only be equal to 0 or 1; therefore
D = Я1. Therefore В + B+ = тх1 and i(B — B+) = z2l and therefore
В = т1.
If the irreducible spaces (M)&\ and (M)^~2 are not isomorphic, then it
follows that for a bounded linear map Д- ST2 which satisfies AB = BA for
all A e M, that is, for a homomorphism В, В = 0. Conversely, and ^ are
isomorphic if there exists a homomorphism В Ф 0.
PROOF. /(y) = <Bx, By) defines a bounded linear form in STX, from which, accord¬
ing to §4, there exists a ze^i for which ?(y) = <z, y>. An operator D e is
defined which satisfies /(y) = <Bx, y>. From <Bx, By) = <By, Bx> it follows
that <Bx, y> = <By, x> = <x, By), that is, B+ = В and (Bx,Bx} > 0 and
therefore D > 0. Therefore (Bx, By> = <x, By). From AB = BA for all
As M it follows that for A + e M (x, ABy) = < A +x, By) = <BA +x, By) =
(A+Bx,By} = (Bx, ABy) = <Bx, BAy) = <x, BAy) and therefore we obtain
AB = BA. Therefore В = Я1 and from В = B+ > 0 we obtain Я > 0.
If Я # 0 then if U = Я1/2В it follows < Ux, l/y> = <x, y>. U is therefore an
isometric map of into ZT2. Thus U is an isomorphic map of onto U2TX c= ^.
is, on account of the commutivity of U with M, an invariant subspace. Since
^2 is irreducible, we must have U&\ = ^ and therefore (M)^ and (M)^ are
isomorphic, in contradiction to the fact that we assummed that they were not
isomorphic. Therefore Я = 0, that is, ||Bx||2 =0 and therefore В = 0.
From the same proof it also follows if (M)&\ is irreducible, but (M)$~2 is
not necessarily irreducible, then В = XU where U is an isomorphic map of
onto the invariant subspace U&\ of 9~2. Since is irreducible, then
is irreducible.
(М)Ж is said to be completely reducible, if, in every invariant subspace of
Ж there exists an irreducible (invariant) subspace. The set {R\R is a set of
pairwise orthogonal irreducible invariant subspaces} satisfies the conditions
of Zorn’s lemma (with respect to the partial order of inclusion). Thus it
follows that there exists a maximal element Rm and that the elements ^ of Rm
satisfy the relation Ж = ^. Let Pv be the projectors on ^. Then
Pv = 1 and, for each A e M A = PVAPV. We shall now investigate the
structure of a Be J&(Ж) which commutes with M. It is easy to show that
AB = BA is equivalent to PVAPVPVBP^ = РХВР^АР^ for all pairs v, ju. If
^ ^ are not isomorphic spaces, then, as above, it follows that Р^ВРУ = 0. If
2TV and ^ are isomorphic, then there exists an isomorphic map ^ ^
from which it therefore follows that PvAPvUyfl = U^P^AP^. F = и~^РуВР^
therefore commutes with P^AP^ and therefore, as an operator in ^ is a
multiple of the unit operator, that is, F = rwe find that
PyBPp = . Thus it follows that
s = EW, = £4AA (14-2)
V,jU V,jU
14 Products of Hilbert Spaces 409
where is only taken over such pairs v, p for which ZTV is isomorphic to ^.
The set of can be divided into classes, whereby two belong to the same
class when they are isomorphic. Let ^ = £(va) ® where the sum is taken
over the ^ belonging to the same class labeled with the index a. We find that
Ж = Yja® - We will now show that the ^ are uniquely determined.
If, in addition, Ж = Yjh® ^ anc* = where the index a
means that the *5^' contains precisely those ST'P for which the from ^ are
isomorphic, the projection Pp onto a 3~'p in *5^' therefore commutes with M.
FpPp is therefore a mapping ^ —► 2T'p which commutes with M, and is,
therefore equal to zero if ^ and ZT'p are not isomorphic. Therefore, it follows
that for Fp instead of В in (14.2)
p' = V(a)x UP
rp La Lvn^vnrn^
Уф
where we only sum those pairs v, p for which &~y, c= . Thus it follows
that ?Г'р c 5^, that is, £P'a c= Thus we also obtain ^ с «5^', that is,
We shall now consider one of the ^ = £(va) ® let fyQ be one of the Уу.
Let ^ be isomorphisms. Then Uvfl = UvU~l is an isomorphism
2Tp —► ^. Let Qa be the projection onto from (14.2) we obtain
&*& = E(“4,W4- (14.3)
Vfl
Let xpVo be a complete orthonormal basis in &~Vo, then the xpp = Upxpvo (for
fixed p) is a complete orthonormal basis for 2Tp. From (14.3) it follows that
Q«BQ«xpP = E(a) Tvp^„v • (14.4)
vp
For Ae M from = ЕГ pApv and from PvAPyUv = UVPV0APV0 (see
above, setting l/VVo = l/v) it follows that QXAQX = UyPyoAPvU;1. Thus,
for Ax„Vo = Ev ^vvoav„ it follows that
Ax„p = E(a) (14.5)
V
We now introduce two Hilbert spaces Ж^ос) which has the dimension of
^ c: £Pa and Ж^ which has the dimension equal to the number of the
^ c Let zv,mp be complete orthonormal bases for Ж^ and Ж£\
respectively; to each operator A e M we assign an operator A(a) as follows
A(\ = Evv (14-6)
V
Then xvp —► zvmp defines an isomorphism £? —> Ж}*} x Ж^ if we assign the
operator QaAQa, that is, if, for the operator A in ^ we assign the operator
Ai<x) x 1 where A(a) is defined by (14.6). Here we often use the simpler notation
^ x Жla) and for A e M: Qa>lQa = Ai<x) x 1. From (14.4) it follows
that QaBQa = 1 x В(Л) where B(a) is an operator in Ж$*\
410 Appendix IV Operators in Hilbert Space
We may therefore write Ж in the form
Ж = £ © Ж^ X 3Vf, (14.7)
a
where the Ae M are given by
л = Ее«(л<«> хщ,. (i4.8)
a
All operators which commute with M therefore have the form
В = Y &Д x #“>)&, (14.9)
a
where the B(a) e if(J^a)) are arbitrary. The Jf?1(a) are irreducible with respect
to the operators A(a) and Ж[а\ Ж^ are, for a ^ P, not isomorphic with
respect to Ai<x) and A^\
If В is a homomorphism of (M)^f into (M)3~ and if (M)Jf is completely
reducible, then, as we have shown above, a self-adjoint operator D in Ж
which commutes with all operators in M is defined by <x, Dx) = <Bx, Bx).
According to (14.9)
£> = I&(ix
a
From <J3x, Bx) = <x, Bx) it also follows that {By, Bx> = <y, Bx). For
Xj e Jf1(a), x2 е Ж{2\ yx e Jf/*, y2 e Ж^] it therefore follows that
(Byyy2, BXiX2> = 5ар(уу, ХуХУг, В(я)х2у. (14.10)
The subspaces Qq Ж will therefore, for different a, be mapped onto orthogonal
subspaces (or {0}). Therefore (here A is the closure of A) we obtain
ВЖ = Y® BQ*Ж, (14.11)
a
where the prime indicates that the sum is to take place only over the
В(}аЖ Ф {0}. It is therefore sufficient to examine the structure of the
individual BQаЖ.
Let Ж£} denote the subspace of Ж^ which is the eigenspace of D(a) for
which the eigenvalue is 0. From (14.10), for ft = a, it follows that
Bxxx2 = Uia\x1D^)(1/2)x2)
is an isomorphism of Ж[а) x Ж^ onto the subspace В()аЖ. U{a) defines a
product representation of В^аЖ = ^a) x ^a). [7(a) therefore defines a pair
of isomorphisms U^: Ж[а) and l/(2a): Ж£ -+ by C/(a) =
x [7(2a). On the subspace
^ ^ © Ж[л) x Ж%
a
the relation
в = Г Qa(Uf> X UfD^'m)Qa
15 The SpacesШЖЪ Жг,...) and 'М'{ЖЪ Ж2,...) 411
is satisfied; В is the null operator on the subspace in Ж which is orthogonal
to . An operator Ain M has the following form
A = xl)&
a
with respect to the decomposition of the invariant subspace ВЖ described
by (14.11):
ВЖ = Y Ф Яа) x
a
where Qa is the projection operator on x ^a).
In addition, for A{cl) in (14.8) we obtain
C7?U(e) = А^Щ\
Therefore the irreducible spaces Ж[а) and Ж[а) are isomorphic. An isomor¬
phism of (M)Ж5 onto (М)ВЖ is defined by the C/(a) = x l/(2a).
For finite-dimensional Ж it is necessary to verify that an operator system
M which contains both A and A+ is always completely reducible (in an
invariant, reducible subspace there must be invariant subspaces of smaller
dimension). In general this is not the case for infinite dimensional Ж\ so that,
in applications, it is necessary to obtain a special proof of the complete
reducibility of M.
15 The Spaces ^2,...) and ^2,...)
The results of the previous sections can easily be extended to sequences of
Hilbert spaces. Here we shall only give a brief outline of the proofs.
Suppose that we are given a sequence Жи Ж2,... of Hilbert spaces. Let us
define Ж as a disjoint union of the sets Жп, that is,
Ж = \)Жп. (15.1)
tl
Жп is therefore a subset of Ж. We find that, in the sense of subsets of Ж\
Жп n Жт = 0 for n Ф т. Ж is clearly not a linear vector space! An inner
product between two elements of Ж is therefore only defined if the elements
belong to the same subset Жп. Otherwise, two elements which belong to
different Жп are said to be orthogonal.
An orthonormal set in Ж is a set {xv} of pairwise orthogonal elements of
Ж for which ||xv|| = 1. Such a set is said to be complete if its intersection with
Жп is, for each n, a complete orthonormal basis for Жп. Each xe Ж may be
expressed in terms of a complete orthonormal basis as follows: x belongs to
one of the Жп and may therefore be expressed in terms of the portion {x(vw)} of
the complete orthonormal basis belonging to Жп in the form described in §2.
A closed subspace of Ж is a subset F for which all the 2Гп = 2Г n Жп are
closed subspaces of Жп. is therefore the disjoint union of the 2Tn, that is,
412 Appendix IV Operators in Hilbert Space
^ = Un^n- Let denote the orthogonal subspace of ^ in Жп. We now
define 2TL = (J„ Let ^(1) = (J„ ^и(1) and ^(2) = (J„ ^(2). The greatest
lower bound (infimum) &~(1) л <^(2) of <^(1) and &~(2) is given by
^~(1) л ^~(2) = ^~(1) n 5~(2) = (J„ (^и(1) п ^и(2)); the least upper bound (sup-
remum) ^(1) v ^"(2) of 5~(1) and ^"(2) is given by ^(1) v ^(2) =
(J„ (^,(1) v ^и(2)). It is easy to verify that the set of closed subspaces of Ж is a
complete orthocomplemented lattice.
An operator A in Ж is defined as a sequence A = (Al9 A2,...) of operators
in which satisfies the condition: For x e Жп Ax = Anx. A is said to be
bounded if there exists a с for which \\Ax\\ < c\\x\\. It follows that
We denote the set of bounded operators in Ж by д/(Ж19 Ж2,.. .)• It is easy to
show that л/(Ж19 Ж2, ...) is a Banach algebra, provided that the product of A
and В is defined by AB = (AxBl9 A2B2,...); the adjoint A+ of A is defined
by A + = (Af, A2 , ...) .A is said to be self-adjoint if A + = A.
The set ^Г{Ж\^ ...) of all self-adjoint operators of sf{jtfl9 Ж29...) is an
ordered Banach space where A > 0 if An > 0 for all n. The unit sphere of
siГ(Ж19 Ж2,...) is the order interval [—1,1] where 1 is the unit operator
lx = x for all x. 6#Г{Ж[, Ж2,...) is therefore an order unit space. For A > 0
we define the positive square root:
where the A\12 are positive roots of An.
If У = (J„ ^ is a closed subspace of Ж then to each x e Ж there is a
corresponding q e given by = P„x for x e ZTn. Thus q = Px where P =
Pi, P2,...) where Pn are projection operators onto ^. It follows that P+ =P
and P2 = P. Conversely, if for P e д#г(Ж19 Ж2,...) P2 = P, then a closed sub¬
space of Ж is defined by У = РЖ and P = (Pl9 P2,...), P2 = Pn and
& = U« &~n where ZTn = РпЖп. For x e Ж9 ||x|| = 1 the projection operator
on the subspace generated by x, Px9 is given as follows (for x e Жп):
The relationships derived in §6 may be directly applied onto the set of
projection operators in Ж29...).
An operator Ve л/(Ж19 Ж29...) is said to be isometric if for
V = (Vl9 V2, ...) all Vn are isometric. U is said to be unitary if U and U+ are
isometric. All the results from §7 are directly applicable to the isometric and
unitary operators in л/(Ж19 Ж29...).
From §8 it follows that if A e stfГ(Ж19 Ж2,...) where A = (Al9 A2,...) and
if En(X) is the spectral family of An and if E(X) = (F^A), E2(X),...) then
where the integral is norm-convergent. A similar result holds for unitary
operators.
Mil = sup ||Лх|| = SUpMnll-
11*11 SI
n
15 The Spaces ${ЖЪ Ж2,...) and &'(Жи Ж2,...) 413
An operator A = (Al9 A2,...) is said to be compact if all the An are
compact. For a compact operator it follows that (9.2) holds.
Let xv be a complete orthonormal basis for Ж\ let A e д/г+(Ж19 Ж29...).
(11.1) is independent of the choice of the complete orthonormal basis; we
denote it by \x(A). For A = (Al9 A29...)
1г(Л) = 2>(Л„). (15.2)
П
Let ЩЖ2, Ж2, ■ ■.) denote the set of operators from ЖГ(Ж^, Ж2,...) for which
MIL = trCV^P) = tr(A+) + tr (A.) = £ t r(y/A%)
It
= £tr(A„+) + £trG4B_)
П П
is finite. Thus we obtain
MII. = EMJL,
n
where \\An\\s is the norm in ЩЖп). ЩЖ19 Ж2,...) is norm-separable because
all the ЩЖп) are norm-separable.
It is easy to show that ЩЖ19 Ж2,...) is an ordered Banach space and is, in
addition, a base norm space where the basis К is the set of all
W = (Wl9 W2,...) > 0 for which
tr(WO = Etr(Wy = 1.
n
For Be ЖГ{Ж\, Ж2,...) 1(B) = tr(AB) is a bounded Unear form over
ЩЖ1, Ж2, ■..) for which tr(^4B) < ЦЛ || S\\B\\, where the latter follows directly
from tr(AB) = Yn Ш„Вп).
Conversely, if / is a bounded linear form over ЩЖ19 Ж2,...) then, from
A = (Au A2,...) = (Al9 0,...) + (0, A29 0,...) + ...,
where this sum converges in the norm, it follows that
1(A) = XOU
It
where l„(A„) is a bounded linear form over ЩЖ„). From \l(A)\ < c\\A\\s we
obtain \ln(A^\ < cM„L-According to §11 we obtain l„(A„) = tr(A„B„) where
Bn e ^г(Жп) and || A,|| < c. Therefore we obtain
1(A) = X ln(An) = £ tr(AnBn) = tr (AB),
П П
where В = (Bu B2,...) and ||B|| < c.
The dual Banach space 0&'(ЖХ, Ж2,...) which corresponds to
0И(Жх, Жг,...) may therefore be identified with ЖГ(ЖХ, Ж2,...). By an
extension of the results obtained in §11 we find that 9Р(ЖХ, Ж2,...) is
separable in the а(Ж(...), Щ.. .))-topology. In addition, the results from §11
about affine functionals over К remain valid for ЩЖи Ж2,...).
414 Appendix IV Operators in Hilbert Space
It is easy to verify that Gleason’s theorem is applicable to the set
в{Ж19 Ж19...) °f projection operators from ^Г{Ж19 Ж29.. .)•
Since almost all theorems for a single Hilbert space are applicable to
Ж = (J„ Жп9 in the text we have referred readers to those sections in AIV
where the theorems have been proven for the case of a single Hilbert space.
We shall assume that the reader is able to extend these theorems to
^ = Un^n using the contents of §15.
Here it is important to note that it is often useful to imbed Ж = (J„ Жп
into a Hilbert space
/s = Ee к,
n
that is, the elements of Ж8 are defined as sequences x = (xl5 x2,...) where
xn e Жп and ||.xn ||2 < oo. The inner product in Ж8 is defined by
<x,y> = £<*„ ,у„У-
n
An element xe Ж = (J„ Жп is an element of one of the Жп and is identified
with (0, 0,..., x,...) where x is in the nth position. We may identify
A = (Al9 A29...) e Ж(Ж19 Ж2,...) with the following operator in J^(Ж8):
A(xu x2,...) = £A„x„.
It
In this way Ж(Ж19 Ж2,...), {Жи Ж29...) become subsets of J^(Ж8) and
J&Г(Ж8) respectively, and, as it is easy to show, ЩЖ19 Ж19...) becomes a
subset of ЩЖ8). If Pn is the projection operator onto the subspace Жп of Ж8
then я4г(Ж^ Ж19...) is precisely the set of those operators in jS?r(J^) which
commute with all Pn. In the same manner, Я(Ж19 Жъ ...) is the set of all
operators in ЩЖ^) which commute with all Pn. The basis К(Ж19 Ж2,...) of
the base norm space ЩЖ19 Ж2,...) is the subset of all those operators from
К(Ж8) (the basis for the base norm space ЩЖ8)) which commute with all the
Pn-
Of course, the imbedding described above has no physical meaning for the
physically interpretable structure introduced in III, §3 and §5. This imbed¬
ding can, however, lead to important computational techniques.
References
[1] G. Ludwig. Die Grundstrukturen Einer Physikalischen Theorie. Berlin-Heidelberg-
New York: Springer-Verlag, 1978.
[2] G. Ludwig. Einfiihrmg in die Grundlagen der Theoretischen Physik, Vols. I-IV.
Braunschweig: Vieweg, 1974-1979.
[3] G. Ludwig. Quantum theory as a theory of interactions between macroscopic
systems which can be described objectively. Erkenntnis, 16, 359-387 (1981).
[4] N. Bourbaki. Theory of Sets. Paris: Hermann; Reading, Mass.: Addison-Wesley,
1968.
[5] G. Ludwig. Makroskopische Systeme und Quantenmechanik. Notes in Math.
Phys., Vol. 5; Marburg, 1972.
[6] G. Ludwig. Mefi- und Praparierprozesse. Notes in Math. Phys., Vol. 6; Marburg,
1972.
[7] G. Ludwig. Measuring and preparing processes. In Foundation of Quantum
Mechanics and Ordered Linear Spaces. Springer Lecture Notes in Physics, Vol. 29,
1974.
[8] H. Neumann. A mathematical model for a set of microsystems. Int. J. Theor.
Phys., 17, 3, 1978.
[9] M. Drieschner. Voraussage-Wahrscheinlichkeit-Objekt. Springer Lecture Notes
in Physics, Vol. 99, 1979.
[10] V. S. Varadarajan. Geometry of Quantum Theory, Vol. II. New York: Van
Nostrand Reinhold, 1970.
[11] О. M. Nikodym. Sure l’extistance d’une mesure parfaitement additive et non
separable. Mem. Acad. Roy. Belgique, XVII, 1939.
415
416 References
[12] P. Jordan. Anschauliche Quantentheorie. Berlin-Heidelberg-New York: Springer-
Verlag, 1936.
[13] G. Ludwig. An Axiomatic Basis for Quantum Mechanics. New York-Heidelberg-
Berlin: Springer-Verlag, 1983.
[14] M. Jammer. The Philosophy of Quantum Mechanics. New York: Wiley, 1977.
E. Scheibe. The Logical Analysis of Quantum Mechanics. New York: Pergamon
Press, 1973.
[15] L. Kanthack. In preparation.
[16] P. Mittelstaedt. Quantum Logic. Dordrecht: Reidel, 1978.
[17] G. Ludwig. Deutung des Begriffs “physikalische Theorie” und axiomatische
Grundlegung der Hilbertraumstruktur der Quantenmechanik durch Hauptsatze des
Messens. Springer Lecture Notes in Physics, Vol. 4, 1970.
[18] H. Neumann. Idealizations of Preparation and Registration Procedures. In pre¬
paration.
[19] O. Melsheimer and R. Werner. In preparation.
[20] H. J. Schmidt. Coordinatization of certain convex sets in axiomatic quantum
theory. Doctoral thesis; Marburg Univ., 1975.
[21] H. Neumann. Classical systems and observables in quantum mechanics. Comm.
Math. Phys., 23, 100 (1971).
H. Neumann. Classical Systems in Quantum Mechanics and Their Representations
in Topological Spaces. Notes in Math. Phys., Vol. 10; Marburg, 1972.
H. Neumann. On the Representation of Classical Systems. Springer Lecture Notes
in Physics, Vol. 29,1974, pp. 316-321.
[22] K. Kraus. Position observables of the photon. In The Uncertainty Principle and
Foundations of Quantum Mechanics, W. C. Price and S. S. Chissick (Eds.). New
York: Wiley, 1977.
H. Neumann. Transformation properties of observables. Helv. Phys. Acta, 45,
811 (1972).
[23] W. Band and J. L. Park. Quantum state determination: Quorum for a particle in
one dimension. Amer. J. Phys., 47, 188 (1979).
[24] L. S. Pontrjagin. Topological Groups. London: Gordon & Breach, 1966.
[25] H. Yamabe. A generalization of a theorem of Gleason. Ann. Math. (2), 58, 351
(1953).
[26] A. Bohm. The Rigged Hilbert Space and Quantum Mechanics. Springer Lecture
Notes in Physics, Vol. 78, 1978.
[27] G. Ludwig. The connection between the objective description of macrosystems
and quantum mechanics of “many particles”. In Essays in Honor of Wolfgang
Yourgrau, Alwyn van der Merve (Ed.). New York: Plenum Press, 1982.
[28] G. W. Mackey. Induced Representations and Quantum Mechanics. Reading, Mass.:
Benjamin, 1968.
[29] A. Hartkamper and H. Neumann. Private communication.
[30] S. Sakai. C*-algebras and W*-algebras. New York-Heidelberg-Berlin: Springer-
Verlag, 1971.
F. W. Shultz. Pure states as a dual object for C*-algebras. Commun. Math. Phys.,
82, 497 (1982).
References 417
J. Dixmier. C*-algebras. Amsterdam: North-Holland, 1977.
W. Arveson. An Introduction to C*-algebras. New York-Heidelberg-Berlin:
Springer-Verlag, 1976.
[31] D. Kappos. Probability Algebras and Stochastic Spaces. New York: Academic
Press, 1969.
[32] N. Bourbaki. General Topology. Paris: Hermann; Reading, Mass: Addison-
Wesley, 1966.
[33] H. H. Schaefer. Topological Vector Spaces. New York-Heidelberg-Berlin:
Springer-Verlag, 1971.
G. Jameson. Ordered Linear Spaces. Springer Lecture Notes in Math., Vol. 141,
1970.
K. Ng. Partially Ordered Topological Vector Spaces. Oxford: Clarendon Press,
1973.
R. Cristescu. Ordered Vector Spaces and Linear Operators. Tunbridge Wells,
Kent: Abacus Press, 1976.
[34] P. R. Halmos. Introduction to Hilbert Space. New York : Chelsea, 1957.
M. Reed and B. Simon. Methods of Modern Mathematical Physics, Vol. I, §11.
New York: Academic Press, 1972.
J. Weidmann. Linear Operators in Hilbert Space. New York-Heidelberg-Berlin:
Springer-Verlag, 1980.
[35] М. H. Stone. Linear Transformations in Hilbert Space. Providence, RI: American
Mathematical Society, 1932.
M. Reed and B. Simon. Methods of Modern Mathematical Physics, Vol. I, §VIII.
New York: Academic Press, 1972.
[36] W. O. Amrein. Nonrelativistic Quantum Dynamics. Dordrecht: Reidel, 1981.
[37] G. Ludwig. Die Grundlagen der Quantenmechanik. Berlin-Heidelberg-New York:
Springer-Verlag, 1954.
[38] G. Birkhoff and J. von Neumann. The logic of quantum mechanics. Ann. Math.,
37, 823 (1936).
[39] G. W. Mackey. Mathematical Foundation of Quantum Mechanics. Reading, Mass.:
Benjamin, 1963.
[40] J. M. Jauch. Foundation of Quantum Mechanics. Reading, Mass.: Addison-Wesley,
1973.
[41] C. Piron. Foundation of Quantum Physics. New York: Benjamin, 1976.
[42] S. P. Gudder. Stochastic Methods in Quantum Physics. Amsterdam: North-
Holland, 1979.
[43] D. J. Foulis and С. H. Randall. Operational statistics, I: Basis concepts. J. Math.
Phys., 13, 1667-1675 (1972).
C. H. Randall and D. J. Foulis. Operational statistics, II: Manuals of operations
and their logics. J. Math. Phys., 14, 1472-1480 (1973).
D. J. Foulis and С. H. Randall. Empirical logic and tensor products; and Opera¬
tional statistics and tensor products. In: Interpretations and Foundations of
Quantum Theory, H. Neumann (Ed.). Mannheim: В. I. Wissenschaftsverlag, 1981;
and references cited therein.
[44] N. Zierler. Axioms for nonrelativistic quantum mechanics. Pacific J. Math., II,
1151 (1961).
418 References
[45] G. Ludwig. The measuring process and an axiomatic foundation of quantum
mechanics (Appendix of this article). In: Rendiconti della Scuola Internazionale
die Fisica “Eurico Fermi", IL Corso. New York: Academic Press, 1971.
G. Ludwig and H. Neumann. Connections between different approaches to the
foundations of quantum mechanics. In Interpretations and Foundations of Quantum
Theory, H. Neumann (Ed.). Mannheim: В. I. Wissenschaftsverlag, 1981.
[46] H. Gerstberger, H. Neumann and R. Werner. Makroskopische Kausalitat und
relativistische Quantenmechanik. In: Grundlagenprobleme der modernen Physik,
J. Nitsch, J. Pfarr and E. W. Stachow (Eds.). Mannheim: В. I. Wissenschaftsverlag,
1981.
H. Neumann and R. Werner. Causality between preparation and registration
processes in relativistic quantum theory. To appear.
[47] G. Ludwig. Axiomatische Basis einer physikalischen Theorie und theoretische
Begriffe. Z. Allg. Wiss., XII/1, 55 (1981).
[48] G. Ludwig. Der MeBprozeB. Z. Phys., 135,483-511 (1953).
G. Ludwig. Zur Deutung der Beobachung in der Q. M. Phys. Bl, 11, 489-494
(1955).
G. Ludwig. Zum Ergodensatz und zum Begriff der makroskopischen Observablen.
Z. Naturforsch., 12a, 662-663 (1957).
G. Ludwig. Zum Ergodensatz und zum Begriff der makroskopischen Obser¬
vablen, I. Z. Phys., 150, 346-375 (1958).
G. Ludwig. Zum Ergodensatz und zum Begriff der makroskopischen Obser¬
vablen, II. Z. Phys., 152, 98-115 (1958).
G. Ludwig. Axiomatic quantum statistics of macroscopic systems (Ergodic
theory). Rendiconti della Scuola Internazionale de Fisica “Enrico Fermi”, XIV
Corso. New York: Academic Press, 1960, pp. 57-132.
G. Ludwig. Geloste und ungeloste Probleme des MeBprozesses in der Quanten¬
mechanik. In: Werner Heisenberg und die Physik unserer Zeit. Braunschweig 1961,
pp. 150-181.
G. Ludwig. Zur Begrundung der Thermodynamik auf Grund der Quanten¬
mechanik. Z. Phys., 171, 476-486 (1963).
G. Ludwig. Zur Begrundung der Thermodynamik auf Grund der Quanten¬
mechanik II, Masterequation. Z. Phys., 173, 232-240 (1963).
G. Ludwig. Versuch einer axiomatischen Grundlegung der Quantenmechanik und
allgemeinerer physikalischer Theorie. Z. Phys., 181, 233-260 (1964).
G. Ludwig. An axiomatic foundation of quantum mechanics on a nonsubjective
basis. In: Quantum Theory and Reality. New York-Heidelberg-Berlin: Springer-
Verlag, 1967, pp. 98-104.
G. Ludwig. Attempt of an axiomatic foundation of quantum mechanics and more
general theories, II. Commun. Math. Phys., 4, 331-348 (1967).
G. Ludwig. Hauptsatze tiber das Messen als Grundlage der Hilbert-Raum-
Struktur der Quantenmechanik. Z. Naturforsch., 22a, 1303-1323 (1967).
G. Ludwig. Ein weiterer Hauptsatz tiber das Messen als Grundlage der Hilbert-
Raum-Struktur der Quantenmechanik. Z. Naturforsch., 22a, 1324-1327 (1967).
G. Ludwig. Attempt of an axiomatic foundation of quantum mechanics and more
general theories, III. Commun. Math. Phys., 9, 1-12 (1968).
G. Dahn. Attempt of an axiomatic foundation of quantum mechanics and more
general theories, IV. Commun. Math. Phys., 9, 192-211 (1968).
P. Stolz. Attempt of an axiomatic foundation of quantum mechanics and more
general theories, V. Commun. Math. Phys., 11, 303-313 (1969).
References 419
G. Ludwig. [17].
P. Stolz. Attempt of an axiomatic foundation of quantum mechanics and more
general theories, VI. Commun. Math. Phys., 23, 117-126 (1971).
G. Ludwig. The measuring process and an axiomatic foundation of quantum
mechanics. In: Foundations of Quantum Mechanics. Rendiconti della Scuola
Internazionale di Fisica “Enrico Fermi”, IL Cor so. New York: Academic Press,
1971, pp. 287-317.
G. Ludwig. A Physical Interpretation of an Axiom within an Axiomatic Approach to
Quantum Mechanics and a New Formulation of this Axiom as a General Covering
Condition. Notes in Math. Phys., Vol. 1; Marburg, 1971.
G. Ludwig. Transformationen von Gesamtheiten und Ejfekten. Notes in Math.
Phys., Vol. 4; Marburg, 1971. 61 Seiten.
G. Ludwig. [5].
G. Ludwig. [6].
G. Ludwig. An improved formulation of some theorems and axioms in the
axiomatic foundation of the Hilbert space structure of quantum mechanics.
Commun. Math. Phys., 26, 78-86 (1972).
G. Ludwig. Why a new approach to find quantum theory? In: The Physicist's
Conception of Nature. Dordrecht-Boston: Reidel, 1973. Seite 702-708.
G. Ludwig. [7].
G. Ludwig. Measurement as a Process of Interaction Between Macroscopic
Systems. Notes in Math. Phys., Vol. 14; Marburg, 1974.
G. Ludwig. A theoretical description of single microsystems. In: The Uncertainty
Principle and Foundations of Quantum Mechanics, W. C. Price and S. S. Chissick,
(Eds.). New York: Wiley, 1977.
G. Ludwig. An axiomatic basis of quantum mechanics. In: Interpretations and
Foundations of Quantum Theory, H. Neumann (Ed.). Mannheim: В. I. Wissen-
schaftsverlag, 1981.
List of Frequently Used Symbols
J set of preparation procedures 22
probability function for & 22
0to set of registration methods 23
A<a0 probability function for 23
^ set of registration procedures 23
У* set of selection procedures generated by
26
probability function for Sf 26
& set of effect processes 31
Ж> К sets of ensembles 42, 56
Se, L sets of effects 43, 57
cp canonical map 2! -► Ж cz К 43,57
ф canonical map ^ JS? с= L 43, 57
fj. probability function Ж x 5£ -► [0, 1]
and К x L -> [0, 1] 48
Str(y4) dispersion of measurement value for
observable A 195
со A smallest convex set containing the set A 362
со A smallest convex set containing the set A
(closed in the norm topology) 56, 140, 362
coa A smallest convex set containing the set A
(closed in the <r(X, Z^-topology) 362
421
List of Axioms
AS 1.1 16
AS 1.2 16
AS 1.3 21
AS 2.1 19
AS 2.2 19
AS 2.3 19
AS 2.4.1 20
AS 2.4.2 20
APS 1 22
APS 2 22
APS 3 22
APS 4.1 22
APS 4.2 22
APS 5.1.1 25
APS 5.1.2 44
APS 5.1.3 44
APS 5.1.4 45
APS 5.2 25
APS 6 26
APS 7 26
APS 8.1 28
APS 8.2 28
АР 1 50
AR 1 53
AV 1.1 58
AV 1.2s 58
AV 2f 59
AVid 58
AV 3 59
AV 4s 59
AE 1 14
AE 2 14
AE 3 61
AE 4.1 61
AE 4.2 61
APE 1 69
APE 2 69
AQ 73
AOb 154
APr 172
AH 1 187
AH 2 187
422
Index
accumulation point 354
actual physical pseudoproperties 72
additive measure 86, 88
additive real measure 20
adjoint operator 375
affine functional 363
angular momentum 293
anti-bilinear form 405
associated kernel, observable 128
axiomatic basis 2
Baire space 357
Banach space 359
base (for the cone) 363
base-normed space 363
^-continuous 211
Bessel’s inequality 369
bilinear form 390
Boolean lattice 348
Boolean ring 348
Borel structure 251
bounded 360
linear form 373
operator 372
carrier of interaction 13, 27
Cauchy filter 356
Cauchy sequence 356, 371-373
center (of a lattice) 75
center of mass 308
closed subspace 57, 370
closure 353
coarser than 354, 355
coexistent 86, 153, 167
coexistent components 158
coexistent effect processes 86
coexistant effect morphisms 214
coexistent effects 84, 87
coexistant morphisms 214
coexistant observables 84
coexistant registrations 84
commensurable 90, 153
compact operator 375
complement 345
complementary 154, 168, 345
complete 356
complete lattice 344
completion 356
completely continuous 375
composite system type 262
connected 357
connected component 358
continuity 354
continuous spectrum 140
convergence 354
convex cone 363
423
424 Index
convex equivalent 124
convex range 124
convex set 362
covering group 272
^-continuous 205, 208
decision effect 58, 75, 79, 290
decision observable 105
decision preparator 162
decomposition 48, 52, 75, 131
dense 353
detection response 57
direct mixture 49
discrete spectrum 140
dispersion-free 163
distributive lattice 345
domain of definition 385
domain of reality 2
dual 205
dual space 360
effect 41, 43
effect morphism 211
effect process 84
effect transformer 216
effective 87
elementary system type 262
energy observable 292
ensemble 41, 42
equivalent observables 129
equivalent group representations 238
essentially self-adjoint 386
extreme kernel 124
face 57,75
filter 353
finer than 354
finiteness of physics 56
fundamental domain 3
Galileo group 258, 260
generalized Boolean ring 350
generating 363
Gleason’s theorem 397
Hamiltonian operator 331
for inner structure 308
Hausdorffspace 355
Heisenberg commutation relation 291
Heisenberg uncertainty relation 193,
197, 294
Hermitian operator 375
hidden properties 186
Hilbert space 366
ideal 201
induced topology 354
initial topology 354
initial uniform structure 355
inner product 366
inner structure
effects 309
observable 309
interaction carrier 9, 27
interior 353
irreducible 135
isometry 379, 389
joint measurement 156
joint preparations 156
kernal observable 127
kinetic energy observable 308
Krein-Milman theorem 57, 124-126,
362
lattice 343, 345, 346, 350
law of quantization 59
Lie group 304
linear form 360
linear functional 360
linear operator 372
linear vector space 111, 359
linearly connected 358
locally compact 357
may be combined 24
maximal 386
meager 357
measure 86
measurement scale 139
metric 356
metrizable 356
mi-morphism 122, 206
Minkowski functional 111
mixture 75
mixture morphism 122, 206
modular 346
Index
modular pair 346
momentum
observable 290, 317
more extensive 129, 166
morphisms for selection
procedures 199
morphisms of effects 211
morphisms of ensembles 206
neighborhood
filter 353
structure 353
norm 360
norm closure 57
normed space 360
normed orthogonal system 369
nowhere dense 357
objective property 13, 60, 67, 68,
173
observable 103
operation 206
measure 215
open set 353
operators of trace class 391
orbital angular momentum 293
order isomorphism 216
order unit space 364
ordered linear vector space 363
orthocomplementation 345
orthocomplemented lattice 345
orthogonal 367
orthomeasure 347
orthonormal basis 369
orthonormal system 369
parity transformations 299
partial observable 141
partially ordered set 343
_L-isomorphism 217
phase space 247
physical approximation 18
physical object 55, 68
physical system 27
physical theory (PT) 2
Poincare group 253, 259, 260, 292
pointwise convergence 372
poset 343
position
observable 289, 290, 317
of center of mass 317
p-automorphism 203
p-continuous 204
p-invariant 204
p-isomorphism 205
p-morphism 203
precompact 57, 357
preparation-continuous 204
preparation-invariant 204
preparation-morphism 203
preparation procedure 6, 22
preparator 160
probability 19
product uniform structure 356
products of Hilbert spaces 405
projection operator 377
property 14
pseudoproperty 69, 177
quantum logic 181
random variable 139
r-continuous 204
r-invariant 203, 204
r-isomorphism 205
recording-continuous 205
recording-invariant 203, 204
registration apparatus 8
registration methods 23
registration procedures 23
relative momentum operator 318
relative position operator 318
relatively compact 357
relatively complemented lattice
350
rest energy observable 308
rotation group 272
scale observable 150
Schmidt orthogonalization
procedure 370
Schwarz inequality 367
selection procedures 16
self-adjoint 386
separable 353
separated part 202
set lattice 352
signed additive measure 108
Slater determinant 326
spatial reflection 299
spectral representation 382
spectral theorem 119
426 Index
spectrum 140
sp-morphism 200
ssp-morphism 202
spin angular momentum 293
state 42
statistical selection procedures 18,
19
Stern-Gerlach experiment 293
Stone’s theorem 62
symmetric operator 386
system type * 175, 262
theory of species of structure 3
topology 354, 355
topological space 353
total momentum 308, 317
trace 391
trace class operators 391
transpreparation processes 172
trans-preparator 215
Tychonov’s theorem 357
uncertainty relations 193
uniform continuity 355
uniform space 355
uniform structure 355
uniformizable space 355
unitary operator 379
unitary representation 255
unitary representation (up to a
factor) 255
vicinities 355
weak convergence 374
weak topologies 361
f itivrr
25S
rT.l .• V
V*
F.y * • "»
FjUjw1
»c*v
’ , л
N • f ** *• 4CJ
f»1 ■■ *.T»J5
'•УЛ л
• * У ч
,■
This is an advanced textbook on quantum mechanics. Its special
feature is a derivation of the basic structures from macroscopically
| describable preparation and registration procedures. For more than
three decades the author has made brilliant contributions to the
“ foundations of quantum physics; this is the first textbook that
ЩН
R 'Mue
fCsjC.'
TV,
гУУД
-Х».ч
gs
presents his theory. Mathematically thorough and rigorous, this
volume documents important progress toward the axiomatic
formulation of quantum theory. The first volume concentrates on the
fundamental structures, while the second treats “conventional
quantum mechanics,” which gains new depth in the author’s
thorough interpretation of its basic principles.
£b&1
,
. iV,
ISBN 0-387-11683-4
ISBN 3-540-11683-4
№711И»!* i *