Автор: Ludwig G.  

Теги: physics   mechanics  

ISBN: 0-387-11683-4

Год: 1983

Текст
                    v/7*
ЧУ<Д
' , . 1 r,
щш
уЙЙЙ
вж
■ iVtfr * v ХчетУФ!
* iV«y> t 1 ■:¥
2.'*v v»
>.<V4-., Л1и»лЙ> r
ШШшШ:
--- ‘ T»Vv> - Д>?»'
^bsv.y^>>
£5
«я
N*
•>»Ж!
G. Ludwig
► ovvT’^'vv» . vv*
ySV^f>V«W.
•	*	a	f	*-	f	*	«	.
Ч«МЙЙЬУ5
«ИЛ» ■	.	a	r	*’A	*V*.i'*	г*»	Азу&м
 ■■>■■
v*:
•iv; ► 5
c^R3c<? Ч’ЗЗкбр**'
кйЯЖЙВЯЙ
rjwwl
>*4V-
• :•?. v
* 1 *
>j
Foundations
of Quantum Mechanics I
Ssm
sra
XL
V.
Springer-Verlag
New York Heidelberg Berlin


G. Ludwig Foundations of Quantum Mechanics I Translated by Carl A. Hein 0 Springer-Verlag New York Heidelberg Berlin
G. Ludwig Institut fur Theoretische Physik Universitat Marburg Renthof 7 Federal Republic of Germany Editors Wolf Beiglbock Institut fur Angewandte Mathematik Universitat Heidelberg Im Neuenheimer Feld 5 D-6900 Heidelberg 1 Federal Republic of Germany Tullio Regge Istituto de Fisica Teorica Universita di Torino C. so M. d’Azeglio, 46 10125 Torino Italy Carl A. Hein (Translator) Formerly with Massachusetts Institute of Technology Lincoln Laboratory Lexington, MA U.S.A. Elliott H. Lieb Department of Physics Joseph Henry Laboratories Princeton University Princeton, NJ 08540 U.S.A. Walter Thirring Institut fur Theoretische Physik der Universitat Wien Boltzmanngasse 5 A-1090 Wien Austria Library of Congress Cataloging in Publication Data Ludwig, Gunther, 1918- Foundations of quantum mechanics. (Texts and monographs in physics) Translation of: Die Grundlagen der Quantenmechanik. Bibliography: p. Includes index. 1. Quantum theory. I. Title. II. Series. QC174.12.L8313 1982 530.Г2 82-10437 ISBN 0-387-11683-4 (v. 1) Original German edition: Die Grundlagen der Quantenmechanik. Berlin-Heidelberg-New York: Springer-Verlag, 1954. © 1983 by Springer-Verlag New York, Inc. All rights reserved. No part of this book may be translated or reproduced in any form without written permission from Springer-Verlag, 175 Fifth Avenue, New York, New York 10010, U.S.A. Typeset by Composition House Ltd., Salisbury, England. Printed and bound by R. R. Donnelley & Sons, Harrisonburg, VA. Printed in the United States of America. 987654321 ISBN 0-387-11683-4 Springer-Verlag New York Heidelberg Berlin ISBN 3-540-11683-4 Springer-Verlag Berlin Heidelberg NewYork
Dedicated to my wife
Preface This book is the first volume of a two-volume work on the Foundations of Quantum Mechanics, and is intended as a new edition of the author’s book Die Grundlagen der Quantenmechanik [37] which was published in 1954. In this two-volume work we will seek to obtain an improved formulation of the interpretation of quantum mechanics based on experiments. The second volume will appear shortly. Since the publication of [37] there have been several attempts to develop a basis for quantum mechanics which is, in the large part, based upon the work of J. von Neumann [38]. In particular, we mention the books of G. W. Mackey [39], J. Jauch [40], C. Piron [41], M. Drieschner [9], and the original work of S. P. Gudder [42], D. J. Foulis and С. H. Randall [43], and N. Zierler [44]. Here we do not seek to compare these different formulations of the foundations of quantum mechanics. We refer interested readers to [45] for such comparisons. In this book we shall seek only to develop a well-defined formulation for the foundations of quantum mechanics and to examine the implications of such a formulation towards the most important applications of quantum mechanics in a consistent manner. This formulation will be based only on the objective, that is, the so-called classical mode of description of the apparatuses. In this respect this book represents a systematic mathematical, as well as conceptual, formulation of the original viewpoint of N. Bohr in which it is assumed that it is necessary to use the classical mode of description in order to describe the measurement process in quantum mechanics (see, for example, the extensive discussion in M. Jammer and E. Scheibe [14]). In our approach to developing a formulation of the foundations of quantum mechanics we shall not present a precise mathematical description vii
viii Preface of the macroscopic measurement apparatus. Instead, we shall only assume that there exists an objective characterization of the mode of operation of the apparatus. In [13] we have shown that it is possible to derive quantum mechanics without making reference to microsystems, by using only the description of macrosystems in terms of state spaces. There the reader will also find a derivation of the Hilbert sp^e6 structure from general laws con¬ cerning the interactions of macrosystems. In this book we will make use of these results in III, §3 without proof. At several places in this book the reader will find references to the book Grundstrukturen einer Physikalischen Theorie [1]. Previous knowledge of [1] is not necessary for an understanding of this book. Readers who are familiar with [1] can easily recognize how the general structure of a physical theory is realized for the case of quantum mechanics. Readers who wish to study [1] later will have the advantage that they will have an example to illustrate the general description in [1]. The formulation of quantum mechanics presented here is the last step of developments since 1964 which the interested reader can find in [48]. (In [48] there is also some previous work which has led to the results de¬ scribed here.) In a certain sense, this presentation together with [1] and [13] represents a greatly improved “second edition” of [17]. In Appendices I-V we have provided a summary of important mathe¬ matical results which may be unfamiliar to some readers. Appendix V will appear in Volume II. For readers who are unfamiliar with the mathematical results in Appendix V, we suggest that they “take them on faith” until Volume II appears. References in the text are made as follows: For references to other sections of the same chapter, we shall only list the section number of the reference, for example, §5.3. For references to other chapters, the chapter is also given; for example, IV, §7.2 refers to Chapter IV, Section 7.2. The formulas are numbered as follows: (5.7.10) refers to the 10th formula in Section 5.7 of the current chapter. References to formulas in other chapters are given, for example, by IV, (5.7.10). References to the Appendix are given by AIV, §2, where AIV denotes Appendix IV. I would like to express my deep gratitude to Mr. Carl A. Hein for the difficult job of translating the manuscript from German to English. He had the difficult assignment of finding suitable English language expressions for the somewhat new and sometimes difficult concepts and ideas used in a new conceptual framework for quantum mechanics. This was possible only because of his deep understanding of the text. I would also like to thank him for his patience in accommodating my wishes and a substantial number of revisions in the German text while the translation was under way. I hope that the present book together with [13] will lead to further interest and research into the foundations of quantum mechanics, especially in the direction of a relativistic theory for quantum mechanics, that is, relativistic quantum field theory (see [46]). Marburg, January 1982 G. Ludwig
Contents CHAPTER I The Problem: An Axiomatic Basis for Quantum Mechanics l 1 The Axiomatic Formulation of a Physical Theory 2 2 The Fundamental Domain for Quantum Mechanics 4 3 The Measurement Problem 10 CHAPTER II Microsystems, Preparation, and Registration Procedures 12 1 The Concept of a Physical Object 13 2 Selection Procedures 15 3 Statistical Selection Procedures 18 4 Physical Systems 21 4.1 Preparation Procedures 21 4.2 Registration Procedures 22 4.3 The Dependence of Registration upon Preparation 24 4.4 The Concept of a Physical System 26 4.5 The Structure of Probability Fields for Physical Systems 28 CHAPTER III Ensembles and Effects 41 1 Combinations of Preparation and Registration Methods 42 2 Mixtures and Decompositions of Ensembles and Effects 47 3 General Laws: Preparation and Registration of Microsystems 56 ix
x Contents 4 Properties and Pseudoproperties 60 4.1 Properties and Physical Objects 60 4.2 Pseudoproperties 69 5 Ensembles and Effects in Quantum Mechanics 73 6 Decision Effects and Faces of К —- 75 CHAPTER IV Coexistent Effects and Coexistent Decompositions 83 1 Coexistent Effects and Observables 84 1.1 Coexistent Registrations 84 1.2 Coexistent Effects 86 1.3 Commensurable Decision Effects 90 1.4 Observables 96 2 Structures in the Class of Observables 106 2.1 The Spaces ЩЛ) and Щ2) 107 2.2 Mixture Morphisms Corresponding to an Observable 122 2.3 The Kernel of an Observable; Mixture of Effects for an Observable 123 2.4 Mixtures and Decompositions of Observables 128 2.5 Measurement Scales for Observables 139 3 Coexistent and Complementary Observables 152 4 Realizations of Observables 154 5 Coexistent Decompositions of Ensembles 156 6 Complementary Decompositions of Ensembles 166 7 Realizations of Decompositions 172 8 Objective Properties and Pseudoproperties of Microsystems 173 8.1 Objective Properties of Microsystems and Superselection Rules 173 8.2 Pseudoproperties of Microsystems 177 8.3 Logic of Decision Effects? 181 CHAPTER V Transformations of Registration and Preparation Procedures. Transformations of Effects and Ensembles 199 1 Morphisms for Selection Procedures 199 2 Morphisms of Statistical Selection Procedures 201 3 Morphisms of Preparation and Registration Procedures 203 4 Morphisms of Ensembles and Effects 206 4.1 Morphisms of Ensembles 206 4.2 Morphisms of Effects 211 4.3 Coexistent Operations and Coexistent Effects Morphisms 214 5 Isomorphisms and Automorphisms of Ensembles and Effects 216
Contents xi CHAPTER VI Representation of Groups by Means of Effect Automorphisms and Mixture Automorphisms 231 1 Homomorphic Maps of a Group ^ in the Group si of J^-continuous Effect Automorphisms 231 1.1 Generation of a Representation of ^ in si by Means of a Representation of ^ by r-Automorphisms 232 1.2 Some General Properties of a Representation of ^ in si 237 1.3 Topologies on the Group si 245 1.4 The Representation of ^ in Phase Space Г 247 2 The ^-invariant Structure Corresponding to a Group Representation 248 3 Properties of Representations of ^ which are Dependent on the Special Structure of st{m) in Quantum Mechanics 249 3.1 The Topological Structure of the Group stm 249 3.2 The Topological Properties of a Representation of ^ 252 3.3 Unitary and Anti-unitary Representations Up to a Factor 254 CHAPTER VII The Galileo Group 258 1 The Galileo Group as a Set of Transformations of Registration Procedures Relative to Preparation Procedures 258 2 Irreducible Representations of the Galileo Group and Their Physical Meaning 262 3 Irreducible Representations of the Rotation Group 272 4 Position and Momentum Observables 284 5 Energy and Angular Momentum Observables 292 6 Time Observable? 293 7 Spatial Reflections (Parity Transformations) 299 8 The Problem of the Space 3> for Elementary Systems 302 9 The Problem of Differentiability 304 CHAPTER VIII Composite Systems 307 1 Registrations and Effects of the Inner Structure 307 2 Composite Systems Consisting of Two Different Elementary Systems 310 3 Composite Systems Consisting of Two Identical Elementary Systems 320 4 Composite Systems Consistingof Electrons and Atomic Nuclei 323 5 The Hamiltonian Operator 328 6 Microsystems in External Fields 332 7 Criticism of the Description of Interaction in Quantum Mechanics and the Problem of the Space 3f 339
xii Contents APPENDIX I Summary of Lattice Theory 343 1 Definition of a Lattice 343 2 Orthomodularity 346 3 Boolean Rings 348 4 Set Lattices 352 APPENDIX II Remarks about Topological and Uniform Structures 353 1 Topological Spaces 353 2 Uniform Spaces 355 3 Baire Spaces 357 4 Connectedness 357 APPENDIX III Banach Spaces 359 1 Linear Vector Spaces 359 2 Normed Vector Spaces and Banach Spaces 360 3 The Dual Space for a Banach Space 360 4 Weak Topologies 361 5 Linear Maps of Banach Spaces 362 6 Ordered Vector Spaces 363 APPENDIX IV Operators in Hilbert Space 366 1 The Hilbert Space Structure Type 366 2 Orthogonal Systems and Closed Subspaces 369 3 The Banach Space of Bounded Operators 372 4 Bounded Linear Forms 373 5 The Banach Space 375 6 Projection Operators 377 7 Isometric and Unitary Operators 379 8 Spectral Representation of Self-adjoint and Unitary Operators 380 9 The Spectrum of Compact Self-adjoint Operators 384 10 Spectral Representation of Unbounded Self-adjoint Operators 385 11 The Trace as a Bilinear Form 390 12 Gleason’s Theorem 397 13 Isomorphisms and Anti-isomorphisms 404 14 Products of Hilbert Spaces 405 15 The Spaces Я{Ж19 ...) and ...) 411 References 415 List of Frequently Used Symbols 421 List of Axioms 422 Index 423
CHAPTER I The Problem: An Axiomatic Basis for Quantum Mechanics The historical path of discovery for a new physical theory is, for the most part, a complicated one. At first new concepts are tentatively introduced. By a lengthy process, involving trial, error, insight, and revision, these concepts are modified and become more clearly defined and familiar. As an under¬ standing of the postulated structure of the theory develops, it is possible by careful application of the new concepts to learn how to avoid error and to develop an interpretation of the new theory. Such has been the case for quantum mechanics. In this book we shall not present a heuristic path to quantum mechanics as a means of developing a theory of electrons, atoms, ... (in general: microsystems). Instead, we shall assume that the reader has already had extensive contact with quantum mechanics, and has studied one or more of the elementary texts. If this is not the case, we recommend that the reader either study such a text before reading this book, or use one of the elementary texts in conjunction with this book. In this way, the reader will discover the vagueness inherent in the usual fundamental concepts which are used to formulate quantum mechanics. For the reader who seeks an elementary text which considers some of the problems to be discussed in this book, we recommend Volume 3 of [2]. In the following chapters we shall attempt to clarify the fundamental concepts of quantum mechanics and present a thorough and systematic axiomatic formulation of quantum mechanics. 1
2 I The Problem: An Axiomatic Basis for Quantum Mechanics 1 The Axiomatic Formulation of a Physical Theory We cannot consider the general structure of a physical theory in detail in this book. We refer readers who are interested in such questions to [1]. Here we shall only describe in broad terms what we mean by an axiomatic basis of a physical theory, and what we seek to accomplish when we formulate an/ axiomatic basis for a physical theory. A physical theory (abbreviated TT) consists of three essential parts, a mathematical theory {JfT), a domain of reality (nF), which we seek to describe by JiT, and a set of mapping principles {Ms40>) where the latter is needed in order to describe the relationship between iff and JiT. The mapping principles also describe what is frequently called the interpretation of JiT. It is here that we encounter the difficult problems in quantum mechanics. Concepts such as observable, state, property, and object are ill-suited to serve as a conceptual basis for an interpretation of quantum mechanics. (1) We expect that the difficulties of interpretation of а TT should be minimized when an axiomatic basis for 0>T is found. A JiT as part of a TT can, in principle, be arbitrarily defined. In the historical evolution of physics, the mathematical theories seldom appear in the form of an axiomatic basis. This situation frequently leads to conflicting views concerning the interpretation of the theory. In particular there may be disputes about a given structure in JiT. Is it accidental, thereby having nothing to do with the real structure of the world as described by nF? In [1] it is shown how it is possible to thoroughly treat such problems if JiT is an axiomatic basis. Here we shall only provide a brief exposition of the principle of an axiomatic basis in order to apply it to quantum mechanics in the following chapters. By doing so we shall give an explicit formulation of the mapping principles (see II). In this manner the application of the principle of an axiomatic basis can be understood without the more general and detailed analysis of [1]. (2) The introduction of new physical concepts in the domain of а TT may be carried out in a simple and transparent manner when the JiT is an axiomatic basis. If JIT is an axiomatic basis, then we can introduce new physical concepts for which the physical meaning will naturally follow from the mapping principles JistfT and mathematical constructions in JiT. Again, it is not necessary to study the general principles presented in [1] because it is not difficult to understand the concrete derivation of the new physical concepts (such as ensemble, effect, observable, position observable, etc.) which are presented in the following chapters. The reader who is familiar with the “stories” which are usually told in order to explain (in a rough manner) the physical meaning of these concepts will appreciate the conceptual clarity which results from the construction of a TT with JtT as an axiomatic basis.
1 The Axiomatic Formulation of a Physical Theory 3 What do we mean by the expression “axiomatic basis”? First it is essential that this expression describe certain aspects of the form of МЗГ itself. Thus it is essential that the М2Г be studied in its own right, that is, detached from its relationship to physics. Thus we postulate that Ji3T should take the form of what Bourbaki [4] calls a “theory of species of structure” E and denotes by . Since we are studying both physical and mathematical theories, we prefer to write JIZT^ instead of . Thus an axiomatic basis should be of the form where E denotes the appropriate species of structure. The form of within a 0>3T is somewhat more specialized than that introduced by Bourbaki in [4]. The specialization consists in the fact that, for the auxiliary base sets of E, we shall only use the set of real numbers R and for МЗГ we require only the use of set theory. It is not necessary for the reader to study either [4] or [1] because it is not difficult to exhibit the nature of а and because in II (supplemented by VII, §1 and axioms introduced later) we will explicitly construct the mathe¬ matical description corresponding to quantum mechanics. In JlZTi we shall assume the usual formulations of mathematical logic and set theory, and the usual formulation of the real number system R. Then we shall introduce what we shall call base sets, for which no internal structure is specified in advance. We obtain an internal structure for the base sets by introducing relations on the sets, and by postulating axioms for them. This is how we shall introduce the structure for the corresponding to quantum mechanics beginning in II and continuing in VII, §1. A simple example for a is a group, that is a set with a relation (called multiplication) which satisfies the axioms for a group. In addition to the requirement that be of the form we must also make requirements concerning the relation between the mapping principles JtstfgP and . We shall use the concept of mapping principles instead of the concept of interpretation because we require a neutral term, one which does not evoke differing preconceived notions among various readers. We note that inherent in the concept of mapping principles is that something is being mapped. We shall only require that what is being mapped must be expressed in terms of experiment and experience without the need for the application of the new theory. We may find that we do need older theories, that is, theories which are already known to express what is being mapped. We may illustrate this requirement with the aid of a familiar theory. For example, it is possible to specify the position of the planets without requiring the use of Newton’s mechanics and Newton’s law of gravitation. The position of the planets at ^different times provides the experimental material which can be compared to Newton’s theory, that is, which is to be mapped into the mathematical framework JltT of Newton’s theory. We shall use the expression fundamental domain У to denote those facts that can be specified in advance of a particular 0>3T and which are mapped into the mathematical framework of In §2 we shall seek to describe the fundamental domain for the case of quantum mechanics. The fundamental domain consists of all that can be determined before the application of the theory. The fundamental domain
4 I The Problem: An Axiomatic Basis for Quantum Mechanics will later be expanded with the help of the theory to the domain of reality iV' where the latter includes all real aspects of the world which can be described by the theory. The mapping principles are rules by which the elements of that is, the stated facts, may be translated into the language of МЗГЪ. We obtain an axiomatic basis if and only if the translation of the determined facts (as elements of into the language of makes use only of the undefined basis sets and the undefined relations of £. The axioms required for are not deduced from experiment, but are guessed from experiment by a process involving trial and error, intuition, and insight. It is important to keep this in mind in the following chapters (see also [1], §5). The axioms used in should not be derived from philosophical a priori principles (for example, from forms of pure sensible intuition) believing that these principles and forms are necessary to develop physics. Here we note that some authors [9] take an a priori viewpoint concerning the fundamental structure of quantum mechanics. By an axiomatic basis we mean a realization of JiZT together with a set of mapping principles which have the form described above. It is obvious that many problems concerning the physical meaning associated with mathematical structures are easier to solve within the framework of an axiomatic basis because we permit only sets and relations in £ (except for certain idealizations) which are interpreted physically by use of the mapping principles. Again, we refer interested readers to [1]. At this stage we expect that many readers will not find the above formula¬ tion of the general structure of a physical theory clear. On the basis of the formulation of quantum mechanics which will be presented in the following chapters, we expect that such readers will appreciate the conceptual clarity of our axiomatic formulation of quantum mechanics as compared to the usual one. As we progress we also expect that such readers will obtain a better understanding of the nature of an axiomatic basis. In this respect, this book can also be used in order to obtain a better understanding of [1]. 2 The Fundamental Domain for Quantum Mechanics The simplicity we find in the case of classical mechanics results from the following fact: It is possible to describe the measurement of the position of a particle (point mass) at various times without making use of the laws of mechanics. In other words, the formulation of a theory of measurement for a position-time measurement does not require the use of the laws of mechanics (see [2], II). Thus, the measured positions of a particle at different times are appropriate for inclusion in the fundamental domain of classical mechanics. In quantum mechanics the situation is completely different. Nevertheless, in the historical development of quantutn mechanics we find that there is a strong tendency to imitate the procedures of classical mechanics, where it is assumed that we know how it is possible to measure position, momentum,
2 The Fundamental Domain for Quantum Mechanics 5 and many other quantities. Here the term “observable” is used to describe the quantities to be measured, and the measured values of the observables are assumed to be the experimental material which is to be compared with the theory. Unfortunately, it is not possible to state precisely what is meant by the concept of an observable. The origin of this difficulty is easily ascertained. As we have seen, in classical mechanics it is possible to develop a theory of measurement of the position of the particles at various times without the application of mech¬ anics. In this way it is possible to give meaning to the concept of the observable position at various times and their measurements before develop¬ ing the theory of classical mechanics. In quantum mechanics this is not possible, because there is no measurement apparatus for microsystems such as electrons, atoms, molecules, etc., whose function can be explained without the use of quantum theory. Thus we find that the concept of observables and their corresponding measurement values are ill-suited for inclusion into the fundamental domain for quantum mechanics. The claim that quantum mechanics is only concerned with what can be measured (that is, the measurement values obtained from a scale on a measurement apparatus) is false because we cannot explain what the measurement values represent without the use of quantum theory. Here we shall not review the vast body of literature devoted to the theory of measurement in quantum mechanics. Instead, we shall seek to obtain a suitable substitute for the concept of an observable (and their associated measurement values) for inclusion into the fundamental domain of quantum mechanics. A second concept—that of a “state”—is often used as an aid in the interpretation of quantum mechanics. A microsystem is said to be in one of its possible states. But what do we mean by the notion of a state? In classical mechanics it is possible to characterize the state of a system by the positions and velocities of the individual particles of the system at a given time, that is, by a point in phase space. We shall not describe the various attempts to * develop a quantum mechanical notion of a state because it is clear that such a notion will make explicit use of the structure of quantum mechanics. Thus we find that the notion of a state is also ill-suited for inclusion in the fundamental domain of quantum mechanics. Thus, if we seek to formulate quantum mechanics in terms of an axiomatic basis, we have little to begin with other than what an experimental physicist would call experiments with a single microsystem. The term “single” is a qualitative designation which is used only to differentiate these experiments from those that treat a large number of interacting systems as a whole, that is, a macrosystem composed of a collection of microsystems. The term “micro¬ system” is also a qualitative designation which is used to emphasize the fact that we do not assume that quantum mechanics is a suitable theory for the description of macrosystems (for example, the earth). Indeed, quantum mechanics is inadequate for a theoretical description of macrosystems. We cannot discuss the problem of the relationship between quantum mechanics
6 I The Problem: An Axiomatic Basis for Quantum Mechanics and a more comprehensive theory of macrosystems in this book. The reader will find an introduction to this problem in [2], XV, [5], [7], [13], [27] and some comments in XVIII. By experiments with “single” microsystems we do not mean that we consider only a “single” experiment with a “single” microsystem. Since statistics plays a central role in quantum mechanics (see II), we must consider experiments with “large numbers” of microsystems. It is important to understand that experiments with “large numbers” of microsystems can be frequently understood in terms of repeated measurements with a single microsystem. This situation is familiar to every experimental physicist. An electron beam can be considered to be the result of a multiple process in which a single electron is “produced” provided that the mutual interaction between individual electrons can be neglected (see XVI). Even the so-called ideal gas can be approximately treated as a collection of many single atoms (see XV, §2) since the mutual interactions of the atoms in an ideal gas are negligible. For the fundamental domain of quantum mechanics we shall choose the class of experiments with individual microsystems, and the relative fre¬ quencies of the phenomena associated with multiple repetitions of these experiments. In this book we shall not describe the vast variety of such experiments. Several examples are briefly described in XI-XVII. In II we shall develop the general structure of such experiments as the basis for the formulation of quantum mechanics. In preparation for this task we find it necessary to make the meaning of the expression “experiments with microsystems” more precise. We shall begin by describing the structure of such experiments in more detail. In order to carry out experiments with individual microsystems it is necessary to have such systems at hand. Often such microsystems can be found in nature, for example, in interstellar space. There they are sufficiently separated so that their mutual interactions can be neglected. In fact, their mutual interactions will be smaller than what can be produced in many experiments in the laboratory. On Earth such microsystems must be produced in the laboratory. Often they may be obtained naturally, as for example, from the decay of radioactive substances. Sometimes it is necessary to produce them using a complicated apparatus which is very expensive to build. Often a rarefied (ideal) gas will be suitable for many purposes. Here we shall use the generic term preparation procedure to denote the various methods of obtaining microsystems. Thus some preparation procedures will require the use of a special apparatus (a giant accelerator), while others will only require the use of the sun, which emits such microsystems as light quanta, charged particles (solar wind) and neutrinos. We shall now state an important requirement for the development of an axiomatic basis for quantum mechanics: It must be possible to describe the structure of the preparation apparatus, and the time-dependent physical
2 The Fundamental Domain for Quantum Mechanics 7 process by which the preparation apparatus operates without the use of quantum mechanics. In brief, we require that the so-called pre-theories for quantum mechanics permit the description of the structure and the operation of the apparatus and the characterization of the preparation procedure (for a description of the concept of a pre-theory see [1]). In other words we require that the preparation procedures belong to the fundamental domain of quantum mechanics. In order to prevent misconceptions concerning the characterization and description of a preparation procedure, we find it necessary to give an example of what is not part of the characterization of a preparation procedure. If, for the purpose of illustration, electrons are to be produced, then the specification of the spin of the prepared electron or the description of the physical process of emission of the electron from, for example, a heated cathode is not permitted as part of the description of the preparation procedure. However, all macroscopic processes which take place in the operation of the preparation apparatus, including those instructions which can be stored on magnetic tapes and in other memory devices and executed in sequence by a computer belong to the description of the preparation procedures. The concept of a preparation procedure permits the description of complicated experimental arrangements such as one composed of an accel¬ erator, a target and a special selection apparatus which selects the desired microsystem. In addition, it is possible to combine two or more preparation procedures into a new preparation procedure. Such combinations of pre¬ paration procedures are commonly found in scattering experiments (see XVI). At present the formulation of a preparation procedure may appear to be too general. We find it necessary to impose an additional restriction—that of reproducibility. The latter notion is related to the relative frequencies of the various phenomena associated with repetitions of an individual experiment (see II). By making the assumption that a microsystem is produced in a prepara¬ tion procedure we do not mean that we know, in a particular case, which preparation procedure was used to produce a particular microsystem. In a test of a physical theory we require that only known facts are to be mapped by means of the mapping principles into the mathematical language of the theory М2ГЪ. We shall now consider the following question: How do we use the prepared microsystems to investigate the structure of microsystems? In the second and crucial part of such experiments we require the use of an apparatus which measures the microsystems and their structure. If we wish to interpret the macroscopic physical processes associated with the second apparatus as a measurement of the structure of a microsystem, we need to make use of quantum mechanics. Therefore we permit only the inclusion of the macroscopic processes associated with such an apparatus as part of the fundamental domain of quantum mechanics. Would it then be correct to say
8 I The Problem: An Axiomatic Basis for Quantum Mechanics that the measured value obtained by a measuring apparatus may be compared with the predictions of quantum theory? It is correct in that the measurement values (scale values) are the result of macroscopic processes associated with the apparatus, and are therefore parts of the fundamental domain. What is not part of the fundamental domain is the interpretation of these scale values as a measurement of a property of a microsystem (that is, the “result” of the measurement of an “observable”). Since, in our description of the fundamental domain, we cannot say what is (or was) measured by the apparatus, we find it necessary to introduce the expression registration apparatus (instead of measuring apparatus) to de¬ scribe the second part of the experimental arrangement. It is in this sense that every experimental arrangement of an experiment with a single microsystem consists of a preparation apparatus and a registration apparatus. It is not necessary that the experimental arrangement be man-made. Indeed, the preparation apparatus can be a star or a galaxy. In order to prevent misunderstanding, it is necessary to note that the expression “single microsystem” also applies to what is called a “composite microsys¬ tem” in the sense of VIII. If, for example, we study electron-proton scattering, the “single” microsystems are electron-proton pairs. In this book we shall not describe the construction of a typical registration apparatus. Instead, we shall give a few familiar examples: a scintillation counter, or array of such counters, a cloud chamber, a bubble chamber, a spectroscope, a photographic plate, etc. In addition, we give an example of a registration apparatus which makes use of complicated electromagnetic fields—such as a mass spectrograph. It can be argued whether every experimental arrangement for an experi¬ ment with a single microsystem consists of a preparation apparatus and a registration apparatus. This is indeed the case. However, this does not mean that it is possible to uniquely divide a complicated experiment into prepara¬ tion and registration parts. In Figure 1 we have an experimental arrange¬ ment which consists of three parts. In part (1) we produce the microsystem a. In part (2) we produce microsystem b as the result of the interaction of the macroscopic apparatus (2) with a (where b can be the same as a). The microsystem b is then registered by (3). We may consider (1) as the pre¬ paration apparatus for the microsystem a, and (2) plus (3) as the registration apparatus for a. We may also consider (1) plus (2) as the preparation apparatus for system b and (3) as the registration apparatus for system b. Thus we find that the preparation-registration structure provides a conceptual basis for experiments with microsystems. It is possible to invent experiments in which it is not possible to speak of preparation and registration of microsystems. For example, consider an apparatus having two parts (1) and (2), and suppose that they interact by exchanging microsystems, and that the emission of microsystems by (1) is influenced by the microsys¬ tems produced by (2) and vice versa. For such a system it is not possible to specify which microsystem is the subject of the experiment. Is it the one which goes from (1) to (2) or the one which goes from (2) to (1)? Here we do
2 The Fundamental Domain for Quantum Mechanics 9 Preparation apparatus for a Registration apparatus for a (1) (2) (3) Preparation apparatus for b Registration apparatus for b Figure 1 not mean to suggest that a more comprehensive theory (which includes quantum mechanics as a special case) will be unable to treat such com¬ plicated interaction problems. We only suggest that such experiments are ill- suited for the immediate goal of formulating an axiomatic basis for quantum mechanics. By our reference to the possibility that the interaction between the apparatus (1) and (2) need not be directed, we may be led to question the assumption of the existence of microsystems, or at least to seek a better understanding of the concept of a microsystem. The inclusion of the preparation apparatus and the registration apparatus (and the associated physical processes which can be described without the use of quantum mechanics) into the fundamental domain is not affected by the question of the existence of microsystems. The directedness of the interaction can be described in terms of the pre-theory of quantum me¬ chanics, that is, without the need of quantum mechanical theory. Indeed, it can be described without the need of the concept of a microsystem. We shall not discuss these matters here; they are discussed in [2], XVI, [3], [6], [7], and briefly in XVIII. In [13] the axiomatic basis of quantum mechanics is formulated without the need for the assumption of the existence of microsys¬ tems. There we find that no special assumption is needed in order to describe the directed interaction between the preparation apparatus and the reg¬ istration apparatus by means of a “carrier of interaction.” In II we shall introduce the fundamental set M of “interaction carriers.” The introduction of M does not violate our intention to develop an axiomatic basis provided we do not make additional assertions about the elements of M other that they depend on the preparation apparatus, the registration apparatus and their associated macroscopic processes. The introduction of M will also permit us to make the formulation of an axiomatic basis of quantum mechanics given here easier to understand than the one given in [13]. Thus the reader who has understood the foundations of the theory of microsystems given here will obtain a better understanding of the formulation presented in
10 I The Problem: An Axiomatic Basis for Quantum Mechanics [2], XVI and the more detailed presentation given in [13]. In II we will call the carriers of interactions (elements of M) “microsystems” even though this word will denote special carriers of interaction which are characterized by the axioms given in III, §3 and §5. The introduction of the set M of microsystems should not be understood as implying that such microsystems exist in every individual experiment because the “vacuum” can also be considered to be a “type” of microsystem. Here we do not exclude the possibility that the preparation apparatus does not always interact with a registration apparatus. In summary, we will present the axiomatic basis for quantum mechanics in this book. The axiomatic basis will be constructed using the fundamental domain which consists of all those aspects of the preparation and registration procedures for microsystems which can be described without the use of quantum mechanics. 3 The Measurement Problem Have we, by our restriction to the fundamental domain described above, eliminated the measurement problem which was described in the beginning of §2? Have we eliminated the problem in such a way that the interaction between the microsystem and the measurement apparatus can be analyzed without the need for quantum theory? Of course not. What we have done is to place the problem where it belongs—namely, with the developing theory. The mapping principles are no longer burdened with the problem of providing a theoretical description of either the effect of the microsystem on the registration apparatus or the dependence of the microsystem on the structure of the preparation apparatus. What is the status of such a theoretical description in the arena of quantum mechanics? In II and III we shall begin our axiomatic formulation of quantum mechanics by introducing structural rules governing the preparation and registration processes. These structural rules will be very general, and will be analogous to those found in thermostatics. These rules will not specify how individual cases of preparation and registration are structured, just as the fundamental structure of thermostatics does not specify the equation of state for a given substance. The theory presented in II—III and extended in VII, §1 describes the preparation and registration of microsystems but is not as complete as we might wish. In opposition to such a wish of completeness, in IV we shall proceed in the opposite direction. We shall attempt to eliminate, as far as possible, the preparation and registration process in order to obtain a theory of the structure of microsystems which is independent of accidental aspects of the structure of the apparatuses. The extent to which this is possible will be discussed in IV.
3 The Measurement Problem 11 In thermodynamics the equation of state is obtained from experimental data. By analogy with thermodynamics we will, in the present state of the theory, take the mode of operation of the special preparation and registration procedures from experiment, and make use of additional assumptions (see, for example, XI, §1 and §2 and XVI, §1 and §2). This “taking” of special structure from experiment is not without considerable cost. As a result, the current status of the theory is not satisfactory. Thus we shall attempt to describe the problems of preparation and registration more precisely. We shall make such a detailed investigation of the problems of preparation and registration in order to obtain a more comprehensive theory than that which was developed in II-XVI. We shall begin these investigations in XVII. There we shall find that quantum mechanics cannot present a closed theory (more precisely—is not a g.G.-closed theory in the sense of [1], §8 and §10) of the preparation and registration process. This is perhaps disappointing. In fact, it merely demonstrates the fact that quantum mechanics is not a theory which can describe everything from a microsystem to a macrosystem. In XVIII we shall analyze the situation in quantum mechanics in its relationship to other physical theories. Thus, at the end of this book we return to the problems posed at the beginning.
CHAPTER II Microsystems, Preparation, and Registration Procedures We shall now present a “theoretical” description of experiments with individual microsystems, a description which is expressed in terms of a mathematical framework . We shall introduce mathematical entities to which the individual microsystems and the experimental procedures which are to be applied to them are to be mapped. We shall use the methodology which was briefly outlined in I, and is developed in greater detail in [2], II and in [1]. Here a familiarity with I will suffice in order to understand (at least in an intuitive way) the relationship between the mathematical theory and the physical reality it describes. We shall develop the mathematical theory as systematically as possible. At the beginning we shall introduce a number of axioms which we shall need in order to formulate the foundations of quantum mechanics in a transparent manner. Later in the development of the theory we shall only motivate the selection of new axioms. Nevertheless the reader who is mathematically inclined will find it easy to verify that the mathematical theory is of the form МЗГ^ as described in I, §1 and described in greater detail in [1]. The mapping principles will be presented in a more intuitive manner. The reader who has read the definite presentation in [1], and is familiar with notion of a “concise formulation” presented there will be able to formulate the mapping principles in a precise way. Even those readers who are satisfied with obtaining a more intuitive understanding of the relationship between physics and mathematics will not find it difficult to understand this “physical interpretation” of quantum mechanics given in the following chapters because this presentation is easier to understand that the usual one. 12
1 The Concept of a Physical Object 13 At various places we shall use expressions such as “physical reality” and deduce new physical concepts while making explicit reference to the precise formulations and methods of [1], §10 without making explicit applications of them. We do so in order to keep the size of this book within reasonable bounds, and not to stray from the intended scope of the book. In the title of the present chapter we have introduced the term “microsystem” somewhat prematurely, because we shall not define this concept until the following chapter. The expression “physical system” or “carrier of interaction” would be more appropriate in this chapter. However, since we are concerned only with the applications of the general methods and concepts described here to this special case, we shall use the expression “microsystems.” This presentation is closely related to that found in [2], XIII. In fact, an understanding of the subject matter presented in [2], XI-XIII will greatly facilitate the understanding of the formulation of quantum mechanics presented in this book. In this respect, this book is a continuation of [2], XIII. 1 The Concept of a Physical Object When we introduce a general concept such as a “physical object” we do not intend to present an analysis of the meaning of such concepts as they are used in physics. Instead, we intend to formulate the concept anew, independent of the fact that the new formulation may not agree with the usual one in all particulars. In [1], §10.5 we set forth the requirement that the “new” physical concepts are to be introduced into the mathematical theory JIZT^ by means of a set together with a “structure.” Here the term “new” refers only to the definition of the concept (that is, of the set and the structure). We have to assume that we already know how to assign physical meaning to the structure terms and to the elements of the set. How this may be done is illustrated by the concept of a “physical system” which is defined in §4. For a detailed description of the method we refer readers to [1], §10.5, §11, and §12. We shall now consider a set M, the elements of which we wish to call “physical objects.” Here, in order to prevent misunderstandings, we warn the reader of the opinion (see [1], §10.5) that the expression “physical object” is used to describe all aspects of “physical reality” which are to be mapped to an element of a set. In mathematics, a set and the elements of a set are often loosely called “mathematical objects.” Thus, it should not be misleading to use the expression “physical object” to refer to the physical reality associated with these mathematical objects. Thus we may call an element of a set M (or preferably the “physical reality” which is mapped to the element) a physical object (see [1], §5 and §10.5) if and only if the “physical reality” associated with the elements of M have (intuitively speaking) objective properties. In the
14 II Microsystems, Preparation, and Registration Procedures mathematical framework we shall express the notion of a property in the following terms: Let a structure $ be defined on the set M as follows: Let $ c= ^(M), that is, $ is a collection of subsets of M. Let a, b e ё\ let M\a denote the complement of a in M. Let $ satisfy the following axioms: AE 1. If a g $ then M\a e S. AE 2. If a, b g $ then a n be S. Physically, the elements of $ represent definite properties. By this we mean that the mapping principles must specify what aspects of “physical reality” are to be mapped to the elements of $ and M. The mapping principles must also specify what real relationships between an element xe M and an element a e S are to be identified with the statement “x has the property аГ The latter statement is mapped to the mathematical relation xe a where xe M and a e S. In a more general context M and $ may represent (in the sense of [1], §10.5) sets of real but only indirectly determinable aspects of “physical reality.” In other words, the statement “x has the property я” may be only indirectly verified (see [1], §10.5 and §10.6). The axioms AE 1 and AE 2 have the following intuitive meaning: AE 1 states that all objects which do not have the property a share a common property—which we denote by “not я.” AE 2 states that all objects which have both properties a and b have a common property—which we denote by a n b. It is important to note that these statements do not constitute a proof of these axioms. A careless reading of these axioms may lead the reader to conclude that they are merely consequences of logic. Such is not the case; AE 1 and AE 2 cannot be derived from the logical axioms of mathematics (see [1], §4.3) and therefore must be asserted as axioms. The concept of a “physical object” which is defined only by the elements of a set M together with a structure ё characterized by axioms AE 1 and AE 2 is too general. The above concept of a property is also too general. In addition to describing the object itself, it may also be used to describe a physical system with respect to its environment. But objective properties should exhibit an independence of the environment. Therefore we shall find it necessary to formulate the notion of “independence of the environment” in terms of the mathematical theory . It is customary to express this independence of the environment in the following terms: The properties of a given system are “objective” and they can be determined by suitable measurements. It is, however, not clear what we mean by a “suitable measurement.” We have not yet formulated the concept of “objectivity” (that is, inde¬ pendence of the environment) in a mathematical way. It is clear that “M together with S” by itself is not sufficient for this formulation because the
2 Selection Procedures 15 “interaction with the environment” must first be described if we wish to define the notion of “objective,” (that is, independent of the environment). In §4 we shall define the notion of a “physical system” and describe the interaction of the system with the environment. In III, §4.1 we shall continue this discussion in order to obtain a suitable definition of a physical object. We shall now proceed as if the term “objective property” has already been defined in the theory. If S is a set of “objective” properties, we shall call the elements of M “physical objects.” After we have introduced the above definition of the concept of a “physical object” it is important to put aside all intuition and preconceptions about physical objects (despite the fact that they were used in order to formulate the new concept—see [1], §5 and [2], III, §4) in order to obtain a correct understanding of the new concept. Thus our notion of a “physical object” is defined in terms of M, axioms AE 1 and AE 2, and a definition of the notion of “objective” which will be introduced later. It is important to emphasize the fact that this concept depends not only on the elements of the set M but also on S. In mpre precise terms we must speak of “physical objects with respect to the property structure в Г Such a distinction is not necessary when the choice of S is clear and unambiguous. We note the fact that axioms AE 1 and AE 2 are equivalent to the statement that S is a Boolean ring of sets (see AI, §4). For those readers who have not yet achieved an understanding of the remarkable features of quantum mechanics it might appear that (intuitively, on the basis of preconceived notions associated with macroscopic physical objects) it is possible to construct quantum mechanics on the basis of “micro¬ objects,” that is, upon S, M, and AE 1 and AE 2 in such a way that the behavior of micro-objects is completely determined by the properties as defined by S. What is meant by this complete determination is delineated in III, §4.1. Such a procedure will lead to contradiction with experience, as we shall find in IV, §8.1. 2 Selection Procedures We shall now consider the discussion of experiments with microsystems which we began in I, §2. We shall not begin by introducing a number of intuitive concepts which establish a connection between measurements of microsystems and “measurement of properties” (see III, §4.1). Instead we shall proceed cautiously and seek only to obtain a mathematical repre¬ sentation of the preparation and registration of microsystems. According to the description of the experimental procedures presented in I, §2, the preparation and registration procedures have a common attribute—they result in the selection of microsystems. We shall now formulate this common attribute in mathematical terms. In mathematics it is often useful to first introduce the more general and then the more specialized concepts, or in more precise terms, first the less rich
16 II Microsystems, Preparation, and Registration Procedures and then the more rich structure types. Then all theorems for the less rich are also valid for the more rich structure types. Indeed, everywhere in a mathematical theory where such a general structure is found, the theorems deduced for this structure can be applied. Consider, for example, the structure type “group” and its meaning in many different mathematical theories. On the basis of physical and mathematical considerations we shall now introduce the structure type “selection procedures.” We begin by introducing a set M. The elements of M shall be used as labels for the microsystems. Therefore we shall loosely refer to M as the set of microsystems (see, for example, [1], §5, §10, and §12 or [2], III, §4). We shall call a subset 9 c= 0>(M) a set of selection procedures provided the following axioms are satisfied: AS 1.1. If a, b g 9, a c= b then b\a e 9. AS 1.2. If a, be 9, then a n b e 9. It is somewhat difficult to make the axioms for such a general concept as a selection procedure plausible to physicists. The following remarks are intended for this purpose. First, it is evident that a structure S of “properties” is, on the basis of AE 1 and AE 2, also a structure of “selection procedures.” A structure of selection procedures is equivalent to a set of properties (according to AS 1.1 and AS 1.2) if and only if M e 9. The fact that every set $ of properties is a set of selection procedures may be intuitively expressed as follows: S consists of the selection procedures which select according to the properties ae$. If we had introduced the concept of a “property” as a special case of a selection procedure, then we would say that, in physics, there are other methods of “selection” than according to “objective properties.” Mathematically, the distinction between $ and 9 appears to be small. For 9 we do not require that M e 9. This small distinction permits us to extend the domain of application of selection procedures beyond that of the properties. In order to make AS 1 more plausible, let us suppose that the selection procedures a, b, etc. are obtained by physical methods. Then a subset a of M represents the set of systems x selected according to the procedure a for some experiment. The set a is, in general, infinite. The set of systems obtained experimentally from the procedure a is always finite, but can be arbitrarily large. Since in principle we do not know how large this number can be, we express this lack of knowledge in the mathematical framework by the expression “infinite” (see [1], §6, §9 and [2], §5, §8). Intuitively speaking, axiom AS 2 states that the set of all x selected according to both selection procedures a and b, that is all x e a n b, is a possible selection procedure. If flcftwe say that the selection procedure a is “finer” than that of b. If a c= b, and if we eliminate (by means of the finer selection procedure a) the systems
2 Selection Procedures 17 associated with a, we obtain b\a; AS 1 states that b\a is a possible selection procedure. Have we then not also asserted that all objects x e M which do not satisfy the selection criterion of a can be “selected” on the basis that they do not satisfy the selection criteria of a? For “properties” it appears to be meaningful to assert that both a and its complement M\a are properties, because the elements of a differ fundamen¬ tally from those of M\a because the latter do not have the property a. For selection procedures, however, it is physically unrealistic to make this assertion. This is the case not only for micro-objects. For example, let us consider a machine which produces steel spheres (ball bearings). The machine can be considered to be a selection procedure a for steel spheres M. The complementary set M\a is apparently characterized by the fact that the spheres of M\a were not produced by the machine a. For the set a we may make certain (technically important) assertions, while for M\a we may say only that the elements of M\a were not selected according to a. A similar case exists for the case of a modern electron accelerator. For the “selected” set of electrons we may make important assertions about the experiments for which the electrons are used. What assertions can we make about the electrons which are not prepared by the electron accelerator? Thus it is meaningful not to require that M be a selection procedure. The addition of the axiom M e 9 to AS 1 and AS 2 would not lead to a contradiction to quantum mechanics. As the above example has shown, the inclusion of the condition МеУ to the axioms for the structure “selection procedure” is somewhat physically unrealistic. Therefore we will not add the axiom МеУ. If we add the condition МеУ to axioms AS 1 and AS 2 then we find that 9 will satisfy axioms AE 1 and AE 2 and would therefore be a property structure. In physical terms (that is, on the basis of the mapping principles) an element of 9 will not represent an intrinsic “objective” property of an object x but only the “property” that x is selected according to the procedure a. We say that the axiom systems AE 1 and AE 2 or AS 1 and AS 2 alone do not suffice to describe the physical role of the elements of S and 9. Further axioms will be needed in order to formulate the physical structures more precisely. We shall now state a number of definitions and theorems which we shall need later. D 2.1. 9(a) = {b | b g 9 and 6ca}. According to AI, §4 and AS 1.1, 2 we find that 9(a) is a Boolean ring of sets with null element 0 and unit element a. The set 9 itself need not necessarily be a lattice (AI, §1 and §4) since given a,b e 9,av b need not be an element of 9. Th. 2.1. Given a family of structures of selection procedures {9k}, then 9 = f]A9x is a structure of selection procedures.
18 II Microsystems, Preparation, and Registration Procedures The proof is a simple consequence of AS 1.1, 2. Th. 2.2. For each subset © of 3?(M) there is a smallest structure of selection procedures & (called the structure of selection procedures generated by 0) which satisfies © с:У, Proof. S? is the intersection of all structures of selection procedures «9^ which satisfy © c= «9^. Since &(M) is itself a structure of selection procedures, the family «9^ is nonempty. D 2.2. A set «5^ of selection procedures for which «5^ с & is said to be coexistent relative to с e Sf provided that «5^ с £f(c). A set ё of properties is a coexistent set of selection procedures relative to M. Every subset of ё is a coexistent set of selection procedures relative to M. 3 Statistical Selection Procedures Earlier we have presented a mathematical description of the fundamental phenomena associated with the selection of microsystems by means of an apparatus. We shall now consider the mathematical formulation of the second basis of quantum mechanics—statistics. The role of statistics is, of course, not restricted to quantum mechanics. It is, however, an essential component of quantum mechanics. For example, it is possible to resolve the apparent contradiction between the particle and wave descriptions only by the introduction of a statistical viewpoint; see, for example, [2], XI, §1.5— §1.7. In an experimental context the role of statistics is made manifest by the relative frequency with which a finer selection procedure bca selects relative to a. By this we mean that if we select N systems xx, x2,..., xN according to the selection procedure a, and we obtain Nx of these systems which also satisfy the selection procedure b, then the relative frequency h is given by h = NJN. We say that b is statistically dependent on a if the relative frequency h is reproducible. By this we mean that “in physical approxi¬ mation” we obtain the same relative frequency (in the case of large numbers N of systems) for various experiments involving selection according to a and b. We shall use real numbers to mathematically represent these relative frequencies. Here we shall not consider the meaning of the expression “in physical approximation” used above. We only note that we do not require that a = NJN but only that they are approximately equal a « NJN (where a is the real number representing the frequency). The nature of this approxi¬ mation is discussed in [1], §11 where we have placed a particular emphasis on the relationship between theory and experience. We shall now introduce the concept of a statistical selection procedure in order to describe the statistical dependence of selection procedures.
3 Statistical Selection Procedures 19 A set 9* c= 9(M) is called a structure of statistical selection procedures provided that, in addition to AS 1, the following axiom is satisfied. Let P = {(a, b)\a9be99b с а9аф 0};let A: F-*[091]. AS 2.1. If al9 a2 e 99 ax n a2 = 0, ax u a2 e 9 then Цаг u al9 at) 4- Х(аг и al9 a2) = 1. AS 2.2. If al9 a2, a3 e 9, ax c= a2 c= a39 a2 Ф 0 then X(al9 a3) = MPu %)• AS 2.3. If я b a2 e 99a2 a au a2 Ф 0 then А(аъ a2) Ф 0. The quantity Ца9 b) is called the probability of b relative to a and represents the relative frequency (as described above) with which b selects relative to a. On the basis of this interpretation of X(a9 b\ the postulates AS 2.1-AS 2.3 are obvious (but not proven; see [1], §5 or [2], III, §4). If ax и a2 is a selection procedure then ax and a2 are finer than ax и a2. If ax n a2 = 0 then ax and a2 are mutually exclusive. Then if N systems are selected according to aa2 and if Nt are selected according to аъ N2 according to al9 then we must have N = Nx + N2, which, after division by N, is what we obtain from AS 2.1. For three selection procedures a3 c= a2 c= ax let Nx systems be selected according to au N2 of these Nt systems according to a2 and N3 of these N2 systems according to a3. Then we find that N3/Nt = (N./N.m/N,) which is in agreement with AS 2.2. If at zd a2 Ф 0 then for N systems, selected according to au a finite number may be selected according to al9 in agreement with AS 2.3. We shall now consider a number of simple corollaries of AS 2. Let a2 = 0- From AS 2.1 we obtain Ца1,а1) + Х{аъ 0) = 1. (3.1) By multiplication with X(al9 ax) we obtain X(al9 ax)2 + X(al9 ax)X(al9 0) = X(al9 ax). (3.2) According to AS 2.2, X(al9 ax) + X(al9 0) = X(al9 ax) and therefore X(al9 0) = 0 (3.3) and X(al9a1) = 1. (3.4) If a2 n a3 = 0, and if a2 cz al9 a3 cz ax from AS 2.1 it follows that X(a2 и a39 a2) + X(a2 u a39 a3) = 1. By multiplication with X(al9 a2 u a3) and application of AS 2.2 we obtain X(al9 a2 u a3) = X(al9 a2) 4- X(al9 a3). (3.5)
20 II Microsystems, Preparation, and Registration Procedures If аъ с a2 <= аъ from (3.5) and a2 = a3 u (a2\a3) we obtain Л(яь я2) = А(яь я3) + Л(яь a2\a3) and therefore A(^i, n2) > Л,(#ъ %)• D 3.1. A mapping p of a Boolean ring £ (see AI, §3) into the unit interval [0,1] which satisfies p(e) = 1 (where e is the unit element of £) for which the condition (jj л <j2 = 0 implies that p(ax v <r2) = p(at) 4- p(a2) is satisfied is called an additive real measure on £. From (3.5). AS 2.2, and AS 2.3 it is easy to obtain the following theorem: Th. 3.1. p(b) = Л(а, b) is an additive measure on the Boolean ring 9(a); for a3 c= a2 c= a we obtain X(a2, a3) — . p(a2) If we fix a, and consider all b a a, it is sufficient to consider the probability function p(b) since, on the Boolean ring 9(a) we obtain all conditional probabilities from the probability function p(b). In addition to the axioms AS 2.1-AS 2.3 we also propose the following: AS 2.4.1. If ax g 9 is a decreasing sequence for which p)v ax = 0, and if there exists ш ae 9 for which ax c= a (and thus ax c= a for all v) then X(a, av)—>0. AS 2.4.2. For every totally ordered subset of 9 there exists an upper bound in 9. Axiom AS 2.4.1 is a generalization—in the sense of a mathematical idealization—of the intuitively evident relation (3.3). If for the decreasing sequence we have ax = 0 for some v onward, from (3.3) it follows that AS 2.4.1 is satisfied. In addition, AS 2.4.1 requires that if the sequence of selection procedures ax becomes arbitrarily small in the sense of f]x ax = 0 then the probabilities Л(а, ax) must become arbitrarily small. A situation in which Ца, ax) tends to a nonzero limiting value while, for all practical purposes, there are no more physical systems to be selected by ax for large v is physically unrealistic. Axiom AS 2.4.2 is physically realistic because it asserts that, for a sequence of increasing selection procedures, there is a largest selection procedure. Thus it becomes a substitute for the stronger, but less realistic condition M e 9. Here we refer readers who are interested in these
4 Physical Systems 21 axioms and their implications towards a physically motivated “generalized” probability theory to [1], §11 and [18].1 Using axioms AS 2.1-AS 4 we may develop a theory of probability which is similar to that of Kolmogorov (see [18]). We shall not derive any additional results here. Here we shall only note that the probabilistic basis for quantum statistical mechanics differs only slightly from the usual one. If instead, we require that M e 9, then we would find that 9 = 9(M) is a Boolean ring, and that the relative probabilities would be completely determined by the probability function p(a) = A(M, a). The basis of the statistical selection procedures would then be identical to those of the simplest “classical” probability theory. We shall now introduce an important definition which we shall need later. D 3.2. Let 9 be a structure of statistical selection procedures. A partition of ae99(a = (J"=1 bi9 Ф 0, bt e 9, b( n bj = 0, i Ф j) is called a decom¬ position of a in the bb and a is called a mixture of the bt. The А(я, b}) are called the weights of bt in a. Since 9(a) is a Boolean ring, the decomposition of a is nothing other than a disjoint partition of the unit element a of 9(a). With the additive measure p(b) defined on 9(a) we obtain ]T"=1 p(bt) = l. The p(bt) describe the “weights” by which the individual components bt occur in the decom¬ position. If N systems are selected according to a, and if Nt are selected according to each bu then the relation p(bt) = Nt/N must hold in “physical approximation.” 4 Physical Systems We shall now consider the central topic of this chapter: the presentation of a more precise formulation of the concept of a “physical system” and of a more detailed description of the statistical selection procedures used for physical systems. We again begin with a set M, the elements of which will be used to represent “microsystems.” 4.1 Preparation Procedures From the analysis of the experimental process which was presented in I, §2 we found that the “selection” of microsystems takes place in two distinct ways: by preparation and by registration (that is, by selection according to 1. It is possible to add the following axiom to AS1.1-AS2.4 without contradicting experience: AS 1.3. If a,beSf and a n b Ф 0 then a и b e 9 We shall not use this axiom in this book.
22 II Microsystems, Preparation, and Registration Procedures the result of the interaction of the microsystem with a registration ap¬ paratus). Let us begin with the formulation of the mathematical structure describing preparation. Let £ с 0>(M) be a structure for which APS 1. £ is a statistical selection procedure. We shall call £ a “set of preparation procedures’’ This designation is not a consequence of APS 1, but is shorthand for a mapping principle which maps certain facts onto elements of £. That is, the elements of £ will serve as images (see, for example, [2], III, §4 or [1], §5) of certain definite technical facts and processes by which microsystems can be produced in large numbers. Here we do not permit the use of quantum mechanics in their description. The mathematical relation xe a (where a e £) is the translation of the statement: x is obtained by the preparation procedure a. Here it does not make any difference whether this statement is a statement about the past, present, or future (see the discussion in [1], §12). We shall denote the probability function for £ (where £ satisfies APS 1) by кл. The definitions and theorems of §3 are valid for Later we shall consider the decomposition (as defined by D 3.2) of preparation procedures. 4.2 Registration Procedures It is somewhat more difficult to present a mathematical formulation of the registration process. The registration process is characterized by two steps: (1) Construction and utilization of the registration apparatus. (2) Selection according to the changes which occur (or do not occur) in the registration apparatus. Accordingly, we shall define an additional structure on M by means of two subsets of 0>(M): 01 and ^0. For 0t and 0to we require that the following axioms be satisfied: APS 2. 0t is a selection procedure. APS 3. 0to is a statistical selection procedure. APS 4.1. 0to c= 0t. APS 4.2. To each b e0t there exists a b0 e 0to for which b c= b0. In order to develop the physical meaning of axioms APS 2-4 we shall first state what is to be mapped onto the elements of 0t and 0t0. An element b0 g 0to shall represent the construction and the use of a registration
4 Physical Systems 23 apparatus. This can best be illustrated by an example. Let us consider a counter. Then b0 will be the set of all microsystems which are registered by the counter. The mathematical relation xeb0 may be the translation of the following statement: for the registration of x the counter b0 is applied. The mapping principle may be expressed in more concise form as follows: 9t0 is the set of registration methods. For a given microsystem xeb0 the counter considered above may or may not respond. Let b+ be the selection procedure for those xeb0 for which the counter responds. Here we find that b+ c= b0. Let b_ denote the set of those x from b0 for which the counter does not respond. Here we obtain b_ = b0\b+; b+, b_ are elements of 91. Generally we find that 91 contains not only the elements of 9t0 but also those selection procedures which are finer than the elements of 9t0 and are selected according to changes which occur (or do not occur) in the registration apparatus by interaction with the microsystems. This situation is described by axioms APS 4.1 and APS 4.2. The physical interpretation of the elements of 9t may be expressed in more concise form as follows: 9t is the set of registration procedures. Axioms APS 4.1 and APS 4.2 do not permit us to conclude that the selection procedures from 910 do not depend on the microsystems but depend only on the apparatus and its “arbitrary” application. The fact that the registration methods do not depend on the microsystems is, in part, described by APS 3 where we assume that 9t0 is a statistical selection procedure. Here axiom APS 3 requires that a finer registration method is statistically dependent on a coarser one and is independent of the subsequent influence of microsystems. Suppose that b0, c0 e 9tQ and that c0 c= b0. Then the registration method c0 satisfies stronger selection criteria (namely that of c0) than b0 with respect to the apparatus. In an experiment, this would result in the exclusion of those elements of b0 which are not also elements of c0. Thus for c0 there is a statistical dependence relative to b0 which depends only on the “finer” selection c0 of the apparatuses. Experimental physicists will seek to minimize the effect of the “statistical distribution” of the apparatuses because it will interfere with the statistical distributions they wish to measure. Instead of speaking of the statistical dependence of the elements b0 e it is more common to include its influence in the discussion of the so-called experimental errors in the measurement. To an experimental physicist the above statistical distri¬ butions of the apparatuses (the effect of which cannot be completely eliminated) will appear as an error in the experimental measurement because he tries, but cannot attain registration methods b0 e 9l0 for which there are no c0 e 9t0 for which c0 Ф 0, c0 a b0 and c0 Ф b0. We.shall denote the probability function for 9t0 by Xmo. We note that axiom APS 2 does not require that 9t be a structure of statistical selection procedures. This crucial fact permits 9t to contain selection procedures which are essentially conditioned by the microsystem. The counter which is characterized by b0 may or may not respond; thus b0
24 II Microsystems, Preparation, and Registration Procedures can be partitioned into two subsets b+, b_ where b0 = b+ и b_, b+ n b_ = 0. In nature, however, there are no reproducible, frequencies X(b0, b+). Suppose that we pass N microsystems xl9x29...9xN through the counter, and that N+ responses are obtained. By experience we lind that N+/N depends on any circumstances which affect the microsystems. We find that the frequency N+/N is not reproducible on the basis of the registration procedures alone. On the contrary, counters are used to determine which kinds of microsystems are present. The actual frequency N+/N depends on the preparation procedures used for the microsystems. Thus we find it necessary to introduce additional postulates which depend on both the preparation procedure and the registration procedure. 4.3 The Dependence of Registration upon Preparation We shall now consider the following question: Which combinations of a e J and b0 g 9t0 are physically meaningful? Unfortunately, this problem is not trivial; indeed, in the usual formulation of quantum mechanics it is not even mentioned. In an idealized form, part of this problem can be found in axiomatic field theory as the so-called “causality postulate.” How do we formulate this “combination problem”? In order to formulate this problem in mathematical language we introduce the following definition: D 4.3.1. Let ae J, b0e ^0, а Ф 0,bo Ф 0; we say that a and b0 may be combined provided that if ae J, B0 g ^0, 0 Ф a с- a, 0 Ф B0 c=b0, we find that anb0 Ф 0. If a and b0 may be combined, then clearly a n b0 Ф 0. We now define the following set: С = {(a, b0)\ae J, b0 g and a, b0 may be combined}. (4.3.1) The following theorem is an immediate consequence of D 4.3.1. Th. 4.3.1. If 0 Ф a c= a, 0 Ф B0 c= b0 and if a, ae£l; b0, B0e$Q and (a, b0) g С then we find that (a, B0) g C. Intuitively, the statement a n B0 Ф 0 represents the statement that it is physically possible (see [1], §10) to select microsystems according to both a and B0. That is, the microsystems which were prepared according to the preparation procedure a can be used for the registration method B0. Thus the statement that a, b0 may be combined means that for every finer preparation procedure 5ca and every finer registration method B0 cz b0 there are always systems in a n B0. The physical meaning of the combination problem will be discussed in detail in III, §1. In VII, §1 we shall again return to this subject.
4 Physical Systems 25 The “largest possible” set С is given by C= £’ x 01о where D 4.3.2. £' = £\{0} and 0t'o = 0to\{0}. The condition С = £! x gt'Q is not always realistic. Thus, in the selection of axioms for С we have encountered a physical problem. Why is the condition С = £! x an unrealistic condition for microsystems? In order to facilitate understanding, we shall now briefly describe an additional structure which is needed in order to describe the relationship between preparation and registration methods. A detailed exposition of this structure will be given in VII, §1. It is concerned with the space-time relationship between these experimental procedures. Since the microsystems are produced by the preparation procedure it makes sense that the micro¬ objects are registered only after they are produced, and that the preparation apparatus and the registration apparatus do not collide. These remarks show that the question concerning axioms for the set С are not without physical significance. Later we shall see that, under certain circumstances, the combination problem described by С does not play an essential role in the usual formulation of quantum mechanics. We shall now begin by making a minimal requirement for C. APS 5.1.1. To each ae £' there exists a b0 e $f0 such that (a, b0) e C. This axiom states that it is physically possible (see [1], §10) to apply at least one registration apparatus to the microsystems prepared by a. In III, §1 we will replace APS 5.1.1 by the stronger axiom APS 5.1.2. We shall now consider a registration procedure be 0t where b Ф 0 (we do not require that b e 0to); b Ф 0 means that there are microsystems which may be selected according to the procedure b. All experience has shown that it is possible to prepare such microsystems. Thus we assert APS 5.2. To each b e 0t,b Ф 0 there exists at least one ae £ and a b0 e 0to which satisfies b c= b0, (a, b0) e С and a nb Ф 0. Axiom APS 5.2 is stronger than APS 4.2. We shall now consider the mathematical formulation of the statistical dependence of the combined selection procedures. First we define D 4.3.3. 0 = {c\c = a nb, ae£, be and for а Ф 0 there exists a60G^o which satisfies b czb0 and (a, b0) e C}.
26 II Microsystems, Preparation, and Registration Procedures Clearly © is a subset of 9>(M). An element a n b of © represents the set of all microsystems which are prepared according to a and registered according to b. From D 4.3.3 and Th. 4.3.1 it follows that: Th. 4.3.2. Let a, яеД oca and suppose b, В e 91„ В cz b. If a n be® then an Be®. D 4.3.4. Let 9 denote the smallest set of selection procedures for which © c 9 (see Th. 2.2). In general we find that, neither J с- 9 nor 91 c= 9 holds! We shall now formulate the requirement that the combination of prepara¬ tion and registration procedures leads to reproducible frequencies. APS 6. We require that 9 (as defined by D 4.3.4) shall be a statistical selection procedure. Let kc? denote the probability function for 9. On the basis of their physical interpretation, XA, A#oand X#> cannot be mutually independent. Experience has shown that, in the combination of preparation procedures a e & and the registration methods b0 e 9t0 there is no statistical dependence between the selection according to the preparation procedures and the application of the registration method. We formulate the statistical independence of the preparation procedure and the application of registration methods by means of the following axioms: APS 7. Let аъ a2 e b10, b20 e 9t0 where a2 cz аъ b10 c= b20 and (au b20) e C. Then APS 7.1. X^(a^ n b^o, a2 n Ью) = X^(a^9 a2)i APS 7.2. X^(a1 n b10, a± n b20) = X@o(b10, b20). Note that if b e 9t, b ф 9t0, and if аъ a2 e Д ax n b Ф 0 then, in general we find that X^(ax n b,a2 n b) ф Х^аъ a2). Axiom APS 7.1 expresses the directedness of the interaction of the preparation on the registration apparatus in the following way. The pro¬ bability associated with the process of the preparation apparatus does not depend on those of the registration apparatus (that is, on b10); see I, §2. 4.4 The Concept of a Physical System We shall now replace the intuitive notion of a physical system by a well- defined one, one defined in terms of the mathematical structure “preparation and registration procedures” and its physical interpretation (as expressed in
4 Physical Systems 27 terms of the mapping principles). Since the mapping principles are involved in this concept, it is not a derived one, that is, a concept which is defined only by terms and structures deduced in Such derived concepts will be introduced later, for example, the derived concepts of ensemble, effect, observable (see III and IV). The expression “physical system” will represent those facts which are to be mapped to elements of the set M together with the well-defined physical processes which are to be mapped to the elements of J, and whereby the letters J, and 9t0 denote not only the sets but also the structures defined by axioms APS 1-7.1 In less precise terms we say that a physical system is an element of the set M endowed with the structure J, and 9t0. In III, §4.1 we shall examine the distinctions which must be made between the concept of a “physical system” and the concept of a “physical object.” It is important to note that the concept of a physical system is not necessarily restricted to microsystems; see, for example [1], §12 and [2], XV-XVI and [13]. The facts which we have called “physical systems” have some reality beyond that of the “direct” interpretation of preparation and registration procedures. This additional meaning results from axioms APS 1 and APS 7, which imply that the preparation procedure is independent of the regis¬ tration procedure. Intuitively, this means that, in the preparation, “some¬ thing” is produced—that which we have called a physical system—which can be detected by the registration apparatus. A more precise formulation of this state of affairs is presented in [2], XVI and [13]. But the facts which we have called “physical systems” are so closely related to the associated production and detection methods that the typical charac¬ terization of physical objects (see §1 and III, §4.1) by objective properties of the systems has not been possible until now. The statement of facts which are self-existing in the sense that they do not exert any influence on other things is self-contradictory. Such facts are completely inaccessible. Nevertheless, in physics we endeavor to describe, as completely as possible, portions of the real world as if these portions were self- existing. The attempt to describe the real world in complete detail would make physics impossible. Physics is possible only because we are able to make structural assertions about portions of the real world without taking into account the structure of the whole world in all its particulars. Only a few “global structures” of the world as a whole are introduced into the description of the physics of its parts, as, for example, the space-time structure (and gravitation, which for sufficiently small regions of space can be neglected due to the existence of local inertial reference systems). We have made the assumption that the experiments composed of a preparation and a registration procedure can be described as such portions of the world using only space-time as a global structure of the world. 1. Here we shall require the use of axiom APS 5 in its strengthened form as presented in III, §1. We shall use the expression “carriers of physical interaction” to denote the set M when we require only APS 5 in the form presented above.
28 II Microsystems, Preparation, and Registration Procedures Is it possible to describe the physical systems as such “self-existing” portions of the world? It would be possible, if the physical systems were physical objects in the sense of §1 and III, §4.1. But since the microsystems are not physical objects (see IV, §8.1), we shall try to eliminate “as far as possible” details of the structure of the preparation and registration appa¬ ratuses. This problem is treated in III and IV. In the case of a microsystem, this problem is not as simple as in the case of a macrosystem because the latter may be described as a physical object. The concept of a physical system described above is too general because it is applicable to both macro- and microsystems. The selection procedures in Sf describe a completely conventional “classical statistics” which does not exhibit the “typical” quantum mechanical structure. At what point do we make the transition to “quantum statistics” which appears to be so different? This transition will be made in III, §3 where we introduce axiom AV 4s instead of the postulate that the physical systems are physical objects. We are able to describe the structure of a physical system only to the extent that the physical system can be prepared and registered. This statement will be formulated in mathematical terms by the following axiom: APS 8.1. M = U a. a e & APS 8.2. M = U b. be® By APS 4.2 we find that APS 8.2 is equivalent to м = и V bo € 0to Axiom APS 8 does not describe a profound aspect of reality because the preparation and registration procedures are arbitrary and permit “all possibilities.” The axiom states that every physical system interacts with the outside world at least once (APS 8.1) and again (APS 8.2). 4.5 The Structure of Probability Fields for Physical Systems In this section we shall derive those theorems which may be obtained from the above axioms (that is, without the axioms in III, §1) in order to reduce the problem of the probability structure for the selection procedures of 9* to a manageable subset. First, we shall consider how Sf is obtained from 0. We shall now consider a finite collection a{ e J; i = 1,..., n for which a{ cz a e J, a( n cij = 0 for i Ф j, and a corresponding collection bke@L\ к = 1,..., m for which bk c= b e bk n bt = 0 for к Ф /. Let A be a subset of ordered pairs (i, k), i = 1,..., n and к = 1,..., m. From an be® and Th. 4.3.2 we obtain atnbke 0. Furthermore, we find that eA^i n bk a a n b.
4 Physical Systems 29 Let li denote the set of all с с M which may be represented in the form c = Ua,k)eAai n К where the aubk,A satisfy the above conditions. In addition, let S' denote the following set: S' = jc | с = (J c{, Ci n ck = 0 for i Ф k,cte& and ccie © j. (4.5.1) Then we find that: Th. 4.5.1. S = S' = PROOF. Since Sf(a n b) is a Boolean ring, and since af n g © с: and a. n bk c= a n b e 0, we find that с = []у,к)еА n bke . Thus we obtain Sc^. We also obtain S = when we prove that S is a system of selection procedures, that is, we show that S satisfies both conditions AS 1.1 and AS 1.2. From the definitions of S and S', from at n g © and a n b g © it follows that I cl'. Since Sf(d) is a Boolean ring, it follows that S' c= Sf. Thus we need only prove that S = Sf. Let с = \J(i,k)eAai n h and с = yoJ)6^ dj n be two elements of S repre¬ sented in the above form where i = 1,..., n; к = 1,..., m; at a a, bk cz b; j = 1,..., n, I = 1,..., m; cij cz a, £, c: 5. Then we obtain с n с = (J fa n a, n bk n bt). (4.5.2) (i, k) e л Let = af n dj, bkl = bk n bt. We find that dtj g J, bkl e 0t and <= a n a g J, bkl <=: b n Be M. Clearly di} n dki — 0 for (/,;’) # (Jc, Q; a similar result holds for Thus, from (4.5.2) and Th. 4.3.2, it follows that с n cel, that is, S satisfies axiom AS 1.2. In order to show that AS 1.1 is satisfied, we shall assume that ccc and compute c\c. Since ccc,we obtain с = с n с, from which we obtain, using (4.5.2) c\c = U Ln M U (flj n Sj n bk n 5,) . (4.5.3) (i, k) e A L UJ. 0 e Л J Since й and Я are selection procedures, we obtain ам+1е % Ьш+1е Я, where (4.5.4) a,7i+i = <*Л U (dt п dj), \j= i \ л Ькт + 1 — Ьк \ U (Ьк П 0j). \l = l In addition, we find that dtj <= d for i = 1,..., n;j = 1,..., n, n + 1, bkl c= b for к = 1, ..., m; I = 1,..., m, m + 1 and for j = 1,..., n + 1 we obtain aij n dvy = 0 for (i,j) Ф (/',/) and a similar result for the bkl. (4.5.5a)
30 II Microsystems, Preparation, and Registration Procedures From (4.5.4) we conclude that n _ ... . fyjf Ьк Ькх j =1 1=1 and (4.5.5b) Й+1 1Й+1 Oi П bk = U U ay n bu. j = 1 1 = 1 Let B = {(/>0L/ = 1, ...,й+ 1;/= 1, ...,m+ 1}. From (4.5.3) we obtain с\г= U U Кn M- (4.5.6) (i, к) e Л O’. 0 e B\A From (4.5.6) together with (4.5.5a) and (4.5.5b) we conclude that c\c e S. Th. 4.5.2. The function X#, for 9 is uniquely determined by and by the special values X#>(a n b0, a n b) for those (a, b0) e С where С is defined by (4.3.1) and be 01 with b cz b0. PROOF. Since 9 = £' (by Th. 4.5.1) we may express the с, с in Я<^(с, с) in the form: n m с = U °i, с = U ck, (4.5.7) i=1 k=l where c{ e 0, ck e © and ct n ck = 0 for i Ф к, ckr\cx — 0 for к Ф /, where cade 0, cade 0. Since с <=. с <=. d we can set d = d and с without loss of generality. From (4.5.7) and (3.5) we obtain A^(c, c) = Я^(с, cfc). (4.5.8) k = l Since с с d e 0, from AS 2.2 we obtain ck) = c)A^(c, ck). Since с Ф 0 (otherwise Я#(с...) would not be defined) we find that, by AS 2.3 Я^(d, c) # 0 and therefore Я-<с’г‘> = ТТГТ- (4 5.9) Я^(а, с) Using (3.5) and (4.5.7) we find that n ЯД c) = X Xcf{d, cf i = 1 Thus (4.5.9) becomes ^ p> ) — S., (“510)
4 Physical Systems 31 By substitution of (4.5.10) into (4.5.8) we obtain i Xk=l (A c 114 (4'5'ш Since d, ch ck g © A? is determined by the special values A^(d, c) for с c= d and c, d g 0. Let d = ar\b, ae£t9be$ and с = a n B,a e £l,B e Then if с c: d we obtain с — (a n a) n (b n 5). к That is, if a = a n a and 5 = b n 5, then с — а n В where a g 5 g ^ and a c: a, Б c= b. According to D 4.3.3 there exists a h0 6 f°r which b a b0 and (a, h0) 6 C; consequently a n b0e® and 5 c= b с h0. Since a n b Ф 0 (otherwise A<^(d, с) = Я^(a nb9anh) would not be defined) anb0 Ф 0 and A^(a n b09 a n b) Ф 0. Using the following equations (obtained from AS 2.2) A<?(a n b0, a n b)A^(a n b, a n B) = A^(a n b0, a n B) we obtain , _ i-ч ЛЛа n b0, a n В) ^ A^(a n b0, an В) = — —. (4.5.12) A<?(a n b0, a n b) Similarly, from AS 2.2 it follows that A^(a n b0, a п В) = Я^(а n b09 a n b0)A#>(a n b0, a n B) (4.5.13) because a n b0 Ф 0 whenever а Ф 0 according to D 4.3.1. If а Ф 0, then by (4.5.13) we obtain A<?(a n b0, a n В) = Ай(а, a)A^(a n b0, a n B); from (4.5.12) we obtain, for а Ф 0 . , . - гч la(a, a)2^(a n b0, a nh) ...... A^(a nb0,anB) = — . (4.5.14) ЯДа n o0, a n o) For the case in which a = 0, we obtain Я^(а n h, a п Б) = 0. According to Th. 4.5.2, in physics it is only necessary to consider those Ay(a n b0, a n b) for which (a, b0) eC,b e b с b0; the role of an experi¬ ment is to “measure” these special values of the function A#>. We now define D 4.5.1. 3F = {(b0> b)\b0 g #o, b g Ь с b0}. ^ is called the set of “effect processes” or “questions.” D 4.5.2. V = {(a,f) \ f = (b0, b) g У and (a, b0) e C}, where С is defined by D 4.3.1.
32 II Microsystems, Preparation, and Registration Procedures D 4.5.3. On ^ we define the real function p by ц(а, f) = ц(а, (b0, b)) = Xy(a nb0,an b). From Th. 4.5.2 it follows that X# is determined for all of 9 by X# and the values of p on c€. The function p defined by D 4.5.3 plays a central role in the statistical description of a physical system. We shall now briefly illustrate the nature of the experiments which may be used to test the function p. These experiments consist of a preparation procedure a by which a single system xe M is produced, and upon which a registration method b0 is applied. As the result of the interaction of the system with the apparatus, a process characterized by b may or may not occur. Let xu ..., xN denote N systems which are prepared according to a and to which b0 is applied^that is, xte a n b0 for i = 1,..., N. Of these systems, suppose that Nt systems xiv, v = 1,..., Nt activate the process characterized by b, that is, xiv e b. According to the physical meaning of X^(a n b0, a n b\ if the theory is ap¬ plicable, we must have approximate agreement of NJN with p(a, (b0, b)) = p(a, /) = X<y(a nb0,a n b). This basic experimental situation will be repeatedly encountered in the comparison of quantum theory with experiment. We shall not give an account of the numerous examples here. Instead we shall prove a number of theorems about p(a, /). In order to formulate the following theorems we shall now define D 4.5.4. lbo = (b0, b0) and 0bo = (b0, 0); lbo and 0bo are elements of D 4.5.5. Suppose that b0 e is partitioned by bt, i = 1,..., и, that is, b0 = U”=i bi and bi n bJ = 0 for 1 We cal1 fi = (fco> bd a (disjoint) partition of lbo and write lbo = 1J?= t fi. The following two theorems describe the central structural properties of the probability function Х&. Th. 4.5.3. For the function p the following conditions are satisfied: (ii) For each a e 9 there exists anf0 e ^ for which p(a, /0) = 0. (iii) For each a e 9 there exists anfx e 3F for which ft(a9 fj = 1. (iv) For a decomposition a = (J£ a£ (see D 3.2) the following condition (i) 0 < p(a, f) < 1. holds for all ffor which (a, /) e c€\ where 0 < Xi = X$(a, at) < 1 and Я* = 1.
4 Physical Systems 33 (v) Let b01, b02 g #o and b cz b02 cz b01, = (b0i, b),/2 = (b02, b). 77zen t/ze following condition holds for all a for which (a, fx) g <€ fi) = Л#0(ЬОЪ b02)p(a, /2). (vi) Let(a,b0)e C;iflbo = [JU г fi (see D 4.5.5) then fd = 1. - (vii) // (a, f) еЯ> where f = (b0, b) then ц(а, /) = 0 z/ and only if a n b = 0. K (viii) Let b0 = (J"= j b0i where b0i g and b0i n b0k = 0 for i Ф к (that is, U?=i b0i is a decomposition of b0 in @l0). Then, for f = (b0, b) fi = (boh'boi n b), and for every aef which satisfies (a, f)e^ we obtain f) = £ АЯо(Ь0, b0i)fi(a, /f) i = 1 and X ^&o(b05 boi) = 1. i = 1 (In this case we say that the f describe a decomposition of f which is induced by the decomposition ofb0). PROOF. According to APS 5.1.1, to each а ей' there is at least one b0 e for which (a, b0) e C. For such a b0, from (vi) it follows that (with n = l)/i = lbo, thus proving (iii); similarly, let/0 = ®b0 (b0 = 0). From (vii) we obtain (ii). We obtain a proof of (i) from the corresponding property of . Thus, it is only necessary to prove (ivHviii). (iv) Since (a, f)e<$ where / = (b0, b), it follows that (a, b0) e C. Then by Th. 4.3.1 we obtain (af, b0) e С, that is, (at, f) e (€. Thus all the p(at, f) are defined. Thus, we find that (n \ n n a> U ai) = £ at) = £ Xt. i=1 / i=1 i=l Since a n b = (J f (af n h) we obtain n iia, f) = Ay(a n b0, an b) = £ A^(a n b0, a, n b) i = 1 n = £ -Ma n fco> ai n b0)Ay(a n b0y at n b) i = 1 n n = £ ^(a, n b0, at n b) = £ Aj(u(a;, /). i=l i = 1 (v) From (a, /J e ^ we obtain (a, h01)eC. From Th. 4.3.1 we obtain (a, b02) g С and therefore (a, f2) e (€. Thus we obtain о a о h) = ly(o о Ь()^, a о ^02)!y(^ ^ ^02» ^ ^ b) = ^0(^01» b02)^(a П b02> a П h).
34 II Microsystems, Preparation, and Registration Procedures (vi) Let b0 = (J"=! bt be a disjoint partition of b0, and let (a, b0) e C. Then we find that (vii) Let (a, b0) e C, then if p(a, /) = Ay(a n b0, a n b) = 0 then from AS 2.3 it follows that a n b = 0. If a n b Ф 0, then by (3.3) we obtain p(a, /) = Я^(а n b0, a n b) Ф 0. (viii) If (a, b0) e С we obtain From AS 2.2, Th. 4.3.1 and APS 7.2 we obtain Ay(a n b0, a n b0i n b) = Ay(a n b0, a n b0i)Ay(a n b0f, a n b0l- n fe) = Ля0(^о> h0i)p(a, fi)- Since = (J "= j b0i we find that It is important now to show there is a partial converse to Th. 4.5.3. Th. 4.5.4. Given A$; suppose there is a function p(a,f) on which satisfies conditions (i), (iv), (v), (vi), and (vii) of Th. 4.5.3. Then on 9 there exists one and only one probability function A(that is, a function which satisfies AS 2) for which Ay(a n b0, a n b) = p(a, /). This function A# satisfies the conditions APS 7 and conditions (viii) of Th. 4.5.3. Also, A#0 is uniquely determined by p. Proof (Based on a Communication by H. Neumann). We shall use the proof of Th. 4.5.2 in reverse. From the function ^:#—►[(), 1] we shall seek a function A: 90—>[0,1] where 90 = {(d, c) | d, с e © where с (= d} which is identical to the restriction of A# to . Then with the help of S' = 9 we will obtain the function Ay.Py-* [0, 1] with 9y = {(d, c)\d,ce9 where с cz d}. We shall begin with the second step, in order to better establish the conditions which A: —► [0,1] must satisfy in order that it may be extended to a probability function A#. We will show that this extension is possible if A: [0,1] satisfies the following three conditions: (a) If cu c2, c3 g 0, c3 с: c2 c= съ c2 Ф 0 then 1 = Ay(a n b0, a n b0) = £ Ay(a n b0, a n b() = £ p(a, fi). i = i i = 1 /j(a, f) = ку(а n b0, a nb) = A J a n b0, Q (a n boi n b) = £ ky(a nb0,an b0i n b). 1 = 1 1 — Ля0(^о> b0) — ]Г А#0(Ьо, b0i). i = i A(c±, c2) — A(ci, c2)A(c2, c3). (P) If C,ct,[jni=1 ct g 0, с Ф 0,(J?= 1 Ci С= с,с{г\с} = 0 for i Ф j then (y) If d, с g 0, с Ф 0 then A(d, с) Ф 0.
4 Physical Systems 35 We now seek to define A^: —► [0,1] by (4.5.11), that is, by the equation i ^ Ck) iA с 1 c\ ( } where с (= d, d g © and с Ф 0. In order to use (4.5.15) as a definition, it is necessary that A(d, ct) # 0 for at least one value of i; otherwise, by (y) we would have cf = 0 for all i, and с = 0 in contradiction to the hypothesis that с Ф 0. We shall now show that A<^, as defined by (4.5.15), is unique. Suppose that, in addition to the decomposition с = (J"=1 ct and с = У”=1 cfc, there is another decomposition с = (Jjl i cj and с = (J|"= x c[. Then we find that n' m’ ct = с n Cj = 1J (с, n cj) and ck = с nck = 1J (c„ n cj). j=i 1 = 1 Then from (p) we conclude that n' W, C() = £ M c< cj), j=l M c*) = fj A(d, с* n cl) 1=1 X (c c\- ^ ° ® Zu^c.-ncj)- This formula is symmetric in the primed and unprimed quantities; in addition, it yields the same value for A(c, cj if the primed quantities are used in the right side of the equation (4.5.15). Thus we find that (4.5.15) does not depend on the choice of the decomposition of с and c. We must now show that it does not depend on the choice of d. Let с a d' where d' g 0. Then d! n d g © and с <=: d n d'. By (a) we find that A(d, ct) = A(d, d! n d)A(d' n d, c*), A(d, ck) — A(d, d' n d)A(d' n d, ck) and, from (4.5.15) we obtain ЕГ-1 Щ' n d, ck) and therefore A^(c, c) — £?-! Md'nd,cd' This formula is symmetric in d and d'; thus (4.5.15) is independent of the choice of d. We shall now show that the function A^ satisfies AS 2.1-AS 2.3. If c, cg^, cnc = 0 and ckj ceZf then ct n ck = 0 in (4.5.7) for all i, k. Since сисеУ = Г, there exists a dg S' with с u с с: d. Since с и с = [Ui CJ u CUfc ^en ЬУ (4.3.15) we obtain w . x_ LM>zd ^сис,с)"БМсг) + 1кмгк) and Б 4d,£j) X?(c и с, с) =
36 II Microsystems, Preparation, and Registration Procedures Thus we find that A^(c u c, c) + A^(c u c, c) = 1, or, in other words, AS 2.1 is satisfied. If ё cz с cz с and с Ф 0 we obtain ^(d, ct) 2jt X(d, ck) -W-»ca ътсд Цс,г)- Thus we find that AS 2.2 is satisfied. If с Ф 0, then at least one of the ck Ф 0; then according to the definition formula (4.5.15) A(c, с) Ф 0, and AS 2.3 is satisfied. We note that the above result and Th. 4.5.2 show that a function A: —► [0,1] can be extended in a unique way to a probability function X#,: У#—► [0,1] if the conditions (a), (/?), and (y) are satisfied for A: —► [0,1]. It only remains to show that a function //: ^ —► [0,1] which satisfies the conditions (i), (iv), (v), (vi), and (vii) can be extended to a function A: ^ —► [0,1] which satisfies (a), (/?), and (y) and APS 7. We use the formula (4.5.14), that is, (4.5.16, H(a, f) where a n b Ф 0, f = (b0, b), / = (i>0, B) and b cz b0 in order to define In order that (4.5.16) be well defined, it is necessary that p(a, f) ф 0. From p(a,/) = 0 it follows that, by (vii) anb = 0 in contradiction to the hypothesis a n b Ф 0. In order to apply (4.5.16), p(a, f) must be defined, that is, a cz а, а Ф 0 and (a, b0) e C. According to D 4.3.3, b0 e 0to can be chosen so that (a, b0) e С because a n b g 0. Then, by Th 4.2.1 it follows that (a, b0) e C. In order to show that the function defined by (4.5.16) is unique, we shall assume that a n b = a1 n b1 and a nb = a1 nht where a1 cz аъ Ъ1 cz bWe must show that Ыа, аЫа, /) = А/а1г aMau /i) t4 5i7t 14Л f) Kaufi) ’ 1 " ' where /t = (b0l, /;,) and /, = (bou Б,). If a n В = at n Bt = 0, then the right side of (4.5.17) is zero because either A^(a, a) = 0 for a = 0 or p(a, /) = 0 according to (vii); a similar result holds for au Бх. We need, therefore, only consider the case in which a n В = a1 n Ф 0. We rewrite the left side of (4.5.17) using (vi). We obtain lAa, (b0, B)) = fi(a, (b0, В n Et)) + /<d, (Ь0, ЩЬ n Bt))) and Ф, Фо, b)) = Ma, (h0, b n hO) + /i(«> №o> *Л(Ь n M)- Since a n b = n bl9 a n b = a n b n bl9 that is, a n (b\(b n bj)) = 0. By (vii) /x(a, (h0, b\(h n hj)) = 0 and /х(а, (h0> Б\(Ь n hj)) = 0. The left-hand side of (4.5.7) is equal to Aj(fl, %(fl, (h0, b n bj)) p(a, (b0, b n bi))
4 Physical Systems 37 If b0 n b01 # 0, then by (v) we obtain: p(a, (b0, b n bi)) = ЯЛо(Ь0, b0 n BoiMa, (Bo n B0i, b n b±)). The condition b0 n b01 # 0 is satisfied because b0nb01^b nb^a nbna1nb1 = anb#0. From (v) we also conclude that p{a, (b0, В n BJ) = A*o(b0, b0 n b01)p(a, (b0 п b0i, В n БД. Using these results, the left-hand side of (4.5.17) is equal to A<ja, a)p(a, (B0 п b01, В n БД Via, (b0 n b01, b nbt) According to (iv), for all/for which (a, f) e % we obtain p(a, f) = Aj(a, a n ajvia n al9 f) + A^a, a\a n aj^(a\a n al9 /). Since a n b = a! n bb it follows that a n b n bi = (a n aj п (b п ЬД that is, (a\a n flj n (ft n = 0. Thus, from (vii) we obtain Via, (b0 n b01, b n bj) = /Ца, a n fli)/i(e n «1» (B0 n B0i, b n ЬД and ^(a, (b0 n b0i, В п БД = Я^(а, a n n йь (b0 n b0i, В n Bi)). Thus we obtain the following expression for the left side of (4.3.17): AM(a, a)A$(a, a n а^уф n al9 (b0 п B01, Б n БД Aj(a, a n ai)/i(a n al9 (b0 r\ b01, b n bi)) For we find that A$(a, a)A$(a, a n ax) = A^a, a n aj and since a n ax # 0, we obtain lj(c, л о #i) = Я^(д, й о #i)Aj(u о fli, л о Ui). Thus the left side of (4.5.17) may be rewritten as follows: Ам(а n al9 a n а^рЦа n al9 (b0 n b01, Б n Bi)) Via n flb (B0 n b0i, b n bj)) (4.5.18) The expression (4.5.18) is symmetric under exchange of quantities having index 1 and those without index 1. The right side of (4.5.17) can be transformed in the same way into an expression which is similar to (4.5.18), proving that (4.5.16) is well defined. It now remains to show that the function A defined on 90 satisfies the conditions («X (ft (Л and APS 7. We shall now show that (a) is satisfied: Let ct = fli n bl9 c2 = a2 n b2 Ф 0, c3 = a3 n b3 where we suppose that c3 cz c2 cz ct (and therefore find that, for example, ct n c2 = c2; that is, c2 — a2c\b2 — (a1 n a2) n (b1 n b2). We may therefore assume that a3 с a2 c: ax and that b3 cz b2 cz bx. According to APS 5.1.1, for ax there exists a b0 such that (ab b0) e С; by Th. 4.3.1 we also find that (a2, b0) e С and (if a3 Ф 0) (a3, b0) g C. If a3 = 0 then c3 = 0 and therefore Я(с15 c3) = 0, Я(с2, c3) = 0 and
38 II Microsystems, Preparation, and Registration Procedures (a) is satisfied in a trivial way. Let us assume that a3 Ф 0; thus we find that a3 n b0 Ф 0. We shall now assume that a3 Ф 0 and therefore a3 n b0 Ф 0. Since ax n b0 Ф 0 we may write Thus we find that \ _ ^&(ai, а2)^Ла2, аз)м(аз, (V b3)) A\ci, c2)a\c2, c3) — - — — . l4ait (b0, bj) From A£(al9 а2)Л^(а2, a3) = ^(al9 a3) we find that (a) is satisfied. In order to show that (j$) holds, we define ct = a{ n bt and (J "= 1ci = с = a n 5. Since ct<=:c<=:c = anb we may assume a{ c= a c= a and bt c= В с b as above. The a{ generate a finite Boolean ring in «2(a); we denote the atoms of this finite subring by av. Similarly let denote the atoms of the finite subring obtained from $(&) and the bt. Then we find that where Ai9 Bt are the set of indices uniquely associated with a{ and bh respectively. We obtain Ксь c2) = —,t V ' iAa 1» Фо, ь i)) Aj(ai, a2)fi(a2, (b0, b2)) a = U «». b= [j ct = at r\ bt= U av n £„. From n Cj = 0 for i Ф j it follows that U a, n £„ = 0, v 6 AtnAj ц e BtnB'j that is йхпоц = 0 for ve^-n A,. and figB{ n Bj. Let then for all pairs (v, /x) g C£ n Cj9 we find that av n ^ = 0 whenever i Ф j. (4.5.19) We may describe the cf in terms of the Ct as follows: c, = U Я n £„). (v, fi) e Ci We obtain (J (av r\6„) = \J 6i = c = anS = = U("v П £„). Let С denote the set of all index pairs (v, fi). Then we find that av n = 0 for all (v, fi) e C\(J Ct. (4.5.20)
4 Physical Systems 39 Since by APS 5.1.1 there exists a b0 for which (a, b0) e С by Th. 4.3.1 we find that (av, b0) g €. For this b0 we find that ,, Ыа, a)fi(a, (b0, b)) 4°’ 9 -44 • (b0.&)) From = (fr0VO u U/* b„ and b0 = (b0\f>) и В it follows from (vi) that 1 =ф,Фо, Ь0\Ь)) + ^11(а, (b0, fi,)), V 1 = n(a, (b, b0\B)) + fi(a, (V b)) and finally (bo. S)) = E (bo, fi.))- v From a = (Jv av, it follows from (iv) that f) = av, /^ = £ As(a, a„)j<(av, /)• The preceding result, together with a)XM(a, av) = A ^a, av) permits us to conclude that *,<9-1 ,4.5.2,) vp (bo, b^)) According to (vii), fi(av, (b0, 6M)) = 0 for av n ^ = 0. Using (4.5.19) and (4.5.20) we easily obtain ~ч у у Aj(#, #v)//(#v, (h0, h^)) iC’ “ T (v.rt .Cl Жф^Ь)) ■ If we show that (noting the definition of Q) Me, с,) = X (4.5.22) V6A< (b0, b)) peBt then we have proven (j8). Note that (4.5.22) has the same form as (4.5.21) and is therefore proven. In order to show that (y) holds, let d = a1 n bt and с = a2r\b2 where we assume that a2 c= au b2 <= bx. From 3/Л \ Л*(я1» a2Ma2i (bo, b2)) A(d, c) = —-—— ^(Ui, (h0, bj) and (vii) it follows that (y) is satisfied. We shall now show that APS 7.1 is satisfied. First, we note that 3 / ~ U „ ~ к \ _ ^(аЬ a2)Ka2i (Ью, Ью)) A^(ai n h10, «2 n b10) — - — - - . M«i, (bio, bi0) According to (iii) (which follows from (vi)) we obtain Kai> (Ью, Ью)) = Ka2> (Ьго, b2o)) = 1*
40 II Microsystems, Preparation, and Registration Procedures Thus we obtain Ay(a1 n b10, a2 n b10) = A^(al9 a2). To show that APS 7.2 is satisfied, we note that , , , , s _ ЬцАРь аМаи (bio, B20)) Ay(a 1 n b10, n b20) — z 77 г ^ /4«i, №10, bio)) = M«1. (bio> b20))- From (v) we find that A*(ai» (Bio» B20)) = Ля0(Вю» Ь2оЖаи (В20» B20)) = Ля0(Вю» B20)- Thus we have also shown that A#0 is determined by p. On the basis of the uniqueness proven in Th. 4.5.2, it follows from Th. 4.5.3 that the Ay so defined satisfies condition (viii) of Th. 4.5.3. Theorem 4.5.4 shows that in order to determine Ay it is sufficient to test the function p(a, /). The conditions (i), (vi), and (vii) are justified (in a trivial way) by the meaning of p as the mathematical description of a frequency in an experi¬ ment, as we have already explained when axiom AS 2 was stated in §3. Condition (iv) has the following intuitive interpretation: the preparation is not “influenced” by the registration (and therefore is one of the conditions defining the experiment). Otherwise, it would not be possible to conduct a test experiment. Condition (v) states that a refinement of the registration method will be statistically independent of the preparation process—and is therefore also a condition which defines the experiment. Conditions (i), (iv), (v), (vi), and (vii) are, foi* the most part, “rules for correct experimentation” rather than assertions about the physical system under investigation. The complete “information about the physical systems” is to be found in the function p(a, /) over c€. This fact will justify our future, almost exclusive, interest in this function. Yet this function p also is not independent of some of the properties of the preparation and registration apparatuses. In §4.4 we shall attempt, as far as possible, to eliminate the properties of the apparatuses (that is, the properties of the function p(a, f) which depend on the structure of the apparatuses). We would like to consider the preparation and registration processes as merely an aid to detect the physical systems and investigate their structure. The “separation” of the preparation and registration apparatus on one side, and the physical systems on the other side is not only nontrivial, but in the case of microsystems (that is, quantum mechanics) is not possible, at least, in the desired or “expected” sense. The clarification of those concepts, with which we seek to obtain the largest separation, will be the goal of III and I V.
CHAPTER III Ensembles and Effects Many of the problems and difficulties encountered in the interpretation of quantum mechanics have the following source: the failure to clearly dis¬ tinguish between a collection of microsystems obtained by means of a preparation procedure and an ensemble, where the latter is represented by a statistical (density) operator. We shall not describe these misunderstandings here. Instead we shall formulate definitions of the concepts of an “ensemble” and an “effect.” An effect is often called a “yes-no” measurement. Here we shall not yet make a distinction between an “effect” and a “decision effect”; such a distinction will be made in §3 and §6. In this book we shall show that the experiments which exhibit paradoxes for the usual interpretation of quantum mechanics can be described in a natural manner. Here we again state that, in the formulation of quantum mechanics which will be presented here, neither the statistical operators (in particular, the projections onto a vector in Hilbert space, often called a state) nor the self- adjoint operators (which described the so-called “observables”) will be used for direct comparison with experience, that is, with experiment. We shall describe the relationship between the mathematical description and experi¬ ment exclusively in terms of the preparation and registration procedures and the probability function X . In the mapping principles (as described in [1], §5 or [2], III, §4) the fundamental sets are M, J, ^0, and and the fundamental relations are X#,(c, d) = a, x g a and x e b. The concepts which will be introduced in III and IV will be derived concepts (see [1], §10). The additional axioms which will be introduced in III and IV are additional laws of nature in the sense of [1], §7.3 concerning M, J, ^0, and and X^(c, d). 41
42 III Ensembles and Effects In subsequent chapters (for example, in VII) we shall introduce additional relations; these will, however, be related directly to the fundamental sets M, J, ^0, and 0t. In addition, it is necessary to describe the apparatuses in more detail than is currently possible using only elements of M, ^0, and 01. As we mentioned in II, this problem has not yet been solved, in general, for the case of quantum mechanics. In IV we shall again return to this problem. In IX and XVI we shall make a number of unsystematic special assumptions; in XVII we shall describe a method for the solution of a portion of this problem. A new aspect of this problem will be discussed in [13]. In III and IV we shall redefine the “usual” concepts such as “ensemble,” “state,” and “observable.” These concepts will no longer be dependent on the interpretation problems of quantum mechanics, because we have already stated the relationship between the mathematical description and experiment in II. In III and IV we shall outline (in the sense of [1], §10) certain parts of the reality domain of quantum mechanics and consider problems of the structure of microsystems within this framework. 1 Combinations of Preparation and Registration Methods We shall now continue the discussion of the combination problem which was begun in II, §4.3. For the case in which С = 0! x 0t'o axioms APS 5.1 and APS 5.2 become theorems. From С = J' x 0t'o it follows that n(a,f) is defined for all f e ^ and that an equivalence relation ax ~ a2 is defined on J' by the condition ltal9f) = n(a2,f) for all fe P. (1.1) In our discussion of the combination problem we shall show that for microsystems, physically realistic assumptions guarantee the existence of an equivalence relation ax ~ a2 in J'. Indeed, this will be proven in Th. 1.2. Assuming that Th. 1.2 holds, we shall introduce the following definition: D 1.1. Let Ж denote the set of all equivalence classes (with respect to the equivalence relation ax ~ a2 in £'). An element of Ж is called an ensemble or state; Ж is called the set of ensembles (set of states). On the basis of the equivalence relation assumed above, we may define a function /i(w,/), w e Ж as follows: = n(a>f) where aew. Then is defined for all pairs (w,/) for which there exists a a e w for which (a,f) e Suppose that we may justify, on physical grounds, that fi is defined on all ofjf x «f (that is, axiom APS 5.1.4, which will be introduced below will be satisfied). Then an equivalence relation ~ f2 on will be defined by /1 ~ /2 if and only if fiiwJJ = fi(w,f2) for all wel.
1 Combinations of Preparation and Registration Methods 43 We now introduce the following definition: D 1.2. Let JSf denote the set of all equivalence classes (with respect to the equivalence relation fx ~ f2) from An element of if is called an effect and if is called the set of effects. We obtain the following theorem: Th. 1.1. Let fi be defined over Ж x if, where jl(w, g) = fi(w,f) where we Ж, g e if, andf e g. Then fi satisfies the following conditions: (i) 0 < ji(w, g) < 1. (ii) Iffi(wu g) = jl(w2, g)for allge& then wx = w2. (iii) Iffiiw, Qi) = Aw, g2)for all we Ж then gt = g2. (iv) There exists ag0e if such that Aw, g0) = 0 for all w e Ж. (v) * There exists a gt e if such that Aw, gt) = I for all w e Ж. The proof is a simple consequence of II, Th. 4.3.4. For the following, it is useful to introduce the canonical mappings of J' onto Ж and У onto if. D 1.3. Let cp denote the canonical map which maps the elements a e & to the equivalence classes we Ж to which a belongs. Let ф denote the canonical map from onto if. Let / = {b0, b)\ in addition to ф(/) we shall also write ФФо,Ъ). The important concepts “ensemble” and “effect” require the existence of the equivalence relations ax ~ a2 in J' and fx ~ f2 in We shall now seek to formulate realistic axioms which will guarantee the existence of such equivalence relations. Here we shall place our emphasis on physical con¬ siderations rather than seeking the weakest possible axioms. The intuitive basis underlying the axioms for С is as follows: In order that a preparation procedure a may be meaningfully combined with a registration method b0—that is, in order that the microsystem prepared according to the preparation procedure a may be registered according to the method b0—it is necessary that the registration according to b0 occur “after” the preparation. The time-ordering of b0 with respect to a will be discussed in more detail in VII, §1. Here we note that the question whether ae £ may be combined with b0 e is not only a question about the physical possibility to position the apparatus for a and b0, but is also a stipulation that an element xe a is also an element of b0 only if the registration of x occurs after x is prepared (see [2], XVI). Only in this way will the transformations which will be introduced in VII, §1 have a simple structure. We see, therefore, that the equivalence relation ax ~ a2 for preparation procedures depends on our stipulation concerning the possible combination of the preparation with registration procedures. ax ~ a2 will be satisfied if
44 III Ensembles and Effects and only if the effect procedures which may be combined with ax and a2 lead to the same frequencies. Thus the resulting equivalence relation depends strongly on the conditions imposed on the set Cl Let denote the set of registration methods b0 Ф 0 which begin “later” than time T. Then we expect that, to each a e J', there exists a Tsuch that (a, b0) g С for all b0 g 3t'0T. Obviously, the set of 3t'0T for increasing Гis a (lower) directed set. We shall now formulate the following axiom: APS 5.1.2. There exists a directed set (in the sense of inclusion) Г с ^(^'0) where 0 ф Г, so that to each a g & there exists at least one element 0l'Oa g Г such that (я, b0) g С for all bo £ ^Oa * This axiom is, by itself, too weak. From the directedness of Г it follows that, to every finite set au a2, ..., an g J' there exists a common element &ofi g Г for which (ai9 b0) g С for i = 1,..., n and all b0 g 3$'0p. It remains an open question whether each element (for example, contains “sufficiently many” registration methods in order to sufficiently test a preparation procedure. If we consider the set @t'0T described above, our intuition may suggest that it should be possible to test a completely by means of all b0 from X0T9 that is, if /i(al9 (b09 b)) = ц(а2, (b09 b)) for all b0 g Лor and b a b0, then it follows that ц(аг,Л = n(a2lf) for all /g providing that both n(auf) and n(a2,f) are defined (that is, (al9f)eV and (a29f)e%)). For certain macrosystems this assumption will prove to be false (see below). For microsystems and certain “classical” macrosystems (systems of mass-points undergoing conservative forces) the above assumption is suc¬ cessful. We shall now formulate a new axiom (which will be stronger than APS 5.1.2): APS 5.1.3. Let Г be a set satisfying APS 5.1.2. Suppose that the following condition is also satisfied: For al9 a2 g J' let @'0p be an element of Г where (al9 b0), (ia2, b0) g С for all b0 g . Suppose that ц{аъ (b0, b)) = ц(а2, (b0, b)) is satisfied for all b0 G Жор and all b c b0. Then we require that Kax>f) = M^25/) f°r all/G ^ for which n(auf) and fi(a2,f) are defined. Axioms APS 5.1.1-APS 5.1.3 are not mutually independent. We shall now state the relationships among these axioms: If we assert APS 5.1.2 then we may prove APS 5.1.1 (given in II, §4.3). APS 5.1.3 is clearly a stronger condition than APS 5.1.2. If we assume APS 5.1.3 then APS 5.1.1 and APS 5.1.2 can be proven, and are therefore superfluous. From APS 5.1.3 we now obtain the following theorem:
1 Combinations of Preparation and Registration Methods 45 Th. 1.2. Let us define the relation ax ~ a2 as follows: If ц{аъ/) = ii{a2,f)for all f еЗ* for which ii{auf) and ii{a2,f) are defined then ax ~ a2. The relation ax ~ a2 is an equivalence relation. Proof. We need only show that if a1 ~ a2, a2 ~ a3 then ax ~ a3. According to APS 5.1.2, the directed set Г contains an element $'0fi with (au b0), (a2} b0\ (a3, b0) e С for all b0 e &0fi. From a1 ~ a2, a2 ~ a3 we obtain ц(аи (b0, b)) = \i(a2, (b0, b)) = ц(а3, (b0, b)) for all b cz b0 and b0 e &'0fi. From ц(аи (b0, b)) = ц(а3, (b0, b)) for all b cz b0 e ЖО0 it follows that, according to APS 5.1.3 МяьЛ = >/) for all/e SF for which n(auf) and n(a3,f) are defined. Thus we have shown that On the basis of Th. 1.2, for w e Ж (where Ж is defined by D 1.1) we may define a function fi on a subset of Ж x У = KaJ) where aew. Thus fi is defined for all pairs (w,/) for which there is an a e w for which (aJ)eV. The “intuitive” considerations which have led to the formulation of axiom APS 5.1.3 have now led us to make additional assertions about the set C. We have seen that, for each a, there is a subset 0t'OT, the elements of which may be combined with a. If we now consider the various a which belong to the same equivalence class, given ana^w which can be combined with all b0 E @t'oTl (that is the preparation of microsystems is excluded after Tt) it may be possible to find an a2ew which excludes preparations after an earlier time T2(T2 < ТД but for all effects (b0, b\ where b0 e0t'OTl, leads to the same frequencies as ax. We would like to assert that for each w and each time Tit is possible to find an aew which may be combined with the registration method b0 E0tfOT. Together with [jT 0t’OT = 0t’o (where the union is taken over all time). We come to the postulate: APS 5.1.4. To each b0 g ^'0 in each class we Ж there exists an aew such that (a, b0) g C. To each pair a e£t',f= 0* there exists an /' g 3F such that (a,f) g ^ and fi(w9f) = fi(w,f') for all w e Ж for which fi(w,f) and fi(w,f') are defined. Th. 1.3. The function fi defined above is defined on Ж x Proof. We need to show that, to each / g & and we Ж there is an a e w for which (a,/) g This is, however, guaranteed by APS 5.1.4 since if/ = (b0, b), there is an aew with (a, b0) e C, that is, (a,f) e c€.
46 III Ensembles and Effects We now obtain the following theorem: Th. 1.4. The relation fx ~ f2 defined by the condition /1) = /2) for all we Ж is an equivalence relation. To each pair aE3!,fe9 there exists anf'e9 such that (a,f') e <€ andf ~ f. Thus, from D 1.2 we have finally proven Th. 1.1. In the following (up to and including VI) we shall not consider the precise form of axioms APS 5.1.3 and APS 5.1.4. We shall, however, make use of the equivalence classes of J' and 9, and the mappings q> and ф defined by D 5.1.3. The special form of the axioms APS 5.1, 3, 4 have stood the test for the case of microsystems; the considerations presented in VII, §1 depend decisively on these axioms. The situation is completely different for macrosystems which undergo irreversible processes. Let us select a reference time (which we denote by t = 0), and consider only those preparation procedures a e J which are completed before t = 0, and the registration procedures which begin after t = 0. Then, instead of APS 5.1, 3,4 we may use the simpler axiom С = J' x @t’Q. This is achieved at the expense that the considerations of VII, §1 are no longer applicable, since we may consider only those time translations t—>t + т for which т > 0—a semi-group of time translations! It is impossible to apply axioms APS 5.1, 3, 4 to irreversible macrosystems because axiom APS 5.1.3 is stronger than APS 5.1.2. The stronger axiom APS 5.1.3 appears to contradict our experience with irreversible macro¬ systems because the ability to distinguish between the different au a2 by means of the elements (b0, b) where b0 e Жот is reduced as T increases. In particular, if T is sufficiently large, the condition p(al9 (b0, b)) = p(a2, (B0, b)) for all (b0, b) for which b0 e Жот does not imply that p(al9f) = p(a2,f) for all/e 9 even where p(al9f) and p(a2,f) are defined. We have noted that APS 5.1.3 cannot be used for irreversible macrosystems. Its usefulness for micro¬ systems, especially for the structures defined in VII, §1 is not self-evident! The distinction between macrosystems and microsystems must be carefully examined if we are to embed a theory of macrosystems into many-body quantum theory. As a result of the above considerations (macrosystems may be prepared only before t — 0), the set Jw of the preparation procedures for macrosystems can only be a small subset of the set J of preparation procedures for a many-body quantum theory; a similar situation holds also for 9t0m and 9t0 (see [2], XV and [13]). The explicit form of the axioms—either APS 5.1, 3, 4 or С = J' x — will not be important in the following—from here to VI inclusive. Here it is important to state that this choice permits us to introduce the concepts of an ensemble and an effect. In the transition from J' to Ж and from 9 to 9 we lose (as desired) the special structure of the apparatuses as characterized by a e J' or b0 e 9T0. Ensembles and effects then appear to depend more on the systems themselves than upon the preparation and registration apparatus. In
2 Mixtures and Decompositions of Ensembles and Effects 47 this respect, axioms APS 5.1.3 and 4 permit, at least, a partial “separation” of thevsystem from the structure of the apparatuses in the sense of our intention which was expressed in II, §4.4 and at the end of II, §4.5. There remain two problems in the separation described above in defining the concept of ensemble and effect: (1) In the transition from J' to Ж and from 3F to ££ we must ask whether we have lost too much of the structure of the system itself. (2) On, the other hand, we must ask whether we have not included more of the apparatus structure than is necessary. An ensemble w is characterized only by its statistical distribution. Therefore it is often said that two sets of systems al9 a2 for which <;p(at) = <p(a2) = w—that is, ax ~ a2—are, in reality, equal and differ only in the details of the construction of the two different preparation apparatuses. In other words the two sets al9 a2 give the same results for all experiments (not only registration). In IV, §5 and §6 and in XVII, §4.4 we shall find that this statement is, in general, not true for microsystems. We find that much too much is lost in the transition from J' to Ж and from & to . In IV we shall seek to recover the loss by introducing the concepts of observable and preparator. On the other hand the notion of an effect may perhaps contain too much of the structure of the registration procedure; in general, an effect may also contain the probability that the associated effect process reacts upon a “property” of the microsystem. We would, of course, like to eliminate the “bad” registrations and keep the “good” ones—those which exhibit the “real” properties of the system. Is such a distinction possible? A search to find a description of the microsystems in themselves, going beyond the concepts of ensembles and effects may proceed in two different directions: * First, by considering the properties or pseudoproperties of systems (see §4), or Second, by providing a very precise analysis of what we seek to describe by the concepts of observable and preparator (see VI). There we seek to identify those observables which precisely describe the physical systems, and only secondarily describe (if this is inevitable) the measurement (registration) apparatus. We also wish to identify those preparators which describe the structure of the prepared physical systems rather than the structure of the preparation apparatus. 2 Mixtures and Decompositions of Ensembles and Effects The concept of an “ensemble” (or “state”) is uniquely defined by D 1.1 and refers to a number of relatively trivial relationships. The usual intuitive notion of an ensemble (state) does not agree in all details with that defined by D 1.1.
48 III Ensembles and Effects An ensemble w e Ж is not a set of microsystems because w is not a subset of M; w is a subset of @>(M)—and is a class of subsets of M. The following fact is of crucial importance in quantum mechanics (but not limited to quantum mechanics). To each class w there is more than one aew (see IV, §5 and §6 for details). The substitution of the statement aef for the statement w e Ж will quickly result in errors in both logic and intuition. Such errors may be avoided by using the mathematical structure associated with M, J, Л0, and in order to describe microsystems. For quantum mechanics it is important that, if q> is the map defined in D 1.3, then the condition cp{ax) = <p(a2) does not imply at = a2. In the following we shall denote the function pi: Ж x —> R obtained in Th. 1.1 by ц (the same symbol used for the corresponding function on ^); it will be clear which function is meant by the arguments. Thus we obtain H(a,f) = fi((p(a), iA(/)). (2.1) From II, Th. 4.3.4(iv) we obtain the following important theorem on the “decomposition of ensembles.” Th. 2.1. Let a = (J"=i at be a decomposition of the preparation procedure a. Then for allg e we obtain n ri<p(a), g) = £ ^A(p(at), g)\ i = 1 where kt = к#(а9 atX 0 < < 1 and £"=1 Xt = 1. Proof. According to II, Th. 4.3.4(iv), we find that for Af = XJa, af), = Z Xip{auf) i = 1 for all/ g for which (a,f) e Therefore, from Th. 1.3 we obtain (K<p(a),f) = L ЛДФд,Л i — 1 for all f g Our assertion follows directly from Th. 1.1. D 2.1. Suppose to w g Ж there is a set of real numbers Xu 0 < Xt < 1, i = 1, ..., n where £"=1 Xt = 1, and a set wt e Ж, i = 1,..., n for which the following condition is satisfied for all g n Ф, g) = E 'Wwi, g)- (2.2) i = 1 Then (2.2) is called a decomposition of w according to the w( with weights X(. We shall now introduce the notion of “mixing” of preparation procedures, a notion which is the inverse of decomposition. We shall describe how ye may construct such a mixture w using the apparatuses for the selection of the components wt.
2 Mixtures and Decompositions of Ensembles and Effects 49 We may build a new apparatus A using the apparatus Ax corresponding to the preparation procedure ax and the apparatus A2 corresponding to a2 in the following way. Suppose we have an apparatus В which randomly generates two states (+) and (—) where (+) occurs with frequency a and (—) with frequency 1 — a. The arrangement of В, Al9 and A2 are such that, upon the occurrence of (+) the apparatus Ax is used, and upon the occurrence of (-) the apparatus A2 is used. Note that В is also a part of the “large” apparatus A. If (+) occurs in the apparatus A (that is, in its part B\ a preparation procedure a\ c= a is determined for which X^a9 a\) = a. Then a\ is a “finer” preparation procedure than a9 and is selected from a only if (+) occurs. Similarly, if (—) occurs, a preparation procedure d2 occurs and Xg(a, a'2) = 1 — a. We may be tempted to set ax = a\ and a2 = a!2. If we do so, we make the following error: the preparation procedures ax and a2 are obtained from the use of ax and a2 independent of the random generator В which controls which of At or A2 in A is used for the preparation. From the construction of the apparatus A we find that a = a\ и a!2 and a\ n a'2 = 0; that is, a is a mixture of a\ and o'2 with weights X^a9 a'x) = a and Xg(a, a'2) = 1 — a. Since the selection according to a\ and ax is obtained with the same apparatus Ax (where in the case of a\ the apparatus Ax is only a part of the total apparatus A, but is otherwise unchanged) we expect that l(at) = {a\ae£, a cz at} is isomorphic to <2(tfi) = {a'\a'e J, a' a ax}. That is, there exists an isomorphism i of the Boolean ring £(аг) onton J(fli) for which cp(ia) = cp(a) and X^(al9 a) = X^(ial9 ia) = XM(a'u ia). Intuitively this isomorphism expresses the fact that ax and o\ have the same “structure type.” With the above motivation, we now introduce the following definitions: &■ D 2.2. Two preparation procedures a and a are said to be isomorphic if there is an isomorphism i between the Boolean rings J(a) and 1(a) for which <p(ia') = <p(a') and X^(a, a') = XJja9 ia') (2.3) and if (a, b0) e С is equivalent to (a, b0) e C. D 2.3. A preparation procedure a is called a direct mixture of ax and a2 if there are isomorphic preparation procedures a\, a2 for which a\ n a'2 = 0 and a = a'x u a'2. a = Хй(а9 a\) and 1 — a = X^(a, a'2) are called the weights of ax and a2 in the direct mixture a. If (р(аг) Ф <p(a2) then the weights a and 1 — a in the direct mixture are uniquely determined by (p(ax) and <p(a2). According to II, Th. 4.3(iv) we obtain
50 III Ensembles and Effects From D 2.3 we obtain fi((p(a), g) = аМ<г>(Ы g) + (1 - u.)g(<p{a2), g). (2.4) Since <p(dh) Ф cp(a2), there exists a g such that КФг), g) Ф g(<p(a2), g)- From (2.4) it follows that _ n(<p(a), g) - gispifl 2), 0) КФ1), 9) - д(Фг), в)' The experimental arrangement described above leads us to introduce the following axiom: AP 1. To each аъ a2 e J' and to each rational number a, 0 < a < 1 there is a direct mixture а ей’ of ax and a2 with weight a of ax in a. From АР 1 we obtain the following theorem: Th. 2.2. Let w e Ж, and let ki9 i = 1,..., n be rational numbers where 0 < ki9 Yj=i h = 1* Then there exists an a e J' and a decomposition a = au (р(а() — wt and k^(a9 a() = ktfor which: n g) = X 'Wtt'i, g) for all деУ, 1 = 1 where w = <p(a). Proof. We use induction on n. For a set of n + 1 rational numbers kl9 ..., kn+1 we consider the set of n rational numbers Z J ‘a, (i = 1, • • •, n). k = 1 / According to the induction hypothesis, there exists an a e J' with a decomposition a = U?=1 а{ with <p(af) = w* and к$(а, dt) = af. Suppose that for w„+1 there exists ana„+1 with (p(an+1) = w„+1. According to APS 1, for a, a„+1 there exists a direct mixture a of a and an+1 with weight kn+1 of an+l in a. By D 2.3 there is an a and a a„+1 for which a = a и a„+1, a n a„+1 = 0, A^a, a) = 1 - A„+1, k^a, an+1) = A„+1 where a is isomorphic to a and a„+1 is isomorphic to an+1. From the decomposition a = (J"=1 a{ and the fact that a is isomorphic to a, it follows that there is an isomorphic decomposition between a = (J?=1 ai and a = \J1=} at. From fl = iuflB+1,flnflB+1 = 0it follows that a = U"=1 af is a decomposition of a. The weights of at are for i < n: at) = kja, a)k^(a, at) = (1 — кп+1)кд(а, dt) = (1 — A„+1)af = kh where the relation (2.3) was used. For i = n we have k$(a, an+1) = kn+1. Thus, frorii (2.3) we obtain <p(af) = (p(dL) and (p(an+1) = <p(a„+1). Thus, with the use of Th. 2.1 the theorem is proven.
2 Mixtures and Decompositions of Ensembles and Effects 51 In АР 1 we have required only that a be a rational number. This has been done with the wish that J can be chosen to be a denumerable set (see [1], §9). Th. 2.2 states that to every decomposition (2.2) of w according to the w* with weights At (At rational) there exists a preparation procedure a e J and a decomposition of a, a = (J?=1 at for which ср(а) = w, (р(а() = wf and AafCl, Clj) A} . It is not difficult to see that АР 1 and Th. 2.2 say little about the structure of microsystems. In fact АР 1 is more of an assertion about the construction of the preparation apparatus. We have introduced АР 1 only because it will illuminate the discussion about the physical assertions of quantum me¬ chanics. We shall now return to our discussion concerning the concepts of preparation and registration procedures and their mixtures and decompositions. In order to avoid error in connection with АР 1 we find it necessary to make the following remarks. For every decomposition a = (J"=1 at we may be tempted to believe that the apparatus A corresponding to a must consist of a random generator В which selects the sub-apparatus At (which correspond to the at). Such a description is incorrect for the following reason: Although there may be indications on the apparatus A by which the selection procedures a{ are determined, the total structure of the apparatus may be such that it is impossible to uniquely define the component apparatuses At of A. For the case of quantum mechanics it is important to note that there are decompositions a = at which do not correspond to a partition of the apparatus A into a random generator and components At. If we replace АР 1 by the following somewhat stronger assertion we shall find ourselves in contradiction with quantum mechanics: Suppose that there is a decomposition of w into w* according to D 2.1, and suppose that cp(a) = w. Then there is a decomposition of a, a = (J£ a( for which (р(а() = wf and Ag(a, at) = A{. In VI, §6 we shall find, even in the case in which the relation (2.2) holds, that there are selection procedures a which satisfy <p(a) = w for which there are no decompositions a = (J£ a£ for which (p(at) = wf. Since АР 1 holds, there must be another selection procedure a' which satisfies <p(a) = w and permits a decomposition a' = (Jf a\ where cp(aJ) = щ and Ай(а\ aj) = At. Earlier we have attempted to formulate the concept of an ensemble in such a way as to avoid errors in interpretation. In order to continue this effort it is now necessary to examine the relationships between the concept of an effect and other concepts which are frequently used in quantum mechanics. By analogy with the notion of an ensemble, we emphasize that, according to D 1.2, the expression effect denotes classes of effect processes (b0, b). Then the map defined by D 1.3 maps many effect processes (b0, b) into an effect. In this mapping, however, information is lost about the effect processes (b0, b)— see, for example, the description of “coexistent” effects in IV, §1. The effect process is characterized by an “apparatus”—the registration method b0—
52 III Ensembles and Effects and by the “detection response”—the registration procedure b. Thus it is possible that the two apparatuses b(0Х) and b^2) with substantially different technical design will represent the same effect g e ££: ф(Ь(о\ b{1)) = ф(Ь$\ b(2>) = g. Let b characterize the “detection response” for the apparatus b0. Some authors call the pair (b0, b) a “yes-no measurement.” In quantum mechanics the concept of a “yes-no” measurement is usually explained in ordinary language before it is used. Thus it will often be unclear whether the expression “yes-no measurement” should refer to the elements (b0, b) of У or to the elements g = ф(Ь0, b) of ££. This ambiguity can easily lead to misunderstandings. Often the expression “question” is used instead of “yes- no measurement.” Here (b0, b) is interpreted as a question posed to the micro-object; xeb corresponds to the answer “yes” and xeb0\b cor¬ responds to “no.” Here again it is not clear whether the concept “question” should refer to an element of or to an element of ££. The expressions “yes-no measurement” and “question” are also used in a more restricted sense. If care is not taken to see how different authors use these expressions, great confusion will result. We find that the expressions “yes-no measurement,” “question,” and “proposition” are used for the elements of a subset of S£, that is, for special effects (which we shall call “decision effects” and define in §3 and §6). We now find it necessary to define the notions of mixture and decom¬ position in reference to registration methods and effects—these notions will later prove to be useful. We shall now consider only the notion of decomposition of registration methods which was defined in II, Th. 4.5.3(viii) (and not the more general decomposition of registration procedures; for the latter, see the discussion on observables presented in IV, §1.4). The following theorem is an immediate consequence of II, Th. 4.5.3. Th. 2.3. Let b0 = b0i be a decomposition of the registration method b0 according to the b0i with weights Xt. Then for an effect process f = (b0, b) where f = (b0i, b0i n b) the following equation holds: g(w, tКЛ) = Z V(w, ФШ) (2.5) i = l for all we X. D 2.4. Let д e ££. Suppose that there is a set of real numbers Xt where 0 < Xt < 1, Ya=i h = 1 and a set gt e У? such that n g) = Z 'Ww, 0,) (2.6) i = l holds for all w e X, then (2.6) is called a decomposition of the effect g according to the effects gt with weights kt.
2 Mixtures and Decompositions of Ensembles and Effects 53 Equation (2.5) represents a decomposition of g = ф(Ь0, b) according to the Qi = ФФоь b0i n b) with weights Л, = Л<%о(Ь0, b0i). The procedure described earlier for the construction of a preparation apparatus A from a random generator В and two preparation apparatuses At and A 2 can also be applied directly to a similar procedure for registration apparatuses. Let us construct a registration apparatus using a random generator В and two registration apparatuses Ax and A2. Since a registration apparatus corresponds to an element of ^'0, from the two registration methods b01 and b02 we obtain a registration method b0 having the decomposition b0 = boi u ^02 where Ьоь ^02 correspond to the apparatuses Ax and A2, respectively. By analogy to D 2.2 and D 2.3 we define: D 2.5. Two registration methods b0 and b'0 are isomorphic if there is an isomorphism i of the Boolean ring 0t(bo) to the Boolean ring 0t(b'o) for which \j/(ib0Jb) = \j/(b0,b)\ i is also an isomorphism of 0to(bo) to 0to{b'o) and (a, b0) eC is equivalent to (a, b'0) e C. (Here we note that 0to(bo) is defined by 0to{bo) = 0ton Я{Ь0)). D 2.6. A registration method b0 is said to be a direct mixture of the registration methods b0l9 b02 if there are two registration methods b'01,b'02, where b'01 is isomorphic to bou b'02 is isomorphic to b02 such that b'oi n bf02 = 0, b0 = b'ol u bo2* a = A^o(b0, b'01) and 1 - a = Л#о(Ь0, b'02) are called the weights of b01, b02 in the direct mixture b0. From D 2.6 and Th. 2.6 it follows that, for every b cz b0 /i(w, il/(b09 b)) = a/i(w, ф(Ь'ои b'01 n b)) + (1 - a)/i(w, ф(Ь'02, Ьо2 n b)). (2.7) Let c= b01, b2 c= b02. Since b'01 and b01 are isomorphic, and b'02 and b02 are isomorphic, there exists a bi cz b'01 and a b'2 cz b'02 such that bu b\ and bl9 b'2 are isomorphic, respectively, and ф(Ьои bt) = ф(Ь'ои b\\ ф(Ь02, b2) = ФФ029 b'2). If b = b\ u b2, then from b'01 n b'02 = 0 we obtain b'oi n b = P'u b'02 n b = b2. From (2.7) it follows that /i(w, ф(Ь0, b)) = a/i(w, ^(b0i, bi)) + (1 - Ф(Ь, Ф(Ь02, Ь2)). (2-8) Here in the set of effects ^(b0, b) there is a mixture of the effects i^(b0i, bi) and ф(Ь02, b2) in the ratio a to (1 — a). We now introduce the following axiom: AR 1. To each pair b01, b02 e 0t'o and each rational number a, 0 < a < 1 there exists a direct mixture b0 e 0t'o of b ,l5 b02 with the weight a of b01 in b0.
54 III Ensembles and Effects From AR 1 we obtain: Th. 2.4. Let g{ e , i = 1,..., n and let Af > 0, i = 1,..., n be rational numbers for which Yj=i h = 1. Then there exists a b0e $'0 and a decom¬ position b0 = U"=1 b0ifor which boi e and there exists abe@t9b a b0 such that i//(b0i, b0i n b) = gt and A^0(b0, b0i) = Af. Let w e Ж9 g = ^(b0, b). Tben n Mvv, 0) = £ AiMw, 0i). i = l Proof. The proof of this theorem is analogous to that of Th. 2.2. According to the induction hypothesis there exists a B0 = (J"=1 B0i and a J с S0 such that *A(6oi> 6oiJ^ £) = 0i and A^0(£0> 6oi) = <*;• Choose j£0«+i> Bn+1) such that ФФоп+и B„+i) = 0И+1- According to AR 1, to B0 and B0n+1 there exists a direct mixture b0 of B0, S0n+i with weight An+1 of b0n+i in b0. There also exists a B0 which is isomorphic to B0 and a b0n+1 isomorphic to B0n+1 such that b0 = B0 v b0n+l9 Bo n b0n+1 = 0, Л'Яо^о» So) = i — K+l9 Л&0Фо9 b0n+1) = An+1. From the isomorphism between B0 and b0 (see D 2.5) it follows that there is a decomposition B0 = [ji=1b0i which is isomorphic to the decomposition B0 = 0"=!^ for which Ф(Во> B0i) = ф(В0, b0i). Fromb0 = b0 u b0n+1 = (Jjj± J b0Jt, it follows that, for к < n 9 bok) = к®0Фо9 Во)Ля0(В09 b0k) = (1 - Аи+1)А^(я nB09 an b0k) = (1 — An+1)A^(a Pi fi0, a n B0f^ = (1 — An+1)A^0(b0, B0k) = (1 — An+1)aft = Aft. Thus it follows that A^0(b0, b0n+1) = An+1. To В a B0 there exists a В (<= b0) isomorphic to В, for which А^(я n B09a n B0i n В) = А^(я n B0,a n b0i n B). Thus we obtain А^(я n bo> л ^ ^oi сл B) = A^o(fi0, B0i)A,#>(a n Boi9 a n B0i n B) = о^А^я n b0i, я n 50i n and Ay,(a n B09 a n b0i n В) = A^0(60> b0i)^(a n b0i9 a n b0i n B) = а,-А^(я n b0i, я n b0i n 6). Thus we find that 9i = ФФоь hi r\S) = ф(Ь01, b0! n 6).
2 Mixtures and Decompositions of Ensembles and Effects 55 For bn+1 (which is isomorphic to Bn+1) we obtain dn + l = ФФоп + и Bn + 1) = ф(Ьо„ +1, bn + 1). From b = fi u bn+1, since bn+1 c= b0n+1 and fi c= fi0, we obtain boi n b = boi n b for i < n, ^ои+i n b = bn+1. Thus, from Th. 2.3 we finally obtain П + 1 ti{a, {b0, b)) = £ V(«> (i’oi. b0, n b)) i = 1 that is, П + 1 0) = E Л-М<?>(«), Si». i = 1 Th. 2.4 states that for every decomposition (2.6) of g into components gt with weights kt (Af rational) there exists a registration method b0 e R'0 with decomposition b0 = (J"=1 b0i and detection response bt = b0i n be ffl(b0i) which satisfies ФФо, Ui bi) = 9, ФФOb bi) = Si and ХЯ(Ь0, bQi) = A,. Axioms AR 1 and АР 1 have been introduced primarily for the purpose of aiding the discussion of the physical meaning of certain aspects of quantum mechanics. Here it is not necessary to again remind the reader of the possibility of making incorrect conclusions from axiom AR 1 (these are analogous to those described in the case of preparation procedures). In closing this section we again state that axioms АР 1 and AR 1 are not only applicable to the case of microsystems, but may be used for all systems in physics. Of course, it is possible to choose stronger axioms than АР 1 and AR 1 (see the discussion on coexistent decompositions and coexistent effects in IV, §1 and §5 and the concept of a “physical object” which will be introduced in §4.1) in such a way as to exclude the possibility of describing microsystems. The interest among theoreticians in the sets J, ^0, and 01 and their physical meaning is divided. In a “classical world” in which the elements of M can be considered to be the set of physical objects, the measurement problems underlying J, 010, and 01 are generally not of theoretical interest. It is sometimes believed that it should be possible to construct a theory of microsystems without making an inquiry into the physics of the measure¬ ment process. In §4 and IV, §8.1 we shall find that this is not possible. We shall eliminate much of the preparation and measurement process if we seek to develop the theory with the aid of the sets Ж and and the function li: Ж x —> [0,1]. In order that we may obtain the most general laws of nature governing the processes of preparation and registration of micro¬ systems (analogous to the first and second laws of thermodynamics) in §3, it is not necessary to use the sets Д 0to, and 01. That is, none of the individual
56 III Ensembles and Effects physical structures associated with apparatuses denoted by the elements of «Э and 0t will be mapped into the mathematical theory М0~ъ. This viewpoint will appear to be sufficient, if not more than sufficient, to those whose interest is in the description of microsystems. We shall consider this form of the theory in the following chapters (up to and including XVI), paying close attention to the difficulties inherent in such a viewpoint. In XVII we shall seek to introduce additional structure on the sets 0, 0tQ, and 01. The reader who is dissatisfied with the lack of structure on «Э, 0tQ, and 0t (that is, the treatment of the preparation and registration apparatuses as “black boxes”)—a viewpoint shared by the author—is referred to XVII, XVIII, and [13]. 3 General Laws: Preparation and Registration of Microsystems In this section we shall briefly digress and consider a few fundamental ideas for the case of microsystems. This section may be skipped without loss of continuity. We shall consider how the mathematical description of the set of ensembles X and the set of effects if which will be formulated in §5 may be deduced for the case of microsystems from physically motivated axioms. A detailed presentation of this topic can be found in [13]. Because of the “finiteness of physics” (see [1], §9 or [2], III, §8) we shall assume that the sets Jt, 1, 01 are countable. Then X and if will be countable. The following theorem is a consequence of Th. 1.1. Th. 3.1. There exists a pair of real Banach spaces 01, 01' (where 0? is dual to 01) and an embedding of X in 01 and if in 01' (that is, X, if can be identified with subsets of 01 and 01', respectively) for which the following conditions hold: (i) The canonical bilinear form (w, g) defined for the dual pair 01, 01' is identical to p(w, g)onX x if, that is, g(w,g) = (w, д)\Жх^. (ii) 01 is a base-norm space (see AIII, §6) with basis К where К is equal to со X (where со X is the norm-closed convex set generated by X). The positive cone 01+ generated by К is closed. From Th. 2.2 it follows that К is the norm closure of X—that is, the norm closure of X is already convex. (iii) The linear span of if is o(0!', 0T) dense in 01' (for the o(.. )-topology see AIII, §4). 0f and 01' are also uniquely defined (up to isomorphism) by (ii)-(iii). Since X is countable, it follows that 01 is separable. The proof of this theorem can be found in [17] and [13].
3 General Laws: Preparation and Registration of Microsystems 57 We shall denote the dual form (x, y) for 3!, 3' by fi(x, y). From Th. 3.1(ii) it follows that 3' is an order unit space. Since 0 < g) < 1 for w e Ж and g e we also obtain 0 < g,(w, g) < 1 for w e К and g e if. This means that if с: [0,1] where 1 is the order unit in 3'. Let L denote the o(3\ 3) closure of if in 3'. From Th. 2.4 it follows that L is convex. Let 3 denote the norm-closure of the linear span of if. Then le 3 and 3 is a separable Banach subspace of 3' (3 is an order unit space). 3 is o(3\ 3) dense in 3'. К is ^) precompact and <т(^', ^) separable. Let 3' be the Banach space which is dual to 3; 3' is a base-norm space. We may identify the space 3 with a subspace of 3'. Let К denote the <r(3, 3')-closmQ of К in 3'. К is o(3'9 3) compact and L is o(3\ ^-compact. For the compact sets К and L the Krein-Millman theorem holds (AIII, §4). The topologies g(3\ 31) and a(3\ 3) have the following physical inter¬ pretation: First, the topologies a(3\ 3), o(3\ L n 3) and <r(3', if) on К (and K) are identical since К is compact. The same is true for the topologies g(3\ 3\ g(3\ K) and g(3', Ж) on L since L is compact. The topologies o(3\ if) on К (or Ж) and Ж) on L (or if) describe the possibility of “physically” distinguishing among ensembles (effects). We shall now illustrate this for the case of ensembles: From giwug) = n(w2,g) for all ge if, it follows that wx = w2. Experimentally we can only use a finite number of registration apparatuses with a finite number of “detection responses”—that is, a finite number of g in order to test whether two ensembles wx and w2 are different. In addition, we can only test to within a finite error whether G) — g). That is, for finitely many gu g2,..., gn and finite error s > 0 we can always test whether Mwi, 9i) ~ fiw2,9t)I < e (» = 1, 2,, n). (3.1) The inequalities (3.1) determine, for different e, n, and g{ the neighborhood basis for the topology g(3\ if). We shall now present additional axioms for К and L. The physical intuition upon which these axioms are based can be found in [13] and (in part) in [17]. It is already clear that, on the basis of the maps <p of 3! in 3 and ij/ of into 31' that the following additional axioms will represent indirect assertions about the sets J, 3t09 and 31. In order to formulate additional axioms, we define the following: D 3.1. K0(B) = [w | w 6 К and g,(w, g) — 0 for all g e В a L}, Ki(B) = {w | w 6 К and g,(w, g) = 1 for all g e В <= L}, L0(A) = {g | g e L and g,(w, g) = 0 for all we A a K}. K0(B) and KX(B) are closed faces1 of X, L0(A) is a <r(3l'9 ^)-closed face of L. If В consists of only a single element g, instead of K0(B) we shall write X0(g) and similarly for Kx and L0. 1. For the concept of a face, see §6.
58 III Ensembles and Effects It is easy to verify that the order relation < y2 in is equivalent to the following relation fi(w, yj < p(w, y2) for all weK. We shall now state the first law of measurement as an axiom: AV 1.1. To each pair gu g2 e L there exists a g3eL for which g3 > gu 03 > 02 and Ko(0i) n K0(g2) cz K0(g3). AV 1.1 is equivalent to the statement that each L0(A) has a largest element, which we denote by eL0(A) (see [17] and [13]). All elements of К (not only those of Ж) are called ensembles; all elements of L (not only of ££) are called effects. The elements eL0(A) are called decision effects. We shall denote the set of decision effects by G. Let deL denote the set of extreme points of L; we obtain G c: deL (see [17] and [13]). For an arbitrary subset {Aa} of &>(K) we find that the relationship L0((JaAa) = f]aL0(Aa) is satisfied. Thus we find that the set {L0(A)\ A <= K) is a complete lattice with respect to the partial order c: of set inclusion. Since the map L0(A) -+ eL0(A) is an order isomorphism of {L0(A)\A с: K] onto G, we find that G is a complete lattice with respect to the order induced on G с ST by ST. For the second law, we propose the following: AV 1.2s. L = [0,1]. From this axiom, it follows that the set {K0(g\ g e L} coincides with the set of so-called exposed1 faces. We may also show that sup p(w, e) = 1 weK for all e e G for which e Ф 0. Unfortunately, the following relation AV id cannot be proven. It represents only a minor idealization. We shall introduce it as an axiom. AV id. To each eeG,e Ф 0, there exists aweKfor which p(w, e) = 1. This relation is equivalent to the assertion eeG implies 1 — e e G. This relation may be used as an axiom instead of AV id. Then we would find that the map e —> e1 =f 1 — e is an orthocomplemen¬ tation in the lattice G and that G is orthomodular. The map e—> K^e) is an isomorphism between G and the lattice of exposed faces of K. 1. A face F of К is said to be exposed if and only if there exists a у e 88' for which F = (w | w e K, ju(w, y) = sup y)t. V w' 6 X *
3 General Laws: Preparation and Registration of Microsystems 59 We shall now define the notion of “distance” between two elements el9 e2 of G (or the corresponding faces Kx(e{)9 of K) as follows: As an additional axiom, we assert the following: AV 3. If el9 e2, e3e G and if e2 < et < e2 v e3 and A(el9 e3) Ф 0 then *i = *2- For the case in which G is a Boolean ring axiom AV 3 will be satisfied as a theorem. “Classical systems” are often defined by the requirement that G be a nonatomic Boolean ring. Instead of the nonatomic condition we shall require that each face of К is infinite dimensional. The requirement that G be a Boolean ring may be replaced by other equivalent assertions (see the general discussion in [13]). In D 4.1.2 we shall define what we mean by the expression “physical object.” The assertion that G is a Boolean ring may be replaced by the requirement that the physical systems in M are physical objects (for a proof see [1], §12.3). The next axiom will permit us to distinguish between microsystems and classical systems. We shall call this axiom the “law of quantization.” AV 4s. Every exposed face of К is the upper bound (the lattice union) of an increasing sequence of exposed and/ш/te-dimensional faces. We extend this axiom by the following assertion: AV 2f. Every finite-dimensional face of К is exposed. D 3.2. If axiom AV 1.1, AV 1.2s, AV2f, AVid, AV 3, and AV4s are satisfied we then say that M (together with the structure J, 0to, 0t) is a set of microsystems. The following important theorem holds (the proof will not be given here; see [17] and [13]): The relations AV 1.1, AV 1.2s, AV2f, AVid, AV 3, and AV4s are equivalent to the condition that the Banach spaces 369 01' can be identified With the spaces 0!{Жи Ж29...), 0У(Жъ Ж29...) with К as the basis of 01(Жи Ж2,...) and L as the order interval [0,1] of 01'(Жи Ж2,...) where we assume that the lattice-dimension of the irreducible parts of G is not 2 or 3. Here ЩЖ19 Ж2,...) and 0?'(Жи Ж2,...) are understood, as they are defined in AIV, §15 with the generalization that the number fields of the Hilbert spaces Ж may be either the set of real numbers R, the set of complex numbers C, or the set of quaternions Q. There are physical arguments (that is, physical facts—see VIII, §2) which permit us to exclude the cases R and Q. In §5 we shall assert this “end &(ei> ei) = max 1 — ec)
60 III Ensembles and Effects result”—which is historically obtained by means of the correspondence principle (see, for example [12], [2], XI, §1 and [2], XIII, §3) as an axiom for microsystems. The reader who is willing to accept this “end result” as “axioms for microsystems” and the accompanying structure of the set of ensembles К and the set of effects L as a hypothesis which has been verified will be able to follow the rest of this book without knowledge of this section. Those readers who are interested in the problems described in this section are again referred to references [17] and [13]. In order to dispel scepticism that the axioms which describe the sets M, l,0to,0t and the function X# can restrict the probability function ц over Ж x more than that which is permitted by the above theorem (or AQ in §5), it has been shown (in [8]) that for each function ц: Ж x if —> [0,1] which satisfies the theorem, it is possible to construct a model consisting of sets M, 1, 0tQ, 01 and a function X?. This does not mean that there is only one such construction possible; we must assume that it is possible to con¬ struct many nonisomorphic models M, 1, 0lQ,01, X? for a given function ц. 4 Properties and Pseudoproperties In the discussion of the concept of a physical object which was presented in II, §1 we have left open what we mean by the expression “objective property.” In this section we shall seek to clarify this and other questions. Here we shall seek conceptual clarity. We shall, for the most part, only sketch much of the mathematical content; much of the subject matter of this section does not have a direct bearing on the problems treated in this book. 4.1 Properties and Physical Objects If, in addition to the axioms APS 1-APS 7 we also add the conditions Mel, Me 01 о (and therefore M e 0t) then 1, 0to, 01 and (since M = M n M e & implies M e 0>) are Boolean rings of sets. Each of these sets can be considered to be a set of properties. These properties are, however, the opposite of that which we have called “objective” because they refer to the preparation and registration apparatuses rather than to the microsystems. For example, a e 1 is the “property that the systems xe a are prepared according to the procedure a.” An objective property should refer directly to the microsystem itself and be independent of the preparation and registration process. By this we mean that a set a which is selected by a preparation (and similarly a set b which is selected by a registration) may be divided— according to objective properties—into subsets, and that such a “part” of a can be treated as if it were a fictitious preparation procedure. We shall now seek to formulate this idea mathematically in terms of the relationship between a set of objective properties and the sets 1 and 0t in order to precisely define the concept of an objective property. For this
4 Properties and Pseudoproperties 61 purpose we shall use part of the general treatment which is presented in [1], §12.3. In addition to the structure defined on M by «2, 0to, 0t suppose that a set of properties S is given (that is, $ с ^(M) satisfies AE 1 and AE 2). D 4.1.1. Let 2 be the set of selection procedures generated by the set {(a n p)\ae 2, peS). Since M e $ we find that <2 <= J. We assert the following axiom: AE 3. <2 is a statistical selection procedure. For the probability function Aj, for al5 a2 e <2 we require that Aj(al5 a2) = Aj(al5 a2). II (4.5.1) providing that the set © is replaced by the set {(a n p) \ a e 2, p e <^}. We may consider 0 to be an extended system of preparation procedures. For example, if a n p is the extended preparation procedure which prepares the system according to a and results in the selection of those with property p for further experimentation. Thus Aj(a, a n p) is the probability that a system prepared according to a also “exhibits” property p. In complete analogy to II, Th. 4.5.1 it follows that «2 is the set found in II (4.5.1) providing that the set в is replaced by the set {(an p) | a e <2, p e $}. In particular, to each ae £ there exists an ae 1 for which a a a. The extended preparation procedures in 1 represent idealized refinements of the preparation procedures in «2. This fact motivates the following extension of II, D 4.3.1: Let a e 2 and let boe0to; we say that a may be combined with b0 if there exists an a e 1 such that a a a and (a, b0)e С. We now define the following as an extension of II, D 4.3.1: С = {(a, b0) | a e J, b0 e 0to and a may be combined with b0}. Here we find that CcC. By analogy with the sets 0, У defined in II, §4.3 we may also define the sets J', 0, We would then obtain 0! a 9* с У. By analogy with APS 6 we shall now introduce the following axiom: AE 4.1. У is a statistical selection procedure. For the probability function we find that if we replace й by J, by we find that APS 7.1, 2 holds. In addition, for със1е9? с/ we find that A^(ci, c2) = k^(cl9 c2\ From AE4.1 it follows that: if (a,b0)eC and a n p Ф 0 then a n p n b0 Ф 0. For С and $ \ye require that AE 4.2. Let b0 e 0trQ and p e S. If a n p = 0 for all a for which (a, b0) e С then p = 0.
62 III Ensembles and Effects This axiom expresses the requirement that the combination problem and the relation pe S are mutually independent. D 4.1.2. Let X 0t9 S be defined on M according to the axioms de¬ scribed above and those axioms given in II, §4.3 and III, §1. Then we shall call S a set of virtual properties—virtual with respect to the structures X Я. Let pi, p2 e S and let b0 e Л0, b e 0t and b c: b0. Then for X^ we find that X^(a n Pi n b0, a n рг n p2 n b) = X^a n px n b09 a n px n p2 n b0) x A^(a n px n p2 n b0, a n px n p2 n b) = X^a n pl9 a n px n p2) x A^(a n px n p2 n b0, a n px n p2 n b). (4.1.1) Thus we find that X#{a n px n b0, a n p1 n p2 n b) is determined by the values of A^(a n b0, a n b) where aeJ' and (b0, b) e On the basis of this result, we introduce the following selection structure as a substitute for and 0t\0tо is unchanged; instead of 0t we consider the set of all selection procedures generated by all b n p where beand peS. We find that <= c Л, and that the system of selection procedures generated by the a n b where a e «Э, b e Л is identical to У (from which X? is determined by (4.4.1)). It is easy to show that the system can be considered to be an extended system of registration procedures, that is, if X 0tO9 satisfy the above axioms, then X Л0, 0t also do. Thus the function X#(a n px n b0, (a n px) n(bn p2)) takes on the fol¬ lowing very intuitive meaning. The “idealized refined” prepared systems in a = a n px will be registered by the method b0 in such a way as to permit the use of the “idealized refined” registration method b = bnp2. For the “idealized” registration procedure b = b0 n p2 we find that X^a np1nb0,anp1np2n b0) = X^a n pl9 a n px n p2) is equal to the probability (which is independent of b0) that the systems which are prepared according to a SP px “have the property” p2. The requirements we have imposed on the set S of virtual properties together with the structures X Л0, 0t appear to be too weak for us to classify them as ^objective,” especially since S is not determined by X Therefore, we shall call the elements of the set S which we may add to X ^0, St “virtual properties”; when we wish to stress their “virtual” character, we shall refer to them as “hidden properties.” The expression “hidden variables” is often used instead of hidden properties. The reason for this name will now be given. We note that, to each x e M, the subsets S(x) = {p\peS and xe p) correspond to an ultrafilter S(x) in the Boolean ring S. According to a theorem by Stone, each Boolean ring may be described in terms of a set П in which each ultrafilter corresponds to a single point of П. To each xe M there is a point n e П; every xe M which corresponds to the same ultrafilter S(x) is mapped to the same point п. П is called the space of
4 Properties and Pseudoproperties 63 hidden variables. In this book we shall not attempt to formulate the problem of hidden variables in mathematical terms. Instead, we shall only attempt to formulate what we would intuitively call “nonhidden” or “measurable” properties. The following condition appears to be obvious: pcMis “measurable” (and is therefore not hidden) if b0 n p e 01 for all b0 e This condition is, however, too strong because we may only be able to register a p <= M approximately. In II, §3 (after introducing AS 2.4) we have stated that we may mathemati¬ cally extend the set of selection procedures by adding “idealized limiting elements.” We shall do so now, but not for the general case (see [18]), but only for the case of registration procedures, for wMich we shall add certain idealized registration procedures. D 4.1.3. A set с a M is called an idealized registration procedure if there is ab0G^0 for which с a b0 and с = U b, b0\c = u b. be SH be SH b <= с be bo\c Th. 4.1.1. The map ф (see D 1.3) of 0F into (and therefore of 3* in L where L is defined in §3 and §5) may be extended to an idealized registration procedure с as follows: Ф(Ь0, с) = sup ф(Ь09 b) = inf ф(Ь09 b). (4.1.2) b e 01 be^t bee b0 => b => с The function X? may be extended to с as follows: Xy(a n b0, a n c) = sup X#(a n b09a n b) b€^t bee = inf X^(a n b09 a n b). (4.1.3) be b0=>b=>c The properties of the function Xare preserved when 0t is extended to include the set of selection procedures generated by all b a с and b n (M\c). Proof. We shall only sketch the essential part of the proof, namely that sup ф(Ь09 b) = inf ф(Ь09 b). beSt be 31 b^c bo=>b=>c Since the set of the b c= с is upwardly directed (in the sense of the order relation c= of set inclusion) the set of the ф(Ь09 b) is also upwardly directed in 0$'. Since ф(Ь09 b)eL and L is compact, the sup and inf exist (and are also limits in the g(08'9 08) topology) and are in L. From b с: с cz В cz b0 it follows that ^(b0, b) < ^(b0, B) and sup b) < inf ф(Ь0, В).
64 III Ensembles and Effects The condition с a В c= b0 is equivalent to the condition b0\B c= b0\c. Since b0\c= [j b'= [j (b0\b) = b\ п Я b' e St be St \ be St b'<=b\c b0=>b=>c \ b0 =» 6 => с it follows that Thus we obtain c = n beSl b0=>b=>c 0 = c n (b0\c) = с n • \ V = П &n П (bo\b) = _ П «K^VO- be b<=c b,be@ b0=>b=>c bo=>b=>c=>b Since the set of b is upwardly directed, and the set of В is downwardly directed, the set of B\b is downwardly directed. By AS 2.4.1 we find that and therefore inf ф(Ь0, B\b) = 0 b,b sup ф(Ь09 b) = inf ф(Ь09 В). be St beSl b<=c b0=>b=>c D 4.1.4. We say that the set p a M may be ideally registered if b0 n p is an idealized registration procedure for each b0 e 0to. Th. 4.1.2. p may be ideally registered if and only if p = (J b and M\p = (J b. (4.1.4) be SI be 01 b^p b <= M\p Proof. If p may be ideally registered then for each b0 e 010 we obtain: b0 n p = (J b and b0 n (M\p) = (J b. (4.1.5) be St be St b<=b0np b<=b0\(M\p) =b0\(b0np) According to APS 8.2 and APS 4.2 we have (JboC3tb0 = M. Thus, from APS 4.2. it follows that p = M n p = (J b0np= (J (J b= [j b. (4.1.6) In a similar way we obtain bo e SIq bo £ Sl0 be St be В b^bonp be p M\p= [j b. (4.1.7) beSl b<=M\p From (4.1.6) and (4.1.7) it immediately follows that p may be ideally registered.
4 Properties and Pseudoproperties 65 For a set p which may be ideally registered the effects ф(Ь0, b0 n p)eL and ф(Ь0, b n p)e L where b cz b0 are uniquely defined. Th. 4.1.3. The set Sr of all sets which may be ideally registered is a Boolean ring. The proof of this theorem is easy, and is left to the reader. By analogy with the case of registration, we shall define the notion of a set which may be ideally prepared. By analogy to D 4.1.2 we define: D 4.1.5. We say that a set p a M may be ideally prepared if p = [j a and M\p = (J a. (4.1.8) a e й a £ J a <=■ p a <=■ M\p The statements made above for sets which may be ideally registered are also valid for sets which may be ideally prepared. Th. 4.1.4. The function (p: &' —► Ж <= К has a unique extension to the set of alia n p Ф 0 where a e 2! and p may be ideally prepared. The extension is given by <ip(a np) = wfi(w, I)-1, where w = sup Aj(a, a)(p(a) ae J a<=anp = inf A^(a, a)(p(a). (4.1.9) a £ a a^a^anp Proof. The proof proceeds in a similar manner as Th. 4.1.2, where it is only important to note that sup and inf exist in the sense of the norm in B. For, if wl5 w2 e К (for definition of K, see AIII, §6), and w1 > w2 it follows that IIWi - W2|| = jU(Wi - W2, 1). The following theorem follows directly from Th. 4.1.3. Th. 4.1.5. The set Sp of all sets which may be ideally prepared is a Boolean ring of sets. Let Sm denote the set Sr n Sp, that is, the set of all sets which may be both ideally prepared and ideally registered. Clearly Sm is a Boolean ring of sets. Th.4.1.6. For each pt$m the map Tp: (for definition of К, see AIII, §6) which is defined by (p(a)—> Aj(a, a n p)(p(a n p) is norm-continuous and has a unique extension Tp on $ which is linear and norm-continuous.
66 III Ensembles and Effects This map is uniquely determined by the following equation i4w, ф(Ь0, bnp)) = fi{Tpw, ф(Ь0, b)), (4.1.10) which is valid for all w e К and all (b0, b) e In addition, the following equations hold: Тщр = 1 — Tp, (4.1.11) PROOF. If w = (p(a) (that is, w e Ж) it follows that: p(w, ф(Ь0, b n p) = n b0, a n b n p) = n b0, a n p n b0)l^(a n b0 n p, a n p n b) = Aj(a, a n p)p((p(a n b), ^(b0, fc)) = р(Трф(а), ф(Ь0, fc)). Thus (4.1.10) is proven for w e Ж. From (4.10), if ]T?=1 a^- = 0, w* e Jf, it follows that Ya=i KiTpWi = 0. Therefore Tp can be uniquely extended as a linear map to all ofJ>: (where is the linear span of Ж in $). If Tp is norm-continuous, then it can be extended to all of In order to prove that Tp is norm-continuous, we shall assume that со is o(&9 ^)-dense in L (which is the case for quantum mechanics—see §3 or §5). Since [-1,1] = 2L - 1, for wl9 w2e Ж we find that || TpWi - Tpw21| = sup р(Т^ - Tpw29 2q - 1) gsL < MTpWj - Tpw2,1)1 + 2 sup KTpW! - Tpw2, g). geL From (4.1.10), for w e К we find that, for the special case b = b0 vATpw, 1) = n(Tpw, ф{Ь0, b0)) = p(w, фф0, b0 n p)) (4.1.12) and we obtain: \KTpWt - TPW2,1)1 = |/i(w! - w2, фф0, b0 n p)| ^ IIwt - w2||. Since со if is dense in L we obtain sup pi.TpWi - Tpw2, g) = sup р(Трщ - Tpw2, g). gsL де& For g e if there exists a (b0, b) e ,¥ with фф0, b) = g. Therefore we find that SUP КТрЩ - Tpw2, g) = sup MTpWj - Tpw2, фф0, b)) geL (bo, b)e& and, since р(Трщ - Tpw2, фф0, b)) = p(wt - w2, фф0, b np))< || wt - w2|| we finally obtain II Трщ - Tpw21| < 3||wj - w2||, whereupon we have proven the norm-continuity of Tp. Thus Tp is defined on all of
4 Properties and Pseudoproperties 67 From (4.1.10) and ф(Ь0, bnp) + ф(Ъ0, b n (M\p)) = ф(Ь0, b) it follows that /i(w, (b0, b n (M\P))) = b)) - p(w, ф(Ь0, b n p)) = мК Ф(ьо, ад - ^(TpW, «А(г>0, ь)) = m((1 - TP)W, Ф(Ь0, b)). Thus we find that T^p = 1 — Tp. Since Tp is defined on all of Я, for w e К it follows that /i(w, ф(Ь0, bnp)) = p(Tpw, ф(Ь0, b)). From ф(Ь0, b гл Pi гл p2) = sup ф(Ь0, В) be St b<=pinp2nb = sup Ф(Ь0, Ъ П В) b<=bi it follows that b,be b^binp,Bc:b2np ф(Ь0, b n Pi n p2) = sup ф(Ь0, Pi n B). be St bcp2nb Thus we obtain p(w, ф{Ь0, b n Pi n p2)) = sup p(w, ф(Ь0, Pi n B)) b^p2nb = sup p(Tpiw, ф{Ь0, В)) = p(TPlw, ф{Ь0, b n p2)) b^p2nb = P(TPiTplw, ф(Ь0, b)). From which we obtain = TPI TP2 — Tp2 'Tpi. D 4.1.6. The set Em of all subsets p a M which may be both ideally registered and ideally prepared is called the set of objective properties of M. We note that D 4.1.6 makes sense becuase Sm is a Boolean ring. From (4.1.12) it follows that the map Tp in Bf which is dual to Tp satisfies the equation х(р)^Ф(Ьо,Ь0пр)=Тр1 (4.1.13) and therefore ф(Ь09 b0 n p) is independent of b0. If px n p2 = 0, from (4.1.13) it is easy to show that XiPi V Pi) = XiPi) + XiPi)- (4.1.14)
68 III Ensembles and Effects If, instead of $ we use Sm, and we use the sets «Э, 01, etc. which were defined with the help of S, it follows that AE 3 and AE 4.1.2 are theorems and that , / . U \ KTPMa)’ ^(b0> b n p2) /, ... n p1 n b0, a n p1 n b n p2) — —TVTx (4.1.15) KTPl(p(a)> !) and Aj(a, a n p) = p(Tpcp(a), 1) = /#(4 *(p)) (4.1.16) are satisfied. Thus <fw is, in the sense of D 4.1.12, also a set of virtual properties. In D 4.1.6 we have called the set Sm the set of objective properties because we believe that the mathematical structure for Sm characterizes what we mean intuitively by the expression “objective property.” By this we mean that the relations AE 3 and AE 4.1.2 describe the condition that the properties of the system be independent of the preparation and registration procedures. The condition that pt$m may be both ideally prepared and registered says that p is not “hidden.” We shall call those physical systems which are completely described by the set Sm of objective properties physical objects. How can we find a mathemat¬ ically precise definition of the expression “completely described”? The set Sm can be so “small” that different ensembles wl5 w2 e К cannot be distinguished by means of p(w, x(p))- By “completely described” we mean that Sm is sufficiently “large” that the w e К can be distinguished by the pe$m. In mathematical terms we may express this idea as follows: x($m) separates К, that is, if ju(wl5 xip)) = x(p)) f°r aH P G then wx = w2. Thus we define: D 4.1.7 A set M of physical systems is called a set of physical objects if x($m) separates the w e K.1 In IV, §8.1 we shall find that microsystems are not physical objects. In [1], §12.3 we have shown what structures К and L must have in order to describe physical objects. Intuitively, on the basis of the formulation of the set Sm, it follows that $m represents “real physical facts.” On the basis of the analysis presented in [1], §10 and in [1], §12.3 we have shown that Sm does represent a set of real physical facts. For a set S of virtual properties (defined according to D 4.1.2) we may suppose that $ с Sr. Then we may also define the set Sp of all sets which may be ideally prepared with respect to 0 (defined according to D 4.1.1). Then, trivially, we find that S c= Sp and that S c= Sp n Sr. Then $ = Sp n $r is a set of imagined properties which at least may be ideally registered. The set $ is, in general, not uniquely determined (there are many such sets possible). Thus S is not a set of real physical facts (in the sense of [1], §10), or in other words—is not a set of physically real and objective properties. If such a set $ 1. More precisely: if it is a certain hypothesis that x(#m) separates the weK (see [1], §12.3).
4 Properties and Pseudoproperties 69 is so large that the equivalence classes of 9! defined by the equivalence relation a1^ a2: Х^аъ a1 n p) = Aj(a2, a2 n P) f°r aH P G $ are ^пег than those determined by the / e then $ cannot exist for microsystems (see IV, §8.1). As we found above, microsystems are not physical objects. This fact has created a scandal. Radical assertions—such as “objective knowledge is impossible”—have been proposed. However, the fact that microsystems are not physical objects is itself objective knowledge. The fact that microsystems are not physical objects does not mean that each separation of microsystems from the preparation and registration procedures is impossible and that in each experiment the complicated structure of each apparatus must be taken into account. It only means that we must abandon the notion of a microscopic “object,” one to which we have been accustomed. In the following section we shall use a different notion—that of a pseudoproperty—in order to separate the microsystems from the prepara¬ tion and registration apparatuses. 4.2 Pseudoproperties In quantum mechanics we find that the role of a Boolean ring $ of sets is replaced by another structure Sps. For Sps <= P(M) the following axioms are satisfied: APE 1. Sps is a lattice with respect to the order c= (set inclusion) with largest element M (AI, §4). APE 2. To each p e Sps the set Dp = {q\pe Sps and q cz M\p} has a greatest element, which we denote by p*. For px =э p2, Pi Ф p2 we require that Dpi Ф Dp2. A set Sps с P(M) which satisfies APE 1 and APE 2 is called a structure of species pseudoproperties. Th. 4.2.1. From APE 1 and APE 2 it follows that Sps is an orthocomplemented lattice. PROOF. We must prove the following three relations (AID 1.2): (i) If p cz q then q* cz p*. (ii) (p*)* = p. (iii) P a P* = 0-
70 III Ensembles and Effects (i) From p c= q it follows that Dq c= Dp and we obtain q* c= p*. (ii) From p* c= M\p (and therefore p c= M\p*) it follows that p e Dp*. Let g => p where c Dp*. Then we would find that 4 c= M\p*; that is, p* c= M\g and p* c: Dq—in other words Dp a Dq. Since Dq c= Dp we obtain Dp = Dq. From APE 2 we obtain p = q. Thus we obtain (p*)* = p. (iii) Since p* c= M\p we find that p n p* = 0, and therefore p л p* = 0 (from which we conclude that <fps has a least element—the empty set 0). Conditions APE 1 and APE 2 are natural generalizations of AE 1 and AE 2. The latter require that if pl9 p2 e $9 then px n p2 and pi и p2 e ; in addition, if p e $ then M\p e S. The former require that to pl9 p2 e <^ps there exists a largest element p3 e <^ps for which p3 <= px n p2 and a smallest element p4 e <fps for which p4 => Pi n p2; also, to p e <fps there exists a largest element p5 e <fps for which p5 c= M\p. The conditions APE 1 and APE 2 are as strong as possible without requiring that conditions AE 1 and AE 2 be satisfied. Of course, AE 1 and AE 2 say little about the system under consideration unless we relate the structure $ to the physically motivated structures «S, Л0, and 01. The same is true for APE 1 and APE 2. We must now relate the structure Sps to 0,0tO9 and 0t. Since we do not intend to develop the notion of “hidden” pseudoproper¬ ties, we shall proceed to develop the analogs of equations (4.1.4) and (4.1.8). Of course, we cannot assume these equations directly without obtaining the set Sm as a result. From (4.1.4) and (4.1.8) it follows that This equation suggests the following generalization: For each subset с a M we may define <c) = Г U а] и Г U ь\ \ ае й \ЪеМ \_acc J Lbc:c J from which it follows that 7t(C) = Г (J fllu[ (J b] ae£ ЪевИ |_ac 7t(c) J |_Ься(с) J that is, n(c) is an element having the form (4.2.1). For each set p of the form (4.2.1) we define p* = n(M\p). From (4.2.4) it follows that (p*)* =j p and that px p2 implies p* cz p*. Thus it follows from (p*)* => p that [(p*)*]* <= p*. From (p*)* =з p, if we replace p with p* we obtain [(p*)*]* p*. Thus we find that (p*)** = p*. Let П denote the set of all p* where p is of the form (4.2.1). Then П is the set of all p of the form (4.2.1) for which p** = p. (4.2.2) (4.2.3) (4.2.4)
4 Properties and Pseudoproperties 71 For p g П we shall use the following abbreviated notation: pP = U a and Pr = U b• (4-2-5) a e £ be 01 a<=-p be p We obtain Pp= U a> Pr = U b and P = PpyjPr- (4.2.6) a e J b e a e pp be pr Th. 4.2.2. П is a structure of species pseudoproperties; in addition, the following equations are satisfied: (Pl A P2)p = Pip П p2p, 2 Ol A P2)r = Plr П P2r. Proof. If px => p2 and pf = P* it follows that px = p2\ then, by definition (4.2.4) the relation APE 2 is proven. According to (4.2.2), n{p1 n p2) is the largest element p of the form (4.2.1) for which p c= p1 and p a p2. It only remains to show that я(рх n p2) e П, that is [я(рх n p2)]** = ^(Pi n Pi)- From я(рх n p2) <= px it follows that [7i(Pi n p2)]** <= Pi* = Pi - Similarly it follows that [я(рх n p2)]** <= Pi\ thus we find that [n{Pi n p2)]** <= Pi n p2- We therefore obtain [я(рх n p2)]** <= tt(pi n p2). Since [я(рх n p2)]** ^ ^(Pi n p2) is trivial, we have proven that Ф1 п p2) 6 П. We shall now prove that [я(р* n p*)]* is the smallest element in П containing p1 and p2. From p => px and p => p2 it follows that p* c pf and p* c pf, and therefore we obtain p* с я(р5* n p2) from which we conclude that ln(p$ n p})]* c p. Thus we find that П is an orthocomplemented lattice. Using л and v for the lattice operations we obtain Pl A Pi = n{p1 n p2)j Pl V Pi = [7l(pt П pf)]*. From the definition of я we also find that (Pi a p2)p = U «; (pi a p2)r = U b- oe be 01 acPinp2 bc:pinp2 From these results we obtain (Pl A p2)p С Plp, (Pj A p2)p <= p2p and (Pl A р2)р С Plp n P2p. If a1 e J, ax с p1 and a2 e я2 c= p2 we find that a1 n a2G й and a1 n a2 c: Pi n P2 and therefore obtain (Pi a p2)p 3 (J (J «1 n a2 = pip о p2p. a 1 e jg 02 e jg «1 cPl 02CP2 Thus we obtain the first equation in (4.2.7). The second equation is obtained similarly.
72 III Ensembles and Effects Let Ap = {(p(a) | ae &\a cz p}. From D 3.1 it follows from p n p* = 0 that a n b = 0 for all a cz p and b a p*, that is, ip(b0, b) e L0(Ap) for all b cz p*. Thus we also obtain ф(Ь0, b) g L0(Ap*) for all b cz p. From the definition ФФо, b0 n pr) = sup ф(Ь0, b) b^bonpr it follows that ip(b0, b0 n pr) g L0(Ap*). For the case in which pr = pp (and thus pr = pp = p) (that is, in the case of objective properties), then for all a cz p, for a n pr = a we obtain A^(a n b0, a n b0 n pr) = 1 from which we conclude that ip(b0, b0 n pr) g L^Ap).1 For a pseudoproperty we shall require that there exists at least one registration method 60 for which ip(b0, b0 n pr) g L^Ap), from which it follows that L0(Ap,0 n L^Ap) # 0. (4.2.8) The fact that there is no b g R for which b cz p and b cz p*, that is, there is no ip(b0, b) for which ip(b0, b) g L0(Ap) and i^(b0, b) g L0(Ap*) leads us to impose the following additional condition for pseudoproperties: L0(Ap и Ap*) = L0(Ap) n L0(Ap*) = 0. (4.2.9) We note that (4.2.9) is satisfied for objective properties because, for each a g 1 the relation a = (a n p) и (a n p*) holds where p* = M\p (that is, (p(a) = A<p(a n p) + (1 — X)(p(a n p*) holds where X = Aj(a, a n p)). We could not show that the set of all p for which there exists a b0 for which ip(b0, b0 n pr) g L^Ap) and for which (4.2.9) is satisfied is an orthocom¬ plemented sublattice of П. For this reason it is necessary to define: D 4.2.1. An orthocomplemented sub-lattice Sps of П which satisfies the conditions (i) to each p g Sps there exists a b0 g such that \p(b0, b0 n pr) g L^Ap), and (ii) the relation (4.2.9) is satisfied for all pe$ps, is called a set of actual physical pseudoproperties. D 4.2.2. A set S’ps of actual physical pseudoproperties is said to be sufficient if the set of all i//(b0, b0 n pr) g L^Ap) for p g Sps separates the elements of K. Whether such sets Sps satisfying D 4.2.1 and D 4.2.2 exist for a given theory cannot be determined in general. For microsystems the existence of such sets is certain—see IV, §8.2. 1. By analogy to D 3.1 we define Lt(A) = {g | g e L and /i(w, g) = 1 for all w e A cr K}.
5 Ensembles and Effects in Quantum Mechanics 73 5 Ensembles and Effects in Quantum Mechanics At the end of §3 we suggested that the “end result” obtained from the fundamental laws of preparation and registration given in §3 may also be formulated as an axiom. This axiom may be motivated by a reasoning process which uses the so-called correspondence principle and certain aspects of the measurement process (see I, XVII, [2], XI, §1 and [2], XIII, §3). We shall now formulate the fundamental axiom of quantum mechanics (which according to §3 may be obtained as a theorem from axioms AV 1.1, AV 1.2s, AV 2f, AV id, AV 3, and AV 4s) as follows: AQ. There exists an injective mapping j? of Ж into the basis К of 08(Жи Ж2,...) and an injective mapping у of if into the order unit interval [0,1] in 08'(Жи Ж2,...) for which p(w, g) = tr((j3w)(yg)) holds for w e Ж, g e if where ft Ж is (norm) dense in К and у if is (t(08\ 08) dense in [0,1]. Both Banach spaces 08(ЖЪ Ж19...) and 08'(Жи Ж19...) and the canonical bilinear form tr(uv) where и e 08(Жи Ж19...) and v e 08(Жи Ж19...) are defined in AIV, §15. In AIV, §15 it will also be shown that 08'(Жи ...) and 08(ЖЪ ...) may be identified as subsets of the Banach algebra л/(Ж19 ...). 08(Жи ...) is a base-norm space and 08\ЖЪ ...) is its dual Banach space (and therefore is a order unit space). The order unit in 08\ЖЪ ...) is the unit operator 1 in я$(Жъ ...). In the following we shall often make use of the fact that 08(ЖЪ ...) and 08'(ЖЪ ...) are subsets of я/(Ж19...). Where products of elements from 08(Жи ...) and 08\ЖЪ ...) occur, they are to be understood in the sense of the algebra #0(Жи ...). (For the formulation of axiom AQ, see also [2], XIII, §3; the above formulation of axiom AQ is somewhat weaker than that given in [2], XIII, §3. The stronger form in [2], XIII, §3 was chosen for its simplicity). On the basis of AQ we shall identify Ж with /JjT and if with yif and write p(w, g) = tr(w, g). Then the maps q> of 9! into Ж and ф of 3* into if correspond to maps of 9! in К and 3* into [0,1]. Instead of [0,1] we shall write L. According to AQ the set (p9f is dense in К and ф97 is dense in L. We shall call the elements of К ensembles and the elements of L effects. Let 0$ denote the norm-closed subspaces of 08'(Ж19...) generated by $F. Then Sf is a Banach space and 08(ЖЪ ...) may be identified with a subspace of 0$'. 0$ is norm-separable; 08’ is not. We shall seek to characterize Sf by axioms in VII, §8. Let К denote the 08) closure of К in For the case in which 08 is the set of real elements of a C*-algebra the mathematical situation for 0$9K <=.0$ has been thoroughly researched (see, for example, [30]). Note that the so-called “set of states” in the theory of a C*-algebra is denoted here by К instead of К; for the problem of К see VI. In order to prevent misunderstandings about the physical meaning of axiom AQ we make the following comments: The countable subsets Ж of
74 III Ensembles and Effects К a К and if of L are, in certain topologies, dense in К or L, respectively. Here К, К, and L are mathematical extensions or “completions” analogous to the completion of the set of rational numbers by the real numbers. We now pose the “converse” question: Are the “physical sets” Ж and if “special” subsets of К, К, and L? If so—how? An important tool in the resolution of this question is the description of the physical distinguishability of ensembles and effects which we have described in §3. This fact suggests that, in addition to the sets Ж and if, the topologies generated on JT by if and on if by if by means of equations of the form (3.1) will also be important. These topologies are identical to g(2\ 2) on К and К and to o(0!\ 0$) on L. Note that the assumption that the ensembles may be distinguishable by the topology o(0l, 01') is false because the topology o(0i, Af) is actually “finer” on К than is a(2\ 2). Since two elements of К may be physically distinguishable only in the sense of the topology o(2\ 2\ no subset of К, that is, no special Ж has special physical significance. However, К itself is of special significance as a subset of К only because the topology generated by К on L (which is identical to o(0!\ Щ) correctly describes the physical distinguishability of effects. We note, however, that every subset of К which is dense (with respect to the norm) in К generates the same topologies on L as does Ж\ К, and 01. In summary, we find that Ж (as a subset of K) and if (as a subset of L) are not uniquely determined by the physics; every subset К of К which is norm-dense in К (including К itself) and every subset of L n 2 which is norm-dense in L n 2 (also L n 2 itself) may be used as the set of “physical” ensembles or effects in the sense that the topologies generated by К on and by on К are correct for “physical distinguishability” (see also the general discussion in [1], §10.5). We may, therefore, choose sets К and of the above type on the basis of our mathematical convenience without “changing the physics.” In this book К <=. 2(Жи ...) and L с: 01(ЖЬ ...) will play a central role. (We shall discuss 2, 2' only briefly in VI and VIII) because the mathematics for Hilbert spaces is extensively developed, and that the methods developed for the dual spaces 2, 2' are, at present, too cumbersome for practical computation. The criterion for mathematical accessibility may change in time as new mathematical methods are developed (recall the development of the Hamilton-Jacobi formulation of mechanics.) If we introduce the axiom AQ as a substitute for a set of axioms such as that given in §3, we find it desirable to substitute the following definition for D 3.2: D 5.1. If axiom AQ is satisfied, then we shall call M (together with the structures J, 0to, and 0t) a set of microsystems. On the basis of the above formulation of the foundations of quantum mechanics it is clear that the Hilbert space (as a complex vector space) does not directly describe a physical structure. Instead it is a computational tool
6 Decision Effects and Faces of К 75 which permits us to cleverly handle the structure of the convex set K. Since the positive affine functionals on К are identical to the elements of the positive cone of (see AIII, §6), it is the structure of К alone (and the choice of topology g{&, 3f) on K) which determine the physical structure of microsystems. The structure of К also contains the so-called “wave charac¬ ter” of microsystems; the Schrodinger wave function—the elements of a Hilbert space—are only mathematical tools for obtaining a better “handle” on this “wave character.” Only the elements of К are “physically real” (in the sense of [1], §10 and [2], III, §9). The elements of the Hilbert spaces Ж{ are not. The elements of form only a particular representation basis for the elements of K ; a representation analogous to the special coordinates used for the representation of the orbit of mass-points in mechanics. The role of Hilbert space as a representation tool for physically important quantities will be developed in IX. 6 Decision Effects and Faces of if In II, Th. 4.5.2(iv) and in III, Th. 2.1 we have examined the notion of a decomposition of preparation procedures and defined the notion of a mixture of ensembles by (2.2). Th. 2.2 states that to each “mixture” of ensembles wf with weights Xt there is an analogous “mixture” a of preparation procedures at. The notions of ensemble, mixture, and decomposition may be expressed in simple terms if we use AQ. If we identify the elements w e Ж with the elements w e К <= $ it follows that (3.2) is equivalent to W = X AjWf, (6.1) i — 1 whete (6.1) is written as an equation among elements of the Banach space Equation (6.1) states that w is a convex combination of the wf. Since the set К is convex, every convex combination of elements wf e К is also an element of K. Let 0 < X < 1 and let wl5 w2 e K. Then w = Xwx + (1 — X)w2 is a “mix¬ ture” of wl5 w2 (in ratio X to 1 — A)). If w e К can be written in the form w = Xwx + (1 — X)w2 where wl5 w2 e К, 0 < X < 1 then the decomposition w = Awx + (1 — X)w2 represents w as a “mixture” of the ensembles wl5 w2. The norm-closed subsets of К which are invariant under mixtures and decompositions play an important role. A norm-closed subset С of К is invariant under mixtures and decomposition if for wl5 w2 e С we obtain Awx + (1 — X)w2 g С for all X which satisfy 0 < X < 1 and if w e C, w = Xwx 4- (1 — X)w2 (where 0 < X < 1 and wl5 w2 e K) it follows that wl5 w2 g C. The norm-closed subsets of К which are closed under mixtures and decomposition are known as norm-closed “faces” in the mathematical theory of convex sets. The “faces” of К obtain their physical meaning from the
76 III Ensembles and Effects concepts of mixture and decomposition. We only consider the norm-closed faces because (from the discussion presented in §5) a subset С of К is not physically distinguishable from its norm-closure. Of special importance are those faces of К which consist only of a single point (and are therefore norm-closed). Such a face is called an extreme point of К and is characterized as follows: An element w is an extreme point of К if and only if when w = Awx + (1 — A)w2, wl9 w2 e К, and 0 < Я < 1 it follows that w = wx = w2. The “physical meaning” of an extreme point of К is that it represents an “irreducible ensemble”—often called a “pure state.” We shall not use this expression in this book because we do not wish to create false associations with the notion of a pure state—for example, the false notion that all microsystems x e a for irreducible (p(a) are “identical” while only those xe a for which (p(a) is reducible can be nonidentical. Ontological concepts such as “identical” and “nonidentical” can easily lead to con¬ tradictions with experiment. D 6.1. The set of extreme points of a convex set К will be frequently denoted by deK. The set of norm-closed faces of К will be denoted by ф(К). What are the elements of de(K) and of ф(К)1 Th. 6.1. Every closed face С of К can be written as C(w)for a suitably chosen w e К where C(w) is the smallest closed face which contains w. Proof. Since $ is separable, К and С are also separable (see AIV, §15). Thus, in С there is a norm-dense denumerable subset wf e C. Thus there are real numbers 0 < Af < 1, Yji h = 1 for which w = hwf e С since С is norm-closed and convex. We shall now show that С = C(w). From we С it follows that C(w) c: C. From w = Afcwfc + (1 — Afc)w£ where wi = (l - 4Г1 E xiwie K> i*k it follows that wk e C(w). Since all wk belong to C(w) and are dense in C, and C(w) is norm-closed, it follows that С с: C(w). D 6.2. The set of elements e e L for which e is a projection operator in the algebra ...) (that is, e2 = e and e+ = e) (see AIV, §15) will be denoted byG. In the following we shall make use of the concise notion which is presented in AIV, §15: Ж = (Jy Жу is the “sum” of the sets Ж19 Ж2,... etc. Every element of Ж is characterized by an index у and a q> e Жу . Ж is not a vector space. The elements of з#{Жи Ж2,...) can be considered to be operators in Ж. As “subspaces” of Ж we shall denote only those subsets ZT of Ж for which ^ ~ Uv where ^ are closed linear subspaces of Жу. ST\ _L ЗГ2 means that for ^ = (Jy and ЗГ2 — (Jy ?T2 the relations ZT) 1 ZT2 hold for all y. is the subspace (Jy
6 Decision Effects and Faces of К 11 Th. 6.2. To each face С of К there is a uniquely determined e e G for which w eC is equivalent tow = ewe. The map so defined of all faces of ф(К) in G is an order isomorphism of ф(К) onto G. Proof. According to Th. 6.1, С = C(w) where w = and {wf} is dense in C. Let 2Г denote the support of w, that is, 2Г is the space which is orthogonal to the eigenspace of w which has eigenvalue 0 (see AIV, §11). Let e be the projection operator for 2Г. Thus we find that w = ewe. The condition w = ewe is equivalent to the condition (1 — e)w(l — e) = 0 (see AIV, §6). Using the decomposition which is found in the proof of Th. 6.1 w = kkwk + (1 - it follows that 0 = (1 - e)w( 1 - e) = Afc(l - e)wk( 1 - e) + (1 - Afc)(l - e)w'k( 1 - e). Since the operators (1 — e)wk( 1 — e) and (1 — e)wk( 1 — e) are positive operators in ЩЖ,...) it follows that (1 — e)wk( 1 — e) = 0 and therefore wk = ewke. Since {wk} is dense in C, it follows that w = ewe for all weC. Let w e К and w = ewe. The eigenvectors (which correspond to nonzero eigenvalues) of w lie in the projection space of e. Since С is convex, we obtain w eC because, if for all elements cp e 2Г the corresponding projections Рф lie in С (for Рф see AIV, §6). By analogy with the proof in Th. 6.1 it can be proven that for every eigenelement (рл having a nonzero eigenvalue, the corresponding Рфа lies in C. The {(pa} span all of ST. From 1 " 1 " where the {&} is an orthogonal system which spans the same subspace as the {<pa.} it follows that, Рф lies in С for each (p from a finite-dimensional subspace ЗГ spanned by elements of {<pa}. Since each cp from may be approximated by a cp' from such a finite-dimensional subspace of 2Г (and, therefore, Рф is approximated in the norm-topology by P^ (see AIV, §11) it follows that all Рф for which (p e lie in C. Thus we have shown that С consists of all w for which w = ewe. In this way the element e which corresponds to С is uniquely determined because the set of those w for which w = ewe contain all Рф for which q> are contained in the projection space for e and the projection spaces for different e are different. It is easy to show that the converse holds, that, for each e e G the set of all w for which w = ewe is a norm-closed face of K. It is easily shown that e > e is equivalent to С з С. Therefore the cor¬ responded between the elements eEG and С is an order isomorphism of ф(К) onto G. ф(К) is a complete lattice, since К e ф(К) and (as is easily shown) that for every set С of closed faces, p)a Ca is closed. By the order isomorphism of ф(К) and G it follows that G is a complete lattice (which also follows directly from AIV, §6 because the set of subspaces of Ж is a complete lattice (see AIV, §2)).
78 III Ensembles and Effects Th. 6.3. The condition that the realtion w = ewe holds for w e К and e eGis equivalent to the condition p(w, e) = tr(we) = 1, that is, w e Kx(e) where Kx(e) is defined by D 3.1. PROOF. From w = ewe it follows that p(w, e) = tr(we) = tr(we2) = tr (ewe) = tr(w) = 1. From p(w, e) = 1 it follows that p(w, 1 — e) = tr(w(l — e)) = 0 and tr(w(l — e)) = tr((l — e)w{ 1 — e)) = 0. Since (1 — e)w( 1 — e) > 0 it follows that (1 — e)w(l — e) = 0 and w = ewe (see AIV, §6). Using the notation of D 4.1, from Th. 6.2 and Th. 6.3 we find that every С e ф(К) is of the form С = Kx(e) for some eeG. Thus we obtain the first part of the following theorem: Th. 6.4. An order isomorphism ф(К) <-> G is defined by С = Kx(e), С e ф(К) and eeG. An anti-order isomorphism ф(К) «-* G is defined by С = K0(e) with С e ф(К). In particular, К fa) n К fa) = К fa) л К fa) = л e2), Ki(ei) v К fa) = К fa v e2), K0(ei) n K0(e2) = K0(ei) л K0(e2) = K0(ei v e2), and Kfa) v ^0(^2) = X0(ei л ег)- PROOF. We need only prove that e —► X0(e) is an anti-isomorphism map of G onto ф(Х). This follows directly from the fact that e—>1 — e is an anti-isomorphism map of G onto itself and K0(e) = К 1 — e). Th. 6.5. The set deK is equal to the set of all . Proof. According to AIV, §11, for each wgKwq have the spectral representation w = E^V.- (6.2) a From p(w, 1) = 1 it follows that ]Ta Xa = 1. Clearly e can be an extreme point of К only if only one Xa Ф 0 in (6.2), that is, w is of the form P^ for some cp. Pp is also an extreme point of K; from P^ = Xw^ + (1 - X)w2, 0 < X < 1 it follows that, for e = 1 — P^ 0 = Xew^e + (1 — X)ew2e. Since ew12e > 0 it follows that ew^e = 0 and ew2e = 0.
6 Decision Effects and Faces of К 79 From ew^ = 0 it follows that w= 0 (see AIV, §6); thus, from e = 1 — we obtain Wi = tjPy. Since tr(Wi) = 1 it follows that rj = 1. Similarly, it follows that w2 = pv In addition to the faces of К (which are special sets of ensembles), the elements of G (effects) also have a special physical meaning. The physical meaning of the elements of L is established by the map xj/ of into L (as presented in §1, §3, and §5). In many formulations of quantum mechanics there is an implicit assumption that every experiment (b0, b) g cor¬ responds to a xp(b0, b)e G. In these formulations (b0, b) is an “observable” having measurement values 0,1 and must be identified with a projection operator e because only projection operators have eigenvalues 0,1 (see, for example, [2], XI, §2.1). Actually, the statement that all xp(b0, b) are elements of G is incorrect. On the contrary, an experimental physicist will not know in advance whether xp(b0, b) at least approximately corresponds to an e e G or whether he has only registered an effect g = xj/(b0, b) e L (see XVII and [2], XII). We shall now seek to obtain a physically meaningful definition of the notion of a decision effect, that is, one which relates to the probability function /j(w, g). For this purpose we shall use D 3.1. For A a K, L0(A) is a set of effects for which there will be no detection response for any w e A. K0L 0(A) is a closed face of К with A c K0L0(A). For С = K0L0(A) we find that L0(A) = L0(C). An element gx e L is said to be more “sensitive” than g2 e L if gx > g2, that is, if ^(w, gx) > /j(w, g2) for all w gK. Then the effect дг will respond more frequently than the effect g2. Is there a most sensitive effect in the set L0(A) = L0(C)? If such an effect exists, and if it cannot respond to the w e A, it will respond to each w g К “as frequently as possible.” We shall now use a few examples to clarify this situation. A filter for light may be considered to be a b0; here b is the set of light-quanta which are absorbed by the filter b0. If g = xj/(b0, b) g L0(A) then w g A represents the “light” which passes through the filter without absorption. Consider, for example, the absorption coefficient of the filter b0 as a function of frequency (see Figure 2). All the light which contains only those frequencies for which the absorption coefficient is zero passes through b0 without absorption. By “combining” several filters of type b0 it is possible to construct filters for which the absorption coefficient is approximately 0 or 1 and passes all light which passes through b0 without absorption. Such a filter represents a type of “maximally sensitive” filter which satisfies the additional condition that all light w g A passes through it without absorption. The following definition is motivated by the above example: D 6.3. If L0(A) has a maximal element e, then e is called a decision effect. We shall denote the largest element of L0(A) by eL0(A). Th. 6.6. Every L0(A) has a maximal element L0(A); the set of decision effects is equal to G (see D 6.2). In addition G = dJL; L0(A) = L0K0(eL0(A)).
80 III Ensembles and Effects Proof. As we have seen earlier L0(A) = L0(C) where С is a closed face of K. According to Th. 6.4 С = K^e) where eeG. For e = I - e we find that С = K0(e) and we therefore obtain L0(A) = L0K0(e). Thus we find that e e L0(A). From g g L0(A) it follows that tr(wg) = 0 for all w g K0(e); in particular, tr(P<pg) = 0 for all P^ g K0(e) = K^e). Since ё = 1 - e, from P^ g K^e) it follows that (1 - e)q> = cp. Thus we obtain 0 = tr(P^g) = <<p, gcp> for all cp for which (1 — e)(p = cp. Since g > 0 we obtain gcp = 0 for all cp in the projection space of 1 — e. From this we conclude that g(l — e) = 0, that is, g = ge. This result, together with the adjoint equation g = eg yields g — ege. From g < 1 it follows that ege < e and also g < e. Therefore e is the maximal element of L0(A) = L0K0(e). Each e g G is obtained as a maximal element of L0(A), namely from L0(A) where A = K0(e). Suppose that e g G, gl9 g2 e L, 0 < Я < 1 and e = Afifi + (1 - X)g2. Then it follows that gl9g2e L0K0(e) and consequently g\<e and g2 < е.Ид1Ф e (or 02 Ф e) then Xgt + (1 — X)g2 $ e; therefore gt = g2 = e. Thus G cz deL. For g g L, from 1 - g < 1 it follows that 0 < (1 - g)2 = 1 - 2g + g2 < 1 - g < 1. We therefore obtain 0 < 2g — g2 and 0 < g2 < 1. For gf = 2g — g2 and g" = #2 we find that g', g" g L and that 0' = ^0' + jg". If g g dgL then g' = g" = 0, that is, 0 = 02 and 0 g G. Thus = G. D 6.4. For e g G we shall let e1 denote the element 1 — e. Th. 6.7. e1 = 1 - e is the projection operator onto the subspace of Ж which is orthogonal to еЖ. The correspondence e—^e1 is an orthocomplementation in
6 Decision Effects and Faces of К 81 the lattice G. The isomorphism e -► К^e) o/Th. 6.4 defines an analogous orthocomplementation on (j>(K)for which Kfi^e)1 = K^e1) = K0(e). PROOF. The first part of this theorem is a direct consequence of AIV, §6. In order to show that e —► e1 is an orthocomplementation it is necessary to show that (see AI, D 1.2): (i) From et < e2 it follows that e\ > ej. (ii) (e1)1 = e. (iii) e1 л e = 0. (i) and (ii) follow directly from e1 = 1 — e. (iii) follows directly from the fact that e1 л e is the projection operator on the intersection of the projection spaces of e1 ande. D 6.5. We shall denote the relation ex <1 — e2 = e2 by ex 1 e2. Since ex < 1 — e2 is equivalent to e2 < 1 — the relation ex 1 e2 is symmetric. If ex < e and e2 < e1 then e2< e1 < e\ and^ 1 e2. Th. 6.8. Ife1 1 e2 then exe2 = 0 and ex + e2 = ex v e2. PROOF. Since 1 e2 and e1e2 = 0 are equivalent (see AIV, §6) it easily follows that (ei + ei)2 = ei + e2> that is, + e2 e G. It is easy to see that K0(ex + e2) = K0(ei) n K0(e2); from Th. 6.4 we obtain K0(ei) n K0(e2) = v e2). From K0(ei + e2) = Хо(вх v e2) and from Th. 6.4 it follows that et + e2 = ey v e2. D 6.6. An orthocomplemented lattice Ж is said to be orthomodular if, for a, b, с g Ж, с 1 b and a < c, the following relation holds: (a v b) л с = (а л c) v (b а с) = a (see AI, D 2.3). Th. 6.9. G is orthomodular. PROOF. From a a с < a,b a с < bit follows that (а а с) v (b а с) < a v b. From a a с < c,b a с < cit follows that (а а с) v (b а с) < с; thus we obtain (а л c) v (b л c) < (a v b) а с. For G the orthomodularity is equivalent to that found in the case of the subspaces *, d, / of a Hilbert space. Thus, for о 1 / and * c= / it is necessary to show that the relation (ь a o) n / <= % holds. Since * 1 d the vectors in * v d have the form (p = (pt + (p2, where <p± e г and (p2 e d. Therefore we find that <pt _L (p2. Since / Id, for the projection Pt on the subspace / we find that Pt(p2 = 0. In addition Ptcp = Pt(py = <pt (the last equality follows from * c /). For each Ф e (* v d) n / and <p e /—that is, Ptcp = cp and also cp = cpt—we obtain cp e 4. Another proof of Th. 6.9 follows directly from AI, Th. 2.2. (See remarks at the end of AI, §2.)
82 III Ensembles and Effects Th. 6.10. Decreasing and increasing sequences of elements of G converge in the a(08\0S) topology towards elements of G; an increasing sequence ev converges towards \JV ev and a decreasing sequence ev converges towards Av*v. Proof. The convergence of increasing and decreasing sequences in L follows in general from AIII, §6. Let ev be a decreasing sequence. From ex —► д e L it follows that ev > /\v ev and therefore д > ер. From ev > ep for all p > v, in the limit we obtain ev > д for all v. We obtain K0(g) => K0(ev) for all v and, from Th. 6.4 we obtain K0(g) => \JV K0(ev) = K0(/\v ev\ that is, g e L0K0(/\V ev). Therefore, from Th. 6.6 we obtain g < f\vev. Thus we find that g = /\v ev. If the sequence eY is increasing, then the sequence eY = 1 — ev is decreasing and 1 - ev /\v e$. Therefore we obtain ev -+1 - /\v e$ = (f\v e^)1 = \fvev.
CHAPTER IV Coexistent Effects and Coexistent Decompositions The structure of the theory of microsystems presented in this book permits us to make a fundamental characterization of the concept of microsystems without making use of the familiar basic concepts of property and observ¬ able. The notions of property and pseudoproperty which were introduced in III, §4 are not fundamental concepts but are derived concepts—derived from the more fundamental concepts of preparation and registration procedure. In III, §4 we have outlined one possible way in which it is possible to begin to separate the notion of a microsystem from the inherent structure associated with preparation and registration procedures. We shall now consider a second way to accomplish this separation. These two methods will be compared in §8. We shall now make the difference between the deductive approach presented here and the approaches which are based on the notions of “property” and “observables” more clearly evident, in order that we may be able to more clearly formulate the problems which will be discussed in this chapter. Many formulations of quantum mechanics begin by making certain specific assumptions about the properties of microsystems : It is assumed that the properties of microsystems may be ascertained by measurement, and may (at least, “after” the measurement) be attributed to the microsystems. Similarly, an observable is considered to be a “measurable quantity” which (again, on the basis of measurement) may be attributed to the microsystems. For instance, it is assumed that, in the measurement of position, we are able to “detect” the position of the microsystem, and that the latter is one of its 83
84 IV Coexistent Effects and Coexistent Decompositions properties. In other words, it is assumed that the proposition “the microsys¬ tem has the measured position” is already meaningful. We have not used the concepts described above in our formulation of quantum mechanics in II and III because we were not convinced that their meaning can be readily determined. For instance, we do not assume that a proposition such as “the microsystem has the measured position” is already meaningful. Instead, we have only made use of the fundamental concepts— those of preparation (represented by 3) and registration (represented by 0t$0t)—concepts which have immediate meaning because they can be explained in terms of “pre-theories” relative to quantum mechanics (for the meaning of the notion of a pre-theory see [1], §5 and [2], §4). It is possible to deduce the concepts of properties and pseudoproperties of microsystems only after the development of the theory (see §8). The microsystems make their presence known by the “response” b e 0t of the “apparatus” b$e 0t$. Except for a few general comments made in III, §4, what these responses permit us to conclude about the microsystems them¬ selves remains open. The investigations in III, §4 do not represent a systematic path from the preparation and registration methods to the actual structure of microsystems. Instead, they only raise the question whether it is possible to correctly formulate the intuitive notion of a “property” in terms of the structures of preparation and registration procedures. It was not possible to systematically answer the question as to how much the “response” b e 0t of the apparatus b0 e 01$ is due to the microsystem xeb and how much is due to the apparatus b0. We shall now turn to these problems and questions without (!) prejudging the issue by making arbitrary “assumptions concern¬ ing the nature of microsystems.” For a realistic clarification it is necessary to obtain all concepts directly from the experiments with microsystems which are described by the structures «Э, 0t$0t (see the more general treatment in [1], §10). The physical meaning of quantum mechanics may be obtained then only with the use of elements of the sets «2, 0t$0t. We shall not introduce a syntax for assertions about the microsystems themselves; we shall only consider mathematically formulated relations in the form described in §8.3. 1 Coexistent Effects and Observables The concept of an observable is a very useful concept in quantum mechanics. It will be derived from the concepts described by the sets 01 $01. In order to do this, we shall now examine the physical significance of the substructures 0t{b$) of the structures 0t$0t. 1.1 Coexistent Registrations Earlier we have introduced the expression “effect process” to denote a pair (b$, b) where b$ e 0t$,b e 0t and b c= b$. To such a pair the map ф assigns an effect ip(b$, b)e L а Я'(Ж19...). In III we have only been concerned with the
1 Coexistent Effects and Observables 85 mapping of a single effect process / = (b0, b) onto an element g = ip(b0, b)e L. Here it is important to note that there are, in general, many such b a b0 for an apparatus b0. Let ЩЬ0) denote the set of beMfor which b a b0. In experiments with microsystems we find that, in most cases, the number of registrations for which b a b0 is overwhelming. An approximately exhaustive description of those cases cannot be described in a book, no less in a few lines. For the purpose of illustration we shall only consider two typical experiments. In the first example we shall consider an array of counters in which each microsystem may activate the response of some of them. Let b0 denote the array of counters. Let us characterize b as follows: three specifically chosen counters will respond; four other specifically chosen counters will not. In this case 0t(bo) will be familiar to those readers who are familiar with modern electronic technology—0l(bo) is the Boolean switching algebra of the set of possible responses of the counters. For the case in which b0 consists of a pair of counters we may describe 0t{bo) as follows: bx is the registration for which the first counter responds; b2 is the registration for which the second counter responds. Here bx n b2 corresponds to the case in which counters 1 and 2 have both responded; b0\b1 corresponds to the case in which counter 1 has not responded; similarly for b0\b2. Here (b0\bi) n b2, (b0\b2) n bl9 bx и b2, etc. are described similarly. By including b0 and 0 we find that @(b0) has exactly 16 elements. In the second example we shall consider a cloud chamber (or a bubble chamber) b0. Here the number of possible registrations is overwhelming. Every condensation droplet (or vapor bubble) may itself be a registration b. the ionization trail (or bubble trail) may also be one. If a magnetic field is applied, then the radius of a circular particle trail will be a “scale value” for certain of these Vs. The scale numbers are nothing more than indices which serve to order the overwhelming range of possible registrations b a b0. Scale values—even when they are very practical—are of a secondary nature. The primary concept is that of registration possibilities. These two examples also demonstrate the following fact: The different elements from 0t{bo) need not occur “simultaneously” because b <= b0 does not necessarily correspond to a “point in time,” but instead, may correspond to an extended process, such as a joint response of two counters in an array where one responds later than the other, or, to the entire ionization trail as in a cloud chamber, which persists for a certain length of time. The joint registration of elements b e 0t{bo) has nothing to do with “simultaneous” registration or “simultaneous” measurement. In quantum theory such expressions as “simultaneously measurable” and “not simultaneously measurable” are often used. These expressions are often misleading and misunderstood. We shall discuss this subject in more detail in VII, XVII, and XVIII. Now we shall state only that the b e 0t{bo) has nothing to do with “simultaneity.” In order to reduce such misunderstanding we shall chose another concept (which we have already introduced in II, D 2.2) for the description of the physical situation represented by 0t(bQ).
86 IV Coexistent Effects and Coexistent Decompositions D 1.1.1. The registration procedures b e 3t(b0) are said to be coexistent with respect to the registration method b0e3t0. The (b09 b)e^ which cor¬ respond to the same b0 will be called coexistent effect processes. A subset A cz gF is a set of coexistent effect processes if and only if all elements in A have the same first component b0. How does the structure of coexistent effect processes affected by the mapping ф of У into L c= 08'(Ж19 ...)? 1.2 Coexistent Effects If we have a set of coexistent effect processes (b09 b) then the corresponding set of effects ф(Ь0, b) is a subset of the set of effects ф(Ь09 b) containing all the b g 3t(b0). For a fixed b0 the map ф(Ь09 b) defines a map ф0: 3l(b0)—>L. We shall now investigate the properties of this map. The following definition is often used: D 1.2.1. Let F be a map of the Boolean ring £ (with unit element e) into the order interval [0, и] of an ordered vector space. Let F satisfy the following conditions: F(e) = u9 F(c7i v <r2) = + F(<72) for the case in which л o2 = 0. Then the map F is called an (additive) measure on £. For the special case in which [0, u] is the interval [0,1] cRwe shall call F (as defined in D 1.2.1) a real measure (see II, D 3.1). Th. 1.2.1. For fixed b0 the map \p0(b) = ф(Ь0, b) of 3t(b0) into [0,1] = L c= <Я'(Ж19 ...) is an additive measure on 3t(b0). Proof. According to APS 5.1.4, to each w g Ж there exists an a g w for which a n b0 Ф 0. Let bl9 b2 e 0%(bo) and suppose that b± n b2 = 0. We then obtain: X#>(a n b09 a n (bt u b2)) = X^(a n b09 (a n bx) u (a n b2)) = X#,(a n b0,a n bx) + X^(a n b0,a n b2). We therefore obtain p(cp(a)9 ф(Ь09 bt и b2)) = p(q>(a\ ф(Ь09 bj) + p(q>(a\ ф(Ь09 b2)). Since, by AQ, cp9! is dense in K9 for all w g К we obtain p(w9 ф(Ь09 bi u b2)) = p(w9 ф(Ь09 bj) + p(w9 ф(Ь09 b2)) (1.2.1) from which we conclude that ф0(Ь1 и b2) = ф0(Ь J + ф0(Ь2). (1.2.2) Since ф(Ь09 b0) = 1 we obtain ф0(Ь0) = 1.
1 Coexistent Effects and Observables 87 Equations (1.2.1) and (1.2.2) are equivalent; intuitively they express the additivity of the frequencies for the responses of or b2” for the case in which iib1 and b2 cannot both respond. In other words it is a direct consequence of the “switch” from bx and b2 to or b2.” In other words these equations are a direct consequence of the intuitively motived equation II (3.5) and the axioms AS 2 for statistical selection procedures. On the basis of previous axioms it follows that there are no restrictive conditions for the map ф0 of $(b0) in L other than those imposed by equation (1.2.2) and the condition ф0(Ь0) = 1. Thus, it will be possible to impose axiom AOb later. We shall now introduce the following definition which is motivated by Th. 2.1.1. D 1.2.2. A set A <= L is called a set of coexistent effects if there exists a Boolean ring £ with additive measure F: E —> L for which icFI. In order to simplify what follows, we shall introduce the following definition: D 1.2.3. An additive measure F on £ is said to be effective if F(<r) = 0 implies that (7 = 0. We shall now state and prove the following theorem: Th. 1.2.2. Let F:E —>L be an additive measure; let E0 = {бг|бге£ and F(ct) = 0}. Let (7 e<7 where <tgE/E0; define F(d) = F(ct). Then F is an effective additive measure on the Boolean ring E/E0. PROOF. First we shall show that E0 is an ideal in E: Suppose that o1 < a and F(a) = 0; then from Ffa) < F(<r) it follows that F^) = 0. Let a = <r1 v a2 and suppose that F(<7X) = F(<t2) = 0. Then from a — <r1 v (<72 л of) and from a2 Д 0* < <r2 h follows that F(<t) = 0. Therefore E/E0 is a Boolean ring (see AI, end of §3). From (Tl9 a2 e <7 e E/E0 we obtain <т1 л of e E0 and of л a2 e E0 from which it follows that F((7i) = F((7j A of) + F((7X A <72) = F(<7? A (72) + F((T1 A (72) = F(<72). Therefore, for o- g or F(or) = F(a) defines a function F: E/E0 —> L. Let ё denote the class containing the unit element e of E. Then we obtain F(s) = 1. If д1 л <т2 = 0, then, for any pair of representatives <r1 e <rl9 a2 e a2 we find that а1 л o-2 g E0. Hence we find that F((7i V (72) = F((7i) + F((7f A (72) = F(o’i) + F(o-1 A <72) + F((7f A (72) = F((7X) + F((72) and finally F(di v ff2) = + F(ff2). From F(<r) = 0 it follows that F(o-) = 0 for all a g <7, and therefore o- g E0, that is, a is the null element of E/E0.
88 IV Coexistent Effects and Coexistent Decompositions Therefore, if we wish to investigate coexistent effects Th. 1.2.2 implies that we need only consider those Boolean rings having effective measures. Clearly ф maps a set of coexistent effect processes into a set of coexistent effects. In addition, it is easy to show that the map ф0: ЩЬ0) —> L is an effective measure on 0l{bo) because from ip0(b) = 0 (that is, ij/(b0, b) = 0) it follows that k#>(a n b0, a n b) = 0 for all a e for which (a, b0) e С (where С is defined in II (4.3.7)). Thus a n b = 0 for all a e S! for which (a, b0) e C. According to APS 5.2 this is possible only if b = 0. Definition D 1.2.2 has the following essential advantage over the special situation ^0(b0)-^*L: We do not have to be concerned with the question whether, given a set A of coexistent effects, there exists a registration method b0 e for which A cz \p0$(b0). In §4 we shall assert that in an “approxi¬ mate” sense, to each Boolean ring £ having an effective measure F: £ —> L there is a “realization” described by a &(b0); the complication of the “approximate realization” will be eliminated if we consider general effective measures on a Boolean ring. In AI, §3 we show how to define two operators -j- and • for a Boolean ring which satisfy the rules of a commutative algebra. In addition, we show how to define a generalized Boolean ring (without unit element) using + and •. A unit element exists if and only if there exists an e for which e • a = a for all elements a of the Boolean ring. D 1.2.4. A map F of a generalized Boolean ring £ into [0,1] is called an additive measure if, for <тх • o2 = 0 the following equation is satisfied: FiPi + ai) = F(<r i) + РФг)- Note. In the case in which E has a unit element e, we do not require that F(e) = 1! Th. 1.2.3. Let E be a generalized Boolean ring and let F be an additive measure on E. Then there exists an extension E o/E and a (normed) measure F which is a continuation of F on 2. PROOF. Let E be the two-element Boolean ring which consists of the zero element 0 and the unit element I. Let 2 = E x E and let -j- and ■ be defined as follows: (6, (Tj) + (fi, <r2) = (6, + <r2), (г, fft) + (0, a2) = (fi, ffj + (T2), (0, ctj) + (0, tr2) = (0, al + tr2), (e, (Tj) • («, <г2) = (e, + a2 + (тг cr2), (e, <Tt) ■ (6, a2) = (6, ff2 + <ri • tr2), (0, j) • (D, <r2) = ((5, <v<x2).
1 Coexistent Effects and Observables 89 2 is a Boolean ring; Z may be identified with the subset of all (0, <r) of 2, that is, 2 is an extension of Z. Let F(0, a) = F(a), F(e, a) = 1 - F(a). F is an additive measure on 2 which coincides with F on Z. From Th. 1.2.3 it follows that a set A cz L is a set of coexistent effects if there is a generalized Boolean ring Z and an additive measure F on Z for which A c= FZ. We shall now consider the special case in which A consists of a pair of elements gl9 g2. We then obtain the theorem: Th. 1.2.4. The following conditions are equivalent: (0 0i> 02 are coexistent. (ii) There exist three elements g[, g'2, g12 e L for which gx = g[ + g12, 02=92+ 012 and g[ + g’2 + 012 = 01 + 02 = 02 + 01 6 L PROOF, (i) => (ii). Let Z be a generalized Boolean ring with gl9g2e {F(<r) | a e Z}. Here we find that F(gx) = gx, F(<t2) = g2. Now let us consider the following additional elements in Z: •<r2, gx + -029 <*2 + *02» and v <r2 = <7j + a2 + 0"i * • From the additivity of F(<r), and from F(g^ • (T2) = g12, F(<7i + di • <r2) = 0i, яи4 F((t2 + <7j • (T2) = 02 it follows that 0i = ■F'(ffi) = f(<^i 4- ffi • + ffi • <*2) = F((Tj + -ff2) + • <r2) = 3; + g12; similarly we obtain 02 = 02 + 0i2. In addition, we obtain F(<7i v <r2) = F{a, + (a2 + • o2)) = F(<7j) + F(<t2 + <s1 • <r2) = 0i + g2sb. (ii) => (i). Let us consider a set П consisting of three elements (1), (2), and (3), and let Z = ^(П). For the elementary sets a1 = {(1)}, <r2 = {(2)}, <r3 = {(3)} we set Ffoi) = g'l9 F(<t2) = 02, F((T3) = 012. It is easy to show that F determines an additive (not necessarily normed) measure satisfying Ffo + 03) = gu F((r2 4- <r3) = 02. The proof of this theorem is particularly instructive because it shows that 0i, 02 does not necessarily uniquely determine the remaining values of the measure F(o). For the case in which Z = ^(П) we may obtain different measures because there will be different g12 e L for which gx — g12, g2 — g12 apd 0i + 02 — 0i2 6 L. If, for example, gx g2e L then we may choose 0i2 = 0. If we can find another 0i2 Ф 0, g12e L for which gx — g12, 02 — 0i2 G we таУ then construct two different measures F(<r) on Z = ^(П). It is easy to give such an example. Consider three arbitrary elements gl9 gl9 g3e L and choose gl9 g2 as follows: 01 = i0i 4- ig3, 02 = 702 4- £03. (1.2.3)
90 IV Coexistent Effects and Coexistent Decompositions Then we obtain g1 + g2 = 3(^1 + g2 + 0з) £ L; alternatively we can set 012 = 603* In the following section we shall return to consider the fact that two effects do not necessarily uniquely determine an effect g12 for which “both effects gx and g2 jointly respond.” We shall now mention the following special case of Th. 1.2.4: The following conditions are sufficient for g1 and g2 to be coexistent : (1) 0! + 02 eL(set£12 = 0). (2) 01 > 02 (set 012 = 02)* 1.3 Commensurable Decision Effects The notion of a decision effect, which was introduced in III, §6 plays an important role in quantum mechanics. Often the “role” of decision effects is over-estimated—resulting in the exclusion of realistic situations for all of the effects. In order to avoid clouding the issue by introducing “additional hypotheses” or “opinions” we shall proceed step by step to develop the special role of a decision effect. We shall use only the assertions about decision effects which were introduced in III, §6. We shall now begin with the following obvious definition: D 1.3.1. A set A c= G (that is, a set A of decision effects) is said to be commensurable if there exists a Boolean ring £ with additive measure F: £ —> G for which A cz F'L. The reader should note that, according to this definition, there is a map F of £ into G. According to D 1.2.2 a set A cz G is coexistent if there exists a Boolean ring with additive measure F': —> L for which A cz JF'S'. Thus, a set of commensurable decision effects is also (since GcL) coexistent. Does the converse hold? The Boolean ring S with measure F: S —> G represents at least an “idealized” (in the case in which S G can only be approximately “realized” by а ЩЬ0)-^Ь—see §4) registration method corresponding to decision effects associated with the realizable registrations. It was the practice of theoretical physicists to permit only the use of those measurement methods for which ф(/) g G. Certainly, this is a particularly interesting special case of a measurement. We note, however, that the general measurement methods for which 1p(b0, b) are only effects are usable measurements. Indeed, they are the only realistic measurements. In Th. 1.2.4 we have analyzed the factual content of two coexistent effects. Now we shall consider the special case in which one of the two effects is a decision effect.
1 Coexistent Effects and Observables 91 Th. 1.3.1. For g g L,e e G the following conditions are equivalent: (i) g9 e are coexistent. (ti) 0 = 0i + 02 where gl9 g2e L and gx < e,g2 < e1 (in this partition of g, gx and g2 are uniquely determined). (iii) e = g[ + g3 where g'l9 g3e L and g[ < g9 g3 < 1 — g (in this partition of e9 g'l9 and g3 are uniquely determined and g[ = gx where gx is defined in (ii)). <iv) eg = ge. PROOF. According to Th. 1.2.4 (i) is equivalent to the condition that there exists 0i> 02» 0з e L such that 0 = 0i + 02» ^ = 0i + 0з and 0i + 02 + 0з e L- From this it follows that g1 < e and 1 > 0i + 02 + 03 = 02 + e9 that is, g2 < 1 - e; thus we obtain (i) => (ii). Conversely, if 0 = 0i + 02 where gt <e9 g2< e1 then g3 = e - g1e L and e = 0i + 0з and 0i + 02 + 0з = 02 + * ^ e± + « = h that is, 0! + 02 + g3 e L. Thus we have shown that (ii) => (i). Let 0 = 0i + 02 = 0i + 02 where gl9 cf < e and gl9 g2 < e1. Then we obtain 0i - 0i = 02 “ 02 and 0<e-g1<e + g1-g1 = e + g2-g2<e + g2< e + e1 < 1. Therefore g3 = e + gt — e L. Since 0j < e we obtain K0(Gi) 3 X0(e); similarly we obtain Kq^j) з K0(e). From these results it follows that K0(g3) з K0(e)9 that is, g3 e L0K0(e). From K0(e) cz X^)—that is, g3 e L0K0(e)—and from III, Th. 6.6 it follows that g3 < e. From these results it follows that g1 < g^ \ similarly we may also derive g1 < g±. Therefore we obtain g± = and g2 = g2. Therefore we have proven the uniqueness of the decom¬ position given in (ii). From the above decomposition g = gy + 02 and e = gy + g3 it follows that 0i < g and 0з = e — gy = e — g + 02. Since g2 < e1, (see (ii)) it follows that 0з < e — g + e1 = 1 — g. Thus we have shown that (i) => (iii). Conversely, let e = 0i + 0з where g[ < g and 03 < 1 — g and let g2 = g — g[. Then we obtain g'2 = g — e + g'3 < g — e + 1 — 0 = 1 — e = e1. Thus we have shown that (iii) => (ii). Since we may derive the relation (ii) out of (iii) for the special case g[ = g1 it follows that the partition in (iii) is unique. It now suffices to show that (ii) <*> (iv). From g1 < e it follows that (see AIV, §6) 0i = £01^» fr°m 02 ^ e± ft follows that 02 = e1g2e1. Therefore, from (ii) we obtain ge = eg^e = eg. Conversely, if ge = eg, we define g1 = ege9 g2 = e1ge1. Then, since e and g commute, we obtain 0 = (e + eL)g(e + e1) = ege + eLgeL = g1 + g2- In addition, we obtain gx = ege < ele = e\ similarly we obtain g2 < e1. Thus we have shown that (ii) holds. The commutativity of two coexistent effects does not hold in general! This fact may tie verified using example (1.2.1) where gl9 gl9 g3 may be arbitrary elements of L. Th. 1.3.2. Let el9 e2e G be two coexistent decision effects. Suppose that they satisfy the decomposition ex = gx + gl9 e2 = g 1 + g3 where
92 IV Coexistent Effects and Coexistent Decompositions 0i> 02> 0з G L and 0i + 02 + 0з G L (equivalent to the coexistent require¬ ment). Then gl9 g2, дъ are uniquely determined by el9 e2 and g 1 = ex a e2, 02 = ei a e2, дъ = e\ a e2 and gl9 g2, g3 g G. 7/E is a Boolean ring with additive measure F: E —> Lfor which = el9 F(ct2) = e2, it follows that F(a1 a g2) = ex a e2, F(a1 + <тх a2) = ex a e2 and F(ct2 + a1 o2) = e2 a e\. PROOF. According to Th. 1.3.3 (ii) gx < e2, g2 < e2, and gx < el9 дъ < ef. Therefore it follows that g1 g L0K0(e1) and g1 g L0K0(e2), that is, 0! g L0iC0(e1) n L0K0(e2) = L0(A) where A = K0(ej u K0(e2). According to III, Th. 6.6 L0(A) = L0(C) where С is the face generated by A, that is, С = v K0(e2). According to III, Th. 6.4 K0(et) v K0(e2) = a e2) and therefore gt eL0(A) = L0K0(ei a e2), that is, gy < et a e2. Similarly, it follows that g2 < e1 a e2, g3 < e\ a e2. Therefore = gt + g2 < (ex л e2) + (ex л e2). Since et a e2< e2 and et a e2 < e2, we find that ei a e2 -L ey a e2; thus, by III, Th. 6.8 we obtain (e1 a e2) + (<гx л e2) = (e1 a e2) v (e2 a ef) < e^. From ey = g1 + g2 < (ex a e2) + (е1 л ej) < el9 it follows that g1 = e1 a el9 g2 = e1 a e2 because gy < a el9 g2 < e2 a e2. Similarly, it follows that g3 = e\ a e2. The rest of the theorem follows directly from the proof of Th. 1.2.4. Th. 1.3.3. If F: E —> G is an additive, effective measure on the Boolean ring E, then F is an isomorphic map of the Boolean ring E onto the Boolean sublattice FH of G. PROOF. According to Th. 1.3.2, for each pair of elements g19 <t2gE we have F(g± a g2) = F^) л F(<t2). Furthermore, for a g E we obtain 1 = F(s) = F(ct v <r*) = F(g) + F(<t*): F(g*) = 1 — F(g) = FH1. Since a1w a2 = (erf л a$)* it follows that F(a1 v g2) = F((T1) v F(g2). Thus F is a lattice homomorphism of E into G for which F(<r*) = F(a)1. In addition, F is injective: From F((T1) = F(g2) it follows F((t1 a g2) = F^) a F(g2) = F((T1) = F(<t2) and from F^) = F{{<j1 A (T2) V (CTi A <7j)) = F((T! A <T2) + F^ A <jJ) we obtain the relation F((T1 a g$) = 0. Since F is effective, it follows that <7j a <t2 = 0 and therefore Gt = g1ag2. Similarly we find that g2 = g1 a g2, that is, Gt = g2. Thus F is an isomorphism onto the sublattice FE of G and FE is a Boolean sublattice of G. Th. 1.3.4. The following six conditions are equivalent for decision effects: (i) el9 e2 are coexistent. (ii) e{9 e2 are commensurable. (iii) The orthocomplemented sublattice Г of G generated by el9 e2 is a Boolean ring. (iv) e2 = (et л e2) v (e2 л ej). (v) ete2 = e2e2. (vi) exe2 = et a e2. Let E-^*L be the Boolean ring defined according to (i) where F is the effective measure. Then there exist two elements al9 a2 e E for which
1 Coexistent Effects and Observables 93 F(gi) = el9 F(g2) = ^2* Let 2° be the Boolean subring of 2 generated by al9 o2. The restriction of F to 2° is an isomorphism of 2° on Г (where Г is defined by (iii)). The identity map i of Г into itself is an additive measure i on the Boolean ring Г for which T-LG. Proof. According to (i) there exists a Boolean ring 2 with an effective measure F: 2—► L (see Th. 1.2.2). According to Th. 1.3.2, there exist two elements <rl9 <r2 for which F^) = el9 F(<t2) = e2 and the following condition holds: F(a1 л <т2) = e1 a el9 F^ + а1 -<т2) = е1 л e2, F(<t2 + 6i -o^) = e1 a e\. We shall now show that the Boolean subring 2° of 2 generated by <т19 <r2 will be mapped by F into G. Since 1 = F(e) = F(<t v <r*) = F(<r) + F(<r*) we obtain F(<r*) = 1 — F(<r). If, in addition, F(<r) g G, then we also find F(a*) g G. It suffices to show that the following eight elements 0, <rl9 <r2, <7i a <r2, g1 + a1 • <r2, 62 + 61*62, 6X + a1 • <t2 + a2 + a1 • <r2 and ^ v <r2 = ^ + (T2 • <r2 + a2 may be mapped into G since the remaining eight elements of 2° are complements of the above elements. We obtain: Ffo + 6i • a2 + 62 + (T1 • <r2) = F((j1 + (Ti • (T2) + F(<t2 + 6X • <r2) = (^1 A 4) + (e2 A 4) e G> since ex Ae2L e2 a e\. Similarly, we obtain F(<t 1 v <r2) = F((T1 + 6X • (T2) + F(62) = a e2 + e2 e G, since a e2l e2. Therefore the map 2° Д G is an additive measure on 2° with range in G, whereby (i) => (ii) is proven. With the application of Th. 1.3.3 to 2° Д G we find that Г = F2° is isomorphic to 2°, thereby proving that (ii) => (iii) is proven. If Г—as defined in (iii)—is a Boolean ring, then from e9 e g Г and e a e = 0 it follows that e < e1 because e = e a (e v e1) = (e a e) v (e a e1) = e a e1. Therefore, by III, Th. 6.8 we find that e v e = e + e1 and that the identity map of Г onto itself is an additive effective measure. Since Г с G we find that el9 e2 are also commensurable. That is, we have shown that (iii) => (ii). As we have shown above after D 1.3.1, (ii) => (i). In Th. 1.3.2 we have shown that (i) => (iv). If (iv) is satisfied, that is, = (e1 a e2) v (e± a 4), then, by III, Th. 6.8 we obtain et = gt + g2 where g^ = ey a el9 g2 = e1 a e2. Or, expressed dif¬ ferently we obtain gx < el9 gx < e2. By Th. 1.3.1 it then follows that (iv) => (i). From Th. 1.3.3 (iv) it follows that (v) <=> (i); (vi) <=> (v) is proven in AIV, §6. We shall now consider some special cases of commensurable decision effects. If ex < el9 then, by arguments presented at the end of §1.2 we conclude that el9 e2 are coexistent and are therefore commensurable. Similarly, if ex + e2 < 1 (that is ex + e2 e L) we conclude that el9 e2 are coexistent and are therefore commensurable since ex + e2 < 1 is equivalent to ex < 1 — e2 = e29 that is, e1\- e2. Hence, from ex _L e2 it follows that ex and e2 are commensurable. If el9 e2 are commensurable and ex a e2 = 0, it then follows from Th. 1.3.4 (iv) that e1l. e2 and, consequently, by III, Th. 6.8, ex v e2 = ex + e2.
94 IV Coexistent Effects and Coexistent Decompositions We shall now develop general criteria for the characterization of com¬ mensurable sets of decision effects. For this purpose we shall state and prove the following theorem: Th. 1.3.5. Let A1 cz G, A2 cz G and suppose that for each pair ex e Al9 and e2 e A2 are commensurable. Then /\eeA2 e and \Jee Al e are commensurable with each ex e Ax. Suppose that e e G is commensurable with each ex e Al9 then e1 is commensurable with each ex e At. PROOF. The statement that e, ey are commensurable is equivalent to the condition that e1 = (е1 л e) v (е1 л e1). This relation is symmetric in e and e1 because e = (e1)1. Thus, e\ ex are commensurable. Since el9e2 are commensurable, from Th. 1.3.4 it follows that e2 = (e2 л ej v (e2 л e\) and we therefore obtain V e = Г V (e л ei)lv Г V (e Aei) e e A2 \ ее A2 J \_e e A2 Since G is orthomodular (see III, D 6.6 and AI, §2) it follows that V el л et = ГГ V (e л ei)l v Г V л ei)]l л et = V (e л ei)- ееАг J l_f_eeA2 J \_eeA2 JJ eeA2 A similar result holds if we substitute e{ for ey. Thus we obtain V * = |Yv eeA2 L\eeA2 / J |_ \eeA2 / J Thus, using Th. 1.3.4(iv) we conclude that el9 \JeeA2e are commensurable. Since (Ve € a2 ех)х = Ле eAZ e. the same result follows for eu Де e Al e. Th. 1.3.6. For A cz G the following conditions are equivalent: (i) A is coexistent. (ii) A is commensurable. (iii) Each pair el9 e2e A are coexistent. (iv) Each pair el9 e2 e A are commensurable. (v) The orthocomplemented sublattice ГА generated by A is a Boolean ring. (vi) The complete orthocomplemented sublattice TA generated by A is a Boolean ring. PROOF, (iii) => (iv) is a direct consequence of Th. 1.3.4. (ii) => (i) and (i) => (iii) are trivial. If we show that (iv) => (ii) we will have proven that (i), (ii), (iii), and (iv) are equivalent. Suppose that (iv) is satisfied. Every element of the sublattice TA described in (v) may be obtained by the finite application of the operators л, v, and 1 to the elements of A. According to Th. 1.3.5 every element of TA is commensurable with every element of A. Again, from Th. 1.3.5 it follows that every pair of elements of TA is commensurable. Thus, from AI, Th. 3.2 it follows that TA is a Boolean ring. Since ey л e2 = 0 and el9 e2 are commensurable (see the discussion preceding Th. 1.3.5) it follows that et v e2 = et + e2 and that the identity map of TA is an additive measure. Therefore, since A c= TA9 the set A is a set of commensurable decision effects. Thus we have proven (iv) => (ii), (iv) => (v), and (v) => (ii).
1 Coexistent Effects and Observables 95 (v) => (vi). By Zorn’s lemma it follows that the set S of Boolean subrings Г' for which А с Г' с G contains a maximal element Гт. Гт must be a complete Boolean ring, otherwise, there would be a subset ВсГт for which the element e = \feeBeeG would not lie in Гт. Then, by Th. 1.3.5 and by (ii) <s> (iv) we would find that {Гт, e} would be a set of commensurable decision effects which, according to (v), would be contained in a Boolean ring. This contradicts the fact that Гт is maximal. TA must be a sublattice of Гт and must therefore be a Boolean ring. (vi) => (v) is trivial. Th. 1.3.7. Let ev be a sequence of commensurable decision effects which converges in the o(38\ 38) topology towards e e G. Then e is commensurable with all ее G which are commensurable with the ev. In addition, it is possible to choose a subsequence eVk such that e = /\^=i en where en = \Д°=и evk- (From this result we obtain the following special case: Every complete Boolean subring of G is o(38\ 38) closed in G.) PROOF. From III, Th. 6.10 en—>e in the a(38\ 38) topology where e = Д^°=1 en. Since en > eVk for all к > n9 in the limit к —> oo we find that en > e and therefore find that e > e. If we show that, in the a(38\ ЗЙ) topology en—>e then from en > e > e we would obtain e = e. First we shall show that e is commensurable with all the ё e G which are commensurable with the ev. If ё is commensurable with the ev then, by Th. 1.3.4(iv) and III, Th. 6.8 we obtain ё = (ev л e) + (ё а e^\ ev = (ey л ё) + (ev л ё1). Since L is a(38\ J^-compact (see AIII, §4 and §6), we may select a subsequence еХж from ev for which eVa л ё —► ду e L, еХж л ё1—>д2еЬ,ёл е^ж —*g3eL, and there¬ fore ё = д1 + д3,е = д1 + д2 - From еХж л ё < еХж and from еХж л ё < ё it follows that 0i < е and д3 < ё. Hence it follows that ^((h) K0(e) and Хо(0х) Z3 К0(ё). Therefore, from III, Th. 6.4 it follows that К0(д1) з K0(e) v К0(ё) = K0(e л ё). According to III, Th. 6.6 it follows that gt < e л ё. Similarly we may prove that g2 < e л ё1 and g3 < ё a e\ From these results we obtain 0i + 02 + 0з ^ l(e A e±) v (e л + (ё a e1). Since (e а ё) v (e а ё1) < e and ё a e1 < e1 we obtain 01 + 02 + 03 ^ e + e± = 1- Therefore, by Th. 1.2.4, e and ё are coexistent and, by Th. 1.3.4 are Commensurable. Since we have assumed that the ev are all commensurable, e is therefore commensurable with all ev, and, by Th. 1.3.5, is also commensurable with e and the Since e is commensurable with the en — Vfc°=" evk апс* with the eVk we therefore obtain (see the proof of Th. 1.3.5) 00 К A e1 = V (eVk л e1). k= 1
96 IV Coexistent Effects and Coexistent Decompositions Since the eXk л e1 are commensurable, we find that 00 К A e1 < £ (eVk л e1). k = n According to AIII, §4 the ^)-topology on L may be characterized by a norm ||... 19. For this norm we find that 00 He, л ех\\„ <, £ Ik* л e±\l• k = n Since e„ > e we find that en — e = en л e1 and we therefore obtain 00 Ik ~ e\l ^ E Ik* а е1^. k = n Since L is compact, we may choose a subsequence eXk from the ex such that eXk л e1 —► g e L in the 88)-topology. From eVk л e1 < eVk it follows that, in the limit g < e; from eXk л e1 < e1 it follows that g < e\ Hence, from the above we may conclude that g < e1 л e, that is, eVk л e1 —► 0. Thus it is possible to select a subsequence such that ||eVk л eL\a < (j)k from which we conclude that \\en — e\\a A 0 and en —► e. D 1.3.2. The set of all e e G which are commensurable with all the elements of G is called the center Z of G. Th. 1.3.8. Z is the set of all e = (El9 E2,...) for which the Ev in Hv are either 0 or 1. PROOF. According to Th. 1.3.4 Z is the set of all e = (El9 E29...) which commute with all the other eeG, which proves the assertion. According to Th. 1.3.6(vi) it follows that Z is a complete Boolean subring of G (which can be easily proven directly). Z is atomic, with atoms q. = (0, 0,..., li9...) where the components are equal to 0 except for the ith position. The fact that Z is atomic is a consequence of AV 4s from III, §3. The atomic character of Z is characteristic for microsystems. 1.4 Observables In §1.1 and §1.2 we have implicitly presented the structure upon which the observable concept will be based. The concept of an observable is none other than an abstract idealization of the structure represented by the map Яфо) Bef°re we proceed to give a precise definition of the notion of an observable we shall make a number of preliminary remarks in order to reduce the possibility of misunderstandings. Many readers will find it somewhat surprising that we shall say little about “measurement values,” “measurement scales” and the like when we introduce the notion of an observable. Is a “measurement” necessarily quantitative? In response to this
1 Coexistent Effects and Observables 97 question, we emphasize that it is important for the reader to put aside the notion that the essential aspect of physics is that of quantitative measurement. Otherwise, the reader will not obtain a correct understanding of the methods of theoretical physics. Parameterization of registration procedures can be very “convenient,” “practical,” and “useful,” but it has no fundamental meaning in reference to the mapping of physical reality by means of the mathematical structures. This becomes evident in the fact that it is possible, in principle, to record all measurements digitally and store them in a computer. In addition, the structure of a Boolean ring (for example, of £%(b0)) becomes more tractable if we do not insist on imposing a more or less arbitrary parameterization of the Boolean ring. Finally, the abstract struc¬ ture of a Boolean ring is more transparent than the “usual” parametric formulation. At this point we could define an observable directly in terms of ЩЬ0) and the map ф0: $(b0) —>L. This approach will, however, lead to a number of mathematically inconvenient structures. Instead, we shall proceed in an abstract manner in analogy to the definition of coexistent effects. The first approach would be to define an observable by means of a Boolean ring E and an effective measure F: E —> L (see [2], XIII, D 5.6; in [2] we have presented this definition. There we did not find it necessary to discuss the process of completion of E). For this approach, in order to make the concept more realistic physically, and to simplify the mathematical treatment, we shall impose a number of additional conditions upon the notion of an observable. Mathematically, in order to formulate several theorems more simply, it is always very convenient to make the sets complete (relative to the uniform structure). We shall apply such a completion to the ring E. Th. 1.4.1. Let weK; the sets Nw,B = {(ffi. a 2) I a и <*2 e g{w, F{a2 + o2)) < £} form a fundamental system of sets for the uniform structure Ug (see All, §2) of the Boolean ring E with effective measure F: E —> L. Ug separates E because F is effective. Ug is metrizable, with metric die и = F(o1 4 g 2)) where we choose an “effective” w0 (such an effective w0 exists according to III, Th. 6.1), that is, a w0 for which C(w0) = K. PROOF. From w0 e К we will have proved the theorem if we can show that d(al9 o2) is a metric, and that, to each pair w, s there exists an s' for which p(w0, g) < s' implies that p(w, g) < s for all g e L. ' Since F(Gt -j- <r2) + F(g2 4 <r3) = F(°i + ^3) + + ^3)‘(^2 + ^3)) we ob¬ tain F((T1 4 03) < F(0i 4 g2) + F(g2 4 g3). Then we find that d(ol9 o2) satisfies the triangle inequality. From d(al9 a2) = 0 it follows that p(w0, F((J1 4 <r2)) = 0 and we therefore obtain p(w0, F(d1 4 g2)) = 0 for all w e C(w0) = К, that is, F((T1 4 g2) = 0. Since F is effective, it follows that ax 4 <*2 = 0, that is = a2. Thus d(ol9 o2) is a metric.
98 IV Coexistent Effects and Coexistent Decompositions We shall now show that we may choose a special effective w0 e К such that, to each pair w, e there exists an s' for which p(w09 g) < e' implies p(w9 g) < e for all g g L. For this purpose, we introduce a denumerable subset {wv} which is dense in К and define w0 = £®=1 Avwv,Av > 0,£®=1 Av = 1. Then, for w g K9wpe {wv} we obtain p(w9 g) < |p(w - wp9 g)| + p(wp9 g) < ||w - wp\\ + p(wp9 g). Since p(w9 g) = J]®= x Av^(wv, g), we find that V*(wp, g) < р(щ, g) and we obtain A*(w, g) < || w - wp\\ + A" V(wp, 0). If we now choose wp so that ||w — wp|| < s/2 and choose s' < Ap(e/2) then we obtain 0(w, 0) < e. From Th. 1.4.3 and Th. 2.1.11 it follows that the metrics *2) = A*(wi> F(ffi + <J2)) and 42(^i» ^2) = 0(w2, 4- (T2)) are equivalent for effective pairs w19w2eK. Here it is important to note that the following theorems (up to Th. 2.1.11) only require that there exists a single w0 for which the metric defined in Th. 1.4.1 generates the uniform structure Ug\ Let Ъд denote Ъ endowed with the uniform structure Ug defined in Th. 1.4.1. Th. 1.4.2. The map F:I,g—*L (where L is endowed with the uniform structure defined by <r(0l'9 01)) and the maps (<rl9 (т2)—><т1 + <r2 and (ol9 o2) -*(T1-(T2 of 2ig x Ъд into 2ig are uniformly continuous. • PROOF. The uniform structure on L determined by g(0\ 01) is the initial structure for the maps L **(vv,g) > [0,1] c= R for w e К because every у e 0 may be expressed in the form у = olw1 — fiw2 where wl9 w2 e K. If the composite map Sg—»L **(vv,g)> [0,1] is uniformly continuous, that is, the map p(w9F(<r)) is uniformly continuous for each w, then the map F will be uniformly continuous (see All, §2). From F(<r 1) + F(g2 + <7i • <r2) = F(g1 v <r2) = F(g2) + F(g± + • g2) and from F(<7i + <T2) = F((T1 + Gt • <r2) + F(g2 + gx • g2) it follows that Ip(w9 F(g1) - F(g2))\ < p(w9 F(g1 + <r2)), thus proving the uniform continuity of p(w9 F(g)).
1 Coexistent Effects and Observables 99 From Ffal + £1 + <*2 + o2) < F(<T 1 + <*i) + F{G2 4 (T2) it is easy to show that (<rl9 a2) —>a1 + a2 is uniformly continuous. From F(<ri 4 + F(g2 + d2) > 2F(((T1 + + ^2)) = 2F((T1 • (T2 4 (Ti -(T2 4 -^2 + £1 -£2) = 2F(<71 • (T2 4 £1 • ^2) + Щ°1 • ^2 + ^1 • ^2) it follows that F{c1 • (T2 4 ffi • ff2) ^ 2fF(^i + <*i) + F(<t2 + ^2)]» thus proving that the map (<rl9 <r2) -kj1-(t2 is uniformly continuous. Th. 1.4.3. The uniform completion tg of 2,g is a (lattice-theoretically) complete Boolean ring. The additive measure F may be extended on tg to an additive measure Ъд -> L. IfF is effective on 2,g then its extension is effective on tlg. For each subset Г с Ъд a denumerable subset ov e Г can be so chosen such that V^er*7 = \Д (and similarly for Д). In addition \/I=i \Д® 1 av • Both В and m(o) = p(w9 F(o))9 are also o-additive measures on Ъд. Proof. From Th. 1.4.2 and from the general theorems about the extension of uniformly continuous maps we find that tg is a Boolean ring and that the map хДь may be extended as a uniformly continuous map because L is o(08\ $)- complete since it is <r(@f9 ^-compact. Similarly, it is easy to show that the extension of F is an additive measure on tg since ol9 o2 g tg may be approximated arbitrarily well by al9 a2 g if c1 • <r2 =0 then • a2 will approximate the null- element arbitrarily well. Similarly it follows that F is effective on tg when it is effective on . In order to prove the lattice-theoretical completeness of tg it suffices to show that, for а Г c= tg the upper bound \/аеГ<т exists because if Г is the set of all a for which <7 < <7 for all <7 g Г then we obtain еГ а = \Jser a. Let ф denote the set of all finite subsets of Г. Then \/аеГ<т = \Де</> % where % = V<reo»<T- Since tg is a Boolean ring <7Ф g tg the set of <7Ф, q> g ф is a directed subset of t,g. From the additivity of the measure m0(<7) = p(w0, F(<r)) where w0 is defined in Th. 1.4.1, we may conclude that the set of numbers т0(сгф) is upwardly directed; since т0(сгф) < 1 it is also bounded. The set of т0(сгф) therefore has an upper bound which we denote by a. If (pl9 (p2 g ф and if (p g ф9 cp cz (pl9 ^ cz cp2 then WoKl +/«) = moK, + % + % + %2) ^ ffloK, 4- %) + + %2)- Furthermore, since сгф < сгф1, cr^ < сгф2 we find that + <r„) = tn0(om) - m0(av) < a - m0(<x„); a similar expression holds for <7ф2. Thus we obtain m0(<7ф1 4 <^2) < 2[a — т0(<7ф)]. Using d(...) from Th. 1.4.1 we obtain
100 IV Coexistent Effects and Coexistent Decompositions Since a is the upper limit of all т0((тф), the directed set <тф, cp e ф converges towards an element . Since <тф • <тф1 = (T^for^i =э (p we obtain • аф = (7ф, that is, (7^ > <7^ for all ф e ф-If <T > Oy for all (реф, then from cr • сгф = сгф it follows that, in the limit <т-<тф = <тф and also оф < <7. Therefore аф = \/„ e Ф % = V»«г ■ As a special case of the above result, for a denumerable set {<rv} we obtain n 00 s«=y ff»-^ V ffv; v=1 v=1 since &n is an increasing sequence, bounded from above, we therefore obtain 00 00 Sn~* V Sm = V ffv m = 1 v = 1 We shall now show that, in general, it is possible to choose a denumerable subset {<тд} from Г such that \JаеГ a = V“=i V First we can choose a denumerable sequence <pv from ф such that d((7^2, o^) —>0. For <r = \Jv<r<Pv we find that <7^ < д < (тф = \/<рефо(р. Therefore, as above, since <т^—ктф9 it follows that <тф < д < <тф; that is, & = оф. Thus, using the denumerable set у = Уv <pv c= Г: we obtain \JasTa = аф = а = \Д <V» = F is e-additive if for a denumerable sequence <rv e £g for which <rv • сг„ = 0 for v # ц f (v *.) = £« Let <rw = VlT=i G\> then we obtain F(<rw) = 1 F(crv). We must therefore show that F(<7w) converges towards F(\JV <rv) in the cr(^', J’)-topology. Above we have already shown that <rw A \/v (Tv; from the continuity of F this assertion is proven. From the (7-additivity of F it easily follows that m(<r) = p(w9 F((t)) is also (7-additive. Th. 1.4.4. Let I, be a (lattice-theoretically) complete Boolean ring, /e£ m0 be a o-additive effective measure Z [0, 1] <= R and let Ug be the uniform structure induced by d((Tu o2) = m0(o*i 4- о2). Then Z is Ug-complete. 77ien, gwen a convergent sequence <7V —> <7 a subsequence on can be chosen such that the decreasing sequence dm = \/f=m aVi converges to a and a = f\m &m. PROOF. Z is (/^-complete if every Cauchy sequence <rv converges to a limit (tgI. From <7V we select a (yet to be determined) subsequence <rVk and (since Z is lattice- theoretically complete) construct on = \J^=n eXn and a = An=i <ти. We will now show that <7V —► (7. Since (7V is a Cauchy sequence, it suffices to show that, for a subsequence <7Vk —» (7, that is, d(t7Vk, n) —> 0. Suppose that m0 is (7-additive and that <тд is a decreasing sequence which satisfies An Gn = We claim that m0(d^) —» 0. From <7l = (^ + <j2) у (<72 + <73) V • • • V (ffm_! + (Tw) v • • • it follows that, on the basis of (7-additivity, that Щ(°1 + Gi) + ™o(<*2 + <73) + • • • + Since т0((7д + (7д + 1) = т0((7д) - т0((7д+1) the left-hand side of the above equation is equal to mo^) — m0(<rw). Therefore it follows that m0(dw) —» 0.
1 Coexistent Effects and Observables 101 From <7 = Д °= x <7„ it follows that an — an -j- о is a decreasing sequence for which A“=i К = 0. Therefore d(Sn, a) = m0(S„ + (x)-»0. From 5m = <rVm v (V<" „+1 <0 it follows that 00 00 + °vm < v к + о= V i = m + 1 i = m + Thus, since m0 is (7-additive, it follows that ^ Z + ffVm + к + ff»j- V Kj + o-Vm) i = m+1 L j = m +1 J 00 00 ^ Z 'Mo(oVi + = Z °vj- i = m + 1 i = m+ 1 Since (7V is a Cauchy sequence, we may select a subsequence such that d((TVk, <7Д) < 1/2* for p> vk. Thus we obtain <*(<?«, ^ z 4a°- k=m+lL From d((iVm, a) < d(crVm, <tw) + d(dm, a) it follows that d(aVm, a)—► 0 and therefore also (7V —» (7. Since (7 e Z, Z is (/^-complete. In reference to Th. 1.4.3 it is important to note that the following situation is possible: For a subset ГсИ9 there exists an upper bound in Ъд which is usually denoted by \Jff e r a. We shall, however, denote this upper bound in Ъд by Vffer*7- The same subset Г, as a subset of t,g has, according to Th. 1.4.3, an upper bound in 1Lg, which in Th. 1.4.3 was denoted by \/аеГ a. Note that it is possible that V а Ф V a- <r e Г or e Г In this sense it is possible that F is a-additive in t.g9 but is not a-additive in S*! The map represents an idealization of the situation $(b0)-^+L. For £, as an idealization of ЩЬ0), the uniform structure Ug which was introduced in Th. 1.4.1 also appears to be physically meaningful. Ug distinguishes two elements <rl9 o2 from Ъ with the aid of finitely many w e К by means of the probabilities p(w, F(o1 + g2)) where ax + o2 that “re¬ gistration” in which only one of the two registrations <7l5 o2 have responded. It is premature, however, to identify Ug with the uniform structure of the “physical imprecision” associated with £ (this uniform structure of “physical imprecision” is treated in a general setting in [1], §6 and §9) because of the following unexpected property of Ug : In general t,g is not compact—as we would expect (according to [1], §9) for a uniform structure of physical imprecision. If we postulate that £ is denumerable, then t,g will be separable. ЩЬ0) is denumerable, in agreement with the assumptions made at the beginning of III, §3 about M9£90t. Since t,g need not be compact, we would like to obtain another uniform structure which would provide a better map of the physical imprecision of the distinguishability of the elements of £ than that provided by Ug. i — 1 J- V K + ffvJ j = m+ 1 ■
102 IV Coexistent Effects and Coexistent Decompositions For the special case E Д G the Boolean ring E is isomorphic to the image of Sin G. The uniform structure which describes the physical precision in G is that generated by cr(^', &). The latter may be transferred to E as the initial structure which corresponds to the map F. This result would suggest that it is reasonable to use the initial uniform structure on E (which is generated by the map S-^L) as the uniform structure of physical imprecision. But the initial uniform structure does not, in general, separated; for E = &t(b0),F = ф0 we obtain (by means of JF^) = F(g2)) the same equivalence classes of elements (b0, b) which we have considered in III, §1. The uniform structure Ug permits us to compare each with all the other <7’s. By analogy to the fact that we can test a gx only with a finite number of w g К we may only compare a gx with a finite number of the a e E. This fact leads us to suggest that we adopt the initial uniform structure generated by the maps g —> F&(g) = F(g • &)—that is, the weakest uniform structure for which the maps Fd are uniformly continuous for each & (see All, §2)—as the uniform structure Up of physical imprecision. Since the uniform structure generated by g($\ $) is identical to the initial structure defined by the maps L g) -> [0,1] с R for all w g K, Up is equal to the initial structure which is generated by the maps: £ /1(w’F(a’g>-l [0,1] for all w e K, a e Z. We therefore “test” a g by means of Up using a finite number of the & e E and a finite number of the w e К (see the general treatment presented in [1], §9). According to Th. 1.4.2 the maps E —1-a) > L (for fixed g) are uniformly continuous. Therefore Ug is finer than Up. If F is effective (which we shall always assume—see Th. 1.2.2) then Up will also separate E because if F(gx • g) = F(g2 • g) for all <7 g E then, for g = g2 we obtain F(g2) = F(gx • g2). Thus, from F(g2) = F(g2 + • &2) + F(gx • g2) it follows that F(g2 + • g2) = 0, that is, g2 -i- g1-g2 = 0 and hence g2 = g1-g2. Thus we may derive (for g = gx) gx = g1g2. Therefore we obtain gx = g2. The map F(g • g) may be extended to all of Ъд as a uniformly continuous map by means of the extension of F upon Ъд in such a way that Up will be defined in all of Ед. The uniform structure of U„ is coarser than that of Ua and у ^ F is the initial structure for the maps Ъд —► L where F&(g) = F(g • &) for all <7 e E^. Would we obtain a finer initial structure on Ъд if we admit all g e t*gl In fact we obtain the same structure Up as the initial structure for the maps Ъд ► L for all <7 eltg. This becomes evident from the following estimate for g еЪд and g g E^, which is obtained by analogy with the proof of Th. 1.4.2. |F(gx • g) - F(g2 • <7))| < |F(gx • g) - F(g2 • g))\ + |i4w, F(gx g) - F(g2-g))\ + \/i(w,F(g2-g) - F(g2-g))\; |/j(w, F(g • g) — F(g • £))| < n(xv, F(g • G + G • <7)) = F(g • (g 4- #))) < Mw, F(g 4- o)
1 Coexistent Effects and Observables 103 and \ye therefore obtain |p(w, F(ctx • d) - F(c72 • <j))| < 2p(w, F(d + dj) + \p(w, F(ci1 • &) - F(g2 • <r))|. The previous discussion shows that if we impose the condition £p = £p we do not lose any physical generality, but we do gain mathematical simplicity. We define: D 1.4.1. Let £ be a Boolean ring with additive measure F: E —> L such that E is complete with respect to the uniform structure Ug defined in Th. 1.4.1 (or equivalently by Th. 1.4.3 and Th. 1.4.4, where F is a-additive and E is lattice theoretically complete). We define the uniform structure of “physical impre¬ cision” Up on £ as the initial uniform structure generated by the maps £ L, F9(g) = F(g d), g g £ and L is endowed with the uniform structure generated by g(0I', 0!). We shall let £p denote E together with the uniform structure Up (£p is therefore complete; £p need not necessarily be complete). Th. 1.4.5. Ър is precompact (see All, §2). The topologies generated by Up and Ug are identical. Proof. Since L is g(0', ^-compact, £p is precompact (see All, §2). Since Ug is finer than Up, it is only necessary to show that the identity map £p—► £0 is continuous (not necessarily uniformly continuous!) in order to prove that the topologies generated by Up and Ug are identical. Now, for fixed a we have: d(G, gJ = p(w0, F(g + (Tj) = p(w0, F(g) + F^) - 2F(<t-(71)) < 2\p(w0, F(g) - F(g • Oj)! + \р(щ, Ffo) - F((j)|. Therefore, for the case in which a1 = g, g2 = e (e = unit element) we have d(G, gJ < 2\p(w0, F(g • - F(Gt • g^I + \p(w09F(g-g2) - F(g1-g2))\ from which we conclude that the identity map £p —» E0 is continuous. According to Th. 1.4.5 the structure £p, £p is of the form which has been discussed more generally in [1], §9. Up is somewhat “more physical” than Ug. £p is precompact (totally bounded), but the structure of this Boolean ring may not necessarily be extended to the completion £p. We note that Ug is characterized by the fact that tg = Ep is a complete Boolean ring; in order to obtain a Boolean ring for completion it is methodologically desirable to introduce Ug and not only Up. Since we wish to describe an idealization of the situation 0t{bo) L by E^L and since 0t{bo) is denumerable, it seems reasonable to introduce the notion of an observable in the following way : D 1.4.2. By an observable we mean a pair of objects (E, F) where E is a Boolean ring and F is an additive effective measure F: E —> L for which E is complete and separable with respect to the uniform structure Ug (defined by Th. 1.4.1 using F). We shall denote the observable (E, F) also by E L.
104 IV Coexistent Effects and Coexistent Decompositions According to D 1.4.2, and Z2-^»L are considered to be the “same” observable if there is an isomorphism Zx -A Z2 of the Boolean rings for which JF\(a) = F2(ic7). Th. 1.4.6. is an observable if and only if Z is lattice theoretically complete, F is o-additive and there exists a denumerable Boolean sublattice Za ofli whose (lattice theoretical) completion in Z is equal to Z. Proof. According to Th. 1.4.3 Z (of D 1.4.2) is lattice theoretically complete and F is <7-additive. Since Z is Inseparable (according to D 1.4.2) there exists a denumerable subset icZ which is l^-dense in Z. The Boolean ring 'LA generated by A is denumerable. The lattice theoretical completion of Z^ in Z is a (lattice theoretically) complete Boolean ring I^cZ. According to Th. 1.4.4 is In¬ complete. Since 2Л is l^-dense in Z and since Z is incomplete 2Л = Z. Conversely, if Z is lattice theoretically complete and F is <r-additive, then by Th. 1.4.4 Z is incomplete. We shall now show that Za is l^-dense in Z. Let Za0 be the completion of Za with respect to Ug. Since Z is incomplete, Za0 c= Z. Note that t*ag is, according to Th. 1.4.3, lattice theoretically complete; therefore the lattice theoretical completion Z of Za lies in 2a0, that is Z c= Za0. Therefore t,ag = Z and Za is l^-dense in Z. In previous theorems we have only assumed that F maps the Boolean ring Z into L. For the special case in which Z Лб, then it is, in principle, conceivable that the extension of the map F to the completion Ъд will yield points which are not elements of G. The following theorem rules out this possibility. Th. 1.4.7. Suppose that, for a Boolean ring Z, an additive measure F: Z —> G is given. Then, for the extension of F: 2^ —> L FZ^ is contained in the &)-closure of FZ in G. Proof. According to Th. 1.4.1 Ug is metrizable. Thus it suffices to show that, for a sequence <rv e Z satisfying d(av, a) —► 0 for <7 e Z0, that if F(<rv) e G then F(a) e G. Since F(<tv) —» F(<r) in the <т(^', ^-topology, then we have also proven that FZ0 is in the Щ closure of FZ in G. Since F(<tv) converges towards F(<r), in order to prove that F(<r) e G it suffices to consider a subsequence <rv.; where the latter is chosen such that (using the notation ofTh. 1.4.4) 00 к = v i = m Then we obtain F(<rJ —► F(<r). According to Th. 1.4.3 we have N &m,N = V 1 = m we therefore obtain Since <7WtN e Z we obtain F(dm N) e G. Therefore, by III, Th. 6.10 F(dJ e G. F(dJ is a decreasing sequence; therefore, by III, Th. 6.10 we obtain F(<r) e G.
1 Coexistent Effects and Observables 105 Th. 1.4.7 shows that in the case in which S-^Gwe may assume that, Ъд is complete without any loss of generality. Thus we define: D 1.4.3. A decision observable is a pair (£, F) where £ is a Boolean ring and F is an additive effective measure F: £ —> G for which £ is complete with respect to the uniform structure Ug. (We do not require that £ is In¬ separable; this fact is a result of Th. 1.4.8.) Th. 1.4.8. For a decision observable Ъд is always separable. F is an isomorphism onto the image set F£ c= G;for each decision observable £ may be identified with a Unclosed Boolean sublattice of G. Each (lattice theoretically) complete Boolean sublattice of G is also Ug-complete, and conversely, so that decision observables may be identified with (lattice theoretically) complete Boolean sublattices of G. The uniform structure Up defined in D 1.4.1 is identical to the uniform structure on £ generated by o(0S\ 08). The topology on £ generated by Ug is identical to the o(08\ 08) topology on £ (therefore £ is a(08\ 08) closed in G since £ is Ug-complete (see Th. 1.3.7). PROOF. On the basis of Th. 1.3.3 £ may be identified with F£. According to Th. 1.4.4 the (/^-complete Boolean sublattices of G and the lattice-theoretically complete sublattices of G are the same. Therefore it is only necessary to show that every complete Boolean sublattice of G is Inseparable and that the uniform structure Up is identical to that generated by o(08\ 08). Let £ be a complete Boolean sublattice of G. In order to prove that £ is In¬ separable, we recall that 08' is <r(08f, 08) separable (see AIII, §4). Therefore £ is a(08\ ^-separable. Therefore there exists a countable subset A с 2 which is o(08\ J^-dense in £. The Boolean subring ЪА (where £л is generated by A— according to Th. 1.3.6(v)) is denumerable; let £л denote the complete Boolean subring generated by A according to Th. 1.3.6(vi)). Since £ is complete, we obtain Za = £4 <= 2. Since A is a(08f, J^-dense in £ and since £л and £ are, according to Th. 1.3.7, a(08\ J^-closed in G, we obtain £л = £. Since the lattice theoretical completion £л of £* is equal to £, then, according to Th. 1.4.6, £ is Inseparable. Since £ is identified with a subset of G, the map F becomes the identity map and the structure Up (defined by D 1.4.1) is the initial structure generated by the maps F^e) = e a e (for all e e £) of £ into G. For e = 1 we obtain the identity map; thus Up is finer than the uniform structure defined by a(08\ 08). If, however, the maps Рё (for fixed e) are uniformly continuous with respect to a(08\ 08) then Up is identical to the uniform structure generated by <r(08f, 08). The fact that Fs is uniformly continuous is a consequence of p(w, e л et — e л e2) = p(w\ — e2), (1*4.1) where w' depends on w and e, as we wished to prove. Since e, el9 e2 are commensurable, then, by Th. 1.3.4(v) and (vi) we obtain ё л e1 = ee1 = e^e = eefe and, correspondingly, e л e2 = ee2 = e2e = ee2e. For w' = ewe, from p(w, g) = tr(wg) we obtain p(w, e(e± — e2)e) = p(ewe, et — e2) thus proving (1.4.1).
106 IV Coexistent Effects and Coexistent Decompositions The fact that Up is identical to the uniform structure defined by 0(01', 0!) on £ с G is only an assurance that we do not provide two different uniform structures for physical imprecision, because, as a subset of G, £ already has the uniform structure of physical imprecision defined by a(0T, 01) (see III, §3). Th. 1.4.8 permits us to characterize the decision observables entirely by complete Boolean subrings £ of G. In this way the Boolean operations in £ immediately represent the switching-algebra of an idealized registration apparatus—the measurement apparatus for the observable. However, this “simplification” which naively identifies the measurements of decision ob¬ servables with subsets £ of decision effects makes the explanation of the measurement problem more difficult. This difficulty is increased by the fact that (as we will see in §2.5) the “usual” concept of an observable is identical with what we called a decision observable and that we are accustomed to view only these decision observ¬ ables. We note, however, that our notion of an observable is more general and realistic as a deduced concept; since it is a deduced concept we do not need to make vague statements such as “an observable is something that can be measured.” Its physical meaning may be obtained from its definition as an idealization of the situation 0t(bo) L to which we have already given a physical interpretation—a conceptual joining of the registration procedures b g 0t(bo) and the set of possible frequencies ky(a n fe0, a n b) for the various preparation procedures a e 0'. In this description, the problem of finding a “measurement method” b0 g 0to which permits us to at least approximately “measure” the observable £ Д L need no longer be assigned to a domain between theory and experiment which may only be described by “words” and not by a theory. Instead, it becomes a question which must be treated in a theory which is perhaps more comprehensive than quantum mechanics because the physical interpretation of quantum me¬ chanics does not depend upon the concept of an observable but depends instead on the actual physical methods of preparation and registration. In §4 we shall begin with a step by step discussion of the problem of the measurement of an observable. This topic will be treated in more detail in XVII and XVIII. Nevertheless, the solution of the measurement problem concerning the macroscopic signals of a macroscopic apparatus will not be treated in this book. This problem has been solved in [13], X where the compatibility of an extrapolated quantum mechanics for “many particles” with the macroscopic description of the macroscopic measurement and preparation apparatuses is demonstrated. Such a macroscopic description of the apparatuses was used as a starting point for the foundation of quantum mechanics presented in II. 2 Structures in the Class of Observables By introducing the concept of an observable we seek to eliminate a portion of the structure associated with a particular apparatus from the registration process. For registration methods only the abstract structure of a Boolean
2 Structures in the Class of Observables 107 ring together with an additive measure F: £ —> L remains. Without any additional analysis it is already evident that the concept of an observable already contains too much of the structure of the registration methods. We should therefore seek to eliminate “unnecessary” and “bad” registration methods. Is it possible that such an elimination procedure will lead us to the concept of a decision observable? If this is the case, then the decision observables are the “true” measurements of the microsystems and exhibit the real structure of the microsystems. These and similar questions make it necessary for us to examine the concept of an observable more closely, as we shall do in this section. The reader who is not interested in a deeper analysis of the concept of an observable may omit this section. This is possible because, in §1 we have already introduced a number of important theorems which have a close connection with the analysis of §2. 2.1 The Spaces and &'(L) Let S be a Boolean ring and let m0: £—► [0,1] с R be an additive real effective measure. £ may be complete with respect to Ug (that is, £ together with the uniform structure defined by the metric d(<rl9 o2) = mQ(a1 + o^))* Let w0 be defined as in Th. 1.4.1; then, according to §1.4 we may choose m0 as follows: F(<r)). According to Th. 1.4.4 a (lattice theoretically) complete Boolean ring £ with a a-additive effective measure is Ug-complete. We shall now recapitulate some of the results obtained in §1. Th. 2.1.1. To each set Г а Ъ there is a denumerable subset {(7V} <= Г for which \/асг a = Vv av- &n = V? = i av *s an increasing sequence for which VA = Vv°v = Verer*7 md for which дп^>\/аеТо (in the topology generated by Ug). For every increasing sequence oy we obtain (7V —> \/v (7v. A similar result holds for Дст e r a. Proof. See Th. 1.4.3. Th. 2.1.2. Let ov be a convergent sequence (in the topology generated by Ug) and suppose oy —> a. Then we may choose a subsequence ov. such that, for dm = \JfLm (7Vi the following relationships hold: Gm > ® ~ f\ ^m* Proof. See Th. 1.4.4.
108 IV Coexistent Effects and Coexistent Decompositions D 2.1.1. A real function x(o) over Ъ is said to be a signed additive (or signed a-additive) measure over Ъ if there exists a real number с such that |x((r)| < с for all о and if, for о = \/v ov, ov л = 0 for v # ц the equation х(<т) = £*К) (2.1.1) V holds for finitely (or countably) many gv . All theorems about signed (7-additive measures will also hold for o- additive measures. Th. 2.1.3. Let x be a signed, additive measure over £. Then the following conditions are equivalent: (i) x is G-additive. (ii) If Gn is a decreasing sequence for which Д„ on = 0 then x(on) —> 0. (iii) Ifon is a decreasing sequence for which /\поп = о then x(on) —> x(o). (iv) J/(7„ is an increasing sequence for which \JnGn = g then x(Gn) —> x(g). Proof, (i) => (iv). Assume that en is defined according to (iv). Let <70 = 0; then (7 = Vn=о (^n+1 + (7„). Then by (i) it follows that m x(ff)= lim X [x(ffn+i) - m-> oo B = 0 = lim x(<7m+1). m~* oo (iv) => (iii). Let on be defined according to (iii). Then <7* is a sequence for which \/„ (7* = (7*. According to (iv) it follows that x(<7*) —► x(<7*). Since x(<7*) + x(<7) = x(s) it follows x(<7„) —» x(<7). (iii) => (ii) is trivial, since (ii) is a special case of (iii). (ii) => (i). Let a = \/®= 1 <7V where gv a = 0 for v ф {i. The sequence an — a + \f”= x (7V is a decreasing sequence, for which, according to (ii) x{° + V -^0. Since x(ff + V = *(<*) - x^ V n = X(a) ~ E x(<Tv), V = 1 it follows that x(<7) = x x((7v). Th. 2.1.4. Let m be an additive measure over S. Then the following conditions are equivalent: (i) m is G-additive, (ii) m is continuous.
2 Structures in the Class of Observables 109 PROOF. According to Th. 2.1.3, the condition that m is <r-additive can be replaced by one of the other conditions (ii)—(iv) in Th. 2.1.3. (ii) => (i). For an increasing sequence <rn for which \Jn on = <r, from Th. 2.1.1 it follows that <7„—>cr; therefore, from (ii) we obtain m(<7„) —»m(<r). Thus, from Th. 2.1.3(iv) we have proven (i). (i)=>(ii). Let <rv be a convergent sequence for which crv—»o. Since 0 ^ m(<rv) < 1 the set m(<rv) is convergent if and only if it has only one accumulation point. If a is any accumulation point we may choose a subsequence <rVk such that m(<rVk) -* a. According toTh. 2.1.2 a subsequence of <rVk can be chosen (for simplicity we shall also use <rVk to denote the subsequence) such that, if am = \JfLm aXi then a = f\m am - Since m is <r-additive, from Th. 2.1.3(iii) it follows that m(<rj —► m(<r). Since am > oVm we obtain m(am) > m(crvJ—» a. Therefore m(<r) > a. From the subsequence <rVk, since <rv* —><т* according to Th. 2.1.2, a subsequence can be chosen (which we shall denote by <rVk) such that 00 °* = /\°m with (Tm = V < • m i — m Thus, we obtain <* = V <** where a* = Д ащ. m i = m Since m is <r-additive, from Th. 2.1.3(iv) we obtain m(<7*)->m(<7). Since m((T*) < m(crVm) and m(crVm) —> a it follows that m(<r) < a. Therefore a = m(<r). Th.2.1.5. To each o-additive measure m there exists one and only one (jsgI for which m(cts) = 1 and т(ст) Ф 0 for all о for which 0 # a < os. PROOF. Let Г = {a I a e Z and m(<r) = 1}. Then, from Th. 2.1.1 it follows that there is a countable subset {<rv} of Г such that «r/=ifA<7 = A<Tv. а еГ v We shall now show that d„ = /\J=1 <rv 6 Г. We need only show that if crl9 cr2 e Г then it follows that а1 л <т2 е Г. From т(<т2) = 1 it follows that т((г|) = 0 and therefore m(cr1 л trf) = 0. From 1 = mfai) = т(ст± л trj) + л <т2), it follows that m(<r1 л (T2) = 1, that is (Tj л (72gT. Since <rs = from m(dn) = 1 and Th. 2.1.3(iii) it follows that m((Ts) = 1. Thus Г contains a least element, namely <rs. Suppose that a < as where m(<r) = 0; then it follows that m(<rs + cr) = m(<rs) — m((r) = 1 and we therefore obtain <rs + (T g Г. Since <rs is the smallest element of Г and since crs -i- cr < crs it follows that crs + cr = crs, that is, <r = 0. From m(<r) = 1 it follows that <r g Г and therefore a > <rs. Let <т = <rs. Then .? we obtain m(<r) = m(a) — m((Ts) = 0. If, in addition, а Ф <rs then there exists a a9 0 Ф a < a with m(<r) = 0. Thus we have proven the uniqueness of <rs. D 2.1.2. We shall call os (as defined in Th. 2.1.5) the support of m.
110 IV Coexistent Effects and Coexistent Decompositions Th. 2.1.6. To each signed o-additive measure x on £ there is exactly one partition e = <7+ v a_ v o0 (that is, o+ л a_ = o+ л a0 = a_ л a0 = 0) /or w/dc/i x(a) > 0 for all a for which 0 Ф a <= a+5 and x(a) < 0/or a// a/or which 0 # a < a_ and x(a) = 0/or alio < o0. PROOF. Let A = {o \ x(o) < 0 for all <7, 0 Ф a < o}; let <r_ = \JaeAo. We shall now show that <t_ e A: Let a satisfy 0 Ф a < o_. Then there must be a o1 e A for which о1 л <7 # 0; thus we obtain x^ л a) < 0. Since d < o_ we obtain о = о л o_ = \1аел(а л According to Th. 2.1.1 there exists a countable subset {<rv} c= A for which <7 = \JX (<tv л a). We may choose ot such that a o) < 0. We may rewrite 6n = Vv=i (ov a o) recursively in the form: On = (<7! A ff) V <72 V • • • V <7„, where [m— 1 V к л s) V=1 from which it follows that n x(<t„) = x(ol Л d) + X! v = 2 From av < <7V it follows that x(<rv) < 0. Therefore х((ти) < х(оу л a). From Th. 2.1.3(iv) it follows that x(<r) < x(&! л d) < 0. Thus we obtain <r_ gL Therefore A contains a greatest element <r_ and A = {<т|0 Ф о < <т_}. We shall now show that x(<r) > 0 for all о < о*. Let o' < <r* and let x(of) < 0. Since o' < o* we find that о' ф A, from which it follows that there exists a ot such that 0 Ф oy < o’ for which x^) > 0. Let nx be the smallest positive integer for which there exists a ot < o' for which xfo) > l/n1. Then x(o' -j- 0i) = x(o') — x(a1) < 0 and o' -j- Oi < o' < o*. We find a similar situation holds for o' + o1: Let n2 be the smallest positive integer for which there exists a o2 for which o2 < o' + ot and x(<r2) > l/n2. Continuing in a similar fashion, from о = o' + \/v ov we obtain x(ff) = X(<7') - X X(av) < x(ff') - J] — < 0. V V nv Thus we obtain \/nx —> 0. If о < <т, then from the construction of ov it follows that x(<r) < 0. Thus m(o) = (1/х(д))х(д л о) is a positive сг-additive measure. On the support os of m we find that os < o; since m(o) > 0 for all о for which 0 Ф о < os we obtain os e A and therefore os < o_ in contradiction to os < о < o' < o*. Thus m+(o) = (1/х(а!!!))х((7!!! л о) is a positive (7-additive measure for which x(<7*) > 0. Let o+ be the support of m+. Then o+ < o*. For all о for which 0 Ф о < o+ we obtain x(<7) >0. If о < o0 = (o+ v o_)* then x(a) = x(o л (o+ v <7_)*) = x(o л о* л (7*) = x(<7*)m+(<7 л (7+) = 0 because o+ is the support of m+. Suppose that, in addition to £ = o+ v a_ v o0 there exists a second partition of the same type e — o\ v oi v ob- Then we would obtain oi e A and therefore 01 < <7_. Since, for all о for which о < o\ v <70, and therefore for all о for which 0 < (<7+ v <70) л <7_x(<7) = 0 we must obtain (<7+ v (7<5) л o_ = 0. Thus we obtain 01 = <7_ and therefore o+ v o0 = 0+ v o£; from the uniqueness of the support we therefore obtain o+ = 0+ and o0 = Oq.
2 Structures in the Class of Observables 111 Th. 2.1.7. Every signed G-additive measure x on E may be written in the form x = a— Pm2 where a, ft > 0 and , m2 are G-additive measures such that the supports osU gs2 of mu m2 satisfy the relation gs1 л gs2 = 0; a, /?, ml9 and m2 are uniquely determined. PROOF. From the partition E = <7+ v <r_ v (T0, from Th. 2.1.6 we obtain the following result: Let х((т+ л a), m2((r) = ^-y x(<7_ л a), a = x(cr+), P = — x(g_), we obtain the desired partition and <rsl = <7+, <rs2 = . Conversely, suppose that we are given a partition which satisfies Th. 2.1.7. Since, for all <7 for which a < gs2x{g) < 0, as2 < o_ is obtained from Th. 2.1.6. Since, for all (7 for which <7 < c7_ a <t*29 x(cr) = am^cr) > 0, we must have (7_ л of2 = 0. Therefore <rs2 = <7_. Thus we obtain am^cr) = х(а* л a) and therefore obtain °,i = (7+* D 2.1.3. Let J^(E) denote the set of all signed a-additive measures on E; let K(E) denote the set of all a-additive measures on E. For every finite set {xv} с= the sum x = £v av*v а signed a-additive measure. This result, together with Th. 2.1.7, yields the following theorem: Th. 2.1.8. ЩЕ) is a linear vector space. The linear hull of K(L) is Щ2). Th. 2.1.9. The absolute convex set generated by K(E)—the set V = Uo^a<i — (1 — A)K(E)]—defines a norm in ЩЕ) by means of its Minkowski functional.1 For this norm ||x|| = a + fi where a, /3 are uniquely defined by Th. 2.1.7. With the positive cone ^+(E) given by ^+(E) = {x | x(g) > 0 for all g g E} J^(E) becomes an ordered vector space. For x> Owe obtain ||x|| = x(e);/or x^. Owe obtain x = x(e)m where m = (l/x(e))x g KfE). Therefore K(Y) is the base for the cone ^+(E) and ЩЕ) is a base-norm space (see AIII, §6). PROOF. It suffices to show that the Minkowski functional for V is equal to a + ft where a, ft are obtained from Th. 2.1.7, because V is then absorbing and the Minkowski functional is equal to zero only if a = ft = 0, that is, x = 0. The Minkowski functional for V is equal to a + ft since, if x = a— j3m2 = — (1 — /фи2] where a, ft, ml9 m2, are defined in Th. 2.1.7 and 0 < p < 1, ml9 m2 g ЩЕ), it follows that A > a + ft. Let <7sl, os2 be the supports of and m2, respectively. Then it follows that a + p = x(<7sl) - x(ffs2) = ЛОфмДо^) - mjfe)) - (1 - n)(m2(osl) - m2(<rs2))]. 1. If F is convex and if x eV implies also —xeV, the Minkowski functional is defined by p(x) = inf{A|A_1x e V}.
112 IV Coexistent Effects and Coexistent Decompositions Since mi((7sl) + = mi(<rsi v as2) < щ(в) = 1 and since m2((7sl) + m2((7s2) <1 it follows that - щ(а,2)) - (1 - n)(m2(ffsi) - m2(<rs2))\ <1. Thus A > a + /?. The remainder of the theorem is easy to prove. Th. 2.1.10. ЩЕ) is a Banach space. PROOF. Let x„ be a Cauchy sequence in 08(L). Let x = ocm1 — fim2 (where a, /?, ml9 m2 are given by Th. 2.1.7). It then follows that |x((7)| < am^tr) + j3m2((r) < a + /? = ||x||. Therefore the real numbers xn(a), for fixed (7, form a Cauchy sequence. From xn(o) —» x(cr) a real function x is defined on £ which (as may easily be shown) is additive. From |x»| < IX» - Xm(cr)| + |xm(<r)| ^ II*» - JCmll + ll*J it follows that there exists a real number с for which |хи((т)| < с for all n and a. Therefore |x(<7)| < c, that is, x is a signed additive measure on £ for which |x(cr)| < с for all (7 e £. We shall now show that x is also (7-additive, and is therefore an element of 08(L). According to Th. 2.1.3 x is (7-additive if, for any decreasing sequence <rv satisfying Д v (7V = 0, it follows that x((7v) —> 0. In the following equation |x(ff„)| < |x(<7„) - X„(<TV)| + |x>t) - Xm(<7„)| + |xm(t7v)| < |X((TV) - X„((TV)| + ||x„ - xj + |XM((TV)| we may choose N such that, for n, m> N, ||x„ — xw|| < в. Next, we hold m> N fixed, and choose v so large that | xm(crv) | < e(xm is (7-additive). Then, for fixed v we choose n so large that |x((7v) — x„(crv) | < s. Therefore, we obtain |x((7v)| 0. Let (j", (7" be the elements which, according to Th. 2.1.6, correspond to the signed measures (x„ — x). Then by Th. 2.1.9 and Th. 2.1.7 we obtain II*. - *11 = *»№ - *№ - l>„№ - x(<t")] < |x„K) - xm(<r"+)| + |xm«) - x«)| + |x„(ff”_) - Xm(<T"_)| + \xm(<f.) - x(<7”_)| ^ 2||x„ - xj + |xmK) - xK)I + \xJrfL) - x(ff")|. We now choose N such that ||x„ — xm|| < в/2 for all n9 m> N. Then, holding n fixed and choosing m so large that |xMW) - xK)| < e, Ixm(al) - x(<t” )| < e. From this, it follows that ||x„ — x|| —> 0. Th. 2.1.11. Let m be an effective a-additive measure. Then !£(£) = co{ma | rii^d) = m(a)~1m((7 л &), a g £, а Ф 0}, where co{...} is the norm-closure of co{...}, that is, the convex subset of K(£) generated by all the measures ma is dense (in norm) in K(£).
2 Structures in the Class of Observables 113 Let mx, m2 be two effective G-additive measures. Then the corresponding metrics dx(Gl9 o2) = mi(oi + &2) and d2(ol9 o2) = m2(o1 + ^2) are equiva¬ lent, that is, the uniform structure Ug of 2 will be generated by dfau ^2) = m(ai + о2) where m is any effective G-additive measure. Proof. For m e X(Z) and for m as defined in Th. 2.1.11, the xk = m — km, for real numbers Я > 0 are elements of Щ2). For each хя we may define <7_(Я) according to Th. 2.1.6; clearly Л1 > k2 implies g_{Xx) > сг_(Я2). We shall now show that \/л>о0-(Л) = e. Suppose not; then, for <7 = [\/a>o we would obtain a < (7_(Я)* for all Я > 0, that is, xk{a) > 0 for all Я > 0. Therefore we would obtain m(<r) > Ят(<т) for all Я > 0. Since т(<г) Ф 0, this contradicts the fact that m{a) < 1. Let d > 0; then define Яи = nd for n = 0,1, 2,... and <7„ = G_(nd) + G_((n — 1)<5). Thus we obtain g„ a am = 0 for n Ф m and \/®_ 1 on = e. Let m„((7) = m((7„)“ lm(o л <7„); we define N *d,N= E (« “ l)*w(<T„)m„. n= 1 Clearly xdtN e &+(!.) and m„ e K(I,). Since 00 miff) = E т((Т л an) n= 1 it follows that N m(<r) - xSN(o) = E [m(<7 л <7„) - (n - l)3m(d л (j„)] И=1 00 + Е ^ л О- п = N +1 We obtain (7 л ап < <7_(ю<5) and <7 л ап < <7_((и — 1)<5)*; from which it follows that m(<7 л <7„) — nSm(G л g„) < 0, m((7 Л <7И) — (n — l)<5m(<7 A <7„) > 0. From which we obtain 0 < m(<7 л (7„) — (n — l)<5m(<7 л g„) < дт(а л <7„) and N 00 0 < m{&) - xd N(a) < 5 E m(<r * <t„) + E ™i° л п=1 п = /V + 1 00 < <5m(<7) + Yj A °Vi)* n = N+ 1 Since 0 < m(cr) — x5>iV(<7) we obtain ||m — Xj,N II = m(e) - x3M(e) (see Th. 1.2.9). It follows that 00 II m - jc^jvII < <5 + E n = N + 1
114 IV Coexistent Effects and Coexistent Decompositions We may now choose 3 < s/4, and N so large that N + 1 m(cr„) < s/4 where the latter is possible because of the convergence of the series Ё "КО = "*(®) = '• n= 1 Therefore we obtain ||m — xdfN\\ < s/2. Since ||m|| = 1 it follows that 1 — e/2 < Ы\ ^ 1 + £/2 and xd,N **,#11 < £. Since xdtN > 0, mdN = xdfN/\\xdtN\\ is an element of K(E) and md N is a convex linear combination of the mn. Suppose that m is effective. We shall now show that the metrics a2) = + g2) and 3(<rl9 a2) = + a2) are equivalent. By symmetry between m and m we need only show that, for each £ > 0 there exists an £ > 0 such that d(al9 <j2) < e implies that d(al9 <r2) < e. For x5>iV, as defined above, we obtain m(<7i + <r2) = m((T! + a2) - + ai) + + *2) N ^ II™ - x5>iV|| + £ (и - l)<5m(<7„ • <7X + (T„-(T2) < m — xt 'd,N I We now choose xd N (and therefore choose <5, N) such that ||m — xdtN\\ < e/2. Then we choose £ < e/2 ! (и — 1)<5. Then, for d(at, <r2) < £ it follows that d(<7i, (T2) < £. Th. 2.1.12. J^(E) is separable i/E is separable. PROOF. Suppose {<rv} is a countable set which is dense in E, that is, to each <7 e E, for arbitrary £ > 0 there exists a <rv such that d(d9 <7V) = m0(d + <7V) < £. Let ms(<r) = m0(a)~1m0(a л a) where m0 is the effective measure (see the beginning of §2.1). According to Th. 2.1.8 and Th. 2.1.11 it suffices to show that, to each ihd there exists a mav for which ||m~ — maJ is arbitrarily small. This fact is a consequence of the following estimates: || md - mj| < —— ||m0(<7 л a) - m0(<7v л a) m0(<7) 1 1 m0(d) m0(<rv) ||m0(<7v a a) 1 — ||m0(<7 л a) - m0((7v л <т)| m0((7) + 1 1 m0(d) m0((7v) From т0(д a <7) - m0(<7V a a) = m0(<7 л <7V л a) 4- m0((<7 + <7 л (7V) л a) - m0(<7 A (7V A (7) ~ m0(((7v 4- O' A (7V) A (7)
2 Structures in the Class of Observables 115 it follows that ||m0(ff - a) - m0(ay л a)|| < \\m0({a + & лс7„) л сг)|| + ||"»o(K + $ л ffv) л ff)|| = m0(<r + о • <rv) + moK 4- ff • ffv) = d(o, <7V). Therefore we obtain 1 11 || mav - щI < (jv) + m0{a) m0(d) m0((rv) We shall now examine the properties of the dual Banach space &'(L) corresponding to J(£). Let I g &'(!<), that is, I is a bounded linear functional over ЩТ). The norm of / is given by M = sup \Kx)l 11*11 SI It is easy to show that ||/|| = sup{l(m)\m e K(L)}. The positive cone ^'+Ф) *п ^'(2) is defined as the set of all / for which l(m) > 0 for m g K(£). The unit sphere in may then be described by the order interval [—1,1] where 1 is the functional for which 1 (m) = 1 for all гпеК(1<). Therefore l(x) = a — /? (where a, jв are defined in Th. 2.1.7). The unit sphere [—1,1] is ^-compact. Thus it satisfies the Krein- Milman theorem (see AIII, §4) which says that a convex set is spanned by its set of extreme points de[— 1,1]- The unit sphere is isomorphic to the set L(I) = [0,1]. What are the elements of АДЕ)? For fixed a g Ъ, la(m) = т(а) is a bounded linear functional which satisfies la g L(£). We shall now show that deL(l<) is precisely the set of all la where a g S. Th. 2.1.13. la g 3eL(£), g2 => lffl Ф la2. Proof. Let lax = А/ + (1 - A)/' where 0 < A < 1. Then, for all m for which m^) = 1 it follows that l(m) = 1; for all m satisfying m^) = 0 it follows that l(m) = 0. Each m can be written as follows: m = m((T1)m<Tl + m(Gf)maV - where m(<7 A <7) = 7^—• m(<7)
116 IV Coexistent Effects and Coexistent Decompositions Therefore, since /(mCTi) = 1 and /(mCTl) = 0 it follows that l(m) = + m(<rf)l(mai) = l°t = l„2 =* w0(<Ti)_ 1m0fer1 л <7j) = m0(ff1)-1m0(ff1 л <72) => «10(0-! -j- О-! Л <72) = 0 => = (7! Л G2 and similarly c2 = g1 a <72, that is, g1 = a2. The fact that each element of deL(L) is of the form la is obtained as a corollary of the important “spectral theorem” for / e I*1 order to prove this fact we introduce the following definition and notation: Let the canonical dual form for be denoted by (x, y). Therefore every linear functional у may be expressed as a function over $ in terms of (x, y) for fixed y. Let ■Кт I °) = sup{(m, y)\me K(l<) and m(<r) = 1}, Гя0>) = И «O'к) < a}, <7(y<a)= V a• (2.1.2) <теГж(у) We claim that ст(у < a) e Га(у), that is, g(y < a) is the largest element of Га(у). In order to show this result, we shall first show that, if au g2 g Га then a1 v (т2еГг The following is obvious: s(y|ffi v a2) > 4y,o2\ s(y I (Tj v a2) > s(y, a2). Let 5X = + g1 • g2 < gx ; we then obtain gx v g2 = Sx v g2 and a g2 = 0 from which we conclude that m(G1 v g2) = + m(<72). Suppose that v g2) = 1. If m^) = 1 (or m^) = 0 and therefore m(G2) = 1) it follows that (m, у) < (or < s(y|<72)). If ^1,^0, then, setting A = т(£х) we obtain: m^G) = a g\ m2(G) = 1 2 ^т((72 л g); mum2 are elements of K(L) and = 1, т2(д2) = 1. Since т(а1 v a2) = 1 we obtain m(G) = v g2) a g) = Am^a) + (1 — A)m2(<7). From this result it follows that (m, у) = Цти у) + (1 - X)(m2, у) < As(y I ffi) + (1 - A)s(j> I a2).
2 Structures in the Class of Observables 117 This result also holds (as we have seen above) for Я = 0 and Я = 1. Therefore we obtain: s(yki v <72) < sup [As(y I <7л) + (1 - A)s(y I <72)] 0<Я<1 = max[s(y I (7л), s(y|<r2)]. Combining this result with (2.1.3) we obtain: s(y | (7j v (72) = max[s(y | <Тл), s(y | (72)] thus proving that, if al9 a2 e Га then ax v a2 e Га. Therefore we have shown that, for <tv g Га we obtain m V v=l This result, together with Th. 2.1.1, shows that there is an increasing sequence an e Га such that a(y < a) = \Jn an. Let m g K(T) satisfy m(a{y < a)) = 1. Then, from Th. 2.1.3(iv) it follows that m((t„)->1. (2.1.4) Clearly, we may have only a finitely many m(an) = 0. These can be omitted, and we may require that т(<ти) ф 0. Then тфп л a) m” ° H°n) is therefore an element of K(L) which satisfies mn(an) = 1. For the comple¬ ment <7* of on we find that is an element of K(L) if m(an) ф 1. If m((7n) = 1 then (m, y) < s(y | <r„) < a. Let m(<тп) ф 1 for all n. Then from m = m((7„)m„ + (1 - m(<7„))m* it follows that (m,y) = m(<7„Xm„,y) + (1 - m(an))(m*, y) < m(<7„)s(y|<7„) + (1 - w(<7„»||y|| < m(<7„)a + (1 - m(<7n))||y||. Therefore, together with (2.1.4) it follows that (again, as in the case in which m(<7„) = 1) (m,y) < a. For an me K(L) for which m{a(y < a)) = 1 we also obtain (m, y) < a, that is, a(y < a) g Га. Thus, we obtain аг > oc2=> <r(y < aj > <т(у < a2). (2.1.5)
118 IV Coexistent Effects and Coexistent Decompositions Let a*(y < a) denote the complement of a(y < a). We will now show that the relation (m, y) > a is satisfied for all m e K(L) which satisfy m(<7*(y < a)) = 1. Suppose that there exists an me K(L) which satisfies m(<7*(y < a)) = 1 and (m, y) < a. Let у = у — al; then (m, y) < a is equivalent to (m, y) < 0. Let <7 be the support of m (see D 2.1.2). Then g < a*(y < а), т(ё) = 1 and m(<r) Ф 0 for all <7 satisfying 0 Ф о < д. Let n1 be the smallest positive integer for which there exists a gx satisfying g1 < <7 and (xai, y) > 1 /nl9 where л a). For д + ^ we may make the same assumption as we made for a: Let n2 be the smallest positive integer for which there exists a o2 < 6 + such that (xa2,y) > l/n2. Proceeding in this fashion we obtain a sequence <7V for which <7V л <тм = 0 for v # /I such that \/v (7V = d. Let a = <7 + Vv (7V; we will now show that £ Ф 0. From g — g v \jy gv it follows that If = 0 (and therefore xd = 0) then we would obtain (m, y) > 0 in con¬ tradiction to (m, y) < 0. Therefore g Ф 0 and т(д) Ф 0 (since g < g and g is the support of m). From (2.1.8) it follows that £v l/wv is convergent and therefore l/nv —► 0. We must have (xff, y) < 0 for all g < g, otherwise there would be a g < g for which (x$, y) = s > 0, contradicting the definition of nv and l/nv -> 0. For all g satisfying 0 ф g < g we find that x9(g) = т(д л g) = m(<7) Ф 0. The measure md e K(S) given by m(<7) = m((7 л a) = т(<7 л a) + £ w(<7v л g). (2.1.6) v Since N N m - = m(s) - m(S) - £ m(<rv) V=1 V=1 in the norm-topology of ^(S) we obtain 00 (2.1.7) V=1 and therefore obtain (m,y) = (x9,y) + Z(x<rv>y) ^ (x»,y) + £—, v nv (2.1.8) is therefore “effective” in a, that is, ma(<7) Ф 0 for all g for which О Ф g < g. Thus, in the same manner as in Th. 2.1.11 it follows that co{m. | g < g] is norm-dense in the set of all m for which т(д) = 1. Therefore, for all m
2 Structures in the Class of Observables 119 satisfying m(o) = 1 we obtain (m, y) < 0, that is, (m, y) = (m,y + al) = (m, y) + a = < a; or, in other words д e Га and therefore о < a(y < a) in contradiction to & < a < a*(y < a). Therefore, we have shown that (m, y) > a is satisfied for all m satisfying m(<7*(y < a)) = 1. We see, therefore, that a(y < a) = 0 for all a < — ||y || and a(y < а) = г for D 2.1.4. The totally ordered subset {c(y < a)} c= Z is called the spectral family of ye 38'(?,). Th. 2.1.14. The spectral family a(y < a) is right-continuous, that is, о (у < a + г) o{y < a) for s > 0 and г 0. PROOF. According to Th. 2.1.1 it suffices to show that, for д = Д£>0 o(y < а + e) the relation a = a(y < a) holds. Immediately it follows that а (у < a) < a. On the other hand Since a < a(y <, а + e) from (m, у) < а + e for each e > 0 we obtain (m, у) < а for each m e K(Z) and m(<r) = 1, that is, о < o(y < a). D 2.1.5. A map RAl satisfying the conditions p(а) = г for a > с for some c, and satisfying p(a + ) = p(a) (p is right- continuous) is called a (generalized) spectral family. Th. 2.1.15. (Spectral Theorem). For each spectral family there is a linear bounded functional I e 38'(L) defined by For the special case in which у e 38'(£), p(a) = a(y < a) then I in (2.1.9a) is equal to y. We may write (2.1.9a) (la is defined in Th. 2.1.13) as follows: where the integral is to be considered to be a limit in the norm of 38'(L). PROOF. l(m) is a bounded linear functional if it is affine and bounded on K(Z); this result follows directly from (2.1.9a). Let m(p(aj) = 1; then m(p(a)) = 1 for a > From (2.1.9a) it follows that l(m) = (m, I) ^ <xl9 that is, p(aj ^ g(1 ^ aj. s(y I &) = sup{(m, y) | m e K(L) and m(<i) = 1}. a2 > oc1=> p(a2) > p(aj; p(a) = 0 for a < — c, (2.1.9a) suchihat g(1 < a) = p(a). (2.1.10) (2.1.9b)
120 IV Coexistent Effects and Coexistent Decompositions Nowletm(<7(/ < aj) = 1. From (2.1.9a) it follows that (m, /) = j* a dm(p(a) л g(1 < aj). (2.1.11) Suppose that p(aj Ф a(l < ax); then there exists a S > 0 such that p{otl + S) Ф a(l < (zj since p is continuous from the right. Therefore P(ai + <$) a a(l < ai) Ф a(l < or p^ + <5) > a(l < a^. If p(+ <5) > cr(Z < oti) for all <5 > 0 it would follow that p^ + ) = p(a^ = a(l < a^. If there exists a S such that p{oLx + 6) л <т(/ < aj ф g(1 < aj then a = <т(/ < o^) + p(ai + <5) л g(1 < oil) ф 0 and a < g(1 < а^. Let us choose а m e K(Z) such that m(<r) = 1. Then it follows that m(p(а) л <т(/ < aj) = 0 for a < oq + S, m(p(c) л a(l < ax)) = m(<j(/ < a^) = 1 and, therefore, from (2.1.11) /(m) > + S in contradiction to the fact that a < a(l < If a! > a2, then, from the definition of a(y < a) we obtain the inequality а2т(сг(у < oct) + <r(y < a2)) < < < at) + a(y < a2)), (2.1.12) where xa is defined by xa{d) = m(o л <т). We may rewrite (2.1.12) in the form a2[m(ff(y < at)) - m(a(y < a2))] < (jc...,y) < a1[m((T(j' < at)) - m(o(y < a2))]. For a partition of the real axis for which a„+1 > a„ for A„m = m(a(y < a„+1)) - m(<r(y < aj) and m ~ Z Л‘<т(У<«и+ n we obtain the inequality Z a„A„m < (m, y) < £ “„+Am- (2.1.13a) n n With A„Z = la(y^an+1) ~~ l<r(yz<xn) ^s inequality can be written in the form (< is in the order in J*'(2)!) Z«aAJ<yZ*n+i*nl (2.1.13b) n n because (2.1.13a) holds for all m e K(Z). Thus it folfows that the left- and l right-hand sides of (2.1.13b) converge (with respect to the norm of &'(L))—as the maximum interval length tends zero— towards the same limit dLy^, (2.1.14)
2 Structures in the Class of Observables 121 Th. 2.1.16. dJL(L) = {la | <7 g E}, where la(m) = m(<r). PROOF. On the basis of Th. 2.1.13 it is only necessary to show that there are no other extreme points of L other than the la. If 0 < у < 1, then, from the spectral theorem it follows that: (Щ У) = L adm(<7<a), (2.1.15) where S > 0 can be arbitrarily chosen. We define two elements yl9 y2 e L(E) by means of the following equations: Уi) = J (2a - a2) dm((r(y < a)) (2.1.16) and fa У2) = J a2 dm(a(y < a)) (2.1.17) for which у = \y 1 + \y2; У can be an extreme point of L(E) if and only if У1 = У2 = У* that is, if a(y < a) = a(y1 < a) = a(y2 < a) for all a. On the basis of (2.1.15)—(2.1.17) this is the case only if a{y < a) = a(y = 0) for all a for which 0 < a < l.Thus (m, y) = m(e + a(y = 0)) = lE+a{y=0). (2.1.18) Th. 2.1.17. The map is an order isomorphism ofl, onto deL(L). PROOF. From Th. 2.1.16 and Th. 2.1.13 it is obvious that the mapping is a bijection. Thus, it is clear that > a2 <> lai > la2, where lai > la2 is the order relation in that is, (m, lai) = m(cr1) > (m, la2) = m(a2) for all m e K(L). On the basis of Th. 2.1.17 we may identify £ with d<£,(£) and write (m, la) = (m, a). Since we may identify a representation of a Boolean ring with a measure space (see [31] and §2.5) and the elements of &'(L) with a measurable, essentially bounded function (where two functions which differ only on a set of zero measure are considered to be the “same” element of &'(E)) the abstract space &'(Z) is sometimes called the space of measurable functions. We will not, however, make use of this representation of E and of this notion ota'fL). We shall now state (without proof) an interesting theorem which is concerned with the axioms introduced in III. Th. 2.1.18. If the lattice G which was introduced in III, §3 is distributive then it follows that (without making use of AV 3 and AV Vs in III, §3) the following isomorphisms hold for the Banach spaces defined in III, §3: 36 -> ЩЕ), Я9 -> ЩЕ) where G -> deL{E) (1and therefore G —> £), L —> L(2), К —> K(L) for a suitable chosen E. For a proof, see [13], VII, §5.3.
122 IV Coexistent Effects and Coexistent Decompositions 2.2 Mixture Morphisms Corresponding to an Observable According to Th. 1.4.3, for each observable I-^L, there exists a <r-additive measure m defined by m(a) = p(w9 F(oj) for each w e K; for w0 (from Th. 1.4.1) m0(<j) = ju(w0, F(<r)) is effective. In this way we clearly obtain a map of К (as a subset of $(Ж19 Ж2,...)) into ^(2) (see also [17], p.380) as follows: K^K(2), (2.2.1) w —> p(w9 F(a)). We shall now find it useful to investigate such maps. In V, §4.1 these maps will be studied in a more general setting. The discussion presented in this section may serve to motivate the studies presented in V. We will therefore use several general results from V here. From V, D 4.1.1 it is easy to show that the map (2.2.1) is a mixture morphism (abbreviation: mi-morphism). According to V, Th. 4.1.2 such a mi-morphism uniquely defines a map S of ЩЖ19...) into ЩЕ) which is norm-continuous. Earlier we have called awe К(Ж19...) effective if C(w) = K. Th. 2.2.1 (H. Neumann [21]). The mi-morphism S defined by (2.2.1) transforms effective ensembles w e К(Ж\9...) into effective measures m g K(L). The adjoint map S' of &'(L) into &'(Ж19...) maps (according to V, Th. 4.1.3) L(2) into Ь(Ж19...). The restriction of S' upon deL(L) is identical to the vector measure F: 2 —> L providing that we identify 2 with dJL^L) (according to Th. 2.1.17). S is uniquely defined by the restriction of S' upon 8eL(L). PROOF. Let w be effective. From m(a) = p(w, F(g)) = 0 it follows that F(g) = 0; if F(a) = 0 then 67 = 0 since F was assumed to be effective. Since 67 g deL(L) = 2 and if Sw = m it follows that p(w9 S'a) = (Sw9 a) = (m, 67) = m(67) and, according to (2.2.1) p(w9 S'a) = p(w9 F(a)) for all we K. Thus we obtain S'g = F(a). Suppose that we have two mi-morphisms St and S2 for which S[ and S'2 are identical on дДЯХ). Then, for all w g К(Ж19...) we obtain 0 = p(w9 (Si - S2)a) = (SiW — S2w, 67) for all 67 g deL(L). According to either the Krein-Milman theorem or the spectral theorem (Th. 2.1.15) L(2) = codglfL). Thus it follows that (SiW — S2w, h) = 0 for all h e L(2); since L(2) spans all of &'(Z), the same result holds also for all h e &'(Z). It follows that Stw = S2w for all w g К(Ж\9...) from which we conclude that St = S2. Th. 2.2.2 (H. Neumann [21]). If S is a mi-morphism of К(Ж\9...) into K(L) which maps effective ensembles into effective measures, then the restriction of the adjoint map S' onto 8eL(E) defines a a-additive effective measure F on 2 such that F(o) = S'a and F(s) = 1 (where s is the unit element of 2). In addition, 2^ is complete if Ug is the uniform structure determined by F. PROOF. According to V, Th. 4.1.3 S' maps the set L(2) into U(Jfl9...). We note that if л 672 = 0 we obtain т(бTt v g2) = rr^G^ + m(672), from which it follows
2 Structures in the Class of Observables 123 that v g2 = g1 + g2, where the oj are elements of deL(L) c: J*'(2). Thus it follows that S'(<7i v &2) = tbus S'o- is an additive measure on 2. If we define F(a) = S'o, then, for а e деЦ2), m = Sw we obtain m(o-) = (m, o’) = (Sw, o) = ju(w, S'a) = /z(w, F(<x)). For o- = 6 we find that 1 = m(e) = /4w, F(e)) for all w e К(Жi,...) and we therefore obtain F(e) = 1. For w0 defined by Th. 1.4.1 we find that Sw0 is an effective o--additive measure m0 on 2 for which Ug is the uniform structure generated by d((Jv a2) = m0(a1 + o2). Thus, by Th. 1.4.4,2 is l^-complete. Th. 2.2.1 and Th. 2.2.2 show that it is possible to uniquely characterize o- additive measures F: 2 —> L on complete Boolean rings 2 by mi-morphisms S: К{Ж19...)—>K(2) which transform effective ensembles w into effective measures. S': L(2)—>L(«^i,...) represents a type of extension of the map F: 2 —> L(JtPl9...) because we may identify 2 with deL(L) and S' is equal to F on deL(2). Since S' is linear, S' maps the convex set L(2) (which is generated by 5^(2)) onto a convex subset of ЦЖ19...). What is the physical interpretation of these convex subsets? 2.3 The Kernel of an Observable; Mixture of Effects for an Observable In III, §2 we discussed the use of the concept of a random generator in the formulation of the concept of the direct mixture of two registration methods. We shall now consider the special case in which b01 = b02. We begin by defining the following special case of III, D 2.6. D 2.3.1. A b0 is said to be a direct mixture of the same b0 if there exist two registration methods b'01, b'02 such that b'0l is isomorphic to b0, b'02 is isomorphic to b0 such that b'01 n b'02 = 0, b0 = b'01 u b'02. We call and 1 — a the weights of b'01 and b'02 in the direct mixtureb0. The direct mixture of b0 with itself is nothing other than an extension of the Boolean rings @(b0) to a Boolean ring 0t(bo) with the aid of a random generator such that all convex combinations of effects \l/(b0, b) appear in the ratio a to 1 — a. According to III (2.8), to each pair Bl9 b2 cz b0 there exists a b с b0 such that W09 b) = oaKBo, bx) + (1 - *ШВ0, B2). (2.3.1) From III, Th. 2.4 we obtain the following special case: Th. 2.3.1. Let b0 be a registration method, and let A{ > 0 be a set of rational numbers such that t kt = 1. Then there exists a registration method b0 such that, in @(b0) there exists a b for every series ofbi e &(b0) for which b) = Z ^Ж&о>g.-)- i— 1 (2.3.2)
124 IV Coexistent Effects and Coexistent Decompositions It is not difficult to obtain mixtures of effects from the extension of the ring Яф0) by means of a random generator. This procedure apparently does not lead to new information. In Th. 2.3.3 we shall show that we may obtain an abstract version of Th. 2.3.1 The converse question is more interesting: Let S Дь be an observable. Suppose there exist three elements a, ol9 g2 which satisfy the following equation: F(g) = clF(g^ + (1 - ol)F(g2) (which is an abstract version of (2.3.1)). Is it possible to “shrink” Ъ in such a way as to eliminate g! We have placed these descriptions of mixtures of effects in the foreground in order to make it easier for the reader to visualize the physics underlying the following abstract discussion about convex combinations and to under¬ stand the desire to seek “small” observables. We shall begin the abstract discussion with a definition: D 2.3.2. Let F: £ —> L be an additive measure on the Boolean ring S. (We do not require that £ = Hg or that t,g be separable.) Let cb(F£) denote the g(38'9 38)-con\ex closure of the set F'L; we shall call it the convex range of the measure F. If E Д L is an observable, then we shall call cb(F£) the convex range of the observable. According to the previous discussion, in order to make an “economical” measurement, it is not necessary to observe a “response” of a о for which there exists a pair al9 g2 satisfying (2.3.3). Conversely, if we are interested in the physical meaning of mixtures of effects, by using direct mixtures we may introduce arbitrarily many convex combinations of the form (2.3.3). It is possible to prove the following: To each h L there exists an extension E' of S for which, if then co(FE) = F'E' (see [21]); this result has the following intuitive meaning: we may introduce arbitrary convex com¬ binations (corresponding to the physical notion of a direct mixture given in Th. 2.3.1) if we are willing to “enlarge” the Boolean ring. Since L is g(38'9 J^-compact, co(FS) is, as a closed subset of L, also g(38'938)-compact. According to the Krein-Milman theorem co(FS) is generated by its set of extreme points de co(FS). Physically this set de co(FS) is the essential set of effects for the observable I-^L. We therefore define: D 2.3.3. 8e сo(FE) is called the extreme kernel of the range of the measure F:E->L. If is an observable, then it is called extreme kernel of the observable. D 2.3.4. Two observables 2^-^L and E2-^+L are said to be convex equivalent if сЬ^Ел) = co(T2S2).
2 Structures in the Class of Observables 125 On the basis of the Krein-Milman theorem the following assertions are equivalent: (i) Si L and Z2 L are convex equivalent. (ii) 'L1 L and Z2 L have the same extremal kernel. Physically, in order that we do not have to measure anything more than is necessary, it is desirable to make an observable “as small as possible.” Mathematically, this corresponds to the question whether there is a “small¬ est” range of an observable, given a particular convex range. The answer is based upon the following theorems: Th. 2.3.2. Let F\lL—*Lbe an effective measure on the Boolean ring Z (we do not require that 1, = t,g or that Ъд is separable). Then co(FZ) = сo(FZ0). PROOF. From Th. 1.4.3 it follows that, since F is continuous (according to Th. 1.4.2), Flg c FZ , from which the assertion follows. Th. 2.3.3. To each additive measure F: Z —> L there exists a convex equivalent observable 'L1 L. If desirable, this observable may be chosen such that со (FZ) = F1'L1 (where A is the g(&', $)-closure of A). Proof. Since L is o($', ^-separable, we may choose a countable subset {<7V} c Z such that {F(<7v)} is dense in FZ. Thus co(FZ) = co({F(<7v)}). The <7V generate a countable Boolean subring Z' of Z such that co(FZ) = co(FZ'). Zi = Z^ is separable, hence the extension Fl of F (as a function over Z') onto Z^ defines an observable ZX-^L. According to Th. 2.3.2 cc^F^) = co(FZ'); therefore Zi L is convex equivalent to Z Д L. If we wish to obtain co(FZ) = F^ we then extend F: Z —► L to F": Z" —► L for which co(FZ) = F"Z" (see above and [21]). Then, as above, we may construct the observable Zi L by means of a countable subset {crv} c: Z" for which {F"(crv)} is dense in F"Z". In order to investigate the convex range we may always assume that Z = Xg, that is, that Z is complete. This makes it possible to use the results of §2.2. We then obtain Th. 2.3.4 (H. Neumann [21]). Let F\lL—*Lbe an effective additive measure, and let Z = Z^. Let S: К(ЖХ, ...)—>K(Z) be the mi-morphism defined by (2.2.1) and let S' be the dual map (see Th. 2.2.1) S': L(Z)—> ЦЖХ,...). Then the following conditions are satisfied: (i) S'L(Z) = co(FZ). (ii) To each extreme point ge e de co(FZ) = deS'L(Z) there is one and only one о e Z for which ge = F(g). (iii) G n co(FZ) = G n de co(FZ) = Gn (FZ). PROOF. Since S' is continuous with respect to the a(0§', 0&) topologies and L(Z) is compact, we find that S'L(Z) is compact. Since L(Z) is convex, S'L(Z) is convex. Then, since Th. 2.2.1 Z may be identified with dJJp), we find that FZ с S'L(Z)
126 IV Coexistent Effects and Coexistent Decompositions and therefore obtain co(FZ) c: S'L(Z). Then, by the Krein-Milman theorem L(Z) = сodJJp) and F is continuous, S'L(Z) = S' codeL{Z) = coS'd eL(L) = со (FZ) whereby we have proven (i). Let gesde co(FZ) = deS'L{Z). The set S'~1ge n L(Z), that is, the set of all /g L(Z) with S'f = ge forms a closed face of L(Z), as we will show: S'~1ge is the inverse image of a closed set and is therefore closed. Therefore S'~1ge n L(Z) is also closed. It is easy to verify that S'~ 1ge n L(Z) is convex. Let f e S'~1ge n L(Z), fl9f2e L(Z) and suppose that / = ЯД + (1 - X]f2 where 0 < к < 1. Then S'f = ge = XSfl + (1 - A)S'/2. Since is an extreme point, S'fa = S'f2 = #e, that is,/i,/2 e S'-10e n L(Z). According to the Krein-Milman theorem the face S'~1g4 n L(Z) contains an extreme point of L(Z) which, according to Th. 2.2.1 may be identified with a a g Z. Therefore there exists a a g Z for which F(cr) = . Suppose that, for <rl9 o2 g Z we have F(cr1) = F(cr2) = . Since F is additive, it follows that 9e = + <7i *<72) + F^ -<72), 9e = F(<72) = F(<72 + ^1-^2) + Ffo -<72). We therefore obtain ^Ffo V (72) + ^Ffo A <72) = 1 + <*1 ‘ <*2) + Ffo ‘ ^2) + F(°2 + ^ 1 ‘ <*2)] + 2^(ff 1 ‘ ^2) = 9e- Since ge is an extreme point, the following condition must be satisfied: F(<7i v <72) = F^ л <72) = ge. Thus it follows that F(<71 + (72) = F^ V a2) — F(<71 A (T2) = 0. Since F is effective, it follows that ot + <72 = 0, that is, a1 = o2. Since G cz д^,(Ж19...) (see III, Th. 6.6) we obtain G n co(FZ) = G n de co(FZ). Then, since by (ii) de co(FZ) c: FZ we obtain G n de co(FZ) cz G n FZ cz G n co(FZ). D 2.3.5. Let F: Z —> L where Z is C^-complete. Then the subset N = {<71 <7 g Z and F(<7) g de co(FZ)} is called the kernel of the measure F. If Z Д L is an observable, then N is called the kernel of the observable Z L. According to Th. 2.3.4 the map (7 —> F(<7) is a bijection of N onto de co(FZ). For the experimental technique of measurement the kernel N of an observable is the essential component. For example, in order to determine
2 Structures in the Class of Observables 127 the frequencies p(w, F(<r)) for all a e Z we need only measure the “responses” corresponding to the “indications” a e N. In §2.4 we shall consider this topic in more detail. From Th. 2.3.4(iii) we find that we do not introduce any new additional decision effects into FZ by taking the closure со of the domain FZ (providing, of course, that Z is Unclosed). For the special case in which FZ c: G it follows that, since I-^G is, by definition D 1.4.3, a decision observable, that FI = G n FI = G n FI = G n co(FZ) and therefore FZ is o(8S\ $)-closed in G. Thus we obtain an alternative proof of the portion of Th. 1.3.7 which is enclosed by brackets; see also Th. 1.4.8. Therefore, the kernel N of a decision observable is identical to Z, and we may therefore identify Z, and therefore N with FZ. According to Th. 1.4.8 N = Z is C^-separable. We shall now show that the above result holds in general. Th. 2.3.5. The kernel N of a measure F: Z —> L (where Z is complete in Ug) is Ug-separable (in general Z is not U -separable—see [11]). There exists an observable 'L1-?-L>L where ^ cl and Fx = F|Zl for which the kernel Nx = N (where N is the kernel of F: Z —> L). The complete Boolean ring I2cS generated by N is separable; therefore Z2 L where F2 = F^2 is an observable. If де сo(FZ) cz G, then 8e co(FZ) is a complete Boolean sublattice of G and Z2 L is an isomorphism Z2 de co(FZ). PROOF. If we construct 1^ as in the proof of Th. 2.3.3, then it follows that Ii с I and that Zj-^L is an observable (that is, Zx is Inseparable). According to Th. 2.3.3, for this observable we have co(FZ) = cojFZi) and therefore де со (FZ) = dgCdiF'Lj). Since Zi also satisfies the assumptions of Th. 2.3.4, F(o) is, according to Th. 2.3.4(ii) a bijection of N± с upon de ^(FZJ. Since F(a) is also a bijection of N a Z onto de co(FZ) it follows that N± = N. Since Zi is In¬ separable, we find that N = Nl9 as a subset of Zx is Inseparable. Thus Z2 is also separable. From Th. 1.4.7 it follows that the last part of Th. 2.3.5 holds. According to Th. 2.3.5 the complete Boolean ring Z2 generated by the kernel N (which exists according to Th. 2.3.5) yields the smallest possible observable for which nothing is lost with respect to the measure F: Z— since 8e co(F2Z2) = 8e co(FZ). Thus, as far as the physics of the situation is concerned, it suffices to only consider those observables Z Д L for which Z is the complete Boolean ring generated by the kernel N. D 2.3.6. Let N be the kernel corresponding to the observable Z Д L. If the complete Boolean ring generated by N is equal to Z, then we shall call I-^L a kernel observable. Since the kernel N of a decision observable is identical to Z, every decision observable is a kernel observable.
128 IV Coexistent Effects and Coexistent Decompositions D 2.3.7. Let be an observable. The observable E2-^»L generated by the kernel N of £ Д L is called the associated kernel observable cor¬ responding toS^L. For a given observable, for measurement purposes it is sufficient to consider only the associated kernel observable and the associated measure¬ ments. If we consider only the class of kernel observables, we have taken an additional step towards our goal of describing “measurements” in terms of the structure of microsystems: the exclusion of mixtures of effects from the measurement. Have we actually attained this goal? Suppose that the relation 8 e сo(FX) c= G holds for an observable £ - Д L. Then the kernel is a complete Boolean ring (see Th. 1.4.8), and the kernel observable is therefore a decision observable N-^+G, where Fi — F\n • It would suffice to consider only decision observables if all kernel observables would be decision observables. If this would really be the case, then we could say that the decision effects are the “true” measurements of the structure of microsystems. Unfortunately, as we shall find in the remainder of this chapter, this is not the case. In this section we have already seen that it is sufficient to consider only observables—in particular, kernel observables, instead of more general measures F: £ —> L. To close this section, we shall now state the following theorem, which is a direct consequence of the previous theorems: Th. 2.3.6. A set A a L is a set of coexistent effects if and only if there exists an observable (which we may also assert is a kernel observable) £ -Д L for which A c= co(FZ). 2.4 Mixtures and Decompositions of Observables We shall now, according to the implications of the previous section, only consider observables—as described by Inseparable Boolean rings. We shall now seek to define, in mathematical terms, the following intuitive idea: A registration method b01 registers “more” than a second b02. For this purpose, we again consider, as an idealization of &(b01) and M(b02\ two observables 2^-^L and S2-^>L and the following possibility: Suppose that we are given a homomorphism h of a Boolean ring 2^ into the Boolean ring S2 for which the following diagram is commutative: (2.4.1) A homomorphism h is an isomorphism of 'L1 onto the Boolean subring of S2 if and only if no other element of 2^ except the zero element is mapped to the zero element of S2. If this is the case, then from hfoj = /i(<t2) it
2 Structures in the Class of Observables 129 follows that /1(0-! + a2) = 4- = 0 anc* Gi + ^2 = 0 and that аг = (T2. /Г1 exists, and is also a homomorphism, where the latter can be proven in the following way. From <r3 = h~1[h((j1) v h(a2)~\ and v a2) = /i^x) v h(a2) it follows that 0-3 = (ax v 0-2) and so on. If (2.4.1) is commutative, and if 2^ L, S2 L are observables, that is, Fu F2 are effective, then if h(a) = 0 it follows that a = 0: According to (2.4.1) it follows that if h(a) = 0 then Fx(a) = F2(g) = F2(0) = 0; therefore a = 0. h is an isomorphism of 2^ onto hLx c= Z2. We will prove that, in addition, h is not only an isomorphism of the Boolean ring 'L1 onto hLx c= Z2, but is also an isomorphism of the uniform structures of the two rings. These uniform structures are generated by where w0 is defined in Th. 1.4.1 Since F2(Hpi) + KGi)) = F2h((T1 + a2) = F1(a1 + <r2) we find that Thus КЕг is Ug-complete, as is Zl5 and is, according to Th. 1.4.3, a complete Boolean sublattice of S2 and h is also an isomorphism of the complete Boolean lattices 2^ and KL1. On the basis of (2.4.1) we may identify 2^ L with hL1 L. We shall not, however, do this in the following. D 2.4.1. Let (Sl5 Ft) and (S2, F2) be a pair of observables. We shall say that Z2 L is more extensive than 2^ L (written (Zl5 Fx) -< (S2, F2)) if there is a homomorphism /1 (of the type described above) for which the diagram (2.4.1) is commutative. This definition is the precise form of the following statement: £2-^»L “measures more” than 2^ L. The relation -< is a pre-order: Suppose that (Zl5 Fx) -< (S2, F2) and (S2, F2) -< (£3, F3). Then, from the commutative diagram a2) = ^1(^1 + ^2)) and d2(°i> Gi) = /*(w<)> ^2(^1 + *2))» a2) = ЫРт))- (2.4.2) L it immediately follows that (Sl5 F\) -< (S3, F3). D 2.4.2. A pair of observables (Zl5 Fx) and (S2, F2) are said to be equivalent if (2lf Fx) -< (S2, F2) and (S2, F2) -< (Zlf F,).
130 IV Coexistent Effects and Coexistent Decompositions It is easy to verify the fact that if (El5 РД -< (E2, F2) then the following relationships hold: F1'L1 c= F2E2 and со(Т^ЕД <= co(F2E2). Thus, if two observables are equivalent, then they are convex equivalent (see D 2.3.4). If (El5 Fx) is the kernel observable associated with the observable (E, F) then (El5 Fx) -< (E, F) and (El5 Fx) is convex-equivalent to (E, F). Let (E2, F2) < (E, F) and let (E2, F2) be convex-equivalent to (E, F). Then, from (2.4.1) it follows that co(FKL2) = co(F2E2) = co(FE) and therefore de co(FhL2) = de co(FE). From this result, and as a consequence of Th. 2.3.4 it follows that the kernels Nx and N of both observables obey the relation hNx = N. If (El5 Fx) is the kernel observable associated with (E, F) then HLX <= E2. In this sense the kernel observable associated with (E, F) is the “smallest” observable (E2, F2) which is convex-equivalent to (E, F) and which satisfies (E2, F2) -< (E, F). This result exhibits the exceptional status of kernel observables. The physical description of the registration process motivates the need to study an additional relationship between observables: Let b0 = (J”=i boi be a decomposition of the registration method b0. Then, according to III, Th. 2.3 we obtain ф(Ь0, b)=t Wei. hi n b), (2.4.3) 1=1 where = Я^0(Ь0, b0i). We may also obtain kt by means of the function /i(w, g) as follows: ^&оФо9 boi) = Яу>(л n b0, a n b0i) for any a for which a n b0 Ф 0, that is, к = Ля0(ьо> hi) = Kw, Ф(Ь0, boi)) holds for all w, that is, /i(w, \//(b0, b0i)) = kt is independent of w. Thus it follows that (2.4.4) In equation (2.4.4) we meet, for the first time, an example of a general result (which will be discussed in XVII, §2.3) that we will not obtain any information about a microsystem from an effect of the form AI. This is self- evident in the case of (2.4.4) since b0i, as well as b0 are, as elements of ^20, statistically independent of the preparation; (2.4.4) is a direct consequence of APS 7.2. According to the discussions of III, §2 we may make arbitrary direct mixtures of registration methods. The direct mixtures are of great impor¬ tance for the interpretation of quantum mechanics as well as for the selection
2 Structures in the Class of Observables 131 of axioms in the sense of III, §3. In III, §2 we have already noted that, in general, an experimental physicist does not create mixtures of registration methods in an experiment. In gёneral, as far as possible, he makes decom¬ positions. In making direct mixtures we lose almost nothing in the way of experimental results (see III, §2) providing the weights are not too small. We note, however, that the use of direct mixtures makes experiments more difficult without making even a small improvement with respect to the results. On the other hand, the decomposition of registration procedures, if carried out experimentally by improving the tolerances of the apparatus represents a refinement of the results, because it then becomes possible to measure the effect processes (b0i, b0i n b) instead of only (b0, b) by means of the improved apparatus (see III, §2). If it is possible to measure the effects ij/(b0i, b0i n b) then the measurement of the mixtures ^(b0, b) is not interesting because no new information is obtained. Thus it is reasonable to seek, in mathematical terms, a solution of the problem of decompositions, making the transition from the “special” situation ЩЬ0)L to the idealized case For this purpose we see that the decomposition described by (2.4.3) may be characterized in the two following different ways: (1) We may consider ij/0(b) = ij/(b0, b) and il/0i(b) = i//(b0i, b0i n b) to be additive measures on the same Boolean ring ЩЬ0). Then (2.4.3) describes the decomposition of measures on ^(b0) as follows: Ф<№ = 1Мо*Ь). (2.4.5) i (2) Let us consider the additive measures ij/0: ЩЬ0)—+L and *01 = ФФоi> b0i n b): $(b0i)—>L. Let jt be the injective map of the subset $(b0i) into $(b0). Then the following diagram is commutative: ЖЬ0|) _^i_+ Я(Ь0) L These two characterizations suggest the following two idealizations con¬ cerning observables: Г. We begin with a single observable and consider a decomposition of F into additive measures Ft : E —► L of the form m n F(cj) = £ АД(<7) where A, > 0, £ A, = L (2.4.6) I = 1 i = 1 The Ft need not be effective measures. It is easy to show that the uniform structure Ug where the latter is defined by the metric d(tt19 g2) = /i(w0, Fi{a1 + o2))
132 IV Coexistent Effects and Coexistent Decompositions (where w0 is defined in Th. 1.4.1) is finer than the uniform structures Ug. where the latter are defined by the metrics dfcu °i) = ju(w0, Fi(al + <r2))- Since the F( are uniformly continuous with respect to the Ug. (see Th. 1.4.2), they are also uniformly continuous with respect to Ug and, as is the case of F, they are also o-additive (because o-additivity is equivalent to the condition that Ff((Tv) —> 0 for decreasing sequences which satisfy <rv —> 0—see Th. 2.1.3). 2'. Let and L (i = 1, 2,...) be observables. In addition, suppose that there exist isomorphic maps h{ of onto an interval [0, j/J of £ where ^ л rjj = 0 for i Ф j and \/- rji = s for which the following diagrams commute: S AiF\ (2.4.7) L We then say that the observable is a mixture of the observables with weights kt. From the diagram (2.4.7) it follows that Fhi(s) = ЛД, that is, F^) = ЛД and therefore kt > 0. Since the rjt are a decomposition of e, we obtain 1 = F(e) = F^) = ЛД, that is, ZA- = i. Condition 2' idealizes in a most precise way the decomposition of a b0 g into different b0i g as described by condition 2. From 2' it follows that 1' holds as follows: Let 4°) = A *)• (2.4.8) Since rji a g < rji (2.4.8) defines an additive measure over S. From a = V (*7i л ^ follows that F(<r) = F^. л a). According to the diagram (2.4.7) and definition (2.4.8) it follows that F(<r) = AfFf(<7)—which is a decomposition of the form (2.4.6). The question whether a structure of the form V exists is mathematically simpler than the question whether a structure of the form 2' exists. V is somewhat weaker than 2' (we have already seen that 1' follows directly from 2'), but physically is not much weakfer, as we shall see with the help of the following theorem: Th. 2.4.1. Suppose that, for the measure F of an observable £ L there exists a partition of the form (2.4.6). Then there exists an observable (2, F) > (£, F), a homomorphism £ 2 and a partition of en of 2 of the form г = V w/iere л fjj = 0 /or i Ф j,
2 Structures in the Class of Observables 133 such that the following diagrams commute: 2 > 1 /t (i = 1,2,, и), (2.4.9) L wtoere Я;(<т) = й(<т) л . PROOF. As we have already seen (in Y and the subsequent discussion) the Ft are o- additive measures over Z. The sets Jt = {o-1 <7 g Z and Ff(o-) = 0} are complete Boolean subrings of Z; the factor rings Z* = Z/J* are complete Boolean rings. A ^-additive effective measure Zf L is defined by Ft(p) = Ft(cr) for any asps Zf (2.4.10) Therefore we find that the Zf L are observables. Using the Zf = Z/J* we construct the product 2 of the Z*. The elements of 2 are therefore the n-tuples (<rl9 <r2,..., o-J with the ordering (al9 <j2,...) < (ol9 <t2, ...) if and only if ot < dt (i = 1, 2,..., n) (see AI, §3). Thus 2 is complete. Using the Ft defined by (2.4.10) we define the following measure F on 2: <0 = £ лад. £ This measure is, as in the case of the Fi9 сг-additive. Therefore Z —is an observable. Let rjt = (0,0,..., ei9...). Then we obtain the decomposition s = \/i fh of the unit element e of 2 and n F(ou ...,<>•„)= £ F(fj{ л (alt..., ff„)) i = 1 = £ F(0,0,..., о;,...) i = 1 = £ w i = 1 The map oj—►((), 0,..., <7f, 0,...) is a homomorphism ht in the sense of 2' for which the diagram z* 2 is commutative. Thus we have a structure of type 2' between Z* and 2. Using the canonical homomorphism Z-^Z/Jf, from hiki we obtain a homo¬ morphism of Z into 2 for which the diagram
134 IV Coexistent Effects and Coexistent Decompositions is commutative. Let h((j) = V"=i h then defines a homomorphism £ Д £ which satisfies the conditions of Th. 2.4.1. The fact that h is a homomorphism follows directly from h(a) = h2k2(o\ ..., Thus, we obtain 4°) = Ы?) л if, = htki(a) from which it fpllows that (2.4.11) transforms into (2.4.9). It is easy to verify that condition 2' is equivalent to the representation of the lattice 2 as a product which was used in the proof of Th. 2.4.1. Using the rii in 2', for each aelwe may form the decomposition a = V?=i(a л */*•)> which may be rewritten in “component form” a = (а л ri^a л ri2,.. .,v л rjn). The ^ in (2.4.7) are nothing other than isomorphisms by which we may identify the £f with the Boolean ring of the i th component. We may also represent condition 2' as follows: £ permits a “repre¬ sentation” as a product of the £f as follows: <7 = (<Tl5 <T2,..., <7„) where e £*, where F(a) may be represented in terms of the £f —> L in the following form: m = t i— 1 From the diagram (2.4.9) it follows that Find = F{m A nd = Fht(e) = Я,^(е) = A,l. The observable £ —> L in Th. 2.4.1 may be represented as a product а = (а^ал^...5ал^) and F(a) = £ л >/,), i = 1 where the are defined by Ff(ff л i/i) = A i/i). w p Therefore the observable £—>L may be decomposed according to the “refined registration methods” Th. 2.4.1 states that if the measure F for the observable £ Д L satisfies the decomposition (2.4.6), then by the map h this observable can be identified with a portion (namelj with hL c= £) of the observable £ —> L because the decomposition of £ —> L according to the refined registration procedures fjt leads to a partition of the restriction of the measure F on hL (which is identical to F on £) of the form (2.4.6). Therefore a
2 Structures in the Class of Observables 135 decomposition of the form (2.4.6) can always be considered to be a decomposition of a “more extensive” observable according to a set of refined registration methods. Hence, in experimental situations it is desirable to seek those observables for which the measure F permits no proper decompositions of the form (2.4.6). For this purpose we shall now introduce the following definition: D 2.4.3. An observable is said to be irreducible if, for any decom¬ position of F in the following form F(cj) = &Fx(g) + (1 - A)F(<r), (2.4.12) where 0 < Я < 1 and Fl9 F2 are additive measures Fx, F2: £ —► L it follows that Fx = F2 = F. It immediately follows that every decision observable is irreducible. In complete analogy to the set K(L) which was defined in Th. 2.1.1 we shall now introduce a set K(X, L) as the set of all <r-additive measures F: Z —> L on the complete Boolean ring Z. Then K(L, L) is a convex set; the measure F for an irreducible observable is clearly an extreme point of K(L, L). On the basis of (2.2.1), Th. 2.2.1, and Th. 2.2.2 (see the remarks following Th. 2.2.2) we may also identify K(L, L) with the set of all mi-morphisms (not only those which transform effective w into effective measures) К(Ж19 ...)—> K(Z). An investigation of the mathematical structure of K(Z, L) remains yet to be done; in particular, the structure of deJ£(Z, L) is not yet known. In each case we find that all F for which I-^G belongs to deK(L, L). An experimental physicist must strive to measure irreducible observables (or at least “approximately irreducible” observables). We shall not describe what we mean here by “approximately irreducible” observables. For this purpose it is necessary to introduce a uniform structure of physical impre¬ cision in K(Z, L) (for the general structure see [1], §9) with respect to which K(Z, L) is precompact (totally bounded). We shall now consider the following question: Is irreducibility preserved in the transition to a more extensive observable? This need not be the case; in the following theorem we shall find that if the more extensive observable is reducible, then each component is more extensive than that of the irreducible observable. Th. 2.4.2. Let Z L be irreducible and let Z L be more extensive than Z L and reducible in the sense of 2-'. Then every component Zf L is also more extensive than Z Д L. PROOF. According to 2' and diagram (2.4.7) we may identify Zf with [0, c: £ and Ft with Since 2-^>L is more extensive than there exists a homomorphism h such that
136 IV Coexistent Effects and Coexistent Decompositions The map h^a) = h(a) л rjt is a homomorphism = [0, с I. On the basis of the identification of [0, rj J with we obtain Fh,{ff) = F(h(ff) а щ) = XiFlHa) л »/,) = фш- (2'4'14) On the other hand, on the basis of (2.4.13) we find that F(a) = Fh(a) = £ F(h(«) л пд- i Since F(a) is irreducible, it follows that F(h(a) a m) = XiF(a). Equation (2.4.14) transforms into F(a) = FA(a). (2.4.15) Equation (2.4.15) says that the diagram 2 —Hi—► 2( \ L is commutative, and thus we find that L is more extensive than lil. Therefore Th. 2.4.2 guarantees that, in the transition from an irreducible observable to a more extensive observable (at least approximately) we may make use of irreducible observables. In IX, §1 we shall study such transitions for decision observables which are always irreducible. If we combine our wish for irreducible observables with the wish for kernel observables, then the “desired” measurements would be irreducible kernel observables. The irreducible kernel observable characterizes, so to say, the “optimum apparatus without unnecessary redundancy” which we would seek to approximate in an experiment. We may also wish to determine whether it is reasonable to interpret the irreducible kernel-observables as “measurements of the structure of micro¬ systems.” Are all irreducible kernel observables also decision observables? According to the remarks following D 2.3.6 and the remarks following D 2.4.3 every decision observable is an irreducible kernel observable. If all irreducible kernel observables were also decision observables, then we could justifiably assert that every measurement made by a sufficiently refined apparatus—neglecting unnecessary mixtures of registrations—is traceable to measurement of decision observables. It should be mentioned that this is indeed the case for “classical” physical systems. However, in the measure¬ ment of microsystems there are irreducible kernel observables which are not decision observables. We shall now exhibit an example: Let <p, x be two normalized orthogonal vectors in one of the Hilbert spaces for example, . Let us consider the following effects: 0,0,...), (2416) Q 2 = <х(Рф, o, 0,...),
2 Structures in the Class of Observables 137 where ^ = (9+j) and a = 2_y2. >/2 Later we shall show that gx + g2 < 1—see below. Thus gl9 g2 are coexistent. Let S be the Boolean ring consisting of the subsets of a set of three elements. We assign the following measures of the atoms rjl9 r\2, rj3 of £ as follows: *72 02> 43-»l -flii -g2. In this way we determine an observable S Д L for which the remaining three elements of h (excluding 0 and e) correspond to f/i и ij2-*g1 + g2, r\l и - 02, >j2 и - gu S Д L is a kernel observable with N = T, since co(FZ) is a three-dimensional parallelepiped: the four points 0, gl9 gl9 gx + g2 generate a two-dimensional parallelogram. If we add 1 — gx — g2 to these four points we obtain the remaining four points of the parallelepiped cd(FE). Now we need only show that is irreducible. Let 0 < Я < 1, F(&) = kF^a) + (1 — X)F2(g). For <7 = r\1 we obtain the following equations for the components in (the other components are 0): uPq, = AFx(jчx) + (1 - AJFaOh). (2.4.17) For each vector t1 (p it follows from F1(rj1)> 0, F2(rj1)> 0 that <t, = 0. Thus, since F1(rj1)> 0 we must obtain Fl(rjl)T = 0. Therefore F^) = г^. In the same way it follows that F1(ri2) = е2Р^. If we then show that ex = e2 = a, we would then easily obtain F^a) = F(<r) for all del, that is, S L is irreducible. To this purpose we shall consider all el5 e2 for which *7i £i(P<p, 0,...), Ц2-**г(Р+Л •••) determine an additive measure S—>L. This is the case only if + е2Рф < 1 (as an operator in Жх). Let us consider the operator гХР9 + г2Р^. It is easy to determine the el5 г2 for which this operator takes on 1 as an eigenvalue: el5 г2 must satisfy the condition £l + г2 — 2£i£2 — 1=0. (2.4.18)
138 IV Coexistent Effects and Coexistent Decompositions (For the case in which = s2 we obtain the special values sx = s2 = a = 2 — y/l) In the following diagram (Figure 3) we illustrate the domain in the (гь г2) plane for which гХР9 > 0, е2Р^ > 0 and гХР9 + 82Рф < 1 by means of shading. It is bounded by the curve (2.4.18). This domain is convex, and sx = s2 = a is an extreme point. Thus, by (2.4.17) it follows that гг = г2 = a. e2 Figure 3 The reader should show that the observable constructed in (2.4.16) is reducible for a = j, and, in the sense of Th. 2.4.1, can be described as a mixture of two decision observables. According to Th. 2.4.2, the observable constructed using a = 2 — can also not be “decomposed” if we would make a transition into a more extensive observable. If we had an apparatus which measures the observable characterized by (2.4.16) we would need only two yes-no responses, namely, rjx and rj2. These responses are mutually exclusive, that is, they never occur together. The maximum frequency (for an ensemble) for the response of rjx is not equal to 1 but a = 2 — Jl\ similarly for r\2. There are, however, ensembles for which we “always” obtain at least one of the two responses rjl9 rj2 since а(Р^ +P^) has the eigenvalue 1! The observable cannot be improved. It is very instructive to clarify the notions of coexistence and observable for this example, and to realize that observables which are not decision observables are not necessarily the result of “poor” or “unskilled” experimentation. In §3 we shall find that the observable described above is complementary to the decision observables (P^, 0,...) and (P^, 0,...); we shall discuss this fact in more detail in §3.
2 Structures in the Class of Observables 139 2.5 Measurement Scales for Observables In the definition of an observable we have not discussed the notion of a measurement scale. In practical applications measurement scales often (but not always) play an important role. They are always somewhat arbitrary, but they are also very practical. In the case of a well-developed theory, a well- chosen measurement scale can both be very practical and involve some of the physical structures of the theory; see, for example, the observables defined by the infinitesimal transformations of a group (see VII); here a particular scale is preferred. Rectangular coordinates represent particular selection of scales which are often preferred to arbitrary coordinates in three dimensions. Since we have not developed a detailed formulation of quantum mechanics until this chapter, it is not possible to develop specific measurement scales in this chapter. This fact must be emphasized, otherwise the reader may think that we have already formulated the selection of a measurement scale in mathematical terms. What do we mean by a measurement scale? According to §2.1 and §2.2, to an observable there exists a space ЩЕ) together with the set K(E) of <r-additive measures over S. The set £ symbolizes the set of registrations b e ЩЬ0). In general, by a measurement scale we mean a sequence of numbers such that the fact that the measurement values fall within an interval of the scale can be represented by one of the i)ct0. We shall now replace this intuitive idea by a mathematically correct definition. D 2.5.1. An element у e &'(L) is called a measurement scale for the complete Boolean ring S. The expression “random variable” is frequently used instead of “measure¬ ment scale.” We choose not to use the former expression because the term “random” only obscures the meaning, since it can be assigned with the meaning of pure chance. In physics the concept of pure chance, as a fundamental concept, is without meaning. We wish to clarify the fact that D 2.5.1 corresponds to the intuitive meaning of a measurement scale described above. At first у is an affine functional on K(£). In §2.1 we have already said that we may identify &'(E) with the set of measurable functions. But, as we mentioned in §2.1, we shall not take this mathematical route to obtain a representation of a Boolean ring. The spectral family defined in D 2.1.4 makes the following definition plausible: D 2.5.2. Let ax < a2; we define <r(y < a2) -j- <r(y < o^) to be the registration for which the scale value of у lies in the interval (al5 a2]. This definition will be justified by the following facts (let m e K(L)): nic{y < a2) + c{y < aj) = 1 => < (m, y) < oc2, m((7*(y < a2)) = 1 =>(m,y)> a2, m(o(y < aj) = 1 => (m, y) < olx.
140 IV Coexistent Effects and Coexistent Decompositions D 2.5.3. Let E(y) denote the complete Boolean ring generated by {o(y < a) | a g R}. We shall call Е(у) the ring of registrations generated by the measurement scale y. Therefore it “suffices” to measure the probabilities for all registrations <r(y < a) in order to determine (in principle) the probabilities for all a e E(y). Thus, for the purpose of measurement it is “economical” to introduce measurement scales. In the following we sball consider this topic in more detail. The following concepts will be important in the discussion which follows: D 2.5.4. The set SpOO = {a | (т(у < a + г) + (т(у < a — г) # 0 for all г > 0} is called the spectrum of the scale y. The set SpdOO = {a I o{y < a) + a{y < a-) # 0} is called the discrete spectrum of the scale y. The set SpcOO = {a | [a{y < a + г) + a{y < a)] v [<7(y < a — г) -j- a(y < a—)] # 0 for all г > 0} is called the continuous spectrum of the scale y. It is easy to see that Sp(y) is a closed subset of R. Since there always exists an interval about ос ф Sp(y) for which o{y < a + г) + o(y < a — г) = 0, the registration of a scale value in (a — г, a + г] is “impossible” (see D 2.5.2). Therefore we shall call Sp(y) the set of “possible measurement values” for the scale у. Those readers who are already accustomed to the intricacies of quantum mechanics will at first be somewhat surprised to find that such an essential concept as the “spectrum of possible measurement values” apparently depends only on the arbitrary choice of a Boolean ring and a scale y. The spectrum of a (conventional) quantum mechanical observable (for example, the energy) exhibits the typical quantum mechanical structure—namely, the structure of discrete measurement values. The following remarks are in order: (1) The Boolean ring 2 of an observable 2 -Д L is not at all “arbitrary.” For example, according to Th. 1.4.8, for a decision observable F we may identify 2 with the set of decision effects FE <= G. (2) The choice of a scale у for a given 2 appears to be arbitrary. If, however, we seek to use a scale у for which E(y) is “as large as possible,” for example, E(y) = 2, then the scale у exhibits the structure of 2.
2 Structures in the Class of Observables 141 One purpose (but not the only one) for the introduction of a measurement scale is to manage only with the set {<r(y < a)} instead of the complete Boolean ring. That is, the introduction of a measurement scale corresponds to the “need” to “measure only as much as is necessary,” as we have already described above. (3) It is possible that the choice of a measurement scale for a decision observable does not correspond to the question of the introduction of the most “practical” information about a measurement apparatus. Instead, if the decision observable is defined as an infinitesimal transformation, then the physical meaning of the scale is obtained from the transformations. We may express this fact as follows: The Boolean algebra of the decision observable is generated by the spectral family £(Я) of a one-parameter unitary group given by Ut = | еш dE(X) = eiAz where A = j*A dE(X). Thus the Я-scale is determined by Ux. Observables which are defined in terms of infinitesimal transformations will be discussed in VII and VIII. If £ Д L is an observable, and у is a measurement scale, then £(y) Д L will be an observable for which (£(y), F) -< (£, F) holds (see D 2.4.1). D 2.5.5. If £ Д L is an observable, and if у is a measurement scale, then we shall call £(y) L the partial observable generated by the scale y. We may expand the above concept to a finite collection of scales as follows: Let £(yl5 y2,..., y„) denote the Boolean ring generated by the <т(у( < a), i = 1, 2,..., n. Let £(yl5 y2,...,yjil denote the partial ob¬ servables generated by the scales yl5 y2,..., yn • In physics it is customary to select a set of “practical” scales in order to obtain £(yl5 y2,..., y„) = £. Here we shall not consider the case in which £(y) is “smaller” than £. Such cases can be treated using the methods discussed in §2.1 and §2.2. Instead, we shall consider only the case in which £(yl5..., yn) = £. This is sufficient (see Th. 2.5.6) because, for separable Boolean rings £ there always exists a scale у for which £(y) = £, and in §2.2 we have seen that we may restrict ourselves to observables—that is, to separable Boolean rings. Th. 2.5.1. To each separable complete Boolean ring £ there exists a totally ordered subset A such that the smallest complete Boolean ring containing A is equal to £.
142 IV Coexistent Effects and Coexistent Decompositions Proof. Since Z is separable, Z contains a countably dense subset Г. Using the elements yv of Г we may recursively define the following totally ordered subsets of Z: Ai: 0 < Д2:0 < ri -r2 < Vi < 7i + (У2 + and with A„: 0 < П± < %2 < • • • < nv < 7tv+1 < • • • < nm we set Д„+1:0 < nryn+1 < Щ < nt + (n2 + JtjJ-y.+i < tt2 < ••• < nv < nv -j- (7tv+1 + nv)-yn+1 < 7ZV +1 < < < nm -j- (y„+1 -j- nm)-y„+1. We therefore obtain y„+l = Kl-ln+1 + Я1 + i>, + (n2 + WjJ-y.+ J + n2 + l>2 + (я3 + Я2)-ул+1] + ••• + nm + [7tra + (y„+1 + 3lJ-V„+1]. Thus, recursively, we find that the Boolean ring generated by the A„ contains all of the elements yl9 y2,..., yn. Clearly the Boolean ring generated by yl9 y2,..., yn contains the set A„. Since An+1=>AB and since each A„ contains a finite number of elements, the set A = (J„ A„ is countable. Thus we see that A is totally ordered, because any two elements of A, for sufficiently high n, are elements of a A„. The Boolean ring ZA generated by A is therefore also countable. Since TcZA and Г is dense in Z, ZA is dense in Z, that is, the complete Boolean ring generated by A is equal to Z. Th. 2.5.2. To each separable complete Boolean ring Z there exists a totally ordered and closed subset A such that the smallest complete Boolean ring containing A is equal to Z. PROOF. On the basis of Th. 2.5.1 it suffices to show that the closure A of A is totally ordered. Let <7V, <7^ be two sequences for which <7V, e A and crv —► <7, —> o'. We must show that either a < o' or o' < a. If there exists an N such that, for all v, p > N, the relation <7V < (7^ is satisfied, then a < o'. If there exists an N such that, for all v, p > N the relation crv > is satisfied, then a > o'. If no such N exists, then, for each N there are two pairs <7Vl, <7^ and <7V2, o^2 for which <7Vl < <7^ and <7V2 > o^2. Thus we obtain four subsequences <7V1, <7V2, <7^, o^2. From <7Vn < o^n it follows that, in the limit a < o'. Similarly, from <7V2 > o^2 it follows that, in the limit, <7 > V. Therefore we obtain a = o'. Th. 2.5.3. Let A be a totally ordered subset of I, for which s e A. Let ZA be the smallest Boolean subring of Z containing A. Then ZA is the set of all elements of the form о = г + <7V, where <7V e A; the representation of a as a sum of elements of A is unique.
2 Structures in the Class of Observables 143 Proof. Since the product of finitely many elements of A is again an element of A (namely, the smallest of these sets), every a e ZA is of the form a = £"=i + where av e A. We will now show that this representation is unique. Let q n m £ + ^= £ +K- v=l H=1 Then we obtain n m X + ov + Y + Gli= V = 1 n=l In the last sum let us suppose that the quantities are ordered (A is totally ordered). Then we obtain a sum of the form Yl =T + — 0 with +1 < • We may write this sum as follows (ffi + о’г) + (°з + &Z) + • • • = 0 (2.5.1) For the rjn = &2П-! 4- 02n (n = 1, 2,...) we obtain r]n-rjm = 0 for n Ф m. Thus (2.5.1) takes on the following form I + r,n = 0. (2.5.2) П If we multiply (2.5.2) by rjm, then it follows that rjm = 0, that is, 1 + 02„ = 0 for all n and therefore °'2n-i = °2n for all n. (2.5.3) Since all ov are different, and the are different, equation (2.5.3) can hold only if av and are pairwise identical. Th. 2.5.4. Let Abe a totally ordered and closed subset of a complete Boolean ring S. Then A is compact, and will be mapped by an effective o-additive measure m homomorphically to a closed subset со of[0, 1]. PROOF. Let al9 o2 eA ,ai Ф a2. Then either oi > o2 or a1 < a2; thus it follows that m{a1 + a2) = Im^) — т(о-2)| and, since m is effective, m(a1 + <j2) Ф 0. Since m is a- additive and effective, then, according to Th. 2.1.11 the uniform structure Ug on Z (and therefore also on A) is defined by the metric d(al9 a2) = m(a1 + <j2). Since m(or1 + 02) = |niiaj) — m((T2)I, the uniform structure Ug on A is equal to the initial uniform structure for the map m: A —► [0,1]. Since the image со of A is precompact (totally bounded) (All, §2), A is precompact (totally bounded). However, since A is Incomplete (because it is closed in Z), A is compact, and therefore the image со of A is a compact subset of [0,1]. Thus a —► m{a) is a homomorphic map of A onto со. It is always possible to include the null and unit elements in A. In the following we shall assume that they have been included. We note that Th. 2.5.3 holds, and that со contains 0 and 1. It is not difficult to construct a spectral family from a totally ordered closed set A in Z. To this purpose, we begin by introducing the map A —* со с [0, 1] which we obtained from Th. 2.5.4.
144 IV Coexistent Effects and Coexistent Decompositions Th. 2.5.5. Let т(Я) = sup{a|0 < a < Я and a g со}; т(Я) is an upper con¬ tinuous map defined on the interval [0, 1] which increases from 0 to h The map <т(Я), defined by a € A for which m(<r) = т(Я) is a spectral family for which <r(0) = 0 and <т(1) = г. The set {<т(Я)} is equal to A. Proof. From the definition it immediately follows that т(Я) is increasing and that т(0) = 0, t(1) = 1 since 0 g со and 1 g со. From т(Я) < Я it follows that т(Я + e) < Я + e. Therefore, if Я g со it follows that т(Я+) = Я = т(Я). If Я ф со, then the closed subset {a | a > Я and a g со} has an infimum ft such that ft sco since со is closed. We obtain ft > Я since Хфсо. For an e satisfying 0<e<ft — lit follows that т(Я + e) = т(Я) and therefore т(Я+) = т(Я). Therefore т(Я) is continuous from above. Since the values of the function т(Я) lie in со, and A is homeomorphically mapped onto со by m, m~ 4 is a map of [0,1] in A which we shall denote by ст(Я). Therefore сг(Я) is uniquely defined by т(сг(Я)) = т(Я). Thus we obtain cr(0) = 0 and a{ 1) = e. Since m~1 is increasing and upper continuous, сг(Я) = m" 4 is therefore a spectral family. Since, for X sco, the relation т(Я) = Я holds, we obtain {<т(Я)} = A. Th. 2.5.6. There exists a measurement scale у such that Z(y) = Z. PROOF. Choose the totally ordered and closed set A = A according to Th. 2.5.9. Then, according to Th. 2.5.5 we obtain a spectral family {<т(Я)} where X is the smallest possible Boolean ring containing {<т(Я)}. The linear functional у obtained from the spectral family by (2.1.9) therefore satisfies the relation E(y) = Z (on the basis ofTh. 2.1.15). Th. 2.5.1-2.5.6 show that it is sufficient to consider the spectral family <т(Я) and the corresponding functionals у s A spectral family <т(Я) is a totally ordered subset of Z but is not necessarily closed. We obtain the closure {<т(Я)} of {<т(Я)} by adding all of the limit points <т(Я—) to {<т(Я)}. Theorem 2.5.3 is also applicable to nonclosed totally ordered subsets of Z. Let A = {<т(Я)}, that is, A is a spectral family, and у is its corresponding unique measurement scale. It is easy to obtain a repre¬ sentation of ZA in terms of subsets of the real axis. Th. 2.5.7. Let A be the spectral family of у and let Sp(y) be the spectrum of y. Then the bijection <т(Я) «-* (— oo, Я] n Sp(y) of A into subsets of the real axis can be uniquely extended to an isomorphism ofHA to the Boolean ring of sets Pj generated by the set {(— oo, Я] n Sp(y) | Я g R}. Proof. The proof is obtained directly from Th. 2.5.3. It is only necessary to order the elements in the sum = i + a(K) such that Xv+1 < Xv. Then we consider i + °(K) = ЫК) + o{XJ] v МЯз) + <7(Я4)] .... v= 1 Each bracket, for example, [<т(Я3) + сг(Я4)] corresponds to an interval (Я4, Я3] n Sp(y). The elements of Pj are unions of finitely many such intervals (а, Д n Sp(y).
2 Structures in the Class of Observables 145 In practice, a representation of E(y) in terms of subsets of R which is not one-to-one is more commonly used. Such a representation is obtained as follows: л For A2 < kx let the interval / = (Я2,Я J correspond to the element <r(/) = <7^) 4- <г(к2) Let к be an arbitrary subset of R. We define a covering и of к by a denumerable set of such intervals Iv for which к c= (Jv /v. To each subset of к с: R there is a corresponding element <r(k) of E defined by д(к) = /\u \/v <7(/v). If м is a covering of к and v is a covering of R — к then м и i? is a covering of all of R. It is easy to show that, for every covering of R the relation \/v <r(/v) = e holds. Thus it follows that <r(fc) v <r(R — fc) = e. We shall call a set “measurable” if ff(fc) л o(R — k) = 0. Let P denote the set of all “measurable” subsets of R. We shall show that P is a <r-complete Boolean ring of sets. In general P need not be a complete Boolean ring! A Boolean ring P of sets is said to be <r-complete if the union and intersection of countably many sets in P is an element of P. Th. 2.5.8. The set P described above is a а-complete Boolean ring of sets for which Re P. The map o(k) = <т(к): P—> E(y) is a surjective o-homomorphism of P onto E(y), that is, a homomorphism in which P and E(y) are a-complete Boolean rings. Let J be the kernel of this map, that is, J = {k\ke P, о (fc) = 0}. Then the mapping P -► E(y) may be expressed in canonical form as follows: p^pv^m, where P/J —> E(y) is an isomorphism. In particular, P/J is also a complete Boolean ring. Every interval I = (A2, is an element of P; in particular, we obtain o(I) = -j- <t(A2). Proof. In order to prove that P is a <7-complete Boolean ring of sets it suffices to show that if к e P then R — к e P; in addition, if {kv}, kveP is a countable set, (Jv kv g P because f]v kv = R - (JV(R - kv). If к eP, then, from the definition it follows that <7(к) л <r(R — k) = 0 and therefore R — к g P and cr(R — k) = <7*(/c). We will now show that, if kv g P then (J kv 6 P and JIJ kv) = V V \ V / V Since R — (Jv fcv = Hv(R — K), from the definition of a it easily follows that <t(R - (Jv kv) < <t(R - fcv) for all v, that is, <r(R - (Jv kv) < /\v <r(R - fcv) = Д„ <r(R - kv) = Д„ <7*(/cv). Since d((Jv fcv) и a(R - (Jv kv) = e we obtain <i*(R - Uv kj < kv), that is, ff((Jv kj > [Ду <7*(/cv)]* = \JV tr(kv). If we show that <t((Jv kv) < V» o’(fcv) then we obtain 5(1JV kj л ff(R - Qv kv) = 0, that is, (Jv fcv e P and <r(ijv fev) = \/v <r(fcv)- Let l/v be a covering of kv. Then V = (Jv Uv is a covering of (Jv kv because V is, like the C7V, a countable set of intervals. Therefore we obtain a((J к) ^ A V \v / F = U Uv V,fi
146 IV Coexistent Effects and Coexistent Decompositions where Iare, for fixed v, the intervals for Uv and Д is taken over all V of the form V = (Jv Uv. We will now show that the Uv may be chosen such that \/v \/M cr(/^v) will be “arbitrarily close” to \/v er(/cv), that is, with the metric d from Z, for every 6 > 0 we may make for a suitable choice of Uv. For this purpose we shall show some relations for a subset k:lfU1 = {I^} and U2 = {/£2)} are two coverings of к, it then follows immediately that V = {/J1) n /£2)} is a covering of k. From this result we obtain From which we obtain av = <rVl л aU2. Similarly, for a finite number of coverings Ul9 U2,..., Un there exists a covering V such that av = Д?=i ^t/v- According to Th. 2.1.1, from <j(k) = /\v ov it follows that there is a sequence Uv of coverings such that for dn = /\*=1<rVv the relationship а (к) = Д„ dn and dn^><j(k) hold. Since, to Uv9 v = 1, 2,..., n there exists a covering Vn such that aVn = Д"=1 %v’ there exists a sequence of coverings Vn for which oVn is decreasing and converges towards a(k), and satisfies a{k) = Д„ oVn. For a given e > 0 we may choose such a covering Uv for each kv such that v For a covering U we shall use the following abbreviation: % = V Since and v L м J v V we obtain < « E 47 = 8. £ 1 V= 1 Z
2 Structures in the Class of Observables 147 Thus it follows that Л V «КА»*) = V °(K) K=U Vvv,n v A v from which we have proven that kvj < V With this result we have also proven that the map P —►£()>) is a o- homomorphism. The fact that the map P—►£(>;) is surjective follows from Th. 2.1.1 and Th. 2.1.2 because the elements of £(y) may be obtained from £д by the joint v and meet л of denumerably many elements. P/J —► £(y) is therefore an order isomorphism and therefore P/J is isomorphic to £(y), that is, P/J is a complete Boolean ring. It is easy to see that I = (Xl9 A2] is an element of P and that a(I) = + <j(A2). If £(y) is equal to £ then we have obtained a representation of £ in terms of subsets of scale values of the measurement scale y. For keP a(k) is the “registration” for which the scale value of у lies in the set fc. It directly follows that R\Sp(y) e J, that is, only scale values of the spectrum of у will be registered. It is easy to verify that the discrete spectrum Spd is the set of those values A for which a(k) Ф 0 with к as the set consisting only of a single point A. The set of these <r(fc) is equal to the set of atoms in £(y). It is not difficult to extend Th. 2.5.8 to the case of finitely many scales Уг> • • • ^ Уп and Z(yi9 У29 • • • > Уп)- P is then a ^-complete Boolean ring of sets in the n-dimensional space R". In physics we frequently find that we choose n scales (instead of one) for which £(yl5 y2,..., y„) = £ because n scales often prove to be both practical and theoretically useful (in specific situations which we will discuss later, for example, in VII and VIII). Here we have treated the problem of measurement scales in terms of a structure which may be “added” to the structure of a complete Boolean ring £. Thus we see that the measurement scale is not necessarily involved in the concept of an observable. In this way it is clear that the measurement scale is nothing other than a preferred method which permits an overview of £ (or, in particular, of ${b0)). $(b0) is of primary interest for experiment; the measurement scale for the apparatus b0 is only a very practical tool, or may have an additional physical meaning (see the discussion of (3) above) concerning transformations. In D 2.5.4 we have defined different components of the spectrum Sp(y) for the measurement scale y. If m is a <r-additive effective measure on £ then for keP ц(к) = m(<r(fe)) defines a <r-additive measure /i on P for which the sets of //-measure 0 coincide with the elements of J. (R, P, //) is then a so- called measure space. We may separate Sp(y) into three disjoint components: К = Spd(y), kcc and ksc where fid(k) = fi(k n kd) is a discrete measure, //cc(fc) = p(k n kcc) is absolutely continuous with respect to Lebesque measure, and //sc(fc) = //(fc n fcsc) is singular with respect to Lebesque
148 IV Coexistent Effects and Coexistent Decompositions measure, that is, the Lebesque measure of ksc is 0. This decomposition of Spd(}>) does not depend on the <r-additive effective measure m on £. Let {<r(A)} be a spectral family. Then, by the map F: £ —> L which corresponds to an observable, there exists a totally ordered subset {F(A)} of L defined by F(A) = F(<t(A)). Therefore F(A) is an increasing function R — which is continuous from above F(A+) = F(A). We shall then call F(A) a spectral family of effects. Earlier we have constructed a <7-complete Boolean ring of sets P using the spectral family {<t(A)}, and we have seen that P/J is isomorphic to £. The construction of P is, however, possible without knowledge of <t(A), if we make use of the spectral family of effects {F(A)}. We may proceed by using any totally ordered subset I с: L. The closure / of L in the g(0H\ J1)-topology is totally ordered. This fact may be proven analogously to the proof of Th. 2.5.2. If W is an effective ensemble from К(Жи ...) then I will be mapped injectively into [0,1] by the map F —> F) in complete analogy to Th. 2.5.4. By introducing the parameter т from Th. 2.5.5 we may obtain a spectral family of effects from /; thus, it suffices to start with such a spectral family of effects. If we are given a spectral family of effects {F(A)} then we may use the methods leading to Th. 2.5.8 to obtain an analogous result. For the interval I = (A2, AJ we define F(/) = F(Aj) — F(A2). Therefore we obtain F(/)eL. Furthermore, let F(fc) = infy(£vF(Jv)). This infimum exists because the set of Fv = ]TV F(/v) is a lower-directed set (here we have used the following result from Th. 2.5.8: if Ux = {I[X)} and U2 = {/д2)} are coverings of fc, then V = {1{Х) n /<2)} is a covering of fc). We obtain F(fc) + F(R\fc) > 1. A set k is said to be measurable if F(R\fc) = 1 — F(k). Let P denote the set of measurable fc. Clearly P is a a-complete Boolean ring and fc—>F(fc) is a g- additive measure. Let J be the set of all fc having F-measure 0. Then £0 = P/J is a complete Boolean ring and £0 together with the map F: £0 —> L defined by F: P —> P/J L is an observable. For the special case in which F(A) is determined by F(<t(A)) then P and J are identical to the sets P, J of Th. 2.5.8. That is, by using £0 we may recover the Boolean ring £ and the mapping F: £ —> L from which we obtain the spectral families {<t(A)} and {F(A) = F(<t(A))}. In this sense each totally ordered subset of L uniquely defines an observable, and each observable may be obtained in this way. Therefore it is not surprising that we often use a spectral family of effects instead of the total observable. The spectral family of effects not only determines the observable but also determines the measurement scale у of the observable by means of (2.1.9b) where £0 = P/J replaces £ and p(a) e £0 is the class of elements of P which belong to the interval (— oo, a]. For fc e P we find that p(W, F(fc)) is equal to the probability of obtaining a scale value of у from the set fc in the case in which the observable £0-^L was measured and the ensemble W was prepared. Thus it is not yet clear how we may obtain apparatuses which will measure the desired observable £ —> L and to prepare the desired ensemble W. These problems will be discussed first in §4 and §7 and then later in XVII and XVIII.
2 Structures in the Class of Observables 149 For decision observables the above relationships between X, the spectral family {<r(A)} and the measures E: X —> G is somewhat simplified. According to Th. 1.4.8 X may be directly identified with a subset of G such that all results from Th. 2.5.1-Th. 2.5.8 are applicable to decision observables providing that we consider X as a subset of G. In particular, the map P—>X(y) is also a map of P into G. Therefore spectral families of decision effects uniquely determine a decision observable with scale, and X may be chosen as the complete orthocomplemented sublattice of G generated by the spectral family {£(A)} where the identity map of X onto itself is the measure X —> G of the observable. The scale у corresponding to the spectral set is determined by For decision observables the maps S and S' have the special properties described in §2.2. Th. 2.5.9. For a decision observable S' is injective and SK(J^l9 ...) = K(X). If we identify X with a subset of G, then we obtain S'-1 from the spectral representation of the operator A = Sy e ...): PROOF. We begin by noting that equations (2.5.4) and (2.5.5) are valid as a limit (in norm) in 0Г(ЛГ19...) and J*'(X), respectively. According to Th. 2.1.15 each у g J*'(X) may be written in the form (2.5.5). Since S'E = £for£e! с G and since S' is continuous in norm, it follows that S'у is equal to A. Since both spectral representations (2.5.4) and (2.5.5) are unique (for the operator A, see AIV, §8 and §15 and for y, Th. 2.1.15) we therefore find that S' is injective and S'~\A) (where A is defined for E(A) el с Gby (2.5.4)) is equal to у by (2.5.5). SK(Ж!,...) = K(L) follows from Th. 2.1.11 as follows: each meK(L) may according to Th. 2.1.11, be approximated (in norm) arbitrarily well by xd N. For an arbitrary effective w g K( ,...) m = Sw is an effective m g K(L). For this m x3tN is of the form where we now consider cr, on to be elements of I с G. Thus m(<т л an) = m(aan), where oon are products of the “operators” cr, on of G. From m(a) = ju(w, a) = tr(wcr) it follows that (since a commutes with an where <7, an gI!): A = A d£(A), (2.5.4) where (S' X)A = у has the form (2.5.5) N X6,n(°) = X (n ~ A O-J, m(<7 A <7„) = tr(w<7<7„) = tr(w<7<7n2) = tr(w<7„<7<7w) = tr(<7„W<7„<7).
150 IV Coexistent Effects and Coexistent Decompositions Thus we obtain N XS,N = E (« - 1>5S((T„W(T„) n= 1 = £ (n - l)8anwan According to the proof of Th. 2.1.11 \\xd>iV|| ~*хдtN g K(E) converges in norm to m. Thus \\zd>N\\~1zd>N e К(Жi,...) with zd>N = £n=i (n — 1)<5<7„w<7„ converges in norm towards awe К(Ж19...). Thus we obtain m = Sw. According to Th. 2.5.9 we may therefore identify the space &'(E) with the subspace of Ж(Ж19...) of all operators of the form (2.5.4) where £(A) is a spectral family of the complete Boolean sublattice E of G. ЩЕ) is an abelian algebra. K(E) arises from the partition of К(Ж19...) into equivalence classes where all the w g К(Ж19...) belongs to the same class if the w cannot be distinguished by means of tr(we) where eel, that is, the linear forms tr(wg) agree on the subspace B'(E) of В'(Ж19 ...). If, for one or more scales yl9 ..., yn we have E(yl9..., yn) = 1 (according to Th. 2.5.6 we may always choose a scale у such that Цу) = £) then the corresponding operators Au ..., An and the decision observables and the scales yl9..., yn are uniquely determined. Therefore we may uniquely characterize a decision observable with measurement scales by a finite number of commuting operators Ai9..., An e В\Ж\9...). The spectral families of Al9..., An generate the corresponding complete Boolean ring E с G. We therefore define D 2.5.6. Let A e В\Ж19 ...); the decision observable and its corresponding scale which is uniquely determined by A is called a scale observable. We shall often use the expression “Л is a scale observable” or, more briefly, “A is an observable.” Thus we have explained the connection between the usual language of quantum mechanics in which the self-adjoint operators are called observables. In this explanation we expect that, at least, “in principle” misunderstandings are impossible. The brief characterization of a decision observable having a scale by a single operator A e В'(Ж...) is not applicable to more general observables. In such cases, to each scale у e &'(E) there corresponds an operator A = S'y g ...) but the. operator A does not permit us to reconstruct the Boolean ring and the scale у I Since, for fixed £, we may introduce different measurement scales у e &'(E), the question arises as to what scale we obtain when we replace Я by a new scale value /(A) where /(A) is a real function. Let us consider the set fc(a) = {a|/(A) < a}. (2.5.6) For a registration the question: is /(A) < a? is meaningful only if the set k(oc) g P, that is, k(a) is measurable. Then the registration /(A) < a can be
2 Structures in the Class of Observables 151 associated with the element <r(fc(a)) e E. In this way we may obtain the precise jneaning of the “renaming” of the scale A by /(A), and we find that it is meaningful to consider only these /(A) for which the sets k(a) of (2.5.6) are measurable, that is, are elements of P. Such functions are usually called measurable functions. The element у corresponding to the renaming of the scale defined by /(A) is given by У = ado(k(a)). (2.5.7) Since <r(fc(a)) may be obtained from the <т(у < fi) we may therefore write (2.5.7) as follows: У = /(A) da(y < A), (2.5.8) where (2.5.8) is defined in terms of (2.5.7). Instead of (2.5.8) we sometimes use the abbreviation у = f(y). Since the integrals (2.5.7) and (2.5.8) exist as limits in the norm, /(A) must be essentially bounded, that is, a с can be found such that the sets R and {A| —c </(A) < c} differ only by a set of measure zero. Th. 2.5.10. Let у be a scale for which Цу) = S. Then, to each у = &'(L) there exists a measurable, essentially bounded /(A) for which у = /(у). PROOF. We seek a/(A) for which a(y < a) = a(k(a)) where k(a) satisfies (2.5.6). Let a be a rational number. For this purpose we choose an arbitrary £(a) from the class of subsets of P which is, according to Th. 2.5.8, in 1:1 correspondence with <r(y < a). For rational fi < a, since a(y < p) < o(y < a), we find that the set Да, P) = k(fi) + k(a) n Щ) is a set of measure zero. Therefore, since the set of rational p which satisfies ft < a is * denumerable, we find that the set [jp <0Lj(<x, fi) has measure zero. From this result it follows that the set Ш) = £(а) и (J Да, fS) P<a belongs to the same class as £(a) For the £(a) we find that ol1 > ol2 => H^i) ^ 2)- Thus, for rational a we have obtained a set of £(а) e P which increases with a and satisfies <т(£(а)) = o(y < a). Since the set of a > fi is denumerable, for each fi a set ад = П«>/? £(а)G P is determined. From Th. 2.5.8 it follows that a(k(P)) = A ff(/c(a)) = Д a{y ^ a) = a{y < p), a> fi <x> p where we obtain the last expression from the fact that the spectral set a(y < fi) is upper continuous.
152 IV Coexistent Effects and Coexistent Decompositions Thus the desired function/(Я) is obtained as follows: /(A) = inf{j9| ЯеВД}. This theorem may be transformed from the scale у into an analogous theorem about its corresponding scale observable. Th. 2.5.11. If a decision observable is completely determined by the scale observable A (that is, the spectral family of A generates all o/Z с G) then all of the scale observables corresponding to the decision observable A are functions f(A) of A. PROOF. By analogy to (2.5.6) and (2.5.8)f(A) is defined by where E(oi) is the spectral family of A. The theorem follows directly from the map S' of J*'(Z) into &'(Ж19...). The fact that the scale values of the observable f(A) are obtained from the scale values of A by means of the real function / is not a mysterious result of the correspondence principle, where the latter provided the basis for the initial development of quantum mechanics more than 50 years ago. It is, instead, a consequence of the definition of a scale observable, that is, of the mapping S' of a scale у onto an operator A = S'y from ...). Clearly the concept of a measurement scale is not a quantum mechanical concept, but arises from the registration methods described by ЩЬ0) and idealized by means of Z. Thus we have clarified and explained the usual, more or less, intuitively based methods entirely on the basis of the comparison of theory and experience in terms of a' e 1', b0 e 0lo, be 01. From this viewpoint it is no longer remarkable that one and the same operator A can represent the following different things: a scale observable, an effect (if 0 < A < 1) or even an ensemble (in the case in which 0 < A and tr(A) = 1). A mathematical term does not, in itself, have any physical meaning. The physical meaning is obtained when we, in addition, state what it represents. Here it is hoped that the formulation presented above will eliminate many of these “apparent” problems. 3 Coexistent and Complementary Observables If we apply a registration method b0, then by ЩЬ0) L (more precisely the U^-completion of ЩЬ0)) an observable is determined. In an experiment it is possible that we may only be interested in a Boolean subring 'L1 of 0l(bo); then 'L1 is, so to say, a type of partial observable of 0l(bo)-^L. If Z2 L is a second such partial observable, then Zi L and Z2 L will “both” be measured by the registration method b0. Thus we are led to the following general definition.
3 Coexistent and Complementary Observables 153 D 3.1. Two observables -^+L and Z2 are said to be coexistent if there exists an observable E L and two homomorphisms hl9 h2 such that the following diagram is commutative : L In §2.4 we have shown that we may identify 'L1 with hLx с 1, and Fx with Fjw:i; and similarly for S2. From D 3.1 it follows that, in particular, the set {F | F = F^of} for e 'L1 or F = F(g2) for o2 e S2} is a set of coexistent effects. If 'Ll G and S2 G are decision observables, then, according to Th. 1.4.8 we may identify 'L1 and S2 with Boolean sublattices of G. 'L1 G and S2-^> G may be coexistent only if {Sl5 S2} is a subset of a complete Boolean ring 1, с G. For such a £ the diagram in D 3.1 is trivially satisfied. D 3.2. Two coexistent decision observables are also said to be commensurable. The above results therefore show that : Th. 3.1. Two decision observables are commensurable only if their combined images form a set of commensurable decision effects. From Th. 2.5.9 and Th. 1.3.4(v) we obtain the following extension: Th. 3.2. Two decision observables with scales are commensurable only if the scale observables commute. Thus we have obtained the joining of the well-known and common characterization of commensurable observables without the need to make use of the usual long-winded discussion of what is meant by joint measure¬ ments. Here we have intentionally avoided the use of the expression “simultaneous measurement.” We shall return to this problem and its accompanying misconceptions in XVII, §2 and XVIII, §4. At this point we have only defined the notion of coexistence and commensurability of observables in D 3.1 and D 3.2, respectively, only in the form of an idealization. We must also look into the question concerning the realization of these idealized definitions; this will be done in §4. If two observables are coexistent, then, according to D 3.1 there exists an observable for which F11,1 cz FI, and FfL2 с FI,. We shall also use a concept which will characterize the “extreme case” of noncoexistence be¬ tween two observables. For this purpose we shall define the subset 5 = {(Я11,Я21,...)|0<Я1.<1}
154 IV Coexistent Effects and Coexistent Decompositions of L. Obviously the elements of S are those which are coexistent relative to each element of L. D 3.3. Two observables 2^ L, S2 L are said to be complementary if and, for each observable it follows that пЯс S or F2X2 пЯс S. It is not difficult to see that two decision observables are complementary only if, for g F12,1 and e2 g F2E2 and el9 e2 commensurable, then it follows that either ex or e2 g Z where Z is the center of G (the definition of Z is given in D 1.3.2). 4 Realizations of Observables We have simplified the analysis of the structure of observables by using the idealized version of the definition of an observable instead of the realistic one ЩЬ0)-^ L. We must now ask whether it is possible to realize (in the sense of the construction of a measurement apparatus) the observable I-^L, that is, whether there exists a b0effl0 for which the following diagram commutes: 2 —-—► <Hb0) where h is an isomorphism. The requirement that, to each observable there exists а ЩЬ0) such that the above diagram is satisfied, is clearly too strong since we have assumed that J, 01 are denumerable (see III, §3). The following axiom governing “approximate” realizations may be added to the previous axioms: AOb. To each observable and each finite Boolean subring 2 of E and each ^-neighborhood U of 0 g ...) there exists а ЩЬ0) and a homomorphism 2 —> ЩЬ0) such that \j/0h{a) - F(g) g U for all <7 g 2 where ф0 is the measure 0t{bo) L. If we do not explicitly require axiom AOb, then, in any case, we may consider AOb to be a “certain” hypothesis in the sense of [1], §10.1; we shall not attempt to establish this result here. The proof that AOb is a “certain” hypothesis implies that the addition of axiom AOb does not lead to any contradictions in the mathematical theory. As we have explained in
4 Realizations of Observables 155 [1], §10.4, it is only a matter of taste whether AOb is added as an axiom, except, if on the basis of experience there is strong evidence that, in nature, there are strong barriers to the possibility that apparatuses satisfying AOb can actually be constructed for all observables >L (see [1], §10). But we have no indication of such barriers for the case of microsystems. However, AOb is wrong in the case of an extrapolated quantum mechanics for “many particles” (see [13], X). Axiom AOb expresses the following statement: Each observable can be measured “approximately” (note that 38) determines the structure of physical imprecision in L—see III, §3, and in general [1], §9). The structure described by AOb by which we may “approximately” measure an observable is essential for the application of quantum mechanics. Most of the important observables in quantum mechanics, for example, position, momentum, and angular momentum can only be approximately measured. An noteworthy experiment illustrating this fact is given by the Stern-Gerlach experiment in which the angular momentum of an atom is measured. In fact, the angular momentum is only approximately measured, because the procedure represents a measurement for angular momentum only for a subset of the ensembles (a subset for which the neighborhood U can be specified). This situation may be more clearly understood in the presentation of the theory of this experiment in [2], XI, §7.2. The angular momentum will be measured only for such ensembles which pass through the magnetic field in a particular way. For ensembles which, for example, do not pass through the apparatus, the apparatus does not make any measurements of angular momentum! The definition of coexistence of two observables given in D 3.1 has, on the basis of AOb, the following consequence: there exists a measurement method b0 g 38о by which it is possible to (at least approximately) make a joint measurement of both of them. This can be clearly inferred from the diagram inD 1.3.1 as follows. Let lx be a finite Boolean subring of 2q, and similarly, let £2 be such for S2. Then hlll and h2l2 generate a finite Boolean subring £ in £, to which, according to AOb, given a neighborhood U there exists a 3t(b0) satisfying hi с 3l(b0) and j/h((j) — F(a) g U. From this result it follows that lx 3#(b0), 12 —^ 38(b0) and tj/ohh^a) — F(a) g U for all a from lx and ф0Ш2(а) — F2(g) g U for all <7 from £2. Thus b0 is an approximate measure¬ ment method for 2^ L as well as for S2 L, that is, the coexistent observables 2^ L and S2 L can be (approximately) jointly measured by using the single method b0. Of course, the converse of this situation does not hold. It is indeed possible to make joint approximate measurements of two noncoexistent (even complementary) observables. Naturally both cannot be measured with arbitrary accuracy. For example, it is even possible that an apparatus will be able to measure both position and momentum in which the errors of measurement of position Ap and that of momentum Ap satisfy ApAq < 1/2
156 IV Coexistent Effects and Coexistent Decompositions for a certain subset of ensembles. This can occur only for certain subsets of ensembles. This fact has led to much unnecessary confusion. The source of this confusion is the false interpretation that such an attribution of position and momentum to single microsystems is forbidden by the Heisenberg uncertainty relations (see §8.2). Axiom AOb states that it is “physically possible” (see [1], §10.4) that every observable may be approximately measured. However, AOb does not show how we may find the appropriate measurement b0. The theory presented here cannot be used for this purpose because it does not contain any mathematical description of the technical construction of the “apparatus” b0effl0. In XVII we shall undertake a partial step in the direction of this “construction problem” for the apparatus b0. In XVIII we shall be in¬ troduced into some of the deep problems found in this area. In [13], X these problems are solved in principle. The fact that the theory presented here does not yet provide a description of the physical structure for the apparatus b0 gives rise to another deficiency of quantum mechanics. We shall now provide a brief description of this deficiency. Suppose that we are given a detailed description of the physical structure for a given apparatus (for example, a particle counter). Then, using the mapping axioms given in [1], §5 we may express this fact by identifying the apparatus with a particular b0. Apparatuses for which the internal physical structure is actually different will correspond to different b0’s. According to the definition III, D 1.3 of the map ф should be obtained directly from knowledge of the internal physical structure of the apparatus. But only the existence of this map is assured in the theory. It remains open which g = ^(b0, b) corresponds to a given definite experimental effect process. In this situation we may choose to “guess” which geL cor¬ responds (at least, approximately) to a given (b0, b) and to “add” the resulting guesses to the theory as axioms. In XI and XVI we will use this approach and the defects of this approach will become very evident. In XVII we shall describe a method by which it may be possible to make some progress in the area. A valid solution of the problem of obtaining an approximate de¬ termination of the map \// is, however, not attained in XVII. A general method for finding if/ will be given in [13], X. 5 Coexistent Decompositions of Ensembles In other formulations of quantum mechanics the question whether it is possible to make joint measurements of pairs of observables has played an important role. The question whether it is possible to make joint preparations is, however, either usually ignored or it is treated as a minor part of the measurement problem. Thus, in the literature, a primary emphasis is placed upon the discussion (in a somewhat limited way) of idealized measurements of the first kind. These idealized measurements are concerned with prepara¬
5 Coexistent Decompositions of Ensembles 157 tions of a well-defined form (see XVII, §5). We may clarify many of the fundamental questions in quantum mechanics by separating the question about the possibility of making joint preparations from the problem of registration. This question can be posed in a natural manner using the methodology presented here. To do so only requires that we begin with the methods of III, §2. Using III, Th. 2.1 and the identification of q>(a) with the elements of К according to AQ, for a decomposition a = x a{ of a preparation pro¬ cedure a we obtain Ф) = Z кФд, (5.1) i = 1 where Af = Aj(a, aj, 0 < Af < 1 and Z"= i A; = 1. If we have two decompositions of the same a e 2! n m a = U «i = U 4 i=1 k—1 we then obtain n m Ф) = Z кФд = E КФк)- i=1 k—1 We may create a new decomposition from the above decompositions as follows: a — U (ai П (5-2) i, к Here we shall use the prime ' to indicate that we shall not take the union (or summation) over those (i, k) for which at n ak = 0. Thus, from (5.2) we obtain <p(a) = £#<?(«* n ak), (5.3) i,k where ^ik ~ ai П ^fc). In addition, we find that Фд = Е'4<р(«гn 4) к and Фк) = E' ^Фг П i where
158 IV Coexistent Effects and Coexistent Decompositions The decomposition may be expressed in a very simple way if, in addition to <jp, we introduce the maps %(«) = a)(p(a) of J(a) = {a | a e J, a с a} into K. Th. 5.1. The map cpa with cpa(a) = Я^(а, a)(p(a) of J(a) in К is an additive measure over the Boolean ring J(a) which satisfies cpa(a) = cp(a) e K. Proof. Let a = ax u a2 where al n a2 = 0; Let a3 = a\a. Then it follows that a=(Jf=1af is a decomposition for which according to (5.1) (p(a) = (pjidi) + (pa(a2) + (pa(a3). In addition, a = а и а3 is a decomposition, so that we obtain <p(a) = <pa(a) + <pa(%). D 5.1. A w e К satisfying w < w € К is called a mixture component or component of w. If w is a component of w, then w — w also is, because 0 < w — w < w; w = w + (w — w) is therefore a decomposition of w into the components w and (w — vv). Two decompositions of the same preparation procedure a = (J"=1 a{ = ak generate two decompositions of the ensemble q>(a) as follows: n m Ф) = £ %(«;) = £ q>a(ak), 1=1 k=1 where the components <ра(а(), lie in the range of the additive measure cpa on the Boolean ring J(a). This result suggests the following definition: D 5.2. Two decompositions of an ensemble n m w = Yj wi = Yj ^ where wf, wke К i=1 k=l are said to be coexistent if there exists a Boolean ring £ and an additive measure К for which W(e) = w and wbwke WL. Two decompositions of a preparation procedure a result in coexistent decompositions of <р(а). In §7 we shall return to the problem of the realization of coexistent decompositions. D 5.3. A set А с К is called a set of coexistent components of w if there exists a Boolean ring £ and an additive measure £ ^ К such that W(e) = w and A с WZ. By analogy with the case of coexistent effects we need only consider effective measures. Here it is reasonable to consider an idealization of
5 Coexistent Decompositions of Ensembles 159 obtained by mathematical completion. This possibility follows directly from the following theorem. Th. 5.2. Let W be an effective additive measure on the Boolean ring £ Д- К which satisfies W(s) = w e K. Then m0(o) = ||Щ<т)|| = 1) is an effective additive measure satisfying m0(s) = 1, and d(au o2) = т0(ог + <т2) is a metric in £ for which W is uniformly continuous as a map in the Banach spaced. PROOF. From a = v <j2 and о1 л o2 = 0 it follows that W(o) = Wfa) + W(o2) and m0(a) = 1) = pfWPih 1) + 1) = + пг0(а2). since W(e) e К we obtain m0(s) = p(W(s\ 1) = 1. m0 is effective, because if m0(a) = 0 it follows that p(W(o-\ 1) = 0 and therefore ||Щсг)|| = 0, that is, W{a) = 0; since W is, by assumption, effective, we obtain <7 = 0. From Щах) + W{g2 + o,-o2) = Wfa v <r2) = Щ(72) + Ж^1 +^l^2) and Що 1 + °l) = + °l 'Gl) + W(°2 + °t ,<72) it follows that, for all д s L: l/<№i) - Ща2\ fif))l < КЩеi + Vi), в) < fKW(a 1 + ff2), 1) = dfat, <t2). On the other hand, we have ||^(«Ti) - W(a2)\\ = sup niWiG,) - W(<r2), y). • Since [— 1,1] = 2L — 1, for у = 2д — 1 we obtain - W(<j2), y) = 2^^) - W(e2), в) - i) - W(a2), 1) and thus we find that ||WVi) - W(a2)\\ < 3 sup \n(W(Gt) - W(a2), g)\ geL Therefore we obtain т°1)-Щ°г)\\£Ч°1,°г)- From Th. 5.2 it follows that, as is the case in §1.4, it is possible to complete £ and extend W on the completion of S. Then W becomes a <r-additive measure on the completion of S. If S is a (lattice-theoretically) complete Boolean ring, and if IF is a <r-additive measure then £ is complete with respect to the metric d(<Tu a2) = (iWfai 4- !)•
160 IV Coexistent Effects and Coexistent Decompositions We define the following analog of an observable: D 5.4. A Boolean ring £ with effective measure W: X —> К for which W(e) = w g К and for which £ (in the metric determined by W) is complete and separable is called a preparator of w. It follows that two decompositions of an ensemble n m w = £ wi = £ i= 1 fc=1 are coexistent only if there exists a preparator £ X such that wf, w* e JFE. The mathematical similarity between preparator and observable runs much deeper, and depends upon the following theorem: Th. 5.3. For fixed w e X each w e X satisfying w < w can be written in the form w = w1/2gw1/2, w/iere # e L. Let g g L; then, for w g X: w1/2gw112 g X. TTie correspondence g —> w1/2#w1/2 is an order-isomorphism of [0, e] onto [0, w] с X w/zcrc с is the decision effect satisfying Кг(е) = C(w). Proof. From w = w1/2gw112 and from 0 < g < 1 it follows that <<p, w<p> = <w1/2<p, gw1,2(p} < <w1/2<p, w1,2(p} = <<p, w<p>, that is, w ^ w. Since <w1/2<p, gw1,2(py > 0 we obtain w > 0. Let 0 < w < w; we shall now consider the support * of w. The domain of definition of the operator w-1/2 is dense in * because, if w = £v wvP<Pv and wv Ф 0, then the % form a complete orthonormal basis for *, and all vectors of the form i <Pvav he in the domain of definition of w“1/2. If we then write A = w1/2w“1/2, then A is defined in a dense set in r, in addition, since w < w we obtain ||A<p||2 = <A<p, A<p> = <w"1/2(p,ww"1/2(jo> < <w-1/2<p, ww~1,2(py = <<p, <p>, that is, || A || < 1. Clearly A may be extended to all of Therefore, for all ф e ^ we obtain Aw1/2<p = ww~1,2w1/2(p = w1/2<p and thus we find that w1/2 = Aw112. Thus we obtain w = w1/2w1/2 = (w1/2)+w1/2 = w1/2A+Aw1/2. If we define # by A+A = gwe then obtain w = w1,2gw112. Since (see AIV, §3 and §4) ||A+|| = || A || and ||A+A|| < ||A+|| \\A\\ we therefore find that ||^f|| < 1 and therefore 0^g < 1. From the operator equation in r. w1/2g1w1/2 = w1/2g2w112 it follows that <w1/2<p, (gt — g2)w1/2<py = 0 for all w1/2<p. Since w1/24 is dense in * we obtain (ФЛ01 — 02)*A> = 0 for all ф g ь and thus we find that g1 — g2 = 0 is satisfied as an operator equation in
5 Coexistent Decompositions of Ensembles 161 Th. 5.4. Let be an additive measure with W(e) = w e K. Then an additive measure rj is defined in terms of the bijection x and the following diagram as follows: w д where w = w1,2gw1/2, where [0, w] [0, e], where e e G and Kx(e) = C(w) £ —^ [0,e] [0, w] The uniform structure Ug defined by ц is the same as that one defined by W. PROOF. The fact that rj is an additive measure follows directly from Th. 5.3. Ug is determined by the metric dn(pi, of) = ju(w0, Y\(i7i 4- o2)) for a w0 for which C(w0) = К (see Th. 1.4.1). For example, we may choose w0 as follows: w0 = + |w, where w is defined in Th. 5.4 and w e K0(e) and C(w) = K0(e). Thus, since 0 < rj(a) < ewe obtain a2) = 2i“(w, Ч(а1 + <т2)). On the other hand, the metric for W is given by dw(o 1, o2) = ц(Щ°1 + o2\ 1) = n{wll2n{o1 + о 2)w1/2,1) = /z(w, ^(CTi + g2)). " According to Th. 5.4, if we admit, as observables, the lAl for which ri(e) — eeG and e ф 1, then we have established a 1:1 correspondence between the preparators E^+K and the observables E-^L. This cor¬ respondence permits us to, by analogy, transform all theorems about observables to theorems about preparators. We leave the proof of this result as an exercise for the reader. We may, without difficulty also define the following concepts for preparators: a preparator for an ensemble w is more extensive than another for the same ensemble, and: two preparators for the same ensemble are coexistent (see also §6). These concepts are completely analogous to those defined for observables. We shall now explain the distinction between preparator and observable. The distinction depends on the physical meaning and, mathematically upon the structure of the measure W(o) = wll2r\{o)w112 which is somewhat different than фт). The desire not to consider “unnecessary” convex combinations of the фт) is analogous to the desire not to consider unnecessary mixtures of the W(o);
162 IV Coexistent Effects and Coexistent Decompositions here the case of decompositions is more interesting physically than that of mixtures. We shall therefore seek preparators for which Z L corresponds to an kernel observable. We find, however, that the decomposition of observables described in §2.4 has an alternative interpretation when applied to rj. Suppose, for example, that ф) = ХП1{а) + (1 - Я>,2(а) (5.4) holds for all g e Z. Then it follows that W(g) = mx(G) + (1 - X)W2(g). (5.5) Equation (5.5) represents a decomposition of each ensemble W(o). We note, however, that this decomposition fulfils on the basis of (5.5), an additional condition. Since ri(s) = e = Я^г) + (1 — X)ri2(s) we find that *7i(£) = 7/2OO = e апс* we therefore obtain Wx(e) = W2(s) = w, that is, the ensemble w underlying the preparator is not decomposed by (5.5). A decomposition which does not, in general, satisfy this condition may be obtained, for example, from a partition г = ox v g2,g1 a g2 = 0 of the form W(g) = W(gx a g) + W(g2 л g) = Щ(°) + (1 - X)W2{g\ where Я = ptWfai), IX Щ(<т) = Я~1W(g1 a g) and W2(g) = (1 - X)~1W(g2 a g). In these cases, we find that, in general Wx(s) = 1~1W(g1) is not equal to W(e) = w. If, in (5.5) we have Wx(g) = W2(g) = W(e) = w, then, according to Th. 5.4 it follows that ^71 (cr), vi2(g) are uniquely determined with rj^e) = ri2(s) = e and must also satisfy (5.4). Decompositions of the form (5.5) for which W^s) = W2(s) = W(s) = w are called decompositions of the preparator. One of the arguments presented in §2.4 for observables is not applicable because, in the set of preparation procedures there is no special subset of the type 010 in Despite this fact, we may still deduce from the decompositions of preparators in a way analogous to the case of observables the goal to realize experimentally as far as possible irreducible preparators. If Z is a decision observable (with the exception that we do not require that rj(s) = e = 1) then the corresponding preparator is irreducible. The preparator for which Z L is a decision observable plays a theoretically distinguished role. D 5.5. A preparator Z Д- К is called a decision preparator if the correspond¬ ing observable Z L is a decision observable. According to Th. 1.4.8, for a decision observable we may identify Z with a Boolean ring rfL. A decision preparator is uniquely defined by a Boolean
5 Coexistent Decompositions of Ensembles 163 subring £ of G (possibly with an e Ф 1 as unit element; e is the support of w) together with the map W(e) = w1/2ew1/2 (5.6) of £ into К for eel. Since e is the support of w, we find that W(e) = w1/2ew1/2 = w1/2w1/2 = w (as we would expect). In the desire to eliminate the lack of symmetry between preparator and observable, we will often find the following statement: the ensemble w = (1,1,...) = 1 corresponds to the case of “complete ignorance,” and the fact that w is not an element of К is taken only for a matter of mathematical “inconvenience.” For w = 1 equation (5.6) may be formally transformed into W(e) = e and we may come to the false conclusion that a preparator is nothing other than a decomposition of w = 1 with respect to a (decision) observable. According to the formulation of quantum mechanics developed here we cannot dismiss w in (5.6); w is not a measure of knowledge or ignorance, but is only a mathematical symbol w = cp(a) corresponding to an apparatus (!) a by means of the map cp of J' in K. Such an apparatus a does not exist for the case in which cp(a) is approximately equal to 1. On the contrary, we may approxi¬ mately represent each w = £v as a finite sum v^P^, where wv = wv£*=1 м^Г1. In physical terms each ensemble which we may obtain from a preparation procedure has, for all practical purposes, a finite¬ dimensional support. This is a very important aspect of the structure of microsystems, and is a direct consequence of axiom AV 4s. The following important question is often discussed: For a given observ¬ able is it possible to find a preparator of w for which w can be decomposed in such a way that there is no dispersion with respect to the observable? Let be an observable, and let £ ^ К be a preparator. Suppose that, to each ael there exists a a e £ for which p(W(S), 1 - F(ct)) = 0 and p(W(e + S\ F{c)) = 0, (5.7) that is, the mixture component W(fi) triggers the response for the effect F(o) with certainty and the mixture component W(e + S) does not trigger the response for F(g) with certainty. D 5.6. A preparator £ К is said to be dispersion-free with respect to the observable if to each oeZ there exists a del for which (5.7) is satisfied. What relationships must be satisfied between a preparator and an observable in order that the preparator be dispersion-free with respect to an observable? Let ew denote the support of w = W(e). Then, since W(p) < W(e) we obtain W(S) = ewW(d)ew for all a and therefore find that g) = g(ewW(a)ew, g) = g(W(a), ewgew)
164 IV Coexistent Effects and Coexistent Decompositions for all W(a) and g e L. An additive measure £ -Д L (which is not necessarily effective) is defined by F{a) = ewF{a)ew. Eq. (5.7) is satisfied for F{&) as well as for F(o): riW(o), 1 - F(a)) = g(W(d), ew - F(a)) = 0 and (5.8) li{W(e + a\ F(a)) = 0. Let A = g(W{d), 1); then from (5.8) it follows that A1 LF(<?) e K^Fia)) and (1 - Х)~Ще + о) e K0(F(oj). For e(a) and e0(a)eG and K^Fia)) - КМФ K0(F(a)) = K0(e0(a)) it follows that ф) < F(a) < e0(a). (5.9) Since w = W{5) + W{e + <?) it follows that C(w) = С(Х~Щд)) v C((l - A)-4F(e + «?)), that is, KAeJ = KM*)) v K0(e0(aj) from which it follows that Ki(ew) с К Me)) v 7C1(e0(<7)-L) = K1(e(a) v eo(ff)1). Thus it follows that ew < e(oj v e0(<7)x. Since, according to (5.9) e(c) < eQ{o) we obtain e(a) v e0{a)L = e(a) + ео(ст)1 = e(<7) + 1 - e0(a) = (e0(a) - е(ст))1. Therefore ew ± е0(<т) — e(oj. Thus, according to (5.9) we have F{a) = e{a) + (F(<r) — e(aj), where 0 < F(ff) — e(aj < e0(aj — e(<r). According to the definition of F we obtain eia) < ew. Therefore we obtain F(a) — e(a) = ew(F(oj — e(a))ew < ew(e0(fj) — e(a))ew = 0 and hence we find that F(a) = e(<r). Therefore the measure F(a) is a projection-valued measure £ -Д G for F(e) = ew. From ewF{a)ew = e(oj e G, it follows that, for а ф e е(а)Ж1 such that \M = 1: F(o)<p = (1 - ew)F{a)<p + ewF(a)cp = (1 - ew)F{a)cp + ewF(a)ewcp = (1 - eJF(o)(p + e(a)cp = (1 - ew)F{a)(p + (p.
5 Coexistent Decompositions of Ensembles 165 Since (1 — ew)F((r)(p _L cp = ewcp it follows that ||F(<7MI2 = ||(1 - eJF(a)cp\\2 + 1. Since 0 < F(a) < 1 we obtain ||F(<t)|| < 1 and therefore ||F(<7)<jp||2 < 1. Therefore we find that (1 — ew)F((r)(p = 0 for all cp e е(о)Ж{, and therefore F(a)cp = cp, that is, F((r)e((T) = e(a). (5.10) For cp g (ew — е(а))(Ж1, Ж2,...) we find that <<p, F(a)cp) = (ew<p, F(a)ew(p) = <9, ewF(a)ew(p) = <<p, e(a)(p) = 0 and (since F(a) > 0) F(a)(p = 0, that is, F(a)(ew - e(a)) = 0. (5.11) From (5.10) and (5.11) it follows that F(<j)ew = e(<r) and therefore (1 - ejF(a)ew = 0, ewF(a) = e(a) and (5.12) F(a) = e(a) + (1 - ew)F(a\l - ej. The measure F(a) may therefore be expressed as a sum of two terms, one equal to e(a) and the other term is orthogonal to the support of w. In order that a preparator of w be dispersion-free with respect to the observable £ - Д L F(a) must therefore be, relative to w, a measure of decision effects £-^G. For every does there exist a preparator of w which is dispersion-free with respect to I-^>G? For F(a) = e(a) equation (5.8) is equivalent to wll2rj(d)w1,2(ew — е(а)) = 0 and w1/2(l — ri(d))w1/2e((j)' = 0, that is, equivalent to we(<r) = w1/2ri(d)wll2e((j) = w1/2ri(a)w1/2 (5.13) from which we obtain the adjoint equation: e(a)w = w1/2ri(d)w1/2 = we(a). Therefore we find that e(a) commutes with w and with w1/2, so that, from (5.13), it follows that w1/2rj(d)w1/2 = w1/2e(<r)w1/2, (5.14)
166 IV Coexistent Effects and Coexistent Decompositions Since e(<j) < ew, from Th. 5.3 it follows that e(a) = ri(a), that is, r\L => FE. Instead of we may consider the Boolean subring FE of G and determine FE Д X by W(F(a)) = w1/2F(<t)w1/2. (5.15) For a preparator of w, instead of E X we may use (FE) Д К where W is defined by (5.15). A preparator can be dispersion-free with respect to E-^>L only if (5.15) results in dispersion-free ensembles, that is, if the F(a) commute with w for all a. Then we would obtain ju(w1/2e(<7)w1/2 1 — e(o)) = 0 and ju(w1/2(l — e(<r))w1/2, e(a)) = 0. Therefore, if and only if w commutes with the decision observable determined by E -Л [0, ew], there exists exactly one preparator of w which is dispersion-free with respect to E -Д [0, ew]. On the other hand, to each decision observable E [0, ew] there exists a w g К (with support ew) which commutes with the observable providing that the decision observable has only a discrete spectrum. According to D 2.5.4 and Th. 1.4.8 this result is equivalent to the condition that the Boolean subring FE of G is atomic. This result follows from the fact that w has only discrete eigenvalues and that each eigenspace has finite dimension. Therefore, there are no preparators which are dispersion-free with respect to an observable which have a continuous spectrum. This fact plays an important role in the investigation of the so-called “measurements of the first kind.” This is a typical aspect of “quantum structures,” that is, of the structure of microsystems since, in classical theories, to any decision observ¬ able there exist dispersion-free preparators. Thus it is understandable that some physicists have not only sought to weaken the distinction between observables and preparators, but have also sought (by means of “tricks”) the “elimination” of the typical “quantum mechanical structures.” For the case of microsystems the preparation and registration processes are no longer “parallel” as is the case for macrosystems. Therefore the implications for the so-called measurements of the first kind appear to be highly idealized (see XVII, §5). However, on the basis of II we shall find that it is not necessary to require the existence of measurements of the first kind as a basis for quantum mechanics. 6 Complementary Decompositions of Ensembles In complete analogy to the definition of the notion of coexistent observables we introduce the following definition: D 6.1. A preparator E Д К of the ensemble w is said to be more extensive than another preparator Ei-^X for the same ensemble if there exists a
6 Complementary Decompositions of Ensembles 167 homomorphism h for which the following diagram commutes: К D 6.2. Two preparators Si X, S2 К for the same ensemble w are said to be coexistent if there exists a preparator for the same ensemble which is more extensive than the preparators 'Ll К and S2 K. Therefore we find that, for two coexistent preparators, there exists a preparator and two homomorphisms hl9 h2 such that the following diagram commutes: We shall now provide a physical interpretation of this result: represents (in idealized form) a preparation procedure a for which Zl5 S2 may be considered to be Boolean subrings of J(a), that is, to be parts of a single preparation procedure and its possible decompositions. For the observables which (formally) correspond to the preparators there exists a diagram analogous to (6.1) as follows: L where rj(e) = rj^s) = rj2(s) is the support of w. In the sense of diagram (6.2) the corresponding observables are, by D 3.1, coexistent observables. The situation for two preparators is substantially different than is the case of observables relative to the problem of “perfect” noncoexistence. Such “perfect” noncoexistent preparators play an important role in those prob¬ lems of quantum mechanics which are associated with an epistomology (theory of cognition). We shall now characterize (in idealized form) the following situation: For w g К suppose that we are given two decompositions of w: where wf, wk e K. Suppose that there exists a pair of preparation procedures a and a for which (p(a) = (p(a) = w, and that a, a may be decomposed as follows: a = U"=1 щ, a = Q^=1 ak, where %(a;) = w;, %(ak) = wk. If it fol¬ lows that a n a = 0 then the preparation procedures a and a have nothing К (6.1) Ej - hl > Z h2 - Z2 (6.2) П m w = Z wt = £ (6.3)
168 IV Coexistent Effects and Coexistent Decompositions in common—it is not possible to prepare a single microsystem which could be considered to be prepared according to both a and a. The preparation procedures a, a are mutually exclusive, even though a and a result in the preparation of the same (!) ensemble w = (p(a) = (p(a). Suppose that a n а Ф 0; then there exists a common Boolean subring l(a n a) of 1(a) and 1(a). We shall now try to express, in a mathematically idealized form, the fact that two preparators do not have such a “common part.” From the preparator £ ^ X of an ensemble w we may easily obtain new preparators as follows: Let [0,77] be an interval from S. For Я = p(W(rj), 1) we find that [0, *7] ——X is a preparator of the ensemble w0 = X~xW(rj). We shall call the preparator obtained in this way the preparator canonically determined by [0, *7]. D6.3. Let 2^-^ЛХ and Z2-^X be two preparators for the same ensemble w. These preparators are said to be disjoint if, for any pair of intervals [0, c= 'L1 and [0, *72] c= S2 the following condition cannot be satisfied: Let IX ^2 = MW2O/2X IX then Af 1Wi(f71) = ^I^Oh) ancl the preparators canonically defined by [0, and [0,772] are coexistent. Let Zl5 Z2 denote the closures of 1(a) and 1(a), respectively. Suppose that (p(a) = (p(a) = w, Wx = (pa, W2 = %. Then if 1(a)К and 1(a)К (more precisely, their completions) are disjoint, then we find that a n a = 0 because the intervals [0, a n a] c= Q(a) and [0, a n a] c= Q(a) canonically determine the same preparator, and therefore are (trivially) coexistent. D 6.4. Let 2^ -5X X and S2 X be a pair of preparators. We say that they are complementary if any preparator which is more extensive than 2^-^X is disjoint from any preparator which is more extensive than D 6.5. Two decompositions of an ensemble n m W = X Щ = £ wk i=1 k= 1 are said to be complementary if each pair of preparators 'Ll X, S2 X for which wf g PF^, wfc g W2Z2 are complementary. How may we determine whether two decompositions are complementary? Th. 6.1. Two decompositions of an ensemble n m W = £ w, = £ i = 1 k=l
6 Complementary Decompositions of Ensembles 169 are complementary only if, for each pair wf, wk the following condition is satisfied: w0 eK, w0 < Wf, w0 < wk => w0 = 0. PROOF. We shall assume that the decompositions are not complementary. Then there exists a pair of preparators 2^—+K, for which wk g W21>2 an(* a pair of coexistent intervals [0, с: 2^, [0, y\2] ^ ^2 • For at g Si satisfying W^) = wt and for ak g Z2 satisfying W2(<rk) = wfc we obtain the following diagram: [0, »7i] —^ [0)>/2] Let Tf = hl(ai л »7t) and pk = h2(ok л ;/2). From this diagram it follows that WhM л 4l) = k^WM л 4l) = W(ii), Wh2(dk A m) = ^lw2(dk A n2) = W(pk). From \/i o'; = к it follows that \/(о- л t]j) = ^ and we obtain \/- й(о* л 41) = V = M*7i) = 6- Similarly we obtain \fk pk = s and find that V\fc(Ti л Pk) = e- Thus we obtain Щв) = Ar^itoi) = = I Щтг л A). i,k Therefore we cannot have all W(xt л pk) = 0. Suppose that W(r1 л px) Ф 0. Then we obtain = Wt(at) > л r^) = tJ > л p^ and similarly > ^2^(1! л pj. Therefore there exists a, w0 e it, w0 Ф 0 for which w0 < wl9 Щ < In order to prove the converse, we assume that there exists a w0e К for which w0 Ф 0, w0 < wl9 w0 <wt. We then introduce the following Boolean rings: Let 2^ denote the set of all subsets of a set of (n + 1) elements which we shall denote by 0,1, ...,w. The one element subsets are the atoms of 2^. For these atoms a0, ccl9..., a„ we define as follows: Wi(«o) = wo> = Wj - w0, Wi(a2) = w2, WiW = wn- In the same way we define Z2 К with atoms fi0, flu , /?„,: ИДОо) = w0, IF2(j8i) = w2 - w0, WJfi 2) = w2,
170 IV Coexistent Effects and Coexistent Decompositions The interval [0, /?0] is then isomorphic to the interval [0, a0]. Let 2 denote the Boolean ring consisting only of two elements, the zero and unit elements. We then obtain the following diagram: [0, a0] > 2 < [0,А>] where W(e) = ktand k12 — A*(w0, !)• Let el9 e2 denote the supports of wx e К and w2 e K, respectively. If ex a e2 = 0 then there does not exist a w0 e К such that w0 < wx and w0 < w2, because, since w0 < wx and w0 < w2 the relations e0 < ex and e0 < e2 must hold for the support e0 of w0. The converse is not true in general: Even if л e2 Ф 0 it is possible that for w0 < wl5 w0 < w2, w0 e К that the condition w0 = 0 holds. We shall now give a simple example: For a Hilbert space Ж and a complete orthonormal basis cpv in Ж let Wi = 3 Xv°°=! (1/4v)P„v, w2 = P„, where cp = £v°°= x (l/2v/2)<ja,. Then the supports of and w2 are 1 and Pv, respectively, and we find that 1 л P9 = Py Ф 0. A w0 e К for which w0 < Wj and w0 < w2 must therefore have P(p as support, that is, must have the form AP9 for A > 0. For ).Ptp < wx it follows that, for all ф e Ж КФ, РуФУ = Ж cp, ФУ\2 < (ф, щф). In particular, for all we must have , 1 3 A<<?>, %>\ = Лу - Wl(p^ = 4?’ that is, A < 3/2" for all ц, that is, A = 0. Each w e К may be expressed as a mixture of the extreme points of K, a result which follows from the spectral decomposition theorem for operators of trace class (see AIV, §9 and §11). Thus it follows that W = £ WV^v> wv > 0» (6-4) V where the cpv (v = 1,...) are pairwise orthonormal vectors from Ж = (Jf Ж{. We will now show that any w e К for which the spectral representation (6.4) containing a sum of two P<Pv where <pv are from the same Ж{ has complementary decompositions. It suffices to show this for one Hilbert space only, that is, for Ж = Ж±. We shall show that for Ж = Ж± a w e К(Ж) has complementary decompositions if w is not an extreme point of К(Ж). According to our previous discussion, two decompositions " = = (6-5) v fl are complementary if each P}^ is different from each P<Pv. Then P<Pv л = 0 for all pairs v, /i. We need only show that each w which is not an extreme point (that is w # P^) has two decompositions (6.5).
6 Complementary Decompositions of Ensembles 171 We shall first prove this result for the case in which the support of w is two- dimensional. Therefore, let w = ХР9 + (1 — A)P^, where <pl.il/ and 0 < Я < 1. Let % be another vector in the plane spanned by {<p, ф}. We claim there exists another decomposition of the form: W = цРх + (1 - n)Pn for some 0 < ju < 1 and tj e Ж. For % = <P<* + ФЬ, \a\2 + \b\2 = 1 we need only set A(1 - A) P = r, = X\b\2 + (1 - A)|a|2 ’ cpXb — ф(\ — X)a im2 + (1 - Я2)|а|2]1/2 • Let а Ф 0, b Ф 0; then the decompositions w = XPy + (1 - Я)Р^ = pPx + (1 - p)Pn are complementary. If the support of w is three-dimensional, that is, = Я1Р<Р1 + Л2Рф2 + Л3Рфз where <<й, <^> = 0 for i Ф fe, we may write w (Ях +Р){хг +pP,pi +AX + pP«) ^2 P Л2 + A3 - рл<Р2 ' A3 + X2 - рл<рз f’ + (A3 + Я2 — p)( - — -P + - —- -P where 0 < p < X2 • We may then apply the result proven above to each of the ensembles _A_P I p P w - X2 - p A3 Ax+p Ax + p 2 A3 + A2-p « + A3 + A2-p We then obtain four vectors, Xu 4\ in the plane spanned by {<pl9 <p2} and X2, Ц2 in the plane spanned by {<p2, <p3}9 and a decomposition of the form w = PiPxi + P'ipm + P1PX2 + Pipn2 which is complementary to the decomposition W = ^lP<pi + ^2Pq>2 + ^3^<p3’ Finally, if w = Xv Kp^> where <<jt>v, <рд> = 0 for v Ф p, we may apply the above proof to the parts X1P(pi + X2P(pi, A3P93 + Л4Рф4,...; if we obtain an odd number of terms in the sum, then we may apply the result obtained for the three-dimensional case to the final three terms. Thus we find that complementary decompositions are not unusual. The structure of complementary decompositions has lead to a continuing discussion about the relationship between epistomology (theory of cognition)
172 IV Coexistent Effects and Coexistent Decompositions and quantum theory. Especially many are inclined to reject quantum theory. It is easy to see that the arguments used to reject quantum theory depend on an impermissible identification of the preparation procedure ae<2! with the ensemble q>(a) e K. Such an automatic identification appears to be natural in the usual formulation of quantum mechanics because the usual formulation takes into account only the set К but not the sets M or J, nor the canonical mapping <jp of J' into K. 7 Realizations of Decompositions If, for a preparator there exists a preparation procedure a e J' and a homomorphism h of I into J(a) such that the following diagram is commutative: 2 —► Ща) К then we may identify £ with the Boolean subring hL of J(a). If this is the case we may call the preparation procedure a (together with J(a) and cpa) a realization of the preparator The requirement that to each prepara¬ tor there exists a realization in this sense is too strong (see also §4). In complete analogy to §4 we now impose the following requirement: APr. To each preparator and each finite Boolean subring £ of £ and each <r(3) neighborhood U of 0 there exists a Да) К and a homomorphism £ Д J(a) such that (pah(a) — W(a) e U for all a e £. (For the space 3 see III, §3.) APr means that we may realize (in physical approximation) each preparator and, therefore, each decomposition of a w e K. In other words, it is “physically possible” to (approximately) realize each preparator. We note that APr does not, however, provide us with any information about how we may, in an actual situation, build the apparatus. Such information cannot be obtained from the theory presented here because it does not contain a mathematical description of the structure of the apparatus. In XVIII we shall take a number of small steps in the direction of the investigation of the construction problem, in which we shall consider “transpreparation” processes. The problems concerning the maps ф which were described in §4 are practically the same for the map (p:£l'—*K. If the construction of a preparation apparatus is known to us, we still do not have any theoretical tools by which we may compute the elements cp(a) e К corresponding to the preparation procedures a. By analogy to the case of registration procedures (see §4) we may guess, on the basis of a known classical theory, together with
8 Objective Properties and Pseudoproperties of Microsystems 173 the use of a correspondence principle, the ensemble w = ср(а) and add this result formally as an axiom. At first we have no other choice, even though this procedure may not be very satisfactory (see, for example, XI-XVT). The development of experimental methods for preparation and regis¬ tration has up to now, taken place in a manner similar to that described above using classical theories—statistical theories—and quantum mechanics. This first step allows us to “estimate” the <p(a) and ij/(b0, b). Then, by varying the experiment, that is, by using different combinations of preparation and registration procedures, we seek to improve the values of (p(a) and i//(b0, b). Therefore, in physics it is not the case that there are no adequate means of determining the functional relationships between the preparation and regis¬ tration apparatuses and their corresponding cp(a) and \j/(b0, b). The only failure is that there is no comprehensive and systematic theory of the macroscopic preparation and measurement apparatuses. In XVII we shall show that there is at least one realistic theoretical route by which, using measurement collisions, measurement transformations, transpreparations, and starting with poorly known values of ср(а) before collision and ф(Ь0, b) after collision we may obtain theoretically computed and very precise values of ф(а) and ij/(B0, B) for new preparation procedures a or new effect processes (b0, 6), respectively. It may be that the method described in XVII for the “improvement” is the only “practical” possibility for experimentation, and that the “desired” comprehensive theory (see [13], X) is more a requirement for epistomology (see XVIII). 8 Objective Properties and Pseudoproperties of Microsystems In the literature of quantum mechanics the discussion about properties of microsystems, about propositions about microsystems, and about problems of logic in connection with such propositions about microsystems has taken on vast proportions. In this book we cannot attempt to provide an overview of all these topics. Here we shall only discuss the problem about the properties of microsystems in terms of the formulation presented in III, §4. 8.1 Objective Properties of Microsystems and Superselection Rules We shall now seek to describe the structure Sm of objective properties which was defined in III, D 4.1.6. From III (4.1.11) it follows that, for each g e L together with the mapping Tp (which is dual to Tp) of L into itself (from V, §4.1 we find that Tp is an operation; thus, for Tp it follows that V, Th. 4.1.3 holds): g = т;д + T^pg, (8.1.1)
174 IV Coexistent Effects and Coexistent Decompositions From III (4.1.13) it follows that x(p) = t; i = т;д + r;(i - g). (8.1.2) In addition we find that т;д + т^„д + t;(i - 0) = + t;i < T^p 1 + 7p,l = 1. (8.1.3) From (8.1.1), (8.1.2), (8.1.3), and Th. 1.2.4 it follows that g and xip) are coexistent. Therefore xip) is coexistent with each g e L, and is therefore coexist with each eeG. Thus, according to Th. 1.3.4 x(p) commutes with each eeG. Therefore x(p) has the following form: T;1=z(p) = (^1,A21,...). (8.1.4) Since Tpg + Тщрд = g, for each g e L we obtain Tpg < g and find that ТД0, 0, ..., gi9...) = (0, 0,..., gl, 0,...), where g[ < gt. Since Tp is linear, from the previous result and (8.1.4) it follows that ТД0, 0,..., 1,...) = (0, 0,..., ЛД, 0,...). (8.1.5) From which we find that T;2(0, 0,..., 1,...) = (0, 0,..., A21,...) and (8.1.6) т;21 = (Afl,..., Afl,...). From III (4.1.11) we obtain the following special case: Tp = Tpnp = Tp; from this result, and from (8.1.6) and (8.1.4) it follows that either = 0 or 1. Therefore x(p) is an element of the center Z (see Th. 1.3.8) of G. The map Sm^L therefore maps into Z as follows: iM±Z. (8.1.7) Z is a complete Boolean ring, x mast then be an isomorphism of Sm onto a Boolean subring of Z since the following condition is satisfied: if p Ф 0 then we must have x(p) Ф 0, since if Aj(a, a n p) = ц(ф(а), xip)) = 0 we would obtain an p = 0 for all a e J' and therefore pn (Jae ^ a = 0, and, from APS 8.1 we would obtain p = 0. It is a certain hypothesis in the sense of [1], §10.1 that the map (8.1.7) is surjective. Since we do not wish to discuss the hypotheses which are described in [1], §10.1, we shall formulate, as an axiom, the condition that (8.1.7) is surjective. Then it follows that x is an isomorphism between Sm and Z. According to III (4.1.8), for p e Sm we obtain p = и «• a e St a<=z p
8 Objective Properties and Pseudoproperties of Microsystems 175 From a c= p it follows that Aj(a, a n p) = /^(a, a) = 1; conversely, from Aj(a, a n p) = 1, it follows that Aj(a, a n (M\p)) = 0 since Aj(a, a n p) + Aj(a, a n (M\p)) = 1. Thus it follows that a n (M\p) = 0 and therefore а c= p, where the latter is equivalent to the condition that p(cp(a),xip)) = 1, that is, cp(a) gK^xip)). Therefore, for each eeZan objec¬ tive property is defined by e^p(e)= U (8-18) аеЗ. <p(a)eK t(e) where X(p{e)) = e. Thus we find that (8.1.8) is the inverse mapping of (8.1.7). Thus it is understandable that we may sometimes say (somewhat in¬ correctly) that Z itself is a collection of objective properties of microsystems. Since jU(wl5e) = p(w2, e) for all eeZ does not imply that wx = w2, it follows that the microsystems (as described in terms of III, D 4.1.7) are not physical objects. Of special importance are the atoms of the center Z (see the end of §1.3), because we could always assume that each microsystem x has one and only one property which is characterized by an atom of Z. Let ZA denote the set of atoms of Z. Then we obtain M= U P(4 (8.1.9) eeZA where p(e) is defined by (8.1.8). If el9 e2 e ZA and if ег Ф e2 we then find that Piei) n p(e2) = 0. (8.1.10) In physics it is customary to use “names” to denote objective properties p(e), e g ZA—such as “p^) is the set of electrons,” “p(e2) is the set of hydrogen atoms” or “p(e3) is the set of helium atoms,” etc. The introduction of these “names” requires a more extensive characterization of the individual elements of ZA than we have previously seen, that is, there must be additional axioms which are needed to describe how (that is, by means of which apparatus) we may produce or prepare the individual e g Za . An experimen¬ tal physicist can easily provide us with several apparatuses by which we may “produce” or detect, for example, electrons or hydrogen atoms. Although we are unable, at this stage of the development of the theory, to assign characteristic names to the atoms of the center ZA, we now find it desirable to introduce the concept of a system type. D 8.1.1. The elements eeZA are called system types; for e g Za we shall call the set p(e) the set of systems of type e.
176 IV Coexistent Effects and Coexistent Decompositions Using (8.1.9) and (8.1.10), for each a el' we obtain the following decomposition: a = (J (a n p(e)). (8.1.11) eeZA Thus we obtain ф(й) = хад?>(апр(4 (8.1.12) e where A(e) = tr(<jp(a)e). Equation (8.1.12) is nothing other than a decom¬ position of each w e К into components with respect to the different Жу as follows: Let w = (Wl9 W2,...) and wv = (0, 0,..., Wv,...). Then w = £v wv. This composition of w is uniquely determined by the condition wv tr(wv)-1 e К^ву). This decomposition is coexistent with all other decom¬ positions of w. Thus it is understandable that in physics we are, in most cases, concerned only with the individual system types. This is evident from the fact that each registration procedure b may be decomposed as follows b= U (b A Pie)) (8.1.13) eeZA from which it follows that ФФ0,Ь) = X ф{Ь0,Ь n p(e)). (8.1.14) e e ZA Equation (8.1.14) is nothing other than a decomposition of each effect F = (Fu F2,...) into its components Fv < ev. Here, for a given system type only the probability tr(WvFv) is of interest. The set Sm of objective properties, such as the set ZA of system types are clearly related to the concept of “super selection rules.” We may (in a formal mathematical sense) construct a Hilbert space Ж = ^ ® Ж{ from a direct sum of the individual Hilbert spaces Ж{, where the latter, according to AIV, §15, determine the algebra ^(Жх, Ж2,...). Then ^{Жх, Ж2,...) will be identified with a subalgebra of <£(Ж). This subalgebra is characterized by the condition that all A in ^{Жх, Ж2,...) commute with the projections P{ onto the individual subspaces Ж{ of H (see AIV, §15). The P( are, however, the atoms of the center Z. Each Pt is then a “super selection rule” since the Pt commute with all “actual” decision observables in £/(Жг, Ж2,...) and are therefore, in all circumstances, invariant quantities. Invariant “under all circumstances” is only an imprecise formulation of what we have previously called “objective properties.” In closing we shall now ask whether there exists a set & = &p n Sr (as defined at the end of III, §4.1) which is so large that the equivalence relation on 1' defined by ax « a2: {Aj(al5 ax n p) = Aj(a2, a2 n p) for all p e &} is finer than that defined by the f e From [1], §12.3 it follows that such a set does not exist, because we may apply the results of that section upon 1, 010, 01 and obtain the following result.
8 Objective Properties and Pseudoproperties of Microsystems 177 There exists a complete Boolean ring £ and a pair of maps J A K(L) and Д L(£) where ф J is dense in K(S) and ф J* is dense in L(L). Since ipi,a d-continuous and surjective map L(£) —> L is defined by the maps OF Ль® and ^-^L. According to §2 this would mean that all effects would be coexistent—in particular, G would be a Boolean ring—in contradiction to our axioms for microsystems. 8.2 Pseudoproperties of Microsystems In §8.1 we have found that microsystems are not micro-objects—that is, they do not have a sufficient set of objective properties. We shall now consider the analogous question concerning the physically realizable pseudoproperties of microsystems (see III, D 4.2.1). Let Sps denote the set of physically realizable pseudoproperties. For p e Sps we obtain the following result from III, §4.2: Let K0L0(Ap) = Kx(e) = K0(eL) where e e G, and let K0L0(Ap*) = K^e*) = Ые*1) where e* e G. Then, according to III (4.2.8) there exists a geL0(Ap*) and geL^Ap) for which g < e*1 = 1 — e* and g > e. (8.2.1) From which it follows that e < e*1 that is e Jl e* and e + e* <1. (8.2.2) From III (4.2.9) it follows that 0 = L0(Ap) n L0(Ap,) = L0K0{e*L) n L0K0(eL) = L0K0(e*L л e1) and therefore e*1 a eL = 0, that is, e* v e = 1. (8.2.3) From (8.2.2) and (8.2.3) we obtain e + e* = 1 that is e* = e1. (8.2.4) From (8.2.1) it follows that there exists a particular b0 e such that Q = e = ф{Ь0, b0 n pr). From Ap с К! (e) it follows that for all a с p <pia)eKM (8.2.5) From ф(Ь0, b) e L0(Apt) it follows that, for all b cz p ф(Ь0, b) < e*1 = e. (8.2.6)
178 IV Coexistent Effects and Coexistent Decompositions Conversely, if cp(a) g K^e), then it follows that 0 = fj.((p(a), e1) > ц((р(а), ф(Ь0, b)) for all tcp*. Therefore we obtain ky{a n b0, a n b) = 0 and a nb = 0 for all b c= p* that is an (p*)r = 0. (8.2.7) Suppose that a n (p*)p Ф 0; then there exists an a c= p* such that a n а Ф 0. For a! = a n a, since a' с a we clearly obtain <р(а') g ^(e). Since a' c= a c= p* we obtain <p(a') e 4p, с K0(e*x) = K0(e). Thus we are led to a contradiction, from which it follows that a n (p% = 0. (8.2.8) From (8.2.7) and (8.2.8) it follows that a n p* = 0, that is, a с M\p* and therefore a c= p. Therefore we may strengthen (8.2.5) as follows: a c= p о cp(a) g K^e). (8.2.9) Suppose that il/(b0, b) < e. Then, for <р(а) g Ap* c= K0(e) we obtain 0 = /г(ф(а), ^(Ь0> ^)) = n b0, a n b) that is a n b = 0 for all a c= p*9 from which it follows that b n (p«% = 0. (8.2.10) Suppose that b n (p*)r Ф 0; then there would be a b c= p* for which b' = b n p Ф 0. Since b' c= b we obtain ^(b0> &') < ^(b0> ^ Since b' c= b there exists a 50 for which ^(Ь05 Ю ^ G L0(Ap) = ^o^o(e±) and we therefore obtain {j/(b0, b') < ^ e_L* From ij/(b0, b') < e and ф(Ь0, b') < e1 it follows that \//(b0, b') = 0 in contradiction to b' Ф 0. Therefore we obtain b n (p*)r = 0. (8.2.11) From (8.2.10) and (8.2.11) it follows that b n p* = 0, that is, b c= M\p* and we obtain b c= p. Therefore we may strengthen (8.2.6) as follows: fccpo ^(b0> — e* (8.2.12) From (8.2.9) it follows that Pr= U b (8.2.13) be where = {b | b e M and there exists a b0e3t0 for which ij/(b0, b) < e}.
8 Objective Properties and Pseudoproperties of Microsystems 179 From (8.2.12) we obtain Pp = U a where £e = {a\ael' and cp(a) g JMe)}. (8.2.14) Finally we obtain U U b be £ (8.2.15) To each physically realizable pseudoproperty p there exists a corresponding eeG which is obtained from the equation K0L0(AP) = Kx(e) and defines a map Sps —> G. This map is injective, because the image e e G of p is, according to (8.2.15), uniquely determined. We will now show that, for an arbitrary eeG, the corresponding p (which we shall denote by pie)) given by (8.2.15) is an element of П (where the latter is defined by III, §4.2). By using the method of proof of (8.2.7)—(8.2.11) it may be shown that, if a ele, then, for all a g le± and all В g 3$e± then it follows that a n a = 0, a n В = 0, that is, a n pie1) = 0. Similarly, for be0te it follows that b n pie1) = 0. Therefore pie) n pie1) = 0. This is equivalent to pie1) cz p{e)* where p(e)* = n{M\p{e)) where n{c) is defined by III (4.2.2). From a cz pie) it follows that a n pie1) = 0 and hence a n b = 0 for all b g MeL, that is, picpifi), фф0, b)) = 0 for all фф0, b) < e1. If sup фф0, b) = e1 b e @e. (8.2.16) then it follows that picpia), e1) = 0 and therefore a eQe. From В cz pie) it follows that В n pie1) = 0 and hence В n a = 0 for all a g Qe±, that is, picpia), ^(50, B)) *= 0 for all a eQe±. Let Ae = {(pia)\aeSL\ cpia) g Ktie)}; if L0iAel) = LqK^1) = L0K0ie), (8.2.17) that is, if the face generated by Ae is equal to Kxie) then it follows that ij/iB0, B) g L0K0ie) and hence i//iB0, B) < e, that is, В g 0te. Thus we obtain pie) = и в ae2. a<=p(e) U U b b et% bcp(e) that is pie) has the form III (4.2.2). At the same time we have also proven that if a cz M^ie1) and В cz M^ie1) it also follows that a cz pie) and В cz pie). Thus, from pie1) cz pie)* (see above) and from a cz M\pie)*, В cz pie)* we also obtain a cz pie) and В cz pie), that is, pie) = pie)**. Therefore pie) is an element of П. Furthermore it follows that pie)* = pie1) since (8.2.16) and (8.2.17) are also satisfied if we replace e by e1. From K^eJ n Kxie2) = К^ег л e2) it follows that <bei n = Д,1Лв2. From g<ex and g<e2 it follows that g<ex ле2 and ^ein^e2 = ^eiAe2, Thus piex) л pie2) = piet л e2).
180 IV Coexistent Effects and Coexistent Decompositions Let e e G; then condition (8.2.16), that is, sup ф(Ь0, b) = e ЬеМв is satisfied, if (in correspondence to the assumption from III, D 4.2.1) to e there exists a b0 e for which ф(Ь09 b0 n p(e)r) = e. (8.2.18) The condition (8.2.17) is satisfied for an e e G, that is, L0(Ae) = L0K^e), if there exists an a e Q' for which C(<p(a)) = KM (8.2.19) where C(<p(a)) is the face generated by <p(a) (see also III, §3). If the elements e of an orthocomplemented sublattice Gp of G satisfy conditions (8.2.18) and (8.2.19) then Sps = {p(e)\eeGp} is an orthocomp¬ lemented sublattice of П which is isomorphic to Gp. It is a certain hypoth¬ esis that such a sublattice Gp of G exists which is &(&', $) dense in G. Since, as in §8.1, we shall not discuss certain hypotheses, here we shall assert the existence of such a sublattice Gp as an axiom. For such a Gp, Sps = {pie) | e e G} is a sufficient system (see III, D 4.2.2) of physically realizable pseudoproperties. From (8.1.18) and x(M\p(e)) = pie1) (where x is defined by §8.1) it follows that we may always assume that ZcGr We therefore assume that ZcGp. Thus we obtain = iP(e)\eeZ} <= Sps. We may characterize Sm as a subset of Sps as follows: <^m = {p\p € Eps and p* = M\p}. PROOF. For eeZ, with p(e) defined by (8.1.8), we find that x(M\p(e)) = p{eL) and therefore p(e)* = M\p(e). Conversely, suppose that p(e) e $ps and p(e)* = M\p(e\ but that eeZ is not satisfied. Since Gp is dense in G there exists an e e Gp which is not commensurable with e. From (ё л e) v (ё л eL) < ё and since e and ё are not commensurable, it follows that ex = ё — Цё л e) v (ё л e1)] Ф 0. Since Gp is an orthocomplemented lattice we also obtain e1eGp. In addition, we find that ex л e = 0 and л e1 = 0. Since (p(et) л p(e))r = p(eJ, n p(e)r and pie^ л p(e) = p{ex л e) = p(0) = 0 we therefore find that р(ех)г n p(e)r = 0. Similarly we obtain р(ег)г n pie1),. = 0, Piei)p n Pie)P = 0 and Pi^i)p P(^J')p = 0- Therefore we finally obtain piei) n p(e) = 0 and pf^) n pfe-1) = 0. Since p(ex) = pie)* = M\p(e) it follows that p(et) = 0 in contradiction to ex Ф 0. Therefore, for e ф Z we find that pie)* Ф M\pie). Since Gp is dense in G, in addition to designating the set Sps = {pie) | e e Gp} as a set of pseudoproperties of microsystems, we shall also refer to G itself (somewhat imprecisely) as a set of pseudoproperties. The last designation is, however, often misunderstood. We have introduced the
8 Objective Properties and Pseudoproperties of Microsystems 181 set ips in order to be more precise. We may reduce difficulty in interpretation if we translate “x e p” for p e Sps into normal language by the expression “x has the pseudoproperty p.” 8.3 Logic of Decision Effects? Now that we have discussed the meaning of decision effects in various circumstances we shall now briefly discuss a number of expressions which frequently lead to misunderstandings in quantum mechanics. Since the set G is, in some respects, analogous to the set of properties of a classical system, it is common to refer to an element e e G as. a property (and not more precisely as a pseudoproperty). Since we are not accustomed to describe individual microsystems mathematically (as we do in this book) we frequently express the above assertion in ordinary language as follows: “a particular microsystem has the property e” In seeking to give such state¬ ments a verifiable meaning it was recognized that if the assertion “the microsystem has the property e1 and has the property e2” is replaced by the assertion “the microsystem has the property е1 л e2” then doubts about the validity of the usual two-valued logic arise. A similar case exists if the negation of the proposition “the microsystem has the property e” is replaced by the proposition “the microsystem has the property e1” In this way a nonstandard logic of propositions was derived in which the logical oper¬ ations are (in a sense) parallel to the lattice operations in G. This parallelism may be expressed as follows: and<-*A or«—► v, (8.3.1) not <-* _L. The lattice G is often called a quantum logic—even in the case in which the relationships (8.3.1) are not strictly required or “believed.” In the discussion of propositions concerning the properties of microsys¬ tems an alternative possibility proceeds from the idea that, in quantum mechanics, it is not possible to formulate so-called “objective” propositions of the form “the microsystem has the property e.” Instead, it is suggested that we may only formulate “subjective” propositions, such as, “I know that the microsystem under consideration has the property e.” Here there exists two different types of negation: “I do not know that the microsystem has the property e" and “I know that the microsystem has the property e1” Here “I do not know ...” can be considered to be an imprecise form of the proposition “I do not for certain know whether the microsystem in question has the property e or eM; there is a probability a (possibly subjective) that the microsystem has the property e (and 1 — a for the property e1). In this way attempts have been made to develop a “probability logic.” The notion of a “probability logic” will not be discussed here (see, for example, [16] and [9]). Instead, we shall continue the development of the
182 IV Coexistent Effects and Coexistent Decompositions fundamental description of individual microsystems in mathematical terms, terms which are in some correspondence to the more intuitive ideas formulated above. Instead of G we shall consider the dense subset Gp of G which was introduced at the end of §8.2, and the corresponding set of physically realizable pseudoproperties Sps together with the isomorphic map e —► p(e). Let eeGp; we shall now express the relationship xep(e) (8.3.2) in ordinary language as follows: “The microsystem x has the pseudoproperty еГ Here it is important to note that the ordinary language formulation should not be construed to mean anything other than the relation described by (8.3.2). In this formulation (8.3.2) is primarily a relation in the mathemati¬ cal description of a physical theory, that is, it has a physical interpretation. We should not, however, make the mistake of using the ordinary language description of (8.3.2) as an alternative interpretation in addition to that already given in II. We shall now proceed as follows: According to the methodology presented in [1] certain relationships in a mathematical theory J10~ (as a part of a physical theory 0>0~) may be considered to be a representation of real physical facts. Here it is important to emphasize that, in addition to the mathematical formulation of there does not exist another type of “proposition” formulated in 0*3". The mathematical formulation (for example, (8.3.2)) is, on the basis of the physical interpretation of the funda¬ mental sets—for example, Д01,0to and the real function in II—considered to be an assertion about reality. Thus the logic used in is only that of the mathematical theory. In this way we obtain, in a natural way, certain mathematical assertions of a real character (see [1], §10 or [2], III, §9). We shall now describe this situation using (8.3.2) without making use of the general formulation presented in [1], §10. In the mathematical framework we cannot assign the “values” true or false to the relation (8.3.2). The meaning and importance of relations such as (8.3.2) in a mathematical theory is much more complicated and requires a more precise analysis. Here we shall impose all of the axioms previously introduced and also those introduced in subsequent chapters (for example, VII). We shall begin by providing a logical analysis of relations of the form (8.3.2). Here, by the expression “logical analysis” we mean an analysis in the sense of a mathematical theory. We may logically “combine” two relations of the form (8.3.2) by means of a logical conjunction “and” as follows: * e P(ei) and x e p(e2), (8.3.3) where (8.3.3) is equivalent to the relation xepiej n p(e2). (8.3.4)
8 Objective Properties and Pseudoproperties of Microsystems 183 We note that, in general p(ei) П p(e2) ф p[e2 л e2). (8.3.5) We note that Pie) = P(e)p u p(e)r and Piei) n p(e2) = (piejp и pfci),) n (p(e2)p и p(e2\) = Wi)p П p(e2)p) и (p(ei)p n p(e2)r) u (?(ei)r n p(e2)p) и (ptej, n p(e2)r) = Piei a e2)p u p(ct л e2)r и (P(ei)p п p{e2)r) и (p(et)r n p(e2)p) = Piei a e2) и (р(с!)р n p(e2)r) и (p^), n pie2)p). For the special case in which e1Ae2 = 0we find that p(eJ n p(e2) = (piejp n p{e2)r) и (p^), n p(e2)p) need not be empty! For example, piejp n p(e2)r Ф 0 if there exists a preparation procedure a for which cp(a) e and p(cp(a\ e2) Ф 0 because, for ^(b0, b0 n p(e2)r) = e2 (see (8.2.18)) we must then have ^Aa r\ b0,a n b0 n p(e2)r) Ф 0, that is, a n p(e2)r Ф 0. For example, let el9 e2 be the following decision effects: e1: the momentum lies within a compact region W of momentum space, e2: the position lies within a compact region V of position space (see VII, §4). Then the set (8.3.4) is nonempty because it is possible to prepare microsystems from pie^p (that is, with momentum in W) which may be registered according to p(e2)r (with position in V). If we express x e p(eД x e p(e2) as follows: “x has momentum in W” and “x has position in V,” respectively, then (8.3.4) is equivalent to the logical conjunction (8.3.3) which says that “x has momentum in W and position in УГ For sufficiently small domains W and V the latter statement may appear to contradict the Heisenberg uncertainty relation. We shall now find that this contradiction is only an apparent one. The following objection is frequently made—particularly in the case of position and momentum: “after” the registration of the position (that is, for the elements of p(e2)r) the momentum has been changed. This is indeed the case. This fact is, however, irrelevant to the interpretation of quantum mechanics. Quantum mechanics makes assertions concerning the interaction between the preparation apparatus and the registration apparatus which results from microsystems. Therefore all such assertions are concerned with the microsystems “between” preparation and registration; here (8.3.3) repre¬ sents an assertion which is both correct and important. (In XVII, §4 we shall
184 IV Coexistent Effects and Coexistent Decompositions examine the “trans-preparation” process and obtain a number of conclusions concerning the “passage” of microsystems through the registration ap¬ paratus. We note, however, that these special processes—special relative to the more general process of preparation—are not needed for the in¬ terpretation of quantum mechanics.) We shall now show by an example that the “strength” of the disturbance of a system which occurs during the registration process is not an important issue in the interpretation of quantum mechanics. Let us consider a “classical” system, for example, a bullet which is fired by a gun (as the preparation apparatus) and produces a hole in a target (the registration apparatus). No one will object to the use of the expressions “position of the bullet” and “momentum of the bullet” in the description of bullet im¬ mediately before it is “registered” by the target. Here the “strength” of the influence of the target is unimportant, for example, the bullet may become embedded into the target if, for example, the target is a metal plate. The real distinction between macro- and microsystems lies in the structure of the convex set К which, for macrosystems, is completely different than that for microsystems (see, for example, the remarks in III, §3). The following claim is often made: the “classical” mode of description is made possible whenever the disturbance of the measurement can be neglec¬ ted. This claim is false, and avoids the actual problem, making an under¬ standing of the problem more difficult. Indeed, it is correct to say that every registration disturbs—for both the case of a classical system and that of a quantum mechanical system. Whether a system need be described classically or quantum mechanically has nothing to do with the “strength” of the disturbance in the registration process. Without disturbance there would hardly be any systems because without disturbances we cannot prepare and register, that is, cannot “extract” the system from its surroundings and make observations; without such in¬ teraction we cannot talk about systems at all. For classical systems the set Sm of objective properties is “sufficiently large” that we may interpret the preparation and registration procedures by means of the objective properties (this topic was outlined in III, §4), even though the interaction during registration may be very large! (see the remarks in III, §4 and the discussion in [1], §12). We note that (8.3.5) holds in general. This fact has led to much unnecessary speculation. It is not difficult to see that the expression “and” is used in different senses in е1 л e2 and p(ex) n p(e2), respectively. The mathematical formulation used here does not allow the usage of such imprecise language. The same can be examined if we consider the negation of (8.3.2) x ф p(e). (8.3.6) (8.3.6) is equivalent to x € M\p(e). (8.3.7)
8 Objective Properties and Pseudoproperties of Microsystems 185 In general we find that M\p(e) Ф Pie1)- (8.3.8) In §8.2 we have seen that M\p(e) = pie1) if and only if e e Z. Thus, from (8.3.7) we find that, in general, the two statements “x does not have the pseudoproperty en and “x has the pseudo¬ property e1” are not equivalent. Thus, in “ordinary” language we may easily encounter the following difficulty: Suppose, for example, that e is the decision effect that the position lies in the domain V. Then, from the statement “x does not lie in V" we may easily (and incorrectly!) conclude that “the position of x lies in V,n where V' is the complement of the set V—e1 is the decision effect for the statement “the position of x lies in F'.” In ordinary language it is evident that we may easily arrive at con¬ tradictions with logic. The mathematical language of relation (8.3.2) does not permit such confusion. Let be the smallest Boolean ring of sets which is generated by £ps- &pb contains (in the sense of equivalences (8.3.3), (8.3.4) and (8.3.6), (8.3.7)) all possible “logical relations” of the pseudoproperties in Sps. The Boolean ring is a reflection of the “ordinary” logic.This fact demonstrates that the question about the properties of a microsystem is not a question of logic because the elements of Spb may be called properties of the microsystem, and contains “sufficiently many” properties since Sps already contains suf¬ ficiently many pseudoproperties. The problem is therefore not associated with the construction of a Boolean ring of sets S cz 0>(M) but with the question posed in III, §1 about “objective properties.” If we define the term “objective” in the sense of III, §4.1 then is clearly not a set of “objective” properties because the set of objective properties is (according to §8.1 and §8.2) the subset im of Sps. Clearly Sm is not a sufficient set of properties. Since Sm is a Boolean ring of sets, the “logical operations” do not force us to leave the set Sm, and we may, without hesitation, speak about these objective properties in ordinary language as, for example, “x is an electron.” Using the following properties, we shall show that the elements of Spb are not “objective” properties. Let p = p(e). Then the following element M\p = iM\pp) n (M\pr) belongs to $Ръ- Let us consider two preparation procedures ax and a2 which belong to the same ensemble, that is, (p(ax) = <p(a2). Suppose that e) ф О, Ф 1. According to §6 it is possible that a1 can be com¬ plementary to a2 (see the EPR paradox in XVII, §4.4). Let a2 have a decomposition for which a2 cz a2 and <p{a2) 6^).^ may not have such a decomposition. Therefore it follows that ax n pp = 0, a2 n pp Ф 0. Similarly according to §3 there may exist a pair of registration procedures bq and b% for which b^ n pr = 0 and {//{bl, bj n pr) = e; bq is a registration procedure for which the corresponding observable is not coexistent with any effect g < e.
186 IV Coexistent Effects and Coexistent Decompositions Thus we obtain ax n bo с M\p9 that is, for all systems xeat n bj we obtain x фр. On the other hand, from picpiai), e) = p((p(a Д b20 n pr)) Ф 0 we obtain a1 n b\ n pr Ф 0, that is, in a1 there exists a system x for which x e p. Therefore, by the “application of the registration method b^” alone (that is, without any selection according to a registration b!) we have therefore selected the “property” M\p from the systems of type al9 although at also contains elements of p. The “property” M\p is therefore not “objective” because it depends upon the application of a registration method. Here we again make the remark that we do not use the designation “objective property” in a meaning opposite to that of “subjective,” but (as we have already expressed in an intuitive way in II, §1) in the sense of “independent from the methods of preparation and registration,” that is, objective in the sense of the properties ascribed to the systems. The opposite of “objective” properties is therefore “relative” properties, not subjective meaning and knowledge about properties. The desire for the intuitive idea that although the microsystems are emitted by the preparation apparatus, they exist independently after the emission, that is, no interaction exists after emission between the microsys¬ tems and the preparation apparatus, and later, that the microsystems on the basis of their inherent structure—that is, their “objective properties” act upon the registration apparatus—is so compelling that many of us may wish to adopt the “hidden” properties hypothesis to retain this idea. In the following discussion we shall ignore properties which are “com¬ pletely hidden” in the sense that they have nothing to do with the preparation and registration processes. The above idea may yet have an additional meaning: It is perhaps not possible to construct a sufficiently good prepara¬ tion procedure in order to produce systems with definite specified objective properties. Otherwise, the registration process may be deficient in this respect—the objective properties may only be partially registered. It is in this sense that we shall attempt to mathematically formulate the notion of “hidden objective properties.” This notion will be somewhat different than that described in III, §4.1. In addition to the structure previously introduced, we shall consider an additional structure Sh for which $h с= 0>(M) where Sh is a Boolean ring of sets. The elements of Sh will represent the hidden properties. For p = p(e), eeGp there exists a registration method b0 for which ФФо> b0 n pr) = e (see §8.2). Then we obtain Xy(a nprnb0,anppnb0n pr) = 1, that is, a n pp n b0 c= pr. Since the relation (b0 n pr) и (b0 n p*) = b0 is satisfied for this b0, for each system registered according to b0 either b0 n pr or b0 n p* will “respond.” We shall attempt to interpret this situation in the following way: p = pp и pr possibly does not include all systems
8 Objective Properties and Pseudoproperties of Microsystems 187 which have the same objective property which is made evident by the response b0 n pr during the registration process b0. Suppose there exists a г e Sh for which г => pr such that for systems in г the registration response is always obtained for b0 n pr and therefore a response is never observed for p*, that is, s n p* = 0. Since the systems in a n pr are such that b0 n pr responds with certainty, for all a we should find that a n pp c= г and therefore pp c= г. Since the systems in a n p* are such that b0 n pr does not respond with certainty we should find that г n p* = 0. We then find that г =э pr and г => pp are equivalent to г =э pp и pr = p; г n p* = 0 and г n p* = 0 is equivalent to г n (p* n p*) = г n p* = 0. These speculations suggest that the following axiom is desirable: AH 1. To each p e Sps there exists а г e Sh for which г => p and г n p* = 0. Here £ n p* = 0 is equivalent to М\г => p*. Since = pGp we obtain: To each eeGp there exists a£e^ for which г => p(e) and М\г => pie1). In practical terms axiom AH 1 is not very restrictive because it is satisfied by Sh = Sps itself for the case in which г = p(e). The following idea is more restrictive: Suppose that el9 e2 are two decision effects in Gp for which e1 L e2. Then we may think of a registration method b0 for which Il/(b0, b0 n plr) = eu ф(Ь0, b0 n p2r) = e2 where P2 = p(e2). Then from plr n p2r = 0 and b0 = (Ь0 n plr) u (Ь0 n p2r) и [b0\(b0 n (plr и p2r))] we obtain <A(b0, b0\(h0 n (plr и p2r))) = e3 = 1 - (Ci + e2) and therefore b0\(b0 П (plr и p2r)) = b0 n p3r, where p3 = p(e3). This registration method b0 permits us to separate the systems according to plr, p2r, p3r. For three objective properties el9 г2, г3 for which exactly one of b0 n plr, b0 n p2r, or b0 n p3r will respond with certainty we should therefore have 81 и г2 и г3 = М because, for each system, in every case one of the “responses” b0 n plr, b0 n p2r, b0 n p3r must occur. In this way we are led to the following axiom: AH 2. For el9 e2, e3eGp and e1 + e2 + e3 = l,8l9 82, 83 e$h where S{ => р(е{) (i = 1, 2, 3) we require that s1 и г2 и г3 = М. Let us define a map ф: Gp —► ^(M) as follows: 0(c) = fl s. e e Sh e^p(e) Then it easily follows that cx < e2 => 0(ct) <= ф(е2) (8.3.9)
188 IV Coexistent Effects and Coexistent Decompositions and, from AH 1 it follows that: ф(е1) с= М\ф(е). (8.3.10) If el9 e2, e3 e Gp and e1 + e2 + e3 = 1, then, from AH 2 it follows that ф(е1) и ф(е2) и ф(е3) = М. (8.3.11) If we assume that if e e Gp then [0, e] n Gp is dense in [0, e] we may then prove that the existence of a map ф satisfying (8.3.9), (8.2.10), and (8.3.11) leads to contradictions; for the case in which Gp = G an elementary proof can be found in [2], XVIII). We shall not present discussions here of any weaker hypothesis for “hidden properties.” Instead, we shall make an analysis of the structure Sps of “physically real” pseudoproperties and their physical interpretation. For (8.2.4) it follows that pp depends upon the preparation procedure because a = (a n pp) и (a\a n pp) is a decomposition of a which must be coexistent with all possible decompositions in 1(a). Similarly, from (8.2.3) it follows that pr depends on the registration method since b0 = (Ь0 n pr) и (b0\b0 n pr) must be coexistent with all the other regis¬ trations in $(b0). The discussions in previous sections about the structure of preparators and observables are therefore applicable to the sets a n pp and b0 n pr, respectively, as substructures. The elements of Sps represent only a part of the structures of preparators and observables. The analysis of the preparators and the observables has shown that if we consider only Sps then we lose part of the general structure of the interaction transfer mechanism from the preparation to the registration systems, where this mechanism is independent of the special technical construction of the prepara¬ tion and registration apparatuses as described by the corresponding irreducible kernal observables or preparators, respectively. We cannot speak of microsystems except in the context of preparation and registration even if we neglect as many of the “accidental” properties of the preparation and registration apparatuses as possible. We have also found that even the structure described by the pseudoproperties in Sps actually refers back to the preparation and registration processes. Only the “objective” properties in Sm can be separated from the preparation and registration procedures. Sm is not sufficient. Therefore it would be necessary to attempt to demonstrate the sub¬ jectivity of every assertion about nature. Such attempts would greatly exceed the real procedures in physics and contradict our original intentions to base the description of microsystems in terms of preparation and registration pro¬ cedures which can be described in an “objective” form. In II we have argued that the interpretation of quantum mechanics depends only on the mode of description of the preparation and registration processes, a description which is given already before any knowledge of microsystems and quantum mechanics. This is specially important if we wish to consider the “physical possibility” of assertions of the form (8.3.2), that is, the question whether it is possible to realize situations in which the assertion (8.3.2) is true.
8 Objective Properties and Pseudoproperties of Microsystems 189 The structure of assertions of the form (8.3.2) for an unspecified x and an unspecified e is meaningful only if we are interested in studying the logical operations, as we have done earlier. We are, however, also interested in assertions of the form (8.3.2) in those cases in which x and e are particular specified elements. In order to emphasize this definiteness we shall modify the notion of (8.3.2) by using the subscript 1 as follows: xieP(ei)• (8.3.12) By requiring that xt and e1 be definite elements we mean that x1 and e1 are already defined before we use the relation (8.3.12) in our mathematical framework. For xt this may be achieved by requiring that xt is already a label for an actual system in the context of an actual experiment which has been carried out andidie~result of which is written down in the mathematical framework before we may use (8.3.12). In [1] the mathematical scheme of the theory in which previous experimental results have been incorporated is denoted by JOTs#. In brief we say that xt is a definite element if xt is already a label appearing in JlZTstf before we add the relation (8.3.12) to MZTstf. e1 is a definite element if, for instance, it is defined as the decision effect to find the position of the system in a given (that is, in a technically determined) region Y in the laboratory system, a decision effect which will be defined in VII, §4. It is possible to add the relation (8.3.12) to the mathematical scheme JtZTstf and determine whether (1) (8.3.12) may be derived as a theorem in JlZTstf. (2) The negation of (8.3.12) may be derived as a theorem in JlZTstf. (3) Either (8.3.12) or its negation may be added to without producing a contradiction (naturally both cannot). In case (1) we say that “xx actually has the pseudoproperty ex.” In case (2) we say that “xx actually does not have the pseudoproperty ex.n Earlier we have seen that case (2) is not equivalent to the statement that “xx actually has the pseudoproperty ,” except in the case in which e1 e Z—then e1 would be an objective property. In case (3) we say that “xx may possibly have the property ex” or “... not have ” Every mathematician knows that case (3) can occur in a mathematical theory such as . The existence of case (3) has nothing to do with a “new” mathematical logic. The existence of case (3) is possible using “normal” mathematics and “normal” logic. Only some physicists have difficulty with case (3) and have the opinion that it may be necessary to introduce a new logic in physics in order to interpret case (3). The source of these difficulties lies in the fact that we always (often unconsciously) assume that the elements of Sps are like the inherent properties of the system—so that for x e M and p e Sps only xe p or хф p will be true. For the elements p e Sm we may, in fact, make such a claim, but not for the elements peSps. Case (3) is essential! Case (3) is possible only because the elements p e Sps are not “objective” properties but, more generally, are assertions about the microsystems relative to the preparation
190 IV Coexistent Effects and Coexistent Decompositions and registration processes. For this reason it is important to examine case (3) in more detail. We shall assume that the system is prepared according to some preparation procedure a1. In this way we may introduce the relation x1 e ax into the mathematical theory as an experimental result. On the basis of the construction of the preparation apparatus corresponding to ax (the cor¬ responding information cannot be represented in terms of the theory described here—see XVII) and on the basis of additional experiments we may determine (p(ax). If (p(aA) is determined by experiment then the value of ei) is determined by the theory. (A) Suppose ^(<Kai). ei) = 1- In this case we obtain ax e &ei and obtain (8.3.12) as a theorem. (B) Suppose that /^(аД et) = 0. In this case р(ср(аД e\) = 1, that is, xi e P(ei) and *i Ф p(ei) are theorems. Therefore we find that (A) corresponds to case (1) and (B) corresponds to case (2). (C) Suppose that /^(аД et) ф 1 and Ф 0. To comment on this case the actual experimental situation is essential since we are not dealing with an imaginary microsystem. x1 is an actual microsystem. How was the preparation of the system carried out? Has the system x1 already been registered? Has a registration method already been applied to хД ... etc. ...? We shall first consider the case in which a registration method has not yet been applied. Then the only experimental facts are x1 e ax and (possibly) for some of the a <= ax we have also obtained x1e a. Since the relationship x1e a can be experimentally verified only for finitely many av we find that a = П S. (8.3.13) V is an element of aA) and we obtain e a. If p((p(a), et) = 1 or if p((p(a), ej = 0 we obtain cases (A) and (B), respectively. Here it is important to remark that in the case in which not all the xte av are included in the ‘protocol” of the experiment (for example, it is possible that some of these relations have been overlooked) in reality there exists such an a satisfying (8.3.13) even thoughwe may fail to take it into account in JlZTstf. Thus it is possible that, in the case p((p(a), e= 1, that really has the pseudo¬ property p(e J ” although we may be unaware of this fact. Here we find that the a e &(a^) to which the system x belongs is determined by the Boolean ring J(ai) even in the case in which an experimental result x e a for an a e has been overlooked. The incompleteness associated with the introduction of the experimental results into a mathematical theory (that is, the incompleteness of the axioms denoted by set down in [1], §5 and [2], III, §4) may permit “possibilities” which nature does not allow. For example, it is possible that the relation p((p(a), ej = 1 is satisfied for a particular experimental a but that we have only observed that xeat (where ax => a) where /^(аД et) < 1;
8 Objective Properties and Pseudoproperties of Microsystems 191 then the “possibility” remains that ф while, in reality e p(e t). This “possibility” remains only if we have overlooked the fact that xx actually satisfies the finer preparation procedure аса1( We are not interested in the possibilities which are caused by the incompleteness of the axioms Jtstftf*. Therefore we shall now assume that the experimental situation is described by the relationship xx e a, where a is defined by (8.3.13) and that p(<p(a), ej has been calculated, and found not to be equal to either 0 or 1. Then the relationship (8.3.12) or its negation can be introduced into the theory without producing a contradiction. In such a case, possibility (3) holds—“it is possible that has the pseudoproperty p(eJ.” What are the possibilities engendered in this case? Since we have assumed that x1ea is the best possible observation (that is, we have excluded the possibility of an incomplete “protocol”), since p((p(a), et) Ф 1, we must have a n plp = 0 (where p1 = p(et)) otherwise in <2(0i) there would be selection procedures which are finer than a which we have overlooked. By assuming that a r\p1 = 0 we will simplify the following discussions. From a n plp = 0 and x1ea it follows that х1 ф plp. Then *i e Pi = P(ei) is equivalent to xx e plr. Since p((p(a), ex) Ф 0 we may add the relation xx e plr without contradiction. It would be false, however, to say that has the pseudoproperty p1 with probability p((p(a), et) because the probability that x1 e p1 is realized in an experiment depends on the following: (a) The registration method b0 which we apply to xx can be chosen arbitrarily. Thus the “possibility” that xx e p1 may be obtained from x1 e a depends upon the possibility of the arbitrary(!) choice of b0. For the choice of b0 there are no probabilities. Here b0 is freely “available” (for this concept see [1], §11 and §12). Suppose that a particular choice b^ is made, that is, as an additional experimental fact e b^ is observed. By the introduction of the relation xx € b(01} it becomes necessary to alter our assessment of the relation (8.3.12) as follows: (al) b(01} has been chosen such that b(01} n plr = 0. Such a choice has occurred if the observable corresponding to ЩЬ^) L is complementary to the observable {0, el9 e{91} (see §3). Then, by the choice(!) of such a registration method Ь^\ from e Ь(01} we obtain the following result as a theorem in JUTx^ ф plr, and, since xx ф plp we also obtain х1фр1. Therefore we obtain the case that “x± actually does not have the pseudoprop¬ erty Plr In this case (al) it is easy to see that the elements p e Sps cannot represent properties of “isolated” microsystems because they are unconditionally
192 IV Coexistent Effects and Coexistent Decompositions connected with the possibilities inherent in the preparation and registration processes because the application of a registration method(!) can make it impossible that ф pv (a2) Let Ь(01} be chosen such that n plr ф 0. Here a registration has not yet taken place. Then e b^ does not prohibit the addition of the relation x1eplr An this case we have iix1 may possibly have the pseudoprop¬ erty Pi” The “possibilities” for x1 e p1 cannot be further influenced because there exist reproducible frequencies p(cp(a), ф(Ь^\ b)) for the various registrations b c= Ь(0Ч Therefore, with a degree of justification, we may call M<p(d), фф$\ ьр n plr)) the “probability for xx e pv” Since фф{о\ b^ n plr) is, in general, smaller than еъ it is possible that ФФо\ bo} n Pif)) ^ КфФ), «О! Indeed, it is possible that i4№, ФФо\ w n plr)) = 0 even in the case in which b^ n plr Ф 0. If p{(p{a\ ф(Ь(о] n plr)) = 0 then “xx does not actually have the pseudop¬ roperty px” If p(cp(a), ф(Ь{о\ b(Q} n plr) Ф 0 then “xx may possibly have the pseudoproperty p±” (a2m) According to APE 3.4 we may choose b(01} such that b(ol) = Ф(01) n plr) и ф{01} n р?г). Then we obtain фф{о \ b<01) n plr) = j. If b{Qy> is chosen in this way, then xx e p1 is possible with the maximal probability p{(p{a\ ex). In case (a2) and in the special case (a2m), providing that p(cp(a), ф(Ь#\ b(Q} n plr)) Ф 0), there exists a registration b+ <= b^ n plr. If the experiment is carried out, then either e b^ n plr or xi E n Pir wiH be observed upon registration from the registration apparatus. For the experimental result xx e n plr we therefore obtain (8.3.12) as a theorem: “xx has the pseudoproperty px” and the possible pseudoproperty p1 of x1 has been realized. Otherwise, if we experimentally obtained the result x1 e bo1)\(bo1) n plr) then we obtain the statement ф p1 as a theorem, that is, “Xi does not actually have the pseudoproperty p±” Of course, this does not mean that the relation x1 e pf = p(e|) must be satisfied. If b^ had been chosen according to the case (a2m) then, from the experimental result *i e n plr)
8 Objective Properties and Pseudoproperties of Microsystems 193 we would indeed obtain e pf = p(e|) as a theorem, that is, “xx actually has the pseudoproperty pf.” The analysis presented above shows the complicated structure of case (3) in which e p1 and х1 ф p1 can be introduced without contradiction. This analysis shows the essential point that the knowledge of the subject does not play a role, and that, at most the incompleteness of the “protocol of an experiment” (as, for example, stored in a computer) may leave additional “possibilities” open, and that the latter are actually established by the experiment if they are not established by the protocol. This analysis shows that quantum mechanics permits the description of all experimental si¬ tuations even for individual microsystems without, as we have found, the need for the introduction of a new logic, providing that we do not consider mathematical logic to be unusual because of the occurrence of case (3). The fact that we do not need to introduce a new logic does not mean that it is not possible to introduce new language together with a new logic. For example, it is possible to express relations such as (8.3.2) in a new language which expresses more of the ontological character of such expressions than does the formal mathematical language. For example, it is possible to formulate expressions like (8.3.2) as follows: “The microsystem x has the pseudoproperty e” (see the discussion following (8.3.2)). This “new” linguistic formulation can be considered to be interpreted by means of (8.3.2). Using this and similar expressions as “elementary propositions” it is possible to obtain new propositions by means of logical operations; these new logical operations need not be the logical operations of . Instead, they may be introduced on the basis of a dialog (see [16]), in which the verification of a proposition corresponds to what we have described above by: The pro¬ position is physically verified on the basis of an experiment (see [1], §10.4). According to the previous discussion in §8.3 we are now in the position to correctly explain the physical meaning of many of the famous quantum mechanical facts. We will first consider the uncertainty relation, often called the Heisenberg Uncertainty Relation. It has played a great role in the conceptual development of quantum mechanics, but is often loaded with considerable historical ballast; we shall now present a review of this topic. We shall begin by briefly proving the following mathematical theorem: Let eu e2 e G, let = tг(м>еД a2 = tr(we2). Then there exist el9 e2 such that for each we К at least one of the following two relations is false: trMCi - a^)2) = <*! - a\ < ■&, tr(w(e2 - a2l)2) = a2 - a2 < Vs. First we shall present the proof for the case of a single Hilbert space, that is, for 08 = 08(Ж). Let us consider a complete orthonormal basis which we divide into two sets q>v (v = 1, 2,...) and ф^ (p = 1, 2, ...). Let ei = £ P<pv and 1 - ei = ei = £
194 IV Coexistent Effects and Coexistent Decompositions Let Let us assume that the relation — a2 < is satisfied for w. Then it follows that either < £ or (1 — o^) < that is, either tф^) < i or tr(we|) < We shall only consider the case tr(we1) < the proof of the other is similar. From tr^i) < i it follows that The proof may be extended to the case of more than one Hilbert space. Since Gp is dense in G, the theorem also applies for two suitable el9 e2eGp. Since at least one of the relations (8.3.14) must fail for every w, it represents an uncertainty relationship between the two decision effects e1 and e2. We have stated the case for a pair of decision effects in order to show that the uncertainty relations have nothing to do with the accuracy of measurements! In order to discover the physical meaning of the relations (8.3.14) we shall now rewrite them in terms of preparation and registration procedures. For P(ei) = Pi and pip2) = P2 let two registration procedures and b(02) be chosen such that ф(Ь{о \ b\n plr) = e1 and ^(b(02), b(02) n p2r) = e2 are satisfied. For each preparation procedure a from £ > tr(wet) = Y, \Ф<РЛ2 > 2 Edl\/wZvll “ llv^/vll)2 V V ^ 2 £ IIn/wXvII2 + 2Z ll^vll2 - £ llVwxvll ll^vll V V V > £ tr(we2) + £ tr(we£) > £ - tr(we2)1/2 tr(w(l - e2))1/2 = £ - a£2(l - a2)1/2 and we obtain a2 - ocf = a2(l - a2) > (£ - £)2 = (|)2 > (f)2 = re- Ay(a n b(0l), a n btf* n plr) < £ or A^(a n ^o1*, a n i#* n pfr) < £ it follows that A^(a n b(02), a n b(02) n p2r) > £ and Ay{a n b(o\ a n b(02) n pfr) > £ must be satisfied. Ay(a n bjft a n n plr) < £
8 Objective Properties and Pseudoproperties of Microsystems 195 states that, for the registration n plr at most £ of the systems prepared according to the procedure a will respond. Correspondingly, ky(a n b#\ a n b(0l> n pfr) < £ states that, in the registration b(01) n plr more than f of all systems prepared according to a will respond. A “high” value for the response or nonresponse of b(01} n plr for the systems obtain from the preparation procedure a leads automatically to a “low” value as well for the response as for the лотевропве of b(02) n p2r, that is, the frequency of response must lie between £ and |. The uncertainty relations expressed by the relation (8.3.14) express the following statement for possible preparation apparatuses: It is impossible to experimentally produce a preparation procedure for which the frequencies oq and a2 satisfy both — ol\ < and a2 — a2 < This relation says nothing about the possibility of building registration apparatuses. On the contrary, we have assumed that we have made two experiments a n b^ and a n b(Q} where we permit Ь(01} n b(02) = 0. From the uncertainty relations we cannot say anything about the possibility of obtaining joint measurements of e1 and e2. In fact, if e1 and e2 satisfy (8.3.14) then they are not commensur¬ able in the sense of D 1.3.1. Then, if ф(Ь^\ b^ n plr) = e1 and if ф(Ь(о\ b(02) n p2r) = e2 and if bft* n b^2) = b0 Ф 0 it follows that: Ф(Ь{о\ n plr) = 1.Яо(Ь$\ Ь0)ф(Ь0, b0 n plr) = ex and Ф(Ь{о\fe(02) n p2r) = А<*0(Ь(02), Ь0)ф(Ь0, b0 n p2r) = e2. From 1 > ф(Ь0, b0 n plr) > Ц) Л®0\УО > b0) it follows that b0) = 1, that is, b0 = = b(02) resulting in the fact that e1 and e2 are coexistent, in contradiction to the fact that e1 and e2 do not commute. Therefore we must have b^ n b(02) = 0, that is, the two re¬ gistration methods must be mutually exclusive. The above clarification of this concept is necessary because we are accustomed to intuitively make more or less correct conclusions using the uncertainty relations. These uncertainty relations are usually formulated for scale observables which, according to D 2.5.6, are always decision observables. Suppose that A and В are two such scale observables for which А,Ве$\Ж\,Ж2,..). A and В are therefore “bounded” operators (see AIV, §15). The “dispersion” of measurement values of A in the ensemble w is defined by Str(,4) = tr(w,4'2), (8.3.15) where A' = A — 1 tr(wA).
196 IV Coexistent Effects and Coexistent Decompositions Here tr(wA) is the so-called expectation value of A in the ensemble w, that is, it is approximately the experimental mean value a = (1/iV) av of the measurement results av for a large number N of repeated experiments. Str(A) is the mean of the square of the deviation, that is, it is experimentally the approximate value of the mean value (1/iV) Y,v=i (av — a)2. We may make the physical meaning more clear with the aid of a preparation procedure a for which cp(a) = w and a registration method b0 for which ЩЬ0) L repre¬ sents a very good approximation for the scale decision observable A (for the realization of an observable, see §4). For A A = ^/Str(A) and the corresponding equation (8.3.15) for an observable В it follows that: (ДЛХДВ) = Vtr(wA'2) y/tr(wB'2). Let D = A' + ixB' where a is real. Then D+D is self-adjoint and D+D > 0. Thus it follows that tr(wD+D) > 0 and, for all a we have (AA)2 + a2(AB)2 + a tr(wC) > 0, (8.3.16) where С = i(A'B' - B'A') = i(AB - BA). In order that (8.3.16) is satisfied, it is necessary and sufficient that the minimum of (8.3.16) with respect to a is non-negative; we therefore obtain (AA)(AB) > 0|tr(wC)|. (8.3.17) (If AB = 0 we exchange A with В in the derivation; if both A A and AB = 0 then from (8.3.16) it follows that tr(wC) = 0 from which (8.3.17) is satisfied). If equality holds in (8.3.17), then it follows that there exists an a for which tr(wD+D) = tr(D4/w4/w£)) = 0. This is equivalent to the condition that D^/w = 0 and also Dw = 0. If there exists an a such that Dw = 0 then we obtain equality in (8.3.17). For w = £v (according to AIV, §11) we obtain Dw = 0 is equivalent to the statement: D(pv = 0 for all (pv (for which wv Ф 0). This result follows from 0 = tr(D+Dw) = X wv tr(D+DPJ = 2 wJDPJl2. V V Thus we have seen that the “uncertainty relations” (8.3.16) and (8.3.17) are determined essentially by the noncommutativity of the operators A and В, and, according to §3, are determined by the fact that A and В are not commensurable. The fact that A and В are not commensurable does not directly appear in the derivation of the uncertainty relations, but only indirectly, in terms of the mathematical structure of noncommutivity. The best known case of the uncertainty relation (8.3.16) is the case in which A represents the position observable Q, and В represents the corresponding
8 Objective Properties and Pseudoproperties of Microsystems 197 momentum observable P; P and Q will be described in detail in VII, §4. For P and Q (see VII (4.22) we obtain PQ — QP = — il, that is, С = — il; therefore (8.3.16) takes the form (AP)(A<2) > i (8.3.18) this is the famous Heisenburg uncertainty relation which has played an important role in the evolution of quantum mechanics. (Since P and Q are not bounded observables, the above proof does not apply. Let w = £v wvP<Pv. If for wv Ф 0 all the (pv lie in the domain of definition of the operators P2 and Q2, then the above derivation will be valid. If, for example, one of the (pv, say <p1? does not lie in the domain of definition of P2 (or Q2) then tr(wP(pi) need not exist; we may then consider the operator P' — pi with arbitrary values of p and obtain Str(P) is infinite. Since Str(Q) cannot be equal to zero, equation (8.3.18) will still be satisfied.) The following conclusions are often made in connection with the Heisenberg uncertainty relation: “The position and momentum of a particle cannot be simultaneously determined with arbitrary precision. The measurement uncertainties AP and AQ for a simultaneous measurement must satisfy (8.3.18).” Or, somewhat more concisely: “Position and momentum cannot be simultaneously measured to arbitrary accuracy.” These assertions are half-truths, and are not valid conclusions of equation (8.3.18). First, the expression “time” does not appear in either (8.3.18) or (8.3.16). The meaning of the expression “simultaneous” in this context is unclear. If we conclude that position and momentum may be measured to arbitrary accuracy at different times, that conclusion will be false (see VII, §6 and XVII). As we have already seen, (8.3.16) is only indirectly concerned with the fact that A and В are not commensurable. As we have seen in §1—§4 the concept of commensurability is defined in an entirely different manner than is the uncertainty principle. Again, it is important to note that A A and AB have nothing to do with measurement imprecisions; on the contrary, in the derivation of (8.3.16) it was assumed that they were measured with “ideal precision.” What does it mean to measure an observable “imprecisely?” Such a concept does not appear in this chapter. Have we overlooked part of the structure of the registration process? No, this is not the case. It was essential that, in the introduction of the concept of a registration procedure b e 0t that the registration is precise, whether xeb or x$b. Therefore there are no “imprecise” registrations. What then does an experimental physicist mean by the expressions “measurement error,” “measurement imprecisions,”
198 IV Coexistent Effects and Coexistent Decompositions etc.? He is making a comparison between a “real” observable ^(b0)-^>L and the desired observable E Д L; we have already discussed this problem in §4. Here the “measurement error” is understood to be the “difference” between ^(b0)L and which is characterized in AOb (see §4) by the differences ф0к(а) — F(cr). If E Д L is a scale decision observable, then the experimental physicist seeks to describe the difference between his real apparatus, as described by ЩЬ0)-^Ь and a “real scale,” by “errors” between the real scale of the apparatus and the ideal scale of the scale observable he wished to measure. This “error” depends upon the ensemble used in the experiment. This subject was discussed in §4 in connection with axiom AOb. The discussion of errors is a typical experimental problem because it is not related to the theoretical postulates of AOb but is concerned with the construction of the real experimental apparatus. Thus, when we say that we can only make imprecise joint measurements of both P and Q, the assertion is made that, for a real apparatus b0 having two scales x and у, the corresponding partial observables of @(b0) L may only approximately measure the ideal observables P and Q with errors, where the errors are “somewhat similar” to those described by (8.3.18). To analyze these “errors” more precisely is somewhat more difficult than may at first appear. At least we know that the well-defined quantities AP and AQ in (8.3.18) are not measurement errors. Although AP and AQ are often falsely interpreted as measurement errors, it is probable that a relationship which is similar to (8.3.18) will be obtained if we define a suitable notion of a measurement error. This is certainly not surprising. However, the derivation of a relation which is similar to (8.3.18) for the “measurement errors” of “approximate P” and “approximate Q” is much more difficult and, in all probability, cannot in general be carried out. For this reason we shall not proceed further in this direction (see [19]). Certain related problems, such as the problem of a sequence of measure¬ ments or the problem posed in III, §1 (that the registration must occur “after” the preparation), and that the registration must take place in another region of space than the preparation cannot be clarified using only the axioms presented here. This is due in part to the fact that we have not built into the theory described here a description of space and time. Such a clarification is extremely desirable, since the role of space and time in quantum mechanics is of great importance. Such a clarification is possible only after we have investigated the transformation properties of preparation and registration procedures in V-VIII.
CHAPTER V Transformations of Registration and Preparation Procedures. Transformations of Effects and Ensembles In II-IV we have been primarily motivated by physical considerations. In this chapter we shall be concerned with questions which play an important role in all mathematical theories—the definition and examination of the role of morphisms. The concepts presented in II-IV were motivated by physical considerations; here again the mathematical structure will reflect the physi¬ cal situation. In this book we shall not investigate the underlying general mathematical problem itself because we are interested in the physical significance of the morphisms. We have already encountered this problem in previous chapters and have already anticipated some of the applications of morphisms. In VII we shall describe another important application; in XVII we shall become familiar with additional important physical examples of morphisms. 1 Morphisms for Selection Procedures At first it may appear to be desirable to consider mappings of the set M into itself. In classical physics we are accustomed to studying transformations of state space (for example, phase space in classical mechanics). Here we note that we may not identify the state space of classical theories with the set of physical systems. For example, it is possible that many systems have the same state. In classical theories the notion of the transformation of systems is, in general, physically vague. In quantum mechanics, except for the maps considered in XVII, §4.1, we also find that there are no physically interesting 199
200 V Transformations of Registration and Preparation Procedures maps of M into itself. For these reasons we shall not consider maps of the set M into itself. On the other hand, the mapping of selection procedures is of substantial interest. Let and be two systems of selection procedures on the sets Mx and M2. D 1.1. Let Д y>2 be a map satisfying the following conditions : h(a n b) = h(a) n h(b), h(a\b) = h(a)\h(b) for a => b. Such a map will be called an sp-morphism. For b c= a (hence a n b = b) from h(a n b) = h(a) n h(b) it follows that h(a) n h(b) = h(b), that is, h(a) => h(b). Thus h preserves the order relation and, if the first requirement is satisfied, the second requirement of D 1.1 is meaningful because b с a implies h(b) c= h(a). If a, b e a и b e then from h(a) c= h(a и b) and h(b) c= h(a n b) we obtain h(a) u h(b) c= h(a u b); on the other hand, from h(a и b\a) = h(a и b)\h(a\ a и b\a c= b it follows that h(a и b)\h(a) c= h(b) and that h(a) u [/z(a u b)\/i(a)] = h(a u b) c= b(a) u h(b). Therefore we obtain h(a и b) = h(a) и h(b). In accord with the usual terminology we shall say that a bijective mapping h is an sp-isomorphism if both h and b_1 are sp-morphisms. Th. 1.1. Let Nx be a subset of M1? let N2 = and let be a system of selection procedures. Then the set T2 = {b\b — a n N2 and a e Tx} is a system of selection procedures and the mapping h(a) = a n N2 is an sp-morphism. Proof. The proof is simple and left to the reader. Th. 1.2. If h is an sp-morphism and if J = {a | a e and h(a) = 0} then J satisfies the following properties: a) aeJ,beSFx and b c= a => b e J. (2) a1? a2 e J, ax и a2 e => ax и a2 e J. Proof. (1) Since h is order preserving, from h(b) c: h(a) = 0 it follows that h(b) = 0. (2) We obtain h(a1 и a2) = Ца^) и h(a2) = 0 u 0 = 0.
2 Morphisms of Statistical Selection Procedures 201 D 1.2. A subset У cz У is said to be an ideal in Sf providing that the following conditions are satisfied: (1) аеУ.ЪеУ* and b <= а=>ЬеУ. (2) al9 a2 e У, ax и a2 e У => ax u a2 e У. Th. 1.3. If h is an sp-morphism then, by means of the ideal У described in Th. 1.2 it is possible to decompose the map h as follows: PJJ ^ 5 >У2 where i is an injection. Proof. From = h(a2) it follows that h(a1 n a2) = n h(a2) = = h(a2) and we obtain h(a1\a1 n a2) = h(a1)\h(a1 n a2) = 0, h(a2\a1 n a2) = 0, that is, ЯД**! п a2, аД^ n а2е У. An equivalence relation a1 ~ a2 is defined by аД^ n a2, аД^ n а2еУ. We obtain the following identity: n a3) = [(аД^! n a2)\(aiVh n a2) n a3] u [«! n (a2\a2 n a3)]. From яДс^ n a2e У and a2\a2 n a3 g«/ it follows that a1\a1 n a3 g «/. Similarly, from a3\a3 n a2e У and a2\a2 n g it follows that a3\a3 n a1e У. 2 Morphisms of Statistical Selection Procedures For many applications the probability function obeys certain laws under sp- morphisms. We shall now formulate these laws. If a g S?9 h is an sp-morphism and У is the ideal defined in Th. 1.2, then the set of all a n a for which ae/isa subset of У which we shall denote by У(а). We therefore obtain У {a) = ^(a) n У. For У {a) we therefore obtain ax c= a2, a2 e У (a) => ax e У{а\ al9 a2 g У (a) => ax u a2 e ./(a) since if a1? a2 g У (a) then it follows that ax\j a2e У since ax и a2 c= a! D 2.1. An ideal is said to be closed with respect to a statistical selection procedure У if sup5et/(a) A(a, a) = 1 implies the relation a g У. The condition sup5ey(a) A(a, a) = 1 means that there exists anae / for which the probability for the selection a\a is, for all practical purposes, equal to zero.
202 V Transformations of Registration and Preparation Procedures D 2.2. An sp-morphism h of a statistical selection procedure У1 in a statistical selection procedure У2 is said to be an ssp-morphism if the ideal У is closed and if, for ax c= a2 the following condition holds: a2 A2(/i(ai), h(a2)) = — a2), ai where a1? a2 are defined as follows: 0Lt = 1 — sup Ах(а*> a) = inf Xx(ah at\a). a e J(ai) a e У(а{) Since the ideal У is closed, it follows that the condition h(a) = 0 is equivalent to the condition a = 1 — sup Ax(a, a) = 0. a e J{a) For /i(ax) Ф 0 and therefore ax Ф 0; therefore the condition given for A1? A2 is well defined. If an ssp-homomorphism is an sp-isomorphism then it is also an ssp- isomorphism, since for each а ф 0 we obtain a = 1 and therefore Л2(й(яД h(a2)) = AM* a2). Conversely, if a = 1 for all а Ф 0, then it follows that Ax(a, a) = 0 for all a e У (a). If ae У and if а Ф 0 then a e У (a) and therefore AM a) = 1 in contradiction to AM a) = 0 for all a e У (a). Therefore У contains only the null set, that is, the ssp-homomorphism is an ssp-isomorphism. D 2.3. A subset Ух c= У of a selection procedure У is called a separated part of У, if Ух is a selection procedure and if, for each pair of elements а1еУ1,а2е У\УХ the intersection ax n a2 = 0. It is easy to see that if Ух is a separate part of У then У2 = У\УХ is also a separate part. Th. 2.1. Let h be an ssp-morphism of Ух into У2. If the relation A2(/i(ai), h(a2)) = AMi* a2) is satisfied for the case ax => a2 and Ща^ Ф 0, then У is a separate part of Ух and h is an ssp-isomorphism of У = Уг\У onto a partial selection procedure hУх = У2 с У2. Proof. Let а ф У and ae y.lfa n а Ф 0 then, since h(a n a) = 0 we obtain 0 = A2(/i(a), h(a n a)) = a r\ а) Ф 0 which is a contradiction. Therefore is a separate part of Ух. For a e У\ then h(a) Ф 0 for а Ф 0 and h is an sp-isomorphism of У\ onto У2 = НУ\ = кУ1. As a result the probabilities are invariant—therefore h is an ssp-isomorphism of У [ onto У2.
3 Morphisms of Preparation and Registration Procedures 203 3 Morphisms of Preparation and Registration Procedures We now turn from the general case to the case of the preparation and registration procedures which are important for quantum mechanics. We shall now assume that we are given Ml9 J1? ^01, and M2, <22, ^02? D 3.1. An ssp-morphism of SL^ into i>2 where SLX and J2 are statistical selection procedures is called a preparation morphism (abbreviated p- morphism). By analogy with D 2.2 we define a(a) = inf5ey(a) кй1(а9 a\a\ where J{a) is defined in §2 The p-morphisms and the p-automorphisms shall play a particularly important role. D 3.2. A p-morphism h will be said to be recording-invariant (r-invariant) if <Pi(ai) = <Ma2) implies that (p2(h(a1)) = ViWPi)) anc* a(ai) = а(аг); here <px and (p2 are defined according to III, D 3.1. Th. 3.1. For an r-invariant p-morphism h a map where kv = k&1(a9 av). If h is a p-isomorphism, then a(a) = 1 and (pi&\ = Ж! Ж2 с= K2 . Proof. Clearly S is well defined. From a= (J” = 1 av it follows that h(a) — Ul=i h(ax). Since av n = 0 for v Ф p we also obtain h(ax) n Ща^) = 0. Therefore we obtain Scp^a) = oc(a)(p2(h(a)) For a decomposition a = (J" = 1 av it follows that <Pi(h(a)) = X h2(Ka\ h(av))(p2(h(av)\ and Scp^a) = a (a)(p2(h(a)) = а (a) £ ХЛг(Ца\ h(av))(p2(h(av)). According to D 2.2 we have a(a)k^2(h(a), h(ax)) = a(av)k^(a, av)
204 V Transformations of Registration and Preparation Procedures and we obtain S<pM) = £ ^,(a, ajct(av)<p2(h(aj) V = L -Wa> av)s<M«v)- V From <px(a) = £v Я^Да, av)<Pi(av) (according to III, Th. 3.1 and the equivalence of III (3.2) and III (6.1)) we obtain the desired result. The map S defined in Th. 3.1 is (according to III, Th. 3.2) a rational affine map of the rational, convex set (px (which is dense in Kx) into K2. D 3.3. We shall call an sp-morphism h of ^ into 012 for which the restriction to ^01 is an ssp-isomorphism ^01 onto 0tQ2 a recording morphism (abbreviation: r-morphism). From an r-morphism h we may easily obtain a (canonical) map of effect processes; this map we shall also denote by h: D 3.4. For (b0, b) e we define the map ^ Д $F2 by h(b0, b) =(h(b0), h(b)). D 3.5. A r-morphism h is said to be preparation-invariant (p-invariant if *M/i) = *M/2) implies that ф2(Н/1)) = ^2(^/2))- Ф1 and Ф2 correspond to III, D 3.1. Th. 3.2. For a p-invariant r-morphism h a map ф0*х L2 is defined by ТФ1 (/) = ф2(Ь(/)) which satisfies 71 = 1. For the decomposition of the unit effect I (see III, D 4.5.5) lb„ = Ш i we obtain 1 = Z Ф1Ш and 1 = £ тФЛ/д- i i Proof. The proof is similar to the proof of Th. 3.1. According to III, Th. 2.4 T is a rational affine map of the rational convex set ф0?х in L2. In many physical applications an r-morphism is not only p-invariant but, in addition, if fuf2 are hardly distinguishable by testing with preparation procedures (even if ф(^) Ф ф(/2)) then the same is true for the images h(ji) and h(f2). We therefore define: D 3.6. An r-morphism h is said to be preparation-continuous (p- continuous) if to each г > 0 and a e Ж2 there exists a <5 > 0 and a finite number of at e such that Ifi2{<p2(a), ф2(НЛ) - Ц2Ша), Ф2(М/))\ < £
3 Morphisms of Preparation and Registration Procedures 205 whenever *M/)) - Pi(<pM\ Uf))\ < <5 for all at. An r-isomorphism is said to be a p-continuous r-isomorphism if both h and /Г1 are p-continuous. It is easy to see that a p-continuous r-morphism is also p-invariant. An analogous definition can also be made for p-morphisms: D 3.7. A p-morphism h is said to be recording-continuous (r-continuous) if to each г > 0 and / e 3P2 there exists a 3 > 0 and a finite number off e #i such that \p2(ct(a)(p2(h(a)), i>2(/)) - p2(a(a)q>2(h(a)), ф2(/))\ < e whenever K(<Pi(a), <M/i)) - ^i(<Pi(a), ФЛШ < s for all f. A p-isomorphism is said to be an r-continuous p-isomorphism if both h and h~l are r-continuous. Again it is easy to see that an r-continuous p-morphism is also r-invariant. D 3.8. A p-isomorphism h: lx —► 12 is said to be dual to an r-isomorphism hf: 0t2 —► if (b(a), b0) e C2 is equivalent to (a, h'(b0)) e Cx and if p2(h(a), (b0, b)) = pj(a, h’(b0, b)). (3.1) Here Cx and C2 are defined by analogy with С in II (4.3.1). Th. 3.3. If a p-isomorphism h: 11 —► 12 and r-isomorphism h': $2 —► 01 x are dual, then h~1 and k ~1 are also dual. Proof. From (/i_1(a), b0) e it follows that, for a' = h~1(a) and b'0 = /i'_1(b0) that (a',h'(b'0))eC1 and, according to D 3.8 (h(a'),b'0) e C2, that is, (a, h ~1(b0)) e С2. In this way it follows from (a, h'" Х(Ь0)) e C2 that (/i “ ^a), b0) e С1. From (3.1) it follows that for a' = h-1(a) and b'0 = /i'_1(b0), b' = b'-1(b) p1(h-\a),(b0,b)) = P1(a', h'(b'0, b')) = n2(h(a'\ (b'0, b')) = p2(a, h’~\b0, b)). Th. 3.4 (H. Neumann). // a p-isomorphism h and an r-isomorphism h' are dual, tben h is an r-continuous p-isomorphism and hi is a p-continuous r- isomorphism. Proof. According to Th. 3.3 we need only show that h is r-continuous and that h' is p-continuous. Since h is a p-isomorphism, we find that a (a) = 1. According to D 3.7 it suffices to show that p2(cp2(h(a)), ip2(b0, b)) = PiivM, Ф^Ь’ОЬо, b)). (3.2)
206 V Transformations of Registration and Preparation Procedures First we shall show that h is r-invariant. Suppose that a' ~ a and that h(af) * h(a). Then there exists a (b0, b) such that (h(a'\ b0) e C2, (h(a), b0) e C2 and PiiHa'l (b0, b)) ф p2(b(a'), (b0, b)). Thus it follows that (a', h'(b0)) e Cx and (a, h'(b0)) e Cx and p1 (a', h'(b0, b)) Ф pfa, h'(b0, b)) which contradicts a' ~ a. According to APS 5.1.4 there exists an a' ecpfa) satisfying (a\ hf(b0)) e С Then, according to D 3.8 b)) = Px(a', h'(b0, b)) = fi2(h(a'), (b0, b)) = p1(<p2{h{a'), ф2(Ь0, b)). Since h is r-invariant, from d ~ a it follows that h(d) ~ h(a), that is, <Pi(h(a')) = <p2(h(a)). According to D 3.6 the relation (3.2) suffices to show that h' is p-continuous (observe that h' is a map ffl2 —► and not —► 0t2 as in D 3.6!). 4 Morphisms of Ensembles and Effects Since r-invariant p-morphisms and p-invariant r-morphisms always occur in applications, it is understandable that our emphasis in the investigation of morphisms in quantum mechanics will be concerned with morphisms of ensembles and effects. 4.1 Morphisms of Ensembles D 4.1.1. An affine mapping S of Kx into K2 is called a mixture morphism (mi-morphism). D 4.1.2. An affine map S of Kx into K2 is called an operation. Th. 4.1.1. A rational affine and norm-continuous map S of a (rational affine) set Jfj which is dense in К x into K2 may be uniquely extended to an operation K1 in K2. Proof. Since S is norm-continuous, S may be extended as an affine mapping onto = co K1 and therefore onto the whole space Since a w e K1 may be written in the form kw where 0<2<1 ,w e K1 and S is affine in Kl9 S may be extended onto all of K1 by means of the equation S(2w) = 2S(w). Thus this extension of S is an operation K2. Th. 4.1.2. An operation S of K1 into K2 may be uniquely extended as a linear mapping of in M2 with norm ||S|| < 1 .Every mixture morphism К1 Д K2 has a unique extension as a linear map of in $2for which ||S|| = 1; in particular, every mixture morphism can be extended in this way to an operation.
4 Morphisms of Ensembles and Effects 207 Every positive norm-continuous linear map ||S|| < 1 (restricted to Kx) is an operation. Every positive linear map 0bx 0b2 is norm-continuous and ||S|| _1S is an operation. Proof. Since 0b 1 is spanned by Kl9 S can be extended to 0b 1. For w e Ku since Swjl g K2 the relation \\Sw1 || < 1 and for x = ctw1 — f$w2 and ||x|| = a + P we may conclude that ||Sx|| < ||x||, it follows that S is norm-continuous and ||S|| < 1. In this way we find that every positive map satisfying ||S|| < 1 is an operation, because К is the intersection of the unit sphere with the positive cone. Since every positive linear map is norm-continuous, this result holds in general (see AIII, §6). Thus we see that a bijective mixture morphism Д K2 is a mixture isomorphism, that is, S-1 is a mixture morphism. Thus it follows that a bijective operation Kx Д K2 is a mixture isomorphism, that is, Kx Д K2. Th. 4.1.3. To each operation S there exists a dual map S' of 0S'2 in 0S\ for which L2^+ L1. S' is o(0b'2, 0b2)-(7(0bi, 0fx) continuous; S is a mixture morphism if andonlyifS'l = 1. Proof. The fact that S' exists and is o(0b2,0bfy~a(0b\, ^J-continuous follows from Th. 4.1.2, that is, from the fact that S is norm-continuous (see AIII, §5). From Д- K2 it follows that p2(Sw, 1) = 1 = /ii(w, S' 1) holds for all w e Ku and we therefore obtain S'l = 1. If S' 1 = 1 then, for all w e K1 it follows that p2(Sw, 1) = p2(w, S'l) = Pi(w, 1) = 1 and we therefore obtain Sw e K2. Th. 4.1.4. For an mi -morphism S the following statements are equivalent: (i) S is a mixture isomorphism. (ii) К1 Д K2 is injective and SKX is dense in norm in K2. (iii) L2 Д Lx is bijective. (iv) S' is an isomorphic map of the Banach spaces. (v) S is an isomorphic map of the Banach spaces. Proof, (i) => (ii) trivial. (ii) => (iii). Since БКг is norm-dense in К2 it follows that (S0b1)L — 0 which is equivalent to the condition that S' is injective. Since SK1 is norm-dense in K2 it follows that \\S'y\\ = sup |/ii(w, S'y)\ = sup \p2(Sw, y)\ weKi weKi = sup liWz^.y)! = llyll weK2 that is S' is norm-preserving. Thus it follows that S'0t'2 is a norm-closed subspace of 0b\. We shall now show that if К1 Д K2 is injective then 0b^ Д 0b2 is also injective: Every xgJj can be written in the form x = a w1 — ($w2, where a, ft > 0 and w1,w2eK1. Then from 0 = Sx = aSwi — PSw2 and from Swu Sw2 e K2 it follows that a = P and that Sw1 = Sw2. Thus it follows that w1 = w2 and finally x = 0. Since 0bi is injective, it follows that S'0b'2 is ст-dense in 0b\. Since the unit sphere [—1,1] is ст-compact, the set A = S'[— 1,1] is therefore compact and
208 V Transformations of Registration and Preparation Procedures convex. Since S' preserves the norm we obtain A = S'0b'2 n [—1,1] and S'0b'2 is therefore also ст-closed (see AIII, §4) and S'0b'2 = 0b\ and S'[—1, 1] = [—1,1]. Since, according to Th. 4.1.4 S' 1 = 1 and L = ^(1 + [-1,1]) it follows that S'L2 = L1 and (iii) is satisfied. (iii) => (iv). If S' is a bijective map of L2 onto L1 then (since, according to Th. 4.1.3 S' 1 = 1 and [-1,1] = 2L - 1) it follows that S' is also a bijective map of the unit spheres onto each other. Hence (iv) holds. (iv) => (v). We shall now show that (S')-1 is ст-continuous: (S')-1 as the inverse of the ст-continuous bijective map of the ст-compact unit sphere is ст-continuous on the unit sphere and is therefore ст-continuous everywhere (see AIII, §5). Thus it follows that S-1 exists and satisfies (S-1)' = (S')-1. Since S' maps the unit sphere bijectively, we obtain ||Sx|| = ||x||,from which we have proven (v). (v) => (i). The existence of (S')-1 and (S')-1 = (S-1)' is clear. Then, according to Th. 4.1.3 S'l = 1 and we obtain (S')-1l = 1. Let w e K2; since SK1 cz K2 from ||S-1w|| = ||w|| = 1 it follows that p1(S~1w, 1) = p2(w, 1) = 1, we obtain S-1K2 c Kx. D 4.1.3. An operation (or mixture morphism) is said to be ^-continuous when, as a map —► iC2 is continuous with respect to the topologies a(Ku 00 o(K2, the spaces are defined in III, §4 and §5. As a mapping —> K2 a ^-continuous mixture morphism may be naturally extended as a ^-continuous operation. Th. 4.1.5. Every 00-continuous rational affine map Жх Д Ж2 can be extended to a mapping 0b x Д 0b 2 which is norm-continuous and, as a mapping К1 Д K2 is also 00-continuous. Proof. By definition, the norm in 00' is equal to ||x|| = sup{/i(x,y)|||y|| < 1 ,ye@). Since 00 n L is a(0b', ^)-dense in L (see III, §5) and since the set {у 11|у || < 1, ye£0}is a(0b',0b)-dense in the unit sphere of 0b', for elements of 0Ь с= 00\ the norm in 01' is identical to the norm in B. We shall now show that the unit sphere of 00' is equal to the convex set generated byX* и (-K*). The unit sphere in 01' is the set which is bipolar to the set К и (-K) in the dual pair (0', 00). Therefore the unit sphere of 00' is equal to the o{00\ ^)-closed convex set generated by Ka и Since Ka and (—Ka) are a{00\ ^-compact (as bounded closed sets in 00'\ see AIII, §4) the convex set generated by Ka и is already compact, and is therefore ct(^', ^)-closed, that is, equal to the unit sphere of0\ Let 1 g 00\ since the relation p(w, 1) = 1 for all w e Ka follows from the same relation for all w e К it follows that ||w|l = l_for all w e Ka. Since the closures К\ of in 00 ^ and К% of K2 in 002 are compact, S may be extended as a ^-continuous map K\ Д К2, where this extension becomes an affine map. Since 0b 1 is the linear span of Kl9 S may be extended to a map 0b1-^^'2.
4 Morphisms of Ensembles and Effects 209 For x g 0b x and x = aw1 — /?w2, where wl5 w2 e Ki and ||x|| = a + P it follows that ||Sx|| < ctWSwJ + P\\Sw2\\ < a + P because the norm || w'|| < 1 for elements w' e K2 • 5 is therefore norm-continuous as a map 0b 2 —► 0'2. Since Xjl is norm-complete and Жх is norm-dense in and K2 is norm- complete, and from Жх Д K2 we therefore obtain that is, S is an operation. According to Th. 4.1.1 and Th. 4.1.2 it follows that S is a norm- continuous map^ —► 0b 2. Since К* Д X2 is ^-continuous, the map Д K2 is ^-continuous. Th. 4.1.6. For a 00-continuous affine map Kx Д K2 the extension of S on 0b x is continuous in the topologies o(0b2, ®i) and ®i) anequivalently, maps t/гв subspace 002 of 0b'2 in 00x. Proof. First we shall show that each element may be written in the form x = <xwx — f$w2, where wl5 w2 e Ka and ||x|| = a + p. If x' = ||хЦ_1х then x' is in the unit sphere of 00', and, since со(К* и (—Ka)) is the unit sphere in 0', is therefore of the form x' = Xwx — (1 — X)w2, where 0 < Я < 1 and wx,w2eK<T. Thus it follows that x = awx — Pw2 with a = ||х||Я, P = ||x||(l — Я), that is, a + P = ||x||. Thus it follows that the set of all aw^ — /?w2, where wl5 w2e К is 0) dense in 0', that is, ^ is <r(^', 00) dense in 00'. We may define a <r(^i, ^-continuous affine function on Kx by means of the equation /(w) = /^(Sw, y) (for fixed у £000) because 5 is continuous in the topologies o{00\, 000) on Ki and <7(^2, 02) on K2. In this way we may obtain an extension of f(w) as a ст-continuous affine function on all of Ka. Since each xe9\ is of the form x = awx — Pw2, where ||x|| = a + P and wl5 w2 g К1, I may be extended on all of 00' as a norm-continuous linear form because \Ы < а|*Ю1 + p\l(w2)\ < (a + P) sup \l(w)\ = ||x|| sup |/(w)|. we Kf we Kf We shall now show that / is, in addition, a(9\, ^J-continuous as a linear form over Here I is a(9'x, ^J-continuous if I is a(9'x, ^J-continuous on the unit sphere of 00\ (see AIII, §4). Thus we need only show that (/(x)| < S for ||x|| < 1 and a suitable o(00i, ^-neighborhood of 0. We may write x in the form x = awx — Pw2 where a + P = ||x|| < 1 and wl5 w2 g K°x. Thus it follows that x=^a + fi)(wl - w2) + (a - fi)(jwl + jw2). Since /i1(w, 1) = 1 for all w e Kx, it follows that ^ = ilMK - w2) + Hi(x, !X2wi + 2wi)- Thus m ^ iu*PK - w2) + kw, i)i, where Я = supweKf |/(w)|.
210 V Transformations of Registration and Preparation Procedures We shall now choose the g(3\9 3X) neighborhood of 0 as follows: Since I is a(3'l9 ^J-continuous in K\, there are finitely many yt e 31 for which |l(w — w)| < <5/2 providing that \p(w — w, yt)| <1. For e = <5/22 we define yt = (l/e)y*. We choose the c(3'l9 3X) neighborhoods of 0 as follows: \p(x9 y,)| < * for all i and \p(x9 1)| < с/4, where a is the smallest of the numbers \\yt\\ and e. We will now prove that \l(x)\ < S if ||x|| < 1 and if x lies in the specified neighborhood ofO: (a) If ||x|| < e, then since 1/^ — w2)| < 22 it follows that |/(x)| < sX Л—X < 2sX = <5. 4 (b) If ||x|| > e then, since ||x|| < 1, it follows that \l(x)\ < - Wl)\ + 4<^ ^ Ш™1 - w2)| + el and from Кх, yd = - w2, у,) + ц(х, l^K + w2, yd it follows that 2 1 \Kwi - w2) у()| < -r—r\Kx, 3>i)l + Mx, 1)1 + w2. J>i)l Wl \\x\\ 2 л 11 11111 ^ -Ых, Л)1 + то-ЧуА < —z + = -» e 4 s s 2 2 s s that is, \p(w1 — w2, y*)| < 1. Therefore we obtain 1/^ — w2)| < d/2 and <5 |/(x)| < - + sX = <5. Thus we have shown that I is o(3\, ^J-continuous in all of 3\9 and that there exists a|e^i for which l(x) = /x2(Sx, у) = /X t(x, y) = Hi(x, S’у) from which it follows that S'у e 31 for у e 32, ^aX is, S'32 cz 31. This is equivalent to the condition that S is, as a map 3\ —► 3'l9 continuous in the topologies g(9i, 3X) and o(®'2, 32). Since 08 x cz 3\ and 0(12 a. 3'l9 the same statement holds for the map S: 08x —> 082. Th. 4.1.7. Let h be an r-continuous p-morphism. A rational affine and 3- continuous map Ж1K2 IS defined by the equation S(px(a) = cx,(a)(p2(h(a)). This map may be extended to an affine 3-continuous map K2. The continuation S is, as a map 08 x Д 082, both norm-continuous and 3- continuous, that is, Sf32 <= 3X. If h is a p-isomorphism, then Kx-^ K2 is bijective. Proof. From Th. 3.1 and from the conclusions following Th. 3.1 it follows that Jfi Д К2 is a rational affine map. From the fact that h is r-continuous and the fact that the topologies о(K, 3) and c(K, if) are (with if = ф&) identical (since if is
4 Morphisms of Ensembles and Effects 211 norm-dense in L n <3) it follows that Jfj Д X2 is ^-continuous. According to Th. 4.1.5 it follows that Д X2 is affine and ^-continuous and that Д ^2 is norm-continuous. According to Th. 4.1.6 it follows that Д ^2 is also 0- continuous and S'^2 c ^. If h is a p-isomorphism, then, according to Th. 3.1 Jfj Д K2. Since h is a p- isomorphism, we obtain SJfj = For /z_1 there is an S with §Ж2 = jfj_ and SS = 1, SS = 1 on Ж2 (resp. X[). Since S and S are norm-continuous, and 0F2 norm-dense in K1 and K2, respectively, then for the extensions we obtain Xi Л X2, X2 Л Xi and SS = 1 ,SS = 1. 4.2 Morphisms of Effects D 4.2.1. Let Tbe a mapping of Lx into L2; if the relation T(g 1 + 02) = Tgx + Tflf2 is satisfied for all gl9 g2, 0i + 02 e L i> then we shall call the map Tan effect morphism. Tis said to be J^-continuous if it is continuous as a map Lx —► L2 in the topologies o(0!\, ЛД <т(^2, Л2). Th. 4.2.1. A mapping T of Lx into L2 w/zzc/z satisfies the relation T(g 1 + 02) = ?0i + Tg2for gl9 g2, g 1 + g2 € Lx has a unique extension to 0l\ and satisfies || T\\ < l.Tis positive. PROOF. From gb g2 e L1 and g2 > gx it follows that g2 = g^ + (g2 — gx) and that Tg2 = Tg1 + T(g2 - 0i), that is, T#2 > Tg1 and T(g2 - gfj = T#2 - Tgv For $ g Ljl and ngeL1 (for integer values of n) by induction, we obtain T(ng) = nTg. For integer m and g1eL1 we obtain g = gJmeL and 01 = mg; therefore we find that Tgx = mTg, that is, T(gjm) = (Tg2)/m. Thus, for ng/meL1 we obtain T(ng/m) = nT(g/m) = nT(g)/m. Thus, for all rational numbers A, for g e L1, kg e we obtain T(kg) = ATg. If A is irrational, then, to each e > 0 we may choose two rational numbers kl9k2 such that Ax < A < A2 and A2 — Ax < e. Let A2g gL1( Since A I# < A# < A2g we obtain kfTg < T(kg) < k2Tg. Since s was arbitrary it follows that T(kg) = AT#. If A is irrational and Ag e L1? but (A 4- e)g ф L1 for each e > 0, we then choose g1 = g — Sg, where 0 < S < 1. Then, for irrational numbers A we also obtain TXkgi) = kTg^ and T(kSg) = kT(Sg), because, for sufficiently small e > 0 we always obtain (A + s)g1 gLj and (A + s)Sg e L 1. Since g = g^ + dg we obtain Tg = Tg1 4- 7(<5g) and, since A# = A# 4- kdg we also obtain T(kg) = ДА^) + T(A<5g) = kTQl + kT(Sg) = AT#. Therefore Tis an affine mapping of L1 into L2. Since is the linear span of Ly (for example, each у e B\ may be written in the form у = ад — pi, where g e L), T may be extended as a linear map from into #'2. Since the unit sphere of 0l\ is equal to 2L — 1 (and similarly for ^2) and since Ljl L2 we find that T maps the unit sphere of into the unit sphere of ^2, that is, I|T||<1. Each у e B'1+ is of the form kg, where A > 0,geL1; therefore T is a positive mapping.
212 V Transformations of Registration and Preparation Procedures Th. 4.2.2. IfT is a continuous effect morphism, then the extension of the map Tonto 3S\ is also 0tfy-a(0t'2, &2)-continuous. Proof. Since the unit sphere of is equal to 2L1 — 1 the map Tis ст-continuous on the unit sphere. Then, according to AIII, §5, Tis ст-continuous in all of . Th. 4.2.3. If T is a linear positive mapping of 0t\ into $'2 then T is norm- continuous. T maps into L2 if and only if Tie L2. Proof. The norm-continuity of T is a result of general theorems (see AIII, §6). We may easily prove this result directly. For у e we define P = — inf {/i(w, y)\weK1} providing that the inf is less than zero, otherwise set ft = 0. We also define a = sup{p(w,y)\w eKJ. Then a > —fi and \\yII = sup{p(w, y)\weK1} = max{|a|, £}. If we set 9 = —Tbly + W a + P then 0 < p(w,g) < 1 for all weKu that is, geLv Since T is a positive map we obtain 0 < Tg < 71 and obtain Ту = T((a + P)g - pi) = (a + P)Tg - PT1 < (a + P)T1 - PT1 = a71 and Ту > -PT1. Thus it follows that || Ty|| < || Tl|| ||y||, that is, T is norm-continuous and (since 1 is an element of the unit sphere) we finally obtain || T\\ = || 711|. If Ljl L2 we obtain Tl e L2. Conversely, if 71 e L2 then, for g1 e Ll9 that is, 0 < g1 <1 (since T is positive) and we obtain 0 < Tg1 < Tl. Since 71 e L2 we obtain Tl < 1 and therefore obtain Tg1e L2. For ||T|| = || Tl|| it follows that for each positive mapping T the map || Т|Г1T maps the set Lx into L2. To each a(SPl9 $2)-o{@l'2, ^2)_continuous mapping T of into &У2 there exists an adjoint mapping T of $2 int0 (see AIII, §5) for which fi2(x, Ту) = iniT’x, y). If 7'maps L1 into L2 then, for w e K2 we obtain: 0 ^ fi2(w, Ту) < 1 and also 0 ^ fi^T'w, y) <, 1
4 Morphisms of Ensembles and Effects 213 for all yebl9 that is, T,\^eK1. T is therefore an operation. The continuous effect morphisms and the operations therefore uniquely cor¬ respond to adjoint mappings, as we have already seen in §4.1. In particular, T is a mixture morphism if and only if Tl = 1. In closing this section we shall now give criteria for the J^-continuity of an effect morphism. Th. 4.2.4. An effect morphism T is continuous if each o(8S\ 8#)-convergent sequence gv in L1 (therefore gv —► g e Lfj satisfies Tgv —► Tg in the &2) topology. Proof. The <r(^', ^)-topology is metrizable (see AIII,§4) and it is therefore only necessary to consider sequences. Th. 4.2.5. An effect morphism T is continuous if and only if for each decreasing sequence of decision effects ev satisfying Д v ev = 0 the relation Tev —► 0 in the o(8S'2, ^S2)~t0P0^°Sy- Proof. Since ex is decreasing it follows that Tey is decreasing in L2 and therefore converges in the <r(^2, ^2) topology: Tex —► д e L2. Thus the hypothesis of the theorem is equivalent to the condition that Ду ev = 0 implies that д = 0. If ev is a increasing sequence satisfying \/v ev = e, it follows that e — ev is a decreasing sequence, and that T(e — ev) = Те — Tey —> 0, that is, Tey converges to Те. If en is a sequence of pairwise orthogonal decision effects satisfying 1 en = e> then ev = Yji=ien is an increasing sequence satisfying \Jy ev = e. Therefore, in the @2) topology Tev = ^=1 Ten -> Те, that is, T(X„“ i О = L”=x Te„. Thus, for each w2e К we obtain 00 \ 00 ”2, T £ e„) = £ fe(w2, Ten). n=l ' n=l Thus m(e) = р2(™2> Те) is a <r-additive measure on the lattice G. According to Gleason’s theorem (see AIV, §12) there exists a uniquely determined w1e K1 for which Me) = №>(w2. Те) = e). A mapping K2 —* К, is defined by w2 —> where it is clearly evident that it is affine, that is, is a mixture morphism. We need only show that S' = T, that is, j42(w, Tg) = g2(w, S'g) for all w e K2 and all g eL1. We then have p^Sw, e) = ju2(w, Те) and therefore ju2(w, Те) = ju2(w, S'e) for all w e K2 and for all e e . Thus we obtain p2(w, Tx) = p2(w, S'x) for all x in the linear space spanned by G1. Since T is norm-continuous, the same is true for all x in the norm-closed subspaces of spanned by Gv According to the spectral representation theorem (see AIV, §8) the norm-closed subspace of spanned by is 0b\ itself.
214 V Transformations of Registration and Preparation Procedures The converse is easy to see. If ex is a decreasing sequence of decision effects, then ex converges towards e in the 0b J topology. If Ду ev = 0, from III, Th. 6.10 or IV, Th. 1.3.7 we find e = 0, that is, ev —► 0. Therefore from the ^-continuity of T it follows that Tex —► 0. Th. 4.2.6. A p-continuous r-morphism h determines a 0b-continuous effect morphism T: 08 ± 08 2 by means of the map ф&\ L2 where the latter is given by Th. 3.2. T' is a 3-continuous mixture morphism. PROOF. According to D 3.6 the map ф^ L2 is ^-continuous since (p£' is dense (in norm) in K. Thus T can be uniquely extended onto the ст-compact set in which фis ст-dense because L2 is ст-compact. It follows that L1 -L L2. Since ф^ L2 is, according to the remark following Th. 3.2, a rational affine map, Ljl L2 is an affine map. Therefore T may be extended as a linear map 08\ 08'2. Thus, according to D 4.2.1, T is a ^-continuous effect morphism. According to Th. 3.2 T1 = 1. T is therefore a mixture morphism. Since T is norm-continuous, and since ф^ ф^2 cz 32 n L2 and since ф&х is norm-dense in ^ n Li we obtain n Lx 082 n L2. Since 3 n L spans the space 08 we obtain 31 082. Therefore T is ^-continuous. Th. 4.2.7. Let h be a p-isomorphism and let h' be a r-isomorphism; let h be dual to k (D 3.8). Then the maps S(px(a) = (p2(h(aj) and Тф2(/) = i/q(^ (/)) determine a 3-continuous linear map 081 082, where К2 is bijective and a 08-continuous linear map 08f2 0b[, where L2^± Lx is bijective, T = Sf. In addition, T082 = 31. The proof follows directly from Th. 3.4, Th. 4.1.7, Th. 4.2.6, and Th. 4.1.4. 4.3 Coexistent Operations and Coexistent Effect Morphisms The norm-continuous maps S ol 081 into 082 form a Banach algebra stf(081? 082) and therefore form a Banach space (see AIII, §6) with the norm ||S|| = sup{\\Sx\\\xe081, Ml < 1} = sup{IISwII I W € Kt}. In $0(08 ^ 082) a positive cone $0+(08x, 082) is defined by S > 0 or equivalently {Sx > 0 for all x > 0}. $0(081? 082) is not only complete with respect to the norm topology but also with respect to the topology of simple convergence, that is, for a sequence Sn with Snw which is norm-convergent in 082 for all w eK1 there exists an Se$0(08x, 082) satisfying ||Snw - Sw|| —► 0 for all weK1 and therefore ||Snx — Sx|| —► 0 for all x € 08x. Th. 4.3.1. A map S e $0(082, 082) is an operation if and only if Sis positive and IISII < i.
4 Morphisms of Ensembles and Effects 215 Proof. From p(Sw, 1) = ||Sw|| for w e and S positive and ||S|| = sup{||Sw|| | w gK;) it follows that ||Sw|| < 1, that is,SweK2. D 4.3.1. We shall denote the set of operations—that is, the intersection of я/+(Я19 $2) with the unit sphere—by П. D 4.3.2. An additive mapping of a Boolean ring 2 Л П for which x(e) is a mixture morphism (where s is the unit element of E) will be called an operation measure. For an effective ensemble w0 we may define a uniform structure in E by means of the metric d(au a2) = ц(х(<7! + o-2)w0,1), (4.3.1) which is identical to that defined in IV, Th. 5.2 provided that we set W(tt) = х(ф0 (4.3.2) or is identical with that defined by IV, Th. 1.4.1 if we rewrite (4.3.1) in the form d(au a2) = n(w0, /{a1 + cr2)l) (4.3.3) and set F(a) = /(<r)l. Th. 4.3.2. The mapping E П is uniformly continuous with respect to the metric in E and the uniform structure of simple convergence in П. PROOF. According to IV, Th. 5.2 for each fixed w the map E K2 is uniformly continuous with respect to the norm topology in K2. This implies that the map E -Ь П is uniformly continuous with respect to the uniform structure of simple convergence in П. From Th. 4.3.2 and the fact that П is complete in the topology of simple convergence, it follows directly from IV, Th. 1.4.3 that E can be completed and that x can be extended on the completion.. Th. 4.3.3. E П together with a complete and separable E determines for each w eK1 a preparator of the ensemble x(£)w by means of the map E^K2. Proof. The proof is a simple corollary of IV, D 5.4 and the preceding results. The map E-^IIis uniquely defined by the maps E K2 for all w e Kx. We therefore define : D 4.3.3. We shall call an additive measure ЕЛПопа complete separable Boolean ring E a trans-preparator.
216 V Transformations of Registration and Preparation Procedures Th. 4.3.4. Let E be complete and separable, and let an additive measure x be defined by E -b П. Then to each xifi) {as a map *(g--> ^2) the adjoint map X'(g} {as a map 0b'2 0b'l9 we may define an additive map E ^ P (where P is the set of ^-continuous effect morphisms). For each geL2 an additive measure is defined by E Lx. For g = 1 the map E -Lh L is an observable. Proof. The proof follows simply, noting that x'(e)l = 1 because x'(e)w e K2. D 4.3.4. The map E X P which is conjugate to a trans-preparator E П is called the adjoint effect transformer to E -Ь П. The observable E is said to be the observable associated with the trans-preparator. D 4.3.5. A set si of operations is said to be coexistent if there exists a Boolean ring E and an additive measure E А П for which sf c= /Е. We shall not carry out further analysis of the trans-prepartors in a manner which is similar to the analysis of observables and preparators which was presented in IV. In XVII, §4 we shall present a number of applications in order to become familiar with the concepts presented in §4.3. 5 Isomorphisms and Automorphisms of Ensembles and Effects In quantum mechanics automorphisms of effects and ensembles play an important role. For this reason we must present a careful and precise explanation of the structure of these maps. In addition, we shall investigate the question of the possibility of the extension of bijective maps, for example, isomorphisms of decision effects onto Banach spaces, that is, to effect morphisms. D 5.1. If T is a bijective map of L1 onto L2 which satisfies the following condition: {Gi> Gi e I-'iJ Gi ^ Gi ^ T&1 ^ Tg2) then we say that T is an order isomorphism of L1 onto L2. In D 5.1 we therefore consider the structure type: L as an ordered set. Since 1 is the largest and 0 is the smallest element of L1 (and correspond¬ ingly for L2) it follows that Tl = 1 and TO = 0. Since 1 — g corresponds to the case in which g does not respond, the following comprehensive structure type is suggested: L is an ordered set with a dual automorphism g —► g* = 1 — g. Thus we obtain the following restricted set of isomorphsims satisfying D 5.1: D 5.2. An order isomorphism T of L1 onto L2 is called a «-isomorphism if T(1 — g) = 1 — Tg for all g eL1. The same considerations can be repeated if we consider the subset G of L as an ordered set (as a lattice) and, in the
5 Isomorphisms and Automorphisms of Ensembles and Effects 217 second case, as an ordered set with the dual automorphism e—►e1. The isomorphisms of G as an ordered set are known as lattice isomorphisms. D 5.3. A lattice isomorphism T of G1 onto G2 is said to be an _L- isomorphism of Gt onto G2 providing Tie1) = (Те)1. We may also consider the following structure type: L together with the тар(01; g2) -> + g2 where gt + g2 < 1. According to D 4.2.1 an isomorphism of L with respect to this structure type is called an effect isomorphism. If an effect isomorphism L1 L2 is 0b- continuous, then so is the inverse L2 —► L1 since L1? L2 are compact (see If L1 = L2 or if G1 = G2 we then use the expression automorphism instead of isomorphism. Th. 5.1. Every effect isomorphism T of L1 onto L2 is an *-isomorphism and T may be uniquely extended as a linear mapping of 0b\ onto 0b'2; then T will be a norm preserving isomorphism of the Banach spaces 0S\ and 0b2. If T is 0b- continuous then T' is a mixture isomorphism of K2 onto K1 and an isomorphism of the Banach spaces 0b 2, 081 and T_1 is also 0b-continuous. IfT is 0b-continuous then the restriction of T onto G1 is a 1-isomorphism of G1 onto G2. PROOF. From д + (1 — д) = 1 it follows that Tg + T(1 — g) = Tl. From Th. 4.2.1 it follows that T is positive and preserves the order. Therefore Tl = 1 and T(1 — g) = 1 — Tg, that is, Tis a *-isomorphism. From Th. 4.2.1 it follows that the map T may be extended on 0b\ as a bijective mapping from 0b\ to 0b'2; in addition, since || Ту || < || у || and since || T_1y|| < ||y|| we find that Tpreserves the norm, and it follows that Tis an isomorphism of the Banach spaces. If T is ^-continuous, then, according to §4.2, T is a mixture morphism of K2 into K1. According to Th. 4.1.4 T is a mixture isomorphism and that 0b2 0b 1 is an isomorphism of the Banach spaces. Since (T-1)' = (T')_1 it follows that T is 0b- continuous. Let eeG1. Then, from p(w, Те) = p(T'w, e) it follows that weK0(Te)o T'w e K0(e). Therefore there exists an e' e G2 such that K0(Te) = K0(e') and Те < e’. From this result it follows that e < T“V. From w e K0(e') = K0(Te) it follows that p(w, e') = 0 and that p(w, TT“V) = p(T'w, T“V) = 0, that is, T"V g L0K0(e) and therefore T~yef<e. Therefore T“V = e, that is, Те — e! e G2- The last property may be proved more easily if we use the fact that G is the set of extreme points of L (see III, Th. 6.6). From T(Xg + (1 - l)g) = XTg + (1 - X)Tg it follows that, under a bijective linear mapping of Ly onto L2 the extreme points of Ly are bijectively mapped onto the extreme points of L2. Since T(1 — e) = 1 — Те it follows that (Те)1 = Tie1). We may obtain a stronger result than Th. 5.1 if we use the fact that for g e L the spectral representation §4.2). holds (see AIV, §8). (5.1)
218 V Transformations of Registration and Preparation Procedures Th. 5.2. The restriction of a *-isomorphism L1 L2 to Gt is a L-isomor- phism G1 —► G2. Proof. Let eeGx \ suppose that Те = g e L2. Then we obtain T(1 — e) = 1 - g. Since g' < e and g' <1 — e = e1 implies that g' = 0 it follows that, except for the null element, there exists no element in L2 which is smaller than g and 1 — g. From (5.1) it follows that this is the case if and only if the spectrum of e(k) contains only 0 and 1, that is, g e G2. Therefore TG1 is mapped isomorphically onto G2. Since Те1 = T(1 — e) = 1 — Те = (7b)1. Tis a _L-isomorphism of G1 onto G2. We now have the following situation: An effect isomorphism of L1 onto L2 is also a ^-isomorphism of L1 onto L2. Every *-isomorphism restricted to Gt becomes a _L-isomorphism of the lattices G1? G2. We now turn to the converse problem: When is it possible to extend a 1- isomorphism of Gt onto G2 to a *-isomorphism of L1 —► L2? When is a *- isomorphism of L1 —► L2 an effect isomorphism? ... etc. Th. 5.3. Let T be a mapping of H1 into G2 where H1 is a o(0\, 0f)-dense subset of G1. To each w e K2 there may exist an x' such that p2(w, Те) = pffx^, e) for all eeH1. Then T has a unique extension as a linear Я^у-а(Я'2, 0 ^-continuous map of into 0'2. In addition TL1 c= L2, that is, Tis a 01-continuous effect morphism. Proof. By hypothesis p2(w, Те) is, as a function defined on Hl9 the restriction of a linear c(0\, ^J-continuous functional over 0t\. x' is uniquely determined because H1 is o(0t\, ^^dense in G1 and is the o(0\, ^J-closure of the linear span of Gi. From 0 < p2(w, Те) < 1 it follows that 0 < pfx', e) < 1 on and thus 0 < p^x', g) < 1 on the 01)-closed convex set Ly which is generated by G1 and therefore also by H1. Thus it follows that x' = Aw' where 0 <; A < 1 and w' g Kl9 that is, x' e K1. It is easy to see that an operation S is defined as a mapping of K2 into K1 by w —► x'9 and that this map may be extended to a map of 02 into The dual map S' corresponding to S is identical on H2 to the previously defined map T. Hence we find that S' is the extension of T on 0\9 and that this extension is unique because T is 01}-(t(0'2, ^-continuous and that the space spanned by H1 is a(0\, 01J- dense in ! Therefore S is equal to T, the map which is dual to T. Since T' is an operation, Tis itself an effect morphism. Th. 5.4. If, for the map T = S' which was defined in Th. 5.3, the set G2 lies in the о(Ж2, 0l2)-closure of TG1? then T is surjective as a mapping of L1 onto L2 and of onto Я'2 and, in addition, T is a mixture morphism K2 —► K1 and T is an injective map of 012 into 0x. PROOF. Since c= L1 and TG1 c TL1 and since TL1 is <r(^'2, ^2)-compact (because Tis ^-continuous and L1 is o(0t\, ^J-compact!) the c(0\, ^J-closure of TGjl lies in TL1. Then, according to our hypothesis, G2 c= TLV Since TL1 is convex and compact, and since L2 = со G2 we therefore obtain TL1 = L2 Since
5 Isomorphisms and Automorphisms of Ensembles and Effects 219 L2 spans all of 0b'2 we obtain T0b\ = 0b'2. Since T preserves the order and TLy = L2 we obtain 71 = 1. Thus it follows that p(T'w, 1) = p(w, Tl) = p(w, 1) = 1, that is, TK2 cz K±. T is injective since T0b\ = 0b’2. If, in addition, T is injective onto L1? then T is a J^-continuous effect isomorphism and T is a mixture isomorphism. Is it possible to determine whether the extension of T onto L1 is injective merely by looking at the map T only on G^. Th. 5.5. If we assume the hypothesis of Th. 5.3 and the assumptions TG1 a G2 then T is injective onto 0b[ if and only if for ее G and e Ф 0 it follows that Те Ф 0. Proof. Since у = ад — pl, for each у e 0b\ we obtain Since T preserves the order, Te(X) is, for increasing A, an increasing sequence of elements of G. Since Те Ф 0 for e ф 0 it follows that — Te(A2) Ф 0 for ^AJ — e(A2) Ф 0 and we therefore obtain Ту Ф 0 for у Ф 0. Th. 5.6. Let ev be a pairwise orthogonal set of elements of G^ Then, for a _L- isomorphism T (and for any lattice homomorphism satisfying T(eL) = (Те)1) of Gx into G2 the Tev are pairwise orthogonal and T(£v ev) = Tev. PROOF. From e11 e2—that is, e1 < e2—it follows that Te1 < T(ej) = (Te^1, that is, Те11 Te2. Since the Tey are pairwise orthogonal and since T is a lattice homomorphism it follows that T(£v ex) = T(\/v ev) = \/v (Tev) = Tey. Th. 5.7. Let The a lattice homomorphism of G1 into G2for which Те1 = (Те)1 (for example, a _L-isomorphism of onto G2). Then, by applying Gleason's theorem (see AIV, §12) to G1 it follows that T may be uniquely extended as a linear o(0S\, 0Sfj-a(0S2, 0b^-continuous map T: 0b\ —► 0b'2. The extension of Tisa 0b-continuous effect morphism. If TG1 is o(0b2, 0b2)-dense in G2 then T is a surjective map of L1 onto L2 and of 0b\ onto 0b2 and T is injective as a map of 0b2 into 0b±. If Те Ф 0 for e Ф 0,e e G1 then T is a 0b-continuous effect isomorphism. PROOF. For all weK2 jU2(w, Те) is a positive function over G1? which, according to Th. 5.6 satisfies /i2(w, T(£v ev)) = ju2(w> Tev) and /i2(w, Tl) = 1. Then, according to Gleason’s theorem there exists an x e 0b 1 for which p(w, Те) = p(x, e). The remainder of the theorem follows directly from Th. 5.3, Th. 5.4, and Th. 5.5. У = from which it follows that Ту = Th. 5.8. The restriction of an order isomorphism L2 onto G1 results in a lattice isomorphism of Gt onto G2.
220 V Transformations of Registration and Preparation Procedures Proof. Since both T and T 1 are order isomorphisms, we need only show that TGX e G2. Let A(G) denote the set of atoms of the lattice G. Let p e Gx be an atom of Gv We will show that Tp e A(G2). We note that all g e Lx which satisfy g < p are of the form Яр, where 0 < Я < 1. The set {g \ g < p} is therefore totally ordered, and so is the set {g' \ g' < Tp} since Tis an order isomorphism of Lx onto L2. This is the case only if Tp = ctp', where p' e A(G). Suppose that а ф 1, then T_1p' ^ P an(* T"V ф p and T-y = pp", where p" eA(Gx), which is impossible. Therefore a = 1, that is, Tp g A(G2). Therefore T generates a bijective mapping of A(G0) onto A(G2). If e g Gjl then for every p e A(GX) for which p < e we obtain Tp < Те. If g > eA for Я g Л we obtain Kx(g) => Кх(ел) and Kx(g) => |JAeA Кх(ел) = Кх(\/Л ел) from which it follows that g > Vae л • Therefore we obtain: Те > \/ Tp = e' g G2, p<,e p e A(Gi) a similar result holds for T-1: V T’“1e = e"eG1. qee' qeA(G2) Since Tp g A(G2) and Tp < ef holds for all p < e, it follows that all p g A(G) for which p < e are elements of the set of all T " У (where q < e',qe A(G2)\ that is, e" > V P = e- p<,e peA(G i) Since 7b > we obtain e > = e"; therefore we obtain e = e" and Те = Те" < e\ and we finally obtain Те = e'. Th. 5.9. If, in addition to the hypothesis of Th. 5.8 we also assume that the Tg, T( 1 — g) are coexistent for all g e L1? then the restriction of T on Gi —► G2 is a _L-isomorphism. PROOF. From {Tg, T( 1 - g)} is coexistent, for g = e e Gx it follows that {Те, T(eL)} are coexistent and are therefore commensurable. Since eA^^Owe also obtain (Те) л (Те1) = 0 and also T(eL) _L Те. Since e v e1 = 1 we obtain Те v Те1 = I and finally T(eL) = (Те)1. Th. 5.10. Let Gx, G2 be two atomic lattices, and let A(GX), A(G2) denote their sets of atoms, respectively. Let Tbe a bijective mapping of A(GX) onto A(G2) for which both T and T-1 maps orthogonal atoms into orthogonal atoms, that is, T is an isomorphic map of A(GX) onto A(G2) with respect to the species of structure determined by _L. Then T may be uniquely extended to a L-isomorphismfrom —► G2. Proof. For each e e Gx the set of all atoms for which p < e is uniquely determined and satisfies e= У P. p<,e peA(GO
5 Isomorphisms and Automorphisms of Ensembles and Effects 221 For each order isomorphic mapping T of G1 onto G2 the following equation holds: Те= V Ш рве peA(GO For this reason we define the extension of T(first defined only on A(Gfj) as follows: Те= V Ш p<,e peA(GO Then it follows that e1 < e2 implies that Te± < Te2. For the set of atoms q < e1 it follows that Те1 = \^e± №)• Since p _L q it follows that Tp _L Tq; we therefore obtain Те1 _L Те. Since there exists a complete system of pairwise orthogonal pv, qp satisfying pv < e and qp < e1 and (Vv Pv) v (V. 4/.) = we find that (уж)¥(уч)-> otherwise there would be an atom r which would be orthogonal to all Tpv, Tq so that T~*r would be orthogonal to all pv, qp in contradiction to the condition (Vv Pv) V (V. %) = 1 From (Vv TPv) v (V. = 1 and Vv FPv ^ and Tqp < Те1 it then follows that Те1 = (Те)1, \/v Tpv = Те and \/д Tqp = Те1. Suppose there exists an atom r for which r < Те. Then we alsiypbtain r _L Те1, that is, r _L to all Tq for which q < e1. In this way T_1r ± to all ^ < e1 and we obtain T_1r < e. That is, each atom r < Те may be obtained as the image Tp of an atom p < e. Thus Тех = Te2 implies that e1 = e2 and e = \/ T_1r r<,Te reA(G2) is proven. Since the procedure presented above can also be applied to the extension of the map T"1 of A(G2) to A(GX) and e = \/ T_1r r<,Te r e A(G2) holds, it follows that Tis a bijective map from onto G2. Th. 5.11. Let T be an order automorphism of L onto itself (for example, an effect automorphism). Therefore we obtain TG = G. Let qv denote the atoms of the center Z of G (see IV, end of^ 1.3). For this T there exists a bijective mapping p of the integers (a permutation) so that Г = 7^, such that T[lqy = S^q^ is satisfied. The Tv are order isomorphisms of Lv onto Lp{v) where the Lv are equal to [0,1] in &'(JQ; Ш'(Ж^) is therefore isomorphic to the subspace of all operators of the form (0, 0,..., Av,...) in 8$'(ЖЪ Each order automorphism T may, in this case, therefore be represented as a “ direct sum ” of order isomorphisms Lv Lp(v) for the irreducible Lv, Lp(v). If T is an *-automorphism, then the Tv are *-isomorphisms. If T is an effect automorphism, then the Tv are effect isomorphisms. IfT is $-continuous, then the Tv are also $-continuous.
222 V Transformations of Registration and Preparation Procedures Proof. Let Gv be the sublattice of all e e G for which e < qv. Each eeG may be uniquely written in the form e = £v ex where ey e Gv. The following sums are to be understood in this way. From e < qy it follows that Те < Tqy = ^ (Tqy)p. For Те = (7e)M it follows that (Te)p < (Tqv)^ for all e e Gv. Therefore Tgenerates an order isomorphic map of Gv on the lattice G of all e' = ^ ep where ep < (TqJ^. If (TqX Ф 0 for more than one /i then the lattice G would be reducible, in contradiction to the assumption that Gv is irreducible. Therefore there exists, for each v precisely one /л = p(v) for which (TqJ^ ф 0. Since the same is also true for T_1 we must have (Tqy) = qp in addition to (Tqy) < qp and p(v) must be a bijective map of the set of the v. Therefore we obtain Tqy = qpiy). Since g e L and g < qy it follows that Tg < Tqy = qp(y); therefore Tgenerates an order isomorphism map Tx of Lv onto Lp(v). Therefore it remains only to show that (for g = £v 0Vanci 0V < 4v) T9 = Zv T9vin 0T^QT t0 set T = Zv where we obtain Tygp = 0 for v ф fi. Since Tg = ^ (Tg)p we must show that (Tg)p{v) = Tgx. Since (Tg)p < qp we find that T~\Tg)p{v) = g'v < qy where Tgy = (Tg)p{y). It remains to show that g'v = gv. Thus from (Tg)piv) < Tg it follows that g'v = T~\Tg)p{y) < g. From gy < g it follows that Tgv < Tg and since Tgy < qp(y) we obtain Tgy < (Tg)piv), that is, gy < T\Tg)p(v) = g'v; therefore#; = #v. If T is a *-automorphism, then T(1 — g) = 1 — 7#. If g < qy and therefore gf = 9v - 0 < «V then l-0 = gv-0 + qp. Thus it follows that T(1 - gf) = - 0) + £ <Jp(„) = 1 ~ Tg = '£jqp - Tg рФх p ~ Z 9p(p) ~ Tg — Z 9p(p) ‘b ^fp(v) — p рФх Since g < qy we obtain 7# < #p(v) and therefore obtain T(#v — g) = #p(v) — 7#, that is, Tv is a *-isomorphism since qy or qp(y) are the unit elements of Lv or Lp(v). From g = 0i + 02 it follows that gv = glv + #2v. Thus it follows that, for an effect automorphism Tg = Tg1 + Tg2 and therefore 1= X TvSl + £ Tv02, . V V v that is, Z = Z ^lv + Z ^V02v V V V and we obtain (since the partition is unique) Tygy = Tygu + T2g2y, that is, Tv is an effect isomorphism. The ^-continuity of Tv follows from that of T, in which T is applied to such g having only a single component g = gx which is different from 0. On the basis of Th. 5.11 we are therefore interested in *-isomorphism or effect isomorphism between two irreducible systems L1 and L2, that is, Li c £'(■*!\L2 <= ЩМГ2). Since G is a Gleason lattice, and the spectral representation theorem holds for each ^-automorphism of G may be continued to an automorphism T of the Banach space Ж which is ^-continuous, and the adjoint mapping T is an automorphism of We obtain T = Tv and T = Ty where the Tv are isomorphisms of on &'p(v) and Ty are the adjoint isomorphisms of $p{y) and ^v(^ =
5 Isomorphisms and Automorphisms of Ensembles and Effects 223 From the preceding theorems we obtain the following special case: Th. 5.12. Each _L-isomorphism T of Gx c= ^'(Ж^) onto G2 <= &'(Ж2) may be uniquely extended to a 8$-continuous effect isomorphism T where the latter is also an isomorphism of the Banach space = St’fflf) onto = ^'(«#2)- The adjoint map T is an isomorphism of the Banach space $2 onto where T is a mixture isomorphism K2^> Кx. For <p e Ж and ||<p|| = 1 let P^ be the projection operator Pf = <K <?>/>• Then Py may be considered to be an element of К c= ЩЖ) as well as of G с Я(Ж). The following theorem is to be understood in this sense where T: —► Я2, T\ St2 —► SS1 and ТР9 means P^e K2 <= Я2 = Я(Ж2) and ТРф means Plj,eG1 <= St\ = St(3tff). Th. 5.13. Let T be a mixture isomorphism. Then TP^ Since ТРф is an atom, ТРф = Рф, where ф' is determined by T and ф up to a factor ela(oc is real). Proof. ЩТ'Р^е) = tr(P^Te). For e = T-1?^ = Рф; therefore it follows that tr((Т'Р^Рф) = 1. For e = 1 - Рф we obtain Те = 1 - ТРф = 1 - P9 and therefore tr((T'P^)( 1 - Рф)) = 0. Thus we obtain TP9 = T~lP,. If e11 e2 then Te11 Te2 and T~1e11 T~1e2. Therefore we obtain: Th. 5.14. For TP^ = P^v and pairwise orthogonal cpv, the q>v are pairwise orthogonal. With TP^v = P^ and pairwise orthogonal фу the cpv are also pairwise orthogonal. Th. 5.15. From w = £v AvPt^ (each weK may be written in this form with pairwise orthogonal cpv) and TP^ = Т_1Р^ = Pфv it follows that T'w = £AvP*v. V Proof. The proof follows directly from the fact that T is linear and norm- continuous as a mixture morphism. Th. 5.16. Let Т'РЩ = Рф1, TPV2 = Рф2 (that is, PVl = ТРф1, PV2 = ТРф2). Then K^, <p2>|2 = |<t^, ф2}\2. Proof. tr(P^Pfc) = tftrpjp^ = 1т(Рп(ТРф2)) = tr(PViPV2). Th. 5.17. Let Tbe an isomorphism of A(GX) onto A(G2) (see Th. 5.10). Then T may be uniquely extended as a _L-isomorphism of G1 —► G2. For Рщ = ТРф1 and P^2 = ТРф2 we obtain \<<Pi> <Рг>I2 = K^i> ^2>\2- The proof is a direct consequence of Th. 5.10, Th. 5.12, and Th. 5.16.
224 V Transformations of Registration and Preparation Procedures TP* = UP9U-\ Th. 5.18. To each isomorphism T of A(G1) onto A(G2) there exists an isomorphism or anti-isomorphism U of Ж1 onto Ж2 which satisfies ТРф = P„ and (p = Щ; where ф € Жъ (p € Ж2. Proof. Let фу be a complete orthonormal basis for Ж±. A complete orthonormal basis cpv for Ж2 is defined by 77^v = P(Pv where the cpy are uniquely determined except for factors of the form eilXv. For ф0 = £v°°=1 (1/2v/2)^v and ТРфо = P^ and (p0 = £v xv<pv it follows that since K<Po> <PV>I2 = КФо» Ф,>\2 ^at |xv|2 = 1/2V. The factors ei<Xv for the <pv are arbitrary, and may therefore be chosen such that cpQ = £v (l/2v/2)<pv. We will now set ф = ауфу and set (p = bv(pv where ТРф = P^. We will now show that all bv = av or all bv = av. From |<<pv, <p>|2 = |<^v, i/^)|2 it follows that \ax\ = |bv|. Since Tmay be extended as a _L-isomorphism on all of (by Th. 5.17) we shall consider the map T of the following projection operators: Рф, P = (where the' means that we perform the sum only on a certain subset N' of the natural numbers) and P1. We obtain (Рф v P1) л P = Px where x = Рф\\Рф\\ ~For Q = Ylx = TP we obtain where (P, v Q1) a Q = TPX = Px. Q<P\\Q<P\\- That is, if ф is mapped onto q> then every partial sum £'v ауфу is mapped onto £'v Ьуфу up to a normalization factor, which is, since \ax\ = |bv| identical: m = \\Q<P\1 From КФо'РфУПШ'1 = \<<Po,Q<P>\\\Q<p\\~l and since ||P^|| = \\Q(p\\ it follows that 14 i 2v/2 1 2 (5.2) for every over a subset AT of the v. Since a factor in q> and ф are arbitrary, and |av| = |bv| we may choose a1 = real and nonzero (in the case in which ai = hi = 0 we choose a different v ^ 1). From (5.2) we obtain especially: 1 1 2T72ai 2v/2 a' 1 1 2l/2 2v/2 ' for all v. Thus it follows that either av = bv or av = 6V. For each two different v, p from (5.2) it follows that 1 1 2v/2 + 2ju/2 ^ 1 1 ^W2 + ^/2 If av is not real, and bv = av then it follows that b^ = if av is not real and if bv = av, then it follows that b^ = a^. Thus we can either have all bv = ay or all b=av.
5 Isomorphisms and Automorphisms of Ensembles and Effects 225 We must distinguish between two different cases: (1) ife + i\jj2 is mapped onto + i<p2. (2) ф1 + iф2 is mapped onto (p^ — i<p2. Case (1). Since every partial sum is mapped into a partial sum, it follows that ф1 + iф2 + % —> <?i + fife + <Pv and therefore we obtain ife + fife —> + i<p2 • From this result we obtain фх + + iife^ <PV + anci aiso *Av + 1Фц (px + i(pp for arbitrary pairs v, ц. From ife + fife —> <pv + i(pp it follows that (for real a, b) •Av + «A„ + (a + гЬ)фр -»<pv + i<pp + (a + ib)<pp and thus we obtain i//v. + (a + ib)tj/p —> <pv + (a + ib)(pp. Let £v cvi//v be an arbitrary vector. We shall assume that cVl is real and nonzero. Since cvii/^vi + срфр —> cv!^v! + for a119 it follows that £v cvife -> £v cv<pv. Case (2). In the same way as Case (1) it is easy to show that if •Av + »A„ -»<Pv - i% for all pairs, then ife + (a + ib)ife —> <pv + (a — ib)(pp and therefore Z Cv'Av Z Cv<Pv. V V An isomorphic map U of Ж1 onto Ж2 is defined by l/(£v cvife) = £v cv<pv; an anti-isomorphic map U is defined by C(£v cvife) = £ cv<pv (see AIV, §13). In both cases ТРф = where cp = 11ф. For ап/еЖ2 we obtain те/=p„/= <?<<?,/> = щ<иф,/>. For / = Ug we have <Cife Ug} = <ife #> if U is an isomorphism (or <C/ife Ug} = (g, фу in the case in which U is an anti-isomorphism). Thus it follows that (ГР„)/= иф(ф,д) (or = (иф)(д, ф}) and therefore (TP^)f = UP^U-1/ (or = (W)<9, <A> = t/[«A<«A. й)3 = ирфи-4), where ТРф = UP^U~l is proven for both cases. In the last bracket we used the fact that, for an anti-isomorphic map Щаф) = aGife FromTh. 5.17 it follows that: Th. 5.19. The mapping T extended to all of G1 (or the _L-isomorphism T of G1 onto G2) has the form Те = UeU-1 where U is either an isomorphism or anti-isomorphism of Ж± onto Ж2. Th. 5.20. Every isomorphism or anti-isomorphism U of Ж^ onto Ж2 generates a _L-isomorphism of G1 onto G2 by means of the equation Те = UeU-1. U is determined up to a phase factor em by T. PROOF. It is easy to see that e —+ U^eU^1 is a _L-isomorphism of onto G2. Let = U^U^1. Then we obtain U^U^e = eU21U1. The unitary map U21U1 of Ж± onto itself commutes with all e e G1? and we therefore obtain U21U1 = eial from which we obtain I/ = U2eixl = e±iccU2 (where + or — depends on whether Uu U2 are isomorphisms or anti-isomorphisms.
226 V Transformations of Registration and Preparation Procedures Th. 5.21. The _L-isomorphism Те = UeU-1 of onto G2 may be extended to a continuous effect isomorphism of L1 upon L2 by means of the equation Tg = UgU_1. The uniquely determined isomorphism of onto $S2 de¬ scribed in Th. 5.7 is given by Ту = UyU-1. T'x = U~1xU is the isomor¬ phism of $2 onto 3$! and is a mixture isomorphism of K2 onto K1. PROOF. The fact that Ту = UyU'1 is ап effect isomorphism and is an isomorphism of onto №2 is clear. tr((T'w)e) = tr(w(T<?)) = tr(wUeU'1); if U is unitary, it then follows that tr((T'w)e) = tr(l/_1wl/e) and, therefore, T'w = U_1wU. Since tr(w(Te)) is real, from AIV, §13 it follows that, for an anti-isomorphism tr(\vUeU~1) = tv(U~1wUe) and we therefore obtain T'w = U_1wU. Th. 5.22. If the restriction of a *-automorphism L—*L where L с: <Я'(Ж) onto G satisfies Те = e then Tg = g is satisfied, that is, T is the identity map of L onto itself PROOF. Since Tp = p for all p e A(G), it follows that, by analogy with the proof of Th. 5.8, T(Ap) = t(A, p)p for 0 < A < 1. We obtain т(0, p) = 0, t(1, p) = 1 and t(A, p) increases monotonically with A. Let pv be finitely many pairwise orthogonal elements of A(G), it follows that, for all g = Yjv KPx where 0 < Av < 1, that g'1 (where g'1 is the reciprocal operator to g) exists in the subspace (£v рх)Ж We now consider all atoms < £v pv, that is, all (p for which cp e (£v ру)Ж We seek all values of A, 0 < A < 1 for which g > AP^. For the maximal A for which g > IP' there exists a x (for which рд = x) with QX = ZP^x- With rj = gx (that is, x = G~ln) we obtain P^rj = rj and А~Ь/ = P(f>g~1P(pfi and we therefore obtain A-1 = <<p, p“V). From this result it follows that for g = ^хФцРх + РРц following relationship holds: P = Щ1 - \\PM2) + M\p»<p\\2 and we therefore obtain Wp^W^Pd-W-W-P)-1. Since g = 'Zv*„Pv + PP* for all g for which £v5bv pv < g <, £v Pv we therefore obtain £ < fg < £pv, V^jU V that is, Tg = Z Pv + °„(P)p„- хФ ц t(A, Pf) must be the maximal value of а т for which Tg > тP^. From Tg >t(A, P^P^ where t(A, P^) is the maximal т it follows that т(А, P^ = <<p, (Tg)“ V) and we obtain <*№) = p<pK(Mi - 11р„<р112) + T& р„)11р„<р112, that is, WpM\ 2 ,PJ) xa, рд! - ajm'
5 Isomorphisms and Automorphisms of Ensembles and Effects 227 (5.3) In this formula we shall hold pд fixed, while we consider cp to vary in (£v рх)Ж; we may set the value If we insert the above value of \\р^ср ||2, it follows that cj№ - /0 = ^УХ1 ~ Ж1 - <T„08)) 1(1 - T(I, P^))' d, while 1 g _ Д11р„<?н2 i - Ki - iip^n2) in the above equation. The left-hand side depends therefore only on 2 and \\р^ср\\2; therefore the right side can depend only on 2 and Wp^cpW2. If the Hilbert space Ж is more than two-dimensional then the right-hand side is independent of all other components ||pv<p||2 (v ф p). Since the definition of т(2, P^ does not depend on the choice of pit follows that the right side is independent of each ||pA<p||2 f°r every arbitrary atom pk in G and is therefore independent of (for all cp e (£v ру)Ж). On the other hand we have 2 = - P «1 - M2) + IIP^II2 If in (5.3) we hold p on the left-hand side fixed, then the expression on the left side is a constant, while 2 on the right-hand side (which is independent of P^) may vary, and for fixed РФ 0, may vary between 0 and p. The right side depends only on 2 and is therefore constant in the interval 1 > 2 > p. Since p is arbitrary in 1 > P > 0 it follows that the right side is constant for 1 > 2 > 0 (for 1 = 0 we obtain т(0, Py) = 0). Since the left side of (5.3) is positive, there exists a constant a > 0 such that ak t(A, p<p) = 7——: г 1 + la - Л (this is also the case for 2 = 0!). t(A, Py) is therefore independent of P^ for all cp e (£v ру)Ж; т(2, Рф) = т(2). Since each pair сръ cp2 together with a cp e (£v рх)Ж lies in a finite-dimensional subspace of Ж we therefore obtain Т(ХР(р) = т(Х)Р(р for all cp e Ж. If e g G and if g = ae, where 0 < a < 1, then g is uniquely determined by g > aPy for all cp e еЖ and g ^ (a + e)P(p for e > 0. Therefore fg > i(a)P(p for all cp g еЖ and fg т(а + e)P((>. Since g < e we obtain fg < e and therefore obtain T(<xe) = т{сс)е. Let g = (1 - e) + pe; then, for all cp g > IP(p, where I-1 = <<p, g~lcp) = (1 - \\ecp\\2) + P~1\\ecp\\2 and we obtain fg > t(2)P^. Thus it follows that fg = (1 - e) + т(P)e because for fg = (1 - e) + т(P)e it follows that t(2)_1 = (1 — ||e<p||2) + т(/?)-1||е<р||2, which, for the above value of 2 and the form of the function т leads to an identity. Since f is a *-automorphism of L, we must have T(1 — cte) = 1 — т(а)е. Since 1 — ae = (1 — e) + (1 — a)e, it follows that f(l — ae) = (1 — e) + t(1 — ct)e, that is, 1 — т(а) = т(1 — a) from which we obtain a = 1 and t(2) = 2.
228 V Transformations of Registration and Preparation Procedures Since each g e L is determined by the maximal 1 for which g > IP^ for all (p we obtain Tg = g. Th. 5.23. Each *-isomorphism T of L1 onto L2 (where L1 с= Ь2 <= $'(Ж2)) has the form Tg = UgU-1 with an isomorphic or anti¬ isomorphic map U of Ж± onto Ж2. Proof. The restriction of T onto G1 is a _L-isomorphism. On G1 we therefore obtain Те = \JeU~1 with either an isomorphism or anti-isomorphism map U of Ж± onto Ж2.Тд = U~\Tg)U is a ^-automorphism of L onto itself with Te — e for eeG. According to Th. 5.2.2 T is the identity map on all of L; therefore Tg = UgU~\ Here we have closed the circle: all *-isomorphisms of L1 onto L2 are determined in the above way. We have formulated these theorems such that we can begin the circle at any point, for example, with the isomorphic maps of A(G1) onto A(G2) and with Th. 5.17 or with all 1-isomorphisms of G1 onto G2. Th. 5.23 is not valid for reducible L1? L2, so that, in general each ♦-isomorphism is not necessarily an effect isomorphism. If Жх = Ж2 and L1 = L2, = G2, we may only replace “iso” with “auto” in all the preceding theorems. An automorphism U of a Hilbert space onto itself will also be called a unitary mapping; similarly an anti-automorphism map U will be called an anti-unitary map. Before closing this chapter we shall consider the *-automorphisms T of the whole reducible system L cz $'(ЖЪ Ж2,...). Each у e $'(Жи Ж2,...) can be considered to be an operator у = yv where each yv operates in Жх. Let Up(v)v denote the isomorphism (or anti-isomorphism) of Ж, onto Жр{у). Then we obtain 7y = E UpMvyvUp-J)v. (5.4) V If we embed Ж = (Jv into the Hilbert space / = !©/, V then we may consider a ye $'(ЖЪ Ж2,...) to be an operator having the form У9 = !>,$>„ V where 9 = Z 9v and 9veJ^v V If the Up(v)v are all isomorphisms, then a unitary operator is defined by (5.5)
5 Isomorphisms and Automorphisms of Ensembles and Effects 229 and we obtain Ту = UyU~\ (5.6) In general (that is, if not all the C/p(v)v are isomorphisms or are anti¬ isomorphisms) then the mapping defined by (5.5) is neither unitary or anti- unitary. The map T which is adjoint to Tis given by T x = £ Up(v)vxp(v)UP(V)V (5.7) V or, according to (5.5), is given by T'x = U~1xU. (5.8) Here it is important to note that U is not only defined except for a factor eix by (5.5) and (5.6) but that with и<р = ^итЖ (5.9) V we have also Ту = UyU-1 = UyU~\ (5.10) With the proof of the preceding theorems we have, at the same time also investigated the structure of mixture isomorphisms because, according to §4.1 and §4.2, the mixture isomorphisms S and J^-continuous effect isomor¬ phisms T correspond uniquely to the equations S' = T and T = S. Therefore we do not need to derive any additional theorems for mixture isomorphisms. In deriving the structure of mixture isomorphisms we have, after all, not needed all the preceding theorems. These theorems have served to show “how few” assumptions about the maps L1 —► L2 or —► G2 or A(GX) —► A(G2) were already sufficient to determine the J^-continuous effect isomorphisms. If we had begun with the mixture isomorphism S, then we would only need the following theorems for T = S' : (1) The restriction of T onto G is a 1-isomorphism Gt —► G2 (see Th. 5.1). (2) Tmaps ^4(GJ bijectively onto ^4(G2) and, correspondingly, T-1 maps ^4(G2) onto A(G1) where orthogonal atoms are mapped onto orthog¬ onal atoms (this result follows directly from 1). (3) Th. 5.11. (4) Th. 5.13, Th. 5.14, Th. 5.15, and Th. 5.16. (5) Th. 5.18. (6) T'w = Sw = U~1wU follows directly from Th. 5.21 and Th. 5.7. Th. 5.22 and 5.23 were not needed. With T = S and for x e Ж2,.. •), from (5.7) we finally obtain $X = ^ ^p(v)vXp(v) Up(v)v • (5-11)
230 V Transformations of Registration and Preparation Procedures A mixture isomorphism is ^-continuous if T = S' transforms the space 9 into itself. If we assume that the center Z of G is a subset of 9 then the components j;v of a у e 9) form a Banach subspace 9V of Then S is ^-continuous only if UpMvyvU~(l)v e ®p(y) for yve9v. Th. 5.24. Every p-continuous r-isomorphism h determines (by means of the map ф^ L2 defined in Th. 3.2) a 8$-continuous effect isomorphism T which maps 91 isomorphically onto 92 (as Banach spaces). T is a 9-continuous mixture isomorphism. Proof. According to Th. 4.2.6 we need only show that L1^ L2 is bijective and that 91 92 is bijective. Since ф^ is <r-dense in L1 and ф&2 *s ff-dense in L2 it follows from ф^х ф&2 that L2 is surjective. For the map Tcorresponding to h-1 on ф^ and ф$Р2 we obtain TT= 1 and 7T=1. Since L2-^LX is surjective, it follows that L1 L2 is injective. Since 92 91 and T = T_1 we find that 91 92 is bijective.
CHAPTER VI Representation of Groups by Means of Effect Automorphisms and Mixture Automorphisms In the previous chapter we have seen that there is a one-to-one relationship between the J^-continuous effect automorphisms and mixture automor¬ phisms which is defined by the adjoint maps. In this chapter we shall investigate the representation of groups by means of J^-continuous effect automorphisms. If we make the transition from the J^-continuous effect automorphisms to the mixture automorphisms, then we obtain the cor¬ responding “adjoint” representation. Thereby, to each representation of a group by a mixture automorphism we obtain a corresponding “adjoint” representation in terms of a J^-continuous effect automorphism. Here we shall only consider representations by means of J^-continuous effect auto¬ morphisms because the representation of a group by means of mixture automorphisms would only result in an unnecessary repetition of the results derived here. 1 Homomorphic Maps of a Groups in the Group sd of ^f-continuous Effect Automorphisms From the fact that the product of two effect automorphisms is an effect automorphism and that both the identity and the inverse of a J^-continuous effect automorphism is again such, it easily follows that the set of $- continuous effect automorphisms form a group. By the expression: the representation of a group ^ by means of J^-continuous effect automorphisms we mean a (group)-homomorphism of ^ into the group si of J^-continuous effect automorphisms. 231
232 VI Representation of Groups If we are given a map of a set M into the set si, then we may easily construct a mapping M x Я' —► Я' in the following way: For a e M and a —► T (Те si) we set (a, y) —► Ту. For convenience we shall often use the abbreviation ay for the map Ту of (a, y) providing that in the particular circumstance there exists a particular fixed map a —► T. In addition, for the image set {ay\aeM} we will use the abbreviation My; similarly, for {ay | a e M, у e Я'} we will use the abbreviation МЯ'; for {ay | a e M, у e L} = ML. Therefore, for a,be@, for a representation of a group ^ we obtain a(by) = (ab)y for all ye L (and therefore all у e Я'). Thus it follows (e the unit element of that a(ey — y) = 0; since a is an effect automorphism it follows that ey = y. From a-1a = e it then follows that a-1ay = y, that is, a-1 is the inverse effect automorphism relative to a. 1.1 Generation of a Representation of ^ in j/ by Means of a Representation of ^ by r-Automorphisms In applications we frequently find that a representation of a group ^ is given in terms of r-automorphisms. We will therefore assume that to each element a e У there corresponds an r-automorphism 3F A which satisfies ai(aif) = (aiai)f (f°r all/ E ^)- Thus, for a2 = e we obtain a(ef) = af. Since a is injective, it follows that ef = /, that is, e is the identity map in SF. From (a~xa)f = ef =f it follows that a-1 is the inverse r-automorphism which corresponds to a. If, for all a e G, the maps 3F A & are p-continuous, then a and also a~l are, according to V, D 3.6, p-continuous r-automorphisms. If we are given a representation of ^ in terms of p-continuous r- automorphisms, then according to V, Th. 5.24, there exists a representation of ^ by means of ^-continuous effect automorphisms, that is, a repre¬ sentation of ^ in si. In addition, according to Th. 5.2.4, for all a in ^ we obtain aQ) = Я), that is, ^ leaves 2 invariant. If to each r-automorphism a e У there exists a dual p-automorphism a' then, according to V, Th. 3.4, all a are p-continuous. For the representation of ^ in si obtained in this way the mixture automorphisms which cor¬ respond to the p-automorphisms are precisely the adjoint maps correspond¬ ing to the representation elements of ^ in si. In the following we shall only consider topological groups ^ (since we may consider finite and countably infinite groups to be topological groups under the discrete topology, the following considerations are also valid for such groups). The topology of a topological group is uniquely determined by the neighborhood filter of the unit element e (see AV, §10.1). A topological group is given if the neighborhood filter of the unit element satisfies the conditions below; these conditions will be interpreted “physically.” By this we mean that
1 Homomorphic Maps of a Group ^ in the Group sd 233 the neighborhood filter of the unit element should (ideally)) relate physical imprecision to the group elements which will be physically interpreted as being distinguishable from the unit element (see [1], §5 and §9). Therefore it is “physically” reasonable to postulate the following conditions: TG 1. To each neighborhood U of the unit element there exists a neigh¬ borhood Ffor which VV с= U. TG 2. To each neighborhood U of the unit element there exists a neigh¬ borhood Ffor which F_1 сz U. VVis defined as the set of all ab for which a, b e V TG 1 therefore says that the product of two elements a, b is not distinguishable from the unit element (with the imprecision represented by the set U) if both elements are “near enough” to the unit element. Similarly TG 2 says that if a is not “distinguishable” from the unit element then a-1 is also not “distinguishable” from the unit element. TG 3. To each a e У and each neighborhood U of the unit element there exists another neighborhood Vfor which V с= aUa_1, that is, a~1Va c= U. TG 3 says that, if a group element b cannot be distinguished from the unit element (with imprecision defined by V\ then if we consider a_1ba, that is, we apply b “at the location a” and then transform back by a we then obtain an element which is not distinguishable from the unit element. A neighborhood filter of the unit element (or a basis for such a neigh¬ borhood filter) which satisfies TG 1-TG 3 makes ^ into a topological group for which the neighborhood basis of an element a e У is given by a U where U is the neighborhood of the unit element (AV, §10.1). With the neighborhood filter of the unit element there are two (eventually coinciding) uniform structures defined on ^ which are compatible with the group operations. A right-handed (or left-handed) uniform structure is defined by the family of sets {(a, b)|ba-1e U (or a_1be U, respectively)} where U is a neigh¬ borhood of the unit element. The topologies associated with the left- and right-handed uniform structures are identical with the original topology of the group. If a group ^ is complete with respect to the right-handed uniform structure, then it is also complete with respect to the left-handed uniform structure, and conversely (AV, §10.2). We may complete ^ as a group if the map a —► a"1 transforms a Cauchy filter of a right-handed uniform structure into a Cauchy filter of the right-handed uniform structure (AV, §10.2). We will assume that this is the case for all physically meaningful groups. Then, instead of ^ we may use its completion. For this reason we shall assume that all groups which have a “physical interpretation” are complete. (Finite and denumerable groups with discrete topologies are complete.) Every locally compact group is complete (AV, §10.2).
234 VI Representation of Groups In addition to the so-called “finiteness of physics” condition (see [1], §9) we shall require that physically interpretable groups also satisfy the following condition: ^ is separable, and the neighborhood filter of the unit element has a denumerable basis. Then it follows that the uniform structure of ^ is metrizable (see AV, §10.2). Then ^ is a Baire space (see All, §3) since ^ is complete. Thus we assume that “physical” groups ^ will always be, as topologically metrizable, complete and separable groups relative to the corresponding unique right- and left-handed uniform structures. As we mentioned above, finite and denumerable groups with discrete topologies satisfy the above requirements. We have provided a simple physical interpretation for the neighborhood filter of the unit element and derived two uniform structures from the compatibility condition; we have also found that we may consider the group to be complete with respect to this uniform structure. This does not, however, mean that one of these uniform structures describes the physical imprecision for the whole group, that is, the physical imprecision which permits us to compare two arbitrary group elements. Physically it is an important distinction whether we are able to only consider elements Ъ2еУ which are neighbors of a fixed element bx or we have a procedure by which we may compare arbitrary pairs (ft1? b2). The fact that we have considered the set alia-1 where U is an arbitrary neighborhood of the unit element does not contradict the assumption that only elements in the vicinity of the unit element may be compared. From the above symmetry assumption we may at most conclude that we cannot better distinguish elements of U if we perform a translation of the unit to the location a. Our assumption about the uniform structure of physical impre¬ cision for the group implies the opposite: We cannot compare a pair of arbitrary elements as well as we can for two elements in the vicinity of the unit elements. As we have explained in [1], §9, for the uniform structure of physical imprecision it is desirable that the set under consideration (in this case is precompact. For a “physical” group we shall now assume that, in addition, there exists an additional uniform structure ph of physical imprecision on ^ which is weaker than the above and for which &ph (that is, ^ together with the uniform structure ph) is precompact and metrizable and generates the same topology, that is, the topology of ^ which is compatible with the group operations. In addition, for fixed a the maps b —► ab and b —► ba should be p/i-uniformly continuous. Since the topologies of ^ and are identical, is separable. In AV, §10.2 we have proven that such a structure ph always exists for If ^ is compact, then ph, and both the left- and right-handed uniform structures of ^ are identical because a compact space has only a single uniform structure. If ^ is not compact, then the uniform structure ph is not uniquely
1 Homomorphic Maps of a Group ^ in the Group 235 determined by ^ and by the above requirements. Then ph requires an additional physical structure which is not determined by the neighborhood filter of the unit element. The completion §ph of ^ph is compact and is called the physical com- pactification of CS. In general <&ph is not a group. If ^ is compact, then ^ is its own compactification. Although is not necessarily a group, the maps b -* ab and b -* ba may be, for fixed ae^, uniquely extended onto §ph as ph- uniformly continuous maps. The above results are also valid for finite and denumerable infinite groups with the discrete topology; finite groups are already compact in this sense; denumerable infinite groups G must yet be made compact, where the topology of the compactification §ph onto ^ is the discrete topology! In cases in which misunderstandings are unlikely, we will often write # instead of §ph. Th. 1.1.1. The set Ф may be partitioned under (left and right) multiplication with the elements of У in invariant subsets У and #\^. Proof. If b e and frv —► b with bv e then for a e ^ it follows that abx —► ab. Suppose ab = ce^, then from abx —► с it also follows that a~1(abx)—> a~lc = b e (S. The physical meaning of the uniform structure ph should also be evident in the representation of (S. Since, by assumption, ^ is complete and 3F is denumerable (see III, §3), ^ cannot, in general, be represented by p-continuous r-automorphisms. Since ^ is separable, there exists a countable dense subgroup # in (S. We shall now assume that there exists a representation of such a subgroup # by means of p- continuous r-automorphisms. It is reasonable to impose the following additional condition: If a is an element of # in the neighborhood of the unit element, then for all w e Ж then, for all practical purposes, it is not possible to distinguish between the probabilities p(w, ij/(f)) and p(w, ij/(af)\ that is, / and af for all preparation procedures, yield the same probabilities if a is sufficiently close to e. In this way, for fixed w e Ж and arbitrary / the above probabilities cannot be distinguished. We shall formulate this in the following mathematical axiom: AG 1. To each / e and г > 0 there exists a neighborhood UB of the unit element in #, such^that Il4w, !>(/)) - p(w, ф(а/))\ < e for all weJf and aeUe. To each we Ж and (5 > 0 there exists a neighborhood Us of the unit element in such that Ww, iA(/)) - liw, ф(а/))\ < S for all / e and a eUs.
236 VI Representation of Groups If AG 1 holds, then the following theorems hold: Th. 1.1.2. For fixed у e 3 the map defined by a —► ay of # —► 3 is uniformly continuous in the norm topology of 3. For fixed xe 0b the map of § 0b defined by a —► a'x is uniformly continuous in the norm topology of 0b. Proof. Since if = \jjtF is norm-dense in L n §, it is sufficient to prove the first assertion for у e if as follows: K'Kf) - а2ф(Л\\ = sup \n(w, ex[^(/) - аЛа2ф(/)])\ we Ж = sup На>1; ф(Л - ф(а;1а2/)\ we Ж = sup \ц(й,ф(Л - ф(аЛа2Л\■ we Ж According to AG 1 we therefore obtain IМ(Л - а2ф(Л\\ < 8 forajf1a2 £ Ut. Since Ж is norm-dense in К it suffices to prove the second assertion for xe Ж (if is ст-dense in L): ||a\w — a'2w\\ = sup p(a\w — a'2w, 2д — 1) geL = 2 sup p(a\w — a'2w, g) = 2 sup p(w, ф(а,Л ~ Ф(а2,Л) /6^ = 2 sup p(w, ф(Л - ф(а2аЛЛ)- According to AG 1 we therefore obtain \\a\w — a2w\\ < S for a2a[1 e U8. Th. 1.1.3. By continuous continuation of the map § —*■ described in Th. 1.1.2 we may obtain a representation of by means of 0b-continuous effect automorphisms which maps 3 into itself For x e 0b, у e 3 the maps а —► R defined bya^> p(x, ay) are uniformly continuous. Proof. Since § —► 3 is uniformly continuous and 3 is complete with respect to the norm it follows that its continuous completion to each ye3 defines a map ^ —► 3. If we write this map as a —► ay, then it follows that a is linear. Since if is norm-dense in L n 3, it follows that a(L n 3) cz L n 3 and that a is norm- continuous. The map у —► ay is also ^-continuous. This result follows from Th. 1.1.2, since, for all ae@ it follows that p(x, ay) = p(a'x, y) and ihat the mapping a —► a'x is uniformly continuous with respect to the norm topology in 0b and therefore can be extended to a map ^ —► 0b for which, to each ae^a mapping a' of 0b into 0b is determined. From p(a'x, y) = p(x, ay) it follows that, in the limit p(a'x, y) = p(x, ay) for all аеУ, from which we have proven the ^-continuity of a. Since a (as a map of 3 n L into itself) is ^-continuous, it can be extended onto the compact set L. Therefore a is a ^-continuous effect morphism. Since а!Ж is, for all ae§, equal to Ж (and therefore norm-dense in K) it follows that а'Ж is norm-
1 Homomorphic Maps of a Group ^ in the Group 237 dense in K, from which we conclude that, according to V, Th. 4.1.4, a is an effect automorphism. Since the map a —► p(x, ay) is uniformly continuous as a map # —► R, its extension on ^ is also uniformly continuous. We must now make the uniform structure ph of physical imprecision evident in the representations. We have seen that the mapping a —► piw, ф(а/)) of # into R is uniformly continuous. If, the elements of § are to be physically distinguishable according to ph, then it is necessary to impose the following strong requirement: AG2. For each we Ж and fetF the mapping §ph —► R defined by a —► p(w, ij/(af)) is uniformly continuous. From AG 2 we obtain, by continuous extension: Th. 1.1.4. The mapping of &ph —► R defined by a —► p(x, ay) for x e 3$, у e 3) is uniformly continuous. 1.2 Some General Properties of a Representation of ^ in sd In the previous section it has become evident that the representation of groups by means of J^-continuous effect morphisms plays an important role. If such a representation is generated by r-automorphisms then this “gener¬ ation” will only play a role for the interpretation question (see, for example, VII). Since neither the sets 2 and 3F nor the sets Ж and S£ are axiomatically fixed, except that, for Ж each denumerable set in К which is norm-dense in К and for if each denumerable tr-dense set in L may be used, we are led to concentrate our efforts on questions about the representation of a group on representations in j/. Unfortunately, the representation theory of groups by means of positive automorphisms of base-norm spaces has not yet been sufficiently developed for us to be able to present a comprehensive outline of even a part of a “general” representation theory. This is, in part, due to the fact that the relationships between special structures of base-norm spaces (for example, the special structures formulated by axioms AV 1.1-AV 4 in III, §4) and the structure of the automorphism groups are not well known (see, for example, the attempts in this direction presented in [20]). For this reason in this book we shall often refer to the special structure of these automorphism groups as groups of J^-continuous effect automorphisms such as those which were considered for Ж2,...) in V. Now, in the remaining part of this section we shall consider a number of general and easily formulated (and which will later be seen as “physically” meaningful) properties. D 1.2.1. А д e Lis called a ^-invariant effect if Уд = д. An e e G is said to be a ^-invariant decision effect if Ge = e.
238 VI Representation of Groups It is easy to verify that Л1 is, for all 0 < A < 1, a ^-invariant effect, and that 0 and 1 are ^-invariant decision effects. For a e У it follows that from ag = g and the uniqueness of the spectral representation of g that ae(X) = e(A) holds for the spectral family e(X) of g. Therefore it suffices to only consider invariant decision effects. D 1.2.2. A representation is said to be irreducible if 0 and 1 are the only invariant decision effects. A representation is therefore irreducible only if the only invariant effects are given by A1 where 0 < X < 1. As we have already mentioned, to each a we may consider the adjoint d\ in this way is defined as set of transformations in К and 9. D 1.2.3. Two representations <39\ and <39*2 are said to be equivalent if there exists a J^-continuous effect isomorphism T of 9\ onto 9t'2 for which Та = aT for alia e 3. By analogy with the considerations in §1.1 we shall assume that the a e 3 transform the subspace 9) into itself, that is, (D 1) Ш с= 9. Since 9) has not otherwise been specified by means of the axioms (except that 9> is norm-separable and that 9 n L is a(9\ J^)-dense in L—see III, §3) we shall now proceed in the opposite way and seek to obtain conditions for 9 with the aid of the group representations. Therefore (D 1) is one of the conditions which 9 must satisfy. According to III, §3 we will assume that 9 is norm-separable. Here we shall note that the following definitions and concepts will apply to every norm-closed subspace 9 of 91' for which 9 n L is a(9\ J^)-dense in L and is ^-invariant. Since 3 transforms the space 9 into itself, У is—on all of 9'—defined as mixture automorphisms of Ka (Ka is the a(9', ^-closures of К in 9'). In analogy to the considerations presented in §1.1 we shall, in addition, assume that the whole group ^ is represented in sd. From axioms AG 1 and AG 2 it follows that there are certain continuity properties of this repre¬ sentation; here we shall only use two of these (!): According to Th. 1.1.3 we obtain: (D 2) For x e 91, у e 9 the maps ^ —► R defined by a —► g(x, ay) are continuous. According to Th. 1.1.4 we obtain: (D 3) For xel, у € 9 the maps %h -*• R defined by a —► g(x, ay) are uniformly continuous.
1 Homomorphic Maps of a Group ^ in the Group sd 239 It is clearly obvious that (D 3) => (D 2). For a representation of ^ in si we shall not yet assume that a representation must be obtained by means of r- automorphisms in the way described in §1.1. We will only assume that ^ is complete and that either (D 1), (D 2), or (D 3) is satisfied. We will then find that, for a representation of ^ in si the properties which have been derived from AG 1 in §1.1 automatically hold. Th. 1.2.1. If, according to (D 2), p(x, ay) is continuous at ее У (e the unit element of У) as a function on У for each pair (x, y) e 0b x 00, then the mapping of —► 0b defined by a —► a!x is norm-continuous. Proof. Let x e 0b be fixed. From \p(a'x — x, y)\ = \p(x, ay) — p(x, y)\ < s if a is in a suitable neighborhood of the unit element, then it follows that the map a —► a'x is o(0b, ^-continuous at e. Thus since p(a'x — b'x, y) = /i(b'-1a'x — x, by) it follows that the map a —► a'x is o{0b, ^-continuous in all of (S. We now consider the set Ух as a subset of 0b with the topologies induced by the norm and a(0b, 00). Since 0b is norm-separable there exists a denumerable subset {av} с for which {a'vx} is norm-dense in Ух. We define the spheres KBV in Ух: KBX = {x' | x' e Ух, \\x' — a'xx\\ < s}. Let M* denote the inverse image of a —► a'x, that is, MBy = {a\ae G where a'x e KBX}. Since the a'vx are norm-dense in Ух we obtain (Jv KBV = Ух from which it follows that (Jv MBV = G. The set Ух may be considered to be a subset of 00'; since the norm of 00' on 0b agrees with that of 0b (see proof of V, Th. 4.1.6). K\ can be considered to be the intersection of spheres in 00' with Ух. Since the unit sphere of 00' is o(00’, 00)- compact and therefore closed, K\ is o(00', ^)-closed in Ух, that is, KBV is c(0b, 00)- closed in Ух. Since the map a —► a'x is a(0b, ^-continuous, the inverse image M* is closed in Since ^ is a Baire space (see All, §3) it follows that from (Jv M* = ^ that there exists a set M\0 which contains an open subset of Therefore there exists a element a e M\Q and a neighborhood U(a) of a for which U(a) cz MBVQ, that is, for all b e U(a) we obtain a'x e KBVQ and b'x e KBV0 and therefore ||b'x — a'x|| < 2s. U(a)a~1 is a neighborhood V of the unit element. Since the norm is preserved by mixture automorphisms, for all с e V we obtain ||c'x — x\\ < 2s whereby we have proven the norm-continuity of с —► c'x in the point e e G. Thus we easily obtain the norm-continuity in (§. Th. 1.2.2. If a —► a'x for all xe 0b is norm-continuous for fixed x, then a ay for all у e 0b' is o(0b', 0b)-continuous. The proof follows directly from \p(x, ay - by)| = |^(a'x - b'x, y)\ < || a'x - b'x || ||y||. Th. 1.2.2 says that the map a —► pix,ay) is continuous for each pair (x, y) e 0b x 0b' on all of CS.
240 VI Representation of Groups The following theorem is a corollary of Th. 1.2.1 andTh. 1.2.2: Th. 1.2.3. If a —► ц(х, ay) is continuous for all (x, y) e 0b x 3 it follows that it is continuous for all (x, y)e 08 x 08'. In the same manner we may prove the following theorems : Th. 1.2.4. If a —► fi(x, ay) is continuous for all (xj)eJx® it follows that a —► ay is norm-continuous for ally e 3. Th. 1.2.5. If a —► ay is norm-continuous for ally e 3 it follows that a —► ax is o(3\ 3) continuous for all x e 3'. Th. 1.2.6. If a —► ц(х, ay) is continuous for all (x,y)e0bx3 then it is continuous for all (x, y) e 3' x 3. In all these theorems only (D 1) and (D 2) are required. According to (D 3) fi(x, ay) is p/i-uniformly continuous for all (x, y) e 0b x 3. In general we should expect that \i(x, ay) is neither p/i-continuous for all (x, y) e 3' x 3 nor for all (x, y) e 0b x 08'. Therefore there exists a complete symmetry with respect to the representa¬ tion of groups between the spaces and Q) and their corresponding extensions $ <= and 3) or $ and => 3). Previously we have generally not made use of the dual pair 3\ 3 together with <= 3' as mathematical tools for the representation of physical problems of quantum mechanics. It is, however, possible that there exist new methods for the mathematical treatment of many problems using the methods of C*-algebras if 3 is a C*- algebra. We will now examine the possible consequences of (D 3). In order to provide a comprehensive formulation of the following processes we now introduce the following spaces A„, Aph and A: D 1.2.4. Aw is the set of all у for which the map a —► ay of ^ into is norm-continuous. Aph is the set of all у e for which the maps a —► fi(x, ay) are p/i-uniformly continuous for all x e 36. A = Aph n A„. According to (D 3) and Th. 1.2.5 it follows that 3 cz A cz . Th. 1.2.7. A„, Aph and A are norm-closed subspaces of Ж which are invariant with respect to Proof. The fact that Aph is a subspace of 0b' is clear. Let у e Aph and let с e then /i(x, асу) is, as a function of a, p/i-uniformly continuous since both maps a —► ac —> р(х, асу) are p/i-uniformly continuous; therefore cy e Aph. If yx —► у is a norm-convergent sequence for which yv e Aph then |p(x, ay - by)| < |p(x, a(y - yv)| + |p(x, ay - byv)\ + |p(x, b(yv - y)\ < 21| x || || у - yv|| + |p(x, ayv - byv)| from which it follows that p(x, ay) is p/i-uniformly continuous.
1 Homomorphic Maps of a Group in the Group si 241 For A„ the same result follows more easily from the continuity of the maps a —► ac —► (ac)y and from || ay - by || < ||a(y - yv)|| + ||ayv - byv|| + ||b(yv - y)|| < 2||y - yv|| + ||ayv - byj. By analogy to D 1.2.4 we define: D 1.2.5. Let E„ be the set of all xe®' for which the mapping a —► a'x of ^ into 2' is norm-continuous. Let Eph denote the set of all хб®' for which the maps a —► p(x, ay) —► p(a'x, y) are p/i-continuous for all у e 2. Let ш Hin O Uph . According to (D 3) and Th. 1.2.1 Я a S c= 2’. By analogy with Th. 1.2.7 it follows that: Th. 1.2.8. E„, Eph and E are norm-closed subspaces of 2' which are invariant relative to У. Since 2 has not previously been restricted by means of axioms, we may now reverse the sequence of ideas presented above as follows: We begin with a topologically complete group ^ and a representation for which the map ^ —► Ух is norm-continuous for all xe Я. The “physical” uniform structure ph on ^ and the space 2 have previously not been specified. We shall now define topologies on ^ by means of its representations as follows: D 1.2.6. We shall call the initial topology on ^ for which the maps defined by a —► a'x are continuous for all x e Я (with respect to the norm topology in Я) the (Я)- topology. The original topology in ^ is therefore finer than the (J^)-topology. If the representation of ^ is not true, then there exists a invariant subgroup Ж of ^ which, in the representation of ^ is mapped onto the identity. The representation of the factor group <3!Jf is then true. The fact that the elements of Ж behave like the identity with respect to quantum mechanics has the physical meaning that the transformations in Ж are possibly meaningful outside the domain of microsystems, but in the domain of microsystems are equivalent to the identity. Thus for microsystems only the group 91Ж is physically meaningful. From this point on we shall assume that the representation of ^ is true (this assertion we may not change, because in certain “subdomains” of the domain of microsystems nontrue representations may occur! See VII, end of §2). From the above arguments it seems reasonable, on physical grounds, to identify the topology of the group ^ with the (J^)-topology. First, we must determine whether ^ together with the (J')-topology is always a topological group.
242 VI Representation of Groups Th. 1.2.9. 0 together with the (0b)-topology is a topological group (because the representation of 0 was assumed to be true, the (0b)-topology is separating). 0 is separable and metrizable in the (0b)-topology. Proof. Since a' preserves the norm, TG 1 follows from ||a'b'x — x|| < \\a'b'x — a'x|| 4- ||a'x — x|| = || a'(b'x — x)|| + || a'x — x|| = || b'x — x|| + || a'x — x||. TG 2 follows from || a'-1x — x|| = || a'(a'-1x — x)|| = ||x — a'x ||. TG 3 follows from the fact that \\b'x1 — x11| < e is equivalent to ||a'~1b'a'x2 — x2\\ < s for x2 = a'~1x1 since \\a'~1b'a'x2 — *21| = || b'a'x2 — a'x2\\ = \\b'x1 — ||. Since 0 was assumed to be separable in the original topology, 0 is separable in the (^)-topology. Every subset A cz 0b for which the closed subspace spanned by A is equal to 0b generates the same initial topology as does 0b. This result follows easily for linear combinations of elements of A and can be proven using the inequalities: ^axx — a2x|| < \\а^ — ахх\\ + \\a^ — a2x\\ + \\a2x — a2x\\ = 2\\x — x|| + \\ахх — a2x\\. Since 0b is separable, there exists a denumerable set A and therefore a denumerable neighborhood basis of the unit element in G. For a subset Л <= 0b’ let the norm-closed subspace of 0b’ spanned by Л be denoted by 00A. D 1.2.7. Let Л c= 0b' be a subset of 0b' which satisfies the condition 0 К c= ®A.We shall call the initial structure for which the maps 0 0b' defined by a ► ay, У 06’ defined by a ► a-1y are uniformly continuous for all у e A with respect to the o(0b', J^)-topology in 0b' the Л-uniform structure on 0. We shall call the topology determined by the Л-uniform structure the Л-topology on 0. It is easy to see that, for all уе®л, the maps a —► ay and a —► ay are uniformly continuous. Since \\ay — ay\\ = \\y — y\\ it follows from 0K с 00A that 000A = ®A, that is, 00A is an invariant subspace of 0b'. Thus it easily follows that ay = у for all у e 00A is equivalent to ay = у for all ye A. A representation will be said to be Л-true if ay = у for all ye A, it follows that a = e.
1 Homomorphic Maps of a Group 3 in the Group si 243 Th. 1.2.10. The maps of <3 onto itself given by a —► a-1, a —► ab and a —► ba (for fixed b) are uniformly continuous with respect to the A-uniform structure. Proof. Since the composite maps a —► a-1 —► a_1y and a —► a-1 —► (a_1)_1y = ay, that is, since the maps a —► a_1y and a —► ay are uniformly continuous a —> a-1 is uniformly continuous. From a —► ab —► aby and a —► ab —► (aby1 у = b~1a~1y are (for fixed b) uniformly continuous, it follows that a —► ab is also such. The uniform continuity of a —> aby is obtained with у = by from a —► aby = ay since ye <3>A. The uniform continuity of a —► b~1a~1y follows from p(x, b~1a~1y) = p(b'~1x, a_1y) = /г(х, a_1y)> where x = b,_1xe J and the uniform continuity of a —► a-1y. In this way we obtain the uniform continuity of a —► ba from a —► ba —► bay and a-+ba-+ a~l b~ly and from у = b_1y and £ = b'x. Th. 1.2.11. Tbe (&)-topology on <3 is finer than the A-topology. <3 is therefore also separable in the A-topology. Proof. Let 3, together with the (^)-topology (or Л-topology) be denoted by <3Я (or respectively). We must now show that the identity map <3Я —► <3K is continuous. У» ~* is continuous providing that the composite maps <§я ► <3K Я and <3Я ► ^A Я are continuous. This follows, however, from \p(x, axy - a2y)| = |p(a\x - a!2x, y)| < \\a\x - a'2x\\ ||y|| and Th. 1.2.9. Th. 1.2.12. The set An n L is о(Я', Щ-dense in L and the set An is therefore о(Я, Щ-dense in Я'. Proof. For the P^ for which cpe(jv we find that, in the norm of $ || a'Py — P91| < e whenever a is in a suitable neighborhood U (either in the original topology or in the ^-topology) of the unit element. Since a'P^ = Рф = a~1P(p (see V, Th. 5.13), denoting the norms in Я and Я by ||- • -||л, and ||- • -||л, respectively, from the relation IIP* - PJ* < 2Щ - PJa we obtain \\а~1Р(р - Р^\\я. < 2s for aeU. Therefore for у = Pv e L we find that the map a —► ay is norm continuous (since a —► a-1 is continuous). Therefore the map a —► ay is norm continuous for all finite linear combinations of elements of the form Py. The set of all such finite linear combinations is о(Ж, ^)-dense in L. Since L is о(Я, ^-separable there exists denumerably many yve An n L which are о(Я', J^-dense in L. Since <3m is separable, there exist denumerably many bp (the unit element of 3 may be among the Ьр) which are dense in the ^-topology in 3. The set A of all bpyv is then denumerable, and we find that A с L and A is о(Я', J^-dense in L. Here we realize that the set yv does not need to be o(Я, J^)-dense in L if only the set of the b^yv is o(Я, Я)-dense in L. Since yve A„ we find that ||byv — bMyv || < s for a be3 and for Ьр sufficiently close to b. Therefore <3A is contained in the norm-closure of A
244 VI Representation of Groups and therefore we find that ЗА <z 9A. Л therefore satisfies the conditions in D 1.2.7 and is denumerable, and is o(9\ J^-dense in L. Thus we find that 9A is norm-separable. We may therefore use the space 9A as the space 9 of the theory if the “physical” uniform structure ph is finer than the Л-uniform structure. If the Л-topology is equal to the original topology on the group 3, then we may identify the physical uniform structure with the Л-uniform structure, that is, by the choice of the set A we fix the choice of the physical uniform structure ph. In this way we recognize the close relationship between the designation of the space 9 and the designation of the uniform structure ph of “physical imprecision” on 3. In practice it is often easy to show that the Л-topology is identical to the original topology on 3. In applications 3 is at most a locally compact group. If we could find a Л-neighborhood of the identity which is compact in the original topology, then the original topology and the weaker Л-topology will coincide in this neighborhood, and therefore will have the same neigh¬ borhood system of the identity. The proof that there is such a A-neigh- borhood is very simple for the case of Lie groups (see VII, §8). In the following we will always assume that there exists a Л-set in order that the Л- topology will coincide with the original topology. Then the (9)-topology also coincides with the original topology. With this result we note that we have not yet solved the problem of the space 9. For example, we may set 9 = 9A, but we may yet find that it is possible to have different sets A and also different spaces 9A which result in the same Л-topology on 3. If, for example, 3 is compact, then, for all possible Л-sets the Л-topologies will coincide with the original topology on 3 and A is equal to A„. This is not so surprising: If we need only consider compact groups in physics, then we would always be able to choose finite-dimensional Hilbert spaces Жх (see VII, §3 and AV, §10) and the problem of the selection of a space 9 would be nonexistent. After the construction (with the physically uniform structure ph being the Л-uniform structure) we obtain А з 9A. The conditions under which A = 9a has not yet been investigated. The introduction of the Л-uniform structure in D 1.1.7 now appears to be unsymmetric with respect to 9 and 9 = 9A. This is, however, not the case, because the Л-uniform structure is also the initial structure for which the following maps 3 —► R of the form a —► p(x, ay) = p(a'x, y), a —► p(x, a~xy) = p(a'~1x, y) are uniformly continuous for alixe 9 and у e 9 = 9A, that is, for which all maps of the form a —► abc and a—> a'_1x are uniformly continuous with respect to the 0(0, -topology in 9.
1 Homomorphic Maps of a Group ^ in the Group si 245 Therefore the above considerations are symmetric between 3 and In particular, the question arises whether and under what conditions is it possible that S = (where S is defined by D 1.1.5). 1.3 Topologies on the Group si Since the elements of si are maps, there are innumerable possibilities for the introduction of a topology on si. We shall select three possibilities—these have already been encountered in §1.2. To each Те si there is a corresponding map $ x 3 R defined by T—► ц(х, Ту). A separating uniform structure on si is defined by means of the uniform structure of normal convergence on & x 3 (that is, the initial structure for all the maps si R defined by T—► p(x, Ту)); we shall let si^2) denote si together with this uniform structure. In the same way p(x, Ту) determines a mapping $ x Я* —► R which, by analogy, determines si^^. A mapping & —► & is defined by means of the adjoint map T as follows x —► Tx. If we use the norm topology of & in the image set, then a separating uniform structure in si is defined by means of the normal convergence of this map. We shall let si{m) denote si endowed with this uniform structure. It is easy to see that sim is finer than siand siis finer than • We may now express (D 2) and (D 3) as follows: The representation maps ^ or Урн are uniformly continuous. From (D 3) it does not follow that the maps <&ph —► si^^ can be extended onto all of §ph since si{m^6J) is, in general, not complete. From the uniform continuity of <&ph —> siin a trivial manner it follows that the map ^ —► si^^ is simply continuous. Th. 1.2.2 states that, from the continuity of ^ —► siKm^ it follows that the map ^ —► sim is continuous. Th. 1.2.3 is then only a trivial corollary because the map ^ —► si^^ is continuous. From the proof of Th. 1.2.10 it follows that in the special case in which we choose ^ = sim we obtain the first part of the following theorem: Th. 1.3.1. sim is a topological group and is metrizable. si{m is separable. sim is complete. Proof. In order to show that sim is separable, choose a denumerable set {xv} which is norm-dense in and, for each v, choose a denumerable set TvA e si for which {T^xv} is norm-dense in sixv. Then {TvX} is a dense set in sim which follows directly from II(Г - ад < ||(Г - TvX)xJ + II(Г - T№ - xv)n < II(T' - T^JxJ + 2||x - xv||
246 VI Representation of Groups (choose v such that ||x — xv|| < e/4, and then choose 2 such that ll(т - r;A)xvII < e/2). The fact that sdm is complete follows from the general theorem that a sequence Tv for which T'xx is a Cauchy sequence for each x converges towards a Те jd (from Tx —► T where T is a norm-continuous map of 0b into itself, it follows that if Tx is positive, then T is positive and from || Tx\\ =1 it also follows that || T'|| = 1). If, in addition to the requirements that ^ be separable and metrizable, we also require that it be locally compact, then with Th. 1.3.1 the following important mathematical theorem may be proven: If we endow and sdKm with the Borel structure generated by the open sets and if the map —► <sd{m) is measurable with respect to this structure then the map —► sd^ is also continuous (for proof see [10]). This theorem therefore shows that if & is locally compact then we may start with a much weaker requirement than that У —► sd^m) is continuous. All groups which occur in quantum mechanics are locally compact. In V we have seen that the group sd is “physically too large.” Only the subgroup sdm can be physically meaningful where sdm is the set of all T in sd for which T3 cz 3. Therefore is a separable metrizable topological group. Th. 1.3.2. The maps T -4 Ту for у e 3 of sd^ in @ tire norm-continuous. Proof. This result follows directly from Th. 1.2.5, in which we replace ^ by sd^. On sdw we may introduce the topology of normal convergence of the maps sdm -4 3 (with the norm-topology in 3); this topology will be called the (3)- topology and sdm together with this topology will be denoted by sd\f j. From Th. 1.3.2 it follows that on sdm the (J^-topology is finer than the (3)- topology. The results which we have obtained for the space 08 may also be obtained for the space 3, and we obtain: Th. 1.3.3. The maps of sd[|J into 08 given by T—► Tx for xe 08 are norm- continuous. FromTh. 1.3.1, Th. 1.3.2, and Th. 1.3.3 it easily follows that: Th. 1.3.4. On the topological group jd{^ the (3)-topology is equal to the (08)- topology. The representations of a group which are of interest to us are therefore the continuous representation maps ^ —► which are (according to (D 3)) uniformly continuous as maps
1 Homomorphic Maps of a Group ^ in the Group d 247 1.4 The Representation of ^ in Phase Space Г The properties of group representations in phase space Г are seldom investigated in quantum mechanics. Since this topic is of importance in understanding the relationship between quantum mechanics and classical mechanics we shall present a brief description of the fundamentals of the phase space representations. The o{00\ ^-closure Ka of К in 00' is o{00', 00)- compact. Therefore, the convex set Ka is, according to the Krein-Milman theorem, generated by the set deKa of its extreme points. D 1.4.1. We shall call the set deKa together with the uniform structure generated by the o{00\ ^)-topology the phase space (which we denote by Г). Г is precompact as a subset of Kff. Since 2' is separable and metrizable in the o{00', ^)-topology, Г is separable and metrizable, and its points describe in a physically meaningful way (see [1], §9) the “idealized” preparation possibilities for the systems in M. Since the elements of У can also be considered to be mixture automor¬ phisms of Ka, Г is У invariant. According to Th. 1.2.6, for a e <3, у e Г the map a —► a'у is continuous. In addition, we find that Th. 1.4.1. The mapping о/^хГ->Г defined by (a, y) —► a'y is continuous. Proof. For fixed ae<& and fixed у e Г and for у e 00 we obtain \p(a'y - a'y, y)| < \p(a'y - a'y, y)\ + \p(a'y - a'y, y)\ < \p(y - y, ay)| + \p(y, ay - ay)\ < \Р(У ~ Ъ аУ)\ + Way ~ ayW- From which it follows that, since a—+ ay is norm-continuous, that the map ^ x Г —► Г is continuous. D 1.4.2. We shall call the representation of У on Г by means of point transformations the associated phase space representation of ^ correspond¬ ing to the original representation of (S. For quantum mechanics the structure of the associated phase space representation is surprisingly unfamiliar. In the case of “physical objects” (see III, §4.1 and [1], §12) we are able to determine the phase space Г by means of a particular choice of the group ^ (Galileo group or a direct product of Galileo groups) and the uniform structure ph of physical imprecision, by requiring that Q = A (see [21]). Here for Г we obtain the usual Г-space of classical mechanics. The axioms in III, §4 and the specification of the group ^ permit us, therefore, to deduce the “usual” phase space of classical mechanics in the case of the description of physical objects.
248 VI Representation of Groups An analogous description for quantum mechanics is, up to now, not commonly in use because the set Г = deKa is, at present, mathematically not as accessible. For quantum mechanics there is no pressing necessity to investigate the set deKa as in the case for classical mechanics, because deK is not only nonempty, but it also satisfies (in the norm-closure of В): со 8eK = K. Since 9 is the dual Banach space corresponding to 5£cr where the latter is the subspace of all compact operators of 9' it follows that in quantum mechanics К has the property со 8eK = К (see AIV, §11). 5£cr is norm- separable. It is not difficult to show that $£cr <z A„. In a pure formal way we may choose 9 = $£cr; then we would find that 1 ф 9; however the selection of 9 as the norm-closed subspace of 9 spanned by 1 and i?cr does not appear to be physically meaningful. The above considerations show why deK, that is, the set of all P^, is used as a substitute for the phase space Г. For this reason we must put up with the fact that for a decision scale observable A which has a continuous spectrum there cannot be an element of deK, that is, a P^ for which /г(Р^,(Л — al)2) = 0 (for a in the continuous spectrum). For “physical” decision observables A, that is, for Ae 9 there exists, for each a in the continuous spectrum an element w e deKa for which p(w, (A — al)2) = 0! 2 The ^-invariant Structure Corresponding to a Group Representation As we have seen in V, §5, every ^-continuous effect automorphism is uniquely determined by a _L-automorphism of G and every _L-automorphism of G determines a J^-continuous effect automorphism. Thus the repre¬ sentation of 3 by means of J^-continuous effect isomorphisms is uniquely determined by means of the representation of 3 determined by 1-automor- phisms of G. Of special importance are two subsets of G, first the set of ^-invariant decision effects (see D 1.2.1). Th. 2.1. The set of 3-invariant decision effects forms a complete orthocom¬ plemented sublattice of G which is o(9\ 9)-closed in G. Proof. Follows directly from the fact that each element ae3 defines a 1- automorphism of G which is o(9\ ^-continuous by means of the map e —► ae. (eeG). D 2.1. Let G(3) denote the set of ^-invariant decision effects, L(3) denote the set of ^-invariant effects and 9\3) denote the set of ^-invariant elements of 9'. Th. 2.2. L(3) = со G(3) where the closure is to be taken with respect to the <j(9\ 9)-topology. 9\3) is the o(9\ 9)-closed subspace of 9' which is spanned by G(3).
3 Properties of Representations of ^ 249 Proof. If А с: & is a set of ^-invariant elements in then co(^) and the &)- closed subspace spanned by A are also sets of ^-invariant elements because th^ elements of si are ^-continuous effect automorphisms. If g is a ^-invariant effect, then from the uniqueness of the spectral representation of g it follows that g e со G(&). Thus for a ^-invariant ye Ж it follows that у lies in the Щ- closed subspace spanned by G(^). Let J^)1 denote the set of all л; e 0! for which p(x, у) = 0 for all у e &'(<&). Then &/ЩУ)1 is a Banach space (see [33]). D 2.2. Let ЩУ) be an abbreviation for Я1 Th. 2.3. Я(<&) is a base-norm space and таУ be identified with the dual space of ЩУ) by means of the map (x, y) = n(x, y), where xexe and у e Я'(&) (here (x, y) is the canonical bilinear form to Я(<&) and the dual space for Я(Щ. is an order unit space. Proof. Since Щ&) is а{Я\ ^)-closed, Щ&) can be identified with the dual space of #(#) (see [33]). Я(&) is a base norm space if Я'(У) is an order unit space (see AIII, §6 and [33]). The unit sphere of is given by Я'(У) n (2L — 1). Since 16 and L n ЯЩ = Ц&) we obtain #{&) n (2L - 1) = 2Ц9) - 1, that is, it is equal to the order interval [— 1,1] in Я'(У). D 2.3. Let K(<&) denote the basis of Я(У). Do К{У\ Ь{У\ SfifS) satisfy all the axioms and theorems which have been formulated in III, §3? We will not pursue this question for the general case. In the case of the special quantum mechanical structure of К and L, we shall return to the question of the structure of К(У\ ЦУ) in VII, §2 and VIII, §1 for physically important groups 3 Properties of Representations of & which are Dependent on the Special Structure of in Quantum Mechanics In this section we shall use the special structure of si which was described in V, and we present a outline of the properties of a representation of 3.1 The Topological Structure of the Group According to V (5.4) each element Те si has the form Ту = Z (3.1.1) V where the Up(v)v are isomorphic or anti-isomorphic maps of upon ^(v), and each operator T of the form (3.1.1) is an element of si. T uniquely
250 VI Representation of Groups determines the permutation P of the indices v and the l/p(v)v up to a phase factor eicCv. Therefore T uniquely determines the subset of those v for which the Up{v)v is an anti-isomorphic map. We shall denote this subset of the indices by I. D 3.1.1. Let I) denote the subset of all Те srf which determine the same permutation p and the same index subset I endowed with the topology induced by . Th. 3.1.1. The topology of is identical to the initial topology generated by the mapsd9l(p2: t—► K(jp2> tw*>l (where (pxe Жх and cp2 e Жp{v)v)for all <pl5 cp2 e Ж. Proof. First we shall show that dф1<Р2 is a continuous mapping of in R. The map T A T'x of in & is continuous for each xel. The map x —► p(x, y) of^inR is continuous for each у e Therefore the map Tp(T'x, y) = /фс, Ту) of —► R is continuous for each (x, у) e & x Ж; in particular, the map T—► i*P„ TPJ = triPJTPJ) = tr(P^Pt7p(v)v<Pl) = !<%, ^(V)V^>|2 is continuous, and therefore the map d(pi(p2 is also continuous. The initial topology corresponding to the maps dщ(п is therefore coarser than the (^)-topology. We will show that it is also finer, that is, for a sequence T(n) the relation T(n)'x —► T'x is satisfied in the norm topology for every x e & if the maps d<pKP2are continuous. Since each xeJ has the form x = £v AVP^, where £v |AV| < oo, from ||(T<">' - T')x\\ Z £ |AV|||(T<">' - T')PJ\ + 2 £ |AV| v = 1 v = N +1 it follows that Twx -► T’x for all x e Jf if TWPV -> T'P^, for all P„. From V (5.7) for all q> e ^,(v) it follows that T'P9 = P* with ф = U;(i)v<p. (3.1.2) Therefore, letting Рфп = Т(и)'Рф we obtain (where ||... || is the norm of ^') ||{T(nY _ T)pj = ||p^ _ In the spectral representation of Рф — P^ there are only two (nondegenerate) eigenvalues af0 and a(2n); therefore, for the norm in & we obtain: ll^„ - Р*Ь = 1а1П)| + 1а2П>1- Therefore | P^n — P^||s —► 0 is therefore equivalent to a*”1 + |a(2n)| —> 0 and there¬ fore also to (af)2 + (af)2 -*■ 0, that is, tr((P^ - Рф)2) -> 0. From tr((P^ - Рф)г) = 2 — 2|<i//„, ф}\2 it follows that T(nVx —»■ T'x for all x e ^ if \<Фп,Ф> 1-1 (3.1.3) for all q>e Ж.
3 Properties of Representations of rS 251 For (ре Жт we obtain Кфп, ф}\ = ки^ср, t/p(»| = \<ир(фф, и%,ф>\. where ф is given in (3.1.2). If the sequence T(n) is convergent in the initial topology corresponding to dwe therefore obtain: where (3.1.3) is proven. It is easy to show that the Borel structure corresponding to the initial toplogy is equal to the initial Borel structure, that is, according to Th. 3.2: The Borel structure of is the initial Borel structure associated with the maps On the basis of the theorem which was mentioned at the conclusion of Th. 1.3.1 we therefore obtain: If, for a local compact group the maps a —► |<<pl5 Up{v)v(a)(p2}\ of ^ in R are measurable (where Up{v)v(a) is the map of Жх in Жр{у) described by (3.1.1) corresponding to a) then ^ is continuous. In the following, for the most part, we shall only use the following fact which is a simple corollary of Th. 3.2: ^ —► «я/(Л) is continuous if and only if, for each cpu cp2 e Ж the maps are continuous. Th. 3.1.2. sd m(p^ I) are the connected components of In particular, j/(^)(l, 0) (where 1 is the unit permutation) is the component which is connected to the unit element of . Proof. The fact that the set л/(л)(1, 0) is connected to the unit element follows directly from the spectral representation Г2п U = ei(° dE(co) of a unitary operator and that the function <<p, Ut(p} is continuous in t (where and that <<p, U0(p> = 1, <<p, = <<p, U(p». Therefore, if Ж is the component which is connected to the unit element, then •«W0) <= ^ Since Ж is connected, it follows that, for each neighborhood 'V' of the unit element of that all elements of the subgroup Ж may be represented as products of finitely many elements off n / (see AV, §10). Кф2> ^р(у)уфl)l “* Кф2> ^р(у)уф1)1> where (рг = ф and <р2 = ир(фф, that is, \<ир(фф, и$фФ>\ -> кир(у)ж и^фуI = 1. а -► !<%, инф(а)<Р2>I Jo
252 VI Representation of Groups Let (pu (p2i.. •, (pn be normed vectors for which (pve34fv. For Y' we choose the neighborhood determined by 1 - |<<pv, Up(v)v(pvyI <8 for V = 1, 2,..., n. From this result it follows that p(v) = v for v = 1, 2,..., n, that is, for all elements in Jf the relation p(v) = v for v = 1, 2,..., n. Since n was arbitrary it follows that for all elements of Ж the permutation p is equal to the unit permutation 1. We choose the neighborhoods which is determined by 1 - Uaail/Vy\<8 (v = 1,2, 3,4), where ф2 are ип^ vectors in and where фг = (1 /v^X^i + Фг)> Фа = (l/4/2X</'i + *Фг)- It is easy to show that L;„ is unitary. Thus it follows that U„ must be unitary for all products of elements in S and also for all elements of Ж. Since a was arbitrary, it therefore follows that, for all elements of Ж, the set of indices,/ = 0. Therefore Ж = л/(л)(1, 0). It is well known that Ж = Жт(\, 0) is an invariant subgroup of (see AV, §10.1). It is easy to show that the cosets of 0) are precisely the /). Thus the m(p, I) are the connected components of (see AV, §10.1). Under the following rule for multiplication: (P1.I1XP2.I2) = (P1P2.P2 ljri + h) (where A + В is the symmetric difference (A\A n В) и (B\A n B)) the elements (p, I) form a group F; it is easy to see that this group is isomorphic to the factor group 0). 3.2 The Topological Properties of a Representation of ^ Let A be the component of ^ which is connected to the unit element of A is an invariant subgroup; therefore for the homomorphic map ^ we obtain A —► j/(^)(1, 0) (see All, §4). Since ^ is separable, the factor group &/A is at most denumerable and is a discrete group. Let $ denote the set of those elements in ^ which are mapped into j/(^)(l, 0). Clearly A ~ J. It is easy to show that # is a subgroup of For j e / and a s У the product aja~l (considered as an element of is a map for which p = 1 and 1 = 0, that is, a/a~1 c= # for all a e <3. / is therefore an invariant subgroup in then, according to the isomorphism theorem (AV, §4) we obtain: From the mapping ^ —► <stf(g8) we obtain a homomorphic map У/f —► 0)• An element of <&// is the union of all those cosets in &/A which are mapped by means of the homomorphic map ^ into one and the same srfm{p, /). An element of /Aj#!A is precisely the set of those
3 Properties of Representations of 3 253 cosets of 3jA which are mapped by the homomorphic map 3 —► into one and the same J). Therefore, on the basis of the isomorphism theorem a homomorphism 3jAj^jA —► 1, 0) is defined by the map 3 —► s/ia)9 and, consequently, we may identify sdKm)/sdKm){X, 0) with the group & which was given at the end of §3.1. In nonrelativistic quantum mechanics we only consider those repre¬ sentations in which the group 3 is mapped onto portions of the form /), that is, for p Ф 1 there are no images of elements in portions of the form <srfm(p9 /). In nonrelativistic quantum mechanics it is also possible to use representations of groups in which there exist images of group elements in portions of the form sd{m)(jp9l) where p Ф 1(!). Such repre¬ sentations are of little physical significance in nonrelativistic quantum mechanics. The situation is somewhat different in the case of relativistic quantum mechanics. Here, on physical grounds, it is meaningful, for example, to choose a transformation of a portion of <sd{m(p, 0) where p Ф 1 as the homomorphic image of reflection (parity inversion). In this book we shall only be concerned with nonrelativistic quantum mechanics for the following two reasons: (1) By analogy with the case of the Galileo group (which is described in VII) we could consider the problem of the representation of the Poincare group (since it has been solved, as is the case of the Galileo group); however, the interaction problem has been solved in nonre¬ lativistic quantum mechanics by the consideration of combined systems (see VIII). The interaction problem has not yet been satisfac¬ torily solved in relativistic quantum mechanics, that is, for “elemen¬ tary particle theory.” We shall again discuss this problem in VII and VIII. (2) This book should provide an insight into the structure of a “closed” physical theory. For this purpose we must abstain from the con¬ sideration of fragments of other theories (not only that of relativistic quantum mechanics) although these fragments would fit into the context of II-VI. See, for example, the discussion with respect to the case of nuclear physics in VII, §2. Since we shall only make use of the partitions sdm(\9 J) of sd{m) it is therefore tedious to use the whole spaces Я(Ж19 Ж2,...) and 9'(Жи Ж2,...) for computational purposes. It suffices to consider for each Жу the spaces 9(ЖХ\ 9'(ЖХ) separately, together with the group of all ^-continuous effect isomorphisms of ЦЖХ). In more general circumstances, or in circum¬ stances in which it is clear which system type (as characterized by an atom of the center—see IV, §8) we are concerned with, we shall ignore the index v for Жх and, instead, consider 9(Ж\ 9\Ж) together with the subsets К(Ж), ЦЖ). Then the group will consist of two connected parts—s/m(u)9 which is connected to 1 and consists of effect isomorphisms described by unitary transformations and <sd{m(a) which is the set of effect isomorphisms described by anti-unitary transformations in Ж The subgroup & of 3
254 VI Representation of Groups therefore permits a representation in «я/(Л)(и). Only elements of the cosets ai for which ai Ф i can have images in J^im{a). Readers who are not well versed in physics will have some difficulty accepting such a “reduced” description of physical experiments because, to a given preparation procedure ae 2! it is possible to find a cp(a) e К(ЖХ, Ж2,...) which has nonzero components in more than one К(ЖХ\ that is, cp(a) = (Wu W2,...), where more than one of the Wv are nonzero. Such a preparation apparatus (described by a) clearly does not produce only microsystems of a single type. However, mixtures of different system types in the ratios tr(WJ: tr(W2): tr(W3): ... are “physically trivial” and are therefore uninteresting. Similarly, for those effects i//(b0, b) = (Fu F2,...) having more than one nonzero Fv we find that the probabilities are computed according to the mixture formula ц(ф), Ф(Ьо, b)) = X tr(WVFV). V Here only the individual terms tr(U^Fv) are of interest. Combinations of several system types are only of interest if there are “physically interesting” effect isomorphisms in J) for which p Ф 1. Such is the case only in relativistic quantum mechanics. For the present we shall only be interested in the simplified description of quantum mechanics in which only a single Hilbert space is used—considering different system types separately. 3.3 Unitary and Anti-unitary Representations Up to a Factor Each element a of the group ^ (considered as an element of by means of the representation in srfKm) (where $ = @1(Ж)Х) corresponds to a unitary or anti-unitary transformation of Ж into itself as follows: a — U(a). (3.3.1) U(a) is determined by a up to a factor (which depends upon a). We may require that the unit element e of ^ corresponds to the unit operator 1 in Ж: e->l. (3.3.2) From the representation properties of ^ in it follows that U(ab) must determine (according to (3.3.1)) the same transformation in as does U(a)U(b\ that is, U(ab) and U(a)U(b) can differ only by a factor of magnitude 1: U(ab) = co(a, b)U(a)U(b), (3.3.3) where \co(a, b)\ = 1.
3 Properties of Representations of ^ 255 Conversely, if we have a representation (3.3.1) of a group ^ for which (3.3.2) and (3.3.3) hold, then to each ae^S there exists a unique element in sim corresponding to U(a) which we denote simply by a. Let ) denote the group of unitary and anti-unitary transformations of Ж. A map of ^ into of the form (3.3.1) for which (3.3.2) and (3.3.3) are valid will be called a unitary (or anti-unitary) representation of ^ up to a factor. Each unitary (anti-unitary) representation of ^ up to a factor uniquely corresponds to a representation of ^ in si{m. Each representation of ^ in sim corresponds to a representation up to a factor, where two repre¬ sentations Uu U2 determine the same representation in si\m if U^a) = eimU2(a), (3.3.4) where 5(a) is a real function. For a = e or b = e, from (3.3.3) and (3.3.2) it follows that co(e, b) = co(a, e) = 1. (3.3.5) From (3.3.3) it follows that U(a(bc)) = co(a, bc)U(a)U(bc) = co(a, bc)co(b, c)U(a)U(b)U(c) and U((ab)c) = co(ab, c)U(ab)U(c) = co(ab, c)co(a, b)U(a)U(b)U(c) and we therefore obtain co(a, bc)co(b, с) = co(ab, c)co(a, b). (3.3.6) The relations (3.3.5) and (3.3.6) are, in a sense, characteristic for the multipliers ca(a, b) because, to each “solution” of (3.3.5) and (3.3.6) there also exists a representation up to a factor (see [10]). From (3.3.4) it follows that the multipliers щ and co2 for two repre¬ sentations Uu U2 which correspond to the same representation in si{m satisfy the equation: co2(a, b) = co^a, b)eWa)+m~3iab)\ (3.3.7) Two multipliers cdx and co2 are said to be equivalent if there exists a real function S on ^ such that (3.3.7) holds. Here we note that nothing more about co(a, b) has been assumed—nothing about continuity, measurability, etc. has been assumed. Therefore the problem remains to put forward a clever choice of special multiplier from a class of equivalent multipliers.
256 VI Representation of Groups Thus the problem of finding the representation of a group ^ in siim is equivalent to the problem of finding all unitary (or anti-unitary) repre¬ sentations up to a factor and the selection of a particularly “simple” multiplier from each equivalence class of multipliers. In accord with Th. 3.1.1 we now propose the following continuity assumption for the representation map U: —► 11(Ж): а-|<Ф,1/(<#>| (3.3.8) is continuous for all (р,ф еЖ. The problem posed above can be solved for a certain type of group which contains all “physically relevant” groups. A more precise description of the solution of this representation problem will require a special book, and would result in the loss of continuity of the train of thought. Since monographs already exist on this topic (see, for example, [10]), in the appendix AV, §10 of this book we shall only present a brief summary for the case of a compact group in order that we may obtain a better understanding of the rotation groups. In closing this section we shall now characterize the concepts introduced at the beginning of §1.2 in terms of the form of a representation up to a factor. A ^-invariant effect (as defined in D 1.2.1) is therefore an element g which commutes with all U(a\ ae<&, that is, U(a)g = gU(a). Therefore, according to D 1.1.2 a representation is irreducible if the only operators which commute with all the U(a), а are multiples of the 1- operator. Otherwise, to each operator A which commutes with all U(a) we would have A + U(a)+ = U(a)+A+, that is, A + U(a~1) = U(a~1)A+; then all U(a) would commute with the operator A and hence with the self-adjoint operator A + A+. Then there would be a projection operator E (т^О, Ф1) (for example, from the spectral family of A + A +) which commutes with all U(a). If a representation is not irreducible, then there exists a projection operator E (^0, Ф\) which commutes with all U(a); this is equivalent to the condition that there exists an invariant subspace of Ж (different from Ж and {0}), namely, the projection spaces belonging to E. With the help of V (5.3) we may, in principle, rewrite condition D 1.2.3 for equivalent representations. We will do this for the case of two repre¬ sentations in Я\Ж) and Я\Ж) because these are the only significant ones in nonrelativistic quantum mechanics. Suppose that two representations by means of J^-continuous effect isomor¬ phisms are given in terms of representations up to a factor U(a) in Ж and U(a) in Ж. They are equivalent if there exists an isomorphism (or anti¬ isomorphism) V of Ж onto Ж such that Щ^уЩаУ1 = V~1 U(a)VyV~ (3.3.9) holds for all у e Я\Ж). From (3.3.9) we find that the above is equivalent to U(a) = eid(a)V~1U(a)V, (3.3.10)
3 Properties of Representations of ^ 257 where S(a) is a real function in <&. If Ж = Ж then V is a unitary or anti- unitary operator in Ж. By clever selection of the factors for a representation we may, according to (3.3.10) obtain S(a) = 0. Then (3.3.10) reduces to the “usual” form for the definition of equivalence of two unitary representations. In particular, it follows from (3.3.10) that for 7=1, that two repre¬ sentations which differ only by factors of the form ei3(a) (and which, according to (3.3.7), correspond to the equivalent multipliers) are equivalent.
CHAPTER VII The Galileo Group In the investigation of microsystems an important role is played by a particular component of the physical-technical structure of those experi¬ ments which are composed of a preparation procedure and a registration method. Every experimental physicist is acquainted with this component, which underlies the problem of fixing the spatial and time relationships between the preparation and registration apparatuses. Earlier we have briefly described this question in the definition of С in II (4.3.1) and in III, §1. We must now introduce a corresponding mathematical structure into our mathematical formulation which describes the spatial and time relationships between the preparation and registration apparatuses. 1 The Galileo Group as a Set of Transformations of Registration Procedures Relative to Preparation Procedures Underlying each element b0 e $0 there exists a whole technology of the construction of the apparatus to which the registration method b0 belongs. Here it is not possible to discuss the technology involved. In this respect a large series of pre-theories (see [1], [2], III and [13]) is required for quantum mechanics. It is, however, essential that quantum mechanics itself is not used in the physical description of b0. Here it is possible to raise objections to this assertion; in XVII we will examine such problems in more detail. In [2], XVI it will be obvious what we mean by the above assertions. In [13] this problem is treated in considerable detail. 258
1 The Galileo Group as a Set of Transformations 259 Although we do not need to go into all of the details of the construction of h0, we must, however, present a brief explanation of the special technical character associated with the pre-theory of space-time measurement. The preparation and registration procedures always refer to a space-time reference system—an inertial system (or approximately inertial system) (see, for example, [2], II, VII, and IX). Every experimental physicist is aware of the importance of the spatial and temporal relationships of the registration apparatus relative to the preparation apparatus. Here it is often necessary to use the most modern measurement techniques. It is important that the technical specification of such a space-time reference system has nothing to do with quantum mechanics. The fact that this situation is often not sufficiently understood is, in part, the cause for many conceptual errors in quantum mechanics. We shall now describe what we believe to be the correct meaning of space and time in quantum mechanics: Space and time coordinates refer only to the preparation and registration apparatuses and do not(!!) refer to the individual microsystems. In the formulatipn of quantum mechanics pre¬ sented in this book we have not introduced the position of a microsystem as a basic concept because it is not clear how such a concept can be defined in terms of the pre-theories. The mapping principles do not permit us to use concepts which cannot be defined by the pre-theories (see [1] and [2], III). Instead of the “position of a microsystem” only the spatial relationship between the preparation and registration apparatuses is defined by means of the pre-theories, and, in this respect, is permitted by the mapping principles. We shall not provide a precise mathematical picture of the relative placement of the registration apparatus relative to the laboratory spatial coordinate system. Here we shall be interested only in a particular aspect of this structure, which is described below. This description is a substitute for a more complicated description in terms of pre-theories (see [13]). We shall assume that a registration method b0 e 0tQ is not only character¬ ized by the “inner” structure of the registration apparatus but also by its position and motion relative to the space-time laboratory reference system. In this way we obtain a new structure in the domain of the registration methods and the corresponding registration procedures. Two registration methods b01 and b02 can only differ in their position and motion in the laboratory reference system if the corresponding apparatuses do not differ in their internal structure. We shall now introduce a mathematical description of such a situation between two registration methods. How can two such registration methods differ? If we use the Newtonian space-time structure, they may differ only by a Galileo transformation. If the space-time structure of special relativity is used, they will differ by one of the transformations of the Poincare group. For a discussion of the Galileo group see [2], VI, §1.2; for the Poincare group see [2], IX, §4.3. Here the physical meaning of a group transformation is that b01 may be obtained from b02 by means of a spatial translation (in the reference system under consideration)
260 VII The Galileo Group or by a time translation, or that the apparatus for b01 moves with constant velocity relative to that of b02. A “time translation” for the time т of b01 relative to b02 means that the apparatus corresponding to b02 is, placed into operation at a time interval т later than the apparatus corresponding to b02, for example, a voltage is turned on at a later time т. Later we shall find that the problem of time displacement is not trivial with respect to the combination problem of a preparation and a registration apparatus described in II, §4.3. We will consider only the mathematical formulation of the structure described above for the case of the Galileo group The formulation of the analogous case for the Poincare group is similar and trivial. ^ is a local compact, separable topological group. The elements of ^ can be given by the transformations (see [2], VI (1.2.1)) as follows: з *v = Z avA + + *1v (v = 2> 3)> Д = 1 t' = t + y, (1.1) where A = (ocVfl) is the matrix of a spatial rotation, that is, A' = A ~1. We shall consider only transformations (1.1) which can be continuously transformed into the unit element, that is, those for which the determinant | A | = 1. We may represent an element characterized by (1.1) as follows: (Aj,ij,y). (1.2) We now make the following assertion: There exists at least one denumer¬ able subgroup # c= ^ (we can choose # to be the set of all transformations (1.1), where a^v,5v,^v,y are rational numbers) which, for all ge@ there exists a map 01 -A 01 (which we shall denote by g) for which g($0) cz 01Q. Its physical meaning (as we mentioned earlier) is that, for b0e$0, the method gb0 is obtained from b0 by means of the Galileo transformation g and that for b e 01, b cz b0 the procedure gb is precisely that which is obtained from b by means of the Galileo transformation g. The “meaning” described here is a mapping principle (in the sense of [1], §5 or [2], III, §4) for the relation mathematically described by 01 01. This short outline must suffice for the present. It is important, however, to note that the “physical meaning” of g and gb0, gb is already determined by means of the pre-theories! The reason why we consider only a denumerable subgroup Ф is concerned with the “finiteness of physics” assumption (see [1], §9 or [2], III, §8) that 01 must be denumerable (see VI, §1.1). We may express the fact that the inner structure of the registration apparatus remains unchanged by the application of the Galileo transfor¬ mation by asserting that the mapping 01 -A 01 is an r-automorphism. The fact that it is essential to consider only Galileo transformations is made clear by the fact that the acceleration of an apparatus can modify its inner structure. Therefore an apparatus b01 which is accelerated relative to an apparatus b02 cannot be characterized by means of a r-automorphism.
1 The Galileo Group as a Set of Transformations 261 We now present a summary of the above considerations in axiomatic form: For each g e § there exists an r-automorphism 0 -A 0 and we obtain a representation of the group ^ by means of r-automorphisms. This axiom can be directly obtained as a theorem from the pre-theories— see, for example, [13]. (For the preparation and registration of macrosystems this assertion about the possibility of representation of the group Ф by means of r-automorphisms is not correct for all time translations у because we have chosen the time point £ = 0 to be the time before which the preparation is complete and after which the registration begins (see III, §1, [2], XV, and [13]). In [13] it is shown that the time translation of the registration apparatus makes sense only for у > 0.) Our description of the application of Galilean transformations for reg¬ istration procedures can also be directly carried out for preparation procedures. For such a transformation we shall write a —► да. The transfor¬ mations of registration procedures and preparation procedures are not mutually independent. For a pair a e 0\ b0 e 01'Q from a Galileo transfor¬ mation g there arises an experiment (a, gb0) if (a, gb0) eC. If, instead, we transform the preparation procedure by means of gT1, then we obtain the pair (#_1a, b0). Here (a, gb0) and (0_1a, b0) differ only in the fact that the complete experiment (a, gb0) is transformed by g relative to (#-1a, b0), while the “relative” position of the preparation and registration apparatus is the same in both cases. For this reason the following assertions are “almost trivial” (a, gb0) eCo (g~la, b0) e С and fi(a, g(b0, bj) = /4ГЧ (b0, bj). From V, Th. 3.4 it follows directly that 01 01 is a p-continuous r- automorphism and that 0 0 is an r-continuous p-automorphism. We may combine the above considerations by asserting that # may be represented by means of p-continuous r-automorphisms by means of the maps 01-1* 01. According to VI, §1.1 it follows that to each g there corresponds a unique ^-continuous effect automorphism which, in turn, corresponds to a repre¬ sentation # —► of # into the group stf of J^-continuous effect automor¬ phisms (see VI, §1.2). In addition the elements of s# which correspond to elements of # leave the subspace 2 of 0!' invariant. For the representation of # by means of p-continuous r-automorphisms we shall require that AG 1 from VI, §1.1 holds. AG 1 is the mathematical expression for the condition that small errors in adjusting the registration apparatus in space and time cannot be detected by means of the probability
262 VII The Galileo Group distributions. In principle this is nothing other than the assumption that small errors are of a statistical character. In this way we may, therefore, by VI, §1.1 and §1.2 consider the repre¬ sentation of the complete Galileo group ^ in si where ^ —► si is continuous according to VI, §1.3. According to VI, §3.3. we may also consider separate representations of the Galileo group (up to a factor) for each of the Hilbert spaces Ж^ of different “system types” because the Galileo group (where we assume that the determinant of the rotation matrix is +1) is connected. For these representations we may impose the continuity condition VI, (3.3.8). 2 Irreducible Representations of the Galileo Group and Their Physical Meaning The irreducible representations of the Galileo group play a fundamental physical role. As we have found at the end of the previous section, we can consider each system type and its corresponding Hilbert space Ж^ separately. D2.1. A system type (IV, D 8.1.1) is said to be “elementary” if its cor¬ responding representation of the Galileo group (that is, its representation up to a factor in Жу) is irreducible; otherwise, it is said to be “composite.” We may sometimes speak (less precisely) of xep(e) (where e is an elementary system type) simply of elementary systems x of type e (for p(e) see IV, §8.1). This definition is mathematically clear. However, its physical interpre¬ tation may be misunderstood. For this reason we shall make a number of explanatory remarks. Every theory refers to a certain “fundamental domain” where it is usuable (see, for example, [2], III, §2 and §4 or [1], §3 and §5). If we then have a more comprehensive theory, then the corresponding fundamental domain will probably be larger (see [2], III, §7 or [1], §8). If at a given point in the development of a theory there is a certain fundamental domain, the theory will describe certain real factual content—for example, atomic nuclei—as elementary systems. In a more comprehensive theory these systems need not necessarily be elementary. For example, if, in the fundamental domain, only low-energy processes are admitted, then the atom nuclei may be described as elementary systems (for example, in atom and molecular physics). If we extend the fundamental domain by admitting such processes as nuclear reactions, then we must use more comprehensive theories. In such theories the atomic nuclei (except for the neutron and proton) must be described as composite systems. We must be careful and avoid the mistakes made by considering the concepts in a physical theory to be absolute, instead of understanding that they are part of the description of a certain fundamental domain. In physics we often restrict the fundamental
2 Irreducible Representations of the Galileo Group 263 domain and consider simplified approximate theories for such restricted fundamental domains. In such a simplified and approximate theory a complete atom may be considered to be an “elementary” system. The fact that this situation does not result in a contradiction can be seen in the fact that in the treatment of composite systems we find that the “center of mass” of a composite system itself behaves as if it were an elementary system. Since we may consider each system type separately—as we have already discussed in VI, §3—in the following we shall always consider only a single Hilbert space Ж As we mentioned in VI, §3.3, we cannot present an exact derivation of the irreducible representations of the Galileo group here, because it would require an entire book to do so. The reader who is interested in this task is referred to [10]. The derivation presented there can be directly applied here because the continuity of the maps, VI (3.3.8), is the central assumption in [10] (if necessary, using the weaker requirement that the maps, VI (3.3.8), be only measurable, then continuity would follow for locally compact groups) for the derivation of the possible inequivalent representations. Here again we note that the assumptions made in §1 are completely sufficient to apply all the theorems presented in [10]. Every irreducible representation of the Galileo group in terms of J^-continuous effect isomorphisms can be given by the corresponding unitary representation up to a factor. We shall now give a brief summary of the structure of such representations. The Galileo group (1.1) contains the abelian group of spatial translations and the “proper” Galileo transformations 5 as subgroups. For a one- parameter group (for example, the translations in the 1-direction with the parameter rj) the factors may be so chosen that we obtain a unitary representation as follows : Let a denote the parameter of the group element, that is, a(ax)a(a2) = a(ax + a2) from VI (3.3.5) and (3.3.6) where co(a(аД а(а2)) = eiy(CLuCL2) it follows that 7(0, a) = y(a, 0) = 0, (2.1) y(al5 a2 + a3) + y(a2, a3) = y(ax + a2, a3) + y(al5 a2). (2.2) According to VI (3.3.7) the question arises whether there exists a 5(a) = 5(a(a)) for which co2(a, b) = 1, that is, 7(ai, а2) + 5(ax) + 5(a2) - 5(ax + a2) = 0. (2.3) On the basis of condition (2.1) it is always possible to find such a 5(a). This can be proven without additional assumptions (see [10]). This can be easily seen if we assume that у is twice differentiable, and we differentiate (2.2) first with respect to ax and then with respect to a3.
264 VII The Galileo Group For a family of unitary opertors U(oc) with 1/(0) = 1 and U(cc1 + a2) = l/(ai) + U(a2) it follows that (see [35]) there exists a spectral family E(k) for From (2.4) it follows that there exists a (not necessarily bounded) self-adjoint operator The above procedure can also be carried out for the three parameter abelian group (1, 0, jf, 0) if we only replace a by a three-dimensional vector Then it follows that we may choose 1/(1, 0, fy, 0) in such a way that we obtain a representation without factors, that is, the 1/(1, 0, jy, 0) form an abelian group. From (2.6) we find that there exist self-adjoint operators Kl9 K2, K3 (or K) for which Since the 1/(1, 0, fy, 0) form an abelian group, the Kv must commute. К is not, however, uniquely determined by the choice of factors. It is easy to see that the 1/(1, 0, jy, 0) are uniquely determined up to factors of the form e*'* where it is an arbitrary vector. Therefore К is determined only up to additive term /cl. For the elements of the Galileo group we find that The multipliers A(A, fj) do not depend on the choice of factors for U(A, 0, 0, 0). We may set A(A, /7) = 1 by making a suitable choice of the factor em,fi as follows: From (2.9) it easily follows (where A(A, /7) = ei9{A,fi\ that g(A, fyx + r\2) = g(A, fyx) + g(A, fy2) which (2.4) 'oo к = k dE(k\ (2.5) — 00 where [/(a) = eiK*. (2.6) 1/(1, 0, i}, 0) = e***. (2.7) (1,0, Aff, 0) = (A, 0, 0, 0)(1, 0, ij, 0)(Л Л 0,0,0). (2.8) For the representation we obtain U(A, 0, 0, 0)17(1, 0, ij, 0)l/(,4, 0, 0, 0Г1 = ДА, ®U( 1,0, Ai\, 0). (2.9) and д{АуАг, ф = g(A2, ф + g(Au А2ф. From the first equation it follows that д(А,ф = h(A) ■ ii and from the second equation we obtain h(AiA2) = h(A2) + A^HAJ.
2 Irreducible Representations of the Galileo Group 265 This equation fixes h(A) up to an arbitrary vector h(A0) for an A0 for which h(A0) Ф 0. The solution up to an arbitrary vector is given by h(A) = к — А~хк. Thus, from (2.9) it follows that the factors em * of the 1/(1, 0, jf, 0) can be chosen such that the A(A, rj) are equal to 1. From (2.9), using (2.7), we find that U(A, 0, 0, 0)KU(A, 0, 0, O)'1 = A-'K. (2.10) By choice of X(A, fj) = 1 in (2.9) the operator vector К is uniquely determined. The same procedure can be applied to the subgroup of the proper Galileo transformation (1, <5, 0, 0). Again the factors can be chosen such that this subgroup has a unitary representation: 1/(1, 5, 0, 0) = e^v=i*v<5v = eixs (2 n) with mutually commuting self-adjoint operators Xv which, by analogy with (2.10) satisfy the equation U(A, 0, 0, 0)XU(A, 0, 0, О)'1 = А~хХ (2.12) and X is uniquely determined. Although all elements (1, S, fy, 0) form an abelian subgroup of the Galileo group, it is not necessary that the U( 1, S, 0, 0) commute with the 1/(1, 0, fy, 0) because from (1, d, 0, 0)(1, 0, fj, 0) = (1,0, fj, 0)(1, S, 0, 0) it only follows that eixvdveiK^ = rjfi)eiK^eix''3\ (2.13) where |ЯДУ| = 1. Since we have no more free choice of factors for 1/(1, S, 0, 0) and 1/(1, 0, fy, 0) we must yet specify what coefficients can occur in the equation (2.13). We will now simplify the answer of this question by assuming that the coefficients are differentiable, although no additional assumptions are necessary (see [10]). If we multiply (2.13) on the left with e-iKrfn^ differentiate with respect to rjд and then set ц(1 = 0 we obtain -1К11е‘хж + ieiX^Kll = eix^. (2.14) Л„=о If we then multiply on the right with e~iXv3v then differentiate with respect to Sv and finally set Sv — 0 we then obtain <2Л5) From (2.10) and (2.12) we obtain
266 VII The Galileo Group where m must be real, because Xv and Кд are self-adjoint. Therefore we obtain (2.15) K„XV - = im5vfi 1. (2.16) We may choose m ^ 0 because m and — m lead to an equivalent repre¬ sentation of ^ in j/; since m transforms into — m by means of a anti-unitary transformation F because i transforms into — i: (F^F-1)^!/-1) - (F^F-'KFX^F-1) = i(-m)<5v/tl. We have to distinguish between two cases: m — 0 and m Ф 0. For m / 0 the Kv do not commute with the Xy. However, (2.16) is only an abbreviated notation for (2.13): jx$eiKi = (2.17) The result of these considerations (we again mention) is valid without any assumption about XVfl (see [10]). Different values of m lead to inequivalent representations, because factors in £7(1, S, 0, 0) and £7(1, 0, ff, 0) cannot be varied more and the number |m| remains unchanged under unitary or anti- unitary transformations. The elements (1, 0, 0, y) form an abelian subgroup; therefore there exists a unitary representation £/(1,0,0,y) = eiH\ (2.20) where Я is a self-adjoint operator, which is uniquely determined up to additive term el. From (A, 0, 0, 0)(1, 0, 0, y)(A ~ \ 0, 0, 0) = (1, 0, 0, y) it follows that U(A, 0, 0, 0)eiHyU(A, 0, 0, 0)U(A, 0, 0, 0)”1 = oc(A)eiH\ where oc(A1A2) = a^ja^). Since the only one-dimensional representation of the rotation group is the identity (see §3) it follows that a = 1, that is, U(A, 0, 0, 0)HU(A, 0, 0, О)'1 = Я. (2.21) Equation (2.21) describes the rotation invariance of Я. From (1, 0, *?, 0)(1, 0, 0, y)(l, 0, *?, 0Г1 = (1, 0, 0, y) it follows that eiKvnveiHye-iKvnv = Pv(rjv9y)e-iHy, From the preceding results and from (2.10) it follows that = 1, that is, Я and the Kv commute. From (1, 0, 0, y)(l, S, 0, 0)(1, 0,0, y)-1 = (1,1 -h, 0) it follows that еШу^х4е-Шу = fyjX-de-iK »y
2 Irreducible Representations of the Galileo Group 267 If we differentiate with respect to у and set у = 0 we obtain (for p(0, <5) = 1)) Ше‘ы - ie‘™H = (8^] - ie‘*4K ■ I ^уЛ=0 Multiplying on the left with e~ixdifferentiating with respect to d and setting S = 0 we obtain ,J-‘& i222) According to (2.12) we must obtain =0 dyd& J $=о, у = о Then combining the results of (2.22) with (2.17) we obtain H = 2-K2 + H{ (2.23) whereby we find that the Xv and Кд commute with Ht. From (2.17) it follows that the Hilbert space Ж can be represented in the following way, that is, Ж together with the operators eiX**, eiKis isomorphic to the following form: Ж = ЖьхЖ{, (2.24) where Жъ can be chosen as the space <5? 2(R3, dkx dk2 dk3). Then it is easy to obtain eiR-a. jR i x j - . - . (2.25) eix s. eix s x J In the above equation К and X are operators only in Жь: and for q>(k)e i?2(R3, dkl dk2 dk3) we obtain e‘*>(£) = eiS'V(/c), (2.26) е1*^(р(к) = (p(k + mS). (2.27) Thus it follows that any operator which commutes with all the operators eixi eik ij must jiave tjje form \ x A.ln particular, we may write (2.23) in the form H = J-K2 x 1 + 1 X Д. (2.28) 2m In this way we obtain ^2<p(k) = ~k 2cp(k) (2.29) and ei(i/2 m>K2>(£) = e<i/2m)i‘2y(p(&). (2.30)
268 VII The Galileo Group The proof of these relationships is given in [10]; certain aspects of the proof can be found in IX. On the basis of (2.26) and (2.27) it follows that the space Жь is irreducible relative to the transformations elKand eiX b. We have not yet given an explicit description of the representation of the subgroup 39 of spatial rotations. In the previously given relations such as (2.10), (2.12), and (2.21) the choice of factors in U(A, 0, 0, 0) were arbitrary. Since 39 is compact, each irreducible representation of 39 is (up to a factor) finite-dimensional (see [10] and AV, §10), and each representation (up to a factor) in Ж can be reduced to the form * = I e n where the Жп are invariant with respect to the representation of 39 and are irreducible subspaces of Ж. Since 39 is compact each representation is, up to a factor, equivalent to a normal unitary representation of the covering group 3% ol39 (see [10] and AV, §10.7). For A e 39 we may define an operator V(A) in Жь by V(A)cp(k) = cpiA-% (2.31) It is easy to see that V(A) is unitary, and that V(A) defines a representation of 39. We then obtain ViAfiViA)-1 = A~XK (2.32) and, from (2.27) we obtain V{A)XV{A)-1 = A~lX. (2.33) From (2.10) and (2.12) it follows that R(A) = V(A) ~1 U(A, 0, 0, 0) (2.34) commutes with all eiX d and eiXЛ that is, is of the form R(A): 1 x R(A) and that we may write U(A, 0, 0, 0) = V(A) x R(A). (2.35) Since the U(A, 0, 0, 0) form a representation of 39 up to a factor, this must also be true of R(A). Then, from (2.21) and (2.28) it follows that (since V(A) commutes with К2): HtR(A) = R(A)Hi9 (2.36) that is, the H( commutes with all R(A). Therefore we may obtain an irreducible representation of the Galileo group (for m Ф 0) only if the representation of 2% by means of R(A), that is, in Ж{ is irreducible. Thus it follows that Ж{ is finite dimensional and (from (2.36)) we find that we must have Щ = AI, (2.37)
2 Irreducible Representations of the Galileo Group 269 that is, H is, according to (2.28) uniquely determined up to an additive constant, which, by choosing a suitable factor, can be set equal to 0. The irreducible representation space Ж{ is called the “spin space” of the elementary system; for elementary systems we shall use the notation % instead of Ж{, that is, where we use the symbol we shall be considering the spin space of an elementary system. In §3 we shall consider the representations of 2% in Жъ by means of V(A) and in by means of R(A) separately, because these representations play a central role in the applications of quantum mechanics. We will see that each irreducible representation is uniquely characterized by a number s = 0, 1, f,...—and that this number will be used as an index for . Thus we find that the Galileo group as a group of transformations of registration procedures relative to preparation procedures leads, without any additional assumptions other than those introduced in §1 (!) to the following structure which is of central importance in quantum mechanics: For the case m Ф 0 to each type of elementary systems there are two parameters m and s and the corresponding irreducible representations of the Galileo group. The necessarily infinite-dimensional Hilbert space Ж of such a type of elementary systems can be written in the form Жъ x where Жъ = i?2(R3, dkx dk2 dk3) and the rules for the transformation of the Galileo group are given by (2.26), (2.27), (2.35), and (2.31) and by ешу = eai2m)K2y x t (2.38) In §4 we will see that (2.16) is equivalent to the Heisenberg uncertainty relations. The typical quantum mechanical structure obtained from the repre¬ sentation of the Galileo group by means of ^-continuous effect automor¬ phisms is not a consequence (!) of the introduction (§1) of the structure of the Galileo transformation as transformations of registration procedures. Everything introduced in §1 also holds for classical systems. The distinction between classical systems and microsystems lies exclusively in axiom AV 4s in III, §3. If, on the contrary, we make the assumption that the systems under consideration are physical objects (see the remarks in III, §3 following AV 4s), then it follows for elementary systems that they are “mass points which move with constant velocity in a straight line between the preparation and registration apparatus” (for proof see [21]). The fact that every elementary quantum mechanical system type uniquely determines two parameters m and s does not, of course, mean that every elementary system type is uniquely characterized by these two parameters. It is possible to give other “objective” properties in addition to m and s for elementary systems (for objective properties, see IV, §8.1). The fact that m and s are objective properties follows directly from the definition that they are parameters which correspond to atoms in the center Z of G. D 2.2. The parameter m is called the mass of the elementary system type; the parameter s is called the spin of the elementary system type.
270 VII The Galileo Group We shall later compare (see VIII, §6 and XVII, §6.2) the concept of “mass” described above with the “usual” concept. The meaning of the parameter s will be explained in §3 and §5. The fact that in this formulation of quantum mechanics there is no “constant” h = h/2n (where h is Planck’s constant) is not a defect. On the contrary, it merely expresses the fact that this formulation considers only the essential structure of quantum mechanics. This structure shows that the quantum mechanical laws are not invariant under transformations m —► Am where A > 0. Because all previous classical theories exhibit this invariance, it seems advantageous to introduce a particular unit of mass in classical physics. For quantum mechanics it appears to be somewhat “artificial” to introduce a special unit of mass; if this was done, then it would be necessary to introduce a factor between the “natural” unit in quantum mechanics and the “artificial” unit in classical mechanics. The natural unit is (cm)-1 if the velocity of light is taken to be 1, that is, if the time is also measured in cm. Then we would have 1 (cm)-1 == h (gram), (2.39) where gram is the usual mass unit. The conversion factor in (2.39) can be found if we measure, for example, the mass of a hydrogen atom in (cm)-1 and determine what the mass of a (cm)3 of water is compared to that of a hydrogen atom (see XVII, §6.2). We have not yet discussed the case m = 0 in (2.16). In [10] the remaining possible irreducible representations of the Galileo group are given. Here we briefly mention these representations and we will make it experimentally evident that such systems are not found in nature, this evidence will be formulated in terms of the following axiom—that m Ф 0. (Light quanta cannot be described in terms of the Galileo group; here the representations of the Poincare group must be used—see the remarks in §1 and at the end of §2.) For m = 0 it follows from (2.16) that all Kv and Xд commute. With respect to the above derivation not much is changed; it is only necessary to set m = 0. This occurs only in the transition between (2.22) and (2.23), that is, in the determination of H in the equation (2.22); XH - HX = - iK. (2.40) Since X and H commute with К we may treat К in (2.40) like a number. From (2.40) by analogy with (2.17) and (2.16) it follows that eiXdeiHye-iXd = ei(Hy+K-S)' (2.41) From (2.41) it follows that we may take (for an irreducible representation) the space of quadratic integrable functions cp(k0, k) with fixed \k\ = r with integration measure dk0r dco (where dco is an element of solid angle or a surface element of the unit sphere) for Жь in (2.24). From (2.10) it follows that
2 Irreducible Representations of the Galileo Group 271 all directions are needed for k. The equations for the representation operators are given by eiK'>(fc0, к) = е‘нср(к0, к), еШу(р{к0, к) = eikny<p(k0, к), е1^ср(к0, к) = ср(к0 + к-д, к). For г = О, X commutes with Я. For such an irreducible representation eiHy reduces to a multiple of the unit operator. The following experimental evidence shows that there are no physical systems which correspond to the representation for which m = 0. For r = 0 it follows that all effects are invariant under time displacements (since eiHy commutes with all F e L). With the exception of a “vacuum” there is no physical system known (see below) for which such a “time invariance” is valid. For г Ф 0 we have |k| = r. Suppose we produce an ensemble W for which tr(WU(l, 0, riv, 0)FU(1, 0, riv, 0)_1) changes slowly with r/l9 ц2 f°r all F- Then it follows that this expression will also vary weakly with rj3. This contradicts experience because experience has shown that it is possible to make ensembles which depend weakly on displacements rjurj2 but strongly on displacements rj3 in the third direction (in all scattering problems we seek to produce ensembles which weakly depend on rjl9 rj2 in order that a “beam” of systems can be directed in the third direction—see XVI, §6.3). Often the following argument is also introduced: For m = 0 there exists no decision observable for position in the sense discussed in §4. This objection is more than questionable because it is practically impossible to prove experi¬ mentally the existence of the decision observable for position (constructed in §4) because the latter is an idealization which is only obtained approximately (with difficulty) in terms of real constructable registration methods b0 (in the sense of IV, §4). The nonexistence of a decision observable for position will therefore not immediately contradict all known experiments because it is conceivable that there yet exists a position observable (in the general sense, that is, a measure S L with £ as the ring of the “position domain” in the sense of §4) which describes what is measured (see [22]) where the case of a light quantum is a nice example. The assertion of the existence of a decision observable for position on arbitrary a priori grounds will absolutely con¬ tradict the concepts of physics as carried out here for the example of quantum mechanics and as described in general in [1]. Such a priori principles are not admissible in the development of an axiomatic basis for quantum mechanics. This, of course, does not mean that intuitive concepts such as the concept of “position” cannot be used in order to guess (or “discover”) a SPS'. There exists a trivial irreducible representation of the Galileo group: the identity in a one-dimensional Hilbert space. The only other additional elementary system types are those whose corresponding Hilbert spaces are one-dimensional. There are no objective properties (IV, §8.1) which are
272 VII The Galileo Group experimentally known which permit us to distinguish such elementary system types whose corresponding Hilbert spaces are one-dimensional. We therefore impose the axiom that there exists only a single one-dimensional Hilbert space Jfv. For this one-dimensional Hilbert space we shall use the index 0: Ж0. The corresponding system type will be called the “vacuum .” The corresponding objective property (as a subset of M) will be denoted by M0. According to IV, §8.1 we therefore obtain M0 = U « a e &' U U b be & _<p(a) e Ki(e0) _ _ ффо, b)<^e о where e0 is the atom of the center Z of G which projects onto Ж0. The language used by the physicist for the situation xe M0 is that “no” microsystem is “present,” and the set M\M0 is often called the set of “proper” microsystems. For the formulation presented in II it is conceptually import¬ ant not to exclude the possibility that x e M0 because it is conceptually impossible to describe the set M\M0 before the introduction of the concept of a “system type.” Therefore the set M in II is only an aid to mediate between the preparation and registration. Conceptually this would be clearer if we construct quantum mechanics without the aid of a set M of microsystems and instead use only mathematical structures which describe the preparation and registration apparatuses and the connection between them (see [3], [2], XVI, and [13]). The discussions in §1 and §2 may also be carried out for the case of the Poincare group. The same is true for parts of the discussion in sections §4—§7, but not for the considerations in VIII. The experiments of “elementary particle” physics lead us to suspect that there are actually no “elementary” systems in the sense of D 2.1 (with respect to the Poincare group). The concept of an elementary particle (as introduced in §2) appears to have a meaningful application only in the realm of nonrelativistic physics of microsystems. 3 Irreducible Representations of the Rotation Group Since the unitary representations (up to a factor) of the rotation group are equivalent to the unitary representations of the covering group Q)% (see [10] and AV, §10.7) we may restrict our consideration to the unitary repre¬ sentations of 2%. Since 3)% is compact, all irreducible representations are finite dimensional (see [10] and AV, §10.6). Ж may be a finite-dimensional Hilbert space and U(A) a unitary representation of Щ in Ж. For a rotation A of angle a about the 3-axis the equation U(A) = U(a) defines the repre¬ sentation of a one-parameter group satisfying Щосх + oc2) = ^(a1)C/(a2) for which an infinitesimal rotation is defined by U(oc) = eaJ*\
3 Irreducible Representations of the Rotation Group 273 Since Ж is finite dimensional, J3 is defined in all of Ж. Since U is unitary iJ3 is a self-adjoint operator. In the same way infinitesimal rotations «/l5 J2 are defined for the axes 1 and 2. We therefore set Lv = iJfv (v = 1,2,3). (3.1) Thus the Lv are self-adjoint operators. From the representation property of the U(A) it follows that (see AV, §10.5) the Jy in Ж satisfy the same commutation relations as the corresponding infinitesimal rotations in itself, that is, Г1,2, 3, = Jp where v, p, p = < 2, 3,1, (3.2) [3,1,2. For the Lv it follows that Г1, 2, 3, LvLp - LpLv = iLp where v, p, p = \ 2, 3, 1, (3.3) [3, 1, 2. Since Ж is finite dimensional, the Lv have (as is the case of all self-adjoint operators in Ж) a discrete spectrum of eigenvalues and a complete orthonor¬ mal basis of eigenvectors. For that reason it is easy to carry out the following computations. We replace Ll5 L2 by means of the operators N = L1 + iL2 and N+ = Lx - iL2 (3.4) and we define L2 = L\ + L\ + L\. (3.5) It follows that L3N - NL3 = N, L3N+ - N+L3 = -N+ (3.6) and NN+ = L2 + L3 - L2, N+N = L2 - L3 - L2. (3.7) If v is an arbitrary eigenvector of L3 in Ж which satisfies L3v = pv (3.8) then, from (3.6) it follows that: L 3Ni> = (NL3 + N)v = (p + l)Nv (3.9) and L3N+v = (N+L3 - N+)v = (p- 1 )N+v. (3.10) If Nv ф 0 then Nv is an eigenvector of L3 with eigenvalues (p + 1); if N* Ф 0, then N* is an eigenvector of L3 with eigenvalue (p — 1). If we apply N repeatedly we obtain increasing eigenvalues of L3 providing that we do not
274 VII The Galileo Group obtain the null vector. Since Ж is finite dimensional there exists an integer n such that Nnv Ф 0, L3Nnv = {ii + n)Nnv and Nn+1v = 0. Let Nnv be denoted by ujlj = ц + n) we therefore obtain L3Uj = jUf and Nuj = 0. (3.11) From (3.11) it follows that N+Nuj = 0 and from (3.7) and (3.11) we obtain L2Uj=j(j+ l)uj. (3.12) Since we may assume that щ is normalized, we may recursively define: m =j,j - 1,..(3.13) where xm is so chosen that ||wj| = 1 for all m. Since we have required only that ||ите|| = 1, we (arbitrarily) choose т to be real and >0. The sequence of the um exists providing that N+um, = 0 for a value of m'. From (3.10) we find that for all um defined according to (3.13) that L3um = mum. (3.14) From the commutation relations (3.3), and intuitively, from the fact that L2 is the square of the magnitude of a vector, it can easily be seen that L2 commutes with the rotations and, therefore, with all Therefore L2 commutes also with N and N+. From (3.12) and (3.13) it follows that L2um=j(j + l)um (3.15) for all m. Since Ж is finite dimensional, for a particular value of m! (for which um> Ф 0) it follows that the relationship N+um, = 0 must hold. From (3.7) and (3.15) it follows that m'(m' — 1) = j(j + 1). Since, according to the definition of the um, for m! the relation m! < j holds, we therefore obtain m' = —j. Conversely, from \\N+u_j\\2 = (u_pNN+u_j> = (u_j,(L2 - L3 - L2)u_j> = 0 it follows thatiV+w.j = 0. The sequence of the um runs as follows: m = jj — 1,..., —j. Since the number (2j + 1) must be an integer, we find that j may only take on half¬ integer values—0, j, 1, f, 2,..., etc. We obtain the normalization condition from (3.13) as follows: lTj2 = UN+uJ2 = <um,NN+um) and, from (3.7), (3.14), and (3.15) (since we have chosen xm to be positive real) we obtain: rm = Jj(j + 1 ) + m- m2 = J(j + m)(j - m + 1). (3.16)
3 Irreducible Representations of the Rotation Group 275 Equation (3.13) then becomes N+um = J(j + m)(J -m + (3.17) Since N is the adjoint operator to N+ and since the um are orthonormal it follows that The subspace spanned by the um is therefore invariant under iV, N+, L3 and therefore under Since the set of operators given by contains the representative operators for all rotations A (see AV, §10.4) the subspace spanned by um is invariant under all the operators in U(A). Since Ж is irreducible, the um span all of Ж. If, conversely, we abstractly construct a space Ж] by specifying the (2j + 1) vectors um (m = — j, —j + 1,... ,j) as a complete orthonormal basis in Ж$ and define the operators L3, N, N+ by the equations (3.14), (3.17), and (3.18) and define the operators Ll5 L2 by (3.4) then, for the Lv it follows that the commutation relations (3.3) hold. Then, for the defined by (3.1) we find that the commutation rules (3.2) hold. Each AeQ) % can be described by a rotation axis and a rotation angle, that is, by three parameters al5 a2, a3 (see AV, §10.7). Since the satisfy the commutation relations (3.2) it follows that the U(A) form a representation of the covering group. That this repre¬ sentation is irreducible follows from the construction because to every invariant subspace there exists an eigenvector of L3 which must coincide with one of the um, from which we find that N and N+ “generate” the entire space. In the sequel we shall denote the above irreducible representation in Ж• by 3j (in particular, for the simplification of the discussion in XI-XVI). 3j also represents a characterization for a class of equivalent representations, independent of which vector space by which it is realized. The representations 3j may be obtained in a purely algebraic fashion, without the use of the theorems of Lie groups. The latter approach is often of great practical value. For this reason we shall explicitly derive the repre¬ sentation 31/2 of Q)% in Ж^2. We shall denote the basis vectors w+1/2, w_1/2 of Ж112 by u+ and w_. In Ж1/2 we find that, according to (3.14), (3.17), and (3.18), the infinitesimal rotations satisfy the equations: Num = yj{j - m)(J + m + l)um+1. (3.18) U{A) = eIv (3.19) Л«+ = -2M+’ = 2U~’ J2u+=^u-, J2u _ =-^m+. (3.20)
276 VII The Galileo Group Thus from (3.2) it follows that the following equations hold for av = 2iJv: If we again use the parameters al5 a2, a3 for the rotation A where a = yjocl + a2 + a3 represents the rotation angle and wv = av/a the com¬ ponents of the rotation axis, then from (3.19) it follows that The element A = [e, #] of the fundamental group of 2% corresponds to a continuous variation of the angle a from 0 to 2n (see AV, §10.7). From (3.25) it follows that The operator U(A) = 1 corresponds to cos a/2 = 1, that is, a = 4nn. All these values of a correspond to the unit element of the covering group. Therefore the representation in Ж1/2 is an isomorphic representation of 2%. We shall now use algebraic methods to obtain the above result. First we shall show that the U(A) contain all unitary operators in Ж1/2 which have determinant 1. From (3.25) and (3.20) it immediately follows that the matrix of U(A) is given by: be a unitary matrix having determinant 1. Then we must also have: (3.21) The <7V are self-adjoint and satisfy the equations: + «Vv = 0 if v A* and aI = 1. (3.22) (3.23) From (3.22) it follows that (3.24) and we therefore obtain U(A) = e l(a/2)£vwv<TV _ j cos ^ wvcrv J sin^. (3.25) V U(le9V]) = -1- / cos(a/2) — iw3 sin(a/2) — (iwx + w2) sin(a/2) \ (—iw1 + w2) sin(a/2) cos(a/2) + iw3 sin(a/2) This is a unitary matrix with determinant 1. Conversely, let ( (3.26) (3.27) (3.28)
3 Irreducible Representations of the Rotation Group 277 From the third equation of (3.28) it follows that #21 = A512, #22 = ^#11 because, from the first equation, it is not possible that both an and a12 be zero. From the second equation of (3.28) it follows that |A|2(| a12l2 + |аи|2) = |Я|2 = 1. From the fourth equation of (3.28) we finally obtain A|au|2 + A|a12|2 — A = 1. Therefore (3.27) has the form ^ 1 1 # 1 2 i ~ . 11 12 \ |2 , i„ |2 with \an\ + \a12\ = 1. (3.29) “#12 all The matrix (3.29) is unitary with determinant 1. If, in (3.29) we then set #11 = 04 “ #12 = + Wi) where the jSv are real, we obtain /^4 — 03 —(/^2 + 0i) \ о о о о я я «\я with/J2 + J82+/J2+J82 = l. (3.30) /?2 “ Ф1 @4 + 0э / Otherwise the jSv may be freely chosen. From the auxiliary condition in (3.30) we may introduce an angle a and set jS4 = cos(a/2). We introduce the wv by means of the equations jSv = wv sin(a/2) for v = 1,2,3. The auxiliary condition in (3.30) then reduces to Wi + W 2 + VV3 = 1. Therefore the matrix in (3.30) takes on the form (3.26). The group of two-dimensional unitary operators with determinant 1 is often called SU2. SU2 is isomorphic to <2)% by the correspondence (3.25). We shall now show that this is the case directly by algebraic methods: For ax = Ysv xvav where the xv are real we find that ax is self-adjoint and that (7X = (£v 1. If p is a self-adjoint operator in Жц2 and tr(p) = 0 we find that p=Yjvav °x where the av are real as follows: Since the four operators <7V, 1 form a complete linearly independent system of operators in Ж1/2 every operator p has the form p = £v #v°v + #(Д; fr°m tr(p) = 0 it follows that a0 = 0. From p = p+ it follows that av = av. Therefore, for a unitary operator U in Ж1/2 it follows that Uffvu+ = £ %»- where a„v are real. Thus it follows that UaxU+ = £ with *'» = E •
278 VII The Galileo Group From it follows that (aVfl) is the matrix of a three-dimensional rotation. It is easy to see that the correspondence U —► (a^v) is a representation of SU2 by means of three-dimensional rotations. That this correspondence is surjective on can be easily seen as follows: For U defined according to equation (3.26), if we set w1 = w2 = 0we then obtain for the matrix (ocVfl) a rotation about the 3-axis. For w2 = w3 = 0 we obtain a rotation about the 1-axis. Any other rotation can be obtained by multiplications of rotations about the 1-axis and the 3-axis (see a discussion of Euler angles, for example, [2], VI,§3.1). Since SU2 is isomorphic to Щ we may therefore obtain all unitary representations of 3)% as unitary representations of SU2. The irreducible unitary representations of SU2 may also be constructed simply using algebraic methods. With the help of u+, w_ we may easily define a (v + l)-dimensional vector space 3~v/2 which is generated by the basis vectors u\, w+_1W-, и+” 2u2_,..., иЧ the vectors of which are all homogeneous polynomials of the i;-degree in the unknowns w+, w_. For U eSU2 there exists a repre¬ sentation of SU2 by means of linear transformations in ^v/2 generated by V(ua+ut) = (Uu+Y(UuJf. It is a simple matter to define an inner product in ^/2 in such a way that the above representation is unitary. The definition of this inner product is suggested by the following considerations: If a+u+ + a_u_ is a vector in J^l/2, then under a unitary transformation in Jf1/2 the expression ||a+u+ + a_w_||2 = a+a+ + a_a_ will be invariant; the same result therefore holds for the expression The coefficients cr are transformed in the representation of SU2 in the same way as the coefficients of (a+a + a_a)v = (3.31) An arbitrary vector in ^vj2 has the form E Cru\~rur_. (3.32) that is, in the same way as (3.33)
3 Irreducible Representations of the Rotation Group 279 From (3.31) it follows that the expression 1 i — X r\(v - r)\crcr VI r = о (3.34) is an invariant for all such transformations. We shall now introduce the following set of basis vectors in ^v/2 where j = i;/2, m = j — r (that is, m = —j, — j + 1,... ,j) and define an inner product <гте, гте,> = . Thus we find that the above representation of SU2 in 2Tj is unitary. We will now show that the above representation is identical with which was derived from infinitesimal transformations in Щ. For this purpose we shall now compute the infinitesimal transformation in S'y For the in ЖХ!2 we find that where Uv(a) is given by (3.26) for = 0 and ц Ф v. From (3.20) it follows that for the in ^ we obtain г d тФ+у+ттФ-У~т~\ Lda J{j + m)\(j - m)l Ja=o (j + m)uj+ m ~1 ujI mJvu+ yj(j + m)\(J - m)\ 0 — rn)uj+mujSm~1Jvu_ (Л + «ЛК = -Mvm = -iy/U - m)(J + m + l)»m+1, (3.36) (Л - 'ЛК = ~iN+vm = —iyf(j + m)(j - m + IK-!, that is, the relations (3.14), (3.17), and (3.18) are satisfied. Since SU2 is isomorphic to 3% the representations in Щ and ^3 are identical. By means of the algebraic construction of the representation 3-3 in ^ we may easily determine which values of j for which the representation is unitary and the values of j which correspond to a “multiple valued” representation of 3%. For this purpose we need only the representation of the element — 1 of SU2 in ^. Since 2T> consists of the homogeneous polynomials of (2;)th degree we will therefore find that — 1 will be represented by (— l)2jl. The integral (3.35) y/r\{v - r)\ J{j + m)\(j - m)! ’ s/U + m)\(j - m)! Since the in 3tflj2 are given in (3.20) we obtain S3vm = —imvm,
280 VII The Galileo Group values of j therefore result in unique representations of the half integer values lead to two valued representations of 3)9. All reducible representations up to a factor of 3)9 in a Hilbert space Ж may (since Q)% is compact) be decomposed into irreducible representations which decompose Ж into a direct sum jt = I e (3.37) д where each subspace Ж^ is invariant and irreducible with respect to the representation. However, not any arbitrary irreducible representation (up to a factor) can occur in the Ж^. Since the representation in Ж has to be a representation up to a factor, the fundamental group of <3)% must be represented isomorphically in each of the Ж^ (see AV, §10.7). A repre¬ sentation up to a factor of 3)9 in Ж contains either only representations with half integer j or only those with integer values of j. It now remains to show that we have determined all of the irreducible representations of 2%, that is, of SU2, that is, all representations (up to a factor) in terms of the representations Щ in the ^. This we shall show on the basis of the completeness of the characters of the representations in the ^ as class functions in SU2 (see AV, §10.5). If U e SU2 then there exists a Fg SU2 such that W = Fl/F-1 has the form Wu+ = eiau+, Wu_ = e~iau_. (3.38) This follows from the fact that U must have two orthogonal eigenvectors vu v2 with eigenvalues eia, eifi. V needs only be chosen as a transformation which transforms vx into u+ and v2 into w_. If the determinant of V is not equal to 1, so that we may obtain this result by the multiplication of V with a factor such that a multiplication of V does not change Fl/F-1. Since the determinant of W and that of U must be equal to 1 we must have eip = e~ia. It follows that two transformations from SU2 which have the same eigen¬ value belong to the same class of conjugate elements. We will run through the different classes when the parameter a runs through the values between 0 and и because the pair eia, e~ia of eigenvalues will run through all possible values. The character Xj of the transformations W in is determined by the equation W(uj+muCm) = (Wu+y+m(Wu_y-m it follows that Xj(W) = ei2aj + ei2aU~l) + • • • + e~i2aj. (3.39) Therefore Хо(Ю = 1, ilXm(WO] = cos a, ШЮ-Хо(Ю~] = cos 2a, ftx3/2(W0 - X1/2W] = cos 3a,
3 Irreducible Representations of the Rotation Group 281 The functions cos(na) for n = 0, 1, 2,... in the interval 0 < a < n form а complete function system, so that there exist no additional irreducible representations of SU2. As we have seen in §2, to an elementary system type there corresponds a spin space in which an irreducible representation is given (up to a factor) of %. must then be isomorphic to one of the ^. As an index s we use the same index as in the case of that is, the spin s (see D 2.2) can take on the values 0, i, 1, |, — The representation (2.35) of the rotation group in Ж is not irreducible even if R(A) is the operator which represents an irreducible representation <2)s in . With this the problem arises of reducing the representation given by (2.35). For this purpose we will reduce the representation given by V(A) in Жъ. In order to derive some frequently used formulas we shall now consider the isomorphic map (see AIV, §13). from the space <5? 2(R3, dki dk2 dk3) to <5? 2(R3, dxx dx2 dx3). For simplicity we shall also denote this space by Жъ since <5?2(R3, dkx dk2 dk3) and j£?2(R3, dxx dx2 dx3) can be considered to be different representations (see XI, §2) of “the same” Hilbert spaces Жь. It is easy to see that, according to (3.40) where V(A) is the image of the operator V(A) in <5? 2(R3, dxx dx2 dx3). Instead of V(A) we shall write V(A). The representation of in Жъ which we must reduce is therefore given by Since (3.41) is a unique representation of in the reduced representation we may only have integer values of j. Consider a rotation A about the 3-axis of angle a, that is, V(A)<p(k) = cp(A гк) —> ф(А ]т) = У(А)ф(г), У(А)ф(г) = ф(А 'г). (3.41) thus, for an infinitesimal rotation we obtain: J^{r) = For the Lv = iJx it follows that, in general: 1J__ 1 d v X“ i dxp Xp i dxfl ’
282 VII The Galileo Group The space Ж^ — ££ 2(R3, dxx dx2 dx3) may be represented in the following form by means of polar coordinates (see AIV, §14): where Q is the surface of the unit sphere and dco is the surface element (solid angle) do = sin 0 d6 dcp. With respect to (3.43a) V(A) takes on the form Let the components of a unit vector be given by ex = sin 0 cos cp, e2 = sin 0 sin cp, e3 = cos 0. Then, by the Weierstrass approximation theorem the set of all e^efef (where the av ^ 0 are integers) span the entire space <5?2(Q, dco). ТЫ e^efe^3 with fixed sum ax + a2 + ос3 = I span a finite-dimensional subspace Ж^ of invariant subspace under V(A). We now seek the irreducible subspace in Ж^ which contains the largest eigenvalue of L3. According to (3.45) the eigenvectors of L3 obviously have the form eim(pg(6). If, instead of eu e2, e3 we introduce the three functions e = ex + ie2 = ei(p sin 0, ё = ex — ie2 = e~i(p sin 0, e3 = cos 0, then Ж1вг(р will be spanned by all ePiePleP3 for which Pi + Pi + Рз = ^ The еР1ёР2еРз are then precisely the eigenvectors of L3 with eigenvalues px — p2. Therefore the largest eigenvalue of L3 is obtained when Pi = I and p3 = p3 = 0. Its value is /. The eigenvalue / of L3 is nonde¬ generate in . The corresponding eigenfunction is Since N = Lx + iL2 cannot take us out of Ж1в (р we must have Nut = 0 and the following equation must be satisfied: (3.43a) where Жг = <5?2(R+, r2 dr\ Ж$9„ = dco), (3.43b) V(A): 1 x V(A). (3.44) We shall now reduce the representation V(A) in Жв>(р. Similarly, the Lv in (3.42) must take the form (3.44). By conversion to polar coordinates, for L3 and N+ = Lx — iL2 we obtain: (3.45) (3.46) 17, Since the ev transform linearly under rotations, Ж\^ is obviously an щ = cxeil<p{sin в)1. (3.47) Ь2щ = 1(1 + 1 )щ. (3.48) With the aid of N+ we may, using (3.17), obtain the desired irreducible representation space of the form which is spanned by the um.
3 Irreducible Representations of the Rotation Group 283 Since um lies in and L3um = mum, um must be a linear combination of vectors of the form where jSi — jS2 = m, + jS2 + jS3 = /: where is a polynomial in cos 0. With N+ defined by (3.46) and from (3.17) we obtain N+um = cmei(m~1)(p(sin 9)~m+1Q'Jcos 0) = y/(l + m)(l - m + 1 )um+1 = ,/(/ + m)(/ - m + l)c„_!ei(m“1 )(P(sin 0)“m+1Sm+1(cos 0), where g'(£) is the derivative of Q(^) with respect to Therefore the may be recursively defined by means of Q'm = Qm-l and, for the normalization constant cm we obtain the recursion formula: From (3.47) it follows that Q^) = (1 — £2)1 and we therefore obtain: From (3.50) it follows that cm takes on the value (with a yet to be determined normalization constant a): The factor a may be determined by means of the normalization condition The integral (3.53) may be recursively calculated (see, for example, [2], XI, §5.3). We obtain The functions um are generally known as “spherical harmonics”; the customary notation for them is cp). From (3.49), (3.53), and (3.54) we finally obtain um = cmeim«(sin e)~mQm(cos 0), (3.49) (3.50) d‘~m eja = ^r( i-«2y. (3.51) (3.52) |м0|2 sin в dd d(p = 1. (3.53) (3.54) шеЬшЦва eymQlm(cos в), (3.55) where dl~m i-a(.
284 VII The Galileo Group The Ylm span an irreducible subspace of v. We will now show that 00 •*5* = Z © ^ (3-56) 1 = 0 that is, the Ylm completely span Жв(р. For this purpose it is sufficient to show that K* = ^0^20^4®- (3.57) In fact, we may consider the &]-2, &l-4,... to be homogeneous polynomials of degree /; for example, obtained from multiplication of polynomials of (/ — 2) degree by 1 = e2 + e\ + e\ (and similarly for ^_4). The right side of (3.57) is therefore a subspace of Ж\^. Since, for different / the ^ must be orthogonal, the dimension of the right-hand side must be equal to [2/ + 1] + [2(1 - 2) + 1] + [2(1 — 4) + 1] H = [(/ + 1) +/] + [(/ -!) + (/- 2)] + [(/ - 3) + (/ - 4)] + ... The dimension n(l) of is equal to the number of the e^efef for which «i + oc2 + oc3 = /. Thus it follows that n(l + 1) =n(/) + [n + /] + 1. Therefore n(l) = [/ + 1] + / + [/ — 1] + • • •. Thus we have proven (3.57) and (3.56). Since (3.42) and (3.44) the reduction of the representation in Жъ is very simple. Choose a complete orthonormal basis %v(r) in Жг. The ф1У1Л(г) = Xv(r)Ylm(Q, (p) span (for fixed v, /) an irreducible subspace of Жь. From (3.56) we therefore obtain ^b = Z e^bv. (3.58) v,l The reduction of the representation U(A) = V(A) x R(A) in Жъ x (ts is an irreducible representation space) will be postponed until XI, §10. It is easy to see that this reduction is achieved if the representation V(A) x R(A) can be reduced in ^ x 4S. For a representation in a product space we write a x- sign, for example, <3h x Q)h for a product representation in the product space where the representation operators have the form V(A) x R(A) and is irreducible with respect to the V(A) and is with respect to R(A). 4 Position and Momentum Observables In §2 and §3 we encountered infinitesimal transformations. Using the latter we defined self-adjoint operators (not necessarily bounded) such as К, X and H in §2 and L in §3. According to IV, D 2.5.6 to each such operator there corresponds a scale observable (which, according to D 2.5.6 is also a decision observable). These scale observables are often given names. The introduction of observables by means of infinitesimal transformations is common practice in quantum mechanics. In this way the problem of how these observables are to be measured is often ignored.
4 Position and Momentum Observables 285 In an analogous procedure in classical mechanics the observables are functions in the Г-space (pv,qv) of the system. Since the pre-theories of classical mechanics should define how position and momentum, and finally how pv,qv, are to be measured, we should be satisfied if we are able to specify the observables as functions in Г-space; then their measurement will be described in terms of the pre-theories of classical mechanics. The situation in quantum mechanics is totally different. Here the pre¬ theories make possible the description of registration methods and regis¬ tration procedures. However, the correspondence is itself de¬ termined by quantum mechanics! Here the problem described in IV, §4 appears in clear focus. Here we must concede that the specification of self- adjoint operators and their corresponding scale observables is primarily of a “pure theoretical nature.” According to the theory there should be “in principle” approximate measurement methods in the sense of IV, §4 for these observables. However, it appears that they cannot be obtained from previously defined theories! The quantity Xv (multiplied by m-1) introduced in §2 is often called the “position observable at time t = 0.” However, from the theory we can neither say how this may be measured or whether a particular measurement method b0 measures these Xv (even approximately). With respect to the latter point we may make an additional step in that direction in the following way: In the above definition of m~1Xv as the “position at time t = 0” it is unclear not only what is meant by the expression “position” (unless we are willing to accept this expression as a mere name without any meaning) but also what the physical meaning of the expression “at time t = 0” should be. At present there are many different and varied conceptions and (apparent) interpretations about this problem in circulation. Here we shall only mention that the expression “at time t = 0” cannot mean (as we shall find in more precise terms in XVII) that the “measurement takes place at time t = 0.” Such a “point in time” at which a measurement takes place does not exist. On this basis other priorities will take precedence over the definition of a decision observable for “position at time t” Such a definition will necessarily refer in a precise manner to a laboratory fixed space-time reference system as defined in §1 for the physical interpretation of the Galileo group. Consider an apparatus b0 for which the scale response refers to the spatial domain, that is, the scale of the measurement apparatus defines an isomorphism between ЩЬ0) and the “Boolean ring £ of a region in three-dimensional space.” Since we cannot expect that there is a real apparatus with a rigorous isomorphism between $(b0) and £ we proceed instead by making an idealization, that is, with an observable S L where £ is the “Boolean ring of the spatial region.” How can we define £ mathematically? Let A be the a-algebra of the Lebesgue measurable sets of R3—in this case the three-dimensional space defined by the laboratory fixed spatial reference system, that is, R3 is the set of coordinate tripels (xu x2, x3). Let J be the family of the sets of measure 0. £ = A/J is a complete Boolean ring, which we shall call the Boolean ring of the “spatial region.” Each element <rel
286 VII The Galileo Group therefore represents a possible response b of the idealized measurement apparatus b0. In the sense of IV, D 2.5.5 Ъ = Z(xl5 x2, x3) where the xv are the measur¬ able spatial coordinates in the laboratory reference system under con¬ sideration. These coordinates have nothing to do with quantum mechanics. They are defined by classical measurement techniques and procedures. £ = *2 j *з) ^ L is therefore, in the sense of IV, §2.5, an observable with the sufficient scales xl5 x2, x3. We now seek to sharpen our assertion about the desired position observable. Let us set £(xl5 x2, x3) G, that is, let us consider a decision observable. Since there is often much misunderstanding concerning the meaning of measurement, that is, of the registrations b which correspond to the a e £(xl5 x2, x3), we again stress the fact that the registrations b on the apparatus under consideration do not have an immediate connection with the technical aspects of the measurement of the coordinates xl5 x2, x3! If, for example, a registration b corresponds to a = {(xl5 x2, x3)\x° - e < xv < x° + e}, where e is small, this means only that the apparatus b0 records (for example, with the aid of a computer) the “measurement values” (x?, x2, x3). It does not mean (!) that x?, x2, x3 are the technically measured spatial coordinates of some macroscopic event or process associated with the registration b. The “responses” of the measurement apparatus have nothing to do with the technical aspects of measurement of spatial position. However, the usage of a measurement apparatus characterized by b0 has much to do with the technical aspects of spatial measurement, as we have explained in §1 and which we shall now discuss. The measurement of the “position of a microsystem at time t” (assuming that such an observable £ G exists which satisfies all the requirements which we will impose on it) by an apparatus is therefore different than, for example, the technical measurement of the position of a space ship at time t. The technical measurement of a space ship is explainable without the use of Newtonian mechanics. The construction of the desired position measure¬ ment apparatus b0 for microsystems cannot be explained without quantum mechanics; the position of a microsystem is only indirectly measurable in quantum mechanics (in the sense of the discussions in [1], §10 or [2], III, §9). We shall now seek to define the position observable in this indirect way. It must be possible to adjust the spatial position of the apparatus b0 relative to the laboratory system in order that Galileo transformations have the meaning which was described in §1, for instance, that of a spatial translation (1, 0, fy, 0). We shall now investigate the requirements which shall be imposed on S(xl5 x2, x3) Д G in order to relate indirectly the “measure¬ ment values” xl5 x2, x3 of b to the spatial coordinates. Intuitively we find that if we try to interpret the registration b for the apparatus b0 as the “determination” of the position in a then the apparatus b'0 which is obtained
4 Position and Momentum Observables 287 from b0 by means of a translation (1, 0, fy, 0) and the corresponding response b' must correspond to the determination that the position is in a' = a + fy (where a + fy is, of course, the spatial domain for which a is displaced by */). According to §1 and §2 we obtain 4f(b'o, bf) = 1/(1, 0, */, 0ЩЪо, b)U{ 1, 0, f\, or1. (4.1) This equation should also be satisfied if ^(ib0, h) is replaced by £(o) and il/(b'0, b') is replaced by £(<7 + fy), since £(o) is the idealization of ^(h0, h). We now impose our first requirement upon E(xl5 x2, x3) Д G: 1/(1, 0, fy, 0)£(<7)l/(l, 0, fy, O)"1 = £(o- + *y). (4.2) For rotations we may make a similar argument; we require that U(A, 0, 0, 0)E((j)U(A, 0, 0, О)"1 = £(A<7), (4.3) where Aa is the domain which is obtained by rotating a by A. The requirement (4.2) is “in principle” experimentally verifiable in the form (4.1) as follows: If we have built an apparatus and set it relative to the laboratory system, then there is a corresponding b0. Then we easily obtain b'0 (macroscopically) as the spatial translation of the apparatus b0. Here (4.1) can be approximately controlled by probability measurements for different a el'. Thus it is clear that the requirements (4.2) and (4.3) refer to registrations and preparations—as we have described in II and in §1. As we have found in IV, §2.5, the decision observable Z(xl5 x2, x3) Д G is uniquely determined by the scale-observables which correspond to the scales as follows: ev = (4.4) Then, from IV, §2.5 we obtain £V(A) = £(<7v(A)) where <7V(A) = {(xl5 x2, x3) \xv< X}. (4.5) The operators Qv introduced in (4.4) are self-adjoint (not bounded) and are not defined in all of Ж Nevertheless these operators uniquely determine the corresponding spectral families £V(A) (see AIV, §10) and we therefore also obtain (as proven in IV, §2.5) the complete observable S(xl5 x2, x3) Д> G. Later we shall return to the discussion of the use of infinite extended scales. The requirements (4.2) and (4.3) for the observable S(xl5 x2, x3) G are not sufficient to uniquely determine this observable. In addition, we must also take into account the use of the laboratory time scale t. We shall now again consider the intuitive idea that b0 represents a measurement apparatus for which the measurement result can be interpreted as the registration of “position at time t” What should this mean? In order to answer this question we shall now consider the original apparatus b0 together with its response b and a second experiment Щ and response b" where Щ and b0 are identical except that bjj moves relative to b0 with velocity $ in such a manner that, with respect to the laboratory time scale the
288 VII The Galileo Group apparatus bjj takes the same spatial position as b0 does at the time t. It is obvious that the apparatuses for b0 and Щ cannot be applied to the same experiment. Two experimental series, one with b0, the other with bJJ must be carried out in order to measure the frequencies of the responses. Then Щ may therefore be obtained by the application of the following Galileo transfor¬ mation to bQ; x" = xx + dv(t - t), Equation (4.6) may be written as a product as follows: (1, 0, 0, f)(l, S, 0, 0)(1, 0, 0, — t). (4.7) The form (4.7) permits us to use the operators described in §2. If b0 “registers the position at time ?’ by the response b then bfQ should also be the same because both apparatuses will be in coincidence at time t. We will therefore require that фф0, b) = фф'о, b"). We may transform ^(b0, b) to its idealized version Еф) and therefore require that 1/(1, 0, 0, 7) 1/(1, S, 0, 0)17(1,0, 0, i)~lE(a) ■ U( 1, 0, 0, t)U(h s, 0,0)"117(1, 0, 0, Г1 = Ц<Г\ (4.8) where we have used the form (4.7) of the Galileo transformation (4.6). We shall now show that there exists exactly one decision observable S(xl5 x29 хъ) Д G which satisfies the conditions (4.2), (4.3), and (4.8). This observable will be called “position at time ?’ and we shall denote the corresponding scale observables (4.4) by Qv(t). Then the real physical meaning of the time parameter t is characterized by (4.8) because (4.8) can only be satisfied by a single t. The time t has nothing to do with the time of occurrence of the macroscopic response b; in addition it has nothing to do with the notion of the “time of measurement.” Such a “time of measurement” does not refer to an instant of time, but rather to a time interval in which the interaction between the microsystem and the apparatus takes place. Rather, the time t is that for which the moving apparatus b^ comes into coincidence with the stationary apparatus b0. Therefore, t may be determined macro- scopically and has nothing to do with the temporal evolution of the interaction of the microsystem with the measurement apparatus, t is obtained from the spatial alignment of the two registration methods b0 and bfQ. Since we are unable to measure the position of a microsystem in a technical sense, it is not meaningful to speak of a measurement “at time F.” Since the xl9 x2, x3 and t are actually parameters of the underlying reference system and are only adjustable parameters for the registration apparatus relative to the reference system, they are not observables in the quantum mechanical sense, as these are defined and studied in IV. We shall now prove the above assertion that the Qv(t) are uniquely defined (for a precise mathematical formulation and proof see [10] and [22]). For
4 Position and Momentum Observables 289 this purpose we shall first consider the special case in which t = 0. Then from (4.8) we obtain 1/(1,1 0, 0)E((j)U(l I 0, О)"1 = E{o\ (4.9) that is, 1/(1, <5, 0, 0) (of (2.11)) commutes with E(a). From (4.2) it follows that, with (2.9) е1^Е{а)е~1^ = E(a + fj). (4.10) Thus, for all Qv(0) from (4.4) it follows that eiR ii&(0)e-iR ii = G(0) - #?1- (4.11) From (2.17) we obtain a similar equation е1кпхе-^ = X - гщ\. (4.12) Therefore 7 = (5(0) — m_1X is an operator which commutes with 1/(1, S, 0, 0) and with 1/(1, 0, fy, 0). Therefore we obtain: Y: 1x7, that is, 6(0) = m-1X x 1 + 1 x 7. (4.13) From (4.3) it follows that, using (2.35): R(A)fR{A)~1 = A_1Y. (4.14) Since the 6V(0) mutually commute, the Yv must also commute. Since is finite dimensional, the Yv have a common system of eigenspaces which span all of . From (4.14) it follows that R(A)YVR(A)~1 are linear combinations of the Yv and therefore the common eigenspaces of Yv are also eigenspaces of the R(A)YVR(A)-1, that is, the R(A) leave the eigenspaces of Yv invariant. Since the representation of the rotation group in is irreducible, there are no proper invariant subspaces. Therefore 7V = Avl. From (4.14) it follows that ^1 = ^2 = ^3 = 0 and therefore, according to (4.13) we obtain: 6(0) = m~lX. (4.15) Conversely, if we define a decision observable according to IV, §2.5 Ц*!, x2, хъ) G by means of (4.15) then the so-defined observable will satisfy conditions (4.2), (4.3), and (4.8). If S(xl5 x2, x3) Д G is the position observable for time t = 0 and we define (using (2.38)) E(a) = 1/(1, 0, 0, t)E((j)U(l, 0, 0, t)"1 = eiHiE(a)e~iHt then, from (4.9) it follows that 1/(1, S, 0, 0)17(1, 0, 0, t)~1E(<j)U(l, 0, 0, t)U( 1, S, 0, О)"1 = 1/(1, 0, 0, ?)_1£(cr)L/(l, 0, 0, t),
290 VII The Galileo Group that is, (4.8) holds for E(o). It is easy to prove that Ё(а) also satisfies conditions (4.2) and (4.3). The scale observables Gv(0 = eiHtQv(0)e~iHt (4.16) therefore exactly satisfy the conditions for the desired observables “position at time t” Since the Qv(t) must have the form (4.16) it therefore follows that the observable defined by e~iHtQv(t)eiHt must satisfy the conditions (4.2), (4.3), and (4.9). Thus, in this way we have uniquely defined the decision observable “position at time £” for elementary systems. We have not yet constructed a registration method b0 which will permit (in the sense of IV, §4) an approximate realization of the “position observable at time £.” We have, however, described a type of experiment which may be used in order to determine whether a given registration method approximates the “position observable at time £.” A similar method can be used for the definition of a “momentum observable.” We begin with a decision observable S(pl5 p2, Рз) G and impose the following requirements: The observable is invariant under spatial translations: 1/(1, 0, rf, 0)E(a)U(l, 0, if, or1 = E(a). (4.17) Under rotations we require that U(A, 0, 0, 0)E{a)U(A, 0, 0, 0)"1 = E(Att). (4.18) Under proper Galileo transformations we require that 1/(1, 3, 0, 0)E(a)U(l, 3,0, О)"1 = E(p - m3). (4.19) Here (4.19) corresponds to the intuitive idea that the motion of the registration apparatus with velocity S results in a change of momentum of the microsystems relative to the moving registration apparatus of magnitude (— md). The observable S(pl5 p2, p3) Д G can be characterized by three scale observables Pv=jAd£v(A) (4.20) with respect to the scales pl9 p2, p3. From (4.17) it follows that the Pv commute with the From (2.17) and (4.19) it follows that the Zv = Pv + Kv commute with the Xv and the Kv, that is, we must obtain Pv = -Kv x 1 + 1 x Zv.
4 Position and Momentum Observables 291 Similarly, as in the case of (4.13), from (4.18) it follows that we must have zv = 0. Therefore, a momentum observable is uniquely determined by the require¬ ments (4.17), (4.18), and (4.19) which is characterized by the scale observables Pv = -Kv x 1. (4.21) It is easy to see that eiHy commutes with the Pv, that is, a time translation of the idealized registration methods corresponding to the Pv do not produce any change in the observables. From (2.16), (4.15), and (4.21), we obtain the famous Heisenberg com¬ mutation relation for position and momentum (4.22) Since Planck’s constant does not appear in the theory formulated here (as we have already discussed), it is clear that we have correctly formulated the fundamental structure of quantum mechanics. Equation (4.22) is an indirect consequence of the representations of the Galileo group and is therefore only a consequence of axiom AV 4s in III, §3. For a discussion of the correspond¬ ing Heisenberg uncertainty relations, see IV, §8.3. These observables Pv, Q^ provide our first example for the use of “un¬ bounded” self-adjoint operators. In quantum mechanics the meaning of unbounded operators, their domain of definition, and the precise formulation of the relation (4.22) are often treated as a mystery. Here we have found that there is, in principle, nothing unusual underlying the introduction of the unbounded operators Pv, Q^ described above. The conceptual structure of quantum mechanics has nothing to do with this occurrence of unbounded operators. It arises exclusively (and, for the most part, effortlessly) from a mathematical idealization which has no particular physical meaning (see, for example, [1], §9 and [2], III, §8). If, for example, we describe the laboratory reference system by Euclidean geometry, it is then practical to use the Euclidean rectangular coordinates xv as scales, although, in principle, it is not necessary; we could instead use other finite scales. Since the lack of finiteness for the scale xv has no physical meaning, it follows that in the real world arbitrarily large xv have no physical content (see, for example, [1], §9 and [2], IX, X). Therefore unbounded self-adjoint operators for scale observables occur if we introduce unbounded scales. The scales for such observables are only a practical tool, as we have described in IV, §2.5. There is, however, a second case where unbounded self-adjoint operators (and therefore also their spectra) have, in a natural way, a physical meaning. This situation arises if the self-adjoint operators occur as infinitesimal operators in the representation of a group which has a physical in¬ terpretation. We have encountered such self-adjoint operators of the form Xv, Kv, H, L in §2 and §3. Here the spectrum, that is, the scale values are of
292 VII The Galileo Group crucial importance for the structure of the corresponding group representations. Both of the above viewpoints—scales as only a practical tool for the ordering of a Boolean ring of an observable (as described in IV, §2.5) and scales as a characteristic of an infinitesimal transformation—are often confused with each other. In particular, this occurs when the infinitesimal transformations are, because of their representations in terms of self-adjoint operators, called observables and are given special names. Naturally it is permissible to relate some scale observables in this way with infinitesimal transformations; here the scales are, on the basis of their definition, no longer arbitrary. Often a sufficient distinction is not made between the case where, in an experiment, a registration method b0 is used with a scale which corresponds to a theoretically defined observable, and the case in which the registrations which were carried out reflect the transform¬ ations and indirectly permit the conclusions concerning the spectrum of the infinitesimal transformation. In the applications presented in this book we shall be careful to point out which are preparations, which are registrations, and which are transformations. In closing this section we shall now make a few remarks concerning the parallelism between the case in which the Galileo group is replaced by the Poincare group. Obviously (4.8) cannot be applied to the relativistic case because it is not possible to bring two moving systems into “coincidence.” Here it can be shown that the position decision observable for elementary systems with nonzero mass conditions analogous to (4.2) and (4.3) can be satisfied. These position observables are, however, not uniquely determined. For light quanta (systems of zero mass) there is no such position decision observable. This is, however, not an argument against the theory. If, in an experiment with light quanta something similar to a position is measured, this measurement does not correspond to a decision observable [22]. Here we have an example that shows that it is somewhat risky to only discuss such an observable concept which we have called a decision observable. Such a restricted concept of an observable (namely that of a decision observable) would not fit all essential experimental procedures. 5 Energy and Angular Momentum Observables In this section we shall consider two observables for which the theory of measurement methods is less well known than is the case for the position and momentum observables described in §4. First, for purely formal reasons, we may call the observable H of infinitesimal time translations (2.20): l/(l,0,0,y) = ^ (5.1) the energy observable. This does not meaii that (as we have already mentioned in §4) we are able to construct a b0 for which ф(Ь0, b) corresponds to an approximate measurement of H (in the sense of IV, §4).
6 Time Observable? 293 If there exist elementary systems, then according to (2.38) Я is a function of К and therefore also of P: H is therefore (in the sense of IV, D 2.5.5 and IV, D 2.5.6) a scale partial observable for the momentum observable, that is, H is automatically determined by the measurement of momentum. This is no longer the case for composite systems, as we shall find in VIII, §1. For certain systems experimental physicists have suitable registration methods for the measurement of Я. Often, however, H can only be experimentally determined indirectly—only by means of 17(1, 0, 0, y). For this reason we must carefully use the notion of an energy observable H in applications. The operators Lv are defined in terms of U(A, 0, 0, 0) by (3.1) in a similar manner as H is defined in terms of 1/(1, 0, 0, y). The scale observables Lv are called the components of angular momentum. Again, we are not told how these observables may be measured. In [2], XI, §7.2 and [2], XII, §2.2 we show how angular momentum can be measured by means of the Stern- Gerlach experiment. In many applications, however, the role of the com¬ ponents of angular momentum as infinitesimal rotations is more important—a typical example is given by atomic spectra, where the latter is discussed in XI-XIV. For elementary systems (2.35) holds, where R(A) are the operators of irreducible representations Ds in . In this way U(A, 0, 0, 0) defines not only the total angular momentum (denoted by J) but, in addition, V(A) defines the orbital angular momentum (denoted by L—as an operator in Jtfj) and R(A) defines the spin angular momentum (denoted by S—as an operator in ts). By differentiation of (2.35) we obtain J = L x 1 + 1 x S, (5.3) L, as an operator in 2(R3, dxx dx2 dx3) is given by (3.42). S, as an operator in is given by (3.14), (3.15), (3.17), and (3.18) with s instead of j. Here we say that the spin has the fixed value s, where by this we mean that, according to (3.15), in the relation S2 =^s(s + 1)1 holds. These mathematical formulas for L, S, J provide no instructions for the construction of measurement apparatuses for these observables. In the theory of atomic spectra the infinitesimal transformations characterized by the L, S, J play a very important role (see XI-XIV). 6 Time Observable? In the literature of quantum mechanics the discussion about the so-called “time observable” has reached vast and overwhelming proportions. Most of this discussion rests upon a misunderstanding of the concept of an observ¬ able. In the “usual” interpretation of quantum mechanics—the one most
294 VII The Galileo Group frequently heard by the student—the observable concept is used as a fundamental concept (see, for example, [2], XI, §1.7 where we have pointed out the inadequacy of this interpretation). It is often necessary to go to great lengths in order to provide an intuitive justification of this concept of an observable, and the discussion of the difficulties associated with this concept are avoided in order to minimize the difficulties with this approach in order not to frighten the student excessively. In this approach the observable concept is introduced as a “quantity measured by an observer” or as a “measurable quantity,” etc. Clearly, in the laboratory there exist clocks by which we may “measure time”; therefore time should also be an observable. In order to counter all such erroneous interpretations of quantum mechanics we have laid the foundations of the interpretation of quantum mechanics in II and extended this interpretation by the structure introduced in VII, §1. Hence it clearly follows that measure¬ ments with “meter sticks” and “clocks” do not constitute measurements of quantum mechanical observables. For this reason we have developed the concept of an observable as a derived concept in IV (for a discussion of derived concepts see [1], §10). Meter sticks and clocks are used only for the purpose of adjusting and calibrating preparation and registration apparatuses. Therefore in quantum mechanics the spatial coordinates xl9 x2, x3 and the time coordinate t are only parameters given by the laboratory reference system! As we have found in §4 the measurement of the position observable Qv(t) does not mean that at time t (clock time) the coordinates xl9 x2, x3 of a microsystem are measured by means of meter sticks because the microsystem as such is “not there” in the sense that it can be measured in this way. The microsystem may only be detected by producing a response in a registration apparatus. The introduction of the position observable Qv(t) in §4 clearly shows that it is concerned only with the possibilities of (idealized) registrations. The claim that the coordinates xu x2, x3 are defined as the measured values of the Qv(t) is a misunderstanding of quantum mechanics. In fact it is just the opposite— the technical process by which the coordinates xl9 x2, x3 are defined in the laboratory system must be explained independently of quantum mechanics. After it is understood that an observable 2 Д> G (here 2 is the Boolean ring for the region of space under consideration) is determined by certain requirements, it is reasonable to also choose the previously defined xl9 x2, x3 as the scales for this observable. Therefore the xl9 x2, x3 are definite scales for the Boolean ring 2 of the region of space under consideration and are already determined by the pre-theories. After xl9 x2, x3 are defined, the quantum mechanics of the registration process comes into play as a map 2 Д> G. The next question which we would like to ask, and is meaningful in the context of quantum mechanics is whether there exists a decision observable 2(0 Д G for which 2(0 is the Boolean ring of the “time domain” which satisfies the following reasonable conditions (by analogy with (4.2)): 1/(1, 0, 0, уЩо)Щ1, 0, 0, y)~l = E(a + y). (6.1)
6 Time Observable? 295 Using the spectral family £(A) = E(<j) for a = {t 11 < A} from (6.1) we obtain the relation : [7(1, 0,0, y)E(X)U(l, 0, 0, y)~l = E(X + y), (6.2) where from [7(1, 0,0, у) = eiHy and eiT* = |giAa we obtain the relation eiHyeiTcce~iHy _ eiy<*eiTa which we can also write in the form e~iTaeiHyeiTa _ giya^iHy If we let Ё(со) denote the spectral family of H it follows that e~iTaE(co)eiTa = Ё(со + a). If (for со2 > (Oi) Ё(со2) — Ё(a^) Ф 0 then we also find that Ё(со2 + a) — Ё(со1 + a) Ф 0. Since a may be chosen arbitrarily, it follows that the spectrum of H varies between — oo to + oo in contradiction to (2.38). Therefore a decision observable E(t) G which satisfies (6.2) does not exist. For elementary systems this is purely a consequence of the axioms cited in III, §3 and the conditions imposed on Galileo transformations of the registration procedures in §1. It does, however, also hold for composite systems because the Hamiltonian operators of time translation 1/(1, 0, 0, y) = eiHv are, in all cases, bounded from below (see VIII, §5). Why, however, should an observable which satisfies (6.2) exist? Is it only because it is “desirable”—even though such an observable is not realizable ? It is necessary to go beyond quantum mechanics to a more comprehensive theory which permits an apparatus which registers the “desired” observable if we succeed in constructing an apparatus which measures a “time observable” which satisfies (6.2). Clearly a registration apparatus which contradicts the Heisenberg uncertainty relations or contradicts the assertion of the non¬ existence of an observable £(£) Д G which satisfies (6.2) has not yet been constructed. Therefore we may consider the nonexistence of the observable X(£) Д> G satisfying (6.2) as a statement about the structure of the real world. Clearly there exist apparatuses—for example, a particle counter—which registers the time upon the detection of a particle. Is such a counter a realization of a type of time observable? This is indeed correct. Let b0 denote such a particle counter (including its spatial orientation with respect to the preparation apparatus), then we
296 VII The Galileo Group apparently can register whether a response signal has occurred in the time interval between tx to £2- Let bn be the registration that the counter has not responded at all; then the various b cz b0\bn register the time domain into which the signal has occurred. If we consider the ideal case where the length of the signal can be ignored than it is reasonable to proceed from $(b0) to the following Boolean ring 27(£): In 27(0 there is a particular on e 27(0 which is an atom of 27(0; the set {a | a < e -j- 0n} (£ *s the unit element of 27(0) forms a Boolean ring (with e -j- an as the unit element) which is isomorphic to the Boolean ring of the time domain which was denoted by 2(0- Therefore $(b0) can be considered to be an approximation of 27(0, where the ij/(b0, b) represent an approximation to an observable 27(0 L. Therefore 27(0 L is obviously a type of “time observable.” Does this observable satisfy the following relationship which is analogous to (6.1)? For real counters the ф(Ь0, b) cannot, of course, exactly satisfy (6.3) because the counter can only be turned on for a finite time, and is therefore usable only for certain registrations b which are not exactly at the beginning or the end of the “on” cycle of the counter. For this reason we should find that (6.3) is satisfied by ф(Ь0, b) = F(o) if у is sufficiently small. Therefore it is conceivable to require (6.3) for the idealiz¬ ation 27(£) L. The following additional idealization is, according to the previous discussions, not allowed: 27(£)-^*L cannot be a decision observable! Apparently this is precisely the point where errors are often made. Since we are often only familiar with observables which are decision observables it is often thought that the signal for a counter is a “yes-no” response of a decision observable, that is, that the registration “the signal occurs in the interval £, to £2” must correspond to a projection operator (in our notation, to a ij/(b0, b) e G). This is clearly an error arising from an inadequate in¬ terpretation of the mathematical framework of quantum mechanics. In order to show that there exist observables 27(£) L which satisfy (6.3) we shall now give an example. We note that (6.3) is equivalent to the following equation (which is analogous to (6.2)) so that we need only to exhibit a general spectral family which satisfies (6.4). lnjfb = JF2(R3, dxx dx2 dx3) we set where e~r2/p2 is the operator consisting of multiplication by e~r2/p2. Since e~r2/p2 js a positive operator, the same is true for eiHte-r2ip2e~iH\ Therefore the operator 1/(1, 0, 0, y)F(a)U(l, 0, 0, y)"1 = F{o + y) (6.3) 1/(1, 0, 0, y)FW( 1, 0, 0, y)"1 = F(A + y) (6.4) (6.5) AHtn-r2\p2„-iHt
6 Time Observable? 297 is always a positive operator. If we show that Г00 eiH,e~r2lp2e~iHtdt (6.6) J - 00 is a bounded operator, then a in (6.5) can be chosen such that F(oo) < 1. From F(tтп) = 1 — F(oo) we obtain an example for an observable 27(0 L which satisfies (6.3) because from (6.5) it follows that eiHyF(X)e~iHy = a f eiH(t+v)e~r2/pViH(t+v) dt J - 00 ^A + y a i eiBze-r^e-iHz dz = Щ + yy In order to show that (6.6) is a bounded operator we transform from i?2(R3, dxt dx2 dx3) to 2(R3, dkl dk2 dk3) using the inverse formula to (3.40) <P$) = ^372 W) dxi dx2 dx3 ■ In this space the operator (6.6) takes on the form: q>(k) -► (p'(H) = aj| dtje{il2m)k2,e-^2l^s~^2e-mm)k'\{k') dk\ dk'2 dk!3, (6.7) where the factor осг is not of any further interest. The positive operator (6.6) is bounded if <(p(b <?'(£)> (р(к)(р'ф) dkx dk2 dk3 < С \<p(k)\2 dkx dk2 dk3 is satisfied for a particular value of C. It is sufficient to show that this condition is satisfied for continuous cp(k) because the latter are dense in Ж For continuous cp(k) we may obtain the integration of t in (b.7) from the Fourier transform. If we introduce polar coordinates for к and k! we obtain <v(h <р'Ф)> = «2 <p(k, в, <р)е-(*2',2/4)|ё-е"|У(/с, в', q>')k3 dk dco' dco, (6.8) where dco, dco' are area elements for the unit sphere (or elements of solid angle) and e, e' are unit vectors in the direction 0, cp or 0', cp'. If we introduce the expansion q>(k, e,<p) = z xtjk)Y‘m(e, (p) l.m
298 VII The Galileo Group into (6.8) we must compute the following integrals: Ylm(6, (p)e-(k2p2l^-^2Ylm{Q\ cpf) dco dco'. (6.9) Since the operator defined by the integral kernel е~{к2р21А)\ё-ё'\2 as an operator on functions on the unit sphere, commutes with rotations, the Ylm must be eigenfunctions of this operator, where for fixed I all Ylm must correspond to the same eigenvalue: I e-(kV/4)|e —e'|2 уГ ^ ы = ф)У1тЩ <p). (6.10) We only need to estimate ct(k). For this purpose we shall set m = 0 in (6.10). Then, according to (3.55) Yl0 is real. Let вт denote the location of the maximum of Yl0(6) and let Yljem) = dx. For e = em in the direction 0m from (6.10) it follows that c,(k)d, = j*e-<*V/*>l*»-*'l2Yj(0') do’ g-(*V/4)| em-eVd(o'. (6.11) Since the last integral is independent of the direction of em we may choose em to be the direction of the polar axis and by setting £ = cos в we obtain Jе-(к2р214)\ёт-ё'\2 dw' = J7t| -(1 - e~k2p2). 2X1-4)^ -1 4n " k2p2 Thus, with the above result, and (6.11) we obtain, for ct(k): 4-7z 0<ф)^р^(1-е-к2р2). (6.12) Combining (6.8) with (6.9)-(6.12) we obtain Гоо 1 _ <cp(k), <р'(к)У < a3 E IXi,m(fc)l2 1 dk- (6-13) l,m JO K From 1 - e~k2p2 kp < 1
7 Spatial Reflections (Parity Transformations) 299 we finally obtain <(p(h <p'{ky> < с X f \z,,Jk)\2k2 dk l,m JO = C^\(p(H)\2dk1 dk2 dk3. Thus we have proven that the operator (6.6) is bounded. For a real counter we neither have i//(b0, b)eG nor does (6.3) hold for all a and y. Therefore no experimental evidence exists against the assertion of the theory that there is no observable X(t) Д G which satisfies (6.1). The resistance to this fact is analogous to the resistance to the entropy theorem— because it contradicts cherished beliefs. The existence of the observable “position at time t” introduced in §4 represents a type of position-time measurement. In contrast with the usual situation in classical mechanics, the observables “position at time t” and “position at time t2 are, for tx Ф t2, not commensurable (in the sense of IV, D 3.1 and D 3.2). This fact follows simply from the fact that 6v(*i) and Qv(t2) do not commute for tx Ф t2 (see IV, Th. 3.2). From (4.16), (4.21), and (2.41) it follows that Q,(t2) = Q,(h) + t-1-~Pv (6.14) and therefore follows that QAtJQAti) ~ Q*(hmt2) = ~ - QMi)PJ = fl)i- (6.15) Therefore, there is no possible way to jointly measure the positions of a microsystem at two different times! This does not, however, mean that a microsystem cannot exist having the two pseudoproperties, “position in о* at time tx" and “position in a2 at time t2 (IV, §8.3), for example, x was prepared having a position in at time tx and was registered as having a position in a2 at time t2. 7 Spatial Reflections (Parity Transformations) Up to now we have considered the Galileo group as the group which is continuously connected to the unit element. This group can be given a well- defined meaning in terms of transformations of the registration apparatus. In practical applications, however, discontinuous transformations—such as the space reflection r:x'v = — xv—play an important role. If we expand the
300 VII The Galileo Group Galileo group ^ by admitting transformations A for which the determinant \A\ = — 1, we obtain a new group ^(r) which can be decomposed into two disconnected components—^ as an invariant subgroup of ^(r) and the coset г<3 where r is the spatial reflection transformation. If we choose, we may replace the reflection r by a transformation in which only one of the coordinates is reversed, that is, by x\ = -xl5 rx: x'2 = x2, (7.1) *3 = *3- We then obtain гУ = r We may obtain a physical interpretation of the entire group ^(r) if we give meaning to one of the transformations r or гг. Here we are confronted by a problem which has often been neglected. While it is clear what it means to rotate or translate an apparatus, it is not clear what it means to subject an apparatus to a reflection r. The transformation (7.1) does not establish what should be done with an entire apparatus. We can visualize (7.1) as reflecting the apparatus in a mirror located at the (2, 3) plane. Here we would see that mirror image of the apparatus. The mirror image is, however, not an actual apparatus. The fact that the production of an apparatus which is the mirror image of the original apparatus is not trivial can be seen from the example of a person—it may well be impossible. However, the mirror image only establishes the spatial organization of the components of the apparatus—by means of the transformation (7.1). However, how do such things as electric charges, electric and magnetic fields, etc., change in the apparatus? The transformation (7.1) therefore does not establish how we should determine the corresponding transformations of registration apparatuses. The arbitrariness of the application of (7.1) to an apparatus is not sufficiently noted, and plays an important role for the so- called “elementary particle physics.” In nonrelativistic quantum mechanics the action of the reflection r (it is mathematically simpler to deal with r rather than with rx) is defined as follows: for the apparatus b0 a new apparatus is built in which the spatial organization of the components and the spatial placement are changed in the sense of the transformation r without changing the charges present. In spite of the objections implicit in the example of a person described above, we assume that we may build such a “reflected” apparatus. Axiomatically we require that for r there exists a p-continuous r-automorphism such that, together with the interpretation of elements of 9 in §1 we obtain a representation of ^(r) by means of p-continuous r-automorphisms. From the representation of ^(r) by means of p-continuous r-automorphisms there arises a representation by means of J^-continuous effect automor¬ phisms. Since 9 is not connected with r^, according to VI, §3.2 we cannot exclude the possibility that the elements of г<3 transform one system type into
7 Spatial Reflections (Parity Transformations) 301 another. Only experience will lead us to impose the requirement that the £8- continuous effect automorphism corresponding to r transforms an F of the form (0, 0,..., Fv, 0,...) into an F of the same form (0, 0,..., Fv, 0,...). This is equivalent to the condition that the effect automorphism correspond¬ ing to r leave the “objective properties” (IV, §8.1) invariant. Thus, according to VI, §3.2 and §3.3 we may again restrict ourselves to a Hilbert space of a single elementary system type and consider the representation of ^(r) in Ж by means of unitary or anti-unitary operators up to a factor. According to IV, §3.2 we must be able to represent ^ by means of unitary transformations, that is, we may assume the previous results about the representation of in principle the elements of гУ may also be represented by means of anit-unitary operators, in particular, the same is true of r itself. Let U(r) denote the operator representing r. From (A, S, fj,*y)r = r(A, — — fj, y) it follows that U(A, I n, y)U(r)U(A,-Ц, y)-1 = A(A, 3, f\, y)U(r). (7.2) For (p(lc) G 2(R3, dkx dk2 dk3) we define a unitary operator in Ж = jfbX4byRxl, where Rcp(&) = (p(—k). It follows then, by simple computation, that U(A, 3, y)(R x 1) = (R x 1 )U(A, -fj, y). If we multiply (7.2) on the right by R x 1 we obtain: U(A, S, fj, y)U(r)(R x 1 )U(A, 3, fj, y)-1 = X(A, S, fj, y)U(r)(R x 1). (7.3) Thus it follows that the set of A(A, 3, f\, >') form a one-dimensional unitary representation of the Galileo group and therefore must be equal to 1. Thus (7.3) states that U(r)(R x 1) commutes with all U(A, <5, fy, y). It follows that for an elementary system U(r)(R x 1) cannot be anti-unitary; therefore U(r) is not anti-unitary; therefore U(r)(R x 1) must be a multiple of the unit operator. Since a factor in U(r) is arbitrary, we may therefore choose this factor such that U(r) = R x 1. (7.4) Here we stress the fact that (7.4) means that in the spin space U(r) behaves like the unit operator. We have given a physical meaning to the reflection as a transformation of registration procedures. In the applications of quantum mechanics we will use additional unitary symmetry transformations. Not all of these can be physically interpreted. This is true, for example, for many of the per¬ mutations used in VIII, §4 and XII-XV. The application of such symmetry transformations is legitimate if it illuminates the mathematical structure for a problem.
302 VII The Galileo Group In addition to the spatial reflection r we shall often consider another type of reflection transformation—“motion reversal.” We will introduce this transformation and its physical meaning in X, §4. 8 The Problem of the Space for Elementary Systems In this section we shall consider the problems of the space which was previously discussed in VI only for the case of elementary systems. For composite systems we shall present a brief discussion in VIII, §7. A single type of elementary systems is described by a single Hilbert space Ж and an irreducible representation of the Galileo group characterized by mass m and spin s. The following attempt to introduce the space 3f is particularly fascinating for physics: For the Galileo group we shall define the uniform structure which characterizes physical imprecision and is denoted by ph in VI, §1.1. It is physically reasonable to do this in the following manner: In both the three- dimensional spaces fy and $ we introduce a uniform structure in an analogous way. We shall now write the formula only for fy. The equations e = г\/Щ and p = arctan(|fy|) define a bijective map f\ <-► pe of the infinite three-dimensional space onto the finite three-dimensional space of points within a sphere of radius я/2. We define the uniform structure ph in the space of jy as the Euclidean uniform structure in this sphere. It is easy to verify that the uniform structure ph in the space of fy also generates the Euclidean topology. It is easy to compactify the space of jy with respect to ph—we need only add the surface to the interior of the sphere of radius я/2 to the space of pe. We proceed in the same way for 5 and for the time translation y. Thus we obtain a uniform structure ph in This expresses the fact that for large \ij\ that the physical distinguishability for a pair of group elements (1, 0, */l5 0) and (1, 0, jy2, 0) is good only with respect to the direction e; for increasing |fy| physical distinguishability deteriorates. In this way we eliminate the idealized “infinity” from the transformations. A similar situation is found for the case of the transformations (1, S, 0, 0) and for the time translations (1, 0, 0, y). For very rapid motion of the registration apparatus relative to the laboratory reference system, the more certain is the direction of motion, but the absolute magnitude of the velocity of motion becomes less certain. In a similar way the magnitude of the displacement becomes less certain for large time translations. Therefore it is physically reasonable that the real registration procedures are such that the probabilities under displacement of the apparatus for large f\ do not depend strongly upon |jy|, similarly for large $ and for large y. We may therefore assert that 2 is identical to the space A of VI, D 1.2.4. To this end we need to prove that the space A is norm-separable, as we assumed. We shall not prove this here; see [19] for the proof.
8 The Problem of the Space @ for Elementary Systems 303 For elementary systems the space Д has not yet been sufficiently analyzed. In addition A' (the dual Banach space to Д, in which 0! can be embedded) and the closure Ka of К in Д' in the n(A', A)-topology is not well known (see [19]). Perhaps in the structure of 8eKa there is a path to a new mathematical method for the treatment of quantum mechanical problems. Since Д is norm-separable there are good physical reasons for the assertion that 3 = Д. Since the structure of Д is not known in detail, the reader should be aware of additional possibilities for the definition of 3. For the above reason, we shall now proceed in the opposite direction (as shown in VI, §1.2): Using the subset A we choose 3 = 3A and we choose the Л-uniform structure (VI, D 1.27) in ^ as the p/i-uniform structure. As a result we lose the physical intuition for ph9 but we do realize the possibility to freely choose A within certain limitations. A first but somewhat radical choice for A consists of the selection of A = 3A and we construct 3 from the position, momentum, and angular momentum observables. Consider the set of continuous functions x in R3 with compact support. Then we construct the norm-closure algebra gen¬ erated by the %(P), x(Q) and aU the 0, 0, 0). This algebra is norm- separable by construction. For 3A we choose the subspace of all self-adjoint operators from this algebra. Since the representation of the Galileo group is irreducible, 3A separates the elements of К and is therefore o(0&\ J^)-dense in Я. It is easy to see that the Л-uniform structure on ^ generates the same topology as does the original topology on Thus we need only construct a Л-neighborhood of the unit element of ^ which is compact in the original topology, that is, a subset in ^ which is bounded in fj, S, у. Thus, together with U as the representative transformation we need only consider tr(l/+n>l/%(P)),tr(l/+P^^)) for a cp for which <p | cp} is concentrated in momentum space, а ф for which <7*| ij/} is concentrated in position space,1 and a x which is concentrated in R3. By means of the algebraic construction we have obtained the possibility for the selection of 3; however, we note that because of its algebraic nature, it does not have a clear intuitive physical interpretation. An essential aspect of the description of quantum mechanics presented in this book is that algebraic operations such as products of operators in Hilbert space do not have a clearly evident physical interpretation. We note, however, that even for this choice of 3 the structure of 3\ Ka, deKa are not well understood (see [19]). The following approach would be more satisfying. Consider the previously described set of continuous functions x in R3 satisfying 0 < % < 1 and having compact support. Then it is plausible that if we consider ()(0) to be the 1. For <p | <p>, <i» | ф) see IX, §5 and §6.
304 VII The Galileo Group “position operator at time t = 0” then we are able to register the effect %(<2(0)). Then, by time displacement, we are able to measure the effects ешх(йФ))е-Ш1 = x№)) = xfc(0) + -P. \ m Let us choose Л to be the set of these effects. Clearly ^A c= 3)K. A is norm- separable. The investigations in [23] indirectly show that A is J^-dense in Ь{Ж)\ we may therefore choose 2 = (see VI, §1.2). We may therefore choose the Л-uniform structure as the physical uniform structure on It remains to show whether it generates the same topology on ^ as does the original topology. Here we have pointed out several problems of a fundamental nature because we must describe the real measurement possibilities (that is, the actual ability to distinguish between different ensembles). Here we have not successfully obtained a unique “solution of these physical questions” by means of the formulation of axioms concerning 2 (see also the discussion of this problem in [19]). It is also important to state the open questions in order to give a better estimate of the current state of the theory. 9 The Problem of Differentiability Differentiable functions, differentiable manifolds, etc. appear in many mathe¬ matical formulations JltT of physical theories. Is differentiability an essential component of physics—that is, does differentiability represent an aspect of the structure of reality—or is it an artifact and a convenient mathematical idealization? There has been much philosophical discussion about this question. If a physical theory is not based upon an axiomatic basis (see I, §1 or [1], §7.3) then there is little that can be said about this problem. Since we used an axiomatic basis for the development of quantum mechanics presented here, we are able to apply the methods described in [1], §10. However, the question whether differentiability is not merely a mathematical idealization still remains. Without using the methods of [1], §10 we may explicitly determine how differentiability arises in the mathematical formulation JIZT of quantum mechanics. Obviously the structure introduced in II-VII, §1 does not have any axioms about differentiability. Note that the Galileo group is initially introduced as a topological group, where the group structure and the topology (more precisely, the uniform structure ph) reflect certain aspects of reality. By selecting a particular parameterization of the Galileo group we obtain a differentiable manifold: ^ then becomes a Lie group. Pontrjagin [24] has shown that, for compact and finite-dimensional groups there always exist such parameters by which the group ^ can be made into a Lie group. This is
9 The Problem of Differentiability 305 also the case for many locally compact groups, in particular, for the Galileo group [25]. For these groups the structure “differentiable manifold” may be derived from the group structure and the topology. Therefore it is not necessary to introduce additional axioms. This fact is important for the following two reasons: First, the structure “differentiable manifolds” represents a structure (clearly, in idealized form) which represents certain aspects of reality. Second, this structure represents nothing which is not already present in the topology (more accurately in the uniform structure ph of physical imprecision) together with the group structure. Nevertheless it is, of course, correct that the differentiability structure represents an idealization about reality, the basis of which is nothing other than the idealization of a topological group. Let us consider the case of a translation group. For a given translation a it is always possible to find another translation b, the square of which generates the given translation, that is, b2 = a. However, is this true for smaller and smaller translations? Consider, for example, translations of an apparatus in the laboratory, as we have done in §1. Obviously we cannot give an answer to this question at the present time. We can express our lack of knowledge in mathematical terms by the following idealization: There exist arbitrarily small translations; but in order to proceed away from idealizations it is necessary to introduce “uncertainty sets” in the neighborhood of the unit element of the group, that is, make ^ into a topological group (see VI, §1). The idealization of group elements which are “arbitrarily close” to the unit element leads to the differentiability structure for the Galileo group. The above assertion that the differentiable structure represents “some¬ thing” of the structure of reality may yet be made more precise as follows: It expresses in idealized form the fact that the group structure describes what happens in the neighborhood of the unit element idealized in the form of an infinitesimal transformation, that is, in terms of the Lie algebra which corresponds to the structure of the group. The well-known mathematical result that the Lie algebra determines the group locally is only a mathemat¬ ical expression of this fact. If we have recognized the physical meaning of the mathematical structure, we could hardly then quarrel about whether it is physically correct to use Hilbert space for the description of quantum mechanics, or whether it is more correct to use subspaces in which all finite products of the operators K, are defined. The answer to such a dispute is very simple: It is not a physical problem but a question about the method of computation, that is, concerning which mathematical methods are best suited for the solution of physical problems. Methods can be judged only by their usefulness. Here the use of Hilbert space is already a mathematical mode of description which permits us to avoid the structure of the “original” set К of ensembles, of the set L of effects, and of the form p(w, g) for the probability function. Thus it is permissible to introduce new methods which facilitate the solution of
306 VII The Galileo Group practical problems such as those which will be described in IX. Such investigations have been undertaken extensively using the GePfand space triple as a tool. Here we shall refer readers to the literature [26] because we cannot describe all possible more or less practical methods especially when they do not result in any new physical structure. Unfortunately it is not always easy to determine from the literature whether we are dealing only with practical methods or with new physical structures. A problem of physical meaning can be directly related to the differentia¬ bility problem, namely in the area of the problem of the space 2 described in §8. In М(Ж) as well as in we may accentuate the “differentiable” elements by asserting that We К is “differentiable” if i(WK - KW\ i(WX - XW\ i(WL - LW\ i(WH - HW) are elements of similarly, F e Lis “differentiable” if i(FK - KF\ i(FX - XF\ i(FL - LF\ i(FH - HF) are elements of While the subset of differentiable ensembles has only a practical meaning and does not have physical meaning, the set of differentiable effects can be given a physical meaning insofar as the space 2 contains this set or not or whether the norm-closed subsets of $ spanned by this set is the space 2,
CHAPTER VIII Composite Systems The real great achievement of quantum mechanics is not its successful treatment of elementary systems, the basis for which was presented in VII, but its successful description of composite systems. According to VI, D 2.1 the representation of the Galileo group in Hilbert space for composite systems is reducible. In addition, there exist decision effects E <= G(Jt) which are different from 0 and I which are left invariant under transformations of the Galileo group. 1 Registrations and Effects of the Inner Structure We shall first consider the structure of those effects which are left invariant under the Galileo group as a whole, or are left invariant under subgroups of the Galileo group. We have already become familiar with some of the general properties of the set of these effects in VI, §2. In the case of the Galileo group and its subgroups we can yet say something more about the structure of these invariant effects. For this purpose we shall now consider some of the results from VII, §2. It is easy to verify that in the derivations up to VII (2.23) no use has been made of the fact that the representation is irreducible. The decisive next step in VII, §2 was that the Hilbert space Ж can be represented in the form (2.24) where the operators К and X obey the operator rules (2.25). (We have not proven these results in VII, §2; an indirect elementary proof will be provided in the discussion in IX, §5. For a proof which uses the group representation or the algebra generated by elKand e1*'* see, for example, [10] and [28]. 307
308 VIII Composite Systems Thus, for composite systems we may use the form Ж = Жь x Ж{ (1.1) of Hilbert space from VII (2.24) and use the operator rules for the operators P = — К and Q = (1 /M)X in Жъ from VII, §2 together with the commutation relation VII (2.17), where we replace m by M. For the definition of Q we have already made use of the requirement that, for composite systems, the parameter M which occurs in VII (2.17) is also nonzero, a result which is not contradicted by any experiment in the fundamental domain of quantum mechanics (that is, in atomic and molecular physics). The observable P will be called the “total momentum” and the observable Q will be called the “position of the center of mass” for the composite system. These observables P and (5 are, however, not uniquely determined by the conditions set down in VII, §4 (of course, P and Q satisfy all those requirements). According to VII (2.28) the Hamiltonian operator H is given by Я=^х1 + 1хЯ, (1.2) The term (1/2M)P2 is called the “kinetic energy” observable of the system; Ht is called the Hamiltonian operator of the “inner structure” or the observable of “rest energy,” that is, the energy of the microsystems if they were prepared in such a way that the kinetic energy is (approximately) zero. According to VII (2.35), for rotations we obtain U(A, 0, 0, 0) = V(A) x R(A), (1.3) where, according to VII (2.31), for Ж^ — ££ 2(R3> dkx dk2 dk3) we obtain V(A)(p(k) = (piA'Hc) (1.4) or, for Жь — JS?2(R3, dxx dx2 dx3) from VII (3.41) we obtain У(АЩг) = Ф(Л-% (1.5) The R(A) generate a representation of <3* in Ж{. For composite systems it is no longer the case that these representations in Ж{ are irreducible. Thus we find that VII (2.37) no longer holds. Instead, we only have R(A)Ht = НДЦ) (1.6) which describes the rotation invariance of Я,. Thus we have already identified the structures which can be deduced from the structure of the Galileo group (considered as a transformation group of registration procedures) which was introduced in VII, §1. We will now introduce new terminology, describe its usage, and, in addition, obtain some consequences from (1.6). An effect is invariant under translations (1, 0, fj, 0) and proper Galileo transformations (1, <?, 0, 0) if and only if it is of the form 1 x F. Such effects will
1 Registrations and Effects of the Inner Structure 309 be called “inner structure effects.” F need not commute with either the R(A) or with Ht. This effect does not depend on the location or the velocity of the registration apparatus corresponding to F but may depend on the orien¬ tation in space or on the time the apparatus is switched on. It is easy to see that the position and momentum observables will be uniquely determined if the requirements for elementary systems are sup¬ plemented by the following additional requirement: position and momentum observables are coexistent with all inner structure effects. In the sense of the terminology described above elementary systems also have—according to VII, §2—an “inner structure”—one which is very sim¬ ple: spin, where the latter is described by the irreducible representation of Q)% in ts; for Ht then VI (2.37) holds where Я is a constant having no particular physical significance. By analogy with the expression “inner structure effect” we shall call an observable £ Д L an “inner structure observable” if the elements of the range of the measure is of the form 1 x F. An “inner structure scale observable” (according to IV, D 2.4.5 a scale observable is always a decision observable) is uniquely defined by a self-adjoint operator of the form 1 x 5. For the angular momentum operator J (for the definition see VII, §5) which cor¬ responds to the infinitesimal rotation R(A) the quantity I x J is an inner structure observable. Since the representation of Щ can be completely reduced in we may write in the following way as a direct sum as follows: ж, = x e ж}, (i.7) j where Ж '3 is the eigenspace of the operator J2 with eigenvalues j(j + 1). Since the R(A) form a representation up to a factor of Sf9 (see VII, §2) the number j in the sum must be either an integer or half integer. The operators I x F are all effects which are invariant under the action of the subgroup (1, S, fj, 0). The effects which are left invariant under the action of the subgroup (A, 0) are therefore of the form I x F, where F leaves each of the subspaces Ж* invariant. Let Fj denote the part of F which acts in JfJ. Since Ж* can itself be completely reduced with respect to irreducible representations, we may introduce a complete orthonormal basis u\£ where m = + 1 where the u\span an irreducible subspace *(v) and the и(*} transform as described in VII, §3. That means nothing other than that can be written in the form jfJ = $ * 4j, (1.8) where the restriction of the operators R(A) on have the form Rj(A) x I with respect to (1.8) (see the general questions related to (1.7) and (1.8) concerning operators which commute with a completely reducible repre¬ sentation in AIV, §14). Since the Fj commute with all the Rj(A) x I, they must have the form I x Fj with respect to (1.8). Thus all the effects (and
310 VIII Composite Systems therefore all scale observables) are known which are invariant with respect to the subgroup of all (A, S, ц, 0), that is, the corresponding registration apparatuses may be arbitrarily translated, given velocities and arbitrary orientations in space. They are the set of all effects of the form I x F (with respect to (1.1)) where F (with respect to (1.7)) transforms each subspace Ж3 into itself and in Ж* takes on the form 1 x Fj (with respect to (1.8)). This is particularly true for the case of the rest energy observable Ht (see (1.6)): To each angular momentum eigenvalue j there is a part Hj of the operator H( for rest energy where H{ operates in For an effect which is invariant under the entire Galileo group the corresponding Fj must also commute with the H{. Without violating any of the previously introduced structures in the theory we may therefore begin with a series of arbitrary Hilbert spaces 4{, define arbitrary self-adjoint operators Я/, and then construct the Ж* using the spaces A? constructed in VII, §3 for an irreducible representation Dj of Щ according to (1.8). From the Ж3 (here we observe that only whole or half integer values of j may occur) we may construct according to (1.7) and finally construct Ж according to (1.1). In this manner we find that the previously introduced structures yield a theory which does not contradict experience, but does contain too much “arbitrariness” to be useful in clarifying the structure of atoms and molecules. In the sense of [1], §10.3 or of [2], III, §7—§9 the preceding theory is not g.G.- closed: not everything which is possible in the theory really occurs. This represents a challenge to strengthen the theory by means of additional structures and axioms (that is, in the sense of [1], §8 or [2], III, §7) in order to proceed to a standard extension (“Standarderweiterung” in [1], §8) 2 Composite Systems Consisting of Two Different Elementary Systems In VII, §2 we have found that an elementary system can be characterized by the mass m, the spin s, and perhaps (in case there is more than one elementary system type having the same mass and same spin) other discrete quantities. In nonrelativistic quantum mechanics it is possible to characterize all elementary systems by the mass m, spin s, and electric charge; we shall discuss such a description below. In most cases the charge is uniquely determined by the mass and the spin. At present there is no satisfactory physical theory which accurately predicts the masses of elementary particles. In quantum mechanics the masses of the elementary systems are “determined by experiment.” What do we mean by “experimentally determine” in the context of a physical theory? It means that we take the (finitely many!) results of experiments and after expressing them in mathematical form “adding” them (possibly as axioms) to
2 Composite Systems Consisting of Two Different Elementary Systems 311 the mathematical framework of the theory. A more precise description of this process is given in [1], §5 or in [2], III, §4. From these experimental results we then seek to deduce the value of the mass (as lying within a certain interval of the real line—the interval describing the so-called experimental errors). Since an analogous situation exists in the case of classical mechanics, there is some discussion of this topic in [1], §10.5. It is not necessary to present a detailed discussion in order to establish the fact that few parameters of a theory can be determined by proceeding “backwards” from experiment—such is the case for the mass m and the spin s for elementary systems. Certainly it is desirable to have a theory which theoretically determines these parameters, so that we may be able to determine whether the theory is or is not contradicted by experiment. If we then would have such a theory then we would have established a more comprehensive description of the structure of the real world. If such a theory is not at hand then we must be satisfied in experimentally determining which elementary systems have which values of mass and spin. Here we shall only consider a restricted but yet vastly large area of knowledge—that of atoms and molecules. Here it suffices to introduce, as elementary systems: (1) The so-called electrons, m = 4.13 x 109 cm-1, s = \ and charge e where e is the negative of the so-called elementary unit of charge; in the units used here e is dimensionless and has magnitude ~ >/(1/137). (2) The different atomic nuclei with specific mass M (which is more than 1800 times that of the electron mass), is positively charged with charge given by Ze where Z is an integer—the so-called charge number of the nucleus. The spin of the different nuclei can be experimentally determined. There exist tables of nuclei labeled with the correspond¬ ing values of M, Z, and s. Certainly there exists a theory of atomic nuclei which permits the values of M, Z, and s to be correctly assigned and which also permits the description of the structure of nuclei. The theory of atomic nuclei permits the use of the formulation of quantum mechanics presented here in which we consider protons and neutrons as elementary systems. However, the problem of constructing the Hamiltonian operator for the composite nucleus is much more difficult than is the case for atoms and molecules; the case for the latter may be handled by means of postulating axioms (see (5.8)). For this reason we shall not be concerned with the application of quantum mechanics for problems in nuclear physics. Since we seek to develop a more illuminating discussion of the fundamentals of a physical theory, we shall confine our interest to the fundamental domain of atomic and molecular physics, where we may treat the atomic nuclei as elementary systems. On the other hand it is important to note that physicists generally seek to extend a successful theory to new areas, that is, to break new ground. Such efforts hre attempts to discover new structure laws obtained from using known theories combined with the aid of intuitive guesswork. We shall not discuss such attempts in this
312 VIII Composite Systems book; the interested reader is referred to the presentation in certain sections in [2]. For the desired application domain we shall assume that the elementary systems in the theory consist of electrons and nuclei, where the different types of nuclei are listed in tabular form. The electrons are completely (with respect to the above domain of application) characterized by their mass. The nuclei are uniquely characterized by their mass and charge. Electrons have spin the nuclear spin will not play a role in the applications presented in this book, since we shall consider those experiments in which the nuclear spin does not play a role, or its effect is very small. It is, however, not difficult to extend the theory developed here to take into account the effect of the spin of the nucleus (hyperfine structure of spectral lines). We shall also identify the experiments which prove that the electron spin is \ in the development of applications (XI, §9 and §11). We have presented the above overview of elementary systems in order to formulate additional structure axioms for composite systems. If the reader has studied physics for a few semesters he will encounter the well-known “fact” that all atoms consist of “atomic nuclei and electrons.” What, however, should such a statement mean? It is a problem of the theory to subject such statements to scrutiny and to formulate them more precisely—by this we mean that axioms should be introduced in the mathematical framework which provide meaning for the short-hand statement that, for example, an atom consists of a nucleus and a number of electrons. In order to clarify the fundamentals we shall begin by considering a composite system consisting of two elementary systems. Let Жх and Ж2 denote the Hilbert spaces for the elementary systems of different types (1) and (2). Let Жъ denote the Hilbert space of the composite system. Previously there existed no relationship between the different Hilbert spaces Jfv in (Ж19 Ж2,...) which were introduced in AQ (III, §5) or by the theorem in III, §3. We now wish to impose additional requirements which will establish connections between these Жх. We shall therefore begin by examining the relationships between ЖХ9 Ж2 (as Hilbert spaces of two elementry systems), and Жъ (as the Hilbert space of the composite system). Each effect F = (Fl9 F2, F3, F4,...) has three components Fl> F2> F3 which refer to these three system types; similarly each ensemble W — (Wi9 W2, W39 W4,...) has three corresponding components Wl9 Wl9 W3. We may consider Wx as an element of ЩЖг)9 Fx of Я'{Ж^)9 and similarly for (2) and (3). From experience we have found that it is not difficult to carry out experiments with pairs of systems, one each of types (1) and (2). Every scattering experiment of a system (1) onto a system (2) is of this form, and that pairs, that is, a pair of (1) and (2) must be prepared as a new system (3). We will consider the experimental situation of the scattering process in some detail in XVI but only after we have explained what we mean by a composite system (3) of the pair (1) and (2). For a mathematical basis it is certainly desirable to only use such structures which are related to the experimental
2 Composite Systems Consisting of Two Different Elementary Systems 313 situation (as is the case for scattering experiments) in order to derive other structures which describe the situation: (3) is composed of (1) and (2) from the original structure. However, this route has not as yet led to success. We shall therefore proceed to extrapolate from a few experimentally motivated structures without being certain whether we will eventually encounter contradictions with experience. If we prepare pairs of systems (1) and (2) where, for example, the partners (1) of each pair is on the moon and the other (2) is on the earth, then it would appear to be possible to register the effects of system (1) independently of registration of system (2). For certain preparations it also seems to make sense that there are effects F3 for the pairs which correspond to one of the effects caused by the partner (1). Physicists often prefer to paraphrase the expression “for a specified preparation” as follows: For systems “without interaction” certain effects F of a “pair” are singled out as those which are caused by the “partner” (1). Before we proceed to extrapolate this “structure” onto systems with in¬ teraction we shall first seek to mathematically describe this intuitive structure. The obvious method is to introduce the following new structure Lx ^ L3 with the following physical interpretation: F3 = is precisely the effect of the pairs (3) which are actuated by (1) and correspond to the effect F1 of the elementary system (1). We shall also introduce a corresponding map L2 L3. We shall now introduce the following axioms about Tx and T2 in order to mathematically describe the experimental situation “pairs without interaction” on which we have earlier provided an intuitive description. (a) 7] and T2 are ^-continuous injective effect morphisms. If the systems (1) and (2) are prepared widely separated then we may obviously measure “together” both the effects caused by (1) and (2). We therefore require that: (/?) Each effect in T1L1 is coexistent with each effect in T2L2. Furthermore, it is reasonable to require that, for decision effects Ex for (1), we cannot expect that there exist an F3 e L30X30(T1F1) which is more sensitive than T1E1 only because the partner (2) is far removed from (1). We therefore require that: (y) F1G1 c= G3, T2G2 <= G3. In order to express the fact that system (3) does not contain anything in addition to systems (1) and (2) we require that: (5) With exception to Al there exist no effects which are coexistent with allF g T1L1 и T2L2. From these results it follows that (for proof see [29]) Жъ can be written in the following form: e/I3 eft ^ A ft 2 . (2.1)
314 VIII Composite Systems and that the maps Tl5 T2 have the form T1(F1) = F1xl, T2(F2) = 1 x F2. (2.2) Here we note (see the proof in [29]) that conditions (a)-(<5) cannot be satisfied if the number field for the Hilbert spaces is either R or Q—it must be С (see III, §3 and §5). Without going into the proof cited above, we may base our structure on (2.1), and define Tl5 T2 by means of (2.2). Then we may easily prove that the relations (oc)-(<5) hold. The structure characterized by means of (oc)-(<5) or by (2.1) and (2.2) is therefore only a representation of the physical situation of a “pair without interaction.” In scattering theory (described in XVI) the partners “before” and “after” scattering have practically no interaction. Then there exists such a structure T^\ T2(0 which describes the effects of the partner (1) or (2) before scattering and a similar structure T^\ T$p which describes the effects of the partner (1) or (2) after the scattering. For scattering theory it is decisively important that T[f) ф Tf\ T2(/) Ф T2(i) a result which we shall examine more closely in XVI, §1 and §4.4. For the nonrelativistic theory of interaction—as is applied in the case of quantum mechanics—it is essential that we extrapolate the structure 7], T2 for the case of interaction. As the scattering experiments have shown we cannot specify any “fixed” structure Tl5 T2 because in the case of interaction each effect which is triggered only by partner (1) cannot be coexistent with each effect which is triggered only by partner (2), as is required in (/?). This is the case because the interaction of the partner (the measurement process on partner (1)) will influence, as a result of the interaction, the effect triggered later by partner (2). These intuitive ideas suggest the following attempt at extrapolation. AZ. To each time point t of the laboratory time scale there exists maps Tu,T2t which satisfy (oc)-(<5). For the Galileo transformation Uy = 1/(1,0,0, y) in Жъ and the corresponding Galileo transformations Щг\ U{2) in and Ж2 we require that for i = 1, 2. The intuitive idea here is that the effects ад are such that they will be triggered by partner (1) if (1) is placed into interaction with a measurement apparatus for a very short time at time t. Here we assume that there exist measurement apparatuses having interaction processes which act “momen¬ tarily” between the microsystem being measured and the apparatus and that during the brief interval during which the interaction takes place there is no noticeable change caused by the interaction between the partners, that is, if At is the duration of the interaction, then TnUflF^l * сиадм
2 Composite Systems Consisting of Two Different Elementary Systems 315 These considerations show that in the case of elementary particle physics we cannot expect to satisfy such a condition. The long-range electromagnetic interaction of atomic and molecular structure and the small value of the elementary charge e « ^/(1/137) combine to make the use of AZ possible. The requirement imposed by (AZ) = ЩТ^и; is self-evident on the basis of the physical interpretation of the Galileo group presented in VII. If TltFx is an effect for which a measurement interaction takes place at time t then Uy{JuF^)Uy is the effect which is produced if the measurement apparatus is applied at a time у later, that is, when the measurement interaction takes place at time (t + y) later; this is again an effect triggered by partner (1) which corresponds to the effect T1(f+y)Uj1)F1Uj1)+ as if system (1) had been triggered at the displaced time. The requirement t1((+7)c/‘1)f1c/<1>+ = щт^и; is mathematically only a definition for Tu if T10 (that is Tu for t = 0) is known. According to (2.1) and (2.2) there exists a product representation Жъ = x Ж2 corresponding to the maps ^10> ^20 5 a similar situation also holds for Tu, T2t. Since the product representation of Жъ obtained in this way changes with time, it is not well suited for practical problems. In X we shall become familiar with the reformulation of the time variability described by Uy in the form of the Schrodinger picture and the interaction picture, where the product representation of Жъ can be considered not to change with time. Now that we have introduced the maps Tu, T2t we must warn the reader about a possible error concerning its physical interpretation: If b0 is a registration procedure by which the effect Fx can be registered by systems of type (1), that is, for b <= b0 ФФо’ b) = (Fl9 F2, F3,...) it no longer follows (also in the case if F2 = 0!) that there exists a time t for which F3 = TltFi. The registration procedure which registers the effect Fx for the systems of type (1) need no longer register the effect TuFi for composite systems of type (3). If, however, in the special case in which b0 is a registration method in the sense described above—that is, for all practical purposes it is only in interaction at time t—then for the case F2 = 0 we would expect that *3 = TUF,. We may extend the requirement AZ as follows: There exist registration methods b0 and registration procedures b for which ^(Ьо,Ь) = (^, 0, TltFl9...) On the basis of (2.1) and (2.2) “at time t = 0” we may, in a formal sense(!), transfer the rules for Galileo transformations for Ж19Ж2, onto Жъ—and
316 VIII Composite Systems even carry out different Galileo transformations in Ж± and Ж2. That is, for the group <S x <S (see AV, §5) with elements ((A1; fju уД (A2, S2, fj2, y2)) where we define U[(AU §u fh, уД (A2, $2, fj2, y2)] = Slf fju уг) X U(2\A2, $2, fj2, y2), (2.3) where Fx x 1 transforms into U\_{Alt ...), (A2,.. .)](F, x l)Ui(Au ...), (A2,..,)] + = U(1)(Al9.. .)F1U(1)(A1,...)+ x I. (2.4) A similar result holds for 1 x F2. Here Fx x 1 transforms according to the first Galileo transformation (Al9...) and 1 x F2 according to the other transformation (A2,.. Is the formal expression (2.4) physically meaningful? We shall now consider a pair (1), (2) of systems which do not interact (for example, (1) and (2) may be widely separated) then it appears to be possible to subject the registration methods for (1) and (2) to separate(l) Galileo transformations (which are not very distant from the unit element). However, for the case of interaction we cannot simply combine two different regis¬ tration methods for systems (1) and (2) into a new registration method for the pair (3). Indeed Т10Рг = Fx x I and T20F2 = I x F2 are coexistent, that is, there exists a registration method b0 and registration procedures bx a b0, b2 <= b0 such that <A(bo> bi) = (•> * Fi x I, • • •)> b2) = (•,., I x F2,...). However, we do not know how to construct the apparatus for b0. At least we do not expect that parts of the apparatus for b0 can be subjected to distinct Galileo transformations. Thus it appears to be more reasonable not to generally interpret the transformations in (2.3) as transformations of regis¬ tration methods. On the contrary it appears only reasonable to investigate the relationship between the Galileo transformations of the entire registration method and their representations in Hilbert space Жъ for systems of type (3). From the extrapolation of systems (1) and (2) without interaction we obtain the following answer which appears to be reasonable: In (2.3) we require that (A2, *?2> У2) = C^i> *7и 7i)> that is, the representation of the Galileo group in Жъ with respect to the product representation of Жъ “at time t = 0” (Жъ = Жi x Ж^ is given as follows: U(A, S, fj, y) = Ua\A, S, fj, y) x U<2,(A, $, fj, y). (2.5) Equation (2.5) cannot hold in general, that is, it leads to contradictions with experience (for example, for scattering processes—see XVI) since for time translations it describes systems “without interaction.” The structure of
2 Composite Systems Consisting of Two Different Elementary Systems 317 the representation of the Galileo group for composite systems which was described in general in §1 may not therefore be extended by means of the severe condition (2.5). The fact that it is possible to solve the interaction problem for the case of nonrelativistic quantum mechanics by means of a small change of (2.5) has led to overwhelming success in the applications of quantum mechanics. However, the fact that an analogous solution is not possible for relativistic quantum mechanics suggests that such a similar closed theory for elementary particles does not exist. The physically functional but not very elegant solution of the interaction problem for nonrelativistic quantum mechanics is obtained as follows: With respect to the product representation of Жъ at time t = 0 (2.5) is required only for y = 0, that is, for the subgroup of the Galileo group denoted by in §1 we require that: U(A, $, if, 0) = UW(A, $, n, 0) X U(1HA, S, n, 0). (2.6) Thus, for the infinitesimal transformations “at time t = 0” it follows that: K = K1xl + lxK2, X = X1xl + lx%2, J = Jt x 1 + 1 x J2, (2.7) where J, Jl9 J2 are the angular momentum operators. From the first two equations and VII (2.16) we obtain the commutation relations ВД - XVK, = iMSvfl I = (KlflXlv - XUK„) X I + I x (K2flX2v - X2vK2J = I x I + I x т2ё^1. It follows that M = m1 + m2; the mass M of the composite system (3) is equal to the sum of the masses for the two systems (1) and (2). For the position and momentum observables given by VII, §4, that is, for Д = -^, p2 = -k2, 17/rt 2 from (2.7) it follows that the total momentum defined in §1 is given by p = Д x 1 + I x P2 (2.8) and the position of the center of mass Q = Х/M is given by л = WiQi X 1 + m2l X Q2 У ml + m2 The term “center of mass” arises from the form of the right side of (2.9). If (2.5) were correct, then it would follow that
318 VIII Composite Systems In the case in which misunderstandings are unlikely we shall use the more familiar notation of the physicist and write (2.10) in the form: H = 2^ + 2^- ' (2U> If Я in (1.2) has the form (2.11) we then say that it describes the situation “no interaction.” We write (1.2) in the form H = 2-P* + 2-PI + Hj, (2.12) 2m1 2 m2 where Hj is called the “interaction operator.” In (2.12) we may replace Pl5 P2 by P in (2.8) and by the following new operator: p = —p —p (2.13) m1 + m2 m1 + m2 Here we call Pr the relative momentum. Then (2.12) transforms into н = ~P2 + ~P? + Hj where m = Щ”2 ; (2.14) 2 M 2m mx + m2 m is called the reduced mass. It is easy to verify that Pr commutes with X and with all of the 1/(1, S, 0). Therefore, relative to x Ж2, that is, relative to (1.1) we obtain a*“ix5;* (215) Since (1.2) holds, H3 must have the form Hj = lxHj (2.16) relative to (1.1), from which it follows from (1.2) that Я;=Т-Р,2+Я;. (2.17) (The notation of the physicist permits us to consider two different rep¬ resentations of the same Hilbert space as a product space without changing symbols or indices—namely, Жъ = Жь x x Ж2, and to write the corrpsponding operators accordingly. Here we must always take care to note which representation of a product space is used in formulas like (2.7) and (2.16)!) In addition to the center of mass position, we introduce the relative position operator Qr = Qx — Q2; it is easy to show that Qr commutes with Q and P, that is, it is of the form Qr = 1 x Qr (2.19)
2 Composite Systems Consisting of Two Different Elementary Systems 319 relative to (1.1). On the other hand it follows that ~ Qr»Prv = (2.20) The “coordinate transformation” of £)l5 Л> 62* ^2 to & A A leads to the following new notation: For l = ь X tis* I — lb X ^2s we obtain Жг x Ж2 = (Ж1Ь x Ж2Ь) x *ls x t2s (2.21) and ^Lb x ^гъ ~ x ЖгЬ^ (2.22) where (5, P are operators in ^ and Qr, Pr are operators in ЖгЪ. From (2.22), (2.21), and (1.1) it follows that Ж{ = 2tfrb x *la x t2s. (2.23) If the subsystem (2) has no spin, or if we can neglect the spin of (2) (a more precise formulation of what is meant by being able to “neglect” spin is given in [2], XI, §7.5) then we can ignore the space t2s in (2.23) and we obtain Ж, = ЖгЬх,и. (2.24) Not only the form (2.24) but also the algebra of the operators Pr, Qr and the rotations (as we shall later prove) are completely equivalent with that of the elementary system (1) with the exception of the Hamiltonian operator Ht described by (2.17) and the mass factor m (m is the reduced mass in (2.14)). It is now necessary to consider the above assertion about rotations. From U(A) = U{1\A) x U{2\A) relative to (2.1) it follows that with U{1\A) = V{1\A) x R{1)(A\ U{2\A) = V{2\A) x R{2)(A) U(A) = {V{1\A) x V{2\A)) x R{1)(A) x Ri2){A) relative to (2.21). Here V(1\A) x V(2\A) is an operator in Ж1Ъ x Ж2Ъ. It is easy to show that from the change in the product representation according to (2.22) we obtain V{1\A) x V{2\A) = V(A) x Vr(A), (2.25) where the left (right) side of (2.25) corresponds to the left (right) side of (2.22). The operators V(1\ V(2), V, Vr behave in a manner which we have generally described for an orbit space in VII, §3. According to (2.23) in we have the representation R(A) = Vr(A) x R{1\A) x R{2\A), (2.26) where R(A) is defined according to (1.3). For the angular momentum, from (2.7) together with VII (5.3) we obtain J = + J2 = Li + L2 + $1 + $2 = L + Lr + (2.27)
320 VIII Composite Systems where L is the orbital angular momentum in /b,Lr in ЖгЬ. Since the V(A) and Vr(A) behave in the same way as described in VII, §3 for an orbit space, from VII (3.42) it follows that L = QxP, Lr = Qr x Pr. (2.28) Therefore, if we may neglect the spin s2 of the elementary system (2) we then obtain a description in which is equivalent to that of an elementary system of mass m, spin Sy and having the Hamiltonian operator (2.17). Such a description is applicable to a system consisting of an electron as system (1) and an atomic nucleus as system (2) (see XI, §3). The structure for the dynamics of composite systems is therefore de¬ termined when the operator Hj is explicitly given. The form of the Hamiltonian operator will be discussed later in this book. In closing we emphasize the fact that the observables Qi x h 1 x Py x I, I x P2 for the representation Жъ = Жу x Ж2 refer to the time t = 0. They therefore correspond to the operator 6(i)(0) = 6i x 1 “position of the subsystem (1) at time t = 0”; the “position of the subsystem (1) at time t,” using the interpretation of the time translation operator Uy = 1/(1, 0,0,y) corresponds to the operator Qd)W = utQ{1)Wt+ = шйх x m+ which does not take the form Ax 1 with respect to the product repre¬ sentation Жъ = Жу x Ж2 defined by ^io> T20. It does, however, have this form with respect to the product representation corresponding to Tu, Tl,'- A similar situation is found for the momenta of the subsystems (1) and (2). The momenta of the subsystems (and the angular momentum) are no longer constant with time! 3 Composite Systems Consisting of Two Identical Elementary Systems If the system types (1) and (2) are identical, then the intuitive reasoning which led to (2.1) and (2.2) is fundamentally incorrect because it is impossible to distinguish between the “effect of subsystem (1)” from that of “system (2).” The following intuitive approach to this problem has been fruitful: Let us suppose that we have constructed a product space at time t = 0 from a Hilbert space for a system of type (1) as follows: 4^2 -l/p w ■?//? у t/t у X c/f у (3.1)
3 Composite Systems Consisting of Two Identical Elementary Systems 321 (see §4 for more details). Here, of course, it does not make sense to apply (2.2) in this case. Since both systems are identical, all “effects” are invariant under exchange of the two systems and, consequently, (3.1) cannot be the “correct” Hilbert space for the composite system. How can we formulate this invariance in mathematical terms? For this purpose we define an exchange operator as a linear operator in Ж\ as follows: P<pv(l)<^(2) = (3.2) where <pv is a complete orthonormal basis in Жг. It is easy to verify (see §4) that the operator P defined in (3.2) is independent of the complete orthonormal basis used in (3.2). In addition P is obviously unitary, and satisfies P2 = 1. Therefore we may partition Ж2 into two subspaces {Ж2}+ and {Ж2}_ where {Ж2}+ is the eigenspace of P with eigenvalue +1 and {Ж2}_ is the eigenspace of P with eigenvalue — 1. An operator A in Ж2 is said to be symmetric if PA = AP. (3.3) All effects of systems (3) therefore should be (intuitively speaking) symmetric operators, that is, {Ж2}+ and {Ж2}_ are invariant subspaces with respect to symmetric effects. Since the set of symmetric effects is a proper subset of the set ЦЖ2\ Ж2 can therefore not be a Hilbert space for the system type (3) because Ь(Ж3) is the set of effects for system (3). However, we may identify Жъ with {Ж2}+ or with {Ж2}_ because it is clear that every self-adjoint operator which leaves {Ж2}+ or {Ж2}_ invariant is also symmetric. Both of these intuitively motivated possibilities have proven to be successful. We shall now formulate the additional structure needed to describe systems of type (3) which are composed of pairs of identical systems of type (1). Case(+): Жъ = {H2}+ (3.4) together with the map T0 (that is Tt for t = 0) of Lx = ЦЖ^ into L3 = Ь(Ж3) defined by: T.(fi) = x 1 + 1 x Fi). (3-5) Case(-): Жъ = {**}_ (3.6) together with the map T0 of Lt = ЦЖг) into L3 = Ь{Ж3) defined by: ад) = X 1 + 1 X FJ. (3.7) Here (3.6) and (3.7) are the analogs of (2.2). T0(FJ is the effect which is triggered by one of the subsystems of type (1) as if the other were not present.
322 VIII Composite Systems In §2 we have already mentioned the fact that the physical interpretation of this requirement is problematical. Which of the cases (+) or (—) are we to choose? That depends on the elementary system type (1) and—according to experience and certain results from quantum field theory—depends only on the spin of the system type (1). If the spin takes on an integer value, then case (+) is to be chosen; then we call such systems “Bose systems.” If the spin takes on half-integer values then case (—) is to be chosen; here the systems are called “Fermi systems.” Many of the ideas in §2 are applicable to the cases (+) and (—). Clearly (2.3) is not applicable; nevertheless the formula (2.6) is applicable. Thus the formulas (2.7)-(2.12) remain applicable and we find that H and P must commute. For the application of the material following (2.13) we must be cautious. If we define, in a formal manner, the following using (2.13) (here m1 = m2!): Pr = - P2) (3.8) then Pr cannot be an observable of system (3) because it is not symmetric. Of course, (2.14) is applicable because Pr2 is symmetric. We must now consider the decomposition Жъ = {Ж?}± = ЖъхЖ{. (3.9) Жъ is defined as in §2, but Ж{ is not of the form (2.19). What is the structure of ^in (3.9)? This we may discover most simply by analogy to (3.8) if we assume the formal definition of (2.18). Then, corresponding to (2.21) and (2.22) we obtain: Ж? = Ж?ь x i\s = Жь x ЖгЬ x i2ls. (3.10) How does the exchange operator P behave? To answer this question we shall define Ps in an obvious way as an exchange operator in ^ and the parity operator R in ЖгЪ — JS?(R3, dxx dx2 dx3) (see VII, §7) as follows: Яф(х19 x2, x3) = ф(-х19 x2, x3). (3.11) Then, according to the representation on the right side of (3.10) we obtain P = I x R x Ps. (3.12) Therefore Ж{ in Жъ = Жь x Ж{ is the subspace of ЖгЬ x t\s which cor¬ responds to the eigenvalue 1 of R x Ps for Bose systems and — 1 for Fermi systems. The operators Ht and H3 must, of course, commute with R x Ps. If ЖгЪ+ is the eigenspace of R with eigenvalue +1 and ЖгЪ_ that for eigenvalue — 1, then we obtain Ж{ = ЖгЪ+ x {*ls}+ 0 ЖгЬ_ x {t2u}_ (3.13) for Bose systems; for Fermi systems we obtain Ж{ = ЖгЬ_ x {i2s}+ Ф ЖгЬ+ x {i2s}_. (3.14)
4 Composite Systems Consisting of Electrons and Atomic Nuclei 323 4 Composite Systems Consisting of Electrons and Atomic Nuclei Now that we have introduced the characteristic structure for the concept of composite systems for the case of two systems we shall seek to apply this structure without additional discussion to the case of larger numbers of subsystems. For applications of quantum mechanics to atoms and molecules we need only consider the case in which the elementary subsystems are electrons and atomic nuclei. We shall now present the construction of the structure of a “composite system” in the form of a prescription since this form is the most transparent. First Step: We consider the different elementary system types from which the system under consideration is composed. Let the system types be denoted by (1), (2), ..., (k). For each of these let nv denote the number of systems of type (v) which appear in the composite system. Second Step: For each system type (v) we then construct, with the aid of the Hilbert spaces Ж^ the Hilbert space Жп^ = {Ж^}+ or Жп^ = {Ж^}_ depending on whether the spin of the system type (v) is an integer or a half¬ integer. , Third Step: We then construct the Hilbert space of the composite system as follows: jr = jreixjrB2x...x (4.1) For the product representation (4.1) the corresponding maps (at time t = 0) of ЦЖХ) into ЦЖ) are given by: TVFV = lxlx--xFx**xl, where F e ЦЖп^ is in the vth position. With respect to the Жп^ the symmetric operator F in Ж*' = Hx x Hx x • • • x Hx is defined by F = —[Fv xlx--*xl+lxFvx--*xl + *** + lxlX“-x FJ. nv It is only necessary to make the second step more mathematically precise. In order not to keep track of multiple indices we shall now consider one elementary system type with Hilbert space Ж and the n-fold product space Жп. This will be correctly defined in the following; here we shall write the n- fold product space of Ж with itself as follows: We define n Hilbert spaces Жа\ Ж{2\..., Ж{п\ suppose that for each Ж{а) we are given an isomorphic map Va of Ж onto Ж(ос). Here we shall consider the pair of effects F e ЦЖ) and VaFV~1 e ЦЖ(а)) to be identical (by definition) that is, on the basis of the mapping principles they are interpreted as being identical. Such isomorphisms are, for the most part, defined by selecting a complete orthonormal basis <f)v in Ж and considering the images Уафх in Ж{а\ The Уафх are usually denoted by фу(a). Such isomorphisms will be used in XII-XV. Жп is then defined by Ж(1) x Ж(2) x • • • x Ж(п) together with the specifi¬ cation of the isomorphisms Va. In such a product Жп we may define linear
324 VIII Composite Systems permutation operators which also play an important role in practical applications (XII-XV). Using the above defined 0v(oc) Yvbv, v„ = <№•.., 0» (4-2) defines a complete orthonormal basis in Жп. If P is a permutation among a set of n items (as defined in AV, §9) we define an operator in Жп by the equation P*F = Ф (4 3) v„ *Pl,P2 Pn5 V'-V where the pl9 fi2,..., fin is obtained from the permutation of the “n indices” Vi, v2,..., v„ (see AV, §9). The operator P is (according to (4.3)) a unitary operator, since it transforms a complete orthonormal basis into another (in this case it only changes the order of the basis vectors). The definition (4.3) of the operator P does not depend on the particular choice of the basis </>v but only on the product representation. The inde¬ pendence of the definition (4.3) from the choice of a basis follows directly from the fact that each vector of the form Xi(l)x2(2) • * * Xn(n)ls transformed into a vector xai(l)#a2(2)... Xan(n) by P where P permutes the n symbols 1, 2,..., n into al5 a2,..., a„. This can be proved very simply as follows: For a vector X ~~ X К v„avi v„ VI V„ we obtain px = Z M„avi v„- VI V„ If rjp is another complete orthonormal basis in Ж for which n, = E ФЛР V then we have ^Pi Pn ~~ */pi(l)> • • • ’ “ X V„^vipi^v2p25 * * * 5 ^V„p„ vi v„ and therefore obtain p у = у и и ж Лр1 pn Lj APl Pn^Vlpl 5 • • • 5 u\npn> vi v„ where the jul5..., \in are obtained using P on the vl5..., vn. If we apply the permutation P to the fevipi,..., bVnPn we do not change the value of the product; they appear in the sequence bpi<Tl,..., bPn<Tn where the al9.. ,,on are obtained from the pl9..., pn by means of the permutation P so that we obtain
4 Composite Systems Consisting of Electrons and Atomic Nuclei 325 The operators P form a unitary representation of the symmetric group S„ (see AV, §9). In this case the general form of an operator A which commutes with all the operators P may easily be determined according to AIV, §14 in the following way: To each element 4$ of the group ring (AV, §7, §9, and §10.6) there exists a corresponding operator E($ in Жп. The operators EQ and £(v) = £a E$ are projection operators. Since e = £v>a 4a we therefore obtain £Vfflr E^ = Yjv E(v) — 1- From this partition of unity the Hilbert space Жп decomposes into subspaces 4(v) = £(v)(/") and the 4(v), in turn, decompose into subspaces 4V) = Е%{Жп): Жп = £v ф *(v) and *(v) = £a 0 4V)- The Hilbert spaces 4v) (f°r ^хе(1 v) are transformed into themselves by the operator A; we may represent *(v) as follows: *(v) = d(v) x /(v) where the operators P are represented in the finite-dimensional space /(v) by the vth irreducible representation. The operators A leave r(v) invariant and have the form (v4(v) x I) in 4(v) where I is the unit operator in /(v) and A(v) is the operator in d(v); the permutation operators have the form (I x P(v)) in 4(v) where P(v) denotes the operator for the vth irreducible representation in /(v) and I is the unit operator in o(v). The operators A{v) in <fv) can be completely arbitrary. Thus it follows that the general form of all operators which commute with all the P (here £ © represents the fact that the operators A and P leave the spaces 4(v) invariant) will be given by: A = £ 0 (^(v) x I), P = ^®(lx P(v)). (4.4) V V In addition we find that unbounded operators which commute with all the P also have the form A = £v ® (^4(v) x I) since the domain of definition of A is transformed by the permutation operators into itself. Therefore we may construct the domain of definition of A{v) by use of on the domain of definition of A. The problem of the spectral decomposition of A is also solved if we know the spectral decomposition of finitely many Ab). Therefore to each irreducible representation (v) of the permutation group there exists a spectrum which corresponds to A (providing that none of the 4(v) is equal to zero, a situation which can occur for a finite-dimensional Hilbert space Ж; see the presentation for the spin space and the Hilbert space in XIV, §7). We shall now consider the possibility that one of the spaces d(v) has been chosen as the Hilbert space Жп for n identical systems. From experience we find that there are only two possible cases—either P(v) belongs to the symmetric (that is, the identity representation) or it belongs to the anti¬ symmetric representation of the permutation group. Since these repre¬ sentations are one-dimensional we may identify d(v) with *(v), that is, 4<v> = {/"}+ or #(v) = {/"}_. We shall now impose the folkxwing postulate: For elementary systems with integer values of spin we choose Жп = {Жп}+; for elementary systems with half-integer values of spin we choose^ = {Жп}_.
326 VIII Composite Systems For the antisymmetric representation the operator £(v) has the form (see AV, §9): 1 -7E(-DPp, n! p (4.5) where (— l)p is +1 or — 1 depending whether P is either an even or an odd permutation. For the TV1 Vn defined according to (4.2) we find that the (4.6) span the space {/"}. The TV1 Vn are equal to zero if two of the indices are equal. The TV1 Vn (except for sign) are identical to ГД1 Mn if the jul9 ...,рп are obtained from the vl5..., vn by a permutation. For the inner product of two TV1 Vn it follows that (here R = P“ *Q) we obtain <rVl V„5 ЦпУ 1 £ (- 1)PQ<P0V1(1),..., </>», <^(1),</>„»> (n!)2 P,Q (n\y 1 P Q = I(- 1)R<0V1(1)5..., • • • > *»>• From the above remark it suffices to choose an index sequence vl5..., vn from all the index sequences obtained from permutation and a definite corresponding TV1 Vn. If jul5 jU2,..., jU„ is a “different” index sequence, then there must be a v* which does not occur in jul9 ju2,..ju„; then TV1V2 Vn will be orthogonal to ГМ1М2 Mn. If both index sequences are identical (that is, vf = jUf) then, (since no pair of indices v* are equal and therefore only the summand for which R = I yields a nonzero term), we obtain nrvl J2 = 1 nl (4.7) In this way the so-called Slater determinant 1 Vvi v„ Ш P 1) 2) Фф) Ы 2) ФЛ1) Ф42) •» 4>v» • 0v,(«) • Ф4п) ■ Фу„(П) (4.8) rv„V^/ 'TVnV defines a complete orthonormal basis in {Жп}_, where for each yvi Vn only one index sequence is chosen from the index sequences obtained by permutation.
4 Composite Systems Consisting of Electrons and Atomic Nuclei 327 For the identity representation the operator E(v) has the simple form (AV, §9): <49> We obtain a basis for {^"}+ from the *FV1 Vn as follows: «V, (41°) The QV1 are never equal to zero; however, we obtain the same element when we permute the index sequence. Therefore, for the QV1 Vn we must select only one index sequence from all the possible index sequences obtained by permutation, that is, according to (4.11) we do not pay attention to the sequence of the indices vl5..., vn and only consider index sequences to be different if different indices are present. It is easy to see that the QV1 Vn which differ in the index sequences are orthogonal. In order to obtain normalized vectors we compute ||QV1 VJ|2. In order to do this we must know how many indices are identical. Since we do not consider the order we shall assume Vl5 v2> • • • 5 Pl’ Pl’ • • • 5 Рщ’ • • • 5 Pl’ Pl’ • • • 5 Ptl2’' * * ’ 4 1’ 4 2’ * * * 5 Чпу,’ (4.11) where the \ui9 pi9.. .9rjt are identical and that nx + n2 + • • • + nv = n. Therefore we obtain (here R = P-1Q): IR, v„ll2 = -г Z <P<Mi),• ■ &»>Q<Mi)>• • • >&»> ni P,Q = Ai E E <<Mi)> • • • > <Ш> R<Mi), • • •. <Ш> {П1) P R = A E «(1), • • •. R0V1(1),..., &»>. ni R The inner products in the last sum vanish unless ROUi, 02, . . . , tjJ = (0!, 02, ... , IfJ if this is the case then they are equal to 1. In the sum there are as many terms equal to 1 as there are permutations which transform the series (4.11) into itself; there are exactly n1!, n2!,..., nv! permutations. Therefore we obtain R, v„ll2 = “7ni-> n2-> • • • > ”v! (4-12) Therefore, for a complete orthonormal basis for {Ж'"}+ we obtain: ®vi v„ — /iii I E • • • > <^i„,(nl)> •••» V'n!, пг!, n2!, ...,nv! p (4.13)
328 VIII Composite Systems We will not carry out the separation of the center of mass as we have done in §2 and §3. We shall do this later in our discussion of applications. We shall also not consider the detailed analysis of the representation of the Galileo group since, by analogy to (2.6), the transformations are given as a product of i na factors. The time translation operator in Ж from (4.1) is given by U(l, 0, 0, y) = eiyH, (4.14) where H must commute with translations, rotations, and permutation operators. In closing we shall now formulate the following axiom: Every system type is either elementary (either an electron or an atomic nucleus) or consists of a composite of a finite number of electrons and atomic nuclei. Here the number щ of electrons and the numbers n2, n3,... of atomic nuclei of identical type uniquely define the type of the composite system. This axiom is, in part, a description of microsystems and, in part, reflects the boundary of the fundamental domain to which we shall apply the theory. This axiom places no restriction on the numbers na and na. There exists grave doubts, indeed, strong evidence from experience, that the formal theory for system types cannot be g.G.-closed for systems with very large values of na, that is, must be extended by a more comprehensive theory (see, for example, [2], XV and [13]). This means that eventually we must place into question predictions of the theory for large na concerning physical possibi¬ lities (see [1], §10). We may only obtain reference points for what is possible in nature for small values of na. The larger the value of na the larger the possible discrepancy between the theoretical possibilities and the actual possibilities. We introduced these critical remarks in order to urge the reader to critically examine all the structure inherent in the theory. 5 The Hamiltonian Operator The representation of the Galileo group is known for the case of composite systems (in the sense of a generalization of (2.6)) if the time translation operator (that is the Hamiltonian) in (4.14) is known. H can therefore be given for every system type. According to VII (5.2) for elementary systems we have H = 2-P2. (5.1) 2m What is H for the case of composite systems? The first answer to this question is disappointing. More precisely there exists no Я, that is, the description already presented in §4 can only represent
5 The Hamiltonian Operator 329 an approximation of experience, because it is not difficult to give examples of experiments which contradict (4.14). The physicist can rapidly determine where the deviations from (4.14) arise: As long as the emission of light (that is, of electromagnetic radiation) plays a role, (4.14) can only be an approxi¬ mation. For elementary systems (5.1) is in agreement with experience. We may strive to obtain a better description than that given in §4 in two different ways. The first attempt seeks to add the electromagnetic radiation to the microsystem, that is, the action carriers from the preparation to the registration systems. Then we must give up the picture of a composite system of the type described in §4, since, for example, the composite system “atom” can emit new systems—light quanta. This method, known under the designation “Quantum Electrodynamics” has led to fantastic success. It is, however, not clear today whether it yields a mathematically correct de¬ scription. For the latter reason we shall not consider this theory here; we shall only consider mathematically correct theories. Therefore we shall now consider the second attempt, in which we must take into consideration inadequate physical concepts and mathematically unpleasant formulations. In the second approach (4.14) appears as a “first” approximation which yields good results as long as the radiation of light does not play a significant role. For this first approximation we will give a Hamiltonian operator without giving any additional theoretical reasons. The first estimate for H was obtained from the correspondence principle which can be found in every quantum mechanics textbook (see, for example, [2], XI). The most general estimate is obtained as an approximation from quantum electrodynamics; in (5.8) we shall give such an estimate for Я. In the domain of the theory presented here (5.8) will therefore be an axiom. The emission of radiation will therefore be described in a second “approximation” step in the following way. We may consider the registration of the light emitted by dn atom or a molecule as an effect triggered by the atom or molecule. This conception is correct in terms of the meaning of the formulation of quantum mechanics presented here and also of the meaning of quantum electrodynamics. In XVII we shall see how the effects for a system can be produced “with the help” of other systems. According to quantum electrodynamics we may indeed express an effect produced by a light quantum as an effect which is produced by the atom or molecule (by analogy with the considerations of XVII, §2). In XI, §1 we will outline a route by which we may guess an operator F e L which describes these effects. The form of this effect operator can be considered to be an axiom in the context of the theory presented here. In the second approximation step the effect of the emission of a quantum of radiation is treated by introducing a correction to (4.14). According to (4.14) the frequency for an effect F and for its time displacement by у is calculated as follows: tr(WF) and tr{WFy) = tr{WeiHyFe~iHy). (5.2)
330 VIII Composite Systems The required correction of (5.2) may be simply determined by realizing that W is also somewhat dependent on y, that is, instead of writing the right-hand side of (5.2) as above we write tr (WyFy) = tr (WyeiHyFe~iHv) (5.3) and for Wy we introduce a differential equation of the form: dW I'ЛтЩА„+т - jA„+mA„mWy - jWyA„+mA„m], (5.4) ЯУ n,m where the Anm are operators in the Hilbert space of the composite systems. From (5.4) it follows that ^ tr(n;) = 0 (5.5) and Tl, m - K<P> А„+тА„тЩ<Р> - 2<wy(p, А+тА„т<рУ]. (5.6) Since for у = 0 the expression <<p, Wyq>) > 0, it could only be negative if there exists a value y0 of у for which <<p, Wyocp) = 0 and [(d/dyXcp, Wyq>y]yo < 0. From <<p, Wyo(p) = 0 it follows that Wyo<p = 0 and therefore l(d/dyKq>, Wyq>y]yo > 0. The form (5.4) therefore guarantees that a mixture morphism Sy: 1C —► К (which depends on y) is defined by W0-+Wy; it may not, however, be a mixture automorphism! We may therefore rewrite (5.3) as follows: tr {(SyW)eiHyFe~iHy) = tr (PT[S;(^HyF^"iHy)]). (5.7) For the case of radiation emission we shall first give an explicit equation for (5.4) in XI (1.18). There is a great similarity between (5.3) and (5.4) and the interaction picture given in X, §3 except that the quantity Sy described above is not an automorphism. Now that we have presented an overview of the secdnd method which we shall use in X-XVI of this book we shall make a few remarks concerning approximations. The Hamiltonian operator given in (5.8) is so complex that it is essential that simplified approximations be used for practical appli¬ cations of the theory. Physicists are so accustomed to this situation that they have no difficulty in accepting it. Often mathematicians will not accept an approximation until they have obtained formulas for the estimation of errors. Physicists are seldom concerned with such estimates because they can compare the approximate theory directly with experiment. Since every physical theory is not an exact picture of reality, every physicist must learn, in terms of experience, how good the theory is. A conceptually detailed discussion about the methods of approximation can be found in [1], §8.
5 The Hamiltonian Operator 331 The Hamiltonian operator which is well suited for systems consisting of n1 electrons and n2 atomic nuclei of the same type with charge Z and mass M is given by (here latin indices are summation indices for electrons, greek indices are summation indices over the nuclei) H = H0 + Hx + • • • + H6, where the individual terms are given by Я° = 2~1№- eA(n, t))2 + Jl £ (Д _ Z\e\A(rv, t))2 v e2 v Z2e2 v Ze2 + E -+ Z E— + w Y “ у “ v i<k Чк \<ц iv 'iv i + E z|ew¥, t); v И,—: 3№■ Ай + Л• wA• «]; i<krik я3 = оЕ x A)-£ + Vtk x Pd-Sui; i<krik = 4 E m i<k'ifc я5 = -±^Sr6(ti, t) + g^-Eitt, x Рг)-£; пт j |)V rt'v <5-8) Here we have also taken into account “external fields” where the latter are described by means of a given vector potential A(t, t) and a scalar potential <p(r, t). We will systematically treat the problem of external fields in §6; we have included them in (5.8) for completeness. In (5.8) we have denoted the position operators by rt or rv and rik = rf — rk, rik = \rik\, etc., the spin operators for the ith electron are denoted by Sx- В = curl A is the external magnetic field. The form of (5.8) can hardly be obtained from the correspondence principle. It follows directly from quantum electrodynamics (where the latter unfortunately does not have a correct mathematical representation). In (5.8) we have only considered a single type of atomic nucleus. It is trivial to extend (5.8) to the case of several types of nuclei with different charges Z. In addition, it is important to note that, mathematically, H as defined by (5.8) is not well defined. Since H is an unbounded operator it is necessary to Z е(р(п, t)
332 VIII Composite Systems specify its domain of definition. We shall not do so here. We only note that it must be a subset of the well-defined domain of definition of the kinetic energy 2_У p2 + —У p2. 2m t 2My " Difficulties in the specification of the domain of definition of H arise from singularities of the “position functions” in (5.8). In order to find a physically meaningful definition of H consider the point charge of an electron or nucleus as the limit of a charged sphere of fixed charge as the radius of the sphere tends to 0. This picture is helpful in overcoming some of the difficulties associated with the possible multiple meaning of (5.8), especially with respect to the <5-function d(riv). In this book we shall not deal with the mathematically complicated problem of the determination of the domain of definition of the operator H. In the problems presented in XII-XIV we shall occasionally make reference to this problem. According to VII, §7 the parity operator r can be represented in the form U{r) x U(r) x • • • x U(r% (5.9) where U(r) is given by VII (7.4). In this way we obtain a representation of the complete Galileo group ^(r) since the Hamiltonian operator (5.8) (excluding the case of external fields) commutes with the transformations (5.9). This commutivity plays an important role in the investigation of atomic spectra presented in XI-XIV. 6 Microsystems in External Fields A frequently used and, for many experiments, a very important approxi¬ mation is that of the external fields. We shall briefly discuss this method and its application. In many experiments the composite system consists of a microsystem (electron, atom, molecule) and a much larger macroscopic system. If a microsystem enters the structure of a macrosystem—such as a particle in a counter—then we cannot use the approximation of the external field mentioned above. If, however, the microsystem remains away from the atomic structure of the macrosystem then the interaction can be described to good approximation with the help of the external field. We also require that the microsystem does not produce such changes which alter the external field produced by the macrosystem. From this first, yet somewhat provisional limitation of the application domain for the method of the external field, we find that we leave the domain of application of quantum mechanics with its different discrete (!) system types; since we have incorporated macrosystems which can be described objectively in terms of external fields, as we have outlined in connection with axiom AV 4 in III, §3. Nevertheless it is possible to retain this approximation
6 Microsystems in External Fields 333 in the “domain” of quantum mechanics (see [1], §12.3), precisely because in this approximation the macrosystems appear only in the form of given fields, and are not influenced by the microsystems themselves. The influence of macrosystems by microsystems, after all, is decisively important for the “measurement process,” that is, for a more precise physical analysis of the registration of microsystems. We have only superficially described these registration processes in terms of the structure ^0, 01 introduced in II, §4.2 (a precise description is given in [13]). As exterior fields we consider only electromagnetic fields which may be described by a vector potential A(r, t) and scalar potential cp{t, t). Here r, t are the technically specified laboratory space time coordinates. Here (A, (p) are considered to be fields which were specified in terms of the experimental arrangement. The electromagnetic field is obtained from (A, q>) in the usual way In terms of quantum mechanics we now have a “composite” system consisting of a microsystem of type (1) and a field (A, <p). We may characterize such a system by the “index” (1) (A, cp). Since the type (1) system will be held fixed in the following considerations, it suffices therefore to characterize a system type of composite systems by using (A, cp) as an “index.” The “discrete” index for system types is then replaced by the “continuous” index (A, q>). Two different function pairs (Л(г, t), <p(t, t)) will therefore be considered to be different indices. Corresponding to this, an effect procedure (b0, b) does not determine a single effect F e ЦЖ), where Ж is the Hilbert space of the microsystems (as a subsystem of the composite system consisting of the microsystem and the field (A, cp)), but an operator which depends on A, <p: i//(b0, b) = F(A, cp) where (A, cp) occurs instead of the discrete index in the formula: F(A, (p) is, as an operator in Ж, a function of the field (A, q>). The fact that an effect procedure (b0, b) generally determines an effect operator F(A, cp) e Ь(Ж) which depends upon a field (A, cp) is usually neglected in quantum mechanics. Such neglect can lead to difficulties in the interpretation of the Galileo group. If we wish to be more general we must introduce a <r-ring E of measurable subsets in the space of the fields (A, cp) and each individual preparation procedure there will correspond to an entire a-additive measure E К(Ж) as an ensemble. For such a measure the probability will be computed by the formula Ё = — — — grad <p, В = curl A. (6.1) iKb09b) = (Fl9 F2,...,Fi5...). (6.2)
334 VIII Composite Systems where the integration over (A, cp) corresponds to the summation over the different system types in the formula tr(J^F*) for the probabilities. Here, however, we shall return to the usual description, where we assume that the field (A, cp) is, for all practical purposes, uniquely specified by the experiment, that is, was uniquely prepared, that is, the dispersion of the field can be neglected. Then the measure W is nonzero only in the neighborhood of a single point in the space of (A, cp). Instead of (6.2) tr(WF(A, cp)) (6.3) is the probability that the effect F(A, cp) will occur for the “specified field” (A, cp). For the same preparation procedure a for which cp(a) = W the probability for the effect procedure (b0, b) for which ij/(b0, b) = F(A, cp) will depend on the field (A, cp). The condition cp(a) — W is independent of the field is correct only if we prepare the microsystems in a space region outside the field. After these introductory remarks we will begin with (6.3), W = cp(a) and F(A, cp) = ф(Ь0, b) as the starting point for the description of the microsys¬ tems under consideration in an external field. In this way. we obtain a description which can be carried out using the Hilbert space Ж of the microsystems. The essential structure which must be added to (6.3) consists of the specification of the Galileo group as the transformation group of the registration methods relative to the preparation procedures and therefore also relative to the external field (A, cp) as was physically interpreted in VII, §1. Here we observe that this representation cannot be the representation of the Galileo group in the Hilbert space for system (1) (that is, without external fields) because the effects F(A, cp) span a different Banach space than ЩЖ) does, because the set of F{A, cp) is a set of maps from the space of the fields (A, cp) into 0И(Ж)\ Here we shall not proceed any further towards a more precise determination of this Banach space of maps; we would have to introduce a uniform structure into the space of (A, cp) and require that the maps be uniformly continuous. A representation of the Galileo group in the space of the functions F(A, cp) is not uniquely determined, even less so than in the space ЩЖ) of composite microsystems. For a certain range of applications the following method leads to the determination of the representation of the Galileo group, that is, leads to a useful theory. The method is copied from that presented in §2. To each time t there exists a map 7J(1) of ЦЖ) into the set of effects F(A, cp) for which Vl)F = Ft, Tt+yF = Ft+y = V(l, 0, 0, y)F„ (6.4) where V(D, d, ij, y) represents the Galileo transformation of the F(A, cp). According to the second requirement in (6.4) it is sufficient to specify Ttw for t = 0. The physical interpretation is that Ft = TtwF represents an effect for a registration apparatus which is only in interaction with the microsystems for
6 Microsystems in External Fields 335 a short time at time t. The structure introduced by (6.4) is probably physically useful only when it is possible to make such relatively “quick” measurements, where we must later specify what we mean by “quick.” By analogy with §2 we require that for system (2)—that is, the field (A, (p)—we are able to register the field independently of the microsystem at each time t. We express this as follows: The special effects Ft(A, q>) = lft(A, (p), (6.5) where ft(A, cp) are real functions of the field (and its first derivatives) at time t and are to be interpreted as such effects which depend “only” on the field at time t. This does not mean that the signal which corresponds to the effect (6.5) occurs at time t but that the registration apparatus is only influenced by the behavior of the field during a short time span around t. Let us now consider (6.4) and (6.5) for the special case where t = 0: F0 = U»F, (6.6) F0(A, cp) = lf0(A, cp). (6.7) Here F0 in (6.6) corresponds to the effect Fx x I from §2 and I/0 corresponds to the effect I x F2 from §2. The effects Fx x F2 from §2 here correspond to an effect of the form F0f0{A, (p). It is reasonable to transform (2.6) into the following form. Let U^D, if, 0) denote the*representative of the Galileo transformation for the microsystem (1) without an external field. How may we introduce the Galileo transformations of the effects f0(A, cp), that is, of the “effects of the pure fields” (without microsystems)? This formulation must be based on the physical interpretation of these effects. If, for example, the registration apparatus characterized by f0(A, cp) is translated by fj then it is apparently subjected to the field A\t, t) = A{r + fj, t), <p% t) = <p(r + f\, t), (6.8) that is, it responds as if the field (6.8) is acting on the original apparatus. The translated registration apparatus then corresponds to an /0' for which /ои^)=/оИ>а (6.9) where cp' were defined in (6.8). We now generally define a Galileo transformation of the field by m s, yU, <p) = (A', (p% (6.Ю) where k( 1,0, f\, y\A{f, t), <p(f, t)) = (A(r -f\,t - y), <p(t y)), (6.11) k{D, 0, 0, 0)Й(П t), (p(t, t)) = {DA(D~4, t), q>(D-% 0). (6.12) The fc(l, S, 0,0) are to be defined in a similar way as an approximation of the proper Lorentz transformations of A and cp. We will not enter into a discussion of such approximations.
336 VIII Composite Systems We now define the transformation of f0(A, (p) which corresponds to the transformation (D, S, if, 0) as a generalization of (6.9) /o(A <P) = /o(fe ■ HD, 8, n, 0)(A, cp)). (6.13) We transform (2.6) for this case as follows: V(D, 8, f\, 0)Fofo(A, cp) = U,{D, 8, ц, 0)FoU+(D, 8, fj, 0)/o(fc- HD, 8, fj, 0ХЛ q>)). (6.14) According to (6.14) V(D, 8, fj, 0)F0f0 takes on the same form F'f as does F0/0: the T0(1)F = F0 and the l/0 are separately transformed by the Galileo transformations without time translation. If we require that (6.14) holds for the case of a time translation, that would mean that the external field exerts no influence on the microsystem. By analogy with the case described in §2 we define V(l, 0, 0, z)F0f0(A, cp) = UzF0U;f0(k-HU o, 0, z)(A, cp)), (6.15) where the unitary family of operators UX(U0 = 1) depends on the field A, (p. For Ux we require that the following differential equation be satisfied: dUT = Шхих, (6.16) where Hx is a time-dependent Hamiltonian operator. We will often make use of the special effects T^F = F0 defined in (6.15) for which we write V(l, 0, 0, t)F0 = Ft. From (6.15) we find that Ft = UtF0Ut+. (6.17) From (6.16) and (6.17) we obtain the following differential equation for Ft: ~ = KH.F, - FtHt). (6.18) According to the above interpretation of Tt and (6.4) Ft is an effect which is triggered only by the microsystem “at time t.” If we apply the registration method for the effect Ft at time у later, then we would register the ffect Ft+y. We may now make the transition from the effects to a scale observable as follows: Let A0 denote the self-adjoint operator which corresponds to the measurement of a scale observable for a microsystem “at time zero.” Here “at time zero” means that the interaction for the registration method takes place, for all practical purposes, only at t = 0 (in the time scale of the laboratory). If, instead, we make the measurement “at time Г (that is, we displace the time relative to 0) we therefore measure the observable At which (according to (6.17)) satisfies the equation At = UtA0Ut+ (6.19)
6 Microsystems in External Fields 337 and, according to (6.18) satisfies the differential equation ^ = ЩА, - AtHt). (6.20) Examples of such special observables are the position and momentum observables of microsystems: Whether the system of type (1) under con¬ sideration is itself an elementary or a composite system in the sense of §2—§4, in every case the position and momentum observables <2i(0), Pt(0) for the individual components of system (1) are defined by the representation U( 1, S, if, 0). Here Q*(0) and P*(0) are these observables at time t = 0. By analogy with (6.19) we obtain pm = utpmut\ qm = umow; . (6.2i) From (6.21) it follows that the components of P^t), Q*(t)-P*v(t), Qiv(t) satisfy the same commutation relations as the Piv(0), <2iv(0): Piv(t)Qkfi(t) ~~ Qkfi(t)Piv(t) = h QiMQU*) - QUWM = 0, Pi,m,(t) - рк,тм = °- (6.22) The Heisenberg commutation relations are therefore satisfied “for all time,” but not, of course, for Piv(t), Qk/Jl(tf\ where t Ф t'. In (6.15) we may consider generalized effects F0(A, q>) which are not only of the form F0f0(A, (p). We obtain V(U o, 0, T)F0(A, <p) = UtF0(k- x(l, 0, 0, x)(A, <p))Uz+. (6.23) Here F0{A, q>) depends only on the fields A, <p (and their first derivatives) at time t = 0. F0(/c_1(l, 0,0, x)(A, (p)) depends only on the fields at time t = x. We shall now simplify the notation of (6.23) as follows: Ft(A, <p) = V(l, 0, 0, t)F0(A, <p). (6.24) Ft(A, <p) is therefore an effect which is measured by a registration method which is only affected by the microsystem and the field at time t. The meaning of the left side of (6.24) is expressed by the right side of equation (6.23) as follows: Ft is the time dependent function UtF0Ut+ which depends on the functions A, cp (and their first derivative) at time t. By differentiation, from (6.23) we obtain jf(Ft(A, cp)) = ilHtFt(A, cp) - Ft(A, cp)HJ + U,£ F0(k~ HI, 0, 0, t)(A, cp))Ut+. (6.25) at The second term on the right-hand side of (6.25) is nothing other than the partial derivative of Ft(A, cp) with respect to the t which appears in the fields
338 VIII Composite Systems A, cp. If we write this derivative in the form (d/dt)Ft(A, (p) then (6.25) is transformed into jt (F,{A, <p)) = i(HtF, - FtHt) + . (6.26) For a corresponding scale observable Bt(A, q>) (where the latter is a time- dependent function of the fields and their first derivatives at time t) it follows that ^ = i(HtB, - BtHt) + (6.27) The frequently used simplified notation (6.27) can often lead to misunder¬ standing if it is not precisely explained. (6.27) is often referred to as “the quantum mechanical equation of motion” because it appears to be the “most general” form of the Galileo transformation for a simple time translation. Certainly (6.27) is (in a formal sense) the most general form since we may derive the others as special cases (for example, for the case A = 0, cp = 0 which corresponds to the case of no external field) from (6.27). We note, however, that physically the applications of (6.27) are limited to the case of the external field approximation. For the general case of measurement (6.27) is, on the contrary, not sufficiently general. In (6.26) we considered effects which “respond only at time t.” In fact, realistic effects are somewhat more complicated—they are somewhat com¬ plicated functionals of the entire field A, cp. Since, however, they do not easily permit a formulation in terms of simple equations analogous to (6.26) and (6.27) we shall restrict our consideration to the effects and observables described by (6.26) and (6.27) and leave the problem of the measurement of such special effects to the experimental physicist. We shall now define the Hamiltonian operator Ht in (6.16) more precisely. In principle we have already given it in (5.8). Here we need only state which operators Pi9 Qt = ri9 Pv, Qv = rv in (5.8) are to be used. If the external fields do not depend on time t—that is, we have constant external fields—we may choose the operators Pi9 ..., (5V to be the operators defined at t = 0, that is, we choose Ht = H0 to be constant with time. We may then choose the Pi9..., ()v at апУ other time since, for At = Ht in (6.20) we obtain Ht = H(P(t), Q(t)) = H(P(0), 6(0)). For Ut it then follows that Ut = eiHt. (6.28) If the external fields A, cp are time dependent, then in (5.8) we must choose in Ht the Pi9..., ()v and the spins St as the P^t),..., ()v(t), $&)• this way we obtain a representation of the Galileo group for systems in external fields. Here Ht is itself an observable of the form Bt described in (6.26). We therefore find that: ~dt - im ~ Вд) ИГ ~ёГ' ,6'29)
7 Criticism of the Description of Interaction 339 Here we shall not proceed further into the problem of the domain of definition of Ht and therefore the existence of solutions for the differential equation (6.16). In physics it is generally assumed that a family of unitary transformations Ut exists and that the domain of definition of Ht is the same as the region where Ut is differentiable. These physical ideas do not, of course, lead to a solution of the mathematical problem for a more or less well defined Ht given by (6.16) and (5.8). Here we refer the reader to [11] and [36]. In closing, we shall now state the form of an infinitesimal translation for the special effects introduced in (6.17). With the total momentum P(t) of system (1) it follows that for an infinitesimal fj: K(l, 0, fj, 0)Ft = Ft + ifjlP(t)Ft - FtP(t)l (6.30) We have given this formula in order to warn about possible errors in the use of P(0) in (6.30) for all Ft. It is of decisive importance that V(...) does not give a representation of the Galileo group by means of transformations in &(Ж) but gives a representation by means of transformations in the Banach space of the mappings of the space of the fields into the space ЩЖ)! The “external fields” provide a useful means by which the mass of charged systems can be measured, especially for the case of elementary systems. For example, for an electron the Hamiltonian operator (5.8) (for small values of the momentum and neglecting the spin term in H5) is given by Я = P - eAf + e<p. (6.31) In XVII, §6.2 we shall briefly outline how we may develop measurement methods by which the quantity elm (in cm) may be measured by the deflection produced by the field. If we then succeed in measuring the elementary charge (by measuring e/M for macroscopic bodies and then measuring M with the aid of other forces in units of gm—Milikan experi¬ ment and other modern experiments) we then obtain Planck’s constant as a conversion factor between the units gm and cm -1 (see VII (2.39)). Here we, of course, assume that m is previously known in units of cm-1 from atomic processes (from the structure of atomic spectra (according to XI-XIV) or more directly from the motion of electrons). 7 Criticism of the Description of Interaction in Quantum Mechanics and the Problem of the Space Q) Since the position and momentum observables for elementary systems could be introduced in the manner described in VII, §4 it is, in principle, possible to determine whether a registration method measures position or momentum for an elementary system on the basis of the interpretation of the preparation
340 VIII Composite Systems and registration procedures described in II and the interpretation of the Galileo group described in VII, §1. For the case of the position and momentum observables defined in §2 for the case of individual systems in a composite system the situation is problematical, since the subsystems undergo mutual interaction. The introduction of these observables was only a consequence of the newly introduced structure (2.1) for the Hilbert space of a composite system. Since every Hilbert space is isomorphic to any other Hilbert space of the same dimension we find that К{Ж^) is isomorphic to К(Ж2) and L(^) is isomorphic to ЦЖ2) including the bilinear form tr(w, g). A mathematical structure of the form (2.1) can be introduced for any Hilbert space. Such a structure has no physical significance as long as this structure is not con¬ nected with an additional physical interpretation. In §2 we have attempted to give such an interpretation by singling out the set of all effects of the form Fx x F2 for this purpose. However, this procedure is physically questionable. Is, however, such a procedure absolutely necessary? If L(^), ЦЖ2), and x Ж2) are isomorphic, how may we correctly make such a distinction? It makes possible a heuristic procedure for the formulation of the Hamiltonian operator according to (5.8); as we shall see in XI-XIV the eigenvalue spectrum of this Hamiltonian operator is in excellent agreement with experience. Where else is the structure (2.1) useful? Is this structure really needed as part of the structure associated with the physics? Does the subset of effects Fx x F2 have any real physical significance? Can we simply forget about (2.1) once we have determined the spectral family of Я, that is, of eiHt<? Certainly we have the fact that it is possible to register the subsystems once they are no longer in interaction. Scattering theory, described in XVI is based on this experimental fact. In this sense we can also say that in an asymptotic sense, for large times (or large distances) the introduction of (2.1) correctly describes the asymptotic structure. In particular, it is not clear whether it is physically meaningful to extend such an asymptotic structure to situations in which the subsystems are in close interaction. Indeed, it may be meaningless in those circumstances to speak of “subsystems” because the concept of a subsystem depends upon (2.1)! In elementary particle physics it is physically meaningless (except as a crude approximation) to speak of subsystems of a system with respect to high energy scattering experiments. In “nonrelativistic” quantum mechanics described here there are ad¬ ditional experimental facts which are explainable in terms of the structure (2.1). For example, the angular momentum structure is, according to §1, quite open; however, on the basis of experiments with external fields (see, for example, the Zeeman effect in XIV, §6) it is in agreement with the structure (2.1) (and its extension to the case of more than two subsystems). The intensities of spectral lines, calculated in XIV, §5 is in agreement with the structure given by (2.1). In addition, the structure of molecules (see XV) and therefore (indirectly) the theoretical description of the most important facts in
7 Criticism of the Description of Interaction 341 chemistry are a consequence of the structures which were introduced in §2- §e. In spite of all these successes there remains a cluster of problems which have already been cited in §2. Is it necessary to impose the condition of “relatively short measurement interaction times” in order to obtain a basis for (2.1)? This does not appear to be the case, but cannot be proven. The considerations in XVII appear to mean that there is no particular time (even approximately) required for a measurement providing that the Hamiltonian operator does not explicitly depend upon the time. The case in which we have a time varying external field remains problematical, as we have found in the discussion in §6. If we exclude the case of external fields it appears that the structure (2.1) is only needed asymptotically and that the extrapolation of (2.1) is a useful tool for such physical problems as atomic spectra including light emission (XI-XV) and the most complicated scattering problems (not only those described in XVI). Certainly it is possible that the structure (2.1) or better, the structure (2.3) indirectly determines a physical structure—namely the space 2. By analogy with VII, §8 we may seek to obtain the space 2 with the aid of the direct product of two (or more) Galileo groups. Since the spaces 2'(Ж2)9 and ^'{Ж^ x ^2) are isomorphic, the additional structure of the subspaces 21 <= ^'(^iX ^2 ^ Щ^Х and @ ^ ^'(^i x ^2) describe an essential difference between, for example, 2± <= 2\Ж^) and 2 <= ^'{Ж1 x Ж2). The convex sets К(Ж±) and К(ЖХ x Ж2) are certainly iso¬ morphic, that is, there exists a mixture isomorphism К(Ж^—* К(Жг x Ж^)\ however, it is possible that an isomorphism does not exist which is с(К(Ж^)9 21)-а(К(Ж1 x Ж2), ^-continuous! In the case of classical mechanics the analogous fact has been known for a long time: The set X is the set of all measures which are totally continuous with respect to Lebesgue measure, that is, which can be described by a measurable density function p(x) in the Г-space. The set L is the set of all measurable functions/for which 0 < / < 1. For the space 2 we may choose the space spanned by the 1-function and the set of all continuous / with compact support. For a single mass point the space Tx is six-dimensional. For two mass points the composite space is twelve-dimensional. If Xl5 Lx correspond to Tx then there certainly are mixture isomorphisms Kx -+ К but none which are <r(Kl9 21)-<т(К, ^-continuous because the dual map must map the set 2X onto 2 which is impossible because of the different dimensions of Гг and Г since a d(Xl5 21)-<j{K, ^-continuous isomorphism must lead to a topologically homeomorphic map of onto Г (the extremal points of Ka may be identified in the o(2i9 ^)-topology with the topological space Г!). It is conceivable that we may be able to solve the problem of composite systems in quantum mechanics by proceeding in the opposite direction— where we supplement the assertions in §1 by introducing axioms about the space 2. Perhaps it is possible to carry out the introduction of the structure “composite system” in an improved manner with respect to the sense of an
342 VIII Composite Systems “axiomatic basis” in this way rather than the route we have used in §2—§6. Here, by the “sense of an axiomatic basis” we mean that new relations are introduced only when these relations can be physically interpreted on the basis of pre-theories (see [47]). Until such a clarification, it must be seen that the concept of a composite system is (as far as this concept exceeds the exposition of §1 is concerned) unfortunately one of the least well-explained concepts of quantum theory.
APPENDIX I Summary of Lattice Theory 1 Definition of a Lattice A set M is said to be a partially ordered set (poset) if there is a relation defined among pairs (a, b)e M x M (which we denote by <) which satisfies the following axioms: (1) a < a for all ae M, (2) a<b,b<c=>a<c, (3) a < b, b < a => a = b. M is said to be totally ordered if, for each pair a,beM either a < b or b < a. If JV is a subset of an ordered set M, then an element ae M is said to be an upper bound of JV if b < a for all b e N. We say that the element ae M is the least upper bound for JV if a < с for every upper bound с of N. Similarly, a lower bound for N is an element d e M for which d < b for all b e N; we say that the element ceMis the greatest lower bound for JV if d < с for every lower bound d of JV. If the greatest lower bound or least upper bounds of the set N exist, they are uniquely determined by the set N. A lattice is a partially ordered set M for which every finite subset N has a least upper bound and a greatest lower bound. By induction, it is easy to show that M is a lattice if each pair of elements a, b has a least upper bound (denoted by a v b) and a greatest lower bound (denoted by а л b). Instead of a < b we frequently write b > a. The least upper bound of N is denoted by \JaeNa, the greatest lower bound is denoted by f\aeNa. 343
344 Appendix I Summary of Lattice Theory According to the definitions it is an easy matter to show that: a = aAa = av a, a v b = b v а, (a a b) a с = a a (b a c), (a v b) v с = a v (b v c), a v (а a b) = а, а a (a v b) = a, a < boa v b = b, a < bo a a b = a. Th. 1.1. Let M be a set, and let two binary operations v, a, be defined on M which satisfy the following conditions: (1) a a b = b a a, a v b = b v a ; (2) a v (b v c) = (a v b) v c, a a (b a c) = (a a b) a c; (3) a v (a a b) = a, a a (a v b) = a, then the relation < defined by a < b whenever a a b = a is a partial order and M is a lattice. Proof. From the first formula in (3) it follows that a = a v (a a (a v b)), and from the second formula in (3) it follows that a = a v a; similarly it follows that a a a = a. Therefore we obtain a < a. From a < b and b < a it directly follows that a = b. If a < b and b < c, that is, if a a b = a and b a с = b then from (2): aAc = (aAb)AC = aA(bAc) = aAb = a, that is, a < c. Therefore < is a partial order. If a < b, then from a a b = a and from (3) it follows that a v b = (a a b) v b = b. If a v b = b then from (3) it follows that a a b = a a (a v b) = a, that is, a < b. Therefore a < b is equivalent to a v b = b. From (3) it follows directly that a a b < a and a a b < b. If с < a and с < b, then it follows that с a (a a b) = (c a a) a b = с a b = c, that is, с < а л b. Therefore a a b is the greatest lower bound of a, b. In a similar way it follows that a v b is the least upper bound of a, b. A lattice M is said to be complete if each subset N of M has a least upper bound and a greatest lower bound. Th. 1.2. If for every subset N of a partially ordered set M there exists a greatest lower bound (least upper bound) then there exists a least upper bound (greatest lower bound), that is, M is a complete lattice. Proof. Suppose for each subset N there exists a least upper bound. Let a denote the least upper bound of the set R = {b \ b < с for all с eN}. Here we find that a is the greatest lower bound of the set N if we show that a eR : From с e N it follows that с > b for all b e R and we therefore obtain с > a. In a similar way we may prove that every subset has a least upper bound. The hypothesis in Th. 1.2 requires that the empty set has a least upper bound, that is, there exists a minimal element in M; corresponding to the existence of the greatest lower bound there exists a maximal element in M. We denote the minimal and maximal elements by 0 and 1 (or e) respectively.
1 Definition of a Lattice 345 D 1.1. If, for a given element ae M there exists an element a! e M for which а л a! — 0 and a v a! = 1 then a! is called a complement of a. A lattice M is said to be complementary if each element ae M has a complement. Here we note that a is also a complement of a'. D 1.2. A bijective map a —»a1 of M into itself is said to be an orthocom¬ plementation if the following conditions are satisfied: (1) a < b => b1 < a1, (2) (a^ = a, (3) а л a1 = 0. An orthocomplemented lattice is a lattice with unit and null element for which in addition, there exists an orthocomplementation structure. Often the orthocomplement of a will be denoted by a* instead of by a\ Remark. For a given lattice there can be more than one ortho¬ complementation ! For a subset N we define the following subset N1 = {a\a = b\ b e N}. Th. 1.3. In an orthocomplemented lattice a v a1 = 1, that is, a1 is a comple¬ ment of a. In addition (a v b)1 = а1 л b1 and (а л b)1 = a1 v b\ If the greatest upper bound /\aeN a exists for the subset N, then \/beN±b exists, and \/beN±b = (Д a e N a)\ A similar result holds for the least upper bound. PROOF. From D 1.2, (1) and (2) it follows that b1 < a1 => a < b. From this result it directly follows that the map a —> a1 maps least upper (greatest lower) bounds into greatest lower (least upper) bounds, and we have proven the second part of the theorem. In particular, 01 = 1. Thus from D 1.2, (3) it follows that a v a1 = (<a1 л a)1 = 01 = 1. D 1.3. We say that a is orthogonal to b (written a 1 b) if a < b\ From D 1.2, (1) and (2) it directly follows that a < b1 => b < a1, that is, the relation alb is symmetric. D 1.4. A lattice M is said to be distributive, if, for all a,b,ceM the following conditions are satisfied: (1) (a V b) Л c = (а л c) v {b л c), (2) (a л b) v c = (a v с) л (b v c). In the following we shall abbreviate conditions (1) and (2) in D 1.4 by D(a, b, с) and D*(a, b, с) respectively.
346 Appendix I Summary of Lattice Theory Th. 1.4. A lattice M is distributive if one of the relations D(a, b, с) or D*(a, b, c) is satisfied for all a,b,ce M. Proof. If D(a, b, c) holds for all a, b, с e M, then it follows that (a v с) л (b v c) = (a a (b v c)) v (с л (b v с)) = (а л (b v c)) v с = (а л b) v (а л c) v с = (а л b) v c, that is, £>*(я, b, c). In the same way we may show that if D*(a, b, c) holds for all я, b, с e M then D(af b, c) holds. 2 Orthomodularity D 2.1. A pair of elements a, b is said to be a modular pair, if for all с < b the relation D(c, a, b) is satisfied, that is, if, for all с < b the relation (c v а) л b = с v (а л b) is satisfied. We shall denote a modular pair a, b by M(a, b). A lattice M is said to be modular if M(a, b) is satisfied for all a, b. We will now assume that M is an orthocomplemented lattice. D 2.2. A pair a, b is said to be compatible (abbreviated C(a, b)) if a = (a л b) v (а л b1) is satisfied (see also IV, Th. 1.3.4(iv)). It follows that a 1 b => C(a, b); C(a, b) => C(a, b1); a < b => C(a, b), since a a b = a and а л b1 = a л b л b1 = 0. Th. 2.1. Tbefollowing statements are equivalent: (i) For all pairs a, b satisfying alb M(a, b) holds. (ii) For all a we obtain M(a, a1). (iii) For all pairs a, b satisfying a < b C(b, a) holds, that is, b = a v (b л a1). (iv) For all pairs a, b satisfying a < b a = b л (a v b1). (v) For all pairs a, b: C(a, b) => C(b, a). (vi) For each triple a, b, с satisfying a 1 b, a 1 c, the following condition holds: avb~avc=>b = c. (vii) For each triple a, b, с satisfying a 1 b, a 1 c, the following condition holds: a v b = a v с and b < с => b ~ c.
2 Orthomodularity 347 (viii) For each a the following condition holds: b J_ a and a v b = 1 =&b = a1, that is, for each a there exists only one orthogonal complement. PROOF, (i) => (ii) is clear, since a 1 a1. (ii) => (iii). From a < b it follows that b1 < a1 and from M(a, a1) it follows that (b1 v a) a a1 = b1 v (а a a1) = b1; therefore, by applying 1 we finally obtain b = a v (b a a1). (iii) => (iv). If a < b then b1 < a1 and, according to (iii) a1 = b1 v (а1 a b), from which it follows that a = b a (a v b1). (iv) => (iii) can be proven in the same way. (iii) => (v). From C(a, b), that is, from a = (а a b) v (а a b1) it follows that a1 = (a1 v b1) л (a1 v b) and b л a1 = b л (a1 v iff. Hence it follows that (b л a) v (b a a1) = (b a a) v (b a (a1 v b1)) = (b a a) v (b л (a a b)1); since b a a < b, from (iii) we finally obtain (b a a) v (b a (b a a)1) = b. (v) => (vi). Assuming (v) holds, we obtain C(a, b) => C(a, b1) => C(b\ a) => C(b\ a1) => C(a\ b1). From a v b = a v c, bl a and cl a it follows that a1 a b1 = a1 a c\ Since a 1 b => C(a, b) => C(b\ a1) it follows that b1 = (b1 a a) v (b1 л я1) = a v (b1 л a1) = a v (с1 л a1). Similarly it follows that c1 = a v (с1 л я1) and therefore b1 = c\ that is, b = c. (vi) => (vii) is trivial. (vii) => (viii). From avb = l = av a1 and b 1 a, that is, b < a1 we obtain the special case in which b = a\ (viii) => (iii): We assume (by (iii)) a < b. We define d = a v (b a a1). Thus we obtain d < b and therefore d a b1 = (d a b) a b1 = 0. On the other hand, b1 v d = b1 v a v (b a a1) = (b a a1)1 v (b a a1) = 1. From (viii) it follows that d = (b1)1 = b, that is, b = a v (b a a1). (iii) => (i). According to (i) we assume that alb and с < b. Since alb M(a, b) is equivalent to (c v a) a b = c. Since с < b and с < с v a we find that (c v a) a b > c. According to (iii), for д = ((с v a) a b) a c1 we obtain the relation (c v a) a b = с v д. Since (с v a) a b < b we obtain д < b a c\ that is, b1 v с < д\ Since (с v a) a b < с v a it follows that д < с v a and, since alb, that is, a < b1 we obtain д < с v b1 < д1—from which we conclude д = 0, that is, с = (с v a) a b. D 2.3. An orthocomplemented lattice is said to be orthomodular if it satisfies one of the conditions (i)-(viii) of Th. 2.1. D 2.4. Let M be an orthocomplemented lattice. A real function M A [0,1] is said to be a normed orthomeasure on M if (1) m( 1) = 1, (2) a 1 b => m(a v b) = m(a) + m(b). Th. 2.2. If there exists a set К of orthomeasures on M such that m(a) = m(b) for all me К implies the relation a = b (that is К separates M) then M is orthomodular. PROOF. According to Th. 2.1(viii) let a v b = a v a1 = 1 with b 1 a. Thus, for all me K, it follows that m(a) + m(b) = m(a) + mia1), that is, m(b) = mia1) and we obtain b = a\
348 Appendix I Summary of Lattice Theory Th. 2.2 is directly applicable to the lattice G of decision effects (see III, D 6.2, III, D 6.3 and III, D 6.6) if we set m(e) = tr(we). G is therefore orthomodular. 3 Boolean Rings D 3.1. A complemented distributive lattice is called a Boolean lattice or a Boolean ring (for the designation “Ring” also see D 3.2). Th. 3.1. In a Boolean ring M each element a has exactly one complement a'. The mapping a—* a! is an orthocomplementation. Proof. Suppose, therefore, that алх = алу = 0 and ovx = flvy = l. It then follows that x = x л 1 = x a (a v у) = (x л a) v (x л у) = x а у. Similarly we obtain у = x л у and therefore x = y. Since a is also a complement of athe mapping a—+a' is bijective and (aj = a. According to D 1.2 we need only prove a < b => bf < a!. From a < b it follows that a = a a b and therefore а a b' = а a b a b' = 0. Thus we obtain b' = bf a 1 = b' a (a v af) = (bf a a) v (bf a af) = bf a a\ that is, bf <; a'. The fact that a Boolean ring is orthomodular is trivial, because it satisfies the distributive law. In a Boolean ring we may therefore use a1 to denote the complement of a. Instead of a1 we may often use a*. Th. 3.2. An orthocomplemented orthomodular lattice is a Boolean ring if and only if every pair of elements is compatible (see D 2.2). Proof. The fact that C(a, b) holds for all pairs a, b in a Boolean ring follows from a = a a 1 = a a (b v b1) and the distributive law. In order to prove the converse, we shall now show that: C(af b)oaAb = aA(bv a1). C(af b) means that a = (a a b) v (a a b1) from which it follows that a1 = (a a b)1 a (a1 v b). From this it follows that a a (a1 v b) a (a a b)1 = 0. Since aAb<aA(bv a1), from Th. 2.1 (iii) it follows that a a (b v a1) = (a a b) v (a a (b v a1) a (a v b)1) = a a b. Conversely, from a a b = a a (b v a1) we obtain the following relation¬ ships: (a a b) v (a a b1) = (a a (b v a1)) v (a a b1) = (a a b1) v (a a (a a b1)1) = a where the latter are obtained from a a b1 < a and Th. 2.1 (iii). Since a a с < a v b, a a с < c, b a с < a v b, b а с < с we find that (a v b) a с > (a a c) v (b a c). According to Th. 2.1(iii) we therefore obtain (a v b) a с = (a a c) v (b a c) v [{a v b) a с a ((a a c) v (b a c))1]. We need only show that the expression z in the square brackets is equal to 0. Next it follows that z = (a v b) a с a ((a a c) v (b a c))1 = (a v b) a с a (a1 v с1) л (b1 v c1). From C(c, a1) we find с a a1 = с а (ях v cx) and we therefore obtain z = (a v b) a с a a1 a (b1 v c1). From C(c, b1) we obtain z = (a v b) a a1 a с a b1 = с a (a v b) a (a v b)1 = 0.
3 Boolean Rings 349 Th. 3.3. An orthocomplemented orthomodular lattice is a Boolean ring if and only if each element a has only one complement. PROOF. The first part of this theorem follows directly from Th. 3.1. According to Th. 3.2 we need only show that each pair of elements is compatible if each element has only one complement. According to Th. 2.1 (iii) we obtain (a a b) v (а а (а a b)1) = a. Let с = (а л (а a b)1) we obtain с a b = 0 and, from (а a b)1 > b1 we obtain с л b1 = а л (b1 л (а л b)1) = а a b\ a and b are therefore compatible if с a b1 = с, that is, if с and b are compatible. For d = b v с and e = с v d1 we obtain bve = bvcvd1 = dvd1 = 1, and, according to Th. 24(iii) it follows that e = d1 v (e a d). Since с 1 d\ from Th. 2.1(vi) it follows that с = e a d. From b < d it follows that bAe = bAdAe = bAc = 0. Therefore e is a complement of b. Since it only has one complement, we find that e = b1 and therefore с < b\ that is, с a b1 = c. Th. 3.4. If the least upper bound \Ja e N a (greatest lower bound /\aeN a) exists for a subset N of a Boolean ring M, then for each be M the least upper bound VaeN^ л a) (greatest lower bound ДаеЛ^(Ь a a)) exists, and satisfies the distributive law: b л ( V a) = V (b л a); b v ( Д а) = Д (b v a). \aeN / aeN \ae N / ae N PROOF. We must show that x = b a (\/aeN a) is the least upper bound of the set of all b a a for which aeN. Since b a a < b and b a a < \/deN d for all a e N it follows that x > b a a for all a e N. Let д > b a a for all a e N. We need to show that д > x. Since д > b a a implies д a b > b a a9 it suffices to show that, for и > b a a with и < b it follows that и > x. Suppose that v > b1 a a for all a e N and v < b\ Thus it follows that и v v > (b a a) v (b1 л a) = a for all aeN, that is, и vt^ V.€jv«- From b а и = и and b a v = 0 it follows that U = bAU = (bAU)v(bAv) = bA(uVV)>bA \/ a = x. aeN FromTh. 1.3, from b v Д a = (V л V a1) = Г V Ф1 л л1)] = Д Ф v a) aeN \ aeN J |_a e N _] aeN we obtain the second part of the theorem. The following relations are frequently defined in a Boolean ring: D 3.2 a-b = a a b, a + b = (a a b1) v (b a a1). It is easy to show that a Boolean ring together with the operations • and + is a (commutative) ring (algebra) for which a • a = a and a + cl = 0.
350 Appendix I Summary of Lattice Theory We may express the operation v by means of • and + as follows: a v b = a + b + a-b. Conversely, let M be a commutative ring (with unit element) with the operations • and + for which a-a —a and a + a = 0; if we define а л b = a-b and avb = a + b + a-b9 then from Th. 1.1(1), (2), (3) it fol¬ lows that M is a lattice. It is easy to show that this lattice is distributive, and that 1 + a is the complement of a. A distributive complemented lattice can therefore be well characterized as a commutative ring (with unit element) for which a - a = a and a + a = 0. The following more general concept of a Boolean ring is often used: D 3.3. A lattice is said to be relatively complemented if to an a > b there exists a с such that Ь л с = 0, fe v с = a. с is called the relative complement of b with respect to a. D 3.4. A distributive relatively complemented lattice is called a generalized Boolean ring. In a generalized Boolean ring the relative complement is uniquely determined. In a generalized Boolean ring we may obtain a commutative algebra by defining a-b = а л b and a -j- b = a v b, where a is the relative complement of а л b in a and b is the relative complement of а л b in b. A generalized Boolean ring is a Boolean ring if and only if it has a unit element. From two Boolean rings Ml9 M2 (and from a finite number of Boolean rings) it is possible to construct new Boolean rings in two different ways. In the first way we are given a pair of arbitrary partially ordered sets Ml5 M2; on the product set M = Mx x M2 we introduce the following partial order: (al9 bx) < (a2, b2): ax < a2, b1<b2. It is easy to see that M is a (complete) lattice whenever Mx and M2 are (complete) lattices. In particular, we obtain {al9 bi) л (a2, b2) = (ax л a2, bx л b2\ (al9 bj v (a2, b2) = (ax v a2, bx v b2). If and M2 have 1 and 0 elements, then so does M: 1 = (!■ ij I2X ® = (®i> O2X If Mx and M2 are complemented, then so is M: (au а2У = (ai, a2). If and M2 are distributive, then so is M. Therefore, if Mx and M2 are Boolean rings, then so is M.
3 Boolean Rings 351 The second way to construct a Boolean ring M from Mx and M2 may be carried out most simply with the aid of the algebraic operations + and •. This construction is analogous to the technical “logical” switching circuits—from Mx and M2 we construct the “free” algebra (ring) which consists of all possible formal sums and products of elements in Mx and M2 by means of finitely many operations + and •. In this case M is not, in general, equal to the product set. The elements of M can be represented by all formal finite sums of the form E + af-af, i where af e Ml9 af e M2 and af • af = 0, af • af = 0 for i Ф j. Then, by means of the operations + and • we may obtain new sums as follows: For QTj + af-af) + (+ bf-bf) we do not immediately obtain af-bf] = 0, etc. We can, however, attain this result stepwise providing that, instead of af, bf] we use new elements of : af • bf\ af + af • bf\ bf + af-bf\ D 3.5. A subset / of a lattice M is said to be an ideal if the following conditions are satisfied: (1) ael and b < a=> b e /, (2) al9 a2el => ax v a2e I. Th. 3.5. If M is a generalized Boolean ring, then I is also algebraically an ideal of M, that is, the conditions ael and b e M => ab e I, ai, a2 £ I —a^ 4~ a2 e I. Conversely, an algebraic ideal I is also an ideal in the sense of О 3.5. PROOF. Let / be an ideal according to D 3.5. From ael and a-b = aAb<ait follows that a-bel. From alfa2el and 4- a2 < at v a2 it follows that 4- a2 eI. Let I be an algebraic ideal. From ael and b < a it follows that b = b a a = b- a el; from al9 a2e I it follows that at v a2 = (at 4- a2) + a1-a2e I. From the fact that I is also an algebraic ideal, it follows that M/I is a Boolean ring. An equivalence relation bx ~ b2 is defined by bx = b2 4- a where ael. Since bx = b2 + aobx + b2 = a we may define bx ~ b2 by bx + b2e I. The set of classes M/I can easily be seen to be a Boolean ring as follows: cx • c2 is equal to the class of bx • b2 for bx e cl9 b2ec2. cx + c2 is equal to the class of the bx + b2 for bx e cl9 b2ec2.
352 Appendix I Summary of Lattice Theory 4 Set Lattices If X is a set and M cz 0>(X) (0*(X) is the power set of X) then M is a partially ordered set with respect to the set theoretical relation c: of inclusion. If M is a lattice, we then speak of a set lattice. Here it is important to note that set theoretical union и and intersection n will not necessarily correspond to the lattice least upper bound v and greatest lower bound л, respectively, that is, for a, b e M it is not necessarily true that а л b = a n b and a v b = a и b. We do find, however, that a a b cz a n b and a v b з a и b. Th. 4.1. If M has a maximal element, and if, for every N cz M f]aeN ae M, then M is a complete set lattice and a a b = a n b, a v b = Q c. csM aub^c Similarly, if M contains a minimal element and if, for every N cz M, (JaeNa G M ^en a v b = a и b and а л b = (J c. csM c^anb The proof of this theorem is a simple consequence of Th. 1.2. If M is complemented, then in general the a' of a e M is not necessarily equal to the set complement e\a of a in the maximal element e of M. If M contains the empty set, then it follows that a' cz e\a. Let M be a Boolean ring of sets, let с e M and let с cz e\a where e is the maximal element of M. Then с л a = 0. Using a1 we find that (c v a1) a a = (c a a) v (а1 л a) = 0 and (c v a1) va = cve = e, that is, с v a1 is also a complement; by the uniqueness of the complement we obtain с v a1 = a1, that is, с cz a1. If the empty set is an element of M then we must have a1 n a = 0 and therefore a1 cz e\a. Then the set {с \ с e M and с cz e\a} has a maximal element, namely a1.
APPENDIX II Remarks about Topological and Uniform Structures Here we shall provide a brief summary of some of the concepts and principal results which are used in various places in this book. For the proof of these theorems, see [32]. 1 Topological Spaces A topological space consists of a set X together with a structure 6 cz 0>(X) which satisfies the following properties: (1) & contains the intersection of any finite collections of elements of (9. (2) (9 contains arbitrary unions of elements of (9. (9 is called the set of open sets. We often say that (9 defines a topology on X. The complements of sets in (9 are called closed sets. The “interior” A0 of a set A is defined as the union of all open subsets of A. A0 is open. The closure A of А cz X is defined as the intersection of all closed sets which contain A. В is said to be dense in A if В cz A and A cz В. X is said to be separable if there exists a countable subset in X which is dense in X. A filter in X is a subset & cz ^(X) for which and and Be^=>v4n£e#\ We may also, in an equivalent manner, define a topology by means of a neighborhood structure, as follows: To each xe X there corresponds a filter 2FX—the so-called neighborhood filter of x. For the &x we require that xeU for all U e 3FX\ to each U e there exists aFe«fx such that for each у eV the relation U e 3Fy is satisfied. 353
354 Appendix II Remarks about Topological and Uniform Structures We may show the equivalence in the following way: (1) Let X and (9 be given. We define as the set of all A for which there exists a В e (9 for which xe В cz A. The 3FX then satisfy the require¬ ments for a neighborhood structure. (2) Let X be given together with a neighborhood structure. We define (9 as the set of all A for which у e A=> As S'y, the set A is therefore a neighborhood of each of its points. It is a simple matter to show that the neighborhood structure defined according to procedure (1) using X and (9 is identical to that defined by this (9 according to procedure (2). Similarly, if we begin with a neighborhood structure in X and define the set (9 of open sets according to procedure (2) and if we then define a neighborhood structure using (9 according to procedure (1), we then obtain the same structure we began with. Let X and Y be topological spaces. A mapping X^Y is said to be continuous if, for every open set A of Y the set f~1(A) = {x\f(x)eA} is open. This is equivalent to the condition that for every closed set A,f~ \A) is closed. A filter ^ is said to be finer than a filter 3F if ^ => We say that a filter ^ in X converges to an x e X if ^ is finer than the neighborhood filter <FX. If x„ is a sequence, then ^ = {A \ A contains all xn except for a finite number} is a filter. We say that the sequence xn converges towards x and write x„ —»x if the corresponding filter ^ converges to x. It follows that xn —»x if and only if each U e 3Fx contains all xn (with the exception of a finite number). For a filter ^ in Xx e X is called an accumulation point of ^ if xe A for all A e&. x is called an accumulation point of the sequence xn if it is an accumulation point of the corresponding filter this is the case only if, to each U e$Fx there exists an infinity of elements xn. of the sequence for which xni e U. We say that a topology on X *s ^пег than a topology ^ on I if (92 => If ^ if a family of topologies on X then (9 — [)л (9Я defines the finest topology which is coarser than each of the ^. The fact that the set (9 = <9a satisfies the axioms for open sets, has, as a consequence of AI, §4, the result that the topologies on X form a complete lattice since 0>(X) satisfies all the axioms for a set (9. Therefore, to each family ^ there exists a coarsest topology 3T0 which is finer than all the ^. 2TV is called the greatest lower bound of all the ^ and 3TQ is the least upper bound of the ^. Suppose we are given a set X together with a family Xk of topological spaces and a set of maps X Xk. The coarsest topology in X for which all the maps fk are continuous is called the initial topology generated by the maps X Xk. If Y is a topological space, then the mapping Y -^ X (X with the initial topology) is continuous if and only if all composite maps Y X Xk of Y into Xk are continuous. If A is a subset of the space X with the topology we then denote the initial topology on A defined by the canonical injection of A in X as the induced topology on A or (more simply) the topology «f on A. The open sets of this topology are precisely the intersection of open sets В of X with A.
2 Uniform Spaces 355 Let Xk be a family of topological spaces, and let X bj£ the Cartesian product of the Xk. We define the product topology on X to be the initial topology defined by the set of projections X —»Xk. A topological space X is said to be a Hausdorffspace if to each different pair of points x, у there exist neighborhoods Ux and Uy of x and у such that UxnU, = 0. 2 Uniform Spaces A uniform structure on a set X is a subset if of 0>(X x X) which satisfies the following axioms: (1) if is a filter. We define: W~1 = {(j/, x) | (x, y) e W}9 V• W = {(x, z) | there exists а у e X for which (x, j/) g V, (у, z) e W} and A = {(x, x) | x e X}. (2) Weif^AczW. (3) PUeiT =>Г1 eif. (4) W e if => there exists a V e if such that V • V € W. The elements of if are called vicinities. A subset Z of @>(X x X) is called a fundamental system of vicinities or the basis of a uniform structure if the filter generated by Z satisfies axioms (2)- (4). A topological structure is defined by a uniform structure in the following way: Let W eif; it is easy to show that, for each x the sets &x = {y | (x, y) e W) form a filter for which x e U for all U e 2FX, and that, to each U e 3FX there exists a Fef, such that U e for each у eV. This topology is called the topology generated by if. A topology is said to be uniformizable if there exists a uniform structure which generates the topology. A uniform structure if is said to be separated if fVeir W = A. For the topology generated by if to be separated it is necessary and sufficient that if is separated. A mapping X -4 У between two uniform spaces X and Y is said to be uni¬ formly continuous if, to each vicinity К of У there exists a vicinity WoiX such that f(W) <= V where f(W) is defined by f(W) = {(/(x), f(y)) \ (x, y) e W}. Let ifk and if2, if^ <= if2 be two uniform structures; we then say that ifk is coarser than if2 and if2 is finer than ifk. If ifk is a collection of uniform structures on X, then if = f)k ifjx is easy to show that if is also a uniform structure. If A is a set, Xk are uniform spaces, and fk are maps X Xk, then there is a coarsest uniform structure for which all mappings fk are uniformly continuous. This is the initial uniform structure for the maps X^+\. A mapping g of a uniform space У into X with the initial uniform structure is uniformly continuous if and only if all the composite maps У X Xk are uniformly continuous. The topology corresponding to the initial uniform
356 Appendix II Remarks abdut Topological and Uniform Structures structure on X is precisely the initial topology for which all maps X Xx are continuous. The product uniform structure on the product set X = Ylx initio uniform structure in which all projections X-^Xk are uniformly con¬ tinuous. The induced uniform structure on a subset A of a uniform space is the initial structure for which the canonical injection A —> X is uniformly continuous. A filter in a uniform space is called a Cauchy filter if for each vicinity W there exists an element F e ^ for which F x F <= W. X is said to be complete if every Cauchy filter converges to a point in X. For each uniform space X it is possible to construct a complete separating(l) uniform space X. If X is itself separating, then we may identify X with a dense subset of X; X is then uniquely determined (up to an isomorphism). X is called the completion of X. A subset A of a separating complete uniform space X is a complete space if and only if A is closed in X. If X, Y are separating uniform spaces and Y is complete, then a uniformly continuous map X-^Y has a unique extension X-^Y. A metric on the set X is a real valued function IxI-^R which satisfies the following conditions: d(x, y) > 0; d(x, у) = 0ox = у and the so-called triangle inequality—d(x, z) < d(x, j/) + d(y, z). A metric space is a set X together with a metric d. It is easy to show that a metric defines a basis for a uniform structure by the sets Ws = {(x, y) \ d(x, y) < s}. This uniform structure is called the uniform structure generated by the metric. A uniform space X is said to be metrizable if there is a metric which generates its uniform structure. X is metrizable if and only if X is separating and there exists a denumerable basis for the uniform structure. A topological space is said to be metrizable if there exists a metric for which the topology is that generated by the uniform structure generated by the metric. A sequence {x„} is called a Cauchy sequence if the corresponding filter ^ is a Cauchy filter, that is, if to each vicinity W there exists an integer N such that (x„, xm) e W for n, m > N. A sequence in a metric space is therefore a Cauchy sequence if and only if, to each s > 0 there exists an N such that d(xn, xm) < s for n, m > N. A metric space X is complete if and only if every Cauchy sequence has a limit element in X. In a separating topological space the following three conditions are equivalent: (1) Every covering of X by open sets contains a finite subcovering. (2) The intersection of a set of closed sets is nonempty if and only if every finite subset of these closed sets has nonempty intersection. (3) Every filter has an accumulation point in X. If X satisfies one of these properties, then X is said to be a compact space. Every compact space is uniformizable, and the corresponding uniform structure is uniquely determined by the topology, and is the uniform
4 Connectedness 357 structure of the neighborhood filter of A in the topological space X x X. In this way X becomes a complete separating uniform space. A separating uniform space is said to be precompact if its completion is compact. X is precompact if and only if to each vicinity W there exists a finite number of points xve X such that (Jv W(xv) = X where W(xv) = {x \ (x, xv) e W}. Every closed subset A of a compact space X is compact. A subset A of a topological space X (X is not necessarily compact) is said to be relatively compact if its closure is compact. The product X = Ylx %x °f compact spaces is compact (Tychonov’s theorem). If f:X^>Y is a continuous map of a compact space X into a separating topological space, then f(X) is a compact subset of Y. For the special case in which / is bijective, X and Y are homeomorphic. Thus, with the aid of the identity mapping I->Iw e find that if X is compact with respect to the topology and ST2 is a coarser, separating topology on X then ZT2 is identical to 2TX. A subset A of a precompact space is precompact; the product X — Y\x %x of precompact spaces Xx is precompact. X with the initial uniform structures generated by the maps X Xx where the Xx are compact is precompact. If the family of the fx is denumerable, then X is also metrizable. A compact and metrizable space is separable. A separating topological space X is said to be locally compact if each point x has a compact neighborhood. 3 Baire Spaces A subset A of a topological space X is said to be nowhere dense in X if the interior of the closure A of A is empty. A is said to be meager in X if A is the union of a denumerable set of nowhere dense subsets. A topological space is said to be a Baire space if every open set is not meager. If a Baire space X is the union of denumerable many closed sets Av, then at least one of the Av must contain an open set, otherwise, all the Av would be nowhere dense and X itself would be meager, although X is open. Every locally compact and every complete metrizable space is a Baire space. 4 Connectedness A topological space X is said to be connected if it is not the union of two open nonempty disjoint sets. This is equivalent to the condition that X is not the union of two closed nonempty disjoint sets. If X = Х1 и X2, where Xl9 X2 are open and Xx n X2 = 0 then Xl9 X2 are also closed. A subset A <= X is said to be connected if A together with the topology induced by X is a connected space.
358 Appendix II Remarks about Topological and Uniform Structures If A is a connected subset of X,f is a continuous mapping X —> Y, then f(A) is a connected subset of Y. If A is a family of connected subsets of X and if Р|я Ak Ф 0 then (JA Ak is a connected subset. If x is an element of X, then the union of all such connected A which contain x is a connected set, and is the largest connected set containing x. This set is called the connected component of x, and is a closed set. Two connected components (of x and j/) are either identical or disjoint, that is, the connected components partition X into equivalence classes. A topological space is said to be locally connected if each point has a fundamental system of connected neighborhoods, that is, if to each neigh¬ borhood U of a point there exists a connected neighborhood V of x such that Fc U. A space is said to be linearly connected if, for each pair of points x, у there exists a continuous path from x to у, that is, a continuous map [0,1] such that /(0) = x and /(1) = y.
APPENDIX III Banach Spaces We cannot develop the theory of Banach spaces here. Instead, we shall only briefly present a summary of important results without proof in order that readers who are not familiar with these results will be able to find them in other books, for example, in [33]. 1 Linear Vector Spaces A linear vector space X over the field К is an additive abelian group (that is, to each pair xl5x2,el there corresponds a + x2eX; the following axioms are also satisfied: *1 + (*2 + *з) = (*1 + *2) + X1 + X2 — X2 + Xl> there exists an element 0 such that 0 + x = x for all x e X, to each xe X there exists a yeX for which x + у = 0; we write у = — x and Xi + (—x2) = Xi — x2). In addition the elements of the field К define maps of X into X which satisfy the following axioms (here let a, j8eК and let e denote the unit element of К): ex = x, a(jSx) = (ajS)x, a(xt + x2) = clx1 + ax2, (a + jS)x = ax + jSx. We shall only consider the two following cases—К = R (where R is the field of real numbers) and К = С (where С is the field of complex numbers). 359
360 Appendix III Banach Spaces We shall assume that the reader is familiar with the simple computation rules which follow directly from these axioms. 2 Normed Vector Spaces and Banach Spaces In a Unear vector space (over the field К = R or C) the norm is a real function ||x|| over X which satisfies the following properties: (1) ||x|| > 0 and ||x|| = 0 only for x = 0. (2) ||x + y\\ < ||x|| + \\y\\. (3) || ax || = |a| ||x|| where |a| is the absolute magnitude of a. From (3) it follows that || —x|| = ||x||. From (2) it follows that ||x — y\\ > I \\xII “ IIУII I- If a norm is defined on X, X is called a normed space. d(x,y) = || x — у || defines a metric (see All, §2) and therefore a uniform structure and a topology. The operations X x X X and К x X X are uniformly continuous. The notion of a Cauchy sequence has already been defined in All, §2. A normed space X is complete if for every Cauchy sequence xn there exists a limit point xe X, that is, xn —» x. A complete normed space is called a Banach space. Every normed space may be completed to form a Banach space. The set of all x e В such that ||x|| < 1 is called the unit sphere Вщ of В and is closed in the norm topology. 3 The Dual Space for a Banach Space A mapping X -Л К for which l(xx + x2) = Z(xx) + /(x2) and /(ax) = a/(x) (where К is the field of scalars for the vector space X) is called a linear form or a linear functional. A linear form is continuous over a normed vector space if it is continuous for x = 0. This is equivalent to the condition that I is bounded, that is, there exists a real number с such that A bounded linear form is clearly uniformly continuous, and therefore has a continuous extension from the normed vector space onto its completion. Let X' denote the set of continuous linear forms; we find that X' = Xf. Xf is the dual space for X. If we introduce the norm in X' as follows: |/(x)| < c||x||. |/(x)| = sup |/(x)| = sup /(x) 11*11 SI 11*11 SSI
4 Weak Topologies 361 (the last equality is only valid if К = R), then X' is a Banach space because it is easy to verify the fact that a Cauchy sequence ln is a convergent sequence in x and that /(x) = lim,,.^ ln(x) is an element of X' with ||/„ — /|| —» 0. The fact that X' separates X, that is, from 1(хг) = l(x2) for all I e X' it follows that xt = x2 follows directly from the Hahn-Banach theorem. Since |/(x)| < ||/|| ||x|| it follows that 11*11 > SUP7TF- leX' ||/|| If II x || < 1 then, according to the Hahn-Banach theorem there exists an I satisfying |/(x)| > \l(y)\ for \\y\\ < 1 and we therefore find that |/(x)| > ||/||. Thus we obtain 11*11 > 1 => suP^]r > !• leX' ||/|| Thus for x = х'/Цх'Ц (1 + г), for arbitrary s > 0 it follows that (1 + £) sup -jiyjp > llx'll leX' ||/|| and we therefore obtain l(x) и-3й- A bilinear form (the canonical bilinear form of the dual pair X, X') is defined onlxl' by <x, /> = /(x). For xeX, ye X\x, у> is, for fixed x, a norm- continuous linear form over X'. X", the set of all norm-continuous linear forms over X' is, in general, larger than X. 4 Weak Topologies Let X be a Banach space, X' the dual Banach space for X and let <x, y) be the canonical bilinear form for the pair X, X'. a(X\ X) is the initial topology in X' for the maps X' -x,:—> R (or C) which are continuous for all xeX. Similarly a(X, X') is the initial topology in X for which the maps X R (or C) are continuous for all у e X'. These <x(...) topologies are often called the weak topologies corresponding to the dual pair X, X' because they are weaker than the norm topologies. The set of all continuous linear forms over X in the <r(X, Z')-topology is given by the у eX'. Similarly, the set of all continuous linear forms over X' in the g(X\ X)-topology is given by the x e X. The sets of continuous linear forms over X in both the norm topology and the a(X, Z')-topology are therefore identical. Uniform structures are defined by means of the weak topology; for example, the uniform structure defined by a(X, X') with vicinities {(x, y)\x — yeU where U is a neighborhood of 0}. Thus every weakly continuous linear form is also uniformly continuous.
362 Appendix III Banach Spaces Let X be a linear vector space over the field of real numbers. A subset К of X is called a convex set, if, for xl5 x2 e K, + (1 — Л)х2 e К for 0 < к < 1. For A cz X let со A denote the smallest convex set in X which contains A. The unit sphere of a Banach space is convex. Let X be a Banach space; let со A and coff A denote the smallest convex set closed in the norm-topology and the g(X, X')-topologies, respectively, which contain i. In I we find that coff A = со A. This is not the case in X'\ However, the unit sphere Х\ц of X' is not only g(X\ X)-closed but also g(X\ X)-compact. This result follows from the fact that Х\ц ( [— 1,1] c: R, that is, the images under these maps are relatively compact (see All, §2) and that X\^ is the polar set to Хщ (that is, the set of all yeXf for which <x, y> < 1 for all x e Хщ) which is a{X\ X)-closed. The convex set generated by one (or finitely many) compact sets is compact, and therefore is closed! An extreme point of a convex set A is an x e A for which the relation x = Лх„+ (1 — A)x2, xl5 x2 g A, 0 < A < 1 is satisfied only for xx = x2 = x. We shall denote the set of extreme points of A by deA. If A is compact, then A = сo(deA) according to the Krein-Millman theorem. Therefore we obtain X\1{ = тдеЩц)- A linear form over X' is g(X', X)-continuous if and only if it is g(X', X)- continuous over the unit sphere. This corresponds to the fact that a linear subspace of X' is g(X', X)-closed when its intersection with the unit sphere is ст(Х\ X)-closed. The topologies g(X\ X) and g(X\ X^) are identical on X\ where g(X\ Хщ) is the initial topology for which the maps X' R are continuous for all xeXm. If A is a subset of X for which co(A и — A) = Хщ then the topologies g(X\ X) and g(X\ A) are also identical on Х\ц. This result follows directly from the fact that both the g(X\ X^)- and g(X\ A)-topologies are weaker than the g(X\ X)-topology and are also separating, and must therefore coincide on a g(X\ X)-compact set. If X is norm-separable we can choose A to be denumerable. We define a norm in X' as follows: IMLd= Z K\<x,y>\, x 6 A where the > 0 and YjxsaK < 00• Again it is easy to verfiy that the topology determined by the norm \\y\\A coincides with g(X\ X) on Х\ц. If X is norm-separable then the g(X', X)-topology on Хщ is compact and metrizable, and therefore, according to All, §2, Х\ц is separable in the g(X\ X)-topology. Thus X' is also separable in the g(X\ X)-topology. 5 Linear Maps of Banach Spaces If Xx and X2 are linear vector spaces, then a map T from Xx X2 is said to be linear if T(x + y) = T(x) + T(y) and T(ax) = ocT(x). If Хг and X2 are Banach spaces, then T is continuous with respect to the norm topology only
6 Ordered Vector Spaces 363 if there exists a real number С such that || Tx|| < C||x|| for all x. Then Tis also said to be bounded. If Tis bounded, then <Tx, у> = <x, /> is a norm-continuous linear form over X, that is, it defines а у' e X'. It is easy to verify that у —» / defines a linear and bounded map X' X' which we call the dual map to T or the adjoint map. T is continuous with respect to the a(X'2, X^-aiXХг)- topologies. The following important relation is satisfied: <Tx, j/> = <x, Ту>. A linear map X'2 X\ is ^-continuous only if there exists a bounded linear map Xx X2 for which S = T'.S is ^-continuous in X'2 only if it is o- continuous on the unit sphere of X2. 6 Ordered Vector Spaces A linear vector space X over the field of real numbers R is said to be ordered if it is a partially ordered set in the sense of AI, §1 and if the following conditions are satisfied : хг > x2 => xx + x > x2 + x for all xe X, x > 0, a > 0 => ax > 0. A convex cone С is defined as a convex set which has the property that if xeC then so does Ax for all A > 0. The cone С is said to be proper if Cn(-C) = {0}. It is easy to see that the specification of an order structure in X is equivalent to the specification of a proper cone С as follows: x > 0 if and only if xeC. This cone С is called the positive cone; the positive cone of X is often denoted by x+. A convex subset К of С is said to be a basis for the cone С if to each xeC, x Ф 0, there exists exactly one number A(x) such that A(x)x e K. The set К defined by X=f ЛХ *s equal to tjie set {x|xeC and x < we K} and is called the truncated cone generated by the base K. An ordered Banach space X is said to be base-normed if there exists a basis К for the cone X+ for which К is norm-closed and Хщ = со (К и (—К)). It is a simple matter to show that X+ will also be norm-closed. In addition, it can be shown (see [33]) that X is generating, that is, X = X+ — X+. An affine functional on the convex set К is a map X ^ R for which w1,w2eK, 0 < A < 1 =>/(Awi + (1 - A)w2) = АДи^) + (1 - A)/(w2). It is easy to show that each affine functional on a basis К of a base-norm Banach space may be uniquely extended to all of X as a linear functional because X = X+ — X+ and X+ = (JA>0 AX. There is a 1:1 correspondence between the linear functionals over X and the affine functionals over K.
364 Appendix III Banach Spaces From Хщ = co(K u — K) it directly follows that || w|| = 1 for all w e К and that || x || -1хбК for all xe X+. Since X+ is generating, each xe X may be expressed in the form x = awx — 1Sw2, where wl5 w2e К and oc, /? > 0. Thus it follows that ||x|| < a + p. To each s > 0 we may choose wl5 w2, a, /? such that ||x|| > a + ft — s (see [33]). We say that X has the minimal decomposition property if wl5 w2, oc, /? may be chosen so that ||x|| = a + /?. All the examples of base-norm spaces in this book satisfy the minimal decomposition property. If xn e X+ is a bounded increasing (or decreasing) sequence, then there exists an x e X+ for which xn —» x. This follows directly from the fact that for n > m xn — xm e X+ and therefore there exists a w e К such that xn = xm + Aw, where A > 0 and with xm = ||xm|| wm (wm e K) it follows that *• “(1Ы+ 4ы!Ь"'- + SFI’ and we therefore obtain ||x„|| = ||xj + X = ||xj + ||x„ - xj. For the norm in the dual Banach space X' we obtain IMI = sup{|<x, y>| I x e XM = co(K и —К)} = sup{|<x,y>||xeK}. X+ determines a polar cone in X' as follows : X'+ = {y | <x, y) > 0 for all xe X+} — {У I y} ^ 0 f°r all xe K}. X'+ is not only g(X\ X)-closed, but is also a(X\ X)-complete (see [33]) because all positive linear functionals over X are norm-continuous. X'+ determines an order for X' because from у e X' n — X’ it easily follows that у = 0fromX = X+ - X+. If l(x) is a bounded linear functional on К, that is, if l(x) < с for all x e К then from x = awx — jSw2 (where wl5 w2 e K) and ||x|| > a + p — e (see above) that |/(x)| < a/(wt) + j3l(w2) < (a + j8)c < c\\x\\ + sc for arbitrary s > 0 and therefore |/(x)| < c||x||. Every linear functional which is bounded on К (and hence each bounded affine functional on K) is therefore an element of X’. l(w) = 1 for all w e К defines an element of X' which we shall denote by 1. The unit sphere of X' is therefore equal to X\i\ = {y | |<w, y}\ < 1 for all weK} = {y | — 1 < <w, y) < 1 for all w e K} = {y I -1 < у <1}. We shall denote the set of all у for which yx < у < y2 by [yl5 y2] and call it the order interval generated by yx and y2. Therefore we obtain Х[ц = [— 1,1]. Because of this property we shall call X' an order unit space.
6 Ordered Vector Spaces 365 From = [— 1,1] it easily follows that X' is generating, that is, X' = X'+ - X'+. If yn e X'+ is a decreasing sequence, then there exists а у e X+ to which the sequence converges in the c(X\ X)-topology, a result which follows directly from the fact that every set of the form [0, al] is compact in the а(Х\ X)- topology. In the same manner in which every positive linear function over X is norm continuous, every positive linear map T that is Tx > 0 for x > 0) of a base norm space Xx into a base norm space X2 is norm-continuous (see [33]). Therefore, according to §5 the adjoint map X2 X\ exists. T is therefore also positive. The set of norm continuous maps Xx X2 together with the norm ||T|| =sup{||7x|||||x|| <1} form a Banach space Y because a Cauchy sequence is also uniformly convergent. У becomes an ordered vector space by means of the cone У+ = {T\ Tis positive}.
APPENDIX IV Operators in Hilbert Space Since the mathematics of Hilbert space is an essential tool in quantum mechanics, we shall briefly outline the proofs of important theorems. In particular, we shall provide a few examples of the application of a number of general theorems in AIII. 1 The Hilbert Space Structure Type A Hilbert space is: (I) A linear vector space Ж over a field К (as defined in AIII, §1). Here we shall only consider the case in which К = С, the field of complex numbers. (II) There is a map, the so-called inner product, defined on/x/-^C which is denoted by <x, y> and satisfies the following axioms: for a g С <x, ay> = a<x, y>, <*> У1 + У2> = <*> У1> + <*> Уг\ <x, j/> = <y, x); <x, x) > 0, = 0 only if x = 0. From <x, j/> = <y, x) it follows that <x, x) is real. From the axioms it easily follows that <ax, j/> = a<x, y), <xt + x2, y) = <*i> У> + <x2, y>, <x, 0> = <0, x> = 0. 366
1 The Hilbert Space Structure Type 367 Two vectors x, у e Ж are said to be orthogonal if <x, y) = 0. If x Ф 0 and if we define a vector z by <*, У> , У = 7 г* + z <x, x> then z satisfies <z, x) = 0 and we obtain from which we obtain the Schwarz inequality: <x, x><y,y> ^ |<x,y>|2 which is also valid for the case x = 0. Here the equality is satisfied if and only if z = 0, that is, у = Ax. If we define ||x|| = <x, x>1/2, then, from the Schwarz inequality |<x, y)\ < l|x|| ||y||; from ||x - y\\2 = ||x||2 + ||y||2 - <x, y> - <y,x> we obtain the triangle inequality ||x + y|| < ||x|| + ||y|| and ||x - y|| :> |||x|| - ||y|||. Therefore Ц...Ц satisfies conditions (l)-(3) for a norm from AIII,§2. Therefore Ж is a normed space with norm ||... ||. The convergence xn —» x of a sequence is defined in the sense of the norm, that is, ||x„ — x|| —» 0. With the help of the Schwarz inequality it easily follows that the inner product /x/-»Cisa continuous map, that is, from xn —» x and ym—> у we obtain the relation <x„, ym) —* <x, j/>. A pre-Hilbert space is defined as a set Ж which satisfies all the above axioms except <x, x> = 0 => x = 0. From <x, x> = 0 it follows directly that <ax, ax) = 0. With <x, у) = |<x, y}\eid it follows that 0 ^ ||x - C-idy||2 = IIx||2 + ||y||2 - 2|<x, y>| (1.1) and from <x, x) = 0 and <y, _v> = 0 we obtain <x, y) = 0 and therefore II* + УII2 = 0- Therefore the set У0 = {x \ <x, x) = ||x|| = 0} is a subspace of Ж. From (1.1) it follows that ||y|| = 0 and with ).x (л > 0) instead of x: 0 < A2||x||2 — 2A|<x, y}\ for all A > 0 and therefore <x, y} = 0 for x e Ж and j/G«f0. Therefore it easily follows that xx ~ x2 if ||x± — x2|| =0 defines an equivalence relation, and the value of <x, у) depends only on the equivalence classes to which x and у belong. In this way Жis a linear vector space over С which satisfies II. We now present the third axiom for a Hilbert space : (III) Ж is a Banach space, that is, it is complete with respect to the norm. Every Cauchy sequence in Ж therefore has a limit element. Each noncomplete space Ж which satisfies (I) and (II) can be completed (see AH, §2).
368 Appendix IV Operators in Hilbert Space We shall only consider Hilbert spaces which satisfy the axiom: (IV) Ж is separable. (For the notion of separability, see All, §1.) A subset G of Ж is called a linear basis (or Hamel basis) if the span of G is dense in Ж. From (IV) it follows that there exist denumerable linear bases. If there exists a countable or a finite linear basis G then all finite sums ]TV avxv where xv e G and the av rational complex numbers form a denumerable subset which is dense in Ж, that is, IV is satisfied. The smallest cardinality of a linear basis of Ж is called the linear dimension of Ж which, according to (IV) can be only either finite or denumerable. We shall now give two important examples of Hilbert spaces (to prove that these examples satisfy axioms (I)-(IV) see [34]). Let Ж be the set of all complex number sequences x = (al5 a2,...) for which £v |av|2 < oo. For x = (al5 a2r...), у = (jSl5 jS2,...), x + у = (<*! + /?l5 a2 + jS2,...), ax = (aal5 aa2,...) and <x, j/> = £v avjSv, where the convergence of ]TV av/?v is easily proven with the aid of the Schwartz inequality For the second example we consider a a-ring $4 of subsets of a set M, that is, is a Boolean ring with respect to the intersection, the union, complements and the union of countably many elements of s/ is again an element of si (for such an example consider the a-ring in IV, §2.5). Suppose Me On si let a а-additive real measure fi be defined for which fi(rj) > 0, and where we permit fi(rj) = oo for some rj. Let /i(0) = 0. For a sequence ^ e st satisfying r{и nrjj = 0 for i Ф j we therefore obtain Thus it follows that fi(rj) ^ fi(a) for ц з a since rj = a и (rj л g*) where a* is the complement of a. If fi(M) = + oo then there may exist a sequence rji such that rji+1 z> rji9 fi(rji) is finite and M = (J* rit. In addition this sequence may be chosen such that every л/. = {a \ a e s/, a a ^.} is separable with respect to the metric (this metric is described in IV, §1.4) <r2) = + <r2) = n <jf) + fi(a2 n af). A complex function M Л С is said to be quadratically integrable if it is measurable and I |/(x)|2 dfi(x) < oo. JM
2 Orthogonal Systems and Closed Subspaces 369 Two functions fuf2 are said to be equivalent if the set {x | fx(x) Ф f2(x)} is of ju-measure zero, or, equivalently f 1/iM -/2WI2 dfi(x) = 0. Jm We define the Hilbert space Ж as the set of all classes of equivalent functions. Since/(x) =f1(x) +/2(x),a/(x),and fi(x)f2(x) dfix) (1.2) Jm depend only on the classes, the operations ft(x) + /2(x), a/(x) make Ж into a complex vector space, and an inner product </i,/2> is defined by (1.2). It can be proven (see [34]) that, under the above assumptions about (M, si, р)Ж is a separable Hilbert space. We shall denote this Hilbert space by J£?2(M, dp). An example for (M, si, p) is obtained by choosing M to be the set of R of real numbers, si the set of Lebesgue measurable sets and /л as Lebesgue measure. For the case of Lebesgue measure it is customary to replace dp(x) in the above integral by the simpler notation dx. 2 Orthogonal Systems and Closed Subspaces A sequence of vectors xv for which <xv, xM) = SVfi = 1 for v = p, 0 for v Ф /л is called a normed orthogonal (orthonormal) system. The elements of an orthonormal system are linearly independent because if ]TV Avxv = 0, then by taking an inner product with we obtain = <x, £"=1 Кху>У = 0* For each xe/a vector p is defined by x = ]T*=1 <xv, x)xv + p which satisfies <xv,p> = 0 for all v<N. Thus it follows that ||x||2 = Kxv> x>I2 + IIPII2 and we obtain Bessel’s inequality ]T*=1 |<xv, x>|2 < ||x||2 and we therefore obtain |<xv, x)|2 < 00 from which we conclude that <xv, x> —► 0. Since ||£™=B xv<xv, x>||2 = E?=„ |<xv, x>|2 the sum Y^=i xv(xv’ ХУ converges in norm, and we therefore obtain E K*v>*>l2 ^ IN2- v = l We shall now show that the expression ||x — £*=1 avxv|| takes on its minimum value when av = <xv,x> because, for p defined above and q = x - Y,v = i ^ follows that q = £*=1 (<xv, x> - av)xv + p and we therefore obtain Ml2 = E K*v»*> - “vl2 + IIpII2- v = l An orthonormal system is said to be an orthonormal basis if it is a basis for Ж. For an orthonormal basis we therefore obtain * = EXv<Xv,X> V
370 Appendix IV Operators in Hilbert Space since the avxv are dense in Ж and, according to a previous result llpll < Ml The cardinality of an orthonormal basis is equal to the dimension of Ж; thus it immediately follows that the cardinality cannot be less than the dimension of Ж. Next we show that the cardinality of an orthonormal basis cannot be greater than denumerable. According to (IV) there exists a denumerable set {yv} which is dense in Ж. Therefore, to each x of an orthonormal basis there exists a yv(jc) such that \\yvix) — x|| < To each pair of different xl5 x2 of an orthonormal basis yv(Xl) and yv(jC2) must be different because if yv(JCl) = yv(JC2) then it follows that 11*1 - *all ^ 11*1 - ^voJI + 11*2 - ЛоЛ < 2 in contradiction to \\xt — x2\\2 = HxJ2 + \\x2\\2 = 2. Therefore every or¬ thonormal basis is at most countable. Thus, in the case in which the dimension of Ж is denumerable, the theorem is proven. Let the dimension of Ж equal n (finite), then a basis consists of finitely many yu ..., yn. Thus it follows that each хеЖ can be written in the form x = ]T"=1 avyv and thus there cannot be more than n linearly independent vectors in Ж. Thus an orthonormal basis can have at most n elements. We will now show that if M is an arbitrary denumerable subset of Ж then it is possible to construct an orthonormal set of vectors which has the same linear span as does M. For yveM set xt = jVllyJ, x2 = P2/IIP2II where p2 = у2 — уУхi providing that p Ф 0 (if p = 0 then y2 and xt are linearly dependent, and we can simply eliminate y2 from M and renumber the elements yn). We will now assume that M is a linearly independent set. Recursively, we may set и — 1 = pJ\\Pn\\ where p„ = y„ - £ <xv, y)xv; V = 1 this procedure is known as the Schmidt orthogonalization procedure. It is easy to verify that the set of xv form an orthonormal basis which has the same linear span as does M. F is called a closed subspace if F is a subspace and is closed in norm. It follows that 2Г is complete. If 9* is a subspace, then the closure of ^ in Ж is a closed subspace. It follows directly that the intersection of arbitrary many closed subspaces of Ж is a closed subspace. According to AI, Th. 4.1 the closed subspaces of Ж form a complete lattice where a ^2 = n ST2 and v 9~2 is equal to the intersection of all closed subspaces 2Г for which => and ZT => ?T2. If M is a subset of Ж let (M) denote the subspace generated by M and [M] be the closed subspace generated by M. Therefore we find that [M] is the intersection of all closed subspaces for which ^ => M and [M] is therefore equal to the closure of (M). If p is orthogonal to all elements of M, it directly follows that it is orthogonal to all elements of (M) , and from the continuity of the inner
2 Orthogonal Systems and Closed Subspaces 371 product, is also orthogonal to all elements in the closure of (M), that is, of [М]. In this way it follows that all elements p which are orthogonal to M form a closed subspace which we denote by M1. Therefore we find that M1 = (M)1 = [M]1. In addition it follows that [M] = (M1)1. Later we shall find that [M] = (M1)1. First we shall show that if is a closed subspace, then each xe Ж may be uniquely represented in the form x = q + p where q e and p e Since the uniqueness is trivial, we need only demonstrate the representation. For x e F we obtain q = x and p = 0. We now consider the case in which x ф ST. Then Min^ 6 ^ ||x — у || = p Ф 0. Therefore there exists a sequence yve ZT for which ||x — yv\\ —► p. From \\Уу - yj2 = 21|yv - x\\2 + 21|- x||2 - ||yv + Уд - 2x||2 it follows that ^(yv + уJe and we therefore obtain ||^{yv + y^) — x|| > ц from which we conclude that || yv - уд || < 2||yv - x ||2 + 21| Уд - x ||2 - 4ц. From this it follows that the yv form a Cauchy sequence and that yv — qe from which it follows that ||x — yv\\ —► ||x — q\\ and finally ||x — q\\ = p. For p = x — q it is only necessary to show that p e «^'1, that is, <A, p> = 0 for all h e ZT. For p = h(h, p}/\\h\\2 + r we find that ||r|| < ||p|| = p. For p = x — q it follows that r = x — (q + h(h, p}/\\h\\2). Since q + p}/\\h\\2 e 2Г we must have ||r|| > p. Therefore ||r|| = p = \\p\\ and since ||p||2 = ||r||2 + I (K p> |2/||/i||4, we finally obtain </z, p> = 0. If x e (M1)1 = ([M]1)1, then from x = p + q, qe [M] and p e [M]1 it follows that <x, p) = 0, so that <x, p) = (q + p, p) = ||p||2 = 0 from which it follows that x e [Af], and we have shown that [M] = (M1)1. Let G be a basis in Ж and let У be a closed subspace in Ж; then each element x in G can be written in the form x = p + q where q e ZT and p e Let the set of q obtained in this way be denoted by G^, similarly let the set of p be denoted by G? ±. It easily follows that is a basis for and G^i is a basis for 5~1, and that G? и G^± is a basis for Ж. With the help of the Schmidt orthogonalization procedure it is easy to show that it is possible to select an orthonormal basis such that the elements xv of which are either elements of У or of We say that and ZT2 are orthogonal (written ?TX 13T2) if 2ГХ <z and therefore 3T2 c= 3TX holds. If 1 У2 then the set of all x + y, where x e and у e ST2 is a closed subspace. Here we need only prove that the subspace is closed. From II*» + Уп - (x„ + y„)ll2 = IIX» - xj2 + IIy„ - ym||2 it follows directly that the xn and the yn form a Cauchy sequence if the xn + yn form a Cauchy sequence.
372 Appendix IV Operators in Hilbert Space For ZTX _L we therefore obtain v ZT2 = + ^2. If _L ^ we then write ^ intsead of + !T2. From AI, D 1.2 it follows that the operation i in the lattice of closed subspaces of Ж is an orthocomplementation. 3 The Banach Space of Bounded Operators A linear map Жх -Д Ж2 is called a linear operator, or more simply an operator and satisfies A(olx) = a Ax and A(xx + x2) = Axt + Ax2. A is continuous if and only if there exists a number С for which \\Ax\\ < C\\x\\. A continuous operator is therefore also called a bounded operator (see also AIII, §5). The values <x, Ax} for all x e Ж uniquely determine an operator A. This fact is a direct consequence of the following simple identity: 4<x, Ay} = <x + y, A(x + y)> - <x - y, A(x - y)> — i(x + iy, A(x + iy)} + i(x — iy, A(x — iy)>. A is uniquely determined by the matrix aVfl = <xv, Ax^} with respect to a complete orthonormal basis {xv} since Ax^ = xv<xv, Ахц}. An operator A = Ax + A2 is defined by Ax = Atx + A2x and is bounded if both Ax and A2 are bounded. We find that the operator A = AtA2, defined Ax = At(A2x) is bounded if Ax and A2 are bounded. For a e С aA is defined by (olA)x = a (Ax). Let <£{Ж) denote the set of bounded operators of Ж. JУ?(Ж) is therefore a vector space over С and is also an algebra. The unit element is the operator lx = x, the null element is the operator Ox = 0. It is easy to see that ||A|| = sup^y ^ ||Ax|| defines a norm in J¥?(Ж). The fact that <£(Ж) is a Banach space follows directly from general theorems; however, we shall show that this is the case below. In addition to the norm-topology in <£{Ж) we may also introduce the pointwise topology as follows: We say that a sequence An e &(Ж) converges pointwise if, for each xe Ж the Anx form a Cauchy sequence. From ||A„x — Amx|| <|| An — Am || ||x|| it directly follows that every Cauchy se¬ quence in the norm topology also converges pointwise. If An is a pointwise convergent sequence, then a linear operator is defined by Anx —► у where у = Ax. A is then also bounded, that is, A e J¥?(Ж). PROOF. Now we shall show that the Anx are uniformly bounded, that is, there exists a D such that ||A„x|| < D||x|| for all «.For this purpose it is sufficient to show that there exists a у and a sphere Kd(y) = {x | ||x - y|| < <5 > 0} such that ||A„x'|| < a for all x' e Kd(y), because for arbitrary x
4 Bounded Linear Forms 373 If such a sphere Kd(y) does not exist, then we may stepwise construct the following sequence: first, find a yx and an n1 such that > 2. From the continuity of Ani we can find a sphere КР1(уг) for which px< 1 and \\Anix|| > 1 for x e K^iy^. Then we may find in the interior of Kpi{yx) a y2 and n2 which satisfy M„2y2|| > 3 together with a sphere KpJy2) c Kp^yJ where p2 < \ and M„2x|| > 2 for x e Kpz(y2). In this way we obtain a sequence of spheres KPv(yv) c= KPv_1(yv_1) for which pv < 1/v and ||4JI > v for xeKPv(yv). Thus for p > v we obtain ||yp — yv|| < 2pv and therefore there exists a у for which yv —► у and у e KPv(yv) for all v. Here \\Anvy || > v in contradiction to the fact that Any is convergent. From \\Anx|| < D||x|| it follows that, in the limit, ||Ax|| < D||x||. In this way we may prove that, for a sequence of bounded linear forms Ж -H- С which converge pointwise, ||/„(x)|| < £>||x|| for all n and hence defines a bounded linear form / by /„(x)-W(x). For a pointwise convergent sequence Anx —► Ax we write An —► A, for norm-convergence we write An A. If An is a norm-Cauchy sequence, then, for all x, Anx is a Cauchy sequence. Therefore, there exists an A for which An —► A. We now show that An A as follows: Let A'n = A — An\ A'n is also a norm-Cauchy sequence. If we choose N such that, for n,m>N the relationship \\A'n — A'm\\ < s holds, then for ||x|| < 1 we obtain: и;х|| < IIК - A'JxII + ll^xll < \\A'n - A'm|| + ||^x|| < e + ll^xll. For fixed x we obtain M^x|| —> 0, and therefore it follows that И^х|| < e for n>N and arbitrary x. Therefore \\A'n\\ = sup^n||^x|| < e, that is, Kil = K-^ll-o. We have shown that <£(Ж) is a Banach space. It is easy to show that \\AB|| < \\A\\ ||В||. If this relation holds in an algebra which is also a normed space, then it is called a normed algebra; if it is also complete, it is called a Banach algebra. From An A, Bm В it easily follows that AnBm AB. From An —► A, Bm —> В it follows that AnBm —► AB which follows directly from \\AnBmx - ABx\\ < \\An(Bm - B)x|| + 11(4, - A)Bx\\ < D\\(Bm - B)x|| + ||(4 - A)Bx|| 0. 4 Bounded Linear Forms As in the case of a Banach space (see AIII, §3) we may also investigate bounded linear forms for Ж. We shall now show that if Ж С is a bounded linear form, then there exists а у e Ж for which /(x) = <y, x). It is easy to see that the set 3~Q = {h \ 1(h) = 0} is a closed subspace of Ж. If = Ж it follows that l(x) = 0 = <0, x>. If 2Г0Ф Ж, then there exists, according to §2, a у which is orthogonal to therefore /(y) ^0. For p = x — (l(x)/l(y))y it follows that l(p) = 0, that is, p e and therefore
374 Appendix IV Operators in Hilbert Space <p, у> = 0. From x = (l(x)/l(y))y + p it follows that, by taking the inner product with h = (l(y)/\\y\\2)y we obtain <й, x> = /(x). From the Schwarz inequality it easily follows that <й, x) is a bounded linear form for all he Ж. Since </il5 x) = <й2, x) for all x implies that ht = h2, it follows that the map / —► h with /(x) = <й, x) is a bijective map Ж' Ж where Ж' is, in the sense of AIII, §3, the dual Banach space to Ж. For this correspondence we obtain lx + l2 —► hx + h2 and a/ —► och. From sup^u < x <й, x) = ||/i|| it follows that the norm defined on Ж corresponds, according to this bijective mapping, to the norm in Ж. A convergent sequence in Ж' (in the sense of the topology а(Ж\ Ж) in AIII, §4) corresponds to a sequence yn in Ж for which <y„,x> converges pointwise. Such a sequence is called a weakly convergent sequence. According to §3, to each pointwise convergent sequence /„ of linear forms there exists a bounded linear form / which satisfies ln(x) —► Z(x) for all x as a limit and there exists a С such that ||/„(x)|| < C||x||. Therefore, for each weakly convergent sequence y11 in Ж there exists a С such that ||yj < С and a limit element у towards which the yn converges (we denote the weak convergence of у by yn у). Ж is therefore sequence-complete. In §2 we saw that the relation <xv, x) —► 0 is satisfied for the elements xv of an orthonormal basis. If a sequence xv satisfies the relations <xv, хд> = 0 for p ф v and there exists a С such that ||xv|| < С for all v, then it follows that xv 0. From the general theorems in AIII, §4 it follows that every bounded set in Ж is weakly relative compact. This result can also be easily shown directly: If M is a bounded set (that is, ||y|| < С for all у e M\ then the set of <y, x) for fixed x is bounded for ye M since | <y, x>| < ||y|| ||x|| < C||x||. Let ^ be a denumerable dense subset of Ж and let xv (v = 1, 2,...) denote the elements of In the usual way we may choose a sequence j/1} e Jt for which xi) converges; from the sequence of the y^ we may choose a subsequence yf} for which <у(д2), x2> converges, etc. For the diagonal sequence у^ xv> converges for all fixed xv for all xv e We will show that this situation holds for all xe Ж. This follows from W - У{рр\ *>| < 1<УдД) - У{РР\ ^v)l + 1<УдД) - y(!\ X - Xv>| < 1<УдД) - Урр\ xv>| + 2C||x - xv||, if we first choose xv such that 2C||x — xv|| < e and then, for fixed xv, choose N such that for jU,p>Nwe obtain |<y^ — у(£\ xv>| < e. If A is a bounded operator, then <y, Ax) is, for fixed у a bounded linear form over x; therefore there exists а у' e Ж for which <y, Ax) = (/, x). Since y' is uniquely determined, у —► у' defines a map Ж —► Ж; it is easy to verify the fact that the map is linear. We denote this operator defined by this map by A +. Thus A + is determined by <y, Ax) = <A+y, x). Since sup \\A+y 11= sup sup |<4+y,x>| 1Ы1<1 llyllsi llxllsi
5 The Banach Space У£Г(Ж) 375 it easily follows that A+ is bounded, and that \\A + \\ = \\A\\. It is also easy to show that (oo4)+ = oL4+, (A + B)+ = A+ + B+, (AB)+= B+A+ and (A+)+ = A. We call A + the adjoint operator corresponding to A. An operator A is said to be self-adjoint (or Hermitian) if A + = A. An operator A is said to be compact (or completely continuous) if A maps the unit sphere (and therefore every bounded set) on to a relatively compact set (with respect to the norm). We shall now show that the above condition is equivalent to the condition that for every sequence yv 0 the relation Ayv —► 0 is satisfied, that is, \\Ayv\\ —► 0. If A is compact, then for each denumerable set {yv} for which ||yv|| < 1 the set Ayv has an accumulation point in the norm-topology. Since, for a weakly convergent sequence yv-^0 (yv9x) is uniformly bounded, there exists a number С such that ||yv|| < C. If we consider the sequence yvC-1 instead of yv, we may then assume that the yv are elements of the unit sphere, and that the Ayv must therefore have an accumulation point у in the norm topology. Therefore there exists a subsequence yVi such that Ayv. —► y. Thus, for arbitrary xe / we obtain (AyVi, x> —* <y, x), and since yv. ^0 we also obtain <AyVi, x) —* <yv., Ax} —► 0; therefore у = 0. Thus the sequence Ay must converge: Ayv —► 0. Conversely, we assume that for yv -*■ 0 it follows that Ayv —> 0. Let zv be an arbitrary denumerable subset of the unit sphere. A will be compact if the Azv have an accumulation point in the norm topology. Since the unit sphere is weakly compact, there exists a subsequence zv. which is weakly convergent: zV( z, where ||z|| < 1; for yv. = j(zVi - z) we obtain ||yV(|| < 1 and yv. 0 from which we obtain AyVi —► 0, that is, AzVi —► Az. The Azv therefore have an accumulation point (in norm). Let У?С{Ж) denote the set of all compact operators in <£(Ж). It is easy to verify that J^c(^f) is a linear subspace of ). J^c(^f) is, however, closed with respect to the norm topology and is itself a Banach space. PROOF. Let An e &С(Ж) and let An A. Let yv 0, then \\Anyv\\ 0. From ||Ayv|| < ||(A - An)yv|| + \\Anyv|| ^ \\A - An\\ ||yv|| + \\Anyv\\ it follows that, since ||yv|| < С for a suitable value of C, \\Ayv\\ < \\A - AJC + \\AHyv\\ from which we conclude that \\Ayv\\ —► 0. 5 The Banach Space ^£Г{Ж) The set of self-adjoint bounded operators is evidently a linear vector space over the field of real numbers R. We may also consider jУ?(Ж) to be a vector space over R, and the set of self-adjoint operators form a subspace of £?(Ж) which we shall denote by J^fr(^f). If 5£Г{Ж) is norm-closed in JУ?(Ж) then
376 Appendix IV Operators in Hilbert Space У?Г(Ж) will be a Banach space. This will be the case if the limit (in norm) of a sequence of self-adjoint operators is a self-adjoint operator. Since ||A + || = \\A ||, it follows from An -^C A that A* does not necessarily follow from An-+ A\). A+ = A+ implies A+ = A. Therefore jis a Banach space. j£?r(jf) is, however, closed with respect to pointwise convergence, because from An —► A and from <x, Any) = {Anx, y) it follows that <x, Ay} = {Ax, y). Since is a Banach subspace of <£(Ж), У?СГ(Ж) = n <£Г(Ж) is a Banach space. For A e <£Г(Ж) we find that <x, Ax') = <Ax, x> = <x, Ax) and we find that <x, Ax) is real. We may introduce a partial order in !£Г(Ж) as follows: A > В if <x, Ax> > <x, Bx) for all x e Ж. Here we note that £?r(^) is> *п the sense of AIII, §6 an ordered vector space with positive cone jSfr+pr) = {i41A e JSftJf) and A > 0}. It is easy to show that ) is closed in the norm topology. We will now show that <£Г(Ж) is an order unit space (see AIII, §6), that is, the unit sphere of J<£Г(Ж) the order interval [-1,1] = {A\ — 1<A <1}. From ||Ax|| < C||x|| it directly follows that \{x,Ax)\ < ||x||||Ax|| < C||x||2 = C<x, x>, that is, —Cl < A < Cl. Conversely, if /^1 < A < /л21, then for Ax Ф 0 (the case Ax = 0 is trivial), setting у = || Ax || “* || x || Ax = cl Ax, we obtain A+(A+ Ал \\Ax\\2 = -Ax, у) =Ua(-x + y),-x + y щ-х + y) = ju||x|| ||Лх||, 1 2 1 -x + у + -x - у a a where ju = Maxfl/jJ, |ju2|}. Therefore we obtain ||Ax|| < ju||x||. For С = 1, )U = — l,ii = l we obtain our assertion about the unit sphere of <£Г(Ж). If A and В are commuting self-adjoint operators, then AB = BA is a self- adjoint operator. In the special case in which В > 0 then A2B = ABA = BA2 > 0 since <x, ABAx) = {(Ax), B(Ax)). We will now show that, from A > 0, В > 0 and AB = BA that AB > 0. Since A is bounded, there exists a number с such that ||Ax|| < c||x||. For Ax = c_1A we have previously found that 0 < Ax < 1. Recursively we define An+1 = An — A2; we shall now show that 0 < An < 1 by induction: A+1 = A2d - A) + Ad - A2) > о and l - A+i = d - A) + A2 > о
6 Projection Operators 377 providing 0 < An < 1. From A1 = £"=1 A2 + An+1 it follows that, since An+1> 0, £"=1 ||Avx||2 < <x, Axx) and we find that \\Avx\\ —► 0, that is, Av —► 0. Thus we find that Ax = ]£®=1 A2. Therefore Av is self-adjoint and commutes with B, as can easily be proven by induction. Thus AB = сАгВ = c^=iAvB > 0. From the preceding results we obtain the following important convergence properties of monotone sequences of commuting operators An e ). Suppose that An is monotonically decreasing and suppose that An> 0 for all n. According to the above theorem (Am — An)Am > 0 and An(Am — An) > 0 for m > n, that is, <x, A2mx) > <x, AmAnx) > <x, A2x}. The sequence of the <x, AmAnx) must therefore converge to the same value as the monotonically decreasing sequence (x, A*x) = \\Anx\\2 so that Il04m - A„)x\\2 = <x, A2x) + <x, A2, x) - 2<x, AmAax) -> 0. The Anx therefore form a Cauchy sequence, and there exists an A such that An —► A. From <x, Anx) > 0 it follows that <x, Ax) > 0, that is, A > 0. If B„ is a monotonically decreasing (or increasing) sequence of commuting self-adjoint operators and there exists an operator B' which commutes with all Bn and satisfies Bn > B' (or Bn < B'\ then by considering the sequence An = Bm- F (or An = F — Bn) we obtain the result that there exists a В such that Bn—> В and В > В' (or В < В'). 6 Projection Operators Let be a closed subspace of Ж. Since each xe Ж has a unique partition x = q + p, where qe&~,pe (see §2), the relation q = Px defines a linear operator P, which we call a projection operator on 2Г. Since ||x||2 = Hell2 + IIpII2 and since x = q for x e we find that ||P|| = 1 (providing that P is not equal to 0, that is, & = {0}). From q e it follows that P2x = Pq = Px, that is, P2 = P. For a corresponding partition of у, у = r + s, re3T,se it follows that (y, Px> = <r + s, q} = <r, q> = <r, q + p} = <Py, x>, that is, P e ЖГ(Ж). Conversely, if P e ЖГ(Ж), and P2 = P, then there exists a closed subspace .7 upon which P projects. Thus the set .7 = РЖ is a closed subspace. The fact that P projects upon P follows directly from the identity x = Px + (1 — P)x; it is easy to show that (1 — P)x e 27Thus we find that 1 — P is a projection operator upon 27L. We therefore write 1 — P = P1. Thus we find that P —> РЖ is a bijection of the set of projection operators on the set of all closed subspaces of Ж. For the special case in which У is the one-dimensional subspace spanned by у (with || у || = 1) we shall often denote the projection operator on 27 by Pr We obtain Pyx = y(y, x).
378 Appendix IV Operators in Hilbert Space If, for a pair of closed subspaces => then from x = <Zi + Pi = Чг + P2 where qxe&~x, pxe$~x, q2e^2 an(l P2e^~2 it follows that pxe&~2. For the partition qx = r2 + s2 where r2 e .T2 and s2 6 -9~2 it follows that x = r2 + (s2 + px), where r2 e 3~2 and s2 + px e 9~2. Therefore q2 = r2 and p2 = s2 + px. If Px is the projector onto and P2 is the projector onto Ж2 we then obtain Pxx = P2x + (1 — P2)Pxx, P2x = P2Pxx. Thus it follows that 11jPjx112 > ||P2x||2 and P2 = P2PX. Since P2 = Px we find that HPjXll2 > ||P2x||2 is equivalent to Px> P2. Since P2 is self- adjoint, from P2 = P2PX it follows that P2 = P2 = PXP2, that is, Px, P2 commute. If, conversely the projection operators Px, P2 satisfy Px > P2 then it follows that ||(P2 — P1 P2)xII2 = <P2X — PJ PjX, P2x — PjP2x> = <P2X, P2X> + <PjP2X, PxP2x} - <PjP2x, P2x> - <P2x, PjP2x> = ||P2x||2 - ЦЛРгхЦ2 and since || PjV ||2 > || P2_v||2 we obtain ||(P2 - PjP^xll2 = ||P2P2x||2 - ||PiP2x||2 < 0, that is, P2 — PXP2 = 0. Thus, as above, it follows that P2 = P2PX = PXP2. For x e Р2Ж, from P2 = PjP2 it follows that x = Pxx and therefore x e P[ Ж, that is, Px Ж > Р2Ж. The above bijection between the projection operators and the corresponding projection spaces is therefore an order isomorphism. From P2 = PXP2 it follows that (Px — P2)2 = Px — P2 and therefore Px — P2> 0, that is, Px> P2. If (Pj — P2)2 = Px — P2 and therefore Px > P2 then it also follows that P2 = PXP2 = P2PX. The following conditions are therefore equivalent: P2 = PiP2, Р2 = РЛ, (Px - P2)2 = Px - P2, PX>P2, РХЖ=>Р2Ж. For two projection operators P and Q we denote the projector onto (РЖ) n (Q-Ж) by P л Q and the projector onto (РЖ) v by P v Q. Because of the above order isomorphism, the set of projection operators is, according to §2, an orthocomplemented lattice. We will now show that PQ = QP is equivalent to P л Q = PQ. From PQ = QP it follows that (PQ)2 = PQ and (QP)2 = QP. Thus it follows that PQJP с РЖ and PQЖ = QPЖ cz QЖ and we obtain PQЖ с РЖ л QЖ. If x e РЖ л QЖ then it follows that x = Px = Qx and that x = PQx e PQЖ. Conversely, if PQ = P л Q, then PQ is self-adjoint, that is, (PQ)+ = QP = PQ. In general PQ is not self-adjoint and is not a projection. For the special case in which PQ = 0 it follows that PQ = QP and Q = (1 — P)Q and therefore Q ‘# <= (РЖ)1, that is, QЖ is orthogonal to
7 Isometric and Unitary Operators 379 РЖ. It follows that P + Q is the projection operator onto РЖ ® (£Ж = РЖ v 0,Ж'. If P + Q < 1, then it follows that P < 1 — Q and that P = P( 1 — Q), that is, PQ = О, РЖ 1 (£Ж and P + Q is a projection onto РЖ 0 еж If P„ is a decreasing (or increasing) sequence of projection operators, then from §5 it follows that Pn converges Pn —► P, since Pn> 0 (Pn < 1). From P2 = Pn it follows that P2 = P, that is, P is also a projection operator. It is easy to show that РЖ = Р|и (РпЖ) (or РЖ = \Jn (РпЖ)). If Pn is a sequence of projection operators, then Y,n=i Pn is an increasing sequence. If ]T*=1 Pn< 1 for all N then Y,n=i Pn exists and is < I. It is easy to show that the condition J^°=1 Pn< 1 is equivalent to the condition that the Pn are pairwise orthogonal, that is, PnPm = 0 for n Ф m. Thus we find that P„ is a projection operator on \/n (РпЖ) = £„ ф Р„Ж. If P is a projection operator and A e S£x\(Ж), then AP = PA is equivalent to A = PAP + (1 — P)A( 1 — P), where the latter can be proven easily with the aid of the identity A = PAP + (1 - P)AP + PA( 1 - P) + (1 - P)A( 1 - P). If A = (PA) (or AP) then since A e <£Г(Ж) it follows that A = (PA)+ = AP (or A = PA). A therefore commutes with P and we obtain A = PA = P2A = PAP. If PA = 0, then it follows that PA = AP — 0 and we obtain (1 - P)A = A( 1 - P) = (1 - P)A( 1 — P) = A. If 0 < A < P, then for x 6 (1 — Р)Ж it easily follows that <x, Ax) = 0. Since A > 0, according to §5 we may write A in the form A = c]Tv A2. From <x, Ax) = 0 it easily follows that Avx = 0 and Ax = 0, that is, A(1 — P) = 0. Thus we therefore obtain A = AP = PA = PAP. 1 Isometric and Unitary Operators A linear operator Vis said to be isometric if || Vx\\ = ||x|| for all x 6 Ж. Thus we find that Ve J¥(Ж) and \\V\\ = 1. From <Fx, Fx> = <x, V+ Vx> = <x, x> it follows that V+ V = 1 (see §3) and from V+ V = 1 we obtain an isometry. For the operator P = VV+ it follows that P+ = P and P2 = P, that is, VV+ is a projection operator. If the Hilbert space Ж is finite dimensional, then we must have V+ = F-1 and therefore P = 1. For infinite-dimensional Ж it is possible that P Ф 1. P = 1 is equivalent to the statement that V+ is also an isometry. An operator U for which both U and U+ are isometric is said to be unitary. Therefore U is unitary if and only if UU+ = U+U = 1. If V is isometric, then УЖ is a closed subspace of Ж because, from the continuity of V and V+ it follows from Vxn —► у that V+ Vxn —► V+у, that is, xn —► V+y = y' and that Vxn —► Vy' e УЖ. From P = VV+ it follows easily
380 Appendix IV Operators in Hilbert Space that РЖ c= УЖ. From Ж => УЖ it follows that РЖ => РУЖ = УУ+УЖ = УЖ. Therefore we obtain РЖ = УЖ. Р is therefore a pro¬ jection onto the image of V. An isometric operator is therefore unitary if and only if VЖ = Ж. If the isometric operator Fhas a right-inverse, that is, there exists a V' such that VV = 1, then it follows that V+ VV = V+ and V' = V+, that is, VV+ = 1. The statement that an isometric operator Vis unitary is equivalent to the statement that Fhas a right inverse. 8 Spectral Representation of Self-adjoint and Unitary Operators We shall now demonstrate the following theorem: Let A e J&Г+(Ж), then there exists a unique В e J&Г+(Ж) such that B2 = A; all operators which commute with A also commute with В. (B is called the positive root of A, В = A1/2.) Proof. From В1е&+(Ж) and Bj = Ax = \\A\\ ~XA, it follows that for В = \\A\\1/2B19 B2 = \\A\\Bj = A and В e &Г+(Ж); from B2 — A and В e &Г+(Ж) and for Bl = \\A ||~1/2 it follows that В2 = Ax and В1е&г+(Ж). Since И||-1Л < 1 it suffices to prove the theorem for A < 1. For this purpose we define a sequence Bn as follows: B0 = 0, Bn+1 = Bn + j(A — B2). Thus it follows that all operators which commute with A also commute with all Bn. Bn is an increasing sequence which satisfies 0 < Bn < 1, because from 1 — Bn+1 = %(1 — Bn)2 + ^-(1 — A) it follows that 1 — Bn+1 > 0; by the induction hypothesis it follows from Bn+1 — Bn = ^(Bn - Б„_1)[(1 - Вп_х) + (1 - BJ] that the relation Bn+1 — Bn > 0 holds. The sequence Bn converges; therefore we obtain Bn-+B where 0 < В < 1. From Bn+1 = В + %A — B2) it follows that, in the limit, B2 = A. Since the Bn commute with all operators which commute with A, the same holds for B. In order to prove uniqueness, we assume that С > 0 and that C2 — A. From AC = C2C = CC2 = CA, С commutes with A, and therefore also commutes with B. Earlier we have shown that there exist positive roots В1/2 and C1/2. For xe Ж and у = (В — C)x it follows that ||B1/2y||2 + ||C1/2y||2 = <y, By) + <y, Су) = <y, (B + C)y) = <y, (В + C)(B - C)x) = <y, (B2 - C2)x) = 0. Therefore we obtain B1/2y = C1,2y = 0, from which we conclude that By = 0 and Cy = 0. Thus we obtain \\(B - C)x\\2 = <x,(В - C)2x) = <x, (B - Qy) = 0, that is, (B — C)x = 0. Since x was arbitrary, we have proven В = С. Now let B(p) >0; let B(ji)2 = (A — pi)2. B(p) is therefore uniquely determined and commutes with all operators which commute with A. Let A(n) = A — jul, А(ц)+ = $(В(ц) + A(ji)) and A(p)_ = ^В{ц) - A(p.)). Therefore А(ц) = A(jx)+ — А(ц)_ and B(ji) = А(ц)+ + A(pi)_. Since B(ji) and A(p.) commute, we obtain А(ц)+А(ц)_ = А(р,)_А(ц)+ = Ub(m)2 ~ Ah)2) = 0-
8 Spectral Representation of Self-adjoint and Unitary Operators 381 Let ^ denote the set {x | x e Ж and A(yi)+x = 0}. From the continuity of A(pi)+ it follows that ^ is a closed subspace of Ж. Let E{p) denote the projector onto ,ТЦ. We therefore obtain A(ji)+E(p.) = 0. Since A(n)+A(p)_ = 0 we therefore obtain А(ц) _ Ж с: that is, Е(ц)А (ц)_ = A(ji)_. If С e Е£Г{Ж) commutes with A, then it also commutes with A(ji)+ and A(fi)_. For у e ,Т^ it then follows that A(p.)+Cy = СА{ц)+у = 0, that is, СуеЗГр. Thus it follows that CE(p.) = E(p)CE(ji). Computing the adjoint operator we obtain E(fi)C = CE(p,). Let С = А(ц)_\ from Е(ц)А(р) _ = A(/j)_ it follows that А(р,)_Е(ц) = A(p)_. Similarly, from А(ц)+Е(ц) = 0 we obtain E(p)A(p)+ = 0. Thus, from 0 < E(ji) < 1 and B(jx) > 0 it follows that 0 < E(ji)B(ji) = B(p)E(p) = E(]i)[A(ji)+ + A(pi)_] = A(p)_, (8.1) 0 < [1 - Е{ц)\В(ц) = А{ц)+, E(ji)A(ji) = -A(ji)_, (1 - E(n))A(ji) = A(jx)+. For X <> ц we obtain A(X) — A(p) > 0; thus, from A(A)_ > 0 we obtain A(A)+ — A(/i)+ + A(ji)_ > 0. By multiplication with A(/i)+ we obtain A(ji) + IA(X)+ - A(ji)+ + A(fi)_~\ > 0, that is, since А(/л)+А(/г)_ = 0 we obtain A(fi)+A(A)+ > (A(ji)+)2, and we find that <x, A(fi)+A(A)+x} > \\A(fi)+x\\2 for all x. From A(A)+x = 0 it therefore follows that A(ji)+x = 0, that is, ^ c: ^ which is equivalent to ДА) < E(fi) and to Ди)ДА) = ДА). The fact that, for A, there exists two constants a, ft for which al < A < fil implies that A(A) > 0 for A < a and therefore ДА) = A(X) for A < a because the positive root of A(A)2 is unique; therefore we obtain A(A)+ = A(A). Since <x, A(A)x} = <x, A(A)+x} > (a — Я) ||x||2 it follows that for A <-a A(A)+x = 0 only for x = 0 and we therefore obtain E(A) = 0 for A < a. For A > P it follows — A(A) > 0 and therefore B(A) = — A(A), that is, A(A)+ = 0, from which we conclude that ДА) = 1 for A > ft. Since for A < fi, ДА) < Дц), Дц) — ДА) is a projection operator onto the space n Let E{J) denote Дц) — ДА); from the relation (8.1) we obtain: 0 < A(A)+E(J) = A(A)( 1 - E(A))E(J) = A(A)E(J?) = (A- А1)Д/), 0 < A(ji)_E(Jf) = -A(fi)E(fi)E(J) = — A(fi)E(J>) = (fil - A)E(J). Thus it follows that AE{J) < AE(J>) < fiE{J). (8.2) We shall denote the limit of Д/i) as /i —► A by ДА+). Let Q(A) = ДА+)-ДА). We obtain AQ(A) = AQ(A) that is, A(A)Q(A) = 0.
382 Appendix IV Operators in Hilbert Space In addition it follows that A(k)+Q(k) = (1 - E(k))A(k)Q(k) = 0 and we therefore find that Q(k)x e for all x, that is, E(k)Q(k) = Q(k). From E(k)E(J) = 0 it follows that, in the limit ju -> к E(k)Q(k) = 0—thus we find that Q(k) = 0 and therefore we obtain E(k+) = E(k), that is, E(k) is con¬ tinuous from above. Thus, for P(ji) = Е(/л) — E(jx_), for A —► ju, from (8.2) we obtain AP(ji) = fiP(ji). (8.3) If we partition the real interval [a — 3, /?] (for a, /? used above, and arbitrary small 3 > 0) into subintervals Jk = (kk, kk+1], where E(Jk) = E(K+i) — E(kk), from (8.2) we obtain £ XkE(Jk) < £ AE(Jk) < £ Xk+1E(Jk). к к к If the maximal length of the intervals Jk is equal to e, then from E(Jk) = E(P) — £(a — 3) = 1 it follows that ® < i — K)E(^k) < к and therefore o < ZVi£(A) - a <81, к 0< A-^KEW<sl к from which it follows that А- 1Ак+1£(Л) < e and A-^XkE(Jk) < 8. Thus Y,k Л'/ДА) an^ Xfc ^к+1ДЛ) converge, as e —► 0, in norm to A. The limit of the sum is written as the integral -i X dE(X). (8.4) (8.4) is called the spectral representation of A. From (8.4) for a polynomial p it follows that P(A) -L p(X) dE(X). In addition, for a projection valued measurable function f(X) which is measurable relative to the projection valued measure defined by E(X) (see IV, §2.5) we may define the function /)/1) as follows № = f(X) dE(X).
8 Spectral Representation of Self-adjoint and Unitary Operators 383 In particular, for 1 for k1 < к < Я2, 0 otherwise, it follows that rj(A) = ДЯ2) — F^). From (8.4) it easily follows that the solution of the eigenvalue problem Ax = (or (A — fil)x = 0) is equivalent to and we therefore obtain x e Р{р[)Ж (P(jx) is defined in (8.3)). The eigenvalues are therefore the values at which Е(/л) is discontinuous. The uniqueness of the spectral family follows from the representation (8.4). Let E'(2) be a second spectral family which has projection operators which are increasing and continuous from above and satisfy (8.4). Then it follows that for each e > 0. Since E'(X) is continuous from above, we therefore obtain E'(ii)x = x for all xe^, that is, for all x = E(p)y, where ye Ж Therefore E(p)E{fi) = E(ji), that is, £'(ju) > E(ji). Conversely, if E’(ji)x = x, then (8.5) is satisfied, that is, A(p.)+x = 0 and we obtain x e Therefore Е'(р.)Ж a that is, E'(n) < Е(ц). Thus we have proven that Е'(ц) = E(p). If U is a unitary operator, then A = + U+) and В = (l/2i)(l/ — U+) are two commuting self-adjoint operators which satisfy ||4|| < 1, ||£|| < 1. From UU+ = 1 it follows that A2 + B2 = 1, that is, B2 = 1 — A2. For A there exists a spectral representation Thus we obtain B2 ={!_!_ (1 — fi2) dE(ji). For a partition of В into positive and negative parts В = B+ — В we obtain B2 = B\ + B2_. Let F denote the projection operator onto the subspace of all x for which B_x = 0. Then B+ = BF and B_ = B( 1 — F) and we obtain B\ = B2F and B2_ = B2( l — F), from which we obtain ||(A - ц\)х\\2 = (Я — ц)2 d\\E(X)x\\2 = 0 a - The space ^ is therefore the set of all x for which (8.5) From which is follows that E'(/a + e)x = x
384 Appendix IV Operators in Hilbert Space For ц ± iy/1 — /г2 = e‘v and Mm) = I(1 ~ E(^)F for 0 < cp < я, w> [£(/t)(l — F) + F for % < cp < 2% we obtain Г2 n U = ei(p dG((p). For D(cp) = G((p+) we find that D((p) is continuous from above, and that Г2 n U = ei<p dD(cp). (8.6) 9 The Spectrum of Compact Self-adjoint Operators If A is self-adjoint and compact (see §4) then A can only have a discrete spectrum, that is, the spectral family of A cannot be continuously increasing. In addition, the eigenspaces corresponding to nonzero eigenvalues, that is, all Р(/л)Ж which satisfy (8.3) for nonzero values of ju can only have finite dimension, and the sequence of eigenvalues must converge towards 0. Proof. If E(k) were continuously increasing, or if there exists an infinite dimen¬ sional eigenspace corresponding to a nonzero eigenvalue, then there would exist an interval к to к + e for which [E(k + e) — E(kJ] was infinite dimensional and A + e<0 or к > 0. Therefore there would be an infinite sequence xv e 1Е(к + e) - Е{к)~]Ж for which <xv, = dVfl so that xv 0 (see §4). Since A is compact, we must have Axv —► 0. From (8.4), it would then follow that \\Axv||2 = J \i2 d\\E(ji)xv\\2 > min{A2, (к + e)2} Ф 0 in contradiction to \\Axv\\ —► 0. Therefore we obtain, for the eigenvalues ц of A, 4 = Z /*P(/4 (9-1) where 0 is the only accumulation point for the eigenvalues ц and P(jE) is finite dimensional for ц Ф 0. Therefore we may choose a complete orthonormal basis xv, for which (9.1) is transformed into A = Z where ц, — 0. (9.2) V We shall now show that if (9.2) holds, then A is compact. According to §4 it is sufficient to consider a sequence yn 0 with ||yj| < 1. From (9.2) it follows that (here we set MN = Max{|/*v| | v > N}:
10 Spectral Representation of Unbounded Self-adjoint Operators 385 Now we choose N such that MN < e and then let n —► oo, and we obtain <xv, yn} —► 0. Therefore we obtain \\Ayn\\ —► 0. From (9.2) it follows that 5£СГ(Ж) is the norm-closed subspace of $£Г(Ж) generated by the set of all Px. A projection operator P is an element of 5£СГ(Ж) if and only if РЖ is finite dimensional. 10 Spectral Representation of Unbounded Self-adjoint Operators In general, an unbounded linear operator A in Ж is not defined in all of Ж. Let 3fA denote the domain of definition of A, that is, A is defined as the map 9)a From linearity, we may assume that Q)A is a subspace (but not necessarily a closed subspace). Let WA = A3)a denote the range of a. arA is also a subspace of Ж. If E(k) is a spectral family of projection operators, that is, E(kt) > E(k2) for kx > k2, E(k+) = E(k) an<i E(k) —► 0 for к —► — oo, E(k) —► 1 for к —► oo, then Axf=^XdE(X) defines an operator Aafi for which \\Aap || < max(|a|, \fi\). For апхе Ж and for a —► — oo and /? —► oo Aapx is convergent if there exists a с for which 11Дх/3ХН < с for all a, p. In particular, for all x for which Г A2rf||£(A)x||2 < oo (10.1) J — 00 there exists an operator A defined by Ax = X dE(X)x. (10.2) J - 00 The operator A therefore has a natural definition domain consisting of the set of vectors which satisfy (10.1). If there does not exist a к for which either E(k) = 1 or 0, then it is easy to show that 3)A Ф Ж. 3)A is, however, dense in Ж\ For a map 3)A Ж we may consider the graphs—as subsets of the topological product space Ж x Ж, that is, the set of all pairs (x, Ax) for which xe^. Here it is particularly advantageous to consider the topologi¬ cal product of Ж with itself to be a Hilbert space Ж ® Ж = Ж2 where (*1> yi) + (*2> У2) = (*1 + У1 + У2Х a(x> у) = (ax, ocy) and <(xl5 yx\ (*2> J2)) = *2> + У2У•For fhe graph 9A of A is the set of all vectors of the form (x, Ax) e Ж2. 9A is a subspace of Ж2 if and only if A is linear. Conversely, each subspace ^ of Ж2 for which there is no element of the form (0, y) with у Ф 0 defines a linear operator A for which 3fA is the first component of the elements of
386 Appendix IV Operators in Hilbert Space The operator A is said to be closed if (SA is a closed subspace. This is equivalent to the condition that for xn e 3)A, xn —► x and Axn —► у it follows that (x, у) e УA, that is, xe£&A and Ax = y. For an operator A we define A + as follows: @A+ is the set of all x for which there exists an x' for which <x, Ay) = <x', y) for all у e 3)A. A+x = x' is therefore defined only if <z, у) = 0 for all у e 3)A it follows that z = 0, that is, if 3)A is dense in Ж If we define a unitary operator U in Ж2 by U(x, y) = (y, — x), we find that U2 = — 1. The graph of A+ is precisely 9A+ = A + is therefore defined only if (l/^)1 *s a graph, that if it contains no element (0, y) for which у #0, that is, if ^ is dense in Ж Since $A+ = (U^)1 is a closed subspace, A+ is closed. We shall now show that if A is itself closed, then @A+ is also dense in Ж If ^4+ is not dense in Ж then there exists а у e Ж for which yl®i+. We therefore obtain (0, y) 1 U9A + , that is, (0, y) e (U^A+)L = [C(C^)1]1 = [(L/2^)1]1 = C2^ = ^ in contradiction to the assumption that <SA is a graph. If 3iA+ is dense in Ж (but A is not necessarily closed!) then A++ exists, and it follows that 9 = muwL = m1 = &a], that is, &A++ is the closed subspace generated by <3A. We say that В is an extension of A (written В id A) if id <3A. A++ is therefore, if it exists (that is, if @A+ is dense in Ж), an extension of A: A + + id A. A + + is closed. If В id A, and if В is closed, then % => <$A and we therefore obtain => [^J, that is, В id A + +. We define A + В by @A+B = @An @B and (A + B)x = Ax + Bx. We define AB by means of $)AB as the set of all xe3)B for which Bx e 3)A and (AB)x = A(Bx). If Ax = 0 only for x = 0, then we can define A"1 by ^4-1 = an<i A~1(Ax) = x. It is easy to show that 0A a 0, A(BC) = (.AB)C, (A + B)C = AC + ВС, A(B + C) => AB + AC, (AB)"1 = B^A"1 if A"1, B"1 exist;(A + B)+ => A+ + B+;(AB)+ => B+A + . A is said to be self-adjoint if A + = A. Clearly Q)A must be dense in Ж A is said to be symmetric if 2A is dense in Ж and <x, Ay) = <Ax, y> for all x, у e ^. If A is symmetric, then A+ => A. If A is self-adjoint, then it is also symmetric. From A+ => A it follows that for a symmetric A @A+ is dense in Ж and that A + + exists and satisfies A + + c: A+ = (A+)++ = (A ++)+— therefore A + + is also symmetric. A symmetric A is said to be maximal, if A has no symmetric extension. A maximal A must therefore satisfy A + + = A. A self-adjoint operator is maximal since В => A implies B+ a A+ — A and В <= B+ <= A, that is, В = A. If A + + = A + then A ++, the closure of A, is self- adjoint. Then we say that A is essentially self-adjoint. It is easy to show that the operator A defined by (10.2) is self-adjoint. If we formulate quantum mechanics in the manner presented in this book, then unbounded operators occur in the representation of Lie groups, namely, as infinitesimal transfor¬ mations (see VII and VIII). As such, they are defined on the basis of
10 Spectral Representation of Unbounded Self-adjoint Operators 387 representation theory (see VII, VIII and [10]) in the form (10.2). However, the introduction of unbounded scale observables (for example: the position and momentum observables introduced in VII) occurs on account of the fact that first a decision observable E —► G was defined and then a spectral family is defined. Then the operator A is introduced according to (10.2), so to speak, as a “condensation” of the spectral family. For the formulation of quantum mechanics presented here it is sufficient to consider only (10.2). For applications it is important to note that the so-called Hamiltonian operator Я, that is, the operator for the infinitesimal time translation is only determined by the Galileo group for elementary systems (see VII, §2). For composite systems, on the other hand, it is necessary to discover the Hamiltonian H. In this way we discover an operator which is not always well defined, as, for example, we find in VIII (5.8), that is, 3)H is not fully known, and even less for the spectral family E(X) which is representable in terms of a time displacement by the unitary operator Often we must seek to find an extension for an operator H which was only given as a symmetric operator. The “correct” extension can only be obtained on the basis of “physical” considerations. The extension must be a self- adjoint operator (or at least essentially self-adjoint); only in this way is it possible to uniquely define the spectral family and the operator eiHt as a unitary operator. The problem of the definition of H is especially important for the term scheme of atoms (see XI-XV) and for scattering theory (see XVI). In the following we shall consider the circumstances under which a sym¬ metric operator can be extended to a self-adjoint operator. According to the previous considerations it therefore follows that the maximal symmetric operators which are not self-adjoint are “physically useless.” We now proceed from a symmetric operator A and seek to define (A — il)(A + il)-1. This form is motivated by the Cayley transform w = (z — i)/(z + i) which maps the real z-axis onto the unit circles in the w plane. For x g 3)A we obtain Thus we find that (A + il)x = 0 only for x = 0, that is, (A + il) 1 exists where ^(A+n)~i = (A + i\)S>A. Therefore the operator is well defined with 3)v = (A + il)@A. It directly follows that = (A — i\)3)A. Each ye%, therefore, is of the form у = (A + il)x where xe<3A and, for y' = Uy it follows that y' = (A — il)x. Thus, from (10.3) it follows that || Uy|| = ||y||—that U is an isometric operator on Q)v (if Q)v = Ж then U is, in the sense of §7, an isometric operator). U(t) = eatdE{X) = eiHt. ||(Л + il)x||2 = ||Лх||2 + ||x||2 > ||x||2. (10.3) U = (A - ilXA + il)-1 (10.4)
388 Appendix IV Operators in Hilbert Space From Uy = (A — il)x and у = (A + il)x it follows that x = (1 — U)y/2i and that Ax = (1 + U)y/2. (1 — L/)"1 exists, because from (1 — U)y = 0 it follows that x = 0 and that у = (A + il)0 = 0. Thus we find that A = i( 1 + U)( 1 - I/)”1 (10.5) since ^ = (1 — L/)^. Conversely, suppose that U is an isometric operator in a subspace &v. Then it follows that <x, y> = <L/x, Uy} for all x,ye3v. Since U is isometric, U~x is defined in From (1 — U)y = 0 it follows that for z e (1 - = (1 - и~г)и% = (U — 1)% = (1 - U)%, that is, for z = (l-U~*)x and xefy we obtain <z, y> = <x, y> - <f/_1x, y> = <x, y> - <x, Uy> = <x, (1 - l/)y> = 0. If (1 — U)3)v is dense in Ж, then from <z, y> = 0 for all z e Ж it follows that у = 0. If U is an isometric operator in 3)v and (1 — U)Sfv is dense in Ж, then from (10.5) an operator A is defined for which 3A = (1 — U)3V. Therefore 3)A is dense in Ж and A+ is defined. For x,ye@A, that is, x = (1 — U)v, у = (1 — L/)w, where w, ve it follows that <x, Ay> = <(1 — U)v, i( 1 + U)w) = i(v, w) — i(JJv, L/w) + i<t?, L/w) — w) = i<t7, L/w) — i(Uv, w> and <Ax, y> = <i(l + U)v, (1 - L/)w> = -i(v, w> + i<l/i>, t/w> + i(v, Uw) — i(Uv, w> = i<t?, L/w) — i(Uv, w>, that is, <x, Ay) = <>lx, y>—A is therefore symmetric. For an operator A defined as above by U, it follows that (10.4) holds. Symmetric operators A and isometric operators U for which (1 — U)3V is dense in Ж are uniquely related by (10.4) and (10.5) and % = (A + il)3>A and 3A = (1 — U)3)v. Each proper symmetric extension of A leads to a proper extension of U and vice versa. In this way the problem of symmetric extensions of A are directly related to that of isometric extensions of U. From (10.3) it follows that for z„ = (A + il)x„ the sequence z„ is con¬ vergent only if the sequences Axn and xn are convergent. If A is closed, then 3)v is closed (and since U is continuous, U is also closed). Therefore 3)v and are closed subspaces of Ж. Conversely, if @>v is closed (and therefore is also closed) then it follows that A is closed. Since A + + is the smallest closed extension of A, A + + corresponds to that extension of U for which the domain of definition is the closure of 3V; this extension of U is uniquely determined. For this reason we need only consider closed symmetric operators A. 3V and iTv are therefore closed subspaces; and are called the defect spaces of A. For xe®J and ye3A it follows that <x, (A + il)y> = 0, that is, <x, Ay> = <ix, y>. In this way c= @A+ and A+x = ix for x e 3jj. In the same way, for z e we obtain A+z = —iz. If, conversely, A+x = ix, then, for all у e 3A it follows that <x, (A + il)y> = 0,
10 Spectral Representation of Unbounded Self-adjoint Operators 389 that is, x g 3jj. Since A = A+ for a self-adjoint operator, <x, Ax) is real; so that we conclude that 3j} = {0}, if£ = {0}, that is, 3V — ifv = Ж. According to §7 this is equivalent to the condition that U is unitary. We now assume the converse—that is U is unitary, that is if = = ^ then A is self-adjoint. In order to show this we prove the following for a closed and symmetric A—3A + = 3A + 3^ + if^. For x g 3A we partition (A+ + il)x into components in 3V and in 3jj, that is, (A+ + il)x = (A + il)z + у where zg3a and ye3jj. Since Az = A+z and A+y = iy it follows that (A+ + il)x = (A+ + il)z + (A+ + il)yf where У = (1/2 i)y. Therefore A+(x — z — yf) = — i(x — z — iy% that is x — z — У e if£ . Therefore x = z + У + r, where z e у e and r g if£. Thus we have proven 3A+ = 3A + 3^ + If 3^ + if £ Ф {0}, then ^ is a proper subset of 3A + because for уеЗи, rGifjj; from the assumption that y + re3A it follows that (A + il)(y + r) = (A+ + il)(j; + r) = 2ry and (A — ilXy + r) = (A+ — HXy + r) = — 2ir. Since 3V = (A + il)^ we obtain у g3v and, since ye3j}: у = 0. From (A — il)^ = it also follows that r = 0. Therefore the partition x = z + у + r, where xe®^ uniquely defines z g 3A, у g r g ify and we find that 3A + 3^ + is a direct sum of subspaces. If U is unitary, then A + = A. Since there is a 1:1 correspondence between the extensions of the isometric operators 3V ifv and the extensions of A, we need only investigate the possibilities of finding isometric extensions of U. From the isometry it easily follows that, for an extension V of U that V maps 3V n 3j} isometrically onto ifv n that is, for 2Г = 3V n 3j} and Sf = if v n if^ the mapping 2Г Sf is an isomorphism (in the sense of §13). Therefore ST and ST have the same dimension. Conversely, if 2Г and are subspaces having the same dimension for which 1 3U9 1 ifV9 then we obtain an isometric exten¬ sion V of U with the aid of one of infinitely many isomorphic maps 2Г Sf by V(x + y) = Ux + Vy for x g 3U9 у g which satisfies 3V = 3V ® ifv = nrv 0 sr. Therefore, A has a self-adjoint extension if and only if 3jj and if Tj have the same dimension. If A is not self-adjoint, then there exist infinitely many self- adjoint extensions of A. Therefore, if we wish to discover a Hamiltonian operator H (for example, in the form VIII (5.8)), it is not sufficient to show that H is symmetric in a certain domain of definition 3H. Indeed, if the defect spaces 3y, if у do not have the same dimension, the operator cannot be used as a Hamiltonian operator. If, however, the dimension of 3jj and if^ are the same, then we are still not finished, as long as 3V Ф Ж. If 3„ = {0} = if^, then H++ is a self-adjoint operator, that is, H is essentially self-adjoint. Such an H is as good as a self-adjoint H because there is a uniquely defined closure of H which is self-adjoint. The search procedure for H can only be carried out using “physical considerations” because, in the case H is not essentially self- adjoint, there are infinitely many possible extensions. If H is essentially self- adjoint, then we can use the operator H++ instead of the operator H.
390 Appendix IV Operators in Hilbert Space The condition that A is self-adjoint is equivalent to the condition that U is unitary. If U is unitary then, according to §8 there exists a spectral family D((p) which satisfies (8.6). From (10.5) it follows that A - J0' r^ST dm - Jo (-cot I) dDiv) because from 3>A = (1 — U)Ж for x = (1 — U)y (with arbitrary y) we obtain D(cp)x = jo (1 — e“p ) dD(ip')y and we obtain \\D((p)x\\2 = Jjl - e‘«'l2 d\\D(<p')y\\2 = sin2^ d\\D((p')y\\2. From which it follows that | c°t2 у ^||£>(<р)х||2 = J 4 cot2 у sin2 у d\\D((p)y\\ 2 = 1 4cos2^-dP((j9)y||2 < oo, that is, Jo" (— cot((jo/2)) dD((p)x exists. For X = cot(tp/2) and E(X) D(— 2 arccot X) it follows that ■f Я dE{k\ (10.6) where the integral is defined as an operator in Q}A = (1 — 11)Ж. Since il + Я dE(X) maps the subspace (1 — ЩЖ surjectively onto Ж, (10.6) cannot converge for other vectors other than those of 3)A = (1 — Ц)Ж. The uniqueness of the spectral family follows in a similar fashion as that described in §8. 11 The Trace as a Bilinear Form Let A g £?Г+(Ж) and let xv be a complete orthonormal basis. The sum £ <xv, Axvy (11.1) V defines a real number >0 or + oo. For another complete orthonormal basis уp we obtain <У»>АУр> = M1/2yJ2 = 'ZKXy, All2y^\2 = El<^^1/2xv>|2 V V where A1/2 is the positive square root of A. Thus it follows that E <jv ау^> = Z EI<y» л1/2х„>|2 < Z 11<y„, л1/2х„>|2 n=l v=l fi=l v= 1 n = l = X M1/2xv||2 = £ <xv,4xv>,
11 The Trace as a Bilinear Form 391 and, if we exchange the xv and we obtain 00 00 Z <У„’АУ„> = Z <xv,Axv). n=l V=1 We find that (11.1) is independent of the choice of xv; we call this invariant the trace of A, and denote it by tr (A)\ for A e £?Г+(Ж) tr (A) is a real positive number or + oo. For an arbitrary A e <£Г{Ж)we define MIL = tr(y/A*) = trM+) + trM_), (11.2) where y/A* is the positive root and A+, A_ are the positive and negative parts of A. We shall denote by ЩЖ) the subset of all A e <£Г(Ж) for which \\A\\S < +oo. The operators in $(Ж) are called “operators of trace class.” For Ae J(/) both tr(^4 +) and tr (A_) are finite. Therefore tr (A) exists because tr (A) = £v <xv, Axv) = tr (A+) — tr (A_). We will now prove that (11.2) is a norm in ЩЖ) and that ЩЖ) is a base norm space (see AIII, §6) and a Banach space. We will now show that ML = sup {tr(EtAEt) - tr(E2AE2)}, (11.3) EuE2 where the supremum over all projection operators EUE2 with finite dimension of ЕХЖ,Е2Ж, so \\A\\S is a norm. It easily follows that for A e and finite-dimensional ЕЖ the expressions tr(2L4+£), tr(EA_E) and tr(EAE) exist and are finite. In particular, for E = Px we obtain tr(PXAPX) = tr(PXA) = tr(APX) = <x, Ax}. For A e <£Г+(Ж) we obtain EAE e &Г+(Ж). For finite or infinite dimensional ЕЖ it easily follows that tr (EAE) < tr (A). Let £(A) denote the spectral family of iG&Г(Ж); then A_ = -E(0)AE(0) and A+= (1 - E(0))A(1 - E(0)). From (11.2) it follows that ML = tr((l - E(0))A(1 - E(0)) - tr(E(0)AE(0)). Let xv be a complete orthonormal basis for (1 — ЕЩЖ = (Е(0)Ж)x and let уp be a complete orthonormal basis for Е(0)Ж—it therefore follows that MIL = Z<xvMxv> - Z<^My„>. v ц Let E1N, E2M be the projections onto the space spanned by xl5..., xN and У и • • • 5 Ум> respectively. We therefore find that ML = [lr№lM^liv) — N-*■00 M-*' oo Therefore, using the sup ... from (11.3) we obtain sup{tr(£M^i) - tr(E2AE2)} > MIL- EuEi
392 Appendix IV Operators in Hilbert Space Since 0 < tr(EA+E) < tr(A +) and 0 < tr(EA_E) < tr(A_) it follows that tr(EAE) = tr(EA+E) - tr(EA_E) < tr(A+), -tr(EAE) = - tr(EA+E) + tr(EA_E) < tr(A_) and we obtain tr^Al^) — tr(E2AE2) < tr(A+) + tr(A_) from which it follows that s\xpEuE2tr(E1AE1) — tr(E2AE2) < \\A\\S. Therefore we have proven (11.3). &(Ж) is therefore a normed space with norm ||* • ||s. $(Ж) is also a Banach space. From \\A\\ = sup||xу <1(x,Ax} it is easy to show that ||A||S > ||A||.Thusit follows that a Cauchy sequence Av e 0&(Ж) with respect to the norm ||* • *||s is also a Cauchy sequence with respect to the norm || *||. Therefore there exists an A e J&Г(Ж) such that Av A. We will now show that \\A — Av\\s —* 0 (from this result and \\A\\S < || A — Av\\ + ||AV||S it follows that \\A\\S < oo). We now assume (11.3) and consider the case in which ЕЖ has dimension N: |tr(E(A - AV)E)| < |tr(E(A - Ap)E)\ + |tr(E(Ap - AV)E)\ <\\A-Ap\\N+\\Ap-Av\\s. For a given e > 0 choose M such that \\Ap — Av\\s < e for p, v > M. Therefore it follows that |tr(E(A - Ay)E)| < e + N\\A - Ap\\ (11.4) for all p, v > M and all E (dimension of ЕЖ < N), where M does not depend on N\ Thus, for two projections EUE2 for fixed v > M we obtain tv(EM - AJEi) ~ tr(E2(A - Av)E2) <2s + (Nt + N2)\\A - Ap||, where Nu N2 are the dimensions of ЕХЖ and Е2Ж. For p —► oo it follows that tr (EM - Av)Et) - tr (E2(A - Ay)E2) < 2 e and we obtain \\A — Av||s < 2e for all v > M. Therefore \\A — Av\\s is finite, and || A — Av\\ —► 0. Therefore ЩЖ) is a Banach space. Thus for A e ЩЖ) it is necessary and sufficient that A has the form (9.2) (that is, it has only a discrete spectrum) and £v |pv| < oo (that is, A must also be compact). Proof. From (9.2) it follows that || A\\s = £v |pv|. If A e ЩЖ) and if, as in §9, for the spectral representation, there exists an interval A, к + e for which £(A + e) - E(X) is infinite dimensional, then for a complete orthonormal basis xv from [£(Я + e) - Е{ЩЖ it follows that Mil.* l<xv,Axy} V Therefore A must have the form (9.2). = oo. ЩЖ) is an ordered vector space with positive cone &+(Ж) of all A > 0. The set К = {W\W> 0, tr(vv) = 1} (11.5)
11 The Trace as a Bilinear Form 393 is a base for this cone (see AIII, §6) because from A > 0 it follows that АЦАЦ71 = A(tx(A))~x e K. For an Ae it follows that A = where Wt = AJA^ and W2 = A_\\A_\\;K A therefore has the minimal decomposition property (see AIII, §6). Thus it follows directly that the unit sphere of ЩЖ) is equal to co(K и — К) and that &(Ж) is a base norm space. We shall now show that $(Ж) is norm- separable. It suffices to show that К is norm-separable. From W = ]TV w„PXv where wv > 0, ]TV wv = 1 for W e K, it suffices to show that the set of all Px in К is norm-separable. Since PXI — PX2 is nonzero only in a two-dimensional space generated by xt and x2, it easily follows that \\PXl — PX2\\S < 21|PXl — PX21| where \\PXI — PX2\\ is the operator norm in Sgtff). Since ЦР*. - PJ < 21|x, - x21| it therefore follows that ||PXl - PX2\\S < 4||xi — x2||. Therefore, the set of Px is norm-separable as a subset of ЩЖ) if the set of x for which ||x|| = 1 is separable as a subset of Ж. This is indeed the case, because Ж is separable. A central and most important result is that, for A e ЩЖ), В e $£Г(Ж\ the expression tr(,42?) is a continuous bilinear form on <М(Ж) x У?Г{Ж) and that each continuous Unear form over ЩЖ) is of the form tr(AB) with suitably chosen В e <£Г(Ж)- Therefore У?Г(Ж) may be identified with the dual Banach space $\Ж) for $(Ж), and we find that \tv(AB)\ < \\A || JB||. We now prove this result in several steps. First, we show that £ <xv, ABxvy = Z<xv, BAxv) = tr(AB) = tr(B^) V V is convergent and is independent of the choice xv of a complete orthonormal basis. It suffices to show that this result is satisfied for A+ and A_ where A = A+ — A_, that is, for an A > 0. Suppose \tr(A+B)\ < M + IIJBH and \tr(A_B)\ < M_||S||B|| then it directly follows that |tr(AB)| < ||>1||S||B||. We shall now show that X <xv, All2BA1/2xv} = tr(A1/2BA1/2) (11.6) V is convergent. This result follows directly from £ |<xv, A1I2BA1I2xv}\ = ^ |<^1/2xv,B/l1/2xv)| V = 1 v=1 < ||B|| £ P1/2xvll2 V=1 <||B|| tr(A) = \\B\\ \\A\\S. Substituting B+ for B, we find that tr(All2B+A1/2) < oo, that is, A1I2B+A112 g&+(Ж). Thus we find that A1/2B_A1/2 e М+(Ж) and we obtain A1/2BA1/2 e ЩЖ), and we have proven (11.6). We will now show that X <xv, A1/2BAll2x„y = £ <xv, BAxvy. (11.7)
394 Appendix IV Operators in Hilbert Space If this is the case, the right-hand side is independent of the orthonormal basis because the left-hand side is according to (11.6). That is, the existence of tr(BA) follows from (11.7). Since A1/2BA1/2 is self-adjoint, the left-hand side of (11.7) is real, and therefore so is the right-hand side. Since <xv, BAxv} = <xv, ABxv}, from the existence of tr(BA) the existence of tr (AB) is guaranteed and tr (AB) = tr (BA). From the previous results we conclude that \tr(BA)\ < \\A\\a\\B\\. In order to prove (11.7) we need only show that, for N —► oo, Y Y <41/2ху,хд><х„,ВЛ1/2ху>Г £ <41/2xy, x„><x„, B41/2xy>j V H=1 V J = £ <Л1/2ху, BAil2x„y = tr (A112 BA112). (11.8) V Then since £ YJixix’BAV2xvy{xv,Amxliy= Y <x„, BAll2All2Xliy Ц=1 V fl=l 00 -*■ £ <ХУ, ВАх^У n=l and we have proven (11.7). In order to prove (11.8) we introduce a projection EN for the subspace spanned by xl5..., xN. Clearly EN —► 1. Thus (11.8) becomes Y (All2xv, ENBAll2xvy -j Y <All2xv, BAll2xvy. V V (11.8) will be proven if we can show that Y <Л1/2ху, (1 - EN)BAl'2xvy r 0. (11.9) V From |<^2xy, (1 - EN)BA1,2xvy\ = |<(1 - EN)A"2XV, (1 - EN)BA^2xvy\ <W(1-En)A^\\\ \\B\\\\A1I2xJ we obtain Y<A1/2xv,(l ~ Е„)ВАЧ\У V <\\B\\Y\\(l-EN)AV2xv\\\\All2xy\\ V M < ||B|| Y l|(l-£wH1/2*JM1/2*vll v = 1 + \\B\\ Y ui,2xv\\2. v = M +1
11 The Trace as a Bilinear Form 395 Since £vM1/2xv|| = \x{A) exists, it is possible to find an M such that W\ Z^°=m+i M1/2*vll2 < e* Fix M and choose N so large that «Dll Zf=i IK1 “ en)a1I2*vII M1/2*vll < e, from which (11.9) is proven. Now we must show the converse, that each linear form on &(Ж) is of the form 1(A) = tr(AB) for a suitably chosen В e 5£Г(Ж). Next we shall show that we may easily extend 1(A) onto the set of “all” operators of trace class. We say that A e <£(Ж) is an operator of trace class if Ai = j(A + A+) and A2 = (l/2i)(A — A+) are elements of ЩЖ). We extend as follows: T(A) = ЦАг) + il(A2). It is a simple matter to show that Г is a linear form. For Tit follows from \l(A)\ < c\\A\\s for A e ЩЖ) that, in general, \T(A)\ < IKAJ] + \l(A2)\ < c(Mi||. + \\A2\\S). We now define an operator which we denote by x<y| as follows: z —► x<y, z). It follows that (x<y|)+ = y<x|. Thus we obtain T(x(y\) = l(j(x(y\ + y<x|)) + i/Qr(x<y| - iy<x 1)^. The operator ^(x<y| + y<x|) acts only on the two-dimensional subspace spanned by x and y, that is, on this operator is a null operator. Thus it follows that ||^(x<);| + y<x|)||s < 2\\^(x(y\ + y<x|)||. We may easily estimate the operator norm in <£(Ж) by \\^(x(y\ + X*l)ll ^ 11*11 IItII* In this waY й follows that Шх(у\ — X*I)L ^ 2||x|| ||y||. Similarly it follows that ||(l/2i)(x<y| - iy<x|)||s < 2||x||||y||. Thus we obtain |Г(х<у|)| < 4c||x|| ||y||. T(x(y\) is, therefore, for fixed у a bounded linear form on Ж. According to §4 there exists a z e Ж for which T(x(y\) = <z, x>. From |/(x<y|)| < 4c||x|| ||y|| it follows that |<z, x)| < 4c || x || ||y|| and we obtain <z, z) < 4c||z|| ||y||, that is, Nl<4c||y||. A linear operator В is defined by у —► z which, from ||z|| < 4c||y || it follows that the relation ||Б|| < 4c is satisfied, that is, В e 5£(Ж). Therefore we obtain Kx<y\) = <By, x> = <y, B+x}. Since x<x| = Px is self-adjoint, /(x<x|) = /(x<x|) = <2?x, x) is real, that is, <Ях, х> = <x, Bx). With the aid of the identity (By, x> = i(B(x + y), x + y> - i(B(x - y), x - y> — ~ (B(x + iy), x + iy} + ~ (B(x — iy), x - iy} it easily follows that (By, x) = <y, Bx) and В = B+, that is, В e ЖГ(Ж). Since I is continuous over ЩЖ) and each A e Щ Ж) is of the form A = YjvPxBxv’ where the converge in the trace norm, from l(PXv) = <xv, Bxv> it follows that 1(A) = £v /iv<xv, Bxv> = £v (Axv, Bxv} = tr(^B). In 0Щ(Ж) the norm is defined by ||/|| = sup^u^, |/(.<4)|. For 1(A) — tr(AB) it is simple to show that ||/|| = ||B||. Thus we have proven that А'(Ж) = У£Г(Ж).
396 Appendix IV Operators in Hilbert Space Since 0Ь(Ж) is norm separable, 0b'(Ж) is separable in the а(&'(Ж), ЩЖ))- topology (see AIII, §4). In closing our investigation of 0b'(Ж) we shall now show that every affine positive functional К -U R+ is of the form /(w) = tr(WB) for some Ве@\(Ж) = J¥Г+(Ж). We say that / is an affine functional over К if W = kWt + (1 — k)W2, 0 < к < 1 implies that l(W) = kl(Wt) + (1 — k)l(W2). It is easy to verify that / has a unique extension on all of 01(Ж) because 0&(Ж) is spanned by K. If / is positive on X, then I is positive on 0b+(Ж). If a linear functional / satisfies /: 0Ь+(Ж) —► R+, then it is said to be positive. For 0Ь(Ж) we may easily prove the theorem mentioned in AIII, §6 that a positive linear functional is continuous. Let Wx = А + \\А + \\;геК9 W2 = A_\\A_\\S e К; we obtain A = \\A + \\Wt - M-ll JF2. From this result and from \\W\\s = lfor WeK it is easy to show that / is continuous only if there exists a number a for which \l(W)\ < a for all We X. For a positive l(W) we need therefore only show that l(W) < a for some a. Suppose such an a does not exist. Then there exists a sequence for which TO >2". Clearly IF = X„°°=12""IF„ is an element of X. For VN = f%= i2~nWn it follows that 1(Vn) = N. Since W-VN = Y,n=N+i 2""IF > 0 it follows that l(W) > l(VN) = N, which, as N —► oo leads to a contradiction to the fact that l(W) is defined. Therefore every positive affine functional / on X is of the form l(W) = tr(WB) where В e <£Г(Ж). Since / is positive, it immediately follows that В is positive. Let us consider the Banach subspace J¥?СГ(Ж) (see §5) of 0Ь\Ж) = J&Г(Ж); then, for each A e 0Ь(Ж) tr(AB) defines a bounded linear form over Е£СГ(Ж). Since Ax = A2 follows simply from tr(AtB) = tr(A2B) for all В e J&СГ(Ж) (all Px are elements of <£СГ(Ж)\ 0Ь(Ж) is therefore a subspace of the Banach space which is dual to <£СГ(Ж). We will now show that 0Ь(Ж) is equal to the Banach space which is dual to <£СГ(Ж). Let / be a bounded linear form over <£СГ(Ж\ therefore, since 0Ь(Ж) cz <£СГ(Ж) (as a set of operators) / is a linear form over 0Ь(Ж). Since 1(B) < c\\B\\ and ||*||e > ||£|| / is also bounded as a linear form over ЗЬ(Ж), that is, there exists an A e 5£Т(Ж) such that 1(B) = tr (AB) for all В e ЩЖ) c Е£СГ(Ж); in particular, l(Px) = <x, Ax) for all Px e Е£СГ(Ж). Since each Ве<£сг(Ж) is of the form В = £vwhere |juv|—>0 we find that || В — PXJ —► 0 as N —► oo and we therefore obtain 1(B) = Zv°°=i MPJ = Ev°°=i nv<xv,Axv}. Since ||B|| = maxv{|^v|} we therefore ob¬ tain |£r=i Axv}\ < c||B|| = с maxv{|/iv|} for any complete orthonor- mal basis and for all sequences juv for which jiv —► 0. By appropriate choice of the juv > 0 and xv we obtain, for xv as a complete orthonormal basis for the space (1 — Е(0))Ж where E(k) is the spectral family of A and juv > juv+1 we obtain
12 Gleason’s Theorem 397 By choosing iix = ii2 = • • • = nN = 1 and jiv = 0 for v > N we obtain Ys!=i ^xv> ^ c f°r N and we therefore obtain tr(A+) < oo. Similarly it follows that tr (A _) < oo and we find that >1 e It is easy to show that the norm in the Banach space which is dual to $£СГ(Ж) is identical to the norm || • • • ||s for ). From the general theorems in AIII, §4 it follows that the unit sphere co(K v —K) is compact in the <т(ЩЖ), 3?СГ(Ж))-topology. Since $+(Ж) is also о(0&(Ж\ J^crpf))-closed, the intersection of $+(Ж) with со (К v —К), that is, the set К = ЛК (see also AIII, §6) is compact in the <т(ЩЖ), ))-topology. 12 Gleason’s Theorem If En is a collection of pairwise orthogonal projection operators then, according to §6 there exists a projection operator E for which E = £E„. (12.1) It For We K, according to §11 we find that w=£WvpXv, V where wv > 0 and £y wv = 1. For EN = £™=N+1 it follows that 0 < tr (WE) - £ tr (WEn) = tr (w(e - £ E^jj M oo = £ Wv<xv, (E - EN)xvy + £ wv<xv, (E - £jv)xv> v = 1 v = M +1 M oo £ £ wv||(£ - Е„)ху||2 + £ wv. v=1 V = M +1 By choosing M sufficiently large, and for fixed M, let N become infinite, it follows that tr (WE) = £ tr(WE„). (12.2) n=l Let G denote the set of projection operators. For We К there exists a positive real function G -A R+ given by ji(E) = tr (WE) for which ц( 1) = 1 and for pairwise orthogonal En we obtain /л(^п En) = /л(Еп). Gleason’s theorem says that the converse holds—from GAR+, /x( 1) = 1 and En) = Y,n then it follows that fi(E) = tr (WE) for some We K. From ii(^n En) = ]ГИ M-EJ ^ follows that for an orthonormal basis {xv} that PXJ = P(PXv)- Conversely, if the last equation holds for all orthonormal vectors, then the first is satisfied for any set of pairwise orthogonal En. Since /i( 1) = 1, it follows that, for any complete orthonormal
398 Appendix IV Operators in Hilbert Space basis that ]TV fi(PXv) = 1. For a subspace 2Г = ЕЖ and for an orthonormal basis in it follows that ]TV ju(P*v) = p(E) < 1. Let A(G) denote the set of all Px {A(G) is the set of all atoms of the lattice G; see V, §5). Suppose we are given a real function A(G) A [0,1] where £v /i(PXv) = 1 for all complete orthonormal bases {xv}. ju can be extended in one and only one way to all of G such that ц(^п En) = ]ГИ /г(Еп) holds. We must therefore prove that ]TV /i(PXv) yields the same value for each complete orthonormal basis in ЕЖ; in this way we may define ц(Е). This, however, follows from the fact if {xv} and {yv} are two complete orthonormal bases from ЕЖ and if {zp} is a complete orthonormal basis for ELЖ that the {xv,zp} and {yv,zp} form a complete orthonormal basis for Ж for which Ev KPJ + Ep №.) = Ev v(Py) + Ip = 1. It is easy to verify that such an extension of /л onto G satisfies the condition /Х]£и En) = ]ГИ /л(Еп). It therefore suffices to consider maps A(G) A [0,1] for which ]TV ju(P*v) = 1 for each complete orthonormal basis {xv}in Ж. Let ST denote the surface of the unit sphere in Ж, that is, ST = {x \ x e Ж and ||x|| = 1}. A function ST A R is called a frame function of weight с if, for every complete orthonormal basis {xv} the relation ]TV m(xv) = c. Since it is possible to replace one of the xv, for example, xVo by eiaxXo it follows that т(ешх) = m(x); in particular m(—x) = m(x). For a frame function m we have not assumed that m(x) is positive! If m(x) > 0 and с = 1, then the function A(G) A [0,1] satisfies ]TV fi(PXv) = 1 for all complete orthonormal bases {xv} of Ж and defines a function G A [0,1] for which En) = ]ГИ ju(£„). The restriction of a positive frame function of weight 1 for Ж to the surface of the unit sphere ^ of a subspace ^ is a frame function for of weight c = Zv m(Jv)? where {yv} is a complete orthonormal basis for A frame function m is said to be regular if m(x) = <x, Ax} for some A e ЩЖ). The weight of <x, Ax} is equal to tr (A). For a regular positive frame function of weight 1 it follows that there exists a unique measure G A [0,1] for which ii(E) = tr (WE) where We K. It suffices to prove that every positive frame function is regular. We shall carry out the proof in several steps using Gleason’s approach. First we shall show that: Every continuous (and therefore not necessarily positive) frame function in Euclidean space R3 is regular. (R3 is the real Hilbert space of three dimensions). Proof. Let & denote the surface of the unit sphere of R3; let denote the set of all continuous functions ^ A R. Since 9* is compact, ||/|| = supX6<^|/(x)| < oo. With the norm ||/|| C(9) is a Banach space. Let C+(9) denote the set of all positive functions (with С+(9), C(9) becomes an ordered vector space—see AIII, §6). Let F be the set of continuous frame functions; clearly F c= C(9); F is a subspace of C(9). It is easy to verify that F is closed, that is, F is a Banach space. Let ^ denote the group of rotations in R3. Let Де^; we obtain a representation of ^ in C(9) by UAf(x) =/(A_1x). Clearly F is an invariant subspace of C(^) under this representation, that is, F is itself a representation
12 Gleason’s Theorem 399 space for According to VII, §3 we may represent F in terms of the subspaces (where is defined in VII, §3), that is, F = + (F n ^J). For which value of / is F n 2ГХ Ф {0} ? For /e«^i is/( —x) = —/(x)? Since for meF we must have m(—x) = m(x) we find that F n «^ = {0}. According to VII, §3 ^ is the set of all homogeneous polynomials of second degree, and F n (+ &~2) is therefore the set of all real homogeneous polynomials of second degree. These have the form <x, Ax) with (real) symmetric A. Now we need only show that 2Txr\F = {0} for I > 3. If n F Ф {0}, then would be equal to the set of all real functions in According to VII (3.55) all functions of the form cos m(p(sin 0)-mQ5„(cos 0), sin <p(sin 0)~mQ!jcos 0), where m = —/,...,/ must belong to The restriction of these functions onto the subspace determined by 0 = я/2 will also be frame functions. For 0 = n/l, that is, for sin 0 = 1 and cos 0 = 0, all functions of the form cos mcp and sin mcp for m = —/,...,/ must be frame functions. Therefore, for cos mcp we must have cos mcp + cos m(cp + я/2) = const. This is only possible for m = 0 or for 1 + cos mn/2 = 0, which cannot be the case for all m = —/,...,/ for I > 3. We shall now show that every positive frame function on the surface ^ of the unit sphere in R3 is continuous. The proof can be carried out in a simple way. Let us consider the polar coordinates introduced above. Let N denote the closed subset 0 < 0 < я/2 of 9*. A frame function is, since m(x) = m(—x), uniquely determined by its values on N. The set of points 0 = constant is called a circle (parallel) of latitude. Through each point xeN (with the exception of the pole 0 = 0) there exists a uniquely determined great circle which, at x, has the same tangent as the latitude circle passing through x. Let H(x) denote the set of all points on this great circle. Clearly x e H(x). Next we shall show that for z e N (z is not the pole) the set X(z) = {x | x 6 N, there exists a у such that у e H(x) and z e H{y)} has a nonempty interior. Proof. It suffices to choose the point z with rectangular coordinates (sin 0,0, cos 0) for which 0 < 0 < я/2. The set L of all у for which z e H(y) is precisely the set of all (^, yj, Q for which ф d= (g2 + rj2) cos 0 - ^ sin 0 = 0. If x is a point for which the quadratic form ф is negative, then L n H(x) ф 0 because if f = 0 ф > 0. Thus X(z) contains the open set of all points for which ф < 0. This is nonempty because, for ^ = sin 0', rj = 0, £ = cos 0' where 0 < 0' < 0: ф < 0. Let osc(/, X) = sup{/(x)|x € X} — inf{/(x)|x€ X}. Let / be a frame function on the unit sphere in R3. There may be a point x for which there
400 Appendix IV Operators in Hilbert Space exists a neighborhood U for which osc(f U) = a. Thus each point у of the great circle which has x as a pole (that is, in all directions у which are orthogonal to the x direction) has a neighborhood V for which osc(/, V) < 2a. Proof. We introduce polar coordinates for which x is the “north pole.” There exists an e > 0 such that all points for which в < e belong to U. Let x0 be a point for which 0 = я/2 and (p = (p0. Let y0 be the point for which 0 = (я -f e)/2 and (p = (p0. The point у with 0 = e/2, <p = <p0 is orthogonal to y0; similarly x is orthogonal to x0. The points x, y, x0, y0 lie on a great circle, x and у lie in U. If, instead of x0 we choose another point z from a sufficiently small neighborhood V of x0, then the following two points z', yf lie in U: z', y' on the great circle passing through x and y0 and orthogonal to z and y0. For zl9 z2 e V let z\, y\, z'2, y'2 denote the points chosen in the above manner. For the frame function/ it therefore follows that /(Уо) =/(Z|) + /(/) (i = 1, 2). By subtraction, we obtain l/(zi) —/(zz)I = l/(/i) -/(y'2) + /(z'2) -/(z'i)l < 2a since y', z' lie in U. Therefore we find that osc(/, V) < 2a. We now prove the following lemma: If, for a frame function f on the unit sphere 9 in R3 there exists an open set U for which osc (f U) = a then, to each point ye 9 there exists a neigh¬ borhood Wfor which osc(/, W) < 4a. PROOF. For x e U and ye 9 there exists a point z which is orthogonal to x and y. z therefore lies on the great circle with x as the pole, and there exists a neighborhood V of z such that osc(/ V) < 2a. у lies on the great circle with z as the pole, so there exists a neighborhood W of у for which osc(/, W) <2 osc(/, F) ^ 4a. We are now in a position to prove that every positive frame function on the unit sphere 9 in R3 is continuous. Let д > 0 be a frame function. Then the function / defined by /(x) = g(x) — infy g(y) is also a frame function. Clearly/(x) > 0 and infXf(x) = 0. Therefore, to each e > 0 there exists a point у for which/(y) < e. Let us choose a polar coordinate system with у as the north pole. To each x there exists a point x' which has the same coordinate в as does x but for which cp differs by я/2. Clearly h(x) =/(x) +/(x') is a frame function for which h > 0. Iff(x) is of weight c, then h(x) is of weight 2c. For the special case in which x has the coordinates в = я/2 and (p = (p0 the vectors y, x, x' are an orthonormal basis for R3', and we find that Ну) + Hx) + Hx') = 2c. Sinee/(—x) =/(x) we find that h(x') = /(x) + /(x'), and we find that h(x) = с — h(y)/2 = с —/(у). Therefore we find that h is constant on the great circle with у as the pole. For an arbitrary point x e N\{y} H(x) is well defined. H(x) intersects the great circle with pole у in the point z which is orthogonal to x (if x is on the
12 Gleason’s Theorem 401 great circle with pole y, then such a z also exists). Therefore we find that 2c > g(x) + g(z) = g(x) + с — f(y) and we obtain g(x) < с +f(y) < с + e for all x e N\{y}. Let и e H(x) n N and u' e H(x) n N where u' is orthogonal to м. It follows that g(x) + с —f(y) = g(x) + g(z) = g(u) + g(u') < g(u) + с + e and thus we obtain g(x) < g(u) + 2e for all x g N\{y} and и e H(x). Let j? = inf{g(x)\x g N\{y}}; let zeN\{y} such that g(z) < p + e. For x e X(z) (where X(z) was defined in the first lemma) there exists are H(x) and z g H(v). According to the above inequality we obtain g(x) < g(v) + 2e and g(v) < g(z) + 2e and j8 < g(x) < g(z) + 4e < j? + 5e. The set X(z) has a nonempty interior, that is, there exists an open set U (as a subset of X(z)) for which j? < g(x) < j? + 5e for all x g U; thus we obtain osc(g, U) < 5e. According to the previous lemma for the point у there exists a neighborhood V for which osc(p, V) < 20e. Since g(y) = 2f(y) < 2e we obtain sup{g(x) \ x g V} = osc(g, V) + inf {g(x) \ x g V} < 20e + 2e. Since 0 </ < g we therefore obtain osc(/, V) < 22s. Thus, according to the previous lemma each point xg has a neighborhood W for which osc(/, W) < 88e. Since s can be chosen arbitrarily small, / is therefore continuous. We have proven the following: Every positive frame function on the unit sphere 9* in R3 is regular. We shall now extend this result to complex Hilbert spaces and to higher dimensions. A subset St of a Hilbert space Ж is called a real subspace if for each x, yG0t x + у GSt and for each a g R axe^. Note that St need not be a subspace of Ж if Ж is a vector space over C. St is said to be completely real if the inner product on St x St only takes on real values. If X с: Ж is a set for which the inner product on X x X takes on only real values, then the real linear span of X is a complete real subspace. The real linear span of an orthonormal basis is therefore a complete real subspace. If the complex subspace generated by a complete real subspace St is dense in Ж, then a complete orthonormal basis of St is also a complete orthonormal basis of Ж . Then the restriction of every frame function of Ж onto St is a frame function for ^2. We now prove that a frame function / on a complex two-dimensional Hilbert space for which / > 0 which is regular on every complete real subspace is regular in the entire Hilbert space. Proof. Let с denote the weight of / and let d = sup{/(x)| ||x|| = 1} where 0 < d < с since/ > 0. We will now show that/ takes on the value d. There exists a sequence xn for which ||xj = 1 and f(xn) —► d. Since the surface of the unit sphere is compact, we may assume that the sequence xn converges: xn —► y. For Xn = <x„, y>/|<x„, y}\ we find that \kn\ = 1 and kn —► 1. Thus it follows that
402 Appendix IV Operators in Hilbert Space Kxn У andf(kn9 xn) = f(xn). Since <у, Аихи> is real, for each n both vectors y, knxn lie in a complete real subspace 0tn. If x, у e 0tn satisfy ||x|| = ||y|| = 1, then, for a symmetric A we obtain Note that F(y) = f(y) = d. If z is orthogonal to у and if ||z|| = 1 then it follows that F(z) = f(z) = с —f(y) = с — d. F(x) is a real quadratic form on the complete real subspace of Ж generated by у and z which takes on its minimum d at y. Therefore, for real a, ft we find that F(ocy + pz) = cc2F(y) + P2F{z) = (x2d + p2(c - d). For complex nonzero A, \i it follows that Щу + ^) = F(j- (Ay + rf) = F(|A|z + l^lz'), where z' = (^|A|/|^|A)z. Clearly z' is orthogonal to y and ||z'|| = 1. Therefore we obtain F(Ay + nz) = \k\2d + |/i|2(c - d). It is easy to show that the above formula is correct for к = 0 or \i = 0. Therefore F(x) = <x, Tx} where T is the diagonal matrix with respect to the complete orthonormal basis {y, z}. Therefore/ is regular. The extension of the preceding result to arbitrary dimension follows from the following theorem: Iff is a frame function in Ж for which / > 0 and f is regular on each two dimensional subspace, thenf is regular on the entire Ж. Proof. Again we define for x Ф 0, for x = 0.
12 Gleason’s Theorem 403 On each two-dimensional subspace 9 Fix) — <x, Ax') where A is a self-adjoint operator which is defined in 9. We define a function Ж x Ж С as follows: G(xf y) = <x, A^y), where 9 is the subspace generated by x and y. Now we must show that this definition is meaningful. If x and у are linearly independent, then 9 is uniquely determined by x and у and therefore G(x, y) is uniquely defined. If у = Ax then for 9 we may select an arbitrary two-dimensional subspace which contains x. It then follows that G(x, Ax) = A(x, Arx) = AF(x)9 which is independent of the choice of 9. Since Ag- is self-adjoint in 9 we find that G satisfies the following relationships for all x, у e Ж and aeC: G(x, ay) = a(G(x, y), G(y, x) = G(x, y), 4 Re G(x, y) = F(x + y) - F(x - y), 2F(x) + 2F(y) = F(x + y) + F(x - y). Thus it follows that G(x, y) + G(x, z) = Re G(x, y) + Re G(x, z) +- Im G(x, y) + Im G(x, z). Since Im G(x, y) = i Re(—i*G(x, y)) = i Re(G(x, — /у)) it follows that G(x, y) + G(x, z) = Re G(x, y) + Re G(x, z) + i Re G(x, — iy) + i Re G(x, — iz). From the above relationships it follows that 8 Re G(x, y) + 8 Re G(x, z) = 2F(x + y) — 2F(x — y) + 2F(x + z) — 2F(x — z) = F(2x + у + z) — F(2x — у — z) = 4 Re G(2x, у + z) = 8 Re G(x, у + z). Therefore we obtain G(x, y) + G(x, z) = Re G(x, у + z) + i Re G(x, i(y + z)) = G(x, у + z). In addition it follows that |G(x, y)| < |Re G(x, y)| + | Im G(x, y)| = |Re G(x, y)| + |Re G(x - iy)| < JF(x + y) + iF(x - y) + iF(x + iy) + jF(x - iy). From the definition of F, and from 0 < / < c, where с is the weight of/, it follows that: |G(x, y)| ^ iC[||x + y\\2 + ||x - y\\2 + ||x + »>||2 + ||x - iy||2] < C(||x||2 + |Ы|2). For unit vectors x, у we therefore obtain |G(x, y)| < 2c. G is therefore a bounded linear form; thus, according to §4 there exists a bounded operator A such that G(x, y) = <x, Ay}. Since G(x, y) = G(y, x) we
404 Appendix IV Operators in Hilbert Space conclude that A = A+, that is, A is self-adjoint; from G(x, x) > 0 we conclude that A > 0. We now prove the following theorem: Every frame function/on a Hilbert space Ж for whichf > 0 is regular when the dimension of Ж > 3. Proof. / is a frame function on each complete real subspace. Each complete real two-dimensional subspace is also a subspace of a three-dimensional complete real subspace. In each complete real three-dimensional subspace (that is, in R3) / is regular, and is therefore regular on every two-dimensional subspace. Thus, from the last two theorems, it follows that / is regular on each two-dimensional subspace, and therefore in all of Ж. 13 Isomorphisms and Anti-isomorphisms Let Ж^Ж2 be two Hilbert spaces, where we permit Ж± = Ж2. A map -4 H2 is said to be linear if A(x + y) = Ax + Ay and A(ax) = aA(x). A is said to be antilinear if we substitute A(ax) = aA(x) for the previous equation. A is continuous if and only if it is bounded, that is, if there exists a number с for which || Ax || < c||x||. Ж± -4 Ж2 is said to be an isomorphism if U is linear and bijective and || Ux|| = ||x||. If Жх = Ж2 then, according to §7 U is an isomorphism if it is unitary. If for Жх 4 Ж2 only || Vx || = || x || then, from the results of §7 it follows that V is an isomorphism on a closed subspace of Ж Жх 4 Ж2 is said to be an anti-isomorphism if U is antilinear and bijective, and || Ux || = ||x||. For the special case in which Жг = Ж2 then we say that U is anti-unitary. For a complete orthonormal basis {xv} for Ж a special anti- unitary operator is defined by Usxv = xv; it follows that Us(£iVocvxv) = £v avxv .If Жх 4> Жг is an anti-isomorphism, then it follows that if U1 is an anti-unitary operator in Ж± then 0 = UU1 is an isomorphism Жх У*Ж2. Both Ux and U^1 are anti-unitary. Therefore every anti-isomorphism U can be represented as a product UU^1 of an isomorphism and an arbitrary anti- unitary map. If U2 is an anti-unitary map in Ж2 it follows that U = U2U is an isomorphism and that U may be represented in the form U^U. As expected, the product of two antilinear maps Жг Ж2 -А Жъ is a linear map Ж1 -24 Жъ. If U is a unitary map, then for A e ЩЖ) and В e $'(Ж) it follows that tr(AU+BU) = X <*v, AU+BUxv} = X <L/xv, UAU+BUxv> = X <yv, UAU+Byv\ V
14 Products of Hilbert Spaces 405 where xv is a complete orthonormal basis and yv = Uxv. Since U is unitary, the set yv is a complete orthonormal basis, and we find that tr (AU+BU) = tr (UAU+B). An analogous proof is applicable to the case in which U is anti-unitary. From \\Ux\\ = <Ux, Ux) = ||x|| = <x, x> it follows that if we replace x by x + у and x + iy, then we obtain (x,y> = (Uy, Ux> = {Ux, Uy> from which we obtain: tr(AU-'BU) = £ <*v, AU-'BUxJ = X <l/xv, 1/А1Г *Blfxv> = X <yv, UAU~lByvy V V Since A e ЩЖ), for A' = JJA U ~1 it follows that <x, A'y) = <x, I/AITV) = <AU~ly, 1/_1х) = (IT^Al/^x) = <1/А1/_1х,У> = <A'x,y> and we find that 4' e Therefore tr(y4'£) is real, and we find that tt(AU~1BU) = tr(C/y4t/_1£) holds for anti-unitary U. 14 Products of Hilbert Spaces Let Жу and Ж2 be two Hilbert spaces. We construct the so-called product space Ж = Жх x Ж2 (where Ж is not the Cartesian product of Жу and Ж2— a fact that is not reflected in the usual notation) in the following way : Let x e Ну, у e H2 ;/(x, y) is said to be an anti-bilinear form if/(xx + x2, y) = fix i, У) +f(x2, y),f{x, у у + y2) = fix, yy) + fix, y2),fi ccx, y) = a/(x, y) and fix, ay) = a/(x, y). An example of an anti-bilinear form is given by/(x, y) = <x, Zj)<y, z2> where zx e Жу, z2 e Ж2. This anti-bilinear form is often denoted by zlz2. An anti-bilinear form is continuous if and only if there exists a с for which l/(x, y)| < c||x||||y||./= ZjZ2 is continuous with с = \\zy\\ ||z2||. The set of all continuous bilinear forms is a linear vector space over C. From |/(x, y)| < c||x|| ||y|| from §4 it follows that /(x, y) = <x, z) for some z e Жу. An antilinear map Ж2 Ж у which satisfies ||Ay|| < c||y|| is defined by y—>z. Therefore /(x, y) = <x, Ay). For / = z2z2 we obtain Ay = Zyiy, z2>. Since A is bounded, there exists, according to §6 a bounded antilinear operator Жх Ж2 for which <x, Ay} = <y, A'x). If A is a
406 Appendix IV Operators in Hilbert Space bounded antilinear operator then <x, Ay) is a bounded antilinear form. Therefore there is a bijective map between bounded antilinear forms and bounded antilinear operators Ж2^> Ж±. From <x, Ay) = (y, A'x) it follows that, for у = A'z, the operator AA' defined by <x, AA'z) is bounded and linear in ; the same holds for A'A in H2. From (x,AA'z) = (A'z, A'x) = (AA'x, z) it follows that AA' is self-adjoint. For z = x it follows that (x,AA'x) = (A'x, A'x) = M'x||2 and (y, A'Ay) = (Ay, Ay) = My||2. Therefore AA' e JSfr+p^) and A'A e JS?r+(^). For a complete orthonormal basis {xv} for Жг we obtain tr(^4^4') = ]TV <xv, AA'xv) = £v ||,4'xv||2. For a complete orthonormal basis {yv} for Ж2 we may define an antilinear operator V (anti-isometry; see §13) by xv = Vyv. Thus A'Ve J¥г{Ж2) and tx(AA') = £v \\A'Vyv\\2 = tr((A'V)+A'V). It is easy to show that (A'V)+ = V'A. Therefore tr(AA') = tx((A'V)+(A'V)) = tx((A'V)(A'V)+) = tr((V'A) + V'A) = £v WV'AyJ2. Since \\Vy\\ = ||y|| we finally obtain \x(AA') = £v Myv||2 = \x(A'A). We shall now introduce an inner product on the set of all those anti- bilinear forms / for which ix(AA') = tr(^4'>l) < oo as follows: Let/i <->Au /2 <-» A2, we define </i,/2) = ЩА^А^ in this way the above set becomes a Hilbert space—the so-called product space Ж = Жхх Ж2. For the special case in which / = z1z2 we obtain Ay = zx(y, z2), A'x = z2(x,z1) and we therefore obtain A'Ay = z2\\z1\\1(z2, y) and tr(^4'>l) = II^PllzjH2. For/i = z(11)z(21) and for/2 = z(2)z(22) it follows that A'2A2y = z<22)<z<i1), z^'Xz^», y> from which we find that <Л./а> = <41).42)><41).42)>' Let {xv} and {yv} be complete orthonormal bases for Жх and Ж2, respectively. From/(x, y) = <x, Ay) it follows that / = ]TVfM ащхуу^; simi¬ larly, for g(x, y) = <x, By) it follows that 0 = ]ГДУ Ь^х^уу. We therefore obtain <0,/> = BVflaVfl. Therefore the set of xvy^ is a complete ortho¬ normal basis in Ж = Жг x Ж2. We will now show that for a given f e Жг x Ж2 we can choose a complete orthonormal basis xvy^ such that / = E where kv > 0. (14.1) V Since А'А еЩЖ2) A'A can be expressed in the form A'A = £v цуРУу. For Ayy = zv it follows that AA'zy = ЛЛ'Лу, = цуАуу = fiyzv. If zv Ф 0, then xv = zv/||zv|| is a normed eigenvector of A'A with eigenvalue Hv. Clearly ||zv||2 = (Ayv, Ayv> = <yv, A'Ayy} = цу. Since A'A is positive,
14 Products of Hilbert Spaces 407 Mv > o. From f(x, у) = <x, Ay} it follows that for у = yv<yv, y>, we can write Л*. jO = E <*> zvX.y> zv> = E V/vX xv><y. yv> V V from which we conclude that/ has the form (14.1). A bounded linear operator В in Ж = x Ж2 is completely defined by the special values Bz1z2 where z1 e Жъ z2g Ж2. For each Ax g ЩЖ^ and A2 g ЩЖ2) an operator Axx A2g ^(Жх x Ж2) is defined by (At x A2)z1z2 = (A1z1)(A2z2) and satisfies \\AX x A2\\ = \\АХ\\ \\A2\\. It is a simple matter to show that (Ax x A2)(B1 x B2) = (A^J x (A2B2). As a special case we obtain (A1 x 1)(1 x B2) = (Axx B2) = (1 x B2)(A1 x 1). Thus we find that A x 1 and 1x5 commute. Under certain circumstances there is a partial converse. Let M а <£(Ж) and suppose that M has the property that if A g M then A + g M. If 2Г is a subspace which is invariant under M (that is a then so is 2TL since if ug for A g M and xe«f it follows that <x, Au) = (A+x, u) = 0 since A+xg tf. Let M be a set and suppose we are given two maps M Д 5£(Ж^) and M A j£?(,Ж^). A bounded linear map Жх Д Ж2 is called a homomorphism with respect to the operator domain M if B(jxx) = (j2x)B for all xg M. Often the functions j1 and j2 are ignored, and we consider the elements of M as operators in Жг and Ж2, where we observe that two different elements of M can correspond to the same operator in Жг and Ж2. In this sense the elements of a subset M c= 5£(Ж) can be considered to be operators in an invariant subspace & of Ж. In order to emphasize the fact that the same operator domain M is under consideration for both Жг and Ж2 we shall often write (М)Ж1 and (М)Ж2. Two Hilbert spaces (M)&\ and (M)3T2 (for example, two invariant subspaces of Ж for which M а <£(Ж)) are said to be isomorphic if there exists an isomorphism such that Axx = U 1AUx1 for all A gM, that is, if UA = AU holds (as maps of into for all A g M. A space (M)&~ is said to be irreducible if it contains no invariant subspaces (other than {0} and If and У2 are isomorphic and if is irreducible, then 2T2 is also irreducible. If ^ <T2 are irreducible, and ^ ^2 are two isomorphic maps, then for U21U1 from U^A = AUX, and U2A = AU2 it follows that U^U^A = U^AU^ = AU^U^ that is, the unitary operator V= U21U1 commutes in with all operators of M (as operators in ^i). Therefore the spectral family of V commutes with all operators in M. If V Ф eial then there exists an element E((p) of the spectral family for which 0 Ф E((p) Ф 1. From AE((p) = E((p)A it follows that E((p)&\ is an invariant subspace of ^ in contradiction to the irreducibility of &~v Therefore U21U1 = eial, that is, Ux = ei<xU2. Two isomorphic maps can therefore differ only by a “phase factor.” We will now show that if У is an irreducible space and if В g S£(!T) commutes with all operators in M (here we then say that В commutes with Af) then В = il. From A g M, BA = AB it follows that A+B+ = B+A+,
408 Appendix IV Operators in Hilbert Space since if A e M then A + e M and B+ commutes with M. Therefore В + B+ and i(B — B+) commute with M. If a self-adjoint operator D commutes with M, then M commutes with its spectral family. If a projection operator P commutes with M, then РвГ is an invariant subspace. Since У is irreducible, the elements of the spectral family can only be equal to 0 or 1; therefore D = Я1. Therefore В + B+ = тх1 and i(B — B+) = z2l and therefore В = т1. If the irreducible spaces (M)&\ and (M)^~2 are not isomorphic, then it follows that for a bounded linear map Д- ST2 which satisfies AB = BA for all A e M, that is, for a homomorphism В, В = 0. Conversely, and ^ are isomorphic if there exists a homomorphism В Ф 0. PROOF. /(y) = <Bx, By) defines a bounded linear form in STX, from which, accord¬ ing to §4, there exists a ze^i for which ?(y) = <z, y>. An operator D e is defined which satisfies /(y) = <Bx, y>. From <Bx, By) = <By, Bx> it follows that <Bx, y> = <By, x> = <x, By), that is, B+ = В and (Bx,Bx} > 0 and therefore D > 0. Therefore (Bx, By> = <x, By). From AB = BA for all As M it follows that for A + e M (x, ABy) = < A +x, By) = <BA +x, By) = (A+Bx,By} = (Bx, ABy) = <Bx, BAy) = <x, BAy) and therefore we obtain AB = BA. Therefore В = Я1 and from В = B+ > 0 we obtain Я > 0. If Я # 0 then if U = Я1/2В it follows < Ux, l/y> = <x, y>. U is therefore an isometric map of into ZT2. Thus U is an isomorphic map of onto U2TX c= ^. is, on account of the commutivity of U with M, an invariant subspace. Since ^2 is irreducible, we must have U&\ = ^ and therefore (M)^ and (M)^ are isomorphic, in contradiction to the fact that we assummed that they were not isomorphic. Therefore Я = 0, that is, ||Bx||2 =0 and therefore В = 0. From the same proof it also follows if (M)&\ is irreducible, but (M)$~2 is not necessarily irreducible, then В = XU where U is an isomorphic map of onto the invariant subspace U&\ of 9~2. Since is irreducible, then is irreducible. (М)Ж is said to be completely reducible, if, in every invariant subspace of Ж there exists an irreducible (invariant) subspace. The set {R\R is a set of pairwise orthogonal irreducible invariant subspaces} satisfies the conditions of Zorn’s lemma (with respect to the partial order of inclusion). Thus it follows that there exists a maximal element Rm and that the elements ^ of Rm satisfy the relation Ж = ^. Let Pv be the projectors on ^. Then Pv = 1 and, for each A e M A = PVAPV. We shall now investigate the structure of a Be J&(Ж) which commutes with M. It is easy to show that AB = BA is equivalent to PVAPVPVBP^ = РХВР^АР^ for all pairs v, ju. If ^ ^ are not isomorphic spaces, then, as above, it follows that Р^ВРУ = 0. If 2TV and ^ are isomorphic, then there exists an isomorphic map ^ ^ from which it therefore follows that PvAPvUyfl = U^P^AP^. F = и~^РуВР^ therefore commutes with P^AP^ and therefore, as an operator in ^ is a multiple of the unit operator, that is, F = rwe find that PyBPp = . Thus it follows that s = EW, = £4AA (14-2) V,jU V,jU
14 Products of Hilbert Spaces 409 where is only taken over such pairs v, p for which ZTV is isomorphic to ^. The set of can be divided into classes, whereby two belong to the same class when they are isomorphic. Let ^ = £(va) ® where the sum is taken over the ^ belonging to the same class labeled with the index a. We find that Ж = Yja® - We will now show that the ^ are uniquely determined. If, in addition, Ж = Yjh® ^ anc* = where the index a means that the *5^' contains precisely those ST'P for which the from ^ are isomorphic, the projection Pp onto a 3~'p in *5^' therefore commutes with M. FpPp is therefore a mapping ^ —► 2T'p which commutes with M, and is, therefore equal to zero if ^ and ZT'p are not isomorphic. Therefore, it follows that for Fp instead of В in (14.2) p' = V(a)x UP rp La Lvn^vnrn^ Уф where we only sum those pairs v, p for which &~y, c= . Thus it follows that ?Г'р c 5^, that is, £P'a c= Thus we also obtain ^ с «5^', that is, We shall now consider one of the ^ = £(va) ® let fyQ be one of the Уу. Let ^ be isomorphisms. Then Uvfl = UvU~l is an isomorphism 2Tp —► ^. Let Qa be the projection onto from (14.2) we obtain &*& = E(“4,W4- (14.3) Vfl Let xpVo be a complete orthonormal basis in &~Vo, then the xpp = Upxpvo (for fixed p) is a complete orthonormal basis for 2Tp. From (14.3) it follows that Q«BQ«xpP = E(a) Tvp^„v • (14.4) vp For Ae M from = ЕГ pApv and from PvAPyUv = UVPV0APV0 (see above, setting l/VVo = l/v) it follows that QXAQX = UyPyoAPvU;1. Thus, for Ax„Vo = Ev ^vvoav„ it follows that Ax„p = E(a) (14.5) V We now introduce two Hilbert spaces Ж^ос) which has the dimension of ^ c: £Pa and Ж^ which has the dimension equal to the number of the ^ c Let zv,mp be complete orthonormal bases for Ж^ and Ж£\ respectively; to each operator A e M we assign an operator A(a) as follows A(\ = Evv (14-6) V Then xvp —► zvmp defines an isomorphism £? —> Ж}*} x Ж^ if we assign the operator QaAQa, that is, if, for the operator A in ^ we assign the operator Ai<x) x 1 where A(a) is defined by (14.6). Here we often use the simpler notation ^ x Жla) and for A e M: Qa>lQa = Ai<x) x 1. From (14.4) it follows that QaBQa = 1 x В(Л) where B(a) is an operator in Ж$*\
410 Appendix IV Operators in Hilbert Space We may therefore write Ж in the form Ж = £ © Ж^ X 3Vf, (14.7) a where the Ae M are given by л = Ее«(л<«> хщ,. (i4.8) a All operators which commute with M therefore have the form В = Y &Д x #“>)&, (14.9) a where the B(a) e if(J^a)) are arbitrary. The Jf?1(a) are irreducible with respect to the operators A(a) and Ж[а\ Ж^ are, for a ^ P, not isomorphic with respect to Ai<x) and A^\ If В is a homomorphism of (M)^f into (M)3~ and if (M)Jf is completely reducible, then, as we have shown above, a self-adjoint operator D in Ж which commutes with all operators in M is defined by <x, Dx) = <Bx, Bx). According to (14.9) £> = I&(ix a From <J3x, Bx) = <x, Bx) it also follows that {By, Bx> = <y, Bx). For Xj e Jf1(a), x2 е Ж{2\ yx e Jf/*, y2 e Ж^] it therefore follows that (Byyy2, BXiX2> = 5ар(уу, ХуХУг, В(я)х2у. (14.10) The subspaces Qq Ж will therefore, for different a, be mapped onto orthogonal subspaces (or {0}). Therefore (here A is the closure of A) we obtain ВЖ = Y® BQ*Ж, (14.11) a where the prime indicates that the sum is to take place only over the В(}аЖ Ф {0}. It is therefore sufficient to examine the structure of the individual BQаЖ. Let Ж£} denote the subspace of Ж^ which is the eigenspace of D(a) for which the eigenvalue is 0. From (14.10), for ft = a, it follows that Bxxx2 = Uia\x1D^)(1/2)x2) is an isomorphism of Ж[а) x Ж^ onto the subspace В()аЖ. U{a) defines a product representation of В^аЖ = ^a) x ^a). [7(a) therefore defines a pair of isomorphisms U^: Ж[а) and l/(2a): Ж£ -+ by C/(a) = x [7(2a). On the subspace ^ ^ © Ж[л) x Ж% a the relation в = Г Qa(Uf> X UfD^'m)Qa
15 The SpacesШЖЪ Жг,...) and 'М'{ЖЪ Ж2,...) 411 is satisfied; В is the null operator on the subspace in Ж which is orthogonal to . An operator Ain M has the following form A = xl)& a with respect to the decomposition of the invariant subspace ВЖ described by (14.11): ВЖ = Y Ф Яа) x a where Qa is the projection operator on x ^a). In addition, for A{cl) in (14.8) we obtain C7?U(e) = А^Щ\ Therefore the irreducible spaces Ж[а) and Ж[а) are isomorphic. An isomor¬ phism of (M)Ж5 onto (М)ВЖ is defined by the C/(a) = x l/(2a). For finite-dimensional Ж it is necessary to verify that an operator system M which contains both A and A+ is always completely reducible (in an invariant, reducible subspace there must be invariant subspaces of smaller dimension). In general this is not the case for infinite dimensional Ж\ so that, in applications, it is necessary to obtain a special proof of the complete reducibility of M. 15 The Spaces ^2,...) and ^2,...) The results of the previous sections can easily be extended to sequences of Hilbert spaces. Here we shall only give a brief outline of the proofs. Suppose that we are given a sequence Жи Ж2,... of Hilbert spaces. Let us define Ж as a disjoint union of the sets Жп, that is, Ж = \)Жп. (15.1) tl Жп is therefore a subset of Ж. We find that, in the sense of subsets of Ж\ Жп n Жт = 0 for n Ф т. Ж is clearly not a linear vector space! An inner product between two elements of Ж is therefore only defined if the elements belong to the same subset Жп. Otherwise, two elements which belong to different Жп are said to be orthogonal. An orthonormal set in Ж is a set {xv} of pairwise orthogonal elements of Ж for which ||xv|| = 1. Such a set is said to be complete if its intersection with Жп is, for each n, a complete orthonormal basis for Жп. Each xe Ж may be expressed in terms of a complete orthonormal basis as follows: x belongs to one of the Жп and may therefore be expressed in terms of the portion {x(vw)} of the complete orthonormal basis belonging to Жп in the form described in §2. A closed subspace of Ж is a subset F for which all the 2Гп = 2Г n Жп are closed subspaces of Жп. is therefore the disjoint union of the 2Tn, that is,
412 Appendix IV Operators in Hilbert Space ^ = Un^n- Let denote the orthogonal subspace of ^ in Жп. We now define 2TL = (J„ Let ^(1) = (J„ ^и(1) and ^(2) = (J„ ^(2). The greatest lower bound (infimum) &~(1) л <^(2) of <^(1) and &~(2) is given by ^~(1) л ^~(2) = ^~(1) n 5~(2) = (J„ (^и(1) п ^и(2)); the least upper bound (sup- remum) ^(1) v ^"(2) of 5~(1) and ^"(2) is given by ^(1) v ^(2) = (J„ (^,(1) v ^и(2)). It is easy to verify that the set of closed subspaces of Ж is a complete orthocomplemented lattice. An operator A in Ж is defined as a sequence A = (Al9 A2,...) of operators in which satisfies the condition: For x e Жп Ax = Anx. A is said to be bounded if there exists a с for which \\Ax\\ < c\\x\\. It follows that We denote the set of bounded operators in Ж by д/(Ж19 Ж2,.. .)• It is easy to show that л/(Ж19 Ж2, ...) is a Banach algebra, provided that the product of A and В is defined by AB = (AxBl9 A2B2,...); the adjoint A+ of A is defined by A + = (Af, A2 , ...) .A is said to be self-adjoint if A + = A. The set ^Г{Ж\^ ...) of all self-adjoint operators of sf{jtfl9 Ж29...) is an ordered Banach space where A > 0 if An > 0 for all n. The unit sphere of siГ(Ж19 Ж2,...) is the order interval [—1,1] where 1 is the unit operator lx = x for all x. 6#Г{Ж[, Ж2,...) is therefore an order unit space. For A > 0 we define the positive square root: where the A\12 are positive roots of An. If У = (J„ ^ is a closed subspace of Ж then to each x e Ж there is a corresponding q e given by = P„x for x e ZTn. Thus q = Px where P = Pi, P2,...) where Pn are projection operators onto ^. It follows that P+ =P and P2 = P. Conversely, if for P e д#г(Ж19 Ж2,...) P2 = P, then a closed sub¬ space of Ж is defined by У = РЖ and P = (Pl9 P2,...), P2 = Pn and & = U« &~n where ZTn = РпЖп. For x e Ж9 ||x|| = 1 the projection operator on the subspace generated by x, Px9 is given as follows (for x e Жп): The relationships derived in §6 may be directly applied onto the set of projection operators in Ж29...). An operator Ve л/(Ж19 Ж29...) is said to be isometric if for V = (Vl9 V2, ...) all Vn are isometric. U is said to be unitary if U and U+ are isometric. All the results from §7 are directly applicable to the isometric and unitary operators in л/(Ж19 Ж29...). From §8 it follows that if A e stfГ(Ж19 Ж2,...) where A = (Al9 A2,...) and if En(X) is the spectral family of An and if E(X) = (F^A), E2(X),...) then where the integral is norm-convergent. A similar result holds for unitary operators. Mil = sup ||Лх|| = SUpMnll- 11*11 SI n
15 The Spaces ${ЖЪ Ж2,...) and &'(Жи Ж2,...) 413 An operator A = (Al9 A2,...) is said to be compact if all the An are compact. For a compact operator it follows that (9.2) holds. Let xv be a complete orthonormal basis for Ж\ let A e д/г+(Ж19 Ж29...). (11.1) is independent of the choice of the complete orthonormal basis; we denote it by \x(A). For A = (Al9 A29...) 1г(Л) = 2>(Л„). (15.2) П Let ЩЖ2, Ж2, ■ ■.) denote the set of operators from ЖГ(Ж^, Ж2,...) for which MIL = trCV^P) = tr(A+) + tr (A.) = £ t r(y/A%) It = £tr(A„+) + £trG4B_) П П is finite. Thus we obtain MII. = EMJL, n where \\An\\s is the norm in ЩЖп). ЩЖ19 Ж2,...) is norm-separable because all the ЩЖп) are norm-separable. It is easy to show that ЩЖ19 Ж2,...) is an ordered Banach space and is, in addition, a base norm space where the basis К is the set of all W = (Wl9 W2,...) > 0 for which tr(WO = Etr(Wy = 1. n For Be ЖГ{Ж\, Ж2,...) 1(B) = tr(AB) is a bounded Unear form over ЩЖ1, Ж2, ■..) for which tr(^4B) < ЦЛ || S\\B\\, where the latter follows directly from tr(AB) = Yn Ш„Вп). Conversely, if / is a bounded linear form over ЩЖ19 Ж2,...) then, from A = (Au A2,...) = (Al9 0,...) + (0, A29 0,...) + ..., where this sum converges in the norm, it follows that 1(A) = XOU It where l„(A„) is a bounded linear form over ЩЖ„). From \l(A)\ < c\\A\\s we obtain \ln(A^\ < cM„L-According to §11 we obtain l„(A„) = tr(A„B„) where Bn e ^г(Жп) and || A,|| < c. Therefore we obtain 1(A) = X ln(An) = £ tr(AnBn) = tr (AB), П П where В = (Bu B2,...) and ||B|| < c. The dual Banach space 0&'(ЖХ, Ж2,...) which corresponds to 0И(Жх, Жг,...) may therefore be identified with ЖГ(ЖХ, Ж2,...). By an extension of the results obtained in §11 we find that 9Р(ЖХ, Ж2,...) is separable in the а(Ж(...), Щ.. .))-topology. In addition, the results from §11 about affine functionals over К remain valid for ЩЖи Ж2,...).
414 Appendix IV Operators in Hilbert Space It is easy to verify that Gleason’s theorem is applicable to the set в{Ж19 Ж19...) °f projection operators from ^Г{Ж19 Ж29.. .)• Since almost all theorems for a single Hilbert space are applicable to Ж = (J„ Жп9 in the text we have referred readers to those sections in AIV where the theorems have been proven for the case of a single Hilbert space. We shall assume that the reader is able to extend these theorems to ^ = Un^n using the contents of §15. Here it is important to note that it is often useful to imbed Ж = (J„ Жп into a Hilbert space /s = Ee к, n that is, the elements of Ж8 are defined as sequences x = (xl5 x2,...) where xn e Жп and ||.xn ||2 < oo. The inner product in Ж8 is defined by <x,y> = £<*„ ,у„У- n An element xe Ж = (J„ Жп is an element of one of the Жп and is identified with (0, 0,..., x,...) where x is in the nth position. We may identify A = (Al9 A29...) e Ж(Ж19 Ж2,...) with the following operator in J^(Ж8): A(xu x2,...) = £A„x„. It In this way Ж(Ж19 Ж2,...), {Жи Ж29...) become subsets of J^(Ж8) and J&Г(Ж8) respectively, and, as it is easy to show, ЩЖ19 Ж19...) becomes a subset of ЩЖ8). If Pn is the projection operator onto the subspace Жп of Ж8 then я4г(Ж^ Ж19...) is precisely the set of those operators in jS?r(J^) which commute with all Pn. In the same manner, Я(Ж19 Жъ ...) is the set of all operators in ЩЖ^) which commute with all Pn. The basis К(Ж19 Ж2,...) of the base norm space ЩЖ19 Ж2,...) is the subset of all those operators from К(Ж8) (the basis for the base norm space ЩЖ8)) which commute with all the Pn- Of course, the imbedding described above has no physical meaning for the physically interpretable structure introduced in III, §3 and §5. This imbed¬ ding can, however, lead to important computational techniques.
References [1] G. Ludwig. Die Grundstrukturen Einer Physikalischen Theorie. Berlin-Heidelberg- New York: Springer-Verlag, 1978. [2] G. Ludwig. Einfiihrmg in die Grundlagen der Theoretischen Physik, Vols. I-IV. Braunschweig: Vieweg, 1974-1979. [3] G. Ludwig. Quantum theory as a theory of interactions between macroscopic systems which can be described objectively. Erkenntnis, 16, 359-387 (1981). [4] N. Bourbaki. Theory of Sets. Paris: Hermann; Reading, Mass.: Addison-Wesley, 1968. [5] G. Ludwig. Makroskopische Systeme und Quantenmechanik. Notes in Math. Phys., Vol. 5; Marburg, 1972. [6] G. Ludwig. Mefi- und Praparierprozesse. Notes in Math. Phys., Vol. 6; Marburg, 1972. [7] G. Ludwig. Measuring and preparing processes. In Foundation of Quantum Mechanics and Ordered Linear Spaces. Springer Lecture Notes in Physics, Vol. 29, 1974. [8] H. Neumann. A mathematical model for a set of microsystems. Int. J. Theor. Phys., 17, 3, 1978. [9] M. Drieschner. Voraussage-Wahrscheinlichkeit-Objekt. Springer Lecture Notes in Physics, Vol. 99, 1979. [10] V. S. Varadarajan. Geometry of Quantum Theory, Vol. II. New York: Van Nostrand Reinhold, 1970. [11] О. M. Nikodym. Sure l’extistance d’une mesure parfaitement additive et non separable. Mem. Acad. Roy. Belgique, XVII, 1939. 415
416 References [12] P. Jordan. Anschauliche Quantentheorie. Berlin-Heidelberg-New York: Springer- Verlag, 1936. [13] G. Ludwig. An Axiomatic Basis for Quantum Mechanics. New York-Heidelberg- Berlin: Springer-Verlag, 1983. [14] M. Jammer. The Philosophy of Quantum Mechanics. New York: Wiley, 1977. E. Scheibe. The Logical Analysis of Quantum Mechanics. New York: Pergamon Press, 1973. [15] L. Kanthack. In preparation. [16] P. Mittelstaedt. Quantum Logic. Dordrecht: Reidel, 1978. [17] G. Ludwig. Deutung des Begriffs “physikalische Theorie” und axiomatische Grundlegung der Hilbertraumstruktur der Quantenmechanik durch Hauptsatze des Messens. Springer Lecture Notes in Physics, Vol. 4, 1970. [18] H. Neumann. Idealizations of Preparation and Registration Procedures. In pre¬ paration. [19] O. Melsheimer and R. Werner. In preparation. [20] H. J. Schmidt. Coordinatization of certain convex sets in axiomatic quantum theory. Doctoral thesis; Marburg Univ., 1975. [21] H. Neumann. Classical systems and observables in quantum mechanics. Comm. Math. Phys., 23, 100 (1971). H. Neumann. Classical Systems in Quantum Mechanics and Their Representations in Topological Spaces. Notes in Math. Phys., Vol. 10; Marburg, 1972. H. Neumann. On the Representation of Classical Systems. Springer Lecture Notes in Physics, Vol. 29,1974, pp. 316-321. [22] K. Kraus. Position observables of the photon. In The Uncertainty Principle and Foundations of Quantum Mechanics, W. C. Price and S. S. Chissick (Eds.). New York: Wiley, 1977. H. Neumann. Transformation properties of observables. Helv. Phys. Acta, 45, 811 (1972). [23] W. Band and J. L. Park. Quantum state determination: Quorum for a particle in one dimension. Amer. J. Phys., 47, 188 (1979). [24] L. S. Pontrjagin. Topological Groups. London: Gordon & Breach, 1966. [25] H. Yamabe. A generalization of a theorem of Gleason. Ann. Math. (2), 58, 351 (1953). [26] A. Bohm. The Rigged Hilbert Space and Quantum Mechanics. Springer Lecture Notes in Physics, Vol. 78, 1978. [27] G. Ludwig. The connection between the objective description of macrosystems and quantum mechanics of “many particles”. In Essays in Honor of Wolfgang Yourgrau, Alwyn van der Merve (Ed.). New York: Plenum Press, 1982. [28] G. W. Mackey. Induced Representations and Quantum Mechanics. Reading, Mass.: Benjamin, 1968. [29] A. Hartkamper and H. Neumann. Private communication. [30] S. Sakai. C*-algebras and W*-algebras. New York-Heidelberg-Berlin: Springer- Verlag, 1971. F. W. Shultz. Pure states as a dual object for C*-algebras. Commun. Math. Phys., 82, 497 (1982).
References 417 J. Dixmier. C*-algebras. Amsterdam: North-Holland, 1977. W. Arveson. An Introduction to C*-algebras. New York-Heidelberg-Berlin: Springer-Verlag, 1976. [31] D. Kappos. Probability Algebras and Stochastic Spaces. New York: Academic Press, 1969. [32] N. Bourbaki. General Topology. Paris: Hermann; Reading, Mass: Addison- Wesley, 1966. [33] H. H. Schaefer. Topological Vector Spaces. New York-Heidelberg-Berlin: Springer-Verlag, 1971. G. Jameson. Ordered Linear Spaces. Springer Lecture Notes in Math., Vol. 141, 1970. K. Ng. Partially Ordered Topological Vector Spaces. Oxford: Clarendon Press, 1973. R. Cristescu. Ordered Vector Spaces and Linear Operators. Tunbridge Wells, Kent: Abacus Press, 1976. [34] P. R. Halmos. Introduction to Hilbert Space. New York : Chelsea, 1957. M. Reed and B. Simon. Methods of Modern Mathematical Physics, Vol. I, §11. New York: Academic Press, 1972. J. Weidmann. Linear Operators in Hilbert Space. New York-Heidelberg-Berlin: Springer-Verlag, 1980. [35] М. H. Stone. Linear Transformations in Hilbert Space. Providence, RI: American Mathematical Society, 1932. M. Reed and B. Simon. Methods of Modern Mathematical Physics, Vol. I, §VIII. New York: Academic Press, 1972. [36] W. O. Amrein. Nonrelativistic Quantum Dynamics. Dordrecht: Reidel, 1981. [37] G. Ludwig. Die Grundlagen der Quantenmechanik. Berlin-Heidelberg-New York: Springer-Verlag, 1954. [38] G. Birkhoff and J. von Neumann. The logic of quantum mechanics. Ann. Math., 37, 823 (1936). [39] G. W. Mackey. Mathematical Foundation of Quantum Mechanics. Reading, Mass.: Benjamin, 1963. [40] J. M. Jauch. Foundation of Quantum Mechanics. Reading, Mass.: Addison-Wesley, 1973. [41] C. Piron. Foundation of Quantum Physics. New York: Benjamin, 1976. [42] S. P. Gudder. Stochastic Methods in Quantum Physics. Amsterdam: North- Holland, 1979. [43] D. J. Foulis and С. H. Randall. Operational statistics, I: Basis concepts. J. Math. Phys., 13, 1667-1675 (1972). C. H. Randall and D. J. Foulis. Operational statistics, II: Manuals of operations and their logics. J. Math. Phys., 14, 1472-1480 (1973). D. J. Foulis and С. H. Randall. Empirical logic and tensor products; and Opera¬ tional statistics and tensor products. In: Interpretations and Foundations of Quantum Theory, H. Neumann (Ed.). Mannheim: В. I. Wissenschaftsverlag, 1981; and references cited therein. [44] N. Zierler. Axioms for nonrelativistic quantum mechanics. Pacific J. Math., II, 1151 (1961).
418 References [45] G. Ludwig. The measuring process and an axiomatic foundation of quantum mechanics (Appendix of this article). In: Rendiconti della Scuola Internazionale die Fisica “Eurico Fermi", IL Corso. New York: Academic Press, 1971. G. Ludwig and H. Neumann. Connections between different approaches to the foundations of quantum mechanics. In Interpretations and Foundations of Quantum Theory, H. Neumann (Ed.). Mannheim: В. I. Wissenschaftsverlag, 1981. [46] H. Gerstberger, H. Neumann and R. Werner. Makroskopische Kausalitat und relativistische Quantenmechanik. In: Grundlagenprobleme der modernen Physik, J. Nitsch, J. Pfarr and E. W. Stachow (Eds.). Mannheim: В. I. Wissenschaftsverlag, 1981. H. Neumann and R. Werner. Causality between preparation and registration processes in relativistic quantum theory. To appear. [47] G. Ludwig. Axiomatische Basis einer physikalischen Theorie und theoretische Begriffe. Z. Allg. Wiss., XII/1, 55 (1981). [48] G. Ludwig. Der MeBprozeB. Z. Phys., 135,483-511 (1953). G. Ludwig. Zur Deutung der Beobachung in der Q. M. Phys. Bl, 11, 489-494 (1955). G. Ludwig. Zum Ergodensatz und zum Begriff der makroskopischen Observablen. Z. Naturforsch., 12a, 662-663 (1957). G. Ludwig. Zum Ergodensatz und zum Begriff der makroskopischen Obser¬ vablen, I. Z. Phys., 150, 346-375 (1958). G. Ludwig. Zum Ergodensatz und zum Begriff der makroskopischen Obser¬ vablen, II. Z. Phys., 152, 98-115 (1958). G. Ludwig. Axiomatic quantum statistics of macroscopic systems (Ergodic theory). Rendiconti della Scuola Internazionale de Fisica “Enrico Fermi”, XIV Corso. New York: Academic Press, 1960, pp. 57-132. G. Ludwig. Geloste und ungeloste Probleme des MeBprozesses in der Quanten¬ mechanik. In: Werner Heisenberg und die Physik unserer Zeit. Braunschweig 1961, pp. 150-181. G. Ludwig. Zur Begrundung der Thermodynamik auf Grund der Quanten¬ mechanik. Z. Phys., 171, 476-486 (1963). G. Ludwig. Zur Begrundung der Thermodynamik auf Grund der Quanten¬ mechanik II, Masterequation. Z. Phys., 173, 232-240 (1963). G. Ludwig. Versuch einer axiomatischen Grundlegung der Quantenmechanik und allgemeinerer physikalischer Theorie. Z. Phys., 181, 233-260 (1964). G. Ludwig. An axiomatic foundation of quantum mechanics on a nonsubjective basis. In: Quantum Theory and Reality. New York-Heidelberg-Berlin: Springer- Verlag, 1967, pp. 98-104. G. Ludwig. Attempt of an axiomatic foundation of quantum mechanics and more general theories, II. Commun. Math. Phys., 4, 331-348 (1967). G. Ludwig. Hauptsatze tiber das Messen als Grundlage der Hilbert-Raum- Struktur der Quantenmechanik. Z. Naturforsch., 22a, 1303-1323 (1967). G. Ludwig. Ein weiterer Hauptsatz tiber das Messen als Grundlage der Hilbert- Raum-Struktur der Quantenmechanik. Z. Naturforsch., 22a, 1324-1327 (1967). G. Ludwig. Attempt of an axiomatic foundation of quantum mechanics and more general theories, III. Commun. Math. Phys., 9, 1-12 (1968). G. Dahn. Attempt of an axiomatic foundation of quantum mechanics and more general theories, IV. Commun. Math. Phys., 9, 192-211 (1968). P. Stolz. Attempt of an axiomatic foundation of quantum mechanics and more general theories, V. Commun. Math. Phys., 11, 303-313 (1969).
References 419 G. Ludwig. [17]. P. Stolz. Attempt of an axiomatic foundation of quantum mechanics and more general theories, VI. Commun. Math. Phys., 23, 117-126 (1971). G. Ludwig. The measuring process and an axiomatic foundation of quantum mechanics. In: Foundations of Quantum Mechanics. Rendiconti della Scuola Internazionale di Fisica “Enrico Fermi”, IL Cor so. New York: Academic Press, 1971, pp. 287-317. G. Ludwig. A Physical Interpretation of an Axiom within an Axiomatic Approach to Quantum Mechanics and a New Formulation of this Axiom as a General Covering Condition. Notes in Math. Phys., Vol. 1; Marburg, 1971. G. Ludwig. Transformationen von Gesamtheiten und Ejfekten. Notes in Math. Phys., Vol. 4; Marburg, 1971. 61 Seiten. G. Ludwig. [5]. G. Ludwig. [6]. G. Ludwig. An improved formulation of some theorems and axioms in the axiomatic foundation of the Hilbert space structure of quantum mechanics. Commun. Math. Phys., 26, 78-86 (1972). G. Ludwig. Why a new approach to find quantum theory? In: The Physicist's Conception of Nature. Dordrecht-Boston: Reidel, 1973. Seite 702-708. G. Ludwig. [7]. G. Ludwig. Measurement as a Process of Interaction Between Macroscopic Systems. Notes in Math. Phys., Vol. 14; Marburg, 1974. G. Ludwig. A theoretical description of single microsystems. In: The Uncertainty Principle and Foundations of Quantum Mechanics, W. C. Price and S. S. Chissick, (Eds.). New York: Wiley, 1977. G. Ludwig. An axiomatic basis of quantum mechanics. In: Interpretations and Foundations of Quantum Theory, H. Neumann (Ed.). Mannheim: В. I. Wissen- schaftsverlag, 1981.
List of Frequently Used Symbols J set of preparation procedures 22 probability function for & 22 0to set of registration methods 23 A<a0 probability function for 23 ^ set of registration procedures 23 У* set of selection procedures generated by 26 probability function for Sf 26 & set of effect processes 31 Ж> К sets of ensembles 42, 56 Se, L sets of effects 43, 57 cp canonical map 2! -► Ж cz К 43,57 ф canonical map ^ JS? с= L 43, 57 fj. probability function Ж x 5£ -► [0, 1] and К x L -> [0, 1] 48 Str(y4) dispersion of measurement value for observable A 195 со A smallest convex set containing the set A 362 со A smallest convex set containing the set A (closed in the norm topology) 56, 140, 362 coa A smallest convex set containing the set A (closed in the <r(X, Z^-topology) 362 421
List of Axioms AS 1.1 16 AS 1.2 16 AS 1.3 21 AS 2.1 19 AS 2.2 19 AS 2.3 19 AS 2.4.1 20 AS 2.4.2 20 APS 1 22 APS 2 22 APS 3 22 APS 4.1 22 APS 4.2 22 APS 5.1.1 25 APS 5.1.2 44 APS 5.1.3 44 APS 5.1.4 45 APS 5.2 25 APS 6 26 APS 7 26 APS 8.1 28 APS 8.2 28 АР 1 50 AR 1 53 AV 1.1 58 AV 1.2s 58 AV 2f 59 AVid 58 AV 3 59 AV 4s 59 AE 1 14 AE 2 14 AE 3 61 AE 4.1 61 AE 4.2 61 APE 1 69 APE 2 69 AQ 73 AOb 154 APr 172 AH 1 187 AH 2 187 422
Index accumulation point 354 actual physical pseudoproperties 72 additive measure 86, 88 additive real measure 20 adjoint operator 375 affine functional 363 angular momentum 293 anti-bilinear form 405 associated kernel, observable 128 axiomatic basis 2 Baire space 357 Banach space 359 base (for the cone) 363 base-normed space 363 ^-continuous 211 Bessel’s inequality 369 bilinear form 390 Boolean lattice 348 Boolean ring 348 Borel structure 251 bounded 360 linear form 373 operator 372 carrier of interaction 13, 27 Cauchy filter 356 Cauchy sequence 356, 371-373 center (of a lattice) 75 center of mass 308 closed subspace 57, 370 closure 353 coarser than 354, 355 coexistent 86, 153, 167 coexistent components 158 coexistent effect processes 86 coexistant effect morphisms 214 coexistent effects 84, 87 coexistant morphisms 214 coexistant observables 84 coexistant registrations 84 commensurable 90, 153 compact operator 375 complement 345 complementary 154, 168, 345 complete 356 complete lattice 344 completion 356 completely continuous 375 composite system type 262 connected 357 connected component 358 continuity 354 continuous spectrum 140 convergence 354 convex cone 363 423
424 Index convex equivalent 124 convex range 124 convex set 362 covering group 272 ^-continuous 205, 208 decision effect 58, 75, 79, 290 decision observable 105 decision preparator 162 decomposition 48, 52, 75, 131 dense 353 detection response 57 direct mixture 49 discrete spectrum 140 dispersion-free 163 distributive lattice 345 domain of definition 385 domain of reality 2 dual 205 dual space 360 effect 41, 43 effect morphism 211 effect process 84 effect transformer 216 effective 87 elementary system type 262 energy observable 292 ensemble 41, 42 equivalent observables 129 equivalent group representations 238 essentially self-adjoint 386 extreme kernel 124 face 57,75 filter 353 finer than 354 finiteness of physics 56 fundamental domain 3 Galileo group 258, 260 generalized Boolean ring 350 generating 363 Gleason’s theorem 397 Hamiltonian operator 331 for inner structure 308 Hausdorffspace 355 Heisenberg commutation relation 291 Heisenberg uncertainty relation 193, 197, 294 Hermitian operator 375 hidden properties 186 Hilbert space 366 ideal 201 induced topology 354 initial topology 354 initial uniform structure 355 inner product 366 inner structure effects 309 observable 309 interaction carrier 9, 27 interior 353 irreducible 135 isometry 379, 389 joint measurement 156 joint preparations 156 kernal observable 127 kinetic energy observable 308 Krein-Milman theorem 57, 124-126, 362 lattice 343, 345, 346, 350 law of quantization 59 Lie group 304 linear form 360 linear functional 360 linear operator 372 linear vector space 111, 359 linearly connected 358 locally compact 357 may be combined 24 maximal 386 meager 357 measure 86 measurement scale 139 metric 356 metrizable 356 mi-morphism 122, 206 Minkowski functional 111 mixture 75 mixture morphism 122, 206 modular 346
Index modular pair 346 momentum observable 290, 317 more extensive 129, 166 morphisms for selection procedures 199 morphisms of effects 211 morphisms of ensembles 206 neighborhood filter 353 structure 353 norm 360 norm closure 57 normed space 360 normed orthogonal system 369 nowhere dense 357 objective property 13, 60, 67, 68, 173 observable 103 operation 206 measure 215 open set 353 operators of trace class 391 orbital angular momentum 293 order isomorphism 216 order unit space 364 ordered linear vector space 363 orthocomplementation 345 orthocomplemented lattice 345 orthogonal 367 orthomeasure 347 orthonormal basis 369 orthonormal system 369 parity transformations 299 partial observable 141 partially ordered set 343 _L-isomorphism 217 phase space 247 physical approximation 18 physical object 55, 68 physical system 27 physical theory (PT) 2 Poincare group 253, 259, 260, 292 pointwise convergence 372 poset 343 position observable 289, 290, 317 of center of mass 317 p-automorphism 203 p-continuous 204 p-invariant 204 p-isomorphism 205 p-morphism 203 precompact 57, 357 preparation-continuous 204 preparation-invariant 204 preparation-morphism 203 preparation procedure 6, 22 preparator 160 probability 19 product uniform structure 356 products of Hilbert spaces 405 projection operator 377 property 14 pseudoproperty 69, 177 quantum logic 181 random variable 139 r-continuous 204 r-invariant 203, 204 r-isomorphism 205 recording-continuous 205 recording-invariant 203, 204 registration apparatus 8 registration methods 23 registration procedures 23 relative momentum operator 318 relative position operator 318 relatively compact 357 relatively complemented lattice 350 rest energy observable 308 rotation group 272 scale observable 150 Schmidt orthogonalization procedure 370 Schwarz inequality 367 selection procedures 16 self-adjoint 386 separable 353 separated part 202 set lattice 352 signed additive measure 108 Slater determinant 326 spatial reflection 299 spectral representation 382 spectral theorem 119
426 Index spectrum 140 sp-morphism 200 ssp-morphism 202 spin angular momentum 293 state 42 statistical selection procedures 18, 19 Stern-Gerlach experiment 293 Stone’s theorem 62 symmetric operator 386 system type * 175, 262 theory of species of structure 3 topology 354, 355 topological space 353 total momentum 308, 317 trace 391 trace class operators 391 transpreparation processes 172 trans-preparator 215 Tychonov’s theorem 357 uncertainty relations 193 uniform continuity 355 uniform space 355 uniform structure 355 uniformizable space 355 unitary operator 379 unitary representation 255 unitary representation (up to a factor) 255 vicinities 355 weak convergence 374 weak topologies 361
f itivrr 25S rT.l .• V V* F.y * • "» FjUjw1 »c*v ’ , л N • f ** *• 4CJ f»1 ■■ *.T»J5 '•УЛ л • * У ч ,■ This is an advanced textbook on quantum mechanics. Its special feature is a derivation of the basic structures from macroscopically | describable preparation and registration procedures. For more than three decades the author has made brilliant contributions to the “ foundations of quantum physics; this is the first textbook that ЩН R 'Mue fCsjC.' TV, гУУД -Х».ч gs presents his theory. Mathematically thorough and rigorous, this volume documents important progress toward the axiomatic formulation of quantum theory. The first volume concentrates on the fundamental structures, while the second treats “conventional quantum mechanics,” which gains new depth in the author’s thorough interpretation of its basic principles. £b&1 , . iV, ISBN 0-387-11683-4 ISBN 3-540-11683-4 №711И»!* i *