Текст
                    Proceedings of Symposia in
Pure Mathematics
Volume 24
Analytic
Number Theory
Symposium on
Analytic Number Theory
March 27-30, 1972
St. Louis, Missouri
Harold G. Diamond
Editor
Is
American Mathematical Society


Volume 24 Analytic Number Theory Symposium on Analytic Number Theory March 27-30, 1972 St. Louis, Missouri Harold G. Diamond Editor
This page intentionally left blank
Analytic Number Theory
This page intentionally left blank
Proceedings of Symposia in Pure Mathematics Volume 24 Analytic Number Theory Symposium on Analytic Number Theory March 27-30, 1972 St. Louis, Missouri Harold G. Diamond Editor ^// TPHTOI MH \>^ of/i^^^^o American Mathematical Society Providence, Rhode Island
PROCEEDINGS OF THE SYMPOSIUM IN PURE MATHEMATICS OF THE AMERICAN MATHEMATICAL SOCIETY HELD AT THE ST. LOUIS UNIVERSITY ST. LOUIS, MISSOURI MARCH 27-30, 1972 Prepared by the American Mathematical Society under National Science Foundation Grant GP-32302 2000 Mathematics Subject Classification. Primary 11-02. Library of Congress Cataloging-in-Publication Data Symposium in Pure Mathematics, St. Louis University 1972. Analytic number theory. (Proceedings of symposia in pure mathematics, v. 24) Includes bibliographies. 1. Numbers, Theory of— Congresses. I. Diamond, Harold G., 1940- ed. II. American Mathematical Society. III. Title. IV. Series. QA241.S88 1972 512'.73 ISBN 0-8218-1424-9 72-10198 Copying and reprinting. Individual readers of this publication, and nonprofit libraries acting for them, are permitted to make fair use of the material, such as to copy a chapter for use in teaching or research. Permission is granted to quote brief passages from this publication in reviews, provided the customary acknowledgment of the source is given. Republication, systematic copying, or multiple reproduction of any material in this publication is permitted only under license from the American Mathematical Society. Requests for such permission should be addressed to the Assistant to the Publisher, American Mathematical Society, P. O. Box 6248, Providence, Rhode Island 02940-6248. Requests can also be made by e-mail to reprint-permissionOams.org. Copyright © 1973 by the American Mathematical Society. All rights reserved. Printed in the United States of America. The American Mathematical Society retains all rights except those granted to the United States Government. The paper used in this book is acid-free and falls within the guidelines established to ensure permanence and durability. @ Visit the AMS home page at URL: http://www.ams.org/ 10 9 8 7 6 5 4 3 04 03 02 01 00
CONTENTS Foreword vii Effective methods in Diophantine problems. II 1 By A. Baker Character transformation formulae similar to those for the Dedekind eta- function 9 By Bruce C. Berndt On large sieve type estimates for the Dirichlet series operator 31 By M. Forti and C. Viola (Presented By Enrico Bombieri) On Iwasawa's analogue of the Jacobian for totally real number fields 51 By John Coates The distribution of values of Euler's phi function 63 By Harold G. Diamond On connections between the Turan-Kubilius inequality and the large sieve: Some applications 77 By P. D. T. A. Elliott On the number of solutions of w = £f=1 xf 83 By P. Erdos and E. Szemeredi The large sieve and probabalistic Galois theory 91 By P. X. Gallagher Some remarks on arithmetic density questions 103 By Larry Joel Goldstein Relations between the values at integral arguments of Dirichlet series that satisfy functional equations Ill By E. Grosswald On the incompatibility of two conjectures concerning primes 123 By Douglas Henseley and Ian Richards v
VI CONTENTS On the intervals between consecutive terms of sequences 129 By Christopher Hooley The difference between consecutive primes 141 By Martin Huxley On the Mertens conjecture and related general £2-theorems 147 By W. B. Jurkat The distribution of the values of real quadratic forms at integer points .. 159 By D. J. Lewis The classification of transcendental numbers 175 By K. Mahler The pair correlation of zeros of the zeta function 181 By H. L. Montgomery Metric theorems on the distribution of sequences 195 By H. G. Niederreiter Bounds for sequences of consecutive power residues. I 213 By Karl K. Norton Rational points on certain elliptic modular curves 221 By A. P. Ogg Arithmetic functions and Brownian motion 233 By Walter Philipp Brun's method and the fundamental lemma 247 By H.-E. Richert and H. Halberstam Estimation of the area of the smallest triangle obtained by selecting three out of n points in a disc of unit area 251 By K. F. Roth Euler products associated with Beurling's generalized prime number systems 263 By C. Ryavec Systematic examination of Littlewood's bounds on L(l, x) 267 By Daniel Shanks On the Riemann hypothesis in hyperelliptic function fields 285 By H. M. Stark Class numbers of totally imaginary fields 303 By Judith S. Sunley Exponential sums and the Riemann conjecture 305 By Paul Turan A new estimate for the exceptional set in Goldbach's problem 315 By Robert C. Vaughan On Euclidean rings of algebraic integers 321 By Peter J. Weinberger Author Index 333 Subject Index 337
FOREWORD A symposium on Analytic Number Theory and Related Parts of Analysis was held at St. Louis University, St. Louis, Missouri, on March 27-30, 1972, in conjunction with the six hundred ninety-third meeting of the American Mathematical Society. The Organizing Committee for the symposium consisted of Harold G. Diamond (chairman), Patrick X. Gallagher, Hugh L. Montgomery, Wolfgang M. Schmidt, and Harold M. Stark. Twenty-nine number theorists were invited to lecture on their recent research, which covers a broad spectrum of contemporary work in number theory. The program was arranged in seven half day sessions, chaired by Paul T. Bateman, Paul Erdos, Lowell Schoenfeld, and the organizers. This volume contains accounts of all the lectures presented at the symposium. Paul Erdos, who could attend only the first hours of the symposium, also contributed an article to the volume. The articles are arranged alphabetically (according to the name of the speaker in the case of joint work). The conference participants are indebted to a number of individuals and organizations for their good planning and administration. In particular, mention should be made of the work of AMS Associate Secretary Paul T. Bateman, Mrs. Lillian Casey of the AMS, and Lawrence W. Conlon of St. Louis University. Financial support for the symposium was provided by a grant from the National Science Foundation. It is hoped that the lively ideas presented at the symposium wil! be further disseminated by this volume and will spawn new number theoretic research in the years to come. Harold G. Diamond vn
This page intentionally left blank
~\ ;~5 W ff I t¥^.... ^ ■ -V- A. BAKER P. T. BATEMAN if km ■■■* £^4^^ BRUCE C. BERNDY " H ■&-■■■■>■ r W ,:'t ENRICO BOMBIERI ?- —sT* : - JOHN COATES '* .,.?*■ ^ v **** '£- "U - K, jj HAROLD G. DIAMOND i % ' P. D. T. A. ELLIOTT f V£V Mi >TWm &■> K'" m i i, ;^€^J \. P. ERDOS P. X. GALLAGHER
** *« „■* ' ' '■-■" A' ■»-,-- -; - >-^ '-;'^W^W x'" 1 LARRY JOEL GOLDSTEIN ''. ?** ** -ST.- -,-v ... ...'%■ '■-■■ ■' J--* '■-V*-;:. >• * *,i '."'.'■■v-:-«V ~'& f- 'V ■' E. GROSSWALD DOUGLAS HENSLEY :"i/ f .. ;'::' r;i ;!1" ^ 'irr:/ CHRISTOPHER HOOLEY MARTIN HUXLEY H W. B. JURKAT D. J. LEWIS i^-r.. ..II K. MAHLER ■7 / " H. L. MONTGOMERY
H. G. NIEDERREITER *>.*, =*!%>■ ^ r*-:-^-s i KARL K. NORTON i „ c. jy#iy* >■ * " -p :*■■;"■" ,: ■■" ":\ ., /^ ■ - '. . ^ i;< A. P. OGG .>■-■■■■;■■*. '»>■ ^ WALTER PHILIPP y?'- :':■>■ ' ' ■ ■'■r.K'-^: .. .:::/."Vv'- >:■ ^ .-.^V :- ^ / . ^ : A, tlftl: #;; ft ~>< >yK J:3 H.-E. RICHERT K. F. ROTH C. RYAVEC .^:i**#V ^ 4 _ *:; ^^ L. SCHOENFELD
■^-K4:-:£. DANIEL SHANKS H. M. STARK JUDITH S. SUNLEY -%^^^m. *;4^- PAUL TURAN ROBERT C. VAUGHAN \ ■ :-:.-:-.:V- • - ^x- PETER J. WEINBERGER
EFFECTIVE METHODS IN DIOPHANTINE PROBLEMS. II A. BAKER 1. Introduction. Three years ago, at a conference held in Stony Brook, I surveyed the theories which had then recently been developed for the effective resolution of a diverse collection of Diophantine problems [1] (see also [2]). Since that time, several of the topics have been considerably expanded and I should like to use the opportunity provided by the present Symposium to bring the account up to date. 2. Lower bounds for linear forms. One of the most active fields of research has been concerned with improved bounds for linear forms in the logarithms of algebraic numbers. In particular, much study has been made of the special situation, of considerable importance in applications, when one of the algebraic numbers has a large height relative to the remainder. The primary result obtained in this connexion reads as follows. Theorem 1. Let a x,..., a„, /? x,..., /?„ be nonzero algebraic numbers with degrees at most d, let al9..., (xn-l have heights at most A' and let a„ and /?l9..., P„ have heights at most A and B respectively. If e>0, <5>0 and 0<|j81loga1 + .•• + /?„ log0Ln\<e-'H for some //>exp((log£)1/2), then H<C(\ogA)1+\ where C=C(n, d9 e, 5, A') is effectively computable. Special cases of the theorem were proved by Stark [21] and myself [3] in connexion with certain class number problems (see §4) and the full result was AMS 1970 subject classifications. Primary 10-02, 10F35; Secondary 10B45, 10F25, 10H10, 12A25, 12A50. ,<; 1973, American Mathematical Society 1
2 A. BAKER obtained by a combination of our methods [9]. Previous work, as described in [1], had led to a similar theorem but with 1 +£ replaced by a number greater than n-1. The condition 7/>exp((logJ5)1/2) can be relaxed to iJ>(logfl)c"2/£ for a sufficiently large absolute constant c, provided that s<\, and, furthermore, one can replace (log Af by some power of log log A, though this power is usually large. Very recently, by further developments of the arguments, it has been shown that, in the case when j8l9..., /?„_! are rational integers and /?„= — 1, conditions frequently satisfied in applications, the exponent 1 +z can be replaced by 1, which is best possible. Theorem 2 [5], [6]. Let o^,...,^ be nonzero algebraic numbers with degrees at most d and let the heights ofal,...,oin-l and a„ be at most A' and A (^ 2) respectively. If, for some s > 0, there exist rational integers bl,...,bn-l with absolute values at most B such that 0<|fc1 logax+ ••• + &„_! loga^-logaj^"**, then B<C log A for some effectively computable number C depending only on n, d, A' and e. Theorem 2 is, in fact, an immediate consequence of another theorem [6] to the effect that there exists C = C(n, d, A) such that, for any S with 0<<5<i, the inequalities 0<|fc1loga1 + ...+^logaJ<((5/F)cl08^-^ have no solution in rational integers bu ..., bn- x and bn (#0) with absolute values at most B and B' respectively. Clearly, on taking S= \/B and assuming B'^B, the number on the right becomes at least C~logA logB for some effectively computable C, and this bound is best possible with respect to A when B is fixed and with respect to B when A is fixed. Corollaries relating to the theory of Diophantine equations will be discussed in §3. The proofs of the above theorems depend upon several new developments in the earlier works. In particular, the underlying auxiliary functions are now considerably more involved; the argument leading to Theorem 2, for instance, utilizes a function of the form A_i=0 A„ = 0 where the p(A_l9..., A„) are, as usual, rational integers, yr = Xr + brXn and
EFFECTIVE METHODS IN DIOPHANTINE PROBLEMS 3 IJ- 1 r= 1 1 1 dm J(r:k) = -(r+l)-(r + fc)- A(z: k, L m) = - — (A(z: k))1. Further, the inductive nature of the expositions is substantially modified and the ultimate contradiction is obtained now by an appeal to certain algebraic lemmas relating to Kummer theory, quite different from the techniques employed previously. The reader is referred to the original memoirs for details. 3. Diophantine equations. In my earlier survey, I discussed the fundamental theorem of Thue on f(x. y) = /n. where / denotes an irreducible binary form with integer coefficients and degree n ^ 3. More especially, I described how the theorem could be made effective, and indeed how one could establish an upper bound max(|.Y|, \y\)<C exp{(logm)K}, applicable for all integer solutions .\\ y, where k>yi and C is computable in terms of k and the coefficients of /. In view of Theorem 2, one can now strengthen the number on the right to Cnf. where c can be computed like C, and this gives at once Theorem 3. For any algebraic number a with degree n^.3 there exist positive effectively computable numbers c, k depending only on a, with K<n, such that \y.-p/q\>cq~K for all rationalsp q (q>0). Feldman [13] first obtained this result from a special case of Theorem 2, involving certain restrictions on a„, and his arguments rested on rather different adaptations in the basic theory of linear forms in logarithms: yet another approach, employing p-adic analysis, was described by Sprindzuk [19], [20] at about the same time. Several other new theorems on rational approximations to algebraic numbers follow from the general result cited after the enunciation of Theorem 2. Thus, for instance, it shows that \y.-pp'/qq'\>Q-KloglogQ\ where p. q are comprised solely of powers of fixed sets of primes and Q, Q are the maxima of the absolute values of p. 4 and p\ q respectively: this furnishes a further improvement on Ridouf s generalization of Roth's theorem (cf. the recent survey of
4 A. BAKER Schmidt [18]). Furthermore one sees that \(x1/m-p/q\>cq-Klogm for any algebraic number a, where c, k are positive numbers effectively computable in terms of a, and this is sharper than the Thue-Siegel inequality when the integer m is large. The latter theorem recalls to mind the very first effective results in this context, derived by means of special properties of Gauss's hypergeometric function (see [1]). When applicable, this method gives surprisingly strong estimates for the solutions of Diophantine equations, and it has not been dormant. In particular, Feldman [12] and Osgood [15], [16] have widely applied ideas of this nature to study effectively certain equations of norm form in several variables. 4. Class numbers. I described at Stony Brook the transcendental method for determining all the imaginary quadratic fields with class number 1, and I remarked also that the same techniques could be used to treat the analogous problem for class number 2 when the discriminants of the fields are even. Since then, a complete resolution of the class number 2 problem has been obtained, and I should like to indicate the main new idea very briefly. A fuller account is provided by the text of the lecture I delivered a year or so ago in Washington [4]. If Q(( — d)i/2) has class number 2 and odd discriminant — d< — 15, then d = pq, where p, q are primes congruent to 1 and 3 (mod 4) respectively. Denoting by %'(n) one of the generic characters associated with forms of discriminant — d and writing *.M-(^) fcW-g) x.M-(?) *>-$• where k is an integer = 1 (mod 4) and (/c, pq) = 1, we obtain L(l, X) L(l, xx„) + m XXP) L(l, XXq) = l X(f)/f, where f = f(x, y) denotes the principal form with discriminant — d, and the sum is over all integers x, y not both 0. Now if k is not a prime power, for instance if A: = 21, then the sum on the right approximates to a rational multiple of 7r2, and on substituting for the L-functions on the left from Dirichlet's formulae we obtain an inequality of the type considered in Theorem 1; this leads at once to the desired effective bound for d. By somewhat similar techniques, Schinzel and I [8] have recently shown that every genus of primitive binary quadratic forms with discriminant D represents a positive integer ^c(s)\D\3l8 + £ for any £>0, where c(s) depends only on e. Our proof involves Siegel's theorem on L-functions and so does not enable c(e) to be
EFFECTIVE METHODS IN DIOPHANTINE PROBLEMS 5 effectively computed when s<^; on the other hand, an effective estimate would, as we show, yield a complete determination of all the "numeri idonei" of Euler, and, of course, this would include the class number 1 and 2 results to which I have just referred. 5. Elliptic functions. The main result on elliptic functions cited in [1] has been extended recently by Coates [11]. Theorem 4. Any nonvanishing linear combination of a>1, a>2, rjl, rj2 and 2ni with algebraic coefficients is transcendental. Here a>u co2 denote a pair of fundamental periods of a Weierstrass p-function with algebraic invariants g2, g^ and nl =2C(i&>i), n2 — 2C(iu>2), where £(z) denotes the associated Weierstrass (-function. The new feature in Theorem 4 is the inclusion of 2ni, this extension having been gained, however, at the cost of some restriction in the hypotheses. The result is of particular interest in view of the Legendre relation a>ln2 — a>2nl=2nU showing that the five numbers in question are algebraically dependent. Furthermore, one sees that the theorem includes the transcendence of such numbers as n + co and n + r\ for any period co of p(z) and quasi-period n of £(z). Some quantitative estimates in connexion with Theorem 4 have recently been derived by a student of mine, D. W. Masser; in particular, he has proved [14]: Theorem 5. For any positive integer n and any a>0, we have |p(«)|<C«(loglogn)7+c where C depends only on g2, g3 and e. Moreover he has shown that a similar estimate obtains for p(n + ri) and indeed for p (a), where a is any nonzero algebraic number. Theorem 5 compares well with the lower bound \p(n)\>Cn valid for some C>0 and infinitely many «, and it improves upon the result mentioned in [1], where an unspecified power of log n occurred in place of log logw. It seems likely that this general area of study will be considerably developed in the next few years (cf. [10]). 6. Further results and problems. In a lecture at the same conference in Stony Brook to which I referred at the beginning, Chowla raised the problem whether there exists a rational-valued function/(«), periodic with prime period/?, such that £ f(n)/n = 0. He proved some twenty years ago that this could not hold for odd functions / if i(p—1) is prime, a condition subsequently removed by
6 A. BAKER Siegel, and recently he showed that the same is true for even functions / if /(0) = 0. In a forthcoming paper [7] by Birch, Wirsing and myself, it is shown that there is in fact no function / with these properties. The arguments involve an appeal to the basic result on the linear independence of the logarithms of algebraic numbers, but otherwise the proof runs on classical lines. Our work enables us to treat more generally functions / that take algebraic values and are periodic with any modulus q, and we prove thereby Theorem 6. If (q, (j)(q))=\ and x ™ns through all nonprincipal characters mo&q then the L(l, y) are linearly independent over the rationals. Theorem 6 plainly generalizes Dirichlet's famous result on the nonvanishing of L(l, y)\ it does not, however, give a new proof of this result, for the latter is, in fact, utilized in the demonstration. It would be of much interest to know whether the theorem is valid when (q, <l>(q))> 1. Finally, I should like to discuss some possible future avenues of investigation. First, one would like to have a theorem of the nature of Theorem 1 in which A denotes the height of all the a's and not just a„; some work in this direction has been carried out by Ramachandra [17] and his pupil T. N. Shorey, and they have applied their results to certain questions in prime number theory. But, at the moment, the theorems are rather special and one would hope for considerable improvements here. Secondly, it is almost certain that Theorems 1 and 2 have natural/?-adic analogues, and these would enable many of the Diophantine results obtained earlier to be strengthened. In particular, they would give an inequality of the form ||(3/2)"|| >2_<5n, valid for all n>n0, where n0 is effectively computable, 3 is an absolute constant with 0<<5<1 and ||x|| denotes the distance of x from the nearest integer. If, moreover, the value of 3 were such that 2_<5>J then this would settle an outstanding question in connexion with Waring's problem. But, of course, it may be difficult to obtain such a precise value of 3 from the present analysis. Thirdly, one would like to obtain a value of k in Theorem 3 depending only on n and indeed of the same order of magnitude as the Siegel exponent; this would naturally lead to an effective determination of all the integer points on a curve of arbitrary genus, that is, to a complete solution to the first problem mentioned at the end of [1]. Since the magnitude of k depends on the value of C in Theorem 2, this again reflects on the basic theory of linear forms in logarithms. And lastly, one would like an extension of Theorem 2 in which bi,..., bn_1 denote arbitrary algebraic numbers and not merely rational integers; this too seems difficult to obtain with our present techniques. References 1. A. Baker, Effective methods in Diophantine problems, Proc. Sympos. Pure Math., vol. 20, Amer. Math. Soc, Providence, R.I., 1971, pp. 195-205.
EFFECTIVE METHODS IN DIOPHANTINE PROBLEMS 7 2. , Effective methods in the theory of numbers, Proc. Internat. Congress Math. (Nice, 1970), vol. 1, Gauthier-Villars, Paris, 1971, pp. 19-26. 3. , Imaginary quadratic fields with class number 2, Ann. of Math. (2) 94 (1971), 139-152. 4. , On the class number of imaginary quadratic fields, Bull. Amer. Math. Soc. 77 (1971), 678-684. £>. , A sharpening of the bounds for linear forms in logarithms, Acta Arith. 21 (1972), 117-129. 6. , A sharpening of the bounds for linear forms in logarithms. II, Acta Arith. (to appear). 7. A. Baker, B. J. Birch and E. A. Wirsing, On a problem ofChowla, J. Number Theory (to appear). 8. A. Baker and A. Schinzel, On the least integers represented by the genera of binary quadratic forms, Acta Arith. 18 (1971), 137-144. 9. A. Baker and H. M. Stark, On a fundamental inequality in number theory, Ann. of Math. (2) 94 (1971), 190-199. 10. J. Coates, An application of the division theory of elliptic functions to Diophantine approximation, Invent. Math. 11 (1970), 167-182. 11. , The transcendence of linear forms in wx, cd2, n1, r\2, 2ni, Amer. J. Math. 93 (1971), 385-397. 12. N. I. Fel'dman, Effective bounds for the number of solutions of certain Diophantine equations, Mat. Zametki 8 (1970), 361-371. (Russian) MR 42 #7590. 13. , An effective sharpening of the exponent in Liouville's theorem, Izv. Akad. Nauk SSSR Ser. Mat. 35 (1971), 973-990 = Math. USSR Izv. 5 (1971), 985-1002. 14. D. W. Masser, On the periods of the exponential and elliptic functions, Proc. Cambridge Philos. Soc. (to appear). 15. C. F. Osgood, The simultaneous diophantine approximation of certain kth roots, Proc. Cambridge Philos. Soc. 67 (1970), 75-86. MR 40 #2612. 16. , On the simultaneous Diophantine approximation of values of certain algebraic functions, Acta Arith. 19 (1971), 343-386. 17. K. Ramachandra, A note on numbers with a large prime factor. Ill, Acta Arith. 19 (1971), 49-62. 18. W. M. Schmidt, Approximation to algebraic numbers, Enseignement Math. 17 (1971), 187-253. 19. V. G. Sprindzuk, A new application of p-adic analysis to representations of numbers by binary forms, Izv. Akad. Nauk SSSR Ser. Mat. 34(1970), 1038-1063 = Math. USSR Izv. 4(1970), 1043-1069. MR 42 #5910. 20. , On rational approximations to algebraic numbers, Izv. Akad. Nauk SSSR Ser. Mat. 35 (1971), 991-1007 = Math. USSR Izv. 5(1971), 1003-1019. 21. H. M. Stark, A transcendence theorem for class number problems, Ann. of Math. (2) 94 (1971), 153-173. Trinity College Cambridge, England
This page intentionally left blank
CHARACTER TRANSFORMATION FORMULAE SIMILAR TO THOSE FOR THE DEDEKIND ETA-FUNCTION BRUCE C. BERNDT 1. Introduction. The classical Dedekind eta-function rj(z) is defined for lmz>0 by if(z) = e""/12 fl (l-e2"'"2). «= 1 If V(z)= Vz = (az + b)/(cz + d) is any modular substitution with c>0, rj(z) satisfies the well-known transformation formula (1.1) \ogrj(Vz) = \ogri(z) + %\og(cz + d)-ni/4 + 7ii(a + d)/l2c-7iis(d, c), where s(d,c)= £ ((//c))((/d/c)) j mode is the well-known Dedekind sum with ((*)) = x — [x] — i, if x is not an integer, = 0, if x is an integer. (For a proof of (1.1), see, for example, [8] or [2, pp. 167-173].) The most famous and useful property of Dedekind sums is the reciprocity formula: if c, d>0 and (c, d)=l, then AMS 1970 subject classifications. Primary 10D05; Secondary 10A40. <■£) 1973, American Mathematical Society 9
10 BRUCE C. BERNDT (u» s,^+s(<j,c).-i+±(£+i.+^. For several proofs of (1.2) as well as several references to the literature, see [6]. There are several other functions possessing transformation formulae similar to (1.1). Appearing in these transformation formulae are various generalizations of Dedekind sums involving Bernoulli polynomials, and these generalized Dedekind sums satisfy reciprocity laws similar to (1.2). We have recently shown [4] that all of these transformation formulae can be deduced from one general theorem. The objective of this paper is to derive transformation formulae for a large class of functions involving primitive characters. Even for the simplest cases, the results appear to be new. Appearing in our transformation formulae are still further generalizations of Dedekind sums. These sums involve characters and generalized Bernoulli functions. We shall show that the sums possess a reciprocity formula as well. In the sequel, x always denotes a primitive character of modulus k. The upper half-plane {z: lmz>0} will be denoted by Jf. As usual, <j = Re(s). We always set K(z) = Vz = (az + b)/(cz + d), where a, b, c and d are rational integers with c>0 and ad — bc=\. We shall let e(z) = e2niz. We shall use the customary notations, [x] for the greatest integer ^x and {x} for the fractional part of x. The characteristic function of the integers will be denoted by X(x). Unless otherwise stated, we choose that branch of log w with — 7r^argw<7r. 2. Preliminary results. The Gauss sum G(z, x) is defined by G(z,X) = Yx(h)e(hz/k). We put G(l, x)=G(x). If n is an integer, then [2, p. 312] (2-1) G(n,X) = x(n)G(X). We shall need some facts about Bernoulli functions and their character generalizations. The Bernoulli polynomials Bn(x) are generated by [1, p. 804] uexu °° un (2-2) T1=IB»W3 (M<2*)- The Bernoulli functions @ln{x) are defined by (2.3) &.(x + m) = Bn(x),
CHARACTER TRANSFORMATION FORMULAE 11 where 0^x<l and m is an arbitrary integer, except in the case n=\ and x = 0, where we define S8X (0 + m) = 0. The generalized Bernoulli functions &n(x, /), «^ 1, — oo <x< oo, are defined as follows [3, Definition 1]: if x is even, 2(-l)"G(y)(2«-l)! « XQ) sinpir/x/fc) and 2(-l)"-1G(x)(2n)! » x(j) co&(2njx/k) ^2"(X'X> k(2n/k)2» fa f if x is odd, 2(-l)"-1iG(x)(2n-l)! • XQ) cos(2njx/k) -*2*-i[x,X)- k^n/k)2'-1 fa j2"-1 and x_2(-l)-1iGCt)(2n)! » x(j) sin(2njx/k) Jzn(x' x)~ k^W" fa J" • For n^l and — oo<x<oo, we have the important property [3, Theorem 3.1], (2.4) ^„(x,x) = /c"-1 £ ^)^(^)- In [3, Example 3], we derived the following character analogue of the Lipschitz summation formula: for a real, Rez>0, and o>\, £ X(«)g(ntt/fc)(z + m)-' = *( 1} G{*){2n/k)S £ zW^izJri + aVfcXn + a)'-1. n = -oo i lSj n + a>0 Setting a = 0, replacing z by — iz, and replacing n by — n in the series on the left, we find that for ze Jf and g>1, (2.5) £ Z(„)(„+2)-.=£M^W£ jj(„)e(n2/fc)M.-i. n = - oo ■» ^Sj n = 1
12 BRUCE C. BERNDT We now discuss the functions whose transformation formulae we shall derive. Let r1 and r2 be arbitrary real numbers. For o>2 and zttf, define G(z,s;x;r1,r2)= 2- m, n= - oo ((m + r1)z + n + r2)s where the dash ' means that the possible pair m= —r1,n=—r2is omitted from the summation. Extend the definition of x to the set of all real numbers by defining x{r) = 0 if r is not an integer. Then write 00 G(z, s; x\ rl9 r2) = x{-rl) Z' %(n) (" + r2)~s n= — oo <-r, n=-oo m>-rr n= - 00/ ((™ + Tj) Z + H + T2)S = S1+S2 + S3, say. Firstly, (2.7) \n>-r2 n>r2 / = *(-ri) (L(s, x, r2) + X(- 1) e(s/2) L(s, *, -r2)), where for a> 1 and a real, L(s,z,fl)= Z z(«)(« + «)"s. n> —a L(s, x, a) m&y be written in terms of the Hurwitz zeta-function £(s, a) as follows. Let n = mk+j+[-a] + \, 0^m< oo, O^j^k- 1. Then, for <j> 1, (2.8) ^^ m = 0 j = 0 where we have used the fact that [a\ + [ — a\ = k(d)—\. Observe that (2.8) provides an analytic continuation of L(s, %, a) for the entire complex plane. Secondly, replacing n by — w, we get for <j>2,
CHARACTER TRANSFORMATION FORMULAE 13 S2 = Z(-l)e(s/2) £ xH I *(«) ((-m-ri) z + n-r2y m< —r\ n= — oo Replacing m by — m and applying (2.5), we find that for a>2 and zeJf, r(s) where 00 (2.10) ^(z,s;x;»-1,r2)= £ z(m) £ z(») eHm + ri) z + r2)/*) n*"1. m> —r\ n= 1 Note that y4(z, s; /; rl9 r2) can be analytically continued to an entire function of s. Similarly, (211) Si_2itp:^,iZiri.rj. Putting (2.7), (2.9) and (2.11) into (2.6), we conclude that for a>2 and zeJf, G(z, s; x; rl9 r2)= X f *'— {X(z, s; *; rx, r2) + e(s/2) 4(z, s; Z; -ri, -r2)} (2.12) +z(-ri) {^(5, £ r2) + Z(- 1) e(s/2) L(s, % -r2)}. Since A(z, s; x\ rl9 r2) and L(s, x, a) have analytic continuations to the entire complex j-plane, (2.12) yields an analytic continuation for G(z,s\x\r\i ri) to the entire complex j-plane. Our objective is to prove transformation formulae for the functions /l(z, s; x\ r\, r2)- ^ will be simpler, however, to state our results in terms of the functionsG(z, s; x\ ri> ri)- Also, our proofs will use the functions G(z, j; x\ ri> ri)- The method of proof is based on ideas of J. Lewittes [7] and us in [4]. Observe that if we set s = rl = r2 = 0 in (2.10), we have the natural character generalization of log^(z). The functions A(z, s; x\ 0, 0), where s^O is even, have arisen in Grosswald's work [5] on the values of L-functions. We shall need the following simple lemma [7, Lemma 1] in the next section. Lemma 1. Let A, B, C and D be real with A and B not both zero and C>0. Then, for ze Jf,
14 BRUCE C. BERNDT aTg((Az + B)l(Cz + D)) = arg(Az + B)-arg(Cz + D) + 2nk, where k is independent ofzeJf, and k=\, ifA^OandAD-BC>0, = 0, otherwise. 3. Main Theorem. Theorem 2. Let Q = {z = x + iy:x> —d/c, y>0}. Define R1=ar1 + cr2 and R2 = brl-\-dr2> where rx and r2 are arbitrary real numbers. Let q = q(R1, R2, c, d) = {R2} c—lR^d. Suppose first that a = d=0 (mod k). Then for zeQ and all s, (cz + d)-sr(s)G(Vz,s;X;r1,r2) = x(b) x(c) |r(5) G(z, s; x; *!, R2)~2ir(s) sin(ns) x{Rx) L(s, Z, -K2) (3.1) c k-l k-l + e(-s/2) I Z I Zfa+J + M) j=l n=0 v=0 •z([*2+d(/-{^})/c]-v)/(z,S;r1>r2)J, where f(z, s; rl9 r2) = f(z, s; rl9 r2J, /i, v) = . if.l exp(-((c/i+7-{R1})/c/c) (cz + d) ku) exp(((v + {(/d + g)/c})/fc) ku) ^ ■J exp ( — (cz + d) ku) — 1 exp (ku) — 1 c Here, we choose the branch ofus with 0<argu<2n. Also, C is a loop beginning at + oo, proceeding in the upper half-plane, encircling the origin in the positive direction so that u = 0 is the only zero of (exp ( — (cz + d) ku)—l) (exp(/cw)— 1) lying "inside" the loop, and then returning to + oo in the lower half-plane. Secondly, ifb = c = 0(modk), we have for zeQ and all s, (cz + d)-T(s)G(Vz,s;x;r1,r2) = i{a) x(d) |r(s) G(z, s; X; Rl9 R2)-2iT(s) sin(ns) X(Ri) L(s, £ -R2) (3.2) c k-i k-i +«(-*/2) ni z(/+[*ii) j=l fi = 0 v = 0 ■x(iR2+diJ-{R1}yc-] + dn-v)f(z,s;rl,r2)[.
CHARACTER TRANSFORMATION FORMULAE 15 Proof. For zeJf and <j>2, G(Vz,s;x;r^r2)= £ x(™)x{*)) ~A f ' m,n=-ao I CZ + d ) where M = ma + nc and N = mb + nd. As the pair m, n ranges over all pairs of integers, except for possibly the pair — rl9 — r2, M and AT range over all pairs of integers except for the possibility — Ru — R2, since ad — bc=\. Thus, G(Vz,s;X;rur2)= £' X{Md-Nc) X(Na-Mb) \[- »-— 2-\ M,N=-ao I CZ+d J = x(b)x{c) f' x(n)x(m)j(m + jR')'7+jR2j ' (^^0 (mod/c)) m,n= - ao (. CZ -\-U J = x(a)x(d) £' x(m)ywi(m + ^l)l7 + R2i ' (^^0 (mod/c)). For the remainder of the proof, we assume that a = d = 0 (mod/c). The proof for b=c = 0 (mod/c) is completely analogous. Using Lemma 1, we find that, for ze Jf and <7>2, (cz + d)~sG(Vz, s; x\ rl9 r2) = *(6)Z(c)(e(-s) I +1' )■ ,f» /((m + R1)2 + n + R2)s /l 0\ d(m + /?i)>c(n + /?2) otherwise = x(b) x(c) {G{z9 s; Z; J^, K2) + (*(-s)- 1) #(z, s; Z; J^, K2)}, where #(z, s; x;/?i, #2) = ((m + R1)z + n + R2)s Replacing m by —m and n by — n and then separating the terms with m = R1, we obtain , x ,(2,S;X;/?1,/?2) = ^/2) I I „ *(")x(m) (3.4) np, „>i?2+a(m-R,)/c ((wi-K,) z + -n -K2)s = e(s/2){x(R1)L(s,x,-R2) + h(z,s;X;R1,R2)},
16 BRUCE C. BERNDT where h(z,s;X;Ri,R2)= Z Z X(m)x(n) >Rr n>R2 + d(m-Rl)/c ((™ — R\) Z + n — R2)S Now, in the double sum above, Re((m — Rx) z + n — R2)>0 if x> —d/c. Using Euler's integral representation of r(s), we find that for zeQ and o>2, r{s)h{z,s;x;Rl9R2) = Z Z zMx(n) m>Ri n>R2+d(m-R1)/c us 1 Qxp( — (m — R1) zu — (n — R2) u) du. Put m = m,c+7 + [R1] + l,0^m,<oo,0^7^c-l, and n = ri + [R2+d(m-Rl)/c] + 1. The above double sum becomes I I E xim'c+j+LRil + Vxin'HRz+dim'c+j-iRj + iycl + l) j = 0 m' = 0 n' = 0 us 1 exp( —(ra'c +j—{Ri} + 1) zw -(n' + C^ + dKc+y-^iJ + lVcl + l-Uj)) d„. Replace ;+l by 7, use the fact that d = 0(mod/c), put ra' = m/c + ^, 0^ra<oo, Og^/c-1, and put n' = n/c + v, 0^n<oo, O^v^/c-1. We then get for zeQ and o- > 2, r(s)A(z,s;Z;R1,R2) Z V Z z(^+J + [*i])z(v + [*2+da-{*i})/c] + l) (3.5) j=l /i = 0 v = 0 • us-1exp(-(c/i+7-{R1})2u-(v + [R2+d(/-{^i})/c] + ^ + l-^2)«) 00 00 • £ £ Qxp( — mkczu — dmku — nku)du m-0n=0
CHARACTER TRANSFORMATION FORMULAE 17 = Z Z Z Z(cAt+J + [Hi])z(v + [/?2+^0'-{*i})/c] + l) j = 1 /* = 0 v = 0 00 ■J t exp(-((c/x+;-{^1})/c/c) (cz + d) Jcu) 1—exp(—(cz + d) /cu) exp((- v -1 +jd/c - {Rt} d/c + R2 - jR2 + <*(/ - {R, })/c]) «) ^ 1 —exp( —/cu) = -11' Vz(cAt +J + [Ki])x([K2 + d(/-{Ki})/<|-v) j= 1 /j = Ov = 0 s-1 exp(-((qi+M*i})M) (cz + d) ku) exp(((v + {(jd + Q)/c})/k)ku) ^ exp ( — (cz + d) /cw) —1 exp(/cw) — 1 0 = -Z z1 *z x(w+[*.])x([^+^(/--{^i})/^]-v)/(2',';ri;r2). Here, in the next to the last step, we have multiplied the numerator and denominator by exp (/cw) and then replaced k — 1 — v by v. In the last step, we have used a classical method of Riemann to convert the integral over (0, 00) to a loop integral [11, pp. 18, 19]. If we now combine (3.3)—(3.5) together, we immediately arrive at (3.1). The result holds for all s by analytic continuation. Our main results can be greatly simplified if s is an integer. For then/ (z, s;rl7 r2) can be easily evaluated by the residue theorem with the aid of (2.2). Thus, if s = — N, where AT is a nonnegative integer, a simple calculation yields Si "''Xl E 4^-(*'Vf,+iw+g)/c}y-<"+'')r'. m+«=N + 2 \ ck J \ k ) mini Upon the evaluation of f(z, -N; rx, r2), (3.1) and (3.2) will then be valid for all ze Jtf by analytic continuation. 4. The character analogue of \ogrj(z). Let r1 =r2 = 0. From (2.10) and (2.12), we find that 00 r(s) G(z, s; x; 0, 0) = G(x) (-2ni/kY(l +e(s/2)) Z X(mn) e(mnz/k) ns~l m,n= 1 00 = G(x) (-2ni/kf(\ +e(s/2)) Z zW °s-1 (r) e(rz/k).
18 BRUCE C. BERNDT In particular, if s = 0, 00 limr(s)G(z,s;X;0,0) = 2G(x) £ Z(r) <x_, (r) e(rz/*) = 2G(x) A(z, X), s-0 r=l say. Thus, for s = r1=r2=0, (3.1) yields in the case a=d=0 (mod/c), (4.1) +1 t V 'l xfo*+/) z(D^/c]-v)/(2, 0; 0, 0)1, j= 1 p = 0 v = 0 J and (3.2) yields in the case b = c = 0 (mod/c), G(jf) >l(Kz, X) = l(a) X(d) |G(jf) A(z, Z) (4'2) +i I Y V X(j) xiWc}+dn-v)f(z, 0; 0, 0)|>. j = 1 /* = 0 v = 0 By (3.6), We must now evaluate the triple sums in (4.1) and (4.2). First examine the sum in (4.1). By summing on v first, we see that the contribution of (cz + d) B2((cii+j)/ck) is zero. By summing on fi first, we see that the contribution of B2((v + {jd/c})/k)/(cz + d) is zero, since cn+j runs through a complete residue system (mod/c) as \i does, for (c, k)= 1. Next observe that the triple sum is unchanged if we replace Bx ((v + {jd/c})/k) by 0tx ((v + {jd/c})/k), since d = 0 (mod /c). By (2.4), |x(D*]-v)*,(^)=Yx(-v)«,(^*)=,(-.)^0*.i). Thus, so far, we have shown that the last expression in curly brackets on the right side of (4.1) is
CHARACTER TRANSFORMATION FORMULAE 19 (4.3) nix(-l) t @,(jd/c,l) £ xicfi+fiB, Next, observe that the sum above is unchanged if B1 ((cfi+j)/ck) is replaced by ^i((c^+;)M)- Put cfi+j = n, where l^n^ck. Since d = 0(modk) and ^(x, x) has period fc, (4.3) becomes (4.4) nix(-l) I *!<#, X) X(n) %i(n/ck). n= 1 Definition. Let (c, d) = 1 with c>0. The Dedekind character sum s(d, c; %) is defined by n mod ck Using the above definition and recalling that (4.4) is the value of the last expression on the right side of (4.1), we find that (4.1) becomes G(x) A(Vz, X) = x(b) X(c) {G(x) A(z, x) + ™x(- 1) s{d, c; x)}. Secondly, we examine the triple sum on the right side of (4.2). By summing on v first, we observe that the contribution of the second expression in/(z, 0; 0, 0) is zero. Next, we may replace B1((v-\-{jd/c})/k) by &i{{v + {jd/c})/k). Using (2.4), we find that =x(-l)#i(dit+jd/c,x)- Thus, the last expression in curly brackets on the right side of (4.2) is *i*(-1) I X(J) V *i (CJ~) <*x (d^+jd/c, x)=uiX(-1) s(d, c; x), where we have replaced B^cii+fi/ck) by &i((cfi+j)/ck), set n = cfi+j, l^n^ck, and used the fact that c = 0 (mod k). Hence, using the above calculation, we find that (4.2) becomes G(x) A{Vz, X) = i(a) X(d) {G(x) A(z, X) + niX(-1) s(d, c; Z)}. In summary, we have shown m
20 BRUCE C. BERNDT Theorem 3. Let zetf, jfa^d=0 (modA:), (4.5) G(x) A(Vz, X) = x(b) x(c) {<?(*) A(z, x) + niX(- 1) s(d, c\ x)}; ifb = c = 0(modk), (4.6) G(x) A(Vz, x) = x{a) x(d) {Gft) A(z9 x) + *iZ(-1) s(d, c; Z)}. We will next prove a reciprocity formula for s(d, c; %) analogous to the reciprocity formula for s(d, c). Theorem 4. Let c, d>0,(c,d)=l, and either c or d=0 (mod k). Then, (4.7) s(c,d,x) + s(d,c;x) = B1(x)B1(x). Proof. By symmetry, we may without loss of generality assume that d = 0(modk). Let V*(z)=V*z = (bz-a)/(dz-c) and T(z)= Tz= -l/z. If we replace z by Tz in (4.5), we obtain (4.8) G(x) A(V*z, X) = x(b) x(c) {G(X) A(Tz, *) + *iZ(- 1) s(d, c; *)}. Next, apply (4.6) with K replaced by V*. We get (4.9) G(x) A(V*z, X) = x(b) *(-c) {G(*) A(z, *) + 7rix(- 1) *(-*, d; *)}. Lastly, apply (4.5) with V replaced by T and x replaced by x to obtain (4.10) G(X)A(Tzrx) = x(-l){G(?)A(z,x) + nix(-l)s(0,l',x)}- Multiplying both sides of (4.10) by x(b) x(c) and combining the result with (4.8) and (4.9), we arrive at (4.11) 7rix(ft)z(c){z(-l)s(d,c;i)-s(-c,d;z) + s(0,l;z)}=0. Since (b9 k) = (c9 k)= 1, x(b) x(c)#0. From the definition of ^(x, %), we see that (4.12) #i(-*,x)=-x(-l)*i(*>x). It follows that s(-c9d;x)=-x(-l)s{c>d>x)- Lastly,
CHARACTER TRANSFORMATION FORMULAE 21 (4.13) s(0, 1;Z)= I x(«)Bi(x)*i(«/*) = *!(X)*i00, n= 1 upon the use of (2.4). Thus, (4.11) reduces to Z(-l)s(d,c;x) + z(-l)s(c,d;^) + 51WB1(x) = 0. This is equivalent to (4.7) since — x(- 1) Bx (x) Bx (x) = Bl(x) #i (£), for if x is even, Since rj(z) possesses an infinite product representation, it is natural to ask whether A (z, x) can be represented as the logarithm of an infinite product. Theorem 5. Let zeJ4f. Then there exists an integer r, independent of z, such that fc-l oo G(x) A(z, *)= -log f[ n (l-e({j+mz)/k)Y<»*™+2mr. j=0 m=l Proof. By (2.1), we have, for zejf, 00 G(x) A(z,x)= Z XM G(w, x) e(mnz/k)/n m, n— 1 k— 1 oo oo = £ *(/) Z xM Z e(n(j + mz)/k)/n j=0 m = 1 n = 1 k-1 oo = ~Z X(J) Z zNlog(l-«((/ +wz)/fc)) j = 0 m=l = -log n fi (l-e((/+m2)A))'0',x<m) + 27:I>(z)) j=0 m= 1 where r(z) is an integer. Since A(z, x) and the logarithm of the product above are both analytic on Jf, r(z) is analytic on Jf and hence a constant r on Jf. 5. A second example. Let s = 09 but suppose that r1 and r2 are arbitrary. Firstly, suppose that a = d = 0(mod/c). Proceeding as in the preceding section, we find that the triple sum in curly brackets on the right side of (3.1) is, with the help of (3.6), 2*i Z V I1z(^+j + [ll1])Z([ll2 + d(/-{*i})/c]-v) (5 J) j=i p = o v=o _ /cn+j-jR^ fv + {(jd + Q)/cf
22 BRUCE C. BERNDT We next want to replace ^((c/i+_/-{/?!})/<:&) by 3Bi{(cfi-\-j-{Ri\)lck). This is valid except when Rx is an integer, \i=k — 1, andy'=c, for then we obtain Bx{\)=\ and ^,(1) = 0, respectively. Replacing Bl({cii-¥j-{Rl})lck) by ^i i(c^+j— {^i})/cA:) in the triple sum above, we are led to the "extra" expression t- ^ -, i'v + IR?'. k = *ii(Ri) *I z([*2]-v) «! (^M)-^^,) Z(K2) = «iz(-*i) l] Z(v) ^ (^)-i^(^i) x(*2) = ^(-/?1) £,(K2, x)-i7iix(«i) x(K2), upon the use of (2.4). Thus, (5.1) becomes 2w£ I I X(cn +JHRil)z([^2+d(/-{*i})/c]-v) j = 1 /j = Ov = 0 (5.2) .# ftW-{*i}\Bi fv + {(/<* + rf/c}N ck J \ k + nix(-R1)al{R2,fl-±Kix{R1)x(R2). Next, we wish to replace #i((v+ {(/</+g)/c})/fc) by #i((v + {(/rf+e)/c})/fc) in (5.2). This is valid except in the case when (jd + g)/c is an integer and v = 0, i.e., except when q is an integer and v = 0. If q is an integer, let/ be the unique integer such that 1^/^c and j'd + g = 0 (mod c). Using the definitions of #,/^ and R2 and using the definition {x} = x — [x], we obtain after a short calculation, O'^ + d/c = (/<*-*■! + [«i] d-[R2-] c)/c. From the above, we see that g is an integer if and only if rx is as well. Since also ad — bc=l9 the above then becomes (/d + c)/c = ((/ + [cr2])i)/c-[dr2]. Since (fd + g)/c is an integer and (c, d)= 1, (5.3) c|(/ + [cr2]). Since d = 0 (mod fc), we conclude that (fd-\-g)/c = - [dr2] (mod /c), and so,
CHARACTER TRANSFORMATION FORMULAE 23 (5.4) d(f - {Rx })/c + R2 = (fd + q)Ic + [K2] = ftr x (mod /c). Replacing Bi((v + {0'd + e)/c})/fc) by ^i((v + {(jd + £)/c})//c), we obtain an "extra" expression = -TTiAto) z^rO V z(cai+/ + [ct2]) «x (^+^ ^'^ by (5.4) and the fact that a = 0 (mod/c). Using the definition of R^ and (5.3), we can write the above as *"' 7 Z + M^ /^ + (/'+[>2])/c-r2 _ XIM + ■TrateJxtferJxtc) X x(a*+" ~ /~il £ = -7rU(e)z(6ri)z(c)*1(-r2,z), by (2.4). Thus, (5.2) becomes 2** t kZkI^(^+;+[^i])x([^2+^-{^i})/c]-v) j = 1 // = 0 v = 0 (5.5) .^i^+^-^.A^ ^v + {(A*+ffyc} c* / \ it + nix(-R1) »x (R2, X)-biix(Ri) X(*2) -JtUfe) Z(fcr,) *(c) ^(-i* %)■ We next simplify the triple sum of (5.5). Using (2.4) and then letting cn+j=n, 1 ^ « g c/c, we find that the triple sum of (5.5) becomes w-.ij;^wtM«,(^).,(^iw) (5.6) =2niX(-l) ntn + ^p.^Jj,^ ^ '+*2,xj- Definition. Let (c, d) = 1 with c>0. Suppose that x and j> are arbitrary real numbers. The generalized Dedekind character sum s(d, c; x\ x, y) is defined by (5.7) s(d,c;X;x,y)= £ x{n)&Ad{n + y)/c + x,X)a1{{n + y)/ck). n mod ck Observe that s(d, c; %; 0, 0) = s(d, c; %). Also, s(d, c; %; x, y) is the natural
24 BRUCE C. BERNDT character generalization of the generalized Dedekind sum s(d, c; x, y) (e.g. see [6] or [4, §4]). Using (5.7) in (5.6), we see that we may now write (5.5) as (5 8) 2nix(~ l) S(d' C; *; *2' -Ki) + 7^(-Ki) #i(«2, X) -±™z(*i) x(R2)-na(Q) x(brx) i(c) #i (-r2, *). We next calculate L(0, #, a). From (2.8) and the formula [12, p. 267], C(0,a)=-B1(a) (0<a£l), we find that, for real a, L(0, z, a)= -"j: z(/-|>] + *(«)) *i ((/ + {*}+*(*))/*) j = o k-1 l5'9) =-I Z(7'-[a] + A(a))*1(a+{a}+A(a))/fc)-z(-o)B1(l) = -^i(a,3f)-iz(-fl), by (2.4). Using (5.8) and (5.9) in (3.1), we find that, for zejff and a=d=0 (modk), lim r(s) {(cz+d)-s G(Vz, s; X; r„ r2)-jf(fc) x(c) G(z, s; jj; R» R2)} =X(b) X(c) {InixiR,) aA-R2, x) + *ii{Ri) X(R2) + 2mx(-l)s(d,c;x;R2, -*i) + >«z(-*i)*i(*2,x) (5.10) -fax{Ri) X(Rz)-iM{Q) X{br,) jf(c) *,(-r2, *)} = *(&) *(c) ni{X(Ri) @A-R2, x)+til(Ri) X(R2) -Hc)x(brl)x(c)a1(-r2,x) + 2x(-l)s(d,c;x;R2,-Ri)}, where we used (4.12). Secondly, suppose that b = c = 0 (mod k). Proceeding as in the previous section, we find that the triple sum in curly brackets on the right side of (3.2) becomes, with the help of (3.6), c k — 1 k— 1 2*i £ X ZxVHRiDx(LR2+d(J-{Ri})/c]+dii-v) j=l n=0v=0 (5.H) B(cji+j-{R^\BJv + {(jd+Q)lc}
CHARACTER TRANSFORMATION FORMULAE 25 As in the previous case, the replacing of B1((cfi-\-j—{R1})/ck) by @i((cix+j — {R^lck) results in the "extra" expression ni)C(Ri) V Z([K2]-v) £i(-i~) = niX(-Ri)@AR2,X)-inix(Ri)x(R2). Thus, (5.11) may be written as 2ni t X I xU + [*i])!tp2 + ^-{Xi})/c] + *-v) j=l /* = 0 v = 0 <512) ^ (cH+)-{Rx}\BJv + {{Jd + Q)lc ck J L\ k + niX(-R1) 0,(K2, x)-i*ix(Ri) x(R2)- We next replace Bi{{v + {(jd + Q)/c})/k) by £1((v + {(jtf + e)/c})/fc).Let / be defined as in the case a=d=0 (mod/c). We obtain, as before, an "extra" expression (5.13) -mm x(fH*J> % f (^M-^)*1 f^" <*->yc). From our calculations in the previous case of a = d = 0 (mod/c), we see that (fd + g)/c + [K2] = (/ + [cr2]) d/c + ferx = (/ + [cr2]) d/c (mod fc). From (5.3), / = — [cr2] (mod /c), and so Z + Msan (mod/c). Using the above two congruences and (5.3), we see that we may write (5.13) as ■u w * -^ V -/"Z + C^] , \ „ ^+(/-{R1})/c\ -7td(e)x(a'-1)x(^) Z Xl +A*J*i( 1 I (5-14) = -siAfo) Z(ar,) *(<*) "£ Z(A*) *i (^) = -«iA(c) x(ari) l(d) #i(-r2, *), by (2.4). Using (5.14), we find that (5.12) becomes
26 BRUCE C. BERNDT 2*'"i I IxVHRiDxdRi+dU-iRiW+dii-v) •^(Cfl+^{R,})^(v+{(^+g)/c}) + jriZ(-K,) «,(*2, x)-2™'x(*i) *(*2) -«iA(<?)z(flr,)jf(d)«i(-r2,z) = 2*iz(-l) s(d, c; Z; K2, -Ki) + k»x(-*i) #i(K2, x) where we have used the same argument as before in obtaining (5.6). Substituting the above into the right side of (3.2) and using (5.9) and (4.12), we conclude that Urn r(s) {(cz + dy* G(Vz, s; X; ru r2)-X(a) %{d) G(z, s; Z; Ru R2)} (5.15) =x(a) X(d) ni{x(Ri) »i(-R2, x)+fc(*i) zW -X(q) xfa) x(d) ®x (-r2, Z) + 2Z(-1) s(d, c; X; R2, -*i)}. We summarize our results (5.10) and (5.15) in Theorem 6. LetzeJf. Ifa=d=0(modk), Jim r(s) {(cz + d)-° G(Vz, s; X; ru r2)-jt(b) x(c) G(z, s; *; Ru R2)} =x(b)x(c)ni{x(R1)^1(-R2,x)+mRi)x(R2) -HQ) X(brx) x(c) £i (-r2, *) + 2X(- 1) s(d, c; x; R2, -R,)}; ifb = c=0(modk), l™ *» {(" + rf)-s G(Vz, s; X; ru r2)-x(a) X(d) G(z, s; X; R,, R2)} = X(a)x(d)ni{x(R1)@i(-R2,x) + 2-X(Ri)x(R2) -Afe) zK) *(<Q <*i(-»2. Z)+2z(-1) s(d, c; Z; *2, -Hi)}- To express the results of Theorem 6 in terms of the functions A(z,s;x;ru r2), some additional computation is required. The computations are similar to those in [4, §4]. We now derive a reciprocity law for s(d, c; x', x, y) similar to the reciprocity lawfors(d, c;x, y)[4, §5]. Theorem 7. Let c, d>0, (c, d} = 1, and either c or d=0 (modfc). Let x andy
CHARACTER TRANSFORMATION FORMULAE 27 be arbitrary real numbers. Then, (5.16) s(c, d\ x; x, y) + s(d, c; x\ y, *) = -iz(x) x(y) + #i (*, z) *i U z)« Proof. Without loss of generality, assume that d=0 (mod/c). Let V* and 7" be as in the proof of Theorem 4. Replacing z by 7z in (5.10), we get lim r(s) {(dz-c)~s zsG(V*z, s; X\ rl9 r2)-X(b) *(c) G{Tz, s; r, «i, *2)} s-+0 (5.17) =jf(&)z(c)»ri{z(*i)*i(-*2.Z)+iz(*i)z(i?2) -A(e)z(6r1)J£(c)«1(-r2,Z) + 2Z(-l)s(d,c;z;ll2.-*i)}- Apply (5.15) to the transformation F* and note that R^ is replaced by R2 and R2by -/I,. Thus, lim r(s) {(dz-c)-' G(V*z, s;X; rl3 r2)~x(b) Z(-c) G(z, s; X; R2, -Rj} (5.18) =z(6)z(-c)»ri{z(*2)«i(^i.z)+k^2)z(-i?i) -A(c)z(fc'-i)z(-c)«1(-r2,z) + 2z(-l)s(-c,d;z; -*i, -*2)}- Lastly, apply (5.10) with F replaced by T, rt and r2 replaced by Rt and J?2, respectively, and z replaced by /. Observe that Ri and R2 are replaced by R2 and —Rit respectively. Hence, lim {z~*G(Tz, s; z; K„ R2)-x(-1) G(z, «; Z5 «2, -*i)} s->0 (5.19) =z("l)'rf{z(i?2)*i(*i.Z)+iz(*2)z(-*i) -z(-«i)^i(-H2,z)+2z(-i)s(0,i;z;-Ki,-JR2)}- Multiply (5.19) by z(fr) z(c) and combine the resulting equation with (5.17) and (5.18) to obtain after considerable cancellation (5 20) 2x(_ l) s(d'c; *; R2>-RJ-2s(-c> d> x; -Ru-Rz) + 2s(0, 1; z; -*„ -H2)+k(*2) z(*i)=0. From (5.7) and (2.4), s(0,l;z;-*!,-*,)=£ z(«)^i(-«i,z)^i(^r^) =^1(-R1,z)^,(-i?2,z)- By replacing n by — n in the definition of s( — c, d\ %; — R1? —R2) and using the
28 BRUCE C. BERNDT oddness of 88x (x), we easily deduce that s{-c, d; x; -Ri, -R2)= ~X{-1) s(c, d; x\ -Rl9 R2). Hence, (5.20) can be written as (5 21) 5(d'c; *; R2>-RJ+S(C> d'i xi-Ri, R2) = -iz(-*i)z(*2)-z(-i)*i(-*i,z)*i(-*2,z). If we apply (4.12) and let x = -Rx and y = R2, (5.21) reduces to (5.16). Observe that when x = y = 0, (5.16) reduces to (4.7). 6. An alternative method for proving the main theorem. A very short, elegant proof, using contour integration, of the transformation formula for rj(Vz) when Vz=Tz= — \/z has been given by C. L. Siegel [10]. The method was extended by Rademacher [8] who proved the transformation formula for any modular substitution. (See also [2, pp. 155-158, 167-173].) Another extension of Siegel's idea has been made by Schoeneberg [9] who proved a transformation formula under T for generalized Dedekind eta-functions. Using an idea which Schoeneberg attributes to E. Hecke, Schoeneberg then derived the general result using essentially only the result for T. These ideas of Siegel and Hecke can be extended to yield another proof of Theorem 2, when 5 is a nonpositive integer. We shall illustrate this extension by giving the proof for the transformation Tz only, in the special case s = r1 =r2 = 0 and a = d = 0 (mod/c). From (4.5) and (4.13), we then shall show that (6.1) G(x) A(Tz, X) = X(-1) G(X) A(z, D + mB^) B^x). Letting m = rk+j, 0^r<oo, O^j^k— 1, in the definition (2.10) of A(z, x) and then using (2.1), we see that (6.1) may be written as /*o\ rt-\ V x{n)G{-n/z,x) * j(n) G(nz, x) , ... (62) G(X) L n{l-e{-n,z))=*{-l) G{x) £ ,(1^))^^ *l W" We shall show (6.2) for z = iy, y>0, for then the general result will hold by analytic continuation. Let n be a fixed positive integer and put N = n + j. Define G(wN,x)G(-wN/z,X) *n(w, z, x) = - G(x) G(x) w(l-e(wN)) (l-e(-wN/z))
CHARACTER TRANSFORMATION FORMULAE 29 Let C be the rhombus with vertices w= ±1 and w= ±z. On the interior of C. FN(w, z, x) has simple poles at w= ±j/N and w= ±jz/N, 1 ^j^n, and a simple pole at w = 0 when % is odd. The residue of FN(w, z, x) at w=j/N is G(j,x)G(-j/z,X) xV)G(-j/z,x) G(X) G(x) (j/N)(-2niN) (1 -e(-j/z)) 2nijG(X) (1 -e(-j/z))' by (2.1). By replacing h by k — h in the definition of the Gauss sum, a short calculation shows that the residue at w= —j/N is equal to the residue at w=j/N. The residue at w =jz/N is x{-j)G{jz,x)l2nijG{x){\-e{jz)). As above, the residues at w= ±jz/N are equal for each;. The residue at w = 0 is B1(X)B1(X)/G(X)G(X), The easiest way to calculate the residue at w = 0 is to use the generating function for generalized Bernoulli numbers [4, Example 6]: !^p).|4|i' (|2|<2„A). Thus, by the residue theorem, 1 FN(w,z,x)dw=-2 £ -t MjG(X)(l-e(-j/z)) (6.3) c If we can show that the left side of (6.3) tends to 0 as AT tends to oo, we would be done. For then letting N tend to oo in (6.3), we get (6.2). Let w = u + iy(l — u), O^w^l, denote the line segment from w = iy to w=l. Then for w on this line segment, z B~=\ ^M-2nhy(l-u)N/k)^khZ\ exp(-2nhuN/yk) Mvv,z,W|_ |G(z)G(jf)w(l-g(wiV))(l-^(-wJV/2))| which tends to 0 as AT tends to oo, uniformly for O^w^ 1. The same conclusions hold for the other three sides of the rhombus by similar arguments. Thus, the in-
30 BRUCE C. BERNDT tegral on the left side of (6.3) tends to 0 as AT tends to oo. By our remarks above, this completes the proof of (6.2). References 1. M. Abramowitz and I. A. Stegun (Editors), Handbook of mathematical functions, with formulas, graphs and mathematical tables, 3rd ed., Nat. Bur. Standards Appl. Math. Series, 55, Superintendent of Documents, U.S. Government Printing Office, Washington, D.C., 1965. MR 31 # 1400. 2. Raymond Ayoub, An introduction to the analytic theory of numbers, Math. Surveys, no. 10, Amer. Math. Soc, Providence, R.I., 1963. MR 28 #3954. 3. Bruce C. Berndt, Character analogues of the Poisson and Euler-Maclaurin summation formulas with applications, J. Number Theory (to appear). 4. , Generalized Dedekind eta-junctions and generalized Dedekind sums, Trans. Amer. Math. Soc. (to appear). 5. Emil Grosswald, Remarks concerning the values of the Riemann zeta function at integral, odd arguments, J. Number Theory 4 (1972), 225-235. 6. Emil Grosswald and Hans Rademacher, Dedekind sums, Carus Monograph, no. 16, Math. Assoc. Amer., Washington, DC, 1972. 7. Joseph Lewittes. Analytic continuation of Eisenstein series, Trans. Amer. Math. Soc. 171 (1972), 469-490. 8. Hans Rademacher, On the transformation of \o%t](x\ J. Indian Math. Soc. 19 (1955), 25-30. MR 17, 15. 9. B. Schoeneberg, Zur Theorie der verallgemeinerten Dedekindschen Modulfunktionen, Nachr. Akad. Wiss. Gottingen Math.-Phys. Kl. II, 1969, 1-10. 10. C. L. Siegel, A simple proof of n(- 1/t) = >/(t) ^(t/i), Mathematika 1 (1954), 4. MR 16, 16. 11. E. C. Titchmarsh, The theory of the Riemann zeta-function, Clarendon Press, Oxford, 1951. MR 13, 741. 12. E. T. Whittaker and G. N. Watson, A course of modern analysis. An introduction to the general theory of infinite processes and of analytic functions', with an account of the principal transcendental junctions, 4th ed., Cambridge Univ. Press, New York, 1962. MR 31 #2375. University of Illinois at Urbana-Champaign
ON LARGE SIEVE TYPE ESTIMATES FOR THE DIRICHLET SERIES OPERATOR1 M. FORTI AND C. VIOLA Presented by Enrico Bombieri 0. Introduction. The large sieve inequalities (0.1) I I* E anX{n)\ Sc(N,Q) X \af, where * denotes summation over all primitive Dirichlet characters, were first considered by Linnik [14] then developed by Renyi [18] and Roth [19] and substantially improved by Bombieri [1]. Refinements in the estimates of c(N, Q) were given by Davenport and Halberstam [6], Bombieri and Davenport [4] and [5], and Gallagher [9]. Recently Montgomery [15] has stated the following result of Davenport: (0.2) I <*•»-"' ^[T + 0(N logjV)](J+logJV) X |fl.l2 (for real tr, <5 = minr*s|£r — ts|, T = max tr — min ts) and using a method introduced by Halasz [11] has successfully combined (0.1) and (0.2) to obtain <1 ^ Q X mod q r = 1 £ anl{n)n *»" -JL) 2 ( loe2AT\ N ^(e2T + N)(l+-4—)log4N X K\2n'2a where sx%r = GXtr + itXt„ G = mmxrGxr, T = maxxrtxr — mmXtrtx r+ 1, 3 = minx,r*skx,r-'x,sl- A MS 1970 subject classifications. Primary 10H30. 1 Research supported by Consiglio Nazionale delle Ricerche. 31 (£) 1973, American Mathematical Society
32 M. FORTI AND C. VIOLA Gallagher [10] proved a continuous analogue of Montgomery's result, namely T Z Z* q = Q x m°d q Z anX(n)«" dt<Z(Q2T + n)\an\- -T Bombieri [3] has remarked that the large sieve inequality R z M + N Y, an exp(27rmxr) 2 / 9\ M + N ^IN + T Z l«.|2 where <5 = minr*J|;cr — xs|| and ||x|| = min„|x — w|, can easily be deduced from Bessel's inequality in the Hilbert space I1 of complex number sequences (a„)_ <» such that ^*00|a„|2<oo; Elliott [7] has pointed out that the derivation of large sieve inequalities is equivalent to the determination of the spectral radii of certain hermitian operators. The recent notes of Montgomery [17] give a survey of the methods and results in this field; also mentioned is an interpretation recently devised by Bombieri (unpublished) which consists in the reduction of the large sieve inequalities to the estimates of the norms of linear operators between Banach spaces. This approach unifies most of the techniques used so far and suggests "nonlinear" analogues of the large sieve. The purpose of the present paper is to develop this method in order to obtain some improvements and generalizations on the results quoted above. We denote by co(n) = x(n)nit a generalized Dirichlet character and define ||ct>|| =q(\t\ + 1) if q is the modulus of %; Q is a finite set of generalized characters of cardinality |Q|. We also define D = D(Q) = sup(0±(0> \\a)'(b\\ and consider a finite set Jf of positive integers between Njl and N. Definition 1.1. One says that Q is 3-well spaced if for co = x(n)nit, co, = xf(n) ri*\ oj, cd'eQ, co^co', we have either \t — t'\^d or x'x nonprincipal. The large sieve inequality (0.3) Tia„a>(n)\ <;c(^,G)Zkl: may be considered as an upper bound for the norm of a linear operator. If Zf(^T) and 13 (Q) are the Banach spaces of complex-valued functions over Jf, Q respectively with the usual norms, we define the "Dirichlet series" operator 2f = 3l {Jf, Q): U(jr)-+E{Q) by
SIEVE ESTIMATES FOR THE DIRICHLET SERIES OPERATOR 33 [.JT JweG If ||®|L,a is the norm of the above operator, we have the inequality (0.4) \\p,q \q\ l/« / \ 1/P X ano>(n) :AmP,q{i\an\pJ which is clearly of large sieve type. The success of the methods so far used depends on the following well-known equality ii0iip.,=mi,<„., where @* is the adjoint operator of Q> and 1/p + 1/p' = l/q + l/q' = 1. In the classical case p = p' = q = qf = 2, the introduction of the Halasz coefficients (see Montgomery [15]) is avoided by estimating \\@*\\ 2,2 instead of ||^|| 2,2- This enables us to obtain our results directly and to avoid some splitting of cases occurring in Montgomery's paper. Our main results run as follows: Theorem 3.2. IfQ is(2 log/))2-wellspaced, then \\B\\\i2<N+D log2/). Theorem 3.3. IfQ is (log N)-well spaced, then for 0^<5^1, \\@\\2t2<N+N* .j)rt*)+*\Q\9 where p(d) is the Lindelof' p-function. Theorem 3.4. Let N(ol, T; y) denote the number of zeros of L(s, x) in the rectangle ol^g^X, \t\^T, and put N(cc, T; Q) = maxx, J^x N(cc, T; xx) where x, x' are such that co(n) = x{n) nlt, cD'(n) = x(n) nit belong to Q for some t, t'. Let Q be (log2D)-wellspaced. Then, for d^, T=max(00i, \t-t'\ + l, \\@\\l2<N + NdDll2-d+£\Q\+NdD1/2-d/2+eN(l-d,T',Qy-d/2. From these results, using the Riesz-Thorin theorem, we deduce some estimates for the norm ||^||Pf, with (p, q)¥z(2, 2). We also obtain the following Theorem 4.1. Let Q be (log2 D)-well spaced, and put D1 = D\og2D. Then, for any even integerp^2, \\@\\PtP<(N1/2 + D\/p) 7V1/2"1/p(log7V)c(p). Montgomery [17, Theorem 12.6] has pointed out that the validity of Theorem 4.1 for any real 2^p^4 would imply the density hypothesis N(g, T)^T2il~a)+e. Unfortunately the Riesz-Thorin interpolation theorem gives the following weaker result.
34 M. FORTI AND C. VIOLA Theorem 4.2. Let Q be (log2D)-well spaced. Then, for any p^2, where fc = [p/2]. Finally, we can generalize Theorem 3.3 in the following form. Theorem 4.3. Let Q be (logN)-well spaced. Then, for any p, q^.2 and for 0^<5^1, ii®iiPi,«jvi/2-i/^i/2+^2z>^^2+iior/«). The authors are deeply indebted to Professor Bombieri for help and constant encouragement in the preparation of this paper. 1. An interpretation of the large sieve by means of linear operators. Throughout the paper Jf, Q, \Q\, co, \\co\\, D will be as defined in the introduction. Given the finite set Q of generalized characters co(n) = x(n) ri* we denote by X(Q) the set of Dirichlet characters x such that x{n) nlt belongs to Q for some t. Let n{Jr),I3{Q) be the Banach spaces of complex number sequences fl = Kk^a = W(a6f} indexed by Jf, Q respectively with the usual norms We define the "Dirichlet series" operator @ = &(JT9 Q):LP(JT)-+I3(Q) by (oeQ Denoting by \\@\\p „ the norm of 2f, we get (1.2) Z «»">(") q\ l/« / \ 1/P The adjoint operator 3>*:D'(Q)^L"'(JT), where Xjp+ljp'=\lq + \jq'=\, is (1-3) ®*({<U„.n) = {l <*>(»)} • We recall the following well-known results in the theory of Banach spaces. Theorem 1.1. \\9\\Ptq=\\®*\\q;P' where \/p+l/p'= \/q+\/qf=\.
SIEVE ESTIMATES FOR THE DIRICHLET SERIES OPERATOR 35 Theorem 1.2 (Riesz-Thorin). log||^||ptQ is a convex function of (\/p, \/q) in 1 ^p, q^ +oo. Using Theorem 1.1 we may evaluate ||^*|| instead of ||^|| (|| • || being a shortened notation for || • ||2,2)- Let/(«) be a real-valued function such that ,1 A\ f{n)^\ for neJT, » (1.4) ^^ , andF = V/(«)<+oo. v ' ^0 always, ^ v ' We get (1.5) XX *>(")! jV \ i 2 oo ^ E f(n) n=l X a>M = X X f(n)oKaf(n))(i^m.. # x # \w = 1 Since any hermitian form X;, j aijxi*j satisfies X aijXtXj\ I ij we obtain (1.6) ^X MiW2+il*/)^x(X kilj W2^max X k7l) X M2> X aco<*>M ^( max X jeQ co'eQ £ /(n) oico'(n) IW2; hence (1.7) l\S>*\\2 = \m2^ma\^\K((bo)')\^F+K\Q\, where K(a>)=£?!.! /(») a>(»), K = sup^0>.\K(cb(o')\. The inequality (1.5) also enables us to obtain a bound for ||^||2,1. Let Aa = L»-|K (a>co')|; then z X a>W ^ X *(<^0««A»^i X l^(^Ol(|aJ2 + |awf) = X^|aJ2, #x# #x# # whence (1.8) X^1/2a>(n) *H The operator 3)\ on L2 (£) defined by
36 M. FORTI AND C. VIOLA @U{«lo}loea) = <Y,A«m«Mn) nejT is the adjoint of the operator @1:L2(jV)-+L2(Q) given by »i(KM={e^1/2^w| IS J Q- By (1.7), ||®*|| = H0J ^ 1, whence I I2 (i.9) E^1 Z<W")| ^£kl7 Therefore, by Cauchy's inequality, jr (i.io) (zlE^JY^fx^fz^1 Iz^wl^fs^Zkl2, which implies that (1.11) wmli£ X i«(^')i Remark. The usual choices for Q are the following: (i) Consider for any Dirichlet character #modg a finite set of real numbers tx,r, l^r<^Rx; then Q = {(D \ co(n) = x(n) nitxr} and D = qT with r=max|/x ,r-// J + l. (ii) For any q^Q and any primitive character x modq let rx r be as in (i); then Q = {co | co(n) = x{n) nil^r} and D^Q2T. 2. A localization principle. The following lemma generalizes a method due to Bombieri [2]. Lemma 2.1. Let g(s) = £*= 1 bnn~s, s=G + it, be such that for suitable o0, gx, t0,D,A,l^\ ((70^<7i), (i) /Ae series is absolutely convergent for g^g± ; (ii) #(s) w holomorphic for g^g0, |/ — /0|^log' D; (iii) \g(s)\<DAforG^G0,\t-t0\SloglD. Define gx,k{s) = U?=i M"s exp(-(n/*)*) (fc>0). Tteii/or 6Z-k/2,<T^c0-5, \t-t0\^\ogl D, omax(0, Gl~G0-k/2),
SIEVE ESTIMATES FOR THE DIRICHLET SERIES OPERATOR 37 Qx. k (s) = Ms) + 0 <( (log1 D) X6 max \g (a+6 + it (2.1) wft/i |f-tol^(l/2) log' D + o\kDAXc exp(-(l/4/c) log'D)max |^(5)|[, = 0 i/5>0. Proof. We note that, by Mellin's transform, c + ioo if /w \ ATW 0x,*(s) = — ^(w + s)rf-+lj — dw, c large. c-iao In the above integral we change the path of integration in the union.of the following segments (w = u + iv): Llf2 = {w\u = c, M^ilog'D}, L3f4 = {iv|5gii^c,M=iloglZ)}, L5 = {vv|W = c5, M^log'/)}. If 5 < 0, the residue of the integrand at w = 0 is #(s). Hence 0* '>(s)=iiijg(w+s)r{-k+l)^dw+^ if5<0' Clearly +&(s) if 5=0, + 0 if 5>0. (l/4)log'D ^(w + 5)r(-+l —dw\ \k J w 5 v\ Xd+iv g(S + G + i(t + v))r[ l+-+f-|-—-dv\ k kj 5 + iv -(l/4)log'D |,(w+s)rg+i): « ATd log/ D max |^r(^-hcr-+- ir)|; |t-f|g(l/4) log'D *("+*) ^+l)vdwi « fcXc e -('/4*> log,D max |# (s)| (/' = 1, 2); «/cD^ATc^-(1/4k)log,D, by(iii) (/=3,4). This completes the proof of the lemma. #(w + s)r -+1 —dw 1 A: / w
38 M. FORTI AND C. VIOLA Remark. If g(s) is meromorphic in the strip o = o0, \t — /0|^log/ D, we have to add to the right-hand side of (2.1) the residues of the integrand at the poles of g. We now define (2.2) f(n) = C[e-{nlN)k-e-i2n/N)k], where C = C(k) is such that (1.4) is fulfilled, and apply the above lemma with g(s) = L(s, %), as well as the previous remark when x is a principal character, in order to obtain (2.3) K(co)<Nd log3D max \L(5 + h, *)| + £(x) Ne~m |t-t|g(l/2) log3D with z(x) = 4)(q)/q for principal x mod q, = 0 otherwise, provided N<^DB for some B. From the estimate \L(s, x)\<{qTf{a) + E (x modq, Res = tr, \lms\^T) it follows that, if ||co|| <D, 0 = S = 1, (2.4) K{co)<NdD»id)+e + £(x) Ne~Mk uniformly in co. In particular we put 3 = 0 and obtain (2.5) K(a))<$D1/2 + e+£(x) Ne~M. From \L(s, x)\<(q\s\/2*)ll2~° logfo|s|) we get, if <5<0, /\s\nV,2~d (2.6) K(co)<Ndr-j-j (logD)4 + £(%) Ne^26. This estimate is significant for N large compared with £>; if 5= —(A/2) logD and N = AD logD, we have (2.7) K{co)<Dll2~A log5D + £(x) AT^IM 1o*d. 3. Estimates of the norm ||£^||. The estimate (2.3) for K(co) is effective for either a nonprincipal generalized character co, or a not too small |/|. Remark 3.1. Suppose we are given a bound \\@\\ <^H for any <5-well spaced
SIEVE ESTIMATES FOR THE DIRICHLET SERIES OPERATOR 39 set Q. Then we obtain the bound \\@\\ <H(5/5* + l) for any (5*-well spaced Q, by means of a partition of Q in <5-well spaced subsets whose number is <^5/5*+l. Since Q is finite, it is well spaced provided that \t —1'\ >0 whenever %'x is principal. Under this assumption the following results give effective bounds. With the notations of §1 (D = sup(a*<o' || coco' ||) we have the following Theorem 3.1. Let Q be (2 log D)2-well spaced and assume N^2D log D. Then (3.1) \\@\\2<N. Proof. From (2.2) and (2.7) we get F<^N and K{(b(D')<$Dll2-A+e + z(xy;) Ate-"-'IM ,ogD; hence, K<^D1~A+ND~A provided Q is (A logZ>)2-well spaced. From (1.7) and \Q\ ^ D2, putting A = 2, we obtain \\@\\2 <N+(D-1 + nd~2) D2 <$n . q.e.d. The following remark enables us to derive a bound for ||^|| which does not depend on the size of N. If m is a positive integer such that a>(m)^0 for any cogQ, then \Y^* flwco(n)| = |L,e^ an(o(™n% whence \\9(jV, Q)\\Ptq=\\9{mjr, Q)\\Pt9. Theorem 3.2. IfQ is (2 log D)2-well spaced, then (3.2) ||®||2^Ar + Dlog2D. Proof. By Theorem 3.1 the proof is needed only if N<2D\ogD. Let us consider a finite sequence of primes px <p2 <--<ps such that Np1>2D logD and Pi...ps>D- For every cogQ there is at least a pj such that co^^O, and we may apply Theorem 3.1 to every set pj^V. Multiplying by co(Pj) and summing overj we obtain z Q 2 s ^ZZ X an(o(pjn)\ x <^\\®(P}jr,Q)\\2Y,\atf «Z (PjN)?:K\2^psNZ\a„\2, whence \\@}\\2<^spsN. Choose p1<(4D\ogD)/N and sKlogD/logp^ Since the number of primes between px and 2px is asymptotic to pi/logpl9 there is a prime ps such that ps<2p1
40 M. FORTI AND C. VIOLA <8D \ogD/N. Therefore \\@\\2<spsN^(\ogD/\ogPl) ((82) logD)/N) N<D log2D. Combining this estimate with (3.1), we obtain (3.2). Q.E.D. We can improve the estimate (3.2) if \Q\ is small compared with D. By (2.4) and (2.5) we immediately obtain Theorem 3.3. Let Q be (logN)-wellspaced. Then, forO^S^l, (3.3.1) \\^\\2<N + NdD^3) + e\Q\. In particular for S = 0 we have (3.3.2) \\@\\2<N + Dll2 + e\Q\. Remark 3.2. If N<D log A it follows from fi(S)^\/2-S/2 (O^S^ 1) that (3.3.2) improves (3.2) provided \Q\<£D1/2, while (3.3.1) for 5=^ gives the bound \\@\\2<N + N1/2Dm + e\Q\ which is stronger than the previous one for |G|<^D3/4 -N~1/2. On the other hand, if we assume the Lindelof hypothesis ^(^) = 0, (3.3.1) gives \\@\\2<N + N1/2De\Q\. We can improve the bounds (3.3.1) and (3.3.2) by applying the following lemma (see for example Bombieri [2]). Lemma 3.1. AssumeL(s, x)¥:Oforxmodq,s = a-\-it,a>a^, \t — t0\<log2T Then \L{s,/)H{qT)^a)^ for |/-/0|<ilog27\ where Pl(g, a) = 0 ifa^OL, = (a-ff)/2 iyi-a£(7£a, = \ — c ifa^l—oc. Define, for any p}± 1, fi{p) (a) to be the least exponent for which
SIEVE ESTIMATES FOR THE DIRICHLET SERIES OPERATOR 41 T max j- £ -T Also denote by N(<x, T\ x) the number of zeros of L(s, x) in the rectangle a^c^ 1, \t\£T, and put AT (a, T;G)=max £ NfaT;**'). Theorem 3.4. Le/ Q be (log2D)-well spaced. Then, for cc^\—S, S^j, T = maxco>co,|r-r,| + l? (3.4.1) H^H2 <^ AT-h AT^^^-^^^^I^I -|_ A^^jr>i/2-^+i/jP+/xc^> ci -^>+£ [iv(oc, T; O)] 1~1/". In particular for a = 1 — S, p = 2/5, we have (3.4.2) ||^||2^N + ^D1/2-^ + £|^| + Ar5D1/2-^2+£[N(l-5, T;^)]1"^2. /« ftof/i (3.4.1) and (3.4.2) owe way replace the factor [7V(a, r;&)]1_1/p by min{|0|,[N(a,T;0)]1-1^}. Proof. Let co*e£ be such that, for any co'eQ, Y*<»en \K((ocof)\^ Y,a>eQ \K{<oa>*)\, and let Q*=Qcb*. Then (1.7) implies \\9\\2 <7V+£^|AT(co)| which, combined with (2.3) and with the functional equation for the L-functions, gives (3.5) ||0||2«JV + JV'D1/2-'X ™ax |L(l-5 + iT,z)|. fl*-|t-f|^(l/2)log2D Let tw verify |L(l-<5 + iTw,Z)| = max |L(1-5 + it,Z)|, |T-r|£(l/2)log2D and let 7m = {T|mlog2D<T^(m+l)log2/)}. We follow here an idea of Turan [20], [12]. Define &a(x) to be the set of strips Im such that L(g + ir, x)#0 for o-^a, re Jm, and let ^a(x) be the set of the remaining strips. Let Qb (resp. Qc) be the set of coeQ* such that tw belongs to a strip 7me^a(x) (resp. #a(/)). Then, for a ^ 1 - S, \L{° + it>XX')\ p ) i/p dt> <^D«(p)<">+£.
42 M. FORTI AND C. VIOLA (3.6) n* Qh &c «D<'-1+')/2+I|0| + I|L(l-« + «„„ Z)|. Since |#a(x)| £N(<x, T; x), Holder's inequality yields (3.7) £|L(1-* + «„, Z)|£ tic LXeX(Q*) *T:*]-''\$ \L(l-S + ixwX)\' i ip Applying the Sobolev inequality \f(x)\£{l/(b-a)) \ba \f{x)\ dx + J» |/'(x)| dx we get X|L(l-^ + JTra,X)|p T <1 I {|L(l-5 + it,z)l"+P|L(l-6 + ir,x)lp"1|£(l-^+^x)l}A. From this, using Cauchy's integral formula for L(s, x) in terms of L(s, x), we obtain, for a suitable 5' between S — 1/logD and 5+ 1/logD, (3-8) |_S |L(l-5 + iTra,x)|" i/p «logD T+l * 1 X(Q*) J \L(l-5' + it,x)\p dt i/p ^/}l/p + Cl")(l-«)+ei Now (3.4.1) follows from (3.6), (3.7) and (3.8). We recall the following result of Gabriel [8] (see also Bombieri [1, Lemma 10]). (3.9) it* (<r)£((0-<x)/(0-«)) //" («) + ((*-«)/(/»-«)) /<> (/?), where a£o£p,l/p=(l/h){p- a)/(fi - a)+(1/fc) (a - «)/(/? - a). By (3.9) with /i = 4, k = ao, a=j, /? = 1 and the estimate (3.10) /i<4,(l/2)=0, we obtain for any p ^ 4 (3.11) ^<p)(<7) = 0 ifff^l-2/p. Now (3.4.2) follows from (3.11) putting a = l-8,p = 2/5 in (3.4.1).
SIEVE ESTIMATES FOR THE DIRICHLET SERIES OPERATOR 43 Remark 3.3. (3.10) is obtained by considering fourth moments of L- functions (see for instance Montgomery [17, Theorem 10.1]); further improvements on the exponent of the modulus q can be derived from Huxley's results [13] about eighth moments. If fr (a) is the least exponent for which N(ol,T; Q)<$D<ll{a) + £, then (3.4.2) has the following form. (3.12) ||^||2^7V + j/V<5jD1/2-^ + £|^|+^/)1/2-^/2 + (1-^/2)mi(1-^) + ^ Remark 3.4. We can extend the previous estimates to sets N of integers which are not necessarily included in any interval [N/2, AT]. If N = sup^T, let J\rh = jrn[NI2\ N/2h~1'] (h=l, 2,...). Then £aa><5(W) h JVh q I |_ /i J a i«j2, i.e. \\2>{^,Q)\\2^Yj. \\&{^rk,0)\\2. By Theorem 3.1, \\®{Jfh, Q^^N/l"^ for N^2hD logD, while by Theorem 3.2, \\®(Jrh, Q)\\2<D log2D for N<2hD logD. Hence (3.13) ||^(^,0)||2<^Ar + Dlog3D holds for any set N of positive integers ^ AT. The same argument allows us to establish Theorems 3.3 and 3.4 for any set Jf of positive integers ^ N. Remark 3.5. In order to estimate expressions of the type £*= x |J^ an nitr\2, corresponding to a set Q of generalized characters co(n) = nu, we apply the following idea of Huxley (unpublished). Let H(T) = supQ maxw £©, \K((o'cb)\, where sup^ is taken over all sets Q such that D(Q)^T. It follows from (1.7) that ;!*>'' ^tfOOIkl2 provided supr>s|tr-tJ^T. Let Im = [mV, (m + 1) V~\ and &m = {coe& | co(n) = nir, re/m}; then
44 M. FORTI AND C. VIOLA Now we may apply the estimates so far considered to H{V). For instance, it follows from (3.4.2) that E anco(n) <Z{N + NdV1/2-d+£\Qm\ + NdV1/2-d/2+e[N(\-S, K)]1"^2} £ \af, where N (1 — S, V) is the number of zeros of £ (s) for a ^ 1 — 5, 0 ^ f ^ K. Summing over m we get, for 1 <^ V <^ T, 2»H (3.14) ^JArl + ^K1/2-^^ + ^-K1/2^/2+£[iV(l-5, K)]1-^2!^!^2, provided infr s \tr - ts\ ^ log2 AT. 4. Estimates of the norm ||^||p,, with (p, #)#(2, 2). By Holder's inequality it follows that (4.1) H^IU^JV1*-1"!!®!!*, forr>p. We shall now estimate the norms H^ll^k, (k a positive integer) by means of II^IIp,,. Let jVk = {m | m = n1...nk, «;€>"}. Then (4.2) z E<v°(") *« =E E^-^^K-^) I.** ^||^(^*,Q)|||.J Z ejVk E *«.,•••** p\q!p Applying twice Holder's inequality to the right-hand side of (4.2) we obtain E «■,-«* ^ E [dk(m)Y{1-1M[ E KI^KI" me/k 1/fc (4.3) £< X [</kM](p~1/k)/(1~1/k) me/k Z Z KI*p...kJ*' mejVk n\...rih = m <^\_Nk (log Ny^^y-1'kYi\af. jV Therefore, from (4.2) and (4.3),
SIEVE ESTIMATES FOR THE DIRICHLET SERIES OPERATOR 45 kq\l/kq / \l/fcp J < \\® {^\ £3)|| \% N^~ W (log AT)^ \Y \an\kp) Hence, by (4.1) and (4.4) we get (4.5) \\&{jr Q)\\rM<N^-^ (\ogNfk>* \\®{Jf\ 0)||PV{ for r^kp. Combining (4.5) with the results of §3, we can deduce estimates for ||^||p><r We shall adopt for the sake of simplicity the estimate \Q)(Jf, (2)|| <^N1/2 + D\12 (see Theorem 3.2), where D1=D log2 D. Therefore, if p = q = 2k we obtain Theorem 4.1. Let Q be (log D)2-well spaced. Then, for any positive integer k, (4.6) \\@\\2k,2k<(N1,2 + D\/2k)Nll2-1/2k+e. We may now interpolate between 2k — 2 and 2/c by means of Theorem 1.2 in order to obtain Theorem 4.2. Let Q be (\ogD)2-well spaced. Then, for any p^2 and for * = [p/2], (4.7) ll^llp,^<^(^(k+1)/2 + Z>i/2)1-2fc/^(Arfc/24-Z>i/2)(2fc-h2)/^-1 N1/2~1/p+e. Remark 4.1. The estimate (4.7) takes the following more explicit forms (we put k = 0/2] as before and $ =p/2 - [>/2]): (4.8.1) \m\P,P<^1~1/p+E forD^JV*; (4.8.2) \\9\\P9P<D\fpNlf2-lfp+t for D.PN^1; ||® || <^£)(1fc+1)/P-1/2j/Vfc/2-fc(fc+l)/P+l-l/P + £ (4.8.3) P'P forNk<Dx^Nk + l. = £)d -S)/pjy&!2+S(l -&)fp + 1/2-1/p+e (4.8.1) and (4.8.2) are particular cases of the very strong statement \\@\\p,P<{N1/2 + D\/p)N1/2-1/p+e, which has recently been conjectured by Montgomery [17, Conjecture 9.2] in a special case. Unfortunately (4.8.3) yields the weaker result (4-4) (E E anV(n)\
46 M. FORTI AND C. VIOLA (4.8.4) \\9\\PtP<(Nlf2 + lf4p + D\'p+lf2p2)Nlf2-lfp+t forallp^2. Another estimate for ||^||P><J can be obtained by means of the adjoint operator (Theorem 1.1). By Holder's inequality (1 ^p'^2) we get (4.9) Z «»<»(") P'y/P' ( ^N(2-P')/2P' £ Z aX^) Q 2N1/2 If f(ri) is a function satisfying (1.4), then Zaco^(")| Q 2\l/2 /oo Z aX") 2\ 1/2 v2^ 1/2 (4.10) ^f1/2fzi«J2Y/2 + K1/2ElaJ. Since for l^q'^2 we have 1/9' and 1/2 / \l/«' ZW2) ^ I !«„!•' we deduce from (4.9) and (4.10) that ip'\ i/p' Z aco^(") ^NW-l/2(Fl/2 + Xl/2|0|l-W) ^|aj, L/«' Now define p, g such that 1/p + 1/p' = l/q + l/q' = 1. The above inequality reduces to (4.11) \\^{^Q)\\Pyq<^Nll2-llp(Nll2+Kll2\Q\llq) for any p, q ^ 2, provided F <^N. Let (2.2) /(«)=c[^-(w/N)k-^-(2^>k].
SIEVE ESTIMATES FOR THE DIRICHLET SERIES OPERATOR 47 Combining (4.11) with (2.4) and (2.5), we generalize Theorem 3.3 as follows. Theorem 4.3. Let Q be (logN)-well spaced. Then, for any p, q^2, 0^3^ 1, (4712.1) \\^\\Ptq^N1/2-1/p(N1/2 + Ns/2D^S)l2+e\Q\llq). In particular for 3 = 0, (4.12.2) \\®\\P,q<N1/2-llP(N1/2 + D1/4+e\Q\1/q). Remark 4.2. In view of Remark 3.4, the results of the present section hold for every set Jf of positive integers not exceeding N. 5. Estimates for sums of type ^ |]T^ a„ co(n) n~s{(0)\p. We can easily generalize the previous theorems to the case of an arbitrary complex exponent for n. Our result is as follows. Theorem 5.1. Let Jf be an arbitrary set of positive integers not exceeding N. Let Q be a set of generalized characters. For any coeQ let s(co>) = g((d) + h(co). Let Q' = {cd' I co,(n) = co(n) n~lT{to),a>EQ} and define a0 = 'mfQ a (co). Thenforanyp, q^2, (5-1) E .ft " < \\3(JT, 0')||p,,(log logAO1^ |a„|*n-"°V/P Xa„co(n)n"sH Moreover, ifp^q the following stronger inequality holds: jr (5-2) (I -ft 1,q L. / log N\y/p ^ii^(^,^iip,,|p^ip""pffY+iog^J} • Proof. We may suppose cro = 0, by substituting an with ann~ao; we also assume without loss of generality that t(co) = 0 for any cog Q (i.e. Q! = Q). Then, by partial summation (a = a(co)), (5.3) IS AT' J ^M+f {£ ana>(n)\ ^~a~l d{ <2UN-qa Z an<t>{n)\ ^aMnn^r'-'di We now apply Holder's inequality; if W+!/<?= 1, then
48 M. FORTI AND C. VIOLA {£ a^inftcC'-'dZ (5.4) TV 41 N E a»w(«)| dt E anW(") {log 5. z iog<r |far,''-1(iog^'-i^j*/" Summing over to we obtain from (5.3) and (5.4), Xa„co(H)H-s<"> 1^* A1/* ^1 X>„w(") "J? E a«<°(n)\ i log^J (5.5) \QlP r/S \q/P iz ll/Q .^" £log£ Hence we immediately have X«„co(n)n" s(a>) jV " < \\nP,q(\+iog logjv)1" h \an\p\IP, which is equivalent to (5.1). Now define |y||, = (fN \g(Z)\q d log logf)1'*. It is well known that, for any p^q, q~\\\y\\\p' Hence, putting g{£) = (£? \an\p)1/p, we obtain by (5.5), X>„co(n) *"*<*» q\l/q ^2 II^Hp.«{(l k|P)1/P + (| £ kl* d log log<^)1/P <4 1101 n d log log £ 1/p (5.2) immediately follows.
SIEVE ESTIMATES FOR THE DIRICHLET SERIES OPERATOR 49 Corollary. IfJf is comprised between N/2 and N, we have (5-6) (I js References 1. E. Bombieri, On the large sieve, Mathematika 12 (1965), 201-225. MR 33 # 5590. 2. , Density theorems for the zeta function, Proc. Sympos. Pure Math., vol. 20, Amer. Math. Soc, Providence, R.I., 1971, pp. 352-358. 3. , A note on the large sieve, Acta Arith. 18 (1971), 401-404. 4. E. Bombieri and H. Davenport, On the large sieve method, Number Theory and Analysis (Papers in Honor of Edmund Landau), Plenum, New York, 1969, pp. 9-r22. MR 41 #5327. 5. , Some inequalities involving trigonometrical polynomials, Ann. Scuola Norm. Sup. Pisa (3) 23 (1969), 223-241. MR 40 #2636. 6. H. Davenport and H. Halberstam, The values of a trigonometrical polynomial at well spaced points, Mathematika 13 (1966), 91-96; Corrigendum and Addendum, ibid. 14 (1967), 229-232. MR 33 #5592; MR 36 #2569. 7. P. D. T. A. Elliott, On inequalities of large sieve type, Acta Arith. 18 (1971), 405-^22. 8. R. M. Gabriel, Some results concerning the integrals of moduli of regular functions along certain curves, J. London Math. Soc. 2 (1927), 112-117. 9. P. X. Gallagher, The large sieve, Mathematika 14 (1967), 14-20. MR 35 #5411. 10. , A large sieve density estimate near a=\, Invent. Math. 11 (1970), 329-339. MR 43 #4775. 11. G. Halasz, Uber die Mittelwerte multiplikativer zahlentheoretischer Funktionen, Acta Math. Acad. Sci. Hungar. 19 (1968), 365-403. MR 37 #6254. 12. G. Halasz and P. Turan, On the distribution of roots of Riemann zeta and allied functions. I, J. Number Theory 1 (1969), 121-137. MR 38 #4422. 13. M. N. Huxley, The large sieve inequality for algebraic number fields. II. Means of moments ofHecke zeta-functions, Proc. London Math. Soc. (3) 21 (1970), 108-128. MR 42 #5944. 14. Ju. V. Linnik, The large sieve, C. R. (Dokl.) Acad. Sci. URSS 30 (1941), 292-294. MR 2, 349. 15. H. L. Montgomery, Mean and large values of Dirichlet polynomials, Invent. Math. 8(1969), 334- 345. MR 42 #3029. 16. , Zeros of L-functions, Invent. Math. 8 (1969), 346-354. MR 40 #2620. 17. , Topics in multiplicative number theory, Lecture Notes in Math., vol. 227, Springer- Verlag, Berlin and New York, 1971. 18. A. Renyi, On the large sieve of Ju. V. Linnik, Compositio Math. 8 (1950), 68-75. MR 11, 581. 19. K. F. Roth, On the large sieves of Linnik and Renyi, Mathematika 12 (1965), 1 -9. MR 33 # 5589. 20. P. Turan, On the so-called density-hypothesis in the theory of the zeta-function of Riemann, Acta Arith. 4 (1958), 31-56. MR 20 #2304. Istituto Matematico, Universita di Pisa Pisa, Italy
This page intentionally left blank
ON IWASAWA'S ANALOGUE OF THE JACOBIAN FOR TOTALLY REAL NUMBER FIELDS JOHN COATES 1. Introduction. The present paper is a summary, without proofs, of some joint work with S. Lichtenbaum. Detailed proofs, as well as some material not discussed here, will be appearing in [4]; see also [3], [10] for earlier work in the same direction. We begin by indicating two basic problems in algebraic number theory which motivated [4]. The first is the problem of finding analogues of Dirichlet's class number formula, in the following sense. Let F be a number field, rx its number of real embeddings, and r2 its number of pairs of complex conjugate embeddings. For each integer n^0, let dn be either rx + r2 — 1, r2, or rx + r2, according as n is 0, odd, or even and positive. Let C(F, s) be the complex zeta function of F. By the functional equation for C(F, s), we have £(F, s)~cn(s + ri)dn as s-> — n, where cn is some constant. Dirichlet's class number formula asserts that c0 = hR/xv, where h is the class number, w the number of roots of unity, and R the regulator of F. Do there exist similar formulae for the cn when n>0? A crude analogy suggests that one should look for a formula for cn of the form hnRnjwn, where hn is the order of some generalized ideal class group, wn is the order of some group of roots of unity, and Rn is some dn x dn determinant generalizing the regulator. The simplest case is when the regulator term Rn does not occur, that is, when Fis totally real and n is odd, and indeed Siegel [15], [16] has shown in this case that £(F, -n) is a rational number. However, Siegel's proof gives no interpretation of this rational number in the form h„/wn suggested above. Recently, Birch and Tate [18] in the case n = l, and independently Lichtenbaum [10] for all odd positive n, made a precise conjecture of this kind. While apparently quite different, the two conjectures are in fact the same for n = 1. Part of the object of [4] has been to study these conjectures, and, AMS 1970 subject classifications. Primary 12A70. <y) 1973, American Mathematical Society 51
52 JOHN COATES in particular, to develop techniques for proving them for a class of totally real abelian extensions of the rational field Q. The second problem arises from the well-known analogy between number fields and curves over finite fields. Let C be a complete, nonsingular curve of genus g ^ 1 defined over a finite field k, and let f be the Jacobian variety of C. For each prime number / distinct from the characteristic of k, let /x be the /-primary subgroup of the group of points of f defined over the algebraic closure k of k. Then, as an abelian group, fx is isomorphic to (QJZ^29, where Qx and Zx denote the field of /-adic numbers and the ring of /-adic integers, respectively. The Frobenius automorphism of k/k induces an endomorphism of fx, and a basic theorem of Weil [19] asserts that the characteristic polynomial of this endomorphism is the quotient of the zeta function of the curve C and the zeta function of a curve of genus 0. Recently, Iwasawa [6], [7] conjectured that a certain T-module in his theory of Zrextensions should provide a good analogue of # x for number fields. Further, for a very special class of abelian extensions of Q, he established a beautiful analogue of Weil's theorem by relating the characteristic polynomial of this T-module to the /-adic zeta function of the number field in the sense of Kubota- Leopoldt [9]. Much of the work of [4] can be viewed as providing evidence that this result of Iwasawa is valid, without restriction, for all totally real number fields. In fact, it turns out that the conjecture of Lichtenbaum mentioned before is equivalent to the assertion that the characteristic polynomial of Iwasawa's analogue of fx is always an /-adic function of the type constructed by Kubota- Leopoldt. Moreover, this connexion does not seem to be a superficial one, and it is used in [4] to obtain results about both problems. 2. Iwasawa's analogue. The following notation will be used throughout. Let / be an odd prime number, and let Qx, Zx be the field of /-adic numbers and the ring of /-adic integers, respectively. For each integer m^l,/xm will denote the group of mth roots of unity, and we put W= U*= i A*i»,<^ = proj lim/i^. If K is a field, K will denote the algebraic closure of K. If £ is a Galois extension of K, we write G(E/K) for the Galois group of E over K. For each integer n^O, let«^(n) denote the tensor product of ST with itself n times over Zx. If B is any discrete /-primary abelian group on which G(K/K) operates continuously, we define B(n) to be the G(K/K)-modu\e B®Zl #~{n); here it is understood that G(K/K) acts on the tensor product by the diagonal action. Finally, A will denote the ring of formal power series in an indeterminate T with coefficients in Zx. Throughout, F will denote a totally real number field of finite degree over Q, and we put F0 = F(n,), FX = F(W). For each n^O, let F„ denote the unique subextension of FJF0 of degree /" over
IWASAWA'S ANALOGUE OF THE JACOBIAN 53 F0; each Fn is a totally imaginary quadratic extension of a totally real subfield, which we denote by F+. Put r = G(Fo0/F0). For reasons that will be clear later, it is more natural to consider the analogue of the group 4fl discussed earlier for Fq rather than F itself. To this end, let An (n^O) be the /-primary subgroup of the ideal class group of Fn, and let A — 'm& lim^„, the inductive limit being taken relative to the homomorphisms induced by the inclusion of the divisor group of Fn in the divisor group of Fm when n^m. Let J denote complex conjugation. Once we choose an embedding of F^ into the complex field C, there is a natural action of J on A; it is easily seen that this action does not depend on the particular embedding chosen. We then have the decomposition A = A + ®A~, where A+=A1+J, A~ = y41-J. Iwasawa has proposed that A~ should provide a good analogue of #x for Fq . As a first step towards explaining the evidence for the analogy, we recall the following basic result of Iwasawa [8]. Let (A~y = Hom(A~, Q^Z^ be the Pontrjagin dual of the discrete group A". We define an action of r on (A~)~ by specifying that (acp) (a) = (p((ja) for geT, <pe(,4~)*and aeA~. Fix a topological generator y0 of f. Then, as is well known, the T-structure on (A")"gives rise to a unique /1-module structure on (A~)~ such that y0<p = (l + T) <p for all (pe(A~y. Then Iwasawa proved, by arguments based on class field theory and the structure theory of noetherian /1-modules, that there exist nonzero elements/^T), ...,fr(T) of A, r being some nonnegative integer, such that there is an exact sequence (i) o^-^e^irH-o, where D is some /1-module of finite cardinality. Moreover, assuming that the choice of y0 is fixed, he showed that the power series (2) Un,T)=t\fXT) i=l is uniquely determined by A~ up to a unit in A. This power series plays a basic role in our work. As indicated by our choice of notation, we believe that Ci(Fq , T) deserves to be called the /-adic zeta function of Fq . 3. Lichtenbaum's conjecture. Following [10], we first state the conjecture in terms of etale cohomology. For the definition and basic facts about etale co- homology, see [1]. Let 0 be the ring of integers of F, and X the spectrum of the ring (9 [1//]. Let ;: Spec(F)-»X be the natural inclusion. For each n^O, we can view the G(F/F)-module W(n) as a sheaf for the etale topology for Spec(F), and we may take its direct image j^ W(n) on X. Let £(/% s) be the complex zeta function of F. Finally, let 11, be the valuation of Z, normalized so that \l\l = rl,
54 JOHN COATES and let \M\ denote the cardinality of any finite set M. Conjecture 1 (Lichtenbaum). Let n be an odd positive integer. Then (i) the Hl(XJ+W(n)) are finite for all i^O and trivial for all i = 2; (ii) \H'(X,uW(n))\/\H°(X,j\,W(n))\ = \C(F, -«)|f *. Lichtenbaum also conjectured that the same result is valid for 1 = 2. Note first that his conjecture would imply the following estimate for the denominator of the rational number ((F, — n). If £ is a field and m a positive integer, let wm(E) denote the largest integer k such that G(E(fik)/E) is annihilated by m. In particular, w^E) denotes the number of roots of unity in E. Then it is easily seen that \H°(X, j*W(n))\ = |wn+1(F)|f1, and so the conjecture predicts that wn+1(F) C(F, — n) should be a rational integer for all odd positive integers n. When n= 1, this last assertion has been proven by Serre [14]. When n> 1, it is still unknown, although it has been verified in some special cases. Henceforth we assume again that / is odd. The next result, which is proven in [10], relates Conjecture 1 to the Iwasawa module A~. Theorem 1. For each odd positive integer n, H1 (X, j^ W{n)) is canonically isomorphic to (A~(n))G, where G = G(Fao/F). Furthermore, the H^XJ^Wfo)) are trivial for all i^2 if and only if(A'(n))G is finite. Using this theorem and the exact sequence (1), it is easy to obtain the following formulation of Lichtenbaum's conjecture for Fq. Let (9q be the ring of integers of Fq, Xq the spectrum of (9q [1/C and;: Spec(Fo )->Xq the natural inclusion. Let q0 denote the largest power of / such that ^ocF0, and let K'.r^>\+q0Zl be the isomorphism defined by y(£) = £K{y) for all £eW and yeT. Proposition 1. Let n be an odd positive integer. Then (i) the H1(Xq, j*W(n)) are finite for all i — 0 and trivial for all i = 2 if and only J/C,(Fo+,k(?0)-"-1)#0; (ii) \HHxs,j,w(n))\ = UF5, MroP-iJir1. Hence Lichtenbaum's conjecture is true for Fq and the prime I if and only if In particular, this shows that Lichtenbaum's conjecture is essentially equivalent to the assertion that C/(^o > T) is an /-adic function of the type constructed by Kubota and Leopoldt [9] when Fq is an abelian extension of Q. So far, this remarkable fact has only been proven for a rather special class of abelian extensions of <2; the precise result is given in §4. Even the much weaker assertion that
IWASAWA'S ANALOGUE OF THE JACOBIAN 55 Ci(Fq , K(y0)~n— 1)^0 for all odd positive integers n is unknown in general. However, it has been proven for n= 1 by rather deep arguments involving the K2 of Fq (see §6). Of course, it is trivially true that C/(^o > k(}>0)~"- 1)#0 for all but a finite number of integers n. 4. The analytic theory. The results in this section are based on the fundamental ideas introduced by Iwasawa [6], [7]. These, in turn, have their origin in a classical theorem of Stickelberger [17]. Let % be a primitive Dirichlet character satisfying %( — l)= — 1. We view the values of x as lying in the algebraic closure of Qh and let (9X be the ring generated over Zz by the values of x- Let Ax be the ring of formal power series in T with coefficients in (9X. In [7], Iwasawa has associated with x an element g(T\x) of the quotient field of Ar Define/(T; x) to be either g(T; x) or [T— I) g(T; x) according as %#o; or x = o>; here co is the Dirichlet character modulo / satisfying co(a) = a mod/Zz for all integers a. We shall only consider those x which have order prime to /; and in this case, /(T; x) is an element of Ax. Suppose now that F, in addition to being totally real, is an abelian extension of Q of degree prime to /. Let # be the character of any imaginary representation of G(F0/Q) which is irreducible over Qh let e0 be the associated orthogonal idempo- tent in the group ring Zl[G(FQ/Q)\ and let </> be the primitive Dirichlet character associated with an absolutely irreducible component of #. We have the direct sum decomposition A~ = ®0 e<j>A~, where 0 runs over all distinct characters of imaginary representations of G(F0/Q) irreducible over Qt. Let (^"fbe the Pontrjagin dual of e0A~, it being endowed with a T-module structure in the same way as (A~y. Conjecture 2. Let 0 be the character of an imaginary representation of G(F0/Q), irreducible over Qt. Then, for a suitable choice of the topological generator of T, there is an exact sequence of A-modules O^eoA-y^A+fifiT; <£)HA^0, where D0 is a finite A-module. The following result is then not difficult to establish (cf. [10]). Proposition 2. If Conjecture 2 is valid for F and /, then Conjecture 1 is valid for F and I. Before stating the actual result we can prove in the direction of Conjecture 1, we recall the definition of a wild prime of a number field. Let £ be a finite extension of (2, and p a nonarchimedean prime of E lying above a rational prime p. Then
56 JOHN COATES we say that p is wild if \iv is contained in the completion of E at p. Note that a prime of Fq lying above / is wild if and only if it splits in F0. Theorem 2. Let F be a totally real abelian extension of Q. Assume that (i) / does not divide the degree of F over Q, (ii) no prime of Fq lying above I is wild, and (iii) there exists a0eA$ such that Aq ~Zl\_G(F0/Q)]a0. Then Conjecture 2 is valid for F and I. When F=Q, this result is due to Iwasawa [7]. The general result is proven in [4]. We shall see in the next section that condition (ii) of Theorem 2 is a natural one in the theory, being equivalent to the nonvanishing of Ci(Fq , T) at T=0. Unfortunately, condition (iii) is very restrictive, and difficult to verify for any particular field. However, we give a number of examples of fields to which it applies in § 7. It should also be noted that Leopoldt [11] has proven that (iii) is valid if the class number of Fq is prime to /. 5. The vanishing of ^(Fq , T) at r=0. It is shown in [4] that there is a close connexion between the vanishing of £i(Fq , T) at T=0 and the existence of wild primes of Fq lying above /. Theorem 3. Ci(Fq , T) vanishes at 7=0 if and only if at least one prime of Fq lying above I is wild. Furthermore, ifCii^o > T) vanishes at T= 0, the order of the zero at T= 0 is greater than or equal to the number of wild primes ofpQ lying above I. Presumably, the exact order of the zero at T = 0 is the number of wild primes of Fq lying above /, but we have been unable to prove this in general so far. In connexion with Theorem 3, it may be of interest to note the following analogous fact for the complex zeta function. Let S0 be the set of primes of F0 lying above /, and put £So(F0, s) = £(F09 s)Y\peSo(\-(Np)-% where Np denotes the norm of peS0. Similarly, let S$ be the set of primes of .Fq lying above /, and put CsA^o » s) = C(Fq , s) Yivesd 0 _(NP)~s)- Then the complex function Cs0(fo,s)/Cs0<F^s) vanishes at s = 0 if and only if at least one prime of Fq lying above / is wild. Furthermore, if it does vanish at s = 0, the order of the zero is the number of wild primes of Fq lying above /. Note also that, in the special case in which F is an abelian extension of Q of degree prime to /, Conjecture 2 is in accord with Theorem 3 since f(T; (j>) vanishes at T = 0 if and only if </>(/)= 1. We also mention the following corollary of Theorem 3.
IWASAWA'S ANALOGUE OF THE JACOBIAN 57 Corollary. Let the integer en be defined by\A~\ = len. Then, for all sufficiently large «, we have en = X~n + fi"/" + v", where A~, \i~, v" are integers not depending on n, and where X ~ is greater than or equal to the number of wild primes of F£ lying above I. As a simple example of the corollary, assume that 1=3 mod 4, and take F to be any real quadratic field with discriminant of the form //, where/is a quadratic nonresidue modulo /. Then it is easily seen that the unique prime of Fq lying above / is wild, and so, for this choice of F and /, we have X~ = 1. The proof of Theorem 3 given in [4] is based on the etale topology. The key step in its proof is the following result, which is established in [4]. If p is any prime of F, let Fp be the completion of F at p. For each n^O, we define wj/^Fp) to be the maximal number of /-power roots of unity in any extension of Fp of degree n. Theorem 4 (Lichtenbaum). Let n be an odd positive integer. Let G = G(FJF\ and assume that (A" (n))G is finite. Then the order of (A~ (n))G is divisible by {"[pifVvj^CFp), where the product is taken over all primes p of F lying above I. When n = 1, this result was pointed out several years ago by Tate in a different, but equivalent, context (cf. §6). Note that since n is odd, the integer Y\v\i w{n{Fv) is greater than 1 only if at least one prime of Fq lying above / is wild. Also, recalling the isomorphism Hl (X, j^ W(n))^> (A~ (n))G, we see that Theorem 4 and Conjecture 1 suggest the following divisibility assertion for wn+l (F) £(F, — n). Conjecture 3. Let n be an odd positive integer. Then wn + l (F) £(F, — n) is an l-integer which is divisible by \\v\i w(^(Fp)9 where the product is taken over all primes p of F lying above I. Thus the following result, which is established in [4], can be viewed as giving indirect evidence for Conjectures 1 and 2. Theorem 5. Assume that F is a totally real abelian extension of Q. Then Conjecture 3 is valid for F and all I. A particular example of Theorem 5 is the following. Assume that 1 = 3 mod 4, and that F is a real quadratic field with discriminant of the form //, where / is quadratic nonresidue modulo /. Then, for each r=0, wnt-»i2+i(F)UF,-ni-W) is an /-integer which is divisible by lr+1.
58 JOHN COATES 6. Connexion with AVtheory. There is a remarkable and useful connexion between the questions discussed in the preceding sections and the K-theory of number fields. As this connexion has only been proven for K2, we limit most of ourjdiscussion to this case. We first recall one of the several equivalent definitions of the K2 of a field (cf. [3], [10]). Let £ be a field. Then K2E is the abelian group generated by the symbols {a, b}, where a and b run over all nonzero elements of E, subject to the relations {ac, b) = {a, b} {c, b], {a, be] = {a, b] {a, c], {a, 1 — a] = 1. Suppose now that v is any discrete valuation of £, and let E* denote the multiplicative group of the residue field of v. The tame symbol at v is the homomorphism kv\ K2E-*EXV defined by mapping {a, b] to the residue class of (-iyw<*> 'av{b)/bv{a). Assume now that E is a finite extension of Q. We define the tame kernel of £, which we denote by R2E9 to be the intersection of the kernels of the kv for v ranging over all nonarchimedean primes of E. By a deep theorem of Garland [5], R2E is a finite group. The following result (cf. [3], [10]), whose proof relies heavily on the work of Tate [18], relates R2E to Iwasawa's theory. Theorem 6. Let F be a totally real number field of finite degree over Q, and let I be an odd prime number. Then the l-primary subgroup of R2F is canonically isomorphic to (A ~ (1)) G(F^F\ Corollary 1. For each totally real number field F, we have UF^K(yo)-"-\)^0. Corollary 2. For each totally real number field F, (A~(\))G{Fco/F) is zero except for a finite number of primes I. These results provide further evidence for the validity of Conjecture 1. In fact, Birch and Tate [18] had conjectured, on the basis of some computations on the order of the tame kernel, that \R2F\ = w2(F) £(F, —1). Theorem 6 shows that, except for the 2-primary subgroup of R2F, their conjecture is just the special case of Lichtenbaum's conjecture given by n= 1. Furthermore, by Proposition 2, the Birch-Tate conjecture is true for the /-primary subgroup of R2F when F and / satisfy the conditions of Theorem 2. Finally, we remark that there may well be a connexion between the higher X-groups and Iwasawa's theory. For, if (9 denotes the ring of integers of F, Bass [2] has proven that the inclusion of & in F induces a surjective homomorphism K2(9-*R2F. Presumably this homomorphism is an isomorphism, but this is not
IWASAWA'S ANALOGUE OF THE JACOBIAN 59 known yet.1 In the light of this remark and Theorem 6, it seems natural to conjecture that, for any totally real number field F and any odd positive integer n, the /-primary subgroup of K2n(9 is canonically isomorphic to (A~ (n))G(Fco/F). 7. Numerical examples. We now give some numerical examples to illustrate the general theorems and conjectures discussed before. Example 1. Assume that / is an odd prime number ^4001, and take F to be any totally real subfield of Q(fity Then the hypotheses of Theorem 2 are well known to be valid for F and /. In particular, Conjecture 1 is valid for F and /. Example 2. Let F be a real quadratic field whose discriminant d is either prime to 3 or of the form 3(3m+ 1), where m is a positive integer. Assume that the 3-primary subgroup of the ideal class group of Q(( — 3d)1/2) is a cyclic abelian group. Then the hypotheses of Theorem 2 are valid for F and the prime /= 3. In particular, Conjecture 1 is valid for F and 1 = 3. Example 3. Let F=Q((11)1/2), and take 1=1. Condition (ii) of Theorem 2 is valid because 7 splits in <2((11)1/2) and is totally ramified in Q(pi7). Furthermore, if C denotes the 7-primary subgroup of the ideal class group of Q(fi308), tables of class numbers [13] show that |C"| = 7. Hence, since the degree of Q(fi30s) over F0 is prime to 7, we have \Aq\ = 1 or 7. Thus Theorem 2 is valid for F and 1=1. Now w2(F) = 23-3, C(F, -1)= ±7/(2-3), and so, by Proposition 2, (A~ (i))G<Fo°/F) has order 7, whence, by Theorem 6, the 7-primary subgroup of R2F has order 7. Several years ago, before Theorem 6 was proven, Birch and Atkin found convincing numerical evidence for the validity of this last assertion by direct computations. But they could not prove it at the time because the definition of R2F involves infinitely many relations. Example 4. Let F=Q((19)1/2), and take /=19. Since Q((-19)1/2) is a subfield of Q(fil9), we have F(fil9) = Q(fi4, fil9). Now 19 stays prime in Q(^4) and is totally ramified in Q(^i9). Hence condition (ii) of Theorem 2 is valid. Furthermore, tables of class numbers [13] show that \Aq \ = 19. Thus Theorem 2 is valid for F and /=19. Now w2(F) = 23-3, C(F, -1)= ±19/(2-3), and so, in particular, (A~ (1))G<F~/F> has order 19 by Proposition 2. Also it follows from Theorem 6 that the 19-primary subgroup of R2F has order 19. The remaining four examples are of nonabelian totally real cubic fields F. The values of £(F, —n) given have been computed, by hand, by Mr. A. Candiotti and myself, using a remarkably simple formula for £(F, — n) discovered recently by Siegel [16]. For each of the four fields given, it is easily seen that w2(F) = 23-3, w4(F) = 24-3-5. Note that, in each example, w2(F) C(F, -1) and w4(F) £(F, -3) are integers, in accordance with Conjecture 1. 1 See Note Added in Proof.
60 JOHN COATES Example 5. Take F=Q (0), where 0 is a root of x3 - 4x + 2. The discriminant of F is 22-37. We have C(^, - 1)= ± 1/3, C(F, -3)= ±577/(2-3-5). Example 6. Take F=Q (0), where 0 is a root of x3 — 4x + 1. The discriminant ofjFis229. WehaveC(F, - 1)= ±2/3, C(F, -3)= ±1333/(2-3-5). Example 7. Take F=Q (0), where 0 is a root of x3 — 5x + 3. The discriminant of Fis 257. We have C(F, — 1)= ±2/3, f(F, -3)= ± 1891/(3-5). Example 8. Take F=<2(0), where 0 is a root of x3 — 6.x+ 2. The discriminant ofiMs22-33-7. We have C(^, -1)= ± 13/3, £(F9 -3)= ±(72-3589)/(2-3-5). This example is particularly interesting in connexion with Conjecture 3. For, if p denotes the prime of F lying above 7 which is ramified in the extension F/Q, it is easily seen that \in is contained in an extension of Fv of degree 3. Thus Conjecture 3 predicts that w4(F) £(F, —3) should be divisible by 7, which is indeed the case. This research was supported in part by NSF grant GP-9152 and the U.S. Army Office of Research (Durham). Added in Proof. D. Quillen had recently proven that K2(9^R2F for all finite extensions F of Q. Also, R. Greenberg has shown that the order of the zero of Ci(Fq , T) is exactly the number of wild primes of F£ above / when F is an abel- ian extension of Q and / any odd prime number. References 1. M. Artin, Grothendieck topologies, Mimeographed notes, Harvard University, Cambridge, Mass., 1962. 2. H. Bass, K2 des corps globaux, Seminaire Bourbaki, Expose 394, 1971. 3. J. Coates, On K2 and some classical conjectures in algebraic number theory, Ann. of Math. (2) 95 (1972), 99-116. 4. J. Coates and S. Lichtenbaum, On \-adic zeta functions, Ann. of Math, (to appear). 5. H. Garland, Finiteness theorem for K2 of a number field, Ann. of Math. (2) 94 (1971), 534-548. 6. K. Iwasawa, Analogies between number fields and function fields, Proc. Annual Sci. Conf. Some Recent Advances in the Basic Sciences, vol. 2 (Belfer Grad. School Sci., Yeshiva Univ., New York, 1965/66), Belfer Graduate School of Science, Yeshiva Univ., New York, 1969, pp. 203-208. MR 41 #172. 7. , On p-adic L-functions, Ann. of Math. (2) 89 (1969), 198-205. MR 42 #4522. 8. K. Iwasawa, On Zrextensions of algebraic number fields, Ann. of Math. (2) 96 (1972), 338-360. 9. T. Kubota and H. W. Leopoldt, Eine p-adische Theorie der Zetawerte. I. Einfuhrung der p-adischen Dirichletschen L-Funktionen, J. Reine Angew. Math. 214/215 (1964), 328-339. MR 29 #1199. 10. S. Lichtenbaum, On the values of zeta and L-functions. I, Ann. of Math, (to appear). 11. H. W. Leopoldt, Zur Arithmetik in abelschen Zahlkorpern, J. Reine Angew. Math. 209 (1962), 54-71. MR 25 #3034. 12. J. Milnor, Introduction to algebraic K-theory, Ann. of Math. Studies, no. 72, Princeton Univ. Press, Princeton, N.J., 1971. 13. Guntram Schrutka v. Rechtenstamm, Tabell der (Relativ)-klassenzahlen der Kreiskorper, deren (j>-Funktion dcs Wurzelexponenten (Grad) nicht grosser als 256 ist, Abh. Deutsch. Akad. Wiss. Berlin Kl. Math. Phys. Tech. 1964, no. 2, 64 pp. MR 29 #4918.
IWASAWA'S ANALOGUE OF THE JACOBIAN 61 14. J.-P. Serre, Cohomologie des groupes discrets, Ann. of Math. Studies, no. 70, Princeton Univ. Press, Princeton, N.J., 1971. 15. C. Siegel, "t)ber die analytische Theorie der quadratischen Formen. Ill," in Gesammelte Abhandlungen. Band I, Springer-Verlag, Berlin and New York, 1966, pp. 469-548. MR 33 #5441. 16. , Berechnung von Zetafunktionen an ganzzahligen Stellen, Nachr. Akad. Wiss. Gottingen Ma&.-Phys. Kl. II 1969, 87-102. MR 40 #5570. 17. L. Stickelberger, Uber eine Verallgemeinerung der Kreistheilung, Math. Ann. 37 (1890), 321-367. 18. J. Tate, Symbols in arithmetic, Proc. Internat. Congress Math. (Nice, 1970), vol. 1, Gauthier- Villars, Paris, 1971, pp. 201-212. 19. A. Weil, Varietes abeliennes et courbes algebriques, Hermann, Paris, 1948. Stanford University
This page intentionally left blank
THE DISTRIBUTION OF VALUES OF EULER'S PHI FUNCTION HAROLD G. DIAMOND l A number of facts are known about the distribution of values of Euler's phi function. A classical theorem of Schoenberg [10] asserts that q>(n)/n has a continuous distribution function. That is, there exists a continuous monotone function /with/(0) = 0, /(l)= 1 such that, as x->oo, (1) (l/x)#{n€[l,x]:^(n)/n^a}^/(a), 0£a£l. In geometric terms, the left side of (1) is the proportion of integers n^x for which the point (n, <p(n)) lies below the line t = as in the s-t plane. Another estimate of the distribution of values of cp(n) ([4], [2], [1]) is, for v-»oo, (2) #{»:^W^y}-C(2)C(3)y/C(6). (C is the Riemann zeta function.) The left side of (2) is the number of points (h, <p(n)) lying in the semi-infinite horizontal strip {(s, t): 0<s<oo, 0<t^y}. Here we shall investigate a similar problem for rectangles. Let i.e. the number of points (h, (p(n)) lying in the rectangle (0, x] x(0, y]. Clearly, <P(x, y) = 0 for y< 1 and <P(x9 y) = [x] for y^x. Let g be defined on R by setting gf(a) = 0 if a^0;#(a)=l if a^l; and for 0<<x<l set * «-»-5Jni«-r'+1.-'(i-»-)-}^. where # is any line from a — ioo to <z + ioo with ae(0, 1). We shall prove the A MS 1970 subject classifications. Primary I0H25; Secondary 10K20. 1 Research supported in part by NSF grant 21335. •V) 1973, American Mathematical Society 63
64 HAROLD G. DIAMOND Graph of points («, q>(n)) 500 , 400 300 , 200 . 100 . o^x"'" ...... 100 200 300 400 500 600 Graph courtesy of H.-E. Richert and H. Siebert Theorem. Let c be a fixed number in (0, |). If 1 ^y<x, 0(x, y) = xg(y/x) + 0(y exp(-{c \ogey log loge2>>})1/2). The constant implied by the O depends only on the value ofc. Our device for counting the integers n satisfying both the inequalities n<x and cp(n)<y is set out in the following formula, which is valid for any a, b>0: b + iao a + iao (2ni)-2 z—b — iao s = a — ico f /*Y ( y \ ds dz = 0, n>x or (p(ri)>y.
DISTRIBUTION OF VALUES OF EULER'S PHI FUNCTION 65 We exploit this idea by using a generating function of two complex variables and applying Perron's inversion formula twice. Generating function. Let s=c + it and z = £ + *>/ be complex variables. Let Sa = {(j, z):o- + £>a}, and on Sx define F(s,z)=£ n-scp(ny\ The series converges on Sx because n^(p(n) = nY[(l-p-1)^n f] (1 -p-^Pn/log logw. p\n p^c log n (See [6, pp. 267-268].) It is easy to see by uniform convergence that F is an analytic function of s and z on Sx. Fhas a product representation valid on Sx. Since n*-+n~s(p(n)~z is multiplicative, we have F(s,z)=Y\{i+p-s<p(p)-z+p-2wp2rz+p-3s<p(p3rz+-} p =n{i+p_s(p-i)"2(i+p"s"z+p"2s_22+-)} p =n u-p"s-z+p-s(p-i)-zK(s+z) p = defile Z)C(S + Z), where £ is Riemann's zeta function, the behavior of which is well known. We now set out some facts about J~[(s, z) for use below. To avoid extra estimates, we limit ourselves to sets of the form Sa = df {(o- + it, { + if): <r>0, £>0, <7 + £>a}. Lemma 1. The product defining J~[(^, z) converges and defines an analytic function of s andz on S£. Moreover U(s, z)«expf3l0gW\ if°^=l "l0g l0g '"l/l0g i,?l 11V' FVloglog|ij|/ analog log |>/|^ 10, and |r;|<expexp 10. The estimates are valid independently oft.
66 HAROLD G. DIAMOND Proof. We may assume that £^2 and <t^2, for otherwise the conclusions of the lemma are quite obvious. For ^Owe have the inequalities |(p-lp-p-'|£2(p-l)-< ^IzKp-iP"1. For log log \rj\ ^ 10, we write inMis n • n- 11 = n {l+2(p-l)-«-}gexpj £ 2(p-l)-«-j ^expJ2 + 2 J u-s-°dn(u)\^exp<A + 2 I M-1+«—j, 3/2 3/2 where £ = def log log\rj\/\og\rj\. If we set w=£logw the last integral becomes 2 (1/2) log log M log log |i/| + I + (MM) ewdw e log (3/2) i (1/2) log iog|ir| The first two integrals are estimated directly; the third after an integration by parts. We find that I'/I f ^ du 3 logM ^ f 3 1ogM] U"1+£i g- . , . if log log |i,|£ 10, and fl gexp^ + gl . J logw 2 log log|^| ,*,„, (. log log WJ 3/2 n = n (i+i2i(p-ir4—') p>l>;l p>l>;l gexp £ |2| (p-I)"*"-» (W-i) ^exp ^exp-M|z| |z| 2-c + W J logwj
DISTRIBUTION OF VALUES OF EULER'S PHI FUNCTION 67 These inequalities establish the O-estimate if log log \rj\ ^ 10. In the other case we have simply in(s,z)i<exPJ2:i2i(p-i)-«-'-iJ«i. The product converges uniformly on each set S* n{z: \z\ ^ M}, for any e > 0, M < oo, because log n {i-p_,"*+p"*(p-ir"} a<p<b < I P~a\(p-irz-P~z\ a<p<b <\z\ £ (p-l)-«-"-»-0 a<p<b uniformly as a, b-+co and (s, z)eS* n{z: |z|^M}. Thus the product defines an analytic function of s and z on 5£+. □ For fixed z the function s\-+F(s, z) has a pole with residue f](l—z, z)atj=l— z. We use this fact in our first application of Perron's formula. We assume in what follows that the variable x is sufficiently large that all estimates contingent on the size of x are sensible and valid. Lemma 2. Let c' be any fixed number in (0, \). Then xl~z X <p(n)~z = Yl(l-z>z)-< +0(xexp{-(c'logxloglogx)1/2}) n£x \—Z for x^x0(c') and z in the rectangle {z = b+irj:0<b£$, \rj| ^exp((y log* log logx)1/2)}. The constant implied by the O depends only on c'. Proof. All estimates that follow are uniform in z for z in the rectangle. We begin by observing that on any half plane {z:Re z^/?}, in(i-*.z)i= n i-p-'+p-1 P-\ ^n{i-P->+P->(i+^y}=o(i). Next, it suffices to prove the lemma in the case x = [x] +i This is so since
68 HAROLD G. DIAMOND m-^r-™+»i-om. 1-2 Now assume x = [x]+^ Let T = exp((logx log logx)1/2) and let a=\ + log log x/log x. Applying Perron's formula we obtain a + iT (3) 27TI 5 n<j, a-iT oo The error term<^x°T~l £ n" r|iogx/n| log- <: Z »-+(£Y Z r^V Z »■ ^x/e \X/ x/e<n<ex l^ — *| M^ex ^xT"1 log2*. We estimate the left side of (3) by deforming the contour. Let a! = 1—6 — (log logx)/(2 log*). The function s\-+F(s, z) has the aforementioned pole within the rectangle with vertices a'±iT, a±iT. Thus we have a + il -L f ds F(s,z)x° ■■l\{l-z,z)x1-'/{l-z) (4) a-iT a'-iT a' + iT a + iT +Uhhl}r<*** ds s ' a-iT a'-iT a' + iT We now estimate the last three integrals. I £- sup {IC(2 + 5-»T)n(«-ir,2)|}. The lower bound /loglogxY'2 « log log |ry| log |9/| is valid and thus we can apply Lemma 1 to obtain I~[(£-iT, z)«exp(3 log|if|/log log|!f|)«exp(18 logx/log logx)1/2.
DISTRIBUTION OF VALUES OF EULER'S PHI FUNCTION 69 Since |Im(z + £ — zT)|^l, we do not approach the pole of zeta. Consequently we may apply the familiar estimate (cf. [7, Theorem 9]) £(z + Z-iT)<Tl-a'-b/(l-a'-b) ( logx \1/2 <d -—f exp(2"1/2 log log*Hlog2*. \loglogx/ Thus a'-iT <4xT~l (logx)3 exp(18 logx/log logx)1/2. a-iT The same estimate applies for j^Vrr- For the remaining integral we write a' + iT T J «x*'sup {\az + "' + it)Y\(a' + it,z)\} -^—. a'-iT ~T Now a'=\ — b — o{\)^\ and hence the last integral is <^ log r<^ logx. We can apply Lemma 1 to obtain the estimate Y[(a' + it, z)<exp{(18 logx/log logx)1/2}. Since Re(z+<z' + zf) is less than 1, we can apply the zeta function estimate [7, Theorem 9] t(z + a' + it)<T1-a'-b/(l-a,-b)<\og2x. Thus we have a' + iT ^x*'(logx)3 exp{(18 logx/log logx)1/2} a'-iT «x(logx)3 exP{-^T72+(loglogJ Oogx log logx)1/2 <^xexp{-(c' logx loglogx)1/2}. The estimates of (3) and (4) establish the lemma. □ We now apply Perron's formula a second time. It is convenient to use an integrated form of the inversion formula here.
70 HAROLD G. DIAMOND Lemma 3. Let a and 2c" be any fixed numbers in (0,1) and let %> be a line from a — ioo to a + icc. Let x^x0(c"). Then for any ae(0,1) ax a2 dz (5) J u 2ni J z2(l—z) 1 <€ + 0{x exp(-(c" log* log logx)1/2)}. The constant implied by the O depends only on c". Proof. Since ]~J(1 — z, z) is bounded and the integrand of ]"# is analytic in the strip 0 < Re z < 1, it is clearly sufficient to prove the theorem for the special choice ^ = {2 = (logx)"1+^:-oo<<J<oo}. Also, let T = exp{(^ log* log logx)1/2} and let ^t={z = (logx)-1+^:-T^^T}. Perron's formula and an integration by parts give ax ax = log—du<P{x, u)= #(x, uju"1 du. Since Lemma 2 applies only for z with |Imz|^r, we express the left side of (6) as We estimate \<€-<€x by noting that \^n^x<p(n)~z\^x and \(ocxf\^e. It follows that \$<g-<gr\^ex/(m;). By Lemma 2, f x f t-t , v a2 dz + o(xexp{-(c'logxloglogx)1/2} p*Li. h The last integral is O(logx) since Rez = (logx)_1. The integral in the main term
DISTRIBUTION OF VALUES OF EULER'S PHI FUNCTION 71 can be extended to an integral along the curve ^ with an error of 0(xt~2). Combining the above estimates with (6) we obtain (5), with c" any positive number less than c' of Lemma 2. □ Differencing. We now difference (5), evaluating it once as it stands and once with a(l +S) in place of a. The d will be taken as a positive function of x. Where convenient, we shall write y for ax. 1 C , Ju x f—^ x a2 f(l+<5)2-l] j /7x - *{x9u)— = —\n(l-z9z)— -<- -1 >dz V) d J v u 2tc*J11v z(l-z)[ Sz J y «• H-Ofx^"1 exp(-(c" logx log logx)1/2)}. We estimate the left side of (7), noting that # is monotone increasing in each variable. (8) *(x, y) ^U l *(x, u) ***<*+&, y+^ ^±5. The following estimate is easily seen to hold for 0 g Rez ^ 1: ((l+^-l)/&-l«min{l,a + a|z|}. Using this estimate, we express the first term on the right side of (7) as xg(oc) + 0(xd log logx + x<5 log^"1), where g(oc) is the function defined before the statement of the theorem. The log log* arises from the integral over the region where z is small. We take <5 = <5(x) = exp(-£(c" log* log logx)1/2), and let 0<c1<c'V4. With this choice and the above estimates, (7) becomes y + dy 1 C A (9) - <J>(x,u) — = xg(<x) + 0(xQxp{-(c1 log* log logx)1/2}). o J u y It is easily seen that g(<x) is real by dividing through (9) by x and letting x->oo. If we combine (8) and (9), taking x' = x(l+(5), y'=y(l+S) and on=y/x=y'/x\ we obtain the estimate
72 HAROLD G. DIAMOND (10) #(x, y)=xg(y/x) + 0{x exp(-(cx log* log logx)1/2)}, where cx is any number in (0, £). We shall now convert the error estimate into one in terms of y. Such an estimate is most interesting, of course, when y is much smaller than x. This case is close to the Erdos-Dressler-Bateman problem in the sense that if y is large but y/x is suitably small, then (C(2) C(3)/C(6)) y~*(oof y) = #(jc, y)~xg{y/x). We make this observation precise in the following lemma, which estimates g(ot) accurately for small positive values of a. Lemma 4. There exists an absolute positive constant k such that for 0<a< 1, i-(«|-^« + o{=xp(-expi Proof. g(oc) was defined by an integral just before the statement of the theorem. Shift the contour in this integral to the vertical line {£ + it: — oo < f < oo, £ = exp(fca)-1}, where A: is a positive constant to be specified below. The residue of the integrand at z = 1 is For Rez = £, we have the estimate ina-z.z^nf/rTno+o^-2)}. The last product is bounded, and 11 f-^rY^exp X -^-gexp^aoglogf + log^/e))} p<t\P-1/ P<$P~l for some absolute positive constant k. The factor e is included to make the final form of the lemma simpler. Thus, on the line Rez = £ we have the estimate f](l -z, z) az<exp{£(log logf + log(/c/e) + loga)} <exp{ —exp(fax)"1}.
DISTRIBUTION OF VALUES OF EULER'S PHI FUNCTION 73 The lemma now follows from the residue theorem and an application of the last estimate to j(4) ]~] (1 - z, z) a2 dzjz (1 - z). □ Proof of the Theorem. We have estimated #(x, y) in (10). It remains to convert the error estimate into terms of y. We treat three cases according to the size of x and y. If e2<y<x^y logy, then x exp(-(c! log* log logx)1/2)^y logy exp(-(c! logy log logy)1/2) <y exp(-(c logy log logy)1/2) for 0<c<C! <^ (c arbitrary). In this case conversion of the error term was trivial. Now suppose x>y\ogy>y0\ogy0 (for a suitable y0). Then $(x, y) = <P(y logy, y) since there exists no n>y logy for which (p(n) ^y. (Recall q>(n)p rt/log logw.) Using (10) once and the last lemma two times we obtain #(y logy, y) = ylogy0(l/logy) + O{y logy exp(-(cx logy log logy)1/2)} = (C(2) C(3)/C(6))y + 0{y logy exp(-exp{^1logy})} + 0{y logy exp(-(ci logy log logy)1/2)} = xg(y/x) + 0{ylogyQxp(-(cl logy log log y)1/2)} = xg(y/x) + 0{y exp(-(c logy log logy)1/2)}. Finally, suppose that 1 ^y^max(y0, e2) and x is arbitrary. In this case <P(x, y) is bounded and the formula for <&(x, y) can be made valid by suitably choosing the constant in the error term. This completes the proof of the theorem. □ Connections with Schoenberg's problem. The present method can be applied to the problem of estimating *l*(x> «) = def #{we[l, x] :<p(n)£<m}. We start with the generating function £ n-'{q>(n)/n}-* = F(s-z9z) and proceed to the formula f du x C„ xz dz (11) J 0(x, u) - = — I n(l -z, z)—r + 0{x exp(-(c'logx log logx)1'2)}, 0 <€
74 HAROLD G. DIAMOND where ]~J, #, and c' are as before. This time, however, the differencing argument does not work so easily. At fault is the function/occurring in (1), which is singular [3]. It was shown by Tjan [11] that /(a^)-/(a)<^(loglog/i-1)-1. With this estimate we can deduce the following theorem of Fainleib [5]: \j/ (x, a) = xf (a) + O (x/log log x). The last error term cannot be improved very much on account of the rather heavy concentration of points (n9 cp(n)) near certain rays through the origin. A formula connecting the functions/and g can be derived by comparing the integral defining g with equation (11). We have a a f „ , x du 1 f__ a2 dz C du 0 <€ 0 Now/is continuous, as we mentioned before; g is also continuous, by consideration of its integral. We can differentiate the last equation, establishing the differentiability of g on (0, 1] and obtaining for 0<a^ 1 the formula (12) /(a) = 0(a)-a0'(a). We can also establish (12) without knowledge of integral formulas for/and g. We can compare the number of points (n, <p (n)) lying in a rectangle with the number lying in a containing and a contained trapezoid. This method suggests the general problem of applying knowledge of/or g to estimate the number of points (n, q>(n)) lying in more general regions. Let0<a<j?^l. We have i/, ((j3/a)x, a) -i/, (x, a) ^ <J> ((/?/a)x, fix) - <J> (x, fix) ^ i/, ((/?/a) x, /?) - ^ (*, /?) • If we replace each <P and if/ by its asymptotic estimate, divide by x, and let x->oo, we get (f}ix-i)f(oL)mi«)g(«)-0mw*-i)f(P) or, forOga^l, f(a)£g(a)-*{(g(P)-g(«)W-«)}£f(P)-
DISTRIBUTION OF VALUES OF EULER'S PHI FUNCTION 75 Now/and g are continuous on [0, 1]. It follows that g is differentiable on (0, 1] and equation (12) is valid on (0, 1]. Lemma 4 implies that g has a derivative from the right at the origin and that its value is C(2) C(3)/f(6). Since/(1) = #(1) = 1, it follows from (12) that 0'(1) = O. Equation (12) also implies g' is continuous on (0, 1]. Finally, we shall use (12) and the fact that/is nondecreasing and singular to deduce that g' is nonincreasing and singular on (0, 1). If we difference (12) we find that /(« + £)-/(«) Jg'(« + fi)-.g'(g)| g(g + fi)-g(g) + cc< > = g (ct) + g (oc)-g a + £ . £ I e J £ The right side of the last equation goes to zero with £, proving both assertions about g'. It appears that the method we have described can be applied to other problems of estimating the number of points (n, f(n)) lying in a rectangle (0, x] x(0, y] for suitable/. Examples arQf(n) = <p(ri)a ovf(n) = Ga(ri) = J^dln da. I am indebted to Professor Harald Niederreiter for bringing articles [5] and [11] to my attention. References 1. Paul T. Bateman, The distribution of values of the Euler function, Acta Arith. 21 (1972), 329-345. 2. Robert E. Dressier, A density which counts multiplicity, Pacific J. Math. 34 (1970), 371-378. MR 42 #5940. 3. Paul Erdos, On the smoothness of the asymptotic distribution of additive arithmetical functions, Amer. J. Math. 61 (1939), 722-725. 4. , Some remarks on Euler's </> function and some related problems, Bull. Amer. Math. Soc. 51 (1945), 540-544. MR 7, 49. 5. A. S. Fainleib, Distribution of values of Euler's function, Mat. Zametki 1 (1967), 645-652 = Math. Notes 1 (1967), 428-432. MR 35 #6636. 6. G. H. Hardy and E. M. Wright, An introduction to the theory of numbers, 4th ed., Oxford Univ. Press, London, 1960. MR 20 #828. 7. A. E. Ingham, The distribution of prime numbers, Cambridge Univ. Press, Cambridge, 1932. 8. M. Kac, Statistical independence in probability, analysis and number theory, Carus Math. Monographs, no. 12, Math. Assoc. Amer.; distributed by Wiley, New York, 1959.MR22 #996. 9. J. Kubilius, Probabilistic methods in the theory of numbers, Gos. Izdat. Polit. Naucn. Lit. Litovsk. SSR, Vilna, 1962; English transl., Transl. Math. Monographs, vol. 11, Amer. Math. Soc, Providence, R.I., 1964. MR 26 #3691; MR 28 #3956. 10. I. J. Schoenberg, Vber die asymptotische Verteilung reeler Zahlen mod 1, Math. Z. 28 (1928), 171-199. 11. M. M. Tjan, On the question of the distribution of values of the Euler function <p(n), Litovsk. Mat. Sb. 6 (1966), 105-119. (Russian) MR 34 #5780. University of Illinois at Urbana-Champaign
This page intentionally left blank
ON CONNECTIONS BETWEEN THE TURAN-KUBILIUS INEQUALITY AND THE LARGE SIEVE: SOME APPLICATIONS P. D. T. A. ELLIOTT Let f(n) be an additive function. Thus for coprime integers a and b the relation f(ab) = f(a) + f(b) is satisfied. The Turan-Kubilius inequality states that there is a positive absolute constant ct so that /H- I f(pv)p- Pv = n ^ctn £ I/(PV)|2P" PV =n This inequality is valid even if f(n) assumes complex values. It can be viewed as a form of the law of large numbers for additive functions. It was proved first by Turan [3] for real-valued functions, and extended to complex-valued functions by Kubilius, (see for example [2]). The proof needs little more than a careful use of the Cauchy-Schwarz inequality. We shall show by means of this inequality that there is sometimes a correspondence: operator-* sufficient, dual of operator-* necessity. We write pv || n if px divides n and pv+1 does not. Define S(PV,") = PV/2(1-P_V) if PVN = —pv/2 otherwise. Replacing each/(/?v) by f(pv) pv/2 in the Turan-Kubilius inequality we can rewrite it in the form £ f(p*)s(p\m)\ pv^n I m=l Dualizing we arrive (after a little untangling) at 2^n £ \f(ff Pv^n A MS 1970 subject classifications. Primary 10K20, 10H30. ,V) 1973, American Mathematical Society 77
78 P. D. T. A. ELLIOTT Lemma 1. For any complex numbers am(m=\,...,n) pv ^n Z am~P " Z m= 1; pv II m ^n Z Kl2. This is clearly an inequality of large sieve type, but with the usual uniformity p^n1'2 extended to pv^n. We give two applications of this lemma. First application. We consider the Erdos-Wintner [1] theorem. This states that the frequencies n'1 Z 1 (n=h2„..) m= l;/(m)^z possess a limiting distribution if and only if the three series I/(p)I>iP I/(p)I^i P l/(P)l£i P converge. To prove that these conditions are sufficient one can construct suitably 'small' probability spaces on which to study the function /»= Z f(Pv) (m = h...,n) pv||m;pvgr with r = (logn)1/2. It is straightforward to prove that this function possesses a limiting distribution. To complete the proof of sufficiency one shows by means of the Turan-Kubilius inequality that the differences f(m)—fr(m) are on the whole small. As to necessity one usually appeals to a Tauberian theorem concerning Dirichlet series. We now sketch an alternative method. In fact we shall prove that if cp(t) is the characteristic function of a limiting distribution for the above frequencies, then there is an absolute constant c2 (independent of q>(t)) so that the inequality WOI2Zp~Me'''/(p)-i|2^c2 is valid uniformly for all real numbers t. In fact set am = Qxp(itf(m)) (m=l,...,w) in Lemma 1. Then from the additive property of f(ri),
THE TURAN-KUBILIUS INEQUALITY AND THE LARGE SIEVE 79 n E am = exp{itf(p)) £ ar. m=l;p\\m r^p'ln;P%r Choose a (large) prime P. Then uniformly for all p not exceeding P we have n-1 t flw = exp(ft^p))^(t) + 0(l) + flp-1 (|0|£1). m = 1; p || m Applying Lemma 1, dividing by n2 and letting n->oo we see that i I (p-1 Iexp(ft/(p))-l|2 \cf>{t)\2-p-2)^cx. Letting P-+00 we deduce the asserted inequality with c2 = 2c1 + X/>~2. It is almost immediate that the series £|/<p)I>i 1/p and £|/(P)|gi/2(p)/p converge. For each n^ 1 set *- 1 &. p*»; l/(p)l£i P Then, as in the argument for sufficiency, one can (using the convergence of the above two series) prove that a limiting distribution exists for the function/(m) — A(n) (m= 1,..., n). Since by hypothesis this is also true for f(m) we must have that lim A(n) (n->oo) exists, and the proof of the theorem is complete. Our second example is a little more subtle. For each set E and real number x^l set vx(n; neE) = x~lYl * n^x where ' indicates that summation is restricted to integers n which belong to the set E. Thforem. Let fi(x) be a function ofx increasing (in the wide sense) to infinity. Then the following two propositions are equivalent: A. There are constants oc(x),for each x^ 1, with the two properties (i) ifO<w<\,then Pix)'1 sup,**^, |a(x)-a(y)|->0 (x->oo); (ii)for each s>0, vx(n\ \f(n)-ai(x) |>8j?(x))->0 (x->oo). B. For each real number u>0, X 1-0 and P(x)-2 £ ^-0, P^x;\f(p)\>up(x)P p^x;\f(p)\^up(x) P both as n->co.
80 P. D. T. A. ELLIOTT Remark. This theorem is a general form of the law of large numbers for additive functions insofar as they mimic the sum of independent random variables. We make no assumption concerning f(n) whatsoever beyond the fact that it is adcjitive. We shall not give the whole proof but sketch the key lemma. Moreover, we shall give this lemma for strongly additive functions, namely those which satisfy f(pv) = f(p), for v= 1, 2,.... This is purely for convenience of exposition. Lemma 2. Assume Proposition A (ii). Then for each e>0 the estimate £" —0 (x-oo) P^xP is satisfied, where summation is restricted to those primes pfor which the inequality \f{p)-Hx)-0L(p-'x))\>eP(x) is valid. Granted this lemma, choose a real number w in the interval 0< w< 1. Then by condition A(i), a(x) — cc(p~1x) = o(f}(x)) holds uniformly for 2^p^xw, as x->oo. This fact together with Lemma 2 ensures that for each real u>0, I -£o{l) + I -:g-logw + 0(l) P^x; \f(p)\>uP(x) P x™<p^xP as x->oo. Since w can be taken arbitrarily close to 1 from below we see that I --0 (x-oo). P^x;\f(p)\>uP(x) P This is already a large part of Proposition B. We now sketch a proof of Lemma 2. We can find functions e(x) and S(x) so that e(x) decreases (in the wide sense) to zero, S(x)->0 as x-> oo and so that v»(*; \f(n)-oi(x)\>8(x)p(x))^S(x) (x^l). By replacing S (x) by sup y ^ x S (y) if necessary we can also assume that S (x) decreases to zero as x->oo. Consider the integers nt (i = 1,..., k) which lie in the interval 1 ^ n{ ^ x, for which
THE TURAN-KUBILIUS INEQUALITY AND THE LARGE SIEVE 81 the inequality \f(ni)-(x(x)\>£(x) /?(x) is valid. Let pj run through those primes p in the interval 1 ^p^x for which I i-p-1 I i nt^x; p\\nt «i^x -l/2Y/Al/2n-l >(S(x)-1/2xk)1/2p Then, from Lemma 2, Next, we note that if j> = x(logx)_1, then £ -^((logx)-1'2)-^ (x-oo). y<pgxP We shall now show that every prime p in the range 2^/?^x(logx)_1 which is not a pj has the property |/(p)-(a(x)-a(p-1x))|^£^(x) provided only that x is sufficiently large. Let p be such a prime. Consider the integers m} in the interval l^ra^x for which \f(mJ) — tt(x)\?^£(x) /?(x), and for which p || rrij. The number of such rrij is at least x x Z I" I 1 ^---2-1—-(*(*)-1/2**)!'-p /C P /2„"1 ^-(l-l/p-Oogx)-1-^)-^)1/4). P Moreover, the number of integers r not exceeding xp"1 for which the inequality |/(r)-a(xp-1)|^£(xp-1)^(xp-1) is satisfied is at least (x/p) (1— S(xp~1)). It is now clear that at least one of the integers mjp~l{p || rrij) coincides with one of these integers r. For otherwise there are at least (x/p) (2- 1/p-Pogx)-1 -2(<5(logx))1'4) integers in the interval [1, xp"1], and for all sufficiently large values of x this is impossible. Thus we have m3 = pr, where the condition p || rrij ensures that (p, r)= 1.
82 P. D. T. A. ELLIOTT From the additive property of / (n), ^s(x)p(x) + s(xp-1)^xp-1) ^2s(logx)P(x). The proof of Lemma 2 is now complete. References 1. P. Erdds and A. Wintner, Additive arithmetical functions and statistical independence, Amer. J. Math. 61(1939), 713-721. 2. J. Kubilius, Probabilistic methods in the theory of numbers, Gos. Izdat. Polit. Naucn. Lit. Litovsk. SSR, Vilna, 1962; English transl., Transl. Math. Monographs, vol. 11, Amer. Math. Soc, Providence, R.I., 1964, Chap. 3, pp. 31-35. MR 26 #3691; MR 28 #3956. 3. P. Turan, On a theorem of Hardy and Ramanujan, J. London Math. Soc. 9 (1934), 274-276; ibid. 11(1936), 125-133. University of Colorado
ON THE NUMBER OF SOLUTIONS OF P. ERDOS AND E. SZEMEREDI Denote by rkJ(n) the number of solutions of i n= £ x? i = l in positive integers x(. The well-known hypothesis K of Hardy and Littlewood states that for every e > 0, (1) rk,k(n) = 0(n>). (1) is well-known for /c = 2, in fact for n>n0(s)9 (2) r2 2(n)<n(1+£),og2/loglogn, and (2) does not hold for every n if log 2 is replaced by a smaller constant. Nearly 40 years ago Mahler [7] disproved the hypothesis for /c = 3. He showed in fact that for infinitely many n (cl9 c2i... denote positive absolute constants), (3) r3,3(n)>Cln^2. It is possible that, for all n, (4) r3,3(n)<c2n^\ but nothing is known about this. It is probable that the X-hypothesis fails for every k>3 too, but probably AMS 1970 subject classifications. Primary 10B15, 10J99. © 1973, American Mathematical Society 83
84 P. ERDOS AND E. SZEMEREDI (5) E(r*»)2<x1+£ n=l for every e if x>x0(e). (5) would be just as useful for Waring's problem as the K-hypothesis. Chowla [1] proved that for /c^5, rkk{n)^0{\\ and Chowla and Erdos [3] proved that, for every /c^2 and infinitely many n, rk, k (n) > exp (ck log n/log log n). Mordell proved that r32(ri)^0(l), and Mahler [8] proved that, for infinitely many n, r3 2(n)>(\ogn)l/4. As far as we know there is no nontrivial upper bound for r32(n) and almost nothing is known about rkl{n) for /</c, /c>3. Another very difficult problem is to estimate Akl(x\ the number of integers m^x for which i m= X xf is solvable. A classical result of Landau states that A2t2(x) = (C + o(l))x/(logx)1'2. Mahler and Erdos [4] proved that, for every /c>2, Ak 2(x)xxkx2/k (ak>0), and Hooley proved Ak 2(x) = (ck + o(l)) x2/k. It seems certain that, for every /</c, AK i > akf txllk and AK k(x) > xl ~E for every e > 0, if x > x0 (s). Unfortunately we have no contribution towards settling these classical problems; for important partial results see the papers of Davenport [2]- P. Erdos [5] proved the following result. Let r1<---<rk<w, fc>«1"Cl/loglogn, C!<^log2. Then for n>n0(cl), there is an m so that the number of solutions of m = rf — rf is greater than exp(c2 logfl/log log«). He also proved that for infinitely many n the number of solutions of n=p2 + q2, /?, q primes, is greater than exp(c3 logw/log logw). P. Erdos states without giving the proof that for every k there is an nk so that the number of solutions of nk=p3 + q3 + r3 is greater than k. The analogous result seems to be unknown for more than three summands.
THE NUMBER OF SOLUTIONS OF m=YJi = 1 *? &5 In the present note we prove the following: Theorem. Let t be a positive integer, cx a positive number, and I and n positive integers satisfying l>cxn. Let ax < ••• <a{ be positive integers smaller than n but otherwise arbitrary. If n>n0(cl9 t) there exists an integer m such that the equation k j= i has more than t solutions. Before we prove our theorem we wish to state a few well-known and very difficult problems in additive number theory. Denote by fk(n) the largest set of integers l^a1<---<a/^n for which all the sums i i £ Btai9 £t = 0 or 1, X 8t = k are all distinct. Erdos and Turan conjectured (6) f2(n) = n^ + 0(\). This problem seems very deep. Erdos, Turan and Lindstrom [6] proved f2(n)^nl,2 + nl/4 + l and recently Szemeredi proved f2(n) <n1,2 + o(nl/4); the proof is very complicated. The results of Singer [9] immediately imply f2(n)^(l + o(l)) n1'2. P. Erdos often offered 300 dollars for a proof or disproof of the conjecture (6). Chowla and Ryser conjectured that (7) Mn)=(l+o(l))nV. They proved fk(n)^(l +o(l)) n1/k. P. Erdos offers 100 dollars for a proof or disproof of (7). The methods used for f2 (n) seem to break down completely. Finally denote by F(n) the largest set of integers 1 ^ax < ••• <at<n for which all the sums Yj= x etah et = 0 or 1 are all distinct. P. Erdos and L. Moser proved ri v .log" , loglogn F(n)< he, V y"log2 21og2 and Conway and Guy showed that for t > 22, F(2t) ^ t + 2. P. Erdos asked 40 years
86 P. ERDOS AND E. SZEMEREDI ago: Is it true that (*) F(/i) = log/i/log2 + 0(l)? Erdos offers 300 dollars for a proof or disproof of (8). We now prove our theorem. The proof is rather complicated and to motivate it we first try to explain its plan which follows [3]. Let s be sufficiently large but fixed, A will denote the sequence 1 ^ax < ••• <ax ^n,l>cxn. A(u,d, n) denotes the number of integers of the sequence A satisfying <zt = tt(modd). Suppose that we have found a square-free integer Tr, r>r0(k, s, t, c), all of whose prime factors pl9..., pr are sufficiently large so that! for every j, 1 ^j^r, (8) A(0,Tr/Pj;n)>lPj/2Tr; and the number of residue classes u (mod Trp)~ *), w = 0 (mod Tr/pj) (the number of these residue classes is p*) which do not satisfy (9) A(u,Trp)-lMi)>l/sTrp)-1 is less than p)l%k for j= 1,..., r. Then we can prove our theorem by the method of [3]. To see this denote by F(Tr) the number of solutions of the congruence (in distinct a's) (10) X a^0(modTrk) (1^/), i=l and let Fj(Tr) denote the number of those solutions of (10) for which at = 0(modTr/Pj), ar#0(mod/?;), i=l, 2,..., k. Clearly (11) I Fj(Tr)^F(Tr). Next we estimate F3 (Tr) from below. The first k — 2 summands of (10) we choose arbitrarily subject only to (12) a, = 0 (mod 7^), ^^(modp,). The number of choices of at satisfying (12) is, by (8), greater than (13) lPj/2Tr-n/Tr>lPj/4Tr
THE NUMBER OF SOLUTIONS OF m = Yj=\ Xk 87 by l> cxn, if the prime factors of Tr are greater than, say, 10/cx. From (13) we obtain that the number of choices of fc — 2 distinct <z's satisfying (12) is greater than JOTJ r (10kf\Try We have to choose ak-1 and afc so that besides satisfying (12) they should satisfy (15) a^1+a*=-2;2a?(mod$. A well-known result in elementary number theory states that if p>p0, then the number of solutions of the congruence xk + yk = a(modpk), x, y#0(modp) is greater than pk/2. Now observe that the number of solutions of the congruence (15) in residues where at least one of them does not satisfy (9) is less than/?*/4. To see this observe that there are at most pk/8k residues not satisfying (9), and once one such residue has been chosen there are at most k choices for the other residue in (15). Thus the number of solutions of (15) in residues satisfying (9) is greater thanpk/4k. Hence by (9) the number of solutions in ak-l and ak of (15) is greater than do - '-Yd- ' sTrp)~lJ 4 As2Tr2p)-2' From (14) and (16) we have (17) Fj(Tr)>lkT^ks-2(100kyk. Thus from (17) and (11) and l>cln we have, for r>r0(k, s, cx\ (18) F(Tr)>r(lT-1(l00k)-1)ks-2>r1'2(nk/Tk). Now the integers £{L l ak are all less than knk. Thus there are at most knkT^k of them which are multiples of 7^ and hence by (18) for at least one of these integers, say mj Trk, the number of solutions of m = mlTk=Y ak
88 P. ERDOS AND E. SZEMEREDI is greater than r1/2/k> t for r> t2k2, and this completes the proof of our theorem. Now we 'only' have to prove the existence of an integer Tr satisfying (8) and (9) and this will be the chief difficulty of our proof. We need three lemmas. Lemma 1. Let £>0, c>0, and r be a positive integer. Then there is an n0 = n0 (e, c, r) so that for every n> n0 if' 1 <ax <... <at<n, l> en is any sequence of integers, then there is a square-free integer tr<t0(s, e, r) so that V(tr) = r (V(m) denotes the number of distinct prime factors of m) and for every divisor d of tr, (19) (1 -e) l/d<A(0, d; n)<(l+e) l\d. The proof of the lemma follows fairly easily from Turan's method and we will leave some of the details to the reader. First of all it immediately follows from Turan's method that (20) Zp<Cu C^C&c) where in £ 1/p the summation is extended over all the primes p which do not satisfy p„ <lz^<„0,p;„)<(i±^. p p Henceforth we only consider primes p which satisfy (21). Let px be the smallest such prime. Put tl=pl;tl clearly satisfies (19). Suppose we have already constructed an integer ts = px • • -ps, px < • • • <ps so that for every divisor d! of ts we have (22) (l-sjr) l/d'<A(0, d'; n)<(l +ejr) l/d>. It again follows by Turan's method (taking note of (22)) that (23) E'l/P<Cs+i where in J] 1/p the summation is extended over the primes p for which for some divisor d' of ts, (24) (1 -eiM+l)fr)llpd'<A{09pd'; n)<(l +fi(I+1)/r)//prf' does not hold. Let ps+1 be the smallest prime greater than ps which satisfies (24) for every divisor d! of ts. Put ts+l = tsps + l. Clearly tr satisfies (19) and by our construction rr<r0(£, c, r) which proves our lemma.
THE NUMBER OF SOLUTIONS OF m = Yj= i *? 89 Lemma 2. Let £>0, oO, ml < ••• <mr be any sequence of integers which are pairwise relatively prime. Let L> L0(c9 e)9 N>N0(mr9 L, s) and bl<--<bl<N, l>cN be any sequence of integers. An mh 1 ^/^r, is said to be bad if there are more than e,m{ residue classes u (mod raj so that for each of them (25) 5(W,mi,N)<//2Lmi. Then there are fewer than L bad mis. The lemma would follow easily from the large sieve but we give a very simple direct proof. A residue class u (modmf) is bad if it satisfies (25). If a b3 is congruent to a bad residue class modmf for any i= 1,..., r, we throw it away. Assume that our lemma is not true and that there are L or more bad m/s. Consider any L of them, say mil,...,mi/. We throw away, by (25), at most //2fe's; thus by l>cN there are at least 1/2 /'s, bl<"-<bM9 so that every fc,-(mod raj, 1 ^s^L is not a bad residue class (i.e. B(bj9 mis, N) does not satisfy (25)). But since mis is bad, 1 <5<L, there are at least am. bad residues modm; , or the fe's are in at most (1 -s)L Y[s= i mts residue classes mod Y[s= i mis- Thus for L>L0(c, e), cN/4<l/2<M<(l+o(\))(\-£)L N<cN/4, an evident contradiction, which proves Lemma 2. Let now tr be an integer which satisfies (19) and let r be sufficiently large. Let d | tr. A prime p \ tr/d is said to be bad with respect to d if the following holds: Let bx < - - - < br < n/d be the integers ajd, r>(l—e) l/d>(l — e) cn/d, by Lemma 1. Now p is bad (with respect to d) if there are more than epk residues modp* so that (25) holds for each of them (mf = pfc, N = n/d). By Lemma 2 there are fewer than L bad primes p | tr/d. Lemma 3. There is a d\ /r, V(d)>\ogr/2 log2 so that no p | d\dx is bad with respect to d1 where dl is any divisor of d. If we prove Lemma 3 our proof is finished since we can simply put d=Tr and (8) and (9) are satisfied. Thus we only have to prove Lemma 3. Lemma 3 follows from an argument used by Spencer and Erdos (their paper will be soon published in Matematikai Lapok) but in view of the fact that the paper is in Hungarian it seems appropriate to give the simple proof in full detail. The argument is of course purely combinatorial. Let \<p\ = r,<p1cz(p. By assumption there are fewer than L bad elements xeq> with respect to q>x (x^cp^. A subset q>l of cp is called bad if there is an element x of <px so that x is bad with respect to (?! — x. Clearly there are at most L(ULx) bad subsets q>i^(p with \q>1\ = u.
90 P. ERDOS AND E. SZEMEREDI We want to prove that there is a subset Aczqy, \A\>\ogr/2 log2 which contains no bad subsets, and this will complete the proof of our lemma. Clearly there are at most JtAl-uJ \u-lj {1-1)1 /-element subsets of cp which contain a bad subset. Now if r> r0(L), /^logr/2 log 2, then 0-1)! W thus there is an /-element subset A, /^logr/2 log 2, which contains no bad subset, which proves our lemma and theorem. Lemma 1 could have been strengthened in the following way: Instead of (19) we could have proved that for every w, (19') (l-£)l/d<A(u,d;n)<{l+£)l/d uniformly for every residue class u. The proof would be essentially the same as that of (19). Several other possibilities of generalisations we plan to discuss in another paper. References 1. S. Chowla, Indian Phys.-Math. J. 6 (1935), 65-68. 2. H. Davenport, Sums of three positive cubes, J. London Math. Soc. 25 (1950), 339-343. MR 12, 393. 3. P. Erdds, On the representation of an integer as the sum ofk k-th powers, J. London Math. Soc. 11(1936), 133-136. 4. P. Erdds and K. Mahler, On the number of integers which can be represented by a binary form, J. London Math. Soc. 14(1939), 134-139. 5. P. Erdos, On the sum and difference of squares of primes. I, II, J. London Math. Soc. 12 (1937), 133-136, 168-171. 6. B. Lindstrdm, An inequality for B2-sequences, J. Combinatorial Theory 6 (1969), 211-212. 7. K. Mahler, Note on the hypothesis k of Hardy and Littlewood, J. London Math. Soc. 11 (1936), 136-138. *8. , On the lattice points on curves of genus 1, Proc. London Math. Soc. 39 (1935), 431-466. 9. J. Singer, A theorem in finite projective geometry and some applications to number theory, Trans. Amer. Math. Soc. 43 (1938), 377-385. Mathematical Institute of the Hungarian Academy of Science Budapest, Hungary * P. Erdos remembers that Mahler in a later paper improved the exponent i to 2 but is unable to trace the reference.
THE LARGE SIEVE AND PROBABILISTIC GALOIS THEORY P. X. GALLAGHER In this paper we give versions of the large sieve in several variables suggested by recent papers of Hlawka [11], Elliott [10], and Montgomery [16]. The sieve is applied to the set of all monic polynomials in one variable of given degree with integer coefficients. In particular, we sharpen the result of Dorge [8] and van der Waerden [25] that almost all such polynomials are irreducible and have Galois group equal to the symmetric group. For the number En(N) of polynomials F(X) = Xn + alXn-1 + -+an with integer coefficients and height H(F) = max(|a1|,..., \an\)^N for which the Galois group is less than the symmetric group, van der Waerden [26] gave the estimate En(N)<Nin-c/lo*lo*N\ with c=l/6(n-2), by an argument based on reduction modulo p for many primes p. Knobloch [14], [15] has improved this to En(N)^Nn~c, with c=l/18n(n!)\ using a quentitative version of the Hilbert irreducibility theorem. Using the sieve in van der Waerden's argument we get (1) ^(NHAT-^logN, AMS 1969 subject classifications. Primary 1064, 1050, 1240; Secondary 1065, 2020. © 1973, American Mathematical Society 91
92 P. X. GALLAGHER which begins to approach the estimate ([9], [21], [26]) Rn(N)<N\ogN (n = 2), ^N^1 (n>2), for the number Rn(N) of reducible monic polynomials of height ^N.A refinement of the argument leads to (2) En(N)<Nn-ll2\ogl-yN, withy = yn>0. 1. The large sieve in Zn. For each prime /?, let Q(p) be a subset of the group ZnjpZn of ^-dimensional lattice vectors modulo/?. Denoting by co(p) the number of elements of Q{p), we have 0^co(p)^pn. For each lattice vector aeZn, denote by P(a, x) the number of primes p^x for which a mod/? belongs to Q{p), and put P(x)= £ a>(p)/p\ p^x the "expectation" of P(a, x). Lemma A. For N^.x2, we have (3) £ {P{a,x)-P{x))2<NnP{x). MP Here \a\ is the maximum of the absolute values of the components of a. The implied constant, here and in what follows, may depend on n. With the same notation, denote by E(N) the number of a with \a\^N for which a modp$Q(p\ for each prime p. Put (4) ^M-E^n^^L. q^x p\qP -0>(p) Lemma B. For N^x2,we have E(N)<Nn/^(x). Lemma A is an n-dimensional variant of an inequality of Turan [23], [24], analogous to Tchebychev's inequality for the standard deviation of a sum of independent random variables. Lemma B is the n-dimensional version of the Selberg-Montgomery sieve upper bound. The proofs depend on the n-dimensional analogue of Bombieri's large sieve inequality for exponential sums [4], [5]. Let S(«)= Z c(a)e(a-v) (aeRn/Zn) \a\^N
THE LARGE SIEVE AND PROBABILISTIC GALOIS THEORY 93 with the usual dot product, and arbitrary complex coefficients c(a). Then (5) £ \S(«)\2<(N» + x2») £ \c(a)\2. Inequalities of this sort have been applied by Huxley [12], Wilson [29], Saman- darov [17], and Schaal [18] in generalising the large sieve to algebraic number fields, and by Hlawka [11], whose application is a variant of Lemma B. For a proofof(5),see[ll]or[12]. Weaker forms of Lemmas A and B may be derived without using (5). For square-free q, let Nq be the number of a with \a\^N for which a modpeQ(p) for each p | q. A simple lattice point argument gives (6) N, = (2N)»^+o(^-^)), for,**. Using (6) in Turan's argument, we can prove (3) provided x^(N logAT)1/3. In the application mentioned in the introduction, this would lead to a bound jyn-1/3 iQgc ^y Similarly, using (6) in Selberg's upper bound sieve method [19], we would get a bound Nn~1/5 logc N. The following proof of Lemma A was suggested by arguments of Elliott [10] and Warlimont [28]. Let <pp(a) be the characteristic function of the set of a satisfying a modpeQ(p). Then (pp is periodic modpZn, so with <Pp(a)= I cp(cc)e(a-cc), ord a | p cP(*)=P " E <PP(a)e(-aot). aeZn/pZn In particular, (7) c,(0)-2a and £ *,<#-=« P orda|p P It follows that P(a,x)=£ Pp(flHP(x) + K(a,x), pZx
94 P. X. GALLAGHER with Hence R(a,x) = X £ cp(a)e(a-(x). pf|jc orda = p Z W«,*))2=Z £ c,(a) I K(a,x)e(aa) |a|^N pf|x orda = p a^N ^(z Z kP(a)|2)1/2(ziS(a)|2) \p^x orda = p / \ A / where A is the set of a in Rn/Zn of prime order ^x, and S(a)= £ K(a,x)e(aa). Using (7) and (5), we conclude that 1/2 X (R(a,x))2<[P(x)(Nn + x2n) X (*M): 1/2 from which Lemma A follows. For the proof of Lemma B, it suffices, as in [16], to show that if c(a) = 0 unless a mod p$Q(p) for all p, then (8) i* = q p\qP -<0[P) for all square-free q. In fact, putting c(a) = l or 0 according as amodp$Q(p) for all p or not, we have E(N) = S(0) = Yd\a\^N\c(a)\2> so the upper bound for E(N) follows from (5) and (8). To prove (8), we may proceed as in [16] or as follows: For each prime p, (9) £ |S(a)|2 = p" X |S(/i,p)|2-|S(0)|2, ord a = p he Zn/pZn where S(/i, p) = Y^amodPehc{a)' The Schwarz inequality gives (10) |S(0)|2 lS(h,p)\ ^p"-co(p))Z\S(h,p)\2, since, by the hypothesis, S(/i, p) = 0 for heQ(p). Combining (9) and (10), we get
THE LARGE SIEVE AND PROBABILISTIC GALOIS THEORY 95 (8) for q = p. More generally, replacing c(a) by c(a) e(a-/?), we get Z |S(«+0l2*-^Lls(fll2. orda = p P —(0(P) If q is a square-free integer but not a prime, then q = pr, with p a prime and r a square-free integer prime to p. Hence Z |S(y)l2= Z Z |S(a + /?)|2^^7- Z lW, ord > = q ord a = p ord 0 = r p — CD (Pj Crd 0 = r and we get (8) from this by induction on the number of prime factors of q. 2. Sifting polynomials. Let F(X) be a monic polynomial of degree n with integer coefficients. If F(X) modp splits into distinct monic irreducible factors, with rl linear factors, r2 quadratic factors, etc., then we say that r = (rl9 r2,...) is the splitting type of F(X) modp. Thus F(X) has a splitting type for each prime which does not divide D(F\ the discriminant of .F(A'). For each splitting type r, we have (ii) i/•»■/=«. / Given r satisfying (11), (Jenote by nF r(x) the number of primes p^x for which F(X) modp has type r. As we will show later, for "almost all" F(X), we have 7iFr(x)~5(r) 7r(x), x-kx), where n(x) is the number of primes p^x and (12) S(r) = (Virl \2r2r2 I-)"1. The following result shows that S(r) is the normal density of primes for which monic polynomials of degree n have splitting type r. Theorem A. Ifr satisfies (11), then for N^x2, we have (13) X (*F,r(x)-Hr)n(x))2<Nn7i(x). H(F)^N Proof. We identify a monic polynomial of degree n with the lattice vector a = (a!,..., an) formed by its coefficients, so that H(F)=\a\. Similarly, polynomials modp are identified with lattice vectors modp. Let Qr(p) be the set of monic nth degree polynomials modp of type r. By the unique factorisation theorem, the
96 P. X. GALLAGHER number cor(p) of such polynomials is given by f \ rf where np(f) is the number of/th degree monic irreducible polynomials modp. By a theorem of Dedekind [7], [13] we have J d\f Thus np(f) is a polynomial in p with leading term pf/f, so cor(p) is a polynomial in p with leading term riffi-*'- It follows that Pr(*)= I (^ = S(r)n(x) + 0(\oglogx). p^x P Identifying nFr(x) with Pr(a, x) and using Lemma A, the result follows. Let nF(x)= Z' (number of roots of F(X) = 0 modp), where the dash indicates that p \ D(F). Corollary 1. For N^x2, we have (14) £ (nF(*)-n(x))2<Nn7i(x). H(F)^N Proof. We have M*) = Z ',i^,rW = (Z >*i<5(r)j ^W + Z M*F,r(*)-<$(r) tt(x)). The first sum on the right is 1, since it is the coefficient of xn~l in (d/dxx) exp(x1+x2/2 + x3/3+-)x1=x = explog(l/(l-x))=l+x + x2+--.
THE LARGE SIEVE AND PROBABILISTIC GALOIS THEORY 97 Using the Schwarz inequality, we get M*)-k(x))2«£ (*P.r(x)-d(r) n(x))2. r Summing on F and using (13) for each r, we get (14). Denote by Er(N) the number of monic nth degree polynomials F(X) of height ^N for which F(X) modp has splitting type r for no prime p. Corollary 2. For each r, we have Er(N)<^Nn~112 logN. Proof. For each such F(X), we have nFr(x) = 0, so the corresponding term in (13) contributes (n(x))2. Hence Er(N)<$Nn/n(x) for N^x2. Choosing x = N1/2, the result follows, using Tchebychev's estimate for n(x). Using Lemma B, we can improve the logarithm in Corollary 2. For any nonempty set R of splitting types, denote by ER(N) the number of monic nth degree polynomials F(X) of height ^N for which F(X) modp has splitting type in R for no prime p. Put S(R) = J^reR S(r). Theorem B. For each 5<5(R), we have (15) ER(N)<Nn-ll2(\ogNy-m-5). Proof. Let QR(p) be the union of the Qr(p) for reR. Then coR(p)~5(R)jf as p-tco, so coR(p)^5pn for all p^Ms. To find an upper bound for ER(N), it suffices by Lemma B to find a lower bound for yc,M(x)=Z c*« (c = <5/(l-<5)) where the dash indicates that q runs over square-free integers with no prime factor < M, and v(q) is the number of prime factors of q. From a more general formula of A. Selberg [20, Theorem 2] we get ^»w^,,n('-^)\n(.^)(.-i)'-log.-.„ for each c. Putting x = N1/2, the result follows. 3. van der Waerden's theorem. Combining the results of the previous paragraph with a method of Bauer [3], we can estimate En(N). We use the fact that if ^(A") mod/? has splitting type r, for some prime/?, then the Galois group G of the splitting field of F(X), regarded as permutation group on the roots of ^(A"),
98 P. X. GALLAGHER contains a permutation of cycle type r, with rx cycles of length 1, r2 cycles of length 2,... [27,§61]. If G is a proper subgroup of the symmetric group S„, then the conjugates of G do not cover Sn [6, §26], so there is a conjugacy class of S„, consisting of all permutations with a given cycle type r, which does not intersect G. Thus there is a forbidden splitting type for F(X). It follows that r Using Corollary 2 or Theorem B (with R = r\ we get the inequalities (1) or (2) of the introduction, for example with yn = min 3 (r) = (n!)"1. To get a larger value for yn9 we use the following lemma, stated by Bauer. I owe the proof to D. Knutson. Lemma. Let G be a subgroup of Sn. If G is transitive, contains a transposition, and contains a p-cycle for some prime p>n/2, then G = Sn. Proof. Consider the graph with vertices 1, 2,..., n and edges corresponding to the transpositions (ij) in G. Since Sn is generated by transpositions, it suffices to show that the graph is complete, i.e. has an edge between each two vertices. If G contains (ij) and (jk), then G contains (ik) = (if) (jk) (ij). Hence the connected components of G are themselves complete graphs, so it suffices to show that G is connected. In its action on the vertices, G operates as a group of automorphisms of the graph, since (ai, oj) = a(ij) o~leG for oeG and (ij)eG. Hence G acts transitively on the components. Therefore the components are all isomorphic, of a size d dividing n. If the p-cycle moves a component, then there are at least p components, so pd^n, forcing d=\. This is impossible, since the graph has at least one edge. Hence the p-cycle fixes each component, so d^p, forcing d = n. Hence the graph is connected. Theorem C. We have En(N)<Nn~1/2 \ogx~yN, with y = yn>0, and y„~ (Inn)-112. Proof. Let Tbe the set of elements of Sn among whose cycles there is just one transposition and no other cycles of even length. Let P be the set of elements of order divisible by some prime p>n/2. If G is a proper subgroup of 5„, then either G is intransitive or either Gr\T or Gr\P is empty, since otherwise G satisfies the hypotheses of the lemma. It follows that (16) E„(NURn(N) + ET(N) + EP(N).
THE LARGE SIEVE AND PROBABILISTIC GALOIS THEORY 99 The density of P is the sum of the d(r) with rp=\ for some prime p>n/2. It follows easily that n/KpgnP lOgfZ using the prime number theorem. The density of T is the sum of the 5(r) with r2 = 1 and r4 = r6 = • • • = 0. This is half the coefficient of x"~2 in exp(x + y+y+- ) = exp 1-x =(l+x)(l-x2)"1/2 0 1+JCV/2 (1+X,io(2^!)2X ' or half the coefficient of x2k in the last sum, with 2k = n — 2 or n — 3 according as n is even or odd. Using Stirling's approximation, we get (18) S(T)~(2nn)-112. Using Theorem B, the estimate for Rn(N) stated in the introduction, and (16), (17), and (18), the result follows. We remark that for irreducible F(X) the asymptotic formula 7rF(x)~7r(x), x->oo follows from the prime ideal theorem for the field Q(oc) with F(a)=0. The hypothesis that Cqw/Cq is entire and has no zeros to the right of the critical line would give 7cF(x) = 7r(x) + 0(x1/2+*), with a constant depending on H(F). Similarly, if F (X) has Galois group S„, the formula *F,r(*)~<$(r)7t(x) follows [1, §2] from a strong form of the Tchebotarev density theorem [22], [2]
100 P. X. GALLAGHER and the sharper form nFJx) = d(r)n(x) + 0(xl/2 + e) would follow from the hypothesis that the nonprincipal Artin L-functions for the splitting field of F(X) are entire and have no zeros to the right of the critical line. By comparison, Corollary 1 and Theorem A give \nF(x)-n(x)\^xl/2+e and \nF,r{x)-5{r) n(x)\^xll2+° for all but <$„ x2n{l ~e) polynomials F(X) with H{F)^x2. References 1. E. Artin, fiber die Zetafunktionen gewisser algebraischer Zahlkorper, Math. Ann. 89 (1923), 147-156. 2. , Vber eine neue Art von L-Reihen, Abh. Math. Sem. Univ. Hamburg. 3 (1923), 89-108. 3. M. Bauer, Ganzahlige Gleichungen ohne Affekt, Math. Ann. 64 (1907), 325-327. 4. E. Bombieri, On the large sieve, Mathematika 12 (1965), 202-225. MR 33 #5590. 5. , A note on the large sieve, Acta Arith. 18 (1971), 401-404. 6. W. Burnside, Theory of groups of finite order, Cambridge Univ. Press, Cambridge, 1911. 7. R. Dedekind, Abriss einer Theorie der hohern Congruenzen in Bezug auf einen reellen primzahl Modulus, J. Reine Angew. Math. 54 (1857), 1-26. 8. K. Dorge, Vber die Seltenheit der reduziblen Polynome und der Normalgleichungen, Math. Ann. 95(1925), 247-256. 9. , Abschatzung der Anzahl der reduziblen Polynome, Math. Ann. 160 (1965), 59-63. MR 31 #5865. 10. P. D. T. A. Elliott, The Turan-Kubilius inequality, and a limitation theorem for the large sieve, Amer. J. Math. 92 (1970), 293-300. MR 41 #8360. 11. E. Hlawka, Bemerkungen zum grossen Sieb von Linnik, Osterreich Akad. Wiss. Math.-Natur. Kl. S.-B. II 178 (1970), 13-18. MR 42 #224. 12. M. N. Huxley, The large sieve inequality for algebraic number fields, Mathematika 15 (1968), 178-187. MR 38 #5737. 13. H. Kornblum, Vber die Primfunktionen in einer arithmetischen Progression, Math. Z. 5 (1919), 100-111. 14. H.-W. Knobloch, Zum Hilbertschen Irreduzibilitatssatz, Abh. Math. Sem. Univ. Hamburg. 19 (1955), 176-190. MR 16, 798. 15. , Die Seltenheit der reduziblen Polynome, Jber. Deutsch. Math. Verein. 59 (1956), Abt. 1, 12-19. MR 18, 185. 16. H. L. Montgomery, A note on the large sieve, J. London Math. Soc. 43 (1968), 93-98. MR 37 #184. 17. A. G. Samandarov, The large sieve in algebraic number fields, Mat. Zametki 2 (1967), 673-680. (Russian) MR 36 #6379. 18. W. Schaal, On the large sieve method in algebraic number fields, J. Number Theory 2 (1970), 249-270. MR 42 #7626.
THE LARGE SIEVE AND PROBABILISTIC GALOIS THEORY 101 19. A. Selberg, The general sieve-method and its place in prime number theory, Proc. Internat. Congress Math. (Cambridge, Mass., 1950), vol. 1, Amer. Math. Soc, Providence, R.I., 1952, pp. 286- 292. MR 13, 438. 20. , Note on a paper by L. G. Sathe, J. Indian Math. Soc. 18 (1954), 83-87. MR 16, 676. 21. W. Specht, Zur Zahlentheorie der Polynome, S.-B. Math.-Nat. Kl. Bayer. Akad. Wiss. 1951, 139-146. MR 14, 251. 22. N. Tchebotarev, Die Bestimmung der Dichtigkeit einer Menge von Primzahlen, welche zu einer gegebenen Substitutionsklasse gehoren, Math. Ann. 95 (1925), 191-228. 23. P. Turan, On a theorem of Hardy and Ramanujan, J. London Math. Soc. 9 (1934), 274-276. 24. , Uber einige Verallgemeinerungen eines Satzes von Hardy und Ramanujan, J. London Math. Soc. 11(1936), 125-133. 25. B. L. van der Waerden, Die Seltenheit der Gleichungen mit Affekt, Math. Ann. 109 (1934), 13-16. 26. , Die Seltenheit der reduziblen Gleichungen und der Gleichungen mit Affekt, Monatsh. Math. 43(1936), 133-147. 27. , Moderne Algebra. Vol. 1, Springer, Berlin, 1935; English transl., Ungar, New York, 1949. MR 10, 587. 28. R. Warlimont, On Artin's conjecture (to appear). 29. R. J. Wilson, The large sieve in algebraic number fields, Mathematika 16 (1969), 189-204. MR 41 #8374. Columbia University
This page intentionally left blank
SOME REMARKS ON ARITHMETIC DENSITY QUESTIONS LARRY JOEL GOLDSTEIN1 In [2], we proposed the following generalization of the Artin primitive root conjecture [1, p. viii]: Let S be a set of rational primes; for each qeS, let Lq be a finite, normal algebraic number field. Further, let 5* be the set consisting of 1 and all square-free numbers divisible only by primes in S. For keS*, define Lk by Lk=Q (k=l) and let dQg(LJQ) = n(k). Let P denote the set of all rational primes and let stf—stf({Lq), S) = {peP |/7 does not split completely in Lq for all qeS}. Then our conjecture was as follows. Conjecture. Let fi denote the Mobius function and assume that Yjkes* H {k)ln (£) converges absolutely. Then stf has a natural density d(stf) and d(s/)= £ n(k)/n(k). keS* If S=P and Lq = Q(Cq, a1/q), where tfeZis given (tf #0, 1) and £, is a primitive qth root of unity, then the conjecture is equivalent to the Artin primitive root conjecture. The conjecture is known to be true in the following cases: (1) Sfinite; (2) S arbitrary and Lq^Q(£q2) for all but finitely many q\ (3) S arbitrary, Lq^ Q(Cr al/q) for all but finitely many q, provided the Riemann hypothesis for the Dedekind zeta functions is true. AMS 1970 subject classifications. Primary 12A70; Secondary 12A35. 1 Research supported by National Science Foundation grant GP-20538. £■ 1973, American Mathematical Society 103
104 LARRY JOEL GOLDSTEIN The conjecture is supported by the same heuristic that supports the Artin primitive root conjecture. Moreover, in [2], we have given the following interpretation of the conjecture in the setting of equidistributed sequences: Without loss of generality, assume that Lq^Q for all qeS. Let L denote the composite of all Lq. Then L is a Galois extension of Q having profinite Galois group G. Let Hk be the open subgroup of G corresponding to Lk under the Galois correspondence, and let C = G-U Hq. qeS Then C is a closed subset of G invariant under conjugation. Moreover, if p is the Haar measure on G which gives the whole group the measure 1, then fd/*= £ f*{k)/n{k). J keS* C In [2], we constructed a generalized Artin symbol ((L/Q)/p) defined for each finite Q-prime p which had the following properties: (1) ((L/Q)/p) is a closed subset of G; (2) ((L/Q)/p) is invariant under G-conjugation; (3) if/? does not ramify in Lk, then the restrictions of the elements in ((L/Q)/p) to Lk coincide with the usual Artin symbol. From the properties (l)-{3), it is clear that pestfo((L/Q)/p)^C. Therefore, the conjecture is equivalent to the following Tchebotarev-type result. Conjecture'. Let stf = {peP | ((L/Q)/p)^C}. Then srf has a natural density d{stf) and d{^)= dp. c It is relatively easy to connect Conjecture' with the theory of equidistribution. Let G denote the space of all conjugacy classes of G, given the quotient topology, and let Ji be the projection of p on G. Further, let rj: G-+G denote the canonical projection. Then p is a Borel measure on G and dp=\dp. 1(C) C Then it is easy [2, p. 441] to see that the sequence n(((L/Q)/p)) is equidistributed with respect to the measure p. Set
SOME REMARKS ON ARITHMETIC DENSITY QUESTIONS 105 Then Conjecture' may be reformulated as follows. Conjecture". Let srf = {peP \ oper\(C)}. Then srf has a natural density d(stf) and d{s/)= dji. In view of the fact that the sequence {ap} is equidistributed with respect to /}, Conjecture" might seem reasonable. However Weinberger [5] and Serre (private communication) have recently shown that this is not the case by constructing counterexamples. The counterexamples can be constructed using elementary class field-theory in such a way that the extensions Lq can be taken to be abelian over Q. In this paper, we will show that by imposing some further hypotheses on the family {Lj, the most essential of which concerns the growth of the discriminant of Lq as q-+co, it is still reasonable to expect the above conjectures to hold. Namely, we will prove the following result. Theorem. Let the fields {Lq}qeS be as above, and assume LJQ is abelian for q^R. Moreover, assume that the family {Lq}qeS satisfies the following three con- ditions: (i) \og\dk\/n(k) = 0(ka)for some a^O, as k-+oo, where dk = the absolute discriminant of Lk/Q; (ii) ifpeP splits completely in Lq, then p^q for all sufficiently large q\ (»0 I,** l/n(q) = o(l/(log2A))(A^<x>). Then the conjecture holds if the Riemann hypothesis for the Dedekind zeta function is true. Remarks. (1) Condition (iii) already implies the absolute convergence of T,kes*^)/n(k). (2) If Lq^Q(Cq) for all sufficiently large q, then condition (ii) is satisfied since if p splits completely in Q (Q, then p = 1 (mod q). (3) Condition (i) is suggested by the Riemann hypothesis for the Dedekind zeta function (see below). Moreover it is satisfied in situations of interest. For example, if Lq = Q(Cq, all% then the arguments of Hooley [3, p. 229] show that lim n{k)/k2 = l9 q-*ao \dq\SAqBq2 (A>0,B>0), which can be easily used to verify conditions (i) and (iii).
106 LARRY JOEL GOLDSTEIN (4) Condition (iii) is of a technical nature and may not really be necessary, but is required in our proof. Proof of the Theorem. The idea of the proof is' the same as that used to prove Theorem 1 of [2]. Let us introduce the following notations: For x>0, y>0, set AT(x, y)= the number of primes p^x which split completely in some Lq, q^y, qeS', P(x, A:) = the number of primes p^x which split completely in Lk/Q; M(x, yl,y2) = thQ number of primes p^x which split completely in some Lv y\<q<y\\ 7rk(x) = the number of prime ideals of Lk of norm ^x; n(x, j/) = the number of primes p^x such that pesrf. We must prove that (1) n{x, s#) = d(s?) x/logx + o(x/logx) (x->oo), where (2) '(-)- I ^f ■ kls* n(k) Condition (ii) implies that for all sufficiently large x, we have (3) n{x9s/) = N{x9x). Moreover, for £>l ^£2 = x> we see as *n [2, Equation (3.1)] that (4) tc(x, ^) = AT(x, ^) + 0(M(x, ^, £2)) + 0(M(x, £2, x)) (x-oo), where the O-term constants do not depend on x, £l9 £2. We will choose ^=^(4 £2 = £2(x) so that (5) AT(x, il) = d{s/) x/logx + fl(x/logx) (*->oo), (6) M(x, £l^2) = o(x/\ogx) (*->oo), (7) M(x,£>2,x) = o{x/\ogx) (x->oo). It is clear that (5)-(7) together with (4) suffice to prove (1). In order to prove (5), let us first note that a simple combinatorial argument implies that (8) N(^ii) = I*Kk)P(x,k), k where ££ is over all those keS* whose prime factors are all ^^. Let 7Tfc(x) = the number of Lk-primes p such that Np^x and p is of absolute
SOME REMARKS ON ARITHMETIC DENSITY QUESTIONS 107 degree 1 and unramified in LJQ and let nk(x) = nk(x) — nk(x). Note that n'k(x) =n(k)P(x, k). Moreover, since the primes counted by nk(x) either ramify in LJQ or are of degree > 1, we see that *rk(x)£2\ag\dk\ + n(h) X 1 p2^x ^ ^2\og\dk\ + n(k)x112 = 0{n{k)ka + n{k)x112 by condition (i), where the O-term constant does not depend on Lk or x. Let 0(x) = Xp^* l°8P be the classical Tchebycheff function, where p runs over the primes. Then it is well-known that 6(2x)^2x for x^ 1. Thus, all k occurring in the sum Yjk are at most exp(2<^1). Let us choose £l = £l(x)so that (10) £i(x)-+oo asx-+oo, (11) sup ka<.x, fc^exp(2<Si) (12) exp(2^)^x1/2/log3x. For such £1? for /cgexp(2^1), we have P(x,k) = n'k(x)/n(k) = nk(x)/n(k)-nZ(x)/n(k) = nk(x)/n(k) + 0(ka + x1'2) (by (9)) = 7rk(x)/n(/c) + 0(x1/2) (by (12)). Therefore, by (8), k (13) =E*^^W + 0(^x1'2) Let us define {f/k(x) by ^fc (*) = Z *°S ^P' P a Prime ideal of Lk. Npm£x Then by assuming the Riemann hypothesis for the Dedekind zeta function of Lk,
108 LARRY JOEL GOLDSTEIN Lang [4, Equation (11RH)] has shown that (14) i//k(x) = x + 0(xl/2 log* log(|dk| xn(k})), where the O-term constant is absolute. By the usual trick for calculating nk(x) from ^fc(x), we derive from (14) that (15) nk(x)^U(x) + 0(x1'2 log(|dJ x»<*>)), where Li(x)=jj dy/logy. From equations (13) and (15), we derive that (16) N(x, «i) = [l* £|j] Li(x) + 0 (x1'2 p l°gW'«»x)yo(j^). Since all those k^£x belonging to S* are included in Yjk and since ££ l/n(k) converges, we see that since ^ (x)-*oo as x-*oo, we have M*)_ m+0( j. J_\_ s s<*)+0( z _L (17) * «W *6S-;*§«, "CO \*6S-;*>«, «(fc)/ keS»n(fc) VfceS»Tt>«l "CO Moreover, by condition (ii) we see that otx1'2 P log(|dj1/nwx)) = 0 (x1/2 X* log(fc"x)) (18) =o(x1/2logxpij (by (11)) = 0(exp(2£,)je1/2logx) = o(x/logx) (by (12)). By equations (16), (17), (18) we finally see that = d (s/) x/log x + o (x/log x), which proves (5). Let us now prove (6) for £2 = x1/2/\og2x. It is easy to see that (19) A#(xfflf£2)£ X P(x,«).
SOME REMARKS ON ARITHMETIC DENSITY QUESTIONS 109 Moreover, the definition of P(x, q) and some elementary reasoning implies that P(x,q)^nq(x)/n(q). Therefore, on the Riemann hypothesis, (15) implies that (20) P{x, q)^U(x)/n(q) + 0(x1/2 logO^I1^ *)). However, since ^q l/n(q) converges and since £>l (x)-*oo as x-+oo, we see that (2i) I 4rLiM=°(r^\ Moreover, by condition (i), ofx1'2 X logfl^^x^ofx1'2 S log(qax)) (22) = o(x1/2logx £ 0 = 0(x1/2logx(x1/2/log3x)) = o(xj\o%x). Estimates (19}-{22) suffice to prove (6). Finally, let us prove (7). Here is where we will make use of the hypothesis that LJQ is abelian for all sufficiently large q. Without loss of generality, let us assume that LJQ is abelian for all q^£2. In analogy to (19), we have (23) M(x,£2,x)^ X p(x>4)' Lztfq denote the conductor of LJQ. Then it is well known that there exist rational integers al{q),..., at(q) such that a rational prime p splits completely in LJQ if and only if p = at(q) (modfq) for some i (1 ^i^t). Thus, / is such that the density of primes which split completely in LJQ is just f/<£(jQ. However, by elementary density considerations, this implies that t = (j)(jq)/n(q). However, we then have (24) p^q)zt.i*m.xzi
110 LARRY JOEL GOLDSTEIN Combining (23) with (24), we have ^2<%xn(q) \log2xJ by condition (iii) and the choice of £2- This proves (7), and with it the Theorem. □ Bibliography 1. E. Artin, The collected papers of Emil Artin, Edited by S. Lang and J. T. Tate, Addison-Wesley, Reading, Mass., 1965. MR 31 #1159. 2. L. Goldstein, Analogues of Artin's conjecture, Trans. Amer. Math. Soc. 149 (1970), 431-442. MR 43 #4792. 3. C. Hooley, On Artin's conjecture, J. Reine Angew. Math. 225 (1967), 209-220. MR 34 #7445. 4. S. Lang, On the zeta functions of number fields, Invent. Math. 12 (1971), 337-345. 5. P. Weinberger, Counterexample to an analogue of Artin's conjecture, Proc. Amer. Math. Soc. 35 (1972), 49-52. University of Maryland, College Park
RELATIONS BETWEEN THE VALUES AT INTEGRAL ARGUMENTS OF DIRICHLET SERIES THAT SATISFY FUNCTIONAL EQUATIONS1 E. GROSSWALD 1. Introduction. Let m stand for a natural integer, £(j) (s = o + it, o, t real) for Riemann's zeta function and Bk for the A:th Bernoulli number, respectively. Then it is classical that C(-2m) = 0, £(l-2m)= -B2m/2m, and (2n)~2m C(2m) = (— 1 )m ~1 B2J2 (2m)!, where the second members are all rational values, but practically nothing is known about £(2ra — 1). In [3] it was shown that (27r)-«2m-,)C(2m_1)= £ akB2kB2m„2k-Rm, k = 0 where the ak are rational and Rm = (2n)~{2m~1)G(i\ with G(t)= f cke^\ (1) ck=2(i+7ifc(i-(-ir)/('"-i))Z^1 d\k -2m The "remainder term" Rm is quite small and, for m->oo, it approaches zero very rapidly, but it seems difficult to determine its arithmetic nature, i.e. whether it is rational, algebraic, or transcendental. For this reason, the following result [4] is rather remarkable. Let x = x{fy be a nonprincipal, primitive residue class character modulo the AMS 1970 subject classifications. Primary 10H05, 10H10, 12A70, 12A95, 44A15, 44A20; Secondary 13D15, 10A40, 30A20. 1 This paper has been written with partial support from the National Science Foundation through grant GP-23170. © 1973, American Mathematical Society 111
112 E. GROSSWALD natural integer /> 1, and denote the conjugate character by %. In analogy to (1), we define the coefficients ck(X) = 2X(k)(l + nk(l-(-ir)/(m-l)f) £ j^*/1"2" d\k and the function G(t, x) = ^k= i <*(*) e2nikz/f. For / = 1, ck{x) and G(t, *) reduce to previously defined ck and G(t), respectively. Now the correspondingly defined Rm(x) satisfies Rm(x)=(2n)l-2mG(i,X)=r2m+3'2 'l1 bkBkxB2m-\ k = 0 where the coefficients bk are rational and the £*, Leopoldt's generalized Bernoulli numbers, are algebraic, so that also Rm(x) is algebraic. In fact, if/> 1 and x(k) is a real, primitive character (mod/), then Rm(x)f~1/2 is actually rational. If it were possible to drop the conditions "nonprincipal, primitive" concerning the character, then it would follow that n~aC(a) would be rational for all positive integral arguments <z, which is very unlikely. The purpose of the present paper is to extend these considerations to a rather wide class of functions, representable in a halfplane by Dirichlet series. The results will apply, in particular, to the Dedekind zeta functions £K(s) of totally real fields K. The corresponding conclusions seem to indicate that it may be profitable to study the arithmetical nature of n~(2m~l) C(2m — 1) and, more generally, of 7r-(2m-1)nCx(2m-l), where CK(s) is the Dedekind zeta function of an algebraic number field K, of degree n over the field Q of rationals. This should be contrasted with some consequences of a conjecture of Lichtenbaiun [9], which point towards 7r2_2mC(2m— 1), and, more generally, towards n(2~2m)nC)K(2m — 1) as more likely to be of arithmetical significance. So, e.g., according to Lichtenbaum, 7r~2£(3) = ce{(p(u)\ with integral c, and where a is a generator of K5Z, e(--) is the canonical homorphism K5(C)-+R and (p the map KZ-+KC induced by the inclusion ZaC. On the other hand, by [3],.tT3 f(3) = 7/180-2 £«>=1 (k3(e2nk-1))"1. Following §2 with the notations, the main results are stated in §3. §4 contains an exposition of the general method and a main lemma. The proofs of the theorems and corollaries are collected in § 5. 2. Notations. Throughout, K stands for an algebraic number field of degree n over the field Q of rational numbers, n = 2r2 + rl (rx= number of reals, 2r2 = number of complex conjugates), with discriminant d, class number h, regulator R, ring of integers 0K and containing w roots of unity. We also set A = 2-r2n-n/2\d\l/2 and Q = 2ri+r2nr2Rhw-1 |d|"1/2.
DIRICHLET SERIES THAT SATISFY FUNCTIONAL EQUATIONS 113 Na denotes the norm of the ideal aczK and the Dedekind zeta function of K is defined for a> 1 by CK(s)= £acK Na~s. We denote by g{k\ n\ m) Meyer functions (see [1, pp. 206-222]) depending on the indicated parameters. They generalize the exponential function to which they reduce for K=Q. S(t) stands for the function defined by (4). The symbol |(ff)-- will be used for lim^^ £-!•?•" and au(n) for £,,„</". 3. Main results. Theorem 1. For every positive, integral m, the three quantities \d\112 7c-2mnCx(2m), 7t-{2m-1)n-rKK(2m-l) and /T 17r"',(2m-1/2)S'(l) are dependent over the rational integers. Corollary 1.1. In case K is totally real, n-i2m-l)%K(2m-l) = c1+c2S'(l)R-ln-ni2m-l/2) with cx and c2 explicitly known, rational and depending only on m and the field K. Corollary 1.2. In case K is totally real, the quantities n'^2m-^n £K(2m-1) and S'{\) R-ln-n<2m-l/2) nave the same arithmetical nature. Theorem 2. Letyk = yk(m) = YjNa=kI,b\a (M))-(2m"1); then S'(l)=£ yk{g{1)(k;n;m) + g<2\k;n;m)} where g{i)(k; n; m) (i= 1, 2) are Meyer functions. Theorem 3. For a totally real field K, one has, with rational, explicitly known constants c3, c4, c5, the equality where F1(t)=~ W(s)¥(s + 4m-l)(2s + 4m-2)-l(<i;/i)-s ds, 2ni J {*) withW(s) = As{r(s/2)}nCK(s).
114 E. GROSSWALD For a function F2 (t), closely related to Fx (t), one has some information about the arithmetical nature of F2 (*'). Let f be an integral ideal of K and let x = x(a) be a nonprincipal, primitive ray-class character modulo f. The functions L(s, x) = Yjo^k x(a) Na~s are entire and satisfy (see [6]) a functional equation of the customary type, that involves an algebraic number W(x\ of absolute value one. It also is known (see [7]) that L(2ra, y) = n2mnd~Xil<x, with a algebraic and belonging to the cyclotomic field generated by the values of/(a) and the roots of unity of order TVf. We now set V(s,z) = Aa{r(s/2)YL(s9z) and define F2(t) as before, but by using Y(s, x) instead of Y(s). F2(z) depends, of course, also on a = 2m— 1 and on %(a), but, for simplicity, this will not be indicated in the notation. Theorem 4. For a = Am — 1 and x (a) a nonprincipal, primitive ray-class character modulo f, the function F2(x) satisfies F2(0 = i^(z)(d-Nf-7r-n4~r2)1/2{r(m)}2ri{r(2m)}2r2L2(2m,x). Corollary 4.1. With the same notations and under the same conditions as in Theorem 4, // K is totally real, then F2{i)^W{x) (d-Nln-y2 {/»}2« L2(2m, I). In particular, F2(i) = n{4m~ll2)np with p algebraic. Corollary 4.2. If the conclusions of Theorem 4 and Corollary 4.1 hold also for x(o) a principal character, then n-n^4m- *) ^K{Am— 1) is algebraic. 4. The general method. Let zj(j)=fl?=i r(avj+j?v), with av>0, pveC, let <p(s) = Yj?=i ake~XkS be a Dirichlet series, convergent for <7^<7o>0, and assume that for some real (not necessarily positive) r, <p(s) A(s) = (p(r — s) A(r—s). Following a tradition that goes back to Riemann, we consider also a rational function P(s) that satisfies the identity P(s) = (- If P(r-s) with (5 = 0 or 3=1. Then also (2) <p(s)A(s) P(s) = (-lf <p(r-s)A(r-s) P(v-s)
DIRICHLET SERIES THAT SATISFY FUNCTIONAL EQUATIONS 115 holds. Just as in the case of Riemann's factor s(s— 1), the purpose of P(s) is to kill poles, create poles or modify their order. If we denote the left-hand side of (2) by $(s), then (2) reads (2') *(s) = (-l)**(r-s). Most of what follows holds also in the more general setting of two distinct Dirichlet series, satisfying a functional equation, as studied by Chandrasekharan, Nara- simhan, Berndt and others (see [2] and papers quoted there), but in all intended applications the two series coincide; hence the greater generality is not needed here. Moreover, all functions that will occur will be meromorphic for o>ox, where ax =r — o0 — g, with some g>0, sufficiently small, but fixed. For Im t > 0 one can define the pair of reciprocal Mellin formulae 00 (3) F(T) = ^ | *(*) M"s*, #(*)= -i |f(t) (t/O5"1 dx, <<r2) ° where o2 = o0 + g. In the first formula of (3) we make the change of variable s<-+r — s, recall that al =r — oQ — s, use (2'), and finally shift the line of integration back to o2 by taking into account the residues at the poles in the strip ox =^o = o2. Very mild assumptions on <p(s) (e.g. \q>((i + it)\<\t\c for g1=^o<>g2, \t\^t0 is more than needed) and A (s) are sufficient to justify these operations (see, e.g. [3]). In this way, if 3 = 0 we obtain successively: >(s)(T/i)s-rds (a) (<n) -if* (<n) =-^ f 9(s)(Tlif-"ds - £ Res{4>(s)(T/irr} ={T/i)~r{L \ 0(s){z^1) 5ds~ <?< Res^(s)(T/os) = (tATF(-1/t)-(tATS(t/0 or
116 E. GROSSWALD F(-1/t)-(t//)'F(t) = S(t//), with (4) *(")= I Res{0(s)us}. If we now set t = it, t > 0, and observe that lim {F(it)-rrF(i/t)}/(t-l) = 2iF'(i) + rF(i), r-l we obtain 2iF(i) + rF(i)=-S'(l). If (2) and (2') hold with 5=1, then, proceeding as before, one obtains F{x)=~hi I *{s) {xlir~r ds (<r2) whence F(-1/t) + (t/iTF(t) = S(t/0. For t = i this yields the simple relation 2F(i) = S(l). Clearly, S(l)=£,lS<,s„ Res4>(s) and S'(l) = £0lSff§ff2 Res{s#(s)}. We formulate these results as a Main Lemma. With F{x) defined by (3), F(-l/r)-(-lf(r/iyF(T) = S(z/i); in particular, 2iF'(i) + rF(i)=- X Res{s<Z>(s)} if3=0, 2F(i)= X Res{*(s)} if<5 = l.
DIRICHLET SERIES THAT SATISFY FUNCTIONAL EQUATIONS 117 By proper choice of the function <P(s), the residues can be computed and depend on the values of q>(s) at integral arguments. Hence, at least in principle, the Main Lemma permits us to obtain relations between several values of <p(s) for s an integer. These relations can be used in a variety of ways. Particular cases of this Main Lemma have already occurred in the literature. The case r = 0, A(s) = r(s), Psl, <p{s) = {2n)-' f(s) C(s+l) = E?Li <>-,(k) e~x«s, with ?,k = \og(2nk), has been used by A. Weil [10] in order to give a proof of the transformation formula for Dedekind's ^-function. Next, with the same Ak, with A(s) = r(s/2)r{(s + a)/2)9 P=2l-m(s+\)(s+3)-(s + a-2) and with r = 1 — a, a = 2m — 1 an odd, positive integer and cp(s) = ££L l o-a (k) e~ AkS, this Main Lemma was used in [3] to gain information about C(2m— 1). On the other hand, in [4], the Main Lemma was used with r=l— a, the same A(s) and P(s) as above and (p(s) = (2n/f)~s L(s9 x)L{s + a, x)=S°=i M*) e~^s (lik = \og(2nk/f); ak(x) = ak(x; a) certain coefficients depending on /c, the non- principal, primitive character #(mod/) and the odd integer a; L(s, x) = X*°= i x(fy k "s), in order to obtain information about the arithmetic character of F(i) and 2iF'(i) H-rF(i), with the result that (for x primitive, nonprincipal only!) F(i)9 or 2iF'(i) — (a — 1) F(i), respectively, are of the form naC, with C an explicitly known algebraic number. 5. Proofs. We shall consider now some applications of the Main Lemma that lead directly to the proofs of the theorems and corollaries of §3. 5.1. Proofs of Theorem 1 and its corollaries. Let K be an algebraic extension of Q of degree n = r1 +2r2, and let £K(s) be the Dedekind zeta function of K. We recall that (see [5]) T{s) = As{r(s)}r2{r{s/2)}ri £K{s) satisfies the functional equation (5) y(s)=!P(l-s). For odd, rational integer a = 2m— 1 ^3, we now set (6) #(s)=!P(s)!P(s + <i). Then, by (5) and (6), &(s)=V(l-s)V(l-s-a)=V(l-s-a)¥((l-s-a) + a) = <P(\-s-a) = $(r-s), with r=l— a. Also, (7) <P(s) = A2sA(s) £ ckk-° = A{s) f cke-**>9
118 E. GROSSWALD with A(s) = {r(s)r(s+a)}'2{r(s/2)r((s+a)/2)}r>, Xk = \og{kA~2) and ck = Y,Na=k Zb|«(^)"- Clearly, the Main Lemma is applicable. It remains to compute S(z/i) and (<72) We recall that CK(s) has (see, e.g. [5]) a simple pole at s= 1, with residue g, a zero of order rx + r2 — 1 at j = 0 and zeros of orders r2 and r1-\-r2 at the negative odd and negative even integers, respectively. Perusal of this information leads to the simple (and, because of past experience, somewhat unexpected) result that <P(s) is holomorphic in the strip ol%o^o2, except for the four poles at 5= —a, 1 —a, 0, and 1. The corresponding residues are computed routinely, with the result that -S'(l) = 2r>hRw~1 {(a+ 1) \d\1/2n-nl22-r2r{a+ l)r2r{(a+ l)/2)ri £K(a+ 1) -(fl-ijr^ri^Wfl)}, or, equivalently, with a = 2m— 1, W =-C2(m, K) R-ln-n(2m-ll2)S'{\), where C1(m,X) = (l-m-1)(m-l/2)-r2{41-m(2^:?)}ri((2^:?) = binomial coefficient) and C2(m,K) = w/i-12ri+r2-1m-1{4(m-l)!}-ri{(2m-l)!}-r2 are rational numbers. This proves Theorem 1. More can be said if we restrict our attention to totally real fields K. In that case r x = n, r2 = 0, w = 2, Cam,K) = (l-m-1){41--(2-:2)r,C2(m,X)={(2(m-l)!r/im}-1, and it is known (see e.g. [7]) that d1,2n~2mnCK(2m) = C3(m, K) is also rational. Equation (8) becomes, with appropriate rational constants cl9 c2 (depending on m and K only) (8') n-{2m-1)ntK(2m-\) = c1+c2S'(\)R-1n-n{2m-ll2). This proves Corollary 1.1. Corollary 1.2 is a simple reformulation of Corollary 1.1.
DIRICHLET SERIES THAT SATISFY FUNCTIONAL EQUATIONS 119 5.2. Proof of Theorem 2. The determination of the arithmetical nature of Cx(2m-1) has been reduced to that of {Rnn{2m-ll2)}~1 S"(l). For r=l-a, a = 2m—\, one has, from (3), (4) and the Main Lemma, that iff(l) = (m-l) F(Q-iF(i), F(i) = ~ | *(s) ds, ^'(0 = ^ J **M *■ (<72) <ff2) If we now use (3) and (7), we obtain, with xk = k~2A49 irci J k=i k = i Zni J (^2) (<T2) (<T2) <<T2/2) with rational coefficients gk. In the case of a totally real field K, these integrals, and the similar ones for iF'(i\ can be obtained explicitly with the help of Meyer's G-functions (see [1, pp. 206-222]). Specifically, ^ | {r{s/2)r{{s^a)/2)Yx^ds=2'~ | {r(s)r(s + a/2)}»x^s (<r2/2) = 2G2#(xk|l, 1,..., 1, l-a/2, l-fl/2,..., l-a/2) = ^(/c; n;fl), say, and F(i) = Yj?=i 9k9{^\ n\ #)• The expression, iF'(f) can be treated similarly and this finishes the proof of Theorem 2. For K = Q, these series reduce to the Lambert type series ^=1 k~a{e2nk- l)"1 and X?=i e2nkkl~a{e2nk-\)-2, respectively encountered in [3]. 5.3. Proof of Theorem 3. Let us consider the same situation as before with a+ 1 =4m, <P(s) defined by (6) and set &1(s) = <P(s)(2s + a-l)-1. Then ■01{s)=-01{l-a-s) and <Px(s) has five simple poles at s = -a, —a+1, -(a- l)/2, 0, and 1. We set again F1(t) = (1/27iz) j(<r) ^(j) (t//)"s ds. The computation of the residues is routine and (after division by 2) the Main Lemma with S = 1 yields
120 E. GROSSWALD Fx{x) = 2'i{hRlw){a-\)-1 r(a)r>r(a/2r CK(a) (9) -\d\V2 2ri-ra7r-"/2(fcU/w)(fl+l)-1 r(a+l)r2r((a + l)/2)ri C*(a+1) + |d|l/2 2-r2-2 ^-n/2 J^ + l)/2)^ T{{d+ \)/4)2r> £*((<*+l)/2)2. If K is totally real, then Cx(^+l) = Cx(4m) = 7r4^-1/2c4m, Cx((^ + l)/2) = Cx(2m) = 7r2^-1/2c2m, with rational constants c4m and c2m, and (9) becomes where B = 4'n(2m- 1) p1"4"^: ?)}"", k3 = 2"c4m/2m, k4 = {(2m-1)!}, and A:5 = 4{(2ra — 1)!}~", are all rational. Finally, set c^Bkj and this finishes the proof of Theorem 3. 5.4. Proof of Theorem 4 and its corollaries. While the arithmetical nature of F^i) is unknown, for a closely related function F2(x) one knows that F2(/) = 7rn(4m_1/2) a, with a algebraic. With the same notations as before (see especially § 4) let f be an integral ideal of K and let % = %(a) be a nonprincipal, primitive character of the group of ray- classes, i.e. of the congruence classes (mod f) of ideals in Ok prime to f. The L-function L(s, y) = Yja^K x(a) Na~s satisfies (see [6]) the functional equation where ¥(s; x) = Al{r(s/2)}ri {r(s)}r2L(s, x) with A2 = (d- JVf •7r_',4-r2)1/2, and W(x) = C(x)(N^)~1/2, with C(x) a certain generalized Gaussian sum, |W(x)| = l, W(x) algebraic. L(s,x) is an entire function, different from zero for c^l, and vanishes to the order r2 at the odd negative integers and to the order rl-\-r2 at s = 0 and at the even negative integers. We set &2(s) = A2-aY(s)Y(s + a)(2s + a-l)-1 = Als{r(s/2)r((s + a)/2)}^{r(s)r(s + a)rL(s,x)L(s + a,x)(2s + a-l)-' and observe that <P2(l— a — s)= —<P2(s), so that the Main Lemma applies with 5= 1. We set F2(T) = (\/2ni) J((T2) <P2(s) (t/i)~s ds. It is clear that *F(s, x) is an entire function; hence, <P2(s) has only a single pole at s = (\—a)/2. The corresponding residue is Qi=Wix) A2r((a+l)/4)^r((a+l)/2)^L((a+l)/2, If.
DIRICHLET SERIES THAT SATISFY FUNCTIONAL EQUATIONS 121 Proceeding as before, we obtain 1 f ds Fl[T)=~hi \*2{s){*liYsds=-±-i J^l-s-aMT/ir*-1 (<r2) ( $2 (s) (t/i)s+a-1 ds + (x/if-1,/2 q2 , <ffl) --f 2ni J (<T2) or f2W + (T/0'-1F2(-l/T) = (T/ir-1)/2C2. We set t = i and a + 1 = Am and obtain 2F2{i) = Q2 = 2-W{X)(dN\ s-4-')1'2 •{r((a+1)/4)}2- {r((a+1)/2)}2- {L((a+1)/2, *)}2, F2{i)=W(%) (d- Nln-"4-'>y'2{r(m)}2" {r(2m)}2'*{L(2m, x)}2. This finishes the proof of Theorem 4. In the case of a totally real field, the last equation becomes (10) F2(i) = iW(X)(d-Nln-«)l<2{r(m)}2"{L(2m,x)}2. However, it is known (see [8]) that n~2mnL(2m, x) is algebraic, so that (10) becomes F2(i) = n-n,2'n4mn'p, with /? algebraic, and this finishes the proof of Corollary 4.1. Finally, Corollary 4.2 follows immediately from Theorem 3 and Corollary 4.1. Bibliography 1. A. Erdelyi et al., Higher transcendental functions. Vol. 1. The hyper geometric function, Legendre functions, McGraw-Hill, New York, 1953. MR 15, 419. 2. B. C. Berndt, Identities involving the coefficients of a class of Dirichlet series. Ill, Trans. Amer. Math. Soc. 146(1969), 323-348. MR 40 #5551. 3. E. Grosswald, Die Werte der Riemannschen Zetafunktion an ungeraden Argumentstellen, Nachr. Akad. Wiss. Gottingen Math. Phys. Kl. II 1970, 9-13. MR 42 #7606.
122 E. GROSSWALD 4. , Remarks concerning the values of the Riemann zeta function at integral odd arguments, J. Number Theory 4 (1972), 225-235. 5. E. Hecke, Uber die Zetafunktion beliebiger algebraischer Zahlk'drper, Nachr. Ges. Wiss. Gottingen Math.-Phys. Kl. 1917, 77-89; Math. Werke, pp. 159-171. 6. , Uber die L-Funktionen und den Dirichletschen Primzahlsatz fur einen beliebigen Zahl- korper, Nachr. Ges. Wiss. Gottingen Math.-Phys. Kl. 1917, 299-318; Math. Werke, pp. 178-197. 7. H. Klingen, Uber die Werte der Dedekindschen Zetafunktion, Math. Ann. 145 (1961/62), 265-272. MR 24 #A3138. 8. , Uber den arithmetischen Charakter der Fourierkoeffizienten von Modulformen, Math. Ann. 147(1962), 176-188. MR 25 #2041. 9. S. Lichtenbaum, Private communication. 10. A. Weil, Sur une formule classique, J. Math. Soc. Japan 20 (1968), 400-402. MR 37 #155. Temple University
ON THE INCOMPATIBILITY OF TWO CONJECTURES CONCERNING PRIMES DOUGLAS HENSLEY AND IAN RICHARDS 0. Introduction. This is an account of work published in more detail elsewhere (cf. [5]). We show that the conjecture (A) n (x + y) ^ n (x) + n (y) (where x, y are integers ^ 2) is incompatible with (B) the "prime /c-tuples conjecture" (definition below). That is, at least one of these conjectures must be false. (We lean towards the opinion that the prime /c-tuples conjecture is true, and (A) false.) The inequality (A) can be written n(x + y) — n(y)^n(x). Thus (A) states that no interval of length x contains more primes than the first x integers. The prime /c-tuples conjecture is the natural extension of well-known hypotheses concerning "twin-primes" AT, AT+ 2, triples AT, AT+ 2, AT+ 6, etc. It refers to a /c-tuple of functions AT-f bl9..., X + bk. The conjecture states: (B) There are infinitely many integers n > 0 for which all of the values n + b l9..., n + bk are prime if and only if (*) for each prime /?, there is some congruence class (mod/?) which contains none of the constants bt. A sequence of integers bl<b2<-'<bk which satisfies (*) is called admissible. (To form an admissible sequence on an interval of length x, eliminate one congruence class (mod 2), then one class (mod 3), one class (mod 5), etc. until the next prime exceeds the number of points which remain.) Definition. We denote by q* (x) the maximum value of A: for which there is an admissible A:-tuple bx <b2 < ••• <bk on an interval y<bt^y + x of length x. The prime k-tuples conjecture implies that infinitely often as y-»oo there are intervals y <n^y-f-x (of fixed length x) which contain £*(x) primes. AMS 1970 subject classifications. Primary 10H15, 10H25, 10H30. (Q 1973, American Mathematical Society 123
124 DOUGLAS HENSLEY AND IAN RICHARDS Thus if we set g(x) = lim supy^o0[n(y-\-x) — 7c(y)], then g(x)^g*(x), and the prime /c-tuples conjecture implies q(x) = q*(x). It has been suggested that q*(x) ^n(x) for x^2 (cf. [2], [4], and [9]). However this is false (and the contrary result for q*(x) does not depend on any conjectures). We will show that: (C) lim Q*(x)-n(x)= + oo. jc-*oo If we assume the prime fc-tuples conjecture (B), then we obtain: (C) For all sufficiently large x, there exist infinitely many y, such that n(y + x) -n(y)>n(x). (Of course these values of y may be much larger than x\ furthermore they are likely to be very sparse.) Hardy and Littlewood (cf. [4]) introduced the function q* (x), and they observed that Q*(x)^7i(2x)-~n(x)~x/\ogx. Our result shows that g*(x)>n(x) for large x. The most that is known concerning an upper bound for q* (x) follows via the sieve method of Brun and Selberg: q*(x)gConst n(x). (Montgomery has proved Const^2; cf [7].) It seems likely that Const^l+a as x-»oo, which would give Q*(x)~n(x). We note the trivial inequalities Q*(x + y)^Q*(x) + g*(y), and Q*(x)^n(y + x) — n(y) for all y^x. Our study of the function q*(x) began with a computer search to find values x^2 for which q*(x)>ti(x). It was this search which led us eventually to a theoretical solution. (Our data also showed that £*(105)>7r(105), and it is known that Q*(x)<n(x) for x^ 146; cf. [5] and [8].) We wish to thank William Franta and Richard Franta of the Computer Sciences Department and the Computer Center for writing a machine-language program which enabled us to carry out our calculations. 1. The main result. We adhere to the notations of §0. Theorem. \imx^aoQ*(x)—7r(x)= + oo; the difference w^(log2—e) [x/(logx)2]. Proof. The theorem follows from two lemmas. Of these, the first is easy, while the second requires several more lemmas before it is established. The idea is to take an interval of integer points — x/2 <n^ x/2 located symmetrically about the origin. Then we will construct a set of points {&,} by eliminating points as follows. First fix an integer N ^3. Eliminate all multiples (positive and negative) of all primes p^x/N log* (the "hard" sieve of Eratosthenes, where the prime itself is not saved). Call what remains the residual set. (This set consists of the primes
TWO CONJECTURES CONCERNING PRIMES 125 between x/N log* and x/2, and their negatives, plus the points ± 1.) Then Lemma 1. The number of points in the residual set exceeds n(x) by an amount asymptotic to [log2-2/7V] [x/(logx)2]. Lemma 2. The residual set is an admissible set for p*(x) (cf §0) as soon as x is large enough. Remark. It would be trivial that the residual set is admissible if we stopped at primes p>2n (x/2) ~x/logx (since then there would be more congruence classes (modp) than points in the residual set). However we have an average of about N points per congruence class (N is fixed). We need to show that as the number of trials increases (i.e. as x->oo), then at least one empty class appears. Proof of Lemma 1. The number of points remaining is (with an error of ±2 or less) 2n(x/2) — 2n(x/N logx). The lemma now follows from the well-known fact that 27t(x/2)-7i(x)~log2(x/(logx)2) (cf. [6, Chapter 3]). Proof of Lemma 2. As stated above, several auxiliary lemmas will be necessary. We begin with a lemma (to roughly the opposite effect as our theorem) about how few primes are necessary to completely eliminate a sequence of t consecutive integers. (Later on, t will be (N +1) logx, the number of multiples of any prime q>x/N logx between —x/2 and x/2.) Lemma 3. Let T=T(t) be the minimum number of primes in the natural ordering px = 2, p2 = 3,..., pT such that there are congruence classes n = at (mod/?,), one congruence class for each pi9 whose union contains the entire interval 1 ^j^t. ThenT(t) = o[n(t)\. Remark. The lemma is contained in a result of Erdos (cf. [1]) that the maximum gap between primes pn + i—pn is asymptotically larger than logpn. However his result uses a difficult argument to obtain quantitative results which go farther than we need. Proof. We start with Mertens' theorem (cf. [6]): p<x \ P/ Let CM = Y[M<p<eM(l-l/p)~\ogM/M. We will show that, given a large fixed number M, we can make T(t)<n(t/M) + CMn(t) if t is large enough. Since M is arbitrary, this implies T(t) = o[n(t)"]. First apply the "hard" sieve of Eratosthenes, taking out all multiples of the primes in the two ranges \<p^M and eM^p^t/M (saving the middle range for
126 DOUGLAS HENSLEY AND IAN RICHARDS later use). What remains of the original interval 1 ^j^t is: (a) primes >t/M, and (b) integers all of whose prime factors come from the fixed middle interval Mxp<eM If t is large enough, the set (b) becomes negligible in comparison with (a). Now use the primes in the middle interval M<p<eM in an optimal way. This reduces the residual set of ^n(t) elements by a multiplicative factor ^CM ( = the product of the corresponding 1 — l/p). Finally remove the remaining points (at mostCM7r(r) in number) one at a time, using another CMn(t) primes. In all, n(t/M) + CMn(t) primes have been used. This proves Lemma 3. Lemma 4. Consider t-+co in Lemma 3. Then Y\i^TPi = e0it)- Proof. T= T(t) = o\n(t)] by Lemma 3, whence pT = o(t). The prime number theorem for Y(x) (cf. [6]) gives £.^T log/?i~/?T, so ]\i^TPi = e0(t). Proof of Lemma 2, completed. Take any prime q>x/N logx. We have to show that the residual set (formed by sieving out the primes up to x/N logx) leaves at least one congruence class (mod q) empty. Let t denote the number of multiples of q in the interval —x/2<n^x/2; then t^(N+ 1) logx. Choose T=T(t) as in Lemma 3: each prime pt^pT eliminates a set {jt} of points from the interval 1 ^j^t, so that the union of the {jt} is the whole interval. Furthermore, the different elements of {jt} are all congruent (modp^)! Now for — (x/2 + q)^y< — x/2, no points other than the t points y + q, y + 2q, ..., y + tq fall between —x/2 and x/2. Since by Lemma 4, Y[i^t Pi<x/N logx<g, there exists sorfie y, —(x/2 + q)^y< —x/2, such that y+jtq = 0 (modpt), l^i^T (Chinese remainder theorem). (If this holds for one^ in {jj, then it holds for all!) The union over i and {jt} of the points jtq + y is the entire congruence class jq + y, Thus y gives a congruence class (modg) which, in the interval — x/2<n^x/2, hits no member of the residual set (since it hits only multiples of the relatively small primesphpt<t = 0(logx), whereas primes up to x/N logx have been sieved out). Q.E.D. References 1. P. Erdos, On the difference of consecutive primes, Quart. J. Math. Oxford 6 (1935), 124-128. 2. , Some unsolved problems, Michigan Math. J. 4 (1957), 291-300. MR 20 #5157. 3. P. Erdos and J. L. Selfridge, Complete prime subsets of consecutive integers (to appear). 4. G. H. Hardy and J. E. Littlewood, Some problems of "partitio numerorum". III. On the expression of a number as a sum of primes, Acta Math. 44 (1923), 1-70. 5. D. Hensley and I. Richards, Primes in intervals (to appear).
TWO CONJECTURES CONCERNING PRIMES 127 6. A. E. Ingham, The distribution of prime numbers, Cambridge Univ. Press, 1932; reprint, Hafner, New York. 7. H. L. Montgomery, Topics in multiplicative number theory, Lecture Notes in Math., vol. 227, Springer-Verlag, New York, 1971. 8. A. Schinzel, Remarks on the paper "Sur certaines hypotheses concernant les nombres premiers", Act* Arith. 7 (1961/62), 1-8. MR 24 #A70. 9. A. Schinzel and W. Sierpihski, Sur certaines hypotheses concernant les nombres premiers, Acta Arith. 4 (1958), 185-208; erratum, 5 (1959), 259. MR 21 #4936. University of Minnesota
This page intentionally left blank
ON THE INTERVALS BETWEEN CONSECUTIVE TERMS OF SEQUENCES CHRISTOPHER HOOLEY 1. Introduction. Let su s2,..., .$>,... be a sequence of integers that is defined in some natural way; let/(«) be the counting function of the sequence, i.e./(«)= 1 if n belongs to the sequence, and/(«) = 0 otherwise; and let sr S jc n £ x be the number of members of the sequence not exceeding x. Then in many well known cases the overall asymptotic behaviour of the sequence has been determined in the sense that it has been possible to prove an asymptotic formula of the type (1) F(x)~x/g(x) as x-+oo, where, characteristically, g(x) is either a positive constant or a slowly increasing function tending to oo as x-+oo. Results, however, of this form leave unresolved a corpus of difficult questions related to the theory of the finer structure of such sequences. In particular there are many problems connected with the distribution of the intervals sr + 1 — sr, and it is with this aspect of the theory that we shall be concerned in this article. We shall consider the following three questions with an emphasis on the latter two: (i) An upper bound for sr+1 — sr in terms of sr; (ii) an upper bound for the sum (2) I (Sr+lSrY, A MS 1970 subject classifications. Primary 10L99. © 1973, American Mathematical Society 129
130 CHRISTOPHER HOOLEY where y is a positive constant. In particular the question as to whether the upper bound (3) I (sr+1-sry = 0(xg?-l(x)) holds for some or all y> 1 in respect of a given sequence satisfying (1) (it holds trivially for y = 0 and 1, and hence for O^y^l by the Cauchy-Schwarz inequality), (iii) The distribution of the intervals sr+1—sr. In particular the question of whether there is a distribution function for (sr+1—sr)/g(sr) in respect of a given sequence that satisfies (1) for the case #(x)->oo as x->oo; that is to say, if Mc(x) for a given positive constant c be the number of intervals sr+1 — sr of length not exceeding cg(sr) for which sr+15^x, then does *-oo x/g(x) exist? There is of course a corresponding question for the case g(x) = const provided its statement is appropriately rephrased. The above topics naturally by no means exhaust the list of interesting questions about the intervals sr+1—sr. They have, however, been chosen not only for their intrinsic importance but also because they are the ones most closely connected with the author's own researches on the subject. In order that as much ground as possible should be encompassed our survey will be of an expository nature, proofs as such not being given for the results we quote. Where, as in most cases, the results have already appeared in the literature, full details of the methods used can be found through the references given. As for the new results announced for the first time in this article, it is intended that a full account of them should be published shortly. After beginning by making some observations about the relationships between the three problems, we shall outline some general principles and ideas that are relevant to the second and third ones. During the latter part of the article we shall then summarize what has been discovered about some of the more familiar sequences through the application of these ideas and others. Four sequences are chosen as examples, these being in order of consideration, the numbers not exceeding n that are relatively prime to n, the primes, the numbers expressible as a sum of two squares, and the square-free numbers. 2. General considerations. Any upper bound in (i) naturally gives rise to an upper bound in (ii), since (4) bd (sr+1-sr) = 0(h(x)) Sr+l^-X
INTERVALS BETWEEN CONSECUTIVE TERMS OF SEQUENCES 131 implies that £ (sr+1-sry=o\hy-l(x) £ (sr+1-sr)l=o{xh>-i(x)}. But this consideration is not of much avail in practice because the upper bounds found for (i) usually fall far short of what is probably true. There is furthermore the point that best possible results for (ii) are usually not theoretically attainable in this manner because (4) in many cases is false with h(x)=g(x) (as, for instance, in the four examples given later). Conversely a result of type (2) for (ii) implies that s^-s^O^V"1^*)) for sr + 15^x, which for large y would supply a good answer to (i). Unfortunately, however, most of the methods used to consider (ii) are circumscribed by the fact that they become increasingly dependent at some point or other on good bounds for (i) as y is taken larger. There are also some connections between (ii) and (iii). For example, if (2) holds for y=a and if there is a distribution function for (sr+1—sr)/g(sr)9 then (2) may be strengthened for y<a so that it becomes an asymptotic equality of the form E (Sr+i-sry^A(y)xg^1(x) as x-» oo. A less direct association is seen by letting Nx(l) be the number of intervals sr+1-srof length / for which sr + 1 ^x, and then writing the sum (2) in the form (5) I,Nx(t)P. I This could be estimated by partial summation provided satisfactory bounds could be provided for sums of the form Xi>*^*(0> which are clearly closely related to M^(x) for £ = X/g(x). Although the methods used to establish the existence of a distribution function for (iii) do not in fact readily extend to deal with the case where c is unbounded, it is nevertheless frequently convenient to express (2) by (5) and then to perform a partial summation through the use of the cognate sum (6) E "Mo, V ' 1>X about which information can be obtained in a way to be described below. This completes our remarks about the mutual relationships between (i), (ii), and (iii).
132 CHRISTOPHER HOOLEY We discuss next the role played in the theory by sums of the form R(x;l) = £ f(n)f(n + t), which obviously have some relevance to the problems at hand. Indeed, since R(x; I) is equal to the number of pairs sM, sv (not necessarily consecutive) for which sv — su = l and sv^x, jR(x; /) provides an upper bound for Nx(l) that is likely to be satisfactory for small values of /. It was in fact through this idea that Erdos found the first proof of the theorem that hm inf— <1. x-oo logpr Nevertheless the bound supplied by R(x; I) will not be of much direct use for the larger values of / that are of importance to us here, because for such values almost all the intervals sv — su of length / will contain many terms of the sequence. We must instead look to the ideas of the next paragraph in order to see how useful these sums turn out to be. The sums appear again when we introduce a method of correlating the frequency of occurrence of large intervals sr+l — sr with the frequency of occurrence of the zero values taken by sums of the form (7) S(n,h) = F(n + h)-F(n)= £ f(m). n<m£n+h To estimate the frequency of abnormally small values of S(n, h) we consider in the first place the 'variance' (8) Z{S(n,h)-h/g(x)}2, which in most cases will be near enough (9) ZS*(n,h)-h2x/g2(x), n^x since the expected value of S(n, h) will be about h/g(x) on account of (1). As for the sum in the first term of (9), it can be expressed in terms of the sums R(x,l) for l<h by writing the square of the sum in (7) as a double sum and then changing the orders of summation after it is substituted in (9). The expectation that the variance (8) should be small can therefore be tested provided suitable asymptotic formulae are available for the sums R(x\ I).
INTERVALS BETWEEN CONSECUTIVE TERMS OF SEQUENCES 133 On the other hand, corresponding to a given interval sr + 1—sr of length l>h, there about l — h values of n for which S(n, /z) = 0. Consequently (8) provides an upper bound for the sum £/>*(/ —fc) N*(0» which is to all intents and purposes the same as the sum (6) whose application to problem (ii) has already been indicated. Where asymptotic formulae with sufficiently accurate remainder terms are known for R(x; /), we may expect in this way to obtain a proof of (ii) in typical cases for y < 2. To deal with the cases for which y is larger it will usually be essential to consider higher moments of the form £ {S(n,h)-h/g(x)}2', for the estimation of which asymptotic formulae for sums of the form (10) R(x; /!,..., 0= £ f(n)f(n + li)...f(n + lu) n^x are needed. Additional difficulties, however, normally occur even in the cases where the latter requirement is met, and in consequence not much success has so far attended attempts to use the method for the case y > 2. Nevertheless the sums R(x; /l9..., /M) are of paramount importance from the standpoint of problem (iii), since they can be used to calculate the number of intervals sr+1—sr of length not exceeding X for which sr+1 ^x (X being of an order of magnitude appropriate to the sequence in question, i.e. Xxg(x)). That this is so can be seen by considering the fact that the number of intervals in question is precisely the number of pairs sM, sv such that sv — su^X, s„^x, and such that no members of the sequence lie between su and sv. Thus, appealing to a classical exclusion principle, we have that the number required is (ii) K-iy &(*,*), t where Qt(x, X) is the number of intervals of length not exceeding X which contain exactly t members of the sequence strictly within them. In turn the sum Qt(x, X) itself can be easily expressed in terms of the sums R(x; ll9...9 lt + x), where 0 < lx < ••• <lt+i^X. The above method in some cases leads successfully to the solution of (iii). In practice, however, the formula (11) is not satisfactory to use because it will contain too many terms, and we use instead an expression of the type YJ(-l)'Q(t,X) + 0{Q(M,X)} bearing a close analogy to the formulae that appear in Brun's sieve method.
134 CHRISTOPHER HOOLEY The field of application of the above ideas is naturally restricted by the availability of suitable auxiliary formulae and by other difficulties that may occur in practice. However, as we shall see in the next section, these methods with appropriate modifications have enabled us to make some progress with problems (ii) and (iii). 3. Special sequences. 1. The numbers prime to n. Here the problems need to be formulated in a slightly different manner because for each n the number of members of the sequence is finite. We therefore denote, for given n, the numbers not exceeding n that are relatively prime to n by au..., <20(n), and then regard n as tending to infinity. Apart from its intrinsic interest, the consideration of this sequence may be regarded as a useful preliminary to the corresponding exercise for the primes (which are virtually obtained by taking n = \\p^xmp and then taking the at that do not exceed x). Problem (i). Brun's or Selberg's sieve method leads easily to at+1-^ = 0(logcn) for some positive constant C. Problem (ii). In 1940 Erdos [3] conjectured that <t>(n)- 1 (12) £ (ai + l-at)2 = 0(n2/ct>(ri)). The author [8] then proved in 1963 that in fact, for y<2, <t>(n) -1 (13) X (ai+1-ai)v = 0{n(H/</>(H)r1}, i=l and that (14) J (ai+ ,-ad^Oin(log log*)2}. i=l It actually was in this context that the author developed the appropriate method described in the previous section. Here of course the work is facilitated by the fact that there is an exact formula for the analogue of R(x; /). No progress has yet been made with the case y > 2. Even the case y = 2 is still not resolved, although several people including Vaughan, Norton, and the author (unpublished) have improved (14) in such a way that it follows that (12) holds for almost all n. Problem (iii). Erdos [4] had conjectured that, for n = J~[p ^M p and M large, the ratio
INTERVALS BETWEEN CONSECUTIVE TERMS OF SEQUENCES 135 nl4>(n) had a distribution function that was (essentially) independent of n. This was subsequently proved by the author [9] under merely the weaker condition that n-+co through a sequence for which n/<t>(n)-+ao. Stated more precisely the result asserts that, for any given positive constant c, the number of intervals ai+1 —at of length not exceeding cn/</>(n) is equal to ct>(n){l-e-c + o(l)} as n-+co through a sequence for which n/</>(n)->oo. Thus (ai+1 — at) (f>(n)/n is distributed approximately as a gamma variable with parameter 1 when n/c/)(n) is large. The proof depends on the method described at the end of the previous section, there being again exact formulae for the analogue of (10). The proof, however, is fairly complicated because there are one or two incidental difficulties to overcome. Combining this result with that of (ii) we obtain, for 0^y<2, £ (al+l-aF = {l+o(l)}r(y+l)n(n/<Hn)Y-1 i = l as n/<j>(n)-+oo. The methods used here can be extended further in order that other detailed properties of the intervals may be studied. For example, it can be proved [10] that £ (a(+1-a,)=(±+0(l))« i<<f>(n); i = 0mod2 as w-»oo through a sequence for which n/<f)(n)-+co. 2. The prime numbers. The question as to whether for the sequence of primes there are asymptotic formulae for sums of the form (10) is of course one of the most important unsolved problems in the theory of numbers. The following conjecture, however, was made by Hardy and Littlewood in their famous paper "Some problems of partitio numerorum. Ill" [7]. Conjecture. Let lu /2,..., /r_i be r—\ distinct nonzero integers and let f(n) be defined so that f(n)=\ if n is a prime andf(n) = 0 otherwise. Then, as x->oo, i/w/(»+M---/(»+ui)-Krxnf-AT^:T. where X v f du h x =\ 2
136 CHRISTOPHER HOOLEY and v = v(p;ll,...,lr_1) is the number of distinct residues of 0, /l9..., /r_i to the modulus p. -Slightly stronger variants of the conjecture can be taken in which the asymptotic formula has a remainder term of a specified order and in which the /, can lie in some range depending on x. For the purposes of (ii) and (iii) it will be enough to have in mind a fairly weak form of this type (say with a remainder term 0{xl~% Problem (i). Dr. Huxley has now proved that by a method that he has described to us in his interesting address to this colloquium (references to the earlier literature on the subject may be found in his article in these Proceedings). This does not fall far short of the result (15) Pw+i-p„ = 0(pn1/2logp„), which was shown by Cramer [2] to be true on the Riemann hypothesis. Problem (ii). Erdos [3] has conjectured that X (Pn+i-pw)2 = 0(xlogx), Pn + l^X although the prospect of proving it seems rather remote since its truth wouia imply P. + i-A. = (*1/2log1'2x), which is stronger than (15). Selberg [15], however, improving earlier results by Cramer, has shown on the Riemann hypothesis that V» \Pn +1 ■"" Pn) ~ /, 3 x L = 0(log3x); Pn + I ^ X Pn a result which is probably only imperfect to the extent of a superfluous factor logx in the right-hand side. If, however, the Hardy-Littlewood conjecture in appropriate form for r=2 is assumed together with the Riemann hypothesis, then it is possible to combine Selberg's method with that described in §2 in order to prove that I (P.+1 -P„)y = 0(x log'" 4) fory<2. Pn+l^X
INTERVALS BETWEEN CONSECUTIVE TERMS OF SEQUENCES 137 Problem (iii). The existence of a distribution function for (Pn+l-Pn) logp„ can be shown provided the Hardy-Littlewood conjectures in appropriate form are assumed to hold for any r. As in the case of the numbers prime to n, the ratio would be distributed as a gamma variable with parameter 1. 3. Sums of two squares. Problem (i). Bambah and Chowla [1] have proved that sn+1-sn = 0(s^). Their method is elementary, but more sophisticated analytic methods seem to achieve less. Problem (ii). Let r(n) be the number of representations of n as a sum of two squares, and let rx (ri) be defined to be 1 or 0 according as to whether n is expressible as a sum of two squares or not. Then there are difficulties in applying our ideas to the problem in this case because no asymptotic formula is known for the sum I MiOMn + O; indeed the proof of such a formula might well lie nearly as deep as that for the corresponding Hardy-Littlewood formula for prime-pairs. On the other hand Estermann has proved an asymptotic formula with remainder term 0(x5/6+e) for the associated sum £ r(n)r(n + t). This, however, cannot be immediately applied to our problem because the use of r(n) means that only a sparse subset of the sr is effectively counted, the average intervals between consecutive terms of this subset having length greater in magnitude than (logx)1/2. That this is so can be seen, in effect, through recalling the behaviour of co*(n), the number of distinct prime factors of n that are congruent to 1, mod 4. Thus the normal order of co*(sm) is \ log logsm, whereas the major contribution to ^Sm^xr(sm) is due to those terms for which co*(sm) is about log logsm. The above difficulty was overcome by the author [11] through the introduction of a new function Q(n) with the property that g(n) = 0 unless n is expressible as the sum of two squares. This function is virtually an approximation to rx (n) in that it is obtained by affecting r(n) with a factor t(n) that is intended to mimic as closely as
138 CHRISTOPHER HOOLEY possible the function 2"£0*(n). If d denotes, generally, square-free numbers (including 1) composed entirely of prime factors p such that p = \ mod 4, and if v = x3 for a suitable small value of 5, then t(n) is virtually defined1 by in contrast to the identity By means of a somewhat elaborate analysis, asymptotic formulae (with remainder terms not specified here) of the following form are obtained Ie(n)~o^' Iffa("hd^' E<(»M»+0~7^. which are the counterparts of (either already proved or conjectured) formulae with rx (n) replacing g(n). These provide the basis for the proof of X (sr+l-sry = 0(xlofp-lv2x) for y<5/3. Problem (iii). No unconditional results relating to this are known. Indeed, it has not even yet been possible to obtain an asymptotic formula for the sum X r(n)r(n + ll)r(n + l2), although the author has been able to show that there are infinitely many n for which n, n + lx, n +12 are all expressible as the sums of two squares. 4. The square-free numbers. Problem (i). The result sr+l-sr = 0(x2,9+e) is due to Richert [14]. A method for slightly reducing the exponent was subsequently given by Rankin [13]. Problem (ii). Erdos [5] showed that, as x->oo, 1 In practice it is convenient, though not essential, to modify this definition slightly.
INTERVALS BETWEEN CONSECUTIVE TERMS OF SEQUENCES 139 £ (Sr+lSrY~A(y)X for y^2. The author recently extended this by proving that the formula holds for y^3. The method used is different from that previously described, since the basic idea is to show that, iffh(n) is defined by /»(»)» E i, p2 | n;p>h logfi then the sum is small. Having in effect shown that it suffices to consider numbers that are not divisible by squares of small primes, it is then comparatively easy to complete the proof. Problem (iii). This was solved by Mirsky [12]. The existence of the distribution function was used in the work described above. 4. Concluding remarks. We have had perforce to omit work of importance from this survey. In particular, we have not mentioned the very interesting results of Erdos regarding sequences with terms that are not divisible by any of the terms of another sequence. Obviously there is much that still remains to be done in this field. For example, methods that could effectively be applied to sequences such as that of the numbers expressible as X3 + 73 + Z3; X, Y,Z^0 remain a desideratum. References 1. R. P. Bambah and S. Chowla, On numbers which can be expressed as a sum of two squares, Proc. Nat. Inst. Sci. India 13 (1947), 101-103. MR 9, 273. 2. H. Cramer, Some theorems concerning prime numbers, Ark. Mat. Astronom. Fys. 15 (1920), no. 5, 1-32. 3. P. Erdos, The difference of consecutive primes, Duke Math. J. 6 (1940), 438-441. MR 1, 292. 4. , Some unsolved problems, Magyar Tud. Akad. Kutato Int. Kozl. 6 (1961), 221-254. MR 31 #2106.
140 CHRISTOPHER HOOLEY 5. , Some problems and results in elementary number theory, Publ. Math. Debrecen 2 (1951), 103-109. MR 13, 627. 6. T. Estermann, An asymptotic formula in the theory of numbers, Proc. London Math. Soc. (2) 34 (1932), 280-292. T: G. H. Hardy and J. E. Littlewood, Some problems ofpartitio numerorum. Ill, Acta Math. 8. C. Hooley, On the difference of consecutive numbers prime to n. I, Acta Arith. 8 (1962/63), 343- 347. MR 27 #5741. 9. , On the difference between consecutive numbers prime to n. II, Publ. Math. Debrecen 12 (1965), 39^9. MR 32 #4099. 10. , On the difference between consecutive numbers prime to n. Ill, Math. Z. 90 (1965), 355-364. MR 32 #1182. 11. , On the intervals between numbers that are sums of two squares, Acta Math. 127 (1971), 279-297. 12. L. Mirsky, Arithmetical pattern problems relating to divisibility by rth powers, Proc. London Math. Soc. (2) 50 (1949), 497-508. MR 10, 431. 13. R. A. Rankin, Van der Corput's method and the theory of exponent pairs, Quart. J. Math. Oxford Ser. (2) 6 (1955), 147-153. MR 17, 240. 14. H.-E. Richert, On the difference between consecutive squarefree numbers, J. London Math. Soc. 29(1954), 16-20. MR 15, 289. 15. A. Selberg, On the normal density of primes in small intervals and the difference between consecutive primes, Arch. Math. Naturvid. 47 (1943), no. 6, 87-105. MR 7, 48. University College Cardiff, Wales
THE DIFFERENCE BETWEEN CONSECUTIVE PRIMES MARTIN HUXLEY The gap for consecutive primes Has decreased in historical times: I can prove it myself Down to seven upon twelve, Said one who rejoices in rhymes. This article is a sketch of the ideas used in my paper on the difference between consecutive primes. The result is (1) Pn+l~Pn<Pn when S > 7/12 and n is sufficiently large, improving Montgomery's condition S>3/5. Ingham reduced the problem to the study of zeros of £(s). A contour integral gives (2) I *,«- I ^+o(ii£i). Pa^x \q\<T Q \ J / Subtracting this equation at x from the same at x + /i, one has a sum over prime powers - essentially over primes - that lie between x and x + h. The sum over zeros has to be estimated rather crudely. Let JV(a, t) be the number of zeros Q = P + iy in the rectangle a ^/J^ 1, \y\^t. Any zero-density theorem of the form (3) N(<x,t)<tA{1~a) AMS 1970 subject classifications. Primary 10H05, 10H15; Secondary 30A16. © 1973, American Mathematical Society 141
142 MARTIN HUXLEY for a < 1, together with a strong result on the nonvanishing of £(s) near the line (7=1, implies (4) £ \ogp = h + o(h) x<pa<x + h for log/i/logx > 1 — \jL With x = p„, 3 = 1 — 1/A + ■£, the sum on the left of (4) is nonzero, and thus not empty, and we deduce (1). If the Riemann hypothesis were true, then (3) holds with any k>2, and (1) with any 5>\. The present talk sets about to get X> 12/5. The program above is Ingham's, who proved (1) for 5>5/&. Evidently recent improvements have been in counting zeros. There are two basic properties of £(s): its functional equation and its product representation. The analytic properties are very useful, but counterexamples show that they alone are not sufficient for results of the form (3). The multiplicativity is used as follows. Let M(s)= X n(m)lnf, m^X a partial sum for the reciprocal of £(s). Then C(s)Af(5) = l+X -s £ n(d). m>X M d\m;d^X This product of series does not converge in the region o < 1 in which we are interested, but we can insert convergence factors: Y / \ (5) £ (5) M (5) = 1 + Yj —T + error term + contour integral. Here Y is the point at which the convergence factor really takes effect. When 5 is a zero q of £(s) (or a zero of M(s)), the left-hand side is zero. Essentially what happens is that £<z(m)/ms is approximately — 1. It is also possible that the contour integral is near — 1. With the appropriate choice of X and Y, in our problem the contour integral is negligible for a = 5/6, and for 3/4 = a = 5/6 it can be — 1, but not so often as the sum can be. Something is being lost here; intuitively we expect to get a better result by adjusting the value of Y until the contour integral is — 1 as often as the sum is. A technical point enters here. The sum from X to Y is split into sections jN<m^N, and if the original sum was greater than | (in modulus), one of the parts exceeds 1/(4 log 7).
THE DIFFERENCE BETWEEN CONSECUTIVE PRIMES 143 The first method an analytic number theorist tries is to integrate the square of the modulus of the sum over t; if the range for t is long enough one gets an expression involving the sums of the squares of the coefficients. This approach works here, and leads to the result (to cut a long story short) N(<x, t)<t3il-*)li2-a)\og10t, (the logarithm power can be reduced, but that is of little importance to us), which recovers one of Ingham's theorems. Figure 1 shows the graph of X = 3/(2 — a). A t 3 | 12/5 + 2 1/2 3/4 1 a Figure 1 The method of Halasz and Montgomery for counting the number of times a finite Dirichlet series can be large rests on the inequality (6) £ a(n)ur(n) ^ I l«(«)l2 E I n= 1 r= 1 q= 1 I M"KM uq(n)\ |n=l where a(n), ur(n) are complex numbers and b(n) are positive real numbers, subject to b(n)^ 1 for those n for which a(n) is nonzero. This inequality is related to the large sieve and to Bessel's inequality; it is proved by a judicious use of Cauchy's inequality. It is convenient to divide the a(m) in (5) by nf to obtain the a(m) for (6), and to take ur(n) = rf~Qr, where Qi,...9 qr are zeros. The inner sum in (6) becomes (7) V b(ri)n2a~lir~Pq~iyr + iyq. The point of this manoeuver is to replace the coefficients a(m) which involve sums of Mobius functions by the known coefficients b(n). We can make (7) to be an average of the zeta function near the point a + iyq — iyr. The simplest choice is o- = 0. The order of magnitude of £(if) is known, but for o->0 the estimate for C(a + it) is probably not best possible, and there is much scope for what another
144 MARTIN HUXLEY speaker has aptly described as "very difficult arguments of a function-theoretic nature". Professor Bombieri's work with Forti and Viola is one example. It is convenient at this stage to invent a notation f<«9 to mean f = 0(g log* T) for some fixed a as 7->oo. In our application of (6) we assume that (8) (l/2)N<n^N >»1 for r= 1,..., R, and that £i,..., qr are not too close together. With a suitable choice of the b{n), (6) becomes (9) R2<«R(N + RT1/2) N1-2". The term with R and not R2 on the right arises from summands with q = r. Essentially N (a, T) <« R, and (9) gives (10) R<«N2~2a provided that (11) T1/2<«N2a-K The above is a summary of Montgomery's work. I know three devices that enable (11) to be relaxed a little. (i) Raising the sum to a power k replaces N by Nk but does not change T. This relaxes (11) but weakens (10). However Jutila has shown how to use this effectively; we follow his idea below. (ii) Subdividing the sum over zeros into intervals \yq — yr\^T0, where T0 satisfies (11). This gives (12) R<<<(l + T/T0)N2-2"<<<N2-2a + N*-6aT. (iii) Iterating the Halasz lemma (6). This is the subject of my current research. To obtain a useful result one chooses b(n) so that the sum (7) is related to the approximate functional equation for £(s), and can be transformed into a sum of length 0(T/N). The calculations to obtain (3) for a^|. We take
THE DIFFERENCE BETWEEN CONSECUTIVE PRIMES 145 y = min(T1/2, T3/(6a"2)). The sum in (5) has been divided into ranges jN<m^N. We square the longest ranges, cube the next longest, and so on, before applying (12). When N{k + *} (2 " 2a) = Nkr4 ~ 6a) T, then we change from the /cth to the (k + l)st power. This gives R«< T3(1 -a)/<3a-1) when we add up the various inequalities given by (12). The graph of 3/(3a—1) [Figure 2] cuts that of 3/(2-a) at (3/4, 12/5), and so (3) is established for X> 12/5, and (1) for S > 7/12. A 4 12/54 2 1 3/21 1/2 2/3 3/4 Figure 2 a There is a sense in which a = 3/4 is the best possible intersection of the two curves. At a = 3/4, (11) is essentially T = N, and the bounds for R derived from mean value arguments and from Halasz's lemma agree. For T<N the mean value method is more powerful. References M. N. Huxley, On the difference between consecutive primes, Invent. Math. 15 (1972), 164-170. H. L. Montgomery, Topics in multiplicative number theory, Lecture Notes in Math., vol. 227, Springer-Verlag, Berlin and New York, 1971. University College Cardiff, Wales
This page intentionally left blank
ON THE MERTENS CONJECTURE AND RELATED GENERAL ^-THEOREMSx W. B. JURKAT In this paper we consider the functions M(x) = Z ^(n) (^(n) Mobius function), il/(x)= X A(n)= £ logP> and D(x)= Z ^(n) (d(n) number of divisors), where x>0. First we indicate our present knowledge concerning the Mertens conjecture in terms of M+ =lim supM(x)/x1/2, M" =lim infM(x)/x1/2. x-+ao x-*ao Then we discuss O-theorems for R(x) = x-i//(x), A(x) = D(x)-{x\ogx + (2y-l)x} or better for the reduced remainders R(x)x-l/2, ZJ(x)x"1/4, (M(x)x"1/2). Previous results by Littlewood [10], Hardy [3], Ingham [6], [7], Haselgrove [4], etc. made use of some kind of almost periodicity. This concept will be clarified and leads to "almost periodic functions in a distributional sense", in short, APD- AMS 1970 subject classifications. Primary 10K30; Secondary 10K35. 1 This research was partially supported by the National Science Foundation. © 1973. American Mathematical Society 147
148 W. B. JURKAT functions, denoted generically by / The well-known explicit formulas for the various reduced remainders r(x) take (under suitable assumptions) the form (1) r(x)=/(f(x)) + o(l) asx->oo, where t(x) = \ogx or r(x) = x1/2, respectively. There is a general theorem which permits us to transfer the behavior of f(t) at a finite point like r = 0 to very large values of t. Our improvements of the known results consist of ^-theorems for the averages t+d f(uS)=\ f(z)dx (d>0). There are several reasons to believe that the J2-theorems are sharper than the corresponding O-theorems. Indeed, I consider it likely that (2) f(t) = 0 (logtf should hold true often. At the St. Louis conference, H. L. Montgomery remarked that he believed that all remainders like R(x), A(x\ etc. should be 0(h(x)1/2 x% where h(x) denotes the distance from x to the nearest zero (or change of sign) of the remainder. This puts the order of h(x) roughly at \/t'{x) and, hence, the expected order for the remainder at (t'(x))~l/2. The reduced remainder should then equal the actual remainder divided by its expected order. If the corresponding explicit formula confirms this calculation, I would expect as final estimate for the remainder 0(t'(x))-l'2(logt(x))K, which specifies the s in Montgomery's conjecture. Since the conference I have been able to obtain several results concerning averages like t+d \ Jl/MI2^. This will answer questions raised by E. Bombieri at the conference and will throw some light on the possible values of K in (2). 1. Special results. For every x > 0 we have (without any assumption) (3) M-^(M(x) + 2M*(l/x))*-1/2^M\
THE MERTENS CONJECTURE AND RELATED Q-THEOREMS 149 where 00 {-l)n(2nx)2n M*(x)=l+ £ „f12n(2n)!C(2n+l) This was announced in [8] and will be proved in §2. Letting x-*l ±0 we obtain, in particular, (4) M"^2M*(1)^1+2M*(1)^M+ and an easy computation gives 2M*(1)= — .505.... LetO«/(x)=o(l)and x + x<Hx) 1 9+{ This is an approximate average of the reduced remainder at x of length xd(x). Under the Riemann hypothesis (RH) we take d(x)= 1/log logx and obtain (5) lim sup g^/log log logx^i, lim inf ^/log log logx<; -\. In particular, we have (assuming RH) (6) ^(x, 1/log logx) = ^± (log log logx), which should be compared (supposing RH) with (7) g+ (x, 1/log log x) = 0 (log log log x)2. Obviously, (5) implies the corresponding result without averages which is due to Littlewood [10]. It is remarkable, however, that even an average of length x/log logx can become so large, and (7) shows that not much more could be said. By contrast, the RH implies (according to Koch [9]) R(x)x-1/2 = 0(logx)2, which leaves a much larger gap (at present). Probably, the RH can be removed for the ^-results; but this is not too interesting since the zeros of £ off the critical line produce much larger oscillations of R(x) x_1/2 anyway.
150 W. B. JURKAT Next, let 0<d(x) = o(l) and x+2xl/2d(x) X This is an approximate average of the reduced remainder at x of length 2x1/2 d(x). Without any assumption and taking d(x) = (logx)~1/2, we obtain (8) gD(x, l/(logx)1/2) = ^+(logx)^4 log log*, (9) gD(x, l/(logx)1/2) = 0(logx)1/4 log log*. Obviously, (8) implies the corresponding result without averages which is due to Hardy [3]. It is remarkable, again, that even an average of length 2(x/logx)1/2 can become so large, and (9) shows that nothing more can be said. The best known estimates for A(x) x~l/4 still leave a considerable gap (at present). There are similar Q- and O-results for practically every d(x). In (6) and (8) we selected d(x) such that first, the right side is particularly large and second, d(x) is particularly large also. The weights which are used in the averages equal r'(£) and can, probably, be replaced by other weights as well. 2. Proof of (3). Let us discuss the lower estimate first. We may assume M~ > — oo and, hence, (10) M(x)^-Kxl/2 (allx>0) for some constant K > 0. In a standard manner 00 1 K CM(x) + Kxl/2 j -F7-V-T75—= — 4.1 dx (Res>l) s£(s) 1/2-5 J x*+i v f extends according to Landau to Res>y. Thus the RH follows and we have, in particular, 1 K VII 1 K Letting a-^+O and t=y (nontrivial zero Q=j+iy) it follows that q is simple and (ii) \i/qC'(qUK (aiie).
THE MERTENS CONJECTURE AND RELATED ^-THEOREMS 151 Similarly, if M+ < + oo, we derive again the RH, the simplicity of the zeros, and (11). Thus, these assumptions can be made without loss of generality. Then, according to Titchmarsh [15, p. 318], (12) (Af (x) + 2Af*(l/x)) x"1/2 = £ x'Vefte) Q for 0<x#integer, where the sum converges boundedly (locally) after grouping the terms into suitable blocks. (This holds true also for 0<x< 1.) Let us define f(t) for real t by /(logx) = (M(x) + 2M*(l/x))x"1/2 (allx>0), and observe that f(t) is real-valued, locally integrable and, in fact, locally of bounded variation. Further, notice that we have an expansion of the form (13) /(/)-£ Re{a„^"'}, n= 1 where 0 < L /oo and an are complex constants such that (14) £ M/% < °° (* some integer ^ 0). It is important to note that the series in (13), as derived from (12), converges in the distributional sense, i.e., if we multiply (13) by smooth functions with compact support, we may integrate term by term. If k in (14) were zero, f(t) would be almost periodic in the sense of H. Bohr. If fc>0 at least a /c-fold integral of / is almost periodic, i.e., / is the /cth derivative of an almost periodic function. Functions of that kind we call "almost periodic in a distributional sense", in short APD-functions, and we reserve the right to add further conditions and still use the same terminology when it is convenient. Given an APD-function f(t) we define F+=lim sup*_ + 00 f{t\ F~ = lim inf?_ + ao f (t), where the * indicates that t should be restricted to Lebesgue points of / (t). In this generality we prove the following result. Proposition. For any L-point t of an APD-function (15) F~^f(t)SF+. Proof. If k = 0 then/(/) is continuous, almost periodic, and hence every value of/(/) will also be a limit value for /-> + oo. If k>0 we consider
152 W. B. JURKAT d d F«W = ^J-J f(t + r1 + -+xk)dxl-dxk (8>0) 0 0 f ei*n8 _ \\k = 1 ReU U„(5 Obviously, F4(t) is continuous, almost periodic, and lim sup Fd(t)^F+, lim inf F,(t)gF". r-* + oo r-> + oo Hence, F~g>Fd(t)^F+ (all r). Letting (5 -> +0 we obtain (15). □ In our case we have k = 2 and (16) M(x)/x1/2 = f(\ogx) + o(l) (x-oo), so that M± =F±. Thus (3) becomes a special case of (15). 3. Discussion of M±. The Mertens conjecture \M(x) x~1/2|^l (x> 1) is still open at present. Sterneck [14] conjectured even — |x1/2^M(x)^x1/2 (except near x = 200) and supported this by calculations for 2^x^5-105 and by tests up to x = 5-106. Our inequality (4) shows that the lower estimate will be false again and again. Neubauer [11] calculated M(x) up to x = 108, still confirming Ster- neck's conjecture. Making tests up to x=1010 he finally found values of x near (7.7)(109) where M(x)x~1/2>i but none with M(x)x_1/2< -\. It would be interesting to know where the lower inequality actually becomes wrong. In (3) the calculation of any value of /(logx) has a consequence for M±. The calculations mentioned above show that — ^/(logx)^ holds for l^x^lO8 with one of the biggest values at x = 1 + 0. But one could consider x < 1 as well. For such calculations it is useful to note that i (17) M*(x)= cos(27rxr)m[-j- (x>0) 0 where w(*) = Z M")A* (x>0).
THE MERTENS CONJECTURE AND RELATED ^-THEOREMS 153 Using (17) and associated approximation formulas,/(logx) was calculated for me at the Hahn-Meitner Institut in Berlin for the range 10~3^x<l. It turned out that x= 1 — 0 gave the most interesting value. Thus, for all the x values considered, inequality (4) contains nearly the best information. On the other hand, the more complicated structure of M*(x) suggests that further computations of f(\ogx) for smaller values of x could prove to be more successful. Another approach would be to follow Ingham [7] and Haselgrove [4] in computing the Fejer mean of (13), i.e., in our case C*M= Z f1-?)-^ (T><Ureal). M<r\ TJqC(q) The inequality corresponding to (3) is (18) At- = Cf (£) = M + (T>0, t real). This is slightly weaker than (3) since C%(i) is an average of f(t) (involving Fejer's kernel), but has the advantage that now much larger values of t can be considered. Recent computations by Spira [13] for T=103 gave values +.535 and —.602 near £=98 resp. £ = 854. These seem to give the best estimates of M1 at present. Here, also, one should make calculations for negative £'s. There is one consequence of (18) which is worth mentioning. Let M= max(M + , — M"). Using Parseval's equation for h'l Urn - | \C*T{ufdu, and letting T-> + oo, we obtain 1 (19) qC(q)\ 2 <M2. This is one quantitative form of Titchmarsh [15, p. 323] and could, perhaps, be used in connection with disproving the Mertens conjecture. It can be viewed as an improvement of (11) under a two-sided assumption and implies, in particular, /c=lor £k.lM.<°o- There have been other attempts to disprove the Mertens conjecture based upon
154 W. B. JURKAT the possible linear independence of the y\ We mention, e.g., the following papers: Ingham [7], Bateman et al. [1], Diamond [2]. 4. Other APD-functions. Let us define/^) for real t = logx by f(t) xll2 = x-\jj(x)-\og2n-\ log(l - l/x2) for x> 1, f{t)xll2= £ -A~-log- + y-ilog- forO<x<l. n^l/x H X 1—X Observe that / (t) is real-valued, locally integrable and, in fact, locally of bounded variation except for t = 0, where om /«=ilog(l/t) + 0(l) ast^+0, 1 ' /(')=-* log(l/|f|) + 0(l) asr--0. It is well known (under the RH), cf. Ingham [5, p. 77], that (21) f{t) = Y.eiytlQ (x±V integer), where the convergence is actually dominated by a locally integrable function. Hence the convergence is distributional, and f(t) is an APD-function with k= 1. By definition (22) R(x)/x1/2 = f(\ogx) + o(l) (x-oo) and F~ = —oo, F+ = +oo in view of (15), (20). This is one qualitative form of Littlewood's ^-result. A comparison between the problems for M(x) and \j/{x) is natural: In the previous case f(t) had a jump discontinuity at t = 0, and (4) was a natural consequence of that observation. Now f(t) makes an infinite jump at t = 0, which results in R(x)/x1/2 being unbounded on both sides. In such cases it is desirable to give a quantitative formulation of our Proposition. Let us discuss D (x) also. According to Hardy [3] we may define f(t) for real t by (23) /(,)-£ Re|^(-^)#L)exp(.47rnl/2t)| t2#integerj where the convergence is actually dominated by a locally integrable function. Hence the convergence is distributional, and f(t) is an APD-function with k= 1. By Voronoi's explicit formula for A (x), cf. Hardy [3], we see that (24) J(x)x"1/4 = /V/2) + 0(l), x^oo.
THE MERTENS CONJECTURE AND RELATED O-THEOREMS 155 Notice that the almost periodic behavior (24) follows without any assumptions and that distributional convergence in (23) can be obtained more easily than ordinary convergence by discussing explicit formulas in integrated form. (It is one of the advantages of distributional convergence that we can use formulas like (23) without ever proving actual convergence or other unnecessary properties.) It is elementary to prove that s (25) i|/(t)A = (C + o(l))(l/a)1'Mogi C><U-+0 0 which implies F+ = + oo in connection with (15). This is one qualitative form of Hardy's O-result. Being general for a moment, it turns out that the functional equation for the C-function is responsible (in the end) for quite a number of explicit formulas which lead to corresponding APD-functions. We shall discuss further examples on a later occasion. Let us observe that the examples of this section have the following additional properties (besides k= 1): (26) 2.n = (c + o(l)) nai (lognf1 as n->oo, where c>0, clx >0, px real; (27) £ |c1JA,=O(N»)(logA0'1, NZ2, where a2<0, p2 real; (28) ai+a2^0, px+p2^0; s (29) \ f(i) dt=(C + o(l)) [ ±Y' (log^J3, <5- +0, where C>0, a3 = l + a2/a1, P3 = P2~Pi<x2/<xi- In the case of \fi(x), «, = 1, j8,«-l, a2=-l, p2 = 2, a3=0, /J3 = l, C=\; and in the case of D(x), «i=i, Pi=0, a2=-i, j82 = l, a3=i /?3 = 1.
156 W. B. JURKAT For convenience in language we shall incorporate these conditions into our general concept of APD-functions. ^5. General results. The procedure of Littlewood and Hardy can be interpreted as a discussion of their functions/(f) through the associated Dirichlet series 00 £ ane~XnS [s = o + it,o>§) «= i and its boundary behavior, in particular, at s = 0. Skewes [12] and Ingham [6] noticed already that the proofs simplify if one uses directly / (r), i.e. the explicit formula, and properties like (20). Bohr's almost periodicity for o>0 produces APD-functions along the boundary a = 0 (at least if one widens the concept). It is natural and possible to extend the ^-results to such APD-functions in considerable generality. A careful analysis of our simplified proofs shows that even in the previously known cases the conclusion can be strengthened to give £2-theorems for the averages/(/, S) = (\/S) \\*d f{z) dz. These results become better if S = S(t) is not too small. This suggests that one should discuss the order of f(t, S(t)) for various choices of S(t). In the following I shall describe some of the results I have obtained so far for our APD-functions (as characterized in the previous section). We have uniformly in S > 0 and t (real) f{t9S) = 0{l/&) for^i (30) f{t,5) = 0{l/5y*{log{l/5)Y> forO<agi (ifa3>0), f(t,S) = 0(l/S)*>(\og(l/S)y> + 1 for0«5^ (ifa3 = 0). Here we may take S = d(t)>0, but otherwise arbitrary. If S(t) = Q+(l) the estimate cannot be improved, but it is not too interesting with regard to f(t). For the remaining functions 0<S(t) = o(l), there is a critical lower limit ^(r)=(logr)-ai(loglogr)-^, t^t0 (ifa3>0), <U') = (logO~ai (log logr)-"1 (log log logr)a\ t^t0 (if a3 = 0). Under the assumptions indicated the following result holds. Theorem. IfS (t)/S^ (t) -> oo (/ -> + oo) then, independently of 6 (/), (3!) ita»p/MM/(jl)"(iOV.
THE MERTENS CONJECTURE AND RELATED ^-THEOREMS 157 If S{t) is of the order of 5*(t) or smaller we can still prove (under a slight regularity condition for S(t)) (32) /(/,^(/)) = i2+(l/^(/r(log(l/^W))"3, '- + 00. Here the constant in Q+ is not specified unless a3=0, and the right side can be evaluated by (A))aXl0S^T)J3=(a?3+0(1))(l08tr'+a2(l0gl0gf)','+',2' f"+a)' In particular, if t is restricted to L-points of f(t), (33) /(r) = G+(logt)ai+a2(loglogt)^2, t- + oo. We remark that (31) gives the exact order if a3>0, and leaves only a factor log(l/(5(r)) open if a3 = 0. Since a3 and j?3 may be zero, (31) can be viewed as a quantitative formulation of our Proposition. If S(t) is roughly of order 5#(/), then (32) will give an answer which is relatively close to the truth. For 5(t) much smaller than 5^ (t), the order problem is still wide open. In that case (30) should be improved while (32) may still be not too far off. If S(t) is very small, the order problem for f(t, S(t)) is the same as for f(t). The detailed proofs will be published elsewhere. References 1. P. T. Bateman, J. W. Brown, R. S. Hall, K. E. Kloss and R. M. Stemmler, Linear relations connecting the imaginary parts of the zeros of the zeta function, Proc. Atlas Sympos., Computers in Number Theory. 2. H. G. Diamond, Two oscillation theorems. The theory of arithmetic functions, Lecture Notes in Math., vol. 251, Springer-Verlag, Berlin and New York, 1972, pp. 113-118. 3. G. H. Hardy, On Dirichlet's divisor problem, Proc. London Math. Soc. 15 (1916), 1-25. 4. C. B. Haselgrove, A disproof of a conjecture of Poly a, Mathematika 5 (1958), 141-145. MR 21 #3391. 5. A. E. Ingham, The distribution of prime numbers, Cambridge Tracts in Math, and Math. Phys., no. 30, Cambridge Univ. Press, Cambridge, 1932. 6. , A note on the distribution of primes, Acta Arith. 1 (1936), 201-211. 7. , On two conjectures in the theory of numbers., A.m£*. I. MsAh. GA{\W1), MV-M9. MR ^, 271. 8. W. B. Jurkat, Erne Bemerkung zur Vermutung von Mertens, Nachr. Osterreich. Math. Ges., Sondernr. V. Osterr. Math.-Kongress 1960, (Wien 1961), p. 11. 9. H. von Koch, Sur la distribution des nombres premiers, Acta Math. 24 (1901), 159-182. 10. J. E. Littlewood, Sur la distribution des nombres premiers, Comptes Rendus 158 (1914), 1869— 1872.
158 W. B. JURKAT 11. G. Neubauer, Eine empirische Untersuchung zur Mertensschen Funktion, Numer. Math. 5 (1963), 1-13. MR 27 #5721. 12. S. Skewes, On the difference n(x)-\\ x. I, J. London Math. Soc. 8 (1933), 277-283. 13. R. Spira, Zeros of sections of the zeta function. II, Math. Comp. 22 (1968), 163-173. MR 37 #4036. 14. R. D. v. Sterneck, Empirische Untersuchung uber den Verlaufder zahlentheoretischen Funktion <r("HZx=1 ii(x) im Intervalle 150000 bis 500000, S.-B. Akad. Wiss. Wien Math. Nat. CI. Ila 110 (1901), 1053-1102. 15. E. C. Titchmarsh, The theory of the Riemann zeta-function, Clarendon Press, Oxford, 1951. MR 13, 741. Syracuse University
THE DISTRIBUTION OF THE VALUES OF REAL QUADRATIC FORMS AT INTEGER POINTS D. J. LEWIS1 1. Let Q = Q(x) be a nondegenerate quadratic form with real coefficients in n variables ;jc = (x !,..., x„). Let^ = {Q(jc) | jceZ"}. We are interested in the distribution of the set 9C on the real line. Clearly, if Q is proportional to a form F with rational integer coefficients, i.e. all ratios of coefficients of Q are rational, then 9£ is a discrete set and the problem reduces to a discussion of what integers are represented by the form F. This problem has been thoroughly studied and we need not discuss it further here, except to note that if Q is an indefinite real form in n ^ 5 variables with all ratios of coefficients rational, then 9C is a full Z-module of rank 1 and each value in 9C is represented infinitely often. Unless we state otherwise, henceforth we shall assume that Q is a real form not proportional to a form with integer coefficients and hence at least one ratio of the coefficients of Q is irrational. For such forms with n ^ 5 one might hope to show that 3C is in some sense dense on the line if Q is indefinite, or on a ray if Q is definite. Indeed, when Q is an indefinite form in n^21 variables the set 9C is dense on the line. This result is the culmination of theorems in a sequence of papers, written by B. J. Birch, H. Davenport, and D. Ridout ([6], [7], [8], [11], [19]) during the 1950's (see §4 for discussion of proofs). Actually they showed for such Q, the set contained 0 as a nonisolated accumulation point; i.e., there are nonzero values in 9C with arbitrary small absolute value. With this information one can use a result of A. Oppenheim [14] to demonstrate that 9£ is dense on the line (see § 5). AMS 1970 subject classifications. Primary 10B45, 10C05. 1 This paper was written while the author was partially supported by a grant from the National Science Foundation. © 1973, American Mathematical Society 159
160 D. J. LEWIS As indicated earlier, one would hope to show that 9C is dense when Q contains fewer than 21 variables. However, with the techniques presently available we seem unable to reduce the 21 appreciably without imposing some further condition, such as Q being diagonal or additive; i.e., that Q = X1x1 + ... + A„x„2. In 1945, H. Davenport and H. Heilbronn [9] proved IfQ is an indefinite diagonal form inn^5 variables with real coefficients having at least one irrational ratio, then 9C contains nonzero numbers with arbitrary small absolute value. This result had been conjectured by A. Oppenheim [13] in 1929, and in 1934 it was shown to hold when n^9 by S. Chowla [2] using results of V. Jarnik and A. Walfisz on the number of integer points in a large ellipsoid. The Davenport- Heilbronn proof makes use of a modified Hardy-Littlewood circle method (see §3 below). The basic reason why, for general indefinite forms, 21 variables are presently needed to ensure that 3C is dense on the line lies in the method of proof. The proof uses several modifications of the Hardy-Littlewood method —a method that is seldom efficient. Also, the proof uses a modification of a diagonalization process introduced by R. Brauer [1] which is also quite wasteful. In this case, under certain additional hypotheses, one demonstrates that an indefinite form can be transformed by an integral matrix to the sum of an indefinite diagonal form in a much smaller number of variables (about j the number in the original form) and a form whose coefficients are extremely small. One then shows the diagonal form (in 5 variables if n ^ 21) has a small value for an integral point with a modest sized norm (a quantified form of the Davenport-Heilbronn theorem). It then follows that the second form also has small value for this point and hence the result. Finally it should be noted that even in the case of indefinite diagonal forms we do not know for sure that some smaller number of variables, say 3, would not suffice to ensure the denseness of SC. 2. Now let us examine the situation for positive definite quadratic forms Q. Here the investigation is still in a primitive stage. For positive definite forms we have \\x\\2<Q(x)<\\x\\\ where ||x||=max|Xj|. Here we use U<V to mean U<constant• V, where the constant is independent of x. Hence we cannot expect 9C to be dense on the line, or any part of the line. However, we might hope to show that for each e > 0 there is an X (e) such that all intervals with left-hand end point to the right of X (e) and of
DISTRIBUTION OF THE VALUES OF REAL QUADRATIC FORMS 161 length s contain a point of 9C. Such is the case for diagonal positive definite real forms, with an irrational ratio, in n ^ 5 variables, as we now show. Suppose A!,..., kn («^5) are positive real numbers with at least one ratio irrational. Let N(X) denote the number of integer solutions of the inequality Aixf + '-' + ^xJ^X. Jarnik and Walfisz [12] proved N{X) = CXnl2 + 0{Xn/2~l), where C = volume of the ellipsoid £A,-xf ^ 1, whence C = Bn(X1...Xn)~1/2, where Bn is a constant depending only on n. Thus for any fixed £>0, we have N(X + s)-N(X)^C'(n/2)'8Xn/2^1 as X-»oo. Since this exceeds 1 when X is sufficiently large, the result follows. In view of the success in handling the general indefinite form, it is reasonable to seek to show that the preceding result holds for general positive definite forms, when the number of variables is sufficiently large. So far no such result has been proven. However, Davenport and Lewis, in a paper to appear [10], have attained a partial result. They proved There exists an absolute integer M with the following property: IfQ is a positive definite quadratic form with real coefficients in n^M variables, then for each integral point x* of sufficiently large norm there is an integral point x#0 such that \Q{x + x*)-Q{x*)\<l. Clearly, in this result one can replace 1 by any positive e on requiring that ||x*|| >X(Q, s). The theorem does not hypothesize that some ratio of the coefficients of Q is irrational. Thus if Q is proportional to a form with integral coefficients, we have Q(x + x*) = Q(x*) when x* is sufficiently large. This result of Davenport and Lewis is imperfect in terms of the anticipated result in several ways, (a) One could have Q(x + x*) = Q(x*) even when Q is not proportional to a form with integral coefficients, and indeed this will happen if Q should integrally represent a form in 4 or more variables which is proportional to a form with integral coefficients, (b) Even if (a) does not occur, the conclusion does not rule out the possibility that the elements of SC occur in clumps leaving appreciable sized intervals, or gaps, without any points of 9C. Finally we note that no attempt was made to determine M, since the value which the proof would produce
162 D. J. LEWIS is surely excessively large, even when certain crude estimates used are replaced by sharper ones which could be attained with more work. The proof (see § 6) again uses the Hardy-Littlewood circle method and in the final analysis depends on representation properties of positive definite diagonal forms with integral coefficients. 3. The Davenport-Heilbronn proof for indefinite additive real forms is the prototype for all the work on this subject. The proof is quite direct, and in the final analysis relies on consideration of the continued fraction development of one of the irrational ratios. Let 6 = A1xf + -+A5xi, where X2/Xl is irrational, Xx > 0, A5 <0. Clearly, these assumptions on the Xt can be made without loss of generality. Also, clearly it is sufficient to demonstrate the existence of an integral point y such that (i) \Q(y)\<U since the integral solubility of \Q\ <e follows from (1) on replacing the At- by e~ Ut-. Let e(x) = e2nix. As is easily verified 00 (2) I e(at) ffl^Ydo: = max{0, 1 -\t\). — oo Let 0> consist of the squares of the denominators of the partial fraction convergents to X2/Xl, and let P be a large integer in 9. Define p S(a)= £ e(ax2), 7(a)= e(<xx2) dx. x=l J 0 It follows from (2) that 00 (3) Js(Aia)-S(l5a) (^Jda =I{1-G(,)}, — oo where the sum is over those integral points y such that l^yl9..., y5^P and
DISTRIBUTION OF THE VALUES OF REAL QUADRATIC FORMS 163 \Q(y)\<l. To prove that (1) has an integral solution, it suffices to show that the integral in (3) has a positive value. We also have from (2) that 00 (4) i= Ji(lia).-.i(l5a) (^J d«= J...J {l-Q(x)}dx, ~ 00 |<2(*)| < 1 and a straightforward integration shows (5) I>P\ [Here we use U<^ V to mean that U< (constant) K, where the constant is independent of P.] Let A = {4P max |^.|)" \ then (6) S(1jOl) = I(Xj<x) + 0(1) on |a|<A, and (7) |/J.(aHmin(P,|a|-1/2)«|ar1/2 on|a|</4. From (4), (5), (6), and (7), one easily deduces (8) f S(Aia)-S(M ("™Yrfa>p3 \<x\<A As is well known, for any e>0, i ||S(«) |4 A~^c D2+£ (9) |S(a)|4rfa^P Hence, by use of Holder's inequality, the trivial estimate for one of the |S(yL;a)| and the fact that x~l sinjc^x-1 for large x, one deduces: For any fixed (5>0, (■0) J Sfc-MIM (*==)'*<i-~. \<x\>P* Now, if we can show (11) J" |S(Aia)-S(A5a)Ma = 0(P3), A<\*\<P*
164 D. J. LEWIS then the conclusion follows, since (8), (10), and (11) would show that the integral in (3) is positive. It follows from Holder's inequality and (9) that (11) holds, provided for S sufficiently small (say S < 1/23) we have (12) minllS^^JUS^a)!}^?1-^ oni4<|a|<P*. Recall that P was a large integer in ^, and hence there exist integers a, q such that (a, g)=l, \XJX2-a/q\<\/q2 and P = q2. Let a be an element of the interval A<\oc\<P5 = q28; then there exist rational approximations ajqu a2/q2 to lxoi and A2a, respectively, such that for z=l, 2, (ah qt) = 1, 1 ^ qt ^ q3, \kp - ajqh < qT lq ' 3. We first suppose qi,q2^q10S. Since \oc\<q2S we have \a(\ <q~3+ \qiaAi\<q128, whence l^Mi^U \a2<li\<q22S and IV^-ai^/^iN?""3. But then a/q — alq2/a2ql #0 and ^-1-22,5 <(^i) ^laAr-ai^A^il A2 4 ^1 ^l<?2 ^2 «2<?2 «^_2+^-3«^-2, which is impossible when g (hence P) is sufficiently large. Thus we must have for each a in A < |a| <Pd that one of qx and <?2> say gl9 exceeds ql0S. But then Weyl's inequality for exponential sums shows that (12) holds. This completes the Daven- port-Heilbronn argument. 4. The proof that the set 9C contains nonzero numbers with arbitrarily small absolute values when Q is a general indefinite quadratic form in n^. 21 variables is exceedingly complicated. It would be highly desirable to have a simpler more straightforward proof. The known proof divides into two parts. Suppose Q is represented as a sum of squares of real linear forms with positive and negative signs, say,
DISTRIBUTION OF THE VALUES OF REAL QUADRATIC FORMS 165 Q=L\ + - + L2r-I* + l L2„. Part I (Davenport [7], [8}) consists in showing the result holds when min (r, n — r) ^16; while part II (Birch and Davenport [6], Ridout [19]) consists in showing the result holds when min(r, n — r) ^5 and /i^21. In both instances one assumes that (13) IGMl^l for 0#j inland derives a contradiction. These results only imply the conclusion when n^3\. However, by a clever variational technique, Davenport and Ridout [11] showed that, from part I, one could deduce the conclusion provided min(r, n — r)^6 and «^21. In part I, the initial approach follows the procedure used by Davenport and Heilbronn. The kernel (sin noLJnoi)2 leads to complications and so is replaced by . , 4 sin47ra/3 /sin 27ra/3/cY K(cc) = . v ' 3 47ra/3 \ 2not/3k J Then K(a) is a real even function such that \K(oL)\^C(k)mm{U\<x\-k-1} and the function Hy)= J e{yz)K(*)d* — oo has 0^(j)^l for real y9 and i/>(y) = 0 for |j|^l, while ^(j)=l for \y\£l/3. Let c be a real vector with no zero coordinates such that Q(c)=0 and f] Lj(c)^0. The choice of c is to ensure the singular integral is positive. Let P be an arbitrarily large integer and let Pj=\cjP\ Define S*(«)= E - I e(«Q(x)). It follows from (13) that 00 ® S*(a)K(a)da=0. o
166 D. J. LEWIS Following standard procedures for the Hardy-Littlewood method (without the need of the hypothesis on n or r, but assuming the k in the definition of K(oc) is sufficiently large) one shows that p-i/2 00 01 S*{a)K(a)da>Pn-2 and 01 S*(a) K(a) da<P3n/4. o p1/4 One completes the argument by showing (now assuming the hypothesis on r and n) that pi/4 (14) ® S*(a)X(a)rfa = 0(P"-2). p-1/2 The proof of (14) is ingenious and depends heavily on knowing that an indefinite quadratic form in 5 variables with integer coefficients has an integral zero x such that || x || = max |xf| ^ C • max {|coefficients|}, where C is an absolute constant. Such estimates on the norm of a zero of such forms have been given by J. W. S. Cassels [3] and by Birch and Davenport [4]. The approach in part II is different. In this instance one shows that if P is a large integer, then Q integrally represents an indefinite real form in 5 variables. where (16) \*.j\<P2J-2, (17) \^j\<Pi+J'2'm9 and (18) \H(y)\>\ for O^yeZ". The relations (16) and (17) imply that kxy\ H 1-A5)>5 is indefinite. The conclusion now follows on applying a quantified version of the theorem of Davenport and Heilbronn due to Birch and Davenport [5], namely:
DISTRIBUTION OF THE VALUES OF REAL QUADRATIC FORMS 167 IfXXt...,k5 are real numbers, not all of the same sign and such that not all ratios are rational, then for each S > 0 there exist integers ylt ...,y5,not all 0, such that (19) l^y? + -+A5yil<l and, for each i, (20) \*iyi\<c(d)\xl-x5\1+i. For suppose y is a solution of (19) satisfying (20) where the X-} come from (15); then (16) implies that l^pn + ioa-^ whence (21) \^yiyj\<P20+20d-n for i^j. But if S< 1/20 and n^21, relations (19) and (21) contradict (18). The proof of the theorem of Birch and Davenport just cited again makes use of a Hardy-Littlewood circle method along the lines of the proof of part I. To prove this theorem, one lets P be a very large positive number such that the inequality (19) has no integral solutions satisfying (22) 0<|A1|yf + -^ + |A5|yi^500P2. If one then defines the exponential sums S,(a)= X e(aXjyj), P<\*j\i/2yj<10P it follows from (22) that (23) St S1(a)-S5(a)X(a)da=0. o By standard arguments from the Hardy-Littlewood method, one shows that it follows from (23) that for 5>0 there exists a function C(S)>0 such that if (24) p>c(s)nl'2+s,
168 D. J. LEWIS then p6 (25) ||S1(a)-S5(a)Ma>4P3/7-1/2, A where ^-1=40Pmin|;y, and 77= IV'^sl. By a very delicate and intricate argument too involved to summarize here, Birch and Davenport [5] show that (24) and (25) are incompatible. An essential part of their argument again uses the estimate on the magnitude of a solution for an indefinite additive form with integer coefficients. The techniques discussed here are typical of the methods used in handling diophantine inequalities in many variables. Almost invariably in the proof one needs quite precise information regarding solutions of some associated diophantine equation with integer coefficients. This phenomenon is well illustrated in the work of D. Ridout and Jane Pitman ([15]-[19]). The absence of such information is frequently the major stumbling block to solving a diophantine inequality. 5. We complete our discussion of the distribution of the set 9£ when Q is indefinite by showing IfQ is an indefinite real quadratic form inn^3 variables and SC contains nonzero elements with arbitrarily small absolute values then % is dense on the line. This result follows quite easily from the following theorem of Oppenheim [14]: If Q is a nondegenerate indefinite quadratic form in n variables, then to each positive value a in 9C there corresponds a negative value —b in 9C such that (26) b2n-2^A(n)an-2\dct(Q)\, where the constant A'(n) depends only on n and det(<2) is the determinant of the matrix associated with the quadratic form Q. It follows from this theorem that if n^3 and 9C contains arbitrarily small nonzero values it contains such values with both positive and negative signs. Now let z be a real number and let 0 < s < 1. Let <5 = £2/(9max{l,|z|}). By hypothesis there exists an integral point j>#0 such that 0<(signz) Q(y)<6. Letd = [(z/e(^))1/2]sothat
DISTRIBUTION OF THE VALUES OF REAL QUADRATIC FORMS 169 (z/fiW)1/2-i<rf^(z/GM)1/2. Then 0^Q(dy)-z = d2Q(y)-z ^ — 5 —2(|z| d)1'2^ -5-&Z -«• Similarly Og,Q((d+l)y)-z<e. Oppenheim's proof of his theorem is by induction on n. If F is any quadratic form, let p(F) denote the greatest lower bound of the positive values assumed by F at integer points. If F is a positive definite quadratic form in n variables, then (27) p(Ff^Bn\dct(F)\, where Bn is a constant depending only on n. It is well known that B2=%, and we prove (27) by induction on n. Let c = F(y)<p(F) (1 +l/«), where y is integral and necessarily the coordinates of y have no common factor. Then there exists a unimodular transformation T so that F(Tx) = G(x) = c(x1+(xx2+-)2 + H(x2,...,xn) where H is positive definite and detF = det G = c deti/. By induction there is an integral point z = (z2,..., zn) such that H(z)n-l^Bn_l |detH|(l + l/w) and p(G{x9 fz))2g£cJJ(z), whence p(F)2n-2g>p(G(x, tz))2n-2^Bnp{F)n-2 detF, which implies (27). When F is an indefinite form, the inequality (27) is implied by (26). For let p(F)^c = F(y)<p(F) + e. It follows that p(-F)2n-2^Anp{F)"-2 |detF|. Apply this last inequality to the indefinite form — F to get
170 D. J. LEWIS p(F)2»-2^Anp(-nn-2\fetF\, and hence p(F)(2»-2)2^^"-4p(F)(""2)2|detF|3"-4, or p(FYi*"-*£A*m-4 |detF|3"-4, giving us (27) with Bn = An. The relation (26) is easily seen to be true for n = 2. We now prove (26) by induction on n. Let 0 <a = Q{y\ where the coordinates of y have no common factor. Then there exists a unimodular transformation T such that Q(Tx) = R{x) = a{x1+-)2 + J(x2,...,xn) and dQtQ = a detJ. Since Q is indefinite, —J must be either indefinite or positive definite in n— 1 variables. As we just saw in either case for each a>0, there exists an integer point z = (z2,..., zn) such that 0<-J(z) = d and 0<dn~l ^jB„_i |det J\ (l+e). But R (xx, tz) is an indefinite form of determinant — ad and so represents a number — b such that b1 <%ad, whence ft2"~2^(f)"-1Bll_1a"-2detF. This proves (26). 6. In this section we outline the proof of the theorem of Davenport and Lewis [10] concerning positive definite forms. Let Q=xAx\ where A=-(kt^ is a positive definite matrix. Suppose for all x* with ||x*|| large, the inequality (28) \Q{x + x*)-Q{x*)\<l has no integral solution x#0. Then (29) EV^I^^l for all integral points x #0, where k
DISTRIBUTION OF THE VALUES OF REAL QUADRATIC FORMS 171 We may suppose lAj^maxl^l, and replacing x* by — x* if necessary, we may suppose Ai<0. Set P=-2A, then ||x*||«P<||x*||, and hence P is large and positive, but not necessarily an integer. Let H(x) = Y,AijXixj-Pxl-2A2x2...2Anxn9 and set L(x) = Pxl+2A2x2-\ \-2Anxn. Notethat|2A;|«F. We know by the box principle that given m linear forms Ll9..., Lm, with real coefficients in n^.4m2^ 16 variables, there exist integral points u with ||w|| <^P5m/n and such that \Lj(u)\<$Pm/n~4'. Using this fact one proves, using induction, the existence of integral points a(v), v= 1,..., 6, witha(1) = (l, 0,..., 0)and forv^2, a(v) is an integral point such that ||a(v)|| «P5v/n, |L(a(v))| <P~5,\ and |aMa(v)t| <P~112 for n<v. On letting A be the integral rank 6 matrix having the a(v) as column vectors we find (30) H{Ay)=Yd HjVJ + T, eyMj-P^i -I e^, 1 2 where (31) /i,=A11, \<nr<Pxo«\ (32) |£rjNP-3 and y^P"2. It follows from (30) and (29) that (33) \H{Ay)\Zl, for integral j>#0. But then (34) PX/V-VO"'210 and (35) \i*iy2i + -+wl-Pyi\*± for integral y with >>! ^0. For if (34) does not hold, we should have yx >0, and ^iJi <^Vi+i9 whence j^P and ^2^2"l f-^iJi <Py\<P2> whence y^<P for all/ But then tajAVjl ^P~ *, and |ak>yk| ^P"* and we would have a contradiction to (33) when P is sufficiently large. Next, using a Hardy-Littlewood argument modeled along that used by Birch
172 D. J. LEWIS and Davenport [9] in their proof of a quantified version of the theorem of Davenport and Heilbronn, we show for each small positive d (5 < 1/10), if n > S00S, then (34) and (35) imply the existence of a real a such that (36) P~9S<x<Pi, (37) «Hj = aJlqJ + Pj, ^#0, (a}tq})=\, (38) qj<P9\ and (39) \Pj\<P-2 + 8s. These results are deduced by showing that if B = fi\,2/SP min|^|1/2, then p6 llSitaJ-SeWlda^P4^!-^)-1'2, B where and then deducing from this the bounds given for q} and /?,-. We complete the proof of the theorem by showing that the existence of an a satisfying (36)-(39) is incompatible with (35). In doing so we need to make use of the following fact: Given positive integers bl9-~9 bs there exists a positive integer B^C(bl- -b5)3f2, where C is an absolute constant, such that for all positive integers m, the equation (40) fc1x? + -.- + 65xi = J5m fs soluble in integers. It is easy to show, for an appropriate B, equation (40) is soluble in each ring of p-adic integers. Then the above result can be deduced quite easily from a theorem of G. L. Watson [20] concerning the size of integers not integrally represented by the form b^2 H \-b5x\, but locally integrally represented for all local rings. The proof of Watson's theorem is quite long and difficult and again a Hardy-Littlewood method is used.
DISTRIBUTION OF THE VALUES OF REAL QUADRATIC FORMS 173 We now demonstrate the incompatibility of (36)-(39) and (35). On multiplying (35) by a we get for all integral y with yx #0. Next set yj = qjzj and use (37) to get (41) |fli«iZ? + -+fl«««z2-aP«1z1+/?1«?z? + ...+/?6«2z2|>p-M for all integral z with z^O. As we have observed in the preceding paragraph, there is an integer (42) B<Z(a2-a6q2-q6ri2<(«sv2-H6q2-q6)3,2<P150S such that for each positive integer ra, we can find integers z2, ...,z6 such that a2q2zl + • • • + a6q6zl = mB. If (43) 0<Bm<P2~21s and (44) 0<z1<P1"195, then II Aflfa2 <ZP-l0d and \piq\z\\<p-l0d. Hence, if we can find positive integers m and zx such that (43), (44) and (45) K^zf + flm-aP^ZiKP-10* all hold, we will have contradicted (41) and hence proven the theorem. Put Zi =jBm, then, by (42), the inequality (46) |a1^1jBM2 + m-aP^f1M|<P~1 60S implies (45). Choose integers m, v such that 0<w<P16O<5, \aPqxu- v\ <P"160d, and put m = v- axqxBu2. Then if S < 1/400, the integers m and zx=Bu satisfy (43), (44), and (45). This completes the proof.
174 D. J. LEWIS References 1. R. Brauer, A note on systems of homogeneous algebraic equations, Bull. Amer. Math. Soc. 51 (1945), 749-755. MR 7, 108. 2. S. Chowla, A theorem on irrational indefinite quadratic forms, J. London Math. Soc. 9 (1934), 162-163. 3. J. W. S. Cassels, Bounds for the least solutions of homogeneous quadratic equations, Proc. Cambridge Philos. Soc. 51 (1955), 262-264; Addendum, ibid. 52 (1956), 602. MR 16, 1002; MR 18, 380. 4. B. J. Birch and H. Davenport, Quadratic equations in several variables, Proc. Cambridge Philos. Soc. 54 (1958), 135-138. MR 20 #3824. 5. , On a theorem of Davenport and Heilbronn, Acta Math. 100 (1958), 259-279. MR 20 #5166. 6. ? Indefinite quadratic forms in many variables, Mathematika 5 (1958), 8-12. MR 20 # 3104. 7. H. Davenport, Indefinite quadratic forms in many variables, Mathematika 3 (1956), 81-101. MR 19, 19. 8. , Indefinite quadratic forms in many variables. II, Proc. London Math. Soc. (3) 8(1958), 109-126. MR 19, 1161. 9. H. Davenport and H. Heilbronn, On indefinite quadratic forms in five variables, J. London Math. Soc. 21 (1946), 185-193. MR 8, 565. 10. H. Davenport and D. J. Lewis, Gaps between values of positive definite quadratic forms, Acta Arith. 22 (1972), 87-105. 11. H. Davenport and D. Ridout, Indefinite quadratic forms, Proc. London Math. Soc. (3) 9 (1959), 544-555. Mr 22 #28. 12. V. Jarnik and A. Walfisz, Uber Gitterpunkte in mehrdimensionalen Ellipsoiden, Math. Z. 32 (1930), 152-160. 13. A. Oppenheim, The minima of indefinite quaternary quadratic forms of signature 0, Proc. Nat. Acad. Sci. U.S.A. 15 (1929), 724-727. 14. , Values of quadratic forms. I, Quart. J. Math. Oxford Ser. (2) 4 (1953), 54^59. MR 14, 955. 15. Jane Pitman, Cubic inequalities, J. London Math. Soc. 43 (1968), 119-126. 16. , Bounds for solutions to diagonal inequalities, Acta Arith. 18 (1971), 179-190. 17. , Bounds for solutions of diagonal equations, Acta Arith. 19 (1971), 223-247. 18. Jane Pitman and D. Ridout, Diagonal cubic equations and inequalities, Proc. Roy. Soc. London, Ser. A 297 (1967), 476-502. MR 35 #6620. 19. D. Ridout, Indefinite quadratic forms, Mathematika 5 (1968), 122-124. MR 21 #2642. 20. G. L. Watson, Quadratic Diophantine equations, Philos. Trans. Roy. Soc. London, Ser. A 253 (1960/61), 227-254. MR 24 # A78. University of Michigan
THE CLASSIFICATION OF TRANSCENDENTAL NUMBERS K. MAHLER 1. All numbers £ considered in this article are real or complex. For polynomials p(z) = p0+plz+---+pmzm, where pm±0, the following notation will be used. m d(p) = m, H(p)= max |pj, and L(p)= £ |pj /i = 0, 1, ..., m j* = 0 denote the exact degree, the height, and the /engt/i of p(z), respectively. We further put A(p)=2d^L(p) and M(p)= f[ (2 + |pJ). If K denotes the set of all polynomials p(z)^0 with rational integral coefficients and v is any positive integer, it is obvious that either of the inequalities A (p)^v or M(p)^v is satisfied by at most finitely many elements of V. Consider now the set C of all real or complex numbers £. Our aim is to subdivide C into subsets or classes which are disjoint and have the following invariance property. Any two numbers in distinct classes are algebraically independent over the rational number field Q. A MS 1970 subject classifications. Primary 10F35; Secondary 10A40. © 1973, American Mathematical Society 175
176 K. MAHLER Here the subdivision of Cis to depend solely on the approximation properties of £, and the number of distinct classes should by preference be large. 2. A first such classification with the invariance property, but into only four classes, was found by me about 40 years ago. A detailed account of this classification, and of the almost equivalent one by J. F. Koksma, can be found in the book on transcendental numbers by Th. Schneider (1957). This classification is obtained as follows. Put successively vv>|£) = inf|p(£)|, where the lower bound extends over all polynomials p(z) satisfying p(z)eV, d(p)^m, H{p)£v, and p(£)#0; log{l/w>lC)} wm(£) wm (£) = hm sup !—-, vv = w (£) = hm sup ——. logu m Let further the symbol ^ = ^(£) denote oo if wm(£) is finite for all suffixes m, and otherwise let it be equal to the smallest suffix m for which wm(£)= oo. Thus at least one of the two numbers w and \i is always equal to oo. Therefore the complex numbers split into the following four disjoint classes: Class A: £ satisfies w = 0 and \i = oo. Class S: £ satisfies 0<w<oo and ^=oo. Class T: £ satisfies w = oo and \i = oo. Class U: £ satisfies w = oo and \i < oo. It can now be proved that: (i) the class A consists exactly of all algebraic numbers, hence the transcendental numbers are distributed amongst the classes S, T, and U; and (ii) the invariance property holds, i.e. numbers in different classes are algebraically independent over Q. One can also show that almost all numbers are S-numbers, a result greatly strengthened by V. Sprindzuk (1967). There are noncountably many U-numbers, e.g. all LiouviUe numbers; these are simply characterised by p,= 1. Until recently it was not known whether there exist any T-numbers, but this existence has now been established by W. Schmidt (1971), although as yet no actual T-number seems to be known. By way of example, e is an S-number, while n is either an S-number or a T- number. 3. I come now to a new classification (Mahler, 1971) which leads to a sub-
THE CLASSIFICATION OF THE TRANSCENDENTAL NUMBERS 177 division of C into infinitely many disjoint classes with the invariance property. In this classification, we need to consider polynomials in V of independently variable degree and height (or rather length). This classification depends on the following partial ordering of monotone non- decreasing functions. If a(v)>Q and b(v)>Q are any two nondecreasing functions of v^.1 for which there exist three positive numbers c, v0, and y such that a(vc)^yb(v) for v^v0, then we write a(v)^>b(v) or b(v)<^a(v). If simultaneously a(v)>b(v) and a(v)<^b(v), then we write a(v)> <b(v). This sign > < evidently defines an equivalence relation. With each element £ of C we associate now an order function 0(»|C)=suplog{l/|p(OI} where the upper bound is extended over all polynomials p(z) in V for which A(p)£v, p(0#O. Since they behave slightly differently, it is convenient to exclude from the consideration all those £ which are either rational integers, or are integers in any imaginary quadratic field. With this restriction, the following results hold. O (v | £) > < log v if C is algebraic; 0(v | ()>(logu)2 if £ is transcendental; 0(v | ()> <0(v | (') if C, (' are algebraically dependent over Q. Thus, if numbers £ £' with equivalent order functions are put into one and the same class, then the invariance property holds.
178 K. MAHLER The actual determination of the order function of a number is, of course, a very difficult problem. I mention, by way of example, the following relations. 0 (v | e) < (log v)3 (log log v)\ 0(v\n)< (log vf (log log vf, which are implicit in work by N. I. Fel'dman (1951 and 1963). It is interesting to see that in the second formula the upper estimate comes close to the lower estimate (logt;)2. In my paper on the order function I raised a number of questions. One of these questions has in the meantime been solved by Swierczkowski in an unpublished note; he proved that there are noncountably many inequivalent order functions and hence also as many classes in this classification. It is not known which monotonic functions are equivalent to order functions, and which can be the order function of almost all real or almost all complex numbers. It is also unknown whether the order functions can be strictly ordered. 4. I conclude this article by suggesting a still different kind of classification; however, I do not know whether it has the invariance property, or rather how the classification has to be defined so that this property holds. The important recent work by W. Schmidt (1970) and A. Baker (1965) suggests that instead of 0(v | Q one should associate with £ the function R(v\Q = sup\og{\/\p(Q\}, where the upper bound is now extended over all polynomials p(z) in V for which M(/?)^u,/?(C)#0. It seems highly probable that also for these functions R an equivalence relation can be found which preserves the invariance property. I dare to conjecture that the ideas of Schmidt could be used to settle this question. So far we have only discussed classifications based on the values of a single variable polynomial p(z) at the given point z = £. A more powerful kind of classification would consider simultaneous approximations by sets of polynomials. I have little doubt that the modern general transfer theorems in the geometry of numbers of convex bodies are the right tool for attacking such problems. References 1. A. Baker, On some Diophantine inequalities involving the exponential function, Canad. J. Math. 17 (1965), 616-626. MR 31 #2204. 2. N. I. Fel'dman, Approximation of certain transcendental numbers. I. Approximation of logarithms of algebraic numbers, Izv. Akad. Nauk SSSR Ser. Mat. 15 (1951), 53-74; English transl., Amer. Math. Soc. Transl. (2) 59 (1966), 224^245. MR 12, 595; MR 13, 117. 3. , On a measure of transcendence of the number e, Uspehi Mat. Nauk 18 (1963), no. 3 (111), 207-213. (Russian) MR 27 #4798.
THE CLASSIFICATION OF THE TRANSCENDENTAL NUMBERS 179 4. K. Mahler, On the order function of a transcendental number, Acta Arith. 18 (1971), 63-76. 5. W. M. Schmidt, Simultaneous approximation to algebraic numbers by rationals, Acta Math. 125(1970), 189-201. MR 42 #3028. 6. , Mahler's T-numbers, Proc. Sympos. Pure Math., vol. 20, Amer. Math. Soc, Providence, R.I., 1971, pp. 275-286. 7. T. Schneider, Einfuhrung in die transzendenten Zahlen, Springer-Verlag, Berlin, 1957. MR 19, 252. 8. V. G. Sprindzuk, Mahler's problem in metric number theory, "Nauka i Tehnika", Minsk, 1967; English transl., Transl. Math. Monographs, vol. 25, Amer. Math. Soc, Providence, R.I., 1969. MR 39 #6832; #6833. Institute of Advanced Studies, Australian National University Canberra, ACT 2600, Australia
This page intentionally left blank
THE PAIR CORRELATION OF ZEROS OF THE ZETA FUNCTION H. L. MONTGOMERY 1. Statement of results. We assume the Riemann Hypothesis (RH) throughout this paper; Q=^+iy denotes a nontrivial zero of the Riemann zeta function. Our object is to investigate the distribution of the differences y — y' between the zeros. It would thus be desirable to know the Fourier transform of the distribution function of the numbers y — /; with this in mind we take (1) F(a) = F(a,T) = f^logr) ' £ r*^ w(y-y% \^7T / 0<y^T;0<y'^T where a and 7^2 are real. Here w(w) is a suitable weighting function, w(w) = 4/(4 + w2), so w(0)=l. Our results concerning F(a) are stated in the following Theorem. (Assume RH.) For real on, 7^2, let F(oc) be defined by (1). Then F((x) is real, and F((x) = F( — oc). If T> T0(s) then F(oc)^ —s for all a. For fixed a satisfying 0^a< 1 we have (2) F(x) = (l+o(l)) T~2a\ogT + (x + o(l) as T tends to infinity; this holds uniformly for O^a^ 1 — 8. The first term on the right-hand side of the above behaves in the limit as a Dirac <5-function; it reflects the fact that if a = 0 then all the terms in (1) are positive. With more effort we could show that (2) holds uniformly throughout O^a^ 1. To investigate sums involving y — y' we have only to convolve F(a) with an A MS 1970 subject classifications. Primary 10H05. © 1973, American Mathematical Society 181
182 H. L. MONTGOMERY appropriate kernel r(cc); from (1) alone it is immediate that (3) £ r((y-fl^l)w{y-y) = (I- logj) f F(a)r(a)<fa. Here f is the Fourier transform of r, + 00 (4) f((x)= r(u)e(-au)du (e(6) = e2nW). — oo Our theorem gives us little information about F(a) for a ^ 1, so for the most part we restrict our attention to kernels r which vanish outside [—1+5, \—S]. Particular choices of r(a) give us Corollary 1. (Assume RH) IfO<oc<\ is fixed then 0<y*T;0<y'ZT\ 0i(y-y)\OgT ) \2(X 2/ 27T and (6) Z , nw x~x——) w(y-/)~ - + - I — logT. v; o<y*rfo<y'sr\ (a/2) (y-/) log7 / \a 3/ 2tt In the latter assertion one can delete the factor w(>> — /) if one wishes. We use (6) to derive Corollary 2. (Assume RH.) As T tends to infinity (7) I l^(|+o(l))f logT. 0<y^T;e simple Z7T The number of zeros of £(s) with 0<y^T is ~(7727r) logT, so the above asserts that at least § of the zeros are simple. It is known (see [6]) that the first 3,500,000 zeros are simple and lie on the critical line o = \. Although one expects that all the zeros of £(s) are simple, the only other result in this direction is due to A. Selberg [7]. His result holds unconditionally; it states that a positive density of the zeros of £(s) are of odd order and lie on the critical line. Let 0 < y x ^ y2 = • • • denote the imaginary parts of the zeros of ((s) in the upper
THE PAIR CORRELATION OF ZEROS OF THE ZETA FUNCTION 183 half-plane. The average of yn+ 1 — yn is 2n/\ogyn; our Theorem enables us to show that yn+i — yn is not always near its average. Corollary 3. (Assume RH.) We can compute a constant X so that (8) liinmf (yn+1- yn) (\ogyn/2n) £ A < 1. A complicated argument would permit one to show that in fact yn+1—yn^ 2nX/\ogyn for a positive density of n. This, with the fact that the average value is 2n/\ogy„ enables one to assert that (9) lim sup (yn + 1- yn) (logyn/27r) ^ X' > 1. n We note that if £(s) has infinitely many multiple zeros then we may take X = 0 in (8). Our proof allows us to take X = 0.68. It would be of interest to have A<£. as P. J. Weinberger and I have established the following: Let d>0 be square-free, and put K=Q((-d)l/2). Let h(-d) be the class number of K, and let £K(s) = £(s) - L (s, x) be the Dedekind zeta function of K. For each positive A, e there is an effectively computable constant d0 = d0(A, e) such that if h( — d)^A, d>d0, then all zeros of CK(s) which are in the rectangle 0<<7<1, 0^t^dl/2~£ lie on the line <f=h\ if i+iyni i+iyn+i are consecutive zeros of CK(s) in this range then (10) (1 "£) 1ng,,2* 9^r«+i -y.^U +«) 1n Jn^2- ^gd(yn + 2y \ogd(yn + 2)2 One may inquire about the behaviour of F(a) for a ^ 1. Our first observation is that (2) cannot hold uniformly for O^a^C if C is large. For if it did then (6) would hold for 0<a = C. Write (6) as G(a)~//(a). On one hand |sin2x| = 2|sinx|, so G(2a) = G(a) for all a. On the other hand tf(2a)>f#(a) for a = 2. This suggests that F(a) makes some change in its behaviour for a=l. Further considerations of the above sort lead one to believe that certain averages of F(a) over large a are close to 1. At the end of §3 we describe two heuristic arguments which suggest that (11) F(a) = l+o(l) for a^l, uniformly in bounded intervals. This, with the Theorem, completely determines F, so an appropriate use of (3) leads immediately to a Conjecture. For fixed a < /?,
184 H. L. MONTGOMERY P ^ /f /sin7rw\2 \ T <12> .-?„ ,~U|-hrJ*+'Hs,0»r 27ta/log r ^ y - y' ^ 27t0/log T as 7 tends to m/wiry. H^r^ <5(a, 0)= 1 z/0e[a, jj], 5 (a, J?) = 0 otherwise. The Dirac 5-function occurs naturally in the above, for if Oe[a, jS] then the sum includes terms y — yf. The assertions (11) and (12) are essentially equivalent. From either it immediately follows that almost all zeros are simple. From (11) it is easy to see how Corollary 1 ought to be extended: If (11) is true then for a^ 1, o<y^T;o<yzT\ a(y-y)logT / In and o<,srfo</srV (a/2)(y-/) logT J u y> \ 3a2) 2n ("> I (^FSOAl^-^ll+^lilogT. In a certain standard terminology the Conjecture may be formulated as the assertion that 1 — ((sin nu)/nu)2 is the pair correlation function of the zeros of the zeta function. F. J. Dyson has drawn my attention to the fact that the eigenvalues of a random complex Hermitian or unitary matrix of large order have precisely the same pair correlation function (see [3, equations (6.13), (9.61)]). This means that the Conjecture fits well with the view that there is a linear operator (not yet discovered) whose eigenvalues characterize the zeros of the zeta function. The eigenvalues of a random real symmetric matrix of large order have a different pair correlation, and the eigenvalues of a random symplectic matrix of large order have yet another pair correlation. In fact the "form factors" Fr(a), Fs(a) of these latter pair correlations are nonlinear for 0<a< 1, so our Theorem enables us to distinguish the behaviour of the zeros of £(s) from the eigenvalues of such matrices. Hence, if there is a linear operator whose eigenvalues characterize the zeros of the zeta function, we might expect that it is complex Hermitian or unitary. One might extend the present work to investigate the k-tuple correlation of the zeros of the zeta function. If the analogy with random complex Hermitian matrices appears to continue, then one might conjecture that the /c-tuple correlation function F(ul9 u2,..., uk) is given by (15) F(ul9u29...9uk) = detA,
THE PAIR CORRELATION OF ZEROS OF THE ZETA FUNCTION 185 where A = \_atj\ is the k x k matrix with entries au = 1, au = (sinn (ut — Uj))/n (u{ — w,) for i #;. Here the normalization is the same as in the Conjecture, which is the case k = 2 of the above. If one continues to draw on the analogy with random complex Hermitian matrices then one may formulate a conjecture concerning the distribution of the numbers yn+ x — yn. The precise conjecture involves a complicated (but calculable) spheroidal function. Thus, or otherwise, one may conjecture that (16) liming !->>,,) logy,, = 0, n and (17) lim sup(yn+l-yn) \ogyn= +oo; n so Corollary 3 is probably far from the truth. It would be interesting to see how numerical evidence compares with the above conjectures. The first several thousand zeros have been computed, so it would not be difficult to assemble relevant statistics. However, data on the failures of "Gram's law" indicate that the asymptotic behaviour is approached very slowly. Thus the numerical evidence may not be particularly illuminating. 2. An explicit formula. In proving our Theorem we require the following formula, which relates zeros of C(s) to prime numbers. Lemma. If\ <o<2 andx^l then x iy + X1/2-ff + i'(l0gT + 0<r(l)) + 0ff(x1/2T-1), where r = \t\ + 2. The implicit constants depend only on a. Proof. It is well known (see [2, p. 353]) that if x> 1, x^p", then I A{n)n-'—Us)+j E— +E X-— n^x C 1-s q Q-s „=i 2n + s provided s^ 1, s^q, s^ -In. This does not depend on RH, but if we assume RH then the above may be expressed as
186 H. L. MONTGOMERY (19) I r = x*"1/2 -(s) + T A(n)n-°-* Y If we replace s by 1 — a + it in the above then we have I, **" . =xl'2-°(Ul-a + it)+ £ ^(ii)!!-1-" (20) v<r-it oo Y-2n-l+<r-it\ _* y * V cr —ir Bfi 2n+l-<7-f it) We subtract respective sides of (20) from (19), and use the relation (21) 7 (*)=-! ^M«-s, which holds for <r> 1. We find that P—DE, ,»,*!'„ ,2=-*-1/2fl ^(n)(-)1_""+S W-Y+") 7 (ff-i) Ht-ir \n% W „>x \nj J (22) JL(\-a+it)xw-'+»+. xl/2(2ff_1) C ((7-1 +it) (a-it) _x-l/2 (2<r-l)x ■2n y - „=i (ff-l-ir-2n)(<r+it + 2n) Both sides of the above are continuous for all x^ 1, so we no longer exclude the values x = 1, x = p". If 1 «r<2, then from the logarithmic derivative of the functional equation of the zeta function (see [1, pp. n5, 82-83]) we have |(l-«T + »r)=-|(ff-it)-logT + 0„(l); from (21) we see that this is = — logT + 0„(l). Hence the right-hand side of (22) is + Xll2-° + it(logT + 0,(\))+0,(xll2T-2) + 0(r(x-2T-1), which gives the result.
THE PAIR CORRELATION OF ZEROS OF THE ZETA FUNCTION 187 3. Proof of the Theorem. The first assertion of the Theorem follows from the observation that we may interchange y and y' in (1). To prove the remaining assertions, take g = \ in the Lemma, and write (18) briefly as L(x, t) = R(x, t). We evaluate the integrals JJ|L(x, t)\2 dt, $%\R{x, t)\2 dt. We treat the left-hand side first. We have m \^***i*-"\f«&mw=m- We wish to exclude those numbers y^[0, T]. It suffices to show that (2*) I \n+(t ,:!, .,u,<^T, dt .ym+(t-yf) for then (23) is T (25) =4 £ xHy'yl \(\+(t vffln+rr v^+0(log3T)' 0<y^T;0<y'^T J (l+(t-y) )(l+(t-y) ) 0 To prove (24) we use the fact (Theorem 9.2 of [8]) that if 7^2 then there are <^ log Tzeros for which T ^ y ^ T + 1. From this it is immediate that if 0 ^ t ^ T then v:J,T]TT(^^(7TI+^7Tl),ogT' and On the left-hand side of (24) we take the sums inside and use the above estimates. The integration is then trivial, and we obtain (24). Arguing similarly we may also show that 00 00 0<y^T;0<y'^T
188 H. L. MONTGOMERY The estimation of £0<^r;o<y^r J-oo---is the same, so we see that (25) is + 00 -V, J<„/""'' I (1+(,-rnU-tf)+0(l083r) From the calculus of residues we deduce that the definite integral is = (n/2) w(y — /), so the above is = 2tc £ x'(v"y,)w(y-/) + 0(log3T). 0<y^7;0<y'^r If we put x=Ta then we have (26) J \L{T\t)\2dt =F(a) TlogT + 0(log3T). o Here the left-hand side is clearly nonnegative, so we have the second assertion of the Theorem. To complete the proof of the Theorem we prove (2); to this end we evaluate JJ \R(x, t)\2 dt. In the first place (27) i\x-i+u \ogx\2dt =—(log2T + 0(logT)) for all x^l, 7^2. To compute the mean square of the Dirichlet series on the right-hand side of (18) we use the following quantitative form (see [5]) of Parseval's identity for Dirichlet series: (28) 2>n«~''l dt=Z\an\2(T + 0(n)). We could instead use the weaker relation T dt=(T + 0(N)) £ kl2;
THE PAIR CORRELATION OF ZEROS OF THE ZETA FUNCTION 189 this is Theorem 1.6 of [4]. However, the latter is restricted to Dirichlet polynomials, so we simplify our treatment by arguing from (28). We have T 1 f -1/2 + it + lMn)[~) '\ 3/2 + i(| dt = ; I A(nr(^)'\T + 0(n))+l £ A{nf (f\{T + 0{ri)). x n^x W x n>x \n/ By the prime number theorem with error term this is (29) =T(logx + 0(l)) + 0(xlogx). As for the error terms in (18), we see that (30) 1 J v2 and (31) xt~2 dt <^x. We now combine our estimates (27), (29), (30), (31); we employ the following consequence of the Cauchy-Schwarz inequality: If Mfc=$l\Ak(t)\2 dt and Ml^M2 ^M3^M4, then 1 I Ak(t)\ dt =Ml+0((MlM2)l/2). We consider three cases. Case 1. 1 ^x^(log 7)3/4. Then our Mx term is given by (27). Our other terms are uniformly o{Mx\ so our expression is =(1 +o(l)) {T/x2) log2 T. Case 2. (logr)3/4<x^(log7)3/2. In this case all our estimates are uniformly o{T\ogT). Case 3. (log7)3/2<x^r/logr. Then our M2 term is given by (29). All our
190 H. L. MONTGOMERY other terms are uniformly o(M1), so our expression is =(1 +o(l)) Tlogx. If we put x=Ta then we may express our result by saying that \R(T\t)\2dt=((l+o(l))T-2*\ogT + <x + o(l))T\ogT, uniformly for O^a^ 1 — £. This and (26) give (2), so the proof is complete. If a > 1 in the above then x > T, so the second error term in (29) is no longer smaller than the main term. The error term (31) also gives problems; a little consideration reveals that what we require is to know the size of (32) I I £ A{n)nll2-i' + x £ Ai^n-3'2-"-- 2x 1/2 -it X nix (i+fc)(*-fc); dt. If we multiply out the integrand and integrate terms individually, we find that there are too many nondiagonal terms to be ignored. We may, however, collect terms so that the above is expressed in terms of sums of the sort £w g y A (n) A (n + h). There are various indications that this sum is approximately c(h) y, where c(h) is a certain arithmetic constant. If we replace these sums by their conjectured approximations c(h) y, then our new expression is ~ Tlog T. Moreover, there is a reasonable hypothesis as to the size of the differences (33) X A(n)A(n + h)-c(h)y which if true would allow us to carry out our program for 1 ^a < 2. If the differences (33) are not only reasonably small but also behave independently for different h then (32) is - T log T for all a ^ 1. Another indication of the behaviour of the expression (32) can be obtained by considering its "^-analogue." The expression (34) i*Q<P(4)x*xo - I >l(*)x(*)*1/2+I A(n)X(n)n-3/2 X n£x n>x may be shown to be ~ Q log x for Q ^ x, in analogy with (29). If x (log x) ~ A ^ Q ^ x then we may use an established technique [4, Chapter 17] to show that (34) is ~ 6 log 8- If GRH is true then this latter asymptotic relationship holds for x3/4+e<Q^x. This corresponds to 1 ^a<f. One does not expect a change in the
THE PAIR CORRELATION OF ZEROS OF THE ZETA FUNCTION 191 behaviour for larger a, but a more delicate error-term analysis is needed if the result is to be extended. 4. The corollaries. To prove Corollary 1 we use our Theorem in conjunction with (3). To obtain (5) we take r(w) = (sin27caw)/27caw. The Theorem makes it a simple task to compute F(ftr(fl)dfi=± F(fi)dfi. To obtain (6) we take r(w) = ((sin7caw)/7caw)2. Again from the Theorem it is easy to compute •+• oo Ta JF(/?)r(/?)4?=i j(a-ftF(ftdp. We now prove Corollary 2. Let mQ be the multiplicity of the zero q. In a sum over 0<y^ T, our convention concerning multiple zeros is that zeros are counted according to their multiplicities. This is accomplished by allowing y to take on the same value mQ times. In particular, 0<y^T 0<y^7 0<y'^T for on both sides a given zero q is counted with weight m2. But /sin(a/2)(y-/)logry o<7sr - o<vsrfo<v'SA (a/2)(y-y') logT J U n' 0<y'^T y = y' and if we take a = 1 — S then from (6) the above is ^(f + s)(T/27r)logT. Hence we have demonstrated that £ mQZ$ + o(l)){T/2n)logT. 0<y^T
192 H. L. MONTGOMERY Now £ 1£ £ (2-«g^(2-f+o(l))f logT, 0 <y ^T; qsimple 0<y^7 ^ so we have Corollary 2. The kernel r(u) which we have used does not appear to be optimal for our purpose, so presumably one can improve slightly on the constant f. We now turn to the first assertion of Corollary 3. We take r(u) = max(l— (|m|/A), 0) in (3), and choose X later. Now r(a) is nonnegative, and \£ r(a) dot < co, so our Theorem permits us to calculate a lower bound for the right-hand side of (3). We see that + 00 1 J F(a) r(«)<k£(1 +o(l)) U + 2k j «fc^ do\ ^ logT. We may assume that all but finitely many zeros are simple, so the terms y=y' in (3) contribute an amount ~(T/27t) logT. Hence £ l^(i + 0(l))C(l)^logT 0<y^T ^ (Xy'^T 0<y-y' <2nA/logT where C(X) = X + (l/n2X) Cin(27d)-1. Here Cin(x) is the "cosine integral," X fl— COSH Cinx= du. J u 0 Note that the integrand is nonnegative, so that Cin(x)>0 for x>0. To obtain (8) we show that C(X)>0 for some X<1. This is easy, because C(l) = (l/7c2) • Cin (2n) > 0, and C(X) is continuous. In fact a little calculation reveals that C(0.68) >0. We have not determined the optimal kernel r(a), so one should be able to improve on the constant 0.68.
the pair correlation of zeros of the zeta function 193 References 1. H. Davenport, Multiplicative number thtory, Lectures in Advanced Math., no. 1, Markham, Chicago, 111., 1967. MR 36 #117. 2. Edmund Landau, Handbuch der Lehre von der Verteilung der Primzahlen, Teubner, Berlin, 1909. 3. M. L. Mehta, Random matrices and the statistical theorv of energy levels, Academic Press, New York, 1967. MR 36 #3554. 4. Hugh L. Montgomery, Topics in multiplicative number theory, Lecture Notes in Math., vol. 227, Springer-Verlag, Berlin and New York, 1971. 5. Hugh L. Montgomery and R. C. Vaughan, Hilbert's inequality (to appear). 6. J. B. Rosser, J. M. Yohe and L. Schoenfeld, Rigorous computation and the zeros of the Riemann zeta-function (with discussion), Information Processing 68 (Proc. IFIP Congress, Edinburgh, 1968), vol. 1, Math., Software, North-Holland, Amsterdam, 1969, pp. 70-76. MR 41 #2892. 7. Atle Selberg, On the zeros of Riemann's zeta-function, Skr. Norske Vid. Akad. Oslo I (1942), no. 10, 59 pp. MR 6, 58. 8. E. C. Titchmarsh, The theory of the Riemann zeta-function, Clarendon Press, Oxford, 1951. MR 13, 741. Trinity College Cambridge, England
This page intentionally left blank
METRIC THEOREMS ON THE DISTRIBUTION OF SEQUENCES H. NIEDERREITER 1. Introduction. The theory of uniform distribution of sequences, historically an outgrowth of diophantine approximations, is more and more becoming the meeting ground of areas as diverse as number theory, harmonic analysis, probability, and functional analysis. The very definition of uniform distribution is in essence of a probabilistic nature, and so this aspect of the theory is rather evident. The interest of number theorists in the subject centers mainly around the study of various special sequences stemming from, or having an impact on, classical areas of number theory. Well-known is the interplay between uniform distribution modulo one and, to name only a few, diophantine approximations, normal numbers, exponential sums, and the probabilistic theory of additive and multiplicative functions. But there are also relations to the distribution of primes (see [40]) and the theory of transcendental numbers (see [33, § 4]). For surveys of the general theory of uniform distribution, we refer to Koksma [28], Cigler and Helmberg [9], and Kuipers and Niederreiter [30]. In this paper, we shall adopt the probabilistic viewpoint. The general problem we are facing may be very crudely stated as follows. Given a class of sequences in the unit interval or in some multidimensional unit cube, determine whether it obeys some of the common laws of probability theory. For instance, can a sequence chosen at random from the given class be expected to be uniformly distributed? If so, can one prove quantitative refinements of the result? What is the average order of magnitude of the discrepancy for sequences from the given class? Does the law of the iterated logarithm hold? In this generality, one cannot hope for decisive answers to these questions. Therefore, one usually considers only fairly explicit classes of sequences of either probabilistic or number-theoretic significance. Most AMS 1970 subject classifications. Primary 10-02, 10F40, 10K05; Secondary 05A05, 60F10, 62-02, 62G15. © 1973, American Mathematical Society 195
196 H. NIEDERREITER of the results described here will concern the class of all sequences contained in the domain under consideration. However, in the last section, we provide some information about important classes of special sequences which have received much attention in the literature. For an account of the interplay between uniform distribution and probability on a general level, see the paper of Kemperman [25]. The present paper is to a large part expository, since the detailed proofs of the new results will be published elsewhere (see [38], [39]). It is divided into six sections. A brief summary follows. In § 2, we introduce the so-called discrepancy of a sequence which provides a quantitative measure for the deviation of the distribution of the sequence from a given distribution. We mention in fact various notions of discrepancy which have been considered in the literature, but we shall in the sequel concentrate on the so-called extreme discrepancy. The purpose of the next section is to exhibit the close relation between discrepancy and the theory of empirical distribution functions, a relation which has not been noticed so far. The concept of extreme discrepancy turns out to be a rather special case of a notion very familiar to statisticians, namely Kolmogorov's two- sided test. In § 4 we discuss "global results", i.e., results pertaining to the class of all sequences contained in the domain under consideration. In the one-dimensional case, we have the general limit theorem of Kolmogorov for empirical distribution functions. An asymptotic expansion of this limit theorem can be used to obtain a law of the iterated logarithm. In the multidimensional case, no analogue to Kolmogorov's limit theorem is known, but a law of the iterated logarithm can still be obtained. Some interesting combinatorial aspects arise in § 5. Here we are concerned with finding the measure of the set of sequences in the unit interval whose Mh discrepancy is bounded by some prescribed real number. Notions from transversal theory will turn out to be useful. In the last section, we give a survey of the most interesting results on classes of special sequences, notably lacunary sequences. Due to the large number of papers written on the subjects covered in this paper, especially on the topics belonging to probability and statistics, the bibliography constitutes only a selection of the literature, but I hope it is a representative and useful one. I would like to take this opportunity to thank Professor Walter Philipp for many enlightening conversations on the subject of probabilistic number theory. 2. Discrepancy. Let E— [0, 1) be the unit interval, and let a> = (xn), n = 1, 2,..., be a sequence of elements from E. We are interested in the distribution of the sequence co in E. For a positive integer N and a subset M of £, let A(M;N) be the number of
METRIC THEOREMS ON THE DISTRIBUTION OF SEQUENCES 197 elements x„, 1 ^n^LN, with xneM. Furthermore, let /be a continuous distribution function on [0, 1], i.e., a continuous nondecreasing function / on [0, 1] with /(0) = 0and/(l)=l. Definition 2.1. For TV^l and O^a^l, we define the local discrepancy RN((x;f) of the sequence co with respect to the continuous distribution function/by RN{*;f) = A{t09a);N)/N-f{*). Definition 2.2. The sequence co is called uniformly distributed in E with respect to the distribution function/if lim^^ RN{x;f)=0 for O^a^l. If/M is the uniform distribution on [0, 1], i.e.,/tt(x)=x for O^x^l, then a sequence co uniformly distributed in E with respect to/u is simply called uniformly distributed in E. We arrive at a notion of discrepancy by taking some function norm of the local discrepancy RN((x;f), considered as a function of a on [0, 1]. Using the most common norms, namely the supremum norm and the IF norm, we arrive at the following definitions, respectively. Definition 2.3. For jV^I, the (extreme) discrepancy DN(co;f) of the sequence co with respect to/is defined by DN(co;f)= sup \RN{*;f)\. If/=/u, then we write DN(co) instead of DN(co;fu). Definition 2.4. For 1 ^p < oo and N ^ 1, the LP discrepancy Dtf (co; f) of the sequence co with respect to/is defined by i DW(co;f)M\RN(x-,f)\>> dx 0 Iff=fU9 then we write Dtf](co) instead of D^{co;fu). We have obvious inequalities relating these discrepancies, namely (1) Dft\€o;f)£DW{(o;f) whenever l£pg9<oo, and (2) DJH©;/)gDN(a>;/) forl^p<oo. The importance of these discrepancies stems from the following simple facts, the first of which is well-known. y/p
198 H. NIEDERREITER Lemma 2.1. The sequence co is uniformly distributed in E with respect to f if and only //lim^^ DN{co;f) = 0. -Proof. The sufficiency of the condition is obvious. Conversely, suppose that co is uniformly distributed with respect to/ For N ^ 1, let gN be the function defined by gN(a) = A([0, a); N)/N. Then gl9 g2,..., 9n>-- is a sequence of nondecreasing functions converging pointwise to the continuous function / By the theorem of Polya-Cantelli [19, pp. 319-321], the convergence must be uniform, and the result is established. Lemma 2.2. Let 1 ^ p < oo be given. Then the sequence co is uniformly distributed in E with respect to f if and only if lim^^ D^p)(co;f) = 0. Proof. If co is uniformly distributed with respect to/, then Lemma 2.1 and (2) imply limiV_ooD^)(co;/) = 0. To prove the converse, it suffices to show that limiV_ooDy)(co;/) = 0 implies the uniform distribution of co with respect to/ Using inequality (1), we get then the desired result for all values of p under consideration. So suppose that lim^^ JJ \RN(ai;f)\ dec = 0. We observe that limiV_00JRiV(a;/) =/(a) is trivial for a = 0 and 1. Now choose e>0 and j?e(0, 1). Suppose that RN(P;f)^e for some N^ 1. Since/is continuous, there exists S with 0<<5^ 1 — /? such that / (y) ^/ (j?) + e/2 for j? ^ y ^ j3 + S. Then Consequently, we have jj \RN(cc;f)\ da^sS/l. By hypothesis, this inequality can only hold for finitely many N. It follows that RN(P;f)<e for sufficiently large N. A similar argument shows that RN(P,f)> — e for sufficiently large N, and the proof is complete. For later purposes, the following alternative formula for the extreme discrepancy will be useful, which differs only slightly from Definition 2.3. Lemma 2.3. The extreme discrepancy DN(co;f) is also given by DN(co;f)= sup |>l([0,a];JV)/iV-/(a)|. Proof. If a is not one of the first N elements of the sequence co, then A([0, a); N) = A([0, a]; N). To deal with the finitely many remaining possibilities
METRIC THEOREMS ON THE DISTRIBUTION OF SEQUENCES 199 for a, we choose ^>0, and observe that there exists S with 0<(5^1 -max1^w^iVxw such that A{[09xH};N) = A{[09xH^S);N) and f{xH + S)£f(xJ + e for l^n^N. Then N N A([0,x„ + 8);N) N -f(xm+4 +£, and so \A([0, x„]; AO/JV-AxJI^to/He. Together with the earlier remark, it follows that (3) sup \A([p,«-];N)/N-f(x)\^DN(co;f). In the other direction, we note that (4) \A([0,P);N)/N-f(PU sup M([0, a]; N)/JV-/(a)| for j?=0 and for all /? which are not among the first N elements of co. It remains to consider the xw, l^w^JV, with xw>0. If such xn exist at all, we will proceed as follows. For given e>0, there exists rj with 0<rj^minl^n^N;Xn>0 xn such that >1([0, x„); N) = A([0, x„-ij]; JV) and/(xw-^/(x„)-e for all xn", 1 ^n^JV, with x„ > 0. Then A([09xn-rj-];N) N 'f(xH-l)\ !S^™-/w<«^)-/w, and so M([0,x„);N)/iV-/(xn)|^ sup \A([0,«-];N)/N-f(<x)\+s for all xn from above. Together with (4), this proves inequality (3) in the reverse sense. We remark that statements analogous to Lemma 2.3 will also hold for the discrepancies Z^O^/)* l^/?<oo, since
200 H. NIEDERREITER \A{[09*);N)/N-f{a)\ and \A([0, a]; N)/N-f(a)\ differ only for finitely many values of a, and so the integral defining D^p)(oo;f) wiftnot be affected by this change. The dependence of DN(co;f) on the first N elements of oo can be made very explicit. We prove the following representation for DN(oo;f) which generalizes [38, Theorem 1]. Lemma 2.4. Let xl5..., xN be the first N terms of co, arranged in ascending order. Then DN(co;f)= max max (|/(*„)-*/JV|, \f(Xn)-(n-l)/N\) = l/2iV+ max \f{xn)-{2n- 1)/2N\. Proof. For notational convenience, set x0 = 0 and xN+1 = \. The distinct values of the numbers xw, 0^n^iV +1, define a subdivision of [0, 1]. Therefore, DN(co;f)= max sup 0^n^N;x„<x„+i x„<a^xn+l = max sup 0^n^N;x„<x„+ i x„<a^x„ + i Whenever xn<xn+1, the function gn(<x)=\n/N— f(oc)\ attains its maximum in [xn, xn+ J at one of the endpoints of the interval, since/is monotone. Using also the continuity of/, we conclude that DN(co;f)= max max(|n/JV-/(x„)|, \n/N-f{xH+1)\). 0£n^N;x„<xn+l From here on, one proceeds as in the proof of [38, Theorem 1]. Corollary 2.1. Let f be a continuous distribution function on [0,1] mapping E into E. For a sequence co = (xn) in E, let f(co) denote the sequence f(co)=(f(xn)). ThenDN(f(co)) = DN(co;f). Fork^2,\etEk = {(xu...,xk)€Rk:0^Xj<lforl^j^k} be the/c-dimensional unit cube. The theory of discrepancy may as well be developed for sequences co in Ek. Let / be a continuous distribution function on £fc = {(x1,..., xk)eRk: O^Xj^l for l^y^/c}, i.e., a continuous distribution function / on Rk which A{[09a);N) N N -/H -/(«)
METRIC THEOREMS ON THE DISTRIBUTION OF SEQUENCES 201 satisfies/(l,..., 1)=1 and/(xl5..., xk)=0 whenever at least one of the Xj is zero. With the obvious definition of the counting function A(M\ N), one may then define the local discrepancy of co with respect to / by RN{<*u—,<*k,f) = f(<xl9...9cik) for («!,..., u.k)eEk. The extreme discrepancy DN(co;f) is then DN{<o;f)= sup \RN(<xl9...9<xk;f)\9 (<*i «k)e£k and, for 1 ^p< oo, we set 1 i DW(w;f) = n--j\RN(<x1,...,ak-,f)\>'d«1...d<xk 0 0 An important special case results when f=fu9 the uniform distribution given by fu(xl9...9xk) = xl---xk for (xj,..., xk)eEk. Again we shall write DN(co) instead of It is easily seen that Lemmas 2.1, 2.2, and 2.3 can be extended to the multidimensional case. However, there is no obvious analogue to Lemma 2.4 in the multidimensional case, and this is due to the lack of certain order properties in Ek. For results which can be shown in the direction of Lemma 2.4, we refer to [38]. We shall see in §§ 4 and 5 that the absence of a result such as Lemma 2.4 causes extreme difficulties in the metric theory of discrepancy in Ek. A comprehensive treatment of- discrepancy and its role in uniform distribution can be found in the following books and articles: Koksma [28, Kapitel IX], Cigler and Helmberg [9], Hlawka [22], Niederreiter [36], and Kuipers and Niederreiter [30, Chapter 2]. 3. Empirical distribution functions. We adopt the following general viewpoint. We are given a probability space (X, U, /i), i.e., a nonvoid set X9 a c-algebra U of subsets of X, and a probability measure p defined on U. Let £x, £2,..., £„,... be independent random variables on (X, U, /i) with a (not necessarily continuous) common distribution function F(t). The following definition is well-known in mathematical statistics. Definition 3.1. For xeX and a positive integer N9 the empirical distribution function FN(t, x) is defined for all real t by \/N times the number of £„(x), l^n^N, with Zn{x)£t. An important test, known as Kolmogorov's two-sided test or Kolmogorov's statistic, is based on empirical distribution functions. y/p
202 H. NIEDERREITER Definition 3.2. For xeX and a positive integer N, define (5) GN(x) = sup|FNM)-F(0|. teR Let us show that GN(x) is a generalization of the extreme discrepancy DN(co; f). First of all, we note that any infinite sequence co in £ may be identified with a point in the infinite-dimensional unit cube £°°, i.e., in the cartesian product of countably many copies of E. For the given continuous distribution function/on [0, 1], let v be the Borel measure in E determined by v([0, *))=/(/) for O^/^l. Then the product probability space (£°°, 93°°, v00)= ®iLx(Ei9 ®£, v4), where Et = E9 93£ = 95 (the cr-algebra of Borel sets in E), and v, = v for all i = 1, 2,..., is taken as the probability space (X, U,/i). Let us extend the distribution function / on [0,1] to a distribution function F on R by putting F(f) = 0 for t<0, F{t) = f(t) for Ogrgl, and F(t)= 1 for r> 1. For n^ 1, let £„ be the nth coordinate projection in £°°, i.e., Zn((o) = Zn(xl9...9xm...) = xn for all a>=(x1,...,xll,...)e£G0. Then £u £2,..., £„,... are independent random variables on (£°°, 95°°, vj with F(t) as their common distribution function. We observe that, for coeE™ and N^ 1, the empirical distribution function FN(t, co) satisfies FN(t, oo) = 0 for t<0, FN(t, oo) = A([0, f]; JV)/N for O^r^l, and FN(r, co)=l for t>\. Therefore, by using Definition 3.2 and Lemma 2.3, it follows that GN(oo) = DN(co; f) for all coe£°° and all N^ 1. Thus the extreme discrepancy DN(co;f) is just a special case of Kolmogorov's two-sided test. Detailed bibliographies on Kolmogorov's test were compiled by Darling [10] for the work prior to 1957 and by Barton and Mallows [3] for the work between 1957 and 1965. In later sections, we shall be interested in the probability fi({xeX: GN(x)^<x}) for given iV^l and a^O. It is important to note that this probability does not depend on the distribution function F(t) as long as F(t) is continuous, or, to use the technical term, the above probability is distribution-free in the class of continuous distribution functions. This was first observed by Wald and Wolfowitz [51]. For the proof, we start from a sequence £l5 £2,..., £m... of independent random variables on (X, U, /i) with common continuous distribution function F(t). Then Fo£l5 Fo£2,..., Fo£w,... is again a sequence of independent random variables on (X, U, /i), having as their common distribution function the uniform distribution Fu(t\ given by Fu(t) = 0 for t<0, Fu{t) = t for O^t^l, and Fu{t)=l for t> 1. It is also easily seen that GN(x) = GN(x) for all xeX, where GN(x) is the expression in (5) corresponding to the random variables Fo£m n= 1, 2,.... This should be compared with Corollary 2.1. We note that, when dealing with the probabilities fi({xeX: Gn(x)^ol}) for continuous distribution functions, we may therefore assume without loss of generality that the given random variables have the uniform distribution Fu(t) as their common distribution function. If the con-
METRIC THEOREMS ON THE DISTRIBUTION OF SEQUENCES 203 tinuity of F(t) is dropped, the above argument is no longer valid. In fact, the probability n({xeX: GN(x)^a}) is not distribution-free in the class of all distribution functions (see [46]). Restricting our attention to the uniform distribution Fu(t\ we note that the L2 discrepancy D\}](co) may be viewed as a special case of the Cramer-Smirnov test of mathematical statistics [10]. With the same notation as in Definition 3.2, we set (6) Gtf (x)=( | (FN(t, x)-Fu(t))2 dtj2. - oo For the probability space (£°°, 2300, X^), where X^ is the product measure on E induced by the Lebesgue measure X in E, and for the coordinate projections as the random variables, we have then G^2) (co) = D{v2) (co) for all coeE00 and all N^l. It is obvious how the above notions have to be extended to the multidimensional case. One considers now a sequence rjl, r\2,..., rjn,... of independent random variables on (X, U, /i) with values in Rk, and having a common distribution function F(tl9..., tk). The empirical distribution function FN(tu..., tk; x) is now defined for all (*!,..., tk)eRk by l/N times the number of rjn(x) = (rjnl (x),..., r}nk(x)\ l^n^N, for which r\ni{x)^ti for l^f^/c. Then one sets GN(x)= sup \FN(tu...,tk,x)-F(tu...,tk)\. (tu...,tk)eRk The definition of Gjv2) (x) may be extended in a completely analogous fashion. 4. Global results. It follows from the methods of H. Weyl in his fundamental paper [52] on uniform distribution of sequences that A^-almost all coeE00 are uniformly distributed in E. This can be deduced readily from the strong law of large numbers. The same fact is also an immediate consequence of the following central result from the theory of empirical distribution functions. Lemma 4.1 (Glivenko-Cantelli theorem). Let fi> £2*•••> £„>••• be independent random variables on (X, U, /i) with common distribution function F(t). Then lim^^ GN(x) = 0 \i- a. e. For the proof, see [19, p. 279] and [44, pp. 335-336]. We are now interested mainly in quantitative refinements of the Glivenko-Cantelli theorem. We shall state these results for GN(x), and point out once again that they include theorems for all the discrepancies DN(co; f). By Kolmogorov's limit theorem given below, the quantity Nl/2GN(x) has a
204 H. NIEDERREITER limiting distribution in case F(t) is continuous. The remarks in the previous section imply that this limiting distribution is independent of the nature of the function F(t). Theorem4.1 (Kolmogorov's limit theorem). Let £x, £2,..., £w,... be independent random variables on (X, U, p) with a common continuous distribution function. Then (7) lim p{{xeX: Ni/2GN{x)^<x})=l-2 f] {-iy+1 e'2*2*2 for a>0. N->oo j=l Various proofs of this limit theorem are available in the literature. For the original proof, see [29] and also §5. Another proof was given by Feller [18]. Doob [1Z] made the remarkable observation that the limiting distribution in (7) also occurs in limit laws for certain Gaussian processes, and that Kolmogorov's limit theorem may in fact be deduced from a heuristic invariance principle for Gaussian processes. Donsker [11] supplied the proof of this invariance principle. See also Billingsley [4, §13]. The distribution function in (7) was tabulated by Owen [41, §15] and Renyi [44, Tabelle 8]. Chung [8] proves Kolmogorov's limit theorem with a remainder term. As a consequence, he shows the following zero-one law. Let (AN), iV = l, 2,..., be a sequence of real numbers tending monotonically to infinity. Then the inequality Nl/2 GN(x)>2.N will be satisfied for infinitely many N with probability zero or one according as the series | f exp(-2A£) converges or diverges. Specifically, choosing AN = (i log logiV)1/2 for sufficiently large N, it follows that — (2AQ"2Ow(x)^ (8) ^(loglogAO1'2- * Another consequence of Chung's refinement of Theorem 4.1 is the law of the iterated logarithm for GN(x) which strengthens (8). Theorem4.2 (Law of the iterated logarithm). Let £l9 £2,..., £„,... be independent random variables on (X U, p) with a common continuous distribution function. Then {2Ny2GN(x) N-oo (log logAT)1/2
METRIC THEOREMS ON THE DISTRIBUTION OF SEQUENCES 205 For another approach to Theorem 4.2, we refer to Cassels [7]. An analogue to Theorem 4.1 can even be shown for the case where the common distribution function F(t) of the random variables £„, n— 1, 2,..., is not continuous (see Schmid [46]). However, the resulting limiting distribution is not distribution-free, as it will depend on the jumps of the function F(t). A theory similar to that for GN (x) can be developed for the Cramer-Smirnov test Gjv^x) as defined in (6). The counterpart to Theorem 4.1 for G^2)(x) reads as follows. Theorem 4.3. Let £l9 £2,.-•,£„>••• be independent random variables on (X, U, /i) with the uniform distribution Fu(t) as their common distribution function. Then lim fi{{xeX:N1/2 Gk2)(x)ga}) where K1/4(t) is the modified Bessel function of the third kind of order £ (see [16]). The limiting distribution for Ni/2 Gff (x) was first obtained by Smirnov [48]. In the above form, the distribution function was given by Anderson and Darling [1] who also tabulated the function (see also [41, § 16]). In the proof, one again seeks to arrive at the desired limit law by invoking an invariance principle for a suitable Gaussian process. This Gaussian process is defined in terms of the eigenvalues and the normalized eigenfunctions of the integral equation (10) f(t)=l\(mm(s,t)-st)f(s)ds. o Using the Fredholm determinant of the kernel in (10), one finds the limiting characteristic function of (iV1/2 G^x))2. By a standard inversion technique using Laplace transforms, the limiting distribution function in (9) is obtained. As far as I know, a law of the iterated logarithm is not yet established for Gj^x). Let us now discuss the multidimensional Kolmogorov test GN (x). For dimension k> 1, the natural problems one might want to pose are much harder to settle. It is an unpleasant new feature that for k>\ Kolmogorov's test fails to be distribution-free, even in the class of continuous distribution functions (see the example constructed by Simpson [47]). Although one knows that N1/2 GN(x) has again a limiting distribution for continuous F(tu..., tk), which will of course
206 H. NIEDERREITER depend on F, the nature of the distribution function could not be determined for any continuous F. Most of the known results hold uniformly for all continuous distribution functions F(t1,..., tk). The closest approximation to a multidimensional analogue of Theorem 4.1 is the following result of Kiefer [26]. Theorem 4.4. Let rji,rj2,...,rjn,... be independent random variables on (X, U, /i) with values in Rk, and having a common continuous distribution function F(tl9..., tk). Then, for every e>0, there exists a positive constant c = c(s, k), independent ofF, such that n{{xeX:N1/2GN{x)S<x})^l-CQxp{-{2-s)a2) holds for allN^l and all a^O. A weaker result was obtained earlier by Kiefer and Wolfowitz [27]. For fc= 1, an optimal result of this type was shown by Dvoretzky, Kiefer, and Wolfowitz [14]. However, the multidimensional metric theory of GN(x) is not all that unsatisfactory. The most decisive result is the unrestricted extension of Theorem 4.2. This was achieved by Kiefer [26]. Theorem 4.5 (Multidimensional law of the iterated logarithm). Let r1i,1l2,...,nn,... be independent vector-valued random variables on (X, U, fi) with a common continuous distribution function. Then — (2N)^GN(x) ^ooOoglogN)1/2 For two other approaches to Theorem 4.5 containing some refinements, see Philipp [42], [43, Chapter 4] and Zaremba [53]. One of the major difficulties in the multidimensional case seems to be that, although the desired limit laws are again intimately connected with certain (multidimensional) Gaussian processes, we cannot use these correspondences to the same extent as in the one-dimensional case, since the theory of these more general stochastic processes is not as fully developed as one might wish. Not much is known about the multidimensional Cramer-Smirnov test Gi^){x). Again, the test fails to be distribution-free for k>\. But, at least in theory, the problem of the limiting distribution of Ni/2 Gffiix) can be reduced to the problem of finding the limit law for a certain multidimensional stochastic process, which, in turn, boils down essentially to finding the eigenvalues of certain multivariate integral equations (see Rosenblatt [45]).
METRIC THEOREMS ON THE DISTRIBUTION OF SEQUENCES 207 5. Combinatorial aspects. By Kolmogorov's limit theorem (see Theorem 4.1), we know the limiting distribution of fi({xeX:N1/2 GN(x)^a}) as N-+00 for continuous F(t). In this section, we are interested in exact formulas for probabilities of this type with fixed N. We introduce the following notation. Definition 5.1. Let £l5 £2,. ., £„,... be independent random variables on (X, U, fi) with a common continuous distribution function. For a positive integer N and a^O, we set (11) P,(a) = /i({xeX:G,(x)ga}). From the remarks in §3, it follows that, whenever it is convenient, we may take for the common distribution function the uniform distribution Fu(t), and that we may also assume that the random variables £„, w=l, 2,..., only attain values in [0, 1]. The computation of the probabilities PN{u) is essentially a combinatorial problem. The first formulas for the PN{a), with a of the form a = m/N, m = 0,1,..., N, were derived recursively by Kolmogorov in his proof of Theorem 4.1. We set PN(m/N) = (N\/NN)eNR0>N(m), where Ritk{m) is defined for all integers i, for all nonnegative integers fe, and for m=l,...,N. Then, for Ritk(m), we have the conditions R0 0(m)=\ for lgm^N; Ri0(m) = 0 for lgm^JV, i#0; Rik(m) = 0 for \i\^m and all k ; and the recursion formula 2m-1 j K.\*+iM = e'~1 Z -.Ri+i-s.kim) for |i|^m-l andall/c. s = 0 SI Numerical tables for PN(m/N) for small values of TV were compiled by Massey [31], Birnbaum [5], and Owen [41, § 15]. For a slightly more general problem, recursions of the above type were given by Durbin [13]. Let us now consider PN{oc) for arbitrary a. We note, first of all, that PN(a) = 0 for 0 ^ a < 1/2JV, by Lemma 2.4. Therefore, we may suppose a ^ 1/2JV in the sequel. A simple but general formula for PN(oc) may be given using a notion from transversal theory. This area of combinatorial theory supplies some surprising connections with problems concerning distribution of sequences. See for instance the papers of Meijer and Niederreiter [32], Niederreiter [37], and Tijdeman [49]
208 H. NIEDERREITER on sequences in finite and countable sets. For the present purposes, we need the following definition from transversal theory (Mirsky [34], [35]). Definition 5.2. Let Mu..., Mk be subsets of an arbitrary set M. A &-tuple (mlv.., mk) of elements of M is called a system of representatives for Mi,..., Mk if there exists a permutation a of k letters such that ma{i)eMi for 1 gig/c. An important step is the combinatorial characterization of the elements xeX for which Gn(x)^ol. Lemma 5.1. Let £u £2,..., £„,.. .be independent random variables on (X, U, /i) attaining values in [0,1] and having the uniform distribution as their common distribution function. For N^.\ and ajj^l/2iV, define the intervals J{al) = [i/N—tt, (/-l)/7V+a], lg/gAT. Then GN(x)<^<x if and only //(^(x),..., £N{x)) is a system of representatives for J{a1],..., J{"]. The proof depends on a special case of Lemma 2.4 and a few combinatorial arguments, and is given in [39]. To state the formula for PN(a), we need some more notation. For fixed iV^l and a^l/2JV, we set aI = max(i/iV —a, 0) and ^ = min((i- l)/JV + a, 1) for 1 gig N. Let Ht be the interval H~[ah ftj, 1 gigN. Now arrange the numbers at,..., aN, bu...,bN according to their magnitudes, not allowing repetitions: c0<cx < • • • <cs. For 1 gyg5, let I} be the interval 7J=[cJ_l9 cj\. Then the intervals i/f can be written in the form H{ — Kj)l=hi Ij for 1 g/g Af. The integers h( and kt satisfy 1 =frigfr2g***gfyvgs and 1 gfeig/c2g---g/cN = s, and also fejgk, for lgjgJV. Theorem 5.1. ff7/A ffe afore notation, we have (J1.....JN) for N^.1 and a^l/2iV, w/i^r^ t/ie summation is extended over all systems of representatives (ji,...JN)for the intervals [hl9 /cj],..., \_hN, kN~] of integers. The proof can be found in [39]. One makes use of Lemma 5.1, some combinatorial arguments, and the independence of the random variables £l5..., £N. There are also various analytic representations for PN{oc). One of them was first mentioned by Birnbaum [5]. It is in fact a rather simple consequence of Lemma 2.4 (see [39]). We use again the notation introduced above. For an excellent exposition of the methods used in obtaining results of this type, see [50, § 16]. Lemma 5.2. For N^ 1 and a^ 1/2JV, we have
METRIC THEOREMS ON THE DISTRIBUTION OF SEQUENCES 209 (12) PN(*) = N\ bN x{h,~->tN)dtN...dtl9 where x(tu...9 tN)=\ if tlSt2S'-^tN, and x(*i,..., *jv) = 0 otherwise. A recurrence formula for the integral appearing in (12) was given by Epanecni- kov [15] who also computed explicit values for N^4. Various formulas for PN(<x) may be deduced from Lemma 5.2, depending on what analytic expression one chooses for the function #(fi,..., tN). In one method, one works with Dirichlet's integral (see [39]). The following formula, also given in [39], appears to be more tractable. Theorem 5.2. For N^ 1 and a ^ 1/27V, we have exp(kbi) expikbs) exp(fcai) exp(fcajv) Another exact expression for PN(ct) was obtained by Kemperman [23], [24, §5]. The author uses the method of generating functions. The resulting formula is fairly complicated. The approach of Kemperman was recently extended by Durbin [13] to cover more general cases. Kemperman's formula reads as follows. Theorem 5.3. For N^ 1 and a^ 1/27V, we have U=o i\J\ where Bv(t) is given by with H = [t~], while £* stands for the sum over all H-tuples (il5..., iH) of nonnegative integers with il+2i2-{ VHiH = v. The problem of evaluating PN{a) in the multidimensional case is far from being solved. Even the basic step, namely the combinatorial characterization similar to Lemma 5.1 of the points xeX with GN(x)^u, has not yet been carried out. Solutions of these problems will undoubtedly lead to a proof for the multidimensional version of Kolmogorov's limit theorem (compare with the remarks on the multidimensional case in §4).
210 H. NIEDERREITER 6. Special sequences. The classical example of a uniformly distributed sequence in E is the sequence co = ({na}), n— 1, 2,..., with irrational a, where {t} = t — [t] denotes the fractional part of the real number t. Thus, trivially, the sequences ({na}) are uniformly distributed in E for almost all a (in the sense of the Lebesgue measure in R). To obtain a quantitative refinement, one may use the well-known discrepancy estimates for ({no.}) in terms of the continued fraction expansion of a, together with the classical metric results of Khinchine on continued fractions. This leads to the following theorem. For a given nondecreasing positive function g with X*=i (#(«))" *< oo, the discrepancy DN(oo) of the sequence co = ({wa}) satisfies NDN(cj) = 0(g(\og log JV) log JV) for almost all a. Hence, on the average, the sequences {{no.}) have an extremely even distribution. Most of the probabilistic investigations of special classes of sequences have concentrated on sequences of the type ({A„a}), where (A„) is either a given increasing sequence of positive integers, or a given lacunary sequence of real numbers. One is interested in the behavior of the sequence for almost all a. For a survey of the results prior to 1961, see Cigler and Helmberg [9, §9]. In the last decade, a major achievement in this area was a law of the iterated logarithm for the sequences ({2wa}), which was established by S. Gal and L. Gal [20]. The result was improved decisively by Philipp, and the strengthened version may be found in [43, §4]. For another recent result, see the paper of Baker [2]. We also refer to the excellent survey article of Gaposhkin [21] on lacunary sequences. More general classes of sequences have been studied, for instance in Erdos and Koksma [17] and Cassels [6]. A comprehensive bibliography of the work on special sequences will be available in [30, §2.3]. References 1. T. W. Anderson and D. A. Darling, Asymptotic theory of certain "goodness of fit" criteria based on stochastic processes, Ann. Math. Statist. 23(1952), 193-212. 2. R. C. Baker, Discrepancy modulo one and capacity of sets, Quart. J. Math. Oxford Ser. (2) 22 (1971), 597-603. 3. D. E. Barton and C. L. Mallows, Some aspects of the random sequence, Ann. Math. Statist. 36 (1965), 236-260. MR 31 #2745. 4. P. Billingsley, Convergence of probability measures, Wiley, New York, 1968. MR 38 #1718. 5. Z. W. Birnbaum, Numerical tabulation of the distribution of Kolmogorov's statistic for finite sample size, J. Amer. Statist. Assoc. 47 (1952), 425-441. MR 14, 389. 6. J. W. S. Cassels, Some metrical theorems in Diophantine approximation. I, Proc. Cambridge Philos. Soc. 46 (1950), 209-218. MR 12, 162. 7. , An extension of the law of the iterated logarithm, Proc. Cambridge Philos. Soc. 47 (i951), 55-64. MR 12, 723.
METRIC THEOREMS ON THE DISTRIBUTION OF SEQUENCES 211 8. K. L. Chung, An estimate concerning the Kolmogoroff limit distribution, Trans. Amer. Math. Soc. 67 (1949), 36-50. MR 11, 606. 9. J. Cigler and G. Helmberg, Neuere Entwicklungen der Theorie der Gleichverteilung, Jber. Deutsch. Math.-Verein. 64 (1961), Abt. 1, 1-50. MR 23 #A2409. 10. D. A. Darling, The Kolmogorov-Smirnov, Cramer-von Mises tests, Ann. Math. Statist. 28 (1957), 823-838. MR 20 #390. 11. M. D. Donsker, Justification and extension of Doob's heuristic approach to the Kolmogoroff- Smirnov theorems, Ann. Math. Statist. 23 (1952), 277-281. MR 13, 853. 12. J. L. Doob, Heuristic approach to the Kolmogoroff-Smirnov theorems, Ann. Math. Statist. 20 (1949), 393-403. MR 11, 43. 13. J. Durbin, The probability that the sample distribution function lies between two parallel straight lines, Ann. Math. Statist. 39 (1968), 398^11. MR 37 #1001. 14. A. Dvoretzky, J. Kiefer and J. Wolfowitz, Asymptotic minimax character of the sample distribution function and of the classical multinomial estimator, Ann. Math. Statist. 27 (1956), 642-669. MR 18, 772. 15. V. A. Epanecnikov, The significance level and the power of the two-sided Kolmogorov criterion in the case of small sample sizes, Teor. Verojatnost. i Primenen. 13 (1968), 725-730=Theor. Probability Appl. 13 (1968), 686-690. MR 39 #3664. 16. A. Erdelyi, et al., Higher transcendental functions. Vol. II, McGraw-Hill, New York, 1955. MR 15, 419. 17. P. Erdos and J. F. Koksma, On the uniform distribution modulo 1 of sequences {fin, #)}, Nederl. Akad. Wetensch. Proc. Ser. A 52 = Indag. Math. 11 (1949), 299-302. MR 11, 331. 18. W. Feller, On the Kolmogorov-Smirnov limit theorems for empirical distributions, Ann. Math. Statist. 19 (1948), 177-189. MR 9, 599; 10, 855. 19. M. Frechet, Generalites sur les probabilites. Elements aleatoires, 2ieme ed., Traite du calcul des probabilites et de ses applications, tome I, fasc. 3, premier livre, Gauthier-Villars, Paris, 1950. MR 12, 423. 20. S. Gal and L. Gal, The discrepancy of the sequence {(2";c)}, Nederl. Akad. Wetensch. Proc. Ser. A 67 = Indag. Math. 26 (1964), 129-143. MR 29 #392. 21. V. F. Gaposkin, Lacunary series and independent functions, Uspehi Mat. Nauk 21 (1966), no. 6 (132), 3-82 = Russian Math. Surveys 21 (1966), no. 6, 1-82. MR 34 #6374. 22. E. Hlawka, Discrepancy and uniform distribution of sequences, Compositio Math. 16 (1964), 83-91. MR 30 #4745. 23. J. H. B. Kemperman, Some exact formulae for the Kolmogorov-Smirnov distributions, Nederl. Akad. Wetensch. Proc. Ser. A 60 = Indag. Math. 19 (1957), 535-540. MR 20 #2779. 24. , The passage problem for a stationary Markov chain, Statistical Research Monographs, vol. I, Univ. of Chicago Press, Chicago, 111., 1961. MR 22 #9992. 25. , Probability methods in the theory of distributions modulo one, Compositio Math. 16 (1964), 106-137. MR 30 #3494. 26. J. Kiefer, On large deviations of the empiric D.F. of vector chance variables and a law of the iterated logarithm, Pacific J. Math. 11 (1961), 649-660. MR 24 #A1732. 27. J. Kiefer and J. Wolfowitz, On the deviations of the empiric distribution function of vector chance variables, Trans. Amer. Math. Soc. 87 (1958), 173-186. MR 20 #5519. 28. J. F. Koksma, Diophantische Approximationen, Ergebnisse der Mathematik und ihrer Grenz- gebiete IV, Heft 4, Springer, Berlin, 1936. 29. A. Kolmogorov, Sulla determinazione empirica di una legge di distribuzione, Giorn. 1st. Ital. Attuari 4 (1933), 83-91. 30. L. Kuipers and H. Niederreiter, Uniform distribution of sequences, Interscience Tracts, Wiley, New York (in print).
212 H. NIEDERREITER 31. F. J. Massey, A note on the estimation of a distribution function by confidence limits, Ann. Math. Statist. 21 (1950), 116-119. MR 11, 446. 32. H. G. Meijer and H. Niederreiter, On a distribution problem infinite sets, Compositio Math. 25(1972), 153-160. 33C Y. Meyer, Nombres de Pisot, nombres de Salem, et analyse harmonique, Lecture Notes in Math., vol. 117, Springer-Verlag, Berlin and New York, 1970. 34. L. Mirsky, Systems of representatives with repetition, Proc. Cambridge Philos. Soc. 63 (1967), 1135-1140. MR 36 #58. 35. , Transversal theory, Academic Press, New York, 1971. 36. H. Niederreiter, Methods for estimating discrepancy, Proc. Sympos. on Applications of Number Theory to Numerical Analysis (Montreal, 1971), Edited by S. K. Zaremba, Academic Press, New York, 1972, pp. 203-236. 37. , A distribution problem in finite sets, Proc. Sympos. on Applications of Number Theory to Numerical Analysis (Montreal, 1971), Edited by S. K. Zaremba, Academic Press, New York, 1972, pp. 237-248. 38. , Discrepancy and convex programming, Ann. Mat. Pura Appl. 93 (1972), 89-97. 39. , Zur quantitativen Theorie der Gleichverteilung, Monatsh. Math, (to appear). 40. 1 The distribution of Farey points, Math. Ann. (to appear). 41. D. B. Owen, Handbook of statistical tables, Addison-Wesley, Reading, Mass., 1962. MR 28 #4608. 42. W. Philipp, Das Gesetz vom iterierten Logarithmus mit Anwendungen auf die Zahlentheorie, Math. Ann. 180 (1969), 75-94. MR 39 # 1423. 43. , Mixing sequences of random variables and probabilistic number theory, Mem. Amer. Math. Soc. No. 114(1971). 44. A. Renyi, Wahrscheinlichkeitsrechnung. Mit einem Anhang uber Informationstheorie, Hoch- schulbucher fiir Mathematik, Band 54, VEB Deutscher Verlag der Wissenschaften, Berlin, 1962. MR 26 #5597. 45. M. Rosenblatt, Limit theorems associated with variants of the von Mises statistic, Ann. Math. Statist. 23(1952), 617-623. MR 14, 665. 46. P. Schmid, On the Kolmogorov and Smirnov limit theorems for discontinuous distribution functions, Ann. Math. Statist. 29(1958), 1011-1027. MR 21 #392. 47. P. B. Simpson, Note on the estimation of a bivariate distribution function, Ann. Math. Statist. 22 (1951), 476^78. MR 13, 142. 48. N. V. Smirnov, On the distribution of the co2 criterion of von Mises, Math. Sb. 2 (1937), 937-993. (Russian) 49. R. Tijdeman, On a distribution problem infinite and countable sets, J. Combinatorial Theory (to appear). 50. B. L. van der Waerden, Mathematische Statistik, 2nd ed., Springer-Verlag, Berlin, 1957; English transl., Die Grundlehren der math. Wissenschaften, Band 156, Springer-Verlag, New York, 1969. MR 32 #8421; MR 40 #5051. 51. A. Wald and J. Wolfowitz, Confidence limits for continuous distribution functions, Ann. Math. Statist. 10(1939), 105-118. 52. H. Weyl, Uber die Gleichverteilung von Zahlen mod. Eins, Math. Ann. 77 (1916), 313-352. 53. S. K. Zaremba, Sur la discrepance des suites aleatoires, Z. Wahrscheinlichkeitstheorie und Verw. Gebiete 20 (1971), 236-248. UNIVERSITY OF ILLINOIS AT URBANA-ChAMPAIGN Southern Illinois University
BOUNDS FOR SEQUENCES OF CONSECUTIVE POWER RESIDUES. I KARL K. NORTON1 Let n and k be positive integers with /c^2, let C(n) denote the multiplicative group of residue classes (modn) which are relatively prime to n, and let Ck(ri) denote the subgroup of fcth powers. Write vk(n) = {C(n)\Ck(n)\ and when vk(n)> 1, let gf1(n, k) denote the smallest positive integer which is in C(n) but not in Ck(n) (i.e., g^n, k) is the least positive kxh power nonresidue modn which is relatively prime to n). This paper is an expository account of some recent research on the problem of finding bounds for gx{n, k) and, more generally, for the maximum number of consecutive members of C(n) lying in any given coset of Ck(n). Our principal new results are stated in Theorems 1, 4, 6, 7, and 8. Detailed proofs of these theorems will appear elsewhere. We shall always use the symbol p to represent a prime. The best known upper bound for gx (p, 2) is due to Burgess [5]: (1) gi(p,2) = 0£(pl^l^+£) forp>2,£>0, where the notation indicates that the implied constant depends at most on e (O without subscripts will imply an absolute constant). There is a standard method for generalizing this result to the kth power case. The method requires the introduction of Dickman's function, a positive continuous nonincreasing function defined recursively by (2) e(a)=l (0£a£l), AMS 1970 subject classifications. Primary 10-02, 10H35; Secondary 10G05, 10H30. Key words and phrases. Power residues, character sums. 1 This research was supported in part by the grant AF-AFOSR-69-1712. © 1973, American Mathematical Society 213
214 KARL K. NORTON (3) q(<x) = q{N)- \v-lQ{v-l)dv (JV<a£JV + l;N=l,2,...). N The generalization of (1) was given by Wang Yuan [21]: (4) gi(p,k) = OkJpl^+£) for p=l (mod/c), £>0, where ak is the unique root of the equation g(a) = /c_1 (note that a2 = e1/2). Buch- stab [4] showed that (5) uk>{\ogk)(\og\ogk + 2)~l>6 fork>e33. Assuming the extended Riemann hypothesis, Ankeny [1] obtained the very strong result (6) 0i(p,2) = O(log2p) for odd p, and he remarked (a little vaguely) that a similar result holds for gY (p, k) when k>2 and p=l (mod/c). Montgomery [17, Chapter 13] has further generalized (6); we shall discuss his result after stating our Theorem 6. In the other direction, we have the inequality (7) gl (p, k)>dk logp for infinitely many p= 1 (mod/c), where dk is positive. When fe = 2, this is an easy consequence of Linnik's deep theorem on the smallest prime in an arithmetic progression, as was remarked by S. Chowla (unpublished) and various other authors. (7) was established for all prime k by Elliott [11], who pointed out to the author that the result actually holds for any /c^2. Elliott [12, pp. 840-841] also showed that for each £>0, there is a positive constant c(e) such that gl(p, 2)>c(s) logp for at least x1_£ primes p^x. In the same paper, he proved that there are positive absolute constants d, a such that gY (p, 2)^d logp for all but 0(x exp(-a (logx/log logx))) primes not exceeding x, while for each <5>0, an inequality of the form #i(p, 2) ^(logp)1+<5 holds for all but 0<5(x1_M(<5)) primes not exceeding x, where /i(<5)>0. These results led him to conjecture that gi(p, 2) = 0£((logp)1+£) for every e>0. Assuming the extended Riemann hypothesis, Montgomery [17, pp. 122, 128j showed that there is a constant c>0 such that gl(p, 2)>c(logp) (log logp) for infinitely many p.
SEQUENCES OF CONSECUTIVE POWER RESIDUES 215 The above results fall well short of conveying the full truth about the size of 0i(p,k). In fact, (8) lim WxjU)}"1 E gi(p,k) = Ak x-* + oo P = x, p= 1 (modfc) for /c = 2, 3,..., where Ak depends only on k and n(x; /c, 1) is the number of p^x with p= 1 (mod/c). This very striking result was proved by Erdos [14] in the case fc = 2 and was extended by Elliott [10], [13] to the general case. It is interesting to ask whether these and other results on the distribution of power residues can be generalized to the case in which the prime p is replaced by an arbitrary modulus n. In [18], [19], and [20], we showed that certain generalizations could be made, but these were not completely satisfactory, since the estimates obtained were weaker unless a certain assumption was made about the prime factorizations of n and k. This assumption, which we refer to as Condition A, is a little troublesome to state precisely, but it certainly holds if n is cubefree, or if k is odd and squarefree. Our generalization of (1) and (4) in [20, p. 70] reads as follows: If w is an integer greater than 1, and if vk(n)^ w, then for each s>0, (9) 0iM)=Ow>3/8*"+*), while if Condition A holds, we have the stronger estimate (10) ffiM) = 0*>1/4"~+r The proof of these results required a rather delicate elementary estimate concerning the distribution of numbers with small prime factors, as well as some previous work on the distribution of power residues, which in turn depended on Burgess's estimates for character sums ([5], [6], [7}, [8]). Our first new result is more satisfactory* than (9) and (10). Theorem 1. If wis an integer and vk (n) ^ w ^ 2, then for each e > 0, (11) 3l(n,/c)=Ow,£(n1^+£). In particular, (12) g1{n,k)=0E(nlf4cx^2)+£) whenever gx (n, k) exists. The proof of Theorem 1 depends on (9) and (10) but is otherwise rather simple.
216 KARL K. NORTON We now consider the more general problem of finding bounds for the length of any sequence of consecutive power residues. Here we have the following elegant result for the case of prime modulus. Theorem 2 (Burgess [9]). Let x be a nonprincipal character mod/?, and suppose N,H are integers with x(AT+ \) = i(N+2)=~ = x(N+H). Then H= 0(p^\ogp). If (k, p — 1) > 1, then there exists a nonprincipal x mod/? such that xk is principal, and Theorem 2 gives an upper bound for the maximum number of consecutive integers lying in any fixed coset of Ck(p). This upper bound appears to be the best known and difficult to improve by Burgess's method. Since the implied constant in Theorem 2 is absolute, it seems worthwhile to calculate an admissible value for it. One could, of course, simply go through Burgess's proof and make all the inequalities explicit, but it does not appear possible to get a reasonably small value of the constant in this way. By a laborious refinement of his method, we have been able to get the following result: Theorem 3. In Theorem 2, H<4.lpi/4 \ogpfor all p. Ifp>e15&3.27x 106, thenH<2.5p1/4\ogp. We note that Alfred Brauer [2], [3, pp. 23-26] proved (essentially) that H<(2p)1/2+2 for all p, which still seems to be the best result known for p<e15. The method leading to Theorem 3 also gives various specific estimates for gx (p, k). Our best result in this direction is that (13) gt(p9 k)£l.lp1/4(logp + 4) if(/c,p-l)>l. This improves an inequality given in [20, p. 87]. For other specific estimates of gx(p, fe), see [20, pp. 75-96] and [3, pp. 26-31] (both papers contain references to other work). In obtaining Theorem 2, Burgess used his inequality (14) Sw(p,h,x)=t flxfa + O m = l |/ = 0 where h, w are any positive integers and / is any nonprincipal character modp (see [6, Lemma 2]). In order to get Theorem 3, we used (among other things) the following rather trifling refinement of (14): (15) 5w(p, K x)<21,2(2w/er phwcn(w, A) + (2w-2) p1/2/i2w, '2w <(4w)w+lphw + 2wpl/2h2w,
SEQUENCES OF CONSECUTIVE POWER RESIDUES 217 where h, w are any positive integers, % is a nonprincipal character of order n (in the character group modp), and (16) cn(w, h)=l if n is even or w^5, (17) cn{w, h) = {l+{w/2h)1/2}2w-2 if n is odd and w>5. In order to generalize Theorem 2 to the case of an arbitrary modulus, we need some appropriate definitions. By an integer interval, we mean any set of the form / = {JV+l,iV + 2,...,JV + //}, where N, H are integers with H^0. The length of / is \I\ = H. If vk(n)> 1, let Mfc (h) = max {|/|: 3#Vx [x£ /, (x, n) = 1 =>xegCk (n)]}. In words, Mk{n) is the length of the largest integer interval / such that all members of/ which are relatively prime to n lie in the same coset of Ck(n). For convenience, we define Mk(n) = n if vk(n)=l. Note that (18) gi(n,k)^Mk{n)<n ifvk(w)>l. In [19, Theorem 3.15], we showed that, for each £>0, (19) Mfc(n) = 0E(*3/8+£) ifvk(n)>l, while (20) Mk{n) = 0E{nll4+£) if vk(n) > 1 and Condition A holds. (Recall that Condition A was discussed just before equation (9).) We can now state an analogue of Theorem 1 which improves (19) and (20). Theorem 4. If vk(n)> 1, then Mk(n) = Oe(n1/4+e) for each s>0. Note that the implied constant here does not depend on k. The estimate of Theorem 4 can be improved dramatically for almost all values of n. To get the improvements, we need Theorem 5. If there exists a prime p dividing n such that (k, p— 1)> 1, then Mk{n)<{n/(p{n))2^nyi2\ogp,
218 KARL K. NORTON where (p is Euler's function and co(n) is the number of distinct prime factors of n. Using Theorem 5, the sieve of Eratosthenes, and the well-known theorem of Hardy and Ramanujan on the normal order of a>(n), we can get the following result: Theorem 6. Let x, e be real with x^3, £>0. For each integer k^2, define (2i) ,(/c)=1_n[1_J_Y P\k\ P-I/ Then Mk(n)<(\ogn)lo*2+E for all but 0M(x (log logx)"c(fc)) values ofn^x. A similar estimate for ^(n, k) follows from (18). It is interesting to compare this with Montgomery's recent generalization of Ankeny's result (6). In [17, Chapter 13], Montgomery defined what he called least character nonresidues and gave various upper and lower bounds for them. His Theorem 13.1 yields (as a special case) the following extension of (6): on the extended Riemann hypothesis, (22) 0x (w, k) = 0 (log2 n) whenever v* (ri) > 1. Our Theorem 6 has the advantage of giving a better bound in a more general problem, and without any unproved hypotheses. However, the set of exceptional n in Theorem 6 may be quite large. If we are willing to accept weaker bounds for Mk(n), we can reduce appreciably the size of the exceptional set. Theorem 7. Let x^3, e>0. For each fc^2, Mk(n)<exp ((\ogn)E)for all but OkE(x (logx)~EC(fc)) values of w^x, where c(k) is defined by (21). Furthermore, Mk(n)<nE for all but OkE(x (logx)"c(k)) values ofn^x. The proof of Theorem 7 is not as simple as that of Theorem 6. It depends on Selberg's upper bound sieve method and a well-known Tauberian theorem of Hardy and Littlewood [15, Theorem D]. The second part of Theorem 7 should be compared with a result of Linnik [16], which states that for each s > 0, g l (p, 2) < pE for all but 0E(loglogx) primes p^x (Linnik actually stated his theorem in a slightly different way). As we remarked after equation (7), Elliott obtained much better upper bounds for gt (p, 2), but with much larger exceptional sets. In some cases, we can be a little more specific about the size of the exceptional set. For example, Theorem 8. Ifx^3,0<e^j, andk is even, then Mk{n)<ne for all but at most 2x/e logx + 0E(x log logx/log2x) values ofn^x.
SEQUENCES OF CONSECUTIVE POWER RESIDUES 219 Finally, we mention that a simple application of the prime number theorem shows that for each k ^ 2, there are infinitely many n with (23) Mk(«)^^1(«,A:)^(log«){l + 0(exp(~6(loglog/2)1/2))}, where b is a positive absolute constant. The idea behind (23) is almost trivial, but we have not been able to obtain a better lower bound. However, Theorem 6 and (22) suggest that (23) cannot be improved very much. References 1. N. C. Ankeny, The least quadratic non-residue, Ann. of Math. (2) 55 (1952), 65-72. MR 13, 538. 2. A. Brauer, Uber die Verteilung der Potenzreste, Math. Z. 35 (1932), 39-50. 3. , Combinatorial methods in the distribution of kth power residues (with discussion), Proc. Conf. Combinatorial Math, and Appl. (Univ. North Carolina, Chapel Hill, N.C., 1967), Univ. North Carolina Press, Chapel Hill, N.C., 1969, pp. 14-37. MR 40 #1330. 4. A. A. Buhstab, On those numbers in an arithmetic progression all prime factors of which are small in order of magnitude, Dokl. Akad. Nauk SSSR 67 (1949), 5-8. (Russian) MR 11, 84; 871. 5. D. A. Burgess, The distribution of quadratic residues and non-residues, Mathematika 4 (1957), 106-112. MR 20 #28. 6. , On character sums and primitive roots, Proc. London Math. Soc. (3) 12(1962), 179-192. MR24#A2569. 7# 9 On character sums and L-series, Proc. London Math. Soc. (3) 12(1962), 193-206. MR 24 #A2570. 8. , On character sums and L-series. II, Proc. London Math. Soc. (3) 13 (1963), 524-536. MR 26 #6133. 9. , A note on the distribution of residues and non-residues, J. London Math. Soc. 38 (1963), 253-256. MR 26 #6135. 10. P. D. T. A. Elliott, A problem ofErdos concerning power residue sums, Acta Arith. 13 (1967/68), 131-149. MR 36 #3741. 11. , Some notes on k-th power residues, Acta Arith. 14 (1967/68), 153-162. MR 37 #4000. 12. , The distribution of primitive roots, Canad. J. Math. 21 (1969), 822-841. MR 40 # 104. 13. , The distribution of power residues and certain related results, Acta Arith. 17 (1970), 141-159. MR 43 #4773. 14. P. Erdos, Remarks on number theory. I, Mat. Lapok 12 (1961), 10-17. (Hungarian) MR 26 #2410. 15. G. H. Hardy and J. E. Littlewood, Some theorems concerning Dirichlet's series, Messenger Math. 43(1913/14), 134-147. 16. Ju. V. Linnik, A remark on the least quadratic non-residue, C. R. (Dokl.) Acad. Sci. URSS 36 (1942), 119-120. MR 4, 189. 17. H. L. Montgomery, Topics in multiplicative number theory, Lecture Notes in Math., vol 227, Springer-Verlag, Berlin and New York, 1971. 18. K. K. Norton, Upper bounds for kth power coset representatives modulo n, Acta Arith. 15 (1968/ 69), 161-179. MR 39 #1419. 19. , On the distribution of kth power residues and non-residues modulo n, J. Number Theory 1 (1969), 398-418. MR 40 #4223.
220 KARL K. NORTON 20. , Numbers with small prime factors, and the least kth power non-residue, Mem. Amer. Math. Soc. No. 106 (1971), 106 pp. 21. Wang Yuan, Estimation and application of character sums, Shuxue Jinzhan 7 (1964), 78-83. (Chinese) MR 37 #5162. Institute for Advanced Study University of Colorado
RATIONAL POINTS ON CERTAIN ELLIPTIC MODULAR CURVES A. P. OGG1 Let T = SL(2, Z)/±\ be the full modular group, let Ybe the quotient of the upper half plane by r, and let X be the compactification of Yobtained by adjoining the cusp. For any integer iV^3, we have subgroups r(N), rl(N),r0(N) of r defined by matrices ("5) congruent modulo N to (o ?), (o *)> (o *)> respectively; we let Y(N), Yl(N), Y0(N) denote the corresponding quotients of the upper half plane, and X(N), ^(TV), ^oM ^e compactifications by adding cusps. The X"s are compact Riemann surfaces of genus p(N\ px (N), p0(N), easily computed from the Riemann-Hurwitz formula. As moduli, Y(N) parametrizes (isomorphism classes of) elliptic curves A together with a definite isomorphism of AN onto Cjy x Cjv, where AN is the group of points of order N on A, and CN is the cyclic group of order N; Y1(N) parametrizes elliptic curves together with a point of order N; Y0(N) parametrizes elliptic curves together with a cyclic subgroup of order N. Furthermore, the X's and Y's are not merely Riemann surfaces, i.e. algebraic curves over C, but curves over Q, and so for example an elliptic curve A, defined over Q, together with a point P of order N on A, rational over Q, corresponds to a point of Yx (N)Q, the set of Q-rational points on the curve Yx (N). Recently Demjanenko [5] has proved the conjecture that Yi (N)Q is empty for N sufficiently large (even over any number field, not just Q), i.e. no elliptic curve over Q can have a rational point of order N. I once made the guess [11], on admittedly thin evidence, that Yx (N)Q is empty if px (iV)>0, i.e. if N = 11 or N^ 13. The rational points of YX(N) or Y0(N), or, as we shall usually think of them, the rational points of XX(N) or X0(N) which are not cusps, are then inherently A MS 1970 subject classifications Primary 10D15, 14G25. 1 Partially supported by NSF grant GP-20532. Much of this work, although written up only in early 1972, was done while the author held an appointment in the Miller Institute for Basic Research in Science during the academic year 1970-1971. © 1973, American Mathematical Society 221
222 A. P. OGG interesting, and on the present occasion I want to describe some methods for finding them, subject to the following considerations. One can derive plane equations for these curves, as in Fricke [6], and sometimes solve directly for the rational points, the classic example being Billing-Mahler [1]. But for larger values of N these equations will be highly singular, as well as laborious to derive, and this makes counting solutions difficult. We will therefore use only the modular properties of our curves, and consider using a specific equation as against the rules of the game. On the other hand, while it is difficult to say in modular terms what it means for a point of Y0(N) or Yl(N) to be rational, for the cusps of X0(N) or Xl(N), which are precisely the points we are not interested in, it is very easy to decide which cusps are rational. In practice, the rational cusps generate rather large finite groups of rational divisor classes of degree 0, i.e. subgroups of J0(N)Q or J1 {N)Q, where J0{N), Jx(N) are the Jacobians of X0(N), X^(N), and from this one can go on to determine X0(N)Q or Xx (N)Q for some values of N. It is a pleasure to acknowledge a number of helpful conversations with Bill Casselman, Barry Mazur, Peter Swinnerton-Dyer, and John Tate. 1. Rationality of cusps. The cusps of X(N) can be regarded as pairs ± (*), where x, y are in Z/NZ, and relatively prime, and (*), (Z *) are identified; G (N) = T/T (N) = SL(2, Z/NZ)/± 1 operates naturally on the left, and so a cusp of X0(N) or Xx (N) can be regarded as an orbit of T0 (N)/r (N) = G0 {N) or T x (N)/r {N) = GX (N). In the canonical model of Shimura, as communicated to me by Casselman, X(N) is defined over Q; the cusps of X(N) are rational in Q{Z>x) = Q{e2nilN\ and the Galois group (Z/NZ)X operates as (J ?), i.e. if P=±(J), and ce(Z/NZ)x (giving the automorphism Cn^Cn of Q(£N)/Q), then a{P) = ±{° J). For the group r1{N),G1{N) = {(l \): beZ/NZ}, and a cusp is an orbit {±{x+yby)}. Note that we can choose a representative (*) with x reduced modulo d = (y, N), and (x, y, N) = 1 means (x, d) = 1, so for each d | N there are (p(N/d) y's and <p(d) x's corresponding, i.e. \ (p(d) (p{N/d) cusps of Xx (N) (say for N^5). As to rationality, fix d \ N, d<N/2, and fix y with (y, N) = d, 0^y<N/2. The <p(d) cusps (y) are certainly conjugate, and in particular (J) is rational only if (p(d)= 1, i.e. d=l or 2. For d = N (resp. d = N/2 in the case of even N) we can take y = 0 (resp. y = N/2); the %(p(d) cusps corresponding are all conjugate, and so are rational only if (p(d) = 2, i.e. d = 3, 4, 6. Except for N= 1 -4, 6, 8, 12, then, the only rational cusps are those with d= 1 or 2, and in any case the cusps with the same ±y are conjugate. Finally we note that there are N/d cusps of X(N) over a cusp (J) of Xl (N), so its ramification degree in the covering X(N)-+ Xl (N) is e — d. Thus: Proposition 1. Suppose pl(N)>G, i.e. N=\\ or N^\3. Then for each d \ N, we have \ <p(d) (p(N/d) cusps (*) of X^{N) with d = (y, N), each having ramification degree e = d in X(N)-+Xi (N). The cusps (J) with a fixed value of ±y are conjugate,
RATIONAL POINTS ON CERTAIN ELLIPTIC MODULAR CURVES 223 and in particular are rational only if(p(d)=l, i.e. d=l or 2, i.e. the only rational cusps are the %<p(N) cusps (°) with d—\, and for even N, the \ <p(N/2) cusps (*) with d = 2. For X0(N), let us consider the Galois covering Xl(N)-^X0(N), with group G = r0(N)/ri(N)= {(g °a-1): ae(Z/NZ)x/±l}. Given a cusp(*) of Xx(N\ it is fixed by (o a-0G^ if anc* only if ax= ±x (mod d) and ay= ±y (mod N), i.e. a= ± 1 modulo dand N/d, i.e. modulo t = (N/d, d). This shows that the ramification degree is e = t. It also shows that for a cusp of X0(N), we can assume y = d=(y, N), and reduce x modulo t, so we have <p{t) conjugate cusps (J) corresponding to d, rational only if <p(t)= 1, i.e. r= 1 or 2. Proposition 2. For £tfc/z d| N, we have <p(t) conjugate cusps (J) of X0(N), each with ramification degree e=t in Xl(N)-^X0(N), and these are all the cusps of X0(N). In particular, all cusps are rational ifN or N/2 is a square-free integer. 2. Groups generated by cusps on Xt (N). Consider, for k^2, the "Eisenstein series" Ek(a,fi=Ek(T;a,p;N)= £ £ (mr + n)-" = <5(a/JV) I ,n"* + J~2WI?>, I tt-'sgnneinlrm + fl/N), n = p(N) IS (K— I) ! m~a(N);mn>0 where e(x) = e2wix, and S(x) = 1 or 0 as xeZ or x$Z. For /c ^ 3, £fc (a, P) is a modular form of dimension -k for r(N), and for y = g 5)eJ\ £*(«, j8) | y = JBk((a, j8) y) = Ek(cta + Pc, a6 + /W). (Here and in the following we have defined the operator |, relative to a given integer k, by/| (" hd) (r) = (ad-bc)k,2(cT + d)~kf((ar + b)/(cT + d)), for any real matrix (acbd) of positive determinant.) For k = 2, a difference E2{ol, /?)— E2 (a> P) wiH be a form of dimension — 2 (cf. Hecke [7, Number 24]). The order of zero of Ek(a, P) at a cusp (*) of X(N) can be read off from the above Fourier expansion and transformation rule, and we find Proposition 3. Let {x} = [x] N be defined by 0^{x}^ AT/2, {x) = ±x(modN). Then £fc(a, /?) has a zero of order ^{ocx + /fy} at the cusp (*) ofX(N) (with equality usually holding, for example if (a, P, N)=l). For k = 3, one checks that for (a, p, JV)= 1, £k(a, j?) has a total of (r: r(JV))/4 zeroes at cusps and hence no further zeroes. (In general a form of dimension - k for a group of index ^ in r has /c/i/12 zeroes in any fundamental domain.) This gives various relations in the group of divisor classes of degree 0 generated by the cusps. However, we prefer to use k — 2, since our divisors will then be only f as big.
224 A. P. OGG We illustrate the method in detail for Xt (13); this is the first interesting case, since X^N) has genus 0 for JV^IO or AT =12, while Ar1(ll), the first elliptic curve in nature, is well-known [1]; p1(13) = 2. Let then Pj=((j)9j= 1, 2,..., 6, denote the 6 rational cusps of Xx (13); here d= 1 in the notation of Proposition 1, so Ar(13)->X1(13) is unramified over the Py Let (pj = E2{0J; 13). Then cpij = (pi — (pj is a form of dimension —2 for 1^(13), and as such has (P: Px(13))/6= 14 zeroes in a fundamental domain. Letting (al9..., a6) denote the divisor a^P^ H \-a6P6, we have, by Proposition 3, (9i)£(l, 2, 3, 4, 5, 6), fo>2)£(2, 4, 6, 5, 3, 1), (<p3)^(3, 6, 4, 1, 2, 5), (<p4)^(4, 5, 1, 3, 6, 2), (<p5)^(5, 3, 2, 6, 1, 4), fo>6)£(6, 1, 5, 2, 4, 3). Then ((p^^inf^), (<p2) = (l, 2, 3, 4, 3, 1), and since this last divisor has degree 14, equality holds. (In particular, (pl2 has no zeroes at finite points, only at cusps.) Proceeding we find (cpl2) = (h 2, 3, 4, 3, 1), fo13) = (l, 2, 3, 1, 2, 5), (9l4) = (l, 2, 1, 3, 5, 2), fo>15) = (l, 2, 2, 4, 1, 4), (<p16) = (l, 1, 3, 2, 4, 3), (<p23) = (2, 4, 4, 1, 2, 1), and so (?i2/fi3) = (0,0,0,3,l,-4), (?i2/?u) = (0,0,2, 1, -2, -1), (<7>i2/<7>15) = (0,0, 1,0,2,-3), (9i2M6) = (0, 1,0,2,-1,-2), (<Pi2/«>23) = (-l, -2, -1,3,1,0) are divisors linearly equivalent to 0 on Xx (13). Let us embed Xx (13) in Jx (13) by sending P to the class of P — P6; let u, be the image of Pi9 i.e. the class of Pi — P6, and let U be the subgroup of Jl(\3)Q generated by ul,...,u5. The above relations give, in order, that u5= -3w4, then that 2w3 + 7w4 = 0 and u3 — 6w4 = 0, whence 19u4 = 0 and w3 = 6w4; the last two relations then give w2 = — 5u4, and mx=4m4. Thus (7 = C19 is cyclic of order 19. Note that u x + w5 = w4 = u2 + u3, i.e. P1 + P5-P4 + P6-P2 + P3. This corresponds to the covering ^(13)^(13)^X0(13),
RATIONAL POINTS ON CERTAIN ELLIPTIC MODULAR CURVES 225 where the middle curve has genus 0; since 52 = — 1 (mod 13), we see that {P1? P5}, {P2, Pi0 = Ps}, and {P4, P2o — Pe} are cusps of A"2(13) and so are linearly equivalent (giving another proof of that fact). Recall that any curve of genus 2 admits a unique double covering of a line; the fibers are the positive canonical divisors. For our curve Xx(13) it follows for example that there could not be a function whose divisor of poles is P2 + P5, and so forth. From this we can check that Xl (13) meets our group of order 19 in only the 6 rational cusps we started with, as follows. If not, we would have PeXx (13), with w = class of P—P6 = vw4 for some veZ/19Z, and u^uh i.e. v^O, 1, 4, —5, 6, — 3. If u— —uh then P + Pf~2P6, contrary to the remarks above, so v^ — 1, —4, 5, —6, 3. Similarly m = 2w, is not possible, i.e. v^2, 8,9, — 7, —6; hence v=— 2, 7, —8, or —9, i.e. u = ul — u3,u3 + u4,u2 + u5, or u2 — ul, whence Pl-\-P6, P3 + P4, P2 + P5, or P2 + P6 is canonical, which is not possible. Next we claim that our groups of order 19 is the full torsion subgroup Jx (13)qfs of the group of rational points on Jx (13). For this we use the fact that we have good reduction modulo p for p)( 13 (or in general for pJfN), and that reduction modulo p is injective except for p-torsion on the torsion subgroup. We also use the fact that the reduced curve has the same meaning as a moduli space in characteristic p, i.e. parametrizes elliptic curves plus a point of order 13. Let v(q) resp. h(q) be the number of points on the reduced curve resp. its Jacobian rational in the field of q elements. Since the genus is 2 we have %)=-<J+(v(q2) + v(<j)2)/2, from the usual formulas connected with the zeta-function of the reduced curve. For q = 2 or 4 we still have the 6 rational "cusps" but no other points, since an elliptic curve over a field of 2 or 4 elements has no room for a point of order 13. (By the "Riemann hypothesis", an elliptic curve over the field of q elements has at most (1 + (?)1/2)2 points in that field.) Hence v(2) = v(4)=6, h{2)= -2 + (6 + 6)2/2 = 19. Similarly, v(3) = 6. For q = 9, there is a unique elliptic curve A over F9 with 13 points; this gives apparently 12 pairs (A, P), P of order 13, but there is a cyclic group of order 6 operating, so we have only 2 corresponding points; v (9) = 6 -I- 2 = 8, /,(3)=_3 + (62 + 8)/2 = 19. Hence J^B^^C^. There are two other cases with p1(JV) = 2, namely AT =16, 18, and one can proceed as above. The analysis is somewhat more complicated because N is composite, but not essentially any more difficult. (The ground rules mentioned in the introduction arose at this point, since in the first proof of the above facts for N = 13,1 in fact made use of an equation for the curve, but was unable to do the same for N= 18. If you use an equation for Xx (N), even if you get past the difficulties due to singularities, which may be considerable, and have found some rational points presumably corresponding to cusps, it will still require luck or
226 A. P. OGG ingenuity to find functions on the curve which have zeroes and poles only at those points. The above procedure using Eisenstein series, by contrast, is quite mechanical.) The result is Theorem. Let pl(N) = 2, i.e. N=\3, 16, 18. Then the 6 rational cusps of Xl(N) generate Ji{N)qts (isomorphic to Ci9,C2xCl0, C2i respectively), which meets Xx (N) in only the 6 rational cusps. One then has a proof of the nonexistence of an elliptic curve over Q with a rational point of order N, for these N, provided one can show that Jx (N)Q is finite. For N= 13 this has been accomplished by Barry Mazur [9], using his new descent techniques, and his proof meets our requirement of only using the modular properties of the curve. For N= 16 or 18, it seems likely that Mazur's techniques will again show that Jx (N)Q is finite, which would prove the nonexistence of an elliptic curve with a rational point of order 18. (N=\6 is already a known result—cf. Cassels [3, p. 264].) Finally, let us note that the finiteness of Jx (13)Q is predicted by the conjecture of Birch and Swinnerton-Dyer. Let 5=5(rx (N), k) be the space of cusp forms of dimension -k for r^N). Then r0(N)/rl(N) = (Z/NZ)x/±l operates on 5, so we have s=©st e where e ranges over the characters of (Z/NZ)X/±1 (i.e. characters s modulo N, with s(-1)= 1), and Se is the set off with f\ £ bd) = £(d)-f for £ bd)eF0{N). Note that the operator HN = („ "^carries Se onto S-. The Hecke operators T(n), for (w, JV)= 1, operate on each Se, and so SE has a basis of elements / of the form /W = E?=i *»«"> with q = e2Kit9 where fll = l, and f\ T{n) = an-f for (n,N)=l. Since HNT(n) = (e(n))~ T(n)HN, and an = (e(n))~ an by Petersson's theory (cf. [10, V-3 and IV-27]), we can write X •/1 HN = £ bnq", where X is some constant, and bn — an for (n, JV)=1. This suggests that we consider Hecke's conjugation operator K, defined by/1 K(t) = (/(-t))" =£ anqn\ K is a conjugate-linear map of Se onto S-. Thus F | X and A/1 HN have the same Fourier coefficients an, for (w, JV)= 1, and hence coincide, in the case where e has conductor N, by a theorem of Hecke [7, p. 836]. Note |A| = 1. For px (AT) = 2, i.e. JV = 13, 16, 18, and fe = 2, 5 has dimension 2, and S = SE®S- for a certain nontrivial character £; let/£ = £ a^^f^Xf \ HN = Y,^n^ generate SE and 5£- as above. (Of course the argument above using the operator K is not needed here, since the spaces Se and S£ have dimension 1.) One checks that s has maximal order \<p(N) = 6, 4, 3, using the Riemann-Hurwitz formula. (If rl(N) cker(s)c:r0(N), with proper inclusions, then ker(£) corresponds to a curve Xe
RATIONAL POINTS ON CERTAIN ELLIPTIC MODULAR CURVES 227 of genus 0, but with a nontrivial holomorphic differential /e(t) dz, which is impossible.) By a congruence formula of Shimura, the one-dimensional part of the zeta-function of Xx (N) of Jx (N) is L(s) = Le(s)L-(s), where Le(s) = Y< ann~s> Ar(s) = Z ^«w~s- ^e conJecture of Birch and Swinnerton- Dyer states that Jx {N)Q is finite if and only if 0^L(1) = L£(1) L£-(1) = |L£(1)|2; since 00 (27rAT1/2)-s r(s) L,(s)= I ff(itN-112) dt/t o (with/=/£), this is so if and only if O^tf/prAT1'2) dt = /(/). Let g=f-f\HN, h=f+f | HN. Then J(fc) = 0, and J(flr) = 2 ff g{itN~1/2) dt, so L(1)^0 if and only if J? g{itN~1,2)dt=£0. Furthermore, # is almost real on the imaginary axis, since g=f-f | HN=f+e~2ief | K = e-ie(e>ef+e-wf \ K% so eieg is real on the imaginary axis, and to show that L(1)^0 it suffices to show that g(itN~1/2)^0 for 1 <t<oo. For (r0(N): r1(N))=j(p(N) even, i.e. AT =13 or 16, this is shown as follows. Let rl(N)ar2(N)czr0(N), where r2(N) is defined by e(d)= ±1; let n(N) = r2(N) vr2(N)HN. Then (n(W): A (TV)) = 4, and r$(7V) permutes the zeroes of g = f—f\HN. Since g{z)dT has only 2/?1(jV) — 2 = 2 zeroes on A^jV), a zero of g(itN~112) with l<f<oo would have to be a fixed point of some element of r*{N), which is not the case. Thus the conjecture of Birch and Swinnerton-Dyer predicts the finiteness of Jx (N)Q for N= 13, 16, and Mazur's result for N= 13 then gives another verification of their conjecture. 3. Groups generated by cusps on X0(N). Let A(x) = q\\^=l (1 -qn)2Ar be the cusp form of dimension —12 for T, where q = e2nix; A has a zero of order 1 at the cusp, and no other zeroes or poles. Regarding A as a form for T0(iV), it has a zero at the cusp (J) of order N/dt (the ramification degree in X0 (N)^>X), i.e. A has divisor For any d | JV, let Jd(i) = J(^t); Jd is also a form for r0(N), whose divisor is easily determined. For d = AT, we find (4»)=I(d/r)-Q.
228 A. P. OGG (With fc=12, we have AN = A \ (N0 ?) = d | (_? J)(° "$) = d | ffNf where //„ = (£ -J) maps X0 (W)-»X0 M and carries {*) on (1/d) HN(xy) = (^) = ft), with (y\ N) = JV/d.) Then (J/JNH£(JV/A-d/r)/x is a divisor linearly equivalent to 0 on X0(N); similarly, for any d, d' | N, AdjAd> is a function on X0(N) whose divisor is easily determined. For iV = pa prime, we find (J/Jp) = (p-l)(P0-P00), where P0 = (?), P^ = (J), so the class of Po-P^ is a point on J0{p)q of order dividing (p — 1); its order may be smaller, as follows. Let n(T) = A(T)1/24 = enix/12Yl™=l(l-qn) be Dedekind's function, and nd{z) = rj(dr). Then, by Hecke [7, Number 42], if p is a prime >3, then a>p = f]p/t]p is a modular form for P0(p) with multiplier s(d) = (d/p) (quadratic residue symbol), i.e. c d)=£^^ f0r(c d (Actually cop is a product of Eisenstein series, so the method here is similar to that of the preceding section.) For p = 3, the same is true for a>l = rj9/rjl, and for p = 2, rj16/rjl is a form for X0(2). For simplicity, let us take the case of a prime p> 3, and let v be the least integer >0 with n = v(p-l)/12eZ. Then fp = {rj/rjp)2v = {rjp/t]p)2v jy-2^"1^^ A~n is a function on X0 (p), of divisor (fp) = n-(PQ-P„). Actually, n is the exact period of the divisor class of Pq — P^, i.e./p1/r is not a function on Jf0(p)for r>l, which can be seen from Dedekind's transformation formulas for the ^-function [4]. (We can see immediately that/p1/2 is not a function on X0(p), for that could happen only for n even and v odd, and then we would have a multiplier of e(d).) Thus Theorem. For a prime p > 3, the divisor class of P0 — P^ is a point on J0(p)q of period exactly w = v(p—1)/12, where v is the least integer >0 with neZ. For N composite, one has to consider rjd for various d \ N, and the situation is more complicated. I have not stated a general result (perhaps it is not worthwhile to do so); at any rate it seems clear that for any fixed N one can determine the exact (finite) subgroup of J0(N)Q generated by the rational cusps. The technique of reducing modulo p, for small p)( N, can be applied here as in the previous section,
RATIONAL POINTS ON CERTAIN ELLIPTIC MODULAR CURVES 229 and in many cases one can check that the group generated is J0(N)qts. For example, ^(27) has two rational cusps, namely P0 = (i) anc* ^oo = (o)- The function A\n A3 Ag1 A'3 has divisor 72(7^ - P0), and by the results of Hecke mentioned above we can extract the 8th root, getting {rj^g 9Y(*l\n~9) Ag, a function on ^(27) with divisor 9(Poo-P0). But ^(27) has genus 1 and good reduction at p = 2, and so actually the divisor class of Pq-P^ has period 3. Reducing modulo 7, we find that X0(27) has 9 points in the field of 7 elements, so X0(21)qts has order 3. (In F7, X0(21) has 6 rational "cusps", and 3 rational points given by the curve with endomorphism ring the order of conductor 3 in the field of cube roots of 1, with (say) Frobenius endomorphism tt = (1 -h( — 27)1/2)/2. (2n — 1), {n + 4), and (7c - 5) are 3 isogenics, cyclic of degree 27.) For the twelve values N= 11,..., 49 with p0(JV)= 1, one finds in this manner, in all cases, that the rational cusps generate X0(N)qts; the groups are listed in the table below. Actually, X0(N)Q is finite for these N, as is predicted by the conjecture of Birch and Swinnerton-Dyer [12], and has been verified by various people, including Ligozat [8]. Table (p0(N)=\) N RATIONAL CUSPS GROUP THEY GENERATE NUMBER OF RATIONAL NONCUSPS 11 14 15 17 19 20 21 24 27 32 36 49 </=l, 11 1,2,7, 14 1,3,5, 15 1, 17 1, 19 1,2,4,5, 10,20 1,3,7,21 1,2,4,8,3,6, 12, 1,27 1,2, 16,32 1,2,4,9, 18,36 1,49 24 Q Q C2xCt c4 C3 C6 C2xC4 C2xCA c3 c4 Q c2 Thus for JV = 20, 24, 32, 36, 49, there can be no elliptic curve over Q with a cyclic rational group of order AT, let alone a rational point of order N. For N= 19, 27, we have just one rational noncusp, i.e. there is a unique elliptic curve A over Q together with a cyclic rational isogeny X of order JV. By the uniqueness, the isogenous curve A = X(A) must be the same as A, i.e. X is a complex multiplier of A. But then the >invariant of A is integral, i.e. jeZ, and so A has good reduction at p = 2 and hence cannot have a rational point of order N. Thus we get a painless proof of the nonexistence of an elliptic curve with a rational point of order N = 19 or 27.
230 A. P. OGG Finally, as Shimura has pointed out, Kummer theory gives another way to find divisor classes of finite period on J0 (N), this time rational over a cyclotomic field. For simplicity, we consider only the case of a prime p> 3, although, as usual, it d©es not really matter. Consider the Galois covering Ar1(/?)->Ar0(/?), with group r0{p)/rx(p)= (ZjpZ)xl ± 1, cyclic of order (p — 1 )/2; this covering is ramified only over the elliptic fixed points of X0(p). Now X0(p) has elliptic fixed points of order 2 resp. 3 if and only if/?== 1 modulo 4 resp. 3. Thus if we define (as before) v to be the least integer >0 with w = v(/? — l)/12eZ, then our covering factors as *1(P)^*2(P)-X0(P), where X2(p)^X0(p) is unramified, and cyclic of degree n. Let kn = Q(£n)9 £n = e2lti/n.By Kummer theory, the function field kn(X2(p)) is generated over kn(X0(p)) by adjoining h=f1/n for some fekn(X0(p)).f has divisor of the form (f) — nD (since Ar2(/?)->Ar0(/?) is unramified), and the divisor class d of D then has period excatly n on J0(p)kn. Now a = (?2)ejr0(p) operates rationally, i.e. olg — gol for <reGal(()/()). Hence (ft 0 a)* = A* o a. Also A o a = e (d) • A, where £ is some character modulo /? of period «, and for aeGal(kJQ) = (Z/nZ)x, we have <x(£„) = £, say, where se{Z/nZ)x. Hence (A<Tft-s)oa = (e(^)-/if^(d)-/i)-s = A<Tft-s, i.e. g = hffh~sekn(X0(p)). In other words (t(D)^s-D on X0(p)> i-e- (r(d) = s-d9 which we describe by saying that Gal(kJQ) operates naturally on d. Thus consideration of rational cusps gives aeJ0(p)Q of period n = v(p— 1)/12 (a = class of Pq — P^), while Kummer theory gives deJ0(p)k of period n with the Galois group operating naturally. It is clear that a and d generate Cn x Cn on J0{p)kn if n is oc*d; for even n they generate C„ x C„/2. References 1. G. Billing and K. Mahler, On exceptional points on cubic curves, J. London Math. Soc. 15 (1940), 32^3. MR 1,266. 2. W. Casselman, The arithmetic of the cusps of the classical modular curves (to appear). 3. J. W. S. Cassels, Diophantine equations with special reference to elliptic curves, J. London Math. Soc. 41 (1966), 193-291. MR 33 #7299. 4. R. Dedekind, "Erlauterungen zu den Fragmenten. XXVIII," in Collected works of B. Riemann, Dover, New York, 1953, pp. 466-478. 5. V. A. Dem'janenko, Torsion of elliptic curves, Izv. Akad. Nauk SSSR Ser. Mat. 35 (1971), 280-307 = Math. USSR Izv. 5 (1971), 289-318.
RATIONAL POINTS ON CERTAIN ELLIPTIC MODULAR CURVES 231 6. R. Fricke, Die elliptische Funktionen und ihre Anwendungen, Teubner, Leipzig, 1922. 7. E. Hecke, Mathematische Werke, Vandenhoeck & Ruprecht, Gottingen, 1959. MR 21 #3303. 8. G. Ligozat, Fonction L des courbes modulaires, Seminaire Delange-Pisot-Poitou, 1969/70, no. 9. 9. B. Mazur, Letter to A. Ogg, September, 1971. 10. A. Ogg, Modular forms and Dirichlet series, Benjamin, New York, 1969. MR 41 # 1648. 11. , Rational points of finite order on elliptic curves, Invent. Math. 12(1971), 105-111. 12. H. P. F. Swinnerton-Dyer, The conjectures of Birch and Swinnerton-Dyer, and of Tate, Proc. Conf. Local Fields (Driebergen, 1966), Springer, Berlin, 1967, pp. 132-157. MR 37 #6287. University of California, Berkeley
This page intentionally left blank
ARITHMETIC FUNCTIONS AND BROWNIAN MOTION WALTER PHILIPP 1. Introduction. Additive functions/are mappings from the positive integers into the reals satisfying /^(m1m2) = /'{ml) + / (m2) whenever the greatest common divisor gcd(ml9 m2)— 1. For the limit theorems we shall be concerned with in this paper there is no loss in generality in assuming that the additive function / is strongly additive, i.e. / satisfies the additional requirement /(pa) = /(p) for all positive integers or and all primes p (see [8, pp. 36-37]). Kubilius proved [8, Theorem 4.2] a central limit theorem for additive functions / (m) satisfying a kind of Lindeberg condition. The special case where sup | / (p)\ < oo is the celebrated Erdds-Kac theorem [7]. Kubilius also gave a more general version of the Erdds-Kac theorem [8, Theorem 7.3]. The connection between Kubilius' result and weak convergence of probability measures is suggested by a reference to a paper of Prohorov [13] on page 124 of his book [8]. Billingsley [3] showed that Kubilius' Theorem 7.3 [8] implies that certain random functions in C[0, 1], similar to hN(t, m) defined below, converge in distribution to standard Brownian motion. He also gave a very simple proof of Theorem 1 below. In this paper we shall prove Billingsley's theorem assuming only a Lindeberg condition. Generalizing a result of Davenport and Erdos [6] on the distribution of normalized sums of Legendre symbols, Kubilius and Linnik [9] (see also [10, Chapter 10]) proved that the finite dimensional distributions of certain normalized sums of Jacobi symbols as well as of certain character sums converge to those of standard Brownian motion. Combining their results and their methods of proving them with Serfling's [14] maximal inequality we shall show that these normalized sums converge in distribution to standard Brownian motion. This can be used to obtain the standard corollaries on the distribution of the maxima and minima of A MS 1970 subject classifications. Primary 10K20, 10H25; Secondary 60J65. © 1973, American Mathematical Society 233
234 WALTER PHILIPP the normalized partial sums, as well as of the number of changes of sign of these partial sums, etc. 2. Invariance principles for additive functions. 2.1. Statement of results. Let </N> be a sequence of strongly additive functions. For n ^ 1 write B2{N, n) = £ ^-^, B{N, N) = B{N). p^n P For O^trg 1 define functions hN(t, m) by M^)=^E/*(p)(*,M-£) where (5p(m)= 1 or 0 according to whether or not p \ m and the sum is extended over all primes p^N satisfying B2(N, p)^tB2(N). Hence if m is an integer chosen at random from [1, AT] then hN(t, m) is a random function in D[0, l]1 constant on the intervals [B2 (N, q)/B2 (N), B2 (N, q')/B2 (iV)] where q<q'^N are consecutive primes. The value of the constant is B'^N) ^/^(p) (Sp(m)— 1/p) where the sum ranges over all primes p^q. There are other possible choices for hN(t, m). For instance we could join the points of the graph at the end points of the interval linearly. One then obtains a polygonal curve in C[0, 1]. The theorems below continue to hold for this alternate choice of hN(t, m). Theorem 1 (Billingsley [5], [4]). Let (fN} be a sequence of strongly additive functions with B(N)-+ oo. // (1) sup|/„(p)|<oo, p,n then hN{t,m)^W, i.e. the random functions hN(t,m) converge in distribution to standard Brownian motion. Remark. Though very similar, the present Theorem 1 and Theorem 7.3 in [8] do not contain each other. Also their proofs are completely different. 1 Z)[0, 1] is the set of real-valued functions on [0, 1] which are right-continuous and have left-hand limits. C[0, 1] is the set of continuous functions on [0, 1].
ARITHMETIC FUNCTIONS AND BROWNIAN MOTION 235 In this paper we relax (1) to a kind of Lindeberg condition. Theorem 2. Let (fN} be a sequence of strongly additive functions with B(N)-* oo. Suppose that, for any s>0, where the sum is extended over all primes p not exceeding N and satisfying |/at(p)I ^sB{N). Then (3) hN{t9m)$W where W is standard Brownian motion. Moreover, if B(N)/B(N, N1/2)-+l then (2) is also necessary. The assumption that the functions fN be additive was kept for historical reasons. Of course, Theorems 1 and 2 and their corollaries continue to hold for any sequence of number theoretic functions subject to the remaining conditions. Simply redefine the values of such a function on the nonprime powers to make the modified function additive. For the probabilistic background as well as for the proof of the standard corollary to this kind of theorem see Billingsley [2]. Corollary. Under the hypotheses of Theorem 2 we have, for x^.0, 1 A ,r tik){m)-A{N,k) } 2 f 2/_ --card <mgN: max"™ v ' l '<XU»—— \e-u2/2 du N { k<N B{N) J 2tt1/2J and 1 A .M \f^(m)-A{N,k)\ — card <m<N: max — <x N 1 - kiN B(N) f{.1<->'-(-^) (2k) ~7t^,2fc + ieXPV 8x2 Here
236 WALTER PHILIPP ,4(JV,fc)=£—, A(N,N) = A(N), /*» = £ fN(p). P^k P P^k;p\m Remark. In the case fN=f for all N^\ this corollary is due to Babu [1]. Theorem 2 and its corollary can be generalized in several directions. One of them is as follows. Let R(m) be a polynomial with integer coefficients and assuming only positive values. Let $R(p) be the number of residue classes satisfying the congruence R(m) = 0(modp). Again let </N> be a sequence of strongly additive functions and put b2(n, «)= £ —• Up), b(n, n)=b(n). p^n P For 0^ t ^ 1 define random functions hN(t, m, R) by hN{t, m, R) = -^'1MP) (<U*M)- »r{p)/p)> where the sum is extended over all primes p^N with B2(N, p)^tB2(N). Then Theorem 2 remains valid if we replace (2) by b-2(JV)£^-MpM> (at-oo). P The proof of this result is essentially the same as the proof of Theorem 2 given below. Instead of Lemmas 1 and 2 one has to apply the corresponding facts due to Uzdavinis [15]. Similarly one can generalize Theorem 2 to sums of shifted additive functions, a theory developed by Kubilius (see [8, Chapters 5-7]). The modified version of Lemma 1 follows from the last formula in [8, p. 97]; Lemma 2 is applied as it stands; and instead of Theorem 3 one has to use a slightly generalized version of [8, Theorem 5.2] (see the proof of Theorem 3). As already indicated above we shall prove only Theorem 2 in complete detail. 2.2. Lemmas on additive functions. We need a minor extension of a result of Kubilius mentioned above [8, Theorem 4.2]. Theorem 3. Under the hypotheses of Theorem 2, X e-u2'2du. 1 \ fNlm)-A(N) I 1 (4) - card \m<N:JNy ' * <*HtttT7I w N { ~ B(N) J (2n 1/2 — 00
ARITHMETIC FUNCTIONS AND BROWNIAN MOTION 237 Moreover, ifB(N)/B(N, N1'2) -» 1 then (2) is also necessary for (4) to hold. For the proof see Kubilius [8] or the author [12]. Simply attach the index N to each /. The proofs then work without further changes. The reason for this phenomenon is as follows. By the additivity offN we have /wM=£/*(pm")) P where <xp(m) is defined by the canonical representation of m = f|ppap(m). Hence limit theorems for fN(m) can be interpreted as limit theorems for sums of the random variablesfN(p*pim)) (p = 2, 3, 5,... ^N) on the probability spaces (QN, %N, PN). Here 0N = (1, 2,..., N), %N = y(QN) is the power set and PN is defined by PN(A)={1/N) •card (.4), AczQN. Then Theorem 3 is a central limit theorem for the row sums of the triangular array <fN{p*p{m)), p^N, Af=l, 2,...>, where within a row the fN{pap{m)) satisfy certain dependence relations. Since [8, Theorem 4.2] is proved by means of a certral limit theorem for triangular arrays it is quite clear that its proof will yield Theorem 3, too. From now on we shall drop the index NinfN and/^fc). Lemma 1 [12]. Let r = r(N) be any integer valued function with logr/logiV->0 and let P and Q be two disjoint sets of primes of which no primes exceed r. Let MjiV) be the algebra generated by the random variables {f{pap{m)),peP} and similarly for Q. Then PN(AB)-PN(A)PN(B)<Qxp{-c(\ogN/\ogry'2} for all sets AeM(P and BeM{q\ Here the constant c>0 and the constant implied by <^ are absolute. Remark. This result is slightly more general than [12, Lemma 5.2.1], but the proof is the same. Lemma 2 [12, Lemma 5.1.4]. For any integers r, w^2, card Im^N: fj pfl,'(m)>ii|<|Nexp j-c( — where c>0 and the constant implied by <^ are absolute. The following maximal inequality will be used to establish tightness. First we give some notation. We introduce the random variables )"
238 WALTER PHILIPP *„, = £->(A0 {f if *m^-EN{f {f *»)}}, p^r(N) prime on the probability spaces (QN, S$N, PN) as described above. Here r = r(N) satisfies logr/logiV->0 and EN denotes the expectation of the random variable involved. Obviously EN(xNp) = 0. Write SNk = Y,P*k *np- Lemma 3. For any e>0 and any sufficiently large N The proof is adapted from the standard proof of the Levy inequalities. We put Sno = 0 > Snu = max {SNj - \i (SNj - SNr)), Ak — \^Nk- 1 < £> $Nk~~A*\^Nk~~ ^Nr) = £} » &k — { $Nr — $Nk ~ V fiNr ~ $Nk) = 0} , where \i is a median with ft{SNr — SNk) = —ft(SNk — SNr). As in [11, p. 248], we have {SJr^fi}= U Ak9 {SNr*e}=> U AkBk. k^r k^r The sets Ak are disjoint and PN{Bk)^j. If we can prove that (5) PN JU i4AU(i + o(l)) £ PN(^) + o(l), Ik^r ) k^r the remainder of the proof in [11, p. 248] will work without change. In fact (5) will imply PN{SNrZe}£PN(v AkBk)^iZ PN(A) + o(\)=iP(S*Nr^e) + o(l) \k^r ) k^r and the rest is the same. For the proof of (5) we repeat part of the argument leading to [12, Lemma 5.2.1]. Each Ak and Bk can be written as Ak= U JO {aPlN = a|.}j, (ai,...,ait)eYk (i=l J
ARITHMETIC FUNCTIONS AND BROWNIAN MOTION 239 Bk= U { O {cep» = /?,.}}, <0lc+l,...,0s)€Zk U = k+1 J where yfc and Zk are certain /c-tuples and (s —/c)-tuples of integers respectively. We drop the index N in PN and write k £ft = £k(a1,...,ak)=n {aPi(m) = aI-}, Fk = Fk(pk+u...,ps)= n K»=/5/}- We define P'(£fc) = P(£fc), if/?=flpf^iV1/4, =0, ifK>JV1/4, and P"(£,) = P(£)i)-F(£k). Similarly for F(Fk), P"(Fk), F(EkFk), P"(EkFk). We apply Lemma 1, sum over all (au..., <xk)e Yk and (pk+1,..., fls)eZk (for the details see [12, p. 82]) and obtain P(AkBk)-P(Ak)P(Bk) (l+o(l)) (6) £(l+o(l)) X (P"(£*)P(Ft) + P(£»)P"(F»)) (lgfcgr). Yk,Zk Summing over 1 g/egr we get on the left-hand side (7) p(\j AkBk)-(\+o(\)) £ P(Ak) P(Bk). Lemma 2 with u = N1/4 yields pj]~I pa*(m)>N1/4j = 0(l). Hence summing over lg/cgr and observing that any two sets of type E are disjoint, since the Ak are, we obtain for the first terms on the right-hand side of (6) the bound EI/"(£*)=/» (u U*Ek) = o(\) k^r Yk \k^r Yk J
240 WALTER PHILIPP where the * indicates that the union is extended over those (al5..., OLk)eYk with pV'~Pkk>N1/4' The sum of the second terms on the right-hand side of (6) is less than p(v Ak\supP"(Fk) = o(l). Hence (7) yields p(u AkBk) = £ + o(\)) £ P(Ak) + o(l) \k^r J k^r as P(Bk)^\. This proves (5) and thus the lemma. Lemma 4. If B(N)/B(N, r)-+l when N-+cc and logA71ogr->oo, then, as TV-oo, (8) and sup£N(xJp)-*0 (9) PN< max I x»p~Wm 2/W(Mm)-;)i >Smin(r,n) 0(I\)p^„ \ p/\ >£>-0 for any e > 0. Proof. (8) is proved in [12, p. 88]. For the proof of (9) write yNp=xNp-(l/B(N))f(p)(5p(m)-l/p), l^p^r, = (l/B(N))f(p)(3p(m)-l/p), r<p£N. By Chebyshev's inequality it is enough to show that (10) EN <[ max I yNp\ But the left-hand side of (10) does not exceed £,Jmax £ \yNP\>£ Z EN\yNp\- [n^N p^n J p^N
ARITHMETIC FUNCTIONS AND BROWNIAN MOTION 241 By an easy calculation using [12, Lemma 5.2.4] we obtain I EN\yNp\<-^-l\f(p)\-N-'=o(l) and similarly I ^I^pN^TIa I ——=o(l). r<p^N By1*) r<p£N P The last three inequalities prove (10) and hence (9). 2.3. Proof of Theorem 2. Since (4) follows from (3) (see e.g. [2, p. 4]), Theorem 3 implies that (2) is necessary for (3) to hold provided that B(N)/B(N, Nll2)-+ 1. To prove the sufficiency we observe that the Lindeberg condition (2) implies the existence of an integer-valued function r = r(N) such that B(N)/B(N,r)-+l with logiV/logr-KX) (see [8, p. 61]). Moreover, Theorem 3 and Lemma 4 yield (id '•[s><"taM*"*1* — oo iff, for any e>0, (12) £ f x2dFNp^0. Vkr J \x\*z HereFjvp denotes the distribution function of xNp. In fact this can be regarded as an intermediate step in the proof of Theorem 3 (see e.g. [12, Lemma 5.3.1]). Define random functions XN(t9 m) by XN(t,m) = Y,xNp where the sum is exterided over all primes p^r with EN(£ xNp)2^tENQ]pgrxNp)2' Then XN(t9 m) e D[0, 1]. We shall show that the sequence (XN(t, m)> is tight and that its finite dimensional distributions converge weakly towards those of standard Brownian motion. Once these two facts are established it will follow that XN(t, m) converges in distribution towards standard Brownian motion (see [2, Theorem 8.1]). This together with Lemma 4 and [2, Theorem 4.1] will imply Theorem 2. 2.3.1. To establish tightness we have to show that given e, rj >0 there exists a positive S such that
242 WALTER PHILIPP (13) PN{w(XN,S)^e}^t, for all sufficiently large N. Here w is the modulus of continuity. By [2, p. 56, Corollary] we have (14) PN{w(XN,8)^s}^ £ PN\ sup W^-X^^e for any s>0, 5>0 and 0 = /0</1 <-</w = l with /,- — /f_ t ^5 (2^/g/i-l). Let s > 0 and 77 > 0 be given. Choose (15) 0<S<&2 so small that 10(5-1exp(-£2/(8(5))<y7. Choose tf = i<5, 0gi^[<5-1] + l = w. Because of the structure of XN we have (16) pj \XN(s)-XN(ti-i)\^i^PN\ max XN. ti -l^p^t, _i +j >£ where zt is the largest integer t with £jv(£^TXjvp)2^r For the estimate of the right-hand side of (16) we apply Lemma 3. We redefine xNp to be zero for all primes p outside the interval [t,_ x, tJ. Then for suificiently large N the right-hand side of (16) does not exceed (17) 4PN KNp\ .It,- i^p^t, Since the xNp are almost independent we have, for any a and b, E[ I xNp\ = 2 E(x2Np) + o(l). using (8). (A proof can be easily patterned after the argument on [12, p. 89].) Hence by the choice of t{ and zt we obtain (18) ONi=daEH[ E xNp) ~5 (JV-»oo). *i- i^p^'i
ARITHMETIC FUNCTIONS AND BROWNIAN MOTION 243 We apply now (11) to the sequence <xNp(7Ni\ x^^p^x^ and observe that the Lindeberg condition (12) in the present case becomes *n/2 I x2dFNp<S-l£ x2dFNp-+0 \x\^effNl using (18) twice. Hence by (11), M ^e<5 (19) PNUml xN. r,- t^P^r, oo ^ot>< \e-ul2 du<OL~l exp(~^a2) and thus by (18), (19) and (15) we conclude that (17) is less than (20) 4PN t,- l^P^Tj KNp ^-J> + o(l)<5exp (4) + o(l)<rjd for N sufficiently large. (13) follows now from (16), (20), (14) and the fact that n^S"1. 2.3.2. We now will prove the convergence of the finite dimensional distributions to those of Brownian motion. We shall consider only the two dimensional distributions, as the higher ones can be treated in the same manner. We consider first a single time interval [f, u], O^g t < u ^ 1, and obtain by the argument leading to (19) that (21) XN(u)-XN(t)$Wu-Wt. We wish to show that for t<u, (22) (XN(t), XN(u))$(W„ Wu). But this is equivalent to showing (see [2, Corollary 1 to Theorem 5.1]) (XN(t), XN(u)-XN(t))Z(W„ Wu-Wt) or (23) (l xNp, I xNp)^(W„Wu-Wt). \peQt peQu-Qt / Here Qt is the set of primes figuring in the definition of XN(t, m). The two sets of primes Qt and Qu-Qt are disjoint and thus, by Lemma 1,
244 WALTER PHILIPP \peQt peQu-Qt / \peQt J \peQu~Qt / By (21) the two terms in the product tend towards normal distribution with means 0 and variances t and u — t respectively. Since Brownian motion has independent increments (23), and thus (22), follows. In view of the remarks at the beginning of §3, Theorem 2 is proven. 4. Proof of the Corollary. The first relation is proven in the same way as [2, (10.11) and (10.17)]. For the second relation, one uses an argument similar to that leading to [2, (10.11)], but, instead of using [2, (10.16)], applies [2, (11.13)]. 3. Invariance principles for sums of Jacobi symbols and certain character sums. At first we shall consider sums of Jacobi symbols. For P odd square free we define Hence if we pick m at random from the integers 1, 2,..., P we see that SP(m, t; h) is a random function on [0,1 ]. This is just an intuitive way of saying that we consider them as being defined on the probability space (QP, gP, /iP) where QP is the segment of the first P positive integers, gP is the powerset of QP and fiP(A) = (\/P) card (A) for all subsets A of QP. We denote the probability measure by \iP for obvious reasons. With this convention we have the following extension of a theorem of Kubilius and Linnik [9]. Theorem 4. Let P run through any infinite increasing sequence of odd square free numbers such that for every fixed c^.0 (24) IJ (.-£)-. as P-> oo. (The product is extended over all prime divisors p of P.) Let h~h(P)-+ oo so slowly that log A/log P-+0. Then where W is standard Brownian motion.
ARITHMETIC FUNCTIONS AND BROWNIAN MOTION 245 In a similar fashion we extend a theorem of Kubilius and Linnik to certain character sums. Theorem 5. Let P run through an infinite increasing sequence of odd square free integers, occurring as basic moduli of primitive characters of order g>2 satisfying (24). Let h be as in Theorem 4. Then the real and the imaginary parts of (2/h)1'2 X Xp(m + n) n^ht converge in distribution towards standard Brownian motion. Here Xp(m) is any primitive character modulo P of order g. The convergence of the finite dimensional distributions has already been established by Kubilius and Linnik (see [10, Chapter 10]). The tightness can be proved by combining their proof with the following maximal inequality, due to Serfling. Lemma 5 (Serfling [14]). Let {XH9n=l92,...} be a sequence of random variables. Suppose that, for v>2, all a^a0 and all n^l, ££]=«+i Xj\v^gvf2(n) where g(n) is nondecreasing, 2g(n)^g(2ri) and g(n+l)/g(n)-+l as n->oo. Then there exists a finite constant M such that, for all a^.a0 and n^.\, a + k max £ Xj k^n j=a+\ ^Mgv/2(n). We prove tightness for the character sums only; the proof for the sums of Jacobi symbols is similar. We shall use the notation of Linnik [10, Chapter 10]. Now,for0^s<t^l, EP Z Xp ("* + *) hsSn^ht =4 I (h/2)^TP(m,s,t;h,x) * m=l 4 h2 For the estimate of n22(P) we apply [10, X.2.7] and obtain /i22(P)<^/i-2(^(r-s)+l)2(l+o(l)) + p-1/4/i-2(^~s)+l)2^(r~s)2, uniformly in 0 ^ s < t^ 1 because if t — s < \/h the sum defining \i12 (P) is empty and because o(l) depends only on the rate of convergence in (24). Hence the left-hand side of (25) is bounded by a constant times (ht — hs)2. We apply Serfling's Lemma 5 with g(u) = u and obtain
246 WALTER PHILIPP EP < max hs<n^hr <{ht-hsf uniformly in 0^s<t^l and hence, by Chebyshev's inequality for any A>0, 1 card <m^P: max hs < n ^ hr ^x{ht-hSyl2}<r4 uniformly in Og*s<t^ 1. This implies tightness (see [2, Theorem 8.4]). References 1. G. Jogesh Babu, On the distribution of additive arithmetical functions of integral polynomials, Technical Report Math.-Stat. # 19/1971, Indian Statistical Institute, 1971. 2. Patrick Billingsley, Convergence of probability measures, Wiley, New York, 1968. MR 38 #1718. 3. , Written communication. 4. , Unpublished manuscript. 5. , Additive functions and Brownian motion, Notices Amer. Math. Soc. 17 (1970), 1050. Abstract #681-A9. 6. H. Davenport and P. Erdos, The distribution of quadratic and higher residues, Publ. Math. Debrecen 2 (1952), 252-265. MR 14, 1063. 7. P. Erdos and M. Kac, The Gaussian law of errors in the theory of additive number theoretic functions, Amer. J. Math. 62 (1940), 738-742. MR 2, 42. 8. J. Kubilius, Probabilistic methods in the theory of numbers, Gos. Izdat. Polit. Naucn. Lit. Litovsk. SSR, Vilna, 1962; English transl., Transl. Math. Monographs, vol. 11, Amer. Math. Soc, Providence, R.I., 1964. MR 26 #3691; MR 28 #3956. 9. J. Kubilius and Ju. V. Linnik, An arithmetic analogue of Brownian motion, Izv. Vyss. Ucebn. Zaved. Matematika 1959, no. 6 (13), 88-95. (Russian) MR 25 #57. 10. Ju. V. Linnik, Ergodic properties of algebraic fields, Izdat. Leningrad. Univ., Leningrad, 1967; English transl., Ergebnisse der Mathematik und ihrer Grenzgebiete, Band 45, Springer- Verlag, New York, 1968. MR 35 #5408; MR 39 #165. 11. Michel Loeve, Probability theory, 3rd ed., Van Nostrand, Princeton, N.J., 1963. MR 34 #3596. 12. Walter Philipp, Mixing sequences of random variables and probabilistic number theory, Mem. Amer. Math. Soc. No. 114 (1971). 13. Ju. V. Prohorov, Convergence of random processes and limit theorems in probability theory, Teor. Verojatnost. i Primenen. 1 (1956), 177-238 =Theor. Probability Appl. 1 (1956), 106-134. MR 18, 943. 14. R. J. Serfling, Moment inequalities for the maximum cumulative sum, Ann. Math. Statist. 41 (1970), 1227-1334. MR 42 #3835. 15. R. V. Uzdavinis, On the joint distribution of values of additive arithmetic functions of integral polynomials, Trudy Akad. Nauk Litov. SSR Ser. B. 1960, no. 1 (21), 5-29. (Russian) MR 26 #100. University of Illinois at Urbana-Champaign
BRUN'S METHOD AND THE FUNDAMENTAL LEMMA1 H.-E. RICHERT AND H. HALBERSTAM This paper contains a simple account of Brun's combinatorial sieve method. A less general version of it appeared recently in our note [1] (note, however, that although all the main results are correctly stated, [1] is full of misprints and minor errors). Let si be a finite sequence of integers, and let 9 be a set of primes. In order to sift si by those primes of 9 which are less than z, any "small" sieve method estimates the sifting function S{si; 0>, z) : = \{a: aesi, {a, P(z))= 1}|, where z ^ 2, p{?)-= n p. p<z;pe& and |{...}| denotes the cardinality of the set {...}. There are no significant estimates for all sequences si and all set of primes &\ therefore, one has to introduce some more basic restrictions on si and 0>. Here, we postulate the following conditions, which in most cases are quite natural: the existence of a real number X> 1 and a multiplicative function co such that (Qi) (oW/p^l-l/At for pe^A^U _, (o(p) z (Qii*)) I -^logp^;clog-+,42, 2£w£z,k>0,A2*1, wgp<z;pe& P W AMS 1970 subject classifications. Primary 10H30, 10H20. 1 This is an abstract of a paper to appear in Acta Arithmetica. (C) 1973, American Mathematical Society 247
248 H.-E. RICHERT AND H. HALBERSTAM and such that the "remainders" ^d - a<=j*;a = Omodd l j A satisfy (R) \Rd\£Ko(d) for d\P(z),K^\. Also, for convenience, we put co(p)=0 ifp$&, and then On probabilistic grounds one expects that S{sf;P9z)~XV{z)9 at least in a certain region of the X-z-plane. By a "Fundamental Lemma" one understands such a result for 5, where u : = logAyiogz(^ 1) is large. The problem is first to find a region which extends as far as possible and, secondly, simultaneously to find a remainder term as sharp as possible for \S{s/;P9z) I I XV (z) I' We obtain the following Theorem 1. Suppose that (Qt), (Q2(K)) and(R) hold. Then (l) S(j&; 0>, z)=XV{z) {l + 0(exp(-M(logM-loglog3M-log»c-2))) + 0(Kexp(-(logX)1'2))}, where the O-constants may depend ohk, Ax and A2 only. The result, where the interest is attached to the first error term, is superior to all results obtained so far, and it rests completely on Brun's method. The name "Fundamental Lemma" seems to stem from Barban. Such a result was extensively used by Kubilius in his book [3] (see also [5]), where he gave numerous applications, mainly to additive functions. For u bounded (and K<^exp ((log X)l,2% the theorem tells us the well-known fact that S<t X V(z\ a result which is often useful in situations where the sieve is used in an auxiliary capacity, and in most cases this is simply quoted as "by Brun's sieve". By Selberg's sieve or the "Large Sieve"
BRUN'S METHOD AND THE FUNDAMENTAL LEMMA 249 we obtain only a weaker result: the first error term in (1) is replaced by O(exp(-4«logtt))(cf.[2]). The Fundamental Lemma also has applications to the study of quasi-primes (cCe.g. Lavrik [4]). We shall call a number q a quasi-prime (relative to a large number x and an arbitrary function u = u(x), tending to infinity arbitrarily slowly) if it has no prime factor less than x1/u. In this connection we deduce from our Theorem 1 the following general result. Theorem 2. Let f\,...,fg be distinct irreducible polynomials with integer coefficients, and let q(p) denote the number of solutions of the congruence fi (n)'' 'fg (n) = 0 mod/?; assume that q (p) <pfor all p. Then the number ofn, 1 ^ n ^ x, such that eachf(n) is a quasi-prime (i= 1,..., g) is equal to y p\ p-i A pJ log** x<l+OF(exp(-w(logw-loglog3w-log0-2))) + 0F where the 0F-constants may depend at most on the coefficients and degrees off, ...,fg. In a later communication we shall deduce from a new version of Brim's sieve, in which (R) is replaced by a weaker condition, an asymptotic formula for the number of primes p^x such that each of f^p) is a quasi-prime. Bibliography 1. H. Halberstam and H.-E. Richert, A new look at Brim's sieve, Bull. Soc. Math. France 25 (1971), 97-106. 2. , Sieve methods, Markham, Chicago, 111. (to appear). 3. J. P. Kubilius, Probabilistic methods in the theory of numbers, Gos. Izdat. Polit. Naucn. Litovsk. SSR, Vilna, 1962; English transl., Transl. Math. Monographs, vol. 11, Amer. Math Soc, Providence, R.I., 1964. MR 26 #3691; MR 28 #3956. 4. A. F. Lavrik, The theory of quasiprime numbers, Dokl. Akad. Nauk SSSR 152 (1963), 544^547 = Soviet Math. Dokl. 4 (1963), 1355-1359. MR 27 #4805. 5. W. Philipp, Mixing sequences of random variables and probabilistic number theory, Mem. Amer. Math. Soc. No. 114(1971). University of Ulm Ulm, Federal Republic of Germany log* University of Nottingham Nottingham, England
This page intentionally left blank
ESTIMATION OF THE AREA OF THE SMALLEST TRIANGLE OBTAINED BY SELECTING THREE OUT OF n POINTS IN A DISC OF UNIT AREA K. F. ROTH 1. Introduction. We endeavour to give a concise and unified account of our recent work [3], [4]. Let (1.1) Pl9P2,...9PH (where n ^ 3) be a distribution of n points in a (closed) disc of unit area, such that the minimum of the areas of the triangles PtPjPk (taken over \^i<j<k^n) assumes its maximum possible value A = A(n). Heilbronn conjectured that A(n)<^n"2 and Paul Erdos (see [1, Appendix]) showed that this result, if true, would be best possible. The first improvement on the trivial A(ri)<^n~l was due to K. F. Roth, who in 1950 proved that (1.2) ^(nHH-^loglogn)-1'2. There was no further improvement until about 20 years later, when Wolfgang M. Schmidt1 [2], using a different method, proved (1.3) A(n)4n-l(\ogn)-1/2. (Actually Schmidt obtained a result containing an explicit constant.) In [3] Roth proved A(n)<^n~^, where jU = 2 — (.8)1/2= 1.105..., and in [4] refined his method to yield (1.4) J(n)«cn-"'+E, A MS 1970 subject classifications. Primary 52A40, 10K35. 1 I am indebted to Professor Schmidt for having sent me a preprint of his paper. © 1973, American Mathematical Society 251
252 K. F. ROTH where (1.5) /i'=|(17-(65)1'2)= 1.117... . Schmidt's method (for proving (1.3)) involved the use of 'weighted strips'. Although our method is also based on the use of weighted strips, both the manner and purpose of their application is different. In Schmidt's method they feature in an averaging argument to find many strips of a certain kind with a nonempty intersection, the 'weights' serving to give 'preference' to wider strips; and in our method they are used to construct systems of quasi-orthogonal functions. The key procedure in our method (cf. Lemma 4 in §4) is to show that if'nearly all' the members of an appropriate system of thin strips are 'deficient' of points (1.1), then nearly all the members of a derived system of wider strips are similarly deficient (unless nA (n) is small, as is to be proved). Such a result ensues on applying a modified form of Bessel's inequality to a system of quasi-orthogonal functions (constructed from the two systems of strips, suitably weighted) with respect to a function obtained by replacing the points (1.1) by discs. We require a generalization of Bessel's inequality applicable to systems of quasi-orthogonal functions. Such generalizations, which have proved very fruitful in connection with the large sieve, are discussed in [5, Chapter 1]. The generalized Bessel's inequality we shall use here is due to A. Selberg (see [5, p. 7]); the proof (see [5, p. 8]) of Selberg's elegant result being at least as simple as that of weaker inequalities of the same general nature (some of which would suffice for our purpose). Selberg's inequality. Letf, ij/{1\il/{2\..., ij/(R) be elements of an inner product space over the complex numbers. Then (i-6) j; K/,<wf f lOAnoiV^ii/ii2. It seems highly probable that systems of quasi-orthogonal functions constructed from weighted strips (but not necessarily used in conjunction with Bessel's inequality) will find other applications to problems concerning distributions of points in Euclidean space. 2. Notation. Vectors X. We use X=(x, y) to denote a point in the Euclidean plane, and write J" g(X) dX for J J" g(x, y) dx dy taken over the entire plane. Discs D*,D. Let (2.1) D* = {X;x2 + y2^(7c-1/2)2}
THE SMALLEST TRIANGLE IN A DISC OF UNIT AREA 253 be the disc containing the set (1.1), and let (2.2) D = {X;x2 + j;2gl}. Pairs t and d(r), 0(t), Tz(w). We use t to denote a pair Pi9 P} (i<j) of points selected from (1.1) and d(r) to denote the distance PtPj between the two points constituting x. If (2.3) ycos0-xsin0 = a (O^0<tu) is the line joining Ph Pj9 we use 0(t) to denote the inclination 0 of the line (2.3); and, for any w>0, we use Tr(w) to denote the strip (2.4) a — jwg^y cos9 — x sin0^a+j\v of width w about the line (2.3). Characteristic functions and counting functions: in particular D(X\ Tx(w; X), Nt(X), Mt(X). If Sf is any subset of the plane, we use Sf{X) to denote the characteristic function of Sf (in other words, ^(X) is 1 or 0 according as X does or does not lie in Sf)\ in particular, we use D(X), Tt(w, X) to denote the characteristic functions of D, Tx(w) respectively. We use N(£f) to denote the number of points (1.1) in £7, and introduce the abbreviation Nt(w) = N(Tt(w)); thus (2.5) JV»=£ Tt(Yr,Pd. i=l We also write (2.6) Mt{w) = w-lNx{w). Orthogonal functions <j>x. For any w'> w">0, we write (2.7) (/>>', w"; X) = (\/w') Tt(w'; X)-(Vw") 7>"; X). These functions have the following obvious property. Orthogonality Property. If </>*, </>** are any two functions of type (2.7) (corresponding to pairs t*, t**), then J (/>*$** rf^f is zero whenever it is finite. Constants. We use c as a generic symbol for a sufficiently large positive absolute constant. The constants implicit in the <^ notation depend at most on y and s.
254 K. F. ROTH 3. Structure of the proof of (1.4). We say that the positive number y is admissible if (3.1) A(t)<r? for all integers r^3. We suppose throughout that the integer n featuring in (1.1) is large, and reserve the symbol A for the value of A (t) when t = n. We remark that 1 is obviously admissible in view of the trivial inequality (3.2) A<{n-2)-l<2n-1. Since /i' = i(17 —(65)1/2) is the smaller root of the polynomial Q(y) = 4y2 — lly + 14, and (for y^l3/4) (13-4y)-1 Q(y)={{U-4y)/(13-4y)}-y, the estimate (1.4) is an immediate consequence of the following result. (As usual, e > 0 is arbitrarily small.) Theorem. If y is admissible and l^ygl.2, then {(14-4y)/(13-4y)} — s is also admissible. We remark that (in view of the values of the roots of Q(y)) we are entitled to assume (3.3) A>n~12 when proving the theorem. We now deduce the theorem from the following two lemmas, which are to be established in §4 and §5; but first we introduce some further notation. Definition 1. We denote by B(u;S,w) the number of pairs x for which (cf. (2.6)) (3.4) d(x)^u, Mt(w)^Sn. Throughout the text, / denotes the (odd) integer defined by (3.5) /=2[exp{(logn)2'3}] + l. Lemma A. Let y be admissible, u>Ai/2, 8>uA~in~ll, and suppose that w satisfies (3.6) Jm_1<w<c_1.
THE SMALLEST TRIANGLE IN A DISC OF UNIT AREA 255 Then B = B(u;S,w) satisfies (3.7) B<5~2u2 A-3{u-^Al-ii/y)w^2^-l)n-l+E. Lemma B. Suppose that A>l8n~2 and (3.8) A-'l2n-'<v<l-\ Then we have (3.9) E= £ -L>nh, t;(3.10),(3.11)«VT; where the summation is over all pairs x satisfying (3.10) Anv<d{z)<l3v9 (3.11) Mx{n'lv'l/d(r))>l'3n. Deduction of the theorem. We choose v to satisfy (3.12) v = Al'ilM{n'lv'2Y2^'1. In view of (3.3) the v thus defined satisfies (3.8), so that (for appropriate r', r") Lemma B implies (3.13) I £•» i>«2". r = r' t; (3.14), (3.1 5) « lT/ where the conditions of summation for the inner sum £(r) are (3.14) /-r-1<rf(T)^/"r, (3.15) Mt(iT1iT1/r+1)>r4ii; we choose the maximal r' and minimal r" consistent with the requirement that the union of the intervals (3.14) should cover the interval (3.10). We note that (3.16) Y!r)^T^lr+iB{rr;r\n-lv-llr+l). d(z) On estimating the right-hand side of (3.16) by Lemma A (the condition (3.3) ensures that the appropriate premises are satisfied), we see that the inner sum on
256 K. F. ROTH the left-hand side of (3.13) is ^A^^ + A^^in'1^1^'^2'^ n'1 + t Thus (3.13) (in conjunction with (3.12)) yields A3n3~E<^v, and hence the desired inequality A^rfn'il4'4y)fil3'4yK 4. Proof of Lemma A. Lemma 1. Let w>0, 0^a<7c, w^jAu'1. Then, for suitable c, (4.1) D(X) £ Tt{w;X)^cwuA-1 t;(4.2) for every X in the plane; here the summation is over all pairs x satisfying (4.2) d{r)Su, a^0(T)«x + (l/lO) Au'1. Prcx)F. Suppose (4.1) is false. Then there exists a point X0 in D which lies in more than cwuA "1 of the strips Tz(w) with x satisfying (4.2); it is easy to deduce from this (since c is large) that there exist two pairs il5 x2, each satisfying d(x)^u, such that (i) \e(xl)^e(x2)\<(\i\o)Au'\ (ii) the strips Ttl({\/10)Au~l), Tt2((l/10) Au~l) have a common point Xl in D. This is a contradiction, since (i) and (ii) (in conjunction with d(x1)^u) imply that the two points of tx together with (either) one of the points of x2 form a triangle of area less than A. Definition 2. We write, for w'> w" > 0, (4.3) <2>T(w', w"; X) = D{X) 4>T(w', w"; *). Here </>t is the function defined by (2.7), so that (cf. below (2.7)) the functions $ have the following: Orthogonality Property. For given w' > w" > 0, let <PZ*, &+* be two functions of type (4.3). 77z£az these two functions are orthogonal over the plane unless the intersection of Tz*(w'), Tt**(w') and the boundary C ofD is nonempty. In the following three lemmas we use the abbreviation (4.4) <Pt{X) = <Pt{w',w";X). Lemma 2. Suppose that u>0, 0ga<n, \Au~1 ^ w" < w' <c~l. Then, for any pair x0,
THE SMALLEST TRIANGLE IN A DISC OF UNIT AREA 257 (4.5) where (4.6) I t;(4.2) <PjX)<Pz(X)dX <^uA~lZ, Z = min(l, w/|sin(a-0(To))r1). Proof. Let rx, T2 be the two arcs in which the strip Txo(w') intersects the boundary C of D; only those terms corresponding to x in one of the sets Sv (v = 1, 2), consisting of the pairs t for which TX{W) intersects Tv and (4.2) holds, can contribute to the sum on the left-hand side of (4.5). It is easily seen that (for each v= 1, 2) all the strips TZ(W) with re<fv intersect D in regions contained in a single strip Sv of type av — cW < y cos a — x sin a < av + cW. Since |<f>t(X)|g(l/w') Tz(w'; X) + (l/w") T>"; X), it follows from Lemma 1 that X \<Pr(X)\<uA~lSv(X), where SV(X) is the characteristic function of the strip Sv. In view of the trivial estimate j \<PZo{X)\ 5vW dX<^Z, the desired inequality (4.5) follows at once. Lemma 3. Let u>0,%Au~l^w"<w'<c~l. Then, for any pair t0, (4.7) r:d(r)£u h {X)<Pt{X)dX <^u2A 2W \og(uA l). Proof. Write Ik = %z;(48)\$ <Pt0{X)<Pt{X) dX\ for /c=l,..., AT, where #= \_\0nuA ~ l] +1 and the condition of summation is (4.8) </(T)gw, n{k-\) K~l^8{%)<nkK-1. On using Lemma 2 to estimate /*, and then summing over /c, we obtain (4.7). Lemma 4. Suppose y is admissible and the premises of Lemma 3 are satisfied. Tlien (4.9) I UMtfrv')-2Mt(2w")}2«y, t: d(r)^u; (4.10)
258 K. F. ROTH where Y = n(w")~2 {u2A~2W logH-1))max{l,((w#')2J-1)1/y} and the second condition of summation is (4.10) MT(V)>4MT(2w"). Proof. For each i = l, 2,..., n, let g^X) be the function which is 1 or 0 according as X does or does not lie in the disc centre Pt and radius \W. We write /(*)= i 9l(X). i=l Clearly (since jw' > 2w" if (4.10) holds for some t) 1 w J X) f{X)dX*-n$W)2Nx$w')9 1 Tr(w"; X) f(X) dX£—„ 7c(K)2Aft(2W), w so that the left-hand side of (4.9) is majorized by (4.11) (w")-4 X \[*x(X)f(X)dX t;d(T)^u;(4.10) (J Since /(A") counts the number of points Pt in the disc centre X and radius |w", we obtain from the defining property of the distribution (1.1), /(Xj^maxll,^)2^-1)1^}. Combining this with jf(X)dX =nn(\W')2, we have (4.12) J f2{X)dX 4n(Wf max{l, ((w")2^-1)1"}. In view of Selberg's inequality (1.6), the desired estimate for (4.11) now follows at once from (4.7) and (4.12). Completion of the proof of Lemma A. We consider a fixed value u satisfying u>A1'2, and use the abbreviation B(5, w) = B(u; <5, w). Write J =
THE SMALLEST TRIANGLE IN A DISC OF UNIT AREA 259 [(logn)1/2] and, fory = 0, 1,..., /, Wj = w(Au'x w~iy/J. For d(x)^u, the strip Tx(wj)=Tx(Au~l) contains only the two points P, constituting t, so that Nz(Au~l) = 2 and hence JB(5',w,) = 0 for^>2MJ-1n"1. In particular, B (2 " 3 J(5, w7) = 0 since <5 > uA "* n "* /, so that (4.13) B(6, w) = Jt1 {B(2-3^f w,)-5(2-3^+1^, wJ+1)}. Now the summand in (4.13) certainly does not exceed the number Rj of pairs t (satisfying d(r)^u) for which M>;)^2-3^n, Aft(wj+1)<2-3U+1>fe. On applying Lemma 4 (with w' = 2w/, w"=jw/+1) to estimate Rj, the desired estimate (3.7) follows from (4.13). 5. Proof of Lemma B. The number a satisfying 0^a<7r is to remain fixed almost to the end of this section. We use S to denote a strip of type (5.1) a — ^w<y cosa — x sina^a+^w. We use tc: S to express the condition that the two constituent points P, of t lie in 5, and we recall that N(S) denotes the number of points P, in S. Lemma 5. Let S be the strip (5.1) and suppose that 0<2w^m<c_1. Then (5.2) £ l>iN(S)-3u-1, tcS;(5.3) where the second condition of summation is (5.3) \Aw-x^d{<z)<u. Proof. We subdivide S into rectangles by means of a system of lines perpendicular to S and distance \u apart. More than N(S)—\2u~l of the points P{ in S fall into 'good9 rectangles, each containing at least 4 points P,. The pairs t in S for which d(z)<^Aw~l are disjoint, and we destroy these by rejecting from S one of the two constituent points from each such pair. After this operation a typical 'good' rectangle will now contain t^2 points P„ and such rectangles contain
260 K. F. ROTH between them at least jN(S) — 6u~i points P(. On applying the inequality jt(t—l) ^jt to the number of pairs t in the typical rectangle of this kind, we obtain (5.2). Construction of sets srfr. We construct inductively sets jrf0, s/l9 stf2>-~> where (foneach r) the set srfr is the union of certain strips (having width 2/"r) of type (5.4) (2p-l)rr<ycosa-xsina^(2p+l)rr, where p denotes an integer. We use rS to denote any strip of type (5.4) and rS* to denote a constituent strip of srfT. The inductive procedure is as follows. Take jrf0 to consist of the single strip — 1 < y cos a — x sin a ^ 1. If, for r^ 1, ^r_i has already been constructed, the set srfT is obtained from it by first splitting each strip r~ lS* into / strips rS, and then selecting from the resulting strips rS precisely those satisfying N{rS)^{2l)~rn for inclusion in s/r. Clearly each strip r5* is representable as an intersection of type r (5.5) rs*=n j5*, j=o where (5.6) JV(>S*)£(2/)-'n (/ = 0,l,...,r). It is easily verified by induction that the number N(stfr) of points Pt in srfr satisfies (5.7) N«) = ^>N(S)^2-rn, s where ^(r*) extends over the constituent strips S = rS* of srfr. Completion of Proof of Lemma B. Let v satisfy (3.8). We choose the integer r0 to satisfy (5.8) rr°-1<(/2m?)-1grr°. We apply Lemma 5 with u = l3v, w = 2/"ro (permissible in view of (3.2), (3.8)) to obtain
THE SMALLEST TRIANGLE IN A DISC OF UNIT AREA 261 (5.9) X l>$N{S)-3rh-1 tcS;(3.10) for every constituent strip S=roS* of jrfro. We note that every pair t lying in such a strip automatically satisfies (5.10) \s'm(<x-0(T))\<2{lnvd{T)yl, since the right-hand side of (5.10) exceeds w/d(i), where w is the width of the strip. Hence, summing (5.9) over the constituent strips of stfro, and using (5.7) in conjunction with i(2"ron)-/ro(3/-3t;-1)>2/-1n (there are at most lro strips roS*), we obtain (5.11) £<"*> I l>2r1n. S tcS;(3.10),(5.10) We consider a particular pair t counted on the left-hand side of (5.11). We choose rx to satisfy (5.12) rr*-l<(l2nvd(t))-l^rr\ and note that 0< rx <r0 (in view of (3.8), (3.10)). By considering the termor! of the representation (5.5) of the strip roS* containing t, we see that t lies in a strip Su of inclination a and width 2/"ri < 2{lnvd(x))~ \ for which (5.13) NiSj^iliy^n. It is easily verified that (5.10) ensures that the strip Tz(n~lv~ 1/d(z)) completely covers the region DnSl in which St interesects the disc D. Hence (5.13) implies NT(n-lv-l/d(z))^{2l)-rin>r3(n-lv-l/d(T))n. Thus (5.11) implies I \>irln, t;(3.10),(3.11),(5.10) and on integrating this with respect to a over the range 0^a<7r, we obtain (3.9).
262 K. F. ROTH References 1. K. F. Roth, On a problem of Heilbronn, J. London Math. Soc. 26(1951), 198-204, MR 13,16. 2. Wolfgang M. Schmidt, On a problem of Heilbronn, J. London Math. Soc. (2)4 (1971/72), 545-550. 3^ K. F. Roth, On a problem of Heilbronn. II, Proc. London Math. Soc. (3) 25 (1972), 193-212. 4. , On a problem of Heilbronn. Ill, Proc. London Math. Soc. 25 (1972), 543-549. 5. Hugh L. Montgomery, Topics in multiplicative number theory, Lecture Notes in Math., vol. 227, Springer-Verlag, Berlin and New York, 1971. Imperial College of Science and Technology London SW7, England
EULER PRODUCTS ASSOCIATED WITH BEURLING'S GENERALIZED PRIME NUMBER SYSTEMS C. RYAVEC One of the most fruitful methods employed in the study of the distribution of the primes has been the consideration of those functions, such as the Riemann zeta function, which embody the fundamental theorem of arithmetic. This method (sometimes referred to as the "analytic method") has also been successfully applied to the study of generalized prime number systems, an account of which may be found in [1]. Briefly, a generalized prime number system is a sequence P = {KPlgp2g...} of real numbers pk-+oo. The multiplicative semigroup generated by P is called the generalized integers of the system P, which we denote by N-{l=nl<n2^n3S'"}' Note that two integers n, and w, of N, of possibly equal value, are, nevertheless, to be distinguished if they arise as distinct products of the primes P. The research in generalized number systems has involved considerable use of the zeta functions CP(s), defined by either the product or the series wherever the infinite product converges. (It is known that the product and series converge on the same half-plane, possibly empty.) AMS 1970 subject classifications. Primary 10H40, 30A14, 10H05. © 1973, American Mathematical Society 263
264 C. RYAVEC We shall be concerned in this summary (all of this work will appear in [3]) with the relationship between P and the natural boundary of £P(s). We shall use the following setting to describe this relationship. Let U denote the set of real sequences £/ = {(u2, w3,...): up>p~1} indexed on the rational primes. Define a "perturbed zeta function" (perturbed by the element ueU) £(s, u) by c(s,M)=n(i-Kp)"s)_l p whenever the infinite product converges (it is known that if the product converges at some point, then it converges on a half-plane). Further, when C(s, u) does converge at a point, let <3(w) = <5C(s, u) denote the natural boundary of £(s, u). It would be desirable to know completely the influence on d(u) of variations of u. Although not much in this direction is known, certain special cases can be completely settled (Theorem 1). Also, a number of classes of perturbations (Theorem 3) can be dealt with reasonably well. In Theorem 1 we characterize the Riemann zeta function essentially in terms of the functional equation and the assumption d(u) = {1, oo}, where the singularity at s = 1 is a simple pole with residue 1. Theorem 1. Suppose that C(s, u) converges for Re(s)>\ and that C(s, u) can be continued to C— {1} as C(s,u) = £(s)/(s-l), £(1)=1, where E(s) is an entire function of finite order. Then, ifC(s, u) satisfies the functional equation n-«2r(s/2) Us, u) = n-v-»<2r((l -s)/2) £(1 -s, u), we have C(s, u) = C(s), the Riemann zeta function. Thus, we observe that when £(s, u) satisfies the hypotheses of Theorem 1, the factors {up} simply permute the primes among themselves; e.g., u2 = 3/2, u3 = 2/3, w5 = w7=-=l. Theorem 1 follows directly from the following result concerning generalized Dirichlet series. Theorem 2. Let f(s) = Yj?= i akKs> ak^®> be a general Dirichlet series which converges at some point of C and which can be continued to C—{\) as f(s) = E(s)/(s— 1), where E(s) is an entire function of finite order, and where £(1)= 1 and E(0) = j. Let g{s):=Yl7=i ^jvJs be a general Dirichlet series which converges
EULER PRODUCTS WITH PRIME NUMBER SYSTEMS 265 absolutely at 5 = 2. Finally, assume that f(s) and g(s) satisfy n-*i2r(s/2)f(s) = n-^-*v2r((l-s)/2)g(l-s). Thenf(s) = g(s) = t(s). There are a number of papers which have dealt with problems similar to Theorem 2. The closest such result can be found as Corollary 4 on p. 295 of [2]. This result cannot be directly compared to Theorem 2, however, since it does not require the hypothesis a^O. It does require lacunarity conditions not present in Theorem 2, though.1 Comparing Theorem 1 and Theorem 2, it is seen that the assumption of Rie- mann's functional equation is essential to the conclusion of the latter theorem, but it is not at all clear that this is the case in Theorem 1. It may be that the existence of the Euler product makes both the assumption of the functional equation and the growth condition on E(s) unnecessary; i.e. Conjecture. Let C(s, u) converge for Re(s)> 1 and satisfy C(s,u) = £(s)/(s-l), £(1)=1, where E(s) is entire. Then C(s, w) = C(s). We now describe a class of perturbations of £(s) and attempt to determine the extent to which £(s, u) can be analytically continued across its abscissa of convergence. Thus let q denote a large prime. For each residue class h (modq) choose a real number rh; and for each p = h (mod 4), put up = rh, subject only to the restriction upp>\ for all primes p (the obvious restrictions then hold on rh, l^h^q). With such a choice of ue U, we have Theorem 3. Define a set Eq(u) by Eq(u)= U <s:s = ;0<x^l> u(0,1], Q(X),n I n ) where the first union is over all positive integers n and over all zeros q{x) = P(x) + fy M> P to > 0, of all L (s, x) (mod q). Then with u described above, £ (s, u) is analytic on the domain 1 We mention that for any general Dirichlet series/(s) = £0,^, s, the condition l^Ai^A2^..., ^-♦oo is always in force.
266 C. RYAVEC Dq(u) = {s:Re(s)>0}-Eq{u). Theorem 3 implies that if one perturbs the rational primes in finitely many residue classes, then the resulting Euler product, £(s, w), can be continued to the line Re(s)>0, with the exception of countably many lines running from the zeros of all L-functions to the imaginary axis. There does not exist at this time any method which can be used to deal with the general perturbation problem. The determination of d(u) for functions like rip>2(l-(P"-1)~s)"1 (here> uP=l-P~i>P>2) wiN probably be very difficult. An example of H. Diamond shows that there exists a ueU for which d(u) = {s: Re(s)=l} and such that up-^\ as p-*oo. This example gives some support to the notion that only special functions £(s, u) can be continued across their abscissae of convergence; for almost all £(s, u) the line of convergence coincides with the natural boundary. References 1. P. T. Bateman and H. G. Diamond, Asymptotic distribution of Beurling's generalized prime numbers, Studies in Number Theory, Math. Assoc. Amer.; distributed by Prentice-Hall, Englewood Cliffs, N. J., 1969, pp. 152-210. MR 39 #4105. 2. K. Chandrasekharan and S. Mandelbrojt, On Riemann's functional equation, Ann. of Math. (2) 66 (1957), 285-296. MR 19, 635. 3. C. Ryavec, The analytic continuation of Euler products with applications to asymptotic formulae, Illinois J. Math, (to appear). University of Colorado
SYSTEMATIC EXAMINATION OF LITTLEWOOD'S BOUNDS ON L(l,x) DANIEL SHANKS 1. Introduction. This investigation was largely conducted in close collaboration with D. H. and Emma Lehmer. My joint paper with them [1] overlaps some with the present paper but each paper also treats topics not in the other, and to minimize duplication the papers refer to each other for those aspects of the problem. We confine ourselves to the real characters Xd — (d/n) and examine the functions for s= 1. If L(s, Xd) satisfies the Riemann hypothesis, and d^m2, then Littlewood [2] deduces the bounds (2) [{l+o(l)} (12^/7r2)lnln|^|]-1<L(l,/d)<{l+o(l)}2^1nlnM. He gives nothing about the o(l) here, neither its sign nor the manner in which it approaches zero as a function of d. We wish to study the possibility of approaching these bounds or, perhaps, surpassing them, and to obtain a measure for this we temporarily ignore the o(l) and define the upper and lower Littlewood indices by (3) L(l, Xd)/2e* In ln|<*| = ULI, L(l, Xd) (12/tt2) e* In ln|d| = LLI. We will examine, systematically, the possibility of finding d with A MS 1970 subject classifications. Primary 12A25, 12A70. © 1973, American Mathematical Society 267
268 DANIEL SHANKS (4) ULI^l or LLI^l. Littlewood himself [2], followed by Chowla [3], got halfway there by constructing arbitrarily large \d\ having (5) ULI^(l-a) or LLI^2(l+a) for any positive e. Relative to these constructions (called LC in the following) the question now is whether we can attain the extra factor of 2. If LC obtains a certain large (or small) L(l, Xd) f°r a discriminant D, then we would have to obtain a comparable L(l, Xd) with (6) In In \d\=$ In In |D| or \d\ = exp((ln|D|)1/2). Thus, if their D = 10450, our d must be the much smaller d— 1014. The first step of LC in obtaining a large (or small) L(l, Xd) *s to select D such that (7) (D/«)=+l (or(D/q)=-l) for all primes q ^ some p. That maximizes (or minimizes) the first n (p) factors in the Euler product in (1) for s=l. There are such D by the Chinese Remainder Theorem satisfying (8) D<4 f\q=Up. q = 2 The bound on the right, Up, and some further construction then yields (5). But Up is surely grossly too large since there are, in fact, distinct solutions D of (7), all being less than Up. If one could identify the smallest of these D by some algebraic or analytic technique, one could seek to improve (5) with these smallest D. Since no such technique is known, we will compute the smallest d numerically and begin our study with four introductory examples of (3) so computed. 2. Four examples and their computation. In (lOa-d) below, we list four d, each being the smallest discriminant having a prescribed quadratic character. The characters are designated as follows: aRp (aNp) means a positive d^m2 of
EXAMINATION OF LITTLEWOOD'S BOUNDS ON L(l, y) 269 the form 8& + a which is a quadratic residue (nonresidue) of all odd primes q^p. Similarly, —aRp( — aNp) is such a negative d— — (8k + a). For each d in (lOa-d) we give the class number h(d) of Q{di/2) and, for d>0, the regulator In e. Then L(\,x) equals 2h{d)\n£/d112 or nh(d)/(-d)112 for d>0 or d<0, and the indices are computed by (3). (10a) (10b) d= \Rl39 = 2871842842801 (prime), h(d)=l, In e = 7023729.36, L(l, *) = 8.28929, ULI = 0.6933. d = 5Nl39 =49107823133 (prime), h{d)=l, In e= 18804.68, L(l, x) = 0.16972, LLI= 1.1773. </=-7R157=-47375970146951 (composite), h(d)= 19213042, * ' L(l,*) = 8.76934, ULI = 0.7136. n , d=-3Nl8l= -30059924764123 (prime), fc(rf) = 296475, ( > L(l,x) = 0.16988, LLI= 1.2637. These four (first solution) d are clearly much stronger than the LC constructions D that yield (5). The example (10b) is especially strong; it nearly attains (4). The first — 3iV181 is not quite that strong, but if it had a class number, say 230000 instead of its listed h(d), it could well be a violation of the RH, subject to investigation of its factor {1 +0(1)}. A brief word about computation. These four d, and most of those that follow, were obtained with Lehmer's delay line sieve DLS-157 [4]. This is a specialized computer that determines solutions N of the system of congruences: JV = a,(mod<z) (4 = 2,3,5,..., 157). If it had not been available, the computation of, say, the first — 3Ni8i above on a commercial computer would be incredibly time-consuming and expensive; in a word, impractical. Again, the classical algorithms for computing h(d) and e are far too slow for the huge regulator in (10a) and h(d) in (10c), and it was necessary to devise new algorithms for computing h(d) [5] and \ns [6] that are far more efficient. Suffice it to say that without Lehmer's DLS-157 and without these two new algorithms much of the data that follows would have been almost impossible to obtain.
270 DANIEL SHANKS 3. Even discriminants. Presently, we will study the variations in the ULI and LLI for all such first solutions of \Rp, 5Np, etc., as/? is systematically increased: /? = 3, 5, 7, 11,.... But these four characters all have odd dand it is desirable to gather more data by examining even d also. For any N^ —k2 we write m = i\ m Jm for the even d — — 4iV. All even terms m = 2r in (11) vanish. Correspondingly, the leading (and strongest) factor in the Euler product in (1) is now lost since (d | g) = 0 for q = 2. Using Littlewood's analysis for d=—4iV, everything goes as before except at the very end when these leading factors of 2 or f drop off. One therefore has, instead of Littlewood's (2), the stronger result: (12) [{l+o(l)}(8^/7c2)lnln|4iV|]-1<LiV(l)<{l-ho(l)}^lnln|4N|. For even d we therefore modify (3) and define the indices by (13) LN{l)/ey In ln|4N| = ULI, Ln(1)(8/tc2) e> In ln|4iV| = LLI. The bounds (12) are valid for every jV# — k2, not merely for fundamental discriminants. Consider -3R167 = -29772062022491= -N. One has (_AT|?)= + 1 for ^ = 3 to 167 and (— A^ | ^) = — 1 for q = 2. With a discriminant — AN, for this N, we can "neutralize" the "wrong" character with respect to q = 2, and (12) then holds for its LN(l). In (14a-d) we list four examples analogous to (lOa-d). Each has a wrong character for q = 2 that is neutralized with a factor of 4. Their indices are now computed by (13) and are seen to be comparable to those in (lOa-d). In effect, we simply ignore q = 2 by this device and study only the sequence of (d | q) for 4 = 3,5,.... <f = 4(-3K167)= -4-29772062022491, * ' LN{\) = 4.54327, ULI = 0.7333.
EXAMINATION OF LITTLEWOOD'S BOUNDS ON L(l, x) 271 n4W d=-4{-7Nl61) = -4-17382121592383, 1 ] LN(1) = 0.27109, LLI = 1.3548. rf = 4(5K163) = 4-4745628949021, ^ C' LN(1) = 4.30219, ULI = 0.7063. , d = 4(lN167) = 4-11571384229697, ( ' LN(1) = 0.26008, LLI =1.2950. 4. Systematic examination of the LLI. In Table 1 we list the indices LLI for the smallest dhaving the character -3Np, 5Np, 4(-lNp) and 4(1 Np) for/? = 3, 5, 7,.... (The LLI of the examples above are found in Table 1 in the appropriate rows and columns.) The discriminants d themselves, their h(d) and L(l, Xd\ are not given in Table 1 but can be found in the tables in [1] and [7]. This is what we observe in Table 1: (a) All LLI listed are far stronger for these smallest d than for the LC construction in (5). (b) If we set aside the smaller d, those for p < 50, we see a certain uniformity here; the LLI are essentially equal, on the average, for all four characters, and appear to remain stable, on the average (or change only very slowly), as p increases. (c) For these 50<p^l81, the average LLI is about lj and the fluctuations take us up to 1.528 for the weak 4(liV83) and down to 1.177 for the very strong example (10b). (d) The d= —3Np for p=17 thru 37 is the famous —163 and its startling LLI = 0.8675 would imply that £( —163 | n) n~s violates the Riemann hypothesis were it not for its factor {1 +o(l)}. For the present, we will assume that this factor saves the day (since 163 is quite small) but we must return to this {1 + o(l)} problem later. Similarly, the LLI shown for the even smaller d= — 28 = 4( — 1N5) and d = 6S=4(lNll) are (temporarily) discounted. (e) With this dubious d=—163 excepted, we see no indications here for violations of the RH. We are making a real effort here to obtain cases of LLI < 1 but they do not appear (for large d); the strongest examples such as 5Nl39 press towards the bound, but do not cross it. 5. Systematic examination of the ULI. In Table 2 we list the ULI for the characters \Rp(^m2\ -7RP, 4(5Rp), and 4(-3Rp). The ULI behave quite differently from the LLI. (a) For p< 13, the ULI can even be weaker than (5) but they increase rapidly with p and become distinctly stronger. (b) Quite unlike point (b) of § 4, the growth of the ULI is very obvious as are the differences among the four characters, especially the outer two.
272 DANIEL SHANKS Table 1. LLI for first discriminant of the character. p -3NP 5N„ 4(-77Vp) 4(INP) 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97 101 103 107 109 113 127 131 137 139 149 151 157 163 167 173 179 181 Average 1.6855 1.3744 1.3744 1.1937 1.1937 0.8675 0.8675 0.8675 0.8675 0.8675 0.8675 1.3002 1.3002 1.2315 1.2617 1.2617. 1.3058 1.3944 1.3269 1.3423 1.3423 1.2869 1.2832 1.2832 1.2832 1.2832 1.2974 1.3182 1.3182 1.2422 1.3604 1.3604 1.3114 1.3422 1.3422 1.3422 1.3422 1.3223 1.2789 1.2637 1.2637 LLI forp>50 1.3096 0.4436 1.6125 1.3880 1.3880 1.2467 1.1377 1.2470 1.2470 1.2908 1.3876 1.3876 1.3876 1.3249 1.3593 1.3593 1.3593 1.1855 1.3144 1.4284 1.4220 1.4220 1.3633 1.3633 1.3633 1.2210 1.2210 1.2809 1.2809 1.2809 1.2243 1.2176 1.1773 1.1773 1.2393 1.2393 1.2393 1.2393 1.3433 1.2846 1.2899 1.0317 1.0317 1.8407 1.6717 1.4888 1.4565 1.1671 1.6268 1.4350 1.3874 1.4031 1.4031 1.3838 1.3838 1.2898 1.2898 1.2898 1.2607 1.2607 1.2607 1.3514 1.2979 1.2979 1.2979 1.4066 1.3432 1.3454 1.3303 1.3303 1.3303 1.3248 1.3248 1.3130 1.3555 1.3555 1.3555 1.3555 1.3548 1.3218 1.0560 1.0560 1.0560 1.0560 1.7017 1.5780 1.5108 1.4011 1.4011 1.1893 1.1893 1.4815 1.5256 1.3750 1.4138 1.4194 1.3409 1.3409 1.3042 1.3042 1.2411 1.5283 1.4297 1.4297 1.3877 1.3877 1.3877 1.3877 1.3877 1.4173 1.3541 1.3541 1.3279 1.3010 1.3010 1.3343 1.3343 1.2950 1.3629
vj ON ON ^ Ul ^ W s) W si m \0 U> U> K) ►— OOOOVOOOOO^J^J^JONONt^U*-^ n1mn1W^n1W>-vJ\OW\OWi-vJi-\OWn1 .&* .&* U> u> to K) ►— ►— ►— •— U> ►— ^J VOWVOv|W>-vlUlW OOOOOOOOOOO O O wwwwwwO OOOO ppppppppppppppppO ONbNbNbNONbNbNbNbNbNbNbNbNbNbNbNbNbNbNbNbNONbNL/*L/*i^L/*^ ^^\0^^vlvl00000N0N0NON0N^WWWK)K)M»-O0000vlONUl^t-Ovl^\0W 0000WWWOOK)K)^^K)K)(O^0000O^^00WW^\O00U)^00000NnJ^00Ui OOOOWWW^^N)N)UiON\O^VOWONONOO\O^OK)OOVOt-K)vli-M'-^K — . _ . _ U* \Q £> O VO VO VO p p p p p p p Lj Lj Lj Lj ^J ^J O ►— O O O O ON W ON vj ^J ^J .&* as ^j lh vo vo OOOOOO w w j—' w w w w O O O O O O O O O O O O O O O O O O O O O nIvJOnOnonOnOnOnOnOnOnOnOnOnOnOnOnOnOnC ~ ~ vr^ u*"> vr»i u*"> vr^ vr^ —i ^1 ^i ^i c-n **t\ ctv #«"t\ **t\ **t\ Jv^ J o o o o o \OOO^SSS^^^^^^OnOnSn3nOnOn 00U>U>OOOOOO>— 00004^VOK)K)W^^ ONONONONONONONONONMMOt-OOOOKJK)" ., 0>ONONONU)UU)UiUiUi^ _. OOtO^iOVOVO^— W W t^ K) K)t-MK)U)O^ONO^OOWOO OOOO pppppppppppppppp ^J^-J^-J —] OnONvIOnOnOnOnOnOnonOnOnon^OnOn oooo ^^o^^NioovjNiviuiuiON^ONaN as as as as oowtot-ONOo^^wsiNiON^w^w^w! U> U> U> U> 0000\0^OvlU)K)K)O0NaN00t-W0N^tO ©©©©©©©©©©©op© © ONbNbNbNbNbNbNbNbNL/*L/*l^L/*L/*'.fck ,kU>UiWK)tOK)K)O^OOON|ONWyO _ ^OOONOnOUi^COOnK)K)K)v1 ONON^JslONOOONOOONt-rt-^Lft On On O O -. ^- -^ !&> U> VO ON O U) ON ©©©©©© Lj ^j Lj Lj Lj ^j K) W W K) O K) ^ U) U) t- vj OO ►— U> U> U> ^J t-/* O O O ^J ^J ^J ^J o ►— ►— ►— © ^J ^J ^J U) OO OO OO oooooo oooooooo pppppppppppppppp NlvJ^vJ^ON^vlvJvJvlONbNONONONONONONONONONONbNON^l/lL/) © © © © © VO ►— ►— ►— n— h- \00000\00Nvlslsl0\0NWU)WWOv0\0 ^JOOaNONU>U>U>U>U>U>U>K>OOU>L^i-*^^O.^N>^OOOOVOvO OOt-i-\OvOWO^^^^^^^^NlUiOOK)K)OK)U)Uit-MW^^ > r CJ> >3 3 P 3 >3 I > 3 > H o 2: H H r w 1 w o c z a o 2:
274 DANIEL SHANKS .7 ft ■ .D| .5 .4 > u II 4 | ■ ■ / | 1 4 1 r 1 1 "^ 1 ■1 ^ A UL P 1 lm mi i \ "r li 1 I I y l i 1 li i III ill Mi , ^ g ( I >; L ■^l ■ 1 ■1 ix L *■ c r ■ 1 rl ii FIRST4(-3R FIR in ST R "■ ■ r > c p) II (^ X ^ ^ || ( > |. ■v y. II K i> 1 I X \, r i >< m i > i i 5 11 17 23 31 41 47 59 67 73 83 97 103 109 127 137 149 157 167 FIGURE l In Figure 1 we show this difference graphically. The ULI for 1RP (the so-called "pseudosquares") start very low, increase rapidly and smoothly with p, and only become ragged as p exceeds 100 and ULI approaches 0.7. Those for 4(-3Rp)
EXAMINATION OF LITTLEWOOD'S BOUND ONL(lj) 275 start much higher, increase slowly and exhibit much greater fluctuations. The two intermediate characters, not shown in Figure 1, behave intermediately; they start at an intermediate level, increase at an intermediate rate, and have an intermediate amount of raggedness. A qualitative explanation of this behavior is based upon the relation of these characters to the perfect squares — the principal characters. All squares not divisible by any prime ^p are \RP. For \Rp, the Sp solutions (9) will therefore include not only the pseudosquares, lRp(^m2), but also many perfect squares. Thus, the first pseudosquare will appear very late, especially for smaller p. Thus 1R3 = 73> U3 = 24, lK5 = 241>t/5 = 120, 1K7 = 1009>£/7 = 840; and while 1K11=2641< Ull9 it is larger than the first 11 solutions: l2, 132, 172, ...,472. For 1RP, In Ind is therefore correspondingly large and ULI is correspondingly small. As p increases, this competition with the perfect squares slowly decreases. The sets of SP solutions for — 1RP and for 5Rp are obtained from that for \Rp by, respectively, the sets {1RP—UP} and {lRp±jUp} and so are not distributed uniformly in Up but are both biased towards the second half of Up as a reflection of the many small squares in \Rp. Their first solutions are therefore also delayed ([7, p. 435], [1]) but this effect diminishes with increasing p more rapidly than the corresponding effect for 1RP. Finally, — 3Rp differs from a square in two ways, being both negative and wrong for q = 2. Its delay is therefore relatively small and is relatively quickly dissipated with increasing p. These differences are also reflected in the fact that while 1R15x and —3Rl73 are nearly the same size, the second is a valid solution for four extra values of q: 157,163,167,173. For large p, and therefore large d, these strong effects of the perfect squares will dissipate as the squares become less dense. Thus, we can anticipate that the differences noted, caused by differing relations to the principal characters, will largely disappear. For p, say «300-400, one would expect a common average ULI of about | and sizable fluctuations around this average. In a word, we can expect that the ULI will then be a mirror-image of the LLI and that the different behaviors noted in §4(b) and §5(b) will vanish. 6. Conclusions from this first experiment. Setting aside the two complications, the {1 + 0(1)} factor and the strong effect of the squares just discussed, the indices for the first solution d behave fairly uniformly; they are consistently stronger than those of LC (5) but show no sign of ever violating the indicated bounds. For very large p and d - far beyond our data - it is likely that the observed average LLI« f and anticipated ULI «| will very slowly deteriorate and sink back towards the LC values. The LC bound on D is actually greater than the Up of (8); it is [2, p. 369] (15) \D\<p4Up. On the average, our first solution should be the much smaller:
276 DANIEL SHANKS (16) \d\*Up/Sp*2nip)2ey\np. But the ratio (17) lnln|d|/Inln|D| for (15) and (16) nonetheless very slowly increases to 1. It is likely that the fluctuations in the indices around these deteriorating averages will simultaneously slowly increase and that d with strong indices will therefore continue to appear. 7. Lochamps and hichamps. The first solutions of (7) do not necessarily have the strongest indices. They do have minimal values of In In \d\ but their L(\, x) need not be the most extreme since the character (d \ q) has only been forced thru q=p and floats freely for subsequent q. Since we seek to approach or pass the bounds (2) and (12), we will therefore seek (to a limited extent) to locate the strongest possible examples. Suppose JV>0, d = -4JV in (11). If (18) M1)<L„(1) (allO<w<JV), we say LN(\) is a lochamp. If (19) LJV(1)>L„(1) (allO<i!<JV), we say LN(l) is a hichamp. Similarly, there will be a sequence of lochamps and hichamps for positive discriminants d=4M, M>0. We include odd discriminants —N in the tables by the use of their multiples d— — 47V, and LN(l) instead of L(l, x), in order to obtain a uniform sequence. It is clear that no indices can be stronger than those for these champions, and if any indices approach or pass the bounds we would find them here. Table 3 shows the sequence of negative discriminant lochamps thru N ^ 50000. Each LN (1) there thru L47338 (1) satisfies (18). But for N> 50000 it was not possible to examine every N and below the heavy line in Table 3 the LN(l) shown are merely tentative, that is, they are smaller than any Ln{\), 0<n<N, that has come to my attention. For the positive discriminant lochamps in Table 4 the heavy line represents M = 2000. The entries in these tables come from several sources including calculations of the Lehmers, of myself, and from an unpublished table of LN(l), -2000<N<50000, due to Mohan Lai. Prior to the N = 163 in Table 3 we see the well-known, very strong N = 58, and following 163 no smaller LN(1) appears until N = 4687. Only at N = 30493 does an appreciably smaller LN(l) develop. The case N=991027, with h( — N) = 63, was
EXAMINATION OF LITTLEWOOD'S BOUNDS ON 1.(1, x) Table 3. Lochamps, — 4N=Discriminant. N LjJ) LLI 7 0.59371 1.0317 37 0.51647 1.1996 58 0.41251 1.0094 163 0.36910 0.8675 4687 0.36711 1.2117 30178 0.36169 1.2844 30493 0.34182 1.2142 47338 0.33210 1.1974 222643 546067 991027 393292183 481022602 1970364883 2426489587 3416131987 8864190043 71837718283 85702502803 569078186623 2 17 167 227 362 398 679733 2004917 41941577 77891897 261153673 9447241877 19553206613 49107823133 4813372912697 0.32957 0.32523 0.29822 0.29449 0.28577 0.28398 0.27982 0.27227 0.26983 0.26731 0.26172 0.25346 0.62323 0.50804 0.45014 0.40578 0.38245 0.33494 0.33492 0.30698 0.29228 0.28949 0.28533 0.27058 0.26644 0.25457 0.25094 1.1946 1.2119 1.1302 1.2979 1.2634 1.2560 1.2415 1.2142 1.2198 1.2422 1.2188 1.2252 0.6587 1.0560 1.2168 1.1239 1.0959 0.9660 1.2550 1.1855 1.2411 1.2426 1.2210 1.2243 1.2176 1.1773 1.2392 Table 4. Lochamps, 4M=Discriminant. M L_M(1) LLI
-J \0 vl si W vj ^O VOOO jv, »«J00N>00OnOnU>^- K)^4^vo^ — On^^^- OVOOOOOO^O-JN>-^.^U> OsWnIOOUiOOOOOWh-OM^^^ K)h-oo^K)sjUiH-aM30IXONK)sl^ OOM^O^^Wi\0\OyiNlWM\OONvOK)H- K)H-K)^OOK)WOH-WvlWWOOK)h-u,00 ^WVOWh-vIvOvISOh-vIh-h-WUi^^U) >— ►—> vovOvovO1-- VO"-- i-- vO1-- VO1-- ►— •— •— so ^ON<-hVOVOO<-ftO-JVOVOONU>'«-JON>^ U)L^^U)-JOOU)-JslO-^'-JO^^- O OO ppooppppopppopppoo u>k>k>n>^-o^-ooovovo^-vovoon-jon U)^-OOK)^JUi^JOOnU)L^U)U)K)U)(^^-^- _ 00000\UiWWK)H-h-h- ^^-vlslH-K)U,y,AMOOH-O^WOJMh- a0M>OvlK)yi^o5^00Uis000^H-\0O00^N)^M nO^-OnVO-^On^^-OnVO^-vO^^-vO^-^^.*— VO^-On^.^- ^ViO00h-000000Vin1^O00K)s000OVi00\00\^s)K)Om CM^K)^v)00OOV0^n1K)v0UiWm^v,0000K)00v0OUiO — — ■ 00UiOn1\O(O^^s10000UiK)O^^K)00On1 On K) ^> -J 00 *- K) O U) *- o o o o o o O VO 00 00 -J On O ^- K> -^ U) 00 Lrt U) VO ^- On 00 o o p o o ppppppppppppppo On 0s On On On On On On On On On On On On On On On On On On On On L/i ■—) 00 -U-JONON-^--J^-OvOONOOOO^U>ONt-*'-JVOVO>— ^--J<-ft^-«JU>^<-ft4^<-*^N>0N0NU>00O.^O00 c r r m I I 3 P 3 > ffl r 2 C/3
> ^-010-JN>N>^-0000^-OON>0000'-J<-ft4^<-K>U> 00N)00L04^OO4^OO-JV00jOnU>0s*— ^-VOO VOUJ^-0<-/iVOvOU>,0<-/iVO--J'-J<-/iVOO*— -J U> •— 000-^-^K)^-^-0-^OJOOOK)0--J>— 04^K> - -'vo-^J^-JOOvONVO — — -^ -J —* O — U) VO ^. 2 oppoooooppoopopopoop U)U)U)U)U)U)U)^.K)U)K)K)K)K)VOVOOOOOOOOO On^OVOVO^JOU)U)K)K)K)^- — W W H- H- 4^>— voo--jonujk> K) hj K) 'h- ^- ^- ^» O o o --JO^00K)^-OO^0^-^-av00^^-voOlS0J^i4^0JOK> vou)^it^»u)^.--j^K)vo^-^-'-JONU)^--«jo--JoaN(^ou) poop pop opoppppoooooopoop la Lh Lft ly» L* 0» L* \y\ L* L* L* L* L* L* i-^» L* L* L* L* 4^. 4^ !&> -fc- ■£> OOOOU)K)(^\OO^iU)OOVOV04^.0^-^OOK)K)^J^-^-00 P 3 m X > 2 > H O o H H r m w O C Z a O z
280 DANIEL SHANKS 257371 294694 584791 969406 1138999 1234531 3462229 6810301 10073779 10393111 39136549 43030381 100041439 249623581 1169755141 1272463669 2055693949 5959962661 7209891781 30116328181 78073081381 4745628949021 11256755665549 3.0156 3.0736 3.1077 3.1128 3.1509 3.1841 3.2644 3.3194 3.3597 3.4098 3.4616 3.4762 3.5179 3.6001 3.6343 3.7146 3.7496 3.8389 3.9018 3.9041 4.1608 4.3022 4.3598 0.6443 0.6543 0.6497 0.6427 0.6480 0.6536 0.6546 0.6562 0.6589 0.6683 0.6616 0.6633 0.6615 0.6668 0.6576 0.6713 0.6730 0.6792 0.6885 0.6767 0.7131 0.7063 0.7099 discovered by the Lehmers and is exceptionally strong. In Table 4 we find another case of LLI< 1 at d=4-398. (Everyone knows of Q((—163)1/2) but almost no one knew that g(3981/2) was nearly as strong.) In Tables 3 and 4 the tentative lochamps having N > 50000 and M > 2000 both have an average value of LLI of about 1.22. In a word, we are trying harder than in Table 1 and so are getting indices closer to their presumed bound. The corresponding hichamps in Tables 5 and 6 that are not already in Table 2 are also somewhat stronger but are clearly also markedly affected by the presence of the squares, as discussed above. Some of the tentative hichamps in Table 6 were extracted from Beach's and Williams' table [8] of (M)1/2 having exceptionally long continued fractions. The results of this second experiment confirm those of the first; by trying harder we press a little closer to the bounds but do not pass them except for d= —163 and d = 4-398. We now return to the postponed problem of {1 +o(l)} and give it a partial treatment. 8. Partial analysis of {l-j-o(l)} and conclusions. Clearly, the next order of business would be to determine if the o(\) on the left sides of (2) and (12) are positive and sufficiently large for d= —163 and d=4-398 so that the bounds shown are valid. Otherwise, their L functions violate the Riemann hypothesis. Unfortunately,
EXAMINATION OF LITTLEWOOD'S BOUNDS ON L(l, x) 281 many complicated terms enter into these o(\) and no such unequivocal determination is now available. Nonetheless, it is desirable to show that the two leading and simplest approximations that were made are of the correct sign and magnitude so that they alone could account for these apparent violations. Littlewood's (2), prior to the two approximations alluded to, could be written as (20) [{l + o(l)} B(x)]-1<L(1, x*)<{1 +o(l)} A(x), where (21) B(x) = exp £ (-l)m+1A«pm, ^(x)=exp £ l/mpm, pm^x Pm^x and (22) x = (ln|rf|)2(1+4£), 8>0. An integrand in the analysis [2, p. 365] includes the factor (23) |£(i + 8 + iff)/L(i + 6 + iij)|f and the o(l) in (20) depend upon our choice of e. Let us define a(x) and b(x) by writing As x -» oo, a(x)/xl/2 lnx and b(x)/xi/2 lnx-*0 and the first approximation is their replacement by 0. The second approximation sets the e of (22) equal to 0 and so the left side of (20) becomes (25) [{1+0(1)} ^TTMnlnldl]-1. Now, in all of our examples above we had \d\ <4* 1014, and setting s = 0 in (22) we obtain x< 1200. This is sufficiently small that one can easily compute b(x) and a(x) exactly. We find that throughout this range b(x) is positive and fairly stable, remaining mostly between 1 and 2. (We also find that a(x) changes sign frequently and is usually much smaller, but do not need that now.) Therefore, B(x)>6n-2ey lnx. This is in the correct direction to absolve d= —163 and 4-398, and the difference involved is sufficient to account for the latter's apparent misdemeanor: LLI=0.966.
282 DANIEL SHANKS But for d = —163, if we set s = 0, we get x = (In 163)2 = 25.9463, B(x) = 3.7601, 6tc" V In x = 3.4853, and even the smaller B(x)~l exceeds L(l, #) = 7r/1631/2. However, one cannot allow e to approach 0 too closely for the small |d| = 163 without losing control over the other approximations leading to the o(l) in (20). It happens that even a quite small e in (22) will suffice to obtain an x with B(x)~1 <7c/1631/2. This is because an increasing x will soon encounter the odd powers of primes pm = 27,29, 31 and thereby yield a B(x) = 4.0695, whereas, at the earlier square p2 = 25, B(x) had actually decreased from 3.8360. That is as far as we will go here. While that leaves it open whether —163 does or does not violate the lower bound, there is enough here, in the correct direction, that we now have no real reason to believe that it does. We have sought, in two different ways, to exceed the bounds (2) and (12), but with an improbable exception at d= —163 we find that we cannot. Our approach has not been at all hit-and-miss but, instead, very systematic. The resulting ULI and LLI are quite uniform and clearly relate to these bounds. All of our strongest cases, such as d=—991027 and d = first 5iV139, press against the bounds. Our tentative lochamps had LLI = 1.22. The simplest interpretation of all this persistent behavior is that the extended Riemann hypothesis is true. Of course, that is no proof—not even for a single d. Any heuristic conclusion is somewhat subjective and I should add that I, personally, regard this as fairly strong evidence. Heuristic reasoning, unlike deductive reasoning, is influenced by collateral evidence. There was considerable evidence for the ERH, of several sorts, prior to this work and that can only strengthen our assessment of the present data. Suppose we did find a clear violation. We would then know that there were non-Riemannian zeros for that d and we could even give a lower bound for their real parts. If, in place of (23), we were forced out to |£(0 + £ + iij)/L(0 + £ + iij)| because of zeros at 9 + it, then (22) would be replaced by (26) x = (ln|</|)<1+4£)/(1-">, and the famous factor of 2 in the bounds would be replaced by the larger factor i/(i-0). Littlewood does not give (26) but the writes [2, p. 371], "Hypothesis X, without modification, is essential in proving Theorem 1" [ — that is, in proving (2)]. I presume that the need for an enlarged (26) is what he had in mind.
EXAMINATION OF LITTLEWOOD'S BOUNDS ON L(l, x) 2^3 References 1. D. H. Lehmer, Emma Lehmer and Daniel Shanks, Integer sequences. II, Math. Comp. (to appear). 2. J. E. Littlewood, On the class-number of the corpus P(y/ — k), Proc. London Math. Soc. 28 (1928), 358-372. 3. S. Chowla, On the class-number of the corpus P{yJ -k), Proc. Nat. Inst. Sci. India 13 (1947), 197-200. MR 10, 285. 4. D. H. Lehmer, An announcement concerning the Delay Line Sieve DLS-127, Math. Comp. 20 (1966), 645-646. This was subsequently modified to the DLS-157. 5. Daniel Shanks, Class number, a theory of factorization, and genera, Proc. Sympos. Pure Math., vol. 20, Amer. Math. Soc, Providence, R.L, 1971, pp. 415-^40. 6. , The infrastructure of a real quadratic field and its applications, Proceedings of Boulder Symposium, August 1972, University of Colorado, 1972. 7. D. H. Lehmer, Emma Lehmer and Daniel Shanks, Integer sequences having prescribed quadratic character, Math. Comp. 24 (1970), 433-451. MR 42 #5889. 8. B. D. Beach and H. C. Williams, Some computer results on periodic continued fractions, Proc. Second Louisiana Conference on Combinatorics, Graph Theory and Computing, Baton Rouge, 1971, pp. 133-146. Naval Ship Research and Development Center Bethesda, Maryland
This page intentionally left blank
ON THE RIEMANN HYPOTHESIS IN HYPERELLIPTIC FUNCTION FIELDS H. M. STARK 1. Introduction. Let kq be the finite field of q elements where q is a power of the characteristic of the field which we denote by p. Let F(x) = a0xn+ ~- + an9 a0^0, be a polynomial with coefficients in kq which has no double roots. Let N(q) be the number of points on the hyperelliptic curve (i) y2 = /(*), including the points at infinity, defined over kq. The Riemann hypothesis for the curve (1), first conjectured by Artin [1] when q — p, is equivalent to the estimate, (2) \N(q^)-q^-l\^2g(q^2, j=l,2,..., where g — {n-\)/2 if n is odd, = (m —2)/2 if n is even, is the genus of the curve (1). With q arbitrary, it suffices to treat the case j= 1 of (2). For n=l,2we have g = 0 and the result of (2) is exact and trivial. For w = 3, 4 we have g= 1 and (2) was first proved by Hasse [2]. The result (2) for n^> 5 (g > 1) was proved by Weil [5] as a special case of the Riemann hypothesis for curves in general. A MS 1970 subject classifications. Primary 10F35, 14G10; Secondary 10B15. © 1973, American Mathematical Society 285
286 H. M. STARK From the point of view of number theory, one wants to count the solutions to (1) with x and y in kq. Let N = Nqbe the number of pairs (x, y) in kq satisfying (1). Then N(q) — Nq is the number of points at infinity defined over kq and this is 1 \tn is odd, 2 if n is even and a0 is a square in kq, and 0 if n is even and a0 is not a square in kq. Thus (2) splits into three cases: (3) \N-q\^2gq112 (n odd), (4) \N—'q+ l\^2gql/2 (n even and a0 a square in kq), (5) \N — q—l\^2gqi/2 (n even and a0 not a square in kq). In the case that w = 3, an elementary proof of (3) was first given in 1956 by Manin [3], but his proof is modeled upon Hasse's and cannot be carried over to n^5 (there is also an error in the proof but it is repairable). For w^3, n odd, Stepanov [4] introduced an entirely new method three years ago which for the first time approached (3) by methods from number theory. He showed that if q = p>9n2 and n^3 is odd, then (6) \N-p\<n{3n)1,2pi/2. His method is in spirit closely allied with methods in diophantine approximation and transcendence in that he creates an auxiliary polynomial having high order zeros at each of several points of interest. The major problem in the method then turns out to be the necessity of showing that the auxiliary polynomial is not identically zero. Stepanov's proof that his auxiliary polynomial is not identically zero is not valid for the optimum choice of variables (nor for n even) and surprisingly we find that for the optimum choice of variables that if the auxiliary polynomial is not identically zero, then for any w^3 and g = p we would get (7) \N-p\<cgpV2, where c is independent of n and p (but greater than 2). It is the purpose of this paper to modify Stepanov's method enough to prove (3) when q = p and n is odd and at least prove (7) when n > 2 is even with c < 3. Our main result is Theorem 1. Suppose thatf(x) has exactly n0 zeros in kp. Ifp^5,pl,2^:2g—\ and m is an integer such that 1 ^m^pl/2, then (8) \N~p\^(n~l)(m+g/2) + {(g/2)(p-n0)-(g2/4)(n-l)} (m+g/2)-'.
THE RIEMANN HYPOTHESIS IN HYPERELLIPTIC FIELDS 287 For n= 1, the right side of (8) is zero which agrees with (3). For n odd, w^3, we have n—\=2g and with a real variable y = m + g/2, the right side of (8) is less than (n0 > 0) or equal to (n0 = 0), h(y) = 2g{y + ((P-g2)/4)y-1}. For y = ^pi/2±g/2 we have h(y) = 2gp1/2, and hence if m is chosen (as it may be since p^ 5) so that m^ 1 and we get (3). The restriction, p1/2 ^ 2g — 1, is harmless since (3) is trivial if p1/2 < 2g— 1. In fact, we see that the Riemann hypothesis is not best possible in that there are n and p for which the right-hand side of (8) is always less than [2#p1/2] for all/ (and when 2gp1/2<p). For example, suppose n = 5 (g = 2) and p = 4y2 + l where y^2. Then l2gpll2-\Z4gy, while if we set m = y— 1 in (8) (so that m-\-\g — y\ we get the right-hand side of (8) is 2g(y+(4y2 -n0-3)/4y)<4gy. As a further example, consider the case when n = 5 (g = 2),p= 13. Here [2gfp1/2] = 14 and so the Riemann hypothesis is unable to even say whether y2 — f{x) has any solutions in this case. But Theorem 1 with m= 1 gives 25 -n0 25 IJV-131^-^ and thus we have at least N ^ 1. If N = 1 we must have n0 = 1 and this gives exactly |JV— 13| ^ 12. This is best possible as the example y2 = x(x4 + x2 + 3) (p=13) has JV = 1. When n is even, n = 2g + 2 and Theorem 1 gives iJV-pl^ + lJp1'2 with no difficulty. The estimate of Theorem 1 may be improved for even n but we do it here only for m = 1.
288 H. M. STARK Theorem 2. Ifn is even, p^ 5,pll2^2g — 1, then This result is of interest when/?1/2»2#. For example, when/?= 13, n = 6 (g — 2) we get |N-13+(a0/13)|^(26-n0)^13 with equality only if w0 = 0. When (a0/13)=—1, this gives JVj^l and JV=1 is impossible since it implies n0 = l which improves the estimate. Thus JV^2. If (a0/13)=l, we get no information on N but in this case there are two points at oo. Thus in the language of algebraic geometry, any complete nonsingular curve of genus 2 defined over kl3 has at least two points over /c13; the Riemann hypothesis says nothing. 2. An illustration of the method. Let nl9 n0, n.x be the number of elements a in kp such that/(a) is a nonzero square, 0, or not a square, respectively. Then JV = 2w1+n0 and n1+n0 + n_1=p. Thus N-p = 2nl+n0-p=-{2n.i+n0-p), and Theorem 1 is equivalent to showing both 2n±l+n0-p^(n-l)(m + g/2)+{(g/2)(p-n0)-(g2/4)(n-l)}(m + g/2)-1. In fact, since multiplying/(x) by some c in kp such that the Legendre symbol (c/p)= 1 interchanges the meaning of nx and w_ l9 it suffices to prove Theorem 1'. Ifp^5,pll2^2g-l,l^m^pll2,then (9) 2*-1+*o-pg(*-l)(m + 0/2)+^ We illustrate the method of proof in the case of n = 2 (g = 0) where the optimal choice of m is m = 1. Let R(x)ekp[x] be defined by (io) «(x)=(i+/Wlr,),2)/W+l/'W(^-4 Let (ii) /_,(*)= n (*-«). /oW= n (*-«)■ a € fcP; (/ (a)/p) = - 1 aekp; /(a) = 0
THE RIEMANN HYPOTHESIS IN HYPERELLIPTIC FIELDS 289 Then clearly f.x \ R since every zero of/_! is a zero of R. But further (since kp has characteristic p\ R'(x)=(\+f(xyo-^)±f'(x)+y"(x)(xi>-x), and thus every zero of /_ x is also a zero ofR' and hence fl x \ R. It is also apparent that/0 | R and sincef0 and/_j have no common roots, f0f-\ \ R- Assuming that R is not identically zero, we may thus compare degrees and get (12) n0 + 2«_1^p+l, which is Theorem 1' with m= 1 (n = 2, g = 0). In fact, we may get the full Riemann hypothesis in this case since the coefficients of xp+1 and xp in R(x) are a0(a<T1)/2 + l) and ^WT 1)/2 +1) respectively. This enables us to replace (12) by (13) n0 + 2n_1^p + (a0/p). To get the corresponding estimate for nu we must remember that when we multiply/(x) by a nonsquare, {a0/p) changes sign and thus the analogue of (13) is (14) n0 + 2ni^p-{a0lp). If we add (13) and (14), we see that both are equalities and in particular N = p-(a0/p), which is precisely (4), (5) and Theorem 2 when n = 2. Even in this simple case, we still have not shown that R is not identically zero. In this case, the coefficient of xp~1 in R is (even when p = 3) -Wop-3),2(ai-4a0a2) which is not zero since / has no double roots. Note that if /= aQ (x - a)2, {ajp) = -1, then not only are the first three coefficients of R zero but in fact R is identically zero. The result for n = 2 is valid for any finite field of odd characteristic since we required only one derivative. But for n^3, the optimal number of derivatives is
290 H. M. STARK on the order of p1/2, and thus extra difficulties arise when we deal with a general finite field. 3. The auxiliary polynomial. In the illustration in the last section, we could write R(x) explicitly. This is no longer possible in the general case, but rather we will define the coefficients of R by differential recursion relations. To avoid numerous subscripts, it is imperative to use vectors and matrices. Also, we will write/, R, etc. in place of/(x), R(x), etc. Throughout this section, m and d will denote integers such that m>0, d^O and 2m + d<p. Let the m x m matrix Q be defined by (15) e=(/'/2/)/+fii, where / is the mxm identity matrix and /0 1 0 2 0 3 (16) 6i = \ V 0 m-1 0 / with zero elsewhere. Put Ft=(Fn,...,Fim); we define the F, recursively from (17) F,+ 1 = DFJ + Ffe (i£0) where D is the operator, D = d/dx, and (18) F0 = (/^-1)/2 + l,0,...,0). Due to the triangular nature of Q, any Fi} is independent of the value of m^.j used to define it. This will be useful later. In terms of the Ftj, (17) is (19) Fl+Uj=DFtJ+(f'/2f) Fij+(j-1) Fu_x, 1 </gm. From this we see by induction that F0 = 0 if i+1 <j^m and (20) Fu+1 = ilF0l9 Ogigm-1. Further, we have
THE RIEMANN HYPOTHESIS IN HYPERELLIPTIC FIELDS DF01+(f'/2f)F01 = f'/2f, and thus by induction we see that, for /^l, (21) F^Ptj/f'-J*1, l£/£max(i, m), where PfJ is a polynomial of degree (degO= — oo), (22) degPy£(n-l) (i-j+1). We define the (m+d +1) x m matrix F by (23) fJ :° \ we see from (17) that ef+fe-(;' )■ \^m + d+l/ Let (24) u = (u0,...,um+d), uj = (l/jl)(xp-xy where m + d<p. Note that DM=-(0,K0,...,Mm + d-l) and thus, (25) D{uF) + uFQ = um+dFm+d+l. Let (26) rI- = r(rI.1,...,rI.m) be a column vector of functions defined recursively by (27) ri+l=Dri-Qri9 i^O,
292 H. M. STARK and the r0j are to be chosen. In particular if the r0j are polynomials, then the rtj are rational functions whose denominators have only zeros which are zeros off. Set, for i^O, R^uPr^ We see that (28) DRt = [D [uF) + uFQ] rt + uF {Drt - Qr) It follows from (20) that m m + d min (mi, k) i (29) i?,. = (l + /<"-»/2) X ry(x'-*)>-1 + £ I -Fk/l.,(x"-x)\ j=l fc=l J=l *' and since wm+d = [(m-|-t/)!]"1(xp — x)m+d, we see that if the r0j are polynomials and aekp is such that (f(a)/p)~ — 1, then D'Ko(a) = 0, 0^f^m+rf. In other words, if/_i is given by (11) and in addition R0 is a polynomial, then /?r+1|/?o. We have m free polynomials, r0j, and hope to put m — 1 conditions on them. To do this we set m = 2ml(ml>0), d+1 =dx. Our m—\ conditions are (30) and Thus F,ro = 0, mi-h^i-hlgf^m-h^i-1. /roi\ 'Fo Ro—(uOi--> Umi + di)\ : mi +di/ f0mi \0 7 Since the only Ftj with nonzero coefficients are those with j^ml and since (30) implies that r0 = 0, mx + \^j^m for all /, we see that we may define Fh r, and R{ with m and d replaced by ml and rfx. For this reason, we drop the subscript on mx and dt. We have (28) and (29) still valid; and if R0 and the r0j are polynomials, 1 ^7^aw, and the m — 1 equations (31) /V*o = 0, m + </+l^i^2m + d-l, are fulfilled, then
THE RIEMANN HYPOTHESIS IN HYPERELLIPTIC FIELDS 293 (32) /-27+'|*o- 4. A choice of the r0j. We will find the polynomials r0j in the form (33) r0J = fm+d->+lSj where the Sj are polynomials. The advantage of this form is that from (29), R0 = (l + /<"-1>/2) J fm+d-j+lsj(xp-xy-1 (34) rn + d i min(fc,m) + Ett I Pkjfm+d-ksj(x»-x)k, so that R0 is automatically a polynomial. The system (31) becomes m p Y , , V /-m+d-j+i 0 w+d + l<i<2w+d-l, which reduces to m (35) X Pus;=°> m + d+lgig2m+d-l. Lemma 1. 77j£re are polynomials Sj, not all zero, of degrees degsj£pj={n-l){j-l)^n-l){m-l)(m + d) w/nc/i satisfy the system (35). (Ifm=l,we take sl = l.) Proof. If Sj is a polynomial of degree rg/i,- then from (22), degPys^n- 1) i + (n -1) {m- 1) (m + d). Thus if we attempt to solve (35) by letting Sj=£fi0 b}lxl and then setting the coefficient of every power of x equal to zero, we have u = I (^-hl) = (n-l)i(m-l)m + m(n-l)(m-l)(m + rf)-hm unknowns and
294 H. M. STARK 2m + d- 1 ES E [(fi-l)i+(n-l)(»i-l)(»i+<0+l] i=m+d+l = (n-l)$(m-l)(3m + 2d) + (m-l)(n-l){m--l){m + d) + {m-l)=U-l equations, and hence there is a solution for the bjt with not all 6,7 = 0. We take one such solution $,•(/= 1,..., m) with the Sj having no common factor among them all and fix this solution for the rest of the paper. For the rest of this section, the r0j are given by (33). Lemma 2. ///> = 5, pl/2^2g-l, \^m^pl/2/2 and m + d^pl/2/2 + g, then the polynomial R0(x) is not identically zero. We postpone the proof of Lemma 2 to the next section. Lemma 3. Ifd=gthen dcgR0(x)^(n-l)(m + g/2)2 + p(m+g/2) + l(g/2)p-(g2/4)(n-l)^^ Proof. For fixed j and \^k^m + d, we have deg(Pkjfm+d-kSj{xp-x)k)^{n-\){k-jl l) + w(m + </-/c) + /^ + /cp ^fc(p-l) + w(w + i) + |ij-(»i-l)0,-l) £{m + d){p-l) + n{m + d)^n-l){m-l){m + d) £(m + d)p + {n-l)(m){m + d) while deg[(l+/('-1)/2)/,"+d--'+1sJ(x|,-xy-1] £${p-l)n + n(m + d-j+l) + iij + p{j-l) ^i(p-l)« + «(w + d) + (p-l)(m-l) + (»i-l)(m-l)(w + d) ^{m + d)p + (n-l)m{m + d) + (p-l)(n/2-d-l). Hence for d — g ^ w/2 — 1 we see that degR0S{m + g)p-i'{n-l)m(m+g) = (n-\) (m + g/2)2 + p(m + g/2) + [(g/2) p-(n-l) (g2/^. 5. Proof of Theorem 1'. We constructed R0 (x) in the last section such that flTd | ^o- But we see from (34) that fSl+d\R0 also. Thus
THE RIEMANN HYPOTHESIS IN HYPERELLIPTIC FIELDS 295 (36) (2m + d) w_1 +(m + d) n0<>degR0. We estimated the degree of R0 when d = g in Lemma 3. If we use this estimate in (36) with d = #, we get Theorem 1'. It remains to prove Lemma 2. For the time being, we will assume that r0 is a vector of functions in kp(x, fl,2\ not all zero, satisfying the system of equations (31). The choice of r0 given in Lemma 1 is useful as is another choice to be given soon. We note that, for /^0, y'^0, (37) DiFtrj^Ft+trj + Ffj+t. It is also useful to simplify the recursion relations for the Ft and r, by making a change of variable. Let Qi = t{QiW"9Qim)=f1,2ri' The recursion relations (17) and (27) simplify to (38) 0i + i=Jtyi + <fcei, and (39) Qi + i = DQi-QlQi. We will reduce Lemma 2 to the following key result which is reminiscent of the theory of ^-functions in transcendence. Lemma 4. For p^>5, p1,2^2g-l% m + d£±p1,2 + g, m^jpl/2, we have detM=£0, where M is the mxm matrix (FtJ), m + d+\^i^2m + d, l^j^m. Proof. We have <£01 =fll2 + (fl/2y. We then see from (38) that Fm+d+i,i=f-l/2<i>m+d+ui=ri/2om+d+Hf112) (40) = I bkf~\ k=\ where the bk are polynomials in x and bm+i+l=H-t)-Q->"-Wr+i+i 2 ) n l * / j=l;jodd
296 H. M. STARK If Fm+d +1,1=0 then / | bm+d + 1. But / and /' have no common factors and frm+d + i^0 since 2g-l<pi/2 and 2m + 2rf-l^p1/2 + 20-l<:2p1/2<p. Thus the upper left-hand entry of M is not identically zero. This settles the case of m=l. We now assume that detM = 0, m> 1, and derive a contradiction. We see that for some A, 1 ^h<m, we must then have detMh#0, detMh+1=0 where Mfc = (F0), m + d+l^i^m + d + /c, 1=7=^. In fact by lowering m and raising d we may assume that this happens with h = m—l (as we noted before, the triangular shape of Q guarantees that the matrix Mk is independent of the value of m^k used to find it). Thus we shall assume that (41) detMm_1#0, detM=0. Therefore M has rank m— 1 over kp{x) with a single row relation which involves the last row, 2m+d-l (42) F2m+d= £ h,F, i=m+d+l with rational functions ht. By (31), F2m+dr0 = 0 and hence Firo = 0, m + d+l^i^2m-d. By (37), Firi=0, m + d+l£i£2m + d-l, and by (42) again, this also holds for i = 2m + d. This gives us two relations among the columns of M and since the column rank of M is also m— 1, these relations must be proportional: ri=hr0
THE RIEMANN HYPOTHESIS IN HYPERELLIPTIC FIELDS 297 where h is a function in kp(x, fl,2\ Thus (43) hr0 = Dr0-Qr0, and, in particular, (44) hr0m = DrQm-(f'/2f)r0m. Note that rOm^0 since otherwise the fact that Mm_x is nonsingular would imply all the other rOj = 0 also. In particular the sm of Lemma 1 is not zero. Let (45) ro^r-S+WsjS-1, 1^/rSm. This gives an r0 which is just (fd+ i/2sm)~1 times the r0 used in the last section and in particular r0m = /1/2. For this choice of r0, we see from (44) that ft = 0 and hence from (43) that ri=Dro — Qro = 0. We let q0 be given by (46) Qoj = f-^r0j=r^SjS-\ l^m, so that the qoj are rational functions, Q0m = 1 and £x =0; (47) DQo = QlQo. This gives (48) l)w+J+M/1/2Foieo0 = ^+^H^o) = ^+d + ieo = ^+d+^o = 0. In characteristic 0, this would be possible only for q01=0 and then repeated application of the differential equation (47) would yield QOm=0 which is a contradiction. But in characteristic p, things are not so easy. Suppose that a is a root of / (not necessarily in kp) such that (x — a)m \q01 in the sense that Q0i{x) = (x — a)m^(x), where the denominator of g(x) is not divisible by (x — a). By (47) we see that {m-l)lQom{x) = Dm-lQoi{x). Thus £0m in reduced form contains a factor of (x - a) in its numerator which contradicts the fact that Q0m = 1. Therefore (49) (x-arJCoi, and in particular £01 #0.
298 H. M. STARK Let a be the exact power of x — a contained in q01 in the sense that q0i(x) = (x — a)a q(x), where neither the numerator nor the denominator of g is divisible by (x — a). We know that Q^0,a<m. Then where h(x) = (f(x)/(x-ix))112 F01 (x) e(x)#0. Hence m+d+1 0 = D(m+d + l)(fl/2FoiQoi)= £ f"+f + 1)DI[(x-a)B+1/2]Dm+d+1-|ft / = o m+d+l = I (m+1+l)(a+±)(a-±y-ia+i-l)(x-x)a-l+l/2Dm+d+l-lh^ 1 = 0 When we multiply this though by (//(x-a))1/2 (x-a)m+d~a+1/2 we get a sum of the form (50) O^f1 gi(x-ocr+d + '-1, 1 = 0 where each gt is a rational function without factors of (x —a) in the denominator and 2m+d+ lgm+d+1 =(2a+ 1) (2a- \)-(2a-2m-2d + 1) (//(x-a)) F0lC. But (x-a) does not divide the numerator of (/(x)/(x-a)) F0l(x) q(x) and thus the only way that (50) can hold is that gm+d +1 =0, which implies either (51) 2a + l^p, or (52) 2a-2m-2d+lS-p. But (51) cannot hold since it implies a^(p-l)>lPl/2>™ which contradicts (49). Therefore — a^(p+\) — m — d. Since this result is true for all roots of/, it follows that/(p+1)/2~m_d divides the denominator of g01 when reduced. It follows from (46) that/(p~1)/2~d | sm and therefore (53) n((p-l)/2-d)^nm = (n-l)(m-l)(m + d+\).
THE RIEMANN HYPOTHESIS IN HYPERELLIPTIC FIELDS 299 This inequality is false as we now proceed to show. Consider the inequality (54) (g+l)(4t2-l)>(2g+l)(t-l)(t + g+l) + 2(g+l)g. Clearly this is true for f-*oo and both sides are equal at t = 0. Hence if f0>0 and (54) is true for t = t0 then it is true for t>t0. But (54) is true for t = g and indeed, when 0^2, (54) is true for t=.g — \. In particular, since pi/2^2g— 1, p^5, (54) is true for t = ^p1/2. Now with t=|p1/2, the inequality (55) u(L(4t2-l))>(u-l)(t-l)(t+g+l) + ug is satisfied u = 2g + 2 by (54) and clearly satisfied for w = 0 also (t> 1). Since (55) is linear in u, it must also be satisfied for u = 2g+ 1 and thus (55) is satisfied for u = n. Now consider the inequality (56) n{Up-l))>{n-l){m-l)(t + g+l) + n(t + g)-nm which is satisfied for m = t = p1/2/2 by (55) and for m=l since p^5, p1/2^2# — 1 implies {p-\)/2>{2P^-\)/2UPV2+^-2)l2 = t+g-\. Since (56) is linear in m, it is satisfied for 1 ^m^t. But now n((p- l)/2)>(n-1) (m- 1) (m + d+1) +w* holds for m + d=t + g, lgm^r and clearly continues to hold when either m or d are lowered. This contradicts (53) and proves Lemma 4. From this point on, r0 is given by (33). Lemma 5. The vectors r0, rl5..., rm_x Are linearly independent over kp(x). Proof. Our m — 1 equations for r0 are Fm+,+ 1 + 1-r0 = 0 (0^igm-2). By (37) these equations are equivalent to the equations (57) Fm + d+lrt = 09 0^m-2,
300 H. M. STARK Suppose there is a dependence relation among r0,...,rm„l. Then since r0 is not identically zero, for some J, l^J^m- 1, we have j-i rJ = I hfj j = o for some rational functions h} (x). If we apply (D — Q)m ~1" J to this we get a relation of the form rm_ i =X7=o2 0/0 with rational functions ^(x) (and indeed, although it is not needed we could even have ^.(x) = 0 for J^j^-m — 2). But this means (57) holds for i — m— 1 also and thus by (37), Fm+d+l + iro = 0, 0^fgm-l. But by Lemma 4, this means r0 is identically zero which is contrary to its definition in Lemma 1. This proves Lemma 5. We can now complete the proof of Lemma 2. By (28) and (37) we have for 0^f^m-2, DRi(x) = Ri+1(x) and hence, for 0^il^m — 1, If Ro=0 then R(=0 (0^i^m — \) and therefore by Lemma 5, uF=0. Hence by (25), Fm+d+1 =0 which violates Lemma 4 and completes the proof of Lemma 2. 6. Proof of Theorem 2. The case of even n is more complicated. We carry it through in the case that m= 1. By h(x) = 0(xa) we mean that h(x) has no terms xb with b > a present. Let A=(n- 1) (m + g/2)2+p(m + g/2) + [(g/2) p-(g2/4) (n- 1)] ; this is the bound that we had in Lemma 3 on deg R0 when d = g. For the rest of this section n is even. Lemma 6. Ifd=g,then ^oWHi+/(p-1)/Vom(xp-xr-i+^ m x I irm+9,/o,(*p-x)m+9+0(^-p+1). J'=l
THE RIEMANN HYPOTHESIS IN HYPERELLIPTIC FIELDS 301 Proof. This is clear from the proof of Lemma 3. Lemma 7. Ifm=l, d=g andn is even then degR0^A—(m + g/2) [1 — (a0/p)]. Proof. Since n is even, we may expand (ao i f)i/2 in a formal series of the form K"7)1/2= t bjxi + x*^ and thus from Lemma 6 with m= 1, T 1 xnp/2 1 *o(x)=r4(TT^ But from (40), F =D1+y[(fl0-1/)1/2] = (g + l)i + Q(x^2) 1+*1 (flo"1/)172 (flo"1/)172 ' and since for 2g—l^p1'2, 5£p we havegf+2<p —1, we see that R0(x)=r0lx^i+^ (ao1/)'1'2 (ari)l2 + l)+0(x^o-2). The lemma follows. Lemma 8. Ifnis even, then Proof. With m= 1, d=#, we have, by Lemma 7, "-i(2+0) + "o(l + 0)^degKo ^A-(l+g/2)2 + (a0/p)(l+g/2) + (g/2 + g2/4) = 2g(l+g/2)2 + [P + ("o/p)'] (1+0/2) + [to/2)(p+l)-2»^/4]. The lemma follows after dividing by 1 +g/2. Theorem 2 follows from Lemma 8 in exactly the same way that Theorem 1 follows from Theorem 1'. In the case of even n and m> 1, we can improve the estimate on deg.Ro even if (a0/p)= 1 and then in addition more terms cancel out when (ao/P)= — L We will not go into this matter further here.
302 H. M. STARK References 1. E. Artin, "Quadratische Korper im Gebiete der hohren Kongruenzen. I, II," in The collected papers of Emil Artin, Edited by S. Lang and J. T. Tate, Addison-Wesley, Reading, Mass., 1965, pp. f-94. MR 31 #1159. 2. H. Hasse, Zur Theorie der abstrakten elliptischen Funktionenkorper. I—III, J. Reine Angew. Math. 175(1936), 55-62, 69-88, 193-208. 3. Ju. I. Manin, On cubic congruences to a prime modulus, Izv. Akad. Nauk SSSR Ser. Mat. 20 (1956), 673-678; English transl., Amer. Math. Soc. Transl. (2) 13 (1960), 1-7. MR 18, 380; MR 22 #3711. 4. S. A. Stepanov, On the number of points of a hyper elliptic curve over a finite primefield, Izv. Akad. NaukSSSRSer. Mat. 33 (1969), 1171-1181 = Math. USSR Izv. 3 (1969), 1103-1114. MR 40 #5620. 5. A. Weil, Sur les courbes algebriques et les varietes qui s'en deduisent, Actualites Sci. Indust., no. 1041, Hermann, Paris, 1948. MR 10, 262. MASSACHUSETTS INSTITUTE OF TECHNOLOGY
CLASS NUMBERS OF TOTALLY IMAGINARY FIELDS JUDITH S. SUNLEY In recent years the study of class numbers of imaginary quadratic number fields has played an increasingly important role in number theory. These studies have so far resulted in the determination of all imaginary quadratic fields with class number one or two. The techniques used in these determinations are quite different from the classical methods used in obtaining earlier results such as the Brauer- Siegel theorem and the results of Heilbronn and Linfoot [2] and Tatuzawa [5]. At the present time, little progress has been made in determining imaginary quadratic fields of higher class number. This paper investigates a similar problem at a slightly different level. Let K be any totally real algebraic number field of degree n ^ 2. The problem posed is to determine all totally imaginary quadratic extensions of K having a given class number. Although the classical techniques seem insufficient to answer the class number problem in the case of imaginary quadratic fields, given the results in imaginary quadratic fields the classical techniques do give an answer in many cases to the same problem in the situation described above. The major result is Theorem. Let K be a fixed totally real algebraic number field and let L be a totally imaginary quadratic extension of K having conductor fL and class number h. Then there exists an effectively computable constant c = c(K, h) such that N]LSc with the possible exception of one field L. Using this result with results of Goldstein [1] it is possible to give an effective classification of all L having class number one where K is normal over Q. It is also A MS 1969 subject classifications. Primary 1065; Secondary 1068. i!, 1973, American Mathematical Society 303
304 JUDITH S. SUNLEY possible to classify those L having class number two when K has certain specified characteristics. The proof of the major theorem is drawn on the proof of Tatuzawa [5] of a similar statement for imaginary quadratic fields. It is based upon developing a lower bound for L(l, x) where x *s the real ideal character induced by the field L. The result is then obtained by relating the Dedekind zeta functions of the two fields and using the known algebraic relationships between the two fields. The determination of a lower bound for L(l, x) is made by forcing a certain complex analytic function to have too many zeros close to s= 1. If more than one X of K gave a value L(l, x) which was too small, this analytic function has impossible properties around 5=1. The one exception provides the complications for imaginary quadratic fields, and it is only by making use of strong algebraic properties that one is able to overcome this problem* for some specialized cases where the base field is a totally real field other than Q. There are, of course, many drawbacks to the result. First of all, the constant obtained in the result is quite large, on the order of 1021 even in the simplest cases where the base field is a real quadratic field of small class number. This means computation would be extremely difficult. Also, the one exception still creates difficulties in the complete determination of all L's of class number h. Further investigation of the possible exception is definitely in order. The conjecture would be that the exception, if it exists, must be related to an imaginary quadratic field of class number h. These results have previously been announced in [3] and complete proofs will appear shortly in [4]. Bibliography 1. Larry Goldstein, Relative imaginary quadratic fields of class number 1 or 2, Trans. Amer. Math. Soc. 165 (1972), 353-364. 2. H. Heilbronn and E. H. Linfoot, On the imaginary quadratic corpora of class number one, Quart. J. Math. Oxford Ser. 5 (1934), 293-301. 3. J. Sunley, On the class numbers of totally imaginary quadratic extensions of totally real fields, Bull. Amer. Math. Soc. 78 (1972), 74-76. 4. , Class numbers of totally imaginary quadratic extensions of totally real fields, Trans. Amer. Math. Soc. (to appear). 5. T. Tatuzawa, On a theorem ofSiegel, Japan J. Math. 21 (1951), 163-178. MR 14, 452. American University
EXPONENTIAL SUMS AND THE RIEMANN CONJECTURE PAUL TURAN 1. We are dealing in this paper exclusively with the Riemann zeta function though the results can be extended to numerous other important cases mutatis mutandis. The starting point of this paper was a remark of Landau in his Handbuch from 1909. Denoting by ^'s the nontrivial zeros of C(s) ordered according to absolute increasing ordinates he wrote I.e. the following lines. "Die Tatsache, dass ^ xe/g gerade in der Nahe der Primzahlen und hoheren Primzahlpotenzen und sonst in der Nahe keiner Stelle > 1 ungleichmassig konvergiert, deutet auf einen arith- metischen Zusammenhang zwischen komplexen Wurzeln q der Zetafunktion und den Primzahlen hin. Ich habe keine Ahnung, worin derselbe besteht." We would more modestly ask whether the zeta-roots of some ranges and the primes of some ranges influence each other particularly strongly, and if this is indeed the case then how? As far as I know the first result in this direction was contained in my paper in Izv. Akad. Nauk SSSR in 1947. This runs as follows.1 Theorem 1. Suppose the existence of constants a ^ 2, 0 < /? ^ 1 and a c (a, /?) so that for a T>c(a, /?) the inequality (i.i) 10 £ exp(-nrlogp) N\og10N AMS 1970 subject classifications. Primary 10G05, 10H05; Secondary 10-02. 1 In what follows c stands for unspecified positive numerical, explicitly calculable constants; if they depend on some parameters this will be explicitly stated, expx stands throughout for ex, p for primes, n(x) for the number of primes ^x. © 1973, American Mathematical Society- SOS
306 PAUL TURAN holds for all Nu N2 integers with (1.2) xa^N^Nl<N2^2N^exp(xfi/l0). Then C(s)#0 on the segment (s = a + it) (1.3) o>\-e-l0pl*\ t = t. This theorem is "local" in the sense that for a fixed, sufficiently large x it gives information about the distribution of zeros on the segment 0<a< 1, t = i, from a property of the exponential sums £ exp(-iilogp) which involves a set of primes depending on x only. It is useful to give to this theorem a somewhat weaker "semiglobal form" where the conclusion refers to the distribution of zeros in a "small" parallelogram. More exactly we state Theorem 1'. Suppose the existence of constants a^2, 0</Jg 1, 0<£g9/10 and c(a, /?) so that for a T>c(u, /?) the inequality (1.4) ...^ 10 £ exp(-filogp) Nl^p^N2 N\og10N holds for (1.5) Ta^NSNl<N2S2NSexp(TV10), T-TE^x^T+TE. Then C(s)#0 on the parallelogram (1.6) <7>l-e-10^, r-T£^^T+ 7£. a These theorems might be compared with the following two theorems of a converse character. Theorem 2. Suppose the existence of constant 9 and E with j^9< 1, 0<E ^9/10 such that for a x with x> 100, say, (,(s) is nonvanishing in the parallelogram (j>9, t-t£^/^t + t£ Then the inequality
EXPONENTIAL SUMS AND THE RIEMANN CONJECTURE 307 £ exp(-rilogp) <c- N\og10N holds if TE,{1-*)^N<>Nl<N2^2N. Theorem T. Suppose the existence of constants $ and E with y^#<l and 0 < ES 9/10 such that for a T> 200, say, C(s)^0 in the parallelogram a>99T- TE |T^r+ TE. Then the inequality £ exp(-iilogp) Ni^p^N2 <C N\og10N holds if T-^Te^t^T^\Te,TEI(1-^^N^N1<N2^2N. Further we mention the "global" form of these theorems which refer to half- planes. Theorem 1". Suppose the existence of constants a ^ 2, 0 < /? ^ 1 and c (a, /?) so that for all t>c(ql, p) the inequality (1.7) holds for (1.8) X exp(-nrlogp) ,N\og10N Ta^N^Ni<N2^Qxp(Tp/l0). Then the half plane a>\—e °/?3/a contains at most finitely many zeros of Us)- Theorem 2". Suppose the existence of a constant i^# < 1 so that C{s)¥:0for g>9. Then for t>c(S) the inequality £ exp(-/ilogp) Nt^p^N2 <C N\og10N holds if Tl/(l~*)SN^Nl<N2^2N. We emphasize a curious consequence of the "global" Theorem 1". Suppose the truth of (1.7)-(1.8). Then Theorem 1" implies the existence of a constant 0< 1
308 PAUL TURAN such that C(s) has no zeros in g> 0. But this gives in turn the inequality (1.9) *»-J£ <^xe log* (and in the case 0>j even the log factor can be dropped). Hence (1.7)-(1.8) imply (1.9). That (1.9) implies (1.7)-(1.8) with some positive constant a^2 and 0<p<> 1 is trivial. The interest of these remarks lies in the fact that all references to zeta roots are eliminated now; the error term in the prime number formula is made dependent directly on an estimation of the finite exponential sums in (1.7). The above deduction however uses zeta-roots; it would be very interesting to find a proof for it directly, without using the C-function. This problem was stated already in my Izvestiya paper in 1947, without receiving the slightest attention so far. 2. These results - though they bring out clearly the crucial role of the exponential sums in (1.7) - raise several questions. We mention only two. Problem 1. In Theorems 1" and 2" the occurring half planes are not the same. Can they be made identical? If yes, what is the exact form? Scrutinizing the mutual influence of primes and zeta roots it is natural to ask whether or not in the local Theorem 1 it is not enough to require (1.1) for a much narrower range than the one in (1.2) and still having the conclusion (1.3). Thus we came to the following general Problem 2. To improve the localisations. 3. As for Problem 1, let us define the "abscissa aw of quasizerofreeness of C(s)" as follows. For arbitrarily small e>0 the halfplane a >(TW + s contains at most finitely many zeros of C(s), whereas the halfplane <J>Gw — £ contains infinitely many. In the German edition of my book Eine neue Methode der Analysis und ihre Anwendungen, I settled this question at least supposing the Lindelof hypothesis
EXPONENTIAL SUMS AND THE RIEMANN CONJECTURE 309 p.,, ta MS±!*_o. r- + oo log/ It turned out that in this case (3.2) gw = 1 - lim sup P/ol where a, /? run all pairs of constants with (3.3) a^2jS, 0<jS^l for which (1.7)-(1.8) hold. I returned to this question in 1957 when I deduced the formula (3.2)-(3.3) from the "weak Lindelof hypothesis" which amounts to the fact that the Lindelof //(^-function of C(s) (which is continuous and convex) is for j<g^\ everywhere differentiable (actually one needs even less). The proof was given with all details in the Sammelband zu Ehren des 250. Geburtstages Leonhard Eulers (Akademie-Verlag, Berlin, 1959); this result as well as another one from 1958 in Acta Arithmetica, in which I deduced the density hypothesis from it, remained rather unnoticed. An unconditional proof of (3.2)-(3.3) would be highly desirable. For the sake of orientation I remark that for the sum S= £ exp( —it logw), t^2 the elementary formula |(v+l)1+/T-V1+I'T-(l+iT)v,T|^T2/v gives at once \S\ <£ JV/t + t logN <{N logN)/x if AT^t2. The really significant sum of this type is Si=Z (-ir+1exp(-iilogn) since the inequality |SiN£N1/2+£(2 + |t|)£ is necessary and sufficient for the truth of the Lindelof conjecture in (3.1).
310 PAUL TURAN 4. Next we turn to the second problem. In a lecture published in Number Theory (Colloq., Janos Bolyai Math. Soc, Debrecen, 1968), North-Holland, Amsterdam, 1970, I reduced the interval (1.3) in Theorem 1 to (t2, t8). While drafting an English edition of my book in May and June of 1971 in Ann Arbor I found that a still stronger localisation is possible. We assert the Theorem 3. Let D>0 be fixed. Suppose that for a Sq = S0(D)andaO<p^S0 there is a t0 = t0(/?, D) such that for a t^t0 the inequality £ exp(-/Tlogp)| Ni^p^N2 <C AMog10iV holds for TD(l-dfil/6)^N^Nl<N2^2N^TDil+dfil/6K Then for the same t, C(s)#0 on the segment g>\— j?2, t = x. This theorem is in our terminology a "local" theorem with an unexpectedly strong localisation. In order to draw another surprising conclusion we formulate first, as a trivial consequence of Theorem 3, the "semiglobal" Theorem 3'. LetD>0and0<E^9/\0 be fixed. Suppose that for a 5$ = £$ (D) and a 0<fl^d$ there is a t§ = t§(/J, D) such that for a T^t$ and for all z's with T-Te^t<LT+ TE the inequality £ exp(-fTlogp)| Ni^p^N2 <C AMog10iV holds for TDii-pl/6)^N^Nl<N2^2N^TDil+pl/6). Then C{s)^0for the parallelogram <7>l-j32, T-TE^t^T+TE. This formulation helps to explain the puzzling role of the constant D in Theorem 3. Let us namely apply Theorems 3' and 2' successively. Then we get the following corollary which we formulate owing to its independent interest as
EXPONENTIAL SUMS AND THE RIEMANN CONJECTURE 311 Theorem 4. If for a D>0 and0<E^9/\0for S$(D)and a0< p^S$ there is a t* = t*(Z), p) so that for a 7^t5>200 and all t's with (4.1) and T-Te<t<T+Te (4.2) the inequality TDii-pl/6)^N^Ni<N2^2N^TDii+fii/6) (4.3) £ exp(-rrlogp)| Ni£p^N2 ,Nlog10N holds, then for all (4.4) (4.5) we have also the inequality TE/fi2^M^Ml<M2S2M, T-$Te^t^T + $Te, (4.6) £ exp(-iTlogp) Mi^p^M2 I <C Mlog5M The content of this theorem can be described in the following way: If the inequality (4.3) is satisfied in the i-range (4.1) and prime-range (4.2) then the inequality (4.6) of the same type holds for the somewhat smaller i-range (4.5) and for the prime-range (4.4) which is unbounded from above (remarking, of course, that for M > exp(7£/5) (4.6) is trivial). In the text of Theorem 4 only primes occur, and no reference to C(^); it would be highly interesting to find a proof of it without recourse to C(s). 5. Hence the interest of Theorem 4 lies in the fact that it throws some light on the behaviour of the exponential sums \^n1£P£n2 exp( —it logp)| though at present there are no known ways for their nontrivial direct investigation (apart from the fact that it could be shown that each sum is "small" indeed, apart from a "small" set, thus opening a new approach to the so-called density hypothesis as early as 1949). Now there are several other finite exponential sums whose absolute values can be estimated nontrivially by the trigonometrical sieve method of I. M. Vino-
312 PAUL TURAN gradoff, which is independent of the theory of £(s). Such sums are e.g. (5.1) Sy{T,Nl9N2)= £ exp(-iTlogH i^y^2,y#l, and (5.2) G(a)= £ exp(ipa); Ni^p^N2 we shall confine ourselves to the former one. The sieve method gives for this the inequality c N\og20N (53) |S,(t, Nl9 N2)\<^r.^ tga for (5.4) {210S)t10SNSN1<N2^2N, which is highly nontrivial. Hence it was plausible to investigate the connection of these sums with the quasi-Riemann conjecture. I proved in this direction in the early 1950's the following two theorems (cf. my book Eine neue Methode..., pp. 143-147). Theorem 5. For a given ^y^2, y#l, suppose the existence of constants a ij>2, 0<?7^2 am/ c(a, rj, y) so that for all T>c(a, rj, y) the inequality (5.5) |Sy(t, JVi, JV2)|^(JV log10iV)/T1/2^ (5.6) Ta^N^Ni<N2^2N. Then the halfplane a>l —e~i0 n3/(<x+.l)2 contains at most finitely many zeros of Us). Theorem 6. Suppose the existence of a constant %^9<l so that C(s)^0for g > S. Then for all ^ y :g 2 and t>c(S) the inequality (5.7) |Sy(t, Nl9 N2)\^Cl{a) (N log4 JV)/t
EXPONENTIAL SUMS AND THE RIEMANN CONJECTURE 313 holds if (5.8) T2/{l-*)^N^Nl<N2^2N. Again Theorem 5 is more difficult than Theorem 6. These theorems - compared with Theorems 1" and 2" - reveal a curious situation. In (1.7) an arbitrarily small positive constant exponent /? of t implies the existence of a nontrivial zerofree halfplane, and (1.7) is trivial for /? = 0; in order to achieve the same aim in (5.5), the exponent of t must be greater than \ however close y is to 1 from either side, and (5.5) is (for fixed y# 1) nontrivial even for rj = 0 (but proved unconditionally). It is not known whether this curious discontinuity of the "critical" exponent /J with respect to y at y=l is due only to the weakness of our "transition formulae", nor whether an improvement of the trigonometrical sieve can lead to the factor Ti/2+f/ wjtk a pOSitive rj instead of t1/2 in (5.3). 6. Theorems 5 and 6 are in our terminology "global" theorems; it is reasonable to ask for their "local" or "semiglobal" forms. One analogue of Theorem 5 - given for the sake of simplicity only for y=% - is Theorem 7. Let Dl>0 be fixed. Suppose that for a Sl=Sl(Di) and an 0<rj^Sl there is a Ti=Ti(rj, D^) such that for a t^tx the inequality (6.1) holds for (6.2) and X expt-iVOogp)1'2)) <c N\og10N Tl/2 + f7 xDl{l-^)^N^Ni<N2S2N^rD^l+riif6 (6.3) {D(l-2rjl/6))l/2 T(logT)1/2^r^(D(l + 2j;1/6))1/2 T(logt)1/2. Then for the same t, C(s)t*0 on the segment a>l—rj2, t = z. This theorem is a "local" one in the sense that it implies the nonexistence of C-zeros on a segment. It is a bit weaker than Theorem 3 in the sense that the inequality (6.1) is required for the r-interval in (6.3) whereas in Theorem 3 it is required only for r — x. This fact no doubt lends additional interest to Theorem 3. It would be trivial to formulate the "semiglobal" form of Theorem 7; we shall omit it. It would be more interesting to find the semiglobal form of Theorem 6,
314 PAUL TURAN even with Tl/2+tl instead of t in (5.7) and thus to find the analogue of Theorem 4 for the sum S1/2(t, Nl9 N2); for this I have so far no proof. However there is no doubt that the analogue of Theorem 7 can be proved for the exponential sum G(o) in (5.2), which is so important in the additive theory of primes; the analogues of Theorems 5 and 6 for G(a) were already proved in my book Eine neue Methode der Analysis und ihre Anwendungen. The proofs of the new Theorems 3 and 7 are too long to be included in this volume; at least one of them will be inserted in the forthcoming English edition of the above-mentioned book in the Interscience Tracts series. Mathematical Institute of the Hungarian Academy of Sciences Budapest, Hungary
A NEW ESTIMATE FOR THE EXCEPTIONAL SET IN GOLDBACH'S PROBLEM ROBERT C. VAUGHAN Let E(X) = {2m:2m^X, 2m^Pl + p2}. Goldbach conjectured that E(oo) = {2}. Tchudakoff, van der Corput and Estermann all noticed independently, after Vinogradov's three primes theorem appeared in 1937, that \E(X)\<AXlog~AX. If K(n) = \{{Pi,--;Ps)-Pi + ---+Ps = n}l Js(n)= Z (logrv.-logn,)-1 n\ -\ +ns = n and s-<Hn-(^)')m-(^r they all showed that £ \R2(n)-J2(n)S2(n)\2 ^AX3log-AX. I can show by essentially the same method that (1) I \R2(n)-J2(n)S2(n) + D(n,X)\2 <^X3exp(-c1 log1'2*), where D is a term corresponding to a possible exceptional 'Siegel' zero of L- AMS 1970 subject classifications. Primary 10315, 10J10; Secondary 10B35, 10L05, 10L15. £) 1973, American Mathematical Society 315
316 ROBERT C. VAUGHAN functions. The proof of (1) is much too complicated to give here. From this estimate I can show that (2) \E(X)\<Xexp(-c2\ogl'2X). The main difficulty is that if q is the 'exceptional modulus' and q is large, then D{n, X) can closely imitate the behaviour of J2(n) S2{n). However, one knows that g^exp(c3 log1/2 X), and I can show that for all but X exp(-c4 log1/2 X) values of n ^ X we have J2(n) S2(n)-D(n, X)>J2{n) S2(n) q~d for a suitable small positive number S. (2) then follows quite easily from (1). For further details I refer interested persons to my forthcoming paper in Acta Arith- metica. I thought it might be of interest to discuss here some joint work with Hugh Montgomery, which is connected with the above. Following I. M. Vinogradov it is known that, for all positive A, R3{n)-J3{n) S3{n) <An2 log~An. Also, on the generalised Riemann hypothesis the error is known to be <^£ n1/4+£. One might conjecture that the error term is ^ (main term)1/2+£, but this is false. More generally we can show that Rs(n)-Js(n)Ss(n) = Q.(ns-3>2 log~sn) and if rs{n)= X! Ani)'~A(ns), n\ -\— +ns = n then rs(n)-(-l)"(~S)ss(W)=f2±(n-3/2). These results are not very deep, and depend only on classical methods in prime
THE EXCEPTIONAL SET IN GOLDBACH'S PROBLEM 317 number theory. However, we can also show that R2{n)-J2{n)S2{n) = Q{ni,2\og-ln) and (3) r2{n)-nS2(n) = Q(nl/2 \ogn). For the rest of my discourse I shall give the proof of (3). We first of all require a lemma. Lemma. Let m u(a)2 q %^)=lS £ e{-an/q)- Then (4) S2{n)-S{n, m^m"1 (log logm)2 d(n) {n>0). Proof. It is well known that S2{n) = S(n, oo) and £ e(-an/q)= £ fi(-)r, a=l;(a,q) = l r\q;r\n V/ so that S2(n)-S(n,m)^r £ ('°8 y<in-'(log logm)2 d(n), r\n q=m+ l;r\q fl which proves (4). Now suppose that 0<g<l. Then, by Parseval's theorem and Schwarz's inequality, 1 Q2n\r2(n)-(n+l)S(n,m)\2=t n = 0 J 0 where £ Q"e(an)(r2(n)-(n+l)S(n,m))\ doc^T2, 1 ■J £ Qne(un)(r2(n)-(n + \)S(n,m)) n = 0 da.
318 ROBERT C. VAUGHAN It is easily seen that T^ Tx — T2, where -I Z A(n)qne(m)\ da=Z Q2nM")2 n = 0 and 00 m aid)2 q » = 0 ,= 1 <P(q) a=l;(a.fl)=l I ■J. 0 f=i^(«)2.-ii(,,,)=iJ 0 - n(qf \\' I da X (fi+l)e"gl.lo—|« da 2 oo ««1 <?(<?) n = 0 Let g2 = 1 — 1/X and m = [Xc~\ with c a constant such that 0<c< 1. c will be at our disposal later on. Then we have 7i=X(logX + 0(l)) and T2SX(\ogm + 0(l)) = X(c\ogX + 0(l)). Hence T^X((l -c) \ogX + 0(l))>X logX, so that Z e2n|r2(W)-(n+l)S(n,m)|2>X2log2X. n = 0 We now simplify our expression. We have, by the Lemma, for w>0, |r2(fz)-(w+l) S(az, m)|2^|r2(«)~fiS2(«)|2 + S2(«)2-h«2^(«)2(loglogm)4 m"2. Clearly Z e2"S2(a)2«Ie2"»«*2,
THE EXCEPTIONAL SET IN GOLDBACH'S PROBLEM 319 and £ g2nn2d(n)2 (log logm)4 m"2^(log logm)4 m~2X3 log3 X which is <^X2 provided we choose c so that 1/2<c< 1, say c = 3/4. Finally |r2(0)-S(0,m)| = £ ^«logm«*. Hence S e2"|r2(«) -«S2(n)|2 > X2 log2X = (-i-j log -J-T and (3) follows easily. References T. Estermann, On Goldbach's problem: Proof that almost all even positive integers are sums of two primes, Proc. London Math. Soc. (2) 44 (1938), 307-314. H. L. Montgomery and R. C. Vaughan, Error terms in additive prime number theory, Oxford Quart. J. (to appear). N. G. Tchudakoff, On the density of the set of even numbers which are not representable as a sum of two odd primes, Izv. Akad. Nauk SSSR Ser. Mat. 2 (1938), 25^0. J. G. van der Corput, Sur Vhypothese de Goldbach pour presque tous les nombres pairs, Acta Arith. 2(1937), 266-290. R. C. Vaughan, On Goldbach's problem, Acta Arith. 22 (1972), 21^8. I. M. Vinogradov, Representation of an odd number as a sum of three primes, C. R. (Dokl.) Acad. Sci. URSS15(1937), 169-172. , Some theorems concerning the theory of primes, Mat. Sb. 2 (44) (1937), 179-195. The University Sheffield, Great Britain
This page intentionally left blank
ON EUCLIDEAN RINGS OF ALGEBRAIC INTEGERS PETER J. WEINBERGER 1. Let Rbea commutative ring with unit. Then R is said to be Euclidean if there is a function E: R-+{0, 1, 2,...} such that £(0) = 0 and Va, beR, a#0, 3q, c with b = qa + c, E(c)< E(a). The following test may be used to determine if a ring R is Euclidean, and then to construct the function E. Let £0 = {0}, and define Epj^ 1, by Ej—Ej-1 = {aeR: each residue class of R/(a) contains an element of £/_i}. Note that E^-Eq is the set of units of R. If (1.1) R=U EJ9 then R is Euclidean, with E(a) = min {j:aeEj}. Motzkin [5] proves that the condition (1.1) is also necessary if R is to be Euclidean. The reader is further referred to the recent and readable paper of Samuel [6]. This paper is entirely concerned with those rings which are the rings of integers of an algebraic number field, and which are also principal ideal domains (abbreviated p.i.d.). This last condition is no restriction, since every Euclidean ring is a p.i.d. Of these rings it is known that those in Q((-19)1/2), Q((-43)1/2), g((-67)1/2), g((-163)1/2) are not Euclidean [6]. The other rings of the type considered in this paper, which are known to be Euclidean, all have the absolute value of the norm as their Euclidean function E. (E(a)=\N(a)\. The unadorned symbol N will always denote the absolute norm.) The purpose of this paper is to prove the following theorem. Theorem 1. Let K be an algebraic number field whose ring of integers both is a p.i.d. and has infinitely many units. Then a generalized Riemann hypothesis (abbreviated GRH) implies that the ring of integers of K is Euclidean. AMS 1970 subject classifications. Primary 12A05, 13F99; Secondary 12A40. © 1973, American Mathematical Society 321
322 PETER J. WEINBERGER A sufficient set of Riemann hypotheses is given in §4. The five remaining complex quadratic fields with class number one are known to be Euclidean [1, p. 213]. In §2 the proof of Theorem 1 is reduced to the proof of Theorem 4. In §3, following Hooley [3], the proof of Theorem 4 is brought to where analytic techniques are applicable. §4 applies them, and the rest of the paper contains the rest of the proof. 2. Henceforth assume that K satisfies the hypothesis of Theorem 1 and that R is its ring of integers. We make use of the Ej notation from § 1. The first step is to show that Theorem 1 can be deduced from the following theorem. Theorem 2. GRH implies that every irreducible of R is in E3. An irreducible, b, of R is an element such that the principal ideal (b) is prime. If m is a divisor of K, and a, b are two nonzero ideals, recall that a = b (mod m) is defined to mean that ab"1 is a principal ideal (c) such that vp(c—l)^vp(m) for all primes p of K. Lemma 2.1. Let m be any nonzero element of R. Then every prime residue class of Rl(m) contains infinitely many irreducibles of degree one. In particular, if the conclusion of Theorem 2 holds, each such residue class contains an irreducible which is in E3. (A prime residue class is one which is a unit in the ring R/(m).) Proof. Let m=(m) be the principal divisor of K corresponding to m. Choose any b prime to m. By an extension of Dirichlet's theorem on primes in arithmetic progression [2, Volume I, p. 32], there are infinitely many prime ideals q = (#) = (&) (modm), where q is of degree one and prime to m. Then, by definition, q/b-le{m), so m \ [q-b). Q.E.D. If b, not zero, is the product of n2 + n3 irreducibles, n} from Ej9 7 = 2, 3, define the height of b by ht(£) = 2w2 + 3w3. Theorem 3. GRH implies that beEhtib). Hence Theorem 2 implies Theorem 1. Proof. Since the theorem is clearly true if ht(6)^2, it suffices to show that every residue class modb contains an element whose height is less than ht(b). For the purpose of this proof, say that b has type (n2i n3). Aside from the class contain-
EUCLIDEAN RINGS OF ALGEBRAIC INTEGERS 323 ing zero, all residue classes mod b are of the form B1(n2 — t2, n3 —t3)-prime residue class modB2(t2, t3), where B{ = B^j, k) is a divisor of b of type (/, k). The case with (t2, t3) = (0, 0) gives the zero class. If (r2, t3) = (l, 0), then any prime residue class modj52 contains an element of Eu which is a number of type (0, 0), so the residue class modZ? contains a number of height 2(n2 — 1) + 3n3 <ht(b). If (t2, ^) = (0, 1), any prime residue class mod B2 contains an element of E2, which must be either of type (0, 0) or (1, 0). Hence the residue class modb contains a number of height 2(n2 + S2) + 3(n3— 1) <ht(6), where <52 = 0 or 1. For any other choice of (t2, t3), Lemma 2.1 gives an element of type (S2, S3) in each prime residue class modB2, where S2, <53 = 0, 1 and not both <52, S3 are 1. Therefore the residue class modb contains an element of type (n2 — t2 + 52, n3 — t3 + 53) and this element has height <ht(b). Q.E.D. For an arbitrary irreduciblep of R let p = {/?). Let /be the multiplicative group of ideals of K prime to p, and let H= {ael:a = (l) (modp)}. Then I/H is a finite group. Note that [/://]= 1 if and only ifpeE2. For if (a,p) = 1, then/? | (a -unit) if and only if (a)=(l) (modp). We say that a unit, £, of R is a fundamental unit if e^e" for all units zx and all integers n > 1. We say that e is a primitive root of the prime ideal p if s generates the (cyclic) group of units in the finite field R/p. Fix a fundamental unit s. Theorem 4. Assume GRH. Let p be a prime ideal of K. Let H be as above. Then every ideal class of I/H contains infinitely many prime ideals for which s is a primitive root. Proof that Theorem 4 implies Theorem 2. Let p be an irreducible of degree one, and let p = (/?). Let {a,p) — \. Then Theorem 4 says that there is a prime ideal q = (q) = (a) (modp) with q prime to p. Then q/a-lep so p \(q-a). Further, qeE2, since every residue class of R/(q) = R/q, except the one containing zero, contains a power of s, which is in Ex. Q.E.D. 3. We have an algebraic number field K with a fixed fundamental unit s. Let H be any ideal group in K with a conductor; i.e., an ideal group of class field theory [2, Volume I, p. 61]. Let h be a member of I/H. We are interested in the number of prime ideals in h for which s is a primitive root, since for certain h Theorem 4 requires that this number be infinite. Henceforth q shall denote a rational prime, k a square-free integer, and Q the field of rational numbers. Small letters a, b, p, q shall denote ideals of K, with p, q being reserved for primes, while the respective capital letters 91, 93, ^3, Q will be used to denote ideals in certain extensions of K.
324 PETER J. WEINBERGER If £ is not a primitive root of p, then £ must be a qth power residue mod p for some q\(Np-l). Let R(p,k)=l if k\(Np-l) and £(^-1)/k=l (modp), and 0 otherwise. Define N(z, ;/) = card{pe/i: Np^z,R(p, g) = 0for all q^rj}, P(z9 fc) = card {pe/i: Np^z, R(p, q)= 1 for all q\k}9 N(z) = card {peh: Np^ z, e is a primitive root of p}, M(z, rjl9 r\2) = C2ixd{peh: Np<Lz, R{p, q)=\ for some qe(rju i/2]}, ^=(logz)/6, {2 = z1/2(logz)-2, £3 = z1/2logz. These definitions, and the method for estimating N(z), are due to Hooley [3]. Trivially, N(z) = N(z, z— 1), and, as in [3], (3.1) N{z) = N{z9 ix)^0{M{z9 tl9 z-1)) and (3.2) M(z, Zl9 z- l)^M(z, ^, {2) + M(z, £2, ^3) + M(z, £3, z-1). Lemma 3.1. M(z, £3, z- l)<^z/log2z. (Throughout, the implied constants depend only on K, e, h.) Prcx)f. If p is counted in M(z, £3, z-1), then q\(Np-l) and £{Np~i)lq = l(modp), for some q, zi/2 \ogz<q^z—l. Hence all such p divide 11 \N(em-l)\. m<z1/2/logz Each divisor of this has norm at least 2, so that 2[K:Q]M(z,S3,z-l)^ J-J |JV(fim-l)|. m<z1/2/logz All conjugates of £ have absolute value <Al9 so \N(sm-l)\<(Ar!! + l)[K:Q]<A^K:Q\ where Al9 A2 <l 1. Hence [K:Q]MM3,-l^^f^ Z ». Q.E-D. lOgZ m^z1/2/ log z Lemma 3.2. M(z, £2, £3)<|z log logz/log2z.
EUCLIDEAN RINGS OF ALGEBRAIC INTEGERS 325 Proof. First, M(z, £2, f3)^Le2<,^3 p(2' ?)• For £2<<7^C3, P(z, q) ^card {piA/p^z, iVp = l (modg)} = Z Jcard{p.y^z,^=l(modq)}, since Np = pi for some p and 7. The term with 7= 1 is q logz glogz by the Brun-Titchmarsh theorem. Similarly, the term with 7 = 2 is <^zl,2/q, since p2== 1 (mod#) only if p= 1, -1 (mod#). If7 = 3, then /?<#, so there are at most 7 solutions o(pj=l (modq). Hence P(z9q)<z/qlogz + zll2/q + [K:Q]2<z/qlogz9 and Mz, £2, £3)«= L "1 « X V~lT^—\—2 * v-E.D. ^^tflogz fc<7^3 4 log2z log2z Since (3.3) N(r,«i)=lA*(0^.0. where le{k:q | k=>q^<Ji}, we now turn our attention to estimating P(z, /c). Let L = Lk = K(Ck, ^1/k), where £k is a primitive kth root of one. Lfc/X is normal. #(p, fc)=l if and only if p splits completely in Lk. For the rest of this section fix k. Let f be the product of k and the conductor of H. Let IL be the multiplicative group of ideals of L prime to f, and let HL = {2le/L:iV£2le//}. Then HL is an ideal group defined modf, for if 21=(1) (modf), then 21=(a) and a= 1 (modf). But then, since f is an ideal of K, every conjugate, a', of a over K satisfies a'== 1 (modf) so iV£(a)= 1 (modf), so N%HeH. Hence /L/#L is a finite group. Let H be the group of characters on IL/HL. Each character in H extends to a function on integral ideals of L in the usual way: if (<P, f)= 1, then x{V) = x{^HL), otherwise x($)=0; and x is extended multiplicatively. Let 23 be a fixed integral ideal of L, prime to f. Let
326 PETER J. WEINBERGER Then (3.4) £ *(93) tc(z, x)=[/L: J/L] card{$eL:JV^z, $eSHL}. xeH 4. In this section k and ^gH are fixed. Let £&*)= I -^r. Re(s)>l; let ^0, with conductor f0, be the primitive character equivalent to %; let n — [L\ Q~], A =disc(L/0; let rx be the number of real conjugates of L (0 unless /c^2), r2 = (n —rx)/2; let r3, with 0^r3gr1, be the number of real archimedean valuations on L whose sign affects x\ finally, let Now L(s, x) may be analytically continued to the whole s-plane as an entire function unless x is principal, when it has a simple pole at s= 1 [4, Satz LXIII]. In any case (s — 1) L(s, x) is an entire function of order 1. L{s, x)*0 for Re(s)^ 1. For z^ 1, <A(z, x) = E(x) z-{r0 logz + ao)-(y-r3) log(l ^"^(^"E^ where E(x) = 1 if x is principal and zero otherwise, L , and where g = p + iy runs over all zeros of L(s, x) with 0 < <r < 1 [4, Satz LXXXVIII]. e r->oo|y|<r converges for all z > 0 [4, Satz LXXXIX], and uniformly for z > 1 in any interval not containing an integer [4, Satz XC]. The use of the functional equation for L(s, x) to estimate ij/(z, x) is complicated by the fact that x may not be primitive. If x is not primitive, then MS„)-L(S„.)n(.-||)-MS.,.)ni(.-|).
EUCLIDEAN RINGS OF ALGEBRAIC INTEGERS 327 for roots of one, gm, and powers prime Pm. Let ^(f0) = 2-^-"'2(M|JVf0)1'2, *(«. xo)=^(fo)s r((s+1)/2^ r(s/2)"-3 /»'; l(s, Xo). Then, by defining #(s, *)=<P(s, Xo). we have [4, Satz LXIII] |<P(s,x)| = |*(l-s,x)|. Let rf=card{em:em = l}. Then [4, Satz LXV] r0=max{d-£(x)+r1 + r2-r3,0}«l + log(Arf) so that (4.1) f(z9 x) = E(X) z-a0-£ -+0(log(Nf) logz). As always, the implied constant does not depend on k or %, but only on K, e, h. We are now in position to estimate the infinite sum over zeros of L(s, x) by using the Phragmen-Lindelof principle and Jensen's theorem. The former can be applied in the vertical strip —3/2^(7^11/2. For <r^2, and so for cr= 11/2, \L(s, X)\^U2fScl cx <L On <r= -3/2, \L(s,x)\ = \f(s,x)\Ml-s,x)\£fif(s,xh where /(s, x) is the factor from the functional equation ^ [S' X) ((cos (7ts/2))r'+r2 -r3 (sin (7ts/2))r*+'3 T (s)") On a— —3/2, by Stirling's formula, f{s,xH(f2{AN\0f{\t\ + 2f\ c2<\. Hence, on a = — 3/2 and on a — 11/2, (4.2) |(s-l)L(s,%H(|r| + 2)^(JiVf)2, where c3 is a positive integer independent of k, x- Applying Phragmen-Lindelof
328 PETER J. WEINBERGER while to the function (s— 1) L(s, x)/(s + 2)C3" shows that (4.2) holds throughout the strip -3/2^(7^11/2. On (7 = 2, (4.3) |(s-l)L(s,*)| = C(2r. Let v(y) be the number of zeros of £(s, x)~(s~ 1) L(s, y) m the circle \s-{2 + iT)\£y. Then by (4.2), (4.3), 7/2 In {-7^= ^ j log|{(2 + iT+(7/2) *", *)l «0-log|£(2 + iT, Z)| 0 0 ^c4log(JJVf(|r| + 2)"), 7/2 7/2 [vJAdy^[viyLdy^v(3) log(7/6). 0 3 Then, since v(3)^> JV(T+1, Z)-iV(T, *), where N(T,x) is the number of zeros of L(s, x)in0<(7<l, -T<t<T, (4.4) JV(T + l,x)-N(r,x)<log(JNf(T + 2)"). Now (4.1), (4.4), and the argument in [3] immediately give *(*, *) = £(*) H(z) + 0(z1/2 log(JJVfz")), if all the zeros of L(s, x) with 0<(7<1 lie on the line o = \. This hypothesis, for each k and each ^eH, is the GRH sufficient for Theorem 1. Now |J| = |disc(L/0^^(disc(Xfc-£))^^(/cfc-1)^fcfcIX:e], JVf = N§(cond(fl)-ik)^c5k". Hence log(JJVfz")<^rc log/c + w logz, so (4.5) tt(z, Z) = £(Z) li(z) + 0([L:K] z1'2 log(/cz)). 5. In this section k is fixed and f, 23 are as defined near the end of § 3. Let C = C(k) be the ideal group in K, defined modf, generated by norms of ideals
EUCLIDEAN RINGS OF ALGEBRAIC INTEGERS 329 from L. For later purposes, it is necessary to observe that C is also the group generated by norms of ideals from E = Ek, the maximal Abelian subextension ofL/tf [2,VolumeH,p. 167]. Lemma 5.1. N% induces an isomorphism ofILjHL onto C/CnH. Proof. All ideal groups are given modf so the definition of C implies that the map is onto. If NftleCnH, then the definition of HL shows that 2Iei/L, so the map is injective. Q.E.D. As an immediate consequence, NkV = N£® (mod H) iff <$ = 95 (mod HL). Therefore, the right-hand side of (3.4) equals (5.1) [C:Cnif]card{^c:L:N^^z,iV^95 = JV^(modi/)}. If possible, choose 95 such that Nfy&eh. If this is possible, define F(k)= 1, otherwise define F(/c)=0. If p is counted in P(z, k), then p = Y\fj^ yj9 with distinct %, and A#«P, = p. Hence (5.1) equals (5.2) lC:CnIQ.[[L:K]P{z9k)F{k) + 0{Ai+A2)]9 where Ax is the number of prime ideals of L which are ramified over K, and i42 = card{^c=L:NV^z,cleg(^/jK)^2}. It is clear that the only prime ideals of L ramified over K are among those dividing k, so A^lL.Q] \ogk = n\ogk. By the prime number theorem, A24[L:Q~] zl/2/\ogz<nz1/2. Combining these last two estimates, (5.2), (4.5), and (3.4) gives (53) p<z-*»-Ed^U+0(2"Mo8(tz))
330 PETER J. WEINBERGER 6. We now can finish the estimation of (3.2) by estimating M(z, £2, £3). Note that [Lfc:X] = /c[X(Cfc):X] and that, for sufficiently large q, [K{Q:K] = q--\. Hence (61) +^+z1'2 li(£2)log£2z«z/log2z. si To complete the proof of Theorem 4, it suffices to know N(z, £x), by (3.1). Substituting from (5.3) in (3.3) gives As in [3], all values of/are ^z1/3, so the error term in (6.2) is <^z5/6 logz<^z/log2z. Lemma 6.1. F(k) is multiplicative. Proof. It is clear that F(k)—0 if hr\C(k) is empty, for then h contains no norms from Lk. Conversely, if beC(k)nh, then there exists an 9lc:Lfc such that b = Af£2l(modf), so that N^eh, so F(k) = 1. Hence F{k)= 1 if and only if hnC(k) is not empty. Now C(k) is class group to Ek, the maximal Abelian subextension of Lk/K. But Ek is the compositum of all Eq with q | /c, so C(k)=f|C(«) and C(/c)n/z = n (C(«)nft). Q.E.D. We now turn to the factor [C(/):C(/)n#]. Let i/={ac:A::a = (l) (modp)}, where p is a prime (ideal or divisor) of K. There is a unique rational prime p divisible by p. In the rest of the estimation of N(z) it is convenient to make use of the fact that jR is a p.i.d. Lemma 6.2. IfpXK then F{k)=\ and[C(k): C{k)nH] = [/: H\ Ifp \ k, then [C(/c):C(/c)nH] = [/:H]/[/://C(p)]. Proof. The conductor of H divides p, and so is either p or one. If it is the latter, then I/H is the absolute class group, so I=H since R is a p.i.d. and the lemma follows. Otherwise, the conductor of H must be p. Let K' be class field to H, so K'jK is ramified exactly at p. Now
EUCLIDEAN RINGS OF ALGEBRAIC INTEGERS 331 [C{k):C(k)nH] = [HC(k):H], and HC(k) is class group to K'nEk. Ek can only be ramified at those primes dividing k, so if pjfk, then HC(k) = I, so C(k)/C(k)r\H is isomorphic to I/H, which gives the first conclusion of the lemma. Finally, if p \ k, then k = pm with (m, p)= 1. Then Epmc:Ep(Cm, e1/m) so Epm/Ep is unramified at any prime dividing p. On the other hand K/K is totally ramified at p [2, Volume I, p. 31], so K'r\Epm must be a totally ramified, over p, extension of K'r\Ep. Hence K'nEp = K'nEpm. But HC{k) is class group to K'nEpm, so HC(k) = HC(p). Q.E.D. Lemma 6.1 and Lemma 6.2, (6.2) and (3.2) give U [/:H]V P[K(C,):K] yW.Ul ^(O^ V log2z Now for large enough z,q>£i implies that [K(tq):K]=q— 1, so I logfl- 1 V z V-^+°(i/93)«i/^1. q[.K(Q:K]J qntq(q-l) Therefore (6.3) N{zy_m L F{p)U:C{p)H-\ \I:K\\ p[K(C,):K] qVP\ q[K(Q:KV \ log2z The infinite product is not zero, so Theorem 4 will be proved if the second factor can be shown to be positive. The factor in question can only be zero if F(p) = 1 and [/:^C(/?)]=/7[/i:(Cp):A:]. This last implies that K'nEp = Lp, so L = K(si,p)czK'. But then \Lp: A:] =/?, while \K': K] = [/: H~\ which divides the number of prime residue classes of R/p, which ispJ— 1 for somey. This contradiction completes the proof of Theorem 4, and therefore Theorem 1 is proved. Bibliography 1. G. H. Hardy and E. M. Wright, An introduction to the theory of numbers, Oxford Univ. Press, 1965, p. 213. 2. H. Hasse, Bericht iiber neuere Untersuchungen und Probleme aus der Theorie der algebraischen Zahlkorper. Teile I, II, Zweite Auflage, Physica-Verlag, Wurzburg-Vienna, 1965. MR 33 #4045a,b.
332 PETER J. WEINBERGER 3. C. Hooley, On Artin's conjecture, J. Reine Angew. Math. 225 (1967), 209-220. MR 34 #7445. 4. E. Landau, Uber Ideale und Primideale in Idealklassen, Math. Z. 2 (1918), 52-154. 5. T. S. Motzkin, The Euclidean algorithm, Bull. Amer. Math. Soc. 55 (1949), 1142-1146. MR 11, 311. 6T P. Samuel, About Euclidean rings, J. Algebra 19 (1971), 282-301. University of Michigan
AUTHOR INDEX Italic numbers refer to pages on which a complete reference to a work by the author is given. Roman numbers refer to pages on which a reference is made to a work of the author. For example, under Oppenheim would be the page on which a statement like the following occurs: "This result had been conjectured by Oppenheim [13] in 1929,..." Boldface numbers indicate the first page of the articles in this volume. Abramowitz, M, 30 Anderson, T. W., 205,210 Ankeny,N.C.,214,219 Artin, E., 100,110,285,302 Artin, M., 60 Ayoub, Raymond, 30 Babu, G. Jogesh, 236,246 Baker, A., 1,1, 2, 4, 6, 6, 7, 178, 178 Baker, R. C, 210,210 Bambah, R. P., 137,139 Barton, D. E., 202,210 Bass, H., 58, 60 Bateman, PaulT., 75,154, 157, 266 Bauer, M., 97,100 Beach, B. D., 280,283 Berndt, Bruce C, 9,10,13, 30, 115, 121 Billing, G., 222,230 Billingsley, Patrick, 204, 210, 233, 234,235,246 Birch, B. J., 6, 7,159,165,166, 168, 171,274 Birnbaum, Z. W., 207, 208,210 Bombieri, E., 31, 32, 36, 40, 42, 49, 100 Brauer, A., 216,219 Brauer, R., 160,174 Brown, J. W., 157 Buhstab, A. A., 214,219 Burgess, D. A., 213, 215, 216,219 Burnside, W., 100 Casselman, W., 230 Cassels, J. W. S., 166,174, 210, 210, 230 Chandrasekharan, K., 266 Chowla, S., 84, 90, 137, 139, 160, 274,268,283 Chung, K. L., 204,211 Cigler, J., 195, 201, 210,211 Coates, John, 5, 7, 51,51, 60 van der Corput, J. G., 319 Cramer, H., 136,139 Darling, D. A., 202, 205,210,211 Davenport, H., 31, 49, 84, 90, 159, 160, 161, 165, 166, 168, 170, 172, 174,193,233,246 Dedekind, R., 96,100,228,230 Dem'janenko, V. A., 221,230 Diamond, Harold G., 63, 154, 157, 266 Donsker, M. D., 204,211 Doob, J. L., 204,211 Dorge, K., 91,100 333
334 AUTHOR INDEX Dressier, Robert E., 75 Durbin, J., 207,209,211 Dvoretzky, A., 206,211 Elliott, P. D. T. A., 32, 49, 77, 91, 93,100, 214, 215,219 Epanecnikov, V. A., 209,211 Erdelyi, A., 121,211 Erdos, Paul, 75, 78, 82, 83, 84, 90, 125, 126, 134, 136, 138, 139, 140, 210,211,215,219,233,246 Estermann, T., 140,319 Fainleib, A. S., 74, 75 Fel'dman, N. I., 3,4, 7,178,178 Feller, W., 204,211 Forti, M., 31 Frechet, M., 211 Fricke, R., 222,231 Gabriel, R. M., 42,49 Gal, L., 210,211 Gal, S., 210,211 Gallagher, P. X., 31, 32,49,91 GaposTrin, V. F., 210,211 Garland, H., 60 Goldstein, Larry Joel, 103,103,104, 110,303,304 Grosswald, Emil, 13,30, 111, 121 Halasz, G., 31,49 Halberstam, H., 31,49,247,247,249 Hall, R. S., 157 Hardy, G. H., 75,124,126,135,140, 147, 150, 154, 157, 218, 219, 331 Haselgrove, C. B., 147,153,157 Hasse, H., 285,302,331 Hecke, E., 122, 223, 228,231 Heilbronn, H., 160,174,303,304 Helmberg, G., 195,201, 210,211 Hensley, Douglas, 123,123,126 Hlawka, E., 91,93,100, 201,211 Hooley, Christopher, 105, 110, 129, 134, 135, 137, 140, 322, 324, 332 Huxley, Martin N., 43, 49, 93, 100, 141,145 Ingham, A. E., 75, 127, 147, 153, 154,156,157 Iwasawa, K., 52, 53, 55,56, 60 Jarnik.V., 161,174 Jurkat, W. B., 147,157 Kac, M., 75,233,246 Kemperman, J. H. B., 196, 209, 211 Kiefer, J., 206,211 Klingen, H., 122 Kloss, K. E., 157 Knobloch, H.-W., 91,100 von Koch, H., 149,157 Koksma, J. F., 195, 201, 210, 211 Kolmogorov, A., 211 Kornblum, H., 100 Kubilius, J., 75, 77, 82, 233, 236, 237,244, 245,246, 248, 249 Kubota, T., 52,54, 60 Kuipers,L., 195,201,222 Landau, Edmund, 193,332 Lang, S., 108,110 Lavrik, A. F., 249,249 Lehmer, D. H., 267, 269,283 Lehmer, Emma, 267,283 Leopoldt, H. W., 52,54, 56,60 Lewis, D. J., 159,161,170,174 Lewittes, Joseph, 13,30 Lichtenbaum, S., 51, 60,122 Ligozat.G., 229,231 Lindstrom, B., 85, 90 Linfoot, E. H., 303,304 Linnik, Ju. V., 31, 49, 218,219, 233, 244, 245,246 Littlewood, J. E., 124,126,135,140, 147, 149, 157, 218, 219, 267, 268, 283 Loeve, Michel, 246 Mahler, K., 83,84,90,175,176,179, 222,230 Mallows, C.L., 202,210 Mandelbrojt, S., 266 Manin, Ju. I., 286,302
AUTHOR INDEX 335 Masser, D. W., 5, 7 Massey, F. J., 207,211 Mazur, B., 226, 231 Mehta, M. L., 193 Meijer, H. G., 207,212 Meyer, Y., 212 Minor, J., 60 Mirsky, L., 139,140, 208,212 Montgomery, Hugh L., 31, 32, 33, 43, 45, 49, 91, 100, 124, 127, 145, 181, 193, 214, 218, 219, 262, 319 Motzkin, T. S., 321,332 Neubauer, G., 152,158 Niederreiter, H., 195, 195, 201, 207, 211,212 Norton, Karl K., 213, 215, 219, 220 Ogg, A. P., 221, 221, 231 Oppenheim, A., 159, 160, 168, 174 Osgood, C. F., 4, 7 Owen, D. B., 204, 207,212 Philipp, Walter, 206, 210, 212, 233, 237,246, 249 Pitman, Jane, 168,174 Prohorov, Ju. V., 233, 246 Rademacher, Hans, 28,30 Ramachandra, K. F., 6, 7 Rankin, R. A., 138,140 Renyi, A., 31,49, 204,212 Richards, Ian, 123,123,126 Richert, H.-E., 138, 140, 247, 247, 249 Ridout, D., 159,165,168,174 Rosenblatt, M., 206,212 Rosser, J. B., 193 Roth, K. F., 31, 49, 251, 251, 262 Ryavec, C, 263, 264, 266 Samandarov, A. G., 93,100 Samuel, P., 321, 332 Schaal, W., 93,100 Schinzel, A., 4, 7,127 Schmid, P., 205,212 Schmidt, Wolfgang M., 4, 7, 176, 178,179,251,262 Schneider, T., 176,179 Schoenberg, I. J., 63, 75 Schoeneberg, B., 28,30 Schoenfeld, L., 193 Schrutka, V., 60 Selberg, Atle, 93, 97, 101, 136, 140, 193 Selfridge, J. L., 126 Serfling, R. J., 233, 245,246 Serre, J.-P., 54, 61 Shanks, Daniel, 267, 267,283 Siegel, C. L., 28,30,51,59,61 Sierpinski, W., 127 Simpson, P. B., 205,212 Singer, J., 85,90 Skewes, S., 156,158 Smirnov.N.V., 205, 212 Specht, W., 101 Spira, R., 153,158 Sprindzuk, V. G., 3, 7,176,179 Stark, H. M., 1,2, 7,285 Stegun, I. A., 30 Stemmler, R. M., 157 Stepanov, S. A., 286,302 Sterneck, R. D. v., 152,158 Stickelberger, L., 55, 61 Sunley, Judith S., 303,304 Swinnerton-Dyer, H. P. F., 229,231 Szemeredi, E., 83 Tate, J., 51,58, 61 Tatuzawa, T., 303, 304,304 Tchebotarev, N., 99,101 Tchudakoff, N. G., 319 Tijdeman, R., 207,212 Titchmarsh, E. C, 30, 151, 153, 158,193 Tjan, M. M., 74, 75 Turan, P., 41, 49, 77, 82, 92, 101, 305, 305, 308, 309, 310, 312, 314 Uzdavinis, R. V., 236,246
336 AUTHOR INDEX Vaughan, Robert C, 193,315,319 Vinogradov, I. M., 319 Viola, C, 31 vairder Waerden, B. L., 91,101,212 Wald, A., 202,212 Walfisz.A., 161,174 Warlimont, R., 93,101 Watson, G.L., 172,174 Watson, G. N., 30 WeU, A., 52, 61, 117, 122, 285, 302 Weinberger, Peter J., 105, 110, 321 Weyl, H., 203,212 Whittaker, E. T., 30 Williams, H. C, 280,283 Wilson, R. J., 93,101 Wintner, A., 78,82 Wirsing, E. A., 6, 7 Wolfowitz, J., 202, 206, 211, 212 Wright, E. M., 75,331 Yohe, J. M., 193 Yuan, Wang, 214,220 Zaremba, S. K., 206,212
SUBJECT INDEX Abscissa aw of quasizerofreeness of f 00,308 additive form, 160 additive function, 77, 233 adjoint operator, 34 additive theory of primes, 314 admissible sequence, 123 algebraic numbers, linear forms in logarithms of, 1 rational approximations to, 3 algebraic number field, totally real, 303 almost periodic distributions, 151, 154,155 Artin conjecture, 103 Bernoulli function, 10 generalized, 10,11 Bernoulli number, 111 generalized, 29,112 Bernoulli polynomial, 10 Bessel's inequality, generalized, 143, 252 Birch and Swinnerton-Dyer, conjecture of, 226 Bombieri's large sieve inequality, 92 Brownian motion, 233, 234, 235 Bran's sieve, 134, 247, 248 character sum, 215, 216, 244 class, 175 Class A, 176 class number, 183, 269, 303 class number 2 problem, 4 Class S, 176 Class T, 176 Class U, 176 consecutive power residues, 213 consecutive primes, 141 converge in distribution, 233, 234 Cramer-Smirnov test, 203, 205, 206 cycle type, 98 Dedekind character sum, 19 generalized, 23 Dedekind eta-function, 9,117, 228 generalized, 28 Dedekind sum, 9,10 generalized, 10, 24 Dedekind zeta function, 112, 113, 117 5-well spaced, 32 density hypothesis, 311 diagonal form, 160 Dickman's function, 213 Diophantine equations, 3 Dirichlet character, generalized, 32 Dirichlet series, 112,114, 267 "Dirichlet series" operator, 34 Dirichlet's class number formula, 51 337
338 SUBJECT INDEX discrepancy, 195 extreme, 196, 197, 198, 201, 202 local, 197, 201 If, 197 distribution-free, 202, 203, 205, 206 distribution function, 130 empirical, 196, 201, 203 distribution of points in Euclidean space, 252 "Eisenstein series", 223 elliptic curve, 221 elliptic functions, 5 empirical distribution function, 196,201, 203 equidistributed sequence, 104 error term, 316 etale cohomology, 53 Euclidean ring, 321 Euler's phi function, 63 even discriminant, 270 exponential sum, 306 extended Riemann hypothesis, 214, 218, 282 extreme discrepancy, 196,197, 198, 201, 202 fourth moments of L- functions, 43 Fundamental Lemma, 248 Gaussian process, 204, 205, 206 Gaussian sum, 120 generalized Bernoulli function, 10, 11 generalized Bernoulli number, 29, 112 generalized Bessel's inequality, 252 generalized Dedekind character sum, 23 generalized Dedekind eta-function, 28 generalized Dedekind sum, 10, 24 generalized Dirichlet character, 32 generalized prime number system, 263 Glivenko-Cantelli theorem, 203 Goldbach's problem, 315 graph of (n,<p(n)),64 Haar measure, 104 Halasz, method of, 143 Hardy-Littlewood conjecture, 136 Hardy -Littlewood method, 160 Hardy-Littlewood Tauberian theorem, 218 Hecke operator, 226 hichamp, 276 Hurwitz zeta-function, 12 hyperelliptic fields, Riemann hypothesis in, 285 indefinite diagonal form, 160 integer interval, 217 intervals, distribution of, 129 invariance property, 175 iterated logarithm, law of, 195 196, 204, 205,206, 210 Jacobi symbol, 244 Jacobian, 222 ^-hypothesis, 83, 84 Kolmogorov's limit theorem, 19€ 204, 207, 209 Kolmogorov's two-sided test, 196 201, 202 K-theory, 58 lacunary sequence, 196, 210 /-adic zeta function, 52 Lambert series, 119 large numbers, law of, 80 large sieve, 248 large sieve inequalities, 31 law of iterated logarithm, 195, IS 204, 205,206,210 law of large numbers, 80 Legendre symbol, 233 Lehmer's delay-line sieve DLi 157, 269 Lichtenbaum, conjecture of, 112
SUBJECT INDEX 339 Lindeberg condition, 233, 235 Lindelof hypothesis, 40, 308 weak, 309 Lindelof ^-function, 33 linear forms in the logarithms of algebraic numbers, 1 Lipschitz summation formula, 11 Littlewood indices, 267 Littlewood's bounds on L(l, x), 267 local discrepancy, 197, 201 localization principle, 36 lochamp, 276 logarithms of algebraic numbers, linear forms in, 1 lower Littlewood index, 267 If discrepancy, 197 Mertens conjecture, 148,152 Meyer function, 113 Meyer's G- function, 119 modular group, 221 mutual influence of primes and zeta roots, 308 non-Riemannian zero, 282 nontrivial zero of f(s), 305 numbers prime to n, 130, 213, 217 numeri idonei, 5 Q-theorems, 149,150,156,157 order function, 177 pair correlation function, 184 perturbed zeta function, 264 positive definite quadratic form, 160 power residues, 213 prime ^-tuples conjecture, 123 prime number system, generalized, 263 prime number theory, 316 prime-pair, 137 primes, 130, 305 additive theory of, 314 pseudosquare, 275 quadratic character, 268 quadratic extension, totally imaginary, 303 quadratic form, 159 positive definite, 160 quasi-orthogonal functions, 252 constructed from weighted strips, system of, 252 quasi-prime, 249 quasi-Riemann conjecture, 312 rational approximations to algebraic numbers, 3 rational cusp, 222 rational group, 229 rational point, 221 ray-class character, 114 reciprocity formula, 9,10, 20 reciprocity law, 10, 26 regulator, 269 Riemann hypothesis, 103,107, 136,181, 267, 321 extended, 214, 218, 282 in hyperelliptic fields, 285 Riemann zeta function, 63, 111, 181, 305 Riesz-Thorin theorem, 33 Schoenberg, theorem of, 63 Selberg's sieve, 134, 136, 218, 248 sequence, 129 equidistributed, 104 lacunary, 196, 210 uniformly distributed, 195, 197 Siegel zero, 315 sieve, Bombieri's, 92 Bran's, 134, 247, 248 large, 248 Lehmer's delay-line, DLS-157, 269 Selberg's, 134,136, 218, 248 trigonometrical of I.M.Vinogradoff,311 singular function, 75 splitting type, 95 square-free number, 130
340 SUBJECT INDEX system of quasi-orthogonal functions constructed from weighted strips, 252 system of representatives, 208 tame kernel, 58 Tchebychev's inequality, 92 Thue, theorem of, 3 totally imaginary quadratic extension, 303 totally real algebraic number field, 303 transcendental numbers, 175 transversal theory, 196, 207, 208 trigonometrical sieve method of I. M. Vinogradoff, 311 two squares, sum of, 130 uniformly distributed sequence, 195,197 upper Littlewood index, 267 Waring's problem, 6, 84 weak convergence of probability measures, 233 weak Lindelof hypothesis, 309 weighted strips, 252 system of quasi-orthogonal functions constructed from, 252 wild, 56 zero, non-Riemannian, 282 ofL(s,x),33, 282 of f(s), nontrivial, 181, 305 Siegel, 315 zero-density theorem, 141 zeta function, Dedekind, 112,113,117 Hurwitz, 12 perturbed, 264 Riemann, 63, 111, 181, 305 {>($), 263
This page intentionally left blank
ISBN 0-8218-1424-9 9»780821«814246l PSPUM/24 AMS on theWcb www.ams.org
ISBN 0-8218-1424-9 9"780821«814246l PSPUM/24 AMS on f^Web www.ams.org