/
Автор: Diamond H.G.
Теги: mathematics number theory higher mathematics mathematical analysis
ISBN: 0-8218-1424-9
Год: 1972
Текст
Proceedings of Symposia in
Pure Mathematics
Volume 24
Analytic
Number Theory
Symposium on
Analytic Number Theory
March 27-30, 1972
St. Louis, Missouri
Harold G. Diamond
Editor
Is
American Mathematical Society
Volume 24
Analytic
Number Theory
Symposium on
Analytic Number Theory
March 27-30, 1972
St. Louis, Missouri
Harold G. Diamond
Editor
This page intentionally left blank
Analytic
Number Theory
This page intentionally left blank
Proceedings of Symposia in
Pure Mathematics
Volume 24
Analytic
Number Theory
Symposium on
Analytic Number Theory
March 27-30, 1972
St. Louis, Missouri
Harold G. Diamond
Editor
^// TPHTOI MH \>^
of/i^^^^o American Mathematical Society
Providence, Rhode Island
PROCEEDINGS OF THE SYMPOSIUM IN PURE MATHEMATICS
OF THE AMERICAN MATHEMATICAL SOCIETY
HELD AT THE ST. LOUIS UNIVERSITY
ST. LOUIS, MISSOURI
MARCH 27-30, 1972
Prepared by the American Mathematical Society under
National Science Foundation Grant GP-32302
2000 Mathematics Subject Classification. Primary 11-02.
Library of Congress Cataloging-in-Publication Data
Symposium in Pure Mathematics, St. Louis University
1972.
Analytic number theory.
(Proceedings of symposia in pure mathematics, v. 24)
Includes bibliographies.
1. Numbers, Theory of— Congresses. I. Diamond, Harold G., 1940- ed.
II. American Mathematical Society. III. Title. IV. Series.
QA241.S88 1972 512'.73
ISBN 0-8218-1424-9 72-10198
Copying and reprinting. Individual readers of this publication, and nonprofit
libraries acting for them, are permitted to make fair use of the material, such as to
copy a chapter for use in teaching or research. Permission is granted to quote brief
passages from this publication in reviews, provided the customary acknowledgment of
the source is given.
Republication, systematic copying, or multiple reproduction of any material in this
publication is permitted only under license from the American Mathematical Society.
Requests for such permission should be addressed to the Assistant to the Publisher,
American Mathematical Society, P. O. Box 6248, Providence, Rhode Island 02940-6248.
Requests can also be made by e-mail to reprint-permissionOams.org.
Copyright © 1973 by the American Mathematical Society. All rights reserved.
Printed in the United States of America.
The American Mathematical Society retains all rights
except those granted to the United States Government.
The paper used in this book is acid-free and falls within the guidelines
established to ensure permanence and durability. @
Visit the AMS home page at URL: http://www.ams.org/
10 9 8 7 6 5 4 3 04 03 02 01 00
CONTENTS
Foreword vii
Effective methods in Diophantine problems. II 1
By A. Baker
Character transformation formulae similar to those for the Dedekind eta-
function 9
By Bruce C. Berndt
On large sieve type estimates for the Dirichlet series operator 31
By M. Forti and C. Viola
(Presented By Enrico Bombieri)
On Iwasawa's analogue of the Jacobian for totally real number fields 51
By John Coates
The distribution of values of Euler's phi function 63
By Harold G. Diamond
On connections between the Turan-Kubilius inequality and the large sieve:
Some applications 77
By P. D. T. A. Elliott
On the number of solutions of w = £f=1 xf 83
By P. Erdos and E. Szemeredi
The large sieve and probabalistic Galois theory 91
By P. X. Gallagher
Some remarks on arithmetic density questions 103
By Larry Joel Goldstein
Relations between the values at integral arguments of Dirichlet series that
satisfy functional equations Ill
By E. Grosswald
On the incompatibility of two conjectures concerning primes 123
By Douglas Henseley and Ian Richards
v
VI
CONTENTS
On the intervals between consecutive terms of sequences 129
By Christopher Hooley
The difference between consecutive primes 141
By Martin Huxley
On the Mertens conjecture and related general £2-theorems 147
By W. B. Jurkat
The distribution of the values of real quadratic forms at integer points .. 159
By D. J. Lewis
The classification of transcendental numbers 175
By K. Mahler
The pair correlation of zeros of the zeta function 181
By H. L. Montgomery
Metric theorems on the distribution of sequences 195
By H. G. Niederreiter
Bounds for sequences of consecutive power residues. I 213
By Karl K. Norton
Rational points on certain elliptic modular curves 221
By A. P. Ogg
Arithmetic functions and Brownian motion 233
By Walter Philipp
Brun's method and the fundamental lemma 247
By H.-E. Richert and H. Halberstam
Estimation of the area of the smallest triangle obtained by selecting three
out of n points in a disc of unit area 251
By K. F. Roth
Euler products associated with Beurling's generalized prime number systems 263
By C. Ryavec
Systematic examination of Littlewood's bounds on L(l, x) 267
By Daniel Shanks
On the Riemann hypothesis in hyperelliptic function fields 285
By H. M. Stark
Class numbers of totally imaginary fields 303
By Judith S. Sunley
Exponential sums and the Riemann conjecture 305
By Paul Turan
A new estimate for the exceptional set in Goldbach's problem 315
By Robert C. Vaughan
On Euclidean rings of algebraic integers 321
By Peter J. Weinberger
Author Index 333
Subject Index 337
FOREWORD
A symposium on Analytic Number Theory and Related Parts of Analysis was
held at St. Louis University, St. Louis, Missouri, on March 27-30, 1972, in
conjunction with the six hundred ninety-third meeting of the American Mathematical
Society. The Organizing Committee for the symposium consisted of Harold G.
Diamond (chairman), Patrick X. Gallagher, Hugh L. Montgomery, Wolfgang M.
Schmidt, and Harold M. Stark.
Twenty-nine number theorists were invited to lecture on their recent research,
which covers a broad spectrum of contemporary work in number theory. The
program was arranged in seven half day sessions, chaired by Paul T. Bateman,
Paul Erdos, Lowell Schoenfeld, and the organizers.
This volume contains accounts of all the lectures presented at the symposium.
Paul Erdos, who could attend only the first hours of the symposium, also
contributed an article to the volume. The articles are arranged alphabetically
(according to the name of the speaker in the case of joint work).
The conference participants are indebted to a number of individuals and
organizations for their good planning and administration. In particular, mention
should be made of the work of AMS Associate Secretary Paul T. Bateman, Mrs.
Lillian Casey of the AMS, and Lawrence W. Conlon of St. Louis University.
Financial support for the symposium was provided by a grant from the National
Science Foundation.
It is hoped that the lively ideas presented at the symposium wil! be further
disseminated by this volume and will spawn new number theoretic research in the
years to come.
Harold G. Diamond
vn
This page intentionally left blank
~\
;~5
W
ff I
t¥^.... ^ ■ -V-
A. BAKER P. T. BATEMAN
if km
■■■* £^4^^
BRUCE C. BERNDY
" H
■&-■■■■>■ r W
,:'t
ENRICO BOMBIERI
?-
—sT* : -
JOHN COATES
'* .,.?*■ ^
v ****
'£-
"U
- K,
jj
HAROLD G. DIAMOND
i
% '
P. D.
T. A. ELLIOTT
f V£V
Mi
>TWm &■>
K'"
m i
i, ;^€^J \.
P. ERDOS
P. X. GALLAGHER
** *« „■* ' ' '■-■"
A' ■»-,-- -; - >-^
'-;'^W^W x'" 1
LARRY JOEL GOLDSTEIN
''. ?** ** -ST.- -,-v ... ...'%■ '■-■■ ■' J--* '■-V*-;:.
>• * *,i '."'.'■■v-:-«V ~'& f-
'V ■'
E. GROSSWALD
DOUGLAS HENSLEY
:"i/ f .. ;'::' r;i
;!1" ^
'irr:/
CHRISTOPHER HOOLEY
MARTIN HUXLEY
H
W. B. JURKAT
D. J. LEWIS
i^-r.. ..II
K. MAHLER
■7 / "
H. L. MONTGOMERY
H. G. NIEDERREITER
*>.*,
=*!%>■ ^
r*-:-^-s
i
KARL K. NORTON
i „ c. jy#iy*
>■ * " -p :*■■;"■"
,: ■■" ":\ ., /^
■ - '. . ^ i;<
A. P. OGG
.>■-■■■■;■■*. '»>■ ^
WALTER PHILIPP
y?'- :':■>■ ' ' ■ ■'■r.K'-^:
.. .:::/."Vv'- >:■ ^ .-.^V :- ^ /
. ^ : A,
tlftl:
#;;
ft ~>< >yK
J:3
H.-E. RICHERT
K. F. ROTH
C. RYAVEC
.^:i**#V ^
4 _ *:;
^^
L. SCHOENFELD
■^-K4:-:£.
DANIEL SHANKS
H. M. STARK
JUDITH S. SUNLEY
-%^^^m.
*;4^-
PAUL TURAN
ROBERT C. VAUGHAN
\ ■ :-:.-:-.:V- • - ^x-
PETER J. WEINBERGER
EFFECTIVE METHODS IN
DIOPHANTINE PROBLEMS. II
A. BAKER
1. Introduction. Three years ago, at a conference held in Stony Brook, I
surveyed the theories which had then recently been developed for the effective
resolution of a diverse collection of Diophantine problems [1] (see also [2]).
Since that time, several of the topics have been considerably expanded and I
should like to use the opportunity provided by the present Symposium to bring the
account up to date.
2. Lower bounds for linear forms. One of the most active fields of research
has been concerned with improved bounds for linear forms in the logarithms of
algebraic numbers. In particular, much study has been made of the special
situation, of considerable importance in applications, when one of the algebraic
numbers has a large height relative to the remainder. The primary result obtained
in this connexion reads as follows.
Theorem 1. Let a x,..., a„, /? x,..., /?„ be nonzero algebraic numbers with degrees
at most d, let al9..., (xn-l have heights at most A' and let a„ and /?l9..., P„ have
heights at most A and B respectively. If e>0, <5>0 and
0<|j81loga1 + .•• + /?„ log0Ln\<e-'H
for some //>exp((log£)1/2), then H<C(\ogA)1+\ where C=C(n, d9 e, 5, A') is
effectively computable.
Special cases of the theorem were proved by Stark [21] and myself [3] in
connexion with certain class number problems (see §4) and the full result was
AMS 1970 subject classifications. Primary 10-02, 10F35; Secondary 10B45, 10F25, 10H10, 12A25,
12A50.
,<; 1973, American Mathematical Society
1
2
A. BAKER
obtained by a combination of our methods [9]. Previous work, as described in
[1], had led to a similar theorem but with 1 +£ replaced by a number greater than
n-1. The condition 7/>exp((logJ5)1/2) can be relaxed to iJ>(logfl)c"2/£ for a
sufficiently large absolute constant c, provided that s<\, and, furthermore, one
can replace (log Af by some power of log log A, though this power is usually
large. Very recently, by further developments of the arguments, it has been shown
that, in the case when j8l9..., /?„_! are rational integers and /?„= — 1, conditions
frequently satisfied in applications, the exponent 1 +z can be replaced by 1, which
is best possible.
Theorem 2 [5], [6]. Let o^,...,^ be nonzero algebraic numbers with
degrees at most d and let the heights ofal,...,oin-l and a„ be at most A' and A (^ 2)
respectively. If, for some s > 0, there exist rational integers bl,...,bn-l with absolute
values at most B such that
0<|fc1 logax+ ••• + &„_! loga^-logaj^"**,
then B<C log A for some effectively computable number C depending only on n, d,
A' and e.
Theorem 2 is, in fact, an immediate consequence of another theorem [6] to the
effect that there exists C = C(n, d, A) such that, for any S with 0<<5<i, the
inequalities
0<|fc1loga1 + ...+^logaJ<((5/F)cl08^-^
have no solution in rational integers bu ..., bn- x and bn (#0) with absolute values
at most B and B' respectively. Clearly, on taking S= \/B and assuming B'^B, the
number on the right becomes at least C~logA logB for some effectively computable
C, and this bound is best possible with respect to A when B is fixed and with respect
to B when A is fixed. Corollaries relating to the theory of Diophantine equations
will be discussed in §3.
The proofs of the above theorems depend upon several new developments in
the earlier works. In particular, the underlying auxiliary functions are now
considerably more involved; the argument leading to Theorem 2, for instance, utilizes
a function of the form
A_i=0 A„ = 0
where the p(A_l9..., A„) are, as usual, rational integers, yr = Xr + brXn and
EFFECTIVE METHODS IN DIOPHANTINE PROBLEMS 3
IJ- 1
r= 1
1 1 dm
J(r:k) = -(r+l)-(r + fc)- A(z: k, L m) = - — (A(z: k))1.
Further, the inductive nature of the expositions is substantially modified and the
ultimate contradiction is obtained now by an appeal to certain algebraic lemmas
relating to Kummer theory, quite different from the techniques employed
previously. The reader is referred to the original memoirs for details.
3. Diophantine equations. In my earlier survey, I discussed the fundamental
theorem of Thue on f(x. y) = /n. where / denotes an irreducible binary form with
integer coefficients and degree n ^ 3. More especially, I described how the theorem
could be made effective, and indeed how one could establish an upper bound
max(|.Y|, \y\)<C exp{(logm)K},
applicable for all integer solutions .\\ y, where k>yi and C is computable in terms of
k and the coefficients of /. In view of Theorem 2, one can now strengthen the
number on the right to Cnf. where c can be computed like C, and this gives at once
Theorem 3. For any algebraic number a with degree n^.3 there exist positive
effectively computable numbers c, k depending only on a, with K<n, such that
\y.-p/q\>cq~K
for all rationalsp q (q>0).
Feldman [13] first obtained this result from a special case of Theorem 2,
involving certain restrictions on a„, and his arguments rested on rather different
adaptations in the basic theory of linear forms in logarithms: yet another approach,
employing p-adic analysis, was described by Sprindzuk [19], [20] at about the
same time.
Several other new theorems on rational approximations to algebraic numbers
follow from the general result cited after the enunciation of Theorem 2. Thus, for
instance, it shows that
\y.-pp'/qq'\>Q-KloglogQ\
where p. q are comprised solely of powers of fixed sets of primes and Q, Q are the
maxima of the absolute values of p. 4 and p\ q respectively: this furnishes a further
improvement on Ridouf s generalization of Roth's theorem (cf. the recent survey of
4
A. BAKER
Schmidt [18]). Furthermore one sees that
\(x1/m-p/q\>cq-Klogm
for any algebraic number a, where c, k are positive numbers effectively computable
in terms of a, and this is sharper than the Thue-Siegel inequality when the integer m
is large. The latter theorem recalls to mind the very first effective results in this
context, derived by means of special properties of Gauss's hypergeometric function
(see [1]). When applicable, this method gives surprisingly strong estimates for the
solutions of Diophantine equations, and it has not been dormant. In particular,
Feldman [12] and Osgood [15], [16] have widely applied ideas of this nature to
study effectively certain equations of norm form in several variables.
4. Class numbers. I described at Stony Brook the transcendental method for
determining all the imaginary quadratic fields with class number 1, and I remarked
also that the same techniques could be used to treat the analogous problem for
class number 2 when the discriminants of the fields are even. Since then, a complete
resolution of the class number 2 problem has been obtained, and I should like to
indicate the main new idea very briefly. A fuller account is provided by the text of
the lecture I delivered a year or so ago in Washington [4].
If Q(( — d)i/2) has class number 2 and odd discriminant — d< — 15, then d = pq,
where p, q are primes congruent to 1 and 3 (mod 4) respectively. Denoting by %'(n)
one of the generic characters associated with forms of discriminant — d and writing
*.M-(^) fcW-g) x.M-(?) *>-$•
where k is an integer = 1 (mod 4) and (/c, pq) = 1, we obtain
L(l, X) L(l, xx„) + m XXP) L(l, XXq) = l X(f)/f,
where f = f(x, y) denotes the principal form with discriminant — d, and the sum
is over all integers x, y not both 0. Now if k is not a prime power, for instance if
A: = 21, then the sum on the right approximates to a rational multiple of 7r2, and
on substituting for the L-functions on the left from Dirichlet's formulae we obtain
an inequality of the type considered in Theorem 1; this leads at once to the desired
effective bound for d.
By somewhat similar techniques, Schinzel and I [8] have recently shown that
every genus of primitive binary quadratic forms with discriminant D represents a
positive integer ^c(s)\D\3l8 + £ for any £>0, where c(s) depends only on e. Our
proof involves Siegel's theorem on L-functions and so does not enable c(e) to be
EFFECTIVE METHODS IN DIOPHANTINE PROBLEMS
5
effectively computed when s<^; on the other hand, an effective estimate would,
as we show, yield a complete determination of all the "numeri idonei" of Euler,
and, of course, this would include the class number 1 and 2 results to which I
have just referred.
5. Elliptic functions. The main result on elliptic functions cited in [1] has
been extended recently by Coates [11].
Theorem 4. Any nonvanishing linear combination of a>1, a>2, rjl, rj2 and 2ni
with algebraic coefficients is transcendental.
Here a>u co2 denote a pair of fundamental periods of a Weierstrass p-function
with algebraic invariants g2, g^ and nl =2C(i&>i), n2 — 2C(iu>2), where £(z) denotes
the associated Weierstrass (-function. The new feature in Theorem 4 is the
inclusion of 2ni, this extension having been gained, however, at the cost of some
restriction in the hypotheses. The result is of particular interest in view of the
Legendre relation a>ln2 — a>2nl=2nU showing that the five numbers in question
are algebraically dependent. Furthermore, one sees that the theorem includes the
transcendence of such numbers as n + co and n + r\ for any period co of p(z) and
quasi-period n of £(z).
Some quantitative estimates in connexion with Theorem 4 have recently been
derived by a student of mine, D. W. Masser; in particular, he has proved [14]:
Theorem 5. For any positive integer n and any a>0, we have
|p(«)|<C«(loglogn)7+c
where C depends only on g2, g3 and e.
Moreover he has shown that a similar estimate obtains for p(n + ri) and indeed
for p (a), where a is any nonzero algebraic number. Theorem 5 compares well with
the lower bound \p(n)\>Cn valid for some C>0 and infinitely many «, and it
improves upon the result mentioned in [1], where an unspecified power of log n
occurred in place of log logw. It seems likely that this general area of study will be
considerably developed in the next few years (cf. [10]).
6. Further results and problems. In a lecture at the same conference in
Stony Brook to which I referred at the beginning, Chowla raised the problem
whether there exists a rational-valued function/(«), periodic with prime period/?,
such that £ f(n)/n = 0. He proved some twenty years ago that this could not hold
for odd functions / if i(p—1) is prime, a condition subsequently removed by
6
A. BAKER
Siegel, and recently he showed that the same is true for even functions / if /(0) = 0.
In a forthcoming paper [7] by Birch, Wirsing and myself, it is shown that there is
in fact no function / with these properties. The arguments involve an appeal to
the basic result on the linear independence of the logarithms of algebraic numbers,
but otherwise the proof runs on classical lines. Our work enables us to treat more
generally functions / that take algebraic values and are periodic with any modulus
q, and we prove thereby
Theorem 6. If (q, (j)(q))=\ and x ™ns through all nonprincipal characters
mo&q then the L(l, y) are linearly independent over the rationals.
Theorem 6 plainly generalizes Dirichlet's famous result on the nonvanishing
of L(l, y)\ it does not, however, give a new proof of this result, for the latter is, in
fact, utilized in the demonstration. It would be of much interest to know whether
the theorem is valid when (q, <l>(q))> 1.
Finally, I should like to discuss some possible future avenues of investigation.
First, one would like to have a theorem of the nature of Theorem 1 in which A
denotes the height of all the a's and not just a„; some work in this direction has
been carried out by Ramachandra [17] and his pupil T. N. Shorey, and they have
applied their results to certain questions in prime number theory. But, at the
moment, the theorems are rather special and one would hope for considerable
improvements here. Secondly, it is almost certain that Theorems 1 and 2 have
natural/?-adic analogues, and these would enable many of the Diophantine results
obtained earlier to be strengthened. In particular, they would give an inequality
of the form ||(3/2)"|| >2_<5n, valid for all n>n0, where n0 is effectively computable,
3 is an absolute constant with 0<<5<1 and ||x|| denotes the distance of x from
the nearest integer. If, moreover, the value of 3 were such that 2_<5>J then this
would settle an outstanding question in connexion with Waring's problem. But,
of course, it may be difficult to obtain such a precise value of 3 from the present
analysis. Thirdly, one would like to obtain a value of k in Theorem 3 depending
only on n and indeed of the same order of magnitude as the Siegel exponent;
this would naturally lead to an effective determination of all the integer points on
a curve of arbitrary genus, that is, to a complete solution to the first problem
mentioned at the end of [1]. Since the magnitude of k depends on the value of
C in Theorem 2, this again reflects on the basic theory of linear forms in logarithms.
And lastly, one would like an extension of Theorem 2 in which bi,..., bn_1 denote
arbitrary algebraic numbers and not merely rational integers; this too seems
difficult to obtain with our present techniques.
References
1. A. Baker, Effective methods in Diophantine problems, Proc. Sympos. Pure Math., vol. 20,
Amer. Math. Soc, Providence, R.I., 1971, pp. 195-205.
EFFECTIVE METHODS IN DIOPHANTINE PROBLEMS
7
2. , Effective methods in the theory of numbers, Proc. Internat. Congress Math. (Nice,
1970), vol. 1, Gauthier-Villars, Paris, 1971, pp. 19-26.
3. , Imaginary quadratic fields with class number 2, Ann. of Math. (2) 94 (1971), 139-152.
4. , On the class number of imaginary quadratic fields, Bull. Amer. Math. Soc. 77 (1971),
678-684.
£>. , A sharpening of the bounds for linear forms in logarithms, Acta Arith. 21 (1972),
117-129.
6. , A sharpening of the bounds for linear forms in logarithms. II, Acta Arith. (to appear).
7. A. Baker, B. J. Birch and E. A. Wirsing, On a problem ofChowla, J. Number Theory (to appear).
8. A. Baker and A. Schinzel, On the least integers represented by the genera of binary quadratic
forms, Acta Arith. 18 (1971), 137-144.
9. A. Baker and H. M. Stark, On a fundamental inequality in number theory, Ann. of Math. (2) 94
(1971), 190-199.
10. J. Coates, An application of the division theory of elliptic functions to Diophantine
approximation, Invent. Math. 11 (1970), 167-182.
11. , The transcendence of linear forms in wx, cd2, n1, r\2, 2ni, Amer. J. Math. 93 (1971),
385-397.
12. N. I. Fel'dman, Effective bounds for the number of solutions of certain Diophantine equations,
Mat. Zametki 8 (1970), 361-371. (Russian) MR 42 #7590.
13. , An effective sharpening of the exponent in Liouville's theorem, Izv. Akad. Nauk SSSR
Ser. Mat. 35 (1971), 973-990 = Math. USSR Izv. 5 (1971), 985-1002.
14. D. W. Masser, On the periods of the exponential and elliptic functions, Proc. Cambridge Philos.
Soc. (to appear).
15. C. F. Osgood, The simultaneous diophantine approximation of certain kth roots, Proc. Cambridge
Philos. Soc. 67 (1970), 75-86. MR 40 #2612.
16. , On the simultaneous Diophantine approximation of values of certain algebraic
functions, Acta Arith. 19 (1971), 343-386.
17. K. Ramachandra, A note on numbers with a large prime factor. Ill, Acta Arith. 19 (1971),
49-62.
18. W. M. Schmidt, Approximation to algebraic numbers, Enseignement Math. 17 (1971), 187-253.
19. V. G. Sprindzuk, A new application of p-adic analysis to representations of numbers by binary
forms, Izv. Akad. Nauk SSSR Ser. Mat. 34(1970), 1038-1063 = Math. USSR Izv. 4(1970), 1043-1069.
MR 42 #5910.
20. , On rational approximations to algebraic numbers, Izv. Akad. Nauk SSSR Ser. Mat. 35
(1971), 991-1007 = Math. USSR Izv. 5(1971), 1003-1019.
21. H. M. Stark, A transcendence theorem for class number problems, Ann. of Math. (2) 94 (1971),
153-173.
Trinity College
Cambridge, England
This page intentionally left blank
CHARACTER TRANSFORMATION FORMULAE
SIMILAR TO THOSE FOR THE
DEDEKIND ETA-FUNCTION
BRUCE C. BERNDT
1. Introduction. The classical Dedekind eta-function rj(z) is defined for
lmz>0 by
if(z) = e""/12 fl (l-e2"'"2).
«= 1
If V(z)= Vz = (az + b)/(cz + d) is any modular substitution with c>0, rj(z) satisfies
the well-known transformation formula
(1.1) \ogrj(Vz) = \ogri(z) + %\og(cz + d)-ni/4 + 7ii(a + d)/l2c-7iis(d, c),
where
s(d,c)= £ ((//c))((/d/c))
j mode
is the well-known Dedekind sum with
((*)) = x — [x] — i, if x is not an integer,
= 0, if x is an integer.
(For a proof of (1.1), see, for example, [8] or [2, pp. 167-173].) The most famous
and useful property of Dedekind sums is the reciprocity formula: if c, d>0 and
(c, d)=l, then
AMS 1970 subject classifications. Primary 10D05; Secondary 10A40.
<■£) 1973, American Mathematical Society
9
10
BRUCE C. BERNDT
(u» s,^+s(<j,c).-i+±(£+i.+^.
For several proofs of (1.2) as well as several references to the literature, see [6].
There are several other functions possessing transformation formulae similar
to (1.1). Appearing in these transformation formulae are various generalizations of
Dedekind sums involving Bernoulli polynomials, and these generalized Dedekind
sums satisfy reciprocity laws similar to (1.2). We have recently shown [4] that all
of these transformation formulae can be deduced from one general theorem.
The objective of this paper is to derive transformation formulae for a large
class of functions involving primitive characters. Even for the simplest cases, the
results appear to be new. Appearing in our transformation formulae are still
further generalizations of Dedekind sums. These sums involve characters and
generalized Bernoulli functions. We shall show that the sums possess a reciprocity
formula as well.
In the sequel, x always denotes a primitive character of modulus k. The upper
half-plane {z: lmz>0} will be denoted by Jf. As usual, <j = Re(s). We always set
K(z) = Vz = (az + b)/(cz + d), where a, b, c and d are rational integers with c>0 and
ad — bc=\. We shall let e(z) = e2niz. We shall use the customary notations, [x] for
the greatest integer ^x and {x} for the fractional part of x. The characteristic
function of the integers will be denoted by X(x). Unless otherwise stated, we choose
that branch of log w with — 7r^argw<7r.
2. Preliminary results. The Gauss sum G(z, x) is defined by
G(z,X) = Yx(h)e(hz/k).
We put G(l, x)=G(x). If n is an integer, then [2, p. 312]
(2-1) G(n,X) = x(n)G(X).
We shall need some facts about Bernoulli functions and their character
generalizations. The Bernoulli polynomials Bn(x) are generated by [1, p. 804]
uexu °° un
(2-2) T1=IB»W3 (M<2*)-
The Bernoulli functions @ln{x) are defined by
(2.3)
&.(x + m) = Bn(x),
CHARACTER TRANSFORMATION FORMULAE
11
where 0^x<l and m is an arbitrary integer, except in the case n=\ and x = 0,
where we define S8X (0 + m) = 0.
The generalized Bernoulli functions &n(x, /), «^ 1, — oo <x< oo, are defined
as follows [3, Definition 1]: if x is even,
2(-l)"G(y)(2«-l)! « XQ) sinpir/x/fc)
and
2(-l)"-1G(x)(2n)! » x(j) co&(2njx/k)
^2"(X'X> k(2n/k)2» fa f
if x is odd,
2(-l)"-1iG(x)(2n-l)! • XQ) cos(2njx/k)
-*2*-i[x,X)- k^n/k)2'-1 fa j2"-1
and
x_2(-l)-1iGCt)(2n)! » x(j) sin(2njx/k)
Jzn(x' x)~ k^W" fa J" •
For n^l and — oo<x<oo, we have the important property [3, Theorem 3.1],
(2.4) ^„(x,x) = /c"-1 £ ^)^(^)-
In [3, Example 3], we derived the following character analogue of the Lipschitz
summation formula: for a real, Rez>0, and o>\,
£ X(«)g(ntt/fc)(z + m)-' = *( 1} G{*){2n/k)S £ zW^izJri + aVfcXn + a)'-1.
n = -oo i lSj n + a>0
Setting a = 0, replacing z by — iz, and replacing n by — n in the series on the left, we
find that for ze Jf and g>1,
(2.5) £ Z(„)(„+2)-.=£M^W£ jj(„)e(n2/fc)M.-i.
n = - oo ■» ^Sj n = 1
12
BRUCE C. BERNDT
We now discuss the functions whose transformation formulae we shall derive.
Let r1 and r2 be arbitrary real numbers. For o>2 and zttf, define
G(z,s;x;r1,r2)= 2-
m, n= - oo
((m + r1)z + n + r2)s
where the dash ' means that the possible pair m= —r1,n=—r2is omitted from the
summation. Extend the definition of x to the set of all real numbers by defining
x{r) = 0 if r is not an integer. Then write
00
G(z, s; x\ rl9 r2) = x{-rl) Z' %(n) (" + r2)~s
n= — oo
<-r, n=-oo m>-rr n= - 00/ ((™ + Tj) Z + H + T2)S
= S1+S2 + S3,
say.
Firstly,
(2.7) \n>-r2 n>r2 /
= *(-ri) (L(s, x, r2) + X(- 1) e(s/2) L(s, *, -r2)),
where for a> 1 and a real,
L(s,z,fl)= Z z(«)(« + «)"s.
n> —a
L(s, x, a) m&y be written in terms of the Hurwitz zeta-function £(s, a) as follows.
Let n = mk+j+[-a] + \, 0^m< oo, O^j^k- 1. Then, for <j> 1,
(2.8) ^^ m = 0
j = 0
where we have used the fact that [a\ + [ — a\ = k(d)—\. Observe that (2.8) provides
an analytic continuation of L(s, %, a) for the entire complex plane.
Secondly, replacing n by — w, we get for <j>2,
CHARACTER TRANSFORMATION FORMULAE
13
S2 = Z(-l)e(s/2) £ xH I *(«) ((-m-ri) z + n-r2y
m< —r\ n= — oo
Replacing m by — m and applying (2.5), we find that for a>2 and zeJf,
r(s)
where
00
(2.10) ^(z,s;x;»-1,r2)= £ z(m) £ z(») eHm + ri) z + r2)/*) n*"1.
m> —r\ n= 1
Note that y4(z, s; /; rl9 r2) can be analytically continued to an entire function of s.
Similarly,
(211) Si_2itp:^,iZiri.rj.
Putting (2.7), (2.9) and (2.11) into (2.6), we conclude that for a>2 and zeJf,
G(z, s; x; rl9 r2)= X f *'— {X(z, s; *; rx, r2) + e(s/2) 4(z, s; Z; -ri, -r2)}
(2.12) +z(-ri) {^(5, £ r2) + Z(- 1) e(s/2) L(s, % -r2)}.
Since A(z, s; x\ rl9 r2) and L(s, x, a) have analytic continuations to the entire
complex j-plane, (2.12) yields an analytic continuation for G(z,s\x\r\i ri) to the
entire complex j-plane.
Our objective is to prove transformation formulae for the functions
/l(z, s; x\ r\, r2)- ^ will be simpler, however, to state our results in terms of the
functionsG(z, s; x\ ri> ri)- Also, our proofs will use the functions G(z, j; x\ ri> ri)-
The method of proof is based on ideas of J. Lewittes [7] and us in [4].
Observe that if we set s = rl = r2 = 0 in (2.10), we have the natural character
generalization of log^(z). The functions A(z, s; x\ 0, 0), where s^O is even, have
arisen in Grosswald's work [5] on the values of L-functions.
We shall need the following simple lemma [7, Lemma 1] in the next section.
Lemma 1. Let A, B, C and D be real with A and B not both zero and C>0.
Then, for ze Jf,
14
BRUCE C. BERNDT
aTg((Az + B)l(Cz + D)) = arg(Az + B)-arg(Cz + D) + 2nk,
where k is independent ofzeJf, and
k=\, ifA^OandAD-BC>0,
= 0, otherwise.
3. Main Theorem.
Theorem 2. Let Q = {z = x + iy:x> —d/c, y>0}. Define R1=ar1 + cr2 and
R2 = brl-\-dr2> where rx and r2 are arbitrary real numbers. Let q = q(R1, R2, c, d)
= {R2} c—lR^d. Suppose first that a = d=0 (mod k). Then for zeQ and all s,
(cz + d)-sr(s)G(Vz,s;X;r1,r2)
= x(b) x(c) |r(5) G(z, s; x; *!, R2)~2ir(s) sin(ns) x{Rx) L(s, Z, -K2)
(3.1) c k-l k-l
+ e(-s/2) I Z I Zfa+J + M)
j=l n=0 v=0
•z([*2+d(/-{^})/c]-v)/(z,S;r1>r2)J,
where
f(z, s; rl9 r2) = f(z, s; rl9 r2J, /i, v)
= . if.l exp(-((c/i+7-{R1})/c/c) (cz + d) ku) exp(((v + {(/d + g)/c})/fc) ku) ^
■J
exp ( — (cz + d) ku) — 1 exp (ku) — 1
c
Here, we choose the branch ofus with 0<argu<2n. Also, C is a loop beginning at
+ oo, proceeding in the upper half-plane, encircling the origin in the positive direction
so that u = 0 is the only zero of (exp ( — (cz + d) ku)—l) (exp(/cw)— 1) lying "inside"
the loop, and then returning to + oo in the lower half-plane.
Secondly, ifb = c = 0(modk), we have for zeQ and all s,
(cz + d)-T(s)G(Vz,s;x;r1,r2)
= i{a) x(d) |r(s) G(z, s; X; Rl9 R2)-2iT(s) sin(ns) X(Ri) L(s, £ -R2)
(3.2) c k-i k-i
+«(-*/2) ni z(/+[*ii)
j=l fi = 0 v = 0
■x(iR2+diJ-{R1}yc-] + dn-v)f(z,s;rl,r2)[.
CHARACTER TRANSFORMATION FORMULAE
15
Proof. For zeJf and <j>2,
G(Vz,s;x;r^r2)= £ x(™)x{*)) ~A f '
m,n=-ao I CZ + d )
where M = ma + nc and N = mb + nd. As the pair m, n ranges over all pairs of
integers, except for possibly the pair — rl9 — r2, M and AT range over all pairs of
integers except for the possibility — Ru — R2, since ad — bc=\. Thus,
G(Vz,s;X;rur2)= £' X{Md-Nc) X(Na-Mb) \[- »-— 2-\
M,N=-ao I CZ+d J
= x(b)x{c) f' x(n)x(m)j(m + jR')'7+jR2j ' (^^0 (mod/c))
m,n= - ao (. CZ -\-U J
= x(a)x(d) £' x(m)ywi(m + ^l)l7 + R2i ' (^^0 (mod/c)).
For the remainder of the proof, we assume that a = d = 0 (mod/c). The proof for
b=c = 0 (mod/c) is completely analogous.
Using Lemma 1, we find that, for ze Jf and <7>2,
(cz + d)~sG(Vz, s; x\ rl9 r2)
= *(6)Z(c)(e(-s) I +1' )■
,f» /((m + R1)2 + n + R2)s
/l 0\ d(m + /?i)>c(n + /?2) otherwise
= x(b) x(c) {G{z9 s; Z; J^, K2) + (*(-s)- 1) #(z, s; Z; J^, K2)},
where
#(z, s; x;/?i, #2) =
((m + R1)z + n + R2)s
Replacing m by —m and n by — n and then separating the terms with m = R1,
we obtain
, x ,(2,S;X;/?1,/?2) = ^/2) I I „ *(")x(m)
(3.4) np, „>i?2+a(m-R,)/c ((wi-K,) z + -n -K2)s
= e(s/2){x(R1)L(s,x,-R2) + h(z,s;X;R1,R2)},
16
BRUCE C. BERNDT
where
h(z,s;X;Ri,R2)= Z Z
X(m)x(n)
>Rr n>R2 + d(m-Rl)/c ((™ — R\) Z + n — R2)S
Now, in the double sum above, Re((m — Rx) z + n — R2)>0 if x> —d/c. Using
Euler's integral representation of r(s), we find that for zeQ and o>2,
r{s)h{z,s;x;Rl9R2)
= Z Z zMx(n)
m>Ri n>R2+d(m-R1)/c
us 1 Qxp( — (m — R1) zu — (n — R2) u) du.
Put m = m,c+7 + [R1] + l,0^m,<oo,0^7^c-l, and n = ri + [R2+d(m-Rl)/c]
+ 1. The above double sum becomes
I I E xim'c+j+LRil + Vxin'HRz+dim'c+j-iRj + iycl + l)
j = 0 m' = 0 n' = 0
us 1 exp( —(ra'c +j—{Ri} + 1) zw
-(n' + C^ + dKc+y-^iJ + lVcl + l-Uj)) d„.
Replace ;+l by 7, use the fact that d = 0(mod/c), put ra' = m/c + ^, 0^ra<oo,
Og^/c-1, and put n' = n/c + v, 0^n<oo, O^v^/c-1. We then get for zeQ
and o- > 2,
r(s)A(z,s;Z;R1,R2)
Z V Z z(^+J + [*i])z(v + [*2+da-{*i})/c] + l)
(3.5)
j=l /i = 0 v = 0
• us-1exp(-(c/i+7-{R1})2u-(v + [R2+d(/-{^i})/c] + ^ + l-^2)«)
00 00
• £ £ Qxp( — mkczu — dmku — nku)du
m-0n=0
CHARACTER TRANSFORMATION FORMULAE
17
= Z Z Z Z(cAt+J + [Hi])z(v + [/?2+^0'-{*i})/c] + l)
j = 1 /* = 0 v = 0
00
■J
t exp(-((c/x+;-{^1})/c/c) (cz + d) Jcu)
1—exp(—(cz + d) /cu)
exp((- v -1 +jd/c - {Rt} d/c + R2 - jR2 + <*(/ - {R, })/c]) «) ^
1 —exp( —/cu)
= -11' Vz(cAt +J + [Ki])x([K2 + d(/-{Ki})/<|-v)
j= 1 /j = Ov = 0
s-1 exp(-((qi+M*i})M) (cz + d) ku) exp(((v + {(jd + Q)/c})/k)ku) ^
exp ( — (cz + d) /cw) —1 exp(/cw) — 1
0
= -Z z1 *z x(w+[*.])x([^+^(/--{^i})/^]-v)/(2',';ri;r2).
Here, in the next to the last step, we have multiplied the numerator and
denominator by exp (/cw) and then replaced k — 1 — v by v. In the last step, we have used a
classical method of Riemann to convert the integral over (0, 00) to a loop integral
[11, pp. 18, 19]. If we now combine (3.3)—(3.5) together, we immediately arrive at
(3.1). The result holds for all s by analytic continuation.
Our main results can be greatly simplified if s is an integer. For then/ (z, s;rl7 r2)
can be easily evaluated by the residue theorem with the aid of (2.2). Thus, if s = — N,
where AT is a nonnegative integer, a simple calculation yields
Si "''Xl E 4^-(*'Vf,+iw+g)/c}y-<"+'')r'.
m+«=N + 2 \ ck J \ k ) mini
Upon the evaluation of f(z, -N; rx, r2), (3.1) and (3.2) will then be valid for all
ze Jtf by analytic continuation.
4. The character analogue of \ogrj(z). Let r1 =r2 = 0. From (2.10) and (2.12),
we find that
00
r(s) G(z, s; x; 0, 0) = G(x) (-2ni/kY(l +e(s/2)) Z X(mn) e(mnz/k) ns~l
m,n= 1
00
= G(x) (-2ni/kf(\ +e(s/2)) Z zW °s-1 (r) e(rz/k).
18
BRUCE C. BERNDT
In particular, if s = 0,
00
limr(s)G(z,s;X;0,0) = 2G(x) £ Z(r) <x_, (r) e(rz/*) = 2G(x) A(z, X),
s-0 r=l
say. Thus, for s = r1=r2=0, (3.1) yields in the case a=d=0 (mod/c),
(4.1)
+1 t V 'l xfo*+/) z(D^/c]-v)/(2, 0; 0, 0)1,
j= 1 p = 0 v = 0 J
and (3.2) yields in the case b = c = 0 (mod/c),
G(jf) >l(Kz, X) = l(a) X(d) |G(jf) A(z, Z)
(4'2) +i I Y V X(j) xiWc}+dn-v)f(z, 0; 0, 0)|>.
j = 1 /* = 0 v = 0
By (3.6),
We must now evaluate the triple sums in (4.1) and (4.2).
First examine the sum in (4.1). By summing on v first, we see that the
contribution of (cz + d) B2((cii+j)/ck) is zero. By summing on fi first, we see that the
contribution of B2((v + {jd/c})/k)/(cz + d) is zero, since cn+j runs through a complete
residue system (mod/c) as \i does, for (c, k)= 1. Next observe that the triple sum is
unchanged if we replace Bx ((v + {jd/c})/k) by 0tx ((v + {jd/c})/k), since d = 0 (mod /c).
By (2.4),
|x(D*]-v)*,(^)=Yx(-v)«,(^*)=,(-.)^0*.i).
Thus, so far, we have shown that the last expression in curly brackets on the right
side of (4.1) is
CHARACTER TRANSFORMATION FORMULAE
19
(4.3) nix(-l) t @,(jd/c,l) £ xicfi+fiB,
Next, observe that the sum above is unchanged if B1 ((cfi+j)/ck) is replaced by
^i((c^+;)M)- Put cfi+j = n, where l^n^ck. Since d = 0(modk) and ^(x, x)
has period fc, (4.3) becomes
(4.4) nix(-l) I *!<#, X) X(n) %i(n/ck).
n= 1
Definition. Let (c, d) = 1 with c>0. The Dedekind character sum s(d, c; %)
is defined by
n mod ck
Using the above definition and recalling that (4.4) is the value of the last
expression on the right side of (4.1), we find that (4.1) becomes
G(x) A(Vz, X) = x(b) X(c) {G(x) A(z, x) + ™x(- 1) s{d, c; x)}.
Secondly, we examine the triple sum on the right side of (4.2). By summing on
v first, we observe that the contribution of the second expression in/(z, 0; 0, 0)
is zero. Next, we may replace B1((v-\-{jd/c})/k) by &i{{v + {jd/c})/k). Using (2.4),
we find that
=x(-l)#i(dit+jd/c,x)-
Thus, the last expression in curly brackets on the right side of (4.2) is
*i*(-1) I X(J) V *i (CJ~) <*x (d^+jd/c, x)=uiX(-1) s(d, c; x),
where we have replaced B^cii+fi/ck) by &i((cfi+j)/ck), set n = cfi+j, l^n^ck,
and used the fact that c = 0 (mod k). Hence, using the above calculation, we find
that (4.2) becomes
G(x) A{Vz, X) = i(a) X(d) {G(x) A(z, X) + niX(-1) s(d, c; Z)}.
In summary, we have shown
m
20 BRUCE C. BERNDT
Theorem 3. Let zetf, jfa^d=0 (modA:),
(4.5) G(x) A(Vz, X) = x(b) x(c) {<?(*) A(z, x) + niX(- 1) s(d, c\ x)};
ifb = c = 0(modk),
(4.6) G(x) A(Vz, x) = x{a) x(d) {Gft) A(z9 x) + *iZ(-1) s(d, c; Z)}.
We will next prove a reciprocity formula for s(d, c; %) analogous to the
reciprocity formula for s(d, c).
Theorem 4. Let c, d>0,(c,d)=l, and either c or d=0 (mod k). Then,
(4.7) s(c,d,x) + s(d,c;x) = B1(x)B1(x).
Proof. By symmetry, we may without loss of generality assume that
d = 0(modk). Let V*(z)=V*z = (bz-a)/(dz-c) and T(z)= Tz= -l/z. If we
replace z by Tz in (4.5), we obtain
(4.8) G(x) A(V*z, X) = x(b) x(c) {G(X) A(Tz, *) + *iZ(- 1) s(d, c; *)}.
Next, apply (4.6) with K replaced by V*. We get
(4.9) G(x) A(V*z, X) = x(b) *(-c) {G(*) A(z, *) + 7rix(- 1) *(-*, d; *)}.
Lastly, apply (4.5) with V replaced by T and x replaced by x to obtain
(4.10) G(X)A(Tzrx) = x(-l){G(?)A(z,x) + nix(-l)s(0,l',x)}-
Multiplying both sides of (4.10) by x(b) x(c) and combining the result with (4.8)
and (4.9), we arrive at
(4.11) 7rix(ft)z(c){z(-l)s(d,c;i)-s(-c,d;z) + s(0,l;z)}=0.
Since (b9 k) = (c9 k)= 1, x(b) x(c)#0. From the definition of ^(x, %), we see that
(4.12) #i(-*,x)=-x(-l)*i(*>x).
It follows that
s(-c9d;x)=-x(-l)s{c>d>x)-
Lastly,
CHARACTER TRANSFORMATION FORMULAE 21
(4.13) s(0, 1;Z)= I x(«)Bi(x)*i(«/*) = *!(X)*i00,
n= 1
upon the use of (2.4). Thus, (4.11) reduces to
Z(-l)s(d,c;x) + z(-l)s(c,d;^) + 51WB1(x) = 0.
This is equivalent to (4.7) since — x(- 1) Bx (x) Bx (x) = Bl(x) #i (£), for if x is even,
Since rj(z) possesses an infinite product representation, it is natural to ask
whether A (z, x) can be represented as the logarithm of an infinite product.
Theorem 5. Let zeJ4f. Then there exists an integer r, independent of z, such
that
fc-l oo
G(x) A(z, *)= -log f[ n (l-e({j+mz)/k)Y<»*™+2mr.
j=0 m=l
Proof. By (2.1), we have, for zejf,
00
G(x) A(z,x)= Z XM G(w, x) e(mnz/k)/n
m, n— 1
k— 1 oo oo
= £ *(/) Z xM Z e(n(j + mz)/k)/n
j=0 m = 1 n = 1
k-1 oo
= ~Z X(J) Z zNlog(l-«((/ +wz)/fc))
j = 0 m=l
= -log n fi (l-e((/+m2)A))'0',x<m) + 27:I>(z))
j=0 m= 1
where r(z) is an integer. Since A(z, x) and the logarithm of the product above are
both analytic on Jf, r(z) is analytic on Jf and hence a constant r on Jf.
5. A second example. Let s = 09 but suppose that r1 and r2 are arbitrary.
Firstly, suppose that a = d = 0(mod/c). Proceeding as in the preceding section,
we find that the triple sum in curly brackets on the right side of (3.1) is, with the
help of (3.6),
2*i Z V I1z(^+j + [ll1])Z([ll2 + d(/-{*i})/c]-v)
(5 J) j=i p = o v=o
_ /cn+j-jR^ fv + {(jd + Q)/cf
22
BRUCE C. BERNDT
We next want to replace ^((c/i+_/-{/?!})/<:&) by 3Bi{(cfi-\-j-{Ri\)lck).
This is valid except when Rx is an integer, \i=k — 1, andy'=c, for then we obtain
Bx{\)=\ and ^,(1) = 0, respectively. Replacing Bl({cii-¥j-{Rl})lck) by
^i i(c^+j— {^i})/cA:) in the triple sum above, we are led to the "extra" expression
t- ^ -, i'v + IR?'.
k
= *ii(Ri) *I z([*2]-v) «! (^M)-^^,) Z(K2)
= «iz(-*i) l] Z(v) ^ (^)-i^(^i) x(*2)
= ^(-/?1) £,(K2, x)-i7iix(«i) x(K2),
upon the use of (2.4). Thus, (5.1) becomes
2w£ I I X(cn +JHRil)z([^2+d(/-{*i})/c]-v)
j = 1 /j = Ov = 0
(5.2) .# ftW-{*i}\Bi fv + {(/<* + rf/c}N
ck J \ k
+ nix(-R1)al{R2,fl-±Kix{R1)x(R2).
Next, we wish to replace #i((v+ {(/</+g)/c})/fc) by #i((v + {(/rf+e)/c})/fc) in
(5.2). This is valid except in the case when (jd + g)/c is an integer and v = 0, i.e.,
except when q is an integer and v = 0. If q is an integer, let/ be the unique integer
such that 1^/^c and j'd + g = 0 (mod c). Using the definitions of #,/^ and R2
and using the definition {x} = x — [x], we obtain after a short calculation,
O'^ + d/c = (/<*-*■! + [«i] d-[R2-] c)/c.
From the above, we see that g is an integer if and only if rx is as well. Since also
ad — bc=l9 the above then becomes
(/d + c)/c = ((/ + [cr2])i)/c-[dr2].
Since (fd + g)/c is an integer and (c, d)= 1,
(5.3) c|(/ + [cr2]).
Since d = 0 (mod fc), we conclude that (fd-\-g)/c = - [dr2] (mod /c), and so,
CHARACTER TRANSFORMATION FORMULAE 23
(5.4) d(f - {Rx })/c + R2 = (fd + q)Ic + [K2] = ftr x (mod /c).
Replacing Bi((v + {0'd + e)/c})/fc) by ^i((v + {(jd + £)/c})//c), we obtain an "extra"
expression
= -TTiAto) z^rO V z(cai+/ + [ct2]) «x (^+^ ^'^
by (5.4) and the fact that a = 0 (mod/c). Using the definition of R^ and (5.3), we
can write the above as
*"' 7 Z + M^ /^ + (/'+[>2])/c-r2
_ XIM +
■TrateJxtferJxtc) X x(a*+" ~ /~il £
= -7rU(e)z(6ri)z(c)*1(-r2,z),
by (2.4). Thus, (5.2) becomes
2** t kZkI^(^+;+[^i])x([^2+^-{^i})/c]-v)
j = 1 // = 0 v = 0
(5.5) .^i^+^-^.A^ ^v + {(A*+ffyc}
c* / \ it
+ nix(-R1) »x (R2, X)-biix(Ri) X(*2) -JtUfe) Z(fcr,) *(c) ^(-i* %)■
We next simplify the triple sum of (5.5). Using (2.4) and then letting cn+j=n,
1 ^ « g c/c, we find that the triple sum of (5.5) becomes
w-.ij;^wtM«,(^).,(^iw)
(5.6) =2niX(-l) ntn + ^p.^Jj,^ ^ '+*2,xj-
Definition. Let (c, d) = 1 with c>0. Suppose that x and j> are arbitrary real
numbers. The generalized Dedekind character sum s(d, c; x\ x, y) is defined by
(5.7) s(d,c;X;x,y)= £ x{n)&Ad{n + y)/c + x,X)a1{{n + y)/ck).
n mod ck
Observe that s(d, c; %; 0, 0) = s(d, c; %). Also, s(d, c; %; x, y) is the natural
24
BRUCE C. BERNDT
character generalization of the generalized Dedekind sum s(d, c; x, y) (e.g. see
[6] or [4, §4]).
Using (5.7) in (5.6), we see that we may now write (5.5) as
(5 8) 2nix(~ l) S(d' C; *; *2' -Ki) + 7^(-Ki) #i(«2, X)
-±™z(*i) x(R2)-na(Q) x(brx) i(c) #i (-r2, *).
We next calculate L(0, #, a). From (2.8) and the formula [12, p. 267],
C(0,a)=-B1(a) (0<a£l),
we find that, for real a,
L(0, z, a)= -"j: z(/-|>] + *(«)) *i ((/ + {*}+*(*))/*)
j = o
k-1
l5'9) =-I Z(7'-[a] + A(a))*1(a+{a}+A(a))/fc)-z(-o)B1(l)
= -^i(a,3f)-iz(-fl),
by (2.4).
Using (5.8) and (5.9) in (3.1), we find that, for zejff and a=d=0 (modk),
lim r(s) {(cz+d)-s G(Vz, s; X; r„ r2)-jf(fc) x(c) G(z, s; jj; R» R2)}
=X(b) X(c) {InixiR,) aA-R2, x) + *ii{Ri) X(R2)
+ 2mx(-l)s(d,c;x;R2, -*i) + >«z(-*i)*i(*2,x)
(5.10) -fax{Ri) X(Rz)-iM{Q) X{br,) jf(c) *,(-r2, *)}
= *(&) *(c) ni{X(Ri) @A-R2, x)+til(Ri) X(R2)
-Hc)x(brl)x(c)a1(-r2,x) + 2x(-l)s(d,c;x;R2,-Ri)},
where we used (4.12).
Secondly, suppose that b = c = 0 (mod k). Proceeding as in the previous section,
we find that the triple sum in curly brackets on the right side of (3.2) becomes,
with the help of (3.6),
c k — 1 k— 1
2*i £ X ZxVHRiDx(LR2+d(J-{Ri})/c]+dii-v)
j=l n=0v=0
(5.H) B(cji+j-{R^\BJv + {(jd+Q)lc}
CHARACTER TRANSFORMATION FORMULAE 25
As in the previous case, the replacing of B1((cfi-\-j—{R1})/ck) by
@i((cix+j — {R^lck) results in the "extra" expression
ni)C(Ri) V Z([K2]-v) £i(-i~) = niX(-Ri)@AR2,X)-inix(Ri)x(R2).
Thus, (5.11) may be written as
2ni t X I xU + [*i])!tp2 + ^-{Xi})/c] + *-v)
j=l /* = 0 v = 0
<512) ^ (cH+)-{Rx}\BJv + {{Jd + Q)lc
ck J L\ k
+ niX(-R1) 0,(K2, x)-i*ix(Ri) x(R2)-
We next replace Bi{{v + {(jd + Q)/c})/k) by £1((v + {(jtf + e)/c})/fc).Let / be
defined as in the case a=d=0 (mod/c). We obtain, as before, an "extra" expression
(5.13) -mm x(fH*J> % f (^M-^)*1 f^" <*->yc).
From our calculations in the previous case of a = d = 0 (mod/c), we see that
(fd + g)/c + [K2] = (/ + [cr2]) d/c + ferx = (/ + [cr2]) d/c (mod fc).
From (5.3), / = — [cr2] (mod /c), and so
Z + Msan (mod/c).
Using the above two congruences and (5.3), we see that we may write (5.13) as
■u w * -^ V -/"Z + C^] , \ „ ^+(/-{R1})/c\
-7td(e)x(a'-1)x(^) Z Xl +A*J*i( 1 I
(5-14) = -siAfo) Z(ar,) *(<*) "£ Z(A*) *i (^)
= -«iA(c) x(ari) l(d) #i(-r2, *),
by (2.4). Using (5.14), we find that (5.12) becomes
26
BRUCE C. BERNDT
2*'"i I IxVHRiDxdRi+dU-iRiW+dii-v)
•^(Cfl+^{R,})^(v+{(^+g)/c})
+ jriZ(-K,) «,(*2, x)-2™'x(*i) *(*2)
-«iA(<?)z(flr,)jf(d)«i(-r2,z)
= 2*iz(-l) s(d, c; Z; K2, -Ki) + k»x(-*i) #i(K2, x)
where we have used the same argument as before in obtaining (5.6). Substituting
the above into the right side of (3.2) and using (5.9) and (4.12), we conclude that
Urn r(s) {(cz + dy* G(Vz, s; X; ru r2)-X(a) %{d) G(z, s; Z; Ru R2)}
(5.15) =x(a) X(d) ni{x(Ri) »i(-R2, x)+fc(*i) zW
-X(q) xfa) x(d) ®x (-r2, Z) + 2Z(-1) s(d, c; X; R2, -*i)}.
We summarize our results (5.10) and (5.15) in
Theorem 6. LetzeJf. Ifa=d=0(modk),
Jim r(s) {(cz + d)-° G(Vz, s; X; ru r2)-jt(b) x(c) G(z, s; *; Ru R2)}
=x(b)x(c)ni{x(R1)^1(-R2,x)+mRi)x(R2)
-HQ) X(brx) x(c) £i (-r2, *) + 2X(- 1) s(d, c; x; R2, -R,)};
ifb = c=0(modk),
l™ *» {(" + rf)-s G(Vz, s; X; ru r2)-x(a) X(d) G(z, s; X; R,, R2)}
= X(a)x(d)ni{x(R1)@i(-R2,x) + 2-X(Ri)x(R2)
-Afe) zK) *(<Q <*i(-»2. Z)+2z(-1) s(d, c; Z; *2, -Hi)}-
To express the results of Theorem 6 in terms of the functions A(z,s;x;ru r2),
some additional computation is required. The computations are similar to those
in [4, §4].
We now derive a reciprocity law for s(d, c; x', x, y) similar to the reciprocity
lawfors(d, c;x, y)[4, §5].
Theorem 7. Let c, d>0, (c, d} = 1, and either c or d=0 (modfc). Let x andy
CHARACTER TRANSFORMATION FORMULAE
27
be arbitrary real numbers. Then,
(5.16) s(c, d\ x; x, y) + s(d, c; x\ y, *) = -iz(x) x(y) + #i (*, z) *i U z)«
Proof. Without loss of generality, assume that d=0 (mod/c). Let V* and 7"
be as in the proof of Theorem 4. Replacing z by 7z in (5.10), we get
lim r(s) {(dz-c)~s zsG(V*z, s; X\ rl9 r2)-X(b) *(c) G{Tz, s; r, «i, *2)}
s-+0
(5.17) =jf(&)z(c)»ri{z(*i)*i(-*2.Z)+iz(*i)z(i?2)
-A(e)z(6r1)J£(c)«1(-r2,Z) + 2Z(-l)s(d,c;z;ll2.-*i)}-
Apply (5.15) to the transformation F* and note that R^ is replaced by R2 and
R2by -/I,. Thus,
lim r(s) {(dz-c)-' G(V*z, s;X; rl3 r2)~x(b) Z(-c) G(z, s; X; R2, -Rj}
(5.18) =z(6)z(-c)»ri{z(*2)«i(^i.z)+k^2)z(-i?i)
-A(c)z(fc'-i)z(-c)«1(-r2,z) + 2z(-l)s(-c,d;z; -*i, -*2)}-
Lastly, apply (5.10) with F replaced by T, rt and r2 replaced by Rt and J?2,
respectively, and z replaced by /. Observe that Ri and R2 are replaced by R2 and —Rit
respectively. Hence,
lim {z~*G(Tz, s; z; K„ R2)-x(-1) G(z, «; Z5 «2, -*i)}
s->0
(5.19) =z("l)'rf{z(i?2)*i(*i.Z)+iz(*2)z(-*i)
-z(-«i)^i(-H2,z)+2z(-i)s(0,i;z;-Ki,-JR2)}-
Multiply (5.19) by z(fr) z(c) and combine the resulting equation with (5.17) and
(5.18) to obtain after considerable cancellation
(5 20) 2x(_ l) s(d'c; *; R2>-RJ-2s(-c> d> x; -Ru-Rz)
+ 2s(0, 1; z; -*„ -H2)+k(*2) z(*i)=0.
From (5.7) and (2.4),
s(0,l;z;-*!,-*,)=£ z(«)^i(-«i,z)^i(^r^)
=^1(-R1,z)^,(-i?2,z)-
By replacing n by — n in the definition of s( — c, d\ %; — R1? —R2) and using the
28
BRUCE C. BERNDT
oddness of 88x (x), we easily deduce that
s{-c, d; x; -Ri, -R2)= ~X{-1) s(c, d; x\ -Rl9 R2).
Hence, (5.20) can be written as
(5 21) 5(d'c; *; R2>-RJ+S(C> d'i xi-Ri, R2)
= -iz(-*i)z(*2)-z(-i)*i(-*i,z)*i(-*2,z).
If we apply (4.12) and let x = -Rx and y = R2, (5.21) reduces to (5.16).
Observe that when x = y = 0, (5.16) reduces to (4.7).
6. An alternative method for proving the main theorem. A very short, elegant
proof, using contour integration, of the transformation formula for rj(Vz) when
Vz=Tz= — \/z has been given by C. L. Siegel [10]. The method was extended by
Rademacher [8] who proved the transformation formula for any modular
substitution. (See also [2, pp. 155-158, 167-173].) Another extension of Siegel's idea
has been made by Schoeneberg [9] who proved a transformation formula under
T for generalized Dedekind eta-functions. Using an idea which Schoeneberg
attributes to E. Hecke, Schoeneberg then derived the general result using
essentially only the result for T.
These ideas of Siegel and Hecke can be extended to yield another proof of
Theorem 2, when 5 is a nonpositive integer. We shall illustrate this extension by
giving the proof for the transformation Tz only, in the special case s = r1 =r2 = 0
and a = d = 0 (mod/c). From (4.5) and (4.13), we then shall show that
(6.1) G(x) A(Tz, X) = X(-1) G(X) A(z, D + mB^) B^x).
Letting m = rk+j, 0^r<oo, O^j^k— 1, in the definition (2.10) of A(z, x) and
then using (2.1), we see that (6.1) may be written as
/*o\ rt-\ V x{n)G{-n/z,x) * j(n) G(nz, x) , ...
(62) G(X) L n{l-e{-n,z))=*{-l) G{x) £ ,(1^))^^ *l W"
We shall show (6.2) for z = iy, y>0, for then the general result will hold by analytic
continuation.
Let n be a fixed positive integer and put N = n + j. Define
G(wN,x)G(-wN/z,X)
*n(w, z, x) = -
G(x) G(x) w(l-e(wN)) (l-e(-wN/z))
CHARACTER TRANSFORMATION FORMULAE
29
Let C be the rhombus with vertices w= ±1 and w= ±z. On the interior of C.
FN(w, z, x) has simple poles at w= ±j/N and w= ±jz/N, 1 ^j^n, and a simple
pole at w = 0 when % is odd. The residue of FN(w, z, x) at w=j/N is
G(j,x)G(-j/z,X) xV)G(-j/z,x)
G(X) G(x) (j/N)(-2niN) (1 -e(-j/z)) 2nijG(X) (1 -e(-j/z))'
by (2.1). By replacing h by k — h in the definition of the Gauss sum, a short
calculation shows that the residue at w= —j/N is equal to the residue at w=j/N. The
residue at w =jz/N is
x{-j)G{jz,x)l2nijG{x){\-e{jz)).
As above, the residues at w= ±jz/N are equal for each;. The residue at w = 0 is
B1(X)B1(X)/G(X)G(X),
The easiest way to calculate the residue at w = 0 is to use the generating function
for generalized Bernoulli numbers [4, Example 6]:
!^p).|4|i' (|2|<2„A).
Thus, by the residue theorem,
1
FN(w,z,x)dw=-2 £ -t
MjG(X)(l-e(-j/z))
(6.3) c
If we can show that the left side of (6.3) tends to 0 as AT tends to oo, we would be
done. For then letting N tend to oo in (6.3), we get (6.2).
Let w = u + iy(l — u), O^w^l, denote the line segment from w = iy to w=l.
Then for w on this line segment,
z B~=\ ^M-2nhy(l-u)N/k)^khZ\ exp(-2nhuN/yk)
Mvv,z,W|_ |G(z)G(jf)w(l-g(wiV))(l-^(-wJV/2))|
which tends to 0 as AT tends to oo, uniformly for O^w^ 1. The same conclusions
hold for the other three sides of the rhombus by similar arguments. Thus, the in-
30
BRUCE C. BERNDT
tegral on the left side of (6.3) tends to 0 as AT tends to oo. By our remarks above, this
completes the proof of (6.2).
References
1. M. Abramowitz and I. A. Stegun (Editors), Handbook of mathematical functions, with formulas,
graphs and mathematical tables, 3rd ed., Nat. Bur. Standards Appl. Math. Series, 55, Superintendent
of Documents, U.S. Government Printing Office, Washington, D.C., 1965. MR 31 # 1400.
2. Raymond Ayoub, An introduction to the analytic theory of numbers, Math. Surveys, no. 10,
Amer. Math. Soc, Providence, R.I., 1963. MR 28 #3954.
3. Bruce C. Berndt, Character analogues of the Poisson and Euler-Maclaurin summation formulas
with applications, J. Number Theory (to appear).
4. , Generalized Dedekind eta-junctions and generalized Dedekind sums, Trans. Amer.
Math. Soc. (to appear).
5. Emil Grosswald, Remarks concerning the values of the Riemann zeta function at integral, odd
arguments, J. Number Theory 4 (1972), 225-235.
6. Emil Grosswald and Hans Rademacher, Dedekind sums, Carus Monograph, no. 16, Math.
Assoc. Amer., Washington, DC, 1972.
7. Joseph Lewittes. Analytic continuation of Eisenstein series, Trans. Amer. Math. Soc. 171
(1972), 469-490.
8. Hans Rademacher, On the transformation of \o%t](x\ J. Indian Math. Soc. 19 (1955), 25-30.
MR 17, 15.
9. B. Schoeneberg, Zur Theorie der verallgemeinerten Dedekindschen Modulfunktionen, Nachr.
Akad. Wiss. Gottingen Math.-Phys. Kl. II, 1969, 1-10.
10. C. L. Siegel, A simple proof of n(- 1/t) = >/(t) ^(t/i), Mathematika 1 (1954), 4. MR 16, 16.
11. E. C. Titchmarsh, The theory of the Riemann zeta-function, Clarendon Press, Oxford, 1951.
MR 13, 741.
12. E. T. Whittaker and G. N. Watson, A course of modern analysis. An introduction to the
general theory of infinite processes and of analytic functions', with an account of the principal
transcendental junctions, 4th ed., Cambridge Univ. Press, New York, 1962. MR 31 #2375.
University of Illinois at Urbana-Champaign
ON LARGE SIEVE TYPE ESTIMATES FOR
THE DIRICHLET SERIES OPERATOR1
M. FORTI AND C. VIOLA
Presented by Enrico Bombieri
0. Introduction. The large sieve inequalities
(0.1)
I I*
E anX{n)\
Sc(N,Q) X \af,
where * denotes summation over all primitive Dirichlet characters, were first
considered by Linnik [14] then developed by Renyi [18] and Roth [19] and
substantially improved by Bombieri [1]. Refinements in the estimates of c(N, Q)
were given by Davenport and Halberstam [6], Bombieri and Davenport [4] and
[5], and Gallagher [9]. Recently Montgomery [15] has stated the following result
of Davenport:
(0.2)
I <*•»-"'
^[T + 0(N logjV)](J+logJV) X |fl.l2
(for real tr, <5 = minr*s|£r — ts|, T = max tr — min ts) and using a method introduced
by Halasz [11] has successfully combined (0.1) and (0.2) to obtain
<1 ^ Q X mod q r = 1
£ anl{n)n *»"
-JL)
2 ( loe2AT\ N
^(e2T + N)(l+-4—)log4N X K\2n'2a
where sx%r = GXtr + itXt„ G = mmxrGxr, T = maxxrtxr — mmXtrtx r+ 1, 3 =
minx,r*skx,r-'x,sl-
A MS 1970 subject classifications. Primary 10H30.
1 Research supported by Consiglio Nazionale delle Ricerche.
31
(£) 1973, American Mathematical Society
32
M. FORTI AND C. VIOLA
Gallagher [10] proved a continuous analogue of Montgomery's result,
namely
T
Z Z*
q = Q x m°d q
Z anX(n)«"
dt<Z(Q2T + n)\an\-
-T
Bombieri [3] has remarked that the large sieve inequality
R
z
M + N
Y, an exp(27rmxr)
2 / 9\ M + N
^IN + T Z l«.|2
where <5 = minr*J|;cr — xs|| and ||x|| = min„|x — w|, can easily be deduced from
Bessel's inequality in the Hilbert space I1 of complex number sequences (a„)_ <»
such that ^*00|a„|2<oo; Elliott [7] has pointed out that the derivation of large
sieve inequalities is equivalent to the determination of the spectral radii of certain
hermitian operators.
The recent notes of Montgomery [17] give a survey of the methods and results
in this field; also mentioned is an interpretation recently devised by Bombieri
(unpublished) which consists in the reduction of the large sieve inequalities to the
estimates of the norms of linear operators between Banach spaces. This approach
unifies most of the techniques used so far and suggests "nonlinear" analogues of
the large sieve. The purpose of the present paper is to develop this method in order
to obtain some improvements and generalizations on the results quoted above.
We denote by co(n) = x(n)nit a generalized Dirichlet character and define
||ct>|| =q(\t\ + 1) if q is the modulus of %; Q is a finite set of generalized characters of
cardinality |Q|. We also define D = D(Q) = sup(0±(0> \\a)'(b\\ and consider a finite set
Jf of positive integers between Njl and N.
Definition 1.1. One says that Q is 3-well spaced if for co = x(n)nit,
co, = xf(n) ri*\ oj, cd'eQ, co^co', we have either \t — t'\^d or x'x nonprincipal.
The large sieve inequality
(0.3)
Tia„a>(n)\
<;c(^,G)Zkl:
may be considered as an upper bound for the norm of a linear operator. If Zf(^T)
and 13 (Q) are the Banach spaces of complex-valued functions over Jf, Q
respectively with the usual norms, we define the "Dirichlet series" operator 2f = 3l {Jf, Q):
U(jr)-+E{Q) by
SIEVE ESTIMATES FOR THE DIRICHLET SERIES OPERATOR 33
[.JT JweG
If ||®|L,a is the norm of the above operator, we have the inequality
(0.4)
\\p,q
\q\ l/« / \ 1/P
X ano>(n)
:AmP,q{i\an\pJ
which is clearly of large sieve type.
The success of the methods so far used depends on the following well-known
equality
ii0iip.,=mi,<„.,
where @* is the adjoint operator of Q> and 1/p + 1/p' = l/q + l/q' = 1. In the classical
case p = p' = q = qf = 2, the introduction of the Halasz coefficients (see
Montgomery [15]) is avoided by estimating \\@*\\ 2,2 instead of ||^|| 2,2- This enables us
to obtain our results directly and to avoid some splitting of cases occurring in
Montgomery's paper. Our main results run as follows:
Theorem 3.2. IfQ is(2 log/))2-wellspaced, then \\B\\\i2<N+D log2/).
Theorem 3.3. IfQ is (log N)-well spaced, then for 0^<5^1, \\@\\2t2<N+N*
.j)rt*)+*\Q\9 where p(d) is the Lindelof' p-function.
Theorem 3.4. Let N(ol, T; y) denote the number of zeros of L(s, x) in the
rectangle ol^g^X, \t\^T, and put N(cc, T; Q) = maxx, J^x N(cc, T; xx) where x, x'
are such that co(n) = x{n) nlt, cD'(n) = x(n) nit belong to Q for some t, t'. Let Q be
(log2D)-wellspaced. Then, for d^, T=max(00i, \t-t'\ + l,
\\@\\l2<N + NdDll2-d+£\Q\+NdD1/2-d/2+eN(l-d,T',Qy-d/2.
From these results, using the Riesz-Thorin theorem, we deduce some estimates
for the norm ||^||Pf, with (p, q)¥z(2, 2). We also obtain the following
Theorem 4.1. Let Q be (log2 D)-well spaced, and put D1 = D\og2D. Then,
for any even integerp^2, \\@\\PtP<(N1/2 + D\/p) 7V1/2"1/p(log7V)c(p).
Montgomery [17, Theorem 12.6] has pointed out that the validity of Theorem
4.1 for any real 2^p^4 would imply the density hypothesis N(g, T)^T2il~a)+e.
Unfortunately the Riesz-Thorin interpolation theorem gives the following weaker
result.
34
M. FORTI AND C. VIOLA
Theorem 4.2. Let Q be (log2D)-well spaced. Then, for any p^2,
where fc = [p/2].
Finally, we can generalize Theorem 3.3 in the following form.
Theorem 4.3. Let Q be (logN)-well spaced. Then, for any p, q^.2 and for
0^<5^1,
ii®iiPi,«jvi/2-i/^i/2+^2z>^^2+iior/«).
The authors are deeply indebted to Professor Bombieri for help and constant
encouragement in the preparation of this paper.
1. An interpretation of the large sieve by means of linear operators.
Throughout the paper Jf, Q, \Q\, co, \\co\\, D will be as defined in the introduction. Given
the finite set Q of generalized characters co(n) = x(n) ri* we denote by X(Q) the set
of Dirichlet characters x such that x{n) nlt belongs to Q for some t.
Let n{Jr),I3{Q) be the Banach spaces of complex number sequences
fl = Kk^a = W(a6f} indexed by Jf, Q respectively with the usual norms
We define the "Dirichlet series" operator @ = &(JT9 Q):LP(JT)-+I3(Q) by
(oeQ
Denoting by \\@\\p „ the norm of 2f, we get
(1.2)
Z «»">(")
q\ l/« / \ 1/P
The adjoint operator 3>*:D'(Q)^L"'(JT), where Xjp+ljp'=\lq + \jq'=\, is
(1-3) ®*({<U„.n) = {l <*>(»)} •
We recall the following well-known results in the theory of Banach spaces.
Theorem 1.1. \\9\\Ptq=\\®*\\q;P' where \/p+l/p'= \/q+\/qf=\.
SIEVE ESTIMATES FOR THE DIRICHLET SERIES OPERATOR
35
Theorem 1.2 (Riesz-Thorin). log||^||ptQ is a convex function of (\/p, \/q)
in 1 ^p, q^ +oo.
Using Theorem 1.1 we may evaluate ||^*|| instead of ||^|| (|| • || being a
shortened notation for || • ||2,2)-
Let/(«) be a real-valued function such that
,1 A\ f{n)^\ for neJT, »
(1.4) ^^ , andF = V/(«)<+oo.
v ' ^0 always, ^ v '
We get
(1.5) XX *>(")!
jV \ i
2 oo
^ E f(n)
n=l
X a>M
= X X f(n)oKaf(n))(i^m..
# x # \w = 1
Since any hermitian form X;, j aijxi*j satisfies
X aijXtXj\
I ij
we obtain
(1.6)
^X MiW2+il*/)^x(X kilj W2^max X k7l) X M2>
X aco<*>M
^( max X
jeQ co'eQ
£ /(n) oico'(n)
IW2;
hence
(1.7)
l\S>*\\2 = \m2^ma\^\K((bo)')\^F+K\Q\,
where K(a>)=£?!.! /(») a>(»), K = sup^0>.\K(cb(o')\.
The inequality (1.5) also enables us to obtain a bound for ||^||2,1. Let Aa
= L»-|K (a>co')|; then
z
X a>W
^ X *(<^0««A»^i X l^(^Ol(|aJ2 + |awf) = X^|aJ2,
#x# #x# #
whence
(1.8)
X^1/2a>(n)
*H
The operator 3)\ on L2 (£) defined by
36
M. FORTI AND C. VIOLA
@U{«lo}loea) = <Y,A«m«Mn)
nejT
is the adjoint of the operator @1:L2(jV)-+L2(Q) given by
»i(KM={e^1/2^w|
IS J Q-
By (1.7), ||®*|| = H0J ^ 1, whence
I I2
(i.9) E^1 Z<W")| ^£kl7
Therefore, by Cauchy's inequality,
jr
(i.io) (zlE^JY^fx^fz^1 Iz^wl^fs^Zkl2,
which implies that
(1.11)
wmli£ X i«(^')i
Remark. The usual choices for Q are the following:
(i) Consider for any Dirichlet character #modg a finite set of real numbers
tx,r, l^r<^Rx; then Q = {(D \ co(n) = x(n) nitxr} and D = qT with
r=max|/x ,r-// J + l.
(ii) For any q^Q and any primitive character x modq let rx r be as in (i); then
Q = {co | co(n) = x{n) nil^r} and D^Q2T.
2. A localization principle. The following lemma generalizes a method due
to Bombieri [2].
Lemma 2.1. Let g(s) = £*= 1 bnn~s, s=G + it, be such that for suitable o0, gx,
t0,D,A,l^\ ((70^<7i),
(i) /Ae series is absolutely convergent for g^g± ;
(ii) #(s) w holomorphic for g^g0, |/ — /0|^log' D;
(iii) \g(s)\<DAforG^G0,\t-t0\SloglD.
Define gx,k{s) = U?=i M"s exp(-(n/*)*) (fc>0). Tteii/or 6Z-k/2,<T^c0-5,
\t-t0\^\ogl D, omax(0, Gl~G0-k/2),
SIEVE ESTIMATES FOR THE DIRICHLET SERIES OPERATOR
37
Qx. k (s) = Ms) + 0 <( (log1 D) X6 max \g (a+6 + it
(2.1)
wft/i
|f-tol^(l/2) log' D
+ o\kDAXc exp(-(l/4/c) log'D)max |^(5)|[,
= 0 i/5>0.
Proof. We note that, by Mellin's transform,
c + ioo
if /w \ ATW
0x,*(s) = — ^(w + s)rf-+lj — dw, c large.
c-iao
In the above integral we change the path of integration in the union.of the following
segments (w = u + iv):
Llf2 = {w\u = c, M^ilog'D},
L3f4 = {iv|5gii^c,M=iloglZ)},
L5 = {vv|W = c5, M^log'/)}.
If 5 < 0, the residue of the integrand at w = 0 is #(s). Hence
0*
'>(s)=iiijg(w+s)r{-k+l)^dw+^ if5<0'
Clearly
+&(s) if 5=0,
+ 0 if 5>0.
(l/4)log'D
^(w + 5)r(-+l —dw\
\k J w
5 v\ Xd+iv
g(S + G + i(t + v))r[ l+-+f-|-—-dv\
k kj 5 + iv
-(l/4)log'D
|,(w+s)rg+i):
« ATd log/ D max |^r(^-hcr-+- ir)|;
|t-f|g(l/4) log'D
*("+*) ^+l)vdwi
« fcXc e -('/4*> log,D max |# (s)| (/' = 1, 2);
«/cD^ATc^-(1/4k)log,D, by(iii) (/=3,4).
This completes the proof of the lemma.
#(w + s)r -+1 —dw
1 A: / w
38
M. FORTI AND C. VIOLA
Remark. If g(s) is meromorphic in the strip o = o0, \t — /0|^log/ D, we have
to add to the right-hand side of (2.1) the residues of the integrand at the poles of g.
We now define
(2.2) f(n) = C[e-{nlN)k-e-i2n/N)k],
where C = C(k) is such that (1.4) is fulfilled, and apply the above lemma with
g(s) = L(s, %), as well as the previous remark when x is a principal character, in
order to obtain
(2.3) K(co)<Nd log3D max \L(5 + h, *)| + £(x) Ne~m
|t-t|g(l/2) log3D
with
z(x) = 4)(q)/q for principal x mod q,
= 0 otherwise,
provided N<^DB for some B.
From the estimate \L(s, x)\<{qTf{a) + E (x modq, Res = tr, \lms\^T) it follows
that, if ||co|| <D, 0 = S = 1,
(2.4) K{co)<NdD»id)+e + £(x) Ne~Mk
uniformly in co. In particular we put 3 = 0 and obtain
(2.5) K(a))<$D1/2 + e+£(x) Ne~M.
From \L(s, x)\<(q\s\/2*)ll2~° logfo|s|) we get, if <5<0,
/\s\nV,2~d
(2.6) K(co)<Ndr-j-j (logD)4 + £(%) Ne^26.
This estimate is significant for N large compared with £>; if 5= —(A/2) logD and
N = AD logD, we have
(2.7) K{co)<Dll2~A log5D + £(x) AT^IM 1o*d.
3. Estimates of the norm ||£^||. The estimate (2.3) for K(co) is effective for either
a nonprincipal generalized character co, or a not too small |/|.
Remark 3.1. Suppose we are given a bound \\@\\ <^H for any <5-well spaced
SIEVE ESTIMATES FOR THE DIRICHLET SERIES OPERATOR 39
set Q. Then we obtain the bound \\@\\ <H(5/5* + l) for any (5*-well spaced Q, by
means of a partition of Q in <5-well spaced subsets whose number is <^5/5*+l.
Since Q is finite, it is well spaced provided that \t —1'\ >0 whenever %'x is principal.
Under this assumption the following results give effective bounds.
With the notations of §1 (D = sup(a*<o' || coco' ||) we have the following
Theorem 3.1. Let Q be (2 log D)2-well spaced and assume N^2D log D. Then
(3.1) \\@\\2<N.
Proof. From (2.2) and (2.7) we get F<^N and
K{(b(D')<$Dll2-A+e + z(xy;) Ate-"-'IM ,ogD;
hence, K<^D1~A+ND~A provided Q is (A logZ>)2-well spaced. From (1.7) and
\Q\ ^ D2, putting A = 2, we obtain
\\@\\2 <N+(D-1 + nd~2) D2 <$n . q.e.d.
The following remark enables us to derive a bound for ||^|| which does not
depend on the size of N. If m is a positive integer such that a>(m)^0 for any cogQ,
then \Y^* flwco(n)| = |L,e^ an(o(™n% whence \\9(jV, Q)\\Ptq=\\9{mjr, Q)\\Pt9.
Theorem 3.2. IfQ is (2 log D)2-well spaced, then
(3.2) ||®||2^Ar + Dlog2D.
Proof. By Theorem 3.1 the proof is needed only if N<2D\ogD. Let us
consider a finite sequence of primes px <p2 <--<ps such that Np1>2D logD and
Pi...ps>D- For every cogQ there is at least a pj such that co^^O, and we may
apply Theorem 3.1 to every set pj^V. Multiplying by co(Pj) and summing overj we
obtain
z
Q
2 s
^ZZ
X an(o(pjn)\
x
<^\\®(P}jr,Q)\\2Y,\atf
«Z (PjN)?:K\2^psNZ\a„\2,
whence \\@}\\2<^spsN.
Choose p1<(4D\ogD)/N and sKlogD/logp^ Since the number of primes
between px and 2px is asymptotic to pi/logpl9 there is a prime ps such that ps<2p1
40
M. FORTI AND C. VIOLA
<8D \ogD/N. Therefore
\\@\\2<spsN^(\ogD/\ogPl) ((82) logD)/N) N<D log2D.
Combining this estimate with (3.1), we obtain (3.2). Q.E.D.
We can improve the estimate (3.2) if \Q\ is small compared with D. By (2.4) and
(2.5) we immediately obtain
Theorem 3.3. Let Q be (logN)-wellspaced. Then, forO^S^l,
(3.3.1) \\^\\2<N + NdD^3) + e\Q\.
In particular for S = 0 we have
(3.3.2) \\@\\2<N + Dll2 + e\Q\.
Remark 3.2. If N<D log A it follows from fi(S)^\/2-S/2 (O^S^ 1) that
(3.3.2) improves (3.2) provided \Q\<£D1/2, while (3.3.1) for 5=^ gives the bound
\\@\\2<N + N1/2Dm + e\Q\ which is stronger than the previous one for |G|<^D3/4
-N~1/2. On the other hand, if we assume the Lindelof hypothesis ^(^) = 0, (3.3.1)
gives
\\@\\2<N + N1/2De\Q\.
We can improve the bounds (3.3.1) and (3.3.2) by applying the following lemma
(see for example Bombieri [2]).
Lemma 3.1. AssumeL(s, x)¥:Oforxmodq,s = a-\-it,a>a^, \t — t0\<log2T
Then
\L{s,/)H{qT)^a)^ for |/-/0|<ilog27\
where
Pl(g, a) = 0 ifa^OL,
= (a-ff)/2 iyi-a£(7£a,
= \ — c ifa^l—oc.
Define, for any p}± 1, fi{p) (a) to be the least exponent for which
SIEVE ESTIMATES FOR THE DIRICHLET SERIES OPERATOR 41
T
max j- £
-T
Also denote by N(<x, T\ x) the number of zeros of L(s, x) in the rectangle a^c^ 1,
\t\£T, and put
AT (a, T;G)=max £ NfaT;**').
Theorem 3.4. Le/ Q be (log2D)-well spaced. Then, for cc^\—S, S^j,
T = maxco>co,|r-r,| + l?
(3.4.1) H^H2 <^ AT-h AT^^^-^^^^I^I -|_ A^^jr>i/2-^+i/jP+/xc^> ci -^>+£ [iv(oc, T; O)] 1~1/".
In particular for a = 1 — S, p = 2/5, we have
(3.4.2) ||^||2^N + ^D1/2-^ + £|^| + Ar5D1/2-^2+£[N(l-5, T;^)]1"^2.
/« ftof/i (3.4.1) and (3.4.2) owe way replace the factor [7V(a, r;&)]1_1/p by
min{|0|,[N(a,T;0)]1-1^}.
Proof. Let co*e£ be such that, for any co'eQ, Y*<»en \K((ocof)\^
Y,a>eQ \K{<oa>*)\, and let Q*=Qcb*. Then (1.7) implies \\9\\2 <7V+£^|AT(co)| which,
combined with (2.3) and with the functional equation for the L-functions, gives
(3.5) ||0||2«JV + JV'D1/2-'X ™ax |L(l-5 + iT,z)|.
fl*-|t-f|^(l/2)log2D
Let tw verify
|L(l-<5 + iTw,Z)| = max |L(1-5 + it,Z)|,
|T-r|£(l/2)log2D
and let
7m = {T|mlog2D<T^(m+l)log2/)}.
We follow here an idea of Turan [20], [12].
Define &a(x) to be the set of strips Im such that L(g + ir, x)#0 for o-^a, re Jm,
and let ^a(x) be the set of the remaining strips. Let Qb (resp. Qc) be the set of coeQ*
such that tw belongs to a strip 7me^a(x) (resp. #a(/)). Then, for a ^ 1 - S,
\L{° + it>XX')\
p ) i/p
dt> <^D«(p)<">+£.
42
M. FORTI AND C. VIOLA
(3.6)
n* Qh &c
«D<'-1+')/2+I|0| + I|L(l-« + «„„ Z)|.
Since |#a(x)| £N(<x, T; x), Holder's inequality yields
(3.7) £|L(1-* + «„, Z)|£
tic
LXeX(Q*)
*T:*]-''\$
\L(l-S + ixwX)\'
i ip
Applying the Sobolev inequality \f(x)\£{l/(b-a)) \ba \f{x)\ dx + J» |/'(x)| dx we
get
X|L(l-^ + JTra,X)|p
T
<1 I {|L(l-5 + it,z)l"+P|L(l-6 + ir,x)lp"1|£(l-^+^x)l}A.
From this, using Cauchy's integral formula for L(s, x) in terms of L(s, x), we
obtain, for a suitable 5' between S — 1/logD and 5+ 1/logD,
(3-8) |_S
|L(l-5 + iTra,x)|"
i/p
«logD
T+l
* 1
X(Q*) J
\L(l-5' + it,x)\p dt
i/p
^/}l/p + Cl")(l-«)+ei
Now (3.4.1) follows from (3.6), (3.7) and (3.8).
We recall the following result of Gabriel [8] (see also Bombieri [1, Lemma 10]).
(3.9)
it* (<r)£((0-<x)/(0-«)) //" («) + ((*-«)/(/»-«)) /<> (/?),
where a£o£p,l/p=(l/h){p- a)/(fi - a)+(1/fc) (a - «)/(/? - a).
By (3.9) with /i = 4, k = ao, a=j, /? = 1 and the estimate
(3.10) /i<4,(l/2)=0,
we obtain for any p ^ 4
(3.11) ^<p)(<7) = 0 ifff^l-2/p.
Now (3.4.2) follows from (3.11) putting a = l-8,p = 2/5 in (3.4.1).
SIEVE ESTIMATES FOR THE DIRICHLET SERIES OPERATOR
43
Remark 3.3. (3.10) is obtained by considering fourth moments of L-
functions (see for instance Montgomery [17, Theorem 10.1]); further
improvements on the exponent of the modulus q can be derived from Huxley's results [13]
about eighth moments.
If fr (a) is the least exponent for which N(ol,T; Q)<$D<ll{a) + £, then (3.4.2) has
the following form.
(3.12)
||^||2^7V + j/V<5jD1/2-^ + £|^|+^/)1/2-^/2 + (1-^/2)mi(1-^) + ^
Remark 3.4. We can extend the previous estimates to sets N of integers
which are not necessarily included in any interval [N/2, AT]. If N = sup^T, let
J\rh = jrn[NI2\ N/2h~1'] (h=l, 2,...). Then
£aa><5(W)
h JVh
q I |_ /i J a
i«j2,
i.e. \\2>{^,Q)\\2^Yj. \\&{^rk,0)\\2. By Theorem 3.1, \\®{Jfh, Q^^N/l"^ for
N^2hD logD, while by Theorem 3.2, \\®(Jrh, Q)\\2<D log2D for N<2hD logD.
Hence
(3.13)
||^(^,0)||2<^Ar + Dlog3D
holds for any set N of positive integers ^ AT.
The same argument allows us to establish Theorems 3.3 and 3.4 for any set Jf
of positive integers ^ N.
Remark 3.5. In order to estimate expressions of the type £*= x |J^ an nitr\2,
corresponding to a set Q of generalized characters co(n) = nu, we apply the following
idea of Huxley (unpublished).
Let H(T) = supQ maxw £©, \K((o'cb)\, where sup^ is taken over all sets Q such
that D(Q)^T. It follows from (1.7) that
;!*>''
^tfOOIkl2
provided supr>s|tr-tJ^T.
Let Im = [mV, (m + 1) V~\ and &m = {coe& | co(n) = nir, re/m}; then
44
M. FORTI AND C. VIOLA
Now we may apply the estimates so far considered to H{V). For instance, it follows
from (3.4.2) that
E anco(n)
<Z{N + NdV1/2-d+£\Qm\ + NdV1/2-d/2+e[N(\-S, K)]1"^2} £ \af,
where N (1 — S, V) is the number of zeros of £ (s) for a ^ 1 — 5, 0 ^ f ^ K. Summing
over m we get, for 1 <^ V <^ T,
2»H
(3.14) ^JArl + ^K1/2-^^ + ^-K1/2^/2+£[iV(l-5, K)]1-^2!^!^2,
provided infr s \tr - ts\ ^ log2 AT.
4. Estimates of the norm ||^||p,, with (p, #)#(2, 2). By Holder's inequality
it follows that
(4.1)
H^IU^JV1*-1"!!®!!*, forr>p.
We shall now estimate the norms H^ll^k, (k a positive integer) by means of
II^IIp,,. Let jVk = {m | m = n1...nk, «;€>"}. Then
(4.2)
z
E<v°(")
*«
=E
E^-^^K-^)
I.**
^||^(^*,Q)|||.J Z
ejVk
E *«.,•••**
p\q!p
Applying twice Holder's inequality to the right-hand side of (4.2) we obtain
E «■,-«*
^ E [dk(m)Y{1-1M[ E KI^KI"
me/k
1/fc
(4.3) £< X [</kM](p~1/k)/(1~1/k)
me/k
Z Z KI*p...kJ*'
mejVk n\...rih = m
<^\_Nk (log Ny^^y-1'kYi\af.
jV
Therefore, from (4.2) and (4.3),
SIEVE ESTIMATES FOR THE DIRICHLET SERIES OPERATOR 45
kq\l/kq / \l/fcp
J < \\® {^\ £3)|| \% N^~ W (log AT)^ \Y \an\kp)
Hence, by (4.1) and (4.4) we get
(4.5) \\&{jr Q)\\rM<N^-^ (\ogNfk>* \\®{Jf\ 0)||PV{
for r^kp.
Combining (4.5) with the results of §3, we can deduce estimates for ||^||p><r We
shall adopt for the sake of simplicity the estimate \Q)(Jf, (2)|| <^N1/2 + D\12 (see
Theorem 3.2), where D1=D log2 D. Therefore, if p = q = 2k we obtain
Theorem 4.1. Let Q be (log D)2-well spaced. Then, for any positive integer k,
(4.6) \\@\\2k,2k<(N1,2 + D\/2k)Nll2-1/2k+e.
We may now interpolate between 2k — 2 and 2/c by means of Theorem 1.2 in
order to obtain
Theorem 4.2. Let Q be (\ogD)2-well spaced. Then, for any p^2 and for
* = [p/2],
(4.7) ll^llp,^<^(^(k+1)/2 + Z>i/2)1-2fc/^(Arfc/24-Z>i/2)(2fc-h2)/^-1 N1/2~1/p+e.
Remark 4.1. The estimate (4.7) takes the following more explicit forms
(we put k = 0/2] as before and $ =p/2 - [>/2]):
(4.8.1) \m\P,P<^1~1/p+E forD^JV*;
(4.8.2) \\9\\P9P<D\fpNlf2-lfp+t for D.PN^1;
||® || <^£)(1fc+1)/P-1/2j/Vfc/2-fc(fc+l)/P+l-l/P + £
(4.8.3) P'P forNk<Dx^Nk + l.
= £)d -S)/pjy&!2+S(l -&)fp + 1/2-1/p+e
(4.8.1) and (4.8.2) are particular cases of the very strong statement
\\@\\p,P<{N1/2 + D\/p)N1/2-1/p+e,
which has recently been conjectured by Montgomery [17, Conjecture 9.2] in a
special case. Unfortunately (4.8.3) yields the weaker result
(4-4) (E
E anV(n)\
46
M. FORTI AND C. VIOLA
(4.8.4) \\9\\PtP<(Nlf2 + lf4p + D\'p+lf2p2)Nlf2-lfp+t forallp^2.
Another estimate for ||^||P><J can be obtained by means of the adjoint operator
(Theorem 1.1). By Holder's inequality (1 ^p'^2) we get
(4.9)
Z «»<»(")
P'y/P' (
^N(2-P')/2P' £
Z aX^)
Q
2N1/2
If f(ri) is a function satisfying (1.4), then
Zaco^(")|
Q
2\l/2 /oo
Z aX")
2\ 1/2
v2^ 1/2
(4.10)
^f1/2fzi«J2Y/2 + K1/2ElaJ.
Since for l^q'^2 we have
1/9'
and
1/2 / \l/«'
ZW2) ^ I !«„!•'
we deduce from (4.9) and (4.10) that
ip'\ i/p'
Z aco^(")
^NW-l/2(Fl/2 + Xl/2|0|l-W) ^|aj,
L/«'
Now define p, g such that 1/p + 1/p' = l/q + l/q' = 1. The above inequality reduces to
(4.11) \\^{^Q)\\Pyq<^Nll2-llp(Nll2+Kll2\Q\llq)
for any p, q ^ 2, provided F <^N.
Let
(2.2)
/(«)=c[^-(w/N)k-^-(2^>k].
SIEVE ESTIMATES FOR THE DIRICHLET SERIES OPERATOR
47
Combining (4.11) with (2.4) and (2.5), we generalize Theorem 3.3 as follows.
Theorem 4.3. Let Q be (logN)-well spaced. Then, for any p, q^2, 0^3^ 1,
(4712.1) \\^\\Ptq^N1/2-1/p(N1/2 + Ns/2D^S)l2+e\Q\llq).
In particular for 3 = 0,
(4.12.2) \\®\\P,q<N1/2-llP(N1/2 + D1/4+e\Q\1/q).
Remark 4.2. In view of Remark 3.4, the results of the present section hold
for every set Jf of positive integers not exceeding N.
5. Estimates for sums of type ^ |]T^ a„ co(n) n~s{(0)\p. We can easily
generalize the previous theorems to the case of an arbitrary complex exponent for n. Our
result is as follows.
Theorem 5.1. Let Jf be an arbitrary set of positive integers not exceeding N.
Let Q be a set of generalized characters. For any coeQ let s(co>) = g((d) + h(co). Let
Q' = {cd' I co,(n) = co(n) n~lT{to),a>EQ} and define a0 = 'mfQ a (co). Thenforanyp, q^2,
(5-1) E
.ft
" < \\3(JT, 0')||p,,(log logAO1^ |a„|*n-"°V/P
Xa„co(n)n"sH
Moreover, ifp^q the following stronger inequality holds:
jr
(5-2) (I
-ft
1,q L. / log N\y/p
^ii^(^,^iip,,|p^ip""pffY+iog^J} •
Proof. We may suppose cro = 0, by substituting an with ann~ao; we also
assume without loss of generality that t(co) = 0 for any cog Q (i.e. Q! = Q). Then, by
partial summation (a = a(co)),
(5.3)
IS
AT' J ^M+f {£ ana>(n)\ ^~a~l d{
<2UN-qa
Z an<t>{n)\
^aMnn^r'-'di
We now apply Holder's inequality; if W+!/<?= 1, then
48
M. FORTI AND C. VIOLA
{£ a^inftcC'-'dZ
(5.4)
TV
41
N
E a»w(«)|
dt
E anW(")
{log 5.
z iog<r
|far,''-1(iog^'-i^j*/"
Summing over to we obtain from (5.3) and (5.4),
Xa„co(H)H-s<">
1^*
A1/*
^1
X>„w(")
"J?
E a«<°(n)\
i log^J
(5.5)
\QlP r/S \q/P iz ll/Q
.^"
£log£
Hence we immediately have
X«„co(n)n"
s(a>)
jV
" < \\nP,q(\+iog logjv)1" h \an\p\IP,
which is equivalent to (5.1).
Now define |y||, = (fN \g(Z)\q d log logf)1'*. It is well known that, for any p^q,
q~\\\y\\\p'
Hence, putting g{£) = (£? \an\p)1/p, we obtain by (5.5),
X>„co(n) *"*<*»
q\l/q
^2 II^Hp.«{(l k|P)1/P + (| £ kl* d log log<^)1/P
<4 1101
n
d log log £
1/p
(5.2) immediately follows.
SIEVE ESTIMATES FOR THE DIRICHLET SERIES OPERATOR 49
Corollary. IfJf is comprised between N/2 and N, we have
(5-6) (I
js
References
1. E. Bombieri, On the large sieve, Mathematika 12 (1965), 201-225. MR 33 # 5590.
2. , Density theorems for the zeta function, Proc. Sympos. Pure Math., vol. 20, Amer. Math.
Soc, Providence, R.I., 1971, pp. 352-358.
3. , A note on the large sieve, Acta Arith. 18 (1971), 401-404.
4. E. Bombieri and H. Davenport, On the large sieve method, Number Theory and Analysis
(Papers in Honor of Edmund Landau), Plenum, New York, 1969, pp. 9-r22. MR 41 #5327.
5. , Some inequalities involving trigonometrical polynomials, Ann. Scuola Norm. Sup. Pisa
(3) 23 (1969), 223-241. MR 40 #2636.
6. H. Davenport and H. Halberstam, The values of a trigonometrical polynomial at well spaced
points, Mathematika 13 (1966), 91-96; Corrigendum and Addendum, ibid. 14 (1967), 229-232.
MR 33 #5592; MR 36 #2569.
7. P. D. T. A. Elliott, On inequalities of large sieve type, Acta Arith. 18 (1971), 405-^22.
8. R. M. Gabriel, Some results concerning the integrals of moduli of regular functions along certain
curves, J. London Math. Soc. 2 (1927), 112-117.
9. P. X. Gallagher, The large sieve, Mathematika 14 (1967), 14-20. MR 35 #5411.
10. , A large sieve density estimate near a=\, Invent. Math. 11 (1970), 329-339. MR 43
#4775.
11. G. Halasz, Uber die Mittelwerte multiplikativer zahlentheoretischer Funktionen, Acta Math.
Acad. Sci. Hungar. 19 (1968), 365-403. MR 37 #6254.
12. G. Halasz and P. Turan, On the distribution of roots of Riemann zeta and allied functions. I,
J. Number Theory 1 (1969), 121-137. MR 38 #4422.
13. M. N. Huxley, The large sieve inequality for algebraic number fields. II. Means of moments
ofHecke zeta-functions, Proc. London Math. Soc. (3) 21 (1970), 108-128. MR 42 #5944.
14. Ju. V. Linnik, The large sieve, C. R. (Dokl.) Acad. Sci. URSS 30 (1941), 292-294. MR 2, 349.
15. H. L. Montgomery, Mean and large values of Dirichlet polynomials, Invent. Math. 8(1969), 334-
345. MR 42 #3029.
16. , Zeros of L-functions, Invent. Math. 8 (1969), 346-354. MR 40 #2620.
17. , Topics in multiplicative number theory, Lecture Notes in Math., vol. 227, Springer-
Verlag, Berlin and New York, 1971.
18. A. Renyi, On the large sieve of Ju. V. Linnik, Compositio Math. 8 (1950), 68-75. MR 11, 581.
19. K. F. Roth, On the large sieves of Linnik and Renyi, Mathematika 12 (1965), 1 -9. MR 33 # 5589.
20. P. Turan, On the so-called density-hypothesis in the theory of the zeta-function of Riemann,
Acta Arith. 4 (1958), 31-56. MR 20 #2304.
Istituto Matematico, Universita di Pisa
Pisa, Italy
This page intentionally left blank
ON IWASAWA'S ANALOGUE OF THE JACOBIAN
FOR TOTALLY REAL NUMBER FIELDS
JOHN COATES
1. Introduction. The present paper is a summary, without proofs, of some
joint work with S. Lichtenbaum. Detailed proofs, as well as some material not
discussed here, will be appearing in [4]; see also [3], [10] for earlier work in the
same direction. We begin by indicating two basic problems in algebraic number
theory which motivated [4].
The first is the problem of finding analogues of Dirichlet's class number formula,
in the following sense. Let F be a number field, rx its number of real embeddings,
and r2 its number of pairs of complex conjugate embeddings. For each integer
n^0, let dn be either rx + r2 — 1, r2, or rx + r2, according as n is 0, odd, or even and
positive. Let C(F, s) be the complex zeta function of F. By the functional equation
for C(F, s), we have £(F, s)~cn(s + ri)dn as s-> — n, where cn is some constant.
Dirichlet's class number formula asserts that c0 = hR/xv, where h is the class number,
w the number of roots of unity, and R the regulator of F. Do there exist similar
formulae for the cn when n>0? A crude analogy suggests that one should look
for a formula for cn of the form hnRnjwn, where hn is the order of some generalized
ideal class group, wn is the order of some group of roots of unity, and Rn is some
dn x dn determinant generalizing the regulator. The simplest case is when the
regulator term Rn does not occur, that is, when Fis totally real and n is odd, and indeed
Siegel [15], [16] has shown in this case that £(F, -n) is a rational number.
However, Siegel's proof gives no interpretation of this rational number in the form
h„/wn suggested above. Recently, Birch and Tate [18] in the case n = l, and
independently Lichtenbaum [10] for all odd positive n, made a precise conjecture
of this kind. While apparently quite different, the two conjectures are in fact the
same for n = 1. Part of the object of [4] has been to study these conjectures, and,
AMS 1970 subject classifications. Primary 12A70.
<y) 1973, American Mathematical Society
51
52
JOHN COATES
in particular, to develop techniques for proving them for a class of totally real
abelian extensions of the rational field Q.
The second problem arises from the well-known analogy between number
fields and curves over finite fields. Let C be a complete, nonsingular curve of genus
g ^ 1 defined over a finite field k, and let f be the Jacobian variety of C. For each
prime number / distinct from the characteristic of k, let /x be the /-primary
subgroup of the group of points of f defined over the algebraic closure k of k. Then,
as an abelian group, fx is isomorphic to (QJZ^29, where Qx and Zx denote the
field of /-adic numbers and the ring of /-adic integers, respectively. The Frobenius
automorphism of k/k induces an endomorphism of fx, and a basic theorem of
Weil [19] asserts that the characteristic polynomial of this endomorphism is the
quotient of the zeta function of the curve C and the zeta function of a curve of
genus 0. Recently, Iwasawa [6], [7] conjectured that a certain T-module in his
theory of Zrextensions should provide a good analogue of # x for number fields.
Further, for a very special class of abelian extensions of Q, he established a beautiful
analogue of Weil's theorem by relating the characteristic polynomial of this
T-module to the /-adic zeta function of the number field in the sense of Kubota-
Leopoldt [9]. Much of the work of [4] can be viewed as providing evidence that
this result of Iwasawa is valid, without restriction, for all totally real number
fields. In fact, it turns out that the conjecture of Lichtenbaum mentioned before
is equivalent to the assertion that the characteristic polynomial of Iwasawa's
analogue of fx is always an /-adic function of the type constructed by Kubota-
Leopoldt. Moreover, this connexion does not seem to be a superficial one, and it
is used in [4] to obtain results about both problems.
2. Iwasawa's analogue. The following notation will be used throughout. Let
/ be an odd prime number, and let Qx, Zx be the field of /-adic numbers and the
ring of /-adic integers, respectively. For each integer m^l,/xm will denote the
group of mth roots of unity, and we put W= U*= i A*i»,<^ = proj lim/i^. If K is a
field, K will denote the algebraic closure of K. If £ is a Galois extension of K, we
write G(E/K) for the Galois group of E over K. For each integer n^O, let«^(n)
denote the tensor product of ST with itself n times over Zx. If B is any discrete
/-primary abelian group on which G(K/K) operates continuously, we define B(n)
to be the G(K/K)-modu\e B®Zl #~{n); here it is understood that G(K/K) acts on
the tensor product by the diagonal action. Finally, A will denote the ring of formal
power series in an indeterminate T with coefficients in Zx.
Throughout, F will denote a totally real number field of finite degree over Q,
and we put
F0 = F(n,), FX = F(W).
For each n^O, let F„ denote the unique subextension of FJF0 of degree /" over
IWASAWA'S ANALOGUE OF THE JACOBIAN
53
F0; each Fn is a totally imaginary quadratic extension of a totally real subfield,
which we denote by F+. Put r = G(Fo0/F0). For reasons that will be clear later, it is
more natural to consider the analogue of the group 4fl discussed earlier for Fq
rather than F itself. To this end, let An (n^O) be the /-primary subgroup of the ideal
class group of Fn, and let A — 'm& lim^„, the inductive limit being taken relative
to the homomorphisms induced by the inclusion of the divisor group of Fn in
the divisor group of Fm when n^m. Let J denote complex conjugation. Once we
choose an embedding of F^ into the complex field C, there is a natural action of
J on A; it is easily seen that this action does not depend on the particular embedding
chosen. We then have the decomposition A = A + ®A~, where A+=A1+J, A~
= y41-J. Iwasawa has proposed that A~ should provide a good analogue of #x
for Fq . As a first step towards explaining the evidence for the analogy, we recall
the following basic result of Iwasawa [8]. Let (A~y = Hom(A~, Q^Z^ be the
Pontrjagin dual of the discrete group A". We define an action of r on (A~)~
by specifying that (acp) (a) = (p((ja) for geT, <pe(,4~)*and aeA~. Fix a topological
generator y0 of f. Then, as is well known, the T-structure on (A")"gives rise to a
unique /1-module structure on (A~)~ such that y0<p = (l + T) <p for all (pe(A~y.
Then Iwasawa proved, by arguments based on class field theory and the structure
theory of noetherian /1-modules, that there exist nonzero elements/^T), ...,fr(T)
of A, r being some nonnegative integer, such that there is an exact sequence
(i) o^-^e^irH-o,
where D is some /1-module of finite cardinality. Moreover, assuming that the choice
of y0 is fixed, he showed that the power series
(2) Un,T)=t\fXT)
i=l
is uniquely determined by A~ up to a unit in A. This power series plays a basic
role in our work. As indicated by our choice of notation, we believe that Ci(Fq , T)
deserves to be called the /-adic zeta function of Fq .
3. Lichtenbaum's conjecture. Following [10], we first state the conjecture in
terms of etale cohomology. For the definition and basic facts about etale co-
homology, see [1]. Let 0 be the ring of integers of F, and X the spectrum of the
ring (9 [1//]. Let ;: Spec(F)-»X be the natural inclusion. For each n^O, we
can view the G(F/F)-module W(n) as a sheaf for the etale topology for Spec(F),
and we may take its direct image j^ W(n) on X. Let £(/% s) be the complex zeta
function of F. Finally, let 11, be the valuation of Z, normalized so that \l\l = rl,
54
JOHN COATES
and let \M\ denote the cardinality of any finite set M.
Conjecture 1 (Lichtenbaum). Let n be an odd positive integer. Then
(i) the Hl(XJ+W(n)) are finite for all i^O and trivial for all i = 2;
(ii) \H'(X,uW(n))\/\H°(X,j\,W(n))\ = \C(F, -«)|f *.
Lichtenbaum also conjectured that the same result is valid for 1 = 2. Note first
that his conjecture would imply the following estimate for the denominator of the
rational number ((F, — n). If £ is a field and m a positive integer, let wm(E) denote
the largest integer k such that G(E(fik)/E) is annihilated by m. In particular, w^E)
denotes the number of roots of unity in E. Then it is easily seen that \H°(X, j*W(n))\
= |wn+1(F)|f1, and so the conjecture predicts that wn+1(F) C(F, — n) should be a
rational integer for all odd positive integers n. When n= 1, this last assertion has
been proven by Serre [14]. When n> 1, it is still unknown, although it has been
verified in some special cases.
Henceforth we assume again that / is odd. The next result, which is proven in
[10], relates Conjecture 1 to the Iwasawa module A~.
Theorem 1. For each odd positive integer n, H1 (X, j^ W{n)) is canonically
isomorphic to (A~(n))G, where G = G(Fao/F). Furthermore, the H^XJ^Wfo)) are
trivial for all i^2 if and only if(A'(n))G is finite.
Using this theorem and the exact sequence (1), it is easy to obtain the following
formulation of Lichtenbaum's conjecture for Fq. Let (9q be the ring of integers of
Fq, Xq the spectrum of (9q [1/C and;: Spec(Fo )->Xq the natural inclusion. Let
q0 denote the largest power of / such that ^ocF0, and let K'.r^>\+q0Zl be the
isomorphism defined by y(£) = £K{y) for all £eW and yeT.
Proposition 1. Let n be an odd positive integer. Then
(i) the H1(Xq, j*W(n)) are finite for all i — 0 and trivial for all i = 2 if and only
J/C,(Fo+,k(?0)-"-1)#0;
(ii) \HHxs,j,w(n))\ = UF5, MroP-iJir1.
Hence Lichtenbaum's conjecture is true for Fq and the prime I if and only if
In particular, this shows that Lichtenbaum's conjecture is essentially equivalent
to the assertion that C/(^o > T) is an /-adic function of the type constructed by
Kubota and Leopoldt [9] when Fq is an abelian extension of Q. So far, this
remarkable fact has only been proven for a rather special class of abelian extensions
of <2; the precise result is given in §4. Even the much weaker assertion that
IWASAWA'S ANALOGUE OF THE JACOBIAN
55
Ci(Fq , K(y0)~n— 1)^0 for all odd positive integers n is unknown in general.
However, it has been proven for n= 1 by rather deep arguments involving the K2 of Fq
(see §6). Of course, it is trivially true that C/(^o > k(}>0)~"- 1)#0 for all but a finite
number of integers n.
4. The analytic theory. The results in this section are based on the
fundamental ideas introduced by Iwasawa [6], [7]. These, in turn, have their
origin in a classical theorem of Stickelberger [17]. Let % be a primitive Dirichlet
character satisfying %( — l)= — 1. We view the values of x as lying in the algebraic
closure of Qh and let (9X be the ring generated over Zz by the values of x- Let Ax
be the ring of formal power series in T with coefficients in (9X. In [7], Iwasawa has
associated with x an element g(T\x) of the quotient field of Ar Define/(T; x)
to be either g(T; x) or [T— I) g(T; x) according as %#o; or x = o>; here co is the
Dirichlet character modulo / satisfying co(a) = a mod/Zz for all integers a. We
shall only consider those x which have order prime to /; and in this case, /(T; x)
is an element of Ax.
Suppose now that F, in addition to being totally real, is an abelian extension of
Q of degree prime to /. Let # be the character of any imaginary representation of
G(F0/Q) which is irreducible over Qh let e0 be the associated orthogonal idempo-
tent in the group ring Zl[G(FQ/Q)\ and let </> be the primitive Dirichlet character
associated with an absolutely irreducible component of #. We have the direct
sum decomposition A~ = ®0 e<j>A~, where 0 runs over all distinct characters of
imaginary representations of G(F0/Q) irreducible over Qt. Let (^"fbe the
Pontrjagin dual of e0A~, it being endowed with a T-module structure in the same
way as (A~y.
Conjecture 2. Let 0 be the character of an imaginary representation of
G(F0/Q), irreducible over Qt. Then, for a suitable choice of the topological
generator of T, there is an exact sequence of A-modules
O^eoA-y^A+fifiT; <£)HA^0,
where D0 is a finite A-module.
The following result is then not difficult to establish (cf. [10]).
Proposition 2. If Conjecture 2 is valid for F and /, then Conjecture 1 is valid
for F and I.
Before stating the actual result we can prove in the direction of Conjecture 1,
we recall the definition of a wild prime of a number field. Let £ be a finite extension
of (2, and p a nonarchimedean prime of E lying above a rational prime p. Then
56
JOHN COATES
we say that p is wild if \iv is contained in the completion of E at p. Note that a
prime of Fq lying above / is wild if and only if it splits in F0.
Theorem 2. Let F be a totally real abelian extension of Q. Assume that (i) /
does not divide the degree of F over Q, (ii) no prime of Fq lying above I is wild, and
(iii) there exists a0eA$ such that Aq ~Zl\_G(F0/Q)]a0. Then Conjecture 2 is valid
for F and I.
When F=Q, this result is due to Iwasawa [7]. The general result is proven
in [4]. We shall see in the next section that condition (ii) of Theorem 2 is a natural
one in the theory, being equivalent to the nonvanishing of Ci(Fq , T) at T=0.
Unfortunately, condition (iii) is very restrictive, and difficult to verify for any particular
field. However, we give a number of examples of fields to which it applies in § 7.
It should also be noted that Leopoldt [11] has proven that (iii) is valid if the class
number of Fq is prime to /.
5. The vanishing of ^(Fq , T) at r=0. It is shown in [4] that there is a close
connexion between the vanishing of £i(Fq , T) at T=0 and the existence of wild
primes of Fq lying above /.
Theorem 3. Ci(Fq , T) vanishes at 7=0 if and only if at least one prime of Fq
lying above I is wild. Furthermore, ifCii^o > T) vanishes at T= 0, the order of the zero
at T= 0 is greater than or equal to the number of wild primes ofpQ lying above I.
Presumably, the exact order of the zero at T = 0 is the number of wild primes
of Fq lying above /, but we have been unable to prove this in general so far. In
connexion with Theorem 3, it may be of interest to note the following analogous
fact for the complex zeta function. Let S0 be the set of primes of F0 lying above /,
and put £So(F0, s) = £(F09 s)Y\peSo(\-(Np)-% where Np denotes the norm of
peS0. Similarly, let S$ be the set of primes of .Fq lying above /, and put CsA^o » s)
= C(Fq , s) Yivesd 0 _(NP)~s)- Then the complex function
Cs0(fo,s)/Cs0<F^s)
vanishes at s = 0 if and only if at least one prime of Fq lying above / is wild.
Furthermore, if it does vanish at s = 0, the order of the zero is the number of wild primes of
Fq lying above /. Note also that, in the special case in which F is an abelian
extension of Q of degree prime to /, Conjecture 2 is in accord with Theorem 3 since
f(T; (j>) vanishes at T = 0 if and only if </>(/)= 1. We also mention the following
corollary of Theorem 3.
IWASAWA'S ANALOGUE OF THE JACOBIAN
57
Corollary. Let the integer en be defined by\A~\ = len. Then, for all sufficiently
large «, we have en = X~n + fi"/" + v", where A~, \i~, v" are integers not depending
on n, and where X ~ is greater than or equal to the number of wild primes of F£
lying above I.
As a simple example of the corollary, assume that 1=3 mod 4, and take F to
be any real quadratic field with discriminant of the form //, where/is a quadratic
nonresidue modulo /. Then it is easily seen that the unique prime of Fq lying
above / is wild, and so, for this choice of F and /, we have X~ = 1.
The proof of Theorem 3 given in [4] is based on the etale topology. The
key step in its proof is the following result, which is established in [4]. If p is any
prime of F, let Fp be the completion of F at p. For each n^O, we define wj/^Fp)
to be the maximal number of /-power roots of unity in any extension of Fp of
degree n.
Theorem 4 (Lichtenbaum). Let n be an odd positive integer. Let G = G(FJF\
and assume that (A" (n))G is finite. Then the order of (A~ (n))G is divisible by
{"[pifVvj^CFp), where the product is taken over all primes p of F lying above I.
When n = 1, this result was pointed out several years ago by Tate in a different,
but equivalent, context (cf. §6). Note that since n is odd, the integer Y\v\i w{n{Fv)
is greater than 1 only if at least one prime of Fq lying above / is wild. Also, recalling
the isomorphism Hl (X, j^ W(n))^> (A~ (n))G, we see that Theorem 4 and Conjecture
1 suggest the following divisibility assertion for wn+l (F) £(F, — n).
Conjecture 3. Let n be an odd positive integer. Then wn + l (F) £(F, — n) is an
l-integer which is divisible by \\v\i w(^(Fp)9 where the product is taken over all primes
p of F lying above I.
Thus the following result, which is established in [4], can be viewed as giving
indirect evidence for Conjectures 1 and 2.
Theorem 5. Assume that F is a totally real abelian extension of Q. Then
Conjecture 3 is valid for F and all I.
A particular example of Theorem 5 is the following. Assume that 1 = 3 mod 4,
and that F is a real quadratic field with discriminant of the form //, where / is
quadratic nonresidue modulo /. Then, for each r=0,
wnt-»i2+i(F)UF,-ni-W)
is an /-integer which is divisible by lr+1.
58
JOHN COATES
6. Connexion with AVtheory. There is a remarkable and useful connexion
between the questions discussed in the preceding sections and the K-theory of
number fields. As this connexion has only been proven for K2, we limit most of
ourjdiscussion to this case.
We first recall one of the several equivalent definitions of the K2 of a field
(cf. [3], [10]). Let £ be a field. Then K2E is the abelian group generated by the
symbols {a, b}, where a and b run over all nonzero elements of E, subject to the
relations
{ac, b) = {a, b} {c, b], {a, be] = {a, b] {a, c], {a, 1 — a] = 1.
Suppose now that v is any discrete valuation of £, and let E* denote the
multiplicative group of the residue field of v. The tame symbol at v is the homomorphism
kv\ K2E-*EXV defined by mapping {a, b] to the residue class of (-iyw<*>
'av{b)/bv{a). Assume now that E is a finite extension of Q. We define the tame kernel
of £, which we denote by R2E9 to be the intersection of the kernels of the kv for v
ranging over all nonarchimedean primes of E. By a deep theorem of Garland [5],
R2E is a finite group. The following result (cf. [3], [10]), whose proof relies
heavily on the work of Tate [18], relates R2E to Iwasawa's theory.
Theorem 6. Let F be a totally real number field of finite degree over Q, and
let I be an odd prime number. Then the l-primary subgroup of R2F is canonically
isomorphic to (A ~ (1)) G(F^F\
Corollary 1. For each totally real number field F, we have
UF^K(yo)-"-\)^0.
Corollary 2. For each totally real number field F, (A~(\))G{Fco/F) is zero
except for a finite number of primes I.
These results provide further evidence for the validity of Conjecture 1. In fact,
Birch and Tate [18] had conjectured, on the basis of some computations on the
order of the tame kernel, that \R2F\ = w2(F) £(F, —1). Theorem 6 shows that,
except for the 2-primary subgroup of R2F, their conjecture is just the special case
of Lichtenbaum's conjecture given by n= 1. Furthermore, by Proposition 2, the
Birch-Tate conjecture is true for the /-primary subgroup of R2F when F and /
satisfy the conditions of Theorem 2.
Finally, we remark that there may well be a connexion between the higher
X-groups and Iwasawa's theory. For, if (9 denotes the ring of integers of F, Bass [2]
has proven that the inclusion of & in F induces a surjective homomorphism
K2(9-*R2F. Presumably this homomorphism is an isomorphism, but this is not
IWASAWA'S ANALOGUE OF THE JACOBIAN
59
known yet.1 In the light of this remark and Theorem 6, it seems natural to
conjecture that, for any totally real number field F and any odd positive integer n, the
/-primary subgroup of K2n(9 is canonically isomorphic to (A~ (n))G(Fco/F).
7. Numerical examples. We now give some numerical examples to illustrate
the general theorems and conjectures discussed before.
Example 1. Assume that / is an odd prime number ^4001, and take F to be
any totally real subfield of Q(fity Then the hypotheses of Theorem 2 are well
known to be valid for F and /. In particular, Conjecture 1 is valid for F and /.
Example 2. Let F be a real quadratic field whose discriminant d is either
prime to 3 or of the form 3(3m+ 1), where m is a positive integer. Assume that the
3-primary subgroup of the ideal class group of Q(( — 3d)1/2) is a cyclic abelian group.
Then the hypotheses of Theorem 2 are valid for F and the prime /= 3. In particular,
Conjecture 1 is valid for F and 1 = 3.
Example 3. Let F=Q((11)1/2), and take 1=1. Condition (ii) of Theorem 2
is valid because 7 splits in <2((11)1/2) and is totally ramified in Q(pi7).
Furthermore, if C denotes the 7-primary subgroup of the ideal class group of Q(fi308),
tables of class numbers [13] show that |C"| = 7. Hence, since the degree of Q(fi30s)
over F0 is prime to 7, we have \Aq\ = 1 or 7. Thus Theorem 2 is valid for F and
1=1. Now w2(F) = 23-3, C(F, -1)= ±7/(2-3), and so, by Proposition 2,
(A~ (i))G<Fo°/F) has order 7, whence, by Theorem 6, the 7-primary subgroup of R2F
has order 7. Several years ago, before Theorem 6 was proven, Birch and Atkin
found convincing numerical evidence for the validity of this last assertion by direct
computations. But they could not prove it at the time because the definition of
R2F involves infinitely many relations.
Example 4. Let F=Q((19)1/2), and take /=19. Since Q((-19)1/2) is a
subfield of Q(fil9), we have F(fil9) = Q(fi4, fil9). Now 19 stays prime in Q(^4) and
is totally ramified in Q(^i9). Hence condition (ii) of Theorem 2 is valid.
Furthermore, tables of class numbers [13] show that \Aq \ = 19. Thus Theorem 2 is valid
for F and /=19. Now w2(F) = 23-3, C(F, -1)= ±19/(2-3), and so, in particular,
(A~ (1))G<F~/F> has order 19 by Proposition 2. Also it follows from Theorem 6 that
the 19-primary subgroup of R2F has order 19.
The remaining four examples are of nonabelian totally real cubic fields F. The
values of £(F, —n) given have been computed, by hand, by Mr. A. Candiotti and
myself, using a remarkably simple formula for £(F, — n) discovered recently by
Siegel [16]. For each of the four fields given, it is easily seen that w2(F) = 23-3,
w4(F) = 24-3-5. Note that, in each example, w2(F) C(F, -1) and w4(F) £(F, -3)
are integers, in accordance with Conjecture 1.
1 See Note Added in Proof.
60
JOHN COATES
Example 5. Take F=Q (0), where 0 is a root of x3 - 4x + 2. The discriminant
of F is 22-37. We have C(^, - 1)= ± 1/3, C(F, -3)= ±577/(2-3-5).
Example 6. Take F=Q (0), where 0 is a root of x3 — 4x + 1. The discriminant
ofjFis229. WehaveC(F, - 1)= ±2/3, C(F, -3)= ±1333/(2-3-5).
Example 7. Take F=Q (0), where 0 is a root of x3 — 5x + 3. The discriminant
of Fis 257. We have C(F, — 1)= ±2/3, f(F, -3)= ± 1891/(3-5).
Example 8. Take F=<2(0), where 0 is a root of x3 — 6.x+ 2. The discriminant
ofiMs22-33-7. We have C(^, -1)= ± 13/3, £(F9 -3)= ±(72-3589)/(2-3-5). This
example is particularly interesting in connexion with Conjecture 3. For, if p
denotes the prime of F lying above 7 which is ramified in the extension F/Q, it is
easily seen that \in is contained in an extension of Fv of degree 3. Thus Conjecture 3
predicts that w4(F) £(F, —3) should be divisible by 7, which is indeed the case.
This research was supported in part by NSF grant GP-9152 and the U.S. Army
Office of Research (Durham).
Added in Proof. D. Quillen had recently proven that K2(9^R2F for all
finite extensions F of Q. Also, R. Greenberg has shown that the order of the zero
of Ci(Fq , T) is exactly the number of wild primes of F£ above / when F is an abel-
ian extension of Q and / any odd prime number.
References
1. M. Artin, Grothendieck topologies, Mimeographed notes, Harvard University, Cambridge,
Mass., 1962.
2. H. Bass, K2 des corps globaux, Seminaire Bourbaki, Expose 394, 1971.
3. J. Coates, On K2 and some classical conjectures in algebraic number theory, Ann. of Math. (2)
95 (1972), 99-116.
4. J. Coates and S. Lichtenbaum, On \-adic zeta functions, Ann. of Math, (to appear).
5. H. Garland, Finiteness theorem for K2 of a number field, Ann. of Math. (2) 94 (1971), 534-548.
6. K. Iwasawa, Analogies between number fields and function fields, Proc. Annual Sci. Conf.
Some Recent Advances in the Basic Sciences, vol. 2 (Belfer Grad. School Sci., Yeshiva Univ., New
York, 1965/66), Belfer Graduate School of Science, Yeshiva Univ., New York, 1969, pp. 203-208.
MR 41 #172.
7. , On p-adic L-functions, Ann. of Math. (2) 89 (1969), 198-205. MR 42 #4522.
8. K. Iwasawa, On Zrextensions of algebraic number fields, Ann. of Math. (2) 96 (1972), 338-360.
9. T. Kubota and H. W. Leopoldt, Eine p-adische Theorie der Zetawerte. I. Einfuhrung der
p-adischen Dirichletschen L-Funktionen, J. Reine Angew. Math. 214/215 (1964), 328-339. MR 29
#1199.
10. S. Lichtenbaum, On the values of zeta and L-functions. I, Ann. of Math, (to appear).
11. H. W. Leopoldt, Zur Arithmetik in abelschen Zahlkorpern, J. Reine Angew. Math. 209 (1962),
54-71. MR 25 #3034.
12. J. Milnor, Introduction to algebraic K-theory, Ann. of Math. Studies, no. 72, Princeton Univ.
Press, Princeton, N.J., 1971.
13. Guntram Schrutka v. Rechtenstamm, Tabell der (Relativ)-klassenzahlen der Kreiskorper,
deren (j>-Funktion dcs Wurzelexponenten (Grad) nicht grosser als 256 ist, Abh. Deutsch. Akad. Wiss.
Berlin Kl. Math. Phys. Tech. 1964, no. 2, 64 pp. MR 29 #4918.
IWASAWA'S ANALOGUE OF THE JACOBIAN
61
14. J.-P. Serre, Cohomologie des groupes discrets, Ann. of Math. Studies, no. 70, Princeton Univ.
Press, Princeton, N.J., 1971.
15. C. Siegel, "t)ber die analytische Theorie der quadratischen Formen. Ill," in Gesammelte
Abhandlungen. Band I, Springer-Verlag, Berlin and New York, 1966, pp. 469-548. MR 33 #5441.
16. , Berechnung von Zetafunktionen an ganzzahligen Stellen, Nachr. Akad. Wiss. Gottingen
Ma&.-Phys. Kl. II 1969, 87-102. MR 40 #5570.
17. L. Stickelberger, Uber eine Verallgemeinerung der Kreistheilung, Math. Ann. 37 (1890), 321-367.
18. J. Tate, Symbols in arithmetic, Proc. Internat. Congress Math. (Nice, 1970), vol. 1, Gauthier-
Villars, Paris, 1971, pp. 201-212.
19. A. Weil, Varietes abeliennes et courbes algebriques, Hermann, Paris, 1948.
Stanford University
This page intentionally left blank
THE DISTRIBUTION OF VALUES OF
EULER'S PHI FUNCTION
HAROLD G. DIAMOND l
A number of facts are known about the distribution of values of Euler's phi
function. A classical theorem of Schoenberg [10] asserts that q>(n)/n has a
continuous distribution function. That is, there exists a continuous monotone function
/with/(0) = 0, /(l)= 1 such that, as x->oo,
(1) (l/x)#{n€[l,x]:^(n)/n^a}^/(a), 0£a£l.
In geometric terms, the left side of (1) is the proportion of integers n^x for which
the point (n, <p(n)) lies below the line t = as in the s-t plane.
Another estimate of the distribution of values of cp(n) ([4], [2], [1]) is, for
v-»oo,
(2) #{»:^W^y}-C(2)C(3)y/C(6).
(C is the Riemann zeta function.) The left side of (2) is the number of points (h, <p(n))
lying in the semi-infinite horizontal strip {(s, t): 0<s<oo, 0<t^y}.
Here we shall investigate a similar problem for rectangles. Let
i.e. the number of points (h, (p(n)) lying in the rectangle (0, x] x(0, y]. Clearly,
<P(x, y) = 0 for y< 1 and <P(x9 y) = [x] for y^x.
Let g be defined on R by setting gf(a) = 0 if a^0;#(a)=l if a^l; and for
0<<x<l set *
«-»-5Jni«-r'+1.-'(i-»-)-}^.
where # is any line from a — ioo to <z + ioo with ae(0, 1). We shall prove the
A MS 1970 subject classifications. Primary I0H25; Secondary 10K20.
1 Research supported in part by NSF grant 21335.
•V) 1973, American Mathematical Society
63
64 HAROLD G. DIAMOND
Graph of points («, q>(n))
500 ,
400
300 ,
200 .
100 .
o^x"'" ......
100 200 300 400 500 600
Graph courtesy of H.-E. Richert and H. Siebert
Theorem. Let c be a fixed number in (0, |). If 1 ^y<x,
0(x, y) = xg(y/x) + 0(y exp(-{c \ogey log loge2>>})1/2).
The constant implied by the O depends only on the value ofc.
Our device for counting the integers n satisfying both the inequalities n<x
and cp(n)<y is set out in the following formula, which is valid for any a, b>0:
b + iao a + iao
(2ni)-2
z—b — iao s = a — ico
f /*Y ( y \ ds dz
= 0, n>x or (p(ri)>y.
DISTRIBUTION OF VALUES OF EULER'S PHI FUNCTION
65
We exploit this idea by using a generating function of two complex variables and
applying Perron's inversion formula twice.
Generating function. Let s=c + it and z = £ + *>/ be complex variables. Let
Sa = {(j, z):o- + £>a}, and on Sx define
F(s,z)=£ n-scp(ny\
The series converges on Sx because
n^(p(n) = nY[(l-p-1)^n f] (1 -p-^Pn/log logw.
p\n p^c log n
(See [6, pp. 267-268].) It is easy to see by uniform convergence that F is an analytic
function of s and z on Sx.
Fhas a product representation valid on Sx. Since n*-+n~s(p(n)~z is
multiplicative, we have
F(s,z)=Y\{i+p-s<p(p)-z+p-2wp2rz+p-3s<p(p3rz+-}
p
=n{i+p_s(p-i)"2(i+p"s"z+p"2s_22+-)}
p
=n u-p"s-z+p-s(p-i)-zK(s+z)
p
= defile Z)C(S + Z),
where £ is Riemann's zeta function, the behavior of which is well known. We now
set out some facts about J~[(s, z) for use below. To avoid extra estimates, we limit
ourselves to sets of the form
Sa = df {(o- + it, { + if): <r>0, £>0, <7 + £>a}.
Lemma 1. The product defining J~[(^, z) converges and defines an analytic
function of s andz on S£. Moreover
U(s, z)«expf3l0gW\ if°^=l "l0g l0g '"l/l0g i,?l
11V' FVloglog|ij|/ analog log |>/|^ 10,
and |r;|<expexp 10.
The estimates are valid independently oft.
66
HAROLD G. DIAMOND
Proof. We may assume that £^2 and <t^2, for otherwise the conclusions of
the lemma are quite obvious. For ^Owe have the inequalities
|(p-lp-p-'|£2(p-l)-<
^IzKp-iP"1.
For log log \rj\ ^ 10, we write
inMis n • n-
11 = n {l+2(p-l)-«-}gexpj £ 2(p-l)-«-j
^expJ2 + 2 J u-s-°dn(u)\^exp<A + 2 I M-1+«—j,
3/2 3/2
where £ = def log log\rj\/\og\rj\. If we set w=£logw the last integral becomes
2 (1/2) log log M log log |i/|
+ I +
(MM)
ewdw
e log (3/2)
i (1/2) log iog|ir|
The first two integrals are estimated directly; the third after an integration by
parts. We find that
I'/I
f ^ du 3 logM ^ f 3 1ogM]
U"1+£i g- . , . if log log |i,|£ 10, and fl gexp^ + gl .
J logw 2 log log|^| ,*,„, (. log log WJ
3/2
n = n (i+i2i(p-ir4—')
p>l>;l p>l>;l
gexp £ |2| (p-I)"*"-»
(W-i)
^exp
^exp-M|z|
|z|
2-c
+ W
J logwj
DISTRIBUTION OF VALUES OF EULER'S PHI FUNCTION
67
These inequalities establish the O-estimate if log log \rj\ ^ 10. In the other case
we have simply
in(s,z)i<exPJ2:i2i(p-i)-«-'-iJ«i.
The product converges uniformly on each set S* n{z: \z\ ^ M}, for any e > 0, M < oo,
because
log n {i-p_,"*+p"*(p-ir"}
a<p<b
< I P~a\(p-irz-P~z\
a<p<b
<\z\ £ (p-l)-«-"-»-0
a<p<b
uniformly as a, b-+co and (s, z)eS* n{z: |z|^M}. Thus the product defines an
analytic function of s and z on 5£+. □
For fixed z the function s\-+F(s, z) has a pole with residue f](l—z, z)atj=l— z.
We use this fact in our first application of Perron's formula. We assume in what
follows that the variable x is sufficiently large that all estimates contingent on the
size of x are sensible and valid.
Lemma 2. Let c' be any fixed number in (0, \). Then
xl~z
X <p(n)~z = Yl(l-z>z)-< +0(xexp{-(c'logxloglogx)1/2})
n£x \—Z
for x^x0(c') and z in the rectangle
{z = b+irj:0<b£$, \rj| ^exp((y log* log logx)1/2)}.
The constant implied by the O depends only on c'.
Proof. All estimates that follow are uniform in z for z in the rectangle. We
begin by observing that on any half plane {z:Re z^/?},
in(i-*.z)i=
n i-p-'+p-1
P-\
^n{i-P->+P->(i+^y}=o(i).
Next, it suffices to prove the lemma in the case x = [x] +i This is so since
68
HAROLD G. DIAMOND
m-^r-™+»i-om.
1-2
Now assume x = [x]+^ Let T = exp((logx log logx)1/2) and let a=\
+ log log x/log x. Applying Perron's formula we obtain
a + iT
(3)
27TI
5 n<j,
a-iT
oo
The error term<^x°T~l £ n"
r|iogx/n|
log-
<:
Z »-+(£Y Z r^V Z »■
^x/e \X/ x/e<n<ex l^ — *| M^ex
^xT"1 log2*.
We estimate the left side of (3) by deforming the contour. Let a! = 1—6
— (log logx)/(2 log*). The function s\-+F(s, z) has the aforementioned pole within
the rectangle with vertices a'±iT, a±iT. Thus we have
a + il
-L f
ds
F(s,z)x° ■■l\{l-z,z)x1-'/{l-z)
(4)
a-iT
a'-iT a' + iT a + iT
+Uhhl}r<***
ds
s '
a-iT a'-iT a' + iT
We now estimate the last three integrals.
I
£- sup {IC(2 + 5-»T)n(«-ir,2)|}.
The lower bound
/loglogxY'2 «
log log |ry|
log |9/|
is valid and thus we can apply Lemma 1 to obtain
I~[(£-iT, z)«exp(3 log|if|/log log|!f|)«exp(18 logx/log logx)1/2.
DISTRIBUTION OF VALUES OF EULER'S PHI FUNCTION
69
Since |Im(z + £ — zT)|^l, we do not approach the pole of zeta. Consequently we
may apply the familiar estimate (cf. [7, Theorem 9])
£(z + Z-iT)<Tl-a'-b/(l-a'-b)
( logx \1/2
<d -—f exp(2"1/2 log log*Hlog2*.
\loglogx/
Thus
a'-iT
<4xT~l (logx)3 exp(18 logx/log logx)1/2.
a-iT
The same estimate applies for j^Vrr-
For the remaining integral we write
a' + iT T
J «x*'sup {\az + "' + it)Y\(a' + it,z)\} -^—.
a'-iT ~T
Now a'=\ — b — o{\)^\ and hence the last integral is <^ log r<^ logx. We can
apply Lemma 1 to obtain the estimate
Y[(a' + it, z)<exp{(18 logx/log logx)1/2}.
Since Re(z+<z' + zf) is less than 1, we can apply the zeta function estimate [7,
Theorem 9]
t(z + a' + it)<T1-a'-b/(l-a,-b)<\og2x.
Thus we have
a' + iT
^x*'(logx)3 exp{(18 logx/log logx)1/2}
a'-iT
«x(logx)3 exP{-^T72+(loglogJ Oogx log logx)1/2
<^xexp{-(c' logx loglogx)1/2}.
The estimates of (3) and (4) establish the lemma. □
We now apply Perron's formula a second time. It is convenient to use an
integrated form of the inversion formula here.
70
HAROLD G. DIAMOND
Lemma 3. Let a and 2c" be any fixed numbers in (0,1) and let %> be a line from
a — ioo to a + icc. Let x^x0(c"). Then for any ae(0,1)
ax
a2 dz
(5) J u 2ni J z2(l—z)
1 <€
+ 0{x exp(-(c" log* log logx)1/2)}.
The constant implied by the O depends only on c".
Proof. Since ]~J(1 — z, z) is bounded and the integrand of ]"# is analytic in the
strip 0 < Re z < 1, it is clearly sufficient to prove the theorem for the special choice
^ = {2 = (logx)"1+^:-oo<<J<oo}.
Also, let T = exp{(^ log* log logx)1/2} and let
^t={z = (logx)-1+^:-T^^T}.
Perron's formula and an integration by parts give
ax ax
= log—du<P{x, u)= #(x, uju"1 du.
Since Lemma 2 applies only for z with |Imz|^r, we express the left side of (6) as
We estimate \<€-<€x by noting that \^n^x<p(n)~z\^x and \(ocxf\^e. It follows
that \$<g-<gr\^ex/(m;). By Lemma 2,
f x f t-t , v a2 dz
+ o(xexp{-(c'logxloglogx)1/2} p*Li. h
The last integral is O(logx) since Rez = (logx)_1. The integral in the main term
DISTRIBUTION OF VALUES OF EULER'S PHI FUNCTION
71
can be extended to an integral along the curve ^ with an error of 0(xt~2).
Combining the above estimates with (6) we obtain (5), with c" any positive number
less than c' of Lemma 2. □
Differencing. We now difference (5), evaluating it once as it stands and once
with a(l +S) in place of a. The d will be taken as a positive function of x. Where
convenient, we shall write y for ax.
1 C , Ju x f—^ x a2 f(l+<5)2-l] j
/7x - *{x9u)— = —\n(l-z9z)— -<- -1 >dz
V) d J v u 2tc*J11v z(l-z)[ Sz J
y «•
H-Ofx^"1 exp(-(c" logx log logx)1/2)}.
We estimate the left side of (7), noting that # is monotone increasing in each
variable.
(8) *(x, y) ^U l *(x, u) ***<*+&, y+^ ^±5.
The following estimate is easily seen to hold for 0 g Rez ^ 1:
((l+^-l)/&-l«min{l,a + a|z|}.
Using this estimate, we express the first term on the right side of (7) as
xg(oc) + 0(xd log logx + x<5 log^"1),
where g(oc) is the function defined before the statement of the theorem. The
log log* arises from the integral over the region where z is small.
We take <5 = <5(x) = exp(-£(c" log* log logx)1/2), and let 0<c1<c'V4. With
this choice and the above estimates, (7) becomes
y + dy
1 C A
(9) - <J>(x,u) — = xg(<x) + 0(xQxp{-(c1 log* log logx)1/2}).
o J u
y
It is easily seen that g(<x) is real by dividing through (9) by x and letting x->oo.
If we combine (8) and (9), taking x' = x(l+(5), y'=y(l+S) and on=y/x=y'/x\
we obtain the estimate
72
HAROLD G. DIAMOND
(10) #(x, y)=xg(y/x) + 0{x exp(-(cx log* log logx)1/2)},
where cx is any number in (0, £).
We shall now convert the error estimate into one in terms of y. Such an estimate
is most interesting, of course, when y is much smaller than x. This case is close
to the Erdos-Dressler-Bateman problem in the sense that if y is large but y/x is
suitably small, then
(C(2) C(3)/C(6)) y~*(oof y) = #(jc, y)~xg{y/x).
We make this observation precise in the following lemma, which estimates g(ot)
accurately for small positive values of a.
Lemma 4. There exists an absolute positive constant k such that for 0<a< 1,
i-(«|-^« + o{=xp(-expi
Proof. g(oc) was defined by an integral just before the statement of the
theorem. Shift the contour in this integral to the vertical line {£ + it: — oo < f < oo,
£ = exp(fca)-1}, where A: is a positive constant to be specified below. The residue
of the integrand at z = 1 is
For Rez = £, we have the estimate
ina-z.z^nf/rTno+o^-2)}.
The last product is bounded, and
11 f-^rY^exp X -^-gexp^aoglogf + log^/e))}
p<t\P-1/ P<$P~l
for some absolute positive constant k. The factor e is included to make the final
form of the lemma simpler. Thus, on the line Rez = £ we have the estimate
f](l -z, z) az<exp{£(log logf + log(/c/e) + loga)}
<exp{ —exp(fax)"1}.
DISTRIBUTION OF VALUES OF EULER'S PHI FUNCTION
73
The lemma now follows from the residue theorem and an application of the
last estimate to j(4) ]~] (1 - z, z) a2 dzjz (1 - z). □
Proof of the Theorem. We have estimated #(x, y) in (10). It remains to
convert the error estimate into terms of y. We treat three cases according to the size
of x and y.
If e2<y<x^y logy, then
x exp(-(c! log* log logx)1/2)^y logy exp(-(c! logy log logy)1/2)
<y exp(-(c logy log logy)1/2)
for 0<c<C! <^ (c arbitrary). In this case conversion of the error term was trivial.
Now suppose x>y\ogy>y0\ogy0 (for a suitable y0). Then $(x, y) =
<P(y logy, y) since there exists no n>y logy for which (p(n) ^y. (Recall q>(n)p
rt/log logw.) Using (10) once and the last lemma two times we obtain
#(y logy, y) = ylogy0(l/logy) + O{y logy exp(-(cx logy log logy)1/2)}
= (C(2) C(3)/C(6))y + 0{y logy exp(-exp{^1logy})}
+ 0{y logy exp(-(ci logy log logy)1/2)}
= xg(y/x) + 0{ylogyQxp(-(cl logy log log y)1/2)}
= xg(y/x) + 0{y exp(-(c logy log logy)1/2)}.
Finally, suppose that 1 ^y^max(y0, e2) and x is arbitrary. In this case <P(x, y)
is bounded and the formula for <&(x, y) can be made valid by suitably choosing
the constant in the error term.
This completes the proof of the theorem. □
Connections with Schoenberg's problem. The present method can be applied to
the problem of estimating
*l*(x> «) = def #{we[l, x] :<p(n)£<m}.
We start with the generating function
£ n-'{q>(n)/n}-* = F(s-z9z)
and proceed to the formula
f du x C„ xz dz
(11) J 0(x, u) - = — I n(l -z, z)—r + 0{x exp(-(c'logx log logx)1'2)},
0 <€
74
HAROLD G. DIAMOND
where ]~J, #, and c' are as before. This time, however, the differencing argument
does not work so easily. At fault is the function/occurring in (1), which is singular
[3]. It was shown by Tjan [11] that
/(a^)-/(a)<^(loglog/i-1)-1.
With this estimate we can deduce the following theorem of Fainleib [5]:
\j/ (x, a) = xf (a) + O (x/log log x).
The last error term cannot be improved very much on account of the rather heavy
concentration of points (n9 cp(n)) near certain rays through the origin.
A formula connecting the functions/and g can be derived by comparing the
integral defining g with equation (11). We have
a a
f „ , x du 1 f__ a2 dz C du
0 <€ 0
Now/is continuous, as we mentioned before; g is also continuous, by
consideration of its integral. We can differentiate the last equation, establishing the
differentiability of g on (0, 1] and obtaining for 0<a^ 1 the formula
(12) /(a) = 0(a)-a0'(a).
We can also establish (12) without knowledge of integral formulas for/and g.
We can compare the number of points (n, <p (n)) lying in a rectangle with the number
lying in a containing and a contained trapezoid. This method suggests the general
problem of applying knowledge of/or g to estimate the number of points (n, q>(n))
lying in more general regions.
Let0<a<j?^l. We have
i/, ((j3/a)x, a) -i/, (x, a) ^ <J> ((/?/a)x, fix) - <J> (x, fix) ^ i/, ((/?/a) x, /?) - ^ (*, /?) •
If we replace each <P and if/ by its asymptotic estimate, divide by x, and let x->oo,
we get
(f}ix-i)f(oL)mi«)g(«)-0mw*-i)f(P)
or, forOga^l,
f(a)£g(a)-*{(g(P)-g(«)W-«)}£f(P)-
DISTRIBUTION OF VALUES OF EULER'S PHI FUNCTION
75
Now/and g are continuous on [0, 1]. It follows that g is differentiable on (0, 1]
and equation (12) is valid on (0, 1].
Lemma 4 implies that g has a derivative from the right at the origin and that
its value is C(2) C(3)/f(6). Since/(1) = #(1) = 1, it follows from (12) that 0'(1) = O.
Equation (12) also implies g' is continuous on (0, 1]. Finally, we shall use (12)
and the fact that/is nondecreasing and singular to deduce that g' is nonincreasing
and singular on (0, 1). If we difference (12) we find that
/(« + £)-/(«) Jg'(« + fi)-.g'(g)| g(g + fi)-g(g)
+ cc< > = g (ct) + g (oc)-g a + £ .
£ I e J £
The right side of the last equation goes to zero with £, proving both assertions
about g'.
It appears that the method we have described can be applied to other problems
of estimating the number of points (n, f(n)) lying in a rectangle (0, x] x(0, y]
for suitable/. Examples arQf(n) = <p(ri)a ovf(n) = Ga(ri) = J^dln da.
I am indebted to Professor Harald Niederreiter for bringing articles [5] and
[11] to my attention.
References
1. Paul T. Bateman, The distribution of values of the Euler function, Acta Arith. 21 (1972), 329-345.
2. Robert E. Dressier, A density which counts multiplicity, Pacific J. Math. 34 (1970), 371-378.
MR 42 #5940.
3. Paul Erdos, On the smoothness of the asymptotic distribution of additive arithmetical functions,
Amer. J. Math. 61 (1939), 722-725.
4. , Some remarks on Euler's </> function and some related problems, Bull. Amer. Math. Soc.
51 (1945), 540-544. MR 7, 49.
5. A. S. Fainleib, Distribution of values of Euler's function, Mat. Zametki 1 (1967), 645-652 = Math.
Notes 1 (1967), 428-432. MR 35 #6636.
6. G. H. Hardy and E. M. Wright, An introduction to the theory of numbers, 4th ed., Oxford Univ.
Press, London, 1960. MR 20 #828.
7. A. E. Ingham, The distribution of prime numbers, Cambridge Univ. Press, Cambridge, 1932.
8. M. Kac, Statistical independence in probability, analysis and number theory, Carus Math.
Monographs, no. 12, Math. Assoc. Amer.; distributed by Wiley, New York, 1959.MR22 #996.
9. J. Kubilius, Probabilistic methods in the theory of numbers, Gos. Izdat. Polit. Naucn. Lit. Litovsk.
SSR, Vilna, 1962; English transl., Transl. Math. Monographs, vol. 11, Amer. Math. Soc, Providence,
R.I., 1964. MR 26 #3691; MR 28 #3956.
10. I. J. Schoenberg, Vber die asymptotische Verteilung reeler Zahlen mod 1, Math. Z. 28 (1928),
171-199.
11. M. M. Tjan, On the question of the distribution of values of the Euler function <p(n), Litovsk.
Mat. Sb. 6 (1966), 105-119. (Russian) MR 34 #5780.
University of Illinois at Urbana-Champaign
This page intentionally left blank
ON CONNECTIONS BETWEEN
THE TURAN-KUBILIUS INEQUALITY
AND THE LARGE SIEVE: SOME APPLICATIONS
P. D. T. A. ELLIOTT
Let f(n) be an additive function. Thus for coprime integers a and b the relation
f(ab) = f(a) + f(b) is satisfied. The Turan-Kubilius inequality states that there is
a positive absolute constant ct so that
/H- I f(pv)p-
Pv = n
^ctn £ I/(PV)|2P"
PV =n
This inequality is valid even if f(n) assumes complex values. It can be viewed as a
form of the law of large numbers for additive functions. It was proved first by
Turan [3] for real-valued functions, and extended to complex-valued functions by
Kubilius, (see for example [2]). The proof needs little more than a careful use of
the Cauchy-Schwarz inequality.
We shall show by means of this inequality that there is sometimes a
correspondence: operator-* sufficient, dual of operator-* necessity.
We write pv || n if px divides n and pv+1 does not. Define
S(PV,") = PV/2(1-P_V) if PVN
= —pv/2 otherwise.
Replacing each/(/?v) by f(pv) pv/2 in the Turan-Kubilius inequality we can
rewrite it in the form
£ f(p*)s(p\m)\
pv^n I
m=l
Dualizing we arrive (after a little untangling) at
2^n £ \f(ff
Pv^n
A MS 1970 subject classifications. Primary 10K20, 10H30.
,V) 1973, American Mathematical Society
77
78
P. D. T. A. ELLIOTT
Lemma 1. For any complex numbers am(m=\,...,n)
pv ^n
Z am~P " Z
m= 1; pv II m
^n Z Kl2.
This is clearly an inequality of large sieve type, but with the usual uniformity
p^n1'2 extended to pv^n.
We give two applications of this lemma.
First application. We consider the Erdos-Wintner [1] theorem. This states
that the frequencies
n'1 Z 1 (n=h2„..)
m= l;/(m)^z
possess a limiting distribution if and only if the three series
I/(p)I>iP I/(p)I^i P l/(P)l£i P
converge. To prove that these conditions are sufficient one can construct suitably
'small' probability spaces on which to study the function
/»= Z f(Pv) (m = h...,n)
pv||m;pvgr
with r = (logn)1/2. It is straightforward to prove that this function possesses a
limiting distribution. To complete the proof of sufficiency one shows by means
of the Turan-Kubilius inequality that the differences f(m)—fr(m) are on the
whole small.
As to necessity one usually appeals to a Tauberian theorem concerning
Dirichlet series. We now sketch an alternative method. In fact we shall prove that
if cp(t) is the characteristic function of a limiting distribution for the above
frequencies, then there is an absolute constant c2 (independent of q>(t)) so that the
inequality
WOI2Zp~Me'''/(p)-i|2^c2
is valid uniformly for all real numbers t. In fact set
am = Qxp(itf(m)) (m=l,...,w)
in Lemma 1. Then from the additive property of f(ri),
THE TURAN-KUBILIUS INEQUALITY AND THE LARGE SIEVE 79
n
E am = exp{itf(p)) £ ar.
m=l;p\\m r^p'ln;P%r
Choose a (large) prime P. Then uniformly for all p not exceeding P we have
n-1 t flw = exp(ft^p))^(t) + 0(l) + flp-1 (|0|£1).
m = 1; p || m
Applying Lemma 1, dividing by n2 and letting n->oo we see that
i I (p-1 Iexp(ft/(p))-l|2 \cf>{t)\2-p-2)^cx.
Letting P-+00 we deduce the asserted inequality with c2 = 2c1 + X/>~2.
It is almost immediate that the series £|/<p)I>i 1/p and £|/(P)|gi/2(p)/p
converge. For each n^ 1 set
*- 1 &.
p*»; l/(p)l£i P
Then, as in the argument for sufficiency, one can (using the convergence of the
above two series) prove that a limiting distribution exists for the function/(m) —
A(n) (m= 1,..., n). Since by hypothesis this is also true for f(m) we must have that
lim A(n) (n->oo) exists, and the proof of the theorem is complete.
Our second example is a little more subtle. For each set E and real number
x^l set
vx(n; neE) = x~lYl *
n^x
where ' indicates that summation is restricted to integers n which belong to the
set E.
Thforem. Let fi(x) be a function ofx increasing (in the wide sense) to infinity.
Then the following two propositions are equivalent:
A. There are constants oc(x),for each x^ 1, with the two properties
(i) ifO<w<\,then Pix)'1 sup,**^, |a(x)-a(y)|->0 (x->oo);
(ii)for each s>0, vx(n\ \f(n)-ai(x) |>8j?(x))->0 (x->oo).
B. For each real number u>0,
X 1-0 and P(x)-2 £ ^-0,
P^x;\f(p)\>up(x)P p^x;\f(p)\^up(x) P
both as n->co.
80
P. D. T. A. ELLIOTT
Remark. This theorem is a general form of the law of large numbers for
additive functions insofar as they mimic the sum of independent random variables.
We make no assumption concerning f(n) whatsoever beyond the fact that it is
adcjitive.
We shall not give the whole proof but sketch the key lemma. Moreover, we shall
give this lemma for strongly additive functions, namely those which satisfy
f(pv) = f(p), for v= 1, 2,.... This is purely for convenience of exposition.
Lemma 2. Assume Proposition A (ii). Then for each e>0 the estimate
£" —0 (x-oo)
P^xP
is satisfied, where summation is restricted to those primes pfor which the inequality
\f{p)-Hx)-0L(p-'x))\>eP(x)
is valid.
Granted this lemma, choose a real number w in the interval 0< w< 1. Then by
condition A(i), a(x) — cc(p~1x) = o(f}(x)) holds uniformly for 2^p^xw, as x->oo.
This fact together with Lemma 2 ensures that for each real u>0,
I -£o{l) + I -:g-logw + 0(l)
P^x; \f(p)\>uP(x) P x™<p^xP
as x->oo. Since w can be taken arbitrarily close to 1 from below we see that
I --0 (x-oo).
P^x;\f(p)\>uP(x) P
This is already a large part of Proposition B.
We now sketch a proof of Lemma 2. We can find functions e(x) and S(x) so
that e(x) decreases (in the wide sense) to zero, S(x)->0 as x-> oo and so that
v»(*; \f(n)-oi(x)\>8(x)p(x))^S(x) (x^l).
By replacing S (x) by sup y ^ x S (y) if necessary we can also assume that S (x) decreases
to zero as x->oo.
Consider the integers nt (i = 1,..., k) which lie in the interval 1 ^ n{ ^ x, for which
THE TURAN-KUBILIUS INEQUALITY AND THE LARGE SIEVE 81
the inequality \f(ni)-(x(x)\>£(x) /?(x) is valid. Let pj run through those primes p in
the interval 1 ^p^x for which
I i-p-1 I i
nt^x; p\\nt «i^x
-l/2Y/Al/2n-l
>(S(x)-1/2xk)1/2p
Then, from Lemma 2,
Next, we note that if j> = x(logx)_1, then
£ -^((logx)-1'2)-^ (x-oo).
y<pgxP
We shall now show that every prime p in the range 2^/?^x(logx)_1 which is
not a pj has the property
|/(p)-(a(x)-a(p-1x))|^£^(x)
provided only that x is sufficiently large.
Let p be such a prime. Consider the integers m} in the interval l^ra^x for
which \f(mJ) — tt(x)\?^£(x) /?(x), and for which p || rrij. The number of such rrij is at
least
x x
Z I" I 1 ^---2-1—-(*(*)-1/2**)!'-p
/C
P
/2„"1
^-(l-l/p-Oogx)-1-^)-^)1/4).
P
Moreover, the number of integers r not exceeding xp"1 for which the inequality
|/(r)-a(xp-1)|^£(xp-1)^(xp-1)
is satisfied is at least (x/p) (1— S(xp~1)). It is now clear that at least one of the
integers mjp~l{p || rrij) coincides with one of these integers r. For otherwise there
are at least
(x/p) (2- 1/p-Pogx)-1 -2(<5(logx))1'4)
integers in the interval [1, xp"1], and for all sufficiently large values of x this is
impossible. Thus we have m3 = pr, where the condition p || rrij ensures that (p, r)= 1.
82
P. D. T. A. ELLIOTT
From the additive property of / (n),
^s(x)p(x) + s(xp-1)^xp-1)
^2s(logx)P(x).
The proof of Lemma 2 is now complete.
References
1. P. Erdds and A. Wintner, Additive arithmetical functions and statistical independence, Amer. J.
Math. 61(1939), 713-721.
2. J. Kubilius, Probabilistic methods in the theory of numbers, Gos. Izdat. Polit. Naucn. Lit. Litovsk.
SSR, Vilna, 1962; English transl., Transl. Math. Monographs, vol. 11, Amer. Math. Soc,
Providence, R.I., 1964, Chap. 3, pp. 31-35. MR 26 #3691; MR 28 #3956.
3. P. Turan, On a theorem of Hardy and Ramanujan, J. London Math. Soc. 9 (1934), 274-276;
ibid. 11(1936), 125-133.
University of Colorado
ON THE NUMBER OF SOLUTIONS OF
P. ERDOS AND E. SZEMEREDI
Denote by rkJ(n) the number of solutions of
i
n= £ x?
i = l
in positive integers x(. The well-known hypothesis K of Hardy and Littlewood
states that for every e > 0,
(1) rk,k(n) = 0(n>).
(1) is well-known for /c = 2, in fact for n>n0(s)9
(2) r2 2(n)<n(1+£),og2/loglogn,
and (2) does not hold for every n if log 2 is replaced by a smaller constant. Nearly
40 years ago Mahler [7] disproved the hypothesis for /c = 3. He showed in fact
that for infinitely many n (cl9 c2i... denote positive absolute constants),
(3) r3,3(n)>Cln^2.
It is possible that, for all n,
(4) r3,3(n)<c2n^\
but nothing is known about this. It is probable that the X-hypothesis fails for
every k>3 too, but probably
AMS 1970 subject classifications. Primary 10B15, 10J99.
© 1973, American Mathematical Society
83
84
P. ERDOS AND E. SZEMEREDI
(5) E(r*»)2<x1+£
n=l
for every e if x>x0(e). (5) would be just as useful for Waring's problem as the
K-hypothesis.
Chowla [1] proved that for /c^5, rkk{n)^0{\\ and Chowla and Erdos [3]
proved that, for every /c^2 and infinitely many n,
rk, k (n) > exp (ck log n/log log n).
Mordell proved that r32(ri)^0(l), and Mahler [8] proved that, for infinitely
many n, r3 2(n)>(\ogn)l/4. As far as we know there is no nontrivial upper bound
for r32(n) and almost nothing is known about rkl{n) for /</c, /c>3.
Another very difficult problem is to estimate Akl(x\ the number of integers
m^x for which
i
m= X xf
is solvable. A classical result of Landau states that
A2t2(x) = (C + o(l))x/(logx)1'2.
Mahler and Erdos [4] proved that, for every /c>2, Ak 2(x)xxkx2/k (ak>0), and
Hooley proved Ak 2(x) = (ck + o(l)) x2/k. It seems certain that, for every /</c,
AK i > akf txllk and AK k(x) > xl ~E
for every e > 0, if x > x0 (s). Unfortunately we have no contribution towards settling
these classical problems; for important partial results see the papers of Davenport
[2]-
P. Erdos [5] proved the following result.
Let r1<---<rk<w, fc>«1"Cl/loglogn, C!<^log2. Then for n>n0(cl), there is
an m so that the number of solutions of m = rf — rf is greater than
exp(c2 logfl/log log«).
He also proved that for infinitely many n the number of solutions of n=p2 + q2,
/?, q primes, is greater than exp(c3 logw/log logw). P. Erdos states without giving
the proof that for every k there is an nk so that the number of solutions of
nk=p3 + q3 + r3 is greater than k. The analogous result seems to be unknown for
more than three summands.
THE NUMBER OF SOLUTIONS OF m=YJi = 1 *? &5
In the present note we prove the following:
Theorem. Let t be a positive integer, cx a positive number, and I and n
positive integers satisfying l>cxn. Let ax < ••• <a{ be positive integers smaller than
n but otherwise arbitrary. If n>n0(cl9 t) there exists an integer m such that the
equation
k
j= i
has more than t solutions.
Before we prove our theorem we wish to state a few well-known and very
difficult problems in additive number theory.
Denote by fk(n) the largest set of integers l^a1<---<a/^n for which all
the sums
i i
£ Btai9 £t = 0 or 1, X 8t = k
are all distinct. Erdos and Turan conjectured
(6) f2(n) = n^ + 0(\).
This problem seems very deep. Erdos, Turan and Lindstrom [6] proved
f2(n)^nl,2 + nl/4 + l
and recently Szemeredi proved f2(n) <n1,2 + o(nl/4); the proof is very complicated.
The results of Singer [9] immediately imply f2(n)^(l + o(l)) n1'2. P. Erdos often
offered 300 dollars for a proof or disproof of the conjecture (6).
Chowla and Ryser conjectured that
(7) Mn)=(l+o(l))nV.
They proved fk(n)^(l +o(l)) n1/k. P. Erdos offers 100 dollars for a proof or
disproof of (7). The methods used for f2 (n) seem to break down completely.
Finally denote by F(n) the largest set of integers 1 ^ax < ••• <at<n for which
all the sums Yj= x etah et = 0 or 1 are all distinct. P. Erdos and L. Moser proved
ri v .log" , loglogn
F(n)< he,
V y"log2 21og2
and Conway and Guy showed that for t > 22, F(2t) ^ t + 2. P. Erdos asked 40 years
86
P. ERDOS AND E. SZEMEREDI
ago: Is it true that
(*) F(/i) = log/i/log2 + 0(l)?
Erdos offers 300 dollars for a proof or disproof of (8).
We now prove our theorem. The proof is rather complicated and to motivate
it we first try to explain its plan which follows [3].
Let s be sufficiently large but fixed, A will denote the sequence 1 ^ax < ••• <ax
^n,l>cxn. A(u,d, n) denotes the number of integers of the sequence A satisfying
<zt = tt(modd).
Suppose that we have found a square-free integer Tr, r>r0(k, s, t, c), all of
whose prime factors pl9..., pr are sufficiently large so that! for every j, 1 ^j^r,
(8) A(0,Tr/Pj;n)>lPj/2Tr;
and the number of residue classes u (mod Trp)~ *), w = 0 (mod Tr/pj) (the number of
these residue classes is p*) which do not satisfy
(9) A(u,Trp)-lMi)>l/sTrp)-1
is less than p)l%k for j= 1,..., r. Then we can prove our theorem by the method
of [3].
To see this denote by F(Tr) the number of solutions of the congruence (in
distinct a's)
(10) X a^0(modTrk) (1^/),
i=l
and let Fj(Tr) denote the number of those solutions of (10) for which
at = 0(modTr/Pj), ar#0(mod/?;), i=l, 2,..., k.
Clearly
(11) I Fj(Tr)^F(Tr).
Next we estimate F3 (Tr) from below. The first k — 2 summands of (10) we choose
arbitrarily subject only to
(12) a, = 0 (mod 7^), ^^(modp,).
The number of choices of at satisfying (12) is, by (8), greater than
(13) lPj/2Tr-n/Tr>lPj/4Tr
THE NUMBER OF SOLUTIONS OF m = Yj=\ Xk 87
by l> cxn, if the prime factors of Tr are greater than, say, 10/cx. From (13) we obtain
that the number of choices of fc — 2 distinct <z's satisfying (12) is greater than
JOTJ r (10kf\Try
We have to choose ak-1 and afc so that besides satisfying (12) they should satisfy
(15) a^1+a*=-2;2a?(mod$.
A well-known result in elementary number theory states that if p>p0, then the
number of solutions of the congruence
xk + yk = a(modpk), x, y#0(modp)
is greater than pk/2.
Now observe that the number of solutions of the congruence (15) in residues
where at least one of them does not satisfy (9) is less than/?*/4. To see this observe
that there are at most pk/8k residues not satisfying (9), and once one such residue
has been chosen there are at most k choices for the other residue in (15). Thus the
number of solutions of (15) in residues satisfying (9) is greater thanpk/4k. Hence by
(9) the number of solutions in ak-l and ak of (15) is greater than
do - '-Yd- '
sTrp)~lJ 4 As2Tr2p)-2'
From (14) and (16) we have
(17) Fj(Tr)>lkT^ks-2(100kyk.
Thus from (17) and (11) and l>cln we have, for r>r0(k, s, cx\
(18) F(Tr)>r(lT-1(l00k)-1)ks-2>r1'2(nk/Tk).
Now the integers £{L l ak are all less than knk. Thus there are at most knkT^k
of them which are multiples of 7^ and hence by (18) for at least one of these integers,
say mj Trk, the number of solutions of
m = mlTk=Y ak
88 P. ERDOS AND E. SZEMEREDI
is greater than r1/2/k> t for r> t2k2, and this completes the proof of our theorem.
Now we 'only' have to prove the existence of an integer Tr satisfying (8) and (9)
and this will be the chief difficulty of our proof. We need three lemmas.
Lemma 1. Let £>0, c>0, and r be a positive integer. Then there is an n0 =
n0 (e, c, r) so that for every n> n0 if' 1 <ax <... <at<n, l> en is any sequence of
integers, then there is a square-free integer tr<t0(s, e, r) so that V(tr) = r (V(m) denotes
the number of distinct prime factors of m) and for every divisor d of tr,
(19) (1 -e) l/d<A(0, d; n)<(l+e) l\d.
The proof of the lemma follows fairly easily from Turan's method and we will
leave some of the details to the reader. First of all it immediately follows from
Turan's method that
(20) Zp<Cu C^C&c)
where in £ 1/p the summation is extended over all the primes p which do not
satisfy
p„ <lz^<„0,p;„)<(i±^.
p p
Henceforth we only consider primes p which satisfy (21). Let px be the smallest
such prime. Put tl=pl;tl clearly satisfies (19). Suppose we have already
constructed an integer ts = px • • -ps, px < • • • <ps so that for every divisor d! of ts we have
(22) (l-sjr) l/d'<A(0, d'; n)<(l +ejr) l/d>.
It again follows by Turan's method (taking note of (22)) that
(23) E'l/P<Cs+i
where in J] 1/p the summation is extended over the primes p for which for some
divisor d' of ts,
(24) (1 -eiM+l)fr)llpd'<A{09pd'; n)<(l +fi(I+1)/r)//prf'
does not hold. Let ps+1 be the smallest prime greater than ps which satisfies (24)
for every divisor d! of ts. Put ts+l = tsps + l. Clearly tr satisfies (19) and by our
construction rr<r0(£, c, r) which proves our lemma.
THE NUMBER OF SOLUTIONS OF m = Yj= i *? 89
Lemma 2. Let £>0, oO, ml < ••• <mr be any sequence of integers which are
pairwise relatively prime. Let L> L0(c9 e)9 N>N0(mr9 L, s) and bl<--<bl<N,
l>cN be any sequence of integers. An mh 1 ^/^r, is said to be bad if there are
more than e,m{ residue classes u (mod raj so that for each of them
(25) 5(W,mi,N)<//2Lmi.
Then there are fewer than L bad mis.
The lemma would follow easily from the large sieve but we give a very simple
direct proof. A residue class u (modmf) is bad if it satisfies (25). If a b3 is congruent
to a bad residue class modmf for any i= 1,..., r, we throw it away. Assume that
our lemma is not true and that there are L or more bad m/s. Consider any L of
them, say mil,...,mi/. We throw away, by (25), at most //2fe's; thus by l>cN
there are at least 1/2 /'s, bl<"-<bM9 so that every fc,-(mod raj, 1 ^s^L is not a
bad residue class (i.e. B(bj9 mis, N) does not satisfy (25)). But since mis is bad,
1 <5<L, there are at least am. bad residues modm; , or the fe's are in at most
(1 -s)L Y[s= i mts residue classes mod Y[s= i mis- Thus for L>L0(c, e),
cN/4<l/2<M<(l+o(\))(\-£)L N<cN/4,
an evident contradiction, which proves Lemma 2.
Let now tr be an integer which satisfies (19) and let r be sufficiently large. Let
d | tr. A prime p \ tr/d is said to be bad with respect to d if the following holds:
Let bx < - - - < br < n/d be the integers ajd, r>(l—e) l/d>(l — e) cn/d, by Lemma 1.
Now p is bad (with respect to d) if there are more than epk residues modp* so that
(25) holds for each of them (mf = pfc, N = n/d). By Lemma 2 there are fewer than L
bad primes p | tr/d.
Lemma 3. There is a d\ /r, V(d)>\ogr/2 log2 so that no p | d\dx is bad with
respect to d1 where dl is any divisor of d.
If we prove Lemma 3 our proof is finished since we can simply put d=Tr and
(8) and (9) are satisfied. Thus we only have to prove Lemma 3.
Lemma 3 follows from an argument used by Spencer and Erdos (their paper
will be soon published in Matematikai Lapok) but in view of the fact that the paper
is in Hungarian it seems appropriate to give the simple proof in full detail. The
argument is of course purely combinatorial. Let \<p\ = r,<p1cz(p. By assumption
there are fewer than L bad elements xeq> with respect to q>x (x^cp^. A subset q>l
of cp is called bad if there is an element x of <px so that x is bad with respect to
(?! — x. Clearly there are at most L(ULx) bad subsets q>i^(p with \q>1\ = u.
90
P. ERDOS AND E. SZEMEREDI
We want to prove that there is a subset Aczqy, \A\>\ogr/2 log2 which contains
no bad subsets, and this will complete the proof of our lemma.
Clearly there are at most
JtAl-uJ \u-lj {1-1)1
/-element subsets of cp which contain a bad subset. Now if r> r0(L), /^logr/2 log 2,
then
0-1)! W
thus there is an /-element subset A, /^logr/2 log 2, which contains no bad subset,
which proves our lemma and theorem.
Lemma 1 could have been strengthened in the following way:
Instead of (19) we could have proved that for every w,
(19') (l-£)l/d<A(u,d;n)<{l+£)l/d
uniformly for every residue class u.
The proof would be essentially the same as that of (19). Several other
possibilities of generalisations we plan to discuss in another paper.
References
1. S. Chowla, Indian Phys.-Math. J. 6 (1935), 65-68.
2. H. Davenport, Sums of three positive cubes, J. London Math. Soc. 25 (1950), 339-343. MR 12,
393.
3. P. Erdds, On the representation of an integer as the sum ofk k-th powers, J. London Math. Soc.
11(1936), 133-136.
4. P. Erdds and K. Mahler, On the number of integers which can be represented by a binary form, J.
London Math. Soc. 14(1939), 134-139.
5. P. Erdos, On the sum and difference of squares of primes. I, II, J. London Math. Soc. 12 (1937),
133-136, 168-171.
6. B. Lindstrdm, An inequality for B2-sequences, J. Combinatorial Theory 6 (1969), 211-212.
7. K. Mahler, Note on the hypothesis k of Hardy and Littlewood, J. London Math. Soc. 11 (1936),
136-138.
*8. , On the lattice points on curves of genus 1, Proc. London Math. Soc. 39 (1935), 431-466.
9. J. Singer, A theorem in finite projective geometry and some applications to number theory, Trans.
Amer. Math. Soc. 43 (1938), 377-385.
Mathematical Institute of the Hungarian Academy of Science
Budapest, Hungary
* P. Erdos remembers that Mahler in a later paper improved the exponent i to 2 but is unable to
trace the reference.
THE LARGE SIEVE AND
PROBABILISTIC GALOIS THEORY
P. X. GALLAGHER
In this paper we give versions of the large sieve in several variables suggested
by recent papers of Hlawka [11], Elliott [10], and Montgomery [16]. The sieve
is applied to the set of all monic polynomials in one variable of given degree with
integer coefficients. In particular, we sharpen the result of Dorge [8] and van der
Waerden [25] that almost all such polynomials are irreducible and have Galois
group equal to the symmetric group.
For the number En(N) of polynomials
F(X) = Xn + alXn-1 + -+an
with integer coefficients and height H(F) = max(|a1|,..., \an\)^N for which the
Galois group is less than the symmetric group, van der Waerden [26] gave the
estimate
En(N)<Nin-c/lo*lo*N\ with c=l/6(n-2),
by an argument based on reduction modulo p for many primes p. Knobloch [14],
[15] has improved this to
En(N)^Nn~c, with c=l/18n(n!)\
using a quentitative version of the Hilbert irreducibility theorem. Using the sieve
in van der Waerden's argument we get
(1) ^(NHAT-^logN,
AMS 1969 subject classifications. Primary 1064, 1050, 1240; Secondary 1065, 2020.
© 1973, American Mathematical Society
91
92
P. X. GALLAGHER
which begins to approach the estimate ([9], [21], [26])
Rn(N)<N\ogN (n = 2),
^N^1 (n>2),
for the number Rn(N) of reducible monic polynomials of height ^N.A refinement
of the argument leads to
(2) En(N)<Nn-ll2\ogl-yN, withy = yn>0.
1. The large sieve in Zn. For each prime /?, let Q(p) be a subset of the group
ZnjpZn of ^-dimensional lattice vectors modulo/?. Denoting by co(p) the number
of elements of Q{p), we have 0^co(p)^pn. For each lattice vector aeZn, denote
by P(a, x) the number of primes p^x for which a mod/? belongs to Q{p), and put
P(x)= £ a>(p)/p\
p^x
the "expectation" of P(a, x).
Lemma A. For N^.x2, we have
(3) £ {P{a,x)-P{x))2<NnP{x).
MP
Here \a\ is the maximum of the absolute values of the components of a. The implied
constant, here and in what follows, may depend on n.
With the same notation, denote by E(N) the number of a with \a\^N for
which a modp$Q(p\ for each prime p. Put
(4) ^M-E^n^^L.
q^x p\qP -0>(p)
Lemma B. For N^x2,we have E(N)<Nn/^(x).
Lemma A is an n-dimensional variant of an inequality of Turan [23], [24],
analogous to Tchebychev's inequality for the standard deviation of a sum of
independent random variables. Lemma B is the n-dimensional version of the
Selberg-Montgomery sieve upper bound.
The proofs depend on the n-dimensional analogue of Bombieri's large sieve
inequality for exponential sums [4], [5]. Let
S(«)= Z c(a)e(a-v) (aeRn/Zn)
\a\^N
THE LARGE SIEVE AND PROBABILISTIC GALOIS THEORY
93
with the usual dot product, and arbitrary complex coefficients c(a). Then
(5) £ \S(«)\2<(N» + x2») £ \c(a)\2.
Inequalities of this sort have been applied by Huxley [12], Wilson [29], Saman-
darov [17], and Schaal [18] in generalising the large sieve to algebraic number
fields, and by Hlawka [11], whose application is a variant of Lemma B. For a
proofof(5),see[ll]or[12].
Weaker forms of Lemmas A and B may be derived without using (5). For
square-free q, let Nq be the number of a with \a\^N for which a modpeQ(p)
for each p | q. A simple lattice point argument gives
(6)
N, = (2N)»^+o(^-^)), for,**.
Using (6) in Turan's argument, we can prove (3) provided x^(N logAT)1/3. In
the application mentioned in the introduction, this would lead to a bound
jyn-1/3 iQgc ^y Similarly, using (6) in Selberg's upper bound sieve method [19],
we would get a bound Nn~1/5 logc N.
The following proof of Lemma A was suggested by arguments of Elliott [10]
and Warlimont [28]. Let <pp(a) be the characteristic function of the set of a
satisfying a modpeQ(p). Then (pp is periodic modpZn, so
with
<Pp(a)= I cp(cc)e(a-cc),
ord a | p
cP(*)=P " E <PP(a)e(-aot).
aeZn/pZn
In particular,
(7) c,(0)-2a and £ *,<#-=«
P orda|p P
It follows that
P(a,x)=£ Pp(flHP(x) + K(a,x),
pZx
94
P. X. GALLAGHER
with
Hence
R(a,x) = X £ cp(a)e(a-(x).
pf|jc orda = p
Z W«,*))2=Z £ c,(a) I K(a,x)e(aa)
|a|^N pf|x orda = p a^N
^(z Z kP(a)|2)1/2(ziS(a)|2)
\p^x orda = p / \ A /
where A is the set of a in Rn/Zn of prime order ^x, and
S(a)= £ K(a,x)e(aa).
Using (7) and (5), we conclude that
1/2
X (R(a,x))2<[P(x)(Nn + x2n) X (*M):
1/2
from which Lemma A follows.
For the proof of Lemma B, it suffices, as in [16], to show that if c(a) = 0 unless
a mod p$Q(p) for all p, then
(8)
i* = q p\qP -<0[P)
for all square-free q. In fact, putting c(a) = l or 0 according as amodp$Q(p)
for all p or not, we have E(N) = S(0) = Yd\a\^N\c(a)\2> so the upper bound for E(N)
follows from (5) and (8).
To prove (8), we may proceed as in [16] or as follows: For each prime p,
(9)
£ |S(a)|2 = p" X |S(/i,p)|2-|S(0)|2,
ord a = p he Zn/pZn
where S(/i, p) = Y^amodPehc{a)' The Schwarz inequality gives
(10)
|S(0)|2
lS(h,p)\
^p"-co(p))Z\S(h,p)\2,
since, by the hypothesis, S(/i, p) = 0 for heQ(p). Combining (9) and (10), we get
THE LARGE SIEVE AND PROBABILISTIC GALOIS THEORY
95
(8) for q = p. More generally, replacing c(a) by c(a) e(a-/?), we get
Z |S(«+0l2*-^Lls(fll2.
orda = p P —(0(P)
If q is a square-free integer but not a prime, then q = pr, with p a prime and r
a square-free integer prime to p. Hence
Z |S(y)l2= Z Z |S(a + /?)|2^^7- Z lW,
ord > = q ord a = p ord 0 = r p — CD (Pj Crd 0 = r
and we get (8) from this by induction on the number of prime factors of q.
2. Sifting polynomials. Let F(X) be a monic polynomial of degree n with
integer coefficients. If F(X) modp splits into distinct monic irreducible factors,
with rl linear factors, r2 quadratic factors, etc., then we say that r = (rl9 r2,...) is
the splitting type of F(X) modp. Thus F(X) has a splitting type for each prime
which does not divide D(F\ the discriminant of .F(A'). For each splitting type r,
we have
(ii) i/•»■/=«.
/
Given r satisfying (11), (Jenote by nF r(x) the number of primes p^x for which
F(X) modp has type r. As we will show later, for "almost all" F(X), we have
7iFr(x)~5(r) 7r(x), x-kx), where n(x) is the number of primes p^x and
(12) S(r) = (Virl \2r2r2 I-)"1.
The following result shows that S(r) is the normal density of primes for which
monic polynomials of degree n have splitting type r.
Theorem A. Ifr satisfies (11), then for N^x2, we have
(13) X (*F,r(x)-Hr)n(x))2<Nn7i(x).
H(F)^N
Proof. We identify a monic polynomial of degree n with the lattice vector
a = (a!,..., an) formed by its coefficients, so that H(F)=\a\. Similarly, polynomials
modp are identified with lattice vectors modp. Let Qr(p) be the set of monic nth
degree polynomials modp of type r. By the unique factorisation theorem, the
96
P. X. GALLAGHER
number cor(p) of such polynomials is given by
f \ rf
where np(f) is the number of/th degree monic irreducible polynomials modp.
By a theorem of Dedekind [7], [13] we have
J d\f
Thus np(f) is a polynomial in p with leading term pf/f, so cor(p) is a polynomial
in p with leading term
riffi-*'-
It follows that
Pr(*)= I (^ = S(r)n(x) + 0(\oglogx).
p^x P
Identifying nFr(x) with Pr(a, x) and using Lemma A, the result follows.
Let
nF(x)= Z' (number of roots of F(X) = 0 modp),
where the dash indicates that p \ D(F).
Corollary 1. For N^x2, we have
(14) £ (nF(*)-n(x))2<Nn7i(x).
H(F)^N
Proof. We have
M*) = Z ',i^,rW = (Z >*i<5(r)j ^W + Z M*F,r(*)-<$(r) tt(x)).
The first sum on the right is 1, since it is the coefficient of xn~l in
(d/dxx) exp(x1+x2/2 + x3/3+-)x1=x = explog(l/(l-x))=l+x + x2+--.
THE LARGE SIEVE AND PROBABILISTIC GALOIS THEORY
97
Using the Schwarz inequality, we get
M*)-k(x))2«£ (*P.r(x)-d(r) n(x))2.
r
Summing on F and using (13) for each r, we get (14).
Denote by Er(N) the number of monic nth degree polynomials F(X) of height
^N for which F(X) modp has splitting type r for no prime p.
Corollary 2. For each r, we have Er(N)<^Nn~112 logN.
Proof. For each such F(X), we have nFr(x) = 0, so the corresponding term
in (13) contributes (n(x))2. Hence Er(N)<$Nn/n(x) for N^x2. Choosing x = N1/2,
the result follows, using Tchebychev's estimate for n(x).
Using Lemma B, we can improve the logarithm in Corollary 2. For any
nonempty set R of splitting types, denote by ER(N) the number of monic nth
degree polynomials F(X) of height ^N for which F(X) modp has splitting type
in R for no prime p. Put S(R) = J^reR S(r).
Theorem B. For each 5<5(R), we have
(15) ER(N)<Nn-ll2(\ogNy-m-5).
Proof. Let QR(p) be the union of the Qr(p) for reR. Then coR(p)~5(R)jf
as p-tco, so coR(p)^5pn for all p^Ms. To find an upper bound for ER(N), it
suffices by Lemma B to find a lower bound for
yc,M(x)=Z c*« (c = <5/(l-<5))
where the dash indicates that q runs over square-free integers with no prime
factor < M, and v(q) is the number of prime factors of q. From a more general
formula of A. Selberg [20, Theorem 2] we get
^»w^,,n('-^)\n(.^)(.-i)'-log.-.„
for each c. Putting x = N1/2, the result follows.
3. van der Waerden's theorem. Combining the results of the previous
paragraph with a method of Bauer [3], we can estimate En(N). We use the fact that if
^(A") mod/? has splitting type r, for some prime/?, then the Galois group G of the
splitting field of F(X), regarded as permutation group on the roots of ^(A"),
98
P. X. GALLAGHER
contains a permutation of cycle type r, with rx cycles of length 1, r2 cycles of length
2,... [27,§61].
If G is a proper subgroup of the symmetric group S„, then the conjugates of
G do not cover Sn [6, §26], so there is a conjugacy class of S„, consisting of all
permutations with a given cycle type r, which does not intersect G. Thus there
is a forbidden splitting type for F(X). It follows that
r
Using Corollary 2 or Theorem B (with R = r\ we get the inequalities (1) or (2)
of the introduction, for example with yn = min 3 (r) = (n!)"1.
To get a larger value for yn9 we use the following lemma, stated by Bauer.
I owe the proof to D. Knutson.
Lemma. Let G be a subgroup of Sn. If G is transitive, contains a transposition,
and contains a p-cycle for some prime p>n/2, then G = Sn.
Proof. Consider the graph with vertices 1, 2,..., n and edges corresponding to
the transpositions (ij) in G. Since Sn is generated by transpositions, it suffices to
show that the graph is complete, i.e. has an edge between each two vertices. If G
contains (ij) and (jk), then G contains (ik) = (if) (jk) (ij). Hence the connected
components of G are themselves complete graphs, so it suffices to show that G is
connected.
In its action on the vertices, G operates as a group of automorphisms of the
graph, since (ai, oj) = a(ij) o~leG for oeG and (ij)eG. Hence G acts transitively
on the components. Therefore the components are all isomorphic, of a size d
dividing n.
If the p-cycle moves a component, then there are at least p components, so
pd^n, forcing d=\. This is impossible, since the graph has at least one edge.
Hence the p-cycle fixes each component, so d^p, forcing d = n. Hence the graph
is connected.
Theorem C. We have En(N)<Nn~1/2 \ogx~yN, with y = yn>0, and y„~
(Inn)-112.
Proof. Let Tbe the set of elements of Sn among whose cycles there is just one
transposition and no other cycles of even length. Let P be the set of elements of
order divisible by some prime p>n/2. If G is a proper subgroup of 5„, then either
G is intransitive or either Gr\T or Gr\P is empty, since otherwise G satisfies the
hypotheses of the lemma. It follows that
(16)
E„(NURn(N) + ET(N) + EP(N).
THE LARGE SIEVE AND PROBABILISTIC GALOIS THEORY
99
The density of P is the sum of the d(r) with rp=\ for some prime p>n/2.
It follows easily that
n/KpgnP lOgfZ
using the prime number theorem.
The density of T is the sum of the 5(r) with r2 = 1 and r4 = r6 = • • • = 0. This is
half the coefficient of x"~2 in
exp(x + y+y+- ) = exp
1-x
=(l+x)(l-x2)"1/2
0
1+JCV/2
(1+X,io(2^!)2X '
or half the coefficient of x2k in the last sum, with 2k = n — 2 or n — 3 according
as n is even or odd. Using Stirling's approximation, we get
(18) S(T)~(2nn)-112.
Using Theorem B, the estimate for Rn(N) stated in the introduction, and (16),
(17), and (18), the result follows.
We remark that for irreducible F(X) the asymptotic formula
7rF(x)~7r(x), x->oo
follows from the prime ideal theorem for the field Q(oc) with F(a)=0. The
hypothesis that Cqw/Cq is entire and has no zeros to the right of the critical line would
give
7cF(x) = 7r(x) + 0(x1/2+*),
with a constant depending on H(F).
Similarly, if F (X) has Galois group S„, the formula
*F,r(*)~<$(r)7t(x)
follows [1, §2] from a strong form of the Tchebotarev density theorem [22], [2]
100
P. X. GALLAGHER
and the sharper form
nFJx) = d(r)n(x) + 0(xl/2 + e)
would follow from the hypothesis that the nonprincipal Artin L-functions for
the splitting field of F(X) are entire and have no zeros to the right of the critical
line.
By comparison, Corollary 1 and Theorem A give
\nF(x)-n(x)\^xl/2+e and \nF,r{x)-5{r) n(x)\^xll2+°
for all but <$„ x2n{l ~e) polynomials F(X) with H{F)^x2.
References
1. E. Artin, fiber die Zetafunktionen gewisser algebraischer Zahlkorper, Math. Ann. 89 (1923),
147-156.
2. , Vber eine neue Art von L-Reihen, Abh. Math. Sem. Univ. Hamburg. 3 (1923), 89-108.
3. M. Bauer, Ganzahlige Gleichungen ohne Affekt, Math. Ann. 64 (1907), 325-327.
4. E. Bombieri, On the large sieve, Mathematika 12 (1965), 202-225. MR 33 #5590.
5. , A note on the large sieve, Acta Arith. 18 (1971), 401-404.
6. W. Burnside, Theory of groups of finite order, Cambridge Univ. Press, Cambridge, 1911.
7. R. Dedekind, Abriss einer Theorie der hohern Congruenzen in Bezug auf einen reellen primzahl
Modulus, J. Reine Angew. Math. 54 (1857), 1-26.
8. K. Dorge, Vber die Seltenheit der reduziblen Polynome und der Normalgleichungen, Math. Ann.
95(1925), 247-256.
9. , Abschatzung der Anzahl der reduziblen Polynome, Math. Ann. 160 (1965), 59-63.
MR 31 #5865.
10. P. D. T. A. Elliott, The Turan-Kubilius inequality, and a limitation theorem for the large sieve,
Amer. J. Math. 92 (1970), 293-300. MR 41 #8360.
11. E. Hlawka, Bemerkungen zum grossen Sieb von Linnik, Osterreich Akad. Wiss. Math.-Natur.
Kl. S.-B. II 178 (1970), 13-18. MR 42 #224.
12. M. N. Huxley, The large sieve inequality for algebraic number fields, Mathematika 15 (1968),
178-187. MR 38 #5737.
13. H. Kornblum, Vber die Primfunktionen in einer arithmetischen Progression, Math. Z. 5 (1919),
100-111.
14. H.-W. Knobloch, Zum Hilbertschen Irreduzibilitatssatz, Abh. Math. Sem. Univ. Hamburg.
19 (1955), 176-190. MR 16, 798.
15. , Die Seltenheit der reduziblen Polynome, Jber. Deutsch. Math. Verein. 59 (1956), Abt. 1,
12-19. MR 18, 185.
16. H. L. Montgomery, A note on the large sieve, J. London Math. Soc. 43 (1968), 93-98. MR 37
#184.
17. A. G. Samandarov, The large sieve in algebraic number fields, Mat. Zametki 2 (1967), 673-680.
(Russian) MR 36 #6379.
18. W. Schaal, On the large sieve method in algebraic number fields, J. Number Theory 2 (1970),
249-270. MR 42 #7626.
THE LARGE SIEVE AND PROBABILISTIC GALOIS THEORY
101
19. A. Selberg, The general sieve-method and its place in prime number theory, Proc. Internat.
Congress Math. (Cambridge, Mass., 1950), vol. 1, Amer. Math. Soc, Providence, R.I., 1952, pp. 286-
292. MR 13, 438.
20. , Note on a paper by L. G. Sathe, J. Indian Math. Soc. 18 (1954), 83-87. MR 16, 676.
21. W. Specht, Zur Zahlentheorie der Polynome, S.-B. Math.-Nat. Kl. Bayer. Akad. Wiss. 1951,
139-146. MR 14, 251.
22. N. Tchebotarev, Die Bestimmung der Dichtigkeit einer Menge von Primzahlen, welche zu einer
gegebenen Substitutionsklasse gehoren, Math. Ann. 95 (1925), 191-228.
23. P. Turan, On a theorem of Hardy and Ramanujan, J. London Math. Soc. 9 (1934), 274-276.
24. , Uber einige Verallgemeinerungen eines Satzes von Hardy und Ramanujan, J. London
Math. Soc. 11(1936), 125-133.
25. B. L. van der Waerden, Die Seltenheit der Gleichungen mit Affekt, Math. Ann. 109 (1934),
13-16.
26. , Die Seltenheit der reduziblen Gleichungen und der Gleichungen mit Affekt, Monatsh.
Math. 43(1936), 133-147.
27. , Moderne Algebra. Vol. 1, Springer, Berlin, 1935; English transl., Ungar, New York,
1949. MR 10, 587.
28. R. Warlimont, On Artin's conjecture (to appear).
29. R. J. Wilson, The large sieve in algebraic number fields, Mathematika 16 (1969), 189-204.
MR 41 #8374.
Columbia University
This page intentionally left blank
SOME REMARKS ON
ARITHMETIC DENSITY QUESTIONS
LARRY JOEL GOLDSTEIN1
In [2], we proposed the following generalization of the Artin primitive root
conjecture [1, p. viii]: Let S be a set of rational primes; for each qeS, let Lq be a
finite, normal algebraic number field. Further, let 5* be the set consisting of 1 and
all square-free numbers divisible only by primes in S. For keS*, define Lk by
Lk=Q (k=l)
and let dQg(LJQ) = n(k). Let P denote the set of all rational primes and let
stf—stf({Lq), S) = {peP |/7 does not split completely in Lq for all qeS}.
Then our conjecture was as follows.
Conjecture. Let fi denote the Mobius function and assume that
Yjkes* H {k)ln (£) converges absolutely. Then stf has a natural density d(stf) and
d(s/)= £ n(k)/n(k).
keS*
If S=P and Lq = Q(Cq, a1/q), where tfeZis given (tf #0, 1) and £, is a primitive
qth root of unity, then the conjecture is equivalent to the Artin primitive root
conjecture. The conjecture is known to be true in the following cases: (1) Sfinite;
(2) S arbitrary and Lq^Q(£q2) for all but finitely many q\ (3) S arbitrary, Lq^
Q(Cr al/q) for all but finitely many q, provided the Riemann hypothesis for the
Dedekind zeta functions is true.
AMS 1970 subject classifications. Primary 12A70; Secondary 12A35.
1 Research supported by National Science Foundation grant GP-20538.
£■ 1973, American Mathematical Society
103
104
LARRY JOEL GOLDSTEIN
The conjecture is supported by the same heuristic that supports the Artin
primitive root conjecture. Moreover, in [2], we have given the following
interpretation of the conjecture in the setting of equidistributed sequences: Without
loss of generality, assume that Lq^Q for all qeS. Let L denote the composite of all
Lq. Then L is a Galois extension of Q having profinite Galois group G. Let Hk be
the open subgroup of G corresponding to Lk under the Galois correspondence,
and let
C = G-U Hq.
qeS
Then C is a closed subset of G invariant under conjugation. Moreover, if p is the
Haar measure on G which gives the whole group the measure 1, then
fd/*= £ f*{k)/n{k).
J keS*
C
In [2], we constructed a generalized Artin symbol ((L/Q)/p) defined for each finite
Q-prime p which had the following properties:
(1) ((L/Q)/p) is a closed subset of G;
(2) ((L/Q)/p) is invariant under G-conjugation;
(3) if/? does not ramify in Lk, then the restrictions of the elements in ((L/Q)/p)
to Lk coincide with the usual Artin symbol.
From the properties (l)-{3), it is clear that pestfo((L/Q)/p)^C. Therefore, the
conjecture is equivalent to the following Tchebotarev-type result.
Conjecture'. Let stf = {peP | ((L/Q)/p)^C}. Then srf has a natural density
d{stf) and
d{^)= dp.
c
It is relatively easy to connect Conjecture' with the theory of equidistribution.
Let G denote the space of all conjugacy classes of G, given the quotient topology,
and let Ji be the projection of p on G. Further, let rj: G-+G denote the canonical
projection. Then p is a Borel measure on G and
dp=\dp.
1(C) C
Then it is easy [2, p. 441] to see that the sequence n(((L/Q)/p)) is equidistributed
with respect to the measure p. Set
SOME REMARKS ON ARITHMETIC DENSITY QUESTIONS
105
Then Conjecture' may be reformulated as follows.
Conjecture". Let srf = {peP \ oper\(C)}. Then srf has a natural density d(stf)
and
d{s/)= dji.
In view of the fact that the sequence {ap} is equidistributed with respect to /},
Conjecture" might seem reasonable. However Weinberger [5] and Serre (private
communication) have recently shown that this is not the case by constructing
counterexamples. The counterexamples can be constructed using elementary class
field-theory in such a way that the extensions Lq can be taken to be abelian over Q.
In this paper, we will show that by imposing some further hypotheses on the family
{Lj, the most essential of which concerns the growth of the discriminant of Lq as
q-+co, it is still reasonable to expect the above conjectures to hold. Namely, we will
prove the following result.
Theorem. Let the fields {Lq}qeS be as above, and assume LJQ is abelian for
q^R. Moreover, assume that the family {Lq}qeS satisfies the following three con-
ditions:
(i) \og\dk\/n(k) = 0(ka)for some a^O, as k-+oo, where dk = the absolute
discriminant of Lk/Q;
(ii) ifpeP splits completely in Lq, then p^q for all sufficiently large q\
(»0 I,** l/n(q) = o(l/(log2A))(A^<x>).
Then the conjecture holds if the Riemann hypothesis for the Dedekind zeta function
is true.
Remarks. (1) Condition (iii) already implies the absolute convergence of
T,kes*^)/n(k).
(2) If Lq^Q(Cq) for all sufficiently large q, then condition (ii) is satisfied since if
p splits completely in Q (Q, then p = 1 (mod q).
(3) Condition (i) is suggested by the Riemann hypothesis for the Dedekind
zeta function (see below). Moreover it is satisfied in situations of interest. For
example, if Lq = Q(Cq, all% then the arguments of Hooley [3, p. 229] show that
lim n{k)/k2 = l9
q-*ao
\dq\SAqBq2 (A>0,B>0),
which can be easily used to verify conditions (i) and (iii).
106
LARRY JOEL GOLDSTEIN
(4) Condition (iii) is of a technical nature and may not really be necessary, but
is required in our proof.
Proof of the Theorem. The idea of the proof is' the same as that used to
prove Theorem 1 of [2]. Let us introduce the following notations: For x>0, y>0,
set AT(x, y)= the number of primes p^x which split completely in some Lq, q^y,
qeS', P(x, A:) = the number of primes p^x which split completely in Lk/Q;
M(x, yl,y2) = thQ number of primes p^x which split completely in some Lv
y\<q<y\\ 7rk(x) = the number of prime ideals of Lk of norm ^x; n(x, j/) = the
number of primes p^x such that pesrf. We must prove that
(1) n{x, s#) = d(s?) x/logx + o(x/logx) (x->oo),
where
(2) '(-)- I ^f ■
kls* n(k)
Condition (ii) implies that for all sufficiently large x, we have
(3) n{x9s/) = N{x9x).
Moreover, for £>l ^£2 = x> we see as *n [2, Equation (3.1)] that
(4) tc(x, ^) = AT(x, ^) + 0(M(x, ^, £2)) + 0(M(x, £2, x)) (x-oo),
where the O-term constants do not depend on x, £l9 £2. We will choose ^=^(4
£2 = £2(x) so that
(5) AT(x, il) = d{s/) x/logx + fl(x/logx) (*->oo),
(6) M(x, £l^2) = o(x/\ogx) (*->oo),
(7) M(x,£>2,x) = o{x/\ogx) (x->oo).
It is clear that (5)-(7) together with (4) suffice to prove (1).
In order to prove (5), let us first note that a simple combinatorial argument
implies that
(8) N(^ii) = I*Kk)P(x,k),
k
where ££ is over all those keS* whose prime factors are all ^^.
Let 7Tfc(x) = the number of Lk-primes p such that Np^x and p is of absolute
SOME REMARKS ON ARITHMETIC DENSITY QUESTIONS
107
degree 1 and unramified in LJQ and let nk(x) = nk(x) — nk(x). Note that n'k(x)
=n(k)P(x, k). Moreover, since the primes counted by nk(x) either ramify in
LJQ or are of degree > 1, we see that
*rk(x)£2\ag\dk\ + n(h) X 1
p2^x
^ ^2\og\dk\ + n(k)x112
= 0{n{k)ka + n{k)x112
by condition (i), where the O-term constant does not depend on Lk or x. Let
0(x) = Xp^* l°8P be the classical Tchebycheff function, where p runs over the
primes. Then it is well-known that 6(2x)^2x for x^ 1. Thus, all k occurring in the
sum Yjk are at most exp(2<^1). Let us choose £l = £l(x)so that
(10) £i(x)-+oo asx-+oo,
(11) sup ka<.x,
fc^exp(2<Si)
(12) exp(2^)^x1/2/log3x.
For such £1? for /cgexp(2^1), we have
P(x,k) = n'k(x)/n(k)
= nk(x)/n(k)-nZ(x)/n(k)
= nk(x)/n(k) + 0(ka + x1'2) (by (9))
= 7rk(x)/n(/c) + 0(x1/2) (by (12)).
Therefore, by (8),
k
(13) =E*^^W + 0(^x1'2)
Let us define {f/k(x) by
^fc (*) = Z *°S ^P' P a Prime ideal of Lk.
Npm£x
Then by assuming the Riemann hypothesis for the Dedekind zeta function of Lk,
108 LARRY JOEL GOLDSTEIN
Lang [4, Equation (11RH)] has shown that
(14) i//k(x) = x + 0(xl/2 log* log(|dk| xn(k})),
where the O-term constant is absolute. By the usual trick for calculating nk(x)
from ^fc(x), we derive from (14) that
(15) nk(x)^U(x) + 0(x1'2 log(|dJ x»<*>)),
where Li(x)=jj dy/logy. From equations (13) and (15), we derive that
(16) N(x, «i) = [l* £|j] Li(x) + 0 (x1'2 p l°gW'«»x)yo(j^).
Since all those k^£x belonging to S* are included in Yjk and since ££ l/n(k)
converges, we see that since ^ (x)-*oo as x-*oo, we have
M*)_ m+0( j. J_\_ s s<*)+0( z _L
(17) * «W *6S-;*§«, "CO \*6S-;*>«, «(fc)/ keS»n(fc) VfceS»Tt>«l "CO
Moreover, by condition (ii) we see that
otx1'2 P log(|dj1/nwx)) = 0 (x1/2 X* log(fc"x))
(18) =o(x1/2logxpij (by (11))
= 0(exp(2£,)je1/2logx)
= o(x/logx) (by (12)).
By equations (16), (17), (18) we finally see that
= d (s/) x/log x + o (x/log x),
which proves (5).
Let us now prove (6) for £2 = x1/2/\og2x. It is easy to see that
(19) A#(xfflf£2)£ X P(x,«).
SOME REMARKS ON ARITHMETIC DENSITY QUESTIONS
109
Moreover, the definition of P(x, q) and some elementary reasoning implies that
P(x,q)^nq(x)/n(q).
Therefore, on the Riemann hypothesis, (15) implies that
(20) P{x, q)^U(x)/n(q) + 0(x1/2 logO^I1^ *)).
However, since ^q l/n(q) converges and since £>l (x)-*oo as x-+oo, we see that
(2i) I 4rLiM=°(r^\
Moreover, by condition (i),
ofx1'2 X logfl^^x^ofx1'2 S log(qax))
(22) = o(x1/2logx £ 0
= 0(x1/2logx(x1/2/log3x))
= o(xj\o%x).
Estimates (19}-{22) suffice to prove (6).
Finally, let us prove (7). Here is where we will make use of the hypothesis that
LJQ is abelian for all sufficiently large q. Without loss of generality, let us assume
that LJQ is abelian for all q^£2. In analogy to (19), we have
(23) M(x,£2,x)^ X p(x>4)'
Lztfq denote the conductor of LJQ. Then it is well known that there exist rational
integers al{q),..., at(q) such that a rational prime p splits completely in LJQ if
and only if p = at(q) (modfq) for some i (1 ^i^t). Thus, / is such that the density
of primes which split completely in LJQ is just f/<£(jQ. However, by elementary
density considerations, this implies that t = (j)(jq)/n(q). However, we then have
(24) p^q)zt.i*m.xzi
110
LARRY JOEL GOLDSTEIN
Combining (23) with (24), we have
^2<%xn(q) \log2xJ
by condition (iii) and the choice of £2- This proves (7), and with it the Theorem. □
Bibliography
1. E. Artin, The collected papers of Emil Artin, Edited by S. Lang and J. T. Tate, Addison-Wesley,
Reading, Mass., 1965. MR 31 #1159.
2. L. Goldstein, Analogues of Artin's conjecture, Trans. Amer. Math. Soc. 149 (1970), 431-442.
MR 43 #4792.
3. C. Hooley, On Artin's conjecture, J. Reine Angew. Math. 225 (1967), 209-220. MR 34 #7445.
4. S. Lang, On the zeta functions of number fields, Invent. Math. 12 (1971), 337-345.
5. P. Weinberger, Counterexample to an analogue of Artin's conjecture, Proc. Amer. Math. Soc. 35
(1972), 49-52.
University of Maryland, College Park
RELATIONS BETWEEN THE VALUES
AT INTEGRAL ARGUMENTS
OF DIRICHLET SERIES THAT
SATISFY FUNCTIONAL EQUATIONS1
E. GROSSWALD
1. Introduction. Let m stand for a natural integer, £(j) (s = o + it, o, t real)
for Riemann's zeta function and Bk for the A:th Bernoulli number, respectively.
Then it is classical that C(-2m) = 0, £(l-2m)= -B2m/2m, and (2n)~2m C(2m)
= (— 1 )m ~1 B2J2 (2m)!, where the second members are all rational values, but
practically nothing is known about £(2ra — 1). In [3] it was shown that
(27r)-«2m-,)C(2m_1)= £ akB2kB2m„2k-Rm,
k = 0
where the ak are rational and Rm = (2n)~{2m~1)G(i\ with
G(t)= f cke^\
(1) ck=2(i+7ifc(i-(-ir)/('"-i))Z^1
d\k
-2m
The "remainder term" Rm is quite small and, for m->oo, it approaches zero very
rapidly, but it seems difficult to determine its arithmetic nature, i.e. whether it is
rational, algebraic, or transcendental. For this reason, the following result [4] is
rather remarkable.
Let x = x{fy be a nonprincipal, primitive residue class character modulo the
AMS 1970 subject classifications. Primary 10H05, 10H10, 12A70, 12A95, 44A15, 44A20;
Secondary 13D15, 10A40, 30A20.
1 This paper has been written with partial support from the National Science Foundation
through grant GP-23170.
© 1973, American Mathematical Society
111
112
E. GROSSWALD
natural integer /> 1, and denote the conjugate character by %. In analogy to (1),
we define the coefficients
ck(X) = 2X(k)(l + nk(l-(-ir)/(m-l)f) £ j^*/1"2"
d\k
and the function G(t, x) = ^k= i <*(*) e2nikz/f. For / = 1, ck{x) and G(t, *) reduce to
previously defined ck and G(t), respectively.
Now the correspondingly defined Rm(x) satisfies
Rm(x)=(2n)l-2mG(i,X)=r2m+3'2 'l1 bkBkxB2m-\
k = 0
where the coefficients bk are rational and the £*, Leopoldt's generalized Bernoulli
numbers, are algebraic, so that also Rm(x) is algebraic. In fact, if/> 1 and x(k) is a
real, primitive character (mod/), then Rm(x)f~1/2 is actually rational. If it were
possible to drop the conditions "nonprincipal, primitive" concerning the character,
then it would follow that n~aC(a) would be rational for all positive integral
arguments <z, which is very unlikely.
The purpose of the present paper is to extend these considerations to a rather
wide class of functions, representable in a halfplane by Dirichlet series. The results
will apply, in particular, to the Dedekind zeta functions £K(s) of totally real fields K.
The corresponding conclusions seem to indicate that it may be profitable to study
the arithmetical nature of n~(2m~l) C(2m — 1) and, more generally, of
7r-(2m-1)nCx(2m-l),
where CK(s) is the Dedekind zeta function of an algebraic number field K, of
degree n over the field Q of rationals. This should be contrasted with some
consequences of a conjecture of Lichtenbaiun [9], which point towards
7r2_2mC(2m— 1), and, more generally, towards n(2~2m)nC)K(2m — 1) as more likely
to be of arithmetical significance. So, e.g., according to Lichtenbaum, 7r~2£(3)
= ce{(p(u)\ with integral c, and where a is a generator of K5Z, e(--) is the canonical
homorphism K5(C)-+R and (p the map KZ-+KC induced by the inclusion ZaC.
On the other hand, by [3],.tT3 f(3) = 7/180-2 £«>=1 (k3(e2nk-1))"1.
Following §2 with the notations, the main results are stated in §3. §4 contains
an exposition of the general method and a main lemma. The proofs of the theorems
and corollaries are collected in § 5.
2. Notations. Throughout, K stands for an algebraic number field of degree
n over the field Q of rational numbers, n = 2r2 + rl (rx= number of reals, 2r2
= number of complex conjugates), with discriminant d, class number h, regulator
R, ring of integers 0K and containing w roots of unity. We also set
A = 2-r2n-n/2\d\l/2 and Q = 2ri+r2nr2Rhw-1 |d|"1/2.
DIRICHLET SERIES THAT SATISFY FUNCTIONAL EQUATIONS 113
Na denotes the norm of the ideal aczK and the Dedekind zeta function of K is
defined for a> 1 by CK(s)= £acK Na~s. We denote by g{k\ n\ m) Meyer functions
(see [1, pp. 206-222]) depending on the indicated parameters. They generalize
the exponential function to which they reduce for K=Q. S(t) stands for the
function defined by (4). The symbol |(ff)-- will be used for lim^^ £-!•?•" and au(n)
for £,,„</".
3. Main results.
Theorem 1. For every positive, integral m, the three quantities
\d\112 7c-2mnCx(2m), 7t-{2m-1)n-rKK(2m-l) and /T 17r"',(2m-1/2)S'(l)
are dependent over the rational integers.
Corollary 1.1. In case K is totally real,
n-i2m-l)%K(2m-l) = c1+c2S'(l)R-ln-ni2m-l/2)
with cx and c2 explicitly known, rational and depending only on m and the field K.
Corollary 1.2. In case K is totally real, the quantities n'^2m-^n £K(2m-1)
and S'{\) R-ln-n<2m-l/2) nave the same arithmetical nature.
Theorem 2. Letyk = yk(m) = YjNa=kI,b\a (M))-(2m"1); then
S'(l)=£ yk{g{1)(k;n;m) + g<2\k;n;m)}
where g{i)(k; n; m) (i= 1, 2) are Meyer functions.
Theorem 3. For a totally real field K, one has, with rational, explicitly known
constants c3, c4, c5, the equality
where
F1(t)=~ W(s)¥(s + 4m-l)(2s + 4m-2)-l(<i;/i)-s ds,
2ni J
{*)
withW(s) = As{r(s/2)}nCK(s).
114
E. GROSSWALD
For a function F2 (t), closely related to Fx (t), one has some information about
the arithmetical nature of F2 (*').
Let f be an integral ideal of K and let x = x(a) be a nonprincipal, primitive
ray-class character modulo f. The functions L(s, x) = Yjo^k x(a) Na~s are entire
and satisfy (see [6]) a functional equation of the customary type, that involves an
algebraic number W(x\ of absolute value one. It also is known (see [7]) that
L(2ra, y) = n2mnd~Xil<x, with a algebraic and belonging to the cyclotomic field
generated by the values of/(a) and the roots of unity of order TVf. We now set
V(s,z) = Aa{r(s/2)YL(s9z)
and define F2(t) as before, but by using Y(s, x) instead of Y(s). F2(z) depends, of
course, also on a = 2m— 1 and on %(a), but, for simplicity, this will not be indicated
in the notation.
Theorem 4. For a = Am — 1 and x (a) a nonprincipal, primitive ray-class
character modulo f, the function F2(x) satisfies
F2(0 = i^(z)(d-Nf-7r-n4~r2)1/2{r(m)}2ri{r(2m)}2r2L2(2m,x).
Corollary 4.1. With the same notations and under the same conditions as in
Theorem 4, // K is totally real, then
F2{i)^W{x) (d-Nln-y2 {/»}2« L2(2m, I).
In particular, F2(i) = n{4m~ll2)np with p algebraic.
Corollary 4.2. If the conclusions of Theorem 4 and Corollary 4.1 hold also
for x(o) a principal character, then n-n^4m- *) ^K{Am— 1) is algebraic.
4. The general method. Let zj(j)=fl?=i r(avj+j?v), with av>0, pveC, let
<p(s) = Yj?=i ake~XkS be a Dirichlet series, convergent for <7^<7o>0, and assume
that for some real (not necessarily positive) r,
<p(s) A(s) = (p(r — s) A(r—s).
Following a tradition that goes back to Riemann, we consider also a rational
function P(s) that satisfies the identity P(s) = (- If P(r-s) with (5 = 0 or 3=1.
Then also
(2) <p(s)A(s) P(s) = (-lf <p(r-s)A(r-s) P(v-s)
DIRICHLET SERIES THAT SATISFY FUNCTIONAL EQUATIONS 115
holds. Just as in the case of Riemann's factor s(s— 1), the purpose of P(s) is to kill
poles, create poles or modify their order. If we denote the left-hand side of (2) by
$(s), then (2) reads
(2') *(s) = (-l)**(r-s).
Most of what follows holds also in the more general setting of two distinct Dirichlet
series, satisfying a functional equation, as studied by Chandrasekharan, Nara-
simhan, Berndt and others (see [2] and papers quoted there), but in all intended
applications the two series coincide; hence the greater generality is not needed here.
Moreover, all functions that will occur will be meromorphic for o>ox, where
ax =r — o0 — g, with some g>0, sufficiently small, but fixed.
For Im t > 0 one can define the pair of reciprocal Mellin formulae
00
(3) F(T) = ^ | *(*) M"s*, #(*)= -i |f(t) (t/O5"1 dx,
<<r2) °
where o2 = o0 + g. In the first formula of (3) we make the change of variable s<-+r — s,
recall that al =r — oQ — s, use (2'), and finally shift the line of integration back to
o2 by taking into account the residues at the poles in the strip ox =^o = o2. Very
mild assumptions on <p(s) (e.g. \q>((i + it)\<\t\c for g1=^o<>g2, \t\^t0 is more than
needed) and A (s) are sufficient to justify these operations (see, e.g. [3]).
In this way, if 3 = 0 we obtain successively:
>(s)(T/i)s-rds
(a) (<n)
-if*
(<n)
=-^ f 9(s)(Tlif-"ds - £ Res{4>(s)(T/irr}
={T/i)~r{L \ 0(s){z^1) 5ds~ <?< Res^(s)(T/os)
= (tATF(-1/t)-(tATS(t/0
or
116 E. GROSSWALD
F(-1/t)-(t//)'F(t) = S(t//),
with
(4) *(")= I Res{0(s)us}.
If we now set t = it, t > 0, and observe that
lim {F(it)-rrF(i/t)}/(t-l) = 2iF'(i) + rF(i),
r-l
we obtain
2iF(i) + rF(i)=-S'(l).
If (2) and (2') hold with 5=1, then, proceeding as before, one obtains
F{x)=~hi I *{s) {xlir~r ds
(<r2)
whence
F(-1/t) + (t/iTF(t) = S(t/0.
For t = i this yields the simple relation
2F(i) = S(l).
Clearly, S(l)=£,lS<,s„ Res4>(s) and S'(l) = £0lSff§ff2 Res{s#(s)}. We formulate
these results as a
Main Lemma. With F{x) defined by (3),
F(-l/r)-(-lf(r/iyF(T) = S(z/i);
in particular,
2iF'(i) + rF(i)=- X Res{s<Z>(s)} if3=0,
2F(i)= X Res{*(s)} if<5 = l.
DIRICHLET SERIES THAT SATISFY FUNCTIONAL EQUATIONS 117
By proper choice of the function <P(s), the residues can be computed and
depend on the values of q>(s) at integral arguments. Hence, at least in principle,
the Main Lemma permits us to obtain relations between several values of <p(s)
for s an integer. These relations can be used in a variety of ways.
Particular cases of this Main Lemma have already occurred in the literature.
The case r = 0, A(s) = r(s), Psl, <p{s) = {2n)-' f(s) C(s+l) = E?Li <>-,(k) e~x«s,
with ?,k = \og(2nk), has been used by A. Weil [10] in order to give a proof of the
transformation formula for Dedekind's ^-function.
Next, with the same Ak, with
A(s) = r(s/2)r{(s + a)/2)9 P=2l-m(s+\)(s+3)-(s + a-2)
and with r = 1 — a, a = 2m — 1 an odd, positive integer and cp(s) = ££L l o-a (k) e~ AkS,
this Main Lemma was used in [3] to gain information about C(2m— 1). On
the other hand, in [4], the Main Lemma was used with r=l— a, the same
A(s) and P(s) as above and (p(s) = (2n/f)~s L(s9 x)L{s + a, x)=S°=i M*) e~^s
(lik = \og(2nk/f); ak(x) = ak(x; a) certain coefficients depending on /c, the non-
principal, primitive character #(mod/) and the odd integer a; L(s, x) = X*°= i x(fy k "s),
in order to obtain information about the arithmetic character of F(i) and 2iF'(i)
H-rF(i), with the result that (for x primitive, nonprincipal only!) F(i)9 or 2iF'(i)
— (a — 1) F(i), respectively, are of the form naC, with C an explicitly known algebraic
number.
5. Proofs. We shall consider now some applications of the Main Lemma
that lead directly to the proofs of the theorems and corollaries of §3.
5.1. Proofs of Theorem 1 and its corollaries. Let K be an algebraic
extension of Q of degree n = r1 +2r2, and let £K(s) be the Dedekind zeta function
of K. We recall that (see [5]) T{s) = As{r(s)}r2{r{s/2)}ri £K{s) satisfies the
functional equation
(5) y(s)=!P(l-s).
For odd, rational integer a = 2m— 1 ^3, we now set
(6) #(s)=!P(s)!P(s + <i).
Then, by (5) and (6),
&(s)=V(l-s)V(l-s-a)=V(l-s-a)¥((l-s-a) + a) = <P(\-s-a) = $(r-s),
with r=l— a. Also,
(7) <P(s) = A2sA(s) £ ckk-° = A{s) f cke-**>9
118
E. GROSSWALD
with A(s) = {r(s)r(s+a)}'2{r(s/2)r((s+a)/2)}r>, Xk = \og{kA~2) and ck =
Y,Na=k Zb|«(^)"- Clearly, the Main Lemma is applicable. It remains to
compute S(z/i) and
(<72)
We recall that CK(s) has (see, e.g. [5]) a simple pole at s= 1, with residue g, a zero
of order rx + r2 — 1 at j = 0 and zeros of orders r2 and r1-\-r2 at the negative odd
and negative even integers, respectively. Perusal of this information leads to the
simple (and, because of past experience, somewhat unexpected) result that <P(s)
is holomorphic in the strip ol%o^o2, except for the four poles at 5= —a, 1 —a,
0, and 1. The corresponding residues are computed routinely, with the result that
-S'(l) = 2r>hRw~1 {(a+ 1) \d\1/2n-nl22-r2r{a+ l)r2r{(a+ l)/2)ri £K(a+ 1)
-(fl-ijr^ri^Wfl)},
or, equivalently, with a = 2m— 1,
W =-C2(m, K) R-ln-n(2m-ll2)S'{\),
where C1(m,X) = (l-m-1)(m-l/2)-r2{41-m(2^:?)}ri((2^:?) = binomial
coefficient) and
C2(m,K) = w/i-12ri+r2-1m-1{4(m-l)!}-ri{(2m-l)!}-r2
are rational numbers. This proves Theorem 1.
More can be said if we restrict our attention to totally real fields K. In that
case r x = n, r2 = 0, w = 2,
Cam,K) = (l-m-1){41--(2-:2)r,C2(m,X)={(2(m-l)!r/im}-1,
and it is known (see e.g. [7]) that d1,2n~2mnCK(2m) = C3(m, K) is also rational.
Equation (8) becomes, with appropriate rational constants cl9 c2 (depending on
m and K only)
(8') n-{2m-1)ntK(2m-\) = c1+c2S'(\)R-1n-n{2m-ll2).
This proves Corollary 1.1. Corollary 1.2 is a simple reformulation of
Corollary 1.1.
DIRICHLET SERIES THAT SATISFY FUNCTIONAL EQUATIONS 119
5.2. Proof of Theorem 2. The determination of the arithmetical nature of
Cx(2m-1) has been reduced to that of {Rnn{2m-ll2)}~1 S"(l). For r=l-a,
a = 2m—\, one has, from (3), (4) and the Main Lemma, that
iff(l) = (m-l) F(Q-iF(i), F(i) = ~ | *(s) ds, ^'(0 = ^ J **M *■
(<72) <ff2)
If we now use (3) and (7), we obtain, with xk = k~2A49
irci J k=i k = i Zni J
(^2) (<T2)
(<T2) <<T2/2)
with rational coefficients gk.
In the case of a totally real field K, these integrals, and the similar ones for
iF'(i\ can be obtained explicitly with the help of Meyer's G-functions (see [1,
pp. 206-222]). Specifically,
^ | {r{s/2)r{{s^a)/2)Yx^ds=2'~ | {r(s)r(s + a/2)}»x^s
(<r2/2)
= 2G2#(xk|l, 1,..., 1, l-a/2, l-fl/2,..., l-a/2) = ^(/c; n;fl),
say, and F(i) = Yj?=i 9k9{^\ n\ #)• The expression, iF'(f) can be treated similarly
and this finishes the proof of Theorem 2. For K = Q, these series reduce to the
Lambert type series ^=1 k~a{e2nk- l)"1 and X?=i e2nkkl~a{e2nk-\)-2,
respectively encountered in [3].
5.3. Proof of Theorem 3. Let us consider the same situation as before
with a+ 1 =4m, <P(s) defined by (6) and set
&1(s) = <P(s)(2s + a-l)-1.
Then
■01{s)=-01{l-a-s)
and <Px(s) has five simple poles at s = -a, —a+1, -(a- l)/2, 0, and 1.
We set again F1(t) = (1/27iz) j(<r) ^(j) (t//)"s ds. The computation of the
residues is routine and (after division by 2) the Main Lemma with S = 1 yields
120
E. GROSSWALD
Fx{x) = 2'i{hRlw){a-\)-1 r(a)r>r(a/2r CK(a)
(9) -\d\V2 2ri-ra7r-"/2(fcU/w)(fl+l)-1 r(a+l)r2r((a + l)/2)ri C*(a+1)
+ |d|l/2 2-r2-2 ^-n/2 J^ + l)/2)^ T{{d+ \)/4)2r> £*((<*+l)/2)2.
If K is totally real, then
Cx(^+l) = Cx(4m) = 7r4^-1/2c4m, Cx((^ + l)/2) = Cx(2m) = 7r2^-1/2c2m,
with rational constants c4m and c2m, and (9) becomes
where B = 4'n(2m- 1) p1"4"^: ?)}"", k3 = 2"c4m/2m, k4 = {(2m-1)!}, and
A:5 = 4{(2ra — 1)!}~", are all rational. Finally, set c^Bkj and this finishes the
proof of Theorem 3.
5.4. Proof of Theorem 4 and its corollaries. While the arithmetical
nature of F^i) is unknown, for a closely related function F2(x) one knows that
F2(/) = 7rn(4m_1/2) a, with a algebraic.
With the same notations as before (see especially § 4) let f be an integral ideal
of K and let % = %(a) be a nonprincipal, primitive character of the group of ray-
classes, i.e. of the congruence classes (mod f) of ideals in Ok prime to f.
The L-function L(s, y) = Yja^K x(a) Na~s satisfies (see [6]) the functional
equation
where ¥(s; x) = Al{r(s/2)}ri {r(s)}r2L(s, x) with A2 = (d- JVf •7r_',4-r2)1/2, and
W(x) = C(x)(N^)~1/2, with C(x) a certain generalized Gaussian sum, |W(x)| = l,
W(x) algebraic. L(s,x) is an entire function, different from zero for c^l, and
vanishes to the order r2 at the odd negative integers and to the order rl-\-r2 at
s = 0 and at the even negative integers. We set
&2(s) = A2-aY(s)Y(s + a)(2s + a-l)-1
= Als{r(s/2)r((s + a)/2)}^{r(s)r(s + a)rL(s,x)L(s + a,x)(2s + a-l)-'
and observe that <P2(l— a — s)= —<P2(s), so that the Main Lemma applies with
5= 1. We set F2(T) = (\/2ni) J((T2) <P2(s) (t/i)~s ds. It is clear that *F(s, x) is an entire
function; hence, <P2(s) has only a single pole at s = (\—a)/2. The corresponding
residue is
Qi=Wix) A2r((a+l)/4)^r((a+l)/2)^L((a+l)/2, If.
DIRICHLET SERIES THAT SATISFY FUNCTIONAL EQUATIONS 121
Proceeding as before, we obtain
1 f
ds
Fl[T)=~hi \*2{s){*liYsds=-±-i J^l-s-aMT/ir*-1
(<r2) (
$2 (s) (t/i)s+a-1 ds + (x/if-1,/2 q2 ,
<ffl)
--f
2ni J
(<T2)
or
f2W + (T/0'-1F2(-l/T) = (T/ir-1)/2C2.
We set t = i and a + 1 = Am and obtain
2F2{i) = Q2 = 2-W{X)(dN\ s-4-')1'2
•{r((a+1)/4)}2- {r((a+1)/2)}2- {L((a+1)/2, *)}2,
F2{i)=W(%) (d- Nln-"4-'>y'2{r(m)}2" {r(2m)}2'*{L(2m, x)}2.
This finishes the proof of Theorem 4.
In the case of a totally real field, the last equation becomes
(10) F2(i) = iW(X)(d-Nln-«)l<2{r(m)}2"{L(2m,x)}2.
However, it is known (see [8]) that n~2mnL(2m, x) is algebraic, so that (10) becomes
F2(i) = n-n,2'n4mn'p,
with /? algebraic, and this finishes the proof of Corollary 4.1.
Finally, Corollary 4.2 follows immediately from Theorem 3 and Corollary 4.1.
Bibliography
1. A. Erdelyi et al., Higher transcendental functions. Vol. 1. The hyper geometric function, Legendre
functions, McGraw-Hill, New York, 1953. MR 15, 419.
2. B. C. Berndt, Identities involving the coefficients of a class of Dirichlet series. Ill, Trans. Amer.
Math. Soc. 146(1969), 323-348. MR 40 #5551.
3. E. Grosswald, Die Werte der Riemannschen Zetafunktion an ungeraden Argumentstellen, Nachr.
Akad. Wiss. Gottingen Math. Phys. Kl. II 1970, 9-13. MR 42 #7606.
122
E. GROSSWALD
4. , Remarks concerning the values of the Riemann zeta function at integral odd arguments,
J. Number Theory 4 (1972), 225-235.
5. E. Hecke, Uber die Zetafunktion beliebiger algebraischer Zahlk'drper, Nachr. Ges. Wiss.
Gottingen Math.-Phys. Kl. 1917, 77-89; Math. Werke, pp. 159-171.
6. , Uber die L-Funktionen und den Dirichletschen Primzahlsatz fur einen beliebigen Zahl-
korper, Nachr. Ges. Wiss. Gottingen Math.-Phys. Kl. 1917, 299-318; Math. Werke, pp. 178-197.
7. H. Klingen, Uber die Werte der Dedekindschen Zetafunktion, Math. Ann. 145 (1961/62), 265-272.
MR 24 #A3138.
8. , Uber den arithmetischen Charakter der Fourierkoeffizienten von Modulformen, Math.
Ann. 147(1962), 176-188. MR 25 #2041.
9. S. Lichtenbaum, Private communication.
10. A. Weil, Sur une formule classique, J. Math. Soc. Japan 20 (1968), 400-402. MR 37 #155.
Temple University
ON THE INCOMPATIBILITY OF
TWO CONJECTURES CONCERNING PRIMES
DOUGLAS HENSLEY AND IAN RICHARDS
0. Introduction. This is an account of work published in more detail elsewhere
(cf. [5]). We show that the conjecture
(A) n (x + y) ^ n (x) + n (y) (where x, y are integers ^ 2)
is incompatible with (B) the "prime /c-tuples conjecture" (definition below).
That is, at least one of these conjectures must be false. (We lean towards the opinion
that the prime /c-tuples conjecture is true, and (A) false.)
The inequality (A) can be written n(x + y) — n(y)^n(x). Thus (A) states that
no interval of length x contains more primes than the first x integers.
The prime /c-tuples conjecture is the natural extension of well-known
hypotheses concerning "twin-primes" AT, AT+ 2, triples AT, AT+ 2, AT+ 6, etc. It refers
to a /c-tuple of functions AT-f bl9..., X + bk. The conjecture states:
(B) There are infinitely many integers n > 0 for which all of the values n + b l9...,
n + bk are prime if and only if
(*) for each prime /?, there is some congruence class (mod/?) which
contains none of the constants bt.
A sequence of integers bl<b2<-'<bk which satisfies (*) is called admissible.
(To form an admissible sequence on an interval of length x, eliminate one
congruence class (mod 2), then one class (mod 3), one class (mod 5), etc. until the next
prime exceeds the number of points which remain.)
Definition. We denote by q* (x) the maximum value of A: for which there is an
admissible A:-tuple bx <b2 < ••• <bk on an interval y<bt^y + x of length x.
The prime k-tuples conjecture implies that infinitely often as y-»oo there are
intervals y <n^y-f-x (of fixed length x) which contain £*(x) primes.
AMS 1970 subject classifications. Primary 10H15, 10H25, 10H30.
(Q 1973, American Mathematical Society
123
124
DOUGLAS HENSLEY AND IAN RICHARDS
Thus if we set g(x) = lim supy^o0[n(y-\-x) — 7c(y)], then g(x)^g*(x), and the
prime /c-tuples conjecture implies q(x) = q*(x). It has been suggested that q*(x)
^n(x) for x^2 (cf. [2], [4], and [9]). However this is false (and the contrary
result for q*(x) does not depend on any conjectures). We will show that:
(C) lim Q*(x)-n(x)= + oo.
jc-*oo
If we assume the prime fc-tuples conjecture (B), then we obtain:
(C) For all sufficiently large x, there exist infinitely many y, such that n(y + x)
-n(y)>n(x).
(Of course these values of y may be much larger than x\ furthermore they
are likely to be very sparse.)
Hardy and Littlewood (cf. [4]) introduced the function q* (x), and they observed
that Q*(x)^7i(2x)-~n(x)~x/\ogx. Our result shows that g*(x)>n(x) for large x.
The most that is known concerning an upper bound for q* (x) follows via the sieve
method of Brun and Selberg: q*(x)gConst n(x). (Montgomery has proved
Const^2; cf [7].) It seems likely that Const^l+a as x-»oo, which would give
Q*(x)~n(x).
We note the trivial inequalities Q*(x + y)^Q*(x) + g*(y), and Q*(x)^n(y + x)
— n(y) for all y^x.
Our study of the function q*(x) began with a computer search to find values
x^2 for which q*(x)>ti(x). It was this search which led us eventually to a
theoretical solution. (Our data also showed that £*(105)>7r(105), and it is known that
Q*(x)<n(x) for x^ 146; cf. [5] and [8].)
We wish to thank William Franta and Richard Franta of the Computer
Sciences Department and the Computer Center for writing a machine-language
program which enabled us to carry out our calculations.
1. The main result. We adhere to the notations of §0.
Theorem. \imx^aoQ*(x)—7r(x)= + oo; the difference w^(log2—e) [x/(logx)2].
Proof. The theorem follows from two lemmas. Of these, the first is easy,
while the second requires several more lemmas before it is established. The idea
is to take an interval of integer points — x/2 <n^ x/2 located symmetrically about
the origin. Then we will construct a set of points {&,} by eliminating points as
follows.
First fix an integer N ^3. Eliminate all multiples (positive and negative) of
all primes p^x/N log* (the "hard" sieve of Eratosthenes, where the prime itself
is not saved). Call what remains the residual set. (This set consists of the primes
TWO CONJECTURES CONCERNING PRIMES
125
between x/N log* and x/2, and their negatives, plus the points ± 1.) Then
Lemma 1. The number of points in the residual set exceeds n(x) by an amount
asymptotic to [log2-2/7V] [x/(logx)2].
Lemma 2. The residual set is an admissible set for p*(x) (cf §0) as soon as x
is large enough.
Remark. It would be trivial that the residual set is admissible if we stopped
at primes p>2n (x/2) ~x/logx (since then there would be more congruence classes
(modp) than points in the residual set). However we have an average of about N
points per congruence class (N is fixed). We need to show that as the number of
trials increases (i.e. as x->oo), then at least one empty class appears.
Proof of Lemma 1. The number of points remaining is (with an error of
±2 or less) 2n(x/2) — 2n(x/N logx). The lemma now follows from the well-known
fact that 27t(x/2)-7i(x)~log2(x/(logx)2) (cf. [6, Chapter 3]).
Proof of Lemma 2. As stated above, several auxiliary lemmas will be
necessary. We begin with a lemma (to roughly the opposite effect as our theorem)
about how few primes are necessary to completely eliminate a sequence of t
consecutive integers. (Later on, t will be (N +1) logx, the number of multiples of any
prime q>x/N logx between —x/2 and x/2.)
Lemma 3. Let T=T(t) be the minimum number of primes in the natural
ordering px = 2, p2 = 3,..., pT such that there are congruence classes n = at (mod/?,),
one congruence class for each pi9 whose union contains the entire interval 1 ^j^t.
ThenT(t) = o[n(t)\.
Remark. The lemma is contained in a result of Erdos (cf. [1]) that the
maximum gap between primes pn + i—pn is asymptotically larger than logpn.
However his result uses a difficult argument to obtain quantitative results which
go farther than we need.
Proof. We start with Mertens' theorem (cf. [6]):
p<x \ P/
Let CM = Y[M<p<eM(l-l/p)~\ogM/M.
We will show that, given a large fixed number M, we can make T(t)<n(t/M)
+ CMn(t) if t is large enough. Since M is arbitrary, this implies T(t) = o[n(t)"].
First apply the "hard" sieve of Eratosthenes, taking out all multiples of the
primes in the two ranges \<p^M and eM^p^t/M (saving the middle range for
126
DOUGLAS HENSLEY AND IAN RICHARDS
later use). What remains of the original interval 1 ^j^t is:
(a) primes >t/M, and
(b) integers all of whose prime factors come from the fixed middle interval
Mxp<eM
If t is large enough, the set (b) becomes negligible in comparison with (a).
Now use the primes in the middle interval M<p<eM in an optimal way. This
reduces the residual set of ^n(t) elements by a multiplicative factor ^CM ( = the
product of the corresponding 1 — l/p).
Finally remove the remaining points (at mostCM7r(r) in number) one at a time,
using another CMn(t) primes. In all, n(t/M) + CMn(t) primes have been used. This
proves Lemma 3.
Lemma 4. Consider t-+co in Lemma 3. Then Y\i^TPi = e0it)-
Proof. T= T(t) = o\n(t)] by Lemma 3, whence pT = o(t). The prime number
theorem for Y(x) (cf. [6]) gives £.^T log/?i~/?T, so ]\i^TPi = e0(t).
Proof of Lemma 2, completed. Take any prime q>x/N logx. We have to
show that the residual set (formed by sieving out the primes up to x/N logx)
leaves at least one congruence class (mod q) empty. Let t denote the number of
multiples of q in the interval —x/2<n^x/2; then t^(N+ 1) logx.
Choose T=T(t) as in Lemma 3: each prime pt^pT eliminates a set {jt} of
points from the interval 1 ^j^t, so that the union of the {jt} is the whole interval.
Furthermore, the different elements of {jt} are all congruent (modp^)!
Now for — (x/2 + q)^y< — x/2, no points other than the t points y + q, y + 2q,
..., y + tq fall between —x/2 and x/2. Since by Lemma 4, Y[i^t Pi<x/N logx<g,
there exists sorfie y, —(x/2 + q)^y< —x/2, such that y+jtq = 0 (modpt), l^i^T
(Chinese remainder theorem). (If this holds for one^ in {jj, then it holds for all!)
The union over i and {jt} of the points jtq + y is the entire congruence class jq + y,
Thus y gives a congruence class (modg) which, in the interval — x/2<n^x/2,
hits no member of the residual set (since it hits only multiples of the relatively small
primesphpt<t = 0(logx), whereas primes up to x/N logx have been sieved out).
Q.E.D.
References
1. P. Erdos, On the difference of consecutive primes, Quart. J. Math. Oxford 6 (1935), 124-128.
2. , Some unsolved problems, Michigan Math. J. 4 (1957), 291-300. MR 20 #5157.
3. P. Erdos and J. L. Selfridge, Complete prime subsets of consecutive integers (to appear).
4. G. H. Hardy and J. E. Littlewood, Some problems of "partitio numerorum". III. On the
expression of a number as a sum of primes, Acta Math. 44 (1923), 1-70.
5. D. Hensley and I. Richards, Primes in intervals (to appear).
TWO CONJECTURES CONCERNING PRIMES
127
6. A. E. Ingham, The distribution of prime numbers, Cambridge Univ. Press, 1932; reprint,
Hafner, New York.
7. H. L. Montgomery, Topics in multiplicative number theory, Lecture Notes in Math., vol. 227,
Springer-Verlag, New York, 1971.
8. A. Schinzel, Remarks on the paper "Sur certaines hypotheses concernant les nombres premiers",
Act* Arith. 7 (1961/62), 1-8. MR 24 #A70.
9. A. Schinzel and W. Sierpihski, Sur certaines hypotheses concernant les nombres premiers, Acta
Arith. 4 (1958), 185-208; erratum, 5 (1959), 259. MR 21 #4936.
University of Minnesota
This page intentionally left blank
ON THE INTERVALS BETWEEN CONSECUTIVE
TERMS OF SEQUENCES
CHRISTOPHER HOOLEY
1. Introduction. Let su s2,..., .$>,... be a sequence of integers that is defined
in some natural way; let/(«) be the counting function of the sequence, i.e./(«)= 1
if n belongs to the sequence, and/(«) = 0 otherwise; and let
sr S jc n £ x
be the number of members of the sequence not exceeding x. Then in many well
known cases the overall asymptotic behaviour of the sequence has been
determined in the sense that it has been possible to prove an asymptotic formula of
the type
(1) F(x)~x/g(x)
as x-+oo, where, characteristically, g(x) is either a positive constant or a slowly
increasing function tending to oo as x-+oo. Results, however, of this form leave
unresolved a corpus of difficult questions related to the theory of the finer structure
of such sequences. In particular there are many problems connected with the
distribution of the intervals sr + 1 — sr, and it is with this aspect of the theory that
we shall be concerned in this article.
We shall consider the following three questions with an emphasis on the
latter two:
(i) An upper bound for sr+1 — sr in terms of sr;
(ii) an upper bound for the sum
(2) I (Sr+lSrY,
A MS 1970 subject classifications. Primary 10L99.
© 1973, American Mathematical Society
129
130
CHRISTOPHER HOOLEY
where y is a positive constant. In particular the question as to whether the upper
bound
(3) I (sr+1-sry = 0(xg?-l(x))
holds for some or all y> 1 in respect of a given sequence satisfying (1) (it holds
trivially for y = 0 and 1, and hence for O^y^l by the Cauchy-Schwarz inequality),
(iii) The distribution of the intervals sr+1—sr. In particular the question of
whether there is a distribution function for (sr+1—sr)/g(sr) in respect of a given
sequence that satisfies (1) for the case #(x)->oo as x->oo; that is to say, if Mc(x)
for a given positive constant c be the number of intervals sr+1 — sr of length not
exceeding cg(sr) for which sr+15^x, then does
*-oo x/g(x)
exist? There is of course a corresponding question for the case g(x) = const
provided its statement is appropriately rephrased.
The above topics naturally by no means exhaust the list of interesting questions
about the intervals sr+1—sr. They have, however, been chosen not only for their
intrinsic importance but also because they are the ones most closely connected
with the author's own researches on the subject. In order that as much ground
as possible should be encompassed our survey will be of an expository nature,
proofs as such not being given for the results we quote. Where, as in most cases,
the results have already appeared in the literature, full details of the methods
used can be found through the references given. As for the new results announced
for the first time in this article, it is intended that a full account of them should
be published shortly.
After beginning by making some observations about the relationships between
the three problems, we shall outline some general principles and ideas that are
relevant to the second and third ones. During the latter part of the article we shall
then summarize what has been discovered about some of the more familiar
sequences through the application of these ideas and others. Four sequences
are chosen as examples, these being in order of consideration, the numbers not
exceeding n that are relatively prime to n, the primes, the numbers expressible
as a sum of two squares, and the square-free numbers.
2. General considerations. Any upper bound in (i) naturally gives rise to an
upper bound in (ii), since
(4) bd (sr+1-sr) = 0(h(x))
Sr+l^-X
INTERVALS BETWEEN CONSECUTIVE TERMS OF SEQUENCES 131
implies that
£ (sr+1-sry=o\hy-l(x) £ (sr+1-sr)l=o{xh>-i(x)}.
But this consideration is not of much avail in practice because the upper bounds
found for (i) usually fall far short of what is probably true. There is furthermore
the point that best possible results for (ii) are usually not theoretically attainable
in this manner because (4) in many cases is false with h(x)=g(x) (as, for instance,
in the four examples given later). Conversely a result of type (2) for (ii) implies that
s^-s^O^V"1^*))
for sr + 15^x, which for large y would supply a good answer to (i). Unfortunately,
however, most of the methods used to consider (ii) are circumscribed by the fact
that they become increasingly dependent at some point or other on good bounds
for (i) as y is taken larger.
There are also some connections between (ii) and (iii). For example, if (2)
holds for y=a and if there is a distribution function for (sr+1—sr)/g(sr)9 then (2)
may be strengthened for y<a so that it becomes an asymptotic equality of the
form
E (Sr+i-sry^A(y)xg^1(x)
as x-» oo. A less direct association is seen by letting Nx(l) be the number of intervals
sr+1-srof length / for which sr + 1 ^x, and then writing the sum (2) in the form
(5) I,Nx(t)P.
I
This could be estimated by partial summation provided satisfactory bounds
could be provided for sums of the form Xi>*^*(0> which are clearly closely
related to M^(x) for £ = X/g(x). Although the methods used to establish the
existence of a distribution function for (iii) do not in fact readily extend to deal with
the case where c is unbounded, it is nevertheless frequently convenient to express
(2) by (5) and then to perform a partial summation through the use of the cognate
sum
(6) E "Mo,
V ' 1>X
about which information can be obtained in a way to be described below. This
completes our remarks about the mutual relationships between (i), (ii), and (iii).
132
CHRISTOPHER HOOLEY
We discuss next the role played in the theory by sums of the form
R(x;l) = £ f(n)f(n + t),
which obviously have some relevance to the problems at hand. Indeed, since
R(x; I) is equal to the number of pairs sM, sv (not necessarily consecutive) for
which sv — su = l and sv^x, jR(x; /) provides an upper bound for Nx(l) that is likely
to be satisfactory for small values of /. It was in fact through this idea that Erdos
found the first proof of the theorem that
hm inf— <1.
x-oo logpr
Nevertheless the bound supplied by R(x; I) will not be of much direct use for the
larger values of / that are of importance to us here, because for such values almost
all the intervals sv — su of length / will contain many terms of the sequence. We must
instead look to the ideas of the next paragraph in order to see how useful these
sums turn out to be.
The sums appear again when we introduce a method of correlating the
frequency of occurrence of large intervals sr+l — sr with the frequency of occurrence
of the zero values taken by sums of the form
(7) S(n,h) = F(n + h)-F(n)= £ f(m).
n<m£n+h
To estimate the frequency of abnormally small values of S(n, h) we consider in the
first place the 'variance'
(8) Z{S(n,h)-h/g(x)}2,
which in most cases will be near enough
(9) ZS*(n,h)-h2x/g2(x),
n^x
since the expected value of S(n, h) will be about h/g(x) on account of (1). As for the
sum in the first term of (9), it can be expressed in terms of the sums R(x,l) for
l<h by writing the square of the sum in (7) as a double sum and then changing the
orders of summation after it is substituted in (9). The expectation that the variance
(8) should be small can therefore be tested provided suitable asymptotic formulae
are available for the sums R(x\ I).
INTERVALS BETWEEN CONSECUTIVE TERMS OF SEQUENCES 133
On the other hand, corresponding to a given interval sr + 1—sr of length l>h,
there about l — h values of n for which S(n, /z) = 0. Consequently (8) provides an
upper bound for the sum £/>*(/ —fc) N*(0» which is to all intents and purposes
the same as the sum (6) whose application to problem (ii) has already been indicated.
Where asymptotic formulae with sufficiently accurate remainder terms are
known for R(x; /), we may expect in this way to obtain a proof of (ii) in typical
cases for y < 2. To deal with the cases for which y is larger it will usually be essential
to consider higher moments of the form
£ {S(n,h)-h/g(x)}2',
for the estimation of which asymptotic formulae for sums of the form
(10) R(x; /!,..., 0= £ f(n)f(n + li)...f(n + lu)
n^x
are needed. Additional difficulties, however, normally occur even in the cases where
the latter requirement is met, and in consequence not much success has so far
attended attempts to use the method for the case y > 2.
Nevertheless the sums R(x; /l9..., /M) are of paramount importance from the
standpoint of problem (iii), since they can be used to calculate the number of
intervals sr+1—sr of length not exceeding X for which sr+1 ^x (X being of an order of
magnitude appropriate to the sequence in question, i.e. Xxg(x)). That this is so
can be seen by considering the fact that the number of intervals in question is
precisely the number of pairs sM, sv such that sv — su^X, s„^x, and such that no
members of the sequence lie between su and sv. Thus, appealing to a classical
exclusion principle, we have that the number required is
(ii) K-iy &(*,*),
t
where Qt(x, X) is the number of intervals of length not exceeding X which contain
exactly t members of the sequence strictly within them. In turn the sum Qt(x, X)
itself can be easily expressed in terms of the sums R(x; ll9...9 lt + x), where 0 < lx < •••
<lt+i^X.
The above method in some cases leads successfully to the solution of (iii). In
practice, however, the formula (11) is not satisfactory to use because it will contain
too many terms, and we use instead an expression of the type
YJ(-l)'Q(t,X) + 0{Q(M,X)}
bearing a close analogy to the formulae that appear in Brun's sieve method.
134
CHRISTOPHER HOOLEY
The field of application of the above ideas is naturally restricted by the
availability of suitable auxiliary formulae and by other difficulties that may occur in
practice. However, as we shall see in the next section, these methods with
appropriate modifications have enabled us to make some progress with problems (ii)
and (iii).
3. Special sequences.
1. The numbers prime to n. Here the problems need to be formulated in a
slightly different manner because for each n the number of members of the sequence
is finite. We therefore denote, for given n, the numbers not exceeding n that are
relatively prime to n by au..., <20(n), and then regard n as tending to infinity. Apart
from its intrinsic interest, the consideration of this sequence may be regarded as a
useful preliminary to the corresponding exercise for the primes (which are virtually
obtained by taking n = \\p^xmp and then taking the at that do not exceed x).
Problem (i). Brun's or Selberg's sieve method leads easily to
at+1-^ = 0(logcn)
for some positive constant C.
Problem (ii). In 1940 Erdos [3] conjectured that
<t>(n)- 1
(12) £ (ai + l-at)2 = 0(n2/ct>(ri)).
The author [8] then proved in 1963 that in fact, for y<2,
<t>(n) -1
(13) X (ai+1-ai)v = 0{n(H/</>(H)r1},
i=l
and that
(14) J (ai+ ,-ad^Oin(log log*)2}.
i=l
It actually was in this context that the author developed the appropriate method
described in the previous section. Here of course the work is facilitated by the fact
that there is an exact formula for the analogue of R(x; /).
No progress has yet been made with the case y > 2. Even the case y = 2 is still
not resolved, although several people including Vaughan, Norton, and the author
(unpublished) have improved (14) in such a way that it follows that (12) holds for
almost all n.
Problem (iii). Erdos [4] had conjectured that, for n = J~[p ^M p and M large,
the ratio
INTERVALS BETWEEN CONSECUTIVE TERMS OF SEQUENCES 135
nl4>(n)
had a distribution function that was (essentially) independent of n. This was
subsequently proved by the author [9] under merely the weaker condition that
n-+co through a sequence for which n/<t>(n)-+ao. Stated more precisely the result
asserts that, for any given positive constant c, the number of intervals ai+1 —at
of length not exceeding cn/</>(n) is equal to
ct>(n){l-e-c + o(l)}
as n-+co through a sequence for which n/</>(n)->oo. Thus (ai+1 — at) (f>(n)/n is
distributed approximately as a gamma variable with parameter 1 when n/c/)(n)
is large.
The proof depends on the method described at the end of the previous section,
there being again exact formulae for the analogue of (10). The proof, however, is
fairly complicated because there are one or two incidental difficulties to overcome.
Combining this result with that of (ii) we obtain, for 0^y<2,
£ (al+l-aF = {l+o(l)}r(y+l)n(n/<Hn)Y-1
i = l
as n/<j>(n)-+oo.
The methods used here can be extended further in order that other detailed
properties of the intervals may be studied. For example, it can be proved [10] that
£ (a(+1-a,)=(±+0(l))«
i<<f>(n); i = 0mod2
as w-»oo through a sequence for which n/<f)(n)-+co.
2. The prime numbers. The question as to whether for the sequence of primes
there are asymptotic formulae for sums of the form (10) is of course one of the
most important unsolved problems in the theory of numbers. The following
conjecture, however, was made by Hardy and Littlewood in their famous paper
"Some problems of partitio numerorum. Ill" [7].
Conjecture. Let lu /2,..., /r_i be r—\ distinct nonzero integers and let f(n)
be defined so that f(n)=\ if n is a prime andf(n) = 0 otherwise. Then, as x->oo,
i/w/(»+M---/(»+ui)-Krxnf-AT^:T.
where
X
v f du
h x =\
2
136
CHRISTOPHER HOOLEY
and v = v(p;ll,...,lr_1) is the number of distinct residues of 0, /l9..., /r_i to the
modulus p.
-Slightly stronger variants of the conjecture can be taken in which the asymptotic
formula has a remainder term of a specified order and in which the /, can lie in
some range depending on x. For the purposes of (ii) and (iii) it will be enough to
have in mind a fairly weak form of this type (say with a remainder term 0{xl~%
Problem (i). Dr. Huxley has now proved that
by a method that he has described to us in his interesting address to this colloquium
(references to the earlier literature on the subject may be found in his article in
these Proceedings). This does not fall far short of the result
(15) Pw+i-p„ = 0(pn1/2logp„),
which was shown by Cramer [2] to be true on the Riemann hypothesis.
Problem (ii). Erdos [3] has conjectured that
X (Pn+i-pw)2 = 0(xlogx),
Pn + l^X
although the prospect of proving it seems rather remote since its truth wouia
imply
P. + i-A. = (*1/2log1'2x),
which is stronger than (15). Selberg [15], however, improving earlier results by
Cramer, has shown on the Riemann hypothesis that
V» \Pn +1 ■"" Pn) ~ /, 3 x
L = 0(log3x);
Pn + I ^ X Pn
a result which is probably only imperfect to the extent of a superfluous factor
logx in the right-hand side.
If, however, the Hardy-Littlewood conjecture in appropriate form for r=2
is assumed together with the Riemann hypothesis, then it is possible to combine
Selberg's method with that described in §2 in order to prove that
I (P.+1 -P„)y = 0(x log'" 4) fory<2.
Pn+l^X
INTERVALS BETWEEN CONSECUTIVE TERMS OF SEQUENCES 137
Problem (iii). The existence of a distribution function for
(Pn+l-Pn)
logp„
can be shown provided the Hardy-Littlewood conjectures in appropriate form are
assumed to hold for any r. As in the case of the numbers prime to n, the ratio
would be distributed as a gamma variable with parameter 1.
3. Sums of two squares.
Problem (i). Bambah and Chowla [1] have proved that
sn+1-sn = 0(s^).
Their method is elementary, but more sophisticated analytic methods seem to
achieve less.
Problem (ii). Let r(n) be the number of representations of n as a sum of two
squares, and let rx (ri) be defined to be 1 or 0 according as to whether n is expressible
as a sum of two squares or not. Then there are difficulties in applying our ideas to
the problem in this case because no asymptotic formula is known for the sum
I MiOMn + O;
indeed the proof of such a formula might well lie nearly as deep as that for the
corresponding Hardy-Littlewood formula for prime-pairs. On the other hand
Estermann has proved an asymptotic formula with remainder term 0(x5/6+e) for
the associated sum
£ r(n)r(n + t).
This, however, cannot be immediately applied to our problem because the use of
r(n) means that only a sparse subset of the sr is effectively counted, the average
intervals between consecutive terms of this subset having length greater in
magnitude than (logx)1/2. That this is so can be seen, in effect, through recalling the
behaviour of co*(n), the number of distinct prime factors of n that are congruent
to 1, mod 4. Thus the normal order of co*(sm) is \ log logsm, whereas the major
contribution to ^Sm^xr(sm) is due to those terms for which co*(sm) is about
log logsm.
The above difficulty was overcome by the author [11] through the introduction
of a new function Q(n) with the property that g(n) = 0 unless n is expressible as the
sum of two squares. This function is virtually an approximation to rx (n) in that it is
obtained by affecting r(n) with a factor t(n) that is intended to mimic as closely as
138
CHRISTOPHER HOOLEY
possible the function 2"£0*(n). If d denotes, generally, square-free numbers
(including 1) composed entirely of prime factors p such that p = \ mod 4, and if
v = x3 for a suitable small value of 5, then t(n) is virtually defined1 by
in contrast to the identity
By means of a somewhat elaborate analysis, asymptotic formulae (with
remainder terms not specified here) of the following form are obtained
Ie(n)~o^' Iffa("hd^' E<(»M»+0~7^.
which are the counterparts of (either already proved or conjectured) formulae
with rx (n) replacing g(n). These provide the basis for the proof of
X (sr+l-sry = 0(xlofp-lv2x)
for y<5/3.
Problem (iii). No unconditional results relating to this are known. Indeed,
it has not even yet been possible to obtain an asymptotic formula for the sum
X r(n)r(n + ll)r(n + l2),
although the author has been able to show that there are infinitely many n for
which n, n + lx, n +12 are all expressible as the sums of two squares.
4. The square-free numbers.
Problem (i). The result
sr+l-sr = 0(x2,9+e)
is due to Richert [14]. A method for slightly reducing the exponent was
subsequently given by Rankin [13].
Problem (ii). Erdos [5] showed that, as x->oo,
1 In practice it is convenient, though not essential, to modify this definition slightly.
INTERVALS BETWEEN CONSECUTIVE TERMS OF SEQUENCES 139
£ (Sr+lSrY~A(y)X
for y^2.
The author recently extended this by proving that the formula holds for y^3.
The method used is different from that previously described, since the basic idea
is to show that, iffh(n) is defined by
/»(»)» E i,
p2 | n;p>h logfi
then the sum
is small. Having in effect shown that it suffices to consider numbers that are not
divisible by squares of small primes, it is then comparatively easy to complete
the proof.
Problem (iii). This was solved by Mirsky [12]. The existence of the
distribution function was used in the work described above.
4. Concluding remarks. We have had perforce to omit work of importance
from this survey. In particular, we have not mentioned the very interesting results
of Erdos regarding sequences with terms that are not divisible by any of the terms
of another sequence.
Obviously there is much that still remains to be done in this field. For example,
methods that could effectively be applied to sequences such as that of the numbers
expressible as
X3 + 73 + Z3; X, Y,Z^0
remain a desideratum.
References
1. R. P. Bambah and S. Chowla, On numbers which can be expressed as a sum of two squares, Proc.
Nat. Inst. Sci. India 13 (1947), 101-103. MR 9, 273.
2. H. Cramer, Some theorems concerning prime numbers, Ark. Mat. Astronom. Fys. 15 (1920),
no. 5, 1-32.
3. P. Erdos, The difference of consecutive primes, Duke Math. J. 6 (1940), 438-441. MR 1, 292.
4. , Some unsolved problems, Magyar Tud. Akad. Kutato Int. Kozl. 6 (1961), 221-254.
MR 31 #2106.
140
CHRISTOPHER HOOLEY
5. , Some problems and results in elementary number theory, Publ. Math. Debrecen 2 (1951),
103-109. MR 13, 627.
6. T. Estermann, An asymptotic formula in the theory of numbers, Proc. London Math. Soc. (2)
34 (1932), 280-292.
T: G. H. Hardy and J. E. Littlewood, Some problems ofpartitio numerorum. Ill, Acta Math.
8. C. Hooley, On the difference of consecutive numbers prime to n. I, Acta Arith. 8 (1962/63), 343-
347. MR 27 #5741.
9. , On the difference between consecutive numbers prime to n. II, Publ. Math. Debrecen 12
(1965), 39^9. MR 32 #4099.
10. , On the difference between consecutive numbers prime to n. Ill, Math. Z. 90 (1965),
355-364. MR 32 #1182.
11. , On the intervals between numbers that are sums of two squares, Acta Math. 127 (1971),
279-297.
12. L. Mirsky, Arithmetical pattern problems relating to divisibility by rth powers, Proc. London
Math. Soc. (2) 50 (1949), 497-508. MR 10, 431.
13. R. A. Rankin, Van der Corput's method and the theory of exponent pairs, Quart. J. Math. Oxford
Ser. (2) 6 (1955), 147-153. MR 17, 240.
14. H.-E. Richert, On the difference between consecutive squarefree numbers, J. London Math. Soc.
29(1954), 16-20. MR 15, 289.
15. A. Selberg, On the normal density of primes in small intervals and the difference between
consecutive primes, Arch. Math. Naturvid. 47 (1943), no. 6, 87-105. MR 7, 48.
University College
Cardiff, Wales
THE DIFFERENCE BETWEEN
CONSECUTIVE PRIMES
MARTIN HUXLEY
The gap for consecutive primes
Has decreased in historical times:
I can prove it myself
Down to seven upon twelve,
Said one who rejoices in rhymes.
This article is a sketch of the ideas used in my paper on the difference between
consecutive primes. The result is
(1) Pn+l~Pn<Pn
when S > 7/12 and n is sufficiently large, improving Montgomery's condition
S>3/5.
Ingham reduced the problem to the study of zeros of £(s). A contour integral
gives
(2) I *,«- I ^+o(ii£i).
Pa^x \q\<T Q \ J /
Subtracting this equation at x from the same at x + /i, one has a sum over prime
powers - essentially over primes - that lie between x and x + h. The sum over zeros
has to be estimated rather crudely. Let JV(a, t) be the number of zeros Q = P + iy
in the rectangle a ^/J^ 1, \y\^t. Any zero-density theorem of the form
(3) N(<x,t)<tA{1~a)
AMS 1970 subject classifications. Primary 10H05, 10H15; Secondary 30A16.
© 1973, American Mathematical Society
141
142
MARTIN HUXLEY
for a < 1, together with a strong result on the nonvanishing of £(s) near the line
(7=1, implies
(4) £ \ogp = h + o(h)
x<pa<x + h
for log/i/logx > 1 — \jL
With x = p„, 3 = 1 — 1/A + ■£, the sum on the left of (4) is nonzero, and thus not
empty, and we deduce (1). If the Riemann hypothesis were true, then (3) holds
with any k>2, and (1) with any 5>\. The present talk sets about to get X> 12/5.
The program above is Ingham's, who proved (1) for 5>5/&. Evidently recent
improvements have been in counting zeros. There are two basic properties of
£(s): its functional equation and its product representation. The analytic properties
are very useful, but counterexamples show that they alone are not sufficient for
results of the form (3). The multiplicativity is used as follows. Let
M(s)= X n(m)lnf,
m^X
a partial sum for the reciprocal of £(s). Then
C(s)Af(5) = l+X -s £ n(d).
m>X M d\m;d^X
This product of series does not converge in the region o < 1 in which we are
interested, but we can insert convergence factors:
Y / \
(5) £ (5) M (5) = 1 + Yj —T + error term + contour integral.
Here Y is the point at which the convergence factor really takes effect. When 5 is a
zero q of £(s) (or a zero of M(s)), the left-hand side is zero. Essentially what happens
is that £<z(m)/ms is approximately — 1. It is also possible that the contour integral
is near — 1. With the appropriate choice of X and Y, in our problem the contour
integral is negligible for a = 5/6, and for 3/4 = a = 5/6 it can be — 1, but not so often
as the sum can be. Something is being lost here; intuitively we expect to get a
better result by adjusting the value of Y until the contour integral is — 1 as often
as the sum is.
A technical point enters here. The sum from X to Y is split into sections
jN<m^N, and if the original sum was greater than | (in modulus), one of the
parts exceeds 1/(4 log 7).
THE DIFFERENCE BETWEEN CONSECUTIVE PRIMES
143
The first method an analytic number theorist tries is to integrate the square
of the modulus of the sum over t; if the range for t is long enough one gets an
expression involving the sums of the squares of the coefficients. This approach
works here, and leads to the result (to cut a long story short)
N(<x, t)<t3il-*)li2-a)\og10t,
(the logarithm power can be reduced, but that is of little importance to us), which
recovers one of Ingham's theorems. Figure 1 shows the graph of X = 3/(2 — a).
A t
3 |
12/5 +
2
1/2 3/4 1 a
Figure 1
The method of Halasz and Montgomery for counting the number of times a
finite Dirichlet series can be large rests on the inequality
(6)
£ a(n)ur(n)
^ I l«(«)l2 E I
n= 1 r= 1 q= 1
I M"KM uq(n)\
|n=l
where a(n), ur(n) are complex numbers and b(n) are positive real numbers, subject
to b(n)^ 1 for those n for which a(n) is nonzero. This inequality is related to the
large sieve and to Bessel's inequality; it is proved by a judicious use of Cauchy's
inequality. It is convenient to divide the a(m) in (5) by nf to obtain the a(m) for (6),
and to take ur(n) = rf~Qr, where Qi,...9 qr are zeros. The inner sum in (6) becomes
(7)
V b(ri)n2a~lir~Pq~iyr + iyq.
The point of this manoeuver is to replace the coefficients a(m) which involve
sums of Mobius functions by the known coefficients b(n). We can make (7) to be an
average of the zeta function near the point a + iyq — iyr. The simplest choice is
o- = 0. The order of magnitude of £(if) is known, but for o->0 the estimate for
C(a + it) is probably not best possible, and there is much scope for what another
144
MARTIN HUXLEY
speaker has aptly described as "very difficult arguments of a function-theoretic
nature". Professor Bombieri's work with Forti and Viola is one example.
It is convenient at this stage to invent a notation
f<«9
to mean f = 0(g log* T) for some fixed a as 7->oo.
In our application of (6) we assume that
(8)
(l/2)N<n^N
>»1
for r= 1,..., R, and that £i,..., qr are not too close together. With a suitable choice
of the b{n), (6) becomes
(9) R2<«R(N + RT1/2) N1-2".
The term with R and not R2 on the right arises from summands with q = r.
Essentially N (a, T) <« R, and (9) gives
(10) R<«N2~2a
provided that
(11) T1/2<«N2a-K
The above is a summary of Montgomery's work. I know three devices that
enable (11) to be relaxed a little.
(i) Raising the sum to a power k replaces N by Nk but does not change T. This
relaxes (11) but weakens (10). However Jutila has shown how to use this effectively;
we follow his idea below.
(ii) Subdividing the sum over zeros into intervals \yq — yr\^T0, where T0
satisfies (11). This gives
(12) R<<<(l + T/T0)N2-2"<<<N2-2a + N*-6aT.
(iii) Iterating the Halasz lemma (6). This is the subject of my current research.
To obtain a useful result one chooses b(n) so that the sum (7) is related to the
approximate functional equation for £(s), and can be transformed into a sum of
length 0(T/N).
The calculations to obtain (3) for a^|. We take
THE DIFFERENCE BETWEEN CONSECUTIVE PRIMES
145
y = min(T1/2, T3/(6a"2)).
The sum in (5) has been divided into ranges jN<m^N. We square the longest
ranges, cube the next longest, and so on, before applying (12). When N{k + *} (2 " 2a) =
Nkr4 ~ 6a) T, then we change from the /cth to the (k + l)st power. This gives
R«< T3(1 -a)/<3a-1)
when we add up the various inequalities given by (12). The graph of 3/(3a—1)
[Figure 2] cuts that of 3/(2-a) at (3/4, 12/5), and so (3) is established for X> 12/5,
and (1) for S > 7/12.
A 4
12/54
2 1
3/21
1/2 2/3 3/4
Figure 2
a
There is a sense in which a = 3/4 is the best possible intersection of the two
curves. At a = 3/4, (11) is essentially T = N, and the bounds for R derived from
mean value arguments and from Halasz's lemma agree. For T<N the mean
value method is more powerful.
References
M. N. Huxley, On the difference between consecutive primes, Invent. Math. 15 (1972), 164-170.
H. L. Montgomery, Topics in multiplicative number theory, Lecture Notes in Math., vol. 227,
Springer-Verlag, Berlin and New York, 1971.
University College
Cardiff, Wales
This page intentionally left blank
ON THE MERTENS CONJECTURE AND
RELATED GENERAL ^-THEOREMSx
W. B. JURKAT
In this paper we consider the functions
M(x) = Z ^(n) (^(n) Mobius function),
il/(x)= X A(n)= £ logP> and
D(x)= Z ^(n) (d(n) number of divisors),
where x>0. First we indicate our present knowledge concerning the Mertens
conjecture in terms of
M+ =lim supM(x)/x1/2, M" =lim infM(x)/x1/2.
x-+ao x-*ao
Then we discuss O-theorems for
R(x) = x-i//(x), A(x) = D(x)-{x\ogx + (2y-l)x}
or better for the reduced remainders
R(x)x-l/2, ZJ(x)x"1/4, (M(x)x"1/2).
Previous results by Littlewood [10], Hardy [3], Ingham [6], [7], Haselgrove
[4], etc. made use of some kind of almost periodicity. This concept will be clarified
and leads to "almost periodic functions in a distributional sense", in short, APD-
AMS 1970 subject classifications. Primary 10K30; Secondary 10K35.
1 This research was partially supported by the National Science Foundation.
© 1973. American Mathematical Society
147
148
W. B. JURKAT
functions, denoted generically by / The well-known explicit formulas for the
various reduced remainders r(x) take (under suitable assumptions) the form
(1) r(x)=/(f(x)) + o(l) asx->oo,
where t(x) = \ogx or r(x) = x1/2, respectively. There is a general theorem which
permits us to transfer the behavior of f(t) at a finite point like r = 0 to very large
values of t. Our improvements of the known results consist of ^-theorems for the
averages
t+d
f(uS)=\
f(z)dx (d>0).
There are several reasons to believe that the J2-theorems are sharper than the
corresponding O-theorems. Indeed, I consider it likely that
(2) f(t) = 0 (logtf
should hold true often. At the St. Louis conference, H. L. Montgomery remarked
that he believed that all remainders like R(x), A(x\ etc. should be 0(h(x)1/2 x%
where h(x) denotes the distance from x to the nearest zero (or change of sign) of
the remainder. This puts the order of h(x) roughly at \/t'{x) and, hence, the
expected order for the remainder at (t'(x))~l/2. The reduced remainder should then
equal the actual remainder divided by its expected order. If the corresponding
explicit formula confirms this calculation, I would expect as final estimate for
the remainder
0(t'(x))-l'2(logt(x))K,
which specifies the s in Montgomery's conjecture.
Since the conference I have been able to obtain several results concerning
averages like
t+d
\ Jl/MI2^.
This will answer questions raised by E. Bombieri at the conference and will throw
some light on the possible values of K in (2).
1. Special results. For every x > 0 we have (without any assumption)
(3) M-^(M(x) + 2M*(l/x))*-1/2^M\
THE MERTENS CONJECTURE AND RELATED Q-THEOREMS
149
where
00 {-l)n(2nx)2n
M*(x)=l+ £
„f12n(2n)!C(2n+l)
This was announced in [8] and will be proved in §2. Letting x-*l ±0 we obtain,
in particular,
(4) M"^2M*(1)^1+2M*(1)^M+
and an easy computation gives 2M*(1)= — .505....
LetO«/(x)=o(l)and
x + x<Hx)
1
9+{
This is an approximate average of the reduced remainder at x of length xd(x).
Under the Riemann hypothesis (RH) we take d(x)= 1/log logx and obtain
(5) lim sup g^/log log logx^i, lim inf ^/log log logx<; -\.
In particular, we have (assuming RH)
(6) ^(x, 1/log logx) = ^± (log log logx),
which should be compared (supposing RH) with
(7) g+ (x, 1/log log x) = 0 (log log log x)2.
Obviously, (5) implies the corresponding result without averages which is due to
Littlewood [10]. It is remarkable, however, that even an average of length
x/log logx can become so large, and (7) shows that not much more could be said.
By contrast, the RH implies (according to Koch [9])
R(x)x-1/2 = 0(logx)2,
which leaves a much larger gap (at present). Probably, the RH can be removed for
the ^-results; but this is not too interesting since the zeros of £ off the critical line
produce much larger oscillations of R(x) x_1/2 anyway.
150
W. B. JURKAT
Next, let 0<d(x) = o(l) and
x+2xl/2d(x)
X
This is an approximate average of the reduced remainder at x of length 2x1/2 d(x).
Without any assumption and taking d(x) = (logx)~1/2, we obtain
(8) gD(x, l/(logx)1/2) = ^+(logx)^4 log log*,
(9) gD(x, l/(logx)1/2) = 0(logx)1/4 log log*.
Obviously, (8) implies the corresponding result without averages which is due to
Hardy [3]. It is remarkable, again, that even an average of length 2(x/logx)1/2
can become so large, and (9) shows that nothing more can be said. The best known
estimates for A(x) x~l/4 still leave a considerable gap (at present).
There are similar Q- and O-results for practically every d(x). In (6) and (8) we
selected d(x) such that first, the right side is particularly large and second, d(x)
is particularly large also. The weights which are used in the averages equal r'(£)
and can, probably, be replaced by other weights as well.
2. Proof of (3). Let us discuss the lower estimate first. We may assume
M~ > — oo and, hence,
(10) M(x)^-Kxl/2 (allx>0)
for some constant K > 0. In a standard manner
00
1 K CM(x) + Kxl/2 j
-F7-V-T75—= — 4.1 dx (Res>l)
s£(s) 1/2-5 J x*+i v f
extends according to Landau to Res>y. Thus the RH follows and we have, in
particular,
1 K
VII
1 K
Letting a-^+O and t=y (nontrivial zero Q=j+iy) it follows that q is simple and
(ii) \i/qC'(qUK (aiie).
THE MERTENS CONJECTURE AND RELATED ^-THEOREMS
151
Similarly, if M+ < + oo, we derive again the RH, the simplicity of the zeros, and
(11). Thus, these assumptions can be made without loss of generality. Then,
according to Titchmarsh [15, p. 318],
(12) (Af (x) + 2Af*(l/x)) x"1/2 = £ x'Vefte)
Q
for 0<x#integer, where the sum converges boundedly (locally) after grouping
the terms into suitable blocks. (This holds true also for 0<x< 1.)
Let us define f(t) for real t by
/(logx) = (M(x) + 2M*(l/x))x"1/2 (allx>0),
and observe that f(t) is real-valued, locally integrable and, in fact, locally of
bounded variation. Further, notice that we have an expansion of the form
(13) /(/)-£ Re{a„^"'},
n= 1
where 0 < L /oo and an are complex constants such that
(14) £ M/% < °° (* some integer ^ 0).
It is important to note that the series in (13), as derived from (12), converges in the
distributional sense, i.e., if we multiply (13) by smooth functions with compact
support, we may integrate term by term. If k in (14) were zero, f(t) would be
almost periodic in the sense of H. Bohr. If fc>0 at least a /c-fold integral of / is
almost periodic, i.e., / is the /cth derivative of an almost periodic function.
Functions of that kind we call "almost periodic in a distributional sense", in short
APD-functions, and we reserve the right to add further conditions and still use
the same terminology when it is convenient.
Given an APD-function f(t) we define F+=lim sup*_ + 00 f{t\ F~
= lim inf?_ + ao f (t), where the * indicates that t should be restricted to Lebesgue
points of / (t). In this generality we prove the following result.
Proposition. For any L-point t of an APD-function
(15) F~^f(t)SF+.
Proof. If k = 0 then/(/) is continuous, almost periodic, and hence every value
of/(/) will also be a limit value for /-> + oo. If k>0 we consider
152
W. B. JURKAT
d d
F«W = ^J-J f(t + r1 + -+xk)dxl-dxk (8>0)
0 0
f ei*n8 _ \\k
= 1 ReU
U„(5
Obviously, F4(t) is continuous, almost periodic, and
lim sup Fd(t)^F+, lim inf F,(t)gF".
r-* + oo r-> + oo
Hence,
F~g>Fd(t)^F+ (all r).
Letting (5 -> +0 we obtain (15). □
In our case we have k = 2 and
(16) M(x)/x1/2 = f(\ogx) + o(l) (x-oo),
so that M± =F±. Thus (3) becomes a special case of (15).
3. Discussion of M±. The Mertens conjecture \M(x) x~1/2|^l (x> 1) is still
open at present. Sterneck [14] conjectured even — |x1/2^M(x)^x1/2 (except
near x = 200) and supported this by calculations for 2^x^5-105 and by tests up
to x = 5-106. Our inequality (4) shows that the lower estimate will be false again
and again. Neubauer [11] calculated M(x) up to x = 108, still confirming Ster-
neck's conjecture. Making tests up to x=1010 he finally found values of x near
(7.7)(109) where M(x)x~1/2>i but none with M(x)x_1/2< -\. It would be
interesting to know where the lower inequality actually becomes wrong.
In (3) the calculation of any value of /(logx) has a consequence for M±. The
calculations mentioned above show that — ^/(logx)^ holds for l^x^lO8
with one of the biggest values at x = 1 + 0. But one could consider x < 1 as well.
For such calculations it is useful to note that
i
(17) M*(x)= cos(27rxr)m[-j- (x>0)
0
where
w(*) = Z M")A* (x>0).
THE MERTENS CONJECTURE AND RELATED ^-THEOREMS 153
Using (17) and associated approximation formulas,/(logx) was calculated for me
at the Hahn-Meitner Institut in Berlin for the range 10~3^x<l. It turned out
that x= 1 — 0 gave the most interesting value. Thus, for all the x values considered,
inequality (4) contains nearly the best information. On the other hand, the more
complicated structure of M*(x) suggests that further computations of f(\ogx)
for smaller values of x could prove to be more successful.
Another approach would be to follow Ingham [7] and Haselgrove [4] in
computing the Fejer mean of (13), i.e., in our case
C*M= Z f1-?)-^ (T><Ureal).
M<r\ TJqC(q)
The inequality corresponding to (3) is
(18) At- = Cf (£) = M + (T>0, t real).
This is slightly weaker than (3) since C%(i) is an average of f(t) (involving Fejer's
kernel), but has the advantage that now much larger values of t can be considered.
Recent computations by Spira [13] for T=103 gave values +.535 and —.602
near £=98 resp. £ = 854. These seem to give the best estimates of M1 at present.
Here, also, one should make calculations for negative £'s.
There is one consequence of (18) which is worth mentioning. Let M=
max(M + , — M"). Using Parseval's equation for
h'l
Urn - | \C*T{ufdu,
and letting T-> + oo, we obtain
1
(19)
qC(q)\
2
<M2.
This is one quantitative form of Titchmarsh [15, p. 323] and could, perhaps, be
used in connection with disproving the Mertens conjecture. It can be viewed as an
improvement of (11) under a two-sided assumption and implies, in particular,
/c=lor
£k.lM.<°o-
There have been other attempts to disprove the Mertens conjecture based upon
154
W. B. JURKAT
the possible linear independence of the y\ We mention, e.g., the following papers:
Ingham [7], Bateman et al. [1], Diamond [2].
4. Other APD-functions. Let us define/^) for real t = logx by
f(t) xll2 = x-\jj(x)-\og2n-\ log(l - l/x2) for x> 1,
f{t)xll2= £ -A~-log- + y-ilog- forO<x<l.
n^l/x H X 1—X
Observe that / (t) is real-valued, locally integrable and, in fact, locally of bounded
variation except for t = 0, where
om /«=ilog(l/t) + 0(l) ast^+0,
1 ' /(')=-* log(l/|f|) + 0(l) asr--0.
It is well known (under the RH), cf. Ingham [5, p. 77], that
(21) f{t) = Y.eiytlQ (x±V integer),
where the convergence is actually dominated by a locally integrable function.
Hence the convergence is distributional, and f(t) is an APD-function with k= 1.
By definition
(22) R(x)/x1/2 = f(\ogx) + o(l) (x-oo)
and F~ = —oo, F+ = +oo in view of (15), (20). This is one qualitative form of
Littlewood's ^-result. A comparison between the problems for M(x) and \j/{x) is
natural: In the previous case f(t) had a jump discontinuity at t = 0, and (4) was a
natural consequence of that observation. Now f(t) makes an infinite jump at
t = 0, which results in R(x)/x1/2 being unbounded on both sides. In such cases it is
desirable to give a quantitative formulation of our Proposition.
Let us discuss D (x) also. According to Hardy [3] we may define f(t) for real t
by
(23) /(,)-£ Re|^(-^)#L)exp(.47rnl/2t)| t2#integerj
where the convergence is actually dominated by a locally integrable function.
Hence the convergence is distributional, and f(t) is an APD-function with k= 1.
By Voronoi's explicit formula for A (x), cf. Hardy [3], we see that
(24) J(x)x"1/4 = /V/2) + 0(l), x^oo.
THE MERTENS CONJECTURE AND RELATED O-THEOREMS 155
Notice that the almost periodic behavior (24) follows without any assumptions
and that distributional convergence in (23) can be obtained more easily than
ordinary convergence by discussing explicit formulas in integrated form. (It is one of
the advantages of distributional convergence that we can use formulas like (23)
without ever proving actual convergence or other unnecessary properties.) It is
elementary to prove that
s
(25) i|/(t)A = (C + o(l))(l/a)1'Mogi C><U-+0
0
which implies F+ = + oo in connection with (15). This is one qualitative form of
Hardy's O-result.
Being general for a moment, it turns out that the functional equation for the
C-function is responsible (in the end) for quite a number of explicit formulas which
lead to corresponding APD-functions. We shall discuss further examples on a later
occasion. Let us observe that the examples of this section have the following
additional properties (besides k= 1):
(26) 2.n = (c + o(l)) nai (lognf1 as n->oo,
where c>0, clx >0, px real;
(27) £ |c1JA,=O(N»)(logA0'1, NZ2,
where a2<0, p2 real;
(28) ai+a2^0, px+p2^0;
s
(29) \
f(i) dt=(C + o(l)) [ ±Y' (log^J3, <5- +0,
where C>0, a3 = l + a2/a1, P3 = P2~Pi<x2/<xi- In the case of \fi(x),
«, = 1, j8,«-l, a2=-l, p2 = 2, a3=0, /J3 = l, C=\;
and in the case of D(x),
«i=i, Pi=0, a2=-i, j82 = l, a3=i /?3 = 1.
156
W. B. JURKAT
For convenience in language we shall incorporate these conditions into our general
concept of APD-functions.
^5. General results. The procedure of Littlewood and Hardy can be interpreted
as a discussion of their functions/(f) through the associated Dirichlet series
00
£ ane~XnS [s = o + it,o>§)
«= i
and its boundary behavior, in particular, at s = 0. Skewes [12] and Ingham [6]
noticed already that the proofs simplify if one uses directly / (r), i.e. the explicit
formula, and properties like (20). Bohr's almost periodicity for o>0 produces
APD-functions along the boundary a = 0 (at least if one widens the concept). It
is natural and possible to extend the ^-results to such APD-functions in
considerable generality. A careful analysis of our simplified proofs shows that even in the
previously known cases the conclusion can be strengthened to give £2-theorems
for the averages/(/, S) = (\/S) \\*d f{z) dz. These results become better if S = S(t)
is not too small. This suggests that one should discuss the order of f(t, S(t)) for
various choices of S(t).
In the following I shall describe some of the results I have obtained so far for
our APD-functions (as characterized in the previous section). We have uniformly
in S > 0 and t (real)
f{t9S) = 0{l/&) for^i
(30) f{t,5) = 0{l/5y*{log{l/5)Y> forO<agi (ifa3>0),
f(t,S) = 0(l/S)*>(\og(l/S)y> + 1 for0«5^ (ifa3 = 0).
Here we may take S = d(t)>0, but otherwise arbitrary. If S(t) = Q+(l) the estimate
cannot be improved, but it is not too interesting with regard to f(t). For the
remaining functions 0<S(t) = o(l), there is a critical lower limit
^(r)=(logr)-ai(loglogr)-^, t^t0 (ifa3>0),
<U') = (logO~ai (log logr)-"1 (log log logr)a\ t^t0 (if a3 = 0).
Under the assumptions indicated the following result holds.
Theorem. IfS (t)/S^ (t) -> oo (/ -> + oo) then, independently of 6 (/),
(3!) ita»p/MM/(jl)"(iOV.
THE MERTENS CONJECTURE AND RELATED ^-THEOREMS
157
If S{t) is of the order of 5*(t) or smaller we can still prove (under a slight
regularity condition for S(t))
(32) /(/,^(/)) = i2+(l/^(/r(log(l/^W))"3, '- + 00.
Here the constant in Q+ is not specified unless a3=0, and the right side can be
evaluated by
(A))aXl0S^T)J3=(a?3+0(1))(l08tr'+a2(l0gl0gf)','+',2' f"+a)'
In particular, if t is restricted to L-points of f(t),
(33) /(r) = G+(logt)ai+a2(loglogt)^2, t- + oo.
We remark that (31) gives the exact order if a3>0, and leaves only a factor
log(l/(5(r)) open if a3 = 0. Since a3 and j?3 may be zero, (31) can be viewed as a
quantitative formulation of our Proposition. If S(t) is roughly of order 5#(/), then
(32) will give an answer which is relatively close to the truth. For 5(t) much smaller
than 5^ (t), the order problem is still wide open. In that case (30) should be improved
while (32) may still be not too far off. If S(t) is very small, the order problem for
f(t, S(t)) is the same as for f(t).
The detailed proofs will be published elsewhere.
References
1. P. T. Bateman, J. W. Brown, R. S. Hall, K. E. Kloss and R. M. Stemmler, Linear relations
connecting the imaginary parts of the zeros of the zeta function, Proc. Atlas Sympos., Computers in Number
Theory.
2. H. G. Diamond, Two oscillation theorems. The theory of arithmetic functions, Lecture Notes in
Math., vol. 251, Springer-Verlag, Berlin and New York, 1972, pp. 113-118.
3. G. H. Hardy, On Dirichlet's divisor problem, Proc. London Math. Soc. 15 (1916), 1-25.
4. C. B. Haselgrove, A disproof of a conjecture of Poly a, Mathematika 5 (1958), 141-145. MR 21
#3391.
5. A. E. Ingham, The distribution of prime numbers, Cambridge Tracts in Math, and Math. Phys.,
no. 30, Cambridge Univ. Press, Cambridge, 1932.
6. , A note on the distribution of primes, Acta Arith. 1 (1936), 201-211.
7. , On two conjectures in the theory of numbers., A.m£*. I. MsAh. GA{\W1), MV-M9. MR ^,
271.
8. W. B. Jurkat, Erne Bemerkung zur Vermutung von Mertens, Nachr. Osterreich. Math. Ges.,
Sondernr. V. Osterr. Math.-Kongress 1960, (Wien 1961), p. 11.
9. H. von Koch, Sur la distribution des nombres premiers, Acta Math. 24 (1901), 159-182.
10. J. E. Littlewood, Sur la distribution des nombres premiers, Comptes Rendus 158 (1914), 1869—
1872.
158
W. B. JURKAT
11. G. Neubauer, Eine empirische Untersuchung zur Mertensschen Funktion, Numer. Math. 5
(1963), 1-13. MR 27 #5721.
12. S. Skewes, On the difference n(x)-\\ x. I, J. London Math. Soc. 8 (1933), 277-283.
13. R. Spira, Zeros of sections of the zeta function. II, Math. Comp. 22 (1968), 163-173. MR 37
#4036.
14. R. D. v. Sterneck, Empirische Untersuchung uber den Verlaufder zahlentheoretischen Funktion
<r("HZx=1 ii(x) im Intervalle 150000 bis 500000, S.-B. Akad. Wiss. Wien Math. Nat. CI. Ila 110
(1901), 1053-1102.
15. E. C. Titchmarsh, The theory of the Riemann zeta-function, Clarendon Press, Oxford, 1951.
MR 13, 741.
Syracuse University
THE DISTRIBUTION OF THE VALUES OF
REAL QUADRATIC FORMS
AT INTEGER POINTS
D. J. LEWIS1
1. Let Q = Q(x) be a nondegenerate quadratic form with real coefficients in n
variables ;jc = (x !,..., x„). Let^ = {Q(jc) | jceZ"}. We are interested in the
distribution of the set 9C on the real line.
Clearly, if Q is proportional to a form F with rational integer coefficients, i.e. all
ratios of coefficients of Q are rational, then 9£ is a discrete set and the problem
reduces to a discussion of what integers are represented by the form F. This
problem has been thoroughly studied and we need not discuss it further here, except
to note that if Q is an indefinite real form in n ^ 5 variables with all ratios of
coefficients rational, then 9C is a full Z-module of rank 1 and each value in 9C is
represented infinitely often.
Unless we state otherwise, henceforth we shall assume that Q is a real form not
proportional to a form with integer coefficients and hence at least one ratio of the
coefficients of Q is irrational. For such forms with n ^ 5 one might hope to show
that 3C is in some sense dense on the line if Q is indefinite, or on a ray if Q is definite.
Indeed, when Q is an indefinite form in n^21 variables the set 9C is dense on the
line. This result is the culmination of theorems in a sequence of papers, written
by B. J. Birch, H. Davenport, and D. Ridout ([6], [7], [8], [11], [19]) during the
1950's (see §4 for discussion of proofs). Actually they showed for such Q, the set
contained 0 as a nonisolated accumulation point; i.e., there are nonzero values in 9C
with arbitrary small absolute value. With this information one can use a result of
A. Oppenheim [14] to demonstrate that 9£ is dense on the line (see § 5).
AMS 1970 subject classifications. Primary 10B45, 10C05.
1 This paper was written while the author was partially supported by a grant from the National
Science Foundation.
© 1973, American Mathematical Society
159
160
D. J. LEWIS
As indicated earlier, one would hope to show that 9C is dense when Q contains
fewer than 21 variables. However, with the techniques presently available we seem
unable to reduce the 21 appreciably without imposing some further condition,
such as Q being diagonal or additive; i.e., that Q = X1x1 + ... + A„x„2. In 1945, H.
Davenport and H. Heilbronn [9] proved
IfQ is an indefinite diagonal form inn^5 variables with real coefficients having at
least one irrational ratio, then 9C contains nonzero numbers with arbitrary small
absolute value.
This result had been conjectured by A. Oppenheim [13] in 1929, and in 1934
it was shown to hold when n^9 by S. Chowla [2] using results of V. Jarnik and
A. Walfisz on the number of integer points in a large ellipsoid. The Davenport-
Heilbronn proof makes use of a modified Hardy-Littlewood circle method (see
§3 below).
The basic reason why, for general indefinite forms, 21 variables are presently
needed to ensure that 3C is dense on the line lies in the method of proof. The proof
uses several modifications of the Hardy-Littlewood method —a method that is
seldom efficient. Also, the proof uses a modification of a diagonalization process
introduced by R. Brauer [1] which is also quite wasteful. In this case, under certain
additional hypotheses, one demonstrates that an indefinite form can be
transformed by an integral matrix to the sum of an indefinite diagonal form in a much
smaller number of variables (about j the number in the original form) and a form
whose coefficients are extremely small. One then shows the diagonal form (in 5
variables if n ^ 21) has a small value for an integral point with a modest sized norm
(a quantified form of the Davenport-Heilbronn theorem). It then follows that the
second form also has small value for this point and hence the result.
Finally it should be noted that even in the case of indefinite diagonal forms we
do not know for sure that some smaller number of variables, say 3, would not
suffice to ensure the denseness of SC.
2. Now let us examine the situation for positive definite quadratic forms Q.
Here the investigation is still in a primitive stage.
For positive definite forms we have
\\x\\2<Q(x)<\\x\\\
where ||x||=max|Xj|. Here we use U<V to mean U<constant• V, where the
constant is independent of x. Hence we cannot expect 9C to be dense on the line,
or any part of the line. However, we might hope to show that for each e > 0 there is
an X (e) such that all intervals with left-hand end point to the right of X (e) and of
DISTRIBUTION OF THE VALUES OF REAL QUADRATIC FORMS
161
length s contain a point of 9C. Such is the case for diagonal positive definite real
forms, with an irrational ratio, in n ^ 5 variables, as we now show.
Suppose A!,..., kn («^5) are positive real numbers with at least one ratio
irrational. Let N(X) denote the number of integer solutions of the inequality
Aixf + '-' + ^xJ^X.
Jarnik and Walfisz [12] proved
N{X) = CXnl2 + 0{Xn/2~l),
where C = volume of the ellipsoid £A,-xf ^ 1, whence C = Bn(X1...Xn)~1/2, where Bn
is a constant depending only on n. Thus for any fixed £>0, we have
N(X + s)-N(X)^C'(n/2)'8Xn/2^1
as X-»oo. Since this exceeds 1 when X is sufficiently large, the result follows.
In view of the success in handling the general indefinite form, it is reasonable to
seek to show that the preceding result holds for general positive definite forms,
when the number of variables is sufficiently large. So far no such result has been
proven. However, Davenport and Lewis, in a paper to appear [10], have attained
a partial result. They proved
There exists an absolute integer M with the following property: IfQ is a positive
definite quadratic form with real coefficients in n^M variables, then for each
integral point x* of sufficiently large norm there is an integral point x#0 such that
\Q{x + x*)-Q{x*)\<l.
Clearly, in this result one can replace 1 by any positive e on requiring that
||x*|| >X(Q, s). The theorem does not hypothesize that some ratio of the
coefficients of Q is irrational. Thus if Q is proportional to a form with integral
coefficients, we have Q(x + x*) = Q(x*) when x* is sufficiently large.
This result of Davenport and Lewis is imperfect in terms of the anticipated
result in several ways, (a) One could have Q(x + x*) = Q(x*) even when Q is not
proportional to a form with integral coefficients, and indeed this will happen if Q
should integrally represent a form in 4 or more variables which is proportional to
a form with integral coefficients, (b) Even if (a) does not occur, the conclusion does
not rule out the possibility that the elements of SC occur in clumps leaving
appreciable sized intervals, or gaps, without any points of 9C. Finally we note that no
attempt was made to determine M, since the value which the proof would produce
162
D. J. LEWIS
is surely excessively large, even when certain crude estimates used are replaced by
sharper ones which could be attained with more work.
The proof (see § 6) again uses the Hardy-Littlewood circle method and in the
final analysis depends on representation properties of positive definite diagonal
forms with integral coefficients.
3. The Davenport-Heilbronn proof for indefinite additive real forms is the
prototype for all the work on this subject. The proof is quite direct, and in the final
analysis relies on consideration of the continued fraction development of one of
the irrational ratios.
Let
6 = A1xf + -+A5xi,
where X2/Xl is irrational, Xx > 0, A5 <0. Clearly, these assumptions on the Xt can be
made without loss of generality. Also, clearly it is sufficient to demonstrate the
existence of an integral point y such that
(i) \Q(y)\<U
since the integral solubility of \Q\ <e follows from (1) on replacing the At- by e~ Ut-.
Let e(x) = e2nix. As is easily verified
00
(2) I e(at) ffl^Ydo: = max{0, 1 -\t\).
— oo
Let 0> consist of the squares of the denominators of the partial fraction convergents
to X2/Xl, and let P be a large integer in 9. Define
p
S(a)= £ e(ax2), 7(a)= e(<xx2) dx.
x=l J
0
It follows from (2) that
00
(3) Js(Aia)-S(l5a) (^Jda =I{1-G(,)},
— oo
where the sum is over those integral points y such that l^yl9..., y5^P and
DISTRIBUTION OF THE VALUES OF REAL QUADRATIC FORMS 163
\Q(y)\<l. To prove that (1) has an integral solution, it suffices to show that the
integral in (3) has a positive value.
We also have from (2) that
00
(4) i= Ji(lia).-.i(l5a) (^J d«= J...J {l-Q(x)}dx,
~ 00 |<2(*)| < 1
and a straightforward integration shows
(5) I>P\
[Here we use U<^ V to mean that U< (constant) K, where the constant is
independent of P.]
Let A = {4P max |^.|)" \ then
(6) S(1jOl) = I(Xj<x) + 0(1) on |a|<A,
and
(7) |/J.(aHmin(P,|a|-1/2)«|ar1/2 on|a|</4.
From (4), (5), (6), and (7), one easily deduces
(8) f S(Aia)-S(M ("™Yrfa>p3
\<x\<A
As is well known, for any e>0,
i
||S(«)
|4 A~^c D2+£
(9) |S(a)|4rfa^P
Hence, by use of Holder's inequality, the trivial estimate for one of the |S(yL;a)|
and the fact that x~l sinjc^x-1 for large x, one deduces: For any fixed (5>0,
(■0) J Sfc-MIM (*==)'*<i-~.
\<x\>P*
Now, if we can show
(11) J" |S(Aia)-S(A5a)Ma = 0(P3),
A<\*\<P*
164
D. J. LEWIS
then the conclusion follows, since (8), (10), and (11) would show that the integral
in (3) is positive. It follows from Holder's inequality and (9) that (11) holds,
provided for S sufficiently small (say S < 1/23) we have
(12)
minllS^^JUS^a)!}^?1-^ oni4<|a|<P*.
Recall that P was a large integer in ^, and hence there exist integers a, q
such that (a, g)=l,
\XJX2-a/q\<\/q2 and P = q2.
Let a be an element of the interval A<\oc\<P5 = q28; then there exist rational
approximations ajqu a2/q2 to lxoi and A2a, respectively, such that for z=l, 2,
(ah qt) = 1, 1 ^ qt ^ q3, \kp - ajqh < qT lq ' 3.
We first suppose qi,q2^q10S. Since \oc\<q2S we have \a(\ <q~3+ \qiaAi\<q128,
whence
l^Mi^U \a2<li\<q22S and IV^-ai^/^iN?""3.
But then a/q — alq2/a2ql #0 and
^-1-22,5
<(^i) ^laAr-ai^A^il
A2 4
^1 ^l<?2
^2 «2<?2
«^_2+^-3«^-2,
which is impossible when g (hence P) is sufficiently large. Thus we must have for
each a in A < |a| <Pd that one of qx and <?2> say gl9 exceeds ql0S. But then Weyl's
inequality for exponential sums shows that (12) holds. This completes the Daven-
port-Heilbronn argument.
4. The proof that
the set 9C contains nonzero numbers with arbitrarily small absolute values
when Q is a general indefinite quadratic form in n^. 21 variables
is exceedingly complicated. It would be highly desirable to have a simpler more
straightforward proof.
The known proof divides into two parts. Suppose Q is represented as a sum of
squares of real linear forms with positive and negative signs, say,
DISTRIBUTION OF THE VALUES OF REAL QUADRATIC FORMS
165
Q=L\ + - + L2r-I* + l L2„.
Part I (Davenport [7], [8}) consists in showing the result holds when min (r, n — r)
^16; while part II (Birch and Davenport [6], Ridout [19]) consists in showing the
result holds when min(r, n — r) ^5 and /i^21. In both instances one assumes that
(13) IGMl^l for 0#j
inland derives a contradiction. These results only imply the conclusion when n^3\.
However, by a clever variational technique, Davenport and Ridout [11] showed
that, from part I, one could deduce the conclusion provided min(r, n — r)^6 and
«^21.
In part I, the initial approach follows the procedure used by Davenport and
Heilbronn. The kernel (sin noLJnoi)2 leads to complications and so is replaced by
. , 4 sin47ra/3 /sin 27ra/3/cY
K(cc) = .
v ' 3 47ra/3 \ 2not/3k J
Then K(a) is a real even function such that
\K(oL)\^C(k)mm{U\<x\-k-1}
and the function
Hy)= J e{yz)K(*)d*
— oo
has 0^(j)^l for real y9 and i/>(y) = 0 for |j|^l, while ^(j)=l for \y\£l/3. Let
c be a real vector with no zero coordinates such that Q(c)=0 and f] Lj(c)^0.
The choice of c is to ensure the singular integral is positive. Let P be an arbitrarily
large integer and let Pj=\cjP\ Define
S*(«)= E - I e(«Q(x)).
It follows from (13) that
00
® S*(a)K(a)da=0.
o
166
D. J. LEWIS
Following standard procedures for the Hardy-Littlewood method (without the
need of the hypothesis on n or r, but assuming the k in the definition of K(oc) is
sufficiently large) one shows that
p-i/2 00
01 S*{a)K(a)da>Pn-2 and 01 S*(a) K(a) da<P3n/4.
o p1/4
One completes the argument by showing (now assuming the hypothesis on r and
n) that
pi/4
(14) ® S*(a)X(a)rfa = 0(P"-2).
p-1/2
The proof of (14) is ingenious and depends heavily on knowing that an indefinite
quadratic form in 5 variables with integer coefficients has an integral zero x
such that
|| x || = max |xf| ^ C • max {|coefficients|},
where C is an absolute constant. Such estimates on the norm of a zero of such
forms have been given by J. W. S. Cassels [3] and by Birch and Davenport [4].
The approach in part II is different. In this instance one shows that if P is a
large integer, then Q integrally represents an indefinite real form in 5 variables.
where
(16) \*.j\<P2J-2,
(17) \^j\<Pi+J'2'm9
and
(18) \H(y)\>\ for O^yeZ".
The relations (16) and (17) imply that kxy\ H 1-A5)>5 is indefinite. The conclusion
now follows on applying a quantified version of the theorem of Davenport and
Heilbronn due to Birch and Davenport [5], namely:
DISTRIBUTION OF THE VALUES OF REAL QUADRATIC FORMS 167
IfXXt...,k5 are real numbers, not all of the same sign and such that not all ratios
are rational, then for each S > 0 there exist integers ylt ...,y5,not all 0, such that
(19) l^y? + -+A5yil<l
and, for each i,
(20) \*iyi\<c(d)\xl-x5\1+i.
For suppose y is a solution of (19) satisfying (20) where the X-} come from (15);
then (16) implies that
l^pn + ioa-^
whence
(21) \^yiyj\<P20+20d-n
for i^j. But if S< 1/20 and n^21, relations (19) and (21) contradict (18).
The proof of the theorem of Birch and Davenport just cited again makes use
of a Hardy-Littlewood circle method along the lines of the proof of part I. To prove
this theorem, one lets P be a very large positive number such that the inequality
(19) has no integral solutions satisfying
(22) 0<|A1|yf + -^ + |A5|yi^500P2.
If one then defines the exponential sums
S,(a)= X e(aXjyj),
P<\*j\i/2yj<10P
it follows from (22) that
(23) St S1(a)-S5(a)X(a)da=0.
o
By standard arguments from the Hardy-Littlewood method, one shows that it
follows from (23) that for 5>0 there exists a function C(S)>0 such that if
(24)
p>c(s)nl'2+s,
168
D. J. LEWIS
then
p6
(25) ||S1(a)-S5(a)Ma>4P3/7-1/2,
A
where ^-1=40Pmin|;y, and 77= IV'^sl.
By a very delicate and intricate argument too involved to summarize here,
Birch and Davenport [5] show that (24) and (25) are incompatible. An essential
part of their argument again uses the estimate on the magnitude of a solution for an
indefinite additive form with integer coefficients.
The techniques discussed here are typical of the methods used in handling
diophantine inequalities in many variables. Almost invariably in the proof one
needs quite precise information regarding solutions of some associated diophantine
equation with integer coefficients. This phenomenon is well illustrated in the work
of D. Ridout and Jane Pitman ([15]-[19]). The absence of such information is
frequently the major stumbling block to solving a diophantine inequality.
5. We complete our discussion of the distribution of the set 9£ when Q is
indefinite by showing
IfQ is an indefinite real quadratic form inn^3 variables and SC contains nonzero
elements with arbitrarily small absolute values then % is dense on the line.
This result follows quite easily from the following theorem of Oppenheim [14]:
If Q is a nondegenerate indefinite quadratic form in n variables, then to each
positive value a in 9C there corresponds a negative value —b in 9C such that
(26) b2n-2^A(n)an-2\dct(Q)\,
where the constant A'(n) depends only on n and det(<2) is the determinant of the matrix
associated with the quadratic form Q.
It follows from this theorem that if n^3 and 9C contains arbitrarily small
nonzero values it contains such values with both positive and negative signs. Now
let z be a real number and let 0 < s < 1. Let
<5 = £2/(9max{l,|z|}).
By hypothesis there exists an integral point j>#0 such that 0<(signz) Q(y)<6.
Letd = [(z/e(^))1/2]sothat
DISTRIBUTION OF THE VALUES OF REAL QUADRATIC FORMS 169
(z/fiW)1/2-i<rf^(z/GM)1/2.
Then
0^Q(dy)-z = d2Q(y)-z
^ — 5 —2(|z| d)1'2^ -5-&Z -«•
Similarly
Og,Q((d+l)y)-z<e.
Oppenheim's proof of his theorem is by induction on n. If F is any quadratic
form, let p(F) denote the greatest lower bound of the positive values assumed by F
at integer points. If F is a positive definite quadratic form in n variables, then
(27) p(Ff^Bn\dct(F)\,
where Bn is a constant depending only on n. It is well known that B2=%, and
we prove (27) by induction on n. Let c = F(y)<p(F) (1 +l/«), where y is integral
and necessarily the coordinates of y have no common factor. Then there exists a
unimodular transformation T so that
F(Tx) = G(x) = c(x1+(xx2+-)2 + H(x2,...,xn)
where H is positive definite and detF = det G = c deti/. By induction there is an
integral point z = (z2,..., zn) such that
H(z)n-l^Bn_l |detH|(l + l/w) and p(G{x9 fz))2g£cJJ(z),
whence
p(F)2n-2g>p(G(x, tz))2n-2^Bnp{F)n-2 detF,
which implies (27).
When F is an indefinite form, the inequality (27) is implied by (26). For let
p(F)^c = F(y)<p(F) + e. It follows that
p(-F)2n-2^Anp{F)"-2 |detF|.
Apply this last inequality to the indefinite form — F to get
170
D. J. LEWIS
p(F)2»-2^Anp(-nn-2\fetF\,
and hence
p(F)(2»-2)2^^"-4p(F)(""2)2|detF|3"-4,
or
p(FYi*"-*£A*m-4 |detF|3"-4,
giving us (27) with Bn = An.
The relation (26) is easily seen to be true for n = 2. We now prove (26) by
induction on n. Let 0 <a = Q{y\ where the coordinates of y have no common factor.
Then there exists a unimodular transformation T such that
Q(Tx) = R{x) = a{x1+-)2 + J(x2,...,xn)
and dQtQ = a detJ. Since Q is indefinite, —J must be either indefinite or positive
definite in n— 1 variables. As we just saw in either case for each a>0, there exists
an integer point z = (z2,..., zn) such that
0<-J(z) = d and 0<dn~l ^jB„_i |det J\ (l+e).
But R (xx, tz) is an indefinite form of determinant — ad and so represents a number
— b such that b1 <%ad, whence
ft2"~2^(f)"-1Bll_1a"-2detF.
This proves (26).
6. In this section we outline the proof of the theorem of Davenport and Lewis
[10] concerning positive definite forms.
Let Q=xAx\ where A=-(kt^ is a positive definite matrix. Suppose for all x*
with ||x*|| large, the inequality
(28) \Q{x + x*)-Q{x*)\<l
has no integral solution x#0. Then
(29) EV^I^^l
for all integral points x #0, where
k
DISTRIBUTION OF THE VALUES OF REAL QUADRATIC FORMS 171
We may suppose lAj^maxl^l, and replacing x* by — x* if necessary, we may
suppose Ai<0. Set P=-2A, then ||x*||«P<||x*||, and hence P is large and
positive, but not necessarily an integer. Let
H(x) = Y,AijXixj-Pxl-2A2x2...2Anxn9
and set
L(x) = Pxl+2A2x2-\ \-2Anxn.
Notethat|2A;|«F.
We know by the box principle that given m linear forms Ll9..., Lm, with real
coefficients in n^.4m2^ 16 variables, there exist integral points u with ||w|| <^P5m/n
and such that \Lj(u)\<$Pm/n~4'. Using this fact one proves, using induction, the
existence of integral points a(v), v= 1,..., 6, witha(1) = (l, 0,..., 0)and forv^2, a(v)
is an integral point such that ||a(v)|| «P5v/n, |L(a(v))| <P~5,\ and |aMa(v)t| <P~112
for n<v. On letting A be the integral rank 6 matrix having the a(v) as column
vectors we find
(30) H{Ay)=Yd HjVJ + T, eyMj-P^i -I e^,
1 2
where
(31) /i,=A11, \<nr<Pxo«\
(32) |£rjNP-3 and y^P"2.
It follows from (30) and (29) that
(33) \H{Ay)\Zl, for integral j>#0.
But then
(34) PX/V-VO"'210
and
(35) \i*iy2i + -+wl-Pyi\*±
for integral y with >>! ^0. For if (34) does not hold, we should have yx >0, and
^iJi <^Vi+i9 whence j^P and ^2^2"l f-^iJi <Py\<P2> whence y^<P for
all/ But then tajAVjl ^P~ *, and |ak>yk| ^P"* and we would have a contradiction
to (33) when P is sufficiently large.
Next, using a Hardy-Littlewood argument modeled along that used by Birch
172
D. J. LEWIS
and Davenport [9] in their proof of a quantified version of the theorem of
Davenport and Heilbronn, we show for each small positive d (5 < 1/10), if n > S00S, then
(34) and (35) imply the existence of a real a such that
(36) P~9S<x<Pi,
(37) «Hj = aJlqJ + Pj, ^#0, (a}tq})=\,
(38) qj<P9\
and
(39) \Pj\<P-2 + 8s.
These results are deduced by showing that if B = fi\,2/SP min|^|1/2, then
p6
llSitaJ-SeWlda^P4^!-^)-1'2,
B
where
and then deducing from this the bounds given for q} and /?,-.
We complete the proof of the theorem by showing that the existence of an a
satisfying (36)-(39) is incompatible with (35). In doing so we need to make use of
the following fact:
Given positive integers bl9-~9 bs there exists a positive integer B^C(bl- -b5)3f2,
where C is an absolute constant, such that for all positive integers m, the equation
(40) fc1x? + -.- + 65xi = J5m
fs soluble in integers.
It is easy to show, for an appropriate B, equation (40) is soluble in each ring
of p-adic integers. Then the above result can be deduced quite easily from a theorem
of G. L. Watson [20] concerning the size of integers not integrally represented by
the form b^2 H \-b5x\, but locally integrally represented for all local rings. The
proof of Watson's theorem is quite long and difficult and again a Hardy-Littlewood
method is used.
DISTRIBUTION OF THE VALUES OF REAL QUADRATIC FORMS 173
We now demonstrate the incompatibility of (36)-(39) and (35). On multiplying
(35) by a we get
for all integral y with yx #0. Next set yj = qjzj and use (37) to get
(41) |fli«iZ? + -+fl«««z2-aP«1z1+/?1«?z? + ...+/?6«2z2|>p-M
for all integral z with z^O. As we have observed in the preceding paragraph,
there is an integer
(42) B<Z(a2-a6q2-q6ri2<(«sv2-H6q2-q6)3,2<P150S
such that for each positive integer ra, we can find integers z2, ...,z6 such that
a2q2zl + • • • + a6q6zl = mB.
If
(43) 0<Bm<P2~21s
and
(44) 0<z1<P1"195,
then
II Aflfa2
<ZP-l0d and \piq\z\\<p-l0d.
Hence, if we can find positive integers m and zx such that (43), (44) and
(45) K^zf + flm-aP^ZiKP-10*
all hold, we will have contradicted (41) and hence proven the theorem.
Put Zi =jBm, then, by (42), the inequality
(46) |a1^1jBM2 + m-aP^f1M|<P~1
60S
implies (45). Choose integers m, v such that 0<w<P16O<5, \aPqxu- v\ <P"160d, and
put m = v- axqxBu2. Then if S < 1/400, the integers m and zx=Bu satisfy (43), (44),
and (45). This completes the proof.
174
D. J. LEWIS
References
1. R. Brauer, A note on systems of homogeneous algebraic equations, Bull. Amer. Math. Soc. 51
(1945), 749-755. MR 7, 108.
2. S. Chowla, A theorem on irrational indefinite quadratic forms, J. London Math. Soc. 9 (1934),
162-163.
3. J. W. S. Cassels, Bounds for the least solutions of homogeneous quadratic equations, Proc.
Cambridge Philos. Soc. 51 (1955), 262-264; Addendum, ibid. 52 (1956), 602. MR 16, 1002; MR 18, 380.
4. B. J. Birch and H. Davenport, Quadratic equations in several variables, Proc. Cambridge Philos.
Soc. 54 (1958), 135-138. MR 20 #3824.
5. , On a theorem of Davenport and Heilbronn, Acta Math. 100 (1958), 259-279. MR 20
#5166.
6. ? Indefinite quadratic forms in many variables, Mathematika 5 (1958), 8-12. MR 20 # 3104.
7. H. Davenport, Indefinite quadratic forms in many variables, Mathematika 3 (1956), 81-101.
MR 19, 19.
8. , Indefinite quadratic forms in many variables. II, Proc. London Math. Soc. (3) 8(1958),
109-126. MR 19, 1161.
9. H. Davenport and H. Heilbronn, On indefinite quadratic forms in five variables, J. London Math.
Soc. 21 (1946), 185-193. MR 8, 565.
10. H. Davenport and D. J. Lewis, Gaps between values of positive definite quadratic forms, Acta
Arith. 22 (1972), 87-105.
11. H. Davenport and D. Ridout, Indefinite quadratic forms, Proc. London Math. Soc. (3) 9
(1959), 544-555. Mr 22 #28.
12. V. Jarnik and A. Walfisz, Uber Gitterpunkte in mehrdimensionalen Ellipsoiden, Math. Z. 32
(1930), 152-160.
13. A. Oppenheim, The minima of indefinite quaternary quadratic forms of signature 0, Proc. Nat.
Acad. Sci. U.S.A. 15 (1929), 724-727.
14. , Values of quadratic forms. I, Quart. J. Math. Oxford Ser. (2) 4 (1953), 54^59. MR 14,
955.
15. Jane Pitman, Cubic inequalities, J. London Math. Soc. 43 (1968), 119-126.
16. , Bounds for solutions to diagonal inequalities, Acta Arith. 18 (1971), 179-190.
17. , Bounds for solutions of diagonal equations, Acta Arith. 19 (1971), 223-247.
18. Jane Pitman and D. Ridout, Diagonal cubic equations and inequalities, Proc. Roy. Soc. London,
Ser. A 297 (1967), 476-502. MR 35 #6620.
19. D. Ridout, Indefinite quadratic forms, Mathematika 5 (1968), 122-124. MR 21 #2642.
20. G. L. Watson, Quadratic Diophantine equations, Philos. Trans. Roy. Soc. London, Ser. A 253
(1960/61), 227-254. MR 24 # A78.
University of Michigan
THE CLASSIFICATION OF
TRANSCENDENTAL NUMBERS
K. MAHLER
1. All numbers £ considered in this article are real or complex. For
polynomials
p(z) = p0+plz+---+pmzm, where pm±0,
the following notation will be used.
m
d(p) = m, H(p)= max |pj, and L(p)= £ |pj
/i = 0, 1, ..., m j* = 0
denote the exact degree, the height, and the /engt/i of p(z), respectively. We further
put
A(p)=2d^L(p) and M(p)= f[ (2 + |pJ).
If K denotes the set of all polynomials p(z)^0 with rational integral coefficients
and v is any positive integer, it is obvious that either of the inequalities A (p)^v or
M(p)^v is satisfied by at most finitely many elements of V.
Consider now the set C of all real or complex numbers £. Our aim is to
subdivide C into subsets or classes which are disjoint and have the following invariance
property.
Any two numbers in distinct classes are algebraically
independent over the rational number field Q.
A MS 1970 subject classifications. Primary 10F35; Secondary 10A40.
© 1973, American Mathematical Society
175
176
K. MAHLER
Here the subdivision of Cis to depend solely on the approximation properties of £,
and the number of distinct classes should by preference be large.
2. A first such classification with the invariance property, but into only four
classes, was found by me about 40 years ago. A detailed account of this
classification, and of the almost equivalent one by J. F. Koksma, can be found in the book
on transcendental numbers by Th. Schneider (1957).
This classification is obtained as follows. Put successively
vv>|£) = inf|p(£)|,
where the lower bound extends over all polynomials p(z) satisfying
p(z)eV, d(p)^m, H{p)£v, and p(£)#0;
log{l/w>lC)} wm(£)
wm (£) = hm sup !—-, vv = w (£) = hm sup ——.
logu m
Let further the symbol ^ = ^(£) denote oo if wm(£) is finite for all suffixes m, and
otherwise let it be equal to the smallest suffix m for which wm(£)= oo. Thus at least
one of the two numbers w and \i is always equal to oo.
Therefore the complex numbers split into the following four disjoint classes:
Class A: £ satisfies w = 0 and \i = oo.
Class S: £ satisfies 0<w<oo and ^=oo.
Class T: £ satisfies w = oo and \i = oo.
Class U: £ satisfies w = oo and \i < oo.
It can now be proved that: (i) the class A consists exactly of all algebraic numbers,
hence the transcendental numbers are distributed amongst the classes S, T, and U;
and (ii) the invariance property holds, i.e. numbers in different classes are
algebraically independent over Q.
One can also show that almost all numbers are S-numbers, a result greatly
strengthened by V. Sprindzuk (1967). There are noncountably many U-numbers,
e.g. all LiouviUe numbers; these are simply characterised by p,= 1. Until recently it
was not known whether there exist any T-numbers, but this existence has now
been established by W. Schmidt (1971), although as yet no actual T-number seems
to be known.
By way of example, e is an S-number, while n is either an S-number or a T-
number.
3. I come now to a new classification (Mahler, 1971) which leads to a sub-
THE CLASSIFICATION OF THE TRANSCENDENTAL NUMBERS 177
division of C into infinitely many disjoint classes with the invariance property. In
this classification, we need to consider polynomials in V of independently variable
degree and height (or rather length).
This classification depends on the following partial ordering of monotone non-
decreasing functions.
If a(v)>Q and b(v)>Q are any two nondecreasing functions of v^.1 for which
there exist three positive numbers c, v0, and y such that
a(vc)^yb(v) for v^v0,
then we write
a(v)^>b(v) or b(v)<^a(v).
If simultaneously
a(v)>b(v) and a(v)<^b(v),
then we write
a(v)> <b(v).
This sign > < evidently defines an equivalence relation.
With each element £ of C we associate now an order function
0(»|C)=suplog{l/|p(OI}
where the upper bound is extended over all polynomials p(z) in V for which
A(p)£v, p(0#O.
Since they behave slightly differently, it is convenient to exclude from the
consideration all those £ which are either rational integers, or are integers in any
imaginary quadratic field. With this restriction, the following results hold.
O (v | £) > < log v if C is algebraic;
0(v | ()>(logu)2 if £ is transcendental;
0(v | ()> <0(v | (') if C, (' are algebraically dependent over Q.
Thus, if numbers £ £' with equivalent order functions are put into one and the same
class, then the invariance property holds.
178
K. MAHLER
The actual determination of the order function of a number is, of course, a very
difficult problem. I mention, by way of example, the following relations.
0 (v | e) < (log v)3 (log log v)\ 0(v\n)< (log vf (log log vf,
which are implicit in work by N. I. Fel'dman (1951 and 1963). It is interesting to
see that in the second formula the upper estimate comes close to the lower estimate
(logt;)2.
In my paper on the order function I raised a number of questions. One of these
questions has in the meantime been solved by Swierczkowski in an unpublished
note; he proved that there are noncountably many inequivalent order functions
and hence also as many classes in this classification.
It is not known which monotonic functions are equivalent to order functions,
and which can be the order function of almost all real or almost all complex
numbers. It is also unknown whether the order functions can be strictly ordered.
4. I conclude this article by suggesting a still different kind of classification;
however, I do not know whether it has the invariance property, or rather how the
classification has to be defined so that this property holds.
The important recent work by W. Schmidt (1970) and A. Baker (1965) suggests
that instead of 0(v | Q one should associate with £ the function
R(v\Q = sup\og{\/\p(Q\},
where the upper bound is now extended over all polynomials p(z) in V for which
M(/?)^u,/?(C)#0. It seems highly probable that also for these functions R an
equivalence relation can be found which preserves the invariance property. I dare
to conjecture that the ideas of Schmidt could be used to settle this question.
So far we have only discussed classifications based on the values of a single
variable polynomial p(z) at the given point z = £. A more powerful kind of
classification would consider simultaneous approximations by sets of polynomials. I have
little doubt that the modern general transfer theorems in the geometry of numbers
of convex bodies are the right tool for attacking such problems.
References
1. A. Baker, On some Diophantine inequalities involving the exponential function, Canad. J. Math.
17 (1965), 616-626. MR 31 #2204.
2. N. I. Fel'dman, Approximation of certain transcendental numbers. I. Approximation of logarithms
of algebraic numbers, Izv. Akad. Nauk SSSR Ser. Mat. 15 (1951), 53-74; English transl., Amer. Math.
Soc. Transl. (2) 59 (1966), 224^245. MR 12, 595; MR 13, 117.
3. , On a measure of transcendence of the number e, Uspehi Mat. Nauk 18 (1963), no. 3
(111), 207-213. (Russian) MR 27 #4798.
THE CLASSIFICATION OF THE TRANSCENDENTAL NUMBERS 179
4. K. Mahler, On the order function of a transcendental number, Acta Arith. 18 (1971), 63-76.
5. W. M. Schmidt, Simultaneous approximation to algebraic numbers by rationals, Acta Math.
125(1970), 189-201. MR 42 #3028.
6. , Mahler's T-numbers, Proc. Sympos. Pure Math., vol. 20, Amer. Math. Soc,
Providence, R.I., 1971, pp. 275-286.
7. T. Schneider, Einfuhrung in die transzendenten Zahlen, Springer-Verlag, Berlin, 1957. MR 19,
252.
8. V. G. Sprindzuk, Mahler's problem in metric number theory, "Nauka i Tehnika", Minsk,
1967; English transl., Transl. Math. Monographs, vol. 25, Amer. Math. Soc, Providence, R.I., 1969.
MR 39 #6832; #6833.
Institute of Advanced Studies, Australian National University
Canberra, ACT 2600, Australia
This page intentionally left blank
THE PAIR CORRELATION OF ZEROS
OF THE ZETA FUNCTION
H. L. MONTGOMERY
1. Statement of results. We assume the Riemann Hypothesis (RH)
throughout this paper; Q=^+iy denotes a nontrivial zero of the Riemann zeta function.
Our object is to investigate the distribution of the differences y — y' between the
zeros. It would thus be desirable to know the Fourier transform of the distribution
function of the numbers y — /; with this in mind we take
(1) F(a) = F(a,T) = f^logr) ' £ r*^ w(y-y%
\^7T / 0<y^T;0<y'^T
where a and 7^2 are real. Here w(w) is a suitable weighting function,
w(w) = 4/(4 + w2), so w(0)=l. Our results concerning F(a) are stated in the
following
Theorem. (Assume RH.) For real on, 7^2, let F(oc) be defined by (1). Then
F((x) is real, and F((x) = F( — oc). If T> T0(s) then F(oc)^ —s for all a. For fixed a
satisfying 0^a< 1 we have
(2) F(x) = (l+o(l)) T~2a\ogT + (x + o(l)
as T tends to infinity; this holds uniformly for O^a^ 1 — 8.
The first term on the right-hand side of the above behaves in the limit as a
Dirac <5-function; it reflects the fact that if a = 0 then all the terms in (1) are positive.
With more effort we could show that (2) holds uniformly throughout O^a^ 1.
To investigate sums involving y — y' we have only to convolve F(a) with an
A MS 1970 subject classifications. Primary 10H05.
© 1973, American Mathematical Society
181
182
H. L. MONTGOMERY
appropriate kernel r(cc); from (1) alone it is immediate that
(3) £ r((y-fl^l)w{y-y) = (I- logj) f F(a)r(a)<fa.
Here f is the Fourier transform of r,
+ 00
(4) f((x)= r(u)e(-au)du (e(6) = e2nW).
— oo
Our theorem gives us little information about F(a) for a ^ 1, so for the most part we
restrict our attention to kernels r which vanish outside [—1+5, \—S]. Particular
choices of r(a) give us
Corollary 1. (Assume RH) IfO<oc<\ is fixed then
0<y*T;0<y'ZT\ 0i(y-y)\OgT ) \2(X 2/ 27T
and
(6) Z , nw x~x——) w(y-/)~ - + - I — logT.
v; o<y*rfo<y'sr\ (a/2) (y-/) log7 / \a 3/ 2tt
In the latter assertion one can delete the factor w(>> — /) if one wishes. We use
(6) to derive
Corollary 2. (Assume RH.) As T tends to infinity
(7) I l^(|+o(l))f logT.
0<y^T;e simple Z7T
The number of zeros of £(s) with 0<y^T is ~(7727r) logT, so the above
asserts that at least § of the zeros are simple. It is known (see [6]) that the first
3,500,000 zeros are simple and lie on the critical line o = \. Although one expects
that all the zeros of £(s) are simple, the only other result in this direction is due to A.
Selberg [7]. His result holds unconditionally; it states that a positive density of
the zeros of £(s) are of odd order and lie on the critical line.
Let 0 < y x ^ y2 = • • • denote the imaginary parts of the zeros of ((s) in the upper
THE PAIR CORRELATION OF ZEROS OF THE ZETA FUNCTION 183
half-plane. The average of yn+ 1 — yn is 2n/\ogyn; our Theorem enables us to show
that yn+i — yn is not always near its average.
Corollary 3. (Assume RH.) We can compute a constant X so that
(8) liinmf (yn+1- yn) (\ogyn/2n) £ A < 1.
A complicated argument would permit one to show that in fact yn+1—yn^
2nX/\ogyn for a positive density of n. This, with the fact that the average value is
2n/\ogy„ enables one to assert that
(9) lim sup (yn + 1- yn) (logyn/27r) ^ X' > 1.
n
We note that if £(s) has infinitely many multiple zeros then we may take X = 0
in (8). Our proof allows us to take X = 0.68. It would be of interest to have A<£. as
P. J. Weinberger and I have established the following: Let d>0 be square-free,
and put K=Q((-d)l/2). Let h(-d) be the class number of K, and let £K(s) = £(s)
- L (s, x) be the Dedekind zeta function of K. For each positive A, e there is an
effectively computable constant d0 = d0(A, e) such that if h( — d)^A, d>d0, then all
zeros of CK(s) which are in the rectangle 0<<7<1, 0^t^dl/2~£ lie on the line
<f=h\ if i+iyni i+iyn+i are consecutive zeros of CK(s) in this range then
(10) (1 "£) 1ng,,2* 9^r«+i -y.^U +«) 1n Jn^2-
^gd(yn + 2y \ogd(yn + 2)2
One may inquire about the behaviour of F(a) for a ^ 1. Our first observation
is that (2) cannot hold uniformly for O^a^C if C is large. For if it did then (6)
would hold for 0<a = C. Write (6) as G(a)~//(a). On one hand |sin2x| = 2|sinx|,
so G(2a) = G(a) for all a. On the other hand tf(2a)>f#(a) for a = 2. This suggests
that F(a) makes some change in its behaviour for a=l. Further considerations
of the above sort lead one to believe that certain averages of F(a) over large a are
close to 1. At the end of §3 we describe two heuristic arguments which suggest that
(11) F(a) = l+o(l)
for a^l, uniformly in bounded intervals. This, with the Theorem, completely
determines F, so an appropriate use of (3) leads immediately to a
Conjecture. For fixed a < /?,
184
H. L. MONTGOMERY
P
^ /f /sin7rw\2 \ T
<12> .-?„ ,~U|-hrJ*+'Hs,0»r
27ta/log r ^ y - y' ^ 27t0/log T
as 7 tends to m/wiry. H^r^ <5(a, 0)= 1 z/0e[a, jj], 5 (a, J?) = 0 otherwise.
The Dirac 5-function occurs naturally in the above, for if Oe[a, jS] then the
sum includes terms y — yf.
The assertions (11) and (12) are essentially equivalent. From either it immediately
follows that almost all zeros are simple. From (11) it is easy to see how Corollary 1
ought to be extended: If (11) is true then for a^ 1,
o<y^T;o<yzT\ a(y-y)logT / In
and
o<,srfo</srV (a/2)(y-/) logT J u y> \ 3a2) 2n
("> I (^FSOAl^-^ll+^lilogT.
In a certain standard terminology the Conjecture may be formulated as the
assertion that 1 — ((sin nu)/nu)2 is the pair correlation function of the zeros of the
zeta function. F. J. Dyson has drawn my attention to the fact that the eigenvalues
of a random complex Hermitian or unitary matrix of large order have precisely
the same pair correlation function (see [3, equations (6.13), (9.61)]). This means
that the Conjecture fits well with the view that there is a linear operator (not yet
discovered) whose eigenvalues characterize the zeros of the zeta function. The
eigenvalues of a random real symmetric matrix of large order have a different pair
correlation, and the eigenvalues of a random symplectic matrix of large order
have yet another pair correlation. In fact the "form factors" Fr(a), Fs(a) of these
latter pair correlations are nonlinear for 0<a< 1, so our Theorem enables us to
distinguish the behaviour of the zeros of £(s) from the eigenvalues of such matrices.
Hence, if there is a linear operator whose eigenvalues characterize the zeros of
the zeta function, we might expect that it is complex Hermitian or unitary.
One might extend the present work to investigate the k-tuple correlation of
the zeros of the zeta function. If the analogy with random complex Hermitian
matrices appears to continue, then one might conjecture that the /c-tuple
correlation function F(ul9 u2,..., uk) is given by
(15) F(ul9u29...9uk) = detA,
THE PAIR CORRELATION OF ZEROS OF THE ZETA FUNCTION 185
where A = \_atj\ is the k x k matrix with entries au = 1, au = (sinn (ut — Uj))/n (u{ — w,)
for i #;. Here the normalization is the same as in the Conjecture, which is the case
k = 2 of the above.
If one continues to draw on the analogy with random complex Hermitian
matrices then one may formulate a conjecture concerning the distribution of the
numbers yn+ x — yn. The precise conjecture involves a complicated (but calculable)
spheroidal function. Thus, or otherwise, one may conjecture that
(16) liming !->>,,) logy,, = 0,
n
and
(17) lim sup(yn+l-yn) \ogyn= +oo;
n
so Corollary 3 is probably far from the truth.
It would be interesting to see how numerical evidence compares with the above
conjectures. The first several thousand zeros have been computed, so it would not
be difficult to assemble relevant statistics. However, data on the failures of "Gram's
law" indicate that the asymptotic behaviour is approached very slowly. Thus the
numerical evidence may not be particularly illuminating.
2. An explicit formula. In proving our Theorem we require the following
formula, which relates zeros of C(s) to prime numbers.
Lemma. If\ <o<2 andx^l then
x
iy
+ X1/2-ff + i'(l0gT + 0<r(l)) + 0ff(x1/2T-1),
where r = \t\ + 2. The implicit constants depend only on a.
Proof. It is well known (see [2, p. 353]) that if x> 1, x^p", then
I A{n)n-'—Us)+j E— +E X-—
n^x C 1-s q Q-s „=i 2n + s
provided s^ 1, s^q, s^ -In. This does not depend on RH, but if we assume RH
then the above may be expressed as
186
H. L. MONTGOMERY
(19) I r = x*"1/2 -(s) + T A(n)n-°-* Y
If we replace s by 1 — a + it in the above then we have
I, **" . =xl'2-°(Ul-a + it)+ £ ^(ii)!!-1-"
(20) v<r-it oo Y-2n-l+<r-it\
_* y * V
cr —ir Bfi 2n+l-<7-f it)
We subtract respective sides of (20) from (19), and use the relation
(21) 7 (*)=-! ^M«-s,
which holds for <r> 1. We find that
P—DE, ,»,*!'„ ,2=-*-1/2fl ^(n)(-)1_""+S W-Y+")
7 (ff-i) Ht-ir \n% W „>x \nj J
(22) JL(\-a+it)xw-'+»+. xl/2(2ff_1)
C ((7-1 +it) (a-it)
_x-l/2
(2<r-l)x
■2n
y -
„=i (ff-l-ir-2n)(<r+it + 2n)
Both sides of the above are continuous for all x^ 1, so we no longer exclude the
values x = 1, x = p". If 1 «r<2, then from the logarithmic derivative of the
functional equation of the zeta function (see [1, pp. n5, 82-83]) we have
|(l-«T + »r)=-|(ff-it)-logT + 0„(l);
from (21) we see that this is = — logT + 0„(l). Hence the right-hand side of (22) is
+ Xll2-° + it(logT + 0,(\))+0,(xll2T-2) + 0(r(x-2T-1),
which gives the result.
THE PAIR CORRELATION OF ZEROS OF THE ZETA FUNCTION 187
3. Proof of the Theorem. The first assertion of the Theorem follows from
the observation that we may interchange y and y' in (1). To prove the remaining
assertions, take g = \ in the Lemma, and write (18) briefly as L(x, t) = R(x, t). We
evaluate the integrals JJ|L(x, t)\2 dt, $%\R{x, t)\2 dt.
We treat the left-hand side first. We have
m \^***i*-"\f«&mw=m-
We wish to exclude those numbers y^[0, T]. It suffices to show that
(2*) I \n+(t ,:!, .,u,<^T,
dt
.ym+(t-yf)
for then (23) is
T
(25) =4 £ xHy'yl \(\+(t vffln+rr v^+0(log3T)'
0<y^T;0<y'^T J (l+(t-y) )(l+(t-y) )
0
To prove (24) we use the fact (Theorem 9.2 of [8]) that if 7^2 then there are
<^ log Tzeros for which T ^ y ^ T + 1. From this it is immediate that if 0 ^ t ^ T then
v:J,T]TT(^^(7TI+^7Tl),ogT'
and
On the left-hand side of (24) we take the sums inside and use the above estimates.
The integration is then trivial, and we obtain (24).
Arguing similarly we may also show that
00 00
0<y^T;0<y'^T
188
H. L. MONTGOMERY
The estimation of £0<^r;o<y^r J-oo---is the same, so we see that (25) is
+ 00
-V, J<„/""'' I (1+(,-rnU-tf)+0(l083r)
From the calculus of residues we deduce that the definite integral is = (n/2) w(y — /),
so the above is
= 2tc £ x'(v"y,)w(y-/) + 0(log3T).
0<y^7;0<y'^r
If we put x=Ta then we have
(26)
J \L{T\t)\2dt =F(a) TlogT + 0(log3T).
o
Here the left-hand side is clearly nonnegative, so we have the second assertion of
the Theorem.
To complete the proof of the Theorem we prove (2); to this end we evaluate
JJ \R(x, t)\2 dt. In the first place
(27)
i\x-i+u
\ogx\2dt =—(log2T + 0(logT))
for all x^l, 7^2. To compute the mean square of the Dirichlet series on the
right-hand side of (18) we use the following quantitative form (see [5]) of Parseval's
identity for Dirichlet series:
(28)
2>n«~''l
dt=Z\an\2(T + 0(n)).
We could instead use the weaker relation
T
dt=(T + 0(N)) £ kl2;
THE PAIR CORRELATION OF ZEROS OF THE ZETA FUNCTION 189
this is Theorem 1.6 of [4]. However, the latter is restricted to Dirichlet polynomials,
so we simplify our treatment by arguing from (28). We have
T
1 f
-1/2 + it
+ lMn)[~)
'\ 3/2 + i(|
dt
= ; I A(nr(^)'\T + 0(n))+l £ A{nf (f\{T + 0{ri)).
x n^x W x n>x \n/
By the prime number theorem with error term this is
(29) =T(logx + 0(l)) + 0(xlogx).
As for the error terms in (18), we see that
(30)
1
J v2
and
(31)
xt~2 dt
<^x.
We now combine our estimates (27), (29), (30), (31); we employ the following
consequence of the Cauchy-Schwarz inequality: If Mfc=$l\Ak(t)\2 dt and Ml^M2
^M3^M4, then
1
I Ak(t)\
dt =Ml+0((MlM2)l/2).
We consider three cases.
Case 1. 1 ^x^(log 7)3/4. Then our Mx term is given by (27). Our other terms
are uniformly o{Mx\ so our expression is =(1 +o(l)) {T/x2) log2 T.
Case 2. (logr)3/4<x^(log7)3/2. In this case all our estimates are uniformly
o{T\ogT).
Case 3. (log7)3/2<x^r/logr. Then our M2 term is given by (29). All our
190
H. L. MONTGOMERY
other terms are uniformly o(M1), so our expression is =(1 +o(l)) Tlogx.
If we put x=Ta then we may express our result by saying that
\R(T\t)\2dt=((l+o(l))T-2*\ogT + <x + o(l))T\ogT,
uniformly for O^a^ 1 — £. This and (26) give (2), so the proof is complete.
If a > 1 in the above then x > T, so the second error term in (29) is no longer
smaller than the main term. The error term (31) also gives problems; a little
consideration reveals that what we require is to know the size of
(32)
I
I
£ A{n)nll2-i' + x £ Ai^n-3'2-"--
2x
1/2 -it
X nix
(i+fc)(*-fc);
dt.
If we multiply out the integrand and integrate terms individually, we find that
there are too many nondiagonal terms to be ignored. We may, however, collect
terms so that the above is expressed in terms of sums of the sort £w g y A (n) A (n + h).
There are various indications that this sum is approximately c(h) y, where c(h)
is a certain arithmetic constant. If we replace these sums by their conjectured
approximations c(h) y, then our new expression is ~ Tlog T. Moreover, there is a
reasonable hypothesis as to the size of the differences
(33)
X A(n)A(n + h)-c(h)y
which if true would allow us to carry out our program for 1 ^a < 2. If the differences
(33) are not only reasonably small but also behave independently for different h
then (32) is - T log T for all a ^ 1.
Another indication of the behaviour of the expression (32) can be obtained by
considering its "^-analogue." The expression
(34)
i*Q<P(4)x*xo
- I >l(*)x(*)*1/2+I A(n)X(n)n-3/2
X n£x n>x
may be shown to be ~ Q log x for Q ^ x, in analogy with (29). If x (log x) ~ A ^ Q ^ x
then we may use an established technique [4, Chapter 17] to show that (34) is
~ 6 log 8- If GRH is true then this latter asymptotic relationship holds for
x3/4+e<Q^x. This corresponds to 1 ^a<f. One does not expect a change in the
THE PAIR CORRELATION OF ZEROS OF THE ZETA FUNCTION 191
behaviour for larger a, but a more delicate error-term analysis is needed if the result
is to be extended.
4. The corollaries. To prove Corollary 1 we use our Theorem in conjunction
with (3). To obtain (5) we take r(w) = (sin27caw)/27caw. The Theorem makes it a
simple task to compute
F(ftr(fl)dfi=±
F(fi)dfi.
To obtain (6) we take r(w) = ((sin7caw)/7caw)2. Again from the Theorem it is easy
to compute
•+• oo Ta
JF(/?)r(/?)4?=i j(a-ftF(ftdp.
We now prove Corollary 2. Let mQ be the multiplicity of the zero q. In a sum
over 0<y^ T, our convention concerning multiple zeros is that zeros are counted
according to their multiplicities. This is accomplished by allowing y to take on
the same value mQ times. In particular,
0<y^T 0<y^7
0<y'^T
for on both sides a given zero q is counted with weight m2. But
/sin(a/2)(y-/)logry
o<7sr - o<vsrfo<v'SA (a/2)(y-y') logT J U n'
0<y'^T
y = y'
and if we take a = 1 — S then from (6) the above is
^(f + s)(T/27r)logT.
Hence we have demonstrated that
£ mQZ$ + o(l)){T/2n)logT.
0<y^T
192
H. L. MONTGOMERY
Now
£ 1£ £ (2-«g^(2-f+o(l))f logT,
0 <y ^T; qsimple 0<y^7 ^
so we have Corollary 2. The kernel r(u) which we have used does not appear to
be optimal for our purpose, so presumably one can improve slightly on the
constant f.
We now turn to the first assertion of Corollary 3. We take r(u) =
max(l— (|m|/A), 0) in (3), and choose X later. Now r(a) is nonnegative, and
\£ r(a) dot < co, so our Theorem permits us to calculate a lower bound for the
right-hand side of (3). We see that
+ 00 1
J F(a) r(«)<k£(1 +o(l)) U + 2k j «fc^ do\ ^ logT.
We may assume that all but finitely many zeros are simple, so the terms y=y'
in (3) contribute an amount ~(T/27t) logT. Hence
£ l^(i + 0(l))C(l)^logT
0<y^T ^
(Xy'^T
0<y-y' <2nA/logT
where
C(X) = X + (l/n2X) Cin(27d)-1.
Here Cin(x) is the "cosine integral,"
X
fl— COSH
Cinx= du.
J u
0
Note that the integrand is nonnegative, so that Cin(x)>0 for x>0. To obtain
(8) we show that C(X)>0 for some X<1. This is easy, because C(l) = (l/7c2)
• Cin (2n) > 0, and C(X) is continuous. In fact a little calculation reveals that C(0.68)
>0. We have not determined the optimal kernel r(a), so one should be able to
improve on the constant 0.68.
the pair correlation of zeros of the zeta function 193
References
1. H. Davenport, Multiplicative number thtory, Lectures in Advanced Math., no. 1, Markham,
Chicago, 111., 1967. MR 36 #117.
2. Edmund Landau, Handbuch der Lehre von der Verteilung der Primzahlen, Teubner, Berlin, 1909.
3. M. L. Mehta, Random matrices and the statistical theorv of energy levels, Academic Press,
New York, 1967. MR 36 #3554.
4. Hugh L. Montgomery, Topics in multiplicative number theory, Lecture Notes in Math., vol. 227,
Springer-Verlag, Berlin and New York, 1971.
5. Hugh L. Montgomery and R. C. Vaughan, Hilbert's inequality (to appear).
6. J. B. Rosser, J. M. Yohe and L. Schoenfeld, Rigorous computation and the zeros of the
Riemann zeta-function (with discussion), Information Processing 68 (Proc. IFIP Congress, Edinburgh,
1968), vol. 1, Math., Software, North-Holland, Amsterdam, 1969, pp. 70-76. MR 41 #2892.
7. Atle Selberg, On the zeros of Riemann's zeta-function, Skr. Norske Vid. Akad. Oslo I (1942),
no. 10, 59 pp. MR 6, 58.
8. E. C. Titchmarsh, The theory of the Riemann zeta-function, Clarendon Press, Oxford, 1951.
MR 13, 741.
Trinity College
Cambridge, England
This page intentionally left blank
METRIC THEOREMS ON
THE DISTRIBUTION OF SEQUENCES
H. NIEDERREITER
1. Introduction. The theory of uniform distribution of sequences,
historically an outgrowth of diophantine approximations, is more and more becoming
the meeting ground of areas as diverse as number theory, harmonic analysis,
probability, and functional analysis. The very definition of uniform distribution
is in essence of a probabilistic nature, and so this aspect of the theory is rather
evident. The interest of number theorists in the subject centers mainly around the
study of various special sequences stemming from, or having an impact on,
classical areas of number theory. Well-known is the interplay between uniform
distribution modulo one and, to name only a few, diophantine approximations,
normal numbers, exponential sums, and the probabilistic theory of additive and
multiplicative functions. But there are also relations to the distribution of primes
(see [40]) and the theory of transcendental numbers (see [33, § 4]). For surveys
of the general theory of uniform distribution, we refer to Koksma [28], Cigler
and Helmberg [9], and Kuipers and Niederreiter [30].
In this paper, we shall adopt the probabilistic viewpoint. The general problem
we are facing may be very crudely stated as follows. Given a class of sequences in
the unit interval or in some multidimensional unit cube, determine whether it
obeys some of the common laws of probability theory. For instance, can a sequence
chosen at random from the given class be expected to be uniformly distributed?
If so, can one prove quantitative refinements of the result? What is the average
order of magnitude of the discrepancy for sequences from the given class? Does the
law of the iterated logarithm hold? In this generality, one cannot hope for decisive
answers to these questions. Therefore, one usually considers only fairly explicit
classes of sequences of either probabilistic or number-theoretic significance. Most
AMS 1970 subject classifications. Primary 10-02, 10F40, 10K05; Secondary 05A05, 60F10, 62-02,
62G15.
© 1973, American Mathematical Society
195
196
H. NIEDERREITER
of the results described here will concern the class of all sequences contained in the
domain under consideration. However, in the last section, we provide some
information about important classes of special sequences which have received much
attention in the literature. For an account of the interplay between uniform
distribution and probability on a general level, see the paper of Kemperman [25].
The present paper is to a large part expository, since the detailed proofs of the
new results will be published elsewhere (see [38], [39]). It is divided into six
sections. A brief summary follows.
In § 2, we introduce the so-called discrepancy of a sequence which provides
a quantitative measure for the deviation of the distribution of the sequence from
a given distribution. We mention in fact various notions of discrepancy which
have been considered in the literature, but we shall in the sequel concentrate on the
so-called extreme discrepancy.
The purpose of the next section is to exhibit the close relation between
discrepancy and the theory of empirical distribution functions, a relation which has
not been noticed so far. The concept of extreme discrepancy turns out to be a rather
special case of a notion very familiar to statisticians, namely Kolmogorov's two-
sided test.
In § 4 we discuss "global results", i.e., results pertaining to the class of all
sequences contained in the domain under consideration. In the one-dimensional
case, we have the general limit theorem of Kolmogorov for empirical distribution
functions. An asymptotic expansion of this limit theorem can be used to obtain
a law of the iterated logarithm. In the multidimensional case, no analogue to
Kolmogorov's limit theorem is known, but a law of the iterated logarithm can
still be obtained.
Some interesting combinatorial aspects arise in § 5. Here we are concerned
with finding the measure of the set of sequences in the unit interval whose Mh
discrepancy is bounded by some prescribed real number. Notions from transversal
theory will turn out to be useful.
In the last section, we give a survey of the most interesting results on classes
of special sequences, notably lacunary sequences. Due to the large number of
papers written on the subjects covered in this paper, especially on the topics
belonging to probability and statistics, the bibliography constitutes only a
selection of the literature, but I hope it is a representative and useful one.
I would like to take this opportunity to thank Professor Walter Philipp for
many enlightening conversations on the subject of probabilistic number theory.
2. Discrepancy. Let E— [0, 1) be the unit interval, and let a> = (xn), n = 1, 2,...,
be a sequence of elements from E. We are interested in the distribution of the
sequence co in E.
For a positive integer N and a subset M of £, let A(M;N) be the number of
METRIC THEOREMS ON THE DISTRIBUTION OF SEQUENCES 197
elements x„, 1 ^n^LN, with xneM. Furthermore, let /be a continuous distribution
function on [0, 1], i.e., a continuous nondecreasing function / on [0, 1] with
/(0) = 0and/(l)=l.
Definition 2.1. For TV^l and O^a^l, we define the local discrepancy
RN((x;f) of the sequence co with respect to the continuous distribution function/by
RN{*;f) = A{t09a);N)/N-f{*).
Definition 2.2. The sequence co is called uniformly distributed in E with
respect to the distribution function/if lim^^ RN{x;f)=0 for O^a^l. If/M is
the uniform distribution on [0, 1], i.e.,/tt(x)=x for O^x^l, then a sequence co
uniformly distributed in E with respect to/u is simply called uniformly distributed
in E.
We arrive at a notion of discrepancy by taking some function norm of the local
discrepancy RN((x;f), considered as a function of a on [0, 1]. Using the most
common norms, namely the supremum norm and the IF norm, we arrive at the
following definitions, respectively.
Definition 2.3. For jV^I, the (extreme) discrepancy DN(co;f) of the
sequence co with respect to/is defined by
DN(co;f)= sup \RN{*;f)\.
If/=/u, then we write DN(co) instead of DN(co;fu).
Definition 2.4. For 1 ^p < oo and N ^ 1, the LP discrepancy Dtf (co; f) of the
sequence co with respect to/is defined by
i
DW(co;f)M\RN(x-,f)\>> dx
0
Iff=fU9 then we write Dtf](co) instead of D^{co;fu).
We have obvious inequalities relating these discrepancies, namely
(1) Dft\€o;f)£DW{(o;f) whenever l£pg9<oo,
and
(2) DJH©;/)gDN(a>;/) forl^p<oo.
The importance of these discrepancies stems from the following simple facts, the
first of which is well-known.
y/p
198
H. NIEDERREITER
Lemma 2.1. The sequence co is uniformly distributed in E with respect to f if
and only //lim^^ DN{co;f) = 0.
-Proof. The sufficiency of the condition is obvious. Conversely, suppose that
co is uniformly distributed with respect to/ For N ^ 1, let gN be the function defined
by gN(a) = A([0, a); N)/N. Then gl9 g2,..., 9n>-- is a sequence of nondecreasing
functions converging pointwise to the continuous function / By the theorem of
Polya-Cantelli [19, pp. 319-321], the convergence must be uniform, and the result
is established.
Lemma 2.2. Let 1 ^ p < oo be given. Then the sequence co is uniformly distributed
in E with respect to f if and only if lim^^ D^p)(co;f) = 0.
Proof. If co is uniformly distributed with respect to/, then Lemma 2.1 and
(2) imply limiV_ooD^)(co;/) = 0. To prove the converse, it suffices to show that
limiV_ooDy)(co;/) = 0 implies the uniform distribution of co with respect to/
Using inequality (1), we get then the desired result for all values of p under
consideration.
So suppose that lim^^ JJ \RN(ai;f)\ dec = 0. We observe that limiV_00JRiV(a;/)
=/(a) is trivial for a = 0 and 1. Now choose e>0 and j?e(0, 1). Suppose that
RN(P;f)^e for some N^ 1. Since/is continuous, there exists S with 0<<5^ 1 — /?
such that / (y) ^/ (j?) + e/2 for j? ^ y ^ j3 + S. Then
Consequently, we have jj \RN(cc;f)\ da^sS/l. By hypothesis, this inequality
can only hold for finitely many N. It follows that RN(P;f)<e for sufficiently large
N. A similar argument shows that RN(P,f)> — e for sufficiently large N, and
the proof is complete.
For later purposes, the following alternative formula for the extreme
discrepancy will be useful, which differs only slightly from Definition 2.3.
Lemma 2.3. The extreme discrepancy DN(co;f) is also given by
DN(co;f)= sup |>l([0,a];JV)/iV-/(a)|.
Proof. If a is not one of the first N elements of the sequence co, then
A([0, a); N) = A([0, a]; N). To deal with the finitely many remaining possibilities
METRIC THEOREMS ON THE DISTRIBUTION OF SEQUENCES 199
for a, we choose ^>0, and observe that there exists S with 0<(5^1 -max1^w^iVxw
such that A{[09xH};N) = A{[09xH^S);N) and f{xH + S)£f(xJ + e for l^n^N.
Then
N
N
A([0,x„ + 8);N)
N
-f(xm+4
+£,
and so
\A([0, x„]; AO/JV-AxJI^to/He.
Together with the earlier remark, it follows that
(3) sup \A([p,«-];N)/N-f(x)\^DN(co;f).
In the other direction, we note that
(4) \A([0,P);N)/N-f(PU sup M([0, a]; N)/JV-/(a)|
for j?=0 and for all /? which are not among the first N elements of co. It remains to
consider the xw, l^w^JV, with xw>0. If such xn exist at all, we will proceed as
follows. For given e>0, there exists rj with 0<rj^minl^n^N;Xn>0 xn such that
>1([0, x„); N) = A([0, x„-ij]; JV) and/(xw-^/(x„)-e for all xn", 1 ^n^JV, with
x„ > 0. Then
A([09xn-rj-];N)
N
'f(xH-l)\
!S^™-/w<«^)-/w,
and so
M([0,x„);N)/iV-/(xn)|^ sup \A([0,«-];N)/N-f(<x)\+s
for all xn from above. Together with (4), this proves inequality (3) in the reverse
sense.
We remark that statements analogous to Lemma 2.3 will also hold for the
discrepancies Z^O^/)* l^/?<oo, since
200 H. NIEDERREITER
\A{[09*);N)/N-f{a)\ and \A([0, a]; N)/N-f(a)\
differ only for finitely many values of a, and so the integral defining D^p)(oo;f)
wiftnot be affected by this change.
The dependence of DN(co;f) on the first N elements of oo can be made very
explicit. We prove the following representation for DN(oo;f) which generalizes
[38, Theorem 1].
Lemma 2.4. Let xl5..., xN be the first N terms of co, arranged in ascending
order. Then
DN(co;f)= max max (|/(*„)-*/JV|, \f(Xn)-(n-l)/N\)
= l/2iV+ max \f{xn)-{2n- 1)/2N\.
Proof. For notational convenience, set x0 = 0 and xN+1 = \. The distinct
values of the numbers xw, 0^n^iV +1, define a subdivision of [0, 1]. Therefore,
DN(co;f)= max sup
0^n^N;x„<x„+i x„<a^xn+l
= max sup
0^n^N;x„<x„+ i x„<a^x„ + i
Whenever xn<xn+1, the function gn(<x)=\n/N— f(oc)\ attains its maximum in
[xn, xn+ J at one of the endpoints of the interval, since/is monotone. Using also
the continuity of/, we conclude that
DN(co;f)= max max(|n/JV-/(x„)|, \n/N-f{xH+1)\).
0£n^N;x„<xn+l
From here on, one proceeds as in the proof of [38, Theorem 1].
Corollary 2.1. Let f be a continuous distribution function on [0,1] mapping
E into E. For a sequence co = (xn) in E, let f(co) denote the sequence f(co)=(f(xn)).
ThenDN(f(co)) = DN(co;f).
Fork^2,\etEk = {(xu...,xk)€Rk:0^Xj<lforl^j^k} be the/c-dimensional
unit cube. The theory of discrepancy may as well be developed for sequences
co in Ek. Let / be a continuous distribution function on £fc = {(x1,..., xk)eRk:
O^Xj^l for l^y^/c}, i.e., a continuous distribution function / on Rk which
A{[09a);N)
N
N
-/H
-/(«)
METRIC THEOREMS ON THE DISTRIBUTION OF SEQUENCES 201
satisfies/(l,..., 1)=1 and/(xl5..., xk)=0 whenever at least one of the Xj is zero.
With the obvious definition of the counting function A(M\ N), one may then define
the local discrepancy of co with respect to / by
RN{<*u—,<*k,f) = f(<xl9...9cik)
for («!,..., u.k)eEk. The extreme discrepancy DN(co;f) is then
DN{<o;f)= sup \RN(<xl9...9<xk;f)\9
(<*i «k)e£k
and, for 1 ^p< oo, we set
1 i
DW(w;f) = n--j\RN(<x1,...,ak-,f)\>'d«1...d<xk
0 0
An important special case results when f=fu9 the uniform distribution given by
fu(xl9...9xk) = xl---xk for (xj,..., xk)eEk. Again we shall write DN(co) instead of
It is easily seen that Lemmas 2.1, 2.2, and 2.3 can be extended to the
multidimensional case. However, there is no obvious analogue to Lemma 2.4 in the
multidimensional case, and this is due to the lack of certain order properties in Ek.
For results which can be shown in the direction of Lemma 2.4, we refer to [38].
We shall see in §§ 4 and 5 that the absence of a result such as Lemma 2.4 causes
extreme difficulties in the metric theory of discrepancy in Ek.
A comprehensive treatment of- discrepancy and its role in uniform distribution
can be found in the following books and articles: Koksma [28, Kapitel IX],
Cigler and Helmberg [9], Hlawka [22], Niederreiter [36], and Kuipers and
Niederreiter [30, Chapter 2].
3. Empirical distribution functions. We adopt the following general
viewpoint. We are given a probability space (X, U, /i), i.e., a nonvoid set X9 a c-algebra
U of subsets of X, and a probability measure p defined on U. Let £x, £2,..., £„,... be
independent random variables on (X, U, /i) with a (not necessarily continuous)
common distribution function F(t). The following definition is well-known in
mathematical statistics.
Definition 3.1. For xeX and a positive integer N9 the empirical
distribution function FN(t, x) is defined for all real t by \/N times the number of £„(x),
l^n^N, with Zn{x)£t.
An important test, known as Kolmogorov's two-sided test or Kolmogorov's
statistic, is based on empirical distribution functions.
y/p
202
H. NIEDERREITER
Definition 3.2. For xeX and a positive integer N, define
(5) GN(x) = sup|FNM)-F(0|.
teR
Let us show that GN(x) is a generalization of the extreme discrepancy DN(co; f).
First of all, we note that any infinite sequence co in £ may be identified with a point
in the infinite-dimensional unit cube £°°, i.e., in the cartesian product of countably
many copies of E. For the given continuous distribution function/on [0, 1], let v
be the Borel measure in E determined by v([0, *))=/(/) for O^/^l. Then the
product probability space (£°°, 93°°, v00)= ®iLx(Ei9 ®£, v4), where Et = E9 93£ = 95
(the cr-algebra of Borel sets in E), and v, = v for all i = 1, 2,..., is taken as the
probability space (X, U,/i). Let us extend the distribution function / on [0,1] to a
distribution function F on R by putting F(f) = 0 for t<0, F{t) = f(t) for Ogrgl,
and F(t)= 1 for r> 1. For n^ 1, let £„ be the nth coordinate projection in £°°, i.e.,
Zn((o) = Zn(xl9...9xm...) = xn for all a>=(x1,...,xll,...)e£G0. Then £u £2,..., £„,...
are independent random variables on (£°°, 95°°, vj with F(t) as their common
distribution function. We observe that, for coeE™ and N^ 1, the empirical
distribution function FN(t, co) satisfies FN(t, oo) = 0 for t<0, FN(t, oo) = A([0, f]; JV)/N
for O^r^l, and FN(r, co)=l for t>\. Therefore, by using Definition 3.2 and
Lemma 2.3, it follows that GN(oo) = DN(co; f) for all coe£°° and all N^ 1. Thus the
extreme discrepancy DN(co;f) is just a special case of Kolmogorov's two-sided
test.
Detailed bibliographies on Kolmogorov's test were compiled by Darling [10]
for the work prior to 1957 and by Barton and Mallows [3] for the work between
1957 and 1965.
In later sections, we shall be interested in the probability fi({xeX: GN(x)^<x})
for given iV^l and a^O. It is important to note that this probability does not
depend on the distribution function F(t) as long as F(t) is continuous, or, to use
the technical term, the above probability is distribution-free in the class of
continuous distribution functions. This was first observed by Wald and Wolfowitz
[51]. For the proof, we start from a sequence £l5 £2,..., £m... of independent
random variables on (X, U, /i) with common continuous distribution function
F(t). Then Fo£l5 Fo£2,..., Fo£w,... is again a sequence of independent random
variables on (X, U, /i), having as their common distribution function the uniform
distribution Fu(t\ given by Fu(t) = 0 for t<0, Fu{t) = t for O^t^l, and Fu{t)=l
for t> 1. It is also easily seen that GN(x) = GN(x) for all xeX, where GN(x) is the
expression in (5) corresponding to the random variables Fo£m n= 1, 2,.... This
should be compared with Corollary 2.1. We note that, when dealing with the
probabilities fi({xeX: Gn(x)^ol}) for continuous distribution functions, we may
therefore assume without loss of generality that the given random variables have
the uniform distribution Fu(t) as their common distribution function. If the con-
METRIC THEOREMS ON THE DISTRIBUTION OF SEQUENCES 203
tinuity of F(t) is dropped, the above argument is no longer valid. In fact, the
probability n({xeX: GN(x)^a}) is not distribution-free in the class of all distribution
functions (see [46]).
Restricting our attention to the uniform distribution Fu(t\ we note that the
L2 discrepancy D\}](co) may be viewed as a special case of the Cramer-Smirnov
test of mathematical statistics [10]. With the same notation as in Definition 3.2, we
set
(6) Gtf (x)=( | (FN(t, x)-Fu(t))2 dtj2.
- oo
For the probability space (£°°, 2300, X^), where X^ is the product measure on E
induced by the Lebesgue measure X in E, and for the coordinate projections as the
random variables, we have then G^2) (co) = D{v2) (co) for all coeE00 and all N^l.
It is obvious how the above notions have to be extended to the
multidimensional case. One considers now a sequence rjl, r\2,..., rjn,... of independent random
variables on (X, U, /i) with values in Rk, and having a common distribution
function F(tl9..., tk). The empirical distribution function FN(tu..., tk; x) is now defined
for all (*!,..., tk)eRk by l/N times the number of rjn(x) = (rjnl (x),..., r}nk(x)\ l^n^N,
for which r\ni{x)^ti for l^f^/c. Then one sets
GN(x)= sup \FN(tu...,tk,x)-F(tu...,tk)\.
(tu...,tk)eRk
The definition of Gjv2) (x) may be extended in a completely analogous fashion.
4. Global results. It follows from the methods of H. Weyl in his fundamental
paper [52] on uniform distribution of sequences that A^-almost all coeE00 are
uniformly distributed in E. This can be deduced readily from the strong law of
large numbers. The same fact is also an immediate consequence of the following
central result from the theory of empirical distribution functions.
Lemma 4.1 (Glivenko-Cantelli theorem). Let fi> £2*•••> £„>••• be
independent random variables on (X, U, /i) with common distribution function F(t). Then
lim^^ GN(x) = 0 \i- a. e.
For the proof, see [19, p. 279] and [44, pp. 335-336]. We are now interested
mainly in quantitative refinements of the Glivenko-Cantelli theorem. We shall
state these results for GN(x), and point out once again that they include theorems
for all the discrepancies DN(co; f).
By Kolmogorov's limit theorem given below, the quantity Nl/2GN(x) has a
204
H. NIEDERREITER
limiting distribution in case F(t) is continuous. The remarks in the previous section
imply that this limiting distribution is independent of the nature of the function
F(t).
Theorem4.1 (Kolmogorov's limit theorem). Let £x, £2,..., £w,... be
independent random variables on (X, U, p) with a common continuous distribution
function. Then
(7) lim p{{xeX: Ni/2GN{x)^<x})=l-2 f] {-iy+1 e'2*2*2 for a>0.
N->oo j=l
Various proofs of this limit theorem are available in the literature. For the
original proof, see [29] and also §5. Another proof was given by Feller [18].
Doob [1Z] made the remarkable observation that the limiting distribution in (7)
also occurs in limit laws for certain Gaussian processes, and that Kolmogorov's
limit theorem may in fact be deduced from a heuristic invariance principle for
Gaussian processes. Donsker [11] supplied the proof of this invariance principle.
See also Billingsley [4, §13]. The distribution function in (7) was tabulated by
Owen [41, §15] and Renyi [44, Tabelle 8].
Chung [8] proves Kolmogorov's limit theorem with a remainder term. As a
consequence, he shows the following zero-one law. Let (AN), iV = l, 2,..., be a
sequence of real numbers tending monotonically to infinity. Then the inequality
Nl/2 GN(x)>2.N will be satisfied for infinitely many N with probability zero or one
according as the series
| f exp(-2A£)
converges or diverges. Specifically, choosing AN = (i log logiV)1/2 for sufficiently
large N, it follows that
— (2AQ"2Ow(x)^
(8) ^(loglogAO1'2- *
Another consequence of Chung's refinement of Theorem 4.1 is the law of the
iterated logarithm for GN(x) which strengthens (8).
Theorem4.2 (Law of the iterated logarithm). Let £l9 £2,..., £„,... be
independent random variables on (X U, p) with a common continuous distribution
function. Then
{2Ny2GN(x)
N-oo (log logAT)1/2
METRIC THEOREMS ON THE DISTRIBUTION OF SEQUENCES 205
For another approach to Theorem 4.2, we refer to Cassels [7]. An analogue
to Theorem 4.1 can even be shown for the case where the common distribution
function F(t) of the random variables £„, n— 1, 2,..., is not continuous (see Schmid
[46]). However, the resulting limiting distribution is not distribution-free, as it
will depend on the jumps of the function F(t).
A theory similar to that for GN (x) can be developed for the Cramer-Smirnov
test Gjv^x) as defined in (6). The counterpart to Theorem 4.1 for G^2)(x) reads
as follows.
Theorem 4.3. Let £l9 £2,.-•,£„>••• be independent random variables on
(X, U, /i) with the uniform distribution Fu(t) as their common distribution function.
Then
lim fi{{xeX:N1/2 Gk2)(x)ga})
where K1/4(t) is the modified Bessel function of the third kind of order £ (see [16]).
The limiting distribution for Ni/2 Gff (x) was first obtained by Smirnov [48].
In the above form, the distribution function was given by Anderson and Darling
[1] who also tabulated the function (see also [41, § 16]). In the proof, one again
seeks to arrive at the desired limit law by invoking an invariance principle for a
suitable Gaussian process. This Gaussian process is defined in terms of the
eigenvalues and the normalized eigenfunctions of the integral equation
(10) f(t)=l\(mm(s,t)-st)f(s)ds.
o
Using the Fredholm determinant of the kernel in (10), one finds the limiting
characteristic function of (iV1/2 G^x))2. By a standard inversion technique using
Laplace transforms, the limiting distribution function in (9) is obtained. As far
as I know, a law of the iterated logarithm is not yet established for Gj^x).
Let us now discuss the multidimensional Kolmogorov test GN (x). For
dimension k> 1, the natural problems one might want to pose are much harder to settle.
It is an unpleasant new feature that for k>\ Kolmogorov's test fails to be
distribution-free, even in the class of continuous distribution functions (see the
example constructed by Simpson [47]). Although one knows that N1/2 GN(x)
has again a limiting distribution for continuous F(tu..., tk), which will of course
206
H. NIEDERREITER
depend on F, the nature of the distribution function could not be determined for
any continuous F.
Most of the known results hold uniformly for all continuous distribution
functions F(t1,..., tk). The closest approximation to a multidimensional analogue
of Theorem 4.1 is the following result of Kiefer [26].
Theorem 4.4. Let rji,rj2,...,rjn,... be independent random variables on
(X, U, /i) with values in Rk, and having a common continuous distribution function
F(tl9..., tk). Then, for every e>0, there exists a positive constant c = c(s, k),
independent ofF, such that
n{{xeX:N1/2GN{x)S<x})^l-CQxp{-{2-s)a2)
holds for allN^l and all a^O.
A weaker result was obtained earlier by Kiefer and Wolfowitz [27]. For fc= 1,
an optimal result of this type was shown by Dvoretzky, Kiefer, and Wolfowitz [14].
However, the multidimensional metric theory of GN(x) is not all that
unsatisfactory. The most decisive result is the unrestricted extension of Theorem 4.2.
This was achieved by Kiefer [26].
Theorem 4.5 (Multidimensional law of the iterated logarithm). Let
r1i,1l2,...,nn,... be independent vector-valued random variables on (X, U, fi) with a
common continuous distribution function. Then
— (2N)^GN(x)
^ooOoglogN)1/2
For two other approaches to Theorem 4.5 containing some refinements, see
Philipp [42], [43, Chapter 4] and Zaremba [53].
One of the major difficulties in the multidimensional case seems to be that,
although the desired limit laws are again intimately connected with certain
(multidimensional) Gaussian processes, we cannot use these correspondences to
the same extent as in the one-dimensional case, since the theory of these more
general stochastic processes is not as fully developed as one might wish.
Not much is known about the multidimensional Cramer-Smirnov test Gi^){x).
Again, the test fails to be distribution-free for k>\. But, at least in theory, the
problem of the limiting distribution of Ni/2 Gffiix) can be reduced to the problem
of finding the limit law for a certain multidimensional stochastic process, which,
in turn, boils down essentially to finding the eigenvalues of certain multivariate
integral equations (see Rosenblatt [45]).
METRIC THEOREMS ON THE DISTRIBUTION OF SEQUENCES 207
5. Combinatorial aspects. By Kolmogorov's limit theorem (see Theorem 4.1),
we know the limiting distribution of fi({xeX:N1/2 GN(x)^a}) as N-+00 for
continuous F(t). In this section, we are interested in exact formulas for probabilities
of this type with fixed N. We introduce the following notation.
Definition 5.1. Let £l5 £2,. ., £„,... be independent random variables on
(X, U, fi) with a common continuous distribution function. For a positive integer
N and a^O, we set
(11) P,(a) = /i({xeX:G,(x)ga}).
From the remarks in §3, it follows that, whenever it is convenient, we may take
for the common distribution function the uniform distribution Fu(t), and that we
may also assume that the random variables £„, w=l, 2,..., only attain values
in [0, 1].
The computation of the probabilities PN{u) is essentially a combinatorial
problem. The first formulas for the PN{a), with a of the form a = m/N, m = 0,1,..., N,
were derived recursively by Kolmogorov in his proof of Theorem 4.1. We set
PN(m/N) = (N\/NN)eNR0>N(m),
where Ritk{m) is defined for all integers i, for all nonnegative integers fe, and for
m=l,...,N. Then, for Ritk(m), we have the conditions
R0 0(m)=\ for lgm^N;
Ri0(m) = 0 for lgm^JV, i#0;
Rik(m) = 0 for \i\^m and all k ;
and the recursion formula
2m-1 j
K.\*+iM = e'~1 Z -.Ri+i-s.kim) for |i|^m-l andall/c.
s = 0 SI
Numerical tables for PN(m/N) for small values of TV were compiled by Massey
[31], Birnbaum [5], and Owen [41, § 15]. For a slightly more general problem,
recursions of the above type were given by Durbin [13].
Let us now consider PN{oc) for arbitrary a. We note, first of all, that PN(a) = 0
for 0 ^ a < 1/2JV, by Lemma 2.4. Therefore, we may suppose a ^ 1/2JV in the sequel.
A simple but general formula for PN(oc) may be given using a notion from
transversal theory. This area of combinatorial theory supplies some surprising
connections with problems concerning distribution of sequences. See for instance
the papers of Meijer and Niederreiter [32], Niederreiter [37], and Tijdeman [49]
208
H. NIEDERREITER
on sequences in finite and countable sets. For the present purposes, we need the
following definition from transversal theory (Mirsky [34], [35]).
Definition 5.2. Let Mu..., Mk be subsets of an arbitrary set M. A &-tuple
(mlv.., mk) of elements of M is called a system of representatives for Mi,..., Mk
if there exists a permutation a of k letters such that ma{i)eMi for 1 gig/c.
An important step is the combinatorial characterization of the elements xeX
for which Gn(x)^ol.
Lemma 5.1. Let £u £2,..., £„,.. .be independent random variables on (X, U, /i)
attaining values in [0,1] and having the uniform distribution as their common
distribution function. For N^.\ and ajj^l/2iV, define the intervals J{al) = [i/N—tt,
(/-l)/7V+a], lg/gAT. Then GN(x)<^<x if and only //(^(x),..., £N{x)) is a system
of representatives for J{a1],..., J{"].
The proof depends on a special case of Lemma 2.4 and a few combinatorial
arguments, and is given in [39]. To state the formula for PN(a), we need some
more notation. For fixed iV^l and a^l/2JV, we set aI = max(i/iV —a, 0) and
^ = min((i- l)/JV + a, 1) for 1 gig N. Let Ht be the interval H~[ah ftj, 1 gigN.
Now arrange the numbers at,..., aN, bu...,bN according to their magnitudes, not
allowing repetitions: c0<cx < • • • <cs. For 1 gyg5, let I} be the interval 7J=[cJ_l9 cj\.
Then the intervals i/f can be written in the form H{ — Kj)l=hi Ij for 1 g/g Af. The
integers h( and kt satisfy 1 =frigfr2g***gfyvgs and 1 gfeig/c2g---g/cN = s, and
also fejgk, for lgjgJV.
Theorem 5.1. ff7/A ffe afore notation, we have
(J1.....JN)
for N^.1 and a^l/2iV, w/i^r^ t/ie summation is extended over all systems of
representatives (ji,...JN)for the intervals [hl9 /cj],..., \_hN, kN~] of integers.
The proof can be found in [39]. One makes use of Lemma 5.1, some
combinatorial arguments, and the independence of the random variables £l5..., £N.
There are also various analytic representations for PN{oc). One of them was
first mentioned by Birnbaum [5]. It is in fact a rather simple consequence of
Lemma 2.4 (see [39]). We use again the notation introduced above. For an
excellent exposition of the methods used in obtaining results of this type, see
[50, § 16].
Lemma 5.2. For N^ 1 and a^ 1/2JV, we have
METRIC THEOREMS ON THE DISTRIBUTION OF SEQUENCES 209
(12) PN(*) = N\
bN
x{h,~->tN)dtN...dtl9
where x(tu...9 tN)=\ if tlSt2S'-^tN, and x(*i,..., *jv) = 0 otherwise.
A recurrence formula for the integral appearing in (12) was given by Epanecni-
kov [15] who also computed explicit values for N^4.
Various formulas for PN(<x) may be deduced from Lemma 5.2, depending
on what analytic expression one chooses for the function #(fi,..., tN). In one
method, one works with Dirichlet's integral (see [39]). The following formula,
also given in [39], appears to be more tractable.
Theorem 5.2. For N^ 1 and a ^ 1/27V, we have
exp(kbi) expikbs)
exp(fcai) exp(fcajv)
Another exact expression for PN(ct) was obtained by Kemperman [23], [24, §5].
The author uses the method of generating functions. The resulting formula is
fairly complicated. The approach of Kemperman was recently extended by Durbin
[13] to cover more general cases. Kemperman's formula reads as follows.
Theorem 5.3. For N^ 1 and a^ 1/27V, we have
U=o i\J\
where Bv(t) is given by
with H = [t~], while £* stands for the sum over all H-tuples (il5..., iH) of nonnegative
integers with il+2i2-{ VHiH = v.
The problem of evaluating PN{a) in the multidimensional case is far from
being solved. Even the basic step, namely the combinatorial characterization
similar to Lemma 5.1 of the points xeX with GN(x)^u, has not yet been carried
out. Solutions of these problems will undoubtedly lead to a proof for the
multidimensional version of Kolmogorov's limit theorem (compare with the remarks
on the multidimensional case in §4).
210
H. NIEDERREITER
6. Special sequences. The classical example of a uniformly distributed
sequence in E is the sequence co = ({na}), n— 1, 2,..., with irrational a, where
{t} = t — [t] denotes the fractional part of the real number t. Thus, trivially, the
sequences ({na}) are uniformly distributed in E for almost all a (in the sense of the
Lebesgue measure in R). To obtain a quantitative refinement, one may use the
well-known discrepancy estimates for ({no.}) in terms of the continued fraction
expansion of a, together with the classical metric results of Khinchine on continued
fractions. This leads to the following theorem. For a given nondecreasing positive
function g with X*=i (#(«))" *< oo, the discrepancy DN(oo) of the sequence
co = ({wa}) satisfies
NDN(cj) = 0(g(\og log JV) log JV)
for almost all a. Hence, on the average, the sequences {{no.}) have an extremely
even distribution.
Most of the probabilistic investigations of special classes of sequences have
concentrated on sequences of the type ({A„a}), where (A„) is either a given increasing
sequence of positive integers, or a given lacunary sequence of real numbers. One
is interested in the behavior of the sequence for almost all a. For a survey of the
results prior to 1961, see Cigler and Helmberg [9, §9]. In the last decade, a major
achievement in this area was a law of the iterated logarithm for the sequences
({2wa}), which was established by S. Gal and L. Gal [20]. The result was improved
decisively by Philipp, and the strengthened version may be found in [43, §4].
For another recent result, see the paper of Baker [2]. We also refer to the excellent
survey article of Gaposhkin [21] on lacunary sequences.
More general classes of sequences have been studied, for instance in Erdos and
Koksma [17] and Cassels [6]. A comprehensive bibliography of the work on
special sequences will be available in [30, §2.3].
References
1. T. W. Anderson and D. A. Darling, Asymptotic theory of certain "goodness of fit" criteria based
on stochastic processes, Ann. Math. Statist. 23(1952), 193-212.
2. R. C. Baker, Discrepancy modulo one and capacity of sets, Quart. J. Math. Oxford Ser. (2) 22
(1971), 597-603.
3. D. E. Barton and C. L. Mallows, Some aspects of the random sequence, Ann. Math. Statist. 36
(1965), 236-260. MR 31 #2745.
4. P. Billingsley, Convergence of probability measures, Wiley, New York, 1968. MR 38 #1718.
5. Z. W. Birnbaum, Numerical tabulation of the distribution of Kolmogorov's statistic for finite
sample size, J. Amer. Statist. Assoc. 47 (1952), 425-441. MR 14, 389.
6. J. W. S. Cassels, Some metrical theorems in Diophantine approximation. I, Proc. Cambridge
Philos. Soc. 46 (1950), 209-218. MR 12, 162.
7. , An extension of the law of the iterated logarithm, Proc. Cambridge Philos. Soc. 47
(i951), 55-64. MR 12, 723.
METRIC THEOREMS ON THE DISTRIBUTION OF SEQUENCES 211
8. K. L. Chung, An estimate concerning the Kolmogoroff limit distribution, Trans. Amer. Math.
Soc. 67 (1949), 36-50. MR 11, 606.
9. J. Cigler and G. Helmberg, Neuere Entwicklungen der Theorie der Gleichverteilung, Jber. Deutsch.
Math.-Verein. 64 (1961), Abt. 1, 1-50. MR 23 #A2409.
10. D. A. Darling, The Kolmogorov-Smirnov, Cramer-von Mises tests, Ann. Math. Statist. 28
(1957), 823-838. MR 20 #390.
11. M. D. Donsker, Justification and extension of Doob's heuristic approach to the Kolmogoroff-
Smirnov theorems, Ann. Math. Statist. 23 (1952), 277-281. MR 13, 853.
12. J. L. Doob, Heuristic approach to the Kolmogoroff-Smirnov theorems, Ann. Math. Statist.
20 (1949), 393-403. MR 11, 43.
13. J. Durbin, The probability that the sample distribution function lies between two parallel
straight lines, Ann. Math. Statist. 39 (1968), 398^11. MR 37 #1001.
14. A. Dvoretzky, J. Kiefer and J. Wolfowitz, Asymptotic minimax character of the sample
distribution function and of the classical multinomial estimator, Ann. Math. Statist. 27 (1956), 642-669.
MR 18, 772.
15. V. A. Epanecnikov, The significance level and the power of the two-sided Kolmogorov criterion
in the case of small sample sizes, Teor. Verojatnost. i Primenen. 13 (1968), 725-730=Theor. Probability
Appl. 13 (1968), 686-690. MR 39 #3664.
16. A. Erdelyi, et al., Higher transcendental functions. Vol. II, McGraw-Hill, New York, 1955.
MR 15, 419.
17. P. Erdos and J. F. Koksma, On the uniform distribution modulo 1 of sequences {fin, #)}, Nederl.
Akad. Wetensch. Proc. Ser. A 52 = Indag. Math. 11 (1949), 299-302. MR 11, 331.
18. W. Feller, On the Kolmogorov-Smirnov limit theorems for empirical distributions, Ann. Math.
Statist. 19 (1948), 177-189. MR 9, 599; 10, 855.
19. M. Frechet, Generalites sur les probabilites. Elements aleatoires, 2ieme ed., Traite du calcul
des probabilites et de ses applications, tome I, fasc. 3, premier livre, Gauthier-Villars, Paris, 1950.
MR 12, 423.
20. S. Gal and L. Gal, The discrepancy of the sequence {(2";c)}, Nederl. Akad. Wetensch. Proc.
Ser. A 67 = Indag. Math. 26 (1964), 129-143. MR 29 #392.
21. V. F. Gaposkin, Lacunary series and independent functions, Uspehi Mat. Nauk 21 (1966),
no. 6 (132), 3-82 = Russian Math. Surveys 21 (1966), no. 6, 1-82. MR 34 #6374.
22. E. Hlawka, Discrepancy and uniform distribution of sequences, Compositio Math. 16 (1964),
83-91. MR 30 #4745.
23. J. H. B. Kemperman, Some exact formulae for the Kolmogorov-Smirnov distributions, Nederl.
Akad. Wetensch. Proc. Ser. A 60 = Indag. Math. 19 (1957), 535-540. MR 20 #2779.
24. , The passage problem for a stationary Markov chain, Statistical Research Monographs,
vol. I, Univ. of Chicago Press, Chicago, 111., 1961. MR 22 #9992.
25. , Probability methods in the theory of distributions modulo one, Compositio Math. 16
(1964), 106-137. MR 30 #3494.
26. J. Kiefer, On large deviations of the empiric D.F. of vector chance variables and a law of the
iterated logarithm, Pacific J. Math. 11 (1961), 649-660. MR 24 #A1732.
27. J. Kiefer and J. Wolfowitz, On the deviations of the empiric distribution function of vector
chance variables, Trans. Amer. Math. Soc. 87 (1958), 173-186. MR 20 #5519.
28. J. F. Koksma, Diophantische Approximationen, Ergebnisse der Mathematik und ihrer Grenz-
gebiete IV, Heft 4, Springer, Berlin, 1936.
29. A. Kolmogorov, Sulla determinazione empirica di una legge di distribuzione, Giorn. 1st. Ital.
Attuari 4 (1933), 83-91.
30. L. Kuipers and H. Niederreiter, Uniform distribution of sequences, Interscience Tracts, Wiley,
New York (in print).
212
H. NIEDERREITER
31. F. J. Massey, A note on the estimation of a distribution function by confidence limits, Ann. Math.
Statist. 21 (1950), 116-119. MR 11, 446.
32. H. G. Meijer and H. Niederreiter, On a distribution problem infinite sets, Compositio Math.
25(1972), 153-160.
33C Y. Meyer, Nombres de Pisot, nombres de Salem, et analyse harmonique, Lecture Notes in Math.,
vol. 117, Springer-Verlag, Berlin and New York, 1970.
34. L. Mirsky, Systems of representatives with repetition, Proc. Cambridge Philos. Soc. 63 (1967),
1135-1140. MR 36 #58.
35. , Transversal theory, Academic Press, New York, 1971.
36. H. Niederreiter, Methods for estimating discrepancy, Proc. Sympos. on Applications of
Number Theory to Numerical Analysis (Montreal, 1971), Edited by S. K. Zaremba, Academic Press,
New York, 1972, pp. 203-236.
37. , A distribution problem in finite sets, Proc. Sympos. on Applications of Number
Theory to Numerical Analysis (Montreal, 1971), Edited by S. K. Zaremba, Academic Press, New
York, 1972, pp. 237-248.
38. , Discrepancy and convex programming, Ann. Mat. Pura Appl. 93 (1972), 89-97.
39. , Zur quantitativen Theorie der Gleichverteilung, Monatsh. Math, (to appear).
40. 1 The distribution of Farey points, Math. Ann. (to appear).
41. D. B. Owen, Handbook of statistical tables, Addison-Wesley, Reading, Mass., 1962.
MR 28 #4608.
42. W. Philipp, Das Gesetz vom iterierten Logarithmus mit Anwendungen auf die Zahlentheorie,
Math. Ann. 180 (1969), 75-94. MR 39 # 1423.
43. , Mixing sequences of random variables and probabilistic number theory, Mem. Amer.
Math. Soc. No. 114(1971).
44. A. Renyi, Wahrscheinlichkeitsrechnung. Mit einem Anhang uber Informationstheorie, Hoch-
schulbucher fiir Mathematik, Band 54, VEB Deutscher Verlag der Wissenschaften, Berlin, 1962.
MR 26 #5597.
45. M. Rosenblatt, Limit theorems associated with variants of the von Mises statistic, Ann. Math.
Statist. 23(1952), 617-623. MR 14, 665.
46. P. Schmid, On the Kolmogorov and Smirnov limit theorems for discontinuous distribution
functions, Ann. Math. Statist. 29(1958), 1011-1027. MR 21 #392.
47. P. B. Simpson, Note on the estimation of a bivariate distribution function, Ann. Math. Statist.
22 (1951), 476^78. MR 13, 142.
48. N. V. Smirnov, On the distribution of the co2 criterion of von Mises, Math. Sb. 2 (1937), 937-993.
(Russian)
49. R. Tijdeman, On a distribution problem infinite and countable sets, J. Combinatorial Theory
(to appear).
50. B. L. van der Waerden, Mathematische Statistik, 2nd ed., Springer-Verlag, Berlin, 1957;
English transl., Die Grundlehren der math. Wissenschaften, Band 156, Springer-Verlag, New York,
1969. MR 32 #8421; MR 40 #5051.
51. A. Wald and J. Wolfowitz, Confidence limits for continuous distribution functions, Ann. Math.
Statist. 10(1939), 105-118.
52. H. Weyl, Uber die Gleichverteilung von Zahlen mod. Eins, Math. Ann. 77 (1916), 313-352.
53. S. K. Zaremba, Sur la discrepance des suites aleatoires, Z. Wahrscheinlichkeitstheorie und
Verw. Gebiete 20 (1971), 236-248.
UNIVERSITY OF ILLINOIS AT URBANA-ChAMPAIGN
Southern Illinois University
BOUNDS FOR SEQUENCES OF CONSECUTIVE
POWER RESIDUES. I
KARL K. NORTON1
Let n and k be positive integers with /c^2, let C(n) denote the multiplicative
group of residue classes (modn) which are relatively prime to n, and let Ck(ri)
denote the subgroup of fcth powers. Write vk(n) = {C(n)\Ck(n)\ and when vk(n)> 1,
let gf1(n, k) denote the smallest positive integer which is in C(n) but not in Ck(n)
(i.e., g^n, k) is the least positive kxh power nonresidue modn which is relatively
prime to n). This paper is an expository account of some recent research on the
problem of finding bounds for gx{n, k) and, more generally, for the maximum
number of consecutive members of C(n) lying in any given coset of Ck(n). Our
principal new results are stated in Theorems 1, 4, 6, 7, and 8. Detailed proofs of
these theorems will appear elsewhere.
We shall always use the symbol p to represent a prime. The best known upper
bound for gx (p, 2) is due to Burgess [5]:
(1) gi(p,2) = 0£(pl^l^+£) forp>2,£>0,
where the notation indicates that the implied constant depends at most on e
(O without subscripts will imply an absolute constant). There is a standard
method for generalizing this result to the kth power case. The method requires
the introduction of Dickman's function, a positive continuous nonincreasing
function defined recursively by
(2) e(a)=l (0£a£l),
AMS 1970 subject classifications. Primary 10-02, 10H35; Secondary 10G05, 10H30.
Key words and phrases. Power residues, character sums.
1 This research was supported in part by the grant AF-AFOSR-69-1712.
© 1973, American Mathematical Society
213
214
KARL K. NORTON
(3) q(<x) = q{N)- \v-lQ{v-l)dv (JV<a£JV + l;N=l,2,...).
N
The generalization of (1) was given by Wang Yuan [21]:
(4) gi(p,k) = OkJpl^+£) for p=l (mod/c), £>0,
where ak is the unique root of the equation g(a) = /c_1 (note that a2 = e1/2). Buch-
stab [4] showed that
(5) uk>{\ogk)(\og\ogk + 2)~l>6 fork>e33.
Assuming the extended Riemann hypothesis, Ankeny [1] obtained the very
strong result
(6) 0i(p,2) = O(log2p)
for odd p, and he remarked (a little vaguely) that a similar result holds for gY (p, k)
when k>2 and p=l (mod/c). Montgomery [17, Chapter 13] has further
generalized (6); we shall discuss his result after stating our Theorem 6.
In the other direction, we have the inequality
(7) gl (p, k)>dk logp for infinitely many p= 1 (mod/c),
where dk is positive. When fe = 2, this is an easy consequence of Linnik's deep
theorem on the smallest prime in an arithmetic progression, as was remarked by
S. Chowla (unpublished) and various other authors. (7) was established for all
prime k by Elliott [11], who pointed out to the author that the result actually
holds for any /c^2. Elliott [12, pp. 840-841] also showed that for each £>0,
there is a positive constant c(e) such that gl(p, 2)>c(s) logp for at least x1_£
primes p^x. In the same paper, he proved that there are positive absolute
constants d, a such that gY (p, 2)^d logp for all but
0(x exp(-a (logx/log logx)))
primes not exceeding x, while for each <5>0, an inequality of the form #i(p, 2)
^(logp)1+<5 holds for all but 0<5(x1_M(<5)) primes not exceeding x, where /i(<5)>0.
These results led him to conjecture that gi(p, 2) = 0£((logp)1+£) for every e>0.
Assuming the extended Riemann hypothesis, Montgomery [17, pp. 122, 128j
showed that there is a constant c>0 such that gl(p, 2)>c(logp) (log logp) for
infinitely many p.
SEQUENCES OF CONSECUTIVE POWER RESIDUES
215
The above results fall well short of conveying the full truth about the size of
0i(p,k). In fact,
(8) lim WxjU)}"1 E gi(p,k) = Ak
x-* + oo P = x, p= 1 (modfc)
for /c = 2, 3,..., where Ak depends only on k and n(x; /c, 1) is the number of p^x
with p= 1 (mod/c). This very striking result was proved by Erdos [14] in the case
fc = 2 and was extended by Elliott [10], [13] to the general case.
It is interesting to ask whether these and other results on the distribution of
power residues can be generalized to the case in which the prime p is replaced by
an arbitrary modulus n. In [18], [19], and [20], we showed that certain
generalizations could be made, but these were not completely satisfactory, since the
estimates obtained were weaker unless a certain assumption was made about the
prime factorizations of n and k. This assumption, which we refer to as Condition A,
is a little troublesome to state precisely, but it certainly holds if n is cubefree, or if
k is odd and squarefree. Our generalization of (1) and (4) in [20, p. 70] reads as
follows: If w is an integer greater than 1, and if vk(n)^ w, then for each s>0,
(9) 0iM)=Ow>3/8*"+*),
while if Condition A holds, we have the stronger estimate
(10) ffiM) = 0*>1/4"~+r
The proof of these results required a rather delicate elementary estimate concerning
the distribution of numbers with small prime factors, as well as some previous work
on the distribution of power residues, which in turn depended on Burgess's
estimates for character sums ([5], [6], [7}, [8]). Our first new result is more
satisfactory* than (9) and (10).
Theorem 1. If wis an integer and vk (n) ^ w ^ 2, then for each e > 0,
(11) 3l(n,/c)=Ow,£(n1^+£).
In particular,
(12) g1{n,k)=0E(nlf4cx^2)+£)
whenever gx (n, k) exists.
The proof of Theorem 1 depends on (9) and (10) but is otherwise rather simple.
216
KARL K. NORTON
We now consider the more general problem of finding bounds for the length
of any sequence of consecutive power residues. Here we have the following elegant
result for the case of prime modulus.
Theorem 2 (Burgess [9]). Let x be a nonprincipal character mod/?, and
suppose N,H are integers with x(AT+ \) = i(N+2)=~ = x(N+H). Then H=
0(p^\ogp).
If (k, p — 1) > 1, then there exists a nonprincipal x mod/? such that xk is principal,
and Theorem 2 gives an upper bound for the maximum number of consecutive
integers lying in any fixed coset of Ck(p). This upper bound appears to be the best
known and difficult to improve by Burgess's method. Since the implied constant
in Theorem 2 is absolute, it seems worthwhile to calculate an admissible value for it.
One could, of course, simply go through Burgess's proof and make all the
inequalities explicit, but it does not appear possible to get a reasonably small value of the
constant in this way. By a laborious refinement of his method, we have been able
to get the following result:
Theorem 3. In Theorem 2, H<4.lpi/4 \ogpfor all p. Ifp>e15&3.27x 106,
thenH<2.5p1/4\ogp.
We note that Alfred Brauer [2], [3, pp. 23-26] proved (essentially) that
H<(2p)1/2+2 for all p, which still seems to be the best result known for p<e15.
The method leading to Theorem 3 also gives various specific estimates for gx (p, k).
Our best result in this direction is that
(13) gt(p9 k)£l.lp1/4(logp + 4) if(/c,p-l)>l.
This improves an inequality given in [20, p. 87]. For other specific estimates of
gx(p, fe), see [20, pp. 75-96] and [3, pp. 26-31] (both papers contain references
to other work).
In obtaining Theorem 2, Burgess used his inequality
(14) Sw(p,h,x)=t flxfa + O
m = l |/ = 0
where h, w are any positive integers and / is any nonprincipal character modp
(see [6, Lemma 2]). In order to get Theorem 3, we used (among other things)
the following rather trifling refinement of (14):
(15) 5w(p, K x)<21,2(2w/er phwcn(w, A) + (2w-2) p1/2/i2w,
'2w
<(4w)w+lphw + 2wpl/2h2w,
SEQUENCES OF CONSECUTIVE POWER RESIDUES 217
where h, w are any positive integers, % is a nonprincipal character of order n
(in the character group modp), and
(16) cn(w, h)=l if n is even or w^5,
(17) cn{w, h) = {l+{w/2h)1/2}2w-2 if n is odd and w>5.
In order to generalize Theorem 2 to the case of an arbitrary modulus, we need
some appropriate definitions. By an integer interval, we mean any set of the form
/ = {JV+l,iV + 2,...,JV + //},
where N, H are integers with H^0. The length of / is \I\ = H. If vk(n)> 1, let
Mfc (h) = max {|/|: 3#Vx [x£ /, (x, n) = 1 =>xegCk (n)]}.
In words, Mk{n) is the length of the largest integer interval / such that all members
of/ which are relatively prime to n lie in the same coset of Ck(n). For convenience,
we define Mk(n) = n if vk(n)=l. Note that
(18) gi(n,k)^Mk{n)<n ifvk(w)>l.
In [19, Theorem 3.15], we showed that, for each £>0,
(19) Mfc(n) = 0E(*3/8+£) ifvk(n)>l,
while
(20) Mk{n) = 0E{nll4+£) if vk(n) > 1 and Condition A holds.
(Recall that Condition A was discussed just before equation (9).) We can now state
an analogue of Theorem 1 which improves (19) and (20).
Theorem 4. If vk(n)> 1, then Mk(n) = Oe(n1/4+e) for each s>0.
Note that the implied constant here does not depend on k.
The estimate of Theorem 4 can be improved dramatically for almost all values
of n. To get the improvements, we need
Theorem 5. If there exists a prime p dividing n such that (k, p— 1)> 1, then
Mk{n)<{n/(p{n))2^nyi2\ogp,
218
KARL K. NORTON
where (p is Euler's function and co(n) is the number of distinct prime factors of n.
Using Theorem 5, the sieve of Eratosthenes, and the well-known theorem of
Hardy and Ramanujan on the normal order of a>(n), we can get the following result:
Theorem 6. Let x, e be real with x^3, £>0. For each integer k^2, define
(2i) ,(/c)=1_n[1_J_Y
P\k\ P-I/
Then Mk(n)<(\ogn)lo*2+E for all but 0M(x (log logx)"c(fc)) values ofn^x.
A similar estimate for ^(n, k) follows from (18). It is interesting to compare
this with Montgomery's recent generalization of Ankeny's result (6). In [17,
Chapter 13], Montgomery defined what he called least character nonresidues
and gave various upper and lower bounds for them. His Theorem 13.1 yields (as
a special case) the following extension of (6): on the extended Riemann hypothesis,
(22) 0x (w, k) = 0 (log2 n) whenever v* (ri) > 1.
Our Theorem 6 has the advantage of giving a better bound in a more general
problem, and without any unproved hypotheses. However, the set of exceptional
n in Theorem 6 may be quite large. If we are willing to accept weaker bounds for
Mk(n), we can reduce appreciably the size of the exceptional set.
Theorem 7. Let x^3, e>0. For each fc^2, Mk(n)<exp ((\ogn)E)for all but
OkE(x (logx)~EC(fc)) values of w^x, where c(k) is defined by (21). Furthermore,
Mk(n)<nE for all but OkE(x (logx)"c(k)) values ofn^x.
The proof of Theorem 7 is not as simple as that of Theorem 6. It depends on
Selberg's upper bound sieve method and a well-known Tauberian theorem of
Hardy and Littlewood [15, Theorem D]. The second part of Theorem 7 should
be compared with a result of Linnik [16], which states that for each s > 0, g l (p, 2) < pE
for all but 0E(loglogx) primes p^x (Linnik actually stated his theorem in a
slightly different way). As we remarked after equation (7), Elliott obtained much
better upper bounds for gt (p, 2), but with much larger exceptional sets.
In some cases, we can be a little more specific about the size of the exceptional
set. For example,
Theorem 8. Ifx^3,0<e^j, andk is even, then Mk{n)<ne for all but at most
2x/e logx + 0E(x log logx/log2x) values ofn^x.
SEQUENCES OF CONSECUTIVE POWER RESIDUES
219
Finally, we mention that a simple application of the prime number theorem
shows that for each k ^ 2, there are infinitely many n with
(23) Mk(«)^^1(«,A:)^(log«){l + 0(exp(~6(loglog/2)1/2))},
where b is a positive absolute constant. The idea behind (23) is almost trivial, but
we have not been able to obtain a better lower bound. However, Theorem 6 and
(22) suggest that (23) cannot be improved very much.
References
1. N. C. Ankeny, The least quadratic non-residue, Ann. of Math. (2) 55 (1952), 65-72. MR 13, 538.
2. A. Brauer, Uber die Verteilung der Potenzreste, Math. Z. 35 (1932), 39-50.
3. , Combinatorial methods in the distribution of kth power residues (with discussion), Proc.
Conf. Combinatorial Math, and Appl. (Univ. North Carolina, Chapel Hill, N.C., 1967), Univ. North
Carolina Press, Chapel Hill, N.C., 1969, pp. 14-37. MR 40 #1330.
4. A. A. Buhstab, On those numbers in an arithmetic progression all prime factors of which are
small in order of magnitude, Dokl. Akad. Nauk SSSR 67 (1949), 5-8. (Russian) MR 11, 84; 871.
5. D. A. Burgess, The distribution of quadratic residues and non-residues, Mathematika 4 (1957),
106-112. MR 20 #28.
6. , On character sums and primitive roots, Proc. London Math. Soc. (3) 12(1962), 179-192.
MR24#A2569.
7# 9 On character sums and L-series, Proc. London Math. Soc. (3) 12(1962), 193-206. MR 24
#A2570.
8. , On character sums and L-series. II, Proc. London Math. Soc. (3) 13 (1963), 524-536.
MR 26 #6133.
9. , A note on the distribution of residues and non-residues, J. London Math. Soc. 38 (1963),
253-256. MR 26 #6135.
10. P. D. T. A. Elliott, A problem ofErdos concerning power residue sums, Acta Arith. 13 (1967/68),
131-149. MR 36 #3741.
11. , Some notes on k-th power residues, Acta Arith. 14 (1967/68), 153-162. MR 37 #4000.
12. , The distribution of primitive roots, Canad. J. Math. 21 (1969), 822-841. MR 40 # 104.
13. , The distribution of power residues and certain related results, Acta Arith. 17 (1970),
141-159. MR 43 #4773.
14. P. Erdos, Remarks on number theory. I, Mat. Lapok 12 (1961), 10-17. (Hungarian) MR 26
#2410.
15. G. H. Hardy and J. E. Littlewood, Some theorems concerning Dirichlet's series, Messenger
Math. 43(1913/14), 134-147.
16. Ju. V. Linnik, A remark on the least quadratic non-residue, C. R. (Dokl.) Acad. Sci. URSS 36
(1942), 119-120. MR 4, 189.
17. H. L. Montgomery, Topics in multiplicative number theory, Lecture Notes in Math., vol 227,
Springer-Verlag, Berlin and New York, 1971.
18. K. K. Norton, Upper bounds for kth power coset representatives modulo n, Acta Arith. 15 (1968/
69), 161-179. MR 39 #1419.
19. , On the distribution of kth power residues and non-residues modulo n, J. Number Theory
1 (1969), 398-418. MR 40 #4223.
220 KARL K. NORTON
20. , Numbers with small prime factors, and the least kth power non-residue, Mem. Amer.
Math. Soc. No. 106 (1971), 106 pp.
21. Wang Yuan, Estimation and application of character sums, Shuxue Jinzhan 7 (1964), 78-83.
(Chinese) MR 37 #5162.
Institute for Advanced Study
University of Colorado
RATIONAL POINTS ON CERTAIN
ELLIPTIC MODULAR CURVES
A. P. OGG1
Let T = SL(2, Z)/±\ be the full modular group, let Ybe the quotient of the
upper half plane by r, and let X be the compactification of Yobtained by adjoining
the cusp. For any integer iV^3, we have subgroups r(N), rl(N),r0(N) of r
defined by matrices ("5) congruent modulo N to (o ?), (o *)> (o *)> respectively;
we let Y(N), Yl(N), Y0(N) denote the corresponding quotients of the upper half
plane, and X(N), ^(TV), ^oM ^e compactifications by adding cusps. The X"s
are compact Riemann surfaces of genus p(N\ px (N), p0(N), easily computed from
the Riemann-Hurwitz formula. As moduli, Y(N) parametrizes (isomorphism
classes of) elliptic curves A together with a definite isomorphism of AN onto
Cjy x Cjv, where AN is the group of points of order N on A, and CN is the cyclic
group of order N; Y1(N) parametrizes elliptic curves together with a point of
order N; Y0(N) parametrizes elliptic curves together with a cyclic subgroup of
order N. Furthermore, the X's and Y's are not merely Riemann surfaces, i.e.
algebraic curves over C, but curves over Q, and so for example an elliptic curve A,
defined over Q, together with a point P of order N on A, rational over Q,
corresponds to a point of Yx (N)Q, the set of Q-rational points on the curve Yx (N).
Recently Demjanenko [5] has proved the conjecture that Yi (N)Q is empty for N
sufficiently large (even over any number field, not just Q), i.e. no elliptic curve over
Q can have a rational point of order N. I once made the guess [11], on admittedly
thin evidence, that Yx (N)Q is empty if px (iV)>0, i.e. if N = 11 or N^ 13.
The rational points of YX(N) or Y0(N), or, as we shall usually think of them,
the rational points of XX(N) or X0(N) which are not cusps, are then inherently
A MS 1970 subject classifications Primary 10D15, 14G25.
1 Partially supported by NSF grant GP-20532. Much of this work, although written up only in
early 1972, was done while the author held an appointment in the Miller Institute for Basic
Research in Science during the academic year 1970-1971.
© 1973, American Mathematical Society
221
222
A. P. OGG
interesting, and on the present occasion I want to describe some methods for
finding them, subject to the following considerations. One can derive plane
equations for these curves, as in Fricke [6], and sometimes solve directly for the
rational points, the classic example being Billing-Mahler [1]. But for larger values
of N these equations will be highly singular, as well as laborious to derive, and this
makes counting solutions difficult. We will therefore use only the modular
properties of our curves, and consider using a specific equation as against the rules of
the game. On the other hand, while it is difficult to say in modular terms what it
means for a point of Y0(N) or Yl(N) to be rational, for the cusps of X0(N) or
Xl(N), which are precisely the points we are not interested in, it is very easy to
decide which cusps are rational. In practice, the rational cusps generate rather
large finite groups of rational divisor classes of degree 0, i.e. subgroups of J0(N)Q
or J1 {N)Q, where J0{N), Jx(N) are the Jacobians of X0(N), X^(N), and from this
one can go on to determine X0(N)Q or Xx (N)Q for some values of N.
It is a pleasure to acknowledge a number of helpful conversations with Bill
Casselman, Barry Mazur, Peter Swinnerton-Dyer, and John Tate.
1. Rationality of cusps. The cusps of X(N) can be regarded as pairs ± (*), where
x, y are in Z/NZ, and relatively prime, and (*), (Z *) are identified; G (N) = T/T (N) =
SL(2, Z/NZ)/± 1 operates naturally on the left, and so a cusp of X0(N) or Xx (N)
can be regarded as an orbit of T0 (N)/r (N) = G0 {N) or T x (N)/r {N) = GX (N). In the
canonical model of Shimura, as communicated to me by Casselman, X(N) is
defined over Q; the cusps of X(N) are rational in Q{Z>x) = Q{e2nilN\ and the Galois
group (Z/NZ)X operates as (J ?), i.e. if P=±(J), and ce(Z/NZ)x (giving the
automorphism Cn^Cn of Q(£N)/Q), then a{P) = ±{° J).
For the group r1{N),G1{N) = {(l \): beZ/NZ}, and a cusp is an orbit
{±{x+yby)}. Note that we can choose a representative (*) with x reduced modulo
d = (y, N), and (x, y, N) = 1 means (x, d) = 1, so for each d | N there are (p(N/d) y's
and <p(d) x's corresponding, i.e. \ (p(d) (p{N/d) cusps of Xx (N) (say for N^5). As
to rationality, fix d \ N, d<N/2, and fix y with (y, N) = d, 0^y<N/2. The <p(d)
cusps (y) are certainly conjugate, and in particular (J) is rational only if (p(d)= 1,
i.e. d=l or 2. For d = N (resp. d = N/2 in the case of even N) we can take y = 0
(resp. y = N/2); the %(p(d) cusps corresponding are all conjugate, and so are rational
only if (p(d) = 2, i.e. d = 3, 4, 6. Except for N= 1 -4, 6, 8, 12, then, the only rational
cusps are those with d= 1 or 2, and in any case the cusps with the same ±y are
conjugate. Finally we note that there are N/d cusps of X(N) over a cusp (J) of
Xl (N), so its ramification degree in the covering X(N)-+ Xl (N) is e — d. Thus:
Proposition 1. Suppose pl(N)>G, i.e. N=\\ or N^\3. Then for each d \ N,
we have \ <p(d) (p(N/d) cusps (*) of X^{N) with d = (y, N), each having ramification
degree e = d in X(N)-+Xi (N). The cusps (J) with a fixed value of ±y are conjugate,
RATIONAL POINTS ON CERTAIN ELLIPTIC MODULAR CURVES 223
and in particular are rational only if(p(d)=l, i.e. d=l or 2, i.e. the only rational
cusps are the %<p(N) cusps (°) with d—\, and for even N, the \ <p(N/2) cusps (*) with
d = 2.
For X0(N), let us consider the Galois covering Xl(N)-^X0(N), with group
G = r0(N)/ri(N)= {(g °a-1): ae(Z/NZ)x/±l}. Given a cusp(*) of Xx(N\ it is fixed
by (o a-0G^ if anc* only if ax= ±x (mod d) and ay= ±y (mod N), i.e. a= ± 1
modulo dand N/d, i.e. modulo t = (N/d, d). This shows that the ramification degree
is e = t. It also shows that for a cusp of X0(N), we can assume y = d=(y, N), and
reduce x modulo t, so we have <p{t) conjugate cusps (J) corresponding to d, rational
only if <p(t)= 1, i.e. r= 1 or 2.
Proposition 2. For £tfc/z d| N, we have <p(t) conjugate cusps (J) of X0(N),
each with ramification degree e=t in Xl(N)-^X0(N), and these are all the cusps of
X0(N). In particular, all cusps are rational ifN or N/2 is a square-free integer.
2. Groups generated by cusps on Xt (N). Consider, for k^2, the "Eisenstein
series"
Ek(a,fi=Ek(T;a,p;N)= £ £ (mr + n)-"
= <5(a/JV) I ,n"* + J~2WI?>, I tt-'sgnneinlrm + fl/N),
n = p(N) IS (K— I) ! m~a(N);mn>0
where e(x) = e2wix, and S(x) = 1 or 0 as xeZ or x$Z. For /c ^ 3, £fc (a, P) is a modular
form of dimension -k for r(N), and for y = g 5)eJ\ £*(«, j8) | y = JBk((a, j8) y) =
Ek(cta + Pc, a6 + /W). (Here and in the following we have defined the operator |,
relative to a given integer k, by/| (" hd) (r) = (ad-bc)k,2(cT + d)~kf((ar + b)/(cT + d)),
for any real matrix (acbd) of positive determinant.) For k = 2, a difference E2{ol, /?)—
E2 (a> P) wiH be a form of dimension — 2 (cf. Hecke [7, Number 24]). The order of
zero of Ek(a, P) at a cusp (*) of X(N) can be read off from the above Fourier
expansion and transformation rule, and we find
Proposition 3. Let {x} = [x] N be defined by 0^{x}^ AT/2, {x) = ±x(modN).
Then £fc(a, /?) has a zero of order ^{ocx + /fy} at the cusp (*) ofX(N) (with equality
usually holding, for example if (a, P, N)=l).
For k = 3, one checks that for (a, p, JV)= 1, £k(a, j?) has a total of (r: r(JV))/4
zeroes at cusps and hence no further zeroes. (In general a form of dimension - k
for a group of index ^ in r has /c/i/12 zeroes in any fundamental domain.) This
gives various relations in the group of divisor classes of degree 0 generated by the
cusps. However, we prefer to use k — 2, since our divisors will then be only f as big.
224
A. P. OGG
We illustrate the method in detail for Xt (13); this is the first interesting case, since
X^N) has genus 0 for JV^IO or AT =12, while Ar1(ll), the first elliptic curve in
nature, is well-known [1]; p1(13) = 2.
Let then Pj=((j)9j= 1, 2,..., 6, denote the 6 rational cusps of Xx (13); here d= 1
in the notation of Proposition 1, so Ar(13)->X1(13) is unramified over the Py
Let (pj = E2{0J; 13). Then cpij = (pi — (pj is a form of dimension —2 for 1^(13), and
as such has (P: Px(13))/6= 14 zeroes in a fundamental domain. Letting (al9..., a6)
denote the divisor a^P^ H \-a6P6, we have, by Proposition 3,
(9i)£(l, 2, 3, 4, 5, 6), fo>2)£(2, 4, 6, 5, 3, 1),
(<p3)^(3, 6, 4, 1, 2, 5), (<p4)^(4, 5, 1, 3, 6, 2),
(<p5)^(5, 3, 2, 6, 1, 4), fo>6)£(6, 1, 5, 2, 4, 3).
Then ((p^^inf^), (<p2) = (l, 2, 3, 4, 3, 1), and since this last divisor has degree 14,
equality holds. (In particular, (pl2 has no zeroes at finite points, only at cusps.)
Proceeding we find
(cpl2) = (h 2, 3, 4, 3, 1), fo13) = (l, 2, 3, 1, 2, 5),
(9l4) = (l, 2, 1, 3, 5, 2), fo>15) = (l, 2, 2, 4, 1, 4),
(<p16) = (l, 1, 3, 2, 4, 3), (<p23) = (2, 4, 4, 1, 2, 1),
and so
(?i2/fi3) = (0,0,0,3,l,-4),
(?i2/?u) = (0,0,2, 1, -2, -1),
(<7>i2/<7>15) = (0,0, 1,0,2,-3),
(9i2M6) = (0, 1,0,2,-1,-2),
(<Pi2/«>23) = (-l, -2, -1,3,1,0)
are divisors linearly equivalent to 0 on Xx (13). Let us embed Xx (13) in Jx (13) by
sending P to the class of P — P6; let u, be the image of Pi9 i.e. the class of Pi — P6,
and let U be the subgroup of Jl(\3)Q generated by ul,...,u5. The above relations
give, in order, that u5= -3w4, then that 2w3 + 7w4 = 0 and u3 — 6w4 = 0, whence
19u4 = 0 and w3 = 6w4; the last two relations then give w2 = — 5u4, and mx=4m4.
Thus (7 = C19 is cyclic of order 19.
Note that u x + w5 = w4 = u2 + u3, i.e.
P1 + P5-P4 + P6-P2 + P3.
This corresponds to the covering
^(13)^(13)^X0(13),
RATIONAL POINTS ON CERTAIN ELLIPTIC MODULAR CURVES 225
where the middle curve has genus 0; since 52 = — 1 (mod 13), we see that {P1? P5},
{P2, Pi0 = Ps}, and {P4, P2o — Pe} are cusps of A"2(13) and so are linearly
equivalent (giving another proof of that fact). Recall that any curve of genus 2 admits
a unique double covering of a line; the fibers are the positive canonical divisors.
For our curve Xx(13) it follows for example that there could not be a function
whose divisor of poles is P2 + P5, and so forth. From this we can check that Xl (13)
meets our group of order 19 in only the 6 rational cusps we started with, as follows.
If not, we would have PeXx (13), with w = class of P—P6 = vw4 for some veZ/19Z,
and u^uh i.e. v^O, 1, 4, —5, 6, — 3. If u— —uh then P + Pf~2P6, contrary to the
remarks above, so v^ — 1, —4, 5, —6, 3. Similarly m = 2w, is not possible, i.e. v^2,
8,9, — 7, —6; hence v=— 2, 7, —8, or —9, i.e. u = ul — u3,u3 + u4,u2 + u5, or
u2 — ul, whence Pl-\-P6, P3 + P4, P2 + P5, or P2 + P6 is canonical, which is not
possible.
Next we claim that our groups of order 19 is the full torsion subgroup Jx (13)qfs
of the group of rational points on Jx (13). For this we use the fact that we have good
reduction modulo p for p)( 13 (or in general for pJfN), and that reduction modulo
p is injective except for p-torsion on the torsion subgroup. We also use the fact that
the reduced curve has the same meaning as a moduli space in characteristic p,
i.e. parametrizes elliptic curves plus a point of order 13. Let v(q) resp. h(q) be the
number of points on the reduced curve resp. its Jacobian rational in the field of q
elements. Since the genus is 2 we have
%)=-<J+(v(q2) + v(<j)2)/2,
from the usual formulas connected with the zeta-function of the reduced curve.
For q = 2 or 4 we still have the 6 rational "cusps" but no other points, since an
elliptic curve over a field of 2 or 4 elements has no room for a point of order 13.
(By the "Riemann hypothesis", an elliptic curve over the field of q elements has at
most (1 + (?)1/2)2 points in that field.) Hence v(2) = v(4)=6, h{2)= -2 + (6 + 6)2/2
= 19. Similarly, v(3) = 6. For q = 9, there is a unique elliptic curve A over F9 with 13
points; this gives apparently 12 pairs (A, P), P of order 13, but there is a cyclic
group of order 6 operating, so we have only 2 corresponding points; v (9) = 6 -I- 2 = 8,
/,(3)=_3 + (62 + 8)/2 = 19. Hence J^B^^C^.
There are two other cases with p1(JV) = 2, namely AT =16, 18, and one can
proceed as above. The analysis is somewhat more complicated because N is
composite, but not essentially any more difficult. (The ground rules mentioned in
the introduction arose at this point, since in the first proof of the above facts for
N = 13,1 in fact made use of an equation for the curve, but was unable to do the
same for N= 18. If you use an equation for Xx (N), even if you get past the
difficulties due to singularities, which may be considerable, and have found some
rational points presumably corresponding to cusps, it will still require luck or
226
A. P. OGG
ingenuity to find functions on the curve which have zeroes and poles only at those
points. The above procedure using Eisenstein series, by contrast, is quite
mechanical.) The result is
Theorem. Let pl(N) = 2, i.e. N=\3, 16, 18. Then the 6 rational cusps of
Xl(N) generate Ji{N)qts (isomorphic to Ci9,C2xCl0, C2i respectively), which
meets Xx (N) in only the 6 rational cusps.
One then has a proof of the nonexistence of an elliptic curve over Q with a
rational point of order N, for these N, provided one can show that Jx (N)Q is finite.
For N= 13 this has been accomplished by Barry Mazur [9], using his new descent
techniques, and his proof meets our requirement of only using the modular
properties of the curve. For N= 16 or 18, it seems likely that Mazur's techniques will
again show that Jx (N)Q is finite, which would prove the nonexistence of an elliptic
curve with a rational point of order 18. (N=\6 is already a known result—cf.
Cassels [3, p. 264].)
Finally, let us note that the finiteness of Jx (13)Q is predicted by the conjecture
of Birch and Swinnerton-Dyer. Let 5=5(rx (N), k) be the space of cusp forms of
dimension -k for r^N). Then r0(N)/rl(N) = (Z/NZ)x/±l operates on 5, so
we have
s=©st
e
where e ranges over the characters of (Z/NZ)X/±1 (i.e. characters s modulo N,
with s(-1)= 1), and Se is the set off with f\ £ bd) = £(d)-f for £ bd)eF0{N). Note
that the operator HN = („ "^carries Se onto S-. The Hecke operators T(n), for
(w, JV)= 1, operate on each Se, and so SE has a basis of elements / of the form
/W = E?=i *»«"> with q = e2Kit9 where fll = l, and f\ T{n) = an-f for (n,N)=l.
Since HNT(n) = (e(n))~ T(n)HN, and an = (e(n))~ an by Petersson's theory (cf.
[10, V-3 and IV-27]), we can write X •/1 HN = £ bnq", where X is some constant,
and bn — an for (n, JV)=1. This suggests that we consider Hecke's conjugation
operator K, defined by/1 K(t) = (/(-t))" =£ anqn\ K is a conjugate-linear map
of Se onto S-. Thus F | X and A/1 HN have the same Fourier coefficients an, for
(w, JV)= 1, and hence coincide, in the case where e has conductor N, by a theorem
of Hecke [7, p. 836]. Note |A| = 1.
For px (AT) = 2, i.e. JV = 13, 16, 18, and fe = 2, 5 has dimension 2, and S = SE®S-
for a certain nontrivial character £; let/£ = £ a^^f^Xf \ HN = Y,^n^ generate
SE and 5£- as above. (Of course the argument above using the operator K is not
needed here, since the spaces Se and S£ have dimension 1.) One checks that s has
maximal order \<p(N) = 6, 4, 3, using the Riemann-Hurwitz formula. (If rl(N)
cker(s)c:r0(N), with proper inclusions, then ker(£) corresponds to a curve Xe
RATIONAL POINTS ON CERTAIN ELLIPTIC MODULAR CURVES 227
of genus 0, but with a nontrivial holomorphic differential /e(t) dz, which is
impossible.) By a congruence formula of Shimura, the one-dimensional part of the
zeta-function of Xx (N) of Jx (N) is
L(s) = Le(s)L-(s),
where Le(s) = Y< ann~s> Ar(s) = Z ^«w~s- ^e conJecture of Birch and Swinnerton-
Dyer states that Jx {N)Q is finite if and only if 0^L(1) = L£(1) L£-(1) = |L£(1)|2; since
00
(27rAT1/2)-s r(s) L,(s)= I ff(itN-112) dt/t
o
(with/=/£), this is so if and only if O^tf/prAT1'2) dt = /(/). Let g=f-f\HN,
h=f+f | HN. Then J(fc) = 0, and J(flr) = 2 ff g{itN~1/2) dt, so L(1)^0 if and only
if J? g{itN~1,2)dt=£0. Furthermore, # is almost real on the imaginary axis, since
g=f-f | HN=f+e~2ief | K = e-ie(e>ef+e-wf \ K% so eieg is real on the imaginary
axis, and to show that L(1)^0 it suffices to show that g(itN~1/2)^0 for 1 <t<oo.
For (r0(N): r1(N))=j(p(N) even, i.e. AT =13 or 16, this is shown as follows. Let
rl(N)ar2(N)czr0(N), where r2(N) is defined by e(d)= ±1; let n(N) = r2(N)
vr2(N)HN. Then (n(W): A (TV)) = 4, and r$(7V) permutes the zeroes of g =
f—f\HN. Since g{z)dT has only 2/?1(jV) — 2 = 2 zeroes on A^jV), a zero of
g(itN~112) with l<f<oo would have to be a fixed point of some element of
r*{N), which is not the case.
Thus the conjecture of Birch and Swinnerton-Dyer predicts the finiteness of
Jx (N)Q for N= 13, 16, and Mazur's result for N= 13 then gives another
verification of their conjecture.
3. Groups generated by cusps on X0(N). Let A(x) = q\\^=l (1 -qn)2Ar be the
cusp form of dimension —12 for T, where q = e2nix; A has a zero of order 1 at the
cusp, and no other zeroes or poles. Regarding A as a form for T0(iV), it has a zero
at the cusp (J) of order N/dt (the ramification degree in X0 (N)^>X), i.e. A has divisor
For any d | JV, let Jd(i) = J(^t); Jd is also a form for r0(N), whose divisor is easily
determined. For d = AT, we find
(4»)=I(d/r)-Q.
228
A. P. OGG
(With fc=12, we have AN = A \ (N0 ?) = d | (_? J)(° "$) = d | ffNf where //„ = (£ -J)
maps X0 (W)-»X0 M and carries {*) on (1/d) HN(xy) = (^) = ft), with (y\ N) = JV/d.)
Then
(J/JNH£(JV/A-d/r)/x
is a divisor linearly equivalent to 0 on X0(N); similarly, for any d, d' | N, AdjAd>
is a function on X0(N) whose divisor is easily determined. For iV = pa prime, we
find (J/Jp) = (p-l)(P0-P00), where P0 = (?), P^ = (J), so the class of Po-P^
is a point on J0{p)q of order dividing (p — 1); its order may be smaller, as follows.
Let n(T) = A(T)1/24 = enix/12Yl™=l(l-qn) be Dedekind's function, and nd{z)
= rj(dr). Then, by Hecke [7, Number 42], if p is a prime >3, then a>p = f]p/t]p is a
modular form for P0(p) with multiplier s(d) = (d/p) (quadratic residue symbol), i.e.
c d)=£^^ f0r(c d
(Actually cop is a product of Eisenstein series, so the method here is similar to that
of the preceding section.) For p = 3, the same is true for a>l = rj9/rjl, and for p = 2,
rj16/rjl is a form for X0(2).
For simplicity, let us take the case of a prime p> 3, and let v be the least integer
>0 with n = v(p-l)/12eZ. Then fp = {rj/rjp)2v = {rjp/t]p)2v jy-2^"1^^ A~n is a
function on X0 (p), of divisor
(fp) = n-(PQ-P„).
Actually, n is the exact period of the divisor class of Pq — P^, i.e./p1/r is not a
function on Jf0(p)for r>l, which can be seen from Dedekind's transformation
formulas for the ^-function [4]. (We can see immediately that/p1/2 is not a function
on X0(p), for that could happen only for n even and v odd, and then we would
have a multiplier of e(d).) Thus
Theorem. For a prime p > 3, the divisor class of P0 — P^ is a point on J0(p)q
of period exactly w = v(p—1)/12, where v is the least integer >0 with neZ.
For N composite, one has to consider rjd for various d \ N, and the situation is
more complicated. I have not stated a general result (perhaps it is not worthwhile
to do so); at any rate it seems clear that for any fixed N one can determine the exact
(finite) subgroup of J0(N)Q generated by the rational cusps. The technique of
reducing modulo p, for small p)( N, can be applied here as in the previous section,
RATIONAL POINTS ON CERTAIN ELLIPTIC MODULAR CURVES 229
and in many cases one can check that the group generated is J0(N)qts.
For example, ^(27) has two rational cusps, namely P0 = (i) anc* ^oo = (o)-
The function A\n A3 Ag1 A'3 has divisor 72(7^ - P0), and by the results of Hecke
mentioned above we can extract the 8th root, getting {rj^g 9Y(*l\n~9) Ag, a
function on ^(27) with divisor 9(Poo-P0). But ^(27) has genus 1 and good
reduction at p = 2, and so actually the divisor class of Pq-P^ has period 3.
Reducing modulo 7, we find that X0(27) has 9 points in the field of 7 elements, so
X0(21)qts has order 3. (In F7, X0(21) has 6 rational "cusps", and 3 rational points
given by the curve with endomorphism ring the order of conductor 3 in the field
of cube roots of 1, with (say) Frobenius endomorphism tt = (1 -h( — 27)1/2)/2.
(2n — 1), {n + 4), and (7c - 5) are 3 isogenics, cyclic of degree 27.)
For the twelve values N= 11,..., 49 with p0(JV)= 1, one finds in this manner,
in all cases, that the rational cusps generate X0(N)qts; the groups are listed in the
table below. Actually, X0(N)Q is finite for these N, as is predicted by the conjecture
of Birch and Swinnerton-Dyer [12], and has been verified by various people,
including Ligozat [8].
Table (p0(N)=\)
N RATIONAL CUSPS GROUP THEY GENERATE NUMBER OF RATIONAL
NONCUSPS
11
14
15
17
19
20
21
24
27
32
36
49
</=l, 11
1,2,7, 14
1,3,5, 15
1, 17
1, 19
1,2,4,5, 10,20
1,3,7,21
1,2,4,8,3,6, 12,
1,27
1,2, 16,32
1,2,4,9, 18,36
1,49
24
Q
Q
C2xCt
c4
C3
C6
C2xC4
C2xCA
c3
c4
Q
c2
Thus for JV = 20, 24, 32, 36, 49, there can be no elliptic curve over Q with a
cyclic rational group of order AT, let alone a rational point of order N.
For N= 19, 27, we have just one rational noncusp, i.e. there is a unique elliptic
curve A over Q together with a cyclic rational isogeny X of order JV. By the
uniqueness, the isogenous curve A = X(A) must be the same as A, i.e. X is a complex
multiplier of A. But then the >invariant of A is integral, i.e. jeZ, and so A has
good reduction at p = 2 and hence cannot have a rational point of order N. Thus
we get a painless proof of the nonexistence of an elliptic curve with a rational
point of order N = 19 or 27.
230
A. P. OGG
Finally, as Shimura has pointed out, Kummer theory gives another way to
find divisor classes of finite period on J0 (N), this time rational over a cyclotomic
field. For simplicity, we consider only the case of a prime p> 3, although, as usual,
it d©es not really matter.
Consider the Galois covering Ar1(/?)->Ar0(/?), with group r0{p)/rx(p)=
(ZjpZ)xl ± 1, cyclic of order (p — 1 )/2; this covering is ramified only over the elliptic
fixed points of X0(p). Now X0(p) has elliptic fixed points of order 2 resp. 3 if and
only if/?== 1 modulo 4 resp. 3. Thus if we define (as before) v to be the least integer
>0 with w = v(/? — l)/12eZ, then our covering factors as
*1(P)^*2(P)-X0(P),
where X2(p)^X0(p) is unramified, and cyclic of degree n.
Let kn = Q(£n)9 £n = e2lti/n.By Kummer theory, the function field kn(X2(p)) is
generated over kn(X0(p)) by adjoining h=f1/n for some fekn(X0(p)).f has divisor
of the form (f) — nD (since Ar2(/?)->Ar0(/?) is unramified), and the divisor class d of
D then has period excatly n on J0(p)kn.
Now a = (?2)ejr0(p) operates rationally, i.e. olg — gol for <reGal(()/()). Hence
(ft 0 a)* = A* o a. Also A o a = e (d) • A, where £ is some character modulo /? of period «,
and for aeGal(kJQ) = (Z/nZ)x, we have <x(£„) = £, say, where se{Z/nZ)x. Hence
(A<Tft-s)oa = (e(^)-/if^(d)-/i)-s = A<Tft-s,
i.e. g = hffh~sekn(X0(p)). In other words (t(D)^s-D on X0(p)> i-e- (r(d) = s-d9 which
we describe by saying that Gal(kJQ) operates naturally on d.
Thus consideration of rational cusps gives aeJ0(p)Q of period n = v(p— 1)/12
(a = class of Pq — P^), while Kummer theory gives deJ0(p)k of period n with the
Galois group operating naturally. It is clear that a and d generate Cn x Cn on
J0{p)kn if n is oc*d; for even n they generate C„ x C„/2.
References
1. G. Billing and K. Mahler, On exceptional points on cubic curves, J. London Math. Soc. 15
(1940), 32^3. MR 1,266.
2. W. Casselman, The arithmetic of the cusps of the classical modular curves (to appear).
3. J. W. S. Cassels, Diophantine equations with special reference to elliptic curves, J. London Math.
Soc. 41 (1966), 193-291. MR 33 #7299.
4. R. Dedekind, "Erlauterungen zu den Fragmenten. XXVIII," in Collected works of B. Riemann,
Dover, New York, 1953, pp. 466-478.
5. V. A. Dem'janenko, Torsion of elliptic curves, Izv. Akad. Nauk SSSR Ser. Mat. 35 (1971),
280-307 = Math. USSR Izv. 5 (1971), 289-318.
RATIONAL POINTS ON CERTAIN ELLIPTIC MODULAR CURVES 231
6. R. Fricke, Die elliptische Funktionen und ihre Anwendungen, Teubner, Leipzig, 1922.
7. E. Hecke, Mathematische Werke, Vandenhoeck & Ruprecht, Gottingen, 1959. MR 21 #3303.
8. G. Ligozat, Fonction L des courbes modulaires, Seminaire Delange-Pisot-Poitou, 1969/70, no. 9.
9. B. Mazur, Letter to A. Ogg, September, 1971.
10. A. Ogg, Modular forms and Dirichlet series, Benjamin, New York, 1969. MR 41 # 1648.
11. , Rational points of finite order on elliptic curves, Invent. Math. 12(1971), 105-111.
12. H. P. F. Swinnerton-Dyer, The conjectures of Birch and Swinnerton-Dyer, and of Tate, Proc.
Conf. Local Fields (Driebergen, 1966), Springer, Berlin, 1967, pp. 132-157. MR 37 #6287.
University of California, Berkeley
This page intentionally left blank
ARITHMETIC FUNCTIONS
AND BROWNIAN MOTION
WALTER PHILIPP
1. Introduction. Additive functions/are mappings from the positive integers
into the reals satisfying /^(m1m2) = /'{ml) + / (m2) whenever the greatest common
divisor gcd(ml9 m2)— 1. For the limit theorems we shall be concerned with in this
paper there is no loss in generality in assuming that the additive function / is
strongly additive, i.e. / satisfies the additional requirement /(pa) = /(p) for all
positive integers or and all primes p (see [8, pp. 36-37]).
Kubilius proved [8, Theorem 4.2] a central limit theorem for additive functions
/ (m) satisfying a kind of Lindeberg condition. The special case where sup | / (p)\ < oo
is the celebrated Erdds-Kac theorem [7]. Kubilius also gave a more general
version of the Erdds-Kac theorem [8, Theorem 7.3]. The connection between
Kubilius' result and weak convergence of probability measures is suggested by a
reference to a paper of Prohorov [13] on page 124 of his book [8]. Billingsley [3]
showed that Kubilius' Theorem 7.3 [8] implies that certain random functions in
C[0, 1], similar to hN(t, m) defined below, converge in distribution to standard
Brownian motion. He also gave a very simple proof of Theorem 1 below. In this
paper we shall prove Billingsley's theorem assuming only a Lindeberg condition.
Generalizing a result of Davenport and Erdos [6] on the distribution of
normalized sums of Legendre symbols, Kubilius and Linnik [9] (see also [10, Chapter
10]) proved that the finite dimensional distributions of certain normalized sums
of Jacobi symbols as well as of certain character sums converge to those of
standard Brownian motion. Combining their results and their methods of proving
them with Serfling's [14] maximal inequality we shall show that these normalized
sums converge in distribution to standard Brownian motion. This can be used to
obtain the standard corollaries on the distribution of the maxima and minima of
A MS 1970 subject classifications. Primary 10K20, 10H25; Secondary 60J65.
© 1973, American Mathematical Society
233
234
WALTER PHILIPP
the normalized partial sums, as well as of the number of changes of sign of these
partial sums, etc.
2. Invariance principles for additive functions.
2.1. Statement of results. Let </N> be a sequence of strongly additive functions.
For n ^ 1 write
B2{N, n) = £ ^-^, B{N, N) = B{N).
p^n P
For O^trg 1 define functions hN(t, m) by
M^)=^E/*(p)(*,M-£)
where (5p(m)= 1 or 0 according to whether or not p \ m and the sum is extended
over all primes p^N satisfying B2(N, p)^tB2(N). Hence if m is an integer chosen
at random from [1, AT] then hN(t, m) is a random function in D[0, l]1 constant
on the intervals [B2 (N, q)/B2 (N), B2 (N, q')/B2 (iV)] where q<q'^N are
consecutive primes. The value of the constant is B'^N) ^/^(p) (Sp(m)— 1/p) where
the sum ranges over all primes p^q. There are other possible choices for hN(t, m).
For instance we could join the points of the graph at the end points of the interval
linearly. One then obtains a polygonal curve in C[0, 1]. The theorems below
continue to hold for this alternate choice of hN(t, m).
Theorem 1 (Billingsley [5], [4]). Let (fN} be a sequence of strongly
additive functions with B(N)-+ oo. //
(1) sup|/„(p)|<oo,
p,n
then
hN{t,m)^W,
i.e. the random functions hN(t,m) converge in distribution to standard Brownian
motion.
Remark. Though very similar, the present Theorem 1 and Theorem 7.3 in
[8] do not contain each other. Also their proofs are completely different.
1 Z)[0, 1] is the set of real-valued functions on [0, 1] which are right-continuous and have left-hand
limits. C[0, 1] is the set of continuous functions on [0, 1].
ARITHMETIC FUNCTIONS AND BROWNIAN MOTION
235
In this paper we relax (1) to a kind of Lindeberg condition.
Theorem 2. Let (fN} be a sequence of strongly additive functions with
B(N)-* oo. Suppose that, for any s>0,
where the sum is extended over all primes p not exceeding N and satisfying |/at(p)I
^sB{N). Then
(3) hN{t9m)$W
where W is standard Brownian motion. Moreover, if B(N)/B(N, N1/2)-+l then (2)
is also necessary.
The assumption that the functions fN be additive was kept for historical reasons.
Of course, Theorems 1 and 2 and their corollaries continue to hold for any sequence
of number theoretic functions subject to the remaining conditions. Simply
redefine the values of such a function on the nonprime powers to make the modified
function additive.
For the probabilistic background as well as for the proof of the standard
corollary to this kind of theorem see Billingsley [2].
Corollary. Under the hypotheses of Theorem 2 we have, for x^.0,
1 A ,r tik){m)-A{N,k) } 2 f 2/_
--card <mgN: max"™ v ' l '<XU»—— \e-u2/2 du
N { k<N B{N) J 2tt1/2J
and
1 A .M \f^(m)-A{N,k)\
— card <m<N: max — <x
N 1 - kiN B(N)
f{.1<->'-(-^)
(2k)
~7t^,2fc + ieXPV 8x2
Here
236
WALTER PHILIPP
,4(JV,fc)=£—, A(N,N) = A(N), /*» = £ fN(p).
P^k P P^k;p\m
Remark. In the case fN=f for all N^\ this corollary is due to Babu [1].
Theorem 2 and its corollary can be generalized in several directions. One of
them is as follows. Let R(m) be a polynomial with integer coefficients and assuming
only positive values. Let $R(p) be the number of residue classes satisfying the
congruence R(m) = 0(modp). Again let </N> be a sequence of strongly additive
functions and put
b2(n, «)= £ —• Up), b(n, n)=b(n).
p^n P
For 0^ t ^ 1 define random functions hN(t, m, R) by
hN{t, m, R) = -^'1MP) (<U*M)- »r{p)/p)>
where the sum is extended over all primes p^N with B2(N, p)^tB2(N). Then
Theorem 2 remains valid if we replace (2) by
b-2(JV)£^-MpM> (at-oo).
P
The proof of this result is essentially the same as the proof of Theorem 2 given
below. Instead of Lemmas 1 and 2 one has to apply the corresponding facts due to
Uzdavinis [15].
Similarly one can generalize Theorem 2 to sums of shifted additive functions,
a theory developed by Kubilius (see [8, Chapters 5-7]). The modified version of
Lemma 1 follows from the last formula in [8, p. 97]; Lemma 2 is applied as it
stands; and instead of Theorem 3 one has to use a slightly generalized version of
[8, Theorem 5.2] (see the proof of Theorem 3).
As already indicated above we shall prove only Theorem 2 in complete detail.
2.2. Lemmas on additive functions. We need a minor extension of a result of
Kubilius mentioned above [8, Theorem 4.2].
Theorem 3. Under the hypotheses of Theorem 2,
X
e-u2'2du.
1 \ fNlm)-A(N) I 1
(4) - card \m<N:JNy ' * <*HtttT7I
w N { ~ B(N) J (2n 1/2
— 00
ARITHMETIC FUNCTIONS AND BROWNIAN MOTION
237
Moreover, ifB(N)/B(N, N1'2) -» 1 then (2) is also necessary for (4) to hold.
For the proof see Kubilius [8] or the author [12]. Simply attach the index
N to each /. The proofs then work without further changes. The reason for this
phenomenon is as follows. By the additivity offN we have
/wM=£/*(pm"))
P
where <xp(m) is defined by the canonical representation of m = f|ppap(m). Hence
limit theorems for fN(m) can be interpreted as limit theorems for sums of the random
variablesfN(p*pim)) (p = 2, 3, 5,... ^N) on the probability spaces (QN, %N, PN). Here
0N = (1, 2,..., N), %N = y(QN) is the power set and PN is defined by PN(A)={1/N)
•card (.4), AczQN. Then Theorem 3 is a central limit theorem for the row sums of
the triangular array <fN{p*p{m)), p^N, Af=l, 2,...>, where within a row the
fN{pap{m)) satisfy certain dependence relations. Since [8, Theorem 4.2] is proved
by means of a certral limit theorem for triangular arrays it is quite clear that its
proof will yield Theorem 3, too.
From now on we shall drop the index NinfN and/^fc).
Lemma 1 [12]. Let r = r(N) be any integer valued function with
logr/logiV->0 and let P and Q be two disjoint sets of primes of which no primes
exceed r. Let MjiV) be the algebra generated by the random variables {f{pap{m)),peP}
and similarly for Q. Then
PN(AB)-PN(A)PN(B)<Qxp{-c(\ogN/\ogry'2}
for all sets AeM(P and BeM{q\ Here the constant c>0 and the constant implied
by <^ are absolute.
Remark. This result is slightly more general than [12, Lemma 5.2.1], but
the proof is the same.
Lemma 2 [12, Lemma 5.1.4]. For any integers r, w^2,
card Im^N: fj pfl,'(m)>ii|<|Nexp j-c( —
where c>0 and the constant implied by <^ are absolute.
The following maximal inequality will be used to establish tightness. First
we give some notation. We introduce the random variables
)"
238
WALTER PHILIPP
*„, = £->(A0 {f if *m^-EN{f {f *»)}}, p^r(N) prime
on the probability spaces (QN, S$N, PN) as described above. Here r = r(N) satisfies
logr/logiV->0 and EN denotes the expectation of the random variable involved.
Obviously EN(xNp) = 0. Write SNk = Y,P*k *np-
Lemma 3. For any e>0 and any sufficiently large N
The proof is adapted from the standard proof of the Levy inequalities. We put
Sno = 0 > Snu = max {SNj - \i (SNj - SNr)),
Ak — \^Nk- 1 < £> $Nk~~A*\^Nk~~ ^Nr) = £} »
&k — { $Nr — $Nk ~ V fiNr ~ $Nk) = 0} ,
where \i is a median with ft{SNr — SNk) = —ft(SNk — SNr). As in [11, p. 248], we have
{SJr^fi}= U Ak9 {SNr*e}=> U AkBk.
k^r k^r
The sets Ak are disjoint and PN{Bk)^j. If we can prove that
(5) PN JU i4AU(i + o(l)) £ PN(^) + o(l),
Ik^r ) k^r
the remainder of the proof in [11, p. 248] will work without change. In fact (5)
will imply
PN{SNrZe}£PN(v AkBk)^iZ PN(A) + o(\)=iP(S*Nr^e) + o(l)
\k^r ) k^r
and the rest is the same.
For the proof of (5) we repeat part of the argument leading to [12, Lemma 5.2.1].
Each Ak and Bk can be written as
Ak= U JO {aPlN = a|.}j,
(ai,...,ait)eYk (i=l J
ARITHMETIC FUNCTIONS AND BROWNIAN MOTION
239
Bk= U { O {cep» = /?,.}},
<0lc+l,...,0s)€Zk U = k+1 J
where yfc and Zk are certain /c-tuples and (s —/c)-tuples of integers respectively. We
drop the index N in PN and write
k
£ft = £k(a1,...,ak)=n {aPi(m) = aI-},
Fk = Fk(pk+u...,ps)= n K»=/5/}-
We define
P'(£fc) = P(£fc), if/?=flpf^iV1/4,
=0, ifK>JV1/4,
and P"(£,) = P(£)i)-F(£k). Similarly for F(Fk), P"(Fk), F(EkFk), P"(EkFk). We
apply Lemma 1, sum over all (au..., <xk)e Yk and (pk+1,..., fls)eZk (for the details
see [12, p. 82]) and obtain
P(AkBk)-P(Ak)P(Bk) (l+o(l))
(6)
£(l+o(l)) X (P"(£*)P(Ft) + P(£»)P"(F»)) (lgfcgr).
Yk,Zk
Summing over 1 g/egr we get on the left-hand side
(7) p(\j AkBk)-(\+o(\)) £ P(Ak) P(Bk).
Lemma 2 with u = N1/4 yields
pj]~I pa*(m)>N1/4j = 0(l).
Hence summing over lg/cgr and observing that any two sets of type E are
disjoint, since the Ak are, we obtain for the first terms on the right-hand side of (6)
the bound
EI/"(£*)=/» (u U*Ek) = o(\)
k^r Yk \k^r Yk J
240
WALTER PHILIPP
where the * indicates that the union is extended over those (al5..., OLk)eYk with
pV'~Pkk>N1/4' The sum of the second terms on the right-hand side of (6) is
less than
p(v Ak\supP"(Fk) = o(l).
Hence (7) yields
p(u AkBk) = £ + o(\)) £ P(Ak) + o(l)
\k^r J k^r
as P(Bk)^\. This proves (5) and thus the lemma.
Lemma 4. If B(N)/B(N, r)-+l when N-+cc and logA71ogr->oo, then, as
TV-oo,
(8)
and
sup£N(xJp)-*0
(9) PN< max
I x»p~Wm 2/W(Mm)-;)i
>Smin(r,n) 0(I\)p^„ \ p/\
>£>-0
for any e > 0.
Proof. (8) is proved in [12, p. 88]. For the proof of (9) write
yNp=xNp-(l/B(N))f(p)(5p(m)-l/p), l^p^r,
= (l/B(N))f(p)(3p(m)-l/p), r<p£N.
By Chebyshev's inequality it is enough to show that
(10)
EN <[ max
I yNp\
But the left-hand side of (10) does not exceed
£,Jmax £ \yNP\>£ Z EN\yNp\-
[n^N p^n J p^N
ARITHMETIC FUNCTIONS AND BROWNIAN MOTION
241
By an easy calculation using [12, Lemma 5.2.4] we obtain
I EN\yNp\<-^-l\f(p)\-N-'=o(l)
and similarly
I ^I^pN^TIa I ——=o(l).
r<p^N By1*) r<p£N P
The last three inequalities prove (10) and hence (9).
2.3. Proof of Theorem 2. Since (4) follows from (3) (see e.g. [2, p. 4]), Theorem 3
implies that (2) is necessary for (3) to hold provided that B(N)/B(N, Nll2)-+ 1.
To prove the sufficiency we observe that the Lindeberg condition (2) implies
the existence of an integer-valued function r = r(N) such that B(N)/B(N,r)-+l
with logiV/logr-KX) (see [8, p. 61]). Moreover, Theorem 3 and Lemma 4 yield
(id '•[s><"taM*"*1*
— oo
iff, for any e>0,
(12) £ f x2dFNp^0.
Vkr J
\x\*z
HereFjvp denotes the distribution function of xNp. In fact this can be regarded as
an intermediate step in the proof of Theorem 3 (see e.g. [12, Lemma 5.3.1]).
Define random functions XN(t9 m) by
XN(t,m) = Y,xNp
where the sum is exterided over all primes p^r with EN(£ xNp)2^tENQ]pgrxNp)2'
Then XN(t9 m) e D[0, 1]. We shall show that the sequence (XN(t, m)> is tight and
that its finite dimensional distributions converge weakly towards those of standard
Brownian motion. Once these two facts are established it will follow that XN(t, m)
converges in distribution towards standard Brownian motion (see [2, Theorem
8.1]). This together with Lemma 4 and [2, Theorem 4.1] will imply Theorem 2.
2.3.1. To establish tightness we have to show that given e, rj >0 there exists a
positive S such that
242
WALTER PHILIPP
(13)
PN{w(XN,S)^e}^t,
for all sufficiently large N. Here w is the modulus of continuity. By [2, p. 56,
Corollary] we have
(14) PN{w(XN,8)^s}^ £ PN\ sup W^-X^^e
for any s>0, 5>0 and 0 = /0</1 <-</w = l with /,- — /f_ t ^5 (2^/g/i-l). Let
s > 0 and 77 > 0 be given. Choose
(15) 0<S<&2
so small that
10(5-1exp(-£2/(8(5))<y7.
Choose tf = i<5, 0gi^[<5-1] + l = w. Because of the structure of XN we have
(16)
pj \XN(s)-XN(ti-i)\^i^PN\ max
XN.
ti -l^p^t, _i +j
>£
where zt is the largest integer t with £jv(£^TXjvp)2^r For the estimate of the
right-hand side of (16) we apply Lemma 3. We redefine xNp to be zero for all primes
p outside the interval [t,_ x, tJ. Then for suificiently large N the right-hand side of
(16) does not exceed
(17)
4PN
KNp\
.It,- i^p^t,
Since the xNp are almost independent we have, for any a and b,
E[ I xNp\ = 2 E(x2Np) + o(l).
using (8). (A proof can be easily patterned after the argument on [12, p. 89].)
Hence by the choice of t{ and zt we obtain
(18)
ONi=daEH[ E xNp) ~5 (JV-»oo).
*i- i^p^'i
ARITHMETIC FUNCTIONS AND BROWNIAN MOTION
243
We apply now (11) to the sequence <xNp(7Ni\ x^^p^x^ and observe that the
Lindeberg condition (12) in the present case becomes
*n/2 I x2dFNp<S-l£
x2dFNp-+0
\x\^effNl
using (18) twice. Hence by (11),
M ^e<5
(19)
PNUml
xN.
r,- t^P^r,
oo
^ot>< \e-ul2 du<OL~l exp(~^a2)
and thus by (18), (19) and (15) we conclude that (17) is less than
(20)
4PN
t,- l^P^Tj
KNp
^-J> + o(l)<5exp
(4)
+ o(l)<rjd
for N sufficiently large. (13) follows now from (16), (20), (14) and the fact that n^S"1.
2.3.2. We now will prove the convergence of the finite dimensional
distributions to those of Brownian motion. We shall consider only the two dimensional
distributions, as the higher ones can be treated in the same manner. We consider
first a single time interval [f, u], O^g t < u ^ 1, and obtain by the argument leading to
(19) that
(21) XN(u)-XN(t)$Wu-Wt.
We wish to show that for t<u,
(22) (XN(t), XN(u))$(W„ Wu).
But this is equivalent to showing (see [2, Corollary 1 to Theorem 5.1])
(XN(t), XN(u)-XN(t))Z(W„ Wu-Wt)
or
(23)
(l xNp, I xNp)^(W„Wu-Wt).
\peQt peQu-Qt /
Here Qt is the set of primes figuring in the definition of XN(t, m). The two sets of
primes Qt and Qu-Qt are disjoint and thus, by Lemma 1,
244
WALTER PHILIPP
\peQt peQu-Qt / \peQt J \peQu~Qt /
By (21) the two terms in the product tend towards normal distribution with means
0 and variances t and u — t respectively. Since Brownian motion has independent
increments (23), and thus (22), follows. In view of the remarks at the beginning of
§3, Theorem 2 is proven.
4. Proof of the Corollary. The first relation is proven in the same way as
[2, (10.11) and (10.17)]. For the second relation, one uses an argument similar to
that leading to [2, (10.11)], but, instead of using [2, (10.16)], applies [2, (11.13)].
3. Invariance principles for sums of Jacobi symbols and certain character sums.
At first we shall consider sums of Jacobi symbols. For P odd square free we define
Hence if we pick m at random from the integers 1, 2,..., P we see that SP(m, t; h)
is a random function on [0,1 ]. This is just an intuitive way of saying that we consider
them as being defined on the probability space (QP, gP, /iP) where QP is the segment
of the first P positive integers, gP is the powerset of QP and fiP(A) = (\/P) card (A)
for all subsets A of QP. We denote the probability measure by \iP for obvious
reasons. With this convention we have the following extension of a theorem of Kubilius
and Linnik [9].
Theorem 4. Let P run through any infinite increasing sequence of odd square
free numbers such that for every fixed c^.0
(24) IJ (.-£)-.
as P-> oo. (The product is extended over all prime divisors p of P.) Let h~h(P)-+ oo
so slowly that log A/log P-+0. Then
where W is standard Brownian motion.
ARITHMETIC FUNCTIONS AND BROWNIAN MOTION
245
In a similar fashion we extend a theorem of Kubilius and Linnik to certain
character sums.
Theorem 5. Let P run through an infinite increasing sequence of odd square
free integers, occurring as basic moduli of primitive characters of order g>2
satisfying (24). Let h be as in Theorem 4. Then the real and the imaginary parts of
(2/h)1'2 X Xp(m + n)
n^ht
converge in distribution towards standard Brownian motion. Here Xp(m) is any
primitive character modulo P of order g.
The convergence of the finite dimensional distributions has already been
established by Kubilius and Linnik (see [10, Chapter 10]). The tightness can be
proved by combining their proof with the following maximal inequality, due to
Serfling.
Lemma 5 (Serfling [14]). Let {XH9n=l92,...} be a sequence of random
variables. Suppose that, for v>2, all a^a0 and all n^l, ££]=«+i Xj\v^gvf2(n)
where g(n) is nondecreasing, 2g(n)^g(2ri) and g(n+l)/g(n)-+l as n->oo. Then there
exists a finite constant M such that, for all a^.a0 and n^.\,
a + k
max £ Xj
k^n j=a+\
^Mgv/2(n).
We prove tightness for the character sums only; the proof for the sums of
Jacobi symbols is similar. We shall use the notation of Linnik [10, Chapter 10].
Now,for0^s<t^l,
EP
Z Xp ("* + *)
hsSn^ht
=4 I (h/2)^TP(m,s,t;h,x)
* m=l
4 h2
For the estimate of n22(P) we apply [10, X.2.7] and obtain
/i22(P)<^/i-2(^(r-s)+l)2(l+o(l)) + p-1/4/i-2(^~s)+l)2^(r~s)2,
uniformly in 0 ^ s < t^ 1 because if t — s < \/h the sum defining \i12 (P) is empty and
because o(l) depends only on the rate of convergence in (24). Hence the left-hand
side of (25) is bounded by a constant times (ht — hs)2. We apply Serfling's Lemma 5
with g(u) = u and obtain
246
WALTER PHILIPP
EP < max
hs<n^hr
<{ht-hsf
uniformly in 0^s<t^l and hence, by Chebyshev's inequality for any A>0,
1
card <m^P: max
hs < n ^ hr
^x{ht-hSyl2}<r4
uniformly in Og*s<t^ 1. This implies tightness (see [2, Theorem 8.4]).
References
1. G. Jogesh Babu, On the distribution of additive arithmetical functions of integral polynomials,
Technical Report Math.-Stat. # 19/1971, Indian Statistical Institute, 1971.
2. Patrick Billingsley, Convergence of probability measures, Wiley, New York, 1968. MR 38 #1718.
3. , Written communication.
4. , Unpublished manuscript.
5. , Additive functions and Brownian motion, Notices Amer. Math. Soc. 17 (1970), 1050.
Abstract #681-A9.
6. H. Davenport and P. Erdos, The distribution of quadratic and higher residues, Publ. Math.
Debrecen 2 (1952), 252-265. MR 14, 1063.
7. P. Erdos and M. Kac, The Gaussian law of errors in the theory of additive number theoretic
functions, Amer. J. Math. 62 (1940), 738-742. MR 2, 42.
8. J. Kubilius, Probabilistic methods in the theory of numbers, Gos. Izdat. Polit. Naucn. Lit. Litovsk.
SSR, Vilna, 1962; English transl., Transl. Math. Monographs, vol. 11, Amer. Math. Soc, Providence,
R.I., 1964. MR 26 #3691; MR 28 #3956.
9. J. Kubilius and Ju. V. Linnik, An arithmetic analogue of Brownian motion, Izv. Vyss. Ucebn.
Zaved. Matematika 1959, no. 6 (13), 88-95. (Russian) MR 25 #57.
10. Ju. V. Linnik, Ergodic properties of algebraic fields, Izdat. Leningrad. Univ., Leningrad,
1967; English transl., Ergebnisse der Mathematik und ihrer Grenzgebiete, Band 45, Springer-
Verlag, New York, 1968. MR 35 #5408; MR 39 #165.
11. Michel Loeve, Probability theory, 3rd ed., Van Nostrand, Princeton, N.J., 1963. MR 34 #3596.
12. Walter Philipp, Mixing sequences of random variables and probabilistic number theory, Mem.
Amer. Math. Soc. No. 114 (1971).
13. Ju. V. Prohorov, Convergence of random processes and limit theorems in probability theory,
Teor. Verojatnost. i Primenen. 1 (1956), 177-238 =Theor. Probability Appl. 1 (1956), 106-134. MR 18,
943.
14. R. J. Serfling, Moment inequalities for the maximum cumulative sum, Ann. Math. Statist. 41
(1970), 1227-1334. MR 42 #3835.
15. R. V. Uzdavinis, On the joint distribution of values of additive arithmetic functions of integral
polynomials, Trudy Akad. Nauk Litov. SSR Ser. B. 1960, no. 1 (21), 5-29. (Russian) MR 26 #100.
University of Illinois at Urbana-Champaign
BRUN'S METHOD AND
THE FUNDAMENTAL LEMMA1
H.-E. RICHERT AND H. HALBERSTAM
This paper contains a simple account of Brun's combinatorial sieve method. A
less general version of it appeared recently in our note [1] (note, however, that
although all the main results are correctly stated, [1] is full of misprints and minor
errors).
Let si be a finite sequence of integers, and let 9 be a set of primes. In order to
sift si by those primes of 9 which are less than z, any "small" sieve method
estimates the sifting function
S{si; 0>, z) : = \{a: aesi, {a, P(z))= 1}|,
where z ^ 2,
p{?)-= n p.
p<z;pe&
and |{...}| denotes the cardinality of the set {...}. There are no significant estimates
for all sequences si and all set of primes &\ therefore, one has to introduce some
more basic restrictions on si and 0>. Here, we postulate the following conditions,
which in most cases are quite natural: the existence of a real number X> 1 and a
multiplicative function co such that
(Qi) (oW/p^l-l/At for pe^A^U
_, (o(p) z
(Qii*)) I -^logp^;clog-+,42, 2£w£z,k>0,A2*1,
wgp<z;pe& P W
AMS 1970 subject classifications. Primary 10H30, 10H20.
1 This is an abstract of a paper to appear in Acta Arithmetica.
(C) 1973, American Mathematical Society
247
248
H.-E. RICHERT AND H. HALBERSTAM
and such that the "remainders"
^d - a<=j*;a = Omodd l j A
satisfy
(R) \Rd\£Ko(d) for d\P(z),K^\.
Also, for convenience, we put
co(p)=0 ifp$&,
and then
On probabilistic grounds one expects that
S{sf;P9z)~XV{z)9
at least in a certain region of the X-z-plane. By a "Fundamental Lemma" one
understands such a result for 5, where u : = logAyiogz(^ 1) is large. The problem
is first to find a region which extends as far as possible and, secondly,
simultaneously to find a remainder term as sharp as possible for
\S{s/;P9z) I
I XV (z) I'
We obtain the following
Theorem 1. Suppose that (Qt), (Q2(K)) and(R) hold. Then
(l) S(j&; 0>, z)=XV{z) {l + 0(exp(-M(logM-loglog3M-log»c-2)))
+ 0(Kexp(-(logX)1'2))},
where the O-constants may depend ohk, Ax and A2 only.
The result, where the interest is attached to the first error term, is superior to
all results obtained so far, and it rests completely on Brun's method. The name
"Fundamental Lemma" seems to stem from Barban. Such a result was extensively
used by Kubilius in his book [3] (see also [5]), where he gave numerous
applications, mainly to additive functions. For u bounded (and K<^exp ((log X)l,2% the
theorem tells us the well-known fact that S<t X V(z\ a result which is often useful
in situations where the sieve is used in an auxiliary capacity, and in most cases
this is simply quoted as "by Brun's sieve". By Selberg's sieve or the "Large Sieve"
BRUN'S METHOD AND THE FUNDAMENTAL LEMMA 249
we obtain only a weaker result: the first error term in (1) is replaced by
O(exp(-4«logtt))(cf.[2]).
The Fundamental Lemma also has applications to the study of quasi-primes
(cCe.g. Lavrik [4]). We shall call a number q a quasi-prime (relative to a large
number x and an arbitrary function u = u(x), tending to infinity arbitrarily slowly)
if it has no prime factor less than x1/u. In this connection we deduce from our
Theorem 1 the following general result.
Theorem 2. Let f\,...,fg be distinct irreducible polynomials with integer
coefficients, and let q(p) denote the number of solutions of the congruence
fi (n)'' 'fg (n) = 0 mod/?; assume that q (p) <pfor all p. Then the number ofn, 1 ^ n ^ x,
such that eachf(n) is a quasi-prime (i= 1,..., g) is equal to
y p\ p-i A pJ log**
x<l+OF(exp(-w(logw-loglog3w-log0-2))) + 0F
where the 0F-constants may depend at most on the coefficients and degrees off, ...,fg.
In a later communication we shall deduce from a new version of Brim's sieve,
in which (R) is replaced by a weaker condition, an asymptotic formula for the
number of primes p^x such that each of f^p) is a quasi-prime.
Bibliography
1. H. Halberstam and H.-E. Richert, A new look at Brim's sieve, Bull. Soc. Math. France 25 (1971),
97-106.
2. , Sieve methods, Markham, Chicago, 111. (to appear).
3. J. P. Kubilius, Probabilistic methods in the theory of numbers, Gos. Izdat. Polit. Naucn. Litovsk.
SSR, Vilna, 1962; English transl., Transl. Math. Monographs, vol. 11, Amer. Math Soc, Providence,
R.I., 1964. MR 26 #3691; MR 28 #3956.
4. A. F. Lavrik, The theory of quasiprime numbers, Dokl. Akad. Nauk SSSR 152 (1963), 544^547 =
Soviet Math. Dokl. 4 (1963), 1355-1359. MR 27 #4805.
5. W. Philipp, Mixing sequences of random variables and probabilistic number theory, Mem. Amer.
Math. Soc. No. 114(1971).
University of Ulm
Ulm, Federal Republic of Germany
log*
University of Nottingham
Nottingham, England
This page intentionally left blank
ESTIMATION OF THE AREA OF THE
SMALLEST TRIANGLE OBTAINED
BY SELECTING THREE OUT OF n
POINTS IN A DISC OF UNIT AREA
K. F. ROTH
1. Introduction. We endeavour to give a concise and unified account of our
recent work [3], [4].
Let
(1.1) Pl9P2,...9PH
(where n ^ 3) be a distribution of n points in a (closed) disc of unit area, such that
the minimum of the areas of the triangles PtPjPk (taken over \^i<j<k^n)
assumes its maximum possible value A = A(n).
Heilbronn conjectured that A(n)<^n"2 and Paul Erdos (see [1, Appendix])
showed that this result, if true, would be best possible.
The first improvement on the trivial A(ri)<^n~l was due to K. F. Roth, who
in 1950 proved that
(1.2) ^(nHH-^loglogn)-1'2.
There was no further improvement until about 20 years later, when Wolfgang
M. Schmidt1 [2], using a different method, proved
(1.3) A(n)4n-l(\ogn)-1/2.
(Actually Schmidt obtained a result containing an explicit constant.) In [3] Roth
proved A(n)<^n~^, where jU = 2 — (.8)1/2= 1.105..., and in [4] refined his method
to yield
(1.4) J(n)«cn-"'+E,
A MS 1970 subject classifications. Primary 52A40, 10K35.
1 I am indebted to Professor Schmidt for having sent me a preprint of his paper.
© 1973, American Mathematical Society
251
252
K. F. ROTH
where
(1.5) /i'=|(17-(65)1'2)= 1.117... .
Schmidt's method (for proving (1.3)) involved the use of 'weighted strips'.
Although our method is also based on the use of weighted strips, both the manner
and purpose of their application is different. In Schmidt's method they feature in an
averaging argument to find many strips of a certain kind with a nonempty
intersection, the 'weights' serving to give 'preference' to wider strips; and in our method
they are used to construct systems of quasi-orthogonal functions.
The key procedure in our method (cf. Lemma 4 in §4) is to show that if'nearly
all' the members of an appropriate system of thin strips are 'deficient' of points (1.1),
then nearly all the members of a derived system of wider strips are similarly
deficient (unless nA (n) is small, as is to be proved). Such a result ensues on applying
a modified form of Bessel's inequality to a system of quasi-orthogonal functions
(constructed from the two systems of strips, suitably weighted) with respect to a
function obtained by replacing the points (1.1) by discs.
We require a generalization of Bessel's inequality applicable to systems of
quasi-orthogonal functions. Such generalizations, which have proved very fruitful
in connection with the large sieve, are discussed in [5, Chapter 1]. The generalized
Bessel's inequality we shall use here is due to A. Selberg (see [5, p. 7]); the proof
(see [5, p. 8]) of Selberg's elegant result being at least as simple as that of weaker
inequalities of the same general nature (some of which would suffice for our
purpose).
Selberg's inequality. Letf, ij/{1\il/{2\..., ij/(R) be elements of an inner product
space over the complex numbers. Then
(i-6) j; K/,<wf f lOAnoiV^ii/ii2.
It seems highly probable that systems of quasi-orthogonal functions
constructed from weighted strips (but not necessarily used in conjunction with Bessel's
inequality) will find other applications to problems concerning distributions of
points in Euclidean space.
2. Notation. Vectors X. We use X=(x, y) to denote a point in the
Euclidean plane, and write J" g(X) dX for J J" g(x, y) dx dy taken over the entire
plane.
Discs D*,D. Let
(2.1) D* = {X;x2 + y2^(7c-1/2)2}
THE SMALLEST TRIANGLE IN A DISC OF UNIT AREA
253
be the disc containing the set (1.1), and let
(2.2) D = {X;x2 + j;2gl}.
Pairs t and d(r), 0(t), Tz(w). We use t to denote a pair Pi9 P} (i<j) of points
selected from (1.1) and d(r) to denote the distance PtPj between the two points
constituting x. If
(2.3) ycos0-xsin0 = a (O^0<tu)
is the line joining Ph Pj9 we use 0(t) to denote the inclination 0 of the line (2.3);
and, for any w>0, we use Tr(w) to denote the strip
(2.4) a — jwg^y cos9 — x sin0^a+j\v
of width w about the line (2.3).
Characteristic functions and counting functions: in particular D(X\ Tx(w; X),
Nt(X), Mt(X). If Sf is any subset of the plane, we use Sf{X) to denote the
characteristic function of Sf (in other words, ^(X) is 1 or 0 according as X does or
does not lie in Sf)\ in particular, we use D(X), Tt(w, X) to denote the characteristic
functions of D, Tx(w) respectively.
We use N(£f) to denote the number of points (1.1) in £7, and introduce the
abbreviation Nt(w) = N(Tt(w)); thus
(2.5) JV»=£ Tt(Yr,Pd.
i=l
We also write
(2.6) Mt{w) = w-lNx{w).
Orthogonal functions <j>x. For any w'> w">0, we write
(2.7) (/>>', w"; X) = (\/w') Tt(w'; X)-(Vw") 7>"; X).
These functions have the following obvious property.
Orthogonality Property. If </>*, </>** are any two functions of type (2.7)
(corresponding to pairs t*, t**), then J (/>*$** rf^f is zero whenever it is finite.
Constants. We use c as a generic symbol for a sufficiently large positive absolute
constant. The constants implicit in the <^ notation depend at most on y and s.
254
K. F. ROTH
3. Structure of the proof of (1.4). We say that the positive number y is
admissible if
(3.1) A(t)<r?
for all integers r^3. We suppose throughout that the integer n featuring in (1.1)
is large, and reserve the symbol A for the value of A (t) when t = n. We remark that 1
is obviously admissible in view of the trivial inequality
(3.2) A<{n-2)-l<2n-1.
Since /i' = i(17 —(65)1/2) is the smaller root of the polynomial Q(y) = 4y2 — lly
+ 14, and (for y^l3/4)
(13-4y)-1 Q(y)={{U-4y)/(13-4y)}-y,
the estimate (1.4) is an immediate consequence of the following result. (As usual,
e > 0 is arbitrarily small.)
Theorem. If y is admissible and l^ygl.2, then {(14-4y)/(13-4y)} — s is
also admissible.
We remark that (in view of the values of the roots of Q(y)) we are entitled
to assume
(3.3) A>n~12
when proving the theorem. We now deduce the theorem from the following two
lemmas, which are to be established in §4 and §5; but first we introduce some
further notation.
Definition 1. We denote by B(u;S,w) the number of pairs x for which
(cf. (2.6))
(3.4) d(x)^u, Mt(w)^Sn.
Throughout the text, / denotes the (odd) integer defined by
(3.5) /=2[exp{(logn)2'3}] + l.
Lemma A. Let y be admissible, u>Ai/2, 8>uA~in~ll, and suppose that w
satisfies
(3.6) Jm_1<w<c_1.
THE SMALLEST TRIANGLE IN A DISC OF UNIT AREA 255
Then B = B(u;S,w) satisfies
(3.7) B<5~2u2 A-3{u-^Al-ii/y)w^2^-l)n-l+E.
Lemma B. Suppose that A>l8n~2 and
(3.8) A-'l2n-'<v<l-\
Then we have
(3.9) E= £ -L>nh,
t;(3.10),(3.11)«VT;
where the summation is over all pairs x satisfying
(3.10) Anv<d{z)<l3v9
(3.11) Mx{n'lv'l/d(r))>l'3n.
Deduction of the theorem. We choose v to satisfy
(3.12) v = Al'ilM{n'lv'2Y2^'1.
In view of (3.3) the v thus defined satisfies (3.8), so that (for appropriate r', r")
Lemma B implies
(3.13) I £•» i>«2".
r = r' t; (3.14), (3.1 5) « lT/
where the conditions of summation for the inner sum £(r) are
(3.14) /-r-1<rf(T)^/"r,
(3.15) Mt(iT1iT1/r+1)>r4ii;
we choose the maximal r' and minimal r" consistent with the requirement that the
union of the intervals (3.14) should cover the interval (3.10). We note that
(3.16) Y!r)^T^lr+iB{rr;r\n-lv-llr+l).
d(z)
On estimating the right-hand side of (3.16) by Lemma A (the condition (3.3)
ensures that the appropriate premises are satisfied), we see that the inner sum on
256
K. F. ROTH
the left-hand side of (3.13) is ^A^^ + A^^in'1^1^'^2'^ n'1 + t
Thus (3.13) (in conjunction with (3.12)) yields A3n3~E<^v, and hence the desired
inequality
A^rfn'il4'4y)fil3'4yK
4. Proof of Lemma A.
Lemma 1. Let w>0, 0^a<7c, w^jAu'1. Then, for suitable c,
(4.1) D(X) £ Tt{w;X)^cwuA-1
t;(4.2)
for every X in the plane; here the summation is over all pairs x satisfying
(4.2) d{r)Su, a^0(T)«x + (l/lO) Au'1.
Prcx)F. Suppose (4.1) is false. Then there exists a point X0 in D which lies
in more than cwuA "1 of the strips Tz(w) with x satisfying (4.2); it is easy to deduce
from this (since c is large) that there exist two pairs il5 x2, each satisfying d(x)^u,
such that
(i) \e(xl)^e(x2)\<(\i\o)Au'\
(ii) the strips Ttl({\/10)Au~l), Tt2((l/10) Au~l) have a common point Xl in D.
This is a contradiction, since (i) and (ii) (in conjunction with d(x1)^u) imply
that the two points of tx together with (either) one of the points of x2 form a triangle
of area less than A.
Definition 2. We write, for w'> w" > 0,
(4.3) <2>T(w', w"; X) = D{X) 4>T(w', w"; *).
Here </>t is the function defined by (2.7), so that (cf. below (2.7)) the functions $
have the following:
Orthogonality Property. For given w' > w" > 0, let <PZ*, &+* be two functions
of type (4.3). 77z£az these two functions are orthogonal over the plane unless the
intersection of Tz*(w'), Tt**(w') and the boundary C ofD is nonempty.
In the following three lemmas we use the abbreviation
(4.4) <Pt{X) = <Pt{w',w";X).
Lemma 2. Suppose that u>0, 0ga<n, \Au~1 ^ w" < w' <c~l. Then, for any
pair x0,
THE SMALLEST TRIANGLE IN A DISC OF UNIT AREA
257
(4.5)
where
(4.6)
I
t;(4.2)
<PjX)<Pz(X)dX
<^uA~lZ,
Z = min(l, w/|sin(a-0(To))r1).
Proof. Let rx, T2 be the two arcs in which the strip Txo(w') intersects the
boundary C of D; only those terms corresponding to x in one of the sets Sv (v = 1, 2),
consisting of the pairs t for which TX{W) intersects Tv and (4.2) holds, can
contribute to the sum on the left-hand side of (4.5).
It is easily seen that (for each v= 1, 2) all the strips TZ(W) with re<fv intersect D
in regions contained in a single strip Sv of type av — cW < y cos a — x sin a < av + cW.
Since
|<f>t(X)|g(l/w') Tz(w'; X) + (l/w") T>"; X),
it follows from Lemma 1 that
X \<Pr(X)\<uA~lSv(X),
where SV(X) is the characteristic function of the strip Sv. In view of the trivial
estimate j \<PZo{X)\ 5vW dX<^Z, the desired inequality (4.5) follows at once.
Lemma 3. Let u>0,%Au~l^w"<w'<c~l. Then, for any pair t0,
(4.7)
r:d(r)£u
h
{X)<Pt{X)dX
<^u2A 2W \og(uA l).
Proof. Write Ik = %z;(48)\$ <Pt0{X)<Pt{X) dX\ for /c=l,..., AT, where #=
\_\0nuA ~ l] +1 and the condition of summation is
(4.8) </(T)gw, n{k-\) K~l^8{%)<nkK-1.
On using Lemma 2 to estimate /*, and then summing over /c, we obtain (4.7).
Lemma 4. Suppose y is admissible and the premises of Lemma 3 are
satisfied. Tlien
(4.9)
I UMtfrv')-2Mt(2w")}2«y,
t: d(r)^u; (4.10)
258
K. F. ROTH
where
Y = n(w")~2 {u2A~2W logH-1))max{l,((w#')2J-1)1/y}
and the second condition of summation is
(4.10) MT(V)>4MT(2w").
Proof. For each i = l, 2,..., n, let g^X) be the function which is 1 or 0
according as X does or does not lie in the disc centre Pt and radius \W. We write
/(*)= i 9l(X).
i=l
Clearly (since jw' > 2w" if (4.10) holds for some t)
1
w J
X) f{X)dX*-n$W)2Nx$w')9
1
Tr(w"; X) f(X) dX£—„ 7c(K)2Aft(2W),
w
so that the left-hand side of (4.9) is majorized by
(4.11) (w")-4 X \[*x(X)f(X)dX
t;d(T)^u;(4.10) (J
Since /(A") counts the number of points Pt in the disc centre X and radius |w",
we obtain from the defining property of the distribution (1.1),
/(Xj^maxll,^)2^-1)1^}.
Combining this with jf(X)dX =nn(\W')2, we have
(4.12) J f2{X)dX 4n(Wf max{l, ((w")2^-1)1"}.
In view of Selberg's inequality (1.6), the desired estimate for (4.11) now follows
at once from (4.7) and (4.12).
Completion of the proof of Lemma A. We consider a fixed value u
satisfying u>A1'2, and use the abbreviation B(5, w) = B(u; <5, w). Write J =
THE SMALLEST TRIANGLE IN A DISC OF UNIT AREA
259
[(logn)1/2] and, fory = 0, 1,..., /, Wj = w(Au'x w~iy/J.
For d(x)^u, the strip Tx(wj)=Tx(Au~l) contains only the two points P,
constituting t, so that Nz(Au~l) = 2 and hence
JB(5',w,) = 0 for^>2MJ-1n"1.
In particular, B (2 " 3 J(5, w7) = 0 since <5 > uA "* n "* /, so that
(4.13) B(6, w) = Jt1 {B(2-3^f w,)-5(2-3^+1^, wJ+1)}.
Now the summand in (4.13) certainly does not exceed the number Rj of pairs
t (satisfying d(r)^u) for which
M>;)^2-3^n, Aft(wj+1)<2-3U+1>fe.
On applying Lemma 4 (with w' = 2w/, w"=jw/+1) to estimate Rj, the desired
estimate (3.7) follows from (4.13).
5. Proof of Lemma B. The number a satisfying 0^a<7r is to remain fixed
almost to the end of this section. We use S to denote a strip of type
(5.1) a — ^w<y cosa — x sina^a+^w.
We use tc: S to express the condition that the two constituent points P, of t lie in 5,
and we recall that N(S) denotes the number of points P, in S.
Lemma 5. Let S be the strip (5.1) and suppose that 0<2w^m<c_1. Then
(5.2) £ l>iN(S)-3u-1,
tcS;(5.3)
where the second condition of summation is
(5.3) \Aw-x^d{<z)<u.
Proof. We subdivide S into rectangles by means of a system of lines
perpendicular to S and distance \u apart. More than N(S)—\2u~l of the points P{
in S fall into 'good9 rectangles, each containing at least 4 points P,. The pairs t in S
for which d(z)<^Aw~l are disjoint, and we destroy these by rejecting from S one
of the two constituent points from each such pair. After this operation a typical
'good' rectangle will now contain t^2 points P„ and such rectangles contain
260
K. F. ROTH
between them at least jN(S) — 6u~i points P(. On applying the inequality jt(t—l)
^jt to the number of pairs t in the typical rectangle of this kind, we obtain (5.2).
Construction of sets srfr. We construct inductively sets jrf0, s/l9 stf2>-~> where
(foneach r) the set srfr is the union of certain strips (having width 2/"r) of type
(5.4) (2p-l)rr<ycosa-xsina^(2p+l)rr,
where p denotes an integer. We use rS to denote any strip of type (5.4) and rS* to
denote a constituent strip of srfT.
The inductive procedure is as follows. Take jrf0 to consist of the single strip
— 1 < y cos a — x sin a ^ 1.
If, for r^ 1, ^r_i has already been constructed, the set srfT is obtained from it by
first splitting each strip r~ lS* into / strips rS, and then selecting from the resulting
strips rS precisely those satisfying
N{rS)^{2l)~rn
for inclusion in s/r. Clearly each strip r5* is representable as an intersection of type
r
(5.5) rs*=n j5*,
j=o
where
(5.6) JV(>S*)£(2/)-'n (/ = 0,l,...,r).
It is easily verified by induction that the number N(stfr) of points Pt in srfr
satisfies
(5.7) N«) = ^>N(S)^2-rn,
s
where ^(r*) extends over the constituent strips S = rS* of srfr.
Completion of Proof of Lemma B. Let v satisfy (3.8). We choose the integer
r0 to satisfy
(5.8) rr°-1<(/2m?)-1grr°.
We apply Lemma 5 with u = l3v, w = 2/"ro (permissible in view of (3.2), (3.8)) to
obtain
THE SMALLEST TRIANGLE IN A DISC OF UNIT AREA 261
(5.9) X l>$N{S)-3rh-1
tcS;(3.10)
for every constituent strip S=roS* of jrfro. We note that every pair t lying in such a
strip automatically satisfies
(5.10) \s'm(<x-0(T))\<2{lnvd{T)yl,
since the right-hand side of (5.10) exceeds w/d(i), where w is the width of the strip.
Hence, summing (5.9) over the constituent strips of stfro, and using (5.7) in
conjunction with
i(2"ron)-/ro(3/-3t;-1)>2/-1n
(there are at most lro strips roS*), we obtain
(5.11) £<"*> I l>2r1n.
S tcS;(3.10),(5.10)
We consider a particular pair t counted on the left-hand side of (5.11). We
choose rx to satisfy
(5.12) rr*-l<(l2nvd(t))-l^rr\
and note that 0< rx <r0 (in view of (3.8), (3.10)). By considering the termor! of the
representation (5.5) of the strip roS* containing t, we see that t lies in a strip Su of
inclination a and width 2/"ri < 2{lnvd(x))~ \ for which
(5.13) NiSj^iliy^n.
It is easily verified that (5.10) ensures that the strip Tz(n~lv~ 1/d(z)) completely
covers the region DnSl in which St interesects the disc D. Hence (5.13) implies
NT(n-lv-l/d(z))^{2l)-rin>r3(n-lv-l/d(T))n.
Thus (5.11) implies
I \>irln,
t;(3.10),(3.11),(5.10)
and on integrating this with respect to a over the range 0^a<7r, we obtain (3.9).
262
K. F. ROTH
References
1. K. F. Roth, On a problem of Heilbronn, J. London Math. Soc. 26(1951), 198-204, MR 13,16.
2. Wolfgang M. Schmidt, On a problem of Heilbronn, J. London Math. Soc. (2)4 (1971/72), 545-550.
3^ K. F. Roth, On a problem of Heilbronn. II, Proc. London Math. Soc. (3) 25 (1972), 193-212.
4. , On a problem of Heilbronn. Ill, Proc. London Math. Soc. 25 (1972), 543-549.
5. Hugh L. Montgomery, Topics in multiplicative number theory, Lecture Notes in Math., vol. 227,
Springer-Verlag, Berlin and New York, 1971.
Imperial College of Science and Technology
London SW7, England
EULER PRODUCTS ASSOCIATED
WITH BEURLING'S
GENERALIZED PRIME NUMBER SYSTEMS
C. RYAVEC
One of the most fruitful methods employed in the study of the distribution of
the primes has been the consideration of those functions, such as the Riemann
zeta function, which embody the fundamental theorem of arithmetic. This method
(sometimes referred to as the "analytic method") has also been successfully applied
to the study of generalized prime number systems, an account of which may be
found in [1].
Briefly, a generalized prime number system is a sequence
P = {KPlgp2g...}
of real numbers pk-+oo. The multiplicative semigroup generated by P is called the
generalized integers of the system P, which we denote by
N-{l=nl<n2^n3S'"}'
Note that two integers n, and w, of N, of possibly equal value, are, nevertheless, to
be distinguished if they arise as distinct products of the primes P.
The research in generalized number systems has involved considerable use of
the zeta functions CP(s), defined by either the product or the series
wherever the infinite product converges. (It is known that the product and series
converge on the same half-plane, possibly empty.)
AMS 1970 subject classifications. Primary 10H40, 30A14, 10H05.
© 1973, American Mathematical Society
263
264
C. RYAVEC
We shall be concerned in this summary (all of this work will appear in [3])
with the relationship between P and the natural boundary of £P(s). We shall use
the following setting to describe this relationship.
Let U denote the set of real sequences £/ = {(u2, w3,...): up>p~1} indexed on
the rational primes. Define a "perturbed zeta function" (perturbed by the element
ueU) £(s, u) by
c(s,M)=n(i-Kp)"s)_l
p
whenever the infinite product converges (it is known that if the product converges
at some point, then it converges on a half-plane). Further, when C(s, u) does
converge at a point, let <3(w) = <5C(s, u) denote the natural boundary of £(s, u).
It would be desirable to know completely the influence on d(u) of variations of u.
Although not much in this direction is known, certain special cases can be
completely settled (Theorem 1). Also, a number of classes of perturbations (Theorem 3)
can be dealt with reasonably well.
In Theorem 1 we characterize the Riemann zeta function essentially in terms
of the functional equation and the assumption d(u) = {1, oo}, where the singularity
at s = 1 is a simple pole with residue 1.
Theorem 1. Suppose that C(s, u) converges for Re(s)>\ and that C(s, u) can
be continued to C— {1} as
C(s,u) = £(s)/(s-l), £(1)=1,
where E(s) is an entire function of finite order. Then, ifC(s, u) satisfies the functional
equation
n-«2r(s/2) Us, u) = n-v-»<2r((l -s)/2) £(1 -s, u),
we have C(s, u) = C(s), the Riemann zeta function.
Thus, we observe that when £(s, u) satisfies the hypotheses of Theorem 1, the
factors {up} simply permute the primes among themselves; e.g., u2 = 3/2, u3 = 2/3,
w5 = w7=-=l. Theorem 1 follows directly from the following result concerning
generalized Dirichlet series.
Theorem 2. Let f(s) = Yj?= i akKs> ak^®> be a general Dirichlet series which
converges at some point of C and which can be continued to C—{\) as
f(s) = E(s)/(s— 1), where E(s) is an entire function of finite order, and where £(1)= 1
and E(0) = j. Let g{s):=Yl7=i ^jvJs be a general Dirichlet series which converges
EULER PRODUCTS WITH PRIME NUMBER SYSTEMS
265
absolutely at 5 = 2. Finally, assume that f(s) and g(s) satisfy
n-*i2r(s/2)f(s) = n-^-*v2r((l-s)/2)g(l-s).
Thenf(s) = g(s) = t(s).
There are a number of papers which have dealt with problems similar to
Theorem 2. The closest such result can be found as Corollary 4 on p. 295 of [2]. This
result cannot be directly compared to Theorem 2, however, since it does not require
the hypothesis a^O. It does require lacunarity conditions not present in Theorem
2, though.1
Comparing Theorem 1 and Theorem 2, it is seen that the assumption of Rie-
mann's functional equation is essential to the conclusion of the latter theorem,
but it is not at all clear that this is the case in Theorem 1. It may be that the existence
of the Euler product makes both the assumption of the functional equation and
the growth condition on E(s) unnecessary; i.e.
Conjecture. Let C(s, u) converge for Re(s)> 1 and satisfy
C(s,u) = £(s)/(s-l), £(1)=1,
where E(s) is entire. Then C(s, w) = C(s).
We now describe a class of perturbations of £(s) and attempt to determine the
extent to which £(s, u) can be analytically continued across its abscissa of
convergence. Thus let q denote a large prime. For each residue class h (modq) choose
a real number rh; and for each p = h (mod 4), put up = rh, subject only to the
restriction upp>\ for all primes p (the obvious restrictions then hold on rh, l^h^q).
With such a choice of ue U, we have
Theorem 3. Define a set Eq(u) by
Eq(u)= U <s:s = ;0<x^l> u(0,1],
Q(X),n I n )
where the first union is over all positive integers n and over all zeros q{x) = P(x)
+ fy M> P to > 0, of all L (s, x) (mod q). Then with u described above, £ (s, u) is analytic
on the domain
1 We mention that for any general Dirichlet series/(s) = £0,^, s, the condition l^Ai^A2^...,
^-♦oo is always in force.
266
C. RYAVEC
Dq(u) = {s:Re(s)>0}-Eq{u).
Theorem 3 implies that if one perturbs the rational primes in finitely many
residue classes, then the resulting Euler product, £(s, w), can be continued to the
line Re(s)>0, with the exception of countably many lines running from the zeros
of all L-functions to the imaginary axis.
There does not exist at this time any method which can be used to deal with the
general perturbation problem. The determination of d(u) for functions like
rip>2(l-(P"-1)~s)"1 (here> uP=l-P~i>P>2) wiN probably be very difficult.
An example of H. Diamond shows that there exists a ueU for which
d(u) = {s: Re(s)=l} and such that up-^\ as p-*oo. This example gives some
support to the notion that only special functions £(s, u) can be continued across
their abscissae of convergence; for almost all £(s, u) the line of convergence
coincides with the natural boundary.
References
1. P. T. Bateman and H. G. Diamond, Asymptotic distribution of Beurling's generalized prime
numbers, Studies in Number Theory, Math. Assoc. Amer.; distributed by Prentice-Hall, Englewood
Cliffs, N. J., 1969, pp. 152-210. MR 39 #4105.
2. K. Chandrasekharan and S. Mandelbrojt, On Riemann's functional equation, Ann. of Math.
(2) 66 (1957), 285-296. MR 19, 635.
3. C. Ryavec, The analytic continuation of Euler products with applications to asymptotic formulae,
Illinois J. Math, (to appear).
University of Colorado
SYSTEMATIC EXAMINATION OF
LITTLEWOOD'S BOUNDS ON L(l,x)
DANIEL SHANKS
1. Introduction. This investigation was largely conducted in close
collaboration with D. H. and Emma Lehmer. My joint paper with them [1] overlaps some
with the present paper but each paper also treats topics not in the other, and to
minimize duplication the papers refer to each other for those aspects of the
problem.
We confine ourselves to the real characters Xd — (d/n) and examine the functions
for s= 1. If L(s, Xd) satisfies the Riemann hypothesis, and d^m2, then Littlewood
[2] deduces the bounds
(2) [{l+o(l)} (12^/7r2)lnln|^|]-1<L(l,/d)<{l+o(l)}2^1nlnM.
He gives nothing about the o(l) here, neither its sign nor the manner in which it
approaches zero as a function of d.
We wish to study the possibility of approaching these bounds or, perhaps,
surpassing them, and to obtain a measure for this we temporarily ignore the o(l)
and define the upper and lower Littlewood indices by
(3) L(l, Xd)/2e* In ln|<*| = ULI, L(l, Xd) (12/tt2) e* In ln|d| = LLI.
We will examine, systematically, the possibility of finding d with
A MS 1970 subject classifications. Primary 12A25, 12A70.
© 1973, American Mathematical Society
267
268
DANIEL SHANKS
(4) ULI^l or LLI^l.
Littlewood himself [2], followed by Chowla [3], got halfway there by
constructing arbitrarily large \d\ having
(5) ULI^(l-a) or LLI^2(l+a)
for any positive e. Relative to these constructions (called LC in the following)
the question now is whether we can attain the extra factor of 2. If LC obtains a
certain large (or small) L(l, Xd) f°r a discriminant D, then we would have to obtain
a comparable L(l, Xd) with
(6) In In \d\=$ In In |D| or \d\ = exp((ln|D|)1/2).
Thus, if their D = 10450, our d must be the much smaller d— 1014.
The first step of LC in obtaining a large (or small) L(l, Xd) *s to select D such that
(7) (D/«)=+l (or(D/q)=-l)
for all primes q ^ some p. That maximizes (or minimizes) the first n (p) factors in
the Euler product in (1) for s=l. There are such D by the Chinese Remainder
Theorem satisfying
(8) D<4 f\q=Up.
q = 2
The bound on the right, Up, and some further construction then yields (5). But Up
is surely grossly too large since there are, in fact,
distinct solutions D of (7), all being less than Up.
If one could identify the smallest of these D by some algebraic or analytic
technique, one could seek to improve (5) with these smallest D. Since no such
technique is known, we will compute the smallest d numerically and begin our
study with four introductory examples of (3) so computed.
2. Four examples and their computation. In (lOa-d) below, we list four d,
each being the smallest discriminant having a prescribed quadratic character.
The characters are designated as follows: aRp (aNp) means a positive d^m2 of
EXAMINATION OF LITTLEWOOD'S BOUNDS ON L(l, y) 269
the form 8& + a which is a quadratic residue (nonresidue) of all odd primes q^p.
Similarly, —aRp( — aNp) is such a negative d— — (8k + a). For each d in (lOa-d)
we give the class number h(d) of Q{di/2) and, for d>0, the regulator In e. Then
L(\,x) equals
2h{d)\n£/d112 or nh(d)/(-d)112
for d>0 or d<0, and the indices are computed by (3).
(10a)
(10b)
d= \Rl39 = 2871842842801 (prime), h(d)=l,
In e = 7023729.36, L(l, *) = 8.28929, ULI = 0.6933.
d = 5Nl39 =49107823133 (prime), h{d)=l,
In e= 18804.68, L(l, x) = 0.16972, LLI= 1.1773.
</=-7R157=-47375970146951 (composite), h(d)= 19213042,
* ' L(l,*) = 8.76934, ULI = 0.7136.
n , d=-3Nl8l= -30059924764123 (prime), fc(rf) = 296475,
( > L(l,x) = 0.16988, LLI= 1.2637.
These four (first solution) d are clearly much stronger than the LC
constructions D that yield (5). The example (10b) is especially strong; it nearly attains (4).
The first — 3iV181 is not quite that strong, but if it had a class number, say 230000
instead of its listed h(d), it could well be a violation of the RH, subject to
investigation of its factor {1 +0(1)}.
A brief word about computation. These four d, and most of those that follow,
were obtained with Lehmer's delay line sieve DLS-157 [4]. This is a specialized
computer that determines solutions N of the system of congruences:
JV = a,(mod<z) (4 = 2,3,5,..., 157).
If it had not been available, the computation of, say, the first — 3Ni8i above on a
commercial computer would be incredibly time-consuming and expensive; in a
word, impractical. Again, the classical algorithms for computing h(d) and e are far
too slow for the huge regulator in (10a) and h(d) in (10c), and it was necessary to
devise new algorithms for computing h(d) [5] and \ns [6] that are far more
efficient. Suffice it to say that without Lehmer's DLS-157 and without these two
new algorithms much of the data that follows would have been almost impossible
to obtain.
270
DANIEL SHANKS
3. Even discriminants. Presently, we will study the variations in the ULI and
LLI for all such first solutions of \Rp, 5Np, etc., as/? is systematically increased:
/? = 3, 5, 7, 11,.... But these four characters all have odd dand it is desirable to
gather more data by examining even d also.
For any N^ —k2 we write
m = i\ m Jm
for the even d — — 4iV. All even terms m = 2r in (11) vanish. Correspondingly, the
leading (and strongest) factor in the Euler product in (1) is now lost since (d | g) = 0
for q = 2. Using Littlewood's analysis for d=—4iV, everything goes as before
except at the very end when these leading factors of 2 or f drop off. One therefore
has, instead of Littlewood's (2), the stronger result:
(12) [{l+o(l)}(8^/7c2)lnln|4iV|]-1<LiV(l)<{l-ho(l)}^lnln|4N|.
For even d we therefore modify (3) and define the indices by
(13) LN{l)/ey In ln|4N| = ULI, Ln(1)(8/tc2) e> In ln|4iV| = LLI.
The bounds (12) are valid for every jV# — k2, not merely for fundamental
discriminants. Consider
-3R167 = -29772062022491= -N.
One has
(_AT|?)= + 1 for ^ = 3 to 167 and (— A^ | ^) = — 1 for q = 2.
With a discriminant — AN, for this N, we can "neutralize" the "wrong" character
with respect to q = 2, and (12) then holds for its LN(l).
In (14a-d) we list four examples analogous to (lOa-d). Each has a wrong
character for q = 2 that is neutralized with a factor of 4. Their indices are now
computed by (13) and are seen to be comparable to those in (lOa-d). In effect, we
simply ignore q = 2 by this device and study only the sequence of (d | q) for
4 = 3,5,....
<f = 4(-3K167)= -4-29772062022491,
* ' LN{\) = 4.54327, ULI = 0.7333.
EXAMINATION OF LITTLEWOOD'S BOUNDS ON L(l, x) 271
n4W d=-4{-7Nl61) = -4-17382121592383,
1 ] LN(1) = 0.27109, LLI = 1.3548.
rf = 4(5K163) = 4-4745628949021,
^ C' LN(1) = 4.30219, ULI = 0.7063.
, d = 4(lN167) = 4-11571384229697,
( ' LN(1) = 0.26008, LLI =1.2950.
4. Systematic examination of the LLI. In Table 1 we list the indices LLI for
the smallest dhaving the character -3Np, 5Np, 4(-lNp) and 4(1 Np) for/? = 3, 5,
7,.... (The LLI of the examples above are found in Table 1 in the appropriate
rows and columns.) The discriminants d themselves, their h(d) and L(l, Xd\ are
not given in Table 1 but can be found in the tables in [1] and [7]. This is what we
observe in Table 1:
(a) All LLI listed are far stronger for these smallest d than for the LC
construction in (5).
(b) If we set aside the smaller d, those for p < 50, we see a certain uniformity
here; the LLI are essentially equal, on the average, for all four characters, and
appear to remain stable, on the average (or change only very slowly), as p increases.
(c) For these 50<p^l81, the average LLI is about lj and the fluctuations
take us up to 1.528 for the weak 4(liV83) and down to 1.177 for the very strong
example (10b).
(d) The d= —3Np for p=17 thru 37 is the famous —163 and its startling
LLI = 0.8675 would imply that £( —163 | n) n~s violates the Riemann hypothesis
were it not for its factor {1 +o(l)}. For the present, we will assume that this factor
saves the day (since 163 is quite small) but we must return to this {1 + o(l)} problem
later. Similarly, the LLI shown for the even smaller d= — 28 = 4( — 1N5) and
d = 6S=4(lNll) are (temporarily) discounted.
(e) With this dubious d=—163 excepted, we see no indications here for
violations of the RH. We are making a real effort here to obtain cases of LLI < 1
but they do not appear (for large d); the strongest examples such as 5Nl39 press
towards the bound, but do not cross it.
5. Systematic examination of the ULI. In Table 2 we list the ULI for the
characters \Rp(^m2\ -7RP, 4(5Rp), and 4(-3Rp). The ULI behave quite differently
from the LLI.
(a) For p< 13, the ULI can even be weaker than (5) but they increase rapidly
with p and become distinctly stronger.
(b) Quite unlike point (b) of § 4, the growth of the ULI is very obvious as are
the differences among the four characters, especially the outer two.
272 DANIEL SHANKS
Table 1. LLI for first discriminant of the character.
p -3NP 5N„ 4(-77Vp) 4(INP)
3
5
7
11
13
17
19
23
29
31
37
41
43
47
53
59
61
67
71
73
79
83
89
97
101
103
107
109
113
127
131
137
139
149
151
157
163
167
173
179
181
Average
1.6855
1.3744
1.3744
1.1937
1.1937
0.8675
0.8675
0.8675
0.8675
0.8675
0.8675
1.3002
1.3002
1.2315
1.2617
1.2617.
1.3058
1.3944
1.3269
1.3423
1.3423
1.2869
1.2832
1.2832
1.2832
1.2832
1.2974
1.3182
1.3182
1.2422
1.3604
1.3604
1.3114
1.3422
1.3422
1.3422
1.3422
1.3223
1.2789
1.2637
1.2637
LLI forp>50
1.3096
0.4436
1.6125
1.3880
1.3880
1.2467
1.1377
1.2470
1.2470
1.2908
1.3876
1.3876
1.3876
1.3249
1.3593
1.3593
1.3593
1.1855
1.3144
1.4284
1.4220
1.4220
1.3633
1.3633
1.3633
1.2210
1.2210
1.2809
1.2809
1.2809
1.2243
1.2176
1.1773
1.1773
1.2393
1.2393
1.2393
1.2393
1.3433
1.2846
1.2899
1.0317
1.0317
1.8407
1.6717
1.4888
1.4565
1.1671
1.6268
1.4350
1.3874
1.4031
1.4031
1.3838
1.3838
1.2898
1.2898
1.2898
1.2607
1.2607
1.2607
1.3514
1.2979
1.2979
1.2979
1.4066
1.3432
1.3454
1.3303
1.3303
1.3303
1.3248
1.3248
1.3130
1.3555
1.3555
1.3555
1.3555
1.3548
1.3218
1.0560
1.0560
1.0560
1.0560
1.7017
1.5780
1.5108
1.4011
1.4011
1.1893
1.1893
1.4815
1.5256
1.3750
1.4138
1.4194
1.3409
1.3409
1.3042
1.3042
1.2411
1.5283
1.4297
1.4297
1.3877
1.3877
1.3877
1.3877
1.3877
1.4173
1.3541
1.3541
1.3279
1.3010
1.3010
1.3343
1.3343
1.2950
1.3629
vj ON ON ^ Ul ^
W s) W si m \0
U> U> K) ►— OOOOVOOOOO^J^J^JONONt^U*-^
n1mn1W^n1W>-vJ\OW\OWi-vJi-\OWn1
.&* .&* U> u> to K) ►— ►— ►— •—
U> ►— ^J
VOWVOv|W>-vlUlW
OOOOOOOOOOO
O O
wwwwwwO OOOO ppppppppppppppppO
ONbNbNbNONbNbNbNbNbNbNbNbNbNbNbNbNbNbNbNbNONbNL/*L/*i^L/*^
^^\0^^vlvl00000N0N0NON0N^WWWK)K)M»-O0000vlONUl^t-Ovl^\0W
0000WWWOOK)K)^^K)K)(O^0000O^^00WW^\O00U)^00000NnJ^00Ui
OOOOWWW^^N)N)UiON\O^VOWONONOO\O^OK)OOVOt-K)vli-M'-^K — . _ . _
U* \Q £> O VO VO VO
p p p p p p p
Lj Lj Lj Lj ^J ^J
O ►— O O O O
ON W ON vj ^J ^J
.&* as ^j lh vo vo
OOOOOO
w w j—' w w w w O O O O O O O O O O O O O O O O O O O O O
nIvJOnOnonOnOnOnOnOnOnOnOnOnOnOnOnOnOnC
~ ~ vr^ u*"> vr»i u*"> vr^ vr^ —i ^1 ^i ^i c-n **t\ ctv #«"t\ **t\ **t\ Jv^ J
o o o o o
\OOO^SSS^^^^^^OnOnSn3nOnOn
00U>U>OOOOOO>— 00004^VOK)K)W^^
ONONONONONONONONONMMOt-OOOOKJK)"
., 0>ONONONU)UU)UiUiUi^
_. OOtO^iOVOVO^— W W t^ K)
K)t-MK)U)O^ONO^OOWOO
OOOO pppppppppppppppp
^J^-J^-J —] OnONvIOnOnOnOnOnOnonOnOnon^OnOn
oooo ^^o^^NioovjNiviuiuiON^ONaN
as as as as oowtot-ONOo^^wsiNiON^w^w^w!
U> U> U> U> 0000\0^OvlU)K)K)O0NaN00t-W0N^tO
©©©©©©©©©©©op© ©
ONbNbNbNbNbNbNbNbNL/*L/*l^L/*L/*'.fck
,kU>UiWK)tOK)K)O^OOON|ONWyO
_ ^OOONOnOUi^COOnK)K)K)v1
ONON^JslONOOONOOONt-rt-^Lft
On On
O O
-. ^- -^ !&>
U> VO ON O
U) ON
©©©©©©
Lj ^j Lj Lj Lj ^j
K) W W K) O K)
^ U) U) t- vj OO
►— U> U> U> ^J t-/*
O O O
^J ^J ^J ^J
o ►— ►— ►—
© ^J ^J ^J
U) OO OO OO
oooooo
oooooooo
pppppppppppppppp
NlvJ^vJ^ON^vlvJvJvlONbNONONONONONONONONONONbNON^l/lL/)
© © © © © VO ►— ►— ►— n— h- \00000\00Nvlslsl0\0NWU)WWOv0\0
^JOOaNONU>U>U>U>U>U>U>K>OOU>L^i-*^^O.^N>^OOOOVOvO
OOt-i-\OvOWO^^^^^^^^NlUiOOK)K)OK)U)Uit-MW^^
>
r
CJ>
>3
3
P
3
>3 I
>
3
>
H
o
2:
H
H
r
w
1
w
o
c
z
a
o
2:
274
DANIEL SHANKS
.7
ft ■
.D|
.5
.4
>
u
II
4
|
■
■
/
|
1
4
1
r
1
1
"^
1
■1
^
A
UL
P
1
lm
mi
i
\
"r
li
1
I
I
y
l
i
1
li
i
III
ill Mi
,
^
g
(
I
>;
L
■^l
■
1
■1
ix
L
*■
c
r
■ 1
rl
ii
FIRST4(-3R
FIR
in
ST
R
"■
■
r
>
c
p)
II
(^
X
^
^
||
(
>
|.
■v
y.
II
K
i>
1
I
X
\,
r
i
><
m
i
>
i
i
5 11 17 23 31 41 47 59 67 73 83 97 103 109 127 137 149 157 167
FIGURE l
In Figure 1 we show this difference graphically. The ULI for 1RP (the so-called
"pseudosquares") start very low, increase rapidly and smoothly with p, and only
become ragged as p exceeds 100 and ULI approaches 0.7. Those for 4(-3Rp)
EXAMINATION OF LITTLEWOOD'S BOUND ONL(lj) 275
start much higher, increase slowly and exhibit much greater fluctuations. The two
intermediate characters, not shown in Figure 1, behave intermediately; they start
at an intermediate level, increase at an intermediate rate, and have an intermediate
amount of raggedness.
A qualitative explanation of this behavior is based upon the relation of these
characters to the perfect squares — the principal characters. All squares not divisible
by any prime ^p are \RP. For \Rp, the Sp solutions (9) will therefore include not
only the pseudosquares, lRp(^m2), but also many perfect squares. Thus, the first
pseudosquare will appear very late, especially for smaller p. Thus 1R3 = 73> U3
= 24, lK5 = 241>t/5 = 120, 1K7 = 1009>£/7 = 840; and while 1K11=2641< Ull9
it is larger than the first 11 solutions: l2, 132, 172, ...,472. For 1RP, In Ind is
therefore correspondingly large and ULI is correspondingly small. As p increases, this
competition with the perfect squares slowly decreases.
The sets of SP solutions for — 1RP and for 5Rp are obtained from that for \Rp
by, respectively, the sets {1RP—UP} and {lRp±jUp} and so are not distributed
uniformly in Up but are both biased towards the second half of Up as a reflection
of the many small squares in \Rp. Their first solutions are therefore also delayed
([7, p. 435], [1]) but this effect diminishes with increasing p more rapidly than the
corresponding effect for 1RP. Finally, — 3Rp differs from a square in two ways,
being both negative and wrong for q = 2. Its delay is therefore relatively small and
is relatively quickly dissipated with increasing p. These differences are also reflected
in the fact that while 1R15x and —3Rl73 are nearly the same size, the second is a
valid solution for four extra values of q: 157,163,167,173.
For large p, and therefore large d, these strong effects of the perfect squares will
dissipate as the squares become less dense. Thus, we can anticipate that the
differences noted, caused by differing relations to the principal characters, will largely
disappear. For p, say «300-400, one would expect a common average ULI of
about | and sizable fluctuations around this average. In a word, we can expect that
the ULI will then be a mirror-image of the LLI and that the different behaviors
noted in §4(b) and §5(b) will vanish.
6. Conclusions from this first experiment. Setting aside the two complications,
the {1 + 0(1)} factor and the strong effect of the squares just discussed, the indices
for the first solution d behave fairly uniformly; they are consistently stronger than
those of LC (5) but show no sign of ever violating the indicated bounds. For very
large p and d - far beyond our data - it is likely that the observed average LLI« f
and anticipated ULI «| will very slowly deteriorate and sink back towards the LC
values. The LC bound on D is actually greater than the Up of (8); it is [2, p. 369]
(15) \D\<p4Up.
On the average, our first solution should be the much smaller:
276
DANIEL SHANKS
(16) \d\*Up/Sp*2nip)2ey\np.
But the ratio
(17) lnln|d|/Inln|D|
for (15) and (16) nonetheless very slowly increases to 1. It is likely that the
fluctuations in the indices around these deteriorating averages will simultaneously slowly
increase and that d with strong indices will therefore continue to appear.
7. Lochamps and hichamps. The first solutions of (7) do not necessarily have
the strongest indices. They do have minimal values of In In \d\ but their L(\, x)
need not be the most extreme since the character (d \ q) has only been forced
thru q=p and floats freely for subsequent q. Since we seek to approach or pass the
bounds (2) and (12), we will therefore seek (to a limited extent) to locate the
strongest possible examples.
Suppose JV>0, d = -4JV in (11). If
(18) M1)<L„(1) (allO<w<JV),
we say LN(\) is a lochamp. If
(19) LJV(1)>L„(1) (allO<i!<JV),
we say LN(l) is a hichamp. Similarly, there will be a sequence of lochamps and
hichamps for positive discriminants d=4M, M>0. We include odd discriminants
—N in the tables by the use of their multiples d— — 47V, and LN(l) instead of
L(l, x), in order to obtain a uniform sequence. It is clear that no indices can be
stronger than those for these champions, and if any indices approach or pass the
bounds we would find them here.
Table 3 shows the sequence of negative discriminant lochamps thru N ^ 50000.
Each LN (1) there thru L47338 (1) satisfies (18). But for N> 50000 it was not possible
to examine every N and below the heavy line in Table 3 the LN(l) shown are
merely tentative, that is, they are smaller than any Ln{\), 0<n<N, that has come
to my attention. For the positive discriminant lochamps in Table 4 the heavy line
represents M = 2000. The entries in these tables come from several sources
including calculations of the Lehmers, of myself, and from an unpublished table of
LN(l), -2000<N<50000, due to Mohan Lai.
Prior to the N = 163 in Table 3 we see the well-known, very strong N = 58, and
following 163 no smaller LN(1) appears until N = 4687. Only at N = 30493 does an
appreciably smaller LN(l) develop. The case N=991027, with h( — N) = 63, was
EXAMINATION OF LITTLEWOOD'S BOUNDS ON 1.(1, x)
Table 3. Lochamps, — 4N=Discriminant.
N LjJ) LLI
7 0.59371 1.0317
37 0.51647 1.1996
58 0.41251 1.0094
163 0.36910 0.8675
4687 0.36711 1.2117
30178 0.36169 1.2844
30493 0.34182 1.2142
47338 0.33210 1.1974
222643
546067
991027
393292183
481022602
1970364883
2426489587
3416131987
8864190043
71837718283
85702502803
569078186623
2
17
167
227
362
398
679733
2004917
41941577
77891897
261153673
9447241877
19553206613
49107823133
4813372912697
0.32957
0.32523
0.29822
0.29449
0.28577
0.28398
0.27982
0.27227
0.26983
0.26731
0.26172
0.25346
0.62323
0.50804
0.45014
0.40578
0.38245
0.33494
0.33492
0.30698
0.29228
0.28949
0.28533
0.27058
0.26644
0.25457
0.25094
1.1946
1.2119
1.1302
1.2979
1.2634
1.2560
1.2415
1.2142
1.2198
1.2422
1.2188
1.2252
0.6587
1.0560
1.2168
1.1239
1.0959
0.9660
1.2550
1.1855
1.2411
1.2426
1.2210
1.2243
1.2176
1.1773
1.2392
Table 4. Lochamps, 4M=Discriminant.
M L_M(1) LLI
-J
\0 vl si W
vj ^O VOOO jv,
»«J00N>00OnOnU>^-
K)^4^vo^ — On^^^-
OVOOOOOO^O-JN>-^.^U>
OsWnIOOUiOOOOOWh-OM^^^
K)h-oo^K)sjUiH-aM30IXONK)sl^
OOM^O^^Wi\0\OyiNlWM\OONvOK)H-
K)H-K)^OOK)WOH-WvlWWOOK)h-u,00
^WVOWh-vIvOvISOh-vIh-h-WUi^^U)
>— ►—> vovOvovO1-- VO"-- i-- vO1-- VO1-- ►— •— •— so
^ON<-hVOVOO<-ftO-JVOVOONU>'«-JON>^
U)L^^U)-JOOU)-JslO-^'-JO^^-
O OO
ppooppppopppopppoo
u>k>k>n>^-o^-ooovovo^-vovoon-jon
U)^-OOK)^JUi^JOOnU)L^U)U)K)U)(^^-^-
_ 00000\UiWWK)H-h-h-
^^-vlslH-K)U,y,AMOOH-O^WOJMh-
a0M>OvlK)yi^o5^00Uis000^H-\0O00^N)^M
nO^-OnVO-^On^^-OnVO^-vO^^-vO^-^^.*— VO^-On^.^-
^ViO00h-000000Vin1^O00K)s000OVi00\00\^s)K)Om
CM^K)^v)00OOV0^n1K)v0UiWm^v,0000K)00v0OUiO
— — ■ 00UiOn1\O(O^^s10000UiK)O^^K)00On1
On K) ^> -J 00
*- K) O U) *-
o o o o o o
O VO 00 00 -J On
O ^- K> -^ U) 00
Lrt U) VO ^- On 00
o o p o o
ppppppppppppppo
On 0s On On On On On On On On On On On On On On On On On On On On L/i ■—) 00
-U-JONON-^--J^-OvOONOOOO^U>ONt-*'-JVOVO>—
^--J<-ft^-«JU>^<-ft4^<-*^N>0N0NU>00O.^O00
c
r
r
m
I
I
3
P
3
>
ffl
r
2
C/3
>
^-010-JN>N>^-0000^-OON>0000'-J<-ft4^<-K>U>
00N)00L04^OO4^OO-JV00jOnU>0s*— ^-VOO
VOUJ^-0<-/iVOvOU>,0<-/iVO--J'-J<-/iVOO*— -J U> •—
000-^-^K)^-^-0-^OJOOOK)0--J>— 04^K>
- -'vo-^J^-JOOvONVO — — -^
-J —* O — U) VO ^.
2
oppoooooppoopopopoop
U)U)U)U)U)U)U)^.K)U)K)K)K)K)VOVOOOOOOOOO
On^OVOVO^JOU)U)K)K)K)^- —
W W H- H-
4^>— voo--jonujk>
K) hj K) 'h- ^- ^- ^» O
o o
--JO^00K)^-OO^0^-^-av00^^-voOlS0J^i4^0JOK>
vou)^it^»u)^.--j^K)vo^-^-'-JONU)^--«jo--JoaN(^ou)
poop pop opoppppoooooopoop
la Lh Lft ly» L* 0» L* \y\ L* L* L* L* L* L* i-^» L* L* L* L* 4^. 4^ !&> -fc- ■£>
OOOOU)K)(^\OO^iU)OOVOV04^.0^-^OOK)K)^J^-^-00
P
3
m
X
>
2
>
H
O
o
H
H
r
m
w
O
C
Z
a
O
z
280
DANIEL SHANKS
257371
294694
584791
969406
1138999
1234531
3462229
6810301
10073779
10393111
39136549
43030381
100041439
249623581
1169755141
1272463669
2055693949
5959962661
7209891781
30116328181
78073081381
4745628949021
11256755665549
3.0156
3.0736
3.1077
3.1128
3.1509
3.1841
3.2644
3.3194
3.3597
3.4098
3.4616
3.4762
3.5179
3.6001
3.6343
3.7146
3.7496
3.8389
3.9018
3.9041
4.1608
4.3022
4.3598
0.6443
0.6543
0.6497
0.6427
0.6480
0.6536
0.6546
0.6562
0.6589
0.6683
0.6616
0.6633
0.6615
0.6668
0.6576
0.6713
0.6730
0.6792
0.6885
0.6767
0.7131
0.7063
0.7099
discovered by the Lehmers and is exceptionally strong. In Table 4 we find another
case of LLI< 1 at d=4-398. (Everyone knows of Q((—163)1/2) but almost no one
knew that g(3981/2) was nearly as strong.)
In Tables 3 and 4 the tentative lochamps having N > 50000 and M > 2000 both
have an average value of LLI of about 1.22. In a word, we are trying harder than in
Table 1 and so are getting indices closer to their presumed bound.
The corresponding hichamps in Tables 5 and 6 that are not already in Table 2
are also somewhat stronger but are clearly also markedly affected by the presence
of the squares, as discussed above. Some of the tentative hichamps in Table 6 were
extracted from Beach's and Williams' table [8] of (M)1/2 having exceptionally
long continued fractions.
The results of this second experiment confirm those of the first; by trying
harder we press a little closer to the bounds but do not pass them except for
d= —163 and d = 4-398. We now return to the postponed problem of {1 +o(l)}
and give it a partial treatment.
8. Partial analysis of {l-j-o(l)} and conclusions. Clearly, the next order of
business would be to determine if the o(\) on the left sides of (2) and (12) are positive
and sufficiently large for d= —163 and d=4-398 so that the bounds shown are
valid. Otherwise, their L functions violate the Riemann hypothesis. Unfortunately,
EXAMINATION OF LITTLEWOOD'S BOUNDS ON L(l, x) 281
many complicated terms enter into these o(\) and no such unequivocal
determination is now available. Nonetheless, it is desirable to show that the two leading
and simplest approximations that were made are of the correct sign and magnitude
so that they alone could account for these apparent violations.
Littlewood's (2), prior to the two approximations alluded to, could be written as
(20) [{l + o(l)} B(x)]-1<L(1, x*)<{1 +o(l)} A(x),
where
(21) B(x) = exp £ (-l)m+1A«pm, ^(x)=exp £ l/mpm,
pm^x Pm^x
and
(22) x = (ln|rf|)2(1+4£), 8>0.
An integrand in the analysis [2, p. 365] includes the factor
(23) |£(i + 8 + iff)/L(i + 6 + iij)|f
and the o(l) in (20) depend upon our choice of e.
Let us define a(x) and b(x) by writing
As x -» oo, a(x)/xl/2 lnx and b(x)/xi/2 lnx-*0 and the first approximation is their
replacement by 0. The second approximation sets the e of (22) equal to 0 and so
the left side of (20) becomes
(25) [{1+0(1)} ^TTMnlnldl]-1.
Now, in all of our examples above we had \d\ <4* 1014, and setting s = 0 in (22)
we obtain x< 1200. This is sufficiently small that one can easily compute b(x) and
a(x) exactly. We find that throughout this range b(x) is positive and fairly stable,
remaining mostly between 1 and 2. (We also find that a(x) changes sign frequently
and is usually much smaller, but do not need that now.) Therefore,
B(x)>6n-2ey lnx.
This is in the correct direction to absolve d= —163 and 4-398, and the difference
involved is sufficient to account for the latter's apparent misdemeanor: LLI=0.966.
282
DANIEL SHANKS
But for d = —163, if we set s = 0, we get
x = (In 163)2 = 25.9463, B(x) = 3.7601, 6tc" V In x = 3.4853,
and even the smaller B(x)~l exceeds L(l, #) = 7r/1631/2. However, one cannot
allow e to approach 0 too closely for the small |d| = 163 without losing control
over the other approximations leading to the o(l) in (20). It happens that even a
quite small e in (22) will suffice to obtain an x with B(x)~1 <7c/1631/2. This is
because an increasing x will soon encounter the odd powers of primes pm = 27,29, 31
and thereby yield a B(x) = 4.0695, whereas, at the earlier square p2 = 25, B(x) had
actually decreased from 3.8360.
That is as far as we will go here. While that leaves it open whether —163 does
or does not violate the lower bound, there is enough here, in the correct direction,
that we now have no real reason to believe that it does.
We have sought, in two different ways, to exceed the bounds (2) and (12), but
with an improbable exception at d= —163 we find that we cannot. Our approach
has not been at all hit-and-miss but, instead, very systematic. The resulting ULI
and LLI are quite uniform and clearly relate to these bounds. All of our strongest
cases, such as d=—991027 and d = first 5iV139, press against the bounds. Our
tentative lochamps had LLI = 1.22. The simplest interpretation of all this
persistent behavior is that the extended Riemann hypothesis is true. Of course, that is
no proof—not even for a single d.
Any heuristic conclusion is somewhat subjective and I should add that I,
personally, regard this as fairly strong evidence. Heuristic reasoning, unlike deductive
reasoning, is influenced by collateral evidence. There was considerable evidence
for the ERH, of several sorts, prior to this work and that can only strengthen our
assessment of the present data.
Suppose we did find a clear violation. We would then know that there were
non-Riemannian zeros for that d and we could even give a lower bound for their
real parts. If, in place of (23), we were forced out to
|£(0 + £ + iij)/L(0 + £ + iij)|
because of zeros at 9 + it, then (22) would be replaced by
(26) x = (ln|</|)<1+4£)/(1-">,
and the famous factor of 2 in the bounds would be replaced by the larger factor
i/(i-0).
Littlewood does not give (26) but the writes [2, p. 371], "Hypothesis X, without
modification, is essential in proving Theorem 1" [ — that is, in proving (2)]. I
presume that the need for an enlarged (26) is what he had in mind.
EXAMINATION OF LITTLEWOOD'S BOUNDS ON L(l, x) 2^3
References
1. D. H. Lehmer, Emma Lehmer and Daniel Shanks, Integer sequences. II, Math. Comp. (to
appear).
2. J. E. Littlewood, On the class-number of the corpus P(y/ — k), Proc. London Math. Soc. 28
(1928), 358-372.
3. S. Chowla, On the class-number of the corpus P{yJ -k), Proc. Nat. Inst. Sci. India 13 (1947),
197-200. MR 10, 285.
4. D. H. Lehmer, An announcement concerning the Delay Line Sieve DLS-127, Math. Comp. 20
(1966), 645-646. This was subsequently modified to the DLS-157.
5. Daniel Shanks, Class number, a theory of factorization, and genera, Proc. Sympos. Pure Math.,
vol. 20, Amer. Math. Soc, Providence, R.L, 1971, pp. 415-^40.
6. , The infrastructure of a real quadratic field and its applications, Proceedings of Boulder
Symposium, August 1972, University of Colorado, 1972.
7. D. H. Lehmer, Emma Lehmer and Daniel Shanks, Integer sequences having prescribed quadratic
character, Math. Comp. 24 (1970), 433-451. MR 42 #5889.
8. B. D. Beach and H. C. Williams, Some computer results on periodic continued fractions, Proc.
Second Louisiana Conference on Combinatorics, Graph Theory and Computing, Baton Rouge, 1971,
pp. 133-146.
Naval Ship Research and Development Center
Bethesda, Maryland
This page intentionally left blank
ON THE RIEMANN HYPOTHESIS IN
HYPERELLIPTIC FUNCTION FIELDS
H. M. STARK
1. Introduction. Let kq be the finite field of q elements where q is a power of
the characteristic of the field which we denote by p. Let
F(x) = a0xn+ ~- + an9 a0^0,
be a polynomial with coefficients in kq which has no double roots. Let N(q) be
the number of points on the hyperelliptic curve
(i) y2 = /(*),
including the points at infinity, defined over kq. The Riemann hypothesis for the
curve (1), first conjectured by Artin [1] when q — p, is equivalent to the estimate,
(2) \N(q^)-q^-l\^2g(q^2, j=l,2,...,
where
g — {n-\)/2 if n is odd,
= (m —2)/2 if n is even,
is the genus of the curve (1). With q arbitrary, it suffices to treat the case j= 1 of
(2). For n=l,2we have g = 0 and the result of (2) is exact and trivial. For w = 3, 4
we have g= 1 and (2) was first proved by Hasse [2]. The result (2) for n^> 5 (g > 1)
was proved by Weil [5] as a special case of the Riemann hypothesis for curves
in general.
A MS 1970 subject classifications. Primary 10F35, 14G10; Secondary 10B15.
© 1973, American Mathematical Society
285
286
H. M. STARK
From the point of view of number theory, one wants to count the solutions
to (1) with x and y in kq. Let N = Nqbe the number of pairs (x, y) in kq satisfying
(1). Then N(q) — Nq is the number of points at infinity defined over kq and this is
1 \tn is odd, 2 if n is even and a0 is a square in kq, and 0 if n is even and a0 is not
a square in kq. Thus (2) splits into three cases:
(3) \N-q\^2gq112 (n odd),
(4) \N—'q+ l\^2gql/2 (n even and a0 a square in kq),
(5) \N — q—l\^2gqi/2 (n even and a0 not a square in kq).
In the case that w = 3, an elementary proof of (3) was first given in 1956 by
Manin [3], but his proof is modeled upon Hasse's and cannot be carried over to
n^5 (there is also an error in the proof but it is repairable). For w^3, n odd,
Stepanov [4] introduced an entirely new method three years ago which for the
first time approached (3) by methods from number theory. He showed that if
q = p>9n2 and n^3 is odd, then
(6) \N-p\<n{3n)1,2pi/2.
His method is in spirit closely allied with methods in diophantine approximation
and transcendence in that he creates an auxiliary polynomial having high order
zeros at each of several points of interest. The major problem in the method then
turns out to be the necessity of showing that the auxiliary polynomial is not
identically zero.
Stepanov's proof that his auxiliary polynomial is not identically zero is not
valid for the optimum choice of variables (nor for n even) and surprisingly we
find that for the optimum choice of variables that if the auxiliary polynomial
is not identically zero, then for any w^3 and g = p we would get
(7) \N-p\<cgpV2,
where c is independent of n and p (but greater than 2). It is the purpose of this
paper to modify Stepanov's method enough to prove (3) when q = p and n is
odd and at least prove (7) when n > 2 is even with c < 3. Our main result is
Theorem 1. Suppose thatf(x) has exactly n0 zeros in kp. Ifp^5,pl,2^:2g—\
and m is an integer such that 1 ^m^pl/2, then
(8) \N~p\^(n~l)(m+g/2) + {(g/2)(p-n0)-(g2/4)(n-l)} (m+g/2)-'.
THE RIEMANN HYPOTHESIS IN HYPERELLIPTIC FIELDS
287
For n= 1, the right side of (8) is zero which agrees with (3). For n odd, w^3,
we have n—\=2g and with a real variable y = m + g/2, the right side of (8) is
less than (n0 > 0) or equal to (n0 = 0),
h(y) = 2g{y + ((P-g2)/4)y-1}.
For y = ^pi/2±g/2 we have h(y) = 2gp1/2, and hence if m is chosen (as it may be
since p^ 5) so that m^ 1 and
we get (3). The restriction, p1/2 ^ 2g — 1, is harmless since (3) is trivial if p1/2 < 2g— 1.
In fact, we see that the Riemann hypothesis is not best possible in that there
are n and p for which the right-hand side of (8) is always less than [2#p1/2] for all/
(and when 2gp1/2<p). For example, suppose n = 5 (g = 2) and p = 4y2 + l where
y^2. Then
l2gpll2-\Z4gy,
while if we set m = y— 1 in (8) (so that m-\-\g — y\ we get the right-hand side of
(8) is
2g(y+(4y2 -n0-3)/4y)<4gy.
As a further example, consider the case when n = 5 (g = 2),p= 13. Here [2gfp1/2]
= 14 and so the Riemann hypothesis is unable to even say whether y2 — f{x)
has any solutions in this case. But Theorem 1 with m= 1 gives
25 -n0 25
IJV-131^-^
and thus we have at least N ^ 1. If N = 1 we must have n0 = 1 and this gives exactly
|JV— 13| ^ 12. This is best possible as the example
y2 = x(x4 + x2 + 3) (p=13)
has JV = 1.
When n is even, n = 2g + 2 and Theorem 1 gives
iJV-pl^ + lJp1'2
with no difficulty. The estimate of Theorem 1 may be improved for even n but
we do it here only for m = 1.
288
H. M. STARK
Theorem 2. Ifn is even, p^ 5,pll2^2g — 1, then
This result is of interest when/?1/2»2#. For example, when/?= 13, n = 6 (g — 2)
we get
|N-13+(a0/13)|^(26-n0)^13
with equality only if w0 = 0. When (a0/13)=—1, this gives JVj^l and JV=1 is
impossible since it implies n0 = l which improves the estimate. Thus JV^2. If
(a0/13)=l, we get no information on N but in this case there are two points
at oo. Thus in the language of algebraic geometry, any complete nonsingular
curve of genus 2 defined over kl3 has at least two points over /c13; the Riemann
hypothesis says nothing.
2. An illustration of the method. Let nl9 n0, n.x be the number of elements
a in kp such that/(a) is a nonzero square, 0, or not a square, respectively. Then
JV = 2w1+n0 and n1+n0 + n_1=p. Thus
N-p = 2nl+n0-p=-{2n.i+n0-p),
and Theorem 1 is equivalent to showing both
2n±l+n0-p^(n-l)(m + g/2)+{(g/2)(p-n0)-(g2/4)(n-l)}(m + g/2)-1.
In fact, since multiplying/(x) by some c in kp such that the Legendre symbol
(c/p)= 1 interchanges the meaning of nx and w_ l9 it suffices to prove
Theorem 1'. Ifp^5,pll2^2g-l,l^m^pll2,then
(9) 2*-1+*o-pg(*-l)(m + 0/2)+^
We illustrate the method of proof in the case of n = 2 (g = 0) where the optimal
choice of m is m = 1. Let R(x)ekp[x] be defined by
(io) «(x)=(i+/Wlr,),2)/W+l/'W(^-4
Let
(ii) /_,(*)= n (*-«). /oW= n (*-«)■
a € fcP; (/ (a)/p) = - 1 aekp; /(a) = 0
THE RIEMANN HYPOTHESIS IN HYPERELLIPTIC FIELDS
289
Then clearly f.x \ R since every zero of/_! is a zero of R. But further (since kp
has characteristic p\
R'(x)=(\+f(xyo-^)±f'(x)+y"(x)(xi>-x),
and thus every zero of /_ x is also a zero ofR' and hence fl x \ R. It is also apparent
that/0 | R and sincef0 and/_j have no common roots, f0f-\ \ R- Assuming that
R is not identically zero, we may thus compare degrees and get
(12) n0 + 2«_1^p+l,
which is Theorem 1' with m= 1 (n = 2, g = 0).
In fact, we may get the full Riemann hypothesis in this case since the
coefficients of xp+1 and xp in R(x) are
a0(a<T1)/2 + l) and ^WT 1)/2 +1)
respectively. This enables us to replace (12) by
(13) n0 + 2n_1^p + (a0/p).
To get the corresponding estimate for nu we must remember that when we
multiply/(x) by a nonsquare, {a0/p) changes sign and thus the analogue of (13) is
(14) n0 + 2ni^p-{a0lp).
If we add (13) and (14), we see that both are equalities and in particular
N = p-(a0/p),
which is precisely (4), (5) and Theorem 2 when n = 2.
Even in this simple case, we still have not shown that R is not identically zero.
In this case, the coefficient of xp~1 in R is (even when p = 3)
-Wop-3),2(ai-4a0a2)
which is not zero since / has no double roots. Note that if /= aQ (x - a)2, {ajp) = -1,
then not only are the first three coefficients of R zero but in fact R is identically
zero.
The result for n = 2 is valid for any finite field of odd characteristic since we
required only one derivative. But for n^3, the optimal number of derivatives is
290
H. M. STARK
on the order of p1/2, and thus extra difficulties arise when we deal with a general
finite field.
3. The auxiliary polynomial. In the illustration in the last section, we could
write R(x) explicitly. This is no longer possible in the general case, but rather
we will define the coefficients of R by differential recursion relations. To avoid
numerous subscripts, it is imperative to use vectors and matrices. Also, we will
write/, R, etc. in place of/(x), R(x), etc. Throughout this section, m and d will
denote integers such that m>0, d^O and 2m + d<p.
Let the m x m matrix Q be defined by
(15) e=(/'/2/)/+fii,
where / is the mxm identity matrix and
/0 1
0 2
0 3
(16)
6i =
\
V
0 m-1
0 /
with zero elsewhere. Put Ft=(Fn,...,Fim); we define the F, recursively from
(17) F,+ 1 = DFJ + Ffe (i£0)
where D is the operator, D = d/dx, and
(18) F0 = (/^-1)/2 + l,0,...,0).
Due to the triangular nature of Q, any Fi} is independent of the value of m^.j
used to define it. This will be useful later. In terms of the Ftj, (17) is
(19) Fl+Uj=DFtJ+(f'/2f) Fij+(j-1) Fu_x, 1 </gm.
From this we see by induction that F0 = 0 if i+1 <j^m and
(20) Fu+1 = ilF0l9 Ogigm-1.
Further, we have
THE RIEMANN HYPOTHESIS IN HYPERELLIPTIC FIELDS
DF01+(f'/2f)F01 = f'/2f,
and thus by induction we see that, for /^l,
(21) F^Ptj/f'-J*1, l£/£max(i, m),
where PfJ is a polynomial of degree (degO= — oo),
(22) degPy£(n-l) (i-j+1).
We define the (m+d +1) x m matrix F by
(23) fJ :° \
we see from (17) that
ef+fe-(;' )■
\^m + d+l/
Let
(24) u = (u0,...,um+d), uj = (l/jl)(xp-xy
where m + d<p. Note that
DM=-(0,K0,...,Mm + d-l)
and thus,
(25) D{uF) + uFQ = um+dFm+d+l.
Let
(26) rI- = r(rI.1,...,rI.m)
be a column vector of functions defined recursively by
(27) ri+l=Dri-Qri9 i^O,
292
H. M. STARK
and the r0j are to be chosen. In particular if the r0j are polynomials, then the
rtj are rational functions whose denominators have only zeros which are zeros
off. Set, for i^O, R^uPr^ We see that
(28)
DRt = [D [uF) + uFQ] rt + uF {Drt - Qr)
It follows from (20) that
m m + d min (mi, k) i
(29) i?,. = (l + /<"-»/2) X ry(x'-*)>-1 + £ I -Fk/l.,(x"-x)\
j=l fc=l J=l *'
and since wm+d = [(m-|-t/)!]"1(xp — x)m+d, we see that if the r0j are polynomials
and aekp is such that (f(a)/p)~ — 1, then
D'Ko(a) = 0, 0^f^m+rf.
In other words, if/_i is given by (11) and in addition R0 is a polynomial, then
/?r+1|/?o.
We have m free polynomials, r0j, and hope to put m — 1 conditions on them.
To do this we set m = 2ml(ml>0), d+1 =dx. Our m—\ conditions are
(30)
and
Thus
F,ro = 0, mi-h^i-hlgf^m-h^i-1.
/roi\
'Fo
Ro—(uOi--> Umi + di)\ :
mi +di/
f0mi
\0 7
Since the only Ftj with nonzero coefficients are those with j^ml and since (30)
implies that r0 = 0, mx + \^j^m for all /, we see that we may define Fh r, and R{
with m and d replaced by ml and rfx. For this reason, we drop the subscript on
mx and dt. We have (28) and (29) still valid; and if R0 and the r0j are polynomials,
1 ^7^aw, and the m — 1 equations
(31) /V*o = 0, m + </+l^i^2m + d-l,
are fulfilled, then
THE RIEMANN HYPOTHESIS IN HYPERELLIPTIC FIELDS 293
(32) /-27+'|*o-
4. A choice of the r0j. We will find the polynomials r0j in the form
(33) r0J = fm+d->+lSj
where the Sj are polynomials. The advantage of this form is that from (29),
R0 = (l + /<"-1>/2) J fm+d-j+lsj(xp-xy-1
(34) rn + d i min(fc,m)
+ Ett I Pkjfm+d-ksj(x»-x)k,
so that R0 is automatically a polynomial.
The system (31) becomes
m p
Y , , V /-m+d-j+i 0 w+d + l<i<2w+d-l,
which reduces to
m
(35) X Pus;=°> m + d+lgig2m+d-l.
Lemma 1. 77j£re are polynomials Sj, not all zero, of degrees
degsj£pj={n-l){j-l)^n-l){m-l)(m + d)
w/nc/i satisfy the system (35). (Ifm=l,we take sl = l.)
Proof. If Sj is a polynomial of degree rg/i,- then from (22),
degPys^n- 1) i + (n -1) {m- 1) (m + d).
Thus if we attempt to solve (35) by letting Sj=£fi0 b}lxl and then setting the
coefficient of every power of x equal to zero, we have
u = I (^-hl) = (n-l)i(m-l)m + m(n-l)(m-l)(m + rf)-hm
unknowns and
294
H. M. STARK
2m + d- 1
ES E [(fi-l)i+(n-l)(»i-l)(»i+<0+l]
i=m+d+l
= (n-l)$(m-l)(3m + 2d) + (m-l)(n-l){m--l){m + d) + {m-l)=U-l
equations, and hence there is a solution for the bjt with not all 6,7 = 0.
We take one such solution $,•(/= 1,..., m) with the Sj having no common factor
among them all and fix this solution for the rest of the paper. For the rest of this
section, the r0j are given by (33).
Lemma 2. ///> = 5, pl/2^2g-l, \^m^pl/2/2 and m + d^pl/2/2 + g, then
the polynomial R0(x) is not identically zero.
We postpone the proof of Lemma 2 to the next section.
Lemma 3. Ifd=gthen
dcgR0(x)^(n-l)(m + g/2)2 + p(m+g/2) + l(g/2)p-(g2/4)(n-l)^^
Proof. For fixed j and \^k^m + d, we have
deg(Pkjfm+d-kSj{xp-x)k)^{n-\){k-jl l) + w(m + </-/c) + /^ + /cp
^fc(p-l) + w(w + i) + |ij-(»i-l)0,-l)
£{m + d){p-l) + n{m + d)^n-l){m-l){m + d)
£(m + d)p + {n-l)(m){m + d)
while
deg[(l+/('-1)/2)/,"+d--'+1sJ(x|,-xy-1]
£${p-l)n + n(m + d-j+l) + iij + p{j-l)
^i(p-l)« + «(w + d) + (p-l)(m-l) + (»i-l)(m-l)(w + d)
^{m + d)p + (n-l)m{m + d) + (p-l)(n/2-d-l).
Hence for d — g ^ w/2 — 1 we see that
degR0S{m + g)p-i'{n-l)m(m+g)
= (n-\) (m + g/2)2 + p(m + g/2) + [(g/2) p-(n-l) (g2/^.
5. Proof of Theorem 1'. We constructed R0 (x) in the last section such that
flTd | ^o- But we see from (34) that fSl+d\R0 also. Thus
THE RIEMANN HYPOTHESIS IN HYPERELLIPTIC FIELDS
295
(36) (2m + d) w_1 +(m + d) n0<>degR0.
We estimated the degree of R0 when d = g in Lemma 3. If we use this estimate in
(36) with d = #, we get Theorem 1'.
It remains to prove Lemma 2. For the time being, we will assume that r0 is a
vector of functions in kp(x, fl,2\ not all zero, satisfying the system of equations
(31). The choice of r0 given in Lemma 1 is useful as is another choice to be given
soon. We note that, for /^0, y'^0,
(37) DiFtrj^Ft+trj + Ffj+t.
It is also useful to simplify the recursion relations for the Ft and r, by making a
change of variable. Let
Qi = t{QiW"9Qim)=f1,2ri'
The recursion relations (17) and (27) simplify to
(38) 0i + i=Jtyi + <fcei,
and
(39) Qi + i = DQi-QlQi.
We will reduce Lemma 2 to the following key result which is reminiscent of
the theory of ^-functions in transcendence.
Lemma 4. For p^>5, p1,2^2g-l% m + d£±p1,2 + g, m^jpl/2, we have
detM=£0, where M is the mxm matrix (FtJ), m + d+\^i^2m + d, l^j^m.
Proof. We have <£01 =fll2 + (fl/2y. We then see from (38) that
Fm+d+i,i=f-l/2<i>m+d+ui=ri/2om+d+Hf112)
(40)
= I bkf~\
k=\
where the bk are polynomials in x and
bm+i+l=H-t)-Q->"-Wr+i+i
2 ) n l
* / j=l;jodd
296
H. M. STARK
If Fm+d +1,1=0 then / | bm+d + 1. But / and /' have no common factors and
frm+d + i^0 since 2g-l<pi/2 and
2m + 2rf-l^p1/2 + 20-l<:2p1/2<p.
Thus the upper left-hand entry of M is not identically zero. This settles the case of
m=l.
We now assume that detM = 0, m> 1, and derive a contradiction. We see that
for some A, 1 ^h<m, we must then have
detMh#0, detMh+1=0
where
Mfc = (F0), m + d+l^i^m + d + /c, 1=7=^.
In fact by lowering m and raising d we may assume that this happens with h = m—l
(as we noted before, the triangular shape of Q guarantees that the matrix Mk is
independent of the value of m^k used to find it). Thus we shall assume that
(41) detMm_1#0, detM=0.
Therefore M has rank m— 1 over kp{x) with a single row relation which involves
the last row,
2m+d-l
(42) F2m+d= £ h,F,
i=m+d+l
with rational functions ht. By (31), F2m+dr0 = 0 and hence
Firo = 0, m + d+l^i^2m-d.
By (37),
Firi=0, m + d+l£i£2m + d-l,
and by (42) again, this also holds for i = 2m + d.
This gives us two relations among the columns of M and since the column
rank of M is also m— 1, these relations must be proportional:
ri=hr0
THE RIEMANN HYPOTHESIS IN HYPERELLIPTIC FIELDS 297
where h is a function in kp(x, fl,2\ Thus
(43) hr0 = Dr0-Qr0,
and, in particular,
(44) hr0m = DrQm-(f'/2f)r0m.
Note that rOm^0 since otherwise the fact that Mm_x is nonsingular would imply
all the other rOj = 0 also. In particular the sm of Lemma 1 is not zero. Let
(45) ro^r-S+WsjS-1, 1^/rSm.
This gives an r0 which is just (fd+ i/2sm)~1 times the r0 used in the last section and
in particular r0m = /1/2. For this choice of r0, we see from (44) that ft = 0 and hence
from (43) that ri=Dro — Qro = 0. We let q0 be given by
(46) Qoj = f-^r0j=r^SjS-\ l^m,
so that the qoj are rational functions, Q0m = 1 and £x =0;
(47) DQo = QlQo.
This gives
(48) l)w+J+M/1/2Foieo0 = ^+^H^o) = ^+d + ieo = ^+d+^o = 0.
In characteristic 0, this would be possible only for q01=0 and then repeated
application of the differential equation (47) would yield QOm=0 which is a
contradiction. But in characteristic p, things are not so easy.
Suppose that a is a root of / (not necessarily in kp) such that (x — a)m \q01 in
the sense that Q0i{x) = (x — a)m^(x), where the denominator of g(x) is not divisible
by (x — a). By (47) we see that
{m-l)lQom{x) = Dm-lQoi{x).
Thus £0m in reduced form contains a factor of (x - a) in its numerator which
contradicts the fact that Q0m = 1. Therefore
(49) (x-arJCoi,
and in particular £01 #0.
298
H. M. STARK
Let a be the exact power of x — a contained in q01 in the sense that q0i(x)
= (x — a)a q(x), where neither the numerator nor the denominator of g is divisible
by (x — a). We know that Q^0,a<m. Then
where h(x) = (f(x)/(x-ix))112 F01 (x) e(x)#0. Hence
m+d+1
0 = D(m+d + l)(fl/2FoiQoi)= £ f"+f + 1)DI[(x-a)B+1/2]Dm+d+1-|ft
/ = o
m+d+l
= I (m+1+l)(a+±)(a-±y-ia+i-l)(x-x)a-l+l/2Dm+d+l-lh^
1 = 0
When we multiply this though by (//(x-a))1/2 (x-a)m+d~a+1/2 we get a sum of
the form
(50) O^f1 gi(x-ocr+d + '-1,
1 = 0
where each gt is a rational function without factors of (x —a) in the denominator
and
2m+d+ lgm+d+1 =(2a+ 1) (2a- \)-(2a-2m-2d + 1) (//(x-a)) F0lC.
But (x-a) does not divide the numerator of (/(x)/(x-a)) F0l(x) q(x) and thus
the only way that (50) can hold is that gm+d +1 =0, which implies either
(51) 2a + l^p,
or
(52) 2a-2m-2d+lS-p.
But (51) cannot hold since it implies
a^(p-l)>lPl/2>™
which contradicts (49). Therefore — a^(p+\) — m — d. Since this result is true
for all roots of/, it follows that/(p+1)/2~m_d divides the denominator of g01
when reduced. It follows from (46) that/(p~1)/2~d | sm and therefore
(53) n((p-l)/2-d)^nm = (n-l)(m-l)(m + d+\).
THE RIEMANN HYPOTHESIS IN HYPERELLIPTIC FIELDS
299
This inequality is false as we now proceed to show.
Consider the inequality
(54) (g+l)(4t2-l)>(2g+l)(t-l)(t + g+l) + 2(g+l)g.
Clearly this is true for f-*oo and both sides are equal at t = 0. Hence if f0>0 and
(54) is true for t = t0 then it is true for t>t0. But (54) is true for t = g and indeed,
when 0^2, (54) is true for t=.g — \. In particular, since pi/2^2g— 1, p^5, (54) is
true for t = ^p1/2. Now with t=|p1/2, the inequality
(55) u(L(4t2-l))>(u-l)(t-l)(t+g+l) + ug
is satisfied u = 2g + 2 by (54) and clearly satisfied for w = 0 also (t> 1). Since (55) is
linear in u, it must also be satisfied for u = 2g+ 1 and thus (55) is satisfied for u = n.
Now consider the inequality
(56) n{Up-l))>{n-l){m-l)(t + g+l) + n(t + g)-nm
which is satisfied for m = t = p1/2/2 by (55) and for m=l since p^5, p1/2^2# — 1
implies
{p-\)/2>{2P^-\)/2UPV2+^-2)l2 = t+g-\.
Since (56) is linear in m, it is satisfied for 1 ^m^t. But now
n((p- l)/2)>(n-1) (m- 1) (m + d+1) +w*
holds for m + d=t + g, lgm^r and clearly continues to hold when either m or d
are lowered. This contradicts (53) and proves Lemma 4.
From this point on, r0 is given by (33).
Lemma 5. The vectors r0, rl5..., rm_x Are linearly independent over kp(x).
Proof. Our m — 1 equations for r0 are
Fm+,+ 1 + 1-r0 = 0 (0^igm-2).
By (37) these equations are equivalent to the equations
(57) Fm + d+lrt = 09 0^m-2,
300
H. M. STARK
Suppose there is a dependence relation among r0,...,rm„l. Then since r0 is not
identically zero, for some J, l^J^m- 1, we have
j-i
rJ = I hfj
j = o
for some rational functions h} (x). If we apply (D — Q)m ~1" J to this we get a relation
of the form rm_ i =X7=o2 0/0 with rational functions ^(x) (and indeed, although
it is not needed we could even have ^.(x) = 0 for J^j^-m — 2). But this means (57)
holds for i — m— 1 also and thus by (37),
Fm+d+l + iro = 0, 0^fgm-l.
But by Lemma 4, this means r0 is identically zero which is contrary to its definition
in Lemma 1. This proves Lemma 5.
We can now complete the proof of Lemma 2. By (28) and (37) we have for
0^f^m-2,
DRi(x) = Ri+1(x)
and hence, for 0^il^m — 1,
If Ro=0 then R(=0 (0^i^m — \) and therefore by Lemma 5, uF=0. Hence by
(25), Fm+d+1 =0 which violates Lemma 4 and completes the proof of Lemma 2.
6. Proof of Theorem 2. The case of even n is more complicated. We carry it
through in the case that m= 1. By h(x) = 0(xa) we mean that h(x) has no terms xb
with b > a present. Let
A=(n- 1) (m + g/2)2+p(m + g/2) + [(g/2) p-(g2/4) (n- 1)] ;
this is the bound that we had in Lemma 3 on deg R0 when d = g. For the rest of this
section n is even.
Lemma 6. Ifd=g,then
^oWHi+/(p-1)/Vom(xp-xr-i+^
m
x I irm+9,/o,(*p-x)m+9+0(^-p+1).
J'=l
THE RIEMANN HYPOTHESIS IN HYPERELLIPTIC FIELDS
301
Proof. This is clear from the proof of Lemma 3.
Lemma 7. Ifm=l, d=g andn is even then degR0^A—(m + g/2) [1 — (a0/p)].
Proof. Since n is even, we may expand (ao i f)i/2 in a formal series of the form
K"7)1/2= t bjxi + x*^
and thus from Lemma 6 with m= 1,
T 1 xnp/2 1
*o(x)=r4(TT^
But from (40),
F =D1+y[(fl0-1/)1/2] = (g + l)i + Q(x^2)
1+*1 (flo"1/)172 (flo"1/)172 '
and since for 2g—l^p1'2, 5£p we havegf+2<p —1, we see that
R0(x)=r0lx^i+^ (ao1/)'1'2 (ari)l2 + l)+0(x^o-2).
The lemma follows.
Lemma 8. Ifnis even, then
Proof. With m= 1, d=#, we have, by Lemma 7,
"-i(2+0) + "o(l + 0)^degKo
^A-(l+g/2)2 + (a0/p)(l+g/2) + (g/2 + g2/4)
= 2g(l+g/2)2 + [P + ("o/p)'] (1+0/2)
+ [to/2)(p+l)-2»^/4].
The lemma follows after dividing by 1 +g/2.
Theorem 2 follows from Lemma 8 in exactly the same way that Theorem 1
follows from Theorem 1'. In the case of even n and m> 1, we can improve the
estimate on deg.Ro even if (a0/p)= 1 and then in addition more terms cancel out when
(ao/P)= — L We will not go into this matter further here.
302
H. M. STARK
References
1. E. Artin, "Quadratische Korper im Gebiete der hohren Kongruenzen. I, II," in The collected
papers of Emil Artin, Edited by S. Lang and J. T. Tate, Addison-Wesley, Reading, Mass., 1965,
pp. f-94. MR 31 #1159.
2. H. Hasse, Zur Theorie der abstrakten elliptischen Funktionenkorper. I—III, J. Reine Angew. Math.
175(1936), 55-62, 69-88, 193-208.
3. Ju. I. Manin, On cubic congruences to a prime modulus, Izv. Akad. Nauk SSSR Ser. Mat. 20
(1956), 673-678; English transl., Amer. Math. Soc. Transl. (2) 13 (1960), 1-7. MR 18, 380; MR 22
#3711.
4. S. A. Stepanov, On the number of points of a hyper elliptic curve over a finite primefield, Izv. Akad.
NaukSSSRSer. Mat. 33 (1969), 1171-1181 = Math. USSR Izv. 3 (1969), 1103-1114. MR 40 #5620.
5. A. Weil, Sur les courbes algebriques et les varietes qui s'en deduisent, Actualites Sci. Indust., no.
1041, Hermann, Paris, 1948. MR 10, 262.
MASSACHUSETTS INSTITUTE OF TECHNOLOGY
CLASS NUMBERS OF
TOTALLY IMAGINARY FIELDS
JUDITH S. SUNLEY
In recent years the study of class numbers of imaginary quadratic number
fields has played an increasingly important role in number theory. These studies
have so far resulted in the determination of all imaginary quadratic fields with class
number one or two. The techniques used in these determinations are quite different
from the classical methods used in obtaining earlier results such as the Brauer-
Siegel theorem and the results of Heilbronn and Linfoot [2] and Tatuzawa [5].
At the present time, little progress has been made in determining imaginary
quadratic fields of higher class number.
This paper investigates a similar problem at a slightly different level. Let K be
any totally real algebraic number field of degree n ^ 2. The problem posed is to
determine all totally imaginary quadratic extensions of K having a given class
number. Although the classical techniques seem insufficient to answer the class
number problem in the case of imaginary quadratic fields, given the results in
imaginary quadratic fields the classical techniques do give an answer in many
cases to the same problem in the situation described above.
The major result is
Theorem. Let K be a fixed totally real algebraic number field and let L be a
totally imaginary quadratic extension of K having conductor fL and class number h.
Then there exists an effectively computable constant c = c(K, h) such that N]LSc
with the possible exception of one field L.
Using this result with results of Goldstein [1] it is possible to give an effective
classification of all L having class number one where K is normal over Q. It is also
A MS 1969 subject classifications. Primary 1065; Secondary 1068.
i!, 1973, American Mathematical Society
303
304
JUDITH S. SUNLEY
possible to classify those L having class number two when K has certain specified
characteristics.
The proof of the major theorem is drawn on the proof of Tatuzawa [5] of a
similar statement for imaginary quadratic fields. It is based upon developing a
lower bound for L(l, x) where x *s the real ideal character induced by the field L.
The result is then obtained by relating the Dedekind zeta functions of the two fields
and using the known algebraic relationships between the two fields.
The determination of a lower bound for L(l, x) is made by forcing a certain
complex analytic function to have too many zeros close to s= 1. If more than one
X of K gave a value L(l, x) which was too small, this analytic function has
impossible properties around 5=1. The one exception provides the complications
for imaginary quadratic fields, and it is only by making use of strong algebraic
properties that one is able to overcome this problem* for some specialized cases
where the base field is a totally real field other than Q.
There are, of course, many drawbacks to the result. First of all, the constant
obtained in the result is quite large, on the order of 1021 even in the simplest cases
where the base field is a real quadratic field of small class number. This means
computation would be extremely difficult. Also, the one exception still creates
difficulties in the complete determination of all L's of class number h. Further
investigation of the possible exception is definitely in order. The conjecture would
be that the exception, if it exists, must be related to an imaginary quadratic field
of class number h.
These results have previously been announced in [3] and complete proofs
will appear shortly in [4].
Bibliography
1. Larry Goldstein, Relative imaginary quadratic fields of class number 1 or 2, Trans. Amer. Math.
Soc. 165 (1972), 353-364.
2. H. Heilbronn and E. H. Linfoot, On the imaginary quadratic corpora of class number one, Quart.
J. Math. Oxford Ser. 5 (1934), 293-301.
3. J. Sunley, On the class numbers of totally imaginary quadratic extensions of totally real fields,
Bull. Amer. Math. Soc. 78 (1972), 74-76.
4. , Class numbers of totally imaginary quadratic extensions of totally real fields, Trans.
Amer. Math. Soc. (to appear).
5. T. Tatuzawa, On a theorem ofSiegel, Japan J. Math. 21 (1951), 163-178. MR 14, 452.
American University
EXPONENTIAL SUMS
AND THE RIEMANN CONJECTURE
PAUL TURAN
1. We are dealing in this paper exclusively with the Riemann zeta function
though the results can be extended to numerous other important cases mutatis
mutandis.
The starting point of this paper was a remark of Landau in his Handbuch from
1909. Denoting by ^'s the nontrivial zeros of C(s) ordered according to absolute
increasing ordinates he wrote I.e. the following lines. "Die Tatsache, dass ^ xe/g
gerade in der Nahe der Primzahlen und hoheren Primzahlpotenzen und sonst in
der Nahe keiner Stelle > 1 ungleichmassig konvergiert, deutet auf einen arith-
metischen Zusammenhang zwischen komplexen Wurzeln q der Zetafunktion und
den Primzahlen hin. Ich habe keine Ahnung, worin derselbe besteht." We would
more modestly ask whether the zeta-roots of some ranges and the primes of some
ranges influence each other particularly strongly, and if this is indeed the case
then how? As far as I know the first result in this direction was contained in my
paper in Izv. Akad. Nauk SSSR in 1947. This runs as follows.1
Theorem 1. Suppose the existence of constants a ^ 2, 0 < /? ^ 1 and a c (a, /?) so
that for a T>c(a, /?) the inequality
(i.i)
10
£ exp(-nrlogp)
N\og10N
AMS 1970 subject classifications. Primary 10G05, 10H05; Secondary 10-02.
1 In what follows c stands for unspecified positive numerical, explicitly calculable constants;
if they depend on some parameters this will be explicitly stated, expx stands throughout for ex,
p for primes, n(x) for the number of primes ^x.
© 1973, American Mathematical Society-
SOS
306
PAUL TURAN
holds for all Nu N2 integers with
(1.2) xa^N^Nl<N2^2N^exp(xfi/l0).
Then C(s)#0 on the segment (s = a + it)
(1.3) o>\-e-l0pl*\ t = t.
This theorem is "local" in the sense that for a fixed, sufficiently large x it gives
information about the distribution of zeros on the segment 0<a< 1, t = i, from a
property of the exponential sums
£ exp(-iilogp)
which involves a set of primes depending on x only.
It is useful to give to this theorem a somewhat weaker "semiglobal form"
where the conclusion refers to the distribution of zeros in a "small" parallelogram.
More exactly we state
Theorem 1'. Suppose the existence of constants a^2, 0</Jg 1, 0<£g9/10
and c(a, /?) so that for a T>c(u, /?) the inequality
(1.4)
...^ 10
£ exp(-filogp)
Nl^p^N2
N\og10N
holds for
(1.5) Ta^NSNl<N2S2NSexp(TV10), T-TE^x^T+TE.
Then C(s)#0 on the parallelogram
(1.6) <7>l-e-10^, r-T£^^T+ 7£.
a
These theorems might be compared with the following two theorems of a
converse character.
Theorem 2. Suppose the existence of constant 9 and E with j^9< 1, 0<E
^9/10 such that for a x with x> 100, say, (,(s) is nonvanishing in the parallelogram
(j>9, t-t£^/^t + t£ Then the inequality
EXPONENTIAL SUMS AND THE RIEMANN CONJECTURE
307
£ exp(-rilogp)
<c-
N\og10N
holds if TE,{1-*)^N<>Nl<N2^2N.
Theorem T. Suppose the existence of constants $ and E with y^#<l and
0 < ES 9/10 such that for a T> 200, say, C(s)^0 in the parallelogram a>99T- TE
|T^r+ TE. Then the inequality
£ exp(-iilogp)
Ni^p^N2
<C
N\og10N
holds if T-^Te^t^T^\Te,TEI(1-^^N^N1<N2^2N.
Further we mention the "global" form of these theorems which refer to half-
planes.
Theorem 1". Suppose the existence of constants a ^ 2, 0 < /? ^ 1 and c (a, /?) so
that for all t>c(ql, p) the inequality
(1.7)
holds for
(1.8)
X exp(-nrlogp)
,N\og10N
Ta^N^Ni<N2^Qxp(Tp/l0).
Then the half plane a>\—e °/?3/a contains at most finitely many zeros of
Us)-
Theorem 2". Suppose the existence of a constant i^# < 1 so that C{s)¥:0for
g>9. Then for t>c(S) the inequality
£ exp(-/ilogp)
Nt^p^N2
<C
N\og10N
holds if Tl/(l~*)SN^Nl<N2^2N.
We emphasize a curious consequence of the "global" Theorem 1". Suppose
the truth of (1.7)-(1.8). Then Theorem 1" implies the existence of a constant 0< 1
308
PAUL TURAN
such that C(s) has no zeros in g> 0. But this gives in turn the inequality
(1.9)
*»-J£
<^xe log*
(and in the case 0>j even the log factor can be dropped). Hence (1.7)-(1.8) imply
(1.9). That (1.9) implies (1.7)-(1.8) with some positive constant a^2 and 0<p<> 1 is
trivial. The interest of these remarks lies in the fact that all references to zeta roots
are eliminated now; the error term in the prime number formula is made dependent
directly on an estimation of the finite exponential sums in (1.7). The above
deduction however uses zeta-roots; it would be very interesting to find a proof for it
directly, without using the C-function. This problem was stated already in my
Izvestiya paper in 1947, without receiving the slightest attention so far.
2. These results - though they bring out clearly the crucial role of the
exponential sums in (1.7) - raise several questions. We mention only two.
Problem 1. In Theorems 1" and 2" the occurring half planes are not the same.
Can they be made identical? If yes, what is the exact form?
Scrutinizing the mutual influence of primes and zeta roots it is natural to ask
whether or not in the local Theorem 1 it is not enough to require (1.1) for a much
narrower range than the one in (1.2) and still having the conclusion (1.3). Thus we
came to the following general
Problem 2. To improve the localisations.
3. As for Problem 1, let us define the "abscissa aw of quasizerofreeness of
C(s)" as follows. For arbitrarily small e>0 the halfplane
a >(TW + s
contains at most finitely many zeros of C(s), whereas the halfplane
<J>Gw — £
contains infinitely many. In the German edition of my book Eine neue Methode der
Analysis und ihre Anwendungen, I settled this question at least supposing the
Lindelof hypothesis
EXPONENTIAL SUMS AND THE RIEMANN CONJECTURE 309
p.,, ta MS±!*_o.
r- + oo log/
It turned out that in this case
(3.2) gw = 1 - lim sup P/ol
where a, /? run all pairs of constants with
(3.3) a^2jS, 0<jS^l
for which (1.7)-(1.8) hold. I returned to this question in 1957 when I deduced the
formula (3.2)-(3.3) from the "weak Lindelof hypothesis" which amounts to the
fact that the Lindelof //(^-function of C(s) (which is continuous and convex) is for
j<g^\ everywhere differentiable (actually one needs even less). The proof was
given with all details in the Sammelband zu Ehren des 250. Geburtstages Leonhard
Eulers (Akademie-Verlag, Berlin, 1959); this result as well as another one from 1958
in Acta Arithmetica, in which I deduced the density hypothesis from it, remained
rather unnoticed. An unconditional proof of (3.2)-(3.3) would be highly desirable.
For the sake of orientation I remark that for the sum
S= £ exp( —it logw), t^2
the elementary formula
|(v+l)1+/T-V1+I'T-(l+iT)v,T|^T2/v
gives at once
\S\ <£ JV/t + t logN <{N logN)/x
if AT^t2. The really significant sum of this type is
Si=Z (-ir+1exp(-iilogn)
since the inequality
|SiN£N1/2+£(2 + |t|)£
is necessary and sufficient for the truth of the Lindelof conjecture in (3.1).
310
PAUL TURAN
4. Next we turn to the second problem. In a lecture published in Number
Theory (Colloq., Janos Bolyai Math. Soc, Debrecen, 1968), North-Holland,
Amsterdam, 1970, I reduced the interval (1.3) in Theorem 1 to (t2, t8). While
drafting an English edition of my book in May and June of 1971 in Ann Arbor I
found that a still stronger localisation is possible. We assert the
Theorem 3. Let D>0 be fixed. Suppose that for a Sq = S0(D)andaO<p^S0
there is a t0 = t0(/?, D) such that for a t^t0 the inequality
£ exp(-/Tlogp)|
Ni^p^N2
<C
AMog10iV
holds for
TD(l-dfil/6)^N^Nl<N2^2N^TDil+dfil/6K
Then for the same t, C(s)#0 on the segment g>\— j?2, t = x.
This theorem is in our terminology a "local" theorem with an unexpectedly
strong localisation. In order to draw another surprising conclusion we formulate
first, as a trivial consequence of Theorem 3, the "semiglobal"
Theorem 3'. LetD>0and0<E^9/\0 be fixed. Suppose that for a 5$ = £$ (D)
and a 0<fl^d$ there is a t§ = t§(/J, D) such that for a T^t$ and for all z's with
T-Te^t<LT+ TE the inequality
£ exp(-fTlogp)|
Ni^p^N2
<C
AMog10iV
holds for
TDii-pl/6)^N^Nl<N2^2N^TDil+pl/6).
Then C{s)^0for the parallelogram
<7>l-j32, T-TE^t^T+TE.
This formulation helps to explain the puzzling role of the constant D in
Theorem 3. Let us namely apply Theorems 3' and 2' successively. Then we get the
following corollary which we formulate owing to its independent interest as
EXPONENTIAL SUMS AND THE RIEMANN CONJECTURE
311
Theorem 4. If for a D>0 and0<E^9/\0for S$(D)and a0< p^S$ there is a
t* = t*(Z), p) so that for a 7^t5>200 and all t's with
(4.1)
and
T-Te<t<T+Te
(4.2)
the inequality
TDii-pl/6)^N^Ni<N2^2N^TDii+fii/6)
(4.3)
£ exp(-rrlogp)|
Ni£p^N2
,Nlog10N
holds, then for all
(4.4)
(4.5)
we have also the inequality
TE/fi2^M^Ml<M2S2M,
T-$Te^t^T + $Te,
(4.6)
£ exp(-iTlogp)
Mi^p^M2 I
<C
Mlog5M
The content of this theorem can be described in the following way: If the
inequality (4.3) is satisfied in the i-range (4.1) and prime-range (4.2) then the
inequality (4.6) of the same type holds for the somewhat smaller i-range (4.5) and
for the prime-range (4.4) which is unbounded from above (remarking, of course,
that for M > exp(7£/5) (4.6) is trivial). In the text of Theorem 4 only primes occur,
and no reference to C(^); it would be highly interesting to find a proof of it without
recourse to C(s).
5. Hence the interest of Theorem 4 lies in the fact that it throws some light on
the behaviour of the exponential sums \^n1£P£n2 exp( —it logp)| though at
present there are no known ways for their nontrivial direct investigation (apart from
the fact that it could be shown that each sum is "small" indeed, apart from a "small"
set, thus opening a new approach to the so-called density hypothesis as early as
1949). Now there are several other finite exponential sums whose absolute values
can be estimated nontrivially by the trigonometrical sieve method of I. M. Vino-
312
PAUL TURAN
gradoff, which is independent of the theory of £(s). Such sums are e.g.
(5.1) Sy{T,Nl9N2)= £ exp(-iTlogH i^y^2,y#l,
and
(5.2) G(a)= £ exp(ipa);
Ni^p^N2
we shall confine ourselves to the former one. The sieve method gives for this the
inequality
c N\og20N
(53) |S,(t, Nl9 N2)\<^r.^ tga
for
(5.4) {210S)t10SNSN1<N2^2N,
which is highly nontrivial. Hence it was plausible to investigate the connection of
these sums with the quasi-Riemann conjecture. I proved in this direction in the
early 1950's the following two theorems (cf. my book Eine neue Methode...,
pp. 143-147).
Theorem 5. For a given ^y^2, y#l, suppose the existence of constants
a ij>2, 0<?7^2 am/ c(a, rj, y) so that for all T>c(a, rj, y) the inequality
(5.5) |Sy(t, JVi, JV2)|^(JV log10iV)/T1/2^
(5.6) Ta^N^Ni<N2^2N.
Then the halfplane a>l —e~i0 n3/(<x+.l)2 contains at most finitely many zeros of
Us).
Theorem 6. Suppose the existence of a constant %^9<l so that C(s)^0for
g > S. Then for all ^ y :g 2 and t>c(S) the inequality
(5.7) |Sy(t, Nl9 N2)\^Cl{a) (N log4 JV)/t
EXPONENTIAL SUMS AND THE RIEMANN CONJECTURE
313
holds if
(5.8)
T2/{l-*)^N^Nl<N2^2N.
Again Theorem 5 is more difficult than Theorem 6. These theorems - compared
with Theorems 1" and 2" - reveal a curious situation. In (1.7) an arbitrarily small
positive constant exponent /? of t implies the existence of a nontrivial zerofree
halfplane, and (1.7) is trivial for /? = 0; in order to achieve the same aim in (5.5), the
exponent of t must be greater than \ however close y is to 1 from either side, and
(5.5) is (for fixed y# 1) nontrivial even for rj = 0 (but proved unconditionally). It is
not known whether this curious discontinuity of the "critical" exponent /J with
respect to y at y=l is due only to the weakness of our "transition formulae",
nor whether an improvement of the trigonometrical sieve can lead to the factor
Ti/2+f/ wjtk a pOSitive rj instead of t1/2 in (5.3).
6. Theorems 5 and 6 are in our terminology "global" theorems; it is reasonable
to ask for their "local" or "semiglobal" forms. One analogue of Theorem 5 - given
for the sake of simplicity only for y=% - is
Theorem 7. Let Dl>0 be fixed. Suppose that for a Sl=Sl(Di) and an
0<rj^Sl there is a Ti=Ti(rj, D^) such that for a t^tx the inequality
(6.1)
holds for
(6.2)
and
X expt-iVOogp)1'2))
<c
N\og10N
Tl/2 + f7
xDl{l-^)^N^Ni<N2S2N^rD^l+riif6
(6.3) {D(l-2rjl/6))l/2 T(logT)1/2^r^(D(l + 2j;1/6))1/2 T(logt)1/2.
Then for the same t, C(s)t*0 on the segment a>l—rj2, t = z.
This theorem is a "local" one in the sense that it implies the nonexistence of
C-zeros on a segment. It is a bit weaker than Theorem 3 in the sense that the
inequality (6.1) is required for the r-interval in (6.3) whereas in Theorem 3 it is
required only for r — x. This fact no doubt lends additional interest to Theorem 3.
It would be trivial to formulate the "semiglobal" form of Theorem 7; we shall
omit it. It would be more interesting to find the semiglobal form of Theorem 6,
314
PAUL TURAN
even with Tl/2+tl instead of t in (5.7) and thus to find the analogue of Theorem 4
for the sum S1/2(t, Nl9 N2); for this I have so far no proof. However there is no
doubt that the analogue of Theorem 7 can be proved for the exponential sum
G(o) in (5.2), which is so important in the additive theory of primes; the analogues
of Theorems 5 and 6 for G(a) were already proved in my book Eine neue Methode
der Analysis und ihre Anwendungen.
The proofs of the new Theorems 3 and 7 are too long to be included in this
volume; at least one of them will be inserted in the forthcoming English edition
of the above-mentioned book in the Interscience Tracts series.
Mathematical Institute of the Hungarian Academy of Sciences
Budapest, Hungary
A NEW ESTIMATE FOR THE EXCEPTIONAL
SET IN GOLDBACH'S PROBLEM
ROBERT C. VAUGHAN
Let E(X) = {2m:2m^X, 2m^Pl + p2}. Goldbach conjectured that E(oo) = {2}.
Tchudakoff, van der Corput and Estermann all noticed independently, after
Vinogradov's three primes theorem appeared in 1937, that
\E(X)\<AXlog~AX.
If
K(n) = \{{Pi,--;Ps)-Pi + ---+Ps = n}l
Js(n)= Z (logrv.-logn,)-1
n\ -\ +ns = n
and
s-<Hn-(^)')m-(^r
they all showed that
£ \R2(n)-J2(n)S2(n)\2 ^AX3log-AX.
I can show by essentially the same method that
(1) I \R2(n)-J2(n)S2(n) + D(n,X)\2 <^X3exp(-c1 log1'2*),
where D is a term corresponding to a possible exceptional 'Siegel' zero of L-
AMS 1970 subject classifications. Primary 10315, 10J10; Secondary 10B35, 10L05, 10L15.
£) 1973, American Mathematical Society
315
316
ROBERT C. VAUGHAN
functions. The proof of (1) is much too complicated to give here. From this estimate
I can show that
(2) \E(X)\<Xexp(-c2\ogl'2X).
The main difficulty is that if q is the 'exceptional modulus' and q is large, then
D{n, X) can closely imitate the behaviour of J2(n) S2{n). However, one knows
that g^exp(c3 log1/2 X), and I can show that for all but X exp(-c4 log1/2 X)
values of n ^ X we have
J2(n) S2(n)-D(n, X)>J2{n) S2(n) q~d
for a suitable small positive number S. (2) then follows quite easily from (1). For
further details I refer interested persons to my forthcoming paper in Acta Arith-
metica.
I thought it might be of interest to discuss here some joint work with Hugh
Montgomery, which is connected with the above. Following I. M. Vinogradov
it is known that, for all positive A,
R3{n)-J3{n) S3{n) <An2 log~An.
Also, on the generalised Riemann hypothesis the error is known to be <^£ n1/4+£.
One might conjecture that the error term is
^ (main term)1/2+£,
but this is false. More generally we can show that
Rs(n)-Js(n)Ss(n) = Q.(ns-3>2 log~sn)
and if
rs{n)= X! Ani)'~A(ns),
n\ -\— +ns = n
then
rs(n)-(-l)"(~S)ss(W)=f2±(n-3/2).
These results are not very deep, and depend only on classical methods in prime
THE EXCEPTIONAL SET IN GOLDBACH'S PROBLEM
317
number theory. However, we can also show that
R2{n)-J2{n)S2{n) = Q{ni,2\og-ln)
and
(3)
r2{n)-nS2(n) = Q(nl/2 \ogn).
For the rest of my discourse I shall give the proof of (3). We first of all require
a lemma.
Lemma. Let
m u(a)2 q
%^)=lS £ e{-an/q)-
Then
(4) S2{n)-S{n, m^m"1 (log logm)2 d(n) {n>0).
Proof. It is well known that S2{n) = S(n, oo) and
£ e(-an/q)= £ fi(-)r,
a=l;(a,q) = l r\q;r\n V/
so that
S2(n)-S(n,m)^r £ ('°8 y<in-'(log logm)2 d(n),
r\n q=m+ l;r\q fl
which proves (4).
Now suppose that 0<g<l. Then, by Parseval's theorem and Schwarz's
inequality,
1 Q2n\r2(n)-(n+l)S(n,m)\2=t
n = 0 J
0
where
£ Q"e(an)(r2(n)-(n+l)S(n,m))\
doc^T2,
1
■J
£ Qne(un)(r2(n)-(n + \)S(n,m))
n = 0
da.
318
ROBERT C. VAUGHAN
It is easily seen that T^ Tx — T2, where
-I
Z A(n)qne(m)\
da=Z Q2nM")2
n = 0
and
00 m aid)2 q
» = 0 ,= 1 <P(q) a=l;(a.fl)=l I
■J.
0
f=i^(«)2.-ii(,,,)=iJ
0
- n(qf \\' I
da
X (fi+l)e"gl.lo—|«
da
2 oo
««1 <?(<?) n = 0
Let g2 = 1 — 1/X and m = [Xc~\ with c a constant such that 0<c< 1. c will be
at our disposal later on. Then we have
7i=X(logX + 0(l))
and
T2SX(\ogm + 0(l)) = X(c\ogX + 0(l)).
Hence T^X((l -c) \ogX + 0(l))>X logX, so that
Z e2n|r2(W)-(n+l)S(n,m)|2>X2log2X.
n = 0
We now simplify our expression. We have, by the Lemma, for w>0,
|r2(fz)-(w+l) S(az, m)|2^|r2(«)~fiS2(«)|2 + S2(«)2-h«2^(«)2(loglogm)4 m"2.
Clearly
Z e2"S2(a)2«Ie2"»«*2,
THE EXCEPTIONAL SET IN GOLDBACH'S PROBLEM 319
and
£ g2nn2d(n)2 (log logm)4 m"2^(log logm)4 m~2X3 log3 X
which is <^X2 provided we choose c so that 1/2<c< 1, say c = 3/4.
Finally
|r2(0)-S(0,m)| = £ ^«logm«*.
Hence
S e2"|r2(«) -«S2(n)|2 > X2 log2X = (-i-j log -J-T
and (3) follows easily.
References
T. Estermann, On Goldbach's problem: Proof that almost all even positive integers are sums of
two primes, Proc. London Math. Soc. (2) 44 (1938), 307-314.
H. L. Montgomery and R. C. Vaughan, Error terms in additive prime number theory, Oxford
Quart. J. (to appear).
N. G. Tchudakoff, On the density of the set of even numbers which are not representable as a sum of
two odd primes, Izv. Akad. Nauk SSSR Ser. Mat. 2 (1938), 25^0.
J. G. van der Corput, Sur Vhypothese de Goldbach pour presque tous les nombres pairs, Acta Arith.
2(1937), 266-290.
R. C. Vaughan, On Goldbach's problem, Acta Arith. 22 (1972), 21^8.
I. M. Vinogradov, Representation of an odd number as a sum of three primes, C. R. (Dokl.) Acad. Sci.
URSS15(1937), 169-172.
, Some theorems concerning the theory of primes, Mat. Sb. 2 (44) (1937), 179-195.
The University
Sheffield, Great Britain
This page intentionally left blank
ON EUCLIDEAN RINGS OF
ALGEBRAIC INTEGERS
PETER J. WEINBERGER
1. Let Rbea commutative ring with unit. Then R is said to be Euclidean if
there is a function E: R-+{0, 1, 2,...} such that £(0) = 0 and Va, beR, a#0, 3q, c
with b = qa + c, E(c)< E(a). The following test may be used to determine if a ring R
is Euclidean, and then to construct the function E. Let £0 = {0}, and define Epj^ 1,
by Ej—Ej-1 = {aeR: each residue class of R/(a) contains an element of £/_i}.
Note that E^-Eq is the set of units of R. If
(1.1) R=U EJ9
then R is Euclidean, with E(a) = min {j:aeEj}. Motzkin [5] proves that the
condition (1.1) is also necessary if R is to be Euclidean. The reader is further referred to
the recent and readable paper of Samuel [6].
This paper is entirely concerned with those rings which are the rings of integers
of an algebraic number field, and which are also principal ideal domains
(abbreviated p.i.d.). This last condition is no restriction, since every Euclidean ring
is a p.i.d. Of these rings it is known that those in Q((-19)1/2), Q((-43)1/2),
g((-67)1/2), g((-163)1/2) are not Euclidean [6]. The other rings of the type
considered in this paper, which are known to be Euclidean, all have the absolute
value of the norm as their Euclidean function E. (E(a)=\N(a)\. The unadorned
symbol N will always denote the absolute norm.) The purpose of this paper is to
prove the following theorem.
Theorem 1. Let K be an algebraic number field whose ring of integers both is
a p.i.d. and has infinitely many units. Then a generalized Riemann hypothesis
(abbreviated GRH) implies that the ring of integers of K is Euclidean.
AMS 1970 subject classifications. Primary 12A05, 13F99; Secondary 12A40.
© 1973, American Mathematical Society
321
322
PETER J. WEINBERGER
A sufficient set of Riemann hypotheses is given in §4.
The five remaining complex quadratic fields with class number one are known
to be Euclidean [1, p. 213].
In §2 the proof of Theorem 1 is reduced to the proof of Theorem 4. In §3,
following Hooley [3], the proof of Theorem 4 is brought to where analytic
techniques are applicable. §4 applies them, and the rest of the paper contains the rest of
the proof.
2. Henceforth assume that K satisfies the hypothesis of Theorem 1 and that R
is its ring of integers. We make use of the Ej notation from § 1.
The first step is to show that Theorem 1 can be deduced from the following
theorem.
Theorem 2. GRH implies that every irreducible of R is in E3.
An irreducible, b, of R is an element such that the principal ideal (b) is prime.
If m is a divisor of K, and a, b are two nonzero ideals, recall that a = b (mod m) is
defined to mean that ab"1 is a principal ideal (c) such that vp(c—l)^vp(m) for
all primes p of K.
Lemma 2.1. Let m be any nonzero element of R. Then every prime residue class
of Rl(m) contains infinitely many irreducibles of degree one. In particular, if the
conclusion of Theorem 2 holds, each such residue class contains an irreducible
which is in E3.
(A prime residue class is one which is a unit in the ring R/(m).)
Proof. Let m=(m) be the principal divisor of K corresponding to m. Choose
any b prime to m. By an extension of Dirichlet's theorem on primes in
arithmetic progression [2, Volume I, p. 32], there are infinitely many prime ideals
q = (#) = (&) (modm), where q is of degree one and prime to m. Then, by definition,
q/b-le{m), so m \ [q-b). Q.E.D.
If b, not zero, is the product of n2 + n3 irreducibles, n} from Ej9 7 = 2, 3, define
the height of b by
ht(£) = 2w2 + 3w3.
Theorem 3. GRH implies that beEhtib). Hence Theorem 2 implies Theorem 1.
Proof. Since the theorem is clearly true if ht(6)^2, it suffices to show that
every residue class modb contains an element whose height is less than ht(b). For
the purpose of this proof, say that b has type (n2i n3). Aside from the class contain-
EUCLIDEAN RINGS OF ALGEBRAIC INTEGERS
323
ing zero, all residue classes mod b are of the form
B1(n2 — t2, n3 —t3)-prime residue class modB2(t2, t3),
where B{ = B^j, k) is a divisor of b of type (/, k). The case with (t2, t3) = (0, 0) gives
the zero class. If (r2, t3) = (l, 0), then any prime residue class modj52 contains an
element of Eu which is a number of type (0, 0), so the residue class modZ? contains
a number of height 2(n2 — 1) + 3n3 <ht(b). If (t2, ^) = (0, 1), any prime residue class
mod B2 contains an element of E2, which must be either of type (0, 0) or (1, 0).
Hence the residue class modb contains a number of height 2(n2 + S2) + 3(n3— 1)
<ht(6), where <52 = 0 or 1. For any other choice of (t2, t3), Lemma 2.1 gives an
element of type (S2, S3) in each prime residue class modB2, where S2, <53 = 0, 1 and
not both <52, S3 are 1. Therefore the residue class modb contains an element of
type (n2 — t2 + 52, n3 — t3 + 53) and this element has height <ht(b). Q.E.D.
For an arbitrary irreduciblep of R let p = {/?). Let /be the multiplicative group
of ideals of K prime to p, and let H= {ael:a = (l) (modp)}. Then I/H is a finite
group. Note that [/://]= 1 if and only ifpeE2. For if (a,p) = 1, then/? | (a -unit)
if and only if (a)=(l) (modp).
We say that a unit, £, of R is a fundamental unit if e^e" for all units zx and all
integers n > 1. We say that e is a primitive root of the prime ideal p if s generates the
(cyclic) group of units in the finite field R/p. Fix a fundamental unit s.
Theorem 4. Assume GRH. Let p be a prime ideal of K. Let H be as above.
Then every ideal class of I/H contains infinitely many prime ideals for which s is
a primitive root.
Proof that Theorem 4 implies Theorem 2. Let p be an irreducible of degree
one, and let p = (/?). Let {a,p) — \. Then Theorem 4 says that there is a prime ideal
q = (q) = (a) (modp) with q prime to p. Then q/a-lep so p \(q-a). Further,
qeE2, since every residue class of R/(q) = R/q, except the one containing zero,
contains a power of s, which is in Ex. Q.E.D.
3. We have an algebraic number field K with a fixed fundamental unit s. Let
H be any ideal group in K with a conductor; i.e., an ideal group of class field
theory [2, Volume I, p. 61]. Let h be a member of I/H. We are interested in the
number of prime ideals in h for which s is a primitive root, since for certain h
Theorem 4 requires that this number be infinite. Henceforth q shall denote a
rational prime, k a square-free integer, and Q the field of rational numbers. Small
letters a, b, p, q shall denote ideals of K, with p, q being reserved for primes, while
the respective capital letters 91, 93, ^3, Q will be used to denote ideals in certain
extensions of K.
324
PETER J. WEINBERGER
If £ is not a primitive root of p, then £ must be a qth power residue mod p for
some q\(Np-l). Let R(p,k)=l if k\(Np-l) and £(^-1)/k=l (modp), and 0
otherwise.
Define
N(z, ;/) = card{pe/i: Np^z,R(p, g) = 0for all q^rj},
P(z9 fc) = card {pe/i: Np^z, R(p, q)= 1 for all q\k}9
N(z) = card {peh: Np^ z, e is a primitive root of p},
M(z, rjl9 r\2) = C2ixd{peh: Np<Lz, R{p, q)=\ for some qe(rju i/2]},
^=(logz)/6, {2 = z1/2(logz)-2, £3 = z1/2logz.
These definitions, and the method for estimating N(z), are due to Hooley [3].
Trivially, N(z) = N(z, z— 1), and, as in [3],
(3.1) N{z) = N{z9 ix)^0{M{z9 tl9 z-1))
and
(3.2) M(z, Zl9 z- l)^M(z, ^, {2) + M(z, £2, ^3) + M(z, £3, z-1).
Lemma 3.1. M(z, £3, z- l)<^z/log2z.
(Throughout, the implied constants depend only on K, e, h.)
Prcx)f. If p is counted in M(z, £3, z-1), then q\(Np-l) and £{Np~i)lq =
l(modp), for some q, zi/2 \ogz<q^z—l. Hence all such p divide
11 \N(em-l)\.
m<z1/2/logz
Each divisor of this has norm at least 2, so that
2[K:Q]M(z,S3,z-l)^ J-J |JV(fim-l)|.
m<z1/2/logz
All conjugates of £ have absolute value <Al9 so
\N(sm-l)\<(Ar!! + l)[K:Q]<A^K:Q\
where Al9 A2 <l 1. Hence
[K:Q]MM3,-l^^f^ Z ». Q.E-D.
lOgZ m^z1/2/ log z
Lemma 3.2. M(z, £2, £3)<|z log logz/log2z.
EUCLIDEAN RINGS OF ALGEBRAIC INTEGERS
325
Proof. First, M(z, £2, f3)^Le2<,^3 p(2' ?)• For £2<<7^C3,
P(z, q) ^card {piA/p^z, iVp = l (modg)}
= Z Jcard{p.y^z,^=l(modq)},
since Np = pi for some p and 7. The term with 7= 1 is
q logz glogz
by the Brun-Titchmarsh theorem. Similarly, the term with 7 = 2 is <^zl,2/q, since
p2== 1 (mod#) only if p= 1, -1 (mod#). If7 = 3, then /?<#, so there are at most 7
solutions o(pj=l (modq). Hence
P(z9q)<z/qlogz + zll2/q + [K:Q]2<z/qlogz9
and
Mz, £2, £3)«= L "1 « X V~lT^—\—2 * v-E.D.
^^tflogz fc<7^3 4 log2z log2z
Since
(3.3) N(r,«i)=lA*(0^.0.
where le{k:q | k=>q^<Ji}, we now turn our attention to estimating P(z, /c).
Let L = Lk = K(Ck, ^1/k), where £k is a primitive kth root of one. Lfc/X is normal.
#(p, fc)=l if and only if p splits completely in Lk. For the rest of this section
fix k. Let f be the product of k and the conductor of H. Let IL be the multiplicative
group of ideals of L prime to f, and let HL = {2le/L:iV£2le//}. Then HL is an ideal
group defined modf, for if 21=(1) (modf), then 21=(a) and a= 1 (modf). But then,
since f is an ideal of K, every conjugate, a', of a over K satisfies a'== 1 (modf) so
iV£(a)= 1 (modf), so N%HeH. Hence /L/#L is a finite group. Let H be the group
of characters on IL/HL. Each character in H extends to a function on integral
ideals of L in the usual way: if (<P, f)= 1, then x{V) = x{^HL), otherwise x($)=0;
and x is extended multiplicatively.
Let 23 be a fixed integral ideal of L, prime to f. Let
326
PETER J. WEINBERGER
Then
(3.4) £ *(93) tc(z, x)=[/L: J/L] card{$eL:JV^z, $eSHL}.
xeH
4. In this section k and ^gH are fixed. Let
£&*)= I -^r. Re(s)>l;
let ^0, with conductor f0, be the primitive character equivalent to %; let n — [L\ Q~],
A =disc(L/0; let rx be the number of real conjugates of L (0 unless /c^2), r2 =
(n —rx)/2; let r3, with 0^r3gr1, be the number of real archimedean valuations
on L whose sign affects x\ finally, let
Now L(s, x) may be analytically continued to the whole s-plane as an entire
function unless x is principal, when it has a simple pole at s= 1 [4, Satz LXIII].
In any case (s — 1) L(s, x) is an entire function of order 1.
L{s, x)*0 for Re(s)^ 1. For z^ 1,
<A(z, x) = E(x) z-{r0 logz + ao)-(y-r3) log(l ^"^(^"E^
where E(x) = 1 if x is principal and zero otherwise,
L ,
and where g = p + iy runs over all zeros of L(s, x) with 0 < <r < 1 [4, Satz LXXXVIII].
e r->oo|y|<r
converges for all z > 0 [4, Satz LXXXIX], and uniformly for z > 1 in any interval
not containing an integer [4, Satz XC].
The use of the functional equation for L(s, x) to estimate ij/(z, x) is complicated
by the fact that x may not be primitive. If x is not primitive, then
MS„)-L(S„.)n(.-||)-MS.,.)ni(.-|).
EUCLIDEAN RINGS OF ALGEBRAIC INTEGERS
327
for roots of one, gm, and powers prime Pm. Let
^(f0) = 2-^-"'2(M|JVf0)1'2,
*(«. xo)=^(fo)s r((s+1)/2^ r(s/2)"-3 /»'; l(s, Xo).
Then, by defining #(s, *)=<P(s, Xo). we have [4, Satz LXIII]
|<P(s,x)| = |*(l-s,x)|.
Let rf=card{em:em = l}. Then [4, Satz LXV]
r0=max{d-£(x)+r1 + r2-r3,0}«l + log(Arf)
so that
(4.1) f(z9 x) = E(X) z-a0-£ -+0(log(Nf) logz).
As always, the implied constant does not depend on k or %, but only on K, e, h.
We are now in position to estimate the infinite sum over zeros of L(s, x) by
using the Phragmen-Lindelof principle and Jensen's theorem. The former can
be applied in the vertical strip —3/2^(7^11/2. For <r^2, and so for cr= 11/2,
\L(s, X)\^U2fScl cx <L On <r= -3/2,
\L(s,x)\ = \f(s,x)\Ml-s,x)\£fif(s,xh
where /(s, x) is the factor from the functional equation
^ [S' X) ((cos (7ts/2))r'+r2 -r3 (sin (7ts/2))r*+'3 T (s)")
On a— —3/2, by Stirling's formula,
f{s,xH(f2{AN\0f{\t\ + 2f\ c2<\.
Hence, on a = — 3/2 and on a — 11/2,
(4.2) |(s-l)L(s,%H(|r| + 2)^(JiVf)2,
where c3 is a positive integer independent of k, x- Applying Phragmen-Lindelof
328
PETER J. WEINBERGER
while
to the function (s— 1) L(s, x)/(s + 2)C3" shows that (4.2) holds throughout the strip
-3/2^(7^11/2. On (7 = 2,
(4.3) |(s-l)L(s,*)| = C(2r.
Let v(y) be the number of zeros of £(s, x)~(s~ 1) L(s, y) m the circle
\s-{2 + iT)\£y. Then by (4.2), (4.3),
7/2 In
{-7^= ^ j log|{(2 + iT+(7/2) *", *)l «0-log|£(2 + iT, Z)|
0 0
^c4log(JJVf(|r| + 2)"),
7/2 7/2
[vJAdy^[viyLdy^v(3) log(7/6).
0 3
Then, since v(3)^> JV(T+1, Z)-iV(T, *), where N(T,x) is the number of zeros
of L(s, x)in0<(7<l, -T<t<T,
(4.4) JV(T + l,x)-N(r,x)<log(JNf(T + 2)").
Now (4.1), (4.4), and the argument in [3] immediately give
*(*, *) = £(*) H(z) + 0(z1/2 log(JJVfz")),
if all the zeros of L(s, x) with 0<(7<1 lie on the line o = \. This hypothesis, for
each k and each ^eH, is the GRH sufficient for Theorem 1.
Now
|J| = |disc(L/0^^(disc(Xfc-£))^^(/cfc-1)^fcfcIX:e],
JVf = N§(cond(fl)-ik)^c5k".
Hence log(JJVfz")<^rc log/c + w logz, so
(4.5) tt(z, Z) = £(Z) li(z) + 0([L:K] z1'2 log(/cz)).
5. In this section k is fixed and f, 23 are as defined near the end of § 3. Let
C = C(k) be the ideal group in K, defined modf, generated by norms of ideals
EUCLIDEAN RINGS OF ALGEBRAIC INTEGERS 329
from L. For later purposes, it is necessary to observe that C is also the group
generated by norms of ideals from E = Ek, the maximal Abelian subextension
ofL/tf [2,VolumeH,p. 167].
Lemma 5.1. N% induces an isomorphism ofILjHL onto C/CnH.
Proof. All ideal groups are given modf so the definition of C implies that the
map is onto. If NftleCnH, then the definition of HL shows that 2Iei/L, so the
map is injective. Q.E.D.
As an immediate consequence,
NkV = N£® (mod H) iff <$ = 95 (mod HL).
Therefore, the right-hand side of (3.4) equals
(5.1) [C:Cnif]card{^c:L:N^^z,iV^95 = JV^(modi/)}.
If possible, choose 95 such that Nfy&eh. If this is possible, define F(k)= 1,
otherwise define F(/c)=0.
If p is counted in P(z, k), then p = Y\fj^ yj9 with distinct %, and A#«P, = p.
Hence (5.1) equals
(5.2) lC:CnIQ.[[L:K]P{z9k)F{k) + 0{Ai+A2)]9
where Ax is the number of prime ideals of L which are ramified over K, and
i42 = card{^c=L:NV^z,cleg(^/jK)^2}.
It is clear that the only prime ideals of L ramified over K are among those dividing
k, so
A^lL.Q] \ogk = n\ogk.
By the prime number theorem,
A24[L:Q~] zl/2/\ogz<nz1/2.
Combining these last two estimates, (5.2), (4.5), and (3.4) gives
(53) p<z-*»-Ed^U+0(2"Mo8(tz))
330
PETER J. WEINBERGER
6. We now can finish the estimation of (3.2) by estimating M(z, £2, £3). Note
that [Lfc:X] = /c[X(Cfc):X] and that, for sufficiently large q, [K{Q:K] = q--\.
Hence
(61) +^+z1'2 li(£2)log£2z«z/log2z.
si
To complete the proof of Theorem 4, it suffices to know N(z, £x), by (3.1).
Substituting from (5.3) in (3.3) gives
As in [3], all values of/are ^z1/3, so the error term in (6.2) is <^z5/6 logz<^z/log2z.
Lemma 6.1. F(k) is multiplicative.
Proof. It is clear that F(k)—0 if hr\C(k) is empty, for then h contains no
norms from Lk. Conversely, if beC(k)nh, then there exists an 9lc:Lfc such that
b = Af£2l(modf), so that N^eh, so F(k) = 1. Hence F{k)= 1 if and only if hnC(k)
is not empty. Now C(k) is class group to Ek, the maximal Abelian subextension
of Lk/K. But Ek is the compositum of all Eq with q | /c, so
C(k)=f|C(«) and C(/c)n/z = n (C(«)nft). Q.E.D.
We now turn to the factor [C(/):C(/)n#]. Let i/={ac:A::a = (l) (modp)},
where p is a prime (ideal or divisor) of K. There is a unique rational prime p
divisible by p. In the rest of the estimation of N(z) it is convenient to make use
of the fact that jR is a p.i.d.
Lemma 6.2. IfpXK then F{k)=\ and[C(k): C{k)nH] = [/: H\ Ifp \ k, then
[C(/c):C(/c)nH] = [/:H]/[/://C(p)].
Proof. The conductor of H divides p, and so is either p or one. If it is the latter,
then I/H is the absolute class group, so I=H since R is a p.i.d. and the lemma
follows. Otherwise, the conductor of H must be p. Let K' be class field to H, so
K'jK is ramified exactly at p. Now
EUCLIDEAN RINGS OF ALGEBRAIC INTEGERS
331
[C{k):C(k)nH] = [HC(k):H],
and HC(k) is class group to K'nEk. Ek can only be ramified at those primes
dividing k, so if pjfk, then HC(k) = I, so C(k)/C(k)r\H is isomorphic to I/H,
which gives the first conclusion of the lemma. Finally, if p \ k, then k = pm with
(m, p)= 1. Then Epmc:Ep(Cm, e1/m) so Epm/Ep is unramified at any prime dividing p.
On the other hand K/K is totally ramified at p [2, Volume I, p. 31], so K'r\Epm
must be a totally ramified, over p, extension of K'r\Ep. Hence K'nEp = K'nEpm.
But HC{k) is class group to K'nEpm, so HC(k) = HC(p). Q.E.D.
Lemma 6.1 and Lemma 6.2, (6.2) and (3.2) give
U [/:H]V P[K(C,):K] yW.Ul ^(O^ V log2z
Now for large enough z,q>£i implies that [K(tq):K]=q— 1, so
I logfl-
1 V z V-^+°(i/93)«i/^1.
q[.K(Q:K]J qntq(q-l)
Therefore
(6.3)
N{zy_m L F{p)U:C{p)H-\
\I:K\\ p[K(C,):K]
qVP\ q[K(Q:KV \ log2z
The infinite product is not zero, so Theorem 4 will be proved if the second factor
can be shown to be positive. The factor in question can only be zero if F(p) = 1 and
[/:^C(/?)]=/7[/i:(Cp):A:]. This last implies that K'nEp = Lp, so L = K(si,p)czK'.
But then \Lp: A:] =/?, while \K': K] = [/: H~\ which divides the number of prime
residue classes of R/p, which ispJ— 1 for somey. This contradiction completes the
proof of Theorem 4, and therefore Theorem 1 is proved.
Bibliography
1. G. H. Hardy and E. M. Wright, An introduction to the theory of numbers, Oxford Univ. Press,
1965, p. 213.
2. H. Hasse, Bericht iiber neuere Untersuchungen und Probleme aus der Theorie der algebraischen
Zahlkorper. Teile I, II, Zweite Auflage, Physica-Verlag, Wurzburg-Vienna, 1965. MR 33 #4045a,b.
332
PETER J. WEINBERGER
3. C. Hooley, On Artin's conjecture, J. Reine Angew. Math. 225 (1967), 209-220. MR 34 #7445.
4. E. Landau, Uber Ideale und Primideale in Idealklassen, Math. Z. 2 (1918), 52-154.
5. T. S. Motzkin, The Euclidean algorithm, Bull. Amer. Math. Soc. 55 (1949), 1142-1146. MR 11,
311.
6T P. Samuel, About Euclidean rings, J. Algebra 19 (1971), 282-301.
University of Michigan
AUTHOR INDEX
Italic numbers refer to pages on which a complete reference to a work by the author is given.
Roman numbers refer to pages on which a reference is made to a work of the author.
For example, under Oppenheim would be the page on which a statement like the following
occurs: "This result had been conjectured by Oppenheim [13] in 1929,..."
Boldface numbers indicate the first page of the articles in this volume.
Abramowitz, M, 30
Anderson, T. W., 205,210
Ankeny,N.C.,214,219
Artin, E., 100,110,285,302
Artin, M., 60
Ayoub, Raymond, 30
Babu, G. Jogesh, 236,246
Baker, A., 1,1, 2, 4, 6, 6, 7, 178, 178
Baker, R. C, 210,210
Bambah, R. P., 137,139
Barton, D. E., 202,210
Bass, H., 58, 60
Bateman, PaulT., 75,154, 157, 266
Bauer, M., 97,100
Beach, B. D., 280,283
Berndt, Bruce C, 9,10,13, 30, 115,
121
Billing, G., 222,230
Billingsley, Patrick, 204, 210, 233,
234,235,246
Birch, B. J., 6, 7,159,165,166, 168,
171,274
Birnbaum, Z. W., 207, 208,210
Bombieri, E., 31, 32, 36, 40, 42, 49,
100
Brauer, A., 216,219
Brauer, R., 160,174
Brown, J. W., 157
Buhstab, A. A., 214,219
Burgess, D. A., 213, 215, 216,219
Burnside, W., 100
Casselman, W., 230
Cassels, J. W. S., 166,174, 210, 210,
230
Chandrasekharan, K., 266
Chowla, S., 84, 90, 137, 139, 160,
274,268,283
Chung, K. L., 204,211
Cigler, J., 195, 201, 210,211
Coates, John, 5, 7, 51,51, 60
van der Corput, J. G., 319
Cramer, H., 136,139
Darling, D. A., 202, 205,210,211
Davenport, H., 31, 49, 84, 90, 159,
160, 161, 165, 166, 168, 170, 172,
174,193,233,246
Dedekind, R., 96,100,228,230
Dem'janenko, V. A., 221,230
Diamond, Harold G., 63, 154, 157,
266
Donsker, M. D., 204,211
Doob, J. L., 204,211
Dorge, K., 91,100
333
334
AUTHOR INDEX
Dressier, Robert E., 75
Durbin, J., 207,209,211
Dvoretzky, A., 206,211
Elliott, P. D. T. A., 32, 49, 77, 91,
93,100, 214, 215,219
Epanecnikov, V. A., 209,211
Erdelyi, A., 121,211
Erdos, Paul, 75, 78, 82, 83, 84, 90,
125, 126, 134, 136, 138, 139, 140,
210,211,215,219,233,246
Estermann, T., 140,319
Fainleib, A. S., 74, 75
Fel'dman, N. I., 3,4, 7,178,178
Feller, W., 204,211
Forti, M., 31
Frechet, M., 211
Fricke, R., 222,231
Gabriel, R. M., 42,49
Gal, L., 210,211
Gal, S., 210,211
Gallagher, P. X., 31, 32,49,91
GaposTrin, V. F., 210,211
Garland, H., 60
Goldstein, Larry Joel, 103,103,104,
110,303,304
Grosswald, Emil, 13,30, 111, 121
Halasz, G., 31,49
Halberstam, H., 31,49,247,247,249
Hall, R. S., 157
Hardy, G. H., 75,124,126,135,140,
147, 150, 154, 157, 218, 219, 331
Haselgrove, C. B., 147,153,157
Hasse, H., 285,302,331
Hecke, E., 122, 223, 228,231
Heilbronn, H., 160,174,303,304
Helmberg, G., 195,201, 210,211
Hensley, Douglas, 123,123,126
Hlawka, E., 91,93,100, 201,211
Hooley, Christopher, 105, 110, 129,
134, 135, 137, 140, 322, 324, 332
Huxley, Martin N., 43, 49, 93, 100,
141,145
Ingham, A. E., 75, 127, 147, 153,
154,156,157
Iwasawa, K., 52, 53, 55,56, 60
Jarnik.V., 161,174
Jurkat, W. B., 147,157
Kac, M., 75,233,246
Kemperman, J. H. B., 196, 209, 211
Kiefer, J., 206,211
Klingen, H., 122
Kloss, K. E., 157
Knobloch, H.-W., 91,100
von Koch, H., 149,157
Koksma, J. F., 195, 201, 210, 211
Kolmogorov, A., 211
Kornblum, H., 100
Kubilius, J., 75, 77, 82, 233, 236,
237,244, 245,246, 248, 249
Kubota, T., 52,54, 60
Kuipers,L., 195,201,222
Landau, Edmund, 193,332
Lang, S., 108,110
Lavrik, A. F., 249,249
Lehmer, D. H., 267, 269,283
Lehmer, Emma, 267,283
Leopoldt, H. W., 52,54, 56,60
Lewis, D. J., 159,161,170,174
Lewittes, Joseph, 13,30
Lichtenbaum, S., 51, 60,122
Ligozat.G., 229,231
Lindstrom, B., 85, 90
Linfoot, E. H., 303,304
Linnik, Ju. V., 31, 49, 218,219, 233,
244, 245,246
Littlewood, J. E., 124,126,135,140,
147, 149, 157, 218, 219, 267, 268,
283
Loeve, Michel, 246
Mahler, K., 83,84,90,175,176,179,
222,230
Mallows, C.L., 202,210
Mandelbrojt, S., 266
Manin, Ju. I., 286,302
AUTHOR INDEX
335
Masser, D. W., 5, 7
Massey, F. J., 207,211
Mazur, B., 226, 231
Mehta, M. L., 193
Meijer, H. G., 207,212
Meyer, Y., 212
Minor, J., 60
Mirsky, L., 139,140, 208,212
Montgomery, Hugh L., 31, 32, 33,
43, 45, 49, 91, 100, 124, 127, 145,
181, 193, 214, 218, 219, 262, 319
Motzkin, T. S., 321,332
Neubauer, G., 152,158
Niederreiter, H., 195, 195, 201, 207,
211,212
Norton, Karl K., 213, 215, 219, 220
Ogg, A. P., 221, 221, 231
Oppenheim, A., 159, 160, 168, 174
Osgood, C. F., 4, 7
Owen, D. B., 204, 207,212
Philipp, Walter, 206, 210, 212, 233,
237,246, 249
Pitman, Jane, 168,174
Prohorov, Ju. V., 233, 246
Rademacher, Hans, 28,30
Ramachandra, K. F., 6, 7
Rankin, R. A., 138,140
Renyi, A., 31,49, 204,212
Richards, Ian, 123,123,126
Richert, H.-E., 138, 140, 247, 247,
249
Ridout, D., 159,165,168,174
Rosenblatt, M., 206,212
Rosser, J. B., 193
Roth, K. F., 31, 49, 251, 251, 262
Ryavec, C, 263, 264, 266
Samandarov, A. G., 93,100
Samuel, P., 321, 332
Schaal, W., 93,100
Schinzel, A., 4, 7,127
Schmid, P., 205,212
Schmidt, Wolfgang M., 4, 7, 176,
178,179,251,262
Schneider, T., 176,179
Schoenberg, I. J., 63, 75
Schoeneberg, B., 28,30
Schoenfeld, L., 193
Schrutka, V., 60
Selberg, Atle, 93, 97, 101, 136, 140,
193
Selfridge, J. L., 126
Serfling, R. J., 233, 245,246
Serre, J.-P., 54, 61
Shanks, Daniel, 267, 267,283
Siegel, C. L., 28,30,51,59,61
Sierpinski, W., 127
Simpson, P. B., 205,212
Singer, J., 85,90
Skewes, S., 156,158
Smirnov.N.V., 205, 212
Specht, W., 101
Spira, R., 153,158
Sprindzuk, V. G., 3, 7,176,179
Stark, H. M., 1,2, 7,285
Stegun, I. A., 30
Stemmler, R. M., 157
Stepanov, S. A., 286,302
Sterneck, R. D. v., 152,158
Stickelberger, L., 55, 61
Sunley, Judith S., 303,304
Swinnerton-Dyer, H. P. F., 229,231
Szemeredi, E., 83
Tate, J., 51,58, 61
Tatuzawa, T., 303, 304,304
Tchebotarev, N., 99,101
Tchudakoff, N. G., 319
Tijdeman, R., 207,212
Titchmarsh, E. C, 30, 151, 153,
158,193
Tjan, M. M., 74, 75
Turan, P., 41, 49, 77, 82, 92, 101,
305, 305, 308, 309, 310, 312, 314
Uzdavinis, R. V., 236,246
336
AUTHOR INDEX
Vaughan, Robert C, 193,315,319
Vinogradov, I. M., 319
Viola, C, 31
vairder Waerden, B. L., 91,101,212
Wald, A., 202,212
Walfisz.A., 161,174
Warlimont, R., 93,101
Watson, G.L., 172,174
Watson, G. N., 30
WeU, A., 52, 61, 117, 122, 285, 302
Weinberger, Peter J., 105, 110, 321
Weyl, H., 203,212
Whittaker, E. T., 30
Williams, H. C, 280,283
Wilson, R. J., 93,101
Wintner, A., 78,82
Wirsing, E. A., 6, 7
Wolfowitz, J., 202, 206, 211, 212
Wright, E. M., 75,331
Yohe, J. M., 193
Yuan, Wang, 214,220
Zaremba, S. K., 206,212
SUBJECT INDEX
Abscissa aw of quasizerofreeness of
f 00,308
additive form, 160
additive function, 77, 233
adjoint operator, 34
additive theory of primes, 314
admissible sequence, 123
algebraic numbers,
linear forms in logarithms of, 1
rational approximations to, 3
algebraic number field, totally
real, 303
almost periodic distributions, 151,
154,155
Artin conjecture, 103
Bernoulli function, 10
generalized, 10,11
Bernoulli number, 111
generalized, 29,112
Bernoulli polynomial, 10
Bessel's inequality, generalized,
143, 252
Birch and Swinnerton-Dyer,
conjecture of, 226
Bombieri's large sieve inequality,
92
Brownian motion, 233, 234, 235
Bran's sieve, 134, 247, 248
character sum, 215, 216, 244
class, 175
Class A, 176
class number, 183, 269, 303
class number 2 problem, 4
Class S, 176
Class T, 176
Class U, 176
consecutive power residues, 213
consecutive primes, 141
converge in distribution, 233, 234
Cramer-Smirnov test, 203, 205, 206
cycle type, 98
Dedekind character sum, 19
generalized, 23
Dedekind eta-function, 9,117, 228
generalized, 28
Dedekind sum, 9,10
generalized, 10, 24
Dedekind zeta function, 112, 113,
117
5-well spaced, 32
density hypothesis, 311
diagonal form, 160
Dickman's function, 213
Diophantine equations, 3
Dirichlet character, generalized, 32
Dirichlet series, 112,114, 267
"Dirichlet series" operator, 34
Dirichlet's class number formula,
51
337
338
SUBJECT INDEX
discrepancy, 195
extreme, 196, 197, 198, 201, 202
local, 197, 201
If, 197
distribution-free, 202, 203, 205, 206
distribution function, 130
empirical, 196, 201, 203
distribution of points in Euclidean
space, 252
"Eisenstein series", 223
elliptic curve, 221
elliptic functions, 5
empirical distribution function,
196,201, 203
equidistributed sequence, 104
error term, 316
etale cohomology, 53
Euclidean ring, 321
Euler's phi function, 63
even discriminant, 270
exponential sum, 306
extended Riemann hypothesis,
214, 218, 282
extreme discrepancy, 196,197,
198, 201, 202
fourth moments of L- functions, 43
Fundamental Lemma, 248
Gaussian process, 204, 205, 206
Gaussian sum, 120
generalized Bernoulli function, 10,
11
generalized Bernoulli number, 29,
112
generalized Bessel's inequality, 252
generalized Dedekind character
sum, 23
generalized Dedekind
eta-function, 28
generalized Dedekind sum, 10, 24
generalized Dirichlet character, 32
generalized prime number system,
263
Glivenko-Cantelli theorem, 203
Goldbach's problem, 315
graph of (n,<p(n)),64
Haar measure, 104
Halasz, method of, 143
Hardy-Littlewood conjecture, 136
Hardy -Littlewood method, 160
Hardy-Littlewood Tauberian
theorem, 218
Hecke operator, 226
hichamp, 276
Hurwitz zeta-function, 12
hyperelliptic fields, Riemann
hypothesis in, 285
indefinite diagonal form, 160
integer interval, 217
intervals, distribution of, 129
invariance property, 175
iterated logarithm, law of, 195
196, 204, 205,206, 210
Jacobi symbol, 244
Jacobian, 222
^-hypothesis, 83, 84
Kolmogorov's limit theorem, 19€
204, 207, 209
Kolmogorov's two-sided test, 196
201, 202
K-theory, 58
lacunary sequence, 196, 210
/-adic zeta function, 52
Lambert series, 119
large numbers, law of, 80
large sieve, 248
large sieve inequalities, 31
law of iterated logarithm, 195, IS
204, 205,206,210
law of large numbers, 80
Legendre symbol, 233
Lehmer's delay-line sieve DLi
157, 269
Lichtenbaum, conjecture of, 112
SUBJECT INDEX
339
Lindeberg condition, 233, 235
Lindelof hypothesis, 40, 308
weak, 309
Lindelof ^-function, 33
linear forms in the logarithms of
algebraic numbers, 1
Lipschitz summation formula, 11
Littlewood indices, 267
Littlewood's bounds on L(l, x),
267
local discrepancy, 197, 201
localization principle, 36
lochamp, 276
logarithms of algebraic numbers,
linear forms in, 1
lower Littlewood index, 267
If discrepancy, 197
Mertens conjecture, 148,152
Meyer function, 113
Meyer's G- function, 119
modular group, 221
mutual influence of primes and
zeta roots, 308
non-Riemannian zero, 282
nontrivial zero of f(s), 305
numbers prime to n, 130, 213, 217
numeri idonei, 5
Q-theorems, 149,150,156,157
order function, 177
pair correlation function, 184
perturbed zeta function, 264
positive definite quadratic form,
160
power residues, 213
prime ^-tuples conjecture, 123
prime number system, generalized,
263
prime number theory, 316
prime-pair, 137
primes, 130, 305
additive theory of, 314
pseudosquare, 275
quadratic character, 268
quadratic extension, totally
imaginary, 303
quadratic form, 159
positive definite, 160
quasi-orthogonal functions, 252
constructed from weighted
strips, system of, 252
quasi-prime, 249
quasi-Riemann conjecture, 312
rational approximations to
algebraic numbers, 3
rational cusp, 222
rational group, 229
rational point, 221
ray-class character, 114
reciprocity formula, 9,10, 20
reciprocity law, 10, 26
regulator, 269
Riemann hypothesis, 103,107,
136,181, 267, 321
extended, 214, 218, 282
in hyperelliptic fields, 285
Riemann zeta function, 63, 111,
181, 305
Riesz-Thorin theorem, 33
Schoenberg, theorem of, 63
Selberg's sieve, 134, 136, 218, 248
sequence, 129
equidistributed, 104
lacunary, 196, 210
uniformly distributed, 195, 197
Siegel zero, 315
sieve,
Bombieri's, 92
Bran's, 134, 247, 248
large, 248
Lehmer's delay-line, DLS-157,
269
Selberg's, 134,136, 218, 248
trigonometrical of
I.M.Vinogradoff,311
singular function, 75
splitting type, 95
square-free number, 130
340
SUBJECT INDEX
system of quasi-orthogonal
functions constructed from weighted
strips, 252
system of representatives, 208
tame kernel, 58
Tchebychev's inequality, 92
Thue, theorem of, 3
totally imaginary quadratic
extension, 303
totally real algebraic number field,
303
transcendental numbers, 175
transversal theory, 196, 207, 208
trigonometrical sieve method of I.
M. Vinogradoff, 311
two squares, sum of, 130
uniformly distributed sequence,
195,197
upper Littlewood index, 267
Waring's problem, 6, 84
weak convergence of probability
measures, 233
weak Lindelof hypothesis, 309
weighted strips, 252
system of quasi-orthogonal
functions constructed from, 252
wild, 56
zero,
non-Riemannian, 282
ofL(s,x),33, 282
of f(s), nontrivial, 181, 305
Siegel, 315
zero-density theorem, 141
zeta function,
Dedekind, 112,113,117
Hurwitz, 12
perturbed, 264
Riemann, 63, 111, 181, 305
{>($), 263
This page intentionally left blank
ISBN 0-8218-1424-9
9»780821«814246l
PSPUM/24
AMS on theWcb
www.ams.org
ISBN 0-8218-1424-9
9"780821«814246l
PSPUM/24
AMS on f^Web
www.ams.org