Текст
                    /. P. NAT ANSON
CONSTRUCTIVE
FUNCTION THEORY
Volume I
UNIFORM APPROXIMATION
Translated by
ALEXIS N. OBOLENSKY
T
FREDERICK UNGAR PUBLISHING CO.
NEW YORK


Copyright © 1964 by Frederick Ungar Publishing Co., Inc. Printed in the United States of America Library of Congress Catalog Card No. 64-15689
CONTENTS Introduction vii Chapter I. Weierstrass* Approximation Theorems 1. Weierstrass' First Theorem 3 2. Weierstrass' Second Approximation Theorem 9 3. The Reciprocal Coherence between the Two Weierstrass Theorems 16 Chapter II. Algebraic Polynomial of the Best Approximation 1. Fundamental Concepts 20 2. Tchebysheff's Theorems 27 3. Examples. The Tchebysheff Polynomials 34 4. Other Properties of Tchebysheff Polynomials 42 Chapter HI. Trigonometric Polynomials of the Best Approximation 1. The Roots of a Trigonometric Polynomial 57 2. The Image Point Method 59 3. The Trigonometric Polynomial of the Best Approximation 64 4. P. L. Tchebysheff's Theorems 65 5. Examples 72 Chapter IV. Interrelation between Structural Properties of Functions and the Degree of Their Approximation by Trigonometric Polynomials 1. Formulation of Problem. Modulus of Continuity; A Lipschitz Condition 75 2. Lemmata 79 3. D. Jackson's Theorems 84 Chapter V. Characterization of the Structural Properties of a Function by the Behavior of Its Best Approximations by Trigonometric Polynomials 1. S. N. Bernstein's Inequality 90 2. Some Elements of Theory of Series 93 3. S. N. Bernstein's Theorems 97 4. A. Zygmund's Theorems 106
vi CONTENTS 5. The Existence of Functions with Preassigned Best Approximation 109 6. Density of Class H% in Class LipM a 118 Chapter VI. Interrelation between Structural Properties of Functions and Their Approximation by Algebraic Polynomials 1. Lemmata 120 2. How the Structural Properties of a Function Affect Its Approximation 125 3. Converse Theorems 129 4. Bernstein's Second Inequality 133 5. Existence of a Function with Preassigned Approximations 136 6. The Markoff Inequality 137 Chapter VII. Approximation by Means of Fourier Series 1. The Fourier Series 141 2. Estimate of the Deviation of Partial Sums of a Fourier Series 149 3. Example of a Continuous Function Not Expandable in a Fourier Series 153 Chapter VIII. The Sums of Fej^r and de la Vallee-Poussin 1. Fej^r Sums 157 2. Some Estimates of Fejer Sums 161 3. De la Vallee-Poussin's Sums 167 Chapter IX. The Best Approximation to Analytic Functions 1. The Concept of Analytic Functions 172 2. S. N. Bernstein's Theorems on the Best Approximation to Periodic Analytic Functions 177 3. The Best Approximation to Functions Analytic on a Segment 183 Chapter X. Properties of Some Analytic Expressions as Means of Approximation 1. Expansion by Tchebysheff Polynomials 195 2. Some Properties of Bernstein's Polynomials 196 3. Some Properties of de la Vallee-Poussin^ Integral 206 4. Bernstein-Rogosinsky's Sums 217 5. Convergence Factors 220 Bibliography 227 Index 231 Index of Symbols 232
INTRODUCTION The constructive theory of functions is a branch of mathematical analysis dealing with questions that arise in the approximate representation of arbitrary functions by the simplest analytical expedients possible. In this book we shall refrain from considering classes of functions of too extensive a character, and shall restrict ourselves to the investigation of the following two important classes: I. Real functions that are defined and continuous over a specified segment [a, b]. We shall denote the set of all these functions by C([a, b]). II. Real functions that are defined and continuous over the entire real axis (— oo, + oo), and at the same time have the period 2w, so that the equa- ti0n J(x + 2n) =f(x) is true for every value of x. We shall denote the set of all these functions byC2r. For the approximate representation of functions of both these classes we shall employ two particularly simple types of function: For class C([a, b]), the usual algebraic polynomials P(x) = c0 + cxx + c2x2 -f ... -f cnxn with real coefficients; for class C2T, however, the trigonometric polynomials, i.e. functions of the form T(x) —A + (a± cos x + bx sin x) + • • • + (an cos nx + bn sin nx) with real coefficients A, a^ &*• We still have to explain what we mean by the approximate representation of a function f(x) by a polynomial P(x) or T(x). This may be done in a number of ways. We shall denote a polynomial P(x) as an approximate function for a function f(x) € C([a, &]), if for all values of x e [a, b] the inequality \P(x)-f(x)\ <e holds true, where the constant € > 0 is characteristic of the degree of approximation attained. Similarly in this connection we denominate a trigonometric polynomial T(x) as an approximation for a function f(x) e C2T, if for all real values of x, the inequality \T(x)-f(x)\ <e holds true. vu
viii INTRODUCTION On account of the 2w periodicity of T(x) it is, furthermore, sufficient if this last inequality be satisfied in a segment of length 2w, for instance, in the segment [0, 2w], and even simply in the half-segment open to the right, since it is then actually satisfied over the whole of the axis. If we make the principle of evaluation, thus defined, basic for the closeness of approximation in our theory, then we may denominate it as the Theory of Uniform Approximation. Clearly from this standpoint, the value max \P(x)—f(x)\ may serve as the "measure" of the accuracy attained in the case/(x) e C[a, b]> and the value , ^ max \T{x) — f(x)\ — oo<:r<-|-oo in the case/(x) e C2r. This is, so to speak, the "distance" between f(x) and P(x), or between f{x) and T(x), respectively. [Parts II and III, described below, are to be published in English at a later date.] Part II of the book is devoted to the Theory of Mean Value Approximation. We shall there deal with the approximate representation of functions f(x) of an essentially more general type, for which we shall, however, again make use of the standard algebraic polynomials P(x) and trigonometric polynomials T{x) as approximate functions; we shall, nevertheless, change the criterion for the attained accuracy of the approximation. We shall, in fact, use the integral j[P{x)-f(x)fdx as the "measure of distance" of the two functions f(x) and P(x), and similarly the integral f[T(z)-f(z)]*dz as the distance of a trigonometric polynomial T(x) from a given function f(x) in the segment [—7r, +t]. The approach thus modified will lead us to an essentially different theory with new formulations of problems and fresh results. Finally in Part III, we shall investigate the problems arising out of interpolation. As a criterion of the approximation of a polynomial P(x) to & function f{x) e C([a, 6]), neither the smallness of the value max \P(z) — f(z)\ a£x£b nor of / ][P(z)-f(z)]*dz
INTRODUCTION ix will be of service, but the actual coincidence of P(x) with/(a;) at various previously selected points ("interpolation nodes") Xlf £2, • • • j xn of the segment [a, b]. The same problem arises in the approximation of a function fix) e C2ir by a trigonometric polynomial T(x)> in which the interpolation nodes must lie in one and the same segment of the length 2w. As we shall see, all these points of view are most closely interconnected, so that the theories pertaining to them overlap to a very high degree. In fact, it is this interlocking of manifold ideas, methods, and facts that—quite apart from its highly practical significance—is chiefly responsible for the constructive theory of functions being one of the most beautiful branches of mathematics.
UNIFORM APPROXIMATION
CHAPTER I WEIERSTRASS' APPROXIMATION THEOREMS § 1. Weierstrass* First Theorem The following fundamental question is intrinsic in the theory of continuous approximation from the outset: "Can every arbitrary, continuous function in general be approximately represented by a polynomial with arbitrarily postulated accuracy?" Weierstrass [1]l found it possible to give an affirmative answer to this in 1885. We formulate his result as: Weierstrass' First Approximation Theorem. For an arbitrarily assumed f(x) e C([a, &]), where € > 0, a polynomial P(x) exists of such type that for all values of x e [a, 6], the inequality P(x) — f(x) < € is satisfied. From the large number of proofs now available for this theorem, I here present one that depends on another equally important theorem in analysis, i.e., one of S. N. Bernstein's theorems [1]. Lemma 1. The following identities hold good: ZCknxk(l-x)n-k = l, (1) 2(k — nx?Cknxk(\— x)n'k =nx(l — x). (2) Proof. Identity (1) is trivial; it follows straight from the standard binomial formula, putting a = x and b = 1 — x in the expansion (a + b)n=£cknakb-k. (3) The proof of the second identity is not quite so simple. Putting a = 2, and 6=1, (3) above gives us the identity ZCknzk = (z + l)n (4) Taking the first differential of (4) and multiplying it by 2, we get 2hCknzk=nz(z + l)n-\ (5) *=o 1 Figures in square brackets refer to the bibliography at the end of this volume. 3
4 I. WEIERSTRASS* APPROXIMATION THEOREMS and differentiating (5) and again multiplying the result by z, we get jt k2Ckzk = nz(nz + 1) (z + If2. (6) X Now put z = in the three identities (4), (5), and (6), and multiply 1 — x each of them by (1 — x)n. This will give three new identities j;C*z*(l-zr-* = l, (7) 2kCknxh{l-x)«-1' = nx, (8) %k*Chnxk{\ — z)n~k = nx(nx + 1 — *). (9) Multiplying (7), (8), and (9), respectively, by n2x2, —2nx, and 1, and adding the results, we obtain the required identity (2). Corollary. For all values of x Z(k-nx)*Ckxk(l - x)n~k^^. (10) *-o 4 For, since 4z2 — 4x + 1 = (2a? — l)2 ^ 0 it follows that x(l — x) ^ —. 4 Lemma 2. Ld x e [0, 1] and 5 > 0 arbitrarily. If we also denote by An(x) J/ie sd of k-values from the range 0, 1, 2, 3, . . . , n, for which k x n ^d, (11) then Z Cknxk(l - *)"""* ^ -L (12) Proof. For k e A„(:r), it follows from (11) that (t-n«)« n2<S* -
1. WEIERSTRASS* FIRST THEOREM 5 and hence 2; Oj^l-*)-*^-^ 2 (k-nx)*Cknxk(l-x)n-* If we now extend the summation on the right-hand side to all values in the range k = 0, 1, 2, 3, . . . , n, then the right-hand sum does not decrease, since for x e [0, 1] none of the newly added summands is negative. Hence the inequality (!) leads immediately to the desired inequality (12). The gist of this lemma is—in brief—that with extremely large values of n in the sum £ Ckxk(l -*)"-* (13) *-o only those summands are significant whose index k satisfies the condition while the remaining summands contribute but slightly to the total. <s , Definition. If f(x) is a given function in the segment [0, 1], then every polynomial of the form *-o \toj may be denominated a Bernstein polynomial of the function fix). For large values of n, and if f(x) is continuous, such a polynomial will differ only slightly from f(x). For—as we have already seen—those sum- k mands for which - is remote from x play hardly any part in the sum (13), n and this holds also for the polynomial Bn(x), since all the factors/ ( — 1 are of course bounded. In the polynomial Bn(x), accordingly, only those sum- k mands are substantially significant for which - lies in close proximity to x. n But for these summands the factor /1 -) differs only slightly from f(x), because of the continuity of f(x). This, however, means that the whole polynomial Bn{x) varies only slightly if we substitute f(x) for /(-) in its summands. In other words: the approximate equation holds good. jb.(x)«i;/(*)o;^(i-*)
6 I. WEIERSTRASS* APPROXIMATION THEOREMS From this and equation (1) it immediately follows that This heuristic consideration is formulated in exact terms in the following theorem: S. N. Bernstein's Theorem. If fix) is continuous in the segment [0, 1], then in relation to x UmBn(x)=f(x) (14) n-»oo holds uniformly in this segment. Proof. Let M be the greatest value of \f(x)\ in [0, 1]. If, furthermore, € > 0 is arbitrarily assumed, then in consequence of the uniform continuity of f{x) in the segment [0, 1], a number 8 > 0 can be found such that for the inequality \x"-x'\ <<5 !/(*")-/(*')!< it is always true. Now let x be a value arbitrarily chosen from the segment [0, 1]. From equation (1) so that f(x)=2n*)CnX*(l-z)n-\ *-0 B.(x) - /(*) =J£ {/ (fy - /(*)) Cknxk(l - *)*"*. (15) Now let us split the series of values k = 0, 1, 2, . . . , n into two classes rn(:r) and An(z) by determining: k € rn(x), wenn k c An(x), wenn k x n k x n <<5, ^<5. The sum (15) correspondingly splits also into two sums 2r and 2A. In the first e $-«■» <
1. weierstrass' first theorem whence \Zr\ ^ ± £ (fnX\\ _ xf~\ and as v £ OUk(l - *)""* < Z C>x\l - *)-* = 1 , it follows that In the second sum, if then, from (12), irIS {-. (16) '(i)-/W 2M, k *,, ,n-h ^ M \Za\^2M 21 oU{i-x)n~k^ *€J«(*> 2nd2 This inequality combined with (16) gives 2 ' 2nd2 For sufficiently high values n > N9 42<£ (17) no2 and consequently I *.(*)-/(*) I <*• Hence the theorem is proved; for the choice of Nt is determined only by inequality (17) and is in no way dependent on the value of x selected. We are now in a position to prove the Weierstrass theorem previously mentioned. In fact the Weierstrass theorem follows directly from the Bernstein theorem, if segment [a, b] coincides with segment [0, l].2 Now let segment [a, b] be different from segment [0, 1]. Then let us introduce the function <p(y) = /[« + y(b — a)] 2 We observe, however, that the Bernstein theorem is more productive than the Weierstrass in this case, as it provides a sequence of well-defined polynomials, while the Weierstrass theorem only establishes the existence of such a sequence of approximations, without stating anything in regard to its construction.
8 I. WEIERSTRASS* APPROXIMATION THEOREMS which is defined and continuous in the segment [0, 1]. From what has just been proved, a polynomial Q(y) =Zcky* must exist that satisfies the condition n I <e (18) f[a+y(b-a)]-Zcky* for all values of y e [0, 1]. x — a But for every value of x e [a, b], the fraction lies in the segment b — a [0, 1]; it can therefore be substituted for y in (18). This gives \f(*)-2*tl^)k\<e, I *.o \o — a) I which proves that the polynomial *-o \o — a) approximately represents the function f(x) with the required degree of accuracy. We can express the Weierstrass theorem in still another form: A. Every continuous function f(x) in the segment [a, b] is the limiting function of a uniformly convergent sequence of polynomials in this segment. In point of fact, let us take the null-sequence en = - , then for each such n value en a polynomial Pn{x) can be found that satisfies the condition \Pn(x)-f(*)\ <~ (a^x^b) Hence, clearly Pn(x)^f(x) for n —> oo .8 Finally, we give the Weierstrass theorem still a third form: B. Every continuous function in a segment can be expanded into a uniformly converging series of polynomials in that segment. Suppose we have found a series of polynomials uniformly converging on f(x), then let Q1(X) = P,(X), Qn(x) = Pn(x) - Pn-lW (n > I). 3 Occasionally, we denote uniform convergence by the symbol =s .
2. WEIERSTRASS* SECOND APPROXIMATION THEOREM 9 Then the partial sums of the series coincide with the polynomials Pn(x)> so that this series itself uniformly converges onf(x). § 2. Weierstrass' Second Approximation Theorem WEIERSTRASS, second theorem states the possibility of approximately representing continuous periodic functions by trigonometric polynomials to any required degree of accuracy. Weierstrass' second theorem. // f{x) e C2r and € > 0 be arbitrarily assumed, then there exists a trigonometric polynomial T(x) such that for all real values of x the inequality \T(x) — f(x)\ < € is satisfied. This theorem—like the first—also admits two other formulations (of type A, and B, respectively). A particularly simple proof was furnished for this by De La Vall^e-Poussin in 1908 [1]; we follow it here. Lemma 1. // <t>(x) e C2T, then for all values a the equation a+2n 2n J <p{x) dx=J(p(x) dx a 0 holds true. In fact, we have a+2n 0 In a+2n J (p(x) dx = J cp(x) dx + (p(x) dx+ (p{x) dx. a a 0 2n If in the last integral on the right we put x = z + 2w and consider the equation <f>(z + 2w) = 4>(z), then for this last integral we get the value — J<p{z)dzi from which the lemma follows. Lemma 2. The identity4 »/2 (2n—1)11k (2n)I! -2 holds true. /«*'*-«.? <i9> 4 The symbol n\\ denotes the product of all natural numbers not exceeding n and even or uneven according as n is even or uneven, e.g., 8!! = 2 • 4 • 6 • 8.
10 I. WEIERSTRASS* APPROXIMATION THEOREMS Proof. If we denote the integral (19) by £/2n, then integration by parts gives 7J/2 U2n= f cos2n-1td(sint) n\l = [sin t cos2*-1 t)o12 + (2n — 1) j cos2*-2 t sin2 tdt, whence it follows that w/2 U2n = (2» — 1) / (cos2*-2 tf — cos2*0 A = (2n — 1) (J72n-2 — U2n) , o so that finally we have Urn-—^-U2n-2. If in place of n we substitute here the successive values n — 1, w — 2, ...,1 and multiply the corresponding sides, of the equations so obtained, we arrive at equation (19). Definition. If f{x) e Cin, then we denominate an integral of the form — n as a De La Vallee-Poussin singular integral. De La Vallee-Poussin Theorem. For all real values limVn(x)=f(x) is uniformly true. Proof. If in the De La Vallee-Poussin integral we put t = x + u, then according to Lemma 1, we need not alter the limits of integration when making this substitution, and we obtain The further substitution u = 2t gives n/2 F»(a:) = (2^TyT!^_//2/(a: + 2<)cos2n<^-
2. WEIERSTRASS* SECOND APPROXIMATION THEOREM 11 Let us split this integral into two parts, namely for the segments I — - , 0 and 0, - . Substituting — t for t in the former, we get (2»)H 1 n,r2 The behavior of the integral is plainly characterized by this form. The factor cos2*11 becomes exceedingly small for large values of n, decreasing all the more, the further t lies from zero. Hence only those elements whose lvalues are very close to zero are significant in the integral. But for all these values of t, the factor f(x + 2t) + f(x - 2t) differs only very slightly from 2f(x) and can therefore be replaced by 2f(x) without appreciable error. Thus we arrive at the approximation equation which, in virtue of (19), transforms into y*{x)**f(x). As the accuracy of the approximation equation increases with the growth of index n, it leads in the final analysis to the statement of the De LaValu£e- Poussin theorem. Obviously, the heuristic considerations just employed do not take the place of rigid proof; they are, however, very instructive in that they reveal the mechanisms which in fact govern a whole host of similar analytical devices. Indeed, we have already recognized the same mechanism as vital in the Bernstein polynomials. Let us now return to the proof of our theorem. For every assumed value of e > 0, we can find a value 8 > 0, so that for \x" — x'\ <2S the inequality |/(z")-/(*') I <{■ is always true. The possibility of such a choice of 8 follows from the uniform continuity of/Or). On this point, a few details may still need to be investigated, but we will leave this until the conclusion of the proof in order not to interrupt the train of thought. Now let x be any given real value. From (19) then n ' (2w—1)!! n I M '
12 I. WEIERSTRASS, APPROXIMATION THEOREMS from which it follows that Vn(x) -/(*) = {2n-\)\\ lil {t(X + 2t) +/(^~ 2t) ~ 2/(*)}cos*»*d«. If we split this integral into two portions extended over the segments [0, 8] and \d, - , we observe that in the former the inequality \f(x + 2t)+f(x-2t)-2f(x)\ ^ \f(x + 2t) - /(*) | + | f(x - 20 - /(a?) | < e, in the latter the inequality \f(x + 2t) + f(x — 2t) — 2f(x) | ^ 4 if is valid, wherein M = max |/(ar)| # From this it immediately follows that ,n \ | | l ( «* n/2 "I |F,(«) - /(a?) | < ,2n_1)U ~ e fcos2"t dt + 4 if j cos2»* <m, in which »/2 f cos2" tdt < f cos2n *<ft (2n—l)!lgg (2»)I! 2 We notice, further, that the function cos t in the segment 0, | decreases monotonically; if we denote cos2 5 by g, then w/2 /" cos2ntdt <~gn. All this taken together leads to the evaluation |Fa(x)-/(,)|<|+^^T]2^. As, however, (2w)l! __2 i_ 2tj —2 (2»— 1)!! ~~3 ' 5 '" 2n— 1 2n <2n,
2. WEIERSTRASS* SECOND APPROXIMATION THEOREM 13 we have \Vn(x)-f(x)\<j + 4Mnq». For 0 < q < 1, it is evident that lim nqn = 0, n-*-oo so that for n > Nt, also 4Mnqn < — and thus |F,(*)-/(*)|<c , which proves the theorem, since Nt is independent of x. We still have to demonstrate the uniform continuity of the function f(x). Now Cantor's well-known theorem in analysis on the uniform continuity of continuous functions relates to functions which are given and continuous in one interval and can not be transferred to functions that are defined and continuous along the whole real axis. For instance, it can easily be shown that the function y = x2, though continuous throughout, is by no means uniformly continuous on the axis. Nevertheless the function f(x) appearing in the De La Vall^e-Poussin theorem is periodic, and this circumstance guarantees uniform continuity. If e > 0 is an arbitrarily assumed value, then the uniform continuity of f(x) in the segment [0, 47r] gives a value 5 > 0 such that from I*"—a/| <<5, 0^x"^4:7z, O^x'^ln it always follows that Without loss of generality, we may select 8 < 2w. Now let x and y be two given points, for which \x-y\ <d and, furthermore, let x < y. Then represent x by the form x = 2nw + u, where 0 ^ u ^ 2w, and in addition put v = y — 2nw. Then v > u ^ 0 and, moreover, v — u = y — x < 6 < 2w, whence v < 47r. The two points u and v accordingly lie in the segment [0, 47r], so that
14 I. WEIERSTRASS, APPROXIMATION THEOREMS As, finally, on account of the periodicity, f(x) = f(u) and/(y) = f(v), then also l/(*)-/(y)l<«. Hence the proof of the De La Vall^e-Poussin theorem is complete. To obtain Weierstrass' second theorem from this one, it is clearly sufficient for Vn(x) to be a trigonometric polynomial. For this we require Lemma 3. The product of two trigonometric polynomials is itself a trigonometric polynomial whose ordinal6 is equal to the sum of the ordinals of its two factors. Proof. If we multiply the polynomials n Tn(x) = A + ]£ (ak cos kx + bk sin kx) and m Um(x) = C + 2j (ck cos kx + dk sin kx), we obtain a sum whose summands pertain to the following three types: cos kx cos ix, sin kx sin ix, cos &# sin ix. (20) By means of the formulas cos a cos /? = -r- [cos (a — /?) + cos (a + /?) ], sin a sin /? = — [cos (a — /?) — cos (a + /?)], sin a cos /? == — [sin (a + jS) + sin (a — /?)], (21) we next satisfy ourselves that each of the products in (20) represents a trigonometric polynomial, hence also each linear combination of these products is such a polynomial. We now have therefore only to calculate the ordinal of the product. From formula 21 it is evident that this order can not exceed n + m, and we immediately establish also that it is not smaller than that: the product of the two highest terms of Tn(x) and Um(x) is 6 8 If \an\ + \bn\ > 0, we understand by the term "ordinal" of the polynomial Tn(x) = n A + ]T (a*, cos kx + bk sin kx) the number n. • &=i 6 From the formulas (21) themselves, it follows, further, that the terms cos (n + m)x and sin (n + w)x of the product are obtained only through multiplication of both the highest terms of Tn(x) and Um(x).
2. WEIERSTRASS* SECOND APPROXIMATION THEOREM 15 (an cos nx + bn sin nx) (cm cos mx + dm sin mx) = yf(a»c« ~~ b»dm> cos (w + m)* + (a»rf« + 6»c"») sin (** + m) *] + A, in which X is composed of terms of lower order. As the coefficients an, bn cmj dm are real, so too are the coefficients of cos (n + m)x and sin (n + m)x; in order to show, furthermore, that both of the coefficients will not vanish simultaneously, we form the sum of their squares and obtain (ancm - bndmf + (andm + bncm)* = (a2n + b2n) (c2m + d2m) > 0. Remark. It should be noted that this proof holds only for real polynomials. If (contrary to our previous convention) we were to admit complex coefficients, then the order of the product might come out lower than the sum of the ordinals of the factors. For instance, (cos x + i sin x) (cos x — i sin x) =■ 1. Now let us prove additionally the simple Lemma 4. Every even trigonometric polynomial T(x), satisfying therefore the functional equation T( — x) = T(x) can be put in the form n T(x) = A + £ ak cos kx containing no sine functions. To prove this, let us add the two equations n T(x) — A + £ (akcos kx + 6* sin kx), n T (— x) = A + £ (ak cos kx — bk sin fca;) and divide the result by 2. It is now no longer difficult to prove that Vn(x) is a trigonometric polynomial. First of all, u 1 -f cos u is a polynomial of the first order. Consequently cos2n - is a polynomial of the nth order and, as it is an even function, it can be represented in the form w ** cos2n—- = L + £ lk cos kx. 2 A:—1
16 I. WEIERSTRASS' APPROXIMATION THEOREMS Hence it follows that (2n)!! Vn(x) n ht If{t) [L +Mlk cos k{t~~x)]dt > whence (271—1)! (2ri)\\ 1 ? f n 1 Vn(%) =7^ rTfi^r- / /W \L + 2J h (coslet coskx + sinkt sinkx)\ dt. which means in fact n Vn(x) = A + ]£ (ak cos & a; + bk sin & a;), if we put 2 7T J ak = (2»—1)!! 2: (2n)!! Z* whereby Weierstrass* second theorem is completely proved. § 3. The Reciprocal Coherence Between the Two Weierstrass Theorems We now show that Weierstrass' first theorem follows from his second. In the segment [—w, w] let a continuous function/(x) be given. Then introduce the function From this we immediately see that g(ir) = >g( —w), and we can therefore extend the domain of definition of g(x) by means of the functional equation g(x + 2t) = g(x) over the whole of the real axis; the function g{x) thus extended then appertains to the class C2r. Weierstrass' second theorem therefore gives, for every value e > 0, a trigonometric polynomial n T(x) = A + £ (ak coskx + bk$inkx), which satisfies the inequality \g(z)-T(z)\<j
3. RECIPROCAL COHERENCE BETWEEN WEIERSTRASS* THEOREMS 17 for all real values of x. Having determined this polynomial, we then put 2(\°k\ + \bk\)=M. From elementary analysis it is well known that the functions cos z and sin z can be expanded into the power series z2 z4 . z3 , z6 COsz = l--+--..., sin* = z--+-- , which converge uniformly in every finite segment. In particular, their convergence is uniform also in the segment [ — tit, +nir|, where n may stand for the order of the polynomial T(x). Hence there exists a sufficiently large number m such that for all values of z pertaining to the segment [ — nv, +nw] £ S \cosz — Cm(z)\< — 9 \ainz — Sm(z)\<-^ , in which Cm(z) and Sm(z) are the partial sums of the series expansion given above. For all values of x in the segment [—w, w] and all values of k in the sequence 1, 2, . . . , n, therefore, " \coskx — Cm(kx)\<—9 \sinkx — 8u(kx)\< — 9 whence n n g \ £ [aL cos kx + bk sinkx] — £ [akCm(kx) + bkSm(kx)]\ <~ . If we now put Q(x) =A+% [akCm(kx) + bkSm(kx)], then Q(x) is an ordinary algebraic polynomial, which satisfies the inequality \T(z)-Q(x)\<j and hence the inequality \g(z)-Q{x)\ <e in the whole segment [ — t, t]. But the polynomial P(»)-Q(*)-/(-*,i-/W»
18 I. WEIERSTRASS, APPROXIMATION THEOREMS then satisfies the inequality |/(*)-P(*)| <e in the entire segment [—7r, t]. That is to say: we thus have Weierstrass' first theorem for a continuous function in the particular segment [—w, t]. In complete analogy with'the transition effected in §1 from the segment [0, 1] to a given segment [a, b], we can here also assign our result to any chosen segment [a, b]. It is somewhat more difficult to prove the converse, that Weierstrass' second theorem follows from the first. For this purpose we require the Lemma. // a function f(x) is defined and continuous in the segment [0, t], then for every value e > 0 there exists an even trigonometric polynomial T(x) that satisfies the inequality \f(x) — T(x)\ <e (0^x^7i)% For the function /(arc cos y) is defined and continuous in the segment [ — 1, +1]. From Weierstrass' first theorem there exists therefore a poly- n nomial ^2 ckVk that for all values of y of segment [—1, +1] satisfies the inequality /(arc cos y) — Z cky*\ k=0 <e or—what is the same thing—the inequality n I /(*) — ^c*cos* z\<e, *=o I for all values of x in the segment [0, w]; and for completion of the proof we n need only add the remark that ^ c* cos* x 1S an even trigonometric poly- nomial. Hence we can derive Weierstrass' second theorem from the first. Let f(x) be an arbitrary function of the class C2ir. According to the lemma, we get for the two even functions f(x)+f(-x), [f(x)-f(-x)]sinx two even trigonometric polynomials Ti(x) and T2(x) such that in the whole segment 0 ^ x ^ w, \f(x) + f(-z)-Tx(x)\ <\
3. RECIPROCAL COHERENCE BETWEEN WEIERSTRASS* THEOREMS 19 and \[f(x)-f(-x)]smx-Tt(x)\ <j. Obviously these last two inequalities are also satisfied for — w ^ x < 0, since the left-hand members remain unaltered if we substitute — x for x, and by reason of the periodicity of all the functions occurring in them, they therefore hold good for all real values of x. Let us now write the two inequalities as equations f(x) +f(-x) = Tx{x) +a1{x)i [f(x) — /(— x)] sin x = T2(x) + a2(x), in which \ai(x)\ < - , multiply the first of these by sin2x, the second by sin x, and then add the two together. This gives (on dividing by 2) f(x) sin* x = T3 (x) + p(x) (\ 0(x) \< j), in which Tz(x) is a new trigonometric polynomial. In the last equation/(j) is a wholly arbitrary function from C2ir; it holds therefore also for the function /( x 1, and so f(x-jjsm*x= TA(x) + y(x) (lr(*)l<4) If in this we substitute x by x H— and, in addition, put we get f(x) cos* x = T6(x) + d(x) (\6(x) | < ~) in which Th(x) is another trigonometric polynomial. Then f(x) = Tz(x) + Th{x) + p(x) + d(x), so that the polynomial Tz(x) + Th(x) differs from f(x) by less than e.7 7 De La Valleb-Poussin [2].
CHAPTER II ALGEBRAIC POLYNOMIAL OF THE BEST APPROXIMATION § 1. Fundamental Concepts Although Weierstrass' first theorem states that any function of class C([a, b]) can be approximated to any assigned degree of accuracy by a polynomial, the degree of that polynomial may become very high. Thus there arises of itself the question as to the accuracy of approximation that can be reached if from the outset we limit the degree of the polynomials. This approach leads to new concepts. We denote by Hn the set of all polynomials whose degree does not exceed the number n, i.e., polynomials of the form P(x) = c0 + clX + c2x2 -\ h cnxn. Here the coefficients Co, Ci, . . . , cn are arbitrary real numbers. In particular, the principal coefficient cn may vanish, whence ffoc^cffjc... (22) ensues. Now let f(x) be a function of class C([a, b]) and let P{x) be an arbitrary polynomial. Then we set A(P) = max \P(x) — f(x)\ and define A(P) as the deviation of the polynomial P{x) from the function f(x). If now we let the polynomial P(x) cover the entire set Hn, then the corresponding values of A(P), which are never negative, form a set of numbers bounded from below, their exact lower bound being Eu = Eu(f)= inf {A(P)} Penn This quantity En is said to be the least deviation from/(x) of the polynomials belonging to Hn, or also the best approximation to f(x) by polynomials from Hn. For the time being, however, these two formulations are not justified inasmuch as it has not been established whether a polynomial P*(x), for which A(P*)=En (23) 20
1. FUNDAMENTAL CONCEPTS 21 holds true can at all be found in Hn; En may therefore not be regarded as a deviation value as yet. Later, however, we shall prove the existence of a polynomial P*(x) which satisfies (23), and justify the terminology introduced by us. Obviously From the relations (22) we can see that with increasing n the number set A(P) increases as well, i.e., its lower bound does not increase; hence Eq ^ E± ^ E.2 ^ • • • . From this and from the first Weierstrass theorem it follows that UmEn = 0. In fact, for any preassigned e > 0 we can find a polynomial P{x) which fulfills the condition A(P)<e. If its degree is n0, we first have E^ < e ard, hence, for n > n0, a fortiori En <e. Definition. If P(x) = % c*** fc=0 is any polynomial, we set Jf(P)=max|P(x)|, L(P)=Z\ck\ and define the number M(P) as the norm and the number L(P) as the quasi- norm of the polynomial P(x). As a matter of fact, the norm of a polynomial depends not only upon the polynomial alone but also on the original segment [a, b]; the quasi-norm, however, is free of this shortcoming. Theorem 1. For a given specified segment [a, b] and a specific integer n ^ 0 there exist two positive constants A and B such that for each polynomial P(x) e Hn the equations M(P) ^AL(P), (24) L(P) ^ BM(P) (25) are satisfied.
22 II. ALGEBRAIC POLYNOMIAL OF BEST APPROXIMATION Proof. The existence of the constant A is trivial since any of a finite number of continuous functions 1, x, x2, . .. , xn is bounded on the segment [a, b]. Hence, if A is a number greater than the maximum absolute value of each such function over [a, b], then for every value of x e [a, b] \P(x)\^2\ck\\x*\tiAL(P) *-o holds true. The existence of constant B is somewhat more difficult to prove. To this end we choose from segment [a, b] a set of n + 1 points x <y < ..• <t , which we retain and leave unaltered in the following. Then c0 + c±x + c2x2 + ••• + cnxn = P(x), co + ciV + C2V2 + h cmy* = P(y), Co + ^t +c2t* + —+c,t» =P(t). (26) (27) If we know the points (26) and the values of polynomials at these points, we can calculate the coefficients c0, ch . . . , cn of that polynomial from the system of linear equations (27) whose determinant Z) = 1 x x2 i y y2 l t t2 tn is a Vandermonde determinant, hence different from zero. According to Cramer's formula we therefore have Ck If we expand this determinant by the elements of the (fc + 1 )-sl column, we obtain the expression 1 77 1 X. i y. l t . . x*-1 P(x)xk+1 . • s/*-1 P(y)yk+l- .. i*-1 P(t) ik+l . .. xn ..yn . in {k = 0, 1, . .,?i) ck = V»P(x) + lfP{y) + ... + llk)P(t), whose coefficients l(xk\ l(yk\ . .. , l\k) are fully defined by the points (26) and the number k, but are totally independent of the choice of the polynomial.
1. FUNDAMENTAL CONCEPTS 23 From this we derive \ckl£{\lY\ + \l?\ + - + \lF\}M{P). By adding these inequalities valid for the values of k = 0, 1, . . . , n, we merely have to set B equal to S{\^)\ + \if\ + - + \f)\} in order to obtain (25). Corollary 1. Let S = {P(x)\ be any family of polynomials from Hn. To have all the polynomials belonging to S bounded over one segment [a, b] by the same number it is necessary and sufficient that the set of their quasi-norms be bounded. In fact, if all the polynomials from S are bounded by the same number K, then M(P) ^ K holds true for P e S. But then L(P) ^BK. Conversely, if L(P) ^K for PeS, then M(P) ^ AK. Corollary 2. // {Pm(x)} is a polynomial sequence belonging to Hn and P(x) any polynomial from Hn, then for the sequence Pm(x) over a segment [a, b] to be uniformly converging to P(x), the condition lim L{Pm— P) =0. is necessary and sufficient^ since uniform convergence of Pm(x) to P(x) signifies that lim M{Pm— P) =0. m-K30 Remark. According to these corollaries a subset of Hn is uniformly bounded over any segment if it is uniformly bounded over one segment. Similarly, a sequence from Hn converges uniformly to a polynomial P(x) e Hn over any segment if this is the case over one segment since the quasi-norm of a polynomial is independent of the choice of the segment. The reader should bear in mind, however, that all this holds true only as long as we confine ourselves to polynomials the degree of which does not exceed a fixed number n. For
24 II. ALGEBRAIC POLYNOMIAL OF BEST APPROXIMATION example, the polynomial sequence 1, X, X2, X'\ ... converges over the segment [0, §] uniformly to 0 but loses this property over segment [0, 1]. Over the latter segment, however, it is still bounded, whereas over segment [0, 2] this is no longer so. Definitions. 1. A system N(xh x2, . . . , xn) of n real numbers xh x2, . .. , xn arranged in a specific order is said to be a point in an n-dimensional space. 2. A set E = \N(xi, x2, . . . , xn)} of points in an n-dimensional space is said to be bounded if there exists a constant C such that all the points N(xh x2, . .. , xn) of E satisfy the inequality ZM <C. 3. We say that the sequence of points {Nk(x«\ *«, .... *»,} in an n-dimensional space converges to the point N(xh x*, . . . , xn) if lim ]£ | x\ — Xi; j = 0. Jfc-OO 1=1 We now assign to each polynomial from Hn P(x) =c0 + c±x + c2x2-\ Vcnxn that point in an (n + l)-dimensional space whose coordinates are the coefficients of the polynomial, i.e., the point N(co, ch . . . , cn). We call it the image point of the polynomial P(x). Thus we can reformulate the corollaries from theorem 1 as follows. 1. The polynomials of a subset of Hn are bounded on each segment if and only if the set of their image points is bounded. 2. A polynomial sequence \Pm(x)} from Hn converges uniformly to a polynomial P(x) eHn if and only if the sequence of its image points Nm converges toward the image point N of the polynomial P(x). These theorems enable us to apply to the polynomials the geometric properties valid for the image points. We do this right away. Theorem 2. (Multidimensional Bolzano-Weierstrass axiom of choice.) From any bounded sequence of points of an n-dimensional space we can select a convergent subsequence. Proof. For simplicity we investigate only a two-dimensional space. Let {Mn(xny yn)} with n = 1, 2, 3, . . . , be a sequence of points where i*.l + |y.l <o.
1. FUNDAMENTAL CONCEPTS 25 Hence it ensues specifically that also the sequence of coordinates x Xi, X2, #3j . • •, is bounded; by a known theorem from the fundamentals of analysis it therefore contains a convergent subsequence xnXi %n2> Xj^, • • • > *lm xnk = x• (28) Now we examine the sequence of coordinates y Vnv 2/n,, Vn3, ..., (29) that is, of those points Mnk whose coordinates x belong to sequence (28). By applying the axiom of choice to sequence (29) we obtain a convergent subsequence Vnh, Vnk2, Vnh • • •; Hm y^ = y, for which, moreover, lim a?Wit. = x since {xnki} is a subsequence of (28). Hence the sequence of points {Mnki(*nki, ynki)} converges toward the point M(x, y). If we regard these points as image points of polynomials, then this theorem results in the following Corollary. // the polynomials of a sequence {Pm(x)} eHn are uniformly bounded on any one segment: \Pm(x)\£K9 (30) then from this sequence we can select a subsequence which converges uniformly toward a polynomial P(x) from Hn. This is true since according to (30) the sequence {Nm\ of image points of our polynomials is bounded. Hence we can select from it a subsequence {Nmi\ converging to the point N(c0, ch . . . , cn). But then the polynomial sequence {Pmi(x)} associated with this subsequence converges uniformly to the polynomial P(x) = c0 + CjxH h cnxn. Whence there ensues the interesting though somewhat remote Remark. If a polynomial sequence }Pm(x)} from Hn converges uniformly toward a function f(x), then this function f(x) is a polynomial from Hn. This is true since sequence {Pm(x)} is bounded, hence we can select from it a subsequence {P^WJ which converges uniformly toward a polynomial P(x) e Hn. But since function f{x) is the limit of the complete sequence {Pm(x)}, it is also the limit of the subsequence {P^x)}: hence f{x) = P(x).
26 II. ALGEBRAIC POLYNOMIAL OF BEST APPROXIMATION We can now prove the theorem stated earlier relating to the existence of a polynomial in Hn having the least deviation from a preset continuous function. Theorem 3. (E. Borel) If f(x) e C([a, b])y then there exists in Hn a polynomial P(x) for which A(P)=EU. (31) Proof. From the definition of the exact lower bound it follows directly that for every natural number m we can find in Hn a polynomial Pm(x) for which En^A(Pm)<En + — . (32) m Now we demonstrate that all the polynomials of the sequence {Pm(x)} are uniformly bounded on the segment [a, 6]. In fact, for x e [a, b] \Pm(z)\£ \Pm(x) - f(x)\ + \f(x)\<En + -L + |/(3)| and hence \Pu(x)\ <En + l + max \f(x)\ = C, whereby we have proved that the polynomials of our sequence are bounded. We can therefore select from it a subsequence {Pm(x)\ which converges uniformly toward a polynomial P(x) e Hn, and we can easily show that this limit polynomial is the one sought for. According to (32) KmA(Pmi)=Eu. Thus, if in the inequality \PmW-K*)\?*MPm) we pass to the limit, we obtain \P(x)-f(x)\^En. This inequality holds for all values of x e [a, b]; hence A(P) is not greater than En. Since, on the other hand, for no polynomial P(x) from Hn the deviation A(P) is smaller than En, (31) holds true. The polynomial P{x) is said to be the polynomial of best approximation to the function f(x) or the polynomial of least deviation from it. The first to investigate these polynomials was the great Russian mathematician P. L. Tchebysheff (1821-1894), who with full right may be regarded as the creator of the constructive theory of functions.1 It is true, however, that 1 We remind the reader that P. L. Tchebysheff's first investigations into the theory of approximations [1, 2] go back to the year 1853, which is to say that they were published more than 30 years prior to the Weierstrass theorems.
2. tchebysheff's theorems 27 Tchebysheff regarded the existence of these polynomials as a matter of course. In 1905, E. Borel [1] found the required completion of P. L. Tchebysheff's investigations, given here. § 2. Tchebysheff's Theorems The subject of this section are some properties of polynomials of best approximation. Let f(x) e C([a, b])\ we permanently select an integer n ^ 0 and denote by P(x) one2 of the polynomials of the best approximation to f(x) from the set Hn. Hence m*x\P(x)-f(x)\=Eu. If En equals zero, then this means that the function f(x) as such is a polynomial from Hn. We ignore this trivial case and take En > 0. Now the function \P(x) — f(x)\ is continuous on the segment [a, b\. It therefore acquires its highest value in this segment so that there exists at least one point xq e [a, b] at which 1^(^-/(^)1=^. Any such point is said to be an (e)-point of the polynomial P(x). An (6)-point xq is said to be a (+)-point if P(x0)-f(x0)=En, and a (-)-pointii P{xQ)-f(x0) = -En Theorem 1. There exist both (+)-points and (-)-points. Proof. Let us, for instance, assume that there exist no ( — )-points for the polynomial P{x). Then for every x e [a, b] P(x)—f(x) >-En holds true. Thus also the minimum value of the continuous function P(x) — f(x) is greater than — En. If we denote it by — En + 2ft, h being greater than 0, then for all xe [a, b] —En + 2h^ P(x) — f{x) ^ En. 2 In fact, in every Hn there is only one polynomial of best approximation; since however, this has not been proved as yet (it will be done in this section) we have for the time being to speak of "one" of these polynomials.
28 II. ALGEBRAIC POLYNOMIAL OF BEST APPROXIMATION From this we derive —Eu + h£[P(x)—h]—f{x)£Eu — h, hence This means, however, that the polynomial P(x) - h deviates from f(x) by less than En, which contradicts the definition of En. The theorem just proved is quite evident from a geometric viewpoint. In fact, if we visualize the two curves y=f(x)+Eu, y=f(x)-En , (33) then for a ^ x ^ b the graph of the polynomial P(x) lies in the strip between the curves (33). The theorem proved states that this graph touches both the upper and the lower curve (33) at least once. This is quite obvious since if it did not touch, e.g., the lower curve y = f(x) — En (cancellation of (-)-points), we could move it slightly downward and obtain a graph running in a strip which encloses the curve y = f(x) more tightly. The proof given above is nothing but a precise explanation of such an approach. Tchebysheff proved that the number of the points of tangency of the graph of y = P(x) with the boundary curves (33) is considerably greater. Accordingly, this leads to Theorem 2. (P. L. Tchebysheff [2]). On the segment [a, b] there exists a sequence of (n + 2) points Xj < X2 < ••• < Xn+2, which are alternately (+)-points and (-)-points. Such a system of points is called a Tchebysheff alternant. Proof. We divide [a, b] by the points u0 = a < ux < u2 < • • • < ut = b into sufficiently small segments [uk, Uk+i] such that in each of them the deviation of the continuous function P(x) — f(x) is smaller than \En. If a segment [uk, Uk+i] contains at least one (6)-point, we say that it is an (e)-segment. In such an (6)-segment the difference P(x) — f(x) is obviously nowhere zero and it therefore keeps its sign. We may thus divide the the set of (e)-segments into two classes; we denote by (+)-segments those on which the difference P(x) — f{x) is positive, and as ( —)-segments those on which it is negative.
2. tchebysheff's theorems 29 Thereupon we number all the (e)-segments successively from left to right dly d2, ds, ..., dN (34) and assume that di is a (+)-segment. We decompose sequence (34) into groups according to the following system: d,,d2, ...,{?*! [(+)-Segment], j dkl+1,dkl+2, •••,^ [(—)-Segment], ( 4fcm_1+i, <tkm-1+2, ...,dkm [(— l^-^-Segment]. J Each group contains at least one segment and, moreover, each segment of the first group contains at least one (+)-point, each segment of the second group at least one ( — )-point, and so on. Thus, to prove this theorem we only need to show that m ^ n + 2 (36) (the preceding theorem warrants only m ^ 2). Suppose that m<n + 2. (37) In view of the fact that in the segments dkl and dkl+1 the difference p(x) — /(*) has different signs, the right-hand end point of dkl cannot coincide with the left-hand end point of dkl+1. We can therefore find a point Z\ lying right of dkx and left of dkl+1. Thus we write symbolically dk> < *i < dkl+1. In a like fashion we can find points z2, Zz, . . . , zm-\ for which dka <z2 <dka+lf dkm-i < zm-i < dkm-1+\ . Now we set Q(X) = fo — X) (Z2 — X) *•• (^m-l — X). According to our assumption (37) m — 1 ^ n so that the polynomial p(x) belongs to Hn. Except at points 2» the polynomial has no zeroes; in particular, it has no zeroes on any of the segments dk, hence it retains on each of them a fixed sign. On each segment of the first group of (35) the polynomial p(x) is positive because then all the factors zt — x are positive; on the segments of the second group of (35) it is negative because one factor, viz., zi — x, is negative. By carrying on this procedure we can ascertain that on all (e)-segments (34) the sign of the polynomial p(x) coincides with that of the difference P(x) — f{x).
30 II. ALGEBRAIC POLYNOMIAL OF BEST APPROXIMATION If [ui} Ui+i] is a segment of the original decomposition, which is no ^-segment, then the quantity max \P(x)—f(x)\ (38) is certainly smaller than En; thus, if we denote by E* the largest number in (38), we have E* <En. Now we set R = max \q(x)\ and choose a sufficiently small positive number X for which 3 aR <En — E*, XR<^En, (39) If we also set Q(z) = P(z)-Xq(x), then we can show that the polynomial Q(x) (obviously belonging to Hn) deviates from f(x) by less than En. This being impossible, we have proved this theorem by contradiction. Thus we have brought back the proof to showing that A(Q)<En. (40) Suppose that [ui} ui+i] is a segment of the original decomposition and no (e)-segment, and that x e [ui} ui+\]. Then \Q(x)--m\^\P(x)-f(x)\ + X\Q(x)\^E* + XR<En. If, however, x is a point on an (6)-segment dk then P(x) —f(x) und /.q(x) have identical signs. In this case l^(*)-/(*)l>Ale(*)l, 8 As a matter of fact we can easily see that En — E* < \En, i.e., the second inequality in (39) follows from the first one. If, in fact, up is the right-hand end point of segment dki, then P(up) — f(up) > \En (because dki contains one (H-)-point and the deviation of P(s) —fM on dki is smaller than iEn). On tlie other hand, uv is the left-hand end point of segment [uP, w^i], which lies right of dkx and is no (6)-segment, so thatl P(up) - f(up)\ ^ E*. Hence E* > \En.
2. tchebysheff's theorems 31 because \P{x)-f{x)\>jEu, X\q(x)\ <jEn holds true. It follows that \Q(x) - f(x)\ = \P(x) - f(x) - Xq(x)\ == \P(x) - f(x)\ - 1\q(x)\, and therefore \Q(x)-f{x)\£Eu-X\e(x)\<Eu, since on the (e)-segments p{x) 4= 0. Thus \Q(x)-f(x)\<En, for every value of x e [a, b]. This proves (40) and the theorem. Let us point to the fact that the construction of the polynomial Q(x) is totally independent of (37) and that the inequality (40) always holds true for it. But for m ^ n + 2 there is no contradiction since then the polynomial Q(x) no longer belongs to Hn. Irrespective of the relative difficulties encountered, this proof is based on considerations as simple as those which were made in connection with Theorem 1. P. L. Tchebysheff wants to show that in the case where an (n + 2)- termed alternant does not exist the deviation of the polynomial P(x) from f(x) can be reduced by subtracting from P(x) an appropriately chosen polynomial p(x). Since to do this the absolute value of P(x) — f(x) must be reduced at all (e)-points, the sign of p(x) must at these points coincide with that of the difference cited. Should this difference change its sign less than (n + 2) times, then this requirement may be fulfilled by a polynomial p(x) of a degree not exceeding n. The danger that in this case the polynomial P(x) — p(x) deviate from the function f(x) by En or more can be easily met by multiplication with a sufficiently small factor X. It follows therefore that a polynomial P(x) for which there exists no alternant is no polynomial of the best approximation. We recommend readers to reconsider the entire proof in the light of these heuristic considerations. From the theorem just proved we derive directly the uniqueness of the polynomial of the best approximation: Theorem 3. In Hn there exists only one polynomial of the least deviation. Proof. Suppose that there exist in Hn two polynomials of least deviation P(x) and Q(x). Then for every x in [a, b] -En^P(x)-f(x)^En, -En^Q(x)-f(x)^E1t.
32 II. ALGEBRAIC POLYNOMIAL OF BEST APPROXIMATION By adding these two inequalities and dividing by 2 we have This shows that the half sum P(x)+Q(x) B(x) is also a polynomial with the least deviation from f(x). Hence there exists for R(x) a Tchebysheff alternant xl < X2 < •• < X»+2- (41) Now let Xk be one of the (+)-points of R(x). Hence P(xk)-f(xk) , Q(xk)-f(xk) 2 + 2 = "• Now Q(xk) - f(xk) ^ En, hence P(xk)-f(xk) £„ 2 2 ~ or P(xk)-f(xk)^En. (42) But since the difference P(x) — f(x) is no greater than En, only the equality sign is true in (42). In other words, xk is a (+)-point also for P(x); and according to an analogous consideration it is also a (+)-point for Q(x). Thus we obtain P{xk) - f(xk) =En= Q(xk) - f(xk), that is, P(xk) = Q(xk), and P(xk) and Q(xk) also coincide at the ( — )-points of alternant (41). The two polynomials P(x) and Q(x), whose degree does not exceed n, do therefore coincide in the (n + 2)-values (41) of x, which is only possible if they are identical. In addition it can be shown without great difficulty that the existence of a Tchebysheff alternant is a characteristic property of polynomials of the best approximation. Theorem 4 (P. L. Tchebysheff). Let f(x) e C([a, b]) and Q(x) be a polynomial from Hn. We set A -max \Q(z)—f(z)\.
2. tchebysheff's theorems 33 // in the segment [a, b] there exist points xx<x2< • • • < xn+2, (43) for which |«(*i)-/(*<) I =~-A (i = 1, 2, ...,n + 2) (44) and z/ £/i6 szan o/ £/i6 difference Q(xi) — /(x») changes with each passage from one point X{ to the subsequent one z»+i, <Aen 4 = 2£n, and Q(x) is the polynomial of the best approximation to f{x). Proof. Since A = A(Q), A ^ En. We now prove that A = En. Were it not so, then A>En. (45) Let P(x) be the polynomial of the best approximation tof(x). Then Q(xi) - P(*,) = {©(**) - f(xi)} - {P(z<) - /(*«)} , but since |P(z,)-/(z,)|^n<^, it follows together with (44) that the sign of the difference Q(z») — P(z») coincides with that of the difference Q(x%) — f(xi); thus it changes with each passage from one point x* to the subsequent one xi+i. The difference Q(x) — P(x) has therefore a root in every interval (xi, x2)y (x2, x3), . . . , (zn+i, Zn+2), that is, a total of at least n + 1 roots. However, since it is a polynomial of nth degree at most, it is therefore identical with zero. But because of A(Q) = A>En=A(P) this is impossible. This contradiction convinces us of the fact that A = En. In this case, however A(Q)=En and Q(x) is the polynomial of the least deviation. Another theorem which gives for En an estimate downwards also belongs to the same range of ideas. Theorem 5. Suppose we succeeded in finding for the function f(x) e C ([a, b]) a polynomial Q(x) e Hn and n + 2 points *! < X2 < • • • < Xn+2 ,
34 II. ALGEBRAIC POLYNOMIAL OF BEST APPROXIMATION such that the difference Q(x{) — f{Xi) changes its sign with each passage from Xi to £»+i- // A denotes the smallest of the numbers \Q(xi)-f(xi)\ (* = 1,2, •••,7i + 2), then A S_En. In fact, were A > En) then a literal repetition of the consideration just made would result in the same contradiction. § 3. Examples. The Tchebysheff Polynomials The theorems of Borel and Tchebysheff prove the existence of a unique polynomial of the least deviation for each continuous function; they suggest no procedure, however, by which to actually find this polynomial. As a matter of fact, the latter problem involves such formidable difficulties that a general solution has not been found to this day. Here we are going to deal only with the two simplest cases, n = 0 and n = 1. For n = 0 the solution is quite simple. In fact, if m and M are the minimum and maximum value of the function f{x) continuous over the segment [a, 6], then __m + M 2 is among all the constants the one with the least deviation from f(x). M — m Geometrically this is obvious since then A(P) = and with each displacement upward or downward on the straight line y = P this deviation obviously grows. The formal proof also encounters no difficulties. Let x\ and x2 be two points for which and also f(Xl) = M, f(x2) ■„ , M — m _ ,, x M — m P-f(xl)= — , P-f(x2)=—^— since A(P) = , the two points x\ and x2 represent a Tchebysheff z alternant whose existence does characterize the polynomial of the best approximation.
3. EXAMPLES. THE TCHEBYSHEFF POLYNOMIALS 35 For n = 1 the problem is quite simple in the case where f(x) can be differentiated twice and/"(x) does not alter its sign, i.e., f"(x)>0. (46) In fact, let P(x) = Ax + B be the polynomial of the best approximation. Of the three points X-t *\ Xn "^» Xo of the Tchebysheff alternant the median one xi is certainly interior to [a, b], i.e., the difference reaches its extreme value 4 there. We have therefore /'(z2)-P'(z2)=0, whence A=f(x2) follows. On the basis of (46), f'(x) is, strictly speaking, an increasing function, hence it can assume the value A only once. Thus there exist within [a, b] no other extreme points of the difference f(x) — P{x)1 so that the other two points xi and ar3 of the alternant coincide with the end points of segment la, b]: xi — a> xz — b. For simplicity we denote the point x2 by c. By the considerations just made we have f(a) - P(a) = f(b) - P(b) = -{/(c) - P(c)} or, more circumstantially: f{a) — Aa — B = f(b) — Ab — B = Ac• + B — /(c). The first of these equations yields AJJ^ZM. ,47, o — a With it we can easily find /(q) + f(c) _ f(b)-f(a) a + c 2 b—a 2 ' 4 According to condition (46) we are dealing here with a minimum value of the difference f(x) - P(x)t i.e., 32 is a (+)-point of P(x).
36 II. ALGEBRAIC POLYNOMIAL OF BEST APPROXIMATION These two equations solve the problem since the point c is defined by the equation The solution thus obtained affords a very simple geometrical interpretation. In fact, Eq. (47) states that the straight line y = P{x) is parallel to the chord MN (Fig. 1) which connects the points M[a,/(a)] and N[b,f(b)]. If we write the equation of this straight line: f(a)+f(c) ,/ a + c\ y s = a\x 2-J, we see that it passes through the center D of the chord MQ which connects the points M and Q[c, f(c)]. Fig. 1 Thus we come to the following procedure by which to obtain the linear polynomial of the best approximation to fix): 1. We draw the chord MN, then we determine 2. that point Q on the arc MN at which the tangent runs parallel to the chord MN; 3. finally, we join Q with M and N and draw the median parallel KL in the triangle MQN. The straight line KL is then the graph of the polynomial sought. In the example given we investigate the linear polynomial with the least deviation from the function y = y/x on the segment [0, 1]. Here the slope of the chord MN is equal to one. Equation (48) takes the form 1
3. EXAMPLES. THE TCHEBYSHEFF POLYNOMIALS 37 whence it follows that c = J, and the points Q and D are Q(l, J) and D(J, J). The equation of the straight line KL therefore becomes __!- __J_ y 4: ~ X 8 ? and the polynomial sought is Now we pose the following problem: Find in Hn-i that polynomial P(x) which on the segment [ —1, +1] deviates the least from the function f(x) = xn. If the polynomial sought is P(x) = azn~l + bxn~2 + ••• + r and we set5 B(x) = xn — (as*-1 + &*n-2 + \-r), (49) then the problem is reduced to seeking those coefficients a, b, . . . , r for which the quantity if = max |-R(x)| becomes minimal. Since any polynomial whose principal coefficient equals one can be written in the form (49), and the quantity M is but the deviation of this polynomial from zero, we see that the problem above is identical with the following one: Find among all the polynomials of degree n and the principal coefficient one} that which on the segment [ — 1, +1] has* the smallest deviation from zero. To solve all of these important problems some lemmas are required. Lemma. The identity cos nO = 2*-1 cosn 0 + £ A^cos* 6 (n = l,2,3,...) holds true} X(o\ X(f, . . . , X^ii being certain constants. Proof. For n = 1 the lemma is trivial. Now we assume that it holds true up to and including a specific value of n. According to 6 The symbol R{x) (or P(x)} etc.) introduced by W. L. Gorchakov indicates that the principal coefficient of R(x) is the number one.
38 II. ALGEBRAIC POLYNOMIAL OF BEST APPROXIMATION cos I Q 0 a-p a+p cos a + cos p = 2 cos —-r1- cos —^ we have cos (n + 1)0 + cos (n— 1)0 = 2 cos 0 cosnd, whence i (W + l) 0 = 2 cos 0 { 2"-1 cos" 0 + *2 ti? cos* 0 } — 2>* cos e I Jfc-0 J *-0 follows, so that cos (» +1) e = 2n cos^1 e + jt n cos* e, Jfc-0 whereby we have proved the lemma. If, specifically, for — 1 ^ x ^ 1 we set 6 = arc cos x. we can draw 6 the following Corollary. For -1 ^ x ^ 1 the identity n-l cos (n arc cos x) = 2'1-1 zn + J£ * V ** (n^l). (50) holds true. Definition. A polynomial of the form Tn{z) = cos (n arc cos x) (51) is said to be a Tchebysheff polynomial. Equation (51) defines this polynomial only for — 1 ^ x ^ 1, whereas Tn(x), as any polynomial, is defined for all real (and even complex) values of x: the right-hand side of identity (50) gives a representation also for the values of x lying outside of [ — 1, +1]. In the first place we write T0(x) = l. Calculation of a value of Tn{x) for x e [ — 1, +1] is best performed in two stages: 1. We determine in [0, w] the angle 0 for which cos 6 — x . 6 We remind the reader that the symbol arc cos x univocally denotes that angle $ whose cosine is x and which satisfies the inequality 0 ^ 6 ^ ir.
3. EXAMPLES. THE TCHEBYSHEFF POLYNOMIALS 39 2. Then we calculate Tn(x) = cosnd. The Tchebysheff polynomials give us the solution for the above problems on the strength of the following Theorem. Among all the polynomials of degree n with the principal coefficient one, the polynomial T»(x) = ^ty Tn (x) = 2^1cos (n arc cos x) on the segment [ —1, +1] has the least deviation from zero. Proof. We have already shown that in order to find the required polynomials B(x) =xn— (axn~l + bxn~2 + \-r) we must seek the polynomial P(x) =axtl~1 + bxn"2 + hr, which represents the best approximation to the function f{x) = xn on the segment [ —1, +1]. In order that the polynomial P(x) fulfill this requirement, it must have an (n + l)-termed 7 Tchebysheff alternant, i.e., a sequence of points x\ < x2 <'' < xn+i (— 1 ^ xk ^ 1) with a property such that at each of its points the polynomial R(x) attains its maximum absolute value, but changes sign only when passing from Xk tO Xjfc+i. We verify the existence of such an alternant for the polynomial fn(x). In fact, if we set 60 — 0, 0j = — ,..., 0* = —, ..., 0n=-r, n n then cos n6k = (— 1)*; 7 We draw the attention of the reader upon the fact that here the polynomial with the least deviation from xn is to be sought not in Hn but in Hn_x so that the alternant contains only n + 1 points.
40 II. ALGEBRAIC POLYNOMIAL OF BEST APPROXIMATION for xk = cos 6k (k = 0, 1, ..., n) (52) we have therefore Since, on the other hand, (—1)* max \>I\(x)\=-±- (53) points (52) form the alternant required. Explanation. P. L. Tchebysheff found his polynomials in about the following fashion: let f(x) be the polynomial with the principal coefficient one sought for, which has a minimum deviation from zero on the segment [ —1, +1]. Let its deviation, i.e., max \f(x)\, in the segment [ —1, +1] be M. As is already known, the polynomial T(x) must assume the values ±M at a total of n + 1 points of the segment [ —1, +1]. At all such points interior to [ — 1, +1], T'(x) must be equal to 0. But T'{x) is a polynomial of (n — l)-st degree, it has therefore n — 1 roots. Thus n — 1 of the points with maximum deviation are interior to the segment [ — 1, +1] whereas the remaining two coincide with the end points of this segment. The two polynomials M*—f*(x) and (l — x*)f'2{z) have therefore identical roots, and for both polynomials the roots interior to the segment [ — 1, +1] are double roots. Hence both polynomials differ from one another by one constant factor at most. Comparison of their maximum coefficients yields M*-f*(x)= (1~X2)//2(X) , (54) n2 a relation already interesting in itself. From (54) it follows that ]/iJf2— T*(x) = ±— |/l — x2T'{x). The derivative f'(x) alters its sign as it passes through every point of the alternant. If it is positive in the interval (a, /3), then T'(z) _ n V~M2—T2(x) Vl — x2 in this interval. By integrating this relation we find arc cos ——- = n arc cos x + C: M
3. EXAMPLES. THE TCHEBYSHEFF POLYNOMIALS 41 hence T(x) = M cos [n arc cos x + C] or T(x) = M [cos C cos (n arc cos x) — sin C sin (n arc cos #)]. Since T(x) is a polynomial, sin C must be equal to 0, hence cos C must be equal to ±1. But the highest coefficient of cos (n arc cos x) is equal to 2n~\ so that cos C = +1 and Af = 2~n+1 whence T(x) = h^Zi cos (w arc cos x) follows. For the time being this equation is true only in the interval (a, fi); but since both sides are polynomials, it is valid everywhere. Now we revert to the theorem proved. Together with Eq. (53) it yields the Corollary. For each polynomial Pn(x) of degree n with the principal coefficient onef the inequality max \Pn(x)\ ^^=\ holds true. But if the principal coefficient of the polynomial Pn(x) of n-th degree is not one but Aq, then max |Pn(x)|>-^L. With this remark in mind we investigate a polynomial of n-th degree on the segment [a, 6]. If we set a + b , b — a then we can take Pn(x) as the polynomial of argument y with the principal coefficient lb — a\n c°{—) ■ Thus max\P.(z)\ = max \Pn(z(y)) | ^ c0^~^
42 II. ALGEBRAIC POLYNOMIAL OF BEST APPROXIMATION § 4. Other Properties of Tchebysheff Polynomials The Tchebysheff polynomials Tn(x) = cos (n arc cos x), which, as we have seen, have the least deviation from zero, also have other remarkable properties which we discuss below. In the first place, we derive from the formula cosnO = 2 cos 6 cos (n— 1) 0— cos(ti —2) 6 (n = 2, 3, ...) with the aid of substitution 0 = arc cos x the recurrence formula Tn(x) = 2xTn_x(x) - Tn_2(x) (n = 2, 3, ...), (55) which connects three successive Tchebysheff polynomials. Since T0(x)=l; T1(x)==x) we find from Eq. (55) successively: T2(x) = 2x2— 1, T3{x) = 4x3— 3x, T4(x) = 8x4 — 8x2 + 1, T5{x) = 16x5— 20x3 + 5x, T6{x) = 32 x6— 48s4 + 18x2 —1. This series can of course be continued ad libitum. Another way of obtaining the polynomials Tn(x) is with the aid of the so-called "generating function." We derive it from the formula Zl«cosnd=- ' °!* (-1<«<1), (56) n=0 1 — 2t COS 0 + t1 whose left-hand side is obviously an absolutely convergent series. As for the proof of this formula it should be noted that its left-hand side is the real part of the sum of the geometric progression oo l A _ 1 — tf6i Since 1 _ 1 _ 1 — t cos 0 + it sin 6 I — te* = (1 — tcosd — it sin 6) ~~ (1 —*cos0)* + t* sin2 0 ' the real part of this fraction is also the right-hand side of formula (56).
4. OTHER PROPERTIES OF TCHEBYSHEFF POLYNOMIALS 43 For 0 = arc cos x, Eq. (56) goes over into Y2^ = ZSTn{x) (-1 <«<!,, (57) so that the sequence of the Tchebysheff polynomials appears as the sequence of coefficients of the powers of t in the expansion of the function 1 — 2tx + t2 This function is said to be the generator of Tchebysheff^ polynomials. Expansion of T(t, x) by powers of t is obtained by formally dividing 1 — tx by 1 — 2tx + t2; this leads to a new method of calculating the polynomials Tn(x): l — tx 1 — 2tx + t2\ l — 2tx + t2 1 -f tx + t2(2x2 — 1) + t*(4x* — 3x) +... * tx — t2 tx — 2t2x2 + t3x t2(2 x2— l) — t*x t2(2x2— l) — 2t3{2xi — x) +t*(2x2 — 1) P{ix* — Sx)— t*(2x2— 1) The coefficients of the subsegment 1 +tx + t2(2x2— 1) + t*(4x* — 3x) + ••• do indeed coincide with T0(x), Ti(x)> T2(x), Ti(x)f . . . However, we ean also easily give the explicit formula for the polynomial Tn(x). To do this we first have to prove a theorem which appears to be interesting in itself: Theorem 1. The polynomial y = Tn(x) satisfies the differential equation (1 — x") y" — xy + n2y = 0. (58) In fact, if then y = cos (n arc cos x), v = —== sin (n arc cos x) ^ }/l-x2
44 II. ALGEBRAIC POLYNOMIAL OF BEST APPROXIMATION and |/l — x2 y' = n sin (n arc cos x). Differentiation of this identity yields #> 2 yl — x2y" ,- = , cos (n arc cos x), yi — x2 yi — x2 whence (58) ensues.8 We obtain (58) in an even simpler fashion by differentiating the relation jp-y = (1-f»", * n2 which satisfies the Tchebysheff polynomial (cf. (54)). But let us revert to the problem of an explicit formula for Tn(x). We write and substitute this relation into (58). This yields the identity (1 — x2) J; (n — k) (n — k — 1) akxn~k-2 — xj>J(n — k) akxn~k~x +n2 2 <tkX>n~k = 0> which can also be written in the form J; (n — k) (n — k — 1) akxn~k~2 + J} [n2 - (n — k)2] akxn~k = 0 . (59) Here, in the first sum, the summands corresponding to the values k = n and k = n — 1 obviously vanish. We can therefore write this sum 2 {n — k + 2) (n — k + 1) ak_2 a*-* . In the second sum, the summand corresponding to the value k = 0 vanishes, so that identity (59) takes the form [n2 — (n—l)2]a1xn-1 + 2 {(n - k + 2) (n - k + l)ak_2 + [n2 - (n - k)2]ak}xn~k = 0, *-2 8 Our consideration shows in the first place that Tn(x) is a solution of Eq. (58) for -1^x^+1. But since y — Tn(x) is a polynomial, then also (1 — x2)y" — xy' + nhj is a polynomial whose equation to zero on the segment — 1 ^ x £ 1 entails its vanishing for all values of x.
4. OTHER PROPERTIES OF TCHEBYSHEFF POLYNOMIALS 45 whence it follows that a\ = 0 and (n — k + 2)(n — k + l) a* = k(2n-k) "*~2 ' Both these relations show that all coefficients with an odd subscript k are zero. If we replace k by 2fc we have __ (n — 2k + 2)(n — 2k + 1) a* ~ ik(n-k) *2*-2 ' But a0 = 2n_1, hence _ »(n-l) _n(»-l)(n-2)(n-3) a2--l.(w_l)2 '«« 2!(»-l)(»-2) and, generally, a2i_( 1} t!(n-l) •••(»-*) as we can easily verify by induction. If we note further that »(n —1) •••(«—2A + 1) A! (rc — 1) •• 7i (n- 7i — A; we finally obtain T„(x) = •(n — k) -k)(n — k—\) • k\ Tn 1 = yf_iii JL ••(n- rt » -2i + l) n ^* n — k On-2Jfc-l xn-2k H *~0 71 — A* 71 denoting the largest integer which does not exceed -. We can give, however, another explicit expression for the Tchebysheff polynomials. According to the Moivre formula we have whence follows. cos 710 + * sin 710 = (cos 0 + i sin 6)n cos n6 — i sin nO = (cos 6 — i sin0)n, cos nO = — [(cos 6 + i sin 6)n + (cos 0 — i sin 0)n]
46 II. ALGEBRAIC POLYNOMIAL OF BEST APPROXIMATION If we substitute 6 = arc cos x (for |g| ^ 1) into the above, we get 1 Tn{x) = -[(* + »Vl - x*)n + (a?- iYl=!?)*]. This equation was first set up only for \x\ ^ 1; however, we can easily recognize its validity for all real and complex values of x since its right-hand member is a polynomial. We satisfy ourselves about this by writing it in the form I z c* *»-*;* o/r^)*(i + (-1)*) noting that only the summands with an even k do not vanish. For the real values of |z| > 1 it is expedient to somewhat transform the expression last obtained for Tn(x). In fact, if k is even, then t*(Vl — x*)* = (Yx2— 1)*. Thus Tn(x) =i 21 Cl*-*(Y#=l)*(l + (- 1)*), whence it follows that 1 Tn(x) =J[(X + )/x2-l)» + (X - J/*2 - 1)»] , an expression no longer containing any imaginary values for |x| > 1. It gives an important estimate for Tn(x): Theorem 2. // x is real and \x\ > 1, then \Tn(x)\^(\x\+yx*-i)». Indeed, neither the absolute value of x + y/x2 — 1 nor that of x — y/x2 — 1 is actually greater than \x\ + y/x2 — 1. Now we investigate the roots of the polynomials Tn(x). By reason of the equation Tn(x) = cos nd, wherein 6 = arc cos x, the roots of Tn(x) are obviously the values of x which correspond to the values of 6 satisfying the equation cos nd = 0 .
4. OTHER PROPERTIES OF TCHEBYSHEFF POLYNOMIALS 47 These values of 0 are obviously (be it recalled that 0 ^ 0 ^ t) a _ ^ a 3™ a (21c —l)n a (2n—l)n °i — 7TZ > 02 = 9— » • • • 3 Vk = 5- , • • •, 0„ = - . aW Z 7i Z 7i 2 71 The numbers J*> _ ™o *T J*) 3jt (W) (2n—l)rc ,_,..., a.<) = cos____ are therefore roots of the polynomial Tn(x),- and since their amount is equal to the degree of the polynomial, they represent all the roots of Tn(x); moreover, each of them is a simple root. Thus we have Theorem 3. The roots of the polynomial Tn(x) are represented by the formulas (n) {2k—I) n Xh -cos ^T~ (* = l,2,...,n) All of them are real and simple; all of them lie in the interval ( —1, +1). From these formulas there follows the "interspersion" of the roots of two neighboring Tchebysheff polynomials: Theorem 4. Between two neighboring roots x^ and x^+i of the polynomial Tn(x) there is always one and only one root of the preceding polynomial Tn-i{x). This is true because the fc-th root of the polynomial Tn-i(x) is «n-l) (2k— 1).T xk = cos - We prove' 2n %k > %k > ^k-rl (60) or 2k— 1 2k— 1 2k + l 2n~ < 2n—2 < 2n ' which amounts to the same. The former inequality is trivial, the latter equivalent to the inequality 2k <2n— 1, 9 Since cos 0 for 0 ^ 0 f£ t is a decreasing function, the roots xy, x 2 , • • •, * n are arranged in a decreasing order.
48 II. ALGEBRAIC POLYNOMIAL OF BEST APPROXIMATION which is correct since k < n. Thus we have proven the inequalities (60). But since between the n roots x^f there are n — 1 intervals each containing a root of the polynomial Tn-\(x)y then each of these intervals contains precisely one root of Tn-i(x) since the number of these roots is also n — 1. Corollary. Two neighboring polynomials Tn(x) and Tn-i(x) have no root in common. Let us note that this corollary may also follow quite easily from the recurrence formula (35). In fact, had Tn(x) and Tn-\(x) a root x0 in common, then by (55) also , Tn_2(x0)=0i i.e., xo would also be a root of Tn_2(x). By repeating this inference, x0 would also be a root of Tn-a(x), Tn-4(x)} etc., until finally, it would be a root of the polynomial To(x) = 1 which is clearly impossible. We now investigate the distribution of the roots of Tn(x) for very great values of n. For greater clearness we give this problem a mechanical interpretation by imagining a mass of material M distributed in equal parts onto the roots x{7l\ a^, . . . , z(n} of the polynomial Tn(x) so that at each root M ... b there is the mass — . Then we determine which portion of the mass Mn n a is on a given segment [a, b] C [—1, +1]. Obviously, ^r M a n rn being the number of roots lying on [a, b]. In other words, we must determine the number of those numbers k from the series 1, 2, . . ., n which satisfy the inequality . (2*—1)* a < cos — < b . ~" 2n ~ This inequality is equivalent to where a = arc cos a, /? = arc cos b finally, the latter inequality can also be written in the form i(1+a!)s»s"(I+!£).
4. OTHER PROPERTIES OF TCHEBYSHEFF POLYNOMIALS 49 Our task is now to determine the number of all those integers k which satisfy the inequalities P ^ * ^ Q , when P < Q are two preassigned numbers. If i<P^>i+l<i + 2<-'-<i + m<:Q<i + m+li then this number is m. From the simple relation m— 1 ^Q — P<m+1 it follows that Q—P—l<m^Q-P+ 1 and, thence, We are interested in the case where so that It follows that or T„ =71 - + //„ (—1 </*n^l), ir.=^(a-/S)+^5 (-K^l) a 7T 71 J/n = — (arc cos a — arc cos o) H . a n n By this equation, with n increasing indefinitely, our mass distribution converges to a limit where the mass b M M = — (arc cos a — arc cos b) lies on the segment [a, 6],
50 II. ALGEBRAIC POLYNOMIAL OF BEST APPROXIMATION As is customary in mechanics, we shall define here this limit distribution of the mass by its density The mass portion on the segment [x, x + Ax] is x+Ax jy[ M = — [arc cos x — arc cos (x + A x)], x ft and, hence, mean density on this segment I x+Ax J\f arc cos x — arc cos (X +Ax) Ax ^x°° ~~ n Ax Therefore, density at the point x M _. arc cos a: — arc cos (x + A x) p(x) = — hm i . n ax-+q A x In this equation the limit on the right differs from the derivative of the function arc cos x only by the sign; hence M 1 p(x) = , n }/l — x2 which is also the density of the limit distribution for the roots of the Tcheby- sheff polynomials over the segment [-1, +1]. Of course, the only factor of any significance in this expression is — , ; if we choose a unit of VI — x2 mass such that the mass becomes equal to tt, then we find for p(x) an even simpler expression p{z) = . . (61) Formula (61) enables us to find an approximate expression for the number of roots of the polynomial Tn(x) over a small segment [x, x + Ax]. For, if Ax is small, then yi — x2 x-f-Ax With very large n we can write Mn on the left-hand side in place of x+bx b M^ ; because of Mn = - rn we thus find the expression n nA x Tt y i for the number of roots sought.
4. OTHER PROPERTIES OF TCHEBYSHEFF POLYNOMIALS 51 Hence it follows that with large n the density of the roots of the polynomials Tn(x) gradually increases toward the end points of the segment i-i.+i]. But function (61) is connected with the Tchebysheff polynomials also in a completely different fashion. For a more convenient formulation we take the following Definition. In a segment [a, b] two functions f(x) and g(x) are said to be mutually orthogonal with a density p(x) when b fp(x)f(x)g(x)dx = 0. a Theorem 5. Every two Tchebysheff polynomials Tn(x) and Tm(x) are mutually orthogonal on the segment [ — 1, +1] with density p(x) = —=. To prove this we only need calculating the integral 7---y -yr^ dz- -1 To this end we substitute x = cos 0. Because of Tn(cos 0) = cos n (we choose 0 ^ 0 g w) n Inm = / cos n0 COS mOdO. 6 It follows that 1 n Inm = _ J [cos(n — m)6 + cos (n + m)0]dO o _ 1 ["sin (n — m) 0 sin (n + m) dl" 2 i n — m n + m J0 and, hence, /»» = o. In the second part of this book we shall come back on the Tchebysheff polynomials in connection with the general theory of orthogonal polynomials. Here we cite only two extremal properties in order to complete our considerations. Theorem 6 (P. L. Tchebysheff |3|). Let P(x) be a polynomial of a degree not exceeding n, and M be its maximal absolute value on the segment [ — 1, +1].
52 II. ALGEBRAIC POLYNOMIAL OF BEST APPROXIMATION If xo is a real number and |ar0| > 1, then \P(x0)\^M\Tn(x0)\. Proof. Assuming that the theorem is incorrect, we have 10 We also introduce the polynomial S{x)=P^rTn(x)-P(x). iir At the points yi = cos — (i = 0, 1, 2, . . . , n) n Tn(y{)= cos i 7t = (-l)\ From this equation and from (62) it would follow that in the difference the minuend is absolutely greater than the subtrahend (since by assumption the latter is absolutely smaller than M because of —1 ^ yx ^ 1). Hence the difference changes its sign each time it passes from y{ to y*+i, whence it follows that there exists a root of R(x) in each of the intervals (2/0*2/1), (yi»y2)»---» (2/«-i>2/«). Moreover, we also have R(x0) = 0. Thus a polynomial R(x) of n-th degree has n + 1 roots. But this is possible only if R{x) is identical with zero, i.e., Substituting x = 1 into the above and bearing in mind that TJl) = 1 (because of Tn(l) = cos (n arc cos 1) = cos 0), we have P(l) P(*o) Tn(x0) which contradicts assumption (62). Together with Theorem 2, Theorem 6 has an important Corollary. With the designations in Theorem 6 |P(x0)| ^^(Ixol+^g-ir (63) is an estimate which will be used by us in Chapter IX. 10 We note that Tn(x0) 4= 0 since all the roots of Tn(x) lie on the interval (-1, +1).
4. OTHER PROPERTIES OF TCHEBYSHEFF POLYNOMIALS 53 Of all the polynomials with identical principal coefficients the Tchebysheff polynomial deviates the least from zero. This property, however, is shared by the principal coefficient with all the remaining coefficients, as pointed out by V. A. Markoff in a theorem. To prove the latter we need the following Lemma. A function of the form f(x) = Axxxi + A2xh H h im+ia^+», where Xi < X2 < • • • Xm+i are arbitrary real numbersf has at most m positive roots.11 For m = 1 this statement is correct. Let it be correct for a value m, but no longer so for m + 1. Then there exists a function f(x) = -4,a* + A2x** H h Am+1x^^ + Am+2^m^, which has more than m + 1 positive roots, all of them being simultaneously roots of the function U*L = A+ A ^-x, + ... + Jm+ia*»+i-'i + Am+2x*™-**. x*i By Rolle's theorem the derivative of the latter function has more than m positive roots; but this contradicts our assumption since the derivative has the form B1x»* + B2x** H h Bm+1xf^ . This proves the lemma. Theorem 7 (V. A. Markoff). If n and p ^ n are simultaneously even or oddy then of all polynomials of a degree not exceeding n in which the coefficients xp is equal to unity the polynomial "-(To" (64> sip has the smallest deviation from zero on the segment [ —1, +1], Therein Tn(x) is the Tchebysheff polynomial and A^ is the coefficient of xp in that polynomial. However, if p is odd with an even n (or vice versa), then the polynomial Jip has the property cited. 11 Assuming, of course, that none of the coefficients A> vanishes.
54 II. ALGEBRAIC POLYNOMIAL OF BEST APPROXIMATION The following four cases may occur 12: (1) p even, n even; (2) p odd, n odd; (3) p even; n odd; (4) p odd, n even. Suppose that in case (1) there exists a polynomial P(x) e Hv in which the coefficient of xp is unity and which deviates from zero less than polynomial (64) so that on the entire segment [ — 1, +1] we have \P(*)\ <—TV' ' 1 V ' ' | A <n) i The polynomial P('-x) has then the same properties since p is even; hence also \P(x)-rP(-x) 1 < An), This having been verified, we investigate the difference Tn(x) P(x) + P{~x) R{x) A(n) which is a polynomial of a degree not higher than n, containing no term with xp and, for the rest, only terms with even powers of x. Thus B(x)=Zckx™ (cr°) Now we set Q(y)=Ickyk. Since Q(y) consists of - summands at most (for, besides cp other coefficients 2 2 Ck may be equal to zero), the number of the positive roots of Q(y) is, at most, equal to 1. 2 If, however, we set cos* — n 12 The proof is carried out according to S. N. Bernstein.
4. OTHER PROPERTIES OF TCHEBYSHEFF POLYNOMIALS 55 then we get Considering the equation Tn|cos —J = cosijr = (—1)* it follows that Q(y<) coincides with ( —1)» in its sign. Hence, in each of the - intervals 2 (yn_, y±X ly±_x> y^_\ •••> (v*> &)> (&> yo) there lies a root of Q(y). There are therefore at least - positive roots of Q(y), which contradicts the statement above. Thus we have proved the theorem for the case (1). The investigation of case (3) proceeds in a similar fashion, we must merely bear in mind that —— e Hn-i> so that rc-1 „, Tn_x(x) P{x) + P(-x) * „ Ap - t-o fr*) whereupon the proof proceeds in the same way as above. In the cases (2) and (4) we must operate with the difference P(x) — P( — x) rather than with P(x) + P(x). Thus we have, e.g., in case (4) R(x) _r„-i(*) P(x)-P(-x) _CC **+1 whereupon we introduce the polynomial hi Q(y)=Ickyk Jb=0 All further details are left to the reader. We could even show that polynomial (64) or (65) is the only one belonging to Hn, having the coefficient 1 for xp, and the least deviation from zero on
56 II. ALGEBRAIC POLYNOMIAL OF BEST APPROXIMATION the segment [ —1, +1]. But we omit this proof and draw instead some conclusions from Markoff's theorem. Corollary 1. If P(x) is a polynomial of degree n in which the coefficient of xp(p ^ n) is unity, then 1 p\ \ 2 J' max |P(x)|> , or (*£->-l). /n — p—l\ 1 pi \ 2 ) max |P(x)|^„ . --, {^i-^y depending on whether or not the two numbers n and p are simultaneously even (or odd). This corollary follows directly from the explicit expression for the coefficients A(? and An~\ If in the polynomial P(x) the coefficient of xp is not 1 but cPf then the right-hand members of the last two inequalities must be multiplied by \cp\. Thus we come to Corollary 2. If P(x) is a polynomial of n-th degree, then its coefficients cp satisfy the estimate ...(^-) |cp|^2P-1-^- —L. max \P(x)\ p pi In — p\ , ^sz! or In + p - 3\ , n-\\_J_y \cr\^V-i——i- J- max \P(z)\, p! In — p — 1\ , -ig^i depending on whether the difference n — p is even or odd.
CHAPTER III TRIGONOMETRIC POLYNOMIALS OF THE BEST APPROXIMATION § 1. The Roots of a Trigonometric Polynomial In this chapter we shall deal with the approximation of a continuous function with period 2t by trigonometric polynomials of n-th order n T(x) = A + £ (akcoskx + bksinkx) with real coefficients A, ak, bk. To do this we need an estimate of the number of real roots of such a polynomial. We note right away that in the general case such roots need not exist at all,1 as this is shown by the example cos x — 5. If, however, the polynomial T(x) has a root x0, then also each number x0 + 2mn (m = ±1,, ±2, ...) is a root of the same order as x0, (the reader be reminded that x0 is a root of r-th order of the polynomial T(x) when T(x0) equals T'(x0) = . . . = T^Hxo) = 0 and Tir)(x0) =# 0). With this in mind we will in the following regard such points x and y for which x — y = 2m7t (m = ±1, ±2, ...) holds, as not being of different order and call them equivalent, x ~ y. Theorem. A trigonometric polynomial T(x) of n-th order not identical with zero has no more than 2n pairwise nonequivalent roots even when each multiple root is counted several times in accordance with its ordinal number. Proof. We use the well-known Euler formulas eai _|_ e-ai eai _ e~ai cos a = , sin a = — ? to write the polynomial n T(x) = A -f 2J {akcoskx + bksinkx) 1 An algebraic polynomial may not have real roots, but it is bound to have complex roots. Conversely, a trigonometric polynomial may have neither real nor complex roots as, e.g., the polynomial cos x + i sin x. 57
58 III. TRIGONOMETRIC POLYNOMIALS OF BEST APPROXIMATION in the form n / ckxi i p—kxi pkx% fi—kxi\ whence it follows that and, finally, T(x) = £ ckekxi =ernzi £ cke^+n^xi k= —n k= —n 2n T{x) = e~nxi £ dkeM (66) *=0 Now we determine the algebraic polynomial P(z) =d0 + dxz + d2z* + ... + d2nz2n. By assigning to each real x the complex number z = exi we can then write (66) in the form P(z) =enxi T(z). (67) It is important to note that to two nonequivalent values x' and x" there correspond also two different values z' = ex'1 and z" = e*"*, hence z' =f= z". Differentiation of (67) with respect to x yields P'{z)z'x = enxi [niT(x) + T'{x)], and since zx = iexi it follows that P'(z) = t^-^xi[nT{x) — iT'(x)]. Repeated differentiation gives P"(z) =eln~V*i[aT(x) +BT'{x) + yT"(z)], where a, /3„ y are coefficients the exact representation of which may be omitted here. In a similar fashion (this is easily verified by induction) we obtain p(*)(z) = efr-)*[]^T(x) + XiT'iz) + ••• + ImTM(z)]. Now let xo be an ra-th root of T(x). Then T(x0) = T'(x0) = ... - f-,,W=0,
2. THE IMAGE POINT METHOD 59 hence for z<> = e** P(zo) = P'(*o) = • ■ • = ^>(m"1) («o) = 0. Accordingly, 20 is at least an ra-th root of P(z)> that is, of an order I ^ m. Thence it follows that if there exist p pairwise nonequivalent real roots of T(x): Xi of multiplicity mXi x2 of multiplicity ra2, xp of multiplicity rap, where m1 + m2+ h mn = N , then P(s) has also p different roots Z\ = eXl% of multiplicity k ^ mi, z2 = eXtt of multiplicity Z2 ^ ra2; zp = 6Zpl of multiplicity lp ^ rap. But since P(z) is of 2n degree,2 then whence a fortiori it follows that N ^2n . Corollary. 7/ /wo trigonometric polynomials of n-th order coincide at 2n + 1 pairwise nonequivalent points, then they are identical. § 2. The Image Point Method The method of assigning to each algebraic polynomial from Hn an image point in an (n + l)-dimensional space proved to be quite useful in the preceding chapter. Now we pursue the same thought also with regard to trigonometric polynomials. This may be done considering the fact that the determinant I 1 cos x sin x cos 2 x • • • sin n x I 1 cos y sin y cos 2 y • • • sin n y \ I lcos/ sin/ cos 2/ •••sin 71/ 1 2 It should be borne in mind that P(z) is not identical with zero because in this case T(x) = e_n**P(2) would also be identical with zero.
60 III. TRIGONOMETRIC POLYNOMIALS OF BEST APPROXIMATION for mutually nonequivalent values x, y, . . . , t is different from zero (as can be easily gathered from the theorems in the preceding sections). The corresponding considerations would therefore differ in nothing from the algebraic ones. We deem it more instructive, however, to apply the image point method in another fashion here. Lemma 1. Any two functions of the sequence 1, cosx, sinx, cos2x, sin 2 a;, cos 3x, ... (68) are mutually orthogonal on the segment [—7r, 7r], i.e., each integral over one of their pa/ired products re 7t J cosnxdx, I sinnxdx, —n —7i 71 71 J cos n x cos m x d x, / sin n x sin m x d x (n^m), — 71 —71 71 I cosnxsinmxdx (n^m) — 71 is equal to zero. The proof consists simply in calculating these integrals, which we leave to the reader. Let us also note that since all the functions (68) have 27r-periodic- ity, we may take in place of segment [-tt, t] any other segment of the same length, e.g., [0, 2tt]. Lemma 2. For n = 1, 2, 3, . . . 71 71 I cos2 nxdx = I sin2 n x d x = n. — 71 —7t The simple verification of these equations is also omitted here. Lemma 3. If T(x) = A + £ (ak coskx + &* sin&x), *=i then f T*(x) dx =7t\2A* + 2 («** + lf)| (69) Equation (69) is known as the Parseval formula. To prove it we bear in mind that T2(x) =A2 + £ (al cos* kx + blsin2kx) + 2X (70) *=i
2. THE IMAGE POINT METHOD 61 where £' is a sum of products whose factors are always two pairwise unlike terms of T(x). By integrating (70) the integral of £', by reason of the first lemma, is zero (because of the orthogonality of functions (68)). Thus (69) follows from lemma 2. Lemma 4. // a1? a2, ..... am; blf b2i ..., bm are two real number sequences, then the inequality (j? «a)2 ^ (1 «l) (i? &*) • (7i) This is said to be "Cauchy's inequality." Proof. We set a =2*1 B = 2a*h, c = 2*l- *=1 Jfc=l *=»1 If A = 0, then also ax = a2= ... = am = 0, and inequality (71) is trivial. Suppose A > 0. Then we set and B * = -j (72) r = 2' (a*x + 6jk)2. Obviously r ^ 0. On the other hand W 2 ° r = 2 (ak & + 2akbkx + bi), k = l whence r = Ax2 + 2Bx + C and, with the aid of (72) it follows that r-AB2 2BB , C_AC-B* Thus AC — B1 ^ 0; but this is the inequality (71) required.
62 III. TRIGONOMETRIC POLYNOMIALS OF BEST APPROXIMATION If we apply Cauchy's inequality to the number Kl, l«2!> ..-> \am\; 1, 1, ..., 1 , we obtain the useful inequality m ,— -i / m „ 2\«k\^Y^\ 24- (73) *=i r ;t=i Thus we have taken all the preliminary steps for a theorem which serves as the basis for the image point method. By the norm or quasinorm of a trigonometric polynomial n T(x) = A + JT (akcoskx + bksinkx) we understand the two numbers M(T)=m&x\T(x)\, L(T) = \A\+Z(\ak\ + \h\). The set of all trigonometric polynomials not exceeding n-th order we denote by Hi. Theorem 1. For T(x) e HTn the following inequalities hold: M{T)^L{T), (74) L(T) ^Y±n + 2M(T). (75) Only (75) has to be proved since (74) is obviously correct. By reason of (73) we have L(T) ^ y*2n + 1 ]M2 + 2 (a\ + b\). On the other hand, Parseval's theorem (69) yields i n i n A*+j;(al + bl)<- fTi(z)dx^~ fMt(T)dt = 2M*(T). * — \ 71 J 71 J Thus L(T) ^]/2n + 1}/2M*{T) whence (75) follows. As in the analogous Theorem 1 of the preceding chapter, we can derive from this theorem two corollaries.
2. THE IMAGE POINT METHOD 63 Corollary 1. A system S = {T(x)} of polynomials from Hi is uniformly bounded if and only if the set of quasinorms of these polynomials is bounded. Corollary 2. A sequence {Tm(x)} of polynomials from Hi converges to a polynomial T(x) eHlif and only if lim L{Tm— T) = 0. m-+oo Now we make the convention to denote the point N(A,al9b,...,an,bn) of a (2n + l)-dimensional space as the image point of the polynomial n T(x) = A + £ (a*cos&£ + bksinkx). In maintaining the geometric formulation introduced on p. 24 we can restate the two corollaries in the following fashion: Corollary 1'. A subset S of Hi is uniformly bounded if and only if the set of its image points is bounded. Corollary 2'. A sequence {Tm(x)} of polynomials from Hi converges to a polynomial T(x) e Hi if and only if the sequence of its image points converges to the image point of T(x). Together with the multidimensional Bolzano-Weierstrass axiom of choice (p. 24) these theorems yield Theorem 2. From each uniformly bounded sequence {Tm(x)\ of polynomials from H7n we can single out a subsequence which uniformly converges to a polynomial also belonging to Hi. By virtue of inequality (75) we can make another important statement: if a trigonometric polynomial vanishes identically, then according to (75) all its coefficients are zero. This results in Theorem 3. 2/ two polynomials 3 n T(x) = A + £ (akcoskx + bksinkx), *=i n U(x) = O + ]£ (ckcoskx + dksinkx) mutually coincide for all real values x, then they have the same coefficients: A =C, ak = cki bk = dk. 3 The fact that we write T(x) and U(x) as polynomials of equal order is no loss of generality since we can add to each polynomial higher-order terms with the coefficients 0.
64 III. TRIGONOMETRIC POLYNOMIALS OF BEST APPROXIMATION Be it also mentioned that already the coincidence of T(x) and U(x) at 2n + 1 pairwise nonequivalent points entails their coincidence at all points. Earlier we had brought evidence that an even polynomial T(x) can be represented in the form A + ^a* cos kx. We see now that any other representation is impossible. § 3. The Trigonometric Polynomial of the Best Approximation Let F(x) e C2t and T(x) e Hi. Then we set A(T) = max\T(x) — f(x)\ and say that A(T) is the deviation of the polynomial T(x) from the function f(x). If we let the polynomial T(x) run through the entire set HI, then we obtain a whole set {A(T)} of non-negative deviations. Let the exact lower bound of this set be En = En(f)=ml{/i(T)}> we call it the least deviation from or the best approximation to }{x) by polynomials belonging to Hi. To underscore the fact that we are dealing here with trigonometric and not with algebraic approximations we shall occasionally write E Tn rather than En. To justify the terminology just introduced we must, as in the algebraic case, show that the limit En is actually attained by a polynomial from Hn: Theorem (E. Borel). For each value n there exists in Hi a polynomial T(x) for which A(T)=En. (76) The proof is the same as in the algebraic case, since for each natural number m there can be found a Tm(x) in Hi for which En^A(Tm)<En + — . m All polynomials Tm(x) are bounded by one and the same number, since | Tm(x) | fS \f(x) | + I Tm(x) - f{x) \<M + En + 1, where M = max \f(x)\. From the sequence \Tm(x)\ we can therefore single out a subsequence \Tmi(x)} which converges uniformly toward a polynomial T(x) e Hi. Now, if in the inequality \Tm(z)-f(x)\ <^n+^-
4. P. L. TCHEBYSHEFF's THEOREMS 65 we pass to the limit i —> oo f then we obtain for all real values of x \T(z)-f{z)\ ^En, i.e., A(T) ^ En. But since the inequality A(T) ^ En contradicts the definition of En, we have thus proved (76). A polynomial T(x) for which (76) holds true is said to be a polynomial of the least deviation or one of the best approximation. In full analogy with the algebraic case we can then set up the following relations „ ^ E0 ^ Ex ^ E2 ^ ..., lim.#n = 0, the last of which is tantamount to WEIERSTRASS, second theorem. § 4. P. L. Tchebysheff's Theorems The trigonometric polynomials of the best approximation, like the algebraic ones, are characterized by the presence of a certain number of points at which they reach the value of their deviation, specifically with alternating signs. Let/Or) e C2jr and T(x) be a polynomial of the least deviation in ff'J. The points at which \T(x)-f(x)\ = En shall be called (e)-points; an (6)-point is said to be a (+)-point if T(x)-f(x)=En, and a (—)-point if T(x)-f(x) = -Fn. (We assume En > 0 since otherwise fix) would be a polynomial belonging to #J, hence everything would be trivial.) Theorem 1. There exist (+)-points as well as (-)-points. If, e.g., there existed no ( —)-points, then min{T(x) - f(x)} = -En + 2h (A > 0), therefore for all values of x -En + 2h^T(x)-f(x)^En, —(En — h)^ T(x) —h — f(x) <En — h
66 III. TRIGONOMETRIC POLYNOMIALS OF BEST APPROXIMATION and thereby A(T — h) ^En — h <En contradicting the definition of En. Theorem 2 (P. L. Tchebysheff). There exist 2n + 2(e)-points xx < x2 < • • • < x2n+2 (0 ^ xk < 2n), which are alternatively (+)- and (-)-points. We call this system again a Tchebysheff alternant.4 The proof of this theorem is fundamentally the same as in the algebraic case although somewhat more difficult in its technical details. We decompose, as before, the segment [0, 2w] by the points uo = 0 < ux < u2 < • • • <us = 2n in such small segments [uk, uk+i] that in each such segment the variation of the difference T(x) — f(x) becomes smaller than \En. Segments [uk, uk+i] containing at least one (6)-point are called (e)-segments. Obviously in an (e)-segment the difference T(x) — f(x) is absolutely not smaller than \Eni hence there it neither becomes zero nor does it change its sign. Thereupon we can divide the set of all (e)-segments into two classes: (+)-segments, in which the difference T(x) — f(x) is positive, and ( —)- segments, in which it is negative. Then we number all (6)-segments in their order on the segment [0, 2w] from left to right: dl9 d2i d3, ..., av and assume that di = [a, p] is a (+)-segment. Now we also add the segment d»+i = [a + 2n, p + 2n] formed by displacing the segment d\ by 2w to the right. This new segment is no longer contained in segment [0, 2w], but due to the periodicity of the difference T(x) — f(x)y dN+i is also a (+)-segment. The series of (e)-segments thus expanded ^u ^2> ^3' •••» ^ni d#+i 4 In the algebraic case the alternant consists of (n -f 2) and in the trigonometric case of (2ai + 2) points. This is due to the fact that the number of points of the alternant always exceeds that of the coefficients by unity.
4. P. L. TCHEBYSHEFF'S THEOREMS 67 is divided into groups according to the system d\y d2, . . ., dkl — (+)-Segments, djbH-i, dj^+2, . . . , dkl — (—)-Segments, d*m_i+i, dfcm.i-f2, . . . , dkm - (+)-Segments. The last, m-th group, does consist of (+)-segments since it contains dkm = dj\r+i. And now we prove that m^ 2ti + 3. (77) Were (77) not fulfilled, then we would not only have m ^ 2n + 2 but even m ^ 2 n + 1; (78) since m is an odd number being the number of signs in a sequence +, —, +,•••, + which starts and ends with one and the same sign. Let us now assume that inequality (78) is correct. Since the boundary point on the right of the segment dk lies left of the left boundary point of segment dk+h there exists a point Z\ between both segments. This we write symbolically d^ < zi < dkl+1. In a similar fashion we find the points 22, zh . . . , 2m_i for which dt% < z2 < d>k2+i, rfjrm-i < zm-i < dkm_1+i holds true, in which case obviously £<Zl> «m-1 <a + 271. (79) Now we set . . . X — ZX . X — Z2 . X — Zm_Y q{x) = sin—2~~sin—g-...sin Here the number of factors is m — 1, i.e., it is an even number not exceeding 2n. By coupling them pairwise and considering the identity . x — a. x — b 1 [ b — a sin —-— sin 1 [ b — a I a + b\ — cos — cos x — 2 L 2 V 2 / 2 2 2 p(x) reveals itself as a trigonometric polynomial5 not exceeding n-th order; x — a 5 Be it noted that each individual factor sin is no trigonometric polynomial since such a polynomial is defined as being the sum of sine and cosine functions of integral multiples of x.
68 III. TRIGNOMETRIC POLYNOMIALS OF BEST APPROXIMATION hence q(x) € Hi. The points zh z2, . . . , 2m_i are roots of p(x); by (79) they lie on segment [0, a + 2tt], and all the more so on segment [a, 0 + 2tt] which contains all the segments dk. Now we show that p{x) has no other roots than the values of z{ on the segment [a, $ + 2w]. In fact, the factor sin * has as its roots only the terms of the arithmetic progression ...,zi—47zizi—27iizi,zi + 2rc, z< + 4,t, ... . (80) By reason of (79) z{— 2n <a, ft + 27i < z{ + 2n, so that of all points (80) only point zt falls on the segment [a, fi + 2w]. Thus p(x) has no zero in any of the segments dk. We notice further that the factor . x — z{ sin—-— changes its sign from — to + when x, increasing monotonically, passes through the values zi} x = 2» being the only spot where this factor changes its sign on segment [a, $ + 2t\. It follows from the above that the sign of the polynomial p(x) coincides on all segments dk with the sign of the difference T(x) — f(x). On the segments of the first group di, d2, . . . , dkl where, as we have seen, the difference T(x) — f(x) is positive, all the factors of the polynomial p(x) are negative; but since their number is even, the polynomial itself is therefore positive. On the segments of the second group dk]+h dkl+2, . . . , dkt the first factor sin x — z' . . • is positive, the remaining ones are negative, etc. Now we consider any one segment of the initial decomposition [ui} ui+i] which is no (e)-segment. For it max \T(x) — f(x)\ <En. Hence, if we form the expression max \T(z)—f(z)\ xz[u{,u{+1]
4. P. L. TCHEBYSHEFF's THEOREMS 69 over all the segments [w», ui+i] which are no (e)-segments, then even the largest value contained therein is smaller than En. If we denote the exact maximum value by E*, then E* <En. Finally, suppose that X is a small positive number such that6 X<En-E\ l<jEn. (81) If we set U(z) = T(x) — Aq(x), then we are able to show for the polynomial U(z) that, in spite of its belonging to H I, the evaluation A(U)<EU (82) holds true thus contradicting the definition of En. This proves (77). If x e [iti, Ui+i], [ui> ui+i] being no (e)-segment, we obtain \U(z)-f(z)\<\T(z)-f(z)\+X\Q(z)\ ^E* + Mq(*)\ ^E* + X<En (where we have used \p(x)\ ^ 1). If, however, x lies on an (e)-segment dk, then the numbers T(x) — f(x) and \p(x) have the same sign. Moreover \T(z)-f(z)\^jEu>X\g(z)\. Hence, in this case I U(z) - f(z) | = \{T(z) - /(*)} - Iq(z) | = | T(z) - f(z) - X\q{x) | ^ En - X\q{z) I , and since on the segments dk the polynomial p{x) does not vanish, we obtain \U(z)-f(x)\ <En. The inequality \U(x)-f(z)\ <En would therefore hold for all values of x, so that we would come to the impossible evaluation (82). Thus (77) has been proved. 6 Here too, as above (see note on p. 30) the second inequality (81) follows from the first.
70 III. TRIGONOMETRIC POLYNOMIALS OF BEST APPROXIMATION At this point we can complete the proof of the theorem without difficulty. For each segment dkl, dk2) . . . , dktn+t we choose an (e)-point xi, x2) . .. , x2n+2. These points are alternatingly (+)- and ( —)-points; now there remains to prove that the last point x2n+2 lies left of 2w. This can be done quite simply. Since 2n + 2 < m, the segment dk2n+2 lies left of dkm = dN+i. dk2n+2 is therefore contained in [0, 2w] and, hence, x2n+2 ^ 2w. Were x2n+2 = 2w, then the point 2w (and with it the point 0) would be ( —)-points. But then the point 0 should appear as an (e)-point on the (e)-segment d\ farthest to the left, and would be therefore a (+)-point. Since this is impossible we obtain *2n+2 < 2jZ, which fully proves the theorem. As in the preceding chapter, the uniqueness of the polynomial of the best approximation follows from this theorem. Theorem 3. In H% there exists only one polynomial of the best approximation to a given function f(x) e C2ir. Proof. Suppose there exist two such polynomials T(x) and U(x). From the inequalities - En ^ T(x) - f{x) ^ Eni -En^ U(x) - /(*) ^ En there follows so that also the half sum of both polynomials aw = T{x) + U{x) is a polynomial of the best approximation to f(x). Thus, for R{x) there exists one Tchebysheff alternant x, < x2 < ••• < x2n+2 (0^xk<27i). (83) Literally the same considerations made by us in detail in the algebraic case yield also here the coincidence of the polynomials T(x) and U(x) at the points (83). Since the order of both polynomials does not exceed n while the number of points (83) is 2n + 2, both polynomials are identical.
4. P. L. TCHEBYSHEFF's THEOREMS 71 Theorem 4 (P. L. Tchebysheff). Let f(x) e C2ir and U(x) e ETn. If there exist 2n + 2 points Xi < X2 < • • • < X2/i+2 (0 ^ Xk < 2tz) , at which the difference U(z)-t(x) assumes its absolute maximum value A(f/), the sign of this difference changing with each passage from Xi to xi+i, then U(x) is the polynomial of the least deviation from f(x). Proof. Suppose U(x) is not the polynomial of the least deviation, i.e., A(U)>En. Then we denote the polynomial of the least deviation by T{x). Since U(xt) - T(Xi) = {U{Xi) - f(Xi)) - {T(x{) - f(Xi)} and | T(Xi) - /(*,) | ^ En < A (U) = | Ufa) - fto) | the difference U(x%) — T(xt) coincides with the difference U(Xi) — f{x%) in its sign, and therefore also changes it with each passage from x{ to xi+i. It follows that each interval (xh x2), (x2, x3), . . . , (z2n-i, x2n+2) contains a root of the difference U(x) — T(x). This difference has therefore 2n + 1 non- equivalent roots but, on the other hand, it belongs to Hn; thus it is identical with zero. Because of A(U)>A(T)=En this is impossible. This contradiction proves the theorem. Following a like procedure we obtain Theorem 5. Let f(x) e C27r and the polynomial U(x) e Hi have the following property: there exist 2n + 2 points Xi <X2 <•- < X2n+2 (0 ^ Xk < 2jl) such that the difference U{xt) - f(x%) changes its sign with each passage from Xitoxi+i. Then En^mm\U(xt)-f(xt)\.
72 III. TRIGONOMETRIC POLYNOMIALS OF BEST APPROXIMATION Let us point out that (due to the periodicity of the difference T(x) — f(x)) we may, in the preceding theorems, choose any half-segment [a, a + 2w) in place of the half-segment [0, 2w). Finally, for further purposes, we also prove Theorem 6. // f(x) e C2* is an even function, then its polynomial of the least deviation is also even, i.e., it takes the form n T(x) = A + ]? ak cos kx. In fact, by definition the polynomial of the least deviation satisfies all of the real values of x in the inequality \f(x)-T{x)\^En. If we replace here x by — x and consider f(-x) = f(z), we find \f(x)-T(-x)\£Eu, so that T{ — x) as well as T(x) are polynomials of the least deviation. From the uniqueness of this polynomial of least deviation there follows immediately T(-x) = T{x), which proves the theorem. § 5. Examples We investigate some simple examples in which the polynomial of the least deviation can be easily constructed. I. Let m > n. Then of all the polynomials from H^ the one to vanish identically deviates the least from the function f(x) = A cos rax + Bsinmx . In fact, the function f(x) can be represented 7 in the form f(x) = B cos m (x — x0). (84) 7 To this end we denote by R and co the polar coordinates of the point (A, B). Obviously A = R cos co, B — R sin co and f(x) = R cos (mx — co). This leads to (84) in which R = VA* -f B*.
5. EXAMPLES 73 At each point .n .2ti (2n + \)7t xo-, xo ~r — > x0 ~t —— , • • •, x0 ~\ m m m the function f(x) does assume its highest absolute value, changing its sign at the passage from one point to the other; these points form therefore an alternant for the "difference" 0 — f(x). Moreover, since all these points lie on the half-segment [x0, x0 + 2t), all the assumptions of theorem 4 in the preceding Section are fulfilled, the best approximation to f(x) being R = VA« + B\ This same consideration yields II. Among all the polynomials belonging to Hn the polynomial n+i f{x) = A + £ (ak coskx + bksinkx) *=i deviates the least from the function n T(x) — A + ]>} (ak coskx + bk sinkx), Jk=l and As the next example we investigate WEIERSTRAss, nondifferentiable function f(x) = 2qkcos(2m+ l)k x (0<q<\). (85) It appears 8 that its polynomial of the best approximation from H^ is the segment of the sequence T(x) =2qk cos (2m + Dkx, (86) *=i s being determined by the condition (2m + l)8 ^ n< (2m + 1)8+1. To prove this we examine the difference f(x) — T(x) = J* 0* cos (2m + l)kx. Jfc = 5+1 8 A discovery by S. N. Bernstein [2J.
74 III. TRIGONOMETRIC POLYNOMIALS OF BEST APPROXIMATION If we set then *« = (2m +V* <* = 0fl,...f2<2m + l)*H-l. /(*,) - T(x€) = (- 1)< J ** - (- 1)« f—. Taken absolutely, this is clearly the largest value of the difference f(x) — T(x); moreover, this difference changes its sign each time there is a passage from Xi to Xi+i. Finally, all the points Xi lie on [0, 2t) and their number is 2(2m + 1)*+1, i.e., it is not smaller than 2(n + 1). By theorem 4 of the preceding Section, T(x) is therefore the polynomial of the least deviation from f(x). In addition, we note that En=A(T)=l . l — q
CHAPTER IV INTERRELATION BETWEEN STRUCTURAL PROPERTIES OF FUNCTIONS AND THE DEGREE OF THEIR APPROXIMATION BY TRIGONOMETRIC POLYNOMIALS § 1. Formulation of Problem. Modulus of Continuity; A Lipschitz Condition Let fix) eC2rr and En be its best approximation by trigonometric polynomials belonging to Hi. Then, according to WEIERSTRAss, second theorem limEn = 0. n-Kx> It is to be expected that the "simpler" the function/(x) to be approximated, the "more exactly" can it be represented by trigonometric polynomials. In other words, for simpler functions En should converge to zero more rapidly than for functions of a more complex nature. We shall therefore deal in the present chapter with the following question: how does an "improvement" of the structural properties of the function to be approximated affect the acceleration of the decrease of En? The results given in the following are chiefly due to Jackson [1, 2]. An appropriate parameter for the structural properties of a function appears to be a value known as the "modulus of continuity" of this function. Definition. Given a function f(x) on the intercept <a, b> (this may mean the segment [a, b] and the interval (a, b) which, specifically, may also be (— oo, + 0°), as well as one of the half-segments, [a, b) or (a, b]). We now take any positive number 5, examine all number couples x, y on <a, b> which satisfy the condition and find that the absolute values of \f(x) — f(y)\ have an exact upper bound (which may also be infinite): o)(d)= sup {|/(*)-/(y)i}. I*-V!£<5 This value o>(5) is said to be the modulus of continuity of the function f(x). In brief, this modulus indicates by how much two function values may differ at most if their arguments differ by 8 at most. Let us write down some simple properties of the modulus of continuity 75
76 IV. FUNCTIONS AND APPROXIMATION BY TRIGONOMETRIC POLYNOMIALS I. The function o>(5) increases monotonically. For, if 82 > Si > 0, then the set of all the number couples (x> y) which satisfy the condition \x — y\ < <52 is more comprehending than the set of number couples for which \x — y\ < 5i. If we bear in mind that the upper bound of a set of numbers does in no case decrease when the set increases, then we obtain CO^J^CO^). II. For a function f(x) to be uniformly continuous on the intercept <a,b> it is necessary and sufficient that lim co(<5) =0. Proof. This relation is identical with the definition of uniform continuity. III. If n is a natural number, then co(nd) ^ noj(S). For, if I x — y I fg nS. then we decompose the segment [x, y] by the points z{ = x+ — (y— x) (*' = 0, 1, ..., n) n into n equal parts. Obviously f(y)-f(x)=nZ{f(zi+i)-t(*i)\. on the other hand |z*+i — Z{\ ^ 5, whence \f(zi+l)-f(zi)\^co(d) and \f(y)-f(x)\^nco(d). Thus III is proved. IV. For each positive value of X we have co{U) ^ (A+ l)co(6). Let n be the largest integer not exceeding X, i.e., n ^ X < n + 1. Then co{U) ^ co [{n + 1)<5] ^ (» + 1) co{6) ^ (A + 1) w(<5).
1. MODULUS OF CONTINUITY; A LIPSCHITZ CONDITION 77 Definition. A function/(x) defined on an intercept <a,b> and satisfying the condition \f{y)-f(x)\^M\y-z\* («>0), for all pairs of values x, y of this intercept is said to satisfy a Lipschitz condition with the exponent a and the coefficient M. This we write f(x) € LipMa. In cases where the coefficient M is negligible we use the simpler form f(x) € Lip a. Lip.v a is therefore the class of all those functions satisfying the Lipschitz condition with a given exponent a. and a given coefficient M, whereas Lip a. is the class of all those functions which satisfy the Lipschitz condition with a prescribed exponent a while the coefficient M may assume any value. Obviously, a function satisfying a Lipschitz condition is uniformly continuous. V. Iff(x) e Lip a and a > 1, then fix) is a constant, since for such a function f(y)-f(x) y — x M\y — rrl«-l If here we let i/ converge to x, we obtain /'(x) = 0, whence the constancy of f(x) follows. Henceforth we shall therefore always assume a ^ 1. VI. If the derivative f\x) with \f'(x)\ ^ M exists everywhere on the interval (a, b), then /(a)€LipM 1. This follows from the mean value theorem f(y) - /(*) = /' (*) (v - *) (* <z <»>• VII. // £/ie basic intercept <a,b> is finite, and if a < f3, then Lip a D Lip 0. In brief: Functions satisfying a Lipschitz condition with large exponents are "more benign'' than those satisfying that condition with smaller exponents. Specifically, functions of the class Lip 1 appear to be the "most benign" ones. To prove VII we investigate a function f{x) e LiPjB/9.
78 IV. FUNCTIONS AND APPROXIMATION BY TRIGONOMETRIC POLYNOMIALS Now let z, y be a pair of numbers for which \x — y\ < 1, then \f(y)-f{x)\£B\y-x\'£B\y-x\*. But if \x — y\ ^ 1, then \f(y) - f(x)\ ^ {\f(x)\ + |/(2/) |} |x - y|« ^ 2 JC |a? - y|«, where * K = sup {\f(x)\}; thus for all pairs of numbers x, y \f(x)-f(y)\£A\x-y\; if A is the larger of the two numbers B and 2K. A connection between the Lipschitz condition and the modulus of continuity is contained in the following theorem: VIII. The two statements f{x) e LipM a and co{d) ^ M6a are fully equivalent. If o)(8) ^ M8a, then for each pair of values x, y |/(2/) - f(x) \^a>(\y-x\)^M\y-x\° is true. Conversely, if f(x) e LipMa, then for \y — x\ ^ 8 \f(V) - f(x)\ ^M\y-x\«^ Mda is valid. Since this inequality holds for all the pairs of numbers for which \x — y\ S 8, it gives co{8) ^ M8a. (87) Let us note in addition that f(x) even then satisfies a Lipschitz condition of order a (although possibly with changed coefficients) when inequality (87) is fulfilled only for sufficiently small values of 8 and f(x) is bounded. In fact, if (87) holds for all values of 8 ^ 50, then we have for 8 > 80 where ft is the variation over the entire intercept <a, b>. Hence/(x) e Lip^ a, ( a with A = max < Af, — 1 Since the f(x) is uniformly continuous on a finite intercept it is also bounded thereon. In the case of an infinitely large intercept, property VII no longer holds true For example, the function f(x) = x belongs along the entire axis to Lip 1, but not to Lip J. Moreover, it follows from the text that for a < 0 each bounded function of Lip 0 also belongs to Lip a. ■
2. LEMMATA 79 In the following we shall need another class of functions which we denote by W. It consists of functions for which co(S)^Ad(l + |ln<5|) (88) holds, A being independent of 8. If (88) is fulfilled only for 8 ^ 50, and function/(x) is bounded, as this is always the case on finite intercepts, then for 8 > 8o co(d)^Q<¥-d(l + |ln<5|) holds, so that the function f(x) does nevertheless belong to class W. This class is so to speak an intermediate class between Lip 1 and the totality of all the classes Lip a with a < 1, since: IX. On a finite interval Lip a zd W zd Lip 1 (0 < a < 1). If f(x) e Lip 1, then a>(5) ^ M8 ^ M8(l + |ln 5|), whereby we have proved that W D Lip 1. But if a>(5) ^ A8(l + |ln 5|), then condition 0 < a < 1 yields the limit lim ^-^ = 0; hence for 8 < 80 «->o 8a co{6) <6a. This, however, as we already know, is equivalent to f(x) e Lip a. § 2. Lemmata Lemma 1. The identity nt\2 Sm2"\ ]= n + 2 [(n — 1) cos t + {n — 2) cos 2t -\ h cos (n — 1) t] (89) holds true. Proof. In the first place . nnt 1 — cosnt sin Y = —T— (1 —cos£) + (cos£ — cos2£) + ••• + [cos(w— \)t— cos nt] = , then, if we apply the identity , 0 . a -f 6 . 6 — a cos a — cos 0=2 sin —-— sin ——
80 IV. FUNCTIONS AND APPROXIMATION BY TRIGONOMETRIC POLYNOMIALS we get But . 2nt f . t , . 3t , , . (2n — l)t] . t sin2 — = sin — + sin — + • • • + sin - sin-. . t . t sin- = sin-, .3* . t I 3t . V sin — = sin - 2 L-+^sinT-sin-j, . (2n—l)t . t sin = sin— + ^sln__sln_j+... / . (2n — l)t . (2w —3)A (Sln—2 Sm 2 J" (90 If we apply to each parenthesis the formula .. ^.« — b a + b sin a — sin b = 2 sin —-— cos we obtain the identities: t . t 8inT=8iny, 3; t sin — = [1 + 2 cos t] sm — , 5t t sin -$■ = [1 + 2 cos £ + 2 cos 2£] sin — , (2n—l)t t sin- ^—-= [1 +2cosZ + \- 2 cos (n — l)t] sin— . By substituting these expressions into (90) we obtain (89). Corollary. The identity 2n—2 L + S hcoskt (91) *=i \sin T/ AoZds true. In fact, the left-hand side is the square of a trigonometric polynomial of in — l)-th order and, therefore, a polynomial of (2n — 2)-th order; since, moreover, it is an even function, the sine terms are lacking. It has therefore the form (91).
2. LEMMATA Lemma 2. The equation /sinftA4 7in{2n2 + 1) J \smtj 6 o holds true. Proof. By virtue of (89) nu\ J sin'"" u Lsin- An-l "12 n + 2 £ {n — &) cos &w dw. 2 According to Parseval's formula (69) this results in nu\ ' sin isin — j du = 7z\2n2 + 4 2* (w ~ &)2 L But n-l (71—1)71(271—1) 21 (n - Jb)« = l2 + 22 + ••• + (n- l)2 = V '—£ -. jfc = l hence nu\* 2nn(2n2 + 1) ^ ' sin / B» J \Sln 2 With the remark that nu\ —n. the proof is complete. Lemma 3. The inequalities |sinn£| ^ 7i| sint| (— oo < t < + oo), . _ 2 sin t ^ — £ («'4 AoZd Jrwe.
82 IV. FUNCTIONS AND APPROXIMATION BY TRIGONOMETRIC POLYNOMIALS The first inequality can be proved by induction. The second one results from the fact that the function over the segment 0 - decreases t L ' 2j because its derivative on the interval I 0, -) is negative. Lemma 4. The inequality n/2 J \ sin t J 4 (95) holds true. Proof. We divide the integral in two parts extending over the segments 0, — and —, - . We apply the evaluation (93) to the former, and use L 2nj L2n 2 J the evaluation (94) and the fact that |sin nt\ ^ 1 with regard to the latter. This gives us the inequality */? ni2n nj2 However, / /sin nt\* T f , tz* t dt It{^r)dt<ntjtdt + r6.h' > 6 nj2n t/2n nj2 +oo Jtdt = S#' J^<J^=^> 0 n\2n whence (95) follows. Theorem. Suppose function f{x) e C2r has the modulus of continuity o>(5). // we set Un(x) 27tn(2n2 tit) sinn t — x sin t — x dt, (96) then the following statements hold: a) The function Un(x) can be represented in the form 2n-2 Un(x) = A + •? (ak cos kx + bk sin kx), Jt-i Un(x) is therefore a trigonometric polynomial of (2n — 2)-th order. n j f(x)dx = 0, (97) (98)
2. LEMMATA 83 then the absolute term A vanishes in (97). y) For all values of x \Un(x)-f(x)\^6co(^j (99) holds true. Proof. According to (91) t — x\* r sin n - 2 1 2n-2 = L + £ ^(cos &£ cos kx + sin &£sin fcx), 2 hence . t — x J k=1 sin ' 3 / r 2n~2 i ^n(^) = ^ /c% 9 , lx / /(0 £ + X ^ (cos kt cos lex + sin fc£ sin kx)\dti Znn{Znl + X) J [ &=i J —nt which proves statement a) and 0). To prove (98) we use in the integral (96) the substitution t = u + x. Because of the periodicity of the integrand we may retain the old limits of integration. Thus we obtain . nu\ sin- u'{x)-2^+T)lf(x+u) -4 r sin — / We divide this integral into two parts extending over the segment [—w, 0] and [0,7r] and substitute — u for u in the former. This yields . nu\ Un(z) = c>_„/^ , ^l[f{x + U)+f{x-U)][ J Ml*. 2rc7i(27i2+ 1)J uv ' ' /v \0,-„^ "" 2 0 \ sin — Finally, we set u = 2t and find as the final expression for Un(x) ji/2 Q /* /sin w,A4 ^ = ^(2^ + 1)/ "<« + 2<> + '<« - 2^W) *' 0 On the other hand, according to (92) */2 ~7tn{2n2 + 1) / \ sin* / 6
84 IV. FUNCTIONS AND APPROXIMATION BY TRIGONOMETRIC POLYNOMIALS If we multiply this equation by f(x) and subtract it from the preceding one, we obtain Un(x)-f(x) */2 7tn(2n2 + 1) Obviously ° j [f(x + 2t) + f(x-2t)-2f(x)] /sin nt V \ sin t ) ( dt. (100) \f(x + 2t) + f{x - 2t) - 2f(x) | < 2co(2t) and according to property IV of o>(5) Thus co(2t) =(o(2nt— J ^ (2nt + l)co(-). 6eo - \Un(x)-f(x)\ ^ 'nn(2n-+ 1)5 With the aid of (92) we derive from it \U„(x) - f(x)\ ^co /l\l , 12 V/sinnA* which, together with (95) yields the estimate Because of 3ir < 10, (99) follows and the proof is complete. § 3. D. Jackson's Theorems At this point we can already use Jackson's estimates for En mentioned in§l. Theorem 1. For each function f(x) e C2r the estimate En ^ 12co (—\ holds true. Proof. By the theorem of the preceding Section (101) \Un(x)-f(x)\^6co &
3. d. jackson's theorems 85 Since Un(x) e H2l-2 we have a fortiori ^2n-2 ^ 6cof—j. Suppose that n = 2m is an even natural number, then En = E2m ^ E2m_2 ^ 6co (1-J = 6co (^) < 12co C^\. But if n = 2m — 1 is an odd number then En = E2m-\ ^ ^2m-2 ^ ^CO I—J Estimate (101) thus holds for each natural number n. If we bear in mind that for each function from C2x lim co n--oo $-■ then we see that Jackson's theorem which we have just proved contains WEIERSTRAss, second theorem. We have now to deal with a condition that will play an important role later on. We consider, besides En, another quantity en which we understand to be the best approximation to f(x) by such polynomials from H Tn which have no absolute term, i.e., by polynomials of the form n T(x) = £ (dkcoskx + fyfrsin kx). (102) In other words: we have en=ini{A(T)}, T(x) being a polynomial of the form (102). We denote by hi the set of all the polynomials of form (102). It is obvious that e0 ^ eL ^ e2 ^ ..., En<en. In the theorem of the preceding section we have proved that under the condition (98) relation Un(x) e h2n-2 holds. This means, however, that for such a function e2»_2 ^ 6ft>( — ®-
86 IV. FUNCTIONS AND APPROXIMATION BY TRIGONOMETRIC POLYNOMIALS By using this remark and denoting by C*2r the class of all the functions fix) e C2t which fulfill the condition (98), we obtain by repeating verbatim the proof of Jackson's theorem, the following Appendix of Theorem 1. If fix) e (f2x, then Uco G> This is the Appendix which we had in mind. We revert now to theorem 1 and draw from it another two corollaries. Corollary 1. If fix) e Lipjf a (0 < a ^ 1), then E»^-^T' (103) From this we obtain Corollary 2. If f(x) e C2x ^nd if f(x) possesses a bounded derivative f'{x) with \f'(x)\ ^ Mi, then En^ -1. (104) n v ' Under these conditions fix) e Lip^ 1. We point out, in addition, that in (103) and (104) we can replace En by en if f(x) satisfies condition (98). Lemma. If the function f(x) possesses a bounded derivative fix) and if e'n is the best approximation to f(x) by polynomials from hi, then n if, in particular, f(x) e (f2r, then .12*: Proof. According to the definition of en there exists a polynomial with no absolute term n U{x) = J£ (akcos hx + bksinhx), which satisfies the inequality |/'(a?)- U(x)\£e'n (105) for all values of x.
3. d. jackson's theorems 87 Now we investigate the integral of U(x)\ v "aksinkx— bkcoskx V(x) obviously belongs also to h„. If we set f(x)-V(x)=(p(x), then inequality (105) can be written in the form \<p'(z)\^€'n . If we apply to <p(x) the corollary 2 from theorem 1, we find £„(?>) 2S-, (106) n En(<p) being the best approximation to <p(x) by polynomials from Hi. We also consider n n n n J y(x)dx = f(x)dx — V(x)dx = j f(x)dx. —n. —n —n —n Thus, if f(x) e C\v) then also <p(x) e C*2ir and we can replace EJ?) by en(<p) in (106), en(<p) being the best approximation to (p(x) by polynomials from hi. From (106) we derive the existence of a trigonometric polynomial W( x) e Hi for which \(p(x)— W(x)\ ^^ (107) holds true. If, in this case, fix) e C*2ir) then polynomial W(x) can also be taken from hi. If we write (107) in the form 12/ \f(x)-{Vix) + Wix)}\^-^, we see that the polynomial V(x) + W(x) deviates from/(x) by no more than —- . Thus, a fortiori, n n - n
88 IV. FUNCTIONS AND APPROXIMATION BY TRIGONOMETRIC POLYNOMIALS Finally, if f(x) e C2jr, then W(x) and, therefore, the entire sum V(x) + W(x) belong to hi so that e < 12e'» n Thus the lemma is completely proven. Theorem 2. Let function f(x) e C2ir possess p continuous derivatives f'(x), Thus if a>p(8) is the modulus of continuity of the p-th derivative /(p)(x), then UP+ o En ^ —^ (108) np Proof. According to the preceding lemma n Since n \f(x)dx = f{pt) — t(—n) =0 , —7t f'(x) t C*2„ hence, according to the same lemma, 124' n en being the best approximation of the second derivative by polynomials from hi. Since the other derivatives also belong to C*2ir1 we further obtain n 19€(7>) e(p-l) < »_ , 71 ft the designations introduced being self-explanatory. Whence it results that 12/>e(p) nP Finally, according to the Appendix of Theorem 1 we have e<*> ^ l2co n $■
3. D. JACKSON'S THEOREMS 89 (108) following from the last two inequalities. Corollary 1. // f(p)(x) e LipMa (0 < a ^ 1) is added to the assumptions of the theorem, then \2P+1M En< npta Corollary 2. If there exists fiP+l)(x) and if it is bounded: \fiP+1)(x)\ ^ Mp+h then E„^ 12P+1Jfp+1 ^+i Corollary 3. // the function f(x) e C2ir possesses finite derivatives of all orders, then for each exponent p lim (n?Em)=0 (109) n-+x> holds true. In fact, all derivatives of f(x) are continuous, hence each individual one is bounded. According to Corollary 2, for a fixed p n holds therefore true, whence (109) follows. Following the same trend, S. N. Bernstein has proved that for analytic functions En^Aqn (0<£<1) We shall deal with this in Chapter IX.
CHAPTER V CHARACTERIZATION OF THE STRUCTURAL PROPERTIES OF A FUNCTION BY THE BEHAVIOR OF ITS BEST APPROXIMATIONS BY TRIGONOMETRIC POLYNOMIALS § 1. S. N. Bernstein's Inequality In the preceding Chapter we proceeded from the structural properties of a function (the modulus of continuity), the existence of a number of derivatives, etc.) and drew conclusions as to the rate of decrease of the least deviation En. A number of important results solving the converse problem, namely, draw conclusions as to the differential structure of a function from the rate of decrease of the least deviation En, are due to S. N. Bernstein [3]. These investigations yield in the last analysis a meaningful classification of continuous functions according to the behavior of their best approximations. In the present chapter we deal with some of S. N. Bernstein's results confining ourselves, as before, to investigating continuous functions with period 2w. Fundamental for all such investigations is the remarkable inequality which is also due to S. N. Bernstein [3]: Theorem. Let n T(x) = A + 2J (ak cos fcx + bk sin kx) be a trigonometric polynomial of n-th order. Then for T'{x) the estimate \T'(x)\ <Jnmax \T(x)\ (110) holds true. Proof. Suppose that max \T'{x)\ =nL, and that L > max \T(x)\. Because of its continuity the function \T'{x)\ assumes its maximum value nL at a point z; hence T'{z) = ±nL. Let, e.g., T'(z)=nL. (Ill) 90
1. s. n. bernstein's inequality Since nL is the maximum value of T'(x), then T"{z) = 0. Now we examine the trigonometric polynomial S(x) = L sin (nx — nz) — T(x) which, as well as its derivatives R(x) = Ln cos {nx — nz) — T'(x) 7 is a polynomial of n-th order. We introduce the points , n n 2n U° = Z + ¥n ' ^ = M° + ~n ' U* = U° + ~n~ ' "' W2n Since \T{x)\ < L, it is obvious that %)=!-%) >0, flf(M,) = ~L— T(Ul) <0, s(*8ii) = l- T(%n)>o. Thus each of the 2n intervals (u0t ui), {uh w2), . . . , (^2n-i, u2n) contains a root 2/0, 2/1, ... , 2/2n-i of the function S(x) and we obtain S(y<) = o (u<i <yi<ui+l-, * = 0, 1, ...,2n — 1). Moreover 2/2n-l < Vo + 271, which follows from 2/2n-i < ^2» = ^o + 2tt < y0 + 2n . Now let #2» = 2/0+ 2^> then 91 (112) = u0 + 2n . %2,) = >%0) = 0.
92 V. BEST APPROXIMATIONS BY TRIGONOMETRIC POLYNOMIALS According to Rolle's theorem each of the 2n intervals (2/0, 2/0 (2/1, 2/2), • . . , (y2n-h V2n) contains a root of derivative R(x) of the polynomial S(x). Let these roots be x0, xh . . . , x2n-i, whereby we obtain R(x-) = 0 (Hi <*i< yi+i; * = 0,1,..., 2n — 1). As we can easily see *2»-l <*0+ 2^, so that the roots x» are pairwise nonequivalent. According to (111) also R{z) = nL— T'(z) =0 (113) holds, so that z is also a root of the polynomial R(x). But since this polynomial cannot have more than 2n nonequivalent roots, z must therefore be equivalent to one of the roots x0, xi, . . . , X2n-i] suppose, for instance that Z ~ X{. Since R'(x) = —n2L sin (nx — nz) — T"(x) y we find by virtue of (112) R'(z) = 0. It follows from this equation and from (113) that z (and therefore also xt) is a double root of the polynomial R{x). Hence R(x) has the (2n — 1) roots Xq, Xl9 ..., £t-_i, Xi+i, ..., X2n-\ in addition to the double root x», i.e., altogether no less than 2w + 1 non- equivalent roots. This is only possible if R(x) = 0. But then S(x) = const, which is obviously not the case since S(uo) > 0 and S(ux) < 0. This contradiction proves the theorem. Corollary. Under the assumptions of the theorem \T&Hx)\ ^n?ma,x\T{x)\. We also note that the coefficient n in estimate (110) cannot be reduced, since for T(x) = sin nx we have max \T'(x)\ = n max \T(x)\.
2. SOME ELEMENTS OF THEORY OF SERIES 93 § 2. Some Elements of Theory of Series We require here some theorems of the theory of series which are not always treated in general lectures on analysis. Lemma 1 (N. H. Abel). Given are n numbers ah a2, . . . , an. If we set then If then t-1 \*t\^A (k = 1,2, ....»). Qi> g2> ••• >qn> o, then Zak<Lk i*=i Mi- Proof. For k > 1, ak is equal to s* — s*_i, hence n n n n Z ak9k = si 9i + Z (sk — **-i) Vk = Zskqk — £ ^_!<fr. *=--! A;=2 A;=l *=2 Thus it follows that n n—1 Z ak<lk = ZSk (ft — ft+l) + 5nft , *=1 Tc = \ so that, finally, Z ak<lk *=i ife — ft+i) + ?»J = -4?i Definition. We say that this series <h + a2 + «8 H satisfies Abel's condition if there exists a constant (which we then call the Abelian constant) for which Zak \ *=i <A (7i = l,2,3,...). Obviously any converging series satisfies Abel's condition; the reverse, of course, does not hold; this is shown by the series 1 — 1 + 1 — 1 + ....
94 V. BEST APPROXIMATIONS BY TRIGONOMETRIC POLYNOMIALS Theorem 1 (N. H. Abel). Suppose the series J2 ak satisfies Abel's condi- tion with the constant A. If a number series ^i > ^2 > ••• > qn> •••, limgn = 0 00 is given, then series ^2 akqk converges and its sum is absolutely not greater A;=l than Aq\. Proof. We set For m > n we have However, $m 1 * k=n+l s n = 2 akqk. k=l — Sn = 2 akqk. = i n 2 <*k — 2 ak k=\ *=1 (114) 2At so that the sum (114) can be applied to Lemma 1; this yields \Sm-Sn\^2Aqn+1. If n is sufficiently large, then this value is arbitrarily small, whence it follows that series £#*#* is convergent. We obtain the second assertion by performing the passage to the limit n —* oo in the inequality 2 ak<lk k=\ <Mi. Remark. If under the conditions of this theorem the terms of the series Y,Q>k are functions of an argument x given on a set E = {x), and if the Abelian constant can be chosen independently of x, then the series ^a^* converges uniformly toward E. 00 This due to the fact that ^ ak satisfies Abel's condition with the k = n+l constant 2A. Hence, according to the theorem which we have just proved 2 akVk k=n+\ < 2A 9W1; where the right-hand side becomes arbitrarily small for n > N> N being independent of x.
2. SOME ELEMENTS OF THEORY OF SERIES Lemma 2. // x *= 2for, then each of the two series cos x + cos 2 x + cos 3 x + * • • and sin x + sin 2 x + sin 3 x + satisfies Abel's condition with the constant 1 95 (115) (116) (117) Proof. The partial sums n n Cn = ]£ cos kx, Sn = y sin lex of the two series represent the real and the imaginary part of the sum exi e(n+l)xi E, hence, it follows that and, thus, a fortiori \E.\£ ^ 2 1 sin- exi\ \K 1 1 . x sin- ^ 1 , |on 1 =s sin- 5 X We note, in addition, that the series oo oo ]£ cos kx, £sinkx (x^2k7t), k=n k=n obtained from (115) and (116) by cancelling the first (n — 1) terms, also satisfy Abel's condition with the same constant (117). This lemma 2 together with Abel's theorem gives Theorem 2. // then the series 9i > 92 > 9z > limgn = 0, £ 9k cos kx, 29k sin kx k=l k = \
96 V. BEST APPROXIMATIONS BY TRIGONOMETRIC POLYNOMIALS converge for x ^ 2kw. Convergence is uniform on each segment [a, b] containing none of the points 2kir. x\ In fact, the minimum value of the continuous function a segment is different from zero, whence sin on such sin ^ ft > 0 (a < x ^ b) so that we can choose - as the Abelian constant independently of x. It is interesting, moreover, that the sine series £<?* sin kx converges in a trivial fashion also for x = 2kw, so that it converges along the entire axis. However, its uniform convergence along the entire axis cannot be warranted. As an example we take the series converging everywhere v sin kx *~ yic which does not uniformly converge along the axis. This results from the following consideration, namely, if the series were converging uniformly along the entire axis, then its partial sums would be uniformly bounded by a number M: " sin k x ! *ti yic y M. It would follow from this and from Parseval's formula (69) that which contradicts the nonconvergence of the harmonic series. Lemma 3. For every value of x and n the inequality " sin k x ^2]/tt (118) holds true. Proof. Let 0 < x < w, and m be the integer defined by the conditions m ?g -— < m + 1 x (119)
3. s. n. bernstein's theorems 97 Then * sin kz sin&x + " sin&x Jfc-ro+1 (120) (On the right-hand side the first summand vanishes for m = 0 and the second one for m ^ n). With the aid of the elementary inequality |sina| ^ \a\ we obtain ™ ! sin k x < v — i x ^ ]/jr. (121) Abel's lemma together with the remark to lemma 2 yield 1 1 | ^ sinfcx According to (94) and (119) sin Thus we obtain . x x sin — > — 2 71 * sinfcx k=m+i k m + 1 > m "j/jT 1 TT X = V£, hence, in connection with (121) and (120), the required estimate for x e (0, t) sin fcxl follows. But since the function is an even function, from the validity of estimate (118) for 0 < x < t follows also its validity for — t < x < 0, and since for x = 0 and x = ±7r it is trivial, it is also valid for — t ^ x ^ 7T. Thus, because of period 2w of the sum ^ the values of x. sin kx k , it is valid for all § 3. S. N. Bernstein's Theorems Theorem 1. Let f(x) e C2ir and En be its best approximation by polynomials from Hn. If for each natural number n En< — (0<a^l) (122) holds, then, in the case ihat a < 1, the assertion is written f(x) e Lip a,
98 V. BEST APPROXIMATIONS BY TRIGONOMETRIC POLYNOMIALS whereas in the case that a = 1 it is written f(x) e W. Proof. For each natural number n there exists a trigonometric polynomial Tn(x) e Hi for which \Tn{x)-f{x)\^a holds. We set Since it is obvious that U0(x) = T1(x)i Un(x) = T2n(x) - T2n-X(x) (n = 1, 2, 3, ...) y n=0 ZUn(x) = T2»(x)=tf(x)} f(x)=2Un(x). Now we choose a number 5 such that o«5<;-I as well as two points x, y which satisfy the condition \x — y\ ^ 8 f(x) -f(y) = 2 \Un(x) - Un(y)]. By the conditions 2m-1^-r<2m (123) o an integer m is defined. Thus we obtain m— 1 oo oo \f{x) - f(y)\ < 2 I Un(x) - Un(y)\ + Z I Un(x)\ + £ | Un(y)\. n=0 n=m n=m We estimate the polynomial Un(x): J7.(«)| ^ |T2«(*) -/<*)! + \f(x)- Tr»(x)\ fi± + _£_ =^i±^
3. s. n. bernstein's theorems 99 It follows that n=m ' V n=mtna 1 — 2— 2™* so that we obtain m-l £ \f(x)-f(y)\<Z\Un(x)-Un(y)\+ — n=0 ^ and we set 1 — 2~a for shortness. But Un(x) is a polynomial belonging to H^; for its derivative the Bernstein inequality (110) yields therefore the estimate | U'n{x) | ^ 2»max \ Un(x) | ^ 2* ^(12^2q) = A (1 + 2") 2»(1-«). Following the mean value theorem we therefore have \Un(x) - Un(y)\ = |ff; (*)| \z-y\ ^ A(l + 2«) 2»<i-°><5, so that m-l ft \t{x)-f{y)\< A(l+2°)dZ 2»<i"«> +— • If we consider that x and y are subject only to one condition, viz., | x — y\ ^ 5, then the inequality last obtained shows that m-l d co(<5)^C<52,2w(1-«)+ _ , «=o 2wa where C = A(l + 2a). With the aid of resulting from (123), the last inequality takes the form m-l co(d) ^Cd£ 2**1-") + Bd* . (124) Until now our consideration included both a < 1 and a = 1. But from here on their ways part.
100 V. BEST APPROXIMATIONS BY TRIGONOMETRIC POLYNOMIALS In the case of a < 1 we have m-l 9W»(1 —a) 1 Om(l-a) y 9n(l-a) = Z 1 <r _Z nt{,~ 21-« — 1 ^21~«-1 But according to (123) 2*^4- so that we obtain and, hence, or m—l 91—a ] y 2n<1-°> < — — ^)<W2l^^a + ^ «(*>< (21=^=1 Cf + iB)*' = 2><$a ((5~^)! which is equivalent to/(x) e Lip a. In the case that a = 1, inequality (124) takes the form co(<5) ^C<$m + £<5. From the inequality 2m~1 g - there follows ra — 1 ^ 'In gl • but if ra > 2 $ In 2 ' then m ^ 2(w — 1) so that mSj^lta*!, whence «>(<5)^<5^|ln«5| + 5J 2C follows. If K is a number greater than and B, then we obtain In 2 co(d)^Kd(\\nd\ + l) (d-^)' and the proof is complete.
3. s. n. bernstein's theorems 101 Together with Jackson's results in the preceding chapter there follows from the Bernstein theorem: For the function f(x) to satisfy a Lipschitz condition of an order a < 1, the condition n — na is necessary and sufficient. If a = 1, this condition remains necessary for f(x) to belong to Lip 1, but ceases to be sufficient since in this case the proof fails. That this is not due to a shortcoming of the proof itself is shown by the following example in which from n by no means f(x) e Lip 1 follows. Example* Let sin k x Since ^ — is a majorant of this series, then \f/(x) e C2ir. We set k2 Jfc=l K then I un{x) - v{x) i *J+i ± <J+i ^-Lp % J+i (^ -1) = 1. For function (125) En<- n holds therefore true. At the same time it does not satisfy the Lipschitz condition with a = 1. To prove this we note that a series ~ coskx »±i k '
102 V. BEST APPROXIMATIONS BY TRIGONOMETRIC POLYNOMIALS obtained by differentiating series (125), converges uniformly on each segment containing none of the points 0, ±27r, ±4n,.. . Hence, for 0 < x < 2w we have ,, v ~ coskx V>(*)=2 —T— . k=l * We show now that lim ip' (x) = +oo , X-+ + 0 (126) In fact, for each value of A > 0 we can find a number N for which N 1 *-i k Now let 0 < x < — and let X be the number defined by condition 2N n jU JO As we see, X ^ N. Further ,, * coskx , ^ coskx , ~ coskx V>(*)=2 —T— + 2 —7— + 2 —T- • 7T For i^Zwe obtain fcx ^ - ; thus cos kx becomes equal to or greater than zero, whence it follows that ,, x ^ ^ coskx , ~ coskx W(x)^2 —r— + 2 —j— • * = 1 * *=X+1 k With the aid of Abel's theorem and Lemma 2 in the preceding section we obtain ~ coskx 11 1 2x < X X + 1 . # 71 sin -— sin —- 2 2 » X X and, since sin - > - , 2 7T cos k x I U=^+i & <2,
3. s. n. bernstein's theorems 103 whence £ cos k x follows. But lim cos kx = 1. For sufficiently small values of x(0 < x < a) x->0 holds, hence for x < min < cr, — y/(x)>4, which proves (126). Now we let a: and y converge to zero (x = y) and obtain ^)-^)l * — y I i.e., iA(x) does not belong to Lip 1. This result might give the idea that Jackson's theorem could be improved, i.e., that for the functions of class Lip 1 the quantity En converges towards zero more rapidly than prescribed by the estimate En^~- (127) n This, however, is not the case, hence estimate (127) is optimal; thus we are faced with the fact that class Lip 1 is a true part of the class of functions A f(x) for which En(f) < — holds (the latter class being contained in W). n That estimate (127) is not improvable can be shown, e.g., by the function | sin x |. It obviously belongs to Lip 1; but for it 1 E»> 2n{2n + 1) holds, as we shall show in § 3 of Chapter VIII.
104 V. BEST APPROXIMATIONS BY TRIGONOMETRIC POLYNOMIALS Theorem 2. Let fix) e C2r and En be the best approximation to f{x) by polynomials from RTn. Then, if p being a natural number and 0 < a ^ 1, f(x) has continuous derivatives ff (x), f"(x)f . .. , fiP)(x)y moreover', the latter, i.e., /(p)(z), belongs to class Lip a if a < 1, and to W if a = 1. Proof. We introduce, as above, polynomials Tn(x) e HJ for which We also set U0(x) = T1{z), Un(x) = T2n(x) — T2«-i{x) (n > 0). We notice Un(x) e H^n in the first place, where | Un{x) | <; | I> (x) - /(«) I + I /(„ _ Tr+W I £ ^ + gs^jft,, so that we have If we apply to Un(x) the corollary from the Bernstein inequality, we find that W^(x)\^~. (128) The series 71=0 which we obtained by p-fold formal differentiation of the series f(x)=2Uu(z) n=0 ' therefore converges uniformly. But this signifies the existence of the p-th derivative (and all the more of all the preceding ones) of f(x) as well as the correctness of the equation /<">(*)=-ft 1P)(*). n=0
3. s. n. bernstein's theorems 105 Now there only remains to determine the class to which fiP)(x) belongs. By reason of (128) where n=0 J | n=m+l - *• 2 u£} (x) n=0 is a trigonometric polynomial from H2m If we denote by E(^ the best approximation to/(p)(x), we have For each natural number n > 2 we now define a number m by the inequalities 2m 5g n <2m+1. Then The p-th derivative offiP)(x) thus satisfies the assumptions of Bernstein's first theorem, and thus the proof is complete. If we combine this theorem with Corollary 1 from Theorem 2 of § 3 in Chapter IV we see that the inequality A En ^ -j^-a (p is a natural number, 0 < a < 1) is necessary and sufficient for the existence of p continuous derivatives of f(x) the last of which belongs to Lip a. For a = 1 this condition remains necessary but ceases to be sufficient. This can again be verified by the example (125); all we have to do is to consider the function ty(x) which we obtained by p-fold integration of the series (without constants of integration). For p = 1, we have, for example y<*>—scoskx *» for this function En < - ; its first derivative \p{x)) however, does not belong n2 to Lip 1.
106 V. BEST APPROXIMATIONS BY TRIGONOMETRIC POLYNOMIALS Theorem 3. A function fix) e C2r has derivatives of arbitrarily high order if and only if for each value of p the relation lim (n?En) = 0 (129) n-+oo is satisjved. The necessity of (129) was proved by us at the end of the preceding chapter. If we assume now that (129) is fulfilled, then we can find a number Np such that for n > Np nP+1En < 1 is always valid. If A p is the largest of the numbers liJSli2P^E29...9Nl+1ENp, then En<:-^- n -nv+l holds for all values of n, whence there finally follows the existence and continuity of all the derivatives up to order p included. But since p is arbitrary, the theorem is proved. § 4. A. Zygmund's Theorems We found in the preceding sections that the class of those functions C fix) e C2t, for which En < — is, on the one hand, actually greater than class n Lip 1 and, on the other hand, it is a subclass of W. Thus the question arises as to which structural property is actually characteristic of this intermediate class. This problem was solved by A. Zygmund in [2]. The results of it are given below. Definition. We denote by Z the class of those functions f(x) e C2r which satisfy all the values of x and all the values of h > 0 of the inequality \f(x + h)~ 2f(x) +f(x — h)\^Mh. Here M is a constant defined by the function and independent of x and h. Theorem 1. A function f(x) e C2r belongs to the class Z if and only if its best approximation En by trigonometric polynomials satisfies the inequality E.<±. n
4. A. ZYGMUNd'S THEOREMS 107 Proof. Suppose that f(x) e Z. We resort here to Jackson's integral (96): t — x~ Um{x) = 2nn(2 * f n2 + 1) J lit) sin 7i- t— x sin- dt. This integral appeared in § 2 of Chapter IV, and we obtained equation (100) o n/2 /sinnA4 Since f(x) e Zf the above equation yields 6M \Uu(z)-f{z)\£ 7ini2n2 + n/2 / • A A -ft3^)dt. By virtue of lemma 4 in § 2 of Chapter IV we derive from it ZnM \Un(x)-f(x)\^ 4:71 However, Un{x) is a trigonometric polynomial of order 2n — 2, hence E 2fl-2 471 From this we derive En StzM 2n ' as in the case of the proof of Jackson's first theorem, thereby proving the necessity of the condition. Now we assume that n As in the proof of Bernstein's theorem, we introduce the polynomials of the best approximation Tn(x) and the polynomials Then, as above, Unix) = T2nix) - T*-i{z) fix) =J Unix), \Unix)\<^9 n-0 * [U0(z) = T^x)]
108 V. BEST APPROXIMATIONS BY TRIGONOMETRIC POLYNOMIALS Un(x) being of order 2n. If we take any natural number m, we get 2mM*)\<2*4 = £i hence for each value of A > 0 | /(x + *) — 2/(a?) + /(a: — *) | m-l 94 A <Z\Un(x + h)-2Un(x)+Un(x-h)\+—. n=0 L By the mean value theorem Uu(x + A) — 2 Un(x) + Un(x - A) = [Un(x + A) - Un(x)] - [*7n(z) - *7„(z - A)] = A[t';(f) - tfifo)], wherein x — h<r]<x<£<x + h. Repeated application of the mean value theorem gives Un(x + A) - 2 Un(x) + Un(x - A) = A Vfi(C) (f - ti), whence it follows that I Un(x + h) - 2 *7n(z) + £/„(* - A) I rg 2A2 max | V^x) \. If we apply the corollary of Bernstein's inequality to Un(x), we obtain 9 A max I UZ(x) I ^ 22»^r = 3A2n Thus I /(a? + h) - 2/(a?) + /(a? - A) | rg 6il A« V2* + — n-o 2m or 24.4 I f(x + A) — 2/(x) +f(x — h)\£ 6Ah* 2™ ■ 2m ' Until now m was an arbitrary natural number. Now we subject it to the following condition: 2m ~~ 21W_1 (it may be assumed that A < 1). Thus the inequality above takes the form I f(x + A) — 2f{x) +f{x — h)\^ Z6AA,
5. EXISTENCE OF FUNCTIONS WITH PREASSIGNED BEST APPROXIMATION 109 and Zygmund's theorem is fully proved. In choosing m we assumed that h < 1. But if h ^ 1, then we can replace constant 36A by 4 max \f(x)\; as the quantity M sought for we then take the larger of the two numbers 36A and 4 max \f(x)\. We bring now the following results obtained by Zygmund without stopping to prove them. Theorem 2. The inequality ETn{f)< A n?+l is necessary and sufficient for the function f(x) e C2jr to possess the first p derivatives f'(x), f"(x), . . . ,fw(x), the latter, fw(x), belonging to Z. Theorem 3. The inequality El{f)<^- (lima, = 0) is necessary and sufficient for the function f(x) e C2jr to satisfy the condition 1/(3 + h) - 2f(x) + f(x-h)\< a(h)h , a (h) approaching zero with h. Theorem 4. The inequality En(f)<^T (liman = 0) is necessary and sufficient for the function f(x) e C2jr to possess the first p derivatives, the last,f(p)(x), satisfying the condition \f(p)(x + h) — 2fM(x)+fM(x — h)\ <a(h)h (\ima(h) = 0) § 5. The Existence of Functions with Preassigned Best Approximation S. B. Bernstein [13] completed the foregoing considerations by the following fundamental Theorem. For each sequence of numbers A0^AX^A2^ ..., lim^n = 0 n-+oo
110 V. BEST APPROXIMATIONS BY TRIGONOMETRIC POLYNOMIALS there exists a function f(x) e C2ir whose best approximations are equal to the numbers of the given sequence, i.e., for which El(j)=An (7i = 0,l,2,...) Moreover, there exists always an even function belonging to C2ir which satisfies this condition. The proof of this theorem being fairly complex, we have to premise the following simpler lemmas. Lemma 1. If f(x) e C2jr and X is an arbitrary constant, then En(Xf) = \X\En(f). If T(x) is the polynomial of the best approximation to f{x), then we have \T(x)-f(x)\^En(f). Multiplying this inequality by |X| we get En(Xf)^\X\En(f). Assuming X = 0 (for X = 0 the theorem is trivial), we further obtain Em(f)=E%{±Xf}<±Em(Xf)% which proves the theorem. Lemma 2. For arbitrary functions fix) andg(x) from C2ir En(f + g)^En(f)+En(g). For, if Un{x) and Vn{x) are polynomials of the best approximation to f{x) and g{x), then \U(x) + g(x)] - [U(x) + V(x)]\ ^ En(f) + En (g), which proves the theorem. Lemma 3. If f{x) and g(x) are two functions from C2jr then the quantity y>(X) = En(f + Xg) is a continuous function of argument X.
5. EXISTENCE OF FUNCTIONS, WITH PREASSIGNED BEST APPROXIMATION 111 We choose a fixed quantity X = X0 and denote by T(x) the polynomial of the best approximation to the function f(*) + *o9(x); hence \f(x) + Xog(x)^T(x)\^W(X0). If X is an arbitrary quantity, then \f(x) + Xg(x) - T(x)\ ^ y(*o) + M\ X - JJ, and M = max \g(x)\. Whence it follows that v{X)^V{io) + M\X^io\. Here X and X0 are fully equivalent and interchangeable, so that we obtain lv(A)-y(^|^Jf lA-Jol and thus prove the lemma. Lemma 4. // the function g(x) in Lemma 3 does not belong to H%, then lim tp(X) = +oo. In fact, by Lemma 2 En(Xg)£En(f + Xg)+Eu(-f), which, by reason of Lemma 1, goes over into \l\En{g)£V(X) + Eu(f). £ince g(x) does not belong to ETn) En(g) > 0, which clearly proves the theorem. Lemma 5. If T(x) e Hi and fix) e C2v, then En(f+T)=En(f). Obviously En(T) = 0. Whence, by Lemma 2, it follows that Eu(f+T)£Eu(f).
112 V. BEST APPROXIMATIONS BY TRIGONOMETRIC POLYNOMIALS On the other hand, the same lemma gives Kit) = £«[(/ +T) + (- Tj] ^ En(f + T). Lemma 6. Suppose thatf(x) is an even function from C2k. Then we can find a constant L for which En[f(x) + L cos (n +l)x] = En+l(f). Lei T(x) be the polynomial of the best approximation to/(x) of class #J_i. Since it is even, we have n+l T(x) = A + J£ ak coskx. The inequality \f(x)-T(x)\^En+l(f) defining T{x) can be written \[f(x) — an+l cos(n + 1)*] — \A + ]£ ak cos k x\\ <^,,_i(/), from which we read En\j{*) - o»+i cos (n + l)a?] ^ ^.+i(/). On the other hand, by reason of Lemma 5 we infer that En[f(x) - an+1 cos (n + l)x] ^ JP,+ 1[/(a;) - an+l cos (n + l)x] = J,+1(/). Thus, the quantity sought for is L = —an+1. Lemma 7. Suppose that f(x) is an even function from C2ir. If A^En+l(f), we can then find a constant M for which En[f(z) + M cos (n + l)x] = A . In the case where A = En+i(f)y this lemma is identical with Lemma 6. Let therefore A>En+1(f).
5. EXISTENCE OF FUNCTIONS WITH PREASSIGNED BEST APPROXIMATION 113 Now we set y>(A) = En[f(x) + A cos (n + l)x]. If L is the constant of Lemma 6, then y>(L)=En+1(f)<A. On the other hand, for large values of X, we have by Lemma 4 xp{X)> A. But since by Lemma 3 \p{\) is continuous, there exists a quantity X = M for which y)(M) =. A. Lemma 8. For each number system A0^A1^A2^"^An^0 there exists an even polynomial U(x) e Hn+i for which Ek(U) = Ak (fc = 0,1, ...,*) and which, in addition, satisfies the condition \U(x)\^A0. Proof. We put f(x) = 0. For this function En+i(f) = 0. By Lemma 7 there exists therefore a constant Mn+i for which En[Mn+l cos (n + l)x] = An . Having this constant we set f(x) = Mn+X cos (n + l)x and apply Lemma 7 to this function (with n — 1 in place of n). Thus we obtain a constant Mn which yields En_Y [Mn+l cos (n + l)x + Mn cosnx] = An_x . Together with Lemma 5, this gives En [Mn+1 cos (n + l)x + Mn cos nx] = En [Mn+1 cos (n + l)x] = An.
114 V. BEST APPROXIMATIONS BY TRIGONOMETRIC POLYNOMIALS In other words: The addition of Mn cos nx to Mn+i cos (n + l)x gave us the value required for En-i without deteriorating the value obtained earlier for En. Further, if we set f(x) = Mn+l cos (n +l)x + Mn cosnx and again apply Lemma 7, we get a constant Mn_i such that En_2 [Mn+l cos (n + l)x + Mn cos nx + Mn_l cos (n— l)x] = An_2. The addition of summand Afn-i cos (n — l)x alters neither En-i(f) nor En(f), so that En_Y [Mn+1 cos(n + l)x + Mn cosnx + Mn_l cos (n — l)z] =^n_l5 En [Mn+1 cos (w + 1)# + Mn cosnx + Mn-\ c°s (w — 1)#] = An . Continuing this procedure, we arrive at a polynomial n+l W(x) =2 Mkcoskx, *=1 for which the equations En(W)=An, ..., E1(W)=A1, E0(W) = A0 hold true. If now we denote by — M0 the constant from which W(x) has the least deviation, then n+l M0 + ]£ Mk cos k x fiA0. Since, moreover, all the deviations Ek(f) of the polynomial n+l U(x) = M0 + £ Mk cos kx fc=l coincide, by Lemma 5, with the deviations Ek(W), then U(x) is the polynomial sought. Now we can finally approach the proof of Bernstein's theorem and point out right away that if a constant An+i of the theorem vanishes, this theorem goes over into Lemma 8. Hence we assume all the values of An to be positive. Then for each index n^Owe construct an even polynomial U „{x) e //J+i for which I CM*) I ^o, E0(Un)=A0, E1(Un)=A]i ..., En(Un) = An
5. EXISTENCE OF FUNCTIONS WITH PREASSIGNED BEST APPROXIMATION 115 and are able to show that from the sequence of polynomials {Un(x)\ we can single out a uniformly convergent subsequence {Uni(x)\ whose boundary function satisfies the conditions required by the theorem. To this end we denote by R{m(x) that polynomial in H^ which deviates the least from Un(x). We can easily see that \B™(x) - Un(x) \^Am (m = 0, 1, 2, ...) i since for m = 0, 1, 2, . . . , n this results from Am = Em{Un) and from the definition of R(m(x). But if m ^ n + 1, then the estimate is trivial since *£?(*) = Un(z) (m = » + l, n + 2, ...). From the same inequality it also follows that for all values of n, m, x the estimate holds true. Then we investigate the sequences of "polynomials" (each being a constant): B$\x), B^W, 42)(*)> i?o3)(*). From this bounded sequence we separate a convergent subsequence B^\x)9 BJp-^z), 2#8'0>(*), ..... limBif**)(x) = B0(z), A;-oo in which the exponents nh,o grow monotonically: ^l.o < n2,o < tt3>0 < * The polynomial sequence B^ix), B{?2>o)(x), B(?*°\z), ... is then formed analogously. All of these polynomials belong to H? and are bounded by a fixed number. By reason of the axiom of choice (Chapter III, § 2, Theorem 2) we separate from this sequence a convergent subsequence JRini'l)(o:), i?in2,l)(z), Mn3,l)(*), ..., \im B{?ktl)(x) = #!(*), k-+oo
116 V. BEST APPROXIMATIONS BY TRIGONOMETRIC POLYNOMIALS where the sequence of exponents {nM} is rigorously increasing with increasing k and is simultaneously a subsequence of the foregoing sequence [nkto}. Continuing with this procedure we obtain for each value of m a uniformly convergent polynomial sequence R%>m)(x), R%'m)(x), R%»m)(x), ..., VimR^){x)=Em{xh k-*oo in which the exponential series {nk,m} increases rigorously as k increases and is a subsequence of {nk-i,m}. After finishing this construction we set rii = niti. If i ^ m, then n» is a term of the sequence \nk,m}, i.e., where as is easily recognizable. Now we determine a quantity m. Then we can find a number Nm such that for k ^ Nm we always have \R%*>m)(x)-Rm(x)\<Am. Then we set i(m) = maxjm, Nm}. If i ^ i(ra), where fciW) ^ i(m) ^ iVm. Hence On the other hand \Uni(x)-R^\x)\^Ami (130) and, therefore, for i ^ i(ra)
5. EXISTENCE OF FUNCTIONS WITH PREASSIGNED BEST APPROXIMATION 117 Hence, for every two numbers i and j greater than i{m)1 \UM(x)- Unf(x)\<±Am, which, together with the condition lim Am = 0 warrants the uniform convergence of the sequence { Uni(x)\. If we set f(x) =lim Uni(x), we can show that this function satisfies the requirements of the theorem. That this function is continuous, has a period 2w and is even is obvious; all we have to ascertain now is that it possesses the best approximations required. To this end, in inequality (130) we retain m and pass to the limit i —> oo. Thus we obtain \f(x)-Bm(x)\^Am, and use R^ (x) = R^ *<m>,m' (x) -> Bm(x). Since Rm{x) e Hi we may therefore infer that Em{f) ^ Am , and all there remains to do is to exclude the possibility that Em(f)<Am. Let us therefore assume that for a specific value m this inequality exists. Then let V(x) be the polynomial of the least deviation from f(x) in H^, that is, \V(x)-f(x)\^Em(f). For sufficiently large values of i we then have \Uni(x)-f(x)\<Am-Em(f), hence \V(x)-Uni(x)\<A,
118 V. BEST APPROXIMATIONS BY TRIGONOMETRIC POLYNOMIALS whence it follows a fortiori that Em(Uni)<Am. This, however, contradicts the definition of the polynomials Un(x) for n< ^ m. Thus the theorem is fully proved. § 6. Density of Class HTn in Class LipM a. Jackson's estimate En(f)^ — (131) does by no means contradict the fact that for individual functions belonging to Lip>/ a, En decreases at a higher rate than n~a. This is the case, e.g., of all the differentiable functions belonging to all classes Lip a. Nonetheless it is impossible to improve the estimate (131) for class Lip aasa whole. We are now going to investigate this problem more thoroughly. If we retain a ^ 1 and let f(x) pass through the entire class 1 Lipi a, then let ru(a)=sup{En(f)}. This quantity is so to speak the measure for the density 2 of the polynomials from HTn in the class Lipi a. Theorem. For all values of n ^ 1 K{a) 12 wherein K(a) is a positive quantity dependent only upon a. Since, as we know, (131) is correct, it suffices to prove the existence of the quantity K(a). First of all, let a < 1. By Bernstein^ theorem in the preceding Section, there exists a function <pa(x) for which holds. ^f f(x) cLipj, a, then — f(x) cLip! a. On the other hand, En(Mf) = MEn(J). The M assumption that M = 1 is therefore no loss of generality for the following considerations. 2 The introduction of such characteristics is due to A. N. Kolmogorov. (A. N. Kol- morogov [3], but cf. also N. I. Achieser and M. G. Klein [1]; S. M. Nikolsky [1, 2, 3] S. N. Bernstein [14].)
6. DENSITY OF CLASS #J IN CLASS LIPM a 119 From Bernstein's results expounded in § 3 it follows that <pa(x) e Lip a (here we use a < 1). If the coefficient of the corresponding Lipschitz condition is Ma then \<Pa(x)—<Pa(y)\ ^Ma\x — y\«. The function therefore belongs to Lipi a and we obtain so that the quantity is the one required. For a = 1 this procedure no longer holds the power of proof since the function (fi(x) does not necessarily belong to Lip 1. We can, however, prove the validity of the theorem also for this case by resorting to function |sin x\ which clearly belongs to Lipi 1 and for which in accordance with an earlier remark.3 Thus we have so that the number — plays the role of K(l) here. 67T 3 The proof for this can be found in Chapter VIII (§3).
CHAPTER VI INTERRELATION BETWEEN STRUCTURAL PROPERTIES OF FUNCTIONS AND THEIR APPROXIMATION BY ALGEBRAIC POLYNOMIALS § 1. Lemmata In this Chapter we investigate the interrelation between the differential structure of functions belonging to class C{[a1 b]) and the rate of decrease of their best approximation by algebraic polynomials. As we shall see, this problem can be easily traced back to the analogous problem of trigonometric approximation dealt with in the two preceding Chapters. This first Section gives the respective lemmata. Let f(x) e C([a, b]). Then for -1 g u g 1 a^-a)u + (a + b) so that the function belongs to class C([ — 1, +1]). If, moreover, we set y)(0) = <p(cosO), then \p(d) is obviously a continuous function with period 2t defined for all real values of 0. Moreover, ${ — 6) = iA(0), hence \f/(6) is even. We shall define \p(6) as being a function induced by the initial function fix). Lemma 1. Let En be the best approximation to fix) e C([a, b]) by algebraic polynomials from Hn and ETnbe the best approximation of their induced function \p{6) by trigonometric polynomials from H n. Then Bn=ETn. (132) 120
1. LEMMATA 121 Proof. Let P(x)=J}ckx* *=0 be the polynomial deviating the least from f(x). Hence for x e [a, b] \f(x)-Zckx*\£En. For u e [-1, +1] it follows that <p(' *=o L 2 J ! and, hence, y>(0) — 2 c* *=o L )cos0 + (a + b)]*\ r #„ But according to Theorem 3 in § 2 of Chapter I the function *=o L ) cos 0+ (« + &) is a trigonometric polynomial from HTn, so that we have Elt^En. (133) We now prove the converse inequality. Let T(6) be the trigonometric polynomial deviating the least from 4/(0) belonging to Hi. Since, like \p(0)f it is even, it takes the form * T{0) =A + £ akcoskO. k=i By definition of the Tchebysheff polynomials cos kd = Tk(cos6); and, hence, T{6) =2Jckcosk6. (As a matter of fact, we need not revert to the theory of the Tchebysheff polynomials; all that matters here is that cos kd is a polynomial of fc-th degree of 0. This was proved in § 3 of Chapter II.) 1 See Chapter I, § 2, Lemma 4.
122 VI. FUNCTIONS AND APPROXIMATION BY ALGEBRAIC POLYNOMIALS The equation . \y(B)-T(d)\£EZ defining the polynomial T{6) can therefore be written <p (cos 0) — ]£ ck cos* 6 *=o ^< so that for u e [—1, +1] we obtain the estimate fW — 2Ckuk Jfc=0 £K. If, however, z e [a, b], then 6 — a if we substitute this ratio into (134) for u, we obtain But and f2x- (0 + 6)1 » |-2«-(q + 6)1» *5H &-• J (134) is an algebraic polynomial belonging to H„ so that we have This proves the theorem. Lemma 2. We retain the designations above and assume o>/(6) and w^(8) to be the moduli of continuity of the functions f(x) and ^(d). Then (135) Proof. We set a certain value S > 0 and assume \B' — d"\ Ji S. Then IV»(0") — V(0')l = Mcos0") — <p(cosd')\.
1. LEMMATA 123 However, |cos0"-cos0'|^ |0"—0'|^<5. Thus, if we introduce the modulus of continuity a>„(5) of the function <p(u)> we obtain |y(0")-?(»')I £<M*). and, hence, cov(d) £w9(i). (136) Now we estimate 0^(0). If \u" — u'\ ^ 5, the distance between the two points n __ (6-a)tt" + (a + 6) , _ (b - a)u'+ (a + b) does not exceed 5. Consequently, and therefore <M*)^/(^p*)- (137) We derive (135) from (136) and (137). Lemma 3. Designations as above. Moreover, let [a', b'] be a segment lying entirely on the interval (a, b). If <af(8) is the modulus of continuity of the function f(x) on the segment [a', &'], then a>i(d)£<o,(Kd), (138) where K is a constant dependent only on the segments [a, b] and [a', &']. Proof. We set _2a' — (a + b) o _ 2b' — (a + b) It is obvious that b — a ' b — a 1 <r <s < +1.
124 VI. FUNCTIONS AND APPROXIMATION BY ALGEBRAIC POLYNOMIALS Now let A = min{r + 1, 1 - *} = min (2 ^=-? , 2*-^]. ' \ 0 — a 0 — a) If %' and x" are two points of the segment [a', V] for which \x" — x'\ ^<5 holds, then the points , 2x'—{a + b) „ 2x" — la + b) u = , u" = j—- - 0 — a 0 — a fall into the segment [r, s] and, moreover, satisfy the condition Let, further, 0' = arc cos u', 0" = arc cos u", so that \f(x") - /(*') I = M*") - <p(u') I = |y>(0") - y>(0') I. (139) Then the mean value theorem yields 0" - 0' = - _!==(*'' - w'), |/1 — tZ2 u lying between u' and w", hence, a fortiori, between r and s. Consequently 1 _ u2 = (1 — u) (1 + u) >(1 — 5) (1 + r) ^ A2, whence it follows that From this and (139) we find
2. HOW PROPERTIES OF A FUNCTION AFFECT ITS APPROXIMATION 125 and therefore ^*^(i<r^) = «*<*«>• where JT 2 f 1 K = -t— = max f X{b-a)~ \af -^a' b - bf § 2. How the Structtural Properties of a Function Affect Its Approximation Theorem 1 (D. Jackson). // En is the best approximation of the function f{x) e C([a, b]) by polynomials from Hn, then En^\2co(^^); (140) wherein ^(5) is the modulus of continuity off(x). For, if we introduce the induced function \p{6)) then according to the Jackson theorem dealt with in § 3 of Chapter IV we have *£ 12^(1); but by (132) and (135) Corollary 1. // f(x) e LipMa (0 < a ^ 1), then with
126 VI. FUNCTIONS AND APPROXIMATION BY ALGEBRAIC POLYNOMIALS Corollary 2. If the function f(x) possesses a bounded derivative f'(x), such that \f'(x)\ ^ Mi, then EU&W—)M\ (142) In order to get another step further we need the following Lemma. If the function f(x) e C([a, b]) possesses a continuous derivative fix), then there exists between its best approximation En and the best approximation En-i of its derivative the relation n Proof. Let P(x) e Hn-X be the polynomial of the best approximation to f\x), that is, \f(x)-P(x)\^Efn_1. (144) If we set <p(x)=f(x)-fP(x)dxy 0 then inequality (144) can be written |?'(*)|£tf-i and thus we derive En(<f) ^ E'n-l from Corollary 2 of the preceding theorem. If Q(x) e Hn is the polynomial of the best approximation to <^(x), then 19{X)-Q{X)\£W^±E>U_1 or, which is the same, f(x)-Up(x)dx + Q(x)\\^^-^-E'H_1>
2. HOW PROPERTIES OF A FUNCTION AFFECT ITS APPROXIMATION 127 and since the sum x fP(x)dx + Q{x) o is a polynomial belonging to Hn) the proof is complete. Theorem 2 (D. Jackson). // there exist p continuous derivatives of the function f(x) on the segment [a, b] and o)p(8) is the modulus of continuity of the last derivative fiP)(x), then for n > p the estimate En<C_vil>-a) * / b — a \ —te=w <146) holds) and the quantity Cp depends only upon p. Proof. If we apply the lemma repeatedly, we obtain (using self-explanatory designations) a series of inequalities ^6(6-a) n 6(ft-a) S*~1^ n-l En~2' Multiplication of these inequalities yields qp {b _ a)P *"&»(n-l)...(n-p+l) —*• On the other hand, by Theorem 1, whence we get En <. 12 l ' 7i(7i — 1) ••• (n — p + Since n > p we have for k = 1, 2, ... , p — 1 n p
128 VI. FUNCTIONS AND APPROXIMATION BY ALGEBRAIC POLYNOMIALS whence it follows that p — k n — k > n. V If we multiply these inequalities we find (w _ i) (w _ 2) • • • (n - p + 1) > (P~ 1) (P~ 2) "• 1 „P-i and 71(71— 1) (71 — 2) ...(71 — p+ 1) >^^P- (147) But from the expression above and from (146) it follows that 12.6*rM6-«)P (b—\ (148) this being the required inequality (145) where for the constant Cp we have found C =12*£. (149) p! Corollary 1. // among the conditions of the theorem there occurs the case f){x)tUpMa, (0<agl) then for n > pthe estimate E c^-^1m holds trite. Here Cp is dependent only upon p and a. This is true since under the assumptions made Since n > p) therefore n ^ p + 1, hence i-£*i 71 p + 1
3. CONVERSE THEOREMS 129 and, consequently, n — p ^——. p+ 1 Whence it follows that p\2(n-V)J- \ 2 } n" Together with (145) this gives p\ \ 2 ) K ' nP+a and in order to get (150) we have only to set r. _1.->6pyp/P+l\" Cp-U-vT{-2-) ' (151) Corollary 2. If f(x) also possesses a bounded derivative /<»+» (x) such that |/(p+1)(x)| ^ Mh-i, «Aen «.ic;|t-^"", (in) and the constant C" depends only on p. For in estimate (150) we may then set a = 1 and M = Mp+i. If we consider also (151) with a = 1, then we immediately obtain (152), where c"> = -jr <* + *)• Corollary 3. If the function f(x) has derivatives of all orders, then for each p lim (n? En) = 0 n-+oo holds true. This follows directly from (152). §3* Converse Theorems Theorem 1 (S. N. Bernstein). Let the best approximation En to the function f(x) e C( [a, b]) satisfy the inequality En^a (0<a^l). (153)
130 VI. FUNCTIONS AND APPROXIMATION BY ALGEBRAIC POLYNOMIALS If a < 1, then fix) on each segment [a', b'] fully comprised in the interval (a, b) belongs to class Lip a. Buf if a = 1, then f(x) on each such segment belongs to class W. Proof. By (132), the equation El (V) = En holds for the induced function \p(6). Estimate (153) and Theorem 1 in § 3 of Chapter V convince us of the fact that \p(6) belongs to class Lip a or to W depending on whether a < 1 or a= 1. According to Lemma 3 in § 1, the estimate o)f{d) ^ cOy,{Kd) (154) holds for each segment [a', b% where K = max | -7 , 1. [a —a b — o J But if a < 1, then cov(d) ^M6a and, hence, according to (154) coff(d) ^ MKada, i.e., fix) belongs to class Lip a with the coefficient MKa. But if a = 1, then (Oy,(d) ^ if(5(1 + |ln<5|) and (o'f(d) ^ MKb (1 + |ln#<5|). Since 1 + |ln#<5| ^ 1 + |lnJT| + |ln<5| < (1 + |lnJT|)(l + |ln<5|) co'f(S) <MK{\ + |lnJC|)«(l + |ln<5|), which means that f(x) e W. As was already the case with the approximation by trigonometric polynomials, we see also here that the case of a = 1 is a particular one with respect to that of a < 1, which no longer surprises us. But a new particularity of the theorem consists in the fact that the characteristic properties
3. CONVERSE THEOREMS 131 of f(x) do not manifest themselves on the entire segment [a, b] but only on every segment [a', b'] comprised entirely in [a, 6]. This is inherent in the problem itself. For instance, we shall ascertain in § 1 of Chapter VII that for the function a(x) =]/! — x2 (155) the best approximation by polynomials from Hn on the segment [ — 1, +1] satisfies the inequality En<^-. (156) nn At the same time, however, function (155) on the entire segment [ —1, +1] belongs to no class Lip a for a > §. For, in fact, \a(z)-a(i)\ = yr+n.. |1 —a?|« |1—a|«-K' As the right-hand side of this inequality is not bounded in the neighborhood of the point x = 1, no constant M can be given for which \a(x) — o(l)\ ^ M |1 — a?|«. Thus, although (156) is fulfilled, yet <r(x) i Wf or: although yet <r(x) 4 Lip f. Theorem 2 (S. B. Bernstein). // (with the designations hitherto adopted) where p is a natural number and 0 < a ^ 1, there exists at all points of the interval (a, b) the derivative fiP>(x). If a < 1, then f{v)(x) on each segment Wi b'] C (a>y b) belongs to Lip a;ifa= 1, then over each such segment it belongs toW. The complete proof of this theorem is rather extensive and we shall deal with it in the following section. Here we confine ourselves to investigating the simplest case, viz., p = 1 and a < 1.
132 VI. FUNCTIONS AND APPROXIMATION BY ALGEBRAIC POLYNOMIALS As in the preceding theorem, the induced function yp(B) satisfies the condition Thus there exists i£'(0) and it belongs to Lip a. However, fix) = w arc cos r . L h —a J Hence, for a < x < b there exists fix), and ,,, , J 2x-(a + b)] -1 2 f'(x) =y> arc cos 7 , rL b-<* J-,/ r2x-(a + b)Yb-a or ,T 2x-(a + b)l — ip arc cos fix) = L7 *>-*_] (157) y(z — a) (b — x) and all we have to do is to demonstrate that f'(x) e Lip a on* the segment [a', &']. To this end we verify that the product (pi(x)<p2(x) of two functions from Lip a belongs again to Lip a. Indeed, \(p1(x)(p2(x) — (Pl(y)(p2(y)\ ^ l9?2(z)ll<M*) —<ft(v)I + lft(y)l !%(*) — <P*{y)\ Thus, if 4i and A2 are the upper bounds of the functions \i(x)\ and |2(z)l, and Mi and M2 are their Lipschitz coefficients, then \<Pi(x) <p2(x) — (fAy) <p2(y)\ ^(M1A2 + M2AX) \x— y\a. After this remark we go back to function (157). The factors ]/x — a' yb — x have on the segment [a'} b'] bounded derivatives: -1 +1 2}/(x — af ' 2]/(6— x)* ' they belong therefore to Lip 1, and a fortiori, to Lip a. As for factor , v ,T 2*—(a+ 6)1 g{x) = w arc cos — L 6 —« J
4. bernstein's second inequality 133 the same consideration as in Lemma 3, § 1, shows that (ofg(d)^(o^(Kd) holds for it, Jg(8) being the modulus of continuity of g(x) on [a', b'], and 0)^(8) the modulus of continuity of \p'(d) over the entire axis. Thus g(x) e Lip a on [a'} b'] so that/'fx) also belongs to the same class. § 4. Bernstein's Second Inequality To complete the proof of Theorem 2 in the preceding section we need an inequality found by S. N. Bernstein, which is interesting as such. Theorem. If the polynomial of n-th degree with real coefficients P(x) =c0 + clX+ ••• + cnxn satisfies on a segment [a, b] the inequality \P(x)\£M, then its derivative P'(x) satisfies on the interval (a, b) the inequality \P'(x)\^-7= (158) y(x-a)(b-x) Proof. The induced polynomial is of n-th order; moreover, \T(0)\ ^M. Hence, according to the Bernstein inequality proved earlier, \T'(0)\£Mn. (159) On the other hand, r{6) = -P> ^-a)oose + (a + b)y_-aain Q To any value of x e (a, b) a value 6 e (0, tt) is assigned by the relation (b — a) cos 6 + (a + b)
134 VI. FUNCTIONS AND APPROXIMATION BY ALGEBRAIC POLYNOMIALS whence we find 2 and, consequently, :sin0 =y(x—a) (b—z) T'{6) = — P' (x) }/(x—a) (b — x). If we substitute this expression into (159) we obtain (158).2 Corollary 1. If [a'f b'] C (a, b), then for every value of xe [a'} b'] the estimate \P'(x)\ <KMn (160) holds true, where K = msLx\- , - -1 , [a — a b — o J since for these values of x (161) , ' i, ' AK }/(x —a)(b — x) Y{a' — a) {b — V) is valid. Corollary 2. For the same values of x I P(p)(x) | ^ K*>MpPnP. (162) To prove this we decompose each of the two segments [a, a'] and [6', 6] into p equal parts by the points a = l0 < Zj < • • • < lp = a'; b' = mp < mp_l < • • • < ra0 = b and set A{ = [li9mi]. 2 Estimate (158) is optimal. For a = —1,6= +1 and P(x) = cos (n arc cos x) we have r>// x n sin (n arc cos x) ^ *" , ,, - „,, x n P'(x) = v '. For xi — cos — we have therefore P'(x.) = —=====. Vl - xz 2n Vl - x\
4. bernstein's second inequality 135 If now we apply inequality (160) repeatedly, we obtain \P'(x)\£K0Mn [s€j1,Z0 = max|r-!T, }1, \P"{x)\ ^K^Mnin—l) \xeA2iK1 = max Ir-^r-, 11, L 1*2—'*i Wj —m2JJ \Pip)(*)\ ^ Kp.x ..-K0Mn(n - 1) ••• (n — <p + 1) \xeApi iTp_1=max|z —, 1. In view of the fact that a' — a 6 — V li+i — k = ——, m{ — wf+1 = —— it follows from this and from (161) that Thus, for x € [a', 6'] \P(p){x)\ ^KVpPMn(n— 1) ••• (» —■ <p + 1) holds true and, all the more, (162). On the basis of (162) we can now carry out the entire proof of Theorem 2 in the preceding section. To do this we construct (according to the assumptions of this theorem) for each n a polynomial Pn(x) of degree n which satisfies the condition and set Q0(X) = PX(X)9 <?,(*) = P2»(X) - P2n-l(x). Obviously Since f(x) = 2Q»(*)- n=0 \Qn(x) | ^ | Pf (x) - f(x) i + \f(x) - P,-.(*) | =£ 2^ + 2(»_^(p+a) we have 5 \Qu(x)\£ 2np+n
136 VI. FUNCTIONS AND APPROXIMATION BY ALGEBRAIC POLYNOMIALS Since Qn(x) is of degree 2n, we therefore derive from (162) on [a', b'] C (a, b) the estimate The sequence \Qn] (*)I ^ Kp2^r«vp2np = §ca■ <163) n=0 thus converges uniformly over [a', 6'] so that everywhere on (a, 6) there exists the derivative f{p)(x) and i(p) If we also set n=»0 = 0 then we find from (163) that ZQT(x) = un(x), n=0 \fHx)- uu(,)\ < fi^- n=m+l "- - The degree of C/m(x) is 2m — p, which is smaller than 2m; the best approximation E^ of the derivative /(p) (x) satisfies therefore the estimate tf« <r * If n > 2 is a natural number and 2m ^ n < 2m+1, then Zfrn ^A2m<__-T_qT-<-- Thus, derivative /(p)(z) satisfies the assumptions of Theorem 1 in the preceding section. From this we derive the assertions in Theorem 2. Corollary. If the best approximation to a function fix) defined on [a, b] satisfies for all values p the condition lim (n*En) = 0, n—oo then f(x) possesses over the interval (a, b) derivatives of every order. § 5. Existence of a Function with Preassigned Approximations On the basis of the results of § 1 we can apply the Bernstein theorem
6. THE MARKOFF INEQUALITY 137 of the existence of a function with preassigned trigonometric approximations to the algebraic case. To do this we need another Lemma. Suppose that the function 4/(6) e C2ir is even and that a segment [a, b] is preassigned. Then there exists a function f(x) e C([a, b]) such that its induced function is 4/(6). In fact, if we set *t \ T 2x — (a + b)~\ f(x) = y> arc cos —— , then the induced function of f(x) is precisely the function ^(0). On the other hand, f(x) is defined and continuous on the segment [a, b], and thus the proof is complete. From this we can easily derive the following Theorem (S. N. Bernstein). If a number sequence A0^ Ax^ ,4a ^ ♦ #. lim An = 0 and a segment [a, b] are given, then there exists a function f(x) e C([a, b]) with the best approximations En(f) - An (n = 0, 1, 2, . . .). We can find, in fact, an even function 4/(6) e C2ir for which the numbers An represent the best approximations by trigonometric polynomials, and this holds also for the function f(x) e C([a, b]) whose induced function is ^(0). § 6. The Markoff Inequality The inequality (158) gives an estimate for the derivative of a polynomial at interior points of an interval (a, b). At these points it is empty since there its right-hand side becomes infinite. We also frequently need, however, an estimate of the derivative of the polynomial over the entire segment [a, b]. This was found by A. A. Markoff [3]; we write it as follows : 9Mn2 \P'(x)\^- (a^x^b), o — a P(x) being a polynomial of n-th degree whose absolute value on [a, b] has the maximum value M. We shall prove this estimate in the course of this Section. Lemma 1. Let Xk =COs —'— (k = 1,2, ...,») 2n
138 VI. FUNCTIONS AND APPROXIMATION BY ALGEBRAIC POLYNOMIALS be the roots of the Tchebysheff polynomial Tn(x). Then for each polynomial Q(x) e Hn-i the identity holds true. Proof. The expression T»(x) x — xk represents a polynomial of (n — l)-st degree. Let i be one of the numbers 1, 2, 3, . . ., n. Then, for k =£ i Tn(Xi) x{ — xk 0 T (x) holds since x» is a root of T»(x). To find the value of —^-^- at the point x — Xi x = x,- we consider T.(x) X — X, Since !Tn(x) = cos (n arc cos x), consequently, T'n(x) = sin (narc cos x). "J/1 — x2 From (2i — \)n arc cos x» = —-— we further derive (<2i \)n sin (n arc cos x*) = sin = (— l)*""1 and, hence, The right-hand side of the asserted identity is therefore equal to Q(x») at the point x = xiy hence it coincides with the left-hand side.3 But since both the right and the left sides of the assertion are polynomials from Hn-h there follows from the coincidence at these n points that they are fully identical. 3 In other words: the right-hand side of our assertion is a Lagrange interpolation polynomial for the left-hand side.
6. THE MARKOFF INEQUALITY Lemma 2. Suppose that Q(x) e #n-i. Let for x e [ -1, +1] \Q{x)\Yl — x2^l. Then on [—1, +1] the estimate \Q(*)\£n holds true. Proof. First we consider only the values x of the segment xn ^ x fg xl. Since xn = cos ■ 2n — 1 n — n = — cos —= — x± 2n 2n then for x e [xn, Xi] the estimate holds and, thence, 1*1 ^ *i which proves the assertion for x e [xn, Xi]. Suppose now that x satisfies one of the two conditions — 1 fg X < Xn, Xj<Xfgl. Then we apply the identity Q(*) _ proved in the preceding lemma. Since —} 2 (-1)*"1 yr^r^ Q{Xk) x— xk we therefore have \Q(*)\£ yi-4\Q{xk)\^i, Tn{x) ! V x— xk 139 U However (and this is essential here), the differences x — xk have for the value x considered by us one and the same sign, so that we can also write the last estimate in the form \Q(x)\ l n ft T„(x) I w x— xk\
140 VI. FUNCTIONS AND APPROXIMATION BY ALGEBRAIC POLYNOMIALS On the other hand, Tn(x) = 2»-ifj(x-xl), t=i hence n n rp / „\ and, consequently, \Q(z)\^-\T'(x)\. n Now we have only to estimate the derivative T'n(x) of the Tchebysheff polynomial Tn(x). We have »,, , x w sin (ti arc cos x) For arc cos x = 0 we have hence, according to (93) \Tn(x)\^n\ which concludes the proof. Lemma 3. If a polynomial Six) e Hn on the segment [ — 1, +1] satisfies the inequality \S(z)\£M, then its derivative S'{x) on [ — 1, +1] satisfies the inequality \S'ix)\ ^ Mn2. Proof. According to the Bernstein inequality (158) Mn The polynomial does therefore satisfy the assumptions of the second lemma, which yields Mn
6. THE MARKOFF INEQUALITY 141 Theorem. (A. A. Markoff). If a polynomial P(x) e Hn on segment [a, b] satisfies the inequality \P(z)\£M, then P'(x) satisfies on the same segment the inequality ip'(*)i^2. b — a Proof. We set s(u) = pf-a)u + {a + h)}. The polynomial S(u) satisfies on the segment [ — 1, +1] the condition \S(u)\ ^ M. But since holds, then and for S' Lemma 3 holds. Let us note, in addition, that Markoff's inequality is optimal. If, for instance, we set a = — 1, b = +1 and P(x) = cos (n arc cos x), then, as we have already seen, ^, sin (n arc cos x) sin7i0 P'(x) = n i-=— = n . . , Vl — x* sin 0 hence P'(l) = n2. r Markoff's inequality yields (with the same designations) also the following inequality : MOtn |pC»)(x)| ^———m2(« - l)2-..(«-m + l)2. Yet, this is no longer the optimum estimate. Such an estimate was found by V. A. Markoff 4 (a brother of A. A. Markoff). It is written I «»>/ a I < M2m w'(w2 - P) {n* - 22) •• • [»2 - (*» - !)']• |jr WI-(J_a)» (2tn-l)!! 4 V. A. Markoff [11, p. 93.
CHAPTER VII APPROXIMATION BY MEANS OF FOURIER SERIES § 1. The Fourier Series An infinite series of the form oo ^ + 2 (a» cos nx + *» s*n nx^ (164) n=l is said to be a trigonometric series. Theorem 1. If a function f{x) can be expanded in a uniformly convergent trigonometric series, then its coefficients are uniquely defined. Proof. It follows from the assumption that fix) e C2r. If we integrate the equation oo f(x) = A + ]£ (an cosnx + K sinnx) (165) over the segment [—w, w] and consider that n n I cos nxdx = sinnxdx = 0, then we obtain n ff(x)dx = 2tzA, — 71 and, consequently, A = h /"/(*>**• (166) — Tt If we multiply (165) by cos kx (which does not interfere with the uniform convergence) and integrate the equation thus obtained again over the segment [-tt, tt], then Lemmas 1 and 2 in § 2 of Chapter III yield the equation 71 I f(x) coskxdx = naky and, consequently, 1 F ak=— / f(x) cos kxdx, (167) and similarly „ bk=— \ f(x) sin kxdx. (168) tz J —n 142
1. THE FOURIER SERIES 143 Equations (166) through (168) prove the theorem. Definition. Let fix) be any function from C2ir. The numbers A, a*, bk derived from this function according to (166) to (168) are said to be the Fourier coefficients of the function f(x), while the trigonometric series whose coefficients are the Fourier coefficients of f(x) is said to be the Fourier series of this function. Thus we can reword Theorem 1 in the following fashion: Theorem 2. // a function can be expanded in a uniformly convergent trigonometric series, then this series is identical with its Fourier series. This theorem, however, mentions in no way that the Fourier series of any function (fx) e C2ir does really represent this function.1 None the less this is true in a vast majority of cases. To explain it we provfe a Lemma. // a function <p(x) e C2* is orthogonal to all the functions of the system 1, cos x, sin x, cos2a;, sin2a;, ... (169) that is, if all the integrals n n n I (p(x) dx, J (p(x) cosnxdx, I cp(x) sinnxdx equal zero, then <p(x) equals zero. This property of (169) is expressed by defining it to be completely belonging to the class C2ir. To prove this we note first that each trigonometric polynomial T(x) also satisfies the equation I (p{x)T{x)dx = 0. — 71 Now let e > 0 be arbitrary; then, according to the second Weierstrass theorem, we can find a trigonometric polynomial T(x) which satisfies the inequality \T(z)-ip{z)\<e for all x. Moreover, for this polynomial n n l<p2{x)dx= J <p(z)[<p(z) —T(z)]dz. — .-t — n 1 We want to point out that the function f(x) is not necessarily the sum of the Fourier series; the latter may not be convergent at all or it may converge to another value. (As a matter of fact, the latter is impossible in the case of f(x)«C2x as we shall prove subsequently.)
144 VII. APPROXIMATION BY MEANS OF FOURIER SERIES holds true. If now M is the maximum value of \<p(x)\, then f<p(x)[<p{z)-T(z)]dz ^2nMe, and therefore also / tp2(x)dx^ 2nMe. But since the right member of this inequality is arbitrarily small it follows that f(p2(x)dx = 0, which is possible only if <p(x) s 0. Theorem 3. // two functions fix) and g(x) of class C2ir have the same Fourier coefficients, then they are identical. This is true since their difference is orthogonal to all functions (169) and therefore identical with zero. From this we derive a theorem which in many cases warrants the expandability of a function in a Fourier series. Theorem 4. // the Fourier series of a function fix) e C2 converges uniformly, then f{x) is equal to its sum. To prove the above we assume that oo A + ]£ iancosnx + bnsinnx) (170) n«=l is a uniformly convergent Fourier series of f(x); let its sum be g(x). Then it is evident that also g{x) e C2ir. This function g(x) can, by definition, be expanded in the uniformly convergent trigonometric series (170) which, according to theorem 2, is also its Fourier series. Thus series (170) is the Fourier series of two functions from C2m namely, the initial function f{x) and the summation function g(x), so that f(x) and g(x) are identical in accordance with the preceding theorem. Thus, if we can prove that the Fourier series of a function from C2ir is uniformly convergent, then we can say that this Fourier series is a representation of that function. Example. We expand the function rp(x) = |sin a;| in a Fourier series.
1. THE FOURIER SERIES 145 Since \p(x) is an even function, then 1 r 1 ? 2 A = —- \sinx\dx = — sinxdx =—. 2n J 7t J n Further, "* ° an = — / sin x\ cosnxdx = — / sin #cosnxdx, 71 J 71 J -.i 0 from which it follows that 1 " an =— J [sin {n + \)x — sin (n — 1)x]dx 71 6 _ 1 [~C0S(7i + l)x COS (ft— ^xl71 71 L W+l 71—1 J0' and, consequently, 1 ["cos (ri + 1) 7r — 1 cos (ri — 1) 7i — 1] 7Z [ 71 + 1 ft— 1 J' If ft is odd, then COS (ft + \)7l = COS (ft — 1) 7T = 1 , and, hence, an = 0. But if n is even, then a - 2( l M- ^4 " Trlft+l ft— 1/ TC(ft2— 1)' Similarly, we obtain b„ = 0 for every n. Thus the Fouiueii series of \p{x) takes the form 2 4 ["cos 2 x cos 4 x cos 2 w a; ( 1 As a result of the uniform convergence of this series (for it is a minorant of the convergent positive series £ ) we obtain the equation 4n2 — 1/ . . . 2 4 ~ cos2fcx This example is highly instructive since it enables us to estimate the best approximation of the function \p(x) by polynomials from HJ. In fact, if
146 VII. APPROXIMATION BY MEANS OF FOURIER SERIES „ x 2 4 IH cos2kx then Sn(x) is obviously a polynomial from Hi (its order is n if n is even, and n — 1 when n is odd). From (171) we have where m = - + 1. But y_L=ly(_J L_\ *ti.4Jb«-l 2»ti,V2t-l 24 + 1/ whence it follows that 1 1 . r •»• 1 ^ 1 "i t?m iF=T = 2"]™ [»5, 2T^1 ~ k?m 2T+1J =lr r_i 1 1 1 2j?42w-l 2A~ + lJ 2(2m —1) ' thus we have 2 I lv(x)-£„(*)! 7t 2w— 1 "■ ~ !"»"■, , If n is even, then - = - : is n odd then - = . Therefore l_2j 2' L2J 2 a 21-1 + 1 = \n+1> *nis*™> 1 rt ' ' 71, if n is odd. In both cases we obtain a number which is not smaller than n, whence it follows that \v{x)-Su(x)\<— (172) Tin and, a fortiori, Tin
1. THE FOURIER SERIES 147 The estimate (172) enables us, moreover, to carry out the proof, not yet available, of estimate (156) of the best approximation of the function a(x) = yi — x2 by common algebraic polynomials on the segment [ — 1, +1]. The polynomial Sn(x) can be written n S*(x) = £ Ck cOs* x* *»0 On the other hand, the equation | sin x | = yi — cos2 x holds for every x. Inequality (172) can therefore be written ]/l — cos2x— £ ck cos* x whence for — 1 ^ x ^ 1 it follows that <- Tin \yi-x*-zckx» . 2 Tin and we arrive at the estimate 2 Tin Our example also leads us to another interesting inference. If in (172) we replace the argument x by y + - and consider the equations vfy + YJ =|sin(y+y)I = i°08 yl> _ / , n\ 2 4 !§ cos(2*y+M 2 4 4ifc2 — 1 [I] 71 71 Jfea.1 >oo8 2ty ^ *' 4*»-l' we obtain and find the inequality S. ■('+$-£ ak cos* y |cos y\ — ]£ak cos* y *=o 7T71 2 We can reach this result much faster if we consider that |sin x\ is a function induced by <t(x) and resort to Lemma 1 in § 1 of Chapter VI.
148 VII. APPROXIMATION BY MEANS OF FOURIER SERIES If we replace cos y by x, we have - 2 a*xk ^ — nn Thus we come to Theorem 5. The best approximation En(\x\) to function \x\ on the segment [ —1, +1] by algebraic polynomials from Hn satisfies the inequality En(\z\)< — . Tin We will prove below that the order of this estimate, like that of estimate (173), cannot be improved. The rectilinear application of theorem 4 consists in that we actually expand a function/(x) in a Fourier series and then investigate its convergence. In many cases, however, it is possible to predict the uniform convergence of a Fourier series of a function proceeding from the structural properties of that function. Generalized results will be given in the following Section, while a few very simple cases shall already be mentioned here. Theorem 6. // the function f(x) e C2ir has a derivative f'(x) and if f'{x) e Lip a, a being greater than 0, then f(x) can be expanded in a uniformly convergent Fourier series. Proof. We integrate by parts and find 1 l r sin n x i" 1 c «„-=— / fix) cosnxdx = — \f(x) f(x) sinnxdx. 7i J 7i \ n \_„ nn J Because of the periodicity of f(x), the first summand is zero, hence n —nnan = / f (x) sinnxdx. (174) —n We substitute x = t + - into the above and retain the limits because of the n periodicity of the integrand: — 7tnan = I f It H jsin(7i£ +n)dt = — / /' ( x H j sin nxdx. Together with (174) this yields — 2nnan = l \ f (x) — /' I x H UsinTixdx.
2. DEVIATION OF PARTIAL SUMS OF A FOURIER SERIES 149 Since |/'(*)-/'(y)|^ Jf |s-y|« it follows that f U'(x) — f'(x + ^)\sinnxdx\ ^ M(^X f \sinnx\ dx < 2nM% and, hence, , , Mna The estimate is found in a like fashion. These estimates warrant the convergence of the series n=l and therefore also the uniform convergence of the Fourier series oo A + 2 (an cos nx + ^n Sin TIX) n=l of the function/(x). This result together with Theorem 4 concludes the proof. § 2. Estimate of the Deviation of Partial Sums of a Fourier Series As we have seen, the Fourier series of a function f(x) from C2* is in many cases an expansion of precisely that function in a uniformly convergent series, so that in such cases the partial sums of the series can be used as approximate functions forf(x). But since these partial sums are trigonometric polynomials, we may regard the Fourier series as the source of trigonometric polynomials representing the function with any degree of accuracy.3 Thus the problem of the degree of approximation of these polynomials to the function arises of itself. Its solution will be attempted below. 3 Of course this does not apply to any function from C2* but only to such of which uniform convergence of their Fourier series is known.
150 VII. APPROXIMATION BY MEANS OF FOURIER SERIES Lemma 1. The identity .271 + 1 sin—-—a 1 = — + cos a + cos 2a + • • • + cos na (175) 2 sin — o (a 4= 2kn) holds true. To prove this we use the equation . 2n + l . a , f . 3 . al sin —-— a = sin — + sin — a — sin — , [ • 5 .31 ,[. 2n + l . 2n-l ] + sin -- a — sin — a M + sin —-— a — sin —-— a and apply the formula a • t> o • -4 — B A + B sin ^4 — sin B = 2 sin cos —: to the summands on the right-hand side. This leads to the equation . 2n + l sin—2~ a — 2 sin — — + cos a + cos 2a + • • • + cos wa , which coincides with (175). Now let f(z) € C2t and #„(a;) = A + JST (a* cos&x + 6* sin&x) *-i be a partial sum of its Fourier series. If we substitute into it the expressions for the Fourier coefficients A=LSmdt' 1 n 1 n ak =— f (t) cos ktdt, bk =— [ f(t) sin ktdt , we find 1 " n 1 r* £„(£) = — / /(£) dt + Y — / f(t) (cos &£ cos kx + sin &£ sin kx)dt, whence Sfl{X) = li I f{t) [T + j^008 * (' ~~ X)] ^
2. DEVIATION OF PARTIAL SUMS OF A FOURIER SERIES 151 follows. With the aid of identity (175) we further obtain . 271 + 1, 1 r Sm—2~(t~x) £n(*)=o- //(«) : dt. (176) The right-hand integral is known as a Dirichlet singular integral. It permits some transformations which will be useful in the following. If, in retaining the limits because of the periodicity of the integrand, we replace t by u + x, we obtain . 2n-fl sin—-—u An J . u sin- If we decompose the integral in two integrals over the segments [—7r, 0] and [0,7r] and replace in the first one u by —u, we obtain . 2n+l n Sin - U SU(X) = ^ / [/(* + U) + f(x - U)] ^— du. n° 8inT Finally, substitution u = 2t yields 8.(x) = - f [/(* + 2*) + /<*- 2<)] 3ia(2n+Vf it. (177) 7i *' sin t o Lemma 2. For any natural number n ^2 the inequality «/2 TioWs true. »/ sin (2ti + 1) t sin £ dt<2 + lnn (m) Proof. We decompose this integral in two over the segments 0, —^—TTTj J ancl ![ ? and apply to the first one the estimate (93), namely, 2 |sin nZ| ^ n |sin Z|, and to the second the estimate (94), namely, sin t ^ - t, and the estimate |sin (2n + 1) t\ ^ 1. Thus we obtain »/2 i iY2 , i f<ii i i sm« rc .' * -J * 4«+2
152 VII. APPROXIMATION BY MEANS OF FOURIER SERIES whence it follows that TT f ",2|sin(2n + 1)< no sin t dt<±+±ln(2n + l)9 But 2n + 1 < n6, hence In (2n + 1) <1 + In n. We can therefore derive (178) from the last inequality. Corollary. // the function f{x) is bounded, i.e., \f(x)\ ^ M, then the partial sums Sn(x) of its Fourier series satisfy the inequality \Su(x) | ^ M(2 + In n) (n ^ 2). (179) This is true since by (177) we have n J sin (2*1 + 1)* o and therefore (178) leads us to our aim. sin t dt, Theorem 1. (H. Lebesgue [1]). // En is the best approximation of a function f(x) by trigonometric polynomials from H%, then for all x the estimate \Sn(x) - f(x) | ^ (3 + In n) En (n ^ 2) (180) holds true. Proof. We now denote the partial sum of the Fourier series of f(x) by Sn[f] so as to underscore its dependence on f(x). For two functions / and g we thus obtain the evident relation Sn[f-g] = Sn[f]-Sn[g]. We consider now that every trigonometric polynomial is its own Fourier series. In particular, the equation Sn[T] = T(x) holds for a polynomial T(x) e i?J. Now, if Tn(x) is the polynomial of the best approximation to f(x), then \f(x)-Tn(x)\^En. (181) Thus, for (179), we have \Sn[f-Tn]\^(2 + \nn)En or \8u[f] - Tn(x)\ ^ (2 + In n) En. (182) The theorem follows directly from (181) and (182). Since In n increases very slowly, the Lebesgue theorem shows that the approximation of a function by the partial sums of its Fourier series is, roughly speaking, only slightly "worse" than the best approximation.
3. CONTINUOUS FUNCTION NOT EXPANDABLE IN A FOURIER SERIES 153 Moreover, it follows from the Lebesgue theorem that all those functions for which \im(En\nn) = 0 (183) can be expanded in uniformly convergent Fourier series. This estimate, however, is not a particularly suitable criterion of convergence. More appropriate for this purpose is Theorem 2. // the modulus of continuity w( 5) of the function f(x) e C2ir satisfies the relation lim [co(<5) In <5] = 0, (184) a-o then fix) can be expanded in a uniformly convergent Fourier series. Proof. According to Jackson's first theorem in Chapter IV, En ^ 12co hence mil. n I Thus, because of assumption (184), (183) is fulfilled. Despite its shortness, this proof should not be regarded as elementary since it relies upon the deeply significant Jackson theorem. Condition (184) is also known as the Dini-Lipschitz condition. It is fulfilled by all functions belonging to any class Lip a for a > 0. § 3. Example of a Continuous Function Not Expandable in a Fourier Series The criterion of convergence set up at the end of the preceding Section is quite comprehensive. We could verify that, e.g., all functions fulfilling a Lipschitz condition of any positive order also fulfill condition (184), i.e., they can be expanded in a Fourier series. Thus there arises of itself the problem whether in addition to the requirement of continuity any other restriction should be imposed on the function f(x) in order to warrant its expandability in a Fourier series. It appears that it is impossible to renounce other conditions completely. This was shown in 1876 by Du Bois- Reymond with an example of a continuous function not expandable in a Fourier series. Here we give a similar example due to Fejer. e En\nn^ \2co m
154 VII. APPROXIMATION BY MEANS OF FOURIER SERIES Lemma. If , v |"cos x , cos 2x , , cos nx~\ [cos (n - i then + 2)x cos(n + 3)x cos (2m + ■ H ~ r ••* H 2 « iy,,(*)i^4y» lif], is valid for all x, since * cos (ti + 1 — &)a; J!, cos (n + 1 + &)x <Pn(z) = 2 £ 2 jl > which, with the aid of the elementary formula n ft.A+B. B—A cos A — cos B = 2 sin -— sin —-— (185) (186) is converted to <pn(a;) = 2 sin (w + 1) x 21 —r— • From this and from the estimate (118) the assertion follows. Since <pn(x) is a trigonometric polynomial, then formula (185) is simultaneously the "expansion of <pn(x) in a Fourier series." We are now going to deal with the partial sum Sm[<pn; x] of this series. To obtain it we must omit from the right-hand side of (185) the cosines of the angles ix for i > m. This yields [ cos a: cos 2 a: cosmx 1 T + 1 7—z , 7i n — 1 ri + 1 — w bei 1 ^ m ^ 7i, cos x cos 2 a; cos n x n n — 1 1 bei m = n + 1, [cos a? cos 7ia:] ["cos(7i + 2)x — - + -j—J-[ I • sm[<pn; x] + COS X COS bei 7i+ 2 ^ m ^ 2ti+ 1, + 2)a: snxl rcos(ri- + ...+ cos rax 1 m — n— 1J cos (2ti + l)x 1 bei ra ^ 2ti + 1.
3. CONTINUOUS FUNCTION NOT EXPANDABLE IN A FOURIER SERIES 155 These equations show that for any pair of values m, n Sm[(pn; 0]^0 (187) holds. We now define the function interesting us: f(x)=Z^<P2*3(x), (188) which obviously belongs to C2r. The partial sum Sm[f; x] of its Fourier series is, as we shall immediately see, defined by the formula SmU;x] = 2^Su[<p%n*;x]9 (189) In fact, to find the cosine coefficient 1 n ak(f)=— ff(x) cos kxdx (190) 71 J — n we must multiply by cos kx the uniformly convergent expansion (188) and integrate the result. This yields ak(f) = i^2ak((f^). (191) n=>l n If we multiply (191) by cos kx and sum from k = 1 to k = m, we do obtain (189). According to (187) and (189) we have for any pair of values m, n the estimate ^m[/;0]^l^m[W8;0]. (192) Let, in particular, m = 2n\ Since 0 _ _ cos x cos2a; cos nx S.lp.ix\=*—+T—T+...+—T-, then £2»8[?2»3;0] = l+y+-"+^B-. We rate this sum downward. Since 1 [ *1 k J x
156 VII. APPROXIMATION BY MEANS OF FOURIER SERIES it is obvious that 2»3+l S2n» [q>2n»; 0] > / ^ = In (2n* + 1) > n3 In 2. 1 From this and from (192) £2»8[/;0] > 7i In 2 (193) follows so that the Fourier series of the function f(x) diverges at the point x = 0. By a more cumbersome construction we can also derive a function whose Fourier series diverges on an infinitely great set of points. However, the problem of the existence of continuous functions the Fourier series of which diverge anywhere has not been solved to this day. The reader may find it interesting to note that the Fourier series for a continuous function can at no point x converge to a value other than/(x). In other words, the Fourier Series of a function f(x) e C2lr either diverges at a point x or it converges to fix). This, however, will be proved later.
CHAPTER VIII THE SUMS OF FEJER AND DE LA VALLEE-POUSSIN § 1. Fejer Sums We have seen that the partial sums Sn(x) of the Fourier series for a function /(x) e C2ir do not always converge to the function value. Proceeding from the partial sums Sn(x), however, we can arrive at polynomials which converge uniformly to the function. Theorem 1. (L. Fejer [1]). Suppose that f(x) e C2ir and that Sn(x) are the partial sums of its Fourier series. If we set „ ,. S0(x) + Sl(x) + — + Sn_l(x) <7n(x) = , n then the sums <rn(x) (which we call Fejer sums) converge uniformly along the entire axis toward f(x): lim an(x) =f(x). n-*oo Proof. Every sum Sk(x) can be represented as a Dirichlet integral: 1 T , , rt sin (2k + l)t 7 Sk(x)=- / [f(x + 2t) + f(x-2t)] \ J dt. 7t J sin i o Whence it follows j_ ?m* + 2t) + f{*-2tM»-i idt nn J sin« [ito J o But1 2'1sin(2jfc+l)<=^- (194) and, hence, an(x)=— f [f{x + 2t) + f(x- 2t)](S-^)2dt. (195) nn J \ sin t J 1 Since 2 sin A sin B = cos (A — B) — cos (A +5), then n-i n-i 2sin*^sin(2fc + 1)* = ^[cos2fc* - cos (2k -f 2)*] = 1 - cos2n* = 2 sin2 ntf whence (194) follows. 157
158 VIII. THE SUMS OF FEJ^R AND DE LA VALL^E-POUSSIN The integral on the right is known as a singular Fej^r integral. Thus, a Fejer sum of any function f(x) can be represented by this function with the aid of a Fej^r integral. In particular, this is valid also for the function fix) ss 1 for which oo i + io is the Fourier series. For it all the sums Sn{x) as well as all the Fejer sums <rn(x) are identical with one. Hence if we apply (195) to this function, we obtain the identity l=±T2(sjnntVdt nn J \ smt J Now, we multiply (196) by f(x) and subtract the result from (195): I c /sin 7it\ an(x) - f(x) =— / U(x + 2t) + f(x-2t) - 2f(x)] (-r—- dt. (197) nn J \ sin t / For every e > 0 there exists a 5 > 0 such that from \x" -x'\ <2d there always follows !/(*")-/(*') I <j If we decompose the integral (197) into two integrals over the segments [0, 8] and 5, - , then in the first one |/(*+2*) + /(*-2*)-2/(*)| <e, hence, by (196), (5 \n i- [ [f(x + 2t) + f(x - 2«) - 2f(x)] l^fdt \n£ \ sin t ) ■ r /sin nt\ n J \ sin t J o e r /sin nt\2 T e o In the second integral we have \f(x + 2t) + f(x - 20 - 2f(x) | ^ 4M,
1. FEJ^R SUMS 159 where M = max \f(x)\ ; moreover and hence /sin nt\2 1 \ sin t ) = sin1 l- J [fix + 2t) + f(x - 2t) - 2f(x)] (^ffdt n~6 ^ 1 ±M n 2M n n sin2 <5 2 n sin2 <5 Altogether this yields \an(x)-f(x)\ <--+ e 2iJf 2 ' n sin2 5' But for sufficiently great values of n 2M e < n sin2 <5 2 and hence l*.(*)— /(*)I <£> which proves the theorem. Remarks. 1. Since <rn(x) is a trigonometric polynomial, this theorem is a further proof of WEIERSTRASS, second theorem. 2. The completeness 2 of the trigonometric system in C2ir follows directly from the theorem proved. In fact, if the function f(x) is orthogonal to all the functions (169), then an(x) = 0, hence fix) = lim <rn{x) = 0. n—► <» 3. Let |/(x)| ^ M for all values of x. Then, from formula (195) it follows that «/2 v " ~ nn J \sin« / which with the aid of (196) is transformed in |o-»fx)| ^ M. This result can be formulated as follows: 2 An identical proof of completeness is also given if we rely on the uniform approximation of f{x) by the de la Vallee-Poussin integral: #i T" %2 T" * * * T" ®n y* = — — • n
160 VIII. THE SUMS OF FEJ^R AND DE LA VALL^E-POUSSIN Theorem 2. The absolute value of a Fej^r sum of a function never exceeds the maximum absolute value of that function. Now, we only need a simple lemma from the theory of limits: — 71 Lemma. // the sequence {xn} has a limit, then the sequence xl ~T x2 "1 "~r xn y.= has the same limit. Proof. Let Km xn = I. n-*oo For a given value of e > 0 we can therefore find an index n0 such that \x„-l\ <^ always follows from n > n0. For these values of n we have ■ j, ^l*i-l| + " + l*--*l , Ix^+i —'I + " + |s,-l| n n Hence, if we set A(e) = \xl-l\ + --- + \xno-l\, we obtain i*-n<-ir+7- If we take n sufficiently large, then A(e) e_ 71 2 hence This proves the lemma.
2. SOME ESTIMATES OF FEJ^R SUMS 161 From this lemma and from Fejer's theorem we derive the theorem mentioned at the end of the preceding chapter. Theorem. // the Fourier series of a function f(x) e C2ir converges at a point Xo then its sum is equal tof(x0). In fact, if its sum is A, then, according to the lemma, the sum an(x0) also tends to A. But since the limit of <rn(xo) isf{x0), we have A = /(xo). § 2. Some Estimates of Fejer Sums Theorem 1. (S. N. Bernstein [3]). Let f(x) e C2v and f(x) e Lip^ for 0 < a < 1. Then C M \on(x)-Hx)\<-^- (19S) lb is valid for all values of x} the factor Ca depending only on a. Proof. If we apply the assumptions to formula (197), then \f(x + 2t) + f{x — 2t) — 2f(x) | ^ 21+*Mta, 4 and also sin21 > — t2. Thus we obtain 7T2 w/2 • / v i, v. n ** t sin2 nt . \a.(x)-f(x)\<-^¥=-aMj -^-dt, 0 which by substitution of nt = z changes into st/2 n«/2 /sin2 7i^ , , f sin2z _ o 6 Whence it follows that -f-oo i / v j / v. M 7i f sin2 2 _ 0 which proves the theorem. To gain a clearer idea we also estimate the factor Ca. It is obvious that + oo 1 -f-oo /" sin2z f , f dz 2 and, hence, c <-**-. 1 — a2
162 VIII. THE SUMS OF FEJ^R AND DE LA VALL^E-POUSSIN In juxtaposing this Bernstein theorem and the results of Chapters IV and V we are tempted to conclude for the functions of class Lip a for a < 1 that the deviation of the Fej^r sums decreases at the same rate as the least deviation. Thus formulated, however, the conclusion is incorrect since for individual functions of class Lip a the least deviation En decreases considerably faster. Such functions are, e.g., those of class Lip 1 which also belong to Lip a with a. < 1. Nonetheless, for every Lip a the rate of decrease of the Fejer sums is, as a whole, coincident with that of En. Let us take a closer look at this assertion. We found in § 6 of Chapter V that the density Tn(a) of the trigonometric polynomials from Hi in class Lipi a satisfies the conditions Now, we set ™*™*£- <199> An{a) = sup {max \an{x) — f(x) |}. Accordingly, we call this quantity introduced by S. N. Nikolskiy [1] the degree of approximation of the entire class Lipi a by Fej^r sums. Nikolskiy showed that for a < 1 the asymptotic relation . an nv ' n{\ — a) 7ia na is valid, pn tending to zero with increasing n. From inequality (198) we find directly na for a < 1, but since Tn(a) ^ An(a) then, finally, l^An(a)c Ca Tn(a)-K(a)m This relation shows that for the entire class Lip a the deviation of the Fej^r sums decreases generally at the same rate as the least deviation. For a = 1, however, this is no longer true. In this case the following theorem is valid: Theorem 2. (S. N. Bernstein). For every function f(x) e C2ir belonging to Lipjf 1 the estimate \an{x) - f(x) | < — (n > 1) (200) holds} A being an absolute constant.
2. SOME ESTIMATES OF FEJ^R SUMS 163 Proof. Again we employ formula (197). With the aid of the inequality \f(x + 2t) +f{x — 2t) — 2f{x)\ ^4Mt it yields the estimate \ou{x)—f{z)\£ / tl-T—) dt. nn J \ sin t ) o 2 Since sin t ^ - t we derive from it T w/2 ,, t ^,nM [ sin2 nt _ \on{x)-i{x)\ ^ — J —r~dL 0 But »/2 nn/2 »/2 nn/2 /sin2»$- fsin2z_ f , , / dz n , . ./ -i-* = J ~rf2<J *2 + j -=J + ]nn- boo ?r/2 Now, if n > 8, then In n > -, whence 2 so that finally3 we have w/2 "sin27i^ _ rtl dt < 2lnn) t o , , v ., xl 2jiM\rin \°n{x)-f{x)\< . The estimate (200) is optimal. This is proved by Theorem 3. Suppose that the function f(x) e C2r has at the point xo a finite derivative f+(xo) on the right and a finite derivative /_(a*o) on the left Then Hm U><*> -/W]| =f+(a:°)-/-(X°). (201) To prove this asymptotic formula we note, in the first place, that |im/(*,+ 2,,-/W *-+o ^ sin t 3 The last inequality shows the correctness of estimate (200) only for n ^ 8. But by increasing A we can easily obtain (200) for every n ^ 1.
164 VIII. THE SUMS OF FEJ^R AND DE LA VALL^E-POUSSIN and, therefore f(x0 + 2t) — f(x0) = 2/V(x0) sin* + a(t) sin*, a(t) tending to zero with t. Similarly, f(x0- 2t) - f(x0) = -2fL(x0)smt+P(t)smt, P(t) tending to zero with t. Together with equation (197) these formulas yield 7t/2 Tt/2 / v // v ^ /, / v. T sin27i^ T 1 f , sin2 nt _ 7i n J sin 2 ti J sin £ o o wherein we set y(0 =-[«(')+ £(*)]. Thus the assertion follows if we can prove the correctness of the two relations nm 2 T«^ntdt = lf (2Q2) n^oo In n J sin tf 2 t sin2 l n J sin o 2 lim y(<): o ji/2 sin 7i/ lim — I yU)— dt = 0 . (203) ^oolnnj rK' sin* v o To obtain the first of these equations we replace in formula (90) t by 2t and find the identity sin* nt = [sin 2 + sin 32 + ••• + sin (2n — l)t] sin t , whence it follows that »/2 /sin2 sir ntdt = 1+L+L+...+. l sint 3 5 ' 2ti— 1 6 But for k g x g k + 1 2fc + 1 ~2x— 1 -2fc— 1 By integrating these inequalities we find *+i i r dx i < « 7<; 2k + 1 / 2x— 1 2fc— I'
2. SOME ESTIMATES OF FEJER SUMS 165 i.e., and Hence 11 1 f dx 3 + 5 H b2n—l< J 2x—l n+l /dx <l + 3- + 2n— 1* n+l / dx I 2x— 1 <l+y + 2n— 1 i <l+r a. / dx or, which is the same, 1 ln(2n+l)<l+i + ---+^-T<l+4-ln(2w--1)- o Zn — I z Thus we have 7J/2 — In2» < 2 / sin £ / sin2 »* , , 1 1 _ / —: d* < 1 + — ln2n, / sir * ° whence (202) follows. ° Now, any variable representing a finite limit is bounded. For all values of n 7ll2 It 1 r sin2 ni Inn J sint dt<K (204) is therefore valid, K being a constant. For any e > 0 we can find a 5 > 0 such that on the segment [0, 5] lyWI <e everywhere. But then »/2 1 f , ain2n« . s\nt < 6 */2 e /'sin2 ri£ _ , 1 f, , x In 7i J sin £ In ?i J sin2 7i£ sin£ dt. We can readily see that the function 7^) is bounded 4 on the segment 5, - , y(t) = a{t) + ^ n f(xQ + 2t) - f(xQ) - 2/+ (rr0)sin^ 4 Since and, e.g., a(t) = - sin t
166 VIII. THE SUMS OF FEJ^R AND DE LA VALL^E-POUSSIN hence \y(t)\ < A. If we consider (204), we obtain Tt/2 sin27i* y(t)—;—-at \nnj ' sint <Ke 2 sin d In n and, for sufficiently great values of n, finally */2 1 t , v sm2 nt 7 / y(0— <^ In 7i / /w —' sin£ <(K+l)e. This proves the theorem. Relation (201) can also be written in the form ~ i°n{Xo) - f(x0)] = 1 [U(X0) - /L(X0)] + Qn , In n Km pn being equal to 0. Hence it follows that In n, an(xo) - f(xo) = — [/'+(*<>) - /'-(*<>)] + *.. wherein rn becomes infinitely small of higher order than In n n Theorem 4. (S. M. Nikolskiy). // we take all the functions f(x) with period 2t of class Lipi 1 and set An(l) = sup {max |an{x) — f(x) |}, then the asymptotic formula5 2 In n Inn is valid, lim pn &6in<7 6#waZ to 0. n—► <» In fact, by reason of (197), we have \°n{x)-f{x)\ nn */2 [ /sin7iA2, 5 S. M. Nikolskiy [1] gives this remainder in even more accurate terms.
3. DE LA VALL^E-POUSSIN's SUMS 167 so that An(l) is not greater than this integral. On the other hand, the function 6 with period 2tt, defined by 6(x) = \x\ on the segment [—?r, tt], belongs to Lipi 1. Hence Tt/2 ^„(1) ^ max\an(x) - 0(x)\ ^ a„(0) - 0(0) = ~ f tl^^fdt. nn J \ sin t ) Therefore ° An(l) = <t„(0) - 0(0) and, by (201), e'+W —0'_(O) 2 Hm [i^„(l)l n-^oo L w J n which is identical with the assertion. Let us, in conclusion, point out an interesting fact. Whereas the Fourier sums do not converge for every continuous function, the Fejer sums do. From a certain viewpoint the Fejer sums are therefore more "useful" than the Fourier sums. This, however, is not true with regard to the quality of their approximation. In fact, if the structure of the function to be approximated is very good (e.g., it has all the derivations), then the approximation by the Fourier sums is scarcely worse than the optimal one. They represent the function with a degree of approximation which is all the greater the better its structure. But the sums (rn(x) yield, as a rule, only one approximation of order -, and no improvement of the properties of the function n will improve this degree of approximation. For the function we have, e.g., x f(x) = 1 — cos x = 2 sin2 — 4 f 1 or (0) = / sin2 ntdt = — , o although f{x) is an integral function (i.e., it can be expanded in a power series with an infinitely great radius of convergence). To express it more freely, we may say that although more comprehensive in their scope, the Fejer sums yield rougher approximations than the Fourier sums. 3. De La Vallee-Poussin's Sums The Fourier sums Sn[f; x] of a function f(x) have the important property to coincide with the function f(x) for n ^ m when this function is a poly-
168 VIII. THE SUMS OF FEJ^R AND DE LA VALLEE-POUSSIN nomial from H^. The Fejer sums have no such property. Instead, they satisfy the inequality |orn[/; x]\ ^ max|/(x)|, while the Fourier sums do not. There are sums rn{x) = rn[/; x], however, which have both properties. They were specified by de la Vallee-Poussin.6 He sets r ,„, _ Sn(x) + 8u+l(x) + ••• + S2n_l(x) We immediately see that for f{x) e Hi and n ^ m t.[/; *]=/(*) always holds true. On the other hand, by the definition of <r»( x) S0(x) + 81(x) + ••• + 8k^(x) = kak(x). Whence it follows that Sk{x) = (k + 1) ok+1(x) — kak(x) and, therefore, S»{x) + Sn+l{x) H + ^2n_i(^) = 2no2n(x) — nan(x), so that rn(x) = 2a2n(x) — an{x). From this equation and from the above-mentioned property of the Fej^r sums it follows that \rn[f;x]\ fS3max|/(x)|, (205) the existence of the factor 3 being of slight importance. Theorem 1 (de la Vallee-Poussin). If f(x) eC2ir) then the deviations of de la Vallee-Poussin's sums rn{x) of this function satisfy the inequality7 \ru(x)-f(x)\ ^±En) (206) 6 de la Vallee-Poussin [3], p. 34. 7 The reader should bear in mind that rn(x) is not of n-th but of (2n — l)-st order. These sums are therefore by no means the solution to the problem of finding in H £ polynomials such that their deviation decreases in the order of magnitude of En, a problem which is unsolved to this day.
3. DE LA VALL^E-POUSSIN's SUMS 169 En representing the best approximation to f(x) by trigonometric polynomials belonging to Hn. Indeed, the sums rn{x) have the property rn[f — g;x] = rn[f; x] — rn[g; x] in common with the sums Sn(x). Let now T(x) be the polynomial of the best approximation to f(x) belonging to Hi so \f(x)-T(x)\^En. (207) By reason of (205) we have therefore \rn[f-T;x]\ ^3En or, which is the same, \rn[f;x]-rn[T;x]\ ^3En. But since rn[T; x] = T(x) It follows that \rn[f;x]-T(x)\ ^3En- (208) Formula (206) follows then from (207) and (208). Following is an application of this theorem. We showed in § 1 of Chapter VII that the Fourier series of the function \p(x) = |sin x\ has the form 2 4 ~ cos2kx 2j 7 Hence But n n *=i4&2 — 1 4 ~ cos2fcx V(x) — Sm(x) = ~ £ 4ta i- , v r . 2^^(x)->Sm(x) y>(x) — t„|>; x] = 2j z
170 VIII. THE SUMS OF FEJ^R AND DE LA VALL^E-POUSSIN If, in particular, we set x = 0, we obtain 4 2n-l [ oo i \ Each of the n sums oo l 2 TU r (m = », »+1, ..., 2»—1) [ml , *# — 1 is greater than the sum therefore JSi 4t« - 1 ; lv(0)-T.lv;0)|>i- j? l n k=n+i *k2 — 1 and, by (206), ,*, ^ 1 S 1 K(y>)>- Z n k-n+i 4fc2 — 1 * We computed already in § 1 of Chapter VI that f, _J__ = _J__ t£i4*a—1 2(2m—1) and, hence, *M>I«£+Ty (209) This is an estimate which we already mentioned in § 3 of Chapter V. It leads to a further interesting inference according to which for every trigonometric polynomial U{x) e H% the estimate max 11sin x\ - U(x)\ > * + ^ (210) holds true. Now let T{x) e #J. We can then apply (210) to 17(„ = *(*-»).
3. DE LA VALLriE-POUSSIN'S SUMS 171 If 2/0 is a value of the argument for which the maximum value in (210) is reached, then Isinyol ,(,-!) > 1 2n{2n + 1) or |cosx0| — T(x0)\ > 2n{2n + 1) if we set y0 = x0. Thus, for every polynomial T(x) belonging to Hi the deviation from the function |cos x\ is greater than 1 2n{2n + 1) This is true, in particular, for all even polynomials T(x) = % ck cos*x. In other words, we have always max |cos x\ — 2 ck cos*x > 2n{2n + 1) irrespective of how we select the coefficients c*. Thus we obtain the following theorem which is a complement of Theorem 5 in § 1 of Chapter VII. Theorem 2. The best approximation En to the function \x\ on the segment [ — 1, +1] by algebraic polynomials belonging to Hn satisfies the inequality *■<'«'> >2^r+ir
CHAPTER IX THE BEST APPROXIMATION TO ANALYTIC FUNCTIONS § 1. The Concept of Analytic Functions The simplest way of defining the concept of analytic functions is with the aid of complex function theory. Nonetheless we shall continue in the following to avail ourselves, whenever possible, of real auxiliary means only. Definition. A function f(x) defined in the segment [a, b] is said to be analytic in this segment if there exists a positive number R such that for every value of x0 e [a, b] we can find a power series oo 2 Ck(X0) (X— X0)k convergent for \x — x0\ < R which represents the function at all points belonging simultaneously to [a, b] and (x0 — Ry x0 + R), i.e., at which fix) = It c*(Zo) {x - z0)*. k=0 (We could dispense with the independence of the number R from xo but this would not widen the domain of the class of functions investigated. This is obvious to any reader familiar with Borel's covering theorem.) We denote by A ([a, b]) the class of functions analytic in the segment [a, b]. Because of the elementary properties of power series, every function of class A([a, b]) is differentiate any number of times, whereas the converse is not true. For example, the function l e~x~\ which in the segment [0, 1] is differentiate any number of times, cannot be expanded in a series of powers of x. In fact, if e"*"-2 = c0 + cA x + c2 x2 + ..., then for n = 0 we would first have c0 = 0. This results in 1 _ -2 — e x = c1 + c2x + c3x2 + ••• , 1 For x = 0 we set e-*"2 = 0. 172
1. THE CONCEPT OF ANALYTIC FUNCTIONS 173 wherein the passage to limit x —»0 yields the equation ci = 0. By continuing this procedure all the coefficients ck would become equal to zero, which is obviously absurd since e~x~x =# 0 for x =*= 0. Thus the existence of all the derivatives does not quite fully characterize an analytic function, the latter being so-to-speak "better" than a function differentiable any number of times. And again, an analytic function is all the "better," the greater the number R appearing in the definition. If R = + oo f the function is said to be an entire function. In analogy with class A ([a, 6]) we introduce the following Definition. Let f(x) e C2ir. If there exists a number R > 0 such that to each point Xo there corresponds a power series oo 2 ck(Xo) (x~ *o)k convergent for I a; — x0| < R which in this interval represents the function/(:r): f{x) = Zck(x0)(x— x0)k, we say that f(x) belongs to class A2r. If, moreover, R = + oo, then f(x) is said to be an entire function. We denote the class of entire functions with period 2w by G2r. Every function of class A2lr also belongs to class A([0, 2w]). It would be wrong to believe, however, that A2ir is an intersection of classes A([0} 2t]) and C2t. For example, the function with period 2-k which in [0, 2w] coincides with (x — x)2, exi ts in the above intersection but does not belong to A2». Of course all the functions of class A2ir are differentiable any number of times. Theorem 1. A function f(x) with period 2w differentiable any number of times belongs to A 2lr if and only if for all the pairs of values x, m the inequality \f(m)(x)\ <Ammm (211) is fulfilled, A being a constant independent of x and m. Proof. Let f(x) e A2t. Because of the periodicity it suffices to prove the inequality (211) for 0 g x ^ 2tt. We can take the natural number N so large that 2n R
174 IX. THE BEST APPROXIMATION TO ANALYTIC FUNCTIONS (here R is the number appearing in the definition of class A2w). Now we set 2tz . Xi=~Wl (* = 1,2, ...,N). To each point Zi there corresponds a power series oo Z ck{xi){x— x{)k convergent for \x — Xi\ < R. It is known from the elementary theory of power series that they are absolutely convergent, hence the N positive series 00 / R\k (i = l,2,...,i\0 (212) are convergent as well. Further, let S > 1 be a number greater than the sum of each series (212). Now, if a; € [0, 2w], then there exists an index i for which , R For this index we also have *=0 Consequently, f(m)(x) = 2ck{Xi)k{k— 1) ...(jfc — m+l)(x — *,)*-«• and, therefore, \fmKx)\^2\^(Xi)\m[—) , whence it follows that M-WIS&T.I1*'*"?^)' But since OO /v.71 »=ow!
1. THE CONCEPT OF ANALYTIC FUNCTIONS 175 then for z > 0 holds, hence x" e*>^\ («3> m -f- <m\ <mm and finally (9p\m oo /7?\* ~i) mmk?JCk{x^\2J• Thus, all the more, \fm){x)\ <S(-£\ mm (214) and because of S > 1 also -¥)mm- This proves the necessity of condition (211). The proof of its sufficiency is easier to conduct. For any two values of xo and x we have f{x) ^^^{x~ Xo)k+f-^{x-X°r' (215) The remainder is, by (211), not greater than Ammm. \x— x0\m, m! which, by (213), is not greater than (Ae\x— x0\)mt For |x — x0| < — the above expression tends to zero as the exponent m Ae increases, so that for these values of x the expansion holds. This proves the theorem.
176 IX. THE BEST APPROXIMATION TO ANALYTIC FUNCTIONS Remark. For further purposes we make the remark here (which follows from the proof just conducted) that if estimate (211) is valid, the constant R need not be taken smaller than —. Ae Furthermore, we can also set R ^ — if the inequality Ae \f(m)(x)\ ^Ammm holds not for all the values of m but only for m > m0 since in equation (215) we can select m > m0 at the outset. Since (211) can also be written in the form m y^<A m = ' wherein Mm = max |/(m)(x)|, we can reword the theorem just proved as follows: Theorem 2. We have fix) e A2xif and only if the quantity m is bounded. (216) It will be shown that it is precisely this quantity to enable us to determine that a function belongs to class G2x of entire functions. Theorem 3. f(x) belongs to G2x if and only if m IiniV^-=o. (217) m-+oo 'ft' Proof. Let fix) e G2r. If e > 0 is preassigned arbitrarily, then we choose R > 0 so large that 2e If we fix R and follow literally the same considerations as in the proof of Theorem 1, we again get to inequality (214), which we can also write in the form m ^=<7S^. (218) m f R
2. BEST APPROXIMATION TO PERIODIC ANALYTIC FUNCTIONS 177 Since with a fixed R also S is a fixed quantity, therefore lim ]/S = l. For sufficiently great values of ra, however, the right-hand side of (218) becomes smaller than e, so that (216) is also smaller than e. This proves the necessity of condition (217) for f(x) e G2ir. That it is also sufficient is proved basically in the same way as in Theorem 1. In fact, if the condition is fulfilled, then for an arbitrarily small pre- assigned positive quantity A we can find a natural number m0 such that for m > m0 the inequality m. m holds. From this \f(m)(x)\ < Ammm follows. By the remark to Theorem 1, R is then not smaller than (Ae)~\ hence R = + °°. The two theorems just proved can also be applied to functions defined on a segment [a, b]. With a literally identical proof we thus come to Theorem 4. Let f{x) be defined on the segment [a, b] and be differentiable any number of times. In this case the function f(x) belongs to class A{[a} b\) if and only if its derivatives satisfy the conditions \fm)(x)\ ^ Amrnm. A being a constant independent of m and x. For this function also to be integral the condition limfe = 0 m-*oo nx is necessary and sufficient § 2. S. N. Bernstein's Theorems on the Best Approximation to Periodic Analytic Functions The following two theorems on the order of magnitude in which the best approximation to periodic analytic functions by trigonometric polynomials decreases are due to S. N. Bernstein [4, 5]. Theorem 1. A function f{x) e C2t belongs to class A2w if and only if K(f) ^ Ktf (219)
178 IX. THE BEST APPROXIMATION TO ANALYTIC FUNCTIONS where K and q < 1 are constants. Theorem 2. This function belongs analogously to class G2r if and only if lim ^El(f) = 0 . (220) n—►oo We will prove both theorems at the same time. Suppose that/(x) e A2r- Then this function can be expanded in a Fourier series. If we denote the partial sums by Sn(x) we have ElZ{f)£mEx\f(z)-Su(z)\ ^ £ (1**1 + 1**1)- Now we estimate the coefficients a* and bk. To do this we integrate by parts m times the right-hand side of the formula 1 n ak=— if{x) cos kxdx . n J Owing to the periodicity all the terms vanish except those under the integral sign so that, depending on whether m is even or odd, we obtain the formulas 1 r Uk = ±—jfi J f{m)(x)coskxdx —n 1 n a* = ±—gk ff(m)(x)sinkxdx , from which 1*1*^ <221> follows. According to Theorem 1 in § 1 there exists a constant A for which Mm ^ Ammm. Thus we obtain 2Ammm \ak\^-^-. (222) Now let k > 2 A and m = |-jj . Then
2. BEST APPROXIMATION TO PERIODIC ANALYTIC FUNCTIONS 179 From this and from (222) it follows that \ak\ < 4g*, where we have set q = 22A . (223) The coefficient 6* is estimated in a like fashion, so that for w ^ 24 we have *-n+1 i — q Now let K be a number which, first, is greater than —— and, secondly, 1 - Q is greater than any of the fractions E[(f) B[U) El(f) g° ql q*o n0 denoting the largest natural number below 2A. This fulfills the estimate (219) for all values of n and proves the necessity of this condition if fix) is to belong to A2r. Now let f{x) e G2ir. Since it follows from this that fix) e A2ir) the estimate (221) is fulfilled. Further, by Theorem 3 from § 1 lim1^" = 0. m We can therefore find for any minimal value of A > 0 an index m0 such that Mm < Ammm is valid for all indices m > ra0. If now we take A to be a constant, we obtain for m > m0 the estimate (222) above. If we choose k so large that I — > m0, we again return to the two inequalities \ak\ <4g*, 16*| <4g*, g having the value (223). For n^n0=im0 + 1)2A we therefore have 2 (\ak\ + \h\) <8 I qk *-n+l *-n+l and, a fortiori
180 IX. THE BEST APPROXIMATION TO ANALYTIC FUNCTIONS If e > 0 is preassigned arbitrarily, we can choose A > 0 such that or r«(/)<«|/7^' (224) -1 q = 22A < e . Let our constant value of A satisfy this condition. Then we take the number no defined above to be constant, so that n0 = n0(f). For n ^ n0 (224) is fulfilled. But for n > nx the right-hand side of (224) is smaller than e (since with increasing n it tends to q), hence for n > N = max (n0, n\) y nut) < £, which proves the necessity of condition (220) for f(x) to belong to G2ir- We now proceed to show that both conditions are also sufficient. Let (219) be fulfilled first. The polynomial of the least deviation from/(x) in Hi is denoted by Tk(x). Then \f(x)-Tk(x)\^Kq*; it follows that f(x) = T0(x) + £ [Tk(x) - TWx)], which by formal differentiation leads further to the equation f(m)(x) = jr* [Tk(x) - TWx)]^) (225) (the term T^\x) is zero since To(x) is a constant). If we bear in mind that \Tk(x)- Tk_Y{x)\ ^ \Tk(x)-f(x)\ + |/(s) - n_i(x) I ^ Kqk + Kq*-i = Lq* , then Bernstein's inequality (110) yields the estimate \[Tk(x) - Tk_x(x)^\ ^ Lq*k™. (226) Series (225) uniformly converges to it, which subsequently justifies the term- by-term differentiation. Thus we obtain \f(mHx)\ ^L jrkmqk (227)
2. BEST APPROXIMATION TO PERIODIC ANALYTIC FUNCTIONS 181 from (225) and (226). Now we estimate the sum on the right-hand side. To this end we differentiate the identity *=o 1 — x m times with respect to q and find JTk(k— 1) ...(& —m + !)£*- *~ x ' x ' ~,z (i_^+i- If we leave out the first m terms, we have oo ymrrhmam fc£i(»_1)...,»_»+l)i.<(_ja. k Now, if k ^ 2m, then k — m + 1 ^ - ; whence it follows that On the other hand, the maximum value which the function assumes for is smaller than Hence xmqx m In q 1 \m ro = -w(-rn^J (228) 2m—1 2m — 1 / 1 \m / 1 \m £ *v < £ «-(- nr,j < 2-w+1 (- i^j • (229) It follows from (228) and (229) that oo r / i \m 2mom i by reason of (227) [/ 1 \m 2mam 1
182 IX. THE BEST APPROXIMATION TO ANALYTIC FUNCTIONS With the aid of the elementary inequality yA + b ^ fA + y b it follows that yi/w(x)i m/j: nrvt ' m . y2m 2q 1 In q 1 — qy y\ - q\ With increasing m, the limit on the right of this inequality is lng 1 — q Thus, for sufficiently large values of m m. y\fm){x)\ L \nq 1 — gJ m and \fm){x) | < Ammm (m > m0). A can be so increased that this last equation is also valid for the finitely many values of m ^ m0. This proves that/(x) e A2, and the proof of Theorem 1 is completed. To complete also the proof of Theorem 2, we go back to the remark in § 9, by which the number R appearing in the definition of class A2x need not be taken smaller than 1 1 (230) Ae L In q 1 — q\ in our case. But if condition (220) is fulfilled, then for a sufficiently small q we can find a number n0 such that for n > n0 Elit) < qn and, hence, a constant K such that the inequality ETnif) < Kqn is satisfied for the same value of q and for all values of n. From the proof above it follows that f(x) e A2in R not being smaller than the number (230); but since this number can be taken arbitrarily large by choosing q sufficiently small, then R = + °° and f(x) e G2x.
3. THE BEST APPROXIMATION TO FUNCTIONS ANALYTIC ON A SEGMENT 183 § 3. The Best Approximation to Functions Analytic on a Segment Bernstein's theorems in the preceding Section can be applied to the theory of approximation by means of algebraic polynomials. Theorem. Let f(x) e C([a, b]) and En = En(f) be the best approximation to f(x) by polynomials belonging to Hn. In this case f(x) e A ([a, b]) if and only if En <Kq«y (231) where K and q < 1 are constants. Moreover, f(x) is an integral function if and only if lim|^ = 0. (232) n-»oo The simplest way to prove this theorem is by means of the theory of functions with a complex variable; readers knowing this theory may look for the corresponding computations either in S. N. Bernstein's book 2 or in the well- known handbooks by V. L. Goncharov 3 and J. S. Besikovich.4 Here we conduct the proof with real values only. Although more difficult, it requires less preliminary knowledge from the reader. That conditions (231) and (232) are necessary can be proved in a fairly simple fashion. As we shall see later, the induced function f{6) /r(6-«)cose + (o + &)1 belongs to A2ir if f(x) belongs to A ([a, b]) and, specifically, \p(d) belongs to GW if /(z) is an integral function. But by Lemma 1, § 1, Chapter VI, El(y) =En(f), so that the necessity of conditions (231) and (232) appears as a direct consequence of the theorems of the preceding Section. Now there remains only to show that from the conditions imposed on fix) it follows that y(0) €^42*? V(0) €#2*. We remind ourselves of the equation y)(6) = <p (cos 0), 2 S. N. Bernstein [10], p. 75 following. 3 V. L. Goncharov [1], p. 292 following. 4 J. S. Besikovich [1], p. 216 following.
184 IX. THE BEST APPROXIMATION TO ANALYTIC FUNCTIONS set *>w = /[(6-a)%+(a + 6)] and investigate the function <p(u) under the assumption that f(x) eA([a, b]). 2 Let uo and u be two points in [ — 1, +1] for which \u — u0\ < R, b — a R being the number contained in the definition of A ([a, b]) and belonging to fix). If (b — a)u + (a + b) _ (b — a)u0 + (a + b) then \x — Xo\ < R; hence f(x) = 2ck(xQ)(x— x0)h k=0 or, in another form, or, finally, having set P(«) = ^^(i£0)(tt-«0)*f (233) d*(*>o) = cjt(x0)|-— '6 — a\* Function <p(w) belongs therefore to A([—1, +11); and if, moreover, /(z) is an integral function, ^>(w) is an integral function as well. We now take a look at the function ^(0) = <p(cos 0). We take a specific 2 value of 0O as constant and let 0 satisfy the inequality |0 — 0O| < Rt b — a 2 Then also |cos 0 — cos 0o| < Ro and, by reason of (233), b — a v(Q) = 2 4 (cos e — cos e0)k, where, for shortness, we write dk instead of dk(cos 0).
3. THE BEST APPROXIMATION TO FUNCTIONS ANALYTIC ON A SEGMENT 185 From elementary analysis we know that cos 0 — cos 0O = 2 «<(0 — 0o)* , i=l (234) hence v(0) = 2<i*\!t«i(0--0o)i]k. Since the series (234) is absolutely convergent, in raising it to a power term- by-term multiplication is permissible; this yields Thence it follows that On the other hand and, therefore, V(0) = 2 dk\ £ a? (0-0o)'l. _ 1 fd^cosfl] ai~T\\ dV Je=V (235) (236) \«i\^- The absolute values of the terms of series (234) are therefore not greater than the corresponding terms of the series <-l *! If we set then, by reason of the above, |a(f | g A(f. Now we compare the series 2; i**i *=0 .<-* J with series (236). (237)
186 IX. THE BEST APPROXIMATION TO ANALYTIC FUNCTIONS 2 Since for \z\ < R the series b — a Zdkz* is absolutely convergent, series (237) converges provided that 2 e\o-Oo\— 1 < B. (238) b — a v ' Assuming that this condition is fulfilled, we can regard series (237) as the result of a summation by rows \d0\ + + \d1\A?)\d- - 60\ + \di\ A? \6- + \d2\Af)\6- - 0ol2 + |dil 4" 10- 0O|3 + • -0O|* + I<*2|42)I0-0OI3 + - + |d3|43)|0-0ol3 + - + •■ •• + •• + •• + ••(239) Since this double series has positive terms and its summation by rows is a finite sum, it is therefore convergent. But series (236) is also the result of a summation by rows of the series d0 + + dia?(0 - fl0) + dia{i\e - 0O)2 + dA\0 ~ Go)3 + - + + d2af{d - d0)2 + d2af{d - 0O)3 + — + + dzaii)(0-0o)* + -.+ + .... (240) By reason of the above, the double series (239) is a majorant of the double series (240). Hence the latter is absolutely convergent, so that we may also add by columns, thus immediately obtaining the equation V(fl)= J^(fl-flo)*. (241) But since the validity of this equation depends only on the condition o (238) (whence \B - 0O| < R follows of itself) and the coefficients X< do b — a not depend on 6, \[/(d) e A2ir. But if f(x) is an entire function, then R = + <» and condition (238) places no restrictions on 6, so that \p{6) e G2jr. Thus we have proved that the conditions of the theorem are necessary. To prove that they are also sufficient; we resort to a
3. THE BEST APPROXIMATION TO FUNCTIONS ANALYTIC ON A SEGMENT 187 Lemma. If P(x) is a polynomial belonging to Hn whose absolute value on segment [a, b] does not exceed the number M, then on a larger segment [a0, 60] with Oo = a — fc (6 — a), b0 = b + h (b — a) (h > 0) this polynomial satisfies the estimate \P(x)\ ^ M(l + 2h + 2yh + A2)n. (242) Proof. Let Q(«) = p Ub - a)u + (a + 6)1 In absolute terms this polynomial does not exceed the number M on segment [-1, +1]. By (63), the estimate \Q(u)\ £M[\u\ + yu* — 1]» is valid for \u\ > 1. We now take a value x lying on [ao, bo] but not on [a, b]; for a concrete definition let b < x ^ 60. Then 2x — (a + b) u===—ir—^—> lj 0 — a and we obtain (since P(x) = Q(u)): 2x — (a + b) \P(x)\ ^ M whence it follows that + jf-T^T M I P(x) I ^ th_ v» [2x-a-b+2y(x-a)(x- fc)]\ (6 - a) If we consider 2X — a — 6 ^ (6 — a) (1 + 2A), (s — a) (s — b) ^ (6 — a)2 (h + A2), we come to (242). The case of ao ^ x < a can be treated in an analogous fashion; but for a ^ x ^ b the estimate (242) is trivial. The importance of this lemma lies in the fact that it leads from the estimate of a polynomial on a segment to the estimate of the same polynomial on the extension of that segment.
188 IX. THE BEST APPROXIMATION TO ANALYTIC FUNCTIONS Let us now revert to proving that the conditions of Bernstein's theorem are sufficient. Let En(f) < Kqn. (243) A natural way of attaining this end is the following: We introduce the induced function \l/(0); since in that case El{\p) = Enij), it follows from (243) that yp{6) is analytic. We have therefore only to infer from this that also the initial function/(x) is analytic. This inference, however, cannot be drawn without supplementary considerations (although actually from the analytic behavior of yp{6) also that of f(x) follows) because the function arc cos x is analytic not on the entire segment [—1, +1]. This circumstance complicates the proof somewhat. If Pn(x) is the polynomial of the best approximation, then \P*(x)-f(x)\ <Kqn (244) and f(z) = P0(x) + % [Pn(x) - Pn_x(x)]. Considering the estimates \Pn(x) - Pn-l(x)\ ^ \Pn(x)-f(x)\ + \f(x) - Pn-i(x)\ it follows from (244) that |iV*)-P,_i(*)| <Lq>\ where L = K(l + g_1). The latter estimate is valid for a ^ x ^ b. Now, we choose h > 0 sufficiently small, so that q1 = q{l+2h +2yh + A*) <1. Then, in the wider segment [a0, 6o] with a0 = a — h (b — a), b0 = b + h (b — a) we have |P.(*)-P,_i(*)| <Lqf. Consequently the series Po(x) + 2 [Pn(x) - Pn-i(z)] (245) n=l
3. THE BEST APPROXIMATION TO FUNCTIONS ANALYTIC ON A SEGMENT 189 converges uniformly on segment [a0, 60]. We denote its sum on this segment by fo(x) (on the initial segment [a, b] f0{x) coincides with the initial function f{x)). If we also consider that |/0(x)-Pn(x)| ^ 2' \Pk(x)-Pk_l(x)\< 2 Lqk1=-—-qn1+\ fc=n+l fc=n + l x — <7l then, obviously En(fo)<r^--qni+1- 1 — qx Now, we introduce the induced function 5 f{6) = /, [(6° ~ a°> C082fl + (a + b)] , thus (and this is the gist of our consideration) we consider not the induced function of the initial function/(x) but its "continuation" fo(x). In view of the fact that El(xf^) = #n(/0), the last inequality warrants that \p(B) belongs to class A2x. Moreover, the number R belonging to ^(0) satisfies the condition B ^ -j ri =j. (246) 2e\ ln^ 1 — qi] Now we have to show that the initial function f(x) is analytic on the segment [a, b]. To this end we verify that the auxiliary function <p(u) — y>(arc cosu) is analytic on segment I - ^ ^ , 1 J . Let 6 and 0O be two points for which \0 — 00\ < R, so that y>(0) = Zck(0o)(6-0o)k. fc=0 Further, on segment , the derivative of function L 1 + 2h 1 + 2hj arc cos u is, in absolute terms, not greater than 2 }/h -f h2' 6 We remind that a0 + 60 = a + b.
190 IX. THE BEST APPROXIMATION TO ANALYTIC FUNCTIONS Therefore, if u and u\ are two points on 1, \u — wol < L 1 + 2A ' 1 + 2hJ ' 0| R — , then |arc cos u — arc cos u0\ < R and, hence H oo <P(U) = 2J dk (arc cos w — arc cos u0)kt (247 where, for shortness, we have set dk = c*(arc cos u0). It is known from elementary analysis that the function arc cos u is analytic for I < 1 on every segment [-Z, +1], the number # belonging to it being equal to 1 — L Providing that 2h we have therefore oo arc cos w — arc cos u0 = ]£ at(u — uQ)1 (248) i-1 (of course, the points u and u0 belong, now as before, to segment , ). It is important in this case that L 1 + 2h ' 1 + 2AJ/ 1 \dl arc cos u\ Since arc cos u is analytic in segment , - , there exists L 1 + 2h 1 + 2/U a constant A for which |(arccostt)(i>| < AH1 holds, whence \a€\ <Ai^- <Aiei = Bi (B = Ae) follows. We notice immediately that the constant B decreases as h increases. From the last estimate it follows that the geometric series
3. THE BEST APPROXIMATION TO FUNCTIONS ANALYTIC ON A SEGMENT 191 which for B\u — w0| < 1, converges to the sum B\u — u0\ 1 — B\u — u0\ is a majorant of series (248). Since the series (248) is absolutely convergent, we may therefore raise to a power this series as well as its majorant (249) by term-by-term multiplication. Thus we obtain oo (arc cos u -— arc cos u0)k = £ a(*> (u — uQ)1 (250) ('"-"•^rrix)' (B\u-u0\ <1), and we also have R Now, if we choose \u - u0\ smaller than each of the three numbers —, 2h - , we find from equation (247): 1 + 2h ' B <p(u) = £ dk \ £ a^(u - u0)t]. (252) For this series Z\dk\\£Alk)\u--u0\i\ (253) is a majorant. If we also consider that the series Zdkz* *=o is absolutely convergent for \z\ < R, we can guarantee the convergence of the positive-termed series (253) provided that B\U Un I _ i—Wi— i < E • (254) 1 —- B \u — un\ v '
192 IX. THE BEST APPROXIMATION TO ANALYTIC FUNCTIONS With the same right as earlier when equation (236) went over into equation (241) we obtain here (with the aid of double series) from equation (252) the equation (p(u)= j?J*(*-«%)*. (255) in which the coefficients X* do not depend on u. We remark that we have proved this equation (255) with the provision that . \R 2h 1 R I e |^-^0|<e = minj-, jjjj, -jg, B(l+M)]- Thus we have shown that <p(u) eA[\ , ). \L 1 + 2h ' 1 + 2/U/ Now we revert to our original function f(x). Given two points x and x0 on [a, b], we set 2x — {a + b) 2x0—(a + b) u = h 7. ' uo = h 7 • We can easily see that these numbers lie in the segment , L 1 + 2h 1 1 . 2 , \u — uq\ being equal to \x — x0\. 1+ 2hJ b0 — a0 Now, if 60 — a0 l»— xo\ < 2—Q ' we obtain (since/(x) = <p(u)) by reason of (255) the equation jfc=0 (O0 — «o) which proves that/(x) e A([a> b]) and, hence, that condition (231) is sufficient for this. Now let also condition (232) be fulfilled. Then in condition (231) q > 0 may be chosen arbitrarily small. Therefore, if an arbitrarily large value of h > 0 is preassigned, we can always choose ag>0 such that qi = g(i + 2h + 2|/A +h2) R 6The fourth number comes from estimate (254). If we consider that-——- < 1, then 1 -f- K we may just as well leave out the number — " B
3. THE BEST APPROXIMATION TO FUNCTIONS ANALYTIC ON A SEGMENT 193 is arbitrarily small. Therefore we again construct the functions f0(x) and yp(6). Then, as above, function yp{6) turns out to be analytic, the corresponding number R satisfies the inequality (246) and may therefore be chosen arbitrarily large (irrespective of the prescribed value of h). Thus function f(x) can again be represented by the series (256) (no matter how large we choose h since the expansion off(x) in powers of x — x0 is possible only in one way). To ensure convergence of the series (256), x must be subjected to the condition \x — xo\ < p or, more specifically, we must require that this difference \x — x0\ be smaller than each of the three numbers E b — a ui =Jf—2~(* +2h)> u2 = (b — a)ht * b-a(l+2hh B(l+B) 2 (257) But since R and h are completely independent of one another, both values may be chosen arbitrarily large, hence also U\ and u2 become arbitrarily large; since, finally, .. 1 +2h hm =— = +oo A-oo -B (as B does not increase when h increases), w3 may also be taken arbitrarily large. Thus the three numbers (253) place no restrictions whatever upon argument x, so that the series (256) converges for all values of x and represents the function/(x) in the segment [a, b]; this, however, means that/(x) is an integral function. Thus, the proof of Bernstein's theorem is complete. It is of interest to compare this theorem with the results obtained in § 3 of Chapter VI. While the latter merely enabled us to characterize the differential structural properties of the function within the interval (a, b), the last theorem gives us a characterization in the entire segment [a, 61. The reason for this is to be sought in the fact that the estimate En < Kqn makes the terms of series (245) become sufficiently small not only on the initial segment [a, b] but also on its expansion [a0, bo], whereas estimates of the form M.< A np+a do not perform accordingly. Now we bring a corollary of Bernstein's theorem which we have briefly mentioned before.
194 IX. THE BEST APPROXIMATION TO ANALYTIC FUNCTIONS Corollary. Given a function f(x) on the segment [—1, +1], if its induced function \p(6) = /(cos 6) is analytic, then the initial function f{x) is analytic in the entire segment [ —1, +1]. This is true since En(f) = ^JOA). Since \p(6) is analytic, it follows that Elty) < Kqn and, consequently, En(J) < Kqn, hence/(x) is analytic. We conclude with a precise formulation of Bernstein's results in the terminology of the theory of functions of a complex variable: I. 2/ the best approximation of the function f(x) e C([a, b]) is such that Ii^^7)=-1 (*>1) (258) then f(x) is holomorphic7 inside an ellipse the foci of which are the points a and by the semiaxes having the sum R, with a singular point lying on its boundary. II. If f{x) is holomorphic inside an ellipse whose foci are a and b} and a singular point lies on the ellipse boundary^ then the best approximation by polynomials in segment [a, b] satisfies the condition (258), R being the sum of both semiaxes of the ellipse. 7 Strictly speaking, there exists a function of a complex variable with the properties mentioned which coincides with f(x) in the segment [a, 6].
CHAPTER X PROPERTIES OF SOME ANALYTIC EXPRESSIONS AS MEANS OF APPROXIMATION In Chapters VII and VIII we have investigated Fourier series, Fejer and de la Vall^e-Poussin sums as means for approximation. In the present chapter we shall deal with some other analytic expressions suitable for this purpose. § 1. Expansion by Tchebysheff Polynomials The results obtained from investigating Fourier series are, mutatis mutandis, easily applicable to the expansion of a function by Tchebysheff polynomials. This can be done on the basis of the following simple Lemma. If a function yp{6) e C2ir is even, then its Fourier coefficients are 1 n A=— [w(0)dO, 71 J 0 2 (259) a* = — f y(d) coskddd, bk = 0. 7Z J 0 To prove this we have to represent each coefficient A, ak, bk, by an integral over the segment [—tt, tt]. If we split this integral into two parts extending over [—tt, 0] and [0, tt], and substitute 0 = — 0' in the former, we immediately obtain the formulas (259). From this we come to the following Theorem. If the function f(x) e C([-l, +1]) satisfies the Dini-Lipschitz hm co (o) In o = 0, then it can be expanded in a uniformly convergent series by Tchebysheff polynomials: f(x)=A +Jta*Tk{x)i (260) where , t1 ., x , n t1 , . \ C f(x)dx 2 f dx A=— / '; ak=— / f(x)Tk(x) _1 f f(x)dx _2 f -njyr=^> ak--Jf{x)Tk ]/l — x2 -i -l 195
196 X. SOME ANALYTIC EXPRESSIONS AS MEANS OF APPROXIMATION the partial sums of this series satisfying the inequality \Sn(x) - f(x)\ ^ (3 + lnn)En(f) (n ^ 2). (261) Proof. We set y){6) =/(cos0). Then, by Lemma 2 from § 2 of Chapter VI, cov(d) ^ C0f(d). Thus, since the function satisfies the Dini-Lipschitz condition (184), it can be expanded in a uniformly convergent Fourier series y,(0) =A + ]Takcoslc6 , (262) whose partial sums satisfy the Lebesgue inequality (180). By putting cos 0 = z therein, we obtain f(x)=A + J>*7*(z). Moreover, substitution 6 = arc cos x yields and analogous expressions are obtained for the coefficients ak. By reason of Lemma 1, § 1, Chapter VI, we also have E^ty) = En(f) so that, finally, the inequality (180) goes over into the inequality (261). Corollary. If the function f(x) has a derivative f'(x) and \f(x)\^Ml9 it is expandable in the series (260) and, moreover, 19 M \Sn(x) - f(x)\ <—^(3 +lnn). Thus, by Jackson's theorem, En ^ . n § 2. Some Properties of Bernstein's Polynomials Now we complete the considerations on Bernstein's polynomials Bn(x) expounded in Chapter I.
2. SOME PROPERTIES OF BERNSTEIN'S POLYNOMIALS 197 Theorem 1 (T. Popoviciu). If f(x) e C([0,1]) and Bn(x) is a Bernstein polynomial of fix), then l \Bn(x)-f{x)\^-co {yt} Proof. From formula (15) we obtain Jfc=0 ®- /(*) C*x*(l — x)n~k. By virtue of the inequality w(X5) ^ (X + l)w(5) we further have /(i)-/w|s»(|i-»|)s(|i-^+i)«.(i) With the aid of identity (1) we obtain from it \Bn(x)-f(x)\ ^eo U) Jfc=0 k I # n C*nxk{\ — x)n-* + 1 (263) (264) On the other hand, Cauchy's inequality (71) yields [» I 1c I «. "I2 21 --x o*x*(i-»)»-* *=oIn I J whence with the aid of (1) and (10) 5,1 & 2, X n cknxk(i-xr-k^-±~ 2}/n (265) follows. And now, both estimates (264) and (265) combined yield the estimate (263). Specifically, if f(x) e Lip^ a, then 3M \Bu(x) - f(z)\ £ 2]/* M. Kac [1] found the latter estimate independently of Popoviciu. He also proved that its order cannot be improved. From the viewpoint of a possibly rapid approximation to any continuous function, Bernstein's polynomials thus turn out to be fairly inadequate. This is made particularly evident by the following L Popoviciu [1], Kac [2], I. P. Natanson [7].
198 X. SOME ANALYTIC EXPRESSIONS AS MEANS OF APPROXIMATION Theorem 2 (E. V. Voronovskaya [1]). 7/ a bounded function f{x) has a finite second derivative f"(x) at the point x, then Bn(x) = /(*) + Q^ *(1 - x) + £= , and lim p» = 0. n—>ao The proof of this is based on several lemmas. Lemma 1. For the polynomials )mClnxk{ (m = 0,1,2,...) Sm(x) = 2 (* — nx)mCnXk(l — x)n~k *=o (266) (267) the recurrence formula Sm+i(z) = x(l - x) [S'm(x) + nmSm^(x)] is valid, since differentiation of Sm(x) yields (268) Whence it follows that &m(z) =Z(k— nxF-iClx*-1^ — x)*-*-l[—nmx(l — x) + (jfc — »*)*]. *-0 and, hence, x(l — x)S'm(x) = —nmx(l — x)^w^x(a:) + Sm+l(x), which is tantamount to (268). Lemma 2. TAc polynomial Sm(x) is representable in the form [|] sm(x) = 2Amti(x)m , functions Am,i{x) being certain polynomials independent of n. This is true for m = 0, 1, 2, which follows immediately from identities (1), (8) and (2). Suppose now that it were proved for m ^ r. Then, by (268) ~\i\ . IfL .1 ^r+i(«) = *(1 — «) t-0 t-o
2. SOME PROPERTIES OF BERNSTEIN'S POLYNOMIALS 199 and we only need the remark that Corollary. On the segment [0, 1] \8m(x)| ^K(m)n1**. (269) Lemma 3. If 8 > 0 and A(x) is the set of those values from the series 0, 1, 2, ..., n for which — x then for every value of m 2 cknx*a-*)*-* keJ(x) K(2m) holds true. In fact, $2 mi*) I OS *»<! - *>-* ^ 3=355 2(h - »*)«-C* ** (1 _ x)»-> = J£-L , Jte J(c) *=o and the assertion follows from (269). Specifically, if we set m = 3 and 8 = n~* we obtain, Z(6) 2? C* **(!-*)»-*:£ ^n-1/* ■j/n3 (270) Now we can turn to proving Theorem 2. From the existence of a finite derivative fix) there follows f(t) = /(*) + /'(*) (*-*) + [^ + *(«)] (« - *)* for lim X(0 = 0, and, further. / (A)./w + rw(±_.) + [^ + ,(4)](A_,)" By substituting this quantity into the expression for Bernstein's polynomial and considering identities (1) and (2) we find x(l — x) „, Bu(x) = f(z) + \ /"(*) + rtt In
200 X. SOME ANALYTIC EXPRESSIONS AS MEANS OF APPROXIMATION for r*=?A^&-xJcU{l-x)n~k- (271) For a preassigned e > 0 there exists a number n such that for \t — x\ < n-* the estimate |X(0| < £ is valid. For such a number n we subdivide the sum (271) representing rn in two sums £i and £2, the summands for which A; I — — x < n~* being regarded as belonging to £1 and the remaining ones n I ones to £2. According to inequality (10) On the other hand, function X(£)(£ — z)2 is bounded; if M is the upper bound of its absolute value, then by (270) |*.-«'i.-V. V*3 \n I holds true. Therefore, for sufficiently large values of n . a MK(6) \nrn\ <-r-\ 7=-. If, in addition, we impose the following condition JfJC(6) 3 —7=-" <_re |/n 4 upon n, then |7irn| <e and, hence lim nrn = 0. If therein we substitute nrn by pn, we obtain (266). n—►<» This formula shows that no improvement of the properties of f{x) lets the measure of approximation of Bernstein's polynomials exceed the order - (except the case of linear functions with which their Bernstein poly- n nomials Bn(x) are identical when n > 0). The following flieorem and this one are closely related in their proof procedures. Theorem 3. If a bounded function f(x) has a finite derivative f(x) at the point x, then \imB'n(x) = f'(x).
2. SOME PROPERTIES OF BERNSTEIN'S POLYNOMIALS 201 Proof. If 0 < x < 1, we can write the derivative Bn(x) in the form 2 Vn(z) = Jtf(-)(k-nz)Ckna*-l(l - xr~*-i x(l — x)k=0 \nj If therein we replace / ( - ) by the expression /(i)-/M + [/-M + l(i)](i-,). lim X(0 being equal to 0, we obtain from it, considering the identities (8) t—>x and (2), Bn(x) = f{x) + l 2l&)fr ~ nx)*C*nxk(l ~ x)n~k nx(i — x) k=0 \n) or, in terms of (271) thus the theorem follows from the already proved relation lim nrn = 0. n—><» The cases when x = 0 and x — 1 are settled in a much simpler fashion. For example, for the case x = 0 we write B*n{x) in the form: B'n(x) = — »/(0)(l — x)n~l + /(--) n{\ —nx){\ — x)n~2 + £ f(-)(k- nx)Cknx*-l(l - a;)""*"1, whence we immediately obtain *;(0) = »[/(!)-/(<>)]-►/'(<>). The theorem just proved is purely local in character. There exists, however, a similar proposition for an entire segment: Theorem 4. // f(x) has a continuous derivative f'(x) everywhere on [0, 1] then Bn(x) converges uniformly to f'(x). To prove this we write Bn(x) in the form Bn(x) = /(0) (1 - X)* + SV-Vn**(l - X)n~k + /(l)^. * = l \»/ 2 To have this formula valid also for x = 0 and z = 1, a rearrangement of the summands would be required since otherwise we would run into summands of the form 0-1. 1 1 1 This formula behaves in a fashion similar to equations 1 = 1+ - — —• or 1 = a? — which for x = 0 are formally meaningless.
202 X. SOME ANALYTIC EXPRESSIONS AS MEANS OF APPROXIMATION Hence we obtain B'n{x) = - n/(0)(l - X)"-1 + n£f(-)cknkx*-l(l - *)*"* — *£ f(—)c*'n — k)x*(l — x)n-k~l +nf(l)xn~l, or B'n{x) = £ /(AW^*-i(l - *)»-* *=i \n) - *2f (-)<£'» -k)xk(l- a;)"-*-1. *=o \n/ In the first sum the index can be so changed that it also runs from k = 0 to k = n — 1. This yields ;<*> =? [/(-1)(*+ 1)C*+1"' (£)(n ~ k)C*"\zk{l Sut (ft + l)C*n+1 = (n - ft)C* = »Cj-lf and, consequently *"" -\5'['(*-Ti)-^)]ci-,<I —r"'- Since the derivative /'(a;) exists on the entire segment, according to the mean value theorem. Thus we have B'n(x) ^fi^jC^X^l - X)»-1-* or, in another form:
2. SOME PROPERTIES OF BERNSTEIN'S POLYNOMIALS 203 In this expression, the first sum represents Bernstein's polynomial of (n - l)-th degree for the derivative f'(x), and converges therefore uniformly to f'(x) on [0,1]. On the other hand, Hence — < z* < < ^ k + 1 n n < whence, if we denote by wi(-) the modulus of continuity of f'(x)} it follows that ■«V(^7)|^G)' Thus the second sum in (272) is not greater than wi(5) and tends uniformly to zero. The following theorem deals with another property of Bernstein's polynomials. Theorem 5. If f(x) is a polynomial of m-th degree and n ^ m, then its Bernstein polynomial Bn(x) is of degree m (i.e., not of degree nfor n > m). It obviously suffices to show that this is true for f(x) = xm or (which affirms the same) that 2lmCknxh(\ — x)n~k *=o (273) for n ^ m is a polynomial of m-th degree. If we differentiate the identity SCknzk = (l- z)n m times but, after each differentiation, multiply first by z, we obtain the following left-hand member ZkmCkz*. On the right-hand side we get, now as before, a polynomial of n-th degree which, in addition, is divisible by (1 + z)n~m without remainder (as can be
204 X. SOME ANALYTIC EXPRESSIONS AS MEANS OF APPROXIMATION proved without difficulty with a conclusion by induction about ra). Thus £**<&* = (1 +z)»->»Pm(z). X If therein we set z = and then multiply by (1 — x)n, we obtain 1 — x (273) on the left and (1-^p-(r^)- on the right, i.e., a polynomial of m-th degree. Corollary. If we denote Bernstein's polynomial of n-th order for the function xm by Bn,m(x), then the equation \imBnrm(x) = x™ n-+oo holds for all real values of x. In fact, this equation is fulfilled for [0, 1]; the rest follows by reason of the remark in § 1 of Chapter II. On the strength of this theorem we also prove Theorem 6 (L. V. Kantorovich [1]). // f(x) is an integral function, then the sequence Bn(x) of its Bernstein polynomials converges to it along the entire axis. For, /W=|cmxw, (274) m=0 the series on the right being absolutely convergent along the entire axis. Hence it follows that Bn{x) = ZcmBn,m{x). (275) Each term of series (275) tends to the corresponding term of (274). Thus it suffices to verify that (retaining argument x) the series (275) is uniformly convergent with respect to n. To this end it should be noted, first of all, that \B».n(*)\ = n 11r\m » t-0 \nJ X) n-k\ ^2'c*i*i*(i + i*!r-* = (2M + i)'1 and, accordingly, \Bn>m{x)\ ^(2|*| + l)2w for n ^ 2m.
2. SOME PROPERTIES OF BERNSTEIN'S POLYNOMIALS 205 On the other hand, in the expression n / h\™> , B,.m(z)=Z[-) C***(l-*)»-* -i5(4)CJL5'-"'«-"H,J the coefficient of xr is equal to t-0 \W/ \r—fc or (since CfcJ = C^Cj) ^r = -s ^(-l)'-**"^. (276) w fc=0 We may in this case count on r ^ m, since Bmtn(x) has no terms of degree higher than m. From (276) it follows that I I 7 I < -* rw y C* — —- G'r and, consequently Now, if n > 2m, then for r ^ m we have always C£ ^ Cn♦ hence IAr|<i^-C- But since £..m(*) = ^0 + ^2: + '•• + Am*™ it follows that \Bn.m(x)\ < (m + 1) —js- OT(|*l + 1)". Since m _ w(w— 1) ••• (w —m + 1) _ra™ n _ m! m!" '
206 X. SOME ANALYTIC EXPRESSIONS AS MEANS OF APPROXIMATION therefore I*....(*)I < (m + l)^^(\x\ + l)m and, by reason of (213), a fortiori \Bn,m{x)\ <(m + l)2™e»>(\x\ + l)m. But m + 1 ^ 2m, hence, finally \Bn.m(x)\<[4e{\z\ + l)r for n > 2m. If we take A to be the larger of the two numbers (2\x\ + l)2 and 4e(|s| + 1), we have the estimate \BUwm(x)\ <A™ for all values of n. The series whose terms are no longer dependent on n, is therefore a majorant of the series (275). This proves the theorem. Let it also be noted that, as it follows from the proof, the series converges uniformly in every segment [-«, +B). § 3. Some Properties of De la Vallee-Poussin's Integral In the present Section this author expounds the result of some of his investigations relating to the properties of the integral of de la Vallee- Poussin. Theorem 1 (I. P. Natanson [7]). If a function f{x) e C2t has the modulus of continuity w(5) and x ' — n is one of its de la Vallee-Poussin integrals, then the inequality \Vn(x)-f(x)\ ^ScoU=\ (277) holds for all values of x.
3. SOME PROPERTIES OF DE LA VALLEE-POUSSIN'S INTEGRAL 207 Proof. It is easy to see that Vn(x)-f(x) = (2n)\\ 1 %+it t — X (2n IjTi^ /{/(<)-/(*)} cos-1^. But \f(t)-f(x)\f<o(\t-x\), i and, by reason of the inequality w(X5) ^ (X + l)w(5). co(\t - x\) ^ (\t -x\Yn + l)co i^J\ so that we get \Vn{x)-f(x)\ (2w)H co\ z+n (2w—1)!! 2n f [\t — z||/w"+l]cos2» —2~dt. -n t — X If in the latter integral we substitute for a new variable and consider the equation we find nj2 ( -»/2 (2n)\\ [ cos2ntdt = /0 xtt 7t9 <— w/2 ^-'^41+^^^M'^'"'T{ky In the expression [ (2n)H 1 L(2n-l)!lJ 2 2-44-6 (2w — 2)-2m Zi —7^7; 3^— • • • — 77; ^ ft 32 52 (2/i — l)2 each fraction on the right is smaller than 1, hence (278) On the other hand ;t/2 7J/2 f \t\cos2ntdt = 2 f tcos2ntdt, and, hence, -71/2 \Vn(x)-f(x)\ ^ — ( tcos*»tdt\<ol—=). (279) 71 </ J ty*l
208 X. SOME ANALYTIC EXPRESSIONS AS MEANS OF APPROXIMATION All we have to do now is to evaluate the latter integral. For 0 ^ t ^ - , t 5^ —sinj, hence »/2 »/2 f J cos2n tdt < y /" cos2n J sin J d£ 0 71 71 ' 2(2n + 1) "^ 4~ti ' o o l = £L fZ2ndz== * <£!. (280) 2 i 2(2rc + 1) 4tT V ; We get (277) from (279) and (280). Corollary. If fix) e Lip^ a, then 3 M \Vn(x)-j(x)\ <*-}=. (281) yna Remark. The order of estimate (281) is not improvable. Given the function f(x) = |sin x\a which, as we can easily verify, belongs to the class Lip a, we have F»W=(2^Tli/8ina<cos2n4^ and thence or 2 (2»)" 1 FL_2n< F-<0)>^riyn*i,i,i"'co8,"Tc" 0 _1_ Vn (2n— l)\\ 7t J (2n)l\ 2 1 7T But for n > 1, —- < -, so that Vn 4 sin 2t > — t 71 on the integration interval, and F"(0>>^ (2^1)11/ff'00S,l>^ where 4 is a constant dependent upon a.
3. SOME PROPERTIES OF DE LA VALL^E-POUSSIN's INTEGRAL 209 But since [" (2n)!! ]2_ 2 [(2n-l)!!J "I- 22 42 (2» — 2)2 "3 "3 • 5'" (2n — 3) • (2n — l)2n — 1 and each fraction on the right is greater than unity, (2n)U {2n— 1)!! >>^. Whence it follows that _i_ Vn Vn(0) > Axyn j iacos2ntdt with a new constant A\. On the other hand for the function cost > 1 9 (282) <p(t) = cost + J9 increases for t > 0 since v'it) = t — sin t > 0. Thus ^>(0 > ^>(0) = 1 for t > 0, which proves the inequality (282) for positive values of t; but since both sides of (282) are even functions, it is valid everywhere. This gives i / t2 \ 2n Vn(0)>Aiynft*(l--) dt. Finally, the well-known Bernoulli inequality (1 + x)m ^ 1 -f mx (which can be easily proved by induction for m) holds for all values of x > — 1. With its aid we find i Vn Vn(0) > A,Yn j r(l — nt*)dt, and computation of the integral leads to vn(0)>A1yn wherein A2 is a positive constant. "1/1 V+1 n I 1 \a+3
210 X. SOME ANALYTIC EXPRESSIONS AS MEANS OF APPROXIMATION The order of estimate (281) is therefore optimal. This, however, cannot be said of the constant factor 3 contained therein. Thus we come to Theorem 2 (I. P. Natanson [9]). If L\{a) = sup {max \Vm{x) — f(x)\], wherein f(x) runs through all functions with period 2w of class Lipi a, then VW \ 2 / Vna _ (283) \nna \ 2 I ]/n for lim pn = 0. We prove this theorem only for the special case of a = 1 for which formula (283) can be written in the simpler form *7n(l)=--L + -^. (284) ynn yn Now, if fix) e Lipi 1, then ir.<«>-/<*>i ^j^kl TV *io-* v* x ' x—n (2W)!! ' "Ulcos-l^. L f \n J (2n — 1)U2ti J l l 2 Hence also UJl) is not greater than the right-hand member. On the other hand, the function 6(x) with period 2ir which coincides with \x\ on segment [—it, w], belongs to Lipi 1. For it F»(0)-e(0)=(2^^2^_/U|COS24d<' and since Un(l) ^ 7»(0) — 0(0), we get exactly —n 0 Thence it follows that r st/2 »/2 f sin t cos2n tdt+ f (t — sin t) cos2n t dt (2n)\\ 4 vn(i)- [zn)" (2n—\)\\n hence 17.(1) = (2W)H 1 l-J— + rn), (283) nK ' (2n— 1)!!«\2» + 1T V
3. SOME PROPERTIES OF DE LA VALLEE-POUSSIN's INTEGRAL 211 wherein we have set Ji/2 t» = J (t — sin t) cos2n t dt. But8forO gtg. - 2 0 ^ * — sin* ^ 4- ^ ~r sin3 *• ~ — 6 - 48 Accordingly, 0<rB<g/sin3<cos^^==24(2w+^(2w + 3)<9-gl. (286) 0 With the aid of (278) we get from (285) and (286) (2n)\\ 4 1 n2 0 < Un(l)-{2n_l)ll-2n + j <f^j7p- Whence On the other hand «/2 »/2 st/2 /* cos2n+1 tdt < f cos2w tdt < f cos2*"1 tf dtf. d 6 6 The integral in the middle was computed earlier in (19), we find the exterior integrals by the same methods and get (2ti)!! (2n — l)l\7i (2n — 2)\\ # (2n + 1)!! K (2»)!! "2* "^ (2» — 1)!! ; hence it follows that 4 {2n)]l -]^TTX|/f (o<en<i) (2n— 1)!! 3 We have seen that cos t ^ 1 - % . The function sin J - J + t» is therefore increasing and is positive for t > 0. 4 The so-called Wallis formula.
212 X. SOME ANALYTIC EXPRESSIONS AS MEANS OF APPROXIMATION and therefore also 17.(1) = *?* + *' * + -^, which is tantamount to formula (284), for Y2n + 6n 4 2==_^_ 2n + 1 |/2^ j/rT^r ]/» with lim pn = 0. n—>ao These results show that, like Bernstein's polynomials and the Fej^r sums, de la Vallee-Poussin's integrals yield a comparatively low accuracy of approximation irrespective of their comprehensive significance (they approximate uniformly every continuous function). No improvement of the structural properties of the function leads to an improvement of the order of errors beyond -. This follows from n Theorem 3 (I. P. Natanson [7]). If the function f(x) e C2r possesses for a value of x a finite second derivative f"(x)} then the formula F.<,)~/<*>+qi>+L. with lim p» = 0 holds true for this x. n—>ao Proof. The quotient f(t) — f(x) — f(x)am(t — z) sin2 (t — x) (287) takes the indeterminate form - as t approaches x. We eliminate this indeterminacy by l'Hospital's method (from the existence of /" at a point x there follows the existence of /' in a neighborhood of x). DiflFerentiation of the numerator and denominator with respect to t gives f(t) — f'{x)cos(t — x) 2 sin (t — x) cos (t — x)' Since the value of this ratio obviously tends to \f"(x) as t -> x, we get /(Q-/(*)-/'(*) sin (<--*) _ 1 sin2(,_x) -"2/ <*> + *<*>'
3. SOME PROPERTIES OF DE LA VALL^E-POUSSIN's INTEGRAL 213 with lim a(t) = 0, whence there follows a certain analogon to Taylor's formula /(0 = /(*) + /' (*) sin (* — x) + — f'(x) sin2 (t — x) + a(t) sin2 (t — x). (288) On the other hand, we found in Chapter I the expression Fn(x) =(2n-l)P^ / {/(* + 2° + /(X ~~ 2^cos2n^^ <289> for de la Vallee-Poussin's integral. By (288) we have f(x + 2t) = f(x) + f'(x) sin 2* + — f"(x) sin2 2t + a(x + 2t) sin2 2t and /(a? — 2t) = /(a?) — /'(a?) sin 2* + — /"(a?) sin2 215 + a(* — 2t) sin2 2/. From this and from (289) it follows that (9 \I I 1 n^ Vn(x)=(e)K n)\'- f [2f(x) + f"(x)sm*2t + p(t) sin2 2t] cos2* tdt, wherein Pit) = a(x + 20 + a(x — 2«). Whence, by virtue of (19), we obtain (9n\\\ f"(x\ nr2 Vn(x)=f(x)+ K Vmi^^- / sin2 2*cos2n tdt (Z71 XII! 71 J v ' 0 (9 \ I f 1 "-* + /-> Vxii— / P(t)sin22tcos2ntdt. (290) (2w, — 1)!! 7t J r o But «/2 st/2 /" sin2 2* cos2n«d« = 4 /* (cos2n+2 « — cos2n+4 t) dt, o ov hence, by (19),
214 X. SOME ANALYTIC EXPRESSIONS AS MEANS OF APPROXIMATION Thus equation (290) takes the form Vix)-flx) i r(x)2n + l wherein we set (271)!! 1 */2 -— f ft(t) sin2 2t cos2* tdt. ,-\ 71 J Since (2n-l)!...0 1 2n + 1 1 5n + 4 n + 2 2n + 2 n n(n + 2) (2n + 2) ' it clearly suffices to verify that lim (nrn) = 0 (292) n-*oo to complete the proof. To this end we take e > 0 and can find a 8 > 0 such that \P(t)\ <* on the segment [0, $]. From the equation (2n)!! 1 [ r T rn = v _ — II p(t) sin2 2tcos2ntdt + I p{t)sin2 2t cos2n tdt there follows then the estimate ji/2 (2n-l)!!»\(/ ;t lT»i <,, ^ ,xtt — <g / sin22*cos2»^ + ilf--cos2»<5 (2n — 1)!! n I J £ where M is the upper bound of the function 0(0 sin2 (20 (the boundedness of which follows from (288)). From this, from (291) and (278) it follows further that ^<f^d^ + ^cos2re<5' hence \tor*\ < £ + MnYncos2n6. For sufficiently large n MnYncos2nd < e and, therefore \nrn\ < 2e, whence equation (292) and, thereby, the theorem follow of themselves.
3. SOME PROPERTIES OF DE LA VALLEE-POUSSIN'S INTEGRAL 215 Now we prove another two theorems on the derivative of a de la Vallee- Poussin integral. These theorems are actually due to Lebesgue and Hahn, since, on the whole, they can be derived in a trivial fashion from general theorems of these two authors on singular integrals. To keep the representation as simple as possible, we will prove them directly here. Theorem 4. If at a point xf f(x) e C2r has a finite derivative f'(x), then the relation limF;(a?) = /'(x) (293) holds true for this value of x. Proof. The derivative of integral Vn(x) can be found by Leibniz' formula, i.e., by differentiating under the integral sign. Thus Whence, by a simple transformation, it follows that ™ -(^WlTn 1 '<« + »cos-^sin^ and, therefore, st/2 V'nW =7o^n!xM- f f{x + 2*)oo8*»-i t sin tdt. -«/2 By dividing this integral into two parts extending over the segments y q and 0, ^ , and substituting in the former — t for t, we get (2n)\\ n n/2 V«W ={2n_i)n^ f M* + 2t) - f(x - 2t)]co***-ltsintdt. By the hypotheses of the theorem we therefore have f(t) = f(x) + f'(x) sin (t — x) + a(t) sin (t — x) with lim a(t) = 0. (The proof of this equation is fully analogous to that of t—>x (288).) Thus we obtain f(x + 2t)'— f(x — 2t) = 2f (x)sin2t + p(t)siri2t
216 X. SOME ANALYTIC EXPRESSIONS AS MEANS OF APPROXIMATION with lim 0(0 = 0. Hence t—>o (9 \l! 9 *'2 F»(*)=/o Vim —/'(*) /" cos2^1 * sin * sin 2<d< (^71 — 1) ! ! 7Z y (2»— 1) (2 + /0(2yi)!V,, — / /?(*) cos2'1-11 sin«sin 2tdt (295) (zTi — 1)!! n J and */2 »/2 /" cos2n-^sin^sin2^^ = 2 /" cos2n « sin2 tdt. We can therefore easily show by means of formula (19) that the coefficient n of f'(x) in equation (295) equals ——7 . As for the second term in (295), it n + 1 approaches zero which can be proved as (292) above. Theorem 5. If the Junction fix) e Cir has a derivative continuous everywhere then limV'n(x) = f{z) n-*oo is uniform with respect to x. This theorem could be proved by closer investigating the nature of the remainder of formula (295). A proof independent of Theorem 4, however, is much simpler. We transform equation (294) in Vn)\\ 1 " ItA^_ 2nt-x Fi(*> = -i&i£;/'<'^«! (2n-l) Integration by parts yields The term without integral in the bracket vanishes because of periodicity; hence
4. bernstein-rogosinsky's sums 217 The right-hand side of this equation is obviously de la Vallee-Poussin's integral forf'(x), hence we have V'nU;x] = Vu[f;z], which reduces the proof to the de la Vallee-Poussin theorem in Chapter I. § 4. Bernstein-Rogosinsky's Sums We will study another possibility of constructing trigonometric polynomials which uniformly approximate any given function belonging to C2,.5 Let/(x) e C2t and S»(2) be a partial sum of its Fourier series. Then we set6 «<«> = *[/; x] = I[s.(* + ^ + *.(«-^j. This is said to be a Rogosinsky sum. We represent the sums on the right by Dirichlet integrals and get £"(a:) = 4^ / /(<) C0S 1~2^(* ~ X) .ft — x, 7t\ ./< — x n \ 3inl"~'+^rT2) sinb>—sr+ij. Lemma. T/ie inequality dt. n / 2» + 1 COS - £ sin (t n \ . / t n \ T + STT^J 3m[-2-Iu-r2) dt <8tz2 (296) Availing ourselves of the fact that the integrand is an even function, the usual transformations lead to the integral sin(tf+^) sin(tf — £-) \ 2m) \ 2m/ dt (m = 2n + 1). 'V. Rogosinsky [1], S. N. Bernstein [7], I. P. Natanson [3, 4], F. I. Kharshiladze [1, 3]. 6 We write B„(x) to avoid confusions with Bernstein's polynomial Bn(x).
218 X. SOME ANALYTIC EXPRESSIONS AS MEANS OF APPROXIMATION By reason of |sin /3 — sin a\ ^ |/3 — a\f this integral is not greater than ji/2 cos mt nn(t — ^-)sin(t + —) \ 2m) \ 2m) dt. Now, we divide the latter integral into three summands 7i, 72, 73 taken over the segments 0, — , —, J- — -^ and ^ — ^—, ^ . 6 L mj Lm 2 2m J L2 2m 2j To evaluate 7i we note that for 0 ^ £ ^ — m cos mt \ 2m) sinm ('-£) 3in(f — ^-) \ 2 m/ < m, sin Hence Furthermore \ 2m/ 2m m Ttjm Ix < I m2 dt — mn. T r dt I sin 11 — -— sin [t + s— To every factor of the denominator we apply the estimate sin a ^ - a and find 71 /2<T dt t* 7rmln3 nm For* |7T 7T 7T~J L2 ~ 2m ' 2 J 4m2 we finally have (when m ^ 3)
4. bernstein's-rogosinsky's sums 219 , 2n Tim hence U < = < mY$ " 2 ' The evaluations for Ji, J2, L give (296). Corollary. If \f(x)\ S M then \£*n[f;x]\^2nM. (297) Theorem. For every function f(x) e Cir the inequality | h\[x) - f(x) \^(27t + l)En + co \^—^i) <298> holds true, En being the best approximation to f(x) by polynomials belonging to Hni and w(5) being the modulus of continuity of f(x). To prove this let T(x) be that polynomial from Hi for which \f(x)-T{x)\ ^En. Then \B*n[f; x] - B'n[T; x]\ = \ITn[f - T; x]\ ^ 27tEn; the designations used are self-explanatory. On the other hand !«[/; x] - f(x)\ ^ |K[/; x] - B-n[T; x]\ + \B-n[T; x] - f(z)\, and, therefore, \Bi[f; x] - f(x)\ ^ 2nEn + \B*n[T; x\ - f(x)\. (299) Since as a trigonometric polynomial T(x) is its own Fourier sum, «[r;»]=-i 2n + lJ, V 2n + 1'. This quantity differs from by En at most, and, in turn, fraction (300) differs from f(x) by w ( on j_ 1) at most. Thus with the aid of (299), (298) follows.
220 X. SOME ANALYTIC EXPRESSIONS AS MEANS OF APPROXIMATION Corollary. For every function fix) e C2T limJj;(*)=/(a) n-*oo is uniformly valid along the entire axis. The Bernstein-Rogosinsky sums have also other interesting properties which we shall not stop to investigate here. Readers are referred to the literature cited at the beginning of this Section. § 5. Convergence Factors We will show now that the theorems relating to the uniform approximation of functions from C2x by Fejer and Bernstein-Rogosinsky sums, and de la Vallee-Poussin integrals can be summed up under a uniform viewpoint. Let jo) eo <?0 > „(1) Ql > Qi > (n) Ql > J® £?2 > Q2 , • .., 0n , (301) be a triangular matrix of real values. We call it a matrix of type (A) if the following two conditions are fulfilled: lime[n)= 1, 1 * 2^ J «<»>, Go +2 £ 6* cost* *=1 dt <K, (302) (303) wherein the constant K is not dependent on n. Given a function f(x) e C2T) its Fourier series be oo A + V (ak cos & x + bk sin k x). With the aid of this series and matrix (301) we then form the polynomial Uu{x) = Un[f; z] = qFA +Z£h(akcoskx + bksinkx). Theorem 1. If a matrix (301) is of type (A), lim Un(x) = f(x) n-*oo holds 'uniformly true for every function f(x) e C2v. (304) (305)
5. CONVERGENCE FACTORS 221 To prove this theorem we show in the first place that the polynomial Unix) satisfies the inequality \Un(x)\ ^KM 9 (306) where M = max \f(x)\. In fact, if we substitute into (304) the expressions for the Fourier coefficients A, akf bk, then Unix) = =- /' fit) [<#> + 2 2 Q? cos k(t- x)]dt, —n whence in connection with (303), (306) follows. If, on the other hand m T(x) = P + ^ (pic cos lex + qksinkx) is any one trigonometric polynomial of ra-th order, then Un[T; x] = ^ P + % qJT (Pk cos kx + qk sin kx) holds for n ^ m, whence, by reason of (302), the uniform convergence of Un[T; x] to T(x) along the entire axis follows. Given any € > 0, we can find a trigonometric polynomial T(x) satisfying the condition |/(*)- T(x)\ <e. Then, by (306), \Un[f;x]-Un[T;x]\<Ke. But I Un[f; x] - f(x) | < | Un[f, *] - Un[T; x]\ + \Un[T; x] - T(x)\ + \T(x) - f(x)\, hence I Un[f, x] - f(x) | < (K + 1)8 + I *7„[T; *] - T(x)\. For sufficiently large n also the latter summand becomes smaller than e, so that \Un[f;x]-f(x)\<(K + 2)e , which proves the theorem.
222 X. SOME ANALYTIC EXPRESSIONS AS MEANS OF APPROXIMATION By choosing the numbers p(^, called "convergence factors" in one or the other way, we obtain the theorems of Fejer, de la Vallee-Poussin and Bernstein-Rogosinsky. Example 1. The Fejer sum . . S0(x) + S1(x) + --+Sn_l(x) an(x) = can also be written an(x) = A + £ 11 1 (akcoskx + bksinkx). *=i\ n/ In other words, it is a sum of the form (304) with eiB, = l-^. (307) For the matrix of these numbers, (302) is clearly valid. From (89) we get, on the other hand, nt\ , i sin —-' *\ 1 / 2 Qon) + 2ZQkn)coskt = l +2z(l--)coskt=- * = 1 Jk=l \ n) n t S1U~2) so that ^on) + 2 2l^n)cos^^0. The sign for the absolute value may therefore be omitted under the integral in (303). But since n / cos ktdt = 0, the equation — / |p(o} + 2 V p(f cos kt\ dt = 1 fc=i follows for the numbers (307), and Fejer's theorem appears as a special case of Theorem 1 just proved. We remark in this connection that condition (303) always follows from (302) when £0° + 2^Qkn)coskt^0. (308) Example 2. We verify the correctness of the identity C0S2nl=(2w-1)!![1+2f (»')' 7|Cqs4 (309) 2 (2»)M L *ri(n-i)!(n + *)I J
5. CONVERGENCE FACTORS 223 Its left-hand side being a trigonometric polynomial of n-th order, t n cos2*—= £ + 21 h cos kt. (310) By integrating this equation over the segment [—wf w], we immediately obtain from (19) (2n-l)l! (2w)!! * We multiply (310) by cos mt, integrate and get nlm= / cos2n — cosmtdt. Now / cos2n — sin mtdt = 0, —n since the integrand is an odd function. Hence 7 t nlm = I cos2n — eimtdt. But since then •_1 --— t e2+e*2 C0ST = 2 t 2n ^ 22nCOS2n = £ C*ne*«(*-»> 2 *=0 and, hence With p being an integer, 1 2n z. /■ _•£ I 2rc, when p = 0. Whence 2 (2n)\ 2 « 22w 2n 22w (n — m)! (w + w)! (2w—1)!! (n!)2 (2n)\\ (n — m)\ (n + m)\ which proves (309).
224 X. SOME ANALYTIC EXPRESSIONS AS MEANS OF APPROXIMATION We substitute (309) into de la Vall&e-Poussin's integral "■<*>=b^t^/'«> cos-6 (2ti— \)\\2nJn'KV' ~~" ~Y~dt and get Vn(x) = A + £ — ——(akcoskx + bksinxk), *-i(w — k)l(n + k)\ wherein A, ak, bk are the Fourier functions of f(x). Thus Vn(x) is a polynomial of form (304) with PW= ^ (311) The numbers (311) obviously satisfy condition (302). By virtue of (309) they also fulfill (308) and, hence, (303), so that the de la Vallee-Poussin theorem may also be taken as a special case of Theorem 1 in this Section.7 Example 3. Now there only remains to show8 that the Bernstein- Rogosinsky sums B*n(x) also can be represented in the general form (304) if the quantities p(f are chosen appropriately. If, in fact Sn(x) = A + ]£ (akcoskx + bksinkx), *=i then hence ±in\J') rt n klZ = A + £ coso 7 (ak coskx + bksinkx); *=i 2n + 1 Again these factors clearly satisfy the condition (302). Condition (303) was actually verified for them already in the preceding Section. For, n i \ n kjz go +21 e* cos kt = 1 + 2 V cos- —cos kt = l+|[cos,(<+^rT)+cos.(<-^I)]. 71. P. Natanson [8]. 8 F. I. Kharsiiiladze [1, 3].
5. CONVERGENCE FACTORS 225 By (175) ir + S cosfca . 2n + l sin a 2 sin- hence *=1 2n + 1 cos 1 sin (l + ^TFi) sin(y-4^) By reason of (296) the numbers (312) therefore satisfy also the condition (303). It is of use to add to the theorem just proved some remarks of general character. Given a matrix (301), we can then form the sum «0 + «1 + «2 + (313) for any infinite series If a finite limit Qn Q = lim Qn exists, then we say that series (313) is summahle by means of factors p(^ and call Q its generalized sum. Of particular interest are now those summation methods (known as permanent methods) by means of which every ordinarily convergent series is also summable in such a fashion that its generalized and ordinary sums coincide. Theorem 2. Let in matrix (301) An) ^ in) Qo ^ Ql •••>o<n)^0 (314) and let, in addition, condition (302) he fulfilled; then the summation method created hy it is permanent. To prove this, suppose that series (313) converges, and its sum is S. We denote its partial sums by Sk, consider that a* = Sk — Sk-i for k > 0, set pn(+i = 0 and find n Qn = 2 (q* — ejM-i)^*- *=0
226 X. SOME ANALYTIC EXPRESSIONS AS MEANS OF APPROXIMATION On the other hand (n) _ v (An) (n) \ 2/ {Qk — Qk+i)- Hence Qn - eP s=i(g?-<?&)(5* -S). (315) k=0 For any given e > 0 there exists a number m such that |S*-S| <e holds for k > m. From this and from (314) and (315) the estimate 10. - eS°si ^ j^n) - eftO I* - si + «e£+i follows for n > m. If we take m to be constant, (302) yields Jim j?(eir)-efti)l<s»-si = o. Thus, for n > nh Z(Qin)-Q?li)\Sk-S\<e. Jfc=0 On the other hand (again by reason of (302)) p^ < 2 for n > n2. Thus, if n > max (m, ni, 712), ic-ef'si <3e. This gives lim(£n-^n)£) = 0, n=oo from which, again with the aid of (302), we finally get \imQn = S. n-*oo Thus the theorem is proved. We can recognize without difficulty that the three summation methods with the aid of the factors (n) , * (n) (n\)2 (n) _ kjZ are permanent. For the first and third method this is trivial, and for the second it follows from the inequality eff 1 _ n - fc o2» » + fc + 1
BIBLIOGRAPHY Akhiezer, N. I. and Krein, M. G. [1] 0 nailuchshem priblizhenii trigonometricheskimi summami differen- tsiruemykh periodicheskikh funktsiy (On the best approximation of differentiable periodical functions by trigonometrical polynomials). DAN, 1937 Bernstein, S. N, [1] Demonstration du the'oreme de Weierstrass fondee sur le calcul des probability (Demonstration of the Weierstrass theorem founded on the calculus of probabilities). Kharkov, Communications of the Mathematical Association, 1912 [2] Sur la valeur asymptotique de la meilleure approximation des fonctions analytiques (On the asymptotical value of the best approximation of analytical functions). C. R. Acad. Sc, 1912 [3] Sur Vordre de la meilleure approximation des fonctions continues par des polynomes de degre donne (On the order of the best approximation of continuous functions by polynomials of a given degree). M6m. Acad, de Belgique, 1912 [4] Ob asimptoticheskom znachenii nailuchshego priblizheniya analiticheskikh funktsiy (On the asymptotical value of the best approximation of analytical functions). Kharkov, Communications of the Mathematical Association, 1913 [5] Sur la valeur asymptotique de la meilleure approximation des fonctions analytiques, admettant des singularity donnees (On the asymptotical value of the best approximation of analytical functions admitting given singularities). Bull. Acad, de Belgique, 1913 [7] Sur un procede de sommations des series trigonometriques (On a summation process of trigonometrical series). C.R. Acad. Sci., 1930 [10] EkstremaVnye svoystva polinomov i nailuchshee priblizhenie nepreryvnykh funktsiy odnoy veshchestvennoy peremennoy (Extremal properties of the polynomials of best approximation to continuous functions of a real variable). M.-L., GONTI, 1937 [13] Sur le probleme inverse de la theorie de la meilleure approximation des fonctions continues (On the inverse problem of the theory of best approximation of continuous functions). C.R. Acad. Sc, 1938 [14] Obobshchenie odnogo rezuVtata S. M. NikoVskogo (Generalization of a result of S. M. Nikolskiy). DAN, 1946 227
228 BIBLIOGRAPHY Besikovich, J. S. [1] Ischislenie konechnykh raznostey (Finite difference calculus). Leningrad, University Press, 1939 Borel, E. [1] Legons sur les fonctions de variables reelles (Lessons on the functions of real variables). Paris, 1905 Fejer, L. [1] Untersuchungen uber Fouriersche Reihen (Investigations of Fourier series). Math. Ann., 1904 Goncharov, V. L. [1] Teoriya interpolirovanii i priblizheniya funktsiy (Theory of interpolation and approximation of functions). Moscow-Leningrad GTTI, 1934 Jackson, D. [1] Vber die Genauigkeit der Annaherung stetiger Funktionen durch ganze rationale Funktionen (On the accuracy of the approximation of continuous functions by integral rational functions). Dissertation, Gottingen, 1911 [2] On approximation by trigonometric sums and polynomials. Trans. Amer. Math. Soc, 1912 Kac, M. [2] Reconnaissance de prior ite relative a ma note "Une remarque sur les polyndmes de S. N. Bernstein11 (Recognition of priority relating to my note "A remark on the polynomials of S. N. Bernstein"). Studia Math., 1939 Kantorovich, L. V. [1] 0 skhodimosti posledovatelnosti polinomov S. N. Bernshteyna za predelami osnovnogo intervala (On the convergence of the Bernstein sequence of polynomials at the boundaries of the basic interval). IAN, phys.- math series, 1931 Kharshiladze, F. I. [1] 0 metode summirovaniya S. N. Bernshteyna i V. Rogozinskogo (On the summation method by S. N. Bernstein and V. Rogosinsky). DAN, 1941 [3] 0 metode summirovaniya S. N. Bernshteyna (On the summation method by S. N. Bernstein). Mat. Sb., 1942 Kolmogoroff, A. N. [3] Zur GroBenordnung des Restgliedes Foiirierscher Reihen differenzierharer
BIBLIOGRAPHY 229 Funktionen (On the order of magnitude of the remainder of Fourier series of differentiate functions). Ann. of Math., 1935 Lebesgue, H. [1] Sur les integrates singulieres (On singular integrals). Ann. de Toulouse, 1909 Markoff, A. A. [3] Ob odnom voprose D. I. Mendeleeva (On a question posed by D .1. Mendelejeff). SPB, IAN, 1889 Markoff, W. A. [1] 0 funktsiyakh naimenee uklonyayiishchikhsya od nulya v dannom promezhutke (On the functions deviating least from zero on a given interval). SPB, 1892 Natanson, I. P. [3] 0 summirovanii ryadov Fur'e po metodu S. N. Bernshteyna i V, Rogo- sinskogo (On the summation of Fourier series by the method of S. N. Bernstein and V. Rogosinski). Works of the Industrial Institutes, Phys.-Math. Dept., 1937 [4] K teorii Bernshteynovskogo sposoba summirovaniya ryadov Fur'e (On the theory of the Bernstein method of summation of Fourier series). Leningrad, Works of the Institute for Precision Mechanics and Optics, 1941 [7] Nekotorye otsenki, svyazannye s singulyarnym integralom Valle-Pussena (Some estimates connected with the singular integral of De La Vall6e- Poussin). DAN, 1944 [9] 0 prihlizhonnom predstavlenii funktsiy, udovletvoryayushchikh usloviyu Lipshitsa, s pomoshch'yu integrala Valle-Pussena (On the approximation of functions satisfying a Lipschitz condition by a De La Vall6e- Poussin integral). DAN, 1946 Nikolskiy, S. M. [1] Ob asimptoticheskom povedenii ostatka pri priblizhenii funktsiy, udovletvoryayushchikh usloviyu Lipshitsa, summami Feyera (On the asymptotic behavior of the remainder in the approximation by Fej6r sums of functions satisfying a Lipschitz condition). IAN, Math, series, 1940 [21 Priblizhenie periodicheskikh funktsiy trigonometncheskimi polinomami (Approximation of periodical functions by trigonometrical polynomials). Works of the math. Stekloff Institute, 1945 [3] Nailuchshee priblizhenie mnogochlenami funktsiy, udovletvoryayushchikh usloviyu Lipshitsa (The best approximation by polynomials of functions satisfying a Lipschitz condition). DAN, 1946
230 BIBLIOGRAPHY Popoviciu, T. [1] Sur Vapproximation des fonctions convexes d'ordre sup&rieur (On the approximation of convex functions of higher order). Mathematica, 1934 Rogosinsky, V. [1] (Jber die Abschnitte trigonometrischer Reihen (On the segments of trigonometrical series). Math. Ann., 1925 Tchebysheff, P. L. [1] Teoriya mekhanizmov, izvestnykh pod imenem parallelogrammov (The theory of mechanisms known by the name of parallelograms). Collected Works, 1853 [2] Voprosy o naimenshikh velichinakh, svyazannykh, s priblizhonnym predstavleniem funktsiy (The question of minimal quantities connected with the approximation of functions). 1857-1859, Collected works [3] 0 funktsiyakh, malo udalyayushchikhsya ot nulya pri nekotorykh velichi- nakh peremennoy (On functions deviating only slightly from zero for certain values of the variable). 1881-1882, Collected Works De La Vallee-Poussin, J. [1] Sur Vapproximation des fonctions dyune variable rSelle et de leurs deriv&es par des polynomes et des suites de Fourier (On the approximation of functions of a real variable and their derivatives by polynomials and Fourier series). Bull. Acad, de Belgique, 1908 [2] L'approximation des fonctions d'une variable rSelle (The approximation of functions of a real variable). L'enseign. math., 1918 [3] Lemons sur Vapproximation des fonctions oVune variable reelle (Lessons on the approximation of functions of a real variable). Paris, 1919 Weierstrass, K. [1] Uber die analytische Darstellbarkeit sogenannter willkurlicher Funktionen einer reellen Verdnderlichen (On the analytic representability of so- called arbitrary functions of a real variable). Minutes of the Academy in Berlin, 1885 Voronovskaya, E. V. [1] Opredelenie asimptoticheskogo vida priblizheniya funktsiy polinomami S. N. Bernshteyna (The asymptotic behavior of the approximation functions by their Bernstein polynomials). DAN, 1932 Zygmund, A. [2] Smooth functions, Duke Math. Journal, 1945
INDEX Alternant, Tchebysheff . . . . 28,66 Approximation best 20,64 degree of 162 mean value viii uniform viii Axiom of choice multidimensional Bolzano- Weierstrass 24 Condition Abel's 93 Dini-Lipschitz 153 Lipschitz 77 Continuity, modulus of 75 Convergence 23 Convergence factors 222 Density 50 measure for the 118 Deviation 20, 64 least 20,64 Formula Parseval 60 Wallis 211 Function analytic 172 entire 173 generating 42 induced 120 Image point method 59 Image point of a polynomial . . 24, 63 Inequality Bernoulli 209 S. N. Bernstein's 90,133 Cauchy's 61 Lebesgue's 152 A. A. Markoff's 141 W. A. Markoff's 141 Integral De La Vallee-Poussin .... 10 DlRICHLET 151 Fejer 158 Jackson's 83,107 Norm of a polynomial 21, 62 Ordinals of a trigonometric polynomial 14 Orthogonal with a density .... 51 Polynomial of the best approximation ... 20, 65 Bernstein 5 of the least deviation 20, 65 norm of a 21, 62 quasinorm of the 21, 62 Tchebysheff 38 image point of the 24, 63 Quasinorm of the polynomial . . 21, 62 Series Fourter 142 trigonometric 142 Sums Bernstein-Rogosinsky's . . . .217 De La Vallee-Poussin's .... 167 Fejer 157 partial—of a Fourier series . . . 149 System of functions, complete . . . 143 231
INDEX OF SYMBOLS A ([a, b]) 172 A2* 172 Bn(x) 5 Bi(x) =B*n[f;x] 217 C([a,6]) vii C2x vii CL 86 en 85 En 20,64 El 64 Gar 173 Hn 20 hi 88 Hi 62 L(P) 21 L{T) 58 Lip a 11 LipM ol 11 M(P) 2 M(T) 62 n\\ 9 S»\f\-Snlf;z] 152 Un(x) = Un[f; x] 220 Vn{x) = Vn[f;x] 10 W 79 Z 106 Tn(a) 118 A(P) 20 A(T) 64 An(a) 162 pW 222 <Tn(x) = <Tn\f; x] 157 Tn(x) = Tn[f; x] 168 +(6) 120 w(5) 75 p 35 232